Can You Repeat That?
Query Clarification in Voice User Interfaces
Mar 04, 2019

In a standard text entry search interface, understanding user intentions is hard enough. But, when users have the ability to speak their search into a Voice User Interface (VUI), getting accurate results can be even harder. As voice search becomes more prevalent in everyday life, designers and engineers are still trying to understand how to design systems that clarify a user’s intentions without negatively impacting their experience. Here, UX Designer Heather Graham shares new insights about designing search interfaces for VUIs based on recent academic research.

Challenges in Designing Voice User Interfaces

With the popularity of smart speakers, an increasing number of people now interact regularly with voice user interfaces. Since this technology is integrated into smartphones and home systems, users are becoming more accustomed to asking Google to look up movie times, or telling Alexa to turn on the lights. Currently, 43 million Americans own a smart speaker like Amazon Alexa or Google Home [1].

Because there are a different set of challenges in designing VUIs, academics in Human Centered Computing, Human-Computer Interaction, and other UX related fields have begun to study these types of interfaces. With a VUI, a designer has to rely on a user’s ability to listen to and remember a list of items. With this in mind, how do we overcome designers’ and developers’ fears of reducing the quality of the user experience or potentially annoying the user? What tactics can we use to verbally clarify ambiguous searches that may have multiple meanings?

Best Practices for Designing Text Search Interfaces

We can start by looking at established best practices for designing text entry search interfaces to help users clarify and narrow search queries. Designers and developers can easily refer to these best practices to create usable search solutions.

Use autocomplete dropdowns to anticipate a user’s search.

google search window autocomplete

Auto-correct spelling errors to ensure quality results.

google search results pterodactyl images

Allow for filtering on search results to help users narrow down initial results. Show users no more than 10 results in one page to avoid overwhelming users. (Users often don’t look past the top 3 before clicking or adjusting their search).

google search results filters bidets toilets voice user interface

Provide links to related articles to promote browsing and help users find what they are looking for.

google search results links related content

Provide results or links for each meaning of an ambiguous search in order to meet user’s intended query. (A word or phrase may have multiple meanings, for example B-52 is an alcoholic beverage, a plane, and a musical group)

google search results b-52 band bomber voice user interface

Keys to Building Functional VUIs

Now, how do we apply these best practices to Voice User Interfaces? The good news is, there is recent research [Kiesel], [Trippas] that helps us build a set of design guidelines. Here are some tips to build out functional and high quality VUIs.

Clarify user’s searches and requests.

Though it may seem cumbersome, research shows that users actually like when the system clarifies their query by repeating it back to them. You can improve user experience by allowing for clarification, especially when user queries are long and complex [3, pg. 1259].

  • In order to make the clarification manageable, keep the list of options provided to the user to 3 options or less.
  • When it’s necessary to have more than 3 clarification options, ask users to specify the meaning of their search themselves.

Voice User Interfaces need to understand individuals.

VUIs should use machine learning to understand individual user’s voice patterns and habits. Users with accents and users who learned English as another language find interacting with these interfaces more difficult. Therefore, the development of more personalized algorithms will help overcome these obstacles. [3, pg. 1259]

Algorithms need to understand the context surrounding a search.

NLP needs to be able to understand user’s information needs. Users like to explain why they are searching for things. Build in opportunities for users to give context for their search and understand the context that explains why a user is searching for a particular thing in order to give more accurate results and narrow down their search. [2, pg. 326]

Conclusion

Voice user interfaces have been around for a long time, but the introduction of smart speakers has allowed a larger percentage of the population to have frequent interactions with these types of interfaces. By looking at academic research, the UX and development community can gain insights into how people interact with these smart speakers and better understand how to design usable solutions for these devices. Of course, a lot of these practices are easier said than done. In coming years, advancements in NLP and personalized algorithms will continue to make these interactions more natural and improve the overall user experience.

References

[1] Anon. 15 Important Statistics About Smart Speakers in 2018. Retrieved January 23, 2019 from https://www.convinceandconvert.com/content-marketing-research/15-important-statistics-about-smart-speakers-in-2018/

[2] Johannes Kiesel, Arefeh Bahrami, Benno Stein, Avishek Anand, and Matthias Hagen. 2018. Toward Voice Query Clarification. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR ’18). ACM, New York, NY, USA, 1257-1260. DOI: https://doi.org/10.1145/3209978.3210160

[3] Johanne R. Trippas, Damiano Spina, Lawrence Cavedon, and Mark Sanderson. 2017. How Do People Interact in Conversational Speech-Only Search Tasks: A Preliminary Analysis. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval (CHIIR ’17). ACM, New York, NY, USA, 325-328. DOI: https://doi.org/10.1145/3020165.3022144