Using Voice Assistants to Conduct Qualitative Interviews

Amazon Echo, in a home

At the ESOMAR Asia-Pacific 2019 conference in Macao, Anne-Marie Moir described her proof of concept using voice assistants to conduct interviews. Anne-Marie owns the firm Consumer Behaviour in Melbourne and has been a qualitative researcher for 25 years. Her hypothesis is that “an AI powered voice tool is capable of eliciting a richer response than a text based survey tool and that it is able to extract key data from an unstructured voice response and ask a follow-up question.”

Why did she attempt to prove this? “You can’t help but sense that we are all getting used to talking to our cars and our voice assistants. I wanted to try this to offer a better experience for respondents, who would prefer it to ticking a box.”

She created a minimal viable concept that used an AI-powered voice bot to ask an open-ended question, extract key data from the answer, then ask a meaningful follow-up question. She collected 44 responses this way, which she contrasted with 44 responses collected from a Google Survey.

Her discussion guide was simple:

  1. Tell me about what you had for dinner last night.
  2. Why did you have {$Q1} last night?

“This is a benign question to ask and is typical of how I might start a discussion around dinner time. Always start very broad: people can answer it in as much or as little detail as desired.” One reason for this question was that Anne-Marie had used it many times in past groups and so knew what sort of data to expect.

The key technical challenge was having the system recognize possible meals to identify what response to pipe into the second question. For instance, one participant answered, “I had an omelet for dinner last night.” In this case, the system had to recognize “omelet” was the key word to use in the follow up question. Anne-Marie trained the system with 200 “entities” – common meal items.

Unfortunately, the system only correctly identified the meal item and asked the appropriate “Why” follow-up question one out of three times. Rarely this was due to problems with speech recognition – “a needle seat from Kmart” (a noodle soup?), “chicken lice”, and “pod vegetable biryani”. Once it confused “KFC” as a meal item instead of a place, hearing “I had a burger from KFC” and following up with “Why did you have a burger and KFC last night?” More often problems were due to the lengthy, naturalistic responses:

  • “I did have a pull up so vegetable pulao Indian dish actually I was about to make some fried rice so I Googled up some recipes so I found this brilliant recipe that uses a single spice so I took to cook it up.”
  • “I had chicken burger and French fries.”
  • “I ordered Fried Chicken through Deliveroo because it was easy and inexpensive and I felt like chicken.”

Machine-learning systems require extensive training and re-training, and these are the types of issues to be addressed when moving beyond a proof of concept. While the accuracy was low in the concept test, it did demonstrate that a voice assistant can “extract key data from an unstructured voice response and ask a follow-up question.”

Clearly the voice assistant encouraged people to speak naturally. Compared to responses from the Google intercept survey, respondents used 6 words instead of 1.3. The voice assistant provided rich responses while the survey provided perfunctory responses: “food” being the unhelpful most common answer.

Her proof of concept validated her hypothesis, and she believes voice assisted interviews have a future in certain applications:

  • Geo-triggered surveys from panelists would already provide 3 of the 5 W’s (who, when, and where) with the survey getting to the what and why. Imagine a hard-to-research topic like why people choose one gas station over another (hard-to-research because it is of low interest to participants, who often don’t give such decisions much thought, and who would need to recall the last time they went to a station). Instead, the geotrigger could prompt them to answer a few questions will they fill up their tank.
  • The classic two-question NPS system could be easily administered via a voice assistant.
  • Respondent-driven feedback could provide a virtual, voice-powered suggestion box, “allowing people to share something when they wanted to, in order to have companies get to know them and their wants and desires better.”

Interested in learning more about qualitative research? The new Principles Express online course Qualitative Market Research, authored by Jeff Walkowski, is perfect for novices looking for a solid background in how to conduct and evaluate qualitative research.

(Photo credit: Amazon. Used by permission.)

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Leave a Reply

Your email address will not be published. Required fields are marked *