Exploring the Boundaries of Passive Listening in Voice Assistants
Various forms of voice assistants—stand-alone devices or those built into smartphones—are becoming increasingly popular among consumers. Currently, these systems react when you directly speak to them using a specific wake-word, such as “Alexa,” “Siri,” “Ok Google.” However, with advancements in speech recognition, the next generation of voice assistants is expected to always listen to the acoustic environment and proactively provide services and recommendations based on human conversations or other audio signals, without being explicitly invoked. This is referred to as “passive listening.” For instance, future voice assistant devices could recognize when a user is talking about dinner plans and could suggest updating the calendar, inviting friends, or making a reservation. While these services might be very useful, constant listening may raise privacy concerns and hinder the adoption of these new devices by consumers. Through a series of interviews, surveys, and experience sampling studies involving users of current voice assistant devices, this project will apply the Theory of Contextual Integrity to build a classifier to detect when future devices should dynamically apply privacy controls. Researchers will attempt to answer several questions: 1) How do users expect their voice to be used? Specifically, what sort of services do they expect the device to provide after listening to their conversations? 2) What are users’ privacy expectations, and what factors affect those expectations? It is important to understand such factors so that platforms can systematically monitor them to act (deny or allow) on information flows (i.e., voice) accordingly.
This research is funded by a 2019 Mozilla Research grant.