When your persona talks: Mitigating linguistic bias in voice interaction design

12 Apr 2020

—

30min.

Linguistic bias is a problem people are unaware of. Learn what it is, how it negatively impacts voice and multimodal experiences, and what design practices can help mitigate its effects.

Voice interfaces, such as Apples’s Siri, Google’s Assistant, and Amazon’s Alexa, present a rather intimate way of interacting with software, asking people to use their spoken language, something tied deeply to their identity, to talk with a machine. Yet at the heart of all voice interfaces are forms of linguistic bias that discriminate against classes of people and speakers.

Linguistic bias is a complex phenomenon. At its heart is the notion that an arbitrary variety of a language is superior to all other varieties. Hence decisions are made that promote that variety; people experience harassment or discrimination because they don’t accommodate or acquire that variety. Linguistic bias is pervasive in all modern societies, and unlike other forms of bias, is not usually viewed negatively by the larger society. As one example, in the United States, some regional accents are so heavily stigmatized that people moving outside their region find they have to change how they talk in order to succeed.

There are two primary forms of linguistic bias that negatively impact voice experiences. The first is a bias towards written language. The underlying technology relies on text strings, yet text strings have little to do with how people experience voice interfaces. Voice is ultimately experienced as audio. Moreover, teams working on voice interfaces are highly educated and have an alacrity with written forms of language that normally do not match the realities of how people listen or talk in spoken discourse. Teams tend not to notice this gap.

The second form of bias is not accommodating the great linguistic diversity of speech communities. Teams tend to rely either on the varieties of language used in their own communities of practice, or on an arbitrarily chosen standard, in the countless layers that make up the linguistic part of a voice experiences. Despite amazing advances in speech technology, this bias impacts the recognition of a wide range of dialects and speakers and vocabulary coverage, as well as people’s comfort in talking to voice interfaces. This lack of intra-language diversity also negatively impacts conceptual models and interaction flow.

Design practices that can help mitigate these forms of linguistic bias include a robust user discovery phase, a reliance on recording, transcribing, and analyzing spoken language as well as observational methods for user research, creating diverse characters for voice interfaces, and using systematic techniques that take the bias out of design decisions.