Surprised African American woman holding a cell phone.
Smiling African American woman holding a cell phone.
African American man holding a tablet and using a stylus.
African American woman holding a cell phone standing in a city.
Smiling African American man sitting down in a study chair using a laptop.
Smiling African American woman holding a cell phone.

Project Elevate Black Voices is a partnership between Google and Howard University to make it easier for people from the Black community to use voice technology. 

The team has completed data collection efforts where they have created a high-quality speech dataset of African-American English (AAE) across the United States to reduce racial disparities in automated speech recognition systems (ASR), which is the technology that powers voice assistants.


Howard University will retain ownership of the dataset and licensing and serve as stewards for its responsible use, ensuring the data benefits Black communities. Google can also use the dataset to improve its own products, ensuring that their tools work for more people. 

Howard's University Logo
Google's Logo

Our motivation

External studies and Google’s own research have found that Black people in the United States often have a worse experience using automatic speech recognition technology (ASR) when compared to white speakers — underscoring the need for a technical solution to word error rates.

We've learned that Black users, in particular, are changing their voice patterns away from AAE in order to be understood by voice products. This is called "code-switching," where people lean away from the way they naturally speak in order to be understood. 

We identified a number of barriers to improving automatic speech recognition (ASR) performance. One issue was the lack of natural AAE speech found within speech data. Because Black users have been implicitly conditioned to change their voices when using ASR-based technology, the in-product data rarely contains organic speech. Security and user privacy policies also serve as a self-imposed, positive constraint to collecting AAE speech data.

Even when there is data available, in-product AAE data is really difficult to leverage. Although we’ve made strides to identify AAE data using dialect classifiers to start improving technology, code-switching makes AAE data underrepresented and insufficient to address the challenge. We realized a novel approach was necessary to build a new, high-quality dataset of unaccommodated AAE for improving Black users’ experiences with ASR technology

About AAE

African American English, African American Vernacular, Black English, Black talk, or Ebonics is the rich language rooted in history that Black people speak across the United States.

Black English has roots in Southern culture and has morphed into the colorful way that Black people and others have learned to address each other.  It is seen as a linguistic celebration of Black heritage and is a language of home, safety, community, and joy

AAE: African American English, African American Vernacular, Black English, Ebonics.