A Blog by Jonathan Low

 

Oct 31, 2020

The Reason Speech Recognition Software Struggles With Accents And Inflections

In a global economy, technology has to learn to recognize speech patterns other than voices speaking unaccented English, which are evidently the source of most of its training data. JL

Claudia Lopez reports in Scientific American:

On average, all five programs from leading technology companies, including Apple and Microsoft, showed significant race disparities; they were roughly twice as likely to incorrectly transcribe audio from Black speakers compared with white speakers.There are  many culprits for these disparities, but most likely (are) the data used for training, which are predominantly from white, native speakers of American English. By using databases that are narrow both in the words that are used and how they are said, training systems exclude accents and other ways of speaking


“Clow-dia,” I say once. Twice. A third time. Defeated, I say the Americanized version of my name: “Claw-dee-ah.” Finally, Siri recognizes it. Having to adapt our way of speaking to interact with speech-recognition technologies is a familiar experience for people whose first language is not English or who do not have conventionally American-sounding names. I have now stopped using Siri, Apple's voice-based virtual assistant, because of it.

The growth of this tech in the past decade—not just Siri but Alexa and Cortana and others—has unveiled a problem in it: racial bias. One recent study, published in the Proceedings of the National Academy of Sciences USA, showed that speech-recognition programs are biased against Black speakers. On average, the authors found, all five programs from leading technology companies, including Apple and Microsoft, showed significant race disparities; they were roughly twice as likely to incorrectly transcribe audio from Black speakers compared with white speakers.

This effectively censors voices that are not part of the “standard” languages or accents used to create these technologies. “I don't get to negotiate with these devices unless I adapt my language patterns,” says Halcyon Lawrence, an assistant professor of technical communication and information design at Towson University, who was not part of the study. “That is problematic.” For Lawrence, who has a Trinidad and Tobagonian accent, or for me as a Puerto Rican, part of our identity comes from speaking a particular language, having an accent or using a set of speech forms such as African American Vernacular English (AAVE). Having to change such an integral part of an identity to be able to be recognized is inherently cruel.

 
The inability to be understood impacts other marginalized communities, such as people with visual or movement disabilities who rely on voice recognition and speech-to-text tools, says Allison Koenecke, a computational graduate student and first author of the PNAS study. For someone with a disability who is dependent on these technologies, being misunderstood could have serious consequences. There are probably many culprits for these disparities, but Koenecke points to the most likely: the data used for training, which are predominantly from white, native speakers of American English. By using databases that are narrow both in the words that are used and how they are said, training systems exclude accents and other ways of speaking that have unique linguistic features. Humans, presumably including those who create these technologies, have accent and language biases. For example, research shows that the presence of an accent affects whether jurors find people guilty and whether patients find their doctors competent.

Recognizing these biases would be an important way to avoid implementing them in technologies. But developing more inclusive technology takes time, effort and money, and often the decision to invest these are market-driven. (In response to several queries, only a Google spokesperson responded in time for publication, saying, in part, “We've been working on the challenge of accurately recognizing variations of speech for several years and will continue to do so.”)

Safiya Noble, an associate professor of information studies at the University of California, Los Angeles, admits that it's a tricky challenge. “Language is contextual,” says Noble, who was not involved in the study. “But that doesn't mean that companies shouldn't strive to decrease bias and disparities.” To do this, they need the input of humanists and social scientists who understand how language actually works.

From the tech side, feeding more diverse training data into the programs could close this gap, Koenecke says. Noble adds that tech companies should also test their products more widely and have more diverse workforces so people from different backgrounds and perspectives can directly influence the design of speech technologies. Koenecke suggests that automated speech-recognition companies use the PNAS study as a preliminary benchmark and keep using it to assess their systems over time.

In the meantime, many of us will continue to struggle between identity and being understood when interacting with Alexa, Cortana or Siri. But Lawrence chooses identity every time: “I'm not switching,” she says. “I'm not doing it.”

2 comments:

Tucker Conrad said...

A GREAT SPELL CASTER (DR. EMU) THAT HELP ME BRING BACK MY EX GIRLFRIEND.
Am so happy to testify about a great spell caster that helped me when all hope was lost for me to unite with my ex-girlfriend that I love so much. I had a girlfriend that love me so much but something terrible happen to our relationship one afternoon when her friend that was always trying to get to me was trying to force me to make love to her just because she was been jealous of her friend that i was dating and on the scene my girlfriend just walk in and she thought we had something special doing together, i tried to explain things to her that her friend always do this whenever she is not with me and i always refuse her but i never told her because i did not want the both of them to be enemies to each other but she never believed me. She broke up with me and I tried times without numbers to make her believe me but she never believed me until one day i heard about the DR. EMU and I emailed him and he replied to me so kindly and helped me get back my lovely relationship that was already gone for two months.
Email him at: Emutemple@gmail.com  
Call or Whats-app him: +2347012841542

Dynamics NAV said...

Microsoft Dynamics NAV is an ERP solution developed by Microsoft for small and mid-market companies (SMEs). The solution offers product functionality covering financial management, supply chain management, manufacturing, project management, and service management. Dynamics NAV is the former name of Dynamics 365 Business Central.

Microsoft Dynamics NAV belongs to the Microsoft Dynamics Navision Information Systems family. In 2013, a new generation of the world-famous Dynamics NAV Microsoft ERP was revealed to the world, and its fifth version is now available with Dynamics NAV 2018. This version of the Navision ERP system focuses on improving user productivity.

Microsoft NAV works on modern three-tier architecture. It provides heterogeneous access (Windows client, Web client and SharePoint client) to the application layer and database server, increases the theoretical performance and makes it easy to upscale the resulting solution. Navision Software provides tools for Business Intelligence and uses the latest technology, Microsoft SQL Server.

MS Dynamics NAV gives small-to-midsized businesses a centralized platform from which to run their processes. Having all operations managed on a single platform helps organizations get a better overview of their business, offering a “single source of truth”, and eliminating departmental siloes which can prevent important data from being communicated efficiently. MS NAVISION can help improve productivity by allowing users to cut down time spent on administrative tasks. Having all business data in one place makes finding the information users need faster, and Dynamics Navision can also automate processes to take repetitive jobs off users’ hands completely.

Post a Comment