share this!
4
5
Share
Email

June 11, 2019

Why people will beat machines in recognizing speech for a long time

by Patrycja Strycharczuk, The Conversation

Why people will beat machines in recognising speech for a long time yet — Credit: AI-generated image (disclaimer)

Imagine a world in which Siri always understands you, Google Translate works perfectly, and the two of them create something akin to a Doctor Who style translation circuit. Imagine being able to communicate freely wherever you go (not having to mutter in school French to your Parisian waiter). It's an attractive, but still distant prospect. One of the bottlenecks in moving this reality forward is variation in language, especially spoken language. Technology cannot quite cope with it.

Humans, on the other hand, are amazingly good at dealing with variations in language. We are so good, in fact, that we really take note when things occasionally break down. When I visited New Zealand, I thought for a while that people were calling me "pet," a Newcastle-like term of endearment. They were, in fact, just saying my name, Pat. My aha moment happened in a coffee shop ("Flat white for pet!" gave me a pause).

This story illustrates how different accents of English have slightly different vowels—a well-known fact. But let's try to understand what happened when I misheard the Kiwi pronunciation of Pat as pet. There is a certain range of sounds that we associate with vowels, like a or e. These ranges are not absolute. Rather, their boundaries vary, for instance between different accents. When listeners fail to adjust for this, as I did in this case, the mapping of sound to meaning can be distorted.

One could, laboriously, teach different accents to a speech recognition system, but accent variation is just the tip of the iceberg. Vowel sounds can also vary depending on our age, gender, social class, ethnicity, sexual orientation, level of intoxication, how fast we are talking, whom we are talking to, whether or not we are in a noisy environment … the list just goes on, and on.

The crux/crooks of the matter

Consider that a recent study I was involved in showed that even moving house (or not) can affect one's vowels. Specifically, there is a correlation between how speakers of Northern English pronounce the vowel in words like crux, and how many times they have moved in the last decade. People who have not moved at all are more likely to pronounce crux the same as crooks, which is the traditional Northern English pronunciation. But those who have moved four times or more are more likely to have different vowels in the two words, similarly in the south of England.

There is, of course, nothing about the act of moving that causes this. But moving house multiple times is correlated with other lifestyle factors, for instance interacting with more people, including people with different accents, which might influence the way we speak.

Other sources of variation may have to do with linguistic factors, such as word structure. A striking example comes from pairs of words such as ruler, meaning "measuring device" and ruler, meaning "leader."

These two words are superficially identical, but they differ at a deeper structural level. A rul-er is someone who rules, just like a sing-er is someone who sings, so we can analyze these words as consisting of two meaningful units. In contrast, ruler meaning "measuring device" cannot be decomposed further.

It turns out that the two meanings of ruler are associated with a different vowel for many speakers of Southern British English, and the difference between the two words has increased in recent years: it is larger for younger speakers than it is for older speakers. So both hidden linguistic structure and speaker age can affect the way we pronounce certain vowels.

End never in sight

This illustrates another important property of language variation: it keeps changing. Language researchers therefore constantly have to review their understanding of variation, which in turn requires continuing to acquire new data, and updating the analysis. The way we do this in linguistics is being revolutionized by new technologies, advances in instrumental data analysis, and the ubiquity of recording equipment (in 2018, 82% of the UK adult population owned a recording device, otherwise known as a smartphone).

Modern day linguistic projects can profit from the technological advancement in various ways. For instance, the English Dialects App collects recordings remotely via smartphones, to build a large and constantly updating corpus of modern day English accents. That corpus is the source of the finding concerning the vowel in crux in Northern English, for example. Accumulating information from this and many other projects allows us to track variation with increased coverage, and to build ever more accurate models predicting the realization of individual sounds.

Can this newly refined linguistic understanding also improve speech recognition technology? Perhaps, but in order to improve, the technology needs to know a lot more about you.

Provided by The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Citation: Why people will beat machines in recognizing speech for a long time (2019, June 11) retrieved 19 April 2024 from https://phys.org/news/2019-06-people-machines-speech.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Why does the UK have so many accents?

9 shares

Feedback to editors

Ghost particle on the scales: Research offers more precise determination of neutrino mass

1 hour ago

Light show in living cells: New method allows simultaneous fluorescent labeling of many proteins

1 hour ago

Warming of Antarctic deep-sea waters contribute to sea level rise in North Atlantic, study finds

1 hour ago

Unraveling water mysteries beyond Earth: Ground-penetrating radar will seek bodies of water on Jupiter

1 hour ago

Baby white sharks prefer being closer to shore, scientists find

6 hours ago

Key protein regulates immune response to viruses in mammal cells

10 hours ago

Unraveling the mysteries of consecutive atmospheric river events

13 hours ago

Research team resolves decades-long problem in microscopy

13 hours ago

RNA's hidden potential: New study unveils its role in early life and future bioengineering

14 hours ago

Smoother surfaces make for better accelerators

14 hours ago

Load comments (1)

Why people will beat machines in recognizing speech for a long time

End never in sight

Ghost particle on the scales: Research offers more precise determination of neutrino mass

Light show in living cells: New method allows simultaneous fluorescent labeling of many proteins

Warming of Antarctic deep-sea waters contribute to sea level rise in North Atlantic, study finds

Unraveling water mysteries beyond Earth: Ground-penetrating radar will seek bodies of water on Jupiter

Baby white sharks prefer being closer to shore, scientists find

Key protein regulates immune response to viruses in mammal cells

Unraveling the mysteries of consecutive atmospheric river events

Research team resolves decades-long problem in microscopy

RNA's hidden potential: New study unveils its role in early life and future bioengineering

Smoother surfaces make for better accelerators

Relevant PhysicsForums posts

Error logging in: onLoginSuccess is not a function

My Website For Creating Interactive Visuals Linked To Equations

Latest Notable AI accomplishments

Building a homemade Long Short Term Memory with FSMs

Most efficient way to randomly choose a word from a file with a list of words

Git, staging and committing files

Why does the UK have so many accents?

Linguists found the weirdest languages – and English is one of them

How people talk now holds clues about human migration centuries ago

Northern dialects can be closer to original English – despite what southerners might say

Alexa and Google Home are no threat to regional accents – here's why

Researchers develop technology that uses pulses to send messages through the skin

Machine learning approach for low-dose CT imaging yields superior results

Team breaks world record for fast, accurate AI training

Medical Xpress

Tech Xplore

Science X

Why people will beat machines in recognizing speech for a long time

End never in sight

Ghost particle on the scales: Research offers more precise determination of neutrino mass

Light show in living cells: New method allows simultaneous fluorescent labeling of many proteins

Warming of Antarctic deep-sea waters contribute to sea level rise in North Atlantic, study finds

Unraveling water mysteries beyond Earth: Ground-penetrating radar will seek bodies of water on Jupiter

Baby white sharks prefer being closer to shore, scientists find

Key protein regulates immune response to viruses in mammal cells

Unraveling the mysteries of consecutive atmospheric river events

Research team resolves decades-long problem in microscopy

RNA's hidden potential: New study unveils its role in early life and future bioengineering

Smoother surfaces make for better accelerators

Relevant PhysicsForums posts

Related Stories

Why does the UK have so many accents?

Linguists found the weirdest languages – and English is one of them

How people talk now holds clues about human migration centuries ago

Northern dialects can be closer to original English – despite what southerners might say

Alexa and Google Home are no threat to regional accents – here's why

Researchers develop technology that uses pulses to send messages through the skin

Recommended for you

Machine learning approach for low-dose CT imaging yields superior results

Team breaks world record for fast, accurate AI training

Newsletter sign up

Donate and enjoy an ad-free experience