AI can predict whether your relationship will last based on how you speak to your partner

September 29, 2017 by Ian Mcloughlin, The Conversation
I’m TALKING. Credit: Roman Samborskyi/Shutterstock

Any child (or spouse) who has been scolded for their tone of voice – such as shouting or being sarcastic – knows that the way you speak to someone can be just as important as the words that you use. Voice artists and actors make great use of this – they are skilled at imparting meaning in the way that they speak, sometimes much more than the words alone would merit.

But just how much information is carried in our tone of and conversation patterns and how does that impact our relationships with others? Computational systems can already establish who people are from their voices, so could they also tell us anything about our love life? Amazingly, it seems like it.

New research, just published in the journal PLOS ONE, has analysed the vocal characteristics of 134 couples undergoing therapy. Researchers from the University of Southern California used computers to extract standard speech analysis features from recordings of therapy session participants over two years. The features – including pitch, variation in pitch and intonation – all relate to voice aspects like tone and intensity.

A was then trained to learn a relationship between those vocal features and the eventual outcome of therapy. This wasn't as simple as detecting shouting or raised voices – it included the interplay of conversation, who spoke when and for how long as well as the sound of the voices. It turned out that ignoring what was being said and considering only these patterns of speaking was sufficient to predict whether or not couples would stay together. This was purely data driven, so it didn't relate outcomes to specific voice attributes.

Interestingly, the full video recordings of the were then given to experts to classify. Unlike the AI, they made their predictions using psychological assessment based on the vocal (and other) attributes – including the spoken and body language. Surprisingly, their prediction of the eventual outcome (they were correct in 75.6% of the cases) was inferior to predictions made by the AI based only on vocal characteristics (79.3%). Clearly there are elements encoded in the way we speak that not even experts are aware of. But the best results came from combining the automated assessment with the experts' assessment (79.6% correct).

The significance of this is not so much about involving AI in marriage counselling or getting couples to speak more nicely to each other (however meritorious that would be). The significance is revealing how much information about our underlying feelings is encoded in the way we speak – some of it completely unknown to us.

Words written on a page or a screen have lexical meanings derived from their dictionary definitions. These are modified by the context of surrounding words. There can be great complexity in writing. But when words are read aloud, it is true that they take on additional meanings that are conveyed by word stress, volume, speaking rate and tone of voice. In a typical conversation there is also in how long each speaker talks for, and how quickly one or other might interject.

How a tone of voice can change the meaning of a few words.

Consider the simple question "Who are you?". Try speaking this with stress on different words; "Who are you?", "Who are you?" and "Who are you?". Listen to these – the semantic meaning can change with how we read even when the words stay the same.

Computers reading 'leaking senses'?

It is unsurprising that words convey different meanings depending on how they are spoken. It is also unsurprising that computers can interpret some of the meaning behind how we choose to speak (maybe one day they will even be able to understand irony).

But this research takes matters further than just looking at the meaning conveyed by a sentence. It seems to reveal underlying attitudes and thoughts that lie behind the sentences. This is a much deeper level of understanding.

The therapy participants were not reading words like actors. They were just talking naturally – or as naturally as they could in a therapist's office. And yet the analysis revealed information about their mutual feelings that they were "leaking" inadvertently into their speech. This may be one of the first steps in using computers to determine what we are really thinking or feeling. Imagine for a moment conversing with future smartphones – will we "leak" information that they can pick up? How will they respond?

Could they advise us about potential partners by listening to us talking together? Could they detect a propensity towards antisocial behaviour, violence, depression or other conditions? It would not be a leap of imagination to imagine the devices themselves as future therapists – interacting with us in various ways to track the effectiveness of interventions that they are delivering.

Don't worry just yet because we are years away from such a future, but it does raise privacy issues, especially as we interact more deeply with computers at the same time as they are becoming more powerful at analysing the world around them.

When we pause also to consider the other human senses apart from sound (speech); perhaps we also leak information through sight (such as , blushing), touch (temperature and movement) or even smell (pheromones). If smart devices can learn so much by listening to how we speak, one wonders how much more could they glean from the other senses.

Explore further: Before babies understand words, they understand tones of voice

Related Stories

How the human brain detects the 'music' of speech

August 24, 2017

Researchers at UC San Francisco have identified neurons in the human brain that respond to pitch changes in spoken language, which are essential to clearly conveying both meaning and emotion.

Social status of listener alters our voice

June 29, 2017

People tend to change the pitch of their voice depending on who they are talking to, and how dominant they feel, a study by the University of Stirling has found.

Can voice recognition technology identify a masked jihadi?

January 7, 2016

The latest video of a masked Islamic State jihadist apparently speaking with a British accent led to him being tentatively identified as Muslim convert Siddhartha Dhar from East London. Voice recognition experts were reportedly ...

Recommended for you

1 in 3 Michigan workers tested opened fake 'phishing' email

March 16, 2018

Michigan auditors who conducted a fake "phishing" attack on 5,000 randomly selected state employees said Friday that nearly one-third opened the email, a quarter clicked on the link and almost one-fifth entered their user ...

World's biggest battery in Australia to trump Musk's

March 16, 2018

British billionaire businessman Sanjeev Gupta will built the world's biggest battery in South Australia, officials said Friday, overtaking US star entrepreneur Elon Musk's project in the same state last year.

Origami-inspired self-locking foldable robotic arm

March 15, 2018

A research team of Seoul National University led by Professor Kyu-Jin Cho has developed an origami-inspired robotic arm that is foldable, self-assembling and also highly-rigid. (The researchers include Suk-Jun Kim, Dae-Young ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.