You can't always determine emotion from someone's facial movements, neither can AI
If you saw a person with their brow furrowed, mouth turned down, and eyes squinted, would you guess they're angry? What if you found out they'd forgotten their reading glasses and were deciphering a restaurant menu?
Interpreting a person's facial movements can't be done in a vacuum; it depends on the context—something that Northeastern neuroscientist Lisa Feldman Barrett shows in a groundbreaking new study published Thursday in the scientific journal Nature Communications.
Barrett, a university distinguished professor of psychology at Northeastern, and colleagues from several other institutions around the world used photographs of professional actors portraying richly constructed scenarios to show that people not only use different facial movements to communicate different instances of the same emotion category (someone might scowl, frown, or even laugh when they're portraying anger), they also employ similar facial configurations to communicate a range of instances from different emotion categories (a scowl might sometimes express concentration, for example)—findings that have serious implications for emotion recognition technology that purports to "read" emotions in the face.
"The implication of this study is that there is much more variability in the way that people express different instances of a given emotion category. And one facial configuration can express instances of anger, happiness, or other emotion categories, depending on the context," Barrett says.
People might widen their eyes because they're angry or because they're surprised, and the human brain depends on context to solve this puzzle.
Previous scientific studies of emotion expressions have relied on regular people or amateur actors to portray a single instance from each emotion category given an impoverished context: "Your cousin just died and you feel very sad. What expression would you make?"
Such renderings cue people to lean on stereotypical expressions of emotion (frowning in sadness), rather than expressions that reflect a richer emotional life, full of nuance and situated variation, Barrett says.
So for their study, Barrett and her co-authors used photographs of professional actors—people with "expertise about emotion" because their very livelihoods depend upon "their authentic portrayal of emotional experiences in movies, television, and theater," in a way that broadcasts believable information, the researchers write.
The actors were given a detailed, emotion-evoking scenario to act out, and then photographed by Howard Schatz (who also created the scenarios) for two published volumes: In Character: Actors Acting, and Caught in the Act: Actors Acting.
An example from Schatz's books: "He is a motorcycle dude coming out of a biker bar just as a guy in a Porsche backs into his gleaming Harley," according to the researchers' paper.
"What's important is that these famous actors were given a scenario without emotion words in it," Barrett says, which eliminates the immediate connection one might make between, for example, the word "sad" and the facial expression "frown."
The researchers used 604 of the 731 photographs in Schatz's books, eliminating only the ones in which the actors' facial poses couldn't be analyzed because their hands covered their faces or because their heads were extremely tilted.
They used those photos and scenarios to run two studies. In the first, the researchers asked 839 volunteers to judge the emotional meanings of the scenario descriptions alone. Each volunteer rated roughly 30 scenarios, using a 1-4 scale to indicate the extent to which one of 13 emotions was evoked in the description: amusement, anger, awe, contempt, disgust, embarrassment, fear, happiness, interest, pride, sadness, shame, and surprise.
They used the median rating of each scenario to classify it into one of those 13 emotion categories. The researchers also called upon three experts to code the 604 photographs using the Facial Action Coding System, which specifies a set of action units that each represent the movement of one or more facial muscles.
According to one long-held hypothesis, certain emotion categories are consistently and specifically expressed with certain sets of facial movements. If that were the case, then all the scenario descriptions classified as evoking instances a given emotion category should correspond to photographs that consistently portray a specific set of facial movements.
Or, as Barrett says, "If the facial configurations in question—scowling, smiling, frowning, and so on—are expressions that evolved to communicate specific emotions, you should see famous actors posing scowls when portraying instances of anger and only anger, posing frowns when portraying sadness, and so on."
The researchers ran machine learning analyses, which revealed that that actors portrayed instances of the same emotion categories by contorting their faces in a variety of ways. Also, similar facial poses didn't reliably express the same emotional category.
To test whether facial movements, alone, carry any emotional information independent of context, the researchers asked two more groups of volunteers to judge the emotional meaning of each facial pose, either when presented alone or with its corresponding scenario.
The first group, 842 people, rated roughly 30 faces each. The second group, 845 people, rated roughly 30 face-and-scenario pairs. Both groups were asked to judge the extent to which their faces or face-and-scenario pairs belonged to each of the 13 emotion categories.
If facial movements carry emotional information independent of the context, then ratings of the faces alone should have been very similar to the ratings of face-scenario pairs. If the emotional meaning of facial movements comes primarily from the context that they are associated with, then the initial ratings of the scenarios alone would be more similar to the face-scenario ratings.
The researchers found that people's judgments of facial poses alone didn't reliably match the ratings of the faces when they were viewed with the scenario; they also did not match the designated emotion category of the scenario. The emotional meanings of the facial poses came primarily from the scenarios they were paired with, i.e., the context.
"The present findings join other recent summaries of the empirical evidence to suggest that scowls, smiles, and other facial configurations belong to a larger, more variable repertoire of the meaningful ways in which people move their faces to express emotion," the researchers write.
In other words, Barrett says, "people infer the meaning of your smile, and their inferences are informed by context. When it comes to expressing emotion, a face does not speak for itself."
The researchers' findings have implications for the sorts of artificially intelligent systems that some engineers claim to be able to decipher someone's emotion by tracking their facial movements alone.
Companies are already using AI-powered systems to gauge children's emotions as they learn, make judgments about potential job candidates, and guess at the would-be nefarious intentions an airline passenger.
"Our research directly counters the traditional emotional AI approach," Barrett says. "Certain companies claim they have algorithms that can detect anger, for example, when what really they have—under optimal circumstances—are algorithms that can probably detect scowling, which may or may not be an expression of anger. It's important not to confuse the description of a facial configuration with inferences about its emotional meaning."