We know how Donald Trump feels about everyone through Twitter, but how do Twitter users feel about Donald Trump?
Computer scientists from the University of Utah's College of Engineering have developed what they call "sentiment analysis" software that can automatically determine how someone feels based on what they write or say. To test out the accuracy of this software's machine-learning model, the team used it to analyze the individual sentiments of more than 1.6 million (and counting) geo-tagged tweets about the U.S. presidential election over the last five months. A database of these tweets is then examined to determine whether states and their counties are leaning toward the Republicans or Democrats.
"With sentiment analysis, it will try to predict the emotions behind every human being when he or she is talking or writing something," says Debjyoti Paul, a doctoral student in the University of Utah's School of Computing and the project leader along with School of Computing associate professor Feifei Li. "With that in mind, we are not just trying to look at the information in the tweets. We are trying to incorporate the emotion with the information."
As a result of their work, the team has created an interactive website at http://www.estorm.org in which users can find out if the tweets coming out of their state and its counties are more positive or negative toward Republicans or Democrats during any defined period of time since June 5. Also, the data can tell you the percentage of both positive and negative tweets toward a political party and when there was a surge for a particular type of tweet in the last five months.
Some interesting facts about this year's U.S. presidential election based on a sample of what people are tweeting:
- Based on the number of positive tweets posted since June toward each party, the computer model predicts that Hillary Clinton will win the presidential election.
- Republicans sent out 17 percent more political tweets than Democrats.
- Delaware was the only state in which a majority of tweets from all counties in the state were positive toward the same party—in this case, the Democrats.
- For the Republicans, South Dakota had the highest percentage of counties in which most of their tweets were positive toward the party (73 percent of the counties).
- The biggest surge of positive tweets for Republicans was during the Republican National Convention on June 18 and when the video of Donald Trump boasting about groping women was leaked Oct. 7 (presumably defenders of Trump tweeting their support of him).
- The largest surge of positive tweets for Democrats was after the last two presidential debates and after the New York Times published its story Oct. 1 that Donald Trump avoided paying federal taxes for nearly two decades.
- Not only did the number of positive tweets for Democrats peak after the last two debates and the Trump federal taxes story, it's also when the most negative tweets about the Democratic Party were posted.
Analyzing the tweets
Paul and his team started with more than 250 million tweets posted around the world from June 5 to Oct. 30 and then weeded out all non-political tweets based on a system of keywords using advanced software. They were left with more than 1.6 million political tweets posted in the U.S.
Then those tweets were sifted through the team's "sentiment analysis" software where each tweet was analyzed and assigned a score from 0 to 1 where 0 is the most negative sentiment, 1 is the most positive sentiment, and 0.5 is neutral. The scores are then collected in a database that can calculate a state or county's political leanings in real time based on the tweets. The database is constantly updated with new tweets.
To measure the accuracy of the model, the team compared its results to the New York Times Upshot election forecast website and found the state-by-state analysis was very similar.
"I think it works really well. It matches up with the major events that happened during this election season. That's a good indicator that the results are accurate," says Li. "We're hoping to develop some more scientific measurements to confirm this observation for an upcoming paper, but the early results are very positive."
Paul believes that their sentiment analysis software could be used to more accurately reflect the feelings of crowd-sourced opinions on the Internet, for example reviews of products on Amazon or restaurant reviews on Yelp, in which the software can "drill down to the individual sentences of the text" to determine a person's true feelings about something, he says.
He also said that voice-enabled assistants such as iPhone's Siri could use such software to better determine what the user wants, not just based on what he or she says but how they say it.
Explore further: Twitter: 17M-plus tweets sent about the debate, most ever