A readability analysis of presidential candidate speeches by researchers in Carnegie Mellon University's Language Technologies Institute (LTI) finds most candidates using words and grammar typical of students in grades 6-8, though Donald Trump tends to lag behind the others.
A historical review of their word and grammar use suggests all five candidates in the analysis - Republicans Trump, Ted Cruz and Marco Rubio (who has since suspended his campaign), and Democrats Hillary Clinton and Bernie Sanders - have been using simpler language as the campaigns have progressed. Again, Trump is an outlier, with his grammar use spiking in his Iowa Caucus concession speech and his word and grammar use plummeting again during his Nevada Caucus victory speech.
"Win," after all, is more likely to appear in 3rd grade texts than "regrettably."
A comparison of the candidates with previous presidents show President Lincoln outpacing them all, boasting grammar at the 11th grade level, while President George W. Bush's 5th grade grammar was below even that of Trump.
"Assessing the readability of campaign speeches is a little tricky because most measures are geared to the written word, yet text is very different from the spoken word," said Maxine Eskenazi, LTI principal systems scientist who performed the analysis with Elliot Schumacher, a graduate student in language technologies. "When we speak, we usually use less structured language with shorter sentences."
An earlier analysis by the Boston Globe used the Flesch-Kincaid readability test, which is based on average sentence length and average number of syllables per word, and found Trump speaking at a 4th grade level, two grade levels below his peers. Eskenazi and Schumacher used a readability model called REAP, which looks at how often words and grammatical constructs are used at each grade level and thus corresponds better to analysis of spoken language.
Based on vocabulary, campaign trail speeches by past and present presidents - Lincoln, Reagan, Bill Clinton, George W. Bush and Obama - were at least on the 8th grade level, while the current candidates ranged from Trump's 7th grade level to Sanders' 10th grade level. Trump and Hillary Clinton's speeches showed the greatest variation, suggesting they may work harder than the others in tailoring speeches to particular audiences, Schumacher said.
In terms of grammar, none of the presidents and presidential candidates could compare with Lincoln's Gettysburg Address - an admittedly high standard, with grammar well above the 10th grade level. The current candidates generally had scores between 6th and 7th grades, with Trump just below 6th grade level. President Bush scored at a 5th grade level.
Analyzing campaign speeches is difficult because it often is hard to obtain transcripts of speeches, Schumacher said. It is possible to generate reliable transcripts from video using automatic speech recognition (ASR) systems, such as those developed at LTI, when the speech took place in a quiet environment, but he and Eskenazi opted not to use today's automated methods because they were likely to introduce errors in the noisy environment of campaign rallies.
Explore further: Research explains success of extremist politicians
The study is available online at reap.cs.cmu.edu/Papers/Technical_report_16-001_Schumacher_Eskenazi.pdf .