Essential tones of music rooted in human speech

May 24, 2007

The use of 12 tone intervals in the music of many human cultures is rooted in the physics of how our vocal anatomy produces speech, according to researchers at the Duke University Center for Cognitive Neuroscience.

The particular notes used in music sound right to our ears because of the way our vocal apparatus makes the sounds used in all human languages, said Dale Purves, the George Barth Geller Professor for Research in Neurobiology.

It's not something one can hear directly, but when the sounds of speech are looked at with a spectrum analyzer, the relationships between the various frequencies that a speaker uses to make vowel sounds correspond neatly with the relationships between notes of the 12-tone chromatic scale of music, Purves said.

The work appeared online May 24 in the Proceedings of the National Academy of Sciences. (Download at>)

Purves and co-authors Deborah Ross and Jonathan Choi tested their idea by recording native English and Mandarin Chinese speakers uttering vowel sounds in both single words and a series of short monologues. They then compared the vocal frequency ratios to the numerical ratios that define notes in music.

Human vocalization begins with the vocal cords in the larynx (the Adam’s apple in the neck), which create a series of resonant power peaks in a stream of air coming up from the lungs. These power peaks are then modified in a spectacular variety of ways by the changing shape of the soft palate, tongue, lips and other parts of the vocal tract. Our vocal anatomy is rather like an organ pipe that can be pinched, stretched and widened on the fly, Purves said. English speakers generate about 50 different speech sounds this way.

Yet despite the wide variation in individual human anatomy, the speech sounds produced by different speakers and languages produce the same variety of vocal tract resonance ratios, Purves said.

The lowest two of these vocal tract resonances, called formants, account for the vowel sounds in speech. "Take away the first two formants and you can't understand what a person is saying," Purves said. The frequency of the first formant is between 200 and 1,000 cycles per second (hertz) and the second formant between 800 and 3,000 hertz.

When the Duke researchers looked at the ratios of the first two formants in speech spectra, they found that the ratios formed musical relationships. For example, the relationship of the first two formants in the English vowel /a/, as in "bod," might correspond with the musical interval between C and A on a piano keyboard.

"In about 70 percent of the speech sounds, these ratios were bang-on musical intervals," Purves said. "This predominance of musical intervals hidden in speech suggests that the chromatic scale notes in music sound right to our ears because they match the formant ratios we are exposed to all the time in speech, even though we are quite unaware of this exposure."

No music, except modern experimental pieces, uses all 12 tones. Most music uses the 7-tone or diatonic scale to divide octaves, and much of folk music uses five tones. These preferences correspond to the most prevalent formant ratios in speech. Purves and his collaborators are now working on whether a given culture's preference for one subset of the tones over another is related to the formant relationships that are especially prevalent in the native language of that group.

Purves and his collaborators also think these findings may help explain a centuries-old debate in music over which tuning scheme for instruments works best. Ten of the 12 harmonic intervals identified in English and Mandarin speech occur in "just intonation" tuning, which sounds best to most trained musicians. They found fewer correspondences in other tuning systems, including the equal temperament tuning commonly used today.

Equal temperament tuning, in which each of the 12 interval distances in the chromatic scale is made exactly the same, is a scheme that enables an ensemble such as an orchestra to play together in different keys and across many octaves. Although equal temperament tuning sounds pretty good, it's a compromise on the more natural, vocally derived just intonation tuning system, Purves said.

The group's next study concerns our intuitive understanding that a musical piece tends to sound happy if it’s in a major key but relatively sad if it's in a minor key. That, too, may come from the characteristics of the human voice, Purves suggests.

Source: Duke University

Explore further: Literacy expert pushes 'play' on educational games

Related Stories

Literacy expert pushes 'play' on educational games

September 30, 2015

Are computer games for learning or just for fun? That's the question Hiller Spires, NC State professor of literacy and technology, tackles in a commentary for the Journal of Adolescent and Adult Literacy.

Television audio of the future—customizable and in 3D

September 9, 2015

The next generation of Ultra High Definition televisions (UHDTV) offers not only crystal-clear images, but also perfect sound. At the IBC trade show (September 11-15 in Amsterdam), Fraunhofer researchers are presenting a ...

Looking for the origins of music in the brain

October 20, 2009

Music serves as a natural and non-invasive intervention for patients with severe neurological disorders to promote long-term memory, social interaction and communication. However, there is currently no plausible explanation ...

Christmas Carol Talk

December 22, 2009

Even without the lyrics, the tunes of some Christmas carols -- such as "Jingle Bells" or "Deck the Halls" -- sound uplifting. But the melodies of other songs like "We Three Kings" have a different, somber sound.

Neuroscientist: Think twice about cutting music in schools

February 21, 2010

At a press briefing today at the American Association for the Advancement of Science annual meeting, a Northwestern University neuroscientist will argue that music training has profound effects that shape the sensory system ...

Recommended for you

The culinary habits of the Stonehenge builders

October 13, 2015

A team of archaeologists at the University of York have revealed new insights into cuisine choices and eating habits at Durrington Walls – a Late Neolithic monument and settlement site thought to be the residence for the ...

Ancient genome from Africa sequenced for the first time

October 8, 2015

The first ancient human genome from Africa to be sequenced has revealed that a wave of migration back into Africa from Western Eurasia around 3,000 years ago was up to twice as significant as previously thought, and affected ...

Mexican site yields new details of sacrifice of Spaniards

October 9, 2015

It was one of the worst defeats in one of history's most dramatic conquests: Only a year after Hernan Cortes landed in Mexico, hundreds of people in a Spanish-led convey were captured, sacrificed and apparently eaten.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.