Research team develops mathematical model to explain harmony in music


( -- Bernardo Spagnolo of the University of Palermo in Italy and his Russian colleagues have developed a model that they believe explains why it is we humans hear some notes as harmonious, and others as dissonant. The team, as described in their paper in Physical Review Letters, say that such harmony can be explained by our auditory neural system.

Most people can hear the difference between harmony and general noise. It’s evident in a guitar chord: strike the notes C, E and G together and you get the familiar C Major Chord, so often heard in popular music. Mess up one note though, and everyone will wince. The same can be seen watching American (or other country) Idol; not when a contestant singing A cappella goes off key, but when a singer hits (or misses) a note that harmonizes with a note played on an accompanying instrument.

There have been many theories suggested over the years as to how and why we hear some groupings of notes as pleasing and others as wrong, or off. Some have suggested that our brains simply receive a stream of notes and make of it what we will. Spagnolo et al, however, disagree, and they have a model that they say proves it.

In their paper, the team says that we humans have different neurons in different parts of our ears that respond to different frequencies. Say perhaps one group responds to the C note on a guitar, and another to an E, etc., these are called . But that’s not enough to account for “liking” the two being heard at the same time. To explain this, the researchers suggest that we also have a third type of neuron called an interneuron. In their model, they suggest that the sensory neurons send signals to the interneuron, which then sends signals based on what its “heard” from them to the brain.

What’s more, the team says that the sensory neurons conform to the "leaky integrate-and-fire” equation whereby the stimuli (in this case sound) drives up the voltage until it reaches a saturation point, whereby it then discharges it’s information (in this case to an interneuron), which then sends signals to the brain. If the sensory neurons were to all fire constantly, the interneurons would be inundated and unable to process all the information from multiple sensory neurons.

The team then applied information theory that says that the less random a signal is the more information it has and came up with a number they call a regularity. And it’s this regularity that explains our “likening” different notes when heard together. “Good” notes played together result in a high regularity (because they have more in them) while dissonant notes produce lower regularity.

And that is why we smile when listening to two or three people who harmonize perfectly together, but frown when hearing the results of those less gifted.

Explore further

Sensory detection and discrimination: Study reveals neural basis of rapid brain adaptation

More information: Regularity of Spike Trains and Harmony Perception in a Model of the Auditory System, Phys. Rev. Lett. 107, 108103 (2011). DOI:10.1103/PhysRevLett.107.108103

Spike train regularity of the noisy neural auditory system model under the influence of two sinusoidal signals with different frequencies is investigated. For the increasing ratio m/n of the input signal frequencies (m, n are natural numbers) the linear growth of the regularity is found at the fixed difference (m-n). It is shown that the spike train regularity in the model is high for harmonious chords of input tones and low for dissonant ones.

via Focus

© 2011

Citation: Research team develops mathematical model to explain harmony in music (2011, September 12) retrieved 20 September 2019 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments

Sep 12, 2011
Try this on for size 0,o


Sep 12, 2011
It takes more information to describe a random signal than one which is structured out of pure tones. Pleasing chords are made out of combinations of tones which are closely harmonically related, with relatively simple periodic patterns. An extreme example -- listening to music vs listening to acoustic modem signals. The noise-like modem signal conveys more information for a given bandwidth than the pleasing sounds of Mozart.

Sep 12, 2011
Yes; poor dynamics unable to translate at point of interfacing, unable to maintain a miror that reflects and registers for a genuine period, and especially maintain integrity of the photon.

further simulation will not work until these issues are addressed.

Sep 12, 2011
well then, you must be satisfied. all the answers. so where then is any music written?

the real tree is by far greater than anything I've heard from you or any body else.

Sep 12, 2011
still, writing the song is always the real challenge

Sep 12, 2011
and I know of no such bridge

Sep 13, 2011
Interesting article, but the informational entropy of harmonic relationships is simply a function of their complexity - specifically the size of the temporal integration window needed to resolve them.

Harmonic consonance and dissonance are a consequence of how we process information, specifically how we evaluate relationships about sensory information, or metadata.

Consider for example how the tones C3 and C6 are processed - we hear two distinct notes, one higher and the other lower. But their relationship is so consinant that they sound, almost paradoxically, the same (hence we have such a concept as pitch class)

This sensation of the relationship between them isn't merely "sameness", but also maximal consonance, and minimal dissonance.

Ergo, consonance and dissonance per se are but degrees of this more fundamental perception of sameness between frequencies synchronised by factors of two.

Sep 13, 2011
Likewise harmonic dissonance is an evaluation of difference - ie. "not sameness".

From this we can conclude that this sameness corresponding to frequencies in a factor of two relationship represents an informational "zero point" of no information, no difference. No metadata. Or indeed no entropy.

Octaves have no entropy. Fifths, at factors of 1.5 (aka the 3:2 interval) thus possess the minimal entropy, fourths at factors of 1.33 or the 4:3 ratio have a slightly higher entropy and so on... the tritone has very high entropy, because it resolves to a large temporal integration window.

Note i discount the unison here since it is an interval of zero and thus there is no information about its relative difference, no metadata to process.

I believe the temporal aliasing causing this whole system to form is systemic, correlating to minimal connective complexity in the associated resolving nuclei, and as such is as primitve as multicellular life itself.

Sep 13, 2011
That is to say, all multicellular animals must be subject to the octave equivalence perception, and it must further be the basis of the key to the binding problem at large - the other sensory modalities, language, emotion and reason etc. - this same fundamental dynamic must be more pervasive than just audition.

Sep 13, 2011
Heh, incidently i'm inclined to think a true AI that could savour experience as a living organism would likewise need to innately perceive the octave equivalence paradox.

Indeed, if there's musical aliens anywhere out there, you can bet their music will be discernable as such to our ears, as ours theirs... provided they're multicellular of course.

Sep 14, 2011
Informational entropy (per Shannon et al), not thermodynamic entropy. They're related concepts but nonetheless different.

Octaves (or more properly frequencies in a factor of two ratio) are percieved as "equivalent", such that the pattern of notes on say a piano repeats, with all A's, B's and C's etc. constituting the same "pitch class". This perception of comparative equality is maximally consonant, and miminally dissonant - in other words the meta-information about the relationship is registering "no difference" - no entropy - and as such they're processed as qualitatively the same note despite being clearly different pitches.

In a nutshell i'm saying that understanding the nature of music, especially harmony, requires addressing the octave equivalence paradox, and the way we resolve paradoxes is by carefully delineating the issues. This leads to the inescapable conclusion tonal consonance and dissonance are more fundamentally degrees of this anomalous percept of equality.

Sep 14, 2011
Some people prefer -on an emotional level- the overdriving distiortion (lots of overtones /noise) sound of an electric guitar over the 'clean' guitar sound.

There are cultures that divide an octave in a lot more notes (=dissonant) than 'western' musicians do.

(how) does this model explain these phenomena?

Sep 14, 2011
@hush1, and FWIW i actually have first hand knowledge of multiple perpetual motion machines already, soon so will you and everyone else..! Besides which all sorts of things could be described in states of zero entropy, likewise in some systems entropy increases, in others it falls (we can extract energy from either gradient), and in others it will be at least temporarilly stable, regardless of the direction of the overall change.

You and absolutely everything around you has emerged spontaneously from a quark-gluon-plasma. All of it may be slowly deterioating, but regardless, negentropic gradients piggyback on entropic ones, and vice versa, and we wouldn't be here otherwise. Your application of the concept appears to amount to little more than a cynical if fuzzy notion of there being "no free lunch"... and maybe there ain't.

Then again, maybe there's nowt else... ;)

Sep 14, 2011

lol if i had a penny for everyone who's insisted to me Javanese folk can't hear octaves.

Obviously Slendro and Pelog for instance are culturally Gamelan and very different from Western 12TET. But the equivalence of octaves is far more fundamental than anything learned or invented, indeed more intrinsic than any tonotopic maps or any other physiological adaptations.

It isn't even an adaptation - we never "aquired" the ability to hear or discriminate octaves as equivalent, and neither did any of our ancestors.

On the contrary it's an inherent emergent effect of how all living things (multicellular ones anyway) process information in general.

So rest assured, all Javanese people can appreciate and understand the equivalence (and thus consonance) of octaves, just as they can tell that two tones in a tritone relationship are not-at-all equivelent, but discordant.

Sep 14, 2011
The use of the terms consonance and dissonance with respect to tonality shouldn't be confused with their similar use with respect to aesthetics, styles etc. - which sort of cadences sound tuneful etc. - that's cultural. Some folks may even prefer discordant or atonal stuff, but what makes it discordant and atonal is the fact that there is actually an underlying objective framework of harmonic congruity. Avoidance or postponement of resolution is of course as much to do with style and artistic intent as anything else.

And obviously the number of divisions into an octave one makes is fairly arbitrary too - as per some Gamelans, the interval sizes don't even need to be consistent across the scale.

Further, one could completely avoid the octave, never using it.

Sep 14, 2011
But it's still there, and defining the music if only by its absence. The divisions any particular tonal system makes into an octave may be cultural, but the octave itself - and its primal consonance - is a universal, an inevitability, and occurs for ultimately non-biological reasons.

The same core effect can be seen in the wagon wheel illusion, wherein the rotor appears to stop at factors of two of a given frequency, for example say the image appears static at 100RPM - as its speed increases it'll appear to accelerate one way then the other, before stopping again at 200RPM, and will follow the same cycle up to 400rpm and so on.

Similarly this is not a cultural or evolutionary adaptation to accelerating discs but a systemic limitation in how we sample and process information.

I suspect octave equivalence has analogues in vision, olfaction and gustation, and probably every other modality too...

Sep 14, 2011
...not least because the thalamocortical pathway through which they pass itself has an octave bandwidth (Miller et al). The waveforms comprising the sensory responses of the olfactory bulb likewise have an octave bandwidth (Axel & Buck et al). Our visuospatial (colour) bandwidth falls just within an octave, and visuotemporal bandwidth is demonstrated to be octave-bound by the wagon wheel illusion above. The male voice (for normal conversation) spans one octave, the female, two. Our peripheral nervous system has two octaves. Bats seperate navigational and communicative calls into higher and lower octaves, respectively. Many different species have been shown to experience octave equivalence, although such verifications are just box ticking excersises if you understand my contention...

Sep 14, 2011
@hush1 lol, i only mentioned Shannon as an example of informational entropy, as opposed to the thermodynamic sort you seemed to think we were discussing... If you understand Shannon then this is a lot simpler; i'm merely relating harmonic entropy to the sense of "difference" with respect to phase shift complexity between pure tones and showing that it has a minimum 'ground state' of "no difference" in the simplicity of octaves, which by virtue of their equality (their lack of percieved difference) possess no such entropy as defined..

That answers your question. I've not discussed any aspect of Shannon's work, and if you really think 'Shannon entropy' has "nothing to do with the measure of entropy of a system" i've no wish to persuade you otherwise, thanks for your thoughts all the same... (and if i ever DO discover over-unity octaves (?) i am so going to rub your face in it)...

Sep 15, 2011
Thanks MrVibrating, interesting stuff.

Just wondering, do you see an evolutionary benefit in recognising octaves and other 'clean intervals'; why is the brain releasing chemicals that make us "smile when listening to two or three people who harmonize perfectly together"?

Feeling safer /happier when a pattern is recognised would be an obvious thought, however, a bit 'who's first, chicken or egg' if you ask me; this behaviour of three people who harmonize perfectly together could be seen as non-threatening play. But then... you have to be able to recognise it first (?)

Sep 15, 2011
@Nederlander To be honest i shy away from any sociological aspects - which is not to discount them of course, only i'm specifically interested in the objective form of information processing, ie. the key to the binding problem, for me that's the prize.

So psychomusicologists who suggest our penchant for rhythm is borne from womb nostalgia and a memory of our mother's heartbeat, or that people tend to "baby talk" to infants in cooing consonances, with lots of 4ths and 5ths etc. - these types of explanations are too woolly for me.

Are C-maj and A-min innately "uplifting" and "serious"? Are there fundamental truths in Plato's descriptions of the modes? The angle i present here is really concerned with the (tentative) bits and bytes of information processing, and i'm not sure how much it could say about aesthetics - i tend to agree with the general consensus regarding cultural variation.

Polyphony is obviously much more ancient than commonly acknowledged, though...

Sep 15, 2011

OU exists mathematically like 1 plus 1 = 3.

I do believe in "OU" but the scare quotes there tell you i haven't given up on CoE.

It's vacuum energy. Has to be. Rectifying virtual photons. Or powered by time (depending on your preference for QED / QCD). But the result depends only on Faraday, Lenz, Newton, a sprinkling of Rutherford and presto... kind of a reversed Casimir effect, only moreso.

The only relevance to octaves i see would be resonant coupling, with a concomitant loss in power transfer. These things tend to have optimum fundamentals and as far as i'm aware the relationships of interest aren't in any sense harmonic..

Maybe if you were slightly less cryptic i could comment more objectively..?

Sep 17, 2011
Still not satisfied, sorry. The quote in the article is claiming a biological and thereofore evolutionairy phenomenon:

"...the sensory neurons conform to the "leaky integrate-and-fire equation whereby the stimuli (in this case sound) drives up the voltage until it reaches a saturation point, whereby it then discharges its information (in this case to an interneuron), which then sends signals to the brain..."

....strike the notes C, E and G....
In most cases, the basilar membrane of the ear is mechanically unable to resonate at these 3 fundamental frequencies at the same time; they're too close, the so called 'treshold of masking'. >> see part2

Sep 17, 2011
The perception of hearing the fundamental frequency E (which is masked mechanically due to deflection of curves on the basilar membrane by the C, if the C is loud enough) is synthesized /calculated somewhere else in the brain (most likely paralimbic brain regions) by extrapolating the resonant of E string formants downward to f1. It is not, and mechanically simply cannot be formed in the ear just by a "sensory neuron" picking up a resonant frequency, because the resonance itself doesn't occur. The membrane can't do it.

So that's why I asked if there is any biological /evolutionairy /brain research angle in this research. Because given this information, the claim "And that is why we smile when listening to two or three people who harmonize perfectly together" is unlikely to me, considering the mechanical limitations of the ear, and evolution of the brain /hearing /language system.

Enlighten me, please.

Sep 17, 2011
Perhaps there is an auditory equivalent to the opponent process ?

Sep 17, 2011
my dog likes my guitar playing, some people do too.some people do no like it and point out my mistakes, which stand out to them like a sore thumb.why?personal preference of an untrained ear?music is another language and the ability is a secondary sub system that processes may be the primary system and the ability to understand spoken words could be the secondary.both display a high degree of processing by our brain and central nervous system,which uses chemicals to brige the synapse.why did evelution choose the slower chemical pathways instead of the much faster electrical impulses that nerves send down axions?we will never know but must keep researching.amazing that perceptions can cause disharmony like bad sour notes.there are more than one visual systems at work also, that create circadian rhythm. its all about cycles, and frequency of cycles.CYCLES are key to the universe.vibrations are cyclic,strings vibrate,string theory vibrates.oops sorry for pokeing fun for fun.

Sep 17, 2011
Just another small note on Miller's findings and maths - he noted that the thalamocortical bandwidth was one octave, in such a way that ie. a 100Hz signal at thalamus would be reduced to a 50 Hz signal at cortex, a 2:1 compression of the data. As he also noted though, the pathway features myriad feedback and feed-forward loops - data is getting cycled around.

Maths being maths it turns out that there's yet another way (besides factors) to describe harmonic relationships; recursive subdivision by two...

So if we take a fundamental and divide it by two, then doing the same to the result, the relationship of each successive result to the fundamental will follow the harmonic series, recursively, ad infinitum.

Should the auditory thalamocortical feedback loops be cycling L1 & L2 data like this, we would of course have a natural predisposed affinity for the harmonic series, quite irrespective of any tonotopic maps or other specialisations...

Sep 18, 2011
@hush1 The basilar membrane is mechanically limited. Don't care what you say, it is. Just google 'basliar membrane auditory masking' and you will find tons of evidence. Your 'hairs' are on this membrane and are influenced by the resonances of the membrane.

You are plain wrong on a kind of ignorant and tunnelvision one-trick-pony level stating "The limits of the ear are not mechanical in nature, the limits of the ear are biochemical in nature." that I am not going to waste my time arguing on this. You lack fundamental knowledge. Have a nice day.

Sep 18, 2011
on = along

Sep 18, 2011

Isn't tinitus regarded as a failure of the inhibitory side of auditory opponent processing? I think it's a general feature of processing and not just of ie. colour perception or what have you..

Sep 18, 2011

Lee M. Miller, lotta cool reasearch on auditory thalamocortical processing, example: http://jn.physiol...516.full

Sep 18, 2011

I too am sceptical regarding the limitations of the basilar membrane you allude to - it isn't a homogenous mass and is designed to accomodate multiple resonances, and of course it is fluid motions against the cilia, driven by the tympanic membrane via the oscicles, that is ultimately responsible for the sensory response.

Could your contention be simplified to say two tones detuned by a few cents, thus causing a similar kind of supposed resonance conflict? Could we not then extend the question beyond the basilar membrane and suppose that other auditory apparatus of the middle ear etc. should be similarly limited in their modes of resonant response?

If we clearly do hear both tones then perhaps there is no mechanical conflict - no amplitude attenuation from destructive inteference besides the usual 'beating' roughness of the dissonance as the phases drift in and out of sync.


Sep 18, 2011

At the extreme end of your argument - in a kind of reductio absurdum - would be the suggestion all we can hear is an averaged-out mean waveform from which the entire eight octave signal arriving at cortex has been magically reverse engineered. Or is that an unfair criticism? Presumably you accept that a given mass can resonate at harmonic fractions of its fundamental simultaneously, so with enough variety of resonating mass (in terms of morphology) perhaps we cover more spectrum mechanically than your source admits?

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more