Averaging the wisdom of crowds

crowd
Credit: CC0 Public Domain

The best decisions are made on the basis of the average of various estimates, as confirmed by the research of Dennie van Dolder and Martijn van den Assem, scientists at VU Amsterdam. Using data from Holland Casino promotional campaigns, they have researched whether it is true that when people make estimates, the average of their estimates is relatively close to reality. The results of the research have been published in Nature Human Behaviour.

During the last seven weeks of 2013, 2014 and 2015, Holland Casino visitors could participate in an estimation contest. The participant with the most accurate estimate of the number of pearls in a giant champagne glass won the tidy sum of €100,000. In total, no fewer than 1.2 million participated over the three years. "For our research, we analysed the three enormous datasets of these promotional campaigns. The data showed that averaging all of the estimates yields significant accuracy gains," says Van den Assem. "We also looked at the estimates of people who participated multiple times."

Wisdom of crowds

More than a century ago, the famous British scientist Sir Francis Galton researched estimation contests that were very similar to the estimation contest at Holland Casino. At a cattle market, visitors could estimate the slaughter weight of an exhibited ox. Galton examined the estimates made by people and found that, surprisingly, the average estimate differed little from reality. The principle that averaging multiple estimates provides a relatively accurate outcome—often better than most underlying estimates and sometimes even better than all—has come to be known as the "Wisdom of Crowds principle." It is an important principle because accurate estimates are crucial for making good decisions.

Wisdom of inner crowds

Also important is the analysis of the estimates from people who participated multiple times. Recently, researchers have suggested that it is also useful to average estimates that come from the same person. Van Dolder and Van den Assem believe that averages from the same person do indeed work, and that therefore 'wisdom of inner crowds' also exists.

This is an attractive idea because it is often easier to make multiple estimates yourself than to involve other people. For issues that require a high degree of specialised expertise and for private matters, makers have to rely on themselves to make the decision. Which holiday will you book? Will you stay with your partner or not? And will you or will you not move to a particular city? The research suggests that to reach a good decision, it is better to think about it at different times of the day and with a few nights of sleep between each time.

However, in comparison, the accuracy improves more dramatically when you take the average of estimates from different people: the average of a large number of estimates from the same person is hardly ever better than the average of two estimates from different people. Van Dolder: "For the quality of estimates, it is therefore better if two people are both engaged in the same two projects than when each focuses entirely on an individual project."

Two heads really are better than than one, and for good decision making, taking the average of the of various people remains the best approach.


Explore further

Novel method developed for estimating prevalence of diabetes

More information: Dennie van Dolder et al, The wisdom of the inner crowd in three large natural experiments, Nature Human Behaviour (2017). DOI: 10.1038/s41562-017-0247-6
Journal information: Nature Human Behaviour

Citation: Averaging the wisdom of crowds (2017, December 12) retrieved 25 May 2019 from https://phys.org/news/2017-12-averaging-wisdom-crowds.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.
10 shares

Feedback to editors

User comments

Dec 12, 2017
Note that the effect relies on counting all answers.

Not just those that "seem to be close", because the criteria you use to exclude some answers but not others will introduce a bias to your system and instead of the answers being a random spread around a mean, you lop off one side of the distribution and shift its mean away from the correct answer.

I.e. if you take the "consensus" argument and say "97 out of a hundred of these people agree that this bull weighs more than 1000 pounds", and then listen to only those 97 people, you're liable to get the resulting estimate wrong(er).

Sure, the 3 remaining people can't shift the answer very much, but if you're doing something like predicting the climate or the economy, where the changes are cumulative (positive feedback) and your projections are exponential, the differences between projections are also exponential and your error grows exponentially the further ahead you're trying to predict.

Dec 12, 2017
they have researched whether it is true that when people make estimates, the average of their estimates is relatively close to reality
...under the proviso that the crowd has at least some knowledge of the subject.

As a counterexample there's the story of the Chinese emperor's nose: The chinese emperor lives in the forbidden city and hence no one can see him. But people wanted to know how long his nose was. So they made questionnaires and averaged the results.

That this doesn't produce any useful data should be obvious.

http://imaginator...nose.htm

Similarly it's rather pointless to 'crowdsource' anything that requires specialized knowledge - particularly if it includes deeply counterintuitive subject matter (quantum mechanics, relativity, economics, probabilities, ... ).

Dec 12, 2017
Van Dolder and Van den Assem believe that averages from the same person do indeed work, and that therefore 'wisdom of inner crowds' also exists.

This is used e.g for looking at whether a new type of method (e.g. a postprocessing/classification algorithm that requires some user input) you do two types of test:

interoperator
and
intraoperator

Interoperator test works like this: you let a number of people do the task and quantify the accuracy and the precision in the gathered datasets.

Intraoperator test works like this: let a number of operators do each task several times and quantify the accuracy and precision for each operator *separately*.

Both results can tell you a lot about how useful your semiautomatic method is (Is it robust against different operators? Is it robust against one operator making small changes in his/her routine?)

Dec 13, 2017
This comment has been removed by a moderator.

Dec 13, 2017
This comment has been removed by a moderator.

Dec 13, 2017
This comment has been removed by a moderator.

Dec 13, 2017
This comment has been removed by a moderator.

Dec 13, 2017
"Democracy is based on the assumption that a million men are wiser than one man. How's that again? I missed something.
Autocracy is based on the assumption that one man is wiser than a million men. Let's play that over again, too. Who decides?" -Lazarus Long

Kylos
"These oligarchs are overthrown by the people who set up a democracy. Democracy soon becomes corrupt and degenerates into ochlocracy, beginning the cycle anew."

-Democracy cannot survive for long without being controlled by a hidden elite. See Plato's Republic.

Dec 13, 2017
This comment has been removed by a moderator.

Dec 13, 2017
This comment has been removed by a moderator.

Dec 16, 2017
As a counterexample there's the story of the Chinese emperor's nose: The chinese emperor lives in the forbidden city and hence no one can see him. But people wanted to know how long his nose was. So they made questionnaires and averaged the results.


That's not actually a counter-example, as the people do know something about the emperor: he's human - therefore the guesses will tend to average to the typical length of the human nose, which is likely to be at least in the same ballpark of the correct answer.

The wisdom of the crowd relies in part on the effect where people have some hidden knowledge about the subject, something which can't necessarily be articulated but will influence their guess. In the case of the Chinese emperor, people may not have seen him, but that does not mean they can know nothing of him and therefore have no intuitions.


Dec 16, 2017
Similarly it's rather pointless to 'crowdsource' anything that requires specialized knowledge - particularly if it includes deeply counterintuitive subject matter (quantum mechanics, relativity, economics, probabilities, ... ).´


I'd pick economics out of the list, as economics is the one subject which the specialists get consistently wrong and fail to predict anything over the long term, while the crowd gets it consistently right - because the crowd IS the economy. How could the crowd be wrong about itself?

if you want to know what's coming up, just ask everyone what they are about to do.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more