Doing the math 'predicts' which movies will be box office hits

August 22, 2013
Doing the math 'predicts' which movies will be box office hits
Researchers systematically charted the online buzz around certain films. Credit: Shutterstock

( —Researchers have devised a mathematical model which can be used to predict whether films will become blockbusters or flops at the box office – up to a month before the movie is released.

Their model is based on an analysis of the activity on Wikipedia pages about American films released in 2009 and 2010. They examined 312 , taking into account the number of page views for the movie's article, the number of human editors contributing to the article, the number of edits made and the diversity of online users.

The researchers from Oxford University, the Central European University at Budapest, and Budapest University of Technology and Economics have published their findings in the journal PLOS ONE.

The model was applied retrospectively so the researchers systematically charted the online buzz on Wikipedia around particular films and compared this with the box office takings from the first weekend after release. The results of the comparison between the predicted opening weekend revenue, using their mathematical model, and the actual figures (published in Internet Movie Database [IMDb]) showed a high degree of .

Their allowed them to predict box office revenues with an overall accuracy of around 77%. The study authors say this level of accuracy is higher than the best existing applied by marketing firms (which they estimate to be at around 57%). They could predict the box office takings of six out of 312 films with 99% accuracy where the predicted value was within 1% of the real value. Some 23 movies were predicted with 90% accuracy and 70 movies with an accuracy of 70% and above.

The more successful the film, the more accurately the researchers were able to predict box office takings. In the study, they explain that this is possibly due to the increased amount of online data generated by films that turn out to be successes. The model correctly forecast the commercial success of Iron Man 2, Alice in Wonderland, Toy Story 3 and Inception, but failed to accurately forecast the financial return on the less successful movies Never Let Me Go and Animal Kingdom.

Dr Taha Yasseri, from the Oxford Internet Institute at the University of Oxford, said: 'These results can be of great value to marketing firms but more importantly for us, we were able to demonstrate how we can use socially generated online data to predict a lot about future human behaviour. The predicting power of the Wikipedia-based model, despite its simplicity compared with Twitter, is that many of the editors of the Wikipedia pages about the movies are committed movie-goers who gather and edit relevant material well before the release date. By contrast, the "mass" production of tweets occurs very close to the release time, and often these can be spun by marketing agencies rather than reflecting the feelings of the public.'

Co-author Professor János Kertész, from the Central European University of Budapest, Hungary, said: 'We have demonstrated for the first time that Wikipedia edit statistics provide us with another tool to predict social events. We studied the problem of predicting the financial success of movies and concluded that, in some aspects, forecasting based on Wikipedia outperforms tweets as Wikipedia activity has a longer timescale which enables earlier predictions.'

The study suggests that the efficiency of the predictions might be improved by applying more sophisticated statistical methods, such as including the controversy measure of an article. The has not been applied yet to films that are not on release.

Explore further: English Wikipedia hosts three millionth article

Related Stories

Netflix strikes movie deal with Weinstein Co.

August 20, 2013

(AP)—Netflix says it's reached a multi-year agreement with The Weinstein Co. that will give it the exclusive streaming rights to the company's first-run films starting in 2016.

Recommended for you

Ancient genome from Africa sequenced for the first time

October 8, 2015

The first ancient human genome from Africa to be sequenced has revealed that a wave of migration back into Africa from Western Eurasia around 3,000 years ago was up to twice as significant as previously thought, and affected ...

Rare braincase provides insight into dinosaur brain

October 8, 2015

Experts have described one of the most complete sauropod dinosaur braincases ever found in Europe. The find could help scientists uncover some of the mysteries of how dinosaur brains operated, including their intellectual ...

How much for that Nobel prize in the window?

October 3, 2015

No need to make peace in the Middle East, resolve one of science's great mysteries or pen a masterpiece: the easiest way to get yourself a Nobel prize may be to buy one.

The dark side of Nobel prizewinning research

October 4, 2015

Think of the Nobel prizes and you think of groundbreaking research bettering mankind, but the awards have also honoured some quite unhumanitarian inventions such as chemical weapons, DDT and lobotomies.


Adjust slider to filter visible comments by rank

Display comments: newest first

5 / 5 (1) Aug 22, 2013
The study authors say this level of accuracy is higher than the best existing predictive models applied by marketing firms (which they estimate to be at around 57%)

57% ? That's slightly better than a wild guess. These marketing firms aren't worth their money.
1 / 5 (2) Aug 22, 2013
The choice of hits depends on many socio-psychological factors, for example at the time of economical crisis the frustrated people are getting more satisfied with simpler enjoyment providing movies and vice-versa. You cannot get these connections with analysis of movie content only.

Indeed. You can do your best but ultimately all models and algorithms are at the "mercy" of reality. To use a somewhat simplistic example; if you predict a movie is going to gross about a billion dollars and a huge solar flare knocks out power in North America and Europe on it's opening night and lasts for two weeks your model is going to be off by about 990 million dollars give or take....
4.7 / 5 (3) Aug 22, 2013
if only they'd put this kind of effort into making GOOD movies, instead of these gazillion dollar "blockbusters" where all the money goes to big names, over the top cgi, nauseating 3d, and not to writing, directing, or decent camera work.
would anyone even recognize a good, well told story?
not rated yet Aug 22, 2013
would anyone even recognize a good, well told story?

That's probably why the money doesn't go to directing, writing and camera work anymore. They don't matter.

Case in point: Comic book adaptations are heralded as cinematic milestones.
Now I don't know about you, but Batman or the Avengers or even Watchmen (enjoyable as they are on their level) aren't Shakespeare. They aren't even George Lucas. They're intellectual tripe with big explosions and fancy costumes with sometimes pretentions to basic literacy and infantile philosophy.
There's a place for that in cinema, for sure. Heck, we all use movies sometimes to unwind - and you don't need Tolstoy type writing to do that.
But overhyping these movies into 'genius writing' and 'deep insights into the human psyche' is just ludicrous.
5 / 5 (1) Aug 22, 2013
"we were able to demonstrate how we can use socially generated online data to predict a lot about future human behaviour."

Reminds me of Asimov's "psychohistory". Apparently it doesn't require trillions of humans to be at least somewhat accurate...
not rated yet Aug 22, 2013
Ok, the developed a model based on 2009 and 2010. How well does the model predict 2011, 2012 and 2013?
not rated yet Aug 23, 2013
The single criterion for being a hit is making lots of money. Striving for that, producers focus on, spend lots of money on, do all sorts of things, that don't contribute to the quality of their movies.
1 / 5 (1) Aug 26, 2013
The study authors say this level of accuracy is higher than the best existing predictive models applied by marketing firms (which they estimate to be at around 57%)

57% ? That's slightly better than a wild guess. These marketing firms aren't worth their money.

and you're just figuring this out now?

but just you wait a while, it will get worse: as soon as this report hits their radar, they'll start swarming the Wikipedia and stacking all the other buzz-routes ten times as hard, their eyes glassed over with the mistaken belief that this hit-maker relationship is bilateral. there goes the neighbourhood.
not rated yet Aug 26, 2013
I can predict box office hits more than 90% of the time.
One useful predictor is to choose the opposite of what "hi-brow" critics tout.
Another useful predictor is lack of competition at release time.
Still another, and probably the best predictor is a well written script. Gizmos and gadgets are neet but without a good plot and story line the movie is a Titanic waiting to happen.
Movies need the backing of a good Producer. They need the artistic craftsmanship of a good director who is trying to visually tell the story that the writer(s) intend(s). Acting is the necessary part which holds the other three together....the glue? Lastly, the movie needs a good support system such as all those unseen people who do the necessary things that make a film production go. Any one of these pieces, in and of themselves, are crucial but there is a synergy to the box office hit or lack of it in the box office dud.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.