Google engineer creates application that monitors Wikipedia content bots

Feb 19, 2014 by Bob Yirka report
Screenshot of the application. Credit: arXiv:1402.0412 [cs.DL]

( —Thomas Steiner, a Customer Solutions Engineer at Google Germany GmbH, Hamburg has created an application that shows in a very clear way, how much of Wikipedia entries are being created or edited by bots, versus humans. He's also written a paper describing his efforts and posted it on the preprint server arXiv.

Many people may not realize it, but some of the appearing on Wikipedia is put there by , rather than human beings. This is because Wikipedia has grown too large to be managed by people alone, especially when noting it's still mostly a volunteer effort.

To keep entries coming and to keep them updated, bots have been created—they grab information from one place and post them into another, thus, they're not actually writers or composer, they're more like auditors updating files automatically. Also, many people may not know that the folks at Wikipedia have also created another information repository—Wikidata—it's a database whose sole purpose is to share data amongst the difference language versions of Wikipedia. If a user in the U.S. enters information about the results of the New York Marathon into a Wiki entry, for example, that data can be automatically ported to Wikidata, where other bots can retrieve it, convert it to the pertinent language and post it to another language version of Wikipedia—all rather seamlessly to readers.

Because of all the automation, some have begun to wonder what portion of Wiki pages are generated by humans versus bots. That's where Steiner comes in—he's written an application that can be accessed and used by anyone to see—in real time—what percentage of pages are being written by humans, versus bots.

The application also allows for noting other aspects of Wikipedia—a quick glance, for example reveals that bots are doing a lot more of the work adding information to pages in non-English speaking countries, which suggests that the majority of Wikipedia content is still being created by real human beings in the U.S. and the U.K. The application also monitors activity on Wikidata, for those who are interested and also displays the data for both in a way that shows which bots are most active.

Steiner has also published the code for the application, making it open source. That should allow those who are interested in the murky world of bots to gain an insider's perspective, and perhaps, to add to the utility.

Explore further: The hazards of presumptive computing

More information: Bots vs. Wikipedians, Anons vs. Logged-Ins, arXiv:1402.0412 [cs.DL]

Wikipedia is a global crowdsourced encyclopedia that at time of writing is available in 287 languages. Wikidata is a likewise global crowdsourced knowledge base that provides shared facts to be used by Wikipedias. In the context of this research, we have developed an application and an underlying Application Programming Interface (API) capable of monitoring realtime edit activity of all language versions of Wikipedia and Wikidata. This application allows us to easily analyze edits in order to answer questions such as "Bots vs. Wikipedians, who edits more?", "Which is the most anonymously edited Wikipedia?", or "Who are the bots and what do they edit?". To the best of our knowledge, this is the first time such an analysis could be done in realtime for Wikidata and for really all Wikipedias—large and small. Our application is available publicly online at the URL this http URL, its code has been open-sourced under the Apache 2.0 license.

Related Stories

Wikipedia losing editors, study says

Jan 04, 2013

Wikipedia, one of the world's biggest websites, is losing many of its English-language editors, crippling its ability to keep pace with its mission as a source of knowledge online, a study says.

New clues to Wikipedia's shared super mind

Mar 28, 2013

( —Wikipedia's remarkable accuracy and usefulness comes from something larger than the sum of its written contributions, a new study by SFI Research Fellow Simon DeDeo finds.

Robots learn to create language

May 17, 2011

( -- Communication is a vital part of any task that has to be done by more than one individual. That is why humans in every corner of the world have created their own complex languages that help ...

Recommended for you

The hazards of presumptive computing

7 hours ago

Have you ever texted somebody saying how "ducking annoyed" you are at something? Or asked Siri on your iPhone to call your wife, but somehow managed to be connected to your mother-in-law?

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.