London's tweets are mapped to see who speaks what, where

October 26, 2012 by Nancy Owano, weblog

(—A doctoral student and a lecturer in spatial analysis have collaborated to deliver a London diversity map via Twitter based on 3.3 million tweets in the city over the course of this year's summer months.

Ed Manley, the PhD student, University College , and James Cheshire, a lecturer at UCL's Centre of Advanced Spatial Analysis, were able to detect no less than 66 languages in use although unsurprisingly English dominated. Manley, whose work looks at movement around the city, especially how and why it forms and changes, said the data generation method was "quite simple," using Google's Chromium Compact Language Detector. The latter is an open-source Python library adapted from the Chrome algorithm.

Basque, Haitian Creole and Swahili were among the languages detected. Tagalog, spoken in the Philippines, was the seventh most tweeted language. The study revealed concentrations of tweets as well. Turkish appeared to be concentrated in the north; pockets of Russian tweets in central London; and Arabic, in the west.

The map is interesting but there are numerous limitations to prevent one's coming away with a true understanding of language used daily in London. Many tweets, from people who had a good and who were connected to the Internet, were along train commuter lines and from people at events.

The fact can't be lost that the examination included the summer period; many languages detected could have come from the Olympic event. Manley fully recognizes the limitations. "I won't dwell too much on discussing the results, only that Twitter appears to reveal itself here to be the severely skewed dataset we all always really knew it was," he said. "In total, 92.5% of are detected as English, far above existing estimations (60%) of English speakers in London…languages you'd expect to score highly such as Bengali and Somali barely feature at all. Either people only tweet in English, or usage of Twitter varies significantly among language groups in London."

He realizes that "there is a great deal you can say about bias within the Twitter dataset, but I think I'll save that for another day."

Nonetheless, one can enjoy the fact that Twitter offers scientists opportunities to identify patterns, trends and social networking insights. In this instance, the Twitter exercise confirmed the global nature of London's population. This would not be the first time Twitter has been used for information mapping.

Earlier this year, the New England Complex Systems Institute (NECSI) set out to map the social, political, and geographical properties of news-sharing communities on Twitter. They tracked user-generated messages that had links to The New York Times online articles. They labeled users according to link topics they shared, their geographic location, and their self-descriptive keywords. With users clustered based on who follows whom in Twitter, they found social groups separated by whether they were interested in local (New York), national (US) or global issues.

Interestingly, while Twitter is said to create communication across continents, may at the same time be strengthening walls that separate users into ideological camps. "A person who is cosmopolitan associates with others who are cosmopolitan, and a US liberal or conservative associates with others who are US liberal or conservative, creating separated social groups with those identities," said Yaneer Bar-Yam, NECSI president, commenting on the findings.

Explore further: Study twitter-maps new world order

More information:

Related Stories

Study twitter-maps new world order

February 19, 2012

( -- A new study of tweets spreading news from The New York Times finds that the Internet, while creating an open line of communication across continents, may at the same time be strengthening walls that separate ...

Twitter study reveals explosion in Arabic 'tweeting'

November 24, 2011

The popularity of Twitter has soared in the Arab world over the past year, a study published Thursday revealed, reflecting the key role of the social networking site in the "Arab Spring" revolutions.

Twitter clocks half-billion users: monitor

July 30, 2012

Over 500 million people are on micro-blogging site Twitter and Americans and Brazilians are the most connected, according to a study by social media monitor Semiocast released Monday.

Twitter adds Arabic and Hebrew

March 6, 2012

Twitter on Tuesday launched Arabic, Hebrew, Farsi and Urdu versions of its website, further localizing of the popular one-to-many text messaging service.

Twitter says its ads pay off for candidates

October 10, 2012

Twitter released a study Wednesday showing its paid messages pay off for political candidates, not only in garnering attention but in driving campaign contributions.

Recommended for you

Top takeaways from Consumers Electronics Show

January 13, 2018

The 2018 Consumer Electronics Show, which concluded Friday in Las Vegas, drew some 4,000 exhibitors from dozens of countries and more than 170,000 attendees, showcased some of the latest from the technology world.

Finnish firm detects new Intel security flaw

January 12, 2018

A new security flaw has been found in Intel hardware which could enable hackers to access corporate laptops remotely, Finnish cybersecurity specialist F-Secure said on Friday.


Adjust slider to filter visible comments by rank

Display comments: newest first

3 / 5 (3) Oct 26, 2012
A bias in the system resides in the fact that tweeting is only done in languages that use the Roman alphabet. No kana, kanji, cyrillic, etc.
not rated yet Oct 29, 2012
There was also Russian language, and it uses non-latin alphabet.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.