London's tweets are mapped to see who speaks what, where

Oct 26, 2012 by Nancy Owano weblog

(—A doctoral student and a lecturer in spatial analysis have collaborated to deliver a London diversity map via Twitter based on 3.3 million tweets in the city over the course of this year's summer months.

Ed Manley, the PhD student, University College , and James Cheshire, a lecturer at UCL's Centre of Advanced Spatial Analysis, were able to detect no less than 66 languages in use although unsurprisingly English dominated. Manley, whose work looks at movement around the city, especially how and why it forms and changes, said the data generation method was "quite simple," using Google's Chromium Compact Language Detector. The latter is an open-source Python library adapted from the Chrome algorithm.

Basque, Haitian Creole and Swahili were among the languages detected. Tagalog, spoken in the Philippines, was the seventh most tweeted language. The study revealed concentrations of tweets as well. Turkish appeared to be concentrated in the north; pockets of Russian tweets in central London; and Arabic, in the west.

The map is interesting but there are numerous limitations to prevent one's coming away with a true understanding of language used daily in London. Many tweets, from people who had a good and who were connected to the Internet, were along train commuter lines and from people at events.

The fact can't be lost that the examination included the summer period; many languages detected could have come from the Olympic event. Manley fully recognizes the limitations. "I won't dwell too much on discussing the results, only that Twitter appears to reveal itself here to be the severely skewed dataset we all always really knew it was," he said. "In total, 92.5% of are detected as English, far above existing estimations (60%) of English speakers in London…languages you'd expect to score highly such as Bengali and Somali barely feature at all. Either people only tweet in English, or usage of Twitter varies significantly among language groups in London."

He realizes that "there is a great deal you can say about bias within the Twitter dataset, but I think I'll save that for another day."

Nonetheless, one can enjoy the fact that Twitter offers scientists opportunities to identify patterns, trends and social networking insights. In this instance, the Twitter exercise confirmed the global nature of London's population. This would not be the first time Twitter has been used for information mapping.

Earlier this year, the New England Complex Systems Institute (NECSI) set out to map the social, political, and geographical properties of news-sharing communities on Twitter. They tracked user-generated messages that had links to The New York Times online articles. They labeled users according to link topics they shared, their geographic location, and their self-descriptive keywords. With users clustered based on who follows whom in Twitter, they found social groups separated by whether they were interested in local (New York), national (US) or global issues.

Interestingly, while Twitter is said to create communication across continents, may at the same time be strengthening walls that separate users into ideological camps. "A person who is cosmopolitan associates with others who are cosmopolitan, and a US liberal or conservative associates with others who are US liberal or conservative, creating separated social groups with those identities," said Yaneer Bar-Yam, NECSI president, commenting on the findings.

Explore further: Most internet anonymity software leaks users' details

More information:

Related Stories

Study twitter-maps new world order

Feb 19, 2012

( -- A new study of tweets spreading news from The New York Times finds that the Internet, while creating an open line of communication across continents, may at the same time be strengthening walls that separate ...

Twitter study reveals explosion in Arabic 'tweeting'

Nov 24, 2011

The popularity of Twitter has soared in the Arab world over the past year, a study published Thursday revealed, reflecting the key role of the social networking site in the "Arab Spring" revolutions.

Twitter clocks half-billion users: monitor

Jul 30, 2012

Over 500 million people are on micro-blogging site Twitter and Americans and Brazilians are the most connected, according to a study by social media monitor Semiocast released Monday.

Twitter adds Arabic and Hebrew

Mar 06, 2012

Twitter on Tuesday launched Arabic, Hebrew, Farsi and Urdu versions of its website, further localizing of the popular one-to-many text messaging service.

Twitter says its ads pay off for candidates

Oct 10, 2012

Twitter released a study Wednesday showing its paid messages pay off for political candidates, not only in garnering attention but in driving campaign contributions.

Recommended for you

New approach to online compatibility

16 hours ago

Many of the online social networks match users with each other based on common keywords and assumed shared interests based on their activity. A new approach that could help users find new friends and contacts with a greater ...

Most internet anonymity software leaks users' details

Jun 29, 2015

Virtual Private Networks (VPNs) are legal and increasingly popular for individuals wanting to circumvent censorship, avoid mass surveillance or access geographically limited services like Netflix and BBC ...

WikiLeaks says NSA spied on French business

Jun 29, 2015

WikiLeaks has released documents that it says show that the U.S. National Security Agency eavesdropped on France's top finance officials and high-stakes French export bids over a decade in what the group called targeted economic ...

Google gets extended deadline to answer EU case

Jun 29, 2015

Brussels has given Google an extension until mid-August to answer an anti-trust case alleging that the tech giant abuses its search engine's market dominance, a company spokesman said Monday.

Facebook opens first Africa office

Jun 29, 2015

Facebook announced Monday it had opened its first African office in Johannesburg as part of its efforts "to help people and businesses connect" on the continent.

User comments : 2

Adjust slider to filter visible comments by rank

Display comments: newest first

3 / 5 (3) Oct 26, 2012
A bias in the system resides in the fact that tweeting is only done in languages that use the Roman alphabet. No kana, kanji, cyrillic, etc.
not rated yet Oct 29, 2012
There was also Russian language, and it uses non-latin alphabet.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.