First large-scale analysis of 'soft' censorship of social media in China

Mar 07, 2012

Researchers in Carnegie Mellon University's School of Computer Science analyzed millions of Chinese microblogs, or "weibos," to uncover a set of politically sensitive terms that draw the attention of Chinese censors. Individual messages containing the terms were often deleted at rates that could vary based on current events or geography.

The study is the first large-scale analysis of political content in , a topic that drew attention and controversy earlier this year when Twitter announced a country-by-country policy for removing that don't comply with local laws.

In China, where online censorship is highly developed, the researchers found that oft-censored terms included well-known hot buttons, such as Falun Gong, a spiritual movement banned by the Chinese government, and Ai Weiwei and Liu Xiaobo. Others varied based on events; Lianghui, a term that normally refers to a joint meeting of China's parliament and its political advisory body, became subject to censorship when it emerged as a code word for "planned protest" during pro-democracy unrest that began in February 2011.

The CMU study also showed high rates of weibo censorship in certain provinces. The phenomenon was particularly notable in Tibet, a hotbed of political unrest, where up to 53 percent of locally generated microblogs were deleted.

The study by Noah Smith, associate professor in the Language Technologies Institute (LTI); David Bamman, a Ph.D. student in LTI; and Brendan O'Connor, a Ph.D. student in the Machine Learning Department, appears in the March issue of First Monday, a peer-reviewed, online journal.

"A lot of studies have focused on censorship that blocks access to Internet sites, but the practice of deleting individual messages is not yet well understood," Smith said. "The rise of domestic Chinese microblogging sites has provided a unique opportunity to systematically study in detail."

The so-called Great Firewall of China, which prevents Chinese residents from accessing foreign websites such as Google and Facebook, is China's best known censorship tool. Other countries also are known to block Web access, such as when Egypt shut down Twitter and other social media sites during last year's Arab Spring protests.

But blocking access to all sites and services is impossible if China or any other country is to harness the Web's commercial and educational potential, Bamman said. An alternative is to allow access to sites, but police the content, eliminating messages deemed objectionable. Automated methods may be used to eliminate some messages, while others are deleted manually, he noted. Seldom are all weibos with a sensitive term deleted, but anecdotal evidence is overwhelming that certain messages are targeted.

"You even see some weibos where the writer asks, 'Is this going to be deleted?'" O'Connor said. In late 2010, New York Times columnist Nicholas Kristof opened an account on a Chinese microblog site; within an hour of sending a message about Falun Gong, his account was shut down.

To study this "soft" censorship, the CMU team analyzed almost 57 million messages posted on Sina Weibo, a domestic Chinese microblog site similar to Twitter that has more than 200 million users. They collected samples of weibos from June 27 to Sept. 30, 2011, using an application programming interface (API) that Sina Weibo provides to developers so they can build related services.

Using the same API, they later checked a random subset of weibos to see if they still existed and another subset that included terms known to be politically sensitive. If a weibo was deleted, Sina would return what the researchers came to regard as an ominous message: "target weibo does not exist."

In late June and early July, for instance, rumors began circulating of the death of Jiang Zemin, a former general secretary of the Communist Party of China who came to power during the Tiananmen Square protests of 1989. On July 6, at the height of the rumor, 64 of 83 messages containing his name were deleted; on July 7, 29 of 31 such messages were deleted.

As another check, the researchers compared the frequency of such messages on Sina Weibo with those on the Chinese language version of Twitter, which officially is blocked by China but can still be accessed by net-savvy people. On July 6, Jiang's name appeared in one out of every 75 tweets, but just one out of every 5,666 messages on Sina Weibo – another indication that the Jiang conversations on Sina Weibo are suppressed.

Many weibos with high deletion rates included terms and names known to be politically sensitive, such as Fang Binxing, the architect of the Great Firewall of China, and references to state propaganda. Others reflect sensitivity to events; a term meaning "to ask someone to resign" became subject to deletion following the high-speed rail crash that killed 40 people in Wenzhou last July and apparently referenced the minister of railways.

Censored terms are not always political. Following the March 2011 Fukushima nuclear disaster in Japan, weibos containing such politically innocuous terms as iodized salt and radioactive iodine had high deletion rates. The researchers believe these deletions were the result of government efforts to quash false rumors about the nuclear accident causing salt contamination.

Not all deletions are necessarily state-instigated censorship, the researchers noted. Spam and pornographic also are subject to deletion, just as they are in the United States.

By establishing a methodology for studying soft censorship in China, the researchers say they now have a means for actively monitoring social media censorship as it changes over time. They also may have the means to probe deeper, identifying code words and metaphors used to sidestep censors.

Explore further: UK: Former reporter sentenced for phone hacking

add to favorites email to friend print save as pdf

Related Stories

China social networking site warns bloggers

Aug 27, 2011

A popular Twitter-like service in China has contacted millions of users warning them to ignore false reports, in a sign of growing official unease over the rise of social networking sites.

China's biggest microblog tops 200 million users

Aug 18, 2011

A popular social networking service used by Chinese people to vent their anger over a deadly July train crash now has more than 200 million users, owner Sina.com said Thursday.

China extends microblog rules to south: report

Dec 22, 2011

China is extending rules requiring microblog users to register under their real names to Guangdong, state media said Thursday, after a spate of violent protests in the southern province.

Beijing orders microbloggers to register real names

Dec 16, 2011

Beijing city authorities on Friday issued new rules requiring microbloggers to register their real names before posting online, as the Chinese government tightens its grip on the Internet.

China microblog users top 300 million: reports

Nov 22, 2011

More than 300 million people in China now have microblogging accounts, a state-run newspaper reported, as the country's fast-growing online population seeks to bypass tight media controls.

Radiohead ventures into Chinese social media

Jul 03, 2011

(AP) -- Radiohead has taken a tentative step into censored Chinese cyberspace, even though the British rock band has been critical of China's human rights record.

Recommended for you

UK: Former reporter sentenced for phone hacking

6 hours ago

(AP)—A former British tabloid reporter was given a 10-month suspended prison sentence Thursday for his role in the long-running phone hacking scandal that shook Rupert Murdoch's media empire.

Evaluating system security by analyzing spam volume

7 hours ago

The Center for Research on Electronic Commerce (CREC) at The University of Texas at Austin is working to protect consumer data by using a company's spam volume to evaluate its security vulnerability through the SpamRankings.net ...

Surveillance a part of everyday life

8 hours ago

Details of casual conversations and a comprehensive store of 'deleted' information were just some of what Victoria University of Wellington students found during a project to uncover what records companies ...

European Central Bank hit by data theft

8 hours ago

(AP)—The European Central Bank said Thursday that email addresses and other contact information have been stolen from a database that serves its public website, though it stressed that no internal systems or market-sensitive ...

Twitter admits to diversity problem in workforce

11 hours ago

(AP)—Twitter acknowledged Wednesday that it has been hiring too many white and Asian men to fill high-paying technology jobs, just like several other major companies in Silicon Valley.

Social Security spent $300M on 'IT boondoggle'

22 hours ago

(AP)—Six years ago the Social Security Administration embarked on an aggressive plan to replace outdated computer systems overwhelmed by a growing flood of disability claims.

User comments : 0