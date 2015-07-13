US Library of Congress backtracks on complete Twitter archive

December 27, 2017
twitter

The US Library of Congress has scaled back plans to archive every message ever sent on Twitter, sparking debate on the importance of social media in historical records.

The library, which is believed to be the largest in the world with a mission of preserving important national and global cultural records, announced this week it would stop collecting the entire Twittersphere's tweets from January 2018.

"Effective January 1, 2018, the Library will acquire tweets on a selective basis—similar to our collections of web sites," the library's communications director Gayle Osterberg said in a blog post.

"The Library regularly reviews its collections practices to account for environmental shifts, diversity of collections and topics, cost effectiveness, use of collections and other factors. This change results from such a review."

Officials cited several reasons for the decision: the volume of the database is much bigger than it was a few years ago and the library lacks the capacity to deal with images and items other than text.

The library in 2010 began its tweet archive after receiving a "gift" from Twitter of the full database of public tweets dating from the first tweet in 2006, but has not determined when or how to make this public.

The archive "will remain embargoed until access issues can be resolved in a cost-effective and sustainable manner," Osterberg said.

A statement said the library from 2018 "will continue to acquire tweets but will do so on a very selective basis" adding that the collection will likely be "thematic and event- based, including events such as elections, or themes of ongoing national interest, e.g. public policy."

Still, it added that the 12-year archive of tweets already collected "may prove to be one of this generation's most significant legacies to future generations."

The move prompted considerable reaction—including on Twitter.

"Not good," tweeted CNBC news associate Mariam Amin.

"I want every tweet to be archived. In 40 years, I want to take my granddaughter to the Library of Congress and show her the madness I dealt with as a journalist...make every tweet count."

How to choose?

Others questioned how the institution would determine which tweets are historically important.

The library "now needs to decide which of all our tweets are a valuable record of events in public life," tweeted the consumer activist group Public Knowledge.

But the news site PoliticusUSA appeared to welcome the decision, tweeting that the library "would be no longer be wasting its resources by trying to archive every single public post published on Twitter."

Some Twitter users lamented the quality of the messaging platform and the impact of prolific tweeter Donald Trump.

"Trump has managed to lower the value of Twitter," one user wrote.

"Now the Library of Congress won't archive every Tweet. #sorrytwitter. That must be because most of his tweets are embarrassing to the rest of the nation."

Jennifer Grygiel, communications professor at Syracuse University, said the move was disappointing from an institution "which is perceived to be the greatest library and in our country."

"This could lead to increased rhetoric about how there is too much user-generated social media content for companies to effectively manage or moderate," Grygiel told AFP

"Social media is not 'too big to moderate;' it takes time, money, and resources to effectively manage content."

7 comments

IronhorseA
5 / 5 (3) 11 hours ago
" it takes time, money, and resources"

Save them and donate them to a soup kitchen.
mackita
5 / 5 (3) 10 hours ago
The world is full of people, the meaning of life of whose is pure reproduction without any value added (if at all). Sometimes it's better not to eat all money and to invest them into things of permanent value. I'm just not sure, that Twitter posts belong to them - but at least they're brief and easy to archive..
LED Guy
5 / 5 (1) 10 hours ago
Some of the comments in the article are pure egoism.

Billions of mostly trivial messages every day aren't easy to archive. Archiving means you have to sort/catalog and that takes resources. Diverting the resources necessary to continuing archiving tweets could make a much bigger impact.
mackita
not rated yet 6 hours ago
LED Guy: Which part of web would be less difficult to archive? With compare to web pages which have versions, history, difficult formatting and huge/deep CSS/script/embedded files references the tweets are prototype of system for easy archiving.
mackita
not rated yet 6 hours ago
The first tweet hit the internet on March 21, 2006 and it wasn't until 2009 did the firm reach its billionth tweet. Every second, on average, around 6,000 tweets are tweeted on Twitter , which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year. But each tweet (140 characters max) needs max 560 8-bit bytes, so that only few average hard disks (2 terrabytes) could archive whole this amount of data (one engineer at Twitter gave a presentation that suggested it's about 200 bytes per tweet). Every day, 3.5 billion malicious Tweets spread spam and viruses- because each Tweet is sent to a large list of people.'
Nik_2213
5 / 5 (1) 5 hours ago
Um, they could ask CERN how to filter good stuff from the noise 'on the fly'...
BendBob
not rated yet 2 hours ago
Now is the time for newspapers to follow the lead. We (I guess me at least) don't need to read a news story with the text of a twit'er user, then pasting the twit's text verbatim. Once is enough, please.

