US Library of Congress backtracks on complete Twitter archive

December 27, 2017

The US Library of Congress has scaled back plans to archive every message ever sent on Twitter, sparking debate on the importance of social media in historical records.

The library, which is believed to be the largest in the world with a mission of preserving important national and global cultural records, announced this week it would stop collecting the entire Twittersphere's tweets from January 2018.

"Effective January 1, 2018, the Library will acquire tweets on a selective basis—similar to our collections of web sites," the library's communications director Gayle Osterberg said in a blog post.

"The Library regularly reviews its collections practices to account for environmental shifts, diversity of collections and topics, cost effectiveness, use of collections and other factors. This change results from such a review."

Officials cited several reasons for the decision: the volume of the database is much bigger than it was a few years ago and the library lacks the capacity to deal with images and items other than text.

The library in 2010 began its tweet archive after receiving a "gift" from Twitter of the full database of public tweets dating from the first tweet in 2006, but has not determined when or how to make this public.

The archive "will remain embargoed until access issues can be resolved in a cost-effective and sustainable manner," Osterberg said.

A statement said the library from 2018 "will continue to acquire tweets but will do so on a very selective basis" adding that the collection will likely be "thematic and event- based, including events such as elections, or themes of ongoing national interest, e.g. public policy."

Still, it added that the 12-year archive of tweets already collected "may prove to be one of this generation's most significant legacies to future generations."

The move prompted considerable reaction—including on Twitter.

"Not good," tweeted CNBC news associate Mariam Amin.

"I want every tweet to be archived. In 40 years, I want to take my granddaughter to the Library of Congress and show her the madness I dealt with as a journalist...make every tweet count."

How to choose?

Others questioned how the institution would determine which tweets are historically important.

The library "now needs to decide which of all our tweets are a valuable record of events in public life," tweeted the consumer activist group Public Knowledge.

But the news site PoliticusUSA appeared to welcome the decision, tweeting that the library "would be no longer be wasting its resources by trying to archive every single public post published on Twitter."

Some Twitter users lamented the quality of the messaging platform and the impact of prolific tweeter Donald Trump.

"Trump has managed to lower the value of Twitter," one user wrote.

"Now the Library of Congress won't archive every Tweet. #sorrytwitter. That must be because most of his tweets are embarrassing to the rest of the nation."

Jennifer Grygiel, communications professor at Syracuse University, said the move was disappointing from an institution "which is perceived to be the greatest library and in our country."

"This could lead to increased rhetoric about how there is too much user-generated social media content for companies to effectively manage or moderate," Grygiel told AFP

"Social media is not 'too big to moderate;' it takes time, money, and resources to effectively manage content."

Explore further: US Library of Congress to archive Twitter messages

Related Stories

America archives its billions of tweets

January 22, 2013

The Library of Congress, repository of the world's largest collection of books, has set for itself the enormous task of archiving something less weighty and far more ephemeral—Americans' billions of tweets.

Twitterverse atwitter over expanded tweet limit

September 27, 2017

Twitter's test of an expanded 280-character limit is aimed at luring new users, but some of the social network's passionate loyalists fear the change will strip it of its unique appeal.

Recommended for you

Earwigs and the art of origami

March 22, 2018

ETH Zurich researchers have developed multifunctional origami structures, which they then fabricated into 4-D printed objects. The design principle mimics the structure of an earwig's wing.


Adjust slider to filter visible comments by rank

Display comments: newest first

4.3 / 5 (4) Dec 27, 2017
" it takes time, money, and resources"

Save them and donate them to a soup kitchen.
4 / 5 (4) Dec 27, 2017
The world is full of people, the meaning of life of whose is pure reproduction without any value added (if at all). Sometimes it's better not to eat all money and to invest them into things of permanent value. I'm just not sure, that Twitter posts belong to them - but at least they're brief and easy to archive..
5 / 5 (1) Dec 27, 2017
Some of the comments in the article are pure egoism.

Billions of mostly trivial messages every day aren't easy to archive. Archiving means you have to sort/catalog and that takes resources. Diverting the resources necessary to continuing archiving tweets could make a much bigger impact.
not rated yet Dec 27, 2017
LED Guy: Which part of web would be less difficult to archive? With compare to web pages which have versions, history, difficult formatting and huge/deep CSS/script/embedded files references the tweets are prototype of system for easy archiving.
not rated yet Dec 27, 2017
The first tweet hit the internet on March 21, 2006 and it wasn't until 2009 did the firm reach its billionth tweet. Every second, on average, around 6,000 tweets are tweeted on Twitter , which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year. But each tweet (140 characters max) needs max 560 8-bit bytes, so that only few average hard disks (2 terrabytes) could archive whole this amount of data (one engineer at Twitter gave a presentation that suggested it's about 200 bytes per tweet). Every day, 3.5 billion malicious Tweets spread spam and viruses- because each Tweet is sent to a large list of people.'
5 / 5 (1) Dec 27, 2017
Um, they could ask CERN how to filter good stuff from the noise 'on the fly'...
not rated yet Dec 27, 2017
Now is the time for newspapers to follow the lead. We (I guess me at least) don't need to read a news story with the text of a twit'er user, then pasting the twit's text verbatim. Once is enough, please.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.