Business information consumption: 9,570,000,000,000,000,000,000 bytes per year

April 6, 2011

Business information consumption: 9,570,000,000,000,000,000,000 bytes per year

(PhysOrg.com) -- Three scientists at UC San Diego have rigorously estimated the annual amount of business-related information processed by the world's computer servers in terms that Guttenberg and Galileo would have appreciated: the digital equivalent of a 5.6-billion-mile-high stack of books from Earth to Neptune and back to Earth, repeated about 20 times a year.

The world's roughly 27 million processed 9.57 zettabytes of information in 2008, according to a paper to be presented April 7 at Storage Networking World's (SNW's) annual meeting in Santa Clara, Calif.

The first-of-its kind rigorous estimate was generated with server-processing performance standards, server-industry reports, interviews with information technology experts, sales figures from server manufacturers and other sources. (One zettabyte is 10 to the 21st power, or a million million gigabytes.)

The study estimated that enterprise server workloads are doubling about every two years, which means that by 2024 the world's enterprise servers will annually process the digital equivalent of a stack of books extending more than 4.37 light-years to Alpha Centauri, our closest neighboring star system in the Galaxy. (Each book is assumed to be 4.8 centimeters thick and contain 2.5 megabytes of information.)

"Most of this information is incredibly transient: it is created, used, and discarded in a few seconds without ever being seen by a person," said Roger Bohn, one of the report's co-authors and a professor of technology management at UC San Diego's School of International Relations and Pacific Studies. "It's the underwater base of the iceberg that runs the world that we see."

The authors of the report titled "How Much Information: 2010 Report on Enterprise Server Information" are Bohn, James E. Short, a research scientist at UC San Diego's School of International Relations and Pacific Studies and research director of the HMI? project, and Chaitanya K. Baru, a distinguished scientist at the San Diego Supercomputer Center.

Business information consumption: 9,570,000,000,000,000,000,000 bytes per year
Enlarge


The paper follows an earlier report on information consumption by U.S. households as part of The How Much Information? project. The effort is designed to conduct a census of the world's information in 2008 and onward, and is supported by AT&T, Cisco Systems, IBM, Intel, LSI, Oracle and Seagate Technology. Early support was provided by the Alfred P. Sloan Foundation.

"The exploding growth in stored collections of numbers, images and other data is well known, but mere data becomes more important when it is actively processed by servers as representing meaningful information delivered for an ever-increasing number of uses," said Short. "As the capacity of servers to process the digital universe's expanding base of information continues to increase, the development itself creates unprecedented challenges and opportunities for corporate information officers."

The workload of all 27 million of the world's enterprise servers in use in 2008 was estimated by using cost and performance benchmarks for online transaction processing, Web services and virtual machine processing.

"Of course, we couldn't directly measure the allocation of workload to millions of servers worldwide, but we received important guidance from experts, industry data and our own judgment," Short said. "Since our capacity assumptions, methodology and calculations are complex, we have prepared a separate technical paper as background to explain our methodology and provide sample calculations."

Servers amount to the unseen, ubiquitous, humming computational infrastructure of modern economies. The study estimated that each of the 3.18 billion workers in the world's labor force received an average of 3 terabytes of information per year.

Rather than focusing on raw processing power, the new analysis focused on server performance per dollar invested as a more consistent yardstick across a wide array of server types and sizes. "While midrange servers doubled their Web processing and business application workloads every 2 years, they doubled their performance per dollar every 1.5 years," Bohn said.

The 36-page "How Much Information?" report said total worldwide sales of all servers has remained stable at about $50-$55 billion per year for five years ending in 2008, while new-server performance as measured by industry benchmarks went up five- to eight-fold during the same period. Entry-level servers costing less than $25,000 processed about 65 percent of the world's information in 2008, midrange servers processed 30 percent, and high-end servers costing $500,000 or more processed 5 percent of the world's information in 2008.

The report's authors note that the estimated workload of the world's servers may be an underestimate because server-industry sales figures don't fully include the millions of servers built in-house from component parts by Google, Microsoft, Yahoo! and others.

The study estimated a sharp increase in virtualization beginning in 2006, in which many distinct "virtual servers" can run on one physical server. Virtualization is a way to improve energy efficiency, scalability and overall performance of large-scale information processing. One of its uses is for cloud computing in which server-processing power is provided as a centrally administered commodity that business clients can pay for as needed.

"Corporations and organizations that have huge and growing databases are compelled to rethink how they accomplish economies of scale, which is why many are now embracing cloud computing initiatives and green datacenters," said Baru. "In addition, a corporation's competitiveness will increasingly hinge on its ability to employ innovative search techniques that help users discover data and obtain useful results, and automatically offer recommendations for subsequent searches."

Measuring worldwide flows of information is an inexact science, and the How Much Information project will issue additional analyses as improved metrics become available and accepted. In 2007, the International Data Corporation and EMC Corp. reported that the total digital universe of information created, captured or replicated digitally was 281 exabytes and would not reach 1 until 2010. The study by Short, Bohn and Baru included estimates of the amount of data processed as input and delivered by servers as output. For example, one email message may flow through multiple servers and be counted multiple times.

The How Much paper points to the importance of data archiving and digital-data preservation. "Preserving data is an increasingly important challenge for business organizations and arbitrary age limits make little sense," said Baru. "In the future, data archiving and preservation will require as much enthusiasm in research and industry settings as we have provided to data generation and data processing."

Provided by University of California - San Diego search and more info website

4.4 /5 (8 votes)  

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

Skeptic_Heretic
Apr 06, 2011

Rank: 3 / 5 (2)
And now that number is wrong.

That's the problem with this calculation, the rate of data influx is ever increasing, however, their method is fairly excellent, and as far as estimations go, I'd agree with the manner in which the work was done.
Quantum_Conundrum
Apr 06, 2011

Rank: 1 / 5 (1)
And now that number is wrong.

That's the problem with this calculation, the rate of data influx is ever increasing, however, their method is fairly excellent, and as far as estimations go, I'd agree with the manner in which the work was done.


Most of the "data" is redundant anyway. How many times per day does this web site get viewed?

All the submit, cancel buttons and other stuff is re-loaded every time the page reloads, the pictures and logos, etc, take up as much or more as anything else, and all represent data that is being processed, even if it is templates, switches and stuff like that.

Most of it is completely redundant, exact same thing over and over.

Heck, the majority may even be the actual markup for HTML and Javascript.
J-n
Apr 06, 2011

Rank: not rated yet
Redundant, yes, i suspect SYN/ACK packages account for more than the reloads on pages tho.
GSwift7
Apr 06, 2011

Rank: 3 / 5 (2)
the digital equivalent of a 5.6-billion-mile-high stack of books from Earth to Neptune and back to Earth


Wow, that's a really fantastic number. Impossible to conceptualize on our human scale.

..., repeated about 20 times a year.


I nearly fell out of my chair when I got to that part.
Skeptic_Heretic
Apr 06, 2011

Rank: 5 / 5 (1)
Most of the "data" is redundant anyway. How many times per day does this web site get viewed?

But that's temporary data. Thde important bit is how many times and when did you or I visit the article, then the dynamics and viewership stats. A single web page visit creates an astronomical amount of data that can be saved or discarded.

As for redundant data, redundant data still has a timestamp.
Quantum_Conundrum
Apr 06, 2011

Rank: 2.3 / 5 (3)
Actually, the numbers are not correct.

Though the metric scale goes up in words corresponding to groups of 1000, i.e. kilo, mega, giga, terra, in computer science, it goes up as multiples of 2^10, or 1024...

So kilobyte is not 1000, it is 1024
Megabyte is not 1,000,000, it is 1024^2, or 1048576, etc

1 Zettabyte is therefore: 1024^7 = 1,180,591,620,717,411,303,424

Doesn't look like a big deal, but if you divide that back into the number above, you find that it's actually only 8.106 Zettabytes...

by the time you get to the 7th power, those missing "24s" make that much difference...

So your stack of books, using the real definition of "N-bytes", including megabytes, but assuming their 4.8cm thickness, is actually:

148,425,292,969 kilometers

Which is 32.59 times the distance to Neptune, or 16 round trips, not 20.

Also, Neptune is nowhere near 5.6 billion miles, it is about half that, at 4.55 billion kilometers...

In general, bad writing here...
Skeptic_Heretic
Apr 06, 2011

Rank: 5 / 5 (1)
by the time you get to the 7th power, those missing "24s" make that much difference...

True, but they didn't make that mistake in the research, only in the journalism.

edit: damn broke my own rule of "read the whole post". You covered that, my bad.
Quantum_Conundrum
Apr 06, 2011

Rank: 1 / 5 (1)
If we doubled the data every 2 years, as this article claims, and for the sake of argument round up, then that would be multiply this amount of data by 2^7, or 128.

This would give 1037.5 Zettabytes (using 1024^n computer science formula.)

The distance this would be is:

1.8998E13 km

Which is only 2 light years...
Quantum_Conundrum
Apr 06, 2011

Rank: 1 / 5 (1)
Anyway, as computers continue to advance, we may need several more prefixes added to the standard terminology...

1037 Zettabytes by 2024...

We could do double prefixes, like this:

1 Kilo-Zettabyte (2024)
1 Mega-Zettabyte (2038?)
1 Giga-Zettabyte (2051?)
Bobamus_Prime
Apr 06, 2011

Rank: not rated yet
QC:
Actually, the numbers are not correct.


148,425,292,969 kilometers

Which is 32.59 times the distance to Neptune, or 16 round trips, not 20.

Also, Neptune is nowhere near 5.6 billion miles, it is about half that, at 4.55 billion kilometers...

In general, bad writing here...


If you read the headline
5.6-billion-mile-high stack of books from Earth to Neptune and back to Earth
It is correct. Just using an estimation of Neptune being located 30 AU away would give you 2.79 billion miles for a 1way trip, the headline said it was round trip so ya... 5.6 billion miles is spot on.
plasticpower
Apr 06, 2011

Rank: not rated yet
I don't like when bytes are compared to paper books. A single book can fit into 0.25Mb but will contain massive amounts of useful information compared to a 150Mb Youtube video.
Quantum_Conundrum
Apr 06, 2011

Rank: 1 / 5 (1)
Plasticpower:

Point taken.

Books are probably a more efficient medium for data than web pages also.

to get an idea how much of the data transmitted by servers is markup, just use the "view source" option in your web browser on this page. Before this post, I got 1278 lines, and the letter "A" appears 3870 times. A text copy of the markup for this page is 84kb, but this doesn't count the external javascripts, style sheets, and images which are linked to, which are generated dynamically by the virtual engine and template engines on the server, and also must be transmitted and loaded...
Cynical1
Apr 07, 2011

Rank: 3 / 5 (2)
Like my Grandpa used to say - you don't know how much useless crap is in the universe, 'til someone gets a grant to study it...
Actually, he didn't say that...
Jayman
Apr 07, 2011

Rank: not rated yet
Is that all? I account for 7-8 zeros myself !!
Quantum_Conundrum
Apr 07, 2011

Rank: 1 / 5 (1)
Is that all? I account for 7-8 zeros myself !!


Actually, based on world population and the fact many people don't use the internet at all, that amount distributed across the average person who actually uses the internet comes to around 500gigabytes per person.

So you actually count for around 11 to 11.5 of those zeros...we all do...
CWonPhysOrg
Apr 10, 2011

Rank: not rated yet
Sorry, but I can't forgo a chuckle about the term "Rigorously Estimated" ... as in Joe Sixpack "rigorously estimated" his IRS liability.

... just struck me funny.
Rank 4.4 /5 (8 votes)
Relevant PhysicsForums posts
  • Ideas to mitigate risk of 911 calls being misdirected
    createdMay 24, 2012
  • Live scribe pen?
    createdMay 10, 2012
  • Shallow water flow simulation
    createdMay 07, 2012
  • Tablet for taking notes?
    createdMay 05, 2012
  • Best fit tablet for me?
    createdMay 05, 2012
  • Measure of Informaton
    createdMay 04, 2012
  • More from Physics Forums - Computing & Technology

More news stories

SpotterRF debuts Radar Backpack Kit (w/ Video)

(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...

Technology / Hi Tech & Innovation

created 20 hours ago | popularity 5 / 5 (5) | comments 12 | with audio podcast report

Probability of contamination from severe nuclear reactor accidents is higher than expected: study

Catastrophic nuclear accidents such as the core meltdowns in Chernobyl and Fukushima are more likely to happen than previously assumed. Based on the operating hours of all civil nuclear reactors and the number ...

Technology / Energy & Green Tech

created May 22, 2012 | popularity 3.6 / 5 (21) | comments 56 | with audio podcast

Delphi gasoline-injection engine technique rivals hybrid's edge

(Phys.org) -- Running a diesel like engine on gasoline is something Delphi is doing in notable fashion. They claim they are on to a promising way to enjoy an engine that gives the vehicle owner high efficiency ...

Technology / Energy & Green Tech

created May 21, 2012 | popularity 4.7 / 5 (18) | comments 38 | with audio podcast report

HyperSolar shows dirty water no barrier to power world

(Phys.org) -- The Santa Barbara, California, company, HyperSolar, is set to transparently share the ups and downs of its research experiences toward the company’s ultimate vision, successfully producing ...

Technology / Energy & Green Tech

created May 24, 2012 | popularity 4.8 / 5 (15) | comments 17 | with audio podcast report

Tesla to launch electric sedan in US on June 22

Tesla Motors said Tuesday it would begin deliveries of "the world's first premium electric sedan" on June 22, slightly ahead of schedule.

Technology / Energy & Green Tech

created May 22, 2012 | popularity 4.5 / 5 (11) | comments 18


Scientist: Evolution debate will soon be history

(AP) -- Richard Leakey predicts skepticism over evolution will soon be history. Not that the avowed atheist has any doubts himself.

Dell tablet leak: 10.1-inch display, two-battery choice

(Phys.org) -- Headline after headline talks about vendors’ tablets in the wings as likely number-one contenders for the iPad. Such claims have justifiably been taken with a grain of salt, considering ...

SpaceX capsule has 'new car' smell, astronauts say (Update)

SpaceX's Dragon cargo vessel smells like a new car, said astronauts at the International Space Station after opening the hatches Saturday following the spacecraft's landmark mission to the orbiting lab.

Thousands of shellfish found dead in Peru

Thousands of crustaceans were found dead off the coast of Lima following the mystery mass death of dolphins and pelicans, the Peruvian Navy said Friday.

Astronomers seize last chance in lifetime for Venus Transit

Astronomers are gearing for one the rarest events in the Solar System: an alignment of Earth, Venus and the Sun that will not be seen for another 105 years.

Australia hails surprise super-telescope decision

Australia has hailed a surprise decision giving it a role in a radio telescope project aimed at revolutionising astronomy, vowing to draw on its decades of experience in space science.