To clear digital waste in computers, 'think green,' researchers say

September 1, 2011

To clear digital waste in computers, 'think green,' researchers say

Enlarge

Johns Hopkins computer scientists propose a real-world approach to the problem of managing digital detritus. Credit: Royce Faddis/JHU

A digital dumping ground lies inside most computers, a wasteland where old, rarely used and unneeded files pile up. Such data can deplete precious storage space, bog down the system's efficiency and sap its energy. Conventional rubbish trucks can't clear this invisible byte blight. But two researchers say real-world trash management tactics point the way to a new era of computer cleansing.

In a recent paper published on the scholarly website arXiv, Johns Hopkins University Ragib Hasan and Randal Burns have suggested familiar "green" solutions to the digital waste data problems: reduce, reuse, recycle, recover and dispose.

"In everyday life, 'waste' is something we don't need or don't want or can't use anymore, so we look for ways to re-use it, recycle it or get rid of it," said Hasan, an adjunct assistant professor of computer science. "We decided to apply the same concepts to the waste data that builds up inside of our computers and storage devices."

With this goal in mind, Hasan and Burns, an associate professor of computer science, first needed to figure out what kind of might qualify as "waste." They settled on theses four categories:

  • Unintentional waste data, created as a side effect or by-product of a process, with no purpose.
  • Used data, which has served its purposes and is no longer useful to the owner.
  • Degraded data, which has deteriorated to a point where it is no longer useful.
  • Unwanted data, which was never useful to the computer user in the first place.
The researchers found no shortage of files and computer code that fit into these categories. "Our everyday data processing activities create massive amounts of data," their paper states. "Like physical waste and trash, unwanted and unused data also pollutes the digital environment. … We propose using the lessons from real life waste management in handling waste data."

The researchers say a user may not even be aware that much of this waste is piling up and impairing the computer's efficiency. "If you have a lot of debris in the street, traffic slows down," said Hasan. "And if you have too much waste data in your computer, your applications may slow down because they don't have the space they require."

Even though data have become less expensive, Hasan said, hard drives can still run out of room. In addition, Flash-based systems, such as memory cards, possess a limited number of write-erase cycles, and frequent deleting of waste data can shorten their lifespan.

How then, can the clutter inside computers be curbed? To address the problem, Hasan and Burns devised a five-tier pyramid of options, inspired by real-world waste reduction tactics:

To clear digital waste in computers, 'think green,' researchers say
Enlarge

An illustration of the researchers; proposed approach to cleaning out computer data trash. Credit: Ragib Hasan and Randal Burns/JHU

Reduce: At the top of the pyramid, the most preferred option is to cut back on the amount of waste data that flows into a computer to begin with. This can be done, the Johns Hopkins researchers say, by encouraging software makers to design their programs to leave fewer unneeded files behind after a program is installed. To coax the software makers to comply, computers could be set up to "punish" programs that do excessive data dumping; such programs would be forced to run more slowly.

Reuse: Software makers also could break their complex strings of code into smaller modules that could serve double-duty. If two programs are found to utilize identical modules, one might be eliminated in a process called "data deduplication." This is the second-best option in the waste-management pyramid, the researchers said.

Recycle: Just as discarded plastic can be refashioned into new soda bottles, some files could be repurposed. For example, when old software is about to be removed, the computer could look for useful pieces of the program that could be put to work in other applications.

Recover: Even when waste data can't be reused or recycled, these digital leftovers might yield information worth studying after private identification details are removed. In their paper, the researchers suggest that "obsolete data can also be mined to gather patterns about historical trends."

Dispose: Sitting at the bottom of the pyramid, this is the least desirable option, the researchers say, and the messiest, when you consider the energy used to completely eliminate old files or the real-world pollution created when one destroys an old hard drive or other form of storage media. One solution, however, the scientists say, could be a "digital landfill." This could be accomplished with a "semi-volatile storage device" that would provide a temporary home to data that is designed to automatically fade away over time, freeing up space for the next tenants.

Although the research paper has shined a spotlight on the digital waste issue, Hasan acknowledges that most computer users haven't given much thought to the clutter piling up in their laptops, particularly when extra storage media and devices are relatively cheap. But he pointed out that more users are moving toward cloud computing, in which files are sent over the Internet to a site where an enormous number of files can be stored. As this continues, such central storage sites could find themselves drowning in waste data. "Someday, this could become a problem as we begin using up these storage resources," Hasan said. "Maybe we should start talking about it now."

More information: The research paper by Hasan and Burns -- The Life and Death of Unwanted Bits: Toward Proactive Waste Data Management in Digital Ecosystems – can be read online here: http://arxiv.org/P … 6.6062v2.pdf

Provided by Johns Hopkins University search and more info website

4.6 /5 (7 votes)  

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

Eikka
Sep 01, 2011

Rank: 5 / 5 (1)
I just take whatever files I need, copy them over to a backup drive, and then wipe the original.

Problem solved.

The argument that deleting files wears out flash based drivers is bunk, because what actually happens when you delete a file is that the file table entry for its physical adress is removed and it's treated like empty drive space.

Nothing actually happens to it until you make a new file there. Then it is cleared to a known state (all zeroes or all ones) and new data is written on it.
Eikka
Sep 01, 2011

Rank: 5 / 5 (1)
And like they say about RAM, unused hard drive space is wasted hard drive space.

The real problem is in how we organize the physical data, because hard drives suffer from fragmentation where files physically split into tiny pieces when there's no free spot large enough for them to fit into. This slows down access speed because the mechanical reading arm has to hunt and peck for the data.

Solid state drives reduce this seeking to practically nil, but the limitiation is still the old file system that assumes that the device is mechanical, so it virtually splits the file apart as if it was physically in pieces and sends many read requests for all the little pieces it thinks should exist. The drive then takes time responding to all the requests where one would have been enough to retrieve the data.

The hard drive and the file system should be re-thought to act more like a database server where the computer is agnostic about the physical aspects, and simply asks for a piece of data.
Eikka
Sep 01, 2011

Rank: 5 / 5 (1)
Of course, there's the workaround hacks. There's the TRIM command that some SSDs support, that is basically the operating system telling the drive about what bits of data are no longer in use so the drive can re-organize internally when needed.

That still leaves the file system, because the operating system has a fixed adress list of where the data should be physically, even when it's not really. The operating system has to mind that the files are continous and have spaces between them to allow them to change in size if they are modified etc. etc. and then tell the hard drive to move the data to different adresses to reflect this imaginary surface of a magnetic disc.

That bit is completely unnecessary when the data adress on an SSD no longer matches its physical location in the chips anyhow. All you should need is an index number for a file, and the drive can figure out where to put it.
CHollman82
Oct 07, 2011

Rank: not rated yet
Agreed on all account Eikka.
CHollman82
Oct 07, 2011

Rank: not rated yet
I've always thought we needed a better type of file system than the ones that exist today... one problem I have with the traditional PC file systems is that they limit organization by only allowing a file to exist in one folder at a time... sure, you can make shortcuts to it but that is tedious and ugly, I hate shortcuts.

For example, I use a firefox extension that lets me download an image on a website just by dragging it a little bit, I just drag it a few pixels and drop it and it is downloaded to the folder I set. Of course after a while browsing wallpapers and peopleofwalmart.com and pornography and whatever else this folder becomes quite full with a mish mash of unsorted stuff. So I was organizing it the other day, particularly my collection of 1920x1080 wallpapers, and found that quite frequently I wanted to put a single image into multiple categorized folders, but I couldn't do so without physically duplicating the file or making a shortcut...
CHollman82
Oct 07, 2011

Rank: not rated yet
This is a limitation of the file/folder organizational system that is in use on modern PC's.

I would prefer the file system to work more like a relational database, where objects (files) could be linked to any arbitrary number of other files or defined categories or "relations". That way I could say that this wallpaper picture belongs to both the "nature" category and the "dark" category, without having to either pick one or the other folders, or duplicate the file, or use a shortcut.

In thinking of this I actually started working on an app that would server to replace windows explorer that lets you add custom tags to ANY file on your computer and then search for files based on a boolean tag search, so to find that example wallpaper I would search for something like "image wallpaper nature dark"... or any substring of the same... with the more tags further refining the results of the search.
Rank 4.6 /5 (7 votes)
Relevant PhysicsForums posts
  • Ideas to mitigate risk of 911 calls being misdirected
    createdMay 24, 2012
  • Live scribe pen?
    createdMay 10, 2012
  • Shallow water flow simulation
    createdMay 07, 2012
  • Tablet for taking notes?
    createdMay 05, 2012
  • Best fit tablet for me?
    createdMay 05, 2012
  • Measure of Informaton
    createdMay 04, 2012
  • More from Physics Forums - Computing & Technology

More news stories

Browser wars flare in mobile space

The browser wars are heating up again, but this time the fight is for dominance of the mobile Internet.

Technology / Software

created 5 hours ago | popularity 5 / 5 (1) | comments 2

Probability of contamination from severe nuclear reactor accidents is higher than expected: study

Catastrophic nuclear accidents such as the core meltdowns in Chernobyl and Fukushima are more likely to happen than previously assumed. Based on the operating hours of all civil nuclear reactors and the number ...

Technology / Energy & Green Tech

created May 22, 2012 | popularity 3.6 / 5 (22) | comments 56 | with audio podcast

SpotterRF debuts Radar Backpack Kit (w/ Video)

(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...

Technology / Hi Tech & Innovation

created May 26, 2012 | popularity 5 / 5 (5) | comments 12 | with audio podcast report

HyperSolar shows dirty water no barrier to power world

(Phys.org) -- The Santa Barbara, California, company, HyperSolar, is set to transparently share the ups and downs of its research experiences toward the company’s ultimate vision, successfully producing ...

Technology / Energy & Green Tech

created May 24, 2012 | popularity 4.8 / 5 (16) | comments 17 | with audio podcast report

Tesla to launch electric sedan in US on June 22

Tesla Motors said Tuesday it would begin deliveries of "the world's first premium electric sedan" on June 22, slightly ahead of schedule.

Technology / Energy & Green Tech

created May 22, 2012 | popularity 4.5 / 5 (11) | comments 18


Nvidia trumpets Tegra 3 phone design wins for 2012

(Phys.org) -- Nvidia’s competitive war paint has a name, Tegra 3. On the heels of Nvidia announcements about lowering costs of its Tegra 3 processors and Nvidia-enabled tablets running Android Ice Cream ...

Scientist: Evolution debate will soon be history

(AP) -- Richard Leakey predicts skepticism over evolution will soon be history. Not that the avowed atheist has any doubts himself.

Dell tablet leak: 10.1-inch display, two-battery choice

(Phys.org) -- Headline after headline talks about vendors’ tablets in the wings as likely number-one contenders for the iPad. Such claims have justifiably been taken with a grain of salt, considering ...

Keep food safety in mind this memorial day weekend

(HealthDay) -- Picnics, parades and cookouts are as much a part of Memorial Day weekend as tributes to the United States' war veterans.

Social welfare cuts ultimately come with heavy price, researchers say

(Phys.org) -- Slashing government funding for Medicaid, food stamps and other programs that serve the poor – while politically popular with some lawmakers and many conservatives – may do more harm ...

Is a classical electrodynamics law incompatible with special relativity?

(Phys.org) -- The laws of classical electromagnetism that were developed in the 19th century are the same laws that scientists use today. They include Maxwell’s four equations along with the Lorentz la ...