The end of sneakernet?

July 13, 2017 by Cory Nealon, University at Buffalo
Credit: University at Buffalo

Not everyone marvels at the speed of the internet.

For researchers and companies sharing extremely large datasets, such as genome maps or satellite imagery, it can be quicker to send documents by truck or airplane. The slowdown leads to everything from lost productivity to the inability to quickly warn people of natural disasters.

The University at Buffalo has received a $584,469 National Science Foundation grant to address this problem. Researchers will create a tool, dubbed OneDataShare, designed to work with the existing computing infrastructure to boost data transfer speeds by more than 10 times.

"Most users fail to obtain even a fraction of the theoretical speeds promised by existing networks. The bandwidth is there. We just need new tools the take advantage of it," says Tevfik Kosar, PhD, associate professor in UB's Department of Computer Science and Engineering, and the grant's principal investigator.

Large businesses, and others can generate 1 petabyte (or much more) of data daily. Each petabyte is one million gigabytes, or roughly the equivalent of 20 million four-drawer filing cabinets filled with papers. Transferring this data online can take days, if not weeks, using standard high-speed networks.

This bottleneck is caused by several factors. Among them: substandard protocols, or rules, that govern the format of how data is sent over the internet; problems with the routes that data takes from its point of origin to its destination; how information is stored; and limitations of computers' processing power.

Rather than waiting to share data online, individuals and companies may opt to store the data on disks and simply deliver the information to its destination. This is sometimes called sneakernet—the idea that physically moving information is more efficient.

Managed file transfer service providers such as Globus and B2SHARE help alleviate data sharing problems, but Kosar says they still suffer from slow transfer speeds, inflexibility, restricted protocol support and other shortcomings.

Government agencies, such as the NSF and the U.S. Department of Energy, want to address these limitations by developing high-performance and cost-efficient data access and sharing technology. The NSF, for example, said in a report that the cyberinfrastructure must "provide for reliable participation, access, analysis, interoperability, and data movement."

OneDataShare attempts to do that through a unique software and research platform. Its main goals are to:

  • Reduce the time needed to deliver data. It will accomplish through application-level tuning, and optimization of Transmission Control Protocol-based data transfer protocols such as HTTP, SCP and more.
  • Allow people to easily work with different datasets that traditionally haven't been compatible. In short, everyone's data is different, and it's often organized differently using different programs.
  • Decrease the uncertainty in real-time decision-making processes and improve delivery time predictions.

OneDataShare's combination of tools—improving data sharing speeds, the interaction between different data programs, and prediction services—will lead to numerous benefits, Kosar says.

"Anything that requires high-volume data transfer, from real-time weather conditions and to sharing genomic maps and real-time consumer behavior analysis, will benefit from OneDataShare," Kosar says.

Explore further: Team creates high-speed internet lane for emergency situations

Related Stories

SDSC enables large-scale data sharing using Globus

April 7, 2014

The San Diego Supercomputer Center (SDSC) at the University of California, San Diego, has implemented a new feature of the Globus software that will allow researchers using the Center's computational and storage resources ...

New technology enables high-speed data transfer

June 18, 2009

GridFTP, a protocol developed by researchers at Argonne National Laboratory, has been used to transfer unprecedented amounts of data over the Department of Energy's (DOE) Energy Sciences Network (ESnet), which provides a ...

Recommended for you

Under-fire Apple removes 25,000 apps in China

August 20, 2018

Apple said Monday it had removed many gambling-related apps from its Chinese app store as the US giant comes under scrutiny amid trade tensions between Beijing and Washington.

Robots as tools and partners in rehabilitation

August 17, 2018

In future decades, the need for effective strategies for medical rehabilitation will increase significantly, because patients' rate of survival after diseases with severe functional deficits, such as a stroke, will increase. ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

EyeNStein
not rated yet Jul 17, 2017
The whole basis of network design is wrong. It creates bottlenecks then tries to force your data and others through them. It only diverts data if a path fails.
Link aggregation should be the norm not an expensive add on.
If your computer has 2.4GHz and 5GHz wireless links and a 100Mb Ethernet it should use ALL of them. If a professional 200Mb stream connects via the web to a 200Mb server you should get the link speed you paid for if you are a commercial high availability premium rate customer.
The domestic would get a lesser amount for their lower price.
The rate you get is determined solely by the QOS (Quality of service level you paid for)

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.