New technology enables high-speed data transfer

Jun 18, 2009

GridFTP, a protocol developed by researchers at Argonne National Laboratory, has been used to transfer unprecedented amounts of data over the Department of Energy's (DOE) Energy Sciences Network (ESnet), which provides a reliable, high-performance communications infrastructure to facilitate large-scale, collaborative science endeavors.

The Argonne-developed system proved key to enabling research groups at Oak Ridge National Laboratory in Tennessee and the National Energy Research Scientific Computing Center in California to move large data sets between the facilities at a rate of 200 megabytes per second.

The deployment of GridFTP at the two computing facilities is part of a major project to optimize wide-area network data transfers between sites hosting DOE leadership-class computers.

According to Ian Foster, co-director of the Globus Alliance project responsible for designing GridFTP, large-scale data transfer places an enormous burden on networks. "Conventional protocols have proven unable to handle the increasing demand of large-scale data transfer," he said. "The result has been delays in obtaining data, or even lost data as the network becomes overwhelmed. GridFTP changes that."

As large-scale collaborative science projects become increasingly common, the need to transfer unprecedented amounts of data is becoming critical. Having GridFTP on ESnet will enable the sharing of data between supercomputer centers in disciplines such as climate modeling and nuclear physics that require secure, robust, high-speed bulk data transfer.

"Our goal is to enable the scientists to rapidly move large-scale data sets between supercomputer centers as dictated by the needs of the science," said Eli Dart, a network engineer for ESnet, which is managed by Lawrence Berkeley National Laboratory. "High-performance networking has become critical to science due to the size of the data sets and the wide scope of collaboration characteristic of today's large science projects such as climate research and high energy physics."

GridFTP offers several advantages over other data transfer systems. For example, with Secure Copy, or scp, bulk transfer of a 33-gigabyte dataset between the two remote hosts could take up to eight hours. With GridFTP, almost 20 times that amount of data can be transferred in the same amount of time. And, unlike the transfer application FTP, GridFTP uses multiple data channels for improving the transfer speed.

"The data tsunami problem has been a major bottleneck to scientific advancement," said Raj Kettimuthu, technical lead and technology coordinator of the GridFTP project at Argonne. "With GridFTP computational scientists can analyze their simulated and derived data in real time."

More information on GridFTP is available at www.globus.org/grid_software/data/gridftp.php .

Source: Argonne National Laboratory

Explore further: Thanksgiving travel woes? There's an app for that

add to favorites email to friend print save as pdf

Related Stories

Data Travels Six Times Faster in the Clouds

Feb 26, 2009

(PhysOrg.com) -- The National Center for Data Mining (NCDM) at the University of Chicago at Illinois established a cloud computing system that can quickly compile data from widely geographically distributed ...

Big Computers For Big Science

Aug 23, 2004

A visiting neutron scattering scientist at ORNL sends data from her experiment to a San Diego supercomputer for analysis. The calculation results are sent to Argonne National Laboratory, where they are turned into "pictures." ...

Recommended for you

Thanksgiving travel woes? There's an app for that

Nov 26, 2014

Traveling by plane, train or automobile can be a headache. Mixing in Thanksgiving can make it a throbbing migraine. Technology provides some pain relief in the form of apps to let you know which roads are ...

Singapore moves to regulate taxi booking apps

Nov 21, 2014

Singapore on Friday announced new rules for mobile taxi booking apps, including US-based Uber, in the latest move by governments around the world to regulate the increasingly popular services.

Protecting personal data in the cloud

Nov 20, 2014

IBM today announced it has patented the design for a data privacy engine that can more efficiently and affordably help businesses protect personal data as it is transferred between countries, including across private clouds.

User comments : 8

Adjust slider to filter visible comments by rank

Display comments: newest first

kasen
not rated yet Jun 19, 2009
So, kinda like bittorrent with beefed up security?
MatthiasF
not rated yet Jun 19, 2009
Like bittorrent? Not really. It's not downloading from multiple sources. It's splitting up the data and sending it over multiple routes. One source to one destination. Not really helpful for the rest of us.
Corvidae
not rated yet Jun 19, 2009
Like bittorrent? Not really. It's not downloading from multiple sources. It's splitting up the data and sending it over multiple routes. One source to one destination. Not really helpful for the rest of us.

There are times when it would be useful for the average end use to be able to multi-stream from a single site. A lot of sites have a per connection speed cap to keep one person from hogging the bandwidth.

The real question is when will someone build a an online data cache service. Cache parts of files on servers all over the country so when a subscriber goes to download one, they get the parts multi streamed from all over. Like an on demand bit-torrent for popular data using dedicated servers. It's just a question of the business model working with bit-torrent as a free competitor.
nick7201969
not rated yet Jun 19, 2009
[The real question is when will someone build a an online data cache service.]

There was an article about 5 months ago that Microsoft was funding a company in or near Fresno,CA that was doing this very thing you mentioned.

I don't have the link anymore but should be easy to find since the Central Valley have very few technology companies there.
foob
not rated yet Jun 19, 2009
Hmmmm... You guys mean Akamai, CacheFly, SimpleCDN, CDNetworks, etcetera? Plenty of commercial distributed caches out there already. Most of the high load sites (like youtube) distribute content through CDNs already.
MatthiasF
not rated yet Jun 20, 2009
There are times when it would be useful for the average end use to be able to multi-stream from a single site. A lot of sites have a per connection speed cap to keep one person from hogging the bandwidth.


Most of the hosts you speak of will limit by IP, not by each connection. So, this technology still wouldn't really help.
MatthiasF
not rated yet Jun 20, 2009
I don't think they're talking about content delivery networks, Foob. They mean transmission caching from users to a distant server, so the user uploads to the closest member in the system and it would pass the data over to the desired destination using a faster connection.

This is pretty much a CDN in reverse, and I bet you could create an ad-hoc system using a current CDN to do just this where instead the user uploads it's file onto the CDN and then the server is told to download the file from the CDN.

This would still not avoid bandwidth caps on the destination server.
nick7201969
not rated yet Jun 20, 2009
Thanks foob, Mathias is correct.

The company I spoke of from Fresno is doing something different from the ones you spoke of. Microsoft thought it was unique so much that they funded some millions to assist with the equipment.

Wish I can find that link.. arrrr

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.