No downtime for communication: New framework allows for asynchronous communication in exascale machines

February 28, 2013
Schematic illustration of push data using a put_notify call in Global Arrays.

(—The productivity of a group of colleagues on a project is always more efficient when required information is sent as soon as it becomes available, rather than sending a request for information when it's needed. In the same way, computer algorithms that send data from one process to another when the data becomes available will be more efficient than one that is requested when it will be used. To facilitate designing such algorithms within the Global Arrays programming model framework, DOE researchers at Pacific Northwest National Laboratory designed a new put_notify capability that allows a process to initiate and complete data to another process without synchronization. The novel feature is a notify element that the receiving process can use to asynchronously determine the completion of the data transfer.

To take advantage of enormous resources that next-generation super computers are expected to have, scientific codes must adapt. For example, (MD) simulation has evolved into a highly useful method for understanding and designing molecular systems. Sophisticated MD analyses can help scientists better understand biomolecular processes, such as and . The advantage of using the programming model is that the data transfer can take place while a receiving process is still working on other tasks, a mechanism often referred to as hiding communication behind computation.  

To support the needed asynchronous communication and coordination for MD algorithms, the team designed (in Global Arrays, the library-based Partitioned Global Address Space programming model) and implemented the non-blocking put_notify capability. To do this, a two-stage process was created-a put message and notification element-for data communication using a push-data instead of a pull-data model. The researchers were able to show there was discernible time spent between a process sending data and another that receives data. This design reduces the communication bottleneck and the associated load imbalance.

Using novel data-centric capabilities provides unique opportunities to address primary challenges for parallel scalability of MD time-stepping algorithms. In future work, the algorithm will be expanded to include dynamic load-balancing through topology-aware assignment and periodic redistribution of tasks.

Explore further: Computational Science Programming Model Crosses the Petaflop Barrier

More information: Straatsma, T. and Chavarria-Miranda, D. 2013. On eliminating synchronous communication in molecular simulations to improve scalability. Computer Physics Communications, January 23. DOI: 10.1016/j.cpc.2013.01.009

Related Stories

Fewer Faults for Faster Computing

March 28, 2011

( -- Environmental Molecular Sciences Laboratory (EMSL) users have designed and implemented an efficient fault-tolerant version of the coupled cluster method for high-performance computational chemistry using ...

Python bindings snake into global arrays toolkit

September 26, 2011

While many of us don't want anything to do with snakes, for some, a certain kind of Python—the computer programming language, that is—is the preferred option. Researchers at Pacific Northwest National Laboratory ...

Scaling Goes eXtreme: Researchers reach 34K CPUs

May 25, 2010

( -- Currently, researchers have demonstrated the scalability of high-level excited-state coupled-cluster approaches and parallel-in-time algorithms, reaching a staggering 34,000 Core Processing Units.  Researchers ...

Digital processors limited by power; what's the upside?

August 15, 2012

Today’s Defense missions rely on a massive amount of sensor data collected by intelligence, surveillance and reconnaissance (ISR) platforms. Not only has the volume of sensor data increased exponentially, there has also ...

Recommended for you

Musk, Zuckerberg duel over artificial intelligence

July 25, 2017

Visionary entrepreneur Elon Musk and Facebook chief Mark Zuckerberg were trading jabs on social media over artificial intelligence this week in a debate that has turned personal between the two technology luminaries.

Adobe bidding Flash farewell in 2020

July 25, 2017

Adobe on Tuesday said its Flash software that served up video and online games for decades will be killed off over the next three years.

Microsoft Paint brushed aside

July 24, 2017

Microsoft on Monday announced the end of days for its pioneering Paint application as it focuses on software for 3-D drawing.

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

1 / 5 (1) Feb 28, 2013
For quantum computers another architectural language in needed.
'No measure' is not a problem.
The prefixes coming after yocto and yotta are.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.