Radio astronomy backed by big data projects
As the leading edge of the Square Kilometre Array (SKA) project, the Murchison Widefield Array (MWA) radio telescope is at the forefront of the big data challenges facing radio astronomy, presenting and solving issues that will help research and industry across the world for years to come.
The MWA is a Curtin University-led project consisting of antennas in Murchison Shire (800 km north of Perth), optical fibre data connectivity to Perth, and a large-scale data archive based at the Pawsey Supercomputing Centre.
The MWA is designed to look back in time, to study the formation of the first stars and galaxies in the universe, less than one billion years after the Big Bang (the universe is 13.8 billion years old).
It is a smaller scale prototype that is testing the types of challenges and technologies that will face the SKA proper, as well as producing scientifically valuable results in its own right.
After approximately 18 months of operations, the MWA has collected over 4 petabytes of data, which is enough data to fill 8 million average computer hard drives.
It is therefore one of the first astronomy facilities to enter the era of big data.
The systems in the Murchison produce approximately 60 gigabits per second data streams (the average Australian internet is 6.9Mbps) that are processed in real-time on site, using GPU-based signal processing as the first stage in a hierarchical data processing strategy.
This set up means the MWA can produce data at a rate almost 8,700 times faster than the average Australian internet connection can download.
The output data streams from this processing are approximately an order of magnitude smaller than the input data streams and are transmitted over a dedicated 10 Gbps optical fibre network to the Pawsey Supercomputing Centre in Perth.
At the centre, the archiving and curation of accumulated datasets represents the next big data challenge.
Ensuring the integrity and discoverability of the archive requires significant effort for such a large-scale dataset.
Finally, the Pawsey Supercomputing Centre allows scientists around the world to access MWA data from the archive and process them.
This is the data analytics aspect of big data, where scientists invent new algorithms to manipulate data into images of the sky, catalogues of objects, and extract highly abstracted representations of the data, searching for subtle statistical signals within.
This is similar to the analytics problems that the commercial world is familiar with, discovering the information due to a small handful of important variables in amongst a very large number of possible variables, across a large volume of data.
The MWA is one of three precursor telescopes for the much larger Square Kilometre Array (SKA), approximately half of which will be built in the Murchison within a decade.
The MWA is the first to be fully operational, placing Western Australian astronomers and engineers at the forefront of international SKA developments and engaging with industry partners such as CISCO, Woodside and SIRCA.