Astronomer Andrew Connolly discusses the promise of big data
Andrew Connolly is a professor in the University of Washington Department of Astronomy. He is one of several UW professors working on the Large Synoptic Survey Telescope, or LSST, which will begin scanning the sky in 2022 from its location atop Cerro Pachón, a mountain in northern Chile.
He has called it "one of the most exciting experiments in astrophysics today," adding, "it could completely transform our knowledge of the universe, from understanding how dark energy drives the expansion of the universe, to identifying asteroids that may one day impact the Earth."
Over the years, Connolly has worked on a number of areas in the design and construction of the LSST, from running the UW data management group that develops software to study information that will come from the telescope, to leading a team developing simulations of what this powerful new telescope might see. On his web page he says, "My science focuses on analyzing large astronomical data sets to study the formation and evolution of galaxies and cosmology."
Throughout his career he has been involved with big data projects. As a postdoctoral researcher he was involved in the Sloan Digital Sky Survey, or SDSS, a collaboration of about 200 astronomers at more than 40 institutions on four continents that has been scanning the sky and collecting data since 2000. During a sabbatical in 2006 at Google, Connolly was the project leader for Google Sky, which incorporated images from the Hubble Space Telescope and the SDSS into Google Earth.
Connolly answered a few questions about his work and the promise of big data and tools such as the LSST to astronomy.
Q: Where are you spending the year, and what are you working on?
A.C.: I am in Cambridge (the UK version) for a year. I'm working on a few different areas ranging from the detection of objects whose light has been bent (or gravitationally lensed) by distant galaxies, to studying how we can survey the sky to maximize how quickly we can get science from the LSST.
These may seem like very different questions and problems but they are in fact related. They both involve searching for subtle signals from large complex data sets. Signals that are hard to extract but if we can, we might be able to understand how the universe evolves (driven by dark energy and dark matter).
We have a lot of different ways to look at the sky (different telescopes and instruments) and many tools that can be used when working with data, but it is only when you start applying these techniques to real observations that you can understand how well they will perform in practice. I'm trying to use some of the techniques that we will use on the LSST but on today's data sets.
So you could say that I am getting my hands dirty with data, which has been a lot of fun, especially with the LSST a few years away.
Q: In your TED talk you say that a single image from the LSST will be equivalent to 3,000 images from the Hubble Space Telescope. How is this achieved?
A.C.: The LSST isn't the biggest telescope in the world (unlike the new generation of telescopes that will have mirrors 30 meters across), nor does it have the highest-quality images (such as those from space base telescopes like the Hubble).
What it does have is a very large field of view (one image covers an area seven times the width of the full moon) and the largest digital camera in the world (with 3.2 billion pixels). This means it can survey half of the sky every three nights to discover if anything has changed or moved (something Hubble would take about 120 years to do just once).
One of the great aspects of all of the telescopes and instruments we are building today is that they have different and complementary capabilities (e.g. the Hubble can look at great detail at very faint sources but can't cover large areas of the sky). Combined, we get to reveal both the big picture and the details of how the universe has evolved up to the present day.
Q: What are the challenges that you face in order to answer these "big questions"?
A.C.: Within the next decade new telescopes (on Earth and in space), and new cameras and spectrographs will realize a 1,000-fold increase in the amount of data accessible to astronomers. The size of the data will enable us to answer some of the most fundamental questions in astrophysics today—questions we have been asking since we started looking up at the stars and wondering how they came into being.
Discoveries that might come from the data include:
- Measurements of the shapes of distant galaxies could reveal the properties of dark energy with an accuracy 10 times better than today. This could change our understanding of general relativity if it shows that gravity works differently on large scales.
- Surveys of the faint radio sky may detect the epoch at which stars and galaxies first began to form within the universe.
- Tracking the orbits of asteroids and comets could reveal if the environment in which the Sun formed was responsible for the distribution of the planets in our solar system or identify asteroids that might one day impact the Earth (at distances where we can do something about it).
Some of the most exciting discoveries will be answers to questions that today we don't even know how to ask.
But this data-rich era comes with a big challenge: Scientific discovery is beginning to be limited not by how we collect or store data, but how we extract the knowledge it contains.
We are reaching a stage where our data are much richer than many of the analyses we apply to them, and where software and algorithms have the potential to become the next instrument for exploring the universe.
Fixing this gap between the science and the amount of data is something that we need to address. The increasing complexity and size of data coming from these instruments means astrophysics is becoming ever more dependent on developments in computing. It also means that there is a great opportunity for discovery if we can prepare the next generation of students and postdocs with the skills that are needed for an era rich in data.
Q: You also mention that "the smart use of data" and new tools will transform astronomy in coming years, "opening up a window in the universe—the window of time." What new understanding of the cosmos might this bring?
A.C.: There are so many things we know about the universe but don't understand. We know it is expanding and this expansion is getting faster, but we don't understand what causes the acceleration.
We know that the dynamics of the universe suggest that most of the matter is not visible, but we don't understand what particles might make up that matter. We can see the diversity of stars and galaxies that have formed in the universe, but we don't understand, in detail, the physical processes that drive the formation and evolution of galaxies or the formation of the first stars.
It is a great time to be an astronomer because a new generation of telescopes and surveys might help us unlock these answers by providing a view of the universe that has unprecedented detail. Data will answer these questions (hopefully) and this revolution in data will occur over the next decade.