Finding the unknowns in the universe
What have pulsars, quasars, dark matter and dark energy got in common? Answer: each of them took the discoverer by surprise. While much of science advances carefully and methodically, the majority of truly spectacular discoveries in astronomy are unexpected.
Many of our telescopes are built to discover the known unknowns: the things we know we don't know, such as identifying the stuff that makes up dark matter.
But the real breakthroughs are the unknown unknowns. These are the things we don't even suspect are out there until we accidentally find them.
For example, of the ten greatest discoveries by the Hubble space telescope, only one featured in the proposal used to justify its construction and launch. That one, measuring the rate of expansion of the universe, is a known unknown.
In other words, we had a question about something that we knew about, and we thought Hubble could answer the question. Most of the other discoveries are unknown unknowns: we didn't know what they were until we stumbled across them.
A chance discovery
Consider pulsars. They were discovered in the 1960s when a bright young PhD student in the UK, Jocelyn Bell Burnell, was studying the twinkling of radio waves by electrons in space (a known unknown).
She noticed odd bits of what she called "bits of scruff" on her chart recorder, and realised they were something much more startling than mere tractor interference, and thereby discovered pulsars – an unknown unknown – for which her supervisor Antony Hewish won the 1974 Nobel prize for physics.
So how did she make that discovery?
Apart from being a bright, persistent, open-minded student, Bell Burnell was also observing the universe in a way in which it had never been observed before. By looking at rapid changes in the radio waves, she was observing the universe using a parameter – in this case short timescale observations – that hadn't been used before.
Other discoveries happen when people observe with a different parameter, such as faintness, or area of sky, that hasn't been observed before. Together, these parameters make up our parameter space.
Most major astronomical discoveries seem to happen when somebody observes a new part of parameter space; observing the universe in a way it hasn't been observed before.
This new way might consist of looking more deeply, or with better resolution, or on a larger scale, or maybe just seeing much more of the universe. Extending any of these parameters into their unexplored regions is likely to lead to an unexpected discovery.
Right now several next-generation telescopes are being built, boldly going where no telescope has gone before. They will significantly expand the volume of observational parameter space, and should in principle discover unexpected new phenomena and new types of object.
For example, CSIRO's A$165-million ASKAP telescope, now nearing completion, is exploring several areas of uncharted parameter space, with an excellent chance of stumbling across a major unexpected discovery that could shake the scientific world.
But will we recognise it when we see it? Probably not.
Bell Burnell discovered pulsars by laboriously sifting through all her data, and noticed a tiny anomaly that didn't fit her understanding of the telescope.
How much data?
If Bell Burnell were observing with ASKAP, she would have to sift through about 80 petabytes of data a year, from a machine that is so complex that nobody truly understands every bit of it. Sorry, not even Bell Burnell's brain is up to the task of sifting through that amount of data.
We cannot possibly examine all that data by eye. So the way we do our science is that we decide on the scientific question we are asking, and turn it into a data query.
We then mine the database looking for those bits of data that will answer our question.
This is a very efficient way of answering the known unknowns. Sadly, it is useless at finding the unknown unknowns. We only receive answers to the questions that we ask, and not to the questions that we didn't know we ought to ask.
Now remember the Hitchhiker's Guide to the Galaxy science fiction/fantasy series by author Douglas Adams? When a giant computer, Deep Thought, found the answer to "life, the universe, and everything" to be 42, another, even bigger, computer had to be built to find out what the actual question was.
So can we design a machine, or a piece of software, to replicate Bell Burnell's brain in detecting unknown unknowns but working comfortably with petabytes of data and unbelievably complex telescopes?
WTF into the unknowns
I think we can, and we've already started the project WTF, which stands for Widefield ouTlier Finder, with the progress so far published just last month. The WTF machine will sift through the petabytes of data, searching for something unexpected, without knowing exactly what it's looking for.
The trick is to use machine learning techniques, where we teach the software about all the things we know about, and then ask it to find things we don't know about.
For example, it might plot a graph of radio brightness against optical colour. On that graph, it would find a cluster of quasars grouped together, another cluster of galaxies like the Milky Way, and so on.
Maybe it will find another cluster of objects that we didn't expect and didn't know about. Our puny brains couldn't make more than a small dent into all the possible graphs that need to be plotted, but WTF will take these in its stride.
This process won't be easy. At first, WTF will probably turn up things we forgot to tell it, and it will also find radio interference and instrumental artefacts.
As we gradually teach it what these are, it will start to recognise truly new objects and phenomena. More significantly, it will start to learn new things from the data that are made invisible to our brains by their sheer multidimensional complexity, but will be grist to the mill for WTF.
We expect WTF to become smarter than us, able to find those rare discoveries buried in the data. Perhaps WTF may even win the first non-human Nobel prize.