What was billed as one of the biggest science stories for years is now the focus of huge controversy, amid claims it was hyped by the researchers.
Earlier this year, Harvard University held a press conference at which a team of scientists revealed what they claimed was the first detection of cosmic gravitational waves – ripples in space-time triggered by the birth of the universe.
On the face of it, the results were confirmation of the leading theory for what caused the Big Bang 14 billion years ago. Known as inflation, this predicts the creation of gravitational waves that may still be detectable in the radiation left over from the explosion.
And in March a team led by Professor John Kovac at Harvard claimed to have found the tell-tall swirling patterns in the radiation – just as predicted.
Some of the world’s leading scientists quickly hailed the discovery. But now it is being attacked as premature at best, and possibly flat wrong.
The resulting controversy provides a rare glimpse into the how cutting-edge science is done – and its limitations.
Even at the time of the original announcement, sceptics warned of the need for confirmation from independent sources. Their concerns focused on some odd features in the swirling patterns of radiation.
For example, the strength of the signal was higher than many expected. Yet most concern surrounded the possibility that much if not all of the patterns were due to something other than gravitational waves.
The prime suspect is dust and particles circulating in our own galaxy, with which the radiation may interact to give similar patterns.
As this newspaper reported at the time, these doubts would be assuaged if the findings are confirmed by data gathered by Planck, an orbiting observatory built by the European Space Agency.
The Planck findings are expected to be published later this year. But for some the verdict is already in. The leading science journal Nature has published a comment article by a highly respected cosmologist dismissing the claim as “premature hype”.
According to its author, Prof Paul Steinhardt of Princeton University, the claim has already been debunked by a detailed analysis by his colleagues.
He could barely hide his contempt for how the original claims were made public: via a press conference. “Announcements should be made after submission to journals and vetting by expert referees”, he said. “If there must be a press conference, hopefully the scientific community and the media will demand that it is accompanied by a complete set of documents … to enable objective verification”.
Prof Steinhardt got his wish last week, with the publication of the original claims in the prestigious journal Physical Review Letters.
Or, to be more precise, those claims that survived the scrutiny of the expert referees. For it seems that they have been distinctly underwhelmed, and have insisted on changes.
The result has been a substantial loss of confidence in the reliability of the discovery.
Even some of the team members are now admitting to being less sure of themselves. While they made efforts to account for the risk of dust being the real cause of the swirling patterns, it now seems they didn’t try hard enough.
Even before the PRL paper appeared, other researchers were warning that the team may have seriously under-estimated the effect of dust.
Now in the paper itself, the team has conceded that preliminary data from the Planck satellite points to a much stronger effect from dust.
“Has my confidence gone down? Yes”, team member Dr Clement Pryke of Minnesota University told New Scientist magazine.
That’s not just an emotional statement, however. In science, confidence has a very precise meaning; it can even be quantified. Indeed, it’s what convinced the team and many other scientists to take the claim seriously in the first place.
As such, it will be the focus of much of the argument that will rage in the coming months. But it also has major implications for research beyond the esoteric realm of cosmology.
That is because the dirty truth about science is that none of its claims can ever be put beyond all possible doubt. Only in mathematics can anything be proved absolutely.
Instead, scientists have to be satisfied with different levels of confidence in their findings, quantified by a number from statistical theory called a “sigma level”.
In essence, a sigma level measures the strength of evidence, as determined by how hard it would be to get the same evidence by fluke alone.
The higher the sigma level, the more compelling the evidence. Researchers in most areas of science, such a medicine, are happy with anything higher than 2-sigma evidence. Roughly speaking, that means the chances of getting at least as impressive a finding by fluke alone is less that 5 per cent.
As scientists have discovered to their cost, however, the 2-sigma level isn’t a very demanding standard. Many “breakthroughs” based on it, from nutritional advice to cancer therapies, have quickly been debunked.
This has led physicists to set a far higher standard of evidence to claim a discovery: 5-sigma. The theory behind sigma values shows that this isn’t just 2.5 times more demanding than 2-sigma, but over 80,000 times.
When they announced their claim in March, Prof Kovac and his team said their result was even more impressive than this, at 5.9-sigma. This is a staggering 13 million times more impressive than the standard used in most areas of science.
Yet despite this, it now seems the original claim could be debunked.
Whether the headlines really were unjustified will become clearer later this year. But if they were based on hype, those who cry “We told you so” should reflect on what the debacle tells us about the standards of evidence used in research.
Most areas of science have adopted a far lower standard, and don’t get this level of scrutiny either.
Given that much of it has literally life or death consequences, that’s a chilling thought.
Robert Matthews is visiting reader in science at Aston University, Birmingham