Is medical evidence accurate?

Modern computing, data generation, scientific software, email, the internet, and social media have all placed strain on the scientific publication infrastructure.

Evidence-based medicine is the cornerstone of modern health care, but what if the system of proof was wrong and  most published research is false? That is the question being posed at NYU Abu Dhabi. Getty Images
Powered by automated translation

Modern computing, data generation, scientific software, email, the internet, and social media have all placed strain on the scientific publication infrastructure.

Evidence-based medicine is the cornerstone of modern health care. In the bad old days, treatments were based on a doctor's personal clinical experience or traditional, time-honoured practice. Today, treatment decisions are based on evidence-based medicine, better known in the medical profession as simply EBM. The phrase had its first public outing as recently as the spring of 1991, in an essay by a young Canadian doctor in the Journal of the American College of Physicians.

Evidence-based medicine, wrote Gordon Guyatt, head of the intern programme at McMaster University, Ontario, was “the way of the future”.

Assisted by the introduction of what were still being called desktop “microcomputers”, linked by telephone line to Medline, the newly searchable database of medical literature run by the United States national library of medicine, doctors could quickly track down studies relevant to their case, and determine “the optimal management of the individual patient” rather than consult a textbook, expert or senior physician.

It was, without doubt, a revolution in health care, which has led to an avalanche of published research, estimated to generate about 2.5 million papers a year, in print and online in an ever-escalating number of journals.

It all sounds good. But what if most published research is false?

That is the startling premise that will be explored in Abu Dhabi on Tuesday by Jeff Leeks, associate professor of biostatistics at Johns Hopkins Bloomberg School of Public Health, in a provocative talk at the NYU Abu Dhabi Institute.

As Prof Leeks wrote in a paper in April, the suggestion that most published research findings are false “seems absurd on the first reading”. After all, scientific research is conducted by skilled scientists, vetted through peer review, and publicly scrutinised.

What’s more, the entire scientific publishing infrastructure “was originally conceived to prevent the publication of incorrect results and provide a forum for correcting false discoveries”.

But “modern computing, data generation, scientific software, email, the internet, and social media have all placed strain on the scientific publication infrastructure”.

It sounds counterintuitive – surely all these innovations can only assist researchers in reaching more accurate and reliable conclusions?

Well, yes and no, says Prof Leek. “In many ways these innovations really help researchers,” he says. “But the landscape of research changed so quickly that training had a hard time keeping up.”

The volume of data generated makes analysing and storing it hard to do and not every scientist is trained in the statistics that play a major role in every study. What’s more, new outlets for research, such as online-only journals and “preprints” – previews of work – that have sped up the publishing rate, are not vetted or peer reviewed. And the limits of social media, Prof Leek says, can pressure researchers to oversimplify their results.

Conventional media also often misrepresent findings. If a story based on a small study that has not been vetted or confirmed is reported as fact, rather than an as interesting idea – think about health claims or warnings about salt, coffee, red wine and so on – when it is later disproved “people will think scientists can’t make up their minds”.

All of these things, he says, “apply unexpected pressures to researchers that make it more difficult to execute dispassionate science”.

The alarm was first sounded in 2005 by John Ioannidis, a medical statistician at Tufts University School of Medicine, Boston, who is now professor of health research and statistics at Stanford University.

There was, he wrote in a paper published in PLoS Medicine in August 2005, "increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims".

What followed was an exhaustive and highly technical analysis of the statistical flaws and assumptions which, Prof Ioannidis said, undermined so many research conclusions.

Since then, “there has been major progress on many fronts”, Prof Ioannidis says.

Increasingly, clinical trials are registered with central bodies, meaning that results both positive and disappointing are reported, leading to a fuller picture, and more data that supports conclusions is being shared, often online.

“But I fully agree with Jeff that much work needs to be done,” says Prof Ioannidis. “I cannot say that research has become less reliable over the 12 years [since his 2005 paper was published].

“Some fields have become more reliable, and a few fields that were extremely unreliable – for example, genetic associations – have become extremely reliable.

“But at the same time, we have an explosion of data, often poor data, and thus many more results that get derived from working with poor data.”

Inevitably, there could be ramifications for those on the receiving end of medical research – the patients.

“For health care, we clearly need robust inferences,” Prof Ioannidis says, but in most medical applications “we have less than optimal evidence.

“This doesn’t mean that we have no evidence, but we can clearly do better”.

A stark example of the potentially deadly consequences of flawed research surfaced in 2014, when the British Medical Journal was forced to retract an article that reached a false conclusion on a drug affecting millions of people.

In October 2013 John D Abramson, a healthcare policy expert at Harvard Medical School, and three co-authors, published a peer-reviewed article in which they stated that 20 per cent of patients who took statins, which are widely prescribed to lower cholesterol, suffered side-effects including diabetes and muscle pain.

Sir Rory Collins, professor of medicine and epidemiology at the University of Oxford, revealed that they had overestimated the side effects of statins by more than 20 times and this “may have meant people stopped taking them or high-risk patients didn’t start taking them”.

It was months before The BMJ retracted the article and Dr Abramson and colleagues admitted they had made an "error of interpretation".

Overstating the dangers of a drug is one thing, but well-designed and correctly interpreted trials can also fail to reveal negative side effects.

When it comes to new drugs that come on the market, says Prof Leeks, there are two main types of research, exploratory and confirmatory.

“In exploratory research, scientists are looking for new ideas and trying them out,” he says.

“Sometimes they work really well, sometimes a little bit, and sometimes not at all. But we wouldn’t go out and start offering a new treatment to the public based on a single exploratory study.”

That’s where confirmatory research comes in. This involves “carefully vetting pre-specified hypotheses – like that a certain drug is a safe and effective treatment for a particular disease”.

Regulatory agencies, such as the FDA in the US, provide oversight to make sure any resulting drugs are safe and effective.

But, Prof Leeks cautions, while “the system is set up to be very conservative about approving drugs, that doesn’t mean there will never be mistakes”.

Before being approved by regulatory bodies for adoption by the medical profession, proposed new drugs or other treatments are first tested in human trials, which are carried out according to a set of stringent, standardised protocols.

The “gold standard” is the so-called randomised double-blind placebo study, or control trial, in which patients with the same diagnosis are divided at random into two groups, one of which is given the new treatment for the condition, and the other a placebo. Neither side knows who is in which arm of the trial.

The results – a range of predetermined outcomes, up to and including death – are then subjected to sophisticated statistical analysis, which reveals whether the treatment is better or worse than existing treatments.

But this system is far from infallible.

Unnervingly for patients, it remains a sobering fact that drugs can be approved and in use for years before statistical evidence emerges to show that, long-term, they do more harm than good.

• Prof Leek's talk on Tuesday from 6.30pm-8pm at the NYU Abu Dhabi Conference Centre (A6) is free and open to the public. Online registration is required.


Four cases where side effects were greater than the drug’s benefits

• In 2011, drug company Eli Lilly and Co withdrew a product called Xigris, designed to treat severe cases of septic shock, in which the body’s organs start to shut down in an -overreaction to infection. The decision, a -decade after the drug’s -approval in Europe and the United States, was based on findings that the chance of death was greater in those who took the drug than in those who took the placebo.

• In 2014,, a US non-profit educational charity, identified 35 prescription drugs that had been approved by the FDA but were later withdrawn from the market, often only after many years, because they had been found to do more harm than good.

• Accutane, a treatment for acne, one of the 35, above, had been on the market for 27 years before it was withdrawn in 2009 for causing side effects including birth defects, miscarriages and premature births when used by pregnant women.

• Darvon, an opioid painkiller, was in use for 55 years before being banned in Britain in 2005 and the US in 2010. Toxic to the heart, it was thought to have caused thousands of deaths.