Photo by Ziad Ahmed/NurPhoto

Under the microscope

What is the value of research if the results are so fragile?

Artillery Row

The Covid pandemic has brought science into the spotlight, as scientists take a leading role in advising government policy. The post-mortem into the quality of their data, models and policy proposals has already begun.

The political controversies are only the tip of the iceberg, however. The wider issue underlying scientific advice is that science itself has severe problems in the way it is carried out and the results disseminated. The issue hinges on reproducibility.

Reproducibility is fundamental to science. An experiment is performed, data collected and analysed, and a conclusion reached. This is then submitted for publication and only accepted after other scientists have scrutinised the work, picked it apart, looked for errors. Only if they are satisfied does it get published, subsequently cited by other scientists in their ongoing work as a validated step on the march of knowledge.

The faith in peer-review is such that not many experiments are repeated these days. Money is tight and asking for it to do what others have already done is not always a winning strategy, especially for a young researcher seeking to establish themselves or for a high-flying institute with a reputation to maintain.

About a decade ago a group of scientists from two drug companies, Bayer and Amgen, initiated a study to reproduce the findings of other scientists carrying out pre-clinical experiments into cancer treatment, investigating compounds and molecules that could attack cancer cells and shrink tumours, for example. The study cost $2 million and took 8 years, longer than intended. The final results have just been released — shocking many, while others must confess themselves unsurprised.

It is a shocking indictment of the way science has worked

The scientists identified 53 high-profile cancer studies published in prestigious journals between 2010 and 2012. They immediately ran into problems. Despite being published in top journals, none of those papers contained nearly enough information to carry out the experiment without further details. The group had hoped to include at least one of the original researchers in their work, but that didn’t happen. All too often the original researchers had not recorded or archived the correct data. Meanwhile, a third of those contacted about their papers were either not reachable, didn’t respond or just said “no”. The number of experiments they were able to replicate kept going down and down. In almost 70 per cent of the papers, they were unable to obtain key data. Eventually they were left with just 23 studies.

Nearly 90 per cent of those papers reported stronger, more statistically significant results than could be reproduced. At best, only 43 per cent of the reproducible studies could be verified to some degree. It is a shocking indictment of the way science has worked in this instance. Technically it doesn’t mean that the original papers were wrong. There could be many reasons why they could not be replicated, some subtle or statistical factors beyond the researchers’ control. But it does raise the question, what is the value of such research if the results are so fragile?

Reproducibility is increasingly being acknowledged as a practical problem beyond its philosophical underpinning of the scientific process. The UK now has a National Reproducibility Network, as do many other countries.

Being wrong is valuable information, but only if we recognise it

It’s not just a problem in pre-clinical cancer studies. It’s everywhere: geoscience, climate change, chemistry, animal behaviour. A 2016 study of economic research showed only 60 per cent could be reproduced, and it is thought that a simple software error in the computers that control MRI brain-imaging scanners could have invalidated over ten per cent of studies involving its data. Thank goodness someone eventually checked. Two years ago, marine biologists attempted to confirm a well-publicised study on ocean acidification due to climate change and its detrimental effects on fish behaviour. The reanalysis found no such effects.

In medicine, it affects our understanding of heart attacks, strokes and allergic responses, to give a few examples. How many still trust the famous Milgram experiment — a simulated jail in which students acted as guards and prisoners? The guards pressed buttons to give the inmates electric shocks and eventually behaved sadistically. The shocks were fake and the prisoners acting, but it was held up as an example of good people being forced to do bad things by authority. It was used in courts for decades, but has actually been discredited as biased to the point of outright fraudulence.

The reproducibility problem begs the question of what it has influenced that we don’t know about. It’s intertwined with the way we do science and report it, the pressure to publish quickly and often to get a foothold on the scientific career ladder, the reticence of young scientists to speak up lest they ruin their careers, and the overemphasis on low-quality studies by the media.

The contradiction, and the challenge, is that science thrives on things that are wrong. It’s valuable information, but only if we recognise it. Inevitably given the way science works, there will be talent, careers and money wasted pursuing dead-ends. But we can do things better, with better recording, better statistical data management, more exacting standards demanded by journals and a change in our cultural mindset. Being wrong isn’t a scientific crime unless you realise it and do nothing.

Enjoying The Critic online? It's even better in print

Try five issues of Britain’s newest magazine for £10

Critic magazine cover