Skip to content

Scientific tests carry more weight when they have been replicated.

February 5, 2012

Why would any scientist want to repeat an experiment that’s been done before?

This is one of ScienceOrNot’s Hallmarks of science. See them all here.

In short…

The scientific community will have greater confidence in a model if the supporting tests are repeated and confirmed by scientists who are independent of the original testers.


Reproducibility—the independent verification of prior findings—is at the core of “the spirit of science”

Ben Santer, Tom Wigley & Karl Taylor, Climate scientists,  2011

Failure to Replicate 2

What does replication mean?

Replication is a kind of checking of results. A particular test is replicated when it is repeated by researchers who are independent of those who did the original test. The scientists doing the replication try to find results similar to those reported by the earlier investigators.

(The terms ‘replicable’ and ‘reproducible’ tend to be used interchangeably – and confusingly – in science. Here, I will stick to ‘replicable’.)

How does replication work?

When scientists publish the results of their model-testing in scientific journals, they are expected to describe in detail how the tests were performed. They should also be prepared to supply any further details on request. This means that their tests are replicable – they can be repeated by other scientists as a check.

There is no general requirement that all scientific tests must be replicated. Many never are. Replication usually happens after scientists report findings that are controversial. If the results of the replication are similar to the original results, they are said to be commensurate. The results are not expected to be exactly the same, since all scientific results have uncertainties.

Note that scientists usually repeat their own tests many times, if possible. This is not the same as replication, which must be done independently of the original scientists and their equipment.

Why replication is needed

Replication plays an important role in weeding out mistakes in testing models. Successful replication, with commensurate results, increases our confidence in the tests. Non-commensurate results show that there is a problem. They can indicate scientific fraud, but more often they arise because the original researchers made honest mistakes.

Examples

  • The Berkeley Earth Surface Temperature project was set up in 2011 to replicate analyses of global temperature trends. Its findings were commensurate with those of earlier studies, showing that the earth’s temperature has increased and that neither the urban heat island effect nor the inclusion of poor quality measuring stations had any significant effect. For full results, see this paper.
  • In December 2010, the journal Science published an online paper by a NASA scientist and colleagues which described a bacterium that could incorporate arsenic into its DNA instead of the usual phosphorus. Many biologists disputed the findings, claiming that DNA incorporating arsenic would be unstable, and that the arsenic found in the DNA was more likely a contaminant. An independent team replicated the study and reported in January 2012 that it could detect no arsenic in the bacterial DNA.
  • In December 2011, Science retracted a 2009 paper that linked chronic fatigue syndrome to a mouse retrovirus, after numerous studies had failed to replicate the findings.
  • Gravitational waves, travelling ripples in space-time, were predicted by Albert Einstein in 1916. In the 1960’s Joe Weber, an American physicist, built a gravitational wave detector and claimed to have detected them. Other physicists were skeptical of his results and tried to replicate them without success. Although physicists now dismiss Weber’s claims to have detected gravitational waves, the search for the waves continues, and his detectors laid the foundations for most current detecting instruments.
  • In 2011, The Journal of Personality and Social Psychology published a paper by psychologist Daryl J. Bem, which claimed to show that human performance can be influenced by future events. There was criticism of the way the data was analysed in the study. Subsequent attempts to replicate Bem’s findings have failed (here and here). Steven Novella at Science-Based medicine has the full story.

The Santer, Wigley and Taylor quote is from “The Reproducibility of Observational Estimates of Surface and Atmospheric Temperature Change”, in Science, 2 Dec 2011. DOI: 10.1126/science.1216273

This is one of ScienceOrNot’s Hallmarks of science. See them all here.

This page reviewed and updated: 2013/10/16

Leave a Comment

Be part of ScienceOrNot? Write a comment, make a suggestion or add an example!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,654 other followers

%d bloggers like this: