A landmark study involving 100 scientists from around the world has tried to replicate the findings of 270 recent findings from highly ranked psychology journals and by one measure, only 36 percent turned up the same results. That means that for over half the studies, when scientists used the same methodology, they could not come up with the same results.
"A large portion of replications produced weaker evidence for the original findings despite using materials provided by the original authors, review in advance for methodological fidelity, and high statistical power to detect the original effect sizes," the team reports in Science today.
The study was organised by having several teams from around the world select an experiment from a 2008 edition of one of three leading psychology journals and then follow the original methodology as closely as they could. They were told to get in contact with the lead authors if possible too, so they could get a better insight into how things were done the first time around.
While 97 of the 100 studies originally reported statistically significant results - which Ed Yong explains at The Atlantic as "if you did the study again, your odds of fluking your way to the same results (or better) would be less than 1 in 20" - only 36 of these results could be replicated as statistically significant the second time around.
And these papers were taken from the best journals - the hardest ones to get published in. If the studies were taken from all available psychology journals, the results would probably have been even worse.
"The success rate is lower than I would have thought," Stanford University's John Ioannidis, author of the widely cited paper, Why Most Published Research Findings are False, told Yong. "I feel bad to see that some of my predictions have been validated. I wish they'd been proven wrong."
But this doesn't mean that the results from two-thirds of the highest profile psychological studies from 2008 were incorrect. Even if the results weren't able to be replicated, it doesn't take away from the fact that there's likely something to the original findings, but as with all studies that haven't been independently verified and replicated, they need to be taken with a grain of salt.
While studies in social psychology, which look at how certain things influence behaviour, are known to be less reproducible than cognitive studies, which look at how the brain functions when it's storing memories, learning new things etc, this isn't the only field in science that suffers when put through the replication wringer.
Just yesterday we reported that a review of climate contrarian papers - journal articles that disagree with the consensus that climate change is likely caused by human activity - found that they were riddled with methodological errors, which would have made their findings impossible to replicate. And earlier this year, a separate study found that the prevalence of irreproducible preclinical research exceeds 50 percent, "resulting in approximately US$28,000,000,000/year spent on preclinical research that is not reproducible - in the United States alone".
So today's study shouldn't be seen as an indication that psychology is a less reliable science. Science at its most basic level is about the process of hypothesising, testing, validating, and retesting, and while it should have been done in the first place with these 2008 experiments, better late than never. "Indeed, the fact these researchers are trying to analyse the credibility of findings from their own discipline is surely an indicator of a commitment to scientific rigour," Victoria Turk points out at Motherboard.
The reasons behind why these studies are published without replication are, as you can imagine, incredibly complex, but Ed Yong tackles the main ones over at his Atlantic article. A lot of it comes down to humans wanting answers, which science is in no way obliged to give. As the authors of today's Science study point out: "Humans desire certainty, and science infrequently provides it. As much as we might wish it to be otherwise, a single study almost never provides definitive resolution for or against an effect and its explanation."