Last month a long thread evolved on Hacker News, a popular discussion forum, in response to my Nautilus article, “We Should Not Accept Scientific Results That Have Not Been Repeated.” Much to my delight, it generated a rich conversation involving scientists and non-scientists alike. That’s fitting, since our inability to independently replicate our results, I argue, threatens to undermine trust in the scientific enterprise—something every citizen has a stake in.
Before I respond to the arguments against my position from the Hacker News community, allow me to quickly summarize my view. I think the irreproducibility crisis we are currently witnessing needs to be seen in a particular light. Specifically, the crisis is an unintended consequence of the misalignment between the incentives scientists are offered to advance their careers, and the resulting quality of their work. The problem, as I see it, is that academic scientists, without any real devious motive, can become famous and respected within their field without producing the finest work. In my article, I proposed that linking recognition with an objective measure of quality, such as whether a study has been replicated, would improve scientific standards and lead to more skepticism and diligence.
Here are some of the noteworthy arguments against me, as well as some questions, and my responses to them.
“Taking papers at face value, without considering whether they’ve been replicated, is really only a problem in science reporting and at (very) sub-par institutions/venues.”
It is true that science reporters have consistently over-represented findings to appeal to the wider public. After all, most scientific studies make extremely specific arguments that the public might not find interesting. However, scientists, too, are not immune to taking papers at face value. They view high-tier journals with less skepticism than low-tier journals. They consistently make the assumption that editors and peer reviewers have already vetted the papers they are reading and identified any mistakes in the study design or logic. More dangerously, skepticism is often put on hold if newly published findings support one’s own line of research, and thus may give that scientist a competitive advantage to receive funding.
“Who’s going to pay for the repeated research to prove the first one?”
Most research is often repeated as part of the scientific process; researchers check others’ work in order to advance their own. Replication may therefore not require an increase in overall funding. The problem, however, is that the outcome of those confirmatory studies, positive or negative, is rarely published due to a lack of interest from academics—because it would not advance their careers—and scholarly journals—because it would not increase their readership and impact factor. This discourages researchers from executing them with the same rigor as “innovative” studies.
“There are anti-incentives to reproducing other people’s results. All scientists want to see it, but nobody is able to actually do it, because they’ll lose status, publication opportunities, and funding…What things could be done to make reproducing results actually an attractive activity to scientists?”
Currently, there are several movements advocating for more transparency in science, like the Center for Open Science, or attempting to innovate publishing and peer-review, like F1000. One particularly interesting approach is to have studies pre-registered, which emphasizes study design over results. In this model, studies are peer reviewed before the experiments are conducted. If they are accepted, they are published regardless of what the results are. I recently proposed a publishing model that may increase the tendency to publish negative and confirmatory results by decreasing the minimum amount of information a new study needs to provide. However, all of these attempts will fail if the scientific community doesn’t embrace them.
Just how to make scientific culture more accepting of confirmatory studies is a difficult question and will undoubtedly, I think, require experimentation. Unfortunately, the scientific community is conservative by nature; we are trained to think like our mentors, who were trained to think like their mentors. Additionally, the hyper-competitive nature of academia discourages scientists from experimenting with how to best do science. No junior scientist is going to waste time preparing a small paper on a confirmation study, as universities are seeking to hire and promote faculty with papers showing novel results in the hopes of securing future funding. A paradigm shift towards reproducibility, in other words, is the type of cultural change that will require the involvement of the entire scientific ecosystem.
“Repetition is good but isn’t a replacement for good science in the first place. It does no good to replicate bad science.”
The argument that the irreproducibility crisis is a consequence of poor scientific training has some merit. Take, for example, the case of young scientists, new to the system. Currently, many universities are holding workshops and seminars to educate them about statistical and psychological biases that may influence their work. I’ve been to some of those talks; they’re very helpful. Being aware of bias is the first step of eliminating it. Nonetheless, the rooms where these talks are held are rarely full because postdocs and graduate students are expected to be in the lab, producing data. If irreproducibility was primarily due to poor training, it would be less widespread and limited to specific institutions. Rather, irreproducibility is a structural problem and therefore requires a structural-level solution, not an individual-level one.
Trying to improve scientific training without addressing the incentive misalignment between academia and quality will only lead to short-term, incremental improvements. So as we teach young scientists to be aware of their own cognitive bias, and how to better use statistical methods, it is important to build an incentive system that ensures high-quality findings are rewarded more than ones that are merely well marketed. If they seek out recognition in a system that allows for it to be achieved without ensuring quality, nothing will change. Of course, this does not mean un-replicated research should be penalized or removed from scientific discourse. But besides repetition, there are currently no other objective metrics capable of assessing the quality of a scientific finding.
The development of replication as the primary metric for evaluating science will naturally produce better science overall, from conception to publication. Quality-based recognition will encourage positive behavioral shifts within the scientific community and further emphasize rigor, attentiveness, and skepticism.
Ahmed Alkhateeb is a postdoctoral research fellow at Harvard Medical School and Massachusetts General Hospital. His research focuses on stromal-tumor interactions in pancreatic cancer.
The lead photograph is courtesy of duncan c via Flickr.