Data sharing, reproducibility and peer review

I just reviewed my first manuscript where the authors provided a reproducible analysis (i.e., they shared their data and analysis script with the reviewers). This is something my coauthors and I have tried to provide with our recent studies, but it was my first time experiencing it as a referee.

I think it really helped, but it also raised new questions about traditional peer review.

The background: I was brought in to review this particular study of primate calling behaviour as the 3rd reviewer (it had already been through one round of review). I was surprised to see that it had two major flaws that the other two reviewers did not mention. First, the authors had compared 3 statistical models for the social context of when the calls occur, under the premise that each model was based on a different hypothesis for its adaptive function. However, they didn’t include a 4th “null” model (with no social variables). If the null model was a better fit to the data, it would invalidate their conclusions that the data supported one of the adaptive hypotheses. Second, they showed that social variables explained between 4-8% of the variation in calling behaviour. However, they didn’t provide any estimates of uncertainty for those numbers (e.g., how sure can we be that the true value doesn’t overlap 0%?). So I wrote out these comments (and many other, more specific ones), and sent in my review.

A couple of months later, I got the authors’ revised manuscript. I was pretty disappointed to see that while they had carefully replied to all of my comments, they did not to change their analysis. They did, however, include a copy of their data and R script. So I was able to run the analyses myself and attach the results to my second review. What did I find? None of the authors’ three candidate models provided a better fit to the data than a null model. Furthermore, a randomization test revealed that the variance explained by their social variables was no better than that obtained using completely randomized data. This means that we can’t really draw any conclusions about the social function of the call from their results. The only conclusion is that perhaps we need some better functional hypotheses.

As a referee, I was glad to have a chance to back up my claims and demonstrate my points to the editor and authors. But it raised a number of new questions about peer review. For example:

  • How do we credit intellectual contributions during peer review? On the first round, I made suggestions via written comments. On the second round, I actually wrote a script, ran a number of tests, and built figures to get my point across. I did this with the understanding (and hope) that the authors might actually use what I had done. Traditionally, reviewers sometimes get a mention in the acknowledgments as thanks, but is this enough if the reviewer wrote analyses or create figures?
  • On a related note, anonymity. This particular journal uses blind (anonymous) peer review. However, because of the extent of what I had done, I signed my R script.
  • How can shared data and scripts be used? According to the publisher’s policies for reviewers, I have to treat the materials shared with me as confidential documents. Suppose the authors’ flawed analysis gets accepted for publication. Am I allowed to use my work (and their data) to publish a response?
  • Why aren’t randomization tests more common in behaviour? These are relatively easy to run, given current computing power and statistical software for custom analysis scripts. I think they should be standard, but I almost never see them.

Peer reviewers have always made important intellectual contributions, at substantial time cost, and without getting direct benefits (I do think there are many indirect benefits to peer review, but that’s another story). In an era of shared data and reproducible analyses, it’s possible to make more direct contributions than ever before, and the system may need to change. I’m just not sure how.

The one thing I’m certain about is that sharing the reproducible analysis had major benefits for the peer review process in this case. Because the authors shared their work, I could provide evidence for my claims as a reviewer and I could communicate these claims more efficiently. It might also help avoid the situation of papers published with many obvious, avoidable errors.