“If you want science to move forward, you have to share it”
EMBO Director Fiona Watt discusses preprints, data sharing, and evaluation in light of EMBL’s new Open Science policy
In December 2021, EMBL announced its Open Science policy as part of its ongoing commitment to drive trust, transparency, and more inclusive research across the life sciences.
In light of the new policy, Victoria Yan, member of the EMBL Office for Scientific Information, talked to EMBO Director and EMBL Group Leader Fiona Watt about the advantages and challenges for scientists in adopting more open practices.
You served on the bioRxiv Scientific Advisory Board and as Deputy Editor of eLife, a journal that now exclusively reviews preprints. A requirement in the EMBL Open Science policy for publications is to post a preprint. Why are preprints important and what are the advantages they bring for researchers?
I really started thinking deeply about preprints several years ago, when I was asked to speak at the Wellcome Sanger Institute. It was a debate about whether posting preprints should be compulsory. I was speaking because of my involvement in bioRxiv.
If you want science to move forward, you have to share it. It should be more than just the glossy final publication, we should learn something of the process by which we get there – and the preprint version is part of that process.
The challenge in some fields is that scientists feel that their main advance is conceptual rather than data sharing. At the debate I attended, scientists who usually generate large amounts of data, for example genomics scientists, were much more in favour of preprints than scientists who do more conceptual work, for example in cancer and stem cells. Of course, posting preprints should protect your ideas. Ideally, if you are going to a conference and have posted a preprint, you should be more confident in sharing your work prior to final publication.
Another practical benefit of preprints is that, if you are on the job market, you can point to a preprint in your job and grant applications, instead of saying your paper is ‘submitted’. Assessors can then judge for themselves how mature the work is.
Posting a preprint allows scientists to interact earlier with fellow researchers. Evaluation and peer review of preprints without involving a journal is a new concept for many researchers. What is your perspective and experience with it?
One of the nicest things that has happened from a preprint interaction was that I received an email from a student in California, who had reviewed my preprint in class. The student wrote to me: “I really liked your preprint, and I have some suggestions for how to improve it. I just looked and you have posted another version, and you fixed some of those things. Good luck with your paper.” This was where I saw the educational value of preprint feedback and the opportunity for multiple people to improve the quality of your work.
I have learned a lot about refereed preprints since I came to EMBO. One aspect that we often discuss is mistakes in publications and how to address them. Sometimes mistakes in the published literature are almost perceived as crimes that should be punished. However an honest mistake could be picked up at the preprint stage. This means you are not retracting a paper, but rather correcting a mistake as early as possible, and well before the final publication.
At EMBO we are working on Refereed Preprints with funding from HHMI. This platform is part of Review Commons, where you submit your paper, receive your referee reports, and then consider which journal you would like to submit to. You separate an honest evaluation of the quality of the work from other considerations. You will now have to post the preprint and the refereed reports. We are interested in how researchers will respond, but I have a suspicion that there will be few objections.
You had submitted a paper to Review Commons, how was your own experience?
I’ve sent one paper to Review Commons so far. The reviewers were agnostic to which journal the paper will end up, therefore the comments were not whether the paper is sufficiently good for a given journal, it focused the reviews on whether the work was robust. My experience was completely positive – we ended up publishing the paper in our top choice journal.
It is interesting to break down the publishing process into steps. You could even imagine that essentially you have a pool of peer-reviewed work, which journals would be able to improve further and curate.
There is a lot of conversation about whether those who do reviews get credit and of course that is important. At the same time, we have to protect reviewers to be able to give their honest feedback. There’s that aspect of peer review that we must not lose sight of.
An analogy for receiving feedback would be submitting a grant, when you ask your colleagues to review it. Some will simply say ‘nice job’, and some will go into considerable detail to help you improve the grant. The person who is really critical is the one who helps you more. Proper standards in peer review should be helpful for improving quality, rather than simply expressing an opinion on whether the topic is interesting.
EMBL researchers are required to have a Data Management Plan and publish the data on which a discovery and published paper are based. What are some challenges for generating and publishing FAIR data? How do you choose the right repository?
It’s really important that we don’t lose data. We have to think about wanting to access it now, and wanting to go back in a few years’ time. Years after publishing our work we may want to look at our own data differently, or to re-analyse other people’s data. This is the findability angle.
The challenge is relying on one institution to archive the data. There may be changes to the institution, and then accessibility may be lost. It is important to use data repositories that are well-established and respected in particular fields. I personally choose the most stable and accessible option for my own data, typically guided by the journal where we publish or by our funders’ requirements.
When it comes to interoperability, it’s challenging because many scientists are not experts in all aspects of technical interoperability. We all generate different types of data. For specific types of data like images, it’s more challenging than – say – genomic data. My lab relies on histology sections. We not only have many physical sections but we also scan the slides and save the data electronically. You end up with a massive amount of scanned data. There have been initiatives to deal with this kind of experimental data, and we are involved in high content imaging initiatives. We always share the pipeline for the analysis, but sharing data is challenging – it simply isn’t feasible to keep a record of every single cell or section. EMBL-EBI, for example, is working on image data repositories amongst many other databases, and I am supportive of these efforts.
Researchers publishing their data on trusted community repositories can use ORCID to be credited for their work. Could increasing the visibility of FAIR data and Open Software provide incentives for researchers to share more openly?.
I suppose the challenge here is that, although these preprint review and open science activities are received positively by researchers, ultimately scientists can be quite conservative, and are focusing their efforts on what is essential for their careers and funding. The question for many researchers is whether it really matters that they were a referee for a preprint, or rather that they published a paper in a high-profile journal. I think we have a long way to go to convince everyone that we should judge scientists in a broader way. But I am optimistic.
EMBL Open Science policy
EMBL’s Open Science policy is part of its ongoing commitment to drive trust, transparency, and more inclusive research across the life sciences. It will expand on existing practice, and contribute to positive culture change across EMBL and more widely. To ensure this, the policy covers research assessment and fair attribution of credit. The policy also puts in place guidelines for EMBL staff regarding open and timely access to research results via publications, data, and software.
The policy includes the following requirements:
- All EMBL research publications should be made openly available in Europe PMC within 6 months of publication
- Publication of all manuscripts in a preprint server indexed by Europe PMC
- Publication of all articles with an Open Access licence (CC-BY)
- Submission of data as complete datasets that comply with FAIR data principles
- Software created as a research output or to support EMBL services should be open source by default
- Alll EMBL staff publishing research will be required to maintain an ORCID iD
- Publications should acknowledge EMBL – both as affiliation and source of funding – as well as any external grants used to conduct the research
For further information on Open Science at EMBL, visit our Open Science website.