Here, Professor Thornton explains why organisations should embrace open science sooner rather than later, and what it can help them achieve.
What is open science?
Open science is an approach that aims to spread knowledge in a FAIR – Findable, Accessible, Interoperable and Reproducible – way.
Open science covers research results, which are often published in scientific papers, as well as data, software, protocols, and metadata that make the research easy to reproduce.
EMBL’s European Bioinformatics Institute (EMBL-EBI) is at the heart of open science because its data resources are freely available for everyone in the world to use. Researchers submit data in a standard format, and experts at EMBL-EBI curate, annotate, and store it, making it openly available for anyone to reuse. This facilitates rapid progress, essentially ‘building on the shoulders of giants’.
Open access literature is another essential part of open science and a topic that has been the subject of much discussion and action in the last 20 years. Some institutes, including EMBL, now require all its publications to be open access – available to read free of charge. This can be expensive for the authors, but is particularly appropriate for research funded by public money.
Why is open science important?
Validating results by repeating experiments in different laboratories is important in the life sciences because of the complexity of the work. We need to be confident that our interpretations are correct and the transparent approach of open science helps us achieve this.
To support this approach, EMBL has recently developed an Open Science Policy which lays out best practice and encourages positive culture change. The policy covers research assessment and fair attribution of credit. It also puts in place guidelines for EMBL staff regarding open and timely access to research results via publications, data, and software.
Open science also speeds up scientific progress on a global scale, creating a rich and collaborative ecosystem spanning academia, government, and the private sector.
What has open science helped you achieve?
My research has been totally dependent on open science and open data. My group’s work would be impossible without the Worldwide Protein DataBank (wwPDB), which gives open access to protein structure data. This database, which is now 50 years old, has allowed my group to explore how protein sequences and structures determine their functions and reveal their evolution. Ultimately this incredible resource has led to accurate structure prediction for millions of protein sequences now available, thus providing many biological insights. This includes the AlphaFold Protein Structure Database, a collaborative project from DeepMind and EMBL-EBI.
My group also developed software and databases for structure validation (PROCHECK), for graphical summaries characterising each PDB file (PDBsum), for enzymes and their mechanisms (M-CSA), and for variants in 3D (VarSite). We initially developed these for our own research but by making them freely available, we have empowered research worldwide. This has been good for science, but also good for our group, and I strongly recommend everyone to embrace this openness.
What can researchers do to make science more open?
The key is to think about how you can make your work as transparent as possible.
Firstly, publish in a way that ensures the papers are open access. Secondly, make your data and software freely available to all. You can do this by submitting to open access data resources, like the ones managed by EMBL-EBI. You should also place software on central hubs and submit papers as preprints. As a last resort, you can make your data and tools available on your local computing system, so that others can download it.
Researchers also have a role to play in raising awareness of the benefits of open science. This may involve lobbying your institute or government. Showing support for open data resources, which are usually publicly-funded, is important for impressing on funders and governments how critical these are.
On a wider scale, organisations like ELIXIR and the Global Biodata Coalition (GBC) make a difference to how open science and open data is coordinated worldwide. The GBC is an international effort to identify ‘core’ biological data resources, which are widely used and indispensable, and raise funds to sustain them. This organisation deserves support if we are to reach an open science equilibrium. Charging for data limits progress and is inefficient, especially on a global scale – so data must remain open whenever possible.
Lastly I encourage everyone to be open about their science – it is good for others, but fundamentally it is also good for you as a scientist. Of course, it is sensible to publish – and patent, if appropriate – first, but sharing data and tools will propel our science forward to address some of the major global challenges which we are facing.
How researchers can make their science open
– Publish in open access journals
– Submit papers as preprints
– Submit your data to public data resources
– Place software in central hubs
– Lobby your institute to support open science
– Be open and transparent about your science
– Get involved with science communication and public engagement