EMBL-EBI Senior Scientist Rob Finn explains why data coordination and sharing are fundamental for a sustainable blue economy
For the life sciences to make major contributions to the blue economy, data needs to be systematically collected, openly shared and carefully analysed.
The blue economy harnesses the enormous potential of marine resources for economic growth and improved livelihoods, while also avoiding overexploitation and negative environmental impacts. The concept covers a wide range of economic activities, including aquaculture, renewable energy, and biotechnology.
By Rob Finn, EMBL Senior Scientist and Head of Microbiome Informatics at EMBL-EBI
A report by the Organisation for Economic Co-operation and Development (OECD) projected a doubling of the blue economy between 2010 and 2030, set to reach three trillion USD, and employ 40 million people. The possibilities are endless, as are the challenges. In recent years, EMBL has been exploring different ways of leveraging molecular data for the benefit of the blue economy.
An ocean of data
The first challenge has been extracting knowledge from the huge volumes of metagenomic data being collected worldwide. This requires systematic data collection, ensuring each sample contains metadata, which is a combination of contextual information, such as when and where a sample was collected, and environmental parameters, such as temperature, depth, and pH. Logging metadata is time-consuming, but it ensures the scientific data can be contextualised with other samples and helps researchers understand differences between organisms found in a sample. Rich metadata enables data reanalysis and reuse to answer new research questions.
What is metagenomics?
Metagenomics is the study of the structure and function of DNA isolated and analysed from all the organisms (typically microbes) in a bulk sample. Metagenomics is often used to study a specific community of microorganisms, such as those residing on human skin, in the soil or in a water sample.
EMBL-EBI’s public data resources are akin to digital libraries where any researcher can see what data has already been collected and access relevant information for their research question. The European Nucleotide Archive, for example, holds billions of DNA sequences, including from major marine studies, such as the Tara Oceans mission. My team manages a resource called MGnify, which enables access to analysed microbiome data and adds value by systematically organising the data into large collections of different data types (e.g. proteins) ready for data mining.
EMBL-EBI is supporting a range of scientific endeavours collecting molecular data from our oceans and seas and cataloguing them so we know where the samples came from and what they represent, and so that they are easier to find and interpret using computational methods. This is the only way to analyse such datasets, which are much too big to manually sift through.
A recent example of a major scientific endeavour is EMBL’s ‘TRaversing European Coastlines’ (TREC) expedition. This is the first pan-European initiative to systematically collect molecular data from 120 coastal sampling sites across 22 European countries, with the ultimate aim of studying coastal ecosystems and their response to the environment. The sampling began in spring 2023 and will continue until summer 2024. The huge volumes of data collected will be made publicly available so researchers around the world can explore a wide range of questions, and develop new solutions for the blue economy.
Reusing fish bones, better fish feed, and carbon sequestration
Another challenge is asking the right research questions and concentrating research efforts on applications that will benefit humans and the environment alike. One of the more advanced research areas in the blue economy is aquaculture – the farming of aquatic organisms, including fish, shellfish, and plants, with the purpose of food production in a sustainable way.
In the circular blue economy, a key activity is making new, valuable products out of what would otherwise be waste. As humans, we enjoy eating fish filleted, but the bones, skin and other parts of the fish go to waste. What if there was a way to break down waste products, such as fish bones, into something more useful, to improve the profitability and sustainability of aquaculture? This is one of the questions the BlueRemediomics project, funded by Horizon Europe, is exploring.
Another promising research question is whether we can identify more sustainable fishmeal. Currently fishmeal consists mostly of forage fish, such as anchovies and herring, and has caused the collapse of some fish populations.
The HoloFood project, funded by Horizon 2020, has used microbiome and other molecular data types to understand the links between different salmon feeds, the microbes living in the fish’s stomach, salmon health, and meat quality. These complex connections were explored by leveraging an increasingly recognised scientific concept – the holobiont, which is the analysis of the microbes living in and around a host animal, the host, and the surrounding environment. So if we take a salmon, for example, its holobiont consists of the microbes living inside and around the salmon and its environment.
Another very different but interesting application is carbon sequestration. The ocean is full of marine microbes that are primary producers, which means they are organisms that acquire their energy from sunlight. Like plants, they absorb carbon dioxide from the atmosphere through photosynthesis. In addition to oxygenating the planet in the process, when these organisms die, they sink to the bottom of the ocean, safely locking away the carbon. The BlueRemediomics project is exploring whether algae can be used more widely for carbon capture, and looking into society’s appetite for using such methods for combating climate change.
Impact of human activity on our oceans
Beyond the economic potential of the blue economy, there is also the impact that human activity has on our oceans. By sampling aerosol, soil, and seawater from the European coastlines as part of the TREC expedition, researchers are trying to understand the effects of changing environments on organisms and communities, at cellular and molecular levels.
This will be the first dataset of its kind: multiomics data, systematically collected on land and sea across the European coastlines. With the right curation and analysis tools, it holds the key to exploring fundamental questions about our oceans and coastlines, including the effects of organic and inorganic pollutants (such as the UV blockers used in sun cream), the impact of global warming and acidification on ecosystems, the discovery of new antibiotics, and tracking antimicrobial resistance spread.
Through the lens of data
Big data means big potential, but also big challenges. EMBL-EBI and partners are exploring new ways to store and curate the increasing amount of data submitted to our resources, to present it in a multitude of different ways so it can answer specific research questions. We are also ensuring that the provenance of the data is clear so any access and benefits can be traced according to the relevant protocols.
My team’s immediate challenge is to present data in manageable bites for the ocean research community to work on – for example, if we’re talking about a specific enzyme with interesting properties, such as plastic degradation. The MGnify data resource contains 3 billion proteins from metagenomic studies. This is orders of magnitude larger than ‘traditional’ protein databases. So there is lots of potential novelty in this huge dataset, but it is hard to handle, and many of the tools currently used by scientists do not scale. My team of software engineers, web developers, and bioinformaticians are working on new ways to overcome the current issues and make sure the data is usable by all.
We need to enable researchers to see specific ‘slices’ of this dataset, for example, seeing enzyme diversity for marine samples, maybe for a specific temperature and pH. This is all possible, but it can take some time to search through the data, so we are focusing on supporting projects that have a specific application, whether that is plastic degradation, breaking down toxic substances, cosmetic products with bioactive ingredients, or something completely different.
Finally, there is only so much we can achieve with data, and the next steps of discovery require wet lab experiments, proof of concept, scalable solutions, legislation etc. But robust data that is easily accessible, carefully curated, and that can be mined is the first step towards speeding up discoveries for the blue economy.
The TREC expedition and the two projects funded by Horizon Europe – BlueRemediomics and BIOcean5D – are part of EMBL’s Planetary Biology and Microbial Ecosystems transversal themes, which aims to understand how microbes, plants, and animals respond to each other and to their environment.