Annual Report 2021

A year of exceptional life science research, training, service, industry collaboration, and integration of European life science research.

The power of open data

EMBL continues its leadership in providing open-access data.

The Darwin Tree of Life project uses genomic sequencing to consolidate biodiversity. Credit: Spencer Phillips/EMBL

EMBL’s commitment to creating and curating databases and resources for biological research is deeply rooted at EMBL-EBI.

For example, fundamental questions about biodiversity were approached as part of EMBL-EBI Ensembl Team’s ongoing involvement in the Darwin Tree of Life project. This initiative is set to sequence the genomes of all 70,000 species of eukaryotic organisms in Britain and Ireland. EMBL-EBI is contributing its expertise in data coordination and genomic data analysis to annotate the diverse array of genomes and build the Darwin Tree of Life data portal.

In 2021, together with an international consortium, Genome 10K, EMBL-EBI established the most accurate way to assemble reference genomes. The study confirms that long-read sequencing methods are crucial for maximising genome quality. Access to more accurate reference genomes can help scientists answer fundamental questions about biology, disease, and biodiversity.

A new EMBL-EBI database was created to provide access to published polygenic risk scores (PGSs), which help assess a person’s inherited risk for certain diseases. The joint UK-US team also released a set of complementing guidelines to promote validity, transparency, and reproducibility of the PGS data.

A new €12 million Horizon Europe-funded project, BeYond-COVID (BY-COVID), continues to tackle the data challenges that can hinder effective pandemic response. BY-COVID expands upon the successful European COVID-19 Data Platform, a data resource EMBL-EBI initiated and set up early in the pandemic to ensure data can be mobilised, found, and used by a broad community of scientists.

“The value of open data has never been clearer,” said Rolf Apweiler, director of EMBL-EBI. “Open data and international collaboration are the bedrock of the unprecedented scientific effort to understand and fight the SARS-CoV-2 virus.”

In coming years, EMBL’s data expertise and services will continue to grow, continuing and contributing towards its role as a leader in the life sciences in Europe and globally.

“The Ensembl genome annotation pipelines have been redesigned to enable processing and release to the public of 250 genomes a month — 80 times faster than previously in support of biodiversity projects in Europe, which aim to sequence 10,000 genomes by 2027.”

– Johanna McEntyre, Associate Director of EMBL-EBI Services

“Particularly memorable for me from 2021 is the renewed funding for the UniProt Consortium, a global collaboration. This continued grant funding not only underpins an ELIXIR Core Data Resource, I have had the great pleasure of providing administrative support for UniProt since I started at EMBL-EBI in 2013 – essentially my entire time at EMBL.”

– Emma Sinha, Grants Office Supervisor at EBI administration


Rhie A et al. (2021). Towards complete and error-free genome assemblies of all vertebrate species. Nature, 28 April 2021, DOI: 10.1038/s41586-021-03451-0

Wand H et al. (2021). Improving reporting standards for polygenic scores in risk prediction studies. Nature. 10 April 2021, DOI: 10.1038/s41586-021-03243-6

Lambert SA et al. (2020). The Polygenic Score Catalog: an open database for reproducibility and systematic evaluation. medRxiv, 23 May 2020, DOI: 10.1101/2020.05.20.20108217

Pink and blue dominate a blurry image against a black background that is actually a global image of a 30-day-old Octopus vulgaris

Next Story

More EMBL Service Highlights

Whether it was chemical biology, multi-omics, gene editing, X-ray beamlines, or myriad microscopy services, EMBL’s core facilities and services supported the advancement of molecular biology during an active 2021.