To foster both research and services, the EMBL Data Science Centre has been established to ensure that data generated as part of the Molecules to Ecosystems Programme are expertly curated, annotated, managed, integrated, visualised, and shared.

Across EMBL’s scientific activities, biodata must be coordinated to enable cutting edge data-driven research and efficient services. This includes the growing volume and heterogeneity of biological and environmental data arising from molecular biology research studying life in context. In line with EMBL’s Open Science policy, the Data Science Centre has the ambition to make all data findable, accessible, interoperable, and reusable (FAIR) throughout the data life cycle at EMBL. The use of standardised workflows for recurrent research activities will be extended, such as those in large-scale consortia studies. EMBL scientists are developing machine learning and artificial intelligence approaches in the life sciences and other disciplines to perpetuate successful techniques. Additionally, EMBL fosters technical infrastructure that allows for sharing data services publicly. To ensure these goals are met, EMBL arranges to train staff and fellows such that they know how to use these data science methods and tools most effectively, while also preparing them for future careers. From research through to services and training, the Data Science Centre aims to act as a future model for life science institutions that face similar data-driven challenges.

data sciences integration infographic
EMBL’s new Data Sciences Programme aims to connect data science centres focused on five priority areas at all EMBL sites. Credit: Spencer Philips/EMBL

“The exponential increase in the volume of biological data every 18 months is a challenge that many institutions face. It underscores the need for this data sciences plan. We want to tackle data challenges by working together with our staff and fellows and ultimately share our solutions with others.”

Explore the EMBL Programme

Edit