Researchers across EMBL are helping to make artificial intelligence (AI) models for bioimaging analysis interoperable and openly available to the scientific community
A new project aims to empower life science researchers to harness the full potential of AI and machine learning methods for bioimage analysis.
Using AI approaches for bioimage analysis helps to increase the pace of life science and medical research. Having some automation behind image analysis enables researchers to analyse their microscopy data quickly and easily. This ultimately increases our understanding of health, disease, and how life functions at a molecular level.
AI4LIFE – funded by the European Commission and orchestrated by Euro-BioImaging – brings together a consortium of ten partners from a variety of computational and life science backgrounds.
This international consortium is made up of partners from institutes including: EMBL Heidelberg, EMBL-EBI UK, University Carlos III Madrid, KTH Royal Institute of Technology Stockholm, Institute Gulbenkian Lisbon and Human Technopole Milan.
The project aims to create ways to use and share AI tools and methods and help deliver these to researchers through outreach and advanced training.
“One important aim of the AI4LIFE project is to reach a diverse and global user base, who need AI solutions to take their research questions to the next level,” said Antje Keppler, Euro-BioImaging Bio-Hub Director. “Together with four other research infrastructure partners, we intend to provide dedicated AI services to the biological imaging, structural biology, marine biology, plant phenotyping communities, among others.”
Euro-BioImaging is a European research infrastructure for biological and biomedical imaging. The European Molecular Biology Laboratory (EMBL) hosts the Euro-BioImaging Bio-Hub and general data services.
Bridging disciplines for better AI models
Incomplete documentation of bioimaging tools and AI models can make them difficult to use. Also, insufficient communication across scientific communities – life scientists producing and studying bioimages and computational biologists creating image processing models – means there can be a disconnect between the people creating and using these methods.
AI4LIFE aims to address these challenges by providing:
AI-based image analysis methods as Findable, Accessible, Interoperable and Reproducible (FAIR) services
integrated access to cloud computing resources for the evaluation of pre-trained models
standards for submission, storage of, and FAIR access to reference data, ground truth data annotations, trained AI models, and trainable AI methods
simple model deployment, sharing, and dissemination of AI-based methods as a new developer-facing service
open calls and challenges for outstanding image analysis problems in the EU Mission areas of the Horizon Europe framework programme
outreach and training for life scientists using image analysis
What are ground truth data annotations?
Ground truth is a term used to describe a component of the data used to train an AI model. These data are considered to be ‘true’ (i.e. matching the result that would be obtained from a perfect manual observation). Using data with ground truth annotations can be considered the gold standard for training an AI model.
“It’s vital to bring together the expertise of researchers from a multitude of disciplines if we want to create robust, interoperable AI-based tools for bioimage analysis,” said Anna Kreshuk, Group Leader at EMBL and co-coordinator of the AI4LIFE project. “There is so much we can gain from using these AI models, but we also need to make sure that they’re accessible to everyone and that the models we create are user-friendly and transparent, by providing access to the data used to train them.”
Access to AI models for image analysis
There are many open repositories – such as the BioImage Model Zoo created by the AI4LIFE consortium – where researchers can share their trained AI models and tools for life science imaging data. One challenge with developing and sharing such AI models is that the amount of data used to train them tends to be very large. It is however critically important to share these data alongside the AI models to make the AI interoperable and reproducible.
“AI models are only as good as the data they’re trained on. EMBL-EBI’s role within the AI4LIFE project is to help host the image data and the ground truth annotations that go into making AI models,” said Matthew Hartley, BioImage Archive Team Leader at EMBL-EBI. “It’s very important that the raw images and annotations used to train an AI model are stored together and are FAIR, partly for model reproducibility, but also to enable researchers to train new and different models using these existing data.”
As part of the AI4LIFE project, Hartley and his team will be helping to develop standards, infrastructure, and pipelines for hosting these data used to train AI models and developing a solution for how these data can best be linked up to the models themselves. Having robust standards and creating these links between the models and data will help to increase the interoperability and usability of the AI models.