The first  cohort study linking data across categories was recently added on the Pathogens Portal Cohort Browser and is also accessible from the BioSamples browser. This pilot study comes from Erasmus Medical Centre through the ReCoDID project, one of the COVID-19 Data Platform supporting projects.

The ReCoDID project is coordinated by Heidelberg University Hospital where the clinical-epidemiological data are also curated before being uploaded to EMBL-EBI databases. 

Having shown that cohort data linking is possible, the Pathogens Portal Cohort Browser is now actively looking for new submissions of cohort data, and keen to work with partners to support data sharing. 

This work was made possible through funding from European Union’s Horizon 2020 Research and Innovation Programme, with the innovative aim of linking clinical-epidemiological data and high dimensional laboratory data in a centralised model to build a long-term, sustainable platform for the storage, curation, and analyses of the complex data sets collected by infectious disease related cohorts.

“Disease development is incredibly complex and can’t be untangled using just one data type,” said Marion Koopmans, Head of the Erasmus Medical Centre department of viroscience. “We need connected datasets, such as this one, and we need a multitude of them. This enables us to peer into what is happening at a molecular level to understand the disease, how and why it affects individuals differently, and what treatment to recommend.” 

Thanks to the efforts of several researchers at the department of Viroscience at Erasmus MC, the study links clinical-epidemiological data with viral genomes and analytical laboratory data.  This is the first such study in the public domain and a significant achievement following the coordinated efforts of multiple teams at Erasmus MC, EMBL-EBI and the University Hospital Heidelberg.

“This is a real achievement and the first example that I know of for a connected datasets study, linking clinical data to laboratory datasets, in the public domain,” said Clara Amid from Erasmus MC. “To have a data platform that makes it possible to link different datasets is a fundamental step to facilitate data analysis for combined research from pathogen to host response.”

“Cohort studies produce invaluable data typically including a range of different data types,” explained Gabriele Rinck, Data Coordination Officer at EMBL-EBI. “Linking these data types on a participant level adds more depth to the dataset by bringing the pathogen/host data into context, allowing a more comprehensive analysis. This represents a huge potential to provide a better understanding of the impact of host factors on the severity of disease, potential risk factors, or efficacy of new treatments, vaccines and other interventions. Linking and sharing cohort data with the scientific community is key to maximising a study’s research output. This pilot project is paving the way for using this approach more widely for other cohort studies.”

“Personalised medicine involves using granular data generated to find the reasons for the differences in clinical severity and transmissibility. When we don’t have access to data from enough cases, we all suffer,” explained Thomas Jänisch, Senior Scientist at the University Clinic Heidelberg. “Sharing data is an additional step, but it’s incredibly worthwhile, and the EMBL-EBI team really made a difference and were very helpful throughout the process.”

To submit your cohort data or find out more about the process, email cohort-dataflow@ebi.ac.uk.

Read the original announcement on the COVID-19 Data Portal.

To enquire on how to collaborate on the European COVID-19 platform: ecovid19@ebi.ac.uk.

For further questions on sharing your data on the COVID-19 Data Portal: virus-dataflow@ebi.ac.uk.

Edit