{"id":55518,"date":"2023-02-09T11:05:28","date_gmt":"2023-02-09T10:05:28","guid":{"rendered":"https:\/\/www.embl.org\/news\/?p=55518"},"modified":"2024-11-26T11:59:12","modified_gmt":"2024-11-26T10:59:12","slug":"alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe","status":"publish","type":"post","link":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/","title":{"rendered":"Case study: AlphaFold uses open data and AI to discover the 3D protein universe"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Challenge: Predicting how proteins fold<\/h2>\n\n\n\n<p>Proteins make up all living things. There are millions of proteins and each one has a unique shape, also called a structure. But protein structures can be elusive, and it can take years of experimental work to determine them.<\/p>\n\n\n<div\n  class=\"vf-embed vf-embed--custom-ratio\"\n\n  style=\"--vf-embed-max-width: 100%;\n    --vf-embed-custom-ratio-x: 640;\n    --vf-embed-custom-ratio-y: 360;\"><iframe loading=\"lazy\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/KpedmJdrTpY\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe><\/div>\n\n\n\n<div style=\"height:18px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Since 1958, when John Kendrew determined the world\u2019s first protein structure by decoding&nbsp;myoglobin \u2013 a protein found in the heart and skeletal muscle tissue of mammals \u2013 almost 200,000 protein structures have been demonstrated using experimental methods. But this is a drop in the ocean, compared to the over 200 million known proteins.<\/p>\n\n\n\n<figure class=\"vf-figure wp-block-image size-full is-style-default\"><img loading=\"lazy\" decoding=\"async\" width=\"1000\" height=\"600\" class=\"vf-figure__image\" src=\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/Kendrew_protein.jpg\" alt=\"\" class=\"wp-image-55524\" srcset=\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/Kendrew_protein.jpg 1000w, https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/Kendrew_protein-300x180.jpg 300w, https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/Kendrew_protein-768x461.jpg 768w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><figcaption class=\"vf-figure__caption\">John Kendrew (left) and a myoglobin structure available in the Protein Data Bank in Europe (PDBe) (right). Credit: Wikipedia and PDBe. <a href=\"https:\/\/www.ebi.ac.uk\/pdbe\/entry\/pdb\/5xkw\" target=\"_blank\" rel=\"noreferrer noopener\">PMID: 28795725<\/a>.<\/figcaption><\/figure>\n\n\n\n<p>To address this challenge, scientists looked towards computational methods, to help predict protein structures. How a protein folds itself into its unique shape is one of biology\u2019s greatest mysteries \u2013&nbsp;and it remains unsolved to this day.&nbsp;<\/p>\n\n\n\n<article class=\"vf-card vf-card--brand vf-card--bordered vf-u-margin__bottom--800\" default>\n  <div class=\"vf-card__content | vf-stack vf-stack--400\">\n      <h3 class=\"vf-card__heading\">\n      Why are protein structures important?    <\/h3>\n                <p class=\"vf-card__text\"><span style=\"font-weight: 400;\">Protein structures help us understand how proteins work and what function they fulfil. This, in turn, helps researchers develop hypotheses about how to control or modify a protein\u2019s function. For example, in the case of proteins linked to diseases, the protein\u2019s structure is useful for developing drugs to treat the disease. Similarly, many diseases happen when proteins don&#8217;t fold correctly.\u00a0<\/span><\/p>\n      <\/div>\n<\/article>\n\n\n\n\n<p>In 1994, computational biologist John Moult set up the <a href=\"https:\/\/predictioncenter.org\/index.cgi\">Critical Assessment of protein Structure Prediction (CASP)<\/a>, an international competition where groups of researchers develop computational models to predict protein folding. For many years, computational methods struggled to reach the confidence threshold of the more accurate, but time-consuming, experimental methods. But 2020 marked a shift in the tide.&nbsp;<\/p>\n\n\n\n<p>At the end of the <a href=\"https:\/\/www.predictioncenter.org\/casp14\/index.cgi\">CASP14<\/a> competition that year, the judges announced to a stunned audience that the AlphaFold 2 system developed by London-based AI company <a href=\"https:\/\/www.deepmind.com\/\">DeepMind<\/a> had achieved a level of accuracy comparable to experimental methods. AlphaFold 2\u2019s average was less than one tenth of a nanometer off the positions that were determined by experiments. This had never been achieved before.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Solution: AlphaFold &#8211; built on open data<\/strong><\/h2>\n\n\n\n<p>AlphaFold 2 is an attention-based neural network system, trained end-to-end, meaning all of the different steps of the process are simultaneously trained instead of sequentially. This gives more flexibility to the network, allowing it to dynamically learn interactions between non-neighboring nodes. AlphaFold 2 also uses evolutionarily related protein sequences, multiple sequence alignment, and a representation of amino acid residue pairs to refine its predictions.&nbsp;<\/p>\n\n\n\n<p>Additionally, AlphaFold 2 offers two reliability metrics which provide confidence estimates in the predicted protein structure. This tells scientists how accurate the predictions are likely to be.<\/p>\n\n\n\n<blockquote class=\"vf-blockquote | vf-u-margin__bottom--600 vf-u-margin__top--600\">\n  <div>\n    <div>\n      &#8220;Public data were essential to the development of AlphaFold.\u201d    <\/div>\n    \n          <footer class=\"vf-u-margin__top--600\">\n      \n      <div class=\"vf-blockquote_author\">\n        John Jumper, Senior Staff Research Scientist at DeepMind &#038; AlphaFold Team Lead      <\/div>\n\n      \n      <div class=\"vf-blockquote_author__details\"><\/div>\n    <\/footer>\n      <\/div>\n<\/blockquote>\n\n\n\n<p><br \/>Like any AI method, AlphaFold required a lot of training data to \u2018learn\u2019 how to make accurate predictions. DeepMind trained AlphaFold on publicly-available data, such as those managed and supported by EMBL\u2019s European Bioinformatics Institute. DeepMind used, among other data sources:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Experimentally-determined protein structures from the <a href=\"https:\/\/wwpdb.org\/\">Protein Data Bank<\/a><\/li>\n\n\n\n<li>Protein sequences and annotations from <a href=\"https:\/\/www.uniprot.org\/\">UniProt<\/a><\/li>\n\n\n\n<li>Metagenomics data from <a href=\"https:\/\/www.ebi.ac.uk\/metagenomics\/\">MGnify<\/a><\/li>\n<\/ul>\n\n\n\n<p>\u201cPublic data were essential to the development of AlphaFold,\u201d said John Jumper, Senior Staff Research Scientist at DeepMind and AlphaFold Team Lead. \u201cThe careful curation of such large data resources, representing the collective output of an entire subfield of biology, is exactly what enables our machine learning models to generalise well across such a huge range of proteins, enabling further breakthroughs in machine learning in other scientific areas.&nbsp;<\/p>\n\n\n\n<p>\u201cWe expect that the expansion of metagenomic efforts like MGnify will be very important for increasing both AlphaFold accuracy and the impact of predicted structures on discovering novel enzymes in metagenomic samples.<\/p>\n\n\n\n<p>\u201cIn addition to training resources, we found the annotations provided important to understanding and debugging model performance during the development of AlphaFold. Particularly, the residue annotations in UniProt were essential in establishing the link between AlphaFold confidence and protein disorder. Before consulting these UniProt disorder annotations, we had no idea how to interpret the long stretches of low confidence that would accompany many AlphaFold predictions.\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What followed: Sharing AlphaFold predictions with the world<\/strong><\/h2>\n\n\n\n<p>During the months following DeepMind\u2019s success at CASP14, there was endless speculation about what the company would do with its revolutionary AI, specifically whether it would make the code open source for the community to use.<\/p>\n\n\n\n<p>In 2021, DeepMind announced that not only would it publish the AlphaFold 2 source code, but it would also make its predicted structures freely-available to all through the <a href=\"https:\/\/alphafold.ebi.ac.uk\/\">AlphaFold Protein Structure Database<\/a>, a collaborative project with EMBL-EBI.&nbsp;<\/p>\n\n\n\n<blockquote class=\"vf-blockquote | vf-u-margin__bottom--600 vf-u-margin__top--600\">\n  <div>\n    <div>\n      \u201cEMBL\u2019s support in developing the AlphaFold Database was crucial; it significantly amplified the impact and reach of AlphaFold predictions across the global scientific community.\u201d    <\/div>\n    \n          <footer class=\"vf-u-margin__top--600\">\n      \n      <div class=\"vf-blockquote_author\">\n        John Jumper, Senior Staff Research Scientist at DeepMind &#038; AlphaFold Team Lead      <\/div>\n\n      \n      <div class=\"vf-blockquote_author__details\"><\/div>\n    <\/footer>\n      <\/div>\n<\/blockquote>\n\n\n\n<p>DeepMind approached EMBL-EBI to support the development of the database because of the institute&#8217;s decades of experience in managing the world\u2019s biological data. EMBL-EBI could ensure the AlphaFold Database would fulfil the requirements of the scientific community. It could also support the integration of AlphaFold prediction with other molecular databases, to help give researchers the most comprehensive information possible on the protein of their choice.&nbsp;<\/p>\n\n\n\n<p>\u201cEMBL-EBI champions open data, so when DeepMind approached us about developing a database for AlphaFold predictions, we were quick to respond,\u201d explained <a href=\"https:\/\/www.ebi.ac.uk\/people\/person\/sameer-velankar\/\" target=\"_blank\" rel=\"noreferrer noopener\">Sameer Velankar, EMBL-EBI Team Leader for the Protein Data Bank in Europe<\/a>. \u201cOur technical and scientific teams worked together and we managed to develop, test, and launch the database in record time.\u201d<\/p>\n\n\n\n<p>EMBL-EBI\u2019s <a href=\"https:\/\/www.ebi.ac.uk\/pdbe\/\">Protein Structure Database in Europe (PDBe)<\/a> team worked closely with DeepMind to develop the AlphaFold database, which launched with approximately 350,000 protein structure predictions, including the vast majority of human proteins \u2013&nbsp;a key dataset for healthcare research.&nbsp;<\/p>\n\n\n\n<div class=\"vf-box vf-box--normal vf-box-theme--primary | vf-u-margin__bottom--400\">\n      <h3 class=\"vf-box__heading\">\n                EMBL-EBI\u2019s contribution:                  <\/h3> \n        <ul class=\"box-list\">\n<li>Co-development of the AlphaFold Database<\/li>\n<li>Data standards \u2013 to ensure high quality<\/li>\n<li>Data curation \u2013 to make predictions easy to find and analyse<\/li>\n<li>Data integration \u2013 cross linking AlphaFold predictions in other biomolecular data resources<\/li>\n<\/ul>\n<\/div>\n\n\n<p>And this was only the beginning. The DeepMind and EMBL-EBI teams continued to work together to update the database. One of the challenges they faced was the sheer size of the dataset, and the need to quickly scale up to make millions more structure predictions available.&nbsp;<\/p>\n\n\n\n<p>In January 2022, they added protein structure predictions for 17 organisms on the <a href=\"https:\/\/www.ebi.ac.uk\/about\/news\/updates-from-data-resources\/alphafold-update-neglected-tropical-diseases\/\">World Health Organisation\u2019s neglected tropical diseases list<\/a> and 10 organisms on its antimicrobial resistance list, bringing the total number of predictions in the database up to almost 1 million.&nbsp;&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Exceeding expectations: 200 million protein structure predictions<\/h2>\n\n\n\n<p>Just one year after the launch, in a gargantuan effort, EMBL-EBI and DeepMind released a major update, covering over <a href=\"https:\/\/www.ebi.ac.uk\/about\/news\/technology-and-innovation\/alphafold-200-million\/\">200 million protein structure predictions. <\/a>This includes almost all \u2018known\u2019 proteins that have sequences in the UniProt database.&nbsp;<\/p>\n\n\n\n<p>This meant that for the first time ever, scientists had a comprehensive view of the protein universe in 3D, and a prediction for almost every protein that has ever been catalogued. In early 2022, the number of researchers accessing the database since launch was up to to nearly one million, with these users viewing 3 million protein structures.<\/p>\n\n\n\n<div class=\"vf-box vf-box--is-link vf-box--normal vf-box-theme--primary | vf-u-margin__bottom--400\">\n      <h3 class=\"vf-box__heading\">\n              <a class=\"vf-box__link\" href=\"https:\/\/www.embl.org\/people\/person\/edith-heard\/\">\n                    AlphaFold database in numbers                <\/a>\n                  <\/h3> \n        <ul class=\"box-list\">\n<li>200 million protein structure predictions<\/li>\n<li>1 million organisms<\/li>\n<li>2 million users in 190 countries<\/li>\n<li>AlphaFold Nature papers cited more than 28,000 times<\/li>\n<\/ul>\n<p class=\"vf-box__text\">*between July 2021 &#8211; January 2023<\/p>\n<\/div>\n\n\n<blockquote class=\"vf-blockquote | vf-u-margin__bottom--600 vf-u-margin__top--600\">\n  <div>\n    <div>\n      \u201cThe popularity and growth of the AlphaFold Database is testament to the success of the collaboration between DeepMind and EMBL. It shows us a glimpse of the power of multidisciplinary science.\u201d    <\/div>\n    \n          <footer class=\"vf-u-margin__top--600\">\n      \n      <div class=\"vf-blockquote_author\">\n        Edith Heard, Director General of EMBL      <\/div>\n\n      \n      <div class=\"vf-blockquote_author__details\"><\/div>\n    <\/footer>\n      <\/div>\n<\/blockquote>\n\n\n\n<p>EMBL-EBI has also helped to integrate the AlphaFold 2 protein structure predictions into other public data resources, so the global scientific community can use this additional data source to gain new insights and answer pressing questions. Making AlphaFold predictions FAIR (Findable, Accessible, Interoperable and Reproducible) is essential for maximising the impact of the data.&nbsp;<\/p>\n\n\n\n<div class=\"vf-box vf-box--normal vf-box-theme--primary | vf-u-margin__bottom--400\">\n      <h3 class=\"vf-box__heading\">\n                AlphaFold data flow                  <\/h3> \n        <p class=\"vf-box__text\"><span style=\"font-weight: 400;\">AlphaFold was trained on\u00a0<\/span><a href=\"https:\/\/www.uniprot.org\/\"><span style=\"font-weight: 400;\">UniProt<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.ebi.ac.uk\/metagenomics\/\"><span style=\"font-weight: 400;\">MGnify<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.rcsb.org\/\"><span style=\"font-weight: 400;\">RCSB PDB<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/uniclust.mmseqs.com\/\"><span style=\"font-weight: 400;\">Uniclust<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/bfd.mmseqs.com\/\"><span style=\"font-weight: 400;\">BFD.<\/span><\/a><\/p>\n<p class=\"vf-box__text\"><span style=\"font-weight: 400;\">AlphaFold data feeds into a large number of data resources, including <\/span><a href=\"https:\/\/www.uniprot.org\/\"><span style=\"font-weight: 400;\">UniProt<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.ebi.ac.uk\/pdbe\/pdbe-kb\/\"><span style=\"font-weight: 400;\">PDBe-KB<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.ebi.ac.uk\/interpro\/\"><span style=\"font-weight: 400;\">InterPro<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.ensembl.org\/index.html\"><span style=\"font-weight: 400;\">Ensembl<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.opentargets.org\/\"><span style=\"font-weight: 400;\">Open Targets<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.ebi.ac.uk\/pdbe\/pdbe-kb\/3dbeacons\/\"><span style=\"font-weight: 400;\">3D-Beacons<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/swissmodel.expasy.org\/repository\/\"><span style=\"font-weight: 400;\">Swiss-Model Repository<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.jalview.org\/\"><span style=\"font-weight: 400;\">Jalview<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.rcsb.org\/\"><span style=\"font-weight: 400;\">RCSB PDB<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/mobidb.org\/\"><span style=\"font-weight: 400;\">MobiDB<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/wheat.pw.usda.gov\/GG3\/\"><span style=\"font-weight: 400;\">GrainGenes<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/string-db.org\/\"><span style=\"font-weight: 400;\">STRING.<\/span><\/a><\/p>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\"><strong>Only the beginning: AlphaFold\u2019s impact so far<\/strong><\/h2>\n\n\n\n<p>The AlphaFold system and database have disrupted the way biology is done, and have stimulated new directions in the field of structural biology.&nbsp;<\/p>\n\n\n\n<p>AlphaFold predictions have been used, among other things, to:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/unfolded.deepmind.com\/stories\/unlocking-a-decade-of-data-to-fight-antibiotic-resistance\">Fight antibiotic resistance<\/a><\/li>\n\n\n\n<li>Advance research into both cures and vaccines for diseases, from <a href=\"https:\/\/stories.dndi.org\/five-ways-innovation-changing-fight-against-neglected-tropical-diseases\/?utm_source=dndi&amp;utm_medium=website&amp;utm_campaign=worldntdday2022#group-section-5-Accelerating-scientific-discovery-isttuscYwO\">leishmaniasis<\/a>, to <a href=\"https:\/\/unfolded.deepmind.com\/stories\/matthew-higgins-is-unlocking-a-new-path-to-stop-malaria-in-its-tracks\">malaria<\/a><\/li>\n\n\n\n<li>Understand the <a href=\"https:\/\/www.embl.org\/news\/science\/puzzling-out-the-structure-of-a-molecular-giant\/\">nuclear pore complex&nbsp;<\/a><\/li>\n\n\n\n<li>Accelerate the <a href=\"https:\/\/unfolded.deepmind.com\/stories\/accelerating-the-fight-against-plastic-pollution\">fight against plastic pollution<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/unfolded.deepmind.com\/stories\/it-took-me-two-days-to-do-something-that-could-have-taken-me-years\">Understand and protect honey bees<\/a><\/li>\n\n\n\n<li>Develop bioinformatics tools, such as <a href=\"https:\/\/www.biorxiv.org\/content\/10.1101\/2022.02.07.479398v2\">Foldseek<\/a> and <a href=\"https:\/\/academic.oup.com\/nar\/advance-article\/doi\/10.1093\/nar\/gkac387\/6591528?login=true\">Dali<\/a>, which enable users to search for entries similar to a given protein<\/li>\n<\/ul>\n\n\n\n<p>\u201cScientists build on the shoulders of giants. In fact, most often, those shoulders are data,\u201d said <a href=\"https:\/\/www.ebi.ac.uk\/people\/person\/janet-thornton\/\" target=\"_blank\" rel=\"noreferrer noopener\">Janet Thornton, Director Emeritus at EMBL-EBI<\/a>. \u201cHaving these millions of structure predictions will change the face of biology. This is useful to medicine, agriculture, biotech, everything \u2013 it\u2019s just fantastic.\u201d&nbsp;<\/p>\n\n\n<div\n  class=\"vf-embed vf-embed--custom-ratio\"\n\n  style=\"--vf-embed-max-width: 100%;\n    --vf-embed-custom-ratio-x: 640;\n    --vf-embed-custom-ratio-y: 360;\"><iframe loading=\"lazy\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/70E7Q_g-eiU\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe><\/div>\n\n\n\n<p>Video credit: DeepMind. <\/p>\n\n\n\n<style>\n.box-list li {\ncolor: #fff !important;\n}\n<\/style>\n","protected":false},"excerpt":{"rendered":"<p>Open data played a pivotal role in the development of the AlphaFold AI. The same open principles now apply to AlphaFold predictions.<\/p>\n","protected":false},"author":47,"featured_media":55808,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[11054,2],"tags":[12758,4718,28,17689,782,36,315,35],"embl_taxonomy":[2906,11980],"class_list":["post-55518","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-perspectives","category-science","tag-alphafold","tag-artificial-intelligence","tag-bioinformatics","tag-case-study","tag-database","tag-embl-ebi","tag-open-data","tag-structural-biology","embl_taxonomy-embl-ebi","embl_taxonomy-protein-data-bank-in-europe"],"acf":{"featured":true,"show_featured_image":false,"field_target_display":"both","field_article_language":{"value":"english","label":"English"},"article_intro":"<p>Open data stored at EMBL-EBI played a pivotal role in the development of the AlphaFold AI.<\/p>\n","related_links":[{"link_description":"AlphaFold predicts structure of almost every catalogued protein known to science\r\n","link_url":"https:\/\/www.ebi.ac.uk\/about\/news\/technology-and-innovation\/alphafold-200-million\/"},{"link_description":"Accessible 3D protein models to accelerate scientific discovery\r\n","link_url":"https:\/\/www.ebi.ac.uk\/about\/news\/perspectives\/Thornton-alphafold\/"},{"link_description":"Great expectations \u2013 the potential impacts of AlphaFold DB\r\n","link_url":"https:\/\/www.ebi.ac.uk\/about\/news\/perspectives\/alphafold-potential-impacts\/"}],"source_article":false,"in_this_article":false,"press_contact":"None","article_translations":false,"languages":"","vf_locked":false,"vfwp-news_embl_taxonomy":[2906,11980]},"embl_taxonomy_terms":[{"uuid":"a:3:{i:0;s:36:\"b14d3f13-5670-44fb-8970-e54dfd9c921a\";i:1;s:36:\"89e00fee-87f4-482e-a801-4c3548bb6a58\";i:2;s:36:\"a99d1a7c-ca83-4c00-ab61-d082d3e41ce3\";}","parents":[],"name":["EMBL-EBI"],"slug":"embl-ebi","description":"Where &gt; All EMBL sites &gt; EMBL-EBI"},{"uuid":"a:3:{i:0;s:36:\"302cfdf7-365b-462a-be65-82c7b783ebf7\";i:1;s:36:\"18699e63-ed43-40c6-8d1c-203db7ed72ee\";i:2;s:36:\"36fe2f11-e6a3-4311-b9f6-7d5839d64b07\";}","parents":[],"name":["Protein Data Bank in Europe"],"slug":"protein-data-bank-in-europe","description":"What &gt; EMBL-EBI Services &gt; Protein Data Bank in Europe"}],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Case study: AlphaFold uses open data and AI to discover the 3D protein universe | EMBL<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Case study: AlphaFold uses open data and AI to discover the 3D protein universe | EMBL\" \/>\n<meta property=\"og:description\" content=\"Open data played a pivotal role in the development of the AlphaFold AI. The same open principles now apply to AlphaFold predictions.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/\" \/>\n<meta property=\"og:site_name\" content=\"EMBL\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/embl.org\/\" \/>\n<meta property=\"article:published_time\" content=\"2023-02-09T10:05:28+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-11-26T10:59:12+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"600\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Oana Stroe\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@embl\" \/>\n<meta name=\"twitter:site\" content=\"@embl\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Oana Stroe\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/\"},\"author\":{\"name\":\"Oana Stroe\",\"@id\":\"https:\/\/www.embl.org\/news\/#\/schema\/person\/95d28f359cc138357c5dcf79319ef414\"},\"headline\":\"Case study: AlphaFold uses open data and AI to discover the 3D protein universe\",\"datePublished\":\"2023-02-09T10:05:28+00:00\",\"dateModified\":\"2024-11-26T10:59:12+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/\"},\"wordCount\":1237,\"publisher\":{\"@id\":\"https:\/\/www.embl.org\/news\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg\",\"keywords\":[\"alphafold\",\"artificial intelligence\",\"bioinformatics\",\"case study\",\"database\",\"embl-ebi\",\"open data\",\"structural biology\"],\"articleSection\":[\"Perspectives\",\"Science\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/\",\"url\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/\",\"name\":\"Case study: AlphaFold uses open data and AI to discover the 3D protein universe | EMBL\",\"isPartOf\":{\"@id\":\"https:\/\/www.embl.org\/news\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg\",\"datePublished\":\"2023-02-09T10:05:28+00:00\",\"dateModified\":\"2024-11-26T10:59:12+00:00\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage\",\"url\":\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg\",\"contentUrl\":\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg\",\"width\":1000,\"height\":600,\"caption\":\"Credit: Nuclear pore complex prediction by AlphaFold. Edited by Karen Arnott\/EMBL-EBI. Background image from Adobe Stock Images.\"},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.embl.org\/news\/#website\",\"url\":\"https:\/\/www.embl.org\/news\/\",\"name\":\"European Molecular Biology Laboratory News\",\"description\":\"News from the European Molecular Biology Laboratory\",\"publisher\":{\"@id\":\"https:\/\/www.embl.org\/news\/#organization\"},\"alternateName\":\"EMBL News\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.embl.org\/news\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.embl.org\/news\/#organization\",\"name\":\"European Molecular Biology Laboratory\",\"alternateName\":\"EMBL\",\"url\":\"https:\/\/www.embl.org\/news\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.embl.org\/news\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2025\/09\/EMBL_logo_colour-1-300x144-1.png\",\"contentUrl\":\"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2025\/09\/EMBL_logo_colour-1-300x144-1.png\",\"width\":300,\"height\":144,\"caption\":\"European Molecular Biology Laboratory\"},\"image\":{\"@id\":\"https:\/\/www.embl.org\/news\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/embl.org\/\",\"https:\/\/x.com\/embl\",\"https:\/\/www.instagram.com\/embl_org\/\",\"https:\/\/www.linkedin.com\/company\/15813\/\",\"https:\/\/www.youtube.com\/user\/emblmedia\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.embl.org\/news\/#\/schema\/person\/95d28f359cc138357c5dcf79319ef414\",\"name\":\"Oana Stroe\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.embl.org\/news\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/105fb83bbeaf64062ef1f4608964e5163ac52f590628df2ffde6de9a0d85af80?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/105fb83bbeaf64062ef1f4608964e5163ac52f590628df2ffde6de9a0d85af80?s=96&d=mm&r=g\",\"caption\":\"Oana Stroe\"},\"description\":\"Oana Stroe is a PR specialist and copywriter with experience in engineering, IT and life sciences. She now collects and shares remarkable science stories.\",\"url\":\"https:\/\/www.embl.org\/news\/author\/oana-stroe-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-3\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Case study: AlphaFold uses open data and AI to discover the 3D protein universe | EMBL","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/","og_locale":"en_US","og_type":"article","og_title":"Case study: AlphaFold uses open data and AI to discover the 3D protein universe | EMBL","og_description":"Open data played a pivotal role in the development of the AlphaFold AI. The same open principles now apply to AlphaFold predictions.","og_url":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/","og_site_name":"EMBL","article_publisher":"https:\/\/www.facebook.com\/embl.org\/","article_published_time":"2023-02-09T10:05:28+00:00","article_modified_time":"2024-11-26T10:59:12+00:00","og_image":[{"width":1000,"height":600,"url":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg","type":"image\/jpeg"}],"author":"Oana Stroe","twitter_card":"summary_large_image","twitter_creator":"@embl","twitter_site":"@embl","twitter_misc":{"Written by":"Oana Stroe","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#article","isPartOf":{"@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/"},"author":{"name":"Oana Stroe","@id":"https:\/\/www.embl.org\/news\/#\/schema\/person\/95d28f359cc138357c5dcf79319ef414"},"headline":"Case study: AlphaFold uses open data and AI to discover the 3D protein universe","datePublished":"2023-02-09T10:05:28+00:00","dateModified":"2024-11-26T10:59:12+00:00","mainEntityOfPage":{"@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/"},"wordCount":1237,"publisher":{"@id":"https:\/\/www.embl.org\/news\/#organization"},"image":{"@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage"},"thumbnailUrl":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg","keywords":["alphafold","artificial intelligence","bioinformatics","case study","database","embl-ebi","open data","structural biology"],"articleSection":["Perspectives","Science"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/","url":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/","name":"Case study: AlphaFold uses open data and AI to discover the 3D protein universe | EMBL","isPartOf":{"@id":"https:\/\/www.embl.org\/news\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage"},"image":{"@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage"},"thumbnailUrl":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg","datePublished":"2023-02-09T10:05:28+00:00","dateModified":"2024-11-26T10:59:12+00:00","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.embl.org\/news\/science\/alphafold-using-open-data-and-ai-to-discover-the-3d-protein-universe\/#primaryimage","url":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg","contentUrl":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg","width":1000,"height":600,"caption":"Credit: Nuclear pore complex prediction by AlphaFold. Edited by Karen Arnott\/EMBL-EBI. Background image from Adobe Stock Images."},{"@type":"WebSite","@id":"https:\/\/www.embl.org\/news\/#website","url":"https:\/\/www.embl.org\/news\/","name":"European Molecular Biology Laboratory News","description":"News from the European Molecular Biology Laboratory","publisher":{"@id":"https:\/\/www.embl.org\/news\/#organization"},"alternateName":"EMBL News","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.embl.org\/news\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.embl.org\/news\/#organization","name":"European Molecular Biology Laboratory","alternateName":"EMBL","url":"https:\/\/www.embl.org\/news\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.embl.org\/news\/#\/schema\/logo\/image\/","url":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2025\/09\/EMBL_logo_colour-1-300x144-1.png","contentUrl":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2025\/09\/EMBL_logo_colour-1-300x144-1.png","width":300,"height":144,"caption":"European Molecular Biology Laboratory"},"image":{"@id":"https:\/\/www.embl.org\/news\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/embl.org\/","https:\/\/x.com\/embl","https:\/\/www.instagram.com\/embl_org\/","https:\/\/www.linkedin.com\/company\/15813\/","https:\/\/www.youtube.com\/user\/emblmedia\/"]},{"@type":"Person","@id":"https:\/\/www.embl.org\/news\/#\/schema\/person\/95d28f359cc138357c5dcf79319ef414","name":"Oana Stroe","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.embl.org\/news\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/105fb83bbeaf64062ef1f4608964e5163ac52f590628df2ffde6de9a0d85af80?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/105fb83bbeaf64062ef1f4608964e5163ac52f590628df2ffde6de9a0d85af80?s=96&d=mm&r=g","caption":"Oana Stroe"},"description":"Oana Stroe is a PR specialist and copywriter with experience in engineering, IT and life sciences. She now collects and shares remarkable science stories.","url":"https:\/\/www.embl.org\/news\/author\/oana-stroe-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-3\/"}]}},"field_target_display":"both","field_article_language":{"value":"english","label":"English"},"fimg_url":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg","featured_image_src":"https:\/\/www.embl.org\/news\/wp-content\/uploads\/2023\/01\/AlphaFold-case-study-1000x600-1.jpg","_links":{"self":[{"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/posts\/55518","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/users\/47"}],"replies":[{"embeddable":true,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/comments?post=55518"}],"version-history":[{"count":146,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/posts\/55518\/revisions"}],"predecessor-version":[{"id":72099,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/posts\/55518\/revisions\/72099"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/media\/55808"}],"wp:attachment":[{"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/media?parent=55518"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/categories?post=55518"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/tags?post=55518"},{"taxonomy":"embl_taxonomy","embeddable":true,"href":"https:\/\/www.embl.org\/news\/wp-json\/wp\/v2\/embl_taxonomy?post=55518"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}