Establishing the ELIXIR Microbiome Community

Finn RD, Balech B, Burgin J, Chua P, Corre E, Cox CJ, Donati C, dos Santos VM, Fosso B, Hancock J, Heil KF, Ishaque N, Kale V, Kunath BJ, Médigue C, Pafilis E, Pesole G, Richardson L, Santamaria M, Van Den Bossche T, Vizcaíno JA, Zafeiropoulos H, Willassen NP, Pelletier E, Batut B.

2024

doi:10.12688/f1000research.144515.1.

Ensembl 2024.

Harrison PW, Amode MR, Austine-Orimoloye O, Azov AG, Barba M, Barnes I, Becker A, Bennett R, Berry A, Bhai J, Bhurji SK, Boddu S, Branco Lins PR, Brooks L, Ramaraju SB, Campbell LI, Martinez MC, Charkhchi M, Chougule K, Cockburn A, Davidson C, De Silva NH, Dodiya K, Donaldson S, El Houdaigui B, Naboulsi TE, Fatima R, Giron CG, Genez T, Grigoriadis D, Ghattaoraya GS, Martinez JG, Gurbich TA, Hardy M, Hollis Z, Hourlier T, Hunt T, Kay M, Kaykala V, Le T, Lemos D, Lodha D, Marques-Coelho D, Maslen G, Merino GA, Mirabueno LP, Mushtaq A, Hossain SN, Ogeh DN, Sakthivel MP, Parker A, Perry M, Piližota I, Poppleton D, Prosovetskaia I, Raj S, Pérez-Silva JG, Salam AIA, Saraf S, Saraiva-Agostinho N, Sheppard D, Sinha S, Sipos B, Sitnik V, Stark W, Steed E, Suner MM, Surapaneni L, Sutinen K, Tricomi FF, Urbina-Gómez D, Veidenberg A, Walsh TA, Ware D, Wass E, Willhoft NL, Allen J, Alvarez-Jarreta J, Chakiachvili M, Flint B, Giorgetti S, Haggerty L, Ilsley GR, Keatley J, Loveland JE, Moore B, Mudge JM, Naamati G, Tate J, Trevanion SJ, Winterbottom A, Frankish A, Hunt SE, Cunningham F, Dyer S, Finn RD, Martin FJ, Yates AD.

Nucleic Acids Res, 2024

doi:10.1093/nar/gkad1049.

SPIRE: a Searchable, Planetary-scale mIcrobiome REsource.

Schmidt TSB, Fullam A, Ferretti P, Orakov A, Maistrenko OM, Ruscheweyh HJ, Letunic I, Duan Y, Van Rossum T, Sunagawa S, Mende DR, Finn RD, Kuhn M, Pedro Coelho L, Bork P.

Nucleic Acids Res, 2024

doi:10.1093/nar/gkad943.

plastiC: A pipeline for recovery and characterization of plastid genomes from metagenomic datasets

Cameron ES, Blaxter ML, Finn RD.

2023

doi:10.12688/wellcomeopenres.19589.1.

Challenges and opportunities in sharing microbiome data and analyses.

Huttenhower C, Finn RD, McHardy AC.

Nat Microbiol, 2023

doi:10.1038/s41564-023-01484-x.

VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models.

Rangel-Pineros G, Almeida A, Beracochea M, Sakharova E, Marz M, Reyes Muñoz A, Hölzer M, Finn RD.

PLoS Comput Biol, 2023

doi:10.1371/journal.pcbi.1011422.

Expansion of novel biosynthetic gene clusters from diverse environments using SanntiS

Sanchez S, Rogers JD, Rogers AB, Nassar M, McEntyre J, Welch M, Hollfelder F, Finn RD.

2023

doi:10.1101/2023.05.23.540769.

Staphylococcal diversity in atopic dermatitis from an individual to a global scale.

Saheb Kashaf S, Harkins CP, Deming C, Joglekar P, Conlan S, Holmes CJ, NISC Comparative Sequencing Program, Almeida A, Finn RD, Segre JA, Kong HH.

Cell Host Microbe, 2023

doi:10.1016/j.chom.2023.03.010.

Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters.

Baltoumas FA, Karatzas E, Paez-Espino D, Venetsianou NK, Aplakidou E, Oulas A, Finn RD, Ovchinnikov S, Pafilis E, Kyrpides NC, Pavlopoulos GA.

Front Bioinform, 2023

doi:10.3389/fbinf.2023.1157956.

MGnify Genomes: A Resource for Biome-specific Microbial Genome Catalogues.

Gurbich TA, Almeida A, Beracochea M, Burdett T, Burgin J, Cochrane G, Raj S, Richardson L, Rogers AB, Sakharova E, Salazar GA, Finn RD.

J Mol Biol, 2023

doi:10.1016/j.jmb.2023.168016.

Evidence for a core set of microbial lichen symbionts from a global survey of metagenomes

Tagirdzhanova G, Saary P, Cameron ES, Garber AI, Díaz Escandón D, Goyette S, Nogerius VT, Passo A, Mayrhofer H, Holien H, Tønsberg T, Stein LY, Finn RD, Spribille T.

2023

doi:10.1101/2023.02.02.524463.

MGnify: the microbiome sequence data analysis resource in 2023.

Richardson L, Allen B, Baldi G, Beracochea M, Bileschi ML, Burdett T, Burgin J, Caballero-Pérez J, Cochrane G, Colwell LJ, Curtis T, Escobar-Zepeda A, Gurbich TA, Kale V, Korobeynikov A, Raj S, Rogers AB, Sakharova E, Sanchez S, Wilkinson DJ, Finn RD.

Nucleic Acids Res, 2023

doi:10.1093/nar/gkac1080.

Ensembl 2023.

Martin FJ, Amode MR, Aneja A, Austine-Orimoloye O, Azov AG, Barnes I, Becker A, Bennett R, Berry A, Bhai J, Bhurji SK, Bignell A, Boddu S, Branco Lins PR, Brooks L, Ramaraju SB, Charkhchi M, Cockburn A, Da Rin Fiorretto L, Davidson C, Dodiya K, Donaldson S, El Houdaigui B, El Naboulsi T, Fatima R, Giron CG, Genez T, Ghattaoraya GS, Martinez JG, Guijarro C, Hardy M, Hollis Z, Hourlier T, Hunt T, Kay M, Kaykala V, Le T, Lemos D, Marques-Coelho D, Marugán JC, Merino GA, Mirabueno LP, Mushtaq A, Hossain SN, Ogeh DN, Sakthivel MP, Parker A, Perry M, Piližota I, Prosovetskaia I, Pérez-Silva JG, Salam AIA, Saraiva-Agostinho N, Schuilenburg H, Sheppard D, Sinha S, Sipos B, Stark W, Steed E, Sukumaran R, Sumathipala D, Suner MM, Surapaneni L, Sutinen K, Szpak M, Tricomi FF, Urbina-Gómez D, Veidenberg A, Walsh TA, Walts B, Wass E, Willhoft N, Allen J, Alvarez-Jarreta J, Chakiachvili M, Flint B, Giorgetti S, Haggerty L, Ilsley GR, Loveland JE, Moore B, Mudge JM, Tate J, Thybert D, Trevanion SJ, Winterbottom A, Frankish A, Hunt SE, Ruffier M, Cunningham F, Dyer S, Finn RD, Howe KL, Harrison PW, Yates AD, Flicek P.

Nucleic Acids Res, 2023

doi:10.1093/nar/gkac958.

metaGOflow: a workflow for the analysis of marine Genomic Observatories shotgun metagenomics data.

Zafeiropoulos H, Beracochea M, Ninidakis S, Exter K, Potirakis A, De Moro G, Richardson L, Corre E, Machado J, Pafilis E, Kotoulas G, Santi I, Finn RD, Cox CJ, Pavloudi C.

Gigascience, 2022

doi:10.1093/gigascience/giad078.

plastiC: A pipeline for recovery and characterization of plastid genomes from metagenomic datasets

Cameron ES, Blaxter ML, Finn RD.

2022

doi:10.1101/2022.12.23.521586.

SymbNET: From metagenomics to metabolic interactions: course materials

Rodrigues Araujo D, Finn R, Zimmerman M.

2022

doi:10.6019/tol.symbnetmaterials-t.2022.00001.1.

Screening of global microbiomes implies ecological boundaries impacting the distribution and dissemination of clinically relevant antimicrobial resistance genes.

Lin Q, Xavier BB, Alako BTF, Mitchell AL, Rajakani SG, Glupczynski Y, Finn RD, Cochrane G, Malhotra-Kumar S.

Commun Biol, 2022

doi:10.1038/s42003-022-04187-x.

MetaGT: A pipeline for <i>de novo</i> assembly of metatranscriptomes with the aid of metagenomic data.

Shafranskaya D, Kale V, Finn R, Lapidus AL, Korobeynikov A, Prjibelski AD.

Front Microbiol, 2022

doi:10.3389/fmicb.2022.981458.

Novel strategies to improve chicken performance and welfare by unveiling host-microbiota interactions through hologenomics.

Tous N, Marcos S, Goodarzi Boroojeni F, Pérez de Rozas A, Zentek J, Estonba A, Sandvang D, Gilbert MTP, Esteve-Garcia E, Finn R, Alberdi A, Tarradas J.

Front Physiol, 2022

doi:10.3389/fphys.2022.884925.

VIRify: an integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models

Rangel-Pineros G, Almeida A, Beracochea M, Sakharova E, Marz M, Muñoz AR, Hölzer M, Finn RD.

2022

doi:10.1101/2022.08.22.504484.

A machine learning framework for discovery and enrichment of metagenomics metadata from open access publications.

Nassar M, Rogers AB, Talo' F, Sanchez S, Shafique Z, Finn RD, McEntyre J.

Gigascience, 2022

doi:10.1093/gigascience/giac077.

Priorities for ocean microbiome research.

Tara Ocean Foundation, Tara Oceans, European Molecular Biology Laboratory (EMBL), European Marine Biological Resource Centre - European Research Infrastructure Consortium (EMBRC-ERIC).

Nat Microbiol, 2022

doi:10.1038/s41564-022-01145-5.

Unifying the known and unknown microbial coding sequence space.

Vanni C, Schechter MS, Acinas SG, Barberán A, Buttigieg PL, Casamayor EO, Delmont TO, Duarte CM, Eren AM, Finn RD, Kottmann R, Mitchell A, Sánchez P, Siren K, Steinegger M, Gloeckner FO, Fernàndez-Guerra A.

Elife, 2022

doi:10.7554/elife.67667.

Large-scale analysis reveals the distribution of novel cellular microbes across multiple biomes and kingdoms

Saary P, Kale V, Finn R.

2022

doi:10.21203/rs.3.rs-1441815/v1.

A mouse model of occult intestinal colonization demonstrating antibiotic-induced outgrowth of carbapenem-resistant Enterobacteriaceae.

Sim CK, Kashaf SS, Stacy A, Proctor DM, Almeida A, Bouladoux N, Chen M, NISC Comparative Sequencing Program, Finn RD, Belkaid Y, Conlan S, Segre JA.

Microbiome, 2022

doi:10.1186/s40168-021-01207-6.

A machine learning framework for discovery and enrichment of metagenomics metadata from open access publications

Nassar M, Rogers AB, Talo' F, Sanchez S, Shafique Z, Finn RD, McEntyre J.

2022

doi:10.21203/rs.3.rs-1396476/v1.

Publisher Correction: A catalogue of 1,167 genomes from the human gut archaeome.

Chibani CM, Mahnert A, Borrel G, Almeida A, Werner A, Brugère JF, Gribaldo S, Finn RD, Schmitz RA, Moissl-Eichinger C.

Nat Microbiol, 2022

doi:10.1038/s41564-022-01061-8.

A catalogue of 1,167 genomes from the human gut archaeome.

Chibani CM, Mahnert A, Borrel G, Almeida A, Werner A, Brugère JF, Gribaldo S, Finn RD, Schmitz RA, Moissl-Eichinger C.

Nat Microbiol, 2022

doi:10.1038/s41564-021-01020-9.

Integrating cultivation and metagenomics for a multi-kingdom view of skin microbiome diversity and functions.

Saheb Kashaf S, Proctor DM, Deming C, Saary P, Hölzer M, NISC Comparative Sequencing Program, Taylor ME, Kong HH, Segre JA, Almeida A, Finn RD.

Nat Microbiol, 2022

doi:10.1038/s41564-021-01011-w.

Metagenomics approach for Polymyxa betae genome assembly enables comparative analysis towards deciphering the intracellular parasitic lifestyle of the plasmodiophorids.

Decroës A, Li JM, Richardson L, Mutasa-Gottgens E, Lima-Mendez G, Mahillon M, Bragard C, Finn RD, Legrève A.

Genomics, 2022

doi:10.1016/j.ygeno.2021.11.018.

Ensembl Genomes 2022: an expanding genome resource for non-vertebrates.

Yates AD, Allen J, Amode RM, Azov AG, Barba M, Becerra A, Bhai J, Campbell LI, Carbajo Martinez M, Chakiachvili M, Chougule K, Christensen M, Contreras-Moreira B, Cuzick A, Da Rin Fioretto L, Davis P, De Silva NH, Diamantakis S, Dyer S, Elser J, Filippi CV, Gall A, Grigoriadis D, Guijarro-Clarke C, Gupta P, Hammond-Kosack KE, Howe KL, Jaiswal P, Kaikala V, Kumar V, Kumari S, Langridge N, Le T, Luypaert M, Maslen GL, Maurel T, Moore B, Muffato M, Mushtaq A, Naamati G, Naithani S, Olson A, Parker A, Paulini M, Pedro H, Perry E, Preece J, Quinton-Tulloch M, Rodgers F, Rosello M, Ruffier M, Seager J, Sitnik V, Szpak M, Tate J, Tello-Ruiz MK, Trevanion SJ, Urban M, Ware D, Wei S, Williams G, Winterbottom A, Zarowiecki M, Finn RD, Flicek P.

Nucleic Acids Res, 2022

doi:10.1093/nar/gkab1007.

The Ensembl COVID-19 resource: ongoing integration of public SARS-CoV-2 data.

De Silva NH, Bhai J, Chakiachvili M, Contreras-Moreira B, Cummins C, Frankish A, Gall A, Genez T, Howe KL, Hunt SE, Martin FJ, Moore B, Ogeh D, Parker A, Parton A, Ruffier M, Sakthivel MP, Sheppard D, Tate J, Thormann A, Thybert D, Trevanion SJ, Winterbottom A, Zerbino DR, Finn RD, Flicek P, Yates AD.

Nucleic Acids Res, 2022

doi:10.1093/nar/gkab889.

Reporting guidelines for human microbiome research: the STORMS checklist.

Mirzayi C, Renson A, Genomic Standards Consortium, Massive Analysis and Quality Control Society, Zohra F, Elsafoury S, Geistlinger L, Kasselman LJ, Eckenrode K, van de Wijgert J, Loughman A, Marques FZ, MacIntyre DA, Arumugam M, Azhar R, Beghini F, Bergstrom K, Bhatt A, Bisanz JE, Braun J, Bravo HC, Buck GA, Bushman F, Casero D, Clarke G, Collado MC, Cotter PD, Cryan JF, Demmer RT, Devkota S, Elinav E, Escobar JS, Fettweis J, Finn RD, Fodor AA, Forslund S, Franke A, Furlanello C, Gilbert J, Grice E, Haibe-Kains B, Handley S, Herd P, Holmes S, Jacobs JP, Karstens L, Knight R, Knights D, Koren O, Kwon DS, Langille M, Lindsay B, McGovern D, McHardy AC, McWeeney S, Mueller NT, Nezi L, Olm M, Palm N, Pasolli E, Raes J, Redinbo MR, Rühlemann M, Balfour Sartor R, Schloss PD, Schriml L, Segal E, Shardell M, Sharpton T, Smirnova E, Sokol H, Sonnenburg JL, Srinivasan S, Thingholm LB, Turnbaugh PJ, Upadhyay V, Walls RL, Wilmes P, Yamada T, Zeller G, Zhang M, Zhao N, Zhao L, Bao W, Culhane A, Devanarayan V, Dopazo J, Fan X, Fischer M, Jones W, Kusko R, Mason CE, Mercer TR, Sansone SA, Scherer A, Shi L, Thakkar S, Tong W, Wolfinger R, Hunter C, Segata N, Huttenhower C, Dowd JB, Jones HE, Waldron L.

Nat Med, 2021

doi:10.1038/s41591-021-01552-x.

COP26 scientist: How much carbon can the ocean absorb?

Tollefson J.

Nature, 2021

doi:10.1038/d41586-021-03029-w.

Erratum to: Predicted Input of Uncultured Fungal Symbionts to a Lichen Symbiosis from Metagenome-Assembled Genomes.

Tagirdzhanova G, Saary P, Tingley JP, Díaz-Escandón D, Abbott DW, Finn RD, Spribille T.

Genome Biol Evol, 2021

doi:10.1093/gbe/evab129.

R2DT is a framework for predicting and visualising RNA secondary structure using templates.

Sweeney BA, Hoksza D, Nawrocki EP, Ribas CE, Madeira F, Cannone JJ, Gutell R, Maddala A, Meade CD, Williams LD, Petrov AS, Chan PP, Lowe TM, Finn RD, Petrov AI.

Nat Commun, 2021

doi:10.1038/s41467-021-23555-5.

An inter-laboratory study to investigate the impact of the bioinformatics component on microbiome analysis using mock communities.

O'Sullivan DM, Doyle RM, Temisak S, Redshaw N, Whale AS, Logan G, Huang J, Fischer N, Amos GCA, Preston MD, Marchesi JR, Wagner J, Parkhill J, Motro Y, Denise H, Finn RD, Harris KA, Kay GL, O'Grady J, Ransom-Jones E, Wu H, Laing E, Studholme DJ, Benavente ED, Phelan J, Clark TG, Moran-Gilad J, Huggett JF.

Sci Rep, 2021

doi:10.1038/s41598-021-89881-2.

Recovering prokaryotic genomes from host-associated, short-read shotgun metagenomic sequencing data.

Saheb Kashaf S, Almeida A, Segre JA, Finn RD.

Nat Protoc, 2021

doi:10.1038/s41596-021-00508-2.

Predicted Input of Uncultured Fungal Symbionts to a Lichen Symbiosis from Metagenome-Assembled Genomes.

Tagirdzhanova G, Saary P, Tingley JP, Díaz-Escandón D, Abbott DW, Finn RD, Spribille T.

Genome Biol Evol, 2021

doi:10.1093/gbe/evab047.

A mouse model of occult intestinal colonization demonstrating antibiotic-induced outgrowth of carbapenem-resistant <i>Enterobacteriaceae</i>

Sim CK, Kashaf SS, Conlan S, Stacy A, Proctor DM, Almeida A, Bouladoux N, Chen M, Finn RD, Belkaid Y, Segre JA, NISC Comparative Sequencing Program.

2021

doi:10.1101/2021.02.24.432587.

Massive expansion of human gut bacteriophage diversity.

Camarillo-Guerrero LF, Almeida A, Rangel-Pineros G, Finn RD, Lawley TD.

Cell, 2021

doi:10.1016/j.cell.2021.01.029.

ELIXIR-EXCELERATE: establishing Europe's data infrastructure for the life science research of the future.

Harrow J, Hancock J, ELIXIR-EXCELERATE Community, Blomberg N.

EMBO J, 2021

doi:10.15252/embj.2020107409.

The Gene Ontology resource: enriching a GOld mine.

Gene Ontology Consortium.

Nucleic Acids Res, 2021

doi:10.1093/nar/gkaa1113.

Rfam 14: expanded coverage of metagenomic, viral and microRNA families.

Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz K, Marz M, Griffiths-Jones S, Toffano-Nioche C, Gautheret D, Weinberg Z, Rivas E, Eddy SR, Finn RD, Bateman A, Petrov AI.

Nucleic Acids Res, 2021

doi:10.1093/nar/gkaa1047.

The InterPro protein families and domains database: 20 years on.

Blum M, Chang HY, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Nuka G, Paysan-Lafosse T, Qureshi M, Raj S, Richardson L, Salazar GA, Williams L, Bork P, Bridge A, Gough J, Haft DH, Letunic I, Marchler-Bauer A, Mi H, Natale DA, Necci M, Orengo CA, Pandurangan AP, Rivoire C, Sigrist CJA, Sillitoe I, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Bateman A, Finn RD.

Nucleic Acids Res, 2021

doi:10.1093/nar/gkaa977.

Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research.

Hufsky F, Lamkiewicz K, Almeida A, Aouacheria A, Arighi C, Bateman A, Baumbach J, Beerenwinkel N, Brandt C, Cacciabue M, Chuguransky S, Drechsel O, Finn RD, Fritz A, Fuchs S, Hattab G, Hauschild AC, Heider D, Hoffmann M, Hölzer M, Hoops S, Kaderali L, Kalvari I, von Kleist M, Kmiecinski R, Kühnert D, Lasso G, Libin P, List M, Löchel HF, Martin MJ, Martin R, Matschinske J, McHardy AC, Mendes P, Mistry J, Navratil V, Nawrocki EP, O'Toole ÁN, Ontiveros-Palacios N, Petrov AI, Rangel-Pineros G, Redaschi N, Reimering S, Reinert K, Reyes A, Richardson L, Robertson DL, Sadegh S, Singer JB, Theys K, Upton C, Welzel M, Williams L, Marz M.

Brief Bioinform, 2021

doi:10.1093/bib/bbaa232.

Pfam: The protein families database in 2021.

Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL, Tosatto SCE, Paladin L, Raj S, Richardson LJ, Finn RD, Bateman A.

Nucleic Acids Res, 2021

doi:10.1093/nar/gkaa913.

RNAcentral 2021: secondary structure integration, improved sequence search and new member databases.

RNAcentral Consortium.

Nucleic Acids Res, 2021

doi:10.1093/nar/gkaa921.

A unified catalog of 204,938 reference genomes from the human gut microbiome.

Almeida A, Nayfach S, Boland M, Strozzi F, Beracochea M, Shi ZJ, Pollard KS, Sakharova E, Parks DH, Hugenholtz P, Segata N, Kyrpides NC, Finn RD.

Nat Biotechnol, 2021

doi:10.1038/s41587-020-0603-3.

Microbiota Characterization of Agricultural Green Waste-Based Suppressive Composts Using Omics and Classic Approaches

Scotti R, Mitchell AL, Pane C, Finn RD, Zaccardelli M.

Agriculture, 2020

doi:10.3390/agriculture10030061.

The Ensembl COVID-19 resource: Ongoing integration of public SARS-CoV-2 data

De Silva NH, Bhai J, Chakiachvili M, Contreras-Moreira B, Cummins C, Frankish A, Gall A, Genez T, Howe KL, Hunt SE, Martin FJ, Moore B, Ogeh D, Parker A, Parton A, Ruffier M, Sakthivel MP, Sheppard D, Tate J, Thormann A, Thybert D, Trevanion SJ, Winterbottom A, Zerbino DR, Finn RD, Flicek P, Yates AD.

2020

doi:10.1101/2020.12.18.422865.

A comprehensive analysis of the global human gut archaeome from a thousand genome catalogue

Chibani CM, Mahnert A, Borrel G, Almeida A, Werner A, Brugere J, Gribaldo S, Finn RD, Schmitz RA, Moissl-Eichinger C.

2020

doi:10.1101/2020.11.21.392621.

Estimating the quality of eukaryotic genomes recovered from metagenomic analysis with EukCC.

Saary P, Mitchell AL, Finn RD.

Genome Biol, 2020

doi:10.1186/s13059-020-02155-4.

R2DT: computational framework for template-based RNA secondary structure visualisation across non-coding RNA types

Sweeney BA, Hoksza D, Nawrocki EP, Ribas CE, Madeira F, Cannone JJ, Gutell R, Maddala A, Meade C, Williams LD, Petrov AS, Chan PP, Lowe TM, Finn RD, Petrov AI.

2020

doi:10.1101/2020.09.10.290924.

Massive expansion of human gut bacteriophage diversity

Camarillo-Guerrero LF, Almeida A, Rangel-Pineros G, Finn RD, Lawley TD.

2020

doi:10.1101/2020.09.03.280214.

Exploring Non-Coding RNAs in RNAcentral.

Sweeney BA, Tagmazian AA, Ribas CE, Finn RD, Bateman A, Petrov AI.

Curr Protoc Bioinformatics, 2020

doi:10.1002/cpbi.104.

Unifying the known and unknown microbial coding sequence space

Vanni C, Schechter MS, Acinas SG, Barberán A, Buttigieg PL, Casamayor EO, Delmont TO, Duarte CM, Eren AM, Finn RD, Kottmann R, Mitchell A, Sanchez P, Siren K, Steinegger M, Glöckner FO, Fernandez-Guerra A.

2020

doi:10.1101/2020.06.30.180448.

COVID-19 pandemic reveals the peril of ignoring metadata standards.

Schriml LM, Chuvochina M, Davies N, Eloe-Fadrosh EA, Finn RD, Hugenholtz P, Hunter CI, Hurwitz BL, Kyrpides NC, Meyer F, Mizrachi IK, Sansone SA, Sutton G, Tighe S, Walls R.

Sci Data, 2020

doi:10.1038/s41597-020-0524-5.

Phylogenomics of expanding uncultured environmental Tenericutes provides insights into their pathogenicity and evolutionary relationship with Bacilli.

Wang Y, Huang JM, Zhou YL, Almeida A, Finn RD, Danchin A, He LS.

BMC Genomics, 2020

doi:10.1186/s12864-020-06807-4.

Computational Strategies to Combat COVID-19: Useful Tools to Accelerate SARS-CoV-2 and Coronavirus Research

Hufsky F, Lamkiewicz K, Almeida A, Aouacheria A, Arighi C, Bateman A, Baumbach J, Beerenwinkel N, Brandt C, Cacciabue M, Chuguransky S, Drechsel O, Finn RD, Fritz A, Fuchs S, Hattab G, Hauschild A, Heider D, Hoffmann M, Hölzer M, Hoops S, Kaderali L, Kalvari I, von Kleist M, Kmiecinski R, Kühnert D, Lasso G, Libin P, List M, Löchel HF, Martin MJ, Martin R, Matschinske J, McHardy AC, Mendes P, Mistry J, Navratil V, Nawrocki E, O'Toole ÁN, Palacios-Ontiveros N, Petrov AI, Rangel-Piñeros G, Redaschi N, Reimering S, Reinert K, Reyes A, Richardson L, Robertson DL, Sadegh S, Singer JB, Theys K, Upton C, Welzel M, Williams L, Marz M.

2020

doi:10.20944/preprints202005.0376.v1.

Phylogenomics of expanding uncultured environmental Tenericutes provides insights into their pathogenicity and evolutionary relationship with Bacilli

Wang Y, Huang J, Zhou Y, Almeida A, Finn RD, Danchin A, He L.

2020

doi:10.1101/2020.01.21.914887.

Microbial composition of Kombucha determined using amplicon sequencing and shotgun metagenomics.

Arıkan M, Mitchell AL, Finn RD, Gürel F.

J Food Sci, 2020

doi:10.1111/1750-3841.14992.

The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences.

Drysdale R, Cook CE, Petryszak R, Baillie-Gerritsen V, Barlow M, Gasteiger E, Gruhl F, Haas J, Lanfear J, Lopez R, Redaschi N, Stockinger H, Teixeira D, Venkatesan A, Elixir Core Data Resource Forum, Blomberg N, Durinx C, McEntyre J.

Bioinformatics, 2020

doi:10.1093/bioinformatics/btz959.

Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation.

Sillitoe I, Andreeva A, Blundell TL, Buchan DWA, Finn RD, Gough J, Jones D, Kelley LA, Paysan-Lafosse T, Lam SD, Murzin AG, Pandurangan AP, Salazar GA, Skwark MJ, Sternberg MJE, Velankar S, Orengo C.

Nucleic Acids Res, 2020

doi:10.1093/nar/gkz967.

MGnify: the microbiome analysis resource in 2020.

Mitchell AL, Almeida A, Beracochea M, Boland M, Burgin J, Cochrane G, Crusoe MR, Kale V, Potter SC, Richardson LJ, Sakharova E, Scheremetjew M, Korobeynikov A, Shlemov A, Kunyavskaya O, Lapidus A, Finn RD.

Nucleic Acids Res, 2020

doi:10.1093/nar/gkz1035.

Metagenomics Bioinformatics

Mitchell A, Finn R, Kale V.

2019

doi:10.6019/tol.metagenomics-t.2019.00001.1.

Estimating the quality of eukaryotic genomes recovered from metagenomic analysis

Saary P, Mitchell AL, Finn RD.

2019

doi:10.1101/2019.12.19.882753.

A unified sequence catalogue of over 280,000 genomes obtained from the human gut microbiome

Almeida A, Nayfach S, Boland M, Strozzi F, Beracochea M, Shi ZJ, Pollard KS, Parks DH, Hugenholtz P, Segata N, Kyrpides NC, Finn RD.

2019

doi:10.1101/762682.

Workflow systems turn raw data into scientific knowledge.

Perkel JM.

Nature, 2019

doi:10.1038/d41586-019-02619-z.

Microbial community drivers of PK/NRP gene diversity in selected global soils.

Borsetto C, Amos GCA, da Rocha UN, Mitchell AL, Finn RD, Laidi RF, Vallin C, Pearce DA, Newsham KK, Wellington EMH.

Microbiome, 2019

doi:10.1186/s40168-019-0692-8.

Consent insufficient for data release-Response.

Amann RI, Baichoo S, Blencowe BJ, Bork P, Borodovsky M, Brooksbank C, Chain PSG, Colwell RR, Daffonchio DG, Danchin A, de Lorenzo V, Dorrestein PC, Finn RD, Fraser CM, Gilbert JA, Hallam SJ, Hugenholtz P, Ioannidis JPA, Jansson JK, Kim JF, Klenk HP, Klotz MG, Knight R, Konstantinidis KT, Kyrpides NC, Mason CE, McHardy AC, Meyer F, Ouzounis CA, Patrinos AAN, Podar M, Pollard KS, Ravel J, Muñoz AR, Roberts RJ, Rosselló-Móra R, Sansone SA, Schloss PD, Schriml LM, Setubal JC, Sorek R, Stevens RL, Tiedje JM, Turjanski A, Tyson GW, Ussery DW, Weinstock GM, White O, Whitman WB, Xenarios I.

Science, 2019

doi:10.1126/science.aax7509.

The EMBL-EBI search and sequence analysis tools APIs in 2019.

Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, Basutkar P, Tivey ARN, Potter SC, Finn RD, Lopez R.

Nucleic Acids Res, 2019

doi:10.1093/nar/gkz268.

Microbial abundance, activity and population genomic profiling with mOTUs2.

Milanese A, Mende DR, Paoli L, Salazar G, Ruscheweyh HJ, Cuenca M, Hingamp P, Alves R, Costea PI, Coelho LP, Schmidt TSB, Almeida A, Mitchell AL, Finn RD, Huerta-Cepas J, Bork P, Zeller G, Sunagawa S.

Nat Commun, 2019

doi:10.1038/s41467-019-08844-4.

A new genomic blueprint of the human gut microbiota.

Almeida A, Mitchell AL, Boland M, Forster SC, Gloor GB, Tarkowska A, Lawley TD, Finn RD.

Nature, 2019

doi:10.1038/s41586-019-0965-1.

A human gut bacterial genome and culture collection for improved metagenomic analyses.

Forster SC, Kumar N, Anonye BO, Almeida A, Viciani E, Stares MD, Dunn M, Mkandawire TT, Zhu A, Shao Y, Pike LJ, Louie T, Browne HP, Mitchell AL, Neville BA, Finn RD, Lawley TD.

Nat Biotechnol, 2019

doi:10.1038/s41587-018-0009-7.

Toward unrestricted use of public genomic data.

Amann RI, Baichoo S, Blencowe BJ, Bork P, Borodovsky M, Brooksbank C, Chain PSG, Colwell RR, Daffonchio DG, Danchin A, de Lorenzo V, Dorrestein PC, Finn RD, Fraser CM, Gilbert JA, Hallam SJ, Hugenholtz P, Ioannidis JPA, Jansson JK, Kim JF, Klenk HP, Klotz MG, Knight R, Konstantinidis KT, Kyrpides NC, Mason CE, McHardy AC, Meyer F, Ouzounis CA, Patrinos AAN, Podar M, Pollard KS, Ravel J, Muñoz AR, Roberts RJ, Rosselló-Móra R, Sansone SA, Schloss PD, Schriml LM, Setubal JC, Sorek R, Stevens RL, Tiedje JM, Turjanski A, Tyson GW, Ussery DW, Weinstock GM, White O, Whitman WB, Xenarios I.

Science, 2019

doi:10.1126/science.aaw1280.

Identifying accurate metagenome and amplicon software via a meta-analysis of sequence to taxonomy benchmarking studies.

Gardner PP, Watson RJ, Morgan XC, Draper JL, Finn RD, Morales SE, Stott MB.

PeerJ, 2019

doi:10.7717/peerj.6160.

RNAcentral: a hub of information for non-coding RNA sequences.

The RNAcentral Consortium.

Nucleic Acids Res, 2019

doi:10.1093/nar/gky1206.

InterPro in 2019: improving coverage, classification and access to protein sequence annotations.

Mitchell AL, Attwood TK, Babbitt PC, Blum M, Bork P, Bridge A, Brown SD, Chang HY, El-Gebali S, Fraser MI, Gough J, Haft DR, Huang H, Letunic I, Lopez R, Luciani A, Madeira F, Marchler-Bauer A, Mi H, Natale DA, Necci M, Nuka G, Orengo C, Pandurangan AP, Paysan-Lafosse T, Pesseat S, Potter SC, Qureshi MA, Rawlings ND, Redaschi N, Richardson LJ, Rivoire C, Salazar GA, Sangrador-Vegas A, Sigrist CJA, Sillitoe I, Sutton GG, Thanki N, Thomas PD, Tosatto SCE, Yong SY, Finn RD.

Nucleic Acids Res, 2019

doi:10.1093/nar/gky1100.

The Gene Ontology Resource: 20 years and still GOing strong.

The Gene Ontology Consortium.

Nucleic Acids Res, 2019

doi:10.1093/nar/gky1055.

RNAcentral: a hub of information for non-coding RNA sequences.

The RNAcentral Consortium.

Nucleic Acids Res, 2019

doi:10.1093/nar/gky1034.

Genome properties in 2019: a new companion database to InterPro for the inference of complete functional attributes.

Richardson LJ, Rawlings ND, Salazar GA, Almeida A, Haft DR, Ducq G, Sutton GG, Finn RD.

Nucleic Acids Res, 2019

doi:10.1093/nar/gky1013.

The Pfam protein families database in 2019.

El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, Qureshi M, Richardson LJ, Salazar GA, Smart A, Sonnhammer ELL, Hirsh L, Paladin L, Piovesan D, Tosatto SCE, Finn RD.

Nucleic Acids Res, 2019

doi:10.1093/nar/gky995.

TreeGrafter: phylogenetic tree-based annotation of proteins with Gene Ontology terms and other annotations.

Tang H, Finn RD, Thomas PD.

Bioinformatics, 2019

doi:10.1093/bioinformatics/bty625.

3DPatch: fast 3D structure visualization with residue conservation.

Jakubec D, Vondrášek J, Finn RD.

Bioinformatics, 2019

doi:10.1093/bioinformatics/bty464.

HMMER: Fast and sensitive sequence similarity searches

Finn R.

2018

doi:10.6019/tol.hmmer-w.2018.00001.1.

Pfam Database: Creating Protein Families

El-Gebali S, Richardson L, Finn R.

2018

doi:10.6019/tol.pfam_fams-t.2018.00001.1.

Repeats in Pfam

El-Gebali S, Richardson L, Finn R.

2018

doi:10.6019/tol.pfam_repeats-t.2018.00001.1.

Eleven quick tips to build a usable REST API for life sciences.

Tarkowska A, Carvalho-Silva D, Cook CE, Turner E, Finn RD, Yates AD.

PLoS Comput Biol, 2018

doi:10.1371/journal.pcbi.1006542.

Corrigendum: Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea.

Bowers RM, Kyrpides NC, Stepanauskas R, Harmon-Smith M, Doud D, Reddy TBK, Schulz F, Jarett J, Rivers AR, Eloe-Fadrosh EA, Tringe SG, Ivanova NN, Copeland A, Clum A, Becraft ED, Malmstrom RR, Birren B, Podar M, Bork P, Weinstock GM, Garrity GM, Dodsworth JA, Yooseph S, Sutton G, Glöckner FO, Gilbert JA, Nelson WC, Hallam SJ, Jungbluth SP, Ettema TJG, Tighe S, Konstantinidis KT, Liu WT, Baker BJ, Rattei T, Eisen JA, Hedlund B, McMahon KD, Fierer N, Knight R, Finn R, Cochrane G, Karsch-Mizrachi I, Tyson GW, Rinke C, Genome Standards Consortium, Lapidus A, Meyer F, Yilmaz P, Parks DH, Eren AM, Schriml L, Banfield JF, Hugenholtz P, Woyke T.

Nat Biotechnol, 2018

doi:10.1038/nbt0718-660a.

Non-Coding RNA Analysis Using the Rfam Database.

Kalvari I, Nawrocki EP, Argasinska J, Quinones-Olvera N, Finn RD, Bateman A, Petrov AI.

Curr Protoc Bioinformatics, 2018

doi:10.1002/cpbi.51.

HMMER web server: 2018 update.

Potter SC, Luciani A, Eddy SR, Park Y, Lopez R, Finn RD.

Nucleic Acids Res, 2018

doi:10.1093/nar/gky448.

Benchmarking taxonomic assignments based on 16S rRNA gene profiling of the microbiota from commonly sampled environments.

Almeida A, Mitchell AL, Tarkowska A, Finn RD.

Gigascience, 2018

doi:10.1093/gigascience/giy054.

Corrigendum: Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea.

Bowers RM, Kyrpides NC, Stepanauskas R, Harmon-Smith M, Doud D, Reddy TBK, Schulz F, Jarett J, Rivers AR, Eloe-Fadrosh EA, Tringe SG, Ivanova NN, Copeland A, Clum A, Becraft ED, Malmstrom RR, Birren B, Podar M, Bork P, Weinstock GM, Garrity GM, Dodsworth JA, Yooseph S, Sutton G, Glöckner FO, Gilbert JA, Nelson WC, Hallam SJ, Jungbluth SP, Ettema TJG, Tighe S, Konstantinidis KT, Liu WT, Baker BJ, Rattei T, Eisen JA, Hedlund B, McMahon KD, Fierer N, Knight R, Finn R, Cochrane G, Karsch-Mizrachi I, Tyson GW, Rinke C, Genome Standards Consortium, Lapidus A, Meyer F, Yilmaz P, Parks DH, Eren AM, Schriml L, Banfield JF, Hugenholtz P, Woyke T.

Nat Biotechnol, 2018

doi:10.1038/nbt0218-196a.

The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database.

Rawlings ND, Barrett AJ, Thomas PD, Huang X, Bateman A, Finn RD.

Nucleic Acids Res, 2018

doi:10.1093/nar/gkx1134.

Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families.

Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki EP, Rivas E, Eddy SR, Bateman A, Finn RD, Petrov AI.

Nucleic Acids Res, 2018

doi:10.1093/nar/gkx1038.

Ensembl Genomes 2018: an integrated omics infrastructure for non-vertebrate species.

Kersey PJ, Allen JE, Allot A, Barba M, Boddu S, Bolt BJ, Carvalho-Silva D, Christensen M, Davis P, Grabmueller C, Kumar N, Liu Z, Maurel T, Moore B, McDowall MD, Maheswari U, Naamati G, Newman V, Ong CK, Paulini M, Pedro H, Perry E, Russell M, Sparrow H, Tapanari E, Taylor K, Vullo A, Williams G, Zadissia A, Olson A, Stein J, Wei S, Tello-Ruiz M, Ware D, Luciani A, Potter S, Finn RD, Urban M, Hammond-Kosack KE, Bolser DM, De Silva N, Howe KL, Langridge N, Maslen G, Staines DM, Yates A.

Nucleic Acids Res, 2018

doi:10.1093/nar/gkx1011.

EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies.

Mitchell AL, Scheremetjew M, Denise H, Potter S, Tarkowska A, Qureshi M, Salazar GA, Pesseat S, Boland MA, Hunter FMI, Ten Hoopen P, Alako B, Amid C, Wilkinson DJ, Curtis TP, Cochrane G, Finn RD.

Nucleic Acids Res, 2018

doi:10.1093/nar/gkx967.

Identifying accurate metagenome and amplicon software via a meta-analysis of sequence to taxonomy benchmarking studies

Gardner PP, Watson RJ, Morgan XC, Draper JL, Finn RD, Morales SE, Stott MB.

2017

doi:10.1101/202077.

The HMMER Web Server for Protein Sequence Similarity Search.

Prakash A, Jeffryes M, Bateman A, Finn RD.

Curr Protoc Bioinformatics, 2017

doi:10.1002/cpbi.40.

EMBL-EBI, programmatically: take a REST from manual searches

Burke M, Armstrong D, Carvalho-Silva D, Castro L, Cowley A, Finn R, Foix A, Katuri J, Laird M, Lee J, Levchenko M, Lopez R, Nightingale A, Nightingale A, Nowotka M, Perry E, Pichler K, Pundir S, Morgan S, Saunders G, Garcia P, Squizzato S.

2017

doi:10.6019/tol.ebiprogrammatically-w.2017.00001.1.

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea.

Bowers RM, Kyrpides NC, Stepanauskas R, Harmon-Smith M, Doud D, Reddy TBK, Schulz F, Jarett J, Rivers AR, Eloe-Fadrosh EA, Tringe SG, Ivanova NN, Copeland A, Clum A, Becraft ED, Malmstrom RR, Birren B, Podar M, Bork P, Weinstock GM, Garrity GM, Dodsworth JA, Yooseph S, Sutton G, Glöckner FO, Gilbert JA, Nelson WC, Hallam SJ, Jungbluth SP, Ettema TJG, Tighe S, Konstantinidis KT, Liu WT, Baker BJ, Rattei T, Eisen JA, Hedlund B, McMahon KD, Fierer N, Knight R, Finn R, Cochrane G, Karsch-Mizrachi I, Tyson GW, Rinke C, Genome Standards Consortium, Lapidus A, Meyer F, Yilmaz P, Parks DH, Eren AM, Schriml L, Banfield JF, Hugenholtz P, Woyke T.

Nat Biotechnol, 2017

doi:10.1038/nbt.3893.

The metagenomic data life-cycle: standards and best practices.

Ten Hoopen P, Finn RD, Bongo LA, Corre E, Fosso B, Meyer F, Mitchell A, Pelletier E, Pesole G, Santamaria M, Willassen NP, Cochrane G.

Gigascience, 2017

doi:10.1093/gigascience/gix047.

ELIXIR pilot action: Marine metagenomics - towards a domain specific set of sustainable services.

Robertsen EM, Denise H, Mitchell A, Finn RD, Bongo LA, Willassen NP.

F1000Res, 2017

doi:10.12688/f1000research.10443.1.

InterPro in 2017-beyond protein family and domain annotations.

Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, Chang HY, Dosztányi Z, El-Gebali S, Fraser M, Gough J, Haft D, Holliday GL, Huang H, Huang X, Letunic I, Lopez R, Lu S, Marchler-Bauer A, Mi H, Mistry J, Natale DA, Necci M, Nuka G, Orengo CA, Park Y, Pesseat S, Piovesan D, Potter SC, Rawlings ND, Redaschi N, Richardson L, Rivoire C, Sangrador-Vegas A, Sigrist C, Sillitoe I, Smithers B, Squizzato S, Sutton G, Thanki N, Thomas PD, Tosatto SC, Wu CH, Xenarios I, Yeh LS, Young SY, Mitchell AL.

Nucleic Acids Res, 2017

doi:10.1093/nar/gkw1107.

RNAcentral: a comprehensive database of non-coding RNA sequences.

The RNAcentral Consortium, Petrov AI, Kay SJE, Kalvari I, Howe KL, Gray KA, Bruford EA, Kersey PJ, Cochrane G, Finn RD, Bateman A, Kozomara A, Griffiths-Jones S, Frankish A, Zwieb CW, Lau BY, Williams KP, Chan PP, Lowe TM, Cannone JJ, Gutell R, Machnicka MA, Bujnicki JM, Yoshihama M, Kenmochi N, Chai B, Cole JR, Szymanski M, Karlowski WM, Wood V, Huala E, Berardini TZ, Zhao Y, Chen R, Zhu W, Paraskevopoulou MD, Vlachos IS, Hatzigeorgiou AG, Ma L, Zhang Z, Puetz J, Stadler PF, McDonald D, Basu S, Fey P, Engel SR, Cherry JM, Volders PJ, Mestdagh P, Wower J, Clark MB, Quek XC, Dinger ME.

Nucleic Acids Res, 2017

doi:10.1093/nar/gkw1008.

Pfam: Quick tour

El-Gebali S, Richardson L, Finn R.

2016

doi:10.6019/tol.pfam-qt.2016.00001.1.

Cache Domains That are Homologous to, but Different from PAS Domains Comprise the Largest Superfamily of Extracellular Sensors in Prokaryotes.

Upadhyay AA, Fleetwood AD, Adebali O, Finn RD, Zhulin IB.

PLoS Comput Biol, 2016

doi:10.1371/journal.pcbi.1004862.

GO annotation in InterPro: why stability does not indicate accuracy in a sea of changing annotations.

Sangrador-Vegas A, Mitchell AL, Chang HY, Yong SY, Finn RD.

Database (Oxford), 2016

doi:10.1093/database/baw027.

The European Bioinformatics Institute in 2016: Data growth and integration.

Cook CE, Cook CE, Bergman MT, Finn RD, Cochrane G, Birney E, Apweiler R.

Nucleic Acids Res, 2016

doi:10.1093/nar/gkv1352.

The Pfam protein families database: towards a more sustainable future.

Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A.

Nucleic Acids Res, 2016

doi:10.1093/nar/gkv1344.

The Dfam database of repetitive DNA families.

Hubley R, Finn RD, Clements J, Eddy SR, Jones TA, Bao W, Smit AF, Wheeler TJ.

Nucleic Acids Res, 2016

doi:10.1093/nar/gkv1272.

EBI metagenomics in 2016--an expanding and evolving resource for the analysis and archiving of metagenomic data.

Mitchell A, Bucchini F, Cochrane G, Denise H, ten Hoopen P, Fraser M, Pesseat S, Potter S, Scheremetjew M, Sterk P, Finn RD.

Nucleic Acids Res, 2016

doi:10.1093/nar/gkv1195.

HPMCD: the database of human microbial communities from metagenomic datasets and microbial reference genomes.

Forster SC, Browne HP, Kumar N, Hunt M, Denise H, Mitchell A, Finn RD, Lawley TD.

Nucleic Acids Res, 2016

doi:10.1093/nar/gkv1216.

Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors.

Rawlings ND, Barrett AJ, Finn R.

Nucleic Acids Res, 2016

doi:10.1093/nar/gkv1118.

Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat.

Babbitt PC, Bagos PG, Bairoch A, Bateman A, Chatonnet A, Chen MJ, Craik DJ, Finn RD, Gloriam D, Haft DH, Henrissat B, Holliday GL, Isberg V, Kaas Q, Landsman D, Lenfant N, Manning G, Nagano N, Srinivasan N, O'Donovan C, Pruitt KD, Sowdhamini R, Rawlings ND, Saier MH, Sharman JL, Spedding M, Tsirigos KD, Vastermark A, Vriend G.

Database (Oxford), 2015

doi:10.1093/database/bav063.

HMMER web server: 2015 update.

Finn RD, Clements J, Arndt W, Miller BL, Wheeler TJ, Schreiber F, Bateman A, Eddy SR.

Nucleic Acids Res, 2015

doi:10.1093/nar/gkv397.

Key challenges for the creation and maintenance of specialist protein resources.

Holliday GL, Bairoch A, Bagos PG, Chatonnet A, Craik DJ, Finn RD, Henrissat B, Landsman D, Manning G, Nagano N, O'Donovan C, Pruitt KD, Rawlings ND, Saier M, Sowdhamini R, Spedding M, Srinivasan N, Vriend G, Babbitt PC, Bateman A.

Proteins, 2015

doi:10.1002/prot.24803.

The complexity, challenges and benefits of comparing two transporter classification systems in TCDB and Pfam.

Chiang Z, Vastermark A, Punta M, Coggill PC, Mistry J, Finn RD, Saier MH.

Brief Bioinform, 2015

doi:10.1093/bib/bbu053.

The InterPro protein families database: the classification resource after 15 years.

Mitchell A, Chang HY, Daugherty L, Fraser M, Hunter S, Lopez R, McAnulla C, McMenamin C, Nuka G, Pesseat S, Sangrador-Vegas A, Scheremetjew M, Rato C, Yong SY, Bateman A, Punta M, Attwood TK, Sigrist CJ, Redaschi N, Rivoire C, Xenarios I, Kahn D, Guyot D, Bork P, Letunic I, Gough J, Oates M, Haft D, Huang H, Natale DA, Wu CH, Orengo C, Sillitoe I, Mi H, Thomas PD, Finn RD.

Nucleic Acids Res, 2015

doi:10.1093/nar/gku1243.

Gene Ontology Consortium: going forward.

Gene Ontology Consortium.

Nucleic Acids Res, 2015

doi:10.1093/nar/gku1179.

Rfam 12.0: updates to the RNA families database.

Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD.

Nucleic Acids Res, 2015

doi:10.1093/nar/gku1063.

RNAcentral: an international database of ncRNA sequences.

RNAcentral Consortium, Petrov AI, Kay SJE, Gibson R, Kulesha E, Staines D, Bruford EA, Wright MW, Burge S, Finn RD, Kersey PJ, Cochrane G, Bateman A, Griffiths-Jones S, Harrow J, Chan PP, Lowe TM, Zwieb CW, Wower J, Williams KP, Hudson CM, Gutell R, Clark MB, Dinger M, Quek XC, Bujnicki JM, Chua NH, Liu J, Wang H, Skogerbø G, Zhao Y, Chen R, Zhu W, Cole JR, Chai B, Huang HD, Huang HY, Cherry JM, Hatzigeorgiou A, Pruitt KD.

Nucleic Acids Res, 2015

doi:10.1093/nar/gku991.

Structure and computational analysis of a novel protein with metallopeptidase-like and circularly permuted winged-helix-turn-helix domains reveals a possible role in modified polysaccharide biosynthesis.

Das D, Murzin AG, Rawlings ND, Finn RD, Coggill P, Bateman A, Godzik A, Aravind L.

BMC Bioinformatics, 2014

doi:10.1186/1471-2105-15-75.

Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models.

Wheeler TJ, Clements J, Finn RD.

BMC Bioinformatics, 2014

doi:10.1186/1471-2105-15-7.

iPfam: a database of protein family and domain interactions found in the Protein Data Bank.

Finn RD, Miller BL, Clements J, Bateman A.

Nucleic Acids Res, 2014

doi:10.1093/nar/gkt1210.

Pfam: the protein families database.

Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M.

Nucleic Acids Res, 2014

doi:10.1093/nar/gkt1223.

The challenge of increasing Pfam coverage of the human proteome.

Mistry J, Coggill P, Eberhardt RY, Deiana A, Giansanti A, Finn RD, Bateman A, Punta M.

Database (Oxford), 2013

doi:10.1093/database/bat023.

The first structure in a family of peptidase inhibitors reveals an unusual Ig-like fold.

Rigden DJ, Xu Q, Chang Y, Eberhardt RY, Finn RD, Rawlings ND.

F1000Res, 2013

doi:10.12688/f1000research.2-154.v2.

Two Pfam protein families characterized by a crystal structure of protein lpg2210 from Legionella pneumophila.

Coggill P, Eberhardt RY, Finn RD, Chang Y, Jaroszewski L, Godzik A, Das D, Xu Q, Axelrod HL, Aravind L, Murzin AG, Bateman A.

BMC Bioinformatics, 2013

doi:10.1186/1471-2105-14-265.

The challenge of increasing Pfam coverage of the human proteome

Mistry J, Coggill P, Eberhardt R, Deiana A, Giansanti A, Finn R, Bateman A, Punta M.

Database: The Journal of Biological Databases and Curation, 2013

doi:.

Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions.

Mistry J, Finn RD, Eddy SR, Bateman A, Punta M.

Nucleic Acids Res, 2013

doi:10.1093/nar/gkt263.

Dfam: a database of repetitive DNA based on profile hidden Markov models.

Wheeler TJ, Clements J, Eddy SR, Hubley R, Jones TA, Jurka J, Smit AF, Finn RD.

Nucleic Acids Res, 2013

doi:10.1093/nar/gks1265.

Recent advances in biocuration: meeting report from the Fifth International Biocuration Conference.

Gaudet P, Arighi C, Bastian F, Bateman A, Blake JA, Cherry MJ, D'Eustachio P, Finn R, Giglio M, Hirschman L, Kania R, Klimke W, Martin MJ, Karsch-Mizrachi I, Munoz-Torres M, Munoz-Torres M, Natale D, O'Donovan C, Ouellette F, Pruitt KD, Robinson-Rechavi M, Sansone SA, Schofield P, Sutton G, Van Auken K, Vasudevan S, Wu C, Young J, Mazumder R.

Database (Oxford), 2012

doi:10.1093/database/bas036.

InterPro in 2011: new developments in the family and domain prediction database

Nucleic Acids Res, 2012

doi:10.1093/nar/gks456.

Making your database available through Wikipedia: the pros and cons.

Finn RD, Gardner PP, Bateman A.

Nucleic Acids Res, 2012

doi:10.1093/nar/gkr1195.

The Pfam protein families database.

Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD.

Nucleic Acids Res, 2012

doi:10.1093/nar/gkr1065.

InterPro in 2011: new developments in the family and domain prediction database.

Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S, de Castro E, Coggill P, Corbett M, Das U, Daugherty L, Duquenne L, Finn RD, Fraser M, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, McMenamin C, Mi H, Mutowo-Muellenet P, Mulder N, Natale D, Orengo C, Pesseat S, Punta M, Quinn AF, Rivoire C, Sangrador-Vegas A, Selengut JD, Sigrist CJ, Scheremetjew M, Tate J, Thimmajanarthanan M, Thomas PD, Wu CH, Yeats C, Yong SY.

Nucleic Acids Res, 2012

doi:10.1093/nar/gkr948.

HMMER web server: interactive sequence similarity searching.

Finn RD, Clements J, Eddy SR.

Nucleic Acids Res, 2011

doi:10.1093/nar/gkr367.

Clustered coding variants in the glutamate receptor complexes of individuals with schizophrenia and bipolar disorder.

Frank RA, McRae AF, Pocklington AJ, van de Lagemaat LN, Navarro P, Croning MD, Komiyama NH, Bradley SJ, Challiss RA, Armstrong JD, Finn RD, Malloy MP, MacLean AW, Harris SE, Starr JM, Bhaskar SS, Howard EK, Hunt SE, Coffey AJ, Ranganath V, Deloukas P, Rogers J, Muir WJ, Deary IJ, Blackwood DH, Visscher PM, Grant SG.

PLoS One, 2011

doi:10.1371/journal.pone.0019011.

Representative proteomes: a stable, scalable and unbiased proteome set for sequence analysis and functional annotation.

Chen C, Natale DA, Finn RD, Huang H, Zhang J, Wu CH, Mazumder R.

PLoS One, 2011

doi:10.1371/journal.pone.0018910.

Rfam: Wikipedia, clans and the "decimal" release.

Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A.

Nucleic Acids Res, 2011

doi:10.1093/nar/gkq1129.

The structure of BVU2987 from Bacteroides vulgatus reveals a superfamily of bacterial periplasmic proteins with possible inhibitory function.

Das D, Finn RD, Carlton D, Miller MD, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Chen C, Chiu HJ, Chiu M, Clayton T, Deller MC, Duan L, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, McMullan D, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, Wilson IA.

Acta Crystallogr Sect F Struct Biol Cryst Commun, 2010

doi:10.1107/s1744309109046788.

DUFs: families in search of function.

Bateman A, Coggill P, Finn RD.

Acta Crystallogr Sect F Struct Biol Cryst Commun, 2010

doi:10.1107/s1744309110001685.

The crystal structure of a bacterial Sufu-like protein defines a novel group of bacterial proteins that are similar to the N-terminal domain of human Sufu.

Das D, Finn RD, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Cai X, Carlton D, Chen C, Chiu HJ, Chiu M, Clayton T, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Lam WW, Marciano D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Yeh A, Zhou J, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, Wilson IA.

Protein Sci, 2010

doi:10.1002/pro.497.

The Pfam protein families database.

Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A.

Nucleic Acids Res, 2010

doi:10.1093/nar/gkp985.

Bacterial pleckstrin homology domains: a prokaryotic origin for the PH domain.

Xu Q, Bateman A, Finn RD, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Carlton D, Chen C, Chiu HJ, Chiu M, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, McMullan D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, Wilson IA.

J Mol Biol, 2010

doi:10.1016/j.jmb.2009.11.006.

The structure of pyogenecin immunity protein, a novel bacteriocin-like immunity protein from Streptococcus pyogenes.

Chang C, Coggill P, Bateman A, Finn RD, Cymborowski M, Otwinowski Z, Minor W, Volkart L, Joachimiak A.

BMC Struct Biol, 2009

doi:10.1186/1472-6807-9-75.

DASMI: exchanging, annotating and assessing molecular interaction data.

Blankenburg H, Finn RD, Prlić A, Jenkinson AM, Ramírez F, Emig D, Schelhorn SE, Büch J, Lengauer T, Albrecht M.

Bioinformatics, 2009

doi:10.1093/bioinformatics/btp142.

Phospholipid scramblases and Tubby-like proteins belong to a new superfamily of membrane tethered transcription factors.

Bateman A, Finn RD, Sims PJ, Wiedmer T, Biegert A, Söding J.

Bioinformatics, 2009

doi:10.1093/bioinformatics/btn595.

Rfam: updates to the RNA families database.

Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, Griffiths-Jones S, Eddy SR, Bateman A.

Nucleic Acids Res, 2009

doi:10.1093/nar/gkn766.

InterPro: the integrative protein signature database.

Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C.

Nucleic Acids Res, 2009

doi:10.1093/nar/gkn785.

Modifier effects between regulatory and protein-coding variation.

Dimas AS, Stranger BE, Beazley C, Finn RD, Ingle CE, Forrest MS, Ritchie ME, Deloukas P, Tavaré S, Dermitzakis ET.

PLoS Genet, 2008

doi:10.1371/journal.pgen.1000244.

The genome of the simian and human malaria parasite Plasmodium knowlesi.

Pain A, Böhme U, Berry AE, Mungall K, Finn RD, Jackson AP, Mourier T, Mistry J, Pasini EM, Aslett MA, Balasubrammaniam S, Borgwardt K, Brooks K, Carret C, Carver TJ, Cherevach I, Chillingworth T, Clark TG, Galinski MR, Hall N, Harper D, Harris D, Hauser H, Ivens A, Janssen CS, Keane T, Larke N, Lapp S, Marti M, Moule S, Meyer IM, Ormond D, Peters N, Sanders M, Sanders S, Sargeant TJ, Simmonds M, Smith F, Squares R, Thurston S, Tivey AR, Walker D, White B, Zuiderwijk E, Churcher C, Quail MA, Cowman AF, Turner CM, Rajandream MA, Kocken CH, Thomas AW, Newbold CI, Barrell BG, Berriman M.

Nature, 2008

doi:10.1038/nature07306.

Identifying protein domains with the Pfam database.

Coggill P, Finn RD, Bateman A.

Curr Protoc Bioinformatics, 2008

doi:10.1002/0471250953.bi0205s23.

Integrating biological data--the Distributed Annotation System.

Jenkinson AM, Albrecht M, Birney E, Blankenburg H, Down T, Finn RD, Hermjakob H, Hubbard TJ, Jimenez RC, Jones P, Kähäri A, Kulesha E, Macías JR, Reeves GA, Prlić A.

BMC Bioinformatics, 2008

doi:10.1186/1471-2105-9-s8-s3.

Experience using web services for biological sequence analysis.

Stockinger H, Attwood T, Chohan SN, Côté R, Cudré-Mauroux P, Falquet L, Fernandes P, Finn RD, Hupponen T, Korpelainen E, Labarga A, Laugraud A, Lima T, Pafilis E, Pagni M, Pettifer S, Phan I, Rahman N.

Brief Bioinform, 2008

doi:10.1093/bib/bbn029.

Pfam 10 years on: 10,000 families and still growing.

Sammut SJ, Finn RD, Bateman A.

Brief Bioinform, 2008

doi:10.1093/bib/bbn010.

The Pfam protein families database.

Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A.

Nucleic Acids Res, 2008

doi:10.1093/nar/gkm960.

Pfam: a domain-centric method for analyzing proteins and proteomes.

Mistry J, Finn R.

Methods Mol Biol, 2007

doi:10.1007/978-1-59745-515-2_4.

Integrating sequence and structural biology with DAS.

Prlić A, Down TA, Kulesha E, Finn RD, Kähäri A, Hubbard TJ.

BMC Bioinformatics, 2007

doi:10.1186/1471-2105-8-333.

Predicting active site residue annotations in the Pfam database.

Mistry J, Bateman A, Finn RD.

BMC Bioinformatics, 2007

doi:10.1186/1471-2105-8-298.

SCOOP: a simple method for identification of novel protein superfamily relationships.

Bateman A, Finn RD.

Bioinformatics, 2007

doi:10.1093/bioinformatics/btm034.

ProServer: a simple, extensible Perl DAS server.

Finn RD, Stalker JW, Jackson DK, Kulesha E, Clements J, Pettett R.

Bioinformatics, 2007

doi:10.1093/bioinformatics/btl650.

New developments in the InterPro database.

Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Selengut JD, Sigrist CJ, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C.

Nucleic Acids Res, 2007

doi:10.1093/nar/gkl841.

Pfam: clans, web tools and services.

Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, Eddy SR, Sonnhammer EL, Bateman A.

Nucleic Acids Res, 2006

doi:10.1093/nar/gkj149.

Conformational changes of Escherichia coli sigma54-RNA-polymerase upon closed-promoter complex formation.

Ray P, Hall RJ, Finn RD, Chen S, Patwardhan A, Buck M, van Heel M.

J Mol Biol, 2005

doi:10.1016/j.jmb.2005.09.057.

The second paradigm for activation of transcription.

Wigneshweraraj SR, Burrows PC, Bordes P, Schumacher J, Rappas M, Finn RD, Cannon WV, Zhang X, Buck M.

Prog Nucleic Acid Res Mol Biol, 2005

doi:10.1016/s0079-6603(04)79007-8.

iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions.

Finn RD, Marshall M, Bateman A.

Bioinformatics, 2005

doi:10.1093/bioinformatics/bti011.

The Pfam protein families database.

Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, Studholme DJ, Yeats C, Eddy SR.

Nucleic Acids Res, 2004

doi:10.1093/nar/gkh121.

Identifying protein domains with the Pfam database.

Finn R, Griffiths-Jones S, Bateman A.

Curr Protoc Bioinformatics, 2003

doi:10.1002/0471250953.bi0205s01.

Determination of Escherichia coli RNA polymerase structure by single particle cryoelectron microscopy.

Ray P, Klaholz BP, Finn RD, Orlova EV, Burrows PC, Gowen B, Buck M, van Heel M.

Methods Enzymol, 2003

doi:10.1016/s0076-6879(03)70003-2.

The PASTA domain: a beta-lactam-binding domain.

Yeats C, Finn RD, Bateman A.

Trends Biochem Sci, 2002

doi:10.1016/s0968-0004(02)02164-3.

Single-particle electron cryo-microscopy: towards atomic resolution.

van Heel M, Gowen B, Matadeen R, Orlova EV, Finn R, Pape T, Cohen D, Stark H, Schmidt R, Schatz M, Patwardhan A.

Q Rev Biophys, 2000

doi:10.1017/s0033583500003644.

Escherichia coli RNA polymerase core and holoenzyme structures.

Finn RD, Orlova EV, Gowen B, Buck M, van Heel M.

EMBO J, 2000

doi:10.1093/emboj/19.24.6833.

The C-terminal 12 amino acids of sigma(N) are required for structure and function.

Studholme DJ, Finn RD, Chaney MK, Buck M.

Arch Biochem Biophys, 1999

doi:10.1006/abbi.1999.1426.