Biological researchers require two crucial sources of information: scientific literature published in peerreviewed journals and databases storing key biological data such as DNA and protein sequences, functions and structures of molecules and microarray data. Tools that integrate these two sources of information are desperately sought.
EMBL-EBI scientists have developed CiteXplore, a tool that links electronic literature resources to bioinformatics databases, to fulfil this need. It integrates abstracts from various resources including the US National Library of Medicine’s MEDLINE database of abstracts from peer-reviewed biomedical journals, biological abstracts from patent applications from the European Patent Office, and Chinese Biological Abstracts from the Shanghai Information Center for Life Sciences, Chinese Academy of Sciences. From these abstracts CiteXplore links to full-text articles at various locations such as PubMedCentral and publisher websites.
CiteXplore also provides a direct link between the scientific literature and the EMBL-EBI’s biological databases. “When you are reading an abstract describing a specific gene or protein, typically you want more information on it, for example its sequence or its function, as well as easy access to the full paper,” says Peter Stoehr, who coordinates CiteXplore. CiteXplore uses powerful text-mining tools developed by EMBL-EBI researchers to link literature and databases automatically, so that at the touch of a button the biological terms are identified in the text and you can call up the record of the molecule that you are looking for.
In future, the range of literature resources hosted in CiteXplore will be extended for better coverage of other domains such as plant science, agricultural and food sciences, and to integrate it with UK PubMedCentral, a recently launched project led by a consortium comprising the British Library, University of Manchester and EMBL-EBI.