Pilot phase completion yields valuable insights into nature of human genetic variation
The 1000 Genomes Project, a major international collaboration to build a detailed map of human genetic variation, has completed its pilot phase. The results are now published in the journal Nature and freely available through the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) and the US National Center for Biotechnology Information (NCBI). The study provides valuable insights into the nature of human genetic variation and will underpin the next phase of human genetic research.
Since the human genome was sequenced, over 1000 regions on the genome have been associated with traits such as disease susceptibility, response to medication or physical characteristics. But recent technological advances have highlighted important gaps in the databases that contain all this genetic information. To fill the gaps, the 1000 Genomes Project has undertaken a thorough and systematic investigation of genetic variation between individuals and populations.
A team of researchers at the EMBL-EBI, led by Paul Flicek, helped to determine the best strategy for characterising more than 95% of the genetic variants that can be found in 1% or more of three different geographic population groups (Europeans, East Asians and West Africans). Such information could shed light on how a person’s genetic makeup may contribute to specific illnesses.
The project partners, working in nine different centres, plan to sequence the genomes of more than 2500 people from five large population groups by the project’s completion in 2012. Considering that one person’s genome contains around 3 billion DNA base pairs, that’s a lot of data. In this pilot phase alone, a total of 4.9 terabases of DNA sequence were generated (1 terabase is 1000 gigabases, about the size of 300 human genomes).
“The amount of information delivered by this first stage of the project is remarkable,” said Richard Durbin of the Sanger Institute in the UK. “In less than two years, we identified 15 million single-letter changes, 1 million small deletions or insertions and 20,000 larger variants. The majority of these variants – around 8 million – had never been seen before. This is the largest catalogue of its kind, and having it in the public domain will help maximise the efficiency of human genetics research.”
Thanks to innovations in DNA sequencing technology, genomic data is being generated at rates previously unimaginable to life scientists. This poses significant challenges not only for storing and moving the information among different partners, but also for its analysis. The EBI group developed a robust new computing platform and several software innovations that made this pioneering project possible, and will also pave the way for other sequencing projects on an even larger scale.
“Having a systematic catalogue of human variation changes the way we can study human genetics, much in the same way as having a catalogue of human genes did,” said Dr Flicek. “Among other things, it also gives us a platform for analysing the connections between genes and an individual’s disease risks.” The results of the collaboration extend well beyond the scope of the 1000 Genomes Project, he said, and represent the beginning of a new era in human genetics using genome-wide sequencing.
“This work shows the power of very recent advances in sequencing to generate maps of genetic variation that bridge different scales,” added Jan Korbel from EMBL in Heidelberg, Germany, who helped analyse the larger variants. “It’s an exciting first step, which paves the way for looking at the relationship between genetic variations and diseases like cancer.”All of the variants described in the pilot study can now be tested for their association with any given disease or trait (e.g. susceptibility to addictive behaviour such as smoking). Indeed, the data are already being used to inform a number of medical studies. The results of the pilot study offer a much deeper, more uniform picture of human genetic variation than was previously available, and offer new insights into functional variation, genetic association and natural selection in humans.
The study was financed by the Wellcome Trust and several national funding bodies including those in China and Germany as well as the US National Institutes of Health (NIH).
Data available from www.1000genomes.org.
The 1000 Genomes Consortium. A map of human genome variation from population scale sequencing. Nature, published online on 28 October 2010. DOI: 10.1038/nature09534.
flicek, gene, genetic variation, genetics, human genetics, korbel, press release
Meyerhofstraße 1 69117 Heidelberg Germany
Looking for past print editions of EMBLetc.?
Browse our archive, going back 20 years.