These releases are huge in many respects, so it was difficult to decide which news to put first! Let’s start with some exciting news from our annotators.

Mouse Genome Annotation Milestone

Ensembl/GENCODE annotators have completed a first-pass ‘walk’ across the entire reference mouse genome that started in 2012, investigating the sequence, aligned data and computational predictions for each BAC clone in turn. This is the GENCODE M21 gene set.

Having completed the first pass, we are now targeting specific loci, for example to identify unannotated protein-coding and lncRNA genes, or alternatively spliced transcripts, and to reassess older protein-coding gene annotation in the light of current data.

Release of Ensembl-RefSeq MANE Select v0.5 Transcripts

Our new joint initiative with the NCBI – the Matched Annotation from the NCBI and EMBL-EBI (MANE) project – aims to define a genome-wide transcript set that is matched between RefSeq and Ensembl/GENCODE (MANE transcripts).

We are releasing phase 1, which includes one well-supported transcript for every protein-coding locus in the human genome (MANE Select set). This first set contains a MANE Select transcript for 53% of the human protein-coding genes and is versioned 0.5.

If you want to learn more about this transcript set, check out our previous blog or watch our recorded webinar.

New human GENCODE Gene Set

We have updated the human gene set to GENCODE 30.

Joint REST Server for Ensembl and Ensembl Genomes, and Changes to the FTP Directory Layout

We are in the process of combining the databases for Ensembl and Ensembl Genomes.

As an important step towards this aim, we merged the Ensembl and Ensembl Genomes REST servers into a single server (rest.ensembl.org) and retired rest.ensemblgenomes.org. Going along with merging our REST server, we have changed the comparative genomics (compara), genomes and info/species endpoints. Don’t worry too much though – simply replacing rest.ensemblgenomes.org with rest.ensembl.org in your REST call should work as before in most cases. If it doesn’t, please make sure to you check the details in our blog outlining the changes.

In a similar move to ensure consistency between Ensembl and Ensembl Genomes, we made changes to the structure of the Ensembl Genomes FTP directory layout. These affect the ‘gvf’, ‘vcf’ and ‘vep’ directories as well as the whole genome alignment files. We have provided the details of all changes in another blog.

New Genomes

Tweet tweet tweet! Have you heard? This spring release brings you lots of bird genomes, including from three kiwis.

But that’s not everything – we have many other new vertebrate genomes too. We are particularly pleased to bring you the annotated genome of Lonesome George, the last known individual of the Abingdon island giant tortoises. In his final years of life, before he sadly died in 2012, he was known as the rarest creature in the world.

Here’s the full list of new genomes in this release:

Birds:

Reptiles:

Primates:

Rodents:

Other mammals:

New Assemblies and Annotation

In addition to the new genomes, we have updated the assembly and annotation of four vertebrate and two plant species:

New Interface for Configuration of Regulation Tracks

The Regulatory Build for both human and mouse have been updated within the past year, in Ensembl 95 and 93, respectively. We now have data for 123 human and 79 mouse cells/tissues. The increased amount of data meant that our previous interface for configuration of regulation tracks became difficult to use, and importantly that it won’t be suitable for the data we expect in the future.

That’s why we’re introducing a new interface in this release! It allows you to select the cell/tissue and the data you would like to see with a few clicks.

You can access the interface on the Regulation tab, e.g. here. Click on the ‘Details by cell type’ icon at the top, then the ‘Configure Cell/Tissue’ button:

You can also access it on the Location or Gene tab, e.g. here. Click on the ‘Configure this page’ button, then on ‘Features by Cell/Tissues’ in the pop-up window.

Our short YouTube video shows you how to use our new interface for configuration of regulation tracks.

 

Updates on Variation Data and Displays

This release brings variation data for Chlorocebus sabaeus (Vervet). We added 31,779 markers from the 35K Axiom SNP array to Triticum aestivum (Bread wheat). This SNP array is widely used by breeders for marker assisted selection; therefore adding this variation data to the IWGSC RefSeq v1.0 wheat assembly is important. We have also added a polyploid view for Triticum dicoccoides (Emmer Zavitan wheat). At the same time, we will discontinue the Drosophila melanogaster (Fruitfly) variation data in Ensembl Metazoa.

The Variant Effect Predictor (VEP) now provides additional phenotype annotations, both via the web interface and the REST server. The web tool also shows the location of a variant on relevant 3D protein structures from PDBe for human and mouse, where these models are available. The VEP and the browser now provide gnomAD version 2.1 frequency data, with an improved mapping to GRCh38. 

Finally, the Variant Recoder will support SPDI genomic format, and variant pages in the browser display GERP scores for all vertebrate species and CADD scores for human, to provide an indication of how tolerant a locus is to change .

Other Updates

Find out more

If you would like to learn more about Ensembl 96 and Ensembl Genomes 43, watch a guided tour or ask questions to our team, please register for the release webinar on Wednesday 17th April 2019 at 16:00 BST.

Edit