This treasure hunt was designed to introduce participants to various bioinformatics approaches which can be used to identify and analyse DNA and protein sequences. In the activity, we are going to analyse the DNA and amino acid sequences of the Green Fluorescent Protein (GFP) and closely-related molecules using different web-based databases and analysis tools. Biological data – e.g., DNA and protein sequences or structural information – are stored and curated in databases which are made available to scientists. As most of the data are produced with public money they are also publicly available. Similarly, bioinformatics tools are designed by scientists funded by public money and are therefore available to other scientists and the public. In this activity, we will be using such bioinformatics databases and tools for our analysis.
Astex Viewer requires Java to be installed on your computer and enabled in your web browser. Java can be installed for free via java.com. Information on how to enable Java in your web browser can be found here.
ENA (European Nucleotide Archive)
Database providing a comprehensive record of the world’s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation. ENA is made up of a number of distinct databases that includes EMBL-Bank, the Sequence Read Archive (SRA) and the Trace Archive.
MUSCLE (Multiple Sequence Comparison by Log-Expectation)
Tool to align and compare multiple sequences, particularly suitable for amino acid sequences.
EMBOSS Transeq (European Molecular Biology Open Software Suite Transeq)
Tool which translates nucleic acid sequences to the corresponding peptide sequences.
Protein Data Bank in Europe (PDBe)
Database containing information on the structure of biological macromolecules.
Glossary: check out the ELLS Glossary for a growing number of bioinformatics-related terms
The navigation menu below shows the individual steps of the activity. Start the treasure hunt by clicking on “GFP treasure hunt introduction” in the menu below.
Topic area: Bioinformatics, Genome biology
Age group: 16-19
Author: Philipp Gebhardt