Science Education

Formerly known as European Learning Laboratory for the Life Sciences

Our inspiring educational experiences share the scientific discoveries of EMBL with young learners aged 10-19 years and teachers in Europe and beyond. We belong to EMBL’s Science Education and Public Engagement office.

This article is also available in  Čeština,  Français,  Ελληνικά and  Italiano

Exploring the evolution of light-sensitive proteins


In this activity, we are going to use bioinformatics approaches to learn about the molecular evolution of genes. We will first use BLAST to identify an “unknown” protein and find basic information about its biological function. To gain an insight into the evolution of this protein family, we will then align multiple homologous protein sequences and explore conservation and variability of protein regions to identify functionally important residues. In the following exercise, we will learn how to construct a phylogenetic tree to understand the relationship between homologues, paralogues and orthologues and investigate the evolutionary relationship within the protein family. The final part of the activity looks at the three-dimensional structure of a selected member of the protein family.

Technical requirements

This activity has been designed for Windows-based operating systems. Mac users are likely to encounter some Java compatibility issues when using JalView and Astex Viewer. The recommended web browser to use during the activity is Mozilla Firefox.
JalView and Astex Viewer require Java to be installed on your computer and enabled in your web browser. Java can be installed for free via java.com. Information on how to enable Java in your web browser can be found here.

Bioinformatics tools

MUSCLE (Multiple Sequence Comparison by Log-Expectation)

Tool to align and compare multiple sequences, particularly suitable for amino acid sequences.

Protein BLAST (Basic Local Alignment Search Tool)

Tool which searches protein databases using a protein query. It identifies regions of local similarity between protein sequences and can thus be used to find the identity of unknown proteins in the database.


Catalogue of protein information, including protein sequences and functions.

Protein Data Bank in Europe (PDBe) 

Database containing information on the structure of biological macromolecules.


Check out the ELLS Glossary for a growing number of bioinformatics-related terms.

Activity navigation

Topic area:  Evolutionary biology, Bioinformatics

Age group:  16-19

Author: Pavel Vopalensky, Eva Haas