European Learning Laboratory for the Life Sciences

Our inspiring educational experiences share the scientific discoveries of EMBL with young learners aged 10-19 years and teachers in Europe and beyond.

This article is also available in  Čeština,  Français,  Ελληνικά and  Italiano

Part 1: Search for protein identity


Imagine we have cloned a new gene and recently got the sequencing results. We were able to translate the nucleotide sequence into an amino acid sequence. However, we do not know anything about the function of our gene. We will therefore try to find very similar sequences in a database containing known protein sequences to see whether our protein resembles any known sequences and their functions.

To start the activity, just follow the instructions in the “Your Task” tab and try to answer the activity questions.

Your task

Proceed as described below:

1.   The “Sequence” tab contains the amino acid sequence of the unknown protein “Bovine_Protein”.
2.   Copy the sequence and follow the instructions on the “Protein BLAST” tab to search for the identity of the protein.
3.   Follow the instructions in the “Protein BLAST” tab and try to answer the questions in the “Questions” tab.


Your input sequence:


Protein BLAST

1.   Access the Protein BLAST (Basic Local Alignment Search Tool) tool below.
2.   Paste your amino acid sequence (including greater-than symbol and sequence name) in the query box. Enter a descriptive job title (such as “Unknown Protein”). In the  “Parameters” field ensure “blastp” is selected. Run the search by hitting “Submit”.
3.   Looking at the data table, try and answer the task questions.


1.   Which protein gives the best hit in the protein database?
2.   What percentage of identity does your “unknown protein” share with the best hit?
3.   What is the known biological function of the best hit? (Click on the link in the second column of the results table (DB:ID) and scroll to “General annotation (Comments)”.)

