{"id":36085,"date":"2023-07-06T09:46:44","date_gmt":"2023-07-06T09:46:44","guid":{"rendered":"https:\/\/www.embl.org\/ells\/?post_type=teachingbase&#038;p=36085"},"modified":"2023-07-14T07:21:51","modified_gmt":"2023-07-14T07:21:51","slug":"part-1-from-dna-to-protein-sequence","status":"publish","type":"teachingbase","link":"https:\/\/www.embl.org\/ells\/teachingbase\/the-mysterious-protein-a-bioinformatics-expedition\/part-1-from-dna-to-protein-sequence\/","title":{"rendered":"Part 1: From DNA to protein sequence"},"content":{"rendered":"\n<div class=\"vf-tabs\"><ul class=\"vf-tabs__list\" data-vf-js-tabs=\"true\"><li class=\"vf-tabs__item\"><a class=\"vf-tabs__link\" href=\"#vf-tabs__section-2129e4bc-4dbb-4088-888a-7bd1e045f1af\" data-vf-js-location-nearest-activation-target=\"\">Overview<\/a><\/li><li class=\"vf-tabs__item\"><a class=\"vf-tabs__link\" href=\"#vf-tabs__section-a9142171-26ff-4311-afb3-62d9d220f38f\" data-vf-js-location-nearest-activation-target=\"\">Your task<\/a><\/li><li class=\"vf-tabs__item\"><a class=\"vf-tabs__link\" href=\"#vf-tabs__section-660723b7-6020-4cbc-91c5-196e645a4341\" data-vf-js-location-nearest-activation-target=\"\">Sequence<\/a><\/li><li class=\"vf-tabs__item\"><a class=\"vf-tabs__link\" href=\"#vf-tabs__section-9f99344e-ee89-486f-89f7-aebe7d287e9c\" data-vf-js-location-nearest-activation-target=\"\">EMBOSS Transeq<\/a><\/li><li class=\"vf-tabs__item\"><a class=\"vf-tabs__link\" href=\"#vf-tabs__section-e8c6c6fe-71a4-4aa0-942c-8b86bc87a5c7\" data-vf-js-location-nearest-activation-target=\"\">Question<\/a><\/li><li class=\"vf-tabs__item\"><a class=\"vf-tabs__link\" href=\"#vf-tabs__section-58cd64e9-ed27-45d8-8406-341f94a64590\" data-vf-js-location-nearest-activation-target=\"\">Activity navigation<\/a><\/li><\/ul><div class=\"vf-tabs-content\" data-vf-js-tabs-content=\"true\">\n<section class=\"vf-tabs__section\" id=\"vf-tabs__section-2129e4bc-4dbb-4088-888a-7bd1e045f1af\"><h2>Overview<\/h2>\n<p>We just received an email from the TREC researchers containing the DNA sequence. To understand which protein it encodes, the first task is to convert this DNA sequence into an amino acid sequence, also known as the protein\u2019s primary structure.<\/p>\n\n\n\n<p>We will use the bioinformatic tool EMBOSS Transeq to assist us in this process. EMBOSS Transeq allows us to merge the biological processes of <a href=\"https:\/\/www.embl.org\/ells\/teachingbase\/ells-glossary\/\" target=\"_blank\" rel=\"noreferrer noopener\">transcription<\/a> (DNA to RNA) and <a href=\"https:\/\/www.embl.org\/ells\/teachingbase\/ells-glossary\/\" target=\"_blank\" rel=\"noreferrer noopener\">translation<\/a> (RNA to protein) and directly create an amino acid sequence from the DNA sequence.\u00a0\u00a0<\/p>\n\n\n\n<p>Begin the activity by following the instructions provided in the \u201cYour task\u201d tab and attempt to answer the accompanying questions.<\/p>\n<\/section>\n\n\n\n<section class=\"vf-tabs__section\" id=\"vf-tabs__section-a9142171-26ff-4311-afb3-62d9d220f38f\"><h2>Your task<\/h2>\n<p><strong>Please follow the steps outlined below:<\/strong><\/p>\n\n\n\n<p><strong>1. <\/strong>In the \u201cSequence\u201d tab, you will find the unidentified DNA sequence labelled as \u201cUnknown_DNA\u201d.<br><strong>2.<\/strong> Copy the sequence and proceed to the &#8220;EMBOSS Transeq&#8221; tab, where you will find instructions on how to identify the amino acid sequence of the unknown protein using the EMBOSS Transeq tool.<br><strong>3.<\/strong> Try to answer the question in the \u201cQuestion\u201d tab.<\/p>\n<\/section>\n\n\n\n<section class=\"vf-tabs__section\" id=\"vf-tabs__section-660723b7-6020-4cbc-91c5-196e645a4341\"><h2>Sequence<\/h2>\n<p>Input sequence:<\/p>\n\n\n\n<pre class=\"wp-block-code vf-u-text--break\"><code>\n>New species (Unknown) \nATGAACGGCACCGAGGGCCCCTTCGGCTACATCCCCATGAGCAACGCCACCGGCCTGGTG\nAGGAGCCCCTACGACTACCCCCAGTACTACCTGGTGCCCCCCTGGGGCTACGCCTGCCTG\nGCCGCCTACATGTTCCTGCTGATCCTGACCGGCTTCCCCGTGAACTTCCTGACCCTGTAC\nGTGACCATCGAGCACAAGAAGCTGAGGAGCCCCCTGAACTACATCCTGCTGAACCTGGCC\nGTGGCCGACCTGTTCATGGTGATCGGCGGCTTCACCACCACCATGTGGACCAGCCTGGAC\nGGCTACTTCGTGTTCGGCAGGATGGGCTGCAACATCGAGGGCTTCTTCGCCACCCTGGGC\nGGCGAGATCGCCCTGTGGAGCCTGGTGGTGCTGAGCATGGAGAGGTGGATCGTGGTGTGC\nAAGCCCATCAGCAACTTCAGGTTCGGCGAGAACCACGCCGTGATGGGCGTGGCCTTCAGC\nTGGTTCATGGCCGCCGCCTGCGCCGTGCCCCCCCTGGTGGGCTGGAGCAGGTACATCCCC\nGAGGGCATGCAGTGCAGCTGCGGCATCGACTACTACACCAGGGCCGAGGGCTTCAACAAC\nGAGAGCTTCGTGATCTACATGTTCGTGGTGTTCTTCACCTGCCCCCTGACCATCATCACC\nTTCTGCTACGGCAGGCTGGTGTGCACCGTGAAGGAGGCCGCCGCCCAGCAGCAGGAGAGC\nGAGACCACCCAGAGGGCCGAGAGGGAGGTGACCAGGATGGTGATCATCACCTTCGTGGCC\nTTCCTGGCCTGCTGGGTGCCCTACGCCAGCGTGGCCTGGTACATCTTCACCCACCAGGGC\nAGCGAGTTCGGCCCCGTGTTCATGACCATCCCCGCCTTCTTCGCCAAGAGCAGCGCCGTG\nTACAACCCCGTGATCTACATCTGCCTGAACAAGCAGTTCAGGCACTGCATGATCACCACC\nCTGTGCTGCGGCAAGAACCCCTTCGAGGAGGAGGAGGGCAGCACCACCGCCAGCAAGACC\nGAGGCCAGCAGCGTGTGCAGCGTGAGCCCCCACGCC\n<\/code><\/pre>\n<\/section>\n\n\n\n<section class=\"vf-tabs__section\" id=\"vf-tabs__section-9f99344e-ee89-486f-89f7-aebe7d287e9c\"><h2>EMBOSS Transeq<\/h2>\n<p><strong>1.<\/strong> Access the EMBOSS Transeq tool in the window below.<br><strong>2.<\/strong>&nbsp;Paste your DNA sequence (including greater-than symbol (&gt;) and sequence name) in the query box (STEP 1). In the&nbsp; \u201cParameters\u201d field (STEP 2), make sure to select \u201cframe=1\u201d and \u201cCodon table=Standard codon\u201d. Then, submit your search by clicking on &#8220;Submit&#8221;. <br><strong>3.<\/strong> Examine the data table that appears and try to answer the questions related to the task. For better visualisation, click on \u201cShow Colors\u201d.<br><strong>4.<\/strong> Download or copy the amino acid sequence for the next task.&nbsp;<\/p>\n\n\n\n<p><strong>Note:<\/strong>  The colours used in the output correspond to specific physicochemical properties of the <a href=\"https:\/\/www.embl.org\/ells\/teachingbase\/ells-glossary\/\" target=\"_blank\" rel=\"noreferrer noopener\">amino acids<\/a>. The amino acids are displayed in brackets as a one-letter code:<br> &#8211; Small + hydrophobic amino acids (<span style=\"color:red\">AVFPMILW<\/span>): <span style=\"color:red\">Red<\/span>\u00a0<br> &#8211; Acidic amino acids (<span style=\"color:blue\">DE<\/span>): <span style=\"color:blue\">Blue<\/span><br> &#8211; Basic amino acids (<span style=\"color: rgb(255,0,255);\">RK<\/span>): <span style=\"color: rgb(255,0,255);\">Magenta<\/span><br> &#8211; Amino acids with hydroxyl, sulfhydryl or amine groups + Glycine (<span style=\"color:green\">STYHCNGQ<\/span>): <span style=\"color:green\">Green<\/span><\/p>\n\n\n\n<div class=\"vf-embed vf-embed--16x9 | vf-u-margin__bottom--400\"\n>\n<iframe src=\"https:\/\/www.ebi.ac.uk\/Tools\/st\/emboss_transeq\/\" frameborder=\"0\" controls allowfullscreen><\/iframe><\/div>\n\n\n<\/section>\n\n\n\n<section class=\"vf-tabs__section\" id=\"vf-tabs__section-e8c6c6fe-71a4-4aa0-942c-8b86bc87a5c7\"><h2>Question<\/h2>\n<p>What are the most prevalent types of amino acids present in the protein sequence? <\/p>\n\n\n\n<p>You can click on \u201cShow Colors\u201d to get information about the physicochemical properties of the individual amino acids.<\/p>\n<\/section>\n\n\n\n<section class=\"vf-tabs__section\" id=\"vf-tabs__section-58cd64e9-ed27-45d8-8406-341f94a64590\"><h2>Activity navigation<\/h2>\n<ul class=\"wp-block-list\" id=\"block-5e336a28-6156-49cd-aa22-7046e0d3c5eb\">\n<li><a href=\"https:\/\/www.embl.org\/ells\/teachingbase\/the-mysterious-protein-a-bioinformatics-expedition\/\" data-type=\"URL\" data-id=\"https:\/\/www.embl.org\/ells\/teachingbase\/the-mysterious-protein-a-bioinformatics-expedition\/\" target=\"_blank\" rel=\"noreferrer noopener\">Introductory<\/a><a href=\"https:\/\/www.embl.org\/ells\/teachingbase\/the-mysterious-protein-a-bioinformatics-expedition\/\" data-type=\"URL\" data-id=\"https:\/\/www.embl.org\/ells\/teachingbase\/the-mysterious-protein-a-bioinformatics-expedition\/\"> page<\/a><\/li>\n\n\n\n<li><a href=\"http:\/\/embl.org\/ells\/teachingbase\/part-1-from-dna-to-protein-sequence\/\"><strong>Part 1: From DNA to protein sequence<\/strong><\/a><\/li>\n\n\n\n<li><a rel=\"noreferrer noopener\" href=\"https:\/\/www.embl.org\/ells\/teachingbase\/part-2-protein-identity-and-function\/\" target=\"_blank\">Part 2: Protein identity and function<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.embl.org\/ells\/teachingbase\/part-3-multiple-sequence-alignment\/\">Part 3: Multiple sequence alignment&nbsp;<\/a><\/li>\n\n\n\n<li><a rel=\"noreferrer noopener\" href=\"https:\/\/www.embl.org\/ells\/teachingbase\/part-4-phylogenetic-analysis\/\" target=\"_blank\">Part 4: Phylogenetic analysis&nbsp;<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.embl.org\/ells\/teachingbase\/part-5-structural-analysis\/\">Part 5: Structural analysis<\/a><\/li>\n<\/ul>\n<\/section>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"featured_media":36159,"parent":36087,"menu_order":1,"template":"","class_list":["post-36085","teachingbase","type-teachingbase","status-publish","has-post-thumbnail","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.embl.org\/ells\/wp-json\/wp\/v2\/teachingbase\/36085","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.embl.org\/ells\/wp-json\/wp\/v2\/teachingbase"}],"about":[{"href":"https:\/\/www.embl.org\/ells\/wp-json\/wp\/v2\/types\/teachingbase"}],"up":[{"embeddable":true,"href":"https:\/\/www.embl.org\/ells\/wp-json\/wp\/v2\/teachingbase\/36087"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.embl.org\/ells\/wp-json\/wp\/v2\/media\/36159"}],"wp:attachment":[{"href":"https:\/\/www.embl.org\/ells\/wp-json\/wp\/v2\/media?parent=36085"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}