Protein Expression and Purification Core Facility

PEPCF expresses proteins in bacteria, insect and mammalian cells and uses a variety of chromatographic and biophysical techniques for protein purification and characterization.

Choice of expression plasmids

Once you have decided to express a protein in E. coli, you’ll need to choose an expression vector.

The following features are commonly found in E. coli expression plasmids:

  • Promoter: initiates the transcription and drives the recombinant protein expression
  • Terminator: the transcription terminator reduces unwanted transcription and increases plasmid and mRNA stability
  • Shine-Dalgarno sequence: ribosomal binding site upstream of the start codon that helps recruiting the ribosome to the mRNA to initiate protein synthesis (consensus sequence: 5’-TAAGGAGGT-3’)
  • Origin of replication (ori): controls the plasmid copy number
  • Selection marker: usually an antibiotic resistance marker is used to provide selective pressure to avoid the host organism losing the plasmid
  • Start and stop codon: to initiate and terminate the translation
  • Regulatory gene (repressor): to prevent leaky expression of the protein before induction
  • Signal sequence: required for periplasmic expression
  • Tags: N- or C-terminal fusion tags can act as affinity tags for the protein purification, but can also help to improve solubility
  • Protease cleavage site: to remove fusion tags during the protein purification

Commonly used promoters:

T7 promoter: requires T7 RNA polymerase for transcription and is therefore generally used in combination with E. coli BL21(DE3) strains. This is very strong promoter and is found for example in the popular pET vectors.

lac promoter: promoter of the lac operon (lactose metabolism)

araBAD promoter: promoter of the L-arabinose operon

tac, trc promoters: hybrid of E. coli trp (-35) and lac (-10) promoters

T5 promoter: recognized by E. coli RNA polymerase

pL promoter: temperature-sensitive promoter

tetA promoter: tetracyclin inducible promoter

Commonly used antibiotics:

AntibioticStock concentrationWorking concentrationRemarks
Ampicillin100 mg/ml100 µg/ml
Carbenicillin100 mg/ml100 µg/mlMore stable than Ampicillin
Kanamycin30 mg/ml30 µg/ml
Chloramphenicol33 mg/ml33 µg/mlPrepare stock in ethanol
Gentamycin20 mg/ml20 µg/ml
Spectinomycin100 mg/ml100 µg/ml
Streptomycin100 mg/ml100 µg/ml
Tetracyclin10 mg/ml10 µg/mlLight sensitive, short-term stability

Start and stop codon

In E. coli the main start codon is ATGGTG is used in roughly 8% of the cases. TTG and TAA are hardly used at all.

There are 3 possible stop codons but TAA is preferred because it is less prone to read-through than TAG and TGA. The efficiency of termination can be increased by using 2 or 3 stop codons in series.

Regulatory gene (repressor)

Many promoters show leakiness in their expression i.e. gene products are already expressed at a low level before the addition of the inducer. This becomes a problem when the gene product is toxic for the host. This can be prevented by the constitutive expression of a repressor protein.

The lac-derived promoters are especially leaky. These promoters can be controlled by the insertion of a lac operator sequence downstream of the promoter and the expression of the lacI repressor protein from the same plasmid (or from a helper plasmid). If no repressor sequence is present on the plasmid, it’s also possible to use a host strain carrying the lacIq allele. Alternatively, repression can also be achieved by the addition of 1% glucose to the culture medium.

Signal sequence

If you desire to express your protein in the periplasm, you’ll need to add an N-terminal periplasmic signal sequence. Commonly used signal sequences are for example the ones from OmpA, OmpF, DsbA, MalE and PelB.

Protein tags

Small affinity tags such as for example His6, His10, (twin)StrepII, Flag, Myc and Spot can be added to the N- or C-terminus of your protein of interest to facilitate the protein purification and/or detection. In some cases, they can also be added to internal loop regions, although this is rather rare. Larger solubility-enhancing tags such as SUMO, Trx, NusA, DsbA and DsbC are usually added to the N-terminus of the protein and are often combined with a small affinity tag to facilitate the purification. Some solubility-enhancing tags such as GST and MBP can also immediately be used as an affinity tag.

Fluorescent tags (e.g. eGFP, mCherry, YFP, CFP, …) can be placed at the N- or C-terminus of proteins and can be useful for imaging or studying interactions using biophysical techniques based on fluorescence.

Modular tags such as HALO, SNAP and CLIP can be added to the N- or C-terminus as well. They allow the attachment of different chemical functionalities and can be used to couple the protein covalently to a fluorescent dye, an affinity handle or a solid surface. The HALO-tag is based on a modified haloalkane dehalogenase and covalently binds synthetic ligands with a chloroalkane linker. The SNAP- and CLIP-tag are both derived from O6-alkylguanine-DNA-alkyltransferase and react with O6-benzylguanine and O2-benzylcytosine derivatives, respectively.

Protease cleavage tags

Affinity tags or solubility-enhancing tags can be removed during the protein purification when a specific protease cleavage site is included between the tag and the protein of interest.

After the affinity chromatography step in the purification, you can add the specific protease to your sample. Protease cleavage can be performed either on-column or in solution. For on-column protease cleavage, you add the protease to your sample while it’s still bound to the resin. After cleavage, your sample will be eluted from the resin. For in solution protease cleavage, you first elute your protein from the resin material and then add the protease to your sample. If you use a protease containing the same affinity tag as your protein of interest, a reverse affinity chromatography step will allow you to easily separate your untagged (cleaved) protein from the still tagged (uncleaved) protein and the tagged protease.

At EMBL PEPCF, we prepare a number of highly specific proteases ourselves, which we routinely use in our protein purification processes. We proteases we produce are His6-tagged and GST-tagged TEV protease, His6-tagged and GST-tagged HRV 3C protease and His6-tagged SenP2 protease. Other commonly used highly specific proteases are for example Thrombin, Enterokinase and Factor Xa. Most of these proteases are commercially available from various manufacturers.

ProteaseProtease cleavage site
HRV 3C (PreScission)LEVLFQ ↓GP
SenP2Sumo3-GG ↓
ThrombinLVPR ↓ GS
EnterokinaseDDDDK ↓
Factor XaI E/D GR ↓
Frequently used proteases and their specific recognition sites. The arrows indicate the specific cleavage sites.