lundi 27 juin 2016

Origin of Replication

Now that we know all about antibiotic resistance genes, let’s consider another basic element of any plasmid: the origin of replication/replicon. The replicon is comprised of the origin of replication (ORI) and all of its control elements. The ORI is the place where DNA replication begins, enabling a plasmid to reproduce itself as it must to survive within cells.
The replicons of plasmids are generally different from the those used to replicate the host's chromosomal DNA, but they still rely on the host machinery to make additional copies. ORI sequences are generally high in As and Ts. Why, you ask? Well, A-T base pairs are held together with two hydrogen bonds not three as G-C pairs are. As a result, stretches of DNA that are rich in A-T pairs melt more readily at lower temperatures. When DNA melts, it gives the replication machinery room to come in and get busy making copies.

So Many Origins, So Little Time

There are lots of ORIs out there so, for simplicity’s sake, we've ignored those used in eukaryotic cells and viruses and focused only on those found in bacteria. Some common ones you might see include ColE1, pMB1 (which comes in a few slightly different but well known derivatives), pSC101, R6K, and 15A. Not all origins of replication are created equal. Some will produce many plasmid copies and others produce just a few copies depending on how they are regulated. Generally, control of replication is referred to as "relaxed" or "stringent" depending on whether the ORI is positively regulated by RNA or proteins, respectively. A plasmid's copy number has to do with the balance between positive and negative regulation and can be manipulated with mutations in the replicon. For example, the pMB1 ORI maintains about 20 copies per cell, while pUC – which differs by only two mutations – will produce as many as 700 copies per cell.
So, how do you choose? Addgene Senior Scientist Marcy Patrick says researchers can ask themselves a few questions to get started: Will the plasmid be used exclusively in E. coli? Gram negative bacteria in general? Both Gram negatives and Gram positives? Will you have only one plasmid type in your cells at a time? Do you want to make a lot of your plasmid? Is the gene toxic in high amounts? It is always good to keep in mind that plasmids with low to medium copy numbers can still express massive amounts of protein given the proper promoter and growth conditions.

Choose Your Origin of Replication Wisely

In other words, the best choice of ORI depends on how many plasmid copies you want to maintain, which host or hosts you intend to use, and whether or not you need to consider your plasmid's compatibility with one or more other plasmids. Generally speaking, plasmids with the same ORIs are incompatible because they will compete for the same machinery, creating an unstable and unpredictable environment. As a rule, plasmids from the same group should not be co-transformed, so if you require two plasmids for an experiment, make sure they have "compatible" ORIs. See the table below for more details.
Common Vectors Copy Number+ ORI Incompatibility Group Control
pUC ~500-700 pMB1 (derivative) A Relaxed
pBR322 ~15-20 pMB1 A Relaxed
pET ~15-20 pBR322 A Relaxed
pGEX ~15-20 pBR322 A Relaxed
pColE1 ~15-20 ColE1 A Relaxed
pR6K ~15-20 R6K* B Stringent
pACYC ~10 p15A B Relaxed
pSC101 ~5 pSC101 C Stringent
pBluescript ~300-500 ColE1 (derivative) and F1** A Relaxed
pGEM ~300-500 pUC and F1** A Relaxed
This table defines common cloning vectors, their copy number, ORI, and incompatibility groups. Note the A -C compatibility grouping is an arbitrary designation, and plasmids from the same incompatibility group should not be co-transformed.
+Actual copy number varies. See below for additional considerations.
*Requires pir gene for replication (reference).
**F1 is a phage-derived ORI that allows for the replication and packaging of ssDNA into phage particles. Plasmids with phage-derived ORIs are referred to as phagemids.

Other Factors that Affect Copy Number 

Although the sequence and regulation of the ORI dramtically affect the copy number of a plasmid, other external factors contribute as well. These considerations are especially useful to keep in mind if you are planning to purify your plasmid DNA:
The insert:
  • Bacteria tend to maintain fewer copies of plasmids if they contain large inserts or genes that create a toxic product.
The E. coli strain:
  • Most E. coli strains can be used to propogate plasmids, but endA- E. coli are best for high yields of plasmids.
Growth conditions:
  • The amount of aeration, temperature, culture volume, antibiotic, and medium can all affect copy number. Some ORIs are temperature sensitive; others ORIs can be "tricked" into amplifying more copies with the addtion of Chloramphenicol – make sure your growth conditions aren't working against you! 
The culture inoculum:
  • Freshly streaked bacteria have higher copy numbers – for optimal results always pick a single colony and do not subculture directly from glycerol stocks, agar stabs, or liquid cultures.
  • Incubation for 12-16 hours tends to give higher copy numbers since the bacteria have just reached stationary phase, but the cells have not started to die off.

Plasmid Replication

In order for a piece of circular, dsDNA to be propagated in bacteria, it needs to be replicated by host machinery. There is a sequence in the plasmid that directs the cell to begin replication. Important considerations are host range, compatibly, and copy number. The host range refers to what species of bacteria will recognize the origin of replication and thus allow for replication. The compatibility refers to a plasmid's ability to coexist with another plasmid in the same cell. Copy number refers to the average or expected number of copies of the plasmid per cell.
There are three main mechanisms for plasmid replication: Rolling Circle, Strand Displacement, and Theta.

Strand displacement replication

RepC binds repeat sequences recruits RepA ( a helicase) to melt an AT rich region. This exposes two single stranded origins ssiA and ssiB. RepB polymerizes primers for these origins. DNA polymerization follows in each direction, meanwhile displacing the non-template stand.
Strand displacement is associated with broad host range vectors, possibly because it does not require any of the normal host machinery (DnaA, DnaB, DnaC, and DnaG)

Rolling circle replication

A nick is made by the Rep protein at the "double strand origin" of a dsDNA plasmdid. The free 3'OH is extended, displacing as it progresses. After one unit length of displacement, replication is terminated, yielding one dsDNA plasmid and ssDNA of one unit length. The displaced strand then serves as a template for replication from a "single strand origins." Since each strand is replicated independently, it is possible for the ssDNA form to accumulate.
This mechanism is found in gram-positive bacteria like Staphylococcus aureus and Streptomyces lividans as well as many bacteriophages.

Theta replication

DnaA (often with the help of other proteins) binds the origin at DnaA boxes. This promotes melting of the orgin. This allows DnaC to load to DnaB helicase, opening the origin further. DnaG is then recuited to form a short RNA primer.
DNA polymerase III extends this primter. If there is only one leading primer, a single fork circumnavigates the entire plasmid until the origin is reached, and daughter plasmids separate. In bidirectional replication, two forks propagate and meet on the far side of the plasmid before resolution.
Theta is the most common form of DNA replication, including most plasmids as well as chromosomes. It is particularly associated with gram-negative bacteria. ColE1, P15A, RK2, F, and P1 all use theta replication.

Host range

Plasmids are classified as having a narrow or broad host range.
  • ColE1 and pMB1 are limited to E. coli and a few close relatives,
  • RK2 plasmids can be used in most gram-negative bacteria.
  • RSF1010 can use used in most gram-negative bacteria, and some gram-positive
  • Plasmids from gram-positive bacteria tend to function well in other gram-positive bacteria.

Compatibility groups

If two plasmids have the same (or very similar) origins of replication, they will compete with each other for replication machinery. This results in an unstable situation. If the two plasmids posses different selectable markers, this can be maintained for several generations, but eventually one of the plasmids will be lost. For scenarios in which multiple plasmids are necesary, one must be careful to choose plasmids will compatible origins. The most common dual-plasmid pair is ColE1(or pMB1) and p15A. The most common plasmid triplet is ColE1 (or pMB1),p15A, and pSC101. Tolia and Joshua-Tor suggest the following groups:
  • ColE1/pMB1 (eg pET, pUC, pBR322, pGEX, pMAL)
  • P15A (eg pBad, pACYC)
  • CloDF13
  • ColA
  • RSF1030

Copy number

An important consideration in choosing what plasmid backbone to use is the copy number. For example, cloning is best done with a high copy plasmid (e.g. pUC) as plasmid preps will have a higher yield. Expressing a toxic gene is better from a low to medium copy plasmid(e.g. pET which uses the pBR322 origin), as there are fewer copies.
  • ColE1: 15-20 copies
  • pMB1: 20-700 copies
    • pUC: 500-700 copies
    • pBR322: ~20 copies
  • pSC101: ~5 copies
  • P15A: 10-12 copies
  • RK2: 4-7 copies
  • F1: ~1 copy
  • CloDF13: 20-40 copies
  • ColA: 20-40 copies
  • RSF1030: >100 copies
  • P1: ~1 copy
  • R6K: 15-30 copies

Control of initiation/copy number

There are several mechanisms by which copy number is controlled. In all cases, some negative-regulating element (RNA or protein) is expressed from the plasmid. As the plasmid concentration increases, so too does the negative regulator. This provides a negative feedback, which stabilizes the copy number. Two plasmids that are regulated by each other's regulator will not be compatible.

RNA regulation

ColE1/pMB1: The origin contains regions promoting the synthesis of RNA I and RNA II. RNA II hybrizes to the DNA, yielded a DNA/RNA hybrid which can serve as a substrate for RNaseH. Digestion of RNA II by RNaseH yields the primer for replication. RNA I binds and sequesters RNA II, so it is unavailable for RNAse H digestion. As the copy number increases, so does the concentration of RNA I. This provides a negative feedback for replication, and sets the average number of plasmids per cell.
Additionally, The Rop protein helps lower the copy number, by stabilizing the RNA I/ RNA II duplex. Deletion of Rop, as well as a point mutation that weakens the RNA I and RNA II duplex, accounts for the higher copy of pUC (a pMB1 derivatives)
P15A, ColA, RSF1030, and CloDF13 are similar, but with versions of RNA I and RNA II that sufficiently different to allow for compatibility.

RNA and protein regulation

On the R1 plasmid, OriR is bound by RepA, thus promoting replication by recruiting DnaA. RepA can be expressed from two different promoter. A proximal promoter (pRepA) drives only RepA while a distal promoter (pCopB) drives both CopB and RepA. CopB represses pRepA, thus once there are enough plasmids around, CopB levels become high enough to limit RepA expression to pCopB promoter. plasmid encoded CopA is completmentary, and thus binds to the 5' end of the transcript originating from the pCopB promoter. The dsRNA is a substrate for the processive RNase III.

Iteron regulation

Like the above examples, pSC101's replication is positively regulated by RepA binding the origin. RepA is also used to control copy number, by two mechanisms.
Firstly, RepA negatively regulates its own transcription, thus the RepA protein levels (and its ability to promote replication) is confined to narrow limits.
Secondly, The plasmid contains several (3-7) repeats of a 17-22bp sequence called iteron sequences. RepA binds the iterons, and at higher plasmid conncentration, this can lead to "handcuffing" of two plasmids. Interestingly, adding extra iteron sequences on other plasmids can reduce the copy number by this handcuffing mechanism.
F, RK6, P1, RK2, and RP4 also use iterons, but the regulating protein and origins differ.
pETcoco is an interesting plasmid, made by Novagen. It can be maintained as a single copy plasmid using the origin and positive regulator from the F plasmid (oriS and RepE). It can be swiched to a medium copy plasmid using the machinery from the RK2 plasmid (oviV and trfA). The switch is achieved by the induction of the trfA protein, which binds and iteron on oriV, thus promoting initiation from this origin by aiding in melting and recruitment of DnaB.

Origin of Replication

Now that we know all about antibiotic resistance genes, let’s consider another basic element of any plasmid: the origin of replication/replicon. The replicon is comprised of the origin of replication (ORI) and all of its control elements. The ORI is the place where DNA replication begins, enabling a plasmid to reproduce itself as it must to survive within cells.
The replicons of plasmids are generally different from the those used to replicate the host's chromosomal DNA, but they still rely on the host machinery to make additional copies. ORI sequences are generally high in As and Ts. Why, you ask? Well, A-T base pairs are held together with two hydrogen bonds not three as G-C pairs are. As a result, stretches of DNA that are rich in A-T pairs melt more readily at lower temperatures. When DNA melts, it gives the replication machinery room to come in and get busy making copies.

So Many Origins, So Little Time

There are lots of ORIs out there so, for simplicity’s sake, we've ignored those used in eukaryotic cells and viruses and focused only on those found in bacteria. Some common ones you might see include ColE1, pMB1 (which comes in a few slightly different but well known derivatives), pSC101, R6K, and 15A. Not all origins of replication are created equal. Some will produce many plasmid copies and others produce just a few copies depending on how they are regulated. Generally, control of replication is referred to as "relaxed" or "stringent" depending on whether the ORI is positively regulated by RNA or proteins, respectively. A plasmid's copy number has to do with the balance between positive and negative regulation and can be manipulated with mutations in the replicon. For example, the pMB1 ORI maintains about 20 copies per cell, while pUC – which differs by only two mutations – will produce as many as 700 copies per cell.
So, how do you choose? Addgene Senior Scientist Marcy Patrick says researchers can ask themselves a few questions to get started: Will the plasmid be used exclusively in E. coli? Gram negative bacteria in general? Both Gram negatives and Gram positives? Will you have only one plasmid type in your cells at a time? Do you want to make a lot of your plasmid? Is the gene toxic in high amounts? It is always good to keep in mind that plasmids with low to medium copy numbers can still express massive amounts of protein given the proper promoter and growth conditions.

Choose Your Origin of Replication Wisely

In other words, the best choice of ORI depends on how many plasmid copies you want to maintain, which host or hosts you intend to use, and whether or not you need to consider your plasmid's compatibility with one or more other plasmids. Generally speaking, plasmids with the same ORIs are incompatible because they will compete for the same machinery, creating an unstable and unpredictable environment. As a rule, plasmids from the same group should not be co-transformed, so if you require two plasmids for an experiment, make sure they have "compatible" ORIs. See the table below for more details.
Common Vectors Copy Number+ ORI Incompatibility Group Control
pUC ~500-700 pMB1 (derivative) A Relaxed
pBR322 ~15-20 pMB1 A Relaxed
pET ~15-20 pBR322 A Relaxed
pGEX ~15-20 pBR322 A Relaxed
pColE1 ~15-20 ColE1 A Relaxed
pR6K ~15-20 R6K* B Stringent
pACYC ~10 p15A B Relaxed
pSC101 ~5 pSC101 C Stringent
pBluescript ~300-500 ColE1 (derivative) and F1** A Relaxed
pGEM ~300-500 pUC and F1** A Relaxed
This table defines common cloning vectors, their copy number, ORI, and incompatibility groups. Note the A -C compatibility grouping is an arbitrary designation, and plasmids from the same incompatibility group should not be co-transformed.
+Actual copy number varies. See below for additional considerations.
*Requires pir gene for replication (reference).
**F1 is a phage-derived ORI that allows for the replication and packaging of ssDNA into phage particles. Plasmids with phage-derived ORIs are referred to as phagemids.

Other Factors that Affect Copy Number 

Although the sequence and regulation of the ORI dramtically affect the copy number of a plasmid, other external factors contribute as well. These considerations are especially useful to keep in mind if you are planning to purify your plasmid DNA:
The insert:
  • Bacteria tend to maintain fewer copies of plasmids if they contain large inserts or genes that create a toxic product.
The E. coli strain:
  • Most E. coli strains can be used to propogate plasmids, but endA- E. coli are best for high yields of plasmids.
Growth conditions:
  • The amount of aeration, temperature, culture volume, antibiotic, and medium can all affect copy number. Some ORIs are temperature sensitive; others ORIs can be "tricked" into amplifying more copies with the addtion of Chloramphenicol – make sure your growth conditions aren't working against you! 
The culture inoculum:
  • Freshly streaked bacteria have higher copy numbers – for optimal results always pick a single colony and do not subculture directly from glycerol stocks, agar stabs, or liquid cultures.
  • Incubation for 12-16 hours tends to give higher copy numbers since the bacteria have just reached stationary phase, but the cells have not started to die off.

jeudi 19 novembre 2015

Some tips to improve the protein solubility
1. Reduce the rate of recombinant protein expression
a. lower culture temperature
b. using a weak promoter
c. using low copy number plasmid expression vector
d. reduce the concentration of inducer.

2. Change the medium :
a. Add factor in the medium to help the protein folding.
b. Add buffer to maintain the pH stable.
c. Add 1% glucose to inhibite lac promoter.
d. Add sorbitol and other factors can stabilize the native structure of the prokaryotic protein.
e. Add ethanol, thiols or disulfides.

3. Co-expression with molecular chaperones or folding enzymes. Commonly used molecular chaperones are: GroES-GroEL,DnaK-DnaJ-GrpE,ClpB. Commonly used folding enzymes are : peptidyl prolyl cis/trans isomerases (PPI's), disulfide oxidoreductase (DsbA) and disulfide isomerase (DsbC), protein disulfide isomerase (PDI).

4. secretion expression. the target protein is secreted into the periplasmic space.

5. The use of specific strains. AD494, which has a mutation in thioredoxin reductase (trxB) . Origami, a double mutant in thioredoxin reductase (trxB) and glutathione reductase (gor).

6. soluble fusion protein partner.

7. Only express a fragment of the target protein. > 70 kDa proteins in E. coli is difficult to express.

8. vitro unfolding, refolding.

jeudi 7 mai 2015

To calculate transformation efficiency

To calculate transformation efficiency:
  • For each batch of competent cells you make, it's a good idea to calculate transformation efficiency.
  • Transform the cells with a known amount of DNA (e.g. 1μg DNA into 100μL cells).
  • Plate several volumes of cells onto selective plates (10μL, 50μL, 100μL cells per plate).

Transformation efficiency = # colonies/amount of DNA on the plate (μg/mL)
e.g. you count 1500 colonies on a plate

100μL cells
+ 1μL DNA @ 1μg/mL
+600μL SOC

711μL → plated 10μL

10/711 = 0.014 = "amount of DNA on the plate"
1500 colonies / 0.014 = ~1x105
Ideally, "ultra-competent cells" will have a competency of about 107 or 108.

samedi 28 mars 2015

Plasmids 101: Protein tags

Posted by Eric J. Perkins | Dec 11, 2014 11:26:08 AM
    
Plasmid-101-tags
Protein tags are usually smallish peptides incorporated into a translated protein. As depicted in the accompanying cartoon, they have a multitude of uses including (but not limited to) purification, detection, solubilization, localization, or protease protection. Thus far Plasmids 101 has covered GFP and its related fluorescent proteins, which are sometimes used as tags for detection; however, those are just one (admittedly large) class of common fusion protein tags. Biochemists and molecular biologists who need to overexpress and purify proteins can face any number of technical challenges depending on their protein of interest. After several decades of trying to address these challenges, researchers have amassed a considerable molecular tool box of tags and fusion proteins to aid in the expression and purification of recombinant proteins.

Tags for Stability and Solubility

What are some of the hurdles to overcome in order to overexpress a recombinant protein? It is not generally in a cell’s best interest to overexpress a protein. Energy and cellular resources are being spent to make something the cell doesn’t need to make. Eukaryotes and some bacteria deploy proteosomes to degrade what the cell might consider junk protein. Though there are a number of chemical and peptide-based proteosome inhibitors, glutathione S-transferase (GST), which can be fused to recombinant proteins for one-step purification with glutathione, can also protect against proteolysis.
That’s one form of instability. Prokaryotes can also have a hard time folding eukaryotic proteins. You can get your bacteria to produce massive amounts of protein, but if it’s not folded correctly, there’s no point in crystallizing it or testing its function. Small ubiquitin-related modifier (SUMO) can help with folding and stabilization, as can maltose-binding protein (MBP). Overexpression can also lead to insolubility, and aggregated protein is not useful protein. MBP tags can help with solubility issues, but scientists may also choose to add smaller proteins, such as Thioredoxin A (TrxA) that improve disulfide bond formation in order to help keep your protein soluble.

Tags for Affinity and Purification

An affinity tag, generally a relatively small sequence of amino acids, is basically a molecular leash for your protein. If you’re working with an uncharacterized protein, or a protein for which a good antibody has not been developed (and just because your protein has a commercially available antibody, that doesn’t mean it’s a good one), then your first step towards detecting, immunoprecipitating, or purifying that protein may be to fuse an affinity tag to it. The FLAG, hemaglutinin antigen (HA), and c-myc tags have been the workhorses of the affinity tag world for years, and deciding on which one to use will depend on your application (see table below). The antibodies available for these tags really are good and can be used for western blots, IP, and affinity purification.
Arguably the simplest affinity tag is the polyhistidine (His) tag. Small and unlikely to affect function, His-tagged proteins can be purified using metal-affinity chromatography, usually using a Ni2+ column. Like other affinity tags, a His tag can be fused to either the N- or C-terminus of a protein. Unlike other epitope tags – which when doubled or tripled increase the tag size quickly – modifying the length a polyhistidine tract does not greatly alter the size of the tag.

Table 1: Common protein tags

TagEpitopeMass (kDa)FunctionNotes
CBPKRRWKKNFIAVSAANRFKKISSSGAL4Affinity and PurificationBinding and elution steps use very moderate buffer conditions
FLAGDYKDDDD or DYKDDDDK or DYKDDDK1Affinity and PurificationGood for antibody-based purification; has inherent enterokinase cleavage site
GSTLarge Protein26Purification and StabilityGood for purification with glutathione; protects against proteolysis, but may reduce solubililty
HAYPYDVPDYA or YAYDVPDYA or YDVPDYASL 1.1AffinityFrequently used for western blots, IP, co-IP, IF, flow -cytometry; can occassionally interfere with protein folding
HBHHHHHHHAGKA GEGEIPAPLA GTVSKILVKE GDTVKAGQTV LVLEAMKMET EINAPTDGKV EKVLVKERDA VQGGQGLIKI GVHHHHHH 9ComboConsists of a bacterially derived in-vivo biotinylation signaling peptide (Bio), flanked by hexahistidine motifs (6xHis)
MBPLarge Protein 40Solubility and PurificationCan improve solulibility and folding of eukaryotic proteins in prokaryotes; single step purification with amylose, but wicked huge
MycEQKLISEEDL 1.2AffinityFrequently used for western blots, IP, co-IP, IF, flow -cytometry, but rarely used for purification as elution requires low pH
poly HisHHHHHH 0.8Affinity and PurificationVery small size, rarely affects function
S-tagKETAAAKFERQHMDS 1.8Solubility and AffinityAbundance of charged and polar residues improves solubility; good for antibody-based detection
SUMO~100 amino acid protein 12StabilityAt N-terminus, promotes folding and structural integrity; cleavable. Not great for purification; toocleavable in eukaryotes
TAPGRRIPGLINP WKRRWKKNFI AVSAANRFKK ISSSGALDYD IPTTASENLY FQGEFGLAQH DEAVDNKFNK EQQNAFYEIL HLPNLNEEQR NAFIQSLKDD PSQSANLLAE AKKLNDAQAP KVDNKFNKEQ QNAFYEILHL PNLNEEQRNA FIQSLKDDPS QSANLLAEAK KLNDAQAPKV DANHQ 21 ComboSee text 
TRXMSDKIIHLTD DSFDTDVLKA DGAILVDFWA EWCGPCKMIA PILDEIADEY QGKLTVAKLN IDQNPGTAPK YGIRGIPTLL LFKNGEVAAT KVGALSKGQL KEFLDANLAG SGSGHMHHHH HHSSGLVPRG 12SolubililtyAssists in proper folding
V5GKPIPNPLLGLDST  1.4Affinity and PurificationGood for antibody-based purification

Combo and Cleavage Tags

Frequently, a single tag is not enough. What if you need one tag to increase solubility and one tag for purification? Or you want to combine a fluorophore with a tag that localizes your protein to the nucleus? Or you want multiple rounds of purification to get your protein as pure as possible? Vectors that offer different combinations of tags are readily available, and though adding too many tags and fusion proteins to your protein of interest would eventually get ridiculous (you generally don’t want more tag than protein), 2-3 tags is increasingly common. Tandem affinity purification (TAP) once referred specifically to a combo tag comprised of a calmodulin binding peptide (CBP), a TEV cleavage site (more on that in a moment), and 2 ProtA IgG-binding domains. TAP has since come to encompass several other tag combinations, though frequently those combinations still include at least one element from the original TAP tag. The terms dual-labeling and dual-tagging are also used. Due to their small size and the ease with which they can be added to a purification scheme, His tags are frequently combined with other tags for dual-labeling.
The problem with all these tags is that many of them serve a one-time purpose, and you don’t necessarily want them to stick around after that purpose has been served. At this point, proteases can be your friend rather your enemy. Two common tags (SUMO and FLAG) are cleaved by specific proteases without requiring the addition of an independent cleavage recognition site. In fact, SUMO cannot be used in eukaryotes because there is already too much SUMO protease around, but it is convenient when used with purified protein since the enzyme cleaves the SUMO tag in the same manner as it would have in the context of a cell. FLAG tags can be cleaved by enterokinase, which recognizes DDDDK^X, cleaving after the lysine. The efficiency of this cleavage depends on the identity of X.
A number of other proteases are available, but scientists would need to incorporate their recognition sites into their protein tag in order to use them effectively. One of the best optimized is the tobacco etch virus (TEV) protease. A TEV protease cleavage site is frequently placed between two tags being used for two rounds of purification, with the cleavage reaction taking place between column runs. The TEV protease itself, with various mutations used to increase its stability activity, can be readily purified using plasmids found in this paper (available at Addgene).

Table 2: Protease recognition sites commonly used with tags

ProteaseRecognition siteNotes
TEVENLYFQSCleaves between the Gln and Ser residues
ThrombinLVPRGSCleaves between Arg and Gly residues
PreScissionLEVLFQGPCleaves between the Gln and Gly residues

This article is not a comprehensive guide to all tags, but rather a quick overview of why scientists use tags, with a few time-tested tags and fusion proteins as examples. The tables list more common tags than are described in the post, but have been categoriezed to help you better assess their function. More detailed information and some protocols can be found in the references provided.

mardi 10 mars 2015

E. coli Strains for Protein Expression

Many challenges can arise when over-expressing a foreign protein in E. coli. We will review the potential pitfalls of recombinant protein expression and some of the most popular commercial strains designed to avoid them.
Why do I need an expression strain?
Protein expression from high-copy number plasmids and powerful promoters will greatly exceed that of any native host protein, using up valuable resources in the cell thus leading to slowed growth. Additionally, some protein products may be toxic to the host when expressed, particularly those that are insoluble, act on DNA, or are enzymatically active. For this reason, recombinant proteins are typically expressed in E. coli engineered to accomodate high protein loads using inducible promoter systems (which will be discussed later). In addition to the basic genotypes outlined below, certain specialized strains are available to confer greater transcriptional control, assist with proper protein folding, and deal with sub-optimal codon usage (Table 1)
A few mutations are common to all or most expression strains to accomodate high protein levels including: 
  • ompT: Strains harboring this mutation are deficient in outer membrane protease VII, which reduces proteolysis of the expressed recombinant proteins.
  • lon protease: Strains where this is completely deleted (designated lon or Δlon) similary reduce proteolysis of the expressed proteins.
  • hsdSB (rB- mB-): These strains have an inactivated native restriction/methylation system. This means the strain can neither restrict nor methylate DNA.
  • dcm: Similarly, strains with this mutation are unable to methylate cytosine within a particular sequence.
Table 1: E. coli Expression Strains 
Note: All strains are derived from the E. coli B strain, except ** which are K12
Strain
Resistance
Key Features
Genotype
Use
BL21 (DE3)

Basic IPTG-inducible strain containing T7 RNAP (DE3)
F- ompT lon hsdSB(rB- mB-) gal dcm (DE3)
General protein expression
BL21 (DE3) pLysS*
Chloramphenicol (pLysS)
pLysS expresses T7 lysozyme to reduce basal expression levels; expression vector cannot have p15A origin of replication
F- ompT lon hsdSB(rB- mB-) gal dcm(DE3) pLysS (CamR)
Expression of toxic proteins
BL21 (DE3) pLysE*
Chloramphenicol (pLysE)
pLysE has higher T7 lysozyme expression than pLysS; expression vector cannot have p15A origin of replication
F- ompT lon hsdSB(rB- mB-) gal dcm(DE3) pLysE (CamR)
Expression of toxic proteins
BL21 star (DE3)

Lacks functional RNaseE which results in longer transcript half-life
F- ompT lon hsdSB(rB- mB-) gal dcm rne131 (DE3)
General expression; not recommended for toxic proteins
BL21-A1
Tetracycline
Arabinose-inducible expression of T7 RNAP; IPTG may still be required for expression
F- ompT lon hsdSB(rB- mB-) gal dcm araB::T7RNAP-tetA
General protein expression 
BLR (DE3)
Tetracycline
RecA-deficient; best for plasmids with repetative sequences. 
F- ompT lon hsdSB(rB- mB-) gal dcm(DE3) Δ(srl-recA)306::Tn10 (TetR)
Expression of unstable proteins 
HMS174 (DE3)**
Rifampicin
RecA-deficient; allows for cloning and expression in same strain
F- recA1 hsdR(rK12- mK12+) (DE3) (RifR)
Expression of unstable proteins
Tuner (DE3)

Contains mutated lac permease whch allows for linear control of expression
F- ompT lon hsdSB(rB- mB-) gal dcm lacY1(DE3)
Expression of toxic or insoluble proteins
Origami2 (DE3)**
Streptomycin and Tetracycline
Contains highly active thioredoxin reductase and glutathione reductase to faciliate proper folding; may increase multimer formation
Δ(ara-leu)7697 ΔlacX74 ΔphoA PvuII phoR araD139 ahpC galE galK rpsL F′[lac+ lacIq pro] (DE3) gor522::Tn10 trxB (StrR, TetR)
Expression of insoluble proteins 
Rosetta2 (DE3)*
Chloramphenicol (pRARE)
Good for “universal” translation; contains 7 additional tRNAs for rare codons not normally used in E. coli.Expression vector cannot have p15A origin of replication
F- ompT hsdSB(rB- mB-) gal dcm (DE3) pRARE2 (CamR)
Expression of eukaryotic proteins
Lemo21 (DE3)*
Chloramphenicol (pLemo)
Rhamnose-tunable T7 RNAP expression alleviates inclusion body formation. Expression vector cannot have p15A origin of replication
fhuA2 [lon] ompT gal (λ DE3) [dcm] ∆hsdS/ pLemo (CamR)
Expression of toxic, insoluble, or membrane proteins 
T7 Express

IPTG-inducible expression of T7 RNAP from the genome; does not restrict methylated DNA
fhuA2 lacZ::T7 gene1 [lon] ompT gal sulA11 R(mcr-73::miniTn10--TetS)2 [dcm] R(zgb-210::Tn10--TetS)
General protein expression 
m15 pREP4*, **
Kanamycin (pREP4)
Cis-repression of the E. coli T5 promoter (found on vectors such as pQE or similar), inducible under IPTG (lac repressor on the pREP4 plasmid). Expression vector cannot have p15A origin of replication
F-, Φ80ΔlacM15, thi, lac-, mtl-, recA+, KmR
Expression of toxic proteins 
* Denotes the presence of an additional plasmid-- make sure to maintain this by growing on appropriate media. Note: Purifying your expression plasmid from these strains is not recommended as these auxillary plasmids may be isolated during the prepping process.
How does inducible expression work?
As mentioned above, many expression plasmids utilize inducible promoters, which are 'inactive' until an inducer such as IPTG is added to the growth medium. Induction timing is important, as you typically want to make sure your cells have first reached an appropriate density. Cells in the exponential growth phase are alive and healthy, which makes them ideal for protein expression. If you wait too long to induce, your culture will start collecting dead cells, and, conversely, you cannot induce too early as there are not enough cells in the culture to make protein. 
The DE3 lysogen/T7 promoter combination is the most popular induction system. The DE3 lysogen expresses T7 RNA polymerase (RNAP) from the bacterial genome under control of the lac repressor, which is inducible by the addition of IPTG. T7 RNAP is then available to transcribe the gene of interest from a T7 promoter on the plasmid. Many commercial strains carry the DE3 lysogen, as indicated by the name of the strain. Conversely, other strains such as M15(pREP4) use a lac repressor to act directly on the expression plasmid in order to repress transcription from a hybrid promoter.
Although the DE3/T7 RNAP system works well for most experiments, the lac promoter can “leak,” meaning that a low level of expression exists even without the addition of IPTG. This is mostly a problem for toxic protein products, which can prevent the culture from reaching the desired density within a reasonable time-frame. For these cases, some strains carry an additional measure of control such as the pLys plasmid, which suppresses basal T7 expression. The pLys plasmid contains a chloramphenicol resistance cassette for positive selection and a p15A origin of replication, making it incompatible with other p15A plasmids. pLys comes in two flavors—pLysS and pLysE—the difference being that the latter provides tighter control of basal expression.
What if I don't see protein overexpression?
The strains described above should generate sufficient expression levels for most purposes, but what do you do when you’ve tried a common strain and don’t get the desired level (or any) protein expression? Low expression outcomes can result from variety of sources, so fear not—there are a few simple troubleshooting measures that can help get you back on track:

  • Compatibility: Double-check your plasmid backbone and expression strain to make sure they are compatible. An arabinose-inducible plasmid will not express in an IPTG induction strain for example, nor will a p15 plasmid be compatible with a pLys strain. Your strain may require additional antibiotic selection or a special growth media, or if your plasmid is low-copy, consider reducing the antibiotic concentration.
  • Growth Tempurature: Analyze your expression conditions by setting up a small-scale expression experiment to test variables such as temperature, time, and media conditions. Many recombinant proteins express better at 30°C or room-temperature, which is accomplished by growing your culture to the desired density at 37°C and reducing the temperature or moving it to a bench-top shaker 10-20 minutes before adding the inducer.
  • Growth Media: Changing media is tricky, because there can be a trade-off between growth rate and protein quality. For many proteins, a rich media such as TB or 2XYT is optimal because of the high cell-density they support; however, minimal media supplemented with M9 salts may be preferable if the protein product is secreted to the medium or if slow expression is required due to solubility concerns.
  • Insoluble and Secreted Proteins: The most common purification protocols are designed for soluble, cystosolic protein products, but this is not always achievable. Proteins which contain hydrophobic regions or multiple disulfide bonds may aggregate and become insoluble. These insoluble globs of misfolded protein are known as inclusion bodies, and can be recovered and purified using a special protocol. Alternatively, reducing the concentration of inducer or adding anaffinity tag such as GST may help with solubility issues.