| Syllabus | Last Lecture | Next Lecture | Optional Reading 1 |
Why overexpress a gene? It is often desirable to manipulate cells to produce a massive quantity of a particular protein ("overexpress the gene" = "overproduce the protein") for biochemical studies or for commercial purposes. Why not just simply purify the protein from cells without overexpression?
Consider the lac repressor. There are ~ 10 molecules of this protein in each cell. For biochemical and structural studies, we'd need about 1 umol of lac repressor total. Using Avogadro's number (6 x 1023 molecules/mole), we find that 1 umol corresponds to 6 x 1017 molecules. Thus, to get our 1 umol, we'd need about 6 x 1016 cells. A typical E. coli culture at stationary phase has ~ 4 x 108 cells/mL (or ~ 4 x 1011 cells/L). Therefore, assuming 100% recovery of the protein, we'd need about 150,000 liters of culture to get our 1 umol!! Obviously, this is not practical. The reason we overexpress a gene is simply to increase the number of molecules of the desired protein per cell so that we can acquire a decent quantity of product in reasonably-sized cultures.
A. Minimal Requirements for the Insert for Overexpression
1. The insert must have clonable ends, compatible with the vector, usually generated by restriction digestion. If the ends are incompatible with the vector, they can be filled in with a polymerase to produce blunt ends and ligated to a vector also possessing blunt ends.
Non-directional cloning - if the insert contains two identical ends, half of the clones recovered will be in the wrong orientation. Candidate clones will need to be screened to find those with the insert in the desired orientation.
2. The gene within the insert must be flanked by a "start" and "stop" codon.
B. "Bells and Whistles" for the Insert - Tricks to Increase the Likelihood of a Successful Cloning
1. Directional cloning. The insert is prepared with two different restriction ends, both of which are compatible with the vector. This "forces" the insert to join with the vector in only the desired orientation. A convenient way to generate directional ends for cloning is to use PCR with primers containing two different restriction enzyme recognition sites.
2. Modifications to the stop codon.
a. Multiple stop codons are sometimes used.
b. Stop codons in all three reading frames are sometimes supplied.
C. Minimal Requirements for the Vector for Overexpression.
D. "Bells and Whistles" for the Vector - Embellishments in Vector Design to Make Cloning and Expression Easier.
1. Modifications to the cloning site
a. Polylinker regions contain multiple cloning sites very near to each other. Each site is unique on the plasmid and so is an option for insertion of foreign DNA. By having many different restriction enzyme recognition sites close together, polylinkers offer many choices in regard to which enzymes to use to prepare the vector and insert for cloning.
b. Alpha complementation provides a simple method to determine if DNA has been inserted into a vector. Invariably, many of the clones recovered will not contain inserts. The polylinker is strategically positioned within the open reading frame of a truncated lacZ gene called lacZ(alpha). This fragment can be complemented by the "remaining portion" of lacZ encoded by the host gene lacZ(beta). Thus, the protein products of these two genes can unite to form an active enzyme capable of converting the substrate X-gal to a blue product. The idea is to disrupt the lacZ(alpha) gene by cloning inserts into it within the polylinker region. If a vector acquires an insert, the lacZ(alpha) gene will not be functional and there will be no complementation with lacZ(beta), no active enzyme produced, and the colonies will remain white on plates containing X-gal. If a vector does not acquire an insert, then alpha complementation occurs and the colonies are blue.
2. Modifications to the origin of replication
a. Multiple origins of replication may be included on the vector. Most origins of replication are species-specific, i.e. different origin sequences are required for different species. Shuttle vectors contain two origins that will allow replication in two different species, one of which is usually E. coli. The advantage of this system is that it allows one to do all the cloning, amplification and manipulation in a familiar organism like E. coli before introducing the vector into the organism under investigation.
b. The copy number (simply the number of copies of vector per cell)
can be varied by using different types of origins of replication.
3. Modifications to the selectable markers
a. Multiple antibiotic resistance genes may be present on the vector. Shuttle vectors usually contain two different markers so that the plasmid may be selected in two different species.
b. Rather than using antibiotic resistance genes, some vectors employ
genes that can complement a nutritional auxotrophy as selectable markers.
4. Modifications to the promoter
a. The ideal promoter is close to the consensus sequence so that expression is strong.
b. It is also desirable to use a tightly controllable, inducible
promoter. The idea is to repress expression of the cloned inserts while
the cells are growing and then to induce when cells are at mid- to late-exponential
growth. This strategy ensures that expression of the insert does not hinder
the growth rate during the early stages of growth.
5. Modifications to the ribosome binding site
a. The ideal ribosome binding site is close to the consensus sequence so that translation is strong.
b. Multiple ribosome binding sequences are sometimes used to increase translation levels.
If you are just fussy, or if you are desperate to get expression and the above modifications are insufficient:
6. The addition of transcriptional stop signals (tandem repeats that form a stem/loop structure followed by a poly (T) sequence or of rho sites sometimes boosts expression.
7. The codon usage within the insert can be changed. Although the genetic code is degenerate, organisms characteristically "prefer" certain codons when there is a choice of several. Expression of cloned inserts can be enhanced by changing codons to match the "favorite" codons of the host.
How do we know we have the right clone inserted into the vector? The presence and the identity of the insert may be verified by restriction mapping, sequencing, hybridization, etc.
E. A Recent (and Very Popular) Addition to Expression Technology
Affinity tags are short sequences added to genes inserted into expression vectors so that the protein product may be selectively purified.
e.g. Polyhistidine tags consist of a stretch of six consecutive histidines. Proteins bearing such tags can bind to columns containing divalent metals like Ni2+. The strategy is to apply a crude mixture of proteins containing the polyhistidine-tagged product to these columns, wash away unwanted proteins that do not bind the column, and finally to elute the desired protein from the column using free histidine.
The advantage of using affinity tags is that protein purification is simple and gentle. The disadvantage is that adding such tags may change the structure and therefore the function of the protein. However, vectors have recently been introduced that allow the proteolytic removal of the affinity tag after the protein has been purified.
| F. Two Examples of Popular Expression Systems | |
1. Protein expression driven by host RNA polymerase using the tac
promoter.
|
2. Protein expression driven by T7 RNA polymerase using a T7 promoter.
|
Lambda expression vectors are able to hold large inserts, a distinct advantage when cloning eukaryotic sequences. To clone a desired eukaryotic gene in lambda, the following strategy is frequently used.