Services - DNA Sequencing Electropherogram
Figure 11
Electropherogram that have "the spread".

"The spread" is characterized by delayed migration of the extension products in the capillary during electrophoresis, and a premature, rapid widening or "spreading" of the DNA bands (see figure below) such that the peaks are assigned bases incorrectly and overlap resulting in basecalls as an "N" in the sequence. This problem is the result of an issue during capillary electrophoresis, and not during the thermal cycling extension step. The problem is caused by as an unyet, identified small ionic molecule produced by E. coli that copurifies with the plasmid DNA when solid phase extraction kits are used, e.g. Qiaprep and Wizard kits. For more information about "the spread" see below.

There are three recommended solutions for this problem. First, the facility can treat the purified extension prodcuts with EDTA and rerun them. This always improves the results, but may only gain a few hundred bases if the spread is severe. Also, there is loss of signal , so if the electropherogram has low signal the first time then there maybe a problem with the rerun sequence. Second, the client can purify the plasmid again using an alkaline-lysis/phenol-chlorofom protocol, and then resubmit the template for sequencing. This protocol always reduces the spread, but it is more difficult than solid phase extraction kits. In adition, the plasmid yield is usually of lower and reduced quality, and residual chemicals can inhibit the seqeuncing reaction. Third, the plasmid can be used as a template in PCR and the resulting amplicon can be submited for the redo reaction. We recommend diluting the plasmid at least 1000 x before adding it to the PCR reaction to diltue the contaminant, and purifying the amplicon product with a solid phase extraction kit before submiting it for the sequencing reaction.



Fig. 11a, Electropherogram that has "the spread".


Fig. 11b, Electropherogram that has "the spread".


Additional Explanation of "the spread"

The molecule that causes "the spread" has not been identfied, but we do know a considerable amount about it. First, it is a small molecule that can be removed by size exclusion chromatography in the presence of a large amount of EDTA which presumably chelates it from the DNA. Second, it is negatively charged as it migrates through the capillary during electrophoresis under denaturing conditions. Third, it probably has positive charges as well which allows it to bind to the DNA during plasmid purification. Fourth, it does have some solubility in organic solvents as phenol/chlorofom removes it from the DNA and its effects are minimized when extension product cleanup procedures are used that incorporate ethanol precipitation.

The molecule is generated by the E. coli in response to stress, i.e. very "unhappy". Plasmids that are large, multihost, and for expressing proteins are the usual culprits. For example, one of the worst cases seen involved a yeast expression vector that carried an integral membrane protein. The E. coli culture carrying this plasmid died if it got over 30C and it grew very, very slowly. The absolute worst case was a column purified amplicon product in which the PCR template was a placque from anE. coli lawn. It is hard to imagine the cells being any more "unhappy" than when they are being lyesed by a bacteriophage.

The method by which it alters electrophoresis is unknown, but it does cause "the spread" in a concentration dependent manner. For most electropherograms the first peak is at approximately scan line 1200. With a minor case of the spread the first scan line might be around 2000 to 2400. If significantly less template is used in the reaction the resulting sequence will have a normal start point and 500 to 700 bases before the peaks overlap too much. For a reaction in which the first peak is at 5000 (worthless seqeunce), if the amount of template is signifcantly reduced then the sequence starts around 2500 to 3000 which is still sequence of low quality. Of course the lower the amount of template then the lower the amount of signal which also effects the quality of the sequencing results.

If you have any idea as to the identity of this molecule, then we would really like to hear your suggestion and reasoning.