Patent application title: SECRETION OF FATTY ACIDS BY PHOTOSYNTHETIC MICROORGANISMS
Inventors:
Paul Gordon Roessler (San Diego, CA, US)
You Chen (San Diego, CA, US)
Bo Liu (San Diego, CA, US)
Bo Liu (San Diego, CA, US)
Corey Neal Dodge (Cardiff, CA, US)
IPC8 Class: AC12P764FI
USPC Class:
435134
Class name: Micro-organism, tissue cell culture or enzyme using process to synthesize a desired chemical compound or composition preparing oxygen-containing organic compound fat; fatty oil; ester-type wax; higher fatty acid (i.e., having at least seven carbon atoms in an unbroken chain bound to a carboxyl group); oxidized oil or fat
Publication date: 2009-12-03
Patent application number: 20090298143
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: SECRETION OF FATTY ACIDS BY PHOTOSYNTHETIC MICROORGANISMS
Inventors:
Paul Gordon ROESSLER
You Chen
Bo Liu
Corey Neal Dodge
Agents:
Synthetic Genomics c/o MoFo
Assignees:
Origin: SAN DIEGO, CA US
IPC8 Class: AC12P764FI
USPC Class:
435134
Patent application number: 20090298143
Abstract:
Recombinant photosynthetic microorganisms that convert inorganic carbon to
secreted fatty acids are described. Methods to recover the secreted fatty
acids from the culture medium without the need for cell harvesting are
also described.Claims:
1. A cell culture of a recombinant photosynthetic microorganism, said
microorganism modified to contain a nucleic acid molecule comprising at
least one recombinant expression system that produces at least one
exogenous acyl-ACP thioesterase,wherein said acyl-ACP thioesterase
preferentially liberates a fatty acid chain that contains 6-20 carbons,
andwherein the culture medium provides inorganic carbon as substantially
the sole carbon source andwherein said microorganism secretes the fatty
acid liberated by the acyl-ACP thioesterase into the culture medium.
2. The culture of claim 1, wherein the at least one exogenous acyl-ACP thioesterase is a Fat B thioesterase.
3. The culture of claim 1, wherein the at least one exogenous acyl-ACP thioesterase is a Fat B thioesterase derived from the genus Cuphea.
4. The culture of claim 1, wherein the at least one exogenous acyl-ACP thioesterase is ChFatB2.
5. The culture of claim 1, wherein the recombinant photosynthetic microorganism has further been modified to produce an exogenous β-ketoacyl synthase (KAS).
6. The culture of claim 5, wherein the exogenous KAS preferentially produces acyl-ACPs having the chain length for which the thioesterase has preferred activity.
7. The culture of claim 1, wherein the recombinant photosynthetic microorganism is further modified so that one or more genes encoding beta-oxidation pathway enzymes are inactivated or downregulated, or said enzymes are inhibited.
8. The culture of claim 1, wherein the recombinant photosynthetic microorganism is further modified so that one or more genes encoding acyl-ACP synthetases are inactivated or downregulated, or said synthetases are inhibited.
9. The culture of claim 1, wherein the recombinant photosynthetic microorganism is further modified so that one or more genes encoding an enzyme involved in carbohydrate biosynthesis are inactivated or downregulated, or said enzymes are inhibited.
10. The culture of claim 9, wherein the enzyme involved in carbohydrate biosynthesis is a branching enzyme.
11. A method to convert inorganic carbon to fatty acids, said method comprising:incubating the culture of claim 1 such that the recombinant photosynthetic microorganism therein secretes the fatty acid into the culture medium; andrecovering the secreted fatty acids from the culture medium.
12. The method of claim 11, wherein the fatty acids are recovered from the culture by contacting the medium with particulate adsorbents.
13. The method of claim 12, wherein the particulate adsorbents circulate in the medium.
14. The method of claim 12, wherein the particulate absorbents are contained in a fixed bed column.
15. The method of claim 14, wherein the pH of the medium is lowered during said contacting.
16. The method of claim 15, wherein said pH lowering process comprises adding CO.sub.2.
17. The method of claim 16, wherein the medium is recirculated to the culture.
18. The method of claim 12, wherein the particulate adsorbents are lipophilic.
19. The method of claim 12, wherein the particulate adsorbents are ion exchange resins.
20. A composition comprising a fatty acid produced by the culture of claim 1.
21. The composition of claim 20, wherein the composition is used to produce another compound.
22. The composition of claim 20, wherein the composition is a biocrude.
23. A composition comprising a derivative of a fatty acid produced by the culture of claim 1.
24. The composition of claim 23, wherein the composition is a finished fuel or fuel additive.
25. The composition of claim 23, wherein the composition is a biological substitute for a petrochemical product.
26. The composition of claim 23, wherein the derivative is an alcohol, an alkane, or an alkene.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001]This application claims benefit of provisional application 61/007,333 filed 11 Dec. 2007. The contents of this application are incorporated herein by reference.
TECHNICAL FIELD
[0002]This invention relates to photosynthetic microorganisms that convert inorganic carbon to fatty acids and secrete them into the culture medium, methods of production of fatty acids using such organisms, and uses thereof. The fatty acids may be used directly or may be further modified to alternate forms such as esters, reduced forms such as alcohols, or hydrocarbons, for applications in different industries, including fuels and chemicals.
BACKGROUND ART
[0003]Photosynthetic microorganisms, including eukaryotic algae and cyanobacteria, contain various lipids, including polar lipids and neutral lipids. Polar lipids (e.g., phospholipids, glycolipids, sulfolipids) are typically present in structural membranes whereas neutral lipids (e.g., triacylglycerols, wax esters) accumulate in cytoplasmic oil bodies or oil globules. A substantial research effort has been devoted to the development of methods to produce lipid-based fuels and chemicals from photosynthetic microorganisms. Typically, eukaryotic microalgae are grown under nutrient-replete conditions until a certain cell density is achieved, after which the cells are subjected to growth under nutrient-deficient conditions, which often leads to the accumulation of neutral lipids. The cells are then harvested by various means (e.g., settling, which can be facilitated by the addition of flocculants, followed by centrifugation), dried, and then the lipids are extracted from the cells by the use of various non-polar solvents. Harvesting of the cells and extraction of the lipids are cost-intensive steps. It would be desirable to obtain lipids from photosynthetic microorganisms without the requirement for cell harvesting and extraction.
[0004]PCT publication numbers WO2007/136762 and WO2008/119082 describe the production of biofuel components using microorganisms. These documents disclose the production by these organisms of fatty acid derivatives which are, apparently, short and long chain alcohols, hydrocarbons, fatty alcohols and esters including waxes, fatty acid esters or fatty esters. To the extent that fatty acid production is described, it is proposed as an intermediate to these derivatives, and the fatty acids are therefore not secreted. Further, there is no disclosure of converting inorganic carbon directly to secreted fatty acids using a photosynthetic organism grown in a culture medium containing inorganic carbon as the primary carbon source. The present invention takes advantage of the efficiency of photosynthetic organisms in secreting fatty acids into the medium in order to recover these valuable compounds.
[0005]The invention includes the expression of heterologous acyl-ACP thioesterase (TE) genes in photosynthetic microbes. Many of these genes, along with their use to alter lipid metabolism in oilseeds, have been described previously. Genes encoding the proteins that catalyze various steps in the synthesis and further metabolism of fatty acids have also been extensively described.
[0006]The two functional classes of plant acyl-ACP thioesterases (unsaturated fatty acid-recognizing Fat A versus saturated fatty acid-recognizing FatB) can be clustered based on amino acid sequence alignments as well as function. FatAs show marked preference for 18:1-ACP with minor activity towards 18:0- and 16:0-ACPs, and FatBs hydrolyze primarily saturated acyl-ACPs with chain lengths that vary between 8-16 carbons. Several studies have focused on engineering plant thioesterases with perfected or altered substrate specificities as a strategy for tailoring specialty seed oils.
[0007]As shown in FIG. 1, fatty acid synthetase catalyzes a repeating cycle wherein malonyl-acyl carrier protein (ACP) is condensed with a substrate, initially acetyl-CoA, to form acetoacetyl-ACP, liberating CO2. The acetoacetyl-ACP is then reduced, dehydrated, and reduced further to butyryl-ACP which can then itself be condensed with malonyl-ACP, and the cycle repeated, adding a 2-carbon unit at each turn. The production of free fatty acids would therefore be enhanced by a thioesterase that would liberate the fatty acid itself from ACP, breaking the cycle. That is, the acyl-ACP is prevented from reentering the cycle. Production of the fatty acid would also be encouraged by enhancing the levels of fatty acid synthetase and inhibiting any enzymes which result in degradation or further metabolism of the fatty acid.
[0008]FIG. 2 presents a more detailed description of the sequential formation of acyl-ACPs of longer and longer chains. As shown, the thioesterase enzymes listed in FIG. 2 liberate the fatty acid from the ACP thioester.
[0009]Taking advantage of this principle, Dehesh, K., et al., The Plant Journal (1996) 9:167-172, describe "Production of high levels of octanoic (8:0) and decanoic (10:0) fatty acids in transgenic canola by overexpression of ChFatB2, a thioesterase cDNA from Cuphea hookeriana." Dehesh, K., et al., Plant Physiology (1996) 110:203-210, and report "Two novel thioesterases are key determinants of the bimodal distribution of acyl chain length of Cuphea palustris seed oil."
[0010]Voelker, T., et al., Science (1992) 257:72-74, describe "Fatty acid biosynthesis redirected to medium chains in transgenic oilseed plants." Voelker, T., and Davies, M., Journal of Bacteriology (1994) 176:7320-7327, describe "Alteration of the specificity and regulation of fatty acid synthesis of Escherichia coli by expression of a plant medium-chain acyl-acyl carrier protein thioesterase."
DISCLOSURE OF THE INVENTION
[0011]The present invention is directed to the production of recombinant photosynthetic microorganisms that are able to secrete fatty acids derived from inorganic carbon into the culture medium. Methods to remove the secreted fatty acids from the culture medium without the need for cell harvesting are also provided. It is anticipated that these improvements will lead to lower costs for producing lipid-based fuels and chemicals from photosynthetic microorganisms. In addition, this invention enables the production of fatty acids of defined chain length, thus allowing their use in the formulation of a variety of different products, including fuels and chemicals.
[0012]Carbon dioxide (which, along with carbonic acid, bicarbonate and/or carbonate define the term "inorganic carbon") is converted in the photosynthetic process to organic compounds. The inorganic carbon source includes any way of delivering inorganic carbon, optionally in admixture with any other combination of compounds which do not serve as the primary carbon feedstock, but only as a mixture or carrier (for example, emissions from biofuel (e.g., ethanol) plants, power plants, petroleum-based refineries, as well as atmospheric and subterranean sources).
[0013]One embodiment of the invention relates to a culture of recombinant photosynthetic microorganisms, said organisms comprising at least one recombinant expression vector encoding at least one exogenous acyl-ACP thioesterase, wherein the at least one exogenous acyl-ACP thioesterase preferentially liberates fatty acid chains containing 6 to 20 carbons from these ACP thioesters. The fatty acids are formed from inorganic carbon as their carbon source and the culture contains substantially only inorganic carbon as a carbon source. The presence of the exogenous thioesterase will increase the secretion levels of desired fatty acids by at least 2-4 fold.
[0014]Specifically, in one embodiment, the invention is directed to a cell culture of a recombinant photosynthetic microorganism where the microorganism has been modified to contain a nucleic acid molecule comprising at least one recombinant expression system that produces at least one exogenous acyl-ACP thioesterase, wherein said acyl-ACP thioesterase preferentially liberates a fatty acid chain that contains 6-20 carbons, and wherein the culture medium provides inorganic carbon as substantially the sole carbon source and wherein said microorganism secretes the fatty acid liberated by the acyl-ACP thioesterase into the medium. In alternative embodiments, the thioesterase preferentially liberates a fatty acid chain that contains 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 carbons.
[0015]In other aspects, the invention is directed to a method to produce fatty acids of desired chain lengths by incubating these cultures and recovering these secreted fatty acids from the cultures. In one embodiment, the recovery employs solid particulate adsorbents to harvest the secreted fatty acids. The fatty acids thus recovered can be further modified synthetically or used directly as components of biofuels or chemicals.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016]FIG. 1 is a diagram of the pathway of fatty acid synthesis as is known in the art.
[0017]FIG. 2 is a more detailed diagram of the synthesis of fatty acids of multiple chain lengths as is known in the art.
[0018]FIG. 3 is an enzymatic overview of fatty acid biosynthesis identifying enzymatic classes for the production of various chain length fatty acids.
[0019]FIG. 4 is a schematic diagram of a recovery system for fatty acids from the medium.
[0020]FIG. 5 shows an experimental system based on the principles in FIG. 4.
[0021]FIG. 6 shows representative acyl-ACP thioesterase from a variety of organisms.
MODES OF CARRYING OUT THE INVENTION
[0022]The present invention provides photosynthetic microorganisms that secrete fatty acids into the culture medium, along with methods to adsorb the fatty acids from the culture medium and collect them for processing into fuels and chemicals. The invention thereby eliminates or greatly reduces the need to harvest and extract the cells, resulting in substantially reduced production costs.
[0023]FIG. 2 is an overview of one aspect of the invention. As shown in FIG. 2, carbon dioxide is converted to acetyl-CoA using the multiple steps in the photosynthetic process. The acetyl-CoA is then converted to malonyl-CoA by the action of acetyl-CoA carboxylase. The malonyl-CoA is then converted to malonyl-ACP by the action of malonyl-CoA:ACP transacylase which, upon progressive action of fatty acid synthetase, results in successive additions of two carbon units. In one embodiment of the invention, the process is essentially halted at carbon chain lengths of 6 or 8 or 10 or 12 or 14 or 16 or 18 carbons by supplying the appropriate thioesterase (shown in FIG. 2 as FatB). To the extent that further conversions to longer chain fatty acids occur in this embodiment, the cell biomass can be harvested as well. The secreted fatty acids can be converted to various other forms including, for example, methyl esters, alkanes, alkenes, alpha-olefins and fatty alcohols.
[0024]Thioesterases (Acyl-ACP TEs)
[0025]In order to effect secretion of the free fatty acids, the organism is provided at least one expression system for at least one thioesterase that operates preferentially to liberate fatty acids of the desired length. Many genes encoding such thioesterases are available in the art. Some of these are subjects of U.S. patents as follows:
[0026]Examples include U.S. Pat. No. 5,298,421, entitled "Plant medium-chain-preferring acyl-ACP thioesterases and related methods," which describes the isolation of an acyl-ACP thioesterase and the gene that encodes it from the immature seeds of Umbellularia californica. Other sources for such thioesterases and their encoding genes include U.S. Pat. No. 5,304,481, entitled "Plant thioesterase having preferential hydrolase activity toward C12 acyl-ACP substrate," U.S. Pat. No. 5,344,771, entitled "Plant thioesterases," U.S. Pat. No. 5,455,167, entitled "Medium-chain thioesterases in plants," U.S. Pat. No. 5,512,482, entitled "Plant thioesterases," U.S. Pat. No. 5,530,186, entitled "Nucleotide sequences of soybean acyl-ACP thioesterase genes," U.S. Pat. No. 5,639,790, entitled "Plant medium-chain thioesterases," U.S. Pat. No. 5,667,997, entitled "C8 and C10 medium-chain thioesterases in plants," U.S. Pat. No. 5,723,761, entitled "Plant acyl-ACP thioesterase sequences," U.S. Pat. No. 5,807,893, entitled "Plant thioesterases and use for modification of fatty acid composition in plant seed oils," U.S. Pat. No. 5,850,022, entitled "Production of myristate in plant cells," U.S. Pat. No. 5,910,631, entitled "Middle chain-specific thioesterase genes from Cuphea lanceolata," U.S. Pat. No. 5,945,585, entitled "Specific for palmitoyl, stearoyl and oleoyl-alp thioesters nucleic acid fragments encoding acyl-ACP thioesterase enzymes and the use of these fragments in altering plant oil composition," U.S. Pat. No. 5,955,329, entitled "Engineering plant thioesterases for altered substrate specificity," U.S. Pat. No. 5,955,650, entitled "Nucleotide sequences of canola and soybean palmitoyl-ACP thioesterase genes and their use in the regulation of fatty acid content of the oils of soybean and canola plants," and U.S. Pat. No. 6,331,664, entitled "Acyl-ACP thioesterase nucleic acids from maize and methods of altering palmitic acid levels in transgenic plants therewith."
[0027]Others are described in the open literature as follows:
[0028]Dormann, P. et al., Planta (1993) 189:425-432, describe "Characterization of two acyl-acyl carrier protein thioesterases from developing Cuphea seeds specific for medium-chain and oleoyl-acyl carrier protein." Dormann, P., et al., Biochimica Biophysica Acta (1994) 1212:134-136, describe "Cloning and expression in Escherichia coli of a cDNA coding for the oleoyl-acyl carrier protein thioesterase from coriander (Coriandrum sativum L.)." Filichkin, S., et al., European Journal of Lipid Science and Technology (2006) 108:979-990, describe "New FATB thioesterases from a high-laurate Cuphea species: Functional and complementation analyses." Jones, A., et al., Plant Cell (1995) 7:359-371, describe "Palmitoyl-acyl carrier protein (ACP) thioesterase and the evolutionary origin of plant acyl-ACP thioesterases." Knutzon, D. S., et al., Plant Physiology (1992) 100:1751-1758, describe "Isolation and characterization of two safflower oleoyl-acyl carrier protein thioesterase cDNA clones." Slabaugh, M., et al., The Plant Journal (1998)13:611-620, describe "Condensing enzymes from Cuphea wrightii associated with medium chain fatty acid biosynthesis."
[0029]Additional genes, not previously isolated, that encode these acyl-ACP TEs can be isolated from plants that naturally contain large amounts of medium-chain fatty acids in their seed oil, including certain plants in the Lauraceae, Lythraceae, Rutaceae, Ulmaceae, and Vochysiaceae families. Typically, the fatty acids produced by the seeds of these plants are esterified to glycerol and retained inside the cells. The seeds containing the products can then be harvested and processed to isolate the fatty acids. Other sources of these enzymes, such as bacteria may also be used.
[0030]The known acyl-ACP TEs from plants can be divided into two main classes, based on their amino acid sequences and their specificity for acyl-ACPs of differing chain lengths and degrees of unsaturation. The "FatA" type of plant acyl-ACP TE has preferential activity on oleoyl-ACP, thereby releasing oleic acid, an 18-carbon fatty acid with a single double bond nine carbons distal to the carboxyl group. The "FatB" type of plant acyl-ACP TE has preferential activity on saturated acyl-ACPs, and can have broad or narrow chain length specificities. For example, FatB enzymes from different species of Cuphea have been shown to release fatty acids ranging from eight carbons in length to sixteen carbons in length from the corresponding acyl-ACPs. Listed below in Table 1 are several plant acyl-ACP TEs along with their substrate preferences. (Fatty acids are designated by standard shorthand notation, wherein the number preceding the colon represents the acyl chain length and the number after the colon represents the number of double bonds in the acyl chain.)
TABLE-US-00001 TABLE 1 Plant Acyl-ACP Thioesterase Garcinia mangostana FatA 18:1 and 18:0 Carthamus tinctorius FatA 18:1 Coriandrum sativum FatA 18:1 Cuphea hookeriana FatB1 16:0 Cuphea hookeriana FatB2 8:0 and 10:0 Cuphea wrightii FatB1 12:0 to 16:0 Cuphea palustris FatB1 8:0 and 10:0 Cuphea palustris FatB2 14:0 and 16:0 Cuphea calophylla FatB1 12:0 to 16:0 Umbellularia californica FatB1 12:0 Ulmus americana FatB1 8:0 and 10:0
[0031]The enzymes listed in Table 1 are exemplary and many additional genes encoding acyl-ACP TEs can be isolated and used in this invention, including but not limited to genes such as those that encode the following acyl-ACP TEs (referred to by GenPept Accession Numbers): [0032]CAA52069.1, CAA52070.1, CAA54060.1, CAA85387.1, CAA85388.1, CAB60830.1, CAC19933.1, CAC19934.1, CAC39106.1, CAC80370.1, CAC80371.1, CAD32683.1, CAL50570.1, CAN60643.1, CAN81819.1, CAO17726.1, CAO42218.1, CAO65585.1, CAO68322.1, AAA33019.1, AAA33020.1, AAB51523.1, AAB51524.1, AAB51525.1, AAB71729.1, AAB71730.1, AAB71731.1, AAB88824.1, AAC49001.1, AAC49002.1, AAC49179.1, AAC49180.1, AAC49269.1, AAC49783.1, AAC49784.1, AAC72881.1, AAC72882.1, AAC72883.1, AAD01982.1, AAD28187.1, AAD33870.1, AAD42220.2, AAG35064.1, AAG43857.1, AAG43858.1, AAG43859.1, AAG43860.1, AAG43861.1, AAL15645.1, AAL77443.1, AAL77445.1, AAL79361.1, AAM09524.1, AAN17328.1, AAQ08202.1, AAQ08223.1, AAX51636.1, AAX51637.1, ABB71579.1, ABB71581.1, ABC47311.1, ABD83939.1, ABE01139.1, ABH11710.1, ABI18986.1, ABI20759.1, ABI20760.1, ABL85052.1, ABU96744.1, EAY74210.1, EAY86874.1, EAY86877.1, EAY86884.1, EAY99617.1, EAZ01545.1, EAZ09668.1, EAZ12044.1, EAZ23982.1, EAZ37535.1, EAZ45287.1, NP--001047567.1, NP--001056776.1, NP--001057985.1, NP--001063601.1, NP--001068400.1, NP--172327.1, NP--189147.1, NP--193041.1, XP--001415703.1, Q39473, Q39513, Q41635, Q42712, Q9SQI3, NP--189147.1, AAC49002, CAA52070.1, CAA52069.1, 193041.1, CAC39106, CAO17726, AAC72883, AAA33020, AAL79361, AAQ08223.1, AAB51523, AAL77443, AAA33019, AAG35064, and AAL77445.Additional sources of acyl-ACP TEs that are useful in the present invention include: Arabidopsis thaliana (At); Bradyrhizobium japonicum (Bj); Brassica napus (Bn); Cinnamonum camphorum (Cc); Capsicum chinense (Cch); Cuphea hookeriana (Ch); Cuphea lanceolata (Cl); Cuphea palustris (Cp); Coriandrum sativum (Cs); Carthamus tinctorius (Ct); Cuphea wrightii (Cw); Elaeis guineensis (Eg); Gossypium hirsutum (Gh); Garcinia mangostana (Gm); Helianthus annuus (Ha); Iris germanica (Ig); Iris tectorum (It); Myristica fragrans (Mf); Triticum aestivum (Ta); Ulmus Americana (Ua); and Umbellularia californica (Uc). Exemplary TEs are shown in FIG. 6 with corresponding NCBI accession numbers.
[0033]In one embodiment, the present invention contemplates the specific production of an individual length of medium-chain fatty acid, for example, predominently producing C8 fatty acids in one culture of recombinant photosynthetic microorganisms. In another embodiment, the present invention contemplates the production of a combination of two or more different length fatty acids, for example, both C8 and C10 fatty acids in one culture of recombinant photosynthetic microorganisms.
[0034]Illustrated below are manipulations of these art-known genes to construct suitable expression systems that result in production of effective amounts of the thioesterases in selected recombinant photosynthetic organisms. In such constructions, it may be desirable to remove the portion of the gene that encodes the plastid transit peptide region, as this region is inappropriate in prokaryotes. Alternatively, if expression is to take place in eukaryotic cells, the appropriate plastid transit peptide encoding region to the host organism may be substituted. Preferred codons may also be employed, depending on the host.
[0035]Other Modifications
[0036]In addition to providing an expression system for one or more appropriate acyl-ACP TE genes, further alterations in the photosynthetic host may be made. For example, the host may be modified to include an expression system for a heterologous gene that encodes a β-ketoacyl synthase (KAS) that preferentially produces acyl-ACPs having medium chain lengths. Such KAS enzymes have been described from several plants, including various species of Cuphea. See Dehesh, K., et al., The Plant Journal (1998) 15:383-390, describe "KAS IV: a 3-ketoacyl-ACP synthase from Cuphea sp. is a medium chain specific condensing enzyme."; Slabaugh, M., et al., The Plant Journal (1998) 13:611-620), and would serve to increase the availability of acyl-ACP molecules of the proper length for recognition and cleavage by the heterologous medium-chain acyl-ACP TE. Another example is that the photosynthetic host cell containing a heterologous acyl-ACP TE gene may be further modified to include an expression system for a heterologous gene that encodes a multifunctional acetyl-CoA carboxylase or a set of heterologous genes that encode the various subunits of a multi-subunit type of acetyl-CoA carboxylase. Other heterologous genes that encode additional enzymes or components of the fatty acid biosynthesis pathway could also be introduced and expressed in acyl-ACP TE-containing host cells.
[0037]The photosynthetic microorganism may also be modified such that one or more genes that encode beta-oxidation pathway enzymes have been inactivated or downregulated, or the enzymes themselves may be inhibited. This would prevent the degradation of fatty acids released from acyl-ACPs, thus enhancing the yield of secreted fatty acids. In cases where the desired products are medium-chain fatty acids, the inactivation or downregulation of genes that encode acyl-CoA synthetase and/or acyl-CoA oxidase enzymes that preferentially use these chain lengths as substrates would be beneficial. Mutations in the genes encoding medium-chain-specific acyl-CoA synthetase and/or medium-chain-specific acyl-CoA oxidase enzymes such that the activity of the enzymes is diminished would also be effective in increasing the yield of secreted fatty acids. An additional modification inactivates or down-regulates the acyl-ACP synthetase gene or inactivates the gene or protein. Mutations in the genes can be introduced either by recombinant or non-recombinant methods. These enzymes and their genes are well known, and may be targeted specifically by disruption, deletion, generation of antisense sequences, generation of ribozymes or other recombinant approaches known to the practitioner. Inactivation of the genes can also be accomplished by random mutation techniques such as UV, and the resulting cells screened for successful mutants. The proteins themselves can be inhibited by intracellular generation of appropriate antibodies or intracellular generation of peptide inhibitors.
[0038]The photosynthetic microorganism may also be modified such that one or more genes that encode storage carbohydrate or polyhydroxyalkanoate (PHA) biosynthesis pathway enzymes have been inactivated or down-regulated, or the enzymes themselves may be inhibited. Examples include enzymes involved in glycogen, starch, or chrysolaminarin synthesis, including glucan synthases and branching enzymes. Other examples include enzymes involved in PHA biosynthesis such as acetoacetyl-CoA synthase and PHA synthase.
[0039]Expression Systems
[0040]Expression of heterologous genes in cyanobacteria and eukaryotic algae is enabled by the introduction of appropriate expression vectors. For transformation of cyanobacteria, a variety of promoters that function in cyanobacteria can be utilized, including, but not limited to the lac, tac, and trc promoters and derivatives that are inducible by the addition of isopropyl β-D-1-thiogalactopyranoside (IPTG), promoters that are naturally associated with transposon- or bacterial chromosome-borne antibiotic resistance genes (neomycin phosphotransferase, chloramphenicol acetyltransferase, spectinomycin adenyltransferase, etc.), promoters associated with various heterologous bacterial and native cyanobacterial genes, promoters from viruses and phages, and synthetic promoters. Promoters isolated from cyanobacteria that have been used successfully include the following:
[0041]secA (secretion; controlled by the redox state of the cell)
[0042]rbc (Rubisco operon)
[0043]psaAB (PS I reaction center proteins; light regulated)
[0044]psbA (D1 protein of PSII; light-inducible)
[0045]Likewise, a wide variety of transcriptional terminators can be used for expression vector construction. Examples of possible terminators include, but are not limited to, psbA, psaAB, rbc, secA, and T7 coat protein.
[0046]Expression vectors are introduced into the cyanobacterial strains by standard methods, including, but not limited to, natural DNA uptake, conjugation, electroporation, particle bombardment, and abrasion with glass beads, SiC fibers, or other particles. The vectors can be: 1) targeted for integration into the cyanobacterial chromosome by including flanking sequences that enable homologous recombination into the chromosome, 2) targeted for integration into endogenous cyanobacterial plasmids by including flanking sequences that enable homologous recombination into the endogenous plasmids, or 3) designed such that the expression vectors replicate within the chosen host.
[0047]For transformation of green algae, a variety of gene promoters and terminators that function in green algae can be utilized, including, but not limited to promoters and terminators from Chlamydomonas and other algae, promoters and terminators from viruses, and synthetic promoters and terminators.
[0048]Expression vectors are introduced into the green algal strains by standard methods, including, but not limited to, electroporation, particle bombardment, and abrasion with glass beads, SiC fibers, or other particles. The vectors can be 1) targeted for site-specific integration into the green algal chloroplast chromosome by including flanking sequences that enable homologous recombination into the chromosome, or 2) targeted for integration into the cellular (nucleus-localized) chromosome.
[0049]For transformation of diatoms, a variety of gene promoters that function in diatoms can be utilized in these expression vectors, including, but not limited to: 1) promoters from Thalassiosira and other heterokont algae, promoters from viruses, and synthetic promoters. Promoters from Thalassiosira pseudonana that would be suitable for use in expression vectors include an alpha-tubulin promoter (SEQ ID NO:1), a beta-tubulin promoter (SEQ ID NO:2), and an actin promoter (SEQ ID NO:3). Promoters from Phaeodacylum tricornutum that would be suitable for use in expression vectors include an alpha-tubulin promoter (SEQ ID NO:4), a beta-tubulin promoter (SEQ ID NO:5), and an actin promoter (SEQ ID NO:6). These sequences are deduced from the genomic sequences of the relevant organisms available in public databases and are merely exemplary of the wide variety of promoters that can be used. The terminators associated with these and other genes, or particular heterologous genes can be used to stop transcription and provide the appropriate signal for polyadenylation and can be derived in a similar manner or are known in the art.
[0050]Expression vectors are introduced into the diatom strains by standard methods, including, but not limited to, electroporation, particle bombardment, and abrasion with glass beads, SiC fibers, or other particles. The vectors can be 1) targeted for site-specific integration into the diatom chloroplast chromosome by including flanking sequences that enable homologous recombination into the chromosome, or 2) targeted for integration into the cellular (nucleus-localized) chromosome.
[0051]Host Organisms
[0052]The host cells used to prepare the cultures of the invention include any photosynthetic organism which is able to convert inorganic carbon into a substrate that is in turn converted to fatty acid derivatives. These organisms include prokaryotes as well as eukaryotic organisms such as algae and diatoms.
[0053]Host organisms include eukaryotic algae and cyanobacteria (blue-green algae). Representative algae include green algae (chlorophytes), red algae, diatoms, prasinophytes, glaucophytes, chlorarachniophytes, euglenophytes, chromophytes, and dinoflagellates. A number of cyanobacterial species are known and have been manipulated using molecular biological techniques, including the unicellular cyanobacteria Synechocystis sp. PCC6803 and Synechococcus elongates PCC7942, whose genomes have been completely sequenced.
[0054]The following genera of cyanobacteria may be used: one group includes
TABLE-US-00002 Chamaesiphon Chroococcus Cyanobacterium Cyanobium Cyanothece Dactylococcopsis Gloeobacter Gloeocapsa Gloeothece Microcystis Prochlorococcus Prochloron Synechococcus Synechocystis
[0055]Another group includes
TABLE-US-00003 Cyanocystis Dermocarpella Stanieria Xenococcus Chroococcidiopsis Myxosarcina Pleurocapsa
[0056]Still another group includes
TABLE-US-00004 Arthrospira Borzia Crinalium Geitlerinema Halospirulina Leptolyngbya Limnothrix Lyngbya Microcoleus Oscillatoria Planktothrix Prochlorothrix Pseudanabaena Spirulina Starria Symploca Trichodesmium Tychonema
[0057]Still another group includes
TABLE-US-00005 Anabaena Anabaenopsis Aphanizomenon Calothrix Cyanospira Cylindrospermopsis Cylindrospermum Nodularia Nostoc Rivularia Scytonema Tolypothrix
[0058]And another group includes
TABLE-US-00006 Chlorogloeopsis Fischerella Geitleria Iyengariella Nostochopsis Stigonema
[0059]In addition, various algae, including diatoms and green algae can be employed.
[0060]Desirable qualities of the host strain include high potential growth rate and lipid productivity at 25-50° C., high light intensity tolerance, growth in brackish or saline water, i.e., in wide range of water types, resistance to growth inhibition by high O2 concentrations, filamentous morphology to aid harvesting by screens; resistance to predation, ability to be flocculated (by chemicals or `on-demand autoflocculation`), excellent inorganic carbon uptake characteristics, virus or cyanophage-resistance, tolerance to free fatty acids or other compounds associated with the invention method, and ability to undergo metabolic engineering.
[0061]Metabolic engineering is facilitated by the ability to take up DNA by electroporation or conjugation, lack of a restriction system and efficient homologous recombination in the event gene replacement or gene knockouts are required.
[0062]Fatty Acid Adsorption, Removal, and Recovery
[0063]The fatty acids secreted into the culture medium by the recombinant photosynthetic microorganisms described above can be recovered in a variety of ways. A straightforward isolation method by partition using immiscible solvents may be employed. In one embodiment, particulate adsorbents can be employed. These may be lipophilic particulates or ion exchange resins, depending on the design of the recovery method. They may be circulating in the separated medium and then collected, or the medium may be passed over a fixed bed column, for example, a chromatographic column containing these particulates. The fatty acids are then eluted from the particulate adsorbents by the use of an appropriate solvent. Evaporation of the solvent, followed by further processing of the isolated fatty acids and lipids can then be carried out to yield chemicals and fuels that can be used for a variety of commercial purposes.
[0064]The particulate adsorbents may have average diameters ranging from 0.5 mm to 30 mm which can be manufactured from various materials including, but not limited to, polyethylene and derivatives, polystyrene and derivatives, polyamide and derivatives, polyester and derivatives, polyurethane and derivatives, polyacrylates and derivatives, silicone and derivatives, and polysaccharide and derivatives. Certain glass and ceramic materials can also be used as the solid support component of the fat adsorbing objects. The surfaces of the particulate adsorbents may be modified so that they are better able to bind fatty acids and lipids. An example of such modification is the introduction of ether-linked alkyl groups having various chain lengths, preferably 8-30 carbons. In another example, acyl chains of various lengths can be attached to the surface of the fat adsorbing objects via ester, thioester, or amide linkages.
[0065]In one embodiment, the particulate adsorbents are coated with inorganic compounds known to bind fatty acids and lipids. Examples of such compounds include but are not limited to aluminum hydroxide, graphite, anthracite, and silica.
[0066]The particles used may also be magnetized or otherwise derivatized to facilitate recovery. For instance the particles may be coupled to one member of a binding pair and the adsorbed to a substrate containing the relevant binding partner.
[0067]The fatty acids may be eluted from the particulate adsorbents by the use of an appropriate solvent such as hexane or ethanol. The particulate adsorbents may be reused by returning them to the culture medium or used in a regenerated column. The solvent containing the dissolved fatty acids is then evaporated, leaving the fatty acids in a purified state for further conversion to chemicals and fuels. The particulate adsorbents can be designed to be neutrally buoyant or positively buoyant to enhance circulation in the culture medium. A continuous cycle of fatty acid removal and recovery can be implemented by utilizing the steps outlined above. The recovered fatty acids may be converted to alternative organic compounds, used directly, or mixed with other components. Chemical methods for such conversions are well understood in the art, and developments of biological methods for such conversions are also contemplated
[0068]The present invention further contemplates a variety of compositions comprising the fatty acids produced by the recombinant photosynthetic microorganisms described herein, and uses thereof. The composition may comprise the fatty acids themselves, or further derivatives of the fatty acids, such as alcohols, alkanes, and alkenes which can be generated from the fatty acids produced by the microorganisms by any methods that are known in the art, as well as by development of biological methods of conversion. For examples, fatty acids may be converted to alkenes by catalytic hydrogenation and catalytic dehydration.
[0069]The composition may serve, for example, as a biocrude. The biocrude can be processed through refineries that will convert the composition compounds to various petroleum and petrochemical replacements, including alkanes, olefins and aromatics through processes including hydrotreatment, decarboxylation, isomerization and catalytic cracking and reforming. The biocrude can be also converted to ester-based fuels, such as fatty acid methyl ester (commercially known as biodiesel), through established chemical processes including transesterification and esterification.
[0070]In addition, one of skill in the art could contemplate a variety of other uses for the fatty acids of the present invention, and derivatives thereof, that are well known in the art, for example, the production of chemicals, soaps, surfactants, detergents, lubricants, nutraceuticals, pharmaceuticals, cosmetics, etc. For example, derivatives of the fatty acids of the present invention include C8 chemicals, such as octanol, used in the manufacture of esters for cosmetics and flavors as well as for various medical applications, and octane, used primarily as a co-monomer in production of polyethylene. Derivatives of the fatty acids of the present invention may also include C10 chemicals, such as decanol, used in the manufacture of plasticizers, surfactants and solvents, and decene, used in the manufacture of lubricants.
[0071]Biocrudes are biologically produced compounds or a mix of different biologically produced compounds that are used as a feedstock for refineries in replacement of, or in complement to, crude oil or other forms of petroleum. In general, but not necessarily, these feedstocks have been pre-processed through biological, chemical, mechanical or thermal processes in order to be in a liquid state that is adequate for introduction in a petroleum refinery.
[0072]The fatty acids of the present invention can be a biocrude, and further processed to a biofuel composition. The biofuel can then perform as a finished fuel or a fuel additive.
[0073]"Finished fuel" is defined as a chemical compound or a mix of chemical compounds (produced through chemical, thermochemical or biological routes) that is in an adequate chemical and physical state to be used directly as a neat fuel or fuel additive in an engine. In many cases, but not always, the suitability of a finished fuel for use in an engine application is determined by a specification which describes the necessary physical and chemical properties that need to be met. Some examples of engines are: internal combustion engine, gas turbine, steam turbine, external combustion engine, and steam boiler. Some examples of finished fuels include: diesel fuel to be used in a compression-ignited (diesel) internal combustion engine, jet fuel to be used in an aviation turbine, fuel oil to be used in a boiler to generate steam or in an external combustion engine, ethanol to be used in a flex-fuel engine. Examples of fuel specifications are ASTM standards, mainly used ion the US, and the EN standards, mainly used in Europe.
[0074]"Fuel additive" refers to a compound or composition that is used in combination with another fuel for a variety of reasons, which include but are not limited to complying with mandates on the use of biofuels, reducing the consumption of fossil fuel-derived products or enhancing the performance of a fuel or engine. For example, fuel additives can be used to alter the freezing/gelling point, cloud point, lubricity, viscosity, oxidative stability, ignition quality, octane level, and flash point. Additives can further function as antioxidants, demulsifiers, oxygenates, thermal stability improvers, cetane improvers, stabilizers, cold flow improvers, combustion improvers, anti-foams, anti-haze additives, icing inhibitors, injector cleanliness additives, smoke suppressants, drag reducing additives, metal deactivators, dispersants, detergents, demulsifiers, dyes, markers, static dissipaters, biocides, and/or corrosion inhibitors.
[0075]The following examples are offered to illustrate but not to limit the invention.
Example 1
Secretion of Fatty Acids by Strains Derived from the Unicellular Photoautotrophic Cyanobacterium Synechococcus elongatus PCC 7942
[0076]The Cuphea hookeriana FatB2 gene encoding an acyl-ACP thioesterase (ChFatB2) enzyme was modified for optimized expression in Synechococcus elongatus PCC 7942. First, the portion of the gene that encodes the plastid transit peptide region of the native ChFatB2 protein was removed. The remainder of the coding region was then codon-optimized using the "Gene Designer" software program (version 1.1.4.1) provided by DNA2.0, Inc. The nucleotide sequence of this derivative of the ChFatB2 gene (hereafter ChFatB2-7942) is provided as SEQ ID NO:7. The protein sequence encoded by this gene is provided in SEQ ID NO:8.
[0077]Two different versions of the trc promoter, trc (Egon, A., et al., Gene (1983) 25:167-178) and "enhanced trc" (hereafter trcE, from pTrcHis A, Invitrogen) were used to drive the expression of ChFatB2-7942 in S. elongatus PCC 7942. The trc promoter is repressed by the Lac repressor protein encoded by the lacIq gene and can be induced by the addition of isopropyl β-D-1-thiogalactopyranoside (IPTG). The trcE promoter is a derivative of trc designed to facilitate expression of eukaryotic proteins in E. coli and is also inducible by IPTG.
[0078]The fusion fragments of ChFatB2-7942 operably linked to trc or trcE, together with the lacIq gene, were cloned into the shuttle vector pAM2314 (Mackey, S. R., et al., Methods Mol. Biol. (2007) 362:115-129), which enables transformation of S. elongatus PCC 7942 via double homologous recombination-mediated integration into the "NS1" site of the chromosome. The constructed plasmid containing the trcE::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC01. SEQ ID NO:9 represents the sequence between and including the NS1 recombination sites of pSGI-YC01. The constructed plasmid containing the trc::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC09. SEQ ID NO: 10 represents the sequence between and including the NS1 recombination sites of pSGI-YC09.
[0079]Each of the plasmids pSGI-YC01 and pSGI-YC09, along with the control vector pAM2314, were introduced into wild-type S. elongatus PCC 7942 cells as described by Golden and Sherman (J. Bacteriol. (1984) 158:36-42). Both recombinant and control strains were pre-cultivated in 100 mL of BG-11 medium supplied with spectinomycin (5 mg/L) to late-log phase (OD730 nm=1.0) on a rotary shaker (150 rpm) at 30° C. with constant illumination (60 μE m-2 sec-1). Cultures were then subcultured at initial OD730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD730 nm=0.7-0.9. For time-course study, 60 mL aliquots of the culture were transferred into 250-mL flasks and induced by adding IPTG (final conc.=1 mM) if applicable. Cultures were sampled 0, 48, 96, and 168 hours after IPTG induction and then filtered through Whatman® GF/F filters using a Millipore vacuum filter manifold. Filtrates were collected in screw top culture tubes for gas chromatographic (GC) analysis.
[0080]Free fatty acids (FFAs) were separated from filtered cell cultures using liquid-liquid extraction. Five mL of the filtrate were mixed with 125 μL of 1 M H3PO3 and 0.25 mL of 5 M NaCl, followed by addition of 2 mL of hexane and thorough mixing. For GC-FID analyze, a 0.2 μl sample of the hexane was injected using a 40:1 split ratio onto a DB-FFAP column (J&W Scientific, 15 m×250 μm×0.25 μm), with a temperature profile starting at 150° C. for 0.5 min, then heating at 15° C./min to 230° C. and holding for 7.1 min (1.1 mL/min He).
[0081]GC analysis results indicating the levels of medium-chain FFAs (8:0 and 10:0) in cultures containing various Synechococcus elongatus strains 168 hours after IPTG induction are shown in Table 1-1.
TABLE-US-00007 TABLE 1-1 Medium-chain fatty acid secretion in various strains of S. elongatus Fatty Acids Parent Plasmid (mg/L) Strain Strain Added Transgenes 8:0 10:0 SGC-YC2-5 PCC 7942 pAM2314 none ND ND SGC-YC1-2 PCC 7942 pSGI-YC01 trcE::ChFatB2- 1.5 3.5 7942 SGC-YC14-4 PCC 7942 pSGI-YC09 trc::ChFatB2- 5.1 10.1 7942 Note: ND represents "not detected" (<1 mg/L).
Example 2
Secretion of Fatty Acids by Strains Derived from the Unicellular Photoheterotrophic Cyanobacterium Synechocystis sp. PCC 6803
[0082]The trcE::ChFatB2-7942 and trc::ChFatB2-7942 fusion fragments, together with the lacIq gene, were cloned into the shuttle vector pSGI-YC03 (SEQ ID NO:11), which enables transformation of Synechocystis sp. PCC 6803 via double homologous recombination-mediated integration into the "RS1" site of the chromosome (Williams, Methods Enzymol. (1988) 167:766-778). The constructed plasmid containing the trcE::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC08. SEQ ID NO:12 represents the sequence between and including the RS1 recombination sites of pSGI-YC08. The constructed plasmid containing the trc::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC14. SEQ ID NO:13 represents the sequence between and including the RS1 recombination sites of pSGI-YC14.
[0083]Each of the plasmids pSGI-YC08, pSGI1-YC14, and the control vector pSGI-YC03, was introduced into wild-type Synechocystis PCC 6803 cells, as described by Zang, X. et al., J. Microbiol. (2007) 45:241-245. Both recombinant and control strains were pre-cultivated in 100 mL of BG-11 medium supplied with kanamycin (10 mg/L) to late-log phase (OD730 nm=1.0) on a rotary shaker (150 rpm) at 30° C. with constant illumination (60 μEm-2 sec-1). Cultures were then subcultured at initial OD730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD730 nm=0.7-0.9. For time-course studies, 60-mL aliquots of the culture were transferred into 250-mL flasks and induced by adding IPTG (final conc.=1 mM) when applicable. Cultures were sampled 0, 72, and 144 hours after IPTG induction and then filtered through Whatman® GF/B filters using a Millipore vacuum filter manifold. Filtrates were collected in screw top culture tubes for gas chromatographic (GC) analysis. Free fatty acids (FFA) were separated from the filtered culture supernatant solutions by liquid-liquid extraction. For each sample, 2 mL filtered culture was extracted with a mixture of 50 μl phosphoric acid (1 M), 100 μl NaCl (5 M) and 2 mL hexane. A 0.2 μl sample was injected using a 40:1 split ratio on to a DB-FFAP column (J&W Scientific, 15 m×250 μm×0.25 μm), with a temperature profile starting at 150° C. for 0.5 min, then heating at 15° C./min to 230° C. and holding for 7.1 min (1.1 mL/min He).
[0084]GC analysis results indicating the levels of medium-chain FFAs (8:0 and 10:0) in cultures 144 hours after IPTG induction are shown in Table 2-1.
TABLE-US-00008 TABLE 2-1 Medium-chain fatty acid secretion in various strains of Synechocystis. Fatty Acids Parent Plasmid (mg/L) Strain Strain Added Transgenes 8:0 10:0 SGC-YC9-8 PCC 6803 pSGI-YC03 none ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 trcE::ChFatB2- 61.3 52.7 7942 SGC-YC16-2 PCC 6803 pSGI-YC14 trc::ChFatB2- 2.7 5.8 7942 Note: ND represents "not detected" (<1 mg/L).
Example 3
Secretion of Fatty Acids by Strains Derived from the Filamentous Cyanobacterium Anabaena variabilis ATCC 29413
[0085]The trc::ChFatB2-7942 and trcE::ChFatB2-7942 fusion fragments, together with the lacIq gene, were PCR amplified using primers RS3-3F (SEQ ID NO:14) and 4YC-rrnBter-3 (SEQ ID NO:15) from pSG1-YC14 and pSGI-YC08, respectively, and then cloned into the shuttle vector pEL17, which enables transformation of A. variabilis ATCC 29413 via double homologous recombination-mediated integration into the nifU1 locus of the chromosome (Lyons and Thiel, J. Bacteriol. (1995) 177:1570-1575). The constructed plasmids are designated pSG1-YC69 and pSG1-YC70 for trc::ChFatB2-7942 and trcE::ChFatB2-7942, respectively.
[0086]Each of the plasmids pSG1-YC69, pSG1-YC70, along with the control vector pEL17, are introduced into wild-type A. variabilis ATCC 29413 cells via tri-parental conjugation, as described by Elhai and Wolk (Methods Enzymol. (1988) 167:747-754). Both recombinant and control strains are pre-cultivated in 100 mL of BG-11 medium supplied with 5 mM NH4Cl and spectinomycin (3 mg/L) to late-log phase (OD730 nm=1.0) on a rotary shaker (150 rpm) at 30° C. with constant illumination (60 μEm-2sec-1). Cultures are then subcultured at initial OD730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD730 nm=0.7-0.9. For time-course studies, 60-mL aliquots of the culture are transferred into 250 mL flasks and induced by adding IPTG (final conc.=1 mM) if applicable. Cultures are sampled every 72 hours and then filtered through Whatman® GF/F filters using a Millipore vacuum filter manifold. Filtrates are collected in screw top culture tubes for gas chromatographic (GC) analysis as described in Example 1.
Example 4
Secretion of Fatty Acids in Strains Derived from Synechococcus elongatus PCC 7942 Containing an Inactivated Acyl-ACP Synthetase Gene
[0087]A putative acyl-ACP synthetase gene in S. elongatus PCC 7942, synpcc7942--0918 (Cyanobase gene designation), was disrupted via replacing of an internal 422-bp portion of its coding region with a 1,741-bp DNA sequence carrying the chloramphenicol resistance marker gene, cat (which encodes chloramphenicol acetyltransferase). Primer pairs 918-15 (SEQ ID NO: 16)/918-13 (SEQ ID NO: 17) and 918-25 (SEQ ID NO:18)/918-23 (SEQ ID NO:19) were used to amplify two DNA fragments corresponding to a 5' portion (1-480 bp) and a 3' portion (903-1521 bp) of the coding region of synpcc7942--0918, respectively. The cat fragment was amplified from plasmid pAM1573 (Mackey et al., Methods Mol. Biol. 362:115-29) using PCR with primers NS21-3 Cm (SEQ ID NO:20) and ter-3 Cm (SEQ ID NO:21), which overlap primers 918-13 and 918-25, respectively. The recombinant chimeric PCR technique was then used to amplify the complete disruption cassette with the three aforementioned PCR fragments, as well as primers 918-15 and 918-23. The resulting 2,840-bp blunt-end PCR fragment (SEQ ID NO:22) was then ligated into pUC19 (Yanisch-Perron et al., Gene 33:103-119), which has been digested with both HindIII and EcoRI to remove the multiple cloning sites and subsequently blunted with T4 DNA polymerase, to yield plasmid pSGI-YC04.
[0088]Plasmid pSG1-YC04 was introduced into S. elongatus strain SGC-YC1-2, which harbors a copy of trcE::ChFatB2-7942 integrated into NS1 (see Example 1). The resulting strain was designated SGC-YC4-7. Fatty acid production assays and GC analyses were performed as described in Example 1. The results of GC analyses indicating the levels of FFAs in cultures of various S. elongatus strains 168 hours after IPTG induction are shown in Table 4-1. It is possible that inactivation of the acyl-ACP synthetase gene has a larger impact on secretion of long-chain fatty acids than on secretion of medium-chain fatty acids.
TABLE-US-00009 TABLE 4-1 Medium-chain fatty acid secretion in various strains of S. elongatus. Plasmid Fatty Acids (mg/L) Strain Parent Strain Added Transgenes Deletions 8:0 10:0 16:0 16:1 SGC-YC2-5 PCC 7942 pAM2314 none none ND ND ND 1.4 SGC-YC1-2 PCC 7942 pSGI- trcE::ChFatB2- none 1.4 4.2 ND 1.6 YC01 7942 SGC-YC4-7 SGC-YC1-2 pSGI- trcE::ChFatB2- synpcc7942_0918 1.0 3.1 1.1 3.9 YC04 7942 Note: ND represents "not detected" (<1 mg/L).
Example 5
Secretion of Fatty Acids in Strains Derived from Synechocystis sp. PCC6803 Containing an Inactivated Acyl-ACP Synthetase Gene
[0089]A ˜b 1.7-kbp DNA fragment spanning an area upstream and into the coding region of the acyl-ACP synthetase-encoding gene, slr1609 (Cyanobase gene designation), from Synechocystis sp. PCC 6803 was amplified from genomic DNA using PCR with primers NB001 (SEQ ID NO:23) and NB002 (SEQ ID NO:24). This fragment was cloned into the pCR2.1 vector (Invitrogen) to yield plasmid pSG1-NB3 and subsequently cut with the restriction enzyme Mfe1. A chloramphenicol resistance marker cassette containing the cat gene and associated regulatory control sequences was amplified from plasmid pAM1573 (Andersson, et al., Methods Enzymol. (2000) 305:527-542) to contain flanking Mfe1 restriction sites using PCR with primers NB010 (SEQ ID NO:25) and NB011 (SEQ ID NO:26). The cat gene expression cassette was then inserted into the MfeI site of pSG1-NB3 to yield pSG1-NB5 (SEQ ID NO:27).
[0090]The pSGI-NB5 vector was transformed into trcE::ChFatB2-7942-containing Synechocystis strain SGC-YC10-5 (see Example 1) according to Zang et al., J Microbiology (2007) 45:241-245. Insertion of the chloramphenicol resistance marker into the Slr1609 gene through homologous recombination was verified by PCR screening of insert and insertion site. The resulting strain was designated SGC-NB10-4, which was tested in liquid BG-11 medium for fatty acid secretion. All liquid medium growth conditions used a rotary shaker (150 rpm) at 30° C. with constant illumination (60 μEm-2sec-1). Cultures were inoculated in 25 mL of BG-11 medium containing chloramphenicol and/or kanamycin (5 μg/mL) accordingly and grown to a sufficient density (minimal OD730 nm=1.6-2). Cultures were then used to inoculate 100 mL BG-11 medium in 250 mL polycarbonate flasks to OD730 nm=0.4-0.5 and incubated overnight. 45 mL of overnight culture at OD730 nm=0.7-0.9 were added to new 250 mL flasks, inducing with 1 mM IPTG or using as uninduced controls. 5 mL samples were taken at 0, 72 and 144 hours post induction and processed as described in Example 2.
[0091]Free fatty acids (FFA) were separated from the filtered culture supernatant solutions by liquid-liquid extraction for GC/FID (flame ionization detector) analysis. For each sample, 2 mL filtered culture was extracted with a mixture of 50 μl phosphoric acid (1 M), 100 μl NaCl (5 M) and 2 mL hexane. A 0.2 μl sample was injected using a 40:1 split ratio on to a DB-FFAP column (J&W Scientific, 15 m×250 μm×0.25 μm), with a temperature profile starting at 150° C. for 0.5 min, then heating at 15° C./min to 230° C. and holding for 7.1 min (1.1 mL/min He).
[0092]GC results indicating secreted levels of free fatty acids after 144 hours are shown in Table 5-1.
TABLE-US-00010 TABLE 5.1 Medium-chain fatty acid secretion in various strains of Synechocystis. Fatty Acids Plasmid (mg/L) Strain Parent Strain Added Transgenes Deletions 8:0 10:0 SGC-YC10-5 PCC 6803 pSGI-YC08 trcE::ChFatB2-7942 none 58.3 67.7 SGC-NB10-4 SGC-YC10-5 pSGI-NB5 trcE::ChFatB2-7942 slr1609 57.7 73.7 Note: ND represents "not detected" (<1 mg/L).
Example 6
Expression of Cuphea lanceolata Kas-IV and Helianthus annuus Kas-III genes in Synechocystis sp.
[0093]A DNA fragment comprising a functional operon was synthesized such that it contained the following elements in the given order: the trc promoter, the Cuphea lanceolata 3-ketoacyl-acyl carrier protein synthase IV gene (ClKas-IV, GenBank Accession No. CAC59946) codon-optimized for expression in Synechococcus elongatus PCC 7942, and the rps14 terminator (SEQ ID NO:28) from Synechococcus sp. WH8102. The nucleotide sequence of this entire functional operon, along with various flanking restriction enzyme recognition sites, is provided in SEQ ID NO:29.
[0094]Another DNA fragment comprising a functional operon was synthesized such that it contained the following elements in the given order: the trc promoter, the Helianthus annuus 3-ketoacyl-acyl carrier protein synthase III gene (HaKas-III, GenBank Accession No. ABP93352) codon-optimized for expression in both Synechococcus elongatus PCC 7942 and Synechocystis sp. PCC 6803, and rps14 terminator from Synechococcus sp. WH8102. The nucleotide sequence of this functional operon, along with various flanking restriction enzyme recognition sites, is provided in SEQ ID NO:30.
[0095]Codon optimization was performed by the use of the "Gene Designer" (version 1.1.4.1) software program provided by DNA2.0, Inc. The functional operon (expression cassette) containing the codon-modified ClKas-IV gene as represented in SEQ ID NO:29 was digested by the restriction enzymes SpeI and XbaI and inserted into plasmid pSGI-YC39 between the restriction sites SpeI and XbaI to form plasmid pSGI-BL26, which enables integration of the functional operon into the Synechocystis sp. PCC 6803 chromosome at the "RS2" recombination site (Aoki, et al., J. Bacteriol (1995) 177:5606-5611). The plasmid pSGI-BL27 containing the DNA fragment represented in SEQ ID NO:30 was constructed in the same way.
[0096]Plasmid pSGI-BL43 contains the trcE promoter, the codon-optimized ClKas-IV gene, and the rps14 terminator as represented in SEQ ID NO:31 and was made by inserting a SpeI/NcoI trcE fragment from pTrcHis A (Invitrogen) into SpeI/NcoI-digested pSGI-BL26. An additional plasmid, pSGI-BL44, contains the trcE promoter, the optimized ClKas-IV gene, the S. elongatus PCC 7942 kaiBC intergenic region, the optimized HaKas-III gene, and the rps14 terminator as represented in SEQ ID NO:32 and was made by inserting a BamHI/SacI fragment (containing the S. elongatus kaiBC intergenic region, the HaKas-III gene, and the rps14 terminator) generated via PCR amplification into BglII/SacI-digested pSGI-BL43. The PCR primers used to generate the DNA fragment containing the kaiBC region, HaKas-III, and rps14 terminator are provided as SEQ ID NO:33 and SEQ ID NO:34.
[0097]Wild-type Synechocystis PCC 6803 cells and transgenic Synechocystis strain SGC-YC10-5, which contains the ChFatB2-7942 gene, were transformed with plasmids pSG1-BL26, pSG1-BL27, pSG1-BL43 and pSG1-BL44 as described by Zang, X. et al. J. Microbiol. (2007) 45:241-245. Both recombinant and wild-type control strains were pre-cultivated in 20 mL of BG-11 medium to mid-log phase (OD730 nm=0.7-0.9) on a rotary shaker (150 rpm) at 30° C. with constant illumination (60 μEm-2sec-1). Kanamycin (5 μg/mL) and/or spectinomycin (10 μg/mL) were included in recombinant cultures as appropriate. Cultures were then subcultured at initial OD730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD730 nm=0.7-0.9. For a time-course study, 45-mL aliquots of the culture were transferred into 250 mL flasks and induced by adding IPTG (final conc.=1 mM) when applicable. Cultures were sampled 0, 72, and 144 hours after IPTG induction and then filtered through Whatman® GF/B filters using a Millipore vacuum filter manifold. Filtrates were collected in screw top culture tubes for gas chromatographic (GC) analysis as described in Example 2.
[0098]Results indicating the levels of secreted octanoic acid and decanoic acid in culture supernatants 144 hours after culture inoculation are shown in Table 6-1. The ClKas-IV and HaKas-III genes present in the indicated strains were under the control of the trc promoter.
TABLE-US-00011 TABLE 6-1 Medium-chain fatty acid secretion in (in mg/L) various Synechocystis sp. strains Fatty Acids Parent Plasmid (mg/L) Strain Strain Added Transgenes 8:0 10:0 PCC 6803 n/a n/a None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2- 69.8 68.4 7942 SGC-BL26-3 PCC 6803 pSGI-BL26 trc-ClKas-IV ND ND SGC-BL26-5 SGC- pSGI-BL26 trcE-ChfatB2- 69.5 71.9 YC10-5 7942 trc-ClKas-IV SGC-BL27-1 PCC 6803 pSGI-BL27 trc-HaKas-III ND ND SGC-BL27-2 SGC- pSGI-BL27 trcE-ChFatB2- 65.7 66.6 YC10-5 7942 trc-HaKas-III Note: ND represents "not detected" (<1 mg/L).
[0099]For a more optimized measurement of fatty acid secretion in these strains, the fatty acid secretion data shown in Table 6-1 were normalized to cell culture density, measured as optical density at 730 nm (OD730 nm); these data are presented in Table 6-2. Other experiments described in this application could be normalized in a similar fashion.
TABLE-US-00012 TABLE 6-2 Normalized medium-chain fatty acid secretion (mg/L/OD730 nm) in various Synechocystis sp. strains Parent Plasmid Fatty Acids Strain Strain Added Transgenes 8:0 10:0 PCC 6803 n/a n/a None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2- 11.7 11.4 7942 SGC-BL26-3 PCC 6803 pSGI-BL26 trc-ClKas-IV ND ND SGC-BL26-5 SGC- pSGI-BL26 trcE-ChfatB2- 11.7 12.1 YC10-5 7942 trc-ClKas-IV SGC-BL27-1 PCC 6803 pSGI-BL27 trc-HaKas-III ND ND SGC-BL27-2 SGC- pSGI-BL27 trcE-ChFatB2- 12.2 12.3 YC10-5 7942 trc-HaKas-III Note: ND represents "not detected" (<1 mg/L).
[0100]Results indicating the levels of secreted octanoic acid and decanoic acid in culture supernatants of additional strains 120 hours after culture inoculation are shown in Table 6-3. The ClKas-IV and HaKas-III genes present in the indicated strains were under the control of the trcE promoter.
TABLE-US-00013 TABLE 6-3 Medium-chain fatty acid secretion (in mg/L) in various Synechocystis sp. strains Fatty Acids Plasmid (mg/L) Strain Parent Strain Added Transgenes 8:0 10:0 SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2-7942 34.8 43.5 SGC-BL44 PCC 6803 pSGI-BL44 trcE-ClKAS-IV + HaKAS-III ND ND SGC-YC10- SGC-YC10-5 pSGI-BL43 trcE-ChFatB2-7942 40.0 48.1 5-BL43 trcE-ClKas-IV SGC-YC10- SGC-YC10-5 pSGI-BL44 trcE-ChfatB2-7942 38.5 47.1 5-BL44 trcE-ClKAS-IV + HaKAS-III Note: ND represents "not detected" (<1 mg/L).
[0101]For a more optimized measurement of fatty acid secretion in these strains, the fatty acid secretion data shown in Table 6-1 were normalized to cell culture density, measured as optical density at 730 nm (OD730 nm); these data are presented in Table 6-4.
TABLE-US-00014 TABLE 6-4 Normalized medium-chain fatty acid secretion (mg/L/OD730 nm) in various Synechocystis sp. strains Plasmid Fatty Acids Strain Parent Strain Added Transgenes 8:0 10:0 SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2-7942 6.8 8.5 SGC-BL44 PCC 6803 pSGI-BL44 trcE-ClKAS-IV + HaKAS-III ND ND SGC-YC10- SGC-YC10-5 pSGI-BL43 trcE-ChFatB2-7942 7.4 8.9 5-BL43 trcE-ClKas-IV SGC-YC10- SGC-YC10-5 pSGI-BL44 trcE-ChfatB2-7942 8.3 10.2 5-BL44 trcE-ClKAS-IV + HaKAS-III
Example 7
Introduction of a Heterologous Acyl-ACP Thioesterase Gene into a Diatom
[0102]A synthetic gene that encodes a derivative of the ChFatB2 enzyme with specificity for medium-chain (8:0-10:0) acyl-ACPs is expressed in various diatoms (Bacillariophyceae) by constructing and utilizing expression vectors comprising the ChFatB2 gene operably linked to gene regulatory regions (promoters and terminators) that function in diatoms. In a preferred embodiment, the gene is optimized for expression in specific diatom species and the portion of the gene that encodes the plastid transit peptide region of the native ChFatB2 protein is replaced with a plastid transit peptide that functions optimally in diatoms. The nucleotide sequence provided as SEQ ID NO:35 represents a synthetic derivative of the ChFatB2 gene that has been optimized for expression in Thalassiosira pseudonana and in which the native plastid transit peptide-encoding region of the gene has been replaced with the plastid transit peptide (including coupled signal sequence) associated with the gamma subunit of the coupling factor portion (CF1) of the chloroplast ATP synthase from T. pseudonana (JGI Identifier=jgi/Thaps3/40156/est Ext_gwp_gwl.C_chr--40019). The protein encoded by this gene, referred to hereafter as ChFatB2-Thal,) is provided in SEQ ID NO:36.
[0103]To produce an expression vector for T. pseudonana, the ChFatB2-Thal gene was placed between the T. pseudonana alpha-tubulin promoter and terminator regulatory sequences. The alpha-tubulin promoter was amplified from genomic DNA isolated from T. pseudonana CCMP 1335 by use of primers PR1 (SEQ ID NO:37) and PR3 (SEQ ID NO:38), whereas the alpha-tubulin terminator was amplified by use of primers PR4 (SEQ ID NO:39) and PR8 (SEQ ID NO:40). The KpnI/BamHI fragment from the alpha-tubulin promoter amplicon, the BamHI/XbaI fragment from the alpha-tubulin terminator and the large fragment from KpnI/XbaI-cut pUC118 (Vieira and Messing, Meth. Enzymol. (1987) 153:3-11) were then combined to form pSG1-PR5. The NcoI/BamHI fragment from ChFatB2-Thal gene was then inserted into NcoI/BamHI-digested pSG1-PR5 to form pSG1-PR16. In addition, a codon-optimized gene that encodes the nourseothricin acetyltransferase (NAT) enzyme from Streptomyces noursei (SEQ ID NO:41) (Krugel, et al., Gene (1993) 127:127-131) was synthesized and the NcoI/BamHI fragment from this NAT-encoding DNA molecule was inserted into the large NcoI/BamHI fragment from pSG1-PR5 to form pSG1-PR7, which upon introduction into T. pseudonana and other diatoms can provide resistance to the antibiotic nourseothricin.
[0104]pSGI-PR16 and pSGI-PR7 were co-transformed into T. pseudonana CCMP 1335 by means of particle bombardment essentially as described by Poulsen, et al., (J. Phycol. (2006) 42:1059-1065). Transformed cells were selected on agar plates in the presence of 100 mg/L nourseothricin (ClonNAT, obtained from Werner BioAgents, Germany). The presence of the ChFatB2-Thal gene in cells was confirmed by the use of PCR. Transformants were grown in ASW liquid medium (Darley and Volcani, Exp. Cell Res. (1964) 58:334) on a rotary shaker (150 rpm) at 18° C. with constant illumination (60 μEm-2sec-1). Samples were removed seven days after inoculation and the culture medium was tested for the presence of FFAs as described in Example 1.
[0105]Although no fatty acid secretion was detected under these particular experimental conditions, optimization of the ChFatB2-Thal gene and diatom host strain can be performed to achieve fatty acid secretion in diatoms, which are known to have relatively impervious cell walls.
Example 8
Secretion of Fatty Acids by Green Algae
[0106]A synthetic gene that encodes a derivative of the ChFatB2 enzyme with specificity for medium-chain (8:0-10:0) acyl-ACPs is expressed in green algae (Chlorophyceae) by constructing and utilizing expression vectors comprising the ChFatB2 gene operably linked to gene regulatory regions (promoters and terminators) that function in green algae. The gene is optimized for expression in specific green algal species and the portion of the gene that encodes the plastid transit peptide region of the native ChFatB2 protein is replaced with a plastid transit peptide that functions optimally in green algae. The nucleotide sequence provided as SEQ ID NO:42 represents a derivative of the ChFatB2 gene optimized for expression in Chlamydomonas reinhardtii and in which the native plastid transit peptide-encoding region of the gene has been replaced with the plastid transit peptide associated with the gamma subunit of the coupling factor portion (CF1) of the chloroplast ATP synthase from C. reinhardtii (GenPept Accession No. XP 001696335). The protein encoded by this gene is provided in SEQ ID NO:43.
Example 9
Secretion of Fatty Acids in Strains of Synechocystis sp. Containing a Disrupted 1,4-alpha-Glucan Branching Enzyme Gene
[0107]A 1.4-kbp DNA fragment spanning an area upstream and into the coding region of the 1,4-alpha-glucan branching enzyme gene (glgB, Cyanobase gene designation=s110158) from Synechocystis sp. PCC6803 was amplified from genomic DNA using PCR with primers glgB-5 (SEQ ID NO:44) and glgB-3 (SEQ ID NO:45). This fragment was cloned into the pCR4-Topo vector (Invitrogen) to yield plasmid pSGI-BL32 and subsequently cut with the restriction enzyme AvaI. A spectinomycin resistance marker cassette containing the aadA gene and associated regulatory control sequences was digested by HindIII from plasmid pSGI-BL27. Both of the linear fragments were treated with the Quick Blunting® Kit (New England Biolabs). The aadA gene expression cassette was then inserted into the AvaI site of pSGI-BL32 to yield pSGI-BL33. The portion of pSGI-BL33 that inserts into and inactivates the glgB gene is provided as SEQ ID NO:46).
[0108]The pSGI-BL33 vector was transformed into wild-type Synechocystis PCC 6803 and into trcE::ChFatB2-7942-containing Synechocystis strain SGC-YC10-5 (see Example 1) according to Zang, et al., J. Microbiology (2007) 45:241-245. Insertion of the spectinomycin resistance marker into the S110158 (glgB) gene via homologous recombination was verified by PCR screening of insert and insertion site. Verified knockout strains were tested in liquid BG-11 medium for secretion of fatty acids. All liquid medium growth conditions used a rotary shaker (150 rpm) at 30° C. with constant illumination (60 μEm-2sec-1). Cultures were inoculated in 25 mL of BG-11 medium containing spectinomycin (10 μg/mL) and/or kanamycin (5 μg/mL) accordingly and grown to a sufficient density (minimal OD730 nm=1.6-2). Cultures were then used to inoculate 100 mL BG-11 medium in 250-mL polycarbonate flasks to OD730 nm=0.4-0.5 and incubated overnight. Forty-five mL of overnight culture at OD730 nm=0.5 were added to new 250-mL flasks; some cultures were induced with 1 mM IPTG or used as uninduced controls. Samples (0.5 mL) were taken at 0, 72, 144, and 216 hours post induction and processed as described in Example 2.
[0109]Free fatty acids (FFA) were separated from the filtered culture supernatant solutions by liquid-liquid extraction for GC/FID analysis. For each sample, 2 mL of filtered culture were extracted with a mixture of 50 μL phosphoric acid (1 M), 100 μL NaCl (5 M) and 2 mL hexane. A 0.2 μl sample was injected using a 40:1 split ratio on to a DB-FFAP column (J&W Scientific, 15 m×250 μm×0.25 μm), with a temperature profile starting at 150° C. for 0.5 min, then heating at 15° C./min to 230° C. and holding for 7.1 min (1.1 mL/min He).
[0110]GC results indicating secreted levels of free fatty acids after 216 hours are shown in Table 9-1.
TABLE-US-00015 TABLE 9-1 Medium-chain Fatty Acid Secretion (in mg/L) in Various Synechocystis sp. Strains Plasmid Fatty Acids Strain Parent Strain Added Deletion Transgenes 8:0 10:0 PCC 6803 n/a n/a None None ND ND SGC-BL33-1 PCC 6803 pSGI-BL33 Sll0158 (glgB) None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 None trcE-ChFatB2-7942 70.0 68.7 SGC-BL33-2 SGC-YC10-5 pSGI-BL33 Sll0158 (glgB) trcE-ChFatB2-7942 66.2 68.1 Note: ND represents "not detected" (<1 mg/L).
[0111]For a more optimized measurement of fatty acid secretion in these strains, the fatty acid secretion data shown in Table 9-1 were normalized to cell culture density, measured as optical density at 730 nm (OD730 nm); these data are presented in Table 9-2. Other experiments described in this application could be normalized in a similar fashion.
TABLE-US-00016 TABLE 9-2 Normalized Medium-chain Fatty Acid Secretion (mg/L/OD730 nm) in Various Synechocystis sp. Strains Plasmid Fatty Acids Strain Parent Strain Added Deletion Transgenes 8:0 10:0 PCC 6803 n/a n/a None None ND ND SGC-BL33-1 PCC 6803 pSGI-BL33 Sll0158 (glgB) None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 None trcE-ChFatB2-7942 9.8 9.7 SGC-BL33-2 SGC-YC10-5 pSGI-BL33 Sll0158 (glgB) trcE-ChFatB2-7942 10.4 10.7 Note: ND represents "not detected" (<1 mg/L).
Example 10
Capture of Free Fatty Acids from Model Solutions with Hydrophobic Adsorbent Resins
[0112]A spike solution was formulated by dissolving 75 mg/L octanoic acid and 75 mg/L decanoic acid in BG-11 medium supplemented with 300 mM NaCl and adjusting the pH to 5.8. 50 mg of each of the resins listed in Table 1 were weighed into a 50 mL centrifuge tube and combined with 1.0 mL of methanol and shaken gently. The excess methanol was decanted and the resins were dried under a 25 in Hg vacuum, room temperature, overnight. 50 mL of the spike solution was then added to each of the resins and incubated with gentle shaking at 31° C. for 24 hours. Following incubation, the resins were removed by filtering over a Whatman® GF/F glass fiber filter and the filtrates were analyzed for octanoic acid and decanoic acid content by gas chromatography as described in Example 2. The capacity of each resin for octanoic and decanoic acid could then be determined by the difference in the concentration of each fatty acid before and after incubation with each resin. The results are shown in Table 10-1 below.
TABLE-US-00017 TABLE 10-1 Adsorption capacities of several commercially-available adsorbents Adsorption Capacity (mg/g) Octanoic Decanoic Total free Description Resin type Acid acid fatty acids Dowex Optipore ® Post cross-linked macroporous 26.3 69.8 96.0 V503 (Dow Chemical) polystyrene divinyl benzene Lewatit 1064 MD Macroporous polystyrene 1.1 46.7 47.8 (LanXess) divinyl benzene Zeolyst CBV 28014 Very low-alumina zeolite 17.4 74.7 92.0 (Zeolyst) Zeolyst CBV 901 Low-alumina zeolite 5.4 64.8 70.1 (Zeolyst) Hisiv 3000 Silicalite Hydrophobic silicalite 15.3 23.7 39.1 (UOP Honeywell) Lipidex 5000 (Packard Alkylated sephadex gel 0.00 18.6 18.6 Instrument Co.) Norit ROW 0.8 (Fluka) Extruded activated charcoal 40.2 71.8 112.1
[0113]Elution of free fatty acids from the hydrophobic adsorbents was also investigated. Dowex® Optipore® V503, Zeolyst CBV 28014, Zeolyst CBV 901, and Norit® ROW were incubated with 1.0 mL of spike solution per mg of adsorbent as described above. After the incubation period, the adsorbents were rinsed and combined with 0.1, 0.5, or 1.0 mL methanol per mg of adsorbent and shaken gently at room temperature for 4 hours. The methanol eluates and post-adsorption spikes were analyzed for free fatty acid concentration by gas chromatography. The results are listed in Table 10-2 below.
TABLE-US-00018 TABLE 10-2 Desorption of free fatty acids in methanol % Desorption mL MeOH/mg Resin 0.1 mL/mg 0.5 mL/mg 1.0 mL/mg Dowex Optipore ® V503 92% 84% 100% CBV 28014 53% 76% 84% CBV 901 78% 76% 57% Norit ® ROW 44% 85% 77%
[0114]The effect of pH on adsorbent capacity was studied utilizing Dowex® Optipore® V503. 40 mg of the resin were combined with 40 mL of BG-11 media spiked with 150 mg/L of octanoic and decanoic acid and adjusted to a pH of 10.0, 7.5, 4.8, or 2.8. The pH 10 spike was buffered with 5 mM CAPS. The pH 7.5 and 2.8 spikes were buffered with 5 mM phosphate, and the pH 4.8 was buffered naturally by the dissolved fatty acids, with 5 mM NaCl added to maintain consistent conductivity. The spikes were incubated with resin as described above. Free fatty acid concentrations were measured with an enzymatic assay purchased from Zen-bio. The results are displayed in Table 10-3 below. From these results, it is clear that hydrophobic adsorption of free fatty acids is possible over a wide range of pH.
TABLE-US-00019 TABLE 10-3 Adsorption capacity of Dowex ® Optipore ® V503 at various pH values pH Adsorption Capacity (mg FFA/g resin) 10 42 ± 13 7.5 64 ± 4 4.8 172 ± 4 2.8 259 ± 1
[0115]Reported values are the mean of two experimental replicates, +/-one standard deviation.
Example 11
In Vivo Capture of Free Fatty Acids from Cultures of Synechocystis Strain SGC-YC10-5
[0116]Synechocystis sp. strain SGC-YC10-5, which contains the ChFatB2-7942 gene as described in Example 1, was cultured in BG-11 with and without Dowex® Optipore® V503 resin. 400 mL of fresh culture was induced with 5 mM IPTG and incubated at room temperature for 1 hour to allow for uptake of the inducer. The culture was then divided into four 1,000 mL baffled Erlenmeyer flasks with PTFE vent caps. To two of the flasks, approximately 400 mg of Dowex® Optipore® V503 were added. The adsorbent resin in the test flasks was recovered and exchanged for fresh resin daily for 10 days. The recovered resin was washed liberally with deionized water and eluted with 2 mL of methanol. Samples of culture medium from the test flasks and control flasks were also taken daily. The samples were measured for OD730 nm and filtered over a Whatman® GF/B glass fiber filter and analyzed for octanoic acid and decanoic acid content by gas chromatography as previously described in Example 2. The results are presented in Table 11-1.
TABLE-US-00020 TABLE 11-1 In vivo capture of free fatty acids from Synechocystis SGC-YC10-5 cultures Avg. Specific Average Free Fatty Growth Rate Acid Productivity (d-1) (mg L-1 d-1) Without Dowex 0.090 ± 0.005 16 ± 0.8 With Dowex 0.090 ± 0.010 31 ± 3
[0117]Reported values are the mean of two biological duplicates +/-one standard deviation.
Example 12
Integration of CO2 Delivery and Product Recovery as a Means for Enhancing the Efficiency and Economy of Both
[0118]Table 10-3 above reveals a clear relationship between free fatty acid adsorption capacity and pH. This relationship results from the inefficiency of extraction of the ionized form of the free fatty acids. Many potential production hosts require a pH significantly higher than the pKa of free fatty acids in order to survive and reproduce. An extreme example of this would be the alkalophilic cyanobacteria such as those belonging to the genera Synechococcus, Synechocystis, Spirulina, and many others, which prefer a pH between 9 and 11 for optimum growth. FIG. 5 outlines an embodiment of the invention wherein this problem is solved by recycling a portion of the culture first through a vessel where it is contacted with concentrated CO2 gas to lower the pH, then through a stationary adsorbent column wherein the protonated free fatty acids are captured.
[0119]The CO2-enriched, free fatty acid-depleted suspension is then returned to the bulk culture. The pressure inside the gas-liquid contactor can be controlled independently to provide a constant pH in the stream exiting the adsorption column. Further, the pressure of the post-column flash vessel can be controlled so as to provide a supply of CO2 which is titrated to the CO2 consumption rate of the bulk culture through PID control of pH, dissolved CO2, off-gas CO2, or any combination of the three. The excess CO2 can then be recycled.
[0120]In order to demonstrate proof of concept for the invention described above, an experimental system was constructed as displayed in FIG. 5.
[0121]Vessel E-1 was filled with 4L of a spike solution containing 700 mg/L octanoic acid dissolved in 100 mM NaCl, pH 11.1. Column C1 was filled with 45.2 g of Dowex® Optipore® V503 polymeric resin. The resin was activated with two column volumes of methanol, followed by a wash of three column volumes of 100 mM NaCl, pH 11.1. Liquid-gas contact vessel E2 was then filled with 200 mL of spike solution and 34.7 psia of CO2. When the pH of the spike solution inside E-2 had decreased to between 5 and 6 (as determined by a slip of pH paper contained within E-2) peristaltic pumps P-1 and P-2 were set to the same flow rate and column loading was initiated. Valve V-2 was adjusted as needed to increase the column pressure and prevent the formation of gas bubbles.
[0122]Fractions of the flow through were taken at periodic intervals of 70-100 mL and assayed for octanoic acid by a commercially-available free fatty acid assay purchased from Zen-Bio. Two superficial linear flow rates were evaluated: 16.3 cm/min and 6.1 cm/min. For both flow rates, a control run was performed whereby vessel E-2 was bypassed and the column was loaded directly at a pH of 11.1. Table 12-1 below displays the results of this experiment. For both flow rates, column dynamic binding capacity was approximately 4-fold greater when CO2 was used to lower the pH of the load.
TABLE-US-00021 TABLE 12-1 Dynamic binding capacity with and without CO2-mediated load acidification Dynamic Binding Capacity (mg/g) Flow velocity (cm/min) +34.7 psia CO2 Control (pH 11.1) 0 psia CO2 6.1 43.5 10.5 16.3 7.2 1.9
Example 13
Secretion of Oleic Acid by Photosynthetic Microorganisms
[0123]A synthetic gene that encodes a derivative of a FatA-type plant acyl-ACP TE enzyme with specificity for oleoyl-ACP is expressed in various photosynthetic microorganisms by constructing and utilizing expression vectors comprising a FatA gene operably linked to gene regulatory regions (promoters and terminators) that function in the host photosynthetic microorganism. The gene is optimized for expression in the host photosynthetic microorganism and the portion of the gene that encodes the plastid transit peptide region of the native FatA protein is removed for expression in cyanobacteria or replaced with a plastid transit peptide that functions effectively in the host eukaryotic photosynthetic microorganisms.
[0124]Genes that could be used for this purpose include, but are not limited to, those that encode the following acyl-ACP TEs (referred to by GenPept Accession Numbers): NP--189147.1, AAC49002, CAA52070.1, CAA52069.1, 193041.1, CAC39106, CAO17726, AAC72883, AAA33020, AAL79361, AAQ08223.1, AAB51523, AAL77443, AAA33019, AAG35064, and AAL77445.
[0125]The following is a sequence listing of all sequences referred to above. SEQ ID NO:1
Sequence CWU
1
461723DNAThalassiosira pseudonana 1atatcgtgga gtatatcaat ggtggggagg
tgtggtgtag tagttgcgag caaagatgac 60acttggtaaa ctgatgcgac gtggatactg
cgacgaagat tggccgtaca cacgtcggat 120ttgaatgaac atatgtgttt tattcaaacc
aatttgacta gtttgaggaa ccttcacgtg 180tttcgctctc aaactttgag acaacagcct
ccgaatccaa atgaatgact tttaaacaca 240agctaggagc tggtgatata taatatgctg
gttgtatgaa agagactaat cgtgtgaaat 300aaatgatggc tcgccctagt gaatgctcct
cagagacgct cattcgtcca agtgttcgtc 360acttctgtca ttgtttcctc cgaggccaag
gtggtcgagt aggtagatac cagctattct 420cttgcttctt ttactttatc tccctctacc
aaaaacagca cgttattatc tcctttccat 480tccacgcaat aacaagaggc aatcggtaaa
gaggcacaaa caagagaaca aagaccccgg 540ctgcttctct cgtccgtccg ccgcccctaa
acttcaagtt ttacttcaag ttcaatctgt 600tttttggcgc aaaaagcgcc gttgctccgc
cgtcctccgc acttttcagt tctctgtcgt 660cgaggactgt tatcaacttc caagatctcc
atctcttctc ctatcctccc ctaacaaagt 720acg
7232700DNAThalassiosira pseudonana
2gttcaatgcc tttggtgttg tcgtcaatag gcacttcgac tttgctcttg gttccgttat
60cccaaacttg aacgagcgcc acggtcctct cggtttcggt ggtatcccag gacctctcgt
120agttgatgca gggttcagaa tcgagataac tcatgttgtc gtttgttgtt ttgttgattt
180taccttgctt ccagctttcg gtctgtaatt acagtgacac gctgtactag aaatgatgta
240cgtttgatgg aatctctaaa attatgagct atttatgaac acaggagttc tcatcaactt
300tccatcgaaa tccgtaggag aattctaatg tcctcttcgg acgagagaca gacgtatcag
360gagtcacttg aaggttccaa gattctatct tcatgaggtc tggatatgac agtcctgcct
420tcgaggcaag ccctgtcact gtgacctttt cgcgtcgtca ataattttag gaacgcaagg
480atagggattc tccatagtaa ggactattgt ttgacccctg aaacttcaac ctttacccca
540agaatggggc attcataagt gaaaaacgtt tgttatgtat gccccaattc ctacacagga
600ataggtattg aatcacgtag aaaatgatcg ttgcgccgca agcaaacaca ccggctctct
660tccgccgcac tctcttccaa tccaacaaac aaacgcaacc
7003700DNAThalassiosira pseudonana 3acgcagatag tgtatatttg cgtcacagtc
tcttgtcgtc ataggagagg agaactagag 60aacaaaaagc gtcatgtaat aaatgttgga
tgttggcatg tcgtcccagc cagtatccaa 120aacaccgaat tgtcgaggtt cgtgagcttg
cagcactcat ggcaacggct aatttcatat 180ctatgttatc aatgttatct gtaacactaa
tgctaagtaa tgcgtcaaca acttatctcc 240tccggctctt cactccactt cgctgacgtc
gtttgatatt ttatctgctc tattattcaa 300gttgaatctg cagttgaggc attctctaac
ttagccgaga aatcaagacg gtgactttga 360atttacaagt acagttacgc ttacacaaga
tacctttctc acaaaaaaga ttccgttggc 420tcccactgcg cattgctact tggtactatt
cccatgtgga actggatttg ggggaaagag 480ggagtctgag tttgtaaatg tacatttgtt
attcccttca ttatcgacaa catcactaac 540tcatcgtgca tacagagaaa aacaatctcc
actttctcaa caaaagtggc cacaatgtgc 600ctccgacaca gcctcaagag ccgaccgatc
gttgcatttt tcactctcga acacacacac 660acacacacac ccacacacca ccacctctct
ttatccaacc 7004779DNAPhaeodactylum tricornutum
4agtcggattg aaaacagcga atgtacgcca ttccaaaggc gctcagcaaa aggagacata
60tgcacacatc cagcggaagt aagtacgaca cttgaacaag agcatgacct gtcaaagcat
120gctgccatcg tcgcttcgct tctattccca atgacacttt ggtcaccacg acttgaaaaa
180cggcaatcag caaaataagc gatagaccct gaccaacggc agctttcatc ttttatgaac
240ggcagatatt cgcatcctct tttatcgata cagcaaacac gcagaatttc tgttctcttt
300caagacgaca agcacgaatt tcggtacgct gtcataattt attgactatg ttagataaca
360caactctcat gcgctttgaa aatctgctta cttcacagta aagagacaag ctctttgcac
420tgactgcgac agagatggaa aaaaggaatt ctaccggcaa ttgacagact gatgtgaaaa
480cagagagtaa ccgtaaacaa gtaccggtaa gtatgcgcgc aacctttact tgttccgttg
540gcgtctgtca tttgatgtca cgcagacttg aaaagtcgtt cgctccattg tgaaaaatat
600catgcgacaa cgttcagaaa ggccggcgtg caatcggttt gccttgtttc tgatccgctg
660ctttttgagc aacgacctgc ggaggaccac aatgatcttt ctcttgtcgt gagagctagt
720tctattacct gttcaattac ctgctttctt gtattactcg aagctctcgt tcttctatc
7795807DNAPhaeodactylum tricornutum 5ttttgtaatt cgccactacc tttacgcaag
taagaacgtt tcatgctgga gtcgtggacc 60aatcgtaagg tatacgttag tcataccgcg
cctgtactat ttacgacacg agagaaagcc 120actgcagttc tgggatggga tcagatgctt
gctcctttca ctgcgctggc aaactgtatg 180ctagacacga ctcggatcgg atatcgaaat
caaacggcgg agaatgggtt cggatgactg 240tccggagcta cctaggaaaa gcttcttttt
cgtttcggac caccaagagg gaagcgctgc 300ctgtactcgt gcgataggaa gcatcagacg
tatttgttcg gatgagatca caccagaact 360agccaggcag ccagctagct attgtcatct
acagatttcg aaccaaacgt ggatactaga 420aagcatggga ttgactgtga ctgtgatttg
tgttgcacac tttataccta ccctcgacct 480cgtactttgt gtagtagcaa aatgtggatt
gtgcgttgaa atgtagaagg gtttggggtt 540gacacgggtt cattcatatc cgggtactcg
aaaatgaccg caacgatact catcgatcga 600gatacggtgt acacgtagac tacgtagaaa
acctacgagg aagcagatat gattttccgg 660tccgcagcat ccacccagcc aacgtcggca
aacaaccaaa caacctcgtc gccccttgtt 720gttcaagatc tgcattccat tgacagcctt
ttcaacgaaa ccgttcgctc gtttgattcc 780atacgtcttt gaataccaac agaaaat
8076791DNAPhaeodactylum tricornutum
6aaagtatcaa tagcttattc cagatttttg tgatgttagc ctacttgtaa agcagcggag
60gtctgtcatg acggtgtagt ggctggtttc gctccggaaa ttaagttctg gttttatatc
120tcaacataac tagagataaa gttacaggca cgttactgta agtccgcaga ttgctaatgc
180tttgcttcgg tgtccgtaaa gcttatgtta ctgttctaga ttagagtggt atccacgatt
240ttcaaacgaa agtgacatat tgcgaattgt gcagtatcag aaaatctcca aagcaggagc
300atacattagt ttggccgtat tgcaacgagt agctctcctg aagatgcaag taatagaggc
360tgtgagcgtg aataatgaat ttgcctgttt agaagctggg gatcacatct cgtgctcccc
420aaaagtctct cagtaaatca agaatgttcc tattttcgaa aacattgcta tttatttagt
480taaccggctt cgtcctccca tttaaataaa gattttcaaa aatgacacca ccaacgtccg
540caagatcacg attcgagagg attcttcttt gtcccaacca tggatgacct ctcctattaa
600cacgtatatg aagtaccgct gctggtaccc ggaaaagaga ggacattcct tgtgggagag
660tcatcgatgc gctgccaatc gaaaaaaatg ccaaggcgag aaaagcgcag ttcgttctta
720taatccaatt ttgagtttca agacatactc gttgctacct tcccaccttc ccaaccaaac
780cactcgcaac c
79171093DNAArtificial SequenceSynthetic construct 7ccatggcgaa tggttctgca
gtctctttga aatctggaag cttgaatacg caggaggata 60ctagttccag tccccctcct
cggacgtttt tgcatcagct gcccgactgg agtcgcttgc 120tgaccgccat cacaacagtg
tttgtcaaat ctaaacgacc ggacatgcat gatcggaaaa 180gcaagcgccc agatatgctc
gtcgatagtt tcggactcga gtctactgtg caggacggcc 240tggtgttccg tcaatccttc
agcatccgaa gctacgagat tggtacggac cgtaccgcta 300gcattgaaac gttgatgaac
catctccaag aaaccagttt gaaccactgc aagagcacgg 360gcatcctgct ggatggtttt
ggccgcacat tggaaatgtg caagcgagac ttgatctggg 420tggtcattaa aatgcagatc
aaagttaatc gatacccggc ctggggagat accgttgaga 480tcaatacacg cttttcccgt
ttgggcaaaa ttggcatggg tcgcgattgg ctgatctccg 540actgcaacac cggtgagatc
ttggtccgtg caacgtctgc gtacgcgatg atgaatcaaa 600agacgcgtcg gttgagtaag
ctgccgtatg aagttcacca agaaattgtt ccattgttcg 660ttgatagtcc cgttatcgag
gattctgacc tcaaagtcca caagtttaaa gtcaagactg 720gcgattccat ccagaagggc
ctgacgccag gttggaacga tctggatgtg aaccaacacg 780ttagcaacgt taagtatatc
ggctggatct tggaaagtat gcctacggaa gtcctggaga 840cgcaggaact ctgcagtctc
gctctggagt accgccgtga gtgtggccgt gattccgtgc 900tcgagtccgt cactgcgatg
gaccctagca aagtgggtgt tcgcagtcaa taccaacacc 960tcttgcggct cgaagatggg
accgccattg tgaacggcgc gaccgaatgg cgccccaaaa 1020atgccggcgc taacggggca
attagtaccg ggaaaacctc caatggaaac agcgtcagct 1080aatgatagga tcc
10938359PRTArtificial
SequenceSynthetic construct 8Met Ala Asn Gly Ser Ala Val Ser Leu Lys Ser
Gly Ser Leu Asn Thr1 5 10
15Gln Glu Asp Thr Ser Ser Ser Pro Pro Pro Arg Thr Phe Leu His Gln
20 25 30Leu Pro Asp Trp Ser Arg Leu
Leu Thr Ala Ile Thr Thr Val Phe Val 35 40
45Lys Ser Lys Arg Pro Asp Met His Asp Arg Lys Ser Lys Arg Pro
Asp 50 55 60Met Leu Val Asp Ser Phe
Gly Leu Glu Ser Thr Val Gln Asp Gly Leu65 70
75 80Val Phe Arg Gln Ser Phe Ser Ile Arg Ser Tyr
Glu Ile Gly Thr Asp 85 90
95Arg Thr Ala Ser Ile Glu Thr Leu Met Asn His Leu Gln Glu Thr Ser
100 105 110Leu Asn His Cys Lys Ser
Thr Gly Ile Leu Leu Asp Gly Phe Gly Arg 115 120
125Thr Leu Glu Met Cys Lys Arg Asp Leu Ile Trp Val Val Ile
Lys Met 130 135 140Gln Ile Lys Val Asn
Arg Tyr Pro Ala Trp Gly Asp Thr Val Glu Ile145 150
155 160Asn Thr Arg Phe Ser Arg Leu Gly Lys Ile
Gly Met Gly Arg Asp Trp 165 170
175Leu Ile Ser Asp Cys Asn Thr Gly Glu Ile Leu Val Arg Ala Thr Ser
180 185 190Ala Tyr Ala Met Met
Asn Gln Lys Thr Arg Arg Leu Ser Lys Leu Pro 195
200 205Tyr Glu Val His Gln Glu Ile Val Pro Leu Phe Val
Asp Ser Pro Val 210 215 220Ile Glu Asp
Ser Asp Leu Lys Val His Lys Phe Lys Val Lys Thr Gly225
230 235 240Asp Ser Ile Gln Lys Gly Leu
Thr Pro Gly Trp Asn Asp Leu Asp Val 245
250 255Asn Gln His Val Ser Asn Val Lys Tyr Ile Gly Trp
Ile Leu Glu Ser 260 265 270Met
Pro Thr Glu Val Leu Glu Thr Gln Glu Leu Cys Ser Leu Ala Leu 275
280 285Glu Tyr Arg Arg Glu Cys Gly Arg Asp
Ser Val Leu Glu Ser Val Thr 290 295
300Ala Met Asp Pro Ser Lys Val Gly Val Arg Ser Gln Tyr Gln His Leu305
310 315 320Leu Arg Leu Glu
Asp Gly Thr Ala Ile Val Asn Gly Ala Thr Glu Trp 325
330 335Arg Pro Lys Asn Ala Gly Ala Asn Gly Ala
Ile Ser Thr Gly Lys Thr 340 345
350Ser Asn Gly Asn Ser Val Ser 35597259DNAArtificial
SequenceSynthetic construct 9cgccggggct ggcagcttag tcctgcgcaa tctctactac
atctgccaac ccagtgaaat 60tttgatcttt gctggcagta gtcgccgcag tagtgatggc
cgccgagttg gctatcgctt 120ggtcaagggc ggcagcagcc tgcgggtacc tctgctggaa
aaagcgctcc gcatggatct 180gaccaacatg atcattgagt tgcgcgtttc caatgccttc
tccaagggcg gcattcccct 240gactgttgaa ggcgttgcca atatcaagat tgctggggaa
gaaccgacca tccacaacgc 300gatcgagcgg ctgcttggca aaaaccgtaa ggaaatcgag
caaattgcca aggagaccct 360cgaaggcaac ttgcgtggtg ttttagccag cctcacgccg
gagcagatca acgaggacaa 420aattgccttt gccaaaagtc tgctggaaga ggcggaggat
gaccttgagc agctgggtct 480agtcctcgat acgctgcaag tccagaacat ttccgatgag
gtcggttatc tctcggctag 540tggacgcaag cagcgggctg atctgcagcg agatgcccga
attgctgaag ccgatgccca 600ggctgcctct gcgatccaaa cggccgaaaa tgacaagatc
acggccctgc gtcggatcga 660tcgcgatgta gcgatcgccc aagccgaggc cgagcgccgg
attcaggatg cgttgacgcg 720gcgcgaagcg gtggtggccg aagctgaagc ggacattgct
accgaagtcg ctcgtagcca 780agcagaactc cctgtgcagc aggagcggat caaacaggtg
cagcagcaac ttcaagccga 840tgtgatcgcc ccagctgagg cagcttgtaa acgggcgatc
gcggaagcgc ggggggccgc 900cgcccgtatc gtcgaagatg gaaaagctca agcggaaggg
acccaacggc tggcggaggc 960ttggcagacc gctggtgcta atgcccgcga catcttcctg
ctccagaagc tcgaaattcg 1020agctcggtac catttacgtt gacaccatcg aatggtgcaa
aacctttcgc ggtatggcat 1080gatagcgccc ggaagagagt caattcaggg tggtgaatgt
gaaaccagta acgttatacg 1140atgtcgcaga gtatgccggt gtctcttatc agaccgtttc
ccgcgtggtg aaccaggcca 1200gccacgtttc tgcgaaaacg cgggaaaaag tggaagcggc
gatggcggag ctgaattaca 1260ttcccaaccg cgtggcacaa caactggcgg gcaaacagtc
gttgctgatt ggcgttgcca 1320cctccagtct ggccctgcac gcgccgtcgc aaattgtcgc
ggcgattaaa tctcgcgccg 1380atcaactggg tgccagcgtg gtggtgtcga tggtagaacg
aagcggcgtc gaagcctgta 1440aagcggcggt gcacaatctt ctcgcgcaac gcgtcagtgg
gctgatcatt aactatccgc 1500tggatgacca ggatgccatt gctgtggaag ctgcctgcac
taatgttccg gcgttatttc 1560ttgatgtctc tgaccagaca cccatcaaca gtattatttt
ctcccatgaa gacggtacgc 1620gactgggcgt ggagcatctg gtcgcattgg gtcaccagca
aatcgcgctg ttagcgggcc 1680cattaagttc tgtctcggcg cgtctgcgtc tggctggctg
gcataaatat ctcactcgca 1740atcaaattca gccgatagcg gaacgggaag gcgactggag
tgccatgtcc ggttttcaac 1800aaaccatgca aatgctgaat gagggcatcg ttcccactgc
gatgctggtt gccaacgatc 1860agatggcgct gggcgcaatg cgcgccatta ccgagtccgg
gctgcgcgtt ggtgcggata 1920tctcggtagt gggatacgac gataccgaag acagctcatg
ttatatcccg ccgttaacca 1980ccatcaaaca ggattttcgc ctgctggggc aaaccagcgt
ggaccgcttg ctgcaactct 2040ctcagggcca ggcggtgaag ggcaatcagc tgttgcccgt
ctcactggtg aaaagaaaaa 2100ccaccctggc gcccaatacg caaaccgcct ctccccgcgc
gttggccgat tcattaatgc 2160agctggcacg acaggtttcc cgactggaaa gcgggcagtg
agcgcaacgc aattaatgta 2220agttagcgcg aattgatctg gtttgacagc ttatcatcga
ctgcacggtg caccaatgct 2280tctggcgtca ggcagccatc ggaagctgtg gtatggctgt
gcaggtcgta aatcactgca 2340taattcgtgt cgctcaaggc gcactcccgt tctggataat
gttttttgcg ccgacatcat 2400aacggttctg gcaaatattc tgaaatgagc tgttgacaat
taatcatccg gctcgtataa 2460tgtgtggaat tgtgagcgga taacaatttc acacaggaaa
cagcgccgct gagaaaaagc 2520gaagcggcac tgctctttaa caatttatca gacaatctgt
gtgggcactc gaccggaatt 2580atcgattaac tttattatta aaaattaaag aggtatatat
taatgtatcg attaaataag 2640gaggaataaa ccatggcgaa tggttctgca gtctctttga
aatctggaag cttgaatacg 2700caggaggata ctagttccag tccccctcct cggacgtttt
tgcatcagct gcccgactgg 2760agtcgcttgc tgaccgccat cacaacagtg tttgtcaaat
ctaaacgacc ggacatgcat 2820gatcggaaaa gcaagcgccc agatatgctc gtcgatagtt
tcggactcga gtctactgtg 2880caggacggcc tggtgttccg tcaatccttc agcatccgaa
gctacgagat tggtacggac 2940cgtaccgcta gcattgaaac gttgatgaac catctccaag
aaaccagttt gaaccactgc 3000aagagcacgg gcatcctgct ggatggtttt ggccgcacat
tggaaatgtg caagcgagac 3060ttgatctggg tggtcattaa aatgcagatc aaagttaatc
gatacccggc ctggggagat 3120accgttgaga tcaatacacg cttttcccgt ttgggcaaaa
ttggcatggg tcgcgattgg 3180ctgatctccg actgcaacac cggtgagatc ttggtccgtg
caacgtctgc gtacgcgatg 3240atgaatcaaa agacgcgtcg gttgagtaag ctgccgtatg
aagttcacca agaaattgtt 3300ccattgttcg ttgatagtcc cgttatcgag gattctgacc
tcaaagtcca caagtttaaa 3360gtcaagactg gcgattccat ccagaagggc ctgacgccag
gttggaacga tctggatgtg 3420aaccaacacg ttagcaacgt taagtatatc ggctggatct
tggaaagtat gcctacggaa 3480gtcctggaga cgcaggaact ctgcagtctc gctctggagt
accgccgtga gtgtggccgt 3540gattccgtgc tcgagtccgt cactgcgatg gaccctagca
aagtgggtgt tcgcagtcaa 3600taccaacacc tcttgcggct cgaagatggg accgccattg
tgaacggcgc gaccgaatgg 3660cgccccaaaa atgccggcgc taacggggca attagtaccg
ggaaaacctc caatggaaac 3720agcgtcagct aatgatagga tccgagctcg agatctgcag
ctggtaccat atgggaattc 3780gaagcttggc tgttttggcg gatgagagaa gattttcagc
ctgatacaga ttaaatcaga 3840acgcagaagc ggtctgataa aacagaattt gcctggcggc
agtagcgcgg tggtcccacc 3900tgaccccatg ccgaactcag aagtgaaacg ccgtagcgcc
gatggtagtg tggggtctcc 3960ccatgcgaga gtagggaact gccaggcatc aaataaaacg
aaaggctcag tcgaaagact 4020gggcctttcg ttttatctgt tgtttgtcgg tgaacgctct
cctgagtagg acaaatccgc 4080cgggagcgga tttgaacgtt gcgaagcaac ggcccggagg
gtggcgggca ggacgcccgc 4140cataaactgc caggcatcaa attaagcaga aggccatcct
gacggatggc ctttttgcgt 4200ttctacaaac tcttttgttt atttttctaa atacattcaa
atatgtatcc gctcatgggg 4260atccgactag taggcctcga ggaattcacg cgtacgtaga
tctccgcggc cgccgatcct 4320ctagtatgct tgtaaaccgt tttgtgaaaa aatttttaaa
ataaaaaagg ggacctctag 4380ggtccccaat taattagtaa tataatctat taaaggtcat
tcaaaaggtc atccaccgga 4440tcagcttagt aaagccctcg ctagatttta atgcggatgt
tgcgattact tcgccaacta 4500ttgcgataac aagaaaaagc cagcctttca tgatatatct
cccaatttgt gtagggctta 4560ttatgcacgc ttaaaaataa taaaagcaga cttgacctga
tagtttggct gtgagcaatt 4620atgtgcttag tgcatctaac gcttgagtta agccgcgccg
cgaagcggcg tcggcttgaa 4680cgaattgtta gacattattt gccgactacc ttggtgatct
cgcctttcac gtagtggaca 4740aattcttcca actgatctgc gcgcgaggcc aagcgatctt
cttcttgtcc aagataagcc 4800tgtctagctt caagtatgac gggctgatac tgggccggca
ggcgctccat tgcccagtcg 4860gcagcgacat ccttcggcgc gattttgccg gttactgcgc
tgtaccaaat gcgggacaac 4920gtaagcacta catttcgctc atcgccagcc cagtcgggcg
gcgagttcca tagcgttaag 4980gtttcattta gcgcctcaaa tagatcctgt tcaggaaccg
gatcaaagag ttcctccgcc 5040gctggaccta ccaaggcaac gctatgttct cttgcttttg
tcagcaagat agccagatca 5100atgtcgatcg tggctggctc gaagatacct gcaagaatgt
cattgcgctg ccattctcca 5160aattgcagtt cgcgcttagc tggataacgc cacggaatga
tgtcgtcgtg cacaacaatg 5220gtgacttcta cagcgcggag aatctcgctc tctccagggg
aagccgaagt ttccaaaagg 5280tcgttgatca aagctcgccg cgttgtttca tcaagcctta
cggtcaccgt aaccagcaaa 5340tcaatatcac tgtgtggctt caggccgcca tccactgcgg
agccgtacaa atgtacggcc 5400agcaacgtcg gttcgagatg gcgctcgatg acgccaacta
cctctgatag ttgagtcgat 5460acttcggcga tcaccgcttc cctcatgatg tttaactttg
ttttagggcg actgccctgc 5520tgcgtaacat cgttgctgct ccataacatc aaacatcgac
ccacggcgta acgcgcttgc 5580tgcttggatg cccgaggcat agactgtacc ccaaaaaaac
agtcataaca agccatgaaa 5640accgccactg cgccgttacc accgctgcgt tcggtcaagg
ttctggacca gttgcgtgag 5700cgcatacgct acttgcatta cagcttacga accgaacagg
cttatgtcca ctgggttcgt 5760gccttcatcc gtttccacgg tgtgcgtcac ccggcaacct
tgggcagcag cgaagtcgag 5820gcatttctgt cctggctggc gaacgagcgc aaggtttcgg
tctccacgca tcgtcaggca 5880ttggcggcct tgctgttctt ctacggcaag gtgctgtgca
cggatctgcc ctggcttcag 5940gagatcggaa gacctcggcc gtcgcggcgc ttgccggtgg
tgctgacccc ggatgaagtg 6000gttcgcatcc tcggttttct ggaaggcgag catcgtttgt
tcgcccagct tctgtatgga 6060acgggcatgc ggatcagtga gggtttgcaa ctgcgggtca
aggatctgga tttcgatcac 6120ggcacgatca tcgtgcggga gggcaagggc tccaaggatc
gggccttgat gttacccgag 6180agcttggcac ccagcctgcg cgagcagggg aattgatccg
gtggatgacc ttttgaatga 6240cctttaatag attatattac taattaattg gggaccctag
aggtcccctt ttttatttta 6300aaaatttttt cacaaaacgg tttacaagca taaagctcta
gagtcgacct gcaggcatgc 6360aagcttcgag tccctgctcg tcacgctttc aggcaccgtg
ccagatatcg acgtggagtc 6420gatcactgtg attggcgaag gggaaggcag cgctacccaa
atcgctagct tgctggagaa 6480gctgaaacaa accacgggca ttgatctggc gaaatcccta
ccgggtcaat ccgactcgcc 6540cgctgcgaag tcctaagaga tagcgatgtg accgcgatcg
cttgtcaaga atcccagtga 6600tcccgaacca taggaaggca agctcaatgc ttgcctcgtc
ttgaggacta tctagatgtc 6660tgtggaacgc acatttattg ccatcaagcc cgatggcgtt
cagcggggtt tggtcggtac 6720gatcatcggc cgctttgagc aaaaaggctt caaactggtg
ggcctaaagc agctgaagcc 6780cagtcgcgag ctggccgaac agcactatgc tgtccaccgc
gagcgcccct tcttcaatgg 6840cctcgtcgag ttcatcacct ctgggccgat cgtggcgatc
gtcttggaag gcgaaggcgt 6900tgtggcggct gctcgcaagt tgatcggcgc taccaatccg
ctgacggcag aaccgggcac 6960catccgtggt gattttggtg tcaatattgg ccgcaacatc
atccatggct cggatgcaat 7020cgaaacagca caacaggaaa ttgctctctg gtttagccca
gcagagctaa gtgattggac 7080ccccacgatt caaccctggc tgtacgaata aggtctgcat
tccttcagag agacattgcc 7140atgcccgtgc tgcgatcgcc cttccaagct gccttgcccc
gctgtttcgg gctggcagcc 7200ctggcgttgg ggctggcgac cgcttgccaa gaaagcagcg
ctccgccggc tgccggatc 7259107113DNAArtificial SequenceSynthetic
construct 10cgccggggct ggcagcttag tcctgcgcaa tctctactac atctgccaac
ccagtgaaat 60tttgatcttt gctggcagta gtcgccgcag tagtgatggc cgccgagttg
gctatcgctt 120ggtcaagggc ggcagcagcc tgcgggtacc tctgctggaa aaagcgctcc
gcatggatct 180gaccaacatg atcattgagt tgcgcgtttc caatgccttc tccaagggcg
gcattcccct 240gactgttgaa ggcgttgcca atatcaagat tgctggggaa gaaccgacca
tccacaacgc 300gatcgagcgg ctgcttggca aaaaccgtaa ggaaatcgag caaattgcca
aggagaccct 360cgaaggcaac ttgcgtggtg ttttagccag cctcacgccg gagcagatca
acgaggacaa 420aattgccttt gccaaaagtc tgctggaaga ggcggaggat gaccttgagc
agctgggtct 480agtcctcgat acgctgcaag tccagaacat ttccgatgag gtcggttatc
tctcggctag 540tggacgcaag cagcgggctg atctgcagcg agatgcccga attgctgaag
ccgatgccca 600ggctgcctct gcgatccaaa cggccgaaaa tgacaagatc acggccctgc
gtcggatcga 660tcgcgatgta gcgatcgccc aagccgaggc cgagcgccgg attcaggatg
cgttgacgcg 720gcgcgaagcg gtggtggccg aagctgaagc ggacattgct accgaagtcg
ctcgtagcca 780agcagaactc cctgtgcagc aggagcggat caaacaggtg cagcagcaac
ttcaagccga 840tgtgatcgcc ccagctgagg cagcttgtaa acgggcgatc gcggaagcgc
ggggggccgc 900cgcccgtatc gtcgaagatg gaaaagctca agcggaaggg acccaacggc
tggcggaggc 960ttggcagacc gctggtgcta atgcccgcga catcttcctg ctccagaagc
tcgaaattcg 1020agctcggtac catttacgtt gacaccatcg aatggtgcaa aacctttcgc
ggtatggcat 1080gatagcgccc ggaagagagt caattcaggg tggtgaatgt gaaaccagta
acgttatacg 1140atgtcgcaga gtatgccggt gtctcttatc agaccgtttc ccgcgtggtg
aaccaggcca 1200gccacgtttc tgcgaaaacg cgggaaaaag tggaagcggc gatggcggag
ctgaattaca 1260ttcccaaccg cgtggcacaa caactggcgg gcaaacagtc gttgctgatt
ggcgttgcca 1320cctccagtct ggccctgcac gcgccgtcgc aaattgtcgc ggcgattaaa
tctcgcgccg 1380atcaactggg tgccagcgtg gtggtgtcga tggtagaacg aagcggcgtc
gaagcctgta 1440aagcggcggt gcacaatctt ctcgcgcaac gcgtcagtgg gctgatcatt
aactatccgc 1500tggatgacca ggatgccatt gctgtggaag ctgcctgcac taatgttccg
gcgttatttc 1560ttgatgtctc tgaccagaca cccatcaaca gtattatttt ctcccatgaa
gacggtacgc 1620gactgggcgt ggagcatctg gtcgcattgg gtcaccagca aatcgcgctg
ttagcgggcc 1680cattaagttc tgtctcggcg cgtctgcgtc tggctggctg gcataaatat
ctcactcgca 1740atcaaattca gccgatagcg gaacgggaag gcgactggag tgccatgtcc
ggttttcaac 1800aaaccatgca aatgctgaat gagggcatcg ttcccactgc gatgctggtt
gccaacgatc 1860agatggcgct gggcgcaatg cgcgccatta ccgagtccgg gctgcgcgtt
ggtgcggata 1920tctcggtagt gggatacgac gataccgaag acagctcatg ttatatcccg
ccgttaacca 1980ccatcaaaca ggattttcgc ctgctggggc aaaccagcgt ggaccgcttg
ctgcaactct 2040ctcagggcca ggcggtgaag ggcaatcagc tgttgcccgt ctcactggtg
aaaagaaaaa 2100ccaccctggc gcccaatacg caaaccgcct ctccccgcgc gttggccgat
tcattaatgc 2160agctggcacg acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc
aattaatgta 2220agttagcgcg aattgatctg gtttgacagc ttatcatcga ctgcacggtg
caccaatgct 2280tctggcgtca ggcagccatc ggaagctgtg gtatggctgt gcaggtcgta
aatcactgca 2340taattcgtgt cgctcaaggc gcactcccgt tctggataat gttttttgcg
ccgacatcat 2400aacggttctg gcaaatattc tgaaatgagc tgttgacaat taatcatccg
gctcgtataa 2460tgtgtggaat tgtgagcgga taacaatttc acacaggaaa cagaccatgg
cgaatggttc 2520tgcagtctct ttgaaatctg gaagcttgaa tacgcaggag gatactagtt
ccagtccccc 2580tcctcggacg tttttgcatc agctgcccga ctggagtcgc ttgctgaccg
ccatcacaac 2640agtgtttgtc aaatctaaac gaccggacat gcatgatcgg aaaagcaagc
gcccagatat 2700gctcgtcgat agtttcggac tcgagtctac tgtgcaggac ggcctggtgt
tccgtcaatc 2760cttcagcatc cgaagctacg agattggtac ggaccgtacc gctagcattg
aaacgttgat 2820gaaccatctc caagaaacca gtttgaacca ctgcaagagc acgggcatcc
tgctggatgg 2880ttttggccgc acattggaaa tgtgcaagcg agacttgatc tgggtggtca
ttaaaatgca 2940gatcaaagtt aatcgatacc cggcctgggg agataccgtt gagatcaata
cacgcttttc 3000ccgtttgggc aaaattggca tgggtcgcga ttggctgatc tccgactgca
acaccggtga 3060gatcttggtc cgtgcaacgt ctgcgtacgc gatgatgaat caaaagacgc
gtcggttgag 3120taagctgccg tatgaagttc accaagaaat tgttccattg ttcgttgata
gtcccgttat 3180cgaggattct gacctcaaag tccacaagtt taaagtcaag actggcgatt
ccatccagaa 3240gggcctgacg ccaggttgga acgatctgga tgtgaaccaa cacgttagca
acgttaagta 3300tatcggctgg atcttggaaa gtatgcctac ggaagtcctg gagacgcagg
aactctgcag 3360tctcgctctg gagtaccgcc gtgagtgtgg ccgtgattcc gtgctcgagt
ccgtcactgc 3420gatggaccct agcaaagtgg gtgttcgcag tcaataccaa cacctcttgc
ggctcgaaga 3480tgggaccgcc attgtgaacg gcgcgaccga atggcgcccc aaaaatgccg
gcgctaacgg 3540ggcaattagt accgggaaaa cctccaatgg aaacagcgtc agctaatgat
aggatccgag 3600ctcgagatct gcagctggta ccatatggga attcgaagct tggctgtttt
ggcggatgag 3660agaagatttt cagcctgata cagattaaat cagaacgcag aagcggtctg
ataaaacaga 3720atttgcctgg cggcagtagc gcggtggtcc cacctgaccc catgccgaac
tcagaagtga 3780aacgccgtag cgccgatggt agtgtggggt ctccccatgc gagagtaggg
aactgccagg 3840catcaaataa aacgaaaggc tcagtcgaaa gactgggcct ttcgttttat
ctgttgtttg 3900tcggtgaacg ctctcctgag taggacaaat ccgccgggag cggatttgaa
cgttgcgaag 3960caacggcccg gagggtggcg ggcaggacgc ccgccataaa ctgccaggca
tcaaattaag 4020cagaaggcca tcctgacgga tggccttttt gcgtttctac aaactctttt
gtttattttt 4080ctaaatacat tcaaatatgt atccgctcat ggggatccga ctagtaggcc
tcgaggaatt 4140cacgcgtacg tagatctccg cggccgccga tcctctagta tgcttgtaaa
ccgttttgtg 4200aaaaaatttt taaaataaaa aaggggacct ctagggtccc caattaatta
gtaatataat 4260ctattaaagg tcattcaaaa ggtcatccac cggatcagct tagtaaagcc
ctcgctagat 4320tttaatgcgg atgttgcgat tacttcgcca actattgcga taacaagaaa
aagccagcct 4380ttcatgatat atctcccaat ttgtgtaggg cttattatgc acgcttaaaa
ataataaaag 4440cagacttgac ctgatagttt ggctgtgagc aattatgtgc ttagtgcatc
taacgcttga 4500gttaagccgc gccgcgaagc ggcgtcggct tgaacgaatt gttagacatt
atttgccgac 4560taccttggtg atctcgcctt tcacgtagtg gacaaattct tccaactgat
ctgcgcgcga 4620ggccaagcga tcttcttctt gtccaagata agcctgtcta gcttcaagta
tgacgggctg 4680atactgggcc ggcaggcgct ccattgccca gtcggcagcg acatccttcg
gcgcgatttt 4740gccggttact gcgctgtacc aaatgcggga caacgtaagc actacatttc
gctcatcgcc 4800agcccagtcg ggcggcgagt tccatagcgt taaggtttca tttagcgcct
caaatagatc 4860ctgttcagga accggatcaa agagttcctc cgccgctgga cctaccaagg
caacgctatg 4920ttctcttgct tttgtcagca agatagccag atcaatgtcg atcgtggctg
gctcgaagat 4980acctgcaaga atgtcattgc gctgccattc tccaaattgc agttcgcgct
tagctggata 5040acgccacgga atgatgtcgt cgtgcacaac aatggtgact tctacagcgc
ggagaatctc 5100gctctctcca ggggaagccg aagtttccaa aaggtcgttg atcaaagctc
gccgcgttgt 5160ttcatcaagc cttacggtca ccgtaaccag caaatcaata tcactgtgtg
gcttcaggcc 5220gccatccact gcggagccgt acaaatgtac ggccagcaac gtcggttcga
gatggcgctc 5280gatgacgcca actacctctg atagttgagt cgatacttcg gcgatcaccg
cttccctcat 5340gatgtttaac tttgttttag ggcgactgcc ctgctgcgta acatcgttgc
tgctccataa 5400catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg gatgcccgag
gcatagactg 5460taccccaaaa aaacagtcat aacaagccat gaaaaccgcc actgcgccgt
taccaccgct 5520gcgttcggtc aaggttctgg accagttgcg tgagcgcata cgctacttgc
attacagctt 5580acgaaccgaa caggcttatg tccactgggt tcgtgccttc atccgtttcc
acggtgtgcg 5640tcacccggca accttgggca gcagcgaagt cgaggcattt ctgtcctggc
tggcgaacga 5700gcgcaaggtt tcggtctcca cgcatcgtca ggcattggcg gccttgctgt
tcttctacgg 5760caaggtgctg tgcacggatc tgccctggct tcaggagatc ggaagacctc
ggccgtcgcg 5820gcgcttgccg gtggtgctga ccccggatga agtggttcgc atcctcggtt
ttctggaagg 5880cgagcatcgt ttgttcgccc agcttctgta tggaacgggc atgcggatca
gtgagggttt 5940gcaactgcgg gtcaaggatc tggatttcga tcacggcacg atcatcgtgc
gggagggcaa 6000gggctccaag gatcgggcct tgatgttacc cgagagcttg gcacccagcc
tgcgcgagca 6060ggggaattga tccggtggat gaccttttga atgaccttta atagattata
ttactaatta 6120attggggacc ctagaggtcc ccttttttat tttaaaaatt ttttcacaaa
acggtttaca 6180agcataaagc tctagagtcg acctgcaggc atgcaagctt cgagtccctg
ctcgtcacgc 6240tttcaggcac cgtgccagat atcgacgtgg agtcgatcac tgtgattggc
gaaggggaag 6300gcagcgctac ccaaatcgct agcttgctgg agaagctgaa acaaaccacg
ggcattgatc 6360tggcgaaatc cctaccgggt caatccgact cgcccgctgc gaagtcctaa
gagatagcga 6420tgtgaccgcg atcgcttgtc aagaatccca gtgatcccga accataggaa
ggcaagctca 6480atgcttgcct cgtcttgagg actatctaga tgtctgtgga acgcacattt
attgccatca 6540agcccgatgg cgttcagcgg ggtttggtcg gtacgatcat cggccgcttt
gagcaaaaag 6600gcttcaaact ggtgggccta aagcagctga agcccagtcg cgagctggcc
gaacagcact 6660atgctgtcca ccgcgagcgc cccttcttca atggcctcgt cgagttcatc
acctctgggc 6720cgatcgtggc gatcgtcttg gaaggcgaag gcgttgtggc ggctgctcgc
aagttgatcg 6780gcgctaccaa tccgctgacg gcagaaccgg gcaccatccg tggtgatttt
ggtgtcaata 6840ttggccgcaa catcatccat ggctcggatg caatcgaaac agcacaacag
gaaattgctc 6900tctggtttag cccagcagag ctaagtgatt ggacccccac gattcaaccc
tggctgtacg 6960aataaggtct gcattccttc agagagacat tgccatgccc gtgctgcgat
cgcccttcca 7020agctgccttg ccccgctgtt tcgggctggc agccctggcg ttggggctgg
cgaccgcttg 7080ccaagaaagc agcgctccgc cggctgccgg atc
7113117173DNAArtificial SequenceSynthetic construct
11cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat
60cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga
120acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt
180ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt
240ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc
300gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa
360gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct
420ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta
480actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg
540gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc
600ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg aagccagtta
660ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg
720gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt
780tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg
840tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta
900aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg
960aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga ctccccgtcg
1020tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca atgataccgc
1080gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc ggaagggccg
1140agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat tgttgccggg
1200aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc attgctacag
1260gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt tcccaacgat
1320caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc
1380cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg gcagcactgc
1440ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt gagtactcaa
1500ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac
1560gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga aaacgttctt
1620cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg taacccactc
1680gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg tgagcaaaaa
1740caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt tgaatactca
1800tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc atgagcggat
1860acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa
1920aagtgccacc tgacgtctaa gaaaccatta ttatcatgac attaacctat aaaaataggc
1980gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga cggtgaaaac ctctgacaca
2040tgcagctccc ggagacggtc acagcttgtc tgtaagcgga tgccgggagc agacaagccc
2100gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg gcttaactat gcggcatcag
2160agcagattgt actgagagtg caccataaaa ttgtaaacgt taatattttg ttaaaattcg
2220cgttaaattt ttgttaaatc agctcatttt ttaaccaata ggccgaaatc ggcaaaatcc
2280cttataaatc aaaagaatag cccgagatag ggttgagtgt tgttccagtt tggaacaaga
2340gtccactatt aaagaacgtg gactccaacg tcaaagggcg aaaaaccgtc tatcagggcg
2400atggcccact acgtgaacca tcacccaaat caagtttttt ggggtcgagg tgccgtaaag
2460cactaaatcg gaaccctaaa gggagccccc gatttagagc ttgacgggga aagccggcga
2520acgtggcgag aaaggaaggg aagaaagcga aaggagcggg cgctagggcg ctggcaagtg
2580tagcggtcac gctgcgcgta accaccacac ccgccgcgct taatgcgccg ctacagggcg
2640cgtactatgg ttgctttgac gtatgcggtg tgaaataccg cacagatgcg taaggagaaa
2700ataccgcatc aggcgccatt cgccattcag gctgcgcaac tgttgggaag ggcgatcggt
2760gcgggcctct tcgctattac gccagctggc gaaaggggga tgtgctgcaa ggcgattaag
2820ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa acgacggcca gtgccaagct
2880attgctgaag cggaatccct ggttaatgcc gccgccgatg ccaattgcat tctccaagtg
2940gggcacattg aacgcttcaa cccggcattt ttagagctaa ccaaaattct caaaacggaa
3000gagttattgg cgatcgaagc ccatcgcatg agtccctatt cccagcgggc caatgatgtc
3060tccgtggtat tggatttgat gatccatgac attgacctgt tgctggaatt ggtgggttcg
3120gaagtggtta aactgtccgc cagtggcagt cgggcttctg ggtcaggata tttggattat
3180gtcaccgcta cgttaggctt ctcctccggc attgtggcca ccctcaccgc cagtaaggtc
3240acccatcgta aaattcgttc catcgccgcc cactgcaaaa attccctcac cgaagcggat
3300tttctcaata acgaaatttt gatccatcgc caaaccaccg ctgattggag cgcggactat
3360ggccaggtat tgtatcgcca ggatggtcta atcgaaaagg tttacaccag taatattgaa
3420cctctccacg ctgaattaga acattttatt cattgtgtta ggggaggtga tcaaccctca
3480gtggggggag aacaggccct caaggccctg aagttagcca gtttaattga agaaatggcc
3540ctggacagtc aggaatggca tgggggggaa gttgtgacag aatatcaaga tgccaccctg
3600gccctcagtg cgagtgttta aatcaactta attaatgcaa ttattgcgag ttcaaactcg
3660ataactttgt gaaatattac tgttgaatta atctatgact attcaataca cccccctagc
3720cgatcgcctg ttggcctacc tcgccgccga tcgcctaaat ctcagcgcca agagtagttc
3780cctcaacacc agtattctgc tcagcagtga cctattcaat caggaagggg gaattgtaac
3840agccaactat ggctttgatg gttatatggt accatatgca tgcgagctca gatctaccag
3900gttgtccttg gcgcagcgct tcccacgctg agagggtgta gcccgtcacg ggtaaccgat
3960atcgtcgaca ggcctctaga cccgggctcg agctagcaag cttggccgga tccggccgga
4020tccggagttt gtagaaacgc aaaaaggcca tccgtcagga tggccttctg cttaatttga
4080tgcctggcag tttatggcgg gcgtcctgcc cgccaccctc cgggccgttg ctccgcaacg
4140ttcaaatccg ctcccggcgg atttgtccta ctcaggagag cgttcaccga caaacaacag
4200ataaaacgaa aggcccagtc tttcgactga gcctttcgtt ttatttgatg cctggcagtt
4260ccctactctc gcatggggag accccacact accatcggcg ctacggcgtt tcacttctga
4320gttcggcatg gggtcaggtg ggaccaccgc gctactgccg ccaggcaaat tctgttttat
4380tgagccgtta ccccacctac tagctaatcc catctgggca catccgatgg caagaggccc
4440gaaggtcccc ctctttggtc ttgcgacgtt atgcggtatt agctaccgtt tccagtagtt
4500atccccctcc atcaggcagt ttcccagaca ttactcaccc gtccgccact cgtcagcaaa
4560gaagcaagct tagatcgacc tgcagggggg ggggggaaag ccacgttgtg tctcaaaatc
4620tctgatgtta cattgcacaa gataaaaata tatcatcatg aacaataaaa ctgtctgctt
4680acataaacag taatacaagg ggtgttatga gccatattca acgggaaacg tcttgctcga
4740ggccgcgatt aaattccaac atggatgctg atttatatgg gtataaatgg gctcgcgata
4800atgtcgggca atcaggtgcg acaatctatc gattgtatgg gaagcccgat gcgccagagt
4860tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt tacagatgag atggtcagac
4920taaactggct gacggaattt atgcctcttc cgaccatcaa gcattttatc cgtactcctg
4980atgatgcatg gttactcacc actgcgatcc ccgggaaaac agcattccag gtattagaag
5040aatatcctga ttcaggtgaa aatattgttg atgcgctggc agtgttcctg cgccggttgc
5100attcgattcc tgtttgtaat tgtcctttta acagcgatcg cgtatttcgt ctcgctcagg
5160cgcaatcacg aatgaataac ggtttggttg atgcgagtga ttttgatgac gagcgtaatg
5220gctggcctgt tgaacaagtc tggaaagaaa tgcataagct tttgccattc tcaccggatt
5280cagtcgtcac tcatggtgat ttctcacttg ataaccttat ttttgacgag gggaaattaa
5340taggttgtat tgatgttgga cgagtcggaa tcgcagaccg ataccaggat cttgccatcc
5400tatggaactg cctcggtgag ttttctcctt cattacagaa acggcttttt caaaaatatg
5460gtattgataa tcctgatatg aataaattgc agtttcattt gatgctcgat gagtttttct
5520aatcagaatt ggttaattgg ttgtaacact ggcagagcat tacgctgact tgacgggacg
5580gcggctttgt tgaataaatc gaacttttgc tgagttgaag gatcagatca cgcatcttcc
5640cgacaacgca gaccgttccg tggcaaagca aaagttcaaa atcaccaact ggtccaccta
5700caacaaagct ctcatcaacc gtggctccct cactttctgg ctggatgatg gggcgattca
5760ggcctggtat gagtcagcaa caccttcttc acgaggcaga cctcagcgcc cccccccccc
5820tgcaggtcga tctggtaacc ccagcgcggt tgctaccaag tagtgacccg cttcgtgatg
5880caaaatccgc tgacgatatt cgggcgatcg ctgctgaatg ccatcgagca gtaacgtggc
5940gaattcggta ccggtatgga tggcaccgat gcggaatccc aacagattgc ctttgacaac
6000aatgtggcct ggaataacct gggggatttg tccaccacca cccaacgggc ctacacttcg
6060gctattagca cagacacagt gcagagtgtt tatggcgtta atctggaaaa aaacgataac
6120attcccattg tttttgcgtg gcccattttt cccaccaccc ttaatcccac agattttcag
6180gtaatgctta acacggggga aattgtcacc ccggtgatcg cctctttgat tcccaacagt
6240gaatacaacg aacggcaaac ggtagtaatt acgggcaatt ttggtaatcg tttaacccca
6300ggcacggagg gagcgattta tcccgtttcc gtaggcacag tgttggacag tactcctttg
6360gaaatggtgg gacccaacgg cccggtcagt gcggtgggta ttaccattga tagtctcaac
6420ccctacgtgg ccggcaatgg tcccaaaatt gtcgccgcta agttagaccg cttcagtgac
6480ctgggggaag gggctcccct ctggttagcc accaatcaaa ataacagtgg cggggattta
6540tatggagacc aagcccaatt tcgtttgcga atttacacca gcgccggttt ttcccccgat
6600ggcattgcca gtttactacc cacagaattt gaacggtatt ttcaactcca agcggaagat
6660attacgggac ggacagttat cctaacccaa actggtgttg attatgaaat tcccggcttt
6720ggtctggtgc aggtgttggg gctggcggat ttggccgggg ttcaggacag ctatgacctg
6780acttacatcg aagatcatga caactattac gacattatcc tcaaagggga cgaagccgca
6840gttcgccaaa ttaagagggt tgctttgccc tccgaagggg attattcggc ggtttataat
6900cccggtggcc ccggcaatga tccagagaat ggtcccccaa attcgtaatc atgtcatagc
6960tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca
7020taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct
7080cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac
7140gcgcggggag aggcggtttg cgtattgggc gct
7173127029DNAArtificial SequenceSynthetic construct 12attgctgaag
cggaatccct ggttaatgcc gccgccgatg ccaattgcat tctccaagtg 60gggcacattg
aacgcttcaa cccggcattt ttagagctaa ccaaaattct caaaacggaa 120gagttattgg
cgatcgaagc ccatcgcatg agtccctatt cccagcgggc caatgatgtc 180tccgtggtat
tggatttgat gatccatgac attgacctgt tgctggaatt ggtgggttcg 240gaagtggtta
aactgtccgc cagtggcagt cgggcttctg ggtcaggata tttggattat 300gtcaccgcta
cgttaggctt ctcctccggc attgtggcca ccctcaccgc cagtaaggtc 360acccatcgta
aaattcgttc catcgccgcc cactgcaaaa attccctcac cgaagcggat 420tttctcaata
acgaaatttt gatccatcgc caaaccaccg ctgattggag cgcggactat 480ggccaggtat
tgtatcgcca ggatggtcta atcgaaaagg tttacaccag taatattgaa 540cctctccacg
ctgaattaga acattttatt cattgtgtta ggggaggtga tcaaccctca 600gtggggggag
aacaggccct caaggccctg aagttagcca gtttaattga agaaatggcc 660ctggacagtc
aggaatggca tgggggggaa gttgtgacag aatatcaaga tgccaccctg 720gccctcagtg
cgagtgttta aatcaactta attaatgcaa ttattgcgag ttcaaactcg 780ataactttgt
gaaatattac tgttgaatta atctatgact attcaataca cccccctagc 840cgatcgcctg
ttggcctacc tcgccgccga tcgcctaaat ctcagcgcca agagtagttc 900cctcaacacc
agtattctgc tcagcagtga cctattcaat caggaagggg gaattgtaac 960agccaactat
ggctttgatg gttatatggt accatatggt gcactctcag tacaatctgc 1020tctgatgccg
catagttaag ccagtataca ctccgctatc gctacgtgac tgggtcatgg 1080ctgcgccccg
acacccgcca acacccgctg acgcgccctg acgggcttgt ctgctcccgg 1140catccgctta
cagacaagct gtgaccgtct ccgggagctg catgtgtcag aggttttcac 1200cgtcatcacc
gaaacgcgcg aggcagcaga tcaattcgcg cgcgaaggcg aagcggcatg 1260catttacgtt
gacaccatcg aatggtgcaa aacctttcgc ggtatggcat gatagcgccc 1320ggaagagagt
caattcaggg tggtgaatgt gaaaccagta acgttatacg atgtcgcaga 1380gtatgccggt
gtctcttatc agaccgtttc ccgcgtggtg aaccaggcca gccacgtttc 1440tgcgaaaacg
cgggaaaaag tggaagcggc gatggcggag ctgaattaca ttcccaaccg 1500cgtggcacaa
caactggcgg gcaaacagtc gttgctgatt ggcgttgcca cctccagtct 1560ggccctgcac
gcgccgtcgc aaattgtcgc ggcgattaaa tctcgcgccg atcaactggg 1620tgccagcgtg
gtggtgtcga tggtagaacg aagcggcgtc gaagcctgta aagcggcggt 1680gcacaatctt
ctcgcgcaac gcgtcagtgg gctgatcatt aactatccgc tggatgacca 1740ggatgccatt
gctgtggaag ctgcctgcac taatgttccg gcgttatttc ttgatgtctc 1800tgaccagaca
cccatcaaca gtattatttt ctcccatgaa gacggtacgc gactgggcgt 1860ggagcatctg
gtcgcattgg gtcaccagca aatcgcgctg ttagcgggcc cattaagttc 1920tgtctcggcg
cgtctgcgtc tggctggctg gcataaatat ctcactcgca atcaaattca 1980gccgatagcg
gaacgggaag gcgactggag tgccatgtcc ggttttcaac aaaccatgca 2040aatgctgaat
gagggcatcg ttcccactgc gatgctggtt gccaacgatc agatggcgct 2100gggcgcaatg
cgcgccatta ccgagtccgg gctgcgcgtt ggtgcggata tctcggtagt 2160gggatacgac
gataccgaag acagctcatg ttatatcccg ccgttaacca ccatcaaaca 2220ggattttcgc
ctgctggggc aaaccagcgt ggaccgcttg ctgcaactct ctcagggcca 2280ggcggtgaag
ggcaatcagc tgttgcccgt ctcactggtg aaaagaaaaa ccaccctggc 2340gcccaatacg
caaaccgcct ctccccgcgc gttggccgat tcattaatgc agctggcacg 2400acaggtttcc
cgactggaaa gcgggcagtg agcgcaacgc aattaatgta agttagcgcg 2460aattgatctg
gtttgacagc ttatcatcga ctgcacggtg caccaatgct tctggcgtca 2520ggcagccatc
ggaagctgtg gtatggctgt gcaggtcgta aatcactgca taattcgtgt 2580cgctcaaggc
gcactcccgt tctggataat gttttttgcg ccgacatcat aacggttctg 2640gcaaatattc
tgaaatgagc tgttgacaat taatcatccg gctcgtataa tgtgtggaat 2700tgtgagcgga
taacaatttc acacaggaaa cagcgccgct gagaaaaagc gaagcggcac 2760tgctctttaa
caatttatca gacaatctgt gtgggcactc gaccggaatt atcgattaac 2820tttattatta
aaaattaaag aggtatatat taatgtatcg attaaataag gaggaataaa 2880ccatggcgaa
tggttctgca gtctctttga aatctggaag cttgaatacg caggaggata 2940ctagttccag
tccccctcct cggacgtttt tgcatcagct gcccgactgg agtcgcttgc 3000tgaccgccat
cacaacagtg tttgtcaaat ctaaacgacc ggacatgcat gatcggaaaa 3060gcaagcgccc
agatatgctc gtcgatagtt tcggactcga gtctactgtg caggacggcc 3120tggtgttccg
tcaatccttc agcatccgaa gctacgagat tggtacggac cgtaccgcta 3180gcattgaaac
gttgatgaac catctccaag aaaccagttt gaaccactgc aagagcacgg 3240gcatcctgct
ggatggtttt ggccgcacat tggaaatgtg caagcgagac ttgatctggg 3300tggtcattaa
aatgcagatc aaagttaatc gatacccggc ctggggagat accgttgaga 3360tcaatacacg
cttttcccgt ttgggcaaaa ttggcatggg tcgcgattgg ctgatctccg 3420actgcaacac
cggtgagatc ttggtccgtg caacgtctgc gtacgcgatg atgaatcaaa 3480agacgcgtcg
gttgagtaag ctgccgtatg aagttcacca agaaattgtt ccattgttcg 3540ttgatagtcc
cgttatcgag gattctgacc tcaaagtcca caagtttaaa gtcaagactg 3600gcgattccat
ccagaagggc ctgacgccag gttggaacga tctggatgtg aaccaacacg 3660ttagcaacgt
taagtatatc ggctggatct tggaaagtat gcctacggaa gtcctggaga 3720cgcaggaact
ctgcagtctc gctctggagt accgccgtga gtgtggccgt gattccgtgc 3780tcgagtccgt
cactgcgatg gaccctagca aagtgggtgt tcgcagtcaa taccaacacc 3840tcttgcggct
cgaagatggg accgccattg tgaacggcgc gaccgaatgg cgccccaaaa 3900atgccggcgc
taacggggca attagtaccg ggaaaacctc caatggaaac agcgtcagct 3960aatgatagga
tccgagctca gatctaccag gttgtccttg gcgcagcgct tcccacgctg 4020agagggtgta
gcccgtcacg ggtaaccgat atcgtcgaca ggcctctaga cccgggctcg 4080agctagcaag
cttggccgga tccggccgga tccggagttt gtagaaacgc aaaaaggcca 4140tccgtcagga
tggccttctg cttaatttga tgcctggcag tttatggcgg gcgtcctgcc 4200cgccaccctc
cgggccgttg cttcgcaacg ttcaaatccg ctcccggcgg atttgtccta 4260ctcaggagag
cgttcaccga caaacaacag ataaaacgaa aggcccagtc tttcgactga 4320gcctttcgtt
ttatttgatg cctggcagtt ccctactctc gcatggggag accccacact 4380accatcggcg
ctacggcgtt tcacttctga gttcggcatg gggtcaggtg ggaccaccgc 4440gctactgccg
ccaggcaaat tctgttttat tgagccgtta ccccacctac tagctaatcc 4500catctgggca
catccgatgg caagaggccc gaaggtcccc ctctttggtc ttgcgacgtt 4560atgcggtatt
agctaccgtt tccagtagtt atccccctcc atcaggcagt ttcccagaca 4620ttactcaccc
gtccgccact cgtcagcaaa gaagcaagct tagatcgacc tgcagggggg 4680ggggggaaag
ccacgttgtg tctcaaaatc tctgatgtta cattgcacaa gataaaaata 4740tatcatcatg
aacaataaaa ctgtctgctt acataaacag taatacaagg ggtgttatga 4800gccatattca
acgggaaacg tcttgctcga ggccgcgatt aaattccaac atggatgctg 4860atttatatgg
gtataaatgg gctcgcgata atgtcgggca atcaggtgcg acaatctatc 4920gattgtatgg
gaagcccgat gcgccagagt tgtttctgaa acatggcaaa ggtagcgttg 4980ccaatgatgt
tacagatgag atggtcagac taaactggct gacggaattt atgcctcttc 5040cgaccatcaa
gcattttatc cgtactcctg atgatgcatg gttactcacc actgcgatcc 5100ccgggaaaac
agcattccag gtattagaag aatatcctga ttcaggtgaa aatattgttg 5160atgcgctggc
agtgttcctg cgccggttgc attcgattcc tgtttgtaat tgtcctttta 5220acagcgatcg
cgtatttcgt ctcgctcagg cgcaatcacg aatgaataac ggtttggttg 5280atgcgagtga
ttttgatgac gagcgtaatg gctggcctgt tgaacaagtc tggaaagaaa 5340tgcataagct
tttgccattc tcaccggatt cagtcgtcac tcatggtgat ttctcacttg 5400ataaccttat
ttttgacgag gggaaattaa taggttgtat tgatgttgga cgagtcggaa 5460tcgcagaccg
ataccaggat cttgccatcc tatggaactg cctcggtgag ttttctcctt 5520cattacagaa
acggcttttt caaaaatatg gtattgataa tcctgatatg aataaattgc 5580agtttcattt
gatgctcgat gagtttttct aatcagaatt ggttaattgg ttgtaacact 5640ggcagagcat
tacgctgact tgacgggacg gcggctttgt tgaataaatc gaacttttgc 5700tgagttgaag
gatcagatca cgcatcttcc cgacaacgca gaccgttccg tggcaaagca 5760aaagttcaaa
atcaccaact ggtccaccta caacaaagct ctcatcaacc gtggctccct 5820cactttctgg
ctggatgatg gggcgattca ggcctggtat gagtcagcaa caccttcttc 5880acgaggcaga
cctcagcgcc cccccccccc tgcaggtcga tctggtaacc ccagcgcggt 5940tgctaccaag
tagtgacccg cttcgtgatg caaaatccgc tgacgatatt cgggcgatcg 6000ctgctgaatg
ccatcgagca gtaacgtggc gaattcggta ccggtatgga tggcaccgat 6060gcggaatccc
aacagattgc ctttgacaac aatgtggcct ggaataacct gggggatttg 6120tccaccacca
cccaacgggc ctacacttcg gctattagca cagacacagt gcagagtgtt 6180tatggcgtta
atctggaaaa aaacgataac attcccattg tttttgcgtg gcccattttt 6240cccaccaccc
ttaatcccac agattttcag gtaatgctta acacggggga aattgtcacc 6300ccggtgatcg
cctctttgat tcccaacagt gaatacaacg aacggcaaac ggtagtaatt 6360acgggcaatt
ttggtaatcg tttaacccca ggcacggagg gagcgattta tcccgtttcc 6420gtaggcacag
tgttggacag tactcctttg gaaatggtgg gacccaacgg cccggtcagt 6480gcggtgggta
ttaccattga tagtctcaac ccctacgtgg ccggcaatgg tcccaaaatt 6540gtcgccgcta
agttagaccg cttcagtgac ctgggggaag gggctcccct ctggttagcc 6600accaatcaaa
ataacagtgg cggggattta tatggagacc aagcccaatt tcgtttgcga 6660atttacacca
gcgccggttt ttcccccgat ggcattgcca gtttactacc cacagaattt 6720gaacggtatt
ttcaactcca agcggaagat attacgggac ggacagttat cctaacccaa 6780actggtgttg
attatgaaat tcccggcttt ggtctggtgc aggtgttggg gctggcggat 6840ttggccgggg
ttcaggacag ctatgacctg acttacatcg aagatcatga caactattac 6900gacattatcc
tcaaagggga cgaagccgca gttcgccaaa ttaagagggt tgctttgccc 6960tccgaagggg
attattcggc ggtttataat cccggtggcc ccggcaatga tccagagaat 7020ggtccccca
7029136883DNAArtificial SequenceSynthetic construct 13attgctgaag
cggaatccct ggttaatgcc gccgccgatg ccaattgcat tctccaagtg 60gggcacattg
aacgcttcaa cccggcattt ttagagctaa ccaaaattct caaaacggaa 120gagttattgg
cgatcgaagc ccatcgcatg agtccctatt cccagcgggc caatgatgtc 180tccgtggtat
tggatttgat gatccatgac attgacctgt tgctggaatt ggtgggttcg 240gaagtggtta
aactgtccgc cagtggcagt cgggcttctg ggtcaggata tttggattat 300gtcaccgcta
cgttaggctt ctcctccggc attgtggcca ccctcaccgc cagtaaggtc 360acccatcgta
aaattcgttc catcgccgcc cactgcaaaa attccctcac cgaagcggat 420tttctcaata
acgaaatttt gatccatcgc caaaccaccg ctgattggag cgcggactat 480ggccaggtat
tgtatcgcca ggatggtcta atcgaaaagg tttacaccag taatattgaa 540cctctccacg
ctgaattaga acattttatt cattgtgtta ggggaggtga tcaaccctca 600gtggggggag
aacaggccct caaggccctg aagttagcca gtttaattga agaaatggcc 660ctggacagtc
aggaatggca tgggggggaa gttgtgacag aatatcaaga tgccaccctg 720gccctcagtg
cgagtgttta aatcaactta attaatgcaa ttattgcgag ttcaaactcg 780ataactttgt
gaaatattac tgttgaatta atctatgact attcaataca cccccctagc 840cgatcgcctg
ttggcctacc tcgccgccga tcgcctaaat ctcagcgcca agagtagttc 900cctcaacacc
agtattctgc tcagcagtga cctattcaat caggaagggg gaattgtaac 960agccaactat
ggctttgatg gttatatggt accatatggt gcactctcag tacaatctgc 1020tctgatgccg
catagttaag ccagtataca ctccgctatc gctacgtgac tgggtcatgg 1080ctgcgccccg
acacccgcca acacccgctg acgcgccctg acgggcttgt ctgctcccgg 1140catccgctta
cagacaagct gtgaccgtct ccgggagctg catgtgtcag aggttttcac 1200cgtcatcacc
gaaacgcgcg aggcagcaga tcaattcgcg cgcgaaggcg aagcggcatg 1260catttacgtt
gacaccatcg aatggtgcaa aacctttcgc ggtatggcat gatagcgccc 1320ggaagagagt
caattcaggg tggtgaatgt gaaaccagta acgttatacg atgtcgcaga 1380gtatgccggt
gtctcttatc agaccgtttc ccgcgtggtg aaccaggcca gccacgtttc 1440tgcgaaaacg
cgggaaaaag tggaagcggc gatggcggag ctgaattaca ttcccaaccg 1500cgtggcacaa
caactggcgg gcaaacagtc gttgctgatt ggcgttgcca cctccagtct 1560ggccctgcac
gcgccgtcgc aaattgtcgc ggcgattaaa tctcgcgccg atcaactggg 1620tgccagcgtg
gtggtgtcga tggtagaacg aagcggcgtc gaagcctgta aagcggcggt 1680gcacaatctt
ctcgcgcaac gcgtcagtgg gctgatcatt aactatccgc tggatgacca 1740ggatgccatt
gctgtggaag ctgcctgcac taatgttccg gcgttatttc ttgatgtctc 1800tgaccagaca
cccatcaaca gtattatttt ctcccatgaa gacggtacgc gactgggcgt 1860ggagcatctg
gtcgcattgg gtcaccagca aatcgcgctg ttagcgggcc cattaagttc 1920tgtctcggcg
cgtctgcgtc tggctggctg gcataaatat ctcactcgca atcaaattca 1980gccgatagcg
gaacgggaag gcgactggag tgccatgtcc ggttttcaac aaaccatgca 2040aatgctgaat
gagggcatcg ttcccactgc gatgctggtt gccaacgatc agatggcgct 2100gggcgcaatg
cgcgccatta ccgagtccgg gctgcgcgtt ggtgcggata tctcggtagt 2160gggatacgac
gataccgaag acagctcatg ttatatcccg ccgttaacca ccatcaaaca 2220ggattttcgc
ctgctggggc aaaccagcgt ggaccgcttg ctgcaactct ctcagggcca 2280ggcggtgaag
ggcaatcagc tgttgcccgt ctcactggtg aaaagaaaaa ccaccctggc 2340gcccaatacg
caaaccgcct ctccccgcgc gttggccgat tcattaatgc agctggcacg 2400acaggtttcc
cgactggaaa gcgggcagtg agcgcaacgc aattaatgta agttagcgcg 2460aattgatctg
gtttgacagc ttatcatcga ctgcacggtg caccaatgct tctggcgtca 2520ggcagccatc
ggaagctgtg gtatggctgt gcaggtcgta aatcactgca taattcgtgt 2580cgctcaaggc
gcactcccgt tctggataat gttttttgcg ccgacatcat aacggttctg 2640gcaaatattc
tgaaatgagc tgttgacaat taatcatccg gctcgtataa tgtgtggaat 2700tgtgagcgga
taacaatttc acacaggaaa cagaccatgg cgaatggttc tgcagtctct 2760ttgaaatctg
gaagcttgaa tacgcaggag gatactagtt ccagtccccc tcctcggacg 2820tttttgcatc
agctgcccga ctggagtcgc ttgctgaccg ccatcacaac agtgtttgtc 2880aaatctaaac
gaccggacat gcatgatcgg aaaagcaagc gcccagatat gctcgtcgat 2940agtttcggac
tcgagtctac tgtgcaggac ggcctggtgt tccgtcaatc cttcagcatc 3000cgaagctacg
agattggtac ggaccgtacc gctagcattg aaacgttgat gaaccatctc 3060caagaaacca
gtttgaacca ctgcaagagc acgggcatcc tgctggatgg ttttggccgc 3120acattggaaa
tgtgcaagcg agacttgatc tgggtggtca ttaaaatgca gatcaaagtt 3180aatcgatacc
cggcctgggg agataccgtt gagatcaata cacgcttttc ccgtttgggc 3240aaaattggca
tgggtcgcga ttggctgatc tccgactgca acaccggtga gatcttggtc 3300cgtgcaacgt
ctgcgtacgc gatgatgaat caaaagacgc gtcggttgag taagctgccg 3360tatgaagttc
accaagaaat tgttccattg ttcgttgata gtcccgttat cgaggattct 3420gacctcaaag
tccacaagtt taaagtcaag actggcgatt ccatccagaa gggcctgacg 3480ccaggttgga
acgatctgga tgtgaaccaa cacgttagca acgttaagta tatcggctgg 3540atcttggaaa
gtatgcctac ggaagtcctg gagacgcagg aactctgcag tctcgctctg 3600gagtaccgcc
gtgagtgtgg ccgtgattcc gtgctcgagt ccgtcactgc gatggaccct 3660agcaaagtgg
gtgttcgcag tcaataccaa cacctcttgc ggctcgaaga tgggaccgcc 3720attgtgaacg
gcgcgaccga atggcgcccc aaaaatgccg gcgctaacgg ggcaattagt 3780accgggaaaa
cctccaatgg aaacagcgtc agctaatgat aggatccgag ctcagatcta 3840ccaggttgtc
cttggcgcag cgcttcccac gctgagaggg tgtagcccgt cacgggtaac 3900cgatatcgtc
gacaggcctc tagacccggg ctcgagctag caagcttggc cggatccggc 3960cggatccgga
gtttgtagaa acgcaaaaag gccatccgtc aggatggcct tctgcttaat 4020ttgatgcctg
gcagtttatg gcgggcgtcc tgcccgccac cctccgggcc gttgcttcgc 4080aacgttcaaa
tccgctcccg gcggatttgt cctactcagg agagcgttca ccgacaaaca 4140acagataaaa
cgaaaggccc agtctttcga ctgagccttt cgttttattt gatgcctggc 4200agttccctac
tctcgcatgg ggagacccca cactaccatc ggcgctacgg cgtttcactt 4260ctgagttcgg
catggggtca ggtgggacca ccgcgctact gccgccaggc aaattctgtt 4320ttattgagcc
gttaccccac ctactagcta atcccatctg ggcacatccg atggcaagag 4380gcccgaaggt
ccccctcttt ggtcttgcga cgttatgcgg tattagctac cgtttccagt 4440agttatcccc
ctccatcagg cagtttccca gacattactc acccgtccgc cactcgtcag 4500caaagaagca
agcttagatc gacctgcagg gggggggggg aaagccacgt tgtgtctcaa 4560aatctctgat
gttacattgc acaagataaa aatatatcat catgaacaat aaaactgtct 4620gcttacataa
acagtaatac aaggggtgtt atgagccata ttcaacggga aacgtcttgc 4680tcgaggccgc
gattaaattc caacatggat gctgatttat atgggtataa atgggctcgc 4740gataatgtcg
ggcaatcagg tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca 4800gagttgtttc
tgaaacatgg caaaggtagc gttgccaatg atgttacaga tgagatggtc 4860agactaaact
ggctgacgga atttatgcct cttccgacca tcaagcattt tatccgtact 4920cctgatgatg
catggttact caccactgcg atccccggga aaacagcatt ccaggtatta 4980gaagaatatc
ctgattcagg tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg 5040ttgcattcga
ttcctgtttg taattgtcct tttaacagcg atcgcgtatt tcgtctcgct 5100caggcgcaat
cacgaatgaa taacggtttg gttgatgcga gtgattttga tgacgagcgt 5160aatggctggc
ctgttgaaca agtctggaaa gaaatgcata agcttttgcc attctcaccg 5220gattcagtcg
tcactcatgg tgatttctca cttgataacc ttatttttga cgaggggaaa 5280ttaataggtt
gtattgatgt tggacgagtc ggaatcgcag accgatacca ggatcttgcc 5340atcctatgga
actgcctcgg tgagttttct ccttcattac agaaacggct ttttcaaaaa 5400tatggtattg
ataatcctga tatgaataaa ttgcagtttc atttgatgct cgatgagttt 5460ttctaatcag
aattggttaa ttggttgtaa cactggcaga gcattacgct gacttgacgg 5520gacggcggct
ttgttgaata aatcgaactt ttgctgagtt gaaggatcag atcacgcatc 5580ttcccgacaa
cgcagaccgt tccgtggcaa agcaaaagtt caaaatcacc aactggtcca 5640cctacaacaa
agctctcatc aaccgtggct ccctcacttt ctggctggat gatggggcga 5700ttcaggcctg
gtatgagtca gcaacacctt cttcacgagg cagacctcag cgcccccccc 5760cccctgcagg
tcgatctggt aaccccagcg cggttgctac caagtagtga cccgcttcgt 5820gatgcaaaat
ccgctgacga tattcgggcg atcgctgctg aatgccatcg agcagtaacg 5880tggcgaattc
ggtaccggta tggatggcac cgatgcggaa tcccaacaga ttgcctttga 5940caacaatgtg
gcctggaata acctggggga tttgtccacc accacccaac gggcctacac 6000ttcggctatt
agcacagaca cagtgcagag tgtttatggc gttaatctgg aaaaaaacga 6060taacattccc
attgtttttg cgtggcccat ttttcccacc acccttaatc ccacagattt 6120tcaggtaatg
cttaacacgg gggaaattgt caccccggtg atcgcctctt tgattcccaa 6180cagtgaatac
aacgaacggc aaacggtagt aattacgggc aattttggta atcgtttaac 6240cccaggcacg
gagggagcga tttatcccgt ttccgtaggc acagtgttgg acagtactcc 6300tttggaaatg
gtgggaccca acggcccggt cagtgcggtg ggtattacca ttgatagtct 6360caacccctac
gtggccggca atggtcccaa aattgtcgcc gctaagttag accgcttcag 6420tgacctgggg
gaaggggctc ccctctggtt agccaccaat caaaataaca gtggcgggga 6480tttatatgga
gaccaagccc aatttcgttt gcgaatttac accagcgccg gtttttcccc 6540cgatggcatt
gccagtttac tacccacaga atttgaacgg tattttcaac tccaagcgga 6600agatattacg
ggacggacag ttatcctaac ccaaactggt gttgattatg aaattcccgg 6660ctttggtctg
gtgcaggtgt tggggctggc ggatttggcc ggggttcagg acagctatga 6720cctgacttac
atcgaagatc atgacaacta ttacgacatt atcctcaaag gggacgaagc 6780cgcagttcgc
caaattaaga gggttgcttt gccctccgaa ggggattatt cggcggttta 6840taatcccggt
ggccccggca atgatccaga gaatggtccc cca
68831420DNAArtificial SequenceSynthetic construct 14accctggccc tcagtgcgag
201521DNAArtificial
SequenceSynthetic construct 15tgcttctttg ctgacgagtg g
211619DNAArtificial SequencePrimer 16gtgactggaa
ccgccctcg
191744DNAArtificial SequencePrimer 17ccatcgagca gtaacgtggc cgatagtgac
gctaaaccag gctg 441840DNAArtificial SequencePrimer
18cgagtggcgg acgggtgagt ctacgagggc gtgcagaagc
401921DNAArtificial SequencePrimer 19caccaagttg ccttcaccga c
212044DNAArtificial SequencePrimer
20cagcctggtt tagcgtcact atcggccacg ttactgctcg atgg
442140DNAArtificial SequencePrimer 21gcttctgcac gccctcgtag actcacccgt
ccgccactcg 40222840DNAArtificial
SequenceSynthetic construct 22gtgactggaa ccgccctcgc gcaaccccgc gccattacgc
cccacgaaca gcagcttttg 60gccaaactga aaagctatcg cgatatccaa agcttgtcgc
aaatttgggg acgtgctgcc 120agtcaatttg gatcgatgcc ggctttggtt gcaccccatg
ccaaaccagc gatcaccctc 180agttatcaag aattggcgat tcagatccaa gcgtttgcag
ccggactgct cgcgctggga 240gtgcctacct ccacagccga tgactttccg cctcgcttgg
cgcagtttgc ggataacagc 300ccccgctggt tgattgctga ccaaggcacg ttgctggcag
gggctgccaa tgcggtgcgc 360ggcgcccaag ctgaagtatc ggagctgctc tacgtcttag
aggacagcgg ttcgatcggc 420ttgattgtcg aagacgcggc gctgctgaag aaactacagc
ctggtttagc gtcactatcg 480gccacgttac tgctcgatgg cattcagcag cgatcgcccg
aatatcgtca gcggattttg 540catcacgaag cgggtcacta cttggtagca accgcgctgg
ggttaccaga tccgtcgatc 600atatcgtcaa ttattacctc cacggggaga gcctgagcaa
actggcctca ggcatttgag 660aagcacacgg tcacactgct tccggtagtc aataaaccgg
taaaccagca atagacataa 720gcggctattt aacgaccctg ccctgaaccg acgaccgggt
cgaatttgct ttcgaatttc 780tgccattcat ccgcttatta tcacttattc aggcgtagca
ccaggcgttt aagggcacca 840ataactgcct taaaaaaatt acgccccgcc ctgccactca
tcgcagtact gttgtaattc 900attaagcatt ctgccgacat ggaagccatc acaaacggca
tgatgaacct gaatcgccag 960cggcatcagc accttgtcgc cttgcgtata atatttgccc
atggtgaaaa cgggggcgaa 1020gaagttgtcc atattggcca cgtttaaatc aaaactggtg
aaactcaccc agggattggc 1080tgagacgaaa aacatattct caataaaccc tttagggaaa
taggccaggt tttcaccgta 1140acacgccaca tcttgcgaat atatgtgtag aaactgccgg
aaatcgtcgt ggtattcact 1200ccagagcgat gaaaacgttt cagtttgctc atggaaaacg
gtgtaacaag ggtgaacact 1260atcccatatc accagctcac cgtctttcat tgccatacgg
aattccggat gagcattcat 1320caggcgggca agaatgtgaa taaaggccgg ataaaacttg
tgcttatttt tctttacggt 1380ctttaaaaag gccgtaatat ccagctgaac ggtctggtta
taggtacatt gagcaactga 1440ctgaaatgcc tcaaaatgtt ctttacgatg ccattgggat
atatcaacgg tggtatatcc 1500agtgattttt ttctccattt tagcttcctt agctcctgaa
aatctcgata actcaaaaaa 1560tacgcccggt agtgatctta tttcattatg gtgaaagttg
gaacctctta cgtgccgatc 1620aacgtctcat tttcgccaaa agttggccca gggcttcccg
gtatcaacag ggacaccagg 1680atttatttat tctgcgaagt gatcttccgt cacaggtatt
tattcgaaga cgaaagggcc 1740tcgtgatacg cctattttta taggttaatg tcatgataat
aatggtttct tagacgtcag 1800gtggcacttt tcggggaaat gtgcgcggaa cccctatttg
tttatttttc taaatacatt 1860caaatatgta tccgctcatg agacaataac cctgataaat
gcttcaataa tattgaaaaa 1920ggaagagtat gagtattcaa catttccgtg tcgcccttat
tccctttttt gcggcatttt 1980gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt
aaaagatgct gaagatcagt 2040tgggtgcacg agtgggttac atcgaactgg atctcaacag
cggtaagatc cttgagagtt 2100ttcgccccga agaacgtttt ccaatgatga gcacttttaa
agttctgcta tgtggcgcgg 2160tattatcccg tgtgacggat ctaagcttgc ttctttgctg
acgagtggcg gacgggtgag 2220tctacgaggg cgtgcagaag cagtttcgcg agcaaccggc
gaagaaacgt cgcttgatcg 2280ataccttctt tggcttgagt caacgctatg ttttggcacg
gcgccgctgg caaggactgg 2340atttgctggc actgaaccaa tccccagccc agcgcctcgc
tgagggtgtc cggatgttgg 2400cgctagcacc gttgcataag ctgggcgatc gcctcgtcta
cggcaaagta cgagaagcca 2460cgggtggccg aattcggcag gtgatcagtg gcggtggctc
actggcactg cacctcgata 2520ccttcttcga aattgttggt gttgatttgc tggtgggtta
tggcttgaca gaaacctcac 2580cagtgctgac ggggcgacgg ccttggcaca acctacgggg
ttcggccggt cagccgattc 2640caggtacggc gattcggatc gtcgatcctg aaacgaagga
aaaccgaccc agtggcgatc 2700gcggcttggt gctggcgaaa gggccgcaaa tcatgcaggg
ctacttcaat aaacccgagg 2760cgaccgcgaa agcgatcgat gccgaaggtt ggtttgacac
cggcgactta ggctacatcg 2820tcggtgaagg caacttggtg
28402325DNAArtificial SequencePrimer 23ctcgagcccc
cgtgctatga ctagc
252428DNAArtificial SequencePrimer 24ctcgagcccg gaacgttttt tgtacccc
282529DNAArtificial SequencePrimer
25caattggtca cacgggataa taccgcgcc
292636DNAArtificial SequencePrimer 26caattggtcg atcatatcgt caattattac
ctccac 36277224DNAArtificial
SequenceSynthetic construct 27cccccgtgct atgactagcg gcgatcgcca taccggccac
gaccatttgc attggatccc 60caacggcggc cacaacttcc atggcattga gatgcgggga
atgatgttct agactctgac 120gcaccaaagc caatttttgt tgatggttgc aatggggatg
actactgttc actttgcccc 180cagcgtcaat gcctagacct agcagtaccc ccagggctgt
ggtagtgccc cccaccacgc 240attcgcttag cactaagtaa ctttcggcat gttcctgggc
taactgtgcg ccccactgca 300aaccctgctg aaaaagatgc tccaccaggg ccaacggtaa
cgcttgccct gtggaaagac 360agcgggcggg ttgtccgtct agattgatga ctggcaccgc
tgggggaatg ggtaaaccag 420agttaaataa ataaaccgga gtatggaggg catccaccaa
cgctttggtg atgaacactg 480gggaaacccc agaaatgagg ggaggtaagg gataggttgc
ccctgccgta gttcccttga 540ttaaaaattc cgcatcggcg atcgccgtca attttcgatc
agcgggggtt ttacccgccg 600cagaaatgcc cggaattaaa ccagtttccg taaagcccaa
cacacagaca aacaccggtg 660gacagtggcc atggcgctca atccaggata aagcttggtc
agactgggta taaactgtca 720acatatttct gcaagagtgg gcccaattgg gaaaatcaac
ctcaaatcca ttggaatagc 780cttttttcaa ccgtaaaaat ccaactttct ctcttccctt
cttccttcca tctgattatg 840gttacgccaa ttaactacca ttccatccat tgcctggcgg
atatctgggc tatcaccgga 900gaaaattttg ccgatattgt ggccctcaac gatcgccata
gtcatccccc cgtaacttta 960acctatgccc aattggtcac acgggataat accgcgccac
atagcagaac tttaaaagtg 1020ctcatcattg gaaaacgttc ttcggggcga aaactctcaa
ggatcttacc gctgttgaga 1080tccagttcga tgtaacccac tcgtgcaccc aactgatctt
cagcatcttt tactttcacc 1140agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg
caaaaaaggg aataagggcg 1200acacggaaat gttgaatact catactcttc ctttttcaat
attattgaag catttatcag 1260ggttattgtc tcatgagcgg atacatattt gaatgtattt
agaaaaataa acaaataggg 1320gttccgcgca catttccccg aaaagtgcca cctgacgtct
aagaaaccat tattatcatg 1380acattaacct ataaaaatag gcgtatcacg aggccctttc
gtcttcgaat aaatacctgt 1440gacggaagat cacttcgcag aataaataaa tcctggtgtc
cctgttgata ccgggaagcc 1500ctgggccaac ttttggcgaa aatgagacgt tgatcggcac
gtaagaggtt ccaactttca 1560ccataatgaa ataagatcac taccgggcgt attttttgag
ttatcgagat tttcaggagc 1620taaggaagct aaaatggaga aaaaaatcac tggatatacc
accgttgata tatcccaatg 1680gcatcgtaaa gaacattttg aggcatttca gtcagttgct
caatgtacct ataaccagac 1740cgttcagctg gatattacgg cctttttaaa gaccgtaaag
aaaaataagc acaagtttta 1800tccggccttt attcacattc ttgcccgcct gatgaatgct
catccggaat tccgtatggc 1860aatgaaagac ggtgagctgg tgatatggga tagtgttcac
ccttgttaca ccgttttcca 1920tgagcaaact gaaacgtttt catcgctctg gagtgaatac
cacgacgatt tccggcagtt 1980tctacacata tattcgcaag atgtggcgtg ttacggtgaa
aacctggcct atttccctaa 2040agggtttatt gagaatatgt ttttcgtctc agccaatccc
tgggtgagtt tcaccagttt 2100tgatttaaac gtggccaata tggacaactt cttcgccccc
gttttcacca tgggcaaata 2160ttatacgcaa ggcgacaagg tgctgatgcc gctggcgatt
caggttcatc atgccgtttg 2220tgatggcttc catgtcggca gaatgcttaa tgaattacaa
cagtactgcg atgagtggca 2280gggcggggcg taattttttt aaggcagtta ttggtgccct
taaacgcctg gtgctacgcc 2340tgaataagtg ataataagcg gatgaatggc agaaattcga
aagcaaattc gacccggtcg 2400tcggttcagg gcagggtcgt taaatagccg cttatgtcta
ttgctggttt accggtttat 2460tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc
ctgaggccag tttgctcagg 2520ctctccccgt ggaggtaata attgacgata tgatcgacca
attgcgggaa gaaattacag 2580cttttgccgc tggcctacag agtttaggag ttacccccca
tcaacacctg gccattttcg 2640ccgacaacag cccccggtgg tttatcgccg atcaaggcag
tatgttggct ggagccgtca 2700acgccgtccg ttctgcccaa gcagagcgcc aggaattact
ctacatccta gaagacagca 2760acagccgtac tttaatcgca gaaaatcggc aaaccctaag
caaattggcc ctagatggcg 2820aaaccattga cctgaaacta atcatcctcc tcaccgatga
agaagtggca gaggacagcg 2880ccattcccca atataacttt gcccaggtca tggccctagg
ggccggcaaa atccccactc 2940ccgttccccg ccaggaagaa gatttagcca ccctgatcta
cacctccggc accacaggac 3000aacccaaagg ggtgatgctc agccacggta atttattgca
ccaagtacgg gaattggatt 3060cggtgattat tccccgcccc ggcgatcagg tgttgagcat
tttgccctgt tggcactccc 3120tagaaagaag cgccgaatat tttcttcttt cccggggctg
cacgatgaac tacaccagca 3180ttcgccattt caagggggat gtgaaggaca ttaaacccca
tcacattgtc ggtgtgcccc 3240ggctgtggga atccctctac gaaggggtac aaaaaacgtt
ccgggctaag ggcgaattct 3300gcagatatcc atcacactgg cggccgctcg agcatgcatc
tagagggccc aattcgccct 3360atagtgagtc gtattacaat tcactggccg tcgttttaca
acgtcgtgac tgggaaaacc 3420ctggcgttac ccaacttaat cgccttgcag cacatccccc
tttcgccagc tggcgtaata 3480gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg
cagcctgaat ggcgaatgga 3540cgcgccctgt agcggcgcat taagcgcggc gggtgtggtg
gttacgcgca gcgtgaccgc 3600tacacttgcc agcgccctag cgcccgctcc tttcgctttc
ttcccttcct ttctcgccac 3660gttcgccggc tttccccgtc aagctctaaa tcgggggctc
cctttagggt tccgatttag 3720tgctttacgg cacctcgacc ccaaaaaact tgattagggt
gatggttcac gtagtgggcc 3780atcgccctga tagacggttt ttcgcccttt gacgttggag
tccacgttct ttaatagtgg 3840actcttgttc caaactggaa caacactcaa ccctatctcg
gtctattctt ttgatttata 3900agggattttg ccgatttcgg cctattggtt aaaaaatgag
ctgatttaac aaaaatttaa 3960cgcgaatttt aacaaaattc agggcgcaag ggctgctaaa
ggaagcggaa cacgtagaaa 4020gccagtccgc agaaacggtg ctgaccccgg atgaatgtca
gctactgggc tatctggaca 4080agggaaaacg caagcgcaaa gagaaagcag gtagcttgca
gtgggcttac atggcgatag 4140ctagactggg cggttttatg gacagcaagc gaaccggaat
tgccagctgg ggcgccctct 4200ggtaaggttg ggaagccctg caaagtaaac tggatggctt
tcttgccgcc aaggatctga 4260tggcgcaggg gatcaagatc tgatcaagag acaggatgag
gatcgtttcg catgattgaa 4320caagatggat tgcacgcagg ttctccggcc gcttgggtgg
agaggctatt cggctatgac 4380tgggcacaac agacaatcgg ctgctctgat gccgccgtgt
tccggctgtc agcgcagggg 4440cgcccggttc tttttgtcaa gaccgacctg tccggtgccc
tgaatgaact gcaggacgag 4500gcagcgcggc tatcgtggct ggccacgacg ggcgttcctt
gcgcagctgt gctcgacgtt 4560gtcactgaag cgggaaggga ctggctgcta ttgggcgaag
tgccggggca ggatctcctg 4620tcatcccacc ttgctcctgc cgagaaagta tccatcatgg
ctgatgcaat gcggcggctg 4680catacgcttg atccggctac ctgcccattc gaccaccaag
cgaaacatcg catcgagcga 4740gcacgtactc ggatggaagc cggtcttgtc gatcaggatg
atctggacga agagcatcag 4800gggctcgcgc cagccgaact gttcgccagg ctcaaggcgc
gcatgcccga cggcgaggat 4860ctcgtcgtga cccatggcga tgcctgcttg ccgaatatca
tggtggaaaa tggccgcttt 4920tctggattca tcgactgtgg ccggctgggt gtggcggacc
gctatcagga catagcgttg 4980gctacccgtg atattgctga agagcttggc ggcgaatggg
ctgaccgctt cctcgtgctt 5040tacggtatcg ccgctcccga ttcgcagcgc atcgccttct
atcgccttct tgacgagttc 5100ttctgaattg aaaaaggaag agtatgagta ttcaacattt
ccgtgtcgcc cttattccct 5160tttttgcggc attttgcctt cctgtttttg ctcacccaga
aacgctggtg aaagtaaaag 5220atgctgaaga tcagttgggt gcacgagtgg gttacatcga
actggatctc aacagcggta 5280agatccttga gagttttcgc cccgaagaac gttttccaat
gatgagcact tttaaagttc 5340tgctatgtgg cgcggtatta tcccgtattg acgccgggca
agagcaactc ggtcgccgca 5400tacactattc tcagaatgac ttggttgagt actcaccagt
cacagaaaag catcttacgg 5460atggcatgac agtaagagaa ttatgcagtg ctgccataac
catgagtgat aacactgcgg 5520ccaacttact tctgacaacg atcggaggac cgaaggagct
aaccgctttt ttgcacaaca 5580tgggggatca tgtaactcgc cttgatcgtt gggaaccgga
gctgaatgaa gccataccaa 5640acgacgagcg tgacaccacg atgcctgtag caatggcaac
aacgttgcgc aaactattaa 5700ctggcgaact acttactcta gcttcccggc aacaattaat
agactggatg gaggcggata 5760aagttgcagg accacttctg cgctcggccc ttccggctgg
ctggtttatt gctgataaat 5820ctggagccgg tgagcgtggg tctcgcggta tcattgcagc
actggggcca gatggtaagc 5880cctcccgtat cgtagttatc tacacgacgg ggagtcaggc
aactatggat gaacgaaata 5940gacagatcgc tgagataggt gcctcactga ttaagcattg
gtaactgtca gaccaagttt 6000actcatatat actttagatt gatttaaaac ttcattttta
atttaaaagg atctaggtga 6060agatcctttt tgataatctc atgaccaaaa tcccttaacg
tgagttttcg ttccactgag 6120cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga
tccttttttt ctgcgcgtaa 6180tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt
ggtttgtttg ccggatcaag 6240agctaccaac tctttttccg aaggtaactg gcttcagcag
agcgcagata ccaaatactg 6300ttcttctagt gtagccgtag ttaggccacc acttcaagaa
ctctgtagca ccgcctacat 6360acctcgctct gctaatcctg ttaccagtgg ctgctgccag
tggcgataag tcgtgtctta 6420ccgggttgga ctcaagacga tagttaccgg ataaggcgca
gcggtcgggc tgaacggggg 6480gttcgtgcac acagcccagc ttggagcgaa cgacctacac
cgaactgaga tacctacagc 6540gtgagctatg agaaagcgcc acgcttcccg aagggagaaa
ggcggacagg tatccggtaa 6600gcggcagggt cggaacagga gagcgcacga gggagcttcc
agggggaaac gcctggtatc 6660tttatagtcc tgtcgggttt cgccacctct gacttgagcg
tcgatttttg tgatgctcgt 6720caggggggcg gagcctatgg aaaaacgcca gcaacgcggc
ctttttacgg ttcctggcct 6780tttgctggcc ttttgctcac atgttctttc ctgcgttatc
ccctgattct gtggataacc 6840gtattaccgc ctttgagtga gctgataccg ctcgccgcag
ccgaacgacc gagcgcagcg 6900agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa
accgcctctc cccgcgcgtt 6960ggccgattca ttaatgcagc tggcacgaca ggtttcccga
ctggaaagcg ggcagtgagc 7020gcaacgcaat taatgtgagt tagctcactc attaggcacc
ccaggcttta cactttatgc 7080ttccggctcg tatgttgtgt ggaattgtga gcggataaca
atttcacaca ggaaacagct 7140atgaccatga ttacgccaag cttggtaccg agctcggatc
cactagtaac ggccgccagt 7200gtgctggaat tcgcccttct cgag
72242879DNAArtificial SequenceSynthetic construct
28gatccgctgt tgacccaaca gcatgagtcg ttatccaagg ggagcttcgg ctcccttttt
60tcatgcgcgg atgcggtga
79291503DNAArtificial SequenceSynthetic construct 29ggatccacta gtcctgaggt
gttgacaatt aatcatccgg ctcgtataat gtgtggaatt 60gtgagcggat aacaatttca
cacaggaaac agaccatggc cgtcgcactg caaccagctc 120aagaagtcgc aactaagaaa
aagcctgcaa tcaaacagcg gcgcgtggtg gttaccggca 180tgggtgtggt gactcccctc
gggcatgaac cggatgtgtt ttacaacaat ctcctggatg 240gcgtgagcgg cattagtgag
atcgagaatt ttgactcgac gcagtttccc actcgcattg 300ccggcgaaat caagagtttc
agcaccgacg gctgggtcgc gcccaaattg agcaaacgga 360tggataaatt gatgctgtat
ctgctcaccg caggcaagaa agcgctggcc gatgcgggca 420tcacggatga tgtgatgaaa
gagctggata aacgcaaatg tggagttctg attggcagtg 480gcatgggcgg catgaagctg
ttctacgatg cgctcgaagc cctgaagatt tcgtatcgaa 540agatgaaccc attctgtgtg
ccttttgcga ccacgaatat gggtagcgcc atgctggcta 600tggatttggg gtggatgggg
ccgaattata gtatttccac cgcgtgcgca acctcgaact 660tctgcatctt gaacgcggct
aaccacatta tccgtggtga agcagacatg atgctctgcg 720gcggctccga tgcggtcatt
atccctatcg gtttgggcgg ctttgttgct tgccgcgcct 780tgagccaacg caataacgac
ccaaccaagg catcgcgccc gtgggacagc aatcgcgatg 840gcttcgtcat gggcgaggga
gccggggtgc tgctgttgga ggagctggaa cacgcgaaaa 900agcgaggcgc gacaatctat
gctgagttct tgggagggtc ctttacatgc gatgcctacc 960acatgacgga gcctcaccca
gagggcgcag gcgtgatctt gtgtatcgag aaggcaatgg 1020ctcaggcagg agtctctcgc
gaggatgtta actacattaa tgctcacgca acgtccacgc 1080cggctggtga catcaaggaa
taccaagctc tcgcccattg tttcggccag aactcggagc 1140tgcgggtcaa tagtacaaag
tccatgatcg gtcatctgct gggtgctgcc ggtggcgtcg 1200aagctgtgac agtcattcaa
gccatccgca ccggctggat tcaccctaat ctgaacctgg 1260aagacccgga caaggccgtt
gacgcaaaat tcctcgtcgg accggagaaa gaacgtctca 1320acgttaaagt cggattgagc
aatagtttcg gttttggtgg ccataactct agtatcctgt 1380ttgcacccta taattgataa
tagatctgat ccgctgttga cccaacagca tgagtcgtta 1440tccaagggga gcttcggctc
ccttttttca tgcgcggatg cggtgagagc tcacgtgtct 1500aga
1503301224DNAArtificial
SequenceSynthetic construct 30ggatccacta gtcctgaggt gttgacaatt aatcatccgg
ctcgtataat gtgtggaatt 60gtgagcggat aacaatttca cacaggaaac agaccatggc
aagccgtgtt gttggtaaag 120gttgtaaact cgttggatgt ggtagtgccg tcccgaagtt
ggaggtgagt aacgacgacc 180tcagtaagat cgtggatact tccgatgaat ggatttctgt
tcggacggga atccgcaacc 240ggcgggtgat tactggtaag gataagatga cggggctggc
ggtcgaggca gcccagaaag 300ccctggaaat ggctgaagtc gatgctgacg atgtggactt
gctcctgttg tgcacctcca 360ccccagatga tctctttgga agtgcgccgc aaatccaggc
ggcactcggc tgcaaaggaa 420accctctggc atttgatatt acagccgctt gtagcggctt
cgttctgggt ctggtgagtg 480cttcctgcta tatccgcggc ggcgggttca agaacgtcct
ggttatcggc gcggacgcac 540tgagccgcta cgtcgattgg actgaccgcg gcacatgcat
tctctttggt gacgccgctg 600gcgctgtgtt ggtccaggcg tgtgagagcg aggacgacgg
cgtcttcggg tttgatctgc 660atagcgatgg agagggttat cgccacctgc atactgggat
caaggcgaac gaggagttcg 720ggacgaacgg ttccgttgtg gattttccgc ccaagcgcag
cagctactct tccatccaaa 780tgaatgggaa agaagtgttc cgtttcgcct gccgcgtcgt
gccccagtct attgagatcg 840cactcgagaa cgcgggcctc acacgttcta gcattgattg
gctgctgctc caccaagcaa 900accaacgaat cttggatgcc gtcgcaacgc gtctggaaat
tcccgcagac cgcgtgatta 960gtaacttggc taattacggc aatacttctg ccgccagcat
tccgttggca ctggatgaag 1020ccgtgcgcag cggtaaggtc aaacccggtc agactatcgc
aacttcgggg tttggagcag 1080gcttgacatg gggcagcgcg atcattcgct ggaattaatg
atagatctga tccgctgttg 1140acccaacagc atgagtcgtt atccaagggg agcttcggct
cccttttttc atgcgcggat 1200gcggtgagag ctcacgtgtc taga
1224311613DNAArtificial SequenceSynthetic construct
31tgttgacaat taatcatccg gctcgtataa tgtgtggaat tgtgagcgga taacaatttc
60acacaggaaa cagcgccgct gagaaaaagc gaagcggcac tgctctttaa caatttatca
120gacaatctgt gtgggcactc gaccggaatt atcgattaac tttattatta aaaattaaag
180aggtatatat taatgtatcg attaaataag gaggaataaa ccatggccgt cgcactgcaa
240ccagctcaag aagtcgcaac taagaaaaag cctgcaatca aacagcggcg cgtggtggtt
300accggcatgg gtgtggtgac tcccctcggg catgaaccgg atgtgtttta caacaatctc
360ctggatggcg tgagcggcat tagtgagatc gagaattttg actcgacgca gtttcccact
420cgcattgccg gcgaaatcaa gagtttcagc accgacggct gggtcgcgcc caaattgagc
480aaacggatgg ataaattgat gctgtatctg ctcaccgcag gcaagaaagc gctggccgat
540gcgggcatca cggatgatgt gatgaaagag ctggataaac gcaaatgtgg agttctgatt
600ggcagtggca tgggcggcat gaagctgttc tacgatgcgc tcgaagccct gaagatttcg
660tatcgaaaga tgaacccatt ctgtgtgcct tttgcgacca cgaatatggg tagcgccatg
720ctggctatgg atttggggtg gatggggccg aattatagta tttccaccgc gtgcgcaacc
780tcgaacttct gcatcttgaa cgcggctaac cacattatcc gtggtgaagc agacatgatg
840ctctgcggcg gctccgatgc ggtcattatc cctatcggtt tgggcggctt tgttgcttgc
900cgcgccttga gccaacgcaa taacgaccca accaaggcat cgcgcccgtg ggacagcaat
960cgcgatggct tcgtcatggg cgagggagcc ggggtgctgc tgttggagga gctggaacac
1020gcgaaaaagc gaggcgcgac aatctatgct gagttcttgg gagggtcctt tacatgcgat
1080gcctaccaca tgacggagcc tcacccagag ggcgcaggcg tgatcttgtg tatcgagaag
1140gcaatggctc aggcaggagt ctctcgcgag gatgttaact acattaatgc tcacgcaacg
1200tccacgccgg ctggtgacat caaggaatac caagctctcg cccattgttt cggccagaac
1260tcggagctgc gggtcaatag tacaaagtcc atgatcggtc atctgctggg tgctgccggt
1320ggcgtcgaag ctgtgacagt cattcaagcc atccgcaccg gctggattca ccctaatctg
1380aacctggaag acccggacaa ggccgttgac gcaaaattcc tcgtcggacc ggagaaagaa
1440cgtctcaacg ttaaagtcgg attgagcaat agtttcggtt ttggtggcca taactctagt
1500atcctgtttg caccctataa ttgataatag atctgatccg ctgttgaccc aacagcatga
1560gtcgttatcc aaggggagct tcggctccct tttttcatgc gcggatgcgg tga
1613322698DNAArtificial SequenceSynthetic construct 32cctgaggtgt
tgacaattaa tcatccggct cgtataatgt gtggaattgt gagcggataa 60caatttcaca
caggaaacag cgccgctgag aaaaagcgaa gcggcactgc tctttaacaa 120tttatcagac
aatctgtgtg ggcactcgac cggaattatc gattaacttt attattaaaa 180attaaagagg
tatatattaa tgtatcgatt aaataaggag gaataaacca tggccgtcgc 240actgcaacca
gctcaagaag tcgcaactaa gaaaaagcct gcaatcaaac agcggcgcgt 300ggtggttacc
ggcatgggtg tggtgactcc cctcgggcat gaaccggatg tgttttacaa 360caatctcctg
gatggcgtga gcggcattag tgagatcgag aattttgact cgacgcagtt 420tcccactcgc
attgccggcg aaatcaagag tttcagcacc gacggctggg tcgcgcccaa 480attgagcaaa
cggatggata aattgatgct gtatctgctc accgcaggca agaaagcgct 540ggccgatgcg
ggcatcacgg atgatgtgat gaaagagctg gataaacgca aatgtggagt 600tctgattggc
agtggcatgg gcggcatgaa gctgttctac gatgcgctcg aagccctgaa 660gatttcgtat
cgaaagatga acccattctg tgtgcctttt gcgaccacga atatgggtag 720cgccatgctg
gctatggatt tggggtggat ggggccgaat tatagtattt ccaccgcgtg 780cgcaacctcg
aacttctgca tcttgaacgc ggctaaccac attatccgtg gtgaagcaga 840catgatgctc
tgcggcggct ccgatgcggt cattatccct atcggtttgg gcggctttgt 900tgcttgccgc
gccttgagcc aacgcaataa cgacccaacc aaggcatcgc gcccgtggga 960cagcaatcgc
gatggcttcg tcatgggcga gggagccggg gtgctgctgt tggaggagct 1020ggaacacgcg
aaaaagcgag gcgcgacaat ctatgctgag ttcttgggag ggtcctttac 1080atgcgatgcc
taccacatga cggagcctca cccagagggc gcaggcgtga tcttgtgtat 1140cgagaaggca
atggctcagg caggagtctc tcgcgaggat gttaactaca ttaatgctca 1200cgcaacgtcc
acgccggctg gtgacatcaa ggaataccaa gctctcgccc attgtttcgg 1260ccagaactcg
gagctgcggg tcaatagtac aaagtccatg atcggtcatc tgctgggtgc 1320tgccggtggc
gtcgaagctg tgacagtcat tcaagccatc cgcaccggct ggattcaccc 1380taatctgaac
ctggaagacc cggacaaggc cgttgacgca aaattcctcg tcggaccgga 1440gaaagaacgt
ctcaacgtta aagtcggatt gagcaatagt ttcggttttg gtggccataa 1500ctctagtatc
ctgtttgcac cctataattg ataatagatc ctgtcgttaa ctgctttgtt 1560ggtactacct
gacttcaccc tcttttaaga tggcaagccg tgttgttggt aaaggttgta 1620aactcgttgg
atgtggtagt gccgtcccga agttggaggt gagtaacgac gacctcagta 1680agatcgtgga
tacttccgat gaatggattt ctgttcggac gggaatccgc aaccggcggg 1740tgattactgg
taaggataag atgacggggc tggcggtcga ggcagcccag aaagccctgg 1800aaatggctga
agtcgatgct gacgatgtgg acttgctcct gttgtgcacc tccaccccag 1860atgatctctt
tggaagtgcg ccgcaaatcc aggcggcact cggctgcaaa ggaaaccctc 1920tggcatttga
tattacagcc gcttgtagcg gcttcgttct gggtctggtg agtgcttcct 1980gctatatccg
cggcggcggg ttcaagaacg tcctggttat cggcgcggac gcactgagcc 2040gctacgtcga
ttggactgac cgcggcacat gcattctctt tggtgacgcc gctggcgctg 2100tgttggtcca
ggcgtgtgag agcgaggacg acggcgtctt cgggtttgat ctgcatagcg 2160atggagaggg
ttatcgccac ctgcatactg ggatcaaggc gaacgaggag ttcgggacga 2220acggttccgt
tgtggatttt ccgcccaagc gcagcagcta ctcttccatc caaatgaatg 2280ggaaagaagt
gttccgtttc gcctgccgcg tcgtgcccca gtctattgag atcgcactcg 2340agaacgcggg
cctcacacgt tctagcattg attggctgct gctccaccaa gcaaaccaac 2400gaatcttgga
tgccgtcgca acgcgtctgg aaattcccgc agaccgcgtg attagtaact 2460tggctaatta
cggcaatact tctgccgcca gcattccgtt ggcactggat gaagccgtgc 2520gcagcggtaa
ggtcaaaccc ggtcagacta tcgcaacttc ggggtttgga gcaggcttga 2580catggggcag
cgcgatcatt cgctggaatt aatgatagat ctgatccgct gttgacccaa 2640cagcatgagt
cgttatccaa ggggagcttc ggctcccttt tttcatgcgc ggatgcgg
26983389DNAArtificial SequenceSynthetic construct 33gtacgggatc cctgtcgtta
actgctttgt tggtactacc tgacttcacc ctcttttaag 60atggcaagcc gtgttgttgg
taaaggttg 893429DNAArtificial
SequenceSynthetic construct 34cacgtgagct ctcaccgcat ccgcgcatg
29351252DNAArtificial SequenceSynthetic
construct 35tcatgaagtt ccttgtcgtc gccgtctcag cacttgcaac tgcatctgct
ttcacaacca 60gtcctgcctc tttcaccact gtcagcagtc cttcggtgaa caatgtgttc
ggacaggagg 120gaaatgctca caggaacagg agagctacca ttgtcatgga tggagctaac
ggaagtgcag 180tcagtttgaa aagtgggtca ttgaatacgc aggaggacac aagttcgtcg
ccaccgcccc 240gtacattcct tcaccaactc cctgattgga gcagattgct cactgccatc
acaaccgttt 300ttgttaaaag taagcgtccg gatatgcatg atcgtaagtc gaaaaggccg
gacatgctcg 360tggatagttt cgggttggag agtaccgttc aggatggact cgtgttccgt
caaagctttt 420cgatccgttc atatgagatt ggaactgatc gtacggcttc cattgagact
ttgatgaacc 480atcttcagga gacttccctc aaccattgta agagtacagg aattttgttg
gatggattcg 540gacgcacact cgaaatgtgt aagcgcgatt tgatttgggt cgtcattaaa
atgcagatca 600aggttaatag atacccggcc tggggcgata cagtagaaat caatactagg
ttcagcagac 660ttggtaagat cggcatgggt cgagattggc tcattagcga ctgcaatacc
ggtgagatcc 720tcgtcagggc aaccagcgcc tacgccatga tgaatcagaa gacccgaaga
ctctcgaagc 780ttccgtacga ggtccaccaa gagattgtcc ccctttttgt cgactccccc
gtaattgaag 840attcggatct caaggtccac aaattcaaag ttaaaacggg tgacagcatc
cagaagggac 900ttactcctgg ttggaacgac ctcgatgtga accaacatgt ttcgaacgtg
aaatatatcg 960gctggattct tgagagtatg ccaaccgagg tacttgagac gcaggaattg
tgctcgttgg 1020cattggagta tcgtcgtgag tgtgggcgag actcagtcct cgaaagtgta
acagcaatgg 1080acccaagcaa agttggtgtt cgttcacagt atcaacacct cctccgtctc
gaggatggaa 1140cagccattgt gaacggggcc acagagtgga ggccaaagaa cgctggcgct
aacggagcta 1200tctccacagg aaagaccagc aatggtaact ctgtgagtta atgataggat
cc 125236412PRTArtificial SequenceSynthetic construct 36Met Lys
Phe Leu Val Val Ala Val Ser Ala Leu Ala Thr Ala Ser Ala1 5
10 15Phe Thr Thr Ser Pro Ala Ser Phe
Thr Thr Val Ser Ser Pro Ser Val 20 25
30Asn Asn Val Phe Gly Gln Glu Gly Asn Ala His Arg Asn Arg Arg
Ala 35 40 45Thr Ile Val Met Asp
Gly Ala Asn Gly Ser Ala Val Ser Leu Lys Ser 50 55
60Gly Ser Leu Asn Thr Gln Glu Asp Thr Ser Ser Ser Pro Pro
Pro Arg65 70 75 80Thr
Phe Leu His Gln Leu Pro Asp Trp Ser Arg Leu Leu Thr Ala Ile
85 90 95Thr Thr Val Phe Val Lys Ser
Lys Arg Pro Asp Met His Asp Arg Lys 100 105
110Ser Lys Arg Pro Asp Met Leu Val Asp Ser Phe Gly Leu Glu
Ser Thr 115 120 125Val Gln Asp Gly
Leu Val Phe Arg Gln Ser Phe Ser Ile Arg Ser Tyr 130
135 140Glu Ile Gly Thr Asp Arg Thr Ala Ser Ile Glu Thr
Leu Met Asn His145 150 155
160Leu Gln Glu Thr Ser Leu Asn His Cys Lys Ser Thr Gly Ile Leu Leu
165 170 175Asp Gly Phe Gly Arg
Thr Leu Glu Met Cys Lys Arg Asp Leu Ile Trp 180
185 190Val Val Ile Lys Met Gln Ile Lys Val Asn Arg Tyr
Pro Ala Trp Gly 195 200 205Asp Thr
Val Glu Ile Asn Thr Arg Phe Ser Arg Leu Gly Lys Ile Gly 210
215 220Met Gly Arg Asp Trp Leu Ile Ser Asp Cys Asn
Thr Gly Glu Ile Leu225 230 235
240Val Arg Ala Thr Ser Ala Tyr Ala Met Met Asn Gln Lys Thr Arg Arg
245 250 255Leu Ser Lys Leu
Pro Tyr Glu Val His Gln Glu Ile Val Pro Leu Phe 260
265 270Val Asp Ser Pro Val Ile Glu Asp Ser Asp Leu
Lys Val His Lys Phe 275 280 285Lys
Val Lys Thr Gly Asp Ser Ile Gln Lys Gly Leu Thr Pro Gly Trp 290
295 300Asn Asp Leu Asp Val Asn Gln His Val Ser
Asn Val Lys Tyr Ile Gly305 310 315
320Trp Ile Leu Glu Ser Met Pro Thr Glu Val Leu Glu Thr Gln Glu
Leu 325 330 335Cys Ser Leu
Ala Leu Glu Tyr Arg Arg Glu Cys Gly Arg Asp Ser Val 340
345 350Leu Glu Ser Val Thr Ala Met Asp Pro Ser
Lys Val Gly Val Arg Ser 355 360
365Gln Tyr Gln His Leu Leu Arg Leu Glu Asp Gly Thr Ala Ile Val Asn 370
375 380Gly Ala Thr Glu Trp Arg Pro Lys
Asn Ala Gly Ala Asn Gly Ala Ile385 390
395 400Ser Thr Gly Lys Thr Ser Asn Gly Asn Ser Val Ser
405 4103726DNAArtificial SequencePrimer
37caggatccgg ggaggtgtgg tgtagt
263843DNAArtificial SequencePrimer 38taggatccag tggtgcccat ggtactttgt
taggggagga tag 433927DNAArtificial SequencePrimer
39caggatcctc actctgtcgc gctgttg
274027DNAArtificial SequencePrimer 40catctagaga ggattgattt ccgagtc
2741573DNAArtificial SequenceSynthetic
construct 41atgggcacca ctctcgacga cacggcttac cgctaccgca ccagtgtgcc
gggggacgcc 60gaggccatcg aggcactgga tgggtccttc accaccgaca ccgtcttccg
cgtcaccgcc 120accggggacg gcttcaccct gcgggaggtg ccggtggacc cgcccctgac
caaggtgttc 180cccgacgacg agtcggacga cgagtcggac gacggggagg acggcgaccc
ggactcccgg 240acgttcgtcg cgtacgggga cgacggcgac ctggcgggct tcgtggtcgt
ctcgtactcc 300ggctggaacc gccggctgac cgtcgaggac atcgaggtcg ccccggagca
ccgggggcac 360ggggtcgggc gcgcgctgat ggggctcgcg acggagttcg cccgcgagcg
gggtgccggg 420cacctctggc tggaggtcac caacgtcaac gcaccggcga tccacgcgta
ccggcggatg 480gggttcaccc tctgcggcct ggacaccgcc ctgtacgacg gcaccgcctc
ggacggcgag 540caggcgctct acatgtccat gccctgcccc taa
573421198DNAArtificial SequenceSynthetic construct
42ccatggccgc tatgctcgcc tctaagcagg gcgccttcat gggccgcagc tcctttgccc
60ccgcccccaa gggcgtcgcc agccgcggct ccctgcaggt ggtggccggc gccaacggca
120gcgcggtgag cctgaagtcg ggttccctca acactcagga ggacacctcg tcctcgcccc
180cgccgcgcac gttcctgcac cagctgccgg actggtcccg cctgctgacg gctattacga
240ccgtgttcgt gaagtcgaag cgccccgaca tgcacgaccg caagagcaag cggcctgata
300tgctggtgga cagctttggc ctggagtcca cggtgcagga cggcctcgtg ttccggcaaa
360gcttcagcat ccgcagctac gagatcggca cggaccgcac cgcgtcgatc gagacgctca
420tgaaccacct ccaggagacg tcgctcaacc actgcaagtc caccggtatc ctgctggacg
480gctttggccg caccctggag atgtgcaagc gggatctgat ctgggtggtg atcaagatgc
540agatcaaggt gaaccgctat cccgcctggg gtgacaccgt cgagattaac acccgcttct
600cgcgcctggg caagatcggc atggggcgcg actggctgat ctcggactgc aacactggcg
660agatcctggt ccgggccacg tcggcctacg ccatgatgaa ccagaagact cggcggctga
720gcaagctgcc ttacgaggtg catcaggaga tcgtgccgct cttcgtggac agccccgtga
780tcgaggacag cgatctgaag gtgcacaagt tcaaggtcaa gaccggcgac agcatccaga
840agggcctgac tcccggctgg aacgacctgg acgtgaacca gcacgtctcg aacgtgaagt
900acatcggctg gattctggag tcgatgccca ccgaggtgct ggagacgcag gagctgtgct
960ccctggcgct ggagtatcgc cgcgagtgcg gccgcgactc cgtgctggag tccgtcaccg
1020cgatggaccc gtcgaaggtg ggtgtccgca gccagtacca acacctgctg cgcctcgagg
1080acggcaccgc cattgtgaac ggcgcgacgg agtggcggcc gaagaacgcg ggcgctaacg
1140gcgccatctc cacgggcaag acctccaacg gcaactcggt gagctaatga taggatcc
119843394PRTArtificial SequenceSynthetic construct 43Met Ala Ala Met Leu
Ala Ser Lys Gln Gly Ala Phe Met Gly Arg Ser1 5
10 15Ser Phe Ala Pro Ala Pro Lys Gly Val Ala Ser
Arg Gly Ser Leu Gln 20 25
30Val Val Ala Gly Ala Asn Gly Ser Ala Val Ser Leu Lys Ser Gly Ser
35 40 45Leu Asn Thr Gln Glu Asp Thr Ser
Ser Ser Pro Pro Pro Arg Thr Phe 50 55
60Leu His Gln Leu Pro Asp Trp Ser Arg Leu Leu Thr Ala Ile Thr Thr65
70 75 80Val Phe Val Lys Ser
Lys Arg Pro Asp Met His Asp Arg Lys Ser Lys 85
90 95Arg Pro Asp Met Leu Val Asp Ser Phe Gly Leu
Glu Ser Thr Val Gln 100 105
110Asp Gly Leu Val Phe Arg Gln Ser Phe Ser Ile Arg Ser Tyr Glu Ile
115 120 125Gly Thr Asp Arg Thr Ala Ser
Ile Glu Thr Leu Met Asn His Leu Gln 130 135
140Glu Thr Ser Leu Asn His Cys Lys Ser Thr Gly Ile Leu Leu Asp
Gly145 150 155 160Phe Gly
Arg Thr Leu Glu Met Cys Lys Arg Asp Leu Ile Trp Val Val
165 170 175Ile Lys Met Gln Ile Lys Val
Asn Arg Tyr Pro Ala Trp Gly Asp Thr 180 185
190Val Glu Ile Asn Thr Arg Phe Ser Arg Leu Gly Lys Ile Gly
Met Gly 195 200 205Arg Asp Trp Leu
Ile Ser Asp Cys Asn Thr Gly Glu Ile Leu Val Arg 210
215 220Ala Thr Ser Ala Tyr Ala Met Met Asn Gln Lys Thr
Arg Arg Leu Ser225 230 235
240Lys Leu Pro Tyr Glu Val His Gln Glu Ile Val Pro Leu Phe Val Asp
245 250 255Ser Pro Val Ile Glu
Asp Ser Asp Leu Lys Val His Lys Phe Lys Val 260
265 270Lys Thr Gly Asp Ser Ile Gln Lys Gly Leu Thr Pro
Gly Trp Asn Asp 275 280 285Leu Asp
Val Asn Gln His Val Ser Asn Val Lys Tyr Ile Gly Trp Ile 290
295 300Leu Glu Ser Met Pro Thr Glu Val Leu Glu Thr
Gln Glu Leu Cys Ser305 310 315
320Leu Ala Leu Glu Tyr Arg Arg Glu Cys Gly Arg Asp Ser Val Leu Glu
325 330 335Ser Val Thr Ala
Met Asp Pro Ser Lys Val Gly Val Arg Ser Gln Tyr 340
345 350Gln His Leu Leu Arg Leu Glu Asp Gly Thr Ala
Ile Val Asn Gly Ala 355 360 365Thr
Glu Trp Arg Pro Lys Asn Ala Gly Ala Asn Gly Ala Ile Ser Thr 370
375 380Gly Lys Thr Ser Asn Gly Asn Ser Val
Ser385 3904425DNAArtificial SequencePrimer 44ggtggaaaat
gcctatgtgt taacg
254525DNAArtificial SequencePrimer 45cgtaggcagt gtgcaaccag gagcc
25463418DNAArtificial SequenceSynthetic
construct 46ggtggaaaat gcctatgtgt taacggatct acaaaccagc accaaactct
attacgaacc 60ccacggtttc cactctcccc aactgcaaga cttggggccc attgatgtgg
ttttaacccc 120cgtcattggc atcaatatcc tcggattcct gccggtgctc aatggccaga
aaaccaccct 180ggagctttgt cgcactgtcc atccccaggc gatcgtcccc acctctggag
ccgcagaatt 240gaactatagc ggtttactaa ctaaagtttt acgtttagac ggcgatctca
gtcaatttcg 300ccagtcccta attgacgaag ggatacaagc ttccctatgg gaaccccagg
tgggagtgcc 360cctcaatgtg ccccaatcca ccgttggcta ggttggaatg ttcaaatcac
tgtgcggtgt 420gatgcttgat aaatacagtg agccagggaa aactgcaaaa aagtgtataa
agtaggttta 480acttgaatca aaatcctttc tccgcagtca tagccaggag taggaagatt
accagcgaag 540caagttgtct tcccctagct ttgggcgggc aaaccccttg cagtattgcc
aacgtcaaaa 600aatcaccata gccgaatgac ctacaccatc aacgctgacc aagtccatca
gattgtccat 660aatcttcacc acgatccctt tgaagtgttg ggctgccatc ccctcggagc
tttatgcttg 720taaaccgttt tgtgaaaaaa tttttaaaat aaaaaagggg acctctaggg
tccccaatta 780attagtaata taatctatta aaggtcattc aaaaggtcat ccaccggatc
agcttagtaa 840agccctcgct agattttaat gcggatgttg cgattacttc gccaactatt
gcgataacaa 900gaaaaagcca gcctttcatg atatatctcc caatttgtgt agggcttatt
atgcacgctt 960aaaaataata aaagcagact tgacctgata gtttggctgt gagcaattat
gtgcttagtg 1020catctaacgc ttgagttaag ccgcgccgcg aagcggcgtc ggcttgaacg
aattgttaga 1080cattatttgc cgactacctt ggtgatctcg cctttcacgt agtggacaaa
ttcttccaac 1140tgatctgcgc gcgaggccaa gcgatcttct tcttgtccaa gataagcctg
tctagcttca 1200agtatgacgg gctgatactg ggccggcagg cgctccattg cccagtcggc
agcgacatcc 1260ttcggcgcga ttttgccggt tactgcgctg taccaaatgc gggacaacgt
aagcactaca 1320tttcgctcat cgccagccca gtcgggcggc gagttccata gcgttaaggt
ttcatttagc 1380gcctcaaata gatcctgttc aggaaccgga tcaaagagtt cctccgccgc
tggacctacc 1440aaggcaacgc tatgttctct tgcttttgtc agcaagatag ccagatcaat
gtcgatcgtg 1500gctggctcga agatacctgc aagaatgtca ttgcgctgcc attctccaaa
ttgcagttcg 1560cgcttagctg gataacgcca cggaatgatg tcgtcgtgca caacaatggt
gacttctaca 1620gcgcggagaa tctcgctctc tccaggggaa gccgaagttt ccaaaaggtc
gttgatcaaa 1680gctcgccgcg ttgtttcatc aagccttacg gtcaccgtaa ccagcaaatc
aatatcactg 1740tgtggcttca ggccgccatc cactgcggag ccgtacaaat gtacggccag
caacgtcggt 1800tcgagatggc gctcgatgac gccaactacc tctgatagtt gagtcgatac
ttcggcgatc 1860accgcttccc tcatgatgtt taactttgtt ttagggcgac tgccctgctg
cgtaacatcg 1920ttgctgctcc ataacatcaa acatcgaccc acggcgtaac gcgcttgctg
cttggatgcc 1980cgaccgaggc atagactgta ccccaaaaaa acagtcataa caagccatga
aaaccgccac 2040tgcgccgtta ccaccgctgc gttcggtcaa ggttctggac cagttgcgtg
agcgcatacg 2100ctacttgcat tacagcttac gaaccgaaca ggcttatgtc cactgggttc
gtgccttcat 2160ccgtttccac ggtgtgcgtc acccggcaac cttgggcagc agcgaagtcg
aggcatttct 2220gtcctggctg gcgaacgagc gcaaggtttc ggtctccacg catcgtcagg
cattggcggc 2280cttgctgttc ttctacggca aggtgctgtg cacggatctg ccctggcttc
aggagatcgg 2340aagacctcgg ccgtcgcggc gcttgccggt ggtgctgacc ccggatgaag
tggttcgcat 2400cctcggtttt ctggaaggcg agcatcgttt gttcgcccag cttctgtatg
gaacgggcat 2460gcggatcagt gagggtttgc aactgcgggt caaggatctg gatttcgatc
acggcacgat 2520catcgtgcgg gagggcaagg gctccaagga tcgggcctgg cacccagcct
gcgcgagcag 2580gggaattgat ccggtggatg accttttgaa tgacctttaa tagattatat
tactaattaa 2640ttggggaccc tagaggtccc cttttttatt ttaaaaattt tttcacaaaa
cggtttacaa 2700gcataaagct tcggggacca cggcaaggtc aatcaatggg tcattcgtgc
ctatttaccc 2760acggctgaag cggtaacggt gttgcttccc accgatcgcc gggaagtgat
tatgaccacg 2820gtccaccatc ccaacttttt tgaatgcgtg ttggagttgg aagaaccgaa
gaattatcaa 2880ttaagaatta ccgaaaatgg ccacgaaagg gtaatttatg acccctatgg
ttttaaaact 2940cccaaactga cggattttga cctccatgtg tttggggaag gcaaccacca
ccgtatttac 3000gaaaaactcg gtgctcacct gatgacggtg gatggagtta aaggggttta
ttttgctgtg 3060tgggccccca atgcccgcaa cgtttccatt ttgggggatt tcaacaactg
ggacggcaga 3120ttgcaccaaa tgcggaaacg caacaacatg gtgtgggaat tatttatccc
tgagttgggg 3180gtgggcactt cttataagta tgagattaaa aactgggaag ggcacatcta
cgaaaagact 3240gacccctacg gtttttacca agaagtacgc cccaaaaccg cttccattgt
ggcagacttg 3300gacggttacc aatggcacga cgaagattgg ttggaagcta ggcgcaccag
cgatcccctg 3360agcaaacccg tttccgttta cgaactccat ttaggctcct ggttgcacac
tgcctacg 3418
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: