Patent application title: ACETYL-COA PRODUCING ENZYMES IN YEAST
Inventors:
Ulrike Maria Müller (Linnich, DE)
Ulrike Maria Müller (Linnich, DE)
Liang Wu (Delft, NL)
Liang Wu (Delft, NL)
Lourina Madeleine Raamsdonk (Nootdorp, NL)
Aaron Adriaan Winkler (Den Haag, NL)
Assignees:
DSM IP ASSETS B.V.
IPC8 Class: AC12Q168FI
USPC Class:
435 6
Class name: Chemistry: molecular biology and microbiology measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving nucleic acid
Publication date: 2010-09-30
Patent application number: 20100248233
Claims:
1. A method of identifying a heterologous polypeptide having enzymatic
activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA
in (the cytosol of) a yeast cell comprising:providing a mutated yeast
cell, wherein said mutation comprises an inactivation of at least one
gene of the (PDH) by-pass, selected from the genes encoding the enzymes
pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and
acetyl-CoA synthetase (ACS);transforming said mutated yeast cell with an
expression vector comprising at least one heterologous nucleotide
sequence operably linked to a promoter functional in yeast and said
heterologous nucleotide sequence encoding a candidate polypeptide having
potential enzymatic activity for converting pyruvate, acetaldehyde or
acetate into acetyl CoA;testing said recombinant mutated yeast cell for
its ability to grow on minimal medium containing glucose as sole carbon
source, andidentifying said candidate polypeptide as a heterologous
polypeptide having enzymatic activity for converting pyruvate,
acetaldehyde or acetate into acetyl CoA in (the cytosol of) said yeast
cell when growth of said cell is observed.
2. Method according to claim 1, wherein said yeast cell is a cell of Saccharomyces cerevisiae and wherein said heterologous nucleotide sequence is codon pair optimized for expression in Saccharomyces cerevisiae.
3. Method according to claim 2, wherein said mutation comprises an inactivation of the gene for acetyl-CoA synthetase isoform 2 (acs2).
4. Method according to claim 1, wherein said candidate polypeptide having enzymatic activity for converting acetaldehyde into acetyl-CoA is a (putative) acetylating acetaldehyde dehydrogenase (acdh).
5. A vector for the expression of heterologous polypeptides in yeast, said vector comprising a heterologous nucleotide sequence operably linked to a promoter functional in yeast and said heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell.
6. Vector according to claim 5, wherein said polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA.
7. Vector according to claim 5, wherein said polypeptide has more than 50%, preferably more than 60%, 70%, 80%, 90%, or 95% sequence identity with the amino acid sequence selected from SEQ ID NO: 19, 22, 25, 28 and 52.
8. Vector according to claim 5 for expression in Saccharomyces cerevisiae, wherein said heterologous nucleotide sequence is codon pair optimized for expression in Saccharomyces cerevisiae.
9. Expression vector according to claim 8, wherein said heterologous nucleotide sequence is selected from SEQ ID NO: 20, 23, 26, 29 and 51.
10. A recombinant yeast cell comprising a vector of claim 5.
11. A recombinant yeast cell comprising a heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell.
12. Yeast cell according to claim 10, further comprising an inactivation of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS).
13. Yeast cell according to claim 10, wherein the yeast cell comprises an inactivation of a gene encoding an acetyl-CoA synthase.
14. Yeast cell according to claim 10, wherein said cell shows growth on minimal medium containing glucose as sole carbon source.
15. Yeast cell according to claim 10, further comprising an inactivation of a gene encoding an enzyme that catalyses the conversion of acetaldehyde into ethanol, preferably an alcohol dehydrogenase.
16. Yeast cell according to claim 10, further comprising one or more introduced genes encoding a recombinant pathway for the formation of 1-butanol from acetyl-CoA.
17. Yeast cell according to claim 16, wherein said one or more introduced genes encode enzymes that produce acetoacetyl-CoA, 3-hydroxybutyryl-CoA, crotonyl-CoA, butyryl-CoA, butylaldehyde and/or 1-butanol.
18. Yeast cell according to claim 10, wherein said yeast is Saccharomyces cerevisiae
19. A method of producing a fermentation product, comprising the steps of fermenting a suitable carbon substrate with a yeast cell according to claim 10 and recovering the fermentation product produced during said fermentation.
20. Method according to claim 19, wherein the fermentation product is butanol.
Description:
FIELD OF THE INVENTION
[0001]The present invention is in the field of metabolites production in yeast using heterologous expression systems. In particular, the present invention relates to the metabolic engineering of yeast strains capable of producing metabolites that require cytosolic acetyl-CoA as a precursor, such as butanol-producing yeast strains. The present invention relates to an assay system for identifying heterologous enzymes capable of converting pyruvate, acetaldehyde or acetate into cytosolic acetyl-CoA when expressed in the cytosol in yeast.
BACKGROUND OF THE INVENTION
[0002]Acetyl-coenzyme A (CoA) is an essential intermediate in numerous metabolic pathways, and is a key precursor in the synthesis of many industrial relevant compounds, such as fatty acids, carotenoids, isoprenoids, vitamins, amino acids, lipids, wax esters, (poly)saccharides polyhydroxyalkanoates, statins, polyketides and acetic esters (such as ethyl acetate and isoamyl acetate). In particular, acetyl-CoA is also the precursor of the industrially important bulk chemical 1-butanol.
[0003]Compared to bacteria, such as E. coli, yeast cells provide a very suitable alternative to produce the above-mentioned acetyl-CoA derived products, in that yeast is not susceptible to phage or other infection since yeast-based processes may be run at low pH. Therefore, the use of yeast does not require a sterile process, thereby lowering the cost price of the product of interest.
[0004]When natural (wild type) yeast is not able to produce the acetyl-CoA-derived product of interest, the use of metabolic engineering can provide for yeast cells expressing heterologous genes that could support such a process. In such cases, the heterologous gene products are usually targeted to the cytosolic compartment of yeast. As the biosynthesis of acetyl-CoA-derived product will take place completely or partially in the cytosol, the supply of sufficient amounts of the precursor acetyl-CoA in the cytosolic compartment is crucial. In Saccharomyces cerevisiae, biosynthesis of acetyl-CoA takes place in two separate compartments. In mitochondria, acetyl-CoA is synthesized by oxidative decarboxylation of pyruvate catalyzed by the pyruvate dehydrogenase complex (PDH), with the following overall reaction stoichiometry:
Pyruvate(Pyr)+CoA+NAD+=acetyl-CoA+CO2+NADH+H+
[0005]In cytosol, acetyl-CoA is synthesized via the pyruvate dehydrogenase (PDH) by-pass, involving the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS), with the following overall reaction stoichiometry:
Pyr+CoA+ATP+NAD(P)+=acetyl-CoA+CO2+NAD(P)H+AMP+PPi+H+.
[0006]Pyruvate-decarboxylase-negative (Pdc-) mutant of the yeast S. cerevisiae does not have a functional PDH by-pass, and cannot grow on minimal medium with glucose as the sole carbon source due to inability to supply (sufficient) cytosolic acetyl-CoA for growth (Flikweert et al., (1996) Yeast 12:247-57). The PDH by-pass is therefore essential in providing acetyl-CoA in the cytosolic compartment. However, the PDH by-pass in yeast is not optimal with respect to the energy balance, as can be seen from the overall reaction stoichiometry: 2 moles of ATP are needed per acetyl-CoA synthesized via the PDH-bypass since in the acetyl-CoA synthetase reaction ATP is hydrolyzed to AMP. In contrast, the mitochondrial pathway via the PDH requires no ATP. The additional ATP requirement of the PDH by-pass can present a problem for synthesizing the product of interest from cytosolic acetyl-CoA precursor, as more carbon source needs to be diverted for ATP generation, via e.g. oxidative phosphorylation and/or substrate phosphorylation (e.g. glycolysis), thereby lowering the overall yield of the product on carbon.
[0007]When yeast is metabolically engineered to produce 1-butanol, heterologous biosynthetic genes of 1-butanol can be expressed in the cytosol in yeast cells (WO 2007/041269). In general 1 mole of glucose give rise to 2 moles of acetyl-CoA via glycolysis, which is the precursor of 1 mole of butanol; hence a maximum of 1 mole of butanol can be synthesized per mole of glucose if cell growth and maintenance is not considered. However, when the PDH by-pass is used in combination with butanol biosynthesis, this maximal theoretical yield cannot be achieved due to energy imbalance: whereas 2 moles of ATP are generated per mole of glucose converted in glycolysis, a total of 4 moles (2 times 2 mole) of ATP are needed in the PDH by-pass to form 2 moles of acetyl-CoA, which are converted to 1 mole of butanol. Thus, there is a net shortage of ATP if the PDH by-pass were used to synthesize 1 mole of 1-butanol from 1 mole of glucose.
[0008]Thus, there is a need for the identification of possible alternative metabolic routes for producing cytosolic acetyl-CoA in yeast, for the production of acetyl-CoA-derived products, in particular butanol, wherein the PDH by-pass is not required.
[0009]Butanol is an important industrial chemical and is suitable as an alternative engine fuel having improved properties over ethanol. Butanol also finds use as a solvent for a wide variety of chemical and textile processes, in the organic synthesis of plastics, as a chemical intermediate and as a solvent in the coating and food and flavor industry. Butanol can be produced from biomass (biobutanol) as well as fossil fuels (petrobutanol).
[0010]The chemical synthesis of butanol in one of its isomers can be accomplished by a variety of available methods known in the art (see e.g. Ullmann's Encyclopedia of Industrial Chemistry, 6th edition, 2003, Wiley-VCHVerlag GmbH and Co., Weinheim, Germany, Vol. 5, pp. 716-719). These processes have the disadvantage that they are based on the use of petrochemical derivates, are generally expensive, and are not environmentally friendly.
[0011]Biological synthesis of butanol can be achieved by fermentation using the acetone-butanol-ethanol (ABE) process carried out by the bacteria Clostridium acetobutylicum or other Clostridium species. An important disadvantage of the ABE process, however, is that it results in a mixture of acetone, 1-butanol and ethanol. Moreover, the use of bacteria requires sterile process conditions and generally renders the process susceptible to bacteriophage infection. Yeast cells thus provide a very suitable alternative as described above.
SUMMARY OF THE INVENTION
[0012]The present inventors have now identified alternative metabolic routes for increasing the production of cytosolic acetyl-CoA in yeast which can overcome the problems of the PDH by-pass.
[0013]One possible route includes the direct conversion of acetaldehyde to acetyl-CoA without ATP consumption, by use of an acetylating acetaldehyde dehydrogenase (E.C. 1.2.1.10) (see FIG. 2, reaction A, ACDH). Another route includes the direct conversion of pyruvate to acetyl-CoA by an enzyme or a multi-enzyme-complex without ATP consumption, for instance, by use of a pyruvate:NADP oxidoreductase (E.C. 1.2.1.51) see FIG. 2, reaction C, PNO). In these two possible routes, the formation of 1 mole of butanol per mole of glucose would result in the formation of 2 moles of ATP. Yet another route includes the conversion of acetate to acetyl-CoA with 1 ATP consumed per acetyl-CoA formed by an alternative enzyme or a combination of enzymes, for instance, by use of acetate:CoA ligase (ADP-forming, E.C. 6.2.1.13), or by use of ATP:acetate phosphotransferase (E.C. 2.7.2.1) in combination with acetyl-CoA:Pi acetyltransferase (E.C. 2.3.1.8). In this route, the formation of 1 mole of butanol per mole of glucose is ATP-balanced, i.e. no ATP will be formed. The present inventors have now found that such an alternative to the PDH by-pass can result in acetyl-CoA synthesis in the cytosol of the yeast, and that such acetyl-CoA can be used biosynthetically to produce higher amounts of desirable fermentation products, such as butanol.
[0014]In a first aspect, the present invention provides a method of identifying a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell comprising: [0015]providing a mutated yeast cell, wherein said mutation comprises an inactivation of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS); [0016]transforming said mutated yeast cell with an expression vector comprising at least one heterologous nucleotide sequence operably linked to a promoter functional in yeast and said at least one heterologous nucleotide sequence encoding at least one candidate polypeptide having potential enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA; [0017]testing said recombinant mutated yeast cell for its ability to grow on minimal medium containing glucose as sole carbon source, and [0018]identifying said candidate polypeptide as a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell when growth of said cell is observed.
[0019]In a preferred embodiment of said method the yeast cell is a cell of Saccharomyces cerevisiae and the heterologous nucleotide sequence is codon (pair) optimized for expression in Saccharomyces cerevisiae.
[0020]In another preferred embodiment, said mutation comprises an inactivation of the gene for acetyl-CoA synthetase isoform 2 (acs2).
[0021]In another preferred embodiment, said at least one candidate polypeptide having enzymatic activity for converting acetaldehyde into acetyl-CoA is a (putative) acetylating acetaldehyde dehydrogenases.
[0022]Alternatively, said at least one heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell may consist of two or more enzymes working together to achieve the desired conversion from pyruvate, acetaldehyde or acetate into acetyl-CoA.
[0023]In another aspect, the present invention provides an integration vector for the integration in a yeast genome of a heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA, and the subsequent expression of the heterologous polypeptide therefrom.
[0024]In another aspect, the present invention provides an expression vector expressing heterologous polypeptides in yeast, said expression vector comprising a heterologous nucleotide sequence operably linked to a promoter functional in yeast and said heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell.
[0025]In a preferred embodiment of said vector the polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA is identified by a method according to the present invention as described above.
[0026]In another preferred embodiment, said polypeptide is selected from SEQ ID NO: 19, 22, 25, 28 and 52 and functional homologues thereof.
[0027]In another preferred embodiment, said expression vector is for expression in Saccharomyces cerevisiae, wherein said heterologous nucleotide sequence is codon (pair) optimized for expression in Saccharomyces cerevisiae.
[0028]In another preferred embodiment, said heterologous nucleotide sequence is selected from SEQ ID NO: 20, 23, 26 and 29.
[0029]In another aspect, the present invention provides a recombinant yeast cell comprising the expression vector of the present invention as described above.
[0030]In a preferred embodiment, the recombinant yeast cell further comprises an inactivation of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS).
[0031]Preferably, a yeast cell according to he present invention comprises an inactivation of a gene encoding an acetyl-CoA synthase.
[0032]In another preferred embodiment, the recombinant yeast cell further comprises an inactivation of a gene (nucleotide sequence) encoding an enzyme capable of catalysing the conversion of acetaldehyde to ethanol, preferably a gene encoding an alcohol dehydrogenase.
[0033]As used herein, inactivation of a gene (nucleotide sequence) encoding an enzyme may be achieved by mutation, deletion or disruption of (part of) a gene or nucleotide sequence encoding an enzyme.
[0034]Preferably a yeast cell according to the present invention shows growth on minimal medium containing glucose as sole carbon source.
[0035]In another preferred embodiment of a yeast cell of the invention, said yeast cell further comprises one or more introduced genes encoding a recombinant pathway for the formation of 1-butanol from cytosolic acetyl-CoA. Suitable recombinant pathways from acetyl-CoA to 1-butanol are known in the art. Such pathways are for instance known from WO 2007/041269. Preferably said one or more introduced genes encode enzymes that produce acetoacetyl-CoA, 3-hydroxybutyryl-CoA, crotonyl-CoA, butyryl-CoA, butylaldehyde and/or 1-butanol. Said enzymes can be: [0036]acetyl-CoA acetyltransferase (E.C. 2.3.1.9 [Enzyme Nomenclature 1992, Academic Press, San Diego]; although, enzyme's with a broader substrate range (E.C. 2.3.1.16) will be functional as well), which converts 2 moles of acetyl-CoA to acetoacetyl-CoA; [0037]NADH-dependent or NADPH-dependent 3-hydroxybutyryl-CoA dehydrogenase E.C. 1.1.1.35 or E.C. 1.1.1.30, resp. E.C. 1.1.1.157 or E.C. 1.1.1.36), which converts acetoacetyl-CoA to 3-hydroxybutyryl-CoA; [0038]3-hydroxybutyryl-CoA dehydratase (also named crotonase; E.C. 4.2.1.17 or E.C. 4.2.1.55), which converts 3-hydroxybutyryl-CoA to crotonyl-CoA; [0039]NADH-dependent or NADPH-dependent butyryl-CoA dehydrogenase (E.C. 1.3.1.44 resp. E.C. 1.3.1.38 or E.C.1.3.99.2), which converts crotonyl-CoA to butyryl-CoA; [0040]monofunctional NADH-dependent or NADPH-dependent aldehyde dehydrogenase (E.C. 1.2.1.10, or 1.2.1.57), which converts butyryl-CoA to butyraldehyde, and [0041]NADH-dependent or NADPH-dependent butanol dehydrogenase (E.C. 1.1.1.-), which converts butylaldehyde to 1-butanol, or [0042]bifunctional NADH-dependent or NADPH-dependent aldehyde/alcohol dehydrogenase (E.C. 1.1.1.1./1.2.1.10), which converts butyryl-CoA to 1-butanol via butyraldehyde
[0043]In another preferred embodiment of the invention a yeast cell is a Saccharomyces cerevisiae.
[0044]In another aspect, the present invention provides a method of producing butanol, comprising the steps of fermenting a suitable carbon substrate with a yeast cell according to the present invention and recovering the butanol produced during said fermentation.
BRIEF DESCRIPTION OF THE DRAWINGS
[0045]FIG. 1 is a schematic presentation of the PDH by-pass showing the enzymes pyruvate decarboxylase (PDC; E.C. 4.1.1.1), acetaldehyde dehydrogenase (ALD; E.C. 1.2.1.3, E.C. 1.2.1.4 and E.C. 1.2.1.5), and acetyl-CoA synthetase (ACS; E.C. 6.2.1.1).
[0046]FIG. 2 shows a schematic metabolic route for butanol production in Saccharomyces cerevisiae. Reactions 1-6 are the butanol biosynthesis steps from Clostridium acetobutylicum introduced in yeast. A, B, and C indicate alternative reactions for acetyl-CoA biosynthesis in the cytosol. B indicates part of the pyruvate dehydrogenase by-pass (pdc, ald and acs), the natural source of cytosolic acetyl-CoA in yeast. Glc, glucose; EtOH, ethanol; Pyr, Pyruvate; AA, acetaldehyde; ACT, acetate; AcCoA, acetyl-CoA; AACoA, acetoacetyl-CoA; BuCoA, butyryl-CoA; Bual, butylaldehyde; BuOH, butanol; NAD(P)(H), nicotinamide adenine dinucleotide (phosphate) (in reduced form); ATP, adenosine triphosphate; AMP, adenosine monophosphate; TCA cycle, tricarboxylic acid cycle; PDH, pyruvate dehydrogenase; pdc, pyruvate decarboxylase; adh, alcohol dehydrogenase; acdh, acetylating acetaldehyde dehydrogenase; ald, acetaldehyde dehydrogenase; acs, acetyl-CoA synthetase; pno, pyruvate:NADP oxidoreductase. Enzymatic conversions indicated by reaction 1-6 indicate a heterologous butanol pathway from Clostridium acetobutylicum: thIB (or ThL) encoding acetyl-CoA acetyltransferase or thiolase [E.C. 2.3.1.9] (SEQ ID NO:30); hbd, 3-hydroxybutyryl-CoA dehydrogenase [E.C.1.1.1.157] (SEQ ID NO:31); crt, 3-hydroxybutyryl-CoA dehydratase [E.C.4.2.1.55] (SEQ ID NO:32); ter, trans-enoyl CoA reductase; bcd, butyryl-CoA dehydrogenase [E.C.1.3.99.2] (SEQ ID NO:33); etf αβ, heterodimeric electron transfer flavoprotein (etf α and etf β, SEQ ID NO:38 and SEQ ID NO:39, respectively); adhE/adhE1, aldehyde/alcohol dehydrogenase E and E1 [E.C. 1.1.1.1/1.2.1.10] (SEQ ID NO:34 and 35, respectively); bdhA/bdhB, NAD(P)H-dependent butanol dehydrogenase A and B [E.C.:1.1.1.-] (SEQ ID NO:36 and 37, respectively).
[0047]FIG. 3 shows the map of plasmid YEplac112PtdhTadh. The sequence of this plasmid is provided in SEQ ID NO:40.
[0048]FIG. 4 shows an example of a similarity tree based on amino acid sequences of proteins of the types 1 to 4 as described in Example 2 and indicates the branches.
DETAILED DESCRIPTION OF THE INVENTION
Definitions
[0049]The term "butanol" refers to n-butanol, or 1-butanol.
[0050]The term "yeast" refers to a phylogenetically diverse group of single-celled fungi, most of which are in the division of Ascomycota and Basidiomycota. The budding yeasts ("true yeasts") are classified in the order Saccharomycetales, with Saccharomyces cerevisiae as the most well known species
[0051]The term "recombinant yeast" as used herein, is defined as a cell which contains a nucleotide sequence and/or protein, or is transformed or genetically modified with a nucleotide sequence that does not naturally occur in the yeast, or it contains additional copy or copies of an endogenous nucleic acid sequence (or protein), or it contains a mutation, deletion or disruption of an endogenous nucleic acid sequence.
[0052]The term "mutated" as used herein regarding proteins or polypeptides means that at least one amino acid in the wild-type or naturally occurring protein or polypeptide sequence has been replaced with a different amino acid, or deleted from the sequence via mutagenesis of nucleic acids encoding these amino acids. Mutagenesis is a well-known method in the art, and includes, for example, site-directed mutagenesis by means of PCR or via oligonucleotide-mediated mutagenesis as described in Sambrook et al., Molecular Cloning-A Laboratory Manual, 2nd ed., Vol. 1-3 (1989). The term "mutated" as used herein regarding genes means that at least one nucleotide in the nucleotide sequence of that gene or a regulatory sequence thereof, has been replaced with a different nucleotide, or has been deleted from the sequence via mutagenesis, resulting in the transcription of a non-functional protein sequence or the knock-out of that gene.
[0053]The term "gene", as used herein, refers to a nucleic acid sequence containing a template for a nucleic acid polymerase, in eukaryotes, RNA polymerase II. Genes are transcribed into mRNAs that are then translated into protein.
[0054]The term pyruvate dehydrogenase (PDH) by-pass refers to the enzymatic cascade form pyruvate to acetyl-CoA in the cytosol of yeast, and which consists of the following enzymes: pyruvate decarboxylase (PDC; E.C. 4.1.1.1) converting pyruvate into acetaldehyde; acetaldehyde dehydrogenase (ALD; E.C. 1.2.1.3, E.C. 1.2.1.4 and E.C. 1.2.1.5), converting acetaldehyde into acetate; and acetyl-CoA synthetase (ACS; E.C. 6.2.1.1), converting acetate into acetyl-CoA.
[0055]The term "nucleic acid" as used herein, includes reference to a deoxyribonucleotide or ribonucleotide polymer, i.e. a polynucleotide, in either single- or double-stranded form, and unless otherwise limited, encompasses known analogues having the essential nature of natural nucleotides in that they hybridize to single-stranded nucleic acids in a manner similar to naturally occurring nucleotides (e.g., peptide nucleic acids). A polynucleotide can be full-length or a subsequence of a native or heterologous structural or regulatory gene. Unless otherwise indicated, the term includes reference to the specified sequence as well as the complementary sequence thereof. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to name just two examples, are polynucleotides as the term is used herein. It will be appreciated that a great variety of modifications have been made to DNA and RNA that serve many useful purposes known to those of skill in the art. The term polynucleotide as it is employed herein embraces such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including among other things, simple and complex cells.
[0056]The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The essential nature of such analogues of naturally occurring amino acids is that, when incorporated into a protein, that protein is specifically reactive to antibodies elicited to the same protein but consisting entirely of naturally occurring amino acids. The terms "polypeptide", "peptide" and "protein" are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
[0057]Sequence identity is herein defined as a relationship between two or more amino acid (polypeptide or protein) sequences or two or more nucleic acid (polynucleotide) sequences, as determined by comparing the sequences. Usually, sequence identities are compared over the whole length of the sequences compared. In the art, "identity" also means the degree of sequence relatedness between amino acid or nucleic acid sequences, as the case may be, as determined by the match between strings of such sequences.
[0058]Preferred methods to determine identity are designed to give the largest match between the sequences tested. Methods to determine identity are codified in publicly available computer programs. Preferred computer program methods to determine identity and similarity between two sequences include BLASTP, BLASTN (Altschul, S. F. et al., J. Mol. Biol. 215:403-410 (1990), publicly available from NCBI and other sources (BLAST Manual, Altschul, S., et al., NCBI NLM NIH Bethesda, Md. 20894). Preferred parameters for amino acid sequences comparison using BLASTP are gap open 11.0, gap extend 1, Blosum 62 matrix.
[0059]Every nucleic acid sequence herein that encodes a polypeptide also, by reference to the genetic code, describes every possible silent variation of the nucleic acid. The term "conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or conservatively modified variants of the amino acid sequences due to the degeneracy of the genetic code. The term "degeneracy of the genetic code" refers to the fact that a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations" and represent one species of conservatively modified variation.
[0060]"Expression" refers to the transcription of a gene into structural RNA (rRNA, tRNA) or messenger RNA (mRNA) with subsequent translation into a protein.
[0061]As used herein, "heterologous" in reference to a nucleic acid or protein is a nucleic acid or protein that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous structural gene is from a species different from that from which the structural gene was derived, or, if from the same species, one or both are substantially modified from their original form. A heterologous protein may originate from a foreign species or, if from the same species, is substantially modified from its original form by deliberate human intervention.
[0062]As used herein "promoter" is a DNA sequence that directs the transcription of a (structural) gene. Typically, a promoter is located in the 5'-region of a gene, proximal to the transcriptional start site of a (structural) gene. Promoter sequences may be constitutive, inducible or repressible. If a promoter is an inducible promoter, then the rate of transcription increases in response to an inducing agent.
[0063]The term "vector" as used herein, includes reference to an autosomal expression vector and to an integration vector used for integration into the chromosome.
[0064]The term "expression vector" refers to a DNA molecule, linear or circular, that comprises a segment encoding a polypeptide of interest under the control of (i.e., operably linked to) additional nucleic acid segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and may optionally include one or more origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, and the like. Expression vectors are generally derived from plasmid or viral DNA, or may contain elements of both. In particular an expression vector comprises a nucleotide sequence that comprises in the 5' to 3' direction and operably linked: (a) a yeast-recognized transcription and translation initiation region, (b) a coding sequence for a polypeptide of interest, and (c) a yeast-recognized transcription and translation termination region. "Plasmid" refers to autonomously replicating extrachromosomal DNA which is not integrated into a microorganism's genome and is usually circular in nature.
[0065]An "integration vector" refers to a DNA molecule, linear or circular, that can be incorporated in a microorganism's genome and provides for stable inheritance of a gene encoding a polypeptide of interest. The integration vector generally comprises one or more segments comprising a gene sequence encoding a polypeptide of interest under the control of (i.e., operably linked to) additional nucleic acid segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and one or more segments that drive the incorporation of the gene of interest into the genome of the target cell, usually by the process of homologous recombination. Typically, the integration vector will be one which can be transferred into the target cell, but which has a replicon which is nonfunctional in that organism. Integration of the segment comprising the gene of interest may be selected if an appropriate marker is included within that segment.
[0066]As used herein, the term "operably linked" refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence "operably linked" to another control sequence and/or to a coding sequence is ligated in such a way that transcription and/or expression of the coding sequence is achieved under conditions compatible with the control sequence. Generally, operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame.
[0067]By "host cell" is meant a cell which contains a vector and supports the replication and/or expression of the vector. Host cells may be prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells. Preferably, host cells are cells of the order of Actinomycetales, most preferably yeast cells, most preferably cells of Saccharomyces cerevicsiae.
[0068]"Transformation" and "transforming", as used herein, refers to the insertion of an exogenous polynucleotide into a host cell, irrespective of the method used for the insertion, for example, direct uptake, transduction, f-mating or electroporation. The exogenous polynucleotide may be maintained as a non-integrated vector, for example, a plasmid, or alternatively, may be integrated into the host cell genome.
[0069]The term "oligonucleotide" refers to a short sequence of nucleotide monomers (usually 6 to 100 nucleotides) joined by phosphorous linkages (e.g., phosphodiester, alkyl and aryl-phosphate, phosphorothioate, phosphotliester), or non-phosphorous linkages (e.g., peptide, sulfamate and others). An oligonucleotide may contain modified nucleotides having modified bases (e.g., 5-methyl cytosine) and modified sugar groups (e.g., 2'-O-methyl ribosyl, 2'-O-methoxyethyl ribosyl, 2'-fluoro ribosyl, 2'-amino ribosyl, and the like). Oligonucleotides may be naturally-occurring or synthetic molecules of double- and single-stranded DNA and double- and single-stranded RNA with circular, branched or linear shapes and optionally including domains capable of forming stable secondary structures (e.g., stem-and-loop and loop-stem-loop structures).
[0070]The term "polynucleotide" as used herein refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes double- and single-stranded DNA and RNA.
[0071]The term "recombinant polynucleotide" as used herein intends a polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its origin or manipulation: (1) is not associated with all or a portion of a polynucleotide with which it is associated in nature; or (2) is linked to a polynucleotide other than that to which it is linked in nature; or (3) does not occur in nature.
[0072]The term "minimal medium" as used herein refers to a chemically defined medium, which includes only the nutrients that are required by the cells to survive and proliferate in culture. Typically, minimal medium is free of biological extracts, e.g., growth factors, serum, pituitary extract, or other substances, which are not necessary to support the survival and proliferation of a cell population in culture. For example, minimal medium generally includes as essential substances: at least one carbon source, such as glucose; at least one nitrogen source, such as ammonium, ammonium sulfate, ammonium chloride, ammonium nitrate or urea; inorganic salts, such as dipotassium hydrogenphosphate, potassium dihydrogen-phosphate and magnesium sulfate; and other nutrients, such as biotin and vitamins.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0073]A method of the present invention provides a method for identifying heterologous enzymes capable of producing acetyl-CoA in the cytosol of a yeast cell. The heterologous enzyme may produce the acetyl-CoA using pyruvate, acetaldehyde or acetate as a substrate, preferably in a single conversion step. Preferably, the heterologous enzyme produces the acetyl-CoA from acetaldehyde. An enzyme capable of catalyzing said reaction is acetylating acetaldehyde dehydrogenase (acdh; E.C. 1.2.1.10) also referred to as acetaldehyde:NAD+oxidoreductase (CoA-acetylating). The conversion of acetaldehyde into acetyl-CoA by acetylating acetaldehyde dehydrogenase is reversible and runs in the direction of acetyl-CoA when acetaldehyde accumulates in the cytosol. Such an accumulation may for instance be achieved by deletion of alcohol dehydrogenase (adh; E.C. 1.1.1.1).
[0074]The heterologous enzyme may also produce the acetyl-CoA from pyruvate. An enzyme capable of catalyzing said reaction is a pyruvate:NADP oxidoreductase (pno; E.C. 1.2.1.51). The reaction is stoichiometrically identical to the mitochondrial pyruvate dehydrogenase except that pno uses NADPH as a cofactor as compared to PDH that uses NADH. Compared to acdh, an important disadvantage of the pno enzyme system is that pno is oxygen sensitive, and that it is a large multimeric enzyme, and hence, its successful genetic incorporation (a 5-6 kb gene) is much more difficult than that of acdh. For this reason, the use of acdh is preferred in embodiments of the present invention.
[0075]An important feature of a test cell capable of revealing the desired enzymatic activity of a test polypeptide is that the cell is prototrophic as a result of the introduced polypeptide. With this, it is meant that the cell's nutritional requirements do not exceed those of the corresponding wild-type strain and that it will proliferate on minimal medium (in contrast to the auxotroph). In fact, the production of acetyl-CoA as supported by the test polypeptide will cancel the effect of the deletion of said at least one gene of the PDH by-pass, caused by the deletion of the gene for pyruvate decarboxylase (pdc; E.C. 4.1.1.1), acetaldehyde dehydrogenase (aid; E.C. 1.2.1.3, E.C. 1.2.1.4 or E.C. 1.2.1.5), or acetyl-CoA synthetase (acs; E.C. 6.2.1.1). Such complementation assays are well known in the art. In aspects of the present invention the assay is used to identify suitable sources of heterologous enzymes capable of sustaining cytosolic acetyl-CoA production in yeast cells.
[0076]The complementation assay is based on the provision of alternative routes to overcome the deleted enzyme activity of the PDH by-pass. Methods for effecting deletion of genes in yeast are well known in the art, and can for instance be achieved by oligonucleotide-mediated mutagenesis. Good results may be obtained with the plasmid pUG6 carrying the IoxP-kanMX-IoxP gene disruption cassette (Guldener et al. [1996] Nucleic Acids Res. 24(13):2519-24; GenPept accession no. P30114). Thus, the skilled person will be able to provide a yeast strain having a deleted acetaldehyde dehydrogenase and/or acetyl-CoA synthetase gene for blocking the PDH by-pass therein.
[0077]Saccharomyces cerevisiae comprises two acetyl-CoA synthetase isoforms, Acs1p and Acs2p. Both are the nuclear source of acetyl-CoA for histone acetylation. The production of cytosolic acetyl-CoA is also required for lipid production. Acs activity is essential, since an acs1 acs2 double null mutant is non-viable. An acs1 null mutant can grow with ethanol as the sole carbon source. The mutated yeast cell used in aspects of the present invention preferably has an inactivation of the acs2 gene.
[0078]Saccharomyces cerevisiae mutants carrying an inactivation of the acs2 gene are not able to grow on glucose as sole carbon source, because ACS1 is repressed and the protein is actively degraded. Complementation of such a delta acs2 mutant with a plasmid based acs gene will restore the cell's ability to grow on glucose as single carbon source. In addition, growth of such a mutant is complemented by the expression of genes supporting alternative routes for the production of sufficient cytosolic acetyl-CoA. Thus, transformation of the delta acs2 mutant with a plasmid from which a functional (heterologous) acdh or pno can be expressed will restore the mutant's ability to grow on glucose as sole carbon source. It should be understood that in addition to the removal of the ACS2 locus, one may also remove the ACS1 locus. Although it is believed that this may in some instances prevent the occurrence of revertants (mutations in the ACS1 locus leading to reversion of the delta acs2 phenotype), this was however not found to be essential. Double mutants (acs1/acs2Δ strains) would be wholly dependant on the introduced acdh or pno gene for the production of cytosolic acetyl-CoA.
[0079]An important advantage of a complementation assay of the present invention is that it can be performed as a plate screening assay wherein successful complementation is observed as colony growth. This is much faster than experiments that require the analysis for the production of a desired metabolic product.
[0080]For complementation of the mutation, the yeast cell having the inactivated ald and/or acs gene is then transformed with a suitable expression vector comprising a nucleotide sequence of a heterologous test polypeptide.
[0081]Yeast expression vectors are widely available from a variety of commercial suppliers. To date, functional complementation of yeast mutations by foreign homologues has become a standard practice in engineering of Saccharomyces cerevisiae. Suitable expression vectors for heterologous gene expression may be based on artificial, inducible promoters such as the GAL promoter, but is preferably based on constitutive promotors such as the TDH3 promoter. Suitable systems are exemplified in the examples below. In certain production systems, the use of an inducible promotor may be preferred, as it would allow for temporal separation of stages for biomass production (promotor not induced) and fermentation product production (promoter induced). In another highly preferred embodiment in certain production systems, the vector is in integration vector for stable integrating the heterologous genes in the genome of the yeast production strain.
[0082]In order to achieve optimal expression in yeast, the codon (pair) usage of the heterologous gene may be optimized by using any one of a variety of synthetic gene design software packages, for instance GeneOptimizer® from Geneart AG (Regensburg, Germany) for codon usage optimization or codon pair usage optimization as described in WO2008/000632. Such adaptation of codon usage ensures that the heterologous genes, which are for instance of bacterial origin, are effectively processed by the yeast transcription and translation machinery. Optimization of codon pair usage will result in enhanced protein expression in the yeast cell.
[0083]The optimized sequences may for instance be cloned into a high copy yeast expression plasmid, operably linked to a (preferably constitutive) promoter functional in yeast. Good results have been obtained with the plasmid YEplac112 (2μ TRP1) (Gietz & Sugino [1988] Gene 74(2):527-34).
[0084]Heterologous genes that encode a candidate polypeptide having potential enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA may be identified in silico. Suitable enzymes described as possessing the capacity to convert acetaldehyde into acetyl-CoA are acetylating acetaldehyde dehydrogenases (E.C. 1.2.1.10). The nucleotide and amino acid sequences of over 200 of these enzymes from a variety of microbial origins are described in various databases (e.g. the KEGG (Kyoto Encyclopedia of Genes and Genomes) database).
[0085]The present inventors have selected several acetylating acetaldehyde dehydrogenases and tested these in the delta acs2 mutant-based assay system of the present invention. Many of these, though not all, were functional in S. cerevisiae when codon pair usage was optimized.
[0086]Functional homologues to these proteins can also be used in aspects of the present invention. The term "functional homologues" as used herein refers to a protein comprising the amino acid sequence of SEQ ID NO:19, 22, 25 or the acetaldehyde dehydrogenase part of SEQ ID NOs: 28 and 52 in which one or more amino acids are substituted, deleted, added, and/or inserted, and which protein has the same enzymatic functionality for substrate conversion, for instance an acetylating acetaldehyde dehydrogenase homologue is capable of converting acetaldehyde into acetyl-CoA. This functionality may be tested by use of an assay system comprising a recombinant yeast cell comprising an expression vector for the expression of the homologue in yeast, said expression vector comprising a heterologous nucleotide sequence operably linked to a promoter functional in yeast and said heterologous nucleotide sequence encoding the homologous polypeptide of which enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl CoA in (the cytosol of) said yeast cell is to be tested, and performing a method for identifying a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell as described herein using said assay system. Candidate homologues may be identified by using in silico similarity analyses. A detailed example of such an analysis is described in Example 2 below. The skilled person will be able to derive therefrom how suitable candidate homologues may be found and, optionally upon codon (pair) optimization, will be able to test the required functionality of such candidate homologues using the assay system of the present invention as described above. A suitable homologue represents a polypeptide having an amino acid sequence identity to an acetylating acetaldehyde dehydrogenase of more than 50%, preferably more than 60%, more preferably more than 70%, 80%, 90% or more, for instance having such an amino acid sequence identity to SEQ ID NOs:19, 22, 25, or the acetaldehyde dehydrogenase part of SEQ ID NOs:28 and 52 and having the required enzymatic functionality for converting acetaldehyde into acetyl-CoA. Similarly, enzymes described for the direct conversion of pyruvate into acetyl-CoA and the functional homologues thereof, as well as enzymes described for the conversion of acetate to acetyl-CoA and the functional homologues thereof, can also be used, similar as described for acetylating acetaldehyde dehydrogenase above.
[0087]A method of the present invention further comprises the step of testing the ability of the mutated and test-protein transformed yeast cell to grow on minimal medium containing glucose as sole carbon source. As stated earlier, this may suitably occur on solid (agar) media in Petri dishes (plates) where growth can be observed as growth of a colony, however, liquid media are equally suitable and growth may be detected by turbidity. Other methods for determining growth of the mutated and test-protein transformed yeast cell on minimal medium containing glucose as sole carbon source may also be used.
[0088]When the mutated and test-protein-transformed yeast cell is capable of growth on minimal medium with glucose, the candidate polypeptide is successfully identified as a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell. Growth may suitably be observed as colony formation on solid growth media, in particular minimal medium containing glucose.
[0089]An expression vector for the expression of heterologous polypeptides in yeast, according to the present invention may be any expression vector suitable for transforming yeast. Innumerable examples are available in the art that can suitably be used to express heterologous nucleotide sequences in yeast. A very suitable vector in aspects of the invention is a plasmid. A highly preferred plasmid is YEplac112PtdhTadh (SEQ ID NO:40).
[0090]Generally, the heterologous nucleotide sequence encoding the polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl CoA in (the cytosol of) said yeast cell, will be placed under control of a promoter functional in yeast. Preferably the promoter is a constitutive promoter. The promoter on plasmid YEplac112PtdhTadh is the TDH3 promoter.
[0091]The heterologous nucleotide sequences incorporated in the expression vector of the present invention may be any pno, acdh or other enzyme capable of converting pyruvate, acetaldehyde or acetate (respectively) into acetyl-CoA in the cytosol of the yeast. Preferred nucleotide sequences are those as identified herein, namely the nucleotide sequences encoding: [0092]the ethanolamine utilization protein EutE from E. coli HS (nucleotide sequences with SEQ ID NO:18); [0093]the hypothetical protein Lin1129 from Listeria innocua similar to ethanolamine utilization protein EutE, (nucleotide sequences with SEQ ID NO:21) [0094]the acetaldehyde dehydrogenase EDK33116 from Clostridium kluyveri DSM 555 (nucleotide sequences with SEQ ID NO:24); and [0095]the adhE homologue of S. aureus (nucleotide sequences with SEQ ID NO:27) encoding a bifunctional acetaldehyde/alcohol dehydrogenase in Staphylococcus aureus subsp. aureus N315, or the acetaldehyde dehydrogenase functional part thereof. [0096]the adhE homologue of Piromyces sp. E2 (nucleotide sequence SEQ ID NO: 51) encoding a bifunctional acetaldehyde/alcohol dehydrogenase, or the acetaldehyde dehydrogenase part thereof.
[0097]Also suitable are functional homologues of these nucleotide sequences, or of the polypeptides that they encode. With this term is meant that a nucleic acid sequence having more than 80%, 90% or 95% sequence identity with the nucleotide sequences encoding the above acdh enzymes, or having more than 50%, preferably more than 60%, 70%, 80%, 90%, or 95% sequence identity with the amino acid sequence of the above acdh enzymes, with the proviso that the polypeptides encoded by the homologous sequences exhibit functional enzymatic acdh activity.
[0098]As stated above, these nucleotide sequences can be optimized for expression in Saccharomyces cerevisiae by optimization of codon pair usage well known in the art. Codon pair optimized sequences for the SEQ ID NO:18, 21, 24, and 27 are provided in SEQ ID NO:20, 23, 26, and 29, respectively.
[0099]The expression vector of the invention may be used to transform a yeast cell. Methods of transformation include electroporation, glass bead and biolistic transformation, all of which are well known in the art and for instance described in Sambrook et al., Molecular Cloning-A Laboratory Manual, 2nd ed., Vol. 1-3 (1989).
[0100]A yeast cell according to the present invention comprises a heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell. Preferably, a yeast cell of the invention comprises a heterologous acdh or pno. The advantage of such a yeast cell is that it can produce acetyl-CoA by a metabolic route wherein the PDH by-pass is not required. This is energetically more favourable under anaerobic conditions, and may form the basis of any biological synthesis process using yeast cells under anaerobic conditions where acetyl-CoA is an intermediate. In addition to comprising the heterologous acdh or pno, the yeast cell of the invention may comprise various gene deletions or gene supplementations, depending on the intended use of the yeast.
[0101]Preferably a yeast cell according to the present invention comprises an inactivation of a nucleotide sequence (gene) encoding an enzyme capable of catalysing the conversion of acetaldehyde to ethanol, preferably an alcohol dehydrogenase, for instance to optimize acetaldehyde accumulation in the yeast cell.
[0102]If used in a method of screening for heterologous enzymes according to a method of the invention, the yeast cell comprises a deletion of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS), preferably acetyl-CoA synthetase, most preferably acs2.
[0103]If used in a method of producing a fermentation product, the yeast cell may optionally comprise a number of (heterologous) gene supplementations supporting the metabolic pathway from acetyl-CoA to said butanol. Such a pathway may consist only of heterologous gene products, or may make use of a mixture of heterologous and endogenous gene products. In the event the fermentation product is butanol, use can be made of a yeast comprising genes encoding enzymes for the butanol pathway of e.g. Clostridium acetobutylicum as described herein and in FIG. 2. In the event the yeast cell according to the present invention comprises genes encoding enzymes for butanol production, the yeast preferably comprises a nucleotide sequence encoding a butyryl-CoA dehydrogenase and at least one nucleotide sequence encoding a heterologous electron transfer flavoprotein (ETF). It was found that a yeast cell comprising an ETF in addition to genes of the butanol pathway produces an increased amount of butanol,
[0104]A heterologous electron transfer flavoprotein in the eukaryotic cell according to the present invention may be a single protein or the ETF may comprise two or more subunits, for instance an alpha and a beta subunit. Preferably the ETF comprises an ETF alpha (SEQ ID NO: 38) and an ETF beta (SEQ ID NO: 39). The electron transfer flavoprotein may be derived from any suitable origin. Preferably, the ETF is derived from the same origin as the butyryl-CoA dehydrogenase. Preferably, the ETF is derived from prokaryotic origin preferably from a Clostridium sp., preferably a Clostridium acetobutylicum or a Clostridium beijerinckii.
[0105]A method for producing a fermentation product according to the present invention, preferably comprises growing a yeast under anaerobic conditions on a suitable carbon and energy source. Suitable sources of carbon and energy are C5 and C6 sugars (monosaccharides) such as glucose and polysaccharides such as starch. Other raw materials such as sugarcane, maize, wheat, barley, sugarbeets, rapeseed, and sunflower are also suitable. In some instances the raw material may be pre-digested by enzymatic treatment. Most preferably the carbon source is lignocellulose, which is composed of mainly cellulose, hemicellulose, pectin, and lignin. Lignocellulose is found, for example, in the stems, leaves, hulls, husks, and cobs of plants. Hydrolysis of these polymers by specific enzymatic treatment releases a mixture of neutral sugars including glucose, xylose, mannose, galactose, and arabinose. Lignocellulosic materials, such as wood, herbaceous material, agricultural residues, corn fiber, waste paper, pulp and paper mill residues can be used to produce butanol. Hydrolysing enzymes are for instance beta-linked glucans for the hydrolysis of cellulose (these enzymes include endoglucanases, cellobiohydrolases, glucohydrolases and beta-glucosidases); beta-glucosidases hydrolyze cellobiose; endo-acting and exo-acting hemicellulases and cellobiases for hydrolysis of hemicellulose, and acetylesterases and esterases that hydrolyze lignin glycoside bonds. These and other methods for hydrolysis of lignocellulose are well known in the art.
[0106]Variations and modifications of the embodiments disclosed herein are possible, and practical alternatives to and equivalents of the various elements of the embodiments would be understood to those of ordinary skill in the art upon study of this patent document. These and other variations and modifications of the embodiments disclosed herein may be made without departing from the scope and spirit of the invention.
[0107]The invention will now be illustrated by way of the following non-limiting examples.
EXAMPLES
[0108]The following examples illustrate the provision of a strain of Saccharomyces cerevisiae useful in assays and methods of the present invention, for instance in methods for identifying heterologous enzymes capable of forming cytosolic acetyl-CoA in S. cerevisiae. Such methods are useful in the identification of routes/enzymes which allow the cytosolic supply of acetyl-CoA in S. cerevisiae under anaerobic conditions.
[0109]In order to enhance cytosolic acetyl-CoA formation in our butanol production strain, a selection method was set up to identify heterologous enzymes forming cytosolic acetyl-CoA in S. cerevisiae. The test system is based on a delta acs2 yeast mutant deficient in cytosolic acetyl-CoA biosynthesis on glucose, such a strain is unable to grow on glucose as sole carbon source unless cytosolic acetyl-CoA formation is complemented. Complementation studies in such a strain can reveal which heterologous enzymes are suitable for use in butanol producing strains of Saccharomyces cerevisiae.
[0110]Acetylating acetaldehyde dehydrogenase was identified to be a good candidate for cytosolic acetyl-CoA supply over the homologous PDH by-pass because no ATP is dissipated. Twelve putative acetylating acetaldehyde dehydrogenases, identified based on sequence homology, were synthesized and checked for complementation of the delta acs2 yeast.
[0111]The codon pair optimized genes of the eutE homologues of E. coli, L. innocua and C. kluyveri and the adhE homologue of S. aureus were able to complement the acs2 yeast mutants (4 out of 7), resulting in growth of the acs2Δ S. cerevisiae host. The aim is to improve butanol biosynthesis in yeast by expression of one or more genes so identified.
[0112]In order to test if these heterologous routes for cytosolic acetyl-CoA supply work in S. cerevisiae, a screening system was developed based on Saccharomyces cerevisiae mutants carrying a deletion of the acs2 gene. These cells are not able to grow on glucose as sole carbon source unless the delta acs2 mutant is complemented with a plasmid based acs gene or complemented with the expression of any other gene generating sufficient cytosolic acetyl-CoA. So if it were to be transformed with a plasmid leading to active expression of acdh or pno, such a mutant should be able to grow again with glucose as single carbon source. The complementation studies were performed on plates. The following experiments were performed to set up and evaluate the test system.
Example 1
Construction of Delta acs2 Strain
[0113]The S. cerevisiae acs2 deleted strain (acs2Δ strain) was produced by first performing a PCR on plasmid pUG6 (Guldener et al., 1996, supra) with the following oligonucleotides:
TABLE-US-00001 5'acs2::Kanlox 5'-tacacaaacagaatacaggaaagtaaatcaatacaataataaaacag ctgaagcttcgtacgc-3' 3'acs2::Kanlox 5'-tctcattacgaaatttttctcatttaagttatttctttttttgaggc ataggccactagtggatctg-3'.
[0114]The resulting 1.4 kb fragment, containing the KanMX marker which confers resistance to G418, was used to transform S. cerevisiae CEN.PK113-3C (MATA trp1-289). After transformation the strain was plated on YPD (10 g I-1 yeast extract (BD Difco), 20 g I-1 peptone (BD Difco)), 10 g I-1 glucose) with 200 mg/ml Geneticin (G418). In resistant transformants, correct integration was verified by PCR using oligonucleotides:
TABLE-US-00002 5'ACS2: 5'-gatattcggtagccgattcc-3' 3'ACS2: 5'-ccgtaaccttctcgtaatgc-3' ACS2internal: 5'-cggattcgtcatcagcttca-3' KanA: 5'-cgcacgtcaagactgtcaag-3' KanB: 5'-tcgtatgtgaatgctggtcg-3'
[0115]The phenotype was verified by testing for growth on YP with 1% glucose (YPD) or 1% ethanol+1% glycerol (YPEG) as the carbon source.
[0116]One transformant that had the correct PCR bands and did not grow on YP with glucose, but did grow on with YP with ethanol and glycerol as the carbon sources, was picked and named RWB060 (MATA trp1-289 acs2::Kanlox).
Example 2
In Silico Identification of Putative Acetylating Acetaldehyde Dehydrogenases for Direct Conversion of Acetaldehyde to Acetyl-CoA
[0117]Enzymes described for the conversion of acetaldehyde to acetyl-CoA are the so-called acetylating acetaldehyde dehydrogenases (ACDH) (E.C. 1.2.1.10) catalysing the following reaction:
Acetaldehyde(AA)+NAD++CoA<=>Acetyl-CoA+NADH+H+
[0118]From literature four types of proteins have been described that have this activity:
[0119]1) Bifunctional proteins that catalyze the reversible conversion of acetyl-CoA to acetaldehyde, and the subsequent reversible conversion of acetaldehyde to ethanol. An example of this type of proteins is the AdhE protein in E. coli (GenBank No: NP--415757). AdhE appears to be the evolutionary product of a gene fusion. The NH2-terminal region of the AdhE protein is highly homologous to aldehyde:NAD+ oxidoreductases, whereas the COOH-terminal region is homologous to a family of Fe2+ dependent ethanol:NAD+ oxidoreductases (Membrillo-Hernandez et al., (2000) J. Biol. Chem. 275: 33869-33875). The E. coli AdhE is subject to metal-catalyzed oxidation and therefore oxygen-sensitive (Tamarit et al. (1998) J. Biol. Chem. 273:3027-32).
[0120]2) Proteins that catalyze the reversible conversion of acetyl-CoA to acetaldehyde in strictly or facultative anaerobic micro-organisms but do not possess alcohol dehydrogenase activity. An example of this type of proteins has been reported in Clostridium kluyveri (Smith et al. (1980) Arch. Biochem. Biophys. 203: 663-675). An acetylating acetaldehyde dehydrogenase has been annotated in the genome of Clostridium kluyveri DSM 555 (GenBank No: EDK33116). A homologous protein AcdH is identified in the genome of Lactobacillus plantarum (GenBank No: NP--784141). Another example of this type of proteins is the ald gene product in Clostridium beijerinckii NRRL B593 (Toth et al. (1999) Appl. Environ. Microbiol. 65: 4973-4980, GenBank No: AAD31841).
[0121]3) Proteins that are involved in ethanolamine catabolism. Ethanolamine can be utilized both as carbon and nitrogen source by many enterobacteria (Stojiljkovic et al. (1995) J. Bacteriol. 177: 1357-1366). Ethanolamine is first converted by ethanolamine ammonia lyase to ammonia and acetaldehyde, subsequently, acetaldehyde is converted by acetylating acetaldehyde dehydrogenase to acetyl-CoA. An example of this type of acetylating acetaldehyde dehydrogenase is the EutE protein in Salmonella typhimurium (Stojiljkovic et al. (1995) J. Bacteriol. 177: 1357-1366, GenBank No: AAL21357). E. coli is also able to utilize ethanolamine (Scarlett et al. (1976) J. Gen. Microbiol. 95:173-176) and has an EutE protein (GenBank No: AAG57564) which is homologous to the EutE protein in S. typhimurium.
[0122]4) Proteins that are part of a bifunctional aldolase-dehydrogenase complex involved in 4-hydroxy-2-ketovalerate catabolism. Such bifunctional enzymes catalyze the final two steps of the meta-cleavage pathway for catechol, an intermediate in many bacterial species in the degradation of phenols, toluates, naphthalene, biphenyls and other aromatic compounds (Powlowski and Shingler (1994) Biodegradation 5, 219-236). 4-Hydroxy-2-ketovalerate is first converted by 4-hydroxy-2-ketovalerate aldolase to pyruvate and acetaldehyde, subsequently acetaldehyde is converted by acetylating acetaldehyde dehydrogenase to acetyl-CoA. An example of this type of acetylating acetaldehyde dehydrogenase is the DmpF protein in Pseudomonas sp CF600 (GenBank No: CAA43226) (Shingler et al. (1992) J. Bacteriol. 174:711-24). E. coli has a homologous MphF protein (Ferrandez et al. (1997) J. Bacteriol. 179: 2573-2581, GenBank No: NP--414885) to the DmpF protein in Pseudomonas sp. CF600.
[0123]To identify the protein family members of acetylating acetaldehyde dehydrogenase, the amino acid sequences of the E. coli bifunctional AdhE protein (GenBank No: NP--415757), L. plantarum AcdH protein (acetylating) (GenBank No: NP--784141), the E. coli EutE protein (GenBank No: AAG57564) and the E. coli MhpF protein (GenBank No: NP--414885) were each run as a query sequence in a BLASTp search against the GenBank non-redundant protein database using default parameters. Amino acid sequences with an E-value smaller or equal to 1 e-20 were extracted. Redundant sequences were removed and the remaining sequences were aligned and a similarity tree was built using Genedata Physolopher protein analyzer software, version 6.5.2. A similarity tree provides information on organism sequence similarity. The tree is created independently of the ClustalW algorithm by pairwise comparison of the amino acid sequences per residue position. At each position, the similarity is rated and summed up to an overall score for each sequence pair. Based on these pairwise scores a hierarchical clustering is performed, which arranges the sequences in a tree. Note that the ald gene product of C. beijerinckii (GenBank no: AAD31841) clustered together with the EutE proteins from E. coli and S. typhimurium. From this similarity tree four major branches could be defined, each branch contains one amino acid sequence that was used as a query for the BLASTp search. FIG. 4 shows an example of such a similarity tree, containing all sequences that are mentioned in this example.
[0124]At least one amino acid sequence was selected from each branch for complementation tests in S. cerevisiae delta acs2. Preferably, the selected amino acid sequences have experimental evidence of its biochemical function as acetylating acetaldehyde dehydrogenase. Such evidences can be found in public databases, such as in the BRENDA, UniProt and NCBI Entrez databases.
Example 3
Construction of Expression Plasmids and Complementation Test
[0125]To test whether acetylating acetaldehyde dehydrogenases (ACDH) could complement the deletion of ACS2 in S. cerevisiae, several genes coding for a (putative) ACDH were chosen from a variety of databases as described above.
[0126]To achieve optimal expression in yeast, the codon usage of all genes was adapted by codon pair optimization. These sequences were synthesized at Geneart AG (Regensburg, Germany).
[0127]The optimized sequences were cloned into the high copy yeast expression plasmid YEplac112PtdhTadh (SEQ ID NO:40; based on YEplac112 (2μ TRP1) (Gietz & Sugino [1988] Gene 74(2):527-34), allowing constitutive expression from the TDH3 promoter.
[0128]YEplac112PtdhTadh was made by cloning a KpnI-SacI fragment from p426GPD (Mumberg et al. [1995] Gene. 156(1):119-22), containing the TDH3 promoter and CYC1 terminator, into YEplac112 cut with KpnI-SacI. The resulting plasmid was cut with KpnI and SphI and the ends were made blunt then ligated to give YEplac112TDH. To obtain YEplac112PtdhTadh, YEplac112TDH was cut with PstI-HindIII and ligated to a 345 by PstI-HindIII PCR fragment containing the ADH1 terminator (Tadh), thus replacing the CYC1 terminator and changing the polylinker between the promoter and terminator. The Tadh PCR fragment was generated using the following oligonucleotides:
TABLE-US-00003 MCS-5'Tadh: 5'-aaggtacctctagactagtcccgggctgcagtcgactcgagcgaatt tcttatgatttatgatt-3' Tadh1-Hind: 5'-aggaagcttaggcctgtgtggaagaacgattacaacagg-3'
[0129]PCR was done with VentR DNA polymerase, according to the manufacturer's specifications.
[0130]The synthetic constructs containing the ACDH genes were cut with SpeI-PstI and ligated into YEplac112PtdhTadh digested with the same enzymes, resulting in pBOL058 through to pBOL068 and pBOL082. The names of the final plasmids and the genes they contain are given in Table 1.
TABLE-US-00004 TABLE 1 Table 1: Overview on putative acetylating acetaldehyde dehydrogenases tested for complementation of delta acs2 S. cerevisiae strain. Genes which resulted in complementation are given in bold. SEQ ID NOs are provided for the DNA sequence of the wild type gene, the protein expressed therefrom, and the codon pair optimized DNA sequence. SEQ ID NO. Size DNA/ Organisms Name Group* (kb) PRT/OPT Escherichia coli adhE 1 2.6 Entamoeba histolytica adh2 1 2.6 48/50/49 adhE 1 2.6 27/28/29 sp. E2 adhE 1 2.6 51/52 EDK33116 2 1.5 24/25/26 Lactobacillus plantarum acdH 2 1.4 EutE 3 1.4 18/19/20 Lin1129 3 1.4 21/22/23 Pseudomonas putida YP 001268189 4 1.0 *Group refers to the group of proteins having ACDH activity as defined in Example 2. Group 1: similar to bifunctional E. coli AdhE (AdhE-type of proteins); group 2: proteins having similarity to Lactobacillus plantarum AcdH (AcdH-type of proteins); group 3: similar to E. coli EutE (EutE-type of proteins); group 4: similar to E. coli MhpF (MhpF-type of proteins).
[0131]All plasmids were used to transform the delta acs2 yeast strain RWB060. As negative control, the empty vector YEplac112 was used. Transformants were plated on mineral medium (Verduyn et al. [1992] Yeast 8 (1992), pp. 501-517) containing either 1% glucose (MYD) or 1% ethanol+1% glycerol (MYEG) as single carbon source.
[0132]While for all constructs several transformants could be selected on minimal medium with ethanol/glycerol, this was not the case on the glucose containing plates.
TABLE-US-00005 TABLE 2 Result of a complementation experiment for putative acetylating acetaldehyde dehydrogenases in delta acs2 S. cerevisiae strain RWB060. Genes resulting in complementation are given in bold. MYEG and MYD columns indicate number of transformants on plates MYEG (ethanol/glycerol) and MYG (glucose). Gene (GenPept Organisms accession) plasmid MYEG MYD none YEplac112 75 0 Escherichia coli adhE pBOL059 6 0 Entamoeba histolytica adh2 pBOL061 54 0 adhE pBOL064 36 39 (BAB41363) sp. E2 adhE pBOL139 32 3 EDK33116 pBOL065 21 8 (EDK33116) Lactobacillus acdH pBOL058 6 0 plantarum EutE pBOL066 24 18 (ABV06849) Lin1129 pBOL067 28 8 (CAC96360) Pseudomonas putida YP 001268189 pBOL068 32 0
[0133]On the glucose containing plates, transformants could only be selected for plasmids pBOL064, pBOL065, pBOL066, and pBOL067, not the empty vector. There was also a clear difference in colony size, depending on the plasmid used. While construct pBOL066 (E. coli eutE) resulted in biggest colonies, colonies of pBOL067 (L. innocua lin1129) appeared a bit smaller and pBOL065 (C. kluyveri edk3116) showed smallest colonies. Plasmid pBOL064 (S. aureus adhE) and plasmid pBOL139 (Piromyces sp. E2, adhE) were done at a later date, so could not be compared directly, Colonies containing pBOL64 seemed to be similar to colonies comprising pBOL066 and colonies comprising pBOL139 seemed to be similar to colonies comprising pBOL065.
[0134]To ensure that these results did not arise from spontaneous revertants, transformation experiments were repeated for some of the plasmids, giving the same results. In addition, for almost all plasmids four transformants were selected at random from the MYEG plates and restreaked onto MYD and MYEG plates.
[0135]In all experiments no growth was ever seen on glucose with the empty vector (YEplac112), while only pBOL065, pBOL066 and pBOL067 repeatedly gave good growth on glucose. Plasmid pBOL064 was not re-tested this way after the initial very positive result.
[0136]From these results, it was concluded that the codon pair optimized genes of the eutE homologues of: [0137]E. coli (SEQ ID NO:20) encoding the ethanolamine utilization protein EutE from E. coli HS; [0138]L. innocua (SEQ ID NO:23) encoding a hypothetical protein from L. innocua similar to ethanolamine utilization protein EutE, and
[0139]C. kluyveri (SEQ ID NO:26) encoding acetylating acetaldehyde dehydrogenase in Clostridium kluyveri DSM 555;
and the codon pair optimized gene of the adhE homologue of [0140]S. aureus (SEQ ID NO:29) encoding a bifunctional acetaldehyde/alcohol dehydrogenase in Staphylococcus aureus subsp. aureus N315;and the non codon pair optimized gene of the adhE homologue [0141]Piromyces sp. E2 (SEQ ID NO:51) encoding a bifunctional acetaldehyde/alcohol dehydrogenaseare able to complement the acs2 yeast mutants. These genes encode an enzymatic activity allowing the formation of cytosolic acetyl-CoA from acetaldehyde in yeast.
Conclusions
[0142]The supply of cytosolic acetyl-CoA is believed to be a bottleneck in the butanol production in yeast. In order to identify heterologous genes encoding for enzymes forming cytosolic acetyl-CoA in S. cerevisiae a test system based on a delta acs2 yeast mutant was established.
[0143]Due to its deficiency in cytosolic acetyl-CoA biosynthesis on glucose, the acs2Δ strain is unable to grow with glucose as sole carbon source.
[0144]9 putative acetylating acetaldehyde dehydrogenases identified as candidates for cytosolic acetyl-CoA supply from acetaldehyde were expressed in the acs2Δ yeast. In total, 5 of these 9 genes complemented growth of the acs2Δ strain with glucose as single carbon source. Therewith, the use of the delta acs2 strain as pre-selection tool for feasible routes for cytosolic supply of acetyl-CoA was shown.
[0145]4 of 5 acetylating acetaldehyde dehydrogenases identified thus far, eutE homologues of E. coli, L. innocua and C. kluyveri and the adhE homologue of S. aureus, and Piromyces sp. E2, were successfully integrated in butanol producing strains of S. cerevisiae. The effect on butanol production was investigated as described in Examples below.
[0146]This test system may also be used, to analyse whether pyruvate:NADP oxidoreductase can successfully be over-expressed in yeast. Due to the oxygen sensitivity, this test has to be performed anaerobically.
[0147]Examples 4-6 below describe the testing 4 of the 5 selected ACDH genes from Example 3 for improvement of butanol production.
Example 4
Construction of a Butanol Producing Yeast Strain and Knocking Out the ADH1 and ADH2 Genes
[0148]The six Clostridium acetobutylicum genes involved in butanol biosynthesis from Acetyl-CoA are listed in Table 3. The genes were codon pair optimized for S. cerevisiae as described in WO2008/000632 and expressed from yeast promoters and terminators as listed in Table 3.
[0149]Two yeast integration vectors (pBOL34 [SEQ ID NO:41] and pBOL36 [SEQ ID NO:42]), each containing 3 of the six codon pair optimised genes from Clostridium acetobutylicum involved in butanol biosynthesis, were designed and synthesized at Geneart.
[0150]The genes ThiL, Hbd and Crt are expressed from pBOL34 containing a AmdS selection marker. The final three genes, Bcd, BdhB and AdhE were expressed from a integration vector with an AmdS selection marker named pBOL36.
TABLE-US-00006 TABLE 3 Genes used for butanol production in S. cerevisiae including the promoter (1000 bp) and terminator (500 bp) Gene activity Promotor Terminator ThiL acetyl CoA c-acetyltransfrase ADH1 TDH1 [E.C. 2.3.1.9 Hbd 3-hydroxybutyryl-CoA ENO1 PMA1 dehydrogenase [E.C.1.1.1.157] Crt 3-hydroxybutyryl-CoA TDH1 ADH1 dehydratase [E.C.4.2.1.55] Bcd butyryl-CoA dehydrogenase PDC1 TDH1 [E.C.1.3.99.2]. BdhB NADH-dependent butanol ENO1 PMA1 dehydrogenase [E.C.1.1.1.--]. adhE alcohol/acetaldehyde CoA TDH1 ADH2 dehydrogenase [E.C.: 1.1.1.1/ 1.2.1.10]
[0151]For integration in the ADH2 locus, pBOL36 was linearized by a BsaBI digestion. S. cerevisiae CEN.PK113-5D (MATa MAL2-8c SUC2 ura3-52) was transformed with the linear fragment and grown on plates with YCB (Difco) and 5 mM acetamide as nitrogen source.
[0152]The AmdS marker was removed by recombination by growing the transformants for 6 hours in YEPD in 2 ml tubes at 30° C. Cells were subsequently plated on 1.8% agar medium containing YCB (Difco) and 40 mM fluoracetamide and 30 mM phosphate buffer pH 6.8 supporting growth only from cells that have lost the AmdS marker. Correct integration and recombination were confirmed by PCR. The correct integration of the fragment upstream was confirmed with the following primers:
TABLE-US-00007 P1: 5'-GAATTGAAGGATATCTACATCAAG-3' and P2: 5'-CCCATCTACGGAACCCTGATCAAGC-3'.
[0153]The correct integration of the fragment downstream was confirmed with the following primers:
TABLE-US-00008 P3: 5'-GATGGTGTCACCATTACCAGGTCTAG-3' and P4: 5'-GTTCTCTGGTCAAGTTGAAGTCCATTTTGATTGATTTGACTGTGTTA TTTTGCGTG-3'.
[0154]The resulting strain was named BLT021.
[0155]pBOL34 was linearized by a PsiI digestion and integrated in the ADH1 locus of BLT021. The transformants were grown on plates containing YCB (Difco) and 5 mM acetamide. For removal of the AmdS selection marker, colonies were inoculated in YEPD and grown for 6 hours in 2 ml tubes at 30° C. The cells were plated on YCB (Difco) and 40 mM fluoracetamide and 0.1% ammonium sulphate.
[0156]Correct integration and recombination were confirmed by PCR. The correct integration of the fragment upstream was confirmed with the following primer set:
TABLE-US-00009 P5: 5'-GAACAATAGAGCGACCATGACCTTG-3' and P6: 5'-GACATCAGCGTCACCAGCCTTGATG-3'.
[0157]The correct integration of the fragment downstream was confirmed with the following primer set:
TABLE-US-00010 P7: 5'-GATTGAAGGTTTCAAGAACAGGTGATG-3' and P8: 5'-GGCGATCAGAGTTGAAAAAAAAATG-3'.
[0158]The resulting strain was named BLT057.
Example 5
Introducing ETFα and ETFβ in BLT057
[0159]The ETF genes and the Acdh genes as listed in Table 4 were codon pair optimized for S. cerevisiae as described in WO2008/000632 and expressed from yeast promoters and terminators as listed in Table 4.
TABLE-US-00011 TABLE 4 Promoters and terminators used for expression of codon pair optimized ETF genes and Acdh genes in S. cerevisiae Promotor Terminator Etfα (CpO) tef1 tdh2 Etfβ (CpO) tdh2 tef1 Acdh64 (AdhE S. aureus) tdh3 adh Acdh65 (Clostridium) tdh3 adh Acdh66 (EutE E. coli) tdh3 adh Acdh67 (lin1129 Ec) tdh3 Adh
[0160]The integration vectors expressing ETFα and ETFβ only (pBOL113, [SEQ ID NO:43]) or ETFα and ETFβ combined with Acdh64 (pBOL115, [SEQ ID NO:44]), Acdh65 (pBOL116, [SEQ ID NO:45]), Acdh66 (pBOL118, [SEQ ID NO:46]) or Acdh67 (pBOL120, [SEQ ID NO:47]) were synthesized by Geneart AG.
[0161]The vectors, pBOL113, pBOL115, pBOL116, pBOL118 and pBOL120, were linearized with StuI and integrated in the ura3-52 locus of strain BLT057.
[0162]The transformants were grown in YNB (Difco) w/o amino acids+2% galactose to select for uracil prototrophic strains. The strains derived from strain BLT057 with pBOL113/115/116/118/120 integrated in the genome were designated strains: BLT071, BLT072, BLT073, BLT074 and BLT075, respectively.
Example 6
Improved Butanol Production by Expressing Positive Acdh Genes
[0163]Strains BLT071 through BLT075 as prepared in Example 5 were grown in Verduyn medium (Verduyn et al. (1992) Yeast 8: 501-517) in which the ammonium sulphate is replaced by 2 g/l ureum and which further contains 4 wt. % galactose. Cells were grown in 100 ml shake flasks containing 50 ml of medium for 72 hours at 30° C. at 180 rpm in a rotary shaker.
[0164]The butanol concentration was determined in the supernatant of the culture. Samples were analysed on a HS-GC equipped with a flame ionisation detector and an automatic injection system. Column J&W DB-1 length 30 m, id 0.53 mm, df 5 μm. The following conditions were used: helium as carrier gas with a flow rate of 5 ml/min. Column temperature was set at 110° C. The injector was set at 140° C. and the detector performed at 300° C. The data was obtained using Chromeleon software. Samples were heated at 60° C. for 20 min in the headspace sampler. One (1) ml of the headspace volatiles were automatically injected on the column.
[0165]1-Butanol production of the various strains was as follows:
[0166]BLT057: 120 mg/l
[0167]BLT071: 450 mg/l
[0168]BLT072: 500 mg/l
[0169]BLT073: 600 mg/l
[0170]BLT074: 670 mg/l
[0171]BLT075: 700 mg/l
[0172]The results show that introduction of electron transfer flavoproteins (ETF alpha and ETF beta) and/or introduction of acetylating acetaldehyde dehydrogenases as identified by a complementation assay of Example 3, increase the butanol production level.
Sequence CWU
1
52164DNAArtificialprimer 5'acs2 1atacacaaac agaatacagg aaagtaaatc
aatacaataa taaaacagct gaagcttcgt 60acgc
64267DNAArtificialprimer 3'acs2
2tctcattacg aaatttttct catttaagtt atttcttttt ttgaggcata ggccactagt
60ggatctg
67320DNAArtificialprobe 5'acs2 3gatattcggt agccgattcc
20420DNAArtificialprobe 3'acs2 4ccgtaacctt
ctcgtaatgc
20520DNAArtificialprobe ACS2internal 5cggattcgtc atcagcttca
20620DNAArtificialprobe KanA
6cgcacgtcaa gactgtcaag
20720DNAArtificialprobe KanB 7tcgtatgtga atgctggtcg
20864DNAArtificialprimer MCS-5'Tadh 8aaggtacctc
tagactagtc ccgggctgca gtcgactcga gcgaatttct tatgatttat 60gatt
64939DNAArtificialprimer Tadh1-Hind 9aggaagctta ggcctgtgtg gaagaacgat
tacaacagg 391024DNAArtificialprimer P1
10gaattgaagg atatctacat caag
241125DNAArtificialprimer P2 11cccatctacg gaaccctgat caagc
251226DNAArtificialprimer P3 12gatggtgtca
ccattaccag gtctag
261356DNAArtificialPrimer P4 13gttctctggt caagttgaag tccattttga
ttgatttgac tgtgttattt tgcgtg 561425DNAArtificialPrimer P5
14gaacaataga gcgaccatga ccttg
251525DNAArtificialPrimer P6 15gacatcagcg tcaccagcct tgatg
251627DNAArtificialPrimer P7 16gattgaaggt
ttcaagaaca ggtgatg
271725DNAArtificialPrimer P8 17ggcgatcaga gttgaaaaaa aaatg
25181404DNAEscherichia coliCDS(1)..(1404)
18atg aat caa cag gat att gaa cag gtg gtg aaa gcg gta ctg ctg aaa
48Met Asn Gln Gln Asp Ile Glu Gln Val Val Lys Ala Val Leu Leu Lys1
5 10 15atg caa agc agt gac acg
ccg tcc gcc gcc gtt cat gag atg ggc gtt 96Met Gln Ser Ser Asp Thr
Pro Ser Ala Ala Val His Glu Met Gly Val 20 25
30ttc gcg tcc ctg gat gac gcc gtt gcg gca gcc aaa gtc
gcc cag caa 144Phe Ala Ser Leu Asp Asp Ala Val Ala Ala Ala Lys Val
Ala Gln Gln 35 40 45ggg tta aaa
agc gtg gca atg cgc cag tta gcc att gct gcc att cgt 192Gly Leu Lys
Ser Val Ala Met Arg Gln Leu Ala Ile Ala Ala Ile Arg 50
55 60gaa gca ggc gaa aaa cac gcc aga gat tta gcg gaa
ctt gcc gtc agt 240Glu Ala Gly Glu Lys His Ala Arg Asp Leu Ala Glu
Leu Ala Val Ser65 70 75
80gaa acc ggc atg ggg cgc gtt gaa gat aaa ttt gca aaa aac gtc gct
288Glu Thr Gly Met Gly Arg Val Glu Asp Lys Phe Ala Lys Asn Val Ala
85 90 95cag gcg cgc ggc aca cca
ggc gtt gag tgc ctc tct ccg caa gtg ctg 336Gln Ala Arg Gly Thr Pro
Gly Val Glu Cys Leu Ser Pro Gln Val Leu 100
105 110act ggc gac aac ggc ctg acc cta att gaa aac gca
ccc tgg ggc gtg 384Thr Gly Asp Asn Gly Leu Thr Leu Ile Glu Asn Ala
Pro Trp Gly Val 115 120 125gtg gct
tcg gtg acg cct tcc act aac ccg gcg gca acc gta att aac 432Val Ala
Ser Val Thr Pro Ser Thr Asn Pro Ala Ala Thr Val Ile Asn 130
135 140aac gcc atc agc ctg att gcc gcg ggc aac agc
gtc att ttt gcc ccg 480Asn Ala Ile Ser Leu Ile Ala Ala Gly Asn Ser
Val Ile Phe Ala Pro145 150 155
160cat ccg gcg gcg aaa aaa gtc tcc cag cgg gcg att acg ctg ctc aac
528His Pro Ala Ala Lys Lys Val Ser Gln Arg Ala Ile Thr Leu Leu Asn
165 170 175cag gcg att gtt gcc
gca ggt ggg ccg gaa aac tta ctg gtt act gtg 576Gln Ala Ile Val Ala
Ala Gly Gly Pro Glu Asn Leu Leu Val Thr Val 180
185 190gca aat ccg gat atc gaa acc gcg caa cgc ttg ttc
aag ttt ccg ggt 624Ala Asn Pro Asp Ile Glu Thr Ala Gln Arg Leu Phe
Lys Phe Pro Gly 195 200 205atc ggc
ctg ctg gtg gta acc ggc ggc gaa gcg gta gta gaa gcg gcg 672Ile Gly
Leu Leu Val Val Thr Gly Gly Glu Ala Val Val Glu Ala Ala 210
215 220cgt aaa cac acc aat aaa cgt ctg att gcc gca
ggc gct ggc aac ccg 720Arg Lys His Thr Asn Lys Arg Leu Ile Ala Ala
Gly Ala Gly Asn Pro225 230 235
240ccg gta gtg gtg gat gaa acc gcc gac ctc gcc cgt gcc gct cag tcc
768Pro Val Val Val Asp Glu Thr Ala Asp Leu Ala Arg Ala Ala Gln Ser
245 250 255atc gtc aaa ggc gct
tct ttc gat aac aac atc att tgt gcc gac gaa 816Ile Val Lys Gly Ala
Ser Phe Asp Asn Asn Ile Ile Cys Ala Asp Glu 260
265 270aag gta ctg att gtt gtt gat agc gta gcc gat gaa
ctg atg cgt ctg 864Lys Val Leu Ile Val Val Asp Ser Val Ala Asp Glu
Leu Met Arg Leu 275 280 285atg gaa
ggc cag cac gcg gtg aaa ctg acc gca gaa cag gcg cag cag 912Met Glu
Gly Gln His Ala Val Lys Leu Thr Ala Glu Gln Ala Gln Gln 290
295 300ctg caa ccg gtg ttg ctg aaa aat atc gac gag
cgc gga aaa ggc acc 960Leu Gln Pro Val Leu Leu Lys Asn Ile Asp Glu
Arg Gly Lys Gly Thr305 310 315
320gtc agc cgt gac tgg gtt ggt cgc gac gca ggc aaa atc gcg gcg gca
1008Val Ser Arg Asp Trp Val Gly Arg Asp Ala Gly Lys Ile Ala Ala Ala
325 330 335atc ggc ctt aaa gtt
ccg caa gaa acg cgc ctg ctg ttt gtg gaa acc 1056Ile Gly Leu Lys Val
Pro Gln Glu Thr Arg Leu Leu Phe Val Glu Thr 340
345 350acc gca gaa cat ccg ttt gcc gtg act gaa ctg atg
atg ccg gtg ttg 1104Thr Ala Glu His Pro Phe Ala Val Thr Glu Leu Met
Met Pro Val Leu 355 360 365ccc gtc
gtg cgc gtc gcc aac gtg gcg gat gcc att gcg cta gcg gtg 1152Pro Val
Val Arg Val Ala Asn Val Ala Asp Ala Ile Ala Leu Ala Val 370
375 380aaa ctg gaa ggc ggt tgc cac cac acg gcg gca
atg cac tcg cgc aac 1200Lys Leu Glu Gly Gly Cys His His Thr Ala Ala
Met His Ser Arg Asn385 390 395
400atc gaa aac atg aac cag atg gcg aat gct att gat acc agc att ttc
1248Ile Glu Asn Met Asn Gln Met Ala Asn Ala Ile Asp Thr Ser Ile Phe
405 410 415gtt aag aac gga ccg
tgc att gcc ggg ctg ggg ctg ggc ggg gaa ggc 1296Val Lys Asn Gly Pro
Cys Ile Ala Gly Leu Gly Leu Gly Gly Glu Gly 420
425 430tgg acc acc atg acc atc acc acg cca acc ggt gaa
ggg gta acc agc 1344Trp Thr Thr Met Thr Ile Thr Thr Pro Thr Gly Glu
Gly Val Thr Ser 435 440 445gcg cgt
acg ttt gtc cgt ctg cgt cgc tgt gta tta gtc gat gcg ttt 1392Ala Arg
Thr Phe Val Arg Leu Arg Arg Cys Val Leu Val Asp Ala Phe 450
455 460cgc att gtt taa
1404Arg Ile Val46519467PRTEscherichia coli 19Met
Asn Gln Gln Asp Ile Glu Gln Val Val Lys Ala Val Leu Leu Lys1
5 10 15Met Gln Ser Ser Asp Thr Pro
Ser Ala Ala Val His Glu Met Gly Val 20 25
30Phe Ala Ser Leu Asp Asp Ala Val Ala Ala Ala Lys Val Ala
Gln Gln 35 40 45Gly Leu Lys Ser
Val Ala Met Arg Gln Leu Ala Ile Ala Ala Ile Arg 50 55
60Glu Ala Gly Glu Lys His Ala Arg Asp Leu Ala Glu Leu
Ala Val Ser65 70 75
80Glu Thr Gly Met Gly Arg Val Glu Asp Lys Phe Ala Lys Asn Val Ala
85 90 95Gln Ala Arg Gly Thr Pro
Gly Val Glu Cys Leu Ser Pro Gln Val Leu 100
105 110Thr Gly Asp Asn Gly Leu Thr Leu Ile Glu Asn Ala
Pro Trp Gly Val 115 120 125Val Ala
Ser Val Thr Pro Ser Thr Asn Pro Ala Ala Thr Val Ile Asn 130
135 140Asn Ala Ile Ser Leu Ile Ala Ala Gly Asn Ser
Val Ile Phe Ala Pro145 150 155
160His Pro Ala Ala Lys Lys Val Ser Gln Arg Ala Ile Thr Leu Leu Asn
165 170 175Gln Ala Ile Val
Ala Ala Gly Gly Pro Glu Asn Leu Leu Val Thr Val 180
185 190Ala Asn Pro Asp Ile Glu Thr Ala Gln Arg Leu
Phe Lys Phe Pro Gly 195 200 205Ile
Gly Leu Leu Val Val Thr Gly Gly Glu Ala Val Val Glu Ala Ala 210
215 220Arg Lys His Thr Asn Lys Arg Leu Ile Ala
Ala Gly Ala Gly Asn Pro225 230 235
240Pro Val Val Val Asp Glu Thr Ala Asp Leu Ala Arg Ala Ala Gln
Ser 245 250 255Ile Val Lys
Gly Ala Ser Phe Asp Asn Asn Ile Ile Cys Ala Asp Glu 260
265 270Lys Val Leu Ile Val Val Asp Ser Val Ala
Asp Glu Leu Met Arg Leu 275 280
285Met Glu Gly Gln His Ala Val Lys Leu Thr Ala Glu Gln Ala Gln Gln 290
295 300Leu Gln Pro Val Leu Leu Lys Asn
Ile Asp Glu Arg Gly Lys Gly Thr305 310
315 320Val Ser Arg Asp Trp Val Gly Arg Asp Ala Gly Lys
Ile Ala Ala Ala 325 330
335Ile Gly Leu Lys Val Pro Gln Glu Thr Arg Leu Leu Phe Val Glu Thr
340 345 350Thr Ala Glu His Pro Phe
Ala Val Thr Glu Leu Met Met Pro Val Leu 355 360
365Pro Val Val Arg Val Ala Asn Val Ala Asp Ala Ile Ala Leu
Ala Val 370 375 380Lys Leu Glu Gly Gly
Cys His His Thr Ala Ala Met His Ser Arg Asn385 390
395 400Ile Glu Asn Met Asn Gln Met Ala Asn Ala
Ile Asp Thr Ser Ile Phe 405 410
415Val Lys Asn Gly Pro Cys Ile Ala Gly Leu Gly Leu Gly Gly Glu Gly
420 425 430Trp Thr Thr Met Thr
Ile Thr Thr Pro Thr Gly Glu Gly Val Thr Ser 435
440 445Ala Arg Thr Phe Val Arg Leu Arg Arg Cys Val Leu
Val Asp Ala Phe 450 455 460Arg Ile
Val465201401DNAArtificialoptimised sequence 20atgaaccaac aagatatcga
acaagttgtc aaggctgtct tgttgaaaat gcaatcttct 60gacactccat ctgctgctgt
ccacgaaatg ggtgttttcg cttctttgga cgacgctgtt 120gctgctgcca aggttgctca
acaaggtttg aaatctgttg ccatgagaca attggccatt 180gctgccatca gagaagctgg
tgaaaagcat gccagagact tggctgaatt ggctgtctcc 240gaaaccggta tgggtagagt
tgaagacaaa ttcgctaaga acgttgctca agctagaggt 300actccaggtg tcgaatgttt
gtctccacaa gtcttgaccg gtgataatgg tttgactttg 360attgaaaatg ctccatgggg
tgttgttgct tccgtcaccc catctaccaa cccagctgct 420actgtcatca acaacgccat
ctctttgatt gctgctggta actccgttat cttcgctcca 480cacccagctg ccaagaaggt
ttctcaaaga gccatcactc tattgaacca agccattgtt 540gctgctggtg gtccagaaaa
cttgttggtc actgttgcca acccagatat cgaaactgct 600caaagattat tcaagttccc
aggtatcggt ctattagtcg tcactggtgg tgaagctgtt 660gttgaagctg ccagaaagca
caccaacaag agattgattg ctgctggtgc tggtaaccct 720cctgttgttg tcgatgaaac
cgctgatttg gccagagctg ctcaatccat tgtcaagggt 780gcttctttcg acaacaacat
catctgtgct gacgaaaagg ttttgattgt tgttgactcc 840gttgctgacg aattgatgag
attgatggaa ggtcaacatg ccgtcaagtt gactgctgaa 900caagctcaac aattgcaacc
agttttgttg aagaacatcg atgaaagagg taagggtacc 960gtctccagag actgggttgg
tagagatgct ggtaagattg ctgctgccat cggtttgaag 1020gttccacaag aaaccagatt
attattcgtc gaaaccaccg ctgaacaccc atttgctgtc 1080actgaattga tgatgccagt
cttaccagtt gtccgtgttg ctaacgttgc tgacgctatt 1140gctttggctg tcaaattgga
aggtggttgt caccacactg ctgccatgca ctccagaaac 1200atcgaaaaca tgaaccaaat
ggctaacgcc attgacactt ccatctttgt caagaacggt 1260ccatgtatcg ctggtttggg
tttgggtggt gaaggttgga ccaccatgac catcaccacc 1320ccaactggtg aaggtgtcac
ttctgccaga actttcgtca gattacgtcg ttgtgttttg 1380gtcgatgctt tcagaattgt t
1401211410DNAListeria
innocuaCDS(1)..(1410) 21atg gaa tca tta gaa ctc gaa caa ctg gta aaa aaa
gtt ctc tta gaa 48Met Glu Ser Leu Glu Leu Glu Gln Leu Val Lys Lys
Val Leu Leu Glu1 5 10
15aaa tta gca gaa caa aaa gaa gta cca aca aaa aca act aca caa ggc
96Lys Leu Ala Glu Gln Lys Glu Val Pro Thr Lys Thr Thr Thr Gln Gly
20 25 30gcg aaa agt ggc gtt ttt gat
aca gtt gac gag gct gtt caa gca gca 144Ala Lys Ser Gly Val Phe Asp
Thr Val Asp Glu Ala Val Gln Ala Ala 35 40
45gtt ata gcg cag aat tgc tat aaa gaa aaa tca ctt gaa gaa cgc
cgc 192Val Ile Ala Gln Asn Cys Tyr Lys Glu Lys Ser Leu Glu Glu Arg
Arg 50 55 60aat gtt gta aaa gca att
cgt gaa gca ctt tat cca gaa att gaa aca 240Asn Val Val Lys Ala Ile
Arg Glu Ala Leu Tyr Pro Glu Ile Glu Thr65 70
75 80att gcg aca aga gca gtt gca gag act ggt atg
gga aat gtg aca gat 288Ile Ala Thr Arg Ala Val Ala Glu Thr Gly Met
Gly Asn Val Thr Asp 85 90
95aaa att ttg aaa aac acg tta gca atc gaa aaa acg cca ggg gta gaa
336Lys Ile Leu Lys Asn Thr Leu Ala Ile Glu Lys Thr Pro Gly Val Glu
100 105 110gat tta tat aca gaa gta
gct aca ggt gat aac ggt atg aca cta tat 384Asp Leu Tyr Thr Glu Val
Ala Thr Gly Asp Asn Gly Met Thr Leu Tyr 115 120
125gaa ctc tct ccg tat ggc gta att ggt gca gta gcg ccg agc
aca aac 432Glu Leu Ser Pro Tyr Gly Val Ile Gly Ala Val Ala Pro Ser
Thr Asn 130 135 140cca acg gaa aca ttg
att tgt aat tca atc ggt atg ctc gca gct gga 480Pro Thr Glu Thr Leu
Ile Cys Asn Ser Ile Gly Met Leu Ala Ala Gly145 150
155 160aat gcc gtt ttt tat agc cct cat cca ggg
gca aaa aac att tca ctg 528Asn Ala Val Phe Tyr Ser Pro His Pro Gly
Ala Lys Asn Ile Ser Leu 165 170
175tgg ttg att gaa aaa cta aac aca att gtt cgc gat agt tgt ggt ata
576Trp Leu Ile Glu Lys Leu Asn Thr Ile Val Arg Asp Ser Cys Gly Ile
180 185 190gat aat cta att gtc acc
gtg gct aaa cca tcc atc caa gca gct caa 624Asp Asn Leu Ile Val Thr
Val Ala Lys Pro Ser Ile Gln Ala Ala Gln 195 200
205gaa atg atg aac cat cca aaa gta ccg cta ctt gtt att aca
ggt ggt 672Glu Met Met Asn His Pro Lys Val Pro Leu Leu Val Ile Thr
Gly Gly 210 215 220ccg ggc gtt gtt ctc
caa gcg atg caa tca ggt aaa aaa gtg att gga 720Pro Gly Val Val Leu
Gln Ala Met Gln Ser Gly Lys Lys Val Ile Gly225 230
235 240gca gga gca ggg aac ccg cct tct att gtt
gac gaa aca gct aat atc 768Ala Gly Ala Gly Asn Pro Pro Ser Ile Val
Asp Glu Thr Ala Asn Ile 245 250
255gaa aaa gcg gct gct gac atc gta gac gga gca tct ttt gac cat aat
816Glu Lys Ala Ala Ala Asp Ile Val Asp Gly Ala Ser Phe Asp His Asn
260 265 270att tta tgt att gct gaa
aaa agt gtg gta gct gtt gat agc att gct 864Ile Leu Cys Ile Ala Glu
Lys Ser Val Val Ala Val Asp Ser Ile Ala 275 280
285gat ttc ttg tta ttc caa atg gaa aaa aat ggt gcc ctt cat
gtt act 912Asp Phe Leu Leu Phe Gln Met Glu Lys Asn Gly Ala Leu His
Val Thr 290 295 300aat cca agt gat att
caa aaa tta gaa aaa gta gcc gtt acc gat aaa 960Asn Pro Ser Asp Ile
Gln Lys Leu Glu Lys Val Ala Val Thr Asp Lys305 310
315 320ggt gta act aat aaa aaa tta gtc gga aaa
agt gca act gaa atc tta 1008Gly Val Thr Asn Lys Lys Leu Val Gly Lys
Ser Ala Thr Glu Ile Leu 325 330
335aaa gaa gca gga ata gct tgt gat ttt aca cca cgt tta atc att gtg
1056Lys Glu Ala Gly Ile Ala Cys Asp Phe Thr Pro Arg Leu Ile Ile Val
340 345 350gaa acg gag aaa tct cat
cca ttt gca aca gta gag cta tta atg cca 1104Glu Thr Glu Lys Ser His
Pro Phe Ala Thr Val Glu Leu Leu Met Pro 355 360
365atc gtt cca gtt gta agg gtg cct gat ttt gac gaa gcc ctt
gaa gtg 1152Ile Val Pro Val Val Arg Val Pro Asp Phe Asp Glu Ala Leu
Glu Val 370 375 380gct att gaa ctc gaa
caa ggc tta cat cat aca gca aca atg cat tca 1200Ala Ile Glu Leu Glu
Gln Gly Leu His His Thr Ala Thr Met His Ser385 390
395 400caa aat atc tcg aga tta aac aaa gct gca
aga gat atg caa act tcc 1248Gln Asn Ile Ser Arg Leu Asn Lys Ala Ala
Arg Asp Met Gln Thr Ser 405 410
415atc ttt gtc aaa aat ggt ccg tcc ttt gcg gga tta ggc ttt aga gga
1296Ile Phe Val Lys Asn Gly Pro Ser Phe Ala Gly Leu Gly Phe Arg Gly
420 425 430gaa ggt agt act act ttc
act att gca acg cct act gga gaa gga aca 1344Glu Gly Ser Thr Thr Phe
Thr Ile Ala Thr Pro Thr Gly Glu Gly Thr 435 440
445act aca gca cgt cat ttt gct aga cgc cgc cgc tgt gtt tta
aca gat 1392Thr Thr Ala Arg His Phe Ala Arg Arg Arg Arg Cys Val Leu
Thr Asp 450 455 460ggt ttt tcg att cgt
taa 1410Gly Phe Ser Ile
Arg46522469PRTListeria innocua 22Met Glu Ser Leu Glu Leu Glu Gln Leu Val
Lys Lys Val Leu Leu Glu1 5 10
15Lys Leu Ala Glu Gln Lys Glu Val Pro Thr Lys Thr Thr Thr Gln Gly
20 25 30Ala Lys Ser Gly Val Phe
Asp Thr Val Asp Glu Ala Val Gln Ala Ala 35 40
45Val Ile Ala Gln Asn Cys Tyr Lys Glu Lys Ser Leu Glu Glu
Arg Arg 50 55 60Asn Val Val Lys Ala
Ile Arg Glu Ala Leu Tyr Pro Glu Ile Glu Thr65 70
75 80Ile Ala Thr Arg Ala Val Ala Glu Thr Gly
Met Gly Asn Val Thr Asp 85 90
95Lys Ile Leu Lys Asn Thr Leu Ala Ile Glu Lys Thr Pro Gly Val Glu
100 105 110Asp Leu Tyr Thr Glu
Val Ala Thr Gly Asp Asn Gly Met Thr Leu Tyr 115
120 125Glu Leu Ser Pro Tyr Gly Val Ile Gly Ala Val Ala
Pro Ser Thr Asn 130 135 140Pro Thr Glu
Thr Leu Ile Cys Asn Ser Ile Gly Met Leu Ala Ala Gly145
150 155 160Asn Ala Val Phe Tyr Ser Pro
His Pro Gly Ala Lys Asn Ile Ser Leu 165
170 175Trp Leu Ile Glu Lys Leu Asn Thr Ile Val Arg Asp
Ser Cys Gly Ile 180 185 190Asp
Asn Leu Ile Val Thr Val Ala Lys Pro Ser Ile Gln Ala Ala Gln 195
200 205Glu Met Met Asn His Pro Lys Val Pro
Leu Leu Val Ile Thr Gly Gly 210 215
220Pro Gly Val Val Leu Gln Ala Met Gln Ser Gly Lys Lys Val Ile Gly225
230 235 240Ala Gly Ala Gly
Asn Pro Pro Ser Ile Val Asp Glu Thr Ala Asn Ile 245
250 255Glu Lys Ala Ala Ala Asp Ile Val Asp Gly
Ala Ser Phe Asp His Asn 260 265
270Ile Leu Cys Ile Ala Glu Lys Ser Val Val Ala Val Asp Ser Ile Ala
275 280 285Asp Phe Leu Leu Phe Gln Met
Glu Lys Asn Gly Ala Leu His Val Thr 290 295
300Asn Pro Ser Asp Ile Gln Lys Leu Glu Lys Val Ala Val Thr Asp
Lys305 310 315 320Gly Val
Thr Asn Lys Lys Leu Val Gly Lys Ser Ala Thr Glu Ile Leu
325 330 335Lys Glu Ala Gly Ile Ala Cys
Asp Phe Thr Pro Arg Leu Ile Ile Val 340 345
350Glu Thr Glu Lys Ser His Pro Phe Ala Thr Val Glu Leu Leu
Met Pro 355 360 365Ile Val Pro Val
Val Arg Val Pro Asp Phe Asp Glu Ala Leu Glu Val 370
375 380Ala Ile Glu Leu Glu Gln Gly Leu His His Thr Ala
Thr Met His Ser385 390 395
400Gln Asn Ile Ser Arg Leu Asn Lys Ala Ala Arg Asp Met Gln Thr Ser
405 410 415Ile Phe Val Lys Asn
Gly Pro Ser Phe Ala Gly Leu Gly Phe Arg Gly 420
425 430Glu Gly Ser Thr Thr Phe Thr Ile Ala Thr Pro Thr
Gly Glu Gly Thr 435 440 445Thr Thr
Ala Arg His Phe Ala Arg Arg Arg Arg Cys Val Leu Thr Asp 450
455 460Gly Phe Ser Ile
Arg465231407DNAArtificialoptimised sequence 23atggaatctt tggaattgga
acaattagtc aagaaggttt tgttggaaaa attggctgaa 60caaaaggaag ttccaaccaa
gaccaccacc caaggtgcca agtccggtgt tttcgatacc 120gtcgatgaag ctgtccaagc
tgccgtcatt gctcaaaact gttacaagga aaaatctttg 180gaagaaagaa gaaacgttgt
caaggccatc agagaagctt tatacccaga aatcgaaacc 240attgctacca gagctgttgc
tgaaaccggt atgggtaatg tcaccgataa aatcttgaag 300aacactttag ctatcgaaaa
gactccaggt gttgaagact tgtacactga agttgctacc 360ggtgacaacg gtatgacttt
atacgaatta tctccatacg gtgtcatcgg tgctgttgct 420ccatctacca acccaactga
aactttgatc tgtaactcca tcggtatgtt ggctgctggt 480aacgccgttt tctactctcc
tcacccaggt gccaagaaca tctctttatg gttgattgaa 540aagttgaaca ctatcgtcag
agattcttgt ggtattgaca acttgattgt caccgttgcc 600aagccatcta tccaagctgc
tcaagaaatg atgaaccacc caaaggttcc attgttggtc 660atcactggtg gtccaggtgt
tgtcttgcaa gctatgcaat ctggtaagaa ggttatcggt 720gctggtgctg gtaaccctcc
atccatcgtt gacgaaaccg ctaacattga aaaggctgct 780gctgacattg tcgacggtgc
ttcctttgac cataatatct tgtgtatcgc tgaaaagtct 840gttgttgccg ttgactccat
tgctgacttc ttgttgttcc aaatggaaaa gaacggtgct 900ttgcacgtca ctaacccatc
tgatatccaa aaattggaaa aggttgccgt cactgacaag 960ggtgtcacca acaagaaatt
ggttggtaag tctgccactg aaatcttgaa agaagctggt 1020attgcttgtg atttcacccc
aagattgatc attgtcgaaa ctgaaaagtc ccacccattc 1080gctactgttg aattgttgat
gccaattgtt ccagttgtca gagttccaga cttcgatgaa 1140gctttggaag ttgccattga
attggaacaa ggtctacatc acactgctac catgcactct 1200caaaacatct ccagattgaa
caaggctgcc cgtgacatgc aaacctccat ctttgtcaag 1260aacggtccat ctttcgctgg
tttaggtttc agaggtgaag gttccaccac tttcaccatt 1320gctactccaa ctggtgaagg
tactaccact gcccgtcact tcgctagaag aagaagatgt 1380gtcttgactg atggtttctc
cattaga 1407241476DNAClostridium
kluyveriCDS(1)..(1476) 24atg gag ata atg gat aag gac tta cag tca ata cag
gaa gta aga act 48Met Glu Ile Met Asp Lys Asp Leu Gln Ser Ile Gln
Glu Val Arg Thr1 5 10
15ctt ata gca aaa gca aag aaa gct caa gca gaa ttt aaa aat ttt tct
96Leu Ile Ala Lys Ala Lys Lys Ala Gln Ala Glu Phe Lys Asn Phe Ser
20 25 30caa gaa gct gta aac aag gta
ata gaa aaa ata gct aag gct aca gaa 144Gln Glu Ala Val Asn Lys Val
Ile Glu Lys Ile Ala Lys Ala Thr Glu 35 40
45gtt gaa gct gta aaa ctt gca aaa ttg gca tat gaa gat aca gga
tat 192Val Glu Ala Val Lys Leu Ala Lys Leu Ala Tyr Glu Asp Thr Gly
Tyr 50 55 60gga aaa tgg gaa gat aaa
gta ata aag aat aag ttt tca agt ata gta 240Gly Lys Trp Glu Asp Lys
Val Ile Lys Asn Lys Phe Ser Ser Ile Val65 70
75 80gtt tat aac tat att aaa gat ttg aaa acg gtt
gga att tta aaa gaa 288Val Tyr Asn Tyr Ile Lys Asp Leu Lys Thr Val
Gly Ile Leu Lys Glu 85 90
95gac aag gaa aag aaa tta ata gat ata gct gtt cca ctt gga gtt ata
336Asp Lys Glu Lys Lys Leu Ile Asp Ile Ala Val Pro Leu Gly Val Ile
100 105 110gca gga ctt ata cct tca
act aac cca act tca aca gca ata ttc aag 384Ala Gly Leu Ile Pro Ser
Thr Asn Pro Thr Ser Thr Ala Ile Phe Lys 115 120
125gta tta ata gca tta aag gca gga aat gca ata gta ttc tca
cca cat 432Val Leu Ile Ala Leu Lys Ala Gly Asn Ala Ile Val Phe Ser
Pro His 130 135 140cca aca gca gta aga
agt att aca gaa act gta aag ata atg cag aaa 480Pro Thr Ala Val Arg
Ser Ile Thr Glu Thr Val Lys Ile Met Gln Lys145 150
155 160gct gca gta gaa gca gga gca cca gat gga
tta atc caa tgt atg tca 528Ala Ala Val Glu Ala Gly Ala Pro Asp Gly
Leu Ile Gln Cys Met Ser 165 170
175ata ttg aca gta gaa ggt act gct gaa ttg atg aag aat aag gat aca
576Ile Leu Thr Val Glu Gly Thr Ala Glu Leu Met Lys Asn Lys Asp Thr
180 185 190gca ctt atc ctt gca aca
ggt gga gaa gga atg gta aga gca gct tac 624Ala Leu Ile Leu Ala Thr
Gly Gly Glu Gly Met Val Arg Ala Ala Tyr 195 200
205agt tca gga aca cca gct ata gga gtt gga cct gga aac ggc
cca tgc 672Ser Ser Gly Thr Pro Ala Ile Gly Val Gly Pro Gly Asn Gly
Pro Cys 210 215 220ttt att gaa aga aca
gca gat att cct aca gca gta aga aaa gta ata 720Phe Ile Glu Arg Thr
Ala Asp Ile Pro Thr Ala Val Arg Lys Val Ile225 230
235 240ggc agt gat act ttt gat aat gga gta ata
tgt gct tca gaa caa tca 768Gly Ser Asp Thr Phe Asp Asn Gly Val Ile
Cys Ala Ser Glu Gln Ser 245 250
255ata ata gca gag aca gta aag aaa gca gag ata att gaa gaa ttc aag
816Ile Ile Ala Glu Thr Val Lys Lys Ala Glu Ile Ile Glu Glu Phe Lys
260 265 270aga caa aaa gga tat ttc
tta aat gca gaa gaa tca gaa aaa gta ggc 864Arg Gln Lys Gly Tyr Phe
Leu Asn Ala Glu Glu Ser Glu Lys Val Gly 275 280
285aag att tta tta aga gct aat gga aca cca aac cca gca ata
gta gga 912Lys Ile Leu Leu Arg Ala Asn Gly Thr Pro Asn Pro Ala Ile
Val Gly 290 295 300aaa gat gtt caa gca
tta gca aaa tta gca gga ata agc ata cca agc 960Lys Asp Val Gln Ala
Leu Ala Lys Leu Ala Gly Ile Ser Ile Pro Ser305 310
315 320gat gcg gta ata tta ctt tca gag cag aca
gat gtg agt cca aag aac 1008Asp Ala Val Ile Leu Leu Ser Glu Gln Thr
Asp Val Ser Pro Lys Asn 325 330
335cct tat gca aag gaa aaa tta gct cca gta ctt gca ttc tat aca gta
1056Pro Tyr Ala Lys Glu Lys Leu Ala Pro Val Leu Ala Phe Tyr Thr Val
340 345 350gaa gac tgg cat gaa gca
tgt gaa aaa tcc tta gca ctt ctt cat aac 1104Glu Asp Trp His Glu Ala
Cys Glu Lys Ser Leu Ala Leu Leu His Asn 355 360
365caa gga agt gga cat aca tta ata att cac tca cag aat gaa
gaa atc 1152Gln Gly Ser Gly His Thr Leu Ile Ile His Ser Gln Asn Glu
Glu Ile 370 375 380ata aga gaa ttc gca
ttg aag aaa cca gta tca aga ata ctt gta aat 1200Ile Arg Glu Phe Ala
Leu Lys Lys Pro Val Ser Arg Ile Leu Val Asn385 390
395 400tca cct gga tca ctt gga gga ata ggt gga
gct aca aat ctt gta cca 1248Ser Pro Gly Ser Leu Gly Gly Ile Gly Gly
Ala Thr Asn Leu Val Pro 405 410
415tca ctt aca tta ggc tgt gga gca gta ggt gga agt gca act tca gat
1296Ser Leu Thr Leu Gly Cys Gly Ala Val Gly Gly Ser Ala Thr Ser Asp
420 425 430aac gta gga cca gaa aac
tta ttc aac ata aga aaa gta gct tat gga 1344Asn Val Gly Pro Glu Asn
Leu Phe Asn Ile Arg Lys Val Ala Tyr Gly 435 440
445act acg aca gta gaa gaa ata aga gaa gct ttt ggt gta gga
gca gct 1392Thr Thr Thr Val Glu Glu Ile Arg Glu Ala Phe Gly Val Gly
Ala Ala 450 455 460tca tca agt gca cca
gca gaa cca gaa gat aat gaa gat gta cag gct 1440Ser Ser Ser Ala Pro
Ala Glu Pro Glu Asp Asn Glu Asp Val Gln Ala465 470
475 480ata gta aaa gct ata atg gct aaa tta aat
ctt taa 1476Ile Val Lys Ala Ile Met Ala Lys Leu Asn
Leu 485 49025491PRTClostridium kluyveri
25Met Glu Ile Met Asp Lys Asp Leu Gln Ser Ile Gln Glu Val Arg Thr1
5 10 15Leu Ile Ala Lys Ala Lys
Lys Ala Gln Ala Glu Phe Lys Asn Phe Ser 20 25
30Gln Glu Ala Val Asn Lys Val Ile Glu Lys Ile Ala Lys
Ala Thr Glu 35 40 45Val Glu Ala
Val Lys Leu Ala Lys Leu Ala Tyr Glu Asp Thr Gly Tyr 50
55 60Gly Lys Trp Glu Asp Lys Val Ile Lys Asn Lys Phe
Ser Ser Ile Val65 70 75
80Val Tyr Asn Tyr Ile Lys Asp Leu Lys Thr Val Gly Ile Leu Lys Glu
85 90 95Asp Lys Glu Lys Lys Leu
Ile Asp Ile Ala Val Pro Leu Gly Val Ile 100
105 110Ala Gly Leu Ile Pro Ser Thr Asn Pro Thr Ser Thr
Ala Ile Phe Lys 115 120 125Val Leu
Ile Ala Leu Lys Ala Gly Asn Ala Ile Val Phe Ser Pro His 130
135 140Pro Thr Ala Val Arg Ser Ile Thr Glu Thr Val
Lys Ile Met Gln Lys145 150 155
160Ala Ala Val Glu Ala Gly Ala Pro Asp Gly Leu Ile Gln Cys Met Ser
165 170 175Ile Leu Thr Val
Glu Gly Thr Ala Glu Leu Met Lys Asn Lys Asp Thr 180
185 190Ala Leu Ile Leu Ala Thr Gly Gly Glu Gly Met
Val Arg Ala Ala Tyr 195 200 205Ser
Ser Gly Thr Pro Ala Ile Gly Val Gly Pro Gly Asn Gly Pro Cys 210
215 220Phe Ile Glu Arg Thr Ala Asp Ile Pro Thr
Ala Val Arg Lys Val Ile225 230 235
240Gly Ser Asp Thr Phe Asp Asn Gly Val Ile Cys Ala Ser Glu Gln
Ser 245 250 255Ile Ile Ala
Glu Thr Val Lys Lys Ala Glu Ile Ile Glu Glu Phe Lys 260
265 270Arg Gln Lys Gly Tyr Phe Leu Asn Ala Glu
Glu Ser Glu Lys Val Gly 275 280
285Lys Ile Leu Leu Arg Ala Asn Gly Thr Pro Asn Pro Ala Ile Val Gly 290
295 300Lys Asp Val Gln Ala Leu Ala Lys
Leu Ala Gly Ile Ser Ile Pro Ser305 310
315 320Asp Ala Val Ile Leu Leu Ser Glu Gln Thr Asp Val
Ser Pro Lys Asn 325 330
335Pro Tyr Ala Lys Glu Lys Leu Ala Pro Val Leu Ala Phe Tyr Thr Val
340 345 350Glu Asp Trp His Glu Ala
Cys Glu Lys Ser Leu Ala Leu Leu His Asn 355 360
365Gln Gly Ser Gly His Thr Leu Ile Ile His Ser Gln Asn Glu
Glu Ile 370 375 380Ile Arg Glu Phe Ala
Leu Lys Lys Pro Val Ser Arg Ile Leu Val Asn385 390
395 400Ser Pro Gly Ser Leu Gly Gly Ile Gly Gly
Ala Thr Asn Leu Val Pro 405 410
415Ser Leu Thr Leu Gly Cys Gly Ala Val Gly Gly Ser Ala Thr Ser Asp
420 425 430Asn Val Gly Pro Glu
Asn Leu Phe Asn Ile Arg Lys Val Ala Tyr Gly 435
440 445Thr Thr Thr Val Glu Glu Ile Arg Glu Ala Phe Gly
Val Gly Ala Ala 450 455 460Ser Ser Ser
Ala Pro Ala Glu Pro Glu Asp Asn Glu Asp Val Gln Ala465
470 475 480Ile Val Lys Ala Ile Met Ala
Lys Leu Asn Leu 485
490261473DNAArtificialoptimised sequence 26atggaaatca tggacaagga
tttgcaatcc atccaagaag ttagaacttt gattgccaag 60gccaagaagg ctcaagctga
attcaagaac ttttcccaag aagctgttaa caaggtcatc 120gaaaagatcg ccaaggctac
tgaagttgaa gctgtcaaat tggccaaatt ggcttacgaa 180gacaccggtt acggtaaatg
ggaagacaag gtcatcaaga acaaattctc ctccattgtt 240gtctacaact acatcaagga
tttgaagacc gttggtatct tgaaggaaga caaggaaaag 300aaattgattg acattgctgt
cccattaggt gtcattgctg gtttgattcc atctaccaac 360ccaacttcca ctgccatttt
caaggtcttg attgctttga aggctggtaa cgccattgtc 420ttctctccac acccaactgc
tgtccgttcc atcactgaaa ccgttaagat catgcaaaag 480gctgctgttg aagctggtgc
tccagatggt ttgatccaat gtatgtccat tttgaccgtt 540gaaggtactg ctgaattgat
gaagaacaag gacaccgctt tgatcttggc taccggtggt 600gaaggtatgg ttagagctgc
ttactcctct ggtactccag ccatcggtgt cggtccaggt 660aacggtccat gtttcatcga
aagaactgct gacattccaa ctgctgttag aaaggttatc 720ggttctgaca ctttcgacaa
cggtgtcatc tgtgcttctg aacaatccat cattgctgaa 780accgtcaaga aggctgaaat
catcgaagaa ttcaagagac aaaagggtta cttcttgaat 840gctgaagaat ctgaaaaggt
tggtaagatt ctattacgtg ccaacggtac tccaaaccca 900gccatcgttg gtaaggatgt
ccaagctttg gccaaattgg ctggtatttc cattccatct 960gatgctgtta tcttactatc
cgaacaaacc gatgtttctc ctaaaaatcc atacgctaag 1020gaaaaattgg ctccagtctt
ggctttctac accgtcgaag actggcatga agcttgtgaa 1080aagtctttgg ctttattgca
caaccaaggt tctggtcaca ctttgatcat ccactctcaa 1140aacgaagaaa tcattagaga
atttgctttg aagaagcctg tttccagaat tttggttaac 1200tctccaggtt ctttgggtgg
tatcggtggt gctaccaact tagtcccatc tttgacttta 1260ggttgtggtg ctgttggtgg
ttctgccacc tctgacaacg ttggtccaga aaacttgttc 1320aacatcagaa aggttgctta
cggtaccacc accgtcgaag aaatcagaga agctttcggt 1380gtcggtgctg cttcttcttc
tgctccagct gaaccagaag acaacgaaga tgttcaagcc 1440attgttaagg ccatcatggc
caaattgaac ttg 1473272610DNAStaphylococcus
aureusCDS(1)..(2610) 27atg tta act ata cct gaa aaa gaa aat cgt gga tcg
aaa gaa caa gaa 48Met Leu Thr Ile Pro Glu Lys Glu Asn Arg Gly Ser
Lys Glu Gln Glu1 5 10
15gtg gca att atg att gat gct cta gct gac aaa ggg aaa aaa gca tta
96Val Ala Ile Met Ile Asp Ala Leu Ala Asp Lys Gly Lys Lys Ala Leu
20 25 30gaa gca tta tct aaa aag tca
caa gaa gaa att gat cat att gtt cat 144Glu Ala Leu Ser Lys Lys Ser
Gln Glu Glu Ile Asp His Ile Val His 35 40
45caa atg agc tta gca gct gtt gat caa cat atg gtg cta gca aaa
tta 192Gln Met Ser Leu Ala Ala Val Asp Gln His Met Val Leu Ala Lys
Leu 50 55 60gca cat gaa gaa act gga
aga ggt ata tac gaa gat aaa gcg att aaa 240Ala His Glu Glu Thr Gly
Arg Gly Ile Tyr Glu Asp Lys Ala Ile Lys65 70
75 80aat tta tac gct tct gaa tat ata tgg aat tca
ata aaa gac aat aag 288Asn Leu Tyr Ala Ser Glu Tyr Ile Trp Asn Ser
Ile Lys Asp Asn Lys 85 90
95aca gta ggg att att ggt gaa gat aaa gaa aaa gga tta acg tat gta
336Thr Val Gly Ile Ile Gly Glu Asp Lys Glu Lys Gly Leu Thr Tyr Val
100 105 110gcg gaa cca att ggt gtt
att tgt ggt gtt acg cca aca aca aat cct 384Ala Glu Pro Ile Gly Val
Ile Cys Gly Val Thr Pro Thr Thr Asn Pro 115 120
125acg tcg aca act att ttt aaa gcg atg att gca att aag aca
gga aat 432Thr Ser Thr Thr Ile Phe Lys Ala Met Ile Ala Ile Lys Thr
Gly Asn 130 135 140cca atc att ttt gca
ttc cat cca agt gca caa gaa tcg tcg aag cgt 480Pro Ile Ile Phe Ala
Phe His Pro Ser Ala Gln Glu Ser Ser Lys Arg145 150
155 160gca gca gaa gtt gta tta gaa gcg gca atg
aag gca ggt gca cct aaa 528Ala Ala Glu Val Val Leu Glu Ala Ala Met
Lys Ala Gly Ala Pro Lys 165 170
175gat att att cag tgg att gaa gtg cct tct atc gaa gca aca aaa caa
576Asp Ile Ile Gln Trp Ile Glu Val Pro Ser Ile Glu Ala Thr Lys Gln
180 185 190tta atg aat cac aaa ggt
att gca tta gtt cta gca aca ggt ggt tcg 624Leu Met Asn His Lys Gly
Ile Ala Leu Val Leu Ala Thr Gly Gly Ser 195 200
205ggc atg gtt aag tct gca tat tca act ggc aaa ccg gca tta
ggt gtg 672Gly Met Val Lys Ser Ala Tyr Ser Thr Gly Lys Pro Ala Leu
Gly Val 210 215 220gga cca ggt aac gtg
ccg tct tac att gaa aaa aca gca cac att aaa 720Gly Pro Gly Asn Val
Pro Ser Tyr Ile Glu Lys Thr Ala His Ile Lys225 230
235 240cgt gca gta aat gat atc att ggt tca aaa
aca ttt gat aat ggt atg 768Arg Ala Val Asn Asp Ile Ile Gly Ser Lys
Thr Phe Asp Asn Gly Met 245 250
255att tgt gct tct gaa caa gtt gta gtc att gat aaa gaa att tat aaa
816Ile Cys Ala Ser Glu Gln Val Val Val Ile Asp Lys Glu Ile Tyr Lys
260 265 270gat gtt act aat gaa ttt
aaa gca cat caa gca tac ttt gtt aaa aaa 864Asp Val Thr Asn Glu Phe
Lys Ala His Gln Ala Tyr Phe Val Lys Lys 275 280
285gat gaa tta caa cgc tta gaa aat gca att atg aat gaa caa
aaa aca 912Asp Glu Leu Gln Arg Leu Glu Asn Ala Ile Met Asn Glu Gln
Lys Thr 290 295 300ggt att aag cct gat
att gtc ggt aaa tct gca gtt gaa ata gct gaa 960Gly Ile Lys Pro Asp
Ile Val Gly Lys Ser Ala Val Glu Ile Ala Glu305 310
315 320tta gca ggt ata cct gtc ccc gaa aat aca
aaa ctt atc ata gcc gaa 1008Leu Ala Gly Ile Pro Val Pro Glu Asn Thr
Lys Leu Ile Ile Ala Glu 325 330
335att agc ggt gta ggt tca gac tat ccg tta tct cgt gaa aaa tta tct
1056Ile Ser Gly Val Gly Ser Asp Tyr Pro Leu Ser Arg Glu Lys Leu Ser
340 345 350cca gta tta gcc tta gta
aaa gcc caa tct aca aaa caa gca ttt caa 1104Pro Val Leu Ala Leu Val
Lys Ala Gln Ser Thr Lys Gln Ala Phe Gln 355 360
365att tgt gaa gac aca cta cat ttt ggt gga tta gga cac aca
gcc gtt 1152Ile Cys Glu Asp Thr Leu His Phe Gly Gly Leu Gly His Thr
Ala Val 370 375 380atc cat aca gaa gat
gaa aca tta caa aaa gat ttt gga cta aga atg 1200Ile His Thr Glu Asp
Glu Thr Leu Gln Lys Asp Phe Gly Leu Arg Met385 390
395 400aaa gct tgt cgt gta ctt gta aat aca cca
tca gcg gtt gga ggt att 1248Lys Ala Cys Arg Val Leu Val Asn Thr Pro
Ser Ala Val Gly Gly Ile 405 410
415ggt gat atg tat aac gaa ttg att ccg tct tta aca tta ggt tgt ggt
1296Gly Asp Met Tyr Asn Glu Leu Ile Pro Ser Leu Thr Leu Gly Cys Gly
420 425 430tcg tac ggt aga aac tca
att tca cat aat gtt agt gcg aca gat tta 1344Ser Tyr Gly Arg Asn Ser
Ile Ser His Asn Val Ser Ala Thr Asp Leu 435 440
445tta aac att aaa acg att gct aaa cga cgt aat aat act caa
att ttc 1392Leu Asn Ile Lys Thr Ile Ala Lys Arg Arg Asn Asn Thr Gln
Ile Phe 450 455 460aag gtg cct gct caa
att tat ttt gaa gaa aat gca atc atg agt cta 1440Lys Val Pro Ala Gln
Ile Tyr Phe Glu Glu Asn Ala Ile Met Ser Leu465 470
475 480aca aca atg gac aag att gaa aaa gtg atg
att gtc tgt gac cct ggt 1488Thr Thr Met Asp Lys Ile Glu Lys Val Met
Ile Val Cys Asp Pro Gly 485 490
495atg gta gaa ttc ggt tat aca aaa aca gtt gag aat gta tta aga caa
1536Met Val Glu Phe Gly Tyr Thr Lys Thr Val Glu Asn Val Leu Arg Gln
500 505 510aga acg gaa cag cct caa
att aaa ata ttt agc gaa gtc gaa ccg aac 1584Arg Thr Glu Gln Pro Gln
Ile Lys Ile Phe Ser Glu Val Glu Pro Asn 515 520
525cca tca act aat aca gta tat aaa ggt ctg gaa atg atg gtt
gat ttc 1632Pro Ser Thr Asn Thr Val Tyr Lys Gly Leu Glu Met Met Val
Asp Phe 530 535 540caa cca gat aca atc
att gca ctt ggt ggt ggt tca gcg atg gat gct 1680Gln Pro Asp Thr Ile
Ile Ala Leu Gly Gly Gly Ser Ala Met Asp Ala545 550
555 560gca aaa gca atg tgg atg ttc ttt gaa cac
cct gag aca tca ttc ttc 1728Ala Lys Ala Met Trp Met Phe Phe Glu His
Pro Glu Thr Ser Phe Phe 565 570
575ggt gct aaa caa aag ttc cta gac atc ggt aaa cgt act tat aaa ata
1776Gly Ala Lys Gln Lys Phe Leu Asp Ile Gly Lys Arg Thr Tyr Lys Ile
580 585 590ggc atg cct gaa aat gcg
acg ttc att tgt atc cct acg aca tca ggt 1824Gly Met Pro Glu Asn Ala
Thr Phe Ile Cys Ile Pro Thr Thr Ser Gly 595 600
605aca ggt tca gaa gta aca cca ttt gca gtt atc aca gat agt
gaa aca 1872Thr Gly Ser Glu Val Thr Pro Phe Ala Val Ile Thr Asp Ser
Glu Thr 610 615 620aat gta aaa tat ccg
ttg gct gat ttt gct tta aca cct gac gtt gca 1920Asn Val Lys Tyr Pro
Leu Ala Asp Phe Ala Leu Thr Pro Asp Val Ala625 630
635 640att att gac cct caa ttt gtg atg agt gtg
cca aaa agc gtt aca gca 1968Ile Ile Asp Pro Gln Phe Val Met Ser Val
Pro Lys Ser Val Thr Ala 645 650
655gat aca gga atg gat gta cta acg cat gca atg gaa tca tat gta tct
2016Asp Thr Gly Met Asp Val Leu Thr His Ala Met Glu Ser Tyr Val Ser
660 665 670gta atg gct tca gac tat
aca aga ggt ttg agt cta caa gcg att aaa 2064Val Met Ala Ser Asp Tyr
Thr Arg Gly Leu Ser Leu Gln Ala Ile Lys 675 680
685ttg acg ttc gaa tat tta aaa tca tct gtt gaa aag ggt gat
aaa gtt 2112Leu Thr Phe Glu Tyr Leu Lys Ser Ser Val Glu Lys Gly Asp
Lys Val 690 695 700tca aga gag aaa atg
cat aac gca tca act ttg gct ggt atg gca ttt 2160Ser Arg Glu Lys Met
His Asn Ala Ser Thr Leu Ala Gly Met Ala Phe705 710
715 720gca aat gca ttc tta ggc att gca cac tca
att gca cat aaa att ggt 2208Ala Asn Ala Phe Leu Gly Ile Ala His Ser
Ile Ala His Lys Ile Gly 725 730
735ggc gaa tat ggt att ccg cat ggt aga gcg aat gcg ata tta cta ccg
2256Gly Glu Tyr Gly Ile Pro His Gly Arg Ala Asn Ala Ile Leu Leu Pro
740 745 750cat att atc cgt tat aat
gcc aaa gac ccg caa aaa cat gca tta ttc 2304His Ile Ile Arg Tyr Asn
Ala Lys Asp Pro Gln Lys His Ala Leu Phe 755 760
765cct aaa tat gag ttc ttc aga gca gat aca gat tat gca gat
att gcc 2352Pro Lys Tyr Glu Phe Phe Arg Ala Asp Thr Asp Tyr Ala Asp
Ile Ala 770 775 780aaa ttc tta gga tta
aaa ggg aat acg aca gaa gca ctc gta gaa tca 2400Lys Phe Leu Gly Leu
Lys Gly Asn Thr Thr Glu Ala Leu Val Glu Ser785 790
795 800tta gct aaa gct gtc tac gaa tta ggt caa
tca gtc gga att gaa atg 2448Leu Ala Lys Ala Val Tyr Glu Leu Gly Gln
Ser Val Gly Ile Glu Met 805 810
815aat ttg aaa tca caa ggt gtg tct gaa gaa gaa tta aat gaa tca att
2496Asn Leu Lys Ser Gln Gly Val Ser Glu Glu Glu Leu Asn Glu Ser Ile
820 825 830gat aga atg gca gag ctc
gca ttt gaa gat caa tgt aca act gct aat 2544Asp Arg Met Ala Glu Leu
Ala Phe Glu Asp Gln Cys Thr Thr Ala Asn 835 840
845cct aaa gaa gca cta atc agt gaa atc aaa gat atc att caa
aca tca 2592Pro Lys Glu Ala Leu Ile Ser Glu Ile Lys Asp Ile Ile Gln
Thr Ser 850 855 860tat gat tat aag caa
taa 2610Tyr Asp Tyr Lys
Gln86528869PRTStaphylococcus aureus 28Met Leu Thr Ile Pro Glu Lys Glu Asn
Arg Gly Ser Lys Glu Gln Glu1 5 10
15Val Ala Ile Met Ile Asp Ala Leu Ala Asp Lys Gly Lys Lys Ala
Leu 20 25 30Glu Ala Leu Ser
Lys Lys Ser Gln Glu Glu Ile Asp His Ile Val His 35
40 45Gln Met Ser Leu Ala Ala Val Asp Gln His Met Val
Leu Ala Lys Leu 50 55 60Ala His Glu
Glu Thr Gly Arg Gly Ile Tyr Glu Asp Lys Ala Ile Lys65 70
75 80Asn Leu Tyr Ala Ser Glu Tyr Ile
Trp Asn Ser Ile Lys Asp Asn Lys 85 90
95Thr Val Gly Ile Ile Gly Glu Asp Lys Glu Lys Gly Leu Thr
Tyr Val 100 105 110Ala Glu Pro
Ile Gly Val Ile Cys Gly Val Thr Pro Thr Thr Asn Pro 115
120 125Thr Ser Thr Thr Ile Phe Lys Ala Met Ile Ala
Ile Lys Thr Gly Asn 130 135 140Pro Ile
Ile Phe Ala Phe His Pro Ser Ala Gln Glu Ser Ser Lys Arg145
150 155 160Ala Ala Glu Val Val Leu Glu
Ala Ala Met Lys Ala Gly Ala Pro Lys 165
170 175Asp Ile Ile Gln Trp Ile Glu Val Pro Ser Ile Glu
Ala Thr Lys Gln 180 185 190Leu
Met Asn His Lys Gly Ile Ala Leu Val Leu Ala Thr Gly Gly Ser 195
200 205Gly Met Val Lys Ser Ala Tyr Ser Thr
Gly Lys Pro Ala Leu Gly Val 210 215
220Gly Pro Gly Asn Val Pro Ser Tyr Ile Glu Lys Thr Ala His Ile Lys225
230 235 240Arg Ala Val Asn
Asp Ile Ile Gly Ser Lys Thr Phe Asp Asn Gly Met 245
250 255Ile Cys Ala Ser Glu Gln Val Val Val Ile
Asp Lys Glu Ile Tyr Lys 260 265
270Asp Val Thr Asn Glu Phe Lys Ala His Gln Ala Tyr Phe Val Lys Lys
275 280 285Asp Glu Leu Gln Arg Leu Glu
Asn Ala Ile Met Asn Glu Gln Lys Thr 290 295
300Gly Ile Lys Pro Asp Ile Val Gly Lys Ser Ala Val Glu Ile Ala
Glu305 310 315 320Leu Ala
Gly Ile Pro Val Pro Glu Asn Thr Lys Leu Ile Ile Ala Glu
325 330 335Ile Ser Gly Val Gly Ser Asp
Tyr Pro Leu Ser Arg Glu Lys Leu Ser 340 345
350Pro Val Leu Ala Leu Val Lys Ala Gln Ser Thr Lys Gln Ala
Phe Gln 355 360 365Ile Cys Glu Asp
Thr Leu His Phe Gly Gly Leu Gly His Thr Ala Val 370
375 380Ile His Thr Glu Asp Glu Thr Leu Gln Lys Asp Phe
Gly Leu Arg Met385 390 395
400Lys Ala Cys Arg Val Leu Val Asn Thr Pro Ser Ala Val Gly Gly Ile
405 410 415Gly Asp Met Tyr Asn
Glu Leu Ile Pro Ser Leu Thr Leu Gly Cys Gly 420
425 430Ser Tyr Gly Arg Asn Ser Ile Ser His Asn Val Ser
Ala Thr Asp Leu 435 440 445Leu Asn
Ile Lys Thr Ile Ala Lys Arg Arg Asn Asn Thr Gln Ile Phe 450
455 460Lys Val Pro Ala Gln Ile Tyr Phe Glu Glu Asn
Ala Ile Met Ser Leu465 470 475
480Thr Thr Met Asp Lys Ile Glu Lys Val Met Ile Val Cys Asp Pro Gly
485 490 495Met Val Glu Phe
Gly Tyr Thr Lys Thr Val Glu Asn Val Leu Arg Gln 500
505 510Arg Thr Glu Gln Pro Gln Ile Lys Ile Phe Ser
Glu Val Glu Pro Asn 515 520 525Pro
Ser Thr Asn Thr Val Tyr Lys Gly Leu Glu Met Met Val Asp Phe 530
535 540Gln Pro Asp Thr Ile Ile Ala Leu Gly Gly
Gly Ser Ala Met Asp Ala545 550 555
560Ala Lys Ala Met Trp Met Phe Phe Glu His Pro Glu Thr Ser Phe
Phe 565 570 575Gly Ala Lys
Gln Lys Phe Leu Asp Ile Gly Lys Arg Thr Tyr Lys Ile 580
585 590Gly Met Pro Glu Asn Ala Thr Phe Ile Cys
Ile Pro Thr Thr Ser Gly 595 600
605Thr Gly Ser Glu Val Thr Pro Phe Ala Val Ile Thr Asp Ser Glu Thr 610
615 620Asn Val Lys Tyr Pro Leu Ala Asp
Phe Ala Leu Thr Pro Asp Val Ala625 630
635 640Ile Ile Asp Pro Gln Phe Val Met Ser Val Pro Lys
Ser Val Thr Ala 645 650
655Asp Thr Gly Met Asp Val Leu Thr His Ala Met Glu Ser Tyr Val Ser
660 665 670Val Met Ala Ser Asp Tyr
Thr Arg Gly Leu Ser Leu Gln Ala Ile Lys 675 680
685Leu Thr Phe Glu Tyr Leu Lys Ser Ser Val Glu Lys Gly Asp
Lys Val 690 695 700Ser Arg Glu Lys Met
His Asn Ala Ser Thr Leu Ala Gly Met Ala Phe705 710
715 720Ala Asn Ala Phe Leu Gly Ile Ala His Ser
Ile Ala His Lys Ile Gly 725 730
735Gly Glu Tyr Gly Ile Pro His Gly Arg Ala Asn Ala Ile Leu Leu Pro
740 745 750His Ile Ile Arg Tyr
Asn Ala Lys Asp Pro Gln Lys His Ala Leu Phe 755
760 765Pro Lys Tyr Glu Phe Phe Arg Ala Asp Thr Asp Tyr
Ala Asp Ile Ala 770 775 780Lys Phe Leu
Gly Leu Lys Gly Asn Thr Thr Glu Ala Leu Val Glu Ser785
790 795 800Leu Ala Lys Ala Val Tyr Glu
Leu Gly Gln Ser Val Gly Ile Glu Met 805
810 815Asn Leu Lys Ser Gln Gly Val Ser Glu Glu Glu Leu
Asn Glu Ser Ile 820 825 830Asp
Arg Met Ala Glu Leu Ala Phe Glu Asp Gln Cys Thr Thr Ala Asn 835
840 845Pro Lys Glu Ala Leu Ile Ser Glu Ile
Lys Asp Ile Ile Gln Thr Ser 850 855
860Tyr Asp Tyr Lys Gln865292607DNAArtificialoptimised sequence
29atgttgacca ttccagaaaa ggaaaacaga ggttccaagg aacaagaagt tgccatcatg
60attgatgctt tagctgacaa aggtaagaag gctttggaag ctttgtccaa gaagtctcaa
120gaagaaattg accacattgt ccaccaaatg tccttggctg ctgttgacca acacatggtt
180ttggccaagt tggctcatga agaaaccggt agaggtatct acgaagacaa ggctatcaag
240aacttatacg cctctgaata catctggaac tccatcaagg acaacaagac tgttggtatc
300attggtgaag acaaagaaaa gggtttgacc tacgttgctg aaccaattgg tgtcatctgt
360ggtgtcactc caaccaccaa cccaacttct accaccatct tcaaggctat gattgccatc
420aagactggta acccaattat tttcgctttc cacccatctg ctcaagaatc ttccaagaga
480gctgctgaag ttgttttgga agctgccatg aaggctggtg ctccaaagga tatcatccaa
540tggattgaag ttccatccat tgaagctacc aagcaattga tgaaccacaa gggtattgct
600ttagtcttgg ctaccggtgg ttctggtatg gttaagtctg cttactccac tggtaaacca
660gctttgggtg ttggtccagg taacgttcca tcttacatcg aaaagactgc tcatatcaag
720cgtgctgtca acgatatcat cggttccaag actttcgata atggtatgat ctgtgcttct
780gaacaagttg ttgtcattga caaggaaatc tacaaggatg tcaccaatga attcaaggct
840caccaagctt acttcgtcaa gaaggacgaa ttacaaagat tagaaaacgc catcatgaac
900gaacaaaaga ctggtatcaa gccagatatc gttggtaagt ctgctgttga aattgctgaa
960ttggccggta tcccagttcc agaaaacacc aaattgatca ttgctgaaat ctccggtgtc
1020ggttctgact acccattgtc cagagaaaag ttgtctccag ttttggcttt agtcaaggct
1080caatctacca agcaagcttt ccaaatctgt gaagacactt tgcacttcgg tggtttaggt
1140cacactgctg ttatccacac tgaagacgaa actttgcaaa aggatttcgg tctaagaatg
1200aaggcttgtc gtgttttggt caacactcca tctgctgttg gtggtatcgg tgacatgtac
1260aacgaattga ttccatcctt gactttgggt tgtggttctt acggtagaaa ctccatctcc
1320cacaacgtct ctgctaccga tttgttgaac atcaagacca ttgccaagag aagaaacaac
1380actcaaatct tcaaggttcc agctcaaatc tatttcgaag aaaacgctat catgtccttg
1440accaccatgg acaagattga aaaggtcatg atcgtttgtg acccaggtat ggttgaattt
1500ggttacacca aaaccgtcga aaacgtctta cgtcaaagaa ctgaacaacc tcaaatcaag
1560atcttctctg aagttgaacc aaatccatcc accaacactg tctacaaggg tttggaaatg
1620atggtcgatt tccaaccaga caccatcatt gctttgggtg gtggttctgc catggatgct
1680gccaaggcta tgtggatgtt cttcgaacat ccagaaactt ctttcttcgg tgccaagcaa
1740aaattcttgg acattggtaa gagaacctac aagattggta tgccagaaaa cgccactttc
1800atctgtattc caaccacttc tggtactggt tctgaagtca ctccatttgc tgttatcact
1860gactctgaaa ccaacgtcaa atacccattg gctgatttcg ctttgactcc agatgtcgcc
1920atcattgacc ctcaatttgt catgtccgtc ccaaaatctg tcactgctga taccggtatg
1980gacgttttga ctcacgctat ggaatcttac gtttctgtca tggcctccga ttacaccaga
2040ggtttgtccc tacaagctat caaattgacc tttgaatact tgaaatcttc cgttgaaaaa
2100ggtgacaagg tttccagaga aaagatgcac aacgcttcta ctttggccgg tatggccttt
2160gctaacgctt tcttgggtat tgctcactcc attgctcaca aaattggtgg tgaatacggt
2220attccacatg gtagagctaa cgccatcttg ttgcctcaca tcatcagata caacgccaag
2280gaccctcaaa agcacgcttt gttcccaaag tacgaattct tcagagctga caccgattac
2340gctgatatcg ccaagttctt aggtttgaaa ggtaacacca ctgaagcttt ggttgaatct
2400ttggccaagg ctgtctacga attaggtcaa tctgttggta ttgaaatgaa cttgaaatct
2460caaggtgtct ctgaagaaga attgaacgaa tccattgaca gaatggctga attggctttc
2520gaagaccaat gtaccactgc caacccaaag gaagctttga tttctgaaat caaggatatc
2580atccaaactt cttacgacta caagcag
260730392PRTClostridium acetobutylicum 30Met Lys Glu Val Val Ile Ala Ser
Ala Val Arg Thr Ala Ile Gly Ser1 5 10
15Tyr Gly Lys Ser Leu Lys Asp Val Pro Ala Val Asp Leu Gly
Ala Thr 20 25 30Ala Ile Lys
Glu Ala Val Lys Lys Ala Gly Ile Lys Pro Glu Asp Val 35
40 45Asn Glu Val Ile Leu Gly Asn Val Leu Gln Ala
Gly Leu Gly Gln Asn 50 55 60Pro Ala
Arg Gln Ala Ser Phe Lys Ala Gly Leu Pro Val Glu Ile Pro65
70 75 80Ala Met Thr Ile Asn Lys Val
Cys Gly Ser Gly Leu Arg Thr Val Ser 85 90
95Leu Ala Ala Gln Ile Ile Lys Ala Gly Asp Ala Asp Val
Ile Ile Ala 100 105 110Gly Gly
Met Glu Asn Met Ser Arg Ala Pro Tyr Leu Ala Asn Asn Ala 115
120 125Arg Trp Gly Tyr Arg Met Gly Asn Ala Lys
Phe Val Asp Glu Met Ile 130 135 140Thr
Asp Gly Leu Trp Asp Ala Phe Asn Asp Tyr His Met Gly Ile Thr145
150 155 160Ala Glu Asn Ile Ala Glu
Arg Trp Asn Ile Ser Arg Glu Glu Gln Asp 165
170 175Glu Phe Ala Leu Ala Ser Gln Lys Lys Ala Glu Glu
Ala Ile Lys Ser 180 185 190Gly
Gln Phe Lys Asp Glu Ile Val Pro Val Val Ile Lys Gly Arg Lys 195
200 205Gly Glu Thr Val Val Asp Thr Asp Glu
His Pro Arg Phe Gly Ser Thr 210 215
220Ile Glu Gly Leu Ala Lys Leu Lys Pro Ala Phe Lys Lys Asp Gly Thr225
230 235 240Val Thr Ala Gly
Asn Ala Ser Gly Leu Asn Asp Cys Ala Ala Val Leu 245
250 255Val Ile Met Ser Ala Glu Lys Ala Lys Glu
Leu Gly Val Lys Pro Leu 260 265
270Ala Lys Ile Val Ser Tyr Gly Ser Ala Gly Val Asp Pro Ala Ile Met
275 280 285Gly Tyr Gly Pro Phe Tyr Ala
Thr Lys Ala Ala Ile Glu Lys Ala Gly 290 295
300Trp Thr Val Asp Glu Leu Asp Leu Ile Glu Ser Asn Glu Ala Phe
Ala305 310 315 320Ala Gln
Ser Leu Ala Val Ala Lys Asp Leu Lys Phe Asp Met Asn Lys
325 330 335Val Asn Val Asn Gly Gly Ala
Ile Ala Leu Gly His Pro Ile Gly Ala 340 345
350Ser Gly Ala Arg Ile Leu Val Thr Leu Val His Ala Met Gln
Lys Arg 355 360 365Asp Ala Lys Lys
Gly Leu Ala Thr Leu Cys Ile Gly Gly Gly Gln Gly 370
375 380Thr Ala Ile Leu Leu Glu Lys Cys385
39031282PRTClostridium acetobutylicum 31Met Lys Lys Val Cys Val Ile Gly
Ala Gly Thr Met Gly Ser Gly Ile1 5 10
15Ala Gln Ala Phe Ala Ala Lys Gly Phe Glu Val Val Leu Arg
Asp Ile 20 25 30Lys Asp Glu
Phe Val Asp Arg Gly Leu Asp Phe Ile Asn Lys Asn Leu 35
40 45Ser Lys Leu Val Lys Lys Gly Lys Ile Glu Glu
Ala Thr Lys Val Glu 50 55 60Ile Leu
Thr Arg Ile Ser Gly Thr Val Asp Leu Asn Met Ala Ala Asp65
70 75 80Cys Asp Leu Val Ile Glu Ala
Ala Val Glu Arg Met Asp Ile Lys Lys 85 90
95Gln Ile Phe Ala Asp Leu Asp Asn Ile Cys Lys Pro Glu
Thr Ile Leu 100 105 110Ala Ser
Asn Thr Ser Ser Leu Ser Ile Thr Glu Val Ala Ser Ala Thr 115
120 125Lys Arg Pro Asp Lys Val Ile Gly Met His
Phe Phe Asn Pro Ala Pro 130 135 140Val
Met Lys Leu Val Glu Val Ile Arg Gly Ile Ala Thr Ser Gln Glu145
150 155 160Thr Phe Asp Ala Val Lys
Glu Thr Ser Ile Ala Ile Gly Lys Asp Pro 165
170 175Val Glu Val Ala Glu Ala Pro Gly Phe Val Val Asn
Arg Ile Leu Ile 180 185 190Pro
Met Ile Asn Glu Ala Val Gly Ile Leu Ala Glu Gly Ile Ala Ser 195
200 205Val Glu Asp Ile Asp Lys Ala Met Lys
Leu Gly Ala Asn His Pro Met 210 215
220Gly Pro Leu Glu Leu Gly Asp Phe Ile Gly Leu Asp Ile Cys Leu Ala225
230 235 240Ile Met Asp Val
Leu Tyr Ser Glu Thr Gly Asp Ser Lys Tyr Arg Pro 245
250 255His Thr Leu Leu Lys Lys Tyr Val Arg Ala
Gly Trp Leu Gly Arg Lys 260 265
270Ser Gly Lys Gly Phe Tyr Asp Tyr Ser Lys 275
28032261PRTClostridium acetobutylicum 32Met Glu Leu Asn Asn Val Ile Leu
Glu Lys Glu Gly Lys Val Ala Val1 5 10
15Val Thr Ile Asn Arg Pro Lys Ala Leu Asn Ala Leu Asn Ser
Asp Thr 20 25 30Leu Lys Glu
Met Asp Tyr Val Ile Gly Glu Ile Glu Asn Asp Ser Glu 35
40 45Val Leu Ala Val Ile Leu Thr Gly Ala Gly Glu
Lys Ser Phe Val Ala 50 55 60Gly Ala
Asp Ile Ser Glu Met Lys Glu Met Asn Thr Ile Glu Gly Arg65
70 75 80Lys Phe Gly Ile Leu Gly Asn
Lys Val Phe Arg Arg Leu Glu Leu Leu 85 90
95Glu Lys Pro Val Ile Ala Ala Val Asn Gly Phe Ala Leu
Gly Gly Gly 100 105 110Cys Glu
Ile Ala Met Ser Cys Asp Ile Arg Ile Ala Ser Ser Asn Ala 115
120 125Arg Phe Gly Gln Pro Glu Val Gly Leu Gly
Ile Thr Pro Gly Phe Gly 130 135 140Gly
Thr Gln Arg Leu Ser Arg Leu Val Gly Met Gly Met Ala Lys Gln145
150 155 160Leu Ile Phe Thr Ala Gln
Asn Ile Lys Ala Asp Glu Ala Leu Arg Ile 165
170 175Gly Leu Val Asn Lys Val Val Glu Pro Ser Glu Leu
Met Asn Thr Ala 180 185 190Lys
Glu Ile Ala Asn Lys Ile Val Ser Asn Ala Pro Val Ala Val Lys 195
200 205Leu Ser Lys Gln Ala Ile Asn Arg Gly
Met Gln Cys Asp Ile Asp Thr 210 215
220Ala Leu Ala Phe Glu Ser Glu Ala Phe Gly Glu Cys Phe Ser Thr Glu225
230 235 240Asp Gln Lys Asp
Ala Met Thr Ala Phe Ile Glu Lys Arg Lys Ile Glu 245
250 255Gly Phe Lys Asn Arg
26033379PRTClostridium acetobutylicum 33Met Asp Phe Asn Leu Thr Arg Glu
Gln Glu Leu Val Arg Gln Met Val1 5 10
15Arg Glu Phe Ala Glu Asn Glu Val Lys Pro Ile Ala Ala Glu
Ile Asp 20 25 30Glu Thr Glu
Arg Phe Pro Met Glu Asn Val Lys Lys Met Gly Gln Tyr 35
40 45Gly Met Met Gly Ile Pro Phe Ser Lys Glu Tyr
Gly Gly Ala Gly Gly 50 55 60Asp Val
Leu Ser Tyr Ile Ile Ala Val Glu Glu Leu Ser Lys Val Cys65
70 75 80Gly Thr Thr Gly Val Ile Leu
Ser Ala His Thr Ser Leu Cys Ala Ser 85 90
95Leu Ile Asn Glu His Gly Thr Glu Glu Gln Lys Gln Lys
Tyr Leu Val 100 105 110Pro Leu
Ala Lys Gly Glu Lys Ile Gly Ala Tyr Gly Leu Thr Glu Pro 115
120 125Asn Ala Gly Thr Asp Ser Gly Ala Gln Gln
Thr Val Ala Val Leu Glu 130 135 140Gly
Asp His Tyr Val Ile Asn Gly Ser Lys Ile Phe Ile Thr Asn Gly145
150 155 160Gly Val Ala Asp Thr Phe
Val Ile Phe Ala Met Thr Asp Arg Thr Lys 165
170 175Gly Thr Lys Gly Ile Ser Ala Phe Ile Ile Glu Lys
Gly Phe Lys Gly 180 185 190Phe
Ser Ile Gly Lys Val Glu Gln Lys Leu Gly Ile Arg Ala Ser Ser 195
200 205Thr Thr Glu Leu Val Phe Glu Asp Met
Ile Val Pro Val Glu Asn Met 210 215
220Ile Gly Lys Glu Gly Lys Gly Phe Pro Ile Ala Met Lys Thr Leu Asp225
230 235 240Gly Gly Arg Ile
Gly Ile Ala Ala Gln Ala Leu Gly Ile Ala Glu Gly 245
250 255Ala Phe Asn Glu Ala Arg Ala Tyr Met Lys
Glu Arg Lys Gln Phe Gly 260 265
270Arg Ser Leu Asp Lys Phe Gln Gly Leu Ala Trp Met Met Ala Asp Met
275 280 285Asp Val Ala Ile Glu Ser Ala
Arg Tyr Leu Val Tyr Lys Ala Ala Tyr 290 295
300Leu Lys Gln Ala Gly Leu Pro Tyr Thr Val Asp Ala Ala Arg Ala
Lys305 310 315 320Leu His
Ala Ala Asn Val Ala Met Asp Val Thr Thr Lys Ala Val Gln
325 330 335Leu Phe Gly Gly Tyr Gly Tyr
Thr Lys Asp Tyr Pro Val Glu Arg Met 340 345
350Met Arg Asp Ala Lys Ile Thr Glu Ile Tyr Glu Gly Thr Ser
Glu Val 355 360 365Gln Lys Leu Val
Ile Ser Gly Lys Ile Phe Arg 370 37534858PRTClostridium
acetobutylicum 34Met Lys Val Thr Asn Gln Lys Glu Leu Lys Gln Lys Leu Asn
Glu Leu1 5 10 15Arg Glu
Ala Gln Lys Lys Phe Ala Thr Tyr Thr Gln Glu Gln Val Asp 20
25 30Lys Ile Phe Lys Gln Cys Ala Ile Ala
Ala Ala Lys Glu Arg Ile Asn 35 40
45Leu Ala Lys Leu Ala Val Glu Glu Thr Gly Ile Gly Leu Val Glu Asp 50
55 60Lys Ile Ile Lys Asn His Phe Ala Ala
Glu Tyr Ile Tyr Asn Lys Tyr65 70 75
80Lys Asn Glu Lys Thr Cys Gly Ile Ile Asp His Asp Asp Ser
Leu Gly 85 90 95Ile Thr
Lys Val Ala Glu Pro Ile Gly Ile Val Ala Ala Ile Val Pro 100
105 110Thr Thr Asn Pro Thr Ser Thr Ala Ile
Phe Lys Ser Leu Ile Ser Leu 115 120
125Lys Thr Arg Asn Ala Ile Phe Phe Ser Pro His Pro Arg Ala Lys Lys
130 135 140Ser Thr Ile Ala Ala Ala Lys
Leu Ile Leu Asp Ala Ala Val Lys Ala145 150
155 160Gly Ala Pro Lys Asn Ile Ile Gly Trp Ile Asp Glu
Pro Ser Ile Glu 165 170
175Leu Ser Gln Asp Leu Met Ser Glu Ala Asp Ile Ile Leu Ala Thr Gly
180 185 190Gly Pro Ser Met Val Lys
Ala Ala Tyr Ser Ser Gly Lys Pro Ala Ile 195 200
205Gly Val Gly Ala Gly Asn Thr Pro Ala Ile Ile Asp Glu Ser
Ala Asp 210 215 220Ile Asp Met Ala Val
Ser Ser Ile Ile Leu Ser Lys Thr Tyr Asp Asn225 230
235 240Gly Val Ile Cys Ala Ser Glu Gln Ser Ile
Leu Val Met Asn Ser Ile 245 250
255Tyr Glu Lys Val Lys Glu Glu Phe Val Lys Arg Gly Ser Tyr Ile Leu
260 265 270Asn Gln Asn Glu Ile
Ala Lys Ile Lys Glu Thr Met Phe Lys Asn Gly 275
280 285Ala Ile Asn Ala Asp Ile Val Gly Lys Ser Ala Tyr
Ile Ile Ala Lys 290 295 300Met Ala Gly
Ile Glu Val Pro Gln Thr Thr Lys Ile Leu Ile Gly Glu305
310 315 320Val Gln Ser Val Glu Lys Ser
Glu Leu Phe Ser His Glu Lys Leu Ser 325
330 335Pro Val Leu Ala Met Tyr Lys Val Lys Asp Phe Asp
Glu Ala Leu Lys 340 345 350Lys
Ala Gln Arg Leu Ile Glu Leu Gly Gly Ser Gly His Thr Ser Ser 355
360 365Leu Tyr Ile Asp Ser Gln Asn Asn Lys
Asp Lys Val Lys Glu Phe Gly 370 375
380Leu Ala Met Lys Thr Ser Arg Thr Phe Ile Asn Met Pro Ser Ser Gln385
390 395 400Gly Ala Ser Gly
Asp Leu Tyr Asn Phe Ala Ile Ala Pro Ser Phe Thr 405
410 415Leu Gly Cys Gly Thr Trp Gly Gly Asn Ser
Val Ser Gln Asn Val Glu 420 425
430Pro Lys His Leu Leu Asn Ile Lys Ser Val Ala Glu Arg Arg Glu Asn
435 440 445Met Leu Trp Phe Lys Val Pro
Gln Lys Ile Tyr Phe Lys Tyr Gly Cys 450 455
460Leu Arg Phe Ala Leu Lys Glu Leu Lys Asp Met Asn Lys Lys Arg
Ala465 470 475 480Phe Ile
Val Thr Asp Lys Asp Leu Phe Lys Leu Gly Tyr Val Asn Lys
485 490 495Ile Thr Lys Val Leu Asp Glu
Ile Asp Ile Lys Tyr Ser Ile Phe Thr 500 505
510Asp Ile Lys Ser Asp Pro Thr Ile Asp Ser Val Lys Lys Gly
Ala Lys 515 520 525Glu Met Leu Asn
Phe Glu Pro Asp Thr Ile Ile Ser Ile Gly Gly Gly 530
535 540Ser Pro Met Asp Ala Ala Lys Val Met His Leu Leu
Tyr Glu Tyr Pro545 550 555
560Glu Ala Glu Ile Glu Asn Leu Ala Ile Asn Phe Met Asp Ile Arg Lys
565 570 575Arg Ile Cys Asn Phe
Pro Lys Leu Gly Thr Lys Ala Ile Ser Val Ala 580
585 590Ile Pro Thr Thr Ala Gly Thr Gly Ser Glu Ala Thr
Pro Phe Ala Val 595 600 605Ile Thr
Asn Asp Glu Thr Gly Met Lys Tyr Pro Leu Thr Ser Tyr Glu 610
615 620Leu Thr Pro Asn Met Ala Ile Ile Asp Thr Glu
Leu Met Leu Asn Met625 630 635
640Pro Arg Lys Leu Thr Ala Ala Thr Gly Ile Asp Ala Leu Val His Ala
645 650 655Ile Glu Ala Tyr
Val Ser Val Met Ala Thr Asp Tyr Thr Asp Glu Leu 660
665 670Ala Leu Arg Ala Ile Lys Met Ile Phe Lys Tyr
Leu Pro Arg Ala Tyr 675 680 685Lys
Asn Gly Thr Asn Asp Ile Glu Ala Arg Glu Lys Met Ala His Ala 690
695 700Ser Asn Ile Ala Gly Met Ala Phe Ala Asn
Ala Phe Leu Gly Val Cys705 710 715
720His Ser Met Ala His Lys Leu Gly Ala Met His His Val Pro His
Gly 725 730 735Ile Ala Cys
Ala Val Leu Ile Glu Glu Val Ile Lys Tyr Asn Ala Thr 740
745 750Asp Cys Pro Thr Lys Gln Thr Ala Phe Pro
Gln Tyr Lys Ser Pro Asn 755 760
765Ala Lys Arg Lys Tyr Ala Glu Ile Ala Glu Tyr Leu Asn Leu Lys Gly 770
775 780Thr Ser Asp Thr Glu Lys Val Thr
Ala Leu Ile Glu Ala Ile Ser Lys785 790
795 800Leu Lys Ile Asp Leu Ser Ile Pro Gln Asn Ile Ser
Ala Ala Gly Ile 805 810
815Asn Lys Lys Asp Phe Tyr Asn Thr Leu Asp Lys Met Ser Glu Leu Ala
820 825 830Phe Asp Asp Gln Cys Thr
Thr Ala Asn Pro Arg Tyr Pro Leu Ile Ser 835 840
845Glu Leu Lys Asp Ile Tyr Ile Lys Ser Phe 850
85535862PRTClostridium acetobutylicum 35Met Lys Val Thr Thr Val Lys
Glu Leu Asp Glu Lys Leu Lys Val Ile1 5 10
15Lys Glu Ala Gln Lys Lys Phe Ser Cys Tyr Ser Gln Glu
Met Val Asp 20 25 30Glu Ile
Phe Arg Asn Ala Ala Met Ala Ala Ile Asp Ala Arg Ile Glu 35
40 45Leu Ala Lys Ala Ala Val Leu Glu Thr Gly
Met Gly Leu Val Glu Asp 50 55 60Lys
Val Ile Lys Asn His Phe Ala Gly Glu Tyr Ile Tyr Asn Lys Tyr65
70 75 80Lys Asp Glu Lys Thr Cys
Gly Ile Ile Glu Arg Asn Glu Pro Tyr Gly 85
90 95Ile Thr Lys Ile Ala Glu Pro Ile Gly Val Val Ala
Ala Ile Ile Pro 100 105 110Val
Thr Asn Pro Thr Ser Thr Thr Ile Phe Lys Ser Leu Ile Ser Leu 115
120 125Lys Thr Arg Asn Gly Ile Phe Phe Ser
Pro His Pro Arg Ala Lys Lys 130 135
140Ser Thr Ile Leu Ala Ala Lys Thr Ile Leu Asp Ala Ala Val Lys Ser145
150 155 160Gly Ala Pro Glu
Asn Ile Ile Gly Trp Ile Asp Glu Pro Ser Ile Glu 165
170 175Leu Thr Gln Tyr Leu Met Gln Lys Ala Asp
Ile Thr Leu Ala Thr Gly 180 185
190Gly Pro Ser Leu Val Lys Ser Ala Tyr Ser Ser Gly Lys Pro Ala Ile
195 200 205Gly Val Gly Pro Gly Asn Thr
Pro Val Ile Ile Asp Glu Ser Ala His 210 215
220Ile Lys Met Ala Val Ser Ser Ile Ile Leu Ser Lys Thr Tyr Asp
Asn225 230 235 240Gly Val
Ile Cys Ala Ser Glu Gln Ser Val Ile Val Leu Lys Ser Ile
245 250 255Tyr Asn Lys Val Lys Asp Glu
Phe Gln Glu Arg Gly Ala Tyr Ile Ile 260 265
270Lys Lys Asn Glu Leu Asp Lys Val Arg Glu Val Ile Phe Lys
Asp Gly 275 280 285Ser Val Asn Pro
Lys Ile Val Gly Gln Ser Ala Tyr Thr Ile Ala Ala 290
295 300Met Ala Gly Ile Lys Val Pro Lys Thr Thr Arg Ile
Leu Ile Gly Glu305 310 315
320Val Thr Ser Leu Gly Glu Glu Glu Pro Phe Ala His Glu Lys Leu Ser
325 330 335Pro Val Leu Ala Met
Tyr Glu Ala Asp Asn Phe Asp Asp Ala Leu Lys 340
345 350Lys Ala Val Thr Leu Ile Asn Leu Gly Gly Leu Gly
His Thr Ser Gly 355 360 365Ile Tyr
Ala Asp Glu Ile Lys Ala Arg Asp Lys Ile Asp Arg Phe Ser 370
375 380Ser Ala Met Lys Thr Val Arg Thr Phe Val Asn
Ile Pro Thr Ser Gln385 390 395
400Gly Ala Ser Gly Asp Leu Tyr Asn Phe Arg Ile Pro Pro Ser Phe Thr
405 410 415Leu Gly Cys Gly
Phe Trp Gly Gly Asn Ser Val Ser Glu Asn Val Gly 420
425 430Pro Lys His Leu Leu Asn Ile Lys Thr Val Ala
Glu Arg Arg Glu Asn 435 440 445Met
Leu Trp Phe Arg Val Pro His Lys Val Tyr Phe Lys Phe Gly Cys 450
455 460Leu Gln Phe Ala Leu Lys Asp Leu Lys Asp
Leu Lys Lys Lys Arg Ala465 470 475
480Phe Ile Val Thr Asp Ser Asp Pro Tyr Asn Leu Asn Tyr Val Asp
Ser 485 490 495Ile Ile Lys
Ile Leu Glu His Leu Asp Ile Asp Phe Lys Val Phe Asn 500
505 510Lys Val Gly Arg Glu Ala Asp Leu Lys Thr
Ile Lys Lys Ala Thr Glu 515 520
525Glu Met Ser Ser Phe Met Pro Asp Thr Ile Ile Ala Leu Gly Gly Thr 530
535 540Pro Glu Met Ser Ser Ala Lys Leu
Met Trp Val Leu Tyr Glu His Pro545 550
555 560Glu Val Lys Phe Glu Asp Leu Ala Ile Lys Phe Met
Asp Ile Arg Lys 565 570
575Arg Ile Tyr Thr Phe Pro Lys Leu Gly Lys Lys Ala Met Leu Val Ala
580 585 590Ile Thr Thr Ser Ala Gly
Ser Gly Ser Glu Val Thr Pro Phe Ala Leu 595 600
605Val Thr Asp Asn Asn Thr Gly Asn Lys Tyr Met Leu Ala Asp
Tyr Glu 610 615 620Met Thr Pro Asn Met
Ala Ile Val Asp Ala Glu Leu Met Met Lys Met625 630
635 640Pro Lys Gly Leu Thr Ala Tyr Ser Gly Ile
Asp Ala Leu Val Asn Ser 645 650
655Ile Glu Ala Tyr Thr Ser Val Tyr Ala Ser Glu Tyr Thr Asn Gly Leu
660 665 670Ala Leu Glu Ala Ile
Arg Leu Ile Phe Lys Tyr Leu Pro Glu Ala Tyr 675
680 685Lys Asn Gly Arg Thr Asn Glu Lys Ala Arg Glu Lys
Met Ala His Ala 690 695 700Ser Thr Met
Ala Gly Met Ala Ser Ala Asn Ala Phe Leu Gly Leu Cys705
710 715 720His Ser Met Ala Ile Lys Leu
Ser Ser Glu His Asn Ile Pro Ser Gly 725
730 735Ile Ala Asn Ala Leu Leu Ile Glu Glu Val Ile Lys
Phe Asn Ala Val 740 745 750Asp
Asn Pro Val Lys Gln Ala Pro Cys Pro Gln Tyr Lys Tyr Pro Asn 755
760 765Thr Ile Phe Arg Tyr Ala Arg Ile Ala
Asp Tyr Ile Lys Leu Gly Gly 770 775
780Asn Thr Asp Glu Glu Lys Val Asp Leu Leu Ile Asn Lys Ile His Glu785
790 795 800Leu Lys Lys Ala
Leu Asn Ile Pro Thr Ser Ile Lys Asp Ala Gly Val 805
810 815Leu Glu Glu Asn Phe Tyr Ser Ser Leu Asp
Arg Ile Ser Glu Leu Ala 820 825
830Leu Asp Asp Gln Cys Thr Gly Ala Asn Pro Arg Phe Pro Leu Thr Ser
835 840 845Glu Ile Lys Glu Met Tyr Ile
Asn Cys Phe Lys Lys Gln Pro 850 855
86036389PRTClostridium acetobutylicum 36Met Leu Ser Phe Asp Tyr Ser Ile
Pro Thr Lys Val Phe Phe Gly Lys1 5 10
15Gly Lys Ile Asp Val Ile Gly Glu Glu Ile Lys Lys Tyr Gly
Ser Arg 20 25 30Val Leu Ile
Val Tyr Gly Gly Gly Ser Ile Lys Arg Asn Gly Ile Tyr 35
40 45Asp Arg Ala Thr Ala Ile Leu Lys Glu Asn Asn
Ile Ala Phe Tyr Glu 50 55 60Leu Ser
Gly Val Glu Pro Asn Pro Arg Ile Thr Thr Val Lys Lys Gly65
70 75 80Ile Glu Ile Cys Arg Glu Asn
Asn Val Asp Leu Val Leu Ala Ile Gly 85 90
95Gly Gly Ser Ala Ile Asp Cys Ser Lys Val Ile Ala Ala
Gly Val Tyr 100 105 110Tyr Asp
Gly Asp Thr Trp Asp Met Val Lys Asp Pro Ser Lys Ile Thr 115
120 125Lys Val Leu Pro Ile Ala Ser Ile Leu Thr
Leu Ser Ala Thr Gly Ser 130 135 140Glu
Met Asp Gln Ile Ala Val Ile Ser Asn Met Glu Thr Asn Glu Lys145
150 155 160Leu Gly Val Gly His Asp
Asp Met Arg Pro Lys Phe Ser Val Leu Asp 165
170 175Pro Thr Tyr Thr Phe Thr Val Pro Lys Asn Gln Thr
Ala Ala Gly Thr 180 185 190Ala
Asp Ile Met Ser His Thr Phe Glu Ser Tyr Phe Ser Gly Val Glu 195
200 205Gly Ala Tyr Val Gln Asp Gly Ile Ala
Glu Ala Ile Leu Arg Thr Cys 210 215
220Ile Lys Tyr Gly Lys Ile Ala Met Glu Lys Thr Asp Asp Tyr Glu Ala225
230 235 240Arg Ala Asn Leu
Met Trp Ala Ser Ser Leu Ala Ile Asn Gly Leu Leu 245
250 255Ser Leu Gly Lys Asp Arg Lys Trp Ser Cys
His Pro Met Glu His Glu 260 265
270Leu Ser Ala Tyr Tyr Asp Ile Thr His Gly Val Gly Leu Ala Ile Leu
275 280 285Thr Pro Asn Trp Met Glu Tyr
Ile Leu Asn Asp Asp Thr Leu His Lys 290 295
300Phe Val Ser Tyr Gly Ile Asn Val Trp Gly Ile Asp Lys Asn Lys
Asp305 310 315 320Asn Tyr
Glu Ile Ala Arg Glu Ala Ile Lys Asn Thr Arg Glu Tyr Phe
325 330 335Asn Ser Leu Gly Ile Pro Ser
Lys Leu Arg Glu Val Gly Ile Gly Lys 340 345
350Asp Lys Leu Glu Leu Met Ala Lys Gln Ala Val Arg Asn Ser
Gly Gly 355 360 365Thr Ile Gly Ser
Leu Arg Pro Ile Asn Ala Glu Asp Val Leu Glu Ile 370
375 380Phe Lys Lys Ser Tyr38537390PRTClostridium
acetobutylicum 37Met Val Asp Phe Glu Tyr Ser Ile Pro Thr Arg Ile Phe Phe
Gly Lys1 5 10 15Asp Lys
Ile Asn Val Leu Gly Arg Glu Leu Lys Lys Tyr Gly Ser Lys 20
25 30Val Leu Ile Val Tyr Gly Gly Gly Ser
Ile Lys Arg Asn Gly Ile Tyr 35 40
45Asp Lys Ala Val Ser Ile Leu Glu Lys Asn Ser Ile Lys Phe Tyr Glu 50
55 60Leu Ala Gly Val Glu Pro Asn Pro Arg
Val Thr Thr Val Glu Lys Gly65 70 75
80Val Lys Ile Cys Arg Glu Asn Gly Val Glu Val Val Leu Ala
Ile Gly 85 90 95Gly Gly
Ser Ala Ile Asp Cys Ala Lys Val Ile Ala Ala Ala Cys Glu 100
105 110Tyr Asp Gly Asn Pro Trp Asp Ile Val
Leu Asp Gly Ser Lys Ile Lys 115 120
125Arg Val Leu Pro Ile Ala Ser Ile Leu Thr Ile Ala Ala Thr Gly Ser
130 135 140Glu Met Asp Thr Trp Ala Val
Ile Asn Asn Met Asp Thr Asn Glu Lys145 150
155 160Leu Ile Ala Ala His Pro Asp Met Ala Pro Lys Phe
Ser Ile Leu Asp 165 170
175Pro Thr Tyr Thr Tyr Thr Val Pro Thr Asn Gln Thr Ala Ala Gly Thr
180 185 190Ala Asp Ile Met Ser His
Ile Phe Glu Val Tyr Phe Ser Asn Thr Lys 195 200
205Thr Ala Tyr Leu Gln Asp Arg Met Ala Glu Ala Leu Leu Arg
Thr Cys 210 215 220Ile Lys Tyr Gly Gly
Ile Ala Leu Glu Lys Pro Asp Asp Tyr Glu Ala225 230
235 240Arg Ala Asn Leu Met Trp Ala Ser Ser Leu
Ala Ile Asn Gly Leu Leu 245 250
255Thr Tyr Gly Lys Asp Thr Asn Trp Ser Val His Leu Met Glu His Glu
260 265 270Leu Ser Ala Tyr Tyr
Asp Ile Thr His Gly Val Gly Leu Ala Ile Leu 275
280 285Thr Pro Asn Trp Met Glu Tyr Ile Leu Asn Asn Asp
Thr Val Tyr Lys 290 295 300Phe Val Glu
Tyr Gly Val Asn Val Trp Gly Ile Asp Lys Glu Lys Asn305
310 315 320His Tyr Asp Ile Ala His Gln
Ala Ile Gln Lys Thr Arg Asp Tyr Phe 325
330 335Val Asn Val Leu Gly Leu Pro Ser Arg Leu Arg Asp
Val Gly Ile Glu 340 345 350Glu
Glu Lys Leu Asp Ile Met Ala Lys Glu Ser Val Lys Leu Thr Gly 355
360 365Gly Thr Ile Gly Asn Leu Arg Pro Val
Asn Ala Ser Glu Val Leu Gln 370 375
380Ile Phe Lys Lys Ser Val385 39038336PRTClostridium
acetobutylicum 38Met Asn Lys Ala Asp Tyr Lys Gly Val Trp Val Phe Ala Glu
Gln Arg1 5 10 15Asp Gly
Glu Leu Gln Lys Val Ser Leu Glu Leu Leu Gly Lys Gly Lys 20
25 30Glu Met Ala Glu Lys Leu Gly Val Glu
Leu Thr Ala Val Leu Leu Gly 35 40
45His Asn Thr Glu Lys Met Ser Lys Asp Leu Leu Ser His Gly Ala Asp 50
55 60Lys Val Leu Ala Ala Asp Asn Glu Leu
Leu Ala His Phe Ser Thr Asp65 70 75
80Gly Tyr Ala Lys Val Ile Cys Asp Leu Val Asn Glu Arg Lys
Pro Glu 85 90 95Ile Leu
Phe Ile Gly Ala Thr Phe Ile Gly Arg Asp Leu Gly Pro Arg 100
105 110Ile Ala Ala Arg Leu Ser Thr Gly Leu
Thr Ala Asp Cys Thr Ser Leu 115 120
125Asp Ile Asp Val Glu Asn Arg Asp Leu Leu Ala Thr Arg Pro Ala Phe
130 135 140Gly Gly Asn Leu Ile Ala Thr
Ile Val Cys Ser Asp His Arg Pro Gln145 150
155 160Met Ala Thr Val Arg Pro Gly Val Phe Glu Lys Leu
Pro Val Asn Asp 165 170
175Ala Asn Val Ser Asp Asp Lys Ile Glu Lys Val Ala Ile Lys Leu Thr
180 185 190Ala Ser Asp Ile Arg Thr
Lys Val Ser Lys Val Val Lys Leu Ala Lys 195 200
205Asp Ile Ala Asp Ile Gly Glu Ala Lys Val Leu Val Ala Gly
Gly Arg 210 215 220Gly Val Gly Ser Lys
Glu Asn Phe Glu Lys Leu Glu Glu Leu Ala Ser225 230
235 240Leu Leu Gly Gly Thr Ile Ala Ala Ser Arg
Ala Ala Ile Glu Lys Glu 245 250
255Trp Val Asp Lys Asp Leu Gln Val Gly Gln Thr Gly Lys Thr Val Arg
260 265 270Pro Thr Leu Tyr Ile
Ala Cys Gly Ile Ser Gly Ala Ile Gln His Leu 275
280 285Ala Gly Met Gln Asp Ser Asp Tyr Ile Ile Ala Ile
Asn Lys Asp Val 290 295 300Glu Ala Pro
Ile Met Lys Val Ala Asp Leu Ala Ile Val Gly Asp Val305
310 315 320Asn Lys Val Val Pro Glu Leu
Ile Ala Gln Val Lys Ala Ala Asn Asn 325
330 33539259PRTClostridium acetobutylicum 39Met Asn Ile
Val Val Cys Leu Lys Gln Val Pro Asp Thr Ala Glu Val1 5
10 15Arg Ile Asp Pro Val Lys Gly Thr Leu
Ile Arg Glu Gly Val Pro Ser 20 25
30Ile Ile Asn Pro Asp Asp Lys Asn Ala Leu Glu Glu Ala Leu Val Leu
35 40 45Lys Asp Asn Tyr Gly Ala His
Val Thr Val Ile Ser Met Gly Pro Pro 50 55
60Gln Ala Lys Asn Ala Leu Val Glu Ala Leu Ala Met Gly Ala Asp Glu65
70 75 80Ala Val Leu Leu
Thr Asp Arg Ala Phe Gly Gly Ala Asp Thr Leu Ala 85
90 95Thr Ser His Thr Ile Ala Ala Gly Ile Lys
Lys Leu Lys Tyr Asp Ile 100 105
110Val Phe Ala Gly Arg Gln Ala Ile Asp Gly Asp Thr Ala Gln Val Gly
115 120 125Pro Glu Ile Ala Glu His Leu
Gly Ile Pro Gln Val Thr Tyr Val Glu 130 135
140Lys Val Glu Val Asp Gly Asp Thr Leu Lys Ile Arg Lys Ala Trp
Glu145 150 155 160Asp Gly
Tyr Glu Val Val Glu Val Lys Thr Pro Val Leu Leu Thr Ala
165 170 175Ile Lys Glu Leu Asn Val Pro
Arg Tyr Met Ser Val Glu Lys Ile Phe 180 185
190Gly Ala Phe Asp Lys Glu Val Lys Met Trp Thr Ala Asp Asp
Ile Asp 195 200 205Val Asp Lys Ala
Asn Leu Gly Leu Lys Gly Ser Pro Thr Lys Val Lys 210
215 220Lys Ser Ser Thr Lys Glu Val Lys Gly Gln Gly Glu
Val Ile Asp Lys225 230 235
240Pro Val Lys Glu Ala Ala Ala Tyr Val Val Ser Lys Leu Lys Glu Glu
245 250 255His Tyr
Ile405976DNAArtificialplasmid YEplac112PtdhTadh 40gcccggggga tccactagtt
ctagaatccg tcgaaactaa gttctggtgt tttaaaacta 60aaaaaaagac taactataaa
agtagaattt aagaagttta agaaatagat ttacagaatt 120acaatcaata cctaccgtct
ttatatactt attagtcaag taggggaata atttcaggga 180actggtttca accttttttt
tcagcttttt ccaaatcaga gagagcagaa ggtaatagaa 240ggtgtaagaa aatgagatag
atacatgcgt gggtcaattg ccttgtgtca tcatttactc 300caggcaggtt gcatcactcc
attgaggttg tgcccgtttt ttgcctgttt gtgcccctgt 360tctctgtagt tgcgctaaga
gaatggacct atgaactgat ggttggtgaa gaaaacaata 420ttttggtgct gggattcttt
ttttttctgg atgccagctt aaaaagcggg ctccattata 480tttagtggat gccaggaata
aactgttcac ccagacacct acgatgttat atattctgtg 540taacccgccc cctattttgg
gcatgtacgg gttacagcag aattaaaagg ctaatttttt 600gactaaataa agttaggaaa
atcactacta ttaattattt acgtattctt tgaaatggcg 660agtattgata atgataaact
gagctcgaat tcactggccg tcgttttaca acgtcgtgac 720tgggaaaacc ctggcgttac
ccaacttaat cgccttgcag cacatccccc tttcgccagc 780tggcgtaata gcgaagaggc
ccgcaccgat cgcccttccc aacagttgcg cagcctgaat 840ggcgaatggc gcctgatgcg
gtattttctc cttacgcatc tgtgcggtat ttcacaccgc 900atatatcgga tcgtacttgt
tacccatcat tgaattttga acatccgaac ctgggagttt 960tccctgaaac agatagtata
tttgaacctg tataataata tatagtctag cgctttacgg 1020aagacaatgt atgtatttcg
gttcctggag aaactattgc atctattgca taggtaatct 1080tgcacgtcgc atccccggtt
cattttctgc gtttccatct tgcacttcaa tagcatatct 1140ttgttaacga agcatctgtg
cttcattttg tagaacaaaa atgcaacgcg agagcgctaa 1200tttttcaaac aaagaatctg
agctgcattt ttacagaaca gaaatgcaac gcgaaagcgc 1260tattttacca acgaagaatc
tgtgcttcat ttttgtaaaa caaaaatgca acgcgagagc 1320gctaattttt caaacaaaga
atctgagctg catttttaca gaacagaaat gcaacgcgag 1380agcgctattt taccaacaaa
gaatctatac ttcttttttg ttctacaaaa atgcatcccg 1440agagcgctat ttttctaaca
aagcatctta gattactttt tttctccttt gtgcgctcta 1500taatgcagtc tcttgataac
tttttgcact gtaggtccgt taaggttaga agaaggctac 1560tttggtgtct attttctctt
ccataaaaaa agcctgactc cacttcccgc gtttactgat 1620tactagcgaa gctgcgggtg
cattttttca agataaaggc atccccgatt atattctata 1680ccgatgtgga ttgcgcatac
tttgtgaaca gaaagtgata gcgttgatga ttcttcattg 1740gtcagaaaat tatgaacggt
ttcttctatt ttgtctctat atactacgta taggaaatgt 1800ttacattttc gtattgtttt
cgattcactc tatgaatagt tcttactaca atttttttgt 1860ctaaagagta atactagaga
taaacataaa aaatgtagag gtcgagttta gatgcaagtt 1920caaggagcga aaggtggatg
ggtaggttat atagggatat agcacagaga tatatagcaa 1980agagatactt ttgagcaatg
tttgtggaag cggtattcgc aatattttag tagctcgtta 2040cagtccggtg cgtttttggt
tttttgaaag tgcgtcttca gagcgctttt ggttttcaaa 2100agcgctctga agttcctata
ctttctagct agagaatagg aacttcggaa taggaacttc 2160aaagcgtttc cgaaaacgag
cgcttccgaa aatgcaacgc gagctgcgca catacagctc 2220actgttcacg tcgcacctat
atctgcgtgt tgcctgtata tatatataca tgagaagaac 2280ggcatagtgc gtgtttatgc
ttaaatgcgt acttatatgc gtctatttat gtaggatgaa 2340aggtagtcta gtacctcctg
tgatattatc ccattccatg cggggtatcg tatgcttcct 2400tcagcactac cctttagctg
ttctatatgc tgccactcct caattggatt agtctcatcc 2460ttcaatgcta tcatttcctt
tgatattgga tcgatccgat gataagctgt caaacatgag 2520aattgatctt ttatgcttgc
ttttcaaaag gcttgcaggc aagtgcacaa acaatactta 2580aataaatact actcagtaat
aacctatttc ttagcatttt tgacgaaatt tgctattttg 2640ttagagtctt ttacaccatt
tgtctccaca cctccgctta catcaacacc aataacgcca 2700tttaatctaa gcgcatcacc
aacattttct ggcgtcagtc caccagctaa cataaaatgt 2760aagctctcgg ggctctcttg
ccttccaacc cagtcagaaa tcgagttcca atccaaaagt 2820tcacctgtcc cacctgcttc
tgaatcaaac aagggaataa acgaatgagg tttctgtgaa 2880gctgcactga gtagtatgtt
gcagtctttt ggaaatacga gtcttttaat aactggcaaa 2940ccgaggaact cttggtattc
ttgccacgac tcatctccat gcagttggac gatatcaatg 3000ccgtaatcat tgaccagagc
caaaacatcc tccttaggtt gattacgaaa cacgccaacc 3060aagtatttcg gagtgcctga
actattttta tatgctttta caagacttga aattttcctt 3120gcaataaccg ggtcaattgt
tctctttcta ttgggcacac atataatacc cagcaagtca 3180gcatcggaat ctagtgcaca
ttctgcggcc tctgtgctct gcaagccgca aactttcacc 3240aatggaccag aactacctgt
gaaattaata acagacatac tccaagctgc ctttgtgtgc 3300ttaatcacgt atactcacgt
gctcaatagt caccaatgcc ctccctcttg gccctctcct 3360tttctttttt cgaccgaatt
aattcttgaa gacgaaaggg cctcgtgata cgcctatttt 3420tataggttaa tgtcatgata
ataatggttt cttagacgtc aggtggcact tttcggggaa 3480atgtgcgcgg aacccctatt
tgtttatttt tctaaataca ttcaaatatg tatccgctca 3540tgagacaata accctgataa
atgcttcaat aatattgaaa aaggaagagt atgagtattc 3600aacatttccg tgtcgccctt
attccctttt ttgcggcatt ttgccttcct gtttttgctc 3660acccagaaac gctggtgaaa
gtaaaagatg ctgaagatca gttgggtgca cgagtgggtt 3720acatcgaact ggatctcaac
agcggtaaga tccttgagag ttttcgcccc gaagaacgtt 3780ttccaatgat gagcactttt
aaagttctgc tatgtggcgc ggtattatcc cgtattgacg 3840ccgggcaaga gcaactcggt
cgccgcatac actattctca gaatgacttg gttgagtact 3900caccagtcac agaaaagcat
cttacggatg gcatgacagt aagagaatta tgcagtgctg 3960ccataaccat gagtgataac
actgcggcca acttacttct gacaacgatc ggaggaccga 4020aggagctaac cgcttttttg
cacaacatgg gggatcatgt aactcgcctt gatcgttggg 4080aaccggagct gaatgaagcc
ataccaaacg acgagcgtga caccacgatg cctgtagcaa 4140tggcaacaac gttgcgcaaa
ctattaactg gcgaactact tactctagct tcccggcaac 4200aattaataga ctggatggag
gcggataaag ttgcaggacc acttctgcgc tcggcccttc 4260cggctggctg gtttattgct
gataaatctg gagccggtga gcgtgggtct cgcggtatca 4320ttgcagcact ggggccagat
ggtaagccct cccgtatcgt agttatctac acgacgggga 4380gtcaggcaac tatggatgaa
cgaaatagac agatcgctga gataggtgcc tcactgatta 4440agcattggta actgtcagac
caagtttact catatatact ttagattgat ttaaaacttc 4500atttttaatt taaaaggatc
taggtgaaga tcctttttga taatctcatg accaaaatcc 4560cttaacgtga gttttcgttc
cactgagcgt cagaccccgt agaaaagatc aaaggatctt 4620cttgagatcc tttttttctg
cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac 4680cagcggtggt ttgtttgccg
gatcaagagc taccaactct ttttccgaag gtaactggct 4740tcagcagagc gcagatacca
aatactgtcc ttctagtgta gccgtagtta ggccaccact 4800tcaagaactc tgtagcaccg
cctacatacc tcgctctgct aatcctgtta ccagtggctg 4860ctgccagtgg cgataagtcg
tgtcttaccg ggttggactc aagacgatag ttaccggata 4920aggcgcagcg gtcgggctga
acggggggtt cgtgcacaca gcccagcttg gagcgaacga 4980cctacaccga actgagatac
ctacagcgtg agctatgaga aagcgccacg cttcccgaag 5040ggagaaaggc ggacaggtat
ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg 5100agcttccagg gggaaacgcc
tggtatcttt atagtcctgt cgggtttcgc cacctctgac 5160ttgagcgtcg atttttgtga
tgctcgtcag gggggcggag cctatggaaa aacgccagca 5220acgcggcctt tttacggttc
ctggcctttt gctggccttt tgctcacatg ttctttcctg 5280cgttatcccc tgattctgtg
gataaccgta ttaccgcctt tgagtgagct gataccgctc 5340gccgcagccg aacgaccgag
cgcagcgagt cagtgagcga ggaagcggaa gagcgcccaa 5400tacgcaaacc gcctctcccc
gcgcgttggc cgattcatta atgcagctgg cacgacaggt 5460ttcccgactg gaaagcgggc
agtgagcgca acgcaattaa tgtgagttag ctcactcatt 5520aggcacccca ggctttacac
tttatgcttc cggctcgtat gttgtgtgga attgtgagcg 5580gataacaatt tcacacagga
aacagctatg accatgatta cgccaagctt aggcctgtgt 5640ggaagaacga ttacaacagg
tgttgtcctc tgaggacata aaatacacac cgagattcat 5700caactcattg ctggagttag
catatctaca attgggtgaa atggggagcg atttgcaggc 5760atttgctcgg catgccggta
gaggtgtggt caataagagc gacctcatgc tatacctgag 5820aaagcaacct gacctacagg
aaagagttac tcaagaataa gaattttcgt tttaaaacct 5880aagagtcact ttaaaatttg
tatacactta ttttttttat aacttattta ataataaaaa 5940tcataaatca taagaaattc
gctcgagtcg actgca
59764113286DNAArtificialpBOL34 41aagcttgcat gcctgcaggt cgacggcgcg
ccgggcccgt ttaaacggcc ggccaaggtg 60agacgcgcat aaccgctaga gtactttgaa
gaggaaacag caatagggtt gctaccagta 120taaatagaca ggtacataca acactggaaa
tggttgtctg tttgagtacg ctttcaattc 180atttgggtgt gcactttatt atgttacaat
atggaaggga actttacact tctcctatgc 240acatatatta attaaagtcc aatgctagta
gagaaggggg gtaacacccc tccgcgctct 300tttccgattt ttttctaaac cgtggaatat
ttcggatatc cttttgttgt ttccgggtgt 360acaatatgga cttcctcttt tctggcaacc
aaacccatac atcgggattc ctataatacc 420ttcgttggtc tccctaacat gtaggtggcg
gaggggagat atacaataga acagatacca 480gacaagacat aatgggctaa acaagactac
accaattaca ctgcctcatt gatggtggta 540cataacgaac taatactgta gccctagact
tgatagccat catcatatcg aagtttcact 600accctttttc catttgccat ctattgaagt
aataataggc gcatgcaact tcttttcttt 660ttttttcttt tctctctccc ccgttgttgt
ctcaccatat ccgcaatgac aaaaaaatga 720tggaagacac taaaggaaaa aattaacgac
aaagacagca ccaacagatg tcgttgttcc 780agagctgatg aggggtatct cgaagcacac
gaaacttttt ccttccttca ttcacgcaca 840ctactctcta atgagcaacg gtatacggcc
ttccttccag ttacttgaat ttgaaataaa 900aaaaagtttg ctgtcttgct atcaagtata
aatagacctg caattattaa tcttttgttt 960cctcgtcatt gttctcgttc cctttcttcc
ttgtttcttt ttctgcacaa tatttcaagc 1020tataccaagc atacaatcaa ctatctcata
tacaatgaag gaagttgtta ttgcttctgc 1080tgtcagaact gccattggtt cttacggtaa
gtctttgaag gacgtcccag ctgtcgactt 1140gggtgctacc gccatcaagg aagctgtcaa
gaaggctggt atcaagccag aagatgttaa 1200cgaagttatc ttaggtaacg ttttgcaagc
tggtttaggt caaaacccag ctcgtcaagc 1260ttctttcaag gctggtttgc cagttgaaat
tccagccatg accatcaaca aggtttgtgg 1320ttctggtttg agaactgttt ctttggctgc
tcaaatcatc aaggctggtg acgctgatgt 1380catcattgct ggtggtatgg aaaacatgtc
cagagctcca tacttggcta acaatgctag 1440atggggttac agaatgggta acgccaagtt
cgtcgatgaa atgatcactg acggtttatg 1500ggacgctttc aacgactacc acatgggtat
cactgctgaa aacattgctg aaagatggaa 1560catctccaga gaagaacaag atgaatttgc
tttggcttct caaaagaagg ctgaagaagc 1620catcaaatct ggtcaattca aggacgaaat
tgtcccagtt gtcatcaagg gtagaaaggg 1680tgaaaccgtt gtcgacaccg atgaacaccc
aagattcggt tccaccattg aaggtttggc 1740caagttgaaa ccagctttca agaaggatgg
taccgtcact gctggtaacg cttccggttt 1800gaacgactgt gctgctgttt tggttatcat
gtctgctgaa aaggccaagg aattgggtgt 1860caagccattg gccaagattg tctcctacgg
ttctgctggt gttgacccag ccatcatggg 1920ttacggtcct ttctacgcta ccaaggctgc
tatcgaaaag gctggttgga ccgttgacga 1980attggatttg attgaatcca acgaagcttt
cgctgctcaa tctttggctg ttgccaagga 2040cttgaaattc gacatgaaca aggtcaacgt
taacggtggt gccattgctt tgggtcaccc 2100aattggtgct tccggtgcca gaatcttggt
tactttagtc cacgctatgc aaaagcgtga 2160tgccaagaag ggtttggcta ctctatgtat
cggtggtggt caaggtactg ccatcttatt 2220ggaaaagtgt taggcccggg cataaagcaa
tcttgatgag gataatgatt tttttttgaa 2280tatacataaa tactaccgtt tttctgctag
attttgtgaa gacgtaaata agtacatatt 2340actttttaag ccaagacaag attaagcatt
aactttaccc ttttctcttc taagtttcaa 2400tactagttat cactgtttaa aagttatggc
gagaacgtcg gcggttaaaa tatattaccc 2460tgaacgtggt gaattgaagt tctaggatgg
tttaaagatt tttccttttt gggaaataag 2520taaacaatat attgctgcct ttgcaaaacg
cacataccca caatatgtga ctattggcaa 2580agaacgcatt atcctttgaa gaggtggata
ctgatactaa gagagtctct attccggctc 2640cacttttagt ccagagatta cttgtcttct
tacgtatcag aacaagaaag catttccaaa 2700gtaattgcat ttgcccttga gcagtatata
tatactaaga agtttaaaca tttaaacgtg 2760tgtgtgcatt atatatatta aaaattaaga
attagactaa ataaagtgtt tctaaaaaaa 2820tattaaagtt gaaatgtgcg tgttgtgaat
tgtgctctat tagaataatt atgacttgtg 2880tgcgtttcat attttaaaat aggaaataac
caagaaagaa aaagtaccat ccagagaaac 2940caattatatc aaatcaaata aaacaaccag
cttcggtgtg tgtgtgtgtg tgaagctaag 3000agttgatgcc atttaatcta aaaattttaa
ggtgtgtgtg tggataaaat attagaatga 3060caattcgaga tgaaatttta agcaaactct
agtaggaaat aagcggctta ttcttgttgg 3120ctcctaattc tttttagtgt atcagttccc
attgataaaa aaattaaaat taaaattaga 3180aaaattaaac cagaaaaatc aagttgatta
aaatgtgaca aaaattatga ttaaatgcta 3240cttcaacagg agcccgggcc tatttggagt
agtcgtagaa acccttacca gactttctac 3300ctaaccaacc agctctaacg tacttcttca
ataaagtgtg aggtctgtac ttagagtcac 3360cggtttcaga gtataagaca tccatgatgg
ccaaacagat atccaaaccg atgaagtcac 3420ctaattccaa tggacccatt gggtggttag
cacccaattt catggccttg tcgatatctt 3480caacagaagc aataccttca gccaaaatac
cgacagcttc gttgatcatt ggaatcaaga 3540ttctgttgac aacgaaacct ggagcttcag
caacttcaac tgggtcctta ccaatggcaa 3600tggaagtttc cttgacagca tcgaaagttt
cttgagaggt ggcaatacct ctgatgactt 3660cgaccaactt catgactgga gctgggttga
agaagtgcat accgataacc ttgtctggtc 3720tcttggtagc agaagcaact tcagtgatgg
acaaagaaga agtgttggaa gccaaaatgg 3780tttctggctt acagatgttg tccaaatcag
caaagatttg cttcttgatg tccattcttt 3840caacggcagc ttcaatgacc aaatcacagt
cagcagccat gttcaagtca acagtaccgg 3900agattctggt caagatttcg accttggtag
cttcttcaat cttacccttc ttgaccaact 3960tggacaagtt cttgttgatg aaatccaaac
cacggtcaac gaattcgtcc ttgatatctc 4020tcaaaacaac ttcgaaaccc ttggcagcga
aagcttgagc aataccagaa cccatggtac 4080cggcaccaat gacacaaacc ttcttcattt
tgatttagtg tttgtgtgtt gataagcagt 4140tgcttggttt tttatgaaaa atagctagaa
ggaataaggg attacaagag agatgttaca 4200agaaagaagt aaaataaatt tgattaatat
tgccattatc aaaagctatt tatatgttga 4260aatcgtggag atcatgtgtg ccagaaaagg
ccacagtttc cggggagagg cataccttga 4320ggtggctagg aatcacggag acctcttgac
ttgcagggta ggctagctag aattaagtga 4380ggtgacaagg tttccataca gttttgacct
tgagacgttg ctacttacga tttgcagtat 4440gcaagtctca tgctgcaaac aaaagaggac
cgctcaggta atcgctcaat tagtggacgt 4500tatcaggggc gggagaggcg aaagtggttt
ttggtggtgt aagtaaaggt cgtccaaata 4560tgcaggtgtt tgggtgctat cctagtggaa
gctcggatca gtagataacc cgcctagaag 4620cggtattttt cttttttttt cttccttctt
tttcgtcatt atttcaaacg cttttgcgtc 4680aagtaatgaa tatctggcgg ttccgcggta
atgcgacaat ttgtgatatg cactcttaaa 4740accccgccac gatgatcgca cgtgccggca
tttatagacg acttttctgg ttgtcccgct 4800tcacggcaca tgcatgcatc aatgaccgaa
ttcaggttgc tactaaccat tgtgttgtgt 4860tattgctgtg catgaggtgc tcaagtgccc
gcggcatctg actagtggta actctagacg 4920gcttcgatgc agagagttcc tcaaaatttt
tcttttcaat tgtttgcctg gtttccgcgg 4980cgtatatcag tttttggcga tatggtaacg
cgatactcta cggcaccttc acggtagatg 5040tcttttttaa aagtgactgt taattccagg
attgaaagga agtgtcgaat agtatagtat 5100gctttctagg ccggccgttt aaatgggccc
gcggcccgtt taaacggccg gcccttccct 5160tttacagtgc ttcggaaaag cacagcgttg
tccaagggaa caatttttct tcaagttaat 5220gcataagaaa tatctttttt tatgtttagc
taagtaaaag cagcttggag taaaaaaaaa 5280aatgagtaaa tttctcgatg gattagtttc
tcacaggtaa cataacaaaa accaagaaaa 5340gcccgcttct gaaaactaca gttgacttgt
atgctaaagg gccagactaa tgggaggaga 5400aaaagaaacg aatgtatatg ctcatttaca
ctctatatca ccatatggag gataagttgg 5460gctgagcttc tgatccaatt tattctatcc
attagttgct gatatgtccc accagccaac 5520acttgatagt atctactcgc cattcacttc
cagcagcgcc agtagggttg ttgagcttag 5580taaaaatgtg cgcaccacaa gcctacatga
ctccacgtca catgaaacca caccgtgggg 5640ccttgttgcg ctaggaatag gatatgcgac
gaagacgctt ctgcttagta accacaccac 5700attttcaggg ggtcgatctg cttgcttcct
ttactgtcac gagcggccca taatcgcgct 5760ttttttttaa aaggcgcgag acagcaaaca
ggaagctcgg gtttcaacct tcggagtggt 5820cgcagatctg gagactggat ctttacaata
cagtaaggca agccaccatc tgcttcttag 5880gtgcatgcga cggtatccac gtgcagaaca
acatagtctg aagaaggggg ggaggagcat 5940gttcattctc tgtagcagta agagcttggt
gataatgacc aaaactggag tctcgaaatc 6000atataaatag acaatatatt ttcacacaat
gagatttgta gtacagttct attctctctc 6060ttgcataaat aagaaattca tcaagaactt
ggtttgatat ttcaccaaca cacacaaaaa 6120acagtacttc actaaattta cacacaaaac
aaaatggaat tgaacaacgt tatcttggaa 6180aaggaaggta aggttgccgt tgtcaccatc
aacagaccaa aggctttgaa tgctttgaac 6240tctgacactt tgaaggaaat ggactacgtc
attggtgaaa ttgaaaacga ttctgaagtt 6300ttggctgtca tcttgaccgg tgccggtgaa
aagtctttcg ttgctggtgc tgatatctct 6360gaaatgaagg aaatgaacac cattgaaggt
agaaagttcg gtatcttagg taacaaggtt 6420ttcagaagat tggaattgtt ggaaaagcca
gtcattgctg ctgtcaacgg tttcgctttg 6480ggtggtggtt gtgaaattgc catgtcctgt
gacatcagaa ttgcttcttc taacgctcgt 6540ttcggtcaac cagaagtcgg tctaggtatc
actccaggtt tcggtggtac tcaaagatta 6600tccagattgg ttggtatggg tatggccaag
caattgatct tcaccgctca aaacatcaag 6660gctgacgaag ctttgagaat tggtttagtc
aacaaggttg ttgaaccatc tgaattgatg 6720aacactgcca aggaaattgc taacaagatc
gtctccaacg ctccagttgc tgtcaaattg 6780tccaagcaag ccatcaacag aggtatgcaa
tgtgatatcg acaccgcttt ggcctttgaa 6840tctgaagctt tcggtgaatg tttctccact
gaagaccaaa aggatgctat gaccgctttc 6900atcgaaaaga gaaagattga aggtttcaag
aacaggtgat gagcccgggc gcgaatttct 6960tatgatttat gatttttatt attaaataag
ttataaaaaa aataagtgta tacaaatttt 7020aaagtgactc ttaggtttta aaacgaaaat
tcttattctt gagtaactct ttcctgtagg 7080tcaggttgct ttctcaggta tagcatgagg
tcgctcttat tgaccacacc tctaccggca 7140tgccgagcaa atgcctgcaa atcgctcccc
atttcaccca attgtagata tgctaactcc 7200agcaatgagt tgatgaatct cggtgtgtat
tttatgtcct cagaggacaa cacctgttgt 7260aatcgttctt ccacacggat ccacagccta
gccttcagtt gggctctatc ttcatcgtca 7320ttcattgcat ctactagccc cttacctgag
cttcaagacg ttatatcgct tttatgtatc 7380atgatcttat cttgagatat gaatacataa
atatatttac tcaagtgtat acgtgcatgc 7440tttttttacg gtttaaacat ttaaatgggc
cgctctagag gatccccggg taccgagctc 7500gggcccagcg ctactagttc cggtaatttg
aaaacaaacc cggtctcgaa gcggagatcc 7560ggcgataatt accgcagaaa taaacccata
cacgagacgt agaaccagcc gcacatggcc 7620ggagaaactc ctgcgagaat ttcgtaaact
cgcgcgcatt gcatctgtat ttcctaatgc 7680ggcacttcca ggcctcgaga cctctgacat
gcttttgaca ggaatagaca ttttcagaat 7740gttatccata tgcctttcgg gtttttttcc
ttccttttcc atcatgaaaa atctctcgag 7800accgtttatc cattgctttt ttgttgtctt
tttccctcgt tcacagaaag tctgaagaag 7860ctatagtaga actatgagct ttttttgttt
ctgttttcct tttttttttt tttacctctg 7920tggaaattgt tactctcaca ctctttagtt
cgtttgtttg ttttgtttat tccaattatg 7980accggtgacg aaacgtggtc gatggtgggt
accgcttatg ctcccctcca ttagtttcga 8040ttatataaaa aggccaaata ttgtattatt
ttcaaatgtc ctatcattat cgtctaacat 8100ctaatttctc ttaaattttt tctctttctt
tcctataaca ccaatagtga aaatcttttt 8160ttcttctata tctacaaaaa cttttttttt
ctatcaacct cgttgataaa ttttttcttt 8220aacaatcgtt aataattaat taattggaaa
ataaccattt tttctctctt ttatacacac 8280attcaaaaga aagaaaaaaa atatacccca
gctagttaaa gaaaatcatt gaaaagaata 8340agaagataag aaagatttaa ttatcaaaca
atatcaatat gcctcaatcc tgggaagaac 8400tggccgctga taagcgcgcc cgcctcgcaa
aaaccatccc tgatgaatgg aaagtccaga 8460cgctgcctgc ggaagacagc gttattgatt
tcccaaagaa atcggggatc ctttcagagg 8520ccgaactgaa gatcacagag gcctccgctg
cagatcttgt gtccaagctg gcggccggag 8580agttgacctc ggtggaagtt acgctagcat
tctgtaaacg ggcagcaatc gcccagcagt 8640taacaaactg cgcccacgag ttcttccctg
acgccgctct cgcgcaggca agggaactcg 8700atgaatacta cgcaaagcac aagagacccg
ttggtccact ccatggcctc cccatctctc 8760tcaaagacca gcttcgagtc aagggctacg
aaacatcaat gggctacatc tcatggctaa 8820acaagtacga cgaaggggac tcggttctga
caaccatgct ccgcaaagcc ggtgccgtct 8880tctacgtcaa gacctctgtc ccgcagaccc
tgatggtctg cgagacagtc aacaacatca 8940tcgggcgcac cgtcaaccca cgcaacaaga
actggtcgtg cggcggcagt tctggtggtg 9000agggtgcgat cgttgggatt cgtggtggcg
tcatcggtgt aggaacggat atcggtggct 9060cgattcgagt gccggccgcg ttcaacttcc
tgtacggtct aaggccgagt catgggcggc 9120tgccgtatgc aaagatggcg aacagcatgg
agggtcagga gacggtgcac agcgttgtcg 9180ggccgattac gcactctgtt gaggacctcc
gcctcttcac caaatccgtc ctcggtcagg 9240agccatggaa atacgactcc aaggtcatcc
ccatgccctg gcgccagtcc gagtcggaca 9300ttattgcctc caagatcaag aacggcgggc
tcaatatcgg ctactacaac ttcgacggca 9360atgtccttcc acaccctcct atcctgcgcg
gcgtggaaac caccgtcgcc gcactcgcca 9420aagccggtca caccgtgacc ccgtggacgc
catacaagca cgatttcggc cacgatctca 9480tctcccatat ctacgcggct gacggcagcg
ccgacgtaat gcgcgatatc agtgcatccg 9540gcgagccggc gattccaaat atcaaagacc
tactgaaccc gaacatcaaa gctgttaaca 9600tgaacgagct ctgggacacg catctccaga
agtggaatta ccagatggag taccttgaga 9660aatggcggga ggctgaagaa aaggccggga
aggaactgga cgccatcatc gcgccgatta 9720cgcctaccgc tgcggtacgg catgaccagt
tccggtacta tgggtatgcc tctgtgatca 9780acctgctgga tttcacgagc gtggttgttc
cggttacctt tgcggataag aacatcgata 9840agaagaatga gagtttcaag gcggttagtg
agcttgatgc cctcgtgcag gaagagtatg 9900atccggaggc gtaccatggg gcaccggttg
cagtgcaggt tatcggacgg agactcagtg 9960aagagaggac gttggcgatt gcagaggaag
tggggaagtt gctgggaaat gtggtgactc 10020cataggtcga gaatttatac ttagataagt
atgtacttac aggtatattt ctatgagata 10080ctgatgtata catgcatgat aatatttaaa
cggttattag tgccgattgt cttgtgcgat 10140aatgacgttc ctatcaaagc aatacactta
ccacctatta catgggccaa gaaaatattt 10200tcgaacttgt ttagaatatt agcacagagt
atatgatgat atccgttaga ttatgcatga 10260ttcattccta caactttttc gtagcataag
gattaattac ttggatgcca ataaaaaaaa 10320aaaacatcga gaaaatttca gcatgctcag
aaacaattgc agtgtatcaa agtaaaaaaa 10380agattttcgc tacatgttcc ttttgaagaa
agaaaatcat ggaacattag atttacaaaa 10440atttaaccac cgctgattaa cgattagacc
gttaagcgca caacaggtta ttagtacaga 10500gaaagcattc tgtggtgttg ccccggactt
tcttttgcga cataggtaaa tcgaatacca 10560tcatactatc ttttccaatg actccctaaa
gaaagactct tcttcgatgt tgtatacgtt 10620ggagcatagg gcaagaattg tggcttgaga
tgaattcact ggccgtcgtt ttacaacgtc 10680gtgactggga aaaccctggc gttacccaac
ttaatcgcct tgcagcacat ccccctttcg 10740ccagctggcg taatagcgaa gaggcccgca
ccgatcgccc ttcccaacag ttgcgcagcc 10800tgaatggcga atggcgcctg atgcggtatt
ttctccttac gcatctgtgc ggtatttcac 10860accgcatatg gtgcactctc agtacaatct
gctctgatgc cgcatagtta agccagcccc 10920gacacccgcc aacacccgct gacgcgccct
gacgggcttg tctgctcccg gcatccgctt 10980acagacaagc tgtgaccgtc tccgggagct
gcatgtgtca gaggttttca ccgtcatcac 11040cgaaacgcgc gagacgaaag ggcctcgtga
tacgcctatt tttataggtt aatgtcatga 11100taataatggt ttcttagacg tcaggtggca
cttttcgggg aaatgtgcgc ggaaccccta 11160tttgtttatt tttctaaata cattcaaata
tgtatccgct catgagacaa taaccctgat 11220aaatgcttca ataatattga aaaaggaaga
gtatgagtat tcaacatttc cgtgtcgccc 11280ttattccctt ttttgcggca ttttgccttc
ctgtttttgc tcacccagaa acgctggtga 11340aagtaaaaga tgctgaagat cagttgggtg
cacgagtggg ttacatcgaa ctggatctca 11400acagcggtaa gatccttgag agttttcgcc
ccgaagaacg ttttccaatg atgagcactt 11460ttaaagttct gctatgtggc gcggtattat
cccgtattga cgccgggcaa gagcaactcg 11520gtcgccgcat acactattct cagaatgact
tggttgagta ctcaccagtc acagaaaagc 11580atcttacgga tggcatgaca gtaagagaat
tatgcagtgc tgccataacc atgagtgata 11640acactgcggc caacttactt ctgacaacga
tcggaggacc gaaggagcta accgcttttt 11700tgcacaacat gggggatcat gtaactcgcc
ttgatcgttg ggaaccggag ctgaatgaag 11760ccataccaaa cgacgagcgt gacaccacga
tgcctgtagc aatggcaaca acgttgcgca 11820aactattaac tggcgaacta cttactctag
cttcccggca acaattaata gactggatgg 11880aggcggataa agttgcagga ccacttctgc
gctcggccct tccggctggc tggtttattg 11940ctgataaatc tggagccggt gagcgtgggt
ctcgcggtat cattgcagca ctggggccag 12000atggtaagcc ctcccgtatc gtagttatct
acacgacggg gagtcaggca actatggatg 12060aacgaaatag acagatcgct gagataggtg
cctcactgat taagcattgg taactgtcag 12120accaagttta ctcatatata ctttagattg
atttaaaact tcatttttaa tttaaaagga 12180tctaggtgaa gatccttttt gataatctca
tgaccaaaat cccttaacgt gagttttcgt 12240tccactgagc gtcagacccc gtagaaaaga
tcaaaggatc ttcttgagat cctttttttc 12300tgcgcgtaat ctgctgcttg caaacaaaaa
aaccaccgct accagcggtg gtttgtttgc 12360cggatcaaga gctaccaact ctttttccga
aggtaactgg cttcagcaga gcgcagatac 12420caaatactgt ccttctagtg tagccgtagt
taggccacca cttcaagaac tctgtagcac 12480cgcctacata cctcgctctg ctaatcctgt
taccagtggc tgctgccagt ggcgataagt 12540cgtgtcttac cgggttggac tcaagacgat
agttaccgga taaggcgcag cggtcgggct 12600gaacgggggg ttcgtgcaca cagcccagct
tggagcgaac gacctacacc gaactgagat 12660acctacagcg tgagctatga gaaagcgcca
cgcttcccga agggagaaag gcggacaggt 12720atccggtaag cggcagggtc ggaacaggag
agcgcacgag ggagcttcca gggggaaacg 12780cctggtatct ttatagtcct gtcgggtttc
gccacctctg acttgagcgt cgatttttgt 12840gatgctcgtc aggggggcgg agcctatgga
aaaacgccag caacgcggcc tttttacggt 12900tcctggcctt ttgctggcct tttgctcaca
tgttctttcc tgcgttatcc cctgattctg 12960tggataaccg tattaccgcc tttgagtgag
ctgataccgc tcgccgcagc cgaacgaccg 13020agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc aatacgcaaa ccgcctctcc 13080ccgcgcgttg gccgattcat taatgcagct
ggcacgacag gtttcccgac tggaaagcgg 13140gcagtgagcg caacgcaatt aatgtgagtt
agctcactca ttaggcaccc caggctttac 13200actttatgct tccggctcgt atgttgtgtg
gaattgtgag cggataacaa tttcacacag 13260gaaacagcta tgaccatgat tacgcc
132864216359DNAArtificialpBOL36
42aagcttgcat gcctgcaggt cgacggcgcg ccgggcccgt ttaaacaatg gcaaactgag
60cacaacaata ccagtccgga tcaactggca ccatctctcc cgtagtctca tctaattttt
120cttccggatg aggttccaga tataccgcaa cacctttatt atggtttccc tgagggaata
180atagaatgtc ccattcgaaa tcaccaattc taaacctggg cgaattgtat ttcgggtttg
240ttaactcgtt ccagtcagga atgttccacg tgaagctatc ttccagcaaa gtctccactt
300cttcatcaaa ttgtgggaga atactcccaa tgctcttatc tatgggactt ccgggaaaca
360cagtaccgat acttcccaat tcgtcttcag agctcattgt ttgtttgaag agactaatca
420aagaatcgtt ttctcaaaaa aattaatatc ttaactgata gtttgatcaa aggggcaaaa
480cgtaggggca aacaaacgga aaaatcgttt ctcaaatttt ctgatgccaa gaactctaac
540cagtcttatc taaaaattgc cttatgatcc gtctctccgg ttacagcctg tgtaactgat
600taatcctgcc tttctaatca ccattctaat gttttaatta agggattttg tcttcattaa
660cggctttcgc tcataaaaat gttatgacgt tttgcccgca ggcgggaaac catccacttc
720acgagactga tctcctctgc cggaacaccg ggcatctcca acttataagt tggagaaata
780agagaatttc agattgagag aatgaaaaaa aaaaaaaaaa aaaaggcaga ggagagcata
840gaaatggggt tcactttttg gtaaagctat agcatgccta tcacatataa atagagtgcc
900agtagcgact tttttcacac tcgaaatact cttactactg ctctcttgtt gtttttatca
960cttcttgttt cttcttggta aatagaatat caagctacaa aaagcataca atcaactatc
1020aactattaac tatatcgtaa tacacaggcc ggccaaaatg aaggccaaat caaggcggga
1080agggacaacc aggacgtaaa gggtagcctc cccataacat aaactcaata aaatatatag
1140tcttcaactt gaaaaaggaa caagctcatg caaagaggtg gtacccgcac gccgaaatgc
1200atgcaagtaa cctattcaaa gtaatatctc atacatgttt catgagggta acaacatgcg
1260actgggtgag catatgttcc gctgatgtga tgtgcaagat aaacaagcaa gacagaaact
1320aacttcttct tcatgtaata aacacacccc gcgtttattt acctatcttt aaacttcaac
1380accttatatc ataactaata tttcttgaga taagcacact gcacccatac cttccttaaa
1440aacgtagctt ccagtttttg gtggttctgg cttccttccc gattccgccc gctaaacgca
1500taattttgtt gcctggtggc atttgcaaaa tgcataacct atgcatttaa aagattatgt
1560atgctcttct gacttttcgt gtgatgaggc tcgtggaaaa aatgaataat ttatgaattt
1620gagaacaatt ttgtgttgtt acggtatttt actatggaat aatcaatcaa ttgaggattt
1680tatgcaaata tcgtttgaat atttttccga ccctttgagt acttttcttc ataattgcat
1740aatattgtcc gctgcccgtt tttctgttag acggtgtctt gatctacttg ctatcgttca
1800acaccacctt attttctaac tatttttttt ttagctcatt tgaatcagct tatggtgatg
1860gcacattttt gcataaacct agctgtcctc gttgaacata ggaaaaaaaa atatataaac
1920aaggctcttt cactctcctt ggaatcagat ttgggtttgt tccctttatt ttcatatttc
1980ttgtcatatt cttttctcaa ttattatctt ctactcataa cctcacgcaa aataacacag
2040tcaaatcaat caaaatggac ttcaacttga ccagagaaca agaattggtc agacaaatgg
2100ttagagaatt tgctgaaaac gaagttaagc caattgctgc tgaaatcgat gaaactgaaa
2160gattcccaat ggaaaacgtc aagaagatgg gtcaatacgg tatgatgggt attccattct
2220ctaaggaata cggtggtgct ggtggtgacg tcttgtctta catcattgct gtcgaagaat
2280tgtccaaggt ttgtggtacc actggtgtca tcttatctgc tcacacttct ctatgtgcct
2340ccttgatcaa cgaacacggt actgaagaac aaaagcaaaa gtacttggtt ccattggcca
2400agggtgaaaa gattggtgcc tacggtttga ctgaaccaaa cgctggtact gactctggtg
2460ctcaacaaac tgttgccgtt ttggaaggtg accactacgt catcaacggt tccaagatct
2520tcatcaccaa cggtggtgtt gctgacacct ttgtcatctt cgctatgacc gatcgtacca
2580agggtaccaa gggtatctct gctttcatta ttgaaaaggg tttcaagggt ttctccatcg
2640gtaaggtcga acaaaagttg ggtatcagag cttcctctac cactgaattg gttttcgaag
2700acatgattgt tccagttgaa aacatgatcg gtaaggaagg taagggtttc ccaattgcca
2760tgaagacttt agatggtggt agaattggta ttgctgctca agctttgggt attgctgaag
2820gtgccttcaa cgaagctaga gcttacatga aggaaagaaa gcaattcggt agatctttgg
2880acaaattcca aggtttggct tggatgatgg ctgacatgga cgttgccatc gaatctgctc
2940gttacttggt ctacaaggct gcttacttga agcaagctgg tttgccatac accgtcgatg
3000ctgccagagc taagttgcac gctgccaacg ttgccatgga tgtcaccacc aaggctgtcc
3060aattattcgg tggttacggt tacaccaagg actacccagt tgaaagaatg atgagagatg
3120ctaagatcac tgaaatctac gaaggtactt ctgaagttca aaagttggtt atctccggta
3180agatcttcag ataggcccgg gcataaagca atcttgatga ggataatgat ttttttttga
3240atatacataa atactaccgt ttttctgcta gattttgtga agacgtaaat aagtacatat
3300tactttttaa gccaagacaa gattaagcat taactttacc cttttctctt ctaagtttca
3360atactagtta tcactgttta aaagttatgg cgagaacgtc ggcggttaaa atatattacc
3420ctgaacgtgg tgaattgaag ttctaggatg gtttaaagat ttttcctttt tgggaaataa
3480gtaaacaata tattgctgcc tttgcaaaac gcacataccc acaatatgtg actattggca
3540aagaacgcat tatcctttga agaggtggat actgatacta agagagtctc tattccggct
3600ccacttttag tccagagatt acttgtcttc ttacgtatca gaacaagaaa gcatttccaa
3660agtaattgca tttgcccttg agcagtatat atatactaag aagtttaaac atttaaacgg
3720ccggcctaga aagcatacta tactattcga cacttccttt caatcctgga attaacagtc
3780acttttaaaa aagacatcta ccgtgaaggt gccgtagagt atcgcgttac catatcgcca
3840aaaactgata tacgccgcgg aaaccaggca aacaattgaa aagaaaaatt ttgaggaact
3900ctctgcatcg aagccgtcta gagttaccac tagtcagatg ccgcgggcac ttgagcacct
3960catgcacagc aataacacaa cacaatggtt agtagcaacc tgaattcggt cattgatgca
4020tgcatgtgcc gtgaagcggg acaaccagaa aagtcgtcta taaatgccgg cacgtgcgat
4080catcgtggcg gggttttaag agtgcatatc acaaattgtc gcattaccgc ggaaccgcca
4140gatattcatt acttgacgca aaagcgtttg aaataatgac gaaaaagaag gaagaaaaaa
4200aaagaaaaat accgcttcta ggcgggttat ctactgatcc gagcttccac taggatagca
4260cccaaacacc tgcatatttg gacgaccttt acttacacca ccaaaaacca ctttcgcctc
4320tcccgcccct gataacgtcc actaattgag cgattacctg agcggtcctc ttttgtttgc
4380agcatgagac ttgcatactg caaatcgtaa gtagcaacgt ctcaaggtca aaactgtatg
4440gaaaccttgt cacctcactt aattctagct agcctaccct gcaagtcaag aggtctccgt
4500gattcctagc cacctcaagg tatgcctctc cccggaaact gtggcctttt ctggcacaca
4560tgatctccac gatttcaaca tataaatagc ttttgataat ggcaatatta atcaaattta
4620ttttacttct ttcttgtaac atctctcttg taatccctta ttccttctag ctatttttca
4680taaaaaacca agcaactgct tatcaacaca caaacactaa atcaaaatgg tcgatttcga
4740atactctatc ccaaccagaa tcttcttcgg taaggacaag atcaacgttt tgggtagaga
4800attgaagaaa tacggttcca aggttttgat tgtctacggt ggtggttcca tcaagagaaa
4860cggtatctac gacaaggctg tctccatttt ggaaaagaac tctatcaaat tctacgaatt
4920ggctggtgtt gaaccaaacc caagagttac caccgtcgaa aagggtgtca agatctgtcg
4980tgaaaacggt gttgaagttg ttttggccat cggtggtggt tctgccattg actgtgccaa
5040ggtcattgct gctgcctgtg aatacgatgg taacccatgg gacattgtct tggatggttc
5100taagatcaag cgtgtcttac caattgcttc catcttgact atcgctgcta ctggttctga
5160aatggacacc tgggctgtta tcaacaacat ggacactaac gaaaagttga ttgctgctca
5220cccagatatg gccccaaagt tctctatttt ggacccaacc tacacttaca ctgttccaac
5280caaccaaact gctgctggta ctgctgatat catgtctcac atctttgaag tttacttctc
5340caacaccaag accgcttact tgcaagacag aatggctgaa gctctattaa gaacctgtat
5400caagtacggt ggtattgctt tggaaaagcc agatgactac gaagccagag ctaacttgat
5460gtgggcttcc tctttggcta tcaacggttt attgacttac ggtaaggaca ccaactggtc
5520cgttcatttg atggaacacg aattgtctgc ttactacgat atcactcacg gtgtcggttt
5580ggccatcttg actccaaact ggatggaata cattttgaac aacgacactg tctacaagtt
5640cgtcgaatac ggtgttaacg tctggggtat tgacaaggaa aagaaccact acgacattgc
5700tcaccaagcc atccaaaaga ccagagacta tttcgtcaac gttttgggtt taccatccag
5760attaagagat gttggtattg aagaagaaaa attggatatc atggctaagg aatctgtcaa
5820attgactggt ggtaccattg gtaacttgag acctgttaac gcttctgaag ttttgcaaat
5880cttcaagaaa tctgtttagg cccgggctcc tgttgaagta gcatttaatc ataatttttg
5940tcacatttta atcaacttga tttttctggt ttaatttttc taattttaat tttaattttt
6000ttatcaatgg gaactgatac actaaaaaga attaggagcc aacaagaata agccgcttat
6060ttcctactag agtttgctta aaatttcatc tcgaattgtc attctaatat tttatccaca
6120cacacacctt aaaattttta gattaaatgg catcaactct tagcttcaca cacacacaca
6180caccgaagct ggttgtttta tttgatttga tataattggt ttctctggat ggtacttttt
6240ctttcttggt tatttcctat tttaaaatat gaaacgcaca caagtcataa ttattctaat
6300agagcacaat tcacaacacg cacatttcaa ctttaatatt tttttagaaa cactttattt
6360agtctaattc ttaattttta atatatataa tgcacacaca cgtttaaatg ggcccgcggc
6420ccgtttaaac ggccggccct tcccttttac agtgcttcgg aaaagcacag cgttgtccaa
6480gggaacaatt tttcttcaag ttaatgcata agaaatatct ttttttatgt ttagctaagt
6540aaaagcagct tggagtaaaa aaaaaaatga gtaaatttct cgatggatta gtttctcaca
6600ggtaacatag caaaaaccaa gaaaagcccg cttctgaaaa ctacagttga cttgtatgct
6660aaagggccag actaatggga ggagaaaaag aaacgaatgt atatgctcat ttacactcta
6720tatcaccata tggaggataa gttgggctga gcttctgatc caatttattc tatccattag
6780ttgctgatat gtcccaccag ccaacacttg atagtatcta ctcgccattc acttccagca
6840gcgccagtag ggttgttgag cttagtaaaa atgtgcgcac cacaagccta catgactcca
6900cgtcacatga aaccacaccg tggggccttg ttgcgctagg aataggatat gcgacgaaga
6960cgcttctgct tagtaaccac accacatttt cagggggtcg atctgcttgc ttcctttact
7020gtcacgagcg gcccataatc gcgctttttt tttaaaaggc gcgagacagc aaacaggaag
7080ctcgggtttc aaccttcgga gtggtcgcag atctggagac tggatcttta caatacagta
7140aggcaagcca ccatctgctt cttaggtgca tgcgacggta tccacgtgca gaacaacata
7200gtctgaagaa gggggggagg agcatgttca ttctctgtag cagtaagagc ttggtgataa
7260tgaccaaaac tggagtctcg aaatcatata aatagacaat atattttcac acaatgagat
7320ttgtagtaca gttctattct ctctcttgca taaataagaa attcatcaag aacttggttt
7380gatatttcac caacacacac aaaaaacagt acttcactaa atttacacac aaaacaaaat
7440gaaggttacc aaccaaaagg aattgaagca aaagttgaac gaattgagag aagctcaaaa
7500gaagttcgct acctacactc aagaacaagt tgacaagatc ttcaagcaat gtgccattgc
7560tgctgccaag gaacgtatca acttggccaa gttggctgtc gaagaaaccg gtattggttt
7620ggttgaagac aagatcatca agaaccactt cgctgctgaa tacatctaca acaagtacaa
7680gaacgaaaag acctgtggta tcatcgacca cgatgactct ttgggtatca ccaaggttgc
7740tgaaccaatc ggtattgtcg ccgccattgt cccaaccact aacccaactt ccactgccat
7800cttcaaatct ttgatctcct tgaagaccag aaacgctatc ttcttctccc cacacccaag
7860agccaagaag tccaccattg ctgctgccaa attaatcttg gatgctgctg ttaaggctgg
7920tgccccaaag aacattattg gttggatcga tgaaccttcc attgaattgt ctcaagactt
7980gatgtctgaa gctgatatca tcttggctac cggtggtcca tccatggtca aggccgctta
8040ctcttctggt aagccagcta ttggtgttgg tgctggtaac actccagcta tcatcgatga
8100atctgctgac attgacatgg ctgtctcctc cattatcttg tccaagactt atgacaacgg
8160tgtcatctgt gcctctgaac aatccatctt ggttatgaac tctatctacg aaaaggtcaa
8220ggaagaattt gttaagagag gttcctacat cttaaaccaa aatgaaattg ccaagatcaa
8280ggaaaccatg ttcaagaacg gtgccatcaa cgctgacatt gtcggtaaat ctgcttacat
8340cattgccaag atggctggta ttgaagttcc acaaaccact aagattttga tcggtgaagt
8400tcaatctgtc gaaaagtctg aattattctc tcacgaaaag ttgtctccag tcttggctat
8460gtacaaggtc aaggatttcg acgaagcttt gaagaaggct caaagattaa ttgaattagg
8520tggttctggt cacacctctt ctctatacat tgactctcaa aacaacaagg acaaggtcaa
8580ggaattcggt ctagctatga agacttccag aactttcatc aacatgccat cttctcaagg
8640tgcttctggt gatttgtaca actttgccat tgctccatct ttcactttag gttgtggtac
8700ctggggtggt aactctgttt ctcaaaacgt tgaaccaaag catttgctaa acatcaagtc
8760cgttgctgaa agaagagaaa acatgttgtg gttcaaggtt ccacaaaaga tctacttcaa
8820atacggttgt ttgagatttg ctttgaagga attgaaagat atgaacaaga agcgtgcttt
8880catcgttact gacaaggatt tgttcaaatt gggttacgtt aacaagatca ctaaggtttt
8940ggatgaaatt gatatcaagt actccatctt cactgatatc aaatctgacc caaccattga
9000ctccgtcaag aagggtgcta aggaaatgtt gaacttcgaa ccagatacca ttatctccat
9060tggtggtggt tctccaatgg atgctgccaa ggttatgcat ttgttgtacg aatacccaga
9120agctgaaatc gaaaacttgg ccatcaactt catggacatc agaaagagaa tctgtaactt
9180cccaaagttg ggtaccaagg ccatttctgt tgccattcca accaccgctg gtaccggttc
9240tgaagctact ccatttgctg tcatcaccaa cgacgaaacc ggtatgaagt acccattgac
9300ctcttacgaa ttgactccaa acatggccat cattgacact gaattgatgt tgaacatgcc
9360aagaaagttg actgctgcta ccggtattga cgctttagtc cacgctatcg aagcttacgt
9420ctccgttatg gccactgact acactgacga attggctttg agagctatca agatgatctt
9480caagtacttg ccaagagctt acaagaacgg tactaacgat atcgaagctc gtgaaaagat
9540ggctcacgct tccaacattg ctggtatggc tttcgctaac gctttcttgg gtgtttgtca
9600ctccatggcc cacaagttgg gtgctatgca ccacgttcct cacggtattg cttgtgctgt
9660tttgattgaa gaagtcatca agtacaacgc tactgactgt ccaaccaagc aaactgcttt
9720cccacaatac aagtctccaa acgccaagag aaagtacgct gaaattgctg aatacttgaa
9780cttgaaaggt acttctgaca ctgaaaaggt cactgcttta atcgaagcta tctccaagtt
9840gaagattgac ttatctattc ctcaaaacat ctctgctgct ggtattaaca agaaggactt
9900ctacaacact ttagacaaga tgtccgaatt ggctttcgat gaccaatgta ccaccgctaa
9960cccaagatac ccattgatct ctgaattgaa ggatatctac atcaagtcct tttaagcccg
10020ggcgcggatc tcttatgtct ttacgattta tagttttcat tatcaagtat gcctatatta
10080gtatatagca tctttagatg acagtgttcg aagtttcacg aataaaagat aatattctac
10140tttttgctcc caccgcgttt gctagcacga gtgaacacca tccctcgcct gtgagttgta
10200cccattcctc taaactgtag acatggtagc ttcagcagtg ttcgttatgt acggcatcct
10260ccaacaaaca gtcggttata gtttgtcctg ctcctctgaa tcgtctccct cgatatttct
10320cattttcctt cgcatgccag cattgaaatg atcgaagttc aatgatgaaa cggtaattct
10380tctgtcattt actcatctca tctcatcaag ttatataatt ctatacggat gtaatttttc
10440acttttcgtc ttgacgtcca ccctataatt tcaattattg aaccctcaca aatgatgcac
10500tgcaatgtac acaccctcat atagtttaaa catttaaatg ggccgctcta gaggatcccc
10560gggtaccgag ctcgggccca gcgctactag ttccggtaat ttgaaaacaa acccggtctc
10620gaagcggaga tccggcgata attaccgcag aaataaaccc atacacgaga cgtagaacca
10680gccgcacatg gccggagaaa ctcctgcgag aatttcgtaa actcgcgcgc attgcatctg
10740tatttcctaa tgcggcactt ccaggcctcg agacctctga catgcttttg acaggaatag
10800acattttcag aatgttatcc atatgccttt cgggtttttt tccttccttt tccatcatga
10860aaaatctctc gagaccgttt atccattgct tttttgttgt ctttttccct cgttcacaga
10920aagtctgaag aagctatagt agaactatga gctttttttg tttctgtttt cctttttttt
10980ttttttacct ctgtggaaat tgttactctc acactcttta gttcgtttgt ttgttttgtt
11040tattccaatt atgaccggtg acgaaacgtg gtcgatggtg ggtaccgctt atgctcccct
11100ccattagttt cgattatata aaaaggccaa atattgtatt attttcaaat gtcctatcat
11160tatcgtctaa catctaattt ctcttaaatt ttttctcttt ctttcctata acaccaatag
11220tgaaaatctt tttttcttct atatctacaa aaactttttt tttctatcaa cctcgttgat
11280aaattttttc tttaacaatc gttaataatt aattaattgg aaaataacca ttttttctct
11340cttttataca cacattcaaa agaaagaaaa aaaatatacc ccagctagtt aaagaaaatc
11400attgaaaaga ataagaagat aagaaagatt taattatcaa acaatatcaa tatgcctcaa
11460tcctgggaag aactggccgc tgataagcgc gcccgcctcg caaaaaccat ccctgatgaa
11520tggaaagtcc agacgctgcc tgcggaagac agcgttattg atttcccaaa gaaatcgggg
11580atcctttcag aggccgaact gaagatcaca gaggcctccg ctgcagatct tgtgtccaag
11640ctggcggccg gagagttgac ctcggtggaa gttacgctag cattctgtaa acgggcagca
11700atcgcccagc agttaacaaa ctgcgcccac gagttcttcc ctgacgccgc tctcgcgcag
11760gcaagggaac tcgatgaata ctacgcaaag cacaagagac ccgttggtcc actccatggc
11820ctccccatct ctctcaaaga ccagcttcga gtcaagggct acgaaacatc aatgggctac
11880atctcatggc taaacaagta cgacgaaggg gactcggttc tgacaaccat gctccgcaaa
11940gccggtgccg tcttctacgt caagacctct gtcccgcaga ccctgatggt ctgcgagaca
12000gtcaacaaca tcatcgggcg caccgtcaac ccacgcaaca agaactggtc gtgcggcggc
12060agttctggtg gtgagggtgc gatcgttggg attcgtggtg gcgtcatcgg tgtaggaacg
12120gatatcggtg gctcgattcg agtgccggcc gcgttcaact tcctgtacgg tctaaggccg
12180agtcatgggc ggctgccgta tgcaaagatg gcgaacagca tggagggtca ggagacggtg
12240cacagcgttg tcgggccgat tacgcactct gttgaggacc tccgcctctt caccaaatcc
12300gtcctcggtc aggagccatg gaaatacgac tccaaggtca tccccatgcc ctggcgccag
12360tccgagtcgg acattattgc ctccaagatc aagaacggcg ggctcaatat cggctactac
12420aacttcgacg gcaatgtcct tccacaccct cctatcctgc gcggcgtgga aaccaccgtc
12480gccgcactcg ccaaagccgg tcacaccgtg accccgtgga cgccatacaa gcacgatttc
12540ggccacgatc tcatctccca tatctacgcg gctgacggca gcgccgacgt aatgcgcgat
12600atcagtgcat ccggcgagcc ggcgattcca aatatcaaag acctactgaa cccgaacatc
12660aaagctgtta acatgaacga gctctgggac acgcatctcc agaagtggaa ttaccagatg
12720gagtaccttg agaaatggcg ggaggctgaa gaaaaggccg ggaaggaact ggacgccatc
12780atcgcgccga ttacgcctac cgctgcggta cggcatgacc agttccggta ctatgggtat
12840gcctctgtga tcaacctgct ggatttcacg agcgtggttg ttccggttac ctttgcggat
12900aagaacatcg ataagaagaa tgagagtttc aaggcggtta gtgagcttga tgccctcgtg
12960caggaagagt atgatccgga ggcgtaccat ggggcaccgg ttgcagtgca ggttatcgga
13020cggagactca gtgaagagag gacgttggcg attgcagagg aagtggggaa gttgctggga
13080aatgtggtga ctccataggt cgagaattta tacttagata agtatgtact tacaggtata
13140tttctatgag atactgatgt atacatgcat gataatattt aaacggttat tagtgccgat
13200tgtcttgtgc gataatgacg ttcctatcaa agcaatacac ttaccaccta ttacatgggc
13260caagaaaata ttttcgaact tgtttagaat attagcacag agtatatgat gatatccgtt
13320agattatgca tgattcattc ctacaacttt ttcgtagcat aaggattaat tacttggatg
13380ccaataaaaa aaaaaaacat cgagaaaatt tcagcatgct cagaaacaat tgcagtgtat
13440caaagtaaaa aaaagatttt cgctacatgt tccttttgaa gaaagaaaat catggaacat
13500tagatttaca aaaatttaac caccgctgat taacgattag accgttaagc gcacaacagg
13560ttattagtac agagaaagca ttctgtggtg ttgccccgga ctttcttttg cgacataggt
13620aaatcgaata ccatcatact atcttttcca atgactccct aaagaaagac tcttcttcga
13680tgttgtatac gttggagcat agggcaagaa ttgtggcttg agatgaattc actggccgtc
13740gttttacaac gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca
13800catccccctt tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa
13860cagttgcgca gcctgaatgg cgaatggcgc ctgatgcggt attttctcct tacgcatctg
13920tgcggtattt cacaccgcat atggtgcact ctcagtacaa tctgctctga tgccgcatag
13980ttaagccagc cccgacaccc gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc
14040ccggcatccg cttacagaca agctgtgacc gtctccggga gctgcatgtg tcagaggttt
14100tcaccgtcat caccgaaacg cgcgagacga aagggcctcg tgatacgcct atttttatag
14160gttaatgtca tgataataat ggtttcttag acgtcaggtg gcacttttcg gggaaatgtg
14220cgcggaaccc ctatttgttt atttttctaa atacattcaa atatgtatcc gctcatgaga
14280caataaccct gataaatgct tcaataatat tgaaaaagga agagtatgag tattcaacat
14340ttccgtgtcg cccttattcc cttttttgcg gcattttgcc ttcctgtttt tgctcaccca
14400gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt gggttacatc
14460gaactggatc tcaacagcgg taagatcctt gagagttttc gccccgaaga acgttttcca
14520atgatgagca cttttaaagt tctgctatgt ggcgcggtat tatcccgtat tgacgccggg
14580caagagcaac tcggtcgccg catacactat tctcagaatg acttggttga gtactcacca
14640gtcacagaaa agcatcttac ggatggcatg acagtaagag aattatgcag tgctgccata
14700accatgagtg ataacactgc ggccaactta cttctgacaa cgatcggagg accgaaggag
14760ctaaccgctt ttttgcacaa catgggggat catgtaactc gccttgatcg ttgggaaccg
14820gagctgaatg aagccatacc aaacgacgag cgtgacacca cgatgcctgt agcaatggca
14880acaacgttgc gcaaactatt aactggcgaa ctacttactc tagcttcccg gcaacaatta
14940atagactgga tggaggcgga taaagttgca ggaccacttc tgcgctcggc ccttccggct
15000ggctggttta ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg tatcattgca
15060gcactggggc cagatggtaa gccctcccgt atcgtagtta tctacacgac ggggagtcag
15120gcaactatgg atgaacgaaa tagacagatc gctgagatag gtgcctcact gattaagcat
15180tggtaactgt cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt
15240taatttaaaa ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa
15300cgtgagtttt cgttccactg agcgtcagac cccgtagaaa agatcaaagg atcttcttga
15360gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc gctaccagcg
15420gtggtttgtt tgccggatca agagctacca actctttttc cgaaggtaac tggcttcagc
15480agagcgcaga taccaaatac tgtccttcta gtgtagccgt agttaggcca ccacttcaag
15540aactctgtag caccgcctac atacctcgct ctgctaatcc tgttaccagt ggctgctgcc
15600agtggcgata agtcgtgtct taccgggttg gactcaagac gatagttacc ggataaggcg
15660cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg aacgacctac
15720accgaactga gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc cgaagggaga
15780aaggcggaca ggtatccggt aagcggcagg gtcggaacag gagagcgcac gagggagctt
15840ccagggggaa acgcctggta tctttatagt cctgtcgggt ttcgccacct ctgacttgag
15900cgtcgatttt tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc cagcaacgcg
15960gcctttttac ggttcctggc cttttgctgg ccttttgctc acatgttctt tcctgcgtta
16020tcccctgatt ctgtggataa ccgtattacc gcctttgagt gagctgatac cgctcgccgc
16080agccgaacga ccgagcgcag cgagtcagtg agcgaggaag cggaagagcg cccaatacgc
16140aaaccgcctc tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc
16200gactggaaag cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca
16260ccccaggctt tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa
16320caatttcaca caggaaacag ctatgaccat gattacgcc
16359438684DNAArtificialpBOL113 43tcgcgcgttt cggtgatgac ggtgaaaacc
tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca
gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg
cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat
catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa
tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca
tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa
aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta
ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact
tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag
gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg
gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca
gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa
tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg
gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag
aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg
cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag
atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg
cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta
ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca
gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat
tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt
tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg
aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat
tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga
tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca
acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct
aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc
cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag
cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca
cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc
gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag
ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt
gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg
ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg
gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa
tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga
agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg
tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat
ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta
ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg
attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc
tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga
caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac
taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa
acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac
aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg
aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat
ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat
gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg
gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt
ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc
acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg
ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca
tcaacccaga cgacaagaac gctttggaag 3240aagctttggt tttgaaggac aactacggtg
ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt
tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata
ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct
ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac
atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga
agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat
tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg
ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact
tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc
aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga
aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat
cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc
tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct
taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca
ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc
tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg
gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa
ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga
agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt
taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg
atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta
gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc
agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt
atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca
tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca
aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag
ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc
attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc
aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat
gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc
caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt
accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga
agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt
ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc
cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat
ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa
gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat
caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc
caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa
gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa
gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc
agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg
acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc
tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac
acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta
aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga
gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa
aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg
gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt
cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt
tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat
gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg
gccgccaccg cggtggagct ccagcttttg 6420ttccctttag tgagggttaa ttgcgcgctt
ggcgtaatca tggtcatagc tgtttcctgt 6480gtgaaattgt tatccgctca caattccaca
caacatagga gccggaagca taaagtgtaa 6540agcctggggt gcctaatgag tgaggtaact
cacattaatt gcgttgcgct cactgcccgc 6600tttccagtcg ggaaacctgt cgtgccagct
gcattaatga atcggccaac gcgcggggag 6660aggcggtttg cgtattgggc gctcttccgc
ttcctcgctc actgactcgc tgcgctcggt 6720cgttcggctg cggcgagcgg tatcagctca
ctcaaaggcg gtaatacggt tatccacaga 6780atcaggggat aacgcaggaa agaacatgtg
agcaaaaggc cagcaaaagg ccaggaaccg 6840taaaaaggcc gcgttgctgg cgtttttcca
taggctccgc ccccctgacg agcatcacaa 6900aaatcgacgc tcaagtcaga ggtggcgaaa
cccgacagga ctataaagat accaggcgtt 6960tccccctgga agctccctcg tgcgctctcc
tgttccgacc ctgccgctta ccggatacct 7020gtccgccttt ctcccttcgg gaagcgtggc
gctttctcat agctcacgct gtaggtatct 7080cagttcggtg taggtcgttc gctccaagct
gggctgtgtg cacgaacccc ccgttcagcc 7140cgaccgctgc gccttatccg gtaactatcg
tcttgagtcc aacccggtaa gacacgactt 7200atcgccactg gcagcagcca ctggtaacag
gattagcaga gcgaggtatg taggcggtgc 7260tacagagttc ttgaagtggt ggcctaacta
cggctacact agaaggacag tatttggtat 7320ctgcgctctg ctgaagccag ttaccttcgg
aaaaagagtt ggtagctctt gatccggcaa 7380acaaaccacc gctggtagcg gtggtttttt
tgtttgcaag cagcagatta cgcgcagaaa 7440aaaaggatct caagaagatc ctttgatctt
ttctacgggg tctgacgctc agtggaacga 7500aaactcacgt taagggattt tggtcatgag
attatcaaaa aggatcttca cctagatcct 7560tttaaattaa aaatgaagtt ttaaatcaat
ctaaagtata tatgagtaaa cttggtctga 7620cagttaccaa tgcttaatca gtgaggcacc
tatctcagcg atctgtctat ttcgttcatc 7680catagttgcc tgactccccg tcgtgtagat
aactacgata cgggagggct taccatctgg 7740ccccagtgct gcaatgatac cgcgagaccc
acgctcaccg gctccagatt tatcagcaat 7800aaaccagcca gccggaaggg ccgagcgcag
aagtggtcct gcaactttat ccgcctccat 7860ccagtctatt aattgttgcc gggaagctag
agtaagtagt tcgccagtta atagtttgcg 7920caacgttgtt gccattgcta caggcatcgt
ggtgtcacgc tcgtcgtttg gtatggcttc 7980attcagctcc ggttcccaac gatcaaggcg
agttacatga tcccccatgt tgtgcaaaaa 8040agcggttagc tccttcggtc ctccgatcgt
tgtcagaagt aagttggccg cagtgttatc 8100actcatggtt atggcagcac tgcataattc
tcttactgtc atgccatccg taagatgctt 8160ttctgtgact ggtgagtact caaccaagtc
attctgagaa tagtgtatgc ggcgaccgag 8220ttgctcttgc ccggcgtcaa tacgggataa
taccgcgcca catagcagaa ctttaaaagt 8280gctcatcatt ggaaaacgtt cttcggggcg
aaaactctca aggatcttac cgctgttgag 8340atccagttcg atgtaaccca ctcgtgcacc
caactgatct tcagcatctt ttactttcac 8400cagcgtttct gggtgagcaa aaacaggaag
gcaaaatgcc gcaaaaaagg gaataagggc 8460gacacggaaa tgttgaatac tcatactctt
cctttttcaa tattattgaa gcatttatca 8520gggttattgt ctcatgagcg gatacatatt
tgaatgtatt tagaaaaata aacaaatagg 8580ggttccgcgc acatttcccc gaaaagtgcc
acctgacgtc taagaaacca ttattatcat 8640gacattaacc tataaaaata ggcgtatcac
gaggcccttt cgtc 86844412314DNAArtificialpBOL115
44tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca
60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accataccac agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc
240ggtttctttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg
300agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc
360cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt
420cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat
480ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca
540aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg
600tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg
660ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca
720aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac
780acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa
840aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg
900gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct
960ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac
1020ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga accgtggatg
1080atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa
1140gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa
1200gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac
1260aaattagagc ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac
1320agatgcgtaa ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat
1380tcgcgttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa
1440tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca
1500agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg
1560gcgatggccc actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta
1620aagcactaaa tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg
1680cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa
1740gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg
1800gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg
1860cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg
1920taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat
1980acgactcact atagggcgaa ttgggtaccg ggccccccct cgaggtcgac ggtatcgata
2040agcttgatat cgaattcctg cagcccgggg gatccactag ttctagagcg gcccatttaa
2100acggccggcc ctagatcaga gggtggtaaa tgaagtgtaa tagtattcat ttttcttata
2160aatcatccct tccgtgattt atacaaaaga agaggagaat atgctgaata cttggtatat
2220tactctacat tatactctta tcttgacggg tattctgagc atcttactca gtttcaagat
2280cttttaatgt ccaaaaacat ttgagccgat ctaaatactt ctgtgttttc attaatttat
2340aaattgtact cttttaagac atggaaagta ccaacatcgg ttgaaacagt ttttcattta
2400cttatggttt attggttttt ccagtgaatg attatttgtc gttacccttt cgtaaaagtt
2460caaacacgtt tttaagtatt gtttagttgc tctttcgaca tatatgatta tccctgcgcg
2520gctaaagtta aggatgcaaa aaacataaga caactgaagt taatttacgt caattaagtt
2580ttccagggta atgatgtttt gggcttccac taattcaata agtatgtcat gaaatacgtt
2640gtgaagagca tccagaaata atgaaaagaa acaacgaaac tgggtcggcc tgttgtttct
2700tttctttacc acgtgatctg cggcatttac aggaagtcgc gcgttttgcg cagttgttgc
2760aacgcagcta cggctaacaa agcctagtgg aactcgactg atgtgttagg gcctaaaact
2820ggtggtgaca gctgaagtga actattcaat ccaatcatgt catggctgtc acaaagacct
2880tgcggaccgc acgtacgaac acatacgtat gctaatatgt gttttgatag tacccagtga
2940tcgcagacct gcaatttttt tgtaggtttg gaagaatata taaaggttgc actcattcaa
3000gatagttttt ttcttgtgtg tctattcatt ttattattgt ttgtttaaat gttaaaaaaa
3060ccaagaactt agtttcaaat taaattcatc acacaaacaa acaaaacaaa atgaacattg
3120ttgtttgttt gaagcaagtt ccagacactg ctgaagtcag aattgaccca gtcaagggta
3180ctttaatcag agaaggtgtt ccatctatca tcaacccaga cgacaagaac gctttggaag
3240aagctttggt tttgaaggac aactacggtg ctcacgttac cgtcatttcc atgggtccac
3300ctcaagccaa gaacgctttg gttgaagctt tggccatggg tgctgatgaa gctgtcttat
3360tgactgacag agctttcggt ggtgctgata ctttagctac ctctcacacc attgctgctg
3420gtatcaagaa attgaaatac gatatcgtct ttgccggtcg tcaagccatc gatggtgata
3480ccgctcaagt cggtccagaa attgctgaac atttgggtat tccacaagtc acctacgttg
3540aaaaggttga agttgacggt gacactttga agatcagaaa ggcttgggaa gacggttacg
3600aagttgttga agtcaagact ccagttctat tgactgccat caaggaattg aacgttccaa
3660gatacatgtc cgttgaaaag atcttcggtg ctttcgacaa ggaagtcaag atgtggactg
3720ctgatgatat cgatgtcgac aaggccaact tgggtttgaa aggttctcca accaaggtca
3780agaaatcttc taccaaggaa gtcaagggtc aaggtgaagt cattgacaaa ccagtcaagg
3840aagctgccgc ttacgttgtt tccaagttga aggaagaaca ctacatctaa agcccgggcg
3900gagattgata agacttttct agttgcatat cttttatatt taaatcttat ctattagtta
3960attttttgta atttatcctt atatatagtc tggttattct aaaatatcat ttcagtatct
4020aaaaattccc ctcttttttc agttatatct taacaggcga cagtccaaat gttgatttat
4080cccagtccga ttcatcaggg ttgtgaagca ttttgtcaat ggtcgaaatc acatcagtaa
4140tagtgcctct tacttgcctc atagaatttc tttctcttaa cgtcaccgtt tggtctttta
4200tagtttcgaa atctatggtg ataccaaatg gtgttcccaa ttcatcgtta cgggcgtatt
4260ttttaccaat tgaagtattg gaatcgtcaa ttttaaagta tatctctctt ttacgtaaag
4320cctgcgagat cctcttaagt atagcgggga agccatcgtt attcgatatt gtcgtaacaa
4380atactttgat cggcgctatg tttaaatgtt taaacatgga cagatatgcg atgaaaacgc
4440taagtgatac tccaaatggt gaaaggtacg atgcttggaa acaatacttg gaaatcaccg
4500gaaacaccat atgcggcgaa aagccaatta gtgtgatact aagtgcttta tcgaaaatcc
4560gtgatgccgg tccttcaggc atcaaatttc agtggcctaa ttattcacag agttctcatg
4620tgacaagtat tgatgatagt agtgtcagtt atgcttcagg ttatgttact ataggataat
4680gatcacggct aaaacggtcg aatgtaagca tatatctttc gattgtataa ttgttcccaa
4740atactacagc atctcaagga aaaaaaaaca aaaacttcca aaaaaatcga atccctgagg
4800aatctttaat acattttcaa tctatttaag ttttataaac gtgtatatga gatgtcatga
4860gcatgaatta ttaataataa aaactaaatc attaaagtaa cttaaggagt taaagcccgg
4920gctttaattg ttagcagcct tgacttgagc aatcaattct ggaacaacct tgttgacatc
4980accgacaatg gccaaatcag caaccttcat gattggagct tcgacatctt tgttgatggc
5040aatgatgtag tcagagtctt gcataccagc caagtgttgg atggcaccag agataccaca
5100agcaatgtac aaagttggtc tgacggtctt accggtttga ccgacttgca agtccttgtc
5160aacccattcc ttttcaatgg cagctctgga agcagcaatg gtaccaccca acaaagaagc
5220taattcttcc aatttttcga agttttcctt ggaaccaaca ccacgaccac cagcaaccaa
5280aaccttggct tcaccgatat cagcaatgtc cttggccaat ttgacaacct tggaaacctt
5340ggttctgata tcagaagcag tcaatttgat ggcaaccttt tcgatcttgt catcagaaac
5400gttagcatcg ttaactggca atttttcaaa gacacctggt ctgacggtgg ccatttgagg
5460tctgtggtca gaacagacaa tggtagcaat caagttacca ccgaaagctg gtctggtagc
5520caacaagtca cggttttcga catcgatatc caaagaggta cagtcagcag tcaaaccagt
5580agacaatctg gcagcaattc ttggacccaa gtctctaccg atgaaagtag caccgatgaa
5640taagatttct ggctttcttt cgttgaccaa gtcacagata accttggcgt aaccgtcagt
5700ggagaaatga gctaataatt cgttgtcagc agccaaaacc ttgtcagcac cgtgggacaa
5760caagtccttg gacatctttt cagtgttgtg acccaataag acagcagtca attcaacacc
5820caatttttca gccatttcct tacccttacc tagcaattcc aaagaaacct tttgtaattc
5880accatctctt tgttcagcga aaacccagac acccttgtag tcagccttgt tcatgtttag
5940ttaattatag ttcgttgacc gtatattcta aaaacaagta ctccttaaaa aaaaaccttg
6000aagggaataa acaagtagaa tagatagaga gaaaaataga aaatgcaaga gaatttatat
6060attagaaaga gagaaagaaa aatggaaaaa aaaaaatagg aaaagccaga aatagcacta
6120gaaggagcga caccagaaaa gaaggtgatg gaaccaattt agctatatat agttaactac
6180cggctcgatc atctctgcct ccagcatagt cgaagaagaa tttttttttt cttgaggctt
6240ctgtcagcaa ctcgtatttt ttctttcttt tttggtgagc ctaaaaagtt cccacgttct
6300cttgtacgac gccgtcacaa acaaccttat gggtaatttg tcgcggtctg ggtgtataaa
6360tgtgtgggtg caggccggcc gtttaaacgg gccgccaccg cggtggagcc tgtgtggaag
6420aacgattaca acaggtgttg tcctctgagg acataaaata cacaccgaga ttcatcaact
6480cattgctgga gttagcatat ctacaattgg gtgaaatggg gagcgatttg caggcatttg
6540ctcggcatgc cggtagaggt gtggtcaata agagcgacct catgctatac ctgagaaagc
6600aacctgacct acaggaaaga gttactcaag aataagaatt ttcgttttaa aacctaagag
6660tcactttaaa atttgtatac acttattttt tttataactt atttaataat aaaaatcata
6720aatcataaga aattcgctcg agtcgactgc agtttactgc ttgtagtcgt aagaagtttg
6780gatgatatcc ttgatttcag aaatcaaagc ttcctttggg ttggcagtgg tacattggtc
6840ttcgaaagcc aattcagcca ttctgtcaat ggattcgttc aattcttctt cagagacacc
6900ttgagatttc aagttcattt caataccaac agattgacct aattcgtaga cagccttggc
6960caaagattca accaaagctt cagtggtgtt acctttcaaa cctaagaact tggcgatatc
7020agcgtaatcg gtgtcagctc tgaagaattc gtactttggg aacaaagcgt gcttttgagg
7080gtccttggcg ttgtatctga tgatgtgagg caacaagatg gcgttagctc taccatgtgg
7140aataccgtat tcaccaccaa ttttgtgagc aatggagtga gcaataccca agaaagcgtt
7200agcaaaggcc ataccggcca aagtagaagc gttgtgcatc ttttctctgg aaaccttgtc
7260acctttttca acggaagatt tcaagtattc aaaggtcaat ttgatagctt gtagggacaa
7320acctctggtg taatcggagg ccatgacaga aacgtaagat tccatagcgt gagtcaaaac
7380gtccataccg gtatcagcag tgacagattt tgggacggac atgacaaatt gagggtcaat
7440gatggcgaca tctggagtca aagcgaaatc agccaatggg tatttgacgt tggtttcaga
7500gtcagtgata acagcaaatg gagtgacttc agaaccagta ccagaagtgg ttggaataca
7560gatgaaagtg gcgttttctg gcataccaat cttgtaggtt ctcttaccaa tgtccaagaa
7620tttttgcttg gcaccgaaga aagaagtttc tggatgttcg aagaacatcc acatagcctt
7680ggcagcatcc atggcagaac caccacccaa agcaatgatg gtgtctggtt ggaaatcgac
7740catcatttcc aaacccttgt agacagtgtt ggtggatgga tttggttcaa cttcagagaa
7800gatcttgatt tgaggttgtt cagttctttg acgtaagacg ttttcgacgg ttttggtgta
7860accaaattca accatacctg ggtcacaaac gatcatgacc ttttcaatct tgtccatggt
7920ggtcaaggac atgatagcgt tttcttcgaa atagatttga gctggaacct tgaagatttg
7980agtgttgttt cttctcttgg caatggtctt gatgttcaac aaatcggtag cagagacgtt
8040gtgggagatg gagtttctac cgtaagaacc acaacccaaa gtcaaggatg gaatcaattc
8100gttgtacatg tcaccgatac caccaacagc agatggagtg ttgaccaaaa cacgacaagc
8160cttcattctt agaccgaaat ccttttgcaa agtttcgtct tcagtgtgga taacagcagt
8220gtgacctaaa ccaccgaagt gcaaagtgtc ttcacagatt tggaaagctt gcttggtaga
8280ttgagccttg actaaagcca aaactggaga caacttttct ctggacaatg ggtagtcaga
8340accgacaccg gagatttcag caatgatcaa tttggtgttt tctggaactg ggataccggc
8400caattcagca atttcaacag cagacttacc aacgatatct ggcttgatac cagtcttttg
8460ttcgttcatg atggcgtttt ctaatctttg taattcgtcc ttcttgacga agtaagcttg
8520gtgagccttg aattcattgg tgacatcctt gtagatttcc ttgtcaatga caacaacttg
8580ttcagaagca cagatcatac cattatcgaa agtcttggaa ccgatgatat cgttgacagc
8640acgcttgata tgagcagtct tttcgatgta agatggaacg ttacctggac caacacccaa
8700agctggttta ccagtggagt aagcagactt aaccatacca gaaccaccgg tagccaagac
8760taaagcaata cccttgtggt tcatcaattg cttggtagct tcaatggatg gaacttcaat
8820ccattggatg atatcctttg gagcaccagc cttcatggca gcttccaaaa caacttcagc
8880agctctcttg gaagattctt gagcagatgg gtggaaagcg aaaataattg ggttaccagt
8940cttgatggca atcatagcct tgaagatggt ggtagaagtt gggttggtgg ttggagtgac
9000accacagatg acaccaattg gttcagcaac gtaggtcaaa cccttttctt tgtcttcacc
9060aatgatacca acagtcttgt tgtccttgat ggagttccag atgtattcag aggcgtataa
9120gttcttgata gccttgtctt cgtagatacc tctaccggtt tcttcatgag ccaacttggc
9180caaaaccatg tgttggtcaa cagcagccaa ggacatttgg tggacaatgt ggtcaatttc
9240ttcttgagac ttcttggaca aagcttccaa agccttctta cctttgtcag ctaaagcatc
9300aatcatgatg gcaacttctt gttccttgga acctctgttt tccttttctg gaatggtcaa
9360cattttttac tagttctaga atccgtcgaa actaagttct ggtgttttaa aactaaaaaa
9420aagactaact ataaaagtag aatttaagaa gtttaagaaa tagatttaca gaattacaat
9480caatacctac cgtctttata tacttattag tcaagtaggg gaataatttc agggaactgg
9540tttcaacctt ttttttcagc tttttccaaa tcagagagag cagaaggtaa tagaaggtgt
9600aagaaaatga gatagataca tgcgtgggtc aattgccttg tgtcatcatt tactccaggc
9660aggttgcatc actccattga ggttgtgccc gttttttgcc tgtttgtgcc cctgttctct
9720gtagttgcgc taagagaatg gacctatgaa ctgatggttg gtgaagaaaa caatattttg
9780gtgctgggat tctttttttt tctggatgcc agcttaaaaa gcgggctcca ttatatttag
9840tggatgccag gaataaactg ttcacccaga cacctacgat gttatatatt ctgtgtaacc
9900cgccccctat tttgggcatg tacgggttac agcagaatta aaaggctaat tttttgacta
9960aataaagtta ggaaaatcac tactattaat tatttacgta ttctttgaaa tggcgagtat
10020tgataatgat aaactgagct ccagcttttg ttccctttag tgagggttaa ttgcgcgctt
10080ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca
10140caacatagga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgaggtaact
10200cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct
10260gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc
10320ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca
10380ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg
10440agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca
10500taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa
10560cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc
10620tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc
10680gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct
10740gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg
10800tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag
10860gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta
10920cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg
10980aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt
11040tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt
11100ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag
11160attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat
11220ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc
11280tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat
11340aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc
11400acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag
11460aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag
11520agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt
11580ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg
11640agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt
11700tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc
11760tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc
11820attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa
11880taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg
11940aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc
12000caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag
12060gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt
12120cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt
12180tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc
12240acctgacgtc taagaaacca ttattatcat gacattaacc tataaaaata ggcgtatcac
12300gaggcccttt cgtc
123144511180DNAArtificialpBOL116 45tcgcgcgttt cggtgatgac ggtgaaaacc
tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca
gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg
cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat
catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa
tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca
tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa
aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta
ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact
tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag
gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg
gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca
gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa
tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg
gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag
aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg
cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag
atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg
cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta
ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca
gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat
tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt
tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg
aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat
tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga
tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca
acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct
aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc
cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag
cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca
cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc
gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag
ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt
gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg
ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg
gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa
tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga
agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg
tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat
ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta
ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg
attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc
tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga
caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac
taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa
acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac
aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg
aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat
ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat
gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg
gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt
ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc
acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg
ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca
tcaacccaga cgacaagaac gctttggaag 3240aagctttggt tttgaaggac aactacggtg
ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt
tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata
ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct
ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac
atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga
agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat
tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg
ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact
tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc
aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga
aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat
cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc
tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct
taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca
ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc
tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg
gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa
ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga
agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt
taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg
atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta
gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc
agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt
atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca
tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca
aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag
ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc
attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc
aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat
gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc
caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt
accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga
agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt
ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc
cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat
ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa
gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat
caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc
caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa
gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa
gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc
agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg
acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc
tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac
acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta
aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga
gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa
aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg
gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt
cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt
tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat
gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg
gccgccaccg cggtggagcc tgtgtggaag 6420aacgattaca acaggtgttg tcctctgagg
acataaaata cacaccgaga ttcatcaact 6480cattgctgga gttagcatat ctacaattgg
gtgaaatggg gagcgatttg caggcatttg 6540ctcggcatgc cggtagaggt gtggtcaata
agagcgacct catgctatac ctgagaaagc 6600aacctgacct acaggaaaga gttactcaag
aataagaatt ttcgttttaa aacctaagag 6660tcactttaaa atttgtatac acttattttt
tttataactt atttaataat aaaaatcata 6720aatcataaga aattcgctcg agtcgactgc
agtttacaag ttcaatttgg ccatgatggc 6780cttaacaatg gcttgaacat cttcgttgtc
ttctggttca gctggagcag aagaagaagc 6840agcaccgaca ccgaaagctt ctctgatttc
ttcgacggtg gtggtaccgt aagcaacctt 6900tctgatgttg aacaagtttt ctggaccaac
gttgtcagag gtggcagaac caccaacagc 6960accacaacct aaagtcaaag atgggactaa
gttggtagca ccaccgatac cacccaaaga 7020acctggagag ttaaccaaaa ttctggaaac
aggcttcttc aaagcaaatt ctctaatgat 7080ttcttcgttt tgagagtgga tgatcaaagt
gtgaccagaa ccttggttgt gcaataaagc 7140caaagacttt tcacaagctt catgccagtc
ttcgacggtg tagaaagcca agactggagc 7200caatttttcc ttagcgtatg gatttttagg
agaaacatcg gtttgttcgg atagtaagat 7260aacagcatca gatggaatgg aaataccagc
caatttggcc aaagcttgga catccttacc 7320aacgatggct gggtttggag taccgttggc
acgtaataga atcttaccaa ccttttcaga 7380ttcttcagca ttcaagaagt aacccttttg
tctcttgaat tcttcgatga tttcagcctt 7440cttgacggtt tcagcaatga tggattgttc
agaagcacag atgacaccgt tgtcgaaagt 7500gtcagaaccg ataacctttc taacagcagt
tggaatgtca gcagttcttt cgatgaaaca 7560tggaccgtta cctggaccga caccgatggc
tggagtacca gaggagtaag cagctctaac 7620cataccttca ccaccggtag ccaagatcaa
agcggtgtcc ttgttcttca tcaattcagc 7680agtaccttca acggtcaaaa tggacataca
ttggatcaaa ccatctggag caccagcttc 7740aacagcagcc ttttgcatga tcttaacggt
ttcagtgatg gaacggacag cagttgggtg 7800tggagagaag acaatggcgt taccagcctt
caaagcaatc aagaccttga aaatggcagt 7860ggaagttggg ttggtagatg gaatcaaacc
agcaatgaca cctaatggga cagcaatgtc 7920aatcaatttc ttttccttgt cttccttcaa
gataccaacg gtcttcaaat ccttgatgta 7980gttgtagaca acaatggagg agaatttgtt
cttgatgacc ttgtcttccc atttaccgta 8040accggtgtct tcgtaagcca atttggccaa
tttgacagct tcaacttcag tagccttggc 8100gatcttttcg atgaccttgt taacagcttc
ttgggaaaag ttcttgaatt cagcttgagc 8160cttcttggcc ttggcaatca aagttctaac
ttcttggatg gattgcaaat ccttgtccat 8220gatttccatt ttttactagt tctagaatcc
gtcgaaacta agttctggtg ttttaaaact 8280aaaaaaaaga ctaactataa aagtagaatt
taagaagttt aagaaataga tttacagaat 8340tacaatcaat acctaccgtc tttatatact
tattagtcaa gtaggggaat aatttcaggg 8400aactggtttc aacctttttt ttcagctttt
tccaaatcag agagagcaga aggtaataga 8460aggtgtaaga aaatgagata gatacatgcg
tgggtcaatt gccttgtgtc atcatttact 8520ccaggcaggt tgcatcactc cattgaggtt
gtgcccgttt tttgcctgtt tgtgcccctg 8580ttctctgtag ttgcgctaag agaatggacc
tatgaactga tggttggtga agaaaacaat 8640attttggtgc tgggattctt tttttttctg
gatgccagct taaaaagcgg gctccattat 8700atttagtgga tgccaggaat aaactgttca
cccagacacc tacgatgtta tatattctgt 8760gtaacccgcc ccctattttg ggcatgtacg
ggttacagca gaattaaaag gctaattttt 8820tgactaaata aagttaggaa aatcactact
attaattatt tacgtattct ttgaaatggc 8880gagtattgat aatgataaac tgagctccag
cttttgttcc ctttagtgag ggttaattgc 8940gcgcttggcg taatcatggt catagctgtt
tcctgtgtga aattgttatc cgctcacaat 9000tccacacaac ataggagccg gaagcataaa
gtgtaaagcc tggggtgcct aatgagtgag 9060gtaactcaca ttaattgcgt tgcgctcact
gcccgctttc cagtcgggaa acctgtcgtg 9120ccagctgcat taatgaatcg gccaacgcgc
ggggagaggc ggtttgcgta ttgggcgctc 9180ttccgcttcc tcgctcactg actcgctgcg
ctcggtcgtt cggctgcggc gagcggtatc 9240agctcactca aaggcggtaa tacggttatc
cacagaatca ggggataacg caggaaagaa 9300catgtgagca aaaggccagc aaaaggccag
gaaccgtaaa aaggccgcgt tgctggcgtt 9360tttccatagg ctccgccccc ctgacgagca
tcacaaaaat cgacgctcaa gtcagaggtg 9420gcgaaacccg acaggactat aaagatacca
ggcgtttccc cctggaagct ccctcgtgcg 9480ctctcctgtt ccgaccctgc cgcttaccgg
atacctgtcc gcctttctcc cttcgggaag 9540cgtggcgctt tctcatagct cacgctgtag
gtatctcagt tcggtgtagg tcgttcgctc 9600caagctgggc tgtgtgcacg aaccccccgt
tcagcccgac cgctgcgcct tatccggtaa 9660ctatcgtctt gagtccaacc cggtaagaca
cgacttatcg ccactggcag cagccactgg 9720taacaggatt agcagagcga ggtatgtagg
cggtgctaca gagttcttga agtggtggcc 9780taactacggc tacactagaa ggacagtatt
tggtatctgc gctctgctga agccagttac 9840cttcggaaaa agagttggta gctcttgatc
cggcaaacaa accaccgctg gtagcggtgg 9900tttttttgtt tgcaagcagc agattacgcg
cagaaaaaaa ggatctcaag aagatccttt 9960gatcttttct acggggtctg acgctcagtg
gaacgaaaac tcacgttaag ggattttggt 10020catgagatta tcaaaaagga tcttcaccta
gatcctttta aattaaaaat gaagttttaa 10080atcaatctaa agtatatatg agtaaacttg
gtctgacagt taccaatgct taatcagtga 10140ggcacctatc tcagcgatct gtctatttcg
ttcatccata gttgcctgac tccccgtcgt 10200gtagataact acgatacggg agggcttacc
atctggcccc agtgctgcaa tgataccgcg 10260agacccacgc tcaccggctc cagatttatc
agcaataaac cagccagccg gaagggccga 10320gcgcagaagt ggtcctgcaa ctttatccgc
ctccatccag tctattaatt gttgccggga 10380agctagagta agtagttcgc cagttaatag
tttgcgcaac gttgttgcca ttgctacagg 10440catcgtggtg tcacgctcgt cgtttggtat
ggcttcattc agctccggtt cccaacgatc 10500aaggcgagtt acatgatccc ccatgttgtg
caaaaaagcg gttagctcct tcggtcctcc 10560gatcgttgtc agaagtaagt tggccgcagt
gttatcactc atggttatgg cagcactgca 10620taattctctt actgtcatgc catccgtaag
atgcttttct gtgactggtg agtactcaac 10680caagtcattc tgagaatagt gtatgcggcg
accgagttgc tcttgcccgg cgtcaatacg 10740ggataatacc gcgccacata gcagaacttt
aaaagtgctc atcattggaa aacgttcttc 10800ggggcgaaaa ctctcaagga tcttaccgct
gttgagatcc agttcgatgt aacccactcg 10860tgcacccaac tgatcttcag catcttttac
tttcaccagc gtttctgggt gagcaaaaac 10920aggaaggcaa aatgccgcaa aaaagggaat
aagggcgaca cggaaatgtt gaatactcat 10980actcttcctt tttcaatatt attgaagcat
ttatcagggt tattgtctca tgagcggata 11040catatttgaa tgtatttaga aaaataaaca
aataggggtt ccgcgcacat ttccccgaaa 11100agtgccacct gacgtctaag aaaccattat
tatcatgaca ttaacctata aaaataggcg 11160tatcacgagg ccctttcgtc
111804611108DNAArtificialpBOL118
46tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca
60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accataccac agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc
240ggtttctttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg
300agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc
360cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt
420cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat
480ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca
540aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg
600tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg
660ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca
720aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac
780acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa
840aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg
900gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct
960ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac
1020ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga accgtggatg
1080atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa
1140gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa
1200gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac
1260aaattagagc ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac
1320agatgcgtaa ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat
1380tcgcgttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa
1440tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca
1500agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg
1560gcgatggccc actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta
1620aagcactaaa tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg
1680cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa
1740gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg
1800gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg
1860cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg
1920taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat
1980acgactcact atagggcgaa ttgggtaccg ggccccccct cgaggtcgac ggtatcgata
2040agcttgatat cgaattcctg cagcccgggg gatccactag ttctagagcg gcccatttaa
2100acggccggcc ctagatcaga gggtggtaaa tgaagtgtaa tagtattcat ttttcttata
2160aatcatccct tccgtgattt atacaaaaga agaggagaat atgctgaata cttggtatat
2220tactctacat tatactctta tcttgacggg tattctgagc atcttactca gtttcaagat
2280cttttaatgt ccaaaaacat ttgagccgat ctaaatactt ctgtgttttc attaatttat
2340aaattgtact cttttaagac atggaaagta ccaacatcgg ttgaaacagt ttttcattta
2400cttatggttt attggttttt ccagtgaatg attatttgtc gttacccttt cgtaaaagtt
2460caaacacgtt tttaagtatt gtttagttgc tctttcgaca tatatgatta tccctgcgcg
2520gctaaagtta aggatgcaaa aaacataaga caactgaagt taatttacgt caattaagtt
2580ttccagggta atgatgtttt gggcttccac taattcaata agtatgtcat gaaatacgtt
2640gtgaagagca tccagaaata atgaaaagaa acaacgaaac tgggtcggcc tgttgtttct
2700tttctttacc acgtgatctg cggcatttac aggaagtcgc gcgttttgcg cagttgttgc
2760aacgcagcta cggctaacaa agcctagtgg aactcgactg atgtgttagg gcctaaaact
2820ggtggtgaca gctgaagtga actattcaat ccaatcatgt catggctgtc acaaagacct
2880tgcggaccgc acgtacgaac acatacgtat gctaatatgt gttttgatag tacccagtga
2940tcgcagacct gcaatttttt tgtaggtttg gaagaatata taaaggttgc actcattcaa
3000gatagttttt ttcttgtgtg tctattcatt ttattattgt ttgtttaaat gttaaaaaaa
3060ccaagaactt agtttcaaat taaattcatc acacaaacaa acaaaacaaa atgaacattg
3120ttgtttgttt gaagcaagtt ccagacactg ctgaagtcag aattgaccca gtcaagggta
3180ctttaatcag agaaggtgtt ccatctatca tcaacccaga cgacaagaac gctttggaag
3240aagctttggt tttgaaggac aactacggtg ctcacgttac cgtcatttcc atgggtccac
3300ctcaagccaa gaacgctttg gttgaagctt tggccatggg tgctgatgaa gctgtcttat
3360tgactgacag agctttcggt ggtgctgata ctttagctac ctctcacacc attgctgctg
3420gtatcaagaa attgaaatac gatatcgtct ttgccggtcg tcaagccatc gatggtgata
3480ccgctcaagt cggtccagaa attgctgaac atttgggtat tccacaagtc acctacgttg
3540aaaaggttga agttgacggt gacactttga agatcagaaa ggcttgggaa gacggttacg
3600aagttgttga agtcaagact ccagttctat tgactgccat caaggaattg aacgttccaa
3660gatacatgtc cgttgaaaag atcttcggtg ctttcgacaa ggaagtcaag atgtggactg
3720ctgatgatat cgatgtcgac aaggccaact tgggtttgaa aggttctcca accaaggtca
3780agaaatcttc taccaaggaa gtcaagggtc aaggtgaagt cattgacaaa ccagtcaagg
3840aagctgccgc ttacgttgtt tccaagttga aggaagaaca ctacatctaa agcccgggcg
3900gagattgata agacttttct agttgcatat cttttatatt taaatcttat ctattagtta
3960attttttgta atttatcctt atatatagtc tggttattct aaaatatcat ttcagtatct
4020aaaaattccc ctcttttttc agttatatct taacaggcga cagtccaaat gttgatttat
4080cccagtccga ttcatcaggg ttgtgaagca ttttgtcaat ggtcgaaatc acatcagtaa
4140tagtgcctct tacttgcctc atagaatttc tttctcttaa cgtcaccgtt tggtctttta
4200tagtttcgaa atctatggtg ataccaaatg gtgttcccaa ttcatcgtta cgggcgtatt
4260ttttaccaat tgaagtattg gaatcgtcaa ttttaaagta tatctctctt ttacgtaaag
4320cctgcgagat cctcttaagt atagcgggga agccatcgtt attcgatatt gtcgtaacaa
4380atactttgat cggcgctatg tttaaatgtt taaacatgga cagatatgcg atgaaaacgc
4440taagtgatac tccaaatggt gaaaggtacg atgcttggaa acaatacttg gaaatcaccg
4500gaaacaccat atgcggcgaa aagccaatta gtgtgatact aagtgcttta tcgaaaatcc
4560gtgatgccgg tccttcaggc atcaaatttc agtggcctaa ttattcacag agttctcatg
4620tgacaagtat tgatgatagt agtgtcagtt atgcttcagg ttatgttact ataggataat
4680gatcacggct aaaacggtcg aatgtaagca tatatctttc gattgtataa ttgttcccaa
4740atactacagc atctcaagga aaaaaaaaca aaaacttcca aaaaaatcga atccctgagg
4800aatctttaat acattttcaa tctatttaag ttttataaac gtgtatatga gatgtcatga
4860gcatgaatta ttaataataa aaactaaatc attaaagtaa cttaaggagt taaagcccgg
4920gctttaattg ttagcagcct tgacttgagc aatcaattct ggaacaacct tgttgacatc
4980accgacaatg gccaaatcag caaccttcat gattggagct tcgacatctt tgttgatggc
5040aatgatgtag tcagagtctt gcataccagc caagtgttgg atggcaccag agataccaca
5100agcaatgtac aaagttggtc tgacggtctt accggtttga ccgacttgca agtccttgtc
5160aacccattcc ttttcaatgg cagctctgga agcagcaatg gtaccaccca acaaagaagc
5220taattcttcc aatttttcga agttttcctt ggaaccaaca ccacgaccac cagcaaccaa
5280aaccttggct tcaccgatat cagcaatgtc cttggccaat ttgacaacct tggaaacctt
5340ggttctgata tcagaagcag tcaatttgat ggcaaccttt tcgatcttgt catcagaaac
5400gttagcatcg ttaactggca atttttcaaa gacacctggt ctgacggtgg ccatttgagg
5460tctgtggtca gaacagacaa tggtagcaat caagttacca ccgaaagctg gtctggtagc
5520caacaagtca cggttttcga catcgatatc caaagaggta cagtcagcag tcaaaccagt
5580agacaatctg gcagcaattc ttggacccaa gtctctaccg atgaaagtag caccgatgaa
5640taagatttct ggctttcttt cgttgaccaa gtcacagata accttggcgt aaccgtcagt
5700ggagaaatga gctaataatt cgttgtcagc agccaaaacc ttgtcagcac cgtgggacaa
5760caagtccttg gacatctttt cagtgttgtg acccaataag acagcagtca attcaacacc
5820caatttttca gccatttcct tacccttacc tagcaattcc aaagaaacct tttgtaattc
5880accatctctt tgttcagcga aaacccagac acccttgtag tcagccttgt tcatgtttag
5940ttaattatag ttcgttgacc gtatattcta aaaacaagta ctccttaaaa aaaaaccttg
6000aagggaataa acaagtagaa tagatagaga gaaaaataga aaatgcaaga gaatttatat
6060attagaaaga gagaaagaaa aatggaaaaa aaaaaatagg aaaagccaga aatagcacta
6120gaaggagcga caccagaaaa gaaggtgatg gaaccaattt agctatatat agttaactac
6180cggctcgatc atctctgcct ccagcatagt cgaagaagaa tttttttttt cttgaggctt
6240ctgtcagcaa ctcgtatttt ttctttcttt tttggtgagc ctaaaaagtt cccacgttct
6300cttgtacgac gccgtcacaa acaaccttat gggtaatttg tcgcggtctg ggtgtataaa
6360tgtgtgggtg caggccggcc gtttaaacgg gccgccaccg cggtggagcc tgtgtggaag
6420aacgattaca acaggtgttg tcctctgagg acataaaata cacaccgaga ttcatcaact
6480cattgctgga gttagcatat ctacaattgg gtgaaatggg gagcgatttg caggcatttg
6540ctcggcatgc cggtagaggt gtggtcaata agagcgacct catgctatac ctgagaaagc
6600aacctgacct acaggaaaga gttactcaag aataagaatt ttcgttttaa aacctaagag
6660tcactttaaa atttgtatac acttattttt tttataactt atttaataat aaaaatcata
6720aatcataaga aattcgctcg agtcgactgc agtttaaaca attctgaaag catcgaccaa
6780aacacaacga cgtaatctga cgaaagttct ggcagaagtg acaccttcac cagttggggt
6840ggtgatggtc atggtggtcc aaccttcacc acccaaaccc aaaccagcga tacatggacc
6900gttcttgaca aagatggaag tgtcaatggc gttagccatt tggttcatgt tttcgatgtt
6960tctggagtgc atggcagcag tgtggtgaca accaccttcc aatttgacag ccaaagcaat
7020agcgtcagca acgttagcaa cacggacaac tggtaagact ggcatcatca attcagtgac
7080agcaaatggg tgttcagcgg tggtttcgac gaataataat ctggtttctt gtggaacctt
7140caaaccgatg gcagcagcaa tcttaccagc atctctacca acccagtctc tggagacggt
7200acccttacct ctttcatcga tgttcttcaa caaaactggt tgcaattgtt gagcttgttc
7260agcagtcaac ttgacggcat gttgaccttc catcaatctc atcaattcgt cagcaacgga
7320gtcaacaaca atcaaaacct tttcgtcagc acagatgatg ttgttgtcga aagaagcacc
7380cttgacaatg gattgagcag ctctggccaa atcagcggtt tcatcgacaa caacaggagg
7440gttaccagca ccagcagcaa tcaatctctt gttggtgtgc tttctggcag cttcaacaac
7500agcttcacca ccagtgacga ctaatagacc gatacctggg aacttgaata atctttgagc
7560agtttcgata tctgggttgg caacagtgac caacaagttt tctggaccac cagcagcaac
7620aatggcttgg ttcaatagag tgatggctct ttgagaaacc ttcttggcag ctgggtgtgg
7680agcgaagata acggagttac cagcagcaat caaagagatg gcgttgttga tgacagtagc
7740agctgggttg gtagatgggg tgacggaagc aacaacaccc catggagcat tttcaatcaa
7800agtcaaacca ttatcaccgg tcaagacttg tggagacaaa cattcgacac ctggagtacc
7860tctagcttga gcaacgttct tagcgaattt gtcttcaact ctacccatac cggtttcgga
7920gacagccaat tcagccaagt ctctggcatg cttttcacca gcttctctga tggcagcaat
7980ggccaattgt ctcatggcaa cagatttcaa accttgttga gcaaccttgg cagcagcaac
8040agcgtcgtcc aaagaagcga aaacacccat ttcgtggaca gcagcagatg gagtgtcaga
8100agattgcatt ttcaacaaga cagccttgac aacttgttcg atatcttgtt ggttcatttt
8160ttactagttc tagaatccgt cgaaactaag ttctggtgtt ttaaaactaa aaaaaagact
8220aactataaaa gtagaattta agaagtttaa gaaatagatt tacagaatta caatcaatac
8280ctaccgtctt tatatactta ttagtcaagt aggggaataa tttcagggaa ctggtttcaa
8340cctttttttt cagctttttc caaatcagag agagcagaag gtaatagaag gtgtaagaaa
8400atgagataga tacatgcgtg ggtcaattgc cttgtgtcat catttactcc aggcaggttg
8460catcactcca ttgaggttgt gcccgttttt tgcctgtttg tgcccctgtt ctctgtagtt
8520gcgctaagag aatggaccta tgaactgatg gttggtgaag aaaacaatat tttggtgctg
8580ggattctttt tttttctgga tgccagctta aaaagcgggc tccattatat ttagtggatg
8640ccaggaataa actgttcacc cagacaccta cgatgttata tattctgtgt aacccgcccc
8700ctattttggg catgtacggg ttacagcaga attaaaaggc taattttttg actaaataaa
8760gttaggaaaa tcactactat taattattta cgtattcttt gaaatggcga gtattgataa
8820tgataaactg agctccagct tttgttccct ttagtgaggg ttaattgcgc gcttggcgta
8880atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat
8940aggagccgga agcataaagt gtaaagcctg gggtgcctaa tgagtgaggt aactcacatt
9000aattgcgttg cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta
9060atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc
9120gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa
9180ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa
9240aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct
9300ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac
9360aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc
9420gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc
9480tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg
9540tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga
9600gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta acaggattag
9660cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta
9720cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag
9780agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg
9840caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac
9900ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc
9960aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag
10020tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc
10080agcgatctgt ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac
10140gatacgggag ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc
10200accggctcca gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg
10260tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag
10320tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc
10380acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac
10440atgatccccc atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag
10500aagtaagttg gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac
10560tgtcatgcca tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg
10620agaatagtgt atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc
10680gccacatagc agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact
10740ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg
10800atcttcagca tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa
10860tgccgcaaaa aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt
10920tcaatattat tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg
10980tatttagaaa aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgccacctga
11040cgtctaagaa accattatta tcatgacatt aacctataaa aataggcgta tcacgaggcc
11100ctttcgtc
111084711114DNAArtificialpBOL120 47tcgcgcgttt cggtgatgac ggtgaaaacc
tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca
gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg
cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat
catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa
tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca
tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa
aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta
ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact
tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag
gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg
gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca
gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa
tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg
gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag
aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg
cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag
atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg
cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta
ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca
gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat
tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt
tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg
aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat
tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga
tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca
acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct
aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc
cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag
cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca
cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc
gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag
ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt
gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg
ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg
gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa
tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga
agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg
tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat
ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta
ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg
attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc
tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga
caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac
taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa
acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac
aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg
aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat
ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat
gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg
gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt
ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc
acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg
ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca
tcaacccaga cgacaagaac gctttggaag 3240aagctttggt tttgaaggac aactacggtg
ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt
tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata
ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct
ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac
atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga
agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat
tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg
ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact
tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc
aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga
aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat
cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc
tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct
taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca
ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc
tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg
gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa
ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga
agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt
taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg
atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta
gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc
agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt
atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca
tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca
aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag
ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc
attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc
aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat
gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc
caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt
accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga
agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt
ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc
cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat
ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa
gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat
caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc
caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa
gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa
gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc
agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg
acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc
tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac
acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta
aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga
gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa
aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg
gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt
cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt
tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat
gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg
gccgccaccg cggtggagcc tgtgtggaag 6420aacgattaca acaggtgttg tcctctgagg
acataaaata cacaccgaga ttcatcaact 6480cattgctgga gttagcatat ctacaattgg
gtgaaatggg gagcgatttg caggcatttg 6540ctcggcatgc cggtagaggt gtggtcaata
agagcgacct catgctatac ctgagaaagc 6600aacctgacct acaggaaaga gttactcaag
aataagaatt ttcgttttaa aacctaagag 6660tcactttaaa atttgtatac acttattttt
tttataactt atttaataat aaaaatcata 6720aatcataaga aattcgctcg agtcgactgc
agtttatcta atggagaaac catcagtcaa 6780gacacatctt cttcttctag cgaagtgacg
ggcagtggta gtaccttcac cagttggagt 6840agcaatggtg aaagtggtgg aaccttcacc
tctgaaacct aaaccagcga aagatggacc 6900gttcttgaca aagatggagg tttgcatgtc
acgggcagcc ttgttcaatc tggagatgtt 6960ttgagagtgc atggtagcag tgtgatgtag
accttgttcc aattcaatgg caacttccaa 7020agcttcatcg aagtctggaa ctctgacaac
tggaacaatt ggcatcaaca attcaacagt 7080agcgaatggg tgggactttt cagtttcgac
aatgatcaat cttggggtga aatcacaagc 7140aataccagct tctttcaaga tttcagtggc
agacttacca accaatttct tgttggtgac 7200acccttgtca gtgacggcaa ccttttccaa
tttttggata tcagatgggt tagtgacgtg 7260caaagcaccg ttcttttcca tttggaacaa
caagaagtca gcaatggagt caacggcaac 7320aacagacttt tcagcgatac acaagatatt
atggtcaaag gaagcaccgt cgacaatgtc 7380agcagcagcc ttttcaatgt tagcggtttc
gtcaacgatg gatggagggt taccagcacc 7440agcaccgata accttcttac cagattgcat
agcttgcaag acaacacctg gaccaccagt 7500gatgaccaac aatggaacct ttgggtggtt
catcatttct tgagcagctt ggatagatgg 7560cttggcaacg gtgacaatca agttgtcaat
accacaagaa tctctgacga tagtgttcaa 7620cttttcaatc aaccataaag agatgttctt
ggcacctggg tgaggagagt agaaaacggc 7680gttaccagca gccaacatac cgatggagtt
acagatcaaa gtttcagttg ggttggtaga 7740tggagcaaca gcaccgatga caccgtatgg
agataattcg tataaagtca taccgttgtc 7800accggtagca acttcagtgt acaagtcttc
aacacctgga gtcttttcga tagctaaagt 7860gttcttcaag attttatcgg tgacattacc
cataccggtt tcagcaacag ctctggtagc 7920aatggtttcg atttctgggt ataaagcttc
tctgatggcc ttgacaacgt ttcttctttc 7980ttccaaagat ttttccttgt aacagttttg
agcaatgacg gcagcttgga cagcttcatc 8040gacggtatcg aaaacaccgg acttggcacc
ttgggtggtg gtcttggttg gaacttcctt 8100ttgttcagcc aatttttcca acaaaacctt
cttgactaat tgttccaatt ccaaagattc 8160cattttttac tagttctaga atccgtcgaa
actaagttct ggtgttttaa aactaaaaaa 8220aagactaact ataaaagtag aatttaagaa
gtttaagaaa tagatttaca gaattacaat 8280caatacctac cgtctttata tacttattag
tcaagtaggg gaataatttc agggaactgg 8340tttcaacctt ttttttcagc tttttccaaa
tcagagagag cagaaggtaa tagaaggtgt 8400aagaaaatga gatagataca tgcgtgggtc
aattgccttg tgtcatcatt tactccaggc 8460aggttgcatc actccattga ggttgtgccc
gttttttgcc tgtttgtgcc cctgttctct 8520gtagttgcgc taagagaatg gacctatgaa
ctgatggttg gtgaagaaaa caatattttg 8580gtgctgggat tctttttttt tctggatgcc
agcttaaaaa gcgggctcca ttatatttag 8640tggatgccag gaataaactg ttcacccaga
cacctacgat gttatatatt ctgtgtaacc 8700cgccccctat tttgggcatg tacgggttac
agcagaatta aaaggctaat tttttgacta 8760aataaagtta ggaaaatcac tactattaat
tatttacgta ttctttgaaa tggcgagtat 8820tgataatgat aaactgagct ccagcttttg
ttccctttag tgagggttaa ttgcgcgctt 8880ggcgtaatca tggtcatagc tgtttcctgt
gtgaaattgt tatccgctca caattccaca 8940caacatagga gccggaagca taaagtgtaa
agcctggggt gcctaatgag tgaggtaact 9000cacattaatt gcgttgcgct cactgcccgc
tttccagtcg ggaaacctgt cgtgccagct 9060gcattaatga atcggccaac gcgcggggag
aggcggtttg cgtattgggc gctcttccgc 9120ttcctcgctc actgactcgc tgcgctcggt
cgttcggctg cggcgagcgg tatcagctca 9180ctcaaaggcg gtaatacggt tatccacaga
atcaggggat aacgcaggaa agaacatgtg 9240agcaaaaggc cagcaaaagg ccaggaaccg
taaaaaggcc gcgttgctgg cgtttttcca 9300taggctccgc ccccctgacg agcatcacaa
aaatcgacgc tcaagtcaga ggtggcgaaa 9360cccgacagga ctataaagat accaggcgtt
tccccctgga agctccctcg tgcgctctcc 9420tgttccgacc ctgccgctta ccggatacct
gtccgccttt ctcccttcgg gaagcgtggc 9480gctttctcat agctcacgct gtaggtatct
cagttcggtg taggtcgttc gctccaagct 9540gggctgtgtg cacgaacccc ccgttcagcc
cgaccgctgc gccttatccg gtaactatcg 9600tcttgagtcc aacccggtaa gacacgactt
atcgccactg gcagcagcca ctggtaacag 9660gattagcaga gcgaggtatg taggcggtgc
tacagagttc ttgaagtggt ggcctaacta 9720cggctacact agaaggacag tatttggtat
ctgcgctctg ctgaagccag ttaccttcgg 9780aaaaagagtt ggtagctctt gatccggcaa
acaaaccacc gctggtagcg gtggtttttt 9840tgtttgcaag cagcagatta cgcgcagaaa
aaaaggatct caagaagatc ctttgatctt 9900ttctacgggg tctgacgctc agtggaacga
aaactcacgt taagggattt tggtcatgag 9960attatcaaaa aggatcttca cctagatcct
tttaaattaa aaatgaagtt ttaaatcaat 10020ctaaagtata tatgagtaaa cttggtctga
cagttaccaa tgcttaatca gtgaggcacc 10080tatctcagcg atctgtctat ttcgttcatc
catagttgcc tgactccccg tcgtgtagat 10140aactacgata cgggagggct taccatctgg
ccccagtgct gcaatgatac cgcgagaccc 10200acgctcaccg gctccagatt tatcagcaat
aaaccagcca gccggaaggg ccgagcgcag 10260aagtggtcct gcaactttat ccgcctccat
ccagtctatt aattgttgcc gggaagctag 10320agtaagtagt tcgccagtta atagtttgcg
caacgttgtt gccattgcta caggcatcgt 10380ggtgtcacgc tcgtcgtttg gtatggcttc
attcagctcc ggttcccaac gatcaaggcg 10440agttacatga tcccccatgt tgtgcaaaaa
agcggttagc tccttcggtc ctccgatcgt 10500tgtcagaagt aagttggccg cagtgttatc
actcatggtt atggcagcac tgcataattc 10560tcttactgtc atgccatccg taagatgctt
ttctgtgact ggtgagtact caaccaagtc 10620attctgagaa tagtgtatgc ggcgaccgag
ttgctcttgc ccggcgtcaa tacgggataa 10680taccgcgcca catagcagaa ctttaaaagt
gctcatcatt ggaaaacgtt cttcggggcg 10740aaaactctca aggatcttac cgctgttgag
atccagttcg atgtaaccca ctcgtgcacc 10800caactgatct tcagcatctt ttactttcac
cagcgtttct gggtgagcaa aaacaggaag 10860gcaaaatgcc gcaaaaaagg gaataagggc
gacacggaaa tgttgaatac tcatactctt 10920cctttttcaa tattattgaa gcatttatca
gggttattgt ctcatgagcg gatacatatt 10980tgaatgtatt tagaaaaata aacaaatagg
ggttccgcgc acatttcccc gaaaagtgcc 11040acctgacgtc taagaaacca ttattatcat
gacattaacc tataaaaata ggcgtatcac 11100gaggcccttt cgtc
11114482613DNAEntamoeba histolytica
48atgtcaacac aacaaactat gactgtagat gaacatatta atcaacttgt tgctaaagca
60caagttgcac ttaaagaata tcttaaacca gaatatacac aagaaaaaat agattatatt
120gtaaagaaag catcagttgc agcacttgat caacattgtg cacttgcagc agctgcagtt
180gaagaaacag gaagaggtat ttttgaagat aaagctacta aaaatatatt tgcatgtgaa
240catgttacac atgaaatgag acatgctaaa acagttggta ttattaatgt agatccactt
300tatggaatta cagaaattgc agaaccagtt ggagttgttt gtggagttac accagttact
360aatccaacat caacagctat tttcaagtca cttatttcaa ttaaaacaag aaatccaatt
420gtattttcat tccatccatc agcacttaaa tgttctatta tggcagctaa aattgttaga
480gatgcagcta ttgcagcagg agcaccagaa aattgtattc aatggattga atttggagga
540attgaagcat caaataaatt aatgaatcat ccaggagttg ctactattct tgctacagga
600ggaaatgcta tggttaaagc agcatattca tcaggaaaac cagcacttgg agtaggagca
660ggaaatgtac caacatatat tgaaaaaaca tgtaatatta aacaagcagc aaatgatgta
720gttatgtcaa aatcatttga taatggtatg atttgtgcat cagaacaagc agcaattatt
780gataaagaaa tttatgatca agtagttgaa gaaatgaaaa cacttggagc atatttcatt
840aatgaagaag aaaaagctaa attagaaaag tttatgtttg gagttaatgc atattcagca
900gatgttaata atgcaagact taatccaaaa tgtccaggta tgtcaccaca atggtttgct
960gaacaagttg gaattaaagt tccagaagat tgtaatatta tttgtgcagt ttgtaaagaa
1020gttggaccaa atgaaccatt aacaagagaa aaattatcac cagttcttgc tattcttaaa
1080gcagaaaata cacaagatgg tattgataaa gctgaagcta tggttgaatt taatggtaga
1140ggacattcag cagctattca ttcgaatgat aaagcagtag ttgaaaagta tgcacttaca
1200atgaaagcat gcagaatttt acataataca ccatcatcac aaggaggaat tggatcaatt
1260tataactata tttggccatc atttacactt ggatgtggat catatggagg aaattcggta
1320tcagctaatg ttacatatca taatttatta aatattaaaa gacttgcaga tagaagaaac
1380aaccttcaat ggttcagagt tccaccaaag attttctttg aaccacattc tattagatat
1440cttgctgaac ttaaggaact tagtaaaata ttcattgttt cagatagaat gatgtataaa
1500ttaggatatg tagatagagt tatggatgta ttgaaaagaa gaagtaatga agtagaaatt
1560gaaattttca ttgatgtaga accagatcca tctattcaaa ccgttcaaaa aggacttgct
1620gttatgaata catttggacc agataatatt attgctattg gaggaggatc agctatggat
1680gcagctaaga ttatgtggtt actttatgaa catccagaag ccgatttctt tgcaatgaaa
1740caaaaattca ttgatcttag aaagagagca tttaaattcc caacaatggg taagaaagct
1800agattaattt gtattccaac aacatcagga actggatcag aagttacacc atttgcagtt
1860atttcagatc atgaaacagg taagaaatat ccacttgctg attattcact tacaccatca
1920gttgctattg ttgatccaat gtttactatg tcacttccaa agagagctat tgctgatact
1980ggacttgatg tattggttca tgcaacagaa gcatatgttt cagttatggc taatgaatat
2040actgatggac ttgctagaga agcagttaaa ttagtctttg aaaatcttct taaatcatat
2100aatggagatt tagaagcaag agaaaagatg cacaatgctg caacaattgc aggtatggca
2160tttgcatcag cattccttgg tatggaccat tccatggcac ataaagttgg agcagcattc
2220catcttccac atggtagatg tgtagcagta ttattaccac atgtcattag atataatgga
2280caaaaaccaa gaaagcttgc aatgtggcca aaatataatt tctataaggc agaccaaaga
2340tatatggaac ttgcacaaat ggttggactt aaatgtaata caccagctga aggagttgaa
2400gcatttgcta aagcatgtga agaattaatg aaagccacag agactattac tggattcaag
2460caagcaaata ttgatgaagc agcatggatg agtaaagtac cagaaatggc acttcttgca
2520tttgaagatc aatgttcacc agctaatcca agagtcccaa tggttaagga tatggaaaag
2580attctcaaag ctgcatatta tccaattgct tga
2613492610DNAArtificial sequenceadh2 E. histolytica codon pair optimised
49atgtccactc aacaaaccat gaccgttgat gaacacatta accaattggt cagaaaggct
60caagttgctt tgaaggaata cttgaaacca gaatacactc aagaaaagat cgattacatt
120gtcaagaagg cttctgttgc tgctctagac caacactgtg ctttggctgc tgctgctgtc
180gaagaaactg gtcgtggtat ctttgaagac aaagctacca agaacatttt cgcttgtgaa
240cacgtcactc acgaaatgag acacgccaag accgttggta tcatcaacgt tgatccatta
300tacggtatca ctgaaattgc tgaaccagtc ggtgttgtct gtggtgtcac cccagttacc
360aacccaactt ctactgccat tttcaaatct ttgatttcca tcaagaccag aaacccaatt
420gttttctcct tccacccatc tgctttgaaa tgttccatca tggctgccaa gatcgtcaga
480gatgctgcca ttgctgctgg tgctccagaa aactgtatcc aatggatcga atttggtggt
540attgaagctt ccaacaaatt gatgaaccat cctggtgttg ctaccatctt agctactggt
600ggtaacgcta tggtcaaggc tgcttactct tctggtaagc cagctttggg tgtcggtgct
660ggtaacgtcc caacttacat cgaaaagacc tgtaatatca agcaagctgc taacgatgtt
720gtcatgtcca agtctttcga caacggtatg atctgtgcct ccgaacaagc tgccatcatc
780gacaaagaaa tctacgacca agttgttgaa gaaatgaaga ctttgggtgc ttacttcatc
840aacgaagaag aaaaggccaa attggaaaaa ttcatgttcg gtgttaatgc ttactctgct
900gatgtcaaca acgccagatt gaacccaaag tgtccaggta tgtctccaca atggttcgct
960gaacaagtcg gtatcaaggt tccagaagac tgtaacatca tctgtgccgt ttgtaaggaa
1020gttggtccaa acgaaccatt gaccagagaa aagttgtctc cagttttggc cattttgaag
1080gctgaaaaca ctcaagatgg tattgacaag gctgaagcta tggtcgaatt caacggtcgt
1140ggtcactctg ctgccattca ctccaatgac aaggctgttg ttgaaaaata cgctttgacc
1200atgaaggctt gtcgtatctt gcacaacact ccatcttctc aaggtggtat cggttccatt
1260tacaactaca tctggccatc tttcacttta ggttgtggtt cttacggtgg taactccgtt
1320tctgccaatg ttacctacca caacttgttg aacatcaaga gattggctga cagaagaaac
1380aacttacaat ggttcagagt cccaccaaag atcttcttcg aacctcactc cattagatac
1440ttggctgaat tgaaggaatt gtccaagatt ttcattgtct ctgacagaat gatgtacaaa
1500ttgggttacg ttgacagagt tatggatgtc ttgaagagaa gatccaacga agttgaaatt
1560gaaatcttca tcgatgttga accagaccca tccattcaaa ccgtccaaaa gggtttggct
1620gtcatgaaca ctttcggtcc agacaacatc attgccattg gtggtggttc tgccatggat
1680gctgccaaga tcatgtggtt attatacgaa catccagaag ctgatttctt cgctatgaag
1740caaaaattca tcgatttaag aaagagagct ttcaagttcc caaccatggg taagaaggcc
1800agattaatct gtatcccaac cacttctggt accggttctg aagtcacccc attcgctgtc
1860atctctgacc acgaaactgg taagaagtat ccattggctg actactcttt gaccccatcc
1920gttgccattg ttgacccaat gtttaccatg tccttgccta agagagccat tgctgacact
1980ggtttggatg tcttagtcca cgctactgaa gcttacgttt ctgttatggc taacgaatac
2040actgacggtt tggccagaga agctgtcaaa ttggttttcg aaaacttgtt gaaatcttac
2100aacggtgact tggaagctcg tgaaaagatg cacaacgctg ctaccattgc tggtatggcc
2160tttgcttctg ctttcttggg tatggaccat tccatggctc acaaggtcgg tgctgctttc
2220catttgccac acggtagatg tgttgccgtt ttgttgcctc acgttatcag atacaacggt
2280caaaagccaa gaaagttggc catgtggcca aagtacaact tctacaaggc tgatcaaaga
2340tacatggaat tggctcaaat ggtcggtttg aagtgtaaca ccccagctga aggtgtcgaa
2400gcctttgcca aggcttgtga agaattgatg aaggctactg aaaccatcac tggtttcaag
2460aaggccaaca ttgatgaagc tgcttggatg tccaaggttc cagaaatggc tctattggct
2520ttcgaagacc aatgttctcc agctaaccca agagtcccaa tggttaagga catggaaaag
2580attttgaagg ctgcttacta cccaatcgct
261050870PRTEntamoeba histolytcia 50Met Ser Thr Gln Gln Thr Met Thr Val
Asp Glu His Ile Asn Gln Leu1 5 10
15Val Arg Lys Ala Gln Val Ala Leu Lys Glu Tyr Leu Lys Pro Glu
Tyr 20 25 30Thr Gln Glu Lys
Ile Asp Tyr Ile Val Lys Lys Ala Ser Val Ala Ala 35
40 45Leu Asp Gln His Cys Ala Leu Ala Ala Ala Ala Val
Glu Glu Thr Gly 50 55 60Arg Gly Ile
Phe Glu Asp Lys Ala Thr Lys Asn Ile Phe Ala Cys Glu65 70
75 80His Val Thr His Glu Met Arg His
Ala Lys Thr Val Gly Ile Ile Asn 85 90
95Val Asp Pro Leu Tyr Gly Ile Thr Glu Ile Ala Glu Pro Val
Gly Val 100 105 110Val Cys Gly
Val Thr Pro Val Thr Asn Pro Thr Ser Thr Ala Ile Phe 115
120 125Lys Ser Leu Ile Ser Ile Lys Thr Arg Asn Pro
Ile Val Phe Ser Phe 130 135 140His Pro
Ser Ala Leu Lys Cys Ser Ile Met Ala Ala Lys Ile Val Arg145
150 155 160Asp Ala Ala Ile Ala Ala Gly
Ala Pro Glu Asn Cys Ile Gln Trp Ile 165
170 175Glu Phe Gly Gly Ile Glu Ala Ser Asn Lys Leu Met
Asn His Pro Gly 180 185 190Val
Ala Thr Ile Leu Ala Thr Gly Gly Asn Ala Met Val Lys Ala Ala 195
200 205Tyr Ser Ser Gly Lys Pro Ala Leu Gly
Val Gly Ala Gly Asn Val Pro 210 215
220Thr Tyr Ile Glu Lys Thr Cys Asn Ile Lys Gln Ala Ala Asn Asp Val225
230 235 240Val Met Ser Lys
Ser Phe Asp Asn Gly Met Ile Cys Ala Ser Glu Gln 245
250 255Ala Ala Ile Ile Asp Lys Glu Ile Tyr Asp
Gln Val Val Glu Glu Met 260 265
270Lys Thr Leu Gly Ala Tyr Phe Ile Asn Glu Glu Glu Lys Ala Lys Leu
275 280 285Glu Lys Phe Met Phe Gly Val
Asn Ala Tyr Ser Ala Asp Val Asn Asn 290 295
300Ala Arg Leu Asn Pro Lys Cys Pro Gly Met Ser Pro Gln Trp Phe
Ala305 310 315 320Glu Gln
Val Gly Ile Lys Val Pro Glu Asp Cys Asn Ile Ile Cys Ala
325 330 335Val Cys Lys Glu Val Gly Pro
Asn Glu Pro Leu Thr Arg Glu Lys Leu 340 345
350Ser Pro Val Leu Ala Ile Leu Lys Ala Glu Asn Thr Gln Asp
Gly Ile 355 360 365Asp Lys Ala Glu
Ala Met Val Glu Phe Asn Gly Arg Gly His Ser Ala 370
375 380Ala Ile His Ser Asn Asp Lys Ala Val Val Glu Lys
Tyr Ala Leu Thr385 390 395
400Met Lys Ala Cys Arg Ile Leu His Asn Thr Pro Ser Ser Gln Gly Gly
405 410 415Ile Gly Ser Ile Tyr
Asn Tyr Ile Trp Pro Ser Phe Thr Leu Gly Cys 420
425 430Gly Ser Tyr Gly Gly Asn Ser Val Ser Ala Asn Val
Thr Tyr His Asn 435 440 445Leu Leu
Asn Ile Lys Arg Leu Ala Asp Arg Arg Asn Asn Leu Gln Trp 450
455 460Phe Arg Val Pro Pro Lys Ile Phe Phe Glu Pro
His Ser Ile Arg Tyr465 470 475
480Leu Ala Glu Leu Lys Glu Leu Ser Lys Ile Phe Ile Val Ser Asp Arg
485 490 495Met Met Tyr Lys
Leu Gly Tyr Val Asp Arg Val Met Asp Val Leu Lys 500
505 510Arg Arg Ser Asn Glu Val Glu Ile Glu Ile Phe
Ile Asp Val Glu Pro 515 520 525Asp
Pro Ser Ile Gln Thr Val Gln Lys Gly Leu Ala Val Met Asn Thr 530
535 540Phe Gly Pro Asp Asn Ile Ile Ala Ile Gly
Gly Gly Ser Ala Met Asp545 550 555
560Ala Ala Lys Ile Met Trp Leu Leu Tyr Glu His Pro Glu Ala Asp
Phe 565 570 575Phe Ala Met
Lys Gln Lys Phe Ile Asp Leu Arg Lys Arg Ala Phe Lys 580
585 590Phe Pro Thr Met Gly Lys Lys Ala Arg Leu
Ile Cys Ile Pro Thr Thr 595 600
605Ser Gly Thr Gly Ser Glu Val Thr Pro Phe Ala Val Ile Ser Asp His 610
615 620Glu Thr Gly Lys Lys Tyr Pro Leu
Ala Asp Tyr Ser Leu Thr Pro Ser625 630
635 640Val Ala Ile Val Asp Pro Met Phe Thr Met Ser Leu
Pro Lys Arg Ala 645 650
655Ile Ala Asp Thr Gly Leu Asp Val Leu Val His Ala Thr Glu Ala Tyr
660 665 670Val Ser Val Met Ala Asn
Glu Tyr Thr Asp Gly Leu Ala Arg Glu Ala 675 680
685Val Lys Leu Val Phe Glu Asn Leu Leu Lys Ser Tyr Asn Gly
Asp Leu 690 695 700Glu Ala Arg Glu Lys
Met His Asn Ala Ala Thr Ile Ala Gly Met Ala705 710
715 720Phe Ala Ser Ala Phe Leu Gly Met Asp His
Ser Met Ala His Lys Val 725 730
735Gly Ala Ala Phe His Leu Pro His Gly Arg Cys Val Ala Val Leu Leu
740 745 750Pro His Val Ile Arg
Tyr Asn Gly Gln Lys Pro Arg Lys Leu Ala Met 755
760 765Trp Pro Lys Tyr Asn Phe Tyr Lys Ala Asp Gln Arg
Tyr Met Glu Leu 770 775 780Ala Gln Met
Val Gly Leu Lys Cys Asn Thr Pro Ala Glu Gly Val Glu785
790 795 800Ala Phe Ala Lys Ala Cys Glu
Glu Leu Met Lys Ala Thr Glu Thr Ile 805
810 815Thr Gly Phe Lys Lys Ala Asn Ile Asp Glu Ala Ala
Trp Met Ser Lys 820 825 830Val
Pro Glu Met Ala Leu Leu Ala Phe Glu Asp Gln Cys Ser Pro Ala 835
840 845Asn Pro Arg Val Pro Met Val Lys Asp
Met Glu Lys Ile Leu Lys Ala 850 855
860Ala Tyr Tyr Pro Ile Ala865 870512658DNAPiromyces sp.
E2 51atgtccggat tacaaatgtt ccaaaacctt tctctttacg gtagtctcgc cgaaatcgat
60actagcgaaa agcttaacga agctatggac aaattaactg ctgcccaaga acaattcaga
120gaatacaacc aagaacaagt tgacaaaatc ttcaaggctg ttgctttagc tgcttctcaa
180aaccgtgttg ctttcgctaa gtacgcacac gaagaaaccc aaaagggtgt tttcgaagat
240aaggttatca agaacgaatt cgctgctgat tacatttacc acaagtactg caatgacaag
300accgccggta tcattgaata tgatgaagcc aatggtctta tggaaattgc tgaaccagtt
360ggtccagttg ttggtattgc tccagttact aacccaactt ctactatcat ctacaagtct
420ttaattgcct taaagacccg taactgtatt atcttctcac cacatccagg agctcacaag
480gcctctgttt tcgttgttaa ggtcttacac caagctgctg ttaaggctgg tgccccagaa
540aactgtattc aaatcatctt cccaaagatg gatttaacta ctgaattatt acaccaccaa
600aagactcgtt tcatttgggc tactggtggt ccaggtttag ttcacgcctc ttacacttct
660ggtaagccag ctcttggtgg tggtccaggt aatgctccag ctcttattga tgaaacttgt
720gatatgaacg aagctgttgg ttctatcgtt gtttctaaga ctttcgattg tggtatgatc
780tgtgccactg aaaacgctgt tgtcgttgtc gaatctgtct acgaaaactt cgttgctacc
840atgaagaagc gtggtgccta cttcatgact ccagaagaaa ccaagaaggc ttctaacctt
900cttttcggag aaggtatgag attaaatgct aaggctgttg gtcaaactgc caagacttta
960gctgaaatgg ccggtttcga agtcccagaa aacaccgttg ttctctgtgg tgaagcttct
1020gaagttaaat tcgaagaacc aatggctcac gaaaagttaa ctactatcct cggtatctac
1080aaggctaagg actttgacga tggtgtcaga ttatgtaagg aattagttac tttcggtggt
1140aagggtcaca ctgctgttct ctacaccaac caaaacaaca aggaccgtat tgaaaagtac
1200caaaacgaag ttccagcctt ccacatctta gttgacatgc catcttccct cggttgtatt
1260ggtgatatgt acaacttccg tcttgctcca gctcttacca ttacttgtgg tactatgggt
1320ggtggttcct cctctgataa cattggtcca aagcacttac ttaacatcaa gcgtgttggt
1380atgagacgcg aaaacatgct ttggttcaag attccaaagt ctgtctactt caagcgtgct
1440atcctttctg aagctttatc tgacttacgt gacacccaca agcgtgctat cattattacc
1500gatagaacta tgactatgtt aggtcaaact gacaagatca ttaaggcttg tgaaggtcat
1560ggtatggtct gcactgtcta cgataaggtt gtcccagatc caactatcaa gtgtattatg
1620gaaggtgtta atgaaatgaa cgtcttcaag ccagatttag ctattgctct tggtggtggt
1680tctgctatgg atgccgctaa gatgatgcgt ttattctacg aatacccaga ccaagactta
1740caagatattg ctactcgttt cgtcgatatc cgtaagcgtg ttgttggttg tccaaagctt
1800ggtagactta ttaagactct tgtctgtatc ccaactacct ctggtactgg tgccgaagtt
1860actccattcg ctgtcgttac ctctgaagaa ggtcgtaagt acccattagt cgactacgaa
1920cttactccag atatggctat tgttgatcca gaattcgctg ttggtatgcc aaagcgttta
1980acttcttgga ctggtattga tgctcttacc cacgccattg aatcttacgt ttctattatg
2040gctactgact tcactagacc atactctctc cgtgctgttg gtcttatctt cgaatccctt
2100tcccttgctt acaacaacgg taaggatatt gaagctcgtg aaaagatgca caatgcttct
2160gctattgctg gtatggcctt tgccaacgct ttccttggtt gttgtcactc tgttgctcac
2220caacttggtt ccgtctacca cattccacac ggtcttgcca acgctttaat gctttctcac
2280atcattaagt acaacgctac tgactctcca gttaagatgg gtaccttccc acaatacaag
2340tacccacaag ctatgcgtca ctacgctgaa attgctgaac tcttattacc accaactcaa
2400gttgttaaga tgactgatgt tgataaggtt caatacttaa ttgaccgtgt tgaacaatta
2460aaggctgacg ttggtattcc aaagtctatt aaggaaactg gaatggttac tgaagaagac
2520ttcttcaaca aggttgacca agttgctatc atggccttcg atgaccaatg tactggtgct
2580aacccacgtt acccattagt ttctgaatta aaacaattaa tgattgatgc ctggaacggt
2640gttgtcccaa agctctaa
265852885PRTPiromyces sp. E2 52Met Ser Gly Leu Gln Met Phe Gln Asn Leu
Ser Leu Tyr Gly Ser Leu1 5 10
15Ala Glu Ile Asp Thr Ser Glu Lys Leu Asn Glu Ala Met Asp Lys Leu
20 25 30Thr Ala Ala Gln Glu Gln
Phe Arg Glu Tyr Asn Gln Glu Gln Val Asp 35 40
45Lys Ile Phe Lys Ala Val Ala Leu Ala Ala Ser Gln Asn Arg
Val Ala 50 55 60Phe Ala Lys Tyr Ala
His Glu Glu Thr Gln Lys Gly Val Phe Glu Asp65 70
75 80Lys Val Ile Lys Asn Glu Phe Ala Ala Asp
Tyr Ile Tyr His Lys Tyr 85 90
95Cys Asn Asp Lys Thr Ala Gly Ile Ile Glu Tyr Asp Glu Ala Asn Gly
100 105 110Leu Met Glu Ile Ala
Glu Pro Val Gly Pro Val Val Gly Ile Ala Pro 115
120 125Val Thr Asn Pro Thr Ser Thr Ile Ile Tyr Lys Ser
Leu Ile Ala Leu 130 135 140Lys Thr Arg
Asn Cys Ile Ile Phe Ser Pro His Pro Gly Ala His Lys145
150 155 160Ala Ser Val Phe Val Val Lys
Val Leu His Gln Ala Ala Val Lys Ala 165
170 175Gly Ala Pro Glu Asn Cys Ile Gln Ile Ile Phe Pro
Lys Met Asp Leu 180 185 190Thr
Thr Glu Leu Leu His His Gln Lys Thr Arg Phe Ile Trp Ala Thr 195
200 205Gly Gly Pro Gly Leu Val His Ala Ser
Tyr Thr Ser Gly Lys Pro Ala 210 215
220Leu Gly Gly Gly Pro Gly Asn Ala Pro Ala Leu Ile Asp Glu Thr Cys225
230 235 240Asp Met Asn Glu
Ala Val Gly Ser Ile Val Val Ser Lys Thr Phe Asp 245
250 255Cys Gly Met Ile Cys Ala Thr Glu Asn Ala
Val Val Val Val Glu Ser 260 265
270Val Tyr Glu Asn Phe Val Ala Thr Met Lys Lys Arg Gly Ala Tyr Phe
275 280 285Met Thr Pro Glu Glu Thr Lys
Lys Ala Ser Asn Leu Leu Phe Gly Glu 290 295
300Gly Met Arg Leu Asn Ala Lys Ala Val Gly Gln Thr Ala Lys Thr
Leu305 310 315 320Ala Glu
Met Ala Gly Phe Glu Val Pro Glu Asn Thr Val Val Leu Cys
325 330 335Gly Glu Ala Ser Glu Val Lys
Phe Glu Glu Pro Met Ala His Glu Lys 340 345
350Leu Thr Thr Ile Leu Gly Ile Tyr Lys Ala Lys Asp Phe Asp
Asp Gly 355 360 365Val Arg Leu Cys
Lys Glu Leu Val Thr Phe Gly Gly Lys Gly His Thr 370
375 380Ala Val Leu Tyr Thr Asn Gln Asn Asn Lys Asp Arg
Ile Glu Lys Tyr385 390 395
400Gln Asn Glu Val Pro Ala Phe His Ile Leu Val Asp Met Pro Ser Ser
405 410 415Leu Gly Cys Ile Gly
Asp Met Tyr Asn Phe Arg Leu Ala Pro Ala Leu 420
425 430Thr Ile Thr Cys Gly Thr Met Gly Gly Gly Ser Ser
Ser Asp Asn Ile 435 440 445Gly Pro
Lys His Leu Leu Asn Ile Lys Arg Val Gly Met Arg Arg Glu 450
455 460Asn Met Leu Trp Phe Lys Ile Pro Lys Ser Val
Tyr Phe Lys Arg Ala465 470 475
480Ile Leu Ser Glu Ala Leu Ser Asp Leu Arg Asp Thr His Lys Arg Ala
485 490 495Ile Ile Ile Thr
Asp Arg Thr Met Thr Met Leu Gly Gln Thr Asp Lys 500
505 510Ile Ile Lys Ala Cys Glu Gly His Gly Met Val
Cys Thr Val Tyr Asp 515 520 525Lys
Val Val Pro Asp Pro Thr Ile Lys Cys Ile Met Glu Gly Val Asn 530
535 540Glu Met Asn Val Phe Lys Pro Asp Leu Ala
Ile Ala Leu Gly Gly Gly545 550 555
560Ser Ala Met Asp Ala Ala Lys Met Met Arg Leu Phe Tyr Glu Tyr
Pro 565 570 575Asp Gln Asp
Leu Gln Asp Ile Ala Thr Arg Phe Val Asp Ile Arg Lys 580
585 590Arg Val Val Gly Cys Pro Lys Leu Gly Arg
Leu Ile Lys Thr Leu Val 595 600
605Cys Ile Pro Thr Thr Ser Gly Thr Gly Ala Glu Val Thr Pro Phe Ala 610
615 620Val Val Thr Ser Glu Glu Gly Arg
Lys Tyr Pro Leu Val Asp Tyr Glu625 630
635 640Leu Thr Pro Asp Met Ala Ile Val Asp Pro Glu Phe
Ala Val Gly Met 645 650
655Pro Lys Arg Leu Thr Ser Trp Thr Gly Ile Asp Ala Leu Thr His Ala
660 665 670Ile Glu Ser Tyr Val Ser
Ile Met Ala Thr Asp Phe Thr Arg Pro Tyr 675 680
685Ser Leu Arg Ala Val Gly Leu Ile Phe Glu Ser Leu Ser Leu
Ala Tyr 690 695 700Asn Asn Gly Lys Asp
Ile Glu Ala Arg Glu Lys Met His Asn Ala Ser705 710
715 720Ala Ile Ala Gly Met Ala Phe Ala Asn Ala
Phe Leu Gly Cys Cys His 725 730
735Ser Val Ala His Gln Leu Gly Ser Val Tyr His Ile Pro His Gly Leu
740 745 750Ala Asn Ala Leu Met
Leu Ser His Ile Ile Lys Tyr Asn Ala Thr Asp 755
760 765Ser Pro Val Lys Met Gly Thr Phe Pro Gln Tyr Lys
Tyr Pro Gln Ala 770 775 780Met Arg His
Tyr Ala Glu Ile Ala Glu Leu Leu Leu Pro Pro Thr Gln785
790 795 800Val Val Lys Met Thr Asp Val
Asp Lys Val Gln Tyr Leu Ile Asp Arg 805
810 815Val Glu Gln Leu Lys Ala Asp Val Gly Ile Pro Lys
Ser Ile Lys Glu 820 825 830Thr
Gly Met Val Thr Glu Glu Asp Phe Phe Asn Lys Val Asp Gln Val 835
840 845Ala Ile Met Ala Phe Asp Asp Gln Cys
Thr Gly Ala Asn Pro Arg Tyr 850 855
860Pro Leu Val Ser Glu Leu Lys Gln Leu Met Ile Asp Ala Trp Asn Gly865
870 875 880Val Val Pro Lys
Leu 885
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20110184476 | FEMORAL RING LOADER |
20110184475 | System, instrumentation and method for spinal fixation using minimally invasive surgical techiques |
20110184474 | POLYAXIAL SCREW ASSEMBLY |
20110184473 | Method and apparatus for spinal fixation using minimally invasive surgical techniques |
20110184472 | Expandable Bone Support |