Patent application title: YEAST CELLS HAVING REDUCTIVE TCA PATHWAY FROM PYRUVATE TO SUCCINATE AND OVEREXPRESSING AN EXOGENOUS NAD(P+) TRANSHYDROGENASE ENZYME
Inventors:
Arlene M. Fosmer (Eden Prairie, MN, US)
Arlene M. Fosmer (Eden Prairie, MN, US)
Vernon L. Mcintosh, Jr. (Minneapolis, MN, US)
Thomas W. Mcmullin (Minneapolis, MN, US)
Gregory M. Poynter (St. Paul, MN, US)
Brian J. Rush (Minneapolis, MN, US)
Brian J. Rush (Minneapolis, MN, US)
Kevin T. Watts (Minneapolis, MN, US)
Assignees:
Cargill, Incorporated
IPC8 Class: AC12P752FI
USPC Class:
1 1
Class name:
Publication date: 2021-12-09
Patent application number: 20210381011
Abstract:
Yeast cells having a reductive TCA pathway from pyruvate or
phosphoenolpyruvate to succinate, and which include at least one
exogenous gene overexpressing an enzyme in that pathway, further contain
an exogenous transhydrogenase gene.Claims:
1. (canceled)
2. The process of claim 60, wherein the recombinant yeast cell has integrated into its genome an exogenous NAD(P)+ transhydrogenase gene that encodes for the overexpressed NAD(P)+ transhydrogenase enzyme.
3.-4. (canceled)
5. The process of claim 2, wherein the exogenous NAD(P)+ transhydrogenase enzyme has an amino acid sequence at least 80% identical to any of SEQ ID NOs: 117, 118, 119 or 146.
6. (canceled)
7. The process of claim 60 wherein the active reductive TCA pathway includes a step of converting pyruvate or phosphoenolpyruvate to oxaloacetate, a step of converting oxaloacetate to malate, a step of converting malate to fumarate, and a step of converting fumarate to succinate.
8. The process of claim 60, wherein the recombinant yeast cell of has integrated into its genome at least one exogenous malate dehydrogenase gene which encodes for an enzyme that catalyst the conversion of oxaloacetate to malate.
9. The process of claim 8, wherein the exogenous malate dehydrogenase gene is NADH-dependent.
10. The process of claim 8, wherein the exogenous malate dehydrogenase gene encodes for an enzyme having an amino acid sequence at least 80% identical to any one of SEQ ID NOs: 98, 99, 100, 101, 102, 103, 104, 105, 106 or 128.
11.-13. (canceled)
14. The process preceding claim 60, wherein the recombinant yeast cell has integrated into its genome at least one exogenous fumarate reductase gene which encodes an enzyme which catalyzes the conversion of fumarate to succinate.
15. The process of claim 14 wherein the exogenous fumarate reductase gene is NADH-dependent.
16. The process of claim 14, wherein the exogenous fumarate reductase gene encodes for an enzyme having an amino acid sequence at least 80% identical to any one of SEQ ID NOs: 108, 109, 110, 111, 112, 113, 114 or 82.
17.-19. (canceled)
20. The process of claim 60, wherein the recombinant yeast cell overexpresses at least one enzyme which catalyzes a reaction that includes the reduction of NADP+ to NADPH.
21.-38. (canceled)
39. The process of claim 60, wherein the recombinant yeast cell has a deletion or disruption of a native pyruvate decarboxylase gene.
40. (canceled)
41. The process of claim 60, wherein the recombinant yeast cell the host cell is selected from C. sonorensis, K. marxianus, K. thermotolerans, C. methanesorbosa, S. bulderi, I. orientalis, C. lambica, C. sorboxylosa, C. zemplinina, C. geochares, P. membranifaciens, Z. kombuchaensis, C. sorbosivorans, C. vanderwaltii, C. sorbophila, Z. bisporus, Z. lentus, S. bayanus, D. castellii, C, boidinii, C. etchellsii, K. lactis, P. jadinii, P. anomala, Saccharomyces cerevisiae and Saccharomycopsis crataegensis.
42. The process of claim 60, wherein the recombinant yeast cell host cell is selected from Issatchenkia orientalis, Pichia galeiformis, Pichia sp. YB-4149 (NRRL designation), Candida ethanolica, P. deserticola, P. membranifaciens and P. fermentans.
43.-53. (canceled)
54. The process of claim 60, wherein the recombinant yeast cell host cell is I. orientalis.
55.-59. (canceled)
60. A process for producing succinate or a succinate metabolization product of succinate, comprising culturing a recombinant yeast cell under fermentation conditions in a fermentation broth that includes a sugar that is fermentable by the cell, wherein the recombinant yeast cell has an active reductive TCA pathway from pyruvate or phosphoenolpyruvate to succinate and overexpresses a NAD(P).sup.+ transhydrogenase enzyme and which further has integrated into its genome one or more of (i) an exogenous pyruvate carboxylase gene that encodes for an enzyme which catalyzes the conversion of pyruvate to oxaloacetate, (ii) an exogenous malate dehydrogenase gene which encodes for an enzyme that catalyzes the conversion of oxaloacetate to malate (iii) an exogenous fumarase gene that encodes for an enzyme which catalyzes the conversion of malate to fumarate and (iv) an exogenous fumarate reductase gene which encodes for an enzyme which catalyzes the conversion of fumarate to succinate.
61. The process of claim 60, wherein the recombinant yeast cell produces succinate and transports succinate out of the cell.
62. The process of claim 60, wherein the recombinant yeast cell further metabolizes succinate to one or more succinate metabolization products, and the recombinant cell transports at least one of said succinate metabolization product out of the cell.
63. The process of claim 62, wherein the succinate metabolization product is one or more of 1,4-butanediol, 1,3-butadiene, propionic acid, and 3-hydroxyisobutryic acid.
64. The process of claim 60, wherein the NAD(P)+ transhydrogenase enzyme is soluble in the cytosol of the recombinant yeast cell.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a Continuation of U.S. patent application Ser. No. 15/816,779, filed Nov. 17, 2017, which is a Divisional of U.S. patent application Ser. No. 14/416,633, filed Jan. 22, 2015, which is a national phase application of International Application No. PCT/US2013/052069, filed Jul. 25, 2013, which claims the benefit of U.S. Provisional Patent Application No. 61/675,788, filed Jul. 25, 2012, each of which are hereby incorporated by reference in their entirety.
INCORPORATION BY REFERENCE OF SEQUENCE LISTING
[0002] A Sequence Listing is provided herewith as a text file, "N00188_ST25.text" created on Aug. 5, 2021 and having a size of 518 KB. The contents of the text file are incorporated by reference herein in their entirety.
[0003] This invention relates to recombinant yeast having an active reductive TCA pathway from pyruvate to succinate. The inventions disclosed and claimed herein were made pursuant to a joint research agreement between Cargill Incorporated, Wayzta, Minn., US, and BioAmber S.A.S, Bazancourt, France.
[0004] Succinic acid is a chemical intermediate useful as a precursor for making compounds such as 1,4-butanediol, tetrahydrofuran and gamma-butyrolactone. It is also a useful diacid that can be polymerized with a polyol to make polyester resins. Succinic acid can be produced industrially from butane. However, butane is a petrochemical, and there is a strong desire to develop processes for making many chemical compounds from annually renewable resources such as plant or animal feedstocks.
[0005] Some microorganisms have evolved the ability to produce succinate from carbohydrate feedstocks. In some cases, these strains have been engineered to improve yield and/or productivity. WO 2007/061590 describes recombinant yeast cells that produce succinate. Some yeast species are of interest as candidates for succinic acid-producing fermentations because they are resistant to low pH conditions, and so can produce acidic fermentation products at a low pH at which the product acid exists mainly in the acid form rather than in the salt form. Producing the acid directly in the acid form simplifies recovery and purification, as salt splitting, with its attendant requirements for raw materials, capital, operating and disposal costs, can be reduced if not eliminated.
[0006] There are three primary fermentation pathways for by which a microorganism can produce succinate: oxidative tricarboxylic acid (TCA), glyoxylate shunt, and reductive TCA. The oxidative TCA pathway begins with the conversion of oxaloacetate (OAA) and acetyl-CoA to citrate. OAA can be generated from carboxylation of phosphoenolpyruvate (PEP) or pyruvate, while acetyl-CoA is generated from the decarboxylation of pyruvate by pyruvate dehydrogenase (PDH) or pyruvate formate lyase (PFL). Citrate is converted to isocitrate, isocitrate is converted to .alpha.-ketoglutarate, .alpha.-ketoglutarate is converted to succinyl-CoA, and succinyl-CoA is converted to succinate.
[0007] Like the oxidative TCA pathway, the glyoxylate shunt pathway begins with the generation of citrate from OAA and acetyl-CoA and the conversion of citrate to isocitrate. Isocitrate is converted to glyoxylate and succinate. Glyoxylate is condensed with acetyl-CoA to form malate, and the resultant malate is converted to succinate via a fumarate intermediate.
[0008] The reductive TCA pathway begins with carboxylation of phosphoenolpyruvate (PEP) or pyruvate to oxaloacetate (OAA) (by PEP carboxylase (PPC) and pyruvate carboxylase (PYC), respectively). OAA is converted to malate by malate dehydrogenase (MDH), malate is converted to fumarate by fumarase (FUM, also known as fumarate hydratase), and fumarate is converted to succinate by fumarate reductase (FRD). The reductive TCA pathway provides the highest succinate yield of the three succinate fermentation pathways, per mole of glucose consumed, and for that reason offers the best economic potential.
[0009] A problem with the reductive TCA pathway is that the MDH enzyme consumes NADH as a cofactor. In addition, certain efficient FRD enzymes also consume NADH. Examples of such NADH-dependent FRD enzymes are described, for example, in WO 2009/065778 and PCT/US2011/022612. Thus, certain efficient metabolic pathways from pyruvate to succinate consume two molecules of NADH. One molecule of NADH is produced when sugars such as glucose are metabolized to pyruvate via the glycolytic pathway, but this still leaves a net deficit of one NADH, which results in a redox imbalance. A living cell must correct this redox balance if it is to remain healthy and continue to metabolize through the reductive TCA pathway. This typically means that the cell must balance the net NADH consumption by replacing the consumed NADH from other metabolic processes that produce NADH. For example, the reductive TCA pathway can be combined with one or both of the oxidative TCA or glyoxylate shunt pathways to help with the redox balance, but the oxidative TCA and glyoxylate shunt pathways produce less succinic acid per mole of starting sugar, and taking this approach therefore results in a loss of yield. It is possible for the cell to use one or more unrelated pathways to produce the needed NADH, but this can have adverse consequences for cell health and productivity, and may create other imbalances within the cell.
[0010] Therefore, there remains a desire to provide a yeast strain that efficiently produces succinic acid (or its salts).
[0011] In one aspect, this invention is a recombinant yeast cell having an active reductive TCA metabolic pathway from pyruvate to succinate and which further overexpresses a NAD(P).sup.+ transhydrogenase enzyme.
[0012] In particular embodiments, the yeast cell of the invention has integrated into its genome at least one exogenous NAD(P)+ transhydrogenase gene that encodes for the NAD(P)+ transhydrogenase enzyme.
[0013] In other particular embodiments, the recombinant yeast cell of the invention (a) expresses an NADPH-dependent malate dehydrogenase enzyme, (b) has at least one exogenous NADPH-dependent malate dehydrogenase gene integrated into its genome, (c) expresses an NADPH-dependent fumarate reductase enzyme, (d) has at least one exogenous NADPH-dependent fumarate reductase gene integrated into its genome or (e) has a combination of any two or more of (a), (b), (c) and (d).
[0014] The recombinant yeast cell of the invention in some embodiments has integrated into its genome one or more of (i) an exogenous pyruvate carboxylase gene that encodes for an enzyme which catalyzes the conversion of pyruvate to oxaloacetate, (ii) an exogenous malate dehydrogenase gene which encodes for an enzyme that catalyzes the conversion of oxaloacetate to malate, (iii) an exogenous fumarase gene that encodes for an enzyme which catalyzes the conversion of malate to fumarate and (iv) an exogenous fumarate reductase gene that encodes an enzyme which catalyzes the conversion of fumarate to succinate. In some embodiments, the recombinant cell of the invention has integrated into its genome one or more of (i) a non-native pyruvate carboxylase gene that encodes for an enzyme which catalyzes the conversion of pyruvate to oxaloacetate, (ii) a non-native malate dehydrogenase gene which encodes for an enzyme that catalyzes the conversion of oxaloacetate to malate, (iii) a non-native exogenous fumarase gene that encodes for an enzyme which catalyzes the conversion of malate to fumarate and (iv) a non-native exogenous fumarate reductase gene which encodes an enzyme which catalyzes the conversion of fumarate to succinate.
[0015] In preferred embodiments, the recombinant cell of the invention has integrated into its genome at least one exogenous malate dehydrogenase gene which encodes for an NADH-dependent enzyme that catalyzes the conversion of oxaloacetate to malate. In other preferred embodiments, the recombinant cell of the invention has integrated into its genome at least one exogenous fumarate reductase gene which encodes for an NADH-dependent enzyme that catalyzes the conversion of fumarate to succinate. In especially preferred embodiments, the recombinant cell of the invention has both of these features.
[0016] In other specific embodiments, the recombinant cell of the invention overexpresses at least one enzyme which catalyzes a reaction that includes the reduction of NADP+ to NADPH. This reaction may be a reaction in the pentose phosphate pathway. The enzyme catalyzing that reaction may be, for example, a 6-phosphogluconate dehydrogenase (6PDGH) enzyme and/or a glucose 6-phosphate dehydrogenase (G6PDH) enzyme.
[0017] In still other specific embodiments, the recombinant cell of the invention overexpresses at least one Stb5p protein, and/or has at least one exogenous Stb5p gene (i.e., a gene that encodes for the Stb5p protein) integrated into its genome.
[0018] In still other specific embodiments, the recombinant cell of the invention has a deletion or disruption of a native phosphoglucose isomerase gene.
[0019] In the cells of any of the foregoing aspects of the invention, the NADH/NAD+ redox imbalance that is produced in the reductive TCA pathway to succinate is compensated for, at least in part, by converting NADPH formed in other cellular metabolic processes to NADH, which can be consumed in the succinate-producing pathway. This is a beneficial approach to solving the NADH/NAD+ redox imbalance, because yeast cells typically have, or can be easily engineered to have, active metabolic pathways that produce NADPH. A yeast cell's native pentose phosphate pathway is an example of a metabolic pathway that produces NADPH. Thus, NADPH can be produced in the cell by directing carbon flux through a pentose phosphate pathway, and all or a portion of the NADPH so produced can be converted to NADH by action of the overexpressed NAD(P).sup.+ transhydrogenase enzyme. Some or all of the NADH so produced can alleviate or even eliminate the NADH/NAD+ redox imbalance that results from succinate production through the reductive TCA pathway.
[0020] NADPH production can be increased (relative to the wild-type host cell), for example, by increasing carbon flux through the pentose phosphate pathway and/or by overexpressing at least one enzyme (including an enzyme in the pentose phosphate pathway) which catalyzes a reaction that includes the reduction of NADP+ to NADPH. Again, the increased NADPH so produced can be converted to NADH by action of the NAD(P).sup.+ transhydrogenase enzyme. As before, some or all of the NADH so produced can alleviate or even eliminate the NADH/NAD+ redox imbalance that results from succinate production through the reductive TCA pathway.
[0021] This, in some embodiments, the recombinant cell of the invention includes one or more genetic modifications that (1) increase flux through the pentose phosphate pathway and/or (2) overexpress one or more enzymes in the pentose phosphate pathway that catalyze a reaction that includes the reduction of NADP+ to NADPH. In certain embodiments, therefore, the recombinant cell of the invention also (a) overexpresses at least one Stb5p protein (b) has at least one exogenous Stb5p gene integrated into its genome, (c) produces a severely reduced quantity of an active phosphoglucose isomerase (PGI) enzyme, (d) produces a PGI enzyme that has a severely reduced activity, (e) has a deletion or disruption of a native PGI gene, (f) overexpresses at least one 6-phosphogluconate dehydrogenase (6PGDH) enzyme, (g) has at least one exogenous 6PGDH gene integrated into its genome, (h) overexpresses at least one glucose-6-phosphate dehydrogenase (G6PDH) enzyme, (i) has at least one exogenous G6PDH gene integrated into its genome, or (j) an combination of any two or more of (a)-(i).
[0022] The cell of the invention may produce succinate and transport it from the cell. In some embodiments, the cell may further metabolize some or all of the succinate into one or more other succinate metabolization products, and transport one or more of such succinate metabolization products from the cell. In such embodiments, the cell contains native or non-native metabolic pathways which perform the further metabolization of succinate into such succinate metabolization product(s).
[0023] In yet other aspects, the invention is a method of producing succinate or a metabolization product of succinate, comprising culturing a cell of any of the foregoing aspects in a fermentation medium that includes at least one carbon source. The cells of the invention are capable of producing succinate or metabolization products of succinate in high yields at commercially reasonable production rates.
[0024] The term "NADH-dependent" as used herein refers to the property of an enzyme to preferentially use NADH as the redox cofactor. An NADH-dependent enzyme has a higher specificity constant (k.sub.cat/K.sub.M) with the cofactor NADH than with other cofactors, including the cofactor NADPH, as determined by in vitro enzyme activity assays.
[0025] For purposes of this application, "native" as used herein with regard to a metabolic pathway refers to a metabolic pathway that exists and is active in the wild-type host strain. Genetic material such as genes, promoters and terminators is "native" for purposes of this application if the genetic material has a sequence identical to (apart from individual-to-individual mutations which do not affect function) a genetic component that is present in the genome of the wild-type host cell (i.e., the exogenous genetic component is identical to an endogenous genetic component).
[0026] For purposes of this application, genetic material such as a gene, a promoter and a terminator is "endogenous" to a cell if it is (i) native to the cell, (ii) present at the same location as that genetic material is present in the wild-type cell and (iii) under the regulatory control of its native promoter and its native terminator.
[0027] For purposes of this application, genetic material such as genes, promoters and terminators is "exogenous" to a cell if it is (i) non-native to the cell and/or (ii) is native to the cell, but is present at a location different than where that genetic material is present in the wild-type cell and/or (iii) is under the regulatory control of a non-native promoter and/or non-native terminator. Extra copies of native genetic material are considered as "exogenous" for purposes of this invention, even if such extra copies are present at the same locus as that genetic material is present in the wild-type host strain.
[0028] As used herein, the term "promoter" refers to an untranslated sequence located upstream (i.e., 5') to the translation start codon of a gene (generally a sequence of about 1 to 1500 base pairs (bp), preferably about 100 to 1000 bp and especially of about 200 to 1000 bp) which controls the start of transcription of the gene. The term "terminator" as used herein refers to an untranslated sequence located downstream (i.e., 3') to the translation finish codon of a gene (generally a sequence of about 1 to 1500 bp, preferably of about 100 to 1000 bp, and especially of about 200 to 500 bp) which controls the end of transcription of the gene. A promoter or terminator is "operatively linked" to a gene if its position in the genome relative to that of the gene is such that the promoter or terminator, as the case may be, performs its transcriptional control function.
[0029] "Identity" for nucleotide or amino acid sequences are for purposes of this invention calculated using BLAST (National Center for Biological Information (NCBI) Basic Local Alignment Search Tool) version 2.2.13 software with default parameters. A sequence having an identity score of XX % with regard to a reference sequence using the BLAST version 2.2.13 algorithm with default parameters is considered to be at least XX % identical or, equivalently, have XX % sequence identity to the reference sequence.
[0030] "Deletion or disruption" with regard to a gene means that either the entire coding region of the gene is eliminated (deletion) or the coding region of the gene, its promoter, and/or its terminator region is modified (such as by deletion, insertion, or mutation) such that the gene no longer produces an active enzyme, produces a severely reduced quantity (at least 75% reduction, preferably at least 85% reduction, more preferably at least 95% reduction) of the enzyme, or produces an enzyme with severely reduced (at least 75% reduced, preferably at least 85% reduced, more preferably at least 95% reduced) activity. A deletion or disruption of a gene can be accomplished by, for example, forced evolution, mutagenesis or genetic engineering methods, followed by appropriate selection or screening to identify the desired mutants.
[0031] "Overexpress" means the artificial expression of an enzyme in increased quantity by a gene. Overexpression of an enzyme may result from the presence of one or more exogeneous gene(s), or from other conditions. For purposes of this invention, a yeast cell containing at least one exogenous gene is considered to overexpress the enzyme(s) encoded by such exogenous gene(s).
[0032] The recombinant yeast of the invention is made by performing certain genetic modifications to a host yeast cell. The host yeast cell is one which as a wild-type strain is natively capable of metabolizing at least one sugar to pyruvate. Suitable host yeast cells include (but are not limited to) yeast cells classified under the genera Candida, Pichia, Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Kluyveromyces, Debaryomyces, Pichia, Issatchenkia, Yarrowia and Hansenula. Examples of specific host yeast cells include C. sonorensis, K. marxianus, K. thermotolerans, C. methanesorbosa, Saccharomyces bulderi (S. bulderi), I. orientalis, C. lambica, C. sorboxylosa, C. zemplinina, C. geochares, P. membranifaciens, Z. kombuchaensis, C. sorbosivorans, C. vanderwaltii, C. sorbophila, Z. bisporus, Z. lentus, Saccharomyces bayanus (S. bayanus), D. castellii, C, boidinii, C. etchellsii, K. lactis, P. jadinii, P. anomala, Saccharomyces cerevisiae (S. cerevisiae) Pichia galeiformis, Pichia sp. YB-4149 (NRRL designation), Candida ethanolica, P. deserticola, P. membranifaciens, P. fermentans and Saccharomycopsis crataegensis (S. crataegensis). Suitable strains of K. marxianus and C. sonorensis include those described in WO 00/71738 A1, WO 02/42471 A2, WO 03/049525 A2, WO 03/102152 A2 and WO 03/102201A2. Suitable strains of I. orientalis are ATCC strain 32196 and ATCC strain PTA-6648.
[0033] In some embodiments of the invention the host cell is Crabtree negative as a wild-type strain. The Crabtree effect is defined as the occurrence of fermentative metabolism under aerobic conditions due to the inhibition of oxygen consumption by a microorganism when cultured at high specific growth rates (long-term effect) or in the presence of high concentrations of glucose (short-term effect). Crabtree negative phenotypes do not exhibit this effect, and are thus able to consume oxygen even in the presence of high concentrations of glucose or at high growth rates.
[0034] In some embodiments, the host cell is succinate-resistant as a wild-type strain. A cell is considered to be "succinate-resistant" if the cell exhibits a growth rate in media containing 75 g/L or greater succinate at pH 2.8 that is at least 50% as high as its growth rate in the same media containing 0 g/L succinate, according to the test method described in Example 1A of WO 2012/103261.
[0035] In some embodiments, the host cell exhibits a volumetric glucose consumption rate of at least 3, at least 5 or at least 8 grams of glucose per liter of broth per hour, as a wild-type strain.
[0036] In some embodiments, the host cell exhibits a specific glucose consumption rate of at least 0.5, at least 1.0 or at least 1.5 gram of glucose per gram dry weight of cells per hour, as a wild-type strain.
[0037] Volumetric and specific glucose consumption can be measured by cultivating the cells in shake flasks yeast in extract peptone dextrose (YPD) media containing 0 g/l 75 g/L succinate at pH 3.0 a described in Example 1 of WO 2012/103261. The flasks are inoculated with biomass harvested from seed flasks grown overnight to an OD.sub.600 of 6 to 10. 250 mL baffled glycolytic assay flasks (50 mL working volume) are inoculated to an OD600 of 0.1 and grown at 250 RPM and 30.degree. C. Samples are taken throughout the time course for the assay and analyzed for glucose consumption by electrophoretic methods (such as by using a 2700 Biochemistry Analyzer from Yellow Springs Instruments or equivalent device). The data is plotted and volumetric glucose consumption rate calculated. Specific glucose consumption rate is calculated by dividing the glucose consumption by the cell dry weight at the end of fermentation.
[0038] The genetically modified yeast cells provided herein have an active reductive TCA active pathway from pyruvate to succinate. Such an active reductive TCA pathway includes a step of converting pyruvate or phosphoenolpyruvate (PEP) (or each) to oxaloacetate (OAA), a step of converting oxaloacetate to malate, a step of converting malate to fumarate, and a step of converting fumarate to succinate.
[0039] The step of converting pyruvate to OAA is catalyzed by a PYC (pyruvate carboxylase) enzyme, i.e., an enzyme having the ability to catalyze the conversion of pyruvate to OAA. A PYC enzyme is encoded by a PYC (pyruvate carboxylase) gene integrated into the genome of the recombinant yeast cell. The PYC gene may be native or non-native to the host cell, and may be endogenous (if native) or exogenous (if non-native or if additional copies of a native gene are present). In certain embodiments, a PYC gene may be a yeast gene. For example, the PYC gene may be an I. orientalis PYC gene encoding for an enzyme having amino acid sequence SEQ ID NO: 94, an S. cerevisiae PYC1 gene encoding for an enzyme having amino acid sequence SEQ ID NO: 95, or a K. marxianus PYC1 gene encoding for an enzyme having amino acid SEQ ID NO: 96. In other embodiments, the gene may encode for an enzyme having an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any of SEQ ID NOs: 94, 95 or 96. In certain embodiments, the gene may have the nucleotide sequence set forth in SEQ ID NOs: 4, 45 or 46, or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any of SEQ ID NOs: 4, 45 or 46. In other embodiments, the PYC gene may be fungal.
[0040] The step of converting PEP to OAA is catalyzed by a PPC (phosphoenolpyruvate carboxylase) enzyme, i.e., an enzyme having the ability to catalyze the conversion of PEP to OAA. A PPC enzyme is encoded by a PPC (phosphoenolpyruvate carboxylase) gene integrated into the genome of the recombinant yeast cell. The PPC gene may be native or non-native to the host cell, and may be endogenous (if native) or exogenous (if non-native or if additional copies of a native gene are present). The PPC gene may encode for an enzyme having either of amino acid sequences SEQ ID NO: 97 or 115, or for an enzyme having an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to either of SEQ ID NOs: 97 or 115. In certain embodiments, the PPC gene may have the nucleotide sequence set forth in either of SEQ ID NOs: 49 or 50, or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to either of SEQ ID NOs: 49 or 50.
[0041] The step of converting OAA to malate is catalyzed by a MDH (malate dehydrogenase) enzyme, i.e., an enzyme having the ability to catalyze the conversion of OAA to malate. A MDH enzyme is encoded by a MDH (malate dehydrogenase) gene present in the genome of the recombinant yeast cell. The MDH gene may be native or non-native to the host cell, and may be endogenous (if native) or exogenous (if non-native or if additional copies of a native gene are present). The MDH enzyme preferably is NADH-dependent, i.e., one which uses NADH preferentially as a cofactor, and in converting OAA to malate also oxidizes NADH to NAD+. In the cells of this invention, the MDH enzyme preferably is overexpressed, by integrating one or more copies of an exogenous MDH gene (preferably at least two copies) into the genome of the cell. Preferred MDH genes encode for NADH-dependent MDH enzymes.
[0042] In certain embodiments, the MDH gene is a yeast MDH gene that encodes for an NADH-dependent MDH enzyme. For example, the MDH gene may be an I. orientalis MDH1, MDH2, or MDH3 gene encoding for an enzyme having any of the amino acid sequences SEQ ID NOs: 98, 99 or 100, respectively, a Z. rouxii MDH gene encoding for an enzyme having amino acid sequence SEQ ID NO: 101, a K. marxianus MDH1, MDH2, or MDH3 gene encoding for an enzyme having any of amino acid sequences SEQ ID NOs: 102, 103 or 104, respectively, or a gene encoding for an enzyme having an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any thereof. In certain embodiments, the yeast MDH gene has the nucleotide sequence set forth in any of SEQ ID NOs: 58, 59, 60, 61, 62, 63 or 64 or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any thereof.
[0043] In certain embodiments, the MDH gene is a bacterial MDH gene that encodes for an NADH-dependent MDH enzyme. For example, the MDH gene is in some embodiments an Escherichia coli (E. coli) MDH gene encoding for an enzyme having amino acid sequence SEQ ID NO: 105 or a gene that encodes for an enzyme having an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity thereto. In certain embodiments, the bacterial MDH gene has the nucleotide sequence SEQ ID NO: 66 or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to either of those.
[0044] In certain embodiments, an MDH gene is a fungal MDH gene that encodes for an NADH-dependent MDH enzyme. For example, the MDH gene in some embodiments is a Rhizopus. oryzae (R. oryzae) MDH gene or a Rhizopus delemar (R. delemar) MDH gene encoding for an enzyme having amino acid sequence SEQ ID NO: 106 or 128 or a gene which encodes for an enzyme having amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to either thereof. In certain embodiments, the fungal MDH gene has nucleotide sequence SEQ ID NO: 68 or 13 or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity thereto.
[0045] The step of converting malate to fumarate is catalyzed by a FUM (fumarase) enzyme, i.e., an enzyme having the ability to catalyze the conversion of malate to fumarate. A FUM (fumarase) enzyme is encoded by a FUM (fumarase) gene integrated into the genome of the recombinant yeast cell. The FUM gene may be native or non-native to the host cell, and may be endogenous (if native) or exogenous (if non-native or if additional copies of a native gene are present). In certain embodiments, a FUM gene is a yeast gene. The FUM gene is in some embodiments an I. orientalis FUM gene encoding an enzyme having amino acid sequence SEQ ID NO: 107, or for an enzyme having an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to SEQ ID NO: 107. In certain embodiments, the FUM gene may have nucleotide sequence SEQ ID NO: 70 or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to SEQ ID NO: 70. In other embodiments, a FUM gene may be a bacterial gene.
[0046] The step of converting fumarate to succinate is catalyzed by a FRD (fumarate reductase) enzyme, i.e., an enzyme having the ability to catalyze the conversion of fumarate to succinate. A FRD (fumarate reductase) enzyme is encoded by a FRD (fumarate reductase) gene present in the genome of the recombinant yeast cell. The FRD gene may be native or non-native to the host cell, and may be endogenous (if native) or exogenous (if non-native or if additional copies of a native gene are present). The FRD enzyme preferably is NADH-dependent, i.e., one which uses NADH preferentially as a cofactor, and in converting fumarate to succinate also oxidizes NADH to NAD+. In the cells of this invention, the FRD enzyme preferably is overexpressed, by integrating one or more copies of an exogenous FRD gene (preferably at least two copies) into the genome of the cell. The FRD gene preferably encodes for an NADH-dependent FRD enzyme.
[0047] In certain embodiments, the FRD gene is a yeast FRD gene that encodes for an NADH-dependent FRD enzyme. For example, the FRD gene may be an S. cerevisiae FRD1 gene encoding for an enzyme having amino acid sequence SEQ ID NO: 108, a Saccharomyces mikatae (S. mikatae) FRD1 gene encoding for an enzyme having amino acid sequence SEQ ID NO: 109, a K. polyspora FRD1 gene encoding for an enzyme having amino acid sequence SEQ ID NO: 110, a K. marxianus FRD1 gene encoding for an enzyme having amino acid sequence SEQ ID NO: 111, or a gene encoding for an enzyme having an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any thereof. In certain embodiments, the yeast FRD gene may have any of nucleotide sequences SEQ ID NOs: 75, 76, 77 or 78, or have a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any thereof.
[0048] In certain embodiments, the FRD gene may be a protozoan gene that encodes for an NADH-dependent FRD enzyme. For example, the FRD gene may be a Trypanosoma brucei (T. brucei) FRD gene encoding for an enzyme having amino acid sequence SEQ ID NO: 112, a Trypanosoma cruzi (T. cruzi) FRD gene encoding for an enzyme having amino acid sequence SEQ ID NO: 113, a Leishmania braziliensis (L. braziliensis) FRD gene encoding for an enzyme having amino acid sequence SEQ ID NO: 114, a Leishmania mexicana (L. mexicana) FRD gene encoding for an enzyme having amino acid sequence SEQ ID NO: 82, or a gene encoding for an enzyme having an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any thereof. In certain embodiments, the FRD gene may have a nucleotide sequence as set forth in any of SEQ ID NOs: 42, 43, 44 or 10, or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any thereof.
[0049] In this invention, it is preferred that the reaction of OAA to malate and the reaction of fumarate to succinate each oxidizes NADH to NAD+. The oxidation of NADH to NAD+ typically occurs in cases in which the reaction in any one or more of these steps is catalyzed by an NADH-dependent enzyme as described before.
[0050] The recombinant cell of the invention overxpresses an active NAD(P)+ transhydrogenase enzyme and/or includes one or more exogenous NAD(P)+ transhydrogenase genes, which may be native or non-native to the host cell. A "NAD(P)+ transhydrogenase" (SthA) gene refers to any gene that encodes a polypeptide that catalyzes the reaction of NADP(H) to form NAD(H). The NAD(P)+ transhydrogenase (SthA) enzyme preferably is soluble in the cytosol of the recombinant cell. The exogenous SthA gene may be of bacterial, fungal, yeast or other origin. The exogenous SthA gene in some embodiments is an E. coli, Azotobacter vinelandii (A. vinelandii) or Pseudomona flourescens SthA gene. The exogenous SthA gene in some embodiments encodes for an enzyme having any of amino acid sequences SEQ ID NOs: 117, 118, 119, or 146, or which is at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% identical to any thereof. In certain embodiments, the exogenous SthA gene has any of nucleotide sequences SEQ ID NOs: 21, 24, 27, or 139, or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to any thereof.
[0051] In some embodiments, the recombinant cell exhibits increased flux (relative to the wild-type host strain) through the pentose phosphate pathway and/or overexpresses at least one enzyme which catalyzes a reaction that includes the reduction of NADP+ to NADPH.
[0052] The overexpressed enzyme may be an enzyme that catalyzes a reaction in the pentose phosphate pathway. The pentose phosphate pathway metabolizes glucose-6-phosphate to glyceraldehyde-3-phosphate through 6-phosphogluconolactone, 6-phosphogluconate and ribulose 5-phosphate intermediates. The conversion of glucose-6-phosphate to 6-phosphogluconolactone is catalyzed by a glucose-6-phosphate dehydrogenase (G6PDH) enzyme that uses NADP+ as a cofactor, thereby reducing NADP+ to NADPH. Similarly, the conversion of 6-phosphogluconate to ribulose-5-phosphate is catalyzed by a 6-phosphogluconate dehydrogenase (6PGDH) enzyme that uses NADP+ as a cofactor, thereby reducing NADP+ to NADPH. Overexpressing one or both of these enzymes, or increasing flux through the pentose phosphate pathway, produces NADPH, which can be converted to NADH by action of the NAD(P)+ transhydrogenase enzyme, helping to maintain cofactor balance in the cell.
[0053] One way of increasing flux through the pentose phosphate pathway is to disrupt the glycolytic pathway from glucose to pyruvate. This can be done, for example, by disrupting or removing the step of isomerising glucose-6-phosphate to fructose-6-phosphate, which is catalyzed by a phosphoglucose (PGI) enzyme. Therefore, in certain embodiments, the recombinant cell of the invention produces a severely reduced quantity (at least 75% reduction, preferably at least 85% reduction, more preferably at least 95% reduction) of an active phosphoglucose isomerase (PGI) enzyme, or produces a PGI enzyme with severely reduced (at least 75% reduced, preferably at least 85% reduced, more preferably at least 95% reduced) activity. In some embodiments, the recombinant cell includes a deletion or disruption of at least one native phosphoglucose isomerase (PGI) gene. If the host cell contains multiple alleles of the PGI gene, all such alleles may be deleted or disrupted.
[0054] The overexpressed enzyme which catalyzes a reaction that includes the reduction of NADP+ to NADPH may be an enzyme that catalyzes a reaction in the pentose phosphate pathway. The pentose phosphate pathway metabolizes glucose-6-phosphate to glyceraldehyde-3-phosphate through 6-phosphogluconolactone, 6-phosphogluconate and ribulose 5-phosphate intermediates. The conversion of glucose-6-phosphate to 6-phosphogluconolactone is catalyzed by a glucose-6-phosphate dehydrogenase (G6PDH) enzyme that uses NADP+ as a cofactor, thereby reducing NADP+ to NADPH. Similarly, the conversion of 6-phosphogluconate to ribulose-5-phosphate is catalyzed by a 6-phosphogluconate dehydrogenase (6PGDH) enzyme that uses NADP+ as a cofactor, thereby reducing NADP+ to NADPH.
[0055] Therefore, in certain embodiments, the yeast cell of the invention overexpresses a G6PDH enzyme. Such a yeast cell in some embodiments includes one or more exogenous G6PDH genes, which may be native or non-native to the strain, integrated into its genome. In certain of these embodiments, the exogenous G6PDH gene may be an I. orientalis G6PDH gene (ZWF1) that encodes for an enzyme having amino acid sequence SEQ ID NO: 121 or which encodes for an enzyme having with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity SEQ ID NO: 121. In certain embodiments, the G6PDH gene may have nucleotide sequence SEQ ID NO: 87 or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to nucleotide sequence SEQ ID NO: 87.
[0056] Similarly, in other embodiments, the recombinant yeast cells provided herein contains one or more exogenous 6PGDH genes, which may be native or non-native to the host strain, integrated into its genome. In certain embodiments, a 6PGDH gene may be a yeast 6PGDH gene such as an I. orientalis 6PGDH gene. In certain embodiments, the exogenous 6PGDH gene encodes for an enzyme having amino acid sequence SEQ ID NO: 88, or an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to SEQ ID NO: 88. In certain embodiments, the exogenous 6PGDH gene has the nucleotide sequence of SEQ ID NO: 89, or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to the nucleotide sequence of SEQ ID NO: 89.
[0057] In certain embodiments, the recombinant cell of the invention overexpresses an oxidative stress-activated zinc cluster protein Stb5p. This zinc cluster protein regulates genes involved in certain NADPH-producing reactions, including the G6PDH and 6PGDH genes. In certain embodiments, the recombinant cell includes one or more exogenous Stb5p genes, which may be native or non-native to the host cell, integrated into its genome. In certain embodiments, the exogenous Stb5p gene encodes for an enzyme having amino acid sequence SEQ ID NO: 83, or an amino acid sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to SEQ ID NO: 83. In certain embodiments, the exogenous Stb5p gene has the nucleotide sequence of SEQ ID NO: 30, or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to the nucleotide sequence of SEQ ID NO: 30.
[0058] The recombinant cell of the invention may further include one or more exogenous succinate exporter genes, which may be native or non-native to the host cell. A "succinate exporter gene" as used herein refers to any gene that encodes a polypeptide with succinate export activity, meaning the ability to transport succinate out of a cell and into the extracellular environment. The exogenous succinate exporter gene may be a fungal succinate exporter gene such as a Schizosaccharomyces pombe (S. pombe) succinate exporter gene or Aspergillus oryzae (A. oryzae) source succinate exporter gene. The exogenous succinate exporter gene in some embodiments encodes for an enzyme having amino acid sequence SEQ ID NOs: 90 or 91, or at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to either of SEQ ID NOs: 90 or 91. In certain embodiments, the exogenous succinate exporter gene has either of nucleotide sequence SEQ ID NOs: 92 or 93, or a nucleotide sequence with at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity to either SEQ ID NOs: 92 or 93.
[0059] In certain embodiments, the recombinant yeast cells provided herein may have a deletion or disruption of one or more other endogenous genes. The other deleted or disrupted genes may include genes which produce enzymes that catalyze the reaction of pyruvate or phosphoenolpyruvate (or their metabolizes) to downstream products other than succinate. Among such genes are, for example, native pyruvate decarboxylase, alcohol dehydrogenase 1 (ADH1, catalyzes the conversion of acetaldehyde to ethanol), alcohol dehydrogenase 2 (ADH2, catalyzes the conversion of ethanol to acetaldehyde), glycerol-3-phosphate dehydrogenase (GPD, systematic name sn-glycerol-3-phosphate:NAD+2-oxidoreductase, EC 1.1.1.8), and glycerol-3-phosphatase enzyme (GPP, systematic name glycerol-1-phosphate phosphohydrolase, EC 3.1.3.21) and NADH+-dependent glycerol dehydrogenase (systematic name glycerol:NAD+2-oxidoreductase, EC 1.1.1.6) genes.
[0060] Other endogenous genes that may be deleted in certain embodiments of the invention include genes which encode for enzymes that catalyze a reaction that consumes PEP, pyruvate, succinate or any intermediates produced in the reductive TCA pathway (other than the TCA pathway reactions leading to succinate). Examples of such genes include a native pyruvate carboxylase gene (which encodes an enzyme that converts OAA to pyruvate), a native PEP carboxykinase (PCK) gene (which encodes an enzyme that converts OAA to PEP), a native malic enzyme (MAE) gene (which encodes an enzyme that converts malate to pyruvate) and a native succinate dehydrogenase (SDH) gene (which encodes an enzyme that catalyzes the back-reaction of succinate to fumarate).
[0061] In some embodiments, the modified yeast cells provided herein have a deletion or disruption of a native succinate importer gene, which as used herein refers to any gene that encodes a polypeptide that allows for growth on and consumption of succinate.
[0062] In certain embodiments, the cells may contain all or part of an active oxidative TCA or glyoxylate shunt succinate fermentation pathway. In these embodiments, the cells comprise one or more genes encoding enzymes selected from the group consisting of citrate synthase, PDH (pyruvate dehydrogenase), PFL (pyruvate formate lyase), aconitase, IDH (isocitrate dehydrogenase), .alpha.-KGDH (.alpha.-ketoglutarate dehydrogenase), succinate thiokinase, isocitrate lyase, and malate synthase.
[0063] The recombinant cell of the invention may further include one or more modifications which individually or collectively confers to the cell the ability to ferment pentose sugars to xylulose 5-phosphate. Among such modifications are (1) insertion of a functional xylose isomerase gene, (2) a deletion or disruption of a native gene that produces an enzyme that catalyzes the conversion of xylose to xylitol, (3) a deletion or disruption of a functional xylitol dehydrogenase gene and/or (4) modifications that cause the cell to overexpress a functional xylulokinase. Methods for introducing those modifications into yeast cells are described, for example, in WO 04/099381, incorporated herein by reference. Suitable methods for inserting a functional xylose isomerase gene, deleting or disrupting a native gene that produces an enzyme that catalyzes the conversion of xylose to xylitol, deleting or disrupting a functional xylitol dehydrogenase gene, and modifying the cell to overexpress a functional xylulokinase are described, for example, in WO 04/099381, incorporated herein by reference.
[0064] In this invention, any exogenous gene, including without limitation any of the exogeneous genes in the reductive TCA pathway from pyruvate to succinate, any succinate exporter gene, any G6PDH gene, any 6PGDH gene, any SthA gene, or any other exogenous gene introduced into the host cell, is operatively linked to one or more regulatory elements, and in particular to a promoter sequence and a terminator sequence that each are functional in the host cell. Such regulatory elements may be native or non-native to the host cell.
[0065] Examples of promoters that may be linked to one or more exogenous genes in the yeast cells provided herein include, but are not limited to, promoters for pyruvate decarboxylase (PDC1), phosphoglycerate kinase (PGK), xylose reductase (XR), xylitol dehydrogenase (XDH), L-(+)-lactate-cytochrome c oxidoreductase (CYB2), translation elongation factor-1 or -2 (TEF1, TEF2), enolase (ENO1), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), orotidine 5'-phosphate decarboxylase (URA3) genes, as well as any of those described in the various Examples that follow. Where the promoters are non-native, they may be identical to or share a high degree of sequence identity (i.e., at least about 80%, at least about 85%, at least about 90%, at least about 95%, or at least about 99%) with one or more native promoters. Other suitable promoters and terminators include those described, for example, in WO99/14335, WO00/71738, WO02/42471, WO03/102201, WO03/102152 and WO03/049525.
[0066] Examples of terminators that may be linked to one or more exogenous genes in the yeast cells provided herein include, but are not limited to, terminators for PDC1, XR, XDH, transaldolase (TAL), transketolase (TKL), ribose 5-phosphate ketol-isomerase (RKI), CYB2, or iso-2-cytochrome c (CYC) genes or the galactose family of genes (especially the GAL10 terminator), as well as any of those described in the various Examples that follow. Where the terminators are non-native, they may be identical to or share a high degree of sequence identity (i.e., at least about 80%, at least about 85%, at least about 90%, at least about 95%, or at least about 99%) with one or more native terminators.
[0067] Modifications (insertion, deletions and/or disruptions) to the genome of the host cell described herein can be performed using methods known in the art. Exogeneous genes may be integrated into the genome in a targeted or a random manner using, for example, well known electroporosis and chemical methods (including calcium chloride and/or lithium acetate methods). In those embodiments where an exogenous gene is integrated in a targeted manner, it may be integrated into the locus for a particular native gene, such that integration of the exogenous gene is coupled with deletion or disruption of a native gene. Alternatively, the exogenous gene may be integrated into a portion of the native genome that does not correspond to a gene. Methods for transforming a yeast cell with an exogenous construct are described in, for example, WO99/14335, WO00/71738, WO02/42471, WO03/102201, WO03/102152, WO03/049525, WO2007/061590, WO 2009/065778 and PCT/US2011/022612.
[0068] Insertion of exogenous genes is generally performed by transforming the cell with one or more integration constructs or fragments. The terms "construct" and "fragment" are used interchangeably herein to refer to a DNA sequence that is used to transform a cell. The construct or fragment may be, for example, a circular plasmid or vector, a portion of a circular plasmid or vector (such as a restriction enzyme digestion product), a linearized plasmid or vector, or a PCR product prepared using a plasmid or genomic DNA as a template. An integration construct can be assembled using two cloned target DNA sequences from an insertion site target. The two target DNA sequences may be contiguous or non-contiguous in the native host genome. In this context, "non-contiguous" means that the DNA sequences are not immediately adjacent to one another in the native genome, but are instead are separated by a region that is to be deleted. "Contiguous" sequences as used herein are directly adjacent to one another in the native genome. Where targeted integration is to be coupled with deletion or disruption of a target gene, the integration construct also functions as a deletion construct. In such an integration/deletion construct, one of the target sequences may include a region 5' to the promoter of the target gene, all or a portion of the promoter region, all or a portion of the target gene coding sequence, or some combination thereof. The other target sequence may include a region 3' to the terminator of the target gene, all or a portion of the terminator region, and/or all or a portion of the target gene coding sequence. Where targeted integration is not to be coupled to deletion or disruption of a native gene, the target sequences are selected such that insertion of an intervening sequence will not disrupt native gene expression. An integration or deletion construct is prepared such that the two target sequences are oriented in the same direction in relation to one another as they natively appear in the genome of the host cell. The gene expression cassette is cloned into the construct between the two target gene sequences to allow for expression of the exogenous gene. The gene expression cassette contains the exogenous gene, and may further include one or more regulatory sequences such as promoters or terminators operatively linked to the exogenous gene.
[0069] It is usually desirable that the deletion construct may also include a functional selection marker cassette. When a single deletion construct is used, the marker cassette resides on the vector downstream (i.e., in the 3' direction) of the 5' sequence from the target locus and upstream (i.e., in the 5' direction) of the 3' sequence from the target locus. Successful transformants will contain the selection marker cassette, which imparts to the successfully transformed cell some characteristic that provides a basis for selection.
[0070] A "selection marker gene" may encode for a protein needed for the survival and/or growth of the transformed cell in a selective culture medium. Typical selection marker genes encode proteins that (a) confer resistance to antibiotics or other toxins, (such as, for example, zeocin (Streptoalloteichus hindustanus ble bleomycin resistance gene), G418 (kanamycin-resistance gene of Tn903) or hygromycin (aminoglycoside antibiotic resistance gene from E. coli)), (b) complement auxotrophic deficiencies of the cell (such as, for example, amino acid leucine deficiency (K. marxianus LEU2 gene) or uracil deficiency (e.g., K. marxianus or S. cerevisiae URA3 gene)); (c) enable the cell to synthesize critical nutrients not available from simple media, or (d) confer ability for the cell to grow on a particular carbon source, (such as a MEL5 gene from S. cerevisiae, which encodes the alpha-galactosidase (melibiase) enzyme and confers the ability to grow on melibiose as the sole carbon source). Preferred selection markers include the zeocin resistance gene, G418 resistance gene, a MEL5 gene, a URA3 gene and hygromycin resistance gene. Another preferred selection marker is an L-lactate:ferricytochrome c oxidoreductase (CYB2) gene cassette, provided that the host cell either natively lacks such a gene or that its native CYB2 gene(s) are first deleted or disrupted.
[0071] The construct may be designed so that the selection marker cassette can become spontaneously deleted as a result of a subsequent homologous recombination event. A convenient way of accomplishing this is to design the vector such that the selection marker gene cassette is flanked by direct repeat sequences. Direct repeat sequences are identical DNA sequences, native or not native to the host cell, and oriented on the construct in the same direction with respect to each other. The direct repeat sequences are advantageously about 50-1500 bp in length. It is not necessary that the direct repeat sequences encode for anything. This construct permits a homologous recombination event to occur. This event occurs with some low frequency, resulting in cells containing a deletion of the selection marker gene and one of the direct repeat sequences. It may be necessary to grow transformants for several rounds on nonselective or selective media to allow for the spontaneous homologous recombination to occur in some of the cells. Cells in which the selection marker gene has become spontaneously deleted can be selected or screened on the basis of their loss of the selection characteristic imparted by the selection marker gene, or by using PCR or Southern Analysis methods to confirm the loss of the selection marker.
[0072] In some embodiments, an exogenous gene may be inserted using DNA from two or more integration fragments, rather than a single fragment. In these embodiments, the 3' end of one integration fragment contains a region of homology with the 5' end of another integration fragment. One of the fragments will contain a first region of homology to the target locus and the other fragment will contain a second region of homology to the target locus. The gene cassette to be inserted can reside on either fragment, or be divided among the fragments, with a region of homology at the 3' and 5' ends of the respective fragments, so the entire, functional gene cassette is produced upon a crossover event. The cell is transformed with these fragments simultaneously. A selection marker may reside on any one of the fragments or may be divided between the fragments with a region of homology as described. In other embodiments, transformation from three or more constructs can be used in an analogous way to integrate exogenous genetic material.
[0073] Deletions and/or disruptions of native genes can be performed by transformation methods, by mutagenesis and/or by forced evolution methods. In mutagenesis methods cells are exposed to ultraviolet radiation or a mutagenic substance, under conditions sufficient to achieve a high kill rate (60-99.9%, preferably 90-99.9%) of the cells. Surviving cells are then plated and selected or screened for cells having the deleted or disrupted metabolic activity. Disruption or deletion of the desired native gene(s) can be confirmed through PCR or Southern analysis methods.
[0074] Cells of the invention can be cultivated to produce succinic acid, either in the free acid form or in salt form (or both), or a metabolization product of succinate. The recombinant cell is cultured in a medium that includes at least one carbon source that can be fermented by the cell. Examples include, but are not limited to, twelve carbon sugars such as sucrose, hexose sugars such as glucose or fructose, glycan, starch, or other polymer of glucose, glucose oligomers such as maltose, maltotriose and isomaltotriose, panose, and fructose oligomers, and pentose sugars such as xylose, xylan, other oligomers of xylose, or arabinose.
[0075] The medium will typically contain, in addition to the carbon source, nutrients as required by the particular cell, including a source of nitrogen (such as amino acids, proteins, inorganic nitrogen sources such as ammonia or ammonium salts, and the like), and various vitamins, minerals and the like. In some embodiments, the cells of the invention can be cultured in a chemically defined medium.
[0076] Other cultivation conditions, such as temperature, cell density, selection of substrate(s), selection of nutrients, and the like are not considered to be critical to the invention and are generally selected to provide an economical process. Temperatures during each of the growth phase and the production phase may range from above the freezing temperature of the medium to about 50.degree. C., although this depends to some extent on the ability of the strain to tolerate elevated temperatures. A preferred temperature, particularly during the production phase, is about 30 to 45.degree. C.
[0077] During cultivation, aeration and agitation conditions may be selected to produce a desired oxygen uptake rate. The cultivation may be conducted aerobically, microaerobically, or anaerobically, depending on pathway requirements. In some embodiments, the cultivation conditions are selected to produce an oxygen uptake rate of around 2-25 mmol/L/hr, preferably from around 5-20 mmol/L/hr, and more preferably from around 8-15 mmol/L/hr. "Oxygen uptake rate" or "OUR" as used herein refers to the volumetric rate at which oxygen is consumed during the fermentation. Inlet and outlet oxygen concentrations can be measured with exhaust gas analysis, for example by mass spectrometers. OUR can be calculated using the Direct Method described in Bioreaction Engineering Principles 2nd Edition, 2003, Kluwer Academic/Plenum Publishers, p. 449, equation 1.
[0078] The culturing process may be divided up into phases. For example, the cell culture process may be divided into a cultivation phase, a production phase, and a recovery phase.
[0079] The pH may be allowed to range freely during cultivation, or may be buffered if necessary to prevent the pH from falling below or rising above predetermined levels. For example, the medium may be buffered to prevent the pH of the solution from falling below around 2.0 or above about 8.0 during cultivation. In certain of these embodiments, the medium may be buffered to prevent the pH of the solution from falling below around 3.0 or rising above around 7.0, and in certain of these embodiments the medium may be buffered to prevent the pH of the solution from falling below around 3.0 or rising above around 4.5. Suitable buffering agents include basic materials that neutralize the acid as it is formed, and include, for example, calcium hydroxide, calcium carbonate, sodium hydroxide, potassium hydroxide, potassium carbonate, sodium carbonate, ammonium carbonate, ammonia, ammonium hydroxide and the like.
[0080] In a buffered fermentation, acidic fermentation products are neutralized to the corresponding salt as they are formed. Recovery of the acid therefore involves regenerating the free acid. This is typically done by removing the cells and acidulating the fermentation broth with a strong acid such as sulfuric acid. A salt by-product is formed (gypsum in the case where a calcium salt is the neutralizing agent and sulfuric acid is the acidulating agent), which is separated from the broth.
[0081] In other embodiments, the pH of the fermentation medium may be permitted to drop during cultivation from a starting pH that is at or above the lower pKa (4.207) of succinate, typically 8 or higher, to at or below the lower pKa of the acid fermentation product, such as in the range of about 2.0 to about 4.2, in the range of from about 3.0 to about 4.2, or in the range from about 3.8 to about 4.2.
[0082] In still other embodiments, fermentation may be carried out to produce a product acid by adjusting the pH of the fermentation broth to at or below the lower pKa of the product acid prior to or at the start of the fermentation process. The pH may thereafter be maintained at or below the lower pKa of the product acid throughout the cultivation. In certain embodiments, the pH may be maintained at a range of about 2.0 to about 4.2, in the range of from about 3.0 to about 4.2, or in the range from about 3.8 to about 4.2.
[0083] When the pH of the fermentation broth is low enough that the succinate is present in acid form, the acid can be recovered from the broth through techniques such as liquid-liquid extraction, distillation, absorption, etc., such as are described in T. B. Vickroy, Vol. 3, Chapter 38 of Comprehensive Biotechnology, (ed. M. Moo-Young), Pergamon, Oxford, 1985; R. Datta, et al., FEMS Microbiol. Rev., 1995, 16:221-231; U.S. Pat. Nos. 4,275,234, 4,771,001, 5,132,456, 5,420,304, 5,510,526, 5,641,406, and 5,831,122, and WO 93/00440.
[0084] The cultivation may be continued until a yield of succinate on the carbon source is, for example, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, or greater than 50% of the theoretical yield. The yield to succinate may at least 80% or at least 90% of the theoretical yield. The concentration, or titer, of succinate produced in the cultivation will be a function of the yield as well as the starting concentration of the carbon source. In certain embodiments, the titer may reach at least 1, at least 3, at least 5, at least 10, at least 20, at least 30, at least 40, at least 50, or greater than 50 g/L at some point during the fermentation, and preferably at the end of the fermentation.
[0085] In certain embodiments, the genetically modified yeast cells produce ethanol in a yield of 10% or less, preferably in a yield of 2% or less of the theoretical yield. In certain of these embodiments, ethanol is not detectably produced. In other embodiments, however, succinate and ethanol may be co-produced. In these embodiments, ethanol may be produced at a yield of greater than 10%, greater than 25%, or greater than 50% of the theoretical yield.
[0086] The recombinant cell of the invention may exhibit a volumetric glucose consumption rate of at least 0.5 gram, at least 0.75 gram, or at least 0.9 gram of glucose per liter of broth per hour, when cultivated under the conditions described in Examples 253-255.
[0087] The cell of the invention may produce succinate as an end-product of the fermentation process. In such a case, the cell preferably transports succinate out of the cell and into the surrounding culture medium.
[0088] In some embodiments, the cell may further metabolize some or all of the succinate into one or more succinate metabolization products, i.e., a compound formed in the further metabolization of succinate by the cell. Examples of such downstream succinate metabolization products include, for example, 1,4-butanediol, 1,3-butadiene, propionic acid, and 3-hydroxyisobutryic acid. In such embodiments, the cell contains native or non-native metabolic pathways which perform the such a further metabolization of succinate into such downstream succinate metabolization product(s). The cell may then transport such downstream succinate metabolization products out of the cell and into the surrounding medium. In some embodiments, the cell may transport one or more succinate metabolization products, but not succinate, out of the cell. In other embodiments, the cell may transport both succinate itself and one or more succinate metabolization products out of the cell. For example, the cell may transport less than 10% by weight of succinate from the cell, based on the combined weight of succinate and succinate metabolization products exported from the cell.
[0089] The following examples are provided to illustrate the invention, but are not intended to limit the scope thereof. All parts and percentages are by weight unless otherwise indicated.
EXAMPLES
Construction of Preparatory Examples
[0090] P-1. An I. orientalis strain host strain is generated by evolving I. orientalis strain ATCC PTA-6658 for 91 days in a glucose-limited chemostat. The system is fed 15 g/L glucose in a defined medium and operated at a dilution rate of 0.06 h.sup.-1 at pH=3 with added lactic acid in the feed medium. The conditions are maintained with an oxygen transfer rate of approximately 2 mmol L.sup.-1h.sup.-1, and dissolved oxygen concentration remains constant at 0% of air saturation. Single colony isolates from the final time point are characterized in two shake flask assays. In the first assay, the isolates are characterized for their ability to ferment glucose to ethanol in the presence of 25 g/L total lactic acid with no pH adjustment in the defined medium. In the second assay, the growth rate of the isolates is measured in the presence of 45 g/L of total lactic acid, with no pH adjustment in the defined medium. Strain P-1 is a single isolate exhibiting the highest glucose consumption rate in the first assay and the highest growth rate in the second assay.
[0091] P-2. Strain P-1 is transformed with linearized integration fragment P2 (having nucleotide sequence SEQ ID NO: 1) designed to disrupt the URA3 gene, using a LiOAc transformation method as described by Gietz et al., in Met. Enzymol. 350:87 (2002). Integration fragment P2 includes a MEL5 selection marker gene. Transformants are selected on YNB-melibiose plates and screened by PCR to confirm the integration of the integration piece and deletion of a copy of the URA3 gene. A URA3-deletant strain is grown for several rounds until PCR screening identifies an isolate in which the MEL5 selection marker gene has looped out. The PCR screening is performed using primers having nucleotide sequences SEQ ID NOs: 47 and 48 to confirm the 5'-crossover and primers having nucleotide sequences SEQ ID NOs: 51 and 52 to confirm the 3' crossover. That isolate is again grown for several rounds on 5-fluoroorotic acid (FOA) plates to identify a strain in which the URA3 marker has looped out. PCR screening is performed on this strain using primers having nucleotide sequences SEQ ID NOs: 56 and 124, identifies an isolate in which both URA3 alleles have been deleted. This isolate is named strain P-2.
[0092] P-3. Strain P-2 is transformed with integration fragment P3 (having the nucleotide sequence SEQ ID NO: 2), which is designed to disrupt the PDC gene. Integration fragment P3 contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis PDC open reading frame, a PDC transcriptional terminator, the URA3 promoter, the I. orientalis URA3 gene, an additional URA3 promoter direct repeat for marker recycling and a DNA fragment with homology for integration corresponding to the region directly downstream of the I. orientalis PDC open reading frame. A successful integrant (and single-copy PDC deletant) is identified on selection plates lacking uracil and confirmed by PCR using primers having nucleotide sequences SEQ ID NOS: 53 and 54 to confirm the 5'-crossover and primers having nucleotide sequences SEQ ID NOs: 55 and 122 to confirm the 3'-crossover. That integrant is grown for several rounds and plated on 5-fluoroorotic acid (FOA) plates to identify a strain in which the URA3 marker has looped out. Loopout of the URA3 marker is confirmed by PCR. That strain is again transformed with integration fragment P3 (SEQ ID NO: 2) to delete the second copy of the native PDC gene. A successful transformant is again identified by selection on selection plates lacking uracil, and further confirmed by culturing the strain over two days and measuring ethanol production. Lack of ethanol production further demonstrates a successful deletion of both copies of the PDC gene in a transformant. That transformant is grown for several rounds and plated on FOA plates until PCR identifies a strain in which the URA3 marker has looped out. The PCR screening is performed using primers having nucleotide sequences SEQ ID NOs: 53 and 54 to confirm the 5'-crossover and SEQ ID NOs: 55 and 122 to confirm the 3'-crossover. That strain is plated on selection plates lacking uracil to confirm the loss of the URA3 marker, and is designated strain P-3.
[0093] P-4. Integration fragment P4-1, having nucleotide sequence SEQ ID NO: 3, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis ADH9091 open reading frame, an I. orientalis PDC1 promoter, the I. orientalis PYC gene (having the nucleotide sequence SEQ ID NO: 4), the I. orientalis TAL terminator, the I. orientalis URA3 promoter, and the first 530 bp of the I. orientalis URA3 open reading frame.
[0094] Integration fragment P4-2, having nucleotide sequence SEQ ID NO: 5, contains the following elements, 5' to 3': a DNA fragment corresponding to the last 568 bp of the I. orientalis URA3 open reading frame, the I. orientalis URA3 terminator, the I. orientalis URA3 promoter, the I. orientalis TDH3 promoter, the S. pombe MAE gene (having the nucleotide sequence SEQ ID NO: 6), the I. orientalis TKL terminator, and a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis ADH9091 open reading frame.
[0095] Strain P-3 is transformed simultaneously with integration fragments P4-1 and P4-2, using lithium acetate methods, to insert the I. orientalis PYC gene and the S. pombe MAE gene at the ADH9091 locus. Integration occurs via three cross-over events: in the regions of the ADH9091 upstream homology, in the regions of the ADH9091 downstream homology and in the region of URA3 homology between SEQ ID NO: 3 and SEQ ID NO: 5. Transformants are streaked to isolates and the correct integration of the cassette at the AHD9091 locus is confirmed in a strain by PCR. The PCR screening is performed using primers having nucleotide sequences SEQ ID NOs: 65 and 69 to confirm the 5'-crossover and SEQ ID NOs: 67 and 71 to confirm the 3'-crossover. That strain is grown and plated on FOA as before until the loopout of the URA3 marker from an isolate is confirmed by PCR.
[0096] That isolate is then transformed simultaneously with integration fragments P4-3 and P4-4 using LiOAc transformation methods, to insert a second copy of each of the I. orientalis PYC gene and the S. pombe MAE gene at the ADH9091 locus.
[0097] Integration fragment P4-3, having the nucleotide sequence SEQ ID NO: 7, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis ADH9091 open reading frame, an I. orientalis PDC1 promoter, the I. orientalis PYC gene as found in SEQ ID NO: 4, the I. orientalis TAL terminator, the I. orientalis URA3 promoter, and the first 530 bp of the I. orientalis URA3 open reading frame.
[0098] Integration fragment P4-4, having the nucleotide sequence SEQ ID NO: 8, contains the following elements, 5' to 3': a DNA fragment corresponding to the last 568 bp of the I. orientalis URA3 open reading frame, the I. orientalis URA3 terminator, the I. orientalis URA3 promoter, the I. orientalis TDH3 promoter, the S. pombe MAE gene (having a nucleotide sequences SEQ ID NO: 6), the I. orientalis TKL terminator, and a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis ADH9091 open reading frame.
[0099] Integration again occurs via three crossover events. Transformants are streaked to isolates and screened by PCR to identify a strain containing both copies of the I. orientalis PYCI and S. pombe MAE genes at the ADH9091 locus by PCR. The PCR screening to confirm the first copy is performed using primers having nucleotide sequences SEQ ID NOs: 65 and 69 to confirm the 5'-crossover and SEQ ID NOs: 67 and 71 to confirm the 3'-crossover. The PCR screening to confirm the second copy is performed using primers having nucleotide sequences SEQ ID NOs: 65 and 67 to confirm the 5'-crossover and SEQ ID NOs: 69 and 71 to confirm the 3'-crossover. That strain is grown and replated on FOA until a strain in which the URA3 marker has looped out is identified. That strain is designated strain P-4.
[0100] P-5. Strain P-4 is transformed with integration fragment P5-1 (having the nucleotide sequence SEQ ID NO: 9) using LiOAc transformation methods as described in previous examples, to integrate the L. mexicana FRD gene at the locus of the native CYB2b open reading frame. The integration fragment P5-1 contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis CYB2b open reading frame, an I. orientalis PDC1 promoter, the L. mexicana FRD gene (having nucleotide sequence SEQ ID NO: 10, and encoding for an enzyme having amino acid sequence SEQ ID NO: 82), the I. orientalis PDC1 Terminator, the I. orientalis URA3 promoter, gene, and terminator in succession, followed by an additional URA3 promoter which serves as a direct repeat for marker recycling, and a region immediately upstream of the I. orientalis CYB2b open reading frame.
[0101] Successful integration of a single copy of the L. mexicana FRD gene in one isolate is identified by selection on a selection plates lacking uracil and confirmed by PCR. The PCR screening is performed using primers having nucleotide sequences SEQ ID NOs: 72 and 73 to confirm the 5'-crossover and SEQ ID NOs: 69 and 79 to confirm the 3'-crossover. That isolate is grown and plated on FOA as before until a strain in which the URA3 promoter has looped out is identified by PCR. That isolate is transformed with integration fragment P5-2 in the same manner as before, to integrate a second copy of the L. mexicana FRD gene at the native locus of the CYB2b open reading frame.
[0102] Integration fragment P5-2 (having nucleotide sequence SEQ ID NO: 11), contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis CYB2b open reading frame, an I. orientalis PDC1 promoter, the L. mexicana FRD gene (having the nucleotide sequence SEQ ID NO: 10), the I. orientalis PDC1 terminator, the I. orientalis URA3 promoter, gene, and terminator in succession, followed by an additional URA3 promoter which serves as a direct repeat for marker recycling, and a region immediately downstream of the I. orientalis CYB2b open reading frame.
[0103] Correct integration of the second copy of the L. mexicana FRD gene in one isolate is confirmed by PCR using primers having nucleotide sequences SEQ ID NOs: 69 and 73 to confirm the 5'-crossover and SEQ ID NOs: 72 and 79 to confirm the 3'-crossover. Retention of the first integration is reconfirmed by repeating the PCR reactions used to verify proper integration of fragment P5-1 above. The confirmed isolate is grown and plated on FOA as before until the loop out of the URA3 marker is confirmed by PCR in one isolate. That isolate is designated strain P-5.
[0104] P-6. Integration fragment P6-1 (having nucleotide sequence SEQ ID NO: 12) contains the Rhizopus delemar MDH (RdMDH) gene (having the nucleotide sequence SEQ ID NO: 13), an ADHb upstream integration arm, ENO promoter, RKI terminator, URA3 promoter and the first 583 base pairs of the URA3 marker.
[0105] Integration fragment P6-2 (having nucleotide sequence SEQ ID NO: 14) contains the Actinobacillus succinogenes FUM (AsFUM) gene (nucleotide sequence SEQ ID NO: 15), the last 568 base pairs of the URA3 marker, URA3 promoter, PGK promoter, PDC terminator and ADHb downstream integration arm.
[0106] Strain P-5 is simultaneously transformed with each of integration fragments P6-1 and P6-2 using the standard lithium acetate process described before. Successful transformants are identified by PCR, and grown and plated until a strain in which the URA3 marker has looped out is identified as before. This strain is designated as strain P-6.
[0107] Second Rhizopus delemar MDH integration fragment P6-3 (having the nucleotide sequence SEQ ID NO: 16) contains the Rhizopus delemar MDH gene (having nucleotide sequence SEQ ID NO: 13), ADHb downstream integration arm, ENO promoter, RKI terminator, URA3 promoter and the first 583 base pairs of the URA3 marker.
[0108] Second A. succinogenes FUM (AsFUM) integration fragment P6-4 (having nucleotide sequence SEQ ID NO: 17) contains the truncated AsFUM gene (nucleotide sequence SEQ ID NO: 15) the last 568 base pairs of the URA3 marker, URA3 promoter, PGK promoter, PDC terminator and ADHb upstream integration arm.
[0109] Strain P-6 is simultaneously transformed with integration fragments P6-3 and P6-4, using the standard lithium acetate process described before. Successful transformants are identified by PCR, and grown and plated on FOA as before until a strain in which the URA3 marker has looped out is identified. This strain is designated as strain P-7.
TABLE-US-00001 TABLE 1 Preparatory Strains P-1 through P-7 Strain Parent name Description strain P-1 Organic acid tolerant I. orientalis isolate Wild-type P-2 URA3 deletion (2) P-1 P-3 URA3 deletion (2) P-2 PDC deletion (2) P-4 URA deletion (2) P-3 PDC deletion (2) I. orientalis PYC1 insertion at ADHa (2) S. pombe MAE insertion at ADHa (2) P-5 URA deletion (2) P-4 PDC deletion (2) I. orientalis PYC1 insertion at ADHa (2) S. pombe MAE insertion at ADHa (2) L. mexicana FRD insertion at CYB2b (2) P-6 URA deletion (2) P-5 PDC deletion (2) I. orientalis PYC1 insertion at ADHa (2) S. pombe MAE insertion at ADHa (2) L. mexicana FRD insertion at CYB2b (2) R. delemar MDH insertion at ADHb (1) A. succinogenes FUM insertion at ADHb (1) P-7 URA deletion (2) P-6 PDC deletion (2) I. orientalis PYC1 insertion at ADHa (2) S. pombe MAE insertion at ADHa (2) L. mexicana FRD insertion at CYB2b (2) R. delemar MDH insertion at ADHb (2) A. succinogenes FUM insertion at ADHb (2)
Examples 1-9: Integration of Soluble Transhydrogenase
[0110] General procedure for producing Examples 1-9: The host strain (as indicated in Table 2 below) is simultaneously transformed with each of two integration fragments, as indicated in Table 2 below, using the standard lithium acetate process described before. The integration fragments are designed for targeted insertion at the native MAE1 gene of the host strain. Integration occurs via three cross-over events: the MAE1 upstream homology, the MAE1 downstream homology and homology between portions of the URA3 gene that are present in each of the integration fragments. Transformants are streaked to isolates and the correct integration of the cassette at the MAE1 locus is confirmed by PCR using primers having nucleotide sequences SEQ ID NOs: 80 and 81 to confirm the 5'-crossover and SEQ ID NOs: 85 and 126 to confirm the 3'-crossover. That strain is grown and plated on FOA as before until the loopout of the URA3 marker from an isolate is confirmed by PCR.
[0111] The integration fragments used to produce strain Examples 1-9 are as follows:
[0112] Integration Fragment 1A: Left Hand Integration Fragment--Marker Only
[0113] Integration fragment 1A, having the nucleotide sequence SEQ ID NO: 18, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis MAE1 open reading frame, an I. orientalis PDC1 promoter, the I. orientalis TAL terminator, the I. orientalis ENO promoter, I. orientalis RKI terminator, URA3 promoter, and the first 582 bp of the I. orientalis URA3 open reading frame.
[0114] Integration Fragment 1B: Right Hand Integration Fragment--Marker Only
[0115] Integration fragment 1B having the nucleotide sequence SEQ ID NO: 19, contains the following elements, 5' to 3': a DNA fragment corresponding to the last 567 bp of the I. orientalis URA3 open reading frame, the I. orientalis URA3 terminator, the I. orientalis URA3 promoter, the I. orientalis TDH3 promoter, the I. orientalis TKL terminator, the I. orientalis PGK promoter, the I. orientalis PDC terminator and a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis MAE1 open reading frame.
[0116] Integration Fragment 1C: Left Hand Integration Fragment with the E. coli SthA Gene
[0117] Integration fragment 1C, having the nucleotide sequence SEQ ID NO: 20, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis MAE1 open reading frame, an I. orientalis PDC1 promoter, the I. orientalis TAL terminator, the I. orientalis ENO promoter, the E. coli SthA gene (having nucleotide sequence SEQ ID NO: 21), I. orientalis RKI terminator, URA3 promoter, and the first 582 bp of the I. orientalis URA3 open reading frame.
[0118] Integration Fragment 1D: Right Hand Integration Fragment with the E. coli SthA Gene
[0119] Integration fragment 1D, having nucleotide sequence SEQ ID NO: 22, contains the following elements, 5' to 3': a DNA fragment corresponding to the last 567 bp of the I. orientalis URA3 open reading frame, the I. orientalis URA3 terminator, the I. orientalis URA3 promoter, the I. orientalis TDH3 promoter, the I. orientalis TKL terminator, the I. orientalis PGK promoter, the E. coli SthA gene (having nucleotide sequence SEQ ID NO: 21), the I. orientalis PDC terminator and a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis MAE1 open reading frame.
[0120] Integration Fragment 1E: Left Hand Integration Fragment with a Codon Optimized E. coli SthA Gene
[0121] Integration fragment 1E, having the nucleotide sequence SEQ ID NO: 23, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis MAE1 open reading frame, an I. orientalis PDC1 promoter, the I. orientalis TAL terminator, the I. orientalis ENO promoter, the codon optimized E. coli SthA gene (having nucleotide sequence SEQ ID NO: 24), I. orientalis RKI terminator, URA3 promoter, and the first 582 bp of the I. orientalis URA3 open reading frame.
[0122] Integration Fragment 1F: Right Hand Integration Fragment with the Codon Optimized E. coli SthA Gene
[0123] Integration fragment 1F, having the nucleotide sequence SEQ ID NO: 25, contains the following elements, 5' to 3': a DNA fragment corresponding to the last 567 bp of the I. orientalis URA3 open reading frame, the I. orientalis URA3 terminator, the I. orientalis URA3 promoter, the I. orientalis TDH3 promoter, the I. orientalis TKL terminator, the I. orientalis PGK promoter, the codon-optimized E. coli SthA gene (having nucleotide sequence SEQ ID NO: 24), the I. orientalis PDC terminator and a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis MAE1 open reading frame.
[0124] Integration Fragment 1G: Left Hand Integration Fragment with the A. vinelandii SthA Gene
[0125] Integration fragment 1G, having the nucleotide sequence SEQ ID NO: 26, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis MAE1 open reading frame, an I. orientalis PDC1 promoter, the I. orientalis TAL terminator, the I. orientalis ENO promoter, the A. vinelandii SthA gene (having nucleotide sequence, SEQ ID NO: 27), I. orientalis RKI terminator, URA3 promoter, and the first 582 bp of the I. orientalis URA3 open reading frame.
[0126] Integration Fragment 1H: Right Hand Integration Fragment with the A. vinelandii SthA Gene
[0127] Integration fragment 1H, having the nucleotide sequence SEQ ID NO: 28, contains the following elements, 5' to 3': a DNA fragment corresponding to the last 567 bp of the I. orientalis URA3 open reading frame, the I. orientalis URA3 terminator, the I. orientalis URA3 promoter, the I. orientalis TDH3 promoter, the I. orientalis TKL terminator, the I. orientalis PGK promoter, the A. vinelandii SthA gene (having nucleotide sequence SEQ ID NO: 27), the I. orientalis PDC terminator and a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis MAE1 open reading frame.
[0128] Integration Fragment 1I: Left Hand Integration Fragment with the S. cervisiae Stb5p Gene
[0129] Integration fragment 1I, having nucleotide sequence SEQ ID NO: 29, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis MAE1 open reading frame, an I. orientalis PDC1 promoter, the S. cerevisiae Stb5p gene (having nucleotide sequence SEQ ID NO: 30), the I. orientalis TAL terminator, the I. orientalis ENO promoter, the I. orientalis RKI terminator, URA3 promoter, and the first 582 bp of the I. orientalis URA3 open reading frame.
TABLE-US-00002 TABLE 2 I. orientalis Insertion Strains Description Integration Parent Designation (in addition to those in strain P-7) Fragments strain Ex. 1 E. coli SthA insertion at MAE1 (1) 1C/1B P-7 Ex. 2 E. coli SthA insertion at MAE1 (2) 1D/1A Ex. 1 Ex. 3 A. vinelandii SthA insertion at 1G/1B P-7 MAE1 (1) Ex. 4 A. vinelandii SthA insertion at 1H/1A Ex. 3 MAE1 (2) Ex. 5 Codon optimized E. coli SthA 1E/1B P-7 insertion at MAE1 (1) Ex. 6 Codon optimized E. coli SthA 1F/1A Ex. 5 insertion at MAE1 (2) P-8 S. cerevisiae Stb5p insertion at 1I/lB P-7 MAE1 (1) Ex. 7 S. cerevisiae Stb5p insertion at 1I/1D P-8 MAE1 (1) E. coli SthA insertion at MAE1 (1) Ex. 8 S. cerevisiae Stb5p insertion at 1I/1F P-8 MAE1 (1) Codon optimized E. coli SthA insertion at MAE1 (1) Ex. 9 S. cerevisiae Stb5p insertion at 1I/1H P-8 MAE1 (1) A. vinelandii SthA insertion at MAE1 (1)
Examples 9-63
[0130] Strains P-9 through P-13 are prepared in the same manner as strain P-7, except the L. mexicana FRD gene in each case has been mutated to render it NADPH-dependent. In each case, the L. mexicana FRD gene having the nucleotide sequence SEQ ID NO: 10 is used as a template to modify the coding sequence to introduce substitutions of amino acid residues of the putative NADH binding domain of the enzyme.
[0131] The FRD gene used to prepare strain P-9 is prepared by performing site-directed substitutions at amino acids 219 (glutamic acid) and 220 (tryptophan) to produce a mutated gene having the nucleotide sequence SEQ ID NO. 32.
[0132] The FRD gene used to prepare strain P-10 is prepared by performing a site-directed substitution at amino acid 417 (glutamic acid) to produce a mutated gene having nucleotide sequence SEQ ID NO: 33.
[0133] The FRD gene used to prepare strain P-11 is prepared by performing a site-directed substitution at amino acid 641 (aspartic acid) to produce a mutated gene having nucleotide sequence SEQ ID NO: 34.
[0134] The FRD gene used to prepare strain P-12 is prepared by performing site-directed substitutions at amino acids 861 (glutamic acid) and 862 (cysteine) to produce a mutated gene having nucleotide sequence SEQ ID NO: 35.
[0135] The FRD gene used to prepare strain P-13 is prepared by performing site-directed substitutions at amino acids 1035 (aspartic acid) and 1036 (serine) to produce a mutated gene having nucleotide sequence SEQ ID NO: 36.
[0136] The FRD gene used to prepare strain P-14 is prepared by performing site-directed substitutions at amino acid 411 of a T. brucei FRD gene having SEQ ID NO: 42 to produce a mutated gene having nucleotide sequence SEQ ID NO: 37.
[0137] Examples 10-18 are made in the same manner as Examples 1-9, respectively, except Examples 10-18 are made starting from strain P-9 instead of strain P-7. Examples 10-18 correspond to Examples 1-9, respectively, except the FRD gene in Examples 10-18 is the mutated L. mexicana FRD gene having the nucleotide sequence SEQ ID NO: 32.
[0138] Examples 19-27 are made in the same manner as Examples 1-9, respectively, except Examples 19-27 are made starting from strain P-10 instead of strain P-7. Examples 23-33 correspond to Examples 1-9, respectively, except the FRD gene in Examples 19-27 is the mutated L. mexicana FRD gene having the nucleotide sequence SEQ ID NO: 33.
[0139] Examples 28-36 are made in the same manner as Examples 1-9, respectively, except Examples 28-36 are made starting from strain P-11 instead of strain P-7. Examples 28-36 correspond to Examples 1-9, respectively, except the FRD gene in Examples 28-36 is the mutated L. mexicana FRD gene having the nucleotide sequence SEQ ID NO: 34.
[0140] Examples 37-45 are made in the same manner as Examples 1-9, respectively, except Examples 37-45 are made starting from strain P-12 instead of strain P-7. Examples 37-45 correspond to Examples 1-9, respectively, except the FRD gene in Examples 37-45 is the mutated L. mexicana FRD gene having the nucleotide sequence SEQ ID NO: 35.
[0141] Examples 46-54 are made in the same manner as Examples 1-9, respectively, except Examples 46-54 are made starting from strain P-13 instead of strain P-7. Examples 46-54 correspond to Examples 1-9, respectively, except the FRD gene in Examples 46-54 is the mutated L. mexicana FRD gene having the nucleotide sequence SEQ ID NO: 36.
[0142] Examples 55-63 are made in the same manner as Examples 1-9, respectively, except Examples 55-63 are made starting from strain P-14 instead of stain P-7. Examples 55-63 correspond to Examples 1-9, respectively, except the FRD gene in Examples 55-63 is the mutated T. brucei FRD gene having the nucleotide sequence SEQ ID NO: 37.
Examples 64-126--Deletion of Native GPD Gene
[0143] Examples 1-63 each are transformed with an integration fragment having nucleotide sequence SEQ ID NO: 38 using lithium acetate methods as described before. This integration fragment contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis GPD1 open reading frame, a PDC transcriptional terminator, the URA3 promoter, the I. orientalis URA3 gene, a URA3 terminator, an additional URA3 promoter direct repeat for marker recycling and a DNA fragment with homology for integration corresponding to the region directly downstream of the I. orientalis GPD1 open reading frame. Successful transformants are selected on selection plates lacking uracil, confirmed by PCR using primers having nucleotide sequences SEQ ID NOs: 129 and 130 to confirm the 5'-crossover and SEQ ID NOs: 131 and 132 to confirm the 3'-crossover), and are then grown and plated on FOA as before until a strain in which the URA3 marker has looped out is identified. This strain is then transformed with an integration fragment having nucleotide sequence SEQ ID NO: 39. This integration fragment contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis GPD1 open reading frame, the URA3 promoter, the I. orientalis URA3 gene, a URA3 terminator an additional URA3 promoter direct repeat for marker recycling a PDC transcriptional terminator, and a DNA fragment with homology for integration corresponding to the region directly downstream of the I. orientalis GPD1 open reading frame. Successful transformants are again selected on selection plates lacking uracil and confirmed by PCR, using primers having nucleotide sequences SEQ ID NOs: 130 and 132) to confirm the 5'-crossover and SEQ ID NOs: 129 and 131 to confirm the 3'-crossover). Retention of the first GPD1 deletion construct is also reconfirmed by repeating the PCR reactions used to verify proper integration of SEQ ID NO: 38 above. Confirmed isolates are grown and plated on FOA as before until a strain in which the URA3 marker has looped out is identified. The strains resulting from the transformations of Examples 1-63 are designated Examples 64-126, respectively.
Example 127-252--Deletion of Native PGI Gene
[0144] Integration fragment 5-1 (having SEQ ID NO: 40) for the deletion of the first copy of the I. orientalis PGI gene, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis PGI open reading frame, a PDC1 transcriptional terminator, the I. orientalis URA3 promoter, gene, and terminator in succession, followed by an additional URA3 promoter which serves as a direct repeat for marker recycling, and a region immediately downstream of the I. orientalis PGI open reading frame.
[0145] Integration fragment 5-2 (having SEQ ID NO: 41) for the deletion of the second copy of the I. orientalis PGI gene, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis PGI open reading frame, a PDC1 transcriptional terminator, the I. orientalis URA3 promoter, gene, and terminator in succession, followed by an additional URA3 promoter which serves as a direct repeat for marker recycling, and a region immediately upstream of the I. orientalis PGI open reading frame.
[0146] Examples 1-127 each are transformed with integration fragment 5-1 using the lithium acetate process described before. Successful transformants are selected on PGI deletion selection plates lacking uracil (SC -ura, +20 g/L fructose, +0.5 g/L glucose) incubated 3-5 days, confirmed by PCR using primers having nucleotide sequences SEQ ID NOs: 84 and 85 to confirm the 5'-crossover and SEQ ID NOs: 72 and 86 to confirm the 3'-crossover. Successful transformants are grown and plated on FOA as before until a strain in which the URA3 marker has looped out is identified. In each case, the resulting strain is then transformed with integration fragment 5-2 in the same manner and successful transformants selected on PGI deletion selection plates lacking uracil (SC -ura, +20 g/L fructose, +0.5 g/L glucose) incubated 3-5 days and confirmed by PCR using primers having nucleotide sequences SEQ ID NOs: 72 and 84 to confirm the 5'-crossover and SEQ ID NOs: 85 and 86 to confirm the 3'-crossover. A successful deletant in which the URA3 marker has looped out is again identified as before. The strains resulting from the transformations of Examples 1-126 are designated Examples 127-252, respectively.
Shake Flask Evaluation for Succinate Production
[0147] Example 1-1 is streaked out for single colonies on URA selection plates and incubated at 30.degree. C. until single colonies are visible (1-2 days). Cells from plates are scraped into sterile growth medium and the optical density (OD.sub.600) is measured. Optical density is measured at wavelength of 600 nm with a 1 cm pathlength using a model Genesys20 spectrophotometer (Thermo Scientific). Dry cell mass is calculated from the measured OD.sub.600 value using an experimentally derived conversion factor of 1.7 OD.sub.600 units per 1 g dry cell mass.
[0148] A shake flask is inoculated with the cell slurry to reach an initial OD.sub.600 of 0.1-0.3. Prior to inoculation, the 250 mL baffled shake flasks containing 1.75 g/L dry CaCO.sub.3 are sterilized. Immediately prior to inoculating, 50 mL of shake flask medium is added to the dry calcium carbonate. The shake flask medium is a sterilized, 5.5 pH aqueous solution of urea (2.3 g/L), magnesium sulfate heptahydrate (0.5 g/L), potassium phosphate monobasic (3.0 g/L), trace element solution (1 mL/L) and vitamin solution (1 mL/L), glucose (120.0 g/L), glycerol (0.1 g/L), 2-(N-Morpholino) ethanesulfonic acid (MES) (4.0 g/L). For strains lacking the URA3 gene (URA-) 20 mg/L uracil is added to the media. The trace element solution is an aqueous solution of EDTA (15.0 g/L), zinc sulfate heptahydrate (4.5 g/L), manganese chloride dehydrate (1.0 g/L), cobalt(II) chloride hexahydrate (0.3 g/L), copper(II)sulfate pentahydrate (0.3 g/L), disodium molybdenum dehydrate (0.4 g/L), calcium chloride dehydrate (4.5 g/L), iron sulphate heptahydrate (3 g/L), boric acid (1.0 g/L), and potassium iodide (0.1 g/L). The vitamin solution is an aqueous solution of biotin (D-; 0.05 g/L), calcium pantothenate (D+; 1 g/L), nicotinic acid (5 g/L), myo-inositol (25 g/L), pyridoxine hydrochloride (1 g/L), p-aminobenzoic acid (0.2 g/L).
[0149] The inoculated flask is incubated at 30.degree. C. with shaking at 150 rpm for 72 hours and taken to analysis. Succinate concentration in the broth at the end of 72 hours fermentation is determined by gas chromatography with flame ionization detector and glucose by high performance liquid chromatography with refractive index detector.
[0150] Examples 2 through 252 are cultured in shake flasks in similar manner and found to produce succinate. The succinate concentration in the broth is measured and yield and titer are calculated.
Examples 253-255
[0151] Integration fragment P6-2a (having nucleotide sequence SEQ ID NO: 116) contains the I. orientalis FUM (IoFUM) gene (nucleotide sequence SEQ ID NO: 70), the last 568 base pairs of the URA3 marker, URA3 promoter, PGK promoter, PDC terminator and ADHb downstream integration arm.
[0152] Integration fragment P6-4a (having nucleotide sequence SEQ ID NO: 125) contains the I. orientalis FUM (IoFUM) gene (nucleotide sequence SEQ ID NO: 70) the last 568 base pairs of the URA3 marker, URA3 promoter, PGK promoter, PDC terminator and ADHb upstream integration arm.
[0153] Strain P-5 is simultaneously transformed with each of integration fragments P6-1 and P6-2a using the standard lithium acetate process described before. Successful transformants are identified by PCR, the transformants are grown and plated on 5FOA plates until a strain in which the URA3 marker has looped out is identified as before. This strain is designated strain P-6a.
[0154] Strain P-6a is simultaneously transformed with each of integration fragments P6-3 and P6-4a and using the standard lithium acetate process described before. Successful transformants are identified by PCR, the transformants are grown and plated on 5FOA plates until a strain in which the URA3 marker has looped out is identified as before. This strain is designated strain P-7a.
[0155] Strain P-7a is transformed with an integration fragments having nucleotide sequences SEQ ID NO: 38 and SEQ ID NO: 39, deleting the GPD gene as described with respect to Example 64-126 above. The resulting strain is named P-8a. Strain P-8a is grown and plated on 5FOA plates until a strain in which the URA3 marker has looped out is identified as before. The resulting strain is named P-8b.
Construction of Strains 253, 254, and 255
[0156] Integration fragment 6-1, having nucleotide sequence SEQ ID NO: 133, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis GPD1 open reading frame, an I. orientalis ENO1 promoter, the E. coli SthA gene (having the nucleotide sequence SEQ ID NO: 24), the I. orientalis PDC terminator, a LoxP site, the I. orientalis PGK promoter, the S. cerevisiae MEL5 gene and terminator (having the nucleotide sequence SEQ ID NO: 134), another LoxP site, and a DNA fragment with homology for integration corresponding to the region directly upstream of the I. orientalis GPD1 open reading frame.
[0157] Integration fragment 6-2, having nucleotide sequence SEQ ID NO: 135, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis GPD1 open reading frame, an I. orientalis ENO1 promoter, the E. coli SthA gene (having the nucleotide sequence SEQ ID NO: 24), the I. orientalis PDC terminator, the URA3 promoter, the I. orientalis URA3 gene, an additional URA3 promoter direct repeat for marker recycling and a DNA fragment with homology for integration corresponding to the region directly downstream of the I. orientalis GPD1 open reading frame.
[0158] Integration fragment 6-3, having nucleotide sequence SEQ ID NO: 136 contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis GPD1 open reading frame, an I. orientalis ENO1 promoter, the A. vinelandii SthA gene (having the nucleotide sequence SEQ ID NO: 27), the I. orientalis PDC terminator, a LoxP site, the I. orientalis PGK promoter, the S. cerevisiae MEL5 gene and terminator (having the nucleotide sequence SEQ ID NO: 134), another LoxP site, and a DNA fragment with homology for integration corresponding to the region directly upstream of the I. orientalis GPD1 open reading frame.
[0159] Integration fragment 6-4, having nucleotide sequence SEQ ID NO: 137, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis GPD1 open reading frame, an I. orientalis ENO1 promoter, the A. vinelandii SthA gene (having the nucleotide sequence SEQ ID NO: 27), the I. orientalis PDC terminator, the URA3 promoter, the I. orientalis URA3 gene, an additional URA3 promoter direct repeat for marker recycling and a DNA fragment with homology for integration corresponding to the region directly downstream of the I. orientalis GPD1 open reading frame.
[0160] Integration fragment 6-5, having nucleotide sequence SEQ ID NO: 138, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately downstream of the I. orientalis GPD1 open reading frame, an I. orientalis ENO1 promoter, the P. fluorescens SthA gene (having the nucleotide sequence SEQ ID NO: 139), the I. orientalis PDC terminator, a LoxP site, the I. orientalis PGK promoter, the S. cerevisiae MEL5 gene and terminator (having the nucleotide sequence SEQ ID NO: 134), another LoxP site, and a DNA fragment with homology for integration corresponding to the region directly upstream of the I. orientalis GPD1 open reading frame.
[0161] Integration fragment 6-6, having nucleotide sequence SEQ ID NO: 140, contains the following elements, 5' to 3': a DNA fragment with homology for integration corresponding to the region immediately upstream of the I. orientalis GPD1 open reading frame, an I. orientalis ENO1 promoter, the P. fluorescens SthA gene (having the nucleotide sequence SEQ ID NO: 139), the I. orientalis PDC terminator, the URA3 promoter, the I. orientalis URA3 gene, an additional URA3 promoter direct repeat for marker recycling and a DNA fragment with homology for integration corresponding to the region directly downstream of the I. orientalis GPD1 open reading frame.
[0162] Examples 253, 254 and 255 are constructed in the following manner. Strain P-8b is co-transformed with the integration fragments listed in the second column of Table 3. Successful integrants in each case are identified as blue colonies on selection plates with 5-bromo-4-chloro-3-indolyl-alpha-D-galactopyranoside and lacking uracil, and confirmed by PCR. PCR oligos used to test the 3' and 5' crossovers of each integration fragment are listed in the third through sixth columns of Table 3. In each case, successful transformants are grown for several rounds and plated on 5-fluoroorotic acid (FOA) plates to identify a strain in which the URA3 marker has looped out. Loopout of the URA3 marker is confirmed by PCR.
TABLE-US-00003 TABLE 3 1.sup.st 1.sup.st 2.sup.nd 2.sup.nd integration integration integration integration 3' 5' 3' 5' crossover crossover crossover crossover Strain Integration oligos oligos oligos oligos name Fragments SEQ ID SEQ ID SEQ ID SEQ ID Example 6-1 and NO: 130 NO: 131 NO: 130 NO: 131 253 6-2 and 145 and 143 and 143 and 144 Example 6-3 and NO: 130 NO: 131 NO: 130 NO: 131 254 6-4 and 145 and 141 and 141 and 144 Example 6-5 and NO: 130 NO: 131 NO: 130 NO: 131 255 6-6 and 145 and 142 and 142 and 144
[0163] Table 4 summarizes the genetic modifications to Strains 253, 254 and 255 (relative to the wild-type strain):
TABLE-US-00004 TABLE 4 Strains 253, 254 and 255 Strain name Description 253 URA deletion (2) PDC deletion (2) I. orientalis PYC1 insertion at ADHa (2) S. pombe MAE insertion at ADHa (2) L. mexicana FRD insertion at CYB2b (2) R. delemar MDH insertion at ADHb (2) I. orientalis FUM insertion at ADHb (2) GPD deletion E. coli SthA insertion at GPD (2) 254 URA deletion (2) PDC deletion (2) I. orientalis PYC1 insertion at ADHa (2) S. pombe MAE insertion at ADHa (2) L. mexicana FRD insertion at CYB2b (2) R. delemar MDH insertion at ADHb (2) I. orientalis FUM insertion at ADHb (2) GPD deletion A. vinelandii SthA insertion at GPD (2) 255 URA deletion (2) PDC deletion (2) I. orientalis PYC1 insertion at ADHa (2) S. pombe MAE insertion at ADHa (2) L. mexicana FRD insertion at CYB2b (2) R. delemar MDH insertion at ADHb (2) I. orientalis FUM insertion at ADHb (2) GPD deletion P. fluorescens SthA insertion at GPD (2)
Shake Flask Evaluation for Succinate Production for Strains 253-255
[0164] Strains P-8, 253, 254 and 255 are separately evaluated for succinate production. In each case, the strain is streaked out for single colonies on plates lacking uracil and incubated at 30.degree. C. until single colonies are visible (1-2 days). Cells from plates are scraped into sterile growth medium and the optical density (OD.sub.600) is measured. Optical density is measured at wavelength of 600 nm with a 1 cm pathlength using a model Genesys20 spectrophotometer (Thermo Scientific). Dry cell mass is calculated from the measured OD.sub.600 value using an experimentally derived conversion factor of 1.7 OD.sub.600 units per 1 g dry cell mass.
[0165] A shake flask is inoculated with the cell slurry to reach an initial 0D600 of 0.1-0.3. Prior to inoculation, the 250 mL baffled shake flasks containing 1.28 g/L dry CaCO.sub.3 are sterilized. Immediately prior to inoculating, 50 mL of shake flask medium is added to the dry calcium carbonate. The shake flask medium is a sterilized, 4.5 pH aqueous solution of urea (2.3 g/L), magnesium sulfate heptahydrate (0.5 g/L), potassium phosphate monobasic (3.0 g/L), trace element solution (1 mL/L) and vitamin solution (1 mL/L), glucose (120.0 g/L), glycerol (0.1 g/L), 2-(N-Morpholino) ethanesulfonic acid (MES) (4.0 g/L). The trace element solution is an aqueous solution of EDTA (15.0 g/L), zinc sulfate heptahydrate (4.5 g/L), manganese chloride dehydrate (1.0 g/L), cobalt(II) chloride hexahydrate (0.3 g/L), copper(II)sulfate pentahydrate (0.3 g/L), disodium molybdenum dehydrate (0.4 g/L), calcium chloride dehydrate (4.5 g/L), iron sulphate heptahydrate (3 g/L), boric acid (1.0 g/L), and potassium iodide (0.1 g/L). The vitamin solution is an aqueous solution of biotin (D-; 0.05 g/L), calcium pantothenate (D+; 1 g/L), nicotinic acid (5 g/L), myo-inositol (25 g/L), pyridoxine hydrochloride (1 g/L), p-aminobenzoic acid (0.2 g/L).
[0166] The inoculated flask is incubated at 30.degree. C. with shaking at 150 rpm for 96 hours and taken to analysis. Succinate and glucose concentrations in the broth at the end of 96 hours fermentation are determined by high performance liquid chromatography with refractive index detector. Results are as indicated in Table 5:
TABLE-US-00005 TABLE 5 Average Average glucose succinate Glucose consumption Succinate production after rate, after rate, Strain 96 hr, g/L g/L/hr 96 hr, g/L g/L/hr P-8a 5.46 1.190 57.6 0.60 253 6.20 1.185 88.4 0.92 255 6.10 1.186 84.1 0.88 257 8.58 1.161 89.0 0.93
[0167] As can be seen from the data in Table 5, all strains produce succinate. However, Examples 253-255 produce more succinate, at a 50% greater rate, than Strain P-8a.
Example 256
[0168] The URA3 gene is deleted from a wild type strain of S. cerevisiae (CEN-PK 113-7D) to create a strain with a uracil auxotrophy. This strain is called S-1.
[0169] Ethanol production is eliminated in S-1 by deletion of the three PDC genes (PDC1, PDC5, and PDC6), using conventional methods, to produce a strain (S-2) that does not produce ethanol. A pathway from pyruvate to succinate is introduced into strain S-2 by the integration of the following exogenous genes driven by strong promoters: the I. orientalis PYC gene, the R. delemar MDH gene, the I. orientalis FUM (fumarase), the L. Mexicana FRD gene, and the S. pombe MAE gene. The various promoters include the S. cerevisiae CYC1 promoter, the S. cerevisiae ADH1 promoter and the S. cerevisiae GPD1 promoter.
[0170] Strain S-3 is transformed with the E. coli soluble transhydrogenase (SthA) gene (SEQ ID NO: 21) under the control of the S. cerevisiae CYC1 promoter. The resulting strain (which still is prototrophic for uracil) is called S-4. Strain S-4 cannot produce ethanol, has an active metabolic pathway to succinate, overexpresses the soluble transhydrogenase enzyme and is prototrophic for uracil.
[0171] After deletion of the PDC genes from S. cerevisiae, it becomes necessary to supplement the growth medium with a C2 carbon source to support growth. Additionally, glucose is known to suppress growth of S. cerevisiae strains lacking adequate PDC activity. Therefore, Strains S-3 and S-4 are grown on a medium containing ethanol as a sole carbon source to a suitable cell density in a shake flask. The cells are collected by centrifugation and the ethanol media discarded. The cells are resuspended in a glucose containing medium in a shake flask and cultivated under aeration at 30.degree. in a stirred shake flask, and succinate formation is monitored until glucose depletion. Strain S-4, which exhibits transhydrogenase activity, shows improved succinate production compared with strain S-3, which lacks transhydrogenase activity.
Sequence CWU
1
1
14614910DNAArtificial SequenceSynthetic - URA 3 gene disruption
fragmentmisc_feature(1656)..(1656)n is a, c, g, or
tmisc_feature(4312)..(4312)n is a, c, g, or t 1ctcaaaacta tttaattagt
taattgtata aactgtatgt cattataaac agggaaggtt 60gacattgtct agcggcaatc
attgtctcat ttggttcatt aactttggtt ctgttcttgg 120aaacgggtac caactctctc
agagtgcttc aaaaattttt cagcacattt ggttagacat 180gaactttctc tgctggttaa
ggattcagag gtgaagtctt gaacacaatc gttgaaacat 240ctgtccacaa gagatgtgta
tagcctcatg aaatcagcca tttgcttttg ttcaacgatc 300ttttgaaatt gttgttgttc
ttggtagtta agttgatcca tcttggctta tgttgtgtgt 360atgttgtagt tattcttagt
atattcctgt cctgagttta gtgaaacata atatcgcctt 420gaaatgaaaa tgctgaaatt
cgtcgacata caatttttca aacttttttt ttttcttggt 480gcacggacat gtttttaaag
gaagtactct ataccagtta ttcttcaccc tgcagggtac 540gtagcatgca ctcgcaagct
gtgccatcgc ccaacggtta attataagaa atcaacatca 600gccaacaact attttcgtcc
ccctcttttc agtggtaacg agcaattaca ttagtaagag 660actattttct tcagtgattt
gtaatttttt ttcagtgatt tgtaattctt tctcgaaata 720tgcgggctta acttatccgg
acattcacta catgcaagga aaaacgagaa ccgcggagat 780ttcctcagta agtaacaatg
atgatctttt tacgcttcat catcactttc caaagttcta 840agctataagt tcaagcctag
atacgctgaa aaactcctga ccaacaatgt aaagaaaaca 900attacaattg taaggttgaa
aacatctaaa aatgaaatat tttattgtac atgcacaccc 960tgatagtcat tctcttactt
catccctgaa agacgtggct gtacaagagt tggaatcgca 1020aggtcatgag gttaaagtta
gtgatcttta tgctcaaaag tggaaggcct caatagaccg 1080tgacgacttc gagcagcttt
tcgcaagaag agaggttaaa aataccccaa gcttcttatg 1140aagcgtatgc cagaggagca
ttaacaaaag acgtaaatca ggaacaggaa aaacttattt 1200gggcggactt tgtcattttg
tcgtttccta tatggtggtc ttctatgccg gctagtcgac 1260cccctcgacc ccctcgagcg
atctcgagat ttgctgcaac ggcaacatca atgtccacgt 1320ttacacacct acatttatat
ctatatttat atttatattt atttatttat gctacttagc 1380ttctatagtt agttaatgca
ctcacgatat tcaaaattga cacccttcaa ctactcccta 1440ctattgtcta ctactgtcta
ctactcctct ttactatagc tgctcccaat aggctccacc 1500aataggctct gtcaatacat
tttgcgccgc cacctttcag gttgtgtcac tcctgaagga 1560ccatattggg taatcgtgca
atttctggaa gagagtgccg cgagaagtga ggcccccact 1620gtaaatcctc gagggggcat
ggagtatggg gcatgnagga tggaggatgg gggggggggg 1680ggaaaatagg tagcgaaagg
acccgctatc accccacccg gagaactcgt tgccgggaag 1740tcatatttcg acactccggg
gagtctataa aaggcgggtt ttgtcttttg ccagttgatg 1800ttgctgagag gacttgtttg
ccgtttcttc cgatttaaca gtatagaatc aaccactgtt 1860aattatacac gttatactaa
cacaacaaaa acaaaaacaa cgacaacaac aacaacaatg 1920tttgctttct actttctcac
cgcatgcacc actttgaagg gtgttttcgg agtttctccg 1980agttacaatg gtcttggtct
caccccacag atgggttggg acagctggaa tacgtttgcc 2040tgcgatgtca gtgaacagct
acttctagac actgctgata gaatttctga cttggggcta 2100aaggatatgg gttacaagta
tgtcatccta gatgactgtt ggtctagcgg cagggattcc 2160gacggtttcc tcgttgcaga
caagcacaaa tttcccaacg gtatgggcca tgttgcagac 2220cacctgcata ataacagctt
tcttttcggt atgtattcgt ctgctggtga gtacacctgt 2280gctgggtacc ctgggtctct
ggggcgtgag gaagaagatg ctcaattctt tgcaaataac 2340cgcgttgact acttgaagta
tgataattgt tacaataaag gtcaatttgg tacaccagac 2400gtttcttacc accgttacaa
ggccatgtca gatgctttga ataaaactgg taggcctatt 2460ttctattctc tatgtaactg
gggtcaggat ttgacatttt actggggctc tggtatcgcc 2520aattcttgga gaatgagcgg
agatattact gctgagttca cccgtccaga tagcagatgt 2580ccctgtgacg gtgacgaata
tgattgcaag tacgccggtt tccattgttc tattatgaat 2640attcttaaca aggcagctcc
aatggggcaa aatgcaggtg ttggtggttg gaacgatctg 2700gacaatctag aggtcggagt
cggtaatttg actgacgatg aggaaaaggc ccatttctct 2760atgtgggcaa tggtaaagtc
cccacttatc attggtgccg acgtgaatca cttaaaggca 2820tcttcgtact cgatctacag
tcaagcctct gtcatcgcaa ttaatcaaga tccaaagggt 2880attccagcca caagagtctg
gagatattat gtttcagaca ccgatgaata tggacaaggt 2940gaaattcaaa tgtggagtgg
tccgcttgac aatggtgacc aagtggttgc tttattgaat 3000ggaggaagcg tagcaagacc
aatgaacacg accttggaag agattttctt tgacagcaat 3060ttgggttcaa aggaactgac
atcgacttgg gatatttacg acttatgggc caacagagtt 3120gacaactcta cggcgtctgc
tatccttgaa cagaataagg cagccaccgg tattctctac 3180aatgctacag agcagtctta
taaagacggt ttgtctaaga atgatacaag actgtttggc 3240cagaaaattg gtagtctttc
tccaaatgct atacttaaca caactgttcc agctcatggt 3300atcgccttct ataggttgag
accctcggct taagctcaat gttgagcaaa gcaggacgag 3360aaaaaaaaaa ataatgattg
ttaagaagtt catgaaaaaa aaaaggaaaa atactcaaat 3420acttataaca gagtgattaa
ataataaacg gcagtatacc ctatcaggta ttgagatagt 3480tttatttttg taggtatata
atctgaagcc tttgaactat tttctcgtat atatcatgga 3540gtatacattg cattagcaac
attgcatact agtcactcgc aagctgtgcc atcgcccaac 3600ggttaattat aagaaatcaa
catcagccaa caactatttt cgtccccctc ttttcagtgg 3660taacgagcaa ttacattagt
aagagactat tttcttcagt gatttgtaat tttttttcag 3720tgatttgtaa ttctttctcg
aaatatgcgg gctwaamtaa tccggacatt cactacatgc 3780aaggaaaaac gagaaccgcg
gagatttcct cagtaagtaa caatgatgat ctttttacgc 3840ttcatcatca ctttccaaag
ttctaagcta taagttcaag cctagatacg ctgaaaaact 3900cctgaccaac aatgtaaaga
aaacaattac aattgtaagg ttgaaaacat ctaaaaatga 3960aatattttat tgtacatgca
caccctgata gtcattctct tacttcatcc ctgaaagacg 4020tggctgtaca agagttggaa
tcgcaaggtc atgaggttaa agttagtgat ctttatgctc 4080aaaagtggaa ggcctcaata
gaccgtgacg acwwmaaaaa amaaamrmaa gaagagaggt 4140taaaaatacc ccaagcttct
tatgaagcgt atgccagagg agcattaaca aaagacgtaa 4200atcaggaaca ggaaaaactt
atttgggcgg actttgtcat tttgtcgttt cctatatggt 4260ggtcttctat gccggctagc
ggccgggcaa caaagcctcc cagatttgat anattttcaa 4320tttgtgcttt gaatcatgac
ttccacctgt ttggtccgca agaacacgta aatgcgcaat 4380ttgtttctcc cttctgctta
aaaaccatgc acctttaata ttatctggaa agataaagaa 4440cagaattgtt gcgtagaaac
aagtagcaga gccgtaaatg agaaaaatat acttccaagc 4500tggtaatttc ccctttatta
gtccaataca gtgtccgaag accccaccaa gaataccagc 4560aagggtgttg aaatataatg
tagatcttag tggttgttct gatttcttcc accacattcc 4620gctaataatc ataaaagacg
gtaatattcc ggcttcaaat acgccaagaa aaaacctcac 4680ggtaaccaaa ccaccaaagc
tatgacatgc agccatgcac ataagtaagc cgccccaaat 4740gaacaaacaa atagacacaa
atttgccaat tctaactcgt ggcaacaaaa aaaaggatat 4800gaactcacct aataaataac
cgaaataaaa agtagaagca actgtggaaa attgagaacc 4860atgtaaattt gtgtcttctt
tcaatgtata aacagccgca atacctaggg 491023704DNAArtificial
SequenceSynthetic - PDC gene disruption fragment 2cccccagttg ttgttgcaat
taacaaattt gctaccgaca ccgagaagga aattgagacc 60attagagaag aagccatcaa
ggctggtgca tttgatgctg ttgagtcaga ccattggtca 120caaggtggta agggtgcaat
caagttagct gaggcaattg tacgtgctac cgaggaaaga 180ccgttggaag aaagtcaacc
tcctaactat ctttattcat tagatggttc gttagaagat 240agactaagaa caattgccac
caagatgtat ggagcaaaag atattgaact atctgagttg 300gccaagaaac agattgaaga
gtatgagagt caaggttttg gcaagctgcc tgtttgtatt 360gcaaagacgc aatattctct
ctcccatgat ccaacattga aaggtgttcc aaaggatttc 420atcttcccaa tcagagaagt
tagaataagt gcaggagcag gatatttata tgcactagct 480gcaaagatca tgacaattcc
aggtctatca acttatgccg gatttatgaa tgttgaagtc 540aacgaagatg gtgagattga
tggattgttt tagtttttat tataaaatta tatattattc 600ttaattacat atcacccttc
tatcagggaa gggagaaacg aaaatagaga gtgacctatc 660caagctcggg ggtctaagtt
ttaatggccc agggaatcat tacttttttt tctcaatcct 720tgatggataa aagtattaca
tacgtacagg attgtgtatt agtgtatttc gttatatgat 780taaacaaagt ttatagattg
taaagtagac gtaaagttta gtaattcatt ttaatgttca 840ttttacattc agatggcggc
cgcggatcca gatcccccgg ggcgttgaag atctattctc 900cagcaattaa atttgtgaag
aataactggt atagagtact tcctttaaaa acatgtccgt 960gcaccaagaa aaaaaaaaag
tttgaaaaat tgtatgtcga cgaatttcag cattttcatt 1020tcaaggcgat attatgtttc
actaaactca ggacaggaat atactaagaa taactacaac 1080atacacacaa cataagccaa
gatggatcaa cttaactacc aagaacaaca acaatttcaa 1140aagatcgttg aacaaaagca
aatggctgat ttcatgaggc tatctgcaga tacgcggaac 1200aatcaatcga taatgatttg
actgataaag aaaaccatac ttttgtttat gtttattagt 1260tatcgctttg ctacattaaa
aattcacata ctaaagcctt tgttaaacaa ctttttctaa 1320atcttaagat tttactctat
ctagtttttt tggttgtagg tgaacgtaaa gtacctcatt 1380tatttttttt tttttgcttg
tgtaattctt ttcatgctta tttaaactag tgtacatgta 1440tcaaatcttt gtgtaagaat
catttaaatc tgtttaaata agcattccaa ccagcttgtt 1500ggtatctttt agcttgctct
ataggatctc ttccttgacc gtacaaacct ctaccaacaa 1560ttatgatatc cgttccagtc
tttacaactt catcaacagt tctatattgt tgaccaagtg 1620catcaccttt gtcatctaaa
ccaacccctg gagtcataat gatccagtca aaaccttctt 1680ctctaccgcc catatcgtgt
tgcgcaataa aaccaatgac aaactcttta tcagatttag 1740caatttctac tgttttttct
gtatattcac catatgctaa agaacccttt gatgataact 1800cagcaagcat tagcaaacct
ctaggttcac tggttgtttc ttgggctgcc tccttcaagc 1860cagaaacaat acctgcaccc
gttacaccat gtgcattagt gatgtcagcc cattcggcaa 1920tacggaagac accagattta
tattgatttt taacagtgtt accaatatca gcaaattttc 1980tatcttcaaa aatcataaaa
ttatgtttct tggcaagctc cttcaaaggc aacacagttc 2040cttcatacgt aaaatcagaa
acaatatcga tgtgtgtttt aactagacag atgtaaggac 2100caatagtgtc caaaatagag
agaagctttt cagtttcagt aatatccaat gatgcacaaa 2160ggttagactt cttttcctcc
atgatggaga aaagtctcct agcaacaggg gaagtgtgtg 2220attctgatct ttctttgtat
gacgccatcc ttgacaaaca aactacttta ttaaagcgtt 2280gaagatctat tctccagcaa
ttaaatttgt gaagaataac tggtatagag tacttccttt 2340aaaaacatgt ccgtgcacca
agaaaaaaaa aaagtttgaa aaattgtatg tcgacgaatt 2400tcagcatttt catttcaagg
cgatattatg tttcactaaa ctcaggacag gaatatacta 2460agaataacta caacatacac
acaacataag ccaagatgga tcaacttaac taccaagaac 2520aacaacaatt tcaaaagatc
gttgaacaaa agcaaatggc tgatttcatg aggctatgaa 2580ttcttttatt ataaaattat
atattattct taattacata tcacccttct atcagggaag 2640ggagaaacga aaatagagag
tgacctatcc aagcttgggg gtctaagttt taatggccca 2700gggaatcatt actttttttt
ctcaatcctt gatggataaa agtattacat acgtacagga 2760ttgtgtatta gtgtatttcg
ttatatgatt aaacaaagtt tatagattgt aaagtagacg 2820taaagtttag taattcattt
taatgttcat tttacattca gatgttaatt aaggcctcga 2880gggatccgcg gccgctattt
ttgtgttttg ctgtgttttg ttttattttg ttttattggg 2940aagaaaatat ataataatag
aatattatat taacaaataa ttaaagaagc tcaactgtta 3000ttagaataaa tgggttctcc
gtgtcctttt tatacgcctt ctccgaaaag aaaaaaacca 3060tcgtatcatt tgtagcccac
gccacccgga aaaaccacca ttgtcctcag cagtccgcaa 3120aaatatggat gcgctcaatc
aatttccctc ccccgtcaat gccaaaagga taacgacaca 3180ctattaagag cgcatcattt
gtaaaagccg aggaaggggg atacgctgac cgagacgtct 3240cgcctcactc tcggagctga
gccgccctcc ttaagaaatt catgggaaga acacccttcg 3300cggcttctga acggctcgcc
ctcgtccatt ggtcacctca cagtggcaac taataaggac 3360attatagcaa tagaaattaa
aatggtgcac agaaatacaa taggatcgaa taggatagga 3420tacaataaga tacggaatat
tagactatac tgtgatacgg tacgctacga tacgctacga 3480tacgatacga tagaggatac
cacggatata acgtagtgtt atttttcatt attggggttt 3540tttttctgtt tgaattttcc
acgtcaagag tatcccatct gacaggaacc gatggactcg 3600tcacagtacc tatcgcccga
gttcaatcca tggacgctgc gggtgaagga tcttcgcccg 3660ctgttggcaa gccatgggat
cagggcgtcg ccaagggacg ggcc 370436392DNAArtificial
SequenceSynthetic - PYC gene integration fragment 3ctaagtagtg gtgttggtga
actcaagatg gactctttag gtaattatat tcttgaatag 60ttgtgtaaag cgaatatgca
aatagatttg ttttataatt atgcatctct ttgaaagagg 120tttagaggca aagttcttgc
atacaatatt gtgattgttt taatgtcatt cttgattttc 180ataaagagat taaaaaaaaa
aaaaaaaaac ttataaaatt gagtagaacc atttatatat 240aagacaaaga ttgtctgtat
tagtcctcaa cacactaaac cttacatact tagggtaaat 300ttgctaatag agtgatatgt
tcatgagaac tccaacgaca acacaaccac ctatttgcac 360aacaaacacc attgtcgcac
gctgcgcgcc ctagaagtag aaagaaaggg aaatgacatt 420aagagaatca taccccgtgc
ccgtaacgcc gaaaaaatca caccccgtcc cccacacctt 480aaaacctcaa ccgcttaaca
ccgccacacc ctttctcttt ataaacgccg tttgcattac 540tcattcttct tataaaccgc
accccccaaa acgcggaata gcttcaaccc cccaatcaga 600tatgagtttc ccgggaaacc
cgcttttccc gacagcccca caaggggttg gtctataaaa 660gaggacgttt tccccgtcat
cgagattgaa gattcttaca ggcccattta ttcaaattgg 720agttgattct tcttgtcttt
actttctttc tctctttttc ttcctttttt aatattatct 780tttgtcaagc ctggttccct
aagttgaact ctcttttctt gtgatcctcc tatatagata 840cgccttgcca aatgcggccg
cgagtccatc ggttcctgtc agatgggata ctcttgacgt 900ggaaaattca aacagaaaaa
aaaccccaat aatgaaaaat aacactacgt tatatccgtg 960gtatcctcta tcgtatcgta
tcgtagcgta tcgtagcgta ccgtatcaca gtatagtcta 1020atattccgta tcttattgta
tcctatccta ttcgatccta ttgtatttca gtgcaccatt 1080ttaatttcta ttgctataat
gtccttatta gttgccactg tgaggtgacc aatggacgag 1140ggcgagccgt tcagaagccg
cgaagggtgt tcttcccatg aatttcttaa ggagggcggc 1200tcagctccga gagtgaggcg
agacgtctcg gtcagcgtat cccccttcct cggcttttac 1260aaatgatgcg ctcttaatag
tgtgtcgtta tccttttggc attgacgggg gagggaaatt 1320gattgagcgc atccatattt
ttgcggactg ctgaggacaa tggtggtttt tccgggtggc 1380gtgggctaca aatgatacga
tggttttttt cttttcggag aaggcgtata aaaaggacac 1440ggagaaccca tttattctaa
aaacagttga gcttctttaa ttattttttg atataatatt 1500ctattattat atattttctt
cccaataaaa caaaataaaa caaaacacag caaaacacaa 1560aaattctaga taaaatgtca
actgtggaag atcactcctc cctacataaa ttgagaaagg 1620aatctgagat tctttccaat
gcaaacaaaa tcttagtggc taatagaggt gaaattccaa 1680ttagaatttt caggtcagcc
catgaattgt caatgcatac tgtggcgatc tattcccatg 1740aagatcggtt gtccatgcat
aggttgaagg ccgacgaggc ttatgcaatc ggtaagactg 1800gtcaatattc gccagttcaa
gcttatctac aaattgacga aattatcaaa atagcaaagg 1860aacatgatgt ttccatgatc
catccaggtt atggtttctt atctgaaaac tccgaattcg 1920caaagaaggt tgaagaatcc
ggtatgattt gggttgggcc tcctgctgaa gttattgatt 1980ctgttggtga caaggtttct
gcaagaaatt tggcaattaa atgtgacgtt cctgttgttc 2040ctggtaccga tggtccaatt
gaagacattg aacaggctaa acagtttgtg gaacaatatg 2100gttatcctgt cattataaag
gctgcatttg gtggtggtgg tagaggtatg agagttgtta 2160gagaaggtga tgatatagtt
gatgctttcc aaagagcgtc atctgaagca aagtctgcct 2220ttggtaatgg tacttgtttt
attgaaagat ttttggataa gccaaaacat attgaggttc 2280aattattggc tgataattat
ggtaacacaa tccatctctt tgaaagagat tgttctgttc 2340aaagaagaca tcaaaaggtt
gttgaaattg cacctgccaa aactttacct gttgaagtta 2400gaaatgctat attaaaggat
gctgtaacgt tagctaaaac cgctaactat agaaatgctg 2460gtactgcaga atttttagtt
gattcccaaa acagacatta ttttattgaa attaatccaa 2520gaattcaagt tgaacataca
attactgaag aaatcacggg tgttgatatt gttgccgctc 2580aaattcaaat tgctgcaggt
gcatcattgg aacaattggg tctattacaa aacaaaatta 2640caactagagg ttttgcaatt
caatgtagaa ttacaaccga ggatcctgct aagaattttg 2700ccccagatac aggtaaaatt
gaggtttata gatctgcagg tggtaacggt gtcagattag 2760atggtggtaa tgggtttgcc
ggtgctgtta tatctcctca ttatgactcg atgttggtta 2820aatgttcaac atctggttct
aactatgaaa ttgccagaag aaagatgatt agagctttag 2880ttgaatttag aatcagaggt
gtcaagacca atattccttt cttattggca ttgctaactc 2940atccagtttt catttcgggt
gattgttgga caacttttat tgatgatacc ccttcgttat 3000tcgaaatggt ttcttcaaag
aatagagccc aaaaattatt ggcatatatt ggtgacttgt 3060gtgtcaatgg ttcttcaatt
aaaggtcaaa ttggtttccc taaattgaac aaggaagcag 3120aaatcccaga tttgttggat
ccaaatgatg aggttattga tgtttctaaa ccttctacca 3180atggtctaag accgtatcta
ttaaagtatg gaccagatgc gttttccaaa aaagttcgtg 3240aattcgatgg ttgtatgatt
atggatacca cctggagaga tgcacatcaa tcattattgg 3300ctacaagagt tagaactatt
gatttactga gaattgctcc aacgactagt catgccttac 3360aaaatgcatt tgcattagaa
tgttggggtg gcgcaacatt tgatgttgcg atgaggttcc 3420tctatgaaga tccttgggag
agattaagac aacttagaaa ggcagttcca aatattcctt 3480tccaaatgtt attgagaggt
gctaatggtg ttgcttattc gtcattacct gataatgcaa 3540ttgatcattt tgttaagcaa
gcaaaggata atggtgttga tattttcaga gtctttgatg 3600ctttgaacga tttggaacaa
ttgaaggttg gtgttgatgc tgtcaagaaa gccggaggtg 3660ttgttgaagc tacagtttgt
tactcaggtg atatgttaat tccaggtaaa aagtataact 3720tggattatta tttagagact
gttggaaaga ttgtggaaat gggtacccat attttaggta 3780ttaaggatat ggctggcacg
ttaaagccaa aggctgctaa gttgttgatt ggctcgatca 3840gatcaaaata ccctgacttg
gttatccatg tccataccca tgactctgct ggtaccggta 3900tttcaactta tgttgcatgc
gcattggcag gtgccgacat tgtcgattgt gcaatcaatt 3960cgatgtctgg tttaacctct
caaccttcaa tgagtgcttt tattgctgct ttagatggtg 4020atatcgaaac tggtgttcca
gaacattttg caagacaatt agatgcatac tgggcagaaa 4080tgagattgtt atactcatgt
ttcgaagccg acttgaaggg accagaccca gaagtttata 4140aacatgaaat tccaggtgga
cagttgacta acctaatctt ccaagcccaa caagttggtt 4200tgggtgaaca atgggaagaa
actaagaaga agtatgaaga tgctaacatg ttgttgggtg 4260atattgtcaa ggttacccca
acctccaagg ttgttggtga tttagcccaa tttatggttt 4320ctaataaatt agaaaaagaa
gatgttgaaa aacttgctaa tgaattagat ttcccagatt 4380cagttcttga tttctttgaa
ggattaatgg gtacaccata tggtggattc ccagagcctt 4440tgagaacaaa tgtcatttcc
ggcaagagaa gaaaattaaa gggtagacca ggtttagaat 4500tagaaccttt caacctcgag
gaaatcagag aaaatttggt ttccagattt ggtccaggta 4560ttactgaatg tgatgttgca
tcttataaca tgtatccaaa ggtttacgag caatatcgta 4620aggtggttga aaaatatggt
gatttatctg ttttaccaac aaaagcattt ttggctcctc 4680caactattgg tgaagaagtt
catgtggaaa ttgagcaagg taagactttg attattaagt 4740tattagccat ttctgacttg
tctaaatctc atggtacaag agaagtatac tttgaattga 4800atggtgaaat gagaaaggtt
acaattgaag ataaaacagc tgcaattgag actgttacaa 4860gagcaaaggc tgacggacac
aatccaaatg aagttggtgc gccaatggct ggtgtcgttg 4920ttgaagttag agtgaagcat
ggaacagaag ttaagaaggg tgatccatta gccgttttga 4980gtgcaatgaa aatggaaatg
gttatttctg ctcctgttag tggtagggtc ggtgaagttt 5040ttgtcaacga aggcgattcc
gttgatatgg gtgatttgct tgtgaaaatt gccaaagatg 5100aagcgccagc agcttaatta
attctgtctt tgattttctt atgttattca aaacatctgc 5160cccaaaatct aacgattata
tatattccta cgtataactg tatagctaat tattgattta 5220tttgtacata aaaaccacat
aaatgtaaaa gcaagaaaaa aaataactaa ggagaaggat 5280caatatctca tttataatgc
tcgccaaagc agcgtacgtg aattttaatc aagacatcaa 5340caaatcttgc aacttggtta
tatcgcttct tcacccactc acccgctttt ctacattgtt 5400gaacacaaat atatacaggg
gtatgtctca aggtcaagtg cagtttcaac agagactacc 5460tcaaggtacc tcttcagaaa
tgcagaactt cactcttgat cagattttct ccgaattaaa 5520ggtttaaaca tagcctcatg
aaatcagcca tttgcttttg ttcaacgatc ttttgaaatt 5580gttgttgttc ttggtagtta
agttgatcca tcttggctta tgttgtgtgt atgttgtagt 5640tattcttagt atattcctgt
cctgagttta gtgaaacata atatcgcctt gaaatgaaaa 5700tgctgaaatt cgtcgacata
caatttttca aacttttttt ttttcttggt gcacggacat 5760gtttttaaag gaagtactct
ataccagtta ttcttcacaa atttaattgc tggagaatag 5820atcttcaacg ctttaataaa
gtagtttgtt tgtcaaggat ggcgtcatac aaagaaagat 5880cagaatcaca cacttcccct
gttgctagga gacttttctc catcatggag gaaaagaagt 5940ctaacctttg tgcatcattg
gatattactg aaactgaaaa gcttctctct attttggaca 6000ctattggtcc ttacatctgt
ctagttaaaa cacacatcga tattgtttct gattttacgt 6060atgaaggaac tgtgttgcct
ttgaaggagc ttgccaagaa acataatttt atgatttttg 6120aagatagaaa atttgctgat
attggtaaca ctgttaaaaa tcaatataaa tctggtgtct 6180tccgtattgc cgaatgggct
gacatcacta atgcacatgg tgtaacgggt gcaggtattg 6240tttctggctt gaaggaggca
gcccaagaaa caaccagtga acctagaggt ttgctaatgc 6300ttgctgagtt atcatcaaag
ggttctttag catatggtga atatacagaa aaaacagtag 6360aaattgctaa atctgataaa
gagtttgttg ag 639243543DNAIssatchenkia
orientalis 4atgtcaactg tggaagatca ctcctcccta cataaattga gaaaggaatc
tgagattctt 60tccaatgcaa acaaaatctt agtggctaat agaggtgaaa ttccaattag
aattttcagg 120tcagcccatg aattgtcaat gcatactgtg gcgatctatt cccatgaaga
tcggttgtcc 180atgcataggt tgaaggccga cgaggcttat gcaatcggta agactggtca
atattcgcca 240gttcaagctt atctacaaat tgacgaaatt atcaaaatag caaaggaaca
tgatgtttcc 300atgatccatc caggttatgg tttcttatct gaaaactccg aattcgcaaa
gaaggttgaa 360gaatccggta tgatttgggt tgggcctcct gctgaagtta ttgattctgt
tggtgacaag 420gtttctgcaa gaaatttggc aattaaatgt gacgttcctg ttgttcctgg
taccgatggt 480ccaattgaag acattgaaca ggctaaacag tttgtggaac aatatggtta
tcctgtcatt 540ataaaggctg catttggtgg tggtggtaga ggtatgagag ttgttagaga
aggtgatgat 600atagttgatg ctttccaaag agcgtcatct gaagcaaagt ctgcctttgg
taatggtact 660tgttttattg aaagattttt ggataagcca aaacatattg aggttcaatt
attggctgat 720aattatggta acacaatcca tctctttgaa agagattgtt ctgttcaaag
aagacatcaa 780aaggttgttg aaattgcacc tgccaaaact ttacctgttg aagttagaaa
tgctatatta 840aaggatgctg taacgttagc taaaaccgct aactatagaa atgctggtac
tgcagaattt 900ttagttgatt cccaaaacag acattatttt attgaaatta atccaagaat
tcaagttgaa 960catacaatta ctgaagaaat cacgggtgtt gatattgttg ccgctcaaat
tcaaattgct 1020gcaggtgcat cattggaaca attgggtcta ttacaaaaca aaattacaac
tagaggtttt 1080gcaattcaat gtagaattac aaccgaggat cctgctaaga attttgcccc
agatacaggt 1140aaaattgagg tttatagatc tgcaggtggt aacggtgtca gattagatgg
tggtaatggg 1200tttgccggtg ctgttatatc tcctcattat gactcgatgt tggttaaatg
ttcaacatct 1260ggttctaact atgaaattgc cagaagaaag atgattagag ctttagttga
atttagaatc 1320agaggtgtca agaccaatat tcctttctta ttggcattgc taactcatcc
agttttcatt 1380tcgggtgatt gttggacaac ttttattgat gatacccctt cgttattcga
aatggtttct 1440tcaaagaata gagcccaaaa attattggca tatattggtg acttgtgtgt
caatggttct 1500tcaattaaag gtcaaattgg tttccctaaa ttgaacaagg aagcagaaat
cccagatttg 1560ttggatccaa atgatgaggt tattgatgtt tctaaacctt ctaccaatgg
tctaagaccg 1620tatctattaa agtatggacc agatgcgttt tccaaaaaag ttcgtgaatt
cgatggttgt 1680atgattatgg ataccacctg gagagatgca catcaatcat tattggctac
aagagttaga 1740actattgatt tactgagaat tgctccaacg actagtcatg ccttacaaaa
tgcatttgca 1800ttagaatgtt ggggtggcgc aacatttgat gttgcgatga ggttcctcta
tgaagatcct 1860tgggagagat taagacaact tagaaaggca gttccaaata ttcctttcca
aatgttattg 1920agaggtgcta atggtgttgc ttattcgtca ttacctgata atgcaattga
tcattttgtt 1980aagcaagcaa aggataatgg tgttgatatt ttcagagtct ttgatgcttt
gaacgatttg 2040gaacaattga aggttggtgt tgatgctgtc aagaaagccg gaggtgttgt
tgaagctaca 2100gtttgttact caggtgatat gttaattcca ggtaaaaagt ataacttgga
ttattattta 2160gagactgttg gaaagattgt ggaaatgggt acccatattt taggtattaa
ggatatggct 2220ggcacgttaa agccaaaggc tgctaagttg ttgattggct cgatcagatc
aaaataccct 2280gacttggtta tccatgtcca tacccatgac tctgctggta ccggtatttc
aacttatgtt 2340gcatgcgcat tggcaggtgc cgacattgtc gattgtgcaa tcaattcgat
gtctggttta 2400acctctcaac cttcaatgag tgcttttatt gctgctttag atggtgatat
cgaaactggt 2460gttccagaac attttgcaag acaattagat gcatactggg cagaaatgag
attgttatac 2520tcatgtttcg aagccgactt gaagggacca gacccagaag tttataaaca
tgaaattcca 2580ggtggacagt tgactaacct aatcttccaa gcccaacaag ttggtttggg
tgaacaatgg 2640gaagaaacta agaagaagta tgaagatgct aacatgttgt tgggtgatat
tgtcaaggtt 2700accccaacct ccaaggttgt tggtgattta gcccaattta tggtttctaa
taaattagaa 2760aaagaagatg ttgaaaaact tgctaatgaa ttagatttcc cagattcagt
tcttgatttc 2820tttgaaggat taatgggtac accatatggt ggattcccag agcctttgag
aacaaatgtc 2880atttccggca agagaagaaa attaaagggt agaccaggtt tagaattaga
acctttcaac 2940ctcgaggaaa tcagagaaaa tttggtttcc agatttggtc caggtattac
tgaatgtgat 3000gttgcatctt ataacatgta tccaaaggtt tacgagcaat atcgtaaggt
ggttgaaaaa 3060tatggtgatt tatctgtttt accaacaaaa gcatttttgg ctcctccaac
tattggtgaa 3120gaagttcatg tggaaattga gcaaggtaag actttgatta ttaagttatt
agccatttct 3180gacttgtcta aatctcatgg tacaagagaa gtatactttg aattgaatgg
tgaaatgaga 3240aaggttacaa ttgaagataa aacagctgca attgagactg ttacaagagc
aaaggctgac 3300ggacacaatc caaatgaagt tggtgcgcca atggctggtg tcgttgttga
agttagagtg 3360aagcatggaa cagaagttaa gaagggtgat ccattagccg ttttgagtgc
aatgaaaatg 3420gaaatggtta tttctgctcc tgttagtggt agggtcggtg aagtttttgt
caacgaaggc 3480gattccgttg atatgggtga tttgcttgtg aaaattgcca aagatgaagc
gccagcagct 3540taa
354354886DNAArtificial SequenceSynthetic - MAE gene
integration fragment 5ctttgaagga gcttgccaag aaacataatt ttatgatttt
tgaagataga aaatttgctg 60atattggtaa cactgttaaa aatcaatata aatctggtgt
cttccgtatt gccgaatggg 120ctgacatcac taatgcacat ggtgtaacgg gtgcaggtat
tgtttctggc ttgaaggagg 180cagcccaaga aacaaccagt gaacctagag gtttgctaat
gcttgctgag ttatcatcaa 240agggttcttt agcatatggt gaatatacag aaaaaacagt
agaaattgct aaatctgata 300aagagtttgt cattggtttt attgcgcaac acgatatggg
cggtagagaa gaaggttttg 360actggatcat tatgactcca ggggttggtt tagatgacaa
aggtgatgca cttggtcaac 420aatatagaac tgttgatgaa gttgtaaaga ctggaacgga
tatcataatt gttggtagag 480gtttgtacgg tcaaggaaga gatcctatag agcaagctaa
aagataccaa caagctggtt 540ggaatgctta tttaaacaga tttaaatgat tcttacacaa
agatttgata catgtacact 600agtttaaata agcatgaaaa gaattacaca agcaaaaaaa
aaaaaataaa tgaggtactt 660tacgttcacc tacaaccaaa aaaactagat agagtaaaat
cttaagattt agaaaaagtt 720gtttaacaaa ggctttagta tgtgaatttt taatgtagca
aagcgataac taataaacat 780aaacaaaagt atggttttct ttatcagtca aatcattatc
gattgattgt tccgcgtatc 840tgcagatagc ctcatgaaat cagccatttg cttttgttca
acgatctttt gaaattgttg 900ttgttcttgg tagttaagtt gatccatctt ggcttatgtt
gtgtgtatgt tgtagttatt 960cttagtatat tcctgtcctg agtttagtga aacataatat
cgccttgaaa tgaaaatgct 1020gaaattcgtc gacatacaat ttttcaaact tttttttttt
cttggtgcac ggacatgttt 1080ttaaaggaag tactctatac cagttattct tcacaaattt
aattgctgga gaatagatct 1140tcaacgcgtt taaacagcaa tttgaggaag gaataggaga
aggagaagca atttctagga 1200aagagcaagg tgtgcaacag catgctctga atgatatttt
cagcaatagt tcagttgaag 1260aacctgttgg cgtatctaca tcacttccta caaacaacac
cacgaattgc gtccgtggtg 1320acgcaactac gaatggcatt gtcaatgcca atgccagtgc
acatacacgt gcaagtccca 1380ccggttccct gcccggctat ggtagagaca agaaggacga
taccggcatc gacatcaaca 1440gtttcaacag caatgcgttt ggcgtcgacg cgtcgatggg
gctgccgtat ttggatttgg 1500acgggctaga tttcgatatg gatatggata tggatatgga
tatggagatg aatttgaatt 1560tagatttggg tcttgatttg gggttggaat taaaagggga
taacaatgag ggttttcctg 1620ttgatttaaa caatggacgt gggaggtgat tgatttaacc
tgatccaaaa ggggtatgtc 1680tattttttag agtgtgtctt tgtgtcaaat tatggtagaa
tgtgtaaagt agtataaact 1740ttcctctcaa atgacgaggt ttaaaacacc ccccgggtga
gccgagccga gaatggggca 1800attgttcaat gtgaaataga agtatcgagt gagaaacttg
ggtgttggcc agccaagggg 1860gaaggaaaat ggcgcgaatg ctcaggtgag attgttttgg
aattgggtga agcgaggaaa 1920tgagcgaccc ggaggttgtg actttagtgg cggaggagga
acgggaggaa aaggccaaga 1980gggaaagtgt atataagggg gagcaatttg ccaaccagga
tagaattgga tgagttataa 2040ttctactgta tttattgtat aatttatttc tccttttata
tcaaacacat tacaaaacac 2100acaaaacata caaacataca cagctagcat gggtgaattg
aaagagattt tgaaacaaag 2160atatcatgaa ttacttgatt ggaatgttaa ggcaccacat
gtccctttat cccagagatt 2220gaagcacttt acttggtcat ggtttgcttg tactatggca
accggtggtg ttggtttgat 2280cattggttcc ttcccattca gattctacgg tttgaacacc
attggcaaga ttgtttacat 2340cttacaaatc tttttgtttt ctctttttgg ctcttgtatg
ttgtttcgtt tcatcaagta 2400tccatctacc attaaggact cttggaatca tcacttggaa
aagttgttta tcgcaacttg 2460tttgttatct atttccacat tcatcgacat gttagctatc
tatgcttatc cagataccgg 2520tgaatggatg gtctgggtca ttagaatctt atactacatc
tatgtcgctg tctctttcat 2580ctactgtgtt atggcctttt tcaccatttt caacaatcat
gtttacacta ttgaaactgc 2640ttctccagct tggattttgc caatcttccc tccaatgatc
tgtggtgtca ttgctggtgc 2700tgttaactcc acccaacctg ctcaccaatt gaaaaacatg
gtcattttcg gtatcttgtt 2760tcaaggttta ggtttttggg tttacctttt acttttcgcc
gttaatgttt tgagattctt 2820cacagtcggt ttagcaaagc cacaagatag accaggtatg
tttatgttcg ttggtccacc 2880agctttctct ggtttagcat tgattaacat tgcaagaggt
gcaatgggct caagacctta 2940cattttcgtt ggtgcaaact cttccgaata cttaggtttt
gtctcaacct tcatggccat 3000tttcatctgg ggtttagccg catggtgtta ttgcttagct
atggtttcct tccttgccgg 3060ctttttcact agagcaccat tgaaattcgc ttgtggttgg
ttcgctttca tctttccaaa 3120tgttggtttt gttaactgta ctatcgaaat cggcaagatg
attgattcta aggcttttca 3180aatgtttggt cacatcattg gtgttatctt gtgtattcaa
tggattttgt taatgtactt 3240aatggttaga gcattccttg ttaatgactt gtgctatcct
ggtaaagacg aagatgcaca 3300cccaccacca aagccaaaca ctggtgtctt aaacccaact
ttcccaccag agaaggctcc 3360agcatcatta gagaaggttg atactcatgt tacatcaaca
ggtggtgaat ccgatcctcc 3420atcttccgaa catgaatccg tttaaggcgc gccatctaat
agtttaatca cagcttatag 3480tctactatag ttttcttttt taaacattgt tgtattttgt
cccccccctc taattgatga 3540tgattatcct ataagaatcc aataaaacga tggaaactaa
taccctctcc tttgtcatgt 3600ggtctttagt atttcttgaa cattggctct gatttctcga
ctttatagtc ctattaaaat 3660cgctgttagt tctcgatcgt tgtatctcgt ttcttgtctc
tttggtggat gattttgcgt 3720gcgaacatgt ttttttccct ttctctcacc atcatcgtgt
agttcttgtc accatccccc 3780ccaccccttc cttctctcat tgattctata agagcttatc
cacagaggtg cagtaacgag 3840gtagtttaac cttcgagtgg atcaaaatgt cacacaggcc
tgcggccgct accataatgt 3900atgcgttgag cctcttgcac cttctttatt aggaaatcag
ttgaaaaatt tccggattgt 3960ctttattatt ggcccatttt tttttggtca cacctttatt
tttgtacact tctcgggcaa 4020agcaaaaact atagtaccgg ataggccttt ataaaactcc
agtgtgtatg attttagttg 4080gtgtgccatc tacacgttct cttagtttct ttatcatgtc
acagaaagca agcatgcaaa 4140cccttacaaa aaataacaac atacaaatgc ctaaacaact
ggactataat gatggtgagt 4200cagttacgaa aagagcaagt gggttaatac gatttcgtaa
gggacagtct gaggaagact 4260acaattttca aaaggagcag ttctggtcca cgggtccttt
agtacagaat cacacatttg 4320tgactgaatt tgttgaaaag tttattgaaa acacaattag
tgaagattat tcaatcacag 4380atagatcgaa aatagaacgt gaaacaatca tacacggatt
ggagaagctg tattttcaaa 4440gggaatatga gcgatgtcta aaagatgttc aactattgaa
ggacaatatc gataagttca 4500atcctaattt ggatcttaat gaaaagaatt tataatgagc
tgaattatat ttcttggatg 4560tgcatcaaaa agatccatga gagtaacgaa aagaaactgg
gggaaatcta ataatttaca 4620atttcaatat acacttctat atcctttaat gtaatggctt
tataaataaa cacgaacttc 4680tacagcaccg acgtttcttt ttcttaccag ctcctcttct
tcttcttctt cttcttcttc 4740ttcttcttct tcttcttctt cttcttcttc ttcttcttct
ttcttaccat cattgccatt 4800ttcctttttt cttatttgct cttgatcctc tgttttttca
atttggacaa actcatctaa 4860tacaccaaca cttttagggc ccccgc
488661317DNASchizosaccharomyces pombe 6atgggtgaat
tgaaagagat tttgaaacaa agatatcatg aattacttga ttggaatgtt 60aaggcaccac
atgtcccttt atcccagaga ttgaagcact ttacttggtc atggtttgct 120tgtactatgg
caaccggtgg tgttggtttg atcattggtt ccttcccatt cagattctac 180ggtttgaaca
ccattggcaa gattgtttac atcttacaaa tctttttgtt ttctcttttt 240ggctcttgta
tgttgtttcg tttcatcaag tatccatcta ccattaagga ctcttggaat 300catcacttgg
aaaagttgtt tatcgcaact tgtttgttat ctatttccac attcatcgac 360atgttagcta
tctatgctta tccagatacc ggtgaatgga tggtctgggt cattagaatc 420ttatactaca
tctatgtcgc tgtctctttc atctactgtg ttatggcctt tttcaccatt 480ttcaacaatc
atgtttacac tattgaaact gcttctccag cttggatttt gccaatcttc 540cctccaatga
tctgtggtgt cattgctggt gctgttaact ccacccaacc tgctcaccaa 600ttgaaaaaca
tggtcatttt cggtatcttg tttcaaggtt taggtttttg ggtttacctt 660ttacttttcg
ccgttaatgt tttgagattc ttcacagtcg gtttagcaaa gccacaagat 720agaccaggta
tgtttatgtt cgttggtcca ccagctttct ctggtttagc attgattaac 780attgcaagag
gtgcaatggg ctcaagacct tacattttcg ttggtgcaaa ctcttccgaa 840tacttaggtt
ttgtctcaac cttcatggcc attttcatct ggggtttagc cgcatggtgt 900tattgcttag
ctatggtttc cttccttgcc ggctttttca ctagagcacc attgaaattc 960gcttgtggtt
ggttcgcttt catctttcca aatgttggtt ttgttaactg tactatcgaa 1020atcggcaaga
tgattgattc taaggctttt caaatgtttg gtcacatcat tggtgttatc 1080ttgtgtattc
aatggatttt gttaatgtac ttaatggtta gagcattcct tgttaatgac 1140ttgtgctatc
ctggtaaaga cgaagatgca cacccaccac caaagccaaa cactggtgtc 1200ttaaacccaa
ctttcccacc agagaaggct ccagcatcat tagagaaggt tgatactcat 1260gttacatcaa
caggtggtga atccgatcct ccatcttccg aacatgaatc cgtttaa
131776527DNAArtificial SequenceSynthetic - PYC gene inserter fragment
7ctaaaagtgt tggtgtatta gatgagtttg tccaaattga aaaaacagag gatcaagagc
60aaataagaaa aaaggaaaat ggcaatgatg gtaagaaaga agaagaagaa gaagaagaag
120aagaagaaga agaagaagaa gaagaagaag aagaagaaga agaggagctg gtaagaaaaa
180gaaacgtcgg tgctgtagaa gttcgtgttt atttataaag ccattacatt aaaggatata
240gaagtgtata ttgaaattgt aaattattag atttccccca gtttcttttc gttactctca
300tggatctttt tgatgcacat ccaagaaata taattcagct cattataaat tcttttcatt
360aagatccaaa ttaggattga acttatcgat attgtccttc aatagttgaa catcttttag
420acatcgctca tattcccttt gaaaatacag cttctccaat ccgtgtatga ttgtttcacg
480ttctattttc gatctatctg tgattgaata atcttcacta attgtgtttt caataaactt
540ttcaacaaat tcagtcacaa atgtgtgatt ctgtactaaa ggacccgtgg accagaactg
600ctccttttga aaattgtagt cttcctcaga ctgtccctta cgaaatcgta ttaacccact
660tgctcttttc gtaactgact caccatcatt atagtccagt tgtttaggca tttgtatgtt
720gttatttttt gtaagggttt gcatgcttgc tttctgtgac atgataaaga aactaagaga
780acgtgtagat ggcacaccaa ctaaaatcat acacactgga gttttataaa ggcctatccg
840gtactatagt ttttgctttg cccgagaagt gtacaaaaat aaaggtgtga ccaaaaaaaa
900atgggccaat aataaagaca atccggaaat ttttcaactg atttcctaat aaagaaggtg
960caagaggctc aacgcataca ttatggtagc ggccgcgagt ccatcggttc ctgtcagatg
1020ggatactctt gacgtggaaa attcaaacag aaaaaaaacc ccaataatga aaaataacac
1080tacgttatat ccgtggtatc ctctatcgta tcgtatcgta gcgtatcgta gcgtaccgta
1140tcacagtata gtctaatatt ccgtatctta ttgtatccta tcctattcga tcctattgta
1200tttcagtgca ccattttaat ttctattgct ataatgtcct tattagttgc cactgtgagg
1260tgaccaatgg acgagggcga gccgttcaga agccgcgaag ggtgttcttc ccatgaattt
1320cttaaggagg gcggctcagc tccgagagtg aggcgagacg tctcggtcag cgtatccccc
1380ttcctcggct tttacaaatg atgcgctctt aatagtgtgt cgttatcctt ttggcattga
1440cgggggaggg aaattgattg agcgcatcca tatttttgcg gactgctgag gacaatggtg
1500gtttttccgg gtggcgtggg ctacaaatga tacgatggtt tttttctttt cggagaaggc
1560gtataaaaag gacacggaga acccatttat tctaaaaaca gttgagcttc tttaattatt
1620ttttgatata atattctatt attatatatt ttcttcccaa taaaacaaaa taaaacaaaa
1680cacagcaaaa cacaaaaatt ctagataaaa tgtcaactgt ggaagatcac tcctccctac
1740ataaattgag aaaggaatct gagattcttt ccaatgcaaa caaaatctta gtggctaata
1800gaggtgaaat tccaattaga attttcaggt cagcccatga attgtcaatg catactgtgg
1860cgatctattc ccatgaagat cggttgtcca tgcataggtt gaaggccgac gaggcttatg
1920caatcggtaa gactggtcaa tattcgccag ttcaagctta tctacaaatt gacgaaatta
1980tcaaaatagc aaaggaacat gatgtttcca tgatccatcc aggttatggt ttcttatctg
2040aaaactccga attcgcaaag aaggttgaag aatccggtat gatttgggtt gggcctcctg
2100ctgaagttat tgattctgtt ggtgacaagg tttctgcaag aaatttggca attaaatgtg
2160acgttcctgt tgttcctggt accgatggtc caattgaaga cattgaacag gctaaacagt
2220ttgtggaaca atatggttat cctgtcatta taaaggctgc atttggtggt ggtggtagag
2280gtatgagagt tgttagagaa ggtgatgata tagttgatgc tttccaaaga gcgtcatctg
2340aagcaaagtc tgcctttggt aatggtactt gttttattga aagatttttg gataagccaa
2400aacatattga ggttcaatta ttggctgata attatggtaa cacaatccat ctctttgaaa
2460gagattgttc tgttcaaaga agacatcaaa aggttgttga aattgcacct gccaaaactt
2520tacctgttga agttagaaat gctatattaa aggatgctgt aacgttagct aaaaccgcta
2580actatagaaa tgctggtact gcagaatttt tagttgattc ccaaaacaga cattatttta
2640ttgaaattaa tccaagaatt caagttgaac atacaattac tgaagaaatc acgggtgttg
2700atattgttgc cgctcaaatt caaattgctg caggtgcatc attggaacaa ttgggtctat
2760tacaaaacaa aattacaact agaggttttg caattcaatg tagaattaca accgaggatc
2820ctgctaagaa ttttgcccca gatacaggta aaattgaggt ttatagatct gcaggtggta
2880acggtgtcag attagatggt ggtaatgggt ttgccggtgc tgttatatct cctcattatg
2940actcgatgtt ggttaaatgt tcaacatctg gttctaacta tgaaattgcc agaagaaaga
3000tgattagagc tttagttgaa tttagaatca gaggtgtcaa gaccaatatt cctttcttat
3060tggcattgct aactcatcca gttttcattt cgggtgattg ttggacaact tttattgatg
3120ataccccttc gttattcgaa atggtttctt caaagaatag agcccaaaaa ttattggcat
3180atattggtga cttgtgtgtc aatggttctt caattaaagg tcaaattggt ttccctaaat
3240tgaacaagga agcagaaatc ccagatttgt tggatccaaa tgatgaggtt attgatgttt
3300ctaaaccttc taccaatggt ctaagaccgt atctattaaa gtatggacca gatgcgtttt
3360ccaaaaaagt tcgtgaattc gatggttgta tgattatgga taccacctgg agagatgcac
3420atcaatcatt attggctaca agagttagaa ctattgattt actgagaatt gctccaacga
3480ctagtcatgc cttacaaaat gcatttgcat tagaatgttg gggtggcgca acatttgatg
3540ttgcgatgag gttcctctat gaagatcctt gggagagatt aagacaactt agaaaggcag
3600ttccaaatat tcctttccaa atgttattga gaggtgctaa tggtgttgct tattcgtcat
3660tacctgataa tgcaattgat cattttgtta agcaagcaaa ggataatggt gttgatattt
3720tcagagtctt tgatgctttg aacgatttgg aacaattgaa ggttggtgtt gatgctgtca
3780agaaagccgg aggtgttgtt gaagctacag tttgttactc aggtgatatg ttaattccag
3840gtaaaaagta taacttggat tattatttag agactgttgg aaagattgtg gaaatgggta
3900cccatatttt aggtattaag gatatggctg gcacgttaaa gccaaaggct gctaagttgt
3960tgattggctc gatcagatca aaataccctg acttggttat ccatgtccat acccatgact
4020ctgctggtac cggtatttca acttatgttg catgcgcatt ggcaggtgcc gacattgtcg
4080attgtgcaat caattcgatg tctggtttaa cctctcaacc ttcaatgagt gcttttattg
4140ctgctttaga tggtgatatc gaaactggtg ttccagaaca ttttgcaaga caattagatg
4200catactgggc agaaatgaga ttgttatact catgtttcga agccgacttg aagggaccag
4260acccagaagt ttataaacat gaaattccag gtggacagtt gactaaccta atcttccaag
4320cccaacaagt tggtttgggt gaacaatggg aagaaactaa gaagaagtat gaagatgcta
4380acatgttgtt gggtgatatt gtcaaggtta ccccaacctc caaggttgtt ggtgatttag
4440cccaatttat ggtttctaat aaattagaaa aagaagatgt tgaaaaactt gctaatgaat
4500tagatttccc agattcagtt cttgatttct ttgaaggatt aatgggtaca ccatatggtg
4560gattcccaga gcctttgaga acaaatgtca tttccggcaa gagaagaaaa ttaaagggta
4620gaccaggttt agaattagaa cctttcaacc tcgaggaaat cagagaaaat ttggtttcca
4680gatttggtcc aggtattact gaatgtgatg ttgcatctta taacatgtat ccaaaggttt
4740acgagcaata tcgtaaggtg gttgaaaaat atggtgattt atctgtttta ccaacaaaag
4800catttttggc tcctccaact attggtgaag aagttcatgt ggaaattgag caaggtaaga
4860ctttgattat taagttatta gccatttctg acttgtctaa atctcatggt acaagagaag
4920tatactttga attgaatggt gaaatgagaa aggttacaat tgaagataaa acagctgcaa
4980ttgagactgt tacaagagca aaggctgacg gacacaatcc aaatgaagtt ggtgcgccaa
5040tggctggtgt cgttgttgaa gttagagtga agcatggaac agaagttaag aagggtgatc
5100cattagccgt tttgagtgca atgaaaatgg aaatggttat ttctgctcct gttagtggta
5160gggtcggtga agtttttgtc aacgaaggcg attccgttga tatgggtgat ttgcttgtga
5220aaattgccaa agatgaagcg ccagcagctt aattaattct gtctttgatt ttcttatgtt
5280attcaaaaca tctgccccaa aatctaacga ttatatatat tcctacgtat aactgtatag
5340ctaattattg atttatttgt acataaaaac cacataaatg taaaagcaag aaaaaaaata
5400actaaggaga aggatcaata tctcatttat aatgctcgcc aaagcagcgt acgtgaattt
5460taatcaagac atcaacaaat cttgcaactt ggttatatcg cttcttcacc cactcacccg
5520cttttctaca ttgttgaaca caaatatata caggggtatg tctcaaggtc aagtgcagtt
5580tcaacagaga ctacctcaag gtacctcttc agaaatgcag aacttcactc ttgatcagat
5640tttctccgaa ttaaaggttt aaacatagcc tcatgaaatc agccatttgc ttttgttcaa
5700cgatcttttg aaattgttgt tgttcttggt agttaagttg atccatcttg gcttatgttg
5760tgtgtatgtt gtagttattc ttagtatatt cctgtcctga gtttagtgaa acataatatc
5820gccttgaaat gaaaatgctg aaattcgtcg acatacaatt tttcaaactt tttttttttc
5880ttggtgcacg gacatgtttt taaaggaagt actctatacc agttattctt cacaaattta
5940attgctggag aatagatctt caacgcttta ataaagtagt ttgtttgtca aggatggcgt
6000catacaaaga aagatcagaa tcacacactt cccctgttgc taggagactt ttctccatca
6060tggaggaaaa gaagtctaac ctttgtgcat cattggatat tactgaaact gaaaagcttc
6120tctctatttt ggacactatt ggtccttaca tctgtctagt taaaacacac atcgatattg
6180tttctgattt tacgtatgaa ggaactgtgt tgcctttgaa ggagcttgcc aagaaacata
6240attttatgat ttttgaagat agaaaatttg ctgatattgg taacactgtt aaaaatcaat
6300ataaatctgg tgtcttccgt attgccgaat gggctgacat cactaatgca catggtgtaa
6360cgggtgcagg tattgtttct ggcttgaagg aggcagccca agaaacaacc agtgaaccta
6420gaggtttgct aatgcttgct gagttatcat caaagggttc tttagcatat ggtgaatata
6480cagaaaaaac agtagaaatt gctaaatctg ataaagagtt tgttgag
652784760DNAArtificial SequenceSynthetic - MAE gene integration fragment
8aattctttga aggagcttgc caagaaacat aattttatga tttttgaaga tagaaaattt
60gctgatattg gtaacactgt taaaaatcaa tataaatctg gtgtcttccg tattgccgaa
120tgggctgaca tcactaatgc acatggtgta acgggtgcag gtattgtttc tggcttgaag
180gaggcagccc aagaaacaac cagtgaacct agaggtttgc taatgcttgc tgagttatca
240tcaaagggtt ctttagcata tggtgaatat acagaaaaaa cagtagaaat tgctaaatct
300gataaagagt ttgtcattgg ttttattgcg caacacgata tgggcggtag agaagaaggt
360tttgactgga tcattatgac tccaggggtt ggtttagatg acaaaggtga tgcacttggt
420caacaatata gaactgttga tgaagttgta aagactggaa cggatatcat aattgttggt
480agaggtttgt acggtcaagg aagagatcct atagagcaag ctaaaagata ccaacaagct
540ggttggaatg cttatttaaa cagatttaaa tgattcttac acaaagattt gatacatgta
600cactagttta aataagcatg aaaagaatta cacaagcaaa aaaaaaaaaa taaatgaggt
660actttacgtt cacctacaac caaaaaaact agatagagta aaatcttaag atttagaaaa
720agttgtttaa caaaggcttt agtatgtgaa tttttaatgt agcaaagcga taactaataa
780acataaacaa aagtatggtt ttctttatca gtcaaatcat tatcgattga ttgttccgcg
840tatctgcaga tagcctcatg aaatcagcca tttgcttttg ttcaacgatc ttttgaaatt
900gttgttgttc ttggtagtta agttgatcca tcttggctta tgttgtgtgt atgttgtagt
960tattcttagt atattcctgt cctgagttta gtgaaacata atatcgcctt gaaatgaaaa
1020tgctgaaatt cgtcgacata caatttttca aacttttttt ttttcttggt gcacggacat
1080gtttttaaag gaagtactct ataccagtta ttcttcacaa atttaattgc tggagaatag
1140atcttcaacg cgtttaaaca gcaatttgag gaaggaatag gagaaggaga agcaatttct
1200aggaaagagc aaggtgtgca acagcatgct ctgaatgata ttttcagcaa tagttcagtt
1260gaagaacctg ttggcgtatc tacatcactt cctacaaaca acaccacgaa ttgcgtccgt
1320ggtgacgcaa ctacgaatgg cattgtcaat gccaatgcca gtgcacatac acgtgcaagt
1380cccaccggtt ccctgcccgg ctatggtaga gacaagaagg acgataccgg catcgacatc
1440aacagtttca acagcaatgc gtttggcgtc gacgcgtcga tggggctgcc gtatttggat
1500ttggacgggc tagatttcga tatggatatg gatatggata tggatatgga gatgaatttg
1560aatttagatt tgggtcttga tttggggttg gaattaaaag gggataacaa tgagggtttt
1620cctgttgatt taaacaatgg acgtgggagg tgattgattt aacctgatcc aaaaggggta
1680tgtctatttt ttagagtgtg tctttgtgtc aaattatggt agaatgtgta aagtagtata
1740aactttcctc tcaaatgacg aggtttaaaa caccccccgg gtgagccgag ccgagaatgg
1800ggcaattgtt caatgtgaaa tagaagtatc gagtgagaaa cttgggtgtt ggccagccaa
1860gggggaagga aaatggcgcg aatgctcagg tgagattgtt ttggaattgg gtgaagcgag
1920gaaatgagcg acccggaggt tgtgacttta gtggcggagg aggaacggga ggaaaaggcc
1980aagagggaaa gtgtatataa gggggagcaa tttgccaacc aggatagaat tggatgagtt
2040ataattctac tgtatttatt gtataattta tttctccttt tatatcaaac acattacaaa
2100acacacaaaa catacaaaca tacacagcta gcatgggtga attgaaagag attttgaaac
2160aaagatatca tgaattactt gattggaatg ttaaggcacc acatgtccct ttatcccaga
2220gattgaagca ctttacttgg tcatggtttg cttgtactat ggcaaccggt ggtgttggtt
2280tgatcattgg ttccttccca ttcagattct acggtttgaa caccattggc aagattgttt
2340acatcttaca aatctttttg ttttctcttt ttggctcttg tatgttgttt cgtttcatca
2400agtatccatc taccattaag gactcttgga atcatcactt ggaaaagttg tttatcgcaa
2460cttgtttgtt atctatttcc acattcatcg acatgttagc tatctatgct tatccagata
2520ccggtgaatg gatggtctgg gtcattagaa tcttatacta catctatgtc gctgtctctt
2580tcatctactg tgttatggcc tttttcacca ttttcaacaa tcatgtttac actattgaaa
2640ctgcttctcc agcttggatt ttgccaatct tccctccaat gatctgtggt gtcattgctg
2700gtgctgttaa ctccacccaa cctgctcacc aattgaaaaa catggtcatt ttcggtatct
2760tgtttcaagg tttaggtttt tgggtttacc ttttactttt cgccgttaat gttttgagat
2820tcttcacagt cggtttagca aagccacaag atagaccagg tatgtttatg ttcgttggtc
2880caccagcttt ctctggttta gcattgatta acattgcaag aggtgcaatg ggctcaagac
2940cttacatttt cgttggtgca aactcttccg aatacttagg ttttgtctca accttcatgg
3000ccattttcat ctggggttta gccgcatggt gttattgctt agctatggtt tccttccttg
3060ccggcttttt cactagagca ccattgaaat tcgcttgtgg ttggttcgct ttcatctttc
3120caaatgttgg ttttgttaac tgtactatcg aaatcggcaa gatgattgat tctaaggctt
3180ttcaaatgtt tggtcacatc attggtgtta tcttgtgtat tcaatggatt ttgttaatgt
3240acttaatggt tagagcattc cttgttaatg acttgtgcta tcctggtaaa gacgaagatg
3300cacacccacc accaaagcca aacactggtg tcttaaaccc aactttccca ccagagaagg
3360ctccagcatc attagagaag gttgatactc atgttacatc aacaggtggt gaatccgatc
3420ctccatcttc cgaacatgaa tccgtttaag gcgcgccatc taatagttta atcacagctt
3480atagtctact atagttttct tttttaaaca ttgttgtatt ttgtcccccc cctctaattg
3540atgatgatta tcctataaga atccaataaa acgatggaaa ctaataccct ctcctttgtc
3600atgtggtctt tagtatttct tgaacattgg ctctgatttc tcgactttat agtcctatta
3660aaatcgctgt tagttctcga tcgttgtatc tcgtttcttg tctctttggt ggatgatttt
3720gcgtgcgaac atgttttttt ccctttctct caccatcatc gtgtagttct tgtcaccatc
3780ccccccaccc cttccttctc tcattgattc tataagagct tatccacaga ggtgcagtaa
3840cgaggtagtt taaccttcga gtggatcaaa atgtcacaca ggcctgcggc cgcatttggc
3900aaggcgtatc tatataggag gatcacaaga aaagagagtt caacttaggg aaccaggctt
3960gacaaaagat aatattaaaa aaggaagaaa aagagagaaa gaaagtaaag acaagaagaa
4020tcaactccaa tttgaataaa tgggcctgta agaatcttca atctcgatga cggggaaaac
4080gtcctctttt atagaccaac cccttgtggg gctgtcggga aaagcgggtt tcccgggaaa
4140ctcatatctg attggggggt tgaagctatt ccgcgttttg gggggtgcgg tttataagaa
4200gaatgagtaa tgcaaacggc gtttataaag agaaagggtg tggcggtgtt aagcggttga
4260ggttttaagg tgtgggggac ggggtgtgat tttttcggcg ttacgggcac ggggtatgat
4320tctcttaatg tcatttccct ttctttctac ttctagggcg cgcagcgtgc gacaatggtg
4380tttgttgtgc aaataggtgg ttgtgttgtc gttggagttc tcatgaacat atcactctat
4440tagcaaattt accctaagta tgtaaggttt agtgtgttga ggactaatac agacaatctt
4500tgtcttatat ataaatggtt ctactcaatt ttataagttt tttttttttt tttttaatct
4560ctttatgaaa atcaagaatg acattaaaac aatcacaata ttgtatgcaa gaactttgcc
4620tctaaacctc tttcaaagag atgcataatt ataaaacaaa tctatttgca tattcgcttt
4680acacaactat tcaagaatat aattacctaa agagtccatc ttgagttcac caacaccact
4740acttagagct cggtacccgc
476096874DNAArtificial SequenceSynthetic - FRD gene integration fragment
9aaacctccgt tatgtatgtt tgtacccaaa aagaatgcgc tatattagtt taatctttta
60taaacccgga attataaaaa tacagttagg aataaagtaa tagaaagatg aacaacgggc
120ctaaaaagac taatgtgttg tggatcggaa tgtttcgaat agagtattaa agttatgctt
180tcttttcttt ttgaacatgc ttggtattac tttgatatgc aaaagatatc gacaaattga
240aaatggtttt gatgtctata gatgtggcat ggtaaggttc atttcaattt agcaaatatc
300agacgagctc agcggccgcg gatccctcga ggagtccatc ggttcctgtc agatgggata
360ctcttgacgt ggaaaattca aacagaaaaa aaaccccaat aatgaaaaat aacactacgt
420tatatccgtg gtatcctcta tcgtatcgta tcgtagcgta tcgtagcgta ccgtatcaca
480gtatagtcta atattccgta tcttattgta tcctatccta ttcgatccta ttgtatttca
540gtgcaccatt ttaatttcta ttgctataat gtccttatta gttgccactg tgaggtgacc
600aatggacgag ggcgagccgt tcagaagccg cgaagggtgt tcttcccatg aatttcttaa
660ggagggcggc tcagctccga gagtgaggcg agacgtctcg gtcagcgtat cccccttcct
720cggcttttac aaatgatgcg ctcttaatag tgtgtcgtta tccttttggc attgacgggg
780gagggaaatt gattgagcgc atccatattt ttgcggactg ctgaggacaa tggtggtttt
840tccgggtggc gtgggctaca aatgatacga tggttttttt cttttcggag aaggcgtata
900aaaaggacac ggagaaccca tttattctaa aaacagttga gcttctttaa ttattttttg
960atataatatt ctattattat atattttctt cccaataaaa caaaataaaa caaaacacag
1020caaaacacaa aaattctaga atggctgatg gcaaaacctc tgcatcagtt gttgctgttg
1080atgctgaacg tgccgctaag gaaagagatg cagcagctag agctatgttg caaggtggtg
1140gtgtctctcc tgctggcaag gcacaattgt tgaaaaaggg tttggttcac actgttccat
1200ataccttaaa ggttgtcgtc gcagatccaa aggaaatgga gaaggcaact gctgacgcag
1260aagaggtttt acaagctgca tttcaagtcg tcgacaccct tttgaacaac tttaacgaaa
1320actcagaagt ttcaagagtc aataggttgg cagttggtga ggaacatcaa atgtctgaaa
1380cattgaaaca cgtcatggcc tgttgtcaaa aggtttatca ttcctccaga ggtgtttttg
1440acccagcagt tggtccatta gtccgtgaac ttagagaagc tgctcacaag ggtaaaactg
1500ttccagccga aagagttaat gatttgttat ccaaatgtac ccttaatgca tctttttcaa
1560ttgatatgtc cagaggtatg attgcaagga agcatccaga cgccatgttg gatttgggtg
1620gtgtcaacaa gggttatggt atcgactaca ttgttgaaca cttaaactct ttgggttatg
1680atgatgtctt tttcgaatgg ggtggtgatg ttagagcatc cggcaaaaac cagttatctc
1740aaccttgggc tgttggtatt gttagaccac ctgccttggc cgacattaga actgttgtcc
1800cagaggacaa aagatccttt atccgtgtcg tcagattgaa caacgaagct attgctacct
1860ctggtgatta tgagaatttg gttgaaggtc ctggttctaa ggtttactct tccaccttca
1920atccaacttc caaaaacttg ttggaaccta ccgaagcagg tatggctcaa gtttctgtca
1980agtgttgctc atgtatctac gctgatgctt tagcaacagc agctttgttg aaaaacgatc
2040ctgctgccgt tagaaggatc ttagataact ggagatatgt cagagatact gttactgact
2100acaccactta cacaagggaa ggtgaaagag ttgctaagat gttggaaatt gctaccgaag
2160atgctgaaat gagagcaaag agaatcaagg gctctttacc agcaagagtt atcattgttg
2220gtggtggttt ggccggttgt tccgcagcta tcgaagcagc taactgtggc gcccacgtca
2280tcttgttaga aaaggaacca aagttaggtg gtaactctgc aaaggctacc tccggtatca
2340acgcctgggg tactagagca caagcaaaac aaggtgtcat ggacggcggc aagtttttcg
2400aaagagatac ccatagatcc ggcaagggtg gtaattgcga tccatgcctt gttaagactt
2460tgtccgttaa gtcctctgat gcagttaagt ggttatctga attaggtgtt ccattgactg
2520ttttgtctca attaggtggt gcttcaagga aacgttgtca ccgtgcacca gataagtctg
2580atggtacacc agtcccagtt ggtttcacca ttatgaaaac ccttgaaaac cacattgtca
2640acgatttgtc cagacatgtt acagttatga caggtattac cgtcacagct ttagaatcta
2700catcaagagt cagacctgat ggtgttttag tcaagcatgt tactggtgtt cacttgattc
2760aggcatctgg tcaatctatg gttttgaatg cagacgctgt tatcttagct actggtggtt
2820tctccaatga tcatacccca aactcccttt tacaacaata cgccccacag ttgtcatctt
2880ttccaacaac caatggtgtc tgggcaactg gcgatggtgt taagatggct tccaagttgg
2940gtgtcgcctt agttgatatg gataaggtcc aattacatcc taccggcttg ttagacccaa
3000aagatccatc taatagaacc aagtatcttg gtccagaggc cttaagaggt tccggcggtg
3060tcttgttaaa caaaaacggt gaaagatttg ttaatgaatt agacttaaga tctgttgtct
3120ctcaagctat catcgcacaa gataatgagt acccaggctc tggtggttcc aagttcgcat
3180actgtgtttt gaacgaaact gcagcaaagt tattcggcaa aaacttcctt ggtttctact
3240ggaatagatt aggtcttttc caaaaggttg attccgttgc tggtttagct aagttgattg
3300gttgtccaga agctaatgtt gttgctacat tgaagcaata tgaggagtta tcttccaaaa
3360agcttaatcc ttgtccattg actggcaagt ctgtctttcc ttgtgtttta ggcactcaag
3420gtccatacta tgttgccttg gttaccccat ccattcacta cactatgggt ggttgtttga
3480tttccccatc tgctgagatg caaaccattg acaactctgg tgttactcct gtcagacgtc
3540caatcttagg cttattcggt gctggtgaag ttactggcgg tgtccatggt ggtaacagat
3600taggcggtaa ctctttgtta gaatgtgttg ttttcggcaa gatcgctggt gacagagctg
3660caaccatctt gcaaaagaaa aacaccggct tatcaatgac agaatggtct actgtcgtct
3720taagagaagt tagagaaggt ggtgtctatg gtgctggttc cagagttttg aggtttaaca
3780tgcctggtgc attacagaga actggtttag ctttaggtca attcatcggt atcagaggtg
3840attgggacgg tcacagattg atcggttact attctccaat cactttacct gatgatgttg
3900gtgttattgg tatcttagct agagcagaca agggtagatt ggcagaatgg atttctgcat
3960tgcagccagg tgacgctgtt gagatgaagg cctgcggtgg tcttatcatt gacagaagat
4020tcgctgaaag acatttcttt ttccgtggtc ataagatcag aaagttggcc cttatcggtg
4080gtggtactgg tgttgcacca atgttacaaa tcgtcagagc tgctgtcaaa aagccatttg
4140tcgattcaat tgagtccatt cagttcatct atgctgcaga ggatgtttcc gagcttacat
4200acagaacctt acttgaatct tacgaagagg aatatggttc agaaaagttt aagtgtcact
4260tcgttttgaa taacccacca gctcaatgga ctgacggtgt tggtttcgtt gatactgcat
4320tgttgagatc cgcagttcaa gcaccatcaa atgatttgct tgttgcaatt tgtggtccac
4380caatcatgca aagagcagtt aagggtgcat tgaaaggttt aggttacaat atgaatcttg
4440ttagaaccgt tgacgaaact gaaccaccat cataattaat taacatctga atgtaaaatg
4500aacattaaaa tgaattacta aactttacgt ctactttaca atctataaac tttgtttaat
4560catataacga aatacactaa tacacaatcc tgtacgtatg taatactttt atccatcaag
4620gattgagaaa aaaaagtaat gattccctgg gccattaaaa cttagacccc caagcttgga
4680taggtcactc tctattttcg tttctccctt ccctgataga agggtgatat gtaattaaga
4740ataatatata attttataat aaaagaattc atagcctcat gaaatcagcc atttgctttt
4800gttcaacgat cttttgaaat tgttgttgtt cttggtagtt aagttgatcc atcttggctt
4860atgttgtgtg tatgttgtag ttattcttag tatattcctg tcctgagttt agtgaaacat
4920aatatcgcct tgaaatgaaa atgctgaaat tcgtcgacat acaatttttc aaactttttt
4980tttttcttgg tgcacggaca tgtttttaaa ggaagtactc tataccagtt attcttcaca
5040aatttaattg ctggagaata gatcttcaac gctttaataa agtagtttgt ttgtcaagga
5100tggcgtcata caaagaaaga tcagaatcac acacttcccc tgttgctagg agacttttct
5160ccatcatgga ggaaaagaag tctaaccttt gtgcatcatt ggatattact gaaactgaaa
5220agcttctctc tattttggac actattggtc cttacatctg tctagttaaa acacacatcg
5280atattgtttc tgattttacg tatgaaggaa ctgtgttgcc tttgaaggag cttgccaaga
5340aacataattt tatgattttt gaagatagaa aatttgctga tattggtaac actgttaaaa
5400atcaatataa atctggtgtc ttccgtattg ccgaatgggc tgacatcact aatgcacatg
5460gtgtaacggg tgcaggtatt gtttctggct tgaaggaggc agcccaagaa acaaccagtg
5520aacctagagg tttgctaatg cttgctgagt tatcatcaaa gggttcttta gcatatggtg
5580aatatacaga aaaaacagta gaaattgcta aatctgataa agagtttgtc attggtttta
5640ttgcgcaaca cgatatgggc ggtagagaag aaggttttga ctggatcatt atgactccag
5700gggttggttt agatgacaaa ggtgatgcac ttggtcaaca atatagaact gttgatgaag
5760ttgtaaagac tggaacggat atcataattg ttggtagagg tttgtacggt caaggaagag
5820atcctataga gcaagctaaa agataccaac aagctggttg gaatgcttat ttaaacagat
5880ttaaatgatt cttacacaaa gatttgatac atgtacacta gtttaaataa gcatgaaaag
5940aattacacaa gcaaaaaaaa aaaaataaat gaggtacttt acgttcacct acaaccaaaa
6000aaactagata gagtaaaatc ttaagattta gaaaaagttg tttaacaaag gctttagtat
6060gtgaattttt aatgtagcaa agcgataact aataaacata aacaaaagta tggttttctt
6120tatcagtcaa atcattatcg attgattgtt ccgcgtatct gcagatagcc tcatgaaatc
6180agccatttgc ttttgttcaa cgatcttttg aaattgttgt tgttcttggt agttaagttg
6240atccatcttg gcttatgttg tgtgtatgtt gtagttattc ttagtatatt cctgtcctga
6300gtttagtgaa acataatatc gccttgaaat gaaaatgctg aaattcgtcg acatacaatt
6360tttcaaactt tttttttttc ttggtgcacg gacatgtttt taaaggaagt actctatacc
6420agttattctt cacaaattta attgctggag aatagatctt caacgccccg ggggatctgg
6480atccgcggcc gctcatatgt ttgaaggtat tatcactgct gttgatttac gttcttgaaa
6540actgcacgga taatattcac aatactaaca ataaagaaga ctcattgtgg aaggtgactc
6600aatcatgcta gaaaagctgg ggaataaagg cacttttata gtagccacat tttggttcaa
6660aagaatataa aggaaaaaaa aatattttcc agtgaaaaag aaaagactct ttctccgaga
6720agccgagttt ctacgaggcc ttgttgagtc ataggggacc tctgtggttg actccggctt
6780attacgtgaa tcatcggggg agccgcaccg tttgtccgcg acaggagaaa acgcaaggag
6840tcaaacatta aattggtagg cactaccgag gttt
6874103435DNALeishmania mexicana 10atggctgatg gcaaaacctc tgcatcagtt
gttgctgttg atgctgaacg tgccgctaag 60gaaagagatg cagcagctag agctatgttg
caaggtggtg gtgtctctcc tgctggcaag 120gcacaattgt tgaaaaaggg tttggttcac
actgttccat ataccttaaa ggttgtcgtc 180gcagatccaa aggaaatgga gaaggcaact
gctgacgcag aagaggtttt acaagctgca 240tttcaagtcg tcgacaccct tttgaacaac
tttaacgaaa actcagaagt ttcaagagtc 300aataggttgg cagttggtga ggaacatcaa
atgtctgaaa cattgaaaca cgtcatggcc 360tgttgtcaaa aggtttatca ttcctccaga
ggtgtttttg acccagcagt tggtccatta 420gtccgtgaac ttagagaagc tgctcacaag
ggtaaaactg ttccagccga aagagttaat 480gatttgttat ccaaatgtac ccttaatgca
tctttttcaa ttgatatgtc cagaggtatg 540attgcaagga agcatccaga cgccatgttg
gatttgggtg gtgtcaacaa gggttatggt 600atcgactaca ttgttgaaca cttaaactct
ttgggttatg atgatgtctt tttcgaatgg 660ggtggtgatg ttagagcatc cggcaaaaac
cagttatctc aaccttgggc tgttggtatt 720gttagaccac ctgccttggc cgacattaga
actgttgtcc cagaggacaa aagatccttt 780atccgtgtcg tcagattgaa caacgaagct
attgctacct ctggtgatta tgagaatttg 840gttgaaggtc ctggttctaa ggtttactct
tccaccttca atccaacttc caaaaacttg 900ttggaaccta ccgaagcagg tatggctcaa
gtttctgtca agtgttgctc atgtatctac 960gctgatgctt tagcaacagc agctttgttg
aaaaacgatc ctgctgccgt tagaaggatc 1020ttagataact ggagatatgt cagagatact
gttactgact acaccactta cacaagggaa 1080ggtgaaagag ttgctaagat gttggaaatt
gctaccgaag atgctgaaat gagagcaaag 1140agaatcaagg gctctttacc agcaagagtt
atcattgttg gtggtggttt ggccggttgt 1200tccgcagcta tcgaagcagc taactgtggc
gcccacgtca tcttgttaga aaaggaacca 1260aagttaggtg gtaactctgc aaaggctacc
tccggtatca acgcctgggg tactagagca 1320caagcaaaac aaggtgtcat ggacggcggc
aagtttttcg aaagagatac ccatagatcc 1380ggcaagggtg gtaattgcga tccatgcctt
gttaagactt tgtccgttaa gtcctctgat 1440gcagttaagt ggttatctga attaggtgtt
ccattgactg ttttgtctca attaggtggt 1500gcttcaagga aacgttgtca ccgtgcacca
gataagtctg atggtacacc agtcccagtt 1560ggtttcacca ttatgaaaac ccttgaaaac
cacattgtca acgatttgtc cagacatgtt 1620acagttatga caggtattac cgtcacagct
ttagaatcta catcaagagt cagacctgat 1680ggtgttttag tcaagcatgt tactggtgtt
cacttgattc aggcatctgg tcaatctatg 1740gttttgaatg cagacgctgt tatcttagct
actggtggtt tctccaatga tcatacccca 1800aactcccttt tacaacaata cgccccacag
ttgtcatctt ttccaacaac caatggtgtc 1860tgggcaactg gcgatggtgt taagatggct
tccaagttgg gtgtcgcctt agttgatatg 1920gataaggtcc aattacatcc taccggcttg
ttagacccaa aagatccatc taatagaacc 1980aagtatcttg gtccagaggc cttaagaggt
tccggcggtg tcttgttaaa caaaaacggt 2040gaaagatttg ttaatgaatt agacttaaga
tctgttgtct ctcaagctat catcgcacaa 2100gataatgagt acccaggctc tggtggttcc
aagttcgcat actgtgtttt gaacgaaact 2160gcagcaaagt tattcggcaa aaacttcctt
ggtttctact ggaatagatt aggtcttttc 2220caaaaggttg attccgttgc tggtttagct
aagttgattg gttgtccaga agctaatgtt 2280gttgctacat tgaagcaata tgaggagtta
tcttccaaaa agcttaatcc ttgtccattg 2340actggcaagt ctgtctttcc ttgtgtttta
ggcactcaag gtccatacta tgttgccttg 2400gttaccccat ccattcacta cactatgggt
ggttgtttga tttccccatc tgctgagatg 2460caaaccattg acaactctgg tgttactcct
gtcagacgtc caatcttagg cttattcggt 2520gctggtgaag ttactggcgg tgtccatggt
ggtaacagat taggcggtaa ctctttgtta 2580gaatgtgttg ttttcggcaa gatcgctggt
gacagagctg caaccatctt gcaaaagaaa 2640aacaccggct tatcaatgac agaatggtct
actgtcgtct taagagaagt tagagaaggt 2700ggtgtctatg gtgctggttc cagagttttg
aggtttaaca tgcctggtgc attacagaga 2760actggtttag ctttaggtca attcatcggt
atcagaggtg attgggacgg tcacagattg 2820atcggttact attctccaat cactttacct
gatgatgttg gtgttattgg tatcttagct 2880agagcagaca agggtagatt ggcagaatgg
atttctgcat tgcagccagg tgacgctgtt 2940gagatgaagg cctgcggtgg tcttatcatt
gacagaagat tcgctgaaag acatttcttt 3000ttccgtggtc ataagatcag aaagttggcc
cttatcggtg gtggtactgg tgttgcacca 3060atgttacaaa tcgtcagagc tgctgtcaaa
aagccatttg tcgattcaat tgagtccatt 3120cagttcatct atgctgcaga ggatgtttcc
gagcttacat acagaacctt acttgaatct 3180tacgaagagg aatatggttc agaaaagttt
aagtgtcact tcgttttgaa taacccacca 3240gctcaatgga ctgacggtgt tggtttcgtt
gatactgcat tgttgagatc cgcagttcaa 3300gcaccatcaa atgatttgct tgttgcaatt
tgtggtccac caatcatgca aagagcagtt 3360aagggtgcat tgaaaggttt aggttacaat
atgaatcttg ttagaaccgt tgacgaaact 3420gaaccaccat cataa
3435116874DNAArtificial
SequenceSynthetic - FRD gene integration fragment 11aaacctcggt agtgcctacc
aatttaatgt ttgactcctt gcgttttctc ctgtcgcgga 60caaacggtgc ggctcccccg
atgattcacg taataagccg gagtcaacca cagaggtccc 120ctatgactca acaaggcctc
gtagaaactc ggcttctcgg agaaagagtc ttttcttttt 180cactggaaaa tatttttttt
tcctttatat tcttttgaac caaaatgtgg ctactataaa 240agtgccttta ttccccagct
tttctagcat gattgagtca ccttccacaa tgagtcttct 300ttattgttag tattgtgaat
attatccgtg cagttttcaa gaacgtaaat caacagcagt 360gataatacct tcaaacatat
gagcggccgc ggatccctcg aggagtccat cggttcctgt 420cagatgggat actcttgacg
tggaaaattc aaacagaaaa aaaaccccaa taatgaaaaa 480taacactacg ttatatccgt
ggtatcctct atcgtatcgt atcgtagcgt atcgtagcgt 540accgtatcac agtatagtct
aatattccgt atcttattgt atcctatcct attcgatcct 600attgtatttc agtgcaccat
tttaatttct attgctataa tgtccttatt agttgccact 660gtgaggtgac caatggacga
gggcgagccg ttcagaagcc gcgaagggtg ttcttcccat 720gaatttctta aggagggcgg
ctcagctccg agagtgaggc gagacgtctc ggtcagcgta 780tcccccttcc tcggctttta
caaatgatgc gctcttaata gtgtgtcgtt atccttttgg 840cattgacggg ggagggaaat
tgattgagcg catccatatt tttgcggact gctgaggaca 900atggtggttt ttccgggtgg
cgtgggctac aaatgatacg atggtttttt tcttttcgga 960gaaggcgtat aaaaaggaca
cggagaaccc atttattcta aaaacagttg agcttcttta 1020attatttttt gatataatat
tctattatta tatattttct tcccaataaa acaaaataaa 1080acaaaacaca gcaaaacaca
aaaattctag aatggctgat ggcaaaacct ctgcatcagt 1140tgttgctgtt gatgctgaac
gtgccgctaa ggaaagagat gcagcagcta gagctatgtt 1200gcaaggtggt ggtgtctctc
ctgctggcaa ggcacaattg ttgaaaaagg gtttggttca 1260cactgttcca tataccttaa
aggttgtcgt cgcagatcca aaggaaatgg agaaggcaac 1320tgctgacgca gaagaggttt
tacaagctgc atttcaagtc gtcgacaccc ttttgaacaa 1380ctttaacgaa aactcagaag
tttcaagagt caataggttg gcagttggtg aggaacatca 1440aatgtctgaa acattgaaac
acgtcatggc ctgttgtcaa aaggtttatc attcctccag 1500aggtgttttt gacccagcag
ttggtccatt agtccgtgaa cttagagaag ctgctcacaa 1560gggtaaaact gttccagccg
aaagagttaa tgatttgtta tccaaatgta cccttaatgc 1620atctttttca attgatatgt
ccagaggtat gattgcaagg aagcatccag acgccatgtt 1680ggatttgggt ggtgtcaaca
agggttatgg tatcgactac attgttgaac acttaaactc 1740tttgggttat gatgatgtct
ttttcgaatg gggtggtgat gttagagcat ccggcaaaaa 1800ccagttatct caaccttggg
ctgttggtat tgttagacca cctgccttgg ccgacattag 1860aactgttgtc ccagaggaca
aaagatcctt tatccgtgtc gtcagattga acaacgaagc 1920tattgctacc tctggtgatt
atgagaattt ggttgaaggt cctggttcta aggtttactc 1980ttccaccttc aatccaactt
ccaaaaactt gttggaacct accgaagcag gtatggctca 2040agtttctgtc aagtgttgct
catgtatcta cgctgatgct ttagcaacag cagctttgtt 2100gaaaaacgat cctgctgccg
ttagaaggat cttagataac tggagatatg tcagagatac 2160tgttactgac tacaccactt
acacaaggga aggtgaaaga gttgctaaga tgttggaaat 2220tgctaccgaa gatgctgaaa
tgagagcaaa gagaatcaag ggctctttac cagcaagagt 2280tatcattgtt ggtggtggtt
tggccggttg ttccgcagct atcgaagcag ctaactgtgg 2340cgcccacgtc atcttgttag
aaaaggaacc aaagttaggt ggtaactctg caaaggctac 2400ctccggtatc aacgcctggg
gtactagagc acaagcaaaa caaggtgtca tggacggcgg 2460caagtttttc gaaagagata
cccatagatc cggcaagggt ggtaattgcg atccatgcct 2520tgttaagact ttgtccgtta
agtcctctga tgcagttaag tggttatctg aattaggtgt 2580tccattgact gttttgtctc
aattaggtgg tgcttcaagg aaacgttgtc accgtgcacc 2640agataagtct gatggtacac
cagtcccagt tggtttcacc attatgaaaa cccttgaaaa 2700ccacattgtc aacgatttgt
ccagacatgt tacagttatg acaggtatta ccgtcacagc 2760tttagaatct acatcaagag
tcagacctga tggtgtttta gtcaagcatg ttactggtgt 2820tcacttgatt caggcatctg
gtcaatctat ggttttgaat gcagacgctg ttatcttagc 2880tactggtggt ttctccaatg
atcatacccc aaactccctt ttacaacaat acgccccaca 2940gttgtcatct tttccaacaa
ccaatggtgt ctgggcaact ggcgatggtg ttaagatggc 3000ttccaagttg ggtgtcgcct
tagttgatat ggataaggtc caattacatc ctaccggctt 3060gttagaccca aaagatccat
ctaatagaac caagtatctt ggtccagagg ccttaagagg 3120ttccggcggt gtcttgttaa
acaaaaacgg tgaaagattt gttaatgaat tagacttaag 3180atctgttgtc tctcaagcta
tcatcgcaca agataatgag tacccaggct ctggtggttc 3240caagttcgca tactgtgttt
tgaacgaaac tgcagcaaag ttattcggca aaaacttcct 3300tggtttctac tggaatagat
taggtctttt ccaaaaggtt gattccgttg ctggtttagc 3360taagttgatt ggttgtccag
aagctaatgt tgttgctaca ttgaagcaat atgaggagtt 3420atcttccaaa aagcttaatc
cttgtccatt gactggcaag tctgtctttc cttgtgtttt 3480aggcactcaa ggtccatact
atgttgcctt ggttacccca tccattcact acactatggg 3540tggttgtttg atttccccat
ctgctgagat gcaaaccatt gacaactctg gtgttactcc 3600tgtcagacgt ccaatcttag
gcttattcgg tgctggtgaa gttactggcg gtgtccatgg 3660tggtaacaga ttaggcggta
actctttgtt agaatgtgtt gttttcggca agatcgctgg 3720tgacagagct gcaaccatct
tgcaaaagaa aaacaccggc ttatcaatga cagaatggtc 3780tactgtcgtc ttaagagaag
ttagagaagg tggtgtctat ggtgctggtt ccagagtttt 3840gaggtttaac atgcctggtg
cattacagag aactggttta gctttaggtc aattcatcgg 3900tatcagaggt gattgggacg
gtcacagatt gatcggttac tattctccaa tcactttacc 3960tgatgatgtt ggtgttattg
gtatcttagc tagagcagac aagggtagat tggcagaatg 4020gatttctgca ttgcagccag
gtgacgctgt tgagatgaag gcctgcggtg gtcttatcat 4080tgacagaaga ttcgctgaaa
gacatttctt tttccgtggt cataagatca gaaagttggc 4140ccttatcggt ggtggtactg
gtgttgcacc aatgttacaa atcgtcagag ctgctgtcaa 4200aaagccattt gtcgattcaa
ttgagtccat tcagttcatc tatgctgcag aggatgtttc 4260cgagcttaca tacagaacct
tacttgaatc ttacgaagag gaatatggtt cagaaaagtt 4320taagtgtcac ttcgttttga
ataacccacc agctcaatgg actgacggtg ttggtttcgt 4380tgatactgca ttgttgagat
ccgcagttca agcaccatca aatgatttgc ttgttgcaat 4440ttgtggtcca ccaatcatgc
aaagagcagt taagggtgca ttgaaaggtt taggttacaa 4500tatgaatctt gttagaaccg
ttgacgaaac tgaaccacca tcataattaa ttaacatctg 4560aatgtaaaat gaacattaaa
atgaattact aaactttacg tctactttac aatctataaa 4620ctttgtttaa tcatataacg
aaatacacta atacacaatc ctgtacgtat gtaatacttt 4680tatccatcaa ggattgagaa
aaaaaagtaa tgattccctg ggccattaaa acttagaccc 4740ccaagcttgg ataggtcact
ctctattttc gtttctccct tccctgatag aagggtgata 4800tgtaattaag aataatatat
aattttataa taaaagaatt catagcctca tgaaatcagc 4860catttgcttt tgttcaacga
tcttttgaaa ttgttgttgt tcttggtagt taagttgatc 4920catcttggct tatgttgtgt
gtatgttgta gttattctta gtatattcct gtcctgagtt 4980tagtgaaaca taatatcgcc
ttgaaatgaa aatgctgaaa ttcgtcgaca tacaattttt 5040caaacttttt ttttttcttg
gtgcacggac atgtttttaa aggaagtact ctataccagt 5100tattcttcac aaatttaatt
gctggagaat agatcttcaa cgctttaata aagtagtttg 5160tttgtcaagg atggcgtcat
acaaagaaag atcagaatca cacacttccc ctgttgctag 5220gagacttttc tccatcatgg
aggaaaagaa gtctaacctt tgtgcatcat tggatattac 5280tgaaactgaa aagcttctct
ctattttgga cactattggt ccttacatct gtctagttaa 5340aacacacatc gatattgttt
ctgattttac gtatgaagga actgtgttgc ctttgaagga 5400gcttgccaag aaacataatt
ttatgatttt tgaagataga aaatttgctg atattggtaa 5460cactgttaaa aatcaatata
aatctggtgt cttccgtatt gccgaatggg ctgacatcac 5520taatgcacat ggtgtaacgg
gtgcaggtat tgtttctggc ttgaaggagg cagcccaaga 5580aacaaccagt gaacctagag
gtttgctaat gcttgctgag ttatcatcaa agggttcttt 5640agcatatggt gaatatacag
aaaaaacagt agaaattgct aaatctgata aagagtttgt 5700cattggtttt attgcgcaac
acgatatggg cggtagagaa gaaggttttg actggatcat 5760tatgactcca ggggttggtt
tagatgacaa aggtgatgca cttggtcaac aatatagaac 5820tgttgatgaa gttgtaaaga
ctggaacgga tatcataatt gttggtagag gtttgtacgg 5880tcaaggaaga gatcctatag
agcaagctaa aagataccaa caagctggtt ggaatgctta 5940tttaaacaga tttaaatgat
tcttacacaa agatttgata catgtacact agtttaaata 6000agcatgaaaa gaattacaca
agcaaaaaaa aaaaaataaa tgaggtactt tacgttcacc 6060tacaaccaaa aaaactagat
agagtaaaat cttaagattt agaaaaagtt gtttaacaaa 6120ggctttagta tgtgaatttt
taatgtagca aagcgataac taataaacat aaacaaaagt 6180atggttttct ttatcagtca
aatcattatc gattgattgt tccgcgtatc tgcagatagc 6240ctcatgaaat cagccatttg
cttttgttca acgatctttt gaaattgttg ttgttcttgg 6300tagttaagtt gatccatctt
ggcttatgtt gtgtgtatgt tgtagttatt cttagtatat 6360tcctgtcctg agtttagtga
aacataatat cgccttgaaa tgaaaatgct gaaattcgtc 6420gacatacaat ttttcaaact
tttttttttt cttggtgcac ggacatgttt ttaaaggaag 6480tactctatac cagttattct
tcacaaattt aattgctgga gaatagatct tcaacgcccc 6540gggggatctg gatccgcggc
cgctgagctc gtctgatatt tgctaaattg aaatgaacct 6600taccatgcca catctataga
catcaaaacc attttcaatt tgtcgatatc ttttgcatat 6660caaagtaata ccaagcatgt
tcaaaaagaa aagaaagcat aactttaata ctctattcga 6720aacattccga tccacaacac
attagtcttt ttaggcccgt tgttcatctt tctattactt 6780tattcctaac tgtattttta
taattccggg tttataaaag attaaactaa tatagcgcat 6840tctttttggg tacaaacata
cataacggag gttt 6874124127DNAArtificial
SequenceSynthetic - MDH gene integration fragment 12gttaacccgt ttcgatggga
ttcccagaag tggatactat actgtctgca atgcactaca 60ctctaaaaaa gtattataca
ttaccataca ttagcaaatc accaatactc tgcactgttt 120cagtgtgtgc acattgctac
ccaattggga aatcgcaggg aaaatgagac accccctcca 180ttcgtattac gtaagacaat
atcagggctg ccgaattcgg cagaaaagcc gagccggccg 240agtcctcttg cacggagtgt
gtccgaaaag ggcagctctg cagtggggga gaggaggtcg 300cacgtctatg cggtgttggc
atggcctgtg cgtgtacctg tcccctccct gggcatcccc 360cactgcgcgc cttctccatt
gggcgctgcg ggcactccgc gccgttaata caggaggggg 420gggggaaagc ttaagattag
agcgggtaca gtcagtgggt gtattgaccc catttctgtc 480agtataaacc ccccgttgag
ccgccggttt ggttgtttat ggataaaatt tttttttccc 540cgcatggaga agattgaggg
ggggaaggaa tgggaaaaag gccagagcca tctccacagc 600ggaatccgac cgttaatggg
gtgaaacacc cccaccaggt agagcaggaa gaatggggaa 660acaaggtgga gagatggtca
ttgttgggaa tagtgggaaa atgaggggga agagaatgac 720tataaaatgg gaagggggtc
caagttatcc aagcagtcca tttagagaag ggagcggccc 780ctattggtag ttctttcccc
ctctcaagct ggcgtgaaat gcaaccttac ggcgtctacg 840ttactacaag gtccagaaag
tgtaggtatt gctactattt ttatttttta ttggttctgg 900agaaatgcag acagtcaatg
aacacaactg tctcaatatg catctatgca catgcacaca 960cacacacatc acaggtaccc
ctacaaagag aggtctcttg ataatgtttc attaccacgt 1020ggcatccccc cccccccccc
caataaacaa gtggccgagt tcccctgttg cagaggagga 1080caaaaaaacc gctggtgttg
gtaccattat gcagcaacta gcacaacaaa caaccgaccc 1140agacatacaa atcaacaaca
cttcgccaaa gacacccttt ccagggagga tccactccca 1200acgtctctcc ataatgtctc
tgttggccca tgtctctgtc gttgacaccg taaccacacc 1260aaccaacccg tccattgtac
tgggatggtc gtccatagac acctctccaa cggggaacac 1320ctcattcgta aaccgccaag
gttaccgttc ctcctgactc gccccgttgt tgatgctgcg 1380cacctgtggt tgcccaacat
ggttgtatat cgtgtaacca caccaacaca tgtgcagcac 1440atgtgtttaa aagagtgtca
tggaggtgga tcatgatgga agtggacttt accacttggg 1500aactgtctcc actcccggga
agaaaagacc cggcgtatca cgcggttgcc tcaatggggc 1560aatttggaag gagaaatata
gggaaaatca cgtcgctctc ggacggggaa gagttccaga 1620ctatgagggg ggggggtggt
atataaagac aggagatgtc cacccccaga gagaggaaga 1680agttggaact ttagaagaga
gagataactt tccccagtgt ccatcaatac acaaccaaac 1740acaaactcta tatttacaca
tataaccccc tctctagaat ggttaaagtt acagtttgtg 1800gtgctgctgg tggtattggt
caaccccttt ctttactctt gaagcaatcc tctcacatta 1860ctcacttatc tctttatgat
atcgttaata ctcctggtgt tgctgctgat cttagtcata 1920tcgataccaa atccaaggtc
actggtcatg taggtgctgc tcaacttgaa gaagctatca 1980aggattctga tgttgtcgtt
attcccgctg gtgtcccaag aaagccaggt atgacgcgtg 2040atgatctttt caagattaat
gctggtattg tacgtgattt ggctacagct gctgcaaagt 2100acgctccaaa ggccttcatg
tgtatcattt ctaacccagt caactcgact gtcccaatcg 2160ttactgaagt attcaaacag
cacaatgttt atgaccccaa aagaatcttt ggtgtaacaa 2220cacttgatat tgttcgtgca
tccacctttg tatccgaatt gattggaggt gaacctaatt 2280cacttcgtgt tcccgtcatt
ggtggtcaca gcggcgtaac catcttacct ttactctcac 2340aggtccccgg cattgaaaag
ttaaaccaag aacaaattga gaaggtaact catcgtattc 2400aatttggtgg cgatgaagtt
gtcaaggcca aggatggtgc tggttctgcc actctttcca 2460tggcttatgc tggtgctcgt
tttgctacaa acatcattga ggctgctttt gctggaaaga 2520agggcattgt tgaatgtacc
tatgttcaat tggatgctga taaatctggt gcccaatctg 2580tcaaggattt ggttggtagt
gaacttgaat atttctctgt tcccgttgaa ttgggtccta 2640gtggtgttga aaagatttta
cccattggaa acgttaatga atatgaaaag aagttgttga 2700acgaggcttc tcctgaatta
aaaaccaaca ttgataaagg ttgtactttt gttactgaag 2760gctcaaagtt gtaattaatt
aatttatttt actagtttat ttttgctcct gagaatagga 2820ttacaaacac ttaaagtctt
taattacaac tatatataat attctgttgg ttttcttgaa 2880ttggttcgct gcgattcatg
cctcccattc accaaaggtg gagtgggaaa taacggtttt 2940actgcggtaa ttagcagagg
caagaacagg atacactttt tgatgataaa tctgtattat 3000agtcgagcct atttaggaaa
tcaaattttc ttgtgtttac ttttcaaata aataatgttc 3060gaaaattttt actttactcc
ttcatttaac tataccagac gttatatcat caacaccttc 3120tgaccatata cagctcaaga
tgtttaagag tctgttaaat tttttcaatc catttcatgg 3180agtaccagga ggtgctacaa
aaggaattca tagcctcatg aaatcagcca tttgcttttg 3240ttcaacgatc ttttgaaatt
gttgttgttc ttggtagtta agttgatcca tcttggctta 3300tgttgtgtgt atgttgtagt
tattcttagt atattcctgt cctgagttta gtgaaacata 3360atatcgcctt gaaatgaaaa
tgctgaaatt cgtcgacata caatttttca aacttttttt 3420ttttcttggt gcacggacat
gtttttaaag gaagtactct ataccagtta ttcttcacaa 3480atttaattgc tggagaatag
atcttcaacg ctttaataaa gtagtttgtt tgtcaaggat 3540ggcgtcatac aaagaaagat
cagaatcaca cacttcccct gttgctagga gacttttctc 3600catcatggag gaaaagaagt
ctaacctttg tgcatcattg gatattactg aaactgaaaa 3660gcttctctct attttggaca
ctattggtcc ttacatctgt ctagttaaaa cacacatcga 3720tattgtttct gattttacgt
atgaaggaac tgtgttgcct ttgaaggagc ttgccaagaa 3780acataatttt atgatttttg
aagatagaaa atttgctgat attggtaaca ctgttaaaaa 3840tcaatataaa tctggtgtct
tccgtattgc cgaatgggct gacatcacta atgcacatgg 3900tgtaacgggt gcaggtattg
tttctggctt gaaggaggca gcccaagaaa caaccagtga 3960acctagaggt ttgctaatgc
ttgctgagtt atcatcaaag ggttctttag catatggtga 4020atatacagaa aaaacagtag
aaattgctaa atctgataaa gagtttgtca ttggttttat 4080tgcgcaacac gatatgggcg
gtagagaaga aggttttgac tccgcgg 412713987DNARhizopus
delemar 13atggttaaag ttacagtttg tggtgctgct ggtggtattg gtcaacccct
ttctttactc 60ttgaagcaat cctctcacat tactcactta tctctttatg atatcgttaa
tactcctggt 120gttgctgctg atcttagtca tatcgatacc aaatccaagg tcactggtca
tgtaggtgct 180gctcaacttg aagaagctat caaggattct gatgttgtcg ttattcccgc
tggtgtccca 240agaaagccag gtatgacgcg tgatgatctt ttcaagatta atgctggtat
tgtacgtgat 300ttggctacag ctgctgcaaa gtacgctcca aaggccttca tgtgtatcat
ttctaaccca 360gtcaactcga ctgtcccaat cgttactgaa gtattcaaac agcacaatgt
ttatgacccc 420aaaagaatct ttggtgttac aacacttgat attgttcgtg catccacctt
tgtatccgaa 480ttgattggag gtgaacctaa ttcacttcgt gttcccgtca ttggtggtca
cagcggcgta 540accatcttac ctttactctc acaggtcccc ggcattgaaa agttaaacca
agaacaaatt 600gagaaggtaa ctcatcgtat tcaatttggt ggcgatgaag ttgtcaaggc
caaggatggt 660gctggttctg ccactctttc catggcttat gctggtgctc gttttgctac
aaacatcatt 720gaggctgctt ttgctggaaa gaagggcatt gttgaatgta cctatgttca
attggatgct 780gataaatctg gtgcccaatc tgtcaaggat ttggttggta gtgaacttga
atatttctct 840gttcccgttg aattgggtcc tagtggtgtt gaaaagattt tacccattgg
aaacgttaat 900gaatatgaaa agaagttgtt gaacgaggct tctcctgaat taaaaaccaa
cattgataaa 960ggttgtactt ttgttactga aggctaa
987144112DNAArtificial SequenceSynthetic - A. succinogenes
FUM gene integration fragment 14aattctttga aggagcttgc caagaaacat
aattttatga tttttgaaga tagaaaattt 60gctgatattg gtaacactgt taaaaatcaa
tataaatctg gtgtcttccg tattgccgaa 120tgggctgaca tcactaatgc acatggtgta
acgggtgcag gtattgtttc tggcttgaag 180gaggcagccc aagaaacaac cagtgaacct
agaggtttgc taatgcttgc tgagttatca 240tcaaagggtt ctttagcata tggtgaatat
acagaaaaaa cagtagaaat tgctaaatct 300gataaagagt ttgtcattgg ttttattgcg
caacacgata tgggcggtag agaagaaggt 360tttgactgga tcattatgac tccaggggtt
ggtttagatg acaaaggtga tgcacttggt 420caacaatata gaactgttga tgaagttgta
aagactggaa cggatatcat aattgttggt 480agaggtttgt acggtcaagg aagagatcct
atagagcaag ctaaaagata ccaacaagct 540ggttggaatg cttatttaaa cagatttaaa
tgattcttac acaaagattt gatacatgta 600cactagttta aataagcatg aaaagaatta
cacaagcaaa aaaaaaaaaa taaatgaggt 660actttacgtt cacctacaac caaaaaaact
agatagagta aaatcttaag atttagaaaa 720agttgtttaa caaaggcttt agtatgtgaa
tttttaatgt agcaaagcga taactaataa 780acataaacaa aagtatggtt ttctttatca
gtcaaatcat tatcgattga ttgttccgcg 840tatctgcaga tagcctcatg aaatcagcca
tttgcttttg ttcaacgatc ttttgaaatt 900gttgttgttc ttggtagtta agttgatcca
tcttggctta tgttgtgtgt atgttgtagt 960tattcttagt atattcctgt cctgagttta
gtgaaacata atatcgcctt gaaatgaaaa 1020tgctgaaatt cgtcgacata caatttttca
aacttttttt ttttcttggt gcacggacat 1080gtttttaaag gaagtactct ataccagtta
ttcttcacaa atttaattgc tggagaatag 1140atcttcaacg cgtttcctcg acatttgctg
caacggcaac atcaatgtcc acgtttacac 1200acctacattt atatctatat ttatatttat
atttatttat ttatgctact tagcttctat 1260agttagttaa tgcactcacg atattcaaaa
ttgacaccct tcaactactc cctactattg 1320tctactactg tctactactc ctctttacta
tagctgctcc caataggctc caccaatagg 1380ctctgtcaat acattttgcg ccgccacctt
tcaggttgtg tcactcctga aggaccatat 1440tgggtaatcg tgcaatttct ggaagagagt
ccgcgagaag tgaggccccc actgtaaatc 1500ctcgaggggg catggagtat ggggcatgga
ggatggagga tggggggggg gggggggaaa 1560ataggtagcg aaaggacccg ctatcacccc
acccggagaa ctcgttgccg ggaagtcata 1620tttcgacact ccggggagtc tataaaaggc
gggttttgtc ttttgccagt tgatgttgct 1680gagaggactt gtttgccgtt tcttccgatt
taacagtata gaatcaacca ctgttaatta 1740tacacgttat actaacacaa caaaaacaaa
aacaacgaca acaacaacaa catctagaat 1800gatcattatg actttccgta ttgagaagga
tactatgggt gaagttcaag tcccagctga 1860taagtattgg gctgcccaga ccgaaagatc
tagaaacaac ttcaagattg gtccagctgc 1920ttctatgcca catgaaatca ttgaagcttt
tggttacttg aaaaaggcag ctgcatacgc 1980taacgctgac ttgggtgttt tgccagctga
aaagagagat ttgattgccc aagcttgtga 2040cgaaatctta gccagaaagc ttgacgatca
gttcccattg gttatctggc aaacaggttc 2100tggtacccaa tccaatatga acttgaatga
ggttatcgct aatagagcac atgttatcaa 2160tggtggcaag ttgggtgaaa agtctatcat
tcaccctaat gacgatgtca acaaatccca 2220atcttctaat gacacttatc caacagcaat
gcatattgcc acttacaaaa aggttgttga 2280agctaccatc cctgcaattg aaagattaca
aaagacctta gcagctaagt cagaagagtt 2340taaggatgtt gtcaaaatcg gtaggactca
tcttatggat gccaccccat taaccttggg 2400tcaagagttc tctggttatg ctgcacaatt
gtccttcggt ttagcagcaa tcaaaaacac 2460cttgcctcat ttgagacaat tagcattagg
tggtactgca gtcggtactg gtcttaacac 2520tccaaaaggt tatgatgtta aagttgcaga
atacattgcc aagtttactg gtttaccatt 2580catcactgct gaaaacaagt tcgaggcctt
agcaactcac gatgctattg tcgaaaccca 2640cggtgcctta aagcaggttg caatgtcact
tttcaagatc gcaaacgaca ttagattgtt 2700ggcatcaggt ccaagatctg gcattggcga
gatccttatc cctgaaaacg aaccaggttc 2760atccattatg ccaggcaagg ttaaccctac
tcaatgtgaa gcaatgacaa tggttgcagc 2820acaagtctta ggtaatgata caacaatctc
cttcgctggc tctcaaggtc acttcgaatt 2880gaatgtcttt aagccagtta tggctgctaa
ctttttgcaa tctgctcaac ttattgctga 2940tgtttgcatt tcctttgacg aacactgtgc
ttccggtatc aagcctaata ccccacgtat 3000tcaacatttg ttagaatcct ccttaatgtt
agtcaccgca ttgaacaccc acattggtta 3060cgaaaatgca gctaagattg ctaagaccgc
tcacaaaaac ggtactacat taagagaaga 3120ggccattaac ttaggtttag tttctgctga
agattttgat aagtgggtta gaccagaaga 3180tatggttggt tccttgaagt aattaattaa
catctgaatg taaaatgaac attaaaatga 3240attactaaac tttacgtcta ctttacaatc
tataaacttt gtttaatcat ataacgaaat 3300acactaatac acaatcctgt acgtatgtaa
tacttttatc catcaaggat tgagaaaaaa 3360aagtaatgat tccctgggcc attaaaactt
agacccccaa gcttggatag gtcactctct 3420attttcgttt ctcccttccc tgatagaagg
gtgatatgta attaagaata atatataatt 3480ttataataaa agcggccgca ccaggggttt
agtgaagtca ccaattaaga ttgttggttt 3540gagtgagttg ccaaagatct atgaattgat
ggagcaaggt aagattttag gcagatatgt 3600tgttgacact tcgaaatgat gggctgactt
gggtgtactg gtgtgacgtt tttatgtgta 3660tattgatatg catgggggat gtatagtgat
gaggagtaga gtatataacg aaatgaaatg 3720aaataatatg atatgataag ataagatgag
atcaatacga taatataaga tgcgacatga 3780ggagttcaat gtagcatact acacgatgct
gcagtacaac tctgatacgc tagactatac 3840tatacaaaac tgtagtacac tatacgttag
tgtagtatcc agaaacaaca ctgctttata 3900gtacaataca actctataat actatagtat
actatgccaa accacgtaat accataatat 3960gctccacgac atggtacaat gtgctatact
tcatactatt ataccatata tactccgata 4020tattattgat atactatttt atactataat
accataccac acaacactac attacaacga 4080gcaaccttac cataaatgtc agttatggcc
gc 4112151404DNAActinobacillus
succinogenes 15atgatcatta tgactttccg tattgagaag gatactatgg gtgaagttca
agtcccagct 60gataagtatt gggctgccca gaccgaaaga tctagaaaca acttcaagat
tggtccagct 120gcttctatgc cacatgaaat cattgaagct tttggttact tgaaaaaggc
agctgcatac 180gctaacgctg acttgggtgt tttgccagct gaaaagagag atttgattgc
ccaagcttgt 240gacgaaatct tagccagaaa gcttgacgat cagttcccat tggttatctg
gcaaacaggt 300tctggtaccc aatccaatat gaacttgaat gaggttatcg ctaatagagc
acatgttatc 360aatggtggca agttgggtga aaagtctatc attcacccta atgacgatgt
caacaaatcc 420caatcttcta atgacactta tccaacagca atgcatattg ccacttacaa
aaaggttgtt 480gaagctacca tccctgcaat tgaaagatta caaaagacct tagcagctaa
gtcagaagag 540tttaaggatg ttgtcaaaat cggtaggact catcttatgg atgccacccc
attaaccttg 600ggtcaagagt tctctggtta tgctgcacaa ttgtccttcg gtttagcagc
aatcaaaaac 660accttgcctc atttgagaca attagcatta ggtggtactg cagtcggtac
tggtcttaac 720actccaaaag gttatgatgt taaagttgca gaatacattg ccaagtttac
tggtttacca 780ttcatcactg ctgaaaacaa gttcgaggcc ttagcaactc acgatgctat
tgtcgaaacc 840cacggtgcct taaagcaggt tgcaatgtca cttttcaaga tcgcaaacga
cattagattg 900ttggcatcag gtccaagatc tggcattggc gagatcctta tccctgaaaa
cgaaccaggt 960tcatccatta tgccaggcaa ggttaaccct actcaatgtg aagcaatgac
aatggttgca 1020gcacaagtct taggtaatga tacaacaatc tccttcgctg gctctcaagg
tcacttcgaa 1080ttgaatgtct ttaagccagt tatggctgct aactttttgc aatctgctca
acttattgct 1140gatgtttgca tttcctttga cgaacactgt gcttccggta tcaagcctaa
taccccacgt 1200attcaacatt tgttagaatc ctccttaatg ttagtcaccg cattgaacac
ccacattggt 1260tacgaaaatg cagctaagat tgctaagacc gctcacaaaa acggtactac
attaagagaa 1320gaggccatta acttaggttt agtttctgct gaagattttg ataagtgggt
tagaccagaa 1380gatatggttg gttccttgaa gtaa
1404163983DNAArtificial SequenceSynthetic - R. delemar MDH
gene integration fragment 16gggccccata actgacattt atggtaaggt
tgctcgttgt aatgtagtgt tgtgtggtat 60ggtattatag tataaaatag tatatcaata
atatatcgga gtatatatgg tataatagta 120tgaagtatag cacattgtac catgtcgtgg
agcatattat ggtattacgt ggtttggcat 180agtatactat agtattatag agttgtattg
tactataaag cagtgttgtt tctggatact 240acactaacgt atagtgtact acagttttgt
atagtatagt ctagcgtatc agagttgtac 300tgcagcatcg tgtagtatgc tacattgaac
tcctcatgtc gcatcttata ttatcgtatt 360gatctcatct tatcttatca tatcatatta
tttcatttca tttcgttata tactctactc 420ctcatcacta tacatccccc atgcatatca
atatacacat aaaaacgtca caccagtaca 480cccaagtcag cccatcattt cgaagtgtca
acaacatatc tgcctaaaat cttaccttgc 540tccatcaatt catagatctt tggcaactca
ctcaaaccaa caatcttaat tggtgacttc 600actaaacccc tggtgcggcc gcggatccct
cgagattggt agttctttcc ccctctcaag 660ctggcgtgaa atgcaacctt acggcgtcta
cgttactaca aggtccagaa agtgtaggta 720ttgctactat ttttattttt tattggttct
ggagaaatgc agacagtcaa tgaacacaac 780tgtctcaata tgcatctatg cacatgcaca
cacacacaca tcacaggtac ccctacaaag 840agaggtctct tgataatgtt tcattaccac
gtggcatccc cccccccccc cccaataaac 900aagtggccga gttcccctgt tgcagaggag
gacaaaaaaa ccgctggtgt tggtaccatt 960atgcagcaac tagcacaaca aacaaccgac
ccagacatac aaatcaacaa cacttcgcca 1020aagacaccct ttccagggag gatccactcc
caacgtctct ccataatgtc tctgttggcc 1080catgtctctg tcgttgacac cgtaaccaca
ccaaccaacc cgtccattgt actgggatgg 1140tcgtccatag acacctctcc aacggggaac
acctcattcg taaaccgcca aggttaccgt 1200tcctcctgac tcgccccgtt gttgatgctg
cgcacctgtg gttgcccaac atggttgtat 1260atcgtgtaac cacaccaaca catgtgcagc
acatgtgttt aaaagagtgt catggaggtg 1320gatcatgatg gaagtggact ttaccacttg
ggaactgtct ccactcccgg gaagaaaaga 1380cccggcgtat cacgcggttg cctcaatggg
gcaatttgga aggagaaata tagggaaaat 1440cacgtcgctc tcggacgggg aagagttcca
gactatgagg ggggggggtg gtatataaag 1500acaggagatg tccaccccca gagagaggaa
gaagttggaa ctttagaaga gagagataac 1560tttccccagt gtccatcaat acacaaccaa
acacaaactc tatatttaca catataaccc 1620cctctctaga taaaatggtt aaagttacag
tttgtggtgc tgctggtggt attggtcaac 1680ccctttcttt actcttgaag caatcctctc
acattactca cttatctctt tatgatatcg 1740ttaatactcc tggtgttgct gctgatctta
gtcatatcga taccaaatcc aaggtcactg 1800gtcatgtagg tgctgctcaa cttgaagaag
ctatcaagga ttctgatgtt gtcgttattc 1860ccgctggtgt cccaagaaag ccaggtatga
cgcgtgatga tcttttcaag attaatgctg 1920gtattgtacg tgatttggct acagctgctg
caaagtacgc tccaaaggcc ttcatgtgta 1980tcatttctaa cccagtcaac tcgactgtcc
caatcgttac tgaagtattc aaacagcaca 2040atgtttatga ccccaaaaga atctttggtg
taacaacact tgatattgtt cgtgcatcca 2100cctttgtatc cgaattgatt ggaggtgaac
ctaattcact tcgtgttccc gtcattggtg 2160gtcacagcgg cgtaaccatc ttacctttac
tctcacaggt ccccggcatt gaaaagttaa 2220accaagaaca aattgagaag gtaactcatc
gtattcaatt tggtggcgat gaagttgtca 2280aggccaagga tggtgctggt tctgccactc
tttccatggc ttatgctggt gctcgttttg 2340ctacaaacat cattgaggct gcttttgctg
gaaagaaggg cattgttgaa tgtacctatg 2400ttcaattgga tgctgataaa tctggtgccc
aatctgtcaa ggatttggtt ggtagtgaac 2460ttgaatattt ctctgttccc gttgaattgg
gtcctagtgg tgttgaaaag attttaccca 2520ttggaaacgt taatgaatat gaaaagaagt
tgttgaacga ggcttctcct gaattaaaaa 2580ccaacattga taaaggttgt acttttgtta
ctgaaggctc aaagttgtaa ttaattaatt 2640tattttacta gtttattttt gctcctgaga
ataggattac aaacacttaa agtctttaat 2700tacaactata tataatattc tgttggtttt
cttgaattgg ttcgctgcga ttcatgcctc 2760ccattcacca aaggtggagt gggaaataac
ggttttactg cggtaattag cagaggcaag 2820aacaggatac actttttgat gataaatctg
tattatagtc gagcctattt aggaaatcaa 2880attttcttgt gtttactttt caaataaata
atgttcgaaa atttttactt tactccttca 2940tttaactata ccagacgtta tatcatcaac
accttctgac catatacagc tcaagatgtt 3000taagagtctg ttaaattttt tcaatccatt
tcatggagta ccaggaggtg ctacaaaagg 3060aattcatagc ctcatgaaat cagccatttg
cttttgttca acgatctttt gaaattgttg 3120ttgttcttgg tagttaagtt gatccatctt
ggcttatgtt gtgtgtatgt tgtagttatt 3180cttagtatat tcctgtcctg agtttagtga
aacataatat cgccttgaaa tgaaaatgct 3240gaaattcgtc gacatacaat ttttcaaact
tttttttttt cttggtgcac ggacatgttt 3300ttaaaggaag tactctatac cagttattct
tcacaaattt aattgctgga gaatagatct 3360tcaacgcttt aataaagtag tttgtttgtc
aaggatggcg tcatacaaag aaagatcaga 3420atcacacact tcccctgttg ctaggagact
tttctccatc atggaggaaa agaagtctaa 3480cctttgtgca tcattggata ttactgaaac
tgaaaagctt ctctctattt tggacactat 3540tggtccttac atctgtctag ttaaaacaca
catcgatatt gtttctgatt ttacgtatga 3600aggaactgtg ttgcctttga aggagcttgc
caagaaacat aattttatga tttttgaaga 3660tagaaaattt gctgatattg gtaacactgt
taaaaatcaa tataaatctg gtgtcttccg 3720tattgccgaa tgggctgaca tcactaatgc
acatggtgta acgggtgcag gtattgtttc 3780tggcttgaag gaggcagccc aagaaacaac
cagtgaacct agaggtttgc taatgcttgc 3840tgagttatca tcaaagggtt ctttagcata
tggtgaatat acagaaaaaa cagtagaaat 3900tgctaaatct gataaagagt ttgtcattgg
ttttattgcg caacacgata tgggcggtag 3960agaagaaggt tttgactccg cgg
3983174060DNAArtificial
SequenceSynthetic - A. succinogenes FUM gene integration fragment
17aattctttga aggagcttgc caagaaacat aattttatga tttttgaaga tagaaaattt
60gctgatattg gtaacactgt taaaaatcaa tataaatctg gtgtcttccg tattgccgaa
120tgggctgaca tcactaatgc acatggtgta acgggtgcag gtattgtttc tggcttgaag
180gaggcagccc aagaaacaac cagtgaacct agaggtttgc taatgcttgc tgagttatca
240tcaaagggtt ctttagcata tggtgaatat acagaaaaaa cagtagaaat tgctaaatct
300gataaagagt ttgtcattgg ttttattgcg caacacgata tgggcggtag agaagaaggt
360tttgactgga tcattatgac tccaggggtt ggtttagatg acaaaggtga tgcacttggt
420caacaatata gaactgttga tgaagttgta aagactggaa cggatatcat aattgttggt
480agaggtttgt acggtcaagg aagagatcct atagagcaag ctaaaagata ccaacaagct
540ggttggaatg cttatttaaa cagatttaaa tgattcttac acaaagattt gatacatgta
600cactagttta aataagcatg aaaagaatta cacaagcaaa aaaaaaaaaa taaatgaggt
660actttacgtt cacctacaac caaaaaaact agatagagta aaatcttaag atttagaaaa
720agttgtttaa caaaggcttt agtatgtgaa tttttaatgt agcaaagcga taactaataa
780acataaacaa aagtatggtt ttctttatca gtcaaatcat tatcgattga ttgttccgcg
840tatctgcaga tagcctcatg aaatcagcca tttgcttttg ttcaacgatc ttttgaaatt
900gttgttgttc ttggtagtta agttgatcca tcttggctta tgttgtgtgt atgttgtagt
960tattcttagt atattcctgt cctgagttta gtgaaacata atatcgcctt gaaatgaaaa
1020tgctgaaatt cgtcgacata caatttttca aacttttttt ttttcttggt gcacggacat
1080gtttttaaag gaagtactct ataccagtta ttcttcacaa atttaattgc tggagaatag
1140atcttcaacg cgtttcctcg acatttgctg caacggcaac atcaatgtcc acgtttacac
1200acctacattt atatctatat ttatatttat atttatttat ttatgctact tagcttctat
1260agttagttaa tgcactcacg atattcaaaa ttgacaccct tcaactactc cctactattg
1320tctactactg tctactactc ctctttacta tagctgctcc caataggctc caccaatagg
1380ctctgtcaat acattttgcg ccgccacctt tcaggttgtg tcactcctga aggaccatat
1440tgggtaatcg tgcaatttct ggaagagagt ccgcgagaag tgaggccccc actgtaaatc
1500ctcgaggggg catggagtat ggggcatgga ggatggagga tggggggggg gggggggaaa
1560ataggtagcg aaaggacccg ctatcacccc acccggagaa ctcgttgccg ggaagtcata
1620tttcgacact ccggggagtc tataaaaggc gggttttgtc ttttgccagt tgatgttgct
1680gagaggactt gtttgccgtt tcttccgatt taacagtata gaatcaacca ctgttaatta
1740tacacgttat actaacacaa caaaaacaaa aacaacgaca acaacaacaa catctagata
1800aaatgatcat tatgactttc cgtattgaga aggatactat gggtgaagtt caagtcccag
1860ctgataagta ttgggctgcc cagaccgaaa gatctagaaa caacttcaag attggtccag
1920ctgcttctat gccacatgaa atcattgaag cttttggtta cttgaaaaag gcagctgcat
1980acgctaacgc tgacttgggt gttttgccag ctgaaaagag agatttgatt gcccaagctt
2040gtgacgaaat cttagccaga aagcttgacg atcagttccc attggttatc tggcaaacag
2100gttctggtac ccaatccaat atgaacttga atgaggttat cgctaataga gcacatgtta
2160tcaatggtgg caagttgggt gaaaagtcta tcattcaccc taatgacgat gtcaacaaat
2220cccaatcttc taatgacact tatccaacag caatgcatat tgccacttac aaaaaggttg
2280ttgaagctac catccctgca attgaaagat tacaaaagac cttagcagct aagtcagaag
2340agtttaagga tgttgtcaaa atcggtagga ctcatcttat ggatgccacc ccattaacct
2400tgggtcaaga gttctctggt tatgctgcac aattgtcctt cggtttagca gcaatcaaaa
2460acaccttgcc tcatttgaga caattagcat taggtggtac tgcagtcggt actggtctta
2520acactccaaa aggttatgat gttaaagttg cagaatacat tgccaagttt actggtttac
2580cattcatcac tgctgaaaac aagttcgagg ccttagcaac tcacgatgct attgtcgaaa
2640cccacggtgc cttaaagcag gttgcaatgt cacttttcaa gatcgcaaac gacattagat
2700tgttggcatc aggtccaaga tctggcattg gcgagatcct tatccctgaa aacgaaccag
2760gttcatccat tatgccaggc aaggttaacc ctactcaatg tgaagcaatg acaatggttg
2820cagcacaagt cttaggtaat gatacaacaa tctccttcgc tggctctcaa ggtcacttcg
2880aattgaatgt ctttaagcca gttatggctg ctaacttttt gcaatctgct caacttattg
2940ctgatgtttg catttccttt gacgaacact gtgcttccgg tatcaagcct aataccccac
3000gtattcaaca tttgttagaa tcctccttaa tgttagtcac cgcattgaac acccacattg
3060gttacgaaaa tgcagctaag attgctaaga ccgctcacaa aaacggtact acattaagag
3120aagaggccat taacttaggt ttagtttctg ctgaagattt tgataagtgg gttagaccag
3180aagatatggt tggttccttg aagtaattaa ttaacatctg aatgtaaaat gaacattaaa
3240atgaattact aaactttacg tctactttac aatctataaa ctttgtttaa tcatataacg
3300aaatacacta atacacaatc ctgtacgtat gtaatacttt tatccatcaa ggattgagaa
3360aaaaaagtaa tgattccctg ggccattaaa acttagaccc ccaagcttgg ataggtcact
3420ctctattttc gtttctccct tccctgatag aagggtgata tgtaattaag aataatatat
3480aattttataa taaaagcggc cgcctccctt ctctaaatgg actgcttgga taacttggac
3540ccccttccca ttttatagtc attctcttcc ccctcatttt cccactattc ccaacaatga
3600ccatctctcc accttgtttc cccattcttc ctgctctacc tggtgggggt gtttcacccc
3660attaacggtc ggattccgct gtggagatgg ctctggcctt tttcccattc cttccccccc
3720tcaatcttct ccatgcgggg aaaaaaaaat tttatccata aacaaccaaa ccggcggctc
3780aacggggggt ttatactgac agaaatgggg tcaatacacc cactgactgt acccgctcta
3840atcttaagct ttcccccccc cctcctgtat taacggcgcg gagtgcccgc agcgcccaat
3900ggagaaggcg cgcagtgggg gatgcccagg gaggggacag gtacacgcac aggccatgcc
3960aacaccgcat agacgtgcga cctcctctcc cccactgcag agctgccctt ttcggacaca
4020ctccgtgcaa gaggactcgg ccggctcggc ttttctgccg
4060183835DNAArtificial SequenceSynthetic - 5' integration fragment
18aactactatg tacactgtat aagtaaaaag acgatacccc cctcccactc tgggtgctac
60ggtgtagatc tctccgtaaa cacaaaaagg cggctcagat gataattggg gtccgggcgc
120aaccggaagg ggggagagag gggagcgatg gcttctcctc cggggggcta cgggagtttc
180ctctttggga aggataaaga ggggatggat tgatacaaga ttctgagaac ctattacgat
240gatgttcagt ggtattttgt cttttgttat ttaaagggag gggactttcc tcaatacctt
300agttgtaaaa ttacgctatt atctttaacc ctttcttttg agcaataatt aaaaagagcg
360gccgcgagtc catcggttcc tgtcagatgg gatactcttg acgtggaaaa ttcaaacaga
420aaaaaaaccc caataatgaa aaataacact acgttatatc cgtggtatcc tctatcgtat
480cgtatcgtag cgtatcgtag cgtaccgtat cacagtatag tctaatattc cgtatcttat
540tgtatcctat cctattcgat cctattgtat ttcagtgcac cattttaatt tctattgcta
600taatgtcctt attagttgcc actgtgaggt gaccaatgga cgagggcgag ccgttcagaa
660gccgcgaagg gtgttcttcc catgaatttc ttaaggaggg cggctcagct ccgagagtga
720ggcgagacgt ctcggtcagc gtatccccct tcctcggctt ttacaaatga tgcgctctta
780atagtgtgtc gttatccttt tggcattgac gggggaggga aattgattga gcgcatccat
840atttttgcgg actgctgagg acaatggtgg tttttccggg tggcgtgggc tacaaatgat
900acgatggttt ttttcttttc ggagaaggcg tataaaaagg acacggagaa cccatttatt
960ctaaaaacag ttgagcttct ttaattattt tttgatataa tattctatta ttatatattt
1020tcttcccaat aaaacaaaat aaaacaaaac acagcaaaac acaaaaagct agcggcgcgc
1080cttctgtctt tgattttctt atgttattca aaacatctgc cccaaaatct aacgattata
1140tatattccta cgtataactg tatagctaat tattgattta tttgtacata aaaaccacat
1200aaatgtaaaa gcaagaaaaa aaataactaa ggagaaggat caatatctca tttataatgc
1260tcgccaaagc agcgtacgtg aattttaatc aagacatcaa caaatcttgc aacttggtta
1320tatcgcttct tcacccactc acccgctttt ctacattgtt gaacacaaat atatacaggg
1380gtatgtctca aggtcaagtg cagtttcaac agagactacc tcaaggtacc tcttcagaaa
1440tgcagaactt cactcttgat cagattttct ccgaattaaa ggaggcctat tggtagttct
1500ttccccctct caagctggcg tgaaatgcaa ccttacggcg tctacgttac tacaaggtcc
1560agaaagtgta ggtattgcta ctatttttat tttttattgg ttctggagaa atgcagacag
1620tcaatgaaca caactgtctc aatatgcatc tatgcacatg cacacacaca cacatcacag
1680gtacccctac aaagagaggt ctcttgataa tgtttcatta ccacgtggca tccccccccc
1740cccccccaat aaacaagtgg ccgagttccc ctgttgcaga ggaggacaaa aaaaccgctg
1800gtgttggtac cattatgcag caactagcac aacaaacaac cgacccagac atacaaatca
1860acaacacttc gccaaagaca ccctttccag ggaggatcca ctcccaacgt ctctccataa
1920tgtctctgtt ggcccatgtc tctgtcgttg acaccgtaac cacaccaacc aacccgtcca
1980ttgtactggg atggtcgtcc atagacacct ctccaacggg gaacacctca ttcgtaaacc
2040gccaaggtta ccgttcctcc tgactcgccc cgttgttgat gctgcgcacc tgtggttgcc
2100caacatggtt gtatatcgtg taaccacacc aacacatgtg cagcacatgt gtttaaaaga
2160gtgtcatgga ggtggatcat gatggaagtg gactttacca cttgggaact gtctccactc
2220ccgggaagaa aagacccggc gtatcacgcg gttgcctcaa tggggcaatt tggaaggaga
2280aatataggga aaatcacgtc gctctcggac ggggaagagt tccagactat gagggggggg
2340ggtggtatat aaagacagga gatgtccacc cccagagaga ggaagaagtt ggaactttag
2400aagagagaga taactttccc cagtgtccat caatacacaa ccaaacacaa actctatatt
2460tacacatata accccctctc tagattaatt aatttatttt actagtttat ttttgctcct
2520gagaatagga ttacaaacac ttaaagtctt taattacaac tatatataat attctgttgg
2580ttttcttgaa ttggttcgct gcgattcatg cctcccattc accaaaggtg gagtgggaaa
2640taacggtttt actgcggtaa ttagcagagg caagaacagg atacactttt tgatgataaa
2700tctgtattat agtcgagcct atttaggaaa tcaaattttc ttgtgtttac ttttcaaata
2760aataatgttc gaaaattttt actttactcc ttcatttaac tataccagac gttatatcat
2820caacaccttc tgaccatata cagctcaaga tgtttaagag tctgttaaat tttttcaatc
2880catttcatgg agtaccagga ggtgctacaa aaggaattca tagcctcatg aaatcagcca
2940tttgcttttg ttcaacgatc ttttgaaatt gttgttgttc ttggtagtta agttgatcca
3000tcttggctta tgttgtgtgt atgttgtagt tattcttagt atattcctgt cctgagttta
3060gtgaaacata atatcgcctt gaaatgaaaa tgctgaaatt cgtcgacata caatttttca
3120aacttttttt ttttcttggt gcacggacat gtttttaaag gaagtactct ataccagtta
3180ttcttcacaa atttaattgc tggagaatag atcttcaacg ctttaataaa gtagtttgtt
3240tgtcaaggat ggcgtcatac aaagaaagat cagaatcaca cacttcccct gttgctagga
3300gacttttctc catcatggag gaaaagaagt ctaacctttg tgcatcattg gatattactg
3360aaactgaaaa gcttctctct attttggaca ctattggtcc ttacatctgt ctagttaaaa
3420cacacatcga tattgtttct gattttacgt atgaaggaac tgtgttgcct ttgaaggagc
3480ttgccaagaa acataatttt atgatttttg aagatagaaa atttgctgat attggtaaca
3540ctgttaaaaa tcaatataaa tctggtgtct tccgtattgc cgaatgggct gacatcacta
3600atgcacatgg tgtaacgggt gcaggtattg tttctggctt gaaggaggca gcccaagaaa
3660caaccagtga acctagaggt ttgctaatgc ttgctgagtt atcatcaaag ggttctttag
3720catatggtga atatacagaa aaaacagtag aaattgctaa atctgataaa gagtttgtca
3780ttggttttat tgcgcaacac gatatgggcg gtagagaaga aggttttgac tccgc
3835193883DNAArtificial SequenceSynthetic - Integration fragment targeted
to MAE gene 19aattctttga aggagcttgc caagaaacat aattttatga tttttgaaga
tagaaaattt 60gctgatattg gtaacactgt taaaaatcaa tataaatctg gtgtcttccg
tattgccgaa 120tgggctgaca tcactaatgc acatggtgta acgggtgcag gtattgtttc
tggcttgaag 180gaggcagccc aagaaacaac cagtgaacct agaggtttgc taatgcttgc
tgagttatca 240tcaaagggtt ctttagcata tggtgaatat acagaaaaaa cagtagaaat
tgctaaatct 300gataaagagt ttgtcattgg ttttattgcg caacacgata tgggcggtag
agaagaaggt 360tttgactgga tcattatgac tccaggggtt ggtttagatg acaaaggtga
tgcacttggt 420caacaatata gaactgttga tgaagttgta aagactggaa cggatatcat
aattgttggt 480agaggtttgt acggtcaagg aagagatcct atagagcaag ctaaaagata
ccaacaagct 540ggttggaatg cttatttaaa cagatttaaa tgattcttac acaaagattt
gatacatgta 600cactagttta aataagcatg aaaagaatta cacaagcaaa aaaaaaaaaa
taaatgaggt 660actttacgtt cacctacaac caaaaaaact agatagagta aaatcttaag
atttagaaaa 720agttgtttaa caaaggcttt agtatgtgaa tttttaatgt agcaaagcga
taactaataa 780acataaacaa aagtatggtt ttctttatca gtcaaatcat tatcgattga
ttgttccgcg 840tatctgcaga tagcctcatg aaatcagcca tttgcttttg ttcaacgatc
ttttgaaatt 900gttgttgttc ttggtagtta agttgatcca tcttggctta tgttgtgtgt
atgttgtagt 960tattcttagt atattcctgt cctgagttta gtgaaacata atatcgcctt
gaaatgaaaa 1020tgctgaaatt cgtcgacata caatttttca aacttttttt ttttcttggt
gcacggacat 1080gtttttaaag gaagtactct ataccagtta ttcttcacaa atttaattgc
tggagaatag 1140atcttcaacg cgtttaaaca gcaatttgag gaaggaatag gagaaggaga
agcaatttct 1200aggaaagagc aaggtgtgca acagcatgct ctgaatgata ttttcagcaa
tagttcagtt 1260gaagaacctg ttggcgtatc tacatcactt cctacaaaca acaccacgaa
ttgcgtccgt 1320ggtgacgcaa ctacgaatgg cattgtcaat gccaatgcca gtgcacatac
acgtgcaagt 1380cccaccggtt ccctgcccgg ctatggtaga gacaagaagg acgataccgg
catcgacatc 1440aacagtttca acagcaatgc gtttggcgtc gacgcgtcga tggggctgcc
gtatttggat 1500ttggacgggc tagatttcga tatggatatg gatatggata tggatatgga
gatgaatttg 1560aatttagatt tgggtcttga tttggggttg gaattaaaag gggataacaa
tgagggtttt 1620cctgttgatt taaacaatgg acgtgggagg tgattgattt aacctgatcc
aaaaggggta 1680tgtctatttt ttagagtgtg tctttgtgtc aaattatggt agaatgtgta
aagtagtata 1740aactttcctc tcaaatgacg aggtttaaaa caccccccgg gtgagccgag
ccgagaatgg 1800ggcaattgtt caatgtgaaa tagaagtatc gagtgagaaa cttgggtgtt
ggccagccaa 1860gggggaagga aaatggcgcg aatgctcagg tgagattgtt ttggaattgg
gtgaagcgag 1920gaaatgagcg acccggaggt tgtgacttta gtggcggagg aggaacggga
ggaaaaggcc 1980aagagggaaa gtgtatataa gggggagcaa tttgccaacc aggatagaat
tggatgagtt 2040ataattctac tgtatttatt gtataattta tttctccttt tatatcaaac
acattacaaa 2100acacacaaaa catacaaaca tacacagcta gcaaaggcgc gccatctaat
agtttaatca 2160cagcttatag tctactatag ttttcttttt taaacattgt tgtattttgt
cccccccctc 2220taattgatga tgattatcct ataagaatcc aataaaacga tggaaactaa
taccctctcc 2280tttgtcatgt ggtctttagt atttcttgaa cattggctct gatttctcga
ctttatagtc 2340ctattaaaat cgctgttagt tctcgatcgt tgtatctcgt ttcttgtctc
tttggtggat 2400gattttgcgt gcgaacatgt ttttttccct ttctctcacc atcatcgtgt
agttcttgtc 2460accatccccc ccaccccttc cttctctcat tgattctata agagcttatc
cacagaggtg 2520cagtaacgag gtagtttaac cttcgagtgg atcaaaatgt cacacaggcc
tcgacatttg 2580ctgcaacggc aacatcaatg tccacgttta cacacctaca tttatatcta
tatttatatt 2640tatatttatt tatttatgct acttagcttc tatagttagt taatgcactc
acgatattca 2700aaattgacac ccttcaacta ctccctacta ttgtctacta ctgtctacta
ctcctcttta 2760ctatagctgc tcccaatagg ctccaccaat aggctctgtc aatacatttt
gcgccgccac 2820ctttcaggtt gtgtcactcc tgaaggacca tattgggtaa tcgtgcaatt
tctggaagag 2880agtccgcgag aagtgaggcc cccactgtaa atcctcgagg gggcatggag
tatggggcat 2940ggaggatgga ggatgggggg gggggggggg aaaataggta gcgaaaggac
ccgctatcac 3000cccacccgga gaactcgttg ccgggaagtc atatttcgac actccgggga
gtctataaaa 3060ggcgggtttt gtcttttgcc agttgatgtt gctgagagga cttgtttgcc
gtttcttccg 3120atttaacagt atagaatcaa ccactgttaa ttatacacgt tatactaaca
caacaaaaac 3180aaaaacaacg acaacaacaa caacatctag ataattaatt aacatctgaa
tgtaaaatga 3240acattaaaat gaattactaa actttacgtc tactttacaa tctataaact
ttgtttaatc 3300atataacgaa atacactaat acacaatcct gtacgtatgt aatactttta
tccatcaagg 3360attgagaaaa aaaagtaatg attccctggg ccattaaaac ttagaccccc
aagcttggat 3420aggtcactct ctattttcgt ttctcccttc cctgatagaa gggtgatatg
taattaagaa 3480taatatataa ttttataata aaagcggccg cacacataca cattatcaaa
tgcatttatt 3540cctaatatca cactaaaacg tattatataa ttttaatctt tatagacttc
atagcaccaa 3600ttggatttgc tttctttcag aataccgcac ttaatctcaa tgtacgtaac
gtaggcaaaa 3660tctgtcgata aggatctgta tgccgtaaac ggaaactcca agcgcccaga
aaacttacat 3720tatattcttg ccagtttcat ctcaccagcc agtcacagtt taaaaggttt
gattgcgttt 3780cttgtttcgt cggattcagt gctaattggt aacgcactgt accgccacac
caaagcaaaa 3840atgcagaaac aaacaacaat gagtgtatgt ttaccaactt tgg
3883205722DNAArtificial SequenceSynthetic - E. coli SthA gene
integration fragment 20aactactatg tacactgtat aagtaaaaag acgatacccc
cctcccactc tgggtgctac 60ggtgtagatc tctccgtaaa cacaaaaagg cggctcagat
gataattggg gtccgggcgc 120aaccggaagg ggggagagag gggagcgatg gcttctcctc
cggggggcta cgggagtttc 180ctctttggga aggataaaga ggggatggat tgatacaaga
ttctgagaac ctattacgat 240gatgttcagt ggtattttgt cttttgttat ttaaagggag
gggactttcc tcaatacctt 300agttgtaaaa ttacgctatt atctttaacc ctttcttttg
agcaataatt aaaaagagcg 360gccgcgagtc catcggttcc tgtcagatgg gatactcttg
acgtggaaaa ttcaaacaga 420aaaaaaaccc caataatgaa aaataacact acgttatatc
cgtggtatcc tctatcgtat 480cgtatcgtag cgtatcgtag cgtaccgtat cacagtatag
tctaatattc cgtatcttat 540tgtatcctat cctattcgat cctattgtat ttcagtgcac
cattttaatt tctattgcta 600taatgtcctt attagttgcc actgtgaggt gaccaatgga
cgagggcgag ccgttcagaa 660gccgcgaagg gtgttcttcc catgaatttc ttaaggaggg
cggctcagct ccgagagtga 720ggcgagacgt ctcggtcagc gtatccccct tcctcggctt
ttacaaatga tgcgctctta 780atagtgtgtc gttatccttt tggcattgac gggggaggga
aattgattga gcgcatccat 840atttttgcgg actgctgagg acaatggtgg tttttccggg
tggcgtgggc tacaaatgat 900acgatggttt ttttcttttc ggagaaggcg tataaaaagg
acacggagaa cccatttatt 960ctaaaaacag ttgagcttct ttaattattt tttgatataa
tattctatta ttatatattt 1020tcttcccaat aaaacaaaat aaaacaaaac acagcaaaac
acaaaaagct agcctgaaag 1080ggaaccataa tgggtaagat cgcaccacat tagcgggctc
gaagatggat cttgcgaatg 1140ggtgacacca gtcataaggc ctcgttgtcc cagcatacct
cccgcgctat ctaattgctt 1200cgctctccat tgttcttggt aaacatcact ctggcttgat
ggtgtcatct atgcccgcca 1260agcctatcgg tctatggccc ggagtttgct ccgtcttcca
attgcaatcg cacggaatcc 1320gggatagaaa gaacgatacg cattcatacg attctcacgt
tattggttgg tgaatcaaat 1380gcacaacgaa cccaatcgcc ctggactcag cgtctaggcc
ccccgtatgg ccgacgggga 1440ctcagagcgt caatccacgt tgaagtcgag gttttggcag
ttacagccct tgcaataagg 1500tttttcggac agtctacttt gtcggcgcgc cttctgtctt
tgattttctt atgttattca 1560aaacatctgc cccaaaatct aacgattata tatattccta
cgtataactg tatagctaat 1620tattgattta tttgtacata aaaaccacat aaatgtaaaa
gcaagaaaaa aaataactaa 1680ggagaaggat caatatctca tttataatgc tcgccaaagc
agcgtacgtg aattttaatc 1740aagacatcaa caaatcttgc aacttggtta tatcgcttct
tcacccactc acccgctttt 1800ctacattgtt gaacacaaat atatacaggg gtatgtctca
aggtcaagtg cagtttcaac 1860agagactacc tcaaggtacc tcttcagaaa tgcagaactt
cactcttgat cagattttct 1920ccgaattaaa ggaggcctat tggtagttct ttccccctct
caagctggcg tgaaatgcaa 1980ccttacggcg tctacgttac tacaaggtcc agaaagtgta
ggtattgcta ctatttttat 2040tttttattgg ttctggagaa atgcagacag tcaatgaaca
caactgtctc aatatgcatc 2100tatgcacatg cacacacaca cacatcacag gtacccctac
aaagagaggt ctcttgataa 2160tgtttcatta ccacgtggca tccccccccc cccccccaat
aaacaagtgg ccgagttccc 2220ctgttgcaga ggaggacaaa aaaaccgctg gtgttggtac
cattatgcag caactagcac 2280aacaaacaac cgacccagac atacaaatca acaacacttc
gccaaagaca ccctttccag 2340ggaggatcca ctcccaacgt ctctccataa tgtctctgtt
ggcccatgtc tctgtcgttg 2400acaccgtaac cacaccaacc aacccgtcca ttgtactggg
atggtcgtcc atagacacct 2460ctccaacggg gaacacctca ttcgtaaacc gccaaggtta
ccgttcctcc tgactcgccc 2520cgttgttgat gctgcgcacc tgtggttgcc caacatggtt
gtatatcgtg taaccacacc 2580aacacatgtg cagcacatgt gtttaaaaga gtgtcatgga
ggtggatcat gatggaagtg 2640gactttacca cttgggaact gtctccactc ccgggaagaa
aagacccggc gtatcacgcg 2700gttgcctcaa tggggcaatt tggaaggaga aatataggga
aaatcacgtc gctctcggac 2760ggggaagagt tccagactat gagggggggg ggtggtatat
aaagacagga gatgtccacc 2820cccagagaga ggaagaagtt ggaactttag aagagagaga
taactttccc cagtgtccat 2880caatacacaa ccaaacacaa actctatatt tacacatata
accccctctc tagaatgcca 2940cattcctacg attacgatgc catagtaata ggttccggcc
ccggcggcga aggcgctgca 3000atgggcctgg ttaagcaagg tgcgcgcgtc gcagttatcg
agcgttatca aaatgttggc 3060ggcggttgca cccactgggg caccatcccg tcgaaagctc
tccgtcacgc cgtcagccgc 3120attatagaat tcaatcaaaa cccactttac agcgaccatt
cccgactgct ccgctcttct 3180tttgccgata tccttaacca tgccgataac gtgattaatc
aacaaacgcg catgcgtcag 3240ggattttacg aacgtaatca ctgtgaaata ttgcagggaa
acgctcgctt tgttgacgag 3300catacgttgg cgctggattg cccggacggc agcgttgaaa
cactaaccgc tgaaaaattt 3360gttattgcct gcggctctcg tccatatcat ccaacagatg
ttgatttcac ccatccacgc 3420atttacgaca gcgactcaat tctcagcatg caccacgaac
cgcgccatgt acttatctat 3480ggtgctggag tgatcggctg tgaatatgcg tcgatcttcc
gcggtatgga tgtaaaagtg 3540gatctgatca acacccgcga tcgcctgctg gcatttctcg
atcaagagat gtcagattct 3600ctctcctatc acttctggaa cagtggcgta gtgattcgtc
acaacgaaga gtacgagaag 3660atcgaaggct gtgacgatgg tgtgatcatg catctgaagt
cgggtaaaaa actgaaagct 3720gactgcctgc tctatgccaa cggtcgcacc ggtaataccg
attcgctggc gttacagaac 3780attgggctag aaactgacag ccgcggacag ctgaaggtca
acagcatgta tcagaccgca 3840cagccacacg tttacgcggt gggcgacgtg attggttatc
cgagcctggc gtcggcggcc 3900tatgaccagg ggcgcattgc cgcgcaggcg ctggtaaaag
gcgaagccac cgcacatctg 3960attgaagata tccctaccgg tatttacacc atcccggaaa
tcagctctgt gggcaaaacc 4020gaacagcagc tgaccgcaat gaaagtgcca tatgaagtgg
gccgcgccca gtttaaacat 4080ctggcacgcg cacaaatcgt cggcatgaac gtgggcacgc
tgaaaatttt gttccatcgg 4140gaaacaaaag agattctggg tattcactgc tttggcgagc
gcgctgccga aattattcat 4200atcggtcagg cgattatgga acagaaaggt ggcggcaaca
ctattgagta cttcgtcaac 4260accaccttta actacccgac gatggcggaa gcctatcggg
tagctgcgtt aaacggttta 4320aaccgcctgt tttaaaactt tatcgaaatg gccatccatt
cttgcgcgga tttaattaat 4380ttattttact agtttatttt tgctcctgag aataggatta
caaacactta aagtctttaa 4440ttacaactat atataatatt ctgttggttt tcttgaattg
gttcgctgcg attcatgcct 4500cccattcacc aaaggtggag tgggaaataa cggttttact
gcggtaatta gcagaggcaa 4560gaacaggata cactttttga tgataaatct gtattatagt
cgagcctatt taggaaatca 4620aattttcttg tgtttacttt tcaaataaat aatgttcgaa
aatttttact ttactccttc 4680atttaactat accagacgtt atatcatcaa caccttctga
ccatatacag ctcaagatgt 4740ttaagagtct gttaaatttt ttcaatccat ttcatggagt
accaggaggt gctacaaaag 4800gaattcatag cctcatgaaa tcagccattt gcttttgttc
aacgatcttt tgaaattgtt 4860gttgttcttg gtagttaagt tgatccatct tggcttatgt
tgtgtgtatg ttgtagttat 4920tcttagtata ttcctgtcct gagtttagtg aaacataata
tcgccttgaa atgaaaatgc 4980tgaaattcgt cgacatacaa tttttcaaac tttttttttt
tcttggtgca cggacatgtt 5040tttaaaggaa gtactctata ccagttattc ttcacaaatt
taattgctgg agaatagatc 5100ttcaacgctt taataaagta gtttgtttgt caaggatggc
gtcatacaaa gaaagatcag 5160aatcacacac ttcccctgtt gctaggagac ttttctccat
catggaggaa aagaagtcta 5220acctttgtgc atcattggat attactgaaa ctgaaaagct
tctctctatt ttggacacta 5280ttggtcctta catctgtcta gttaaaacac acatcgatat
tgtttctgat tttacgtatg 5340aaggaactgt gttgcctttg aaggagcttg ccaagaaaca
taattttatg atttttgaag 5400atagaaaatt tgctgatatt ggtaacactg ttaaaaatca
atataaatct ggtgtcttcc 5460gtattgccga atgggctgac atcactaatg cacatggtgt
aacgggtgca ggtattgttt 5520ctggcttgaa ggaggcagcc caagaaacaa ccagtgaacc
tagaggtttg ctaatgcttg 5580ctgagttatc atcaaagggt tctttagcat atggtgaata
tacagaaaaa acagtagaaa 5640ttgctaaatc tgataaagag tttgtcattg gttttattgc
gcaacacgat atgggcggta 5700gagaagaagg ttttgactcc gc
5722211401DNAEscherichia coli 21atgccacatt
cctacgatta cgatgccata gtaataggtt ccggccccgg cggcgaaggc 60gctgcaatgg
gcctggttaa gcaaggtgcg cgcgtcgcag ttatcgagcg ttatcaaaat 120gttggcggcg
gttgcaccca ctggggcacc atcccgtcga aagctctccg tcacgccgtc 180agccgcatta
tagaattcaa tcaaaaccca ctttacagcg accattcccg actgctccgc 240tcttcttttg
ccgatatcct taaccatgcc gataacgtga ttaatcaaca aacgcgcatg 300cgtcagggat
tttacgaacg taatcactgt gaaatattgc agggaaacgc tcgctttgtt 360gacgagcata
cgttggcgct ggattgcccg gacggcagcg ttgaaacact aaccgctgaa 420aaatttgtta
ttgcctgcgg ctctcgtcca tatcatccaa cagatgttga tttcacccat 480ccacgcattt
acgacagcga ctcaattctc agcatgcacc acgaaccgcg ccatgtactt 540atctatggtg
ctggagtgat cggctgtgaa tatgcgtcga tcttccgcgg tatggatgta 600aaagtggatc
tgatcaacac ccgcgatcgc ctgctggcat ttctcgatca agagatgtca 660gattctctct
cctatcactt ctggaacagt ggcgtagtga ttcgtcacaa cgaagagtac 720gagaagatcg
aaggctgtga cgatggtgtg atcatgcatc tgaagtcggg taaaaaactg 780aaagctgact
gcctgctcta tgccaacggt cgcaccggta ataccgattc gctggcgtta 840cagaacattg
ggctagaaac tgacagccgc ggacagctga aggtcaacag catgtatcag 900accgcacagc
cacacgttta cgcggtgggc gacgtgattg gttatccgag cctggcgtcg 960gcggcctatg
accaggggcg cattgccgcg caggcgctgg taaaaggcga agccaccgca 1020catctgattg
aagatatccc taccggtatt tacaccatcc cggaaatcag ctctgtgggc 1080aaaaccgaac
agcagctgac cgcaatgaaa gtgccatatg aagtgggccg cgcccagttt 1140aaacatctgg
cacgcgcaca aatcgtcggc atgaacgtgg gcacgctgaa aattttgttc 1200catcgggaaa
caaaagagat tctgggtatt cactgctttg gcgagcgcgc tgccgaaatt 1260attcatatcg
gtcaggcgat tatggaacag aaaggtggcg gcaacactat tgagtacttc 1320gtcaacacca
cctttaacta cccgacgatg gcggaagcct atcgggtagc tgcgttaaac 1380ggtttaaacc
gcctgtttta a
1401225335DNAArtificial SequenceSynthetic - RH E. coli integration
fragment 22aattctttga aggagcttgc caagaaacat aattttatga tttttgaaga
tagaaaattt 60gctgatattg gtaacactgt taaaaatcaa tataaatctg gtgtcttccg
tattgccgaa 120tgggctgaca tcactaatgc acatggtgta acgggtgcag gtattgtttc
tggcttgaag 180gaggcagccc aagaaacaac cagtgaacct agaggtttgc taatgcttgc
tgagttatca 240tcaaagggtt ctttagcata tggtgaatat acagaaaaaa cagtagaaat
tgctaaatct 300gataaagagt ttgtcattgg ttttattgcg caacacgata tgggcggtag
agaagaaggt 360tttgactgga tcattatgac tccaggggtt ggtttagatg acaaaggtga
tgcacttggt 420caacaatata gaactgttga tgaagttgta aagactggaa cggatatcat
aattgttggt 480agaggtttgt acggtcaagg aagagatcct atagagcaag ctaaaagata
ccaacaagct 540ggttggaatg cttatttaaa cagatttaaa tgattcttac acaaagattt
gatacatgta 600cactagttta aataagcatg aaaagaatta cacaagcaaa aaaaaaaaaa
taaatgaggt 660actttacgtt cacctacaac caaaaaaact agatagagta aaatcttaag
atttagaaaa 720agttgtttaa caaaggcttt agtatgtgaa tttttaatgt agcaaagcga
taactaataa 780acataaacaa aagtatggtt ttctttatca gtcaaatcat tatcgattga
ttgttccgcg 840tatctgcaga tagcctcatg aaatcagcca tttgcttttg ttcaacgatc
ttttgaaatt 900gttgttgttc ttggtagtta agttgatcca tcttggctta tgttgtgtgt
atgttgtagt 960tattcttagt atattcctgt cctgagttta gtgaaacata atatcgcctt
gaaatgaaaa 1020tgctgaaatt cgtcgacata caatttttca aacttttttt ttttcttggt
gcacggacat 1080gtttttaaag gaagtactct ataccagtta ttcttcacaa atttaattgc
tggagaatag 1140atcttcaacg cgtttaaaca gcaatttgag gaaggaatag gagaaggaga
agcaatttct 1200aggaaagagc aaggtgtgca acagcatgct ctgaatgata ttttcagcaa
tagttcagtt 1260gaagaacctg ttggcgtatc tacatcactt cctacaaaca acaccacgaa
ttgcgtccgt 1320ggtgacgcaa ctacgaatgg cattgtcaat gccaatgcca gtgcacatac
acgtgcaagt 1380cccaccggtt ccctgcccgg ctatggtaga gacaagaagg acgataccgg
catcgacatc 1440aacagtttca acagcaatgc gtttggcgtc gacgcgtcga tggggctgcc
gtatttggat 1500ttggacgggc tagatttcga tatggatatg gatatggata tggatatgga
gatgaatttg 1560aatttagatt tgggtcttga tttggggttg gaattaaaag gggataacaa
tgagggtttt 1620cctgttgatt taaacaatgg acgtgggagg tgattgattt aacctgatcc
aaaaggggta 1680tgtctatttt ttagagtgtg tctttgtgtc aaattatggt agaatgtgta
aagtagtata 1740aactttcctc tcaaatgacg aggtttaaaa caccccccgg gtgagccgag
ccgagaatgg 1800ggcaattgtt caatgtgaaa tagaagtatc gagtgagaaa cttgggtgtt
ggccagccaa 1860gggggaagga aaatggcgcg aatgctcagg tgagattgtt ttggaattgg
gtgaagcgag 1920gaaatgagcg acccggaggt tgtgacttta gtggcggagg aggaacggga
ggaaaaggcc 1980aagagggaaa gtgtatataa gggggagcaa tttgccaacc aggatagaat
tggatgagtt 2040ataattctac tgtatttatt gtataattta tttctccttt tatatcaaac
acattacaaa 2100acacacaaaa catacaaaca tacacagcta gcaaaggcgc gccatctaat
agtttaatca 2160cagcttatag tctactatag ttttcttttt taaacattgt tgtattttgt
cccccccctc 2220taattgatga tgattatcct ataagaatcc aataaaacga tggaaactaa
taccctctcc 2280tttgtcatgt ggtctttagt atttcttgaa cattggctct gatttctcga
ctttatagtc 2340ctattaaaat cgctgttagt tctcgatcgt tgtatctcgt ttcttgtctc
tttggtggat 2400gattttgcgt gcgaacatgt ttttttccct ttctctcacc atcatcgtgt
agttcttgtc 2460accatccccc ccaccccttc cttctctcat tgattctata agagcttatc
cacagaggtg 2520cagtaacgag gtagtttaac cttcgagtgg atcaaaatgt cacacaggcc
tcgacatttg 2580ctgcaacggc aacatcaatg tccacgttta cacacctaca tttatatcta
tatttatatt 2640tatatttatt tatttatgct acttagcttc tatagttagt taatgcactc
acgatattca 2700aaattgacac ccttcaacta ctccctacta ttgtctacta ctgtctacta
ctcctcttta 2760ctatagctgc tcccaatagg ctccaccaat aggctctgtc aatacatttt
gcgccgccac 2820ctttcaggtt gtgtcactcc tgaaggacca tattgggtaa tcgtgcaatt
tctggaagag 2880agtccgcgag aagtgaggcc cccactgtaa atcctcgagg gggcatggag
tatggggcat 2940ggaggatgga ggatgggggg gggggggggg aaaataggta gcgaaaggac
ccgctatcac 3000cccacccgga gaactcgttg ccgggaagtc atatttcgac actccgggga
gtctataaaa 3060ggcgggtttt gtcttttgcc agttgatgtt gctgagagga cttgtttgcc
gtttcttccg 3120atttaacagt atagaatcaa ccactgttaa ttatacacgt tatactaaca
caacaaaaac 3180aaaaacaacg acaacaacaa caacatctag aatgccacat tcctacgatt
acgatgccat 3240agtaataggt tccggccccg gcggcgaagg cgctgcaatg ggcctggtta
agcaaggtgc 3300gcgcgtcgca gttatcgagc gttatcaaaa tgttggcggc ggttgcaccc
actggggcac 3360catcccgtcg aaagctctcc gtcacgccgt cagccgcatt atagaattca
atcaaaaccc 3420actttacagc gaccattccc gactgctccg ctcttctttt gccgatatcc
ttaaccatgc 3480cgataacgtg attaatcaac aaacgcgcat gcgtcaggga ttttacgaac
gtaatcactg 3540tgaaatattg cagggaaacg ctcgctttgt tgacgagcat acgttggcgc
tggattgccc 3600ggacggcagc gttgaaacac taaccgctga aaaatttgtt attgcctgcg
gctctcgtcc 3660atatcatcca acagatgttg atttcaccca tccacgcatt tacgacagcg
actcaattct 3720cagcatgcac cacgaaccgc gccatgtact tatctatggt gctggagtga
tcggctgtga 3780atatgcgtcg atcttccgcg gtatggatgt aaaagtggat ctgatcaaca
cccgcgatcg 3840cctgctggca tttctcgatc aagagatgtc agattctctc tcctatcact
tctggaacag 3900tggcgtagtg attcgtcaca acgaagagta cgagaagatc gaaggctgtg
acgatggtgt 3960gatcatgcat ctgaagtcgg gtaaaaaact gaaagctgac tgcctgctct
atgccaacgg 4020tcgcaccggt aataccgatt cgctggcgtt acagaacatt gggctagaaa
ctgacagccg 4080cggacagctg aaggtcaaca gcatgtatca gaccgcacag ccacacgttt
acgcggtggg 4140cgacgtgatt ggttatccga gcctggcgtc ggcggcctat gaccaggggc
gcattgccgc 4200gcaggcgctg gtaaaaggcg aagccaccgc acatctgatt gaagatatcc
ctaccggtat 4260ttacaccatc ccggaaatca gctctgtggg caaaaccgaa cagcagctga
ccgcaatgaa 4320agtgccatat gaagtgggcc gcgcccagtt taaacatctg gcacgcgcac
aaatcgtcgg 4380catgaacgtg ggcacgctga aaattttgtt ccatcgggaa acaaaagaga
ttctgggtat 4440tcactgcttt ggcgagcgcg ctgccgaaat tattcatatc ggtcaggcga
ttatggaaca 4500gaaaggtggc ggcaacacta ttgagtactt cgtcaacacc acctttaact
acccgacgat 4560ggcggaagcc tatcgggtag ctgcgttaaa cggtttaaac cgcctgtttt
aaaactttat 4620cgaaatggcc atccattctt gcgcggattt aattaacatc tgaatgtaaa
atgaacatta 4680aaatgaatta ctaaacttta cgtctacttt acaatctata aactttgttt
aatcatataa 4740cgaaatacac taatacacaa tcctgtacgt atgtaatact tttatccatc
aaggattgag 4800aaaaaaaagt aatgattccc tgggccatta aaacttagac ccccaagctt
ggataggtca 4860ctctctattt tcgtttctcc cttccctgat agaagggtga tatgtaatta
agaataatat 4920ataattttat aataaaagcg gccgcacaca tacacattat caaatgcatt
tattcctaat 4980atcacactaa aacgtattat ataattttaa tctttataga cttcatagca
ccaattggat 5040ttgctttctt tcagaatacc gcacttaatc tcaatgtacg taacgtaggc
aaaatctgtc 5100gataaggatc tgtatgccgt aaacggaaac tccaagcgcc cagaaaactt
acattatatt 5160cttgccagtt tcatctcacc agccagtcac agtttaaaag gtttgattgc
gtttcttgtt 5220tcgtcggatt cagtgctaat tggtaacgca ctgtaccgcc acaccaaagc
aaaaatgcag 5280aaacaaacaa caatgagtgt atgtttacca actttggttt tgaaagttaa
cccgc 5335235642DNAArtificial SequenceSynthetic - Codon-optimized
E. coli Stha gene integration fragment 23aaccccactc tgggtgctac
ggtgtagatc tctccgtaaa cacaaaaagg cggctcagat 60gataattggg gtccgggcgc
aaccggaagg ggggagagag gggagcgatg gcttctcctc 120cggggggcta cgggagtttc
ctctttggga aggataaaga ggggatggat tgatacaaga 180ttctgagaac ctattacgat
gatgttcagt ggtattttgt cttttgttat ttaaagggag 240gggactttcc tcaatacctt
agttgtaaaa ttacgctatt atctttaacc ctttcttttg 300agcaataatt aaaaagagcg
gccgcgagtc catcggttcc tgtcagatgg gatactcttg 360acgtggaaaa ttcaaacaga
aaaaaaaccc caataatgaa aaataacact acgttatatc 420cgtggtatcc tctatcgtat
cgtatcgtag cgtatcgtag cgtaccgtat cacagtatag 480tctaatattc cgtatcttat
tgtatcctat cctattcgat cctattgtat ttcagtgcac 540cattttaatt tctattgcta
taatgtcctt attagttgcc actgtgaggt gaccaatgga 600cgagggcgag ccgttcagaa
gccgcgaagg gtgttcttcc catgaatttc ttaaggaggg 660cggctcagct ccgagagtga
ggcgagacgt ctcggtcagc gtatccccct tcctcggctt 720ttacaaatga tgcgctctta
atagtgtgtc gttatccttt tggcattgac gggggaggga 780aattgattga gcgcatccat
atttttgcgg actgctgagg acaatggtgg tttttccggg 840tggcgtgggc tacaaatgat
acgatggttt ttttcttttc ggagaaggcg tataaaaagg 900acacggagaa cccatttatt
ctaaaaacag ttgagcttct ttaattattt tttgatataa 960tattctatta ttatatattt
tcttcccaat aaaacaaaat aaaacaaaac acagcaaaac 1020acaaaaagct agcctgaaag
ggaaccataa tgggtaagat cgcaccacat tagcgggctc 1080gaagatggat cttgcgaatg
ggtgacacca gtcataaggc ctcgttgtcc cagcatacct 1140cccgcgctat ctaattgctt
cgctctccat tgttcttggt aaacatcact ctggcttgat 1200ggtgtcatct atgcccgcca
agcctatcgg tctatggccc ggagtttgct ccgtcttcca 1260attgcaatcg cacggaatcc
gggatagaaa gaacgatacg cattcatacg attctcacgt 1320tattggttgg tgaatcaaat
gcacaacgaa cccaatcgcc ctggactcag cgtctaggcc 1380ccccgtatgg ccgacgggga
ctcagagcgt caatccacgt tgaagtcgag gttttggcag 1440ttacagccct tgcaataagg
tttttcggac agtctacttt gtcggcgcgc cttctgtctt 1500tgattttctt atgttattca
aaacatctgc cccaaaatct aacgattata tatattccta 1560cgtataactg tatagctaat
tattgattta tttgtacata aaaaccacat aaatgtaaaa 1620gcaagaaaaa aaataactaa
ggagaaggat caatatctca tttataatgc tcgccaaagc 1680agcgtacgtg aattttaatc
aagacatcaa caaatcttgc aacttggtta tatcgcttct 1740tcacccactc acccgctttt
ctacattgtt gaacacaaat atatacaggg gtatgtctca 1800aggtcaagtg cagtttcaac
agagactacc tcaaggtacc tcttcagaaa tgcagaactt 1860cactcttgat cagattttct
ccgaattaaa ggaggcctat tggtagttct ttccccctct 1920caagctggcg tgaaatgcaa
ccttacggcg tctacgttac tacaaggtcc agaaagtgta 1980ggtattgcta ctatttttat
tttttattgg ttctggagaa atgcagacag tcaatgaaca 2040caactgtctc aatatgcatc
tatgcacatg cacacacaca cacatcacag gtacccctac 2100aaagagaggt ctcttgataa
tgtttcatta ccacgtggca tccccccccc cccccccaat 2160aaacaagtgg ccgagttccc
ctgttgcaga ggaggacaaa aaaaccgctg gtgttggtac 2220cattatgcag caactagcac
aacaaacaac cgacccagac atacaaatca acaacacttc 2280gccaaagaca ccctttccag
ggaggatcca ctcccaacgt ctctccataa tgtctctgtt 2340ggcccatgtc tctgtcgttg
acaccgtaac cacaccaacc aacccgtcca ttgtactggg 2400atggtcgtcc atagacacct
ctccaacggg gaacacctca ttcgtaaacc gccaaggtta 2460ccgttcctcc tgactcgccc
cgttgttgat gctgcgcacc tgtggttgcc caacatggtt 2520gtatatcgtg taaccacacc
aacacatgtg cagcacatgt gtttaaaaga gtgtcatgga 2580ggtggatcat gatggaagtg
gactttacca cttgggaact gtctccactc ccgggaagaa 2640aagacccggc gtatcacgcg
gttgcctcaa tggggcaatt tggaaggaga aatataggga 2700aaatcacgtc gctctcggac
ggggaagagt tccagactat gagggggggg ggtggtatat 2760aaagacagga gatgtccacc
cccagagaga ggaagaagtt ggaactttag aagagagaga 2820taactttccc cagtgtccat
caatacacaa ccaaacacaa actctatatt tacacatata 2880accccctctc tagaatgcca
cattcctatg actacgatgc cattgtcatt ggttccggtc 2940caggtggtga aggtgctgca
atgggcttag ttaagcaggg tgctagagtt gctgtcatcg 3000aaagatatca aaatgttggt
ggtggttgta ctcactgggg tacaattcca tctaaggcat 3060tgagacatgc agtttccaga
attattgagt ttaaccaaaa ccctttatac tctgatcatt 3120caagattgtt gagatcatct
tttgctgata ttttgaacca tgctgacaac gtcatcaacc 3180aacaaactcg tatgcgtcaa
ggcttctatg agagaaatca ttgtgagatt ttacaaggta 3240acgctagatt tgtcgatgag
catactcttg cattagactg tccagacggt tccgttgaga 3300ctcttaccgc tgaaaaattc
gttattgctt gtggttccag accataccac ccaaccgatg 3360tcgatttcac tcaccctcgt
atctacgatt ccgattctat tttgtctatg catcatgaac 3420caagacatgt tttgatttat
ggtgctggtg ttatcggttg tgaatatgct tctattttca 3480gaggtatgga tgttaaggtt
gacttgatta atacaagaga cagattatta gctttccttg 3540atcaggaaat gtctgattcc
ctttcctacc atttttggaa ctccggtgtc gtcatcagac 3600acaacgagga atatgaaaag
attgaaggtt gtgatgacgg cgttattatg caccttaagt 3660ctggtaaaaa gttaaaagca
gattgcttgt tatatgcaaa tggtagaacc ggtaacacag 3720actccttggc tttacaaaac
attggtttag aaaccgattc aagaggtcaa ttaaaggtca 3780attcaatgta tcaaactgca
caaccacacg tttacgcagt tggtgacgtt attggttacc 3840cttcattggc atctgccgct
tacgatcaag gtagaatcgc cgctcaagca cttgttaagg 3900gtgaagcaac tgcacactta
atcgaagata tccctaccgg tatctacact atcccagaaa 3960tctcttctgt tggcaagact
gaacaacaat taaccgcaat gaaggttcca tacgaagtcg 4020gtcgtgccca gttcaagcat
ttggctagag cacaaattgt tggtatgaat gttggtactt 4080tgaaaatctt gtttcacaga
gaaacaaagg aaatcttggg cattcactgt ttcggcgaaa 4140gagctgcaga gattattcac
atcggtcaag ccattatgga acaaaaaggc ggtggtaata 4200ccattgaata tttcgttaat
accaccttca actacccaac aatggccgaa gcatatagag 4260tcgctgcttt aaacggttta
aacagattgt tttaattaat ttattttact agtttatttt 4320tgctcctgag aataggatta
caaacactta aagtctttaa ttacaactat atataatatt 4380ctgttggttt tcttgaattg
gttcgctgcg attcatgcct cccattcacc aaaggtggag 4440tgggaaataa cggttttact
gcggtaatta gcagaggcaa gaacaggata cactttttga 4500tgataaatct gtattatagt
cgagcctatt taggaaatca aattttcttg tgtttacttt 4560tcaaataaat aatgttcgaa
aatttttact ttactccttc atttaactat accagacgtt 4620atatcatcaa caccttctga
ccatatacag ctcaagatgt ttaagagtct gttaaatttt 4680ttcaatccat ttcatggagt
accaggaggt gctacaaaag gaattcatag cctcatgaaa 4740tcagccattt gcttttgttc
aacgatcttt tgaaattgtt gttgttcttg gtagttaagt 4800tgatccatct tggcttatgt
tgtgtgtatg ttgtagttat tcttagtata ttcctgtcct 4860gagtttagtg aaacataata
tcgccttgaa atgaaaatgc tgaaattcgt cgacatacaa 4920tttttcaaac tttttttttt
tcttggtgca cggacatgtt tttaaaggaa gtactctata 4980ccagttattc ttcacaaatt
taattgctgg agaatagatc ttcaacgctt taataaagta 5040gtttgtttgt caaggatggc
gtcatacaaa gaaagatcag aatcacacac ttcccctgtt 5100gctaggagac ttttctccat
catggaggaa aagaagtcta acctttgtgc atcattggat 5160attactgaaa ctgaaaagct
tctctctatt ttggacacta ttggtcctta catctgtcta 5220gttaaaacac acatcgatat
tgtttctgat tttacgtatg aaggaactgt gttgcctttg 5280aaggagcttg ccaagaaaca
taattttatg atttttgaag atagaaaatt tgctgatatt 5340ggtaacactg ttaaaaatca
atataaatct ggtgtcttcc gtattgccga atgggctgac 5400atcactaatg cacatggtgt
aacgggtgca ggtattgttt ctggcttgaa ggaggcagcc 5460caagaaacaa ccagtgaacc
tagaggtttg ctaatgcttg ctgagttatc atcaaagggt 5520tctttagcat atggtgaata
tacagaaaaa acagtagaaa ttgctaaatc tgataaagag 5580tttgtcattg gttttattgc
gcaacacgat atgggcggta gagaagaagg ttttgactcc 5640gc
5642241401DNAArtificial
SequenceSynthetic - Codon optimized E. coli SthA gene 24atgccacatt
cctatgacta cgatgccatt gtcattggtt ccggtccagg tggtgaaggt 60gctgcaatgg
gcttagttaa gcagggtgct agagttgctg tcatcgaaag atatcaaaat 120gttggtggtg
gttgtactca ctggggtaca attccatcta aggcattgag acatgcagtt 180tccagaatta
ttgagtttaa ccaaaaccct ttatactctg atcattcaag attgttgaga 240tcatcttttg
ctgatatttt gaaccatgct gacaacgtca tcaaccaaca aactcgtatg 300cgtcaaggct
tctatgagag aaatcattgt gagattttac aaggtaacgc tagatttgtc 360gatgagcata
ctcttgcatt agactgtcca gacggttccg ttgagactct taccgctgaa 420aaattcgtta
ttgcttgtgg ttccagacca taccacccaa ccgatgtcga tttcactcac 480cctcgtatct
acgattccga ttctattttg tctatgcatc atgaaccaag acatgttttg 540atttatggtg
ctggtgttat cggttgtgaa tatgcttcta ttttcagagg tatggatgtt 600aaggttgact
tgattaatac aagagacaga ttattagctt tccttgatca ggaaatgtct 660gattcccttt
cctaccattt ttggaactcc ggtgtcgtca tcagacacaa cgaggaatat 720gaaaagattg
aaggttgtga tgacggcgtt attatgcacc ttaagtctgg taaaaagtta 780aaagcagatt
gcttgttata tgcaaatggt agaaccggta acacagactc cttggcttta 840caaaacattg
gtttagaaac cgattcaaga ggtcaattaa aggtcaattc aatgtatcaa 900actgcacaac
cacacgttta cgcagttggt gacgttattg gttacccttc attggcatct 960gccgcttacg
atcaaggtag aatcgccgct caagcacttg ttaagggtga agcaactgca 1020cacttaatcg
aagatatccc taccggtatc tacactatcc cagaaatctc ttctgttggc 1080aagactgaac
aacaattaac cgcaatgaag gttccatacg aagtcggtcg tgcccagttc 1140aagcatttgg
ctagagcaca aattgttggt atgaatgttg gtactttgaa aatcttgttt 1200cacagagaaa
caaaggaaat cttgggcatt cactgtttcg gcgaaagagc tgcagagatt 1260attcacatcg
gtcaagccat tatggaacaa aaaggcggtg gtaataccat tgaatatttc 1320gttaatacca
ccttcaacta cccaacaatg gccgaagcat atagagtcgc tgctttaaac 1380ggtttaaaca
gattgtttta a
1401255304DNAArtificial SequenceSynthetic - Codon-optimized E. coli SthA
gene integration fragment 25aattctttga aggagcttgc caagaaacat
aattttatga tttttgaaga tagaaaattt 60gctgatattg gtaacactgt taaaaatcaa
tataaatctg gtgtcttccg tattgccgaa 120tgggctgaca tcactaatgc acatggtgta
acgggtgcag gtattgtttc tggcttgaag 180gaggcagccc aagaaacaac cagtgaacct
agaggtttgc taatgcttgc tgagttatca 240tcaaagggtt ctttagcata tggtgaatat
acagaaaaaa cagtagaaat tgctaaatct 300gataaagagt ttgtcattgg ttttattgcg
caacacgata tgggcggtag agaagaaggt 360tttgactgga tcattatgac tccaggggtt
ggtttagatg acaaaggtga tgcacttggt 420caacaatata gaactgttga tgaagttgta
aagactggaa cggatatcat aattgttggt 480agaggtttgt acggtcaagg aagagatcct
atagagcaag ctaaaagata ccaacaagct 540ggttggaatg cttatttaaa cagatttaaa
tgattcttac acaaagattt gatacatgta 600cactagttta aataagcatg aaaagaatta
cacaagcaaa aaaaaaaaaa taaatgaggt 660actttacgtt cacctacaac caaaaaaact
agatagagta aaatcttaag atttagaaaa 720agttgtttaa caaaggcttt agtatgtgaa
tttttaatgt agcaaagcga taactaataa 780acataaacaa aagtatggtt ttctttatca
gtcaaatcat tatcgattga ttgttccgcg 840tatctgcaga tagcctcatg aaatcagcca
tttgcttttg ttcaacgatc ttttgaaatt 900gttgttgttc ttggtagtta agttgatcca
tcttggctta tgttgtgtgt atgttgtagt 960tattcttagt atattcctgt cctgagttta
gtgaaacata atatcgcctt gaaatgaaaa 1020tgctgaaatt cgtcgacata caatttttca
aacttttttt ttttcttggt gcacggacat 1080gtttttaaag gaagtactct ataccagtta
ttcttcacaa atttaattgc tggagaatag 1140atcttcaacg cgtttaaaca gcaatttgag
gaaggaatag gagaaggaga agcaatttct 1200aggaaagagc aaggtgtgca acagcatgct
ctgaatgata ttttcagcaa tagttcagtt 1260gaagaacctg ttggcgtatc tacatcactt
cctacaaaca acaccacgaa ttgcgtccgt 1320ggtgacgcaa ctacgaatgg cattgtcaat
gccaatgcca gtgcacatac acgtgcaagt 1380cccaccggtt ccctgcccgg ctatggtaga
gacaagaagg acgataccgg catcgacatc 1440aacagtttca acagcaatgc gtttggcgtc
gacgcgtcga tggggctgcc gtatttggat 1500ttggacgggc tagatttcga tatggatatg
gatatggata tggatatgga gatgaatttg 1560aatttagatt tgggtcttga tttggggttg
gaattaaaag gggataacaa tgagggtttt 1620cctgttgatt taaacaatgg acgtgggagg
tgattgattt aacctgatcc aaaaggggta 1680tgtctatttt ttagagtgtg tctttgtgtc
aaattatggt agaatgtgta aagtagtata 1740aactttcctc tcaaatgacg aggtttaaaa
caccccccgg gtgagccgag ccgagaatgg 1800ggcaattgtt caatgtgaaa tagaagtatc
gagtgagaaa cttgggtgtt ggccagccaa 1860gggggaagga aaatggcgcg aatgctcagg
tgagattgtt ttggaattgg gtgaagcgag 1920gaaatgagcg acccggaggt tgtgacttta
gtggcggagg aggaacggga ggaaaaggcc 1980aagagggaaa gtgtatataa gggggagcaa
tttgccaacc aggatagaat tggatgagtt 2040ataattctac tgtatttatt gtataattta
tttctccttt tatatcaaac acattacaaa 2100acacacaaaa catacaaaca tacacagcta
gcaaaggcgc gccatctaat agtttaatca 2160cagcttatag tctactatag ttttcttttt
taaacattgt tgtattttgt cccccccctc 2220taattgatga tgattatcct ataagaatcc
aataaaacga tggaaactaa taccctctcc 2280tttgtcatgt ggtctttagt atttcttgaa
cattggctct gatttctcga ctttatagtc 2340ctattaaaat cgctgttagt tctcgatcgt
tgtatctcgt ttcttgtctc tttggtggat 2400gattttgcgt gcgaacatgt ttttttccct
ttctctcacc atcatcgtgt agttcttgtc 2460accatccccc ccaccccttc cttctctcat
tgattctata agagcttatc cacagaggtg 2520cagtaacgag gtagtttaac cttcgagtgg
atcaaaatgt cacacaggcc tcgacatttg 2580ctgcaacggc aacatcaatg tccacgttta
cacacctaca tttatatcta tatttatatt 2640tatatttatt tatttatgct acttagcttc
tatagttagt taatgcactc acgatattca 2700aaattgacac ccttcaacta ctccctacta
ttgtctacta ctgtctacta ctcctcttta 2760ctatagctgc tcccaatagg ctccaccaat
aggctctgtc aatacatttt gcgccgccac 2820ctttcaggtt gtgtcactcc tgaaggacca
tattgggtaa tcgtgcaatt tctggaagag 2880agtccgcgag aagtgaggcc cccactgtaa
atcctcgagg gggcatggag tatggggcat 2940ggaggatgga ggatgggggg gggggggggg
aaaataggta gcgaaaggac ccgctatcac 3000cccacccgga gaactcgttg ccgggaagtc
atatttcgac actccgggga gtctataaaa 3060ggcgggtttt gtcttttgcc agttgatgtt
gctgagagga cttgtttgcc gtttcttccg 3120atttaacagt atagaatcaa ccactgttaa
ttatacacgt tatactaaca caacaaaaac 3180aaaaacaacg acaacaacaa caacatctag
aatgccacat tcctatgact acgatgccat 3240tgtcattggt tccggtccag gtggtgaagg
tgctgcaatg ggcttagtta agcagggtgc 3300tagagttgct gtcatcgaaa gatatcaaaa
tgttggtggt ggttgtactc actggggtac 3360aattccatct aaggcattga gacatgcagt
ttccagaatt attgagttta accaaaaccc 3420tttatactct gatcattcaa gattgttgag
atcatctttt gctgatattt tgaaccatgc 3480tgacaacgtc atcaaccaac aaactcgtat
gcgtcaaggc ttctatgaga gaaatcattg 3540tgagatttta caaggtaacg ctagatttgt
cgatgagcat actcttgcat tagactgtcc 3600agacggttcc gttgagactc ttaccgctga
aaaattcgtt attgcttgtg gttccagacc 3660ataccaccca accgatgtcg atttcactca
ccctcgtatc tacgattccg attctatttt 3720gtctatgcat catgaaccaa gacatgtttt
gatttatggt gctggtgtta tcggttgtga 3780atatgcttct attttcagag gtatggatgt
taaggttgac ttgattaata caagagacag 3840attattagct ttccttgatc aggaaatgtc
tgattccctt tcctaccatt tttggaactc 3900cggtgtcgtc atcagacaca acgaggaata
tgaaaagatt gaaggttgtg atgacggcgt 3960tattatgcac cttaagtctg gtaaaaagtt
aaaagcagat tgcttgttat atgcaaatgg 4020tagaaccggt aacacagact ccttggcttt
acaaaacatt ggtttagaaa ccgattcaag 4080aggtcaatta aaggtcaatt caatgtatca
aactgcacaa ccacacgttt acgcagttgg 4140tgacgttatt ggttaccctt cattggcatc
tgccgcttac gatcaaggta gaatcgccgc 4200tcaagcactt gttaagggtg aagcaactgc
acacttaatc gaagatatcc ctaccggtat 4260ctacactatc ccagaaatct cttctgttgg
caagactgaa caacaattaa ccgcaatgaa 4320ggttccatac gaagtcggtc gtgcccagtt
caagcatttg gctagagcac aaattgttgg 4380tatgaatgtt ggtactttga aaatcttgtt
tcacagagaa acaaaggaaa tcttgggcat 4440tcactgtttc ggcgaaagag ctgcagagat
tattcacatc ggtcaagcca ttatggaaca 4500aaaaggcggt ggtaatacca ttgaatattt
cgttaatacc accttcaact acccaacaat 4560ggccgaagca tatagagtcg ctgctttaaa
cggtttaaac agattgtttt aattaacatc 4620tgaatgtaaa atgaacatta aaatgaatta
ctaaacttta cgtctacttt acaatctata 4680aactttgttt aatcatataa cgaaatacac
taatacacaa tcctgtacgt atgtaatact 4740tttatccatc aaggattgag aaaaaaaagt
aatgattccc tgggccatta aaacttagac 4800ccccaagctt ggataggtca ctctctattt
tcgtttctcc cttccctgat agaagggtga 4860tatgtaatta agaataatat ataattttat
aataaaagcg gccgccaagt tagttagagc 4920tagagttaac acatacacat tatcaaatgc
atttattcct aatatcacac taaaacgtat 4980tatataattt taatctttat agacttcata
gcaccaattg gatttgcttt ctttcagaat 5040accgcactta atctcaatgt acgtaacgta
ggcaaaatct gtcgataagg atctgtatgc 5100cgtaaacgga aactccaagc gcccagaaaa
cttacattat attcttgcca gtttcatctc 5160accagccagt cacagtttaa aaggtttgat
tgcgtttctt gtttcgtcgg attcagtgct 5220aattggtaac gcactgtacc gccacaccaa
agcaaaaatg cagaaacaaa caacaatgag 5280tgtatgttta ccaactttgg ccgc
5304265676DNAArtificial
SequenceSynthetic - A. vinelandii Stha gene integration fragment
26aactactatg tacactgtat aagtaaaaag acgatacccc cctcccactc tgggtgctac
60ggtgtagatc tctccgtaaa cacaaaaagg cggctcagat gataattggg gtccgggcgc
120aaccggaagg ggggagagag gggagcgatg gcttctcctc cggggggcta cgggagtttc
180ctctttggga aggataaaga ggggatggat tgatacaaga ttctgagaac ctattacgat
240gatgttcagt ggtattttgt cttttgttat ttaaagggag gggactttcc tcaatacctt
300agttgtaaaa ttacgctatt atctttaacc ctttcttttg agcaataatt aaaaagagcg
360gccgcgagtc catcggttcc tgtcagatgg gatactcttg acgtggaaaa ttcaaacaga
420aaaaaaaccc caataatgaa aaataacact acgttatatc cgtggtatcc tctatcgtat
480cgtatcgtag cgtatcgtag cgtaccgtat cacagtatag tctaatattc cgtatcttat
540tgtatcctat cctattcgat cctattgtat ttcagtgcac cattttaatt tctattgcta
600taatgtcctt attagttgcc actgtgaggt gaccaatgga cgagggcgag ccgttcagaa
660gccgcgaagg gtgttcttcc catgaatttc ttaaggaggg cggctcagct ccgagagtga
720ggcgagacgt ctcggtcagc gtatccccct tcctcggctt ttacaaatga tgcgctctta
780atagtgtgtc gttatccttt tggcattgac gggggaggga aattgattga gcgcatccat
840atttttgcgg actgctgagg acaatggtgg tttttccggg tggcgtgggc tacaaatgat
900acgatggttt ttttcttttc ggagaaggcg tataaaaagg acacggagaa cccatttatt
960ctaaaaacag ttgagcttct ttaattattt tttgatataa tattctatta ttatatattt
1020tcttcccaat aaaacaaaat aaaacaaaac acagcaaaac acaaaaagct agcctgaaag
1080ggaaccataa tgggtaagat cgcaccacat tagcgggctc gaagatggat cttgcgaatg
1140ggtgacacca gtcataaggc ctcgttgtcc cagcatacct cccgcgctat ctaattgctt
1200cgctctccat tgttcttggt aaacatcact ctggcttgat ggtgtcatct atgcccgcca
1260agcctatcgg tctatggccc ggagtttgct ccgtcttcca attgcaatcg cacggaatcc
1320gggatagaaa gaacgatacg cattcatacg attctcacgt tattggttgg tgaatcaaat
1380gcacaacgaa cccaatcgcc ctggactcag cgtctaggcc ccccgtatgg ccgacgggga
1440ctcagagcgt caatccacgt tgaagtcgag gttttggcag ttacagccct tgcaataagg
1500tttttcggac agtctacttt gtcggcgcgc cttctgtctt tgattttctt atgttattca
1560aaacatctgc cccaaaatct aacgattata tatattccta cgtataactg tatagctaat
1620tattgattta tttgtacata aaaaccacat aaatgtaaaa gcaagaaaaa aaataactaa
1680ggagaaggat caatatctca tttataatgc tcgccaaagc agcgtacgtg aattttaatc
1740aagacatcaa caaatcttgc aacttggtta tatcgcttct tcacccactc acccgctttt
1800ctacattgtt gaacacaaat atatacaggg gtatgtctca aggtcaagtg cagtttcaac
1860agagactacc tcaaggtacc tcttcagaaa tgcagaactt cactcttgat cagattttct
1920ccgaattaaa ggaggcctat tggtagttct ttccccctct caagctggcg tgaaatgcaa
1980ccttacggcg tctacgttac tacaaggtcc agaaagtgta ggtattgcta ctatttttat
2040tttttattgg ttctggagaa atgcagacag tcaatgaaca caactgtctc aatatgcatc
2100tatgcacatg cacacacaca cacatcacag gtacccctac aaagagaggt ctcttgataa
2160tgtttcatta ccacgtggca tccccccccc cccccccaat aaacaagtgg ccgagttccc
2220ctgttgcaga ggaggacaaa aaaaccgctg gtgttggtac cattatgcag caactagcac
2280aacaaacaac cgacccagac atacaaatca acaacacttc gccaaagaca ccctttccag
2340ggaggatcca ctcccaacgt ctctccataa tgtctctgtt ggcccatgtc tctgtcgttg
2400acaccgtaac cacaccaacc aacccgtcca ttgtactggg atggtcgtcc atagacacct
2460ctccaacggg gaacacctca ttcgtaaacc gccaaggtta ccgttcctcc tgactcgccc
2520cgttgttgat gctgcgcacc tgtggttgcc caacatggtt gtatatcgtg taaccacacc
2580aacacatgtg cagcacatgt gtttaaaaga gtgtcatgga ggtggatcat gatggaagtg
2640gactttacca cttgggaact gtctccactc ccgggaagaa aagacccggc gtatcacgcg
2700gttgcctcaa tggggcaatt tggaaggaga aatataggga aaatcacgtc gctctcggac
2760ggggaagagt tccagactat gagggggggg ggtggtatat aaagacagga gatgtccacc
2820cccagagaga ggaagaagtt ggaactttag aagagagaga taactttccc cagtgtccat
2880caatacacaa ccaaacacaa actctatatt tacacatata accccctctc tagaatggca
2940gtctataact atgatgttgt tgtcattggt actggtccag ctggtgaagg tgctgctatg
3000aatgctgtca aagctggcag aaaggttgct gtcgttgacg acagacctca agtcggtggt
3060aactgtactc atcttggtac tatcccatcc aaggcattaa gacattcagt tagacagatc
3120atgcagtata acaacaaccc attattcaga caaattggtg aacctagatg gttttctttc
3180gcagacgttc ttaagtccgc tgaacaagtt atcgcaaagc aagtctcttc aagaaccggc
3240tattacgcaa gaaatcgtat tgatactttc tttggcaccg cctcattctg tgatgaacat
3300actatcgaag ttgtccactt gaatggtatg gttgaaacct tagttgctaa gcaattcgtt
3360attgcaacag gttcaagacc atacagacca gctgacgtcg actttaccca cccaagaatc
3420tacgattccg ataccattct ttccttgggt catacaccaa gacgtttgat tatctacggt
3480gccggtgtca ttggctgtga gtacgcttca attttctccg gtttaggtgt tttagttgat
3540ttgattgaca acagagatca gttgttgtcc tttttggatg atgaaatttc tgattctttg
3600tcctatcact taagaaataa caacgttttg attagacaca acgaagaata cgaaagagtt
3660gaaggtcttg ataatggtgt tatcttacac ttaaagtctg gtaaaaagat taaggcagat
3720gcatttttgt ggtctaacgg tagaactggt aacactgata agttaggttt ggaaaacatt
3780ggtttgaagg ctaatggcag aggtcaaatt caagttgatg agcattatcg tacagaagtc
3840tccaatatct acgcagccgg tgacgtcatc ggttggccat ccttagcttc agcagcttat
3900gatcaaggta gatctgctgc tggttctatt accgagaatg actcttggcg tttcgttgat
3960gatgttccta ccggtatcta caccatccct gaaatttcct ctgttggtaa aaccgaaaga
4020gagttgacac aagcaaaagt cccatacgag gttggtaaag cctttttcaa gggcatggct
4080cgtgcacaaa ttgcagttga aaaagccggt atgttaaaga ttctttttca tagagagact
4140ttagaaatct tgggtgtcca ctgcttcggt taccaagcat ctgaaattgt tcatattggt
4200caagcaatta tgaaccaaaa gggcgaagca aatacattaa agtatttcat caacactaca
4260ttcaattatc caactatggc tgaagcttat agagttgcag cctacgacgg tttaaacaga
4320ttgttttaat taatttattt tactagttta tttttgctcc tgagaatagg attacaaaca
4380cttaaagtct ttaattacaa ctatatataa tattctgttg gttttcttga attggttcgc
4440tgcgattcat gcctcccatt caccaaaggt ggagtgggaa ataacggttt tactgcggta
4500attagcagag gcaagaacag gatacacttt ttgatgataa atctgtatta tagtcgagcc
4560tatttaggaa atcaaatttt cttgtgttta cttttcaaat aaataatgtt cgaaaatttt
4620tactttactc cttcatttaa ctataccaga cgttatatca tcaacacctt ctgaccatat
4680acagctcaag atgtttaaga gtctgttaaa ttttttcaat ccatttcatg gagtaccagg
4740aggtgctaca aaaggaattc atagcctcat gaaatcagcc atttgctttt gttcaacgat
4800cttttgaaat tgttgttgtt cttggtagtt aagttgatcc atcttggctt atgttgtgtg
4860tatgttgtag ttattcttag tatattcctg tcctgagttt agtgaaacat aatatcgcct
4920tgaaatgaaa atgctgaaat tcgtcgacat acaatttttc aaactttttt tttttcttgg
4980tgcacggaca tgtttttaaa ggaagtactc tataccagtt attcttcaca aatttaattg
5040ctggagaata gatcttcaac gctttaataa agtagtttgt ttgtcaagga tggcgtcata
5100caaagaaaga tcagaatcac acacttcccc tgttgctagg agacttttct ccatcatgga
5160ggaaaagaag tctaaccttt gtgcatcatt ggatattact gaaactgaaa agcttctctc
5220tattttggac actattggtc cttacatctg tctagttaaa acacacatcg atattgtttc
5280tgattttacg tatgaaggaa ctgtgttgcc tttgaaggag cttgccaaga aacataattt
5340tatgattttt gaagatagaa aatttgctga tattggtaac actgttaaaa atcaatataa
5400atctggtgtc ttccgtattg ccgaatgggc tgacatcact aatgcacatg gtgtaacggg
5460tgcaggtatt gtttctggct tgaaggaggc agcccaagaa acaaccagtg aacctagagg
5520tttgctaatg cttgctgagt tatcatcaaa gggttcttta gcatatggtg aatatacaga
5580aaaaacagta gaaattgcta aatctgataa agagtttgtc attggtttta ttgcgcaaca
5640cgatatgggc ggtagagaag aaggttttga ctccgc
5676271395DNAAzotobacter vinelandii 27atggcagtct ataactatga tgttgttgtc
attggtactg gtccagctgg tgaaggtgct 60gctatgaatg ctgtcaaagc tggcagaaag
gttgctgtcg ttgacgacag acctcaagtc 120ggtggtaact gtactcatct tggtactatc
ccatccaagg cattaagaca ttcagttaga 180cagatcatgc agtataacaa caacccatta
ttcagacaaa ttggtgaacc tagatggttt 240tctttcgcag acgttcttaa gtccgctgaa
caagttatcg caaagcaagt ctcttcaaga 300accggctatt acgcaagaaa tcgtattgat
actttctttg gcaccgcctc attctgtgat 360gaacatacta tcgaagttgt ccacttgaat
ggtatggttg aaaccttagt tgctaagcaa 420ttcgttattg caacaggttc aagaccatac
agaccagctg acgtcgactt tacccaccca 480agaatctacg attccgatac cattctttcc
ttgggtcata caccaagacg tttgattatc 540tacggtgccg gtgtcattgg ctgtgagtac
gcttcaattt tctccggttt aggtgtttta 600gttgatttga ttgacaacag agatcagttg
ttgtcctttt tggatgatga aatttctgat 660tctttgtcct atcacttaag aaataacaac
gttttgatta gacacaacga agaatacgaa 720agagttgaag gtcttgataa tggtgttatc
ttacacttaa agtctggtaa aaagattaag 780gcagatgcat ttttgtggtc taacggtaga
actggtaaca ctgataagtt aggtttggaa 840aacattggtt tgaaggctaa tggcagaggt
caaattcaag ttgatgagca ttatcgtaca 900gaagtctcca atatctacgc agccggtgac
gtcatcggtt ggccatcctt agcttcagca 960gcttatgatc aaggtagatc tgctgctggt
tctattaccg agaatgactc ttggcgtttc 1020gttgatgatg ttcctaccgg tatctacacc
atccctgaaa tttcctctgt tggtaaaacc 1080gaaagagagt tgacacaagc aaaagtccca
tacgaggttg gtaaagcctt tttcaagggc 1140atggctcgtg cacaaattgc agttgaaaaa
gccggtatgt taaagattct ttttcataga 1200gagactttag aaatcttggg tgtccactgc
ttcggttacc aagcatctga aattgttcat 1260attggtcaag caattatgaa ccaaaagggc
gaagcaaata cattaaagta tttcatcaac 1320actacattca attatccaac tatggctgaa
gcttatagag ttgcagccta cgacggttta 1380aacagattgt tttaa
1395285298DNAArtificial
SequenceSynthetic - A. vinelandii SthA gene integration fragment
28aattctttga aggagcttgc caagaaacat aattttatga tttttgaaga tagaaaattt
60gctgatattg gtaacactgt taaaaatcaa tataaatctg gtgtcttccg tattgccgaa
120tgggctgaca tcactaatgc acatggtgta acgggtgcag gtattgtttc tggcttgaag
180gaggcagccc aagaaacaac cagtgaacct agaggtttgc taatgcttgc tgagttatca
240tcaaagggtt ctttagcata tggtgaatat acagaaaaaa cagtagaaat tgctaaatct
300gataaagagt ttgtcattgg ttttattgcg caacacgata tgggcggtag agaagaaggt
360tttgactgga tcattatgac tccaggggtt ggtttagatg acaaaggtga tgcacttggt
420caacaatata gaactgttga tgaagttgta aagactggaa cggatatcat aattgttggt
480agaggtttgt acggtcaagg aagagatcct atagagcaag ctaaaagata ccaacaagct
540ggttggaatg cttatttaaa cagatttaaa tgattcttac acaaagattt gatacatgta
600cactagttta aataagcatg aaaagaatta cacaagcaaa aaaaaaaaaa taaatgaggt
660actttacgtt cacctacaac caaaaaaact agatagagta aaatcttaag atttagaaaa
720agttgtttaa caaaggcttt agtatgtgaa tttttaatgt agcaaagcga taactaataa
780acataaacaa aagtatggtt ttctttatca gtcaaatcat tatcgattga ttgttccgcg
840tatctgcaga tagcctcatg aaatcagcca tttgcttttg ttcaacgatc ttttgaaatt
900gttgttgttc ttggtagtta agttgatcca tcttggctta tgttgtgtgt atgttgtagt
960tattcttagt atattcctgt cctgagttta gtgaaacata atatcgcctt gaaatgaaaa
1020tgctgaaatt cgtcgacata caatttttca aacttttttt ttttcttggt gcacggacat
1080gtttttaaag gaagtactct ataccagtta ttcttcacaa atttaattgc tggagaatag
1140atcttcaacg cgtttaaaca gcaatttgag gaaggaatag gagaaggaga agcaatttct
1200aggaaagagc aaggtgtgca acagcatgct ctgaatgata ttttcagcaa tagttcagtt
1260gaagaacctg ttggcgtatc tacatcactt cctacaaaca acaccacgaa ttgcgtccgt
1320ggtgacgcaa ctacgaatgg cattgtcaat gccaatgcca gtgcacatac acgtgcaagt
1380cccaccggtt ccctgcccgg ctatggtaga gacaagaagg acgataccgg catcgacatc
1440aacagtttca acagcaatgc gtttggcgtc gacgcgtcga tggggctgcc gtatttggat
1500ttggacgggc tagatttcga tatggatatg gatatggata tggatatgga gatgaatttg
1560aatttagatt tgggtcttga tttggggttg gaattaaaag gggataacaa tgagggtttt
1620cctgttgatt taaacaatgg acgtgggagg tgattgattt aacctgatcc aaaaggggta
1680tgtctatttt ttagagtgtg tctttgtgtc aaattatggt agaatgtgta aagtagtata
1740aactttcctc tcaaatgacg aggtttaaaa caccccccgg gtgagccgag ccgagaatgg
1800ggcaattgtt caatgtgaaa tagaagtatc gagtgagaaa cttgggtgtt ggccagccaa
1860gggggaagga aaatggcgcg aatgctcagg tgagattgtt ttggaattgg gtgaagcgag
1920gaaatgagcg acccggaggt tgtgacttta gtggcggagg aggaacggga ggaaaaggcc
1980aagagggaaa gtgtatataa gggggagcaa tttgccaacc aggatagaat tggatgagtt
2040ataattctac tgtatttatt gtataattta tttctccttt tatatcaaac acattacaaa
2100acacacaaaa catacaaaca tacacagcta gcaaaggcgc gccatctaat agtttaatca
2160cagcttatag tctactatag ttttcttttt taaacattgt tgtattttgt cccccccctc
2220taattgatga tgattatcct ataagaatcc aataaaacga tggaaactaa taccctctcc
2280tttgtcatgt ggtctttagt atttcttgaa cattggctct gatttctcga ctttatagtc
2340ctattaaaat cgctgttagt tctcgatcgt tgtatctcgt ttcttgtctc tttggtggat
2400gattttgcgt gcgaacatgt ttttttccct ttctctcacc atcatcgtgt agttcttgtc
2460accatccccc ccaccccttc cttctctcat tgattctata agagcttatc cacagaggtg
2520cagtaacgag gtagtttaac cttcgagtgg atcaaaatgt cacacaggcc tcgacatttg
2580ctgcaacggc aacatcaatg tccacgttta cacacctaca tttatatcta tatttatatt
2640tatatttatt tatttatgct acttagcttc tatagttagt taatgcactc acgatattca
2700aaattgacac ccttcaacta ctccctacta ttgtctacta ctgtctacta ctcctcttta
2760ctatagctgc tcccaatagg ctccaccaat aggctctgtc aatacatttt gcgccgccac
2820ctttcaggtt gtgtcactcc tgaaggacca tattgggtaa tcgtgcaatt tctggaagag
2880agtccgcgag aagtgaggcc cccactgtaa atcctcgagg gggcatggag tatggggcat
2940ggaggatgga ggatgggggg gggggggggg aaaataggta gcgaaaggac ccgctatcac
3000cccacccgga gaactcgttg ccgggaagtc atatttcgac actccgggga gtctataaaa
3060ggcgggtttt gtcttttgcc agttgatgtt gctgagagga cttgtttgcc gtttcttccg
3120atttaacagt atagaatcaa ccactgttaa ttatacacgt tatactaaca caacaaaaac
3180aaaaacaacg acaacaacaa caacatctag aatggcagtc tataactatg atgttgttgt
3240cattggtact ggtccagctg gtgaaggtgc tgctatgaat gctgtcaaag ctggcagaaa
3300ggttgctgtc gttgacgaca gacctcaagt cggtggtaac tgtactcatc ttggtactat
3360cccatccaag gcattaagac attcagttag acagatcatg cagtataaca acaacccatt
3420attcagacaa attggtgaac ctagatggtt ttctttcgca gacgttctta agtccgctga
3480acaagttatc gcaaagcaag tctcttcaag aaccggctat tacgcaagaa atcgtattga
3540tactttcttt ggcaccgcct cattctgtga tgaacatact atcgaagttg tccacttgaa
3600tggtatggtt gaaaccttag ttgctaagca attcgttatt gcaacaggtt caagaccata
3660cagaccagct gacgtcgact ttacccaccc aagaatctac gattccgata ccattctttc
3720cttgggtcat acaccaagac gtttgattat ctacggtgcc ggtgtcattg gctgtgagta
3780cgcttcaatt ttctccggtt taggtgtttt agttgatttg attgacaaca gagatcagtt
3840gttgtccttt ttggatgatg aaatttctga ttctttgtcc tatcacttaa gaaataacaa
3900cgttttgatt agacacaacg aagaatacga aagagttgaa ggtcttgata atggtgttat
3960cttacactta aagtctggta aaaagattaa ggcagatgca tttttgtggt ctaacggtag
4020aactggtaac actgataagt taggtttgga aaacattggt ttgaaggcta atggcagagg
4080tcaaattcaa gttgatgagc attatcgtac agaagtctcc aatatctacg cagccggtga
4140cgtcatcggt tggccatcct tagcttcagc agcttatgat caaggtagat ctgctgctgg
4200ttctattacc gagaatgact cttggcgttt cgttgatgat gttcctaccg gtatctacac
4260catccctgaa atttcctctg ttggtaaaac cgaaagagag ttgacacaag caaaagtccc
4320atacgaggtt ggtaaagcct ttttcaaggg catggctcgt gcacaaattg cagttgaaaa
4380agccggtatg ttaaagattc tttttcatag agagacttta gaaatcttgg gtgtccactg
4440cttcggttac caagcatctg aaattgttca tattggtcaa gcaattatga accaaaaggg
4500cgaagcaaat acattaaagt atttcatcaa cactacattc aattatccaa ctatggctga
4560agcttataga gttgcagcct acgacggttt aaacagattg ttttaattaa catctgaatg
4620taaaatgaac attaaaatga attactaaac tttacgtcta ctttacaatc tataaacttt
4680gtttaatcat ataacgaaat acactaatac acaatcctgt acgtatgtaa tacttttatc
4740catcaaggat tgagaaaaaa aagtaatgat tccctgggcc attaaaactt agacccccaa
4800gcttggatag gtcactctct attttcgttt ctcccttccc tgatagaagg gtgatatgta
4860attaagaata atatataatt ttataataaa agcggccgcc aagttagtta gagctagagt
4920taacacatac acattatcaa atgcatttat tcctaatatc acactaaaac gtattatata
4980attttaatct ttatagactt catagcacca attggatttg ctttctttca gaataccgca
5040cttaatctca atgtacgtaa cgtaggcaaa atctgtcgat aaggatctgt atgccgtaaa
5100cggaaactcc aagcgcccag aaaacttaca ttatattctt gccagtttca tctcaccagc
5160cagtcacagt ttaaaaggtt tgattgcgtt tcttgtttcg tcggattcag tgctaattgg
5220taacgcactg taccgccaca ccaaagcaaa aatgcagaaa caaacaacaa tgagtgtatg
5280tttaccaact ttggccgc
5298296139DNAArtificial SequenceSynthetic - S. cerevisiae SthA gene
integration fragment 29aattctttga aggagcttgc caagaaacat aattttatga
tttttgaaga tagaaaattt 60gctgatattg gtaacactgt taaaaatcaa tataaatctg
gtgtcttccg tattgccgaa 120tgggctgaca tcactaatgc acatggtgta acgggtgcag
gtattgtttc tggcttgaag 180gaggcagccc aagaaacaac cagtgaacct agaggtttgc
taatgcttgc tgagttatca 240tcaaagggtt ctttagcata tggtgaatat acagaaaaaa
cagtagaaat tgctaaatct 300gataaagagt ttgtcattgg ttttattgcg caacacgata
tgggcggtag agaagaaggt 360tttgactgga tcattatgac tccaggggtt ggtttagatg
acaaaggtga tgcacttggt 420caacaatata gaactgttga tgaagttgta aagactggaa
cggatatcat aattgttggt 480agaggtttgt acggtcaagg aagagatcct atagagcaag
ctaaaagata ccaacaagct 540ggttggaatg cttatttaaa cagatttaaa tgattcttac
acaaagattt gatacatgta 600cactagttta aataagcatg aaaagaatta cacaagcaaa
aaaaaaaaaa taaatgaggt 660actttacgtt cacctacaac caaaaaaact agatagagta
aaatcttaag atttagaaaa 720agttgtttaa caaaggcttt agtatgtgaa tttttaatgt
agcaaagcga taactaataa 780acataaacaa aagtatggtt ttctttatca gtcaaatcat
tatcgattga ttgttccgcg 840tatctgcaga tagcctcatg aaatcagcca tttgcttttg
ttcaacgatc ttttgaaatt 900gttgttgttc ttggtagtta agttgatcca tcttggctta
tgttgtgtgt atgttgtagt 960tattcttagt atattcctgt cctgagttta gtgaaacata
atatcgcctt gaaatgaaaa 1020tgctgaaatt cgtcgacata caatttttca aacttttttt
ttttcttggt gcacggacat 1080gtttttaaag gaagtactct ataccagtta ttcttcacaa
atttaattgc tggagaatag 1140atcttcaacg cgtttaaaca gcaatttgag gaaggaatag
gagaaggaga agcaatttct 1200aggaaagagc aaggtgtgca acagcatgct ctgaatgata
ttttcagcaa tagttcagtt 1260gaagaacctg ttggcgtatc tacatcactt cctacaaaca
acaccacgaa ttgcgtccgt 1320ggtgacgcaa ctacgaatgg cattgtcaat gccaatgcca
gtgcacatac acgtgcaagt 1380cccaccggtt ccctgcccgg ctatggtaga gacaagaagg
acgataccgg catcgacatc 1440aacagtttca acagcaatgc gtttggcgtc gacgcgtcga
tggggctgcc gtatttggat 1500ttggacgggc tagatttcga tatggatatg gatatggata
tggatatgga gatgaatttg 1560aatttagatt tgggtcttga tttggggttg gaattaaaag
gggataacaa tgagggtttt 1620cctgttgatt taaacaatgg acgtgggagg tgattgattt
aacctgatcc aaaaggggta 1680tgtctatttt ttagagtgtg tctttgtgtc aaattatggt
agaatgtgta aagtagtata 1740aactttcctc tcaaatgacg aggtttaaaa caccccccgg
gtgagccgag ccgagaatgg 1800ggcaattgtt caatgtgaaa tagaagtatc gagtgagaaa
cttgggtgtt ggccagccaa 1860gggggaagga aaatggcgcg aatgctcagg tgagattgtt
ttggaattgg gtgaagcgag 1920gaaatgagcg acccggaggt tgtgacttta gtggcggagg
aggaacggga ggaaaaggcc 1980aagagggaaa gtgtatataa gggggagcaa tttgccaacc
aggatagaat tggatgagtt 2040ataattctac tgtatttatt gtataattta tttctccttt
tatatcaaac acattacaaa 2100acacacaaaa catacaaaca tacacagcta gcatggatgg
tcctaacttc gcccatcaag 2160gcggtagatc ccaaagaact accgagttgt actcatgtgc
acgttgccgt aagcttaaga 2220agaaatgtgg taaacaaatc ccaacttgtg caaactgtga
taagaacggt gcacattgtt 2280cttatcctgg tagagctcct agacgtacca agaaggagtt
ggctgatgca atgcttagag 2340gtgaatatgt tccagttaaa cgtaacaaga aagtcggcaa
atctccttta tctactaagt 2400ctatgccaaa ctcctcttct ccattatccg ctaacggtgc
aatcactcct ggtttttctc 2460catatgaaaa cgatgacgcc cacaagatga agcagttaaa
gccatctgat ccaattaact 2520tagtcatggg tgcatctcca aattcctccg agggcgtttc
ctctttgatt tccgttttaa 2580cctcattgaa cgacaattcc aatccttctt ctcacttgtc
ctctaacgaa aattccatga 2640tcccttctcg ttcccttcca gcatccgttc aacaatcatc
tacaacttct tcctttggtg 2700gttataacac cccatcacca ttgatttcct ctcacgttcc
agccaatgca caagcagtcc 2760cattacaaaa caacaacaga aacacctcta acggtgacaa
tggttctaac gttaatcatg 2820acaacaataa tggttccacc aacacaccac aattatcctt
gactccatac gctaataact 2880ctgctcctaa cggcaaattc gattccgtcc ctgtcgatgc
ttcttccatc gagtttgaga 2940caatgtcttg ttgttttaag ggtggcagaa ctacttcttg
ggttagagaa gatggttctt 3000ttaagtctat tgacagatca ttattggaca gattcatcgc
agcttacttc aagcacaacc 3060acagattgtt cccaatgatt gataagattg cattcttgaa
tgatgctgct actattaccg 3120atttcgaaag attgtacgat aacaagaact atccagattc
ttttgttttc aaggtttaca 3180tgattatggc aattggctgt actacattac aaagagctgg
catggtttcc caagacgaag 3240aatgtttgtc tgaacatttg gctttccttg ctatgaagaa
gtttagatct gtcattatct 3300tacaagatat cgaaactgtt agatgcttgt tgttattagg
tatctactca ttctttgaac 3360caaagggttc ttcctcttgg actatttcag gcattatcat
gcgtcttact attggtttgg 3420gtttgaatag agaattgact gctaagaaat tgaagtctat
gtcagcctta gaagcagaag 3480caagatatag agttttctgg tctgcttatt gttttgaaag
attggtctgt acatccttgg 3540gtagaatttc cggtattgac gacgaagata ttactgttcc
attaccaaga gcattgtatg 3600tcgatgagag agatgatttg gaaatgacca agttaatgat
ctccttaaga aagatgggtg 3660gtagaatcta caagcaagtc cattctgttt ccgctggtag
acaaaagttg accatcgaac 3720agaagcagga gattatctct ggtttgagaa aagaacttga
cgaaatctac tccagagaat 3780ccgaaagaag aaagttaaag aagtctcaaa tggaccaagt
cgaaagagaa aacaattcaa 3840caactaatgt tatttccttt cactcatctg agatttggtt
agctatgaga tactctcaat 3900tgcaaatctt gttgtataga ccatccgccc ttatgccaaa
accacctatt gattctttgt 3960caacccttgg tgaattttgt ttacaagcat ggaaacacac
ttacacattg tacaagaaaa 4020gattgttacc attgaattgg attaccttat tcagaacttt
aactatttgt aacacaatct 4080tatactgttt atgccaatgg tccattgact taattgaatc
taagattgaa atccaacagt 4140gtgttgaaat cttgcgtcat tttggtgaaa gatggatttt
cgccatgaga tgtgctgatg 4200ttttccaaaa catttcaaat accattttag acatctccct
ttcccatggt aaagttccaa 4260acatggatca attaaccaga gagttattcg gtgcatctga
ctcctaccaa gacatcttag 4320acgaaaacaa tgttgatgtt tcttgggtcg ataagttggt
ctaaggcgcg ccatctaata 4380gtttaatcac agcttatagt ctactatagt tttctttttt
aaacattgtt gtattttgtc 4440ccccccctct aattgatgat gattatccta taagaatcca
ataaaacgat ggaaactaat 4500accctctcct ttgtcatgtg gtctttagta tttcttgaac
attggctctg atttctcgac 4560tttatagtcc tattaaaatc gctgttagtt ctcgatcgtt
gtatctcgtt tcttgtctct 4620ttggtggatg attttgcgtg cgaacatgtt tttttccctt
tctctcacca tcatcgtgta 4680gttcttgtca ccatcccccc caccccttcc ttctctcatt
gattctataa gagcttatcc 4740acagaggtgc agtaacgagg tagtttaacc ttcgagtgga
tcaaaatgtc acacaggcct 4800cgacatttgc tgcaacggca acatcaatgt ccacgtttac
acacctacat ttatatctat 4860atttatattt atatttattt atttatgcta cttagcttct
atagttagtt aatgcactca 4920cgatattcaa aattgacacc cttcaactac tccctactat
tgtctactac tgtctactac 4980tcctctttac tatagctgct cccaataggc tccaccaata
ggctctgtca atacattttg 5040cgccgccacc tttcaggttg tgtcactcct gaaggaccat
attgggtaat cgtgcaattt 5100ctggaagaga gtccgcgaga agtgaggccc ccactgtaaa
tcctcgaggg ggcatggagt 5160atggggcatg gaggatggag gatggggggg ggggggggga
aaataggtag cgaaaggacc 5220cgctatcacc ccacccggag aactcgttgc cgggaagtca
tatttcgaca ctccggggag 5280tctataaaag gcgggttttg tcttttgcca gttgatgttg
ctgagaggac ttgtttgccg 5340tttcttccga tttaacagta tagaatcaac cactgttaat
tatacacgtt atactaacac 5400aacaaaaaca aaaacaacga caacaacaac aacatctaga
taattaatta acatctgaat 5460gtaaaatgaa cattaaaatg aattactaaa ctttacgtct
actttacaat ctataaactt 5520tgtttaatca tataacgaaa tacactaata cacaatcctg
tacgtatgta atacttttat 5580ccatcaagga ttgagaaaaa aaagtaatga ttccctgggc
cattaaaact tagaccccca 5640agcttggata ggtcactctc tattttcgtt tctcccttcc
ctgatagaag ggtgatatgt 5700aattaagaat aatatataat tttataataa aagcggccgc
caagttagtt agagctagag 5760ttaacacata cacattatca aatgcattta ttcctaatat
cacactaaaa cgtattatat 5820aattttaatc tttatagact tcatagcacc aattggattt
gctttctttc agaataccgc 5880acttaatctc aatgtacgta acgtaggcaa aatctgtcga
taaggatctg tatgccgtaa 5940acggaaactc caagcgccca gaaaacttac attatattct
tgccagtttc atctcaccag 6000ccagtcacag tttaaaaggt ttgattgcgt ttcttgtttc
gtcggattca gtgctaattg 6060gtaacgcact gtaccgccac accaaagcaa aaatgcagaa
acaaacaaca atgagtgtat 6120gtttaccaac tttggccgc
6139302232DNASaccharomyces cerevisiae 30atggatggtc
ctaacttcgc ccatcaaggc ggtagatccc aaagaactac cgagttgtac 60tcatgtgcac
gttgccgtaa gcttaagaag aaatgtggta aacaaatccc aacttgtgca 120aactgtgata
agaacggtgc acattgttct tatcctggta gagctcctag acgtaccaag 180aaggagttgg
ctgatgcaat gcttagaggt gaatatgttc cagttaaacg taacaagaaa 240gtcggcaaat
ctcctttatc tactaagtct atgccaaact cctcttctcc attatccgct 300aacggtgcaa
tcactcctgg tttttctcca tatgaaaacg atgacgccca caagatgaag 360cagttaaagc
catctgatcc aattaactta gtcatgggtg catctccaaa ttcctccgag 420ggcgtttcct
ctttgatttc cgttttaacc tcattgaacg acaattccaa tccttcttct 480cacttgtcct
ctaacgaaaa ttccatgatc ccttctcgtt cccttccagc atccgttcaa 540caatcatcta
caacttcttc ctttggtggt tataacaccc catcaccatt gatttcctct 600cacgttccag
ccaatgcaca agcagtccca ttacaaaaca acaacagaaa cacctctaac 660ggtgacaatg
gttctaacgt taatcatgac aacaataatg gttccaccaa cacaccacaa 720ttatccttga
ctccatacgc taataactct gctcctaacg gcaaattcga ttccgtccct 780gtcgatgctt
cttccatcga gtttgagaca atgtcttgtt gttttaaggg tggcagaact 840acttcttggg
ttagagaaga tggttctttt aagtctattg acagatcatt attggacaga 900ttcatcgcag
cttacttcaa gcacaaccac agattgttcc caatgattga taagattgca 960ttcttgaatg
atgctgctac tattaccgat ttcgaaagat tgtacgataa caagaactat 1020ccagattctt
ttgttttcaa ggtttacatg attatggcaa ttggctgtac tacattacaa 1080agagctggca
tggtttccca agacgaagaa tgtttgtctg aacatttggc tttccttgct 1140atgaagaagt
ttagatctgt cattatctta caagatatcg aaactgttag atgcttgttg 1200ttattaggta
tctactcatt ctttgaacca aagggttctt cctcttggac tatttcaggc 1260attatcatgc
gtcttactat tggtttgggt ttgaatagag aattgactgc taagaaattg 1320aagtctatgt
cagccttaga agcagaagca agatatagag ttttctggtc tgcttattgt 1380tttgaaagat
tggtctgtac atccttgggt agaatttccg gtattgacga cgaagatatt 1440actgttccat
taccaagagc attgtatgtc gatgagagag atgatttgga aatgaccaag 1500ttaatgatct
ccttaagaaa gatgggtggt agaatctaca agcaagtcca ttctgtttcc 1560gctggtagac
aaaagttgac catcgaacag aagcaggaga ttatctctgg tttgagaaaa 1620gaacttgacg
aaatctactc cagagaatcc gaaagaagaa agttaaagaa gtctcaaatg 1680gaccaagtcg
aaagagaaaa caattcaaca actaatgtta tttcctttca ctcatctgag 1740atttggttag
ctatgagata ctctcaattg caaatcttgt tgtatagacc atccgccctt 1800atgccaaaac
cacctattga ttctttgtca acccttggtg aattttgttt acaagcatgg 1860aaacacactt
acacattgta caagaaaaga ttgttaccat tgaattggat taccttattc 1920agaactttaa
ctatttgtaa cacaatctta tactgtttat gccaatggtc cattgactta 1980attgaatcta
agattgaaat ccaacagtgt gttgaaatct tgcgtcattt tggtgaaaga 2040tggattttcg
ccatgagatg tgctgatgtt ttccaaaaca tttcaaatac cattttagac 2100atctcccttt
cccatggtaa agttccaaac atggatcaat taaccagaga gttattcggt 2160gcatctgact
cctaccaaga catcttagac gaaaacaatg ttgatgtttc ttgggtcgat 2220aagttggtct
aa
223231506PRTIssatchenkia orientalis 31Met Gly Val Gln Phe Ile Glu Asn Thr
Ile Ile Val Val Phe Gly Ala1 5 10
15Ser Gly Asp Leu Ala Lys Lys Lys Thr Phe Pro Ala Leu Phe Gly
Leu 20 25 30Phe Arg Glu Gly
Gln Leu Ser Glu Thr Thr Lys Ile Ile Gly Phe Ala 35
40 45Arg Ser Lys Leu Ser Asn Asp Asp Leu Arg Asn Arg
Ile Lys Pro Tyr 50 55 60Leu Lys Leu
Asn Lys Arg Thr Asp Ala Glu Arg Gln Ser Leu Glu Lys65 70
75 80Phe Leu Gln Ile Leu Glu Tyr His
Gln Ser Asn Tyr Asp Asp Ser Glu 85 90
95Gly Phe Glu Lys Leu Glu Lys Leu Ile Asn Lys Tyr Asp Asp
Glu Ala 100 105 110Asn Val Lys
Glu Ser His Arg Leu Tyr Tyr Leu Ala Leu Pro Pro Ser 115
120 125Val Phe Thr Thr Val Ala Thr Met Leu Lys Lys
His Cys His Pro Gly 130 135 140Asp Ser
Gly Ile Ala Arg Leu Ile Val Glu Lys Pro Phe Gly His Asp145
150 155 160Leu Ser Ser Ser Arg Glu Leu
Gln Lys Ser Leu Ala Pro Leu Trp Asn 165
170 175Glu Asp Glu Leu Phe Arg Ile Asp His Tyr Leu Gly
Lys Glu Met Val 180 185 190Lys
Asn Leu Ile Pro Leu Arg Phe Ser Asn Thr Phe Leu Ser Ser Ser 195
200 205Trp Asn Asn Gln Phe Ile Asp Thr Ile
Gln Ile Thr Phe Lys Glu Asn 210 215
220Phe Gly Thr Glu Gly Arg Gly Gly Tyr Phe Asp Ser Ile Gly Ile Ile225
230 235 240Arg Asp Val Ile
Gln Asn His Leu Leu Gln Val Leu Thr Ile Val Leu 245
250 255Met Glu Lys Pro Ala Asp Phe Asn Gly Glu
Ser Ile Arg Asp Glu Lys 260 265
270Val Lys Val Leu Lys Ala Ile Glu Gln Ile Asp Phe Asn Asn Val Leu
275 280 285Val Gly Gln Tyr Asp Lys Ser
Glu Asp Gly Ser Lys Pro Gly Tyr Leu 290 295
300Asp Asp Asp Thr Val Asn Pro Asp Ser Lys Ala Val Thr Tyr Ala
Ala305 310 315 320Leu Val
Leu Asn Val Ala Asn Glu Arg Trp Asn Asn Val Pro Ile Ile
325 330 335Leu Lys Ala Gly Lys Ala Leu
Asn Gln Ser Lys Val Glu Ile Arg Ile 340 345
350Gln Phe Lys Pro Val Glu Asn Gly Ile Phe Lys Asn Ser Ala
Arg Asn 355 360 365Glu Leu Val Ile
Arg Ile Gln Pro Asn Glu Ala Met Tyr Leu Lys Met 370
375 380Asn Ile Lys Val Pro Gly Val Ser Asn Gln Val Ser
Ile Ser Glu Met385 390 395
400Asp Leu Thr Tyr Lys Asn Arg Tyr Ser Ser Glu Phe Tyr Ile Pro Glu
405 410 415Ala Tyr Glu Ser Leu
Ile Lys Asp Ala Leu Met Asp Asp His Ser Asn 420
425 430Phe Val Arg Asp Asp Glu Leu Asp Ile Ser Trp Ala
Leu Phe Thr Pro 435 440 445Leu Leu
Glu His Ile Glu Gly Pro Asp Gly Pro Thr Pro Thr Lys Tyr 450
455 460Pro Tyr Gly Ser Arg Gly Pro Lys Glu Ile Asp
Glu Phe Leu Arg Asn465 470 475
480His Gly Tyr Val Lys Glu Pro Arg Glu Asn Tyr Gln Trp Pro Leu Thr
485 490 495Thr Pro Lys Glu
Leu Asn Ser Ser Lys Phe 500
505323435DNAArtificial SequenceSynthetic - Mutated L. mexicana FRD gene
32atggctgatg gcaaaacctc tgcatcagtt gttgctgttg atgctgaacg tgccgctaag
60gaaagagatg cagcagctag agctatgttg caaggtggtg gtgtctctcc tgctggcaag
120gcacaattgt tgaaaaaggg tttggttcac actgttccat ataccttaaa ggttgtcgtc
180gcagatccaa aggaaatgga gaaggcaact gctgacgcag aagaggtttt acaagctgca
240tttcaagtcg tcgacaccct tttgaacaac tttaacgaaa actcagaagt ttcaagagtc
300aataggttgg cagttggtga ggaacatcaa atgtctgaaa cattgaaaca cgtcatggcc
360tgttgtcaaa aggtttatca ttcctccaga ggtgtttttg acccagcagt tggtccatta
420gtccgtgaac ttagagaagc tgctcacaag ggtaaaactg ttccagccga aagagttaat
480gatttgttat ccaaatgtac ccttaatgca tctttttcaa ttgatatgtc cagaggtatg
540attgcaagga agcatccaga cgccatgttg gatttgggtg gtgtcaacaa gggttatggt
600atcgactaca ttgttgaaca cttaaactct ttgggttatg atgatgtctt tttcggacgg
660ggtggtgatg ttagagcatc cggcaaaaac cagttatctc aaccttgggc tgttggtatt
720gttagaccac ctgccttggc cgacattaga actgttgtcc cagaggacaa aagatccttt
780atccgtgtcg tcagattgaa caacgaagct attgctacct ctggtgatta tgagaatttg
840gttgaaggtc ctggttctaa ggtttactct tccaccttca atccaacttc caaaaacttg
900ttggaaccta ccgaagcagg tatggctcaa gtttctgtca agtgttgctc atgtatctac
960gctgatgctt tagcaacagc agctttgttg aaaaacgatc ctgctgccgt tagaaggatc
1020ttagataact ggagatatgt cagagatact gttactgact acaccactta cacaagggaa
1080ggtgaaagag ttgctaagat gttggaaatt gctaccgaag atgctgaaat gagagcaaag
1140agaatcaagg gctctttacc agcaagagtt atcattgttg gtggtggttt ggccggttgt
1200tccgcagcta tcgaagcagc taactgtggc gcccacgtca tcttgttaga aaaggaacca
1260aagttaggtg gtaactctgc aaaggctacc tccggtatca acgcctgggg tactagagca
1320caagcaaaac aaggtgtcat ggacggcggc aagtttttcg aaagagatac ccatagatcc
1380ggcaagggtg gtaattgcga tccatgcctt gttaagactt tgtccgttaa gtcctctgat
1440gcagttaagt ggttatctga attaggtgtt ccattgactg ttttgtctca attaggtggt
1500gcttcaagga aacgttgtca ccgtgcacca gataagtctg atggtacacc agtcccagtt
1560ggtttcacca ttatgaaaac ccttgaaaac cacattgtca acgatttgtc cagacatgtt
1620acagttatga caggtattac cgtcacagct ttagaatcta catcaagagt cagacctgat
1680ggtgttttag tcaagcatgt tactggtgtt cacttgattc aggcatctgg tcaatctatg
1740gttttgaatg cagacgctgt tatcttagct actggtggtt tctccaatga tcatacccca
1800aactcccttt tacaacaata cgccccacag ttgtcatctt ttccaacaac caatggtgtc
1860tgggcaactg gcgatggtgt taagatggct tccaagttgg gtgtcgcctt agttgatatg
1920gataaggtcc aattacatcc taccggcttg ttagacccaa aagatccatc taatagaacc
1980aagtatcttg gtccagaggc cttaagaggt tccggcggtg tcttgttaaa caaaaacggt
2040gaaagatttg ttaatgaatt agacttaaga tctgttgtct ctcaagctat catcgcacaa
2100gataatgagt acccaggctc tggtggttcc aagttcgcat actgtgtttt gaacgaaact
2160gcagcaaagt tattcggcaa aaacttcctt ggtttctact ggaatagatt aggtcttttc
2220caaaaggttg attccgttgc tggtttagct aagttgattg gttgtccaga agctaatgtt
2280gttgctacat tgaagcaata tgaggagtta tcttccaaaa agcttaatcc ttgtccattg
2340actggcaagt ctgtctttcc ttgtgtttta ggcactcaag gtccatacta tgttgccttg
2400gttaccccat ccattcacta cactatgggt ggttgtttga tttccccatc tgctgagatg
2460caaaccattg acaactctgg tgttactcct gtcagacgtc caatcttagg cttattcggt
2520gctggtgaag ttactggcgg tgtccatggt ggtaacagat taggcggtaa ctctttgtta
2580gaatgtgttg ttttcggcaa gatcgctggt gacagagctg caaccatctt gcaaaagaaa
2640aacaccggct tatcaatgac agaatggtct actgtcgtct taagagaagt tagagaaggt
2700ggtgtctatg gtgctggttc cagagttttg aggtttaaca tgcctggtgc attacagaga
2760actggtttag ctttaggtca attcatcggt atcagaggtg attgggacgg tcacagattg
2820atcggttact attctccaat cactttacct gatgatgttg gtgttattgg tatcttagct
2880agagcagaca agggtagatt ggcagaatgg atttctgcat tgcagccagg tgacgctgtt
2940gagatgaagg cctgcggtgg tcttatcatt gacagaagat tcgctgaaag acatttcttt
3000ttccgtggtc ataagatcag aaagttggcc cttatcggtg gtggtactgg tgttgcacca
3060atgttacaaa tcgtcagagc tgctgtcaaa aagccatttg tcgattcaat tgagtccatt
3120cagttcatct atgctgcaga ggatgtttcc gagcttacat acagaacctt acttgaatct
3180tacgaagagg aatatggttc agaaaagttt aagtgtcact tcgttttgaa taacccacca
3240gctcaatgga ctgacggtgt tggtttcgtt gatactgcat tgttgagatc cgcagttcaa
3300gcaccatcaa atgatttgct tgttgcaatt tgtggtccac caatcatgca aagagcagtt
3360aagggtgcat tgaaaggttt aggttacaat atgaatcttg ttagaaccgt tgacgaaact
3420gaaccaccat cataa
3435333444DNAArtificial SequenceSynthetic - Mutated L. mexicana FRD gene
33atggctgatg gcaaaacctc tgcatcagtt gttgctgttg atgctgaacg tgccgctaag
60gaaagagatg cagcagctag agctatgttg caaggtggtg gtgtctctcc tgctggcaag
120gcacaattgt tgaaaaaggg tttggttcac actgttccat ataccttaaa ggttgtcgtc
180gcagatccaa aggaaatgga gaaggcaact gctgacgcag aagaggtttt acaagctgca
240tttcaagtcg tcgacaccct tttgaacaac tttaacgaaa actcagaagt ttcaagagtc
300aataggttgg cagttggtga ggaacatcaa atgtctgaaa cattgaaaca cgtcatggcc
360tgttgtcaaa aggtttatca ttcctccaga ggtgtttttg acccagcagt tggtccatta
420gtccgtgaac ttagagaagc tgctcacaag ggtaaaactg ttccagccga aagagttaat
480gatttgttat ccaaatgtac ccttaatgca tctttttcaa ttgatatgtc cagaggtatg
540attgcaagga agcatccaga cgccatgttg gatttgggtg gtgtcaacaa gggttatggt
600atcgactaca ttgttgaaca cttaaactct ttgggttatg atgatgtctt tttcgaatgg
660ggtggtgatg ttagagcatc cggcaaaaac cagttatctc aaccttgggc tgttggtatt
720gttagaccac ctgccttggc cgacattaga actgttgtcc cagaggacaa aagatccttt
780atccgtgtcg tcagattgaa caacgaagct attgctacct ctggtgatta tgagaatttg
840gttgaaggtc ctggttctaa ggtttactct tccaccttca atccaacttc caaaaacttg
900ttggaaccta ccgaagcagg tatggctcaa gtttctgtca agtgttgctc atgtatctac
960gctgatgctt tagcaacagc agctttgttg aaaaacgatc ctgctgccgt tagaaggatc
1020ttagataact ggagatatgt cagagatact gttactgact acaccactta cacaagggaa
1080ggtgaaagag ttgctaagat gttggaaatt gctaccgaag atgctgaaat gagagcaaag
1140agaatcaagg gctctttacc agcaagagtt atcattgttg gtggtggttt ggccggttgt
1200tccgcagcta tcgaagcagc taactgtggc gcccacgtca tcttgttagg taaggaacca
1260aagttaggtg gtaactctgc aaaggctacc tccggtatca acgcctgggg tactagagca
1320caagcaaaac aaggtgtcat ggacggcggc aagtttttcg aaagagatac ccatagatcc
1380ggcaagggtg gtaattgcga tccatgcctt gttaagactt tgtccgttaa gtcctctgat
1440gcagttaagt ggttatctga attaggtgtt ccattgactg ttttgtctca attaggtggt
1500gcttcaagga aacgttgtca ccgtgcacca gataagtctg atggtacacc agtcccagtt
1560ggtttcacca ttatgaaaac ccttgaaaac cacattgtca acgatttgtc cagacatgtt
1620acagttatga caggtattac cgtcacagct ttagaatcta catcaagagt cagacctgat
1680ggtgttttag tcaagcatgt tactggtgtt cacttgattc aggcatctgg tcaatctatg
1740gttttgaatg cagacgctgt tatcttagct actggtggtt tctccaatga tcatacccca
1800aactcccttt tacaacaata cgccccacag ttgtcatctt ttccaacaac caatggtgtc
1860tgggcaactg gcgatggtgt taagatggct tccaagttgg gtgtcgcctt agttgatatg
1920gataaggtcc aattacatcc taccggcttg ttagacccaa aagatccatc taatagaacc
1980aagtatcttg gtccagaggc cttaagaggt tccggcggtg tcttgttaaa caaaaacggt
2040gaaagatttg ttaatgaatt agacttaaga tctgttgtct ctcaagctat catcgcacaa
2100gataatgagt acccaggctc tggtggttcc aagttcgcat actgtgtttt gaacgaaact
2160gcagcaaagt tattcggcaa aaacttcctt ggtttctact ggaatagatt aggtcttttc
2220caaaaggttg attccgttgc tggtttagct aagttgattg gttgtccaga agctaatgtt
2280gttgctacat tgaagcaata tgaggagtta tcttccaaaa agcttaatcc ttgtccattg
2340actggcaagt ctgtctttcc ttgtgtttta ggcactcaag gtccatacta tgttgccttg
2400gttaccccat ccattcacta cactatgggt ggttgtttga tttccccatc tgctgagatg
2460caaaccattg acaactctgg tgttactcct gtcagacgtc caatcttagg cttattcggt
2520gctggtgaag ttactggcgg tgtccatggt ggtaacagat taggcggtaa ctctttgtta
2580gaatgtgttg ttttcggcaa gatcgctggt gacagagctg caaccatctt gcaaaagaaa
2640aacaccggct tatcaatgac agaatggtct actgtcgtct taagagaagt tagagaaggt
2700ggtgtctatg gtgctggttc cagagttttg aggtttaaca tgcctggtgc attacagaga
2760actggtttag ctttaggtca attcatcggt atcagaggtg attgggacgg tcacagattg
2820atcggttact attctccaat cactttacct gatgatgttg gtgttattgg tatcttagct
2880agagcagaca agggtagatt ggcagaatgg atttctgcat tgcagccagg tgacgctgtt
2940gagatgaagg cctgcggtgg tcttatcatt gacagaagat tcgctgaaag acatttcttt
3000ttccgtggtc ataagatcag aaagttggcc cttatcggtg gtggtactgg tgttgcacca
3060atgttacaaa tcgtcagagc tgctgtcaaa aagccatttg tcgattcaat tgagtccatt
3120cagttcatct atgctgcaga ggatgtttcc gagcttacat acagaacctt acttgaatct
3180tacgaagagg aatatggttc agaaaagttt aagtgtcact tcgttttgaa taacccacca
3240gctcaatgga ctgacggtgt tggtttcgtt gatactgcat tgttgagatc cgcagttcaa
3300gcaccatcaa atgatttgct tgttgcaatt tgtggtccac caatcatgca aagagcagtt
3360aagggtgcat tgaaaggttt aggttacaat atgaatcttg ttagaaccgt tgacgaaact
3420gaaccaccat cataattaat taac
3444343444DNAArtificial SequenceSynthetic - Mutated L. mexicana FRD gene
34atggctgatg gcaaaacctc tgcatcagtt gttgctgttg atgctgaacg tgccgctaag
60gaaagagatg cagcagctag agctatgttg caaggtggtg gtgtctctcc tgctggcaag
120gcacaattgt tgaaaaaggg tttggttcac actgttccat ataccttaaa ggttgtcgtc
180gcagatccaa aggaaatgga gaaggcaact gctgacgcag aagaggtttt acaagctgca
240tttcaagtcg tcgacaccct tttgaacaac tttaacgaaa actcagaagt ttcaagagtc
300aataggttgg cagttggtga ggaacatcaa atgtctgaaa cattgaaaca cgtcatggcc
360tgttgtcaaa aggtttatca ttcctccaga ggtgtttttg acccagcagt tggtccatta
420gtccgtgaac ttagagaagc tgctcacaag ggtaaaactg ttccagccga aagagttaat
480gatttgttat ccaaatgtac ccttaatgca tctttttcaa ttgatatgtc cagaggtatg
540attgcaagga agcatccaga cgccatgttg gatttgggtg gtgtcaacaa gggttatggt
600atcgactaca ttgttgaaca cttaaactct ttgggttatg atgatgtctt tttcgaatgg
660ggtggtgatg ttagagcatc cggcaaaaac cagttatctc aaccttgggc tgttggtatt
720gttagaccac ctgccttggc cgacattaga actgttgtcc cagaggacaa aagatccttt
780atccgtgtcg tcagattgaa caacgaagct attgctacct ctggtgatta tgagaatttg
840gttgaaggtc ctggttctaa ggtttactct tccaccttca atccaacttc caaaaacttg
900ttggaaccta ccgaagcagg tatggctcaa gtttctgtca agtgttgctc atgtatctac
960gctgatgctt tagcaacagc agctttgttg aaaaacgatc ctgctgccgt tagaaggatc
1020ttagataact ggagatatgt cagagatact gttactgact acaccactta cacaagggaa
1080ggtgaaagag ttgctaagat gttggaaatt gctaccgaag atgctgaaat gagagcaaag
1140agaatcaagg gctctttacc agcaagagtt atcattgttg gtggtggttt ggccggttgt
1200tccgcagcta tcgaagcagc taactgtggc gcccacgtca tcttgttaga aaaggaacca
1260aagttaggtg gtaactctgc aaaggctacc tccggtatca acgcctgggg tactagagca
1320caagcaaaac aaggtgtcat ggacggcggc aagtttttcg aaagagatac ccatagatcc
1380ggcaagggtg gtaattgcga tccatgcctt gttaagactt tgtccgttaa gtcctctgat
1440gcagttaagt ggttatctga attaggtgtt ccattgactg ttttgtctca attaggtggt
1500gcttcaagga aacgttgtca ccgtgcacca gataagtctg atggtacacc agtcccagtt
1560ggtttcacca ttatgaaaac ccttgaaaac cacattgtca acgatttgtc cagacatgtt
1620acagttatga caggtattac cgtcacagct ttagaatcta catcaagagt cagacctgat
1680ggtgttttag tcaagcatgt tactggtgtt cacttgattc aggcatctgg tcaatctatg
1740gttttgaatg cagacgctgt tatcttagct actggtggtt tctccaatga tcatacccca
1800aactcccttt tacaacaata cgccccacag ttgtcatctt ttccaacaac caatggtgtc
1860tgggcaactg gcgatggtgt taagatggct tccaagttgg gtgtcgcctt agttgatatg
1920ggtaaggtcc aattacatcc taccggcttg ttagacccaa aagatccatc taatagaacc
1980aagtatcttg gtccagaggc cttaagaggt tccggcggtg tcttgttaaa caaaaacggt
2040gaaagatttg ttaatgaatt agacttaaga tctgttgtct ctcaagctat catcgcacaa
2100gataatgagt acccaggctc tggtggttcc aagttcgcat actgtgtttt gaacgaaact
2160gcagcaaagt tattcggcaa aaacttcctt ggtttctact ggaatagatt aggtcttttc
2220caaaaggttg attccgttgc tggtttagct aagttgattg gttgtccaga agctaatgtt
2280gttgctacat tgaagcaata tgaggagtta tcttccaaaa agcttaatcc ttgtccattg
2340actggcaagt ctgtctttcc ttgtgtttta ggcactcaag gtccatacta tgttgccttg
2400gttaccccat ccattcacta cactatgggt ggttgtttga tttccccatc tgctgagatg
2460caaaccattg acaactctgg tgttactcct gtcagacgtc caatcttagg cttattcggt
2520gctggtgaag ttactggcgg tgtccatggt ggtaacagat taggcggtaa ctctttgtta
2580gaatgtgttg ttttcggcaa gatcgctggt gacagagctg caaccatctt gcaaaagaaa
2640aacaccggct tatcaatgac agaatggtct actgtcgtct taagagaagt tagagaaggt
2700ggtgtctatg gtgctggttc cagagttttg aggtttaaca tgcctggtgc attacagaga
2760actggtttag ctttaggtca attcatcggt atcagaggtg attgggacgg tcacagattg
2820atcggttact attctccaat cactttacct gatgatgttg gtgttattgg tatcttagct
2880agagcagaca agggtagatt ggcagaatgg atttctgcat tgcagccagg tgacgctgtt
2940gagatgaagg cctgcggtgg tcttatcatt gacagaagat tcgctgaaag acatttcttt
3000ttccgtggtc ataagatcag aaagttggcc cttatcggtg gtggtactgg tgttgcacca
3060atgttacaaa tcgtcagagc tgctgtcaaa aagccatttg tcgattcaat tgagtccatt
3120cagttcatct atgctgcaga ggatgtttcc gagcttacat acagaacctt acttgaatct
3180tacgaagagg aatatggttc agaaaagttt aagtgtcact tcgttttgaa taacccacca
3240gctcaatgga ctgacggtgt tggtttcgtt gatactgcat tgttgagatc cgcagttcaa
3300gcaccatcaa atgatttgct tgttgcaatt tgtggtccac caatcatgca aagagcagtt
3360aagggtgcat tgaaaggttt aggttacaat atgaatcttg ttagaaccgt tgacgaaact
3420gaaccaccat cataattaat taac
3444353435DNAArtificial SequenceSynthetic - Mutated L. mexicana FRD gene
35atggctgatg gcaaaacctc tgcatcagtt gttgctgttg atgctgaacg tgccgctaag
60gaaagagatg cagcagctag agctatgttg caaggtggtg gtgtctctcc tgctggcaag
120gcacaattgt tgaaaaaggg tttggttcac actgttccat ataccttaaa ggttgtcgtc
180gcagatccaa aggaaatgga gaaggcaact gctgacgcag aagaggtttt acaagctgca
240tttcaagtcg tcgacaccct tttgaacaac tttaacgaaa actcagaagt ttcaagagtc
300aataggttgg cagttggtga ggaacatcaa atgtctgaaa cattgaaaca cgtcatggcc
360tgttgtcaaa aggtttatca ttcctccaga ggtgtttttg acccagcagt tggtccatta
420gtccgtgaac ttagagaagc tgctcacaag ggtaaaactg ttccagccga aagagttaat
480gatttgttat ccaaatgtac ccttaatgca tctttttcaa ttgatatgtc cagaggtatg
540attgcaagga agcatccaga cgccatgttg gatttgggtg gtgtcaacaa gggttatggt
600atcgactaca ttgttgaaca cttaaactct ttgggttatg atgatgtctt tttcgaatgg
660ggtggtgatg ttagagcatc cggcaaaaac cagttatctc aaccttgggc tgttggtatt
720gttagaccac ctgccttggc cgacattaga actgttgtcc cagaggacaa aagatccttt
780atccgtgtcg tcagattgaa caacgaagct attgctacct ctggtgatta tgagaatttg
840gttgaaggtc ctggttctaa ggtttactct tccaccttca atccaacttc caaaaacttg
900ttggaaccta ccgaagcagg tatggctcaa gtttctgtca agtgttgctc atgtatctac
960gctgatgctt tagcaacagc agctttgttg aaaaacgatc ctgctgccgt tagaaggatc
1020ttagataact ggagatatgt cagagatact gttactgact acaccactta cacaagggaa
1080ggtgaaagag ttgctaagat gttggaaatt gctaccgaag atgctgaaat gagagcaaag
1140agaatcaagg gctctttacc agcaagagtt atcattgttg gtggtggttt ggccggttgt
1200tccgcagcta tcgaagcagc taactgtggc gcccacgtca tcttgttaga aaaggaacca
1260aagttaggtg gtaactctgc aaaggctacc tccggtatca acgcctgggg tactagagca
1320caagcaaaac aaggtgtcat ggacggcggc aagtttttcg aaagagatac ccatagatcc
1380ggcaagggtg gtaattgcga tccatgcctt gttaagactt tgtccgttaa gtcctctgat
1440gcagttaagt ggttatctga attaggtgtt ccattgactg ttttgtctca attaggtggt
1500gcttcaagga aacgttgtca ccgtgcacca gataagtctg atggtacacc agtcccagtt
1560ggtttcacca ttatgaaaac ccttgaaaac cacattgtca acgatttgtc cagacatgtt
1620acagttatga caggtattac cgtcacagct ttagaatcta catcaagagt cagacctgat
1680ggtgttttag tcaagcatgt tactggtgtt cacttgattc aggcatctgg tcaatctatg
1740gttttgaatg cagacgctgt tatcttagct actggtggtt tctccaatga tcatacccca
1800aactcccttt tacaacaata cgccccacag ttgtcatctt ttccaacaac caatggtgtc
1860tgggcaactg gcgatggtgt taagatggct tccaagttgg gtgtcgcctt agttgatatg
1920gataaggtcc aattacatcc taccggcttg ttagacccaa aagatccatc taatagaacc
1980aagtatcttg gtccagaggc cttaagaggt tccggcggtg tcttgttaaa caaaaacggt
2040gaaagatttg ttaatgaatt agacttaaga tctgttgtct ctcaagctat catcgcacaa
2100gataatgagt acccaggctc tggtggttcc aagttcgcat actgtgtttt gaacgaaact
2160gcagcaaagt tattcggcaa aaacttcctt ggtttctact ggaatagatt aggtcttttc
2220caaaaggttg attccgttgc tggtttagct aagttgattg gttgtccaga agctaatgtt
2280gttgctacat tgaagcaata tgaggagtta tcttccaaaa agcttaatcc ttgtccattg
2340actggcaagt ctgtctttcc ttgtgtttta ggcactcaag gtccatacta tgttgccttg
2400gttaccccat ccattcacta cactatgggt ggttgtttga tttccccatc tgctgagatg
2460caaaccattg acaactctgg tgttactcct gtcagacgtc caatcttagg cttattcggt
2520gctggtgaag ttactggcgg tgtccatggt ggtaacagat taggcggtaa ctctttgtta
2580ggacgtgttg ttttcggcaa gatcgctggt gacagagctg caaccatctt gcaaaagaaa
2640aacaccggct tatcaatgac agaatggtct actgtcgtct taagagaagt tagagaaggt
2700ggtgtctatg gtgctggttc cagagttttg aggtttaaca tgcctggtgc attacagaga
2760actggtttag ctttaggtca attcatcggt atcagaggtg attgggacgg tcacagattg
2820atcggttact attctccaat cactttacct gatgatgttg gtgttattgg tatcttagct
2880agagcagaca agggtagatt ggcagaatgg atttctgcat tgcagccagg tgacgctgtt
2940gagatgaagg cctgcggtgg tcttatcatt gacagaagat tcgctgaaag acatttcttt
3000ttccgtggtc ataagatcag aaagttggcc cttatcggtg gtggtactgg tgttgcacca
3060atgttacaaa tcgtcagagc tgctgtcaaa aagccatttg tcgattcaat tgagtccatt
3120cagttcatct atgctgcaga ggatgtttcc gagcttacat acagaacctt acttgaatct
3180tacgaagagg aatatggttc agaaaagttt aagtgtcact tcgttttgaa taacccacca
3240gctcaatgga ctgacggtgt tggtttcgtt gatactgcat tgttgagatc cgcagttcaa
3300gcaccatcaa atgatttgct tgttgcaatt tgtggtccac caatcatgca aagagcagtt
3360aagggtgcat tgaaaggttt aggttacaat atgaatcttg ttagaaccgt tgacgaaact
3420gaaccaccat cataa
3435363435DNAArtificial SequenceSynthetic - Mutated L. mexicana FRD gene
36atggctgatg gcaaaacctc tgcatcagtt gttgctgttg atgctgaacg tgccgctaag
60gaaagagatg cagcagctag agctatgttg caaggtggtg gtgtctctcc tgctggcaag
120gcacaattgt tgaaaaaggg tttggttcac actgttccat ataccttaaa ggttgtcgtc
180gcagatccaa aggaaatgga gaaggcaact gctgacgcag aagaggtttt acaagctgca
240tttcaagtcg tcgacaccct tttgaacaac tttaacgaaa actcagaagt ttcaagagtc
300aataggttgg cagttggtga ggaacatcaa atgtctgaaa cattgaaaca cgtcatggcc
360tgttgtcaaa aggtttatca ttcctccaga ggtgtttttg acccagcagt tggtccatta
420gtccgtgaac ttagagaagc tgctcacaag ggtaaaactg ttccagccga aagagttaat
480gatttgttat ccaaatgtac ccttaatgca tctttttcaa ttgatatgtc cagaggtatg
540attgcaagga agcatccaga cgccatgttg gatttgggtg gtgtcaacaa gggttatggt
600atcgactaca ttgttgaaca cttaaactct ttgggttatg atgatgtctt tttcgaatgg
660ggtggtgatg ttagagcatc cggcaaaaac cagttatctc aaccttgggc tgttggtatt
720gttagaccac ctgccttggc cgacattaga actgttgtcc cagaggacaa aagatccttt
780atccgtgtcg tcagattgaa caacgaagct attgctacct ctggtgatta tgagaatttg
840gttgaaggtc ctggttctaa ggtttactct tccaccttca atccaacttc caaaaacttg
900ttggaaccta ccgaagcagg tatggctcaa gtttctgtca agtgttgctc atgtatctac
960gctgatgctt tagcaacagc agctttgttg aaaaacgatc ctgctgccgt tagaaggatc
1020ttagataact ggagatatgt cagagatact gttactgact acaccactta cacaagggaa
1080ggtgaaagag ttgctaagat gttggaaatt gctaccgaag atgctgaaat gagagcaaag
1140agaatcaagg gctctttacc agcaagagtt atcattgttg gtggtggttt ggccggttgt
1200tccgcagcta tcgaagcagc taactgtggc gcccacgtca tcttgttaga aaaggaacca
1260aagttaggtg gtaactctgc aaaggctacc tccggtatca acgcctgggg tactagagca
1320caagcaaaac aaggtgtcat ggacggcggc aagtttttcg aaagagatac ccatagatcc
1380ggcaagggtg gtaattgcga tccatgcctt gttaagactt tgtccgttaa gtcctctgat
1440gcagttaagt ggttatctga attaggtgtt ccattgactg ttttgtctca attaggtggt
1500gcttcaagga aacgttgtca ccgtgcacca gataagtctg atggtacacc agtcccagtt
1560ggtttcacca ttatgaaaac ccttgaaaac cacattgtca acgatttgtc cagacatgtt
1620acagttatga caggtattac cgtcacagct ttagaatcta catcaagagt cagacctgat
1680ggtgttttag tcaagcatgt tactggtgtt cacttgattc aggcatctgg tcaatctatg
1740gttttgaatg cagacgctgt tatcttagct actggtggtt tctccaatga tcatacccca
1800aactcccttt tacaacaata cgccccacag ttgtcatctt ttccaacaac caatggtgtc
1860tgggcaactg gcgatggtgt taagatggct tccaagttgg gtgtcgcctt agttgatatg
1920gataaggtcc aattacatcc taccggcttg ttagacccaa aagatccatc taatagaacc
1980aagtatcttg gtccagaggc cttaagaggt tccggcggtg tcttgttaaa caaaaacggt
2040gaaagatttg ttaatgaatt agacttaaga tctgttgtct ctcaagctat catcgcacaa
2100gataatgagt acccaggctc tggtggttcc aagttcgcat actgtgtttt gaacgaaact
2160gcagcaaagt tattcggcaa aaacttcctt ggtttctact ggaatagatt aggtcttttc
2220caaaaggttg attccgttgc tggtttagct aagttgattg gttgtccaga agctaatgtt
2280gttgctacat tgaagcaata tgaggagtta tcttccaaaa agcttaatcc ttgtccattg
2340actggcaagt ctgtctttcc ttgtgtttta ggcactcaag gtccatacta tgttgccttg
2400gttaccccat ccattcacta cactatgggt ggttgtttga tttccccatc tgctgagatg
2460caaaccattg acaactctgg tgttactcct gtcagacgtc caatcttagg cttattcggt
2520gctggtgaag ttactggcgg tgtccatggt ggtaacagat taggcggtaa ctctttgtta
2580gaatgtgttg ttttcggcaa gatcgctggt gacagagctg caaccatctt gcaaaagaaa
2640aacaccggct tatcaatgac agaatggtct actgtcgtct taagagaagt tagagaaggt
2700ggtgtctatg gtgctggttc cagagttttg aggtttaaca tgcctggtgc attacagaga
2760actggtttag ctttaggtca attcatcggt atcagaggtg attgggacgg tcacagattg
2820atcggttact attctccaat cactttacct gatgatgttg gtgttattgg tatcttagct
2880agagcagaca agggtagatt ggcagaatgg atttctgcat tgcagccagg tgacgctgtt
2940gagatgaagg cctgcggtgg tcttatcatt gacagaagat tcgctgaaag acatttcttt
3000ttccgtggtc ataagatcag aaagttggcc cttatcggtg gtggtactgg tgttgcacca
3060atgttacaaa tcgtcagagc tgctgtcaaa aagccatttg tcggtcgaat tgagtccatt
3120cagttcatct atgctgcaga ggatgtttcc gagcttacat acagaacctt acttgaatct
3180tacgaagagg aatatggttc agaaaagttt aagtgtcact tcgttttgaa taacccacca
3240gctcaatgga ctgacggtgt tggtttcgtt gatactgcat tgttgagatc cgcagttcaa
3300gcaccatcaa atgatttgct tgttgcaatt tgtggtccac caatcatgca aagagcagtt
3360aagggtgcat tgaaaggttt aggttacaat atgaatcttg ttagaaccgt tgacgaaact
3420gaaccaccat cataa
3435373420DNAArtificial SequenceSynthetic - Mutated T.brucei FRD gene
37atggttgatg gtagatcttc agcttctatt gttgcagttg atccagaaag agcagcaaga
60gaaagagatg ctgcagctag agctttgtta caagattctc cattgcacac taccatgcaa
120tatgctacct ccggtttaga attgaccgtc ccttatgcat tgaaagttgt tgcatctgcc
180gacaccttcg atagagctaa ggaagttgca gatgaagtcc ttagatgtgc ctggcaattg
240gctgatacag tccttaactc ctttaaccca aactctgaag tctctcttgt tggtagactt
300ccagtcggtc agaagcatca aatgtccgcc ccacttaaga gagttatggc ttgttgtcaa
360agagtttaca attcctctgc tggttgtttc gacccatcca ccgccccagt tgcaaaggct
420ttgcgtgaaa tcgctttagg caaggagaga aacaatgcct gtttggaggc tttaacacaa
480gcatgcactt tgccaaactc tttcgtcatt gactttgaag caggtactat ctcacgtaaa
540catgaacatg cttcacttga cttaggtggt gtttcaaagg gttacatcgt tgactatgtt
600attgataaca ttaacgcagc tggtttccaa aatgtctttt tcgattgggg tggtgattgt
660agagcctccg gtatgaatgc tagaaatacc ccttgggttg ttggtattac tagaccacca
720tcattagata tgttaccaaa cccaccaaag gaagcatcct atatctctgt tatctcattg
780gacaacgaag ctttggcaac ctccggtgat tacgagaatt tgatctacac agctgatgac
840aagcctttaa cttgtactta cgattggaag ggcaaggaac ttatgaagcc atctcaatca
900aacattgccc aagtttcagt taagtgctat tcagcaatgt acgctgacgc tttagccacc
960gcttgtttca tcaaaagaga tccagccaag gttagacaat tgttagatgg ttggagatac
1020gttagagata ctgtcagaga ttacagagtt tatgttagag aaaatgagag agtcgctaag
1080atgtttgaaa ttgcaaccga agatgctgaa atgagaaaaa gacgtatctc taatactttg
1140cctgcaagag tcatcgttgt cggtggcggt ttagcaggtt tatctgcagc aattgaagct
1200gcaggctgcg gtgcacaagt cgttttgatg ggaaaggaag ctaagttagg tggtaactct
1260gcaaaggcaa cctctggtat caatggttgg ggtactagag cccaagcaaa ggcttccatt
1320gttgacggtg gcaagtattt cgaaagagat acttacaaat ctggtattgg tggtaatacc
1380gacccagctt tagttaagac tctttccatg aagtctgctg acgctattgg ttggttaaca
1440tcattaggtg ttcctttaac agtcttatca caattgggtg gtcattccag aaagagaact
1500cacagagcac cagacaaaaa ggatggcacc ccattaccta ttggttttac cattatgaaa
1560accttagaag atcacgtcag aggtaatctt tctggtagaa ttactatcat ggaaaactgt
1620tccgttacct ctttactttc tgaaactaag gaaagaccag atggtactaa acaaatcaga
1680gttaccggtg ttgagttcac tcaagcaggc tctggcaaaa ctaccatttt ggccgacgca
1740gtcatcttgg ccactggtgg tttctctaac gacaagaccg cagactcttt gttgagagaa
1800catgcccctc acttagttaa ctttcctaca actaacggtc cttgggcaac tggtgacggt
1860gttaagcttg ctcaaagatt aggtgcacaa ttggtcgaca tggataaggt tcaattgcat
1920ccaactggtt tgattaaccc aaaagatcca gctaatccaa caaagttttt gggtccagaa
1980gctttaagag gttccggtgg tgtcttgtta aacaaacagg gtaaaagatt tgttaacgaa
2040ttagatttgc gttctgttgt ttccaaggcc attatggaac aaggtgctga atacccaggc
2100tctggtggtt ctatgttcgc atattgtgtc cttaatgcag ctgcacaaaa gttgtttggt
2160gtctcttccc acgagttcta ctggaaaaag atgggtttgt tcgttaaggc tgatactatg
2220agagatttgg cagcattgat tggttgtcca gtcgagtctg ttcaacaaac tttagaggaa
2280tatgaaagat tatctatttc tcagagatcc tgtccaatca ctagaaaatc tgtttaccca
2340tgtgttttgg gcactaaggg tccatactac gttgctttcg tcaccccatc tattcactat
2400acaatgggtg gttgtttgat ttccccatca gcagaaattc agatgaaaaa cacctcctcc
2460cgtgctccat tgtcccattc caaccctatc ttgggtttgt tcggtgctgg tgaagttact
2520ggtggtgtcc acggtggcaa tagattaggt ggtaactcat tgttagaatg tgttgtcttt
2580ggtagaattg ctggtgatag agcttctacc attttgcaga gaaagtcctc cgcattatct
2640ttcaaggtct ggactaccgt tgttttgaga gaagttagag aaggtggcgt ctatggtgcc
2700ggttcaagag ttttgagatt caacttgcct ggtgctttac aaagatccgg tttgtccttg
2760ggtcaattca tcgcaatcag aggtgactgg gatggtcaac aattgattgg ttactattcc
2820ccaattacat tgccagatga cttgggtatg attgacattt tggctagatc cgataaaggt
2880actttaagag aatggatttc tgctttagaa ccaggcgacg ctgttgagat gaaagcatgc
2940ggtggtttag tcatcgagag aagattgtca gataagcact ttgtctttat gggtcacatc
3000attaacaagt tatgtttgat cgctggtggt acaggcgttg cacctatgtt acaaatcatt
3060aaggcagcat tcatgaaacc ttttatcgat accttagaat ctgtccatct tatctatgct
3120gcagaagatg ttaccgagtt aacttataga gaagttttag aggagcgtag aagagagtct
3180cgtggcaagt tcaaaaagac ctttgttttg aacagacctc caccactttg gactgatggt
3240gttggtttca tcgatagagg tatcttaact aatcatgtcc aaccaccatc cgataacctt
3300ttggttgcaa tctgtggtcc acctgtcatg cagcgtattg ttaaggccac cttaaagact
3360ttgggttaca atatgaatct tgttagaaca gttgacgaaa cagaaccatc cggttcctaa
3420383889DNAArtificial SequenceSynthetic - GPD gene deletion fragment
38ccttcattta cgaaataaag tgccgcggtt acgcagcaca caccagcaat cacgtgcagt
60gtctttttct tttttttttc ttttttttcc tctttttctt ttgttttgtt tcgtttcttt
120tccgccagtt cccgttttcc atttccggaa caacaatggg actccactgt tttctttccc
180cccttccgtt ttcggctcgc agtctgtaca tgcacgttta tccgacacct gtcttgtttg
240gcgcgtaatt aatacagttt ctccggagtc caggtctcgg acgggtaatt tacacgtcat
300cattcatttc tgtgtcaaga gaggtagcgc aaaaagtaga aatggtgaac cacgggaatg
360acttgctgga aatcgacgcc agagtccatt tgaaaaccta cctctacaag agaggaaaca
420cactacaggg tgtccctggt ccgtaaaatg gcgtaatatg atgacttccc tctatagacg
480ttgtatttcc agctccaaca tggttaaact attgctatgg tgatggtatt acagatagta
540aaagaaggaa gggggggtgg caatctcacc ctaacagtta ctaagaacgt ctacttcatc
600tactgtcaat atacattggc cacatgccga gaaattacgt cgacgccaaa gaagggctca
660gccgaaaaaa gaaatggaaa acttggccga aaagggaaac aaacaaaaag gtgatgtaaa
720attagcggaa aggggaattg gcaaattgag ggagaaaaaa aaaaggcaga aaaggaggcg
780gaaagtcagt acgttttgaa ggcgtcattg gttttccctt ttgcagagtg tttcatttct
840tttgtttcat gacgtagtgg cgtttctttt cctgcacttt agaaatctat cttttcctta
900tcaagtaaca agcggttggc aaaggtgtat ataaatcaag gaattcccac tttgaaccct
960ttgaattttg atatcgttta ttttaaattt atttgcggcc gcggatccct cgaggcctta
1020attaacatct gaatgtaaaa tgaacattaa aatgaattac taaactttac gtctacttta
1080caatctataa actttgttta atcatataac gaaatacact aatacacaat cctgtacgta
1140tgtaatactt ttatccatca aggattgaga aaaaaaagta atgattccct gggccattaa
1200aacttagacc cccaagcttg gataggtcac tctctatttt cgtttctccc ttccctgata
1260gaagggtgat atgtaattaa gaataatata taattttata ataaaagaat tcatagcctc
1320atgaaatcag ccatttgctt ttgttcaacg atcttttgaa attgttgttg ttcttggtag
1380ttaagttgat ccatcttggc ttatgttgtg tgtatgttgt agttattctt agtatattcc
1440tgtcctgagt ttagtgaaac ataatatcgc cttgaaatga aaatgctgaa attcgtcgac
1500atacaatttt tcaaactttt tttttttctt ggtgcacgga catgttttta aaggaagtac
1560tctataccag ttattcttca caaatttaat tgctggagaa tagatcttca acgctttaat
1620aaagtagttt gtttgtcaag gatggcgtca tacaaagaaa gatcagaatc acacacttcc
1680cctgttgcta ggagactttt ctccatcatg gaggaaaaga agtctaacct ttgtgcatca
1740ttggatatta ctgaaactga aaagcttctc tctattttgg acactattgg tccttacatc
1800tgtctagtta aaacacacat cgatattgtt tctgatttta cgtatgaagg aactgtgttg
1860cctttgaagg agcttgccaa gaaacataat tttatgattt ttgaagatag aaaatttgct
1920gatattggta acactgttaa aaatcaatat aaatctggtg tcttccgtat tgccgaatgg
1980gctgacatca ctaatgcaca tggtgtaacg ggtgcaggta ttgtttctgg cttgaaggag
2040gcagcccaag aaacaaccag tgaacctaga ggtttgctaa tgcttgctga gttatcatca
2100aagggttctt tagcatatgg tgaatataca gaaaaaacag tagaaattgc taaatctgat
2160aaagagtttg tcattggttt tattgcgcaa cacgatatgg gcggtagaga agaaggtttt
2220gactggatca ttatgactcc aggggttggt ttagatgaca aaggtgatgc acttggtcaa
2280caatatagaa ctgttgatga agttgtaaag actggaacgg atatcataat tgttggtaga
2340ggtttgtacg gtcaaggaag agatcctata gagcaagcta aaagatacca acaagctggt
2400tggaatgctt atttaaacag atttaaatga ttcttacaca aagatttgat acatgtacac
2460tagtttaaat aagcatgaaa agaattacac aagcaaaaaa aaaaaaataa atgaggtact
2520ttacgttcac ctacaaccaa aaaaactaga tagagtaaaa tcttaagatt tagaaaaagt
2580tgtttaacaa aggctttagt atgtgaattt ttaatgtagc aaagcgataa ctaataaaca
2640taaacaaaag tatggttttc tttatcagtc aaatcattat cgattgattg ttccgcgtat
2700ctgcagatag cctcatgaaa tcagccattt gcttttgttc aacgatcttt tgaaattgtt
2760gttgttcttg gtagttaagt tgatccatct tggcttatgt tgtgtgtatg ttgtagttat
2820tcttagtata ttcctgtcct gagtttagtg aaacataata tcgccttgaa atgaaaatgc
2880tgaaattcgt cgacatacaa tttttcaaac tttttttttt tcttggtgca cggacatgtt
2940tttaaaggaa gtactctata ccagttattc ttcacaaatt taattgctgg agaatagatc
3000ttcaacgccc cgggggatct ggatccgcgg ccgcaataac ctcagggaga actttggcat
3060tgtactctcc attgacgagt ccgccaaccc attcttgtta aacctaacct tgcattatca
3120cattcccttt gacccccttt agctgcattt ccacttgtct acattaagat tcattacaca
3180ttctttttcg tatttctctt acctccctcc cccctccatg gatcttatat ataaatcttt
3240tctataacaa taatatctac tagagttaaa caacaattcc acttggcatg gctgtctcag
3300caaatctgct tctacctact gcacgggttt gcatgtcatt gtttctagca gggaatcgtc
3360catgtacgtt gtcctccatg atggtcttcc cgctgccact ttctttagta tcttaaatag
3420agcagatctt acgtccactg tgcatccgtg caccccgaaa atcgtatggt tttccttgcc
3480acctctcaca attttgaata tgctcaacgc gaaagagagg ggaagaggaa tcgcattcgt
3540agagtggcta cattcaaccc tgacaaagga actagcgttt gtgcaggaga gagtggtttg
3600catagatttc ctttcctttg caagcatatt atatagagta gccaatacag taacagctac
3660agcacaaaaa agagaacgag aacgagaacg agaacaagaa caagaactag cactactgtc
3720actgccagca tcaacattac taccattatt ccaacatgtt tgcaactaga aatataacca
3780ttggtgtcag aacactcaga ccaaccagtt tcttgaaaac aaggtctttt ctgcaacaga
3840ggctacaatc aacgctaaag aagagctatg aaccaaccaa atccgagct
3889393889DNAArtificial SequenceSynthetic - GPD gene deletion fragment
39ccttcattta cgaaataaag tgccgcggtt acgcagcaca caccagcaat cacgtgcagt
60gtctttttct tttttttttc ttttttttcc tctttttctt ttgttttgtt tcgtttcttt
120tccgccagtt cccgttttcc atttccggaa caacaatggg actccactgt tttctttccc
180cccttccgtt ttcggctcgc agtctgtaca tgcacgttta tccgacacct gtcttgtttg
240gcgcgtaatt aatacagttt ctccggagtc caggtctcgg acgggtaatt tacacgtcat
300cattcatttc tgtgtcaaga gaggtagcgc aaaaagtaga aatggtgaac cacgggaatg
360acttgctgga aatcgacgcc agagtccatt tgaaaaccta cctctacaag agaggaaaca
420cactacaggg tgtccctggt ccgtaaaatg gcgtaatatg atgacttccc tctatagacg
480ttgtatttcc agctccaaca tggttaaact attgctatgg tgatggtatt acagatagta
540aaagaaggaa gggggggtgg caatctcacc ctaacagtta ctaagaacgt ctacttcatc
600tactgtcaat atacattggc cacatgccga gaaattacgt cgacgccaaa gaagggctca
660gccgaaaaaa gaaatggaaa acttggccga aaagggaaac aaacaaaaag gtgatgtaaa
720attagcggaa aggggaattg gcaaattgag ggagaaaaaa aaaaggcaga aaaggaggcg
780gaaagtcagt acgttttgaa ggcgtcattg gttttccctt ttgcagagtg tttcatttct
840tttgtttcat gacgtagtgg cgtttctttt cctgcacttt agaaatctat cttttcctta
900tcaagtaaca agcggttggc aaaggtgtat ataaatcaag gaattcccac tttgaaccct
960ttgaattttg atatcgttta ttttaaattt atttgcggcc gcggatccag atcccccggg
1020gcgttgaaga tctattctcc agcaattaaa tttgtgaaga ataactggta tagagtactt
1080cctttaaaaa catgtccgtg caccaagaaa aaaaaaaagt ttgaaaaatt gtatgtcgac
1140gaatttcagc attttcattt caaggcgata ttatgtttca ctaaactcag gacaggaata
1200tactaagaat aactacaaca tacacacaac ataagccaag atggatcaac ttaactacca
1260agaacaacaa caatttcaaa agatcgttga acaaaagcaa atggctgatt tcatgaggct
1320atctgcagat acgcggaaca atcaatcgat aatgatttga ctgataaaga aaaccatact
1380tttgtttatg tttattagtt atcgctttgc tacattaaaa attcacatac taaagccttt
1440gttaaacaac tttttctaaa tcttaagatt ttactctatc tagttttttt ggttgtaggt
1500gaacgtaaag tacctcattt attttttttt ttttgcttgt gtaattcttt tcatgcttat
1560ttaaactagt gtacatgtat caaatctttg tgtaagaatc atttaaatct gtttaaataa
1620gcattccaac cagcttgttg gtatctttta gcttgctcta taggatctct tccttgaccg
1680tacaaacctc taccaacaat tatgatatcc gttccagtct ttacaacttc atcaacagtt
1740ctatattgtt gaccaagtgc atcacctttg tcatctaaac caacccctgg agtcataatg
1800atccagtcaa aaccttcttc tctaccgccc atatcgtgtt gcgcaataaa accaatgaca
1860aactctttat cagatttagc aatttctact gttttttctg tatattcacc atatgctaaa
1920gaaccctttg atgataactc agcaagcatt agcaaacctc taggttcact ggttgtttct
1980tgggctgcct ccttcaagcc agaaacaata cctgcacccg ttacaccatg tgcattagtg
2040atgtcagccc attcggcaat acggaagaca ccagatttat attgattttt aacagtgtta
2100ccaatatcag caaattttct atcttcaaaa atcataaaat tatgtttctt ggcaagctcc
2160ttcaaaggca acacagttcc ttcatacgta aaatcagaaa caatatcgat gtgtgtttta
2220actagacaga tgtaaggacc aatagtgtcc aaaatagaga gaagcttttc agtttcagta
2280atatccaatg atgcacaaag gttagacttc ttttcctcca tgatggagaa aagtctccta
2340gcaacagggg aagtgtgtga ttctgatctt tctttgtatg acgccatcct tgacaaacaa
2400actactttat taaagcgttg aagatctatt ctccagcaat taaatttgtg aagaataact
2460ggtatagagt acttccttta aaaacatgtc cgtgcaccaa gaaaaaaaaa aagtttgaaa
2520aattgtatgt cgacgaattt cagcattttc atttcaaggc gatattatgt ttcactaaac
2580tcaggacagg aatatactaa gaataactac aacatacaca caacataagc caagatggat
2640caacttaact accaagaaca acaacaattt caaaagatcg ttgaacaaaa gcaaatggct
2700gatttcatga ggctatgaat tcttttatta taaaattata tattattctt aattacatat
2760cacccttcta tcagggaagg gagaaacgaa aatagagagt gacctatcca agcttggggg
2820tctaagtttt aatggcccag ggaatcatta cttttttttc tcaatccttg atggataaaa
2880gtattacata cgtacaggat tgtgtattag tgtatttcgt tatatgatta aacaaagttt
2940atagattgta aagtagacgt aaagtttagt aattcatttt aatgttcatt ttacattcag
3000atgttaatta aggcctcgag ggatccgcgg ccgcaataac ctcagggaga actttggcat
3060tgtactctcc attgacgagt ccgccaaccc attcttgtta aacctaacct tgcattatca
3120cattcccttt gacccccttt agctgcattt ccacttgtct acattaagat tcattacaca
3180ttctttttcg tatttctctt acctccctcc cccctccatg gatcttatat ataaatcttt
3240tctataacaa taatatctac tagagttaaa caacaattcc acttggcatg gctgtctcag
3300caaatctgct tctacctact gcacgggttt gcatgtcatt gtttctagca gggaatcgtc
3360catgtacgtt gtcctccatg atggtcttcc cgctgccact ttctttagta tcttaaatag
3420agcagatctt acgtccactg tgcatccgtg caccccgaaa atcgtatggt tttccttgcc
3480acctctcaca attttgaata tgctcaacgc gaaagagagg ggaagaggaa tcgcattcgt
3540agagtggcta cattcaaccc tgacaaagga actagcgttt gtgcaggaga gagtggtttg
3600catagatttc ctttcctttg caagcatatt atatagagta gccaatacag taacagctac
3660agcacaaaaa agagaacgag aacgagaacg agaacaagaa caagaactag cactactgtc
3720actgccagca tcaacattac taccattatt ccaacatgtt tgcaactaga aatataacca
3780ttggtgtcag aacactcaga ccaaccagtt tcttgaaaac aaggtctttt ctgcaacaga
3840ggctacaatc aacgctaaag aagagctatg aaccaaccaa atccgagct
3889403175DNAArtificial SequenceSynthtic - PGI gene deletion construct
40cttcgctcgc catctatatc ttcaacgaac aacggaatta caaacatggg cagtagttca
60aacaatctcc agactctcaa ctctctctcg ctatcgttga aacatccaca gttccaaggc
120ctattctccc cactggatgt ccacagtccg tacgaacaga acgtttcttc cccactggcc
180cccaccgttc cggctgttcc gggaaccgca ccttcattcg agtcggacga tctctacaat
240gcaacggctg cccgcaaaag agactctctc aagatgaaga gaagatagac gctacatcat
300tgtctgtgca gtacctaata tatagtactt ggtataaggt ataataaagc tataaaatta
360taataatctt aataataata accatattaa tggaaggatg aggcccgatg tccttttttt
420tgcctttcta ctatagtgct tacattgtgt ataaattctc gcggccgcgg atccctcgag
480gccttaatta acatctgaat gtaaaatgaa cattaaaatg aattactaaa ctttacgtct
540actttacaat ctataaactt tgtttaatca tataacgaaa tacactaata cacaatcctg
600tacgtatgta atacttttat ccatcaagga ttgagaaaaa aaagtaatga ttccctgggc
660cattaaaact tagaccccca agcttggata ggtcactctc tattttcgtt tctcccttcc
720ctgatagaag ggtgatatgt aattaagaat aatatataat tttataataa aagaattcat
780agcctcatga aatcagccat ttgcttttgt tcaacgatct tttgaaattg ttgttgttct
840tggtagttaa gttgatccat cttggcttat gttgtgtgta tgttgtagtt attcttagta
900tattcctgtc ctgagtttag tgaaacataa tatcgccttg aaatgaaaat gctgaaattc
960gtcgacatac aatttttcaa actttttttt tttcttggtg cacggacatg tttttaaagg
1020aagtactcta taccagttat tcttcacaaa tttaattgct ggagaataga tcttcaacgc
1080tttaataaag tagtttgttt gtcaaggatg gcgtcataca aagaaagatc agaatcacac
1140acttcccctg ttgctaggag acttttctcc atcatggagg aaaagaagtc taacctttgt
1200gcatcattgg atattactga aactgaaaag cttctctcta ttttggacac tattggtcct
1260tacatctgtc tagttaaaac acacatcgat attgtttctg attttacgta tgaaggaact
1320gtgttgcctt tgaaggagct tgccaagaaa cataatttta tgatttttga agatagaaaa
1380tttgctgata ttggtaacac tgttaaaaat caatataaat ctggtgtctt ccgtattgcc
1440gaatgggctg acatcactaa tgcacatggt gtaacgggtg caggtattgt ttctggcttg
1500aaggaggcag cccaagaaac aaccagtgaa cctagaggtt tgctaatgct tgctgagtta
1560tcatcaaagg gttctttagc atatggtgaa tatacagaaa aaacagtaga aattgctaaa
1620tctgataaag agtttgtcat tggttttatt gcgcaacacg atatgggcgg tagagaagaa
1680ggttttgact ggatcattat gactccaggg gttggtttag atgacaaagg tgatgcactt
1740ggtcaacaat atagaactgt tgatgaagtt gtaaagactg gaacggatat cataattgtt
1800ggtagaggtt tgtacggtca aggaagagat cctatagagc aagctaaaag ataccaacaa
1860gctggttgga atgcttattt aaacagattt aaatgattct tacacaaaga tttgatacat
1920gtacactagt ttaaataagc atgaaaagaa ttacacaagc aaaaaaaaaa aaataaatga
1980ggtactttac gttcacctac aaccaaaaaa actagataga gtaaaatctt aagatttaga
2040aaaagttgtt taacaaaggc tttagtatgt gaatttttaa tgtagcaaag cgataactaa
2100taaacataaa caaaagtatg gttttcttta tcagtcaaat cattatcgat tgattgttcc
2160gcgtatctgc agatagcctc atgaaatcag ccatttgctt ttgttcaacg atcttttgaa
2220attgttgttg ttcttggtag ttaagttgat ccatcttggc ttatgttgtg tgtatgttgt
2280agttattctt agtatattcc tgtcctgagt ttagtgaaac ataatatcgc cttgaaatga
2340aaatgctgaa attcgtcgac atacaatttt tcaaactttt tttttttctt ggtgcacgga
2400catgttttta aaggaagtac tctataccag ttattcttca caaatttaat tgctggagaa
2460tagatcttca acgccccggg ggatctggat ccgcggccgc gttaacgaaa gttccaaact
2520ttatttataa tgtgtttatg tttgtatttt aatcactctt tatgacctat atatgaagct
2580tttagcatta tcgcagcaag tataaatgga tgcatgtaaa ttccatagtt catatagtgc
2640gatttggtga atttttgaaa tttttgctaa tggataatat actctatatt tttacactgt
2700gtttactgat gcctcttccg aatttctttc tttcaccact caacccatga aaggcaagga
2760acacatacat catgattaca ataatataga tatcggggta acaataacag ttcccagaag
2820aaggaaacaa aaacgtacag gatctacaaa tagtcaaagc actgggtgga agaaattgtt
2880atggctcaaa caaccttatg acgataacta cacagattcg agcttcttat cacaactgaa
2940acgaaattca acggttgtaa agtactcgta tgtaaagcta gtcaatgatt tttccatcat
3000tgtattgcat ctgtcgtcca ttatgtttgt tgttgttgta ttttatggga tctatcagtt
3060aaattggaac ccgattaaac caacagtgat aagtacgatt tgtacactca ttggattcat
3120tttttatgtt gtaacattga agataataag aaataaagaa ttgattgaac gagct
3175413175DNAArtificial SequenceSynthetic - PGI gene deletion construct
41cgttcaatca attctttatt tcttattatc ttcaatgtta caacataaaa aatgaatcca
60atgagtgtac aaatcgtact tatcactgtt ggtttaatcg ggttccaatt taactgatag
120atcccataaa atacaacaac aacaaacata atggacgaca gatgcaatac aatgatggaa
180aaatcattga ctagctttac atacgagtac tttacaaccg ttgaatttcg tttcagttgt
240gataagaagc tcgaatctgt gtagttatcg tcataaggtt gtttgagcca taacaatttc
300ttccacccag tgctttgact atttgtagat cctgtacgtt tttgtttcct tcttctggga
360actgttattg ttaccccgat atctatatta ttgtaatcat gatgtatgtg ttccttgcct
420ttcatgggtt gagtggtgaa agaaagaaat tcggaagagg catcagtaaa cacagtgtaa
480aaatatagag tatattatcc attagcaaaa atttcaaaaa ttcaccaaat cgcactatat
540gaactatgga atttacatgc atccatttat acttgctgcg ataatgctaa aagcttcata
600tataggtcat aaagagtgat taaaatacaa acataaacac attataaata aagtttggaa
660ctttcgttaa cgcggccgcg gatccctcga ggccttaatt aacatctgaa tgtaaaatga
720acattaaaat gaattactaa actttacgtc tactttacaa tctataaact ttgtttaatc
780atataacgaa atacactaat acacaatcct gtacgtatgt aatactttta tccatcaagg
840attgagaaaa aaaagtaatg attccctggg ccattaaaac ttagaccccc aagcttggat
900aggtcactct ctattttcgt ttctcccttc cctgatagaa gggtgatatg taattaagaa
960taatatataa ttttataata aaagaattca tagcctcatg aaatcagcca tttgcttttg
1020ttcaacgatc ttttgaaatt gttgttgttc ttggtagtta agttgatcca tcttggctta
1080tgttgtgtgt atgttgtagt tattcttagt atattcctgt cctgagttta gtgaaacata
1140atatcgcctt gaaatgaaaa tgctgaaatt cgtcgacata caatttttca aacttttttt
1200ttttcttggt gcacggacat gtttttaaag gaagtactct ataccagtta ttcttcacaa
1260atttaattgc tggagaatag atcttcaacg ctttaataaa gtagtttgtt tgtcaaggat
1320ggcgtcatac aaagaaagat cagaatcaca cacttcccct gttgctagga gacttttctc
1380catcatggag gaaaagaagt ctaacctttg tgcatcattg gatattactg aaactgaaaa
1440gcttctctct attttggaca ctattggtcc ttacatctgt ctagttaaaa cacacatcga
1500tattgtttct gattttacgt atgaaggaac tgtgttgcct ttgaaggagc ttgccaagaa
1560acataatttt atgatttttg aagatagaaa atttgctgat attggtaaca ctgttaaaaa
1620tcaatataaa tctggtgtct tccgtattgc cgaatgggct gacatcacta atgcacatgg
1680tgtaacgggt gcaggtattg tttctggctt gaaggaggca gcccaagaaa caaccagtga
1740acctagaggt ttgctaatgc ttgctgagtt atcatcaaag ggttctttag catatggtga
1800atatacagaa aaaacagtag aaattgctaa atctgataaa gagtttgtca ttggttttat
1860tgcgcaacac gatatgggcg gtagagaaga aggttttgac tggatcatta tgactccagg
1920ggttggttta gatgacaaag gtgatgcact tggtcaacaa tatagaactg ttgatgaagt
1980tgtaaagact ggaacggata tcataattgt tggtagaggt ttgtacggtc aaggaagaga
2040tcctatagag caagctaaaa gataccaaca agctggttgg aatgcttatt taaacagatt
2100taaatgattc ttacacaaag atttgataca tgtacactag tttaaataag catgaaaaga
2160attacacaag caaaaaaaaa aaaataaatg aggtacttta cgttcaccta caaccaaaaa
2220aactagatag agtaaaatct taagatttag aaaaagttgt ttaacaaagg ctttagtatg
2280tgaattttta atgtagcaaa gcgataacta ataaacataa acaaaagtat ggttttcttt
2340atcagtcaaa tcattatcga ttgattgttc cgcgtatctg cagatagcct catgaaatca
2400gccatttgct tttgttcaac gatcttttga aattgttgtt gttcttggta gttaagttga
2460tccatcttgg cttatgttgt gtgtatgttg tagttattct tagtatattc ctgtcctgag
2520tttagtgaaa cataatatcg ccttgaaatg aaaatgctga aattcgtcga catacaattt
2580ttcaaacttt ttttttttct tggtgcacgg acatgttttt aaaggaagta ctctatacca
2640gttattcttc acaaatttaa ttgctggaga atagatcttc aacgccccgg gggatctgga
2700tccgcggccg cgagaattta tacacaatgt aagcactata gtagaaaggc aaaaaaaagg
2760acatcgggcc tcatccttcc attaatatgg ttattattat taagattatt ataattttat
2820agctttatta taccttatac caagtactat atattaggta ctgcacagac aatgatgtag
2880cgtctatctt ctcttcatct tgagagagtc tcttttgcgg gcagccgttg cattgtagag
2940atcgtccgac tcgaatgaag gtgcggttcc cggaacagcc ggaacggtgg gggccagtgg
3000ggaagaaacg ttctgttcgt acggactgtg gacatccagt ggggagaata ggccttggaa
3060ctgtggatgt ttcaacgata gcgagagaga gttgagagtc tggagattgt ttgaactact
3120gcccatgttt gtaattccgt tgttcgttga agatatagat ggcgagcgaa gggcc
3175423420DNATrypanosoma brucei 42atggttgatg gtagatcttc agcttctatt
gttgcagttg atccagaaag agcagcaaga 60gaaagagatg ctgcagctag agctttgtta
caagattctc cattgcacac taccatgcaa 120tatgctacct ccggtttaga attgaccgtc
ccttatgcat tgaaagttgt tgcatctgcc 180gacaccttcg atagagctaa ggaagttgca
gatgaagtcc ttagatgtgc ctggcaattg 240gctgatacag tccttaactc ctttaaccca
aactctgaag tctctcttgt tggtagactt 300ccagtcggtc agaagcatca aatgtccgcc
ccacttaaga gagttatggc ttgttgtcaa 360agagtttaca attcctctgc tggttgtttc
gacccatcca ccgccccagt tgcaaaggct 420ttgcgtgaaa tcgctttagg caaggagaga
aacaatgcct gtttggaggc tttaacacaa 480gcatgcactt tgccaaactc tttcgtcatt
gactttgaag caggtactat ctcacgtaaa 540catgaacatg cttcacttga cttaggtggt
gtttcaaagg gttacatcgt tgactatgtt 600attgataaca ttaacgcagc tggtttccaa
aatgtctttt tcgattgggg tggtgattgt 660agagcctccg gtatgaatgc tagaaatacc
ccttgggttg ttggtattac tagaccacca 720tcattagata tgttaccaaa cccaccaaag
gaagcatcct atatctctgt tatctcattg 780gacaacgaag ctttggcaac ctccggtgat
tacgagaatt tgatctacac agctgatgac 840aagcctttaa cttgtactta cgattggaag
ggcaaggaac ttatgaagcc atctcaatca 900aacattgccc aagtttcagt taagtgctat
tcagcaatgt acgctgacgc tttagccacc 960gcttgtttca tcaaaagaga tccagccaag
gttagacaat tgttagatgg ttggagatac 1020gttagagata ctgtcagaga ttacagagtt
tatgttagag aaaatgagag agtcgctaag 1080atgtttgaaa ttgcaaccga agatgctgaa
atgagaaaaa gacgtatctc taatactttg 1140cctgcaagag tcatcgttgt cggtggcggt
ttagcaggtt tatctgcagc aattgaagct 1200gcaggctgcg gtgcacaagt cgttttgatg
gaaaaggaag ctaagttagg tggtaactct 1260gcaaaggcaa cctctggtat caatggttgg
ggtactagag cccaagcaaa ggcttccatt 1320gttgacggtg gcaagtattt cgaaagagat
acttacaaat ctggtattgg tggtaatacc 1380gacccagctt tagttaagac tctttccatg
aagtctgctg acgctattgg ttggttaaca 1440tcattaggtg ttcctttaac agtcttatca
caattgggtg gtcattccag aaagagaact 1500cacagagcac cagacaaaaa ggatggcacc
ccattaccta ttggttttac cattatgaaa 1560accttagaag atcacgtcag aggtaatctt
tctggtagaa ttactatcat ggaaaactgt 1620tccgttacct ctttactttc tgaaactaag
gaaagaccag atggtactaa acaaatcaga 1680gttaccggtg ttgagttcac tcaagcaggc
tctggcaaaa ctaccatttt ggccgacgca 1740gtcatcttgg ccactggtgg tttctctaac
gacaagaccg cagactcttt gttgagagaa 1800catgcccctc acttagttaa ctttcctaca
actaacggtc cttgggcaac tggtgacggt 1860gttaagcttg ctcaaagatt aggtgcacaa
ttggtcgaca tggataaggt tcaattgcat 1920ccaactggtt tgattaaccc aaaagatcca
gctaatccaa caaagttttt gggtccagaa 1980gctttaagag gttccggtgg tgtcttgtta
aacaaacagg gtaaaagatt tgttaacgaa 2040ttagatttgc gttctgttgt ttccaaggcc
attatggaac aaggtgctga atacccaggc 2100tctggtggtt ctatgttcgc atattgtgtc
cttaatgcag ctgcacaaaa gttgtttggt 2160gtctcttccc acgagttcta ctggaaaaag
atgggtttgt tcgttaaggc tgatactatg 2220agagatttgg cagcattgat tggttgtcca
gtcgagtctg ttcaacaaac tttagaggaa 2280tatgaaagat tatctatttc tcagagatcc
tgtccaatca ctagaaaatc tgtttaccca 2340tgtgttttgg gcactaaggg tccatactac
gttgctttcg tcaccccatc tattcactat 2400acaatgggtg gttgtttgat ttccccatca
gcagaaattc agatgaaaaa cacctcctcc 2460cgtgctccat tgtcccattc caaccctatc
ttgggtttgt tcggtgctgg tgaagttact 2520ggtggtgtcc acggtggcaa tagattaggt
ggtaactcat tgttagaatg tgttgtcttt 2580ggtagaattg ctggtgatag agcttctacc
attttgcaga gaaagtcctc cgcattatct 2640ttcaaggtct ggactaccgt tgttttgaga
gaagttagag aaggtggcgt ctatggtgcc 2700ggttcaagag ttttgagatt caacttgcct
ggtgctttac aaagatccgg tttgtccttg 2760ggtcaattca tcgcaatcag aggtgactgg
gatggtcaac aattgattgg ttactattcc 2820ccaattacat tgccagatga cttgggtatg
attgacattt tggctagatc cgataaaggt 2880actttaagag aatggatttc tgctttagaa
ccaggcgacg ctgttgagat gaaagcatgc 2940ggtggtttag tcatcgagag aagattgtca
gataagcact ttgtctttat gggtcacatc 3000attaacaagt tatgtttgat cgctggtggt
acaggcgttg cacctatgtt acaaatcatt 3060aaggcagcat tcatgaaacc ttttatcgat
accttagaat ctgtccatct tatctatgct 3120gcagaagatg ttaccgagtt aacttataga
gaagttttag aggagcgtag aagagagtct 3180cgtggcaagt tcaaaaagac ctttgttttg
aacagacctc caccactttg gactgatggt 3240gttggtttca tcgatagagg tatcttaact
aatcatgtcc aaccaccatc cgataacctt 3300ttggttgcaa tctgtggtcc acctgtcatg
cagcgtattg ttaaggccac cttaaagact 3360ttgggttaca atatgaatct tgttagaaca
gttgacgaaa cagaaccatc cggttcctaa 3420433420DNATrypanosoma cruzi
43atggctgacg gtagatcctc tgcatctgtt gttgcagttg atccagaaaa ggctgcaaga
60gaaagagatg aagcagctcg tgctttgtta agagactctc cattacaaac tcatcttcag
120tacatgacta atggtttaga gttgactgtc ccattcacct taaaggttgt cgctgaagca
180gttgcatttt ccagagcaaa ggaagttgct gacgaagttt tgaggtcagc ctggcatctt
240gcagacaccg tcttgaacaa ctttaaccct aactccgaga tttctatgat tggtagatta
300ccagttggtc aaaaacatac aatgtccgct acattgaagt ctgttatcac atgctgtcag
360catgttttca attcatccag aggtgttttt gatccagcta ctggtcctat cattgaagct
420ttaagagcta aggttgctga gaaagcctct gtttctgatg aacagatgga gaagttgttt
480cgtgtttgta acttctcttc ctcattcatc gttgatttgg aaatgggtac tattgccaga
540aaacacgaag atgcaagatt tgacttaggt ggtgtttcca agggttacat cgttgactac
600gttgttgaaa gattgaacgc tgctggtatt gtcgatgtct acttcgaatg gggtggtgac
660tgtagagctt ccggtactaa cgcaagacgt accccatgga tggttggtat cattagacct
720ccatctttag aacaattgag aaacccacca aaagatccat cctacattag ggttttacca
780cttaacgatg aagcactttg tacctctggt gactatgaga atttgaccga aggctctaac
840aaaaagttgt atacatccat tttcgattgg aaaaagagat ccttgttgga accagttgaa
900tcagaattgg cccaagtttc cattagatgt tattctgcca tgtatgcaga cgcattagca
960acagcttctc ttatcaagag agatatcaaa aaggttagac aaatgttgga agattggaga
1020cacgtccgta atagggttac taactatgtt acctatacca gacaaggtga aagagtcgca
1080cgtatgtttg aaattgctac tgataacgct gagattagga aaaagagaat tgcaggctct
1140ttacctgcta gggttattgt tgtcggtggt ggtttagctg gtttgtctgc agcaattgaa
1200gcaactgcat gtggtgccca agttatcctt ttagaaaagg aacctaaagt tggtggtaat
1260tccgcaaagg ctacatctgg tatcaacggt tggggtacta gagcacaagc tgaacaagat
1320gtctacgact ctggcaagta cttcgaaaga gatacacaca aatctggttt aggtggttct
1380accgatccag gcttagttcg tactttatca gtcaagtctg gtgacgctat ttcatggtta
1440tcttctcttg gtgttccatt aactgtcttg tcacaattag gcggtcattc cagaaaaagg
1500actcacagag cccctgataa ggcagatggt actccagttc caattggttt caccattatg
1560caaaccttag aacagcatgt tagaaccaag ttagcagaca gagttactat catggagaat
1620accaccgtta cctccttgct ttctaagtcc agagttagac atgatggtgc aaagcaagtt
1680agagtctacg gtgttgaagt cttacaagac gaaggtgtcg tttctcgtat cttggccgat
1740gctgtcattt tggcaacagg tggtttctcc aatgacaaaa ccccaaactc cttattgcaa
1800gagttcgctc cacaattgtc aggttttcca acaaccaacg gtccatgggc tactggcgat
1860ggtgttaagt tagcaagaga acttggtgtc aagttggttg atatggataa ggtccaactt
1920catccaactg gtttgattga ccctaaggac ccagcaaatc caaccaaata cttaggtcca
1980gaagcattga gaggttctgg tggtgtcttg ttaaacaaaa agggtgaaag atttgtcaat
2040gagttggact tgcgttccgt cgtttcaaat gctatcattg aacaaggtga tgaatatcca
2100gatgccggtg gttccaagtt cgccttctgt gttttgaatg atgcagcagt taagttattc
2160ggtgtcaact cccacggttt ctactggaag agacttggtt tgtttgttaa ggctgatacc
2220gttgaaaagt tagccgcatt gatcggttgc ccagtcgaaa atgttagaaa cacattaggt
2280gattatgagc aattgtccaa ggaaaacaga caatgtccaa agactagaaa agttgtctat
2340ccatgtgttg ttggtccaca aggtccattc tatgttgctt ttgttacccc atctattcac
2400tataccatgg gtggttgttt gatctcacca tctgctgaga tgcaattgga agagaacact
2460acctccccat ttggtcacag aaggcctatc ttcggtcttt tcggtgccgg tgaagttact
2520ggtggtgtcc atggtggtaa cagattaggt ggcaactctt tgttggagtg tgttgttttt
2580ggtagaatcg ctggtgatag agctgcaacc attttgcaaa agaaaccagt tccactttcc
2640tttaagactt ggaccaccgt cattttgaga gaggtccgtg aaggtggcat gtacggtact
2700ggttcaagag tcttaagatt caatttgcca ggtgctttac aaagatctgg tttgcaattg
2760ggtcaattca tcgctattag aggcgaatgg gatggtcaac aattgattgg ctactattcc
2820ccaatcactt tgccagacga tttgggtgtc atcggcattt tggctagatc cgataagggt
2880actttgaagg aatggatttc tgctttggaa cctggtgatg cagttgagat gaagggttgt
2940ggcggtttag ttattgaaag gagattctct gaaagatact tgtacttttc tggtcacgct
3000ttgaaaaagt tatgccttat tgctggtggt actggtgtcg caccaatgtt acaaatcatt
3060agagcagcat tgaaaaagcc attccttgag aatatcgaat caattagact tatctatgct
3120gctgaggacg tttctgagtt gacatacagg gaattgttag aacatcacca aagagattct
3180aagggcaagt ttagatccat cttcgttttg aatagaccac ctccaatttg gactgatggt
3240gttggcttta tcgacaaaaa gttgttatct tcatccgttc agccacctgc taaggatttg
3300ttagtcgcca tttgtggtcc tcctatcatg caacgtgttg tcaagacttg tcttaagtca
3360ttaggttatg atatgcagtt agtcagaaca gttgatgaag tcgaaactca aaactcctaa
3420443435DNALeishmania braziliensis 44atggctgatg gtaaaacctc tgcttccgtt
gttgctgtcg acccagagcg tgcagcaaag 60gagagagatg cagcagcaag agcaatgtta
caagacggtg gtgtttctcc agttggtaaa 120gctcagttgt tgaaaaaggg tttggcatat
gctgtccctt acacccttaa gattgttgtt 180gcagatccta aagctatgga aaagaccacc
gcagacgttg agaaggtcct tcaaaccgca 240ttccaagtcg ttgacacttt gttaaacaat
ttcaacgaaa actccgaggt ttctcgtatc 300aacagaatgc cagtcggtga ggaacaccaa
atgtctgctg cattgaagag agttatgggt 360tgctgtcagc gtgtttacaa ttcatctcgt
ggtgcttttg acccagctgt tggtccattg 420gtcagagaat tgagggaagc tgcaagagaa
ggcagaactt taccagcaga aaggattaac 480gctttgttat ccaagtgtac cttgaatatc
tccttttcca ttgatttgaa cagaggtact 540attgccagaa aacacgcaga tgcaatgttg
gatttgggtg gtgtcaataa gggttatggt 600gttgattatg ttgtcgaaca tttgaacaat
ttgggttatg atgatgtctt tttcgaatgg 660ggtggtgatg ttagagcatc tggcaaaaac
ccatcaaacc aacattgggt tgttggtatt 720gctagaccac cagcacttgc tgatatcaga
accgttgttc cacaagacaa gcaatccttc 780atcagagttg tttgtcttaa tgatgaagca
attgccacct ctggtgatta cgaaaatctt 840gtcgaaggtc ctggttctaa ggtttactcc
tctaccttca acgcaacctc taagtcctta 900ttggaaccaa ccgaaaccaa tatcgcacaa
gtctctgtta agtgttactc atgcatgtat 960gcagacgcat tggctaccgc tgccttattg
aaaaacaatc caactgctgt tcgtagaatg 1020ttagataact ggagatatgt tcgtgatact
gttaccgact atacaaccta ttccagagaa 1080ggtgaaagag ttgcaaagat gtttgagatt
gcaaccgaag ataaggaaat gagagctaag 1140agaatttccg gttccttgcc agcaagagtc
attatcgtcg gtggtggttt agctggttgt 1200tctgcagcta ttgaagcagt caactgtggt
gctcaagtca ttttgttaga aaaggaagcc 1260aagattggtg gcaactccgc aaaggctacc
tctggtatca acgcctgggg tactagagcc 1320caggctaaac aaggtgttat ggatggtggc
aagtttttcg agagagacac ccatagatcc 1380ggtaaaggtg gtcactgtga tccttgtttg
gttaagacac tttccgttaa gtcatcagac 1440gcagttaagt ggttgtctga attgggtgtt
ccattaaccg tcttatccca attaggtggt 1500gcatccagaa agaggtgtca tagagcccca
gataagtctg atggtactcc tgttccaatt 1560ggttttacaa tcatgaaaac attagaaaat
cacatcatta acgatctttc tcaccaagtt 1620actgttatga ctggtatcaa ggttactggt
ttggagtcca cttctcacgc tcgtccagat 1680ggtgttttag ttaagcacgt tactggtgtt
agattgattc aaggtgatgg ccaatccaga 1740gttttgaatg ctgatgccgt tatcttagca
actggtggtt tctccaatga ccatactgct 1800aactctttac ttcaacaata cgctccacaa
ctttcatcct ttccaaccac taatggtgtt 1860tgggccactg gtgacggtgt caaggcagct
agagaattag gtgttgagtt ggttgacatg 1920gataaggtcc aattgcatcc aacaggtttg
ttagatccaa aggacccatc caacaggact 1980aagtacttgg gtccagaagc tttaaggggt
tcaggcggtg tcttgttaaa caaaaacggt 2040gaacgtttcg tcaacgaact tgatttgaga
tctgtcgttt ctcaagccat tatcgaacaa 2100aacaacgttt accctggttc tggtggttcc
aagtttgctt actgcgtttt gaacgaagca 2160gcagctaagt tgttcggcaa aaactttttg
ggtttctatt ggcatagatt aggtcttttt 2220gaaaaggttg aagatgttgc tggtttagcc
aaattgatcg gttgtccaga ggaaaatgtt 2280accgctacat tgaaggaata caaggaattg
tcctccaaaa agcttcatgc ctgtccttta 2340accaacaaaa acgtctttcc ttgcacttta
ggtactgaag gcccttacta tgttgctttc 2400gtcacacctt caattcacta cacaatgggt
ggttgtttga tctccccttc agcagaaatg 2460cagaccattg ataacactgg tgtcacacca
gttcgtagac caatcttggg cttattcggt 2520gctggtgaag ttactggtgg tgtccatggt
ggtaacagat tgggtggtaa ttccttattg 2580gaatgtgttg tctttggtag aattgctggt
gatagagccg ctaccatttt gcaaaagaag 2640aatgctggtt tatcaatgac tgagtggtct
acagttgtct taagagaagt cagagaaggc 2700ggtgtttacg gtactggttc tcgtgtcctt
agattcaata tgccaggtgc cttacaaaag 2760actggcttag cattgggtca attcatcgca
atgagaggtg attgggatgg tcaacagtta 2820ttgggttact attctccaat tacattacca
gacgacattg gtgttattgg tatcttagct 2880agagctgaca aaggtagatt agctgaatgg
atttctgcat tacaaccagg tgatgctgtt 2940gagatgaagg catgtggcgg tttgattatc
catagaagat tcgctgctag acacttgttt 3000ttccgttctc acaagattag aaagcttgct
cttattggtg gtggtactgg tgttgcacca 3060atgttgcaaa ttgtcagggc tgcagtcaaa
aagccatttg ttgactctat tgagtctatt 3120cagttcatct atgcagctga agatgtctcc
gaacttactt atagaacttt gttggaatca 3180tatgaaaagg aatacggttc tggcaaattc
aagtgtcatt tcgtcttgaa taacccacca 3240tcacaatgga ccgagggcgt tggtttcgtt
gatactgctt tgttgcgttc tgccgttcaa 3300gcaccttcta acgacttgtt agtcgctatt
tgtggcccac caatcatgca aagagcagtc 3360aaatcagcct taaagggttt aggttacaat
atgaatttgg ttagaacagt tgatgaacca 3420gaaccattgt cttaa
3435453537DNASaccharomyces cerevisiae
45atgtcgcaaa gaaaattcgc cggcttgaga gataacttca atctcttggg tgaaaagaac
60aaaatattgg tggctaatag aggagaaatt ccaatcagaa tttttcgtac cgctcatgaa
120ctgtctatgc agacggtagc tatatattct catgaagatc gtctttcaac gcacaaacaa
180aaggctgacg aagcatacgt cataggtgaa gtaggccaat atacccccgt cggcgcttat
240ttggccattg acgaaatcat ttccattgcc caaaaacacc aggtagattt catccatcca
300ggttatgggt tcttgtctga aaattcggaa tttgccgaca aagtagtgaa ggccggtatc
360acttggattg gccctccagc tgaagttatt gactccgtgg gtgataaggt ctcagctaga
420aacctggcag caaaagctaa tgtgcccacc gttcctggta caccaggtcc tatagaaact
480gtagaggaag cacttgactt cgtcaatgaa tacggctacc cggtgatcat taaggccgcc
540tttggtggtg gtggtagagg tatgagagtc gttagagaag gtgacgacgt ggcagatgcc
600tttcaacgtg ctacctccga agcccgtact gccttcggta atggtacctg ctttgtggaa
660agattcttgg acaagccaaa gcatattgaa gttcaattgt tggccgataa ccacggaaac
720gtggttcatc ttttcgaaag agactgttcc gtgcagagaa gacaccaaaa ggttgtcgaa
780gtggccccag caaagacttt accccgtgaa gtccgtgacg ccattttgac agatgcagtt
840aaattggcca aagagtgtgg ctacagaaat gcgggtactg ctgaattctt ggttgataac
900caaaatagac actatttcat tgaaattaat ccaagaatcc aagtggaaca taccatcaca
960gaagaaatta ccggtataga tattgtggcg gctcagatcc aaattgcggc aggtgcctct
1020ctaccccagc tgggcctatt ccaggacaaa attacgactc gtggctttgc cattcagtgc
1080cgtattacca cggaagaccc tgctaagaac ttccaaccag ataccggtag aatagaagtg
1140taccgttctg caggtggtaa tggtgttaga ctggatggtg gtaacgccta tgcaggaaca
1200ataatctcac ctcattacga ctcaatgctg gtcaaatgct catgctccgg ttccacctac
1260gaaatcgttc gtagaaaaat gattcgtgca ttaatcgagt tcagaattag aggtgtcaag
1320accaacattc ccttcctatt gactcttttg accaatccag tatttattga gggtacatac
1380tggacgactt ttattgacga caccccacaa ctgttccaaa tggtttcatc acaaaacaga
1440gcccaaaaac ttttacatta cctcgccgac gtggcagtca atggttcatc tatcaagggt
1500caaattggct tgccaaaatt aaaatcaaat ccaagtgtcc cccatttgca cgatgctcag
1560ggcaatgtca tcaacgttac aaagtctgca ccaccatccg gatggaggca agtgctacta
1620gaaaaggggc cagctgaatt tgccagacaa gttagacagt tcaatggtac tttattgatg
1680gacaccacct ggagagacgc tcatcaatct ctacttgcaa caagagtcag aacccacgat
1740ttggctacaa tcgctccaac aaccgcacat gcccttgcag gtcgtttcgc cttagaatgt
1800tggggtggtg ccacattcga tgttgcaatg agatttttgc atgaggatcc atgggaacgt
1860ttgagaaaat taagatctct ggtgcctaat attccattcc aaatgttatt gcgtggtgcc
1920aatggtgtgg cttattcttc attgcctgac aatgctattg accatttcgt caagcaagcc
1980aaggataatg gtgttgatat atttagagtc tttgatgcct taaatgactt ggaacaattg
2040aaggtcggtg tagatgctgt gaagaaggca ggtggtgttg tagaagccac tgtttgtttc
2100tctggggata tgcttcagcc aggcaagaaa tacaatttgg attactactt ggaaattgct
2160gaaaaaattg tccaaatggg cactcatatc ctgggtatca aagatatggc aggtaccatg
2220aagccagcag ctgccaaact actgattgga tctttgaggg ctaagtaccc tgatctccca
2280atacatgttc acactcacga ttctgcaggt actgctgttg catcaatgac tgcgtgtgct
2340ctggcgggcg ccgatgtcgt tgatgttgcc atcaactcaa tgtctggttt aacttcacaa
2400ccatcaatca atgctctgtt ggcttcatta gaaggtaata ttgacactgg tattaacgtt
2460gagcatgtcc gtgaactaga tgcatattgg gcagagatga gattgttata ctcttgtttc
2520gaggctgact tgaagggccc agatccagaa gtttatcaac atgaaatccc aggtggtcaa
2580ttgacaaact tgttgtttca agcccaacaa ttgggtcttg gagaacaatg ggccgaaaca
2640aaaagagctt acagagaagc caattattta ttgggtgata ttgtcaaagt taccccaact
2700tcgaaggtcg ttggtgatct ggcacaattt atggtctcca ataaattaac ttccgatgat
2760gtgagacgcc tggctaattc tttggatttc cctgactctg ttatggattt cttcgaaggc
2820ttaatcggcc aaccatacgg tgggttccca gaaccattta gatcagacgt tttaaggaac
2880aagagaagaa agttgacttg tcgtccaggc ctggaactag agccatttga tctcgaaaaa
2940attagagaag acttgcagaa tagatttggt gatgttgatg agtgcgacgt tgcttcttat
3000aacatgtacc caagagttta tgaagacttc caaaagatga gagaaacgta tggtgattta
3060tctgtattgc caacaagaag ctttttgtct ccactagaga ctgacgaaga aattgaagtt
3120gtaatcgaac aaggtaaaac gctaattatc aagctacagg ctgtgggtga tttgaacaaa
3180aagaccggtg aaagagaagt ttactttgat ttgaatggtg aaatgagaaa aattcgtgtt
3240gctgacagat cacaaaaagt ggaaactgtt actaaatcca aagcagacat gcatgatcca
3300ttacacattg gtgcaccaat ggcaggtgtc attgttgaag ttaaagttca taaaggatca
3360ctaataaaga agggccaacc tgtagccgta ttaagcgcca tgaaaatgga aatgattata
3420tcttctccat ccgatggaca agttaaagaa gtgtttgtct ctgatggtga aaatgtggac
3480tcttctgatt tattagttct attagaagac caagttcctg ttgaaactaa ggcataa
3537463528DNAKluyveromyces marxianus 46atgtctaccc aaaacgatct ggccgggttg
cgtgataact cgaacctatt aggtgaaaag 60aacaagattc ttgttgccaa ccgtggtgaa
attccaatta gaatctttag aacggctcat 120gaactttcga tgaagactgt tgcgatctat
tcgcacgagg atagactatc tatgcacaga 180ttgaaggcag acgaagctta cgttattggt
gagccaggaa aatacactcc agttggtgcg 240tatttggcga tcgatgagat tatcaagatt
gctcaattgc acggagtgag cttcatccac 300cctggttatg ggttcttatc ggaaaactct
gagtttgcca agaaggtggc cgactctggt 360atcacgtggg ttggtcctcc agccgatgtg
atcgatgctg ttggtgacaa ggtttctgcc 420agaaacttgg ccgagagagc ggatgttcca
gtggttccag gtacgcctgg tccaatagag 480acagttgaag aagcagttga atttgtggag
aagtacggat acccagtcat catcaaggct 540gccttcggtg gtggtggtcg tggtatgaga
gttgttcgtg aaggtgatga tatcgccgat 600gctttccaaa gagccaagtc cgaagctgtt
actgctttcg gtaacggtac ttgtttcgtt 660gaaagattct tggacaagcc aaagcacatc
gaagttcagt tgttggctga tcactacggt 720aatgtcatcc atctattcga aagagactgt
tctgtgcaaa gaagacatca aaaggtcgtt 780gaagtagcgc cagccaagac tttgccagag
agcgtgcgta atgcaatctt gactgacgct 840gtcaagttgg ctaaggaggc aggatacaga
aatgctggta ccgctgaatt tttggtcgac 900aaccaaaaca gacactactt tattgaaatc
aacccaagaa ttcaagtcga acataccatc 960accgaagaaa ttaccggtat cgacattgtc
gccgcacaaa ttcaaatcgc agcaggtgct 1020tccttggaac aattgggact attgcaagat
agaatcacca cccgtggttt cgctattcaa 1080tgtcgtatca ctactgaaga tccttccaag
aacttccagc cagatactgg tcgtatcgat 1140gtttaccgtt ccgctggtgg taacggtgtc
agattggatg gtggtaacgc attcgctggt 1200tcggtcattt cacctcatta tgattccatg
ttggtcaaat gttcttgttc cggttccact 1260tacgaaatcg ttcgtcgtaa gatgttgcgt
gccttgatcg aattcagaat cagaggtgtg 1320aagacaaaca ttccattctt gctaacgttg
ttgactcatc ctgtgttcaa gtccggtgac 1380tactggacta ccttcatcga tgacactcca
caattgttcg aaatggtttc ttctcaaaac 1440agagcacaaa aactattgca ctacttggcc
gatcttgccg ttaacggttc atcgatcaag 1500ggtcaaattg gtctaccaaa gttaaagact
catcctacta tcccacattt gcataaggcc 1560gatggctcca ttctagatgt gtctgccaag
cctcctgccg ggtggagaga tgttctattg 1620caacacggcc cagaagaatt tgcaaagcaa
gttagaaagt tcaagggtac tttgctaatg 1680gacaccacct ggagagatgc tcatcaatct
ctattggcca ctagagtcag aacttacgat 1740ttggctgcca tcgctccaac tactgctcat
gctttgagcg gtgctttcgc tttggaatgt 1800tggggtggtg ccactttcga tgtctccatg
agattcttgc acgaagatcc atgggaacgt 1860ttgagaactt tgagaaagtt ggttcctaac
attccattcc aaatgttgct acgtggtgcc 1920aacggtgttg catactcttc tctaccagat
aacgctatcg accactttgt caagcaagca 1980aaggataacg gtgttgacat tttcagagtc
ttcgatgctc taaacgattt ggagcaattg 2040actgtcggtg ttgacgctgt caagaaggct
ggtggtgttg tcgaagctac catttgttac 2100tccggtgaca tgctagcacc aggtaagaag
tacaaccttg actactactt ggacattgtt 2160gaacaagtgg ttaagagagg tacccatatt
cttggtatca aggatatggc aggtactttg 2220aagccatctg ctgctaagct cttgatcggt
tctatcagaa caaagtaccc tgacttgcca 2280attcacgtcc atacccatga ctccgccggt
accggtgttg cttccatggc tgcatgtgct 2340ttcgctggtg ctgatgttgt tgatgttgca
accaactcta tgtctggtat gacttctcaa 2400ccatctgtca atgcactatt ggctgctctt
gatggtgaaa tcgactgtaa tgtcaacgtc 2460agctacatca gtcagctaga tgcttactgg
gctgaaatga gactattgta ctcatgtttc 2520gaagccgact tgaagggtcc tgatccagaa
gtttacgtcc atgaaattcc aggtggtcaa 2580ttgaccaact tgctcttcca agcccaacaa
ttgggtcttg gtgagcaatg ggctgaaacc 2640aagagagctt accgtgaagc aaacctgttg
ttgggtgatg ttgttaaggt cactccaaca 2700tccaaggttg tcggtgattt ggctcaattc
atggtcacta acaagttgac ctcggatgat 2760gttaagagat tagcttcatc tttggatttc
ccagactccg tcatggactt ctttgaaggt 2820ttaatcggtc aaccatacgg tggtttccca
gaacctctaa gatctgatgt tttgaagaac 2880aagagaagaa agttgaccaa gagaccaggt
ttggaattgg ctccattcga tttggaaggc 2940attaaggaag atttgactaa cagatttggt
gacattgacg actgtgatgt tgcttcttac 3000aacatgtatc caaaggtcta cgaagatttc
cgtaagatca gagaaaagta cggtgatcta 3060tctgttttgc caaccaagaa cttcttgtct
ccaccttcaa tcggtgaaga aatcgtcgtt 3120acaattgaac aaggtaagac tttgatcatt
aagccacaag ctattggtga tttgaacaag 3180gagactggta tcagagaagt ttacttcgaa
ttgaacggtg aattgagaaa ggtctctgtt 3240gctgacagat ctcaaaaggt tgaaacgatc
tccaagccaa aggctgacgc ccacgatcca 3300ttccaagttg gttctccaat ggcaggtgtt
gttgtcgaag tcaaggtaca caagggttct 3360ttgatctcca agggccaacc agtcgctgtc
ctaagtgcca tgaagatgga aatggttatc 3420tcctccccat ctgatggtca agtcaaggaa
gtgcttgtca aggatggtga aaacgttgac 3480gcttctgact tgctcgttgt tttggaagaa
gctccagcta aagaataa 35284722DNAArtificial
SequenceSynthetic - Primer 47gcaactgatg ttcacgaatg cg
224820DNAArtificial SequenceSynthetic - Primer
48ttgccgttgc agcaaatctc
20492652DNAEscherichia coli 49atgaacgaac aatattccgc attgcgtagt aatgtcagta
tgctcggcaa agtgctggga 60gaaaccatca aggatgcgtt gggagaacac attcttgaac
gcgtagaaac tatccgtaag 120ttgtcgaaat cttcacgcgc tggcaatgat gctaaccgcc
aggagttgct caccacctta 180caaaatttgt cgaacgacga gctgctgccc gttgcgcgtg
cgtttagtca gttcctgaac 240ctggccaaca ccgccgagca ataccacagc atttcgccga
aaggcgaagc tgccagcaac 300ccggaagtga tcgcccgcac cctgcgtaaa ctgaaaaacc
agccggaact gagcgaagac 360accatcaaaa aagcagtgga atcgctgtcg ctggaactgg
tcctcacggc tcacccaacc 420gaaattaccc gtcgtacact gatccacaaa atggtggaag
tgaacgcctg tttaaaacag 480ctcgataaca aagatatcgc tgactacgaa cacaaccagc
tgatgcgtcg cctgcgccag 540ttgatcgccc agtcatggca taccgatgaa atccgtaagc
tgcgtccaag cccggtagat 600gaagccaaat ggggctttgc cgtagtggaa aacagcctgt
ggcaaggcgt accaaattac 660ctgcgcgaac tgaacgaaca actggaagag aacctcggct
acaaactgcc cgtcgaattt 720gttccggtcc gttttacttc gtggatgggc ggcgaccgcg
acggcaaccc gaacgtcact 780gccgatatca cccgccacgt cctgctactc agccgctgga
aagccaccga tttgttcctg 840aaagatattc aggtgctggt ttctgaactg tcgatggttg
aagcgacccc tgaactgctg 900gcgctggttg gcgaagaagg tgccgcagaa ccgtatcgct
atctgatgaa aaacctgcgt 960tctcgcctga tggcgacaca ggcatggctg gaagcgcgcc
tgaaaggcga agaactgcca 1020aaaccagaag gcctgctgac acaaaacgaa gaactgtggg
aaccgctcta cgcttgctac 1080cagtcacttc aggcgtgtgg catgggtatt atcgccaacg
gcgatctgct cgacaccctg 1140cgccgcgtga aatgtttcgg cgtaccgctg gtccgtattg
atatccgtca ggagagcacg 1200cgtcataccg aagcgctggg cgagctgacc cgctacctcg
gtatcggcga ctacgaaagc 1260tggtcagagg ccgacaaaca ggcgttcctg atccgcgaac
tgaactccaa acgtccgctt 1320ctgccgcgca actggcaacc aagcgccgaa acgcgcgaag
tgctcgatac ctgccaggtg 1380attgccgaag caccgcaagg ctccattgcc gcctacgtga
tctcgatggc gaaaacgccg 1440tccgacgtac tggctgtcca cctgctgctg aaagaagcgg
gtatcgggtt tgcgatgccg 1500gttgctccgc tgtttgaaac cctcgatgat ctgaacaacg
ccaacgatgt catgacccag 1560ctgctcaata ttgactggta tcgtggcctg attcagggca
aacagatggt gatgattggc 1620tattccgact cagcaaaaga tgcgggagtg atggcagctt
cctgggcgca atatcaggca 1680caggatgcat taatcaaaac ctgcgaaaaa gcgggtattg
agctgacgtt gttccacggt 1740cgcggcggtt ccattggtcg cggcggcgca cctgctcatg
cggcgctgct gtcacaaccg 1800ccaggaagcc tgaaaggcgg cctgcgcgta accgaacagg
gcgagatgat ccgctttaaa 1860tatggtctgc cagaaatcac cgtcagcagc ctgtcgcttt
ataccggggc gattctggaa 1920gccaacctgc tgccaccgcc ggagccgaaa gagagctggc
gtcgcattat ggatgaactg 1980tcagtcatct cctgcgatgt ctaccgcggc tacgtacgtg
aaaacaaaga ttttgtgcct 2040tacttccgct ccgctacgcc ggaacaagaa ctgggcaaac
tgccgttggg ttcacgtccg 2100gcgaaacgtc gcccaaccgg cggcgtcgag tcactacgcg
ccattccgtg gatcttcgcc 2160tggacgcaaa accgtctgat gctccccgcc tggctgggtg
caggtacggc gctgcaaaaa 2220gtggtcgaag acggcaaaca gagcgagctg gaggctatgt
gccgcgattg gccattcttc 2280tcgacgcgtc tcggcatgct ggagatggtc ttcgccaaag
cagacctgtg gctggcggaa 2340tactatgacc aacgcctggt agacaaagca ctgtggccgt
taggtaaaga gttacgcaac 2400ctgcaagaag aagacatcaa agtggtgctg gcgattgcca
acgattccca tctgatggcc 2460gatctgccgt ggattgcaga gtctattcag ctacggaata
tttacaccga cccgctgaac 2520gtattgcagg ccgagttgct gcaccgctcc cgccaggcag
aaaaagaagg ccaggaaccg 2580gatcctcgcg tcgaacaagc gttaatggtc actattgccg
ggattgcggc aggtatgcgt 2640aataccggct aa
2652502643DNAMannheimia succiniproducens
50atgacagaag aatatttaat gatgcgtaat aacatcaata tgctggggcg ctttttgggc
60gaaactattc aggaggcgca aggtgacgat attctcgaac tgattgaaaa tatccgcgta
120ctgtcccgca attcccgtag cggcgatgac aaagcccggg cggcattatt agacaccctt
180tccactattt cggcggataa tattattccg gttgcccgcg ctttcagcca gtttctgaac
240ctgacaaatg tggcggaaca atatcaaacc atgtctcgct cccatgaaga taaggtttct
300gcggaacgtt ccactgctgc gctgttcgcc cgcctgaaag aacaacatgt ttctcaggaa
360gaaatcatta aaaccgtaca gaaactgttg attgaaatcg tccttaccgc tcacccgacg
420gaagttaccc gccgttcatt aatgcacaaa caggttgaaa tcaacaaatg tctggctcag
480ctggatcata cggatttaac cgccgaagaa caaaaaaata ttgagtataa attacttcgt
540cttatcgccg aagcctggca taccaatgaa atccgtacca atcggccgac acctctggaa
600gaagccaaat ggggttttgc cgttatcgaa aacagtttat gggaaggttt gcccgccttt
660atccgcaaac ttaacgatgc cgccgtcgaa catttaaatt atgctttgcc ggtagacctc
720acaccggtac gcttctcttc ctggatgggc ggtgaccgtg acggcaaccc cttcgttacc
780gcaaaaatta cccgggaagc gctgcaactt gcgcgctgga aagcggcgga tttattttta
840accgatattc aggaactctg cgacgagttg tcaatgacac aatgcactgc ggaattccga
900gaaaaatacg gtgatcattt agaaccctat cgtgtagttg tgaaggattt acgcagcaaa
960ttaaaaaata cgctggatta ttacaacgat atacttgcgg gtcgcattcc gccgtttaaa
1020caagatgaaa tcatcagtga agaccaacaa ctctggcaac cgctttatga ctgttatcaa
1080tccctaaccg cctgcggtat gcgtattatt gccaatggat tattgctgga taccttacgc
1140cgcgttcgtt gtttcggcgt cacattactg cgtttagata tccgtcagga aagcacccgc
1200catagcgacg ccatcggcga aattacccgc tacatcggtt taggcgatta cagccaatgg
1260acagaagatg acaaacaagc cttcctgatc cgggaattaa gttcccgtcg tccgctaatt
1320ccccataact ggacgccttc ggaacacact cgggaaattt tagacacctg taaagtcatt
1380gcaaaacagc cggaaggcgt tatttcctgc tatatcattt ccatggcgcg caccgcttcc
1440gatgttttgg cggtgcattt attattgaaa gaagcgggca tttcatacca tctgccggta
1500gttcctctat ttgaaacatt ggacgacctg gacgcttcta aagaagtgat gacgcaactg
1560tttaacgtag gctggtatcg cggcgtaatc aaaaaccgcc aaatgatcat gatcggctat
1620tccgatagcg ccaaagatgc gggcatgatg gcggcctcat gggcgcaata ccgggcgcag
1680gacgctttag tcaaactttg cgaacaaacc ggcatcgaac ttaccctctt ccacggccgc
1740ggcggcaccg taggacgtgg cggtgcaccg gctcacgccg cattattatc ccaaccgcca
1800cgttctctga aaaacggctt acgggtaacc gaacaagggg aaatgatccg cttcaaactg
1860ggattaccgg ctatcgccgc agaaagtctg gatctctacg ccagcgccat tcttgaggcc
1920aacctcctgc cgccgccgga accgaaagcc agctggtgcc gggtaatgga cgaacttgcc
1980gtcgcttctt gcgaaatcta tcgcaatgtg gtgcgcggcg ataaagattt tgtgccttac
2040ttccgcagcg ccacaccgga acaggaactg gcaaaactgc ctttaggttc ccgaccggca
2100aaacgcaatc cgaacggcgg cgttgaaagc ctgcgtgcca ttccctggat cttcgcctgg
2160atgcaaaacc gcctgatgct gcccgcctgg ctcggtgccg gcgcctcaat ccgtcaggcg
2220atggaaagcg gcaaagcggc ggtgattgaa gaaatgtgca accattggcc gtttttcaat
2280acccgaatcg gcatgcttga aatggtattc agtaaaaccg atagctggct gtccgaatat
2340tacgaccagc gtttagtgaa aaaagagctt tggtatttag gcgaatcgct gcgcaaacag
2400ttaagcgaag atatcgctac cgtgttacgg ctttccggca aaggcgatca attaatgtcg
2460gatttgcctt gggtggcgga atctattgca ctgcgtaacg tttacaccga cccgttaaac
2520ttattgcaag tggaattatt gcgtcgtttg cgagcggatc ccgaacatcc gaatccggat
2580atcgagcaag cgctgatgat caccattacc ggtatcgccg cgggtatgcg taatacgggt
2640tag
26435121DNAArtificial SequenceSynthetic - Primer 51acggcagtat accctatcag
g 215220DNAArtificial
SequenceSynthetic - Primer 52aatgatccat ggtccgagag
205322DNAArtificial SequenceSynthetic - Primer
53gaagagacgt acaagatccg cc
225447DNAArtificial SequenceSynthetic - Primer 54ggataaaagt attacatacg
tacaggattg tgtattagtg tatttcg 475522DNAArtificial
SequenceSynthetic - Primer 55taggaatggt gcatcatcca ac
225644DNAArtificial SequenceSynthetic - Primer
56ccaaccaaac acgcgtacaa tgaacgaaca atattccgca ttgc
445723DNAArtificial SequenceSynthetic - Primer 57ggacacggag aacccattta
ttc 23581029DNAIssatchenkia
orientalis 58atgtccaatg ttaaagtagc tctactaggt gccgctggtg gtatcggcca
accacttgct 60ctattactta agcttaatcc aaacataacc catttggcac tctatgacgt
tgtgcatgtt 120cctggagtgg ctgccgacct acaccatata gacacagatg tagtgattac
ccaccatttg 180aaagatgaag acggtacggc cttggcaaac gccctcaagg acgctacgtt
tgttattgtc 240cccgccggtg ttccgagaaa gcccggcatg actagaggtg atttgttcac
aattaatgcc 300ggtatatgtg ccgaattggc taatgctatt agtttgaacg ctcctaatgc
attcaccctt 360gtcattacca atccggtcaa ctcgaccgtt cctatattta aggaaatatt
tgctaaaaat 420gaagccttca atccaaggag actgtttggt gtaactgctc tagatcatgt
tagatcaaat 480acttttctct cggaattaat tgacggtaaa aatccccaac attttgatgt
cactgttgtt 540ggcggacact ctggtaactc aattgtcccc ctattctccc ttgttaaggc
tgccgaaaat 600ttagacgatg aaattataga tgccttgatt catagagttc aatacggtgg
agatgaagtt 660gtggaagcaa agagcggtgc gggctcggca actctttcaa tggcttatgc
cgctaacaag 720ttcttcaata tattgcttaa tggatacttg ggtttgaaga agacaatgat
ttcaagttat 780gtctttttag acgattcaat caacggcgtc cctcaattaa aggaaaattt
gtctaaactt 840ttgaaaggtt ccgaggttga gttaccaagt tatttggctg ttccaatgac
ctatggtaaa 900gaaggtattg aacaagtctt ttacgattgg gtgtttgaaa tgtcaccaaa
ggaaaaggaa 960aacttcatta cagcgattga atacattgat caaaatattg aaaaaggtct
gaattttatg 1020gtacgttaa
1029591011DNAIssatchenkia orientalis 59atggtcaagg tgactatttt
aggcgctgcc ggtggaattg gacaaccact ctcattgtta 60ttgagactta atccatggat
tgacgaattg gccttgtttg atattgtcaa tacccccggc 120gtgagttgtg atttgtcgca
tattcctgca tcacaggttg ttaatggcta tgctccgaaa 180tcgaaatcag atacagagac
aatcaagact gccttgaaag gtgctgatat tgttgttatt 240cctgcaggaa ttccacgtaa
acctggtatg acaagaaacg atctctttaa aatcaatgcc 300ggaatcgtta agagtttgat
tcatagtgca ggaaccactt gccctgatgc atttatttgt 360gtcatttcga accctgtcaa
ctcgacagtt ccaattgccg ttgaagaact aaagcgtttg 420aatgttttta atccacataa
agttttcggt attaccacat tggacaattt cagattagaa 480gaatttctga gtggagaact
tggtggaatt gtcaaaccaa atgatttata tggtgatgta 540gttgctatag gtggccattc
gggcgactct atagtaccga tcttgaattc gtggaatttg 600aatttcatca atgatggaga
ttcttataac aatttggtca agagggtcca gtttggaggc 660gatgaggttg tcaaggcaaa
ggacgggaaa ggttcggcta cattgtcaat ggctacagct 720gcatacaggt ttgtcaacaa
cctcttggac gccattgtca ataacaagaa agtcaaggaa 780gtggcctttg tgaaaatcga
ccaattgcca actacaaggg ttccttattt tgttgttgat 840gaaactcagt attttagtct
acccattatt ctcggtagac aggggattga gagggtcacg 900ttcccagaat ctctgacaga
gcaagaggtg agaatgacaa agcacgctgt tgctaaagtt 960aaagttgacg ttaataaagg
cttcaatttt gtccatggcc caaaactgta a 1011601005DNAIssatchenkia
orientalis 60atgttctcca gaatctctgc tagacaattc tcctcctctg ctgcttccgc
ttacaaggtc 60accgttttag gtgctgcagg tggtattggc caaccactat ctcttttgat
gaagttgaac 120cacaaggtca ccaacttatc cttgtacgac ttgagattgg gtgctggtgt
tgccactgac 180ttgtcccaca ttccaaccaa ctccgttgtc aagggctatg gtccagaaaa
caatggtttg 240aaggacgcct tgaccggctc cgatgttgtt cttattccag ctggtgttcc
aagaaaacca 300ggtatgacta gagacgatct cttcaacacc aatgcatcga ttgtcagaga
cttggcaaag 360gctgctgcag accactgtcc aaacgccgtc ttgttgatca tttcaaaccc
tgtcaactca 420actgtcccaa ttgttgctga ggttttgaaa tcaaagggcg tctacaaccc
aaagaagttg 480tttggtgtca ccactttgga cgttttgaga tcctcgagat tcttgagtga
agtcgtcaac 540accgacccaa ccaccgaaac cgtcactgtt gttggtggcc actctggtgt
caccattgtt 600cctttaatct cccaaaccaa acacaaggac ttgccaaagg aaacctacga
agcattggtc 660cacagaatcc aattcggtgg tgatgaggtt gtcaaggcca aggacggtgc
aggttccgct 720accttgtcca tggcccaagc cggtgcaaga atggcctcct ccgtcttgaa
gggtttggct 780ggtgaagttg acattgtcga accaaccttt attgactctc cattgttcaa
gtccgaaggt 840gtcgaattct tctcctccag agtcaccctt ggtccagaag gtgtccaaga
agtccaccca 900ttgggcgtct tatctactgc tgaagaagaa atggttgcta ctgctaagga
aaccttgaag 960aagaacatcc aaaagggtgt cgactttgtc aaggctaacc cataa
1005611071DNAZygosaccharomyces rouxii 61atgcctcatt ctatcaacgg
tgatgttaaa atcgcagtat tgggagctgc aggtggtatt 60ggacaatcac tttcgctact
tttgaagacc cagttaacta gagaattgcc aaatcatcgt 120catgctcagt tagccctata
cgacgtcaat gctgacgcag ttcggggtgt cgcagccgac 180ttatctcata ttgatacagg
tgttactgta acaggatatg aaggtgatag gatcggcgaa 240gcgttagaag gtacggatat
cgtcctgatc cctgcaggtg ttcctagaaa acctggtatg 300acaagagaag atctattggt
tgttaatgca aagattgtca agagtatagg gtcatcgatt 360gcgcagcatt gcgatttaaa
caaagtgttc attctactaa tctcaaaccc aataaattcc 420cttgttccag tactcgttaa
ggaactggaa tctaaatctc aaggcactca agttgagaga 480cgtgtgcttg gtctcactaa
gttggattcc gttagagcaa gtgcattttt gcacgaggtt 540acgattaaac atggtctaaa
acctaaatct aatactcttg atgatgttcc agtagttggt 600ggtcattctg gtgaaactat
tgtaccttta ttctcccaag cccctaatgg taaccgttta 660tcacaggacg ccttggaagc
tcttgttcag cgtgtacaat tcggaggcga tgaagtcgtt 720agagctaaaa atggtgctgg
tagtgccact ctgtgtatgg cccatgccgc ttatactgtt 780gctgcatctt ttattccact
tatcactggt caaaagcgtt ccatctctgg tacattctat 840gttgccttaa aggatgctca
aggtcagcct atcaacagta gcgctaagcg tcttttgggc 900tcaatcaacg atttaccata
ttttgcagtg ccattggaga ttacttctca gggtgtggat 960gaattagata ccagcgtttt
ggaaagaatg accaagtatg agagagaaag actcttagct 1020ccttgtctgg gtaaattgga
aggtggtatc agaaacggtt tgagtttgta a 1071621017DNAKluyveromyces
marxianus 62atgcttagag ccctaactcg ccgtcaattt tcctccactg ccttcaaccc
atacaaggtc 60accgttctag gtgctggtgg tggtattggt caaccattgt ccttgttgtt
gaagctaaac 120cacaaggtca ctgacttgag actatacgac ttgaagggtg ccaagggtgt
cgctgctgac 180ttgtctcaca tcccaaccaa ctctaccgtt actggttaca ctccagaatc
caaggactct 240caagaagaat tggctgctgc tttgaaggac actgaggttg ttttgatccc
agctggtgtg 300ccaagaaagc caggtatgac ccgtgacgat ttgttcgcca tcaatgccgg
tattgtcaga 360gatttggcca cttccatcgc caagaacgct ccaaacgccg ccatcttggt
catctccaac 420ccagtcaact ctactgtccc aatcgtcgcc gaggtcttga agcaaaacgg
cgtctacaac 480ccaaagaagt tgttcggtgt caccactttg gacgttatcc gtgcctccag
attcatctcc 540gaggttagag gtaccgaccc aaccactgag cacgtgaccg tcgtcggtgg
tcactccggt 600atcaccatct tgccgctagt gtcccagacc aagcacaagt ccgtcatcaa
gggcgaggaa 660ttggacaact tgatccacag aatccaattc ggtggtgacg aagtcgtcca
ggcaaagaac 720ggtgctggtt ctgccacttt gtccatggcc caagccggtg cccgtttcgc
taacagcgtt 780ctaagcggtt tcgaaggtga aagagacgtc attgagccaa ctttcgtcga
ctccccattg 840ttcaaggacg aaggtatcga attcttcgct tccccagtca ctttgggccc
agaaggtgtc 900gaaaagatcc acggtttggg tgtcttgtcc gacaaggaag aacaaatgtt
ggccacttgt 960aaggaaacct tgaagaagaa catcgaaaag ggtcaaaact ttgtcaagca
aaactaa 1017631017DNAKluyveromyces marxianus 63atggttagcg ttgcagtatt
aggatcatcc ggaggcattg gccaaccact ctcactcttg 60ttgaagctgg accctcgcgt
gtccagcttg agattgtacg acttgaagat gtcccacggg 120atcgccaccg atttgtcgca
catggactcc aactccatct gcgagggctt caacaccgac 180gagatcgcgc tcgcgctcaa
gggcgcccag atcgtcgtca tccccgcggg tgtcccaaga 240aagcccggga tgtcacgtga
cgaccttttc aagatcaacg ccaagatcat caagtcgttg 300gcgttgcaaa tagccgagca
cgcgcccgag gcgcgcgtcc tcgtgatctc gaacccggtc 360aactccttgg tgcccattgt
gtacgagact ttgaagagcg tcggcaagtt cgagccgggt 420aaagtgatgg gaattaccac
attggacatt atccgctcac acacgttcct ggtggacgtc 480ttgggccgca aggcgtacag
cgtcgagaag ttgcgcagcg cggttactgt ggtgggcggc 540cactcgggcg agaccattgt
tccgattttc accgaccaga agttctacag gcgtctcaga 600gacagagagc tctatgacgc
gtacgtgcat agggtccaat tcggcggaga cgaggtcgtg 660aaggccaagg acggcagcgg
tagtgctact ttgtctatgg cctgggcggg ttacagtttt 720gtgaagcagt tgctcaacag
cttgcaccta gaaacaggcg aagacgtgca tccgatccca 780acgtttgtgt acttgccggg
tttaccgggc gggaaggagc tccagcagaa gttgggcacc 840tctgttgagt tttttgccgc
gcccgtgaag ctttccaagg gtattgtggt tgaagttgag 900cacgactggg tcgacaagtt
gaacgatgcc gagaagaagt tgattgcaaa gtgtcttcca 960atccttgaca agaacatcaa
gaagggtctc gccttttcgc agcagacaaa gttgtga 1017641125DNAKluyveromyces
marxianus 64atgccagcag tatcatatga tgtccagcaa cgggatatcc tcaagatcgc
agttctaggg 60gcggcaggcg gtattggcca atccttgtcg ctcttgttga agtcgaacgc
ttcttttttg 120ttaccacgtg actcgtcaag acacataagc ctagcgctat acgacgtgaa
caaagatgcc 180atcgtgggca cagcagcaga cttgtcacac atagacaccc ctatcaccac
cactccacac 240tacccaaatg atgggaatgg cggtatcgca cggtgcttgc aagatgcaga
catggtcatc 300atcccagcag gtgtgcccag aaaacccggt atgtcacgtg atgacctaat
cggtgtcaac 360gccaagatca tcaagtcgct aggaaacgac atcgcagagt actgtgactt
gtctaaagtg 420catgtattgg ttatttcgaa cccagtgaac tcgttggtcc cactgatggt
gtcgactttg 480gcaaacagcc cacacagtgc gaacacaaac atcgaggcta gagtgtacgg
gatcacccat 540ttggacctag tgagagcttc cacctttgtg caacagctaa actctttcaa
atcaaataac 600gcacctgaca ttccggtcat tggtggtcat tccggagata ccatcatccc
cgttttttcc 660gtcttgaatc accgcgcttc taactccgga tacgctaatt tgctagataa
tggcgttagg 720caaaagttgg tccacagagt tcaatatggt ggggacgaaa tcgtccaagc
aaagaacggt 780aacgggagcg cgacattatc catggcatac gcgggcttca aaatcgcagc
acaattcatc 840gaccttttgg tcggaaatat ccgcactatc gaaaatattt gcatgtatgt
tccgctcact 900aacaggtata ataccgagat cgccccaggc tctgacgaat taagatcaaa
gtacatcaac 960ggaacccttt atttctcgat tccactttcc atcggaataa acggtatcga
aagagtccac 1020tacgagatca tggaacatct agacagctac gagcgtgaga cgctactacc
gatctgcttg 1080gaaactctaa agggtaatat tgacaagggt ctaagcttgg tataa
11256522DNAArtificial SequenceSynthetic - Primer 65ggaggaatgg
aacagtgatg ac
2266939DNAEscherichia coli 66atgaaagtcg cagtcctcgg cgctgctggc ggtattggcc
aggcgcttgc actactgtta 60aaaacccaac tgccttcagg ttcagaactc tctctgtatg
atatcgctcc agtgactccc 120ggtgtggctg tcgatctgag ccatatccct actgctgtga
aaatcaaagg tttttctggt 180gaagatgcga ctccggcgct ggaaggcgca gatgtcgttc
ttatctctgc aggcgtagcg 240cgtaaaccgg gtatggatcg ttccgacctg tttaacgtta
acgccggcat cgtgaaaaac 300ctggtacagc aagttgcgaa aacctgcccg aaagcgtgca
ttggtattat cactaacccg 360gttaacacca cagttgcaat tgctgctgaa gtgctgaaaa
aagccggtgt ttatgacaaa 420aacaaactgt tcggcgttac cacgctggat atcattcgtt
ccaacacctt tgttgcggaa 480ctgaaaggca aacagccagg cgaagttgaa gtgccggtta
ttggcggtca ctctggtgtt 540accattctgc cgctgctgtc acaggttcct ggcgttagtt
ttaccgagca ggaagtggct 600gatctgacca aacgcatcca gaacgcgggt actgaagtgg
ttgaagcgaa ggccggtggc 660gggtctgcaa ccctgtctat gggccaggca gctgcacgtt
ttggtctgtc tctggttcgt 720gcactgcagg gcgaacaagg cgttgtcgaa tgtgcctacg
ttgaaggcga cggtcagtac 780gcccgtttct tctctcaacc gctgctgctg ggtaaaaacg
gcgtggaaga gcgtaaatct 840atcggtaccc tgagcgcatt tgaacagaac gcgctggaag
gtatgctgga tacgctgaag 900aaagatatcg ccctgggcga agagttcgtt aataagtaa
9396720DNAArtificial SequenceSynthetic - Primer
67cacagaggtg cagtaacgag
20681014DNARhizopus delemar 68atgtttgccg cctctcgtgt tttctctatt gctgccaagc
gttctttctc tacttctgct 60gctaatcttt ccaaggttgc cgttcttggc gctgctggtg
gtattggtca acccttgtct 120ttgttgttga aggaaaaccc tcacgtcacc cacctttctc
tttatgatat tgtcaacact 180cctggtgtcg ctgccgatct tagccacatc aacaccaact
ccaaggtcac tggccacacc 240cctgaaaacg atggtttgaa gactgctctt gaaggtgctc
acgttgttgt tattcctgct 300ggcgttcctc gtaagcctgg tatgacccgt gatgatttat
tcaacaccaa tgcttccatt 360gttcgtgacc ttgctgaagc tgctgccaag cactgtcccg
acgctcattt ccttatcatc 420tccaaccctg tcaactccac tgttcccatc tttgccgaaa
ccttaaagaa ggctggtgtc 480ttcaacccta agcgtttgta tggtgtcacc actcttgatg
tcgtccgtgc ctctcgcttc 540gttgccgaag tcaagaactt ggaccccaac gatgtcaagg
ttaccgttgt cggtggtcac 600tctggtgtga ctattgtccc tctcctctct caaaccggtc
tcgaattcag caaggaagaa 660ctcgatgcct tgacccaccg tatccaattc ggtggtgatg
aagtcgttca agccaagaat 720ggtactggtt ctgtcactct ctccatggcc tttgccggtg
ctcgtttcgc caactctgtc 780ttggaagcca ctgttggtgg taagaagggt gttgttgaac
cctcctttgt caagtctgat 840gtctttgcca aggatggtgt tgaatatttc tctaccaaca
ttgaacttgg tcctgaaggt 900gttgaaaaga tcaacgaact cggtcaaatc tctgactatg
aaaaggaact tattgctaag 960gccgttcctg aattaaagaa gaacattgcc aagggtaaca
gctttgttca ataa 10146931DNAArtificial SequenceSynthetic - Primer
69caagagtatc ccatctgaca ggaaccgatg g
31701458DNAIssatchenkia orientalis 70atgttagctg ctagatcatt aaaggcaaga
atgtcaacaa gagctttctc aactacctca 60attgcaaaaa gaatcgaaaa agatgcattt
ggtgacattg aagtcccaaa tgagaaatat 120tggggtgctc aaactcaaag atctttacaa
aatttcaaaa ttggtggtaa gagagaagtt 180atgccagaac caatcatcaa atcttttggt
attttaaaga aggctactgc taagatcaat 240gctgagtctg gtgctttaga cccaaagtta
tctgaagcca tccaacaagc tgcaaccgaa 300gtttatgaag gtaaactaat ggaccatttc
ccattagttg tctttcaaac cggttctggt 360actcaatcta acatgaatgc caatgaagtc
atctctaata gagcaattga aatcttgggt 420ggtgaattag gctctaaaac tccagtccat
cctaatgatc atgttaatat gtcccaatct 480tctaatgata ctttccctac tgtcatgcat
attgcagcag ttacagaagt ttcatcccat 540ttattaccag aattaactgc actaagagat
gcattgcaaa agaaatccga tgaatttaag 600aatattatca aaatcggtag aacccattta
caagatgcaa ctcctttaac tttaggtcaa 660gaattttctg gttatgttca acaatgtact
aatggtatca aaagaatcga aattgctctt 720gaacatttga gatacttagc tcaaggtggt
actgccgttg gtactggtct taacaccaag 780aaaggttttg ctgaaaaggt tgcaaatgaa
gtcactaaat tgactggttt acaattctat 840accgctccaa ataaattcga agcccttgca
gctcacgatg ctgttgttga aatgtctggt 900gctttgaata ccgttgcagt ctcattattc
aaaatcgctc aagatatcag atatttgggt 960tccggcccaa gatgtggtta tggtgaattg
gctttaccag aaaatgaacc aggttcttcc 1020atcatgccgg gtaaagttaa cccaactcaa
aacgaagctt tgactatgct ttgtacccaa 1080gtctttggta accactcttg tattaccttt
gcaggtgctt caggtcaatt cgaattgaat 1140gtctttaagc cagttatgat ctccaacttg
ttatcttcta ttaggttatt aggtgatggt 1200tgtaattctt ttagaatcca ctgtgttgaa
ggtatcattg caaataccga caagattgat 1260aaattactac atgaatctct catgttagtt
actgctttga acccacacat tggttacgat 1320aaggcttcca agattgcaaa gaatgcacac
aagaagggct tgacattgaa acaatctgca 1380ttggaattag gttacttgac cgaagaacaa
ttcaatgaat gggttagacc agaaaacatg 1440attggtccaa aggattaa
14587127DNAArtificial SequenceSynthetic
- Primer 71catcactgtt aaaggaatgg gtaaatc
277227DNAArtificial SequenceSynthetic - Primer 72gctggagaat
agatcttcaa cgccccg
277330DNAArtificial SequenceSynthetic - Primer 73gagaacttat acgcaccaga
acgccttttg 307431DNAArtificial
SequenceSynthetic - Primer 74caagagtatc ccatctgaca ggaaccgatg g
31751413DNASaccharomyces cerevisiae 75atgtctctct
ctcccgttgt tgttattgga accggtttgg ccgggctggc tgctgccaac 60gaattggtta
acaagtataa catccctgta accatcctcg aaaaggcttc ctcgatcggt 120gggaactcta
tcaaggcctc cagtggtatt aacggtgctt gcaccgagac tcaacgtcac 180ttccacatcg
aggactcccc acgcttattt gaagatgaca ccatcaagtc tgctaaaggt 240aaaggtgtcc
aagagttaat ggctaagttg gccaatgatt ctcccctggc tattgaatgg 300ttgaaaaacg
aatttgattt gaaattggac ctattggctc aattgggtgg ccactctgtg 360gcaagaactc
acagatcgtc tgggaagttg cctccaggtt tcgaaattgt ttctgcctta 420tctaacaatt
tgaagaaatt agctgagact aaaccagagt tagttaagat taacttagac 480agtaaagtcg
tagacatcca tgaaaaggat ggctccattt ctgctgtagt gtacgaggat 540aagaatggcg
aaaagcacat ggtgagtgct aacgatgtcg ttttttgttc tggagggttt 600ggcttttcta
aggaaatgct taaagaatat gcacccgaac tggtgaactt gccaacgaca 660aacgggcaac
aaacaactgg tgatggtcaa aggcttctgc agaagttagg cgctgatctg 720attgacatgg
accaaattca agttcatcca actgggttca ttgatccaaa tgaccgtagc 780tcaagctgga
aattcttggc tgccgaatcc ttaagaggtc ttggtggtat cctattaaac 840cctattaccg
gtagaagatt tgtcaacgaa ttgaccacaa gagatgtagt cactgcagct 900attcaaaagg
tttgtcctca agaggataac agagcactat tggttatggg cgaaaaaatg 960tacacagatt
tgaagaataa tttagatttt tacatgttca agaaacttgt acagaaattg 1020acattatctc
aagttgtgtc tgaatataat ttaccaatca ctgtcaccca attatgcgag 1080gaattgcaaa
catactcttc gttcactacc aaggctgatc cgttgggacg taccgttatt 1140ctcaacgaat
ttggctctga cgttactcca gaaaccgtgg tttttattgg tgaagtaaca 1200ccggttgtcc
atttcaccat gggtggtgct agaatcaatg tcaaggctca agtcattggc 1260aagaacgacg
aaaggctact aaaaggcctg tacgcggccg gtgaagtttc tggcggtgtt 1320catggcgcca
ataggttggg tggttcaagt ttgttagaat gcgttgtctt tgggagaact 1380gcagctgaat
ctattgccaa tgaccgcaag taa
1413761413DNASaccharomyces mikatae 76atgtcatctt ctccagttgt cgttattggt
acaggcttgg caggtttggc aactgctaat 60gagttagtca ataagtacaa cattcctgtt
accattttgg aaaaggcatc ctctatcggt 120ggcaattcca ttaaggcatc ttctggtatc
aatggtgcat gtacagaaac ccaacgtcat 180tttcacattg aagatactcc tagacttttt
gaagatgata ctgttcaatc cgccaagggc 240aaaggtgttc aagagttaat gggtaaactt
gctaatgatt ctccacttgc tattgaatgg 300ttaaagactg aattcgactt aaagttagac
cttttggctc agttaggtgg tcactctgtt 360gctagaactc atagatcttc cggtaaactt
ccaccaggtt tcgaaatcgt ttccgcctta 420tccaataact tgaaaaagtt ggcagaaacc
aagccagagt tagttaagat taacttagac 480tcaaaggtcg ttgacatcca caaaaaggac
ggctctattt ccgcaattgt ctatgatgac 540aaaaacggtg aaagacatac cttatccact
tcaaatgttg ttttctgctc tggtggtttc 600ggtttttcta aggaaatgtt aaacgagtat
gctccacaat tggtcaactt gccaaccact 660aacggtcagc aaacaacagg tgacggccaa
agattgttac aaaagcttgg tgcagatttg 720attgatatgg atcaaattca agtccatcct
actggtttca tcgacccaaa cgatagaaac 780tcctcttgga agtttttggc tgctgaatct
ttaagaggtt tgggtggtat cttattgaat 840ccaattactg gtcgtagatt tgtcaacgaa
ttgaccacta gagatgtcgt tactgaagca 900atccagaagc actgtccaca agatgataac
agagctttgt tagttatgtc cgaaaagatg 960tatacagatt tgaaaaacaa tttggacttc
tacatgttca aaaagttagt tcaaaagtta 1020tctttgtccc aagttgtttc cgagtataag
ttaccaatta ctgtttccca attgtgtcag 1080gaattacaaa cctactcatc ttttacttca
aaagccgatc ctcttggtag aaccgttgtc 1140ttaaacgaat tcggtgctga catcacccca
gaaacaatgg ttttcatcgg cgaagttacc 1200ccagtcgttc actttaccat gggtggtgct
agaatcaatg ttaaggctca agttatcggc 1260aaaaacgatg agcctttgtt aaacggtttg
tacgcagcag gtgaagtttc tggtggtgtc 1320catggtgcca atagattagg tggttcatct
ttgcttgaat gtgtcgtttt tggtagaact 1380gcagcagaat caattgccaa taaccacaag
taa 1413771413DNAKluyveromyces polysporus
77atgtcaacca aaaagccagt cgtcatcatt ggtactggtt tagccggttt gtctgctggt
60aatcaattgg tcaatatgca taaagttcct atcattatgt tggacaaggc atcctccatt
120ggtggtaatt ctacaaaggc ttcctctggt atcaacggtg cttctactat tactcaacag
180caacttaatg ttaaagactc tcctgactta ttccttcaag atactgttaa gtctgctaag
240ggtagaggta ttgagtccct tatgaaaaag ttatcacaag actccaactc tgctatccat
300tggttgcaac aggattttga tttgaagttg gatttgttag ctcaattggg tggtcattcc
360gttcctagaa cacaccgttc ctcaggcaag ttacctccag gcttcgaaat tgtccaagct
420ttatctaaca agttaaaggc tatttctgag tccgatccag aattcgttag aatcttactt
480aactccaagg ttgttgatgt ttccgttaac aatgagggca aggtcgaatc tattgactat
540gttgatgcag aaggtaaaca tcacaaaatc gctactgata acgttgtctt ttgttccggt
600ggtttcggtc actcagcaga aatgttgaac aagtatgcac cagaattagc taacttgcca
660actactaacg gtcaacaaac cactggcgat ggtcagagaa tcttggagaa attgggtgca
720gacttgattg atatgtccca aattcaagtt cacccaacag gtttcatcga tccagcaaac
780agagattcta agtggaagtt tttggctgcc gaagcattaa gaggtttagg tggtatctta
840cttaatccat ctaccggcaa gagattcgtt aatgagttaa ccaccagaga tttggtcaca
900gaagctatcc aatcacaatg tccaagagat gacaataagg cattccttgt tatgtctgaa
960aaggtctatg agaattacaa aaacaacatg gacttttact tattcaaaaa gttagtttcc
1020aagatgacca ttaaggaatt tgtcgaaact tacaagttgc caatttctgc cgacgccgtt
1080acccaagact taatcgacta ttcagttgat aagaccgata agtttggtag accattggtt
1140atcaacgttt ttgatgaaaa gttgaccgaa gattccgaaa tctatgttgg tgaagttaca
1200ccagttgtcc atttcactat gggtggtgca aagatcaata ctgaatctca agttatcaac
1260aaaaacggtc aagttttggc aaagggtatc tacgcagcag gtgaagtctc cggtggtgtt
1320cacggttcta atagattagg tggttcatct ttgttagaat gcgtcgttta cggtagatct
1380gctgcagata acattgccaa aaacattgaa taa
1413781464DNAKluyveromyces marxianus 78atgttgcaca gatacatccg tttgttctcc
ttctgcgtca tcttgtactt agtctatttg 60ttacttacta aggagtcaaa cgtcatgtct
aagcctgttg ttgttattgg ttctggttta 120gcaggcttaa caacatcttc acaattagca
aagtttaaca ttccaatcgt ccttttagaa 180aagacatctt ccattggtgg taattccatt
aaggcatctt ctggtatcaa tggcgcaggc 240accgaaactc aatctcgttt acacgttgaa
gatcacccag aattgtttgc tgatgatacc 300attaagtctg caaaaggtaa aggtgttgtc
gctttgatgg aaaagttatc taaagactcc 360tctgatgcta tttcctggtt acaaaacgac
ttcaagattc ctttggataa gttagctcaa 420ttaggcggtc attccgttcc tagaacccat
agatcatccg gcaagcttcc accaggtttc 480caaattgtcg ataccttgaa aaaggccttg
gagtcttatg actctaaagc agttaagatc 540caattgaatt ctaaggtcgt tgatgttaag
cttgattcca ataacagagt ttcatctgtt 600gttttcgaag atcaagatgg tactcacacc
attgaaacca acaacgtcgt tttctgtact 660ggtggtttcg gtttcaacaa aaagttattg
gagaagtatg caccacactt ggtcgacttg 720ccaactacca acggtgagca aaccttaggt
gaaggtcagg tcttattgga aaaacttggt 780gctaagttga ttgatatgga ccaaattcaa
gttcatccaa ctggctttat cgatccagcc 840aatccagatt ctaattggaa gtttttggct
gccgaggcct taagaggttt aggtggtgtc 900ttgatcaatc cacacactgg tcagagattt
gttaacgaat tgacaactag agacatggtc 960accgaagcta tccagtctaa gtccgaatcc
aagactgctt acttggttat gtccgagtcc 1020ttatacgaga actacaagcc aaacatggac
ttctatatgt tcaaaaagct tgtttccaaa 1080aagaccgttg ctgaatttgc tgaagatttg
ccagtttctg ttgaccaact tattgcagaa 1140ctttcaactt attccgactt gtctaaggat
gatcatttgg gtagaaagtt tagagaaaac 1200acttttggtt cctcattatc atcagactca
accattttcg ttggcaagat tactcctgtt 1260gttcacttca caatgggtgg tgcaaagatt
gatgaacaag ctagagtctt gaatgcagaa 1320ggtaaaccat tagctactgg tatctacgcc
gctggtgaag tttctggtgg tgtccatggt 1380gctaatagat taggtggttc ctctttgtta
gaatgtgttg tctttggtag acaagcagca 1440aaatccatta gagcaaactt gtaa
14647929DNAArtificial SequenceSynthetic
- Primer 79gaggaagttc aaagtatgaa agacgtcag
298033DNAArtificial SequenceSynthetic - Primer 80gatcgggccc
gtcttggaag acgcactagt ctc
338120DNAArtificial SequenceSynthetic - Primer 81tgacaggaac cgatggactc
20821144PRTLeishmania
mexicana 82Met Ala Asp Gly Lys Thr Ser Ala Ser Val Val Ala Val Asp Ala
Glu1 5 10 15Arg Ala Ala
Lys Glu Arg Asp Ala Ala Ala Arg Ala Met Leu Gln Gly 20
25 30Gly Gly Val Ser Pro Ala Gly Lys Ala Gln
Leu Leu Lys Lys Gly Leu 35 40
45Val His Thr Val Pro Tyr Thr Leu Lys Val Val Val Ala Asp Pro Lys 50
55 60Glu Met Glu Lys Ala Thr Ala Asp Ala
Glu Glu Val Leu Gln Ala Ala65 70 75
80Phe Gln Val Val Asp Thr Leu Leu Asn Asn Phe Asn Glu Asn
Ser Glu 85 90 95Val Ser
Arg Val Asn Arg Leu Ala Val Gly Glu Glu His Gln Met Ser 100
105 110Glu Thr Leu Lys His Val Met Ala Cys
Cys Gln Lys Val Tyr His Ser 115 120
125Ser Arg Gly Val Phe Asp Pro Ala Val Gly Pro Leu Val Arg Glu Leu
130 135 140Arg Glu Ala Ala His Lys Gly
Lys Thr Val Pro Ala Glu Arg Val Asn145 150
155 160Asp Leu Leu Ser Lys Cys Thr Leu Asn Ala Ser Phe
Ser Ile Asp Met 165 170
175Ser Arg Gly Met Ile Ala Arg Lys His Pro Asp Ala Met Leu Asp Leu
180 185 190Gly Gly Val Asn Lys Gly
Tyr Gly Ile Asp Tyr Ile Val Glu His Leu 195 200
205Asn Ser Leu Gly Tyr Asp Asp Val Phe Phe Glu Trp Gly Gly
Asp Val 210 215 220Arg Ala Ser Gly Lys
Asn Gln Leu Ser Gln Pro Trp Ala Val Gly Ile225 230
235 240Val Arg Pro Pro Ala Leu Ala Asp Ile Arg
Thr Val Val Pro Glu Asp 245 250
255Lys Arg Ser Phe Ile Arg Val Val Arg Leu Asn Asn Glu Ala Ile Ala
260 265 270Thr Ser Gly Asp Tyr
Glu Asn Leu Val Glu Gly Pro Gly Ser Lys Val 275
280 285Tyr Ser Ser Thr Phe Asn Pro Thr Ser Lys Asn Leu
Leu Glu Pro Thr 290 295 300Glu Ala Gly
Met Ala Gln Val Ser Val Lys Cys Cys Ser Cys Ile Tyr305
310 315 320Ala Asp Ala Leu Ala Thr Ala
Ala Leu Leu Lys Asn Asp Pro Ala Ala 325
330 335Val Arg Arg Ile Leu Asp Asn Trp Arg Tyr Val Arg
Asp Thr Val Thr 340 345 350Asp
Tyr Thr Thr Tyr Thr Arg Glu Gly Glu Arg Val Ala Lys Met Leu 355
360 365Glu Ile Ala Thr Glu Asp Ala Glu Met
Arg Ala Lys Arg Ile Lys Gly 370 375
380Ser Leu Pro Ala Arg Val Ile Ile Val Gly Gly Gly Leu Ala Gly Cys385
390 395 400Ser Ala Ala Ile
Glu Ala Ala Asn Cys Gly Ala His Val Ile Leu Leu 405
410 415Glu Lys Glu Pro Lys Leu Gly Gly Asn Ser
Ala Lys Ala Thr Ser Gly 420 425
430Ile Asn Ala Trp Gly Thr Arg Ala Gln Ala Lys Gln Gly Val Met Asp
435 440 445Gly Gly Lys Phe Phe Glu Arg
Asp Thr His Arg Ser Gly Lys Gly Gly 450 455
460Asn Cys Asp Pro Cys Leu Val Lys Thr Leu Ser Val Lys Ser Ser
Asp465 470 475 480Ala Val
Lys Trp Leu Ser Glu Leu Gly Val Pro Leu Thr Val Leu Ser
485 490 495Gln Leu Gly Gly Ala Ser Arg
Lys Arg Cys His Arg Ala Pro Asp Lys 500 505
510Ser Asp Gly Thr Pro Val Pro Val Gly Phe Thr Ile Met Lys
Thr Leu 515 520 525Glu Asn His Ile
Val Asn Asp Leu Ser Arg His Val Thr Val Met Thr 530
535 540Gly Ile Thr Val Thr Ala Leu Glu Ser Thr Ser Arg
Val Arg Pro Asp545 550 555
560Gly Val Leu Val Lys His Val Thr Gly Val His Leu Ile Gln Ala Ser
565 570 575Gly Gln Ser Met Val
Leu Asn Ala Asp Ala Val Ile Leu Ala Thr Gly 580
585 590Gly Phe Ser Asn Asp His Thr Pro Asn Ser Leu Leu
Gln Gln Tyr Ala 595 600 605Pro Gln
Leu Ser Ser Phe Pro Thr Thr Asn Gly Val Trp Ala Thr Gly 610
615 620Asp Gly Val Lys Met Ala Ser Lys Leu Gly Val
Ala Leu Val Asp Met625 630 635
640Asp Lys Val Gln Leu His Pro Thr Gly Leu Leu Asp Pro Lys Asp Pro
645 650 655Ser Asn Arg Thr
Lys Tyr Leu Gly Pro Glu Ala Leu Arg Gly Ser Gly 660
665 670Gly Val Leu Leu Asn Lys Asn Gly Glu Arg Phe
Val Asn Glu Leu Asp 675 680 685Leu
Arg Ser Val Val Ser Gln Ala Ile Ile Ala Gln Asp Asn Glu Tyr 690
695 700Pro Gly Ser Gly Gly Ser Lys Phe Ala Tyr
Cys Val Leu Asn Glu Thr705 710 715
720Ala Ala Lys Leu Phe Gly Lys Asn Phe Leu Gly Phe Tyr Trp Asn
Arg 725 730 735Leu Gly Leu
Phe Gln Lys Val Asp Ser Val Ala Gly Leu Ala Lys Leu 740
745 750Ile Gly Cys Pro Glu Ala Asn Val Val Ala
Thr Leu Lys Gln Tyr Glu 755 760
765Glu Leu Ser Ser Lys Lys Leu Asn Pro Cys Pro Leu Thr Gly Lys Ser 770
775 780Val Phe Pro Cys Val Leu Gly Thr
Gln Gly Pro Tyr Tyr Val Ala Leu785 790
795 800Val Thr Pro Ser Ile His Tyr Thr Met Gly Gly Cys
Leu Ile Ser Pro 805 810
815Ser Ala Glu Met Gln Thr Ile Asp Asn Ser Gly Val Thr Pro Val Arg
820 825 830Arg Pro Ile Leu Gly Leu
Phe Gly Ala Gly Glu Val Thr Gly Gly Val 835 840
845His Gly Gly Asn Arg Leu Gly Gly Asn Ser Leu Leu Glu Cys
Val Val 850 855 860Phe Gly Lys Ile Ala
Gly Asp Arg Ala Ala Thr Ile Leu Gln Lys Lys865 870
875 880Asn Thr Gly Leu Ser Met Thr Glu Trp Ser
Thr Val Val Leu Arg Glu 885 890
895Val Arg Glu Gly Gly Val Tyr Gly Ala Gly Ser Arg Val Leu Arg Phe
900 905 910Asn Met Pro Gly Ala
Leu Gln Arg Thr Gly Leu Ala Leu Gly Gln Phe 915
920 925Ile Gly Ile Arg Gly Asp Trp Asp Gly His Arg Leu
Ile Gly Tyr Tyr 930 935 940Ser Pro Ile
Thr Leu Pro Asp Asp Val Gly Val Ile Gly Ile Leu Ala945
950 955 960Arg Ala Asp Lys Gly Arg Leu
Ala Glu Trp Ile Ser Ala Leu Gln Pro 965
970 975Gly Asp Ala Val Glu Met Lys Ala Cys Gly Gly Leu
Ile Ile Asp Arg 980 985 990Arg
Phe Ala Glu Arg His Phe Phe Phe Arg Gly His Lys Ile Arg Lys 995
1000 1005Leu Ala Leu Ile Gly Gly Gly Thr
Gly Val Ala Pro Met Leu Gln 1010 1015
1020Ile Val Arg Ala Ala Val Lys Lys Pro Phe Val Asp Ser Ile Glu
1025 1030 1035Ser Ile Gln Phe Ile Tyr
Ala Ala Glu Asp Val Ser Glu Leu Thr 1040 1045
1050Tyr Arg Thr Leu Leu Glu Ser Tyr Glu Glu Glu Tyr Gly Ser
Glu 1055 1060 1065Lys Phe Lys Cys His
Phe Val Leu Asn Asn Pro Pro Ala Gln Trp 1070 1075
1080Thr Asp Gly Val Gly Phe Val Asp Thr Ala Leu Leu Arg
Ser Ala 1085 1090 1095Val Gln Ala Pro
Ser Asn Asp Leu Leu Val Ala Ile Cys Gly Pro 1100
1105 1110Pro Ile Met Gln Arg Ala Val Lys Gly Ala Leu
Lys Gly Leu Gly 1115 1120 1125Tyr Asn
Met Asn Leu Val Arg Thr Val Asp Glu Thr Glu Pro Pro 1130
1135 1140Ser83743PRTSaccharomyces cerevisiae 83Met
Asp Gly Pro Asn Phe Ala His Gln Gly Gly Arg Ser Gln Arg Thr1
5 10 15Thr Glu Leu Tyr Ser Cys Ala
Arg Cys Arg Lys Leu Lys Lys Lys Cys 20 25
30Gly Lys Gln Ile Pro Thr Cys Ala Asn Cys Asp Lys Asn Gly
Ala His 35 40 45Cys Ser Tyr Pro
Gly Arg Ala Pro Arg Arg Thr Lys Lys Glu Leu Ala 50 55
60Asp Ala Met Leu Arg Gly Glu Tyr Val Pro Val Lys Arg
Asn Lys Lys65 70 75
80Val Gly Lys Ser Pro Leu Ser Thr Lys Ser Met Pro Asn Ser Ser Ser
85 90 95Pro Leu Ser Ala Asn Gly
Ala Ile Thr Pro Gly Phe Ser Pro Tyr Glu 100
105 110Asn Asp Asp Ala His Lys Met Lys Gln Leu Lys Pro
Ser Asp Pro Ile 115 120 125Asn Leu
Val Met Gly Ala Ser Pro Asn Ser Ser Glu Gly Val Ser Ser 130
135 140Leu Ile Ser Val Leu Thr Ser Leu Asn Asp Asn
Ser Asn Pro Ser Ser145 150 155
160His Leu Ser Ser Asn Glu Asn Ser Met Ile Pro Ser Arg Ser Leu Pro
165 170 175Ala Ser Val Gln
Gln Ser Ser Thr Thr Ser Ser Phe Gly Gly Tyr Asn 180
185 190Thr Pro Ser Pro Leu Ile Ser Ser His Val Pro
Ala Asn Ala Gln Ala 195 200 205Val
Pro Leu Gln Asn Asn Asn Arg Asn Thr Ser Asn Gly Asp Asn Gly 210
215 220Ser Asn Val Asn His Asp Asn Asn Asn Gly
Ser Thr Asn Thr Pro Gln225 230 235
240Leu Ser Leu Thr Pro Tyr Ala Asn Asn Ser Ala Pro Asn Gly Lys
Phe 245 250 255Asp Ser Val
Pro Val Asp Ala Ser Ser Ile Glu Phe Glu Thr Met Ser 260
265 270Cys Cys Phe Lys Gly Gly Arg Thr Thr Ser
Trp Val Arg Glu Asp Gly 275 280
285Ser Phe Lys Ser Ile Asp Arg Ser Leu Leu Asp Arg Phe Ile Ala Ala 290
295 300Tyr Phe Lys His Asn His Arg Leu
Phe Pro Met Ile Asp Lys Ile Ala305 310
315 320Phe Leu Asn Asp Ala Ala Thr Ile Thr Asp Phe Glu
Arg Leu Tyr Asp 325 330
335Asn Lys Asn Tyr Pro Asp Ser Phe Val Phe Lys Val Tyr Met Ile Met
340 345 350Ala Ile Gly Cys Thr Thr
Leu Gln Arg Ala Gly Met Val Ser Gln Asp 355 360
365Glu Glu Cys Leu Ser Glu His Leu Ala Phe Leu Ala Met Lys
Lys Phe 370 375 380Arg Ser Val Ile Ile
Leu Gln Asp Ile Glu Thr Val Arg Cys Leu Leu385 390
395 400Leu Leu Gly Ile Tyr Ser Phe Phe Glu Pro
Lys Gly Ser Ser Ser Trp 405 410
415Thr Ile Ser Gly Ile Ile Met Arg Leu Thr Ile Gly Leu Gly Leu Asn
420 425 430Arg Glu Leu Thr Ala
Lys Lys Leu Lys Ser Met Ser Ala Leu Glu Ala 435
440 445Glu Ala Arg Tyr Arg Val Phe Trp Ser Ala Tyr Cys
Phe Glu Arg Leu 450 455 460Val Cys Thr
Ser Leu Gly Arg Ile Ser Gly Ile Asp Asp Glu Asp Ile465
470 475 480Thr Val Pro Leu Pro Arg Ala
Leu Tyr Val Asp Glu Arg Asp Asp Leu 485
490 495Glu Met Thr Lys Leu Met Ile Ser Leu Arg Lys Met
Gly Gly Arg Ile 500 505 510Tyr
Lys Gln Val His Ser Val Ser Ala Gly Arg Gln Lys Leu Thr Ile 515
520 525Glu Gln Lys Gln Glu Ile Ile Ser Gly
Leu Arg Lys Glu Leu Asp Glu 530 535
540Ile Tyr Ser Arg Glu Ser Glu Arg Arg Lys Leu Lys Lys Ser Gln Met545
550 555 560Asp Gln Val Glu
Arg Glu Asn Asn Ser Thr Thr Asn Val Ile Ser Phe 565
570 575His Ser Ser Glu Ile Trp Leu Ala Met Arg
Tyr Ser Gln Leu Gln Ile 580 585
590Leu Leu Tyr Arg Pro Ser Ala Leu Met Pro Lys Pro Pro Ile Asp Ser
595 600 605Leu Ser Thr Leu Gly Glu Phe
Cys Leu Gln Ala Trp Lys His Thr Tyr 610 615
620Thr Leu Tyr Lys Lys Arg Leu Leu Pro Leu Asn Trp Ile Thr Leu
Phe625 630 635 640Arg Thr
Leu Thr Ile Cys Asn Thr Ile Leu Tyr Cys Leu Cys Gln Trp
645 650 655Ser Ile Asp Leu Ile Glu Ser
Lys Ile Glu Ile Gln Gln Cys Val Glu 660 665
670Ile Leu Arg His Phe Gly Glu Arg Trp Ile Phe Ala Met Arg
Cys Ala 675 680 685Asp Val Phe Gln
Asn Ile Ser Asn Thr Ile Leu Asp Ile Ser Leu Ser 690
695 700His Gly Lys Val Pro Asn Met Asp Gln Leu Thr Arg
Glu Leu Phe Gly705 710 715
720Ala Ser Asp Ser Tyr Gln Asp Ile Leu Asp Glu Asn Asn Val Asp Val
725 730 735Ser Trp Val Asp Lys
Leu Val 7408417DNAArtificial SequenceSynthetic - Primer
84caatccaacc gccaccg
178547DNAArtificial SequenceSynthetic - Primer 85ggataaaagt attacatacg
tacaggattg tgtattagtg tatttcg 478621DNAArtificial
SequenceSynthetic - Primer 86ctttcattag gttggttgaa g
21871521DNAIssatchenkia orientalis 87atgggtgtcc
agtttatcga aaataccatt atcgttgtct ttggtgcgtc tggagattta 60gccaagaaga
agactttccc cgccctgttt ggactattca gggagggcca gctctcagaa 120acaaccaaaa
tcattgggtt tgctcgatca aaactatcaa atgatgactt gaggaacaga 180ataaagccgt
acttgaaatt gaacaagaga acagatgctg aaaggcagtc tctggagaag 240tttctgcaga
ttctcgagta tcaccagtca aactacgacg acagtgaagg ttttgaaaaa 300ttggagaagc
taatcaataa gtacgacgat gaggcaaacg tgaaagagtc tcacaggttg 360tactatttgg
ctttaccacc gtctgtcttt acaaccgttg caacaatgtt gaaaaaacat 420tgtcatccag
gtgattctgg tattgctagg ctaattgtcg agaaaccctt tggccatgac 480ttgagctcgt
cccgtgagct acaaaagtct ttagctccac tttggaatga agatgaattg 540tttagaattg
atcattattt gggcaaagaa atggttaaga atttaattcc tttgaggttt 600tcaaatacgt
ttttgagcag ttcttggaac aatcaattta ttgacaccat ccaaatcact 660tttaaggaga
actttggaac tgaaggacgt ggtggttact ttgattccat tggtataata 720agagatgtta
tccaaaatca tttgttacaa gtcttgacta ttgttttgat ggaaaaacca 780gcggatttta
atggagaatc tatcagagat gaaaaggtta aagtgttaaa ggcaattgaa 840caaattgatt
tcaataatgt gttggtaggt caatatgata aatctgaaga tggtagtaaa 900cctggttact
tggatgatga taccgtcaat ccagattcta aagctgtcac ttatgctgcc 960ttagttttaa
atgtggcaaa cgaaagatgg aataatgttc cgatcattct aaaggcaggc 1020aaggccttga
atcaatccaa ggtggaaatt agaatccagt tcaaaccagt agaaaatgga 1080atcttcaaaa
actctgctag gaatgagttg gttattagga tccaaccaaa cgaggcaatg 1140tatttgaaaa
tgaacatcaa agtacctggt gtttccaatc aagtgtcgat ttcagaaatg 1200gatttgactt
acaagaatag gtattcctcc gaattttaca ttccagaagc ttatgaatct 1260ttgattaaag
atgccttaat ggatgatcat tcaaattttg ttagagacga tgaattggac 1320atttcatggg
ctttgttcac tccattacta gaacatatcg aaggccccga tggtccaact 1380ccaaccaagt
atccttacgg ttccagaggt ccaaaggaga ttgacgaatt tttgagaaac 1440catggttatg
taaaggaacc aagagaaaat taccaatggc cattaactac tcctaaagaa 1500ttgaacagtt
caaagtttta a
152188477PRTIssatchenkia orientalis 88Met Gly Gln Asn Leu Ile Leu Asn Ala
Ala Asp His Gly Phe Thr Val1 5 10
15Val Ala Tyr Asn Arg Thr Val Ser Lys Val Asp His Phe Leu Gln
Asn 20 25 30Glu Ala Lys Gly
Lys Ser Ile Ile Gly Ala His Ser Ile Glu Glu Leu 35
40 45Cys Ala Lys Leu Lys Lys Pro Arg Arg Ile Met Leu
Leu Val Lys Ala 50 55 60Gly Asn Pro
Val Asp Gln Phe Ile Glu Gln Leu Leu Pro His Leu Asp65 70
75 80Glu Gly Asp Ile Ile Ile Asp Gly
Gly Asn Ser His Phe Pro Asp Ser 85 90
95Asn Arg Arg Tyr Glu Glu Leu Lys Lys Lys Gly Ile Leu Phe
Val Gly 100 105 110Ser Gly Val
Ser Gly Gly Glu Glu Gly Ala Arg Tyr Gly Pro Ser Leu 115
120 125Met Pro Gly Gly Ala Lys Glu Ala Trp Pro His
Ile Lys Asp Ile Phe 130 135 140Gln Ser
Ile Ser Ala Lys Ala Asp Gly Glu Pro Cys Cys Asp Trp Val145
150 155 160Gly Asp Ala Gly Ala Gly His
Tyr Val Lys Met Val His Asn Gly Ile 165
170 175Glu Tyr Gly Asp Met Gln Leu Ile Cys Glu Ala Tyr
Asp Leu Met Lys 180 185 190Arg
Val Gly Gly Leu Thr Asp Lys Glu Ile Ser Asp Val Phe Gly Glu 195
200 205Trp Asn Glu Gly Val Leu Asp Ser Phe
Leu Val Glu Ile Thr Arg Asp 210 215
220Ile Leu Ala Phe Asn Asp Lys Asp Gly Thr Pro Leu Val Glu Lys Ile225
230 235 240Leu Asp Thr Ala
Gly Gln Lys Gly Thr Gly Lys Trp Thr Ala Ile Asn 245
250 255Ala Leu Asp Leu Gly Met Pro Val Thr Leu
Ile Gly Glu Ala Val Phe 260 265
270Ala Arg Cys Leu Ser Ala Leu Lys Pro Glu Arg Glu Arg Ala Ser Glu
275 280 285Ile Leu Asn Gly Pro Glu Val
Glu Gln Val Ser Ala Glu Gly Arg Ala 290 295
300Gln Phe Ile Ala Asp Leu Met Gln Ala Leu Tyr Ala Ser Lys Ile
Ile305 310 315 320Ser Tyr
Ala Gln Gly Phe Met Leu Ile Arg Glu Ala Ala Lys Glu Tyr
325 330 335Asn Trp Lys Leu Asn Phe Pro
Ser Ile Ala Leu Met Trp Arg Gly Gly 340 345
350Cys Ile Ile Arg Ser Val Phe Leu Ala Glu Ile Thr Ala Ala
Tyr Arg 355 360 365Glu Asn Pro Asp
Leu Glu Asn Leu Leu Phe Asn Lys Phe Phe Gln Asp 370
375 380Ala Ile His Lys Ala Gln Ser Gly Trp Arg Lys Thr
Val Ala Leu Ala385 390 395
400Val Thr Gln Gly Ile Pro Thr Pro Ala Phe Ser Thr Ala Leu Ser Phe
405 410 415Tyr Asp Gly Tyr Arg
Ser Lys Lys Leu Pro Ala Asn Leu Leu Gln Ala 420
425 430Gln Arg Asp Tyr Phe Gly Ala His Thr Phe Gln Ile
Leu Pro Glu Cys 435 440 445Ala Asp
Asp Glu Lys Lys Val Gly Asp Tyr Ile His Val Asn Trp Thr 450
455 460Gly Lys Gly Gly Asn Val Ser Ala Ser Thr Tyr
Asp Ala465 470 475891434DNAIssatchenkia
orientalis 89atgggtcaaa acttgattct taatgcagca gatcatggtt ttactgttgt
tgcatacaac 60agaactgtct ctaaagttga ccatttcctt caaaatgaag caaagggtaa
atccattatt 120ggtgcacact ccattgaaga attatgtgct aagttgaaga aaccaagaag
aattatgttg 180ttagtcaagg caggtaatcc agttgatcaa ttcattgaac aattgttacc
tcatttagat 240gaaggcgata tcattattga cggtggtaac tctcacttcc ctgactccaa
cagaagatac 300gaggaattaa agaagaaggg tattctcttt gtcggttctg gtgtttctgg
tggtgaagaa 360ggtgcaagat atggtccttc tttgatgcct ggtggtgcaa aggaagcatg
gcctcatatt 420aaggacatct tccaatctat ctctgcaaag gccgatggtg agccatgttg
tgattgggtt 480ggtgatgcag gtgcaggtca ttacgttaag atggtccaca atggtatcga
gtatggtgat 540atgcagttga tctgtgaagc ttacgatttg atgaagagag ttggtggttt
aactgacaag 600gaaatatctg atgttttcgg tgaatggaac gagggtgttc tcgattcttt
cttagttgaa 660attaccagag atatcttagc tttcaacgat aaggatggta ccccattagt
tgaaaagatc 720ttagatactg ccggacagaa gggtactggt aaatggactg caataaatgc
tttagacttg 780ggtatgccag tcactttaat tggtgaagct gtttttgcga gatgtttatc
cgctttgaag 840ccagaaagag agagagcttc tgaaatctta aacggtccgg aagttgaaca
agtttctgct 900gaaggtagag cacaatttat tgcagatttg atgcaagctt tatatgcatc
aaagattatt 960tcttacgcac aaggtttcat gttaatcaga gaagcagcaa aggaatacaa
ctggaaatta 1020aacttccctt ctattgcact tatgtggaga ggtggttgta ttatcaggtc
tgttttcttg 1080gctgaaatta ctgcagctta tagggaaaac cctgacttag agaacttact
attcaacaag 1140ttcttccaag atgctattca taaggcacag tctggttgga gaaagactgt
tgcattagct 1200gttacccaag gtattccaac tccagcattc tctactgcat tgtctttcta
cgatggttac 1260agatccaaga agttaccagc taacttgttg caagcacaaa gagattactt
cggtgctcac 1320actttccaaa ttttacctga atgtgcagat gacgaaaaga aggttggtga
ttacatccat 1380gtcaactgga ctggtaaggg tggtaatgtt tctgctagta cttacgatgc
ttaa 143490438PRTSchizosaccharomyces pombe 90Met Gly Glu Leu Lys
Glu Ile Leu Lys Gln Arg Tyr His Glu Leu Leu1 5
10 15Asp Trp Asn Val Lys Ala Pro His Val Pro Leu
Ser Gln Arg Leu Lys 20 25
30His Phe Thr Trp Ser Trp Phe Ala Cys Thr Met Ala Thr Gly Gly Val
35 40 45Gly Leu Ile Ile Gly Ser Phe Pro
Phe Arg Phe Tyr Gly Leu Asn Thr 50 55
60Ile Gly Lys Ile Val Tyr Ile Leu Gln Ile Phe Leu Phe Ser Leu Phe65
70 75 80Gly Ser Cys Met Leu
Phe Arg Phe Ile Lys Tyr Pro Ser Thr Ile Lys 85
90 95Asp Ser Trp Asn His His Leu Glu Lys Leu Phe
Ile Ala Thr Cys Leu 100 105
110Leu Ser Ile Ser Thr Phe Ile Asp Met Leu Ala Ile Tyr Ala Tyr Pro
115 120 125Asp Thr Gly Glu Trp Met Val
Trp Val Ile Arg Ile Leu Tyr Tyr Ile 130 135
140Tyr Val Ala Val Ser Phe Ile Tyr Cys Val Met Ala Phe Phe Thr
Ile145 150 155 160Phe Asn
Asn His Val Tyr Thr Ile Glu Thr Ala Ser Pro Ala Trp Ile
165 170 175Leu Pro Ile Phe Pro Pro Met
Ile Cys Gly Val Ile Ala Gly Ala Val 180 185
190Asn Ser Thr Gln Pro Ala His Gln Leu Lys Asn Met Val Ile
Phe Gly 195 200 205Ile Leu Phe Gln
Gly Leu Gly Phe Trp Val Tyr Leu Leu Leu Phe Ala 210
215 220Val Asn Val Leu Arg Phe Phe Thr Val Gly Leu Ala
Lys Pro Gln Asp225 230 235
240Arg Pro Gly Met Phe Met Phe Val Gly Pro Pro Ala Phe Ser Gly Leu
245 250 255Ala Leu Ile Asn Ile
Ala Arg Gly Ala Met Gly Ser Arg Pro Tyr Ile 260
265 270Phe Val Gly Ala Asn Ser Ser Glu Tyr Leu Gly Phe
Val Ser Thr Phe 275 280 285Met Ala
Ile Phe Ile Trp Gly Leu Ala Ala Trp Cys Tyr Cys Leu Ala 290
295 300Met Val Ser Phe Leu Ala Gly Phe Phe Thr Arg
Ala Pro Leu Lys Phe305 310 315
320Ala Cys Gly Trp Phe Ala Phe Ile Phe Pro Asn Val Gly Phe Val Asn
325 330 335Cys Thr Ile Glu
Ile Gly Lys Met Ile Asp Ser Lys Ala Phe Gln Met 340
345 350Phe Gly His Ile Ile Gly Val Ile Leu Cys Ile
Gln Trp Ile Leu Leu 355 360 365Met
Tyr Leu Met Val Arg Ala Phe Leu Val Asn Asp Leu Cys Tyr Pro 370
375 380Gly Lys Asp Glu Asp Ala His Pro Pro Pro
Lys Pro Asn Thr Gly Val385 390 395
400Leu Asn Pro Thr Phe Pro Pro Glu Lys Ala Pro Ala Ser Leu Glu
Lys 405 410 415Val Asp Thr
His Val Thr Ser Thr Gly Gly Glu Ser Asp Pro Pro Ser 420
425 430Ser Glu His Glu Ser Val
43591398PRTAspergillus oryzae 91Met Phe Asn Asn Glu His His Ile Pro Pro
Gly Ser Ser His Ser Asp1 5 10
15Ile Glu Met Leu Thr Pro Pro Lys Phe Glu Asp Glu Lys Gln Leu Gly
20 25 30Pro Val Gly Ile Arg Glu
Arg Leu Arg His Phe Thr Trp Ala Trp Tyr 35 40
45Thr Leu Thr Met Ser Gly Gly Gly Leu Ala Val Leu Ile Ile
Ser Gln 50 55 60Pro Phe Gly Phe Arg
Gly Leu Arg Glu Ile Gly Ile Ala Val Tyr Ile65 70
75 80Leu Asn Leu Ile Leu Phe Ala Leu Val Cys
Ser Thr Met Ala Ile Arg 85 90
95Phe Ile Leu His Gly Asn Leu Leu Glu Ser Leu Arg His Asp Arg Glu
100 105 110Gly Leu Phe Phe Pro
Thr Phe Trp Leu Ser Val Ala Thr Ile Ile Cys 115
120 125Gly Leu Ser Arg Tyr Phe Gly Glu Glu Ser Asn Glu
Ser Phe Gln Leu 130 135 140Ala Leu Glu
Ala Leu Phe Trp Ile Tyr Cys Val Cys Thr Leu Leu Val145
150 155 160Ala Ile Ile Gln Tyr Ser Phe
Val Phe Ser Ser His Lys Tyr Gly Leu 165
170 175Gln Thr Met Met Pro Ser Trp Ile Leu Pro Ala Phe
Pro Ile Met Leu 180 185 190Ser
Gly Thr Ile Ala Ser Val Ile Gly Glu Gln Gln Pro Ala Arg Ala 195
200 205Ala Leu Pro Ile Ile Gly Ala Gly Val
Thr Phe Gln Gly Leu Gly Phe 210 215
220Ser Ile Ser Phe Met Met Tyr Ala His Tyr Ile Gly Arg Leu Met Glu225
230 235 240Ser Gly Leu Pro
His Ser Asp His Arg Pro Gly Met Phe Ile Cys Val 245
250 255Gly Pro Pro Ala Phe Thr Ala Leu Ala Leu
Val Gly Met Ser Lys Gly 260 265
270Leu Pro Glu Asp Phe Lys Leu Leu His Asp Ala His Ala Leu Glu Asp
275 280 285Gly Arg Ile Ile Glu Leu Leu
Ala Ile Ser Ala Gly Val Phe Leu Trp 290 295
300Ala Leu Ser Leu Trp Phe Phe Cys Ile Ala Ile Val Ala Val Ile
Arg305 310 315 320Ser Pro
Pro Glu Ala Phe His Leu Asn Trp Trp Ala Met Val Phe Pro
325 330 335Asn Thr Gly Phe Thr Leu Ala
Thr Ile Thr Leu Gly Lys Ala Leu Asn 340 345
350Ser Asn Gly Val Lys Gly Val Gly Ser Ala Met Ser Ile Cys
Ile Val 355 360 365Cys Met Tyr Ile
Phe Val Phe Val Asn Asn Val Arg Ala Val Ile Arg 370
375 380Lys Asp Ile Met Tyr Pro Gly Lys Asp Glu Asp Val
Ser Asp385 390
395921317DNASchizosaccharomyces pombe 92atgggtgaat tgaaagagat tttgaaacaa
agatatcatg aattacttga ttggaatgtt 60aaggcaccac atgtcccttt atcccagaga
ttgaagcact ttacttggtc atggtttgct 120tgtactatgg caaccggtgg tgttggtttg
atcattggtt ccttcccatt cagattctac 180ggtttgaaca ccattggcaa gattgtttac
atcttacaaa tctttttgtt ttctcttttt 240ggctcttgta tgttgtttcg tttcatcaag
tatccatcta ccattaagga ctcttggaat 300catcacttgg aaaagttgtt tatcgcaact
tgtttgttat ctatttccac attcatcgac 360atgttagcta tctatgctta tccagatacc
ggtgaatgga tggtctgggt cattagaatc 420ttatactaca tctatgtcgc tgtctctttc
atctactgtg ttatggcctt tttcaccatt 480ttcaacaatc atgtttacac tattgaaact
gcttctccag cttggatttt gccaatcttc 540cctccaatga tctgtggtgt cattgctggt
gctgttaact ccacccaacc tgctcaccaa 600ttgaaaaaca tggtcatttt cggtatcttg
tttcaaggtt taggtttttg ggtttacctt 660ttacttttcg ccgttaatgt tttgagattc
ttcacagtcg gtttagcaaa gccacaagat 720agaccaggta tgtttatgtt cgttggtcca
ccagctttct ctggtttagc attgattaac 780attgcaagag gtgcaatggg ctcaagacct
tacattttcg ttggtgcaaa ctcttccgaa 840tacttaggtt ttgtctcaac cttcatggcc
attttcatct ggggtttagc cgcatggtgt 900tattgcttag ctatggtttc cttccttgcc
ggctttttca ctagagcacc attgaaattc 960gcttgtggtt ggttcgcttt catctttcca
aatgttggtt ttgttaactg tactatcgaa 1020atcggcaaga tgattgattc taaggctttt
caaatgtttg gtcacatcat tggtgttatc 1080ttgtgtattc aatggatttt gttaatgtac
ttaatggtta gagcattcct tgttaatgac 1140ttgtgctatc ctggtaaaga cgaagatgca
cacccaccac caaagccaaa cactggtgtc 1200ttaaacccaa ctttcccacc agagaaggct
ccagcatcat tagagaaggt tgatactcat 1260gttacatcaa caggtggtga atccgatcct
ccatcttccg aacatgaatc cgtttaa 1317931205DNAAspergillus oryzae
93atgtttaaca atgagcacca tattcctcct ggttcctctc actctgatat cgaaatgtta
60acaccaccaa agtttgagga tgaaaaacag ttaggtccag tcggtattag agaaagattg
120agacatttca cttgggcttg gtatacctta accatgtccg gtggtggttt ggcagttttg
180attatctctc agccattcgg ttttagaggt ttaagagaaa ttggtattgc agtttacatt
240ttgaacttaa tcttattcgc tttggtttgt tctaccatgg ctattcgttt catcttgcac
300ggtaaccttt tggaatccct tagacatgac agagaaggtt tgtttttccc tactttctgg
360ttgtctgttg ctaccatcat ttgtggtttg tcaagatact ttggtgagga atccaacgaa
420tccttccaat tggcattaga agccttgttc tggatctatt gcgtttgtac cttgttggtt
480gcaatcattc aatactcttt tgttttctca tcccacaagt acggtttaca aacaatgatg
540ccatcttgga ttttgccagc ctttcctatc atgttgtcag gcacaattgc atctgttatc
600ggtgaacaac aaccagccag agctgcatta ccaatcattg gtgccggtgt caccttccaa
660ggtttaggtt tttctatttc cttcatgatg tatgctcatt acattggcag acttatggaa
720tccggtttac ctcactccga ccatagacca ggcatgttca tctgtgttgg cccaccagcc
780tttactgctt tggctttagt cggtatgtcc aagggtttac cagaagattt caagctttta
840catgacgctc atgcattaga ggatggtaga atcattgaat tgttagcaat ttcagcaggt
900gttttccttt gggcattatc cctttggttt ttctgtattg ctattgtcgc tgtcattaga
960tctccaccag aagctttcca cttgaactgg tgggctatgg ttttcccaaa tactggtttc
1020accttagcta ctatcacttt gggtaaagct ttgaactcaa atggtgtcaa gggtgtcggt
1080tctgcaatgt ccatttgtat tgtctgcatg tacatctttg ttttcgttaa caatgttaga
1140gctgttattc gtaaggatat catgtatcca ggcaaagatg aggatgtttc tgattaacct
1200gcagg
1205941180PRTIssatchenkia orientalis 94Met Ser Thr Val Glu Asp His Ser
Ser Leu His Lys Leu Arg Lys Glu1 5 10
15Ser Glu Ile Leu Ser Asn Ala Asn Lys Ile Leu Val Ala Asn
Arg Gly 20 25 30Glu Ile Pro
Ile Arg Ile Phe Arg Ser Ala His Glu Leu Ser Met His 35
40 45Thr Val Ala Ile Tyr Ser His Glu Asp Arg Leu
Ser Met His Arg Leu 50 55 60Lys Ala
Asp Glu Ala Tyr Ala Ile Gly Lys Thr Gly Gln Tyr Ser Pro65
70 75 80Val Gln Ala Tyr Leu Gln Ile
Asp Glu Ile Ile Lys Ile Ala Lys Glu 85 90
95His Asp Val Ser Met Ile His Pro Gly Tyr Gly Phe Leu
Ser Glu Asn 100 105 110Ser Glu
Phe Ala Lys Lys Val Glu Glu Ser Gly Met Ile Trp Val Gly 115
120 125Pro Pro Ala Glu Val Ile Asp Ser Val Gly
Asp Lys Val Ser Ala Arg 130 135 140Asn
Leu Ala Ile Lys Cys Asp Val Pro Val Val Pro Gly Thr Asp Gly145
150 155 160Pro Ile Glu Asp Ile Glu
Gln Ala Lys Gln Phe Val Glu Gln Tyr Gly 165
170 175Tyr Pro Val Ile Ile Lys Ala Ala Phe Gly Gly Gly
Gly Arg Gly Met 180 185 190Arg
Val Val Arg Glu Gly Asp Asp Ile Val Asp Ala Phe Gln Arg Ala 195
200 205Ser Ser Glu Ala Lys Ser Ala Phe Gly
Asn Gly Thr Cys Phe Ile Glu 210 215
220Arg Phe Leu Asp Lys Pro Lys His Ile Glu Val Gln Leu Leu Ala Asp225
230 235 240Asn Tyr Gly Asn
Thr Ile His Leu Phe Glu Arg Asp Cys Ser Val Gln 245
250 255Arg Arg His Gln Lys Val Val Glu Ile Ala
Pro Ala Lys Thr Leu Pro 260 265
270Val Glu Val Arg Asn Ala Ile Leu Lys Asp Ala Val Thr Leu Ala Lys
275 280 285Thr Ala Asn Tyr Arg Asn Ala
Gly Thr Ala Glu Phe Leu Val Asp Ser 290 295
300Gln Asn Arg His Tyr Phe Ile Glu Ile Asn Pro Arg Ile Gln Val
Glu305 310 315 320His Thr
Ile Thr Glu Glu Ile Thr Gly Val Asp Ile Val Ala Ala Gln
325 330 335Ile Gln Ile Ala Ala Gly Ala
Ser Leu Glu Gln Leu Gly Leu Leu Gln 340 345
350Asn Lys Ile Thr Thr Arg Gly Phe Ala Ile Gln Cys Arg Ile
Thr Thr 355 360 365Glu Asp Pro Ala
Lys Asn Phe Ala Pro Asp Thr Gly Lys Ile Glu Val 370
375 380Tyr Arg Ser Ala Gly Gly Asn Gly Val Arg Leu Asp
Gly Gly Asn Gly385 390 395
400Phe Ala Gly Ala Val Ile Ser Pro His Tyr Asp Ser Met Leu Val Lys
405 410 415Cys Ser Thr Ser Gly
Ser Asn Tyr Glu Ile Ala Arg Arg Lys Met Ile 420
425 430Arg Ala Leu Val Glu Phe Arg Ile Arg Gly Val Lys
Thr Asn Ile Pro 435 440 445Phe Leu
Leu Ala Leu Leu Thr His Pro Val Phe Ile Ser Gly Asp Cys 450
455 460Trp Thr Thr Phe Ile Asp Asp Thr Pro Ser Leu
Phe Glu Met Val Ser465 470 475
480Ser Lys Asn Arg Ala Gln Lys Leu Leu Ala Tyr Ile Gly Asp Leu Cys
485 490 495Val Asn Gly Ser
Ser Ile Lys Gly Gln Ile Gly Phe Pro Lys Leu Asn 500
505 510Lys Glu Ala Glu Ile Pro Asp Leu Leu Asp Pro
Asn Asp Glu Val Ile 515 520 525Asp
Val Ser Lys Pro Ser Thr Asn Gly Leu Arg Pro Tyr Leu Leu Lys 530
535 540Tyr Gly Pro Asp Ala Phe Ser Lys Lys Val
Arg Glu Phe Asp Gly Cys545 550 555
560Met Ile Met Asp Thr Thr Trp Arg Asp Ala His Gln Ser Leu Leu
Ala 565 570 575Thr Arg Val
Arg Thr Ile Asp Leu Leu Arg Ile Ala Pro Thr Thr Ser 580
585 590His Ala Leu Gln Asn Ala Phe Ala Leu Glu
Cys Trp Gly Gly Ala Thr 595 600
605Phe Asp Val Ala Met Arg Phe Leu Tyr Glu Asp Pro Trp Glu Arg Leu 610
615 620Arg Gln Leu Arg Lys Ala Val Pro
Asn Ile Pro Phe Gln Met Leu Leu625 630
635 640Arg Gly Ala Asn Gly Val Ala Tyr Ser Ser Leu Pro
Asp Asn Ala Ile 645 650
655Asp His Phe Val Lys Gln Ala Lys Asp Asn Gly Val Asp Ile Phe Arg
660 665 670Val Phe Asp Ala Leu Asn
Asp Leu Glu Gln Leu Lys Val Gly Val Asp 675 680
685Ala Val Lys Lys Ala Gly Gly Val Val Glu Ala Thr Val Cys
Tyr Ser 690 695 700Gly Asp Met Leu Ile
Pro Gly Lys Lys Tyr Asn Leu Asp Tyr Tyr Leu705 710
715 720Glu Thr Val Gly Lys Ile Val Glu Met Gly
Thr His Ile Leu Gly Ile 725 730
735Lys Asp Met Ala Gly Thr Leu Lys Pro Lys Ala Ala Lys Leu Leu Ile
740 745 750Gly Ser Ile Arg Ser
Lys Tyr Pro Asp Leu Val Ile His Val His Thr 755
760 765His Asp Ser Ala Gly Thr Gly Ile Ser Thr Tyr Val
Ala Cys Ala Leu 770 775 780Ala Gly Ala
Asp Ile Val Asp Cys Ala Ile Asn Ser Met Ser Gly Leu785
790 795 800Thr Ser Gln Pro Ser Met Ser
Ala Phe Ile Ala Ala Leu Asp Gly Asp 805
810 815Ile Glu Thr Gly Val Pro Glu His Phe Ala Arg Gln
Leu Asp Ala Tyr 820 825 830Trp
Ala Glu Met Arg Leu Leu Tyr Ser Cys Phe Glu Ala Asp Leu Lys 835
840 845Gly Pro Asp Pro Glu Val Tyr Lys His
Glu Ile Pro Gly Gly Gln Leu 850 855
860Thr Asn Leu Ile Phe Gln Ala Gln Gln Val Gly Leu Gly Glu Gln Trp865
870 875 880Glu Glu Thr Lys
Lys Lys Tyr Glu Asp Ala Asn Met Leu Leu Gly Asp 885
890 895Ile Val Lys Val Thr Pro Thr Ser Lys Val
Val Gly Asp Leu Ala Gln 900 905
910Phe Met Val Ser Asn Lys Leu Glu Lys Glu Asp Val Glu Lys Leu Ala
915 920 925Asn Glu Leu Asp Phe Pro Asp
Ser Val Leu Asp Phe Phe Glu Gly Leu 930 935
940Met Gly Thr Pro Tyr Gly Gly Phe Pro Glu Pro Leu Arg Thr Asn
Val945 950 955 960Ile Ser
Gly Lys Arg Arg Lys Leu Lys Gly Arg Pro Gly Leu Glu Leu
965 970 975Glu Pro Phe Asn Leu Glu Glu
Ile Arg Glu Asn Leu Val Ser Arg Phe 980 985
990Gly Pro Gly Ile Thr Glu Cys Asp Val Ala Ser Tyr Asn Met
Tyr Pro 995 1000 1005Lys Val Tyr
Glu Gln Tyr Arg Lys Val Val Glu Lys Tyr Gly Asp 1010
1015 1020Leu Ser Val Leu Pro Thr Lys Ala Phe Leu Ala
Pro Pro Thr Ile 1025 1030 1035Gly Glu
Glu Val His Val Glu Ile Glu Gln Gly Lys Thr Leu Ile 1040
1045 1050Ile Lys Leu Leu Ala Ile Ser Asp Leu Ser
Lys Ser His Gly Thr 1055 1060 1065Arg
Glu Val Tyr Phe Glu Leu Asn Gly Glu Met Arg Lys Val Thr 1070
1075 1080Ile Glu Asp Lys Thr Ala Ala Ile Glu
Thr Val Thr Arg Ala Lys 1085 1090
1095Ala Asp Gly His Asn Pro Asn Glu Val Gly Ala Pro Met Ala Gly
1100 1105 1110Val Val Val Glu Val Arg
Val Lys His Gly Thr Glu Val Lys Lys 1115 1120
1125Gly Asp Pro Leu Ala Val Leu Ser Ala Met Lys Met Glu Met
Val 1130 1135 1140Ile Ser Ala Pro Val
Ser Gly Arg Val Gly Glu Val Phe Val Asn 1145 1150
1155Glu Gly Asp Ser Val Asp Met Gly Asp Leu Leu Val Lys
Ile Ala 1160 1165 1170Lys Asp Glu Ala
Pro Ala Ala 1175 1180951178PRTSaccharomyces cerevisiae
95Met Ser Gln Arg Lys Phe Ala Gly Leu Arg Asp Asn Phe Asn Leu Leu1
5 10 15Gly Glu Lys Asn Lys Ile
Leu Val Ala Asn Arg Gly Glu Ile Pro Ile 20 25
30Arg Ile Phe Arg Thr Ala His Glu Leu Ser Met Gln Thr
Val Ala Ile 35 40 45Tyr Ser His
Glu Asp Arg Leu Ser Thr His Lys Gln Lys Ala Asp Glu 50
55 60Ala Tyr Val Ile Gly Glu Val Gly Gln Tyr Thr Pro
Val Gly Ala Tyr65 70 75
80Leu Ala Ile Asp Glu Ile Ile Ser Ile Ala Gln Lys His Gln Val Asp
85 90 95Phe Ile His Pro Gly Tyr
Gly Phe Leu Ser Glu Asn Ser Glu Phe Ala 100
105 110Asp Lys Val Val Lys Ala Gly Ile Thr Trp Ile Gly
Pro Pro Ala Glu 115 120 125Val Ile
Asp Ser Val Gly Asp Lys Val Ser Ala Arg Asn Leu Ala Ala 130
135 140Lys Ala Asn Val Pro Thr Val Pro Gly Thr Pro
Gly Pro Ile Glu Thr145 150 155
160Val Glu Glu Ala Leu Asp Phe Val Asn Glu Tyr Gly Tyr Pro Val Ile
165 170 175Ile Lys Ala Ala
Phe Gly Gly Gly Gly Arg Gly Met Arg Val Val Arg 180
185 190Glu Gly Asp Asp Val Ala Asp Ala Phe Gln Arg
Ala Thr Ser Glu Ala 195 200 205Arg
Thr Ala Phe Gly Asn Gly Thr Cys Phe Val Glu Arg Phe Leu Asp 210
215 220Lys Pro Lys His Ile Glu Val Gln Leu Leu
Ala Asp Asn His Gly Asn225 230 235
240Val Val His Leu Phe Glu Arg Asp Cys Ser Val Gln Arg Arg His
Gln 245 250 255Lys Val Val
Glu Val Ala Pro Ala Lys Thr Leu Pro Arg Glu Val Arg 260
265 270Asp Ala Ile Leu Thr Asp Ala Val Lys Leu
Ala Lys Glu Cys Gly Tyr 275 280
285Arg Asn Ala Gly Thr Ala Glu Phe Leu Val Asp Asn Gln Asn Arg His 290
295 300Tyr Phe Ile Glu Ile Asn Pro Arg
Ile Gln Val Glu His Thr Ile Thr305 310
315 320Glu Glu Ile Thr Gly Ile Asp Ile Val Ala Ala Gln
Ile Gln Ile Ala 325 330
335Ala Gly Ala Ser Leu Pro Gln Leu Gly Leu Phe Gln Asp Lys Ile Thr
340 345 350Thr Arg Gly Phe Ala Ile
Gln Cys Arg Ile Thr Thr Glu Asp Pro Ala 355 360
365Lys Asn Phe Gln Pro Asp Thr Gly Arg Ile Glu Val Tyr Arg
Ser Ala 370 375 380Gly Gly Asn Gly Val
Arg Leu Asp Gly Gly Asn Ala Tyr Ala Gly Thr385 390
395 400Ile Ile Ser Pro His Tyr Asp Ser Met Leu
Val Lys Cys Ser Cys Ser 405 410
415Gly Ser Thr Tyr Glu Ile Val Arg Arg Lys Met Ile Arg Ala Leu Ile
420 425 430Glu Phe Arg Ile Arg
Gly Val Lys Thr Asn Ile Pro Phe Leu Leu Thr 435
440 445Leu Leu Thr Asn Pro Val Phe Ile Glu Gly Thr Tyr
Trp Thr Thr Phe 450 455 460Ile Asp Asp
Thr Pro Gln Leu Phe Gln Met Val Ser Ser Gln Asn Arg465
470 475 480Ala Gln Lys Leu Leu His Tyr
Leu Ala Asp Val Ala Val Asn Gly Ser 485
490 495Ser Ile Lys Gly Gln Ile Gly Leu Pro Lys Leu Lys
Ser Asn Pro Ser 500 505 510Val
Pro His Leu His Asp Ala Gln Gly Asn Val Ile Asn Val Thr Lys 515
520 525Ser Ala Pro Pro Ser Gly Trp Arg Gln
Val Leu Leu Glu Lys Gly Pro 530 535
540Ala Glu Phe Ala Arg Gln Val Arg Gln Phe Asn Gly Thr Leu Leu Met545
550 555 560Asp Thr Thr Trp
Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Val 565
570 575Arg Thr His Asp Leu Ala Thr Ile Ala Pro
Thr Thr Ala His Ala Leu 580 585
590Ala Gly Arg Phe Ala Leu Glu Cys Trp Gly Gly Ala Thr Phe Asp Val
595 600 605Ala Met Arg Phe Leu His Glu
Asp Pro Trp Glu Arg Leu Arg Lys Leu 610 615
620Arg Ser Leu Val Pro Asn Ile Pro Phe Gln Met Leu Leu Arg Gly
Ala625 630 635 640Asn Gly
Val Ala Tyr Ser Ser Leu Pro Asp Asn Ala Ile Asp His Phe
645 650 655Val Lys Gln Ala Lys Asp Asn
Gly Val Asp Ile Phe Arg Val Phe Asp 660 665
670Ala Leu Asn Asp Leu Glu Gln Leu Lys Val Gly Val Asp Ala
Val Lys 675 680 685Lys Ala Gly Gly
Val Val Glu Ala Thr Val Cys Phe Ser Gly Asp Met 690
695 700Leu Gln Pro Gly Lys Lys Tyr Asn Leu Asp Tyr Tyr
Leu Glu Ile Ala705 710 715
720Glu Lys Ile Val Gln Met Gly Thr His Ile Leu Gly Ile Lys Asp Met
725 730 735Ala Gly Thr Met Lys
Pro Ala Ala Ala Lys Leu Leu Ile Gly Ser Leu 740
745 750Arg Ala Lys Tyr Pro Asp Leu Pro Ile His Val His
Thr His Asp Ser 755 760 765Ala Gly
Thr Ala Val Ala Ser Met Thr Ala Cys Ala Leu Ala Gly Ala 770
775 780Asp Val Val Asp Val Ala Ile Asn Ser Met Ser
Gly Leu Thr Ser Gln785 790 795
800Pro Ser Ile Asn Ala Leu Leu Ala Ser Leu Glu Gly Asn Ile Asp Thr
805 810 815Gly Ile Asn Val
Glu His Val Arg Glu Leu Asp Ala Tyr Trp Ala Glu 820
825 830Met Arg Leu Leu Tyr Ser Cys Phe Glu Ala Asp
Leu Lys Gly Pro Asp 835 840 845Pro
Glu Val Tyr Gln His Glu Ile Pro Gly Gly Gln Leu Thr Asn Leu 850
855 860Leu Phe Gln Ala Gln Gln Leu Gly Leu Gly
Glu Gln Trp Ala Glu Thr865 870 875
880Lys Arg Ala Tyr Arg Glu Ala Asn Tyr Leu Leu Gly Asp Ile Val
Lys 885 890 895Val Thr Pro
Thr Ser Lys Val Val Gly Asp Leu Ala Gln Phe Met Val 900
905 910Ser Asn Lys Leu Thr Ser Asp Asp Val Arg
Arg Leu Ala Asn Ser Leu 915 920
925Asp Phe Pro Asp Ser Val Met Asp Phe Phe Glu Gly Leu Ile Gly Gln 930
935 940Pro Tyr Gly Gly Phe Pro Glu Pro
Phe Arg Ser Asp Val Leu Arg Asn945 950
955 960Lys Arg Arg Lys Leu Thr Cys Arg Pro Gly Leu Glu
Leu Glu Pro Phe 965 970
975Asp Leu Glu Lys Ile Arg Glu Asp Leu Gln Asn Arg Phe Gly Asp Val
980 985 990Asp Glu Cys Asp Val Ala
Ser Tyr Asn Met Tyr Pro Arg Val Tyr Glu 995 1000
1005Asp Phe Gln Lys Met Arg Glu Thr Tyr Gly Asp Leu
Ser Val Leu 1010 1015 1020Pro Thr Arg
Ser Phe Leu Ser Pro Leu Glu Thr Asp Glu Glu Ile 1025
1030 1035Glu Val Val Ile Glu Gln Gly Lys Thr Leu Ile
Ile Lys Leu Gln 1040 1045 1050Ala Val
Gly Asp Leu Asn Lys Lys Thr Gly Glu Arg Glu Val Tyr 1055
1060 1065Phe Asp Leu Asn Gly Glu Met Arg Lys Ile
Arg Val Ala Asp Arg 1070 1075 1080Ser
Gln Lys Val Glu Thr Val Thr Lys Ser Lys Ala Asp Met His 1085
1090 1095Asp Pro Leu His Ile Gly Ala Pro Met
Ala Gly Val Ile Val Glu 1100 1105
1110Val Lys Val His Lys Gly Ser Leu Ile Lys Lys Gly Gln Pro Val
1115 1120 1125Ala Val Leu Ser Ala Met
Lys Met Glu Met Ile Ile Ser Ser Pro 1130 1135
1140Ser Asp Gly Gln Val Lys Glu Val Phe Val Ser Asp Gly Glu
Asn 1145 1150 1155Val Asp Ser Ser Asp
Leu Leu Val Leu Leu Glu Asp Gln Val Pro 1160 1165
1170Val Glu Thr Lys Ala 1175961175PRTKluyvermyces
marxianus 96Met Ser Thr Gln Asn Asp Leu Ala Gly Leu Arg Asp Asn Ser Asn
Leu1 5 10 15Leu Gly Glu
Lys Asn Lys Ile Leu Val Ala Asn Arg Gly Glu Ile Pro 20
25 30Ile Arg Ile Phe Arg Thr Ala His Glu Leu
Ser Met Lys Thr Val Ala 35 40
45Ile Tyr Ser His Glu Asp Arg Leu Ser Met His Arg Leu Lys Ala Asp 50
55 60Glu Ala Tyr Val Ile Gly Glu Pro Gly
Lys Tyr Thr Pro Val Gly Ala65 70 75
80Tyr Leu Ala Ile Asp Glu Ile Ile Lys Ile Ala Gln Leu His
Gly Val 85 90 95Ser Phe
Ile His Pro Gly Tyr Gly Phe Leu Ser Glu Asn Ser Glu Phe 100
105 110Ala Lys Lys Val Ala Asp Ser Gly Ile
Thr Trp Val Gly Pro Pro Ala 115 120
125Asp Val Ile Asp Ala Val Gly Asp Lys Val Ser Ala Arg Asn Leu Ala
130 135 140Glu Arg Ala Asp Val Pro Val
Val Pro Gly Thr Pro Gly Pro Ile Glu145 150
155 160Thr Val Glu Glu Ala Val Glu Phe Val Glu Lys Tyr
Gly Tyr Pro Val 165 170
175Ile Ile Lys Ala Ala Phe Gly Gly Gly Gly Arg Gly Met Arg Val Val
180 185 190Arg Glu Gly Asp Asp Ile
Ala Asp Ala Phe Gln Arg Ala Lys Ser Glu 195 200
205Ala Val Thr Ala Phe Gly Asn Gly Thr Cys Phe Val Glu Arg
Phe Leu 210 215 220Asp Lys Pro Lys His
Ile Glu Val Gln Leu Leu Ala Asp His Tyr Gly225 230
235 240Asn Val Ile His Leu Phe Glu Arg Asp Cys
Ser Val Gln Arg Arg His 245 250
255Gln Lys Val Val Glu Val Ala Pro Ala Lys Thr Leu Pro Glu Ser Val
260 265 270Arg Asn Ala Ile Leu
Thr Asp Ala Val Lys Leu Ala Lys Glu Ala Gly 275
280 285Tyr Arg Asn Ala Gly Thr Ala Glu Phe Leu Val Asp
Asn Gln Asn Arg 290 295 300His Tyr Phe
Ile Glu Ile Asn Pro Arg Ile Gln Val Glu His Thr Ile305
310 315 320Thr Glu Glu Ile Thr Gly Ile
Asp Ile Val Ala Ala Gln Ile Gln Ile 325
330 335Ala Ala Gly Ala Ser Leu Glu Gln Leu Gly Leu Leu
Gln Asp Arg Ile 340 345 350Thr
Thr Arg Gly Phe Ala Ile Gln Cys Arg Ile Thr Thr Glu Asp Pro 355
360 365Ser Lys Asn Phe Gln Pro Asp Thr Gly
Arg Ile Asp Val Tyr Arg Ser 370 375
380Ala Gly Gly Asn Gly Val Arg Leu Asp Gly Gly Asn Ala Phe Ala Gly385
390 395 400Ser Val Ile Ser
Pro His Tyr Asp Ser Met Leu Val Lys Cys Ser Cys 405
410 415Ser Gly Ser Thr Tyr Glu Ile Val Arg Arg
Lys Met Leu Arg Ala Leu 420 425
430Ile Glu Phe Arg Ile Arg Gly Val Lys Thr Asn Ile Pro Phe Leu Leu
435 440 445Thr Leu Leu Thr His Pro Val
Phe Lys Ser Gly Asp Tyr Trp Thr Thr 450 455
460Phe Ile Asp Asp Thr Pro Gln Leu Phe Glu Met Val Ser Ser Gln
Asn465 470 475 480Arg Ala
Gln Lys Leu Leu His Tyr Leu Ala Asp Leu Ala Val Asn Gly
485 490 495Ser Ser Ile Lys Gly Gln Ile
Gly Leu Pro Lys Leu Lys Thr His Pro 500 505
510Thr Ile Pro His Leu His Lys Ala Asp Gly Ser Ile Leu Asp
Val Ser 515 520 525Ala Lys Pro Pro
Ala Gly Trp Arg Asp Val Leu Leu Gln His Gly Pro 530
535 540Glu Glu Phe Ala Lys Gln Val Arg Lys Phe Lys Gly
Thr Leu Leu Met545 550 555
560Asp Thr Thr Trp Arg Asp Ala His Gln Ser Leu Leu Ala Thr Arg Val
565 570 575Arg Thr Tyr Asp Leu
Ala Ala Ile Ala Pro Thr Thr Ala His Ala Leu 580
585 590Ser Gly Ala Phe Ala Leu Glu Cys Trp Gly Gly Ala
Thr Phe Asp Val 595 600 605Ser Met
Arg Phe Leu His Glu Asp Pro Trp Glu Arg Leu Arg Thr Leu 610
615 620Arg Lys Leu Val Pro Asn Ile Pro Phe Gln Met
Leu Leu Arg Gly Ala625 630 635
640Asn Gly Val Ala Tyr Ser Ser Leu Pro Asp Asn Ala Ile Asp His Phe
645 650 655Val Lys Gln Ala
Lys Asp Asn Gly Val Asp Ile Phe Arg Val Phe Asp 660
665 670Ala Leu Asn Asp Leu Glu Gln Leu Thr Val Gly
Val Asp Ala Val Lys 675 680 685Lys
Ala Gly Gly Val Val Glu Ala Thr Ile Cys Tyr Ser Gly Asp Met 690
695 700Leu Ala Pro Gly Lys Lys Tyr Asn Leu Asp
Tyr Tyr Leu Asp Ile Val705 710 715
720Glu Gln Val Val Lys Arg Gly Thr His Ile Leu Gly Ile Lys Asp
Met 725 730 735Ala Gly Thr
Leu Lys Pro Ser Ala Ala Lys Leu Leu Ile Gly Ser Ile 740
745 750Arg Thr Lys Tyr Pro Asp Leu Pro Ile His
Val His Thr His Asp Ser 755 760
765Ala Gly Thr Gly Val Ala Ser Met Ala Ala Cys Ala Phe Ala Gly Ala 770
775 780Asp Val Val Asp Val Ala Thr Asn
Ser Met Ser Gly Met Thr Ser Gln785 790
795 800Pro Ser Val Asn Ala Leu Leu Ala Ala Leu Asp Gly
Glu Ile Asp Cys 805 810
815Asn Val Asn Val Ser Tyr Ile Ser Gln Leu Asp Ala Tyr Trp Ala Glu
820 825 830Met Arg Leu Leu Tyr Ser
Cys Phe Glu Ala Asp Leu Lys Gly Pro Asp 835 840
845Pro Glu Val Tyr Val His Glu Ile Pro Gly Gly Gln Leu Thr
Asn Leu 850 855 860Leu Phe Gln Ala Gln
Gln Leu Gly Leu Gly Glu Gln Trp Ala Glu Thr865 870
875 880Lys Arg Ala Tyr Arg Glu Ala Asn Leu Leu
Leu Gly Asp Val Val Lys 885 890
895Val Thr Pro Thr Ser Lys Val Val Gly Asp Leu Ala Gln Phe Met Val
900 905 910Thr Asn Lys Leu Thr
Ser Asp Asp Val Lys Arg Leu Ala Ser Ser Leu 915
920 925Asp Phe Pro Asp Ser Val Met Asp Phe Phe Glu Gly
Leu Ile Gly Gln 930 935 940Pro Tyr Gly
Gly Phe Pro Glu Pro Leu Arg Ser Asp Val Leu Lys Asn945
950 955 960Lys Arg Arg Lys Leu Thr Lys
Arg Pro Gly Leu Glu Leu Ala Pro Phe 965
970 975Asp Leu Glu Gly Ile Lys Glu Asp Leu Thr Asn Arg
Phe Gly Asp Ile 980 985 990Asp
Asp Cys Asp Val Ala Ser Tyr Asn Met Tyr Pro Lys Val Tyr Glu 995
1000 1005Asp Phe Arg Lys Ile Arg Glu Lys
Tyr Gly Asp Leu Ser Val Leu 1010 1015
1020Pro Thr Lys Asn Phe Leu Ser Pro Pro Ser Ile Gly Glu Glu Ile
1025 1030 1035Val Val Thr Ile Glu Gln
Gly Lys Thr Leu Ile Ile Lys Pro Gln 1040 1045
1050Ala Ile Gly Asp Leu Asn Lys Glu Thr Gly Ile Arg Glu Val
Tyr 1055 1060 1065Phe Glu Leu Asn Gly
Glu Leu Arg Lys Val Ser Val Ala Asp Arg 1070 1075
1080Ser Gln Lys Val Glu Thr Ile Ser Lys Pro Lys Ala Asp
Ala His 1085 1090 1095Asp Pro Phe Gln
Val Gly Ser Pro Met Ala Gly Val Val Val Glu 1100
1105 1110Val Lys Val His Lys Gly Ser Leu Ile Ser Lys
Gly Gln Pro Val 1115 1120 1125Ala Val
Leu Ser Ala Met Lys Met Glu Met Val Ile Ser Ser Pro 1130
1135 1140Ser Asp Gly Gln Val Lys Glu Val Leu Val
Lys Asp Gly Glu Asn 1145 1150 1155Val
Asp Ala Ser Asp Leu Leu Val Val Leu Glu Glu Ala Pro Ala 1160
1165 1170Lys Glu 117597883PRTEscherichia coli
97Met Asn Glu Gln Tyr Ser Ala Leu Arg Ser Asn Val Ser Met Leu Gly1
5 10 15Lys Val Leu Gly Glu Thr
Ile Lys Asp Ala Leu Gly Glu His Ile Leu 20 25
30Glu Arg Val Glu Thr Ile Arg Lys Leu Ser Lys Ser Ser
Arg Ala Gly 35 40 45Asn Asp Ala
Asn Arg Gln Glu Leu Leu Thr Thr Leu Gln Asn Leu Ser 50
55 60Asn Asp Glu Leu Leu Pro Val Ala Arg Ala Phe Ser
Gln Phe Leu Asn65 70 75
80Leu Ala Asn Thr Ala Glu Gln Tyr His Ser Ile Ser Pro Lys Gly Glu
85 90 95Ala Ala Ser Asn Pro Glu
Val Ile Ala Arg Thr Leu Arg Lys Leu Lys 100
105 110Asn Gln Pro Glu Leu Ser Glu Asp Thr Ile Lys Lys
Ala Val Glu Ser 115 120 125Leu Ser
Leu Glu Leu Val Leu Thr Ala His Pro Thr Glu Ile Thr Arg 130
135 140Arg Thr Leu Ile His Lys Met Val Glu Val Asn
Ala Cys Leu Lys Gln145 150 155
160Leu Asp Asn Lys Asp Ile Ala Asp Tyr Glu His Asn Gln Leu Met Arg
165 170 175Arg Leu Arg Gln
Leu Ile Ala Gln Ser Trp His Thr Asp Glu Ile Arg 180
185 190Lys Leu Arg Pro Ser Pro Val Asp Glu Ala Lys
Trp Gly Phe Ala Val 195 200 205Val
Glu Asn Ser Leu Trp Gln Gly Val Pro Asn Tyr Leu Arg Glu Leu 210
215 220Asn Glu Gln Leu Glu Glu Asn Leu Gly Tyr
Lys Leu Pro Val Glu Phe225 230 235
240Val Pro Val Arg Phe Thr Ser Trp Met Gly Gly Asp Arg Asp Gly
Asn 245 250 255Pro Asn Val
Thr Ala Asp Ile Thr Arg His Val Leu Leu Leu Ser Arg 260
265 270Trp Lys Ala Thr Asp Leu Phe Leu Lys Asp
Ile Gln Val Leu Val Ser 275 280
285Glu Leu Ser Met Val Glu Ala Thr Pro Glu Leu Leu Ala Leu Val Gly 290
295 300Glu Glu Gly Ala Ala Glu Pro Tyr
Arg Tyr Leu Met Lys Asn Leu Arg305 310
315 320Ser Arg Leu Met Ala Thr Gln Ala Trp Leu Glu Ala
Arg Leu Lys Gly 325 330
335Glu Glu Leu Pro Lys Pro Glu Gly Leu Leu Thr Gln Asn Glu Glu Leu
340 345 350Trp Glu Pro Leu Tyr Ala
Cys Tyr Gln Ser Leu Gln Ala Cys Gly Met 355 360
365Gly Ile Ile Ala Asn Gly Asp Leu Leu Asp Thr Leu Arg Arg
Val Lys 370 375 380Cys Phe Gly Val Pro
Leu Val Arg Ile Asp Ile Arg Gln Glu Ser Thr385 390
395 400Arg His Thr Glu Ala Leu Gly Glu Leu Thr
Arg Tyr Leu Gly Ile Gly 405 410
415Asp Tyr Glu Ser Trp Ser Glu Ala Asp Lys Gln Ala Phe Leu Ile Arg
420 425 430Glu Leu Asn Ser Lys
Arg Pro Leu Leu Pro Arg Asn Trp Gln Pro Ser 435
440 445Ala Glu Thr Arg Glu Val Leu Asp Thr Cys Gln Val
Ile Ala Glu Ala 450 455 460Pro Gln Gly
Ser Ile Ala Ala Tyr Val Ile Ser Met Ala Lys Thr Pro465
470 475 480Ser Asp Val Leu Ala Val His
Leu Leu Leu Lys Glu Ala Gly Ile Gly 485
490 495Phe Ala Met Pro Val Ala Pro Leu Phe Glu Thr Leu
Asp Asp Leu Asn 500 505 510Asn
Ala Asn Asp Val Met Thr Gln Leu Leu Asn Ile Asp Trp Tyr Arg 515
520 525Gly Leu Ile Gln Gly Lys Gln Met Val
Met Ile Gly Tyr Ser Asp Ser 530 535
540Ala Lys Asp Ala Gly Val Met Ala Ala Ser Trp Ala Gln Tyr Gln Ala545
550 555 560Gln Asp Ala Leu
Ile Lys Thr Cys Glu Lys Ala Gly Ile Glu Leu Thr 565
570 575Leu Phe His Gly Arg Gly Gly Ser Ile Gly
Arg Gly Gly Ala Pro Ala 580 585
590His Ala Ala Leu Leu Ser Gln Pro Pro Gly Ser Leu Lys Gly Gly Leu
595 600 605Arg Val Thr Glu Gln Gly Glu
Met Ile Arg Phe Lys Tyr Gly Leu Pro 610 615
620Glu Ile Thr Val Ser Ser Leu Ser Leu Tyr Thr Gly Ala Ile Leu
Glu625 630 635 640Ala Asn
Leu Leu Pro Pro Pro Glu Pro Lys Glu Ser Trp Arg Arg Ile
645 650 655Met Asp Glu Leu Ser Val Ile
Ser Cys Asp Val Tyr Arg Gly Tyr Val 660 665
670Arg Glu Asn Lys Asp Phe Val Pro Tyr Phe Arg Ser Ala Thr
Pro Glu 675 680 685Gln Glu Leu Gly
Lys Leu Pro Leu Gly Ser Arg Pro Ala Lys Arg Arg 690
695 700Pro Thr Gly Gly Val Glu Ser Leu Arg Ala Ile Pro
Trp Ile Phe Ala705 710 715
720Trp Thr Gln Asn Arg Leu Met Leu Pro Ala Trp Leu Gly Ala Gly Thr
725 730 735Ala Leu Gln Lys Val
Val Glu Asp Gly Lys Gln Ser Glu Leu Glu Ala 740
745 750Met Cys Arg Asp Trp Pro Phe Phe Ser Thr Arg Leu
Gly Met Leu Glu 755 760 765Met Val
Phe Ala Lys Ala Asp Leu Trp Leu Ala Glu Tyr Tyr Asp Gln 770
775 780Arg Leu Val Asp Lys Ala Leu Trp Pro Leu Gly
Lys Glu Leu Arg Asn785 790 795
800Leu Gln Glu Glu Asp Ile Lys Val Val Leu Ala Ile Ala Asn Asp Ser
805 810 815His Leu Met Ala
Asp Leu Pro Trp Ile Ala Glu Ser Ile Gln Leu Arg 820
825 830Asn Ile Tyr Thr Asp Pro Leu Asn Val Leu Gln
Ala Glu Leu Leu His 835 840 845Arg
Ser Arg Gln Ala Glu Lys Glu Gly Gln Glu Pro Asp Pro Arg Val 850
855 860Glu Gln Ala Leu Met Val Thr Ile Ala Gly
Ile Ala Ala Gly Met Arg865 870 875
880Asn Thr Gly98342PRTIssatchenkia orinetalis 98Met Ser Asn Val
Lys Val Ala Leu Leu Gly Ala Ala Gly Gly Ile Gly1 5
10 15Gln Pro Leu Ala Leu Leu Leu Lys Leu Asn
Pro Asn Ile Thr His Leu 20 25
30Ala Leu Tyr Asp Val Val His Val Pro Gly Val Ala Ala Asp Leu His
35 40 45His Ile Asp Thr Asp Val Val Ile
Thr His His Leu Lys Asp Glu Asp 50 55
60Gly Thr Ala Leu Ala Asn Ala Leu Lys Asp Ala Thr Phe Val Ile Val65
70 75 80Pro Ala Gly Val Pro
Arg Lys Pro Gly Met Thr Arg Gly Asp Leu Phe 85
90 95Thr Ile Asn Ala Gly Ile Cys Ala Glu Leu Ala
Asn Ala Ile Ser Leu 100 105
110Asn Ala Pro Asn Ala Phe Thr Leu Val Ile Thr Asn Pro Val Asn Ser
115 120 125Thr Val Pro Ile Phe Lys Glu
Ile Phe Ala Lys Asn Glu Ala Phe Asn 130 135
140Pro Arg Arg Leu Phe Gly Val Thr Ala Leu Asp His Val Arg Ser
Asn145 150 155 160Thr Phe
Leu Ser Glu Leu Ile Asp Gly Lys Asn Pro Gln His Phe Asp
165 170 175Val Thr Val Val Gly Gly His
Ser Gly Asn Ser Ile Val Pro Leu Phe 180 185
190Ser Leu Val Lys Ala Ala Glu Asn Leu Asp Asp Glu Ile Ile
Asp Ala 195 200 205Leu Ile His Arg
Val Gln Tyr Gly Gly Asp Glu Val Val Glu Ala Lys 210
215 220Ser Gly Ala Gly Ser Ala Thr Leu Ser Met Ala Tyr
Ala Ala Asn Lys225 230 235
240Phe Phe Asn Ile Leu Leu Asn Gly Tyr Trp Gly Leu Lys Lys Thr Met
245 250 255Ile Ser Ser Tyr Val
Phe Leu Asp Asp Ser Ile Asn Gly Val Pro Gln 260
265 270Leu Lys Glu Asn Leu Ser Lys Leu Leu Lys Gly Ser
Glu Val Glu Leu 275 280 285Pro Ser
Tyr Leu Ala Val Pro Met Thr Tyr Gly Lys Glu Gly Ile Glu 290
295 300Gln Val Phe Tyr Asp Trp Val Phe Glu Met Ser
Pro Lys Glu Lys Glu305 310 315
320Asn Phe Ile Thr Ala Ile Glu Tyr Ile Asp Gln Asn Ile Glu Lys Gly
325 330 335Leu Asn Phe Met
Val Arg 34099333PRTIssatchenkia orientalis 99Met Val Lys Val
Thr Ile Leu Gly Ala Ala Gly Gly Ile Gly Gln Pro1 5
10 15Leu Ser Leu Leu Leu Arg Leu Asn Pro Trp
Ile Asp Glu Leu Ala Leu 20 25
30Phe Asp Ile Val Asn Thr Pro Gly Val Ser Cys Asp Leu Ser His Ile
35 40 45Pro Ala Ser Gln Val Val Asn Gly
Tyr Ala Pro Lys Ser Lys Ser Asp 50 55
60Thr Glu Thr Ile Lys Thr Ala Leu Lys Gly Ala Asp Ile Val Val Ile65
70 75 80Pro Ala Gly Ile Pro
Arg Lys Pro Gly Met Thr Arg Asn Asp Leu Phe 85
90 95Lys Ile Asn Ala Gly Ile Val Lys Ser Leu Ile
His Ser Ala Gly Thr 100 105
110Thr Cys Pro Asp Ala Phe Ile Cys Val Ile Ser Asn Pro Val Asn Ser
115 120 125Thr Val Pro Ile Ala Val Glu
Glu Leu Lys Arg Leu Asn Val Phe Asn 130 135
140Pro His Lys Val Phe Gly Ile Thr Thr Leu Asp Asn Phe Arg Leu
Glu145 150 155 160Glu Phe
Leu Ser Gly Glu Leu Gly Gly Ile Val Lys Pro Asn Asp Leu
165 170 175Tyr Gly Asp Val Val Ala Ile
Gly Gly His Ser Gly Asp Ser Ile Val 180 185
190Pro Ile Leu Asn Ser Trp Asn Leu Asn Phe Ile Asn Asp Gly
Asp Ser 195 200 205Tyr Asn Asn Leu
Val Lys Arg Val Gln Phe Gly Gly Asp Glu Val Val 210
215 220Lys Ala Lys Asp Gly Lys Gly Ser Ala Thr Leu Ser
Met Ala Thr Ala225 230 235
240Ala Tyr Arg Phe Val Asn Asn Leu Leu Asp Ala Ile Val Asn Asn Lys
245 250 255Lys Val Lys Glu Val
Ala Phe Val Lys Ile Asp Gln Leu Pro Thr Thr 260
265 270Arg Val Pro Tyr Phe Val Val Asp Glu Thr Gln Tyr
Phe Ser Leu Pro 275 280 285Ile Ile
Leu Gly Arg Gln Gly Ile Glu Arg Val Thr Phe Pro Glu Ser 290
295 300Leu Thr Glu Gln Glu Val Arg Met Thr Lys His
Ala Val Ala Lys Val305 310 315
320Lys Val Asp Val Asn Lys Gly Phe Asn Phe Val His Gly
325 330100334PRTIssatchenkia orientalis 100Met Phe Ser
Arg Ile Ser Ala Arg Gln Phe Ser Ser Ser Ala Ala Ser1 5
10 15Ala Tyr Lys Val Thr Val Leu Gly Ala
Ala Gly Gly Ile Gly Gln Pro 20 25
30Leu Ser Leu Leu Met Lys Leu Asn His Lys Val Thr Asn Leu Ser Leu
35 40 45Tyr Asp Leu Arg Leu Gly Ala
Gly Val Ala Thr Asp Leu Ser His Ile 50 55
60Pro Thr Asn Ser Val Val Lys Gly Tyr Gly Pro Glu Asn Asn Gly Leu65
70 75 80Lys Asp Ala Leu
Thr Gly Ser Asp Val Val Leu Ile Pro Ala Gly Val 85
90 95Pro Arg Lys Pro Gly Met Thr Arg Asp Asp
Leu Phe Asn Thr Asn Ala 100 105
110Ser Ile Val Arg Asp Leu Ala Lys Ala Ala Ala Asp His Cys Pro Asn
115 120 125Ala Val Leu Leu Ile Ile Ser
Asn Pro Val Asn Ser Thr Val Pro Ile 130 135
140Val Ala Glu Val Leu Lys Ser Lys Gly Val Tyr Asn Pro Lys Lys
Leu145 150 155 160Phe Gly
Val Thr Thr Leu Asp Val Leu Arg Ser Ser Arg Phe Leu Ser
165 170 175Glu Val Val Asn Thr Asp Pro
Thr Thr Glu Thr Val Thr Val Val Gly 180 185
190Gly His Ser Gly Val Thr Ile Val Pro Leu Ile Ser Gln Thr
Lys His 195 200 205Lys Asp Leu Pro
Lys Glu Thr Tyr Glu Ala Leu Val His Arg Ile Gln 210
215 220Phe Gly Gly Asp Glu Val Val Lys Ala Lys Asp Gly
Ala Gly Ser Ala225 230 235
240Thr Leu Ser Met Ala Gln Ala Gly Ala Arg Met Ala Ser Ser Val Leu
245 250 255Lys Gly Leu Ala Gly
Glu Val Asp Ile Val Glu Pro Thr Phe Ile Asp 260
265 270Ser Pro Leu Phe Lys Ser Glu Gly Val Glu Phe Phe
Ser Ser Arg Val 275 280 285Thr Leu
Gly Pro Glu Gly Val Gln Glu Val His Pro Leu Gly Val Leu 290
295 300Ser Thr Ala Glu Glu Glu Met Val Ala Thr Ala
Lys Glu Thr Leu Lys305 310 315
320Lys Asn Ile Gln Lys Gly Val Asp Phe Val Lys Ala Asn Pro
325 330101356PRTZygosaccharomyces rouxii 101Met Pro
His Ser Ile Asn Gly Asp Val Lys Ile Ala Val Leu Gly Ala1 5
10 15Ala Gly Gly Ile Gly Gln Ser Leu
Ser Leu Leu Leu Lys Thr Gln Leu 20 25
30Thr Arg Glu Leu Pro Asn His Arg His Ala Gln Leu Ala Leu Tyr
Asp 35 40 45Val Asn Ala Asp Ala
Val Arg Gly Val Ala Ala Asp Leu Ser His Ile 50 55
60Asp Thr Gly Val Thr Val Thr Gly Tyr Glu Gly Asp Arg Ile
Gly Glu65 70 75 80Ala
Leu Glu Gly Thr Asp Ile Val Leu Ile Pro Ala Gly Val Pro Arg
85 90 95Lys Pro Gly Met Thr Arg Glu
Asp Leu Leu Val Val Asn Ala Lys Ile 100 105
110Val Lys Ser Ile Gly Ser Ser Ile Ala Gln His Cys Asp Leu
Asn Lys 115 120 125Val Phe Ile Leu
Leu Ile Ser Asn Pro Ile Asn Ser Leu Val Pro Val 130
135 140Leu Val Lys Glu Leu Glu Ser Lys Ser Gln Gly Thr
Gln Val Glu Arg145 150 155
160Arg Val Leu Gly Leu Thr Lys Leu Asp Ser Val Arg Ala Ser Ala Phe
165 170 175Leu His Glu Val Thr
Ile Lys His Gly Leu Lys Pro Lys Ser Asn Thr 180
185 190Leu Asp Asp Val Pro Val Val Gly Gly His Ser Gly
Glu Thr Ile Val 195 200 205Pro Leu
Phe Ser Gln Ala Pro Asn Gly Asn Arg Leu Ser Gln Asp Ala 210
215 220Leu Glu Ala Leu Val Gln Arg Val Gln Phe Gly
Gly Asp Glu Val Val225 230 235
240Arg Ala Lys Asn Gly Ala Gly Ser Ala Thr Leu Cys Met Ala His Ala
245 250 255Ala Tyr Thr Val
Ala Ala Ser Phe Ile Pro Leu Ile Thr Gly Gln Lys 260
265 270Arg Ser Ile Ser Gly Thr Phe Tyr Val Ala Leu
Lys Asp Ala Gln Gly 275 280 285Gln
Pro Ile Asn Ser Ser Ala Lys Arg Leu Leu Gly Ser Ile Asn Asp 290
295 300Leu Pro Tyr Phe Ala Val Pro Leu Glu Ile
Thr Ser Gln Gly Val Asp305 310 315
320Glu Leu Asp Thr Ser Val Leu Glu Arg Met Thr Lys Tyr Glu Arg
Glu 325 330 335Arg Leu Leu
Ala Pro Cys Leu Gly Lys Leu Glu Gly Gly Ile Arg Asn 340
345 350Gly Leu Ser Leu
355102338PRTKluyveromyces marxianus 102Met Leu Arg Ala Leu Thr Arg Arg
Gln Phe Ser Ser Thr Ala Phe Asn1 5 10
15Pro Tyr Lys Val Thr Val Leu Gly Ala Gly Gly Gly Ile Gly
Gln Pro 20 25 30Leu Ser Leu
Leu Leu Lys Leu Asn His Lys Val Thr Asp Leu Arg Leu 35
40 45Tyr Asp Leu Lys Gly Ala Lys Gly Val Ala Ala
Asp Leu Ser His Ile 50 55 60Pro Thr
Asn Ser Thr Val Thr Gly Tyr Thr Pro Glu Ser Lys Asp Ser65
70 75 80Gln Glu Glu Leu Ala Ala Ala
Leu Lys Asp Thr Glu Val Val Leu Ile 85 90
95Pro Ala Gly Val Pro Arg Lys Pro Gly Met Thr Arg Asp
Asp Leu Phe 100 105 110Ala Ile
Asn Ala Gly Ile Val Arg Asp Leu Ala Thr Ser Ile Ala Lys 115
120 125Asn Ala Pro Asn Ala Ala Ile Leu Val Ile
Ser Asn Pro Val Asn Ser 130 135 140Thr
Val Pro Ile Val Ala Glu Val Leu Lys Gln Asn Gly Val Tyr Asn145
150 155 160Pro Lys Lys Leu Phe Gly
Val Thr Thr Leu Asp Val Ile Arg Ala Ser 165
170 175Arg Phe Ile Ser Glu Val Arg Gly Thr Asp Pro Thr
Thr Glu His Val 180 185 190Thr
Val Val Gly Gly His Ser Gly Ile Thr Ile Leu Pro Leu Val Ser 195
200 205Gln Thr Lys His Lys Ser Val Ile Lys
Gly Glu Glu Leu Asp Asn Leu 210 215
220Ile His Arg Ile Gln Phe Gly Gly Asp Glu Val Val Gln Ala Lys Asn225
230 235 240Gly Ala Gly Ser
Ala Thr Leu Ser Met Ala Gln Ala Gly Ala Arg Phe 245
250 255Ala Asn Ser Val Leu Ser Gly Phe Glu Gly
Glu Arg Asp Val Ile Glu 260 265
270Pro Thr Phe Val Asp Ser Pro Leu Phe Lys Asp Glu Gly Ile Glu Phe
275 280 285Phe Ala Ser Pro Val Thr Leu
Gly Pro Glu Gly Val Glu Lys Ile His 290 295
300Gly Leu Gly Val Leu Ser Asp Lys Glu Glu Gln Met Leu Ala Thr
Cys305 310 315 320Lys Glu
Thr Leu Lys Lys Asn Ile Glu Lys Gly Gln Asn Phe Val Lys
325 330 335Gln Asn103335PRTKluyveromyces
marxianus 103Met Val Ser Val Ala Val Leu Gly Ser Ser Gly Gly Ile Gly Gln
Pro1 5 10 15Leu Ser Leu
Leu Leu Lys Leu Asp Pro Arg Val Ser Ser Leu Arg Leu 20
25 30Tyr Asp Leu Lys Met Ser His Gly Ile Ala
Thr Asp Leu Ser His Met 35 40
45Asp Ser Asn Ser Ile Cys Glu Gly Phe Asn Thr Asp Glu Ile Ala Leu 50
55 60Ala Leu Lys Gly Ala Gln Ile Val Val
Ile Pro Ala Gly Val Pro Arg65 70 75
80Lys Pro Gly Met Ser Arg Asp Asp Leu Phe Lys Ile Asn Ala
Lys Ile 85 90 95Ile Lys
Ser Leu Ala Leu Gln Ile Ala Glu His Ala Pro Glu Ala Arg 100
105 110Val Leu Val Ile Ser Asn Pro Val Asn
Ser Leu Val Pro Ile Val Tyr 115 120
125Glu Thr Leu Lys Ser Val Gly Lys Phe Glu Pro Gly Lys Val Met Gly
130 135 140Ile Thr Thr Leu Asp Ile Ile
Arg Ser His Thr Phe Leu Val Asp Val145 150
155 160Leu Gly Arg Lys Ala Tyr Ser Val Glu Lys Leu Arg
Ser Ala Val Thr 165 170
175Val Val Gly Gly His Ser Gly Glu Thr Ile Val Pro Ile Phe Thr Asp
180 185 190Gln Lys Phe Tyr Arg Arg
Leu Arg Asp Arg Glu Leu Tyr Asp Ala Tyr 195 200
205Val His Arg Val Gln Phe Gly Gly Asp Glu Val Val Lys Ala
Lys Asp 210 215 220Gly Ser Gly Ser Ala
Thr Leu Ser Met Ala Trp Ala Gly Tyr Ser Phe225 230
235 240Val Lys Gln Leu Leu Asn Ser Leu His Leu
Glu Thr Gly Glu Asp Val 245 250
255His Pro Ile Pro Thr Phe Val Tyr Leu Pro Gly Leu Pro Gly Gly Lys
260 265 270Glu Leu Gln Gln Lys
Leu Gly Thr Ser Val Glu Phe Phe Ala Ala Pro 275
280 285Val Lys Leu Ser Lys Gly Ile Val Val Glu Val Glu
His Asp Trp Val 290 295 300Asp Lys Leu
Asn Asp Ala Glu Lys Lys Leu Ile Ala Lys Cys Leu Pro305
310 315 320Ile Leu Asp Lys Asn Ile Lys
Lys Gly Leu Ala Phe Ser Gln Gln 325 330
335104374PRTKluyveromyces marxianus 104Met Pro Ala Val Ser
Tyr Asp Val Gln Gln Arg Asp Ile Leu Lys Ile1 5
10 15Ala Val Leu Gly Ala Ala Gly Gly Ile Gly Gln
Ser Leu Ser Leu Leu 20 25
30Leu Lys Ser Asn Ala Ser Phe Leu Leu Pro Arg Asp Ser Ser Arg His
35 40 45Ile Ser Leu Ala Leu Tyr Asp Val
Asn Lys Asp Ala Ile Val Gly Thr 50 55
60Ala Ala Asp Leu Ser His Ile Asp Thr Pro Ile Thr Thr Thr Pro His65
70 75 80Tyr Pro Asn Asp Gly
Asn Gly Gly Ile Ala Arg Cys Leu Gln Asp Ala 85
90 95Asp Met Val Ile Ile Pro Ala Gly Val Pro Arg
Lys Pro Gly Met Ser 100 105
110Arg Asp Asp Leu Ile Gly Val Asn Ala Lys Ile Ile Lys Ser Leu Gly
115 120 125Asn Asp Ile Ala Glu Tyr Cys
Asp Leu Ser Lys Val His Val Leu Val 130 135
140Ile Ser Asn Pro Val Asn Ser Leu Val Pro Leu Met Val Ser Thr
Leu145 150 155 160Ala Asn
Ser Pro His Ser Ala Asn Thr Asn Ile Glu Ala Arg Val Tyr
165 170 175Gly Ile Thr His Leu Asp Leu
Val Arg Ala Ser Thr Phe Val Gln Gln 180 185
190Leu Asn Ser Phe Lys Ser Asn Asn Ala Pro Asp Ile Pro Val
Ile Gly 195 200 205Gly His Ser Gly
Asp Thr Ile Ile Pro Val Phe Ser Val Leu Asn His 210
215 220Arg Ala Ser Asn Ser Gly Tyr Ala Asn Leu Leu Asp
Asn Gly Val Arg225 230 235
240Gln Lys Leu Val His Arg Val Gln Tyr Gly Gly Asp Glu Ile Val Gln
245 250 255Ala Lys Asn Gly Asn
Gly Ser Ala Thr Leu Ser Met Ala Tyr Ala Gly 260
265 270Phe Lys Ile Ala Ala Gln Phe Ile Asp Leu Leu Val
Gly Asn Ile Arg 275 280 285Thr Ile
Glu Asn Ile Cys Met Tyr Val Pro Leu Thr Asn Arg Tyr Asn 290
295 300Thr Glu Ile Ala Pro Gly Ser Asp Glu Leu Arg
Ser Lys Tyr Ile Asn305 310 315
320Gly Thr Leu Tyr Phe Ser Ile Pro Leu Ser Ile Gly Ile Asn Gly Ile
325 330 335Glu Arg Val His
Tyr Glu Ile Met Glu His Leu Asp Ser Tyr Glu Arg 340
345 350Glu Thr Leu Leu Pro Ile Cys Leu Glu Thr Leu
Lys Gly Asn Ile Asp 355 360 365Lys
Gly Leu Ser Leu Val 370105312PRTEscherichia coli 105Met Lys Val Ala
Val Leu Gly Ala Ala Gly Gly Ile Gly Gln Ala Leu1 5
10 15Ala Leu Leu Leu Lys Thr Gln Leu Pro Ser
Gly Ser Glu Leu Ser Leu 20 25
30Tyr Asp Ile Ala Pro Val Thr Pro Gly Val Ala Val Asp Leu Ser His
35 40 45Ile Pro Thr Ala Val Lys Ile Lys
Gly Phe Ser Gly Glu Asp Ala Thr 50 55
60Pro Ala Leu Glu Gly Ala Asp Val Val Leu Ile Ser Ala Gly Val Ala65
70 75 80Arg Lys Pro Gly Met
Asp Arg Ser Asp Leu Phe Asn Val Asn Ala Gly 85
90 95Ile Val Lys Asn Leu Val Gln Gln Val Ala Lys
Thr Cys Pro Lys Ala 100 105
110Cys Ile Gly Ile Ile Thr Asn Pro Val Asn Thr Thr Val Ala Ile Ala
115 120 125Ala Glu Val Leu Lys Lys Ala
Gly Val Tyr Asp Lys Asn Lys Leu Phe 130 135
140Gly Val Thr Thr Leu Asp Ile Ile Arg Ser Asn Thr Phe Val Ala
Glu145 150 155 160Leu Lys
Gly Lys Gln Pro Gly Glu Val Glu Val Pro Val Ile Gly Gly
165 170 175His Ser Gly Val Thr Ile Leu
Pro Leu Leu Ser Gln Val Pro Gly Val 180 185
190Ser Phe Thr Glu Gln Glu Val Ala Asp Leu Thr Lys Arg Ile
Gln Asn 195 200 205Ala Gly Thr Glu
Val Val Glu Ala Lys Ala Gly Gly Gly Ser Ala Thr 210
215 220Leu Ser Met Gly Gln Ala Ala Ala Arg Phe Gly Leu
Ser Leu Val Arg225 230 235
240Ala Leu Gln Gly Glu Gln Gly Val Val Glu Cys Ala Tyr Val Glu Gly
245 250 255Asp Gly Gln Tyr Ala
Arg Phe Phe Ser Gln Pro Leu Leu Leu Gly Lys 260
265 270Asn Gly Val Glu Glu Arg Lys Ser Ile Gly Thr Leu
Ser Ala Phe Glu 275 280 285Gln Asn
Ala Leu Glu Gly Met Leu Asp Thr Leu Lys Lys Asp Ile Ala 290
295 300Leu Gly Glu Glu Phe Val Asn Lys305
310106337PRTRhizopus oryzae 106Met Phe Ala Ala Ser Arg Val Phe Ser
Ile Ala Ala Lys Arg Ser Phe1 5 10
15Ser Thr Ser Ala Ala Asn Leu Ser Lys Val Ala Val Leu Gly Ala
Ala 20 25 30Gly Gly Ile Gly
Gln Pro Leu Ser Leu Leu Leu Lys Glu Asn Pro His 35
40 45Val Thr His Leu Ser Leu Tyr Asp Ile Val Asn Thr
Pro Gly Val Ala 50 55 60Ala Asp Leu
Ser His Ile Asn Thr Asn Ser Lys Val Thr Gly His Thr65 70
75 80Pro Glu Asn Asp Gly Leu Lys Thr
Ala Leu Glu Gly Ala His Val Val 85 90
95Val Ile Pro Ala Gly Val Pro Arg Lys Pro Gly Met Thr Arg
Asp Asp 100 105 110Leu Phe Asn
Thr Asn Ala Ser Ile Val Arg Asp Leu Ala Glu Ala Ala 115
120 125Ala Lys His Cys Pro Asp Ala His Phe Leu Ile
Ile Ser Asn Pro Val 130 135 140Asn Ser
Thr Val Pro Ile Phe Ala Glu Thr Leu Lys Lys Ala Gly Val145
150 155 160Phe Asn Pro Lys Arg Leu Tyr
Gly Val Thr Thr Leu Asp Val Val Arg 165
170 175Ala Ser Arg Phe Val Ala Glu Val Lys Asn Leu Asp
Pro Asn Asp Val 180 185 190Lys
Val Thr Val Val Gly Gly His Ser Gly Val Thr Ile Val Pro Leu 195
200 205Leu Ser Gln Thr Gly Leu Glu Phe Ser
Lys Glu Glu Leu Asp Ala Leu 210 215
220Thr His Arg Ile Gln Phe Gly Gly Asp Glu Val Val Gln Ala Lys Asn225
230 235 240Gly Thr Gly Ser
Val Thr Leu Ser Met Ala Phe Ala Gly Ala Arg Phe 245
250 255Ala Asn Ser Val Leu Glu Ala Thr Val Gly
Gly Lys Lys Gly Val Val 260 265
270Glu Pro Ser Phe Val Lys Ser Asp Val Phe Ala Lys Asp Gly Val Glu
275 280 285Tyr Phe Ser Thr Asn Ile Glu
Leu Gly Pro Glu Gly Val Glu Lys Ile 290 295
300Asn Glu Leu Gly Gln Ile Ser Asp Tyr Glu Lys Glu Leu Ile Ala
Lys305 310 315 320Ala Val
Pro Glu Leu Lys Lys Asn Ile Ala Lys Gly Asn Ser Phe Val
325 330 335Gln107485PRTIssatchenkia
orientalis 107Met Leu Ala Ala Arg Ser Leu Lys Ala Arg Met Ser Thr Arg Ala
Phe1 5 10 15Ser Thr Thr
Ser Ile Ala Lys Arg Ile Glu Lys Asp Ala Phe Gly Asp 20
25 30Ile Glu Val Pro Asn Glu Lys Tyr Trp Gly
Ala Gln Thr Gln Arg Ser 35 40
45Leu Gln Asn Phe Lys Ile Gly Gly Lys Arg Glu Val Met Pro Glu Pro 50
55 60Ile Ile Lys Ser Phe Gly Ile Leu Lys
Lys Ala Thr Ala Lys Ile Asn65 70 75
80Ala Glu Ser Gly Ala Leu Asp Pro Lys Leu Ser Glu Ala Ile
Gln Gln 85 90 95Ala Ala
Thr Glu Val Tyr Glu Gly Lys Leu Met Asp His Phe Pro Leu 100
105 110Val Val Phe Gln Thr Gly Ser Gly Thr
Gln Ser Asn Met Asn Ala Asn 115 120
125Glu Val Ile Ser Asn Arg Ala Ile Glu Ile Leu Gly Gly Glu Leu Gly
130 135 140Ser Lys Thr Pro Val His Pro
Asn Asp His Val Asn Met Ser Gln Ser145 150
155 160Ser Asn Asp Thr Phe Pro Thr Val Met His Ile Ala
Ala Val Thr Glu 165 170
175Val Ser Ser His Leu Leu Pro Glu Leu Thr Ala Leu Arg Asp Ala Leu
180 185 190Gln Lys Lys Ser Asp Glu
Phe Lys Asn Ile Ile Lys Ile Gly Arg Thr 195 200
205His Leu Gln Asp Ala Thr Pro Leu Thr Leu Gly Gln Glu Phe
Ser Gly 210 215 220Tyr Val Gln Gln Cys
Thr Asn Gly Ile Lys Arg Ile Glu Ile Ala Leu225 230
235 240Glu His Leu Arg Tyr Leu Ala Gln Gly Gly
Thr Ala Val Gly Thr Gly 245 250
255Leu Asn Thr Lys Lys Gly Phe Ala Glu Lys Val Ala Asn Glu Val Thr
260 265 270Lys Leu Thr Gly Leu
Gln Phe Tyr Thr Ala Pro Asn Lys Phe Glu Ala 275
280 285Leu Ala Ala His Asp Ala Val Val Glu Met Ser Gly
Ala Leu Asn Thr 290 295 300Val Ala Val
Ser Leu Phe Lys Ile Ala Gln Asp Ile Arg Tyr Leu Gly305
310 315 320Ser Gly Pro Arg Cys Gly Tyr
Gly Glu Leu Ala Leu Pro Glu Asn Glu 325
330 335Pro Gly Ser Ser Ile Met Pro Gly Lys Val Asn Pro
Thr Gln Asn Glu 340 345 350Ala
Leu Thr Met Leu Cys Thr Gln Val Phe Gly Asn His Ser Cys Ile 355
360 365Thr Phe Ala Gly Ala Ser Gly Gln Phe
Glu Leu Asn Val Phe Lys Pro 370 375
380Val Met Ile Ser Asn Leu Leu Ser Ser Ile Arg Leu Leu Gly Asp Gly385
390 395 400Cys Asn Ser Phe
Arg Ile His Cys Val Glu Gly Ile Ile Ala Asn Thr 405
410 415Asp Lys Ile Asp Lys Leu Leu His Glu Ser
Leu Met Leu Val Thr Ala 420 425
430Leu Asn Pro His Ile Gly Tyr Asp Lys Ala Ser Lys Ile Ala Lys Asn
435 440 445Ala His Lys Lys Gly Leu Thr
Leu Lys Gln Ser Ala Leu Glu Leu Gly 450 455
460Tyr Leu Thr Glu Glu Gln Phe Asn Glu Trp Val Arg Pro Glu Asn
Met465 470 475 480Ile Gly
Pro Lys Asp 485108470PRTSaccharomyces cerevisiae 108Met
Ser Leu Ser Pro Val Val Val Ile Gly Thr Gly Leu Ala Gly Leu1
5 10 15Ala Ala Ala Asn Glu Leu Val
Asn Lys Tyr Asn Ile Pro Val Thr Ile 20 25
30Leu Glu Lys Ala Ser Ser Ile Gly Gly Asn Ser Ile Lys Ala
Ser Ser 35 40 45Gly Ile Asn Gly
Ala Cys Thr Glu Thr Gln Arg His Phe His Ile Glu 50 55
60Asp Ser Pro Arg Leu Phe Glu Asp Asp Thr Ile Lys Ser
Ala Lys Gly65 70 75
80Lys Gly Val Gln Glu Leu Met Ala Lys Leu Ala Asn Asp Ser Pro Leu
85 90 95Ala Ile Glu Trp Leu Lys
Asn Glu Phe Asp Leu Lys Leu Asp Leu Leu 100
105 110Ala Gln Leu Gly Gly His Ser Val Ala Arg Thr His
Arg Ser Ser Gly 115 120 125Lys Leu
Pro Pro Gly Phe Glu Ile Val Ser Ala Leu Ser Asn Asn Leu 130
135 140Lys Lys Leu Ala Glu Thr Lys Pro Glu Leu Val
Lys Ile Asn Leu Asp145 150 155
160Ser Lys Val Val Asp Ile His Glu Lys Asp Gly Ser Ile Ser Ala Val
165 170 175Val Tyr Glu Asp
Lys Asn Gly Glu Lys His Met Val Ser Ala Asn Asp 180
185 190Val Val Phe Cys Ser Gly Gly Phe Gly Phe Ser
Lys Glu Met Leu Lys 195 200 205Glu
Tyr Ala Pro Glu Leu Val Asn Leu Pro Thr Thr Asn Gly Gln Gln 210
215 220Thr Thr Gly Asp Gly Gln Arg Leu Leu Gln
Lys Leu Gly Ala Asp Leu225 230 235
240Ile Asp Met Asp Gln Ile Gln Val His Pro Thr Gly Phe Ile Asp
Pro 245 250 255Asn Asp Arg
Ser Ser Ser Trp Lys Phe Leu Ala Ala Glu Ser Leu Arg 260
265 270Gly Leu Gly Gly Ile Leu Leu Asn Pro Ile
Thr Gly Arg Arg Phe Val 275 280
285Asn Glu Leu Thr Thr Arg Asp Val Val Thr Ala Ala Ile Gln Lys Val 290
295 300Cys Pro Gln Glu Asp Asn Arg Ala
Leu Leu Val Met Gly Glu Lys Met305 310
315 320Tyr Thr Asp Leu Lys Asn Asn Leu Asp Phe Tyr Met
Phe Lys Lys Leu 325 330
335Val Gln Lys Leu Thr Leu Ser Gln Val Val Ser Glu Tyr Asn Leu Pro
340 345 350Ile Thr Val Ala Gln Leu
Cys Glu Glu Leu Gln Thr Tyr Ser Ser Phe 355 360
365Thr Thr Lys Ala Asp Pro Leu Gly Arg Thr Val Ile Leu Asn
Glu Phe 370 375 380Gly Ser Asp Val Thr
Pro Glu Thr Val Val Phe Ile Gly Glu Val Thr385 390
395 400Pro Val Val His Phe Thr Met Gly Gly Ala
Arg Ile Asn Val Lys Ala 405 410
415Gln Val Ile Gly Lys Asn Asp Glu Arg Leu Leu Lys Gly Leu Tyr Ala
420 425 430Ala Gly Glu Val Ser
Gly Gly Val His Gly Ala Asn Arg Leu Gly Gly 435
440 445Ser Ser Leu Leu Glu Cys Val Val Phe Gly Arg Thr
Ala Ala Glu Ser 450 455 460Ile Ala Asn
Asp Arg Lys465 470109470PRTSaccharomyces mitakae 109Met
Ser Ser Ser Pro Val Val Val Ile Gly Thr Gly Leu Ala Gly Leu1
5 10 15Ala Thr Ala Asn Glu Leu Val
Asn Lys Tyr Asn Ile Pro Val Thr Ile 20 25
30Leu Glu Lys Ala Ser Ser Ile Gly Gly Asn Ser Ile Lys Ala
Ser Ser 35 40 45Gly Ile Asn Gly
Ala Cys Thr Glu Thr Gln Arg His Phe His Ile Glu 50 55
60Asp Thr Pro Arg Leu Phe Glu Asp Asp Thr Val Gln Ser
Ala Lys Gly65 70 75
80Lys Gly Val Gln Glu Leu Met Gly Lys Leu Ala Asn Asp Ser Pro Leu
85 90 95Ala Ile Glu Trp Leu Lys
Thr Glu Phe Asp Leu Lys Leu Asp Leu Leu 100
105 110Ala Gln Leu Gly Gly His Ser Val Ala Arg Thr His
Arg Ser Ser Gly 115 120 125Lys Leu
Pro Pro Gly Phe Glu Ile Val Ser Ala Leu Ser Asn Asn Leu 130
135 140Lys Lys Leu Ala Glu Thr Lys Pro Glu Leu Val
Lys Ile Asn Leu Asp145 150 155
160Ser Lys Val Val Asp Ile His Lys Lys Asp Gly Ser Ile Ser Ala Ile
165 170 175Val Tyr Asp Asp
Lys Asn Gly Glu Arg His Thr Leu Ser Thr Ser Asn 180
185 190Val Val Phe Cys Ser Gly Gly Phe Gly Phe Ser
Lys Glu Met Leu Asn 195 200 205Glu
Tyr Ala Pro Gln Leu Val Asn Leu Pro Thr Thr Asn Gly Gln Gln 210
215 220Thr Thr Gly Asp Gly Gln Arg Leu Leu Gln
Lys Leu Gly Ala Asp Leu225 230 235
240Ile Asp Met Asp Gln Ile Gln Val His Pro Thr Gly Phe Ile Asp
Pro 245 250 255Asn Asp Arg
Asn Ser Ser Trp Lys Phe Leu Ala Ala Glu Ser Leu Arg 260
265 270Gly Leu Gly Gly Ile Leu Leu Asn Pro Ile
Thr Gly Arg Arg Phe Val 275 280
285Asn Glu Leu Thr Thr Arg Asp Val Val Thr Glu Ala Ile Gln Lys His 290
295 300Cys Pro Gln Asp Asp Asn Arg Ala
Leu Leu Val Met Ser Glu Lys Met305 310
315 320Tyr Thr Asp Leu Lys Asn Asn Leu Asp Phe Tyr Met
Phe Lys Lys Leu 325 330
335Val Gln Lys Leu Ser Leu Ser Gln Val Val Ser Glu Tyr Lys Leu Pro
340 345 350Ile Thr Val Ser Gln Leu
Cys Gln Glu Leu Gln Thr Tyr Ser Ser Phe 355 360
365Thr Ser Lys Ala Asp Pro Leu Gly Arg Thr Val Val Leu Asn
Glu Phe 370 375 380Gly Ala Asp Ile Thr
Pro Glu Thr Met Val Phe Ile Gly Glu Val Thr385 390
395 400Pro Val Val His Phe Thr Met Gly Gly Ala
Arg Ile Asn Val Lys Ala 405 410
415Gln Val Ile Gly Lys Asn Asp Glu Pro Leu Leu Asn Gly Leu Tyr Ala
420 425 430Ala Gly Glu Val Ser
Gly Gly Val His Gly Ala Asn Arg Leu Gly Gly 435
440 445Ser Ser Leu Leu Glu Cys Val Val Phe Gly Arg Thr
Ala Ala Glu Ser 450 455 460Ile Ala Asn
Asn His Lys465 470110470PRTKluyvermyces polysporus 110Met
Ser Thr Lys Lys Pro Val Val Ile Ile Gly Thr Gly Leu Ala Gly1
5 10 15Leu Ser Ala Gly Asn Gln Leu
Val Asn Met His Lys Val Pro Ile Ile 20 25
30Met Leu Asp Lys Ala Ser Ser Ile Gly Gly Asn Ser Thr Lys
Ala Ser 35 40 45Ser Gly Ile Asn
Gly Ala Ser Thr Ile Thr Gln Gln Gln Leu Asn Val 50 55
60Lys Asp Ser Pro Asp Leu Phe Leu Gln Asp Thr Val Lys
Ser Ala Lys65 70 75
80Gly Arg Gly Ile Glu Ser Leu Met Lys Lys Leu Ser Gln Asp Ser Asn
85 90 95Ser Ala Ile His Trp Leu
Gln Gln Asp Phe Asp Leu Lys Leu Asp Leu 100
105 110Leu Ala Gln Leu Gly Gly His Ser Val Pro Arg Thr
His Arg Ser Ser 115 120 125Gly Lys
Leu Pro Pro Gly Phe Glu Ile Val Gln Ala Leu Ser Asn Lys 130
135 140Leu Lys Ala Ile Ser Glu Ser Asp Pro Glu Phe
Val Arg Ile Leu Leu145 150 155
160Asn Ser Lys Val Val Asp Val Ser Val Asn Asn Glu Gly Lys Val Glu
165 170 175Ser Ile Asp Tyr
Val Asp Ala Glu Gly Lys His His Lys Ile Ala Thr 180
185 190Asp Asn Val Val Phe Cys Ser Gly Gly Phe Gly
His Ser Ala Glu Met 195 200 205Leu
Asn Lys Tyr Ala Pro Glu Leu Ala Asn Leu Pro Thr Thr Asn Gly 210
215 220Gln Gln Thr Thr Gly Asp Gly Gln Arg Ile
Leu Glu Lys Leu Gly Ala225 230 235
240Asp Leu Ile Asp Met Ser Gln Ile Gln Val His Pro Thr Gly Phe
Ile 245 250 255Asp Pro Ala
Asn Arg Asp Ser Lys Trp Lys Phe Leu Ala Ala Glu Ala 260
265 270Leu Arg Gly Leu Gly Gly Ile Leu Leu Asn
Pro Ser Thr Gly Lys Arg 275 280
285Phe Val Asn Glu Leu Thr Thr Arg Asp Leu Val Thr Glu Ala Ile Gln 290
295 300Ser Gln Cys Pro Arg Asp Asp Asn
Lys Ala Phe Leu Val Met Ser Glu305 310
315 320Lys Val Tyr Glu Asn Tyr Lys Asn Asn Met Asp Phe
Tyr Leu Phe Lys 325 330
335Lys Leu Val Ser Lys Met Thr Ile Lys Glu Phe Val Glu Thr Tyr Lys
340 345 350Leu Pro Ile Ser Ala Asp
Ala Val Thr Gln Asp Leu Ile Asp Tyr Ser 355 360
365Val Asp Lys Thr Asp Lys Phe Gly Arg Pro Leu Val Ile Asn
Val Phe 370 375 380Asp Glu Lys Leu Thr
Glu Asp Ser Glu Ile Tyr Val Gly Glu Val Thr385 390
395 400Pro Val Val His Phe Thr Met Gly Gly Ala
Lys Ile Asn Thr Glu Ser 405 410
415Gln Val Ile Asn Lys Asn Gly Gln Val Leu Ala Lys Gly Ile Tyr Ala
420 425 430Ala Gly Glu Val Ser
Gly Gly Val His Gly Ser Asn Arg Leu Gly Gly 435
440 445Ser Ser Leu Leu Glu Cys Val Val Tyr Gly Arg Ser
Ala Ala Asp Asn 450 455 460Ile Ala Lys
Asn Ile Glu465 470111487PRTKluyveromyces marxianus 111Met
Leu His Arg Tyr Ile Arg Leu Phe Ser Phe Cys Val Ile Leu Tyr1
5 10 15Leu Val Tyr Leu Leu Leu Thr
Lys Glu Ser Asn Val Met Ser Lys Pro 20 25
30Val Val Val Ile Gly Ser Gly Leu Ala Gly Leu Thr Thr Ser
Ser Gln 35 40 45Leu Ala Lys Phe
Asn Ile Pro Ile Val Leu Leu Glu Lys Thr Ser Ser 50 55
60Ile Gly Gly Asn Ser Ile Lys Ala Ser Ser Gly Ile Asn
Gly Ala Gly65 70 75
80Thr Glu Thr Gln Ser Arg Leu His Val Glu Asp His Pro Glu Leu Phe
85 90 95Ala Asp Asp Thr Ile Lys
Ser Ala Lys Gly Lys Gly Val Val Ala Leu 100
105 110Met Glu Lys Leu Ser Lys Asp Ser Ser Asp Ala Ile
Ser Trp Leu Gln 115 120 125Asn Asp
Phe Lys Ile Pro Leu Asp Lys Leu Ala Gln Leu Gly Gly His 130
135 140Ser Val Pro Arg Thr His Arg Ser Ser Gly Lys
Leu Pro Pro Gly Phe145 150 155
160Gln Ile Val Asp Thr Leu Lys Lys Ala Leu Glu Ser Tyr Asp Ser Lys
165 170 175Ala Val Lys Ile
Gln Leu Asn Ser Lys Val Val Asp Val Lys Leu Asp 180
185 190Ser Asn Asn Arg Val Ser Ser Val Val Phe Glu
Asp Gln Asp Gly Thr 195 200 205His
Thr Ile Glu Thr Asn Asn Val Val Phe Cys Thr Gly Gly Phe Gly 210
215 220Phe Asn Lys Lys Leu Leu Glu Lys Tyr Ala
Pro His Leu Val Asp Leu225 230 235
240Pro Thr Thr Asn Gly Glu Gln Thr Leu Gly Glu Gly Gln Val Leu
Leu 245 250 255Glu Lys Leu
Gly Ala Lys Leu Ile Asp Met Asp Gln Ile Gln Val His 260
265 270Pro Thr Gly Phe Ile Asp Pro Ala Asn Pro
Asp Ser Asn Trp Lys Phe 275 280
285Leu Ala Ala Glu Ala Leu Arg Gly Leu Gly Gly Val Leu Ile Asn Pro 290
295 300His Thr Gly Gln Arg Phe Val Asn
Glu Leu Thr Thr Arg Asp Met Val305 310
315 320Thr Glu Ala Ile Gln Ser Lys Ser Glu Ser Lys Thr
Ala Tyr Leu Val 325 330
335Met Ser Glu Ser Leu Tyr Glu Asn Tyr Lys Pro Asn Met Asp Phe Tyr
340 345 350Met Phe Lys Lys Leu Val
Ser Lys Lys Thr Val Ala Glu Phe Ala Glu 355 360
365Asp Leu Pro Val Ser Val Asp Gln Leu Ile Ala Glu Leu Ser
Thr Tyr 370 375 380Ser Asp Leu Ser Lys
Asp Asp His Leu Gly Arg Lys Phe Arg Glu Asn385 390
395 400Thr Phe Gly Ser Ser Leu Ser Ser Asp Ser
Thr Ile Phe Val Gly Lys 405 410
415Ile Thr Pro Val Val His Phe Thr Met Gly Gly Ala Lys Ile Asp Glu
420 425 430Gln Ala Arg Val Leu
Asn Ala Glu Gly Lys Pro Leu Ala Thr Gly Ile 435
440 445Tyr Ala Ala Gly Glu Val Ser Gly Gly Val His Gly
Ala Asn Arg Leu 450 455 460Gly Gly Ser
Ser Leu Leu Glu Cys Val Val Phe Gly Arg Gln Ala Ala465
470 475 480Lys Ser Ile Arg Ala Asn Leu
4851121139PRTTrypanosoma brucei 112Met Val Asp Gly Arg Ser
Ser Ala Ser Ile Val Ala Val Asp Pro Glu1 5
10 15Arg Ala Ala Arg Glu Arg Asp Ala Ala Ala Arg Ala
Leu Leu Gln Asp 20 25 30Ser
Pro Leu His Thr Thr Met Gln Tyr Ala Thr Ser Gly Leu Glu Leu 35
40 45Thr Val Pro Tyr Ala Leu Lys Val Val
Ala Ser Ala Asp Thr Phe Asp 50 55
60Arg Ala Lys Glu Val Ala Asp Glu Val Leu Arg Cys Ala Trp Gln Leu65
70 75 80Ala Asp Thr Val Leu
Asn Ser Phe Asn Pro Asn Ser Glu Val Ser Leu 85
90 95Val Gly Arg Leu Pro Val Gly Gln Lys His Gln
Met Ser Ala Pro Leu 100 105
110Lys Arg Val Met Ala Cys Cys Gln Arg Val Tyr Asn Ser Ser Ala Gly
115 120 125Cys Phe Asp Pro Ser Thr Ala
Pro Val Ala Lys Ala Leu Arg Glu Ile 130 135
140Ala Leu Gly Lys Glu Arg Asn Asn Ala Cys Leu Glu Ala Leu Thr
Gln145 150 155 160Ala Cys
Thr Leu Pro Asn Ser Phe Val Ile Asp Phe Glu Ala Gly Thr
165 170 175Ile Ser Arg Lys His Glu His
Ala Ser Leu Asp Leu Gly Gly Val Ser 180 185
190Lys Gly Tyr Ile Val Asp Tyr Val Ile Asp Asn Ile Asn Ala
Ala Gly 195 200 205Phe Gln Asn Val
Phe Phe Asp Trp Gly Gly Asp Cys Arg Ala Ser Gly 210
215 220Met Asn Ala Arg Asn Thr Pro Trp Val Val Gly Ile
Thr Arg Pro Pro225 230 235
240Ser Leu Asp Met Leu Pro Asn Pro Pro Lys Glu Ala Ser Tyr Ile Ser
245 250 255Val Ile Ser Leu Asp
Asn Glu Ala Leu Ala Thr Ser Gly Asp Tyr Glu 260
265 270Asn Leu Ile Tyr Thr Ala Asp Asp Lys Pro Leu Thr
Cys Thr Tyr Asp 275 280 285Trp Lys
Gly Lys Glu Leu Met Lys Pro Ser Gln Ser Asn Ile Ala Gln 290
295 300Val Ser Val Lys Cys Tyr Ser Ala Met Tyr Ala
Asp Ala Leu Ala Thr305 310 315
320Ala Cys Phe Ile Lys Arg Asp Pro Ala Lys Val Arg Gln Leu Leu Asp
325 330 335Gly Trp Arg Tyr
Val Arg Asp Thr Val Arg Asp Tyr Arg Val Tyr Val 340
345 350Arg Glu Asn Glu Arg Val Ala Lys Met Phe Glu
Ile Ala Thr Glu Asp 355 360 365Ala
Glu Met Arg Lys Arg Arg Ile Ser Asn Thr Leu Pro Ala Arg Val 370
375 380Ile Val Val Gly Gly Gly Leu Ala Gly Leu
Ser Ala Ala Ile Glu Ala385 390 395
400Ala Gly Cys Gly Ala Gln Val Val Leu Met Glu Lys Glu Ala Lys
Leu 405 410 415Gly Gly Asn
Ser Ala Lys Ala Thr Ser Gly Ile Asn Gly Trp Gly Thr 420
425 430Arg Ala Gln Ala Lys Ala Ser Ile Val Asp
Gly Gly Lys Tyr Phe Glu 435 440
445Arg Asp Thr Tyr Lys Ser Gly Ile Gly Gly Asn Thr Asp Pro Ala Leu 450
455 460Val Lys Thr Leu Ser Met Lys Ser
Ala Asp Ala Ile Gly Trp Leu Thr465 470
475 480Ser Leu Gly Val Pro Leu Thr Val Leu Ser Gln Leu
Gly Gly His Ser 485 490
495Arg Lys Arg Thr His Arg Ala Pro Asp Lys Lys Asp Gly Thr Pro Leu
500 505 510Pro Ile Gly Phe Thr Ile
Met Lys Thr Leu Glu Asp His Val Arg Gly 515 520
525Asn Leu Ser Gly Arg Ile Thr Ile Met Glu Asn Cys Ser Val
Thr Ser 530 535 540Leu Leu Ser Glu Thr
Lys Glu Arg Pro Asp Gly Thr Lys Gln Ile Arg545 550
555 560Val Thr Gly Val Glu Phe Thr Gln Ala Gly
Ser Gly Lys Thr Thr Ile 565 570
575Leu Ala Asp Ala Val Ile Leu Ala Thr Gly Gly Phe Ser Asn Asp Lys
580 585 590Thr Ala Asp Ser Leu
Leu Arg Glu His Ala Pro His Leu Val Asn Phe 595
600 605Pro Thr Thr Asn Gly Pro Trp Ala Thr Gly Asp Gly
Val Lys Leu Ala 610 615 620Gln Arg Leu
Gly Ala Gln Leu Val Asp Met Asp Lys Val Gln Leu His625
630 635 640Pro Thr Gly Leu Ile Asn Pro
Lys Asp Pro Ala Asn Pro Thr Lys Phe 645
650 655Leu Gly Pro Glu Ala Leu Arg Gly Ser Gly Gly Val
Leu Leu Asn Lys 660 665 670Gln
Gly Lys Arg Phe Val Asn Glu Leu Asp Leu Arg Ser Val Val Ser 675
680 685Lys Ala Ile Met Glu Gln Gly Ala Glu
Tyr Pro Gly Ser Gly Gly Ser 690 695
700Met Phe Ala Tyr Cys Val Leu Asn Ala Ala Ala Gln Lys Leu Phe Gly705
710 715 720Val Ser Ser His
Glu Phe Tyr Trp Lys Lys Met Gly Leu Phe Val Lys 725
730 735Ala Asp Thr Met Arg Asp Leu Ala Ala Leu
Ile Gly Cys Pro Val Glu 740 745
750Ser Val Gln Gln Thr Leu Glu Glu Tyr Glu Arg Leu Ser Ile Ser Gln
755 760 765Arg Ser Cys Pro Ile Thr Arg
Lys Ser Val Tyr Pro Cys Val Leu Gly 770 775
780Thr Lys Gly Pro Tyr Tyr Val Ala Phe Val Thr Pro Ser Ile His
Tyr785 790 795 800Thr Met
Gly Gly Cys Leu Ile Ser Pro Ser Ala Glu Ile Gln Met Lys
805 810 815Asn Thr Ser Ser Arg Ala Pro
Leu Ser His Ser Asn Pro Ile Leu Gly 820 825
830Leu Phe Gly Ala Gly Glu Val Thr Gly Gly Val His Gly Gly
Asn Arg 835 840 845Leu Gly Gly Asn
Ser Leu Leu Glu Cys Val Val Phe Gly Arg Ile Ala 850
855 860Gly Asp Arg Ala Ser Thr Ile Leu Gln Arg Lys Ser
Ser Ala Leu Ser865 870 875
880Phe Lys Val Trp Thr Thr Val Val Leu Arg Glu Val Arg Glu Gly Gly
885 890 895Val Tyr Gly Ala Gly
Ser Arg Val Leu Arg Phe Asn Leu Pro Gly Ala 900
905 910Leu Gln Arg Ser Gly Leu Ser Leu Gly Gln Phe Ile
Ala Ile Arg Gly 915 920 925Asp Trp
Asp Gly Gln Gln Leu Ile Gly Tyr Tyr Ser Pro Ile Thr Leu 930
935 940Pro Asp Asp Leu Gly Met Ile Asp Ile Leu Ala
Arg Ser Asp Lys Gly945 950 955
960Thr Leu Arg Glu Trp Ile Ser Ala Leu Glu Pro Gly Asp Ala Val Glu
965 970 975Met Lys Ala Cys
Gly Gly Leu Val Ile Glu Arg Arg Leu Ser Asp Lys 980
985 990His Phe Val Phe Met Gly His Ile Ile Asn Lys
Leu Cys Leu Ile Ala 995 1000
1005Gly Gly Thr Gly Val Ala Pro Met Leu Gln Ile Ile Lys Ala Ala
1010 1015 1020Phe Met Lys Pro Phe Ile
Asp Thr Leu Glu Ser Val His Leu Ile 1025 1030
1035Tyr Ala Ala Glu Asp Val Thr Glu Leu Thr Tyr Arg Glu Val
Leu 1040 1045 1050Glu Glu Arg Arg Arg
Glu Ser Arg Gly Lys Phe Lys Lys Thr Phe 1055 1060
1065Val Leu Asn Arg Pro Pro Pro Leu Trp Thr Asp Gly Val
Gly Phe 1070 1075 1080Ile Asp Arg Gly
Ile Leu Thr Asn His Val Gln Pro Pro Ser Asp 1085
1090 1095Asn Leu Leu Val Ala Ile Cys Gly Pro Pro Val
Met Gln Arg Ile 1100 1105 1110Val Lys
Ala Thr Leu Lys Thr Leu Gly Tyr Asn Met Asn Leu Val 1115
1120 1125Arg Thr Val Asp Glu Thr Glu Pro Ser Gly
Ser 1130 11351131139PRTTrypanosoma cruzi 113Met Ala
Asp Gly Arg Ser Ser Ala Ser Val Val Ala Val Asp Pro Glu1 5
10 15Lys Ala Ala Arg Glu Arg Asp Glu
Ala Ala Arg Ala Leu Leu Arg Asp 20 25
30Ser Pro Leu Gln Thr His Leu Gln Tyr Met Thr Asn Gly Leu Glu
Leu 35 40 45Thr Val Pro Phe Thr
Leu Lys Val Val Ala Glu Ala Val Ala Phe Ser 50 55
60Arg Ala Lys Glu Val Ala Asp Glu Val Leu Arg Ser Ala Trp
His Leu65 70 75 80Ala
Asp Thr Val Leu Asn Asn Phe Asn Pro Asn Ser Glu Ile Ser Met
85 90 95Ile Gly Arg Leu Pro Val Gly
Gln Lys His Thr Met Ser Ala Thr Leu 100 105
110Lys Ser Val Ile Thr Cys Cys Gln His Val Phe Asn Ser Ser
Arg Gly 115 120 125Val Phe Asp Pro
Ala Thr Gly Pro Ile Ile Glu Ala Leu Arg Ala Lys 130
135 140Val Ala Glu Lys Ala Ser Val Ser Asp Glu Gln Met
Glu Lys Leu Phe145 150 155
160Arg Val Cys Asn Phe Ser Ser Ser Phe Ile Val Asp Leu Glu Met Gly
165 170 175Thr Ile Ala Arg Lys
His Glu Asp Ala Arg Phe Asp Leu Gly Gly Val 180
185 190Ser Lys Gly Tyr Ile Val Asp Tyr Val Val Glu Arg
Leu Asn Ala Ala 195 200 205Gly Ile
Val Asp Val Tyr Phe Glu Trp Gly Gly Asp Cys Arg Ala Ser 210
215 220Gly Thr Asn Ala Arg Arg Thr Pro Trp Met Val
Gly Ile Ile Arg Pro225 230 235
240Pro Ser Leu Glu Gln Leu Arg Asn Pro Pro Lys Asp Pro Ser Tyr Ile
245 250 255Arg Val Leu Pro
Leu Asn Asp Glu Ala Leu Cys Thr Ser Gly Asp Tyr 260
265 270Glu Asn Leu Thr Glu Gly Ser Asn Lys Lys Leu
Tyr Thr Ser Ile Phe 275 280 285Asp
Trp Lys Lys Arg Ser Leu Leu Glu Pro Val Glu Ser Glu Leu Ala 290
295 300Gln Val Ser Ile Arg Cys Tyr Ser Ala Met
Tyr Ala Asp Ala Leu Ala305 310 315
320Thr Ala Ser Leu Ile Lys Arg Asp Ile Lys Lys Val Arg Gln Met
Leu 325 330 335Glu Asp Trp
Arg His Val Arg Asn Arg Val Thr Asn Tyr Val Thr Tyr 340
345 350Thr Arg Gln Gly Glu Arg Val Ala Arg Met
Phe Glu Ile Ala Thr Asp 355 360
365Asn Ala Glu Ile Arg Lys Lys Arg Ile Ala Gly Ser Leu Pro Ala Arg 370
375 380Val Ile Val Val Gly Gly Gly Leu
Ala Gly Leu Ser Ala Ala Ile Glu385 390
395 400Ala Thr Ala Cys Gly Ala Gln Val Ile Leu Leu Glu
Lys Glu Pro Lys 405 410
415Val Gly Gly Asn Ser Ala Lys Ala Thr Ser Gly Ile Asn Gly Trp Gly
420 425 430Thr Arg Ala Gln Ala Glu
Gln Asp Val Tyr Asp Ser Gly Lys Tyr Phe 435 440
445Glu Arg Asp Thr His Lys Ser Gly Leu Gly Gly Ser Thr Asp
Pro Gly 450 455 460Leu Val Arg Thr Leu
Ser Val Lys Ser Gly Asp Ala Ile Ser Trp Leu465 470
475 480Ser Ser Leu Gly Val Pro Leu Thr Val Leu
Ser Gln Leu Gly Gly His 485 490
495Ser Arg Lys Arg Thr His Arg Ala Pro Asp Lys Ala Asp Gly Thr Pro
500 505 510Val Pro Ile Gly Phe
Thr Ile Met Gln Thr Leu Glu Gln His Val Arg 515
520 525Thr Lys Leu Ala Asp Arg Val Thr Ile Met Glu Asn
Thr Thr Val Thr 530 535 540Ser Leu Leu
Ser Lys Ser Arg Val Arg His Asp Gly Ala Lys Gln Val545
550 555 560Arg Val Tyr Gly Val Glu Val
Leu Gln Asp Glu Gly Val Val Ser Arg 565
570 575Ile Leu Ala Asp Ala Val Ile Leu Ala Thr Gly Gly
Phe Ser Asn Asp 580 585 590Lys
Thr Pro Asn Ser Leu Leu Gln Glu Phe Ala Pro Gln Leu Ser Gly 595
600 605Phe Pro Thr Thr Asn Gly Pro Trp Ala
Thr Gly Asp Gly Val Lys Leu 610 615
620Ala Arg Glu Leu Gly Val Lys Leu Val Asp Met Asp Lys Val Gln Leu625
630 635 640His Pro Thr Gly
Leu Ile Asp Pro Lys Asp Pro Ala Asn Pro Thr Lys 645
650 655Tyr Leu Gly Pro Glu Ala Leu Arg Gly Ser
Gly Gly Val Leu Leu Asn 660 665
670Lys Lys Gly Glu Arg Phe Val Asn Glu Leu Asp Leu Arg Ser Val Val
675 680 685Ser Asn Ala Ile Ile Glu Gln
Gly Asp Glu Tyr Pro Asp Ala Gly Gly 690 695
700Ser Lys Phe Ala Phe Cys Val Leu Asn Asp Ala Ala Val Lys Leu
Phe705 710 715 720Gly Val
Asn Ser His Gly Phe Tyr Trp Lys Arg Leu Gly Leu Phe Val
725 730 735Lys Ala Asp Thr Val Glu Lys
Leu Ala Ala Leu Ile Gly Cys Pro Val 740 745
750Glu Asn Val Arg Asn Thr Leu Gly Asp Tyr Glu Gln Leu Ser
Lys Glu 755 760 765Asn Arg Gln Cys
Pro Lys Thr Arg Lys Val Val Tyr Pro Cys Val Val 770
775 780Gly Pro Gln Gly Pro Phe Tyr Val Ala Phe Val Thr
Pro Ser Ile His785 790 795
800Tyr Thr Met Gly Gly Cys Leu Ile Ser Pro Ser Ala Glu Met Gln Leu
805 810 815Glu Glu Asn Thr Thr
Ser Pro Phe Gly His Arg Arg Pro Ile Phe Gly 820
825 830Leu Phe Gly Ala Gly Glu Val Thr Gly Gly Val His
Gly Gly Asn Arg 835 840 845Leu Gly
Gly Asn Ser Leu Leu Glu Cys Val Val Phe Gly Arg Ile Ala 850
855 860Gly Asp Arg Ala Ala Thr Ile Leu Gln Lys Lys
Pro Val Pro Leu Ser865 870 875
880Phe Lys Thr Trp Thr Thr Val Ile Leu Arg Glu Val Arg Glu Gly Gly
885 890 895Met Tyr Gly Thr
Gly Ser Arg Val Leu Arg Phe Asn Leu Pro Gly Ala 900
905 910Leu Gln Arg Ser Gly Leu Gln Leu Gly Gln Phe
Ile Ala Ile Arg Gly 915 920 925Glu
Trp Asp Gly Gln Gln Leu Ile Gly Tyr Tyr Ser Pro Ile Thr Leu 930
935 940Pro Asp Asp Leu Gly Val Ile Gly Ile Leu
Ala Arg Ser Asp Lys Gly945 950 955
960Thr Leu Lys Glu Trp Ile Ser Ala Leu Glu Pro Gly Asp Ala Val
Glu 965 970 975Met Lys Gly
Cys Gly Gly Leu Val Ile Glu Arg Arg Phe Ser Glu Arg 980
985 990Tyr Leu Tyr Phe Ser Gly His Ala Leu Lys
Lys Leu Cys Leu Ile Ala 995 1000
1005Gly Gly Thr Gly Val Ala Pro Met Leu Gln Ile Ile Arg Ala Ala
1010 1015 1020Leu Lys Lys Pro Phe Leu
Glu Asn Ile Glu Ser Ile Arg Leu Ile 1025 1030
1035Tyr Ala Ala Glu Asp Val Ser Glu Leu Thr Tyr Arg Glu Leu
Leu 1040 1045 1050Glu His His Gln Arg
Asp Ser Lys Gly Lys Phe Arg Ser Ile Phe 1055 1060
1065Val Leu Asn Arg Pro Pro Pro Ile Trp Thr Asp Gly Val
Gly Phe 1070 1075 1080Ile Asp Lys Lys
Leu Leu Ser Ser Ser Val Gln Pro Pro Ala Lys 1085
1090 1095Asp Leu Leu Val Ala Ile Cys Gly Pro Pro Ile
Met Gln Arg Val 1100 1105 1110Val Lys
Thr Cys Leu Lys Ser Leu Gly Tyr Asp Met Gln Leu Val 1115
1120 1125Arg Thr Val Asp Glu Val Glu Thr Gln Asn
Ser 1130 11351141144PRTLeishmania braziliensis 114Met
Ala Asp Gly Lys Thr Ser Ala Ser Val Val Ala Val Asp Pro Glu1
5 10 15Arg Ala Ala Lys Glu Arg Asp
Ala Ala Ala Arg Ala Met Leu Gln Asp 20 25
30Gly Gly Val Ser Pro Val Gly Lys Ala Gln Leu Leu Lys Lys
Gly Leu 35 40 45Ala Tyr Ala Val
Pro Tyr Thr Leu Lys Ile Val Val Ala Asp Pro Lys 50 55
60Ala Met Glu Lys Thr Thr Ala Asp Val Glu Lys Val Leu
Gln Thr Ala65 70 75
80Phe Gln Val Val Asp Thr Leu Leu Asn Asn Phe Asn Glu Asn Ser Glu
85 90 95Val Ser Arg Ile Asn Arg
Met Pro Val Gly Glu Glu His Gln Met Ser 100
105 110Ala Ala Leu Lys Arg Val Met Gly Cys Cys Gln Arg
Val Tyr Asn Ser 115 120 125Ser Arg
Gly Ala Phe Asp Pro Ala Val Gly Pro Leu Val Arg Glu Leu 130
135 140Arg Glu Ala Ala Arg Glu Gly Arg Thr Leu Pro
Ala Glu Arg Ile Asn145 150 155
160Ala Leu Leu Ser Lys Cys Thr Leu Asn Ile Ser Phe Ser Ile Asp Leu
165 170 175Asn Arg Gly Thr
Ile Ala Arg Lys His Ala Asp Ala Met Leu Asp Leu 180
185 190Gly Gly Val Asn Lys Gly Tyr Gly Val Asp Tyr
Val Val Glu His Leu 195 200 205Asn
Asn Leu Gly Tyr Asp Asp Val Phe Phe Glu Trp Gly Gly Asp Val 210
215 220Arg Ala Ser Gly Lys Asn Pro Ser Asn Gln
His Trp Val Val Gly Ile225 230 235
240Ala Arg Pro Pro Ala Leu Ala Asp Ile Arg Thr Val Val Pro Gln
Asp 245 250 255Lys Gln Ser
Phe Ile Arg Val Val Cys Leu Asn Asp Glu Ala Ile Ala 260
265 270Thr Ser Gly Asp Tyr Glu Asn Leu Val Glu
Gly Pro Gly Ser Lys Val 275 280
285Tyr Ser Ser Thr Phe Asn Ala Thr Ser Lys Ser Leu Leu Glu Pro Thr 290
295 300Glu Thr Asn Ile Ala Gln Val Ser
Val Lys Cys Tyr Ser Cys Met Tyr305 310
315 320Ala Asp Ala Leu Ala Thr Ala Ala Leu Leu Lys Asn
Asn Pro Thr Ala 325 330
335Val Arg Arg Met Leu Asp Asn Trp Arg Tyr Val Arg Asp Thr Val Thr
340 345 350Asp Tyr Thr Thr Tyr Ser
Arg Glu Gly Glu Arg Val Ala Lys Met Phe 355 360
365Glu Ile Ala Thr Glu Asp Lys Glu Met Arg Ala Lys Arg Ile
Ser Gly 370 375 380Ser Leu Pro Ala Arg
Val Ile Ile Val Gly Gly Gly Leu Ala Gly Cys385 390
395 400Ser Ala Ala Ile Glu Ala Val Asn Cys Gly
Ala Gln Val Ile Leu Leu 405 410
415Glu Lys Glu Ala Lys Ile Gly Gly Asn Ser Ala Lys Ala Thr Ser Gly
420 425 430Ile Asn Ala Trp Gly
Thr Arg Ala Gln Ala Lys Gln Gly Val Met Asp 435
440 445Gly Gly Lys Phe Phe Glu Arg Asp Thr His Arg Ser
Gly Lys Gly Gly 450 455 460His Cys Asp
Pro Cys Leu Val Lys Thr Leu Ser Val Lys Ser Ser Asp465
470 475 480Ala Val Lys Trp Leu Ser Glu
Leu Gly Val Pro Leu Thr Val Leu Ser 485
490 495Gln Leu Gly Gly Ala Ser Arg Lys Arg Cys His Arg
Ala Pro Asp Lys 500 505 510Ser
Asp Gly Thr Pro Val Pro Ile Gly Phe Thr Ile Met Lys Thr Leu 515
520 525Glu Asn His Ile Ile Asn Asp Leu Ser
His Gln Val Thr Val Met Thr 530 535
540Gly Ile Lys Val Thr Gly Leu Glu Ser Thr Ser His Ala Arg Pro Asp545
550 555 560Gly Val Leu Val
Lys His Val Thr Gly Val Arg Leu Ile Gln Gly Asp 565
570 575Gly Gln Ser Arg Val Leu Asn Ala Asp Ala
Val Ile Leu Ala Thr Gly 580 585
590Gly Phe Ser Asn Asp His Thr Ala Asn Ser Leu Leu Gln Gln Tyr Ala
595 600 605Pro Gln Leu Ser Ser Phe Pro
Thr Thr Asn Gly Val Trp Ala Thr Gly 610 615
620Asp Gly Val Lys Ala Ala Arg Glu Leu Gly Val Glu Leu Val Asp
Met625 630 635 640Asp Lys
Val Gln Leu His Pro Thr Gly Leu Leu Asp Pro Lys Asp Pro
645 650 655Ser Asn Arg Thr Lys Tyr Leu
Gly Pro Glu Ala Leu Arg Gly Ser Gly 660 665
670Gly Val Leu Leu Asn Lys Asn Gly Glu Arg Phe Val Asn Glu
Leu Asp 675 680 685Leu Arg Ser Val
Val Ser Gln Ala Ile Ile Glu Gln Asn Asn Val Tyr 690
695 700Pro Gly Ser Gly Gly Ser Lys Phe Ala Tyr Cys Val
Leu Asn Glu Ala705 710 715
720Ala Ala Lys Leu Phe Gly Lys Asn Phe Leu Gly Phe Tyr Trp His Arg
725 730 735Leu Gly Leu Phe Glu
Lys Val Glu Asp Val Ala Gly Leu Ala Lys Leu 740
745 750Ile Gly Cys Pro Glu Glu Asn Val Thr Ala Thr Leu
Lys Glu Tyr Lys 755 760 765Glu Leu
Ser Ser Lys Lys Leu His Ala Cys Pro Leu Thr Asn Lys Asn 770
775 780Val Phe Pro Cys Thr Leu Gly Thr Glu Gly Pro
Tyr Tyr Val Ala Phe785 790 795
800Val Thr Pro Ser Ile His Tyr Thr Met Gly Gly Cys Leu Ile Ser Pro
805 810 815Ser Ala Glu Met
Gln Thr Ile Asp Asn Thr Gly Val Thr Pro Val Arg 820
825 830Arg Pro Ile Leu Gly Leu Phe Gly Ala Gly Glu
Val Thr Gly Gly Val 835 840 845His
Gly Gly Asn Arg Leu Gly Gly Asn Ser Leu Leu Glu Cys Val Val 850
855 860Phe Gly Arg Ile Ala Gly Asp Arg Ala Ala
Thr Ile Leu Gln Lys Lys865 870 875
880Asn Ala Gly Leu Ser Met Thr Glu Trp Ser Thr Val Val Leu Arg
Glu 885 890 895Val Arg Glu
Gly Gly Val Tyr Gly Thr Gly Ser Arg Val Leu Arg Phe 900
905 910Asn Met Pro Gly Ala Leu Gln Lys Thr Gly
Leu Ala Leu Gly Gln Phe 915 920
925Ile Ala Met Arg Gly Asp Trp Asp Gly Gln Gln Leu Leu Gly Tyr Tyr 930
935 940Ser Pro Ile Thr Leu Pro Asp Asp
Ile Gly Val Ile Gly Ile Leu Ala945 950
955 960Arg Ala Asp Lys Gly Arg Leu Ala Glu Trp Ile Ser
Ala Leu Gln Pro 965 970
975Gly Asp Ala Val Glu Met Lys Ala Cys Gly Gly Leu Ile Ile His Arg
980 985 990Arg Phe Ala Ala Arg His
Leu Phe Phe Arg Ser His Lys Ile Arg Lys 995 1000
1005Leu Ala Leu Ile Gly Gly Gly Thr Gly Val Ala Pro
Met Leu Gln 1010 1015 1020Ile Val Arg
Ala Ala Val Lys Lys Pro Phe Val Asp Ser Ile Glu 1025
1030 1035Ser Ile Gln Phe Ile Tyr Ala Ala Glu Asp Val
Ser Glu Leu Thr 1040 1045 1050Tyr Arg
Thr Leu Leu Glu Ser Tyr Glu Lys Glu Tyr Gly Ser Gly 1055
1060 1065Lys Phe Lys Cys His Phe Val Leu Asn Asn
Pro Pro Ser Gln Trp 1070 1075 1080Thr
Glu Gly Val Gly Phe Val Asp Thr Ala Leu Leu Arg Ser Ala 1085
1090 1095Val Gln Ala Pro Ser Asn Asp Leu Leu
Val Ala Ile Cys Gly Pro 1100 1105
1110Pro Ile Met Gln Arg Ala Val Lys Ser Ala Leu Lys Gly Leu Gly
1115 1120 1125Tyr Asn Met Asn Leu Val
Arg Thr Val Asp Glu Pro Glu Pro Leu 1130 1135
1140Ser115880PRTMannheimia succiiniproducens 115Met Thr Glu Glu
Tyr Leu Met Met Arg Asn Asn Ile Asn Met Leu Gly1 5
10 15Arg Phe Leu Gly Glu Thr Ile Gln Glu Ala
Gln Gly Asp Asp Ile Leu 20 25
30Glu Leu Ile Glu Asn Ile Arg Val Leu Ser Arg Asn Ser Arg Ser Gly
35 40 45Asp Asp Lys Ala Arg Ala Ala Leu
Leu Asp Thr Leu Ser Thr Ile Ser 50 55
60Ala Asp Asn Ile Ile Pro Val Ala Arg Ala Phe Ser Gln Phe Leu Asn65
70 75 80Leu Thr Asn Val Ala
Glu Gln Tyr Gln Thr Met Ser Arg Ser His Glu 85
90 95Asp Lys Val Ser Ala Glu Arg Ser Thr Ala Ala
Leu Phe Ala Arg Leu 100 105
110Lys Glu Gln His Val Ser Gln Glu Glu Ile Ile Lys Thr Val Gln Lys
115 120 125Leu Leu Ile Glu Ile Val Leu
Thr Ala His Pro Thr Glu Val Thr Arg 130 135
140Arg Ser Leu Met His Lys Gln Val Glu Ile Asn Lys Cys Leu Ala
Gln145 150 155 160Leu Asp
His Thr Asp Leu Thr Ala Glu Glu Gln Lys Asn Ile Glu Tyr
165 170 175Lys Leu Leu Arg Leu Ile Ala
Glu Ala Trp His Thr Asn Glu Ile Arg 180 185
190Thr Asn Arg Pro Thr Pro Leu Glu Glu Ala Lys Trp Gly Phe
Ala Val 195 200 205Ile Glu Asn Ser
Leu Trp Glu Gly Leu Pro Ala Phe Ile Arg Lys Leu 210
215 220Asn Asp Ala Ala Val Glu His Leu Asn Tyr Ala Leu
Pro Val Asp Leu225 230 235
240Thr Pro Val Arg Phe Ser Ser Trp Met Gly Gly Asp Arg Asp Gly Asn
245 250 255Pro Phe Val Thr Ala
Lys Ile Thr Arg Glu Ala Leu Gln Leu Ala Arg 260
265 270Trp Lys Ala Ala Asp Leu Phe Leu Thr Asp Ile Gln
Glu Leu Cys Asp 275 280 285Glu Leu
Ser Met Thr Gln Cys Thr Ala Glu Phe Arg Glu Lys Tyr Gly 290
295 300Asp His Leu Glu Pro Tyr Arg Val Val Val Lys
Asp Leu Arg Ser Lys305 310 315
320Leu Lys Asn Thr Leu Asp Tyr Tyr Asn Asp Ile Leu Ala Gly Arg Ile
325 330 335Pro Pro Phe Lys
Gln Asp Glu Ile Ile Ser Glu Asp Gln Gln Leu Trp 340
345 350Gln Pro Leu Tyr Asp Cys Tyr Gln Ser Leu Thr
Ala Cys Gly Met Arg 355 360 365Ile
Ile Ala Asn Gly Leu Leu Leu Asp Thr Leu Arg Arg Val Arg Cys 370
375 380Phe Gly Val Thr Leu Leu Arg Leu Asp Ile
Arg Gln Glu Ser Thr Arg385 390 395
400His Ser Asp Ala Ile Gly Glu Ile Thr Arg Tyr Ile Gly Leu Gly
Asp 405 410 415Tyr Ser Gln
Trp Thr Glu Asp Asp Lys Gln Ala Phe Leu Ile Arg Glu 420
425 430Leu Ser Ser Arg Arg Pro Leu Ile Pro His
Asn Trp Thr Pro Ser Glu 435 440
445His Thr Arg Glu Ile Leu Asp Thr Cys Lys Val Ile Ala Lys Gln Pro 450
455 460Glu Gly Val Ile Ser Cys Tyr Ile
Ile Ser Met Ala Arg Thr Ala Ser465 470
475 480Asp Val Leu Ala Val His Leu Leu Leu Lys Glu Ala
Gly Ile Ser Tyr 485 490
495His Leu Pro Val Val Pro Leu Phe Glu Thr Leu Asp Asp Leu Asp Ala
500 505 510Ser Lys Glu Val Met Thr
Gln Leu Phe Asn Val Gly Trp Tyr Arg Gly 515 520
525Val Ile Lys Asn Arg Gln Met Ile Met Ile Gly Tyr Ser Asp
Ser Ala 530 535 540Lys Asp Ala Gly Met
Met Ala Ala Ser Trp Ala Gln Tyr Arg Ala Gln545 550
555 560Asp Ala Leu Val Lys Leu Cys Glu Gln Thr
Gly Ile Glu Leu Thr Leu 565 570
575Phe His Gly Arg Gly Gly Thr Val Gly Arg Gly Gly Ala Pro Ala His
580 585 590Ala Ala Leu Leu Ser
Gln Pro Pro Arg Ser Leu Lys Asn Gly Leu Arg 595
600 605Val Thr Glu Gln Gly Glu Met Ile Arg Phe Lys Leu
Gly Leu Pro Ala 610 615 620Ile Ala Ala
Glu Ser Leu Asp Leu Tyr Ala Ser Ala Ile Leu Glu Ala625
630 635 640Asn Leu Leu Pro Pro Pro Glu
Pro Lys Ala Ser Trp Cys Arg Val Met 645
650 655Asp Glu Leu Ala Val Ala Ser Cys Glu Ile Tyr Arg
Asn Val Val Arg 660 665 670Gly
Asp Lys Asp Phe Val Pro Tyr Phe Arg Ser Ala Thr Pro Glu Gln 675
680 685Glu Leu Ala Lys Leu Pro Leu Gly Ser
Arg Pro Ala Lys Arg Asn Pro 690 695
700Asn Gly Gly Val Glu Ser Leu Arg Ala Ile Pro Trp Ile Phe Ala Trp705
710 715 720Met Gln Asn Arg
Leu Met Leu Pro Ala Trp Leu Gly Ala Gly Ala Ser 725
730 735Ile Arg Gln Ala Met Glu Ser Gly Lys Ala
Ala Val Ile Glu Glu Met 740 745
750Cys Asn His Trp Pro Phe Phe Asn Thr Arg Ile Gly Met Leu Glu Met
755 760 765Val Phe Ser Lys Thr Asp Ser
Trp Leu Ser Glu Tyr Tyr Asp Gln Arg 770 775
780Leu Val Lys Lys Glu Leu Trp Tyr Leu Gly Glu Ser Leu Arg Lys
Gln785 790 795 800Leu Ser
Glu Asp Ile Ala Thr Val Leu Arg Leu Ser Gly Lys Gly Asp
805 810 815Gln Leu Met Ser Asp Leu Pro
Trp Val Ala Glu Ser Ile Ala Leu Arg 820 825
830Asn Val Tyr Thr Asp Pro Leu Asn Leu Leu Gln Val Glu Leu
Leu Arg 835 840 845Arg Leu Arg Ala
Asp Pro Glu His Pro Asn Pro Asp Ile Glu Gln Ala 850
855 860Leu Met Ile Thr Ile Thr Gly Ile Ala Ala Gly Met
Arg Asn Thr Gly865 870 875
8801164171DNAArtificial SequenceSynthetic - I. orientalis FUM gene
integration fragment 116aattctttga aggagcttgc caagaaacat aattttatga
tttttgaaga tagaaaattt 60gctgatattg gtaacactgt taaaaatcaa tataaatctg
gtgtcttccg tattgccgaa 120tgggctgaca tcactaatgc acatggtgta acgggtgcag
gtattgtttc tggcttgaag 180gaggcagccc aagaaacaac cagtgaacct agaggtttgc
taatgcttgc tgagttatca 240tcaaagggtt ctttagcata tggtgaatat acagaaaaaa
cagtagaaat tgctaaatct 300gataaagagt ttgtcattgg ttttattgcg caacacgata
tgggcggtag agaagaaggt 360tttgactgga tcattatgac tccaggggtt ggtttagatg
acaaaggtga tgcacttggt 420caacaatata gaactgttga tgaagttgta aagactggaa
cggatatcat aattgttggt 480agaggtttgt acggtcaagg aagagatcct atagagcaag
ctaaaagata ccaacaagct 540ggttggaatg cttatttaaa cagatttaaa tgattcttac
acaaagattt gatacatgta 600cactagttta aataagcatg aaaagaatta cacaagcaaa
aaaaaaaaaa taaatgaggt 660actttacgtt cacctacaac caaaaaaact agatagagta
aaatcttaag atttagaaaa 720agttgtttaa caaaggcttt agtatgtgaa tttttaatgt
agcaaagcga taactaataa 780acataaacaa aagtatggtt ttctttatca gtcaaatcat
tatcgattga ttgttccgcg 840tatctgcaga tagcctcatg aaatcagcca tttgcttttg
ttcaacgatc ttttgaaatt 900gttgttgttc ttggtagtta agttgatcca tcttggctta
tgttgtgtgt atgttgtagt 960tattcttagt atattcctgt cctgagttta gtgaaacata
atatcgcctt gaaatgaaaa 1020tgctgaaatt cgtcgacata caatttttca aacttttttt
ttttcttggt gcacggacat 1080gtttttaaag gaagtactct ataccagtta ttcttcacaa
atttaattgc tggagaatag 1140atcttcaacg cgtttcctcg acatttgctg caacggcaac
atcaatgtcc acgtttacac 1200acctacattt atatctatat ttatatttat atttatttat
ttatgctact tagcttctat 1260agttagttaa tgcactcacg atattcaaaa ttgacaccct
tcaactactc cctactattg 1320tctactactg tctactactc ctctttacta tagctgctcc
caataggctc caccaatagg 1380ctctgtcaat acattttgcg ccgccacctt tcaggttgtg
tcactcctga aggaccatat 1440tgggtaatcg tgcaatttct ggaagagagt ccgcgagaag
tgaggccccc actgtaaatc 1500ctcgaggggg catggagtat ggggcatgga ggatggagga
tggggggggg gggggggaaa 1560ataggtagcg aaaggacccg ctatcacccc acccggagaa
ctcgttgccg ggaagtcata 1620tttcgacact ccggggagtc tataaaaggc gggttttgtc
ttttgccagt tgatgttgct 1680gagaggactt gtttgccgtt tcttccgatt taacagtata
gaatcaacca ctgttaatta 1740tacacgttat actaacacaa caaaaacaaa aacaacgaca
acaacaacaa catctagata 1800aaatgttagc tgctagatca ttaaaggcaa gaatgtcaac
aagagctttc tcaactacct 1860caattgcaaa aagaatcgaa aaagatgcat ttggtgacat
tgaagtccca aatgagaaat 1920attggggtgc tcaaactcaa agatctttac aaaatttcaa
aattggtggt aagagagaag 1980ttatgccaga accaatcatc aaatcttttg gtattttaaa
gaaggctact gctaagatca 2040atgctgagtc tggtgcttta gacccaaagt tatctgaagc
catccaacaa gctgcaaccg 2100aagtttatga aggtaaacta atggaccatt tcccattagt
tgtctttcaa accggttctg 2160gtactcaatc taacatgaat gccaatgaag tcatctctaa
tagagcaatt gaaatcttgg 2220gtggtgaatt aggctctaaa actccagtcc atcctaatga
tcatgttaat atgtcccaat 2280cttctaatga tactttccct actgtcatgc atattgcagc
agttacagaa gtttcatccc 2340atttattacc agaattaact gcactaagag atgcattgca
aaagaaatcc gatgaattta 2400agaatattat caaaatcggt agaacccatt tacaagatgc
aactccttta actttaggtc 2460aagaattttc tggttatgtt caacaatgta ctaatggtat
caaaagaatc gaaattgctc 2520ttgaacattt gagatactta gctcaaggtg gtactgccgt
tggtactggt cttaacacca 2580agaaaggttt tgctgaaaag gttgcaaatg aagtcactaa
attgactggt ttacaattct 2640ataccgctcc aaataaattc gaagcccttg cagctcacga
tgctgttgtt gaaatgtctg 2700gtgctttgaa taccgttgca gtctcattat tcaaaatcgc
tcaagatatc agatatttgg 2760gttccggccc aagatgtggt tatggtgaat tggctttacc
agaaaatgaa ccaggttctt 2820ccatcatgcc gggtaaagtt aacccaactc aaaacgaagc
tttgactatg ctttgtaccc 2880aagtctttgg taaccactct tgtattacct ttgcaggtgc
ttcaggtcaa ttcgaattga 2940atgtctttaa gccagttatg atctccaact tgttatcttc
tattaggtta ttaggtgatg 3000gttgtaattc ttttagaatc cactgtgttg aaggtatcat
tgcaaatacc gacaagattg 3060ataaattact acatgaatct ctcatgttag ttactgcttt
gaacccacac attggttacg 3120ataaggcttc caagattgca aagaatgcac acaagaaggg
cttgacattg aaacaatctg 3180cattggaatt aggttacttg accgaagaac aattcaatga
atgggttaga ccagaaaaca 3240tgattggtcc aaaggattaa gttaattaac atctgaatgt
aaaatgaaca ttaaaatgaa 3300ttactaaact ttacgtctac tttacaatct ataaactttg
tttaatcata taacgaaata 3360cactaataca caatcctgta cgtatgtaat acttttatcc
atcaaggatt gagaaaaaaa 3420agtaatgatt ccctgggcca ttaaaactta gacccccaag
cttggatagg tcactctcta 3480ttttcgtttc tcccttccct gatagaaggg tgatatgtaa
ttaagaataa tatataattt 3540tataataaaa gcggccgcac caggggttta gtgaagtcac
caattaagat tgttggtttg 3600agtgagttgc caaagatcta tgaattgatg gagcaaggta
agattttagg cagatatgtt 3660gttgacactt cgaaatgatg ggctgacttg ggtgtactgg
tgtgacgttt ttatgtgtat 3720attgatatgc atgggggatg tatagtgatg aggagtagag
tatataacga aatgaaatga 3780aataatatga tatgataaga taagatgaga tcaatacgat
aatataagat gcgacatgag 3840gagttcaatg tagcatacta cacgatgctg cagtacaact
ctgatacgct agactatact 3900atacaaaact gtagtacact atacgttagt gtagtatcca
gaaacaacac tgctttatag 3960tacaatacaa ctctataata ctatagtata ctatgccaaa
ccacgtaata ccataatatg 4020ctccacgaca tggtacaatg tgctatactt catactatta
taccatatat actccgatat 4080attattgata tactatttta tactataata ccataccaca
caacactaca ttacaacgag 4140caaccttacc ataaatgtca gttatggccg c
4171117466PRTEscherichia coli 117Met Pro His Ser
Tyr Asp Tyr Asp Ala Ile Val Ile Gly Ser Gly Pro1 5
10 15Gly Gly Glu Gly Ala Ala Met Gly Leu Val
Lys Gln Gly Ala Arg Val 20 25
30Ala Val Ile Glu Arg Tyr Gln Asn Val Gly Gly Gly Cys Thr His Trp
35 40 45Gly Thr Ile Pro Ser Lys Ala Leu
Arg His Ala Val Ser Arg Ile Ile 50 55
60Glu Phe Asn Gln Asn Pro Leu Tyr Ser Asp His Ser Arg Leu Leu Arg65
70 75 80Ser Ser Phe Ala Asp
Ile Leu Asn His Ala Asp Asn Val Ile Asn Gln 85
90 95Gln Thr Arg Met Arg Gln Gly Phe Tyr Glu Arg
Asn His Cys Glu Ile 100 105
110Leu Gln Gly Asn Ala Arg Phe Val Asp Glu His Thr Leu Ala Leu Asp
115 120 125Cys Pro Asp Gly Ser Val Glu
Thr Leu Thr Ala Glu Lys Phe Val Ile 130 135
140Ala Cys Gly Ser Arg Pro Tyr His Pro Thr Asp Val Asp Phe Thr
His145 150 155 160Pro Arg
Ile Tyr Asp Ser Asp Ser Ile Leu Ser Met His His Glu Pro
165 170 175Arg His Val Leu Ile Tyr Gly
Ala Gly Val Ile Gly Cys Glu Tyr Ala 180 185
190Ser Ile Phe Arg Gly Met Asp Val Lys Val Asp Leu Ile Asn
Thr Arg 195 200 205Asp Arg Leu Leu
Ala Phe Leu Asp Gln Glu Met Ser Asp Ser Leu Ser 210
215 220Tyr His Phe Trp Asn Ser Gly Val Val Ile Arg His
Asn Glu Glu Tyr225 230 235
240Glu Lys Ile Glu Gly Cys Asp Asp Gly Val Ile Met His Leu Lys Ser
245 250 255Gly Lys Lys Leu Lys
Ala Asp Cys Leu Leu Tyr Ala Asn Gly Arg Thr 260
265 270Gly Asn Thr Asp Ser Leu Ala Leu Gln Asn Ile Gly
Leu Glu Thr Asp 275 280 285Ser Arg
Gly Gln Leu Lys Val Asn Ser Met Tyr Gln Thr Ala Gln Pro 290
295 300His Val Tyr Ala Val Gly Asp Val Ile Gly Tyr
Pro Ser Leu Ala Ser305 310 315
320Ala Ala Tyr Asp Gln Gly Arg Ile Ala Ala Gln Ala Leu Val Lys Gly
325 330 335Glu Ala Thr Ala
His Leu Ile Glu Asp Ile Pro Thr Gly Ile Tyr Thr 340
345 350Ile Pro Glu Ile Ser Ser Val Gly Lys Thr Glu
Gln Gln Leu Thr Ala 355 360 365Met
Lys Val Pro Tyr Glu Val Gly Arg Ala Gln Phe Lys His Leu Ala 370
375 380Arg Ala Gln Ile Val Gly Met Asn Val Gly
Thr Leu Lys Ile Leu Phe385 390 395
400His Arg Glu Thr Lys Glu Ile Leu Gly Ile His Cys Phe Gly Glu
Arg 405 410 415Ala Ala Glu
Ile Ile His Ile Gly Gln Ala Ile Met Glu Gln Lys Gly 420
425 430Gly Gly Asn Thr Ile Glu Tyr Phe Val Asn
Thr Thr Phe Asn Tyr Pro 435 440
445Thr Met Ala Glu Ala Tyr Arg Val Ala Ala Leu Asn Gly Leu Asn Arg 450
455 460Leu Phe465118466PRTArtificial
SequenceSynthetic - Codon optimized E. coli Stha enzyme 118Met Pro His
Ser Tyr Asp Tyr Asp Ala Ile Val Ile Gly Ser Gly Pro1 5
10 15Gly Gly Glu Gly Ala Ala Met Gly Leu
Val Lys Gln Gly Ala Arg Val 20 25
30Ala Val Ile Glu Arg Tyr Gln Asn Val Gly Gly Gly Cys Thr His Trp
35 40 45Gly Thr Ile Pro Ser Lys Ala
Leu Arg His Ala Val Ser Arg Ile Ile 50 55
60Glu Phe Asn Gln Asn Pro Leu Tyr Ser Asp His Ser Arg Leu Leu Arg65
70 75 80Ser Ser Phe Ala
Asp Ile Leu Asn His Ala Asp Asn Val Ile Asn Gln 85
90 95Gln Thr Arg Met Arg Gln Gly Phe Tyr Glu
Arg Asn His Cys Glu Ile 100 105
110Leu Gln Gly Asn Ala Arg Phe Val Asp Glu His Thr Leu Ala Leu Asp
115 120 125Cys Pro Asp Gly Ser Val Glu
Thr Leu Thr Ala Glu Lys Phe Val Ile 130 135
140Ala Cys Gly Ser Arg Pro Tyr His Pro Thr Asp Val Asp Phe Thr
His145 150 155 160Pro Arg
Ile Tyr Asp Ser Asp Ser Ile Leu Ser Met His His Glu Pro
165 170 175Arg His Val Leu Ile Tyr Gly
Ala Gly Val Ile Gly Cys Glu Tyr Ala 180 185
190Ser Ile Phe Arg Gly Met Asp Val Lys Val Asp Leu Ile Asn
Thr Arg 195 200 205Asp Arg Leu Leu
Ala Phe Leu Asp Gln Glu Met Ser Asp Ser Leu Ser 210
215 220Tyr His Phe Trp Asn Ser Gly Val Val Ile Arg His
Asn Glu Glu Tyr225 230 235
240Glu Lys Ile Glu Gly Cys Asp Asp Gly Val Ile Met His Leu Lys Ser
245 250 255Gly Lys Lys Leu Lys
Ala Asp Cys Leu Leu Tyr Ala Asn Gly Arg Thr 260
265 270Gly Asn Thr Asp Ser Leu Ala Leu Gln Asn Ile Gly
Leu Glu Thr Asp 275 280 285Ser Arg
Gly Gln Leu Lys Val Asn Ser Met Tyr Gln Thr Ala Gln Pro 290
295 300His Val Tyr Ala Val Gly Asp Val Ile Gly Tyr
Pro Ser Leu Ala Ser305 310 315
320Ala Ala Tyr Asp Gln Gly Arg Ile Ala Ala Gln Ala Leu Val Lys Gly
325 330 335Glu Ala Thr Ala
His Leu Ile Glu Asp Ile Pro Thr Gly Ile Tyr Thr 340
345 350Ile Pro Glu Ile Ser Ser Val Gly Lys Thr Glu
Gln Gln Leu Thr Ala 355 360 365Met
Lys Val Pro Tyr Glu Val Gly Arg Ala Gln Phe Lys His Leu Ala 370
375 380Arg Ala Gln Ile Val Gly Met Asn Val Gly
Thr Leu Lys Ile Leu Phe385 390 395
400His Arg Glu Thr Lys Glu Ile Leu Gly Ile His Cys Phe Gly Glu
Arg 405 410 415Ala Ala Glu
Ile Ile His Ile Gly Gln Ala Ile Met Glu Gln Lys Gly 420
425 430Gly Gly Asn Thr Ile Glu Tyr Phe Val Asn
Thr Thr Phe Asn Tyr Pro 435 440
445Thr Met Ala Glu Ala Tyr Arg Val Ala Ala Leu Asn Gly Leu Asn Arg 450
455 460Leu Phe465119464PRTAzotobacter
vinelandii 119Met Ala Val Tyr Asn Tyr Asp Val Val Val Ile Gly Thr Gly Pro
Ala1 5 10 15Gly Glu Gly
Ala Ala Met Asn Ala Val Lys Ala Gly Arg Lys Val Ala 20
25 30Val Val Asp Asp Arg Pro Gln Val Gly Gly
Asn Cys Thr His Leu Gly 35 40
45Thr Ile Pro Ser Lys Ala Leu Arg His Ser Val Arg Gln Ile Met Gln 50
55 60Tyr Asn Asn Asn Pro Leu Phe Arg Gln
Ile Gly Glu Pro Arg Trp Phe65 70 75
80Ser Phe Ala Asp Val Leu Lys Ser Ala Glu Gln Val Ile Ala
Lys Gln 85 90 95Val Ser
Ser Arg Thr Gly Tyr Tyr Ala Arg Asn Arg Ile Asp Thr Phe 100
105 110Phe Gly Thr Ala Ser Phe Cys Asp Glu
His Thr Ile Glu Val Val His 115 120
125Leu Asn Gly Met Val Glu Thr Leu Val Ala Lys Gln Phe Val Ile Ala
130 135 140Thr Gly Ser Arg Pro Tyr Arg
Pro Ala Asp Val Asp Phe Thr His Pro145 150
155 160Arg Ile Tyr Asp Ser Asp Thr Ile Leu Ser Leu Gly
His Thr Pro Arg 165 170
175Arg Leu Ile Ile Tyr Gly Ala Gly Val Ile Gly Cys Glu Tyr Ala Ser
180 185 190Ile Phe Ser Gly Leu Gly
Val Leu Val Asp Leu Ile Asp Asn Arg Asp 195 200
205Gln Leu Leu Ser Phe Leu Asp Asp Glu Ile Ser Asp Ser Leu
Ser Tyr 210 215 220His Leu Arg Asn Asn
Asn Val Leu Ile Arg His Asn Glu Glu Tyr Glu225 230
235 240Arg Val Glu Gly Leu Asp Asn Gly Val Ile
Leu His Leu Lys Ser Gly 245 250
255Lys Lys Ile Lys Ala Asp Ala Phe Leu Trp Ser Asn Gly Arg Thr Gly
260 265 270Asn Thr Asp Lys Leu
Gly Leu Glu Asn Ile Gly Leu Lys Ala Asn Gly 275
280 285Arg Gly Gln Ile Gln Val Asp Glu His Tyr Arg Thr
Glu Val Ser Asn 290 295 300Ile Tyr Ala
Ala Gly Asp Val Ile Gly Trp Pro Ser Leu Ala Ser Ala305
310 315 320Ala Tyr Asp Gln Gly Arg Ser
Ala Ala Gly Ser Ile Thr Glu Asn Asp 325
330 335Ser Trp Arg Phe Val Asp Asp Val Pro Thr Gly Ile
Tyr Thr Ile Pro 340 345 350Glu
Ile Ser Ser Val Gly Lys Thr Glu Arg Glu Leu Thr Gln Ala Lys 355
360 365Val Pro Tyr Glu Val Gly Lys Ala Phe
Phe Lys Gly Met Ala Arg Ala 370 375
380Gln Ile Ala Val Glu Lys Ala Gly Met Leu Lys Ile Leu Phe His Arg385
390 395 400Glu Thr Leu Glu
Ile Leu Gly Val His Cys Phe Gly Tyr Gln Ala Ser 405
410 415Glu Ile Val His Ile Gly Gln Ala Ile Met
Asn Gln Lys Gly Glu Ala 420 425
430Asn Thr Leu Lys Tyr Phe Ile Asn Thr Thr Phe Asn Tyr Pro Thr Met
435 440 445Ala Glu Ala Tyr Arg Val Ala
Ala Tyr Asp Gly Leu Asn Arg Leu Phe 450 455
46012020DNAArtificial SequenceSynthetic - Primer 120cacagaggtg
cagtaacgag
20121506PRTIssatchenkia orientalis 121Met Gly Val Gln Phe Ile Glu Asn Thr
Ile Ile Val Val Phe Gly Ala1 5 10
15Ser Gly Asp Leu Ala Lys Lys Lys Thr Phe Pro Ala Leu Phe Gly
Leu 20 25 30Phe Arg Glu Gly
Gln Leu Ser Glu Thr Thr Lys Ile Ile Gly Phe Ala 35
40 45Arg Ser Lys Leu Ser Asn Asp Asp Leu Arg Asn Arg
Ile Lys Pro Tyr 50 55 60Leu Lys Leu
Asn Lys Arg Thr Asp Ala Glu Arg Gln Ser Leu Glu Lys65 70
75 80Phe Leu Gln Ile Leu Glu Tyr His
Gln Ser Asn Tyr Asp Asp Ser Glu 85 90
95Gly Phe Glu Lys Leu Glu Lys Leu Ile Asn Lys Tyr Asp Asp
Glu Ala 100 105 110Asn Val Lys
Glu Ser His Arg Leu Tyr Tyr Leu Ala Leu Pro Pro Ser 115
120 125Val Phe Thr Thr Val Ala Thr Met Leu Lys Lys
His Cys His Pro Gly 130 135 140Asp Ser
Gly Ile Ala Arg Leu Ile Val Glu Lys Pro Phe Gly His Asp145
150 155 160Leu Ser Ser Ser Arg Glu Leu
Gln Lys Ser Leu Ala Pro Leu Trp Asn 165
170 175Glu Asp Glu Leu Phe Arg Ile Asp His Tyr Leu Gly
Lys Glu Met Val 180 185 190Lys
Asn Leu Ile Pro Leu Arg Phe Ser Asn Thr Phe Leu Ser Ser Ser 195
200 205Trp Asn Asn Gln Phe Ile Asp Thr Ile
Gln Ile Thr Phe Lys Glu Asn 210 215
220Phe Gly Thr Glu Gly Arg Gly Gly Tyr Phe Asp Ser Ile Gly Ile Ile225
230 235 240Arg Asp Val Ile
Gln Asn His Leu Leu Gln Val Leu Thr Ile Val Leu 245
250 255Met Glu Lys Pro Ala Asp Phe Asn Gly Glu
Ser Ile Arg Asp Glu Lys 260 265
270Val Lys Val Leu Lys Ala Ile Glu Gln Ile Asp Phe Asn Asn Val Leu
275 280 285Val Gly Gln Tyr Asp Lys Ser
Glu Asp Gly Ser Lys Pro Gly Tyr Leu 290 295
300Asp Asp Asp Thr Val Asn Pro Asp Ser Lys Ala Val Thr Tyr Ala
Ala305 310 315 320Leu Val
Leu Asn Val Ala Asn Glu Arg Trp Asn Asn Val Pro Ile Ile
325 330 335Leu Lys Ala Gly Lys Ala Leu
Asn Gln Ser Lys Val Glu Ile Arg Ile 340 345
350Gln Phe Lys Pro Val Glu Asn Gly Ile Phe Lys Asn Ser Ala
Arg Asn 355 360 365Glu Leu Val Ile
Arg Ile Gln Pro Asn Glu Ala Met Tyr Leu Lys Met 370
375 380Asn Ile Lys Val Pro Gly Val Ser Asn Gln Val Ser
Ile Ser Glu Met385 390 395
400Asp Leu Thr Tyr Lys Asn Arg Tyr Ser Ser Glu Phe Tyr Ile Pro Glu
405 410 415Ala Tyr Glu Ser Leu
Ile Lys Asp Ala Leu Met Asp Asp His Ser Asn 420
425 430Phe Val Arg Asp Asp Glu Leu Asp Ile Ser Trp Ala
Leu Phe Thr Pro 435 440 445Leu Leu
Glu His Ile Glu Gly Pro Asp Gly Pro Thr Pro Thr Lys Tyr 450
455 460Pro Tyr Gly Ser Arg Gly Pro Lys Glu Ile Asp
Glu Phe Leu Arg Asn465 470 475
480His Gly Tyr Val Lys Glu Pro Arg Glu Asn Tyr Gln Trp Pro Leu Thr
485 490 495Thr Pro Lys Glu
Leu Asn Ser Ser Lys Phe 500
50512227DNAArtificial SequenceSynthetic - Primer 122gctggagaat agatcttcaa
cgccccg 2712327DNAArtificial
SequenceSynthetic - Primer 123catcactgtt aaaggaatgg gtaaatc
2712441DNAArtificial SequenceSynthetic - Primer
124gtaagctggc aaacctgcag gttagccggt attacgcata c
411254321DNAArtificial SequenceSynthetic - I. orientalis FUM integration
fragment 125ttgaaggagc ttgccaagaa acataatttt atgatttttg aagatagaaa
atttgctgat 60attggtaaca ctgttaaaaa tcaatataaa tctggtgtct tccgtattgc
cgaatgggct 120gacatcacta atgcacatgg tgtaacgggt gcaggtattg tttctggctt
gaaggaggca 180gcccaagaaa caaccagtga acctagaggt ttgctaatgc ttgctgagtt
atcatcaaag 240ggttctttag catatggtga atatacagaa aaaacagtag aaattgctaa
atctgataaa 300gagtttgtca ttggttttat tgcgcaacac gatatgggcg gtagagaaga
aggttttgac 360tggatcatta tgactccagg ggttggttta gatgacaaag gtgatgcact
tggtcaacaa 420tatagaactg ttgatgaagt tgtaaagact ggaacggata tcataattgt
tggtagaggt 480ttgtacggtc aaggaagaga tcctatagag caagctaaaa gataccaaca
agctggttgg 540aatgcttatt taaacagatt taaatgattc ttacacaaag atttgataca
tgtacactag 600tttaaataag catgaaaaga attacacaag caaaaaaaaa aaaataaatg
aggtacttta 660cgttcaccta caaccaaaaa aactagatag agtaaaatct taagatttag
aaaaagttgt 720ttaacaaagg ctttagtatg tgaattttta atgtagcaaa gcgataacta
ataaacataa 780acaaaagtat ggttttcttt atcagtcaaa tcattatcga ttgattgttc
cgcgtatctg 840cagatagcct catgaaatca gccatttgct tttgttcaac gatcttttga
aattgttgtt 900gttcttggta gttaagttga tccatcttgg cttatgttgt gtgtatgttg
tagttattct 960tagtatattc ctgtcctgag tttagtgaaa cataatatcg ccttgaaatg
aaaatgctga 1020aattcgtcga catacaattt ttcaaacttt ttttttttct tggtgcacgg
acatgttttt 1080aaaggaagta ctctatacca gttattcttc acaaatttaa ttgctggaga
atagatcttc 1140aacgcgtttc ctcgacattt gctgcaacgg caacatcaat gtccacgttt
acacacctac 1200atttatatct atatttatat ttatatttat ttatttatgc tacttagctt
ctatagttag 1260ttaatgcact cacgatattc aaaattgaca cccttcaact actccctact
attgtctact 1320actgtctact actcctcttt actatagctg ctcccaatag gctccaccaa
taggctctgt 1380caatacattt tgcgccgcca cctttcaggt tgtgtcactc ctgaaggacc
atattgggta 1440atcgtgcaat ttctggaaga gagtccgcga gaagtgaggc ccccactgta
aatcctcgag 1500ggggcatgga gtatggggca tggaggatgg aggatggggg gggggggggg
gaaaataggt 1560agcgaaagga cccgctatca ccccacccgg agaactcgtt gccgggaagt
catatttcga 1620cactccgggg agtctataaa aggcgggttt tgtcttttgc cagttgatgt
tgctgagagg 1680acttgtttgc cgtttcttcc gatttaacag tatagaatca accactgtta
attatacacg 1740ttatactaac acaacaaaaa caaaaacaac gacaacaaca acaacatcta
gataaaatgt 1800tagctgctag atcattaaag gcaagaatgt caacaagagc tttctcaact
acctcaattg 1860caaaaagaat cgaaaaagat gcatttggtg acattgaagt cccaaatgag
aaatattggg 1920gtgctcaaac tcaaagatct ttacaaaatt tcaaaattgg tggtaagaga
gaagttatgc 1980cagaaccaat catcaaatct tttggtattt taaagaaggc tactgctaag
atcaatgctg 2040agtctggtgc tttagaccca aagttatctg aagccatcca acaagctgca
accgaagttt 2100atgaaggtaa actaatggac catttcccat tagttgtctt tcaaaccggt
tctggtactc 2160aatctaacat gaatgccaat gaagtcatct ctaatagagc aattgaaatc
ttgggtggtg 2220aattaggctc taaaactcca gtccatccta atgatcatgt taatatgtcc
caatcttcta 2280atgatacttt ccctactgtc atgcatattg cagcagttac agaagtttca
tcccatttat 2340taccagaatt aactgcacta agagatgcat tgcaaaagaa atccgatgaa
tttaagaata 2400ttatcaaaat cggtagaacc catttacaag atgcaactcc tttaacttta
ggtcaagaat 2460tttctggtta tgttcaacaa tgtactaatg gtatcaaaag aatcgaaatt
gctcttgaac 2520atttgagata cttagctcaa ggtggtactg ccgttggtac tggtcttaac
accaagaaag 2580gttttgctga aaaggttgca aatgaagtca ctaaattgac tggtttacaa
ttctataccg 2640ctccaaataa attcgaagcc cttgcagctc acgatgctgt tgttgaaatg
tctggtgctt 2700tgaataccgt tgcagtctca ttattcaaaa tcgctcaaga tatcagatat
ttgggttccg 2760gcccaagatg tggttatggt gaattggctt taccagaaaa tgaaccaggt
tcttccatca 2820tgccgggtaa agttaaccca actcaaaacg aagctttgac tatgctttgt
acccaagtct 2880ttggtaacca ctcttgtatt acctttgcag gtgcttcagg tcaattcgaa
ttgaatgtct 2940ttaagccagt tatgatctcc aacttgttat cttctattag gttattaggt
gatggttgta 3000attcttttag aatccactgt gttgaaggta tcattgcaaa taccgacaag
attgataaat 3060tactacatga atctctcatg ttagttactg ctttgaaccc acacattggt
tacgataagg 3120cttccaagat tgcaaagaat gcacacaaga agggcttgac attgaaacaa
tctgcattgg 3180aattaggtta cttgaccgaa gaacaattca atgaatgggt tagaccagaa
aacatgattg 3240gtccaaagga ttaagttaat taacatctga atgtaaaatg aacattaaaa
tgaattacta 3300aactttacgt ctactttaca atctataaac tttgtttaat catataacga
aatacactaa 3360tacacaatcc tgtacgtatg taatactttt atccatcaag gattgagaaa
aaaaagtaat 3420gattccctgg gccattaaaa cttagacccc caagcttgga taggtcactc
tctattttcg 3480tttctccctt ccctgataga agggtgatat gtaattaaga ataatatata
attttataat 3540aaaagcggcc gcctcccttc tctaaatgga ctgcttggat aacttggacc
cccttcccat 3600tttatagtca ttctcttccc cctcattttc ccactattcc caacaatgac
catctctcca 3660ccttgtttcc ccattcttcc tgctctacct ggtgggggtg tttcacccca
ttaacggtcg 3720gattccgctg tggagatggc tctggccttt ttcccattcc ttccccccct
caatcttctc 3780catgcgggga aaaaaaaatt ttatccataa acaaccaaac cggcggctca
acggggggtt 3840tatactgaca gaaatggggt caatacaccc actgactgta cccgctctaa
tcttaagctt 3900tccccccccc ctcctgtatt aacggcgcgg agtgcccgca gcgcccaatg
gagaaggcgc 3960gcagtggggg atgcccaggg aggggacagg tacacgcaca ggccatgcca
acaccgcata 4020gacgtgcgac ctcctctccc ccactgcaga gctgcccttt tcggacacac
tccgtgcaag 4080aggactcggc cggctcggct tttctgccga attcggcagc cctgatattg
tcttacgtaa 4140tacgaatgga gggggtgtct cattttccct gcgatttccc aattgggtag
caatgtgcac 4200acactgaaac agtgcagagt attggtgatt tgctaatgta tggtaatgta
taatactttt 4260ttagagtgta gtgcattgca gacagtatag tatccacttc tgggaatccc
atcgaaacgg 4320c
432112634DNAArtificial SequenceSynthetic - Primer
126gatcgagctc caccttattt atgggagtta tttc
3412747DNAArtificial SequenceSynthetic - Primer 127ggataaaagt attacatacg
tacaggattg tgtattagtg tatttcg 47128328PRTRhizopus
delemar 128Met Val Lys Val Thr Val Cys Gly Ala Ala Gly Gly Ile Gly Gln
Pro1 5 10 15Leu Ser Leu
Leu Leu Lys Gln Ser Ser His Ile Thr His Leu Ser Leu 20
25 30Tyr Asp Ile Val Asn Thr Pro Gly Val Ala
Ala Asp Leu Ser His Ile 35 40
45Asp Thr Lys Ser Lys Val Thr Gly His Val Gly Ala Ala Gln Leu Glu 50
55 60Glu Ala Ile Lys Asp Ser Asp Val Val
Val Ile Pro Ala Gly Val Pro65 70 75
80Arg Lys Pro Gly Met Thr Arg Asp Asp Leu Phe Lys Ile Asn
Ala Gly 85 90 95Ile Val
Arg Asp Leu Ala Thr Ala Ala Ala Lys Tyr Ala Pro Lys Ala 100
105 110Phe Met Cys Ile Ile Ser Asn Pro Val
Asn Ser Thr Val Pro Ile Val 115 120
125Thr Glu Val Phe Lys Gln His Asn Val Tyr Asp Pro Lys Arg Ile Phe
130 135 140Gly Val Thr Thr Leu Asp Ile
Val Arg Ala Ser Thr Phe Val Ser Glu145 150
155 160Leu Ile Gly Gly Glu Pro Asn Ser Leu Arg Val Pro
Val Ile Gly Gly 165 170
175His Ser Gly Val Thr Ile Leu Pro Leu Leu Ser Gln Val Pro Gly Ile
180 185 190Glu Lys Leu Asn Gln Glu
Gln Ile Glu Lys Val Thr His Arg Ile Gln 195 200
205Phe Gly Gly Asp Glu Val Val Lys Ala Lys Asp Gly Ala Gly
Ser Ala 210 215 220Thr Leu Ser Met Ala
Tyr Ala Gly Ala Arg Phe Ala Thr Asn Ile Ile225 230
235 240Glu Ala Ala Phe Ala Gly Lys Lys Gly Ile
Val Glu Cys Thr Tyr Val 245 250
255Gln Leu Asp Ala Asp Lys Ser Gly Ala Gln Ser Val Lys Asp Leu Val
260 265 270Gly Ser Glu Leu Glu
Tyr Phe Ser Val Pro Val Glu Leu Gly Pro Ser 275
280 285Gly Val Glu Lys Ile Leu Pro Ile Gly Asn Val Asn
Glu Tyr Glu Lys 290 295 300Lys Leu Leu
Asn Glu Ala Ser Pro Glu Leu Lys Thr Asn Ile Asp Lys305
310 315 320Gly Cys Thr Phe Val Thr Glu
Gly 32512919DNAArtificial SequenceSynthetic - Primer
129tggcccaggg aatcattac
1913028DNAArtificial SequenceSynthetic - Primer 130tcaccacctg tcagtgacga
gccacttc 2813120DNAArtificial
SequenceSynthetic - Primer 131ggacccaatg cctcccaatc
2013221DNAArtificial SequenceSynthetic - Primer
132cgtttctccc ttccctgata g
211336890DNAArtificial SequenceSynthetic - E. coli SthA MEL integration
fragmentmisc_feature(3973)..(3973)n is a, c, g, or t 133cggatttggt
tggttcatag ctcttcttta gcgttgattg tagcctctgt tgcagaaaag 60accttgtttt
caagaaactg gttggtctga gtgttctgac accaatggtt atatttctag 120ttgcaaacat
gttggaataa tggtagtaat gttgatgctg gcagtgacag tagtgctagt 180tcttgttctt
gttctcgttc tcgttctcgt tctctttttt gtgctgtagc tgttactgta 240ttggctactc
tatataatat gcttgcaaag gaaaggaaat ctatgcaaac cactctctcc 300tgcacaaacg
ctagttcctt tgtcagggtt gaatgtagcc actctacgaa tgcgattcct 360cttcccctct
ctttcgcgtt gagcatattc aaaattgtga gaggtggcaa ggaaaaccat 420acgattttcg
gggtgcacgg atgcacagtg gacgtaagat ctgctctatt taagatacta 480aagaaagtgg
cagcgggaag accatcatgg aggacaacgt acatggacga ttccctgcta 540gaaacaatga
catgcaaacc cgtgcagtag gtagaagcag atttgctgag acagccatgc 600caagtggaat
tgttgtttaa ctctagtaga tattattgtt atagaaaaga tttatatata 660agatccatgg
aggggggagg gaggtaagag aaatacgaaa aagaatgtgt aatgaatctt 720aatgtagaca
agtggaaatg cagctaaagg gggtcaaagg gaatgtgata atgcaaggtt 780aggtttaaca
agaatgggtt ggcggactcg tcaatggaga gtacaatgcc aaagttctcc 840ctgaggttat
tgcggccgcg gatccctcga gattggtagt tctttccccc tctcaagctg 900gcgtgaaatg
caaccttacg gcgtctacgt tactacaagg tccagaaagt gtaggtattg 960ctactatttt
tattttttat tggttctgga gaaatgcaga cagtcaatga acacaactgt 1020ctcaatatgc
atctatgcac atgcacacac acacacatca caggtacccc tacaaagaga 1080ggtctcttga
taatgtttca ttaccacgtg gcatcccccc cccccccccc aataaacaag 1140tggccgagtt
cccctgttgc agaggaggac aaaaaaaccg ctggtgttgg taccattatg 1200cagcaactag
cacaacaaac aaccgaccca gacatacaaa tcaacaacac ttcgccaaag 1260acaccctttc
cagggaggat ccactcccaa cgtctctcca taatgtctct gttggcccat 1320gtctctgtcg
ttgacaccgt aaccacacca accaacccgt ccattgtact gggatggtcg 1380tccatagaca
cctctccaac ggggaacacc tcattcgtaa accgccaagg ttaccgttcc 1440tcctgactcg
ccccgttgtt gatgctgcgc acctgtggtt gcccaacatg gttgtatatc 1500gtgtaaccac
accaacacat gtgcagcaca tgtgtttaaa agagtgtcat ggaggtggat 1560catgatggaa
gtggacttta ccacttggga actgtctcca ctcccgggaa gaaaagaccc 1620ggcgtatcac
gcggttgcct caatggggca atttggaagg agaaatatag ggaaaatcac 1680gtcgctctcg
gacggggaag agttccagac tatgaggggg gggggtggta tataaagaca 1740ggagatgtcc
acccccagag agaggaagaa gttggaactt tagaagagag agataacttt 1800ccccagtgtc
catcaataca caaccaaaca caaactctat atttacacat ataaccccct 1860ctctagaatg
ccacattcct atgactacga tgccattgtc attggttccg gtccaggtgg 1920tgaaggtgct
gcaatgggct tagttaagca gggtgctaga gttgctgtca tcgaaagata 1980tcaaaatgtt
ggtggtggtt gtactcactg gggtacaatt ccatctaagg cattgagaca 2040tgcagtttcc
agaattattg agtttaacca aaacccttta tactctgatc attcaagatt 2100gttgagatca
tcttttgctg atattttgaa ccatgctgac aacgtcatca accaacaaac 2160tcgtatgcgt
caaggcttct atgagagaaa tcattgtgag attttacaag gtaacgctag 2220atttgtcgat
gagcatactc ttgcattaga ctgtccagac ggttccgttg agactcttac 2280cgctgaaaaa
ttcgttattg cttgtggttc cagaccatac cacccaaccg atgtcgattt 2340cactcaccct
cgtatctacg attccgattc tattttgtct atgcatcatg aaccaagaca 2400tgttttgatt
tatggtgctg gtgttatcgg ttgtgaatat gcttctattt tcagaggtat 2460ggatgttaag
gttgacttga ttaatacaag agacagatta ttagctttcc ttgatcagga 2520aatgtctgat
tccctttcct accatttttg gaactccggt gtcgtcatca gacacaacga 2580ggaatatgaa
aagattgaag gttgtgatga cggcgttatt atgcacctta agtctggtaa 2640aaagttaaaa
gcagattgct tgttatatgc aaatggtaga accggtaaca cagactcctt 2700ggctttacaa
aacattggtt tagaaaccga ttcaagaggt caattaaagg tcaattcaat 2760gtatcaaact
gcacaaccac acgtttacgc agttggtgac gttattggtt acccttcatt 2820ggcatctgcc
gcttacgatc aaggtagaat cgccgctcaa gcacttgtta agggtgaagc 2880aactgcacac
ttaatcgaag atatccctac cggtatctac actatcccag aaatctcttc 2940tgttggcaag
actgaacaac aattaaccgc aatgaaggtt ccatacgaag tcggtcgtgc 3000ccagttcaag
catttggcta gagcacaaat tgttggtatg aatgttggta ctttgaaaat 3060cttgtttcac
agagaaacaa aggaaatctt gggcattcac tgtttcggcg aaagagctgc 3120agagattatt
cacatcggtc aagccattat ggaacaaaaa ggcggtggta ataccattga 3180atatttcgtt
aataccacct tcaactaccc aacaatggcc gaagcatata gagtcgctgc 3240tttaaacggt
ttaaacagat tgttttaatt aacatctgaa tgtaaaatga acattaaaat 3300gaattactaa
actttacgtc tactttacaa tctataaact ttgtttaatc atataacgaa 3360atacactaat
acacaatcct gtacgtatgt aatactttta tccatcaagg attgagaaaa 3420aaaagtaatg
attccctggg ccattaaaac ttagaccccc aagcttggat aggtcactct 3480ctattttcgt
ttctcccttc cctgatagaa gggtgatatg taattaagaa taatatataa 3540ttttataata
aaagaattcg cccttacctg cagggataac ttcgtataat gtatgctata 3600cgaagttatg
ctgcaacggc aacatcaatg tccacgttta cacacctaca tttatatcta 3660tatttatatt
tatatttatt tatttatgct acttagcttc tatagttagt taatgcactc 3720acgatattca
aaattgacac ccttcaacta ctccctacta ttgtctacta ctgtctacta 3780ctcctcttta
ctatagctgc tcccaatagg ctccaccaat aggctctgtc aatacatttt 3840gcgccgccac
ctttcaggtt gtgtcactcc tgaaggacca tattgggtaa tcgtgcaatt 3900tctggaagag
agtgccgcga gaagtgaggc ccccactgta aatcctcgag ggggcatgga 3960gtatggggca
tgnaggatgg aggatggggg ggggggggga aaataggtag cgaaaggacc 4020cgctatcacc
ccacccggag aactcgttgc cgggaagtca tatttcgaca ctccggggag 4080tctataaaag
gcgggttttg tcttttgcca gttgatgttg ctgagaggac ttgtttgccg 4140tttcttccga
tttaacagta tagaatcaac cactgttaat tatacacgtt atactaacac 4200aacaaaaaca
aaaacaacga caacaacaac aacaatgttt gctttctact ttctcaccgc 4260atgcaccact
ttgaagggtg ttttcggagt ttctccgagt tacaatggtc ttggtctcac 4320cccacagatg
ggttgggaca gctggaatac gtttgcctgc gatgtcagtg aacagctact 4380tctagacact
gctgatagaa tttctgactt ggggctaaag gatatgggtt acaagtatgt 4440catcctagat
gactgttggt ctagcggcag ggattccgac ggtttcctcg ttgcagacaa 4500gcacaaattt
cccaacggta tgggccatgt tgcagaccac ctgcataata acagctttct 4560tttcggtatg
tattcgtctg ctggtgagta cacctgtgct gggtaccctg ggtctctggg 4620gcgtgaggaa
gaagatgctc aattctttgc aaataaccgc gttgactact tgaagtatga 4680taattgttac
aataaaggtc aatttggtac accagacgtt tcttaccacc gttacaaggc 4740catgtcagat
gctttgaata aaactggtag gcctattttc tattctctat gtaactgggg 4800tcaggatttg
acattttact ggggctctgg tatcgccaat tcttggagaa tgagcggaga 4860tattactgct
gagttcaccc gtccagatag cagatgtccc tgtgacggtg acgaatatga 4920ttgcaagtac
gccggtttcc attgttctat tatgaatatt cttaacaagg cagctccaat 4980ggggcaaaat
gcaggtgttg gtggttggaa cgatctggac aatctagagg tcggagtcgg 5040taatttgact
gacgatgagg aaaaggccca tttctctatg tgggcaatgg taaagtcccc 5100acttatcatt
ggtgccgacg tgaatcactt aaaggcatct tcgtactcga tctacagtca 5160agcctctgtc
atcgcaatta atcaagatcc aaagggtatt ccagccacaa gagtctggag 5220atattatgtt
tcagacaccg atgaatatgg acaaggtgaa attcaaatgt ggagtggtcc 5280gcttgacaat
ggtgaccaag tggttgcttt attgaatgga ggaagcgtag caagaccaat 5340gaacacgacc
ttggaagaga ttttctttga cagcaatttg ggttcaaagg aactgacatc 5400gacttgggat
atttacgact tatgggccaa cagagttgac aactctacgg cgtctgctat 5460ccttgaacag
aataaggcag ccaccggtat tctctacaat gctacagagc agtcttataa 5520agacggtttg
tctaagaatg atacaagact gtttggccag aaaattggta gtctttctcc 5580aaatgctata
cttaacacaa ctgttccagc tcatggtatc gccttctata ggttgagacc 5640ctcggcttaa
gctcaatgtt gagcaaagca ggacgagaaa aaaaaaaata atgattgtta 5700agaagttcat
gaaaaaaaaa aggaaaaata ctcaaatact tataacagag tgattaaata 5760ataaacggca
gtatacccta tcaggtattg agatagtttt atttttgtag gtatataatc 5820tgaagccttt
gaactatttt ctcgtatata tcatggagta tacattgcat tagcaacatt 5880gcatactagt
tcataacttc gtataatgta tgctatacga agttattaat taacaagggc 5940gaattccttg
atttatatac acctttgcca accgcttgtt acttgataag gaaaagatag 6000atttctaaag
tgcaggaaaa gaaacgccac tacgtcatga aacaaaagaa atgaaacact 6060ctgcaaaagg
gaaaaccaat gacgccttca aaacgtactg actttccgcc tccttttctg 6120cctttttttt
ttctccctca atttgccaat tcccctttcc gctaatttta catcaccttt 6180ttgtttgttt
cccttttcgg ccaagttttc catttctttt ttcggctgag cccttctttg 6240gcgtcgacgt
aatttctcgg catgtggcca atgtatattg acagtagatg aagtagacgt 6300tcttagtaac
tgttagggtg agattgccac ccccccttcc ttcttttact atctgtaata 6360ccatcaccat
agcaatagtt taaccatgtt ggagctggaa atacaacgtc tatagaggga 6420agtcatcata
ttacgccatt ttacggacca gggacaccct gtagtgtgtt tcctctcttg 6480tagaggtagg
ttttcaaatg gactctggcg tcgatttcca gcaagtcatt cccgtggttc 6540accatttcta
ctttttgcgc tacctctctt gacacagaaa tgaatgatga cgtgtaaatt 6600acccgtccga
gacctggact ccggagaaac tgtattaatt acgcgccaaa caagacaggt 6660gtcggataaa
cgtgcatgta cagactgcga gccgaaaacg gaagggggga aagaaaacag 6720tggagtccca
ttgttgttcc ggaaatggaa aacgggaact ggcggaaaag aaacgaaaca 6780aaacaaaaga
aaaagaggaa aaaaaagaaa aaaaaaagaa aaagacactg cacgtgattg 6840ctggtgtgtg
ctgcgtaacc gcggcacttt atttcgtaaa tgaaggggcc
68901341656DNASaccharomyces cerevisiae 134atgtttgctt tctactttct
caccgcatgc accactttga agggtgtttt cggagtttct 60ccgagttaca atggtcttgg
tctcacccca cagatgggtt gggacagctg gaatacgttt 120gcctgcgatg tcagtgaaca
gctacttcta gacactgctg atagaatttc tgacttgggg 180ctaaaggata tgggttacaa
gtatgtcatc ctagatgact gttggtctag cggcagggat 240tccgacggtt tcctcgttgc
agacaagcac aaatttccca acggtatggg ccatgttgca 300gaccacctgc ataataacag
ctttcttttc ggtatgtatt cgtctgctgg tgagtacacc 360tgtgctgggt accctgggtc
tctggggcgt gaggaagaag atgctcaatt ctttgcaaat 420aaccgcgttg actacttgaa
gtatgataat tgttacaata aaggtcaatt tggtacacca 480gacgtttctt accaccgtta
caaggccatg tcagatgctt tgaataaaac tggtaggcct 540attttctatt ctctatgtaa
ctggggtcag gatttgacat tttactgggg ctctggtatc 600gccaattctt ggagaatgag
cggagatatt actgctgagt tcacccgtcc agatagcaga 660tgtccctgtg acggtgacga
atatgattgc aagtacgccg gtttccattg ttctattatg 720aatattctta acaaggcagc
tccaatgggg caaaatgcag gtgttggtgg ttggaacgat 780ctggacaatc tagaggtcgg
agtcggtaat ttgactgacg atgaggaaaa ggcccatttc 840tctatgtggg caatggtaaa
gtccccactt atcattggtg ccgacgtgaa tcacttaaag 900gcatcttcgt actcgatcta
cagtcaagcc tctgtcatcg caattaatca agatccaaag 960ggtattccag ccacaagagt
ctggagatat tatgtttcag acaccgatga atatggacaa 1020ggtgaaattc aaatgtggag
tggtccgctt gacaatggtg accaagtggt tgctttattg 1080aatggaggaa gcgtagcaag
accaatgaac acgaccttgg aagagatttt ctttgacagc 1140aatttgggtt caaaggaact
gacatcgact tgggatattt acgacttatg ggccaacaga 1200gttgacaact ctacggcgtc
tgctatcctt gaacagaata aggcagccac cggtattctc 1260tacaatgcta cagagcagtc
ttataaagac ggtttgtcta agaatgatac aagactgttt 1320ggccagaaaa ttggtagtct
ttctccaaat gctatactta acacaactgt tccagctcat 1380ggtatcgcct tctataggtt
gagaccctcg gcttaagctc aatgttgagc aaagcaggac 1440gagaaaaaaa aaaataatga
ttgttaagaa gttcatgaaa aaaaaaagga aaaatactca 1500aatacttata acagagtgat
taaataataa acggcagtat accctatcag gtattgagat 1560agttttattt ttgtaggtat
ataatctgaa gcctttgaac tattttctcg tatatatcat 1620ggagtataca ttgcattagc
aacattgcat actagt 16561356279DNAArtificial
SequenceE. coli SthA URA integration fragment 135ggccccttca tttacgaaat
aaagtgccgc ggttacgcag cacacaccag caatcacgtg 60cagtgtcttt ttcttttttt
tttctttttt ttcctctttt tcttttgttt tgtttcgttt 120cttttccgcc agttcccgtt
ttccatttcc ggaacaacaa tgggactcca ctgttttctt 180tccccccttc cgttttcggc
tcgcagtctg tacatgcacg tttatccgac acctgtcttg 240tttggcgcgt aattaataca
gtttctccgg agtccaggtc tcggacgggt aatttacacg 300tcatcattca tttctgtgtc
aagagaggta gcgcaaaaag tagaaatggt gaaccacggg 360aatgacttgc tggaaatcga
cgccagagtc catttgaaaa cctacctcta caagagagga 420aacacactac agggtgtccc
tggtccgtaa aatggcgtaa tatgatgact tccctctata 480gacgttgtat ttccagctcc
aacatggtta aactattgct atggtgatgg tattacagat 540agtaaaagaa ggaagggggg
gtggcaatct caccctaaca gttactaaga acgtctactt 600catctactgt caatatacat
tggccacatg ccgagaaatt acgtcgacgc caaagaaggg 660ctcagccgaa aaaagaaatg
gaaaacttgg ccgaaaaggg aaacaaacaa aaaggtgatg 720taaaattagc ggaaagggga
attggcaaat tgagggagaa aaaaaaaagg cagaaaagga 780ggcggaaagt cagtacgttt
tgaaggcgtc attggttttc ccttttgcag agtgtttcat 840ttcttttgtt tcatgacgta
gtggcgtttc ttttcctgca ctttagaaat ctatcttttc 900cttatcaagt aacaagcggt
tggcaaaggt gtatataaat caaggaattc ccactttgaa 960ccctttgaat tttgatatcg
tttattttaa atttatttgc ggccgcggat ccctcgagat 1020tggtagttct ttccccctct
caagctggcg tgaaatgcaa ccttacggcg tctacgttac 1080tacaaggtcc agaaagtgta
ggtattgcta ctatttttat tttttattgg ttctggagaa 1140atgcagacag tcaatgaaca
caactgtctc aatatgcatc tatgcacatg cacacacaca 1200cacatcacag gtacccctac
aaagagaggt ctcttgataa tgtttcatta ccacgtggca 1260tccccccccc cccccccaat
aaacaagtgg ccgagttccc ctgttgcaga ggaggacaaa 1320aaaaccgctg gtgttggtac
cattatgcag caactagcac aacaaacaac cgacccagac 1380atacaaatca acaacacttc
gccaaagaca ccctttccag ggaggatcca ctcccaacgt 1440ctctccataa tgtctctgtt
ggcccatgtc tctgtcgttg acaccgtaac cacaccaacc 1500aacccgtcca ttgtactggg
atggtcgtcc atagacacct ctccaacggg gaacacctca 1560ttcgtaaacc gccaaggtta
ccgttcctcc tgactcgccc cgttgttgat gctgcgcacc 1620tgtggttgcc caacatggtt
gtatatcgtg taaccacacc aacacatgtg cagcacatgt 1680gtttaaaaga gtgtcatgga
ggtggatcat gatggaagtg gactttacca cttgggaact 1740gtctccactc ccgggaagaa
aagacccggc gtatcacgcg gttgcctcaa tggggcaatt 1800tggaaggaga aatataggga
aaatcacgtc gctctcggac ggggaagagt tccagactat 1860gagggggggg ggtggtatat
aaagacagga gatgtccacc cccagagaga ggaagaagtt 1920ggaactttag aagagagaga
taactttccc cagtgtccat caatacacaa ccaaacacaa 1980actctatatt tacacatata
accccctctc tagaatgcca cattcctatg actacgatgc 2040cattgtcatt ggttccggtc
caggtggtga aggtgctgca atgggcttag ttaagcaggg 2100tgctagagtt gctgtcatcg
aaagatatca aaatgttggt ggtggttgta ctcactgggg 2160tacaattcca tctaaggcat
tgagacatgc agtttccaga attattgagt ttaaccaaaa 2220ccctttatac tctgatcatt
caagattgtt gagatcatct tttgctgata ttttgaacca 2280tgctgacaac gtcatcaacc
aacaaactcg tatgcgtcaa ggcttctatg agagaaatca 2340ttgtgagatt ttacaaggta
acgctagatt tgtcgatgag catactcttg cattagactg 2400tccagacggt tccgttgaga
ctcttaccgc tgaaaaattc gttattgctt gtggttccag 2460accataccac ccaaccgatg
tcgatttcac tcaccctcgt atctacgatt ccgattctat 2520tttgtctatg catcatgaac
caagacatgt tttgatttat ggtgctggtg ttatcggttg 2580tgaatatgct tctattttca
gaggtatgga tgttaaggtt gacttgatta atacaagaga 2640cagattatta gctttccttg
atcaggaaat gtctgattcc ctttcctacc atttttggaa 2700ctccggtgtc gtcatcagac
acaacgagga atatgaaaag attgaaggtt gtgatgacgg 2760cgttattatg caccttaagt
ctggtaaaaa gttaaaagca gattgcttgt tatatgcaaa 2820tggtagaacc ggtaacacag
actccttggc tttacaaaac attggtttag aaaccgattc 2880aagaggtcaa ttaaaggtca
attcaatgta tcaaactgca caaccacacg tttacgcagt 2940tggtgacgtt attggttacc
cttcattggc atctgccgct tacgatcaag gtagaatcgc 3000cgctcaagca cttgttaagg
gtgaagcaac tgcacactta atcgaagata tccctaccgg 3060tatctacact atcccagaaa
tctcttctgt tggcaagact gaacaacaat taaccgcaat 3120gaaggttcca tacgaagtcg
gtcgtgccca gttcaagcat ttggctagag cacaaattgt 3180tggtatgaat gttggtactt
tgaaaatctt gtttcacaga gaaacaaagg aaatcttggg 3240cattcactgt ttcggcgaaa
gagctgcaga gattattcac atcggtcaag ccattatgga 3300acaaaaaggc ggtggtaata
ccattgaata tttcgttaat accaccttca actacccaac 3360aatggccgaa gcatatagag
tcgctgcttt aaacggttta aacagattgt tttaattaac 3420atctgaatgt aaaatgaaca
ttaaaatgaa ttactaaact ttacgtctac tttacaatct 3480ataaactttg tttaatcata
taacgaaata cactaataca caatcctgta cgtatgtaat 3540acttttatcc atcaaggatt
gagaaaaaaa agtaatgatt ccctgggcca ttaaaactta 3600gacccccaag cttggatagg
tcactctcta ttttcgtttc tcccttccct gatagaaggg 3660tgatatgtaa ttaagaataa
tatataattt tataataaaa gaattcatag cctcatgaaa 3720tcagccattt gcttttgttc
aacgatcttt tgaaattgtt gttgttcttg gtagttaagt 3780tgatccatct tggcttatgt
tgtgtgtatg ttgtagttat tcttagtata ttcctgtcct 3840gagtttagtg aaacataata
tcgccttgaa atgaaaatgc tgaaattcgt cgacatacaa 3900tttttcaaac tttttttttt
tcttggtgca cggacatgtt tttaaaggaa gtactctata 3960ccagttattc ttcacaaatt
taattgctgg agaatagatc ttcaacgctt taataaagta 4020gtttgtttgt caaggatggc
gtcatacaaa gaaagatcag aatcacacac ttcccctgtt 4080gctaggagac ttttctccat
catggaggaa aagaagtcta acctttgtgc atcattggat 4140attactgaaa ctgaaaagct
tctctctatt ttggacacta ttggtcctta catctgtcta 4200gttaaaacac acatcgatat
tgtttctgat tttacgtatg aaggaactgt gttgcctttg 4260aaggagcttg ccaagaaaca
taattttatg atttttgaag atagaaaatt tgctgatatt 4320ggtaacactg ttaaaaatca
atataaatct ggtgtcttcc gtattgccga atgggctgac 4380atcactaatg cacatggtgt
aacgggtgca ggtattgttt ctggcttgaa ggaggcagcc 4440caagaaacaa ccagtgaacc
tagaggtttg ctaatgcttg ctgagttatc atcaaagggt 4500tctttagcat atggtgaata
tacagaaaaa acagtagaaa ttgctaaatc tgataaagag 4560tttgtcattg gttttattgc
gcaacacgat atgggcggta gagaagaagg ttttgactgg 4620atcattatga ctccaggggt
tggtttagat gacaaaggtg atgcacttgg tcaacaatat 4680agaactgttg atgaagttgt
aaagactgga acggatatca taattgttgg tagaggtttg 4740tacggtcaag gaagagatcc
tatagagcaa gctaaaagat accaacaagc tggttggaat 4800gcttatttaa acagatttaa
atgattctta cacaaagatt tgatacatgt acactagttt 4860aaataagcat gaaaagaatt
acacaagcaa aaaaaaaaaa ataaatgagg tactttacgt 4920tcacctacaa ccaaaaaaac
tagatagagt aaaatcttaa gatttagaaa aagttgttta 4980acaaaggctt tagtatgtga
atttttaatg tagcaaagcg ataactaata aacataaaca 5040aaagtatggt tttctttatc
agtcaaatca ttatcgattg attgttccgc gtatctgcag 5100atagcctcat gaaatcagcc
atttgctttt gttcaacgat cttttgaaat tgttgttgtt 5160cttggtagtt aagttgatcc
atcttggctt atgttgtgtg tatgttgtag ttattcttag 5220tatattcctg tcctgagttt
agtgaaacat aatatcgcct tgaaatgaaa atgctgaaat 5280tcgtcgacat acaatttttc
aaactttttt tttttcttgg tgcacggaca tgtttttaaa 5340ggaagtactc tataccagtt
attcttcaca aatttaattg ctggagaata gatcttcaac 5400gccccggggg atctggatcc
gcggccgcaa taacctcagg gagaactttg gcattgtact 5460ctccattgac gagtccgcca
acccattctt gttaaaccta accttgcatt atcacattcc 5520ctttgacccc ctttagctgc
atttccactt gtctacatta agattcatta cacattcttt 5580ttcgtatttc tcttacctcc
ctcccccctc catggatctt atatataaat cttttctata 5640acaataatat ctactagagt
taaacaacaa ttccacttgg catggctgtc tcagcaaatc 5700tgcttctacc tactgcacgg
gtttgcatgt cattgtttct agcagggaat cgtccatgta 5760cgttgtcctc catgatggtc
ttcccgctgc cactttcttt agtatcttaa atagagcaga 5820tcttacgtcc actgtgcatc
cgtgcacccc gaaaatcgta tggttttcct tgccacctct 5880cacaattttg aatatgctca
acgcgaaaga gaggggaaga ggaatcgcat tcgtagagtg 5940gctacattca accctgacaa
aggaactagc gtttgtgcag gagagagtgg tttgcataga 6000tttcctttcc tttgcaagca
tattatatag agtagccaat acagtaacag ctacagcaca 6060aaaaagagaa cgagaacgag
aacgagaaca agaacaagaa ctagcactac tgtcactgcc 6120agcatcaaca ttactaccat
tattccaaca tgtttgcaac tagaaatata accattggtg 6180tcagaacact cagaccaacc
agtttcttga aaacaaggtc ttttctgcaa cagaggctac 6240aatcaacgct aaagaagagc
tatgaaccaa ccaaatccg 62791366888DNAArtificial
SequenceSynthetic - A. vinelandii MEL integration
fragmentmisc_feature(2914)..(2914)n is a, c, g, or t 136ccttcattta
cgaaataaag tgccgcggtt acgcagcaca caccagcaat cacgtgcagt 60gtctttttct
tttttttttc ttttttttcc tctttttctt ttgttttgtt tcgtttcttt 120tccgccagtt
cccgttttcc atttccggaa caacaatggg actccactgt tttctttccc 180cccttccgtt
ttcggctcgc agtctgtaca tgcacgttta tccgacacct gtcttgtttg 240gcgcgtaatt
aatacagttt ctccggagtc caggtctcgg acgggtaatt tacacgtcat 300cattcatttc
tgtgtcaaga gaggtagcgc aaaaagtaga aatggtgaac cacgggaatg 360acttgctgga
aatcgacgcc agagtccatt tgaaaaccta cctctacaag agaggaaaca 420cactacaggg
tgtccctggt ccgtaaaatg gcgtaatatg atgacttccc tctatagacg 480ttgtatttcc
agctccaaca tggttaaact attgctatgg tgatggtatt acagatagta 540aaagaaggaa
gggggggtgg caatctcacc ctaacagtta ctaagaacgt ctacttcatc 600tactgtcaat
atacattggc cacatgccga gaaattacgt cgacgccaaa gaagggctca 660gccgaaaaaa
gaaatggaaa acttggccga aaagggaaac aaacaaaaag gtgatgtaaa 720attagcggaa
aggggaattg gcaaattgag ggagaaaaaa aaaaggcaga aaaggaggcg 780gaaagtcagt
acgttttgaa ggcgtcattg gttttccctt ttgcagagtg tttcatttct 840tttgtttcat
gacgtagtgg cgtttctttt cctgcacttt agaaatctat cttttcctta 900tcaagtaaca
agcggttggc aaaggtgtat ataaatcaag gaattcgccc ttgttaatta 960ataacttcgt
atagcataca ttatacgaag ttatgaacta gtatgcaatg ttgctaatgc 1020aatgtatact
ccatgatata tacgagaaaa tagttcaaag gcttcagatt atatacctac 1080aaaaataaaa
ctatctcaat acctgatagg gtatactgcc gtttattatt taatcactct 1140gttataagta
tttgagtatt tttccttttt tttttcatga acttcttaac aatcattatt 1200tttttttttc
tcgtcctgct ttgctcaaca ttgagcttaa gccgagggtc tcaacctata 1260gaaggcgata
ccatgagctg gaacagttgt gttaagtata gcatttggag aaagactacc 1320aattttctgg
ccaaacagtc ttgtatcatt cttagacaaa ccgtctttat aagactgctc 1380tgtagcattg
tagagaatac cggtggctgc cttattctgt tcaaggatag cagacgccgt 1440agagttgtca
actctgttgg cccataagtc gtaaatatcc caagtcgatg tcagttcctt 1500tgaacccaaa
ttgctgtcaa agaaaatctc ttccaaggtc gtgttcattg gtcttgctac 1560gcttcctcca
ttcaataaag caaccacttg gtcaccattg tcaagcggac cactccacat 1620ttgaatttca
ccttgtccat attcatcggt gtctgaaaca taatatctcc agactcttgt 1680ggctggaata
ccctttggat cttgattaat tgcgatgaca gaggcttgac tgtagatcga 1740gtacgaagat
gcctttaagt gattcacgtc ggcaccaatg ataagtgggg actttaccat 1800tgcccacata
gagaaatggg ccttttcctc atcgtcagtc aaattaccga ctccgacctc 1860tagattgtcc
agatcgttcc aaccaccaac acctgcattt tgccccattg gagctgcctt 1920gttaagaata
ttcataatag aacaatggaa accggcgtac ttgcaatcat attcgtcacc 1980gtcacaggga
catctgctat ctggacgggt gaactcagca gtaatatctc cgctcattct 2040ccaagaattg
gcgataccag agccccagta aaatgtcaaa tcctgacccc agttacatag 2100agaatagaaa
ataggcctac cagttttatt caaagcatct gacatggcct tgtaacggtg 2160gtaagaaacg
tctggtgtac caaattgacc tttattgtaa caattatcat acttcaagta 2220gtcaacgcgg
ttatttgcaa agaattgagc atcttcttcc tcacgcccca gagacccagg 2280gtacccagca
caggtgtact caccagcaga cgaatacata ccgaaaagaa agctgttatt 2340atgcaggtgg
tctgcaacat ggcccatacc gttgggaaat ttgtgcttgt ctgcaacgag 2400gaaaccgtcg
gaatccctgc cgctagacca acagtcatct aggatgacat acttgtaacc 2460catatccttt
agccccaagt cagaaattct atcagcagtg tctagaagta gctgttcact 2520gacatcgcag
gcaaacgtat tccagctgtc ccaacccatc tgtggggtga gaccaagacc 2580attgtaactc
ggagaaactc cgaaaacacc cttcaaagtg gtgcatgcgg tgagaaagta 2640gaaagcaaac
attgttgttg ttgttgtcgt tgtttttgtt tttgttgtgt tagtataacg 2700tgtataatta
acagtggttg attctatact gttaaatcgg aagaaacggc aaacaagtcc 2760tctcagcaac
atcaactggc aaaagacaaa acccgccttt tatagactcc ccggagtgtc 2820gaaatatgac
ttcccggcaa cgagttctcc gggtggggtg atagcgggtc ctttcgctac 2880ctattttccc
cccccccccc catcctccat cctncatgcc ccatactcca tgccccctcg 2940aggatttaca
gtgggggcct cacttctcgc ggcactctct tccagaaatt gcacgattac 3000ccaatatggt
ccttcaggag tgacacaacc tgaaaggtgg cggcgcaaaa tgtattgaca 3060gagcctattg
gtggagccta ttgggagcag ctatagtaaa gaggagtagt agacagtagt 3120agacaatagt
agggagtagt tgaagggtgt caattttgaa tatcgtgagt gcattaacta 3180actatagaag
ctaagtagca taaataaata aatataaata taaatataga tataaatgta 3240ggtgtgtaaa
cgtggacatt gatgttgccg ttgcagcata acttcgtata gcatacatta 3300tacgaagtta
tccctgcagg taagggcgaa ttcttttatt ataaaattat atattattct 3360taattacata
tcacccttct atcagggaag ggagaaacga aaatagagag tgacctatcc 3420aagcttgggg
gtctaagttt taatggccca gggaatcatt actttttttt ctcaatcctt 3480gatggataaa
agtattacat acgtacagga ttgtgtatta gtgtatttcg ttatatgatt 3540aaacaaagtt
tatagattgt aaagtagacg taaagtttag taattcattt taatgttcat 3600tttacattca
gatgttaatt aattagaaca atctgttcaa accatcgtac gcagccactc 3660tatatgcttc
tgccattgta ggataattga atgtagtatt aatgaaatac ttcaaagtgt 3720tggcttcacc
cttttggttc ataattgctt gaccgatgtg cacaatttca gatgcttgat 3780aaccaaaaca
atggacgcct aagatttcga gtgtttctct gtgaaacaaa atctttaaca 3840tgcctgcctt
ttcaacagca atctgggctc ttgccatacc tttgaaaaag gccttaccaa 3900cctcgtatgg
aacttttgcc tgagtaagtt ctctttctgt cttaccaaca ctggaaattt 3960ctggaatggt
gtagataccg gtagggacat cgtcaacaaa tctccaggag tcgttttcag 4020tgatggagcc
agcagcggat ctaccttgat cgtaagcagc agatgccaaa gatggccaac 4080caataacatc
accagcagcg tagatgttgg aaacttcggt tctataatgc tcatcgactt 4140gaatttgacc
tctaccatta gcctttaaac caatgttttc aagacccaac ttatcagtgt 4200taccggttct
accgttggac caaaggaatg catctgcctt gatcttttta ccagacttca 4260aatgtaaaat
gacaccgttg tcgaggcctt caactctttc gtactcctcg ttatgtctga 4320ttaagacatt
gttattcctt aagtgataag acaatgagtc ggaaatttcg tcatcgagaa 4380aagatagtaa
ttgatcacga ttgtcaatca aatcgactaa aacacctagt cctgagaaaa 4440tacttgcgta
ttcgcaaccg ataacaccag ctccgtagat aatcaaacgt cttggagtat 4500gacccaaaga
taagatagta tcagaatcgt agatcctagg gtgagtaaag tccacatcgg 4560ctggtctata
aggtcttgaa ccggtagcaa taacaaattg ctttgcgacc aaggtttcga 4620ccataccatt
caaatgaacg acttcaatag tatgctcgtc acagaatgat gcggtaccga 4680aaaaggtatc
aattctgttt ctagcataat atccggttct ggaagagact tgcttcgcaa 4740taacttgctc
tgctgatttt agaacgtcag cgaaagagaa ccatcttggt tcaccaattt 4800gacggaataa
tgggttgttg ttgtactgca taatttgtct aacagagtgc cttaaagcct 4860tacttgggat
agtgcctaag tgcgtacaat taccgccaac ttgtggtcta tcatcaacaa 4920ctgcaacttt
tctaccagct ttgactgcat tcatagcggc accttcaccc gctggacctg 4980taccgataac
aacaacatca taattgtaaa cagccattct agagaggggg ttatatgtgt 5040aaatatagag
tttgtgtttg gttgtgtatt gatggacact ggggaaagtt atctctctct 5100tctaaagttc
caacttcttc ctctctctgg gggtggacat ctcctgtctt tatataccac 5160ccccccccct
catagtctgg aactcttccc cgtccgagag cgacgtgatt ttccctatat 5220ttctccttcc
aaattgcccc attgaggcaa ccgcgtgata cgccgggtct tttcttcccg 5280ggagtggaga
cagttcccaa gtggtaaagt ccacttccat catgatccac ctccatgaca 5340ctcttttaaa
cacatgtgct gcacatgtgt tggtgtggtt acacgatata caaccatgtt 5400gggcaaccac
aggtgcgcag catcaacaac ggggcgagtc aggaggaacg gtaaccttgg 5460cggtttacga
atgaggtgtt ccccgttgga gaggtgtcta tggacgacca tcccagtaca 5520atggacgggt
tggttggtgt ggttacggtg tcaacgacag agacatgggc caacagagac 5580attatggaga
gacgttggga gtggatcctc cctggaaagg gtgtctttgg cgaagtgttg 5640ttgatttgta
tgtctgggtc ggttgtttgt tgtgctagtt gctgcataat ggtaccaaca 5700ccagcggttt
ttttgtcctc ctctgcaaca ggggaactcg gccacttgtt tattgggggg 5760gggggggggg
atgccacgtg gtaatgaaac attatcaaga gacctctctt tgtaggggta 5820cctgtgatgt
gtgtgtgtgt gcatgtgcat agatgcatat tgagacagtt gtgttcattg 5880actgtctgca
tttctccaga accaataaaa aataaaaata gtagcaatac ctacactttc 5940tggaccttgt
agtaacgtag acgccgtaag gttgcatttc acgccagctt gagaggggga 6000aagaactacc
aatctcgagg gatccgcggc cgcaataacc tcagggagaa ctttggcatt 6060gtactctcca
ttgacgagtc cgccaaccca ttcttgttaa acctaacctt gcattatcac 6120attccctttg
acccccttta gctgcatttc cacttgtcta cattaagatt cattacacat 6180tctttttcgt
atttctctta cctccctccc ccctccatgg atcttatata taaatctttt 6240ctataacaat
aatatctact agagttaaac aacaattcca cttggcatgg ctgtctcagc 6300aaatctgctt
ctacctactg cacgggtttg catgtcattg tttctagcag ggaatcgtcc 6360atgtacgttg
tcctccatga tggtcttccc gctgccactt tctttagtat cttaaataga 6420gcagatctta
cgtccactgt gcatccgtgc accccgaaaa tcgtatggtt ttccttgcca 6480cctctcacaa
ttttgaatat gctcaacgcg aaagagaggg gaagaggaat cgcattcgta 6540gagtggctac
attcaaccct gacaaaggaa ctagcgtttg tgcaggagag agtggtttgc 6600atagatttcc
tttcctttgc aagcatatta tatagagtag ccaatacagt aacagctaca 6660gcacaaaaaa
gagaacgaga acgagaacga gaacaagaac aagaactagc actactgtca 6720ctgccagcat
caacattact accattattc caacatgttt gcaactagaa atataaccat 6780tggtgtcaga
acactcagac caaccagttt cttgaaaaca aggtcttttc tgcaacagag 6840gctacaatca
acgctaaaga agagctatga accaaccaaa tccgagct
68881376276DNAArtificial SequenceSynthetic - A. vinelandii URA
integration fragment 137ggccccttca tttacgaaat aaagtgccgc ggttacgcag
cacacaccag caatcacgtg 60cagtgtcttt ttcttttttt tttctttttt ttcctctttt
tcttttgttt tgtttcgttt 120cttttccgcc agttcccgtt ttccatttcc ggaacaacaa
tgggactcca ctgttttctt 180tccccccttc cgttttcggc tcgcagtctg tacatgcacg
tttatccgac acctgtcttg 240tttggcgcgt aattaataca gtttctccgg agtccaggtc
tcggacgggt aatttacacg 300tcatcattca tttctgtgtc aagagaggta gcgcaaaaag
tagaaatggt gaaccacggg 360aatgacttgc tggaaatcga cgccagagtc catttgaaaa
cctacctcta caagagagga 420aacacactac agggtgtccc tggtccgtaa aatggcgtaa
tatgatgact tccctctata 480gacgttgtat ttccagctcc aacatggtta aactattgct
atggtgatgg tattacagat 540agtaaaagaa ggaagggggg gtggcaatct caccctaaca
gttactaaga acgtctactt 600catctactgt caatatacat tggccacatg ccgagaaatt
acgtcgacgc caaagaaggg 660ctcagccgaa aaaagaaatg gaaaacttgg ccgaaaaggg
aaacaaacaa aaaggtgatg 720taaaattagc ggaaagggga attggcaaat tgagggagaa
aaaaaaaagg cagaaaagga 780ggcggaaagt cagtacgttt tgaaggcgtc attggttttc
ccttttgcag agtgtttcat 840ttcttttgtt tcatgacgta gtggcgtttc ttttcctgca
ctttagaaat ctatcttttc 900cttatcaagt aacaagcggt tggcaaaggt gtatataaat
caaggaattc ccactttgaa 960ccctttgaat tttgatatcg tttattttaa atttatttgc
ggccgcggat ccctcgagat 1020tggtagttct ttccccctct caagctggcg tgaaatgcaa
ccttacggcg tctacgttac 1080tacaaggtcc agaaagtgta ggtattgcta ctatttttat
tttttattgg ttctggagaa 1140atgcagacag tcaatgaaca caactgtctc aatatgcatc
tatgcacatg cacacacaca 1200cacatcacag gtacccctac aaagagaggt ctcttgataa
tgtttcatta ccacgtggca 1260tccccccccc cccccccaat aaacaagtgg ccgagttccc
ctgttgcaga ggaggacaaa 1320aaaaccgctg gtgttggtac cattatgcag caactagcac
aacaaacaac cgacccagac 1380atacaaatca acaacacttc gccaaagaca ccctttccag
ggaggatcca ctcccaacgt 1440ctctccataa tgtctctgtt ggcccatgtc tctgtcgttg
acaccgtaac cacaccaacc 1500aacccgtcca ttgtactggg atggtcgtcc atagacacct
ctccaacggg gaacacctca 1560ttcgtaaacc gccaaggtta ccgttcctcc tgactcgccc
cgttgttgat gctgcgcacc 1620tgtggttgcc caacatggtt gtatatcgtg taaccacacc
aacacatgtg cagcacatgt 1680gtttaaaaga gtgtcatgga ggtggatcat gatggaagtg
gactttacca cttgggaact 1740gtctccactc ccgggaagaa aagacccggc gtatcacgcg
gttgcctcaa tggggcaatt 1800tggaaggaga aatataggga aaatcacgtc gctctcggac
ggggaagagt tccagactat 1860gagggggggg ggtggtatat aaagacagga gatgtccacc
cccagagaga ggaagaagtt 1920ggaactttag aagagagaga taactttccc cagtgtccat
caatacacaa ccaaacacaa 1980actctatatt tacacatata accccctctc tagaatggct
gtttacaatt atgatgttgt 2040tgttatcggt acaggtccag cgggtgaagg tgccgctatg
aatgcagtca aagctggtag 2100aaaagttgca gttgttgatg atagaccaca agttggcggt
aattgtacgc acttaggcac 2160tatcccaagt aaggctttaa ggcactctgt tagacaaatt
atgcagtaca acaacaaccc 2220attattccgt caaattggtg aaccaagatg gttctctttc
gctgacgttc taaaatcagc 2280agagcaagtt attgcgaagc aagtctcttc cagaaccgga
tattatgcta gaaacagaat 2340tgataccttt ttcggtaccg catcattctg tgacgagcat
actattgaag tcgttcattt 2400gaatggtatg gtcgaaacct tggtcgcaaa gcaatttgtt
attgctaccg gttcaagacc 2460ttatagacca gccgatgtgg actttactca ccctaggatc
tacgattctg atactatctt 2520atctttgggt catactccaa gacgtttgat tatctacgga
gctggtgtta tcggttgcga 2580atacgcaagt attttctcag gactaggtgt tttagtcgat
ttgattgaca atcgtgatca 2640attactatct tttctcgatg acgaaatttc cgactcattg
tcttatcact taaggaataa 2700caatgtctta atcagacata acgaggagta cgaaagagtt
gaaggcctcg acaacggtgt 2760cattttacat ttgaagtctg gtaaaaagat caaggcagat
gcattccttt ggtccaacgg 2820tagaaccggt aacactgata agttgggtct tgaaaacatt
ggtttaaagg ctaatggtag 2880aggtcaaatt caagtcgatg agcattatag aaccgaagtt
tccaacatct acgctgctgg 2940tgatgttatt ggttggccat ctttggcatc tgctgcttac
gatcaaggta gatccgctgc 3000tggctccatc actgaaaacg actcctggag atttgttgac
gatgtcccta ccggtatcta 3060caccattcca gaaatttcca gtgttggtaa gacagaaaga
gaacttactc aggcaaaagt 3120tccatacgag gttggtaagg cctttttcaa aggtatggca
agagcccaga ttgctgttga 3180aaaggcaggc atgttaaaga ttttgtttca cagagaaaca
ctcgaaatct taggcgtcca 3240ttgttttggt tatcaagcat ctgaaattgt gcacatcggt
caagcaatta tgaaccaaaa 3300gggtgaagcc aacactttga agtatttcat taatactaca
ttcaattatc ctacaatggc 3360agaagcatat agagtggctg cgtacgatgg tttgaacaga
ttgttctaat taattaacat 3420ctgaatgtaa aatgaacatt aaaatgaatt actaaacttt
acgtctactt tacaatctat 3480aaactttgtt taatcatata acgaaataca ctaatacaca
atcctgtacg tatgtaatac 3540ttttatccat caaggattga gaaaaaaaag taatgattcc
ctgggccatt aaaacttaga 3600cccccaagct tggataggtc actctctatt ttcgtttctc
ccttccctga tagaagggtg 3660atatgtaatt aagaataata tataatttta taataaaaga
attcatagcc tcatgaaatc 3720agccatttgc ttttgttcaa cgatcttttg aaattgttgt
tgttcttggt agttaagttg 3780atccatcttg gcttatgttg tgtgtatgtt gtagttattc
ttagtatatt cctgtcctga 3840gtttagtgaa acataatatc gccttgaaat gaaaatgctg
aaattcgtcg acatacaatt 3900tttcaaactt tttttttttc ttggtgcacg gacatgtttt
taaaggaagt actctatacc 3960agttattctt cacaaattta attgctggag aatagatctt
caacgcttta ataaagtagt 4020ttgtttgtca aggatggcgt catacaaaga aagatcagaa
tcacacactt cccctgttgc 4080taggagactt ttctccatca tggaggaaaa gaagtctaac
ctttgtgcat cattggatat 4140tactgaaact gaaaagcttc tctctatttt ggacactatt
ggtccttaca tctgtctagt 4200taaaacacac atcgatattg tttctgattt tacgtatgaa
ggaactgtgt tgcctttgaa 4260ggagcttgcc aagaaacata attttatgat ttttgaagat
agaaaatttg ctgatattgg 4320taacactgtt aaaaatcaat ataaatctgg tgtcttccgt
attgccgaat gggctgacat 4380cactaatgca catggtgtaa cgggtgcagg tattgtttct
ggcttgaagg aggcagccca 4440agaaacaacc agtgaaccta gaggtttgct aatgcttgct
gagttatcat caaagggttc 4500tttagcatat ggtgaatata cagaaaaaac agtagaaatt
gctaaatctg ataaagagtt 4560tgtcattggt tttattgcgc aacacgatat gggcggtaga
gaagaaggtt ttgactggat 4620cattatgact ccaggggttg gtttagatga caaaggtgat
gcacttggtc aacaatatag 4680aactgttgat gaagttgtaa agactggaac ggatatcata
attgttggta gaggtttgta 4740cggtcaagga agagatccta tagagcaagc taaaagatac
caacaagctg gttggaatgc 4800ttatttaaac agatttaaat gattcttaca caaagatttg
atacatgtac actagtttaa 4860ataagcatga aaagaattac acaagcaaaa aaaaaaaaat
aaatgaggta ctttacgttc 4920acctacaacc aaaaaaacta gatagagtaa aatcttaaga
tttagaaaaa gttgtttaac 4980aaaggcttta gtatgtgaat ttttaatgta gcaaagcgat
aactaataaa cataaacaaa 5040agtatggttt tctttatcag tcaaatcatt atcgattgat
tgttccgcgt atctgcagat 5100agcctcatga aatcagccat ttgcttttgt tcaacgatct
tttgaaattg ttgttgttct 5160tggtagttaa gttgatccat cttggcttat gttgtgtgta
tgttgtagtt attcttagta 5220tattcctgtc ctgagtttag tgaaacataa tatcgccttg
aaatgaaaat gctgaaattc 5280gtcgacatac aatttttcaa actttttttt tttcttggtg
cacggacatg tttttaaagg 5340aagtactcta taccagttat tcttcacaaa tttaattgct
ggagaataga tcttcaacgc 5400cccgggggat ctggatccgc ggccgcaata acctcaggga
gaactttggc attgtactct 5460ccattgacga gtccgccaac ccattcttgt taaacctaac
cttgcattat cacattccct 5520ttgaccccct ttagctgcat ttccacttgt ctacattaag
attcattaca cattcttttt 5580cgtatttctc ttacctccct cccccctcca tggatcttat
atataaatct tttctataac 5640aataatatct actagagtta aacaacaatt ccacttggca
tggctgtctc agcaaatctg 5700cttctaccta ctgcacgggt ttgcatgtca ttgtttctag
cagggaatcg tccatgtacg 5760ttgtcctcca tgatggtctt cccgctgcca ctttctttag
tatcttaaat agagcagatc 5820ttacgtccac tgtgcatccg tgcaccccga aaatcgtatg
gttttccttg ccacctctca 5880caattttgaa tatgctcaac gcgaaagaga ggggaagagg
aatcgcattc gtagagtggc 5940tacattcaac cctgacaaag gaactagcgt ttgtgcagga
gagagtggtt tgcatagatt 6000tcctttcctt tgcaagcata ttatatagag tagccaatac
agtaacagct acagcacaaa 6060aaagagaacg agaacgagaa cgagaacaag aacaagaact
agcactactg tcactgccag 6120catcaacatt actaccatta ttccaacatg tttgcaacta
gaaatataac cattggtgtc 6180agaacactca gaccaaccag tttcttgaaa acaaggtctt
ttctgcaaca gaggctacaa 6240tcaacgctaa agaagagcta tgaaccaacc aaatcc
62761386892DNAArtificial SequenceSynthetic -
Pseudomonas fluoresens SthA MEL integration
fragmentmisc_feature(3975)..(3975)n is a, c, g, or t 138cggatttggt
tggttcatag ctcttcttta gcgttgattg tagcctctgt tgcagaaaag 60accttgtttt
caagaaactg gttggtctga gtgttctgac accaatggtt atatttctag 120ttgcaaacat
gttggaataa tggtagtaat gttgatgctg gcagtgacag tagtgctagt 180tcttgttctt
gttctcgttc tcgttctcgt tctctttttt gtgctgtagc tgttactgta 240ttggctactc
tatataatat gcttgcaaag gaaaggaaat ctatgcaaac cactctctcc 300tgcacaaacg
ctagttcctt tgtcagggtt gaatgtagcc actctacgaa tgcgattcct 360cttcccctct
ctttcgcgtt gagcatattc aaaattgtga gaggtggcaa ggaaaaccat 420acgattttcg
gggtgcacgg atgcacagtg gacgtaagat ctgctctatt taagatacta 480aagaaagtgg
cagcgggaag accatcatgg aggacaacgt acatggacga ttccctgcta 540gaaacaatga
catgcaaacc cgtgcagtag gtagaagcag atttgctgag acagccatgc 600caagtggaat
tgttgtttaa ctctagtaga tattattgtt atagaaaaga tttatatata 660agatccatgg
aggggggagg gaggtaagag aaatacgaaa aagaatgtgt aatgaatctt 720aatgtagaca
agtggaaatg cagctaaagg gggtcaaagg gaatgtgata atgcaaggtt 780aggtttaaca
agaatgggtt ggcggactcg tcaatggaga gtacaatgcc aaagttctcc 840ctgaggttat
tgcggccgcg gatccctcga gattggtagt tctttccccc tctcaagctg 900gcgtgaaatg
caaccttacg gcgtctacgt tactacaagg tccagaaagt gtaggtattg 960ctactatttt
tattttttat tggttctgga gaaatgcaga cagtcaatga acacaactgt 1020ctcaatatgc
atctatgcac atgcacacac acacacatca caggtacccc tacaaagaga 1080ggtctcttga
taatgtttca ttaccacgtg gcatcccccc cccccccccc aataaacaag 1140tggccgagtt
cccctgttgc agaggaggac aaaaaaaccg ctggtgttgg taccattatg 1200cagcaactag
cacaacaaac aaccgaccca gacatacaaa tcaacaacac ttcgccaaag 1260acaccctttc
cagggaggat ccactcccaa cgtctctcca taatgtctct gttggcccat 1320gtctctgtcg
ttgacaccgt aaccacacca accaacccgt ccattgtact gggatggtcg 1380tccatagaca
cctctccaac ggggaacacc tcattcgtaa accgccaagg ttaccgttcc 1440tcctgactcg
ccccgttgtt gatgctgcgc acctgtggtt gcccaacatg gttgtatatc 1500gtgtaaccac
accaacacat gtgcagcaca tgtgtttaaa agagtgtcat ggaggtggat 1560catgatggaa
gtggacttta ccacttggga actgtctcca ctcccgggaa gaaaagaccc 1620ggcgtatcac
gcggttgcct caatggggca atttggaagg agaaatatag ggaaaatcac 1680gtcgctctcg
gacggggaag agttccagac tatgaggggg gggggtggta tataaagaca 1740ggagatgtcc
acccccagag agaggaagaa gttggaactt tagaagagag agataacttt 1800ccccagtgtc
catcaataca caaccaaaca caaactctat atttacacat ataaccccct 1860ctctagataa
aatggctgtt tataactacg acgttgttgt tttgggttct ggtccagcag 1920gcgaaggtgc
tgctatgaat gcagctaaag caggcagaaa agttgctatg gttgattcac 1980gtagacaagt
cggtggtaac tgtacccact taggtactat tccttctaag gctttgagac 2040actctgttcg
tcaaatcatg caattcaaca ctaatccaat gttcagagcc attggcgaac 2100caagatggtt
ctcctttcca gatgttttaa agtctgcaga aaaggttatt tccaagcaag 2160tcgcttctcg
taccggctat tacgctagaa acagagttga tttgtttttc ggtactggtt 2220ccttcgcaga
tgaacagact gttgaagtcg tttgtgcaaa tggtgttgtc gagaagttag 2280ttgctaagca
tattatcatc gccacaggtt ccagacctta cagaccagca gacatcgatt 2340tccatcatcc
acgtatctac gactctgata ccatcttatc tttaggccac acccctagaa 2400agttgattat
ctacggtgcc ggtgttatcg gttgcgagta tgcttctatc ttttcaggtt 2460tgggtgtctt
agtcgagttg gtcgataaca gagatcaact tttgtccttt ttagactctg 2520aaatttctca
agctctttcc tatcactttt ctaataacaa cattacagtt agacataatg 2580aggaatacga
cagagtcgaa ggtttagata acggtgttat tttgcatttg aagtccggta 2640aaaagattaa
ggccgatgca ttgttatggt gtaacggtag aactggtaat actgacaagt 2700taggtatgga
aaacattggt gttaaggtca actccagagg tcaaattgaa gttgacgaga 2760attacagaac
ctgtgtcaca aacatttatg gtgctggtga tgttattggt tggccatcac 2820ttgcctcagc
agctcacgac caaggtagat cagcagctgg ctctatcgtt gataatggtt 2880cctggagata
tgtcaacgat gttccaaccg gtatctacac tattccagaa atttcctcaa 2940ttggtaaaaa
tgaacacgaa ttgactaaag ctaaggttcc ttatgaggtc ggtaaagcct 3000ttttcaagtc
tatggcaaga gcacaaattg ctggtgaacc acagggtatg cttaaaatct 3060tattccatag
agaaacttta gaagtcttag gtgttcactg ttttggttat caagcatccg 3120aaattgttca
tattggccag gcaattatga accaaccagg tgaacaaaat actcttaagt 3180acttcgtcaa
taccaccttc aactacccaa caatggctga agcatataga gttgcagctt 3240acgatggttt
gaacagattg ttctaattaa ttaacatctg aatgtaaaat gaacattaaa 3300atgaattact
aaactttacg tctactttac aatctataaa ctttgtttaa tcatataacg 3360aaatacacta
atacacaatc ctgtacgtat gtaatacttt tatccatcaa ggattgagaa 3420aaaaaagtaa
tgattccctg ggccattaaa acttagaccc ccaagcttgg ataggtcact 3480ctctattttc
gtttctccct tccctgatag aagggtgata tgtaattaag aataatatat 3540aattttataa
taaaagaatt cgcccttacc tgcagggata acttcgtata atgtatgcta 3600tacgaagtta
tgctgcaacg gcaacatcaa tgtccacgtt tacacaccta catttatatc 3660tatatttata
tttatattta tttatttatg ctacttagct tctatagtta gttaatgcac 3720tcacgatatt
caaaattgac acccttcaac tactccctac tattgtctac tactgtctac 3780tactcctctt
tactatagct gctcccaata ggctccacca ataggctctg tcaatacatt 3840ttgcgccgcc
acctttcagg ttgtgtcact cctgaaggac catattgggt aatcgtgcaa 3900tttctggaag
agagtgccgc gagaagtgag gcccccactg taaatcctcg agggggcatg 3960gagtatgggg
catgnaggat ggaggatggg gggggggggg gaaaataggt agcgaaagga 4020cccgctatca
ccccacccgg agaactcgtt gccgggaagt catatttcga cactccgggg 4080agtctataaa
aggcgggttt tgtcttttgc cagttgatgt tgctgagagg acttgtttgc 4140cgtttcttcc
gatttaacag tatagaatca accactgtta attatacacg ttatactaac 4200acaacaaaaa
caaaaacaac gacaacaaca acaacaatgt ttgctttcta ctttctcacc 4260gcatgcacca
ctttgaaggg tgttttcgga gtttctccga gttacaatgg tcttggtctc 4320accccacaga
tgggttggga cagctggaat acgtttgcct gcgatgtcag tgaacagcta 4380cttctagaca
ctgctgatag aatttctgac ttggggctaa aggatatggg ttacaagtat 4440gtcatcctag
atgactgttg gtctagcggc agggattccg acggtttcct cgttgcagac 4500aagcacaaat
ttcccaacgg tatgggccat gttgcagacc acctgcataa taacagcttt 4560cttttcggta
tgtattcgtc tgctggtgag tacacctgtg ctgggtaccc tgggtctctg 4620gggcgtgagg
aagaagatgc tcaattcttt gcaaataacc gcgttgacta cttgaagtat 4680gataattgtt
acaataaagg tcaatttggt acaccagacg tttcttacca ccgttacaag 4740gccatgtcag
atgctttgaa taaaactggt aggcctattt tctattctct atgtaactgg 4800ggtcaggatt
tgacatttta ctggggctct ggtatcgcca attcttggag aatgagcgga 4860gatattactg
ctgagttcac ccgtccagat agcagatgtc cctgtgacgg tgacgaatat 4920gattgcaagt
acgccggttt ccattgttct attatgaata ttcttaacaa ggcagctcca 4980atggggcaaa
atgcaggtgt tggtggttgg aacgatctgg acaatctaga ggtcggagtc 5040ggtaatttga
ctgacgatga ggaaaaggcc catttctcta tgtgggcaat ggtaaagtcc 5100ccacttatca
ttggtgccga cgtgaatcac ttaaaggcat cttcgtactc gatctacagt 5160caagcctctg
tcatcgcaat taatcaagat ccaaagggta ttccagccac aagagtctgg 5220agatattatg
tttcagacac cgatgaatat ggacaaggtg aaattcaaat gtggagtggt 5280ccgcttgaca
atggtgacca agtggttgct ttattgaatg gaggaagcgt agcaagacca 5340atgaacacga
ccttggaaga gattttcttt gacagcaatt tgggttcaaa ggaactgaca 5400tcgacttggg
atatttacga cttatgggcc aacagagttg acaactctac ggcgtctgct 5460atccttgaac
agaataaggc agccaccggt attctctaca atgctacaga gcagtcttat 5520aaagacggtt
tgtctaagaa tgatacaaga ctgtttggcc agaaaattgg tagtctttct 5580ccaaatgcta
tacttaacac aactgttcca gctcatggta tcgccttcta taggttgaga 5640ccctcggctt
aagctcaatg ttgagcaaag caggacgaga aaaaaaaaaa taatgattgt 5700taagaagttc
atgaaaaaaa aaaggaaaaa tactcaaata cttataacag agtgattaaa 5760taataaacgg
cagtataccc tatcaggtat tgagatagtt ttatttttgt aggtatataa 5820tctgaagcct
ttgaactatt ttctcgtata tatcatggag tatacattgc attagcaaca 5880ttgcatacta
gttcataact tcgtataatg tatgctatac gaagttatta attaacaagg 5940gcgaattcct
tgatttatat acacctttgc caaccgcttg ttacttgata aggaaaagat 6000agatttctaa
agtgcaggaa aagaaacgcc actacgtcat gaaacaaaag aaatgaaaca 6060ctctgcaaaa
gggaaaacca atgacgcctt caaaacgtac tgactttccg cctccttttc 6120tgcctttttt
ttttctccct caatttgcca attccccttt ccgctaattt tacatcacct 6180ttttgtttgt
ttcccttttc ggccaagttt tccatttctt ttttcggctg agcccttctt 6240tggcgtcgac
gtaatttctc ggcatgtggc caatgtatat tgacagtaga tgaagtagac 6300gttcttagta
actgttaggg tgagattgcc accccccctt ccttctttta ctatctgtaa 6360taccatcacc
atagcaatag tttaaccatg ttggagctgg aaatacaacg tctatagagg 6420gaagtcatca
tattacgcca ttttacggac cagggacacc ctgtagtgtg tttcctctct 6480tgtagaggta
ggttttcaaa tggactctgg cgtcgatttc cagcaagtca ttcccgtggt 6540tcaccatttc
tactttttgc gctacctctc ttgacacaga aatgaatgat gacgtgtaaa 6600ttacccgtcc
gagacctgga ctccggagaa actgtattaa ttacgcgcca aacaagacag 6660gtgtcggata
aacgtgcatg tacagactgc gagccgaaaa cggaaggggg gaaagaaaac 6720agtggagtcc
cattgttgtt ccggaaatgg aaaacgggaa ctggcggaaa agaaacgaaa 6780caaaacaaaa
gaaaaagagg aaaaaaaaga aaaaaaaaag aaaaagacac tgcacgtgat 6840tgctggtgtg
tgctgcgtaa ccgcggcact ttatttcgta aatgaagggg cc
68921391395DNAPseudomonas fluorescens 139ttagaacaat ctgttcaaac catcgtaagc
tgcaactcta tatgcttcag ccattgttgg 60gtagttgaag gtggtattga cgaagtactt
aagagtattt tgttcacctg gttggttcat 120aattgcctgg ccaatatgaa caatttcgga
tgcttgataa ccaaaacagt gaacacctaa 180gacttctaaa gtttctctat ggaataagat
tttaagcata ccctgtggtt caccagcaat 240ttgtgctctt gccatagact tgaaaaaggc
tttaccgacc tcataaggaa ccttagcttt 300agtcaattcg tgttcatttt taccaattga
ggaaatttct ggaatagtgt agataccggt 360tggaacatcg ttgacatatc tccaggaacc
attatcaacg atagagccag ctgctgatct 420accttggtcg tgagctgctg aggcaagtga
tggccaacca ataacatcac cagcaccata 480aatgtttgtg acacaggttc tgtaattctc
gtcaacttca atttgacctc tggagttgac 540cttaacacca atgttttcca tacctaactt
gtcagtatta ccagttctac cgttacacca 600taacaatgca tcggccttaa tctttttacc
ggacttcaaa tgcaaaataa caccgttatc 660taaaccttcg actctgtcgt attcctcatt
atgtctaact gtaatgttgt tattagaaaa 720gtgataggaa agagcttgag aaatttcaga
gtctaaaaag gacaaaagtt gatctctgtt 780atcgaccaac tcgactaaga cacccaaacc
tgaaaagata gaagcatact cgcaaccgat 840aacaccggca ccgtagataa tcaactttct
aggggtgtgg cctaaagata agatggtatc 900agagtcgtag atacgtggat gatggaaatc
gatgtctgct ggtctgtaag gtctggaacc 960tgtggcgatg ataatatgct tagcaactaa
cttctcgaca acaccatttg cacaaacgac 1020ttcaacagtc tgttcatctg cgaaggaacc
agtaccgaaa aacaaatcaa ctctgtttct 1080agcgtaatag ccggtacgag aagcgacttg
cttggaaata accttttctg cagactttaa 1140aacatctgga aaggagaacc atcttggttc
gccaatggct ctgaacattg gattagtgtt 1200gaattgcatg atttgacgaa cagagtgtct
caaagcctta gaaggaatag tacctaagtg 1260ggtacagtta ccaccgactt gtctacgtga
atcaaccata gcaacttttc tgcctgcttt 1320agctgcattc atagcagcac cttcgcctgc
tggaccagaa cccaaaacaa caacgtcgta 1380gttataaaca gccat
13951406281DNAArtificial
SequenceSynthetic - Pseudomonas fluorescens URA integration fragment
140cggatttggt tggttcatag ctcttcttta gcgttgattg tagcctctgt tgcagaaaag
60accttgtttt caagaaactg gttggtctga gtgttctgac accaatggtt atatttctag
120ttgcaaacat gttggaataa tggtagtaat gttgatgctg gcagtgacag tagtgctagt
180tcttgttctt gttctcgttc tcgttctcgt tctctttttt gtgctgtagc tgttactgta
240ttggctactc tatataatat gcttgcaaag gaaaggaaat ctatgcaaac cactctctcc
300tgcacaaacg ctagttcctt tgtcagggtt gaatgtagcc actctacgaa tgcgattcct
360cttcccctct ctttcgcgtt gagcatattc aaaattgtga gaggtggcaa ggaaaaccat
420acgattttcg gggtgcacgg atgcacagtg gacgtaagat ctgctctatt taagatacta
480aagaaagtgg cagcgggaag accatcatgg aggacaacgt acatggacga ttccctgcta
540gaaacaatga catgcaaacc cgtgcagtag gtagaagcag atttgctgag acagccatgc
600caagtggaat tgttgtttaa ctctagtaga tattattgtt atagaaaaga tttatatata
660agatccatgg aggggggagg gaggtaagag aaatacgaaa aagaatgtgt aatgaatctt
720aatgtagaca agtggaaatg cagctaaagg gggtcaaagg gaatgtgata atgcaaggtt
780aggtttaaca agaatgggtt ggcggactcg tcaatggaga gtacaatgcc aaagttctcc
840ctgaggttat tgcggccgcg gatccagatc ccccggggcg ttgaagatct attctccagc
900aattaaattt gtgaagaata actggtatag agtacttcct ttaaaaacat gtccgtgcac
960caagaaaaaa aaaaagtttg aaaaattgta tgtcgacgaa tttcagcatt ttcatttcaa
1020ggcgatatta tgtttcacta aactcaggac aggaatatac taagaataac tacaacatac
1080acacaacata agccaagatg gatcaactta actaccaaga acaacaacaa tttcaaaaga
1140tcgttgaaca aaagcaaatg gctgatttca tgaggctatc tgcagatacg cggaacaatc
1200aatcgataat gatttgactg ataaagaaaa ccatactttt gtttatgttt attagttatc
1260gctttgctac attaaaaatt cacatactaa agcctttgtt aaacaacttt ttctaaatct
1320taagatttta ctctatctag tttttttggt tgtaggtgaa cgtaaagtac ctcatttatt
1380tttttttttt tgcttgtgta attcttttca tgcttattta aactagtgta catgtatcaa
1440atctttgtgt aagaatcatt taaatctgtt taaataagca ttccaaccag cttgttggta
1500tcttttagct tgctctatag gatctcttcc ttgaccgtac aaacctctac caacaattat
1560gatatccgtt ccagtcttta caacttcatc aacagttcta tattgttgac caagtgcatc
1620acctttgtca tctaaaccaa cccctggagt cataatgatc cagtcaaaac cttcttctct
1680accgcccata tcgtgttgcg caataaaacc aatgacaaac tctttatcag atttagcaat
1740ttctactgtt ttttctgtat attcaccata tgctaaagaa ccctttgatg ataactcagc
1800aagcattagc aaacctctag gttcactggt tgtttcttgg gctgcctcct tcaagccaga
1860aacaatacct gcacccgtta caccatgtgc attagtgatg tcagcccatt cggcaatacg
1920gaagacacca gatttatatt gatttttaac agtgttacca atatcagcaa attttctatc
1980ttcaaaaatc ataaaattat gtttcttggc aagctccttc aaaggcaaca cagttccttc
2040atacgtaaaa tcagaaacaa tatcgatgtg tgttttaact agacagatgt aaggaccaat
2100agtgtccaaa atagagagaa gcttttcagt ttcagtaata tccaatgatg cacaaaggtt
2160agacttcttt tcctccatga tggagaaaag tctcctagca acaggggaag tgtgtgattc
2220tgatctttct ttgtatgacg ccatccttga caaacaaact actttattaa agcgttgaag
2280atctattctc cagcaattaa atttgtgaag aataactggt atagagtact tcctttaaaa
2340acatgtccgt gcaccaagaa aaaaaaaaag tttgaaaaat tgtatgtcga cgaatttcag
2400cattttcatt tcaaggcgat attatgtttc actaaactca ggacaggaat atactaagaa
2460taactacaac atacacacaa cataagccaa gatggatcaa cttaactacc aagaacaaca
2520acaatttcaa aagatcgttg aacaaaagca aatggctgat ttcatgaggc tatgaattct
2580tttattataa aattatatat tattcttaat tacatatcac ccttctatca gggaagggag
2640aaacgaaaat agagagtgac ctatccaagc ttgggggtct aagttttaat ggcccaggga
2700atcattactt ttttttctca atccttgatg gataaaagta ttacatacgt acaggattgt
2760gtattagtgt atttcgttat atgattaaac aaagtttata gattgtaaag tagacgtaaa
2820gtttagtaat tcattttaat gttcatttta cattcagatg ttaattaatt agaacaatct
2880gttcaaacca tcgtaagctg caactctata tgcttcagcc attgttgggt agttgaaggt
2940ggtattgacg aagtacttaa gagtattttg ttcacctggt tggttcataa ttgcctggcc
3000aatatgaaca atttcggatg cttgataacc aaaacagtga acacctaaga cttctaaagt
3060ttctctatgg aataagattt taagcatacc ctgtggttca ccagcaattt gtgctcttgc
3120catagacttg aaaaaggctt taccgacctc ataaggaacc ttagctttag tcaattcgtg
3180ttcattttta ccaattgagg aaatttctgg aatagtgtag ataccggttg gaacatcgtt
3240gacatatctc caggaaccat tatcaacgat agagccagct gctgatctac cttggtcgtg
3300agctgctgag gcaagtgatg gccaaccaat aacatcacca gcaccataaa tgtttgtgac
3360acaggttctg taattctcgt caacttcaat ttgacctctg gagttgacct taacaccaat
3420gttttccata cctaacttgt cagtattacc agttctaccg ttacaccata acaatgcatc
3480ggccttaatc tttttaccgg acttcaaatg caaaataaca ccgttatcta aaccttcgac
3540tctgtcgtat tcctcattat gtctaactgt aatgttgtta ttagaaaagt gataggaaag
3600agcttgagaa atttcagagt ctaaaaagga caaaagttga tctctgttat cgaccaactc
3660gactaagaca cccaaacctg aaaagataga agcatactcg caaccgataa caccggcacc
3720gtagataatc aactttctag gggtgtggcc taaagataag atggtatcag agtcgtagat
3780acgtggatga tggaaatcga tgtctgctgg tctgtaaggt ctggaacctg tggcgatgat
3840aatatgctta gcaactaact tctcgacaac accatttgca caaacgactt caacagtctg
3900ttcatctgcg aaggaaccag taccgaaaaa caaatcaact ctgtttctag cgtaatagcc
3960ggtacgagaa gcgacttgct tggaaataac cttttctgca gactttaaaa catctggaaa
4020ggagaaccat cttggttcgc caatggctct gaacattgga ttagtgttga attgcatgat
4080ttgacgaaca gagtgtctca aagccttaga aggaatagta cctaagtggg tacagttacc
4140accgacttgt ctacgtgaat caaccatagc aacttttctg cctgctttag ctgcattcat
4200agcagcacct tcgcctgctg gaccagaacc caaaacaaca acgtcgtagt tataaacagc
4260cattttatct agagaggggg ttatatgtgt aaatatagag tttgtgtttg gttgtgtatt
4320gatggacact ggggaaagtt atctctctct tctaaagttc caacttcttc ctctctctgg
4380gggtggacat ctcctgtctt tatataccac ccccccccct catagtctgg aactcttccc
4440cgtccgagag cgacgtgatt ttccctatat ttctccttcc aaattgcccc attgaggcaa
4500ccgcgtgata cgccgggtct tttcttcccg ggagtggaga cagttcccaa gtggtaaagt
4560ccacttccat catgatccac ctccatgaca ctcttttaaa cacatgtgct gcacatgtgt
4620tggtgtggtt acacgatata caaccatgtt gggcaaccac aggtgcgcag catcaacaac
4680ggggcgagtc aggaggaacg gtaaccttgg cggtttacga atgaggtgtt ccccgttgga
4740gaggtgtcta tggacgacca tcccagtaca atggacgggt tggttggtgt ggttacggtg
4800tcaacgacag agacatgggc caacagagac attatggaga gacgttggga gtggatcctc
4860cctggaaagg gtgtctttgg cgaagtgttg ttgatttgta tgtctgggtc ggttgtttgt
4920tgtgctagtt gctgcataat ggtaccaaca ccagcggttt ttttgtcctc ctctgcaaca
4980ggggaactcg gccacttgtt tattgggggg gggggggggg atgccacgtg gtaatgaaac
5040attatcaaga gacctctctt tgtaggggta cctgtgatgt gtgtgtgtgt gcatgtgcat
5100agatgcatat tgagacagtt gtgttcattg actgtctgca tttctccaga accaataaaa
5160aataaaaata gtagcaatac ctacactttc tggaccttgt agtaacgtag acgccgtaag
5220gttgcatttc acgccagctt gagaggggga aagaactacc aatctcgagg gatccgcggc
5280cgcaaataaa tttaaaataa acgatatcaa aattcaaagg gttcaaagtg ggaattcctt
5340gatttatata cacctttgcc aaccgcttgt tacttgataa ggaaaagata gatttctaaa
5400gtgcaggaaa agaaacgcca ctacgtcatg aaacaaaaga aatgaaacac tctgcaaaag
5460ggaaaaccaa tgacgccttc aaaacgtact gactttccgc ctccttttct gccttttttt
5520tttctccctc aatttgccaa ttcccctttc cgctaatttt acatcacctt tttgtttgtt
5580tcccttttcg gccaagtttt ccatttcttt tttcggctga gcccttcttt ggcgtcgacg
5640taatttctcg gcatgtggcc aatgtatatt gacagtagat gaagtagacg ttcttagtaa
5700ctgttagggt gagattgcca cccccccttc cttcttttac tatctgtaat accatcacca
5760tagcaatagt ttaaccatgt tggagctgga aatacaacgt ctatagaggg aagtcatcat
5820attacgccat tttacggacc agggacaccc tgtagtgtgt ttcctctctt gtagaggtag
5880gttttcaaat ggactctggc gtcgatttcc agcaagtcat tcccgtggtt caccatttct
5940actttttgcg ctacctctct tgacacagaa atgaatgatg acgtgtaaat tacccgtccg
6000agacctggac tccggagaaa ctgtattaat tacgcgccaa acaagacagg tgtcggataa
6060acgtgcatgt acagactgcg agccgaaaac ggaagggggg aaagaaaaca gtggagtccc
6120attgttgttc cggaaatgga aaacgggaac tggcggaaaa gaaacgaaac aaaacaaaag
6180aaaaagagga aaaaaaagaa aaaaaaaaga aaaagacact gcacgtgatt gctggtgtgt
6240gctgcgtaac cgcggcactt tatttcgtaa atgaaggggc c
628114134DNAArtificial SequenceSynthetic - Primer 141aaagggttaa
ttaattagaa caatctgttc aaac
3414220DNAArtificial SequenceSynthetic - Primer 142ctgttcaaac catcgtaagc
2014330DNAArtificial
SequenceSynthetic - Primer 143gtagttgaag gtggtattaa cgaaatattc
3014418DNAArtificial SequenceSynthetic - Primer
144ggaactgtgt tgcctttg
1814532DNAArtificial SequenceSynthetic - Primer 145gaataaaact ggtaggccta
ttttctattc tc 32146464PRTPseudomonas
fluorescens 146Met Ala Val Tyr Asn Tyr Asp Val Val Val Leu Gly Ser Gly
Pro Ala1 5 10 15Gly Glu
Gly Ala Ala Met Asn Ala Ala Lys Ala Gly Arg Lys Val Ala 20
25 30Met Val Asp Ser Arg Arg Gln Val Gly
Gly Asn Cys Thr His Leu Gly 35 40
45Thr Ile Pro Ser Lys Ala Leu Arg His Ser Val Arg Gln Ile Met Gln 50
55 60Phe Asn Thr Asn Pro Met Phe Arg Ala
Ile Gly Glu Pro Arg Trp Phe65 70 75
80Ser Phe Pro Asp Val Leu Lys Ser Ala Glu Lys Val Ile Ser
Lys Gln 85 90 95Val Ala
Ser Arg Thr Gly Tyr Tyr Ala Arg Asn Arg Val Asp Leu Phe 100
105 110Phe Gly Thr Gly Ser Phe Ala Asp Glu
Gln Thr Val Glu Val Val Cys 115 120
125Ala Asn Gly Val Val Glu Lys Leu Val Ala Lys His Ile Ile Ile Ala
130 135 140Thr Gly Ser Arg Pro Tyr Arg
Pro Ala Asp Ile Asp Phe His His Pro145 150
155 160Arg Ile Tyr Asp Ser Asp Thr Ile Leu Ser Leu Gly
His Thr Pro Arg 165 170
175Lys Leu Ile Ile Tyr Gly Ala Gly Val Ile Gly Cys Glu Tyr Ala Ser
180 185 190Ile Phe Ser Gly Leu Gly
Val Leu Val Glu Leu Val Asp Asn Arg Asp 195 200
205Gln Leu Leu Ser Phe Leu Asp Ser Glu Ile Ser Gln Ala Leu
Ser Tyr 210 215 220His Phe Ser Asn Asn
Asn Ile Thr Val Arg His Asn Glu Glu Tyr Asp225 230
235 240Arg Val Glu Gly Leu Asp Asn Gly Val Ile
Leu His Leu Lys Ser Gly 245 250
255Lys Lys Ile Lys Ala Asp Ala Leu Leu Trp Cys Asn Gly Arg Thr Gly
260 265 270Asn Thr Asp Lys Leu
Gly Met Glu Asn Ile Gly Val Lys Val Asn Ser 275
280 285Arg Gly Gln Ile Glu Val Asp Glu Asn Tyr Arg Thr
Cys Val Thr Asn 290 295 300Ile Tyr Gly
Ala Gly Asp Val Ile Gly Trp Pro Ser Leu Ala Ser Ala305
310 315 320Ala His Asp Gln Gly Arg Ser
Ala Ala Gly Ser Ile Val Asp Asn Gly 325
330 335Ser Trp Arg Tyr Val Asn Asp Val Pro Thr Gly Ile
Tyr Thr Ile Pro 340 345 350Glu
Ile Ser Ser Ile Gly Lys Asn Glu His Glu Leu Thr Lys Ala Lys 355
360 365Val Pro Tyr Glu Val Gly Lys Ala Phe
Phe Lys Ser Met Ala Arg Ala 370 375
380Gln Ile Ala Gly Glu Pro Gln Gly Met Leu Lys Ile Leu Phe His Arg385
390 395 400Glu Thr Leu Glu
Val Leu Gly Val His Cys Phe Gly Tyr Gln Ala Ser 405
410 415Glu Ile Val His Ile Gly Gln Ala Ile Met
Asn Gln Pro Gly Glu Gln 420 425
430Asn Thr Leu Lys Tyr Phe Val Asn Thr Thr Phe Asn Tyr Pro Thr Met
435 440 445Ala Glu Ala Tyr Arg Val Ala
Ala Tyr Asp Gly Leu Asn Arg Leu Phe 450 455
460
User Contributions:
Comment about this patent or add new information about this topic: