Patent application title: Expression of Hexose Kinase in Recombinant Host Cells
Inventors:
Larry Cameron Anthony (Aston, PA, US)
Larry Cameron Anthony (Aston, PA, US)
Arthur Leo Kruckeberg (Wilmington, DE, US)
Arthur Leo Kruckeberg (Wilmington, DE, US)
Brian James Paul (Wilmington, DE, US)
Brian James Paul (Wilmington, DE, US)
Assignees:
BUTAMAX(TM) ADVANCED BIOFUELS LLC
IPC8 Class: AC12P716FI
USPC Class:
435160
Class name: Containing hydroxy group acyclic butanol
Publication date: 2014-05-22
Patent application number: 20140141479
Abstract:
The invention relates to a recombinant host cell having (a) a
modification in an endogenous polynucleotide encoding a polypeptide
having dual-role hexokinase activity; (b) a heterologous polynucleotide
encoding a polypeptide having hexose kinase activity; and optionally (c)
a modification in an endogenous polynucleotide encoding a polypeptide
having pyruvate decarboxylase activity. Additionally, the invention
relates to methods of making and using such recombinant host cells
including, for example, methods of increasing glucose consumption,
methods of improving redox balance, and/or methods of increasing the
production of a product of a pyruvate-utilizing pathway.Claims:
1. A recombinant yeast host cell comprising: (a) a modification in an
endogenous polynucleotide encoding a polypeptide having dual-role
hexokinase activity in the host cell wherein the activity of the
polypeptide of (a) is reduced or substantially eliminated; and (b) a
heterologous polynucleotide encoding a polypeptide having hexose kinase
activity.
2. The recombinant host cell of claim 1 wherein the glucose consumption rate is increased as compared to that of the host cell comprising (a) but not (b).
3. The recombinant host cell of claim 1 wherein the modification of (a) is a deletion.
4. The recombinant host cell of claim 1 wherein the recombinant host cell further comprises (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity.
5. The recombinant host cell of claim 4 wherein pyruvate decarboxylase activity is reduced or substantially eliminated.
6. The recombinant host of claim 1 wherein the polypeptide of (a) is HXK2, and wherein the recombinant host cell is S. cerevisiae.
7. The recombinant host cell of claim 1, wherein the polypeptide of (a) is RAG5, and wherein the recombinant host cell is K. lactis, or wherein the polypeptide of (a) is HPGLK1, and wherein the recombinant host cell is H. polymorpha, or wherein the polypeptide of (a) is HXK2, and wherein the recombinant host cell is S. pombe.
8-9. (canceled)
10. The recombinant host cell of claim 1 wherein the heterologous polynucleotide of (b) comprises the polypeptide of (a) having a deletion of a protein interaction domain that prevents function as a transcriptional regulator.
11. The recombinant host cell of claim 1 wherein the heterologous polynucleotide of (b) encodes a polypeptide that has at least about 85% identity to SEQ ID NO: 2, 115, 117, 119, 4, 6, 8, 121, or 123.
12. The recombinant host cell of claim 6 wherein the heterologous polynucleotide of (b) encodes a polypeptide of SEQ ID NO: 4, 6, 8, 121, or 123.
13. The recombinant host cell of claim 5 wherein the heterologous polynucleotide of (b) encodes a polypeptide that has at least about 85% identity to SEQ ID NO: 4, 6, 8, 121, or 123.
14. The recombinant host cell of claim 5 wherein the heterologous polynucleotide of (b) encodes a polypeptide that has at least about 85% identity to SEQ ID NO: 130.
15. The recombinant host cell of claim 1 wherein the polypeptide encoded by the heterologous polynucleotide of (b) is constitutively expressed.
16. The recombinant host cell of claim 1 wherein the heterologous polynucleotide of (b) comprises i) a promoter region derived from the S. cerevisiae ADH1 promoter region or ii) a promoter region having at least about 85% identity to SEQ ID NO: 131.
17. The recombinant host cell of claim 6 wherein the heterologous polynucleotide of (b) comprises a conditional promoter and encodes a polypeptide having at least 85% identity to SEQ ID NO: 4 or SEQ ID NO: 2.
18. The recombinant host cell of claim 17 wherein the conditional promoter comprises a sequence derived from the OLE1 promoter region.
19. The recombinant host cell of claim 1 further comprising a pyruvate utilizing biosynthetic pathway which forms a product selected from the group consisting of: 2,3-butanediol, isobutanol, 2-butanol, 1-butanol, 2-butanone, valine, leucine, lactic acid, malate, isoamyl alcohol, and isoprenoids.
20. The recombinant host cell of claim 19 wherein the product is isobutanol.
21-26. (canceled)
27. A method of increasing glucose consumption of a recombinant host cell comprising: (i) providing the recombinant host cell of claim 1; and (ii) growing the recombinant host cell under conditions wherein the heterologous polynucleotide of (b) is expressed in functional form; wherein the glucose consumption of the recombinant host cell is greater than the glucose consumption of a host cell comprising (a) but not (b).
28. A method of increasing the formation of a product of a pyruvate-utilizing biosynthetic pathway comprising; (i) providing the recombinant host cell of claim 19, or combinations thereof; and (ii) growing the recombinant host cell under conditions wherein the product of the pyruvate-utilizing pathway is formed; wherein the amount of product formed by the recombinant host cell is greater than the amount of product formed by a host cell comprising (a) but not (b).
29. The method of claim 28 wherein the product is butanol.
30. The butanol of claim 29, wherein said butanol is isobutanol.
31. (canceled)
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. application Ser. No. 12/980,607, filed Dec. 29, 2010, which claims the benefit of priority of U.S. Provisional Application No. 61/290,639, filed Dec. 29, 2009. The entirety of each is incorporated herein by reference.
REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY
[0002] The content of the electronically submitted sequence listing (Name: 20140123_CL4894USCNT_SQL_ascii.txt; Size: 422,596 bytes; and Date of Creation: Jan. 23, 2014, filed herewith, is incorporated herein by reference in its entirety.
FIELD OF THE INVENTION
[0003] The invention relates generally to the field of industrial microbiology and alcohol production. More specifically, the invention relates to a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity in said recombinant host cell; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and optionally (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity. Additionally, the invention relates to methods of making and using such a recombinant host cell including, for example, methods of increasing glucose consumption, methods of enhancing redox balance, and methods of increasing the production of a product of a pyruvate-utilizing pathway.
BACKGROUND OF THE INVENTION
[0004] Global demand for liquid transportation fuel is projected to strain the ability to meet certain environmentally driven goals, for example, the conservation of oil reserves and limitation of greenhouse gas emissions. Such demand has driven the development of technology which allows utilization of renewable resources to mitigate the depletion of oil reserves and to minimize greenhouse gas emissions.
[0005] Butanol is an important industrial chemical, useful as a fuel additive, as a feedstock chemical in the plastics industry, and as a food grade extractant in the food and flavor industry. Each year 10 to 12 billion pounds of butanol are produced by petrochemical means and the need for this commodity chemical will likely increase in the future.
[0006] Methods for the chemical synthesis of isobutanol, an isomer of butanol, are known, such as oxo synthesis, catalytic hydrogenation of carbon monoxide (Ullmann's Encyclopedia of Industrial Chemistry, 6th edition, 2003, Wiley-VCH Verlag GmbH and Co., Weinheim, Germany, Vol. 5, pp. 716-719) and Guerbet condensation of methanol with n-propanol (Carlini et al., J. Molec. Catal. A: Chem. 220:215-220, 2004). These processes use starting materials derived from petrochemicals, are generally expensive, and are not environmentally friendly. The production of isobutanol from plant-derived raw materials would minimize greenhouse gas emissions and would represent an advance in the art.
[0007] 2-Butanone, also referred to as methyl ethyl ketone (MEK), is a widely used solvent and is the most important commercially produced ketone, after acetone. It is used as a solvent for paints, resins, and adhesives, as well as a selective extractant, activator of oxidative reactions, and it can be chemically converted to 2-butanol by reacting with hydrogen in the presence of a catalyst (Nystrom, R. F. and Brown, W. G. (J. Am. Chem. Soc. (1947) 69:1198). 2,3-butanediol can be used in the chemical synthesis of butene and butadiene, important industrial chemicals currently obtained from cracked petroleum, and esters of 2,3-butanediol can be used as plasticizers (Voloch et al., "Fermentation Derived 2,3-Butanediol," in Comprehensive Biotechnology, Pergamon Press Ltd., England, Vol. 2, Section 3:933-947 (1986)).
[0008] Microorganisms can be engineered for the expression of biosynthetic pathways that utilize pyruvate to produce, for example, 2,3-butanediol, 2-butanone, 2-butanol and isobutanol. U.S. Patent Application Publication No. US 2007/0092957 A1 discloses the engineering of recombinant microorganisms for production of isobutanol. U.S. Patent Application Publication Nos. US 2007/0259410 A1 and US 2007/0292927 A1 disclose the engineering of recombinant microorganisms for production of 2-butanone or 2-butanol. Multiple pathways are disclosed for biosynthesis of isobutanol and 2-butanol, all of which initiate with cellular pyruvate. Butanediol is an intermediate in the 2-butanol pathway disclosed in U.S. Patent Application Publication No. US 2007/0292927 A1.
[0009] Engineering recombinant host cells for increased availability of pyruvate and/or for reduced glucose repression allows for increased formation of the products of pyruvate-utilizing biosynthetic pathways. For example, reducing glucose repression has been used to improve the respiratory capacity of yeast and to increase biomass production. Also, International Publication No. WO 1998/26079 A1 discloses overexpression of the Hap1 transcription factor to reduce glucose repression results in increased respiratory capacity and increased biomass production. European Patent No. 1728854 discloses a process for biomass production using yeast overexpressing the Hap1 transcription factor grown in aerobic conditions.
[0010] Functional deletion of the hexokinase 2 gene has been used to reduce glucose repression and to increase the availability of pyruvate for utilization in biosynthetic pathways. For example, International Publication No. WO 2000/061722 A1 discloses the production of yeast biomass by aerobically growing yeast having one or more functionally deleted hexokinase 2 genes or analogs. In addition, Rossell et al. (Yeast Research 8:155-164 (2008)) found that Saccharomyces cerevisiae with a deletion of the hexokinase 2 gene showed 75% reduction in fermentative capacity, defined as the specific rate of carbon dioxide production under sugar-excess and anaerobic conditions. After starvation, the fermentation capacity was similar to that of a strain without the hexokinase 2 gene deletion. Diderich et al. (Applied and Environmental Microbiology 67:1587-1593 (2001)) found that S. cerevisiae with a deletion of the hexokinase 2 gene had lower pyruvate decarboxylase activity.
[0011] Functional deletion of the pyruvate decarboxylase gene has also been used to increase the availability of pyruvate for utilization in biosynthetic pathways. For example, U.S. Application Publication No. US 2007/0031950 A1 discloses a yeast strain with a disruption of one or more pyruvate decarboxylase genes and expression of a D-lactate dehydrogenase gene, which is used for production of D-lactic acid. U.S. Application Publication No. US 2005/0059136 A1 discloses glucose tolerant two carbon source independent (GCSI) yeast strains with no pyruvate decarboxylase activity, which may have an exogenous lactate dehydrogenase gene, Nevoigt and Stahl (Yeast 12:1331-1337 (1996)) describe the impact of reduced pyruvate decarboxylase and increased NAD-dependent glycerol-3-phosphate dehydrogenase in Saccharomyces cerevisiae on glycerol yield. U.S. patent application Ser. No. 12/477,942 discloses increased conversion of pyruvate to acetolactate by engineering yeast for expression of a cytosol-localized acetolactate synthase and substantial elimination of pyruvate decarboxylase activity.
[0012] There remains a need to improve redox balance, glucose consumption and/or product formation of a pyruvate-utilizing biosynthetic pathway in recombinant host cells comprising a functional deletion of genes encoding dual-role hexokinases such as the hexokinase 2 gene.
BRIEF SUMMARY OF THE INVENTION
[0013] Provided herein are recombinant yeast cells comprising: (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity in the host cell wherein the activity of the polypeptide of (a) is reduced or substantially eliminated; and (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity. In embodiments, the recombinant yeast cells have increased glucose consumption rates as compared to yeast cells with (a) but not (b). In embodiments, the modification of (a) is a deletion. In embodiments, the recombinant yeast cells have altered glucose repression as compared to yeast cells with (a) but not (b). In embodiments, the recombinant yeast cell further comprises (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity. In embodiments, pyruvate decarboxylase activity is reduced or substantially eliminated. In embodiments, the polypeptide of (a) is HXK2, and the recombinant yeast cell is S. cerevisiae. In embodiments, the polypeptide of (a) is RAG5, and the recombinant host cell is K. lactis; or the polypeptide of (a) is HPGLK1, and the recombinant host cell is H. polymorpha; or the polypeptide of (a) is HXK2, and the recombinant host cell is S. pombe. In another aspect of the invention, a polynucleotide or polypeptide of (b) corresponds to Enzyme Commission Number EC 2.7.1.1 and/or corresponds to Enzyme Commission EC 2.7.1.2. In embodiments, the polynucleotide of (b) contains a promoter such that the polypeptide of (b) is conditionally expressed. In embodiments, the conditional promoter comprises a sequence derived from the OLE1 promoter region. In embodiments, the polynucleotide of (b) contains a promoter such that the polypeptide of (b) is constitutively expressed. In embodiments, the heterologous polynucleotide of (b) comprises the polypeptide of (a) with a deletion of a protein interaction domain that prevents function as a transcriptional regulator. In embodiments, the heterologous polynucleotide of (b) encodes a polypeptide that has at least about 85% identity to SEQ ID NO: 2, 115, 117, 119, 4, 6, 8, 121, or 123. In embodiments, the heterologous polynucleotide of (b) comprises i) a promoter region derived from the S. cerevisiae ADH1 promoter region or ii) a promoter region having at least about 35% identity to SEQ ID NO: 131. In embodiments, the yeast cell is S. cerevisiae and the heterologous polynucleotide of (b) encodes a polypeptide of SEQ ID NO: 4, 6, 8, 121, or 123 or the heterologous polynucleotide of (b) encodes a polypeptide that has at least about 85% identity to SEQ ID NO: 4, 6, 8, 121, or 123. In embodiments, the heterologous polynucleotide of (b) encodes a polypeptide that has at least about 85% identity to SEQ ID NO: 130. In embodiments, the heterologous polynucleotide of (b) comprises a conditional promoter and encodes a polypeptide having at least 85% identity to SEQ ID NO: 4 or SEQ ID NO: 2.
[0014] One aspect of the invention relates to a recombinant host cell disclosed herein that expresses a pyruvate-utilizing biosynthetic pathway. In another aspect of the invention, such a pyruvate-utilizing biosynthetic pathway comprises a heterologous polynucleotide. In another aspect of the invention, such a pyruvate-utilizing biosynthetic pathway forms a product selected from 2,3-butanediol, isobutanol, 2-butanol, 1-butanol, 2-butanone, valine, leucine, lactic acid, malate, isoamyl alcohol, and isoprenoids. In another aspect of the invention, such a pyruvate-utilizing biosynthetic pathway is an isobutanol biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) pyruvate to acetolactate; (ii) acetolactate to 2,3-dihydroxyisovalerate; (iii) 2,3-dihydroxyisovalerate to 2-ketoisovalerate; (iv) 2-ketoisovalerate to isobutyraldehyde; and/or (v) isobutyraldehyde to isobutanol. In another aspect of the invention, such a pyruvate-utilizing biosynthetic pathway is a 2-butanone biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) pyruvate to acetolactate; (ii) acetolactate to acetoin; (iii) acetoin to 2,3-butanediol; and/or (iv) 2,3-butanediol to 2-butanone. In another aspect of the invention, such a pyruvate-utilizing biosynthetic pathway is a 2-butanol biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) pyruvate to acetolactate; (ii) acetolactate to acetoin; (iii) acetoin to 2,3-butanediol; (iv) 2,3-butanediol to 2-butanone; and/or (v) 2-butanone to 2-butanol. In another aspect of the invention, such a pyruvate-utilizing biosynthetic pathway is a 1-butanol biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) acetyl-CoA to acetoacetyl-CoA; (ii) acetoacetyl-CoA to 3-hydroxybutyryl-CoA; (iii) 3-hydroxybutyryl-CoA to crotonyl-CoA; (iv) crotonyl-CoA to butyryl-CoA; (v) butyryl-CoA to butyraldehyde; and/or (vi) butyraldehyde to 1-butanol.
[0015] One aspect of the invention relates to methods for the production of a product selected from 2,3-butanediol, isobutanol, 2-butanol, 1-butanol, 2-butanone, valine, leucine, lactic acid, malic acid, isoamyl alcohol, and isoprenoids comprising (a) growing a recombinant host cell disclosed herein under conditions wherein a product is produced; and (b) optionally recovering the product. In another aspect of the invention, such methods comprise an isobutanol biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) pyruvate to acetolactate; (ii) acetolactate to 2,3-dihydroxyisovalerate; (iii) 2,3-dihydroxyisovalerate to 2-ketoisovalerate; (iv) 2-ketoisovalerate to isobutyraldehyde; and/or (v) isobutyraldehyde to isobutanol. In another aspect of the invention, such methods comprise a 2-butanone biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) pyruvate to acetolactate; (ii) acetolactate to acetoin; (iii) acetoin to 2,3-butanediol; and/or (iv) 2,3-butanediol to 2-butanone. In another aspect of the invention, such methods comprise a 2-butanol biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) pyruvate to acetolactate; (ii) acetolactate to acetoin; (iii) acetoin to 2,3-butanediol; (iv) 2,3-butanediol to 2-butanone; and/or (v) 2-butanone to 2-butanol. In another aspect of the invention, such methods comprise a 1-butanol biosynthetic pathway comprising a polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion of (i) acetyl-CoA to acetoacetyl-CoA; (ii) acetoacetyl-CoA to 3-hydroxybutyryl-CoA; (iii) 3-hydroxybutyryl-CoA to crotonyl-CoA; (iv) crotonyl-CoA to butyryl-CoA; (v) butyryl-CoA to butyraldehyde; and/or (vi) butyraldehyde to 1-butanol.
[0016] One aspect of the invention relates to methods of producing a recombinant host cell comprising (i) providing a recombinant host cell comprising a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; and (ii) transforming a recombinant host cell of (i) with a heterologous polynucleotide encoding a polypeptide having hexose kinase activity. In another aspect of the invention, such methods further comprise (iii) introducing a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity.
[0017] One aspect of the invention relates to methods of increasing glucose consumption of a recombinant host cell comprising (i) providing a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; and (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (ii) growing the recombinant host cell of (i) under conditions wherein the heterologous polynucleotide of (b) is expressed in functional form. In another aspect of the invention, the glucose consumption of such a recombinant host cell is greater than the glucose consumption of a recombinant host cell comprising (a) but not (b).
[0018] One aspect of the invention relates to methods of increasing glucose consumption of a recombinant host cell comprising (i) providing a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity; and (ii) growing the recombinant host cell of (i) under conditions wherein the heterologous polynucleotide of (b) is expressed in functional form. In another aspect of the invention, the glucose consumption of such a recombinant host cell is greater than the glucose consumption of a recombinant host cell comprising (a) and (c) but not (b).
[0019] One aspect of the invention relates to methods of improving the redox balance of a recombinant host cell comprising (i) providing a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; and (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (ii) growing the recombinant host cell of (i) under conditions wherein the heterologous polynucleotide of (b) is expressed in functional form. In another aspect of the invention, the redox balance of such a recombinant host cell is improved compared to the redox balance of a recombinant host cell comprising (a) but not (b).
[0020] One aspect of the invention relates to methods of improving the redox balance of a recombinant host cell comprising (i) providing a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity; and (ii) growing the recombinant host cell of (i) under conditions wherein the heterologous polynucleotide of (b) is expressed in functional form. In another aspect of the invention, the redox balance of such a recombinant host cell is improved compared to the redox balance of a recombinant host cell comprising (a) and (c) but not (b).
[0021] One aspect of the invention relates to methods of increasing the formation of a product of a pyruvate-utilizing biosynthetic pathway comprising (i) providing a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; and (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (ii) growing the recombinant host cell of (i) under conditions wherein the product of the pyruvate-utilizing pathway is formed. In another aspect of the invention, the amount of product formed by such a recombinant host cell is greater than the amount of product formed by a recombinant host cell comprising (a) but not (b). In another aspect of the invention, the product is isobutanol, 2-butanol, or 1-butanol.
[0022] One aspect of the invention relates to methods of increasing the formation of a product of a pyruvate-utilizing biosynthetic pathway comprising (i) providing a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity; and (ii) growing the recombinant host cell of (i) under conditions wherein the product of the pyruvate-utilizing pathway is formed. In another aspect of the invention, the amount of product formed by such a recombinant host cell is greater than the amount of product formed by a recombinant host cell comprising (a) and (c) but not (b). In another aspect of the invention, the product is isobutanol, 2-butanol, or 1-butanol.
REFERENCE TO A SEQUENCE LISTING SUBMITTED AS A TEXT FILE VIA EFS-WEB
[0023] The sequences in the accompanying sequence listing, filed electronically herewith and incorporated herein by reference, conform with 37 C.F.R. 1.821-1.825 ("Requirements for patent applications Containing Nucleotide Sequences and/or Amino Acid Sequence Disclosures--the Sequence Rules") and are consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (2009) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of the Administrative Instructions). The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. ยง1.822.
[0024] SEQ ID NOs: 1-2 and 114-119 are example dual-role hexokinases in Saccharomyces cerevisiae, described in Table 3.
[0025] SEQ ID NOs: 3-8 and 120-123 are example hexose kinase coding regions and proteins, described in Table 4.
[0026] SEQ ID NOs: 9-28 are pyruvate decarboxylase sequences described in Table 5.
[0027] SEQ ID NO: 30 is a sequence derived from the CUP1 promoter region.
[0028] SEQ ID NOs: 31 and 32 are B. subtilis acetolactate synthase coding region and protein sequences.
[0029] SEQ ID NOs: 33-36 are sequences derived from the CYC1 terminator region, ILV5 promoter region, ILV5 terminator region, and FBA1 promoter region, respectively.
[0030] SEQ ID NOs: 37 and 38 are the Pf5.IlvC-Z4B8 coding region and protein sequences.
[0031] SEQ ID NOs: 39 and 40 are the ILV5 coding region and protein sequences.
[0032] SEQ ID NOs: 41 and 42 are the Pf5.IlvC-JEA1 coding region and protein sequences.
[0033] SEQ ID NO: 44 and 47 is the L. lactis kivD coding region sequence codon optimized for S. cerevisiae and the encoded protein.
[0034] SEQ ID NO: 45 and 46 is the horse liver ADH coding region sequence codon optimized for S. cerevisiae and the encoded protein.
[0035] SEQ ID NO: 49, 53, and 54 are sequences derived from the TDH3 promoter region, GPM1 promoter region, and ADH1 terminator region, respectively.
[0036] SEQ ID NOs: 55 and 56 are the sadB coding region and protein sequences, respectively.
[0037] SEQ ID NOs: 60 and 61 are FBA terminator region derived and CYC1 terminator region derived sequences.
[0038] SEQ ID NOs: 62 and 63 are the ilvD coding region and protein sequences, respectively.
[0039] SEQ ID NOs: 124 and 125 are the nucleic acid and amino acid sequences of KIGlk1 from K. lactis.
[0040] SEQ. ID NOs: 126 and 127 are the nucleic acid and amino acid sequences of HPHXK1 from Hansenula polymorpha.
[0041] SEQ ID NO: 131 is an ADH1 promoter region derived sequence.
[0042] SEQ ID NO: 140 and 141 are SNO1 and SNZ1 promoter region derived sequences.
[0043] SEQ ID NOs: 50-51, 57-58, 66-75, 77-80, 82-100, 104-105, 107-109, 112-113, 129, and 133-138 are primers used in the Examples.
[0044] The following correspond to synthetic constructs:
[0045] SEQ ID NO: 29 is the sequence of pLH475-Z4B8 plasmid.
[0046] SEQ ID NO: 43 is the sequence of the pLH468 plasmid.
[0047] SEQ ID NO: 48 is the sequence of vector pNY8.
[0048] SEQ ID NO: 52 is the sequence of vector pRS425::GPM-sadB.
[0049] SEQ ID NO: 59 is the sequence of pRS423 FBA ilvD(Strep).
[0050] SEQ ID NO: 64 is the GPM-sadB-ADHt segment sequence.
[0051] SEQ ID NO: 65 is the pUC19-URA3r sequence.
[0052] SEQ ID NO: 76 is the pdc1::PPDC1-ilvD-FBA1t-URA3r integration cassette sequence.
[0053] SEQ ID NO: 81 is the sequence of his3::URA3r2 cassette.
[0054] SEQ ID NO: 102 is the sequence of pUC19::loxP-URA3-loxP.
[0055] SEQ ID NO: 103 is the sequence of pLA25.
[0056] SEQ ID NO: 106 is the sequence of pLA31.
[0057] SEQ ID NO: 110 is the sequence of pRS423::PGAL1-cre.
[0058] SEQ ID NO: 111 is the sequence of pLA32.
[0059] SEQ ID NO: 128 is the pLH475-JEA1 plasmid.
[0060] SEQ ID NO: 130 is the HXK2(DLys6-Met15) sequence.
[0061] SEQ ID NO: 132 is a codon-optimized sequence encoding HXK2 with an internal deletion of the Lys6-Met15 region with ADH1 terminator region derived sequence.
[0062] SEQ ID NO: 139 is the sequence of pUC19::loxP-URA3-loxP-HXK2(Lys6-Met15)-ADH1t.
[0063] SEQ ID NO: 142 is the sequence of pLH467.
[0064] SEQ ID NO: 143 is the sequence of pLH435.
[0065] SEQ ID NO: 144 is the sequence of pLH441
BRIEF DESCRIPTION OF
[0066] The various embodiments of the invention can be more fully understood from the following detailed description, the figures, and the accompanying sequence descriptions, which form a part of this application.
[0067] FIG. 1 depicts the growth (FIG. 1A) and isobutanol production (FIG. 1B) of a hexokinase 2 deletion yeast strain (NYLA84 [pLH468/pLH475-Z4B8]) as compared to a yeast strain without hexokinase 2 deletion (NYLA74 [pLH468/pLH475-Z4B8]), as described in Example 3.
[0068] FIG. 2 depicts a comparison of growth and isobutanol production for a strain with a hexokinase 2 deletion (NYLA84[pLH468/pLH475-Z4B8]; FIG. 2B) and a strain without hexokinase 2 deletion (NYLA74 [pLH468/pLH475-Z4B8]; FIG. 2A).
[0069] FIG. 3 depicts the specific productivity of a strain with a hexokinase 2 deletion (NYLA84 [pLH468/pLH475-Z4B8] and a strain without hexokinase 2 deletion (NYLA74 [pLH468/pLH475-Z4B8]) measured in grams of isobutanol produced per gram of cells over time.
[0070] The invention can be more fully understood from the following detailed description and the accompanying sequence descriptions which form a part of this application.
DETAILED DESCRIPTION OF THE INVENTION
[0071] This invention addresses the need for improved processes for the conversion of plant-derived raw materials to a product stream useful as a liquid transportation fuel. Such processes would satisfy both fuel demands and environmental concerns. Applicants have provided a means to improve redox balance, glucose consumption, and/or product formation of a pyruvate-utilizing biosynthetic pathway in a recombinant host cell comprising a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity by introducing a heterologous polynucleotide encoding a polypeptide having hexose kinase activity. Such cells exhibit improved redox balance, increased glucose consumption, and/or increased product formation of a pyruvate-utilizing biosynthetic pathway compared to a recombinant host cell comprising a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity without the introduction of a heterologous polynucleotide encoding a polypeptide having hexose kinase activity. Applicants have also provided methods of making and using such a recombinant host cell including, for example, methods of improving redox balance, methods of increasing glucose consumption, and methods of increasing the production of a product of a pyruvate-utilizing biosynthetic pathway.
[0072] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In case of conflict, the present application including the definitions will control. Unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular. All publications, patents and other references mentioned herein are incorporated by reference in their entireties for all purposes as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference, unless only specific sections of patents or patent publications are indicated to be incorporated by reference.
[0073] Although methods and materials similar or equivalent to those disclosed herein can be used in practice or testing of the present invention, suitable methods and materials are disclosed below. The materials, methods and examples are illustrative only and are not intended to be limiting. Other features and advantages of the invention will be apparent from the detailed description and from the claims.
[0074] In order to further define this invention, the following terms, abbreviations and definitions are provided.
[0075] As used herein, the terms "comprises," "comprising," "includes," "including," "has," "having," "contains," or "containing," or any other variation thereof, are intended to be non-exclusive or open-ended. For example, a composition, a mixture, a process, a method, an article, or an apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such composition, mixture, process, method, article, or apparatus. Further, unless expressly stated to the contrary, "or" refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
[0076] Also, the indefinite articles "a" and "an" preceding an element or component of the invention are intended to be nonrestrictive regarding the number of instances, i.e., occurrences of the element or component. Therefore "a" or "an" should be read to include one or at least one, and the singular word form of the element or component also includes the plural unless the number is obviously meant to be singular.
[0077] The term "invention" or "present invention" as used herein is a non-limiting term and is not intended to refer to any single embodiment of the particular invention but encompasses all possible embodiments as disclosed in the application.
[0078] As used herein, the term "about" modifying the quantity of an ingredient or reactant of the invention employed refers to variation in the numerical quantity that can occur, for example, through typical measuring and liquid handling procedures used for making concentrates or use solutions in the real world; through inadvertent error in these procedures; through differences in the manufacture, source, or purity of the ingredients employed to make the compositions or to carry out the methods; and the like. The term "about" also encompasses amounts that differ due to different equilibrium conditions for a composition resulting from a particular initial mixture. Whether or not modified by the term "about", the claims include equivalents to the quantities. In one embodiment, the term "about" means within 10% of the reported numerical value, preferably within 5% of the reported numerical value.
[0079] The term "butanol" as used herein, refers to 2-butanol, 1-butanol, isobutanol, or mixtures thereof.
[0080] The term "pyruvate-utilizing biosynthetic pathway" refers to an enzyme pathway to produce a biosynthetic product from pyruvate.
[0081] The term "isobutanol biosynthetic pathway" refers to an enzyme pathway to produce isobutanol from pyruvate.
[0082] The term "2-butanone biosynthetic pathway" refers to an enzyme pathway to produce 2-butanone from pyruvate.
[0083] The term "2-butanol biosynthetic pathway" refers to an enzyme pathway to produce 2-butanol from pyruvate.
[0084] The term "1-butanol biosynthetic pathway" refers to an enzyme pathway to produce 1-butanol from pyruvate.
[0085] The terms "hxk2 mutant," "HXK2 knockout," or "HXK2-KO" as used herein refer to a S. cerevisiae host cell that has a genetic modification to inactivate or reduce expression of a gene encoding hexokinase 2 so that the cell substantially or completely lacks hexokinase 2 enzyme activity.
[0086] The terms "Mc mutant," "PDC knockout," or "PDC-KO" as used herein refer to a cell that has a genetic modification to inactivate or reduce expression of a gene encoding pyruvate decarboxylase (Pdc) so that the cell substantially or completely lacks pyruvate decarboxylase enzyme activity. If the cell has more than one expressed (active) PDC gene, then each of the active PDC genes may be inactivated or have minimal expression.
[0087] The term "carbon substrate" refers to a carbon source capable of being metabolized by the recombinant host cells disclosed herein. Non-limiting examples of carbon substrates are provided herein and include, but are not limited to, monosaccharides including, but not limited to, glucose, fructose, xylose, and arabinose; oligosaccharides including, but not limited to, sucrose and maltose; polysaccharides: and non-carbohydrate carbon sources including, but not limited to, ethanol, lactate, succinate, glycerol, carbon dioxide, methanol, or mixtures thereof.
[0088] The term "polynucleotide" is intended to encompass a singular nucleic acid as well as plural nucleic acids, and refers to a nucleic acid molecule or construct, e.g., messenger RNA (mRNA) or plasmid DNA (pDNA). A polynucleotide can contain the nucleotide sequence of the full-length cDNA sequence, or a fragment thereof, including the untranslated 5' and 3' sequences and the coding sequences. The polynucleotide can be composed of any polyribonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. For example, polynucleotides can be composed of single- and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, and RNA that is mixture of single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions. "Polynucleotide" embraces chemically, enzymatically, or metabolically modified forms.
[0089] A polynucleotide sequence may be referred to as "isolated," in which it has been removed from its native environment. For example, a heterologous polynucleotide encoding a polypeptide or polypeptide fragment contained in a vector is considered isolated for the purposes of the present invention. Further examples of an isolated polynucleotide include recombinant polynucleotides maintained in heterologous host cells or purified (partially or substantially) polynucleotides in solution. Isolated polynucleotides or nucleic acids according to the present invention further include such molecules produced synthetically. An isolated polynucleotide fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.
[0090] The term "gene" refers to a nucleic acid fragment that is capable of being expressed as a specific protein, optionally including regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence.
[0091] As used herein the term "coding region" refers to a DNA sequence that codes for a specific amino acid sequence. "Suitable regulatory sequences" refer to nucleotide sequences located upstream (5c non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, polyadenylation recognition sequences, RNA processing sites, effector binding sites and stem-loop structures.
[0092] As used herein, the term "polypeptide" is intended to encompass a singular "polypeptide" as well as plural "polypeptides," and refers to a molecule composed of monomers (amino acids) linearly linked by amide bonds (also known as peptide bonds). The term "polypeptide" refers to any chain or chains of two or more amino acids, and does not refer to a specific length of the product. Thus, peptides, dipeptides, tripeptides, oligopeptides, "protein," "amino acid chain," or any other term used to refer to a chain or chains of two or more amino acids, are included within the definition of "polypeptide," and the term "polypeptide" may be used instead of, or interchangeably with any of these terms. A polypeptide may be derived from a natural biological source or produced by recombinant technology, but is not necessarily translated from a designated nucleic acid sequence. It may be generated in any manner, including by chemical synthesis.
[0093] As used herein, "hexose kinase activity" refers to the activity of any polypeptide having a biological function of a hexose kinase, including the examples provided herein. Such polypeptides include glucokinases and hexokinases. Such polypeptides also include a polypeptide that catalyzes the conversion of hexose to hexose-6-phosphate, the conversion of D-glucose to D-glucose 6-phosphate, O-fructose to O-fructose 6-phosphate, and D-mannose to D-mannose 6-phosphate. Such polypeptides also include a polypeptide that corresponds to Enzyme Commission Number EC 2.7.1.1 or to Enzyme Commission Number EC 2.7.1.2. Such polypeptides can be determined by methods well known in the art and disclosed herein.
[0094] As used herein, "hexokinase 2 activity" refers to the activity of any polypeptide having a biological function of a Saccharomyces cerevisiae hexokinase 2 enzyme, including the examples provided herein. Such polypeptides include a polypeptide that catalyzes the conversion of hexose to hexose-6-phosphate, the conversion of D-glucose to D-glucose 6-phosphate, D-fructose to D-fructose 6-phosphate, and D-mannose to D-mannose 6-phosphate. Such polypeptides also include a polypeptide that corresponds to Enzyme Commission Number EC 2.7.1.1. Such polypeptides can be determined by methods well known in the art.
[0095] As used herein, "dual-role hexokinase activity" refers to the activity of any polypeptide having a biological function of a hexose kinase enzyme and exerting a glucose repression phenotype in the cell in which it is expressed. Such polypeptides include a polypeptide that catalyzes the conversion of hexose to hexose-6-phosphate, the conversion of D-glucose to D-glucose 6-phosphate, D-fructose to D-fructose 6-phosphate, and D-mannose to D-mannose 6-phosphate. The second role that a hexose kinase may have is regulatory: A hexokinase is dual-role in a yeast host if it functions to exert glucose repression on glucose-repressible genes. This may be demonstrated by relief from glucose repression in a strain with a mutation in the gene encoding that hexokinase. The dual-role is specific to a particular host cell, thus, a hexose kinase having both hexose kinase activity and glucose repression activity in one species may not express the glucose repression function in another. Hexose kinases including dual-function hexokinases are known in the art.
[0096] As used herein, "pyruvate decarboxylase activity" refers to any polypeptide having a biological function of a pyruvate decarboxylase enzyme, including the examples provided herein. Such polypeptides include a polypeptide that catalyzes the conversion of pyruvate to acetaldehyde. Such polypeptides also include a polypeptide that corresponds to Enzyme Commission Number 4.1.1.1. Such polypeptides can be determined by methods well known in the art and disclosed herein.
[0097] As used herein, "reduced activity" refers to any measurable decrease in a known biological activity of a polypeptide when compared to the same biological activity of the polypeptide prior to the change resulting in the reduced activity. Such a change can include a modification of a polypeptide or a polynucleotide encoding a polypeptide as described herein. A reduced activity of a polypeptide disclosed herein can be determined by methods well known in the art and disclosed herein.
[0098] As used herein, "substantially eliminated activity" refers to measurable decrease in a known biological activity of a polypeptide that results in nearly complete abolishment of the activity when compared to the same biological activity of the polypeptide prior to the change resulting in the substantially eliminated activity. Such a change can include a modification of a polypeptide or a polynucleotide encoding a polypeptide as described herein. A substantially eliminated activity of a polypeptide disclosed herein can be determined by methods well known in the art and disclosed herein.
[0099] As used herein, "eliminated activity" refers to the complete abolishment of a known biological activity of a polypeptide when compared to the same biological activity of the polypeptide prior to the change resulting in the eliminated activity. Such a change can include a modification of a polypeptide or a polynucleotide encoding a polypeptide as described herein. An eliminated activity includes a biological activity of a polypeptide that is not measurable when compared to the same biological activity of the polypeptide prior to the change resulting in the eliminated activity. An eliminated activity of a polypeptide disclosed herein can be determined by methods well known in the art and disclosed herein.
[0100] By an "isolated" polypeptide or a fragment, variant, or derivative thereof is intended a polypeptide that is not in its natural milieu. No particular level of purification is required. For example, an isolated polypeptide can be removed from its native or natural environment. Recombinantly produced polypeptides and proteins expressed in host cells are considered isolated for purposed of the invention, as are native or recombinant polypeptides which have been separated, fractionated, or partially or substantially purified by any suitable technique.
[0101] As used herein, "native" refers to the form of a polynucleotide, gene or polypeptide as found in nature with its own regulatory sequences, if present.
[0102] As used herein, "endogenous" refers to the native form of a polynucleotide, gene or polypeptide in its natural location in the organism or in the genome of an organism. "Endogenous polynucleotide" includes a native polynucleotide in its natural location in the genome at an organism. "Endogenous gene" includes a native gene in its natural location in the genome of an organism. "Endogenous polypeptide" includes a native polypeptide in its natural location in the organism.
[0103] As used herein, "heterologous" refers to a polynucleotide, gene or polypeptide not normally found in the host organism but that is introduced into the host organism or is otherwise modified from its native state. "Heterologous polynucleotide" includes a native coding region from the host organism, or portion thereof, that is reintroduced into or is otherwise modified from the host organism in a form that is different from the corresponding native polynucleotide as well as a coding region from a different organism, or portion thereof. "Heterologous gene" includes a native coding region, or portion thereof, that is reintroduced or otherwise modified in the source organism in a form that is different from the corresponding native gene as well as a coding region from a different organism. For example, a heterologous gene may include a native coding region that is a portion of a chimeric gene including non-native regulatory regions that is reintroduced into the native host. "Heterologous polypeptide" includes a native polypeptide that is in a form that is different from the corresponding native polypeptide as well as a polypeptide from another organism. A polypeptide that is altered such that the expression pattern (such as transcriptional or translational profile or cellular localization) is different from that of the native polypeptide is considered heterologous.
[0104] As used herein, the term "modification" refers to a change in a polynucleotide or polypeptide that results in reduced, substantially eliminated or eliminated activity of a polypeptide encoded by the polynucleotide, as well as a change in a polypeptide disclosed herein that results in reduced, substantially eliminated or eliminated activity of the polypeptide. Such changes can be made by methods well known in the art, including, but not limited to, deleting, mutating (e.g., spontaneous mutagenesis, random mutagenesis, mutagenesis caused by mutator genes, or transposon mutagenesis), substituting, inserting, down-regulating, altering the cellular location, altering the state of the polynucleotide or polypeptide (e.g., methylation, phosphorylation or ubiquitination), removing a cofactor, introduction of an antisense RNA/DNA, introduction of an interfering RNA/DNA, chemical modification, covalent modification, irradiation with UV or X-rays, homologous recombination, mitotic recombination, promoter replacement methods, and/or combinations thereof. Guidance in determining which nucleotides or amino acid residues can be modified can be found by comparing the sequence of the particular polynucleotide or polypeptide with that of homologous polynucleotides or polypeptides, e.g., yeast or bacterial, and maximizing the number of modifications made in regions of high homology (conserved regions) or consensus sequences. Other modifications to polynucleotides may result in increased expression, such as in the case of biosynthetic pathways for the production of a product.
[0105] As used herein, the term "variant" refers to a polypeptide differing from a specifically recited polypeptide of the invention by amino acid insertions, deletions, mutations, and substitutions, created using, e.g., recombinant DNA techniques, such as mutagenesis. Guidance in determining which amino acid residues may be replaced, added, or deleted without abolishing activities of interest, may be found by comparing the sequence of the particular polypeptide with that of homologous polypeptides, e.g., yeast or bacterial, and minimizing the number of amino acid sequence changes made in regions of high homology (conserved regions) or by replacing amino acids with consensus sequences.
[0106] Alternatively, recombinant polynucleotide variants encoding these same or similar polypeptides may be synthesized or selected by making use of the "redundancy" in the genetic code. Various codon substitutions, such as silent changes which produce various restriction sites, may be introduced to optimize cloning into a plasmid or viral vector for expression. Mutations in the polynucleotide sequence may be reflected in the polypeptide or domains of other peptides added to the polypeptide to modify the properties of any part of the polypeptide.
[0107] Amino acid "substitutions" may be the result of replacing one amino acid with another amino acid having similar structural and/or chemical properties, i.e., conservative amino acid replacements, or they may be the result of replacing one amino acid with an amino acid having different structural and/or chemical properties, i.e., non-conservative amino acid replacements. "Conservative" amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, or the amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic acid. Alternatively, "non-conservative" amino acid substitutions may be made by selecting the differences in polarity, charge, solubility, hydrophobicity, hydrophilicity, or the amphipathic nature of any of these amino acids. "Insertions" or "deletions" may be within the range of variation as structurally or functionally tolerated by the recombinant proteins. The variation allowed may be experimentally determined by systematically making insertions, deletions, or substitutions of amino acids in a polypeptide molecule using recombinant DNA techniques and assaying the resulting recombinant variants for activity.
[0108] The term "promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3' to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters." It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.
[0109] The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of effecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.
[0110] The term "expression," as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide.
[0111] The term "overexpression," as used herein, refers to expression that is higher than endogenous expression of the same or related gene. A heterologous gene is overexpressed if its expression is higher than that of a comparable endogenous gene.
[0112] As used herein the term "transformation" refers to the transfer of a nucleic acid fragment into a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" or "recombinant" or "transformed" organisms.
[0113] The terms "plasmid" and "vector" as used herein, refer to an extrachromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell.
[0114] As used herein the terms "codon degeneracy" refers to the nature in the genetic code permitting variation of the nucleotide sequence without affecting the amino acid sequence of an encoded polypeptide. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.
[0115] The term "codon-optimized" as it refers to genes or coding regions of nucleic acid molecules for transformation of various hosts, refers to the alteration of codons in the gene or coding regions of the nucleic acid molecules to reflect the typical codon usage of the host organism without altering the polypeptide encoded by the DNA. Such optimization includes replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that organism.
[0116] Deviations in the nucleotide sequence that comprise the codons encoding the amino acids of any polypeptide chain allow for variations in the sequence coding for the gene. Since each codon consists of three nucleotides, and the nucleotides comprising DNA are restricted to four specific bases, there are 64 possible combinations of nucleotides, 61 of which encode amino acids (the remaining three codons encode signals ending translation). The "genetic code" which shows which codons encode which amino acids is reproduced herein as Table 1. As a result, many amino acids are designated by more than one codon. For example, the amino acids alanine and proline are coded for by four triplets, serine and arginine by six, whereas tryptophan and methionine are coded by just one triplet. This degeneracy allows for DNA base composition to vary over a wide range without altering the amino acid sequence of the proteins encoded by the DNA.
TABLE-US-00001 TABLE 1 The Standard Genetic Code T C A G T TTT Phe (F) TCT Ser (S) TAT Tyr (Y) TGT Cys (C) TCC Phe (F) TCC Ser (S) TAC Tyr (Y) TGC TTA Leu (L) TCA Ser (S) TAA Stop TGA Stop TTG Leu (L) TCG Ser (S) TAG Stop TGG Trp (W) C CTT Leu (L) CCT Pro (P) CAT His (H) CGT Arg (R) CTC Leu (L) CCC Pro (P) CAC His (H) CGC Arg (R) CTA Leu (L) CCA Pro (P) CAA Gln (Q) CGA Arg (R) CTG Leu (L) CCG Pro (P) CAG Gln (Q) CGG Arg (R) A ATT Ile (I) ACT Thr (T) AAT Asn (N) AGT Ser (S) ATC Ile (I) ACC Thr (T) AAC Asn (N) AGC Ser (S) ATA Ile (I) ACA Thr (T) AAA Lys (K) AGA Arg (R) ATG Met ACG Thr (T) AAG Lys (K) AGG Arg (R) (M) G GTT Val (V) GCT Ala (A) GAT Asp (D) GGT Gly (G) GTC Val (V) GCC Ala (A) GAC Asp (D) GGC Gly (G) GTA Val (V) GCA Ala (A) GAA Glu (E) GGA Gly (G) GTG Val (V) GCG Ala (A) GAG Glu (E) GGG Gly (G)
[0117] Many organisms display a bias for use of particular codons to code for insertion of a particular amino acid in a growing peptide chain. Codon preference, or codon bias, differences in codon usage between organisms, is afforded by degeneracy of the genetic code, and is well documented among many organisms. Codon bias often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, inter alia, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization.
[0118] Given the large number of gene sequences available for a wide variety of animal, plant and microbial species, it is possible to calculate the relative frequencies of codon usage. Codon usage tables are readily available, for example, at the "Codon Usage Database" available at http://www.kazusa.or.jp/codon/ (visited Mar. 20, 2008), and these tables can be adapted in a number of ways, See Nakamura, Y., et al. Nucl. Acids Res. 28:292 (2000). Codon usage tables for yeast, calculated from GenBank Release 128.0 [15 Feb. 2002], are reproduced below as Table 2. This table uses mRNA nomenclature, and so instead of thymine (T) which is found in DNA, the tables use uracil (U) which is found in RNA. Table 2 has been adapted so that frequencies are calculated for each amino acid, rather than for all 64 codons,
TABLE-US-00002 TABLE 2 Codon Usage Table for Saccharomyces cerevisiae Genes Frequency per Amino Acid Codon Number thousand Phe UUU 170666 26.1 Phe UUC 120510 18.4 Leu UUA 170884 26.2 Leu UUG 177573 27.2 Leu CUU 80076 12.3 Leu CUC 35545 5.4 Leu CUA 87619 13.4 Leu CUG 68494 10.5 Ile AUU 196893 30.1 Ile AUC 112176 17.2 Ile AUA 116254 17.8 Met AUG 136805 20.9 Val GUU 144243 22.1 Val GUC 76947 11.8 Val GUA 76927 11.8 Val GUG 70337 10.8 Ser UCU 153557 23.5 Ser UCC 92923 14.2 Ser UCA 122028 18.7 Ser UCG 55951 8.6 Ser AGU 92466 14.2 Ser AGC 63726 9.8 Pro CCU 88263 13.5 Pro CCC 44309 6.8 Pro CCA 119641 18.3 Pro CCG 34597 5.3 Thr ACU 132522 20.3 Thr ACC 83207 12.7 Thr ACA 116084 17.8 Thr ACG 52045 8.0 Ala GCU 138358 21.2 Ala GCC 82357 12.6 Ala GCA 105910 16.2 Ala GCG 40358 6.2 Tyr UAU 122728 18.8 Tyr UAC 96596 14.8 His CAU 89007 13.6 His CAC 50785 7.8 Gln CAA 178251 27.3 Gln CAG 79121 12.1 Asn AAU 233124 35.7 Asn AAC 162199 24.8 Lys AAA 273618 41.9 Lys AAG 201361 30.8 Asp GAU 245641 37.6 Asp GAC 132048 20.2 Glu GAA 297944 45.6 Glu GAG 125717 19.2 Cys UGU 52903 8.1 Cys UGC 31095 4.8 Trp UGG 67789 10.4 Arg CGU 41791 6.4 Arg CGC 16993 2.6 Arg CGA 19562 3.0 Arg CGG 11351 1.7 Arg AGA 139081 21.3 Arg AGG 60289 9.2 Gly GGU 156109 23.9 Gly GGC 63903 9.8 Gly GGA 71216 10.9 Gly GGG 39359 6.0 Stop UAA 6913 1.1 Stop UAG 3312 0.5 Stop UGA 4447 0.7
[0119] By utilizing this or similar tables, one of ordinary skill in the art can apply the frequencies to any given polypeptide sequence, and produce a nucleic acid fragment of a codon-optimized coding region which encodes the polypeptide, but which uses codons optimal for a given species.
[0120] Randomly assigning codons at an optimized frequency to encode a given polypeptide sequence, can be done manually by calculating codon frequencies for each amino acid, and then assigning the codons to the polypeptide sequence randomly. Additionally, various algorithms and computer software programs are readily available to those of ordinary skill in the art. For example, the "EditSeq" function in the Lasergene Package, available from DNAstar, Inc., Madison, Wis., the backtranslation function in the Vector NTI Suite, available from InforMax, Inc., Bethesda, Md., and the "backtranslate" function in the GCG-Wisconsin Package, available from Accelrys, Inc., San Diego, Calif. In addition, various resources are publicly available to codon-optimize coding region sequences, e.g., the "backtranslation" function at http://www.entelechon.com/bioinformatics/backtranslation.php?lang=eng (visited Apr. 15, 2008) and the "backtranseq" function available at http://bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html (visited Jul. 9, 2002). Constructing a rudimentary algorithm to assign codons based on a given frequency can also easily be accomplished with basic mathematical functions by one of ordinary skill in the art.
[0121] Codon-optimized coding regions can be designed by various methods known to those skilled in the art including software packages such as "synthetic gene designer" (http//phenotype.biosci.umbc.edu/codon/sgd/index.php).
[0122] A polynucleotide or nucleic acid fragment is "hybridizable" to another nucleic acid fragment, such as a cDNA, genomic DNA, or RNA molecule, when a single-stranded form of the nucleic acid fragment can anneal to the other nucleic acid fragment under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2nd ed, Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989), particularly Chapter 11 and Table 11.1 therein (entirely incorporated herein by reference) The conditions of temperature and ionic strength determine the "stringency" of the hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments (such as homologous sequences from distantly related organisms), to highly similar fragments (such as genes that duplicate functional enzymes from closely related organisms). Post-hybridization washes determine stringency conditions. One set of preferred conditions uses a series of washes starting with 6รSSC, 0.5% SDS at room temperature for 15 min, then repeated with 2รSSC, 0.5% SDS at 45ยฐ C. for 30 min, and then repeated twice with 0.2รSSC, 0.5% SDS at 50ยฐ C. for 30 min. A more preferred set of stringent conditions uses higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2รSSC, 0.5% SDS was increased to 60ยฐ C. Another preferred set of highly stringent conditions uses two final washes in 0.1รSSC, 0.1% SDS at 65ยฐ C. An additional set of stringent conditions include hybridization at 0.1รSSC, 0.1% SDS, 65ยฐ C. and washes with 2รSSC, 0.1% SDS followed by 0.1รSSC, 0.1% SDS, for example.
[0123] Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementarity, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e., oligonucleotides, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8). In one embodiment the length for a hybridizable nucleic acid is at least about 10 nucleotides. Preferably a minimum length for a hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least about 20 nucleotides; and most preferably the length is at least about 30 nucleotides. Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe.
[0124] A "substantial portion" of an amino acid or nucleotide sequence is that portion comprising enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to putatively identify that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Altschul, S. F., et al., J. Mol. Biol., 215:403-410 (1993)). In general, a sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide or nucleic acid sequence as homologous to a known protein or gene. Moreover, with aspect to nucleotide sequences, gene specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence comprises enough of the sequence to specifically identify and/or isolate a nucleic acid fragment comprising the sequence. The instant specification teaches the complete amino acid and nucleotide sequence encoding particular proteins. The skilled artisan, having the benefit of the sequences as reported herein, may now use all or a substantial portion of the disclosed sequences for purposes known to those skilled in this art. Accordingly, the instant invention comprises the complete sequences as provided herein, as well as substantial portions of those sequences as defined above.
[0125] The term "complementary" is used to describe the relationship between nucleotide bases that are capable of hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine.
[0126] The term "percent identity", as known in the art, is a relationship between two or more polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the sequences. In the art, "identity" also means the degree of sequence relatedness between polypeptide or polynucleotide sequences, as the case may be, as determined by the match between strings of such sequences. "Identity" and "similarity" can be readily calculated by known methods, including but not limited to those disclosed in: 1.) Computational Molecular Biology (Lesk, A. M., Ed.) Oxford University: NY (1988); 2.) Biocomputing: Informatics and Genome Projects (Smith, D. W., Ed.) Academic: NY (1993); 3.) Computer Analysis of Sequence Data, Part I (Griffin, A. M., and Griffin, H. G., Eds.) Humania: NJ (1994); 4.) Sequence Analysis in Molecular Biology (von Heinje, G., Ed.) Academic (1987); and 5.) Sequence Analysis Primer (Gribskov, M. and Devereux, J., Eds.) Stockton: NY (1991).
[0127] Preferred methods to determine identity are designed to give the best match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. Sequence alignments and percent identity calculations may be performed using the MegAlignยฎ program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences is performed using the "Clustal method of alignment" which encompasses several varieties of the algorithm including the "Clustal V method of alignment" corresponding to the alignment method labeled Clustal V (disclosed by Higgins and Sharp, CABIOS. 5:151-153 (1989); Higgins, D. G. et al., Comput. Appl. Biosci., 8:189-191 (1992)) and found in the MegAlignยฎ program of the LASERGENE bioinformatics computing suite (DNASTAR Inc.). For multiple alignments, the default values correspond to GAP PENALTY=10 and GAP LENGTH PENALTY=10. Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4. After alignment of the sequences using the Clustal V program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table in the same program. Additionally the "Clustal W method of alignment" is available and corresponds to the alignment method labeled Clustal W (described by Higgins and Sharp, CABIOS. 5:151-153 (1989); Higgins, D. G. et al., Comput. Appl. Biosci. 8:189-191 (1992)) and found in the MegAlignยฎ v6.1 program of the LASERGENE bioinformatics computing suite (DNASTAR Inc.). Default parameters for multiple alignment (GAP PENALTY=10, GAP LENGTH PENALTY=0.2, Delay Divergen Seqs(%)=30, DNA Transition Weight=0.5, Protein Weight Matrix=Gonnet Series, DNA Weight Matrix=IUB). After alignment of the sequences using the Clustal W program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table in the same program.
[0128] It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides, such as from other species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to: 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100% may be useful in describing the present invention, such as 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%. Suitable nucleic acid fragments not only have the above homologies but typically encode a polypeptide having at least 50 amino acids, preferably at least 100 amino acids, more preferably at least 150 amino acids, still more preferably at least 200 amino acids, and most preferably at least 250 amino acids.
[0129] The term "sequence analysis software" refers to any computer algorithm or software program that is useful for the analysis of nucleotide or amino acid sequences. "Sequence analysis software" may be commercially available or independently developed. Typical sequence analysis software will include, but is not limited to: 1.) the GCG suite of programs (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wis.); 2.) BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. Biol., 215:403-410 (1990)); 3.) DNASTAR (DNASTAR, Inc. Madison, Wis.); 4.) Sequencer (Gene Codes Corporation, Ann Arbor, Mich.); and 5.) the FASTA program incorporating the Smith-Waterman algorithm (W. R. Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Plenum: New York, N.Y.). Within the context of this application it will be understood that where sequence analysis software is used for analysis, that the results of the analysis will be based on the "default values" of the program referenced, unless otherwise specified. As used herein "default values" will mean any set of values or parameters that originally load with the software when first initialized.
[0130] Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989) (hereinafter "Maniatis"); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W., Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1984); and by Ausubel, F. M. et al, Current Protocols in Molecular Biology, published by Greene Publishing Assoc. and Wiley-Interscience (1987). Additional methods used here are in Methods in Enzymology, Volume 194, Guide to Yeast Genetics and Molecular and Cell Biology (Part A, 2004, Christine Guthrie and Gerald R. Fink (Eds.), Elsevier Academic Press, San Diego, Calif.).
[0131] The genetic manipulations of a recombinant host cell disclosed herein can be performed using standard genetic techniques and screening and can be made in any host cell that is suitable for genetic manipulation (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). In embodiments, a recombinant host cell disclosed herein can be any yeast or fungi host useful for genetic modification and recombinant gene expression. In other embodiments, a recombinant host cell can be a member of the genera Saccharomyces, Zygosaccharomyces, Schizosaccharomyces, Dekkera, Torulopsis, Issatchenkia, Brettanomyces, Torulaspora, Hanseniaspora, Kluyveromyces, and some species of Candida. In another embodiment, a recombinant host cell can be S. cerevisiae.
Modification of Dual-Role Hexokinase
[0132] Recombinant yeast cells disclosed herein can comprise a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity in said host cell and/or a modification in a polypeptide having dual-role hexokinase activity in said host cell. In embodiments, a recombinant host cell disclosed herein can have a modification or disruption of one or more polynucleotides, genes or polypeptides encoding dual-role hexokinases. In embodiments, a recombinant host cell comprises a deletion, mutation, and/or substitution in one or more endogenous polynucleotides or genes encoding a polypeptide having dual-role hexokinase activity, or in one or more endogenous polypeptides having dual-role hexokinase activity. Such modifications, disruptions, deletions, mutations, and/or substitutions can result in dual-role hexokinase activity that is reduced or substantially eliminated, resulting, for example, in a dual-role hexokinase knockout phenotype.
[0133] In embodiments, a polypeptide having dual-role hexokinase activity can catalyze the conversion of hexose to hexose-6-phosphate, and/or can catalyze the conversion of D-glucose to D-glucose 6-phosphate, D-fructose to D-fructose 6-phosphate, and/or D-mannose to D-mannose 6-phosphate. In other embodiments, a polynucleotide, gene or polypeptide having dual-role hexokinase activity can correspond to Enzyme Commission Number EC 2.7.1.1.
[0134] In embodiments, a recombinant host cell can be S. cerevisiae and a polynucleotide, gene or polypeptide having dual-role hexokinase activity can be hexokinase 2 (HXK2). In embodiments, a recombinant host cell can be K. lactis and a polynucleotide, gene or polypeptide having dual-role hexokinase activity can be RAG5. In other embodiments, a recombinant host cell can be H. polymorpha and a polynucleotide, gene or polypeptide having dual-role hexokinase activity can be HPGLK1. In other embodiments, a recombinant host cell can be S. pombe and a polynucleotide, gene or polypeptide having dual-role hexokinase activity can be HXK2. Hexokinase 2 knockout strains are known in the art (Vojtek and Fraenkel, Eur. J. Biochem. 190: 371-375, 1990; Lobo and Maitra, Genetics 86: 727-744, 1977; Winzeler, et al. Science 285: 901-906, 1999; and American Type Culture Collection #4004620, #4014820, #4024620, and #4034620).
[0135] Other examples of dual-role hexokinase polynucleotides, genes and polypeptides that can be targeted for modification or inactivation in a recombinant host cell disclosed herein include, but are not limited to, dual-role hexokinase polynucleotides, genes and/or polypeptides having at least about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to any one of the sequences disclosed herein, wherein such a polynucleotide or gene encodes, or such a polypeptide has, dual-role hexokinase activity. Still other examples of dual-role hexokinase polynucleotides, genes and polypeptides that can be targeted for modification or inactivation in a recombinant host cell disclosed herein include, but are not limited to, an active variant, fragment or derivative of any one of the sequences disclosed herein, wherein such a polynucleotide or gene encodes, or such a polypeptide has, dual-role hexokinase activity.
[0136] In embodiments, the sequences of other dual-role hexokinase polynucleotides, genes and/or polypeptides can be identified in the literature and candidates can be identified in bioinformatics databases well known to the skilled person using sequences disclosed herein and available in the art. For example, such sequences can be identified through BLAST searching of publicly available databases with known hexose kinase encoding polynucleotide or polypeptide sequences. In such a method, identities can be based on the Clustal W method of alignment using the default parameters of GAP PENALTY=10, GAP LENGTH PENALTY=0.1, and Gonnet 250 series of protein weight matrix.
[0137] Additionally, the dual-role hexokinase polynucleotide or polypeptide sequences disclosed herein or known the art can be used to identify other candidate hexose kinase homologs in nature. For example, each of the hexose kinase encoding nucleic acid fragments disclosed herein can be used to isolate genes encoding homologous proteins. Isolation of homologous genes using sequence-dependent protocols is well known in the art. Examples of sequence-dependent protocols include, but are not limited to (1) methods of nucleic acid hybridization; (2) methods of DNA and RNA amplification, as exemplified by various uses of nucleic acid amplification technologies [e.g., polymerase chain reaction (PCR), Mullis et al., U.S. Pat. No. 4,683,202; ligase chain reaction (LCR), Tabor, S. at al., Proc. Acad. Sci. USA 82:1074 (1985); or strand displacement amplification (SDA), Walker et al., Proc. Natl. Acad. Sci. U.S.A., 89:392 (1992)]; and (3) methods of library construction and screening by complementation.
[0138] Whether or not a particular hexose kinase is a dual-role hexokinase is specific to the host cell in which the hexose kinase is expressed. For example, while Hansenula polymorpha HPGLK1 is a dual-role hexokinase in the native organism, it is not associated with glucose repression in S. cerevisiae. Additional examples of hexose kinases that are dual-role in S. cerevisiae are given in Table 3. The dual-role nature of certain hexose kinases is known in the art, and, whether or not a hexose kinase is a dual-role hexokinase in a particular host cell can be readily determined from the art and/or using methods known to those of skill in the art. For example, one of the roles of any hexose kinase is enzymatic activity to phosphorylate hexoses, as per E.C. definition 2.7.1.1 or 2.7.1.2, and such activity can be confirmed by assays known in the art. The second role that a dual-role hexokinase will have is regulatory: that is, it is exerts glucose repression on glucose-repressible genes. This is demonstrated by relief from glucose repression in a strain with a mutation in the gene encoding that hexose kinase. Glucose repression relief in the mutant strain can be demonstrated by methods known in the art, including, but not limited, to:
[0139] 1. measuring expression of the enzymatic activity of an enzyme(s) known to be glucose-repressed in that host (e.g. in S. cerevisiae, invertase; maltase; galactokinase) when the cells are grown in glucose-containing medium (if the genetic system involves induction as well as repression, the cognate non-glucose carbon source must be added too, e.g. galactose, maltose);
[0140] 2. measuring transcription of a gene(s) known to be glucose-repressed in that host when the cells are grown in glucose-containing medium (if the genetic system involves induction as well as repression, the cognate non-glucose carbon source must be added too, e.g. galactose, maltose). Transcription can be measured by Northern blot, RT-PCR, run-on transcription, etc. Transcription can be measured by expression of a reporter gene (e.g. GFP, lacZ, gusB) placed under control of a promoter from a glucose-repressible gene;
[0141] 3. measuring the ability of the mutant strain to co-consume glucose and a carbon source whose consumption is normally repressed by glucose (e.g. in S. cerevisiae: sucrose, maltose, galactose);
[0142] 4. testing the ability of the mutant strain to grow on a carbon source whose consumption is normally repressed by glucose, when the growth medium also contains a gratuitous glucose repressor (e.g. 2-deoxyglucose, 5-thioglucose).
[0143] All of the tests mentioned above could be done with the non-mutant strain as well, for reference.
[0144] In embodiments, dual-role hexokinase polynucleotides, genes and/or polypeptides related to a recombinant host cell disclosed herein can be modified or disrupted. Many methods for genetic modification and disruption of target genes to reduce or eliminate expression are known to one of ordinary skill in the art and can be used to create a recombinant host cell disclosed herein. Modifications that can be used include, but are not limited to, deletion of the entire gene or a portion of the gene encoding a dual-role hexokinase protein, inserting a DNA fragment into the encoding gene (in either the promoter or coding region) so that the protein is not expressed or expressed at lower levels, introducing a mutation into the coding region which adds a stop codon or frame shift such that a functional protein is not expressed, and introducing one or more mutations into the coding region to alter amino acids so that a non-functional or a less active protein is expressed. In other embodiments, expression of a target gene can be blocked by expression of an antisense RNA or an interfering RNA, and constructs can be introduced that result in cosuppression. In other embodiments, the synthesis or stability of the transcript can be lessened by mutation. In embodiments, the efficiency by which a protein is translated from mRNA can be modulated by mutation. All of these methods can be readily practiced by one skilled in the art making use of the known or identified sequences encoding target proteins.
[0145] In other embodiments, DNA sequences surrounding a target dual-role hexokinase coding sequence are also useful in some modification procedures and are available, for example, for yeast such as Saccharomyces cerevisiae in the complete genome sequence coordinated by Genome Project ID9518 of Genome Projects coordinated by NCBI (National Center for Biotechnology Information) with identifying GOPID #13838. An additional non-limiting example of yeast genomic sequences is that of Candida albicans, which is included in GPID #10771, #10701 and #16373. Other yeast genomic sequences can be readily found by one of skill in the art in publicly available databases.
[0146] In other embodiments, DNA sequences surrounding a target dual-role hexokinase coding sequence can be useful for modification methods using homologous recombination. In a non-limiting example of this method, dual-role hexokinase gene flanking sequences can be placed bounding a selectable marker gene to mediate homologous recombination whereby the marker gene replaces the dual-role hexokinase gene. In another non-limiting example, partial dual-role hexokinase gene sequences and dual-role hexokinase gene flanking sequences bounding a selectable marker gene can be used to mediate homologous recombination whereby the marker gene replaces at least a portion of the target dual-role hexokinase gene. In embodiments, the selectable marker can be bounded by site-specific recombination sites, so that following expression of the corresponding site-specific recombinase, the resistance gene is excised from the dual-role hexokinase gene without reactivating the latter. In embodiments, the site-specific recombination leaves behind a recombination site which disrupts expression of the dual-role hexokinase protein. In other embodiments, the homologous recombination vector can be constructed to also leave a deletion in the dual-role hexokinase gene following excision of the selectable marker, as is well known to one skilled in the art.
[0147] In other embodiments, deletions can be made to a dual-role hexokinase target gene using mitotic recombination as described by Wach at al. (Yeast, 10:1793-1808; 1994). Such a method can involve preparing a DNA fragment that contains a selectable marker between genomic regions that can be as short as 20 bp, and which bound a target DNA sequence. In other embodiments, this DNA fragment can be prepared by PCR amplification of the selectable marker gene using as primers oligonucleotides that hybridize to the ends of the marker gene and that include the genomic regions that can recombine with the yeast genome. In embodiments, the linear DNA fragment can be efficiently transformed into yeast and recombined into the genome resulting in gene replacement including with deletion of the target DNA sequence (as disclosed, for example, in Methods in Enzymology, Volume 194, Guide to Yeast Genetics and Molecular and Cell Biology (Part A, 2004, Christine Guthrie and Gerald R. Fink (Eds.), Elsevier Academic Press, San Diego, Calif.).
[0148] Moreover, promoter replacement methods can be used to exchange the endogenous transcriptional control elements allowing another means to modulate expression such as described by Mnaimneh et al. ((2004) Cell 118(1):31-44).
[0149] In other embodiments, the dual-role hexokinase target gene encoded activity can be disrupted using random mutagenesis, which can then be followed by screening to identify strains with reduced or substantially eliminated activity. In this type of method, the DNA sequence of the target gene encoding region, or any other region of the genome affecting carbon substrate dependency for growth, need not be known, in embodiments, a screen for cells with reduced dual-role hexokinase activity, or other mutants having reduced dual-role hexokinase activity, can be useful as recombinant host cells of the invention.
[0150] Methods for creating genetic mutations are common and well known in the art and can be applied to the exercise of creating mutants. Commonly used random genetic modification methods (reviewed in Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) include spontaneous mutagenesis, mutagenesis caused by mutator genes, chemical mutagenesis, irradiation with UV or X-rays, or transposon mutagenesis.
[0151] Chemical mutagenesis of host cells can involve, but is not limited to, treatment with one of the following DNA mutagens: ethyl methanesulfonate (EMS), nitrous acid, diethyl sulfate, or N-methyl-N'-nitro-N-nitroso-guanidine (MNNG). Such methods of mutagenesis have been reviewed in Spencer et al. (Mutagenesis in Yeast, 1996, Yeast Protocols: Methods in Cell and Molecular Biology. Humana Press, Totowa, N.J.). In embodiments, chemical mutagenesis with EMS can be performed as disclosed in Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Irradiation with ultraviolet (UV) light or X-rays can also be used to produce random mutagenesis in yeast cells. The primary effect of mutagenesis by UV irradiation is the formation of pyrimidine dimers which disrupt the fidelity of DNA replication. Protocols for UV-mutagenesis of yeast can be found in Spencer et al. (Mutagenesis in Yeast, 1996, Yeast Protocols: Methods in Cell and Molecular Biology. Humana Press, Totowa, N.J.). In embodiments, the introduction of a mutator phenotype can also be used to generate random chromosomal mutations in host cells. In embodiments, common mutator phenotypes can be obtained through disruption of one or more of the following genes: PMS1, MAG1, RAD18 or RAD1. In other embodiments, restoration of the non-mutator phenotype can be obtained by insertion of the wildtype allele. In other embodiments, collections of modified cells produced from any of these or other known random mutagenesis processes may be screened for reduced or eliminated dual-role hexokinase activity.
[0152] Genomes have been completely sequenced and annotated and are publicly available for the following yeast strains: Ashbya gossypii ATCC 10895, Candida glabrata CBS 138, Kluyveromyces lactis NRRL Y-1140, Pichia stipitis CBS 6054, Saccharomyces cerevisiae S288c, Schizosaccharomyces pombe 972h-, and Yarrowia lipolytica CLIB122. Typically BLAST (described above) searching of publicly available databases with known dual-role hexokinase polynucleotide or polypeptide sequences, such as those provided herein, is used to identify candidate dual-role hexokinase-encoding sequences of other host cells, such as yeast cells.
[0153] Accordingly, it is within the scope of the invention to provide dual-role hexokinase polynucleotides, genes and polypeptides having at least about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to any of the hexokinase polynucleotides or polypeptides disclosed herein. Identities are based on the Clustal W method of alignment using the default parameters of GAP PENALTY=10, GAP LENGTH PENALTY=0.1, and Gonnet 250 series of protein weight matrix.
[0154] The modification of a dual-role hexokinase in a recombinant host cell disclosed herein to reduce or eliminate dual-role hexokinase activity can be confirmed using methods known in the art. For example, one can screen for disruption of hexokinase 2 in S. cerevisiae by PCR (for example, looking for lack of a PCR product with primers such as those listed in Example 2) or by Southern blotting using a probe designed to the hexokinase 2 sequence. Alternatively, one can screen for decreased glucose consumption and higher yield of biomass which is phenotypically indicative of a hexokinase 2 disruption.
Introduction of Hexose Kinase Activity
[0155] Applicants have found that the inclusion of a heterologous polynucleotide encoding a polypeptide having hexose kinase activity in a recombinant host cell comprising a modification in an endogenous polynucleotide, gene or polypeptide having dual-role hexokinase activity wherein the activity of the dual-role hexokinase is reduced or eliminated can result in altered glucose repression in the recombinant host cell. The introduction of a heterologous polynucleotide encoding a polypeptide having hexose kinase activity may result in an improved redox balance, increased glucose consumption and/or increased product formation by a pyruvate-utilizing biosynthetic pathway.
[0156] Hexose kinase polynucleotides, genes or polypeptides known in the art or that are identified as disclosed herein can be expressed in a recombinant host cell disclosed herein.
[0157] Suitable hexose kinase polypeptides include, but are not limited to those that are typically dual-role hexokinases in the host cell but have been modified to reduce or eliminate the glucose repression function. Such hexose kinase polypeptides may be encoded by a polynucleotide comprising a conditional promoter such that the expression of the polypeptide is conditional. As an example, dual-role hexokinase polynucleotides, genes and polypeptides in Saccharomyces cerevisiae include, but are not limited to, those in Table 3.
TABLE-US-00003 TABLE 3 Example hexose kinases that are dual-role hexokinases in S. cerevisiae Nucleic Protein Nucleic Acid Acid GenBank Protein GenBank SEQ ID Accession SEQ ID Description Accession No. NO No. NO HXK2 Z72775.1 1 CAA96973.1 2 (hexokinase 2) from S. cerevisiae Yarrowia AJ011524.1 114 CAA09674.1 115 lipolytica YIHXK1 Schwanniomyces S78714.1 116 AAB34892.1 117 occidentalis SoXHK Human NM_000162.3 118 NP_000153.1 119 pancreatic glucokinase (hexokinase 4; GCK)
[0158] In embodiments, suitable heterologous polynucleotides encode hexose kinases which are dual-function hexokinases in a particular host cell but are expressed in said host cell under the control of a conditional promoter such that glucose repression is altered under conditions where the promoter is not activated or not activated to a significant extent. In embodiments, HXK2 is expressed in S. cerevisiae under the control of a conditional promoter. In embodiments, a dual-function hexokinase having at least 85%, at least 90%, or at least 95% identity to SEQ ID NO: 2, 115, 117, or 119 (see Table 3) is encoded in S. cerevisiae by a polynucleotide comprising a conditional promoter sequence. In embodiments, a dual-function hexokinase of SEQ ID NO: 2, 115, 117, or 119 (see Table 3) is encoded in S. cerevisiae by a polynucleotide comprising a conditional promoter sequence. In embodiments, the conditional promoter sequence is derived from the OLE1 promoter region. In embodiments, the promoter sequence is at least 95% identical to SEQ ID NO: 98. In embodiments, the promoter sequence comprises SEQ ID NO: 98. In embodiments, the promoter is SNO1 (SEQ ID NO: 140) or SNZ1 (SEQ ID NO: 141). In embodiments, the promoter sequence is at least about 95% identical to SEQ ID NO: 140 or 141.
[0159] In embodiments, a polynucleotide encoding a dual-role hexokinase disclosed herein or known in the art can be modified using methods disclosed herein such that the glucose repression activity is reduced or eliminated by altering the cellular localization od rhw wnxosws polypeptide. For example, a decapeptide at the N-terminus of hexokinase 2 (Lys6-Met15) has been implicated as a domain involved with MIG1 binding, and it is believed that the hexokinase 2-MIG1 complex is imported into the nucleus where both genes can function as transcriptional regulators. Ahuatzi et al. describes a Lys6-Met15 deletion mutant of HXK2 could no longer bind MIG1 and was localized to the cytosol and could not enter the nucleus (Ahuatzi et al. (2004) The Glucose-regulated Nuclear Localization of Hexokinase 2 in Saccharomyces cerevisiae Is Mig1-dependent. JBC 279(14):14440-6).
[0160] Thus, deletion or mutation of the MIG1-interaction domain from hexokinase 2 (or related hexokinases) using molecular biology methods known in the art would allow the enzyme to function as a glycolytic enzyme but prevent the enzyme from being translocated to the nucleus and functioning as a transcriptional regulator. In a recombinant host cell comprising reduced or substantially eliminated hexokinase 2 activity, with this modification, one could obtain the growth benefit of the hexokinase 2 reduction, but also high glucose uptake rates akin to the wildtype strain. Therefore, provided herein is a heterologous polynucleotide encoding a polypeptide having hexose kinase activity comprising a mutation or deletion in a protein binding domain necessary for nuclear translocation. In embodiments, the domain is the MIG1-interaction domain. In embodiments, the polynucleotide has at least about 85%, at least about 90%, or at least about 95% identity to SEQ ID NO: 132. In embodiments, the polynucleotide is SEQ ID NO: 132. In embodiments, the polypeptide has at least about 85%, at least about 90%, or at least about 95% identity to SEQ ID NO: 130. In embodiments, the polypeptide is SEQ ID NO: 130.
[0161] In embodiments, a heterologous polynucleotide encoding a polypeptide having hexose kinase activity is overexpressed, or expressed at a level that is higher than endogenous expression of the same or related endogenous gene, if any. In other embodiments, a polypeptide having hexose kinase activity is native to a recombinant host cell. In other embodiments, a polypeptide having hexose kinase activity is not native to a recombinant host cell.
[0162] In embodiments, the heterologous polynucleotide encoding a polypeptide having hexose kinase activity comprises a constitutive promoter sequence. In embodiments, the constitutive promoter sequence is derived from the ADH1 promoter region. In embodiments, the constitutive promoter sequence has at least 95% identity to SEQ ID NO: 131. In embodiments, the constitutive promoter sequence is SEQ ID NO: 131.
[0163] In embodiments, a polypeptide having hexose kinase activity catalyzes the conversion of hexose to hexose-6-phosphate. In other embodiments, a polypeptide having hexose kinase activity catalyzes the conversion of D-glucose to D-glucose 6-phosphate, D-fructose to D-fructose 6-phosphate, and/or D-mannose to D-mannose 6-phosphate.
[0164] In embodiments, such a polynucleotide, gene and/or polypeptide can be K. lactis RAG5, H. polymorpha HPGLK1, S. pombe HXK2, or combinations thereof.
[0165] In embodiments, a polynucleotide, gene and/or polypeptide encoding hexose kinase activity corresponds to the Enzyme Commission Number EC 2.7.1.1. In other embodiments, a polynucleotide, gene and/or polypeptide encoding hexose kinase can include, but is not limited to, a sequence selected from the following Table 4 or from Table 3. Hexose kinases suitable for expression in S. cerevisiae include those disclosed in Table 4. The hexose kinases disclosed in Table 4 are not dual-function hexokinases when expressed in S. cerevisiae, but one of skill in the art will recognize that certain of the hexose kinases suitable for expression in S. cerevisiae, will be dual-function hexokinases in other types of host cells.
TABLE-US-00004 TABLE 4 Example hexose kinase coding regions and proteins and source organism Target Nucleic gene and Nucleic Acid acid Amino acid Protein source GenBank Accession SEQ ID GenBank SEQ ID organism No. or Gene ID No. NO: Accession No. NO: RAG5 from NC_006040 3 XP_453567 4 K. lactis REGION: 973371..974828 HPGLK1 AY034434 5 AAK60444 6 from H. polymorpha HXK2 from X92895 7 NP_593865 8 S. pombe S. cerevisiae Entrez GeneID: 120 -- 121 HXK1 850614 S. cerevisiae Entrez GeneID: 122 -- 123 GLK1 850317
[0166] In other embodiments, a polynucleotide, gene and/or polypeptide encoding a hexose kinase can have at least about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to that of any one of the sequences of Table 3 or Table 4, wherein such a polynucleotide or gene encodes, or such a polypeptide has, hexose kinase activity. Still other examples of hexose kinase polynucleotides, genes and polypeptides that can be expressed in a recombinant host cell disclosed herein include, but are not limited to, an active variant, fragment or derivative of any one of the sequences of Table 3 or Table 4, wherein such a polynucleotide or gene encodes, or such a polypeptide has, hexose kinase activity. Still other examples of hexose kinase polynucleotides, genes and polypeptides that can be expressed in a recombinant host cell disclosed herein include, but are not limited to, an active variant, fragment or derivative of any one of the sequences of K. lactis KIGLK1 (Nucleic acid SEQ ID NO: 124; Amino acid SEQ ID NO: 125) or Hansenula polymorpha HPHXK1 (Nucleic acid SEQ ID NO: 126; Amino acid SEQ ID NO: 127), wherein such a polynucleotide or gene encodes, or such a polypeptide has, hexose kinase activity.
[0167] In other embodiments, a polynucleotide, gene and/or polypeptide encoding hexose kinase can be used to identify another hexose kinase polynucleotide, gene and/or polypeptide sequences and/or can be used to identify a hexose kinase homolog in other cells, as disclosed above for dual-role hexokinases. Such hexose kinase encoding sequences can be identified, for example, in the literature and/or in bioinformatics databases well known to the skilled person. For example, the identification of a hexose kinase encoding sequence in another cell type using bioinformatics can be accomplished through BLAST (as disclosed above) searching of publicly available databases with a known hexose kinase encoding DNA and polypeptide sequence, such as any of those provided herein. Identities are based on the Clustal W method of alignment using the default parameters of GAP PENALTY=10, GAP LENGTH PENALTY=0.1, and Gonnet 250 series of protein weight matrix.
Modification of Pyruvate Decarboxylase
[0168] In embodiments, a recombinant host cell disclosed herein can comprise a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase (PDC) activity or a modification in an endogenous polypeptide having PDC activity. In embodiments, a recombinant host cell disclosed herein can have a modification or disruption of one or more polynucleotides, genes and/or polypeptides encoding PDC. In embodiments, a recombinant host cell comprises a deletion, mutation, and/or substitution in one or more endogenous polynucleotides or genes encoding a polypeptide having PDC activity, or in one or more endogenous polypeptides having PDC activity. Such modifications, disruptions, deletions, mutations, and/or substitutions can result in PDC activity that is reduced or substantially eliminated, resulting, for example, in a PDC knock-out (PDC-KO) phenotype.
[0169] In embodiments, the endogenous pyruvate decarboxylase activity of a recombinant host cell disclosed herein converts pyruvate to acetaldehyde, which can then be converted to ethanol or to acetyl-CoA via acetate. In other embodiments, a recombinant host cell is Kluyveromyces lactis containing one gene encoding pyruvate decarboxylase, Candida glabrata containing one gene encoding pyruvate decarboxylase, or Schizosaccharomyces pombe containing one gene encoding pyruvate decarboxylase.
[0170] In other embodiments, the recombinant host cell is Saccharomyces cerevisiae containing three isozymes of pyruvate decarboxylase encoded by the PDC1, PDC5, and PDC6 genes, as well as a pyruvate decarboxylase regulatory gene, PDC2. In a non-limiting example in S. cerevisiae, the PDC1 and PDC5 genes, or the PDC1, PDC5, and PDC6 genes, are disrupted. In another non-limiting example in S. cerevisiae, pyruvate decarboxylase activity can be reduced by disrupting the PDC2 regulatory gene. In another non-limiting example, expression of the PDC1 and PDC5 genes, or the PDC1, PDC5, and PDC6 genes are reduced. In another non-limiting example in S. cerevisiae, polynucleotides or genes encoding pyruvate decarboxylase proteins such as those having about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to PDC1, PDC2, PDC5 and/or PDC6 can be disrupted.
[0171] In embodiments, a polypeptide having PDC activity or a polynucleotide or gene encoding a polypeptide having PDC activity corresponds to Enzyme Commission Number EC 4.1.1.1. In other embodiments, a PDC gene of a recombinant host cell disclosed herein is not active under the fermentation conditions used, and therefore such a gene would not need to be modified or inactivated.
[0172] Examples of a recombinant host cell with reduced pyruvate decarboxylase activity due to disruption of pyruvate decarboxylase encoding genes have been reported, such as for Saccharomyces in Flikweert et al. (Yeast (1996) 12:247-257), for Kluyveromyces in Bianchi at al. (Mol. Microbiol. (1996) 19(1):27-36), and disruption of the regulatory gene in Hohmann (Mol. Gen. Genet. (1993) 241:657-666). Saccharomyces strains having no pyruvate decarboxylase activity are available from the ATCC with Accession #200027 and #200028.
[0173] Examples of PDC polynucleotides, genes and/or polypeptides that can be targeted for modification or inactivation in the recombinant host cells disclosed herein include, but are not limited to, those of the following Table 5.
TABLE-US-00005 TABLE 5 Pyruvate decarboxylase target gene coding regions and proteins. SEQ ID NO: SEQ ID NO: Description Nucleic acid Amino acid PDC1 pyruvate decarboxylase from 9 10 Saccharomyces cerevisiae PDC5 pyruvate decarboxylase from 11 12 Saccharomyces cerevisiae PDC6 pyruvate decarboxylase from 13 14 Saccharomyces cerevisiae pyruvate decarboxylase from 15 16 Candida glabrata PDC1 pyruvate decarboxylase from 17 18 Pichia stipitis PDC2 pyruvate decarboxylase from 19 20 Pichia stipitis pyruvate decarboxylase from 21 22 Kluyveromyces lactic pyruvate decarboxylase from 23 24 Yarrowia lipolytica pyruvate decarboxylase from 25 26 Schizosaccharomyces pombe pyruvate decarboxylase from 27 28 Zygosaccharomyces rouxii
[0174] Other examples of PDC polynucleotides, genes and polypeptides that can be targeted for modification or inactivation in a recombinant host cell disclosed herein include, but are not limited to, PDC polynucleotides, genes and/or polypeptides having at least about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to any one of the sequences of Table 5, wherein such a polynucleotide or gene encodes, or such a polypeptide has, Pdc activity. Still other examples of PDC polynucleotides, genes and polypeptides that can be targeted for modification or inactivation in a recombinant host cell disclosed herein include, but are not limited to, an active variant, fragment or derivative of any one of the sequences of Table 5, wherein such a polynucleotide or gene encodes, or such a polypeptide has, Pdc activity.
[0175] In embodiments, a polynucleotide, gene and/or polypeptide encoding a PDC sequence disclosed herein or known in the art can be modified, as disclosed above for hexokinases. In other embodiments, a polynucleotide, gene and/or polypeptide encoding PDC can be used to identify another PDC polynucleotide, gene and/or polypeptide sequence or to identify a PDC homolog in other cells, as disclosed above for hexokinases. Such a PDC encoding sequence can be identified, for example, in the literature and/or in bioinformatics databases well known to the skilled person. For example, the identification of a PDC encoding sequence in other cell types using bioinformatics can be accomplished through BLAST (as described above) searching of publicly available databases with a known PDC encoding DNA and polypeptide sequence, such as those provided herein. Identities are based on the Clustal W method of alignment using the default parameters of GAP PENALTY=10, GAP LENGTH PENALTY=0.1, and Gonnet 250 series of protein weight matrix.
[0176] The modification of PDC in a recombinant host cell disclosed herein to reduce or eliminate PDC activity can be confirmed using methods known in the art. For example, one can screen for disruption of pyruvate decarboxylase by lack of a PCR product with primers listed in Example 2 or by Southern blotting using a probe designed to a PDC sequence.
Gene Expression in Recombinant Host Cells
[0177] Methods for gene expression in recombinant host cells, including, but not limited to, yeast cells are known in the art (see, for example, Methods in Enzymology, Volume 194, Guide to Yeast Genetics and Molecular and Cell Biology (Part A, 2004, Christine Guthrie and Gerald R. Fink (Eds.), Elsevier Academic Press, San Diego, Calif.). In embodiments, the coding region for the hexose kinase genes to be expressed can be codon optimized for the target host cell, as well known to one skilled in the art. Expression of genes in recombinant host cells, including but not limited to yeast cells, can require a promoter operably linked to a coding region of interest, and a transcriptional terminator. A number of promoters can be used in constructing expression cassettes for genes, including, but not limited to, the following constitutive promoters suitable for use in yeast: FBA1, TDH3 (GPD), ADH1, GPM1, and TEF1; and the following inducible promoters suitable for use in yeast: GAL1, GAL10 and CUP1. Suitable for conditional expression is the OLE1 promoter, for which transcription of the gene is induced under anaerobic conditions. While not wishing to be bound by theory, it is believed that anaerobic conditions often prevail during stationary phase, especially in industrial fermentations. Other promoters with stationary-phase expression are known in the art and would also be suitable, such as SNO1 and SNZ1. Suitable transcriptional terminators that can be used in a chimeric gene construct for expression include, but are not limited to, FBA1t, TDH3t, GPM1t, ERG10t, GAL1t, CYC1t, and ADH1t.
[0178] Recombinant polynucleotides are typically cloned for expression using the coding sequence as part of a chimeric gene used for transformation, which includes a promoter operably linked to the coding sequence as well as a ribosome binding site and a termination control region. The coding region may be from the host cell for transformation and combined with regulatory sequences that are not native to the natural gene encoding hexose kinase. Alternatively, the coding region may be from another host cell.
[0179] Vectors useful for the transformation of a variety of host cells are common and disclosed in the literature. Typically the vector contains a selectable marker and sequences allowing autonomous replication or chromosomal integration in the desired host. In addition, suitable vectors can comprise a promoter region which harbors transcriptional initiation controls and a transcriptional termination control region, between which a coding region DNA fragment may be inserted, to provide expression of the inserted coding region. Both control regions can be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions can also be derived from genes that are not native to the specific species chosen as a production host.
[0180] In embodiments, suitable promoters, transcriptional terminators, and hexose kinase coding regions can be cloned into E. coli-yeast shuttle vectors, and transformed into yeast cells. Such vectors allow plasmid propagation in both E. coli and yeast strains, and can contain a selectable marker and sequences allowing autonomous replication or chromosomal integration in the desired host. Typically used plasmids in yeast include, but are not limited to, shuttle vectors pRS423, pRS424, pRS425, and pRS426 (American Type Culture Collection, Rockville, Md.), which contain an E. coli replication origin (e.g., pMB1), a yeast 2-micron origin of replication, and a marker for nutritional selection. The selection markers for these four vectors are HIS3 (vector pRS423), TRP1 (vector pRS424), LEU2 (vector pRS425) and URA3 (vector pRS426).
[0181] In embodiments, construction of expression vectors with a chimeric gene encoding the disclosed hexose kinases can be performed by the gap repair recombination method in yeast. The gap repair cloning approach takes advantage of the highly efficient homologous recombination in yeast. In embodiments, a yeast vector DNA is digested (e.g., in its multiple cloning site) to create a "gap" in its sequence. A number of insert DNAs of interest are generated that contain an approximately 21 bp sequence at both the 5' and the 3' ends that sequentially overlap with each other, and with the 5' and 3' terminus of the vector DNA. For example, to construct a yeast expression vector for "Gene X," a yeast promoter and a yeast terminator are selected for the expression cassette. The promoter and terminator are amplified from the yeast genomic DNA, and Gene X is either PCR amplified from its source organism or obtained from a cloning vector comprising Gene X sequence. There is at least a 21 bp overlapping sequence between the 5' end of the linearized vector and the promoter sequence, between the promoter and Gene X, between Gene X and the terminator sequence, and between the terminator and the 3' end of the linearized vector. The "gapped" vector and the insert DNAs are then co-transformed into a yeast strain and plated on the medium containing the appropriate compound mixtures that allow complementation of the nutritional selection markers on the plasmids. The presence of correct insert combinations can be confirmed by PCR mapping using plasmid DNA prepared from the selected cells. The plasmid DNA isolated from yeast (usually low in concentration) can then be transformed into an E. coli strain, e.g., TOP10, followed by mini preps and restriction mapping to further verify the plasmid construct. Finally the construct can be verified by sequence analysis.
[0182] Like the gap repair technique, integration into the yeast genome also takes advantage of the homologous recombination system in yeast.
[0183] In embodiments, a cassette containing a coding region plus control elements (promoter and terminator) and auxotrophic marker is PCR-amplified with a high-fidelity DNA polymerase using primers that hybridize to the cassette and contain 40-70 base pairs of sequence homology to the regions 5'' and 3' of the genomic area where insertion is desired. The PCR product is then transformed into yeast and plated on medium containing the appropriate compound mixtures that allow selection for the integrated auxotrophic marker. For example, to integrate "Gene X" into chromosomal location "Y", the promoter-coding region X-terminator construct is PCR amplified from a plasmid DNA construct and joined to an auxotrophic marker (such as URA3) by either SOE PCR or by common restriction digests and cloning. The full cassette, containing the promoter-coding regionX-terminator-URA3 region, is PCR amplified with primer sequences that contain 40-70 bp of homology to the regions 5' and 3' of location "Y" on the yeast chromosome. The PCR product is transformed into yeast and selected on growth media lacking uracil. Transformants can be verified either by colony PCR or by direct sequencing of chromosomal DNA.
[0184] A recombinant host cell disclosed herein can be cultured using standard laboratory techniques known in the art (see, e.g., Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). The growth of the recombinant host cells disclosed herein can be measured by methods known in the art (see, e.g., Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202).
[0185] Applicants have provided a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and optionally (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity. In embodiments, such a recombinant host cell can have an improved redox balance, increased glucose consumption and/or increased formation of a product of a pyruvate-utilizing biosynthetic pathway. As such, Applicants have also provided methods of improving redox balance, increasing glucose consumption and/or increasing formation of a product of a pyruvate-utilizing biosynthetic pathway of a recombinant host cell comprising (a) a modification in an endogenous polynucleotide encoding a polypeptide having dual-role activity; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and optionally (c) a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity.
[0186] Redox balance and glucose consumption of a recombinant host cell disclosed herein can be measured by methods known in the art (see, e.g., Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). In a non-limiting example, glucose consumption can be measured by quantitating the amount of glucose in culture media by HPLC. Redox balance can be assessed indirectly, for example, by measuring glycerol formation, wherein more glycerol formation implies greater imbalance. Alternatively, redox balance can be assessed by direct analysis of NAD/NADH and NADP/NADPH pools by methods known in the art.
[0187] In other embodiments, methods of producing a recombinant host cell are provided comprising (i) providing a recombinant host cell comprising a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (ii) transforming said recombinant host cell with a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and optionally (iii) introducing a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity.
[0188] In other embodiments, methods for the conversion of hexose into hexose-6-phosphate comprising (i) providing a recombinant host cell comprising a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (ii) transforming said recombinant host cell with a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and optionally (iii) introducing a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity. In other embodiments, methods for the conversion of D-glucose into D-glucose 6-phosphate, D-fructose into D-fructose 6-phosphate, and/or D-mannose into D-mannose 6-phosphate are provided comprising (i) providing a recombinant host cell comprising a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (ii) transforming said recombinant host cell with a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and optionally (iii) introducing a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity.
Engineered Biosynthetic Pathways Using Pyruvate.
[0189] In embodiments, a recombinant host cell comprising (a) a modification (e.g., a deletion, mutation, and/or substitution) in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and optionally (c) a modification (e.g., a deletion, mutation, and/or substitution) in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity can be engineered to have a biosynthetic pathway for the production of a product of a biosynthetic pathway utilizing pyruvate. Such a recombinant host cell can exhibit an increased production of a product of a biosynthetic pathway utilizing pyruvate. As such, in embodiments, methods for the increased production of a product of a biosynthetic pathway utilizing pyruvate are also provided comprising (i) providing a recombinant host cell comprising (a) a modification (e.g., a deletion, mutation, and/or substitution) in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; and (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (ii) growing the recombinant host cell under conditions wherein the product of the pyruvate-utilizing pathway is formed; wherein the amount of product formed by the recombinant host cell is greater than the amount of product formed by a recombinant host cell comprising (a) but not (b).
[0190] In other embodiments, methods for the increased production of a product of a biosynthetic pathway utilizing pyruvate are provided comprising (i) providing a recombinant host cell comprising (a) a modification (e.g., a deletion, mutation, and/or substitution) in an endogenous polynucleotide encoding a polypeptide having dual-role hexokinase activity; (b) a heterologous polynucleotide encoding a polypeptide having hexose kinase activity; and (c) a modification (e.g., a deletion, mutation, and/or substitution) in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase activity; and (ii) growing the recombinant host cell under conditions wherein the product of the pyruvate-utilizing pathway is formed; wherein the amount of product formed by the recombinant host cell is greater than the amount of product formed by a recombinant host cell comprising (a) and (c) but not (b).
[0191] A product from a pyruvate-utilizing biosynthetic pathway used in relation to a recombinant host cell disclosed herein includes, but is not limited to, 2,3-butanediol, isobutanol, 2-butanol, 1-butanol, 2-butanone, valine, leucine, lactic acid, malic acid, isoamyl alcohol, and/or isoprenoids. The features of any pyruvate-utilizing biosynthetic pathway can be engineered in a recombinant host cell disclosed herein in any order. Any product made using a biosynthetic pathway that has pyruvate as the initial substrate can be produced with greater effectiveness in a recombinant host cell disclosed herein. The biosynthetic pathway of a recombinant host cell disclosed herein can be any pathway that utilizes pyruvate and produces a desired product. In some embodiments at least one polynucleotide encoding a polypeptide that catalyzes a substrate to product conversion in biosynthetic pathway is heterologous. In some embodiments, one, two, three, four, or five substrate to product conversions of a biosynthetic pathway are catalyzed by polypeptides encoded by polynucleotides heterologous to the host cell. In some embodiments, the biosynthetic pathway comprises more than one polynucleotide that is heterologous to the yeast cell. In some embodiments, each substrate to product conversion of a biosynthetic pathway is catalyzed by polypeptides encoded by polynucleotides that are heterologous to the host cell. In some embodiments, the polypeptides are heterologous.
[0192] An example of a biosynthetic pathway for producing 2,3-butanediol can be engineered in a recombinant host cell disclosed herein, as disclosed in U.S. patent application Ser. No. 12/477,942. The 2,3-butanediol pathway is a portion of the 2-butanol biosynthetic pathway that is disclosed in U.S. Patent Application Publication No. US 2007/0292927 A1 Such pathway steps include, but are not limited to, conversion of pyruvate to acetolactate, for example by acetolactate synthase, conversion of acetolactate to acetoin, for example by acetolactate decarboxylase, and conversion of acetoin to 2,3-butanediol, for example by butanediol dehydrogenase. Butanediol dehydrogenase requires NADH and thereby contributes to redox balance. The skilled person will appreciate that polypeptides having the activity of such pathway steps can be isolated from a variety of sources can be used in the recombinant host cells disclosed herein.
[0193] In addition, examples of biosynthetic pathways for production of 2-butanone or 2-butanol that can be engineered in a recombinant host cell disclosed herein are disclosed in U.S. Patent Application Publication Nos. US 2007/0292927 A1 and US 2007/0259410 A1. The pathway in U.S. Patent Application Publication No. US 2007/0292927 A1 is the same as disclosed for butanediol production with the addition of the following steps:
[0194] 2,3-butanediol to 2-butanone as catalyzed for example by dial dehydratase or glycerol dehydratase; and
[0195] 2-butanone to 2-butanol as catalyzed for example by butanol dehydrogenase.
[0196] Disclosed in U.S. Patent Application Publication No. US 2009/0155870 A1, is the construction of chimeric genes and genetic engineering of yeast for 2-butanol production using the U.S. Patent Application Publication No. US 2007/0292927 A1 disclosed biosynthetic pathway. Further description for gene construction and expression related to these pathways can be found, for example, in International Publication No. WO 2009/046370 (e.g., butanediol dehydratases); and U.S. Patent Application Publication No. US 2009/0269823 A1 (e.g., butanol dehydrogenase) and U.S. Patent Application Publication No. US 20070259410 A1. The skilled person will appreciate that polypeptides having the activity of such pathway steps can be isolated from a variety of sources and can be used in the recombinant host cells disclosed herein.
[0197] Examples of biosynthetic pathways for production of isobutanol that can be engineered in a recombinant host cell disclosed herein are also provided in U.S. Patent Application Publication No. US 2007/0092957 A1. As disclosed in U.S. Patent Application Publication No. US 2007/0092957 A1, steps in an example isobutanol biosynthetic pathway include conversion of:
[0198] pyruvate to acetolactate as catalyzed by acetolactate synthase
[0199] acetolactate to 2,3-dihydroxyisovalerate as catalyzed for example by acetohydroxy acid isomeroreductase, also called ketol-acid reductoisomerase;
[0200] 2,3-dihydroxyisovalerate to 2-ketoisovalerate as catalyzed for example by acetohydroxy acid dehydratase, also called dihydroxy-acid dehydratase;
[0201] 2-ketoisovalerate to isobutyraldehyde as catalyzed for example by branched-chain ฮฑ-keto acid decarboxylase, and
[0202] isobutyraldehyde to isobutanol as catalyzed for example by branched-chain alcohol dehydrogenase.
[0203] Further description for gene construction and expression related to this pathway can be found, for example, in U.S. Patent Application Publication Nos. US 2008/0261230 A1 and US 2009/0269823 A1. The skilled person will appreciate that polypeptides having the activity of such pathway steps can be isolated from a variety of sources and can be used in a recombinant host cell disclosed herein. Suitable proteins having the ability to catalyze the indicated substrate to product conversions are described in the art. For example, US Published Patent Application Nos. US20080261230 and US20090163376, US20100197519, and U.S. application Ser. No. 12/893,077 describe acetohydroxy acid isomeroreductases; US20070092957 and US20100081154, describe suitable dihydroxyacid dehydratases; suitable alcohol dehydrogenases are described in US Published Patent Application US20090269823 and U.S. Provisional Patent Application No. 61/290,636.
[0204] An example of a biosynthetic pathway for production of 1-butanol that can be engineered in a recombinant lost cell disclosed herein is disclosed in U.S. Patent Application Publication No. US 2008/0182308 A1. As disclosed this publication, steps in the disclosed 1-butanol biosynthetic pathway include conversion of:
[0205] acetyl-CoA to acetoacetyl-CoA, as catalyzed for example by acetyl-CoA acetyltransferase;
[0206] acetoacetyl-CoA to 3-hydroxybutyryl-CoA, as catalyzed for example by 3-hydroxybutyryl-CoA dehydrogenase;
[0207] 3-hydroxybutyryl-CoA to crotonyl-CoA, as catalyzed for example by crotonase;
[0208] crotonyl-CoA to butyryl-CoA, as catalyzed for example by butyryl-CoA dehydrogenase;
[0209] butyryl-CoA to butyraldehyde, as catalyzed for example by butyraldehyde dehydrogenase; and
[0210] butyraldehyde to 1-butanol, as catalyzed for example by butanol dehydrogenase.
[0211] Genes that may be used for expression of these enzymes are disclosed, for example, in U.S. Patent Application Publication No. US 2008/0182308 A1, and additional genes that can be used can be identified by one skilled in the art.
[0212] An example of a biosynthetic pathway for production of valine that can be engineered in a recombinant host cell disclosed herein includes the steps of acetolactate conversion to 2,3-dihydroxy-isovalerate by acetohydroxyacid reductoisomerase (ILV5), conversion of 2,3-dihydroxy-isovalerate to 2-keto-isovalerate by dihydroxy-acid dehydratase (ILV3), and conversion of 2-keto-isovalerate to valine by branched-chain amino acid transaminase (BAT2) and branched-chain amino acid aminotransferase (BAT1). Biosynthesis of leucine includes the same steps to 2-keto-isovalerate, followed by conversion of 2-keto-isovalerate to alpha-isopropylmalate by alpha-isopropylmalate synthase (LEU9, LEU4), conversion of alpha-isopropylmalate to beta-isopropylmalate by isopropylmalate isomerase (LEU1), conversion of beta-isopropylmalate to alpha-ketoisocaproate by beta-IPM dehydrogenase (LEU2), and finally conversion of alpha-ketoisocaproate to leucine by branched-chain amino acid transaminase (BAT2) and branched-chain amino acid aminotransferase (BAT1). It is desired for production of valine or leucine to overexpress at least one of the enzymes in these disclosed pathways.
[0213] An example of a biosynthetic pathway for production of isoamyl alcohol that can be engineered in a recombinant host cell disclosed herein includes the steps of leucine conversion to alpha-ketoisocaproate by branched-chain amino acid transaminase (BAT2) and branched-chain amino acid aminotransferase (BAT1), conversion of alpha-ketoisocaproate to 3-methylbutanal by ketoisocaproate decarboxylase (THI3) or decarboxylase ARO10, and finally conversion of 3-methylbutanal to isoamyl alcohol by an alcohol dehydrogenase such as ADH1 or SFA1, Production of isoamyl alcohol benefits from increased production of leucine or the alpha-ketoisocaproate intermediate by overexpression of one or more enzymes in biosynthetic pathways for these chemicals. In addition, one or both enzymes for the final two steps can be overexpressed.
[0214] An example of a biosynthetic pathway for production of lactic acid that can be engineered in a recombinant host cell disclosed herein includes pyruvate conversion to lactic acid by lactate dehydrogenase. Engineering yeast for lactic acid production using lactate dehydrogenase, known as EC 1.1.1.27, is well known in the art such as in Ishida at al. (Appl. Environ. Microbiol. 71:1964-70 (2005)).
[0215] An example of a biosynthetic pathway for production of malate that can be engineered in a recombinant host cell disclosed herein includes pyruvate conversion to oxaloacetate by pyruvate carboxylase, and conversion of oxaloacetate to malate by malate dehydrogenase as disclosed in Zelle et al. (Appl. Environ. Microbiol. 74:2766-77 (2008)). In addition, a malate transporter can be expressed.
[0216] Examples of biosynthetic pathways for production of isoprenoids can also be engineered in a recombinant host cell disclosed herein. In a non-limiting example, a mevalonate pathway can be used (Martin at al. (2003) Nature Biotech. 21:796-802) which includes the conversion of pyruvate to acetyl-CoA, which is converted to acetoacetyl-CoA, which is converted to 3-hydroxy-3-methylglutaryl-CoA, which is converted to mevalonate and then to isoprenoids. In another non-limiting example, a non-mevalonate pathway is described by Kim and Keisling (Biotechnol. Bioeng, 72:408-15 (2001)).
[0217] The skilled person will appreciate that polypeptides having activities of the above-mentioned biosynthetic pathways can be isolated from a variety of sources can be used in a recombinant host cell disclosed herein.
Additional Modifications
[0218] Additional modifications that may be useful in cells provided herein include modifications to reduce glycerol-3-phosphate dehydrogenase activity as described in US Patent Application Publication No. 20090305363 (incorporated herein by reference), modifications to a host cell that provide for increased carbon flux through an Entner-Doudoroff Pathway or reducing equivalents balance as described in US Patent Application Publication No. 20100120105 (incorporated herein by reference). Yeast strains with increased activity of heterologous proteins that require binding of an Fe--S cluster for their activity are described in US Application Publication No. 20100081179 (incorporated herein by reference). Other modifications include integration of at least one polynucleotide encoding a polypeptide that catalyzes a step in a pyruvate-utilizing biosynthetic pathway described in U.S. Provisional Application No. 61/380,563 (both referenced provisional applications are incorporated herein by reference in their entirety). Additional modifications that may be suitable for embodiments herein are described in U.S. application Ser. No. 12/893,089.
[0219] Additionally, host cells comprising at least one deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide affecting Fe--S cluster biosynthesis are described in U.S. Provisional Patent Application No. 61/305,333 (incorporated herein by reference), and host cells comprising a heterologous polynucleotide encoding a polypeptide with phosphoketolase activity and host cells comprising a heterologous polynucleotide encoding a polypeptide with phosphotransacetylase activity are described in US Provisional Patent Application No. 61/356,379.
Growth for Production
[0220] A recombinant host cell disclosed herein is grown in fermentation media which contains a suitable carbon substrate. Carbon substrates can include, but are not limited to, monosaccharides such as fructose or galactose, oligosaccharides such as lactose, maltose, or sucrose, polysaccharides such as starch or cellulose or mixtures thereof and unpurified mixtures from renewable feedstocks such as cheese whey permeate, cornsteep liquor, sugar beet molasses, and barley malt. Other carbon substrates can include ethanol, lactate, succinate, or glycerol.
[0221] Additionally a carbon substrate can also be one-carbon substrates such as carbon dioxide, or methanol for which metabolic conversion into key biochemical intermediates has been demonstrated. In addition to one and two carbon substrates, methylotrophic organisms are also known to utilize a number of other carbon containing compounds such as methylamine, glucosamine and a variety of amino acids for metabolic activity. For example, methylotrophic yeasts are known to utilize the carbon from methylamine to form trehalose or glycerol (Bellion et al., Microb. Growth C1Compd., [Int. Symp.], 7th (1993), 415-32, Editor(s): Murrell, J. Collin; Kelly, Don P. Publisher: Intercept, Andover, UK). Similarly, various species of Candida will metabolize alanine or oleic acid (Sutter at al., Arch. Microbial. 153:485-489 (1990)). Hence it is contemplated that the source of carbon utilized in the present invention can encompass a wide variety of carbon containing substrates and will only be limited by the choice of organism.
[0222] Although it is contemplated that all of the above mentioned carbon substrates and mixtures thereof are suitable in the present invention, in some embodiments, a carbon substrates can be glucose, fructose, and sucrose, or mixtures of these with C5 sugars such as xylose and/or arabinose for yeast cells modified to use C5 sugars. Sucrose can be derived from renewable sugar sources such as sugar cane, sugar beets, cassava, sweet sorghum, and mixtures thereof. Glucose can be derived from renewable grain sources through saccharification of starch based feedstocks including grains such as corn, wheat, rye, barley, oats, and mixtures thereof. In addition, fermentable sugars can be derived from renewable cellulosic or lignocellulosic biomass through processes of pretreatment and saccharification, as described, for example, in U.S. Patent Application Publication No. US 20070031918 A1.
[0223] Biomass refers to any cellulosic or lignocellulosic material and includes materials comprising cellulose, and optionally further comprising hemicellulose, lignin, starch, oligosaccharides and/or monosaccharides. Biomass can also comprise additional components, such as protein and/or lipid. Biomass can be derived from a single source, or biomass can comprise a mixture derived from more than one source; for example, biomass can comprise a mixture of corn cobs and corn stover, or a mixture of grass and leaves. Biomass includes, but is not limited to, bioenergy crops, agricultural residues, municipal solid waste, industrial solid waste, sludge from paper manufacture, yard waste, wood and forestry waste. Examples of biomass include, but are not limited to, corn grain, corn cobs, crop residues such as corn husks, corn stover, grasses, wheat, wheat straw, barley, barley straw, hay, rice straw, switchgrass, waste paper, sugar cane bagasse, sorghum, soy, components obtained from milling of grains, trees, branches, roots, leaves, wood chips, sawdust, shrubs and bushes, vegetables, fruits, flowers, animal manure, and mixtures thereof.
[0224] In addition to an appropriate carbon source, fermentation media can contain suitable minerals, salts, cofactors, buffers and other components, known to those skilled in the art, suitable for the growth of the cultures.
Culture Conditions
[0225] Typically cells are grown at a temperature in the range of about 20ยฐ C. to about 40ยฐ C. in an appropriate medium. Suitable growth media in the present invention are common commercially prepared media such as Luria Bertani (LB) broth, Sabouraud Dextrose (SD) broth, Yeast Medium (YM) broth, or broth that includes yeast nitrogen base, ammonium sulfate, and dextrose (as the carbon/energy source) or YPD Medium, a blend of peptone, yeast extract, and dextrose in optimal proportions for growing most Saccharomyces cerevisiae strains. Other defined or synthetic growth media can also be used, and the appropriate medium for growth of the particular microorganism will be known by one skilled in the art of microbiology or fermentation science. The use of agents known to modulate catabolite repression directly or indirectly, e.g., cyclic adenosine 2':3'-monophosphate, can also be incorporated into the fermentation medium.
[0226] Suitable pH ranges for the fermentation are between about pH 5.0 to about pH 9.0. In one embodiment, about pH 6.0 to about pH 8.0 can be used for the initial condition. Suitable pH ranges for the fermentation of yeast are typically between about pH 3.0 to about pH 9.0. In one embodiment, about pH 5.0 to about pH 8.0 can be used for the initial condition. Suitable pH ranges for the fermentation of other microorganisms are between about pH 3.0 to about pH 7.5. In one embodiment, about pH 4.5 to about pH 6.5 can be used for the initial condition.
[0227] Fermentations can be performed under aerobic or anaerobic conditions. In one embodiment, anaerobic or microaerobic conditions can be used for fermentations.
Industrial Batch and Continuous Fermentations
[0228] The recombinant host cells disclosed herein can be produced using a batch method of fermentation. A classical batch fermentation is a closed system where the composition of the medium is set at the beginning of the fermentation and not subject to artificial alterations during the fermentation. A variation on the standard batch system is the fed-batch system. Fed-batch fermentation processes are also suitable in the present invention and comprise a typical batch system with the exception that the substrate is added in increments as the fermentation progresses. Fed-batch systems are useful when catabolite repression is apt to inhibit the metabolism of the cells and where it is desirable to have limited amounts of substrate in the media. Batch and fed-batch fermentations are common and well known in the art and examples may be found in Thomas 0. Brock in Biotechnology A Textbook of industrial Microbiology, Second Edition (1989) Sinauer Associates, Inc., Sunderland, Mass., or Deshpande, Mukund V., Appl. Biochem. Biotechnol., 36:227, (1992).
[0229] A product of a pyruvate-utilizing biosynthetic pathway related to a recombinant host cell disclosed herein can also be produced using continuous fermentation methods. Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density where cells are primarily in log phase growth. Continuous fermentation allows for the modulation of one factor or any number of factors that affect cell growth or end product concentration. Methods of modulating nutrients and growth factors for continuous fermentation processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology and a variety of methods are detailed by Brock, supra.
[0230] It is contemplated that a product of a pyruvate-utilizing biosynthetic pathway can be practiced using batch, fed-batch or continuous processes and that any known mode of fermentation would be suitable. Additionally, it is contemplated that a recombinant host cell disclosed herein can be immobilized on a substrate as whole cell catalysts and subjected to fermentation conditions for isobutanol production.
Methods for Product Isolation from the Fermentation Medium
[0231] A product of a pyruvate-utilizing biosynthetic pathway can be isolated from the fermentation medium using methods known in the art for acetone-butanol-ethanol (ABE) fermentations (see, e.g., Durre, Appl. Microbiol. Biotechnol. 49:639-648 (1998), Groot et al., Process. Biochem. 27:61-75 (1992), and references therein). For example, solids can be removed from the fermentation medium by centrifugation, filtration, decantation, or the like. Then, the product can be isolated from the fermentation medium using methods such as distillation, azeotropic distillation, liquid-liquid extraction, adsorption, gas stripping, membrane evaporation, or pervaporation.
[0232] Where a product has a low boiling point (e.g., isobutanol), azeotropic mixture with water, distillation can be used to separate the mixture up to its azeotropic composition. Distillation can be used in combination with another separation method to obtain separation around the azeotrope. Methods that can be used in combination with distillation to isolate and purify butanol include, but are not limited to, decantation, liquid-liquid extraction, adsorption, and membrane-based techniques. Additionally, butanol may be isolated using azeotropic distillation using an entrainer (see, e.g., Doherty and Malone, Conceptual Design of Distillation Systems, McGraw Hill, New York, 2001).
[0233] The butanol-water mixture forms a heterogeneous azeotrope so that distillation can be used in combination with decantation to isolate and purify the isobutanol. In this method, the isobutanol containing fermentation broth is distilled to near the azeotropic composition. Then, the azeotropic mixture is condensed, and the isobutanol is separated from the fermentation medium by decantation. The decanted aqueous phase can be returned to the first distillation column as reflux. The isobutanol-rich decanted organic phase can be further purified by distillation in a second distillation column.
[0234] A product of a pyruvate-utilizing biosynthetic pathway can also be isolated from the fermentation medium using liquid-liquid extraction in combination with distillation. In this method, the product (e.g., isobutanol) can be extracted from the fermentation broth using liquid-liquid extraction with a suitable solvent. The product-containing organic phase can then be distilled to separate the product from the solvent.
[0235] Distillation in combination with adsorption can also be used to isolate a product (e.g., isobutanol) from the fermentation medium. In this method, the fermentation broth containing the product is distilled to near the azeotropic composition and then the remaining water is removed by use of an adsorbent, such as molecular sieves (Aden et al. Lignocellulosic Biomass to Ethanol Process Design and Economics Utilizing Co-Current Dilute Acid Prehydrolysis and Enzymatic Hydrolysis for Corn Stover, Report NREUTP-510-32438, National Renewable Energy Laboratory, June 2002).
[0236] Additionally, distillation in combination with pervaporation can be used to isolate and purify a product (e.g., isobutanol) from the fermentation medium. In this method, the fermentation broth containing the product is distilled to near the azeotropic composition, and then the remaining water is removed by pervaporation through a hydrophilic membrane (Guo et al., J. Membr. Sci. 245, 199-210 (2004)).
EXAMPLES
[0237] The present invention is further defined in the following Examples. It should be understood that these Examples, while indicating embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various uses and conditions.
General Methods
[0238] Standard recombinant DNA and molecular cloning techniques used in the Examples are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, N.Y. (1989) (Maniatis) and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, pub. by Greene Publishing Assoc. and Wiley-Interscience (1987), and by Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
[0239] Materials and methods suitable for the maintenance and growth of microbial cultures are well known in the art. Techniques suitable for use in the following Examples may be found as set out in Manual of Methods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, eds), American Society for Microbiology, Washington, D.C. (1994)) or by Thomas 0. Brock in Biotechnology: A Textbook of Industrial Microbiology, Second Edition, Sinauer Associates, Inc., Sunderland, Mass. (1989). All reagents, restriction enzymes and materials used for the growth and maintenance of microbial cells were obtained from Aldrich Chemicals (Milwaukee, Wis.), BD Diagnostic Systems (Sparks, Md.), Life Technologies (Rockville, Md.), or Sigma Chemical Company (St. Louis, Mo.) unless otherwise specified. Microbial strains were obtained from The American Type Culture Collection (ATCC), Manassas, Va., unless otherwise noted. The oligonucleotide primers used in the following Examples are given in the following Tables. All the oligonucleotide primers were synthesized by Sigma-Genosys (Woodlands, Tex.) or Integrated DNA Technologies (Coralsville, Iowa).
[0240] Synthetic complete medium is described by Amberg, Burke and Strathern, 2005, Methods in Yeast Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
GC Method
[0241] The GC method utilized a ZB-WAXplus column (30 mร0.25 mm ID, 0.25 ฮผm film) from Phenomenex (Torrance, Calif.). The carrier gas was helium at a constant flow rate of 2.3 mL/min; injector split was 1:20 at 250ยฐ C.; oven temperature was 70ยฐ C. for 1 min, 70ยฐ C. to 160ยฐ C. at 10ยฐ C./min, and 160ยฐ C. to 240ยฐ C. at 30ยฐ C./min. FID detection was used at 260ยฐ C. with 40 ml/min helium makeup gas. Culture broth samples were filtered through 0.2 ฮผm spin filters before injection. Depending on analytical sensitivity desired, either 0.1 ฮผl or 0.5 ฮผl injection volumes were used. Calibrated standard curves were generated for the following compounds: ethanol, isobutanol, acetoin, meso-2,3-butanediol, and (2S,3S)-2,3-butanediol. (2S,3S)-2,3-butanediol retention time is 6.8 minutes. meso-2,3-butanediol retention time is 7.2 minutes. Analytical standards were also utilized to identify retention times for isobutyraldehyde, isobutyric acid, and isoamyl alcohol.
HPLC Method
[0242] Analysis for glucose and fermentation by-product composition is well known to those skilled in the art. For example, one high performance liquid chromatography (HPLC) method utilizes a Shodex SH-1011 column with a Shodex SH-G guard column (both available from Waters Corporation, Milford, Mass.), with refractive index (RI) detection. Chromatographic separation is achieved using 0.01 M H2SO4 as the mobile phase with a flow rate of 0.5 mL/min and a column temperature of 50ยฐ C. Isobutanol retention time is 47.6 minutes.
[0243] The meaning of abbreviations is as follows: "s" means second(s), "min" means minute(s), "h" means hour(s), "psi" means pounds per square inch, "nm" means nanometers, "d" means day(s), "ฮผL" means microliter(s), "mL" means milliliter(s), "L" means liter(s), "mm" means millimeter(s), "nm" means nanometers, "mM" means millimolar, "M" means molar, "mmol" means millimole(s), "ฮผmol" means micromole(s), "g" means gram(s), "ฮผg" means microgram(s) and "ng" means nanogram(s), "PCR" means polymerase chain reaction, "OD" means optical density, "OD600" means the optical density measured at a wavelength of 600 nm, "kDa" means kilodaltons, "g" means the gravitation constant, "bp" means base pair(s), "kbp" means kilobase pair(s), "% w/v" means weight/volume percent, "% v/v" means volume/volume percent, "wt %" means percent by weight, "HPLC" means high performance liquid chromatography, and "GC" means gas chromatography. The term "molar selectivity" is the number of moles of product produced per mole of sugar substrate consumed and is reported as a percent. "SLPM" stands for Standard Liters per Minute (of air), "dO" is dissolved oxygen, Op is "specific productivity" measured in grams isobutanol per gram of cells over time. The term "nt" means nucleotides.
Example 1
Construction of Expression Vectors for Isobutanol Pathway Gene Expression in S. cerevisiae
[0244] pLH475-74B8 Construction
[0245] The pLH475-Z4B8 plasmid (SEQ ID NO: 29) was constructed for expression of ALS and KARI in yeast. pLH475-Z4B8 is a pHR81 vector (ATCC #87541) containing the following chimeric genes: A) CUP1 promoter region derived sequence (SEQ ID NO: 30), acetolactate synthase coding region from Bacillus subtilis (AlsS; SEQ ID NOs: 31 and 32) and a CYC1 terminator region derived sequence ("CYC1 terminator 2"; SEQ ID NO: 33); B) ILV5 promoter region derived sequence (SEQ ID NO: 34), Pf5.IlvC-Z4B8 coding region (SEQ ID NOs: 37 and 38) and ILV5 terminator region derived sequence (SEQ ID NO: 35); and C) FBA1 promoter region derived sequence (SEQ ID NO: 36), S. cerevisiae KARI coding region (ILV5; SEQ ID NOs: 39 and 40) and CYC1 terminator region derived sequence.
[0246] The Pf5.IlvC-Z4B8 coding region is a sequence encoding KARI derived from Pseudomonas fluorescens with certain mutations, as disclosed in U.S. Patent Application Publication No. US 2009-0163376 A1. More specifically, the Pf5.IlvC-Z4B8 encoded KARI (SEQ ID NO: 38) has the following amino acid changes as compared to the natural Pseudomonas fluorescens KARI:
C33L: cysteine at position 33 changed to leucine, R47Y: arginine at position 47 changed to tyrosine, S50A: serine at position 50 changed to alanine, T52D: threonine at position 52 changed to asparagine, V53A: valine at position 53 changed to alanine, L61F: leucine at position 61 changed to phenylalanine, T80I: threonine at position 80 changed to isoleucine, A156V: alanine at position 156 changed to threonine, and G170A: glycine at position 170 changed to alanine.
[0247] The Pf5.IlvC-Z4B8 coding region (SEQ ID NO: 37) was synthesized by DNA 2.0 (Palo Alto, Calif.; based or codons that were optimized for expression in Saccharomyces cerevisiae.
pLH475-JEA1 Construction
[0248] The pLH475-JEA1 plasmid (SEQ ID NO:128) was constructed for expression of ALS and KARI in yeast. pLH475-JEA1 is a pHR 81 vector (ATCC #87541) containing the following chimeric genes: 1) the CUP1 promoter (SEQ ID NO: 30), acetolactate synthase coding region from Bacillus subtilis (AlsS; (SEQ ID NOs: 31 and 32)) and CYC1 terminator 2 (SEQ ID NO: 33)); 2) an ILV5 promoter (SEQ ID NO: 34, Pf5.IlvC-JEA1 coding region and ILV5 terminator (SEQ ID NO: 35); and 3) the FBA1 promoter (SEQ ID NO: 36)S. cerevisiae KARI coding region (ILV5; SEQ ID NOs: 39 and 40) and CYC1 terminator.
[0249] The Pf5.IlvC-JEA1 coding region is a sequence encoding KARI derived from Pseudomonas fluorescens with certain mutations, as disclosed in U.S. Patent Application Publication 20090163376A1. More specifically, the Pf5.IlvC-JEA1 encoded KARI (nucleic acid and amino acid sequences of SEQ ID NOs: 41 and 42, respectively) has the following amino acid changes as compared to the natural Pseudomonas fluorescens KARI:
Y24F: tyrosine at position 24 changed to phenylalanine C33L: cysteine at position 33 changed to leucine, R47P: arginine at position 47 changed to praline, S50F: serine at position 50 changed to phenylalanine, T52D: threonine at position 52 changed to asparagine, L61F: leucine at position 61 changed to phenylalanine, T80I: threonine at position 80 changed to isoleucine, A156V: alanine at position 156 changed to threonine. Expression Vector pLH468
[0250] The pLH468 plasmid (SEQ ID NO: 43) was constructed for expression of DHAD, KivD and HADH in yeast. Coding regions for Lactococcus lactis ketoisovalerate decarboxylase (KivD) and horse liver alcohol dehydrogenase (HADI-1) were synthesized by DNA2.0 based on codons that were optimized for expression in Saccharomyces cerevisiae (SEQ ID NO: 44 and 45) and provided in plasmids pKivDy-DNA2.0 and pHadhy-DNA2.0. The encoded proteins are (SEQ ID NOs 47 and 46, respectively. Individual expression vectors for KivD and HADH were constructed. To assemble pLH467 (pRS426::P.sub.TDH3-kivDy-TDH3t), vector pNY8 (SE 0 ID NO: 48; also named pRS426.GPD-ald-GPDt, disclosed in U.S. Patent Application Publication No. US 2008/0182308 A1, Example 17) was digested with AscI and SfiI enzymes, thus excising the GPD promoter region derived sequence and the ald coding region. A TDH3 promoter region derived sequence fragment (SEQ ID NO: 49) from pNY8 was PCR amplified to add an AscI site at the 5' end, and an SpeI site at the 3' end, using 5' primer OT1068 and 3' primer OT1067 (SEQ ID NO: 50 and 51). The AscI/SfiI digested pNY8 vector fragment was ligated with the TDH3 promoter PCR product digested with AscI and SpeI, and the SpeI-SfiI fragment containing the codon optimized kivD coding region isolated from the vector pKivD-DNA2.0. The triple ligation generated vector pLH467 (pRS426::P.sub.TDH3-kivDy-TDH3t). pLH467 (SEQ ID NO: 142) was verified by restriction mapping and sequencing.
[0251] pLH435 (pRS425::P.sub.GPM1-Hadhy-ADH1t) was derived from vector pRS425::GPM-sadB (SEQ ID NO: 52) which is disclosed in U.S. Provisional Patent Application No. 61/058,970, Example 3. pRS425::GPM-sadB is the pRS425 vector (ATCC #77106) with a chimeric gene containing a GPM1 promoter region derived sequence (SEQ ID NO: 53), a coding region from a butanol dehydrogenase of Achromobacter xylosoxidans (sadB; SEQ ID NO: 55, disclosed in U.S. Patent Application No. 61/048,291; amino acid SEQ ID NO: 56), and an ADH1 terminator region derived sequence (SEQ ID NO: 54). pRS425::GPMp-sadB contains BbvI and PacI sites at the 5' and 3' ends of the sadB coding region, respectively. A NheI site was added at the 5' end of the sadB coding region by site-directed mutagenesis using primers OT1074 and OT1075 (SEQ ID NO: 57 and 58) to generate vector pRS425-GPMp-sadB-NheI, which was verified by sequencing. pRS425::P.sub.GPMP-sadB-NheI was digested with NheI and PacI to drop out the sadB coding region, and ligated with the NheI-PacI fragment containing the codon optimized HADH coding region from vector pHadny-DNA2.0 to create pLH435 (SEQ ID NO: 143).
[0252] To combine KivD and HADH expression cassettes in a single vector, yeast vector pRS411 (ATCC #87474) was digested with SacI and NotI, and ligated with the SacI-SalI fragment from pLH467 that contains the P.sub.TDH3-kivDy-TDH3t cassette together with the SalI-NotI fragment from pLH435 that contains the P.sub.GPM1-Hadhy-ADH1t cassette in a triple ligation reaction. This yielded the vector pRS411::P.sub.TDH3-kivDy-P.sub.GPM1-Hadhy (pLH441 SEQ ID NO: 144), which was verified by restriction mapping.
[0253] In order to generate a co-expression vector for all three genes in the lower isobutanol pathway: ilvD, kivDy and Hadhy, we used pRS423 FBA ilvD(Strep) (SEQ ID NO: 59), which is disclosed in U.S. Patent Application No. 61/100,792, as the source of the ilvD gene. This shuttle vector contains an F1 origin of replication (nt 1423 to 1879) for maintenance in E. coli and a 2 micron origin (nt 8082 to 9426) for replication in yeast. The vector has an FBA1 promoter region derived sequence (nt 2111 to 3108; ((SEQ ID NO: 36) and FBA1 terminator region derived sequence (nt 4861 to 5860; SEQ ID NO: 60). In addition, it carries the HIS3 marker (nt 504 to 1163) for selection in yeast and ampicillin resistance marker (nt 7092 to 7949) for selection in E. coli. The ilvD coding region (nt 3116 to 4828; (ilvD coding region of vector is SEQ ID NO: 62 and wild-type protein sequence of ilvD is SEQ ID NO: 63) from Streptococcus mutans UA159 (ATCC #700610) is between the FBA1 promoter region derived sequence and FBA1 terminator region derived sequence forming a chimeric gene for expression. In addition there is a lumio tag fused to the ilvD coding region (nt 4829-4849).
[0254] The first step was to linearize pRS423::FBA ilvD(Strep) (also called pRS423-FBA(SpeI)-IlvD(Streptococcus mutans)-Lumio) with SacI and SacII (with SacII site blunt ended using T4 DNA polymerase), to give a vector with total length of 9,482 bp. The second step was to isolate the kivDy-hADHy cassette from pLH441 with SacI and KpnI (with KpnI site blunt ended using T4 DNA polymerase), which gives a 6,063 bp fragment. This fragment was ligated with the 9,482 bp vector fragment from pRS423-FBA(SpeI)-ilvD(Streptococcus mutans)-Lumio. This generated vector pLH468 (pRS423::P.sub.FBA1-ilvD(Strep)Lumio-FBA1t-P.sub.TDH3-kivDy-TDH3t-- P.sub.GPM1-hadhy-ADH1t), which was confirmed by restriction mapping and sequencing.
Example 2
Pyruvate Decarboxylase and Hexokinase 2 Gene Inactivation
[0255] This example describes insertion-inactivation of endogenous PDC1, PDC5, and PDC6 genes of S. cerevisiae. PDC1, PDC5, and PDC6 genes encode the three major isozymes of pyruvate decarboxylase. The resulting PDC inactivation strain was used as a host for expression vectors pLH475-Z4B8 and pLH468 that were described in Example 1.
Construction of pdc6::P.sub.GPM1-sadB Integration Cassette and PDC6 Deletion:
[0256] A pdc6::P.sub.GPM1-sadB-ADH1t-URA3r integration cassette was made by joining the GPM-sadB-ADHt segment (SEQ ID NO: 64) from pRS425::GPM-sadB (described above) to the URA3r gene from pUC19-URA3r. pUC19-URA3r (SEQ ID NO: 65) contains the URA3 marker from pRS426 (ATCC #77107) flanked by 75 bp homologous repeat sequences to allow homologous recombination in vivo and removal of the URA3 marker. The two DNA segments were joined by SOE PCR (as described by Horton et al. (1989) Gene 77:61-68) using as template pRS425::GPM-sadB and pUC19-URA3r plasmid DNAs, with Phusion DNA polymerase (New England Biolabs Inc., Beverly, Mass.; catalog no. F-540S) and primers 114117-11A through 114117-11D (SEQ ID NOs: 66-69), and 114117-13A and 114117-13B (SEQ ID NOs: 70 and 71). The outer primers for the SOE PCR (114117-13A and 114117-13B) contained 5' and 3'-50 bp regions homologous to regions upstream and downstream of the PDC6 promoter and terminator, respectively. The completed cassette PCR fragment was transformed into BY4700 (ATCC #200866) and transformants were maintained on synthetic complete media lacking uracil and supplemented with 2% glucose at 30ยฐ C. using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). Transformants were screened by PCR using primers 112590-34G and 112590-34H (SEQ ID NOs: 72 and 73), and 112590-34F and 112590-49E (SEQ ID NOs: 74 and 75) to verify integration at the PDC6 locus with deletion of the PDC6 coding region. The URA3r marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30ยฐ C. following standard protocols. Marker removal was confirmed by patching colonies from the 5-FOA plates onto SD URA-media to verify the absence of growth. The resulting identified strain has the genotype: BY4700 pdc6::P.sub.GPM1-sadB-ADH1t.
Construction of Pdc1:: P.sub.PDC1-ilvD Integration Cassette and PDC1 Deletion:
[0257] A pdc1:: P.sub.PDC1-ilvD-FBA1t-URA3r integration cassette was made by joining the ilvD-FBA1t segment (SEQ ID NO: 76) from pLH468 (described above) to the URA3r gene from pUC19-URA3r by SOE PCR (as described by Horton et al. (1989) Gene 77:61-68) using as template pLH468 and pUC19-URA3r plasmid DNAs, with Phusion DNA polymerase (New England Biolabs Inc., Beverly, Mass.; catalog no. F-540S) and primers 114117-27A through 114117-27D (SEQ ID NOs: 77-80).
[0258] The outer primers for the SOE PCR (114117-27A and 114117-27D) contained 5' and 3'-50 bp regions homologous to regions downstream of the PDC1 promoter and downstream of the PDC1 coding sequence. The completed cassette PCR fragment was transformed into BY4700 pdc6::P.sub.GPM1-sadB-ADH1t and transformants were maintained on synthetic complete media lacking uracil and supplemented with 2% glucose at 30ยฐ C. using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). Transformants were screened by PCR using primers 114117-36D and 135 (SEQ ID NOs:82 and 83), and primers 112590-49E and 112590-30F (SR1ID NOs: 75 and 129) to verify integration at the PDC1 locus with deletion of the PDC1 coding sequence The URA3r marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30ยฐ C. following standard protocols. Marker removal was confirmed by patching colonies from the 5-FOA plates onto SD-URA media to verify the absence of growth. The resulting identified strain "NYLA67" has the genotype: BY4700 pdc6::P.sub.GPM1-sadB-ADH1t pdc1::P.sub.PDC1-ilvD-FBA1t.
HIS3 Deletion
[0259] To delete the endogenous HIS3 coding region, a his3::URA3r2 cassette was PCR-amplified from URA3r2 template DNA (SEQ ID NO: 81). URA3r2 contains the URA3 marker from pRS426 (ATCC #77107) flanked by 500 bp homologous repeat sequences to allow homologous recombination in vivo and removal of the URA3 marker. PCR was done using Phusion DNA polymerase and primers 114117-45A and 114117-45B (SEQ ID NOs: 84 and 85) which generated a ห2.3 kb PCR product. The HIS3 portion of each primer was derived from the 5' region upstream of the HIS3 promoter and 3' region downstream of the coding region such that integration of the URA3r2 marker results in replacement of the HIS3 coding region. The PCR product was transformed into NYLA67 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants were selected on synthetic complete media lacking uracil and supplemented with 2% glucose at 30ยฐ C. Transformants were screened to verify correct integration by replica plating of transformants onto synthetic complete media lacking histidine and supplemented with 2% glucose at 30ยฐ C. The URA3r marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30ยฐ C. following standard protocols. Marker removal was confirmed by patching colonies from the 5-FOA plates onto SD-URA media to verify the absence of growth. The resulting identified strain, called NYLA73, has the genotype: BY4700 pdc6:: P.sub.GPM1-sadB-ADH1t pdc1:: P.sub.PDC1-ilvD-FBA1t ฮhis3.
Construction of pdc5::kanMX Intergration Cassette and PDC5 Deletion:
[0260] A pdc5::kanMX4 cassette was PCR-amplified from strain YLR134W chromosomal DNA (ATCC No. 4034091) using Phusion DNA polymerase and primers PDC5::KanMXF and PDC5::KanMXR (SEQ ID NOs: 86 and 87) which generated a ห2.2 kb PCR product. The PDC5 portion of each primer was derived from the 5' region upstream of the PDC5 promoter and 3' region downstream of the coding region such that integration of the kanMX4 marker results in replacement of the PDC5 coding region. The PCR product was transformed into NYLA73 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants were selected on YP media supplemented with 1% ethanol and geneticin (200 ฮผg/ml) at 30ยฐ C. Transformants were screened by PCR to verify correct integration at the PDC locus with replacement of the PDC5 coding region using primers PDC5kofor and N175 (SEQ ID NOs: 88 and 89). The identified correct transformants have the genotype: BY4700 pdc6::P.sub.GPM1-ilvD-sadB-ADH1t pdc1::P.sub.PDC1-ilvD-FBA1t ฮhis3 pdc5::kanMX4. The strain was named NYLA74.
Deletion of Hexokinase 2:
[0261] A hxk2::URA3r cassette was PCR-amplified from URA3r2 template (described above) using Phusion DNA polymerase and primers 384 and 385 (SEQ ID NOs: 90 and 91) which generated a ห2.3 kb PCR product. The HXK2 portion of each primer was derived from the 5' region upstream of the HXK2 promoter and 3' region downstream of the coding region such that integration of the URA3r2 marker results in replacement of the HXK2 coding region. The PCR product was transformed into NYLA73 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants were selected on synthetic complete media lacking uracil and supplemented with 2% glucose at 30ยฐ C. Transformants were screened by PCR to verify correct integration at the HXK2 locus with replacement of the HXK2 coding region using primers N869 and N871 (SEQ ID NO: 92 and 93). The URA3r2 marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30ยฐ C. following standard protocols. Marker removal was confirmed by patching colonies from the 5-FOA plates onto SD-URA media to verify the absence of growth, and by PCR to verify correct marker removal using primers N946 and N947 (SEQ ID NO: 94 and 95). The resulting identified strain named NYLA83 has the genotype: BY4700 pdc6:: P.sub.GPM1-sadB-ADH1t pdc1::P.sub.PDC1-ilvD-FBA1t ฮhis3 ฮhxk2.
Construction of pdc5::kanMX Integration Cassette and PDC5 Deletion
[0262] A pdc5::kanMX4 cassette was PCR-amplified as described above. The PCR fragment was transformed into NYLA83, and transformants were selected and screened as described above. The identified correct transformants named NYLA84 have the genotype: BY4700 pdc6:: P.sub.GPM1-sadB-ADH1t pdc1::P.sub.PDC1-ilvD-FBA1 t ฮhis3 ฮhxk2 pdc5::kanMX4.
[0263] Plasmid vectors pLH468 and pLH475-Z4B8 were simultaneously transformed into strain NYLA84 (BY4700 pdc6::P.sub.GPM1-sadB-ADH1t pdc1::P.sub.PDC1-ilvD-FBA1t ฮhis3 zihxk2 pdc5::kanMX4) using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) and the resulting strain was maintained on synthetic complete media lacking histidine and uracil, and supplemented with 1% ethanol at 30ยฐ C.
Example 3
Production of Isobutanol
[0264] The purpose of this example is to describe the production of isobutanol in the yeast strain NYLA84. The yeast strain comprises deletions of PDC1, PDC5, and PDC6, genes encoding three isozymes of pyruvate decarboxylase, and constructs for heterologous expression of AlsS (acetolactate synthase), KARI (keto acid reductoisomerase), DHAD (dihydroxy acid dehydratase), KivD (ketoisovalerate decarboxylase), and SadB (secondary alcohol dehydrogenase).
Strain Construction
[0265] Plasmids pLH468 and pLH475-Z4B8 were introduced into NYLA74 or NYLA84, described in Example 2, by standard PEG/lithium acetate-mediated transformation methods. Transformants were selected on synthetic complete medium lacking glucose, histidine and uracil. Ethanol (1% v/v) was used as the carbon source. After three days, transformants were patched to synthetic complete medium lacking histidine and uracil supplemented with both 2% glucose and 1% ethanol as carbon sources. Fermentation seed vials were made by inoculation of cultures into synthetic complete medium lacking histidine and uracil supplemented with both 0.2% glucose and 0.5% ethanol. Glycerol was added to final concentration of 15% (v/v) and vials were stored at -80ยฐ C.
Production of Isobutanol
[0266] Fermentation inoculum was grown in synthetic complete medium lacking histidine and uracil supplemented with 1% ethanol as a carbon source at 30ยฐ C. and shaking at 250 rpm. Inoculation volume for the fermenters was 80 ml. The 80 ml of inoculum in the 800 ml fermentation medium described below resulted in the presence of 0.1% ethanol.
[0267] The NYLA84/pLH468+pLH475-Z4B8 strain fermenter was prepared and sterilized with 0.4 L water. After cooling, filter sterilized media was added to give the following final concentrations in 800 mL post-inoculation:
Medium (Final Concentration):
[0268] 6.7 g/L Yeast Nitrogen Base w/o amino acids (Difco)
[0269] 2.8 g/L Yeast Synthetic Drop-out Medium Supplement Without Histidine, Leucine, Tryptophan and Uracil (Sigma Y2001)
[0270] 20 mL/L of 1% (w/v) L-Leucine
[0271] 4 mL/L of 1% (w/v) L-Tryptophan
[0272] 10 g/L glucose
[0273] 1 mL/L 1% ergosterol in 50% (v/v) Tween-80/ethanol solution
[0274] 0.2 mL/L Sigma DF204 antifoam
[0275] The fermenter was set to control at pH 5.5 with KOH, initial dO (dissolved oxygen) 30% by stirring, temperature 30ยฐ C., and airflow of 0.2 SLPM (or, 0.25 vvm). At inoculation, the airflow was set to 0.01 SLPM initially, then increased to 0.2 SLPM. Glucose was maintained at 5-15 g/L throughout.
[0276] The NYLA74/pLH468 pLH475-Z4B8 strain fermenter was prepared as for the NYLA84/pLH468 pLH475-Z4B8 strain fermenter except that 1 mL/L ergosterol/tween/ethanol solution and 0.2 mL/L Sigma DF204 antifoam were omitted, and glucose was 2 g/L. Initial ethanol concentration in the fermenter was 0.1%.
[0277] The fermenter was set to control at pH 5.5 with KOH, initial dO 30% by stirring, temperature 30ยฐ C., and airflow of 0.2 SLPM (or, 0.25 vvm). At inoculation, the airflow was set to 0.01 SLPM initially, then increased to 0.2 SLPM. Glucose was maintained at 0.1-2 g/L throughout.
[0278] Samples were taken periodically and measured for growth by OD600, and for isobutanol content by HPLC as described in General Methods. FIG. 1 shows the results comparing strains with and without hexokinase 2 deletion for growth (1A) and isobutanol production (1B).
[0279] FIG. 2 shows a comparison of growth and isobutanol production for the strain without hexokinase 2 deletion (2A) and the strain with hexokinase 2 deletion (2B). FIG. 3 plots the results as "specific productivity" (Qp) measured in grams isobutanol per gram of cells over time. For the strain without deletion of hexokinase 2, the cell specific productivity dropped from 60-90 hours when there was no longer growth, while for hexokinase 2 deletion strain, the specific productivity was relatively well maintained from 60-140 hours showing that the strain is capable of better non-growth associated production.
Example 4
Prophetic
Regulated Expression of Hexokinase in a S. cerevisiae Strain Devoid of Pyruvate Decarboxylase and Hexokinase 2 Activity
[0280] This example describes insertion of hexokinase enzyme under a controlled expression in a S. cerevisiae strain where pyruvate decarboxylase (ฮpdc1/5/6) and hexokinase 2 (ฮhxk2) activity have been removed. Creation of the NYLA84 (ฮpdc1/5/6 ฮhxk2) strain was described in Example 2.
[0281] The HXK2 gene and native terminator from S. cerevisiae (SEQ ID NO: 101) was PCR amplified from genomic DNA from strain BY4700 (ATCC #200866) using Phusion DNA polymerase and primers LA588 (SEQ ID NO: 96) and LA589 (SEQ ID NO: 97), and digested with XbaI and BamHI restriction enzymes. The OLE1 promoter region derived sequence (SEQ ID NO: 98) was PCR amplified from BY4700 genomic DNA using Phusion DNA polymerase and primers LA586 (SEQ ID NO: 99) and LA587 (SEQ ID NO: 100), and digested with HindIII and XbaI restriction enzymes. The HXK2 and POLE1 products were ligated and subcloned into pUC19::loxP-URA3-loxP which was previously digested with HindIII and BamHI. pUC19::loxP-URA3-loxP (SEQ ID NO: 102) contains the URA3 marker from (ATCC #77107) flanked by loxP recombinase sites. The resulting vector was named pLA25 (SEQ ID NO: 103).
[0282] The RAG5 gene from K. lactis (SEQ ID NO: 3) was PCR amplified from genomic DNA from strain GG799 (#C1001S, New England Biolabs, Ipswich, Mass.) using Phusion DNA polymerase and primers LA593 and LA594 (SEQ ID NO: 104 and 105), and was digested with HindIII and XbaI restriction enzymes. The gel-purified RAG5 product was ligated with the OLE1 promoter region derived sequence from above, and subcloned into pUC19::loxP-URA3-loxP which was previously digested with HindIII and BamHI. The resulting vector was named pLA31 (SEQ ID NO: 106).
[0283] In order to integrate into the TRP1 locus, the POLE1-HXK2-loxP-URA3-loxP and POLE1-RAG5-loxP-URA3-loxP cassettes is PCR amplified from plasmids pLA25 and pLA31 using Phusion DNA polymerase and primers BK600 and BK601 (SEQ ID NOs: 107 and 108). The TRP1 portion of each primer is derived from the 5' region upstream of the TRP1 promoter and 3ยฐ region downstream of the coding region such that integration of the POLE1-HXK2-loxP-URA3-loxP or POLE1-RAG5-loxP-URA3-loxP cassette results in replacement of the TRP1 coding region. The PCR product is transformed into NYLA84 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants are selected on synthetic complete media lacking uracil and supplemented with 2% glucose at 30ยฐ C. Transformants are screened by PCR to verify correct integration at the TRP1 locus with replacement of the TRP1 coding region using primers 112590-49E (SEQ ID NO: 75) and LA606 (SEQ ID NO: 109). The URA3 marker is recycled by transformation with pRS423::P.sub.GAL1-cre (SEQ ID NO: 110) and plating on synthetic complete media lacking histidine supplemented with 1% ethanol at 30ยฐ C. Colonies are patched onto YP (1% galactose) plates at 30ยฐ C. to induce URA3 marker excision and are transferred onto YP (1% ethanol) plates at 30ยฐ C. for recovery. Removal of the URA3 marker is confirmed by patching colonies from the YP (1% ethanol) plates onto synthetic complete media lacking uracil supplemented with 1% ethanol to verify the absence of growth.
Example 5
Prophetic
Constitutive Expression of Hexokinase in a S. cerevisiae Strain Devoid of Pyruvate Decarboxylase and Hexokinase 2 Activity
[0284] This example describes insertion of hexokinase enzyme under control of the constitutive ADH1-derived promoter sequence in a S. cerevisiae strain where pyruvate decarboxylase (ฮpdc1/5/6) and hexokinase 2 (ฮhxk2) activity have been removed. Creation of the NYLA84 (ฮpdc1/5/6 ฮhxk2) strain was described in Example 2.
[0285] The RAG5 gene from K. lactis (SEQ ID NO: 3) was PCR amplified from genomic DNA from strain GG799 (#C1001S; New England Biolabs, Ipswich, Mass.) using Phusion DNA polymerase and primers LA593 and LA594 (SEQ ID NOs: 104 and 105), and was digested with HindIII and XbaI restriction enzymes. The ADH1 promoter region derived sequence (SEQ ID NO: 131) was PCR amplified from BY4700 genomic DNA using Phusion DNA polymerase and primers LA595 and LA597 (SEQ ID NOs: 112 and 113), and digested with HindIII and XbaI restriction enzymes. The gel-purified RAG5 product was ligated with the ADH1 promoter fragment, and subcloned into pUC19::loxP-URA3-loxP which was previously digested with HindIII and BamHI. The resulting vector was named pLA32 (SEQ ID NO: 111).
[0286] In order to integrate into the TRP1 locus, the P.sub.ADH1-RAG5-loxP-URA3-loxP cassette is PCR amplified from plasmid pLA32 using Phusion DNA polymerase and primers BK600 and BK601 (SEQ ID NOs 107 and 108). The TRP1 portion of each primer is derived from the 5' region upstream of the TRP1 promoter and 3' region downstream of the coding region such that integration of the P.sub.ADH1-RAG5-loxP-URA3-loxP cassette results in replacement of the TRP1 coding region. The PCR product is transformed into NYLA84 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants are selected on synthetic complete media lacking uracil and supplemented with 2% glucose at 30ยฐ C. Transformants are screened by PCR to verify correct integration at the TRP1 locus with replacement of the TRP1 coding region using primers 112590-49E (SEQ ID NO: 75) and LA606 (SEQ ID NO: 109). The URA3 marker is recycled by transformation with pRS423::P.sub.GAL1-cre (SEQ ID NO: 110) and plating on synthetic complete media lacking histidine supplemented with 1% ethanol at 30ยฐ C. Colonies are patched onto YP (1% galactose) plates at 30ยฐ C. to induce URA3 marker excision and are transferred onto YP (1% ethanol) plates at 30ยฐ C. for recovery. Removal of the URA3 marker is confirmed by patching colonies from the YP (1% ethanol) plates onto synthetic complete media lacking uracil supplemented with 1% ethanol to verify the absence of growth.
Example 6
Prophetic
Isobutanol Production Using NYLA84 Strains with Regulated Expression of Hexose Kinase
Isobutanol Strain and Production
[0287] The expression constructs pLH475-JEA1 and pLH468 (described in Example 1) are transformed into strains NYLA84, NYLA84 trp1::POLE1-HXK2 and NYLA84 trp1::POLE1-RAG5 (described in Example 4) by standard PEG/lithium acetate-mediated transformation methods. Transformants are selected on synthetic complete medium lacking glucose, histidine and uracil. Ethanol (1% v/v) is used as the carbon source. After three days, transformants are patched to synthetic complete medium lacking histidine and uracil supplemented with both 2% glucose and 1% ethanol as carbon sources. Seed vials are made by inoculation of cultures into synthetic complete medium lacking histidine and uracil supplemented with both 0.2% glucose and 0.5% ethanol. Glycerol is added to final concentration of 15% (v/v) and vials are stored at -80ยฐ C.
Isobutanol Production
[0288] Seed vials of NYLA84 pLH475-JEA1, NYLA84 trp1::POLE1-HXK2, and NYLA84 trp1::POLE1-RAG5 are inoculated into 80 mL of synthetic complete medium lacking histidine and uracil supplemented with both 0.25% glucose and 0.5% ethanol as carbon sources at 30ยฐ C. A 1 liter fermenter is prepared and sterilized with 0.4 L water. After cooling, filter sterilized medium is added to give the following final concentrations in 800 mL post-inoculation:
Medium (Final Concentration):
[0289] 6.7 g/L, Yeast Nitrogen Base w/o amino acids (Difco) 2.8 g/L, Yeast Synthetic Drop-out Medium Supplement Without Histidine, Leucine, Tryptophan and Uracil (Sigma Y2001) 20 mL/L of 1% (w/v) L-Leucine 4 mL/L of 1% (w/v) L-Tryptophan 1 mL/L ergosterol/tween/ethanol solution 0.2 mL/L Sigma DF204 10 g/L glucose
[0290] The fermenter is set to control at pH 5.5 with KOH, 30% dO, temperature 30ยฐ C., and airflow of 0.2 SLPM (or, 0.25 vvm). At inoculation, the airflow is set to 0.01 SLPM initially, then increased to 0.2 SLPM once growth was established. Glucose is maintained at 5-15 g/L throughout by manual addition. Alternatively, the fermenter is set to control at pH 5.5 with KOH, 3-5% dO, temperature 30ยฐ C., and airflow of 0.2 SLPM (or 0.25 vvm). At inoculation, the airflow is set to 0.01 SLPM initially, increased to 0.2 SLPM once growth is established.
[0291] To quantify the loss of isobutanol due to stripping, the off-gas from the fermentor is directly sent to a mass spectrometer (Prima dB mass speatrometer, Thermo Electron Corp., Madison, Wis.) to quantify the amount of isobutanol in the gas stream. The isobutanol peaks at mass to charge ratios of 74 or 42 are monitored continuously to quantify the amount of isobutanol in the gas stream. Glucose and organic acids in the aqueous phase are monitored during the fermentation using HPLC. Glucose is also monitored quickly using a glucose analyzer (YSI, Inc., Yellow Springs, Ohio). Isobutanol and isobutyric acid in the aqueous phase are quantified by HPLC as described in the General Methods Section herein above after the aqueous phase is removed periodically from the fermentor.
Example 7
Prophetic
Modification of Hexose Kinase Function
[0292] The purpose of this example is to describe how the function of hexose kinase can be altered by deletion of a protein interaction domain that prevents function as a transcriptional regulator. The MIG1-interaction domain (Lys6-Met15) is removed from S. cerevisiae HXK2 which allows function as a glycolytic enzyme but prevents translocation to the nucleus.
[0293] In order to remove the N-terminal MIG1-interaction domain from S. cerevisiae HXK2, an integration cassette is constructed using the pUC19::loxP-URA3-loxP plasmid. The gene encoding HXK2 with an internal deletion of the Lys6-Met15 region (bp 19-48) and ADH1 terminator region derived sequence is synthesized by DNA 2.0 with codon-optimization for S. cerevisiae (SEQ ID NO: 132). The HXK2(ฮLys6-Met15)-ADH1t cassette is PCR-amplified using Phusion DNA polymerase and primers E001 and E002 (SEQ ID NOS: 133 and 134) and subcloned into pUC19::loxP-URA3-loxP via HindIII BamHI sites, creating plasmid pUC19::loxP-URA3-loxP-HXK2(Lys6-Met15)-ADH1t (SEQ ID NO: 139).
[0294] The HXK2(ฮLys6-Met15)ADH1t-loxP-URA3-loxP cassette is PCR amplified using Phusion DNA polymerase and primers E003 and E004 (SEQ ID NOS: 135 and 136). Primer E003 contains sequence from the HXK2 promoter region and primer E004 contains sequence from the HXK2 terminator, such that integration of the HXK2(ฮLys6-Met15)ADH1t-loxP-URA3-loxP cassette results in replacement of the native HXK2 coding sequence. The PCR product is transformed into NYLA74 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants are selected on synthetic complete media lacking uracil and supplemented with 2% glucose at 30ยฐ C. Transformants are screened by PCR to verify correct integration at the HXK2 locus using primers E005 and E006 (SEQ ID NOS: 137 and 138). The URA3 marker is recycled by transformation with pRS423::P.sub.GAL1-cre (SEQ ID NO: 110) and plating on synthetic complete media lacking histidine supplemented with 1% ethanol at 30ยฐ C. Colonies are patched onto YP (1% galactose) plates at 30ยฐ C. to induce URA3 marker excision and are transferred onto YP (1% ethanol) plates at 30ยฐ C. for recovery. Removal of the URA3 marker is confirmed by patching colonies from the YP (1% ethanol) plates onto synthetic complete media lacking uracil supplemented with 1% ethanol to verify the absence of growth.
Example 8
Prophetic
Isobutanol Production Using NYLA44 Strains with Modified Function of Hexose Kinase
Isobutanol Strain and Production
[0295] The expression constructs pLF1475-JEA1 and pLH468 (described in Example 1) are transformed into strains NYLA74 hxk2ฮ::HXK2(ฮLys6-Met15) (described in Example 7) by standard PEG/lithium acetate-mediated transformation methods. Transformants are selected on synthetic complete medium lacking glucose, histidine and uracil. Ethanol (1% v/v) is used as the carbon source. After three days, transformants are patched to synthetic complete medium lacking histidine and uracil supplemented with both 2% glucose and 1% ethanol as carbon sources. Seed vials are made by inoculation of cultures into synthetic complete medium lacking histidine and uracil supplemented with both 0.2% glucose and 0.5% ethanol. Glycerol is added to final concentration of 15% (v/v) and vials are stored at -80ยฐ C.
Isobutanol Production
[0296] Seed vials of NYLA74 hxk2ฮ::HXK2(ฮLys6-Met15) pLH468 pLH475-JEA1 are inoculated into 80 mL of synthetic complete medium lacking histidine and uracil supplemented with both 0.25% glucose and 0.5% ethanol as carbon sources at 30ยฐ C. A 1 liter fermenter is prepared and sterilized with 0.4 L water. After cooling, filter sterilized medium is added to give the following final concentrations in 800 mL post-inoculation:
Medium (Final Concentration):
[0297] 6.7 g/L, Yeast Nitrogen Base w/o amino acids (Difco) 2.8 g/L, Yeast Synthetic Drop-out Medium Supplement Without Histidine, Leucine, Tryptophan and Uracil (Sigma Y2001) 20 mL/L of 1% (w/v) L-Leucine 4 mL/L of 1% (My) L-Tryptophan 1 mL/L ergosterol/tween/ethanol solution 0.2 mL/L Sigma DF204 10 g/L glucose
[0298] The fermenter is set to control at pH 5.5 with KOH, 30% dO, temperature 30ยฐ C., and airflow of 0.2 SLFM (or, 0.25 vvm). At inoculation, the airflow is set to 0.01 SLPM initially, then increased to 0.2 SLPM once growth was established. Glucose is maintained at 5-15 g/L throughout by manual addition. Alternatively, the fermenter is set to control at pH 5.5 with KOH, 3-5% dO, temperature 30ยฐ C., and airflow of 0.2 SLPM (or 0.25 vvm). At inoculation, the airflow is set to 0.01 SLPM initially, increased to 0.2 SLPM once growth is established.
[0299] To quantify the loss of isobutanol due to stripping, the off-gas from the fermentor is directly sent to a mass spectrometer (Prima dB mass spectrometer, Thermo Electron Corp., Madison, Wis.) to quantify the amount of isobutanol in the gas stream. The isobutanol peaks at mass to charge ratios of 74 or 42 are monitored continuously to quantify the amount of isobutanol in the gas stream. Glucose and organic acids in the aqueous phase are monitored during the fermentation using HPLC. Glucose is also monitored quickly using a glucose analyzer (YSI, Inc., Yellow Springs, Ohio). Isobutanol and isobutyric acid in the aqueous phase are quantified by HPLC as described in the General Methods Section herein above after the aqueous phase is removed periodically from the fermenter.
Sequence CWU
1
1
14411461DNASaccharomyces cerevisiae 1atggttcatt taggtccaaa aaaaccacaa
gccagaaagg gttccatggc cgatgtgcca 60aaggaattga tgcaacaaat tgagaatttt
gaaaaaattt tcactgttcc aactgaaact 120ttacaagccg ttaccaagca cttcatttcc
gaattggaaa agggtttgtc caagaagggt 180ggtaacattc caatgattcc aggttgggtt
atggatttcc caactggtaa ggaatccggt 240gatttcttgg ccattgattt gggtggtacc
aacttgagag ttgtcttagt caagttgggc 300ggtgaccgta cctttgacac cactcaatct
aagtacagat taccagatgc tatgagaact 360actcaaaatc cagacgaatt gtgggaattt
attgccgact ctttgaaagc ttttattgat 420gagcaattcc cacaaggtat ctctgagcca
attccattgg gtttcacctt ttctttccca 480gcttctcaaa acaaaatcaa tgaaggtatc
ttgcaaagat ggactaaagg ttttgatatt 540ccaaacattg aaaaccacga tgttgttcca
atgttgcaaa agcaaatcac taagaggaat 600atcccaattg aagttgttgc tttgataaac
gacactaccg gtactttggt tgcttcttac 660tacactgacc cagaaactaa gatgggtgtt
atcttcggta ctggtgtcaa tggtgcttac 720tacgatgttt gttccgatat cgaaaagcta
caaggaaaac tatctgatga cattccacca 780tctgctccaa tggccatcaa ctgtgaatac
ggttccttcg ataatgaaca tgtcgttttg 840ccaagaacta aatacgatat caccattgat
gaagaatctc caagaccagg ccaacaaacc 900tttgaaaaaa tgtcttctgg ttactactta
ggtgaaattt tgcgtttggc cttgatggac 960atgtacaaac aaggtttcat cttcaagaac
caagacttgt ctaagttcga caagcctttc 1020gtcatggaca cttcttaccc agccagaatc
gaggaagatc cattcgagaa cctagaagat 1080accgatgact tgttccaaaa tgagttcggt
atcaacacta ctgttcaaga acgtaaattg 1140atcagacgtt tatctgaatt gattggtgct
agagctgcta gattgtccgt ttgtggtatt 1200gctgctatct gtcaaaagag aggttacaag
accggtcaca tcgctgcaga cggttccgtt 1260tacaacagat acccaggttt caaagaaaag
gctgccaatg ctttgaagga catttacggc 1320tggactcaaa cctcactaga cgactaccca
atcaagattg ttcctgctga agatggttcc 1380ggtgctggtg ccgctgttat tgctgctttg
gcccaaaaaa gaattgctga aggtaagtcc 1440gttggtatca tcggtgctta a
14612486PRTSaccharomyces cerevisiae 2Met
Val His Leu Gly Pro Lys Lys Pro Gln Ala Arg Lys Gly Ser Met 1
5 10 15 Ala Asp Val Pro Lys Glu
Leu Met Gln Gln Ile Glu Asn Phe Glu Lys 20
25 30 Ile Phe Thr Val Pro Thr Glu Thr Leu Gln
Ala Val Thr Lys His Phe 35 40
45 Ile Ser Glu Leu Glu Lys Gly Leu Ser Lys Lys Gly Gly Asn
Ile Pro 50 55 60
Met Ile Pro Gly Trp Val Met Asp Phe Pro Thr Gly Lys Glu Ser Gly 65
70 75 80 Asp Phe Leu Ala Ile
Asp Leu Gly Gly Thr Asn Leu Arg Val Val Leu 85
90 95 Val Lys Leu Gly Gly Asp Arg Thr Phe Asp
Thr Thr Gln Ser Lys Tyr 100 105
110 Arg Leu Pro Asp Ala Met Arg Thr Thr Gln Asn Pro Asp Glu Leu
Trp 115 120 125 Glu
Phe Ile Ala Asp Ser Leu Lys Ala Phe Ile Asp Glu Gln Phe Pro 130
135 140 Gln Gly Ile Ser Glu Pro
Ile Pro Leu Gly Phe Thr Phe Ser Phe Pro 145 150
155 160 Ala Ser Gln Asn Lys Ile Asn Glu Gly Ile Leu
Gln Arg Trp Thr Lys 165 170
175 Gly Phe Asp Ile Pro Asn Ile Glu Asn His Asp Val Val Pro Met Leu
180 185 190 Gln Lys
Gln Ile Thr Lys Arg Asn Ile Pro Ile Glu Val Val Ala Leu 195
200 205 Ile Asn Asp Thr Thr Gly Thr
Leu Val Ala Ser Tyr Tyr Thr Asp Pro 210 215
220 Glu Thr Lys Met Gly Val Ile Phe Gly Thr Gly Val
Asn Gly Ala Tyr 225 230 235
240 Tyr Asp Val Cys Ser Asp Ile Glu Lys Leu Gln Gly Lys Leu Ser Asp
245 250 255 Asp Ile Pro
Pro Ser Ala Pro Met Ala Ile Asn Cys Glu Tyr Gly Ser 260
265 270 Phe Asp Asn Glu His Val Val Leu
Pro Arg Thr Lys Tyr Asp Ile Thr 275 280
285 Ile Asp Glu Glu Ser Pro Arg Pro Gly Gln Gln Thr Phe
Glu Lys Met 290 295 300
Ser Ser Gly Tyr Tyr Leu Gly Glu Ile Leu Arg Leu Ala Leu Met Asp 305
310 315 320 Met Tyr Lys Gln
Gly Phe Ile Phe Lys Asn Gln Asp Leu Ser Lys Phe 325
330 335 Asp Lys Pro Phe Val Met Asp Thr Ser
Tyr Pro Ala Arg Ile Glu Glu 340 345
350 Asp Pro Phe Glu Asn Leu Glu Asp Thr Asp Asp Leu Phe Gln
Asn Glu 355 360 365
Phe Gly Ile Asn Thr Thr Val Gln Glu Arg Lys Leu Ile Arg Arg Leu 370
375 380 Ser Glu Leu Ile Gly
Ala Arg Ala Ala Arg Leu Ser Val Cys Gly Ile 385 390
395 400 Ala Ala Ile Cys Gln Lys Arg Gly Tyr Lys
Thr Gly His Ile Ala Ala 405 410
415 Asp Gly Ser Val Tyr Asn Arg Tyr Pro Gly Phe Lys Glu Lys Ala
Ala 420 425 430 Asn
Ala Leu Lys Asp Ile Tyr Gly Trp Thr Gln Thr Ser Leu Asp Asp 435
440 445 Tyr Pro Ile Lys Ile Val
Pro Ala Glu Asp Gly Ser Gly Ala Gly Ala 450 455
460 Ala Val Ile Ala Ala Leu Ala Gln Lys Arg Ile
Ala Glu Gly Lys Ser 465 470 475
480 Val Gly Ile Ile Gly Ala 485
31458DNAKluyveromyces lactis 3atggttcgtt taggtccaaa gaagcctcca gccagaaagg
ggtccatggc agatgtgcca 60gctaatttga tggaacaaat ccacggtttg gaaactttgt
tcaccgtctc ttcagaaaaa 120atgagaagca ttgtcaagca tttcatcagt gaattggaca
aaggtttgtc caaaaagggt 180ggtaacattc ctatgattcc aggttgggtt gttgagtatc
caactggtaa ggaaactggt 240gatttcttag ctcttgattt gggtggtacc aacttgagag
ttgtgttggt taaattgggt 300ggtaatcatg atttcgacac cactcaaaac aagtacagat
taccagacca tttgagaact 360ggtacttctg aacaattgtg gtcatttatt gcaaagtgtt
tgaaggaatt cgtcgatgaa 420tggtacccag atggtgtttc tgaaccattg ccattgggtt
tcactttctc ataccctgca 480tctcaaaaga agatcaattc cggtgtgttg caacgttgga
ccaagggttt cgatattgaa 540ggtgttgaag gtcacgatgt tgttccaatg ctacaagaac
agattgaaaa gctgaatatc 600ccaatcaatg tcgttcgatt gatcaacgat accactggta
ccttggttgc ctctttgtac 660actgatcctc aaactaagat gggtatcatt atcggtactg
gtgtcaacgg tgcttactac 720gatgttgttt ctggtattga gaaattggaa ggtttgttgc
cagaagatat cggtccagat 780tctccaatgg caatcaactg tgaatatggt tccttcgata
acgaacattt ggtgttgcca 840agaaccaaat acgatgttat aatcgatgaa gaatctccaa
gaccaggtca acaagctttc 900gaaaagatga cttctggtta ctatctaggt gaaatcatgc
gtctagtact attggacttg 960tacgacagtg gtttcatctt taaggaccaa gatatctcca
agttgaaaga ggcttacgtc 1020atggacacca gttatccatc taagatcgaa gatgatccat
tcgaaaactt ggaagacact 1080gacgatctgt tcaagactaa cttgaacatc gaaactaccg
ttgttgagag aaagttgatt 1140agaaaattag ccgaattggt cggaacaaga gctgcaagat
tgactgtttg tggtgtttct 1200gctatctgtg acaagagagg ctacaagact gctcacattg
cagctgatgg ttctgtcttc 1260aacagatacc caggttacaa ggaaaaggcc gctcaagcct
tgaaggatat ctacaactgg 1320gatgtcgaaa agatggaaga ccacccaatc caattggtgg
ctgctgaaga tggttccggt 1380gttggtgctg ctatcattgc ttgtttgact caaaagagat
tggctgccgg taagtctgtt 1440ggtattaaag gcgaatag
14584485PRTKluyveromyces lactis 4Met Val Arg Leu
Gly Pro Lys Lys Pro Pro Ala Arg Lys Gly Ser Met 1 5
10 15 Ala Asp Val Pro Ala Asn Leu Met Glu
Gln Ile His Gly Leu Glu Thr 20 25
30 Leu Phe Thr Val Ser Ser Glu Lys Met Arg Ser Ile Val Lys
His Phe 35 40 45
Ile Ser Glu Leu Asp Lys Gly Leu Ser Lys Lys Gly Gly Asn Ile Pro 50
55 60 Met Ile Pro Gly Trp
Val Val Glu Tyr Pro Thr Gly Lys Glu Thr Gly 65 70
75 80 Asp Phe Leu Ala Leu Asp Leu Gly Gly Thr
Asn Leu Arg Val Val Leu 85 90
95 Val Lys Leu Gly Gly Asn His Asp Phe Asp Thr Thr Gln Asn Lys
Tyr 100 105 110 Arg
Leu Pro Asp His Leu Arg Thr Gly Thr Ser Glu Gln Leu Trp Ser 115
120 125 Phe Ile Ala Lys Cys Leu
Lys Glu Phe Val Asp Glu Trp Tyr Pro Asp 130 135
140 Gly Val Ser Glu Pro Leu Pro Leu Gly Phe Thr
Phe Ser Tyr Pro Ala 145 150 155
160 Ser Gln Lys Lys Ile Asn Ser Gly Val Leu Gln Arg Trp Thr Lys Gly
165 170 175 Phe Asp
Ile Glu Gly Val Glu Gly His Asp Val Val Pro Met Leu Gln 180
185 190 Glu Gln Ile Glu Lys Leu Asn
Ile Pro Ile Asn Val Val Arg Leu Ile 195 200
205 Asn Asp Thr Thr Gly Thr Leu Val Ala Ser Leu Tyr
Thr Asp Pro Gln 210 215 220
Thr Lys Met Gly Ile Ile Ile Gly Thr Gly Val Asn Gly Ala Tyr Tyr 225
230 235 240 Asp Val Val
Ser Gly Ile Glu Lys Leu Glu Gly Leu Leu Pro Glu Asp 245
250 255 Ile Gly Pro Asp Ser Pro Met Ala
Ile Asn Cys Glu Tyr Gly Ser Phe 260 265
270 Asp Asn Glu His Leu Val Leu Pro Arg Thr Lys Tyr Asp
Val Ile Ile 275 280 285
Asp Glu Glu Ser Pro Arg Pro Gly Gln Gln Ala Phe Glu Lys Met Thr 290
295 300 Ser Gly Tyr Tyr
Leu Gly Glu Ile Met Arg Leu Val Leu Leu Asp Leu 305 310
315 320 Tyr Asp Ser Gly Phe Ile Phe Lys Asp
Gln Asp Ile Ser Lys Leu Lys 325 330
335 Glu Ala Tyr Val Met Asp Thr Ser Tyr Pro Ser Lys Ile Glu
Asp Asp 340 345 350
Pro Phe Glu Asn Leu Glu Asp Thr Asp Asp Leu Phe Lys Thr Asn Leu
355 360 365 Asn Ile Glu Thr
Thr Val Val Glu Arg Lys Leu Ile Arg Lys Leu Ala 370
375 380 Glu Leu Val Gly Thr Arg Ala Ala
Arg Leu Thr Val Cys Gly Val Ser 385 390
395 400 Ala Ile Cys Asp Lys Arg Gly Tyr Lys Thr Ala His
Ile Ala Ala Asp 405 410
415 Gly Ser Val Phe Asn Arg Tyr Pro Gly Tyr Lys Glu Lys Ala Ala Gln
420 425 430 Ala Leu Lys
Asp Ile Tyr Asn Trp Asp Val Glu Lys Met Glu Asp His 435
440 445 Pro Ile Gln Leu Val Ala Ala Glu
Asp Gly Ser Gly Val Gly Ala Ala 450 455
460 Ile Ile Ala Cys Leu Thr Gln Lys Arg Leu Ala Ala Gly
Lys Ser Val 465 470 475
480 Gly Ile Lys Gly Glu 485 51416DNAHansenula polymorpha
5atgagtttgg atactgaagt cgataagatt gtgtcggagt ttgccgtcac ccaggagaca
60ctccaaaagg gtgtggagcg tttcattgag cttgcaactg ccggactgaa tagtgatgag
120gacaagtatg gtctgccaat gatcccaact tttgttacct ccatcccaac cggtaaagag
180aagggcattc tttttgccgc agacttggga ggaaccaatt tcagagtttg ctctgttgcc
240ttgaacggag atcacacttt caaactgatc cagcagaagt cacatattcc tgccgaactg
300atgacctcca cctcggacga attgttttcg tatcttgcaa gcaaggtcaa gaatttctta
360gagactcatc atgaaggggc tgttacttct acaggaagcc agaaattcaa gatgggtttc
420actttcagtt tccctgtctc gcagaccgcc ttaaacgccg gtactttgct aagatggacc
480aagggattca atattccgga tactgttggt caagaggttg tttctctatt ccaaatgcat
540ttagacgccc aggaaattcc tgttactgtg tctgccctgt ccaacgatac tgtgggaacc
600cttcttgcaa gatcctacac gggttccaat aaggagggca ctactgttct aggatgcatc
660ttcggaacgg gaacaaacgg tgcttacaac gagaagctcg agaatatcaa gaagcttccg
720gccgaggtga gagagaagct gaaggctcaa ggtgtcaccc acatggtcat taatactgaa
780tggggttcct tcgataacca gctcaaggtt ttgccaaata cgaagtatga cgctcaagtt
840gacgaactta ccggcaataa gggcttccac atgtttgaaa agcgtgtttc cggaatgttc
900ttgggtgaga ttctgagaca tattttggtc gaccttcact ctaagggagt gctatttact
960cagtacgcca gctacgaatc cctgccccac agattgagga cgccgtggga tctggactct
1020gaggttctct cactgattga gatcgacgaa tccaccaatt tgcaggccac tgagctgtct
1080ttgaaacagg cattgagact gccaactact actgaggaga gacttgctat tcaaaaactt
1140actcgtgctg tggccaagag atctgcctat cttgctgcta ttcctattgc tgctattcta
1200cacatgaccg agtcttttaa gggccacaac gttgaggtgg acgttggagc agacgggtct
1260gtggttgagt tctaccctgg attcagaact atgatgagag acgccattgc gcagacgcag
1320ataggtgcca aaggagagag aagactgcac attaacattg ccaaagacgg ctcatctgtg
1380ggcgctgcat tgtgcgcatt aagcgagaaa gactaa
14166471PRTHansenula polymorpha 6Met Ser Leu Asp Thr Glu Val Asp Lys Ile
Val Ser Glu Phe Ala Val 1 5 10
15 Thr Gln Glu Thr Leu Gln Lys Gly Val Glu Arg Phe Ile Glu Leu
Ala 20 25 30 Thr
Ala Gly Leu Asn Ser Asp Glu Asp Lys Tyr Gly Leu Pro Met Ile 35
40 45 Pro Thr Phe Val Thr Ser
Ile Pro Thr Gly Lys Glu Lys Gly Ile Leu 50 55
60 Phe Ala Ala Asp Leu Gly Gly Thr Asn Phe Arg
Val Cys Ser Val Ala 65 70 75
80 Leu Asn Gly Asp His Thr Phe Lys Leu Ile Gln Gln Lys Ser His Ile
85 90 95 Pro Ala
Glu Leu Met Thr Ser Thr Ser Asp Glu Leu Phe Ser Tyr Leu 100
105 110 Ala Ser Lys Val Lys Asn Phe
Leu Glu Thr His His Glu Gly Ala Val 115 120
125 Thr Ser Thr Gly Ser Gln Lys Phe Lys Met Gly Phe
Thr Phe Ser Phe 130 135 140
Pro Val Ser Gln Thr Ala Leu Asn Ala Gly Thr Leu Leu Arg Trp Thr 145
150 155 160 Lys Gly Phe
Asn Ile Pro Asp Thr Val Gly Gln Glu Val Val Ser Leu 165
170 175 Phe Gln Met His Leu Asp Ala Gln
Glu Ile Pro Val Thr Val Ser Ala 180 185
190 Leu Ser Asn Asp Thr Val Gly Thr Leu Leu Ala Arg Ser
Tyr Thr Gly 195 200 205
Ser Asn Lys Glu Gly Thr Thr Val Leu Gly Cys Ile Phe Gly Thr Gly 210
215 220 Thr Asn Gly Ala
Tyr Asn Glu Lys Leu Glu Asn Ile Lys Lys Leu Pro 225 230
235 240 Ala Glu Val Arg Glu Lys Leu Lys Ala
Gln Gly Val Thr His Met Val 245 250
255 Ile Asn Thr Glu Trp Gly Ser Phe Asp Asn Gln Leu Lys Val
Leu Pro 260 265 270
Asn Thr Lys Tyr Asp Ala Gln Val Asp Glu Leu Thr Gly Asn Lys Gly
275 280 285 Phe His Met Phe
Glu Lys Arg Val Ser Gly Met Phe Leu Gly Glu Ile 290
295 300 Leu Arg His Ile Leu Val Asp Leu
His Ser Lys Gly Val Leu Phe Thr 305 310
315 320 Gln Tyr Ala Ser Tyr Glu Ser Leu Pro His Arg Leu
Arg Thr Pro Trp 325 330
335 Asp Leu Asp Ser Glu Val Leu Ser Leu Ile Glu Ile Asp Glu Ser Thr
340 345 350 Asn Leu Gln
Ala Thr Glu Leu Ser Leu Lys Gln Ala Leu Arg Leu Pro 355
360 365 Thr Thr Thr Glu Glu Arg Leu Ala
Ile Gln Lys Leu Thr Arg Ala Val 370 375
380 Ala Lys Arg Ser Ala Tyr Leu Ala Ala Ile Pro Ile Ala
Ala Ile Leu 385 390 395
400 His Met Thr Glu Ser Phe Lys Gly His Asn Val Glu Val Asp Val Gly
405 410 415 Ala Asp Gly Ser
Val Val Glu Phe Tyr Pro Gly Phe Arg Thr Met Met 420
425 430 Arg Asp Ala Ile Ala Gln Thr Gln Ile
Gly Ala Lys Gly Glu Arg Arg 435 440
445 Leu His Ile Asn Ile Ala Lys Asp Gly Ser Ser Val Gly Ala
Ala Leu 450 455 460
Cys Ala Leu Ser Glu Lys Asp 465 470
71634DNASchizosaccharomyces pombe 7aacacttttc gcctcacttg cgaatctacg
aaaggaatat ataggtggtt cacccctttt 60cttttcattt cgtgttttta atagttattt
acatcaacag agataactat ttctgttaac 120gatttttttt cccacttgtt ttcttccttt
tttggtgaat tttaattaat ttataataag 180caatggaggc taattttcaa caagctgtta
aaaagttagt caatgacttt gaatacccta 240ccgagtcctt gagagaggcc gttaaggagt
ttgacgaatt acgtcaaaag ggtttacaaa 300agaatggtga ggtgcttgct atggctcctg
cctttatctc tacccttccc accggcgctg 360aaactggtga cttcttggcc cttgactttg
gtggtaccaa cttgcgtgtt tgttggatcc 420aacttctcgg tgacggcaag tatgagatga
agcacagcaa gtccgtcttg ccccgtgaat 480gcgttcgtaa cgagtctgtt aagcccatca
ttgactttat gagtgaccat gttgagcttt 540tcatcaagga gcacttccct tccaagtttg
gctgccctga ggaggaatac cttcctatgg 600gtttcacctt ttcttatccc gccaaccaag
tttccatcac cgagagctac ttgcttcgtt 660ggaccaaggg tcttaacatt cctgaggcca
tcaacaagga ctttgcccaa tttttgactg 720aaggtttcaa ggctcgtaac cttcctatta
gaatcgaggc tgtcatcaac gataccgtcg 780gtactctcgt tacccgtgct tatacttcaa
aggagagcga cacctttatg ggtatcattt 840tcggaaccgg taccaacggt gcttacgtcg
agcaaatgaa ccaaattccc aagcttgctg 900gcaagtgtac tggtgatcat atgcttatca
acatggaatg gggagcaact gatttctctt 960gccttcactc cactcgttat gatttacttc
ttgatcatga tactcccaat gctggtcgtc 1020aaatctttga gaagcgcgtt ggtggtatgt
atctcggtga gcttttccgc cgtgccttat 1080tccacttgat caaggtttac aacttcaacg
aaggtatttt ccctccttcc attactgatg 1140cttggtcttt ggaaacttct gttctttcca
gaatgatggt tgaacgttct gctgagaatg 1200ttcgtaacgt tcttagtaca ttcaagttcc
gtttccgcag cgacgaagag gctttgtacc 1260tttgggatgc tgctcatgca attggccgtc
gtgctgctcg tatgtctgcc gttcccattg 1320cttctttgta tctttctacc ggccgcgctg
gtaagaagag tgatgttggt gttgatggtt 1380ctttagtcga acactatcct cactttgttg
acatgctccg tgaagccttg cgtgagctta 1440tcggtgataa cgaaaaattg atttccattg
gtattgccaa ggatggcagt ggtattggtg 1500ccgctctttg cgccctccaa gctgttaagg
aaaagaaagg cttggcctaa atcatgttag 1560atgtctgtta gctttttttg aattgtacgt
agaaatgagc atgtaaatat gaaattgctt 1620tttaacagct ttta
16348455PRTSchizosaccharomyces pombe
8Met Glu Ala Asn Phe Gln Gln Ala Val Lys Lys Leu Val Asn Asp Phe 1
5 10 15 Glu Tyr Pro Thr
Glu Ser Leu Arg Glu Ala Val Lys Glu Phe Asp Glu 20
25 30 Leu Arg Gln Lys Gly Leu Gln Lys Asn
Gly Glu Val Leu Ala Met Ala 35 40
45 Pro Ala Phe Ile Ser Thr Leu Pro Thr Gly Ala Glu Thr Gly
Asp Phe 50 55 60
Leu Ala Leu Asp Phe Gly Gly Thr Asn Leu Arg Val Cys Trp Ile Gln 65
70 75 80 Leu Leu Gly Asp Gly
Lys Tyr Glu Met Lys His Ser Lys Ser Val Leu 85
90 95 Pro Arg Glu Cys Val Arg Asn Glu Ser Val
Lys Pro Ile Ile Asp Phe 100 105
110 Met Ser Asp His Val Glu Leu Phe Ile Lys Glu His Phe Pro Ser
Lys 115 120 125 Phe
Gly Cys Pro Glu Glu Glu Tyr Leu Pro Met Gly Phe Thr Phe Ser 130
135 140 Tyr Pro Ala Asn Gln Val
Ser Ile Thr Glu Ser Tyr Leu Leu Arg Trp 145 150
155 160 Thr Lys Gly Leu Asn Ile Pro Glu Ala Ile Asn
Lys Asp Phe Ala Gln 165 170
175 Phe Leu Thr Glu Gly Phe Lys Ala Arg Asn Leu Pro Ile Arg Ile Glu
180 185 190 Ala Val
Ile Asn Asp Thr Val Gly Thr Leu Val Thr Arg Ala Tyr Thr 195
200 205 Ser Lys Glu Ser Asp Thr Phe
Met Gly Ile Ile Phe Gly Thr Gly Thr 210 215
220 Asn Gly Ala Tyr Val Glu Gln Met Asn Gln Ile Pro
Lys Leu Ala Gly 225 230 235
240 Lys Cys Thr Gly Asp His Met Leu Ile Asn Met Glu Trp Gly Ala Thr
245 250 255 Asp Phe Ser
Cys Leu His Ser Thr Arg Tyr Asp Leu Leu Leu Asp His 260
265 270 Asp Thr Pro Asn Ala Gly Arg Gln
Ile Phe Glu Lys Arg Val Gly Gly 275 280
285 Met Tyr Leu Gly Glu Leu Phe Arg Arg Ala Leu Phe His
Leu Ile Lys 290 295 300
Val Tyr Asn Phe Asn Glu Gly Ile Phe Pro Pro Ser Ile Thr Asp Ala 305
310 315 320 Trp Ser Leu Glu
Thr Ser Val Leu Ser Arg Met Met Val Glu Arg Ser 325
330 335 Ala Glu Asn Val Arg Asn Val Leu Ser
Thr Phe Lys Phe Arg Phe Arg 340 345
350 Ser Asp Glu Glu Ala Leu Tyr Leu Trp Asp Ala Ala His Ala
Ile Gly 355 360 365
Arg Arg Ala Ala Arg Met Ser Ala Val Pro Ile Ala Ser Leu Tyr Leu 370
375 380 Ser Thr Gly Arg Ala
Gly Lys Lys Ser Asp Val Gly Val Asp Gly Ser 385 390
395 400 Leu Val Glu His Tyr Pro His Phe Val Asp
Met Leu Arg Glu Ala Leu 405 410
415 Arg Glu Leu Ile Gly Asp Asn Glu Lys Leu Ile Ser Ile Gly Ile
Ala 420 425 430 Lys
Asp Gly Ser Gly Ile Gly Ala Ala Leu Cys Ala Leu Gln Ala Val 435
440 445 Lys Glu Lys Lys Gly Leu
Ala 450 455 91689DNASaccharomyces cerevisiae
9atgtctgaaa ttactttggg taaatatttg ttcgaaagat taaagcaagt caacgttaac
60accgttttcg gtttgccagg tgacttcaac ttgtccttgt tggacaagat ctacgaagtt
120gaaggtatga gatgggctgg taacgccaac gaattgaacg ctgcttacgc cgctgatggt
180tacgctcgta tcaagggtat gtcttgtatc atcaccacct tcggtgtcgg tgaattgtct
240gctttgaacg gtattgccgg ttcttacgct gaacacgtcg gtgttttgca cgttgttggt
300gtcccatcca tctctgctca agctaagcaa ttgttgttgc accacacctt gggtaacggt
360gacttcactg ttttccacag aatgtctgcc aacatttctg aaaccactgc tatgatcact
420gacattgcta ccgccccagc tgaaattgac agatgtatca gaaccactta cgtcacccaa
480agaccagtct acttaggttt gccagctaac ttggtcgact tgaacgtccc agctaagttg
540ttgcaaactc caattgacat gtctttgaag ccaaacgatg ctgaatccga aaaggaagtc
600attgacacca tcttggcttt ggtcaaggat gctaagaacc cagttatctt ggctgatgct
660tgttgttcca gacacgacgt caaggctgaa actaagaagt tgattgactt gactcaattc
720ccagctttcg tcaccccaat gggtaagggt tccattgacg aacaacaccc aagatacggt
780ggtgtttacg tcggtacctt gtccaagcca gaagttaagg aagccgttga atctgctgac
840ttgattttgt ctgtcggtgc tttgttgtct gatttcaaca ccggttcttt ctcttactct
900tacaagacca agaacattgt cgaattccac tccgaccaca tgaagatcag aaacgccact
960ttcccaggtg tccaaatgaa attcgttttg caaaagttgt tgaccactat tgctgacgcc
1020gctaagggtt acaagccagt tgctgtccca gctagaactc cagctaacgc tgctgtccca
1080gcttctaccc cattgaagca agaatggatg tggaaccaat tgggtaactt cttgcaagaa
1140ggtgatgttg tcattgctga aaccggtacc tccgctttcg gtatcaacca aaccactttc
1200ccaaacaaca cctacggtat ctctcaagtc ttatggggtt ccattggttt caccactggt
1260gctaccttgg gtgctgcttt cgctgctgaa gaaattgatc caaagaagag agttatctta
1320ttcattggtg acggttcttt gcaattgact gttcaagaaa tctccaccat gatcagatgg
1380ggcttgaagc catacttgtt cgtcttgaac aacgatggtt acaccattga aaagttgatt
1440cacggtccaa aggctcaata caacgaaatt caaggttggg accacctatc cttgttgcca
1500actttcggtg ctaaggacta tgaaacccac agagtcgcta ccaccggtga atgggacaag
1560ttgacccaag acaagtcttt caacgacaac tctaagatca gaatgattga aatcatgttg
1620ccagtcttcg atgctccaca aaacttggtt gaacaagcta agttgactgc tgctaccaac
1680gctaagcaa
168910563PRTSaccharomyces cerevisiae 10Met Ser Glu Ile Thr Leu Gly Lys
Tyr Leu Phe Glu Arg Leu Lys Gln 1 5 10
15 Val Asn Val Asn Thr Val Phe Gly Leu Pro Gly Asp Phe
Asn Leu Ser 20 25 30
Leu Leu Asp Lys Ile Tyr Glu Val Glu Gly Met Arg Trp Ala Gly Asn
35 40 45 Ala Asn Glu Leu
Asn Ala Ala Tyr Ala Ala Asp Gly Tyr Ala Arg Ile 50
55 60 Lys Gly Met Ser Cys Ile Ile Thr
Thr Phe Gly Val Gly Glu Leu Ser 65 70
75 80 Ala Leu Asn Gly Ile Ala Gly Ser Tyr Ala Glu His
Val Gly Val Leu 85 90
95 His Val Val Gly Val Pro Ser Ile Ser Ala Gln Ala Lys Gln Leu Leu
100 105 110 Leu His His
Thr Leu Gly Asn Gly Asp Phe Thr Val Phe His Arg Met 115
120 125 Ser Ala Asn Ile Ser Glu Thr Thr
Ala Met Ile Thr Asp Ile Ala Thr 130 135
140 Ala Pro Ala Glu Ile Asp Arg Cys Ile Arg Thr Thr Tyr
Val Thr Gln 145 150 155
160 Arg Pro Val Tyr Leu Gly Leu Pro Ala Asn Leu Val Asp Leu Asn Val
165 170 175 Pro Ala Lys Leu
Leu Gln Thr Pro Ile Asp Met Ser Leu Lys Pro Asn 180
185 190 Asp Ala Glu Ser Glu Lys Glu Val Ile
Asp Thr Ile Leu Ala Leu Val 195 200
205 Lys Asp Ala Lys Asn Pro Val Ile Leu Ala Asp Ala Cys Cys
Ser Arg 210 215 220
His Asp Val Lys Ala Glu Thr Lys Lys Leu Ile Asp Leu Thr Gln Phe 225
230 235 240 Pro Ala Phe Val Thr
Pro Met Gly Lys Gly Ser Ile Asp Glu Gln His 245
250 255 Pro Arg Tyr Gly Gly Val Tyr Val Gly Thr
Leu Ser Lys Pro Glu Val 260 265
270 Lys Glu Ala Val Glu Ser Ala Asp Leu Ile Leu Ser Val Gly Ala
Leu 275 280 285 Leu
Ser Asp Phe Asn Thr Gly Ser Phe Ser Tyr Ser Tyr Lys Thr Lys 290
295 300 Asn Ile Val Glu Phe His
Ser Asp His Met Lys Ile Arg Asn Ala Thr 305 310
315 320 Phe Pro Gly Val Gln Met Lys Phe Val Leu Gln
Lys Leu Leu Thr Thr 325 330
335 Ile Ala Asp Ala Ala Lys Gly Tyr Lys Pro Val Ala Val Pro Ala Arg
340 345 350 Thr Pro
Ala Asn Ala Ala Val Pro Ala Ser Thr Pro Leu Lys Gln Glu 355
360 365 Trp Met Trp Asn Gln Leu Gly
Asn Phe Leu Gln Glu Gly Asp Val Val 370 375
380 Ile Ala Glu Thr Gly Thr Ser Ala Phe Gly Ile Asn
Gln Thr Thr Phe 385 390 395
400 Pro Asn Asn Thr Tyr Gly Ile Ser Gln Val Leu Trp Gly Ser Ile Gly
405 410 415 Phe Thr Thr
Gly Ala Thr Leu Gly Ala Ala Phe Ala Ala Glu Glu Ile 420
425 430 Asp Pro Lys Lys Arg Val Ile Leu
Phe Ile Gly Asp Gly Ser Leu Gln 435 440
445 Leu Thr Val Gln Glu Ile Ser Thr Met Ile Arg Trp Gly
Leu Lys Pro 450 455 460
Tyr Leu Phe Val Leu Asn Asn Asp Gly Tyr Thr Ile Glu Lys Leu Ile 465
470 475 480 His Gly Pro Lys
Ala Gln Tyr Asn Glu Ile Gln Gly Trp Asp His Leu 485
490 495 Ser Leu Leu Pro Thr Phe Gly Ala Lys
Asp Tyr Glu Thr His Arg Val 500 505
510 Ala Thr Thr Gly Glu Trp Asp Lys Leu Thr Gln Asp Lys Ser
Phe Asn 515 520 525
Asp Asn Ser Lys Ile Arg Met Ile Glu Ile Met Leu Pro Val Phe Asp 530
535 540 Ala Pro Gln Asn Leu
Val Glu Gln Ala Lys Leu Thr Ala Ala Thr Asn 545 550
555 560 Ala Lys Gln 111689DNASaccharomyces
cerevisiae 11atgtctgaaa taaccttagg taaatattta tttgaaagat tgagccaagt
caactgtaac 60accgtcttcg gtttgccagg tgactttaac ttgtctcttt tggataagct
ttatgaagtc 120aaaggtatga gatgggctgg taacgctaac gaattgaacg ctgcctatgc
tgctgatggt 180tacgctcgta tcaagggtat gtcctgtatt attaccacct tcggtgttgg
tgaattgtct 240gctttgaatg gtattgccgg ttcttacgct gaacatgtcg gtgttttgca
cgttgttggt 300gttccatcca tctcttctca agctaagcaa ttgttgttgc atcatacctt
gggtaacggt 360gacttcactg ttttccacag aatgtctgcc aacatttctg aaaccactgc
catgatcact 420gatattgcta acgctccagc tgaaattgac agatgtatca gaaccaccta
cactacccaa 480agaccagtct acttgggttt gccagctaac ttggttgact tgaacgtccc
agccaagtta 540ttggaaactc caattgactt gtctttgaag ccaaacgacg ctgaagctga
agctgaagtt 600gttagaactg ttgttgaatt gatcaaggat gctaagaacc cagttatctt
ggctgatgct 660tgtgcttcta gacatgatgt caaggctgaa actaagaagt tgatggactt
gactcaattc 720ccagtttacg tcaccccaat gggtaagggt gctattgacg aacaacaccc
aagatacggt 780ggtgtttacg ttggtacctt gtctagacca gaagttaaga aggctgtaga
atctgctgat 840ttgatattgt ctatcggtgc tttgttgtct gatttcaata ccggttcttt
ctcttactcc 900tacaagacca aaaatatcgt tgaattccac tctgaccaca tcaagatcag
aaacgccacc 960ttcccaggtg ttcaaatgaa atttgccttg caaaaattgt tggatgctat
tccagaagtc 1020gtcaaggact acaaacctgt tgctgtccca gctagagttc caattaccaa
gtctactcca 1080gctaacactc caatgaagca agaatggatg tggaaccatt tgggtaactt
cttgagagaa 1140ggtgatattg ttattgctga aaccggtact tccgccttcg gtattaacca
aactactttc 1200ccaacagatg tatacgctat cgtccaagtc ttgtggggtt ccattggttt
cacagtcggc 1260gctctattgg gtgctactat ggccgctgaa gaacttgatc caaagaagag
agttatttta 1320ttcattggtg acggttctct acaattgact gttcaagaaa tctctaccat
gattagatgg 1380ggtttgaagc catacatttt tgtcttgaat aacaacggtt acaccattga
aaaattgatt 1440cacggtcctc atgccgaata taatgaaatt caaggttggg accacttggc
cttattgcca 1500acttttggtg ctagaaacta cgaaacccac agagttgcta ccactggtga
atgggaaaag 1560ttgactcaag acaaggactt ccaagacaac tctaagatta gaatgattga
agttatgttg 1620ccagtctttg atgctccaca aaacttggtt aaacaagctc aattgactgc
cgctactaac 1680gctaaacaa
168912563PRTSaccharomyces cerevisiae 12Met Ser Glu Ile Thr Leu
Gly Lys Tyr Leu Phe Glu Arg Leu Ser Gln 1 5
10 15 Val Asn Cys Asn Thr Val Phe Gly Leu Pro Gly
Asp Phe Asn Leu Ser 20 25
30 Leu Leu Asp Lys Leu Tyr Glu Val Lys Gly Met Arg Trp Ala Gly
Asn 35 40 45 Ala
Asn Glu Leu Asn Ala Ala Tyr Ala Ala Asp Gly Tyr Ala Arg Ile 50
55 60 Lys Gly Met Ser Cys Ile
Ile Thr Thr Phe Gly Val Gly Glu Leu Ser 65 70
75 80 Ala Leu Asn Gly Ile Ala Gly Ser Tyr Ala Glu
His Val Gly Val Leu 85 90
95 His Val Val Gly Val Pro Ser Ile Ser Ser Gln Ala Lys Gln Leu Leu
100 105 110 Leu His
His Thr Leu Gly Asn Gly Asp Phe Thr Val Phe His Arg Met 115
120 125 Ser Ala Asn Ile Ser Glu Thr
Thr Ala Met Ile Thr Asp Ile Ala Asn 130 135
140 Ala Pro Ala Glu Ile Asp Arg Cys Ile Arg Thr Thr
Tyr Thr Thr Gln 145 150 155
160 Arg Pro Val Tyr Leu Gly Leu Pro Ala Asn Leu Val Asp Leu Asn Val
165 170 175 Pro Ala Lys
Leu Leu Glu Thr Pro Ile Asp Leu Ser Leu Lys Pro Asn 180
185 190 Asp Ala Glu Ala Glu Ala Glu Val
Val Arg Thr Val Val Glu Leu Ile 195 200
205 Lys Asp Ala Lys Asn Pro Val Ile Leu Ala Asp Ala Cys
Ala Ser Arg 210 215 220
His Asp Val Lys Ala Glu Thr Lys Lys Leu Met Asp Leu Thr Gln Phe 225
230 235 240 Pro Val Tyr Val
Thr Pro Met Gly Lys Gly Ala Ile Asp Glu Gln His 245
250 255 Pro Arg Tyr Gly Gly Val Tyr Val Gly
Thr Leu Ser Arg Pro Glu Val 260 265
270 Lys Lys Ala Val Glu Ser Ala Asp Leu Ile Leu Ser Ile Gly
Ala Leu 275 280 285
Leu Ser Asp Phe Asn Thr Gly Ser Phe Ser Tyr Ser Tyr Lys Thr Lys 290
295 300 Asn Ile Val Glu Phe
His Ser Asp His Ile Lys Ile Arg Asn Ala Thr 305 310
315 320 Phe Pro Gly Val Gln Met Lys Phe Ala Leu
Gln Lys Leu Leu Asp Ala 325 330
335 Ile Pro Glu Val Val Lys Asp Tyr Lys Pro Val Ala Val Pro Ala
Arg 340 345 350 Val
Pro Ile Thr Lys Ser Thr Pro Ala Asn Thr Pro Met Lys Gln Glu 355
360 365 Trp Met Trp Asn His Leu
Gly Asn Phe Leu Arg Glu Gly Asp Ile Val 370 375
380 Ile Ala Glu Thr Gly Thr Ser Ala Phe Gly Ile
Asn Gln Thr Thr Phe 385 390 395
400 Pro Thr Asp Val Tyr Ala Ile Val Gln Val Leu Trp Gly Ser Ile Gly
405 410 415 Phe Thr
Val Gly Ala Leu Leu Gly Ala Thr Met Ala Ala Glu Glu Leu 420
425 430 Asp Pro Lys Lys Arg Val Ile
Leu Phe Ile Gly Asp Gly Ser Leu Gln 435 440
445 Leu Thr Val Gln Glu Ile Ser Thr Met Ile Arg Trp
Gly Leu Lys Pro 450 455 460
Tyr Ile Phe Val Leu Asn Asn Asn Gly Tyr Thr Ile Glu Lys Leu Ile 465
470 475 480 His Gly Pro
His Ala Glu Tyr Asn Glu Ile Gln Gly Trp Asp His Leu 485
490 495 Ala Leu Leu Pro Thr Phe Gly Ala
Arg Asn Tyr Glu Thr His Arg Val 500 505
510 Ala Thr Thr Gly Glu Trp Glu Lys Leu Thr Gln Asp Lys
Asp Phe Gln 515 520 525
Asp Asn Ser Lys Ile Arg Met Ile Glu Val Met Leu Pro Val Phe Asp 530
535 540 Ala Pro Gln Asn
Leu Val Lys Gln Ala Gln Leu Thr Ala Ala Thr Asn 545 550
555 560 Ala Lys Gln 131599DNASaccharomyces
cerevisiae 13atgtctgaaa ttactcttgg aaaatactta tttgaaagat tgaagcaagt
taatgttaac 60accatttttg ggctaccagg cgacttcaac ttgtccctat tggacaagat
ttacgaggta 120gatggattga gatgggctgg taatgcaaat gagctgaacg ccgcctatgc
cgccgatggt 180tacgcacgca tcaagggttt atctgtgctg gtaactactt ttggcgtagg
tgaattatcc 240gccttgaatg gtattgcagg atcgtatgca gaacacgtcg gtgtactgca
tgttgttggt 300gtcccctcta tctccgctca ggctaagcaa ttgttgttgc atcatacctt
gggtaacggt 360gattttaccg tttttcacag aatgtccgcc aatatctcag aaactacatc
aatgattaca 420gacattgcta cagccccttc agaaatcgat aggttgatca ggacaacatt
tataacacaa 480aggcctagct acttggggtt gccagcgaat ttggtagatc taaaggttcc
tggttctctt 540ttggaaaaac cgattgatct atcattaaaa cctaacgatc ccgaagctga
aaaggaagtt 600attgataccg tactagaatt gatccagaat tcgaaaaacc ctgttatact
atcggatgcc 660tgtgcttcta ggcacaacgt taaaaaagaa acccagaagt taattgattt
gacgcaattc 720ccagcttttg tgacacctct aggtaaaggg tcaatagatg aacagcatcc
cagatatggc 780ggtgtttatg tgggaacgct gtccaaacaa gacgtgaaac aggccgttga
gtcggctgat 840ttgatccttt cggtcggtgc tttgctctct gattttaaca caggttcgtt
ttcctactcc 900tacaagacta aaaatgtagt ggagtttcat tccgattacg taaaggtgaa
gaacgctacg 960ttcctcggtg tacaaatgaa atttgcacta caaaacttac tgaaggttat
tcccgatgtt 1020gttaagggct acaagagcgt tcccgtacca accaaaactc ccgcaaacaa
aggtgtacct 1080gctagcacgc ccttgaaaca agagtggttg tggaacgaat tgtccaaatt
cttgcaagaa 1140ggtgatgtta tcatttccga gaccggcacg tctgccttcg gtatcaatca
aactatcttt 1200cctaaggacg cctacggtat ctcgcaggtg ttgtgggggt ccatcggttt
tacaacagga 1260gcaactttag gtgctgcctt tgccgctgag gagattgacc ccaacaagag
agtcatctta 1320ttcataggtg acgggtcttt gcagttaacc gtccaagaaa tctccaccat
gatcagatgg 1380gggttaaagc cgtatctttt tgtccttaac aacgacggct acactatcga
aaagctgatt 1440catgggcctc acgcagagta caacgaaatc cagacctggg atcacctcgc
cctgttgccc 1500gcatttggtg cgaaaaagta cgaaaatcac aagatcgcca ctacgggtga
gtgggatgcc 1560ttaaccactg attcagagtt ccagaaaaac tcggtgatc
159914533PRTSaccharomyces cerevisiae 14Met Ser Glu Ile Thr Leu
Gly Lys Tyr Leu Phe Glu Arg Leu Lys Gln 1 5
10 15 Val Asn Val Asn Thr Ile Phe Gly Leu Pro Gly
Asp Phe Asn Leu Ser 20 25
30 Leu Leu Asp Lys Ile Tyr Glu Val Asp Gly Leu Arg Trp Ala Gly
Asn 35 40 45 Ala
Asn Glu Leu Asn Ala Ala Tyr Ala Ala Asp Gly Tyr Ala Arg Ile 50
55 60 Lys Gly Leu Ser Val Leu
Val Thr Thr Phe Gly Val Gly Glu Leu Ser 65 70
75 80 Ala Leu Asn Gly Ile Ala Gly Ser Tyr Ala Glu
His Val Gly Val Leu 85 90
95 His Val Val Gly Val Pro Ser Ile Ser Ala Gln Ala Lys Gln Leu Leu
100 105 110 Leu His
His Thr Leu Gly Asn Gly Asp Phe Thr Val Phe His Arg Met 115
120 125 Ser Ala Asn Ile Ser Glu Thr
Thr Ser Met Ile Thr Asp Ile Ala Thr 130 135
140 Ala Pro Ser Glu Ile Asp Arg Leu Ile Arg Thr Thr
Phe Ile Thr Gln 145 150 155
160 Arg Pro Ser Tyr Leu Gly Leu Pro Ala Asn Leu Val Asp Leu Lys Val
165 170 175 Pro Gly Ser
Leu Leu Glu Lys Pro Ile Asp Leu Ser Leu Lys Pro Asn 180
185 190 Asp Pro Glu Ala Glu Lys Glu Val
Ile Asp Thr Val Leu Glu Leu Ile 195 200
205 Gln Asn Ser Lys Asn Pro Val Ile Leu Ser Asp Ala Cys
Ala Ser Arg 210 215 220
His Asn Val Lys Lys Glu Thr Gln Lys Leu Ile Asp Leu Thr Gln Phe 225
230 235 240 Pro Ala Phe Val
Thr Pro Leu Gly Lys Gly Ser Ile Asp Glu Gln His 245
250 255 Pro Arg Tyr Gly Gly Val Tyr Val Gly
Thr Leu Ser Lys Gln Asp Val 260 265
270 Lys Gln Ala Val Glu Ser Ala Asp Leu Ile Leu Ser Val Gly
Ala Leu 275 280 285
Leu Ser Asp Phe Asn Thr Gly Ser Phe Ser Tyr Ser Tyr Lys Thr Lys 290
295 300 Asn Val Val Glu Phe
His Ser Asp Tyr Val Lys Val Lys Asn Ala Thr 305 310
315 320 Phe Leu Gly Val Gln Met Lys Phe Ala Leu
Gln Asn Leu Leu Lys Val 325 330
335 Ile Pro Asp Val Val Lys Gly Tyr Lys Ser Val Pro Val Pro Thr
Lys 340 345 350 Thr
Pro Ala Asn Lys Gly Val Pro Ala Ser Thr Pro Leu Lys Gln Glu 355
360 365 Trp Leu Trp Asn Glu Leu
Ser Lys Phe Leu Gln Glu Gly Asp Val Ile 370 375
380 Ile Ser Glu Thr Gly Thr Ser Ala Phe Gly Ile
Asn Gln Thr Ile Phe 385 390 395
400 Pro Lys Asp Ala Tyr Gly Ile Ser Gln Val Leu Trp Gly Ser Ile Gly
405 410 415 Phe Thr
Thr Gly Ala Thr Leu Gly Ala Ala Phe Ala Ala Glu Glu Ile 420
425 430 Asp Pro Asn Lys Arg Val Ile
Leu Phe Ile Gly Asp Gly Ser Leu Gln 435 440
445 Leu Thr Val Gln Glu Ile Ser Thr Met Ile Arg Trp
Gly Leu Lys Pro 450 455 460
Tyr Leu Phe Val Leu Asn Asn Asp Gly Tyr Thr Ile Glu Lys Leu Ile 465
470 475 480 His Gly Pro
His Ala Glu Tyr Asn Glu Ile Gln Thr Trp Asp His Leu 485
490 495 Ala Leu Leu Pro Ala Phe Gly Ala
Lys Lys Tyr Glu Asn His Lys Ile 500 505
510 Ala Thr Thr Gly Glu Trp Asp Ala Leu Thr Thr Asp Ser
Glu Phe Gln 515 520 525
Lys Asn Ser Val Ile 530 151692DNACandida glabrata
15atgtctgaga ttactttggg tagatacttg ttcgagagat tgaaccaagt cgacgttaag
60accatcttcg gtttgccagg tgacttcaac ttgtccctat tggacaagat ctacgaagtt
120gaaggtatga gatgggctgg taacgctaac gaattgaacg ctgcttacgc tgctgacggt
180tacgctagaa tcaagggtat gtcctgtatc atcaccacct tcggtgtcgg tgaattgtct
240gccttgaacg gtattgccgg ttcttacgct gaacacgtcg gtgtcttgca cgtcgtcggt
300gtcccatcca tctcctctca agctaagcaa ttgttgttgc accacacctt gggtaacggt
360gacttcactg tcttccacag aatgtccgct aacatctctg agaccaccgc tatggtcact
420gacatcgcta ccgctccagc tgagatcgac agatgtatca gaaccaccta catcacccaa
480agaccagtct acttgggtct accagctaac ttggtcgacc taaaggtccc agccaagctt
540ttggaaaccc caattgactt gtccttgaag ccaaacgacc cagaagccga aactgaagtc
600gttgacaccg tcttggaatt gatcaaggct gctaagaacc cagttatctt ggctgatgct
660tgtgcttcca gacacgacgt caaggctgaa accaagaagt tgattgacgc cactcaattc
720ccatccttcg ttaccccaat gggtaagggt tccatcgacg aacaacaccc aagattcggt
780ggtgtctacg tcggtacctt gtccagacca gaagttaagg aagctgttga atccgctgac
840ttgatcttgt ctgtcggtgc tttgttgtcc gatttcaaca ctggttcttt ctcttactct
900tacaagacca agaacatcgt cgaattccac tctgactaca tcaagatcag aaacgctacc
960ttcccaggtg tccaaatgaa gttcgctttg caaaagttgt tgaacgccgt cccagaagct
1020atcaagggtt acaagccagt ccctgtccca gctagagtcc cagaaaacaa gtcctgtgac
1080ccagctaccc cattgaagca agaatggatg tggaaccaag tttccaagtt cttgcaagaa
1140ggtgatgttg ttatcactga aaccggtacc tccgcttttg gtatcaacca aaccccattc
1200ccaaacaacg cttacggtat ctcccaagtt ctatggggtt ccatcggttt caccaccggt
1260gcttgtttgg gtgccgcttt cgctgctgaa gaaatcgacc caaagaagag agttatcttg
1320ttcattggtg acggttcttt gcaattgact gtccaagaaa tctccaccat gatcagatgg
1380ggcttgaagc catacttgtt cgtcttgaac aacgacggtt acaccatcga aagattgatt
1440cacggtgaaa aggctggtta caacgacatc caaaactggg accacttggc tctattgcca
1500accttcggtg ctaaggacta cgaaaaccac agagtcgcca ccaccggtga atgggacaag
1560ttgacccaag acaaggaatt caacaagaac tccaagatca gaatgatcga agttatgttg
1620ccagttatgg acgctccaac ttccttgatt gaacaagcta agttgaccgc ttccatcaac
1680gctaagcaag aa
169216564PRTCandida glabrata 16Met Ser Glu Ile Thr Leu Gly Arg Tyr Leu
Phe Glu Arg Leu Asn Gln 1 5 10
15 Val Asp Val Lys Thr Ile Phe Gly Leu Pro Gly Asp Phe Asn Leu
Ser 20 25 30 Leu
Leu Asp Lys Ile Tyr Glu Val Glu Gly Met Arg Trp Ala Gly Asn 35
40 45 Ala Asn Glu Leu Asn Ala
Ala Tyr Ala Ala Asp Gly Tyr Ala Arg Ile 50 55
60 Lys Gly Met Ser Cys Ile Ile Thr Thr Phe Gly
Val Gly Glu Leu Ser 65 70 75
80 Ala Leu Asn Gly Ile Ala Gly Ser Tyr Ala Glu His Val Gly Val Leu
85 90 95 His Val
Val Gly Val Pro Ser Ile Ser Ser Gln Ala Lys Gln Leu Leu 100
105 110 Leu His His Thr Leu Gly Asn
Gly Asp Phe Thr Val Phe His Arg Met 115 120
125 Ser Ala Asn Ile Ser Glu Thr Thr Ala Met Val Thr
Asp Ile Ala Thr 130 135 140
Ala Pro Ala Glu Ile Asp Arg Cys Ile Arg Thr Thr Tyr Ile Thr Gln 145
150 155 160 Arg Pro Val
Tyr Leu Gly Leu Pro Ala Asn Leu Val Asp Leu Lys Val 165
170 175 Pro Ala Lys Leu Leu Glu Thr Pro
Ile Asp Leu Ser Leu Lys Pro Asn 180 185
190 Asp Pro Glu Ala Glu Thr Glu Val Val Asp Thr Val Leu
Glu Leu Ile 195 200 205
Lys Ala Ala Lys Asn Pro Val Ile Leu Ala Asp Ala Cys Ala Ser Arg 210
215 220 His Asp Val Lys
Ala Glu Thr Lys Lys Leu Ile Asp Ala Thr Gln Phe 225 230
235 240 Pro Ser Phe Val Thr Pro Met Gly Lys
Gly Ser Ile Asp Glu Gln His 245 250
255 Pro Arg Phe Gly Gly Val Tyr Val Gly Thr Leu Ser Arg Pro
Glu Val 260 265 270
Lys Glu Ala Val Glu Ser Ala Asp Leu Ile Leu Ser Val Gly Ala Leu
275 280 285 Leu Ser Asp Phe
Asn Thr Gly Ser Phe Ser Tyr Ser Tyr Lys Thr Lys 290
295 300 Asn Ile Val Glu Phe His Ser Asp
Tyr Ile Lys Ile Arg Asn Ala Thr 305 310
315 320 Phe Pro Gly Val Gln Met Lys Phe Ala Leu Gln Lys
Leu Leu Asn Ala 325 330
335 Val Pro Glu Ala Ile Lys Gly Tyr Lys Pro Val Pro Val Pro Ala Arg
340 345 350 Val Pro Glu
Asn Lys Ser Cys Asp Pro Ala Thr Pro Leu Lys Gln Glu 355
360 365 Trp Met Trp Asn Gln Val Ser Lys
Phe Leu Gln Glu Gly Asp Val Val 370 375
380 Ile Thr Glu Thr Gly Thr Ser Ala Phe Gly Ile Asn Gln
Thr Pro Phe 385 390 395
400 Pro Asn Asn Ala Tyr Gly Ile Ser Gln Val Leu Trp Gly Ser Ile Gly
405 410 415 Phe Thr Thr Gly
Ala Cys Leu Gly Ala Ala Phe Ala Ala Glu Glu Ile 420
425 430 Asp Pro Lys Lys Arg Val Ile Leu Phe
Ile Gly Asp Gly Ser Leu Gln 435 440
445 Leu Thr Val Gln Glu Ile Ser Thr Met Ile Arg Trp Gly Leu
Lys Pro 450 455 460
Tyr Leu Phe Val Leu Asn Asn Asp Gly Tyr Thr Ile Glu Arg Leu Ile 465
470 475 480 His Gly Glu Lys Ala
Gly Tyr Asn Asp Ile Gln Asn Trp Asp His Leu 485
490 495 Ala Leu Leu Pro Thr Phe Gly Ala Lys Asp
Tyr Glu Asn His Arg Val 500 505
510 Ala Thr Thr Gly Glu Trp Asp Lys Leu Thr Gln Asp Lys Glu Phe
Asn 515 520 525 Lys
Asn Ser Lys Ile Arg Met Ile Glu Val Met Leu Pro Val Met Asp 530
535 540 Ala Pro Thr Ser Leu Ile
Glu Gln Ala Lys Leu Thr Ala Ser Ile Asn 545 550
555 560 Ala Lys Gln Glu 171788DNAPichia stipitis
17atggctgaag tctcattagg aagatatctc ttcgagagat tgtaccaatt gcaagtgcag
60accatcttcg gtgtccctgg tgatttcaac ttgtcgcttt tggacaagat ctacgaagtg
120gaagatgccc atggcaagaa ttcgtttaga tgggctggta atgccaacga attgaatgca
180tcgtacgctg ctgacggtta ctcgagagtc aagcgtttag ggtgtttggt cactaccttt
240ggtgtcggtg aattgtctgc tttgaatggt attgccggtt cttatgccga acatgttggt
300ttgcttcatg tcgtaggtgt tccatcgatt tcctcgcaag ctaagcaatt gttacttcac
360cacactttgg gtaatggtga tttcactgtt ttccatagaa tgtccaacaa catttctcag
420accacagcct ttatctccga tatcaactcg gctccagctg aaattgatag atgtatcaga
480gaggcctacg tcaaacaaag accagtttat atcgggttac cagctaactt agttgatttg
540aatgttccgg cctctttgct tgagtctcca atcaacttgt cgttggaaaa gaacgaccca
600gaggctcaag atgaagtcat tgactctgtc ttagacttga tcaaaaagtc gctgaaccca
660atcatcttgg tcgatgcctg tgcctcgaga catgactgta aggctgaagt tactcagttg
720attgaacaaa cccaattccc agtatttgtc actccaatgg gtaaaggtac cgttgatgag
780ggtggtgtag acggagaatt gttagaagat gatcctcatt tgattgccaa ggtcgctgct
840aggttgtctg ctggcaagaa cgctgcctct agattcggag gtgtttatgt cggaaccttg
900tcgaagcccg aagtcaagga cgctgtagag agtgcagatt tgattttgtc tgtcggtgcc
960cttttgtctg atttcaacac tggttcattt tcctactcct acagaaccaa gaacatcgtc
1020gaattccatt ctgattacac taagattaga caagccactt tcccaggtgt gcagatgaag
1080gaagccttgc aagaattgaa caagaaagtt tcatctgctg ctagtcacta tgaagtcaag
1140cctgtgccca agatcaagtt ggccaataca ccagccacca gagaagtcaa gttaactcag
1200gaatggttgt ggaccagagt gtcttcgtgg ttcagagaag gtgatattat tatcaccgaa
1260accggtacat cctccttcgg tatagttcaa tccagattcc caaacaacac catcggtatc
1320tcccaagtat tgtggggttc tattggtttc tctgttggtg ccactttggg tgctgccatg
1380gctgcccaag aactcgaccc taacaagaga accatcttgt ttgttggaga tggttctttg
1440caattgaccg ttcaggaaat ctccaccata atcagatggg gtaccacacc ttaccttttc
1500gtgttgaaca atgacggtta caccatcgag cgtttgatcc acggtgtaaa tgcctcatat
1560aatgacatcc aaccatggca aaacttggaa atcttgccta ctttctcggc caagaactac
1620gacgctgtga gaatctccaa catcggagaa gcagaagata tcttgaaaga caaggaattc
1680ggaaagaact ccaagattag attgatagaa gtcatgttac caagattgga tgcaccatct
1740aaccttgcca aacaagctgc cattacagct gccaccaacg ccgaagct
178818596PRTPichia stipitis 18Met Ala Glu Val Ser Leu Gly Arg Tyr Leu Phe
Glu Arg Leu Tyr Gln 1 5 10
15 Leu Gln Val Gln Thr Ile Phe Gly Val Pro Gly Asp Phe Asn Leu Ser
20 25 30 Leu Leu
Asp Lys Ile Tyr Glu Val Glu Asp Ala His Gly Lys Asn Ser 35
40 45 Phe Arg Trp Ala Gly Asn Ala
Asn Glu Leu Asn Ala Ser Tyr Ala Ala 50 55
60 Asp Gly Tyr Ser Arg Val Lys Arg Leu Gly Cys Leu
Val Thr Thr Phe 65 70 75
80 Gly Val Gly Glu Leu Ser Ala Leu Asn Gly Ile Ala Gly Ser Tyr Ala
85 90 95 Glu His Val
Gly Leu Leu His Val Val Gly Val Pro Ser Ile Ser Ser 100
105 110 Gln Ala Lys Gln Leu Leu Leu His
His Thr Leu Gly Asn Gly Asp Phe 115 120
125 Thr Val Phe His Arg Met Ser Asn Asn Ile Ser Gln Thr
Thr Ala Phe 130 135 140
Ile Ser Asp Ile Asn Ser Ala Pro Ala Glu Ile Asp Arg Cys Ile Arg 145
150 155 160 Glu Ala Tyr Val
Lys Gln Arg Pro Val Tyr Ile Gly Leu Pro Ala Asn 165
170 175 Leu Val Asp Leu Asn Val Pro Ala Ser
Leu Leu Glu Ser Pro Ile Asn 180 185
190 Leu Ser Leu Glu Lys Asn Asp Pro Glu Ala Gln Asp Glu Val
Ile Asp 195 200 205
Ser Val Leu Asp Leu Ile Lys Lys Ser Ser Asn Pro Ile Ile Leu Val 210
215 220 Asp Ala Cys Ala Ser
Arg His Asp Cys Lys Ala Glu Val Thr Gln Leu 225 230
235 240 Ile Glu Gln Thr Gln Phe Pro Val Phe Val
Thr Pro Met Gly Lys Gly 245 250
255 Thr Val Asp Glu Gly Gly Val Asp Gly Glu Leu Leu Glu Asp Asp
Pro 260 265 270 His
Leu Ile Ala Lys Val Ala Ala Arg Leu Ser Ala Gly Lys Asn Ala 275
280 285 Ala Ser Arg Phe Gly Gly
Val Tyr Val Gly Thr Leu Ser Lys Pro Glu 290 295
300 Val Lys Asp Ala Val Glu Ser Ala Asp Leu Ile
Leu Ser Val Gly Ala 305 310 315
320 Leu Leu Ser Asp Phe Asn Thr Gly Ser Phe Ser Tyr Ser Tyr Arg Thr
325 330 335 Lys Asn
Ile Val Glu Phe His Ser Asp Tyr Thr Lys Ile Arg Gln Ala 340
345 350 Thr Phe Pro Gly Val Gln Met
Lys Glu Ala Leu Gln Glu Leu Asn Lys 355 360
365 Lys Val Ser Ser Ala Ala Ser His Tyr Glu Val Lys
Pro Val Pro Lys 370 375 380
Ile Lys Leu Ala Asn Thr Pro Ala Thr Arg Glu Val Lys Leu Thr Gln 385
390 395 400 Glu Trp Leu
Trp Thr Arg Val Ser Ser Trp Phe Arg Glu Gly Asp Ile 405
410 415 Ile Ile Thr Glu Thr Gly Thr Ser
Ser Phe Gly Ile Val Gln Ser Arg 420 425
430 Phe Pro Asn Asn Thr Ile Gly Ile Ser Gln Val Leu Trp
Gly Ser Ile 435 440 445
Gly Phe Ser Val Gly Ala Thr Leu Gly Ala Ala Met Ala Ala Gln Glu 450
455 460 Leu Asp Pro Asn
Lys Arg Thr Ile Leu Phe Val Gly Asp Gly Ser Leu 465 470
475 480 Gln Leu Thr Val Gln Glu Ile Ser Thr
Ile Ile Arg Trp Gly Thr Thr 485 490
495 Pro Tyr Leu Phe Val Leu Asn Asn Asp Gly Tyr Thr Ile Glu
Arg Leu 500 505 510
Ile His Gly Val Asn Ala Ser Tyr Asn Asp Ile Gln Pro Trp Gln Asn
515 520 525 Leu Glu Ile Leu
Pro Thr Phe Ser Ala Lys Asn Tyr Asp Ala Val Arg 530
535 540 Ile Ser Asn Ile Gly Glu Ala Glu
Asp Ile Leu Lys Asp Lys Glu Phe 545 550
555 560 Gly Lys Asn Ser Lys Ile Arg Leu Ile Glu Val Met
Leu Pro Arg Leu 565 570
575 Asp Ala Pro Ser Asn Leu Ala Lys Gln Ala Ala Ile Thr Ala Ala Thr
580 585 590 Asn Ala Glu
Ala 595 191707DNAPichia stipitis 19atggtatcaa cctacccaga
atcagaggtt actctaggaa ggtacctctt tgagcgactc 60caccaattga aagtggacac
cattttcggc ttgccgggtg acttcaacct ttccttattg 120gacaaagtgt atgaagttcc
ggatatgagg tgggctggaa atgccaacga attgaatgct 180gcctatgctg ccgatggtta
ctccagaata aagggattgt cttgcttggt cacaactttt 240ggtgttggtg aattgtctgc
tttaaacgga gttggtggtg cctatgctga acacgtagga 300cttctacatg tcgttggagt
tccatccata tcgtcacagg ctaaacagtt gttgctccac 360cataccttgg gtaatggtga
cttcactgtt tttcacagaa tgtccaatag catttctcaa 420actacagcat ttctctcaga
tatctctatt gcaccaggtc aaatagatag atgcatcaga 480gaagcatatg ttcatcagag
accagtttat gttggtttac cggcaaatat ggttgatctc 540aaggttcctt ctagtctctt
agaaactcca attgatttga aattgaaaca aaatgatcct 600gaagctcaag aagttgttga
aacagtcctg aagttggtgt cccaagctac aaaccccatt 660atcttggtag acgcttgtgc
cctcagacac aattgcaaag aggaagtcaa acaattggtt 720gatgccacta attttcaagt
ctttacaact ccaatgggta aatctggtat ctccgaatct 780catccaagat tgggcggtgt
ctatgtcggg acaatgtcga gtcctcaagt caaaaaagcc 840gttgaaaatg ccgatcttat
actatctgtt ggttcgttgt tatcggactt caatacaggt 900tcattttcat actcctacaa
gacgaagaat gttgttgaat tccactctga ctatatgaaa 960atcagacagg ccaccttccc
aggagttcaa atgaaagaag ccttgcaaca gttgataaaa 1020agggtctctt cttacatcaa
tccaagctac attcctactc gagttcctaa aaggaaacag 1080ccattgaaag ctccatcaga
agctcctttg acccaagaat atttgtggtc taaagtatcc 1140ggctggttta gagagggtga
tattatcgta accgaaactg gtacatctgc tttcggaatt 1200attcaatccc attttcccag
caacactatc ggtatatccc aagtcttgtg gggctcaatt 1260ggtttcacag taggtgcaac
agttggtgct gccatggcag cccaggaaat cgaccctagc 1320aggagagtaa ttttgttcgt
cggtgatggt tcattgcagt tgacggttca ggaaatctct 1380acgttgtgta aatgggattg
taacaatact tatctttacg tgttgaacaa tgatggttac 1440actatagaaa ggttgatcca
cggcaaaagt gccagctaca acgatataca gccttggaac 1500catttatcct tgcttcgctt
attcaatgct aagaaatacc aaaatgtcag agtatcgact 1560gctggagaat tggactcttt
gttctctgat aagaaatttg cttctccaga taggataaga 1620atgattgagg tgatgttatc
gagattggat gcaccagcaa atcttgttgc tcaagcaaag 1680ttgtctgaac gggtaaacct
tgaaaat 170720569PRTPichia stipitis
20Met Val Ser Thr Tyr Pro Glu Ser Glu Val Thr Leu Gly Arg Tyr Leu 1
5 10 15 Phe Glu Arg Leu
His Gln Leu Lys Val Asp Thr Ile Phe Gly Leu Pro 20
25 30 Gly Asp Phe Asn Leu Ser Leu Leu Asp
Lys Val Tyr Glu Val Pro Asp 35 40
45 Met Arg Trp Ala Gly Asn Ala Asn Glu Leu Asn Ala Ala Tyr
Ala Ala 50 55 60
Asp Gly Tyr Ser Arg Ile Lys Gly Leu Ser Cys Leu Val Thr Thr Phe 65
70 75 80 Gly Val Gly Glu Leu
Ser Ala Leu Asn Gly Val Gly Gly Ala Tyr Ala 85
90 95 Glu His Val Gly Leu Leu His Val Val Gly
Val Pro Ser Ile Ser Ser 100 105
110 Gln Ala Lys Gln Leu Leu Leu His His Thr Leu Gly Asn Gly Asp
Phe 115 120 125 Thr
Val Phe His Arg Met Ser Asn Ser Ile Ser Gln Thr Thr Ala Phe 130
135 140 Leu Ser Asp Ile Ser Ile
Ala Pro Gly Gln Ile Asp Arg Cys Ile Arg 145 150
155 160 Glu Ala Tyr Val His Gln Arg Pro Val Tyr Val
Gly Leu Pro Ala Asn 165 170
175 Met Val Asp Leu Lys Val Pro Ser Ser Leu Leu Glu Thr Pro Ile Asp
180 185 190 Leu Lys
Leu Lys Gln Asn Asp Pro Glu Ala Gln Glu Val Val Glu Thr 195
200 205 Val Leu Lys Leu Val Ser Gln
Ala Thr Asn Pro Ile Ile Leu Val Asp 210 215
220 Ala Cys Ala Leu Arg His Asn Cys Lys Glu Glu Val
Lys Gln Leu Val 225 230 235
240 Asp Ala Thr Asn Phe Gln Val Phe Thr Thr Pro Met Gly Lys Ser Gly
245 250 255 Ile Ser Glu
Ser His Pro Arg Leu Gly Gly Val Tyr Val Gly Thr Met 260
265 270 Ser Ser Pro Gln Val Lys Lys Ala
Val Glu Asn Ala Asp Leu Ile Leu 275 280
285 Ser Val Gly Ser Leu Leu Ser Asp Phe Asn Thr Gly Ser
Phe Ser Tyr 290 295 300
Ser Tyr Lys Thr Lys Asn Val Val Glu Phe His Ser Asp Tyr Met Lys 305
310 315 320 Ile Arg Gln Ala
Thr Phe Pro Gly Val Gln Met Lys Glu Ala Leu Gln 325
330 335 Gln Leu Ile Lys Arg Val Ser Ser Tyr
Ile Asn Pro Ser Tyr Ile Pro 340 345
350 Thr Arg Val Pro Lys Arg Lys Gln Pro Leu Lys Ala Pro Ser
Glu Ala 355 360 365
Pro Leu Thr Gln Glu Tyr Leu Trp Ser Lys Val Ser Gly Trp Phe Arg 370
375 380 Glu Gly Asp Ile Ile
Val Thr Glu Thr Gly Thr Ser Ala Phe Gly Ile 385 390
395 400 Ile Gln Ser His Phe Pro Ser Asn Thr Ile
Gly Ile Ser Gln Val Leu 405 410
415 Trp Gly Ser Ile Gly Phe Thr Val Gly Ala Thr Val Gly Ala Ala
Met 420 425 430 Ala
Ala Gln Glu Ile Asp Pro Ser Arg Arg Val Ile Leu Phe Val Gly 435
440 445 Asp Gly Ser Leu Gln Leu
Thr Val Gln Glu Ile Ser Thr Leu Cys Lys 450 455
460 Trp Asp Cys Asn Asn Thr Tyr Leu Tyr Val Leu
Asn Asn Asp Gly Tyr 465 470 475
480 Thr Ile Glu Arg Leu Ile His Gly Lys Ser Ala Ser Tyr Asn Asp Ile
485 490 495 Gln Pro
Trp Asn His Leu Ser Leu Leu Arg Leu Phe Asn Ala Lys Lys 500
505 510 Tyr Gln Asn Val Arg Val Ser
Thr Ala Gly Glu Leu Asp Ser Leu Phe 515 520
525 Ser Asp Lys Lys Phe Ala Ser Pro Asp Arg Ile Arg
Met Ile Glu Val 530 535 540
Met Leu Ser Arg Leu Asp Ala Pro Ala Asn Leu Val Ala Gln Ala Lys 545
550 555 560 Leu Ser Glu
Arg Val Asn Leu Glu Asn 565
211692DNAKluyveromyces lactis 21atgtctgaaa ttacattagg tcgttacttg
ttcgaaagat taaagcaagt cgaagttcaa 60accatctttg gtctaccagg tgatttcaac
ttgtccctat tggacaatat ctacgaagtc 120ccaggtatga gatgggctgg taatgccaac
gaattgaacg ctgcttacgc tgctgatggt 180tacgccagat taaagggtat gtcctgtatc
atcaccacct tcggtgtcgg tgaattgtct 240gctttgaacg gtattgccgg ttcttacgct
gaacacgttg gtgtcttgca cgttgtcggt 300gttccatccg tctcttctca agctaagcaa
ttgttgttgc accacacctt gggtaacggt 360gacttcactg ttttccacag aatgtgctcc
aacatttctg aaaccactgc tatgatcacc 420gatatcaaca ctgccccagc tgaaatcgac
agatgtatca gaaccactta cgtttcccaa 480agaccagtct acttgggttt gccagctaac
ttggtcgact tgactgtccc agcttctttg 540ttggacactc caattgattt gagcttgaag
ccaaatgacc cagaagccga agaagaagtc 600atcgaaaacg tcttgcaact gatcaaggaa
gctaagaacc cagttatctt ggctgatgct 660tgttgttcca gacacgatgc caaggctgag
accaagaagt tgatcgactt gactcaattc 720ccagccttcg ttaccccaat gggtaagggt
tccattgacg aaaagcaccc aagattcggt 780ggtgtctacg tcggtaccct atcttctcca
gctgtcaagg aagccgttga atctgctcac 840ttggttctat cggtcggtgc tctattgtcc
gatttcaaca ctggttcttt ctcttactct 900tacaagacca agaacattgt cgaattccac
tctgactaca ccaagatcag aaggcctacc 960ttcccaggtg tccaaatgaa gttcgcttta
caaaaattgt tgactaaggt tgccgatgct 1020gctaagggtt acaagccagt tccagttcca
tctgaaccag aacacaacga agatgtcgct 1080gactccactc cattgaagca agaatgggtc
tggactcaag tcggtgaatt cttgagagaa 1140ggtgatgttg ttatcactga aaccggtacc
tctgccttcg gtatcaacca aactcatttc 1200ccaaacaaca catacggtat ctctcaagtt
ttatggggtt ccattggttt caccactggt 1260gctaccttgg gtgctgcctt cgctgccgaa
gaaattgatc caaagaagag agttatctta 1320ttcattggtg acggttcttt gcaattgact
gttcaagaaa tctccaccat gatcagatgg 1380ggcttgaagc catacttgtt cgtattgaac
aacgacggtt acaccattga aagattgatt 1440cacggtgaaa ccgctcaata caactgtatc
caaaactggc aacacttgga attattgcca 1500actttcggtg ccaaggacta cgaagctgtc
agagtttcca ccactggtga atggaacaag 1560ttgaccactg acgaaaagtt ccaagacaac
accagaatca gattgatcga agttatgttg 1620ccaactatgg atgctccatc taacttggtt
aagcaagctc aattgactgc tgcatccaac 1680gctaagaact aa
169222563PRTKluyveromyces lactis 22Met
Ser Glu Ile Thr Leu Gly Arg Tyr Leu Phe Glu Arg Leu Lys Gln 1
5 10 15 Val Glu Val Gln Thr Ile
Phe Gly Leu Pro Gly Asp Phe Asn Leu Ser 20
25 30 Leu Leu Asp Asn Ile Tyr Glu Val Pro Gly
Met Arg Trp Ala Gly Asn 35 40
45 Ala Asn Glu Leu Asn Ala Ala Tyr Ala Ala Asp Gly Tyr Ala
Arg Leu 50 55 60
Lys Gly Met Ser Cys Ile Ile Thr Thr Phe Gly Val Gly Glu Leu Ser 65
70 75 80 Ala Leu Asn Gly Ile
Ala Gly Ser Tyr Ala Glu His Val Gly Val Leu 85
90 95 His Val Val Gly Val Pro Ser Val Ser Ser
Gln Ala Lys Gln Leu Leu 100 105
110 Leu His His Thr Leu Gly Asn Gly Asp Phe Thr Val Phe His Arg
Met 115 120 125 Ser
Ser Asn Ile Ser Glu Thr Thr Ala Met Ile Thr Asp Ile Asn Thr 130
135 140 Ala Pro Ala Glu Ile Asp
Arg Cys Ile Arg Thr Thr Tyr Val Ser Gln 145 150
155 160 Arg Pro Val Tyr Leu Gly Leu Pro Ala Asn Leu
Val Asp Leu Thr Val 165 170
175 Pro Ala Ser Leu Leu Asp Thr Pro Ile Asp Leu Ser Leu Lys Pro Asn
180 185 190 Asp Pro
Glu Ala Glu Glu Glu Val Ile Glu Asn Val Leu Gln Leu Ile 195
200 205 Lys Glu Ala Lys Asn Pro Val
Ile Leu Ala Asp Ala Cys Cys Ser Arg 210 215
220 His Asp Ala Lys Ala Glu Thr Lys Lys Leu Ile Asp
Leu Thr Gln Phe 225 230 235
240 Pro Ala Phe Val Thr Pro Met Gly Lys Gly Ser Ile Asp Glu Lys His
245 250 255 Pro Arg Phe
Gly Gly Val Tyr Val Gly Thr Leu Ser Ser Pro Ala Val 260
265 270 Lys Glu Ala Val Glu Ser Ala Asp
Leu Val Leu Ser Val Gly Ala Leu 275 280
285 Leu Ser Asp Phe Asn Thr Gly Ser Phe Ser Tyr Ser Tyr
Lys Thr Lys 290 295 300
Asn Ile Val Glu Phe His Ser Asp Tyr Thr Lys Ile Arg Ser Ala Thr 305
310 315 320 Phe Pro Gly Val
Gln Met Lys Phe Ala Leu Gln Lys Leu Leu Thr Lys 325
330 335 Val Ala Asp Ala Ala Lys Gly Tyr Lys
Pro Val Pro Val Pro Ser Glu 340 345
350 Pro Glu His Asn Glu Ala Val Ala Asp Ser Thr Pro Leu Lys
Gln Glu 355 360 365
Trp Val Trp Thr Gln Val Gly Glu Phe Leu Arg Glu Gly Asp Val Val 370
375 380 Ile Thr Glu Thr Gly
Thr Ser Ala Phe Gly Ile Asn Gln Thr His Phe 385 390
395 400 Pro Asn Asn Thr Tyr Gly Ile Ser Gln Val
Leu Trp Gly Ser Ile Gly 405 410
415 Phe Thr Thr Gly Ala Thr Leu Gly Ala Ala Phe Ala Ala Glu Glu
Ile 420 425 430 Asp
Pro Lys Lys Arg Val Ile Leu Phe Ile Gly Asp Gly Ser Leu Gln 435
440 445 Leu Thr Val Gln Glu Ile
Ser Thr Met Ile Arg Trp Gly Leu Lys Pro 450 455
460 Tyr Leu Phe Val Leu Asn Asn Asp Gly Tyr Thr
Ile Glu Arg Leu Ile 465 470 475
480 His Gly Glu Thr Ala Gln Tyr Asn Cys Ile Gln Asn Trp Gln His Leu
485 490 495 Glu Leu
Leu Pro Thr Phe Gly Ala Lys Asp Tyr Glu Ala Val Arg Val 500
505 510 Ser Thr Thr Gly Glu Trp Asn
Lys Leu Thr Thr Asp Glu Lys Phe Gln 515 520
525 Asp Asn Thr Arg Ile Arg Leu Ile Glu Val Met Leu
Pro Thr Met Asp 530 535 540
Ala Pro Ser Asn Leu Val Lys Gln Ala Gln Leu Thr Ala Ala Thr Asn 545
550 555 560 Ala Lys Asn
231716DNAYarrowia lipolytica 23atgagcgact ccgaacccca aatggtcgac
ctgggcgact atctctttgc ccgattcaag 60cagctaggcg tggactccgt ctttggagtg
cccggcgact tcaacctcac cctgttggac 120cacgtgtaca atgtcgacat gcggtgggtt
gggaacacaa acgagctgaa tgccggctac 180tcggccgacg gctactcccg ggtcaagcgg
ctggcatgtc ttgtcaccac ctttggcgtg 240ggagagctgt ctgccgtggc tgctgtggca
ggctcgtacg ccgagcatgt gggcgtggtg 300catgttgtgg gcgttcccag cacctctgct
gagaacaagc atctgctgct gcaccacaca 360ctcggtaacg gcgacttccg ggtctttgcc
cagatgtcca aactcatctc cgagtacacc 420caccatattg aggaccccag cgaggctgcc
gacgtaatcg acaccgccat ccgaatcgcc 480tacacccacc agcggcccgt ttacattgct
gtgccctcca acttctccga ggtcgatatt 540gccgaccagg ctagactgga tacccccctg
gacctttcgc tgcagcccaa cgaccccgag 600agccagtacg aggtgattga ggagatttgc
tcgcgtatca aggccgccaa gaagcccgtg 660attctcgtcg acgcctgcgc ttcgcgatac
agatgtgtgg acgagaccaa ggagctggcc 720aagatcacca actttgccta ctttgtcact
cccatgggta agggttctgt ggacgaggat 780actgaccggt acggaggaac atacgtcgga
tcgctgactg ctcctgctac tgccgaggtg 840gttgagacag ctgatctcat catctccgta
ggagctcttc tgtcggactt caacaccggt 900tccttctcgt actcctactc caccaaaaac
gtggtggaat tgcattcgga ccacgtcaaa 960atcaagtccg ccacctacaa caacgtcggc
atgaaaatgc tgttcccgcc cctgctcgaa 1020gccgtcaaga aactggttgc cgagacccct
gactttgcat ccaaggctct ggctgttccc 1080gacaccactc ccaagatccc cgaggtaccc
gatgatcaca ttacgaccca ggcatggctg 1140tggcagcgtc tcagttactt tctgaggccc
accgacatcg tggtcaccga gaccggaacc 1200tcgtcctttg gaatcatcca gaccaagttc
ccccacaacg tccgaggtat ctcgcaggtg 1260ctgtggggct ctattggata ctcggtggga
gcagcctgtg gagcctccat tgctgcacag 1320gagattgacc cccagcagcg agtgattctg
tttgtgggcg acggctctct tcagctgacg 1380gtgaccgaga tctcgtgcat gatccgcaac
aacgtcaagc cgtacatttt tgtgctcaac 1440aacgacggct acaccatcga gaggctcatt
cacggcgaaa acgcctcgta caacgatgtg 1500cacatgtgga agtactccaa gattctcgac
acgttcaacg ccaaggccca cgagtcgatt 1560gtggtcaaca ccaagggcga gatggacgct
ctgttcgaca acgaagagtt tgccaagccc 1620gacaagatcc ggctcattga ggtcatgtgc
gacaagatgg acgcgcctgc ctcgttgatc 1680aagcaggctg agctctctgc caagaccaac
gtttag 171624571PRTYarrowia lipolytica 24Met
Ser Asp Ser Glu Pro Gln Met Val Asp Leu Gly Asp Tyr Leu Phe 1
5 10 15 Ala Arg Phe Lys Gln Leu
Gly Val Asp Ser Val Phe Gly Val Pro Gly 20
25 30 Asp Phe Asn Leu Thr Leu Leu Asp His Val
Tyr Asn Val Asp Met Arg 35 40
45 Trp Val Gly Asn Thr Asn Glu Leu Asn Ala Gly Tyr Ser Ala
Asp Gly 50 55 60
Tyr Ser Arg Val Lys Arg Leu Ala Cys Leu Val Thr Thr Phe Gly Val 65
70 75 80 Gly Glu Leu Ser Ala
Val Ala Ala Val Ala Gly Ser Tyr Ala Glu His 85
90 95 Val Gly Val Val His Val Val Gly Val Pro
Ser Thr Ser Ala Glu Asn 100 105
110 Lys His Leu Leu Leu His His Thr Leu Gly Asn Gly Asp Phe Arg
Val 115 120 125 Phe
Ala Gln Met Ser Lys Leu Ile Ser Glu Tyr Thr His His Ile Glu 130
135 140 Asp Pro Ser Glu Ala Ala
Asp Val Ile Asp Thr Ala Ile Arg Ile Ala 145 150
155 160 Tyr Thr His Gln Arg Pro Val Tyr Ile Ala Val
Pro Ser Asn Phe Ser 165 170
175 Glu Val Asp Ile Ala Asp Gln Ala Arg Leu Asp Thr Pro Leu Asp Leu
180 185 190 Ser Leu
Gln Pro Asn Asp Pro Glu Ser Gln Tyr Glu Val Ile Glu Glu 195
200 205 Ile Cys Ser Arg Ile Lys Ala
Ala Lys Lys Pro Val Ile Leu Val Asp 210 215
220 Ala Cys Ala Ser Arg Tyr Arg Cys Val Asp Glu Thr
Lys Glu Leu Ala 225 230 235
240 Lys Ile Thr Asn Phe Ala Tyr Phe Val Thr Pro Met Gly Lys Gly Ser
245 250 255 Val Asp Glu
Asp Thr Asp Arg Tyr Gly Gly Thr Tyr Val Gly Ser Leu 260
265 270 Thr Ala Pro Ala Thr Ala Glu Val
Val Glu Thr Ala Asp Leu Ile Ile 275 280
285 Ser Val Gly Ala Leu Leu Ser Asp Phe Asn Thr Gly Ser
Phe Ser Tyr 290 295 300
Ser Tyr Ser Thr Lys Asn Val Val Glu Leu His Ser Asp His Val Lys 305
310 315 320 Ile Lys Ser Ala
Thr Tyr Asn Asn Val Gly Met Lys Met Leu Phe Pro 325
330 335 Pro Leu Leu Glu Ala Val Lys Lys Leu
Val Ala Glu Thr Pro Asp Phe 340 345
350 Ala Ser Lys Ala Leu Ala Val Pro Asp Thr Thr Pro Lys Ile
Pro Glu 355 360 365
Val Pro Asp Asp His Ile Thr Thr Gln Ala Trp Leu Trp Gln Arg Leu 370
375 380 Ser Tyr Phe Leu Arg
Pro Thr Asp Ile Val Val Thr Glu Thr Gly Thr 385 390
395 400 Ser Ser Phe Gly Ile Ile Gln Thr Lys Phe
Pro His Asn Val Arg Gly 405 410
415 Ile Ser Gln Val Leu Trp Gly Ser Ile Gly Tyr Ser Val Gly Ala
Ala 420 425 430 Cys
Gly Ala Ser Ile Ala Ala Gln Glu Ile Asp Pro Gln Gln Arg Val 435
440 445 Ile Leu Phe Val Gly Asp
Gly Ser Leu Gln Leu Thr Val Thr Glu Ile 450 455
460 Ser Cys Met Ile Arg Asn Asn Val Lys Pro Tyr
Ile Phe Val Leu Asn 465 470 475
480 Asn Asp Gly Tyr Thr Ile Glu Arg Leu Ile His Gly Glu Asn Ala Ser
485 490 495 Tyr Asn
Asp Val His Met Trp Lys Tyr Ser Lys Ile Leu Asp Thr Phe 500
505 510 Asn Ala Lys Ala His Glu Ser
Ile Val Val Asn Thr Lys Gly Glu Met 515 520
525 Asp Ala Leu Phe Asp Asn Glu Glu Phe Ala Lys Pro
Asp Lys Ile Arg 530 535 540
Leu Ile Glu Val Met Cys Asp Lys Met Asp Ala Pro Ala Ser Leu Ile 545
550 555 560 Lys Gln Ala
Glu Leu Ser Ala Lys Thr Asn Val 565 570
251716DNASchizosaccharomyces pombe 25atgagtgggg atattttagt cggtgaatat
ctattcaaaa ggcttgaaca attaggggtc 60aagtccattc ttggtgttcc aggagatttc
aatttagctc tacttgactt aattgagaaa 120gttggagatg agaaatttcg ttgggttggc
aataccaatg agttgaatgg tgcttatgcc 180gctgatggtt atgctcgtgt taatggtctt
tcagccattg ttacaacgtt cggcgtggga 240gagctttccg ctattaatgg agtggcaggt
tcttatgcgg agcatgtccc agtagttcat 300attgttggaa tgccttccac aaaggtgcaa
gatactggag ctttgcttca tcatacttta 360ggagatggag actttcgcac tttcatggat
atgtttaaga aagtttctgc ctacagtata 420atgatcgata acggaaacga tgcagctgaa
aagatcgatg aagccttgtc gatttgttat 480aaaaaggcta ggcctgttta cattggtatt
ccttctgatg ctggctactt caaagcatct 540tcatcaaatc ttgggaaaag actaaagctc
gaggaggata ctaacgatcc agcagttgag 600caagaagtca tcaatcatat ctcggaaatg
gttgtcaatg caaagaaacc agtgatttta 660attgacgctt gtgctgtaag acatcgtgtc
gttccagaag tacatgagct gattaaattg 720acccatttcc ctacatatgt aactcccatg
ggtaaatctg caattgacga aacttcgcaa 780ttttttgacg gcgtttatgt tggttcaatt
tcagatcctg aagttaaaga cagaattgaa 840tccactgatc tgttgctatc catcggtgct
ctcaaatcag actttaacac gggttccttc 900tcttaccacc tcagccaaaa gaatgccgtt
gagtttcatt cagaccacat gcgcattcga 960tatgctcttt atccaaatgt agccatgaag
tatattcttc gcaaactgtt gaaagtactt 1020gatgcttcta tgtgtcattc caaggctgct
cctaccattg gctacaacat caagcctaag 1080catgcggaag gatattcttc caacgagatt
actcattgct ggttttggcc taaatttagt 1140gaatttttga agccccgaga tgttttgatc
accgagactg gaactgcaaa ctttggtgtc 1200cttgattgca ggtttccaaa ggatgtaaca
gccatttccc aggtattatg gggatctatt 1260ggatactccg ttggtgcaat gtttggtgct
gttttggccg tccacgattc taaagagccc 1320gatcgtcgta ccattcttgt agtaggtgat
ggatccttac aactgacgat tacagagatt 1380tcaacctgca ttcgccataa cctcaaacca
attattttca taattaacaa cgacggttac 1440accattgagc gtttaattca tggtttgcat
gctagctata acgaaattaa cactaaatgg 1500ggctaccaac agattcccaa gtttttcgga
gctgctgaaa accacttccg cacttactgt 1560gttaaaactc ctactgacgt tgaaaagttg
tttagcgaca aggagtttgc aaatgcagat 1620gtcattcaag tagttgagct tgtaatgcct
atgttggatg cacctcgtgt cctagttgag 1680caagccaagt tgacgtctaa gatcaataag
caatga 171626571PRTSchizosaccharomyces pombe
26Met Ser Gly Asp Ile Leu Val Gly Glu Tyr Leu Phe Lys Arg Leu Glu 1
5 10 15 Gln Leu Gly Val
Lys Ser Ile Leu Gly Val Pro Gly Asp Phe Asn Leu 20
25 30 Ala Leu Leu Asp Leu Ile Glu Lys Val
Gly Asp Glu Lys Phe Arg Trp 35 40
45 Val Gly Asn Thr Asn Glu Leu Asn Gly Ala Tyr Ala Ala Asp
Gly Tyr 50 55 60
Ala Arg Val Asn Gly Leu Ser Ala Ile Val Thr Thr Phe Gly Val Gly 65
70 75 80 Glu Leu Ser Ala Ile
Asn Gly Val Ala Gly Ser Tyr Ala Glu His Val 85
90 95 Pro Val Val His Ile Val Gly Met Pro Ser
Thr Lys Val Gln Asp Thr 100 105
110 Gly Ala Leu Leu His His Thr Leu Gly Asp Gly Asp Phe Arg Thr
Phe 115 120 125 Met
Asp Met Phe Lys Lys Val Ser Ala Tyr Ser Ile Met Ile Asp Asn 130
135 140 Gly Asn Asp Ala Ala Glu
Lys Ile Asp Glu Ala Leu Ser Ile Cys Tyr 145 150
155 160 Lys Lys Ala Arg Pro Val Tyr Ile Gly Ile Pro
Ser Asp Ala Gly Tyr 165 170
175 Phe Lys Ala Ser Ser Ser Asn Leu Gly Lys Arg Leu Lys Leu Glu Glu
180 185 190 Asp Thr
Asn Asp Pro Ala Val Glu Gln Glu Val Ile Asn His Ile Ser 195
200 205 Glu Met Val Val Asn Ala Lys
Lys Pro Val Ile Leu Ile Asp Ala Cys 210 215
220 Ala Val Arg His Arg Val Val Pro Glu Val His Glu
Leu Ile Lys Leu 225 230 235
240 Thr His Phe Pro Thr Tyr Val Thr Pro Met Gly Lys Ser Ala Ile Asp
245 250 255 Glu Thr Ser
Gln Phe Phe Asp Gly Val Tyr Val Gly Ser Ile Ser Asp 260
265 270 Pro Glu Val Lys Asp Arg Ile Glu
Ser Thr Asp Leu Leu Leu Ser Ile 275 280
285 Gly Ala Leu Lys Ser Asp Phe Asn Thr Gly Ser Phe Ser
Tyr His Leu 290 295 300
Ser Gln Lys Asn Ala Val Glu Phe His Ser Asp His Met Arg Ile Arg 305
310 315 320 Tyr Ala Leu Tyr
Pro Asn Val Ala Met Lys Tyr Ile Leu Arg Lys Leu 325
330 335 Leu Lys Val Leu Asp Ala Ser Met Cys
His Ser Lys Ala Ala Pro Thr 340 345
350 Ile Gly Tyr Asn Ile Lys Pro Lys His Ala Glu Gly Tyr Ser
Ser Asn 355 360 365
Glu Ile Thr His Cys Trp Phe Trp Pro Lys Phe Ser Glu Phe Leu Lys 370
375 380 Pro Arg Asp Val Leu
Ile Thr Glu Thr Gly Thr Ala Asn Phe Gly Val 385 390
395 400 Leu Asp Cys Arg Phe Pro Lys Asp Val Thr
Ala Ile Ser Gln Val Leu 405 410
415 Trp Gly Ser Ile Gly Tyr Ser Val Gly Ala Met Phe Gly Ala Val
Leu 420 425 430 Ala
Val His Asp Ser Lys Glu Pro Asp Arg Arg Thr Ile Leu Val Val 435
440 445 Gly Asp Gly Ser Leu Gln
Leu Thr Ile Thr Glu Ile Ser Thr Cys Ile 450 455
460 Arg His Asn Leu Lys Pro Ile Ile Phe Ile Ile
Asn Asn Asp Gly Tyr 465 470 475
480 Thr Ile Glu Arg Leu Ile His Gly Leu His Ala Ser Tyr Asn Glu Ile
485 490 495 Asn Thr
Lys Trp Gly Tyr Gln Gln Ile Pro Lys Phe Phe Gly Ala Ala 500
505 510 Glu Asn His Phe Arg Thr Tyr
Cys Val Lys Thr Pro Thr Asp Val Glu 515 520
525 Lys Leu Phe Ser Asp Lys Glu Phe Ala Asn Ala Asp
Val Ile Gln Val 530 535 540
Val Glu Leu Val Met Pro Met Leu Asp Ala Pro Arg Val Leu Val Glu 545
550 555 560 Gln Ala Lys
Leu Thr Ser Lys Ile Asn Lys Gln 565 570
271689DNAZygosaccharomyces rouxii 27atgtctgaaa ttactctagg tcgttacttg
ttcgaaagat taaagcaagt tgacactaac 60accatcttcg gtgttccagg tgacttcaac
ttgtccttgt tggacaaggt ctacgaagtg 120caaggtctaa gatgggctgg taacgctaac
gaattgaacg ctgcctacgc tgctgacggt 180tacgccagag ttaagggttt ggctgctttg
atcaccacct tcggtgtcgg tgaattgtct 240gctttgaacg gtattgcagg ttcttacgct
gaacacgttg gtgttttgca cattgttggt 300gttccatctg tctcttctca agctaagcaa
ttgttgttgc accacacctt gggtaacggt 360gacttcactg ttttccacag aatgtccgcc
aacatctctg aaaccaccgc tatgttgacc 420gacatcactg ctgctccagc tgaaattgac
cgttgcatca gagttgctta cgtcaaccaa 480agaccagtct acttgggtct accagctaac
ttggttgacc aaaaggtccc agcttctttg 540ttgaacactc caattgatct atctctaaag
gagaacgacc cagaagctga aaccgaagtt 600gttgacaccg ttttggaatt gatcaaggaa
gctaagaacc cagttatctt ggctgatgct 660tgctgctcca gacacgacgt caaggctgaa
accaagaagt tgatcgactt gactcaattc 720ccatctttcg ttactcctat gggtaagggt
tccatcgacg aacaaaaccc aagattcggt 780ggtgtctacg tcggtactct atccagccca
gaagttaagg aagctgttga atctgctgac 840ttggttctat ctgtcggtgc tctattgtcc
gatttcaaca ctggttcttt ctcttactct 900tacaagacca agaacgttgt tgaattccac
tctgaccaca tcaagatcag aaacgctacc 960ttcccaggtg ttcaaatgaa attcgttttg
aagaaactat tgcaagctgt cccagaagct 1020gtcaagaact acaagccagg tccagtccca
gctccgccat ctccaaacgc tgaagttgct 1080gactctacca ccttgaagca agaatggtta
tggagacaag tcggtagctt cttgagagaa 1140ggtgatgttg ttattaccga aactggtacc
tctgctttcg gtatcaacca aactcacttc 1200cctaaccaaa cttacggtat ctctcaagtc
ttgtggggtt ctattggtta caccactggt 1260tccactttgg gtgctgcctt cgctgctgaa
gaaattgacc ctaagaagag agttatcttg 1320ttcattggtg acggttctct acaattgacc
gttcaagaaa tctccaccat gatcagatgg 1380ggtctaaagc catacttgtt cgttttgaac
aacgatggtt acaccattga aagattgatt 1440cacggtgaaa ccgctgaata caactgtatc
caaccatgga agcacttgga attgttgaac 1500accttcggtg ccaaggacta cgaaaaccac
agagtctcca ctgtcggtga atggaacaag 1560ttgactcaag atccaaaatt caacgaaaac
tctagaatta gaatgatcga agttatgctt 1620gaagtcatgg acgctccatc ttctttggtc
gctcaagctc aattgaccgc tgctactaac 1680gctaagcaa
168928563PRTZygosaccharomyces rouxii
28Met Ser Glu Ile Thr Leu Gly Arg Tyr Leu Phe Glu Arg Leu Lys Gln 1
5 10 15 Val Asp Thr Asn
Thr Ile Phe Gly Val Pro Gly Asp Phe Asn Leu Ser 20
25 30 Leu Leu Asp Lys Val Tyr Glu Val Gln
Gly Leu Arg Trp Ala Gly Asn 35 40
45 Ala Asn Glu Leu Asn Ala Ala Tyr Ala Ala Asp Gly Tyr Ala
Arg Val 50 55 60
Lys Gly Leu Ala Ala Leu Ile Thr Thr Phe Gly Val Gly Glu Leu Ser 65
70 75 80 Ala Leu Asn Gly Ile
Ala Gly Ser Tyr Ala Glu His Val Gly Val Leu 85
90 95 His Ile Val Gly Val Pro Ser Val Ser Ser
Gln Ala Lys Gln Leu Leu 100 105
110 Leu His His Thr Leu Gly Asn Gly Asp Phe Thr Val Phe His Arg
Met 115 120 125 Ser
Ala Asn Ile Ser Glu Thr Thr Ala Met Leu Thr Asp Ile Thr Ala 130
135 140 Ala Pro Ala Glu Ile Asp
Arg Cys Ile Arg Val Ala Tyr Val Asn Gln 145 150
155 160 Arg Pro Val Tyr Leu Gly Leu Pro Ala Asn Leu
Val Asp Gln Lys Val 165 170
175 Pro Ala Ser Leu Leu Asn Thr Pro Ile Asp Leu Ser Leu Lys Glu Asn
180 185 190 Asp Pro
Glu Ala Glu Thr Glu Val Val Asp Thr Val Leu Glu Leu Ile 195
200 205 Lys Glu Ala Lys Asn Pro Val
Ile Leu Ala Asp Ala Cys Cys Ser Arg 210 215
220 His Asp Val Lys Ala Glu Thr Lys Lys Leu Ile Asp
Leu Thr Gln Phe 225 230 235
240 Pro Ser Phe Val Thr Pro Met Gly Lys Gly Ser Ile Asp Glu Gln Asn
245 250 255 Pro Arg Phe
Gly Gly Val Tyr Val Gly Thr Leu Ser Ser Pro Glu Val 260
265 270 Lys Glu Ala Val Glu Ser Ala Asp
Leu Val Leu Ser Val Gly Ala Leu 275 280
285 Leu Ser Asp Phe Asn Thr Gly Ser Phe Ser Tyr Ser Tyr
Lys Thr Lys 290 295 300
Asn Val Val Glu Phe His Ser Asp His Ile Lys Ile Arg Asn Ala Thr 305
310 315 320 Phe Pro Gly Val
Gln Met Lys Phe Val Leu Lys Lys Leu Leu Gln Ala 325
330 335 Val Pro Glu Ala Val Lys Asn Tyr Lys
Pro Gly Pro Val Pro Ala Pro 340 345
350 Pro Ser Pro Asn Ala Glu Val Ala Asp Ser Thr Thr Leu Lys
Gln Glu 355 360 365
Trp Leu Trp Arg Gln Val Gly Ser Phe Leu Arg Glu Gly Asp Val Val 370
375 380 Ile Thr Glu Thr Gly
Thr Ser Ala Phe Gly Ile Asn Gln Thr His Phe 385 390
395 400 Pro Asn Gln Thr Tyr Gly Ile Ser Gln Val
Leu Trp Gly Ser Ile Gly 405 410
415 Tyr Thr Thr Gly Ser Thr Leu Gly Ala Ala Phe Ala Ala Glu Glu
Ile 420 425 430 Asp
Pro Lys Lys Arg Val Ile Leu Phe Ile Gly Asp Gly Ser Leu Gln 435
440 445 Leu Thr Val Gln Glu Ile
Ser Thr Met Ile Arg Trp Gly Leu Lys Pro 450 455
460 Tyr Leu Phe Val Leu Asn Asn Asp Gly Tyr Thr
Ile Glu Arg Leu Ile 465 470 475
480 His Gly Glu Thr Ala Glu Tyr Asn Cys Ile Gln Pro Trp Lys His Leu
485 490 495 Glu Leu
Leu Asn Thr Phe Gly Ala Lys Asp Tyr Glu Asn His Arg Val 500
505 510 Ser Thr Val Gly Glu Trp Asn
Lys Leu Thr Gln Asp Pro Lys Phe Asn 515 520
525 Glu Asn Ser Arg Ile Arg Met Ile Glu Val Met Leu
Glu Val Met Asp 530 535 540
Ala Pro Ser Ser Leu Val Ala Gln Ala Gln Leu Thr Ala Ala Thr Asn 545
550 555 560 Ala Lys Gln
2916387DNAartificial sequencesynthetic construct 29tcccattacc gacatttggg
cgctatacgt gcatatgttc atgtatgtat ctgtatttaa 60aacacttttg tattattttt
cctcatatat gtgtataggt ttatacggat gatttaatta 120ttacttcacc accctttatt
tcaggctgat atcttagcct tgttactagt tagaaaaaga 180catttttgct gtcagtcact
gtcaagagat tcttttgctg gcatttcttc tagaagcaaa 240aagagcgatg cgtcttttcc
gctgaaccgt tccagcaaaa aagactacca acgcaatatg 300gattgtcaga atcatataaa
agagaagcaa ataactcctt gtcttgtatc aattgcatta 360taatatcttc ttgttagtgc
aatatcatat agaagtcatc gaaatagata ttaagaaaaa 420caaactgtac aatcaatcaa
tcaatcatcg ctgaggatgt tgacaaaagc aacaaaagaa 480caaaaatccc ttgtgaaaaa
cagaggggcg gagcttgttg ttgattgctt agtggagcaa 540ggtgtcacac atgtatttgg
cattccaggt gcaaaaattg atgcggtatt tgacgcttta 600caagataaag gacctgaaat
tatcgttgcc cggcacgaac aaaacgcagc attcatggcc 660caagcagtcg gccgtttaac
tggaaaaccg ggagtcgtgt tagtcacatc aggaccgggt 720gcctctaact tggcaacagg
cctgctgaca gcgaacactg aaggagaccc tgtcgttgcg 780cttgctggaa acgtgatccg
tgcagatcgt ttaaaacgga cacatcaatc tttggataat 840gcggcgctat tccagccgat
tacaaaatac agtgtagaag ttcaagatgt aaaaaatata 900ccggaagctg ttacaaatgc
atttaggata gcgtcagcag ggcaggctgg ggccgctttt 960gtgagctttc cgcaagatgt
tgtgaatgaa gtcacaaata cgaaaaacgt gcgtgctgtt 1020gcagcgccaa aactcggtcc
tgcagcagat gatgcaatca gtgcggccat agcaaaaatc 1080caaacagcaa aacttcctgt
cgttttggtc ggcatgaaag gcggaagacc ggaagcaatt 1140aaagcggttc gcaagctttt
gaaaaaggtt cagcttccat ttgttgaaac atatcaagct 1200gccggtaccc tttctagaga
tttagaggat caatattttg gccgtatcgg tttgttccgc 1260aaccagcctg gcgatttact
gctagagcag gcagatgttg ttctgacgat cggctatgac 1320ccgattgaat atgatccgaa
attctggaat atcaatggag accggacaat tatccattta 1380gacgagatta tcgctgacat
tgatcatgct taccagcctg atcttgaatt gatcggtgac 1440attccgtcca cgatcaatca
tatcgaacac gatgctgtga aagtggaatt tgcagagcgt 1500gagcagaaaa tcctttctga
tttaaaacaa tatatgcatg aaggtgagca ggtgcctgca 1560gattggaaat cagacagagc
gcaccctctt gaaatcgtta aagagttgcg taatgcagtc 1620gatgatcatg ttacagtaac
ttgcgatatc ggttcgcacg ccatttggat gtcacgttat 1680ttccgcagct acgagccgtt
aacattaatg atcagtaacg gtatgcaaac actcggcgtt 1740gcgcttcctt gggcaatcgg
cgcttcattg gtgaaaccgg gagaaaaagt ggtttctgtc 1800tctggtgacg gcggtttctt
attctcagca atggaattag agacagcagt tcgactaaaa 1860gcaccaattg tacacattgt
atggaacgac agcacatatg acatggttgc attccagcaa 1920ttgaaaaaat ataaccgtac
atctgcggtc gatttcggaa atatcgatat cgtgaaatat 1980gcggaaagct tcggagcaac
tggcttgcgc gtagaatcac cagaccagct ggcagatgtt 2040ctgcgtcaag gcatgaacgc
tgaaggtcct gtcatcatcg atgtcccggt tgactacagt 2100gataacatta atttagcaag
tgacaagctt ccgaaagaat tcggggaact catgaaaacg 2160aaagctctct agttaattaa
tcatgtaatt agttatgtca cgcttacatt cacgccctcc 2220ccccacatcc gctctaaccg
aaaaggaagg agttagacaa cctgaagtct aggtccctat 2280ttattttttt atagttatgt
tagtattaag aacgttattt atatttcaaa tttttctttt 2340ttttctgtac agacgcgtgt
acgcatgtaa cattatactg aaaaccttgc ttgagaaggt 2400tttgggacgc tcgaaggctt
taatttgcgg gcggccgctc tagaactagt accacaggtg 2460ttgtcctctg aggacataaa
atacacaccg agattcatca actcattgct ggagttagca 2520tatctacaat tgggtgaaat
ggggagcgat ttgcaggcat ttgctcggca tgccggtaga 2580ggtgtggtca ataagagcga
cctcatgcta tacctgagaa agcaacctga cctacaggaa 2640agagttactc aagaataaga
attttcgttt taaaacctaa gagtcacttt aaaatttgta 2700tacacttatt ttttttataa
cttatttaat aataaaaatc ataaatcata agaaattcgc 2760ttactcttaa ttaatcaagc
atctaaaaca caaccgttgg aagcgttgga aaccaactta 2820gcatacttgg atagagtacc
tcttgtgtaa cgaggtggag gtgcaaccca actttgttta 2880cgttgagcca tttccttatc
agagactaat aggtcaatct tgttattatc agcatcaatg 2940ataatctcat cgccgtctct
gaccaacccg ataggaccac cttcagcggc ttcgggaaca 3000atgtggccga ttaagaaccc
gtgagaacca ccagagaatc taccatcagt caacaatgca 3060acatctttac ccaaaccgta
acccatcaga gcagaggaag gctttagcat ttcaggcata 3120cctggtgcac ctcttggacc
ttcatatctg ataacaacaa cggttttttc acccttcttg 3180atttcacctc tttccaaggc
ttcaataaag gcaccttcct cttcgaacac acgtgctcta 3240cccttgaagt aagtaccttc
cttaccggta attttaccca cagctccacc tggtgccaat 3300gaaccgtaca gaatttgcaa
gtgaccgttg gccttgattg ggtgggagag tggcttaata 3360atctcttgtc cttcaggtag
gcttggtgct ttctttgcac gttctgccaa agtgtcaccg 3420gtaacagtca ttgtgttacc
gtgcaacatg ttgttttcat atagatactt aatcacagat 3480tgggtaccac caacgttaat
caaatcggcc atgacgtatt taccagaagg tttgaagtca 3540ccgatcaatg gtgtagtatc
actgattctt tggaaatcat ctggtgacaa cttgacaccc 3600gcagagtgag caacagccac
caaatgcaaa acagcattag tggacccacc ggttgcaacg 3660acataagtaa tggcgttttc
aaaagcctct tttgtgagga tatcacgagg taaaataccc 3720aattccattg tcttcttgat
gtattcacca atgttgtcac actcagctaa cttctccttg 3780gaaacggctg ggaaggaaga
ggagtttgga atggtcaaac ctagcacttc agcggcagaa 3840gccattgtgt tggcagtata
cataccacca caagaaccag gacctgggca tgcatgttcc 3900acaacatctt ctctttcttc
ttcagtgaat tgcttggaaa tatattcacc gtaggattgg 3960aacgcagaga cgatatcgat
gtttttagag atcctgttaa aacctctagt ggagtagtag 4020atgtaatcaa tgaagcggaa
gccaaaagac cagagtagag gcctatagaa gaaactgcga 4080taccttttgt gatggctaaa
caaacagaca tctttttata tgtttttact tctgtatatc 4140gtgaagtagt aagtgataag
cgaatttggc taagaacgtt gtaagtgaac aagggacctc 4200ttttgccttt caaaaaagga
ttaaatggag ttaatcattg agatttagtt ttcgttagat 4260tctgtatccc taaataactc
ccttacccga cgggaaggca caaaagactt gaataatagc 4320aaacggccag tagccaagac
caaataatac tagagttaac tgatggtctt aaacaggcat 4380tacgtggtga actccaagac
caatatacaa aatatcgata agttattctt gcccaccaat 4440ttaaggagcc tacatcagga
cagtagtacc attcctcaga gaagaggtat acataacaag 4500aaaatcgcgt gaacacctta
tataacttag cccgttattg agctaaaaaa ccttgcaaaa 4560tttcctatga ataagaatac
ttcagacgtg ataaaaattt actttctaac tcttctcacg 4620ctgcccctat ctgttcttcc
gctctaccgt gagaaataaa gcatcgagta cggcagttcg 4680ctgtcactga actaaaacaa
taaggctagt tcgaatgatg aacttgcttg ctgtcaaact 4740tctgagttgc cgctgatgtg
acactgtgac aataaattca aaccggttat agcggtctcc 4800tccggtaccg gttctgccac
ctccaataga gctcagtagg agtcagaacc tctgcggtgg 4860ctgtcagtga ctcatccgcg
tttcgtaagt tgtgcgcgtg cacatttcgc ccgttcccgc 4920tcatcttgca gcaggcggaa
attttcatca cgctgtagga cgcaaaaaaa aaataattaa 4980tcgtacaaga atcttggaaa
aaaaattgaa aaattttgta taaaagggat gacctaactt 5040gactcaatgg cttttacacc
cagtattttc cctttccttg tttgttacaa ttatagaagc 5100aagacaaaaa catatagaca
acctattcct aggagttata tttttttacc ctaccagcaa 5160tataagtaaa aaactagtat
gaaggtgttt tacgataaag actgcgatct gagcatcatc 5220cagggaaaga aggttgctat
tataggatat ggttcccaag gacacgcaca agccttgaac 5280ttgaaagatt ctggggtcga
cgtgacagta ggtctgtata aaggtgctgc tgatgcagca 5340aaggctgaag cacatggctt
taaagtcaca gatgttgcag cggctgttgc tggcgctgat 5400ttagtcatga ttttaattcc
agatgaattt caatcgcaat tgtacaaaaa tgaaatagaa 5460ccaaacatta agaagggcgc
taccttggcc ttcagtcatg gatttgccat tcattacaat 5520caagtagtcc ccagggcaga
tttggacgtt attatgattg cacctaaggc tccggggcat 5580actgttagga gcgaatttgt
taagggtggt ggtattccag atttgatcgc tatataccaa 5640gacgttagcg gaaacgctaa
gaatgtagct ttaagctacg cagcaggagt tggtggcggg 5700agaacgggta taatagaaac
cacttttaaa gacgagactg agacagattt atttggagaa 5760caagcggttc tgtgcggagg
aactgttgaa ttggttaaag caggctttga gacgcttgtc 5820gaagcagggt acgctcccga
aatggcatac ttcgaatgtc tacatgaatt gaagttgata 5880gtagacttaa tgtatgaagg
tggtatagct aatatgaact attccatttc aaataatgca 5940gaatatggtg agtatgtcac
cggacctgaa gtcattaacg cagaatcaag acaagccatg 6000agaaatgcct tgaaacgtat
ccaggacggt gaatacgcta agatgttcat aagtgaaggc 6060gctacgggtt acccgagtat
gactgctaaa agaagaaaca atgcagcaca tggtatcgaa 6120attattggtg aacagttaag
gtctatgatg ccctggatcg gtgctaataa gatcgtagac 6180aaggcgaaaa attaaggccc
tgcaggccta tcaagtgctg gaaacttttt ctcttggaat 6240ttttgcaaca tcaagtcata
gtcaattgaa ttgacccaat ttcacattta agattttttt 6300tttttcatcc gacatacatc
tgtacactag gaagccctgt ttttctgaag cagcttcaaa 6360tatatatatt ttttacatat
ttattatgat tcaatgaaca atctaattaa atcgaaaaca 6420agaaccgaaa cgcgaataaa
taatttattt agatggtgac aagtgtataa gtcctcatcg 6480ggacagctac gatttctctt
tcggttttgg ctgagctact ggttgctgtg acgcagcggc 6540attagcgcgg cgttatgagc
taccctcgtg gcctgaaaga tggcgggaat aaagcggaac 6600taaaaattac tgactgagcc
atattgaggt caatttgtca actcgtcaag tcacgtttgg 6660tggacggccc ctttccaacg
aatcgtatat actaacatgc gcgcgcttcc tatatacaca 6720tatacatata tatatatata
tatatgtgtg cgtgtatgtg tacacctgta tttaatttcc 6780ttactcgcgg gtttttcttt
tttctcaatt cttggcttcc tctttctcga gtatataatt 6840tttcaggtaa aatttagtac
gatagtaaaa tacttctcga actcgtcaca tatacgtgta 6900cataatgtct gaaccagctc
aaaagaaaca aaaggttgct aacaactctc tagagcggcc 6960gcccgcaaat taaagccttc
gagcgtccca aaaccttctc aagcaaggtt ttcagtataa 7020tgttacatgc gtacacgcgt
ctgtacagaa aaaaaagaaa aatttgaaat ataaataacg 7080ttcttaatac taacataact
ataaaaaaat aaatagggac ctagacttca ggttgtctaa 7140ctccttcctt ttcggttaga
gcggatgtgg ggggagggcg tgaatgtaag cgtgacataa 7200ctaattacat gattaattaa
ttattggttt tctggtctca actttctgac ttccttacca 7260accttccaga tttccatgtt
tctgatggtg tctaattcct tttctagctt ttctctgtag 7320tcaggttgag agttgaattc
caaagatctc ttggtttcgg taccgttctt ggtagattcg 7380tacaagtctt ggaaaacagg
cttcaaagca ttcttgaaga ttgggtacca gtccaaagca 7440cctcttctgg cggtggtgga
acaagcatcg tacatgtaat ccataccgta cttaccgatc 7500aatgggtata gagattgggt
agcttcttcg acggtttcgt tgaaagcttc agatggggag 7560tgaccgtttt ctctcaagac
gtcgtattga gccaagaaca taccgtggat accacccatt 7620aaacaacctc tttcaccgta
caagtcagag ttgacttctc tttcgaaagt ggtttggtaa 7680acgtaaccgg aaccaatggc
aacggccaaa gcttgggcct tttcgtgagc cttaccggtg 7740acatcgttcc agacggcgta
agaagagtta ataccacgac cttccttgaa caaagatctg 7800acagttctac cggaaccctt
tggagcaacc aagataacat ctaagtcctt tggtggttca 7860acgtgagtca agtccttgaa
gactggggag aaaccgtggg agaagtacaa agtcttaccc 7920ttggtcaaca atggcttgat
agcaggccag gtttctgatt gagcggcatc ggacaacaag 7980ttcataacgt aactacctct
cttgatagca tcttcaacag tgaacaagtt cttgcctgga 8040acccaaccgt cttcgatggc
agccttccaa gaagcaccat ctttacggac accaatgata 8100acgttcaaac cgttgtctct
caagttcaaa ccttgaccgt aaccttggga accgtaaccg 8160atcaaagcaa aagtgtcgtt
cttgaagtag tccaacaact tttctcttgg ccagtcagct 8220ctttcgtaga cggtttcaac
agtaccaccg aagttgattt gcttcaacat cctcagctct 8280agatttgaat atgtattact
tggttatggt tatatatgac aaaagaaaaa gaagaacaga 8340agaataacgc aaggaagaac
aataactgaa attgatagag aagtattatg tctttgtctt 8400tttataataa atcaagtgca
gaaatccgtt agacaacatg agggataaaa tttaacgtgg 8460gcgaagaaga aggaaaaaag
tttttgtgag ggcgtaattg aagcgatctg ttgattgtag 8520attttttttt tttgaggagt
caaagtcaga agagaacaga caaatggtat taaccatcca 8580atactttttt ggagcaacgc
taagctcatg cttttccatt ggttacgtgc tcagttgtta 8640gatatggaaa gagaggatgc
tcacggcagc gtgactccaa ttgagcccga aagagaggat 8700gccacgtttt cccgacggct
gctagaatgg aaaaaggaaa aatagaagaa tcccattcct 8760atcattattt acgtaatgac
ccacacattt ttgagatttt caactattac gtattacgat 8820aatcctgctg tcattatcat
tattatctat atcgacgtat gcaacgtatg tgaagccaag 8880taggcaatta tttagtactg
tcagtattgt tattcatttc agatctatcc gcggtggagc 8940tcgaattcac tggccgtcgt
tttacaacgt cgtgactggg aaaaccctgg cgttacccaa 9000cttaatcgcc ttgcagcaca
tccccctttc gccagctggc gtaatagcga agaggcccgc 9060accgatcgcc cttcccaaca
gttgcgcagc ctgaatggcg aatggcgcct gatgcggtat 9120tttctcctta cgcatctgtg
cggtatttca caccgcatac gtcaaagcaa ccatagtacg 9180cgccctgtag cggcgcatta
agcgcggcgg gtgtggtggt tacgcgcagc gtgaccgcta 9240cacttgccag cgccttagcg
cccgctcctt tcgctttctt cccttccttt ctcgccacgt 9300tcgccggctt tccccgtcaa
gctctaaatc gggggctccc tttagggttc cgatttagtg 9360ctttacggca cctcgacccc
aaaaaacttg atttgggtga tggttcacgt agtgggccat 9420cgccctgata gacggttttt
cgccctttga cgttggagtc cacgttcttt aatagtggac 9480tcttgttcca aactggaaca
acactcaact ctatctcggg ctattctttt gatttataag 9540ggattttgcc gatttcggtc
tattggttaa aaaatgagct gatttaacaa aaatttaacg 9600cgaattttaa caaaatatta
acgtttacaa ttttatggtg cactctcagt acaatctgct 9660ctgatgccgc atagttaagc
cagccccgac acccgccaac acccgctgac gcgccctgac 9720gggcttgtct gctcccggca
tccgcttaca gacaagctgt gaccgtctcc gggagctgca 9780tgtgtcagag gttttcaccg
tcatcaccga aacgcgcgag acgaaagggc ctcgtgatac 9840gcctattttt ataggttaat
gtcatgataa taatggtttc ttagacgtca ggtggcactt 9900ttcggggaaa tgtgcgcgga
acccctattt gtttattttt ctaaatacat tcaaatatgt 9960atccgctcat gagacaataa
ccctgataaa tgcttcaata atattgaaaa aggaagagta 10020tgagtattca acatttccgt
gtcgccctta ttcccttttt tgcggcattt tgccttcctg 10080tttttgctca cccagaaacg
ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac 10140gagtgggtta catcgaactg
gatctcaaca gcggtaagat ccttgagagt tttcgccccg 10200aagaacgttt tccaatgatg
agcactttta aagttctgct atgtggcgcg gtattatccc 10260gtattgacgc cgggcaagag
caactcggtc gccgcataca ctattctcag aatgacttgg 10320ttgagtactc accagtcaca
gaaaagcatc ttacggatgg catgacagta agagaattat 10380gcagtgctgc cataaccatg
agtgataaca ctgcggccaa cttacttctg acaacgatcg 10440gaggaccgaa ggagctaacc
gcttttttgc acaacatggg ggatcatgta actcgccttg 10500atcgttggga accggagctg
aatgaagcca taccaaacga cgagcgtgac accacgatgc 10560ctgtagcaat ggcaacaacg
ttgcgcaaac tattaactgg cgaactactt actctagctt 10620cccggcaaca attaatagac
tggatggagg cggataaagt tgcaggacca cttctgcgct 10680cggcccttcc ggctggctgg
tttattgctg ataaatctgg agccggtgag cgtgggtctc 10740gcggtatcat tgcagcactg
gggccagatg gtaagccctc ccgtatcgta gttatctaca 10800cgacggggag tcaggcaact
atggatgaac gaaatagaca gatcgctgag ataggtgcct 10860cactgattaa gcattggtaa
ctgtcagacc aagtttactc atatatactt tagattgatt 10920taaaacttca tttttaattt
aaaaggatct aggtgaagat cctttttgat aatctcatga 10980ccaaaatccc ttaacgtgag
ttttcgttcc actgagcgtc agaccccgta gaaaagatca 11040aaggatcttc ttgagatcct
ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac 11100caccgctacc agcggtggtt
tgtttgccgg atcaagagct accaactctt tttccgaagg 11160taactggctt cagcagagcg
cagataccaa atactgttct tctagtgtag ccgtagttag 11220gccaccactt caagaactct
gtagcaccgc ctacatacct cgctctgcta atcctgttac 11280cagtggctgc tgccagtggc
gataagtcgt gtcttaccgg gttggactca agacgatagt 11340taccggataa ggcgcagcgg
tcgggctgaa cggggggttc gtgcacacag cccagcttgg 11400agcgaacgac ctacaccgaa
ctgagatacc tacagcgtga gctatgagaa agcgccacgc 11460ttcccgaagg gagaaaggcg
gacaggtatc cggtaagcgg cagggtcgga acaggagagc 11520gcacgaggga gcttccaggg
ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc 11580acctctgact tgagcgtcga
tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa 11640acgccagcaa cgcggccttt
ttacggttcc tggccttttg ctggcctttt gctcacatgt 11700tctttcctgc gttatcccct
gattctgtgg ataaccgtat taccgccttt gagtgagctg 11760ataccgctcg ccgcagccga
acgaccgagc gcagcgagtc agtgagcgag gaagcggaag 11820agcgcccaat acgcaaaccg
cctctccccg cgcgttggcc gattcattaa tgcagctggc 11880acgacaggtt tcccgactgg
aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc 11940tcactcatta ggcaccccag
gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 12000ttgtgagcgg ataacaattt
cacacaggaa acagctatga ccatgattac gccaagcttt 12060ttctttccaa tttttttttt
ttcgtcatta taaaaatcat tacgaccgag attcccgggt 12120aataactgat ataattaaat
tgaagctcta atttgtgagt ttagtataca tgcatttact 12180tataatacag ttttttagtt
ttgctggccg catcttctca aatatgcttc ccagcctgct 12240tttctgtaac gttcaccctc
taccttagca tcccttccct ttgcaaatag tcctcttcca 12300acaataataa tgtcagatcc
tgtagagacc acatcatcca cggttctata ctgttgaccc 12360aatgcgtctc ccttgtcatc
taaacccaca ccgggtgtca taatcaacca atcgtaacct 12420tcatctcttc cacccatgtc
tctttgagca ataaagccga taacaaaatc tttgtcgctc 12480ttcgcaatgt caacagtacc
cttagtatat tctccagtag atagggagcc cttgcatgac 12540aattctgcta acatcaaaag
gcctctaggt tcctttgtta cttcttctgc cgcctgcttc 12600aaaccgctaa caatacctgg
gcccaccaca ccgtgtgcat tcgtaatgtc tgcccattct 12660gctattctgt atacacccgc
agagtactgc aatttgactg tattaccaat gtcagcaaat 12720tttctgtctt cgaagagtaa
aaaattgtac ttggcggata atgcctttag cggcttaact 12780gtgccctcca tggaaaaatc
agtcaagata tccacatgtg tttttagtaa acaaattttg 12840ggacctaatg cttcaactaa
ctccagtaat tccttggtgg tacgaacatc caatgaagca 12900cacaagtttg tttgcttttc
gtgcatgata ttaaatagct tggcagcaac aggactagga 12960tgagtagcag cacgttcctt
atatgtagct ttcgacatga tttatcttcg tttcctgcag 13020gtttttgttc tgtgcagttg
ggttaagaat actgggcaat ttcatgtttc ttcaacacta 13080catatgcgta tatataccaa
tctaagtctg tgctccttcc ttcgttcttc cttctgttcg 13140gagattaccg aatcaaaaaa
atttcaagga aaccgaaatc aaaaaaaaga ataaaaaaaa 13200aatgatgaat tgaaaagctt
gcatgcctgc aggtcgactc tagtatactc cgtctactgt 13260acgatacact tccgctcagg
tccttgtcct ttaacgaggc cttaccactc ttttgttact 13320ctattgatcc agctcagcaa
aggcagtgtg atctaagatt ctatcttcgc gatgtagtaa 13380aactagctag accgagaaag
agactagaaa tgcaaaaggc acttctacaa tggctgccat 13440cattattatc cgatgtgacg
ctgcattttt tttttttttt tttttttttt tttttttttt 13500tttttttttt tttttttgta
caaatatcat aaaaaaagag aatcttttta agcaaggatt 13560ttcttaactt cttcggcgac
agcatcaccg acttcggtgg tactgttgga accacctaaa 13620tcaccagttc tgatacctgc
atccaaaacc tttttaactg catcttcaat ggctttacct 13680tcttcaggca agttcaatga
caatttcaac atcattgcag cagacaagat agtggcgata 13740gggttgacct tattctttgg
caaatctgga gcggaaccat ggcatggttc gtacaaacca 13800aatgcggtgt tcttgtctgg
caaagaggcc aaggacgcag atggcaacaa acccaaggag 13860cctgggataa cggaggcttc
atcggagatg atatcaccaa acatgttgct ggtgattata 13920ataccattta ggtgggttgg
gttcttaact aggatcatgg cggcagaatc aatcaattga 13980tgttgaactt tcaatgtagg
gaattcgttc ttgatggttt cctccacagt ttttctccat 14040aatcttgaag aggccaaaac
attagcttta tccaaggacc aaataggcaa tggtggctca 14100tgttgtaggg ccatgaaagc
ggccattctt gtgattcttt gcacttctgg aacggtgtat 14160tgttcactat cccaagcgac
accatcacca tcgtcttcct ttctcttacc aaagtaaata 14220cctcccacta attctctaac
aacaacgaag tcagtacctt tagcaaattg tggcttgatt 14280ggagataagt ctaaaagaga
gtcggatgca aagttacatg gtcttaagtt ggcgtacaat 14340tgaagttctt tacggatttt
tagtaaacct tgttcaggtc taacactacc ggtaccccat 14400ttaggaccac ccacagcacc
taacaaaacg gcatcagcct tcttggaggc ttccagcgcc 14460tcatctggaa gtggaacacc
tgtagcatcg atagcagcac caccaattaa atgattttcg 14520aaatcgaact tgacattgga
acgaacatca gaaatagctt taagaacctt aatggcttcg 14580gctgtgattt cttgaccaac
gtggtcacct ggcaaaacga cgatcttctt aggggcagac 14640attacaatgg tatatccttg
aaatatatat aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 14700tgcagcttct caatgatatt
cgaatacgct ttgaggagat acagcctaat atccgacaaa 14760ctgttttaca gatttacgat
cgtacttgtt acccatcatt gaattttgaa catccgaacc 14820tgggagtttt ccctgaaaca
gatagtatat ttgaacctgt ataataatat atagtctagc 14880gctttacgga agacaatgta
tgtatttcgg ttcctggaga aactattgca tctattgcat 14940aggtaatctt gcacgtcgca
tccccggttc attttctgcg tttccatctt gcacttcaat 15000agcatatctt tgttaacgaa
gcatctgtgc ttcattttgt agaacaaaaa tgcaacgcga 15060gagcgctaat ttttcaaaca
aagaatctga gctgcatttt tacagaacag aaatgcaacg 15120cgaaagcgct attttaccaa
cgaagaatct gtgcttcatt tttgtaaaac aaaaatgcaa 15180cgcgagagcg ctaatttttc
aaacaaagaa tctgagctgc atttttacag aacagaaatg 15240caacgcgaga gcgctatttt
accaacaaag aatctatact tcttttttgt tctacaaaaa 15300tgcatcccga gagcgctatt
tttctaacaa agcatcttag attacttttt ttctcctttg 15360tgcgctctat aatgcagtct
cttgataact ttttgcactg taggtccgtt aaggttagaa 15420gaaggctact ttggtgtcta
ttttctcttc cataaaaaaa gcctgactcc acttcccgcg 15480tttactgatt actagcgaag
ctgcgggtgc attttttcaa gataaaggca tccccgatta 15540tattctatac cgatgtggat
tgcgcatact ttgtgaacag aaagtgatag cgttgatgat 15600tcttcattgg tcagaaaatt
atgaacggtt tcttctattt tgtctctata tactacgtat 15660aggaaatgtt tacattttcg
tattgttttc gattcactct atgaatagtt cttactacaa 15720tttttttgtc taaagagtaa
tactagagat aaacataaaa aatgtagagg tcgagtttag 15780atgcaagttc aaggagcgaa
aggtggatgg gtaggttata tagggatata gcacagagat 15840atatagcaaa gagatacttt
tgagcaatgt ttgtggaagc ggtattcgca atattttagt 15900agctcgttac agtccggtgc
gtttttggtt ttttgaaagt gcgtcttcag agcgcttttg 15960gttttcaaaa gcgctctgaa
gttcctatac tttctagaga ataggaactt cggaatagga 16020acttcaaagc gtttccgaaa
acgagcgctt ccgaaaatgc aacgcgagct gcgcacatac 16080agctcactgt tcacgtcgca
cctatatctg cgtgttgcct gtatatatat atacatgaga 16140agaacggcat agtgcgtgtt
tatgcttaaa tgcgtactta tatgcgtcta tttatgtagg 16200atgaaaggta gtctagtacc
tcctgtgata ttatcccatt ccatgcgggg tatcgtatgc 16260ttccttcagc actacccttt
agctgttcta tatgctgcca ctcctcaatt ggattagtct 16320catccttcaa tgctatcatt
tcctttgata ttggatcata tgcatagtac cgagaaacta 16380gaggatc
1638730448DNASaccharomyces
cerevisiae 30cccattaccg acatttgggc gctatacgtg catatgttca tgtatgtatc
tgtatttaaa 60acacttttgt attatttttc ctcatatatg tgtataggtt tatacggatg
atttaattat 120tacttcacca ccctttattt caggctgata tcttagcctt gttactagtt
agaaaaagac 180atttttgctg tcagtcactg tcaagagatt cttttgctgg catttcttct
agaagcaaaa 240agagcgatgc gtcttttccg ctgaaccgtt ccagcaaaaa agactaccaa
cgcaatatgg 300attgtcagaa tcatataaaa gagaagcaaa taactccttg tcttgtatca
attgcattat 360aatatcttct tgttagtgca atatcatata gaagtcatcg aaatagatat
taagaaaaac 420aaactgtaca atcaatcaat caatcatc
448311713DNABacillus subtilis 31ttgacaaaag caacaaaaga
acaaaaatcc cttgtgaaaa acagaggggc ggagcttgtt 60gttgattgct tagtggagca
aggtgtcaca catgtatttg gcattccagg tgcaaaaatt 120gatgcggtat ttgacgcttt
acaagataaa ggacctgaaa ttatcgttgc ccggcacgaa 180caaaacgcag cattcatggc
ccaagcagtc ggccgtttaa ctggaaaacc gggagtcgtg 240ttagtcacat caggaccggg
tgcctctaac ttggcaacag gcctgctgac agcgaacact 300gaaggagacc ctgtcgttgc
gcttgctgga aacgtgatcc gtgcagatcg tttaaaacgg 360acacatcaat ctttggataa
tgcggcgcta ttccagccga ttacaaaata cagtgtagaa 420gttcaagatg taaaaaatat
accggaagct gttacaaatg catttaggat agcgtcagca 480gggcaggctg gggccgcttt
tgtgagcttt ccgcaagatg ttgtgaatga agtcacaaat 540acgaaaaacg tgcgtgctgt
tgcagcgcca aaactcggtc ctgcagcaga tgatgcaatc 600agtgcggcca tagcaaaaat
ccaaacagca aaacttcctg tcgttttggt cggcatgaaa 660ggcggaagac cggaagcaat
taaagcggtt cgcaagcttt tgaaaaaggt tcagcttcca 720tttgttgaaa catatcaagc
tgccggtacc ctttctagag atttagagga tcaatatttt 780ggccgtatcg gtttgttccg
caaccagcct ggcgatttac tgctagagca ggcagatgtt 840gttctgacga tcggctatga
cccgattgaa tatgatccga aattctggaa tatcaatgga 900gaccggacaa ttatccattt
agacgagatt atcgctgaca ttgatcatgc ttaccagcct 960gatcttgaat tgatcggtga
cattccgtcc acgatcaatc atatcgaaca cgatgctgtg 1020aaagtggaat ttgcagagcg
tgagcagaaa atcctttctg atttaaaaca atatatgcat 1080gaaggtgagc aggtgcctgc
agattggaaa tcagacagag cgcaccctct tgaaatcgtt 1140aaagagttgc gtaatgcagt
cgatgatcat gttacagtaa cttgcgatat cggttcgcac 1200gccatttgga tgtcacgtta
tttccgcagc tacgagccgt taacattaat gatcagtaac 1260ggtatgcaaa cactcggcgt
tgcgcttcct tgggcaatcg gcgcttcatt ggtgaaaccg 1320ggagaaaaag tggtttctgt
ctctggtgac ggcggtttct tattctcagc aatggaatta 1380gagacagcag ttcgactaaa
agcaccaatt gtacacattg tatggaacga cagcacatat 1440gacatggttg cattccagca
attgaaaaaa tataaccgta catctgcggt cgatttcgga 1500aatatcgata tcgtgaaata
tgcggaaagc ttcggagcaa ctggcttgcg cgtagaatca 1560ccagaccagc tggcagatgt
tctgcgtcaa ggcatgaacg ctgaaggtcc tgtcatcatc 1620gatgtcccgg ttgactacag
tgataacatt aatttagcaa gtgacaagct tccgaaagaa 1680ttcggggaac tcatgaaaac
gaaagctctc tag 171332571PRTBacillus
subtilis 32Met Leu Thr Lys Ala Thr Lys Glu Gln Lys Ser Leu Val Lys Asn
Arg 1 5 10 15 Gly
Ala Glu Leu Val Val Asp Cys Leu Val Glu Gln Gly Val Thr His
20 25 30 Val Phe Gly Ile Pro
Gly Ala Lys Ile Asp Ala Val Phe Asp Ala Leu 35
40 45 Gln Asp Lys Gly Pro Glu Ile Ile Val
Ala Arg His Glu Gln Asn Ala 50 55
60 Ala Phe Met Ala Gln Ala Val Gly Arg Leu Thr Gly Lys
Pro Gly Val 65 70 75
80 Val Leu Val Thr Ser Gly Pro Gly Ala Ser Asn Leu Ala Thr Gly Leu
85 90 95 Leu Thr Ala Asn
Thr Glu Gly Asp Pro Val Val Ala Leu Ala Gly Asn 100
105 110 Val Ile Arg Ala Asp Arg Leu Lys Arg
Thr His Gln Ser Leu Asp Asn 115 120
125 Ala Ala Leu Phe Gln Pro Ile Thr Lys Tyr Ser Val Glu Val
Gln Asp 130 135 140
Val Lys Asn Ile Pro Glu Ala Val Thr Asn Ala Phe Arg Ile Ala Ser 145
150 155 160 Ala Gly Gln Ala Gly
Ala Ala Phe Val Ser Phe Pro Gln Asp Val Val 165
170 175 Asn Glu Val Thr Asn Thr Lys Asn Val Arg
Ala Val Ala Ala Pro Lys 180 185
190 Leu Gly Pro Ala Ala Asp Asp Ala Ile Ser Ala Ala Ile Ala Lys
Ile 195 200 205 Gln
Thr Ala Lys Leu Pro Val Val Leu Val Gly Met Lys Gly Gly Arg 210
215 220 Pro Glu Ala Ile Lys Ala
Val Arg Lys Leu Leu Lys Lys Val Gln Leu 225 230
235 240 Pro Phe Val Glu Thr Tyr Gln Ala Ala Gly Thr
Leu Ser Arg Asp Leu 245 250
255 Glu Asp Gln Tyr Phe Gly Arg Ile Gly Leu Phe Arg Asn Gln Pro Gly
260 265 270 Asp Leu
Leu Leu Glu Gln Ala Asp Val Val Leu Thr Ile Gly Tyr Asp 275
280 285 Pro Ile Glu Tyr Asp Pro Lys
Phe Trp Asn Ile Asn Gly Asp Arg Thr 290 295
300 Ile Ile His Leu Asp Glu Ile Ile Ala Asp Ile Asp
His Ala Tyr Gln 305 310 315
320 Pro Asp Leu Glu Leu Ile Gly Asp Ile Pro Ser Thr Ile Asn His Ile
325 330 335 Glu His Asp
Ala Val Lys Val Glu Phe Ala Glu Arg Glu Gln Lys Ile 340
345 350 Leu Ser Asp Leu Lys Gln Tyr Met
His Glu Gly Glu Gln Val Pro Ala 355 360
365 Asp Trp Lys Ser Asp Arg Ala His Pro Leu Glu Ile Val
Lys Glu Leu 370 375 380
Arg Asn Ala Val Asp Asp His Val Thr Val Thr Cys Asp Ile Gly Ser 385
390 395 400 His Ala Ile Trp
Met Ser Arg Tyr Phe Arg Ser Tyr Glu Pro Leu Thr 405
410 415 Leu Met Ile Ser Asn Gly Met Gln Thr
Leu Gly Val Ala Leu Pro Trp 420 425
430 Ala Ile Gly Ala Ser Leu Val Lys Pro Gly Glu Lys Val Val
Ser Val 435 440 445
Ser Gly Asp Gly Gly Phe Leu Phe Ser Ala Met Glu Leu Glu Thr Ala 450
455 460 Val Arg Leu Lys Ala
Pro Ile Val His Ile Val Trp Asn Asp Ser Thr 465 470
475 480 Tyr Asp Met Val Ala Phe Gln Gln Leu Lys
Lys Tyr Asn Arg Thr Ser 485 490
495 Ala Val Asp Phe Gly Asn Ile Asp Ile Val Lys Tyr Ala Glu Ser
Phe 500 505 510 Gly
Ala Thr Gly Leu Arg Val Glu Ser Pro Asp Gln Leu Ala Asp Val 515
520 525 Leu Arg Gln Gly Met Asn
Ala Glu Gly Pro Val Ile Ile Asp Val Pro 530 535
540 Val Asp Tyr Ser Asp Asn Ile Asn Leu Ala Ser
Asp Lys Leu Pro Lys 545 550 555
560 Glu Phe Gly Glu Leu Met Lys Thr Lys Ala Leu 565
570 33250DNASaccharomyces cerevisiae 33ccgcaaatta
aagccttcga gcgtcccaaa accttctcaa gcaaggtttt cagtataatg 60ttacatgcgt
acacgcgtct gtacagaaaa aaaagaaaaa tttgaaatat aaataacgtt 120cttaatacta
acataactat aaaaaaataa atagggacct agacttcagg ttgtctaact 180ccttcctttt
cggttagagc ggatgtgggg ggagggcgtg aatgtaagcg tgacataact 240aattacatga
250341181DNASaccharomyces cerevisiae 34taaaacctct agtggagtag tagatgtaat
caatgaagcg gaagccaaaa gaccagagta 60gaggcctata gaagaaactg cgataccttt
tgtgatggct aaacaaacag acatcttttt 120atatgttttt acttctgtat atcgtgaagt
agtaagtgat aagcgaattt ggctaagaac 180gttgtaagtg aacaagggac ctcttttgcc
tttcaaaaaa ggattaaatg gagttaatca 240ttgagattta gttttcgtta gattctgtat
ccctaaataa ctcccttacc cgacgggaag 300gcacaaaaga cttgaataat agcaaacggc
cagtagccaa gaccaaataa tactagagtt 360aactgatggt cttaaacagg cattacgtgg
tgaactccaa gaccaatata caaaatatcg 420ataagttatt cttgcccacc aatttaagga
gcctacatca ggacagtagt accattcctc 480agagaagagg tatacataac aagaaaatcg
cgtgaacacc ttatataact tagcccgtta 540ttgagctaaa aaaccttgca aaatttccta
tgaataagaa tacttcagac gtgataaaaa 600tttactttct aactcttctc acgctgcccc
tatctgttct tccgctctac cgtgagaaat 660aaagcatcga gtacggcagt tcgctgtcac
tgaactaaaa caataaggct agttcgaatg 720atgaacttgc ttgctgtcaa acttctgagt
tgccgctgat gtgacactgt gacaataaat 780tcaaaccggt tatagcggtc tcctccggta
ccggttctgc cacctccaat agagctcagt 840aggagtcaga acctctgcgg tggctgtcag
tgactcatcc gcgtttcgta agttgtgcgc 900gtgcacattt cgcccgttcc cgctcatctt
gcagcaggcg gaaattttca tcacgctgta 960ggacgcaaaa aaaaaataat taatcgtaca
agaatcttgg aaaaaaaatt gaaaaatttt 1020gtataaaagg gatgacctaa cttgactcaa
tggcttttac acccagtatt ttccctttcc 1080ttgtttgtta caattataga agcaagacaa
aaacatatag acaacctatt cctaggagtt 1140atattttttt accctaccag caatataagt
aaaaaactag t 118135759DNASaccharomyces cerevisiae
35ggccctgcag gcctatcaag tgctggaaac tttttctctt ggaatttttg caacatcaag
60tcatagtcaa ttgaattgac ccaatttcac atttaagatt tttttttttt catccgacat
120acatctgtac actaggaagc cctgtttttc tgaagcagct tcaaatatat atatttttta
180catatttatt atgattcaat gaacaatcta attaaatcga aaacaagaac cgaaacgcga
240ataaataatt tatttagatg gtgacaagtg tataagtcct catcgggaca gctacgattt
300ctctttcggt tttggctgag ctactggttg ctgtgacgca gcggcattag cgcggcgtta
360tgagctaccc tcgtggcctg aaagatggcg ggaataaagc ggaactaaaa attactgact
420gagccatatt gaggtcaatt tgtcaactcg tcaagtcacg tttggtggac ggcccctttc
480caacgaatcg tatatactaa catgcgcgcg cttcctatat acacatatac atatatatat
540atatatatat gtgtgcgtgt atgtgtacac ctgtatttaa tttccttact cgcgggtttt
600tcttttttct caattcttgg cttcctcttt ctcgagtata taatttttca ggtaaaattt
660agtacgatag taaaatactt ctcgaactcg tcacatatac gtgtacataa tgtctgaacc
720agctcaaaag aaacaaaagg ttgctaacaa ctctctaga
75936643DNASaccharomyces cerevisiae 36gaaatgaata acaatactga cagtactaaa
taattgccta cttggcttca catacgttgc 60atacgtcgat atagataata atgataatga
cagcaggatt atcgtaatac gtaatagttg 120aaaatctcaa aaatgtgtgg gtcattacgt
aaataatgat aggaatggga ttcttctatt 180tttccttttt ccattctagc agccgtcggg
aaaacgtggc atcctctctt tcgggctcaa 240ttggagtcac gctgccgtga gcatcctctc
tttccatatc taacaactga gcacgtaacc 300aatggaaaag catgagctta gcgttgctcc
aaaaaagtat tggatggtta ataccatttg 360tctgttctct tctgactttg actcctcaaa
aaaaaaaaat ctacaatcaa cagatcgctt 420caattacgcc ctcacaaaaa cttttttcct
tcttcttcgc ccacgttaaa ttttatccct 480catgttgtct aacggatttc tgcacttgat
ttattataaa aagacaaaga cataatactt 540ctctatcaat ttcagttatt gttcttcctt
gcgttattct tctgttcttc tttttctttt 600gtcatatata accataacca agtaatacat
attcaaatct aga 643371014DNAartificial
sequencePf5.IlvC-Z4B8 variant 37atgaaggtgt tttacgataa agactgcgat
ctgagcatca tccagggaaa gaaggttgct 60attataggat atggttccca aggacacgca
caagccttga acttgaaaga ttctggggtc 120gacgtgacag taggtctgta taaaggtgct
gctgatgcag caaaggctga agcacatggc 180tttaaagtca cagatgttgc agcggctgtt
gctggcgctg atttagtcat gattttaatt 240ccagatgaat ttcaatcgca attgtacaaa
aatgaaatag aaccaaacat taagaagggc 300gctaccttgg ccttcagtca tggatttgcc
attcattaca atcaagtagt ccccagggca 360gatttggacg ttattatgat tgcacctaag
gctccggggc atactgttag gagcgaattt 420gttaagggtg gtggtattcc agatttgatc
gctatatacc aagacgttag cggaaacgct 480aagaatgtag ctttaagcta cgcagcagga
gttggtggcg ggagaacggg tataatagaa 540accactttta aagacgagac tgagacagat
ttatttggag aacaagcggt tctgtgcgga 600ggaactgttg aattggttaa agcaggcttt
gagacgcttg tcgaagcagg gtacgctccc 660gaaatggcat acttcgaatg tctacatgaa
ttgaagttga tagtagactt aatgtatgaa 720ggtggtatag ctaatatgaa ctattccatt
tcaaataatg cagaatatgg tgagtatgtc 780accggacctg aagtcattaa cgcagaatca
agacaagcca tgagaaatgc cttgaaacgt 840atccaggacg gtgaatacgc taagatgttc
ataagtgaag gcgctacggg ttacccgagt 900atgactgcta aaagaagaaa caatgcagca
catggtatcg aaattattgg tgaacagtta 960aggtctatga tgccctggat cggtgctaat
aagatcgtag acaaggcgaa aaat 101438338PRTartificial
sequencePf5.IlvC-Z4B8 variant 38Met Lys Val Phe Tyr Asp Lys Asp Cys Asp
Leu Ser Ile Ile Gln Gly 1 5 10
15 Lys Lys Val Ala Ile Ile Gly Tyr Gly Ser Gln Gly His Ala Gln
Ala 20 25 30 Leu
Asn Leu Lys Asp Ser Gly Val Asp Val Thr Val Gly Leu Tyr Lys 35
40 45 Gly Ala Ala Asp Ala Ala
Lys Ala Glu Ala His Gly Phe Lys Val Thr 50 55
60 Asp Val Ala Ala Ala Val Ala Gly Ala Asp Leu
Val Met Ile Leu Ile 65 70 75
80 Pro Asp Glu Phe Gln Ser Gln Leu Tyr Lys Asn Glu Ile Glu Pro Asn
85 90 95 Ile Lys
Lys Gly Ala Thr Leu Ala Phe Ser His Gly Phe Ala Ile His 100
105 110 Tyr Asn Gln Val Val Pro Arg
Ala Asp Leu Asp Val Ile Met Ile Ala 115 120
125 Pro Lys Ala Pro Gly His Thr Val Arg Ser Glu Phe
Val Lys Gly Gly 130 135 140
Gly Ile Pro Asp Leu Ile Ala Ile Tyr Gln Asp Val Ser Gly Asn Ala 145
150 155 160 Lys Asn Val
Ala Leu Ser Tyr Ala Ala Ala Val Gly Gly Gly Arg Thr 165
170 175 Gly Ile Ile Glu Thr Thr Phe Lys
Asp Glu Thr Glu Thr Asp Leu Phe 180 185
190 Gly Glu Gln Ala Val Leu Cys Gly Gly Thr Val Glu Leu
Val Lys Ala 195 200 205
Gly Phe Glu Thr Leu Val Glu Ala Gly Tyr Ala Pro Glu Met Ala Tyr 210
215 220 Phe Glu Cys Leu
His Glu Leu Lys Leu Ile Val Asp Leu Met Tyr Glu 225 230
235 240 Gly Gly Ile Ala Asn Met Asn Tyr Ser
Ile Ser Asn Asn Ala Glu Tyr 245 250
255 Gly Glu Tyr Val Thr Gly Pro Glu Val Ile Asn Ala Glu Ser
Arg Gln 260 265 270
Ala Met Arg Asn Ala Leu Lys Arg Ile Gln Asp Gly Glu Tyr Ala Lys
275 280 285 Met Phe Ile Ser
Glu Gly Ala Thr Gly Tyr Pro Ser Met Thr Ala Lys 290
295 300 Arg Arg Asn Asn Ala Ala His Gly
Ile Glu Ile Ile Gly Glu Gln Leu 305 310
315 320 Arg Ser Met Met Pro Trp Ile Gly Ala Asn Lys Ile
Val Asp Lys Ala 325 330
335 Lys Asn 391188DNASaccharomyces cerevisiae 39atgttgagaa ctcaagccgc
cagattgatc tgcaactccc gtgtcatcac tgctaagaga 60acctttgctt tggccacccg
tgctgctgct tacagcagac cagctgcccg tttcgttaag 120ccaatgatca ctacccgtgg
tttgaagcaa atcaacttcg gtggtactgt tgaaaccgtc 180tacgaaagag ctgactggcc
aagagaaaag ttgttggact acttcaagaa cgacactttt 240gctttgatcg gttacggttc
ccaaggttac ggtcaaggtt tgaacttgag agacaacggt 300ttgaacgtta tcattggtgt
ccgtaaagat ggtgcttctt ggaaggctgc catcgaagac 360ggttgggttc caggcaagaa
cttgttcact gttgaagatg ctatcaagag aggtagttac 420gttatgaact tgttgtccga
tgccgctcaa tcagaaacct ggcctgctat caagccattg 480ttgaccaagg gtaagacttt
gtacttctcc cacggtttct ccccagtctt caaggacttg 540actcacgttg aaccaccaaa
ggacttagat gttatcttgg ttgctccaaa gggttccggt 600agaactgtca gatctttgtt
caaggaaggt cgtggtatta actcttctta cgccgtctgg 660aacgatgtca ccggtaaggc
tcacgaaaag gcccaagctt tggccgttgc cattggttcc 720ggttacgttt accaaaccac
tttcgaaaga gaagtcaact ctgacttgta cggtgaaaga 780ggttgtttaa tgggtggtat
ccacggtatg ttcttggctc aatacgacgt cttgagagaa 840aacggtcact ccccatctga
agctttcaac gaaaccgtcg aagaagctac ccaatctcta 900tacccattga tcggtaagta
cggtatggat tacatgtacg atgcttgttc caccaccgcc 960agaagaggtg ctttggactg
gtacccaatc ttcaagaatg ctttgaagcc tgttttccaa 1020gacttgtacg aatctaccaa
gaacggtacc gaaaccaaga gatctttgga attcaactct 1080caacctgact acagagaaaa
gctagaaaag gaattagaca ccatcagaaa catggaaatc 1140tggaaggttg gtaaggaagt
cagaaagttg agaccagaaa accaataa 118840395PRTSaccharomyces
cerevisiae 40Met Leu Arg Thr Gln Ala Ala Arg Leu Ile Cys Asn Ser Arg Val
Ile 1 5 10 15 Thr
Ala Lys Arg Thr Phe Ala Leu Ala Thr Arg Ala Ala Ala Tyr Ser
20 25 30 Arg Pro Ala Ala Arg
Phe Val Lys Pro Met Ile Thr Thr Arg Gly Leu 35
40 45 Lys Gln Ile Asn Phe Gly Gly Thr Val
Glu Thr Val Tyr Glu Arg Ala 50 55
60 Asp Trp Pro Arg Glu Lys Leu Leu Asp Tyr Phe Lys Asn
Asp Thr Phe 65 70 75
80 Ala Leu Ile Gly Tyr Gly Ser Gln Gly Tyr Gly Gln Gly Leu Asn Leu
85 90 95 Arg Asp Asn Gly
Leu Asn Val Ile Ile Gly Val Arg Lys Asp Gly Ala 100
105 110 Ser Trp Lys Ala Ala Ile Glu Asp Gly
Trp Val Pro Gly Lys Asn Leu 115 120
125 Phe Thr Val Glu Asp Ala Ile Lys Arg Gly Ser Tyr Val Met
Asn Leu 130 135 140
Leu Ser Asp Ala Ala Gln Ser Glu Thr Trp Pro Ala Ile Lys Pro Leu 145
150 155 160 Leu Thr Lys Gly Lys
Thr Leu Tyr Phe Ser His Gly Phe Ser Pro Val 165
170 175 Phe Lys Asp Leu Thr His Val Glu Pro Pro
Lys Asp Leu Asp Val Ile 180 185
190 Leu Val Ala Pro Lys Gly Ser Gly Arg Thr Val Arg Ser Leu Phe
Lys 195 200 205 Glu
Gly Arg Gly Ile Asn Ser Ser Tyr Ala Val Trp Asn Asp Val Thr 210
215 220 Gly Lys Ala His Glu Lys
Ala Gln Ala Leu Ala Val Ala Ile Gly Ser 225 230
235 240 Gly Tyr Val Tyr Gln Thr Thr Phe Glu Arg Glu
Val Asn Ser Asp Leu 245 250
255 Tyr Gly Glu Arg Gly Cys Leu Met Gly Gly Ile His Gly Met Phe Leu
260 265 270 Ala Gln
Tyr Asp Val Leu Arg Glu Asn Gly His Ser Pro Ser Glu Ala 275
280 285 Phe Asn Glu Thr Val Glu Glu
Ala Thr Gln Ser Leu Tyr Pro Leu Ile 290 295
300 Gly Lys Tyr Gly Met Asp Tyr Met Tyr Asp Ala Cys
Ser Thr Thr Ala 305 310 315
320 Arg Arg Gly Ala Leu Asp Trp Tyr Pro Ile Phe Lys Asn Ala Leu Lys
325 330 335 Pro Val Phe
Gln Asp Leu Tyr Glu Ser Thr Lys Asn Gly Thr Glu Thr 340
345 350 Lys Arg Ser Leu Glu Phe Asn Ser
Gln Pro Asp Tyr Arg Glu Lys Leu 355 360
365 Glu Lys Glu Leu Asp Thr Ile Arg Asn Met Glu Ile Trp
Lys Val Gly 370 375 380
Lys Glu Val Arg Lys Leu Arg Pro Glu Asn Gln 385 390
395 411017DNAartificial sequencePf5.IlvC-JEA1 variant
41atgaaagttt tctacgataa agactgcgac ctgtcgatca tccaaggtaa gaaagttgcc
60atcatcggct tcggttccca gggccacgct caagcactca acctgaagga ttccggcgta
120gacgtgactg ttggcctgcc taaaggcttt gctgatgtag ccaaggctga agcccacggc
180tttaaagtga ccgacgttgc tgcagccgtt gccggtgccg acttggtcat gatcctgatt
240ccggacgagt tccagtccca gctgtacaag aacgaaatcg agccgaacat caagaagggc
300gccactctgg ccttctccca cggcttcgcg atccactaca accaggttgt gcctcgtgcc
360gacctcgacg tgatcatgat cgcgccgaag gctccaggcc acaccgtacg ttccgagttc
420gtcaagggcg gaggtattcc tgacctgatc gcgatctacc aggacgtttc cggcaacgcc
480aagaacgtcg ccctgtccta cgccgcaggc gtgggcggcg gccgtaccgg catcatcgaa
540accaccttca aggacgagac tgaaaccgac ctgttcggtg agcaggctgt tctgtgtggc
600ggtaccgtcg agctggtcaa agccggtttc gaaaccctgg ttgaagctgg ctacgctcca
660gaaatggcct acttcgagtg cctgcacgaa ctgaagctga tcgttgacct catgtacgaa
720ggcggtatcg ccaacatgaa ctactcgatc tccaacaacg ctgaatacgg cgagtacgtg
780actggtccag aagtcatcaa cgccgaatcc cgtcaggcca tgcgcaatgc tctgaagcgc
840atccaggacg gcgaatacgc gaagatgttc atcagcgaag gcgctaccgg ctacccatcg
900atgaccgcca agcgtcgtaa caacgctgct cacggtatcg aaatcatcgg cgagcaactg
960cgctcgatga tgccttggat cggtgccaac aaaatcgtcg acaaagccaa gaactaa
101742338PRTartificial sequencePf5.IlvC-JEA1 variant 42Met Lys Val Phe
Tyr Asp Lys Asp Cys Asp Leu Ser Ile Ile Gln Gly 1 5
10 15 Lys Lys Val Ala Ile Ile Gly Phe Gly
Ser Gln Gly His Ala Gln Ala 20 25
30 Leu Asn Leu Lys Asp Ser Gly Val Asp Val Thr Val Gly Leu
Pro Lys 35 40 45
Gly Phe Ala Asp Val Ala Lys Ala Glu Ala His Gly Phe Lys Val Thr 50
55 60 Asp Val Ala Ala Ala
Val Ala Gly Ala Asp Leu Val Met Ile Leu Ile 65 70
75 80 Pro Asp Glu Phe Gln Ser Gln Leu Tyr Lys
Asn Glu Ile Glu Pro Asn 85 90
95 Ile Lys Lys Gly Ala Thr Leu Ala Phe Ser His Gly Phe Ala Ile
His 100 105 110 Tyr
Asn Gln Val Val Pro Arg Ala Asp Leu Asp Val Ile Met Ile Ala 115
120 125 Pro Lys Ala Pro Gly His
Thr Val Arg Ser Glu Phe Val Lys Gly Gly 130 135
140 Gly Ile Pro Asp Leu Ile Ala Ile Tyr Gln Asp
Val Ser Gly Asn Ala 145 150 155
160 Lys Asn Val Ala Leu Ser Tyr Ala Ala Gly Val Gly Gly Gly Arg Thr
165 170 175 Gly Ile
Ile Glu Thr Thr Phe Lys Asp Glu Thr Glu Thr Asp Leu Phe 180
185 190 Gly Glu Gln Ala Val Leu Cys
Gly Gly Thr Val Glu Leu Val Lys Ala 195 200
205 Gly Phe Glu Thr Leu Val Glu Ala Gly Tyr Ala Pro
Glu Met Ala Tyr 210 215 220
Phe Glu Cys Leu His Glu Leu Lys Leu Ile Val Asp Leu Met Tyr Glu 225
230 235 240 Gly Gly Ile
Ala Asn Met Asn Tyr Ser Ile Ser Asn Asn Ala Glu Tyr 245
250 255 Gly Glu Tyr Val Thr Gly Pro Glu
Val Ile Asn Ala Glu Ser Arg Gln 260 265
270 Ala Met Arg Asn Ala Leu Lys Arg Ile Gln Asp Gly Glu
Tyr Ala Lys 275 280 285
Met Phe Ile Ser Glu Gly Ala Thr Gly Tyr Pro Ser Met Thr Ala Lys 290
295 300 Arg Arg Asn Asn
Ala Ala His Gly Ile Glu Ile Ile Gly Glu Gln Leu 305 310
315 320 Arg Ser Met Met Pro Trp Ile Gly Ala
Asn Lys Ile Val Asp Lys Ala 325 330
335 Lys Asn 4315539DNAartificial sequenceSynthetic
construct 43tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg
gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg
tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta
ctgagagtgc 180accataaatt cccgttttaa gagcttggtg agcgctagga gtcactgcca
ggtatcgttt 240gaacacggca ttagtcaggg aagtcataac acagtccttt cccgcaattt
tctttttcta 300ttactcttgg cctcctctag tacactctat atttttttat gcctcggtaa
tgattttcat 360tttttttttt ccacctagcg gatgactctt tttttttctt agcgattggc
attatcacat 420aatgaattat acattatata aagtaatgtg atttcttcga agaatatact
aaaaaatgag 480caggcaagat aaacgaaggc aaagatgaca gagcagaaag ccctagtaaa
gcgtattaca 540aatgaaacca agattcagat tgcgatctct ttaaagggtg gtcccctagc
gatagagcac 600tcgatcttcc cagaaaaaga ggcagaagca gtagcagaac aggccacaca
atcgcaagtg 660attaacgtcc acacaggtat agggtttctg gaccatatga tacatgctct
ggccaagcat 720tccggctggt cgctaatcgt tgagtgcatt ggtgacttac acatagacga
ccatcacacc 780actgaagact gcgggattgc tctcggtcaa gcttttaaag aggccctagg
ggccgtgcgt 840ggagtaaaaa ggtttggatc aggatttgcg cctttggatg aggcactttc
cagagcggtg 900gtagatcttt cgaacaggcc gtacgcagtt gtcgaacttg gtttgcaaag
ggagaaagta 960ggagatctct cttgcgagat gatcccgcat tttcttgaaa gctttgcaga
ggctagcaga 1020attaccctcc acgttgattg tctgcgaggc aagaatgatc atcaccgtag
tgagagtgcg 1080ttcaaggctc ttgcggttgc cataagagaa gccacctcgc ccaatggtac
caacgatgtt 1140ccctccacca aaggtgttct tatgtagtga caccgattat ttaaagctgc
agcatacgat 1200atatatacat gtgtatatat gtatacctat gaatgtcagt aagtatgtat
acgaacagta 1260tgatactgaa gatgacaagg taatgcatca ttctatacgt gtcattctga
acgaggcgcg 1320ctttcctttt ttctttttgc tttttctttt tttttctctt gaactcgacg
gatctatgcg 1380gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggaaat
tgtaagcgtt 1440aatattttgt taaaattcgc gttaaatttt tgttaaatca gctcattttt
taaccaatag 1500gccgaaatcg gcaaaatccc ttataaatca aaagaataga ccgagatagg
gttgagtgtt 1560gttccagttt ggaacaagag tccactatta aagaacgtgg actccaacgt
caaagggcga 1620aaaaccgtct atcagggcga tggcccacta cgtgaaccat caccctaatc
aagttttttg 1680gggtcgaggt gccgtaaagc actaaatcgg aaccctaaag ggagcccccg
atttagagct 1740tgacggggaa agccggcgaa cgtggcgaga aaggaaggga agaaagcgaa
aggagcgggc 1800gctagggcgc tggcaagtgt agcggtcacg ctgcgcgtaa ccaccacacc
cgccgcgctt 1860aatgcgccgc tacagggcgc gtccattcgc cattcaggct gcgcaactgt
tgggaagggc 1920gcggtgcggg cctcttcgct attacgccag ctggcgaaag ggggatgtgc
tgcaaggcga 1980ttaagttggg taacgccagg gttttcccag tcacgacgtt gtaaaacgac
ggccagtgag 2040cgcgcgtaat acgactcact atagggcgaa ttgggtaccg ggccccccct
cgaggtcgac 2100ggcgcgccac tggtagagag cgactttgta tgccccaatt gcgaaacccg
cgatatcctt 2160ctcgattctt tagtacccga ccaggacaag gaaaaggagg tcgaaacgtt
tttgaagaaa 2220caagaggaac tacacggaag ctctaaagat ggcaaccagc cagaaactaa
gaaaatgaag 2280ttgatggatc caactggcac cgctggcttg aacaacaata ccagccttcc
aacttctgta 2340aataacggcg gtacgccagt gccaccagta ccgttacctt tcggtatacc
tcctttcccc 2400atgtttccaa tgcccttcat gcctccaacg gctactatca caaatcctca
tcaagctgac 2460gcaagcccta agaaatgaat aacaatactg acagtactaa ataattgcct
acttggcttc 2520acatacgttg catacgtcga tatagataat aatgataatg acagcaggat
tatcgtaata 2580cgtaatagct gaaaatctca aaaatgtgtg ggtcattacg taaataatga
taggaatggg 2640attcttctat ttttcctttt tccattctag cagccgtcgg gaaaacgtgg
catcctctct 2700ttcgggctca attggagtca cgctgccgtg agcatcctct ctttccatat
ctaacaactg 2760agcacgtaac caatggaaaa gcatgagctt agcgttgctc caaaaaagta
ttggatggtt 2820aataccattt gtctgttctc ttctgacttt gactcctcaa aaaaaaaaat
ctacaatcaa 2880cagatcgctt caattacgcc ctcacaaaaa cttttttcct tcttcttcgc
ccacgttaaa 2940ttttatccct catgttgtct aacggatttc tgcacttgat ttattataaa
aagacaaaga 3000cataatactt ctctatcaat ttcagttatt gttcttcctt gcgttattct
tctgttcttc 3060tttttctttt gtcatatata accataacca agtaatacat attcaaacta
gtatgactga 3120caaaaaaact cttaaagact taagaaatcg tagttctgtt tacgattcaa
tggttaaatc 3180acctaatcgt gctatgttgc gtgcaactgg tatgcaagat gaagactttg
aaaaacctat 3240cgtcggtgtc atttcaactt gggctgaaaa cacaccttgt aatatccact
tacatgactt 3300tggtaaacta gccaaagtcg gtgttaagga agctggtgct tggccagttc
agttcggaac 3360aatcacggtt tctgatggaa tcgccatggg aacccaagga atgcgtttct
ccttgacatc 3420tcgtgatatt attgcagatt ctattgaagc agccatggga ggtcataatg
cggatgcttt 3480tgtagccatt ggcggttgtg ataaaaacat gcccggttct gttatcgcta
tggctaacat 3540ggatatccca gccatttttg cttacggcgg aacaattgca cctggtaatt
tagacggcaa 3600agatatcgat ttagtctctg tctttgaagg tgtcggccat tggaaccacg
gcgatatgac 3660caaagaagaa gttaaagctt tggaatgtaa tgcttgtccc ggtcctggag
gctgcggtgg 3720tatgtatact gctaacacaa tggcgacagc tattgaagtt ttgggactta
gccttccggg 3780ttcatcttct cacccggctg aatccgcaga aaagaaagca gatattgaag
aagctggtcg 3840cgctgttgtc aaaatgctcg aaatgggctt aaaaccttct gacattttaa
cgcgtgaagc 3900ttttgaagat gctattactg taactatggc tctgggaggt tcaaccaact
caacccttca 3960cctcttagct attgcccatg ctgctaatgt ggaattgaca cttgatgatt
tcaatacttt 4020ccaagaaaaa gttcctcatt tggctgattt gaaaccttct ggtcaatatg
tattccaaga 4080cctttacaag gtcggagggg taccagcagt tatgaaatat ctccttaaaa
atggcttcct 4140tcatggtgac cgtatcactt gtactggcaa aacagtcgct gaaaatttga
aggcttttga 4200tgatttaaca cctggtcaaa aggttattat gccgcttgaa aatcctaaac
gtgaagatgg 4260tccgctcatt attctccatg gtaacttggc tccagacggt gccgttgcca
aagtttctgg 4320tgtaaaagtg cgtcgtcatg tcggtcctgc taaggtcttt aattctgaag
aagaagccat 4380tgaagctgtc ttgaatgatg atattgttga tggtgatgtt gttgtcgtac
gttttgtagg 4440accaaagggc ggtcctggta tgcctgaaat gctttccctt tcatcaatga
ttgttggtaa 4500agggcaaggt gaaaaagttg cccttctgac agatggccgc ttctcaggtg
gtacttatgg 4560tcttgtcgtg ggtcatatcg ctcctgaagc acaagatggc ggtccaatcg
cctacctgca 4620aacaggagac atagtcacta ttgaccaaga cactaaggaa ttacactttg
atatctccga 4680tgaagagtta aaacatcgtc aagagaccat tgaattgcca ccgctctatt
cacgcggtat 4740ccttggtaaa tatgctcaca tcgtttcgtc tgcttctagg ggagccgtaa
cagacttttg 4800gaagcctgaa gaaactggca aaaaatgttg tcctggttgc tgtggttaag
cggccgcgtt 4860aattcaaatt aattgatata gttttttaat gagtattgaa tctgtttaga
aataatggaa 4920tattattttt atttatttat ttatattatt ggtcggctct tttcttctga
aggtcaatga 4980caaaatgata tgaaggaaat aatgatttct aaaattttac aacgtaagat
atttttacaa 5040aagcctagct catcttttgt catgcactat tttactcacg cttgaaatta
acggccagtc 5100cactgcggag tcatttcaaa gtcatcctaa tcgatctatc gtttttgata
gctcattttg 5160gagttcgcga ttgtcttctg ttattcacaa ctgttttaat ttttatttca
ttctggaact 5220cttcgagttc tttgtaaagt ctttcatagt agcttacttt atcctccaac
atatttaact 5280tcatgtcaat ttcggctctt aaattttcca catcatcaag ttcaacatca
tcttttaact 5340tgaatttatt ctctagctct tccaaccaag cctcattgct ccttgattta
ctggtgaaaa 5400gtgatacact ttgcgcgcaa tccaggtcaa aactttcctg caaagaattc
accaatttct 5460cgacatcata gtacaatttg ttttgttctc ccatcacaat ttaatatacc
tgatggattc 5520ttatgaagcg ctgggtaatg gacgtgtcac tctacttcgc ctttttccct
actcctttta 5580gtacggaaga caatgctaat aaataagagg gtaataataa tattattaat
cggcaaaaaa 5640gattaaacgc caagcgttta attatcagaa agcaaacgtc gtaccaatcc
ttgaatgctt 5700cccaattgta tattaagagt catcacagca acatattctt gttattaaat
taattattat 5760tgatttttga tattgtataa aaaaaccaaa tatgtataaa aaaagtgaat
aaaaaatacc 5820aagtatggag aaatatatta gaagtctata cgttaaacca cccgggcccc
ccctcgaggt 5880cgacggtatc gataagcttg atatcgaatt cctgcagccc gggggatcca
ctagttctag 5940agcggccgct ctagaactag taccacaggt gttgtcctct gaggacataa
aatacacacc 6000gagattcatc aactcattgc tggagttagc atatctacaa ttgggtgaaa
tggggagcga 6060tttgcaggca tttgctcggc atgccggtag aggtgtggtc aataagagcg
acctcatgct 6120atacctgaga aagcaacctg acctacagga aagagttact caagaataag
aattttcgtt 6180ttaaaaccta agagtcactt taaaatttgt atacacttat tttttttata
acttatttaa 6240taataaaaat cataaatcat aagaaattcg cttactctta attaatcaaa
aagttaaaat 6300tgtacgaata gattcaccac ttcttaacaa atcaaaccct tcattgattt
tctcgaatgg 6360caatacatgt gtaattaaag gatcaagagc aaacttcttc gccataaagt
cggcaacaag 6420ttttggaaca ctatccttgc tcttaaaacc gccaaatata gctcccttcc
atgtacgacc 6480gcttagcaac agcataggat tcatcgacaa attttgtgaa tcaggaggaa
cacctacgat 6540cacactgact ccatatgcct cttgacagca ggacaacgca gttaccatag
tatcaagacg 6600gcctataact tcaaaagaga aatcaactcc accgtttgac atttcagtaa
ggacttcttg 6660tattggtttc ttataatctt gagggttaac acattcagta gccccgacct
ccttagcttt 6720tgcaaatttg tccttattga tgtctacacc tataatcctc gctgcgcctg
cagctttaca 6780ccccataata acgcttagtc ctactcctcc taaaccgaat actgcacaag
tcgaaccctg 6840tgtaaccttt gcaactttaa ctgcggaacc gtaaccggtg gaaaatccgc
accctatcaa 6900gcaaactttt tccagtggtg aagctgcatc gattttagcg acagatatct
cgtccaccac 6960tgtgtattgg gaaaatgtag aagtaccaag gaaatggtgt ataggtttcc
ctctgcatgt 7020aaatctgctt gtaccatcct gcatagtacc tctaggcata gacaaatcat
ttttaaggca 7080gaaattaccc tcaggatgtt tgcagactct acacttacca cattgaggag
tgaacagtgg 7140gatcacttta tcaccaggac gaacagtggt aacaccttca cctatggatt
caacgattcc 7200ggcagcctcg tgtcccgcga ttactggcaa aggagtaact agagtgccac
tcaccacatg 7260gtcgtcggat ctacagattc cggtggcaac catcttgatt ctaacctcgt
gtgcttttgg 7320tggcgctact tctacttctt ctatgctaaa cggctttttc tcttcccaca
aaactgccgc 7380tttacactta ataactttac cggctgttga catcctcagc tagctattgt
aatatgtgtg 7440tttgtttgga ttattaagaa gaataattac aaaaaaaatt acaaaggaag
gtaattacaa 7500cagaattaag aaaggacaag aaggaggaag agaatcagtt cattatttct
tctttgttat 7560ataacaaacc caagtagcga tttggccata cattaaaagt tgagaaccac
cctccctggc 7620aacagccaca actcgttacc attgttcatc acgatcatga aactcgctgt
cagctgaaat 7680ttcacctcag tggatctctc tttttattct tcatcgttcc actaaccttt
ttccatcagc 7740tggcagggaa cggaaagtgg aatcccattt agcgagcttc ctcttttctt
caagaaaaga 7800cgaagcttgt gtgtgggtgc gcgcgctagt atctttccac attaagaaat
ataccataaa 7860ggttacttag acatcactat ggctatatat atatatatat atatatgtaa
cttagcacca 7920tcgcgcgtgc atcactgcat gtgttaaccg aaaagtttgg cgaacacttc
accgacacgg 7980tcatttagat ctgtcgtctg cattgcacgt cccttagcct taaatcctag
gcgggagcat 8040tctcgtgtaa ttgtgcagcc tgcgtagcaa ctcaacatag cgtagtctac
ccagtttttc 8100aagggtttat cgttagaaga ttctcccttt tcttcctgct cacaaatctt
aaagtcatac 8160attgcacgac taaatgcaag catgcggatc ccccgggctg caggaattcg
atatcaagct 8220tatcgatacc gtcgactggc cattaatctt tcccatatta gatttcgcca
agccatgaaa 8280gttcaagaaa ggtctttaga cgaattaccc ttcatttctc aaactggcgt
caagggatcc 8340tggtatggtt ttatcgtttt atttctggtt cttatagcat cgttttggac
ttctctgttc 8400ccattaggcg gttcaggagc cagcgcagaa tcattctttg aaggatactt
atcctttcca 8460attttgattg tctgttacgt tggacataaa ctgtatacta gaaattggac
tttgatggtg 8520aaactagaag atatggatct tgataccggc agaaaacaag tagatttgac
tcttcgtagg 8580gaagaaatga ggattgagcg agaaacatta gcaaaaagat ccttcgtaac
aagattttta 8640catttctggt gttgaaggga aagatatgag ctatacagcg gaatttccat
atcactcaga 8700ttttgttatc taattttttc cttcccacgt ccgcgggaat ctgtgtatat
tactgcatct 8760agatatatgt tatcttatct tggcgcgtac atttaatttt caacgtattc
tataagaaat 8820tgcgggagtt tttttcatgt agatgatact gactgcacgc aaatataggc
atgatttata 8880ggcatgattt gatggctgta ccgataggaa cgctaagagt aacttcagaa
tcgttatcct 8940ggcggaaaaa attcatttgt aaactttaaa aaaaaaagcc aatatcccca
aaattattaa 9000gagcgcctcc attattaact aaaatttcac tcagcatcca caatgtatca
ggtatctact 9060acagatatta catgtggcga aaaagacaag aacaatgcaa tagcgcatca
agaaaaaaca 9120caaagctttc aatcaatgaa tcgaaaatgt cattaaaata gtatataaat
tgaaactaag 9180tcataaagct ataaaaagaa aatttattta aatgcaagat ttaaagtaaa
ttcacggccc 9240tgcaggcctc agctcttgtt ttgttctgca aataacttac ccatcttttt
caaaacttta 9300ggtgcaccct cctttgctag aataagttct atccaataca tcctatttgg
atctgcttga 9360gcttctttca tcacggatac gaattcattt tctgttctca caattttgga
cacaactctg 9420tcttccgttg ccccgaaact ttctggcagt tttgagtaat tccacatagg
aatgtcatta 9480taactctggt tcggaccatg aatttccctc tcaaccgtgt aaccatcgtt
attaatgata 9540aagcagattg ggtttatctt ctctctaatg gctagtccta attcttggac
agtcagttgc 9600aatgatccat ctccgataaa caataaatgt ctagattctt tatctgcaat
ttggctgcct 9660agagctgcgg ggaaagtgta tcctatagat ccccacaagg gttgaccaat
aaaatgtgat 9720ttcgatttca gaaatataga tgaggcaccg aagaaagaag tgccttgttc
agccacgatc 9780gtctcattac tttgggtcaa attttcgaca gcttgccaca gtctatcttg
tgacaacagc 9840gcgttagaag gtacaaaatc ttcttgcttt ttatctatgt acttgccttt
atattcaatt 9900tcggacaagt caagaagaga tgatatcagg gattcgaagt cgaaattttg
gattctttcg 9960ttgaaaattt taccttcatc gatattcaag gaaatcattt tattttcatt
aagatggtga 10020gtaaatgcac ccgtactaga atcggtaagc tttacaccca acataagaat
aaaatcagca 10080gattccacaa attccttcaa gtttggctct gacagagtac cgttgtaaat
ccccaaaaat 10140gagggcaatg cttcatcaac agatgattta ccaaagttca aagtagtaat
aggtaactta 10200gtctttgaaa taaactgagt aacagtcttc tctaggccga acgatataat
ttcatggcct 10260gtgattacaa ttggtttctt ggcattcttc agactttcct gtattttgtt
cagaatctct 10320tgatcagatg tattcgacgt ggaattttcc ttcttaagag gcaaggatgg
tttttcagcc 10380ttagcggcag ctacatctac aggtaaattg atgtaaaccg gctttctttc
ctttagtaag 10440gcagacaaca ctctatcaat ttcaacagtt gcattctcgg ctgtcaataa
agtcctggca 10500gcagtaaccg gttcgtgcat cttcataaag tgcttgaaat caccatcagc
caacgtatgg 10560tgaacaaact taccttcgtt ctgcactttc gaggtaggag atcccacgat
ctcaacaaca 10620ggcaggttct cagcatagga gcccgctaag ccattaactg cggataattc
gccaacacca 10680aatgtagtca agaatgccgc agcctttttc gttcttgcgt acccgtcggc
catataggag 10740gcatttaact cattagcatt tcccacccat ttcatatctt tgtgtgaaat
aatttgatct 10800agaaattgca aattgtagtc acctggtact ccgaatattt cttctatacc
taattcgtgt 10860aatctgtcca acagatagtc acctactgta tacattttgt ttactagttt
atgtgtgttt 10920attcgaaact aagttcttgg tgttttaaaa ctaaaaaaaa gactaactat
aaaagtagaa 10980tttaagaagt ttaagaaata gatttacaga attacaatca atacctaccg
tctttatata 11040cttattagtc aagtagggga ataatttcag ggaactggtt tcaacctttt
ttttcagctt 11100tttccaaatc agagagagca gaaggtaata gaaggtgtaa gaaaatgaga
tagatacatg 11160cgtgggtcaa ttgccttgtg tcatcattta ctccaggcag gttgcatcac
tccattgagg 11220ttgtgcccgt tttttgcctg tttgtgcccc tgttctctgt agttgcgcta
agagaatgga 11280cctatgaact gatggttggt gaagaaaaca atattttggt gctgggattc
tttttttttc 11340tggatgccag cttaaaaagc gggctccatt atatttagtg gatgccagga
ataaactgtt 11400cacccagaca cctacgatgt tatatattct gtgtaacccg ccccctattt
tgggcatgta 11460cgggttacag cagaattaaa aggctaattt tttgactaaa taaagttagg
aaaatcacta 11520ctattaatta tttacgtatt ctttgaaatg gcagtattga taatgataaa
ctcgaactga 11580aaaagcgtgt tttttattca aaatgattct aactccctta cgtaatcaag
gaatcttttt 11640gccttggcct ccgcgtcatt aaacttcttg ttgttgacgc taacattcaa
cgctagtata 11700tattcgtttt tttcaggtaa gttcttttca acgggtctta ctgatgaggc
agtcgcgtct 11760gaacctgtta agaggtcaaa tatgtcttct tgaccgtacg tgtcttgcat
gttattagct 11820ttgggaattt gcatcaagtc ataggaaaat ttaaatcttg gctctcttgg
gctcaaggtg 11880acaaggtcct cgaaaatagg gcgcgcccca ccgcggtgga gctccagctt
ttgttccctt 11940tagtgagggt taattgcgcg cttggcgtaa tcatggtcat agctgtttcc
tgtgtgaaat 12000tgttatccgc tcacaattcc acacaacata cgagccggaa gcataaagtg
taaagcctgg 12060ggtgcctaat gagtgagcta actcacatta attgcgttgc gctcactgcc
cgctttccag 12120tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg
gagaggcggt 12180ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc
ggtcgttcgg 12240ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac
agaatcaggg 12300gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa
ccgtaaaaag 12360gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca
caaaaatcga 12420cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc
gtttccccct 12480ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata
cctgtccgcc 12540tttctccctt cgggaagcgt ggcgctttct catagctcac gctgtaggta
tctcagttcg 12600gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca
gcccgaccgc 12660tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga
cttatcgcca 12720ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg
tgctacagag 12780ttcttgaagt ggtggcctaa ctacggctac actagaagaa cagtatttgg
tatctgcgct 12840ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg
caaacaaacc 12900accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag
aaaaaaagga 12960tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa
cgaaaactca 13020cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat
ccttttaaat 13080taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc
tgacagttac 13140caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc
atccatagtt 13200gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc
tggccccagt 13260gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc
aataaaccag 13320ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc
catccagtct 13380attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt
gcgcaacgtt 13440gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc
ttcattcagc 13500tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa
aaaagcggtt 13560agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt
atcactcatg 13620gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg
cttttctgtg 13680actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc
gagttgctct 13740tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa
agtgctcatc 13800attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt
gagatccagt 13860tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt
caccagcgtt 13920tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag
ggcgacacgg 13980aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta
tcagggttat 14040tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat
aggggttccg 14100cgcacatttc cccgaaaagt gccacctgaa cgaagcatct gtgcttcatt
ttgtagaaca 14160aaaatgcaac gcgagagcgc taatttttca aacaaagaat ctgagctgca
tttttacaga 14220acagaaatgc aacgcgaaag cgctatttta ccaacgaaga atctgtgctt
catttttgta 14280aaacaaaaat gcaacgcgag agcgctaatt tttcaaacaa agaatctgag
ctgcattttt 14340acagaacaga aatgcaacgc gagagcgcta ttttaccaac aaagaatcta
tacttctttt 14400ttgttctaca aaaatgcatc ccgagagcgc tatttttcta acaaagcatc
ttagattact 14460ttttttctcc tttgtgcgct ctataatgca gtctcttgat aactttttgc
actgtaggtc 14520cgttaaggtt agaagaaggc tactttggtg tctattttct cttccataaa
aaaagcctga 14580ctccacttcc cgcgtttact gattactagc gaagctgcgg gtgcattttt
tcaagataaa 14640ggcatccccg attatattct ataccgatgt ggattgcgca tactttgtga
acagaaagtg 14700atagcgttga tgattcttca ttggtcagaa aattatgaac ggtttcttct
attttgtctc 14760tatatactac gtataggaaa tgtttacatt ttcgtattgt tttcgattca
ctctatgaat 14820agttcttact acaatttttt tgtctaaaga gtaatactag agataaacat
aaaaaatgta 14880gaggtcgagt ttagatgcaa gttcaaggag cgaaaggtgg atgggtaggt
tatataggga 14940tatagcacag agatatatag caaagagata cttttgagca atgtttgtgg
aagcggtatt 15000cgcaatattt tagtagctcg ttacagtccg gtgcgttttt ggttttttga
aagtgcgtct 15060tcagagcgct tttggttttc aaaagcgctc tgaagttcct atactttcta
gagaatagga 15120acttcggaat aggaacttca aagcgtttcc gaaaacgagc gcttccgaaa
atgcaacgcg 15180agctgcgcac atacagctca ctgttcacgt cgcacctata tctgcgtgtt
gcctgtatat 15240atatatacat gagaagaacg gcatagtgcg tgtttatgct taaatgcgta
cttatatgcg 15300tctatttatg taggatgaaa ggtagtctag tacctcctgt gatattatcc
cattccatgc 15360ggggtatcgt atgcttcctt cagcactacc ctttagctgt tctatatgct
gccactcctc 15420aattggatta gtctcatcct tcaatgctat catttccttt gatattggat
catactaaga 15480aaccattatt atcatgacat taacctataa aaataggcgt atcacgaggc
cctttcgtc 15539441644DNAartificial sequencecodon optimized sequence
44atgtatacag taggtgacta tctgttggac agattacacg aattaggtat agaagaaata
60ttcggagtac caggtgacta caatttgcaa tttctagatc aaattatttc acacaaagat
120atgaaatggg tgggaaatgc taatgagtta aatgcctcct atatggccga cgggtacgca
180agaacgaaaa aggctgcggc attcttgact acatttggtg ttggcgaatt atccgcagtt
240aatggcttag cgggctccta tgctgagaac ctgcctgttg ttgagatcgt gggatctcct
300acctcgaaag tgcagaacga aggtaagttt gttcaccata cgttggctga tggtgatttc
360aagcacttta tgaagatgca cgaaccggtt actgctgcca ggactttatt gacagccgag
420aatgcaactg ttgaaattga tagagtgttg tctgccttac taaaggaaag aaagccggtt
480tacatcaatt tacctgtaga tgtagctgcc gctaaggctg aaaaaccatc cttgcctctt
540aagaaggaaa attccacgtc gaatacatct gatcaagaga ttctgaacaa aatacaggaa
600agtctgaaga atgccaagaa accaattgta atcacaggcc atgaaattat atcgttcggc
660ctagagaaga ctgttactca gtttatttca aagactaagt tacctattac tactttgaac
720tttggtaaat catctgttga tgaagcattg ccctcatttt tggggattta caacggtact
780ctgtcagagc caaacttgaa ggaatttgtg gaatctgctg attttattct tatgttgggt
840gtaaagctta ccgattctag tacgggtgca tttactcacc atcttaatga aaataaaatg
900atttccttga atatcgatga aggtaaaatt ttcaacgaaa gaatccaaaa tttcgacttc
960gaatccctga tatcatctct tcttgacttg tccgaaattg aatataaagg caagtacata
1020gataaaaagc aagaagattt tgtaccttct aacgcgctgt tgtcacaaga tagactgtgg
1080caagctgtcg aaaatttgac ccaaagtaat gagacgatcg tggctgaaca aggcacttct
1140ttcttcggtg cctcatctat atttctgaaa tcgaaatcac attttattgg tcaacccttg
1200tggggatcta taggatacac tttccccgca gctctaggca gccaaattgc agataaagaa
1260tctagacatt tattgtttat cggagatgga tcattgcaac tgactgtcca agaattagga
1320ctagccatta gagagaagat aaacccaatc tgctttatca ttaataacga tggttacacg
1380gttgagaggg aaattcatgg tccgaaccag agttataatg acattcctat gtggaattac
1440tcaaaactgc cagaaagttt cggggcaacg gaagacagag ttgtgtccaa aattgtgaga
1500acagaaaatg aattcgtatc cgtgatgaaa gaagctcaag cagatccaaa taggatgtat
1560tggatagaac ttattctagc aaaggagggt gcacctaaag ttttgaaaaa gatgggtaag
1620ttatttgcag aacaaaacaa gagc
1644451125DNAartificial sequencecodon optimized sequence 45atgtcaacag
ccggtaaagt tattaagtgt aaagcggcag ttttgtggga agagaaaaag 60ccgtttagca
tagaagaagt agaagtagcg ccaccaaaag cacacgaggt tagaatcaag 120atggttgcca
ccggaatctg tagatccgac gaccatgtgg tgagtggcac tctagttact 180cctttgccag
taatcgcggg acacgaggct gccggaatcg ttgaatccat aggtgaaggt 240gttaccactg
ttcgtcctgg tgataaagtg atcccactgt tcactcctca atgtggtaag 300tgtagagtct
gcaaacatcc tgagggtaat ttctgcctta aaaatgattt gtctatgcct 360agaggtacta
tgcaggatgg tacaagcaga tttacatgca gagggaaacc tatacaccat 420ttccttggta
cttctacatt ttcccaatac acagtggtgg acgagatatc tgtcgctaaa 480atcgatgcag
cttcaccact ggaaaaagtt tgcttgatag ggtgcggatt ttccaccggt 540tacggttccg
cagttaaagt tgcaaaggtt acacagggtt cgacttgtgc agtattcggt 600ttaggaggag
taggactaag cgttattatg gggtgtaaag ctgcaggcgc agcgaggatt 660ataggtgtag
acatcaataa ggacaaattt gcaaaagcta aggaggtcgg ggctactgaa 720tgtgttaacc
ctcaagatta taagaaacca atacaagaag tccttactga aatgtcaaac 780ggtggagttg
atttctcttt tgaagttata ggccgtcttg atactatggt aactgcgttg 840tcctgctgtc
aagaggcata tggagtcagt gtgatcgtag gtgttcctcc tgattcacaa 900aatttgtcga
tgaatcctat gctgttgcta agcggtcgta catggaaggg agctatattt 960ggcggtttta
agagcaagga tagtgttcca aaacttgttg ccgactttat ggcgaagaag 1020tttgctcttg
atcctttaat tacacatgta ttgccattcg agaaaatcaa tgaagggttt 1080gatttgttaa
gaagtggtga atctattcgt acaattttaa ctttt
112546375PRTEquus caballus 46Met Ser Thr Ala Gly Lys Val Ile Lys Cys Lys
Ala Ala Val Leu Trp 1 5 10
15 Glu Glu Lys Lys Pro Phe Ser Ile Glu Glu Val Glu Val Ala Pro Pro
20 25 30 Lys Ala
His Glu Val Arg Ile Lys Met Val Ala Thr Gly Ile Cys Arg 35
40 45 Ser Asp Asp His Val Val Ser
Gly Thr Leu Val Thr Pro Leu Pro Val 50 55
60 Ile Ala Gly His Glu Ala Ala Gly Ile Val Glu Ser
Ile Gly Glu Gly 65 70 75
80 Val Thr Thr Val Arg Pro Gly Asp Lys Val Ile Pro Leu Phe Thr Pro
85 90 95 Gln Cys Gly
Lys Cys Arg Val Cys Lys His Pro Glu Gly Asn Phe Cys 100
105 110 Leu Lys Asn Asp Leu Ser Met Pro
Arg Gly Thr Met Gln Asp Gly Thr 115 120
125 Ser Arg Phe Thr Cys Arg Gly Lys Pro Ile His His Phe
Leu Gly Thr 130 135 140
Ser Thr Phe Ser Gln Tyr Thr Val Val Asp Glu Ile Ser Val Ala Lys 145
150 155 160 Ile Asp Ala Ala
Ser Pro Leu Glu Lys Val Cys Leu Ile Gly Cys Gly 165
170 175 Phe Ser Thr Gly Tyr Gly Ser Ala Val
Lys Val Ala Lys Val Thr Gln 180 185
190 Gly Ser Thr Cys Ala Val Phe Gly Leu Gly Gly Val Gly Leu
Ser Val 195 200 205
Ile Met Gly Cys Lys Ala Ala Gly Ala Ala Arg Ile Ile Gly Val Asp 210
215 220 Ile Asn Lys Asp Lys
Phe Ala Lys Ala Lys Glu Val Gly Ala Thr Glu 225 230
235 240 Cys Val Asn Pro Gln Asp Tyr Lys Lys Pro
Ile Gln Glu Val Leu Thr 245 250
255 Glu Met Ser Asn Gly Gly Val Asp Phe Ser Phe Glu Val Ile Gly
Arg 260 265 270 Leu
Asp Thr Met Val Thr Ala Leu Ser Cys Cys Gln Glu Ala Tyr Gly 275
280 285 Val Ser Val Ile Val Gly
Val Pro Pro Asp Ser Gln Asn Leu Ser Met 290 295
300 Asn Pro Met Leu Leu Leu Ser Gly Arg Thr Trp
Lys Gly Ala Ile Phe 305 310 315
320 Gly Gly Phe Lys Ser Lys Asp Ser Val Pro Lys Leu Val Ala Asp Phe
325 330 335 Met Ala
Lys Lys Phe Ala Leu Asp Pro Leu Ile Thr His Val Leu Pro 340
345 350 Phe Glu Lys Ile Asn Glu Gly
Phe Asp Leu Leu Arg Ser Gly Glu Ser 355 360
365 Ile Arg Thr Ile Leu Thr Phe 370
375 47548PRTLactococcus lactis 47Met Tyr Thr Val Gly Asp Tyr Leu Leu
Asp Arg Leu His Glu Leu Gly 1 5 10
15 Ile Glu Glu Ile Phe Gly Val Pro Gly Asp Tyr Asn Leu Gln
Phe Leu 20 25 30
Asp Gln Ile Ile Ser His Lys Asp Met Lys Trp Val Gly Asn Ala Asn
35 40 45 Glu Leu Asn Ala
Ser Tyr Met Ala Asp Gly Tyr Ala Arg Thr Lys Lys 50
55 60 Ala Ala Ala Phe Leu Thr Thr Phe
Gly Val Gly Glu Leu Ser Ala Val 65 70
75 80 Asn Gly Leu Ala Gly Ser Tyr Ala Glu Asn Leu Pro
Val Val Glu Ile 85 90
95 Val Gly Ser Pro Thr Ser Lys Val Gln Asn Glu Gly Lys Phe Val His
100 105 110 His Thr Leu
Ala Asp Gly Asp Phe Lys His Phe Met Lys Met His Glu 115
120 125 Pro Val Thr Ala Ala Arg Thr Leu
Leu Thr Ala Glu Asn Ala Thr Val 130 135
140 Glu Ile Asp Arg Val Leu Ser Ala Leu Leu Lys Glu Arg
Lys Pro Val 145 150 155
160 Tyr Ile Asn Leu Pro Val Asp Val Ala Ala Ala Lys Ala Glu Lys Pro
165 170 175 Ser Leu Pro Leu
Lys Lys Glu Asn Ser Thr Ser Asn Thr Ser Asp Gln 180
185 190 Glu Ile Leu Asn Lys Ile Gln Glu Ser
Leu Lys Asn Ala Lys Lys Pro 195 200
205 Ile Val Ile Thr Gly His Glu Ile Ile Ser Phe Gly Leu Glu
Lys Thr 210 215 220
Val Thr Gln Phe Ile Ser Lys Thr Lys Leu Pro Ile Thr Thr Leu Asn 225
230 235 240 Phe Gly Lys Ser Ser
Val Asp Glu Ala Leu Pro Ser Phe Leu Gly Ile 245
250 255 Tyr Asn Gly Thr Leu Ser Glu Pro Asn Leu
Lys Glu Phe Val Glu Ser 260 265
270 Ala Asp Phe Ile Leu Met Leu Gly Val Lys Leu Thr Asp Ser Ser
Thr 275 280 285 Gly
Ala Phe Thr His His Leu Asn Glu Asn Lys Met Ile Ser Leu Asn 290
295 300 Ile Asp Glu Gly Lys Ile
Phe Asn Glu Arg Ile Gln Asn Phe Asp Phe 305 310
315 320 Glu Ser Leu Ile Ser Ser Leu Leu Asp Leu Ser
Glu Ile Glu Tyr Lys 325 330
335 Gly Lys Tyr Ile Asp Lys Lys Gln Glu Asp Phe Val Pro Ser Asn Ala
340 345 350 Leu Leu
Ser Gln Asp Arg Leu Trp Gln Ala Val Glu Asn Leu Thr Gln 355
360 365 Ser Asn Glu Thr Ile Val Ala
Glu Gln Gly Thr Ser Phe Phe Gly Ala 370 375
380 Ser Ser Ile Phe Leu Lys Ser Lys Ser His Phe Ile
Gly Gln Pro Leu 385 390 395
400 Trp Gly Ser Ile Gly Tyr Thr Phe Pro Ala Ala Leu Gly Ser Gln Ile
405 410 415 Ala Asp Lys
Glu Ser Arg His Leu Leu Phe Ile Gly Asp Gly Ser Leu 420
425 430 Gln Leu Thr Val Gln Glu Leu Gly
Leu Ala Ile Arg Glu Lys Ile Asn 435 440
445 Pro Ile Cys Phe Ile Ile Asn Asn Asp Gly Tyr Thr Val
Glu Arg Glu 450 455 460
Ile His Gly Pro Asn Gln Ser Tyr Asn Asp Ile Pro Met Trp Asn Tyr 465
470 475 480 Ser Lys Leu Pro
Glu Ser Phe Gly Ala Thr Glu Asp Arg Val Val Ser 485
490 495 Lys Ile Val Arg Thr Glu Asn Glu Phe
Val Ser Val Met Lys Glu Ala 500 505
510 Gln Ala Asp Pro Asn Arg Met Tyr Trp Ile Glu Leu Ile Leu
Ala Lys 515 520 525
Glu Gly Ala Pro Lys Val Leu Lys Lys Met Gly Lys Leu Phe Ala Glu 530
535 540 Gln Asn Lys Ser 545
489089DNAartificial sequenceSynthetic construct 48tcgcgcgttt
cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct
gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg
tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accataccac
agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc 240ggtttctttg
aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact
tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt
aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac
atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat 480ttaatatcat
gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca 540aggaattact
ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt
gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa
ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca 720aattgcagta
ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt
gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag
aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg 900gagaatatac
taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct 960ttattgctca
aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac 1020ccggtgtggg
tttagatgac aagggagacg cattgggtca acagtataga accgtggatg 1080atgtggtctc
tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa
ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca
gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc
ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa
ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa
tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa
atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca 1500agagtccact
attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc
actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa
tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc
gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt
cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg
ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct
attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg
gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact
atagggcgaa ttgggtaccg ggccccccct cgaggtcgac tggccattaa 2040tctttcccat
attagatttc gccaagccat gaaagttcaa gaaaggtctt tagacgaatt 2100acccttcatt
tctcaaactg gcgtcaaggg atcctggtat ggttttatcg ttttatttct 2160ggttcttata
gcatcgtttt ggacttctct gttcccatta ggcggttcag gagccagcgc 2220agaatcattc
tttgaaggat acttatcctt tccaattttg attgtctgtt acgttggaca 2280taaactgtat
actagaaatt ggactttgat ggtgaaacta gaagatatgg atcttgatac 2340cggcagaaaa
caagtagatt tgactcttcg tagggaagaa atgaggattg agcgagaaac 2400attagcaaaa
agatccttcg taacaagatt tttacatttc tggtgttgaa gggaaagata 2460tgagctatac
agcggaattt ccatatcact cagattttgt tatctaattt tttccttccc 2520acgtccgcgg
gaatctgtgt atattactgc atctagatat atgttatctt atcttggcgc 2580gtacatttaa
ttttcaacgt attctataag aaattgcggg agtttttttc atgtagatga 2640tactgactgc
acgcaaatat aggcatgatt tataggcatg atttgatggc tgtaccgata 2700ggaacgctaa
gagtaacttc agaatcgtta tcctggcgga aaaaattcat ttgtaaactt 2760taaaaaaaaa
agccaatatc cccaaaatta ttaagagcgc ctccattatt aactaaaatt 2820tcactcagca
tccacaatgt atcaggtatc tactacagat attacatgtg gcgaaaaaga 2880caagaacaat
gcaatagcgc atcaagaaaa aacacaaagc tttcaatcaa tgaatcgaaa 2940atgtcattaa
aatagtatat aaattgaaac taagtcataa agctataaaa agaaaattta 3000tttaaatgca
agatttaaag taaattcacg gccctgcagg ccctaacctg ctaggacaca 3060acgtctttgc
ctggtaaagt ttctagctga cgtgattcct tcacctgtgg atccggcaat 3120tgtaaaggtt
gtgaaaccct cagcttcata accgacacct gcaaatgact ttgcattctt 3180aacaaagata
gttgtatcaa tttcacgttc gaatctatta aggttatcga tgttcttaga 3240ataaatgtag
gcggaatgtt ttctattctg ctcagctatc ttggcgtatt taatggcttc 3300atcaatgtcc
ttcactctaa ctataggcaa aattggcatc atcaactccg tcataacgaa 3360cggatggttt
gcgttgactt cacaaataat acactttaca ttacttggtg actctacatc 3420tatttcatcc
aaaaacagtt tagcgtcctt accaacccac ttcttattaa tgaaatattc 3480ttgagtttca
ttgttctttt gaagaacaag gtctatcagc ttggatactt ggtcttcatt 3540gataatgacg
gcgttgtttt tcaacatgtt agagatcaga tcatctgcaa cgttttcaaa 3600cacgaacact
tctttttccg cgatacaagg aagattgttg tcaaacgaac aaccttcaat 3660aatgcttctg
ccggccttct cgatatctgc tgtatcgtct acaataaccg gaggattacc 3720cgcgccagct
ccgatggcct ttttaccaga attaagaagg gtttttacca tacccgggcc 3780acccgtaccg
cacaacaatt ttatggatgg atgtttgata atagcgtcta aactttccat 3840agttgggttc
tttatagtag tgacaaggtt ttcaggtcca ccacagctaa ttatggcttt 3900gtttatcatt
tctactgcga aagcgacaca ctttttggcg catgggtgac cattaaatac 3960aactgcattc
cccgcagcta tcatacctat agaattgcag ataacggttt ctgttggatt 4020cgtgcttgga
gttatagcgc cgataactcc gtatggactc atttcaacca ctgttagtcc 4080attatcgccg
gaccatgctg ttgttgtcag atcttcagtg cctggggtat acttggccac 4140taattcatgt
ttcaagattt tatcctcata ccttcccatg tgggtttcct ccaggatcat 4200tgtggctaag
acctctttat tctgtaatgc ggcttttctt atttcggtga ttattttctc 4260tctttgttcc
tttgtgtagt gtagggaaag aatcttttgt gcatgtactg cagaagaaat 4320ggcattctca
acattttcaa atactccaaa acatgaagag ttatctttgt aattctttaa 4380gttgatgttt
tcaccattag tcttcacttt caagtctttg gtggttggga ttaaggtatc 4440tttatccatg
gtgtttgttt atgtgtgttt attcgaaact aagttcttgg tgttttaaaa 4500ctaaaaaaaa
gactaactat aaaagtagaa tttaagaagt ttaagaaata gatttacaga 4560attacaatca
atacctaccg tctttatata cttattagtc aagtagggga ataatttcag 4620ggaactggtt
tcaacctttt ttttcagctt tttccaaatc agagagagca gaaggtaata 4680gaaggtgtaa
gaaaatgaga tagatacatg cgtgggtcaa ttgccttgtg tcatcattta 4740ctccaggcag
gttgcatcac tccattgagg ttgtgcccgt tttttgcctg tttgtgcccc 4800tgttctctgt
agttgcgcta agagaatgga cctatgaact gatggttggt gaagaaaaca 4860atattttggt
gctgggattc tttttttttc tggatgccag cttaaaaagc gggctccatt 4920atatttagtg
gatgccagga ataaactgtt cacccagaca cctacgatgt tatatattct 4980gtgtaacccg
ccccctattt tgggcatgta cgggttacag cagaattaaa aggctaattt 5040tttgactaaa
taaagttagg aaaatcacta ctattaatta tttacgtatt ctttgaaatg 5100gcagtattga
taatgataaa ctcgaactga aaaagcgtgt tttttattca aaatgattct 5160aactccctta
cgtaatcaag gaatcttttt gccttggcct ccgcgtcatt aaacttcttg 5220ttgttgacgc
taacattcaa cgctagtata tattcgtttt tttcaggtaa gttcttttca 5280acgggtctta
ctgatgaggc agtcgcgtct gaacctgtta agaggtcaaa tatgtcttct 5340tgaccgtacg
tgtcttgcat gttattagct ttgggaattt gcatcaagtc ataggaaaat 5400ttaaatcttg
gctctcttgg gctcaaggtg acaaggtcct cgaaaatagg gcgcgcccca 5460ccgcggtgga
gctccagctt ttgttccctt tagtgagggt taattgcgcg cttggcgtaa 5520tcatggtcat
agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 5580ggagccggaa
gcataaagtg taaagcctgg ggtgcctaat gagtgaggta actcacatta 5640attgcgttgc
gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 5700tgaatcggcc
aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 5760ctcactgact
cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 5820gcggtaatac
ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 5880ggccagcaaa
aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 5940cgcccccctg
acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 6000ggactataaa
gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 6060accctgccgc
ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 6120catagctcac
gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 6180gtgcacgaac
cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 6240tccaacccgg
taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 6300agagcgaggt
atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 6360actagaagga
cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 6420gttggtagct
cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 6480aagcagcaga
ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 6540gggtctgacg
ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 6600aaaaggatct
tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 6660atatatgagt
aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca 6720gcgatctgtc
tatttcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg 6780atacgggagg
gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca 6840ccggctccag
atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt 6900cctgcaactt
tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt 6960agttcgccag
ttaatagttt gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca 7020cgctcgtcgt
ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca 7080tgatccccca
tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga 7140agtaagttgg
ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact 7200gtcatgccat
ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga 7260gaatagtgta
tgcggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg 7320ccacatagca
gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc 7380tcaaggatct
taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga 7440tcttcagcat
cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat 7500gccgcaaaaa
agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttttt 7560caatattatt
gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt 7620atttagaaaa
ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccacctgaa 7680cgaagcatct
gtgcttcatt ttgtagaaca aaaatgcaac gcgagagcgc taatttttca 7740aacaaagaat
ctgagctgca tttttacaga acagaaatgc aacgcgaaag cgctatttta 7800ccaacgaaga
atctgtgctt catttttgta aaacaaaaat gcaacgcgag agcgctaatt 7860tttcaaacaa
agaatctgag ctgcattttt acagaacaga aatgcaacgc gagagcgcta 7920ttttaccaac
aaagaatcta tacttctttt ttgttctaca aaaatgcatc ccgagagcgc 7980tatttttcta
acaaagcatc ttagattact ttttttctcc tttgtgcgct ctataatgca 8040gtctcttgat
aactttttgc actgtaggtc cgttaaggtt agaagaaggc tactttggtg 8100tctattttct
cttccataaa aaaagcctga ctccacttcc cgcgtttact gattactagc 8160gaagctgcgg
gtgcattttt tcaagataaa ggcatccccg attatattct ataccgatgt 8220ggattgcgca
tactttgtga acagaaagtg atagcgttga tgattcttca ttggtcagaa 8280aattatgaac
ggtttcttct attttgtctc tatatactac gtataggaaa tgtttacatt 8340ttcgtattgt
tttcgattca ctctatgaat agttcttact acaatttttt tgtctaaaga 8400gtaatactag
agataaacat aaaaaatgta gaggtcgagt ttagatgcaa gttcaaggag 8460cgaaaggtgg
atgggtaggt tatataggga tatagcacag agatatatag caaagagata 8520cttttgagca
atgtttgtgg aagcggtatt cgcaatattt tagtagctcg ttacagtccg 8580gtgcgttttt
ggttttttga aagtgcgtct tcagagcgct tttggttttc aaaagcgctc 8640tgaagttcct
atactttcta gagaatagga acttcggaat aggaacttca aagcgtttcc 8700gaaaacgagc
gcttccgaaa atgcaacgcg agctgcgcac atacagctca ctgttcacgt 8760cgcacctata
tctgcgtgtt gcctgtatat atatatacat gagaagaacg gcatagtgcg 8820tgtttatgct
taaatgcgta cttatatgcg tctatttatg taggatgaaa ggtagtctag 8880tacctcctgt
gatattatcc cattccatgc ggggtatcgt atgcttcctt cagcactacc 8940ctttagctgt
tctatatgct gccactcctc aattggatta gtctcatcct tcaatgctat 9000catttccttt
gatattggat catactaaga aaccattatt atcatgacat taacctataa 9060aaataggcgt
atcacgaggc cctttcgtc
9089491023DNASaccharomyces cerevisiae 49caccgcggtg gggcgcgccc tattttcgag
gaccttgtca ccttgagccc aagagagcca 60agatttaaat tttcctatga cttgatgcaa
attcccaaag ctaataacat gcaagacacg 120tacggtcaag aagacatatt tgacctctta
acaggttcag acgcgactgc ctcatcagta 180agacccgttg aaaagaactt acctgaaaaa
aacgaatata tactagcgtt gaatgttagc 240gtcaacaaca agaagtttaa tgacgcggag
gccaaggcaa aaagattcct tgattacgta 300agggagttag aatcattttg aataaaaaac
acgctttttc agttcgagtt tatcattatc 360aatactgcca tttcaaagaa tacgtaaata
attaatagta gtgattttcc taactttatt 420tagtcaaaaa attagccttt taattctgct
gtaacccgta catgcccaaa atagggggcg 480ggttacacag aatatataac atcgtaggtg
tctgggtgaa cagtttattc ctggcatcca 540ctaaatataa tggagcccgc tttttaagct
ggcatccaga aaaaaaaaga atcccagcac 600caaaatattg ttttcttcac caaccatcag
ttcataggtc cattctctta gcgcaactac 660agagaacagg ggcacaaaca ggcaaaaaac
gggcacaacc tcaatggagt gatgcaacct 720gcctggagta aatgatgaca caaggcaatt
gacccacgca tgtatctatc tcattttctt 780acaccttcta ttaccttctg ctctctctga
tttggaaaaa gctgaaaaaa aaggttgaaa 840ccagttccct gaaattattc ccctacttga
ctaataagta tataaagacg gtaggtattg 900attgtaattc tgtaaatcta tttcttaaac
ttcttaaatt ctacttttat agttagtctt 960ttttttagtt ttaaaacacc aagaacttag
tttcgaataa acacacataa actagtaaac 1020aaa
10235021DNAartificial sequencePrimer
50caaaagctga gctccaccgc g
215144DNAartificial sequencePrimer 51gtttactagt ttatgtgtgt ttattcgaaa
ctaagttctt ggtg 44528994DNAartificial
sequenceSynthetic construct 52ctagttctag agcggccgcc accgcggtgg agctccagct
tttgttccct ttagtgaggg 60ttaattgcgc gcttggcgta atcatggtca tagctgtttc
ctgtgtgaaa ttgttatccg 120ctcacaattc cacacaacat aggagccgga agcataaagt
gtaaagcctg gggtgcctaa 180tgagtgaggt aactcacatt aattgcgttg cgctcactgc
ccgctttcca gtcgggaaac 240ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg
ggagaggcgg tttgcgtatt 300gggcgctctt ccgcttcctc gctcactgac tcgctgcgct
cggtcgttcg gctgcggcga 360gcggtatcag ctcactcaaa ggcggtaata cggttatcca
cagaatcagg ggataacgca 420ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga
accgtaaaaa ggccgcgttg 480ctggcgtttt tccataggct ccgcccccct gacgagcatc
acaaaaatcg acgctcaagt 540cagaggtggc gaaacccgac aggactataa agataccagg
cgtttccccc tggaagctcc 600ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat
acctgtccgc ctttctccct 660tcgggaagcg tggcgctttc tcatagctca cgctgtaggt
atctcagttc ggtgtaggtc 720gttcgctcca agctgggctg tgtgcacgaa ccccccgttc
agcccgaccg ctgcgcctta 780tccggtaact atcgtcttga gtccaacccg gtaagacacg
acttatcgcc actggcagca 840gccactggta acaggattag cagagcgagg tatgtaggcg
gtgctacaga gttcttgaag 900tggtggccta actacggcta cactagaagg acagtatttg
gtatctgcgc tctgctgaag 960ccagttacct tcggaaaaag agttggtagc tcttgatccg
gcaaacaaac caccgctggt 1020agcggtggtt tttttgtttg caagcagcag attacgcgca
gaaaaaaagg atctcaagaa 1080gatcctttga tcttttctac ggggtctgac gctcagtgga
acgaaaactc acgttaaggg 1140attttggtca tgagattatc aaaaaggatc ttcacctaga
tccttttaaa ttaaaaatga 1200agttttaaat caatctaaag tatatatgag taaacttggt
ctgacagtta ccaatgctta 1260atcagtgagg cacctatctc agcgatctgt ctatttcgtt
catccatagt tgcctgactc 1320cccgtcgtgt agataactac gatacgggag ggcttaccat
ctggccccag tgctgcaatg 1380ataccgcgag acccacgctc accggctcca gatttatcag
caataaacca gccagccgga 1440agggccgagc gcagaagtgg tcctgcaact ttatccgcct
ccatccagtc tattaattgt 1500tgccgggaag ctagagtaag tagttcgcca gttaatagtt
tgcgcaacgt tgttgccatt 1560gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg
cttcattcag ctccggttcc 1620caacgatcaa ggcgagttac atgatccccc atgttgtgca
aaaaagcggt tagctccttc 1680ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt
tatcactcat ggttatggca 1740gcactgcata attctcttac tgtcatgcca tccgtaagat
gcttttctgt gactggtgag 1800tactcaacca agtcattctg agaatagtgt atgcggcgac
cgagttgctc ttgcccggcg 1860tcaatacggg ataataccgc gccacatagc agaactttaa
aagtgctcat cattggaaaa 1920cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt
tgagatccag ttcgatgtaa 1980cccactcgtg cacccaactg atcttcagca tcttttactt
tcaccagcgt ttctgggtga 2040gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa
gggcgacacg gaaatgttga 2100atactcatac tcttcctttt tcaatattat tgaagcattt
atcagggtta ttgtctcatg 2160agcggataca tatttgaatg tatttagaaa aataaacaaa
taggggttcc gcgcacattt 2220ccccgaaaag tgccacctga acgaagcatc tgtgcttcat
tttgtagaac aaaaatgcaa 2280cgcgagagcg ctaatttttc aaacaaagaa tctgagctgc
atttttacag aacagaaatg 2340caacgcgaaa gcgctatttt accaacgaag aatctgtgct
tcatttttgt aaaacaaaaa 2400tgcaacgcga gagcgctaat ttttcaaaca aagaatctga
gctgcatttt tacagaacag 2460aaatgcaacg cgagagcgct attttaccaa caaagaatct
atacttcttt tttgttctac 2520aaaaatgcat cccgagagcg ctatttttct aacaaagcat
cttagattac tttttttctc 2580ctttgtgcgc tctataatgc agtctcttga taactttttg
cactgtaggt ccgttaaggt 2640tagaagaagg ctactttggt gtctattttc tcttccataa
aaaaagcctg actccacttc 2700ccgcgtttac tgattactag cgaagctgcg ggtgcatttt
ttcaagataa aggcatcccc 2760gattatattc tataccgatg tggattgcgc atactttgtg
aacagaaagt gatagcgttg 2820atgattcttc attggtcaga aaattatgaa cggtttcttc
tattttgtct ctatatacta 2880cgtataggaa atgtttacat tttcgtattg ttttcgattc
actctatgaa tagttcttac 2940tacaattttt ttgtctaaag agtaatacta gagataaaca
taaaaaatgt agaggtcgag 3000tttagatgca agttcaagga gcgaaaggtg gatgggtagg
ttatataggg atatagcaca 3060gagatatata gcaaagagat acttttgagc aatgtttgtg
gaagcggtat tcgcaatatt 3120ttagtagctc gttacagtcc ggtgcgtttt tggttttttg
aaagtgcgtc ttcagagcgc 3180ttttggtttt caaaagcgct ctgaagttcc tatactttct
agagaatagg aacttcggaa 3240taggaacttc aaagcgtttc cgaaaacgag cgcttccgaa
aatgcaacgc gagctgcgca 3300catacagctc actgttcacg tcgcacctat atctgcgtgt
tgcctgtata tatatataca 3360tgagaagaac ggcatagtgc gtgtttatgc ttaaatgcgt
acttatatgc gtctatttat 3420gtaggatgaa aggtagtcta gtacctcctg tgatattatc
ccattccatg cggggtatcg 3480tatgcttcct tcagcactac cctttagctg ttctatatgc
tgccactcct caattggatt 3540agtctcatcc ttcaatgcta tcatttcctt tgatattgga
tcatactaag aaaccattat 3600tatcatgaca ttaacctata aaaataggcg tatcacgagg
ccctttcgtc tcgcgcgttt 3660cggtgatgac ggtgaaaacc tctgacacat gcagctcccg
gagacggtca cagcttgtct 3720gtaagcggat gccgggagca gacaagcccg tcagggcgcg
tcagcgggtg ttggcgggtg 3780tcggggctgg cttaactatg cggcatcaga gcagattgta
ctgagagtgc accatatcga 3840ctacgtcgta aggccgtttc tgacagagta aaattcttga
gggaactttc accattatgg 3900gaaatgcttc aagaaggtat tgacttaaac tccatcaaat
ggtcaggtca ttgagtgttt 3960tttatttgtt gtattttttt ttttttagag aaaatcctcc
aatatcaaat taggaatcgt 4020agtttcatga ttttctgtta cacctaactt tttgtgtggt
gccctcctcc ttgtcaatat 4080taatgttaaa gtgcaattct ttttccttat cacgttgagc
cattagtatc aatttgctta 4140cctgtattcc tttactatcc tcctttttct ccttcttgat
aaatgtatgt agattgcgta 4200tatagtttcg tctaccctat gaacatattc cattttgtaa
tttcgtgtcg tttctattat 4260gaatttcatt tataaagttt atgtacaaat atcataaaaa
aagagaatct ttttaagcaa 4320ggattttctt aacttcttcg gcgacagcat caccgacttc
ggtggtactg ttggaaccac 4380ctaaatcacc agttctgata cctgcatcca aaaccttttt
aactgcatct tcaatggcct 4440taccttcttc aggcaagttc aatgacaatt tcaacatcat
tgcagcagac aagatagtgg 4500cgatagggtc aaccttattc tttggcaaat ctggagcaga
accgtggcat ggttcgtaca 4560aaccaaatgc ggtgttcttg tctggcaaag aggccaagga
cgcagatggc aacaaaccca 4620aggaacctgg gataacggag gcttcatcgg agatgatatc
accaaacatg ttgctggtga 4680ttataatacc atttaggtgg gttgggttct taactaggat
catggcggca gaatcaatca 4740attgatgttg aaccttcaat gtagggaatt cgttcttgat
ggtttcctcc acagtttttc 4800tccataatct tgaagaggcc aaaagattag ctttatccaa
ggaccaaata ggcaatggtg 4860gctcatgttg tagggccatg aaagcggcca ttcttgtgat
tctttgcact tctggaacgg 4920tgtattgttc actatcccaa gcgacaccat caccatcgtc
ttcctttctc ttaccaaagt 4980aaatacctcc cactaattct ctgacaacaa cgaagtcagt
acctttagca aattgtggct 5040tgattggaga taagtctaaa agagagtcgg atgcaaagtt
acatggtctt aagttggcgt 5100acaattgaag ttctttacgg atttttagta aaccttgttc
aggtctaaca ctaccggtac 5160cccatttagg accagccaca gcacctaaca aaacggcatc
aaccttcttg gaggcttcca 5220gcgcctcatc tggaagtggg acacctgtag catcgatagc
agcaccacca attaaatgat 5280tttcgaaatc gaacttgaca ttggaacgaa catcagaaat
agctttaaga accttaatgg 5340cttcggctgt gatttcttga ccaacgtggt cacctggcaa
aacgacgatc ttcttagggg 5400cagacatagg ggcagacatt agaatggtat atccttgaaa
tatatatata tattgctgaa 5460atgtaaaagg taagaaaagt tagaaagtaa gacgattgct
aaccacctat tggaaaaaac 5520aataggtcct taaataatat tgtcaacttc aagtattgtg
atgcaagcat ttagtcatga 5580acgcttctct attctatatg aaaagccggt tccggcctct
cacctttcct ttttctccca 5640atttttcagt tgaaaaaggt atatgcgtca ggcgacctct
gaaattaaca aaaaatttcc 5700agtcatcgaa tttgattctg tgcgatagcg cccctgtgtg
ttctcgttat gttgaggaaa 5760aaaataatgg ttgctaagag attcgaactc ttgcatctta
cgatacctga gtattcccac 5820agttaactgc ggtcaagata tttcttgaat caggcgcctt
agaccgctcg gccaaacaac 5880caattacttg ttgagaaata gagtataatt atcctataaa
tataacgttt ttgaacacac 5940atgaacaagg aagtacagga caattgattt tgaagagaat
gtggattttg atgtaattgt 6000tgggattcca tttttaataa ggcaataata ttaggtatgt
ggatatacta gaagttctcc 6060tcgaccgtcg atatgcggtg tgaaataccg cacagatgcg
taaggagaaa ataccgcatc 6120aggaaattgt aaacgttaat attttgttaa aattcgcgtt
aaatttttgt taaatcagct 6180cattttttaa ccaataggcc gaaatcggca aaatccctta
taaatcaaaa gaatagaccg 6240agatagggtt gagtgttgtt ccagtttgga acaagagtcc
actattaaag aacgtggact 6300ccaacgtcaa agggcgaaaa accgtctatc agggcgatgg
cccactacgt gaaccatcac 6360cctaatcaag ttttttgggg tcgaggtgcc gtaaagcact
aaatcggaac cctaaaggga 6420gcccccgatt tagagcttga cggggaaagc cggcgaacgt
ggcgagaaag gaagggaaga 6480aagcgaaagg agcgggcgct agggcgctgg caagtgtagc
ggtcacgctg cgcgtaacca 6540ccacacccgc cgcgcttaat gcgccgctac agggcgcgtc
gcgccattcg ccattcaggc 6600tgcgcaactg ttgggaaggg cgatcggtgc gggcctcttc
gctattacgc cagctggcga 6660aagggggatg tgctgcaagg cgattaagtt gggtaacgcc
agggttttcc cagtcacgac 6720gttgtaaaac gacggccagt gagcgcgcgt aatacgactc
actatagggc gaattgggta 6780ccgggccccc cctcgaggtc gacggtatcg ataagcttga
tatcgaattc ctgcagcccg 6840ggggatccgc atgcttgcat ttagtcgtgc aatgtatgac
tttaagattt gtgagcagga 6900agaaaaggga gaatcttcta acgataaacc cttgaaaaac
tgggtagact acgctatgtt 6960gagttgctac gcaggctgca caattacacg agaatgctcc
cgcctaggat ttaaggctaa 7020gggacgtgca atgcagacga cagatctaaa tgaccgtgtc
ggtgaagtgt tcgccaaact 7080tttcggttaa cacatgcagt gatgcacgcg cgatggtgct
aagttacata tatatatata 7140tatatatata tagccatagt gatgtctaag taacctttat
ggtatatttc ttaatgtgga 7200aagatactag cgcgcgcacc cacacacaag cttcgtcttt
tcttgaagaa aagaggaagc 7260tcgctaaatg ggattccact ttccgttccc tgccagctga
tggaaaaagg ttagtggaac 7320gatgaagaat aaaaagagag atccactgag gtgaaatttc
agctgacagc gagtttcatg 7380atcgtgatga acaatggtaa cgagttgtgg ctgttgccag
ggagggtggt tctcaacttt 7440taatgtatgg ccaaatcgct acttgggttt gttatataac
aaagaagaaa taatgaactg 7500attctcttcc tccttcttgt cctttcttaa ttctgttgta
attaccttcc tttgtaattt 7560tttttgtaat tattcttctt aataatccaa acaaacacac
atattacaat agctagctga 7620ggatgaaggc attagtttat catggggatc acaaaatttc
gttagaagac aaaccaaaac 7680ccactctgca gaaaccaaca gacgttgtgg ttagggtgtt
gaaaacaaca atttgcggta 7740ctgacttggg aatatacaaa ggtaagaatc ctgaagtggc
agatggcaga atcctgggtc 7800atgagggcgt tggcgtcatt gaagaagtgg gcgaatccgt
gacacaattc aaaaaggggg 7860ataaagtttt aatctcctgc gttactagct gtggatcgtg
tgattattgc aagaagcaac 7920tgtattcaca ctgtagagac ggtggctgga ttttaggtta
catgatcgac ggtgtccaag 7980ccgaatacgt cagaatacca catgctgaca attcattgta
taagatcccg caaactatcg 8040atgatgaaat tgcagtacta ctgtccgata ttttacctac
tggacatgaa attggtgttc 8100aatatggtaa cgttcaacca ggcgatgctg tagcaattgt
aggagcaggt cctgttggaa 8160tgtcagtttt gttaactgct caattttact cgcctagtac
cattattgtt atcgacatgg 8220acgaaaaccg tttacaatta gcgaaggagc ttggggccac
acacactatt aactccggta 8280ctgaaaatgt tgtcgaagct gtgcatcgta tagcagccga
aggagtggat gtagcaatag 8340aagctgttgg tatacccgca acctgggaca tctgtcagga
aattgtaaaa cccggcgctc 8400atattgccaa cgtgggagtt catggtgtta aggtggactt
tgaaattcaa aagttgtgga 8460ttaagaatct aaccatcacc actggtttgg ttaacactaa
tactacccca atgttgatga 8520aggtagcctc tactgataaa ttgcctttaa agaaaatgat
tactcacagg tttgagttag 8580ctgaaatcga acacgcatat caggttttct tgaatggcgc
taaagaaaaa gctatgaaga 8640ttattctatc taatgcaggt gccgcctaat taattaagag
taagcgaatt tcttatgatt 8700tatgattttt attattaaat aagttataaa aaaaataagt
gtatacaaat tttaaagtga 8760ctcttaggtt ttaaaacgaa aattcttatt cttgagtaac
tctttcctgt aggtcaggtt 8820gctttctcag gtatagcatg aggtcgctct tattgaccac
acctctaccg gcatgccgag 8880caaatgcctg caaatcgctc cccatttcac ccaattgtag
atatgctaac tccagcaatg 8940agttgatgaa tctcggtgtg tattttatgt cctcagagga
caacacctgt ggta 899453753DNASaccharomyces cerevisiae 53gcatgcttgc
atttagtcgt gcaatgtatg actttaagat ttgtgagcag gaagaaaagg 60gagaatcttc
taacgataaa cccttgaaaa actgggtaga ctacgctatg ttgagttgct 120acgcaggctg
cacaattaca cgagaatgct cccgcctagg atttaaggct aagggacgtg 180caatgcagac
gacagatcta aatgaccgtg tcggtgaagt gttcgccaaa cttttcggtt 240aacacatgca
gtgatgcacg cgcgatggtg ctaagttaca tatatatata tatagccata 300gtgatgtcta
agtaaccttt atggtatatt tcttaatgtg gaaagatact agcgcgcgca 360cccacacaca
agcttcgtct tttcttgaag aaaagaggaa gctcgctaaa tgggattcca 420ctttccgttc
cctgccagct gatggaaaaa ggttagtgga acgatgaaga ataaaaagag 480agatccactg
aggtgaaatt tcagctgaca gcgagtttca tgatcgtgat gaacaatggt 540aacgagttgt
ggctgttgcc agggagggtg gttctcaact tttaatgtat ggccaaatcg 600ctacttgggt
ttgttatata acaaagaaga aataatgaac tgattctctt cctccttctt 660gtcctttctt
aattctgttg taattacctt cctttgtaat tttttttgta attattcttc 720ttaataatcc
aaacaaacac acatattaca ata
75354316DNASaccharomyces cerevisiae 54gagtaagcga atttcttatg atttatgatt
tttattatta aataagttat aaaaaaaata 60agtgtataca aattttaaag tgactcttag
gttttaaaac gaaaattctt attcttgagt 120aactctttcc tgtaggtcag gttgctttct
caggtatagc atgaggtcgc tcttattgac 180cacacctcta ccggcatgcc gagcaaatgc
ctgcaaatcg ctccccattt cacccaattg 240tagatatgct aactccagca atgagttgat
gaatctcggt gtgtatttta tgtcctcaga 300ggacaacacc tgtggt
316551047DNAAchromobacter xylosoxidans
55atgaaagctc tggtttatca cggtgaccac aagatctcgc ttgaagacaa gcccaagccc
60acccttcaaa agcccacgga tgtagtagta cgggttttga agaccacgat ctgcggcacg
120gatctcggca tctacaaagg caagaatcca gaggtcgccg acgggcgcat cctgggccat
180gaaggggtag gcgtcatcga ggaagtgggc gagagtgtca cgcagttcaa gaaaggcgac
240aaggtcctga tttcctgcgt cacttcttgc ggctcgtgcg actactgcaa gaagcagctt
300tactcccatt gccgcgacgg cgggtggatc ctgggttaca tgatcgatgg cgtgcaggcc
360gaatacgtcc gcatcccgca tgccgacaac agcctctaca agatccccca gacaattgac
420gacgaaatcg ccgtcctgct gagcgacatc ctgcccaccg gccacgaaat cggcgtccag
480tatgggaatg tccagccggg cgatgcggtg gctattgtcg gcgcgggccc cgtcggcatg
540tccgtactgt tgaccgccca gttctactcc ccctcgacca tcatcgtgat cgacatggac
600gagaatcgcc tccagctcgc caaggagctc ggggcaacgc acaccatcaa ctccggcacg
660gagaacgttg tcgaagccgt gcataggatt gcggcagagg gagtcgatgt tgcgatcgag
720gcggtgggca taccggcgac ttgggacatc tgccaggaga tcgtcaagcc cggcgcgcac
780atcgccaacg tcggcgtgca tggcgtcaag gttgacttcg agattcagaa gctctggatc
840aagaacctga cgatcaccac gggactggtg aacacgaaca cgacgcccat gctgatgaag
900gtcgcctcga ccgacaagct tccgttgaag aagatgatta cccatcgctt cgagctggcc
960gagatcgagc acgcctatca ggtattcctc aatggcgcca aggagaaggc gatgaagatc
1020atcctctcga acgcaggcgc tgcctga
104756348PRTAchromobacter xylosoxidans 56Met Lys Ala Leu Val Tyr His Gly
Asp His Lys Ile Ser Leu Glu Asp 1 5 10
15 Lys Pro Lys Pro Thr Leu Gln Lys Pro Thr Asp Val Val
Val Arg Val 20 25 30
Leu Lys Thr Thr Ile Cys Gly Thr Asp Leu Gly Ile Tyr Lys Gly Lys
35 40 45 Asn Pro Glu Val
Ala Asp Gly Arg Ile Leu Gly His Glu Gly Val Gly 50
55 60 Val Ile Glu Glu Val Gly Glu Ser
Val Thr Gln Phe Lys Lys Gly Asp 65 70
75 80 Lys Val Leu Ile Ser Cys Val Thr Ser Cys Gly Ser
Cys Asp Tyr Cys 85 90
95 Lys Lys Gln Leu Tyr Ser His Cys Arg Asp Gly Gly Trp Ile Leu Gly
100 105 110 Tyr Met Ile
Asp Gly Val Gln Ala Glu Tyr Val Arg Ile Pro His Ala 115
120 125 Asp Asn Ser Leu Tyr Lys Ile Pro
Gln Thr Ile Asp Asp Glu Ile Ala 130 135
140 Val Leu Leu Ser Asp Ile Leu Pro Thr Gly His Glu Ile
Gly Val Gln 145 150 155
160 Tyr Gly Asn Val Gln Pro Gly Asp Ala Val Ala Ile Val Gly Ala Gly
165 170 175 Pro Val Gly Met
Ser Val Leu Leu Thr Ala Gln Phe Tyr Ser Pro Ser 180
185 190 Thr Ile Ile Val Ile Asp Met Asp Glu
Asn Arg Leu Gln Leu Ala Lys 195 200
205 Glu Leu Gly Ala Thr His Thr Ile Asn Ser Gly Thr Glu Asn
Val Val 210 215 220
Glu Ala Val His Arg Ile Ala Ala Glu Gly Val Asp Val Ala Ile Glu 225
230 235 240 Ala Val Gly Ile Pro
Ala Thr Trp Asp Ile Cys Gln Glu Ile Val Lys 245
250 255 Pro Gly Ala His Ile Ala Asn Val Gly Val
His Gly Val Lys Val Asp 260 265
270 Phe Glu Ile Gln Lys Leu Trp Ile Lys Asn Leu Thr Ile Thr Thr
Gly 275 280 285 Leu
Val Asn Thr Asn Thr Thr Pro Met Leu Met Lys Val Ala Ser Thr 290
295 300 Asp Lys Leu Pro Leu Lys
Lys Met Ile Thr His Arg Phe Glu Leu Ala 305 310
315 320 Glu Ile Glu His Ala Tyr Gln Val Phe Leu Asn
Gly Ala Lys Glu Lys 325 330
335 Ala Met Lys Ile Ile Leu Ser Asn Ala Gly Ala Ala 340
345 5739DNAartificial sequencePrimer
57cacacatatt acaatagcta gctgaggatg aaagctctg
395839DNAartificial sequencePrimer 58cagagctttc atcctcagct agctattgta
atatgtgtg 39599491DNAartificial
sequenceSynthetic construct 59tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat
gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg
tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga
gcagattgta ctgagagtgc 180accataaatt cccgttttaa gagcttggtg agcgctagga
gtcactgcca ggtatcgttt 240gaacacggca ttagtcaggg aagtcataac acagtccttt
cccgcaattt tctttttcta 300ttactcttgg cctcctctag tacactctat atttttttat
gcctcggtaa tgattttcat 360tttttttttt cccctagcgg atgactcttt ttttttctta
gcgattggca ttatcacata 420atgaattata cattatataa agtaatgtga tttcttcgaa
gaatatacta aaaaatgagc 480aggcaagata aacgaaggca aagatgacag agcagaaagc
cctagtaaag cgtattacaa 540atgaaaccaa gattcagatt gcgatctctt taaagggtgg
tcccctagcg atagagcact 600cgatcttccc agaaaaagag gcagaagcag tagcagaaca
ggccacacaa tcgcaagtga 660ttaacgtcca cacaggtata gggtttctgg accatatgat
acatgctctg gccaagcatt 720ccggctggtc gctaatcgtt gagtgcattg gtgacttaca
catagacgac catcacacca 780ctgaagactg cgggattgct ctcggtcaag cttttaaaga
ggccctactg gcgcgtggag 840taaaaaggtt tggatcagga tttgcgcctt tggatgaggc
actttccaga gcggtggtag 900atctttcgaa caggccgtac gcagttgtcg aacttggttt
gcaaagggag aaagtaggag 960atctctcttg cgagatgatc ccgcattttc ttgaaagctt
tgcagaggct agcagaatta 1020ccctccacgt tgattgtctg cgaggcaaga atgatcatca
ccgtagtgag agtgcgttca 1080aggctcttgc ggttgccata agagaagcca cctcgcccaa
tggtaccaac gatgttccct 1140ccaccaaagg tgttcttatg tagtgacacc gattatttaa
agctgcagca tacgatatat 1200atacatgtgt atatatgtat acctatgaat gtcagtaagt
atgtatacga acagtatgat 1260actgaagatg acaaggtaat gcatcattct atacgtgtca
ttctgaacga ggcgcgcttt 1320ccttttttct ttttgctttt tctttttttt tctcttgaac
tcgacggatc tatgcggtgt 1380gaaataccgc acagatgcgt aaggagaaaa taccgcatca
ggaaattgta aacgttaata 1440ttttgttaaa attcgcgtta aatttttgtt aaatcagctc
attttttaac caataggccg 1500aaatcggcaa aatcccttat aaatcaaaag aatagaccga
gatagggttg agtgttgttc 1560cagtttggaa caagagtcca ctattaaaga acgtggactc
caacgtcaaa gggcgaaaaa 1620ccgtctatca gggcgatggc ccactacgtg aaccatcacc
ctaatcaagt tttttggggt 1680cgaggtgccg taaagcacta aatcggaacc ctaaagggag
cccccgattt agagcttgac 1740ggggaaagcc ggcgaacgtg gcgagaaagg aagggaagaa
agcgaaagga gcgggcgcta 1800gggcgctggc aagtgtagcg gtcacgctgc gcgtaaccac
cacacccgcc gcgcttaatg 1860cgccgctaca gggcgcgtcg cgccattcgc cattcaggct
gcgcaactgt tgggaagggc 1920gatcggtgcg ggcctcttcg ctattacgcc agctggcgaa
agggggatgt gctgcaaggc 1980gattaagttg ggtaacgcca gggttttccc agtcacgacg
ttgtaaaacg acggccagtg 2040agcgcgcgta atacgactca ctatagggcg aattgggtac
cgggcccccc ctcgaggtcg 2100acggcgcgcc actggtagag agcgactttg tatgccccaa
ttgcgaaacc cgcgatatcc 2160ttctcgattc tttagtaccc gaccaggaca aggaaaagga
ggtcgaaacg tttttgaaga 2220aacaagagga actacacgga agctctaaag atggcaacca
gccagaaact aagaaaatga 2280agttgatgga tccaactggc accgctggct tgaacaacaa
taccagcctt ccaacttctg 2340taaataacgg cggtacgcca gtgccaccag taccgttacc
tttcggtata cctcctttcc 2400ccatgtttcc aatgcccttc atgcctccaa cggctactat
cacaaatcct catcaagctg 2460acgcaagccc taagaaatga ataacaatac tgacagtact
aaataattgc ctacttggct 2520tcacatacgt tgcatacgtc gatatagata ataatgataa
tgacagcagg attatcgtaa 2580tacgtaatag ttgaaaatct caaaaatgtg tgggtcatta
cgtaaataat gataggaatg 2640ggattcttct atttttcctt tttccattct agcagccgtc
gggaaaacgt ggcatcctct 2700ctttcgggct caattggagt cacgctgccg tgagcatcct
ctctttccat atctaacaac 2760tgagcacgta accaatggaa aagcatgagc ttagcgttgc
tccaaaaaag tattggatgg 2820ttaataccat ttgtctgttc tcttctgact ttgactcctc
aaaaaaaaaa aatctacaat 2880caacagatcg cttcaattac gccctcacaa aaactttttt
ccttcttctt cgcccacgtt 2940aaattttatc cctcatgttg tctaacggat ttctgcactt
gatttattat aaaaagacaa 3000agacataata cttctctatc aatttcagtt attgttcttc
cttgcgttat tcttctgttc 3060ttctttttct tttgtcatat ataaccataa ccaagtaata
catattcaaa ctagtatgac 3120tgacaaaaaa actcttaaag acttaagaaa tcgtagttct
gtttacgatt caatggttaa 3180atcacctaat cgtgctatgt tgcgtgcaac tggtatgcaa
gatgaagact ttgaaaaacc 3240tatcgtcggt gtcatttcaa cttgggctga aaacacacct
tgtaatatcc acttacatga 3300ctttggtaaa ctagccaaag tcggtgttaa ggaagctggt
gcttggccag ttcagttcgg 3360aacaatcacg gtttctgatg gaatcgccat gggaacccaa
ggaatgcgtt tctccttgac 3420atctcgtgat attattgcag attctattga agcagccatg
ggaggtcata atgcggatgc 3480ttttgtagcc attggcggtt gtgataaaaa catgcccggt
tctgttatcg ctatggctaa 3540catggatatc ccagccattt ttgcttacgg cggaacaatt
gcacctggta atttagacgg 3600caaagatatc gatttagtct ctgtctttga aggtgtcggc
cattggaacc acggcgatat 3660gaccaaagaa gaagttaaag ctttggaatg taatgcttgt
cccggtcctg gaggctgcgg 3720tggtatgtat actgctaaca caatggcgac agctattgaa
gttttgggac ttagccttcc 3780gggttcatct tctcacccgg ctgaatccgc agaaaagaaa
gcagatattg aagaagctgg 3840tcgcgctgtt gtcaaaatgc tcgaaatggg cttaaaacct
tctgacattt taacgcgtga 3900agcttttgaa gatgctatta ctgtaactat ggctctggga
ggttcaacca actcaaccct 3960tcacctctta gctattgccc atgctgctaa tgtggaattg
acacttgatg atttcaatac 4020tttccaagaa aaagttcctc atttggctga tttgaaacct
tctggtcaat atgtattcca 4080agacctttac aaggtcggag gggtaccagc agttatgaaa
tatctcctta aaaatggctt 4140ccttcatggt gaccgtatca cttgtactgg caaaacagtc
gctgaaaatt tgaaggcttt 4200tgatgattta acacctggtc aaaaggttat tatgccgctt
gaaaatccta aacgtgaaga 4260tggtccgctc attattctcc atggtaactt ggctccagac
ggtgccgttg ccaaagtttc 4320tggtgtaaaa gtgcgtcgtc atgtcggtcc tgctaaggtc
tttaattctg aagaagaagc 4380cattgaagct gtcttgaatg atgatattgt tgatggtgat
gttgttgtcg tacgttttgt 4440aggaccaaag ggcggtcctg gtatgcctga aatgctttcc
ctttcatcaa tgattgttgg 4500taaagggcaa ggtgaaaaag ttgcccttct gacagatggc
cgcttctcag gtggtactta 4560tggtcttgtc gtgggtcata tcgctcctga agcacaagat
ggcggtccaa tcgcctacct 4620gcaaacagga gacatagtca ctattgacca agacactaag
gaattacact ttgatatctc 4680cgatgaagag ttaaaacatc gtcaagagac cattgaattg
ccaccgctct attcacgcgg 4740tatccttggt aaatatgctc acatcgtttc gtctgcttct
aggggagccg taacagactt 4800ttggaagcct gaagaaactg gcaaaaaatg ttgtcctggt
tgctgtggtt aagcggccgc 4860gttaattcaa attaattgat atagtttttt aatgagtatt
gaatctgttt agaaataatg 4920gaatattatt tttatttatt tatttatatt attggtcggc
tcttttcttc tgaaggtcaa 4980tgacaaaatg atatgaagga aataatgatt tctaaaattt
tacaacgtaa gatattttta 5040caaaagccta gctcatcttt tgtcatgcac tattttactc
acgcttgaaa ttaacggcca 5100gtccactgcg gagtcatttc aaagtcatcc taatcgatct
atcgtttttg atagctcatt 5160ttggagttcg cgattgtctt ctgttattca caactgtttt
aatttttatt tcattctgga 5220actcttcgag ttctttgtaa agtctttcat agtagcttac
tttatcctcc aacatattta 5280acttcatgtc aatttcggct cttaaatttt ccacatcatc
aagttcaaca tcatctttta 5340acttgaattt attctctagc tcttccaacc aagcctcatt
gctccttgat ttactggtga 5400aaagtgatac actttgcgcg caatccaggt caaaactttc
ctgcaaagaa ttcaccaatt 5460tctcgacatc atagtacaat ttgttttgtt ctcccatcac
aatttaatat acctgatgga 5520ttcttatgaa gcgctgggta atggacgtgt cactctactt
cgcctttttc cctactcctt 5580ttagtacgga agacaatgct aataaataag agggtaataa
taatattatt aatcggcaaa 5640aaagattaaa cgccaagcgt ttaattatca gaaagcaaac
gtcgtaccaa tccttgaatg 5700cttcccaatt gtatattaag agtcatcaca gcaacatatt
cttgttatta aattaattat 5760tattgatttt tgatattgta taaaaaaacc aaatatgtat
aaaaaaagtg aataaaaaat 5820accaagtatg gagaaatata ttagaagtct atacgttaaa
ccaccgcggt ggagctccag 5880cttttgttcc ctttagtgag ggttaattgc gcgcttggcg
taatcatggt catagctgtt 5940tcctgtgtga aattgttatc cgctcacaat tccacacaac
ataggagccg gaagcataaa 6000gtgtaaagcc tggggtgcct aatgagtgag gtaactcaca
ttaattgcgt tgcgctcact 6060gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat
taatgaatcg gccaacgcgc 6120ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc
tcgctcactg actcgctgcg 6180ctcggtcgtt cggctgcggc gagcggtatc agctcactca
aaggcggtaa tacggttatc 6240cacagaatca ggggataacg caggaaagaa catgtgagca
aaaggccagc aaaaggccag 6300gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg
ctccgccccc ctgacgagca 6360tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg
acaggactat aaagatacca 6420ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt
ccgaccctgc cgcttaccgg 6480atacctgtcc gcctttctcc cttcgggaag cgtggcgctt
tctcatagct cacgctgtag 6540gtatctcagt tcggtgtagg tcgttcgctc caagctgggc
tgtgtgcacg aaccccccgt 6600tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt
gagtccaacc cggtaagaca 6660cgacttatcg ccactggcag cagccactgg taacaggatt
agcagagcga ggtatgtagg 6720cggtgctaca gagttcttga agtggtggcc taactacggc
tacactagaa ggacagtatt 6780tggtatctgc gctctgctga agccagttac cttcggaaaa
agagttggta gctcttgatc 6840cggcaaacaa accaccgctg gtagcggtgg tttttttgtt
tgcaagcagc agattacgcg 6900cagaaaaaaa ggatctcaag aagatccttt gatcttttct
acggggtctg acgctcagtg 6960gaacgaaaac tcacgttaag ggattttggt catgagatta
tcaaaaagga tcttcaccta 7020gatcctttta aattaaaaat gaagttttaa atcaatctaa
agtatatatg agtaaacttg 7080gtctgacagt taccaatgct taatcagtga ggcacctatc
tcagcgatct gtctatttcg 7140ttcatccata gttgcctgac tccccgtcgt gtagataact
acgatacggg agggcttacc 7200atctggcccc agtgctgcaa tgataccgcg agacccacgc
tcaccggctc cagatttatc 7260agcaataaac cagccagccg gaagggccga gcgcagaagt
ggtcctgcaa ctttatccgc 7320ctccatccag tctattaatt gttgccggga agctagagta
agtagttcgc cagttaatag 7380tttgcgcaac gttgttgcca ttgctacagg catcgtggtg
tcacgctcgt cgtttggtat 7440ggcttcattc agctccggtt cccaacgatc aaggcgagtt
acatgatccc ccatgttgtg 7500caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc
agaagtaagt tggccgcagt 7560gttatcactc atggttatgg cagcactgca taattctctt
actgtcatgc catccgtaag 7620atgcttttct gtgactggtg agtactcaac caagtcattc
tgagaatagt gtatgcggcg 7680accgagttgc tcttgcccgg cgtcaatacg ggataatacc
gcgccacata gcagaacttt 7740aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa
ctctcaagga tcttaccgct 7800gttgagatcc agttcgatgt aacccactcg tgcacccaac
tgatcttcag catcttttac 7860tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa
aatgccgcaa aaaagggaat 7920aagggcgaca cggaaatgtt gaatactcat actcttcctt
tttcaatatt attgaagcat 7980ttatcagggt tattgtctca tgagcggata catatttgaa
tgtatttaga aaaataaaca 8040aataggggtt ccgcgcacat ttccccgaaa agtgccacct
gaacgaagca tctgtgcttc 8100attttgtaga acaaaaatgc aacgcgagag cgctaatttt
tcaaacaaag aatctgagct 8160gcatttttac agaacagaaa tgcaacgcga aagcgctatt
ttaccaacga agaatctgtg 8220cttcattttt gtaaaacaaa aatgcaacgc gagagcgcta
atttttcaaa caaagaatct 8280gagctgcatt tttacagaac agaaatgcaa cgcgagagcg
ctattttacc aacaaagaat 8340ctatacttct tttttgttct acaaaaatgc atcccgagag
cgctattttt ctaacaaagc 8400atcttagatt actttttttc tcctttgtgc gctctataat
gcagtctctt gataactttt 8460tgcactgtag gtccgttaag gttagaagaa ggctactttg
gtgtctattt tctcttccat 8520aaaaaaagcc tgactccact tcccgcgttt actgattact
agcgaagctg cgggtgcatt 8580ttttcaagat aaaggcatcc ccgattatat tctataccga
tgtggattgc gcatactttg 8640tgaacagaaa gtgatagcgt tgatgattct tcattggtca
gaaaattatg aacggtttct 8700tctattttgt ctctatatac tacgtatagg aaatgtttac
attttcgtat tgttttcgat 8760tcactctatg aatagttctt actacaattt ttttgtctaa
agagtaatac tagagataaa 8820cataaaaaat gtagaggtcg agtttagatg caagttcaag
gagcgaaagg tggatgggta 8880ggttatatag ggatatagca cagagatata tagcaaagag
atacttttga gcaatgtttg 8940tggaagcggt attcgcaata ttttagtagc tcgttacagt
ccggtgcgtt tttggttttt 9000tgaaagtgcg tcttcagagc gcttttggtt ttcaaaagcg
ctctgaagtt cctatacttt 9060ctagagaata ggaacttcgg aataggaact tcaaagcgtt
tccgaaaacg agcgcttccg 9120aaaatgcaac gcgagctgcg cacatacagc tcactgttca
cgtcgcacct atatctgcgt 9180gttgcctgta tatatatata catgagaaga acggcatagt
gcgtgtttat gcttaaatgc 9240gtacttatat gcgtctattt atgtaggatg aaaggtagtc
tagtacctcc tgtgatatta 9300tcccattcca tgcggggtat cgtatgcttc cttcagcact
accctttagc tgttctatat 9360gctgccactc ctcaattgga ttagtctcat ccttcaatgc
tatcatttcc tttgatattg 9420gatcatctaa gaaaccatta ttatcatgac attaacctat
aaaaataggc gtatcacgag 9480gccctttcgt c
9491601000DNASaccharomyces cerevisiae 60gttaattcaa
attaattgat atagtttttt aatgagtatt gaatctgttt agaaataatg 60gaatattatt
tttatttatt tatttatatt attggtcggc tcttttcttc tgaaggtcaa 120tgacaaaatg
atatgaagga aataatgatt tctaaaattt tacaacgtaa gatattttta 180caaaagccta
gctcatcttt tgtcatgcac tattttactc acgcttgaaa ttaacggcca 240gtccactgcg
gagtcatttc aaagtcatcc taatcgatct atcgtttttg atagctcatt 300ttggagttcg
cgattgtctt ctgttattca caactgtttt aatttttatt tcattctgga 360actcttcgag
ttctttgtaa agtctttcat agtagcttac tttatcctcc aacatattta 420acttcatgtc
aatttcggct cttaaatttt ccacatcatc aagttcaaca tcatctttta 480acttgaattt
attctctagc tcttccaacc aagcctcatt gctccttgat ttactggtga 540aaagtgatac
actttgcgcg caatccaggt caaaactttc ctgcaaagaa ttcaccaatt 600tctcgacatc
atagtacaat ttgttttgtt ctcccatcac aatttaatat acctgatgga 660ttcttatgaa
gcgctgggta atggacgtgt cactctactt cgcctttttc cctactcctt 720ttagtacgga
agacaatgct aataaataag agggtaataa taatattatt aatcggcaaa 780aaagattaaa
cgccaagcgt ttaattatca gaaagcaaac gtcgtaccaa tccttgaatg 840cttcccaatt
gtatattaag agtcatcaca gcaacatatt cttgttatta aattaattat 900tattgatttt
tgatattgta taaaaaaacc aaatatgtat aaaaaaagtg aataaaaaat 960accaagtatg
gagaaatata ttagaagtct atacgttaaa
100061244DNASaccharomyces cerevisiae 61attaaagcct tcgagcgtcc caaaaccttc
tcaagcaagg ttttcagtat aatgttacat 60gcgtacacgc gtctgtacag aaaaaaaaga
aaaatttgaa atataaataa cgttcttaat 120actaacataa ctataaaaaa ataaataggg
acctagactt caggttgtct aactccttcc 180ttttcggtta gagcggatgt ggggggaggg
cgtgaatgta agcgtgacat aactaattac 240atga
244621713DNAStreptococcus mutans
62atgactgaca aaaaaactct taaagactta agaaatcgta gttctgttta cgattcaatg
60gttaaatcac ctaatcgtgc tatgttgcgt gcaactggta tgcaagatga agactttgaa
120aaacctatcg tcggtgtcat ttcaacttgg gctgaaaaca caccttgtaa tatccactta
180catgactttg gtaaactagc caaagtcggt gttaaggaag ctggtgcttg gccagttcag
240ttcggaacaa tcacggtttc tgatggaatc gccatgggaa cccaaggaat gcgtttctcc
300ttgacatctc gtgatattat tgcagattct attgaagcag ccatgggagg tcataatgcg
360gatgcttttg tagccattgg cggttgtgat aaaaacatgc ccggttctgt tatcgctatg
420gctaacatgg atatcccagc catttttgct tacggcggaa caattgcacc tggtaattta
480gacggcaaag atatcgattt agtctctgtc tttgaaggtg tcggccattg gaaccacggc
540gatatgacca aagaagaagt taaagctttg gaatgtaatg cttgtcccgg tcctggaggc
600tgcggtggta tgtatactgc taacacaatg gcgacagcta ttgaagtttt gggacttagc
660cttccgggtt catcttctca cccggctgaa tccgcagaaa agaaagcaga tattgaagaa
720gctggtcgcg ctgttgtcaa aatgctcgaa atgggcttaa aaccttctga cattttaacg
780cgtgaagctt ttgaagatgc tattactgta actatggctc tgggaggttc aaccaactca
840acccttcacc tcttagctat tgcccatgct gctaatgtgg aattgacact tgatgatttc
900aatactttcc aagaaaaagt tcctcatttg gctgatttga aaccttctgg tcaatatgta
960ttccaagacc tttacaaggt cggaggggta ccagcagtta tgaaatatct ccttaaaaat
1020ggcttccttc atggtgaccg tatcacttgt actggcaaaa cagtcgctga aaatttgaag
1080gcttttgatg atttaacacc tggtcaaaag gttattatgc cgcttgaaaa tcctaaacgt
1140gaagatggtc cgctcattat tctccatggt aacttggctc cagacggtgc cgttgccaaa
1200gtttctggtg taaaagtgcg tcgtcatgtc ggtcctgcta aggtctttaa ttctgaagaa
1260gaagccattg aagctgtctt gaatgatgat attgttgatg gtgatgttgt tgtcgtacgt
1320tttgtaggac caaagggcgg tcctggtatg cctgaaatgc tttccctttc atcaatgatt
1380gttggtaaag ggcaaggtga aaaagttgcc cttctgacag atggccgctt ctcaggtggt
1440acttatggtc ttgtcgtggg tcatatcgct cctgaagcac aagatggcgg tccaatcgcc
1500tacctgcaaa caggagacat agtcactatt gaccaagaca ctaaggaatt acactttgat
1560atctccgatg aagagttaaa acatcgtcaa gagaccattg aattgccacc gctctattca
1620cgcggtatcc ttggtaaata tgctcacatc gtttcgtctg cttctagggg agccgtaaca
1680gacttttgga agcctgaaga aactggcaaa aaa
171363571PRTStreptococcus mutans 63Met Thr Asp Lys Lys Thr Leu Lys Asp
Leu Arg Asn Arg Ser Ser Val 1 5 10
15 Tyr Asp Ser Met Val Lys Ser Pro Asn Arg Ala Met Leu Arg
Ala Thr 20 25 30
Gly Met Gln Asp Glu Asp Phe Glu Lys Pro Ile Val Gly Val Ile Ser
35 40 45 Thr Trp Ala Glu
Asn Thr Pro Cys Asn Ile His Leu His Asp Phe Gly 50
55 60 Lys Leu Ala Lys Val Gly Val Lys
Glu Ala Gly Ala Trp Pro Val Gln 65 70
75 80 Phe Gly Thr Ile Thr Val Ser Asp Gly Ile Ala Met
Gly Thr Gln Gly 85 90
95 Met Arg Phe Ser Leu Thr Ser Arg Asp Ile Ile Ala Asp Ser Ile Glu
100 105 110 Ala Ala Met
Gly Gly His Asn Ala Asp Ala Phe Val Ala Ile Gly Gly 115
120 125 Cys Asp Lys Asn Met Pro Gly Ser
Val Ile Ala Met Ala Asn Met Asp 130 135
140 Ile Pro Ala Ile Phe Ala Tyr Gly Gly Thr Ile Ala Pro
Gly Asn Leu 145 150 155
160 Asp Gly Lys Asp Ile Asp Leu Val Ser Val Phe Glu Gly Val Gly His
165 170 175 Trp Asn His Gly
Asp Met Thr Lys Glu Glu Val Lys Ala Leu Glu Cys 180
185 190 Asn Ala Cys Pro Gly Pro Gly Gly Cys
Gly Gly Met Tyr Thr Ala Asn 195 200
205 Thr Met Ala Thr Ala Ile Glu Val Leu Gly Leu Ser Leu Pro
Gly Ser 210 215 220
Ser Ser His Pro Ala Glu Ser Ala Glu Lys Lys Ala Asp Ile Glu Glu 225
230 235 240 Ala Gly Arg Ala Val
Val Lys Met Leu Glu Met Gly Leu Lys Pro Ser 245
250 255 Asp Ile Leu Thr Arg Glu Ala Phe Glu Asp
Ala Ile Thr Val Thr Met 260 265
270 Ala Leu Gly Gly Ser Thr Asn Ser Thr Leu His Leu Leu Ala Ile
Ala 275 280 285 His
Ala Ala Asn Val Glu Leu Thr Leu Asp Asp Phe Asn Thr Phe Gln 290
295 300 Glu Lys Val Pro His Leu
Ala Asp Leu Lys Pro Ser Gly Gln Tyr Val 305 310
315 320 Phe Gln Asp Leu Tyr Lys Val Gly Gly Val Pro
Ala Val Met Lys Tyr 325 330
335 Leu Leu Lys Asn Gly Phe Leu His Gly Asp Arg Ile Thr Cys Thr Gly
340 345 350 Lys Thr
Val Ala Glu Asn Leu Lys Ala Phe Asp Asp Leu Thr Pro Gly 355
360 365 Gln Lys Val Ile Met Pro Leu
Glu Asn Pro Lys Arg Glu Asp Gly Pro 370 375
380 Leu Ile Ile Leu His Gly Asn Leu Ala Pro Asp Gly
Ala Val Ala Lys 385 390 395
400 Val Ser Gly Val Lys Val Arg Arg His Val Gly Pro Ala Lys Val Phe
405 410 415 Asn Ser Glu
Glu Glu Ala Ile Glu Ala Val Leu Asn Asp Asp Ile Val 420
425 430 Asp Gly Asp Val Val Val Val Arg
Phe Val Gly Pro Lys Gly Gly Pro 435 440
445 Gly Met Pro Glu Met Leu Ser Leu Ser Ser Met Ile Val
Gly Lys Gly 450 455 460
Gln Gly Glu Lys Val Ala Leu Leu Thr Asp Gly Arg Phe Ser Gly Gly 465
470 475 480 Thr Tyr Gly Leu
Val Val Gly His Ile Ala Pro Glu Ala Gln Asp Gly 485
490 495 Gly Pro Ile Ala Tyr Leu Gln Thr Gly
Asp Ile Val Thr Ile Asp Gln 500 505
510 Asp Thr Lys Glu Leu His Phe Asp Ile Ser Asp Glu Glu Leu
Lys His 515 520 525
Arg Gln Glu Thr Ile Glu Leu Pro Pro Leu Tyr Ser Arg Gly Ile Leu 530
535 540 Gly Lys Tyr Ala His
Ile Val Ser Ser Ala Ser Arg Gly Ala Val Thr 545 550
555 560 Asp Phe Trp Lys Pro Glu Glu Thr Gly Lys
Lys 565 570 642145DNAartificial
sequenceSynthetic construct 64gcatgcttgc atttagtcgt gcaatgtatg actttaagat
ttgtgagcag gaagaaaagg 60gagaatcttc taacgataaa cccttgaaaa actgggtaga
ctacgctatg ttgagttgct 120acgcaggctg cacaattaca cgagaatgct cccgcctagg
atttaaggct aagggacgtg 180caatgcagac gacagatcta aatgaccgtg tcggtgaagt
gttcgccaaa cttttcggtt 240aacacatgca gtgatgcacg cgcgatggtg ctaagttaca
tatatatata tatatatata 300tatagccata gtgatgtcta agtaaccttt atggtatatt
tcttaatgtg gaaagatact 360agcgcgcgca cccacacaca agcttcgtct tttcttgaag
aaaagaggaa gctcgctaaa 420tgggattcca ctttccgttc cctgccagct gatggaaaaa
ggttagtgga acgatgaaga 480ataaaaagag agatccactg aggtgaaatt tcagctgaca
gcgagtttca tgatcgtgat 540gaacaatggt aacgagttgt ggctgttgcc agggagggtg
gttctcaact tttaatgtat 600ggccaaatcg ctacttgggt ttgttatata acaaagaaga
aataatgaac tgattctctt 660cctccttctt gtcctttctt aattctgttg taattacctt
cctttgtaat tttttttgta 720attattcttc ttaataatcc aaacaaacac acatattaca
atagctagct gaggatgaag 780gcattagttt atcatgggga tcacaaaatt tcgttagaag
acaaaccaaa acccactctg 840cagaaaccaa cagacgttgt ggttagggtg ttgaaaacaa
caatttgcgg tactgacttg 900ggaatataca aaggtaagaa tcctgaagtg gcagatggca
gaatcctggg tcatgagggc 960gttggcgtca ttgaagaagt gggcgaatcc gtgacacaat
tcaaaaaggg ggataaagtt 1020ttaatctcct gcgttactag ctgtggatcg tgtgattatt
gcaagaagca actgtattca 1080cactgtagag acggtggctg gattttaggt tacatgatcg
acggtgtcca agccgaatac 1140gtcagaatac cacatgctga caattcattg tataagatcc
cgcaaactat cgatgatgaa 1200attgcagtac tactgtccga tattttacct actggacatg
aaattggtgt tcaatatggt 1260aacgttcaac caggcgatgc tgtagcaatt gtaggagcag
gtcctgttgg aatgtcagtt 1320ttgttaactg ctcaatttta ctcgcctagt accattattg
ttatcgacat ggacgaaaac 1380cgtttacaat tagcgaagga gcttggggcc acacacacta
ttaactccgg tactgaaaat 1440gttgtcgaag ctgtgcatcg tatagcagcc gaaggagtgg
atgtagcaat agaagctgtt 1500ggtatacccg caacctggga catctgtcag gaaattgtaa
aacccggcgc tcatattgcc 1560aacgtgggag ttcatggtgt taaggtggac tttgaaattc
aaaagttgtg gattaagaat 1620ctaaccatca ccactggttt ggttaacact aatactaccc
caatgttgat gaaggtagcc 1680tctactgata aattgccttt aaagaaaatg attactcaca
ggtttgagtt agctgaaatc 1740gaacacgcat atcaggtttt cttgaatggc gctaaagaaa
aagctatgaa gattattcta 1800tctaatgcag gtgccgccta attaattaag agtaagcgaa
tttcttatga tttatgattt 1860ttattattaa ataagttata aaaaaaataa gtgtatacaa
attttaaagt gactcttagg 1920ttttaaaacg aaaattctta ttcttgagta actctttcct
gtaggtcagg ttgctttctc 1980aggtatagca tgaggtcgct cttattgacc acacctctac
cggcatgccg agcaaatgcc 2040tgcaaatcgc tccccatttc acccaattgt agatatgcta
actccagcaa tgagttgatg 2100aatctcggtg tgtattttat gtcctcagag gacaacacct
gtggt 2145654280DNAartificial sequenceSynthetic
construct 65ggggatcctc tagagtcgac ctgcaggcat gcaagcttgg cgtaatcatg
gtcatagctg 60tttcctgtgt gaaattgtta tccgctcaca attccacaca acatacgagc
cggaagcata 120aagtgtaaag cctggggtgc ctaatgagtg agctaactca cattaattgc
gttgcgctca 180ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc attaatgaat
cggccaacgc 240gcggggagag gcggtttgcg tattgggcgc tcttccgctt cctcgctcac
tgactcgctg 300cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt
aatacggtta 360tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca
gcaaaaggcc 420aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc
ccctgacgag 480catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact
ataaagatac 540caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct
gccgcttacc 600ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag
ctcacgctgt 660aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca
cgaacccccc 720gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa
cccggtaaga 780cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc
gaggtatgta 840ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag
aaggacagta 900tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg
tagctcttga 960tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca
gcagattacg 1020cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc
tgacgctcag 1080tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag
gatcttcacc 1140tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata
tgagtaaact 1200tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat
ctgtctattt 1260cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg
ggagggctta 1320ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc
tccagattta 1380tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc
aactttatcc 1440gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc
gccagttaat 1500agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc
gtcgtttggt 1560atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc
ccccatgttg 1620tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa
gttggccgca 1680gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat
gccatccgta 1740agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata
gtgtatgcgg 1800cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca
tagcagaact 1860ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag
gatcttaccg 1920ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc
agcatctttt 1980actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc
aaaaaaggga 2040ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata
ttattgaagc 2100atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta
gaaaaataaa 2160caaatagggg ttccgcgcac atttccccga aaagtgccac ctgacgtcta
agaaaccatt 2220attatcatga cattaaccta taaaaatagg cgtatcacga ggccctttcg
tctcgcgcgt 2280ttcggtgatg acggtgaaaa cctctgacac atgcagctcc cggagacggt
cacagcttgt 2340ctgtaagcgg atgccgggag cagacaagcc cgtcagggcg cgtcagcggg
tgttggcggg 2400tgtcggggct ggcttaacta tgcggcatca gagcagattg tactgagagt
gcaccatatg 2460cggtgtgaaa taccgcacag atgcgtaagg agaaaatacc gcatcaggcg
ccattcgcca 2520ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg cctcttcgct
attacgccag 2580ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg taacgccagg
gttttcccag 2640tcacgacgtt gtaaaacgac ggccagtgaa ttcgagctcg gtacccccgg
ctctgagaca 2700gtagtaggtt agtcatcgct ctaccgacgc gcaggaaaag aaagaagcat
tgcggattac 2760gtattctaat gttcagcccg cggaacgcca gcaaatcacc acccatgcgc
atgatactga 2820gtcttgtaca cgctgggctt ccagtgtact gagagtgcac cataccacag
cttttcaatt 2880caattcatca tttttttttt attctttttt ttgatttcgg tttctttgaa
atttttttga 2940ttcggtaatc tccgaacaga aggaagaacg aaggaaggag cacagactta
gattggtata 3000tatacgcata tgtagtgttg aagaaacatg aaattgccca gtattcttaa
cccaactgca 3060cagaacaaaa acctgcagga aacgaagata aatcatgtcg aaagctacat
ataaggaacg 3120tgctgctact catcctagtc ctgttgctgc caagctattt aatatcatgc
acgaaaagca 3180aacaaacttg tgtgcttcat tggatgttcg taccaccaag gaattactgg
agttagttga 3240agcattaggt cccaaaattt gtttactaaa aacacatgtg gatatcttga
ctgatttttc 3300catggagggc acagttaagc cgctaaaggc attatccgcc aagtacaatt
ttttactctt 3360cgaagacaga aaatttgctg acattggtaa tacagtcaaa ttgcagtact
ctgcgggtgt 3420atacagaata gcagaatggg cagacattac gaatgcacac ggtgtggtgg
gcccaggtat 3480tgttagcggt ttgaagcagg cggcagaaga agtaacaaag gaacctagag
gccttttgat 3540gttagcagaa ttgtcatgca agggctccct atctactgga gaatatacta
agggtactgt 3600tgacattgcg aagagcgaca aagattttgt tatcggcttt attgctcaaa
gagacatggg 3660tggaagagat gaaggttacg attggttgat tatgacaccc ggtgtgggtt
tagatgacaa 3720gggagacgca ttgggtcaac agtatagaac cgtggatgat gtggtctcta
caggatctga 3780cattattatt gttggaagag gactatttgc aaagggaagg gatgctaagg
tagagggtga 3840acgttacaga aaagcaggct gggaagcata tttgagaaga tgcggccagc
aaaactaaaa 3900aactgtatta taagtaaatg catgtatact aaactcacaa attagagctt
caatttaatt 3960atatcagtta ttaccctatg cggtgtgaaa taccgcacag atgcgtaagg
agaaaatacc 4020gcatcaggaa attgtaaacg ttaatatttt gttaaaattc gcgttaaatt
tttgttaaat 4080cagctcattt tttaaccaat aggccgaaat cggcaaaatc ttcagcccgc
ggaacgccag 4140caaatcacca cccatgcgca tgatactgag tcttgtacac gctgggcttc
cagtgatgat 4200acaacgagtt agccaaggtg agcacggatg tctaaattag aattacgttt
taatatcttt 4260ttttccatat ctagggctag
42806630DNAartificial sequencePrimer 66gcatgcttgc atttagtcgt
gcaatgtatg 306754DNAartificial
sequencePrimer 67gaacattaga atacgtaatc cgcaatgcac tagtaccaca ggtgttgtcc
tctg 546854DNAartificial sequencePrimer 68cagaggacaa cacctgtggt
actagtgcat tgcggattac gtattctaat gttc 546928DNAartificial
sequencePrimer 69caccttggct aactcgttgt atcatcac
2870100DNAartificial sequencePrimer 70ttttaagccg aatgagtgac
agaaaaagcc cacaacttat caagtgatat tgaacaaagg 60gcgaaacttc gcatgcttgc
atttagtcgt gcaatgtatg 1007198DNAartificial
sequencePrimer 71cccaattggt aaatattcaa caagagacgc gcagtacgta acatgcgaat
tgcgtaattc 60acggcgataa caccttggct aactcgttgt atcatcac
987229DNAartificial sequencePrimer 72caaaagccca tgtcccacac
caaaggatg 297326DNAartificial
sequencePrimer 73caccatcgcg cgtgcatcac tgcatg
267428DNAartificial sequencePrimer 74tcggtttttg caatatgacc
tgtgggcc 287522DNAartificial
sequencePrimer 75gagaagatgc ggccagcaaa ac
22762745DNAartificial sequenceSynthetic construct
76atgactgaca aaaaaactct taaagactta agaaatcgta gttctgttta cgattcaatg
60gttaaatcac ctaatcgtgc tatgttgcgt gcaactggta tgcaagatga agactttgaa
120aaacctatcg tcggtgtcat ttcaacttgg gctgaaaaca caccttgtaa tatccactta
180catgactttg gtaaactagc caaagtcggt gttaaggaag ctggtgcttg gccagttcag
240ttcggaacaa tcacggtttc tgatggaatc gccatgggaa cccaaggaat gcgtttctcc
300ttgacatctc gtgatattat tgcagattct attgaagcag ccatgggagg tcataatgcg
360gatgcttttg tagccattgg cggttgtgat aaaaacatgc ccggttctgt tatcgctatg
420gctaacatgg atatcccagc catttttgct tacggcggaa caattgcacc tggtaattta
480gacggcaaag atatcgattt agtctctgtc tttgaaggtg tcggccattg gaaccacggc
540gatatgacca aagaagaagt taaagctttg gaatgtaatg cttgtcccgg tcctggaggc
600tgcggtggta tgtatactgc taacacaatg gcgacagcta ttgaagtttt gggacttagc
660cttccgggtt catcttctca cccggctgaa tccgcagaaa agaaagcaga tattgaagaa
720gctggtcgcg ctgttgtcaa aatgctcgaa atgggcttaa aaccttctga cattttaacg
780cgtgaagctt ttgaagatgc tattactgta actatggctc tgggaggttc aaccaactca
840acccttcacc tcttagctat tgcccatgct gctaatgtgg aattgacact tgatgatttc
900aatactttcc aagaaaaagt tcctcatttg gctgatttga aaccttctgg tcaatatgta
960ttccaagacc tttacaaggt cggaggggta ccagcagtta tgaaatatct ccttaaaaat
1020ggcttccttc atggtgaccg tatcacttgt actggcaaaa cagtcgctga aaatttgaag
1080gcttttgatg atttaacacc tggtcaaaag gttattatgc cgcttgaaaa tcctaaacgt
1140gaagatggtc cgctcattat tctccatggt aacttggctc cagacggtgc cgttgccaaa
1200gtttctggtg taaaagtgcg tcgtcatgtc ggtcctgcta aggtctttaa ttctgaagaa
1260gaagccattg aagctgtctt gaatgatgat attgttgatg gtgatgttgt tgtcgtacgt
1320tttgtaggac caaagggcgg tcctggtatg cctgaaatgc tttccctttc atcaatgatt
1380gttggtaaag ggcaaggtga aaaagttgcc cttctgacag atggccgctt ctcaggtggt
1440acttatggtc ttgtcgtggg tcatatcgct cctgaagcac aagatggcgg tccaatcgcc
1500tacctgcaaa caggagacat agtcactatt gaccaagaca ctaaggaatt acactttgat
1560atctccgatg aagagttaaa acatcgtcaa gagaccattg aattgccacc gctctattca
1620cgcggtatcc ttggtaaata tgctcacatc gtttcgtctg cttctagggg agccgtaaca
1680gacttttgga agcctgaaga aactggcaaa aaatgttgtc ctggttgctg tggttaagcg
1740gccgcgttaa ttcaaattaa ttgatatagt tttttaatga gtattgaatc tgtttagaaa
1800taatggaata ttatttttat ttatttattt atattattgg tcggctcttt tcttctgaag
1860gtcaatgaca aaatgatatg aaggaaataa tgatttctaa aattttacaa cgtaagatat
1920ttttacaaaa gcctagctca tcttttgtca tgcactattt tactcacgct tgaaattaac
1980ggccagtcca ctgcggagtc atttcaaagt catcctaatc gatctatcgt ttttgatagc
2040tcattttgga gttcgcgatt gtcttctgtt attcacaact gttttaattt ttatttcatt
2100ctggaactct tcgagttctt tgtaaagtct ttcatagtag cttactttat cctccaacat
2160atttaacttc atgtcaattt cggctcttaa attttccaca tcatcaagtt caacatcatc
2220ttttaacttg aatttattct ctagctcttc caaccaagcc tcattgctcc ttgatttact
2280ggtgaaaagt gatacacttt gcgcgcaatc caggtcaaaa ctttcctgca aagaattcac
2340caatttctcg acatcatagt acaatttgtt ttgttctccc atcacaattt aatatacctg
2400atggattctt atgaagcgct gggtaatgga cgtgtcactc tacttcgcct ttttccctac
2460tccttttagt acggaagaca atgctaataa ataagagggt aataataata ttattaatcg
2520gcaaaaaaga ttaaacgcca agcgtttaat tatcagaaag caaacgtcgt accaatcctt
2580gaatgcttcc caattgtata ttaagagtca tcacagcaac atattcttgt tattaaatta
2640attattattg atttttgata ttgtataaaa aaaccaaata tgtataaaaa aagtgaataa
2700aaaataccaa gtatggagaa atatattaga agtctatacg ttaaa
27457799DNAartificial sequencePrimer 77tcctttctca attattattt tctactcata
acctcacgca aaataacaca gtcaaatcaa 60tcaaagtatg actgacaaaa aaactcttaa
agacttaag 997877DNAartificial sequencePrimer
78gaacattaga atacgtaatc cgcaatgctt ctttcttttc cgtttaacgt atagacttct
60aatatatttc tccatac
777945DNAartificial sequencePrimer 79aaacggaaaa gaaagaagca ttgcggatta
cgtattctaa tgttc 458088DNAartificial sequencePrimer
80tatttttcgt tacataaaaa tgcttataaa actttaacta ataattagag attaaatcgc
60caccttggct aactcgttgt atcatcac
88812347DNAartificial sequenceSynthetic construct 81gcattgcgga ttacgtattc
taatgttcag gtgctggaag aagagctgct taaccgccgc 60gcccagggtg aagatccacg
ctactttacc ctgcgtcgtc tggatttcgg cggctgtcgt 120ctttcgctgg caacgccggt
tgatgaagcc tgggacggtc cgctctcctt aaacggtaaa 180cgtatcgcca cctcttatcc
tcacctgctc aagcgttatc tcgaccagaa aggcatctct 240tttaaatcct gcttactgaa
cggttctgtt gaagtcgccc cgcgtgccgg actggcggat 300gcgatttgcg atctggtttc
caccggtgcc acgctggaag ctaacggcct gcgcgaagtc 360gaagttatct atcgctcgaa
agcctgcctg attcaacgcg atggcgaaat ggaagaatcc 420aaacagcaac tgatcgacaa
actgctgacc cgtattcagg gtgtgatcca ggcgcgcgaa 480tcaaaataca tcatgatgca
cgcaccgacc gaacgtctgg atgaagtcat ggtacctact 540gagagtgcac cataccacag
cttttcaatt caattcatca tttttttttt attctttttt 600ttgatttcgg tttctttgaa
atttttttga ttcggtaatc tccgaacaga aggaagaacg 660aaggaaggag cacagactta
gattggtata tatacgcata tgtagtgttg aagaaacatg 720aaattgccca gtattcttaa
cccaactgca cagaacaaaa acctgcagga aacgaagata 780aatcatgtcg aaagctacat
ataaggaacg tgctgctact catcctagtc ctgttgctgc 840caagctattt aatatcatgc
acgaaaagca aacaaacttg tgtgcttcat tggatgttcg 900taccaccaag gaattactgg
agttagttga agcattaggt cccaaaattt gtttactaaa 960aacacatgtg gatatcttga
ctgatttttc catggagggc acagttaagc cgctaaaggc 1020attatccgcc aagtacaatt
ttttactctt cgaagacaga aaatttgctg acattggtaa 1080tacagtcaaa ttgcagtact
ctgcgggtgt atacagaata gcagaatggg cagacattac 1140gaatgcacac ggtgtggtgg
gcccaggtat tgttagcggt ttgaagcagg cggcagaaga 1200agtaacaaag gaacctagag
gccttttgat gttagcagaa ttgtcatgca agggctccct 1260atctactgga gaatatacta
agggtactgt tgacattgcg aagagcgaca aagattttgt 1320tatcggcttt attgctcaaa
gagacatggg tggaagagat gaaggttacg attggttgat 1380tatgacaccc ggtgtgggtt
tagatgacaa gggagacgca ttgggtcaac agtatagaac 1440cgtggatgat gtggtctcta
caggatctga cattattatt gttggaagag gactatttgc 1500aaagggaagg gatgctaagg
tagagggtga acgttacaga aaagcaggct gggaagcata 1560tttgagaaga tgcggccagc
aaaactaaaa aactgtatta taagtaaatg catgtatact 1620aaactcacaa attagagctt
caatttaatt atatcagtta ttaccctatg cggtgtgaaa 1680taccgcacag atgcgtaagg
agaaaatacc gcatcaggaa attgtaaacg ttaatatttt 1740gttaaaattc gcgttaaatt
tttgttaaat cagctcattt tttaaccaat aggccgaaat 1800cggcaaaatc tctagagtgc
tggaagaaga gctgcttaac cgccgcgccc agggtgaaga 1860tccacgctac tttaccctgc
gtcgtctgga tttcggcggc tgtcgtcttt cgctggcaac 1920gccggttgat gaagcctggg
acggtccgct ctccttaaac ggtaaacgta tcgccacctc 1980ttatcctcac ctgctcaagc
gttatctcga ccagaaaggc atctctttta aatcctgctt 2040actgaacggt tctgttgaag
tcgccccgcg tgccggactg gcggatgcga tttgcgatct 2100ggtttccacc ggtgccacgc
tggaagctaa cggcctgcgc gaagtcgaag ttatctatcg 2160ctcgaaagcc tgcctgattc
aacgcgatgg cgaaatggaa gaatccaaac agcaactgat 2220cgacaaactg ctgacccgta
ttcagggtgt gatccaggcg cgcgaatcaa aatacatcat 2280gatgcacgca ccgaccgaac
gtctggatga agtcatccag tgatgataca acgagttagc 2340caaggtg
23478227DNAartificial
sequencePrimer 82gacttttgga agcctgaaga aactggc
278320DNAartificial sequencePrimer 83cttggcagca acaggactag
208427DNAartificial
sequencePrimer 84gacttttgga agcctgaaga aactggc
278520DNAartificial sequencePrimer 85cttggcagca acaggactag
208626DNAartificial
sequencePrimer 86gacttgaata atgcagcggc gcttgc
268730DNAartificial sequencePrimer 87ccaccctctt caattagcta
agatcatagc 308825DNAartificial
sequencePrimer 88aaaaattgat tctcatcgta aatgc
258920DNAartificial sequencePrimer 89ctgcagcgag gagccgtaat
209090DNAartificial
sequencePrimer 90atggttcatt taggtccaaa aaaaccacaa gccagaaagg gttccatggc
cgatgtgcca 60gcattgcgga ttacgtattc taatgttcag
909191DNAartificial sequencePrimer 91ttaagcaccg atgataccaa
cggacttacc ttcagcaatt cttttttggg ccaaagcagc 60caccttggct aactcgttgt
atcatcactg g 919224DNAartificial
sequencePrimer 92ctaggatgag tagcagcacg ttcc
249326DNAartificial sequencePrimer 93ccaattccgt gatgtctctt
tgttgc 269420DNAartificial
sequencePrimer 94gtgaacgagt tcacaaccgc
209522DNAartificial sequencePrimer 95gttcgttcca gaattatcac
gc 229641DNAartificial
sequencePrimer 96cactaaatct agaatggttc atttaggtcc aaaaaaacca c
419740DNAArtificial sequencePrimer 97tttgattgga tccggaagtg
tagagagggt taaaattggc 4098600DNASaccharomyces
cerevisiae 98ccgtgcaaaa actaactccg agcccgggca tgtcccgggt tagcgggccc
aacaaaggcg 60cttatctggt gggcttccgt agaagaaaaa aagctgttga gcgagctatt
tcgggtatcc 120cagccttctc tgcagaccgc cccagttggc ttggctctgg tgctgttcgt
tagcatcaca 180tcgcctgtga caggcagagg taataacggc ttaaggttct cttcgcatag
tcggcagctt 240tctttcggac gttgaacact caacaaacct tatctagtgc ccaaccaggt
gtgcttctac 300gagtcttgct cactcagaca cacctatccc tattgttacg gctatgggga
tggcacacaa 360aggtggaaat aatagtagtt aacaatatat gcagcaaatc atcggctcct
ggctcatcga 420gtcttgcaaa tcagcatata catatatata tgggggcaga tcttgattca
tttattgttc 480tatttccatc tttcctactt ctgtttccgt ttatattttg tattacgtag
aatagaacat 540catagtaata gatagttgtg gtgatcatat tataaacagc actaaaacat
tacaacaaag 6009934DNAArtificial sequencePrimer 99caacaaaagc ttccgtgcaa
aaactaactc cgag 3410036DNAArtificial
sequencePrimer 100tttgattcta gactttgttg taatgtttta gtgctg
361011765DNASaccharomyces cerevisiae 101atggttcatt
taggtccaaa aaaaccacaa gccagaaagg gttccatggc cgatgtgcca 60aaggaattga
tgcaacaaat tgagaatttt gaaaaaattt tcactgttcc aactgaaact 120ttacaagccg
ttaccaagca cttcatttcc gaattggaaa agggtttgtc caagaagggt 180ggtaacattc
caatgattcc aggttgggtt atggatttcc caactggtaa ggaatccggt 240gatttcttgg
ccattgattt gggtggtacc aacttgagag ttgtcttagt caagttgggc 300ggtgaccgta
cctttgacac cactcaatct aagtacagat taccagatgc tatgagaact 360actcaaaatc
cagacgaatt gtgggaattt attgccgact ctttgaaagc ttttattgat 420gagcaattcc
cacaaggtat ctctgagcca attccattgg gtttcacctt ttctttccca 480gcttctcaaa
acaaaatcaa tgaaggtatc ttgcaaagat ggactaaagg ttttgatatt 540ccaaacattg
aaaaccacga tgttgttcca atgttgcaaa agcaaatcac taagaggaat 600atcccaattg
aagttgttgc tttgataaac gacactaccg gtactttggt tgcttcttac 660tacactgacc
cagaaactaa gatgggtgtt atcttcggta ctggtgtcaa tggtgcttac 720tacgatgttt
gttccgatat cgaaaagcta caaggaaaac tatctgatga cattccacca 780tctgctccaa
tggccatcaa ctgtgaatac ggttccttcg ataatgaaca tgtcgttttg 840ccaagaacta
aatacgatat caccattgat gaagaatctc caagaccagg ccaacaaacc 900tttgaaaaaa
tgtcttctgg ttactactta ggtgaaattt tgcgtttggc cttgatggac 960atgtacaaac
aaggtttcat cttcaagaac caagacttgt ctaagttcga caagcctttc 1020gtcatggaca
cttcttaccc agccagaatc gaggaagatc cattcgagaa cctagaagat 1080accgatgact
tgttccaaaa tgagttcggt atcaacacta ctgttcaaga acgtaaattg 1140atcagacgtt
tatctgaatt gattggtgct agagctgcta gattgtccgt ttgtggtatt 1200gctgctatct
gtcaaaagag aggttacaag accggtcaca tcgctgcaga cggttccgtt 1260tacaacagat
acccaggttt caaagaaaag gctgccaatg ctttgaagga catttacggc 1320tggactcaaa
cctcactaga cgactaccca atcaagattg ttcctgctga agatggttcc 1380ggtgctggtg
ccgctgttat tgctgctttg gcccaaaaaa gaattgctga aggtaagtcc 1440gttggtatca
tcggtgctta aacttaattt gtaaattaag tttgaacaac aagaaggtgc 1500cctttttcta
cttatgtgaa catgttttct atgatctttt tttttcttac ttttacaact 1560gtgatattgt
ataaactttg ttagaaattc acgggattta ttcgtgacga taaatattta 1620tatagacaaa
gaatatgacg atttatgaaa tctacatgat tttagtttct tttaacaatt 1680gctcgttttt
ttctcttgct taattttaaa tttttttggt agtaaaagat gcttatataa 1740ggatttcgta
tttattgttc aagta
17651024236DNAartificial sequenceSynthetic construct 102gatccgcatt
gcggattacg tattctaatg ttcagataac ttcgtatagc atacattata 60cgaagttatg
cagattgtac tgagagtgca ccataccaca gcttttcaat tcaattcatc 120attttttttt
tattcttttt tttgatttcg gtttctttga aatttttttg attcggtaat 180ctccgaacag
aaggaagaac gaaggaagga gcacagactt agattggtat atatacgcat 240atgtagtgtt
gaagaaacat gaaattgccc agtattctta acccaactgc acagaacaaa 300aacctgcagg
aaacgaagat aaatcatgtc gaaagctaca tataaggaac gtgctgctac 360tcatcctagt
cctgttgctg ccaagctatt taatatcatg cacgaaaagc aaacaaactt 420gtgtgcttca
ttggatgttc gtaccaccaa ggaattactg gagttagttg aagcattagg 480tcccaaaatt
tgtttactaa aaacacatgt ggatatcttg actgattttt ccatggaggg 540cacagttaag
ccgctaaagg cattatccgc caagtacaat tttttactct tcgaagacag 600aaaatttgct
gacattggta atacagtcaa attgcagtac tctgcgggtg tatacagaat 660agcagaatgg
gcagacatta cgaatgcaca cggtgtggtg ggcccaggta ttgttagcgg 720tttgaagcag
gcggcagaag aagtaacaaa ggaacctaga ggccttttga tgttagcaga 780attgtcatgc
aagggctccc tatctactgg agaatatact aagggtactg ttgacattgc 840gaagagcgac
aaagattttg ttatcggctt tattgctcaa agagacatgg gtggaagaga 900tgaaggttac
gattggttga ttatgacacc cggtgtgggt ttagatgaca agggagacgc 960attgggtcaa
cagtatagaa ccgtggatga tgtggtctct acaggatctg acattattat 1020tgttggaaga
ggactatttg caaagggaag ggatgctaag gtagagggtg aacgttacag 1080aaaagcaggc
tgggaagcat atttgagaag atgcggccag caaaactaaa aaactgtatt 1140ataagtaaat
gcatgtatac taaactcaca aattagagct tcaatttaat tatatcagtt 1200attaccctat
gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcagga 1260aattgtaaac
gttaatattt tgttaaaatt cgcgttaaat ttttgttaaa tcagctcatt 1320ttttaaccaa
taggccgaaa tcggcaaaat cccttataaa tcaaaagaat agaccgagat 1380agggttgagt
gttgttccag tttggaacaa gagtccacta ttaaagaacg tggactccaa 1440cgtcaaaggg
cgaaaaaccg tctatcaggg cgatggccca ctacgtgaac catcacccta 1500atcaagataa
cttcgtatag catacattat acgaagttat ccagtgatga tacaacgagt 1560tagccaaggt
gaattcactg gccgtcgttt tacaacgtcg tgactgggaa aaccctggcg 1620ttacccaact
taatcgcctt gcagcacatc cccctttcgc cagctggcgt aatagcgaag 1680aggcccgcac
cgatcgccct tcccaacagt tgcgcagcct gaatggcgaa tggcgcctga 1740tgcggtattt
tctccttacg catctgtgcg gtatttcaca ccgcatatgg tgcactctca 1800gtacaatctg
ctctgatgcc gcatagttaa gccagccccg acacccgcca acacccgctg 1860acgcgccctg
acgggcttgt ctgctcccgg catccgctta cagacaagct gtgaccgtct 1920ccgggagctg
catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg agacgaaagg 1980gcctcgtgat
acgcctattt ttataggtta atgtcatgat aataatggtt tcttagacgt 2040caggtggcac
ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac 2100attcaaatat
gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa 2160aaaggaagag
tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat 2220tttgccttcc
tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc 2280agttgggtgc
acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga 2340gttttcgccc
cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg 2400cggtattatc
ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactattctc 2460agaatgactt
ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag 2520taagagaatt
atgcagtgct gccataacca tgagtgataa cactgcggcc aacttacttc 2580tgacaacgat
cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg 2640taactcgcct
tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg 2700acaccacgat
gcctgtagca atggcaacaa cgttgcgcaa actattaact ggcgaactac 2760ttactctagc
ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac 2820cacttctgcg
ctcggccctt ccggctggct ggtttattgc tgataaatct ggagccggtg 2880agcgtgggtc
tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg 2940tagttatcta
cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg 3000agataggtgc
ctcactgatt aagcattggt aactgtcaga ccaagtttac tcatatatac 3060tttagattga
tttaaaactt catttttaat ttaaaaggat ctaggtgaag atcctttttg 3120ataatctcat
gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg 3180tagaaaagat
caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc 3240aaacaaaaaa
accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc 3300tttttccgaa
ggtaactggc ttcagcagag cgcagatacc aaatactgtc cttctagtgt 3360agccgtagtt
aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc 3420taatcctgtt
accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact 3480caagacgata
gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac 3540agcccagctt
ggagcgaacg acctacaccg aactgagata cctacagcgt gagctatgag 3600aaagcgccac
gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg 3660gaacaggaga
gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg 3720tcgggtttcg
ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga 3780gcctatggaa
aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt 3840ttgctcacat
gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct 3900ttgagtgagc
tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg 3960aggaagcgga
agagcgccca atacgcaaac cgcctctccc cgcgcgttgg ccgattcatt 4020aatgcagctg
gcacgacagg tttcccgact ggaaagcggg cagtgagcgc aacgcaatta 4080atgtgagtta
gctcactcat taggcacccc aggctttaca ctttatgctt ccggctcgta 4140tgttgtgtgg
aattgtgagc ggataacaat ttcacacagg aaacagctat gaccatgatt 4200acgccaagct
tgcatgcctg caggtcgact ctagag
42361036649DNAartificial sequenceSynthetic construct 103ctagactttg
ttgtaatgtt ttagtgctgt ttataatatg atcaccacaa ctatctatta 60ctatgatgtt
ctattctacg taatacaaaa tataaacgga aacagaagta ggaaagatgg 120aaatagaaca
ataaatgaat caagatctgc ccccatatat atatgtatat gctgatttgc 180aagactcgat
gagccaggag ccgatgattt gctgcatata ttgttaacta ctattatttc 240cacctttgtg
tgccatcccc atagccgtaa caatagggat aggtgtgtct gagtgagcaa 300gactcgtaga
agcacacctg gttgggcact agataaggtt tgttgagtgt tcaacgtccg 360aaagaaagct
gccgactatg cgaagagaac cttaagccgt tattacctct gcctgtcaca 420ggcgatgtga
tgctaacgaa cagcaccaga gccaagccaa ctggggcggt ctgcagagaa 480ggctgggata
cccgaaatag ctcgctcaac agcttttttt cttctacgga agcccaccag 540ataagcgcct
ttgttgggcc cgctaacccg ggacatgccc gggctcggag ttagtttttg 600cacggaagct
tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc 660acaattccac
acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga 720gtgagctaac
tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 780tcgtgccagc
tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg 840cgctcttccg
cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 900gtatcagctc
actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga 960aagaacatgt
gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 1020gcgtttttcc
ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 1080aggtggcgaa
acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 1140gtgcgctctc
ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 1200ggaagcgtgg
cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 1260cgctccaagc
tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 1320ggtaactatc
gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 1380actggtaaca
ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 1440tggcctaact
acggctacac tagaaggaca gtatttggta tctgcgctct gctgaagcca 1500gttaccttcg
gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc 1560ggtggttttt
ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat 1620cctttgatct
tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt 1680ttggtcatga
gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 1740tttaaatcaa
tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc 1800agtgaggcac
ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc 1860gtcgtgtaga
taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata 1920ccgcgagacc
cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg 1980gccgagcgca
gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc 2040cgggaagcta
gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 2100acaggcatcg
tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 2160cgatcaaggc
gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 2220cctccgatcg
ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca 2280ctgcataatt
ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac 2340tcaaccaagt
cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca 2400atacgggata
ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt 2460tcttcggggc
gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc 2520actcgtgcac
ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca 2580aaaacaggaa
ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 2640ctcatactct
tcctttttca atattattga agcatttatc agggttattg tctcatgagc 2700ggatacatat
ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc 2760cgaaaagtgc
cacctgacgt ctaagaaacc attattatca tgacattaac ctataaaaat 2820aggcgtatca
cgaggccctt tcgtctcgcg cgtttcggtg atgacggtga aaacctctga 2880cacatgcagc
tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa 2940gcccgtcagg
gcgcgtcagc gggtgttggc gggtgtcggg gctggcttaa ctatgcggca 3000tcagagcaga
ttgtactgag agtgcaccat atgcggtgtg aaataccgca cagatgcgta 3060aggagaaaat
accgcatcag gcgccattcg ccattcaggc tgcgcaactg ttgggaaggg 3120cgatcggtgc
gggcctcttc gctattacgc cagctggcga aagggggatg tgctgcaagg 3180cgattaagtt
gggtaacgcc agggttttcc cagtcacgac gttgtaaaac gacggccagt 3240gaattcacct
tggctaactc gttgtatcat cactggataa cttcgtataa tgtatgctat 3300acgaagttat
cttgattagg gtgatggttc acgtagtggg ccatcgccct gatagacggt 3360ttttcgccct
ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 3420aacaacactc
aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc 3480ggcctattgg
ttaaaaaatg agctgattta acaaaaattt aacgcgaatt ttaacaaaat 3540attaacgttt
acaatttcct gatgcggtat tttctcctta cgcatctgtg cggtatttca 3600caccgcatag
ggtaataact gatataatta aattgaagct ctaatttgtg agtttagtat 3660acatgcattt
acttataata cagtttttta gttttgctgg ccgcatcttc tcaaatatgc 3720ttcccagcct
gcttttctgt aacgttcacc ctctacctta gcatcccttc cctttgcaaa 3780tagtcctctt
ccaacaataa taatgtcaga tcctgtagag accacatcat ccacggttct 3840atactgttga
cccaatgcgt ctcccttgtc atctaaaccc acaccgggtg tcataatcaa 3900ccaatcgtaa
ccttcatctc ttccacccat gtctctttga gcaataaagc cgataacaaa 3960atctttgtcg
ctcttcgcaa tgtcaacagt acccttagta tattctccag tagataggga 4020gcccttgcat
gacaattctg ctaacatcaa aaggcctcta ggttcctttg ttacttcttc 4080tgccgcctgc
ttcaaaccgc taacaatacc tgggcccacc acaccgtgtg cattcgtaat 4140gtctgcccat
tctgctattc tgtatacacc cgcagagtac tgcaatttga ctgtattacc 4200aatgtcagca
aattttctgt cttcgaagag taaaaaattg tacttggcgg ataatgcctt 4260tagcggctta
actgtgccct ccatggaaaa atcagtcaag atatccacat gtgtttttag 4320taaacaaatt
ttgggaccta atgcttcaac taactccagt aattccttgg tggtacgaac 4380atccaatgaa
gcacacaagt ttgtttgctt ttcgtgcatg atattaaata gcttggcagc 4440aacaggacta
ggatgagtag cagcacgttc cttatatgta gctttcgaca tgatttatct 4500tcgtttcctg
caggtttttg ttctgtgcag ttgggttaag aatactgggc aatttcatgt 4560ttcttcaaca
ctacatatgc gtatatatac caatctaagt ctgtgctcct tccttcgttc 4620ttccttctgt
tcggagatta ccgaatcaaa aaaatttcaa agaaaccgaa atcaaaaaaa 4680agaataaaaa
aaaaatgatg aattgaattg aaaagctgtg gtatggtgca ctctcagtac 4740aatctgcata
acttcgtata atgtatgcta tacgaagtta tctgaacatt agaatacgta 4800atccgcaatg
cggatccgga agtgtagaga gggttaaaat tggcgtgcaa ttttatgaag 4860aataaagaca
tctagtcttt aaatacttga acaataaata cgaaatcctt atataagcat 4920cttttactac
caaaaaaatt taaaattaag caagagaaaa aaacgagcaa ttgttaaaag 4980aaactaaaat
catgtagatt tcataaatcg tcatattctt tgtctatata aatatttatc 5040gtcacgaata
aatcccgtga atttctaaca aagtttatac aatatcacag ttgtaaaagt 5100aagaaaaaaa
aagatcatag aaaacatgtt cacataagta gaaaaagggc accttcttgt 5160tgttcaaact
taatttacaa attaagttta agcaccgatg ataccaacgg acttaccttc 5220agcaattctt
ttttgggcca aagcagcaat aacagcggca ccagcaccgg aaccatcttc 5280agcaggaaca
atcttgattg ggtagtcgtc tagtgaggtt tgagtccagc cgtaaatgtc 5340cttcaaagca
ttggcagcct tttctttgaa acctgggtat ctgttgtaaa cggaaccgtc 5400tgcagcgatg
tgaccggtct tgtaacctct cttttgacag atagcagcaa taccacaaac 5460ggacaatcta
gcagctctag caccaatcaa ttcagataaa cgtctgatca atttacgttc 5520ttgaacagta
gtgttgatac cgaactcatt ttggaacaag tcatcggtat cttctaggtt 5580ctcgaatgga
tcttcctcga ttctggctgg gtaagaagtg tccatgacga aaggcttgtc 5640gaacttagac
aagtcttggt tcttgaagat gaaaccttgt ttgtacatgt ccatcaaggc 5700caaacgcaaa
atttcaccta agtagtaacc agaagacatt ttttcaaagg tttgttggcc 5760tggtcttgga
gattcttcat caatggtgat atcgtattta gttcttggca aaacgacatg 5820ttcattatcg
aaggaaccgt attcacagtt gatggccatt ggagcagatg gtggaatgtc 5880atcagatagt
tttccttgta gcttttcgat atcggaacaa acatcgtagt aagcaccatt 5940gacaccagta
ccgaagataa cacccatctt agtttctggg tcagtgtagt aagaagcaac 6000caaagtaccg
gtagtgtcgt ttatcaaagc aacaacttca attgggatat tcctcttagt 6060gatttgcttt
tgcaacattg gaacaacatc gtggttttca atgtttggaa tatcaaaacc 6120tttagtccat
ctttgcaaga taccttcatt gattttgttt tgagaagctg ggaaagaaaa 6180ggtgaaaccc
aatggaattg gctcagagat accttgtggg aattgctcat caataaaagc 6240tttcaaagag
tcggcaataa attcccacaa ttcgtctgga ttttgagtag ttctcatagc 6300atctggtaat
ctgtacttag attgagtggt gtcaaaggta cggtcaccgc ccaacttgac 6360taagacaact
ctcaagttgg taccacccaa atcaatggcc aagaaatcac cggattcctt 6420accagttggg
aaatccataa cccaacctgg aatcattgga atgttaccac ccttcttgga 6480caaacccttt
tccaattcgg aaatgaagtg cttggtaacg gcttgtaaag tttcagttgg 6540aacagtgaaa
attttttcaa aattctcaat ttgttgcatc aattcctttg gcacatcggc 6600catggaaccc
tttctggctt gtggtttttt tggacctaaa tgaaccatt
664910438DNAArtificial sequencePrimer 104cactaaatct agaatggttc gtttaggtcc
aaagaagc 3810542DNAartificial sequencePrimer
105tttggatgga tccctattcg cctttaatac caacagactt ac
421066276DNAArtificial sequenceSynthetic construct 106ctagactttg
ttgtaatgtt ttagtgctgt ttataatatg atcaccacaa ctatctatta 60ctatgatgtt
ctattctacg taatacaaaa tataaacgga aacagaagta ggaaagatgg 120aaatagaaca
ataaatgaat caagatctgc ccccatatat atatgtatat gctgatttgc 180aagactcgat
gagccaggag ccgatgattt gctgcatata ttgttaacta ctattatttc 240cacctttgtg
tgccatcccc atagccgtaa caatagggat aggtgtgtct gagtgagcaa 300gactcgtaga
agcacacctg gttgggcact agataaggtt tgttgagtgt tcaacgtccg 360aaagaaagct
gccgactatg cgaagagaac cttaagccgt tattacctct gcctgtcaca 420ggcgatgtga
tgctaacgaa cagcaccaga gccaagccaa ctggggcggt ctgcagagaa 480ggctgggata
cccgaaatag ctcgctcaac agcttttttt cttctacgga agcccaccag 540ataagcgcct
ttgttgggcc cgctaacccg ggacatgccc gggctcggag ttagtttttg 600cacggaagct
tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc 660acaattccac
acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga 720gtgagctaac
tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 780tcgtgccagc
tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg 840cgctcttccg
cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 900gtatcagctc
actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga 960aagaacatgt
gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 1020gcgtttttcc
ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 1080aggtggcgaa
acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 1140gtgcgctctc
ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 1200ggaagcgtgg
cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 1260cgctccaagc
tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 1320ggtaactatc
gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 1380actggtaaca
ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 1440tggcctaact
acggctacac tagaaggaca gtatttggta tctgcgctct gctgaagcca 1500gttaccttcg
gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc 1560ggtggttttt
ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat 1620cctttgatct
tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt 1680ttggtcatga
gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 1740tttaaatcaa
tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc 1800agtgaggcac
ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc 1860gtcgtgtaga
taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata 1920ccgcgagacc
cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg 1980gccgagcgca
gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc 2040cgggaagcta
gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 2100acaggcatcg
tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 2160cgatcaaggc
gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 2220cctccgatcg
ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca 2280ctgcataatt
ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac 2340tcaaccaagt
cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca 2400atacgggata
ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt 2460tcttcggggc
gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc 2520actcgtgcac
ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca 2580aaaacaggaa
ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 2640ctcatactct
tcctttttca atattattga agcatttatc agggttattg tctcatgagc 2700ggatacatat
ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc 2760cgaaaagtgc
cacctgacgt ctaagaaacc attattatca tgacattaac ctataaaaat 2820aggcgtatca
cgaggccctt tcgtctcgcg cgtttcggtg atgacggtga aaacctctga 2880cacatgcagc
tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa 2940gcccgtcagg
gcgcgtcagc gggtgttggc gggtgtcggg gctggcttaa ctatgcggca 3000tcagagcaga
ttgtactgag agtgcaccat atgcggtgtg aaataccgca cagatgcgta 3060aggagaaaat
accgcatcag gcgccattcg ccattcaggc tgcgcaactg ttgggaaggg 3120cgatcggtgc
gggcctcttc gctattacgc cagctggcga aagggggatg tgctgcaagg 3180cgattaagtt
gggtaacgcc agggttttcc cagtcacgac gttgtaaaac gacggccagt 3240gaattcacct
tggctaactc gttgtatcat cactggataa cttcgtataa tgtatgctat 3300acgaagttat
cttgattagg gtgatggttc acgtagtggg ccatcgccct gatagacggt 3360ttttcgccct
ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 3420aacaacactc
aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc 3480ggcctattgg
ttaaaaaatg agctgattta acaaaaattt aacgcgaatt ttaacaaaat 3540attaacgttt
acaatttcct gatgcggtat tttctcctta cgcatctgtg cggtatttca 3600caccgcatag
ggtaataact gatataatta aattgaagct ctaatttgtg agtttagtat 3660acatgcattt
acttataata cagtttttta gttttgctgg ccgcatcttc tcaaatatgc 3720ttcccagcct
gcttttctgt aacgttcacc ctctacctta gcatcccttc cctttgcaaa 3780tagtcctctt
ccaacaataa taatgtcaga tcctgtagag accacatcat ccacggttct 3840atactgttga
cccaatgcgt ctcccttgtc atctaaaccc acaccgggtg tcataatcaa 3900ccaatcgtaa
ccttcatctc ttccacccat gtctctttga gcaataaagc cgataacaaa 3960atctttgtcg
ctcttcgcaa tgtcaacagt acccttagta tattctccag tagataggga 4020gcccttgcat
gacaattctg ctaacatcaa aaggcctcta ggttcctttg ttacttcttc 4080tgccgcctgc
ttcaaaccgc taacaatacc tgggcccacc acaccgtgtg cattcgtaat 4140gtctgcccat
tctgctattc tgtatacacc cgcagagtac tgcaatttga ctgtattacc 4200aatgtcagca
aattttctgt cttcgaagag taaaaaattg tacttggcgg ataatgcctt 4260tagcggctta
actgtgccct ccatggaaaa atcagtcaag atatccacat gtgtttttag 4320taaacaaatt
ttgggaccta atgcttcaac taactccagt aattccttgg tggtacgaac 4380atccaatgaa
gcacacaagt ttgtttgctt ttcgtgcatg atattaaata gcttggcagc 4440aacaggacta
ggatgagtag cagcacgttc cttatatgta gctttcgaca tgatttatct 4500tcgtttcctg
caggtttttg ttctgtgcag ttgggttaag aatactgggc aatttcatgt 4560ttcttcaaca
ctacatatgc gtatatatac caatctaagt ctgtgctcct tccttcgttc 4620ttccttctgt
tcggagatta ccgaatcaaa aaaatttcaa agaaaccgaa atcaaaaaaa 4680agaataaaaa
aaaaatgatg aattgaattg aaaagctgtg gtatggtgca ctctcagtac 4740aatctgcata
acttcgtata atgtatgcta tacgaagtta tctgaacatt agaatacgta 4800atccgcaatg
cggatcccta ttcgccttta ataccaacag acttaccggc agccaatctc 4860ttttgagtca
aacaagcaat gatagcagca ccaacaccgg aaccatcttc agcagccacc 4920aattggattg
ggtggtcttc catcttttcg acatcccagt tgtagatatc cttcaaggct 4980tgagcggcct
tttccttgta acctgggtat ctgttgaaga cagaaccatc agctgcaatg 5040tgagcagtct
tgtagcctct cttgtcacag atagcagaaa caccacaaac agtcaatctt 5100gcagctcttg
ttccgaccaa ttcggctaat tttctaatca actttctctc aacaacggta 5160gtttcgatgt
tcaagttagt cttgaacaga tcgtcagtgt cttccaagtt ttcgaatgga 5220tcatcttcga
tcttagatgg ataactggtg tccatgacgt aagcctcttt caacttggag 5280atatcttggt
ccttaaagat gaaaccactg tcgtacaagt ccaatagtac tagacgcatg 5340atttcaccta
gatagtaacc agaagtcatc ttttcgaaag cttgttgacc tggtcttgga 5400gattcttcat
cgattataac atcgtatttg gttcttggca acaccaaatg ttcgttatcg 5460aaggaaccat
attcacagtt gattgccatt ggagaatctg gaccgatatc ttctggcaac 5520aaaccttcca
atttctcaat accagaaaca acatcgtagt aagcaccgtt gacaccagta 5580ccgataatga
tacccatctt agtttgagga tcagtgtaca aagaggcaac caaggtacca 5640gtggtatcgt
tgatcaatcg aacgacattg attgggatat tcagcttttc aatctgttct 5700tgtagcattg
gaacaacatc gtgaccttca acaccttcaa tatcgaaacc cttggtccaa 5760cgttgcaaca
caccggaatt gatcttcttt tgagatgcag ggtatgagaa agtgaaaccc 5820aatggcaatg
gttcagaaac accatctggg taccattcat cgacgaattc cttcaaacac 5880tttgcaataa
atgaccacaa ttgttcagaa gtaccagttc tcaaatggtc tggtaatctg 5940tacttgtttt
gagtggtgtc gaaatcatga ttaccaccca atttaaccaa cacaactctc 6000aagttggtac
cacccaaatc aagagctaag aaatcaccag tttccttacc agttggatac 6060tcaacaaccc
aacctggaat cataggaatg ttaccaccct ttttggacaa acctttgtcc 6120aattcactga
tgaaatgctt gacaatgctt ctcatttttt ctgaagagac ggtgaacaaa 6180gtttccaaac
cgtggatttg ttccatcaaa ttagctggca catctgccat ggaccccttt 6240ctggctggag
gcttctttgg acctaaacga accatt
627610765DNAArtificial sequencePrimer 107gtgagtatac gtgattaagc acacaaaggc
agcttggagt cacacaggaa acagctatga 60ccatg
6510866DNAArtificial sequencePrimer
108gtgcacaaac aatacttaaa taaatactac tcagtaataa cgtcacgacg ttgtaaaacg
60acggcc
6610930DNAArtificial sequencePrimer 109ctcttcaaca agtttgattc cattgcggtg
301107523DNAartificial
sequenceSynthetic construct 110ccagcttttg ttccctttag tgagggttaa
ttgcgcgctt ggcgtaatca tggtcatagc 60tgtttcctgt gtgaaattgt tatccgctca
caattccaca caacatagga gccggaagca 120taaagtgtaa agcctggggt gcctaatgag
tgaggtaact cacattaatt gcgttgcgct 180cactgcccgc tttccagtcg ggaaacctgt
cgtgccagct gcattaatga atcggccaac 240gcgcggggag aggcggtttg cgtattgggc
gctcttccgc ttcctcgctc actgactcgc 300tgcgctcggt cgttcggctg cggcgagcgg
tatcagctca ctcaaaggcg gtaatacggt 360tatccacaga atcaggggat aacgcaggaa
agaacatgtg agcaaaaggc cagcaaaagg 420ccaggaaccg taaaaaggcc gcgttgctgg
cgtttttcca taggctccgc ccccctgacg 480agcatcacaa aaatcgacgc tcaagtcaga
ggtggcgaaa cccgacagga ctataaagat 540accaggcgtt tccccctgga agctccctcg
tgcgctctcc tgttccgacc ctgccgctta 600ccggatacct gtccgccttt ctcccttcgg
gaagcgtggc gctttctcat agctcacgct 660gtaggtatct cagttcggtg taggtcgttc
gctccaagct gggctgtgtg cacgaacccc 720ccgttcagcc cgaccgctgc gccttatccg
gtaactatcg tcttgagtcc aacccggtaa 780gacacgactt atcgccactg gcagcagcca
ctggtaacag gattagcaga gcgaggtatg 840taggcggtgc tacagagttc ttgaagtggt
ggcctaacta cggctacact agaaggacag 900tatttggtat ctgcgctctg ctgaagccag
ttaccttcgg aaaaagagtt ggtagctctt 960gatccggcaa acaaaccacc gctggtagcg
gtggtttttt tgtttgcaag cagcagatta 1020cgcgcagaaa aaaaggatct caagaagatc
ctttgatctt ttctacgggg tctgacgctc 1080agtggaacga aaactcacgt taagggattt
tggtcatgag attatcaaaa aggatcttca 1140cctagatcct tttaaattaa aaatgaagtt
ttaaatcaat ctaaagtata tatgagtaaa 1200cttggtctga cagttaccaa tgcttaatca
gtgaggcacc tatctcagcg atctgtctat 1260ttcgttcatc catagttgcc tgactccccg
tcgtgtagat aactacgata cgggagggct 1320taccatctgg ccccagtgct gcaatgatac
cgcgagaccc acgctcaccg gctccagatt 1380tatcagcaat aaaccagcca gccggaaggg
ccgagcgcag aagtggtcct gcaactttat 1440ccgcctccat ccagtctatt aattgttgcc
gggaagctag agtaagtagt tcgccagtta 1500atagtttgcg caacgttgtt gccattgcta
caggcatcgt ggtgtcacgc tcgtcgtttg 1560gtatggcttc attcagctcc ggttcccaac
gatcaaggcg agttacatga tcccccatgt 1620tgtgcaaaaa agcggttagc tccttcggtc
ctccgatcgt tgtcagaagt aagttggccg 1680cagtgttatc actcatggtt atggcagcac
tgcataattc tcttactgtc atgccatccg 1740taagatgctt ttctgtgact ggtgagtact
caaccaagtc attctgagaa tagtgtatgc 1800ggcgaccgag ttgctcttgc ccggcgtcaa
tacgggataa taccgcgcca catagcagaa 1860ctttaaaagt gctcatcatt ggaaaacgtt
cttcggggcg aaaactctca aggatcttac 1920cgctgttgag atccagttcg atgtaaccca
ctcgtgcacc caactgatct tcagcatctt 1980ttactttcac cagcgtttct gggtgagcaa
aaacaggaag gcaaaatgcc gcaaaaaagg 2040gaataagggc gacacggaaa tgttgaatac
tcatactctt cctttttcaa tattattgaa 2100gcatttatca gggttattgt ctcatgagcg
gatacatatt tgaatgtatt tagaaaaata 2160aacaaatagg ggttccgcgc acatttcccc
gaaaagtgcc acctgaacga agcatctgtg 2220cttcattttg tagaacaaaa atgcaacgcg
agagcgctaa tttttcaaac aaagaatctg 2280agctgcattt ttacagaaca gaaatgcaac
gcgaaagcgc tattttacca acgaagaatc 2340tgtgcttcat ttttgtaaaa caaaaatgca
acgcgagagc gctaattttt caaacaaaga 2400atctgagctg catttttaca gaacagaaat
gcaacgcgag agcgctattt taccaacaaa 2460gaatctatac ttcttttttg ttctacaaaa
atgcatcccg agagcgctat ttttctaaca 2520aagcatctta gattactttt tttctccttt
gtgcgctcta taatgcagtc tcttgataac 2580tttttgcact gtaggtccgt taaggttaga
agaaggctac tttggtgtct attttctctt 2640ccataaaaaa agcctgactc cacttcccgc
gtttactgat tactagcgaa gctgcgggtg 2700cattttttca agataaaggc atccccgatt
atattctata ccgatgtgga ttgcgcatac 2760tttgtgaaca gaaagtgata gcgttgatga
ttcttcattg gtcagaaaat tatgaacggt 2820ttcttctatt ttgtctctat atactacgta
taggaaatgt ttacattttc gtattgtttt 2880cgattcactc tatgaatagt tcttactaca
atttttttgt ctaaagagta atactagaga 2940taaacataaa aaatgtagag gtcgagttta
gatgcaagtt caaggagcga aaggtggatg 3000ggtaggttat atagggatat agcacagaga
tatatagcaa agagatactt ttgagcaatg 3060tttgtggaag cggtattcgc aatattttag
tagctcgtta cagtccggtg cgtttttggt 3120tttttgaaag tgcgtcttca gagcgctttt
ggttttcaaa agcgctctga agttcctata 3180ctttctagag aataggaact tcggaatagg
aacttcaaag cgtttccgaa aacgagcgct 3240tccgaaaatg caacgcgagc tgcgcacata
cagctcactg ttcacgtcgc acctatatct 3300gcgtgttgcc tgtatatata tatacatgag
aagaacggca tagtgcgtgt ttatgcttaa 3360atgcgtactt atatgcgtct atttatgtag
gatgaaaggt agtctagtac ctcctgtgat 3420attatcccat tccatgcggg gtatcgtatg
cttccttcag cactaccctt tagctgttct 3480atatgctgcc actcctcaat tggattagtc
tcatccttca atgctatcat ttcctttgat 3540attggatcat ctaagaaacc attattatca
tgacattaac ctataaaaat aggcgtatca 3600cgaggccctt tcgtctcgcg cgtttcggtg
atgacggtga aaacctctga cacatgcagc 3660tcccggagac ggtcacagct tgtctgtaag
cggatgccgg gagcagacaa gcccgtcagg 3720gcgcgtcagc gggtgttggc gggtgtcggg
gctggcttaa ctatgcggca tcagagcaga 3780ttgtactgag agtgcaccat aaattcccgt
tttaagagct tggtgagcgc taggagtcac 3840tgccaggtat cgtttgaaca cggcattagt
cagggaagtc ataacacagt cctttcccgc 3900aattttcttt ttctattact cttggcctcc
tctagtacac tctatatttt tttatgcctc 3960ggtaatgatt ttcatttttt tttttcccct
agcggatgac tctttttttt tcttagcgat 4020tggcattatc acataatgaa ttatacatta
tataaagtaa tgtgatttct tcgaagaata 4080tactaaaaaa tgagcaggca agataaacga
aggcaaagat gacagagcag aaagccctag 4140taaagcgtat tacaaatgaa accaagattc
agattgcgat ctctttaaag ggtggtcccc 4200tagcgataga gcactcgatc ttcccagaaa
aagaggcaga agcagtagca gaacaggcca 4260cacaatcgca agtgattaac gtccacacag
gtatagggtt tctggaccat atgatacatg 4320ctctggccaa gcattccggc tggtcgctaa
tcgttgagtg cattggtgac ttacacatag 4380acgaccatca caccactgaa gactgcggga
ttgctctcgg tcaagctttt aaagaggccc 4440tactggcgcg tggagtaaaa aggtttggat
caggatttgc gcctttggat gaggcacttt 4500ccagagcggt ggtagatctt tcgaacaggc
cgtacgcagt tgtcgaactt ggtttgcaaa 4560gggagaaagt aggagatctc tcttgcgaga
tgatcccgca ttttcttgaa agctttgcag 4620aggctagcag aattaccctc cacgttgatt
gtctgcgagg caagaatgat catcaccgta 4680gtgagagtgc gttcaaggct cttgcggttg
ccataagaga agccacctcg cccaatggta 4740ccaacgatgt tccctccacc aaaggtgttc
ttatgtagtg acaccgatta tttaaagctg 4800cagcatacga tatatataca tgtgtatata
tgtataccta tgaatgtcag taagtatgta 4860tacgaacagt atgatactga agatgacaag
gtaatgcatc attctatacg tgtcattctg 4920aacgaggcgc gctttccttt tttctttttg
ctttttcttt ttttttctct tgaactcgac 4980ggatctatgc ggtgtgaaat accgcacaga
tgcgtaagga gaaaataccg catcaggaaa 5040ttgtaaacgt taatattttg ttaaaattcg
cgttaaattt ttgttaaatc agctcatttt 5100ttaaccaata ggccgaaatc ggcaaaatcc
cttataaatc aaaagaatag accgagatag 5160ggttgagtgt tgttccagtt tggaacaaga
gtccactatt aaagaacgtg gactccaacg 5220tcaaagggcg aaaaaccgtc tatcagggcg
atggcccact acgtgaacca tcaccctaat 5280caagtttttt ggggtcgagg tgccgtaaag
cactaaatcg gaaccctaaa gggagccccc 5340gatttagagc ttgacgggga aagccggcga
acgtggcgag aaaggaaggg aagaaagcga 5400aaggagcggg cgctagggcg ctggcaagtg
tagcggtcac gctgcgcgta accaccacac 5460ccgccgcgct taatgcgccg ctacagggcg
cgtcgcgcca ttcgccattc aggctgcgca 5520actgttggga agggcgatcg gtgcgggcct
cttcgctatt acgccagctg gcgaaagggg 5580gatgtgctgc aaggcgatta agttgggtaa
cgccagggtt ttcccagtca cgacgttgta 5640aaacgacggc cagtgagcgc gcgtaatacg
actcactata gggcgaattg ggtaccgggc 5700cccccctcga ggtattagaa gccgccgagc
gggcgacagc cctccgacgg aagactctcc 5760tccgtgcgtc ctcgtcttca ccggtcgcgt
tcctgaaacg cagatgtgcc tcgcgccgca 5820ctgctccgaa caataaagat tctacaatac
tagcttttat ggttatgaag aggaaaaatt 5880ggcagtaacc tggccccaca aaccttcaaa
ttaacgaatc aaattaacaa ccataggatg 5940ataatgcgat tagtttttta gccttatttc
tggggtaatt aatcagcgaa gcgatgattt 6000ttgatctatt aacagatata taaatggaaa
agctgcataa ccactttaac taatactttc 6060aacattttca gtttgtatta cttcttattc
aaatgtcata aaagtatcaa caaaaaattg 6120ttaatatacc tctatacttt aacgtcaagg
agaaaaatgt ccaatttact gcccgtacac 6180caaaatttgc ctgcattacc ggtcgatgca
acgagtgatg aggttcgcaa gaacctgatg 6240gacatgttca gggatcgcca ggcgttttct
gagcatacct ggaaaatgct tctgtccgtt 6300tgccggtcgt gggcggcatg gtgcaagttg
aataaccgga aatggtttcc cgcagaacct 6360gaagatgttc gcgattatct tctatatctt
caggcgcgcg gtctggcagt aaaaactatc 6420cagcaacatt tgggccagct aaacatgctt
catcgtcggt ccgggctgcc acgaccaagt 6480gacagcaatg ctgtttcact ggttatgcgg
cggatccgaa aagaaaacgt tgatgccggt 6540gaacgtgcaa aacaggctct agcgttcgaa
cgcactgatt tcgaccaggt tcgttcactc 6600atggaaaata gcgatcgctg ccaggatata
cgtaatctgg catttctggg gattgcttat 6660aacaccctgt tacgtatagc cgaaattgcc
aggatcaggg ttaaagatat ctcacgtact 6720gacggtggga gaatgttaat ccatattggc
agaacgaaaa cgctggttag caccgcaggt 6780gtagagaagg cacttagcct gggggtaact
aaactggtcg agcgatggat ttccgtctct 6840ggtgtagctg atgatccgaa taactacctg
ttttgccggg tcagaaaaaa tggtgttgcc 6900gcgccatctg ccaccagcca gctatcaact
cgcgccctgg aagggatttt tgaagcaact 6960catcgattga tttacggcgc taaggatgac
tctggtcaga gatacctggc ctggtctgga 7020cacagtgccc gtgtcggagc cgcgcgagat
atggcccgcg ctggagtttc aataccggag 7080atcatgcaag ctggtggctg gaccaatgta
aatattgtca tgaactatat ccgtaacctg 7140gatagtgaaa caggggcaat ggtgcgcctg
ctggaagatg gcgattagga gtaagcgaat 7200ttcttatgat ttatgatttt tattattaaa
taagttataa aaaaaataag tgtatacaaa 7260ttttaaagtg actcttaggt tttaaaacga
aaattcttat tcttgagtaa ctctttcctg 7320taggtcaggt tgctttctca ggtatagcat
gaggtcgctc ttattgacca cacctctacc 7380ggcatgccga gcaaatgcct gcaaatcgct
ccccatttca cccaattgta gatatgctaa 7440ctccagcaat gagttgatga atctcggtgt
gtattttatg tcctcagagg acaacacctg 7500tggtccgcca ccgcggtgga gct
75231116747DNAArtificial
sequenceSynthetic construct 111ctagatgtat atgagatagt tgattgtatg
cttggtatag cttgaaatat tgtgcagaaa 60aagaaacaag gaagaaaggg aacgagaaca
atgacgagga aacaaaagat taataattgc 120aggtctattt atacttgata gcaagacagc
aaactttttt ttatttcaaa ttcaagtaac 180tggaaggaag gccgtatacc gttgctcatt
agagagtagt gtgcgtgaat gaaggaagga 240aaaagtttcg tgtgcttcga gatacccctc
atcagctctg gaacaacgac atctgttggt 300gctgtctttg tcgttaattt tttcctttag
tgtcttccat catttttttg tcattgcgga 360tatggtgaga caacaacggg ggagagagaa
aagaaaaaaa aagaaaagaa gttgcatgcg 420cctattatta cttcaataga tggcaaatgg
aaaaagggta gtgaaacttc gatatgatga 480tggctatcaa gtctagggct acagtattag
ttcgttatgt accaccatca atgaggcagt 540gtaattggtg tagtcttgtt tagcccatta
tgtcttgtct ggtatctgtt ctattgtata 600tctcccctcc gccacctaca tgttagggag
accaacgaag gtattatagg aatcccgatg 660tatgggtttg gttgccagaa aagaggaagt
ccatattgta cacaagcttg gcgtaatcat 720ggtcatagct gtttcctgtg tgaaattgtt
atccgctcac aattccacac aacatacgag 780ccggaagcat aaagtgtaaa gcctggggtg
cctaatgagt gagctaactc acattaattg 840cgttgcgctc actgcccgct ttccagtcgg
gaaacctgtc gtgccagctg cattaatgaa 900tcggccaacg cgcggggaga ggcggtttgc
gtattgggcg ctcttccgct tcctcgctca 960ctgactcgct gcgctcggtc gttcggctgc
ggcgagcggt atcagctcac tcaaaggcgg 1020taatacggtt atccacagaa tcaggggata
acgcaggaaa gaacatgtga gcaaaaggcc 1080agcaaaaggc caggaaccgt aaaaaggccg
cgttgctggc gtttttccat aggctccgcc 1140cccctgacga gcatcacaaa aatcgacgct
caagtcagag gtggcgaaac ccgacaggac 1200tataaagata ccaggcgttt ccccctggaa
gctccctcgt gcgctctcct gttccgaccc 1260tgccgcttac cggatacctg tccgcctttc
tcccttcggg aagcgtggcg ctttctcata 1320gctcacgctg taggtatctc agttcggtgt
aggtcgttcg ctccaagctg ggctgtgtgc 1380acgaaccccc cgttcagccc gaccgctgcg
ccttatccgg taactatcgt cttgagtcca 1440acccggtaag acacgactta tcgccactgg
cagcagccac tggtaacagg attagcagag 1500cgaggtatgt aggcggtgct acagagttct
tgaagtggtg gcctaactac ggctacacta 1560gaaggacagt atttggtatc tgcgctctgc
tgaagccagt taccttcgga aaaagagttg 1620gtagctcttg atccggcaaa caaaccaccg
ctggtagcgg tggttttttt gtttgcaagc 1680agcagattac gcgcagaaaa aaaggatctc
aagaagatcc tttgatcttt tctacggggt 1740ctgacgctca gtggaacgaa aactcacgtt
aagggatttt ggtcatgaga ttatcaaaaa 1800ggatcttcac ctagatcctt ttaaattaaa
aatgaagttt taaatcaatc taaagtatat 1860atgagtaaac ttggtctgac agttaccaat
gcttaatcag tgaggcacct atctcagcga 1920tctgtctatt tcgttcatcc atagttgcct
gactccccgt cgtgtagata actacgatac 1980gggagggctt accatctggc cccagtgctg
caatgatacc gcgagaccca cgctcaccgg 2040ctccagattt atcagcaata aaccagccag
ccggaagggc cgagcgcaga agtggtcctg 2100caactttatc cgcctccatc cagtctatta
attgttgccg ggaagctaga gtaagtagtt 2160cgccagttaa tagtttgcgc aacgttgttg
ccattgctac aggcatcgtg gtgtcacgct 2220cgtcgtttgg tatggcttca ttcagctccg
gttcccaacg atcaaggcga gttacatgat 2280cccccatgtt gtgcaaaaaa gcggttagct
ccttcggtcc tccgatcgtt gtcagaagta 2340agttggccgc agtgttatca ctcatggtta
tggcagcact gcataattct cttactgtca 2400tgccatccgt aagatgcttt tctgtgactg
gtgagtactc aaccaagtca ttctgagaat 2460agtgtatgcg gcgaccgagt tgctcttgcc
cggcgtcaat acgggataat accgcgccac 2520atagcagaac tttaaaagtg ctcatcattg
gaaaacgttc ttcggggcga aaactctcaa 2580ggatcttacc gctgttgaga tccagttcga
tgtaacccac tcgtgcaccc aactgatctt 2640cagcatcttt tactttcacc agcgtttctg
ggtgagcaaa aacaggaagg caaaatgccg 2700caaaaaaggg aataagggcg acacggaaat
gttgaatact catactcttc ctttttcaat 2760attattgaag catttatcag ggttattgtc
tcatgagcgg atacatattt gaatgtattt 2820agaaaaataa acaaataggg gttccgcgca
catttccccg aaaagtgcca cctgacgtct 2880aagaaaccat tattatcatg acattaacct
ataaaaatag gcgtatcacg aggccctttc 2940gtctcgcgcg tttcggtgat gacggtgaaa
acctctgaca catgcagctc ccggagacgg 3000tcacagcttg tctgtaagcg gatgccggga
gcagacaagc ccgtcagggc gcgtcagcgg 3060gtgttggcgg gtgtcggggc tggcttaact
atgcggcatc agagcagatt gtactgagag 3120tgcaccatat gcggtgtgaa ataccgcaca
gatgcgtaag gagaaaatac cgcatcaggc 3180gccattcgcc attcaggctg cgcaactgtt
gggaagggcg atcggtgcgg gcctcttcgc 3240tattacgcca gctggcgaaa gggggatgtg
ctgcaaggcg attaagttgg gtaacgccag 3300ggttttccca gtcacgacgt tgtaaaacga
cggccagtga attcaccttg gctaactcgt 3360tgtatcatca ctggataact tcgtataatg
tatgctatac gaagttatct tgattagggt 3420gatggttcac gtagtgggcc atcgccctga
tagacggttt ttcgcccttt gacgttggag 3480tccacgttct ttaatagtgg actcttgttc
caaactggaa caacactcaa ccctatctcg 3540gtctattctt ttgatttata agggattttg
ccgatttcgg cctattggtt aaaaaatgag 3600ctgatttaac aaaaatttaa cgcgaatttt
aacaaaatat taacgtttac aatttcctga 3660tgcggtattt tctccttacg catctgtgcg
gtatttcaca ccgcataggg taataactga 3720tataattaaa ttgaagctct aatttgtgag
tttagtatac atgcatttac ttataataca 3780gttttttagt tttgctggcc gcatcttctc
aaatatgctt cccagcctgc ttttctgtaa 3840cgttcaccct ctaccttagc atcccttccc
tttgcaaata gtcctcttcc aacaataata 3900atgtcagatc ctgtagagac cacatcatcc
acggttctat actgttgacc caatgcgtct 3960cccttgtcat ctaaacccac accgggtgtc
ataatcaacc aatcgtaacc ttcatctctt 4020ccacccatgt ctctttgagc aataaagccg
ataacaaaat ctttgtcgct cttcgcaatg 4080tcaacagtac ccttagtata ttctccagta
gatagggagc ccttgcatga caattctgct 4140aacatcaaaa ggcctctagg ttcctttgtt
acttcttctg ccgcctgctt caaaccgcta 4200acaatacctg ggcccaccac accgtgtgca
ttcgtaatgt ctgcccattc tgctattctg 4260tatacacccg cagagtactg caatttgact
gtattaccaa tgtcagcaaa ttttctgtct 4320tcgaagagta aaaaattgta cttggcggat
aatgccttta gcggcttaac tgtgccctcc 4380atggaaaaat cagtcaagat atccacatgt
gtttttagta aacaaatttt gggacctaat 4440gcttcaacta actccagtaa ttccttggtg
gtacgaacat ccaatgaagc acacaagttt 4500gtttgctttt cgtgcatgat attaaatagc
ttggcagcaa caggactagg atgagtagca 4560gcacgttcct tatatgtagc tttcgacatg
atttatcttc gtttcctgca ggtttttgtt 4620ctgtgcagtt gggttaagaa tactgggcaa
tttcatgttt cttcaacact acatatgcgt 4680atatatacca atctaagtct gtgctccttc
cttcgttctt ccttctgttc ggagattacc 4740gaatcaaaaa aatttcaaag aaaccgaaat
caaaaaaaag aataaaaaaa aaatgatgaa 4800ttgaattgaa aagctgtggt atggtgcact
ctcagtacaa tctgcataac ttcgtataat 4860gtatgctata cgaagttatc tgaacattag
aatacgtaat ccgcaatgcg gatccggaag 4920tgtagagagg gttaaaattg gcgtgcaatt
ttatgaagaa taaagacatc tagtctttaa 4980atacttgaac aataaatacg aaatccttat
ataagcatct tttactacca aaaaaattta 5040aaattaagca agagaaaaaa acgagcaatt
gttaaaagaa actaaaatca tgtagatttc 5100ataaatcgtc atattctttg tctatataaa
tatttatcgt cacgaataaa tcccgtgaat 5160ttctaacaaa gtttatacaa tatcacagtt
gtaaaagtaa gaaaaaaaaa gatcatagaa 5220aacatgttca cataagtaga aaaagggcac
cttcttgttg ttcaaactta atttacaaat 5280taagtttaag caccgatgat accaacggac
ttaccttcag caattctttt ttgggccaaa 5340gcagcaataa cagcggcacc agcaccggaa
ccatcttcag caggaacaat cttgattggg 5400tagtcgtcta gtgaggtttg agtccagccg
taaatgtcct tcaaagcatt ggcagccttt 5460tctttgaaac ctgggtatct gttgtaaacg
gaaccgtctg cagcgatgtg accggtcttg 5520taacctctct tttgacagat agcagcaata
ccacaaacgg acaatctagc agctctagca 5580ccaatcaatt cagataaacg tctgatcaat
ttacgttctt gaacagtagt gttgataccg 5640aactcatttt ggaacaagtc atcggtatct
tctaggttct cgaatggatc ttcctcgatt 5700ctggctgggt aagaagtgtc catgacgaaa
ggcttgtcga acttagacaa gtcttggttc 5760ttgaagatga aaccttgttt gtacatgtcc
atcaaggcca aacgcaaaat ttcacctaag 5820tagtaaccag aagacatttt ttcaaaggtt
tgttggcctg gtcttggaga ttcttcatca 5880atggtgatat cgtatttagt tcttggcaaa
acgacatgtt cattatcgaa ggaaccgtat 5940tcacagttga tggccattgg agcagatggt
ggaatgtcat cagatagttt tccttgtagc 6000ttttcgatat cggaacaaac atcgtagtaa
gcaccattga caccagtacc gaagataaca 6060cccatcttag tttctgggtc agtgtagtaa
gaagcaacca aagtaccggt agtgtcgttt 6120atcaaagcaa caacttcaat tgggatattc
ctcttagtga tttgcttttg caacattgga 6180acaacatcgt ggttttcaat gtttggaata
tcaaaacctt tagtccatct ttgcaagata 6240ccttcattga ttttgttttg agaagctggg
aaagaaaagg tgaaacccaa tggaattggc 6300tcagagatac cttgtgggaa ttgctcatca
ataaaagctt tcaaagagtc ggcaataaat 6360tcccacaatt cgtctggatt ttgagtagtt
ctcatagcat ctggtaatct gtacttagat 6420tgagtggtgt caaaggtacg gtcaccgccc
aacttgacta agacaactct caagttggta 6480ccacccaaat caatggccaa gaaatcaccg
gattccttac cagttgggaa atccataacc 6540caacctggaa tcattggaat gttaccaccc
ttcttggaca aacccttttc caattcggaa 6600atgaagtgct tggtaacggc ttgtaaagtt
tcagttggaa cagtgaaaat tttttcaaaa 6660ttctcaattt gttgcatcaa ttcctttggc
acatcggcca tggaaccctt tctggcttgt 6720ggtttttttg gacctaaatg aaccatt
674711240DNAArtificial sequencePrimer
112caacaaaagc ttgtgtacaa tatggacttc ctcttttctg
4011343DNAArtificial sequencePrimer 113aagtttgtct agatgtatat gagatagttg
attgtatgct tgg 431141605DNAYarrowia lipolytica
114atggttcatc ttggtccccg aaaacccccg tcccgaaagg gctcaatggc agacgtcccg
60cgggacctgc tggagcaaat ctcccagctt gaaaccatct tcaccgtttc gcccgaaaag
120ctgcgtcaaa tcaccgacca ctttgtgtcc gagctcgcta aaggcctcac aaaggagggt
180ggagatatcc ccatgaaccc cacctggatt ctgggatggc ccaccggaaa ggagagcggc
240tgctatctgg ctctcgacat gggtggcacc aacctgcgag ttgtcaaggt gactctggac
300ggcgaccgag gcttcgacgt catgcagtcc aagtaccaca tgccccccaa catcaaggtc
360ggcaagcaag aggagctgtg ggagtacatt gccgaatgtc tgggcaagtt cttggccgac
420aattatcctg aggctcttga tgcccatgag cgaggacgag atgtcgacag aaccgctgcg
480cagagcttca ctcgagacaa gtctcctcct ccccacaacc agcacatttc gtgttctcct
540ggcttcgaca tccacaagat tcctctcggt ttcacctttt catatccctg ctctcagccc
600gccgtcaacc gaggtgtact gcagcgatgg accaagggtt tcgacattga gggagtcgag
660ggcgaggacg tggtccccat gctggaagct gccctcgaaa gaaagaacat tcctatttcc
720atcaccgccc tgatcaacga caccaccgga actatggtgg cctccaacta ccacgacccc
780cagatcaagc tgggtaacat ctttggtact ggtgtcaacg ccgcctacta cgagaaggtc
840aaggacattc ccaagctcaa gggtctcatc cccgacagca ttgatcccga gacccccatg
900gccgtcaatt gcgagtatgg agccttcgac aatgagcaca aggttctccc tagaaccaag
960tgggacatca tcatcgatga ggagtctccc cgacccggtc agcagacctt cgagaagatg
1020agtgctggct actacctggg agaattgctt cgtctggttc ttctggacct gtacaaggac
1080gggtttgtgt tcgagaacca gggcaagaac ggtcaggagc ttggaaacgg caacatcaac
1140aagtcgtatt tcttcgacac ctctttcctg tctctgattg aggaggatcc ctgggagaac
1200ttgactgatg tcgagattct cttcaaggag aagcttggta ttaacaccac tgagcccgag
1260cgaaagctca ttcgtcgact ggccgagctc attggtactc gatccgctcg aatctctgcc
1320tgtggtgtcg ctgccatctg taagaaggct ggctacaagg aggctcacgc tggagctgac
1380ggatccgtgt tcaacaagta ccccggattc aaggagcgag gcgcccaggc tctcaacgag
1440atttttgagt ggaacctgcc caaccctaag gaccacccca tcaaaatcgt tcccgctgag
1500gatggtagcg gtgttggagc tgctctgtgc gctgctctca ccatcaagcg agtcaagcag
1560ggtcttcccg ttggtgtcaa gcccggtgtc aagtacgata tttag
1605115534PRTYarrowia lipolytica 115Met Val His Leu Gly Pro Arg Lys Pro
Pro Ser Arg Lys Gly Ser Met 1 5 10
15 Ala Asp Val Pro Arg Asp Leu Leu Glu Gln Ile Ser Gln Leu
Glu Thr 20 25 30
Ile Phe Thr Val Ser Pro Glu Lys Leu Arg Gln Ile Thr Asp His Phe
35 40 45 Val Ser Glu Leu
Ala Lys Gly Leu Thr Lys Glu Gly Gly Asp Ile Pro 50
55 60 Met Asn Pro Thr Trp Ile Leu Gly
Trp Pro Thr Gly Lys Glu Ser Gly 65 70
75 80 Cys Tyr Leu Ala Leu Asp Met Gly Gly Thr Asn Leu
Arg Val Val Lys 85 90
95 Val Thr Leu Asp Gly Asp Arg Gly Phe Asp Val Met Gln Ser Lys Tyr
100 105 110 His Met Pro
Pro Asn Ile Lys Val Gly Lys Gln Glu Glu Leu Trp Glu 115
120 125 Tyr Ile Ala Glu Cys Leu Gly Lys
Phe Leu Ala Asp Asn Tyr Pro Glu 130 135
140 Ala Leu Asp Ala His Glu Arg Gly Arg Asp Val Asp Arg
Thr Ala Ala 145 150 155
160 Gln Ser Phe Thr Arg Asp Lys Ser Pro Pro Pro His Asn Gln His Ile
165 170 175 Ser Cys Ser Pro
Gly Phe Asp Ile His Lys Ile Pro Leu Gly Phe Thr 180
185 190 Phe Ser Tyr Pro Cys Ser Gln Pro Ala
Val Asn Arg Gly Val Leu Gln 195 200
205 Arg Trp Thr Lys Gly Phe Asp Ile Glu Gly Val Glu Gly Glu
Asp Val 210 215 220
Val Pro Met Leu Glu Ala Ala Leu Glu Arg Lys Asn Ile Pro Ile Ser 225
230 235 240 Ile Thr Ala Leu Ile
Asn Asp Thr Thr Gly Thr Met Val Ala Ser Asn 245
250 255 Tyr His Asp Pro Gln Ile Lys Leu Gly Asn
Ile Phe Gly Thr Gly Val 260 265
270 Asn Ala Ala Tyr Tyr Glu Lys Val Lys Asp Ile Pro Lys Leu Lys
Gly 275 280 285 Leu
Ile Pro Asp Ser Ile Asp Pro Glu Thr Pro Met Ala Val Asn Cys 290
295 300 Glu Tyr Gly Ala Phe Asp
Asn Glu His Lys Val Leu Pro Arg Thr Lys 305 310
315 320 Trp Asp Ile Ile Ile Asp Glu Glu Ser Pro Arg
Pro Gly Gln Gln Thr 325 330
335 Phe Glu Lys Met Ser Ala Gly Tyr Tyr Leu Gly Glu Leu Leu Arg Leu
340 345 350 Val Leu
Leu Asp Leu Tyr Lys Asp Gly Phe Val Phe Glu Asn Gln Gly 355
360 365 Lys Asn Gly Gln Glu Leu Gly
Asn Gly Asn Ile Asn Lys Ser Tyr Phe 370 375
380 Phe Asp Thr Ser Phe Leu Ser Leu Ile Glu Glu Asp
Pro Trp Glu Asn 385 390 395
400 Leu Thr Asp Val Glu Ile Leu Phe Lys Glu Lys Leu Gly Ile Asn Thr
405 410 415 Thr Glu Pro
Glu Arg Lys Leu Ile Arg Arg Leu Ala Glu Leu Ile Gly 420
425 430 Thr Arg Ser Ala Arg Ile Ser Ala
Cys Gly Val Ala Ala Ile Cys Lys 435 440
445 Lys Ala Gly Tyr Lys Glu Ala His Ala Gly Ala Asp Gly
Ser Val Phe 450 455 460
Asn Lys Tyr Pro Gly Phe Lys Glu Arg Gly Ala Gln Ala Leu Asn Glu 465
470 475 480 Ile Phe Glu Trp
Asn Leu Pro Asn Pro Lys Asp His Pro Ile Lys Ile 485
490 495 Val Pro Ala Glu Asp Gly Ser Gly Val
Gly Ala Ala Leu Cys Ala Ala 500 505
510 Leu Thr Ile Lys Arg Val Lys Gln Gly Leu Pro Val Gly Val
Lys Pro 515 520 525
Gly Val Lys Tyr Asp Ile 530 1161437DNASchwanniomyces
occidentalis 116atggttcact taggtccaaa acctccacaa catagaaaag gatccttctt
ggatgttcct 60gaatatttgt tgaaggaatt gacagaactc gaaggattat taacagtttc
aggtgaaaca 120ttaaggaaga ttactgatca ctttatttca gaattggaaa aaggtttatc
taaacaaggg 180ggaaatattc ctatgattcc aggatgggtt atggacttcc caacaggaaa
agaaatgggt 240gattacttgg ctattgattt aggtggtact aatttgagag ttgttttagt
taagttaggt 300ggtaacaggg actttgacac tactcaatcc aagttcgcat tgccagaaaa
catgagaact 360gccaagtctg aagagttatg ggaatttatt gctgagtgtt tacaaaagtt
cgtggaagaa 420gaatttcgaa atggtgttct gtcaaattta ccattaggtt tcaccttttc
atacccagca 480tctcaaggtt ctatcaatga agggtatttg caaagatgga ccaaaggttt
cgacattgaa 540ggtgttgagg gacacgatgt tgttccaatg ttacaagctg caattgaaaa
acgtaaggtt 600ccaattgaag ttgttgcgtt aatcaatgac accacaggta ctttagttgc
ttctatgtac 660accgatccag aagctaaaat gggtttattt tccggtactg gttgtaatgg
tgcttactac 720gatgttgtcg ataacattcc aaaattagaa ggaaaggttc cagatgacat
taaaagctct 780tccccaatgg ccatcaactg tgaatacggt gctttcgata atgagcatat
cattttgcct 840agaactaaat acgatatcca aatcgatgaa gaatcaccaa gaccaggaca
acaggctttc 900gaaaagatga tctctggtta ctacttaggt gaagttttaa gattgatttt
acttgattta 960acctctaaac aattaatttt caaagaccaa gatttgtcta aattacaagt
tccattcatt 1020ttagatacct caatcccagc tagaattgaa gaagatccgt ttgaaaactt
atctgatgtc 1080caagaattat ttcaagaaat tttaggtatt caaactactt ctccagaaag
aaaaatcatc 1140cgtcgtctag cggaattgat cggtgaaaga tcagccagat tatcaatttg
tggtattgct 1200gctatttgca agaagagagg ctacaaaacc gctcattgtg ccgctgatgg
ttcagtctac 1260aacaaatacc caggtttcaa agaaagagct gctaaaggtt tgagagatat
ctttcaatgg 1320gaatctgaag aagatccaat tgtcattgtg cctgcagaag atggtttagg
tgcaggtgcc 1380gctatcattg ctgcattgac tgaaaaaaga ttaaaggatg gattaccgtt
ggtatga 1437117478PRTSchwanniomyces occidentalis 117Met Val His Leu
Gly Pro Lys Pro Pro Gln His Arg Lys Gly Ser Phe 1 5
10 15 Leu Asp Val Pro Glu Tyr Leu Leu Lys
Glu Leu Thr Glu Leu Glu Gly 20 25
30 Leu Leu Thr Val Ser Gly Glu Thr Leu Arg Lys Ile Thr Asp
His Phe 35 40 45
Ile Ser Glu Leu Glu Lys Gly Leu Ser Lys Gln Gly Gly Asn Ile Pro 50
55 60 Met Ile Pro Gly Trp
Val Met Asp Phe Pro Thr Gly Lys Glu Met Gly 65 70
75 80 Asp Tyr Leu Ala Ile Asp Leu Gly Gly Thr
Asn Leu Arg Val Val Leu 85 90
95 Val Lys Leu Gly Gly Asn Arg Asp Phe Asp Thr Thr Gln Ser Lys
Phe 100 105 110 Ala
Leu Pro Glu Asn Met Arg Thr Ala Lys Ser Glu Glu Leu Trp Glu 115
120 125 Phe Ile Ala Glu Cys Leu
Gln Lys Phe Val Glu Glu Glu Phe Arg Asn 130 135
140 Gly Val Leu Ser Asn Leu Pro Leu Gly Phe Thr
Phe Ser Tyr Pro Ala 145 150 155
160 Ser Gln Gly Ser Ile Asn Glu Gly Tyr Leu Gln Arg Trp Thr Lys Gly
165 170 175 Phe Asp
Ile Glu Gly Val Glu Gly His Asp Val Val Pro Met Leu Gln 180
185 190 Ala Ala Ile Glu Lys Arg Lys
Val Pro Ile Glu Val Val Ala Leu Ile 195 200
205 Asn Asp Thr Thr Gly Thr Leu Val Ala Ser Met Tyr
Thr Asp Pro Glu 210 215 220
Ala Lys Met Gly Leu Phe Ser Gly Thr Gly Cys Asn Gly Ala Tyr Tyr 225
230 235 240 Asp Val Val
Asp Asn Ile Pro Lys Leu Glu Gly Lys Val Pro Asp Asp 245
250 255 Ile Lys Ser Ser Ser Pro Met Ala
Ile Asn Cys Glu Tyr Gly Ala Phe 260 265
270 Asp Asn Glu His Ile Ile Leu Pro Arg Thr Lys Tyr Asp
Ile Gln Ile 275 280 285
Asp Glu Glu Ser Pro Arg Pro Gly Gln Gln Ala Phe Glu Lys Met Ile 290
295 300 Ser Gly Tyr Tyr
Leu Gly Glu Val Leu Arg Leu Ile Leu Leu Asp Leu 305 310
315 320 Thr Ser Lys Gln Leu Ile Phe Lys Asp
Gln Asp Leu Ser Lys Leu Gln 325 330
335 Val Pro Phe Ile Leu Asp Thr Ser Ile Pro Ala Arg Ile Glu
Glu Asp 340 345 350
Pro Phe Glu Asn Leu Ser Asp Val Gln Glu Leu Phe Gln Glu Ile Leu
355 360 365 Gly Ile Gln Thr
Thr Ser Pro Glu Arg Lys Ile Ile Arg Arg Leu Ala 370
375 380 Glu Leu Ile Gly Glu Arg Ser Ala
Arg Leu Ser Ile Cys Gly Ile Ala 385 390
395 400 Ala Ile Cys Lys Lys Arg Gly Tyr Lys Thr Ala His
Cys Ala Ala Asp 405 410
415 Gly Ser Val Tyr Asn Lys Tyr Pro Gly Phe Lys Glu Arg Ala Ala Lys
420 425 430 Gly Leu Arg
Asp Ile Phe Gln Trp Glu Ser Glu Glu Asp Pro Ile Val 435
440 445 Ile Val Pro Ala Glu Asp Gly Leu
Gly Ala Gly Ala Ala Ile Ile Ala 450 455
460 Ala Leu Thr Glu Lys Arg Leu Lys Asp Gly Leu Pro Leu
Val 465 470 475
1181398DNAHomo sapiens 118atgctggacg acagagccag gatggaggcc gccaagaagg
agaaggtaga gcagatcctg 60gcagagttcc agctgcagga ggaggacctg aagaaggtga
tgagacggat gcagaaggag 120atggaccgcg gcctgaggct ggagacccat gaagaggcca
gtgtgaagat gctgcccacc 180tacgtgcgct ccaccccaga aggctcagaa gtcggggact
tcctctccct ggacctgggt 240ggcactaact tcagggtgat gctggtgaag gtgggagaag
gtgaggaggg gcagtggagc 300gtgaagacca aacaccagat gtactccatc cccgaggacg
ccatgaccgg cactgctgag 360atgctcttcg actacatctc tgagtgcatc tccgacttcc
tggacaagca tcagatgaaa 420cacaagaagc tgcccctggg cttcaccttc tcctttcctg
tgaggcacga agacatcgat 480aagggcatcc ttctcaactg gaccaagggc ttcaaggcct
caggagcaga agggaacaat 540gtcgtggggc ttctgcgaga cgctatcaaa cggagagggg
actttgaaat ggatgtggtg 600gcaatggtga atgacacggt ggccacgatg atctcctgct
actacgaaga ccatcagtgc 660gaggtcggca tgatcgtggg cacgggctgc aatgcctgct
acatggagga gatgcagaat 720gtggagctgg tggaggggga cgagggccgc atgtgcgtca
ataccgagtg gggcgccttc 780ggggactccg gcgagctgga cgagttcctg ctggagtatg
accgcctggt ggacgagagc 840tctgcaaacc ccggtcagca gctgtatgag aagctcatag
gtggcaagta catgggcgag 900ctggtgcggc ttgtgctgct caggctcgtg gacgaaaacc
tgctcttcca cggggaggcc 960tccgagcagc tgcgcacacg cggagccttc gagacgcgct
tcgtgtcgca ggtggagagc 1020gacacgggcg accgcaagca gatctacaac atcctgagca
cgctggggct gcgaccctcg 1080accaccgact gcgacatcgt gcgccgcgcc tgcgagagcg
tgtctacgcg cgctgcgcac 1140atgtgctcgg cggggctggc gggcgtcatc aaccgcatgc
gcgagagccg cagcgaggac 1200gtaatgcgca tcactgtggg cgtggatggc tccgtgtaca
agctgcaccc cagcttcaag 1260gagcggttcc atgccagcgt gcgcaggctg acgcccagct
gcgagatcac cttcatcgag 1320tcggaggagg gcagtggccg gggcgcggcc ctggtctcgg
cggtggcctg taagaaggcc 1380tgtatgctgg gccagtga
1398119465PRTHomo sapiens 119Met Leu Asp Asp Arg
Ala Arg Met Glu Ala Ala Lys Lys Glu Lys Val 1 5
10 15 Glu Gln Ile Leu Ala Glu Phe Gln Leu Gln
Glu Glu Asp Leu Lys Lys 20 25
30 Val Met Arg Arg Met Gln Lys Glu Met Asp Arg Gly Leu Arg Leu
Glu 35 40 45 Thr
His Glu Glu Ala Ser Val Lys Met Leu Pro Thr Tyr Val Arg Ser 50
55 60 Thr Pro Glu Gly Ser Glu
Val Gly Asp Phe Leu Ser Leu Asp Leu Gly 65 70
75 80 Gly Thr Asn Phe Arg Val Met Leu Val Lys Val
Gly Glu Gly Glu Glu 85 90
95 Gly Gln Trp Ser Val Lys Thr Lys His Gln Met Tyr Ser Ile Pro Glu
100 105 110 Asp Ala
Met Thr Gly Thr Ala Glu Met Leu Phe Asp Tyr Ile Ser Glu 115
120 125 Cys Ile Ser Asp Phe Leu Asp
Lys His Gln Met Lys His Lys Lys Leu 130 135
140 Pro Leu Gly Phe Thr Phe Ser Phe Pro Val Arg His
Glu Asp Ile Asp 145 150 155
160 Lys Gly Ile Leu Leu Asn Trp Thr Lys Gly Phe Lys Ala Ser Gly Ala
165 170 175 Glu Gly Asn
Asn Val Val Gly Leu Leu Arg Asp Ala Ile Lys Arg Arg 180
185 190 Gly Asp Phe Glu Met Asp Val Val
Ala Met Val Asn Asp Thr Val Ala 195 200
205 Thr Met Ile Ser Cys Tyr Tyr Glu Asp His Gln Cys Glu
Val Gly Met 210 215 220
Ile Val Gly Thr Gly Cys Asn Ala Cys Tyr Met Glu Glu Met Gln Asn 225
230 235 240 Val Glu Leu Val
Glu Gly Asp Glu Gly Arg Met Cys Val Asn Thr Glu 245
250 255 Trp Gly Ala Phe Gly Asp Ser Gly Glu
Leu Asp Glu Phe Leu Leu Glu 260 265
270 Tyr Asp Arg Leu Val Asp Glu Ser Ser Ala Asn Pro Gly Gln
Gln Leu 275 280 285
Tyr Glu Lys Leu Ile Gly Gly Lys Tyr Met Gly Glu Leu Val Arg Leu 290
295 300 Val Leu Leu Arg Leu
Val Asp Glu Asn Leu Leu Phe His Gly Glu Ala 305 310
315 320 Ser Glu Gln Leu Arg Thr Arg Gly Ala Phe
Glu Thr Arg Phe Val Ser 325 330
335 Gln Val Glu Ser Asp Thr Gly Asp Arg Lys Gln Ile Tyr Asn Ile
Leu 340 345 350 Ser
Thr Leu Gly Leu Arg Pro Ser Thr Thr Asp Cys Asp Ile Val Arg 355
360 365 Arg Ala Cys Glu Ser Val
Ser Thr Arg Ala Ala His Met Cys Ser Ala 370 375
380 Gly Leu Ala Gly Val Ile Asn Arg Met Arg Glu
Ser Arg Ser Glu Asp 385 390 395
400 Val Met Arg Ile Thr Val Gly Val Asp Gly Ser Val Tyr Lys Leu His
405 410 415 Pro Ser
Phe Lys Glu Arg Phe His Ala Ser Val Arg Arg Leu Thr Pro 420
425 430 Ser Cys Glu Ile Thr Phe Ile
Glu Ser Glu Glu Gly Ser Gly Arg Gly 435 440
445 Ala Ala Leu Val Ser Ala Val Ala Cys Lys Lys Ala
Cys Met Leu Gly 450 455 460
Gln 465 1201458DNASaccharomyces cerevisiae 120atggttcatt taggtccaaa
gaaaccacag gctagaaagg gttccatggc tgatgtgccc 60aaggaattga tggatgaaat
tcatcagttg gaagatatgt ttacagttga cagcgagacc 120ttgagaaagg ttgttaagca
ctttatcgac gaattgaata aaggtttgac aaagaaggga 180ggtaacattc caatgattcc
cggttgggtc atggaattcc caacaggtaa agaatctggt 240aactatttgg ccattgattt
gggtggtact aacttaagag tcgtgttggt caagttgagc 300ggtaaccata cctttgacac
cactcaatcc aagtataaac taccacatga catgagaacc 360actaagcacc aagaggagtt
atggtccttt attgccgact ctttgaagga ctttatggtc 420gagcaagaat tgctaaacac
caaggacacc ttaccattag gtttcacctt ctcgtaccca 480gcttcccaaa acaagattaa
cgaaggtatt ttgcaaagat ggaccaaggg tttcgatatt 540ccaaatgtcg aaggccacga
tgtcgtccca ttgctacaaa acgaaatttc caagagagag 600ttgcctattg aaattgtagc
attgattaat gatactgttg gtactttaat tgcctcatac 660tacactgacc cagagactaa
gatgggtgtg attttcggta ctggtgtcaa cggtgctttc 720tatgatgttg tttccgatat
cgaaaagttg gagggcaaat tagcagacga tattccaagt 780aactctccaa tggctatcaa
ttgtgaatat ggttccttcg ataatgaaca tttggtcttg 840ccaagaacca agtacgatgt
tgctgtcgac gaacaatctc caagacctgg tcaacaagct 900tttgaaaaga tgacctccgg
ttactacttg ggtgaattgt tgcgtctagt gttacttgaa 960ttaaacgaga agggcttgat
gttgaaggat caagatctaa gcaagttgaa acaaccatac 1020atcatggata cctcctaccc
agcaagaatc gaggatgatc catttgaaaa cttggaagat 1080actgatgaca tcttccaaaa
ggactttggt gtcaagacca ctctgccaga acgtaagttg 1140attagaagac tttgtgaatt
gatcggtacc agagctgcta gattagctgt ttgtggtatt 1200gccgctattt gccaaaagag
aggttacaag actggtcaca ttgccgctga cggttctgtc 1260tataacaaat acccaggttt
caaggaagcc gccgctaagg gtttgagaga tatctatgga 1320tggactggtg acgcaagcaa
agatccaatt acgattgttc cagctgagga tggttcaggt 1380gcaggtgctg ctgttattgc
tgcattgtcc gaaaaaagaa ttgccgaagg taagtctctt 1440ggtatcattg gcgcttaa
1458121485PRTSaccharomyces
cerevisiae 121Met Val His Leu Gly Pro Lys Lys Pro Gln Ala Arg Lys Gly Ser
Met 1 5 10 15 Ala
Asp Val Pro Lys Glu Leu Met Asp Glu Ile His Gln Leu Glu Asp
20 25 30 Met Phe Thr Val Asp
Ser Glu Thr Leu Arg Lys Val Val Lys His Phe 35
40 45 Ile Asp Glu Leu Asn Lys Gly Leu Thr
Lys Lys Gly Gly Asn Ile Pro 50 55
60 Met Ile Pro Gly Trp Val Met Glu Phe Pro Thr Gly Lys
Glu Ser Gly 65 70 75
80 Asn Tyr Leu Ala Ile Asp Leu Gly Gly Thr Asn Leu Arg Val Val Leu
85 90 95 Val Lys Leu Ser
Gly Asn His Thr Phe Asp Thr Thr Gln Ser Lys Tyr 100
105 110 Lys Leu Pro His Asp Met Arg Thr Thr
Lys His Gln Glu Glu Leu Trp 115 120
125 Ser Phe Ile Ala Asp Ser Leu Lys Asp Phe Met Val Glu Gln
Glu Leu 130 135 140
Leu Asn Thr Lys Asp Thr Leu Pro Leu Gly Phe Thr Phe Ser Tyr Pro 145
150 155 160 Ala Ser Gln Asn Lys
Ile Asn Glu Gly Ile Leu Gln Arg Trp Thr Lys 165
170 175 Gly Phe Asp Ile Pro Asn Val Glu Gly His
Asp Val Val Pro Leu Leu 180 185
190 Gln Asn Glu Ile Ser Lys Arg Glu Leu Pro Ile Glu Ile Val Ala
Leu 195 200 205 Ile
Asn Asp Thr Val Gly Thr Leu Ile Ala Ser Tyr Tyr Thr Asp Pro 210
215 220 Glu Thr Lys Met Gly Val
Ile Phe Gly Thr Gly Val Asn Gly Ala Phe 225 230
235 240 Tyr Asp Val Val Ser Asp Ile Glu Lys Leu Glu
Gly Lys Leu Ala Asp 245 250
255 Asp Ile Pro Ser Asn Ser Pro Met Ala Ile Asn Cys Glu Tyr Gly Ser
260 265 270 Phe Asp
Asn Glu His Leu Val Leu Pro Arg Thr Lys Tyr Asp Val Ala 275
280 285 Val Asp Glu Gln Ser Pro Arg
Pro Gly Gln Gln Ala Phe Glu Lys Met 290 295
300 Thr Ser Gly Tyr Tyr Leu Gly Glu Leu Leu Arg Leu
Val Leu Leu Glu 305 310 315
320 Leu Asn Glu Lys Gly Leu Met Leu Lys Asp Gln Asp Leu Ser Lys Leu
325 330 335 Lys Gln Pro
Tyr Ile Met Asp Thr Ser Tyr Pro Ala Arg Ile Glu Asp 340
345 350 Asp Pro Phe Glu Asn Leu Glu Asp
Thr Asp Asp Ile Phe Gln Lys Asp 355 360
365 Phe Gly Val Lys Thr Thr Leu Pro Glu Arg Lys Leu Ile
Arg Arg Leu 370 375 380
Cys Glu Leu Ile Gly Thr Arg Ala Ala Arg Leu Ala Val Cys Gly Ile 385
390 395 400 Ala Ala Ile Cys
Gln Lys Arg Gly Tyr Lys Thr Gly His Ile Ala Ala 405
410 415 Asp Gly Ser Val Tyr Asn Lys Tyr Pro
Gly Phe Lys Glu Ala Ala Ala 420 425
430 Lys Gly Leu Arg Asp Ile Tyr Gly Trp Thr Gly Asp Ala Ser
Lys Asp 435 440 445
Pro Ile Thr Ile Val Pro Ala Glu Asp Gly Ser Gly Ala Gly Ala Ala 450
455 460 Val Ile Ala Ala Leu
Ser Glu Lys Arg Ile Ala Glu Gly Lys Ser Leu 465 470
475 480 Gly Ile Ile Gly Ala 485
1221503DNASaccharomyces cerevisiae 122atgtcattcg acgacttaca caaagccact
gagagagcgg tcatccaggc cgtggaccag 60atctgcgacg atttcgaggt tacccccgag
aagctggacg aattaactgc ttacttcatc 120gaacaaatgg aaaaaggtct agctccacca
aaggaaggcc acacattggc ctcggacaaa 180ggtcttccta tgattccggc gttcgtcacc
gggtcaccca acgggacgga gcgcggtgtt 240ttactagccg ccgacctggg tggtaccaat
ttccgtatat gttctgttaa cttgcatgga 300gatcatactt tctccatgga gcaaatgaag
tccaagattc ccgatgattt gctagacgat 360gagaacgtca catctgacga cctgtttggg
tttctagcac gtcgtacact ggcctttatg 420aagaagtatc acccggacga gttggccaag
ggtaaagacg ccaagcccat gaaactgggg 480ttcactttct cataccctgt agaccagacc
tctctaaact ccgggacatt gatccgttgg 540accaagggtt tccgcatcgc ggacaccgtc
ggaaaggatg tcgtgcaatt gtaccaggag 600caattaagcg ctcagggtat gcctatgatc
aaggttgttg cattaaccaa cgacaccgtc 660ggaacgtacc tatcgcattg ctacacgtcc
gataacacgg actcaatgac gtccggagaa 720atctcggagc cggtcatcgg atgtattttc
ggtaccggta ccaatgggtg ctatatggag 780gagatcaaca agatcacgaa gttgccacag
gagttgcgtg acaagttgat aaaggagggt 840aagacacaca tgatcatcaa tgtcgaatgg
gggtccttcg ataatgagct caagcacttg 900cctactacta agtatgacgt cgtaattgac
cagaaactgt caacgaaccc gggatttcac 960ttgtttgaaa aacgtgtctc agggatgttc
ttgggtgagg tgttgcgtaa cattttagtg 1020gacttgcact cgcaaggctt gcttttgcaa
cagtacaggt ccaaggaaca acttcctcgc 1080cacttgacta cacctttcca gttgtcatcc
gaagtgctgt cgcatattga aattgacgac 1140tcgacaggtc tacgtgaaac agagttgtca
ttattacaga gtctcagact gcccaccact 1200ccaacagagc gtgttcaaat tcaaaaattg
gtgcgcgcga tttctaggag atctgcgtat 1260ttagccgccg tgccgcttgc cgcgatattg
atcaagacaa atgctttgaa caagagatat 1320catggtgaag tcgagatcgg ttgtgatggt
tccgttgtgg aatactaccc cggtttcaga 1380tctatgctga gacacgcctt agccttgtca
cccttgggtg ccgagggtga gaggaaggtg 1440cacttgaaga ttgccaagga tggttccgga
gtgggtgccg ccttgtgtgc gcttgtagca 1500tga
1503123500PRTSaccharomyces cerevisiae
123Met Ser Phe Asp Asp Leu His Lys Ala Thr Glu Arg Ala Val Ile Gln 1
5 10 15 Ala Val Asp Gln
Ile Cys Asp Asp Phe Glu Val Thr Pro Glu Lys Leu 20
25 30 Asp Glu Leu Thr Ala Tyr Phe Ile Glu
Gln Met Glu Lys Gly Leu Ala 35 40
45 Pro Pro Lys Glu Gly His Thr Leu Ala Ser Asp Lys Gly Leu
Pro Met 50 55 60
Ile Pro Ala Phe Val Thr Gly Ser Pro Asn Gly Thr Glu Arg Gly Val 65
70 75 80 Leu Leu Ala Ala Asp
Leu Gly Gly Thr Asn Phe Arg Ile Cys Ser Val 85
90 95 Asn Leu His Gly Asp His Thr Phe Ser Met
Glu Gln Met Lys Ser Lys 100 105
110 Ile Pro Asp Asp Leu Leu Asp Asp Glu Asn Val Thr Ser Asp Asp
Leu 115 120 125 Phe
Gly Phe Leu Ala Arg Arg Thr Leu Ala Phe Met Lys Lys Tyr His 130
135 140 Pro Asp Glu Leu Ala Lys
Gly Lys Asp Ala Lys Pro Met Lys Leu Gly 145 150
155 160 Phe Thr Phe Ser Tyr Pro Val Asp Gln Thr Ser
Leu Asn Ser Gly Thr 165 170
175 Leu Ile Arg Trp Thr Lys Gly Phe Arg Ile Ala Asp Thr Val Gly Lys
180 185 190 Asp Val
Val Gln Leu Tyr Gln Glu Gln Leu Ser Ala Gln Gly Met Pro 195
200 205 Met Ile Lys Val Val Ala Leu
Thr Asn Asp Thr Val Gly Thr Tyr Leu 210 215
220 Ser His Cys Tyr Thr Ser Asp Asn Thr Asp Ser Met
Thr Ser Gly Glu 225 230 235
240 Ile Ser Glu Pro Val Ile Gly Cys Ile Phe Gly Thr Gly Thr Asn Gly
245 250 255 Cys Tyr Met
Glu Glu Ile Asn Lys Ile Thr Lys Leu Pro Gln Glu Leu 260
265 270 Arg Asp Lys Leu Ile Lys Glu Gly
Lys Thr His Met Ile Ile Asn Val 275 280
285 Glu Trp Gly Ser Phe Asp Asn Glu Leu Lys His Leu Pro
Thr Thr Lys 290 295 300
Tyr Asp Val Val Ile Asp Gln Lys Leu Ser Thr Asn Pro Gly Phe His 305
310 315 320 Leu Phe Glu Lys
Arg Val Ser Gly Met Phe Leu Gly Glu Val Leu Arg 325
330 335 Asn Ile Leu Val Asp Leu His Ser Gln
Gly Leu Leu Leu Gln Gln Tyr 340 345
350 Arg Ser Lys Glu Gln Leu Pro Arg His Leu Thr Thr Pro Phe
Gln Leu 355 360 365
Ser Ser Glu Val Leu Ser His Ile Glu Ile Asp Asp Ser Thr Gly Leu 370
375 380 Arg Glu Thr Glu Leu
Ser Leu Leu Gln Ser Leu Arg Leu Pro Thr Thr 385 390
395 400 Pro Thr Glu Arg Val Gln Ile Gln Lys Leu
Val Arg Ala Ile Ser Arg 405 410
415 Arg Ser Ala Tyr Leu Ala Ala Val Pro Leu Ala Ala Ile Leu Ile
Lys 420 425 430 Thr
Asn Ala Leu Asn Lys Arg Tyr His Gly Glu Val Glu Ile Gly Cys 435
440 445 Asp Gly Ser Val Val Glu
Tyr Tyr Pro Gly Phe Arg Ser Met Leu Arg 450 455
460 His Ala Leu Ala Leu Ser Pro Leu Gly Ala Glu
Gly Glu Arg Lys Val 465 470 475
480 His Leu Lys Ile Ala Lys Asp Gly Ser Gly Val Gly Ala Ala Leu Cys
485 490 495 Ala Leu
Val Ala 500 1241446DNAKluyveromyces lactis 124atgtcagatc
ctaagttaac caaggcggtt gattctatat gcgatcagtt cattgttact 60aaatcgaaga
tatctcagtt gactgagtat ttcatcgatt gtatggaaaa gggattagaa 120ccctgtgaat
cagatatcag tcaaaacaaa gggttgccta tgattccgac gtttgtgact 180gacaagccat
ccggtcagga acatggagta accatgttgg cagctgattt aggtggtact 240aatttcagag
tttgctctgt ggaactatta ggtaatcatg aattcaagat tgaacaagag 300aagtcaaaga
ttccaacttt cttcttccag gacgatcatc atgttaccag taaggatttg 360ttccaacata
tggccctgat cacgcatcag tttttgacta aacatcataa ggatgtaatt 420caagattaca
aatggaaaat gggtttcact ttttcatatc cagtcgatca aacctccttg 480agcagtggta
agttgattag atggaccaag ggtttcaaga tcggtgatac tgttgggcaa 540gacgttgttc
aactgttcca acaagaattg aacgatattg ggttatcaaa tgttcatgtg 600gttgcattga
ctaatgacac tactggaacc ctattggctc gttgttacgc ttccagtgat 660gcggcaagag
ccatcaacga accagtaatt ggctgtatct ttggtactgg tacgaacggc 720tgctacatgg
aaaagcttga aaatattcac aaattggatc cagctagcag agaagaactt 780ctgtctcagg
ggaagaccca tatgtgcatc aataccgaat ggggctcttt tgataatgaa 840ctaaatcatt
tgcctactac aagttatgat attaagattg atcagcagtt ctccaccaat 900cccgggttcc
acttgtttga aaaaagggtc agtggtcttt atttgggtga aatacttcgt 960aacatactac
tagaccttga aaaacaagag ttattcgact tgaaggaatc tgttttaaag 1020aacaatccct
ttattttaac cacagaaact ttatcacata tcgaaattga taccgttgag 1080aacgacttac
aggacacaag ggatgctctt ttaaaggctg ctgacttgga gaccaccttc 1140gaagaacgtg
tcttgatcca aaaattggta agagctattt ccaggagagc tgcattctta 1200gccgcagtgc
caattgctgc aattttgatc aaaaccaacg ctttgaacca gagttatcac 1260tgccaagtag
aggttggttg tgacggtagt gtcgttgagc actatccagg attcagatct 1320atgatgagac
atgcattagc actttctcca attggccccg agggtgaacg tgatgtccat 1380ctacgtatct
ccaaggatgg ttccggtgtt ggcgctgctt tgtgtgcttt gcatgcaaat 1440tattaa
1446125481PRTKluveromyces lactis 125Met Ser Asp Pro Lys Leu Thr Lys Ala
Val Asp Ser Ile Cys Asp Gln 1 5 10
15 Phe Ile Val Thr Lys Ser Lys Ile Ser Gln Leu Thr Glu Tyr
Phe Ile 20 25 30
Asp Cys Met Glu Lys Gly Leu Glu Pro Cys Glu Ser Asp Ile Ser Gln
35 40 45 Asn Lys Gly Leu
Pro Met Ile Pro Thr Phe Val Thr Asp Lys Pro Ser 50
55 60 Gly Gln Glu His Gly Val Thr Met
Leu Ala Ala Asp Leu Gly Gly Thr 65 70
75 80 Asn Phe Arg Val Cys Ser Val Glu Leu Leu Gly Asn
His Glu Phe Lys 85 90
95 Ile Glu Gln Glu Lys Ser Lys Ile Pro Thr Phe Phe Phe Gln Asp Asp
100 105 110 His His Val
Thr Ser Lys Asp Leu Phe Gln His Met Ala Leu Ile Thr 115
120 125 His Gln Phe Leu Thr Lys His His
Lys Asp Val Ile Gln Asp Tyr Lys 130 135
140 Trp Lys Met Gly Phe Thr Phe Ser Tyr Pro Val Asp Gln
Thr Ser Leu 145 150 155
160 Ser Ser Gly Lys Leu Ile Arg Trp Thr Lys Gly Phe Lys Ile Gly Asp
165 170 175 Thr Val Gly Gln
Asp Val Val Gln Leu Phe Gln Gln Glu Leu Asn Asp 180
185 190 Ile Gly Leu Ser Asn Val His Val Val
Ala Leu Thr Asn Asp Thr Thr 195 200
205 Gly Thr Leu Leu Ala Arg Cys Tyr Ala Ser Ser Asp Ala Ala
Arg Ala 210 215 220
Ile Asn Glu Pro Val Ile Gly Cys Ile Phe Gly Thr Gly Thr Asn Gly 225
230 235 240 Cys Tyr Met Glu Lys
Leu Glu Asn Ile His Lys Leu Asp Pro Ala Ser 245
250 255 Arg Glu Glu Leu Leu Ser Gln Gly Lys Thr
His Met Cys Ile Asn Thr 260 265
270 Glu Trp Gly Ser Phe Asp Asn Glu Leu Asn His Leu Pro Thr Thr
Ser 275 280 285 Tyr
Asp Ile Lys Ile Asp Gln Gln Phe Ser Thr Asn Pro Gly Phe His 290
295 300 Leu Phe Glu Lys Arg Val
Ser Gly Leu Tyr Leu Gly Glu Ile Leu Arg 305 310
315 320 Asn Ile Leu Leu Asp Leu Glu Lys Gln Glu Leu
Phe Asp Leu Lys Glu 325 330
335 Ser Val Leu Lys Asn Asn Pro Phe Ile Leu Thr Thr Glu Thr Leu Ser
340 345 350 His Ile
Glu Ile Asp Thr Val Glu Asn Asp Leu Gln Asp Thr Arg Asp 355
360 365 Ala Leu Leu Lys Ala Ala Asp
Leu Glu Thr Thr Phe Glu Glu Arg Val 370 375
380 Leu Ile Gln Lys Leu Val Arg Ala Ile Ser Arg Arg
Ala Ala Phe Leu 385 390 395
400 Ala Ala Val Pro Ile Ala Ala Ile Leu Ile Lys Thr Asn Ala Leu Asn
405 410 415 Gln Ser Tyr
His Cys Gln Val Glu Val Gly Cys Asp Gly Ser Val Val 420
425 430 Glu His Tyr Pro Gly Phe Arg Ser
Met Met Arg His Ala Leu Ala Leu 435 440
445 Ser Pro Ile Gly Pro Glu Gly Glu Arg Asp Val His Leu
Arg Ile Ser 450 455 460
Lys Asp Gly Ser Gly Val Gly Ala Ala Leu Cys Ala Leu His Ala Asn 465
470 475 480 Tyr
1261452DNAHansenula polymorpha 126atgagtatcg acgacaaacc gctgccagca
gacctggcta aagagatcga gacctacaag 60gagctgttct gggtgccaac cgagactctc
cacaagatca tcgattactt catcgaggaa 120ctcgagagag gtaacgcgga cggaacagat
cctaccggta tccccatgaa ccctgcctgg 180gtgttggaat acccgaacgg ttctgagacc
ggcgattacc ttgccatcga cttgggagga 240acaaaccttc gtgttgtcct tgctcacttg
cttggagacc acaagttttc taccgaacaa 300actaagtacc acatcccaag ccacatgaga
acaaccaaga acagagacga gctgtttgag 360ttcattgctc aatgtctgga agactttctt
aagtcgaaac accctgacgg aattccatcg 420gacgctgttt tccccttggg attcactttt
tcgtacccag ccacgcaaaa cagcattttt 480gagggtgttc tacagagatg gaccaaaggt
tttgatattc ctaatgtcga gggccacgac 540gtggtgcctc ttctgatgga acaggtcgag
aagaaaggcc tgcctatcaa gattggtgcc 600ctgatcaacg acaccagcgg aacccttgtt
gcatcgagat acacagacga gctcacggag 660atgggctgta tttttggtac tggtgtcaac
ggagcatact acgaccgcat caagaacatc 720cctaagctga agggaaagct ttacgacgat
atcgacccag agtctccaat gctgatcaac 780tgcgaatacg gttctttcga taatgcacac
aaggttcttc caagaacgaa gttcgacatc 840agaatcgacg acgagtctcc aagaccggga
caacagtctt tcgagaaaat gacttccggc 900tactacctag gagaacttct cagaatgatt
atgctggaca cctacaaaaa gggactcatt 960ttcaagagct acactgagtc ttcggagcag
atcaagaacc tcgaaacccc atacttcctg 1020gacacatctt tcctgtctat cgctgaggct
gacgacaccc cttcattgag cgtcgtgtcg 1080aatgagttct ccaacaaact cttcatcgac
accactttcg aggagagact gtacgtgaga 1140aagctgtcgc aatttatcgg aaccagagca
gccagactct cgatttgtgg tatctctgcc 1200gtgtgcaaaa agatgaacta caaaaagtgc
cacgttgccg ctgacggatc cgtcttcctc 1260aagtacccat acttcccaga aagggcagca
cagggcctga gcgacgtgtt cggctgggat 1320ggtatcgaca tgaaggacca ccctatccag
atcaaacagg ccgaggacgg atctggtgtt 1380ggtgccgcca tcattgctgc actttcgcat
gccagaagag agaaaggtct gtctttgggt 1440ctgaaaaaat aa
1452127483PRTHansenula polymorpha 127Met
Ser Ile Asp Asp Lys Pro Leu Pro Ala Asp Leu Ala Lys Glu Ile 1
5 10 15 Glu Thr Tyr Lys Glu Leu
Phe Trp Val Pro Thr Glu Thr Leu His Lys 20
25 30 Ile Ile Asp Tyr Phe Ile Glu Glu Leu Glu
Arg Gly Asn Ala Asp Gly 35 40
45 Thr Asp Pro Thr Gly Ile Pro Met Asn Pro Ala Trp Val Leu
Glu Tyr 50 55 60
Pro Asn Gly Ser Glu Thr Gly Asp Tyr Leu Ala Ile Asp Leu Gly Gly 65
70 75 80 Thr Asn Leu Arg Val
Val Leu Ala His Leu Leu Gly Asp His Lys Phe 85
90 95 Ser Thr Glu Gln Thr Lys Tyr His Ile Pro
Ser His Met Arg Thr Thr 100 105
110 Lys Asn Arg Asp Glu Leu Phe Glu Phe Ile Ala Gln Cys Leu Glu
Asp 115 120 125 Phe
Leu Lys Ser Lys His Pro Asp Gly Ile Pro Ser Asp Ala Val Phe 130
135 140 Pro Leu Gly Phe Thr Phe
Ser Tyr Pro Ala Thr Gln Asn Ser Ile Phe 145 150
155 160 Glu Gly Val Leu Gln Arg Trp Thr Lys Gly Phe
Asp Ile Pro Asn Val 165 170
175 Glu Gly His Asp Val Val Pro Leu Leu Met Glu Gln Val Glu Lys Lys
180 185 190 Gly Leu
Pro Ile Lys Ile Gly Ala Leu Ile Asn Asp Thr Ser Gly Thr 195
200 205 Leu Val Ala Ser Arg Tyr Thr
Asp Glu Leu Thr Glu Met Gly Cys Ile 210 215
220 Phe Gly Thr Gly Val Asn Gly Ala Tyr Tyr Asp Arg
Ile Lys Asn Ile 225 230 235
240 Pro Lys Leu Lys Gly Lys Leu Tyr Asp Asp Ile Asp Pro Glu Ser Pro
245 250 255 Met Leu Ile
Asn Cys Glu Tyr Gly Ser Phe Asp Asn Ala His Lys Val 260
265 270 Leu Pro Arg Thr Lys Phe Asp Ile
Arg Ile Asp Asp Glu Ser Pro Arg 275 280
285 Pro Gly Gln Gln Ser Phe Glu Lys Met Thr Ser Gly Tyr
Tyr Leu Gly 290 295 300
Glu Leu Leu Arg Met Ile Met Leu Asp Thr Tyr Lys Lys Gly Leu Ile 305
310 315 320 Phe Lys Ser Tyr
Thr Glu Ser Ser Glu Gln Ile Lys Asn Leu Glu Thr 325
330 335 Pro Tyr Phe Leu Asp Thr Ser Phe Leu
Ser Ile Ala Glu Ala Asp Asp 340 345
350 Thr Pro Ser Leu Ser Val Val Ser Asn Glu Phe Ser Asn Lys
Leu Phe 355 360 365
Ile Asp Thr Thr Phe Glu Glu Arg Leu Tyr Val Arg Lys Leu Ser Gln 370
375 380 Phe Ile Gly Thr Arg
Ala Ala Arg Leu Ser Ile Cys Gly Ile Ser Ala 385 390
395 400 Val Cys Lys Lys Met Asn Tyr Lys Lys Cys
His Val Ala Ala Asp Gly 405 410
415 Ser Val Phe Leu Lys Tyr Pro Tyr Phe Pro Glu Arg Ala Ala Gln
Gly 420 425 430 Leu
Ser Asp Val Phe Gly Trp Asp Gly Ile Asp Met Lys Asp His Pro 435
440 445 Ile Gln Ile Lys Gln Ala
Glu Asp Gly Ser Gly Val Gly Ala Ala Ile 450 455
460 Ile Ala Ala Leu Ser His Ala Arg Arg Glu Lys
Gly Leu Ser Leu Gly 465 470 475
480 Leu Lys Lys 12816387DNAartificial sequenceSynthetic construct
128tcccattacc gacatttggg cgctatacgt gcatatgttc atgtatgtat ctgtatttaa
60aacacttttg tattattttt cctcatatat gtgtataggt ttatacggat gatttaatta
120ttacttcacc accctttatt tcaggctgat atcttagcct tgttactagt tagaaaaaga
180catttttgct gtcagtcact gtcaagagat tcttttgctg gcatttcttc tagaagcaaa
240aagagcgatg cgtcttttcc gctgaaccgt tccagcaaaa aagactacca acgcaatatg
300gattgtcaga atcatataaa agagaagcaa ataactcctt gtcttgtatc aattgcatta
360taatatcttc ttgttagtgc aatatcatat agaagtcatc gaaatagata ttaagaaaaa
420caaactgtac aatcaatcaa tcaatcatcg ctgaggatgt tgacaaaagc aacaaaagaa
480caaaaatccc ttgtgaaaaa cagaggggcg gagcttgttg ttgattgctt agtggagcaa
540ggtgtcacac atgtatttgg cattccaggt gcaaaaattg atgcggtatt tgacgcttta
600caagataaag gacctgaaat tatcgttgcc cggcacgaac aaaacgcagc attcatggcc
660caagcagtcg gccgtttaac tggaaaaccg ggagtcgtgt tagtcacatc aggaccgggt
720gcctctaact tggcaacagg cctgctgaca gcgaacactg aaggagaccc tgtcgttgcg
780cttgctggaa acgtgatccg tgcagatcgt ttaaaacgga cacatcaatc tttggataat
840gcggcgctat tccagccgat tacaaaatac agtgtagaag ttcaagatgt aaaaaatata
900ccggaagctg ttacaaatgc atttaggata gcgtcagcag ggcaggctgg ggccgctttt
960gtgagctttc cgcaagatgt tgtgaatgaa gtcacaaata cgaaaaacgt gcgtgctgtt
1020gcagcgccaa aactcggtcc tgcagcagat gatgcaatca gtgcggccat agcaaaaatc
1080caaacagcaa aacttcctgt cgttttggtc ggcatgaaag gcggaagacc ggaagcaatt
1140aaagcggttc gcaagctttt gaaaaaggtt cagcttccat ttgttgaaac atatcaagct
1200gccggtaccc tttctagaga tttagaggat caatattttg gccgtatcgg tttgttccgc
1260aaccagcctg gcgatttact gctagagcag gcagatgttg ttctgacgat cggctatgac
1320ccgattgaat atgatccgaa attctggaat atcaatggag accggacaat tatccattta
1380gacgagatta tcgctgacat tgatcatgct taccagcctg atcttgaatt gatcggtgac
1440attccgtcca cgatcaatca tatcgaacac gatgctgtga aagtggaatt tgcagagcgt
1500gagcagaaaa tcctttctga tttaaaacaa tatatgcatg aaggtgagca ggtgcctgca
1560gattggaaat cagacagagc gcaccctctt gaaatcgtta aagagttgcg taatgcagtc
1620gatgatcatg ttacagtaac ttgcgatatc ggttcgcacg ccatttggat gtcacgttat
1680ttccgcagct acgagccgtt aacattaatg atcagtaacg gtatgcaaac actcggcgtt
1740gcgcttcctt gggcaatcgg cgcttcattg gtgaaaccgg gagaaaaagt ggtttctgtc
1800tctggtgacg gcggtttctt attctcagca atggaattag agacagcagt tcgactaaaa
1860gcaccaattg tacacattgt atggaacgac agcacatatg acatggttgc attccagcaa
1920ttgaaaaaat ataaccgtac atctgcggtc gatttcggaa atatcgatat cgtgaaatat
1980gcggaaagct tcggagcaac tggcttgcgc gtagaatcac cagaccagct ggcagatgtt
2040ctgcgtcaag gcatgaacgc tgaaggtcct gtcatcatcg atgtcccggt tgactacagt
2100gataacatta atttagcaag tgacaagctt ccgaaagaat tcggggaact catgaaaacg
2160aaagctctct agttaattaa tcatgtaatt agttatgtca cgcttacatt cacgccctcc
2220ccccacatcc gctctaaccg aaaaggaagg agttagacaa cctgaagtct aggtccctat
2280ttattttttt atagttatgt tagtattaag aacgttattt atatttcaaa tttttctttt
2340ttttctgtac agacgcgtgt acgcatgtaa cattatactg aaaaccttgc ttgagaaggt
2400tttgggacgc tcgaaggctt taatttgcgg gcggccgctc tagaactagt accacaggtg
2460ttgtcctctg aggacataaa atacacaccg agattcatca actcattgct ggagttagca
2520tatctacaat tgggtgaaat ggggagcgat ttgcaggcat ttgctcggca tgccggtaga
2580ggtgtggtca ataagagcga cctcatgcta tacctgagaa agcaacctga cctacaggaa
2640agagttactc aagaataaga attttcgttt taaaacctaa gagtcacttt aaaatttgta
2700tacacttatt ttttttataa cttatttaat aataaaaatc ataaatcata agaaattcgc
2760ttactcttaa ttaatcaagc atctaaaaca caaccgttgg aagcgttgga aaccaactta
2820gcatacttgg atagagtacc tcttgtgtaa cgaggtggag gtgcaaccca actttgttta
2880cgttgagcca tttccttatc agagactaat aggtcaatct tgttattatc agcatcaatg
2940ataatctcat cgccgtctct gaccaacccg ataggaccac cttcagcggc ttcgggaaca
3000atgtggccga ttaagaaccc gtgagaacca ccagagaatc taccatcagt caacaatgca
3060acatctttac ccaaaccgta acccatcaga gcagaggaag gctttagcat ttcaggcata
3120cctggtgcac ctcttggacc ttcatatctg ataacaacaa cggttttttc acccttcttg
3180atttcacctc tttccaaggc ttcaataaag gcaccttcct cttcgaacac acgtgctcta
3240cccttgaagt aagtaccttc cttaccggta attttaccca cagctccacc tggtgccaat
3300gaaccgtaca gaatttgcaa gtgaccgttg gccttgattg ggtgggagag tggcttaata
3360atctcttgtc cttcaggtag gcttggtgct ttctttgcac gttctgccaa agtgtcaccg
3420gtaacagtca ttgtgttacc gtgcaacatg ttgttttcat atagatactt aatcacagat
3480tgggtaccac caacgttaat caaatcggcc atgacgtatt taccagaagg tttgaagtca
3540ccgatcaatg gtgtagtatc actgattctt tggaaatcat ctggtgacaa cttgacaccc
3600gcagagtgag caacagccac caaatgcaaa acagcattag tggacccacc ggttgcaacg
3660acataagtaa tggcgttttc aaaagcctct tttgtgagga tatcacgagg taaaataccc
3720aattccattg tcttcttgat gtattcacca atgttgtcac actcagctaa cttctccttg
3780gaaacggctg ggaaggaaga ggagtttgga atggtcaaac ctagcacttc agcggcagaa
3840gccattgtgt tggcagtata cataccacca caagaaccag gacctgggca tgcatgttcc
3900acaacatctt ctctttcttc ttcagtgaat tgcttggaaa tatattcacc gtaggattgg
3960aacgcagaga cgatatcgat gtttttagag atcctgttaa aacctctagt ggagtagtag
4020atgtaatcaa tgaagcggaa gccaaaagac cagagtagag gcctatagaa gaaactgcga
4080taccttttgt gatggctaaa caaacagaca tctttttata tgtttttact tctgtatatc
4140gtgaagtagt aagtgataag cgaatttggc taagaacgtt gtaagtgaac aagggacctc
4200ttttgccttt caaaaaagga ttaaatggag ttaatcattg agatttagtt ttcgttagat
4260tctgtatccc taaataactc ccttacccga cgggaaggca caaaagactt gaataatagc
4320aaacggccag tagccaagac caaataatac tagagttaac tgatggtctt aaacaggcat
4380tacgtggtga actccaagac caatatacaa aatatcgata agttattctt gcccaccaat
4440ttaaggagcc tacatcagga cagtagtacc attcctcaga gaagaggtat acataacaag
4500aaaatcgcgt gaacacctta tataacttag cccgttattg agctaaaaaa ccttgcaaaa
4560tttcctatga ataagaatac ttcagacgtg ataaaaattt actttctaac tcttctcacg
4620ctgcccctat ctgttcttcc gctctaccgt gagaaataaa gcatcgagta cggcagttcg
4680ctgtcactga actaaaacaa taaggctagt tcgaatgatg aacttgcttg ctgtcaaact
4740tctgagttgc cgctgatgtg acactgtgac aataaattca aaccggttat agcggtctcc
4800tccggtaccg gttctgccac ctccaataga gctcagtagg agtcagaacc tctgcggtgg
4860ctgtcagtga ctcatccgcg tttcgtaagt tgtgcgcgtg cacatttcgc ccgttcccgc
4920tcatcttgca gcaggcggaa attttcatca cgctgtagga cgcaaaaaaa aaataattaa
4980tcgtacaaga atcttggaaa aaaaattgaa aaattttgta taaaagggat gacctaactt
5040gactcaatgg cttttacacc cagtattttc cctttccttg tttgttacaa ttatagaagc
5100aagacaaaaa catatagaca acctattcct aggagttata tttttttacc ctaccagcaa
5160tataagtaaa aaactagtat gaaagttttc tacgataaag actgcgacct gtcgatcatc
5220caaggtaaga aagttgccat catcggcttc ggttcccagg gccacgctca agcactcaac
5280ctgaaggatt ccggcgtaga cgtgactgtt ggcctgccta aaggctttgc tgatgtagcc
5340aaggctgaag cccacggctt taaagtgacc gacgttgctg cagccgttgc cggtgccgac
5400ttggtcatga tcctgattcc ggacgagttc cagtcccagc tgtacaagaa cgaaatcgag
5460ccgaacatca agaagggcgc cactctggcc ttctcccacg gcttcgcgat ccactacaac
5520caggttgtgc ctcgtgccga cctcgacgtg atcatgatcg cgccgaaggc tccaggccac
5580accgtacgtt ccgagttcgt caagggcgga ggtattcctg acctgatcgc gatctaccag
5640gacgtttccg gcaacgccaa gaacgtcgcc ctgtcctacg ccgcaggcgt gggcggcggc
5700cgtaccggca tcatcgaaac caccttcaag gacgagactg aaaccgacct gttcggtgag
5760caggctgttc tgtgtggcgg taccgtcgag ctggtcaaag ccggtttcga aaccctggtt
5820gaagctggct acgctccaga aatggcctac ttcgagtgcc tgcacgaact gaagctgatc
5880gttgacctca tgtacgaagg cggtatcgcc aacatgaact actcgatctc caacaacgct
5940gaatacggcg agtacgtgac tggtccagaa gtcatcaacg ccgaatcccg tcaggccatg
6000cgcaatgctc tgaagcgcat ccaggacggc gaatacgcga agatgttcat cagcgaaggc
6060gctaccggct acccatcgat gaccgccaag cgtcgtaaca acgctgctca cggtatcgaa
6120atcatcggcg agcaactgcg ctcgatgatg ccttggatcg gtgccaacaa aatcgtcgac
6180aaagccaaga actaaggccc tgcaggccta tcaagtgctg gaaacttttt ctcttggaat
6240ttttgcaaca tcaagtcata gtcaattgaa ttgacccaat ttcacattta agattttttt
6300tttttcatcc gacatacatc tgtacactag gaagccctgt ttttctgaag cagcttcaaa
6360tatatatatt ttttacatat ttattatgat tcaatgaaca atctaattaa atcgaaaaca
6420agaaccgaaa cgcgaataaa taatttattt agatggtgac aagtgtataa gtcctcatcg
6480ggacagctac gatttctctt tcggttttgg ctgagctact ggttgctgtg acgcagcggc
6540attagcgcgg cgttatgagc taccctcgtg gcctgaaaga tggcgggaat aaagcggaac
6600taaaaattac tgactgagcc atattgaggt caatttgtca actcgtcaag tcacgtttgg
6660tggacggccc ctttccaacg aatcgtatat actaacatgc gcgcgcttcc tatatacaca
6720tatacatata tatatatata tatatgtgtg cgtgtatgtg tacacctgta tttaatttcc
6780ttactcgcgg gtttttcttt tttctcaatt cttggcttcc tctttctcga gtatataatt
6840tttcaggtaa aatttagtac gatagtaaaa tacttctcga actcgtcaca tatacgtgta
6900cataatgtct gaaccagctc aaaagaaaca aaaggttgct aacaactctc tagagcggcc
6960gcccgcaaat taaagccttc gagcgtccca aaaccttctc aagcaaggtt ttcagtataa
7020tgttacatgc gtacacgcgt ctgtacagaa aaaaaagaaa aatttgaaat ataaataacg
7080ttcttaatac taacataact ataaaaaaat aaatagggac ctagacttca ggttgtctaa
7140ctccttcctt ttcggttaga gcggatgtgg ggggagggcg tgaatgtaag cgtgacataa
7200ctaattacat gattaattaa ttattggttt tctggtctca actttctgac ttccttacca
7260accttccaga tttccatgtt tctgatggtg tctaattcct tttctagctt ttctctgtag
7320tcaggttgag agttgaattc caaagatctc ttggtttcgg taccgttctt ggtagattcg
7380tacaagtctt ggaaaacagg cttcaaagca ttcttgaaga ttgggtacca gtccaaagca
7440cctcttctgg cggtggtgga acaagcatcg tacatgtaat ccataccgta cttaccgatc
7500aatgggtata gagattgggt agcttcttcg acggtttcgt tgaaagcttc agatggggag
7560tgaccgtttt ctctcaagac gtcgtattga gccaagaaca taccgtggat accacccatt
7620aaacaacctc tttcaccgta caagtcagag ttgacttctc tttcgaaagt ggtttggtaa
7680acgtaaccgg aaccaatggc aacggccaaa gcttgggcct tttcgtgagc cttaccggtg
7740acatcgttcc agacggcgta agaagagtta ataccacgac cttccttgaa caaagatctg
7800acagttctac cggaaccctt tggagcaacc aagataacat ctaagtcctt tggtggttca
7860acgtgagtca agtccttgaa gactggggag aaaccgtggg agaagtacaa agtcttaccc
7920ttggtcaaca atggcttgat agcaggccag gtttctgatt gagcggcatc ggacaacaag
7980ttcataacgt aactacctct cttgatagca tcttcaacag tgaacaagtt cttgcctgga
8040acccaaccgt cttcgatggc agccttccaa gaagcaccat ctttacggac accaatgata
8100acgttcaaac cgttgtctct caagttcaaa ccttgaccgt aaccttggga accgtaaccg
8160atcaaagcaa aagtgtcgtt cttgaagtag tccaacaact tttctcttgg ccagtcagct
8220ctttcgtaga cggtttcaac agtaccaccg aagttgattt gcttcaacat cctcagctct
8280agatttgaat atgtattact tggttatggt tatatatgac aaaagaaaaa gaagaacaga
8340agaataacgc aaggaagaac aataactgaa attgatagag aagtattatg tctttgtctt
8400tttataataa atcaagtgca gaaatccgtt agacaacatg agggataaaa tttaacgtgg
8460gcgaagaaga aggaaaaaag tttttgtgag ggcgtaattg aagcgatctg ttgattgtag
8520attttttttt tttgaggagt caaagtcaga agagaacaga caaatggtat taaccatcca
8580atactttttt ggagcaacgc taagctcatg cttttccatt ggttacgtgc tcagttgtta
8640gatatggaaa gagaggatgc tcacggcagc gtgactccaa ttgagcccga aagagaggat
8700gccacgtttt cccgacggct gctagaatgg aaaaaggaaa aatagaagaa tcccattcct
8760atcattattt acgtaatgac ccacacattt ttgagatttt caactattac gtattacgat
8820aatcctgctg tcattatcat tattatctat atcgacgtat gcaacgtatg tgaagccaag
8880taggcaatta tttagtactg tcagtattgt tattcatttc agatctatcc gcggtggagc
8940tcgaattcac tggccgtcgt tttacaacgt cgtgactggg aaaaccctgg cgttacccaa
9000cttaatcgcc ttgcagcaca tccccctttc gccagctggc gtaatagcga agaggcccgc
9060accgatcgcc cttcccaaca gttgcgcagc ctgaatggcg aatggcgcct gatgcggtat
9120tttctcctta cgcatctgtg cggtatttca caccgcatac gtcaaagcaa ccatagtacg
9180cgccctgtag cggcgcatta agcgcggcgg gtgtggtggt tacgcgcagc gtgaccgcta
9240cacttgccag cgccttagcg cccgctcctt tcgctttctt cccttccttt ctcgccacgt
9300tcgccggctt tccccgtcaa gctctaaatc gggggctccc tttagggttc cgatttagtg
9360ctttacggca cctcgacccc aaaaaacttg atttgggtga tggttcacgt agtgggccat
9420cgccctgata gacggttttt cgccctttga cgttggagtc cacgttcttt aatagtggac
9480tcttgttcca aactggaaca acactcaact ctatctcggg ctattctttt gatttataag
9540ggattttgcc gatttcggtc tattggttaa aaaatgagct gatttaacaa aaatttaacg
9600cgaattttaa caaaatatta acgtttacaa ttttatggtg cactctcagt acaatctgct
9660ctgatgccgc atagttaagc cagccccgac acccgccaac acccgctgac gcgccctgac
9720gggcttgtct gctcccggca tccgcttaca gacaagctgt gaccgtctcc gggagctgca
9780tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag acgaaagggc ctcgtgatac
9840gcctattttt ataggttaat gtcatgataa taatggtttc ttagacgtca ggtggcactt
9900ttcggggaaa tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt
9960atccgctcat gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta
10020tgagtattca acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg
10080tttttgctca cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac
10140gagtgggtta catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg
10200aagaacgttt tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc
10260gtattgacgc cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg
10320ttgagtactc accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat
10380gcagtgctgc cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg
10440gaggaccgaa ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg
10500atcgttggga accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc
10560ctgtagcaat ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt
10620cccggcaaca attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct
10680cggcccttcc ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc
10740gcggtatcat tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca
10800cgacggggag tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct
10860cactgattaa gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt
10920taaaacttca tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga
10980ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca
11040aaggatcttc ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac
11100caccgctacc agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg
11160taactggctt cagcagagcg cagataccaa atactgttct tctagtgtag ccgtagttag
11220gccaccactt caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac
11280cagtggctgc tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt
11340taccggataa ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg
11400agcgaacgac ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc
11460ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc
11520gcacgaggga gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc
11580acctctgact tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa
11640acgccagcaa cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt
11700tctttcctgc gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg
11760ataccgctcg ccgcagccga acgaccgagc gcagcgagtc agtgagcgag gaagcggaag
11820agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc
11880acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc
11940tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa
12000ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagcttt
12060ttctttccaa tttttttttt ttcgtcatta taaaaatcat tacgaccgag attcccgggt
12120aataactgat ataattaaat tgaagctcta atttgtgagt ttagtataca tgcatttact
12180tataatacag ttttttagtt ttgctggccg catcttctca aatatgcttc ccagcctgct
12240tttctgtaac gttcaccctc taccttagca tcccttccct ttgcaaatag tcctcttcca
12300acaataataa tgtcagatcc tgtagagacc acatcatcca cggttctata ctgttgaccc
12360aatgcgtctc ccttgtcatc taaacccaca ccgggtgtca taatcaacca atcgtaacct
12420tcatctcttc cacccatgtc tctttgagca ataaagccga taacaaaatc tttgtcgctc
12480ttcgcaatgt caacagtacc cttagtatat tctccagtag atagggagcc cttgcatgac
12540aattctgcta acatcaaaag gcctctaggt tcctttgtta cttcttctgc cgcctgcttc
12600aaaccgctaa caatacctgg gcccaccaca ccgtgtgcat tcgtaatgtc tgcccattct
12660gctattctgt atacacccgc agagtactgc aatttgactg tattaccaat gtcagcaaat
12720tttctgtctt cgaagagtaa aaaattgtac ttggcggata atgcctttag cggcttaact
12780gtgccctcca tggaaaaatc agtcaagata tccacatgtg tttttagtaa acaaattttg
12840ggacctaatg cttcaactaa ctccagtaat tccttggtgg tacgaacatc caatgaagca
12900cacaagtttg tttgcttttc gtgcatgata ttaaatagct tggcagcaac aggactagga
12960tgagtagcag cacgttcctt atatgtagct ttcgacatga tttatcttcg tttcctgcag
13020gtttttgttc tgtgcagttg ggttaagaat actgggcaat ttcatgtttc ttcaacacta
13080catatgcgta tatataccaa tctaagtctg tgctccttcc ttcgttcttc cttctgttcg
13140gagattaccg aatcaaaaaa atttcaagga aaccgaaatc aaaaaaaaga ataaaaaaaa
13200aatgatgaat tgaaaagctt gcatgcctgc aggtcgactc tagtatactc cgtctactgt
13260acgatacact tccgctcagg tccttgtcct ttaacgaggc cttaccactc ttttgttact
13320ctattgatcc agctcagcaa aggcagtgtg atctaagatt ctatcttcgc gatgtagtaa
13380aactagctag accgagaaag agactagaaa tgcaaaaggc acttctacaa tggctgccat
13440cattattatc cgatgtgacg ctgcattttt tttttttttt tttttttttt tttttttttt
13500tttttttttt tttttttgta caaatatcat aaaaaaagag aatcttttta agcaaggatt
13560ttcttaactt cttcggcgac agcatcaccg acttcggtgg tactgttgga accacctaaa
13620tcaccagttc tgatacctgc atccaaaacc tttttaactg catcttcaat ggctttacct
13680tcttcaggca agttcaatga caatttcaac atcattgcag cagacaagat agtggcgata
13740gggttgacct tattctttgg caaatctgga gcggaaccat ggcatggttc gtacaaacca
13800aatgcggtgt tcttgtctgg caaagaggcc aaggacgcag atggcaacaa acccaaggag
13860cctgggataa cggaggcttc atcggagatg atatcaccaa acatgttgct ggtgattata
13920ataccattta ggtgggttgg gttcttaact aggatcatgg cggcagaatc aatcaattga
13980tgttgaactt tcaatgtagg gaattcgttc ttgatggttt cctccacagt ttttctccat
14040aatcttgaag aggccaaaac attagcttta tccaaggacc aaataggcaa tggtggctca
14100tgttgtaggg ccatgaaagc ggccattctt gtgattcttt gcacttctgg aacggtgtat
14160tgttcactat cccaagcgac accatcacca tcgtcttcct ttctcttacc aaagtaaata
14220cctcccacta attctctaac aacaacgaag tcagtacctt tagcaaattg tggcttgatt
14280ggagataagt ctaaaagaga gtcggatgca aagttacatg gtcttaagtt ggcgtacaat
14340tgaagttctt tacggatttt tagtaaacct tgttcaggtc taacactacc ggtaccccat
14400ttaggaccac ccacagcacc taacaaaacg gcatcagcct tcttggaggc ttccagcgcc
14460tcatctggaa gtggaacacc tgtagcatcg atagcagcac caccaattaa atgattttcg
14520aaatcgaact tgacattgga acgaacatca gaaatagctt taagaacctt aatggcttcg
14580gctgtgattt cttgaccaac gtggtcacct ggcaaaacga cgatcttctt aggggcagac
14640attacaatgg tatatccttg aaatatatat aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
14700tgcagcttct caatgatatt cgaatacgct ttgaggagat acagcctaat atccgacaaa
14760ctgttttaca gatttacgat cgtacttgtt acccatcatt gaattttgaa catccgaacc
14820tgggagtttt ccctgaaaca gatagtatat ttgaacctgt ataataatat atagtctagc
14880gctttacgga agacaatgta tgtatttcgg ttcctggaga aactattgca tctattgcat
14940aggtaatctt gcacgtcgca tccccggttc attttctgcg tttccatctt gcacttcaat
15000agcatatctt tgttaacgaa gcatctgtgc ttcattttgt agaacaaaaa tgcaacgcga
15060gagcgctaat ttttcaaaca aagaatctga gctgcatttt tacagaacag aaatgcaacg
15120cgaaagcgct attttaccaa cgaagaatct gtgcttcatt tttgtaaaac aaaaatgcaa
15180cgcgagagcg ctaatttttc aaacaaagaa tctgagctgc atttttacag aacagaaatg
15240caacgcgaga gcgctatttt accaacaaag aatctatact tcttttttgt tctacaaaaa
15300tgcatcccga gagcgctatt tttctaacaa agcatcttag attacttttt ttctcctttg
15360tgcgctctat aatgcagtct cttgataact ttttgcactg taggtccgtt aaggttagaa
15420gaaggctact ttggtgtcta ttttctcttc cataaaaaaa gcctgactcc acttcccgcg
15480tttactgatt actagcgaag ctgcgggtgc attttttcaa gataaaggca tccccgatta
15540tattctatac cgatgtggat tgcgcatact ttgtgaacag aaagtgatag cgttgatgat
15600tcttcattgg tcagaaaatt atgaacggtt tcttctattt tgtctctata tactacgtat
15660aggaaatgtt tacattttcg tattgttttc gattcactct atgaatagtt cttactacaa
15720tttttttgtc taaagagtaa tactagagat aaacataaaa aatgtagagg tcgagtttag
15780atgcaagttc aaggagcgaa aggtggatgg gtaggttata tagggatata gcacagagat
15840atatagcaaa gagatacttt tgagcaatgt ttgtggaagc ggtattcgca atattttagt
15900agctcgttac agtccggtgc gtttttggtt ttttgaaagt gcgtcttcag agcgcttttg
15960gttttcaaaa gcgctctgaa gttcctatac tttctagaga ataggaactt cggaatagga
16020acttcaaagc gtttccgaaa acgagcgctt ccgaaaatgc aacgcgagct gcgcacatac
16080agctcactgt tcacgtcgca cctatatctg cgtgttgcct gtatatatat atacatgaga
16140agaacggcat agtgcgtgtt tatgcttaaa tgcgtactta tatgcgtcta tttatgtagg
16200atgaaaggta gtctagtacc tcctgtgata ttatcccatt ccatgcgggg tatcgtatgc
16260ttccttcagc actacccttt agctgttcta tatgctgcca ctcctcaatt ggattagtct
16320catccttcaa tgctatcatt tcctttgata ttggatcata tgcatagtac cgagaaacta
16380gaggatc
1638712926DNAartificial sequencePrimer 129ccaggccaat tcaacagact gtcggc
26130476PRTArtificial
sequenceProtein variant 130Met Val His Leu Gly Pro Ala Asp Val Pro Lys
Glu Leu Met Gln Gln 1 5 10
15 Ile Glu Asn Phe Glu Lys Ile Phe Thr Val Pro Thr Glu Thr Leu Gln
20 25 30 Ala Val
Thr Lys His Phe Ile Ser Glu Leu Glu Lys Gly Leu Ser Lys 35
40 45 Lys Gly Gly Asn Ile Pro
Met Ile Pro Gly Trp Val Met Asp Phe Pro 50 55
60 Thr Gly Lys Glu Ser Gly Asp Phe Leu Ala Ile
Asp Leu Gly Gly Thr 65 70 75
80 Asn Leu Arg Val Val Leu Val Lys Leu Gly Gly Asp Arg Thr Phe Asp
85 90 95 Thr Thr
Gln Ser Lys Tyr Arg Leu Pro Asp Ala Met Arg Thr Thr Gln 100
105 110 Asn Pro Asp Glu Leu Trp Glu
Phe Ile Ala Asp Ser Leu Lys Ala Phe 115 120
125 Ile Asp Glu Gln Phe Pro Gln Gly Ile Ser Glu Pro
Ile Pro Leu Gly 130 135 140
Phe Thr Phe Ser Phe Pro Ala Ser Gln Asn Lys Ile Asn Glu Gly Ile 145
150 155 160 Leu Gln Arg
Trp Thr Lys Gly Phe Asp Ile Pro Asn Ile Glu Asn His 165
170 175 Asp Val Val Pro Met Leu Gln Lys
Gln Ile Thr Lys Arg Asn Ile Pro 180 185
190 Ile Glu Val Val Ala Leu Ile Asn Asp Thr Thr Gly Thr
Leu Val Ala 195 200 205
Ser Tyr Tyr Thr Asp Pro Glu Thr Lys Met Gly Val Ile Phe Gly Thr 210
215 220 Gly Val Asn Gly
Ala Tyr Tyr Asp Val Cys Ser Asp Ile Glu Lys Leu 225 230
235 240 Gln Gly Lys Leu Ser Asp Asp Ile Pro
Pro Ser Ala Pro Met Ala Ile 245 250
255 Asn Cys Glu Tyr Gly Ser Phe Asp Asn Glu His Val Val Leu
Pro Arg 260 265 270
Thr Lys Tyr Asp Ile Thr Ile Asp Glu Glu Ser Pro Arg Pro Gly Gln
275 280 285 Gln Thr Phe Glu
Lys Met Ser Ser Gly Tyr Tyr Leu Gly Glu Ile Leu 290
295 300 Arg Leu Ala Leu Met Asp Met Tyr
Lys Gln Gly Phe Ile Phe Lys Asn 305 310
315 320 Gln Asp Leu Ser Lys Phe Asp Lys Pro Phe Val Met
Asp Thr Ser Tyr 325 330
335 Pro Ala Arg Ile Glu Glu Asp Pro Phe Glu Asn Leu Glu Asp Thr Asp
340 345 350 Asp Leu Phe
Gln Asn Glu Phe Gly Ile Asn Thr Thr Val Gln Glu Arg 355
360 365 Lys Leu Ile Arg Arg Leu Ser Glu
Leu Ile Gly Ala Arg Ala Ala Arg 370 375
380 Leu Ser Val Cys Gly Ile Ala Ala Ile Cys Gln Lys Arg
Gly Tyr Lys 385 390 395
400 Thr Gly His Ile Ala Ala Asp Gly Ser Val Tyr Asn Arg Tyr Pro Gly
405 410 415 Phe Lys Glu Lys
Ala Ala Asn Ala Leu Lys Asp Ile Tyr Gly Trp Thr 420
425 430 Gln Thr Ser Leu Asp Asp Tyr Pro Ile
Lys Ile Val Pro Ala Glu Asp 435 440
445 Gly Ser Gly Ala Gly Ala Ala Val Ile Ala Ala Leu Ala Gln
Lys Arg 450 455 460
Ile Ala Glu Gly Lys Ser Val Gly Ile Ile Gly Ala 465 470
475 131723DNASaccharomyces cerevisiae 131caacaaaagc
ttgtgtacaa tatggacttc ctcttttctg gcaaccaaac ccatacatcg 60ggattcctat
aataccttcg ttggtctccc taacatgtag gtggcggagg ggagatatac 120aatagaacag
ataccagaca agacataatg ggctaaacaa gactacacca attacactgc 180ctcattgatg
gtggtacata acgaactaat actgtagccc tagacttgat agccatcatc 240atatcgaagt
ttcactaccc tttttccatt tgccatctat tgaagtaata ataggcgcat 300gcaacttctt
ttcttttttt ttcttttctc tctcccccgt tgttgtctca ccatatccgc 360aatgacaaaa
aaatgatgga agacactaaa ggaaaaaatt aacgacaaag acagcaccaa 420cagatgtcgt
tgttccagag ctgatgaggg gtatctcgaa gcacacgaaa ctttttcctt 480ccttcattca
cgcacactac tctctaatga gcaacggtat acggccttcc ttccagttac 540ttgaatttga
aataaaaaaa agtttgctgt cttgctatca agtataaata gacctgcaat 600tattaatctt
ttgtttcctc gtcattgttc tcgttccctt tcttccttgt ttctttttct 660gcacaatatt
tcaagctata ccaagcatac aatcaactat ctcatataca tctagacaaa 720ctt
7231321747DNAartificial sequenceSynthetic construct 132atggtacatt
taggtccagc agatgtgccc aaggaattga tgcagcaaat tgaaaatttt 60gagaagatct
ttacagtgcc tactgaaacc ctccaggctg tcaccaagca tttcatttca 120gaactggaaa
agggtttgtc taaaaagggg ggtaatatcc caatgattcc aggttgggta 180atggattttc
ctacaggaaa ggaatccggt gattttttgg caatagacct aggaggcaca 240aacttaaggg
ttgtacttgt taagttaggc ggtgatcgta cgtttgatac gacacaatcg 300aaatataggt
taccagatgc gatgagaact actcagaatc ctgacgaact atgggagttc 360atcgcagact
cattaaaagc attcatcgac gaacagttcc cccagggtat cagcgaacct 420attccactag
gtttcacttt ctcttttcct gcctctcaaa acaagatcaa cgaaggcatt 480ctacaaagat
ggacaaaggg cttcgatata cctaacatcg aaaatcacga cgttgtgcct 540atgctacaga
agcagattac taaaagaaat attcctattg aagttgttgc tctaattaac 600gatactacag
gcacgctcgt tgcctcgtac tacactgacc ctgaaacgaa aatgggcgtt 660attttcggta
ctggtgttaa tggagcctac tacgatgtct gttcggatat cgaaaaactg 720caaggaaaac
tatccgatga cattccacct tccgcgccta tggcaataaa ttgtgaatac 780ggatcttttg
ataatgaaca cgttgttcta cctagaacta aatatgatat aactatcgat 840gaagaaagtc
caagacctgg acaacaaaca ttcgaaaaga tgtcgtcagg ttactactta 900ggtgagatat
tgagactggc tttgatggat atgtacaaac agggttttat cttcaagaat 960caagacttga
gtaaattcga caagccattt gttatggata cttcatatcc tgctagaata 1020gaagaagatc
ccttcgaaaa cttggaagac acagatgatc ttttccaaaa cgaatttgga 1080attaatacca
ccgtacaaga aagaaagttg ataagacgtt tgtctgaact tatcggagct 1140agggccgcaa
gactgagtgt gtgtggtata gctgctattt gccaaaagag gggatataaa 1200actggtcaca
ttgctgctga tggtagcgtt tacaatagat acccaggatt taaggaaaaa 1260gcagccaatg
ctctaaaaga tatatacggt tggactcaaa cctcactcga tgattaccct 1320attaaaattg
ttccggctga ggatggctcc ggtgctggag ccgcagttat tgcagctttg 1380gcacagaaaa
ggatagcgga aggtaaaagt gtaggtatta ttggtgcgtg agagtaagcg 1440aatttcttat
gatttatgat ttttattatt aaataagtta taaaaaaaat aagtgtatac 1500aaattttaaa
gtgactctta ggttttaaaa cgaaaattct tattcttgag taactctttc 1560ctgtaggtca
ggttgctttc tcaggtatag catgaggtcg ctcttattga ccacacctct 1620accggcatgc
cgagcaaatg cctgcaaatc gctccccatt tcacccaatt gtagatatgc 1680taactccagc
aatgagttga tgaatctcgg tgtgtatttt atgtcctcag aggacaacac 1740ctgtggt
174713335DNAartificial sequenceprimer 133tcatctaaag cttatggtac atttaggtcc
agcag 3513439DNAartificial sequenceprimer
134tatttagtgg atccaccaca ggtgttgtcc tctgaggac
3913592DNAartificial sequenceprimer 135tttttctttg aaaaggttgt aggaatataa
ttctccacac ataataagta cgctaattaa 60ataaaatggt acatttaggt ccagcagatg
tg 9213696DNAartificial sequenceprimer
136aacatgttca cataagtaga aaaagggcac cttcttgttg ttcaaactta atttacaaat
60taagtcacct tggctaactc gttgtatcat cactgg
9613730DNAartificial sequenceprimer 137gcatatttga gaagatgcgg ccagcaaaac
3013830DNAartificial sequenceprimer
138gaagtgtaga gagggttaaa attggcgtgc
301395959DNAartificial sequenceSynthetic construct 139agcttatggt
acatttaggt ccagcagatg tgcccaagga attgatgcag caaattgaaa 60attttgagaa
gatctttaca gtgcctactg aaaccctcca ggctgtcacc aagcatttca 120tttcagaact
ggaaaagggt ttgtctaaaa aggggggtaa tatcccaatg attccaggtt 180gggtaatgga
ttttcctaca ggaaaggaat ccggtgattt tttggcaata gacctaggag 240gcacaaactt
aagggttgta cttgttaagt taggcggtga tcgtacgttt gatacgacac 300aatcgaaata
taggttacca gatgcgatga gaactactca gaatcctgac gaactatggg 360agttcatcgc
agactcatta aaagcattca tcgacgaaca gttcccccag ggtatcagcg 420aacctattcc
actaggtttc actttctctt ttcctgcctc tcaaaacaag atcaacgaag 480gcattctaca
aagatggaca aagggcttcg atatacctaa catcgaaaat cacgacgttg 540tgcctatgct
acagaagcag attactaaaa gaaatattcc tattgaagtt gttgctctaa 600ttaacgatac
tacaggcacg ctcgttgcct cgtactacac tgaccctgaa acgaaaatgg 660gcgttatttt
cggtactggt gttaatggag cctactacga tgtctgttcg gatatcgaaa 720aactgcaagg
aaaactatcc gatgacattc caccttccgc gcctatggca ataaattgtg 780aatacggatc
ttttgataat gaacacgttg ttctacctag aactaaatat gatataacta 840tcgatgaaga
aagtccaaga cctggacaac aaacattcga aaagatgtcg tcaggttact 900acttaggtga
gatattgaga ctggctttga tggatatgta caaacagggt tttatcttca 960agaatcaaga
cttgagtaaa ttcgacaagc catttgttat ggatacttca tatcctgcta 1020gaatagaaga
agatcccttc gaaaacttgg aagacacaga tgatcttttc caaaacgaat 1080ttggaattaa
taccaccgta caagaaagaa agttgataag acgtttgtct gaacttatcg 1140gagctagggc
cgcaagactg agtgtgtgtg gtatagctgc tatttgccaa aagaggggat 1200ataaaactgg
tcacattgct gctgatggta gcgtttacaa tagataccca ggatttaagg 1260aaaaagcagc
caatgctcta aaagatatat acggttggac tcaaacctca ctcgatgatt 1320accctattaa
aattgttccg gctgaggatg gctccggtgc tggagccgca gttattgcag 1380ctttggcaca
gaaaaggata gcggaaggta aaagtgtagg tattattggt gcgtgagagt 1440aagcgaattt
cttatgattt atgattttta ttattaaata agttataaaa aaaataagtg 1500tatacaaatt
ttaaagtgac tcttaggttt taaaacgaaa attcttattc ttgagtaact 1560ctttcctgta
ggtcaggttg ctttctcagg tatagcatga ggtcgctctt attgaccaca 1620cctctaccgg
catgccgagc aaatgcctgc aaatcgctcc ccatttcacc caattgtaga 1680tatgctaact
ccagcaatga gttgatgaat ctcggtgtgt attttatgtc ctcagaggac 1740aacacctgtg
gtggatccgc attgcggatt acgtattcta atgttcagat aacttcgtat 1800agcatacatt
atacgaagtt atgcagattg tactgagagt gcaccatacc acagcttttc 1860aattcaattc
atcatttttt ttttattctt ttttttgatt tcggtttctt tgaaattttt 1920ttgattcggt
aatctccgaa cagaaggaag aacgaaggaa ggagcacaga cttagattgg 1980tatatatacg
catatgtagt gttgaagaaa catgaaattg cccagtattc ttaacccaac 2040tgcacagaac
aaaaacctgc aggaaacgaa gataaatcat gtcgaaagct acatataagg 2100aacgtgctgc
tactcatcct agtcctgttg ctgccaagct atttaatatc atgcacgaaa 2160agcaaacaaa
cttgtgtgct tcattggatg ttcgtaccac caaggaatta ctggagttag 2220ttgaagcatt
aggtcccaaa atttgtttac taaaaacaca tgtggatatc ttgactgatt 2280tttccatgga
gggcacagtt aagccgctaa aggcattatc cgccaagtac aattttttac 2340tcttcgaaga
cagaaaattt gctgacattg gtaatacagt caaattgcag tactctgcgg 2400gtgtatacag
aatagcagaa tgggcagaca ttacgaatgc acacggtgtg gtgggcccag 2460gtattgttag
cggtttgaag caggcggcag aagaagtaac aaaggaacct agaggccttt 2520tgatgttagc
agaattgtca tgcaagggct ccctatctac tggagaatat actaagggta 2580ctgttgacat
tgcgaagagc gacaaagatt ttgttatcgg ctttattgct caaagagaca 2640tgggtggaag
agatgaaggt tacgattggt tgattatgac acccggtgtg ggtttagatg 2700acaagggaga
cgcattgggt caacagtata gaaccgtgga tgatgtggtc tctacaggat 2760ctgacattat
tattgttgga agaggactat ttgcaaaggg aagggatgct aaggtagagg 2820gtgaacgtta
cagaaaagca ggctgggaag catatttgag aagatgcggc cagcaaaact 2880aaaaaactgt
attataagta aatgcatgta tactaaactc acaaattaga gcttcaattt 2940aattatatca
gttattaccc tatgcggtgt gaaataccgc acagatgcgt aaggagaaaa 3000taccgcatca
ggaaattgta aacgttaata ttttgttaaa attcgcgtta aatttttgtt 3060aaatcagctc
attttttaac caataggccg aaatcggcaa aatcccttat aaatcaaaag 3120aatagaccga
gatagggttg agtgttgttc cagtttggaa caagagtcca ctattaaaga 3180acgtggactc
caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc ccactacgtg 3240aaccatcacc
ctaatcaaga taacttcgta tagcatacat tatacgaagt tatccagtga 3300tgatacaacg
agttagccaa ggtgaattca ctggccgtcg ttttacaacg tcgtgactgg 3360gaaaaccctg
gcgttaccca acttaatcgc cttgcagcac atcccccttt cgccagctgg 3420cgtaatagcg
aagaggcccg caccgatcgc ccttcccaac agttgcgcag cctgaatggc 3480gaatggcgcc
tgatgcggta ttttctcctt acgcatctgt gcggtatttc acaccgcata 3540tggtgcactc
tcagtacaat ctgctctgat gccgcatagt taagccagcc ccgacacccg 3600ccaacacccg
ctgacgcgcc ctgacgggct tgtctgctcc cggcatccgc ttacagacaa 3660gctgtgaccg
tctccgggag ctgcatgtgt cagaggtttt caccgtcatc accgaaacgc 3720gcgagacgaa
agggcctcgt gatacgccta tttttatagg ttaatgtcat gataataatg 3780gtttcttaga
cgtcaggtgg cacttttcgg ggaaatgtgc gcggaacccc tatttgttta 3840tttttctaaa
tacattcaaa tatgtatccg ctcatgagac aataaccctg ataaatgctt 3900caataatatt
gaaaaaggaa gagtatgagt attcaacatt tccgtgtcgc ccttattccc 3960ttttttgcgg
cattttgcct tcctgttttt gctcacccag aaacgctggt gaaagtaaaa 4020gatgctgaag
atcagttggg tgcacgagtg ggttacatcg aactggatct caacagcggt 4080aagatccttg
agagttttcg ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt 4140ctgctatgtg
gcgcggtatt atcccgtatt gacgccgggc aagagcaact cggtcgccgc 4200atacactatt
ctcagaatga cttggttgag tactcaccag tcacagaaaa gcatcttacg 4260gatggcatga
cagtaagaga attatgcagt gctgccataa ccatgagtga taacactgcg 4320gccaacttac
ttctgacaac gatcggagga ccgaaggagc taaccgcttt tttgcacaac 4380atgggggatc
atgtaactcg ccttgatcgt tgggaaccgg agctgaatga agccatacca 4440aacgacgagc
gtgacaccac gatgcctgta gcaatggcaa caacgttgcg caaactatta 4500actggcgaac
tacttactct agcttcccgg caacaattaa tagactggat ggaggcggat 4560aaagttgcag
gaccacttct gcgctcggcc cttccggctg gctggtttat tgctgataaa 4620tctggagccg
gtgagcgtgg gtctcgcggt atcattgcag cactggggcc agatggtaag 4680ccctcccgta
tcgtagttat ctacacgacg gggagtcagg caactatgga tgaacgaaat 4740agacagatcg
ctgagatagg tgcctcactg attaagcatt ggtaactgtc agaccaagtt 4800tactcatata
tactttagat tgatttaaaa cttcattttt aatttaaaag gatctaggtg 4860aagatccttt
ttgataatct catgaccaaa atcccttaac gtgagttttc gttccactga 4920gcgtcagacc
ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta 4980atctgctgct
tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa 5040gagctaccaa
ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact 5100gtccttctag
tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca 5160tacctcgctc
tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt 5220accgggttgg
actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg 5280ggttcgtgca
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag 5340cgtgagctat
gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta 5400agcggcaggg
tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat 5460ctttatagtc
ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg 5520tcaggggggc
ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc 5580ttttgctggc
cttttgctca catgttcttt cctgcgttat cccctgattc tgtggataac 5640cgtattaccg
cctttgagtg agctgatacc gctcgccgca gccgaacgac cgagcgcagc 5700gagtcagtga
gcgaggaagc ggaagagcgc ccaatacgca aaccgcctct ccccgcgcgt 5760tggccgattc
attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag 5820cgcaacgcaa
ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg 5880cttccggctc
gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc 5940tatgaccatg
attacgcca
5959140500DNASaccharomyces cerevisiae 140cttttaacat ttgggcgaga ccactcttga
tcttaaagtc ttctccagtc atcgtgataa 60ttaactgaat attagtactg tgtatatttg
ctagtcgttc ctaaaggttt ctccaacaat 120accatagact tcgtccatag ctctcagcgt
cctcctattt atatcgaaaa tggtacttcg 180cagccagaat tacagacgta actaacggtg
cggcagagtg tgtcagagtc atgaagaaat 240ggcggcgcta cctgaaaagt agtgaaaaag
cccggctttc aacccttacc cttgtcggct 300gagtcattat gtcatgatga gctattccaa
ctagtgccat aaattccaac tgagtcagta 360aacggcattt atcagcaata actggtcacg
aactctttga atgttttatt ctttcttcca 420aaaatcacgt tgatgccacc aggttttttt
ttcttattat ttcatttcgt taaatagaaa 480gaaaaaccat atcttaaagt
500141500DNASaccharomyces cerevisiae
141ccccaattac tttcatcgac tttccggaca ttgtactgtg ggttttgtgc atactttaag
60atatggtttt tctttctatt taacgaaatg aaataataag aaaaaaaaac ctggtggcat
120caacgtgatt tttggaagaa agaataaaac attcaaagag ttcgtgacca gttattgctg
180ataaatgccg tttactgact cagttggaat ttatggcact agttggaata gctcatcatg
240acataatgac tcagccgaca agggtaaggg ttgaaagccg ggctttttca ctacttttca
300ggtagcgccg ccatttcttc atgactctga cacactctgc cgcaccgtta gttacgtctg
360taattctggc tgcgaagtac cattttcgat ataaatagga ggacgctgag agctatggac
420gaagtctatg gtattgttgg agaaaccttt aggaacgact agcaaatata cacagtacta
480atattcagtt aattatcacg
5001429333DNAArtificial sequenceSynthetic construct 142tcgcgcgttt
cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct
gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg
tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accataccac
agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc 240ggtttctttg
aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact
tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt
aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac
atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat 480ttaatatcat
gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca 540aggaattact
ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt
gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa
ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca 720aattgcagta
ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt
gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag
aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg 900gagaatatac
taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct 960ttattgctca
aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac 1020ccggtgtggg
tttagatgac aagggagacg cattgggtca acagtataga accgtggatg 1080atgtggtctc
tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa
ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca
gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc
ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa
ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa
tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa
atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca 1500agagtccact
attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc
actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa
tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc
gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt
cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg
ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct
attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg
gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact
atagggcgaa ttgggtaccg ggccccccct cgaggtcgac tggccattaa 2040tctttcccat
attagatttc gccaagccat gaaagttcaa gaaaggtctt tagacgaatt 2100acccttcatt
tctcaaactg gcgtcaaggg atcctggtat ggttttatcg ttttatttct 2160ggttcttata
gcatcgtttt ggacttctct gttcccatta ggcggttcag gagccagcgc 2220agaatcattc
tttgaaggat acttatcctt tccaattttg attgtctgtt acgttggaca 2280taaactgtat
actagaaatt ggactttgat ggtgaaacta gaagatatgg atcttgatac 2340cggcagaaaa
caagtagatt tgactcttcg tagggaagaa atgaggattg agcgagaaac 2400attagcaaaa
agatccttcg taacaagatt tttacatttc tggtgttgaa gggaaagata 2460tgagctatac
agcggaattt ccatatcact cagattttgt tatctaattt tttccttccc 2520acgtccgcgg
gaatctgtgt atattactgc atctagatat atgttatctt atcttggcgc 2580gtacatttaa
ttttcaacgt attctataag aaattgcggg agtttttttc atgtagatga 2640tactgactgc
acgcaaatat aggcatgatt tataggcatg atttgatggc tgtaccgata 2700ggaacgctaa
gagtaacttc agaatcgtta tcctggcgga aaaaattcat ttgtaaactt 2760taaaaaaaaa
agccaatatc cccaaaatta ttaagagcgc ctccattatt aactaaaatt 2820tcactcagca
tccacaatgt atcaggtatc tactacagat attacatgtg gcgaaaaaga 2880caagaacaat
gcaatagcgc atcaagaaaa aacacaaagc tttcaatcaa tgaatcgaaa 2940atgtcattaa
aatagtatat aaattgaaac taagtcataa agctataaaa agaaaattta 3000tttaaatgca
agatttaaag taaattcacg gccctgcagg cctcagctct tgttttgttc 3060tgcaaataac
ttacccatct ttttcaaaac tttaggtgca ccctcctttg ctagaataag 3120ttctatccaa
tacatcctat ttggatctgc ttgagcttct ttcatcacgg atacgaattc 3180attttctgtt
ctcacaattt tggacacaac tctgtcttcc gttgccccga aactttctgg 3240cagttttgag
taattccaca taggaatgtc attataactc tggttcggac catgaatttc 3300cctctcaacc
gtgtaaccat cgttattaat gataaagcag attgggttta tcttctctct 3360aatggctagt
cctaattctt ggacagtcag ttgcaatgat ccatctccga taaacaataa 3420atgtctagat
tctttatctg caatttggct gcctagagct gcggggaaag tgtatcctat 3480agatccccac
aagggttgac caataaaatg tgatttcgat ttcagaaata tagatgaggc 3540accgaagaaa
gaagtgcctt gttcagccac gatcgtctca ttactttggg tcaaattttc 3600gacagcttgc
cacagtctat cttgtgacaa cagcgcgtta gaaggtacaa aatcttcttg 3660ctttttatct
atgtacttgc ctttatattc aatttcggac aagtcaagaa gagatgatat 3720cagggattcg
aagtcgaaat tttggattct ttcgttgaaa attttacctt catcgatatt 3780caaggaaatc
attttatttt cattaagatg gtgagtaaat gcacccgtac tagaatcggt 3840aagctttaca
cccaacataa gaataaaatc agcagattcc acaaattcct tcaagtttgg 3900ctctgacaga
gtaccgttgt aaatccccaa aaatgagggc aatgcttcat caacagatga 3960tttaccaaag
ttcaaagtag taataggtaa cttagtcttt gaaataaact gagtaacagt 4020cttctctagg
ccgaacgata taatttcatg gcctgtgatt acaattggtt tcttggcatt 4080cttcagactt
tcctgtattt tgttcagaat ctcttgatca gatgtattcg acgtggaatt 4140ttccttctta
agaggcaagg atggtttttc agccttagcg gcagctacat ctacaggtaa 4200attgatgtaa
accggctttc tttcctttag taaggcagac aacactctat caatttcaac 4260agttgcattc
tcggctgtca ataaagtcct ggcagcagta accggttcgt gcatcttcat 4320aaagtgcttg
aaatcaccat cagccaacgt atggtgaaca aacttacctt cgttctgcac 4380tttcgaggta
ggagatccca cgatctcaac aacaggcagg ttctcagcat aggagcccgc 4440taagccatta
actgcggata attcgccaac accaaatgta gtcaagaatg ccgcagcctt 4500tttcgttctt
gcgtacccgt cggccatata ggaggcattt aactcattag catttcccac 4560ccatttcata
tctttgtgtg aaataatttg atctagaaat tgcaaattgt agtcacctgg 4620tactccgaat
atttcttcta tacctaattc gtgtaatctg tccaacagat agtcacctac 4680tgtatacatt
ttgtttacta gtttatgtgt gtttattcga aactaagttc ttggtgtttt 4740aaaactaaaa
aaaagactaa ctataaaagt agaatttaag aagtttaaga aatagattta 4800cagaattaca
atcaatacct accgtcttta tatacttatt agtcaagtag gggaataatt 4860tcagggaact
ggtttcaacc ttttttttca gctttttcca aatcagagag agcagaaggt 4920aatagaaggt
gtaagaaaat gagatagata catgcgtggg tcaattgcct tgtgtcatca 4980tttactccag
gcaggttgca tcactccatt gaggttgtgc ccgttttttg cctgtttgtg 5040cccctgttct
ctgtagttgc gctaagagaa tggacctatg aactgatggt tggtgaagaa 5100aacaatattt
tggtgctggg attctttttt tttctggatg ccagcttaaa aagcgggctc 5160cattatattt
agtggatgcc aggaataaac tgttcaccca gacacctacg atgttatata 5220ttctgtgtaa
cccgccccct attttgggca tgtacgggtt acagcagaat taaaaggcta 5280attttttgac
taaataaagt taggaaaatc actactatta attatttacg tattctttga 5340aatggcagta
ttgataatga taaactcgaa ctgaaaaagc gtgtttttta ttcaaaatga 5400ttctaactcc
cttacgtaat caaggaatct ttttgccttg gcctccgcgt cattaaactt 5460cttgttgttg
acgctaacat tcaacgctag tatatattcg tttttttcag gtaagttctt 5520ttcaacgggt
cttactgatg aggcagtcgc gtctgaacct gttaagaggt caaatatgtc 5580ttcttgaccg
tacgtgtctt gcatgttatt agctttggga atttgcatca agtcatagga 5640aaatttaaat
cttggctctc ttgggctcaa ggtgacaagg tcctcgaaaa tagggcgcgc 5700cccaccgcgg
tggagctcca gcttttgttc cctttagtga gggttaattg cgcgcttggc 5760gtaatcatgg
tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa 5820cataggagcc
ggaagcataa agtgtaaagc ctggggtgcc taatgagtga ggtaactcac 5880attaattgcg
ttgcgctcac tgcccgcttt ccagtcggga aacctgtcgt gccagctgca 5940ttaatgaatc
ggccaacgcg cggggagagg cggtttgcgt attgggcgct cttccgcttc 6000ctcgctcact
gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc 6060aaaggcggta
atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc 6120aaaaggccag
caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag 6180gctccgcccc
cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc 6240gacaggacta
taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt 6300tccgaccctg
ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct 6360ttctcatagc
tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg 6420ctgtgtgcac
gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct 6480tgagtccaac
ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat 6540tagcagagcg
aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg 6600ctacactaga
aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa 6660aagagttggt
agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt 6720ttgcaagcag
cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc 6780tacggggtct
gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt 6840atcaaaaagg
atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta 6900aagtatatat
gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat 6960ctcagcgatc
tgtctatttc gttcatccat agttgcctga ctccccgtcg tgtagataac 7020tacgatacgg
gagggcttac catctggccc cagtgctgca atgataccgc gagacccacg 7080ctcaccggct
ccagatttat cagcaataaa ccagccagcc ggaagggccg agcgcagaag 7140tggtcctgca
actttatccg cctccatcca gtctattaat tgttgccggg aagctagagt 7200aagtagttcg
ccagttaata gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt 7260gtcacgctcg
tcgtttggta tggcttcatt cagctccggt tcccaacgat caaggcgagt 7320tacatgatcc
cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt 7380cagaagtaag
ttggccgcag tgttatcact catggttatg gcagcactgc ataattctct 7440tactgtcatg
ccatccgtaa gatgcttttc tgtgactggt gagtactcaa ccaagtcatt 7500ctgagaatag
tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac gggataatac 7560cgcgccacat
agcagaactt taaaagtgct catcattgga aaacgttctt cggggcgaaa 7620actctcaagg
atcttaccgc tgttgagatc cagttcgatg taacccactc gtgcacccaa 7680ctgatcttca
gcatctttta ctttcaccag cgtttctggg tgagcaaaaa caggaaggca 7740aaatgccgca
aaaaagggaa taagggcgac acggaaatgt tgaatactca tactcttcct 7800ttttcaatat
tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga 7860atgtatttag
aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccacc 7920tgaacgaagc
atctgtgctt cattttgtag aacaaaaatg caacgcgaga gcgctaattt 7980ttcaaacaaa
gaatctgagc tgcattttta cagaacagaa atgcaacgcg aaagcgctat 8040tttaccaacg
aagaatctgt gcttcatttt tgtaaaacaa aaatgcaacg cgagagcgct 8100aatttttcaa
acaaagaatc tgagctgcat ttttacagaa cagaaatgca acgcgagagc 8160gctattttac
caacaaagaa tctatacttc ttttttgttc tacaaaaatg catcccgaga 8220gcgctatttt
tctaacaaag catcttagat tacttttttt ctcctttgtg cgctctataa 8280tgcagtctct
tgataacttt ttgcactgta ggtccgttaa ggttagaaga aggctacttt 8340ggtgtctatt
ttctcttcca taaaaaaagc ctgactccac ttcccgcgtt tactgattac 8400tagcgaagct
gcgggtgcat tttttcaaga taaaggcatc cccgattata ttctataccg 8460atgtggattg
cgcatacttt gtgaacagaa agtgatagcg ttgatgattc ttcattggtc 8520agaaaattat
gaacggtttc ttctattttg tctctatata ctacgtatag gaaatgttta 8580cattttcgta
ttgttttcga ttcactctat gaatagttct tactacaatt tttttgtcta 8640aagagtaata
ctagagataa acataaaaaa tgtagaggtc gagtttagat gcaagttcaa 8700ggagcgaaag
gtggatgggt aggttatata gggatatagc acagagatat atagcaaaga 8760gatacttttg
agcaatgttt gtggaagcgg tattcgcaat attttagtag ctcgttacag 8820tccggtgcgt
ttttggtttt ttgaaagtgc gtcttcagag cgcttttggt tttcaaaagc 8880gctctgaagt
tcctatactt tctagagaat aggaacttcg gaataggaac ttcaaagcgt 8940ttccgaaaac
gagcgcttcc gaaaatgcaa cgcgagctgc gcacatacag ctcactgttc 9000acgtcgcacc
tatatctgcg tgttgcctgt atatatatat acatgagaag aacggcatag 9060tgcgtgttta
tgcttaaatg cgtacttata tgcgtctatt tatgtaggat gaaaggtagt 9120ctagtacctc
ctgtgatatt atcccattcc atgcggggta tcgtatgctt ccttcagcac 9180taccctttag
ctgttctata tgctgccact cctcaattgg attagtctca tccttcaatg 9240ctatcatttc
ctttgatatt ggatcatact aagaaaccat tattatcatg acattaacct 9300ataaaaatag
gcgtatcacg aggccctttc gtc
93331439075DNAArtificial sequenceSynthetic construct 143ctagttctag
agcggccgcc accgcggtgg agctccagct tttgttccct ttagtgaggg 60ttaattgcgc
gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg 120ctcacaattc
cacacaacat aggagccgga agcataaagt gtaaagcctg gggtgcctaa 180tgagtgaggt
aactcacatt aattgcgttg cgctcactgc ccgctttcca gtcgggaaac 240ctgtcgtgcc
agctgcatta atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt 300gggcgctctt
ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga 360gcggtatcag
ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca 420ggaaagaaca
tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 480ctggcgtttt
tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt 540cagaggtggc
gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 600ctcgtgcgct
ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 660tcgggaagcg
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 720gttcgctcca
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 780tccggtaact
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 840gccactggta
acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 900tggtggccta
actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag 960ccagttacct
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt 1020agcggtggtt
tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa 1080gatcctttga
tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg 1140attttggtca
tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga 1200agttttaaat
caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta 1260atcagtgagg
cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc 1320cccgtcgtgt
agataactac gatacgggag ggcttaccat ctggccccag tgctgcaatg 1380ataccgcgag
acccacgctc accggctcca gatttatcag caataaacca gccagccgga 1440agggccgagc
gcagaagtgg tcctgcaact ttatccgcct ccatccagtc tattaattgt 1500tgccgggaag
ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt 1560gctacaggca
tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc 1620caacgatcaa
ggcgagttac atgatccccc atgttgtgca aaaaagcggt tagctccttc 1680ggtcctccga
tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca 1740gcactgcata
attctcttac tgtcatgcca tccgtaagat gcttttctgt gactggtgag 1800tactcaacca
agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 1860tcaatacggg
ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa 1920cgttcttcgg
ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa 1980cccactcgtg
cacccaactg atcttcagca tcttttactt tcaccagcgt ttctgggtga 2040gcaaaaacag
gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg gaaatgttga 2100atactcatac
tcttcctttt tcaatattat tgaagcattt atcagggtta ttgtctcatg 2160agcggataca
tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 2220ccccgaaaag
tgccacctga acgaagcatc tgtgcttcat tttgtagaac aaaaatgcaa 2280cgcgagagcg
ctaatttttc aaacaaagaa tctgagctgc atttttacag aacagaaatg 2340caacgcgaaa
gcgctatttt accaacgaag aatctgtgct tcatttttgt aaaacaaaaa 2400tgcaacgcga
gagcgctaat ttttcaaaca aagaatctga gctgcatttt tacagaacag 2460aaatgcaacg
cgagagcgct attttaccaa caaagaatct atacttcttt tttgttctac 2520aaaaatgcat
cccgagagcg ctatttttct aacaaagcat cttagattac tttttttctc 2580ctttgtgcgc
tctataatgc agtctcttga taactttttg cactgtaggt ccgttaaggt 2640tagaagaagg
ctactttggt gtctattttc tcttccataa aaaaagcctg actccacttc 2700ccgcgtttac
tgattactag cgaagctgcg ggtgcatttt ttcaagataa aggcatcccc 2760gattatattc
tataccgatg tggattgcgc atactttgtg aacagaaagt gatagcgttg 2820atgattcttc
attggtcaga aaattatgaa cggtttcttc tattttgtct ctatatacta 2880cgtataggaa
atgtttacat tttcgtattg ttttcgattc actctatgaa tagttcttac 2940tacaattttt
ttgtctaaag agtaatacta gagataaaca taaaaaatgt agaggtcgag 3000tttagatgca
agttcaagga gcgaaaggtg gatgggtagg ttatataggg atatagcaca 3060gagatatata
gcaaagagat acttttgagc aatgtttgtg gaagcggtat tcgcaatatt 3120ttagtagctc
gttacagtcc ggtgcgtttt tggttttttg aaagtgcgtc ttcagagcgc 3180ttttggtttt
caaaagcgct ctgaagttcc tatactttct agagaatagg aacttcggaa 3240taggaacttc
aaagcgtttc cgaaaacgag cgcttccgaa aatgcaacgc gagctgcgca 3300catacagctc
actgttcacg tcgcacctat atctgcgtgt tgcctgtata tatatataca 3360tgagaagaac
ggcatagtgc gtgtttatgc ttaaatgcgt acttatatgc gtctatttat 3420gtaggatgaa
aggtagtcta gtacctcctg tgatattatc ccattccatg cggggtatcg 3480tatgcttcct
tcagcactac cctttagctg ttctatatgc tgccactcct caattggatt 3540agtctcatcc
ttcaatgcta tcatttcctt tgatattgga tcatactaag aaaccattat 3600tatcatgaca
ttaacctata aaaataggcg tatcacgagg ccctttcgtc tcgcgcgttt 3660cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 3720gtaagcggat
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 3780tcggggctgg
cttaactatg cggcatcaga gcagattgta ctgagagtgc accatatcga 3840ctacgtcgta
aggccgtttc tgacagagta aaattcttga gggaactttc accattatgg 3900gaaatgcttc
aagaaggtat tgacttaaac tccatcaaat ggtcaggtca ttgagtgttt 3960tttatttgtt
gtattttttt ttttttagag aaaatcctcc aatatcaaat taggaatcgt 4020agtttcatga
ttttctgtta cacctaactt tttgtgtggt gccctcctcc ttgtcaatat 4080taatgttaaa
gtgcaattct ttttccttat cacgttgagc cattagtatc aatttgctta 4140cctgtattcc
tttactatcc tcctttttct ccttcttgat aaatgtatgt agattgcgta 4200tatagtttcg
tctaccctat gaacatattc cattttgtaa tttcgtgtcg tttctattat 4260gaatttcatt
tataaagttt atgtacaaat atcataaaaa aagagaatct ttttaagcaa 4320ggattttctt
aacttcttcg gcgacagcat caccgacttc ggtggtactg ttggaaccac 4380ctaaatcacc
agttctgata cctgcatcca aaaccttttt aactgcatct tcaatggcct 4440taccttcttc
aggcaagttc aatgacaatt tcaacatcat tgcagcagac aagatagtgg 4500cgatagggtc
aaccttattc tttggcaaat ctggagcaga accgtggcat ggttcgtaca 4560aaccaaatgc
ggtgttcttg tctggcaaag aggccaagga cgcagatggc aacaaaccca 4620aggaacctgg
gataacggag gcttcatcgg agatgatatc accaaacatg ttgctggtga 4680ttataatacc
atttaggtgg gttgggttct taactaggat catggcggca gaatcaatca 4740attgatgttg
aaccttcaat gtagggaatt cgttcttgat ggtttcctcc acagtttttc 4800tccataatct
tgaagaggcc aaaagattag ctttatccaa ggaccaaata ggcaatggtg 4860gctcatgttg
tagggccatg aaagcggcca ttcttgtgat tctttgcact tctggaacgg 4920tgtattgttc
actatcccaa gcgacaccat caccatcgtc ttcctttctc ttaccaaagt 4980aaatacctcc
cactaattct ctgacaacaa cgaagtcagt acctttagca aattgtggct 5040tgattggaga
taagtctaaa agagagtcgg atgcaaagtt acatggtctt aagttggcgt 5100acaattgaag
ttctttacgg atttttagta aaccttgttc aggtctaaca ctaccggtac 5160cccatttagg
accagccaca gcacctaaca aaacggcatc aaccttcttg gaggcttcca 5220gcgcctcatc
tggaagtggg acacctgtag catcgatagc agcaccacca attaaatgat 5280tttcgaaatc
gaacttgaca ttggaacgaa catcagaaat agctttaaga accttaatgg 5340cttcggctgt
gatttcttga ccaacgtggt cacctggcaa aacgacgatc ttcttagggg 5400cagacatagg
ggcagacatt agaatggtat atccttgaaa tatatatata tattgctgaa 5460atgtaaaagg
taagaaaagt tagaaagtaa gacgattgct aaccacctat tggaaaaaac 5520aataggtcct
taaataatat tgtcaacttc aagtattgtg atgcaagcat ttagtcatga 5580acgcttctct
attctatatg aaaagccggt tccggcctct cacctttcct ttttctccca 5640atttttcagt
tgaaaaaggt atatgcgtca ggcgacctct gaaattaaca aaaaatttcc 5700agtcatcgaa
tttgattctg tgcgatagcg cccctgtgtg ttctcgttat gttgaggaaa 5760aaaataatgg
ttgctaagag attcgaactc ttgcatctta cgatacctga gtattcccac 5820agttaactgc
ggtcaagata tttcttgaat caggcgcctt agaccgctcg gccaaacaac 5880caattacttg
ttgagaaata gagtataatt atcctataaa tataacgttt ttgaacacac 5940atgaacaagg
aagtacagga caattgattt tgaagagaat gtggattttg atgtaattgt 6000tgggattcca
tttttaataa ggcaataata ttaggtatgt ggatatacta gaagttctcc 6060tcgaccgtcg
atatgcggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc 6120aggaaattgt
aaacgttaat attttgttaa aattcgcgtt aaatttttgt taaatcagct 6180cattttttaa
ccaataggcc gaaatcggca aaatccctta taaatcaaaa gaatagaccg 6240agatagggtt
gagtgttgtt ccagtttgga acaagagtcc actattaaag aacgtggact 6300ccaacgtcaa
agggcgaaaa accgtctatc agggcgatgg cccactacgt gaaccatcac 6360cctaatcaag
ttttttgggg tcgaggtgcc gtaaagcact aaatcggaac cctaaaggga 6420gcccccgatt
tagagcttga cggggaaagc cggcgaacgt ggcgagaaag gaagggaaga 6480aagcgaaagg
agcgggcgct agggcgctgg caagtgtagc ggtcacgctg cgcgtaacca 6540ccacacccgc
cgcgcttaat gcgccgctac agggcgcgtc gcgccattcg ccattcaggc 6600tgcgcaactg
ttgggaaggg cgatcggtgc gggcctcttc gctattacgc cagctggcga 6660aagggggatg
tgctgcaagg cgattaagtt gggtaacgcc agggttttcc cagtcacgac 6720gttgtaaaac
gacggccagt gagcgcgcgt aatacgactc actatagggc gaattgggta 6780ccgggccccc
cctcgaggtc gacggtatcg ataagcttga tatcgaattc ctgcagcccg 6840ggggatccgc
atgcttgcat ttagtcgtgc aatgtatgac tttaagattt gtgagcagga 6900agaaaaggga
gaatcttcta acgataaacc cttgaaaaac tgggtagact acgctatgtt 6960gagttgctac
gcaggctgca caattacacg agaatgctcc cgcctaggat ttaaggctaa 7020gggacgtgca
atgcagacga cagatctaaa tgaccgtgtc ggtgaagtgt tcgccaaact 7080tttcggttaa
cacatgcagt gatgcacgcg cgatggtgct aagttacata tatatatata 7140tatatatata
tagccatagt gatgtctaag taacctttat ggtatatttc ttaatgtgga 7200aagatactag
cgcgcgcacc cacacacaag cttcgtcttt tcttgaagaa aagaggaagc 7260tcgctaaatg
ggattccact ttccgttccc tgccagctga tggaaaaagg ttagtggaac 7320gatgaagaat
aaaaagagag atccactgag gtgaaatttc agctgacagc gagtttcatg 7380atcgtgatga
acaatggtaa cgagttgtgg ctgttgccag ggagggtggt tctcaacttt 7440taatgtatgg
ccaaatcgct acttgggttt gttatataac aaagaagaaa taatgaactg 7500attctcttcc
tccttcttgt cctttcttaa ttctgttgta attaccttcc tttgtaattt 7560tttttgtaat
tattcttctt aataatccaa acaaacacac atattacaat agctagctga 7620ggatgtcaac
agccggtaaa gttattaagt gtaaagcggc agttttgtgg gaagagaaaa 7680agccgtttag
catagaagaa gtagaagtag cgccaccaaa agcacacgag gttagaatca 7740agatggttgc
caccggaatc tgtagatccg acgaccatgt ggtgagtggc actctagtta 7800ctcctttgcc
agtaatcgcg ggacacgagg ctgccggaat cgttgaatcc ataggtgaag 7860gtgttaccac
tgttcgtcct ggtgataaag tgatcccact gttcactcct caatgtggta 7920agtgtagagt
ctgcaaacat cctgagggta atttctgcct taaaaatgat ttgtctatgc 7980ctagaggtac
tatgcaggat ggtacaagca gatttacatg cagagggaaa cctatacacc 8040atttccttgg
tacttctaca ttttcccaat acacagtggt ggacgagata tctgtcgcta 8100aaatcgatgc
agcttcacca ctggaaaaag tttgcttgat agggtgcgga ttttccaccg 8160gttacggttc
cgcagttaaa gttgcaaagg ttacacaggg ttcgacttgt gcagtattcg 8220gtttaggagg
agtaggacta agcgttatta tggggtgtaa agctgcaggc gcagcgagga 8280ttataggtgt
agacatcaat aaggacaaat ttgcaaaagc taaggaggtc ggggctactg 8340aatgtgttaa
ccctcaagat tataagaaac caatacaaga agtccttact gaaatgtcaa 8400acggtggagt
tgatttctct tttgaagtta taggccgtct tgatactatg gtaactgcgt 8460tgtcctgctg
tcaagaggca tatggagtca gtgtgatcgt aggtgttcct cctgattcac 8520aaaatttgtc
gatgaatcct atgctgttgc taagcggtcg tacatggaag ggagctatat 8580ttggcggttt
taagagcaag gatagtgttc caaaacttgt tgccgacttt atggcgaaga 8640agtttgctct
tgatccttta attacacatg tattgccatt cgagaaaatc aatgaagggt 8700ttgatttgtt
aagaagtggt gaatctattc gtacaatttt aactttttga ttaattaaga 8760gtaagcgaat
ttcttatgat ttatgatttt tattattaaa taagttataa aaaaaataag 8820tgtatacaaa
ttttaaagtg actcttaggt tttaaaacga aaattcttat tcttgagtaa 8880ctctttcctg
taggtcaggt tgctttctca ggtatagcat gaggtcgctc ttattgacca 8940cacctctacc
ggcatgccga gcaaatgcct gcaaatcgct ccccatttca cccaattgta 9000gatatgctaa
ctccagcaat gagttgatga atctcggtgt gtattttatg tcctcagagg 9060acaacacctg
tggta
907514411367DNAArtificial sequenceSynthetic construct 144tcgcgcgttt
cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct
gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg
tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accatagcca
tcctcatgaa aactgtgtaa cataataacc gaagtgtcga aaaggtggca 240ccttgtccaa
ttgaacacgc tcgatgaaaa aaataagata tatataaggt taagtaaagc 300gtctgttaga
aaggaagttt ttcctttttc ttgctctctt gtcttttcat ctactatttc 360cttcgtgtaa
tacagggtcg tcagatacat agatacaatt ctattacccc catccataca 420atgccatctc
atttcgatac tgttcaacta cacgccggcc aagagaaccc tggtgacaat 480gctcacagat
ccagagctgt accaatttac gccaccactt cttatgtttt cgaaaactct 540aagcatggtt
cgcaattgtt tggtctagaa gttccaggtt acgtctattc ccgtttccaa 600aacccaacca
gtaatgtttt ggaagaaaga attgctgctt tagaaggtgg tgctgctgct 660ttggctgttt
cctccggtca agccgctcaa acccttgcca tccaaggttt ggcacacact 720ggtgacaaca
tcgtttccac ttcttactta tacggtggta cttataacca gttcaaaatc 780tcgttcaaaa
gatttggtat cgaggctaga tttgttgaag gtgacaatcc agaagaattc 840gaaaaggtct
ttgatgaaag aaccaaggct gtttatttgg aaaccattgg taatccaaag 900tacaatgttc
cggattttga aaaaattgtt gcaattgctc acaaacacgg tattccagtt 960gtcgttgaca
acacatttgg tgccggtggt tacttctgtc agccaattaa atacggtgct 1020gatattgtaa
cacattctgc taccaaatgg attggtggtc atggtactac tatcggtggt 1080attattgttg
actctggtaa gttcccatgg aaggactacc cagaaaagtt ccctcaattc 1140tctcaacctg
ccgaaggata tcacggtact atctacaatg aagcctacgg taacttggca 1200tacatcgttc
atgttagaac tgaactatta agagatttgg gtccattgat gaacccattt 1260gcctctttct
tgctactaca aggtgttgaa acattatctt tgagagctga aagacacggt 1320gaaaatgcat
tgaagttagc caaatggtta gaacaatccc catacgtatc ttgggtttca 1380taccctggtt
tagcatctca ttctcatcat gaaaatgcta agaagtatct atctaacggt 1440ttcggtggtg
tcttatcttt cggtgtaaaa gacttaccaa atgccgacaa ggaaactgac 1500ccattcaaac
tttctggtgc tcaagttgtt gacaatttaa agcttgcctc taacttggcc 1560aatgttggtg
atgccaagac cttagtcatt gctccatact tcactaccca caaacaatta 1620aatgacaaag
aaaagttggc atctggtgtt accaaggact taattcgtgt ctctgttggt 1680atcgaattta
ttgatgacat tattgcagac ttccagcaat cttttgaaac tgttttcgct 1740ggccaaaaac
catgagtgtg cgtaatgagt tgtaaaatta tgtataaacc tactttctct 1800cacaagttat
gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcagga 1860aattgtaaac
gttaatattt tgttaaaatt cgcgttaaat ttttgttaaa tcagctcatt 1920ttttaaccaa
taggccgaaa tcggcaaaat cccttataaa tcaaaagaat agaccgagat 1980agggttgagt
gttgttccag tttggaacaa gagtccacta ttaaagaacg tggactccaa 2040cgtcaaaggg
cgaaaaaccg tctatcaggg cgatggccca ctacgtgaac catcacccta 2100atcaagtttt
ttggggtcga ggtgccgtaa agcactaaat cggaacccta aagggagccc 2160ccgatttaga
gcttgacggg gaaagccggc gaacgtggcg agaaaggaag ggaagaaagc 2220gaaaggagcg
ggcgctaggg cgctggcaag tgtagcggtc acgctgcgcg taaccaccac 2280acccgccgcg
cttaatgcgc cgctacaggg cgcgtcgcgc cattcgccat tcaggctgcg 2340caactgttgg
gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 2400gggatgtgct
gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 2460taaaacgacg
gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccgg 2520gccccccctc
gaggtcgacg gtatcgataa gcttgatatc gaattcctgc agcccggggg 2580atccactagt
tctagagcgg ccgctctaga actagtacca caggtgttgt cctctgagga 2640cataaaatac
acaccgagat tcatcaactc attgctggag ttagcatatc tacaattggg 2700tgaaatgggg
agcgatttgc aggcatttgc tcggcatgcc ggtagaggtg tggtcaataa 2760gagcgacctc
atgctatacc tgagaaagca acctgaccta caggaaagag ttactcaaga 2820ataagaattt
tcgttttaaa acctaagagt cactttaaaa tttgtataca cttatttttt 2880ttataactta
tttaataata aaaatcataa atcataagaa attcgcttac tcttaattaa 2940tcaaaaagtt
aaaattgtac gaatagattc accacttctt aacaaatcaa acccttcatt 3000gattttctcg
aatggcaata catgtgtaat taaaggatca agagcaaact tcttcgccat 3060aaagtcggca
acaagttttg gaacactatc cttgctctta aaaccgccaa atatagctcc 3120cttccatgta
cgaccgctta gcaacagcat aggattcatc gacaaatttt gtgaatcagg 3180aggaacacct
acgatcacac tgactccata tgcctcttga cagcaggaca acgcagttac 3240catagtatca
agacggccta taacttcaaa agagaaatca actccaccgt ttgacatttc 3300agtaaggact
tcttgtattg gtttcttata atcttgaggg ttaacacatt cagtagcccc 3360gacctcctta
gcttttgcaa atttgtcctt attgatgtct acacctataa tcctcgctgc 3420gcctgcagct
ttacacccca taataacgct tagtcctact cctcctaaac cgaatactgc 3480acaagtcgaa
ccctgtgtaa cctttgcaac tttaactgcg gaaccgtaac cggtggaaaa 3540tccgcaccct
atcaagcaaa ctttttccag tggtgaagct gcatcgattt tagcgacaga 3600tatctcgtcc
accactgtgt attgggaaaa tgtagaagta ccaaggaaat ggtgtatagg 3660tttccctctg
catgtaaatc tgcttgtacc atcctgcata gtacctctag gcatagacaa 3720atcattttta
aggcagaaat taccctcagg atgtttgcag actctacact taccacattg 3780aggagtgaac
agtgggatca ctttatcacc aggacgaaca gtggtaacac cttcacctat 3840ggattcaacg
attccggcag cctcgtgtcc cgcgattact ggcaaaggag taactagagt 3900gccactcacc
acatggtcgt cggatctaca gattccggtg gcaaccatct tgattctaac 3960ctcgtgtgct
tttggtggcg ctacttctac ttcttctatg ctaaacggct ttttctcttc 4020ccacaaaact
gccgctttac acttaataac tttaccggct gttgacatcc tcagctagct 4080attgtaatat
gtgtgtttgt ttggattatt aagaagaata attacaaaaa aaattacaaa 4140ggaaggtaat
tacaacagaa ttaagaaagg acaagaagga ggaagagaat cagttcatta 4200tttcttcttt
gttatataac aaacccaagt agcgatttgg ccatacatta aaagttgaga 4260accaccctcc
ctggcaacag ccacaactcg ttaccattgt tcatcacgat catgaaactc 4320gctgtcagct
gaaatttcac ctcagtggat ctctcttttt attcttcatc gttccactaa 4380cctttttcca
tcagctggca gggaacggaa agtggaatcc catttagcga gcttcctctt 4440ttcttcaaga
aaagacgaag cttgtgtgtg ggtgcgcgcg ctagtatctt tccacattaa 4500gaaatatacc
ataaaggtta cttagacatc actatggcta tatatatata tatatatata 4560tatgtaactt
agcaccatcg cgcgtgcatc actgcatgtg ttaaccgaaa agtttggcga 4620acacttcacc
gacacggtca tttagatctg tcgtctgcat tgcacgtccc ttagccttaa 4680atcctaggcg
ggagcattct cgtgtaattg tgcagcctgc gtagcaactc aacatagcgt 4740agtctaccca
gtttttcaag ggtttatcgt tagaagattc tcccttttct tcctgctcac 4800aaatcttaaa
gtcatacatt gcacgactaa atgcaagcat gcggatcccc cgggctgcag 4860gaattcgata
tcaagcttat cgataccgtc gactggccat taatctttcc catattagat 4920ttcgccaagc
catgaaagtt caagaaaggt ctttagacga attacccttc atttctcaaa 4980ctggcgtcaa
gggatcctgg tatggtttta tcgttttatt tctggttctt atagcatcgt 5040tttggacttc
tctgttccca ttaggcggtt caggagccag cgcagaatca ttctttgaag 5100gatacttatc
ctttccaatt ttgattgtct gttacgttgg acataaactg tatactagaa 5160attggacttt
gatggtgaaa ctagaagata tggatcttga taccggcaga aaacaagtag 5220atttgactct
tcgtagggaa gaaatgagga ttgagcgaga aacattagca aaaagatcct 5280tcgtaacaag
atttttacat ttctggtgtt gaagggaaag atatgagcta tacagcggaa 5340tttccatatc
actcagattt tgttatctaa ttttttcctt cccacgtccg cgggaatctg 5400tgtatattac
tgcatctaga tatatgttat cttatcttgg cgcgtacatt taattttcaa 5460cgtattctat
aagaaattgc gggagttttt ttcatgtaga tgatactgac tgcacgcaaa 5520tataggcatg
atttataggc atgatttgat ggctgtaccg ataggaacgc taagagtaac 5580ttcagaatcg
ttatcctggc ggaaaaaatt catttgtaaa ctttaaaaaa aaaagccaat 5640atccccaaaa
ttattaagag cgcctccatt attaactaaa atttcactca gcatccacaa 5700tgtatcaggt
atctactaca gatattacat gtggcgaaaa agacaagaac aatgcaatag 5760cgcatcaaga
aaaaacacaa agctttcaat caatgaatcg aaaatgtcat taaaatagta 5820tataaattga
aactaagtca taaagctata aaaagaaaat ttatttaaat gcaagattta 5880aagtaaattc
acggccctgc aggcctcagc tcttgttttg ttctgcaaat aacttaccca 5940tctttttcaa
aactttaggt gcaccctcct ttgctagaat aagttctatc caatacatcc 6000tatttggatc
tgcttgagct tctttcatca cggatacgaa ttcattttct gttctcacaa 6060ttttggacac
aactctgtct tccgttgccc cgaaactttc tggcagtttt gagtaattcc 6120acataggaat
gtcattataa ctctggttcg gaccatgaat ttccctctca accgtgtaac 6180catcgttatt
aatgataaag cagattgggt ttatcttctc tctaatggct agtcctaatt 6240cttggacagt
cagttgcaat gatccatctc cgataaacaa taaatgtcta gattctttat 6300ctgcaatttg
gctgcctaga gctgcgggga aagtgtatcc tatagatccc cacaagggtt 6360gaccaataaa
atgtgatttc gatttcagaa atatagatga ggcaccgaag aaagaagtgc 6420cttgttcagc
cacgatcgtc tcattacttt gggtcaaatt ttcgacagct tgccacagtc 6480tatcttgtga
caacagcgcg ttagaaggta caaaatcttc ttgcttttta tctatgtact 6540tgcctttata
ttcaatttcg gacaagtcaa gaagagatga tatcagggat tcgaagtcga 6600aattttggat
tctttcgttg aaaattttac cttcatcgat attcaaggaa atcattttat 6660tttcattaag
atggtgagta aatgcacccg tactagaatc ggtaagcttt acacccaaca 6720taagaataaa
atcagcagat tccacaaatt ccttcaagtt tggctctgac agagtaccgt 6780tgtaaatccc
caaaaatgag ggcaatgctt catcaacaga tgatttacca aagttcaaag 6840tagtaatagg
taacttagtc tttgaaataa actgagtaac agtcttctct aggccgaacg 6900atataatttc
atggcctgtg attacaattg gtttcttggc attcttcaga ctttcctgta 6960ttttgttcag
aatctcttga tcagatgtat tcgacgtgga attttccttc ttaagaggca 7020aggatggttt
ttcagcctta gcggcagcta catctacagg taaattgatg taaaccggct 7080ttctttcctt
tagtaaggca gacaacactc tatcaatttc aacagttgca ttctcggctg 7140tcaataaagt
cctggcagca gtaaccggtt cgtgcatctt cataaagtgc ttgaaatcac 7200catcagccaa
cgtatggtga acaaacttac cttcgttctg cactttcgag gtaggagatc 7260ccacgatctc
aacaacaggc aggttctcag cataggagcc cgctaagcca ttaactgcgg 7320ataattcgcc
aacaccaaat gtagtcaaga atgccgcagc ctttttcgtt cttgcgtacc 7380cgtcggccat
ataggaggca tttaactcat tagcatttcc cacccatttc atatctttgt 7440gtgaaataat
ttgatctaga aattgcaaat tgtagtcacc tggtactccg aatatttctt 7500ctatacctaa
ttcgtgtaat ctgtccaaca gatagtcacc tactgtatac attttgttta 7560ctagtttatg
tgtgtttatt cgaaactaag ttcttggtgt tttaaaacta aaaaaaagac 7620taactataaa
agtagaattt aagaagttta agaaatagat ttacagaatt acaatcaata 7680cctaccgtct
ttatatactt attagtcaag taggggaata atttcaggga actggtttca 7740accttttttt
tcagcttttt ccaaatcaga gagagcagaa ggtaatagaa ggtgtaagaa 7800aatgagatag
atacatgcgt gggtcaattg ccttgtgtca tcatttactc caggcaggtt 7860gcatcactcc
attgaggttg tgcccgtttt ttgcctgttt gtgcccctgt tctctgtagt 7920tgcgctaaga
gaatggacct atgaactgat ggttggtgaa gaaaacaata ttttggtgct 7980gggattcttt
ttttttctgg atgccagctt aaaaagcggg ctccattata tttagtggat 8040gccaggaata
aactgttcac ccagacacct acgatgttat atattctgtg taacccgccc 8100cctattttgg
gcatgtacgg gttacagcag aattaaaagg ctaatttttt gactaaataa 8160agttaggaaa
atcactacta ttaattattt acgtattctt tgaaatggca gtattgataa 8220tgataaactc
gaactgaaaa agcgtgtttt ttattcaaaa tgattctaac tcccttacgt 8280aatcaaggaa
tctttttgcc ttggcctccg cgtcattaaa cttcttgttg ttgacgctaa 8340cattcaacgc
tagtatatat tcgttttttt caggtaagtt cttttcaacg ggtcttactg 8400atgaggcagt
cgcgtctgaa cctgttaaga ggtcaaatat gtcttcttga ccgtacgtgt 8460cttgcatgtt
attagctttg ggaatttgca tcaagtcata ggaaaattta aatcttggct 8520ctcttgggct
caaggtgaca aggtcctcga aaatagggcg cgccccaccg cggtggagct 8580cagcttttgt
tccctttagt gagggttaat tgcgcgcttg gcgtaatcat ggtcatagct 8640gtttcctgtg
tgaaattgtt atccgctcac aattccacac aacatacgag ccggaagcat 8700aaagtgtaaa
gcctggggtg cctaatgagt gagctaactc acattaattg cgttgcgctc 8760actgcccgct
ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa tcggccaacg 8820cgcggggaga
ggcggtttgc gtattgggcg ctcttccgct tcctcgctca ctgactcgct 8880gcgctcggtc
gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt 8940atccacagaa
tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc 9000caggaaccgt
aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga 9060gcatcacaaa
aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata 9120ccaggcgttt
ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac 9180cggatacctg
tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg 9240taggtatctc
agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc 9300cgttcagccc
gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag 9360acacgactta
tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt 9420aggcggtgct
acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt 9480atttggtatc
tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg 9540atccggcaaa
caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac 9600gcgcagaaaa
aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca 9660gtggaacgaa
aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac 9720ctagatcctt
ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac 9780ttggtctgac
agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt 9840tcgttcatcc
atagttgcct gactccccgt cgtgtagata actacgatac gggagggctt 9900accatctggc
cccagtgctg caatgatacc gcgagaccca cgctcaccgg ctccagattt 9960atcagcaata
aaccagccag ccggaagggc cgagcgcaga agtggtcctg caactttatc 10020cgcctccatc
cagtctatta attgttgccg ggaagctaga gtaagtagtt cgccagttaa 10080tagtttgcgc
aacgttgttg ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg 10140tatggcttca
ttcagctccg gttcccaacg atcaaggcga gttacatgat cccccatgtt 10200gtgcaaaaaa
gcggttagct ccttcggtcc tccgatcgtt gtcagaagta agttggccgc 10260agtgttatca
ctcatggtta tggcagcact gcataattct cttactgtca tgccatccgt 10320aagatgcttt
tctgtgactg gtgagtactc aaccaagtca ttctgagaat agtgtatgcg 10380gcgaccgagt
tgctcttgcc cggcgtcaat acgggataat accgcgccac atagcagaac 10440tttaaaagtg
ctcatcattg gaaaacgttc ttcggggcga aaactctcaa ggatcttacc 10500gctgttgaga
tccagttcga tgtaacccac tcgtgcaccc aactgatctt cagcatcttt 10560tactttcacc
agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg 10620aataagggcg
acacggaaat gttgaatact catactcttc ctttttcaat attattgaag 10680catttatcag
ggttattgtc tcatgagcgg atacatattt gaatgtattt agaaaaataa 10740acaaataggg
gttccgcgca catttccccg aaaagtgcca cctgggtcct tttcatcacg 10800tgctataaaa
ataattataa tttaaatttt ttaatataaa tatataaatt aaaaatagaa 10860agtaaaaaaa
gaaattaaag aaaaaatagt ttttgttttc cgaagatgta aaagactcta 10920gggggatcgc
caacaaatac taccttttat cttgctcttc ctgctctcag gtattaatgc 10980cgaattgttt
catcttgtct gtgtagaaga ccacacacga aaatcctgtg attttacatt 11040ttacttatcg
ttaatcgaat gtatatctat ttaatctgct tttcttgtct aataaatata 11100tatgtaaagt
acgctttttg ttgaaatttt ttaaaccttt gtttattttt ttttcttcat 11160tccgtaactc
ttctaccttc tttatttact ttctaaaatc caaatacaaa acataaaaat 11220aaataaacac
agagtaaatt cccaaattat tccatcatta aaagatacga ggcgcgtgta 11280agttacaggc
aagcgatccg tcctaagaaa ccattattat catgacatta acctataaaa 11340ataggcgtat
cacgaggccc tttcgtc 11367
User Contributions:
Comment about this patent or add new information about this topic: