Patent application title: VARIANT SUCROSE TRANSPORTER POLYPEPTIDES
Inventors:
Dana Michelle Walters Pollak (Media, PA, US)
Tina K. Van Dyk (Wilmington, DE, US)
Tina K. Van Dyk (Wilmington, DE, US)
Assignees:
E. I. DU PONT DE NEMOURS AND COMPANY
IPC8 Class: AC12P720FI
USPC Class:
435146
Class name: Preparing oxygen-containing organic compound containing a carboxyl group hydroxy carboxylic acid
Publication date: 2014-05-29
Patent application number: 20140147899
Abstract:
Variant sucrose transporter polypeptides that enable bacterial growth
over a wide range of gene expression levels and sucrose concentrations
are described. Additionally, recombinant bacteria comprising these
variant sucrose transporter polypeptides, and methods of utilizing the
bacteria to produce products such as glycerol and glycerol-derived
products are providedClaims:
1-12. (canceled)
13. A recombinant bacterium comprising in its genome or on at least one recombinant construct: (a) a nucleotide sequence encoding a variant sucrose transporter polypeptide having an amino acid sequence that has at least 95% identity based on a Clustal W method of alignment to an amino acid sequence selected from the group consisting of SEQ ID NOs:26, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, and 98,and an amino acid at an equivalent position when compared with a reference amino acid sequence of SEQ ID NO:26 selected from the group consisting of: (i) alanine at a position equivalent to position 300; and (ii) leucine at a position equivalent to position 300; and (b) a nucleotide sequence encoding a polypeptide having sucrose hydrolase activity; wherein (a) and (b) are each operably linked to the same or a different promoter, further wherein said recombinant bacterium is capable of metabolizing sucrose.
14. The recombinant bacterium of claim 13, wherein the variant sucrose transporter polypeptide further comprises: at least one of the following amino acids at an equivalent position when compared with the reference amino acid sequence of SEQ ID NO:26: (i) proline at a position equivalent to position 61; (ii) tryptophan at a position equivalent to position 61; (iii) histidine at a position equivalent to position 61; (iv) phenylalanine at a position equivalent to position 61; (v) tyrosine at a position equivalent to position 61.
15. The recombinant bacterium of claim 14, wherein the variant sucrose transporter polypeptide comprises an amino acid change from glutamine to histidine at position 353 and an amino acid change from leucine to proline at position 61.
16. The recombinant bacterium of claim 13, wherein the polypeptide having sucrose hydrolase activity is classified as EC 3.2.1.26 or EC 2.4.1.7.
17. The recombinant bacterium of claim 13 further comprising in its genome or on at least one recombinant construct, a nucleotide sequence encoding a polypeptide having fructokinase activity.
18. The recombinant bacterium of claim 17, wherein the polypeptide having fructokinase activity is classified as EC 2.7.1.4, EC 2.7.1.3, or EC 2.7.1.1.
19. The recombinant bacterium of claim 13, wherein said bacterium is selected from the group consisting of the genera: Escherichia, Klebsiella, Citrobacter, and Aerobacter.
20. The recombinant bacterium of claim 19, wherein said bacterium is Escherichia coli.
21. The recombinant bacterium of claim 13, wherein the recombinant bacterium produces 1,3-propanediol, glycerol, and/or 3-hydroxypropionic acid.
22. A process for making glycerol, 1,3-propanediol and/or 3-hydroxypropionic acid from sucrose comprising: (a) culturing the recombinant bacterium of claim 21 in the presence of sucrose; and (b) optionally, recovering the glycerol, 1,3-propanediol and/or 3-hydroxypropionic acid produced.
Description:
FIELD OF THE INVENTION
[0001] The invention relates to the fields of microbiology and molecular biology. More specifically, variant sucrose transporter polypeptides that enable bacterial growth over a wide range of gene expression levels and sucrose concentrations, recombinant bacteria comprising these variant sucrose transporter polypeptides, and methods of utilizing such bacteria to produce products such as glycerol and glycerol-derived products are provided.
BACKGROUND OF THE INVENTION
[0002] Many commercially useful microorganisms use glucose as their main carbohydrate source. However, a disadvantage of the use of glucose by microorganisms developed for production of commercially desirable products is the high cost of glucose. The use of sucrose and mixed feedstocks containing sucrose and other sugars as carbohydrate sources for microbial production systems would be more commercially desirable because these materials are usually readily available at a lower cost.
[0003] A production microorganism can function more efficiently when it can utilize any sucrose present in a mixed feedstock. Therefore, when a production microorganism does not have the ability to utilize sucrose efficiently as a major carbon source, it cannot operate as efficiently. For example, bacterial cells typically show preferential sugar use, with glucose being the most preferred. In artificial media containing mixtures of sugars, glucose is typically metabolized to its entirety ahead of other sugars. Moreover, many bacteria lack the ability to utilize sucrose. For example, less than 50% of Escherichia coli (E. coli) strains have the ability to utilize sucrose. Thus, when a production microorganism cannot utilize sucrose as a carbohydrate source, it is desirable to engineer the microorganism so that it can utilize sucrose.
[0004] Recombinant bacteria that have been engineered to utilize sucrose by incorporation of sucrose utilization genes have been reported. For example, Livshits et al. (U.S. Pat. No. 6,960,455) describe the production of amino acids using Escherichia coli strains containing genes encoding a metabolic pathway for sucrose utilization. Additionally, Olson et al. (Appl. Microbiol. Biotechnol. 74:1031-1040, 2007) describe Escherichia coli strains carrying genes responsible for sucrose degradation, which produce L-tyrosine or L-phenylalanine using sucrose as a carbon source. Eliot et al. (U.S. Patent Application No. 2011/0136190 A1) describe recombinant bacteria that produce glycerol and glycerol-derived products from sucrose.
[0005] However, problems remain in engineering production microorganisms so that they can utilize sucrose effectively. Specifically, high levels of expression of sucrose transport genes result in poor growth on sucrose because excess sucrose transport is inhibitory. On the other hand, low levels of sucrose transport also result in sub-optimal growth on sucrose. Therefore, it is difficult to obtain the proper sucrose transporter gene expression level. Additionally, expression of sucrose transport genes under conditions at which sucrose transport is in excess, such as at high sucrose concentrations, may inhibit growth even at gene expression levels at which growth is not inhibited at lower sucrose concentrations. Therefore, a need also exists for a sucrose transporter that can enable growth on sucrose over a broad range of sucrose concentrations.
SUMMARY OF THE INVENTION
[0006] One embodiment provides a variant sucrose transporter polypeptide having an amino acid sequence that has at least 95% identity to an amino acid sequence as set forth in SEQ ID NO:26 based on a Clustal W method of alignment and having an amino acid change from arginine to alanine or arginine to leucine at position 300, and comprising:
[0007] (a) at least one additional amino acid change selected from the group consisting of:
[0008] (i) glutamine to histidine at position 353
[0009] (ii) leucine to proline at position 61;
[0010] (iii) phenylalanine to leucine at position 159;
[0011] (iv) glycine to cysteine at position 162;
[0012] (v) proline to histidine at position 169;
[0013] (vi) leucine to tryptophan at position 61;
[0014] (vii) leucine to histidine at position 61;
[0015] (viii) leucine to phenylalanine at position 61; and
[0016] (ix) leucine to tyrosine at position 61; or
[0017] (b) a length of 402 to 407 amino acids from the N-terminus; or
[0018] (c) a length of 402 to 407 amino acids from the N-terminus, and having at least one of the amino acid changes of (a).
[0019] Another embodiment provides a variant sucrose transporter polypeptide having an amino acid sequence that has at least 95% identity based on a Clustal W method of alignment to an amino acid sequence selected from the group consisting of SEQ ID NOs: 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, and 98, and comprising an amino acid at an equivalent position when compared with a reference amino acid sequence of SEQ ID NO:26 selected from the group consisting of:
[0020] (a) alanine at a position equivalent to position 300; and
[0021] (b) leucine at a position equivalent to position 300.
[0022] Another embodiment provides a recombinant bacterium comprising in its genome or on at least one recombinant construct:
[0023] (a) a nucleotide sequence encoding a variant sucrose transporter polypeptide having an amino acid sequence that has at least 95% identity based on a Clustal W method of alignment to an amino acid sequence selected from the group consisting of SEQ ID NOs:26, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, and 98, and an amino acid at an equivalent position when compared with a reference amino acid sequence of SEQ ID NO:26 selected from the group consisting of:
[0024] (i) alanine at a position equivalent to position 300; and
[0025] (ii) leucine at a position equivalent to position 300; and
[0026] (b) a nucleotide sequence encoding a polypeptide having sucrose hydrolase activity; wherein (a) and (b) are each operably linked to the same or a different promoter, further wherein said recombinant bacterium is capable of metabolizing sucrose.
[0027] In one embodiment, the recombinant bacterium produces 1,3-propanediol, glycerol, and/or 3-hydroxypropionic acid.
[0028] Another embodiment provides a process for making glycerol, 1,3-propanediol and/or 3-hydroxypropionic acid from sucrose comprising:
[0029] a) culturing the recombinant bacterium that produces 1,3-propanediol, glycerol, and/or 3-hydroxypropionic acid, disclosed herein, in the presence of sucrose; and
[0030] b) optionally, recovering the glycerol, 1,3-propanediol and/or 3-hydroxypropionic acid produced.
BRIEF SEQUENCE DESCRIPTIONS
[0031] The following sequences conform with 37 C.F.R. 1.821 1.825 ("Requirements for Patent Applications Containing Nucleotide Sequences and/or Amino Acid Sequence Disclosures--the Sequence Rules") and consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (2009) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 49.5(a bis), and Section 208 and Annex C of the Administrative Instructions). The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822.
TABLE-US-00001 TABLE 1 Summary of Gene and Protein SEQ ID Numbers Coding Encoded Sequence Protein Gene SEQ ID NO: SEQ ID NO: GPD1 from Saccharomyces cerevisiae 1 2 GPD2 from Saccharomyces cerevisiae 3 4 GPP1 from Saccharomyces cerevisiae 5 6 GPP2 from Saccharomyces cerevisiae 7 8 dhaB1 from Klebsiella pneumoniae 9 10 dhaB2 from Klebsiella pneumoniae 11 12 dhaB3 from Klebsiella pneumoniae 13 14 aldB from Escherichia coli 15 16 aldA from Escherichia coli 17 18 aldH from Escherichia coli 19 20 galP from Escherichia coli 21 22 cscB from Escherichia coli EC3132 23 24 cscB from Escherichia coli ATCC ®13281 25 26 cscA from Escherichia coli EC3132 27 28 cscA from Escherichia coli ATCC13281 29 30 bfrA from Bifidobacterium lactis strain DSM 31 32 10140T SUC2 from Saccharomyces cerevisiae 33 34 scrB from Corynebacterium glutamicum 35 36 sucrose phosphorylase gene from 37 38 Leuconostoc mesenteroides DSM 20193 sucP Bifidobacterium adolescentis DSM 39 40 20083 scrK from Agrobacterium tumefaciens 41 42 scrK from Streptococcus mutans 43 44 scrK From Escherichia coli 45 46 scrK from Klebsiella pneumoniae 47 48 cscK from Escherichia coli 49 50 cscK from Enterococcus faecalis 51 52 HXK1 from Saccharomyces cerevisiae 53 54 HXK2 from Saccharomyces cerevisiae 55 56 dhaT from Klebsiella pneumoniae 57 58 dhaX from Klebsiella pneumoniae 59 60 scrT1 from Citrobacter sp 67 68 scrT3 from Enterococcus faecium 69 70 scrT4 from Corynebacterium 71 72 glucuronolyticum scrT5 from Bifidobacterium animalis subsp. 73 74 lactis scrT6 from Bifidobacterium gallicum 75 76 scrT7 from Bifidobacterium longum 77 78 scrT8 from Bifidobacterium adolescentis 79 80 scrT9 from Bifidobacterium longum 81 82 scrT12 from Mitsuokella multacida 83 84 scrT13 from Lactobacillus antri 85 86 scrT14 from Lactobacillus ruminis 87 88 scrT21 from Yersinia frederiksenii 89 90 scrT25 from Serratia proteamaculans 91 92 scrT26 from Escherichia coli 93 94 fruP from Bacillus licheniformis 14580 95 96 lacY from Pseudomonas fluorescens Pf5 97 98 cscB from Escherichia coli ATCC ®13281 99 100 with R300A mutation cscB from Escherichia coli ATCC ®13281 101 102 with R300L mutation cscB from Escherichia coli ATCC ®13281 103 104 with R300A and Q353H mutations cscB from Escherichia coli ATCC ®13281 105 106 with R300A, Q353H, L61P mutations scrT1 from Citrobacter sp with R305A 107 108 mutation scrT1 from Citrobacter sp with R305L 109 110 mutation scrT7 from Bifidobacterium longum with 111 112 R312A mutation scrB from Pseudomonas fluorescens Pf5 133 134 fruA from Bacillus licheniformis 14580 135 136
[0032] SEQ ID NO:61 is the nucleotide sequence of the cscAKB gene cluster from Escherichia coli ATCC013281.
[0033] SEQ ID NO:62 is the nucleotide sequence of plasmid pSYC0101.
[0034] SEQ ID NO:63 is the nucleotide sequence of plasmid pSYC0103.
[0035] SEQ ID NO:64 is the nucleotide sequence of plasmid pSYC0106.
[0036] SEQ ID NO:65 is the nucleotide sequence of plasmid pSYC0109.
[0037] SEQ ID NO:66 is the nucleotide sequence of plasmid pSYCO400/AG RO.
[0038] SEQ ID NO:113 is the nucleotide sequence of plasmid pDMWP1.
[0039] SEQ ID NO:114 is the nucleotide sequence of plasmid pDMWP3.
[0040] SEQ ID NO:119 is the nucleotide sequence of plasmid pBHRcscBKA.
[0041] SEQ ID NO:124 is the nucleotide sequence of the promoter/MCS/double terminator insert described in Examples 22-24.
[0042] SEQ ID NO:125 is the codon optimized nucleotide sequence of the coding region of scrT1 for expression in E. coli.
[0043] SEQ ID NO:130 is the codon optimized nucleotide sequence of the coding region of scrT7 for expression in E. coli.
[0044] SEQ ID NOs:115-118, 120-123, 126-129 and 131-132 are the nucleotide sequences of primers used in the Examples herein.
DETAILED DESCRIPTION
[0045] The disclosure of each reference set forth herein is hereby incorporated by reference in its entirety.
[0046] As used herein and in the appended claims, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to "a cell" includes one or more cells and equivalents thereof known to those skilled in the art, and so forth.
[0047] In the context of this disclosure, a number of terms and abbreviations are used. The following definitions are provided.
[0048] "Open reading frame" is abbreviated as "ORF".
[0049] "Polymerase chain reaction" is abbreviated as "PCR".
[0050] "American Type Culture Collection" is abbreviated as "ATCC".
[0051] The term "recombinant glycerol-producing bacterium" refers to a bacterium that has been genetically engineered to be capable of producing glycerol and/or glycerol-derived products.
[0052] The term "polypeptide having sucrose transporter activity" refers to a polypeptide that is capable of mediating the transport of sucrose into microbial cells.
[0053] The term "polypeptide having fructokinase activity" refers to a polypeptide that has the ability to catalyze the conversion of D-fructose+ATP to fructose-phosphate+ADP. Typical of fructokinase is EC 2.7.1.4. Enzymes that have some ability to phosphorylate fructose, whether or not this activity is their predominant activity, may be referred to as a fructokinase. Abbreviations used for genes encoding fructokinases and proteins having fructokinase activity include, for example, "Frk", "scrK", "cscK", "FK", and "KHK". Fructokinase is encoded by the scrK gene in Agrobacterium tumefaciens and Streptococcus mutans; and by the cscK gene in certain Escherichia coli strains.
[0054] The term "polypeptide having sucrose hydrolase activity" refers to a polypeptide that has the ability to catalyze the hydrolysis of sucrose to produce glucose and fructose. Such polypeptides are often referred to as "invertases" or "β-fructofuranosidases".
[0055] The terms "glycerol derivative" and "glycerol-derived products" are used interchangeably herein and refer to a compound that is synthesized from glycerol or in a pathway that includes glycerol. Examples of such products include 3-hydroxypropionic acid, methylglyoxal, 1,2-propanediol, and 1,3-propanediol.
[0056] The term "microbial product" refers to a product that is microbially produced, i.e., the result of a microorganism metabolizing a substance. The product may be naturally produced by the microorganism, or the microorganism may be genetically engineered to produce the product.
[0057] The terms "phosphoenolpyruvate-sugar phosphotransferase system", "PTS system", and "PTS" are used interchangeably herein and refer to the phosphoenolpyruvate-dependent sugar uptake system.
[0058] The terms "phosphocarrier protein HPr" and "PtsH" refer to the phosphocarrier protein encoded by ptsH in E. coli. The terms "phosphoenolpyruvate-protein phosphotransferase" and "PtsI" refer to the phosphotransferase, EC 2.7.3.9, encoded by ptsI in E. coli. The terms "glucose-specific IIA component", and "Crr" refer to enzymes designated as EC 2.7.1.69, encoded by crr in E. coli. PtsH, PtsI, and Crr comprise the PTS system.
[0059] The term "PTS minus" refers to a microorganism that does not contain a PTS system in its native state or a microorganism in which the PTS system has been inactivated through the inactivation of a PTS gene.
[0060] The terms "glycerol-3-phosphate dehydrogenase" and "G3PDH" refer to a polypeptide responsible for an enzyme activity that catalyzes the conversion of dihydroxyacetone phosphate (DHAP) to glycerol 3-phosphate (G3P). In vivo G3PDH may be NAD- or NADP-dependent. When specifically referring to a cofactor specific glycerol-3-phosphate dehydrogenase, the terms "NAD-dependent glycerol-3-phosphate dehydrogenase" and "NADP-dependent glycerol-3-phosphate dehydrogenase" will be used. As it is generally the case that NAD-dependent and NADP-dependent glycerol-3-phosphate dehydrogenases are able to use NAD and NADP interchangeably (for example by the enzyme encoded by gpsA), the terms NAD-dependent and NADP-dependent glycerol-3-phosphate dehydrogenase will be used interchangeably. The NAD-dependent enzyme (EC 1.1.1.8) is encoded, for example, by several genes including GPD1, also referred to herein as DAR1 (coding sequence set forth in SEQ ID NO:1; encoded protein sequence set forth in SEQ ID NO:2), or GPD2 (coding sequence set forth in SEQ ID NO:3; encoded protein sequence set forth in SEQ ID NO:4), or GPD3. The NADP-dependent enzyme (EC 1.1.1.94) is encoded, for example, by gpsA.
[0061] The terms "glycerol 3-phosphatase", "sn-glycerol 3-phosphatase", "D,L-glycerol phosphatase", and "G3P phosphatase" refer to a polypeptide having an enzymatic activity that is capable of catalyzing the conversion of glycerol 3-phosphate and water to glycerol and inorganic phosphate. G3P phosphatase is encoded, for example, by GPP1 (coding sequence set forth in SEQ ID NO:5; encoded protein sequence set forth in SEQ ID NO:6), or GPP2 (coding sequence set forth in SEQ ID NO:7; encoded protein sequence set forth in SEQ ID NO:8).
[0062] The term "glycerol dehydratase" or "dehydratase enzyme" refers to a polypeptide having enzyme activity that is capable of catalyzing the conversion of a glycerol molecule to the product, 3-hydroxypropionaldehyde (3-HPA).
[0063] For the purposes of the present invention the dehydratase enzymes include a glycerol dehydratase (E.C. 4.2.1.30) and a diol dehydratase (E.C. 4.2.1.28) having preferred substrates of glycerol and 1,2-propanediol, respectively. Genes for dehydratase enzymes have been identified in Klebsiella pneumoniae, Citrobacter freundii, Clostridium pasteurianum, Salmonella typhimurium, Klebsiella oxytoca, and Lactobacillus reuteri, among others. In each case, the dehydratase is composed of three subunits: the large or "α" subunit, the medium or "β" subunit, and the small or "γ" subunit. The genes are also described in, for example, Daniel et al. (FEMS Microbiol. Rev. 22, 553 (1999)) and Toraya and Mori (J. Biol. Chem. 274, 3372 (1999)). Genes encoding the large or "a" (alpha) subunit of glycerol dehydratase include dhaB1 (coding sequence set forth in SEQ ID NO:9, encoded protein sequence set forth in SEQ ID NO:10), gldA and dhaB; genes encoding the medium or "β" (beta) subunit include dhaB2 (coding sequence set forth in SEQ ID NO:11, encoded protein sequence set forth in SEQ ID NO:12), gldB, and dhaC; genes encoding the small or "γ" (gamma) subunit include dhaB3 (coding sequence set forth in SEQ ID NO:13, encoded protein sequence set forth in SEQ ID NO:14), gldC, and dhaE. Other genes encoding the large or "α" subunit of diol dehydratase include pduC and pddA; other genes encoding the medium or "β" subunit include pduD and pddB; and other genes encoding the small or "γ" subunit include pduE and pddC.
[0064] Glycerol and diol dehydratases are subject to mechanism-based suicide inactivation by glycerol and some other substrates (Daniel et al., FEMS Microbiol. Rev. 22, 553 (1999)). The term "dehydratase reactivation factor" refers to those proteins responsible for reactivating the dehydratase activity. The terms "dehydratase reactivating activity", "reactivating the dehydratase activity" and "regenerating the dehydratase activity" are used interchangeably and refer to the phenomenon of converting a dehydratase not capable of catalysis of a reaction to one capable of catalysis of a reaction or to the phenomenon of inhibiting the inactivation of a dehydratase or the phenomenon of extending the useful half-life of the dehydratase enzyme in vivo. Two proteins have been identified as being involved as the dehydratase reactivation factor (see, e.g., U.S. Pat. No. 6,013,494 and references therein; Daniel et al., supra; Toraya and Mori, J. Biol. Chem. 274, 3372 (1999); and Tobimatsu et al., J. Bacteriol. 181, 4110 (1999)). Genes encoding one of the proteins include, for example, orfZ, dhaB4, gdrA, pduG and ddrA. Genes encoding the second of the two proteins include, for example, orfX, orf2b, gdrB, pduH and ddrB.
[0065] The terms "1,3-propanediol oxidoreductase", "1,3-propanediol dehydrogenase" and "DhaT" are used interchangeably herein and refer to the polypeptide(s) having an enzymatic activity that is capable of catalyzing the interconversion of 3-HPA and 1,3-propanediol provided the gene(s) encoding such activity is found to be physically or transcriptionally linked to a dehydratase enzyme in its natural (i.e., wild type) setting; for example, the gene is found within a dha regulon as is the case with dhaT from Klebsiella pneumoniae. Genes encoding a 1,3-propanediol oxidoreductase include, but are not limited to, dhaT from Klebsiella pneumoniae, Citrobacter freundii, and Clostridium pasteurianum. Each of these genes encode a polypeptide belonging to the family of type III alcohol dehydrogenases, which exhibits a conserved iron-binding motif, and has a preference for the NAD.sup.+/NADH linked interconversion of 3-HPA and 1,3-propanediol (Johnson and Lin, J. Bacteriol. 169, 2050 (1987); Daniel et al., J. Bacteriol. 177, 2151 (1995); and Leurs et al., FEMS Microbiol. Lett. 154, 337 (1997)). Enzymes with similar physical properties have been isolated from Lactobacillus brevis and Lactobacillus buchneri (Veiga da Dunha and Foster, Appl. Environ. Microbiol. 58, 2005 (1992)).
[0066] The term "dha regulon" refers to a set of associated polynucleotides or open reading frames encoding polypeptides having various biological activities, including but not limited to a dehydratase activity, a reactivation activity, and a 1,3-propanediol oxidoreductase. Typically a dha regulon comprises the open reading frames dhaR, orfY, dhaT, orfX, orfW, dhaB1, dhaB2, dhaB3, and orfZ as described in U.S. Pat. No. 7,371,558.
[0067] The terms "aldehyde dehydrogenase" and "Ald" refer to a polypeptide that catalyzes the conversion of an aldehyde to a carboxylic acid. Aldehyde dehydrogenases may use a redox cofactor such as NAD, NADP, FAD, or PQQ. Typical of aldehyde dehydrogenases is EC 1.2.1.3 (NAD-dependent); EC 1.2.1.4 (NADP-dependent); EC 1.2.99.3 (PQQ-dependent); or EC 1.2.99.7 (FAD-dependent). An example of an NADP-dependent aldehyde dehydrogenase is AldB (SEQ ID NO:16), encoded by the E. coli gene aldB (coding sequence set forth in SEQ ID NO:15). Examples of NAD-dependent aldehyde dehydrogenases include AldA (SEQ ID NO:18), encoded by the E. coli gene aldA (coding sequence set forth in SEQ ID NO:17); and AldH (SEQ ID NO:20), encoded by the E. coli gene aldH (coding sequence set forth in SEQ ID NO:19).
[0068] The terms "glucokinase" and "Glk" are used interchangeably herein and refer to a protein that catalyzes the conversion of D-glucose+ATP to glucose 6-phosphate+ADP. Typical of glucokinase is EC 2.7.1.2. Glucokinase is encoded by glk in E. coli.
[0069] The terms "phosphoenolpyruvate carboxylase" and "Ppc" are used interchangeably herein and refer to a protein that catalyzes the conversion of phosphoenolpyruvate+H2O+CO2 to phosphate+oxaloacetic acid. Typical of phosphoenolpyruvate carboxylase is EC 4.1.1.31. Phosphoenolpyruvate carboxylase is encoded by ppc in E. coli.
[0070] The terms "glyceraldehyde-3-phosphate dehydrogenase" and "GapA" are used interchangeably herein and refer to a protein having an enzymatic activity capable of catalyzing the conversion of glyceraldehyde 3-phosphate+phosphate+NAD.sup.+ to 3-phospho-D-glyceroyl-phosphate+NADH+H.sup.+. Typical of glyceraldehyde-3-phosphate dehydrogenase is EC 1.2.1.12. Glyceraldehyde-3-phosphate dehydrogenase is encoded by gapA in E. coli.
[0071] The terms "aerobic respiration control protein" and "ArcA" are used interchangeably herein and refer to a global regulatory protein. The aerobic respiration control protein is encoded by arcA in E. coli.
[0072] The terms "methylglyoxal synthase" and "MgsA" are used interchangeably herein and refer to a protein having an enzymatic activity capable of catalyzing the conversion of dihydroxyacetone phosphate to methylglyoxal+phosphate. Typical of methylglyoxal synthase is EC 4.2.3.3. Methylglyoxal synthase is encoded by mgsA in E. coli.
[0073] The terms "phosphogluconate dehydratase" and "Edd" are used interchangeably herein and refer to a protein having an enzymatic activity capable of catalyzing the conversion of 6-phospho-gluconate to 2-keto-3-deoxy-6-phospho-gluconate+H2O. Typical of phosphogluconate dehydratase is EC 4.2.1.12. Phosphogluconate dehydratase is encoded by edd in E. coli.
[0074] The term "YciK" refers to a putative enzyme encoded by yciK which is translationally coupled to btuR, the gene encoding Cob(I)alamin adenosyltransferase in E. colo.
[0075] The term "cob(I)alamin adenosyltransferase" refers to an enzyme capable of transferring a deoxyadenosyl moiety from ATP to the reduced corrinoid. Typical of cob(I)alamin adenosyltransferase is EC 2.5.1.17. Cob(I)alamin adenosyltransferase is encoded by the gene "btuR" in E. coli,"cobA" in Salmonella typhimurium, and "cobO" in Pseudomonas denitrificans.
[0076] The terms "galactose-proton symporter" and "GalP" are used interchangeably herein and refer to a protein having an enzymatic activity capable of transporting a sugar and a proton from the periplasm to the cytoplasm. D-glucose is a preferred substrate for GalP. Galactose-proton symporter is encoded by galP in Escherichia coli (coding sequence set forth in SEQ ID NO:21, encoded protein sequence set forth in SEQ ID NO:22).
[0077] The term "non-specific catalytic activity" refers to the polypeptide(s) having an enzymatic activity capable of catalyzing the interconversion of 3-HPA and 1,3-propanediol and specifically excludes 1,3-propanediol oxidoreductase(s). Typically these enzymes are alcohol dehydrogenases. Such enzymes may utilize cofactors other than NAD.sup.+/NADH, including but not limited to flavins such as FAD or FMN. A gene for a non-specific alcohol dehydrogenase (yqhD) is found, for example, to be endogenously encoded and functionally expressed within E. coli K-12 strains.
[0078] The terms "1.6 long GI promoter", "1.20 short/long GI Promoter", "1.5 long GI promoter", "P1.6", "P1.5" and "P1.20" refer to polynucleotides or fragments containing a promoter from the Streptomyces lividans glucose isomerase gene as described in U.S. Pat. No. 7,132,527. These promoter fragments include a mutation which decreases their activities as compared to the wild type Streptomyces lividans glucose isomerase gene promoter.
[0079] The terms "function" and "enzyme function" are used interchangeably herein and refer to the catalytic activity of an enzyme in altering the rate at which a specific chemical reaction occurs without itself being consumed by the reaction. It is understood that such an activity may apply to a reaction in equilibrium where the production of either product or substrate may be accomplished under suitable conditions.
[0080] The terms "polypeptide" and "protein" are used interchangeably herein.
[0081] The terms "carbon substrate" and "carbon source" are used interchangeably herein and refer to a carbon source capable of being metabolized by the recombinant bacteria disclosed herein and, particularly, carbon sources comprising sucrose. The carbon source may further comprise other monosaccharides, disaccharides, oligosaccharides; or polysaccharides.
[0082] The terms "host cell" and "host bacterium" are used interchangeably herein and refer to a bacterium capable of receiving foreign or heterologous genes and capable of expressing those genes to produce an active gene product.
[0083] The term "production microorganism" as used herein refers to a microorganism, including, but not limited to, those that are recombinant, used to make a specific product such as 1,3-propanediol, glycerol, 3-hydroxypropionic acid, polyunsaturated fatty acids, and the like.
[0084] As used herein, "nucleic acid" means a polynucleotide and includes a single or double-stranded polymer of deoxyribonucleotide or ribonucleotide bases. Nucleic acids may also include fragments and modified nucleotides. Thus, the terms "polynucleotide", "nucleic acid sequence", "nucleotide sequence" or "nucleic acid fragment" are used interchangeably herein and refer to a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.
[0085] A polynucleotide may be a polymer of RNA or DNA that is single- or double-stranded, that optionally contains synthetic, non-natural or altered nucleotide bases. A polynucleotide in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA, synthetic DNA, or mixtures thereof.
[0086] "Gene" refers to a nucleic acid fragment that expresses a specific protein, and which may refer to the coding region alone or may include regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences. "Chimeric gene" refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. "Endogenous gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene refers to a gene that is introduced into the host organism by gene transfer. Foreign genes can comprise genes inserted into a non-native organism, genes introduced into a new location within the native host, or chimeric genes.
[0087] The term "native nucleotide sequence" refers to a nucleotide sequence that is normally found in the host microorganism.
[0088] The term "non-native nucleotide sequence" refers to a nucleotide sequence that is not normally found in the host microorganism.
[0089] The term "native polypeptide" refers to a polypeptide that is normally found in the host microorganism.
[0090] The term "non-native polypeptide" refers to a polypeptide that is not normally found in the host microorganism.
[0091] The terms "encoding" and "coding" are used interchangeably herein and refer to the process by which a gene, through the mechanisms of transcription and translation, produces an amino acid sequence.
[0092] The term "coding sequence" refers to a nucleotide sequence that codes for a specific amino acid sequence.
[0093] "Suitable regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, enhancers, silencers, 5' untranslated leader sequence (e.g., between the transcription start site and the translation initiation codon), introns, polyadenylation recognition sequences, RNA processing sites, effector binding sites and stem-loop structures.
[0094] The term "expression cassette" refers to a fragment of DNA comprising the coding sequence of a selected gene and regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence that are required for expression of the selected gene product. Thus, an expression cassette is typically composed of: 1) a promoter sequence; 2) a coding sequence (i.e., ORF) and, 3) a 3' untranslated region (e.g., a terminator) that, in eukaryotes, usually contains a polyadenylation site. The expression cassette(s) is usually included within a vector, to facilitate cloning and transformation. Different organisms, including bacteria, yeast, and fungi, can be transformed with different expression cassettes as long as the correct regulatory sequences are used for each host.
[0095] "Transformation" refers to the transfer of a nucleic acid molecule into a host organism, resulting in genetically stable inheritance. The nucleic acid molecule may be a plasmid that replicates autonomously, for example, or it may integrate into the genome of the host organism. Host organisms transformed with the nucleic acid fragments are referred to as "recombinant" or "transformed" organisms or "transformants". "Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including both nuclear and organellar genomes, resulting in genetically stable inheritance. In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance.
[0096] "Codon degeneracy" refers to the nature in the genetic code permitting variation of the nucleotide sequence without affecting the amino acid sequence of an encoded polypeptide. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.
[0097] The terms "subfragment that is functionally equivalent" and "functionally equivalent subfragment" are used interchangeably herein. These terms refer to a portion or subsequence of an isolated nucleic acid fragment in which the ability to alter gene expression or produce a certain phenotype is retained whether or not the fragment or subfragment encodes an active enzyme. Chimeric genes can be designed for use in suppression by linking a nucleic acid fragment or subfragment thereof, whether or not it encodes an active enzyme, in the sense or antisense orientation relative to a promoter sequence.
[0098] The term "conserved domain" or "motif" means a set of amino acids conserved at specific positions along an aligned sequence of evolutionarily related proteins. While amino acids at other positions can vary between homologous proteins, amino acids that are highly conserved at specific positions indicate amino acids that are essential in the structure, the stability, or the activity of a protein.
[0099] The terms "substantially similar" and "corresponds substantially" are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. It is therefore understood, as those skilled in the art will appreciate, that the invention encompasses more than the specific exemplary sequences. Moreover, the skilled artisan recognizes that substantially similar nucleic acid sequences encompassed by this invention are also defined by their ability to hybridize (under moderately stringent conditions, e.g., 0.5×SSC (standard sodium citrate), 0.1% SDS (sodium dodecyl sulfate), 60° C.) with the sequences exemplified herein, or to any portion of the nucleotide sequences disclosed herein and which are functionally equivalent to any of the nucleic acid sequences disclosed herein. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes determine stringency conditions.
[0100] The term "selectively hybridizes" includes reference to hybridization, under stringent hybridization conditions, of a nucleic acid sequence to a specified nucleic acid target sequence to a detectably greater degree (e.g., at least 2-fold over background) than its hybridization to non-target nucleic acid sequences and to the substantial exclusion of non-target nucleic acids. Selectively hybridizing sequences are two nucleotide sequences wherein the complement of one of the nucleotide sequences typically has about at least 80% sequence identity, or 90% sequence identity, up to and including 100% sequence identity (i.e., fully complementary) to the other nucleotide sequence.
[0101] The term "stringent conditions" or "stringent hybridization conditions" includes reference to conditions under which a probe will selectively hybridize to its target sequence. Probes are typically single stranded nucleic acid sequences which are complementary to the nucleic acid sequences to be detected. Probes are "hybridizable" to the nucleic acid sequence to be detected. Generally, a probe is less than about 1000 nucleotides in length, optionally less than 500 nucleotides in length.
[0102] Hybridization methods are well defined. Typically the probe and sample are mixed under conditions which will permit nucleic acid hybridization. This involves contacting the probe and sample in the presence of an inorganic or organic salt under the proper concentration and temperature conditions. Optionally a chaotropic agent may be added. Nucleic acid hybridization is adaptable to a variety of assay formats. One of the most suitable is the sandwich assay format. A primary component of a sandwich-type assay is a solid support. The solid support has adsorbed to it or covalently coupled to it an immobilized nucleic acid probe that is unlabeled and complementary to one portion of the sequence.
[0103] Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences can be identified which are 100% complementary to the probe (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing).
[0104] Typically, stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60° C. for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulfate) at 37° C., and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55° C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 0.5× to 1×SSC at 55 to 60° C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 0.1×SSC at 60 to 65° C.
[0105] Specificity is typically the function of post-hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the thermal melting point (Tm) can be approximated from the equation of Meinkoth et al., Anal. Biochem. 138:267-284 (1984): Tm=81.5° C.+16.6 (log M)+0.41 (% GC)-0.61 (% form)-500/L; where M is the molarity of monovalent cations, % GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. The Tm is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe. Tm is reduced by about 1° C. for each 1% of mismatching; thus, Tm, hybridization and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with >90% identity are sought, the Tm can be decreased 10° C. Generally, stringent conditions are selected to be about 5° C. lower than Tm for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4° C. lower than the Tm; moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10° C. lower than the Tm; low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or 20° C. lower than the Tm. Using the equation, hybridization and wash compositions, and desired Tm, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a Tm of less than 45° C. (aqueous solution) or 32° C. (formamide solution) it is preferred to increase the SSC concentration so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen, Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes, Part I, Chapter 2 "Overview of principles of hybridization and the strategy of nucleic acid probe assays", Elsevier, New York (1993); and Current Protocols in Molecular Biology, Chapter 2, Ausubel et al., Eds., Greene Publishing and Wiley-Interscience, New York (1995). Hybridization and/or wash conditions can be applied for at least 10, 30, 60, 90, 120, or 240 minutes.
[0106] "Sequence identity" or "identity" in the context of nucleic acid or polypeptide sequences refers to the nucleic acid bases or amino acid residues in two sequences that are the same when aligned for maximum correspondence over a specified comparison window.
[0107] Thus, "percentage of sequence identity" refers to the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the results by 100 to yield the percentage of sequence identity. Useful examples of percent sequence identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 50% to 100%. These identities can be determined using any of the programs described herein.
[0108] Sequence alignments and percent identity or similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the MegAlign® program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Within the context of this application it will be understood that where sequence analysis software is used for analysis, that the results of the analysis will be based on the "default values" of the program referenced, unless otherwise specified. As used herein "default values" will mean any set of values or parameters that originally load with the software when first initialized.
[0109] The "Clustal V method of alignment" corresponds to the alignment method labeled Clustal V (described by Higgins and Sharp, CABIOS. 5:151-153 (1989); Higgins, D. G. et al., Comput. Appl. Biosci. 8:189-191 (1992)) and found in the MegAlign® program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). For multiple alignments, the default values correspond to GAP PENALTY=10 and GAP LENGTH PENALTY=10. Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal V method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4. After alignment of the sequences using the Clustal V program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table in the same program.
[0110] The "Clustal W method of alignment" corresponds to the alignment method labeled Clustal W (described by Higgins and Sharp, supra; Higgins, D. G. et al., supra) and found in the MegAlign® v6.1 program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Default parameters for multiple alignment correspond to GAP PENALTY=10, GAP LENGTH PENALTY=0.2, Delay Divergen Seqs(%)=30, DNA Transition Weight=0.5, Protein Weight Matrix=Gonnet Series, DNA Weight Matrix=IUB. After alignment of the sequences using the Clustal W program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table in the same program.
[0111] "BLASTN method of alignment" is an algorithm provided by the National Center for Biotechnology Information (NCBI) to compare nucleotide sequences using default parameters. The "BLASTP method of alignment" is an algorithm provided by the NCBI to compare protein sequences using default parameters.
[0112] It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides, from other species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 50% to 100%. Indeed, any integer amino acid identity from 50% to 100% may be useful in describing the present invention, such as 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%. Also, of interest is any full-length or partial complement of this isolated nucleotide fragment.
[0113] Thus, the invention encompasses more than the specific exemplary nucleotide sequences disclosed herein. For example, alterations in the gene sequence which reflect the degeneracy of the genetic code are contemplated. Also, it is well known in the art that alterations in a gene which result in the production of a chemically equivalent amino acid at a given site, but do not affect the functional properties of the encoded protein are common. Substitutions are defined for the discussion herein as exchanges within one of the following five groups:
[0114] 1. Small aliphatic, nonpolar or slightly polar residues: Ala, Ser, Thr (Pro, Gly);
[0115] 2. Polar, negatively charged residues and their amides: Asp, Asn, Glu, Gln;
[0116] 3. Polar, positively charged residues: His, Arg, Lys;
[0117] 4. Large aliphatic, nonpolar residues: Met, Leu, Ile, Val (Cys); and
[0118] 5. Large aromatic residues: Phe, Tyr, Trp. Thus, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue (such as glycine) or a more hydrophobic residue (such as valine, leucine, or isoleucine). Similarly, changes which result in substitution of one negatively charged residue for another (such as aspartic acid for glutamic acid) or one positively charged residue for another (such as lysine for arginine) can also be expected to produce a functionally equivalent product. In many cases, nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the protein molecule would also not be expected to alter the activity of the protein.
[0119] Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products. Moreover, the skilled artisan recognizes that substantially similar sequences encompassed by this invention are also defined by their ability to hybridize under stringent conditions, as defined above.
[0120] Preferred substantially similar nucleic acid fragments of the instant invention are those nucleic acid fragments whose nucleotide sequences are at least 70% identical to the nucleotide sequence of the nucleic acid fragments reported herein. More preferred nucleic acid fragments are at least 90% identical to the nucleotide sequence of the nucleic acid fragments reported herein. Most preferred are nucleic acid fragments that are at least 95% identical to the nucleotide sequence of the nucleic acid fragments reported herein.
[0121] A "substantial portion" of an amino acid or nucleotide sequence is that portion comprising enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to putatively identify that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al., J. Mol. Biol., 215:403-410 (1993)). In general, a sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect to nucleotide sequences, gene-specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence comprises enough of the sequence to specifically identify and/or isolate a nucleic acid fragment comprising the sequence. The instant specification teaches the complete amino acid and nucleotide sequence encoding particular proteins. The skilled artisan, having the benefit of the sequences as reported herein, may now use all or a substantial portion of the disclosed sequences for purposes known to those skilled in this art.
[0122] The term "complementary" describes the relationship between two sequences of nucleotide bases that are capable of Watson-Crick base-pairing when aligned in an anti-parallel orientation. For example, with respect to DNA, adenosine is capable of base-pairing with thymine and cytosine is capable of base-pairing with guanine. Accordingly, the instant invention may make use of isolated nucleic acid molecules that are complementary to the complete sequences as reported in the accompanying Sequence Listing and the specification as well as those substantially similar nucleic acid sequences.
[0123] The term "isolated" refers to a polypeptide or nucleotide sequence that is removed from at least one component with which it is naturally associated.
[0124] "Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence that can stimulate promoter activity, and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity. Promoters that cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters".
[0125] "3' non-coding sequences", "transcription terminator" and "termination sequences" are used interchangeably herein and refer to DNA sequences located downstream of a coding sequence, including polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor.
[0126] The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in a sense or antisense orientation. In another example, the complementary RNA regions of the invention can be operably linked, either directly or indirectly, 5' to the target mRNA, or 3' to the target mRNA, or within the target mRNA, or a first complementary region is 5' and its complement is 3' to the target mRNA.
[0127] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989). Transformation methods are well known to those skilled in the art and are described infra.
[0128] "PCR" or "polymerase chain reaction" is a technique for the synthesis of large quantities of specific DNA segments and consists of a series of repetitive cycles (Perkin Elmer Cetus Instruments, Norwalk, Conn.). Typically, the double-stranded DNA is heat denatured, the two primers complementary to the 3' boundaries of the target segment are annealed at low temperature and then extended at an intermediate temperature. One set of these three consecutive steps is referred to as a "cycle".
[0129] A "plasmid" or "vector" is an extra chromosomal element often carrying genes that are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA fragments. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing an expression cassette(s) into a cell.
[0130] The term "genetically altered" refers to the process of changing hereditary material by genetic engineering, transformation and/or mutation.
[0131] The term "recombinant" refers to an artificial combination of two otherwise separated segments of sequence, e.g., by chemical synthesis or by the manipulation of isolated segments of nucleic acids by genetic engineering techniques. "Recombinant" also includes reference to a cell or vector, that has been modified by the introduction of a heterologous nucleic acid or a cell derived from a cell so modified, but does not encompass the alteration of the cell or vector by naturally occurring events (e.g., spontaneous mutation, natural transformation, natural transduction, natural transposition) such as those occurring without deliberate human intervention.
[0132] The terms "recombinant construct", "expression construct", "chimeric construct", "construct", and "recombinant DNA construct", are used interchangeably herein. A recombinant construct comprises an artificial combination of nucleic acid fragments, e.g., regulatory and coding sequences that are not found together in nature. For example, a recombinant construct may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. Such a construct may be used by itself or may be used in conjunction with a vector. If a vector is used, then the choice of vector is dependent upon the method that will be used to transform host cells as is well known to those skilled in the art. For example, a plasmid vector can be used. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells comprising any of the isolated nucleic acid fragments of the invention. The skilled artisan will also recognize that different independent transformation events may result in different levels and patterns of expression (Jones et al., EMBO J. 4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics 218:78-86 (1989)), and thus that multiple events may need be screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by Southern analysis of DNA, Northern analysis of mRNA expression, immunoblotting analysis of protein expression, or phenotypic analysis, among others.
[0133] The term "expression", as used herein, refers to the production of a functional end-product (e.g., an mRNA or a protein [either precursor or mature]).
[0134] The term "introduced" means providing a nucleic acid (e.g., expression construct) or protein into a cell. Introduced includes reference to the incorporation of a nucleic acid into a eukaryotic or prokaryotic cell where the nucleic acid may be incorporated into the genome of the cell, and includes reference to the transient provision of a nucleic acid or protein to the cell. Introduced includes reference to stable or transient transformation methods, as well as sexually crossing. Thus, "introduced" in the context of inserting a nucleic acid fragment (e.g., a recombinant construct/expression construct) into a cell, means "transfection" or "transformation" or "transduction" and includes reference to the incorporation of a nucleic acid fragment into a eukaryotic or prokaryotic cell where the nucleic acid fragment may be incorporated into the genome of the cell (e.g., chromosome, plasmid, plastid or mitochondrial DNA), converted into an autonomous replicon, or transiently expressed (e.g., transfected mRNA).
[0135] The term "homologous" refers to proteins or polypeptides of common evolutionary origin with similar catalytic function. The invention may include bacteria producing homologous proteins via recombinant technology.
[0136] Disclosed herein are variant sucrose transporter polypeptides that enable bacterial growth over a wide range of gene expression levels and sucrose concentrations. Sucrose transporter polypeptides are polypeptides that are capable of mediating the transport of sucrose into microbial cells. Sucrose transporters known in the art, such as CscB from E. coli, function as H.sup.+/sucrose symporters, which transport one proton for every sucrose molecule transported, thereby coupling the energy of the proton motive force to sucrose transport. Such active transport allows accumulation of sucrose against a concentration gradient. Mutations which change certain amino acids in CscB that result in polypeptides unable to catalyze active uptake of sucrose, but are able to catalyze equilibrium exchange across a membrane have been described by Vadyvaloo et al. (J. Mol. Biol. 358:1051-1059, 2006). The sucrose transporter polypeptides disclosed herein are novel variants that have lost the ability to actively transport sucrose into microbial cells against a concentration gradient, but have the ability to transport sucrose by facilitated diffusion. These variant sucrose transporter polypeptides also enable faster sucrose utilization in bacteria than the native CscB transporter polypeptide. Sucrose transport by facilitated diffusion mitigates the toxicity associated with excess sucrose uptake because sucrose will not accumulate within the cells to concentrations that are higher than extracellular levels. Therefore, microbial cells having sucrose transport by facilitated diffusion are able to grow over a wider range of sucrose concentrations than cells having active sucrose transport.
[0137] In some embodiments, the sucrose transporter polypeptides disclosed herein are variants of the wild-type sucrose transporter polypeptide CscB from E. coli ATCC013281 (set forth in SEQ ID NO:26, nucleotide coding sequence set forth in SEQ ID NO:25). These sucrose transporter polypeptides have an amino acid change from arginine to alanine at amino acid position 300, i.e., R300A mutation, (SEQ ID NO:100, nucleotide coding sequence set forth in SEQ ID NO:99) or an amino acid change from arginine to leucine at amino acid position 300, i.e., R300L mutation, (SEQ ID NO:102, nucleotide coding sequence set forth in SEQ ID NO:101) and at least one other mutation which results in faster sucrose utilization, as described by Chen et al. (U.S. patent application Ser. No. 13/210,488, filed Aug. 16, 2011), i.e., either an amino acid change or a truncation of the amino acid sequence. Accordingly, in these embodiments, the variant sucrose transporter polypeptides have: an amino acid sequence that has at least 95% identity to an amino acid sequence as set forth in SEQ ID NO:26 based on a Clustal W method of alignment and have an amino acid change from arginine to alanine or arginine to leucine at position 300, and comprise:
[0138] (a) at least one amino acid change selected from the group consisting of:
[0139] (i) glutamine to histidine at position 353
[0140] (ii) leucine to proline at position 61;
[0141] (iii) phenylalanine to leucine at position 159;
[0142] (iv) glycine to cysteine at position 162;
[0143] (v) proline to histidine at position 169;
[0144] (vi) leucine to tryptophan at position 61;
[0145] (vii) leucine to histidine at position 61;
[0146] (viii) leucine to phenylalanine at position 61; and
[0147] (ix) leucine to tyrosine at position 61; or
[0148] (b) a length of 402 to 407 amino acids from the N-terminus; or
[0149] (c) a length of 402 to 407 amino acids from the N-terminus, and
[0150] at least one of the amino acid changes of (a).
[0151] In some embodiments, the sucrose transporter polypeptides are variants of sucrose transporter polypeptides from various sources (see Table 1), having an amino acid change to alanine or leucine at a position equivalent to amino acid position 300 when compared with a reference amino acid sequence of CscB (SEQ ID NO:26). The corresponding amino acid positions in the various sucrose transporter polypeptides, relative to the reference amino acid sequence, can be readily determined by one skilled in the art using sequence alignment algorithms, such as Clustal W, Clustal V, and BLASTP, which are described above. Accordingly, in these embodiments, the variant sucrose transporter polypeptides have an amino acid sequence that has at least 95% identity based on a Clustal W method of alignment to an amino acid sequence selected from the group consisting of SEQ ID NOs:68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, and 98, and an amino acid at an equivalent position when compared with a reference amino acid sequence of CscB (SEQ ID NO:26) selected from the group consisting of:
[0152] (a) alanine at a position equivalent to position 300; and
[0153] (b) leucine at a position equivalent to position 300;
[0154] In some embodiments, the sucrose transporter polypeptides are variants of sucrose transporter polypeptides from various sources (see Table 1) having an amino acid change to alanine or leucine at a position equivalent to amino acid position 300 when compared with a reference amino acid sequence of CscB (SEQ ID NO:26), as described above, and further comprise:
[0155] (a) at least one of the following amino acids at an equivalent position when compared with the reference amino acid sequence of SEQ ID NO:26:
[0156] (i) histidine at a position equivalent to position 353;
[0157] (ii) proline at a position equivalent to position 61;
[0158] (iii) leucine at a position equivalent to position 159;
[0159] (iv) cysteine at a position equivalent to position 162;
[0160] (v) histidine at a position equivalent to position 169;
[0161] (vi) tryptophan at a position equivalent to position 61;
[0162] (vii) histidine at a position equivalent to position 61;
[0163] (viii) phenylalanine at a position equivalent to position 61;
[0164] (ix) tyrosine at a position equivalent to position 61; and/or
[0165] (b) truncation at a position equivalent to position 407, 406, 405, 404, 403, or 402 when compared with the reference amino acid sequence of SEQ ID NO:26.
[0166] In some embodiments, the variant sucrose transporter polypeptides have an amino acid sequence selected from the group consisting of: SEQ ID NOs:100, 102, 104, 106, 108, 110, and 112.
[0167] Also disclosed herein are bacteria comprising in their genome or on at least one recombinant construct a nucleotide sequence encoding a variant sucrose transporter polypeptide and a nucleotide sequence encoding a polypeptide having sucrose hydrolase activity. The nucleotide sequences are each operably linked to the same or a different promoter. These bacteria are able to grow over a wider range of gene expression levels and sucrose concentrations than bacteria having native sucrose transporter polypeptides which actively transport sucrose. Accordingly, in these embodiments, the recombinant bacteria comprise in their genome or on at least one recombinant construct:
[0168] (a) a nucleotide sequence encoding a variant sucrose transporter polypeptide having an amino acid sequence that has at least 95% identity based on a Clustal W method of alignment to an amino acid sequence selected from the group consisting of SEQ ID NOs:26, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, and 98, and an amino acid at an equivalent position when compared with a reference amino acid sequence of SEQ ID NO:26 selected from the group consisting of:
[0169] (i) alanine at a position equivalent to position 300; and
[0170] (ii) leucine at a position equivalent to position 300; and
[0171] (b) a nucleotide sequence encoding a polypeptide having sucrose hydrolase activity; wherein (a) and (b) are each operably linked to the same or a different promoter, further wherein the recombinant bacteria are capable of metabolizing sucrose.
[0172] In some embodiments, the recombinant bacteria comprise a variant sucrose transporter polypeptide which further comprises:
[0173] (a) at least one of the following amino acids at an equivalent position when compared with the reference amino acid sequence of SEQ ID NO:26:
[0174] (i) histidine at a position equivalent to position 353;
[0175] (ii) proline at a position equivalent to position 61;
[0176] (iii) leucine at a position equivalent to position 159;
[0177] (iv) cysteine at a position equivalent to position 162;
[0178] (v) histidine at a position equivalent to position 169;
[0179] (vi) tryptophan at a position equivalent to position 61;
[0180] (vii) histidine at a position equivalent to position 61;
[0181] (viii) phenylalanine at a position equivalent to position 61;
[0182] (ix) tyrosine at a position equivalent to position 61; and/or
[0183] (b) truncation at a position equivalent to position 407, 406, 405, 404, 403, or 402 when compared with the reference amino acid sequence of SEQ ID NO:26.
[0184] Recombinant bacteria comprising a nucleotide sequence encoding a variant sucrose transporter polypeptide, as described above, and a nucleotide sequence encoding a polypeptide having sucrose hydrolase activity may be constructed by introducing the nucleotide sequences into a suitable host bacterium, either into the genome or on at least one recombinant construct, using methods known in the art, as described below. In some embodiments, the recombinant bacteria are capable of metabolizing sucrose to produce glycerol and/or glycerol-derived products.
[0185] Suitable host bacteria for use in the construction of the recombinant bacteria disclosed herein include, but are not limited to, organisms of the genera: Escherichia, Streptococcus, Agrobacterium, Bacillus, Corynebacterium, Lactobacillus, Clostridium, Gluconobacter, Citrobacter, Enterobacter, Klebsiella, Aerobacter, Methylobacter, Salmonella, Streptomyces, and Pseudomonas.
[0186] In some embodiments, the host bacterium is selected from the genera: Escherichia, Klebsiella, Citrobacter, and Aerobacter.
[0187] In some embodiments, the host bacterium is Escherichia coli.
[0188] In some embodiments, the host bacterium is PTS minus. In these embodiments, the host bacterium is PTS minus in its native state, or may be rendered PTS minus through inactivation of a PTS gene as described below.
[0189] In production microorganisms, it is sometimes desirable to unlink the transport of sugars and the use of phosphoenolpyruvate (PEP) for phosphorylation of the sugars being transported.
[0190] The term "down-regulated" refers to reduction in, or abolishment of, the activity of active protein(s), as compared to the activity of the wild-type protein(s). The PTS may be inactivated (resulting in a "PTS minus" organism) by down-regulating expression of one or more of the endogenous genes encoding the proteins required in this type of transport. Down-regulation typically occurs when one or more of these genes has a "disruption", referring to an insertion, deletion, or targeted mutation within a portion of that gene, that results in either a complete gene knockout such that the gene is deleted from the genome and no protein is translated or a protein has been translated such that it has an insertion, deletion, amino acid substitution or other targeted mutation. The location of the disruption in the protein may be, for example, within the N-terminal portion of the protein or within the C-terminal portion of the protein. The disrupted protein will have impaired activity with respect to the protein that was not disrupted, and can be non-functional. Down-regulation that results in low or lack of expression of the protein, could also result via manipulating the regulatory sequences, transcription and translation factors and/or signal transduction pathways or by use of sense, antisense or RNAi technology, or similar mechanisms known to skilled artisans.
[0191] The recombinant bacteria disclosed herein comprise in their genome or on at least one recombinant construct, a nucleotide sequence encoding a polypeptide having sucrose hydrolase activity. Polypeptides having sucrose hydrolase activity have the ability to catalyze the hydrolysis of sucrose to produce fructose and glucose. Polypeptides having sucrose hydrolase activity are known, and include, but are not limited to CscA from E. coli wild-type strain EC3132 (set forth in SEQ ID NO:28), encoded by gene cscA (coding sequence set forth in SEQ ID NO:27), CscA from E. coli ATCC®13281 (set forth in SEQ ID NO:30), encoded by gene cscA (coding sequence set forth in SEQ ID NO:29); BfrA from Bifidobacterium lactis strain DSM 10140T (set forth in SEQ ID NO:32), encoded by gene bfrA (coding sequence set forth in SEQ ID NO:31); Suc2p from Saccharomyces cerevisiae (set forth in SEQ ID NO:34), encoded by gene SUC2 (coding sequence set forth in SEQ ID NO:33); ScrB from Corynebacterium glutamicum (set forth in SEQ ID NO:36), encoded by gene scrB (coding sequence set forth in SEQ ID NO:35); ScrB from Pseudomonas fluorescens Pf5 (set forth in SEQ ID NO:134), encoded by gene scrB (coding sequence set forth in SEQ ID NO:133), FruP from Bacillus licheniformis 14580 (set forth in SEQ ID NO:136), encoded by gene fruA (coding sequence set forth in SEQ ID NO:135), sucrose phosphorylase from Leuconostoc mesenteroides DSM 20193 (set forth in SEQ ID NO:38), coding sequence of encoding gene set forth in SEQ ID NO:37; and sucrose phosphorylase from Bifidobacterium adolescentis DSM 20083 (set forth in SEQ ID NO:40), encoded by gene sucP (coding sequence set forth in SEQ ID NO:39).
[0192] In some embodiments, the polypeptide having sucrose hydrolase activity is classified as EC 3.2.1.26 or EC 2.4.1.7.
[0193] In some embodiments, the polypeptide having sucrose hydrolase activity has at least 95% sequence identity, based on the Clustal W method of alignment, to an amino acid sequence as set forth in SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:134, or SEQ ID NO:136.
[0194] In some embodiments, the polypeptide having sucrose hydrolase activity corresponds substantially to the amino acid sequence set forth in SEQ ID NO:30.
[0195] The recombinant bacteria disclosed herein may further comprise in their genome or on at least one recombinant construct, a nucleotide sequence encoding a polypeptide having fructokinase activity to enable the bacteria to utilize the fructose produced by the hydrolysis of sucrose. Polypeptides having fructokinase activity include fructokinases (designated EC 2.7.1.4) and various hexose kinases having fructose phosphorylating activity (EC 2.7.1.3 and EC 2.7.1.1). Fructose phosphorylating activity may be exhibited by hexokinases and ketohexokinases. Representative genes encoding polypeptides from a variety of microorganisms, which may be used to construct the recombinant bacteria disclosed herein, are listed in Table 2. One skilled in the art will know that proteins that are substantially similar to a protein which is able to phosphorylate fructose (such as encoded by the genes listed in Table 2) may also be used.
TABLE-US-00002 TABLE 2 Sequences Encoding Enzymes with Fructokinase Activity Nucleotide Protein EC SEQ ID SEQ ID Source Gene Name Number NO: NO: Agrobacterium scrK (fructokinase) 2.7.1.4 41 42 tumefaciens Streptococcus scrK (fructokinase) 2.7.1.4 43 44 mutans Escherichia coli scrK (fructokinase 2.7.1.4 45 46 Klebsiella scrK (fructokinase 2.7.1.4 47 48 pneumoniae Escherichia coli cscK (fructokinase) 2.7.1.4 49 50 Enterococcus cscK (fructokinase) 2.7.1.4 51 52 faecalis Saccharomyces HXK1 (hexokinase) 2.7.1.1 53 54 cerevisiae Saccharomyces HXK2 (hexokinase) 2.7.1.1 55 56 cerevisiae
[0196] In some embodiments, the polypeptide having fructokinase activity is classified as EC 2.7.1.4, EC 2.7.1.3, or EC 2.7.1.1.
[0197] In some embodiments, the polypeptide having fructokinase activity has at least 95% sequence identity, based on the Clustal W method of alignment, to an amino acid sequence as set forth in SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, or SEQ ID NO:56.
[0198] In some embodiments, the polypeptide having fructokinase activity has the amino acid sequence set forth in SEQ ID NO:50.
[0199] The coding sequence of the genes encoding polypeptides having sucrose transporter activity and polypeptides having sucrose hydrolase activity may be used to isolate nucleotide sequences encoding homologous polypeptides from the same or other microbial species. For example, homologs of the genes may be identified using sequence analysis software, such as BLASTN, to search publically available nucleic acid sequence databases. Additionally, the isolation of homologous genes using sequence-dependent protocols is well known in the art. Examples of sequence-dependent protocols include, but are not limited to, methods of nucleic acid hybridization, and methods of DNA and RNA amplification as exemplified by various uses of nucleic acid amplification technologies (e.g. polymerase chain reaction (PCR), Mullis et al., U.S. Pat. No. 4,683,202; ligase chain reaction (LCR), Tabor, S. et al., Proc. Acad. Sci. USA 82, 1074, 1985); or strand displacement amplification (SDA), Walker, et al., Proc. Natl. Acad. Sci. U.S.A., 89: 392, (1992)). For example, the nucleotide sequence encoding the polypeptides described above may be employed as a hybridization probe for the identification of homologs.
[0200] One of ordinary skill in the art will appreciate that genes encoding these polypeptides isolated from other sources may also be used in the recombinant bacteria disclosed herein. Additionally, variations in the nucleotide sequences encoding the polypeptides may be made without affecting the amino acid sequence of the encoded polypeptide due to codon degeneracy, and that amino acid substitutions, deletions or additions that produce a substantially similar protein may be included in the encoded protein.
[0201] The nucleotide sequences encoding the polypeptides having sucrose transporter activity and polypeptides having sucrose hydrolase activity may be isolated using PCR (see, e.g., U.S. Pat. No. 4,683,202) with primers designed to bound the desired sequence. Other methods of gene isolation are well known to one skilled in the art such as by using degenerate primers or heterologous probe hybridization. The nucleotide sequences can also be chemically synthesized or purchased from vendors such as DNA2.0 Inc. (Menlo Park, Calif.), Integrated DNA Technologies (Coralville, Iowa), and GenScript USA Inc. (Piscataway, N.J.). The nucleotide sequences may be codon optimized for expression in the desired host cell.
[0202] Expression of the polypeptides may be effected using one of many methods known to one skilled in the art. For example, the nucleotide sequences encoding the polypeptides described above may be introduced into the bacterium on at least one multicopy plasmid, or by integrating one or more copies of the coding sequences into the host genome. The nucleotide sequences encoding the polypeptides may be introduced into the host bacterium separately (e.g., on separate plasmids) or in any combination (e.g., on a single plasmid).
[0203] The introduced coding regions that are either on a plasmid(s) or in the genome may be expressed from at least one highly active promoter. An integrated coding region may either be introduced as a part of a chimeric gene having its own promoter, or it may be integrated adjacent to a highly active promoter that is endogenous to the genome or in a highly expressed operon. Suitable promoters include, but are not limited to, CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO, and lac, ara, tet, trp, IPL, IPR, T7, tac, and trc (useful for expression in Escherichia coli) as well as the amy, apr, npr promoters and various phage promoters useful for expression in Bacillus. The promoter may also be the Streptomyces lividans glucose isomerase promoter or a variant thereof, described by Payne et al. (U.S. Pat. No. 7,132,527).
[0204] In some embodiments, the recombinant bacteria disclosed herein are capable of producing glycerol. Biological processes for the preparation of glycerol using carbohydrates or sugars are known in yeasts and in some bacteria, other fungi, and algae. Both bacteria and yeasts produce glycerol by converting glucose or other carbohydrates through the fructose-1,6-bisphosphate pathway in glycolysis. In the method of producing glycerol disclosed herein, host bacteria may be used that naturally produce glycerol. In addition, bacteria may be engineered for production of glycerol and glycerol derivatives. The capacity for glycerol production from a variety of substrates may be provided through the expression of the enzyme activities glycerol-3-phosphate dehydrogenase (G3PDH) and/or glycerol-3-phosphatase as described in U.S. Pat. No. 7,005,291. Genes encoding these proteins that may be used for expressing the enzyme activities in a host bacterium are described in U.S. Pat. No. 7,005,291. Suitable examples of genes encoding polypeptides having glycerol-3-phosphate dehydrogenase activity include, but are not limited to, GPD1 from Saccharomyces cerevisiae (coding sequence set forth in SEQ ID NO:1, encoded protein sequence set forth in SEQ ID NO:2) and GPD2 from Saccharomyces cerevisiae (coding sequence set forth in SEQ ID NO:3, encoded protein sequence set forth in SEQ ID NO:4). Suitable examples of genes encoding polypeptides having glycerol-3-phosphatase activity include, but are not limited to, GPP1 from Saccharomyces cerevisiae (coding sequence set forth in SEQ ID NO:5, encoded protein sequence set forth in SEQ ID NO:6) and GPP2 from Saccharomyces cerevisiae (coding sequence set forth in SEQ ID NO:7, encoded protein sequence set forth in SEQ ID NO:8).
[0205] Increased production of glycerol may be attained through reducing expression of target endogenous genes. Down-regulation of endogenous genes encoding glycerol kinase and glycerol dehydrogenase activities further enhance glycerol production as described in U.S. Pat. No. 7,005,291. Increased channeling of carbon to glycerol may be accomplished by reducing the expression of the endogenous gene encoding glyceraldehyde 3-phosphate dehydrogenase, as described in U.S. Pat. No. 7,371,558. Down-regulation may be accomplished by using any method known in the art, for example, the methods described above for down-regulation of genes of the PTS system.
[0206] Glycerol provides a substrate for microbial production of useful products. Examples of such products, i.e., glycerol derivatives include, but are not limited to, 3-hydroxypropionic acid, methylglyoxal, 1,2-propanediol, and 1,3-propanediol.
[0207] In some embodiments, the recombinant bacteria disclosed herein are capable of producing 1,3-propanediol. The glycerol derivative 1,3-propanediol is a monomer having potential utility in the production of polyester fibers and the manufacture of polyurethanes and cyclic compounds. 1,3-Propanediol can be produced by a single microorganism by bioconversion of a carbon substrate other than glycerol or dihydroxyacetone, as described in U.S. Pat. No. 5,686,276. In this bioconversion, glycerol is produced from the carbon substrate, as described above. Glycerol is converted to the intermediate 3-hydroxypropionaldehyde by a dehydratase enzyme, which can be encoded by the host bacterium or can be introduced into the host by recombination. The dehydratase can be glycerol dehydratase (E.C. 4.2.1.30), diol dehydratase (E.C. 4.2.1.28) or any other enzyme able to catalyze this conversion. A suitable example of genes encoding the "a" (alpha), "β" (beta), and "γ" (gamma) subunits of a glycerol dehydratase include, but are not limited to dhaB1 (coding sequence set forth in SEQ ID NO:9), dhaB2 (coding sequence set forth in SEQ ID NO:11), and dhaB3 (coding sequence set forth in SEQ ID NO:13), respectively, from Klebsiella pneumoniae. The further conversion of 3-hydroxypropionaldehyde to 1,3-propandeiol can be catalyzed by 1,3-propanediol dehydrogenase (E.C. 1.1.1.202) or other alcohol dehydrogenases. A suitable example of a gene encoding a 1,3-propanediol dehydrogenase is dhaT from Klebsiella pneumoniae (coding sequence set forth in SEQ ID NO:57, encoded protein sequence set forth in SEQ ID NO:58).
[0208] Bacteria can be recombinantly engineered to provide more efficient production of glycerol and the glycerol derivative 1,3-propanediol. For example, U.S. Pat. No. 7,005,291 discloses transformed microorganisms and a method for production of glycerol and 1,3-propanediol with advantages derived from expressing exogenous activities of one or both of glycerol-3-phosphate dehydrogenase and glycerol-3-phosphate phosphatase while disrupting one or both of endogenous activities glycerol kinase and glycerol dehydrogenase.
[0209] U.S. Pat. No. 6,013,494 describes a process for the production of 1,3-propanediol using a single microorganism comprising exogenous glycerol-3-phosphate dehydrogenase, glycerol-3-phosphate phosphatase, dehydratase, and 1,3-propanediol oxidoreductase (e.g., dhaT). U.S. Pat. No. 6,136,576 discloses a method for the production of 1,3-propanediol comprising a recombinant microorganism further comprising a dehydratase and protein X (later identified as being a dehydratase reactivation factor peptide).
[0210] U.S. Pat. No. 6,514,733 describes an improvement to the process where a significant increase in titer (grams product per liter) is obtained by virtue of a non-specific catalytic activity (distinguished from 1,3-propanediol oxidoreductase encoded by dhaT) to convert 3-hydroxypropionaldehyde to 1,3-propanediol. Additionally, U.S. Pat. No. 7,132,527 discloses vectors and plasmids useful for the production of 1,3-propanediol.
[0211] Increased production of 1,3-propanediol may be achieved by further modifications to a host bacterium, including down-regulating expression of some target genes and up-regulating, expression of other target genes, as described in U.S. Pat. No. 7,371,558. For utilization of glucose as a carbon source in a PTS minus host, expression of glucokinase activity may be increased.
[0212] Additional genes whose increased or up-regulated expression increases 1,3-propanediol production include genes encoding:
[0213] phosphoenolpyruvate carboxylase typically characterized as EC 4.1.1.31
[0214] cob(I)alamin adenosyltransferase, typically characterized as EC 2.5.1.17
[0215] non-specific catalytic activity that is sufficient to catalyze the interconversion of 3-HPA and 1,3-propanediol, and specifically excludes 1,3-propanediol oxidoreductase(s), typically these enzymes are alcohol dehydrogenases
[0216] Genes whose reduced or down-regulated expression increases 1,3-propanediol production include genes encoding:
[0217] aerobic respiration control protein
[0218] methylglyoxal synthase
[0219] acetate kinase
[0220] phosphotransacetylase
[0221] aldehyde dehydrogenase A
[0222] aldehyde dehydrogenase B
[0223] triosephosphate isomerase
[0224] phosphogluconate dehydratase
[0225] In some embodiments, the recombinant bacteria disclosed herein are capable of producing 3-hydroxypropionic acid. 3-Hydroxypropionic acid has utility for specialty synthesis and can be converted to commercially important intermediates by known art in the chemical industry, e.g., acrylic acid by dehydration, malonic acid by oxidation, esters by esterification reactions with alcohols, and 1,3-propanediol by reduction. 3-Hydroxypropionic acid may be produced biologically from a fermentable carbon source by a single microorganism, as described in copending and commonly owned U.S. Patent No. 2011/0144377 A1. In one representative biosynthetic pathway, a carbon substrate is converted to 3-hydroxypropionaldehyde, as described above for the production of 1,3-propanediol. The 3-hydroxypropionaldehyde is converted to 3-hydroxypropionic acid by an aldehyde dehydrogenase. Suitable examples of aldehyde dehydrogenases include, but are not limited to, AldB (SEQ ID NO:16), encoded by the E. coli gene aldB (coding sequence set forth in SEQ ID NO:15); AldA (SEQ ID NO:18), encoded by the E. coli gene aldA (coding sequence set forth in SEQ ID NO:17); and AldH (SEQ ID NO:20), encoded by the E. coli gene aldH (coding sequence as set forth in SEQ ID NO:19).
[0226] Many of the modifications described above to improve 1,3-propanediol production by a recombinant bacterium can also be made to improve 3-hydroxypropionic acid production. For example, the elimination of glycerol kinase prevents glycerol, formed from G3P by the action of G3P phosphatase, from being re-converted to G3P at the expense of ATP. Also, the elimination of glycerol dehydrogenase (for example, gldA) prevents glycerol, formed from DHAP by the action of NAD-dependent glycerol-3-phosphate dehydrogenase, from being converted to dihydroxyacetone. Mutations can be directed toward a structural gene so as to impair or improve the activity of an enzymatic activity or can be directed toward a regulatory gene, including promoter regions and ribosome binding sites, so as to modulate the expression level of an enzymatic activity.
[0227] Up-regulation or down-regulation may be achieved by a variety of methods which are known to those skilled in the art. It is well understood that up-regulation or down-regulation of a gene refers to an alteration in the level of activity present in a cell that is derived from the protein encoded by that gene relative to a control level of activity, for example, by the activity of the protein encoded by the corresponding (or non-altered) wild-type gene.
[0228] Specific genes involved in an enzyme pathway may be up-regulated to increase the activity of their encoded function(s). For example, additional copies of selected genes may be introduced into the host cell on multicopy plasmids such as pBR322. Such genes may also be integrated into the chromosome with appropriate regulatory sequences that result in increased activity of their encoded functions. The target genes may be modified so as to be under the control of non-native promoters or altered native promoters. Endogenous promoters can be altered in vivo by mutation, deletion, and/or substitution.
[0229] Alternatively, it may be useful to reduce or eliminate the expression of certain genes relative to a given activity level. Methods of down-regulating (disrupting) genes are known to those of skill in the art.
[0230] Down-regulation can occur by deletion, insertion, or alteration of coding regions and/or regulatory (promoter) regions. Specific down regulations may be obtained by random mutation followed by screening or selection, or, where the gene sequence is known, by direct intervention by molecular biology methods known to those skilled in the art. A particularly useful, but not exclusive, method to effect down-regulation is to alter promoter strength.
[0231] Furthermore, down-regulation of gene expression may be used to either prevent expression of the protein of interest or result in the expression of a protein that is non-functional. This may be accomplished for example, by 1) deleting coding regions and/or regulatory (promoter) regions, 2) inserting exogenous nucleic acid sequences into coding regions and/regulatory (promoter) regions, and 3) altering coding regions and/or regulatory (promoter) regions (for example, by making DNA base pair changes). Specific disruptions may be obtained by random mutation followed by screening or selection, or, in cases where the gene sequences in known, specific disruptions may be obtained by direct intervention using molecular biology methods know to those skilled in the art. A particularly useful method is the deletion of significant amounts of coding regions and/or regulatory (promoter) regions.
[0232] Methods of altering recombinant protein expression are known to those skilled in the art, and are discussed in part in Baneyx, Curr. Opin. Biotechnol. (1999) 10:411; Ross, et al., J. Bacteriol. (1998) 180:5375; deHaseth, et al., J. Bacteriol. (1998) 180:3019; Smolke and Keasling, Biotechnol. Bioeng. (2002) 80:762; Swartz, Curr. Opin. Biotech. (2001) 12:195; and Ma, et al., J. Bacteriol. (2002) 184:5733.
[0233] Recombinant bacteria containing the necessary changes in gene expression for metabolizing sucrose in the production of microbial products including glycerol and glycerol derivatives, as described above, may be constructed using techniques well known in the art.
[0234] The construction of the recombinant bacteria disclosed herein may be accomplished using a variety of vectors and transformation and expression cassettes suitable for the cloning, transformation and expression of coding regions that confer the ability to utilize sucrose in the production of glycerol and its derivatives in a suitable host microorganism. Suitable vectors are those which are compatible with the bacterium employed. Suitable vectors can be derived, for example, from a bacterium, a virus (such as bacteriophage T7 or a M-13 derived phage), a cosmid, a yeast or a plant. Protocols for obtaining and using such vectors are known to those skilled in the art (Sambrook et al., supra).
[0235] Initiation control regions, or promoters, which are useful to drive expression of coding regions for the instant invention in the desired host bacterium are numerous and familiar to those skilled in the art. Virtually any promoter capable of driving expression is suitable for use herein. For example, any of the promoters listed above may be used.
[0236] Termination control regions may also be derived from various genes native to the preferred hosts. Optionally, a termination site may be unnecessary; however, it is most preferred if included.
[0237] For effective expression of the instant polypeptides, nucleotide sequences encoding the polypeptides are linked operably through initiation codons to selected expression control regions such that expression results in the formation of the appropriate messenger RNA.
[0238] Particularly useful are the vectors pSYC0101, pSYC0103, pSYC0106, and pSYC0109, described in U.S. Pat. No. 7,371,558, and pSYCO400/AGRO, described in U.S. Pat. No. 7,524,660. The essential elements of these vectors are derived from the dha regulon isolated from Klebsiella pneumoniae and from Saccharomyces cerevisiae. Each vector contains the open reading frames dhaB1, dhaB2, dhaB3, dhaX (coding sequence set forth in SEQ ID NO:59; encoded polypeptide sequence set forth in SEQ ID NO:60), orfX, DAR1, and GPP2 arranged in three separate operons. The nucleotide sequences of pSYC0101, pSYC0103, pSYC0106, pSYC0109, and pSYCO400/AGRO are set forth in SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, and SEQ ID NO:66, respectively. The differences between the vectors are illustrated in the chart below [the prefix "p-" indicates a promoter; the open reading frames contained within each "( )" represent the composition of an operon]:
pSYC0101 (SEQ ID NO:62):
[0239] p-trc (Dar1_GPP2) in opposite orientation compared to the other 2 pathway operons,
[0240] p-1.6 long GI (dhaB1_dhaB2_dhaB3_dhaX), and
[0241] p-1.6 long GI (orfY_orfX_orfW). pSYC0103 (SEQ ID NO:63):
[0242] p-trc (Dar1_GPP2) same orientation compared to the other 2 pathway operons,
[0243] p-1.5 long GI (dhaB1_dhaB2_dhaB3_dhaX), and
[0244] p-1.5 long GI (orfY_orfX_orfW). pSYC0106 (SEQ ID NO:64):
[0245] p-trc (Dar1_GPP2) same orientation compared to the other 2 pathway operons,
[0246] p-1.6 long GI (dhaB1_dhaB2_dhaB3_dhaX), and
[0247] p-1.6 long GI (orfY_orfX_orfW). pSYC0109 (SEQ ID NO:65):
[0248] p-trc (Dar1_GPP2) same orientation compared to the other 2 pathway operons,
[0249] p-1.6 long GI (dhaB1_dhaB2_dhaB3_dhaX), and
[0250] p-1.6 long GI (orfY_orfX). pSYCO400/AGRO (SEQ ID NO:66):
[0251] p-trc (Dar1_GPP2) same orientation compared to the other 2 pathway operons,
[0252] p-1.6 long GI (dhaB1_dhaB2_dhaB3_dhaX), and
[0253] p-1.6 long GI (orfY_orfX).
[0254] p-1.20 short/long GI (scrK) opposite orientation compared to the pathway operons.
[0255] Once suitable expression cassettes are constructed, they are used to transform appropriate host bacteria. Introduction of the cassette containing the coding regions into the host bacterium may be accomplished by known procedures such as by transformation (e.g., using calcium-permeabilized cells, or electroporation) or by transfection using a recombinant phage virus (Sambrook et al., supra). Expression cassettes may be maintained on a stable plasmid in a host cell. In addition, expression cassettes may be integrated into the genome of the host bacterium through homologous or random recombination using vectors and methods well known to those skilled in the art. Site-specific recombination systems may also be used for genomic integration of expression cassettes.
[0256] In addition to the cells exemplified, cells having single or multiple mutations specifically designed to enhance the production of microbial products including glycerol and/or its derivatives may also be used. Cells that normally divert a carbon feed stock into non-productive pathways, or that exhibit significant catabolite repression may be mutated to avoid these phenotypic deficiencies.
[0257] Methods of creating mutants are common and well known in the art. A summary of some methods is presented in U.S. Pat. No. 7,371,558. Specific methods for creating mutants using radiation or chemical agents are well documented in the art. See, for example, Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, Second Edition (1989) Sinauer Associates, Inc., Sunderland, Mass., or Deshpande, Mukund V., Appl. Biochem. Biotechnol. 36, 227 (1992).
[0258] After mutagenesis has occurred, mutants having the desired phenotype may be selected by a variety of methods. Random screening is most common where the mutagenized cells are selected for the ability to produce the desired product or intermediate. Alternatively, selective isolation of mutants can be performed by growing a mutagenized population on selective media where only resistant colonies can develop. Methods of mutant selection are highly developed and well known in the art of industrial microbiology. See, for example, Brock, Supra; DeMancilha et al., Food Chem. 14, 313 (1984).
[0259] Fermentation media in the present invention comprise sucrose as a carbon substrate. Other carbon substrates such as glucose and fructose may also be present.
[0260] In addition to the carbon substrate, a suitable fermentation medium contains, for example, suitable minerals, salts, cofactors, buffers and other components, known to those skilled in the art, suitable for the growth of the cultures and promotion of the enzymatic pathway necessary for production of glycerol and its derivatives, for example 1,3-propanediol. Particular attention is given to Co(II) salts and/or vitamin B12 or precursors thereof in production of 1,3-propanediol.
[0261] Adenosyl-cobalamin (coenzyme B12) is an important cofactor for dehydratase activity. Synthesis of coenzyme B12 is found in prokaryotes, some of which are able to synthesize the compound de novo, for example, Escherichia blattae, Klebsiella species, Citrobacter species, and Clostridium species, while others can perform partial reactions. E. coli, for example, cannot fabricate the corrin ring structure, but is able to catalyze the conversion of cobinamide to corrinoid and can introduce the 5'-deoxyadenosyl group. Thus, it is known in the art that a coenzyme B12 precursor, such as vitamin B12, needs be provided in E. coli fermentations. Vitamin B12 may be added continuously to E. coli fermentations at a constant rate or staged as to coincide with the generation of cell mass, or may be added in single or multiple bolus additions.
[0262] Although vitamin B12 is added to the transformed E. coli described herein, it is contemplated that other bacteria, capable of de novo vitamin B12 biosynthesis will also be suitable production cells and the addition of vitamin B12 to these bacteria will be unnecessary.
[0263] Typically bacterial cells are grown at 25 to 40° C. in an appropriate medium containing sucrose. Examples of suitable growth media for use herein are common commercially prepared media such as Luria Bertani (LB) broth, Sabouraud Dextrose (SD) broth or Yeast medium (YM) broth. Other defined or synthetic growth media may also be used, and the appropriate medium for growth of the particular bacterium will be known by someone skilled in the art of microbiology or fermentation science. The use of agents known to modulate catabolite repression directly or indirectly, e.g., cyclic adenosine 2':3'-monophosphate, may also be incorporated into the reaction media. Similarly, the use of agents known to modulate enzymatic activities (e.g., methyl viologen) that lead to enhancement of 1,3-propanediol production may be used in conjunction with or as an alternative to genetic manipulations with 1,3-propanediol production strains.
[0264] Suitable pH ranges for the fermentation are between pH 5.0 to pH 9.0, where pH 6.0 to pH 8.0 is typical as the initial condition.
[0265] Reactions may be performed under aerobic, anoxic, or anaerobic conditions depending on the requirements of the recombinant bacterium. Fed-batch fermentations may be performed with carbon feed, for example, carbon substrate, limited or excess.
[0266] Batch fermentation is a commonly used method. Classical batch fermentation is a closed system where the composition of the medium is set at the beginning of the fermentation and is not subject to artificial alterations during the fermentation. Thus, at the beginning of the fermentation, the medium is inoculated with the desired bacterium and fermentation is permitted to occur adding nothing to the system. Typically, however, "batch" fermentation is batch with respect to the addition of carbon source, and attempts are often made at controlling factors such as pH and oxygen concentration. In batch systems, the metabolite and biomass compositions of the system change constantly up to the time the fermentation is stopped. Within batch cultures, cells moderate through a static lag phase to a high growth log phase and finally to a stationary phase where growth rate is diminished or halted. If untreated, cells in the stationary phase will eventually die. Cells in log phase generally are responsible for the bulk of production of end product or intermediate.
[0267] A variation on the standard batch system is the Fed-Batch system. Fed-Batch fermentation processes are also suitable for use herein and comprise a typical batch system with the exception that the substrate is added in increments as the fermentation progresses. Fed-Batch systems are useful when catabolite repression is apt to inhibit the metabolism of the cells and where it is desirable to have limited amounts of substrate in the media. Measurement of the actual substrate concentration in Fed-Batch systems is difficult and is therefore estimated on the basis of the changes of measurable factors such as pH, dissolved oxygen and the partial pressure of waste gases such as CO2. Batch and Fed-Batch fermentations are common and well known in the art and examples may be found in Brock, supra.
[0268] Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density where cells are primarily in log phase growth.
[0269] Continuous fermentation allows for the modulation of one factor or any number of factors that affect cell growth or end product concentration. For example, one method will maintain a limiting nutrient such as the carbon source or nitrogen level at a fixed rate and allow all other parameters to moderate. In other systems, a number of factors affecting growth can be altered continuously while the cell concentration, measured by the turbidity of the medium, is kept constant. Continuous systems strive to maintain steady state growth conditions, and thus the cell loss due to medium being drawn off must be balanced against the cell growth rate in the fermentation. Methods of modulating nutrients and growth factors for continuous fermentation processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology and a variety of methods are detailed by Brock, supra.
[0270] It is contemplated that the present invention may be practiced using batch, fed-batch or continuous processes and that any known mode of fermentation would be suitable. Additionally, it is contemplated that cells may be immobilized on a substrate as whole cell catalysts and subjected to fermentation conditions for production of glycerol and glycerol derivatives, such as 1,3-propanediol.
[0271] In some embodiments, a process for making glycerol, 1,3-propanediol, and/or 3-hydroxypropionic acid from sucrose is provided. The process comprises the steps of culturing a recombinant bacterium, as described above, in the presence of sucrose, and optionally recovering the glycerol, 1,3-propanediol, and/or 3-hydroxypropionic acid produced. The product may be recovered using methods known in the art. For example, solids may be removed from the fermentation medium by centrifugation, filtration, decantation, or the like. Then, the product may be isolated from the fermentation medium, which has been treated to remove solids as described above, using methods such as distillation, liquid-liquid extraction, or membrane-based separation.
EXAMPLES
[0272] The present invention is further defined in the following Examples. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various uses and conditions.
General Methods
[0273] Standard recombinant DNA and molecular cloning techniques described in the Examples are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, (1989) (Maniatis) and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, pub. by Greene Publishing Assoc. and Wiley-Interscience (1987).
[0274] The meaning of abbreviations is as follows: "sec" means second(s), "min" means minute(s), "h" means hour(s), "nm" means nanometers, "μL" means microliter(s), "mL" means milliliter(s), "L" means liter(s), "mM" means millimolar, "M" means molar, "g" means gram(s), "μg" means microgram(s), "bp" means base pair(s), "kbp" means kilobase pair(s), "rpm" means revolutions per minute, "ATCC" means American Type Culture Collection, Manassas, Va., "dH2O" means distilled water.
Media and Culture Conditions:
[0275] Materials and methods suitable for the maintenance and growth of bacterial cultures are well known in the art. Techniques suitable for use in the following Examples may be found as set out in Manual of Methods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, eds), American Society for Microbiology, Washington, D.C. (1994)) or by Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, Second Edition, Sinauer Associates, Inc., Sunderland, Mass. (1989). All reagents, restriction enzymes and materials described for the growth and maintenance of bacterial cells may be obtained from Aldrich Chemicals (Milwaukee, Wis.), BD Diagnostic Systems (Sparks, Md.), Life Technologies (Rockville, Md.), New England Biolabs (Beverly, Mass.), or Sigma Chemical Company (St. Louis, Mo.), unless otherwise specified.
[0276] LB (Luria Bertani) medium contains following per liter of medium: Bacto-tryptone (10 g), Bacto-yeast extract (5 g), and NaCl (10 g). Supplements were added as described in the Examples below. All additions were pre-sterilized before they were added to the medium.
Molecular Biology Techniques:
[0277] Restriction enzyme digestions, ligations, transformations, and methods for agarose gel electrophoresis were performed as described in Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press (1989). Polymerase Chain Reactions (PCR) techniques were found in White, B., PCR Protocols: Current Methods and Applications, Volume 15 (1993), Humana Press Inc., New York. N.Y.
Examples 1 and 2
Recombinant E. coli Strain Comprising a Variant CscB Sucrose Transporter Having a R300A Mutation
[0278] The purpose of these Examples was to construct a recombinant E. coli strain containing a variant CscB sucrose transport gene (coding sequence set forth in SEQ ID NO:99), encoding an R300A variant of CscB
[0279] (SEQ ID NO:100), and to demonstrate sucrose transport by facilitated diffusion. The protein encoded by the mutant sucrose transport gene was altered in a residue required for H.sup.+ translocation, thus eliminating H.sup.+/sucrose symport (i.e., active transport of sucrose).
Construction of Expression Vectors:
[0280] Two expression vectors were constructed, one using promoter element P1.20 and the second using promoter element P1.5. P1.20 and P1.5 refer to promoter elements derived from the Streptomyces lividans glucose isomerase promoter as described in U.S. Pat. No. 7,132,527. These two promoters differ from each other by one base in the -35 region such that P1.20 confers lower expression than does P1.5.
[0281] The promoter/multiple cloning site/double terminator regions were synthesized by Integrated DNA Technologies (Coralville, Iowa) and cloned into their pIDTsmart vector, resulting in the construction of plasmids named pDMWP1 and pDMWP3. The sequences of the synthesized regions for vectors pDMWP1 and pDMWP3 are set forth in SEQ ID NO:113 and SEQ ID NO:114, respectively.
[0282] A plasmid referred to herein as pDMWP4 was used as the backbone for subsequent constructs. Plasmid pDMWP4 was constructed from plasmid pBR322 by modifying restriction sites as follows. A Sca1 site and a KpnI site on the 5' end of the TetR gene and an additional KpnI site at the 3' end of the TetR gene were introduced into plasmid pBR322. Additionally, a KpnI site was removed from the middle of the AmpR gene as well. All sites were either added or removed using Stratagene's QuikChange® kits (Stragene, La Jolla, Calif.) following manufacturer's protocols.
[0283] Plasmids pDMWP1 and DMWP3 were digested with EcoRI and KpnI. The resulting 438 bp fragment from each construct was individually cloned into pDMWP4, also digested with EcoRI and KpnI, to complete plasmids pDMWP10 and pDMWP12, which are also referred to herein as pBR*P1.5 and pBR*1.20, respectively.
[0284] The R300A variant of CscB was given the allele name, cscB3. This mutation was introduced into plasmid pBHRcscBKA (described in U.S. Patent Application Publication No. 2011/0136190, Example 1) by site-directed mutagenesis using Stratagene's QuikChange® Site-Directed Mutagenesis kit following the manufacturer's protocol. Primers ODMWP23 (SEQ ID NO:115) and ODMWP24 (SEQ ID NO:116) were used with plasmid pBHRcscBKA as template in the reaction, creating plasmid pDMWP5. The cscB3 gene was subsequently amplified from pDMWP5 using primers ODMWP31 (SEQ ID NO:117) and ODMWP32 (SEQ ID NO:118) to add HindIII/ClaI sites. The resulting product was cloned into pBADtopo (Invitrogen, Carlsbad, Calif.) creating plasmid pDMWP26.
[0285] The HindIII/Cla fragment from pDMWP26 was cloned into HindIII/ClaI digested pDMWP12, creating plasmid pDMWP32, which contained promoter P1.20.
[0286] The HindIII/Pac fragment from pDMWP32 was cloned into HindIII/Pac digested pDMWP10, creating pDMWP73, which contained promoter P1.5.
Construction of E. coli Strains with or without Expression of cscB3:
[0287] E. coli strain PDO3513, an E. coli K12 strain [FM5 yihP:cscA+K+B-(Δ61-353, kanR)] that does not have sucrose transporter function, but possesses genes encoding sucrose invertase and fructokinase for downstream metabolism was used as the host strain. E. coli strain PDO3513 was constructed from an E. coli strain (referred to herein as PDO3085) containing the wild type cscAKB gene cluster from E. coli ATCC013281, integrated at the yihP gene in E. coli strain FM5 (ATCC® No. 53911). The cscAKB gene cluster (SEQ ID NO:61) was integrated at the yihP location in E. coli strain FM5 (ATCC® No. 53911) by the Lambda Red method. The cscAKB gene cluster was amplified from plasmid pBHRcscBKA (SEQ ID NO:119), which was constructed as described in Example 1 of U.S. Patent Application Publication No. 2011/0136190 A1, using yihP cscA primer (SEQ ID NO:120) and yihP cscB primer (SEQ ID NO:121) containing flanking sequences for the yihP gene. Plasmid pBHRcscBKA, linearized by PstI digest, was used as the PCR template. High fidelity PfuUltra® II Fusion HS DNA polymerase (Stratagene; La Jolla, Calif.) was used in the PCR reaction. PCR was performed using the following cycling conditions: 95° C. for 2 min; 35 cycles of 95° C. for 30 sec, 60° C. for 30 sec, and 72° C. for 4 min; and then 72° C. for 7 min. The resulting PCR product was stored at 4° C. The PCR product was purified using a QIAquick PCR Purification kit (Qiagen, Valencia, Calif.). The purified PCR product was electroporated into E. coli strain FM5 containing the pKD46 plasmid (Red recombinase plasmid, GenBank Acc. No. AY048746), encoding lambda recombinases, following the lambda red recombination procedure (Datsenko, K. A. and Wanner, B. L., 2000, Proc. Natl. Acad. Sci. USA 97, 6640-6645). The transformation mixture was plated on MOPS minimal plates containing 10 g/L sucrose. The MOPS minimal plates contained 1XMOPS buffer (Technova, Hollister, Calif.), 1.32 mM KH2PO4 (Technova), 50 μg/L uracil and 1.5 g/L Bacto agar. Plates were incubated at 37° C. for 2-3 days. Colonies grown on minimal sucrose plates were picked to give E. coli strain PDO3085.
[0288] The cscB gene in the cluster in PDO3085 was then partially deleted by replacing it with a kanamycin resistance cassette. The kanamycin resistance cassette was amplified from the pKD4 template plasmid (Datsenko and Wanner, Proc. Natl. Acad. Sci. USA 97:6640-6645, 2000) using cscB61 up kan primer (SEQ ID NO:91) and cscB353 down kan primer (SEQ ID NO:92). High fidelity PfuUltra® II Fusion HS DNA polymerase (Stratagene; La Jolla, Calif.) was used in the PCR reaction. PCR was performed using the following cycling conditions: 95° C. for 2 min; 30 cycles of 95° C. for 20 sec, 60° C. for 20 sec, and 72° C. for 1.5 min; and then 72° C. for 3 min. The resulting PCR product was stored at 4° C. The PCR product was purified using the QIAquick PCR Purification kit (Qiagen). The purified PCR product was electroporated into the PDO3085 strain containing the pKD46 plasmid encoding lambda recombinases following the lambda red recombination procedure. The transformation mixture was plated on LB plates containing 25 μg/mL kanamycin. The kanamycin resistance colonies were checked on MOPS+10 g/L sucrose plates to make sure that they were unable to grow on sucrose. Insertion of the kanamycin resistance cassette between residue 61 and 353 of CscB was confirmed by PCR using cscB 5' primer (SEQ ID NO:93) and cscB 3' primer (SEQ ID NO: 94). The resulting FM5 yihP:cscA+K+B-(461-353, kanR) strain was designated as PDO3513.
[0289] Plasmids pDMWP10 (the vector alone) and pDMWP73, carrying the mutant cscB3 gene, were introduced independently into E. coli strain PDO3513. The resultant strains were named PDO2768 and PDO2770, respectively.
Growth Characterization of E. coli Strains with or without Expression of cscB3:
[0290] E. coli strains PDO2768 (Example 2, Comparative) and PDO2770 (Example 1) were grown overnight in LB (Luria Bertani) medium containing 100 μg/mL ampicillin at 37° C. The next day, these cultures were diluted 1:50 in MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L sucrose and 25 μg/mL ampicillin, These cultures were grown at 37° C. with shaking at 250 rpm for 4 hours. The log-phase cultures were diluted 1:100 in the wells of a Bioscreen-C plate (instrument and plates purchased from Growth Curves USA, Piscataway N.J.) with 150 μL of MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L glucose or 16 g/L sucrose. The cultures were grown at 37° C. in triplicate with continuous shaking and the optical density was monitored. The optical densities of the two cultures at 40 hours after inoculation are given in Table 3.
TABLE-US-00003 TABLE 3 Optical Density of Cultures Growing on Glucose or Sucrose at 40 Hours Example 1 Example 2, Comparative Carbon Source PDO2770 PDO2768 2 g/L glucose 0.583 ± 0.045 0.572 ± 0.022 16 g/L sucrose 0.892 ± 0.023 0.012 ± 0.003
[0291] As can be seen from the data in Table 3, both strains grew well with glucose as a sole carbon source. In contrast, the control strain (i.e., vector only strain) PDO2768 (Example 2, Comparative) was unable to grow with sucrose as sole carbon source, while PDO2770 (Example 1), the strain expressing the mutant cscB3 gene encoding a sucrose transporter unable to translocate H.sup.+ ion was able to grow with sucrose as sole carbon source. Thus, net translocation of sucrose across the membrane must have occurred without translocation of a H.sup.+ ion.
Example 3
[0292] Recombinant E. coli Strain Comprising a Variant CscB Sucrose Transporter Having R300A and Q353H Mutations
[0293] The purpose of this Example was to construct a recombinant E. coli strain containing a variant CscB sucrose transport gene (coding sequence set forth in SEQ ID NO:103), encoding CscB having R300A and Q353H mutations (SEQ ID NO:104), and to demonstrate improved sucrose transport by facilitated diffusion with the additional mutation conferring a Q353H alteration in cscB3.
Construction of Expression Vectors:
[0294] For ease of cloning into a smaller vector, the KanR gene from pBHRcscBKAmutB (described in U.S. Patent Application Publication No. 2011/0136190, Example 1) was removed by digesting the plasmid with PstI and religating, creating plasmid pDMWP6. The new vector was 1240 bp smaller than the parent. The mutant cscB gene in this vector confers the Q353H variation with improved sucrose transport (Jahreis et al., J. Bacteriol. 184:5307-5316, 2002) as compared to the wild type sucrose symporter. It was not known if this mutation would improve sucrose transport by facilitated diffusion.
[0295] A mutation conferring the R300A variation was introduced into plasmid pDMWP6 by site-directed mutagenesis using Stratagene's
[0296] QuikChange® Site-Directed Mutagenesis kit following the manufacturer's protocol. Primers ODMWP23 (SEQ ID NO:115) and ODMWP24 (SEQ ID NO:116) were used with plasmid pDMWP6 as template in the reaction, creating plasmid pDMWP15. The cscB5 gene (containing R300A and Q353H mutations) was subsequently amplified from pDMWP6 using primers ODMWP31 (SEQ ID NO:117) and ODMWP32 (SEQ ID NO:118) to add HindIII/ClaI sites. The resulting product was cloned into pBADtopo (Invitrogen, Carlsbad, Calif.), creating plasmid pDMWP27.
[0297] The HindIII/Cla fragment from pDMWP27 was cloned into HindIII/ClaI digested pDMWP12, creating pDMWP33, which contained the P1.20 promoter.
[0298] The HindIII/Pac fragment from pDMWP33 was cloned into HindIII/Pac digested pDMWP10, creating pDMWP66, which contained the P1.5 promoter.
Construction of E. coli Strain with Expression of cscB5:
[0299] Plasmid pDMWP66 (pBR*p1.5csc5) was transformed into strain PDO3513, to give strain PDO2771.
Growth Characterization of E. coli Strains with Expression of cscB3 or cscB5:
[0300] E. coli strains PDO2770 (with pBR*p1.5csc3, described in Examples 1 and 2) and PDO2771 (with pBR*p1.5csc5) were grown overnight in LB (Luria Bertani) medium containing 100 μg/mL ampicillin at 37° C. The next day, these cultures were diluted 1:50 in MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L sucrose and 25 μg/mL ampicillin, These cultures were grown at 37° C. with shaking at 250 rpm for 4 hours. The log-phase cultures were diluted 1:100 in the wells of a Bioscreen-C plate (instrument and plates purchased from Growth Curves USA, Piscataway N.J.) with 150 μL MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L glucose or 16 g/L sucrose.
[0301] The cultures were grown at 37° C. in triplicate with continuous shaking and the optical density was monitored. The growth on sucrose was much faster in cultures of strain PDO2771 than cultures of strain PDO2770. At 14 hours after inoculation, the optical density of the PDO2770 culture growing on 16 g/L sucrose was 0.060±0.024 while that of the PDO2771 culture growing on 16 g/L sucrose was 0.647±0.009. As a measure of the health of the inoculum cultures, the growth on glucose was measured. Both strains grew well with glucose as a sole carbon source. At 14 hours after inoculation, the optical density of the PDO2770 culture growing on glucose was 0.639±0.037 and the optical density of the PDO2771 culture growing on glucose was 0.693±0.070. These results demonstrate that the strain expressing CscB5, the sucrose transporter with both Q353H and R300A mutations, was able to grow much better with sucrose as sole carbon source than did the strain expressing CscB3 (R300A) alone. Because the CscB5 protein still carries a mutation in a residue essential for H.sup.+ translocation, it must be transporting sucrose without translocation of a H.sup.+ ion. Thus, the transporter encoded by the gene with the double mutation is an improved facilitated diffusion sucrose transporter.
Examples 4-6
Growth on Sucrose of Recombinant E. coli Strains Comprising Mutant or Wild Type Sucrose Transporters
[0302] The purpose of these Examples was to show that a recombinant E. coli strain comprising a variant of CscB having R300A and Q353H mutations was able to grow at a wider range of sucrose concentrations than E. coli strains comprising the wild type sucrose transporter.
Construction of Expression Vectors:
[0303] The wild type E. coli cscB gene was originally amplified from pBHRcscBKA (SEQ ID NO:119), described in U.S. Patent Application Publication No. 2011/0136190, Example 1) with primers ODMWP31 (SEQ ID NO:117) and ODMWP32 (SEQ ID NO:118), allowing the addition of both HindIII and ClaI sites at the 5' and 3' ends of the gene, respectively. The PCR fragment was cloned into pBADtopo (Invitrogen, Carlsbad, Calif.), creating plasmid pDMWP25.
[0304] The HindIII/Cla fragment from pDMWP25 was cloned into HindIII/Cla digested pDMWP12, creating pDMWP31, which contained promoter P1.20.
[0305] The HindIII/Pac digested fragment from pDMWP31 was cloned into the HindIII/Pac digested pDMWP10, creating pDMWP71, which contained promoter P1.5.
Construction of E. coli Strains Comprising the Wild Type Sucrose Transporter CscB:
[0306] Plasmids pDMWP31 (pBR*p1.20cscB) and pDMWP71 (pBR*p1.5cscB) were transformed independently into strain PDO3513, to give strains PDO2625 and PDO2769, respectively.
Growth Characterization of Strains with Expression of cscB5 or Wild Type cscB:
[0307] The two E. coli strains with plasmids encoding the wild type sucrose symporter, PDO2625 (Example 5, Comparative) and PDO2769 (Example 6, Comparative), and a strain with a plasmid carrying the improved sucrose uniporter, PDO2771 (Example 4, with pBR*p1.5csc5, described in Example 3), were grown overnight in LB (Luria Bertani) medium containing 100 μg/mL ampicillin at 37° C. The next day, these cultures were diluted 1:50 in MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L sucrose and 25 μg/mL ampicillin, These cultures were grown at 37° C. with shaking at 250 rpm for 4 hours. The log-phase cultures were diluted 1:100 in the wells of a Bioscreen-C plate (instrument and plates purchased from Growth Curves USA, Piscataway N.J.) with 150 μL of MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L glucose or 2, 4, 8, 16, or 32 g/L sucrose. The cultures were grown at 37° C. in triplicate with continuous shaking and the optical density was monitored. The growth on various concentrations of sucrose was followed. Table 4 shows the optical density at 14 hours after inoculation for the cultures of PDO2771, PDO2625, and PDO2769.
TABLE-US-00004 TABLE 4 Optical Density of Cultures Growing on Glucose or Various Concentrations of Sucrose at 14 Hours Example 5, Example 6, Example 4 Comparative Comparative PDO2771 PDO2625 PDO2769 Carbon Source (pBR1.5cscB5) (pBR*p1.20cscB) (pBR*p1.5cscB) 2 g/L glucose 0.693 ± 0.070 0.593 ± 0.005 0.668 ± 0.013 2 g/L sucrose 0.071 ± 0.036 0.452 ± 0.028 0.654 ± 0.007 4 g/L sucrose 0.343 ± 0.184 0.719 ± 0.007 0.716 ± 0.011 8 g/L sucrose 0.745 ± 0.004 0.655 ± 0.010 0.094 ± 0.012 16 g/L sucrose 0.647 ± 0.009 0.107 ± 0.007 0.060 ± 0.005 32 g/L sucrose 0.576 ± 0.006 0.049 ± 0.002 0.058 ± 0.006
[0308] As shown by the results in Table 4, all three strains grew well on 2 g/L glucose, indicating that the inoculum cultures were viable. The growth of the PDO2625 strain (Example 5, Comparative) and PDO2769 strain (Example 6, Comparative) was better than that of PDO2771 strain (Example 4) at low sucrose concentrations of 2 or 4 g/L. However at the higher sucrose concentrations of 16 or 32 g/L, PDO2771 maintained good growth while the growth of PDO2525 and PDO2769 was severely inhibited. These results demonstrate that the strain expressing CscB5, the sucrose facilitated diffusion transporter, was able to grow at a wider range of sucrose concentrations than the strains expressing the wild type sucrose symporter. Thus, facilitated diffusion, or uniport, has an advantage of allowing growth under conditions at which the symporter does not allow growth.
Examples 7-10
PDO Production from Sucrose with a Strain Comprising a Variant of CscB Sucrose Transporter Having R300A and Q353H Mutations and a Strain Comprising the Wild Type Sucrose Transporter CscB
[0309] The purpose of these Examples was to show that a recombinant E. coli strain comprising a variant of CscB having R300A and Q353H mutations gave better PDO production when grown on sucrose than a recombinant E. coli strain comprising the wild type sucrose transporter CscB.
[0310] A strain for testing the function of sucrose transporters for PDO production was constructed using PDO producing strain TTab pSYCO400/AGRO. E. coli strain TTab pSYCO400/AGRO, a PTS minus strain, was constructed as follows. Strain TTab was generated by deletion of the aldB gene from strain TT aldA, described in U.S. Pat. No. 7,371,558 (Example 17). Briefly, an aldB deletion was made by first replacing 1.5 kbp of the coding region of aldB in E. coli strain MG1655 with the FRT-CmR-FRT cassette of the pKD3 plasmid (Datsenko and Wanner, Proc. Natl. Acad. Sci. USA 97:6640-6645, 2000). A replacement cassette was amplified with the primer pair SEQ ID NO:99 and SEQ ID NO:100 using pKD3 as the template. The primer SEQ ID NO:99 contains 80 bp of homology to the 5'-end of aldB and 20 bp of homology to pKD3. Primer SEQ ID NO:100 contains 80 bp of homology to the 3' end of aldB and 20 bp homology to pKD3. The PCR products were gel-purified and electroporated into MG1655/pKD46 competent cells (U.S. Pat. No. 7,371,558). Recombinant strains were selected on LB (Luria Bertani) plates with 12.5 mg/L of chloramphenicol. The deletion of the aldB gene was confirmed by PCR, using the primer pair SEQ ID NO:101 and SEQ ID NO:102. The wild-type strain gave a 1.5 kbp PCR product while the recombinant strain gave a characteristic 1.1 kbp PCR product. A P1 lysate was prepared and used to move the mutation to the TT aldA strain to form the TT aldAΔaldB::Cm strain. A chloramphenicol-resistant clone was checked by genomic PCR with the primer pair SEQ ID NO:101 and SEQ ID NO:102 to ensure that the mutation was present. The chloramphenicol resistance marker was removed using the FLP recombinase (Datsenko and Wanner, supra) to create TTab. Strain TTab was then transformed with pSYCO400/AGRO (set forth in SEQ ID NO:84), described in U.S. Pat. No. 7,524,660 (Example 4), to generate strain TTab pSYCO400/AGRO.
[0311] As described in the cited references, strain TTab is a derivative of E. coli strain FM5 (ATCC® No. 53911) containing the following modifications:
[0312] deletion of glpK, gldA, ptsHI, crr, edd, arcA, mgsA, qor, ackA, pta, aldA and aldB genes;
[0313] upregulation of galP, glk, btuR, ppc, and yqhD genes; and
[0314] downregulation of gapA gene.
Plasmid pSYCO400/AGRO contains genes encoding a glycerol production pathway (DAR1 and GPP2) and genes encoding a glycerol dehydratase and associated reactivating factor (dhaB123, dhaX, orfX, orfY), as well as a gene encoding a fructokinase (scrK).
[0315] Strain TTab pSYCO400/AGRO was used as a recipient for P1 transduction. The donor strain was PDO3513, constructed as described in Examples 1 and 2, and selection for growth was on LB plates with 25 μg/mL kanamycin. A colony resistant to kanamycin and spectinomycin was purified and named PDO2737 [TTab/pSYCO400AGRO yihP::cscKBΔ(61-353)KanR&A].
[0316] Strain PDO2737 was transformed with plasmids encoding the wild type sucrose transporter, pDMWP31 (pBR*p1.20cscB) and pDMWP71 (pBR*p1.5cscB) described in Examples 4-6, to yield strains PDO2815 and PDO2818, respectively. In addition, strain PDO2737 was transformed with plasmids encoding a facilitated diffusion sucrose transporter, pDMWP33 (pBR*p1.20csc5) and pDMWP66 (pBR*p1.5csc5), described in Example 3, to yield strains PDO2965 and PDO2966, respectively.
[0317] To test for production of PDO and glycerol, these four E. coli strains were grown overnight in L-Broth, Miller's Modification (Teknova, Half Moon Bay, Calif.) supplemented with 100 mg/L spectinomycin and 100 mg/L ampicillin at 33° C. These cultures were used to inoculate shake flasks at an optical density of 0.01 units measured at 550 nm in MOPS minimal medium (Teknova, Half Moon Bay, Calif.) supplemented with 10 g/L sucrose. Vitamin B12 was added to the medium to a concentration of 0.1 mg/L. The cultures were incubated at 34° C. with shaking (225 rpm) for 44 hours. Samples of the cultures were then filtered and used for the determination of the concentrations of sucrose, glycerol and 1,3-propanediol (PDO) in the broth by high performance liquid chromatography.
[0318] Chromatographic separation was achieved using an Aminex HPX-87P column (Bio-Rad, Hercules, Calif.) with an isocratic mobile phase of distilled-deionized water at a flow rate of 0.5 mL/min and a column temperature of 85° C. Eluted compounds were quantified by refractive index detection with reference to a standard curve prepared from commercially purchased pure compounds dissolved to known concentrations in MOPS minimal medium. Retention times were sucrose at 12.2 min, 1,3-propanediol at 17.9 min, and glycerol at 23.6 min. Table 5 shows the residual sucrose and molar yield of PDO and glycerol (mol PDO+mol glycerol/mol glucose equivalent), in the cultures of these four strains.
TABLE-US-00005 TABLE 5 Sucrose Utilization and PDO and Glycerol Production Molar Yield (mol 44 hour PDO + glycerol/mol Example Strain sucrose g/L glucose equivalent) Example 7, PDO2815 6.65 0.780 Comparative (P1.20cscB) Example 8, PDO2818 9.23 0.660 Comparative (P1.5cscB) Example 9 PDO2965 4.80 1.014 (P1.20cscB5) Example 10 PDO2966 1.78 1.066 (P1.5cscB5)
[0319] As can be seen from the results in Table 5, there was more sucrose remaining in the cultures expressing the wild type sucrose transporter
[0320] CscB (Comparative Examples 7 and 8) than was left in the cultures expressing the facilitated diffusion transporter, CscB5 (Examples 9 and 10), indicating faster sucrose utilization with the facilitated diffusion transporter under these conditions. The molar yield of PDO and glycerol from sucrose was substantially higher for the strains expressing the facilitated diffusion transporter. Thus sucrose transport by facilitated diffusion was shown to be better than with the wild type transporter for PDO and glycerol production.
Examples 11-14
Recombinant E. coli Strain Comprising a Variant CscB Sucrose Transporter Having R300A, Q353H and L61P Mutations
[0321] The purpose of these Examples was to demonstrate that recombinant E. coli strains comprising variants of CscB having an L61P mutation in addition to an R300A and/or Q353H mutations (SEQ ID NO:106, encoded by SEQ ID NO:105) have improved sucrose transport by facilitated diffusion. The L61P variation confers improved sucrose transport to the CscB sucrose symporter, as described in copending and commonly owned U.S. patent application Ser. No. 13/210,488, but it was not known if this mutation would improve transport by facilitated diffusion, or if the combination of L61P and Q535H would have still further improved transport.
Construction of Expression Vectors:
[0322] The cscB16 allele contains two mutations, L61P and R300A. Plasmid pDMWP32 (described in Examples 1 and 2), which contains the R300A mutation was further mutated to introduce an L61P mutation. The mutation was introduced into pDMWP32 by site directed mutagenesis using Stratagene's QuikChange® Site-Directed Mutagenesis kit, and oligonucleotides ODMWP33 (SEQ ID N0:122) and ODMWP34 (SEQ ID NO:123) following the manufacturer's protocol, creating plasmid pDMWP54.
[0323] The cscB17 allele contains three mutations, L61P, R300A and Q353H. Plasmid pDMWP33 (described in Example 3), which contains the R300A and Q353H mutations was further mutated to introduce an L61P mutation. The mutation was introduced into pDMWP33 by site directed mutagenesis using Stratagene's QuikChange® Site-Directed Mutagenesis kit, and oligonucleotides ODMWP33 (SEQ ID NO:122) and ODMWP34 (SEQ ID NO:123) following the manufacturer's protocol, creating plasmid pDMWP55. The HindIII/Pac fragment from pDMWP55 was cloned into HindIII/Pac digested pDMWP10, to create the P1.5-containing version of the construct, plasmid pDMWP79.
Construction of E. coli Strains
[0324] Two of the plasmids described above, pDMWP54 (pBR*p1.20cscB16) and pDMWP55 (pBR*p1.20cscB17), were transformed independently into strain PDO3513, to give strains PDO2636 and PDO2637, respectively. In addition, plasmids pDMWP32 (pBR*1.20cscB3, described in Examples 1 and 2) and pDMWP33 (pBR*1.20cscB5, described in Example 3) were transformed into strain PDO3513, to give strains PDO2626 and PDO2627, respectively.
Growth Characterization of Strains with Expression of cscB3, cscB5, cscB16, or cscB17:
[0325] The four E. coli strains described above were grown overnight in LB (Luria Bertani) medium containing 100 μg/mL of ampicillin at 37° C. The next day, these cultures were diluted 1:100 in LB medium containing 100 μg/mL of ampicillin, These cultures were grown at 37° C. with shaking at 250 rpm for 4 hours. The log-phase cultures were diluted 1:100 in the wells of a Bioscreen-C plate (instrument and plates purchased from Growth Curves USA, Piscataway N.J.) with 150 μL of MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L glucose or 2 g/L sucrose. The cultures were grown at 37° C. in triplicate with continuous shaking and the optical density was monitored. Table 6 shows the mean and standard deviation of the optical density readings at 10 hours after inoculation.
TABLE-US-00006 TABLE 6 Growth in Glucose or Sucrose of Strains Expressing Various Sucrose Uniporters Measured by Optical Density at 10 hours. cscB allele and variant 2 g/L 2 g/L Example Strain amino acids glucose sucrose Example 11 PDO2626 cscB3 0.710 ± 0.007 0.064 ± 0.006 (R300A) Example 12 PDO2627 cscB5 0.702 ± 0.005 0.211 ± 0.006 (R300A Q353H) Example 13 PDO2636 cscB16 0.710 ± 0.005 0.609 ± 0.017 (L61P R300A) Example 14 PDO2637 cscB17 0.703 ± 0.001 0.732 ± 0.009 (L61P R300A Q353H)
[0326] As can be seen from the data in Table 6, all four strains grew well on glucose indicating that the inoculum cultures were healthy. Under these growth conditions, there was very little growth of strain PDO2626 expressing the facilitated diffusion transporter CscB3 with the R300A mutation. Comparatively, the growth was dramatically improved in strain PDO2636 expressing CscB16 (L61P and R300A). Likewise, L61P added to R300A Q353H improved growth as seen by comparing the growth of PDO2637 with PDO2627. These results demonstrate that the strain expressing variant sucrose facilitated diffusion transporter with the L61P mutation improved growth with sucrose as sole carbon source. Each of the variant CscB proteins carries the R300A mutation in a residue essential for H.sup.+ translocation, thus each must be transporting sucrose without translocation of a H.sup.+ ion. Accordingly, the transporters encoded by the genes conferring the L61P variation are improved facilitated diffusion sucrose transporters.
Examples 15-17
Growth on Sucrose of Recombinant E. coli Strains Comprising Mutant or Wild Type Sucrose Transporters
[0327] The purpose of these Examples was to show that a recombinant E. coli strain comprising a variant of CscB having R300A, Q353H, and L61P mutations was able to grow at a wider range of sucrose concentrations than E. coli strains comprising the wild type sucrose transporter.
Construction of E. coli Strain Comprising the Variant of CscB Having R300A, Q353H, and L61P Mutations:
[0328] Plasmid pDMWP79 (pBR*p1.5cscB17, described in Examples 11-14) was transformed into strain PDO3513, to give strain PDO2773.
Growth Characterization of E. coli Strains:
[0329] Strain PDO2773 (Example 15) and two E. coli strains with plasmids encoding the wild type sucrose symporter, PDO2625 (Example 16, Comparative) and PDO2769 (Example 17, Comparative), both described in Examples 4-6, were grown overnight in LB (Luria Bertani) medium containing 100 μg/mL ampicillin at 37° C. The next day, these cultures were diluted 1:50 in MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L sucrose and 25 μg/mL ampicillin, These cultures were grown at 37° C. with shaking at 250 rpm for 4 hours. The log-phase cultures were diluted 1:100 in the wells of a Bioscreen-C plate (instrument and plates purchased from Growth Curves USA, Piscataway N.J.) with 150 μL MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L glucose or 2, 4, 8, 16, or 32 g/L sucrose. The cultures were grown at 37° C. in triplicate with continuous shaking and the optical density was monitored. The growth on various concentrations of sucrose was followed. Table 7 shows the optical density at 14 hours after inoculation for the cultures of PDO2773, PDO2625, and PDO2769.
TABLE-US-00007 TABLE 7 Optical Density of Strains Growing on Glucose or Various Concentrations of Sucrose at 14 Hours Example 16, Example 17, Example 15 Comparative Comparative PDO2773 PDO2625 PDO2769 (pBR1.5cscB17) pBR*p1.20cscB pBR*p1.5cscB 2 g/L glucose 0.669 ± 0.042 0.593 ± 0.005 0.668 ± 0.013 2 g/L sucrose 0.474 ± 0.115 0.452 ± 0.028 0.654 ± 0.007 4 g/L sucrose 0.744 ± 0.043 0.719 ± 0.007 0.716 ± 0.011 8 g/L sucrose 0.657 ± 0.052 0.655 ± 0.010 0.094 ± 0.012 16 g/L sucrose 0.666 ± 0.008 0.107 ± 0.007 0.060 ± 0.005 32 g/L sucrose 0.538 ± 0.015 0.049 ± 0.002 0.058 ± 0.006
[0330] As can be seen from the data in Table 7, all three strains grew well on 2 g/L glucose, indicating that the inoculum cultures were viable. The growth of the PDO2625 strain (Example 16, Comparative) and PDO2769 strain (Example 17, Comparative) was similar to the growth of PDO2771 (Example 15) at low sucrose concentrations of 2 or 4 g/L. However at the higher sucrose concentrations of 16 or 32 g/L, PDO2771 maintained good growth while the growth of PDO2525 and PDO2769 was severely inhibited. These results demonstrate that the strain expressing CscB17, the improved sucrose facilitated diffusion transporter with three altered residues, L61P, R300A, Q353H, was able to grow at a much wider range of sucrose concentrations than the strains expressing the wild type sucrose symporter. Thus, this improved facilitated diffusion transporter has an advantage over sucrose transport by a symport mechanism.
Examples 18-21
PDO Production from Sucrose with a Strain Comprising a Variant CscB Sucrose Transporter Having R300A, Q353H, and L61P Mutations and a Strain Comprising the Wild Type Sucrose Transporter CscB
[0331] The purpose of these Examples was to show that a recombinant E. coli strain comprising a variant of CscB having R300A, Q353H, and L61P mutations gave better PDO production when grown on sucrose than a recombinant E. coli strain comprising the wild type sucrose transporter CscB.
[0332] E. coli strain PDO2737 [TTab/pSYCO400AGRO yihP::cscKBΔ(61-353)KanR&A, described in Examples 7-10, was transformed independently with plasmids pDMWP55 (pBR*P1.20cscB17, described in Examples 11-14) and pDMWP79 (pBR*P1.5csscB17, described in Examples 11-14) to make strains PDO2816 and PDO2819, respectively. These two strains and two strains expressing the wild type cscB symporter, PDO2815 and PDO2818 (described in Examples 7-10) were grown overnight in L-Broth, Miller's Modification (Teknova, Half Moon Bay, Calif.) supplemented with 100 mg/L spectinomycin and 100 mg/L ampicillin at 33° C. These cultures were used to inoculate shake flasks at an optical density of 0.01 units measured at 550 nm in MOPS minimal medium (Teknova, Half Moon Bay, Calif.) supplemented with 10 g/L sucrose. Vitamin B12 was added to the medium to a concentration of 0.1 mg/L. The cultures were incubated at 34° C. with shaking (225 rpm) for 48 hours. Samples of the cultures were then filtered and used in determination of the concentrations of sucrose, glycerol and 1,3-propanediol (PDO) in the broth by high performance liquid chromatography.
[0333] Chromatographic separation was achieved using an Aminex HPX-87P column (Bio-Rad, Hercules, Calif.) with an isocratic mobile phase of distilled-deionized water at a flow rate of 0.5 mL/min and a column temperature of 85° C. Eluted compounds were quantified by refractive index detection with reference to a standard curve prepared from commercially purchased pure compounds dissolved to known concentrations in MOPS minimal medium. Retention times were sucrose at 12.2 min, 1,3-propanediol at 17.9 min, and glycerol at 23.6 min. Table 8 shows the residual sucrose and molar yield of PDO and glycerol (mol PDO+mol glycerol/mol glucose equivalent), in the cultures of these four strains.
TABLE-US-00008 TABLE 8 Sucrose utilization and PDO and Glycerol Production Molar Yield (mol PDO + 44 hour sucrose glycerol/mol glucose Example Strain g/L equivalent) Example 18, PDO2815 0.85 1.09 Comparative (P1.20cscB) Example 19, PDO2818 8.19 1.12 Comparative (P1.5cscB) Example 20 PDO2816 0.00 1.14 (P1.20cscB17) Example 21 PDO2819 0.00 1.21 (P1.5cscB17)
[0334] As can be seen by the results in Table 8, sucrose was completely utilized in 48 hours only in the two cultures expressing the improved facilitated diffusion transporter CscB17 (Examples 20 and 21). Furthermore, the molar yield of PDO and glycerol was greater in the cultures expressing CscB17 than in those with the wild-type sucrose symporter CscB (Comparative Examples 18 and 19). Thus, sucrose transport by facilitated diffusion was shown to be advantageous for PDO and glycerol production.
Examples 22-24
Recombinant E. coli Strains Comprising Variants of Sucrose Transporter Gene scrT1 from Citrobacter Sp. 30--2
[0335] The purpose of these Examples was to construct recombinant E. coli strains containing mutant transporter genes from Citrobacter sp. 30--2 and to demonstrate sucrose transport by facilitated diffusion. The protein encoded by the mutant sucrose transport gene was altered in a residue required for H.sup.+ translocation, thus eliminating H.sup.+/sucrose symport.
Construction of Expression Vectors:
[0336] Plasmid pDMWP12-scrT1, carrying a gene encoding a transporter protein from Citrobacter sp. 30--2, was constructed as follows. Vector pDMWP3 was obtained from Integrated DNA Technologies, Inc. (Coralville, 10). The pDMWP3 vector was constructed by cloning a promoter/MCS/double terminator region (set forth in SEQ ID NO:124), synthesized by Integrated DNA Technologies, Inc., into the pIDT-SMART vector (Integrated DNA Technologies, Inc.). Vector pDMWP4 was constructed from plasmid pBR322. A sca1 site and a kpn1 site on the 5' end of the TetR gene and an additional kpn1 site at the 3' end of the TetR gene were introduced into plasmid pBR322. Additionally, a kpn1 site was removed from the middle of the AmpR gene. All restriction sites were either added or removed using Stratagene's QuikChange® Site-Directed Mutagenesis kit (Stratagene, La Jolla, Calif.) following the manufacturer's protocols. Vector pDMWP3 was digested with EcoR1 and Kpn1 and the resulting 438 bp fragment was cloned into vector pDMWP4, which was also digested with EcoR1 and Kpn1, to give vector pDMWP12, which is also referred to herein as pBR*P1.20. The scrT1 transporter gene from Citrobacter sp. 30--2 was codon optimized for expression in E. coli. The codon optimized sequence, set forth in SEQ ID NO:125, was synthesized by GenScript USA Inc. (Piscataway, N.J.). The synthetic gene was subcloned into vector pDMWP12 at restriction sites of HindIII and XmaI to yield pDMWP12-scrT1. This subcloning was done at GenScript. The presence of the transporter gene in pDMWP12-scrT1 was confirmed by sequence analysis.
[0337] The residue equivalent to R300 of E. coli CscB was found by multiple sequence alignment to be an arginine residue at position 305. R305 was mutated independently with two sets of primers to introduce an R305A mutation (SEQ ID N0:108, encoded by SEQ ID N0:107) and an R305L mutation (SEQ ID N0:110, encoded by SEQ ID NO:109). Site directed mutagenesis, using Stratagene's QuikChange® Site-Directed Mutagenesis kit was employed. Oligonucleotides ODMWP97 (SEQ ID NO:126) and ODMWP98 (SEQ ID NO:127) were used to introduce the R305A mutation, creating plasmid pDMWP112. Oligonucleotides ODMWP99 (SEQ ID NO:128) and ODMWP100 (SEQ ID NO:129) were used to introduce the R305L mutation, creating pDMWP113.
Construction of E. coli Strains Comprising the Variant Citrobacter Sp. Sucrose Transporter:
[0338] Plasmids pDMWP112 and pDMWP113 were introduced into E. coli strain PDO3513 (described in Examples 1 and 2). The resultant strains were named PDO2896 and PDO2897, respectively. Additionally, the vector pDMWP12 (described in Examples 1 and 2) was introduced into strain PDO3513 to yield strain PDO2576.
Growth Characterization of E. coli Strains:
[0339] E. coli strains PDO2576 (Example 22, Comparative), PDO2896 (Example 23), and PDO2897 (Example 24) were grown overnight in LB (Luria Bertani) medium containing 100 μg/mL ampicillin at 37° C. The next day, these cultures were diluted 1:50 in LB (Luria Bertani) medium containing 100 μg/mL ampicillin. These cultures were grown at 37° C. with shaking at 250 rpm for 4 hours. The log-phase cultures were diluted 1:100 in the wells of a Bioscreen-C plate (instrument and plates purchased from Growth Curves USA, Piscataway N.J.) with 150 μL MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L glucose or 8 g/L sucrose. The cultures were grown at 37° C. in triplicate with continuous shaking and the optical density was monitored. The optical density of the cultures measured at 6 hours after inoculation is shown in Table 9.
TABLE-US-00009 TABLE 9 Optical Density of Cultures Growing on Glucose or Sucrose at 6 Hours Example 22, Comparative Example 23 Example 24 Carbon Source PDO2576 PDO2896 PDO2897 2 g/L glucose 0.506 ± 0.001 0.561 ± 0.021 0.569 ± 0.014 8 g/L sucrose 0.030 ± 0.002 0.439 ± 0.017 0.451 ± 0.003
[0340] As can be seen from the data in Table 9, all of the strains grew well with glucose as a sole carbon source indicating that the inoculum cultures were viable. In contrast, the control strain PDO2576 (Example 22, Comparative) was unable to grow with sucrose as sole carbon source, while the strains expressing the mutant scrT1 genes encoding a sucrose transporter unable to translocate H.sup.+ ion (Examples 23 and 24) were able to grow with sucrose as sole carbon source. Thus, net translocation of sucrose across the membrane must have occurred without translocation of a H.sup.+ ion.
Examples 25 and 26
Recombinant E. coli Strains Comprising Variants of Sucrose Transporter Gene scrT7 from Bifidobacterium longum
[0341] The purpose of these Examples was to construct a recombinant E. coli strain containing a mutant transporter gene from Bifidobacterium longum NCC2705 and to demonstrate sucrose transport by facilitated diffusion. The protein encoded by the mutant sucrose transport gene was altered in a residue required for H.sup.+ translocation, thus eliminating H.sup.+/sucrose symport.
Construction of Expression Vectors:
[0342] Plasmid pDMWP12-scrT7, carrying a gene encoding a transporter protein from Bifidobacterium longum NCC2705, was constructed using plasmid pDMWP12 (described in Examples 1 and 2 and Examples 22-24). The scrT7 sucrose transporter gene from Bifidobacterium longum was codon optimized for expression in E. coli. The codon optimized sequence, set forth in SEQ ID NO:130, was synthesized by GenScript USA Inc. (Piscataway, N.J.). The synthetic gene was subcloned into vector pDMWP12 at restriction sites of HindIII and XmaI. This subcloning was done at Genscript. The presence of the transporter gene in the vectors was confirmed by sequence analysis.
[0343] The residue equivalent to R300 of E. coli CscB was found by multiple sequence alignment to be an arginine residue at position 312. Plasmid pDMWP12-scrT7 was mutated to introduce an R312A mutation (SEQ ID NO:112, encoded by SEQ ID NO:111). Site directed mutagenesis, using Stratagene's QuikChange® Site-Directed Mutagenesis kit, was employed. Oligonucleotides ODMWP101 (SEQ ID NO: 131) and ODMWP102 (SEQ ID NO:132) were used to introduce the R312A mutation, creating plasmid pDMWP114.
Construction of E. coli Strains Comprising the Variant Bifidobacterium longum Sucrose Transporter:
[0344] Plasmid pDMWP114 was introduced into E. coli strain PDO3513 (described in Examples 1 and 2). The resultant strain was named PDO2898. Additionally, the vector pDMWP12 (described in Examples 1 and 2) was introduced into PDO3513 to yield strain PDO2576.
Growth Characterization of E. coli Strains:
[0345] E. coli strains PDO2576 (Example 25, Comparative) and PDO2898 (Example 26) were grown overnight in LB (Luria Bertani) medium containing 100 μg/mL ampicillin at 37° C. The next day, these cultures were diluted 1:50 in LB (Luria Bertani) medium containing 100 μg/mL ampicillin. These cultures were grown at 37° C., with shaking at 250 rpm for 4 hours. The log-phase cultures were diluted 1:100 in the wells of a Bioscreen-C plate (instrument and plates purchased from Growth Curves USA, Piscataway N.J.) with 1504 MOPS minimal medium (Teknova, Half Moon Bay, Calif.) containing 2 g/L glucose or 8 g/L sucrose. The cultures were grown at 37° C. in triplicate with continuous shaking and the optical density was monitored. The optical density of the cultures measured at 6 hours after inoculation is shown in Table 10.
TABLE-US-00010 TABLE 10 Optical Density of Cultures Growing on Glucose or Sucrose at 6 Hours Example 25, Comparative Example 26 Carbon Source PDO2576 PDO2898 2 g/L glucose 0.506 ± 0.001 0.531 ± 0.011 8 g/L sucrose 0.030 ± 0.002 0.268 ± 0.005
[0346] As can be seen from the data in Table 10, both of the strains grew well with glucose as a sole carbon source indicating that the inoculum cultures were viable. In contrast, the control strain PDO2576 (Example 25, Comparative) was unable to grow with sucrose as sole carbon source, while the strain expressing the mutant scrT7 gene encoding a sucrose transporter unable to translocate H.sup.+ ion (Example 26) was able to grow with sucrose as sole carbon source. Thus, net translocation of sucrose across the membrane must have occurred without translocation of a H.sup.+ ion.
Sequence CWU
1
1
13611176DNASaccharomyces cerevisiae 1atgtctgctg ctgctgatag attaaactta
acttccggcc acttgaatgc tggtagaaag 60agaagttcct cttctgtttc tttgaaggct
gccgaaaagc ctttcaaggt tactgtgatt 120ggatctggta actggggtac tactattgcc
aaggtggttg ccgaaaattg taagggatac 180ccagaagttt tcgctccaat agtacaaatg
tgggtgttcg aagaagagat caatggtgaa 240aaattgactg aaatcataaa tactagacat
caaaacgtga aatacttgcc tggcatcact 300ctacccgaca atttggttgc taatccagac
ttgattgatt cagtcaagga tgtcgacatc 360atcgttttca acattccaca tcaatttttg
ccccgtatct gtagccaatt gaaaggtcat 420gttgattcac acgtcagagc tatctcctgt
ctaaagggtt ttgaagttgg tgctaaaggt 480gtccaattgc tatcctctta catcactgag
gaactaggta ttcaatgtgg tgctctatct 540ggtgctaaca ttgccaccga agtcgctcaa
gaacactggt ctgaaacaac agttgcttac 600cacattccaa aggatttcag aggcgagggc
aaggacgtcg accataaggt tctaaaggcc 660ttgttccaca gaccttactt ccacgttagt
gtcatcgaag atgttgctgg tatctccatc 720tgtggtgctt tgaagaacgt tgttgcctta
ggttgtggtt tcgtcgaagg tctaggctgg 780ggtaacaacg cttctgctgc catccaaaga
gtcggtttgg gtgagatcat cagattcggt 840caaatgtttt tcccagaatc tagagaagaa
acatactacc aagagtctgc tggtgttgct 900gatttgatca ccacctgcgc tggtggtaga
aacgtcaagg ttgctaggct aatggctact 960tctggtaagg acgcctggga atgtgaaaag
gagttgttga atggccaatc cgctcaaggt 1020ttaattacct gcaaagaagt tcacgaatgg
ttggaaacat gtggctctgt cgaagacttc 1080ccattatttg aagccgtata ccaaatcgtt
tacaacaact acccaatgaa gaacctgccg 1140gacatgattg aagaattaga tctacatgaa
gattag 11762391PRTSaccharomyces cerevisiae
2Met Ser Ala Ala Ala Asp Arg Leu Asn Leu Thr Ser Gly His Leu Asn 1
5 10 15 Ala Gly Arg Lys
Arg Ser Ser Ser Ser Val Ser Leu Lys Ala Ala Glu 20
25 30 Lys Pro Phe Lys Val Thr Val Ile Gly
Ser Gly Asn Trp Gly Thr Thr 35 40
45 Ile Ala Lys Val Val Ala Glu Asn Cys Lys Gly Tyr Pro Glu
Val Phe 50 55 60
Ala Pro Ile Val Gln Met Trp Val Phe Glu Glu Glu Ile Asn Gly Glu 65
70 75 80 Lys Leu Thr Glu Ile
Ile Asn Thr Arg His Gln Asn Val Lys Tyr Leu 85
90 95 Pro Gly Ile Thr Leu Pro Asp Asn Leu Val
Ala Asn Pro Asp Leu Ile 100 105
110 Asp Ser Val Lys Asp Val Asp Ile Ile Val Phe Asn Ile Pro His
Gln 115 120 125 Phe
Leu Pro Arg Ile Cys Ser Gln Leu Lys Gly His Val Asp Ser His 130
135 140 Val Arg Ala Ile Ser Cys
Leu Lys Gly Phe Glu Val Gly Ala Lys Gly 145 150
155 160 Val Gln Leu Leu Ser Ser Tyr Ile Thr Glu Glu
Leu Gly Ile Gln Cys 165 170
175 Gly Ala Leu Ser Gly Ala Asn Ile Ala Thr Glu Val Ala Gln Glu His
180 185 190 Trp Ser
Glu Thr Thr Val Ala Tyr His Ile Pro Lys Asp Phe Arg Gly 195
200 205 Glu Gly Lys Asp Val Asp His
Lys Val Leu Lys Ala Leu Phe His Arg 210 215
220 Pro Tyr Phe His Val Ser Val Ile Glu Asp Val Ala
Gly Ile Ser Ile 225 230 235
240 Cys Gly Ala Leu Lys Asn Val Val Ala Leu Gly Cys Gly Phe Val Glu
245 250 255 Gly Leu Gly
Trp Gly Asn Asn Ala Ser Ala Ala Ile Gln Arg Val Gly 260
265 270 Leu Gly Glu Ile Ile Arg Phe Gly
Gln Met Phe Phe Pro Glu Ser Arg 275 280
285 Glu Glu Thr Tyr Tyr Gln Glu Ser Ala Gly Val Ala Asp
Leu Ile Thr 290 295 300
Thr Cys Ala Gly Gly Arg Asn Val Lys Val Ala Arg Leu Met Ala Thr 305
310 315 320 Ser Gly Lys Asp
Ala Trp Glu Cys Glu Lys Glu Leu Leu Asn Gly Gln 325
330 335 Ser Ala Gln Gly Leu Ile Thr Cys Lys
Glu Val His Glu Trp Leu Glu 340 345
350 Thr Cys Gly Ser Val Glu Asp Phe Pro Leu Phe Glu Ala Val
Tyr Gln 355 360 365
Ile Val Tyr Asn Asn Tyr Pro Met Lys Asn Leu Pro Asp Met Ile Glu 370
375 380 Glu Leu Asp Leu His
Glu Asp 385 390 31323DNASaccharomyces cerevisiae
3atgcttgctg tcagaagatt aacaagatac acattcctta agcgaacgca tccggtgtta
60tatactcgtc gtgcatataa aattttgcct tcaagatcta ctttcctaag aagatcatta
120ttacaaacac aactgcactc aaagatgact gctcatacta atatcaaaca gcacaaacac
180tgtcatgagg accatcctat cagaagatcg gactctgccg tgtcaattgt acatttgaaa
240cgtgcgccct tcaaggttac agtgattggt tctggtaact gggggaccac catcgccaaa
300gtcattgcgg aaaacacaga attgcattcc catatcttcg agccagaggt gagaatgtgg
360gtttttgatg aaaagatcgg cgacgaaaat ctgacggata tcataaatac aagacaccag
420aacgttaaat atctacccaa tattgacctg ccccataatc tagtggccga tcctgatctt
480ttacactcca tcaagggtgc tgacatcctt gttttcaaca tccctcatca atttttacca
540aacatagtca aacaattgca aggccacgtg gcccctcatg taagggccat ctcgtgtcta
600aaagggttcg agttgggctc caagggtgtg caattgctat cctcctatgt tactgatgag
660ttaggaatcc aatgtggcgc actatctggt gcaaacttgg caccggaagt ggccaaggag
720cattggtccg aaaccaccgt ggcttaccaa ctaccaaagg attatcaagg tgatggcaag
780gatgtagatc ataagatttt gaaattgctg ttccacagac cttacttcca cgtcaatgtc
840atcgatgatg ttgctggtat atccattgcc ggtgccttga agaacgtcgt ggcacttgca
900tgtggtttcg tagaaggtat gggatggggt aacaatgcct ccgcagccat tcaaaggctg
960ggtttaggtg aaattatcaa gttcggtaga atgtttttcc cagaatccaa agtcgagacc
1020tactatcaag aatccgctgg tgttgcagat ctgatcacca cctgctcagg cggtagaaac
1080gtcaaggttg ccacatacat ggccaagacc ggtaagtcag ccttggaagc agaaaaggaa
1140ttgcttaacg gtcaatccgc ccaagggata atcacatgca gagaagttca cgagtggcta
1200caaacatgtg agttgaccca agaattccca ttattcgagg cagtctacca gatagtctac
1260aacaacgtcc gcatggaaga cctaccggag atgattgaag agctagacat cgatgacgaa
1320tag
13234440PRTSaccharomyces cerevisiae 4Met Leu Ala Val Arg Arg Leu Thr Arg
Tyr Thr Phe Leu Lys Arg Thr 1 5 10
15 His Pro Val Leu Tyr Thr Arg Arg Ala Tyr Lys Ile Leu Pro
Ser Arg 20 25 30
Ser Thr Phe Leu Arg Arg Ser Leu Leu Gln Thr Gln Leu His Ser Lys
35 40 45 Met Thr Ala His
Thr Asn Ile Lys Gln His Lys His Cys His Glu Asp 50
55 60 His Pro Ile Arg Arg Ser Asp Ser
Ala Val Ser Ile Val His Leu Lys 65 70
75 80 Arg Ala Pro Phe Lys Val Thr Val Ile Gly Ser Gly
Asn Trp Gly Thr 85 90
95 Thr Ile Ala Lys Val Ile Ala Glu Asn Thr Glu Leu His Ser His Ile
100 105 110 Phe Glu Pro
Glu Val Arg Met Trp Val Phe Asp Glu Lys Ile Gly Asp 115
120 125 Glu Asn Leu Thr Asp Ile Ile Asn
Thr Arg His Gln Asn Val Lys Tyr 130 135
140 Leu Pro Asn Ile Asp Leu Pro His Asn Leu Val Ala Asp
Pro Asp Leu 145 150 155
160 Leu His Ser Ile Lys Gly Ala Asp Ile Leu Val Phe Asn Ile Pro His
165 170 175 Gln Phe Leu Pro
Asn Ile Val Lys Gln Leu Gln Gly His Val Ala Pro 180
185 190 His Val Arg Ala Ile Ser Cys Leu Lys
Gly Phe Glu Leu Gly Ser Lys 195 200
205 Gly Val Gln Leu Leu Ser Ser Tyr Val Thr Asp Glu Leu Gly
Ile Gln 210 215 220
Cys Gly Ala Leu Ser Gly Ala Asn Leu Ala Pro Glu Val Ala Lys Glu 225
230 235 240 His Trp Ser Glu Thr
Thr Val Ala Tyr Gln Leu Pro Lys Asp Tyr Gln 245
250 255 Gly Asp Gly Lys Asp Val Asp His Lys Ile
Leu Lys Leu Leu Phe His 260 265
270 Arg Pro Tyr Phe His Val Asn Val Ile Asp Asp Val Ala Gly Ile
Ser 275 280 285 Ile
Ala Gly Ala Leu Lys Asn Val Val Ala Leu Ala Cys Gly Phe Val 290
295 300 Glu Gly Met Gly Trp Gly
Asn Asn Ala Ser Ala Ala Ile Gln Arg Leu 305 310
315 320 Gly Leu Gly Glu Ile Ile Lys Phe Gly Arg Met
Phe Phe Pro Glu Ser 325 330
335 Lys Val Glu Thr Tyr Tyr Gln Glu Ser Ala Gly Val Ala Asp Leu Ile
340 345 350 Thr Thr
Cys Ser Gly Gly Arg Asn Val Lys Val Ala Thr Tyr Met Ala 355
360 365 Lys Thr Gly Lys Ser Ala Leu
Glu Ala Glu Lys Glu Leu Leu Asn Gly 370 375
380 Gln Ser Ala Gln Gly Ile Ile Thr Cys Arg Glu Val
His Glu Trp Leu 385 390 395
400 Gln Thr Cys Glu Leu Thr Gln Glu Phe Pro Leu Phe Glu Ala Val Tyr
405 410 415 Gln Ile Val
Tyr Asn Asn Val Arg Met Glu Asp Leu Pro Glu Met Ile 420
425 430 Glu Glu Leu Asp Ile Asp Asp Glu
435 440 5816DNASaccharomyces cerevisiae
5atgaaacgtt tcaatgtttt aaaatatatc agaacaacaa aagcaaatat acaaaccatc
60gcaatgcctt tgaccacaaa acctttatct ttgaaaatca acgccgctct attcgatgtt
120gacggtacca tcatcatctc tcaaccagcc attgctgctt tctggagaga tttcggtaaa
180gacaagcctt acttcgatgc cgaacacgtt attcacatct ctcacggttg gagaacttac
240gatgccattg ccaagttcgc tccagacttt gctgatgaag aatacgttaa caagctagaa
300ggtgaaatcc cagaaaagta cggtgaacac tccatcgaag ttccaggtgc tgtcaagttg
360tgtaatgctt tgaacgcctt gccaaaggaa aaatgggctg tcgccacctc tggtacccgt
420gacatggcca agaaatggtt cgacattttg aagatcaaga gaccagaata cttcatcacc
480gccaatgatg tcaagcaagg taagcctcac ccagaaccat acttaaaggg tagaaacggt
540ttgggtttcc caattaatga acaagaccca tccaaatcta aggttgttgt ctttgaagac
600gcaccagctg gtattgctgc tggtaaggct gctggctgta aaatcgttgg tattgctacc
660actttcgatt tggacttctt gaaggaaaag ggttgtgaca tcattgtcaa gaaccacgaa
720tctatcagag tcggtgaata caacgctgaa accgatgaag tcgaattgat ctttgatgac
780tacttatacg ctaaggatga cttgttgaaa tggtaa
8166271PRTSaccharomyces cerevisiae 6Met Lys Arg Phe Asn Val Leu Lys Tyr
Ile Arg Thr Thr Lys Ala Asn 1 5 10
15 Ile Gln Thr Ile Ala Met Pro Leu Thr Thr Lys Pro Leu Ser
Leu Lys 20 25 30
Ile Asn Ala Ala Leu Phe Asp Val Asp Gly Thr Ile Ile Ile Ser Gln
35 40 45 Pro Ala Ile Ala
Ala Phe Trp Arg Asp Phe Gly Lys Asp Lys Pro Tyr 50
55 60 Phe Asp Ala Glu His Val Ile His
Ile Ser His Gly Trp Arg Thr Tyr 65 70
75 80 Asp Ala Ile Ala Lys Phe Ala Pro Asp Phe Ala Asp
Glu Glu Tyr Val 85 90
95 Asn Lys Leu Glu Gly Glu Ile Pro Glu Lys Tyr Gly Glu His Ser Ile
100 105 110 Glu Val Pro
Gly Ala Val Lys Leu Cys Asn Ala Leu Asn Ala Leu Pro 115
120 125 Lys Glu Lys Trp Ala Val Ala Thr
Ser Gly Thr Arg Asp Met Ala Lys 130 135
140 Lys Trp Phe Asp Ile Leu Lys Ile Lys Arg Pro Glu Tyr
Phe Ile Thr 145 150 155
160 Ala Asn Asp Val Lys Gln Gly Lys Pro His Pro Glu Pro Tyr Leu Lys
165 170 175 Gly Arg Asn Gly
Leu Gly Phe Pro Ile Asn Glu Gln Asp Pro Ser Lys 180
185 190 Ser Lys Val Val Val Phe Glu Asp Ala
Pro Ala Gly Ile Ala Ala Gly 195 200
205 Lys Ala Ala Gly Cys Lys Ile Val Gly Ile Ala Thr Thr Phe
Asp Leu 210 215 220
Asp Phe Leu Lys Glu Lys Gly Cys Asp Ile Ile Val Lys Asn His Glu 225
230 235 240 Ser Ile Arg Val Gly
Glu Tyr Asn Ala Glu Thr Asp Glu Val Glu Leu 245
250 255 Ile Phe Asp Asp Tyr Leu Tyr Ala Lys Asp
Asp Leu Leu Lys Trp 260 265
270 7753DNASaccharomyces cerevisiae 7atgggattga ctactaaacc tctatctttg
aaagttaacg ccgctttgtt cgacgtcgac 60ggtaccatta tcatctctca accagccatt
gctgcattct ggagggattt cggtaaggac 120aaaccttatt tcgatgctga acacgttatc
caagtctcgc atggttggag aacgtttgat 180gccattgcta agttcgctcc agactttgcc
aatgaagagt atgttaacaa attagaagct 240gaaattccgg tcaagtacgg tgaaaaatcc
attgaagtcc caggtgcagt taagctgtgc 300aacgctttga acgctctacc aaaagagaaa
tgggctgtgg caacttccgg tacccgtgat 360atggcacaaa aatggttcga gcatctggga
atcaggagac caaagtactt cattaccgct 420aatgatgtca aacagggtaa gcctcatcca
gaaccatatc tgaagggcag gaatggctta 480ggatatccga tcaatgagca agacccttcc
aaatctaagg tagtagtatt tgaagacgct 540ccagcaggta ttgccgccgg aaaagccgcc
ggttgtaaga tcattggtat tgccactact 600ttcgacttgg acttcctaaa ggaaaaaggc
tgtgacatca ttgtcaaaaa ccacgaatcc 660atcagagttg gcggctacaa tgccgaaaca
gacgaagttg aattcatttt tgacgactac 720ttatatgcta aggacgatct gttgaaatgg
taa 7538250PRTSaccharomyces cerevisiae
8Met Gly Leu Thr Thr Lys Pro Leu Ser Leu Lys Val Asn Ala Ala Leu 1
5 10 15 Phe Asp Val Asp
Gly Thr Ile Ile Ile Ser Gln Pro Ala Ile Ala Ala 20
25 30 Phe Trp Arg Asp Phe Gly Lys Asp Lys
Pro Tyr Phe Asp Ala Glu His 35 40
45 Val Ile Gln Val Ser His Gly Trp Arg Thr Phe Asp Ala Ile
Ala Lys 50 55 60
Phe Ala Pro Asp Phe Ala Asn Glu Glu Tyr Val Asn Lys Leu Glu Ala 65
70 75 80 Glu Ile Pro Val Lys
Tyr Gly Glu Lys Ser Ile Glu Val Pro Gly Ala 85
90 95 Val Lys Leu Cys Asn Ala Leu Asn Ala Leu
Pro Lys Glu Lys Trp Ala 100 105
110 Val Ala Thr Ser Gly Thr Arg Asp Met Ala Gln Lys Trp Phe Glu
His 115 120 125 Leu
Gly Ile Arg Arg Pro Lys Tyr Phe Ile Thr Ala Asn Asp Val Lys 130
135 140 Gln Gly Lys Pro His Pro
Glu Pro Tyr Leu Lys Gly Arg Asn Gly Leu 145 150
155 160 Gly Tyr Pro Ile Asn Glu Gln Asp Pro Ser Lys
Ser Lys Val Val Val 165 170
175 Phe Glu Asp Ala Pro Ala Gly Ile Ala Ala Gly Lys Ala Ala Gly Cys
180 185 190 Lys Ile
Ile Gly Ile Ala Thr Thr Phe Asp Leu Asp Phe Leu Lys Glu 195
200 205 Lys Gly Cys Asp Ile Ile Val
Lys Asn His Glu Ser Ile Arg Val Gly 210 215
220 Gly Tyr Asn Ala Glu Thr Asp Glu Val Glu Phe Ile
Phe Asp Asp Tyr 225 230 235
240 Leu Tyr Ala Lys Asp Asp Leu Leu Lys Trp 245
250 91668DNAKlebsiella pneumoniaeCDS(1)..(1668) 9atg aaa aga tca
aaa cga ttt gca gta ctg gcc cag cgc ccc gtc aat 48Met Lys Arg Ser
Lys Arg Phe Ala Val Leu Ala Gln Arg Pro Val Asn1 5
10 15 cag gac ggg ctg att ggc gag tgg cct
gaa gag ggg ctg atc gcc atg 96Gln Asp Gly Leu Ile Gly Glu Trp Pro
Glu Glu Gly Leu Ile Ala Met 20 25
30 gac agc ccc ttt gac ccg gtc tct tca gta aaa gtg gac aac
ggt ctg 144Asp Ser Pro Phe Asp Pro Val Ser Ser Val Lys Val Asp Asn
Gly Leu 35 40 45 atc
gtc gaa ctg gac ggc aaa cgc cgg gac cag ttt gac atg atc gac 192Ile
Val Glu Leu Asp Gly Lys Arg Arg Asp Gln Phe Asp Met Ile Asp 50
55 60 cga ttt atc gcc gat tac
gcg atc aac gtt gag cgc aca gag cag gca 240Arg Phe Ile Ala Asp Tyr
Ala Ile Asn Val Glu Arg Thr Glu Gln Ala65 70
75 80 atg cgc ctg gag gcg gtg gaa ata gcc cgt atg
ctg gtg gat att cac 288Met Arg Leu Glu Ala Val Glu Ile Ala Arg Met
Leu Val Asp Ile His 85 90
95 gtc agc cgg gag gag atc att gcc atc act acc gcc atc acg ccg gcc
336Val Ser Arg Glu Glu Ile Ile Ala Ile Thr Thr Ala Ile Thr Pro Ala
100 105 110 aaa gcg gtc gag
gtg atg gcg cag atg aac gtg gtg gag atg atg atg 384Lys Ala Val Glu
Val Met Ala Gln Met Asn Val Val Glu Met Met Met 115
120 125 gcg ctg cag aag atg cgt gcc cgc cgg
acc ccc tcc aac cag tgc cac 432Ala Leu Gln Lys Met Arg Ala Arg Arg
Thr Pro Ser Asn Gln Cys His 130 135
140 gtc acc aat ctc aaa gat aat ccg gtg cag att gcc gct
gac gcc gcc 480Val Thr Asn Leu Lys Asp Asn Pro Val Gln Ile Ala Ala
Asp Ala Ala145 150 155
160gag gcc ggg atc cgc ggc ttc tca gaa cag gag acc acg gtc ggt atc
528Glu Ala Gly Ile Arg Gly Phe Ser Glu Gln Glu Thr Thr Val Gly Ile
165 170 175 gcg cgc tac gcg
ccg ttt aac gcc ctg gcg ctg ttg gtc ggt tcg cag 576Ala Arg Tyr Ala
Pro Phe Asn Ala Leu Ala Leu Leu Val Gly Ser Gln 180
185 190 tgc ggc cgc ccc ggc gtg ttg acg cag
tgc tcg gtg gaa gag gcc acc 624Cys Gly Arg Pro Gly Val Leu Thr Gln
Cys Ser Val Glu Glu Ala Thr 195 200
205 gag ctg gag ctg ggc atg cgt ggc tta acc agc tac gcc gag
acg gtg 672Glu Leu Glu Leu Gly Met Arg Gly Leu Thr Ser Tyr Ala Glu
Thr Val 210 215 220 tcg
gtc tac ggc acc gaa gcg gta ttt acc gac ggc gat gat acg ccg 720Ser
Val Tyr Gly Thr Glu Ala Val Phe Thr Asp Gly Asp Asp Thr Pro225
230 235 240tgg tca aag gcg ttc ctc
gcc tcg gcc tac gcc tcc cgc ggg ttg aaa 768Trp Ser Lys Ala Phe Leu
Ala Ser Ala Tyr Ala Ser Arg Gly Leu Lys 245
250 255 atg cgc tac acc tcc ggc acc gga tcc gaa gcg
ctg atg ggc tat tcg 816Met Arg Tyr Thr Ser Gly Thr Gly Ser Glu Ala
Leu Met Gly Tyr Ser 260 265
270 gag agc aag tcg atg ctc tac ctc gaa tcg cgc tgc atc ttc att
act 864Glu Ser Lys Ser Met Leu Tyr Leu Glu Ser Arg Cys Ile Phe Ile
Thr 275 280 285 aaa ggc
gcc ggg gtt cag gga ctg caa aac ggc gcg gtg agc tgt atc 912Lys Gly
Ala Gly Val Gln Gly Leu Gln Asn Gly Ala Val Ser Cys Ile 290
295 300 ggc atg acc ggc gct gtg ccg
tcg ggc att cgg gcg gtg ctg gcg gaa 960Gly Met Thr Gly Ala Val Pro
Ser Gly Ile Arg Ala Val Leu Ala Glu305 310
315 320aac ctg atc gcc tct atg ctc gac ctc gaa gtg gcg
tcc gcc aac gac 1008Asn Leu Ile Ala Ser Met Leu Asp Leu Glu Val Ala
Ser Ala Asn Asp 325 330
335 cag act ttc tcc cac tcg gat att cgc cgc acc gcg cgc acc ctg atg
1056Gln Thr Phe Ser His Ser Asp Ile Arg Arg Thr Ala Arg Thr Leu Met
340 345 350 cag atg ctg ccg
ggc acc gac ttt att ttc tcc ggc tac agc gcg gtg 1104Gln Met Leu Pro
Gly Thr Asp Phe Ile Phe Ser Gly Tyr Ser Ala Val 355
360 365 ccg aac tac gac aac atg ttc gcc ggc
tcg aac ttc gat gcg gaa gat 1152Pro Asn Tyr Asp Asn Met Phe Ala Gly
Ser Asn Phe Asp Ala Glu Asp 370 375
380 ttt gat gat tac aac atc ctg cag cgt gac ctg atg gtt
gac ggc ggc 1200Phe Asp Asp Tyr Asn Ile Leu Gln Arg Asp Leu Met Val
Asp Gly Gly385 390 395
400ctg cgt ccg gtg acc gag gcg gaa acc att gcc att cgc cag aaa gcg
1248Leu Arg Pro Val Thr Glu Ala Glu Thr Ile Ala Ile Arg Gln Lys Ala
405 410 415 gcg cgg gcg atc
cag gcg gtt ttc cgc gag ctg ggg ctg ccg cca atc 1296Ala Arg Ala Ile
Gln Ala Val Phe Arg Glu Leu Gly Leu Pro Pro Ile 420
425 430 gcc gac gag gag gtg gag gcc gcc acc
tac gcg cac ggc agc aac gag 1344Ala Asp Glu Glu Val Glu Ala Ala Thr
Tyr Ala His Gly Ser Asn Glu 435 440
445 atg ccg ccg cgt aac gtg gtg gag gat ctg agt gcg gtg gaa
gag atg 1392Met Pro Pro Arg Asn Val Val Glu Asp Leu Ser Ala Val Glu
Glu Met 450 455 460 atg
aag cgc aac atc acc ggc ctc gat att gtc ggc gcg ctg agc cgc 1440Met
Lys Arg Asn Ile Thr Gly Leu Asp Ile Val Gly Ala Leu Ser Arg465
470 475 480agc ggc ttt gag gat atc
gcc agc aat att ctc aat atg ctg cgc cag 1488Ser Gly Phe Glu Asp Ile
Ala Ser Asn Ile Leu Asn Met Leu Arg Gln 485
490 495 cgg gtc acc ggc gat tac ctg cag acc tcg gcc
att ctc gat cgg cag 1536Arg Val Thr Gly Asp Tyr Leu Gln Thr Ser Ala
Ile Leu Asp Arg Gln 500 505
510 ttc gag gtg gtg agt gcg gtc aac gac atc aat gac tat cag ggg
ccg 1584Phe Glu Val Val Ser Ala Val Asn Asp Ile Asn Asp Tyr Gln Gly
Pro 515 520 525 ggc acc
ggc tat cgc atc tct gcc gaa cgc tgg gcg gag atc aaa aat 1632Gly Thr
Gly Tyr Arg Ile Ser Ala Glu Arg Trp Ala Glu Ile Lys Asn 530
535 540 att ccg ggc gtg gtt cag ccc
gac acc att gaa taa 1668Ile Pro Gly Val Val Gln Pro
Asp Thr Ile Glu 545 550
555 10555PRTKlebsiella pneumoniae 10Met Lys Arg Ser
Lys Arg Phe Ala Val Leu Ala Gln Arg Pro Val Asn 1 5
10 15 Gln Asp Gly Leu Ile Gly Glu Trp Pro
Glu Glu Gly Leu Ile Ala Met 20 25
30 Asp Ser Pro Phe Asp Pro Val Ser Ser Val Lys Val Asp Asn
Gly Leu 35 40 45
Ile Val Glu Leu Asp Gly Lys Arg Arg Asp Gln Phe Asp Met Ile Asp 50
55 60 Arg Phe Ile Ala Asp
Tyr Ala Ile Asn Val Glu Arg Thr Glu Gln Ala 65 70
75 80 Met Arg Leu Glu Ala Val Glu Ile Ala Arg
Met Leu Val Asp Ile His 85 90
95 Val Ser Arg Glu Glu Ile Ile Ala Ile Thr Thr Ala Ile Thr Pro
Ala 100 105 110 Lys
Ala Val Glu Val Met Ala Gln Met Asn Val Val Glu Met Met Met 115
120 125 Ala Leu Gln Lys Met Arg
Ala Arg Arg Thr Pro Ser Asn Gln Cys His 130 135
140 Val Thr Asn Leu Lys Asp Asn Pro Val Gln Ile
Ala Ala Asp Ala Ala 145 150 155
160 Glu Ala Gly Ile Arg Gly Phe Ser Glu Gln Glu Thr Thr Val Gly Ile
165 170 175 Ala Arg
Tyr Ala Pro Phe Asn Ala Leu Ala Leu Leu Val Gly Ser Gln 180
185 190 Cys Gly Arg Pro Gly Val Leu
Thr Gln Cys Ser Val Glu Glu Ala Thr 195 200
205 Glu Leu Glu Leu Gly Met Arg Gly Leu Thr Ser Tyr
Ala Glu Thr Val 210 215 220
Ser Val Tyr Gly Thr Glu Ala Val Phe Thr Asp Gly Asp Asp Thr Pro 225
230 235 240 Trp Ser Lys
Ala Phe Leu Ala Ser Ala Tyr Ala Ser Arg Gly Leu Lys 245
250 255 Met Arg Tyr Thr Ser Gly Thr Gly
Ser Glu Ala Leu Met Gly Tyr Ser 260 265
270 Glu Ser Lys Ser Met Leu Tyr Leu Glu Ser Arg Cys Ile
Phe Ile Thr 275 280 285
Lys Gly Ala Gly Val Gln Gly Leu Gln Asn Gly Ala Val Ser Cys Ile 290
295 300 Gly Met Thr Gly
Ala Val Pro Ser Gly Ile Arg Ala Val Leu Ala Glu 305 310
315 320 Asn Leu Ile Ala Ser Met Leu Asp Leu
Glu Val Ala Ser Ala Asn Asp 325 330
335 Gln Thr Phe Ser His Ser Asp Ile Arg Arg Thr Ala Arg Thr
Leu Met 340 345 350
Gln Met Leu Pro Gly Thr Asp Phe Ile Phe Ser Gly Tyr Ser Ala Val
355 360 365 Pro Asn Tyr Asp
Asn Met Phe Ala Gly Ser Asn Phe Asp Ala Glu Asp 370
375 380 Phe Asp Asp Tyr Asn Ile Leu Gln
Arg Asp Leu Met Val Asp Gly Gly 385 390
395 400 Leu Arg Pro Val Thr Glu Ala Glu Thr Ile Ala Ile
Arg Gln Lys Ala 405 410
415 Ala Arg Ala Ile Gln Ala Val Phe Arg Glu Leu Gly Leu Pro Pro Ile
420 425 430 Ala Asp Glu
Glu Val Glu Ala Ala Thr Tyr Ala His Gly Ser Asn Glu 435
440 445 Met Pro Pro Arg Asn Val Val Glu
Asp Leu Ser Ala Val Glu Glu Met 450 455
460 Met Lys Arg Asn Ile Thr Gly Leu Asp Ile Val Gly Ala
Leu Ser Arg 465 470 475
480 Ser Gly Phe Glu Asp Ile Ala Ser Asn Ile Leu Asn Met Leu Arg Gln
485 490 495 Arg Val Thr Gly
Asp Tyr Leu Gln Thr Ser Ala Ile Leu Asp Arg Gln 500
505 510 Phe Glu Val Val Ser Ala Val Asn Asp
Ile Asn Asp Tyr Gln Gly Pro 515 520
525 Gly Thr Gly Tyr Arg Ile Ser Ala Glu Arg Trp Ala Glu Ile
Lys Asn 530 535 540
Ile Pro Gly Val Val Gln Pro Asp Thr Ile Glu 545 550
555 11585DNAKlebsiella pneumoniaeCDS(1)..(585) 11gtg caa cag
aca acc caa att cag ccc tct ttt acc ctg aaa acc cgc 48Val Gln Gln
Thr Thr Gln Ile Gln Pro Ser Phe Thr Leu Lys Thr Arg1 5
10 15 gag ggc ggg gta gct tct gcc gat
gaa cgc gcc gat gaa gtg gtg atc 96Glu Gly Gly Val Ala Ser Ala Asp
Glu Arg Ala Asp Glu Val Val Ile 20 25
30 ggc gtc ggc cct gcc ttc gat aaa cac cag cat cac act
ctg atc gat 144Gly Val Gly Pro Ala Phe Asp Lys His Gln His His Thr
Leu Ile Asp 35 40 45
atg ccc cat ggc gcg atc ctc aaa gag ctg att gcc ggg gtg gaa gaa
192Met Pro His Gly Ala Ile Leu Lys Glu Leu Ile Ala Gly Val Glu Glu 50
55 60 gag ggg ctt cac
gcc cgg gtg gtg cgc att ctg cgc acg tcc gac gtc 240Glu Gly Leu His
Ala Arg Val Val Arg Ile Leu Arg Thr Ser Asp Val65 70
75 80 tcc ttt atg gcc tgg gat gcg gcc aac
ctg agc ggc tcg ggg atc ggc 288Ser Phe Met Ala Trp Asp Ala Ala Asn
Leu Ser Gly Ser Gly Ile Gly 85 90
95 atc ggt atc cag tcg aag ggg acc acg gtc atc cat cag cgc
gat ctg 336Ile Gly Ile Gln Ser Lys Gly Thr Thr Val Ile His Gln Arg
Asp Leu 100 105 110 ctg
ccg ctc agc aac ctg gag ctg ttc tcc cag gcg ccg ctg ctg acg 384Leu
Pro Leu Ser Asn Leu Glu Leu Phe Ser Gln Ala Pro Leu Leu Thr 115
120 125 ctg gag acc tac cgg cag
att ggc aaa aac gct gcg cgc tat gcg cgc 432Leu Glu Thr Tyr Arg Gln
Ile Gly Lys Asn Ala Ala Arg Tyr Ala Arg 130 135
140 aaa gag tca cct tcg ccg gtg ccg gtg gtg aac
gat cag atg gtg cgg 480Lys Glu Ser Pro Ser Pro Val Pro Val Val Asn
Asp Gln Met Val Arg145 150 155
160ccg aaa ttt atg gcc aaa gcc gcg cta ttt cat atc aaa gag acc aaa
528Pro Lys Phe Met Ala Lys Ala Ala Leu Phe His Ile Lys Glu Thr Lys
165 170 175 cat gtg gtg cag
gac gcc gag ccc gtc acc ctg cac atc gac tta gta 576His Val Val Gln
Asp Ala Glu Pro Val Thr Leu His Ile Asp Leu Val 180
185 190 agg gag tga
585Arg Glu
12194PRTKlebsiella pneumoniae 12Val Gln Gln Thr
Thr Gln Ile Gln Pro Ser Phe Thr Leu Lys Thr Arg 1 5
10 15 Glu Gly Gly Val Ala Ser Ala Asp Glu
Arg Ala Asp Glu Val Val Ile 20 25
30 Gly Val Gly Pro Ala Phe Asp Lys His Gln His His Thr Leu
Ile Asp 35 40 45
Met Pro His Gly Ala Ile Leu Lys Glu Leu Ile Ala Gly Val Glu Glu 50
55 60 Glu Gly Leu His Ala
Arg Val Val Arg Ile Leu Arg Thr Ser Asp Val 65 70
75 80 Ser Phe Met Ala Trp Asp Ala Ala Asn Leu
Ser Gly Ser Gly Ile Gly 85 90
95 Ile Gly Ile Gln Ser Lys Gly Thr Thr Val Ile His Gln Arg Asp
Leu 100 105 110 Leu
Pro Leu Ser Asn Leu Glu Leu Phe Ser Gln Ala Pro Leu Leu Thr 115
120 125 Leu Glu Thr Tyr Arg Gln
Ile Gly Lys Asn Ala Ala Arg Tyr Ala Arg 130 135
140 Lys Glu Ser Pro Ser Pro Val Pro Val Val Asn
Asp Gln Met Val Arg 145 150 155
160 Pro Lys Phe Met Ala Lys Ala Ala Leu Phe His Ile Lys Glu Thr Lys
165 170 175 His Val
Val Gln Asp Ala Glu Pro Val Thr Leu His Ile Asp Leu Val 180
185 190 Arg Glu 13426DNAKlebsiella
pneumoniaeCDS(1)..(426) 13atg agc gag aaa acc atg cgc gtg cag gat tat ccg
tta gcc acc cgc 48Met Ser Glu Lys Thr Met Arg Val Gln Asp Tyr Pro
Leu Ala Thr Arg1 5 10 15
tgc ccg gag cat atc ctg acg cct acc ggc aaa cca ttg acc gat att
96Cys Pro Glu His Ile Leu Thr Pro Thr Gly Lys Pro Leu Thr Asp Ile
20 25 30 acc ctc gag aag gtg
ctc tct ggc gag gtg ggc ccg cag gat gtg cgg 144Thr Leu Glu Lys Val
Leu Ser Gly Glu Val Gly Pro Gln Asp Val Arg 35 40
45 atc tcc cgc cag acc ctt gag tac cag gcg
cag att gcc gag cag atg 192Ile Ser Arg Gln Thr Leu Glu Tyr Gln Ala
Gln Ile Ala Glu Gln Met 50 55 60
cag cgc cat gcg gtg gcg cgc aat ttc cgc cgc gcg gcg gag ctt
atc 240Gln Arg His Ala Val Ala Arg Asn Phe Arg Arg Ala Ala Glu Leu
Ile65 70 75 80 gcc att
cct gac gag cgc att ctg gct atc tat aac gcg ctg cgc ccg 288Ala Ile
Pro Asp Glu Arg Ile Leu Ala Ile Tyr Asn Ala Leu Arg Pro 85
90 95 ttc cgc tcc tcg cag gcg gag
ctg ctg gcg atc gcc gac gag ctg gag 336Phe Arg Ser Ser Gln Ala Glu
Leu Leu Ala Ile Ala Asp Glu Leu Glu 100 105
110 cac acc tgg cat gcg aca gtg aat gcc gcc ttt gtc
cgg gag tcg gcg 384His Thr Trp His Ala Thr Val Asn Ala Ala Phe Val
Arg Glu Ser Ala 115 120 125
gaa gtg tat cag cag cgg cat aag ctg cgt aaa gga agc taa
426Glu Val Tyr Gln Gln Arg His Lys Leu Arg Lys Gly Ser 130
135 140 14141PRTKlebsiella
pneumoniae 14Met Ser Glu Lys Thr Met Arg Val Gln Asp Tyr Pro Leu Ala Thr
Arg 1 5 10 15 Cys
Pro Glu His Ile Leu Thr Pro Thr Gly Lys Pro Leu Thr Asp Ile
20 25 30 Thr Leu Glu Lys Val
Leu Ser Gly Glu Val Gly Pro Gln Asp Val Arg 35
40 45 Ile Ser Arg Gln Thr Leu Glu Tyr Gln
Ala Gln Ile Ala Glu Gln Met 50 55
60 Gln Arg His Ala Val Ala Arg Asn Phe Arg Arg Ala Ala
Glu Leu Ile 65 70 75
80 Ala Ile Pro Asp Glu Arg Ile Leu Ala Ile Tyr Asn Ala Leu Arg Pro
85 90 95 Phe Arg Ser Ser
Gln Ala Glu Leu Leu Ala Ile Ala Asp Glu Leu Glu 100
105 110 His Thr Trp His Ala Thr Val Asn Ala
Ala Phe Val Arg Glu Ser Ala 115 120
125 Glu Val Tyr Gln Gln Arg His Lys Leu Arg Lys Gly Ser
130 135 140 151539DNAEscherichia
coliCDS(1)..(1539) 15atg acc aat aat ccc cct tca gca cag att aag ccc ggc
gag tat ggt 48Met Thr Asn Asn Pro Pro Ser Ala Gln Ile Lys Pro Gly
Glu Tyr Gly1 5 10 15
ttc ccc ctc aag tta aaa gcc cgc tat gac aac ttt att ggc ggc gaa
96Phe Pro Leu Lys Leu Lys Ala Arg Tyr Asp Asn Phe Ile Gly Gly Glu
20 25 30 tgg gta gcc cct gcc
gac ggc gag tat tac cag aat ctg acg ccg gtg 144Trp Val Ala Pro Ala
Asp Gly Glu Tyr Tyr Gln Asn Leu Thr Pro Val 35 40
45 acc ggg cag ctg ctg tgc gaa gtg gcg tct
tcg ggc aaa cga gac atc 192Thr Gly Gln Leu Leu Cys Glu Val Ala Ser
Ser Gly Lys Arg Asp Ile 50 55 60
gat ctg gcg ctg gat gct gcg cac aaa gtg aaa gat aaa tgg gcg
cac 240Asp Leu Ala Leu Asp Ala Ala His Lys Val Lys Asp Lys Trp Ala
His65 70 75 80 acc tcg
gtg cag gat cgt gcg gcg att ctg ttt aag att gcc gat cga 288Thr Ser
Val Gln Asp Arg Ala Ala Ile Leu Phe Lys Ile Ala Asp Arg 85
90 95 atg gaa caa aac ctc gag ctg
tta gcg aca gct gaa acc tgg gat aac 336Met Glu Gln Asn Leu Glu Leu
Leu Ala Thr Ala Glu Thr Trp Asp Asn 100 105
110 ggc aaa ccc att cgc gaa acc agt gct gcg gat gta
ccg ctg gcg att 384Gly Lys Pro Ile Arg Glu Thr Ser Ala Ala Asp Val
Pro Leu Ala Ile 115 120 125
gac cat ttc cgc tat ttc gcc tcg tgt att cgg gcg cag gaa ggt ggg
432Asp His Phe Arg Tyr Phe Ala Ser Cys Ile Arg Ala Gln Glu Gly Gly 130
135 140 atc agt gaa gtt
gat agc gaa acc gtg gcc tat cat ttc cat gaa ccg 480Ile Ser Glu Val
Asp Ser Glu Thr Val Ala Tyr His Phe His Glu Pro145 150
155 160tta ggc gtg gtg ggg cag att atc ccg
tgg aac ttc ccg ctg ctg atg 528Leu Gly Val Val Gly Gln Ile Ile Pro
Trp Asn Phe Pro Leu Leu Met 165 170
175 gcg agc tgg aaa atg gct ccc gcg ctg gcg gcg ggc aac tgt
gtg gtg 576Ala Ser Trp Lys Met Ala Pro Ala Leu Ala Ala Gly Asn Cys
Val Val 180 185 190 ctg
aaa ccc gca cgt ctt acc ccg ctt tct gta ctg ctg cta atg gaa 624Leu
Lys Pro Ala Arg Leu Thr Pro Leu Ser Val Leu Leu Leu Met Glu 195
200 205 att gtc ggt gat tta ctg
ccg ccg ggc gtg gtg aac gtg gtc aat ggc 672Ile Val Gly Asp Leu Leu
Pro Pro Gly Val Val Asn Val Val Asn Gly 210 215
220 gca ggt ggg gta att ggc gaa tat ctg gcg acc
tcg aaa cgc atc gcc 720Ala Gly Gly Val Ile Gly Glu Tyr Leu Ala Thr
Ser Lys Arg Ile Ala225 230 235
240aaa gtg gcg ttt acc ggc tca acg gaa gtg ggc caa caa att atg caa
768Lys Val Ala Phe Thr Gly Ser Thr Glu Val Gly Gln Gln Ile Met Gln
245 250 255 tac gca acg caa
aac att att ccg gtg acg ctg gag ttg ggc ggt aag 816Tyr Ala Thr Gln
Asn Ile Ile Pro Val Thr Leu Glu Leu Gly Gly Lys 260
265 270 tcg cca aat atc ttc ttt gct gat gtg
atg gat gaa gaa gat gcc ttt 864Ser Pro Asn Ile Phe Phe Ala Asp Val
Met Asp Glu Glu Asp Ala Phe 275 280
285 ttc gat aaa gcg ctg gaa ggc ttt gca ctg ttt gcc ttt aac
cag ggc 912Phe Asp Lys Ala Leu Glu Gly Phe Ala Leu Phe Ala Phe Asn
Gln Gly 290 295 300 gaa
gtt tgc acc tgt ccg agt cgt gct tta gtg cag gaa tct atc tac 960Glu
Val Cys Thr Cys Pro Ser Arg Ala Leu Val Gln Glu Ser Ile Tyr305
310 315 320gaa cgc ttt atg gaa cgc
gcc atc cgc cgt gtc gaa agc att cgt agc 1008Glu Arg Phe Met Glu Arg
Ala Ile Arg Arg Val Glu Ser Ile Arg Ser 325
330 335 ggt aac ccg ctc gac agc gtg acg caa atg ggc
gcg cag gtt tct cac 1056Gly Asn Pro Leu Asp Ser Val Thr Gln Met Gly
Ala Gln Val Ser His 340 345
350 ggg caa ctg gaa acc atc ctc aac tac att gat atc ggt aaa aaa
gag 1104Gly Gln Leu Glu Thr Ile Leu Asn Tyr Ile Asp Ile Gly Lys Lys
Glu 355 360 365 ggc gct
gac gtg ctc aca ggc ggg cgg cgc aag ctg ctg gaa ggt gaa 1152Gly Ala
Asp Val Leu Thr Gly Gly Arg Arg Lys Leu Leu Glu Gly Glu 370
375 380 ctg aaa gac ggc tac tac ctc
gaa ccg acg att ctg ttt ggt cag aac 1200Leu Lys Asp Gly Tyr Tyr Leu
Glu Pro Thr Ile Leu Phe Gly Gln Asn385 390
395 400aat atg cgg gtg ttc cag gag gag att ttt ggc ccg
gtg ctg gcg gtg 1248Asn Met Arg Val Phe Gln Glu Glu Ile Phe Gly Pro
Val Leu Ala Val 405 410
415 acc acc ttc aaa acg atg gaa gaa gcg ctg gag ctg gcg aac gat acg
1296Thr Thr Phe Lys Thr Met Glu Glu Ala Leu Glu Leu Ala Asn Asp Thr
420 425 430 caa tat ggc ctg
ggc gcg ggc gtc tgg agc cgc aac ggt aat ctg gcc 1344Gln Tyr Gly Leu
Gly Ala Gly Val Trp Ser Arg Asn Gly Asn Leu Ala 435
440 445 tat aag atg ggg cgc ggc ata cag gct
ggg cgc gtg tgg acc aac tgt 1392Tyr Lys Met Gly Arg Gly Ile Gln Ala
Gly Arg Val Trp Thr Asn Cys 450 455
460 tat cac gct tac ccg gca cat gcg gcg ttt ggt ggc tac
aaa caa tca 1440Tyr His Ala Tyr Pro Ala His Ala Ala Phe Gly Gly Tyr
Lys Gln Ser465 470 475
480ggt atc ggt cgc gaa acc cac aag atg atg ctg gag cat tac cag caa
1488Gly Ile Gly Arg Glu Thr His Lys Met Met Leu Glu His Tyr Gln Gln
485 490 495 acc aag tgc ctg
ctg gtg agc tac tcg gat aaa ccg ttg ggg ctg ttc 1536Thr Lys Cys Leu
Leu Val Ser Tyr Ser Asp Lys Pro Leu Gly Leu Phe 500
505 510 tga
153916512PRTEscherichia coli 16Met Thr Asn
Asn Pro Pro Ser Ala Gln Ile Lys Pro Gly Glu Tyr Gly 1 5
10 15 Phe Pro Leu Lys Leu Lys Ala Arg
Tyr Asp Asn Phe Ile Gly Gly Glu 20 25
30 Trp Val Ala Pro Ala Asp Gly Glu Tyr Tyr Gln Asn Leu
Thr Pro Val 35 40 45
Thr Gly Gln Leu Leu Cys Glu Val Ala Ser Ser Gly Lys Arg Asp Ile 50
55 60 Asp Leu Ala Leu
Asp Ala Ala His Lys Val Lys Asp Lys Trp Ala His 65 70
75 80 Thr Ser Val Gln Asp Arg Ala Ala Ile
Leu Phe Lys Ile Ala Asp Arg 85 90
95 Met Glu Gln Asn Leu Glu Leu Leu Ala Thr Ala Glu Thr Trp
Asp Asn 100 105 110
Gly Lys Pro Ile Arg Glu Thr Ser Ala Ala Asp Val Pro Leu Ala Ile
115 120 125 Asp His Phe Arg
Tyr Phe Ala Ser Cys Ile Arg Ala Gln Glu Gly Gly 130
135 140 Ile Ser Glu Val Asp Ser Glu Thr
Val Ala Tyr His Phe His Glu Pro 145 150
155 160 Leu Gly Val Val Gly Gln Ile Ile Pro Trp Asn Phe
Pro Leu Leu Met 165 170
175 Ala Ser Trp Lys Met Ala Pro Ala Leu Ala Ala Gly Asn Cys Val Val
180 185 190 Leu Lys Pro
Ala Arg Leu Thr Pro Leu Ser Val Leu Leu Leu Met Glu 195
200 205 Ile Val Gly Asp Leu Leu Pro Pro
Gly Val Val Asn Val Val Asn Gly 210 215
220 Ala Gly Gly Val Ile Gly Glu Tyr Leu Ala Thr Ser Lys
Arg Ile Ala 225 230 235
240 Lys Val Ala Phe Thr Gly Ser Thr Glu Val Gly Gln Gln Ile Met Gln
245 250 255 Tyr Ala Thr Gln
Asn Ile Ile Pro Val Thr Leu Glu Leu Gly Gly Lys 260
265 270 Ser Pro Asn Ile Phe Phe Ala Asp Val
Met Asp Glu Glu Asp Ala Phe 275 280
285 Phe Asp Lys Ala Leu Glu Gly Phe Ala Leu Phe Ala Phe Asn
Gln Gly 290 295 300
Glu Val Cys Thr Cys Pro Ser Arg Ala Leu Val Gln Glu Ser Ile Tyr 305
310 315 320 Glu Arg Phe Met Glu
Arg Ala Ile Arg Arg Val Glu Ser Ile Arg Ser 325
330 335 Gly Asn Pro Leu Asp Ser Val Thr Gln Met
Gly Ala Gln Val Ser His 340 345
350 Gly Gln Leu Glu Thr Ile Leu Asn Tyr Ile Asp Ile Gly Lys Lys
Glu 355 360 365 Gly
Ala Asp Val Leu Thr Gly Gly Arg Arg Lys Leu Leu Glu Gly Glu 370
375 380 Leu Lys Asp Gly Tyr Tyr
Leu Glu Pro Thr Ile Leu Phe Gly Gln Asn 385 390
395 400 Asn Met Arg Val Phe Gln Glu Glu Ile Phe Gly
Pro Val Leu Ala Val 405 410
415 Thr Thr Phe Lys Thr Met Glu Glu Ala Leu Glu Leu Ala Asn Asp Thr
420 425 430 Gln Tyr
Gly Leu Gly Ala Gly Val Trp Ser Arg Asn Gly Asn Leu Ala 435
440 445 Tyr Lys Met Gly Arg Gly Ile
Gln Ala Gly Arg Val Trp Thr Asn Cys 450 455
460 Tyr His Ala Tyr Pro Ala His Ala Ala Phe Gly Gly
Tyr Lys Gln Ser 465 470 475
480 Gly Ile Gly Arg Glu Thr His Lys Met Met Leu Glu His Tyr Gln Gln
485 490 495 Thr Lys Cys
Leu Leu Val Ser Tyr Ser Asp Lys Pro Leu Gly Leu Phe 500
505 510 171440DNAEscherichia
coliCDS(1)..(1440) 17atg tca gta ccc gtt caa cat cct atg tat atc gat gga
cag ttt gtt 48Met Ser Val Pro Val Gln His Pro Met Tyr Ile Asp Gly
Gln Phe Val1 5 10 15
acc tgg cgt gga gac gca tgg att gat gtg gta aac cct gct aca gag
96Thr Trp Arg Gly Asp Ala Trp Ile Asp Val Val Asn Pro Ala Thr Glu
20 25 30 gct gtc att tcc cgc
ata ccc gat ggt cag gcc gag gat gcc cgt aag 144Ala Val Ile Ser Arg
Ile Pro Asp Gly Gln Ala Glu Asp Ala Arg Lys 35 40
45 gca atc gat gca gca gaa cgt gca caa cca
gaa tgg gaa gcg ttg cct 192Ala Ile Asp Ala Ala Glu Arg Ala Gln Pro
Glu Trp Glu Ala Leu Pro 50 55 60
gct att gaa cgc gcc agt tgg ttg cgc aaa atc tcc gcc ggg atc
cgc 240Ala Ile Glu Arg Ala Ser Trp Leu Arg Lys Ile Ser Ala Gly Ile
Arg65 70 75 80 gaa cgc
gcc agt gaa atc agt gcg ctg att gtt gaa gaa ggg ggc aag 288Glu Arg
Ala Ser Glu Ile Ser Ala Leu Ile Val Glu Glu Gly Gly Lys 85
90 95 atc cag cag ctg gct gaa gtc
gaa gtg gct ttt act gcc gac tat atc 336Ile Gln Gln Leu Ala Glu Val
Glu Val Ala Phe Thr Ala Asp Tyr Ile 100 105
110 gat tac atg gcg gag tgg gca cgg cgt tac gag ggc
gag att att caa 384Asp Tyr Met Ala Glu Trp Ala Arg Arg Tyr Glu Gly
Glu Ile Ile Gln 115 120 125
agc gat cgt cca gga gaa aat att ctt ttg ttt aaa cgt gcg ctt ggt
432Ser Asp Arg Pro Gly Glu Asn Ile Leu Leu Phe Lys Arg Ala Leu Gly 130
135 140 gtg act acc ggc
att ctg ccg tgg aac ttc ccg ttc ttc ctc att gcc 480Val Thr Thr Gly
Ile Leu Pro Trp Asn Phe Pro Phe Phe Leu Ile Ala145 150
155 160cgc aaa atg gct ccc gct ctt ttg acc
ggt aat acc atc gtc att aaa 528Arg Lys Met Ala Pro Ala Leu Leu Thr
Gly Asn Thr Ile Val Ile Lys 165 170
175 cct agt gaa ttt acg cca aac aat gcg att gca ttc gcc aaa
atc gtc 576Pro Ser Glu Phe Thr Pro Asn Asn Ala Ile Ala Phe Ala Lys
Ile Val 180 185 190 gat
gaa ata ggc ctt ccg cgc ggc gtg ttt aac ctt gta ctg ggg cgt 624Asp
Glu Ile Gly Leu Pro Arg Gly Val Phe Asn Leu Val Leu Gly Arg 195
200 205 ggt gaa acc gtt ggg caa
gaa ctg gcg ggt aac cca aag gtc gca atg 672Gly Glu Thr Val Gly Gln
Glu Leu Ala Gly Asn Pro Lys Val Ala Met 210 215
220 gtc agt atg aca ggc agc gtc tct gca ggt gag
aag atc atg gcg act 720Val Ser Met Thr Gly Ser Val Ser Ala Gly Glu
Lys Ile Met Ala Thr225 230 235
240gcg gcg aaa aac atc acc aaa gtg tgt ctg gaa ttg ggg ggt aaa gca
768Ala Ala Lys Asn Ile Thr Lys Val Cys Leu Glu Leu Gly Gly Lys Ala
245 250 255 cca gct atc gta
atg gac gat gcc gat ctt gaa ctg gca gtc aaa gcc 816Pro Ala Ile Val
Met Asp Asp Ala Asp Leu Glu Leu Ala Val Lys Ala 260
265 270 atc gtt gat tca cgc gtc att aat agt
ggg caa gtg tgt aac tgt gca 864Ile Val Asp Ser Arg Val Ile Asn Ser
Gly Gln Val Cys Asn Cys Ala 275 280
285 gaa cgt gtt tat gta cag aaa ggc att tat gat cag ttc gtc
aat cgg 912Glu Arg Val Tyr Val Gln Lys Gly Ile Tyr Asp Gln Phe Val
Asn Arg 290 295 300 ctg
ggt gaa gcg atg cag gcg gtt caa ttt ggt aac ccc gct gaa cgc 960Leu
Gly Glu Ala Met Gln Ala Val Gln Phe Gly Asn Pro Ala Glu Arg305
310 315 320aac gac att gcg atg ggg
ccg ttg att aac gcc gcg gcg ctg gaa agg 1008Asn Asp Ile Ala Met Gly
Pro Leu Ile Asn Ala Ala Ala Leu Glu Arg 325
330 335 gtc gag caa aaa gtg gcg cgc gca gta gaa gaa
ggg gcg aga gtg gcg 1056Val Glu Gln Lys Val Ala Arg Ala Val Glu Glu
Gly Ala Arg Val Ala 340 345
350 ttc ggt ggc aaa gcg gta gag ggg aaa gga tat tat tat ccg ccg
aca 1104Phe Gly Gly Lys Ala Val Glu Gly Lys Gly Tyr Tyr Tyr Pro Pro
Thr 355 360 365 ttg ctg
ctg gat gtt cgc cag gaa atg tcg att atg cat gag gaa acc 1152Leu Leu
Leu Asp Val Arg Gln Glu Met Ser Ile Met His Glu Glu Thr 370
375 380 ttt ggc ccg gtg ctg cca gtt
gtc gca ttt gac acg ctg gaa gat gct 1200Phe Gly Pro Val Leu Pro Val
Val Ala Phe Asp Thr Leu Glu Asp Ala385 390
395 400atc tca atg gct aat gac agt gat tac ggc ctg acc
tca tca atc tat 1248Ile Ser Met Ala Asn Asp Ser Asp Tyr Gly Leu Thr
Ser Ser Ile Tyr 405 410
415 acc caa aat ctg aac gtc gcg atg aaa gcc att aaa ggg ctg aag ttt
1296Thr Gln Asn Leu Asn Val Ala Met Lys Ala Ile Lys Gly Leu Lys Phe
420 425 430 ggt gaa act tac
atc aac cgt gaa aac ttc gaa gct atg caa ggc ttc 1344Gly Glu Thr Tyr
Ile Asn Arg Glu Asn Phe Glu Ala Met Gln Gly Phe 435
440 445 cac gcc gga tgg cgt aaa tcc ggt att
ggc ggc gca gat ggt aaa cat 1392His Ala Gly Trp Arg Lys Ser Gly Ile
Gly Gly Ala Asp Gly Lys His 450 455
460 ggc ttg cat gaa tat ctg cag acc cag gtg gtt tat tta
cag tct taa 1440Gly Leu His Glu Tyr Leu Gln Thr Gln Val Val Tyr Leu
Gln Ser 465 470 475
18479PRTEscherichia coli 18Met Ser Val Pro Val Gln His Pro Met Tyr Ile
Asp Gly Gln Phe Val 1 5 10
15 Thr Trp Arg Gly Asp Ala Trp Ile Asp Val Val Asn Pro Ala Thr Glu
20 25 30 Ala Val
Ile Ser Arg Ile Pro Asp Gly Gln Ala Glu Asp Ala Arg Lys 35
40 45 Ala Ile Asp Ala Ala Glu Arg
Ala Gln Pro Glu Trp Glu Ala Leu Pro 50 55
60 Ala Ile Glu Arg Ala Ser Trp Leu Arg Lys Ile Ser
Ala Gly Ile Arg 65 70 75
80 Glu Arg Ala Ser Glu Ile Ser Ala Leu Ile Val Glu Glu Gly Gly Lys
85 90 95 Ile Gln Gln
Leu Ala Glu Val Glu Val Ala Phe Thr Ala Asp Tyr Ile 100
105 110 Asp Tyr Met Ala Glu Trp Ala Arg
Arg Tyr Glu Gly Glu Ile Ile Gln 115 120
125 Ser Asp Arg Pro Gly Glu Asn Ile Leu Leu Phe Lys Arg
Ala Leu Gly 130 135 140
Val Thr Thr Gly Ile Leu Pro Trp Asn Phe Pro Phe Phe Leu Ile Ala 145
150 155 160 Arg Lys Met Ala
Pro Ala Leu Leu Thr Gly Asn Thr Ile Val Ile Lys 165
170 175 Pro Ser Glu Phe Thr Pro Asn Asn Ala
Ile Ala Phe Ala Lys Ile Val 180 185
190 Asp Glu Ile Gly Leu Pro Arg Gly Val Phe Asn Leu Val Leu
Gly Arg 195 200 205
Gly Glu Thr Val Gly Gln Glu Leu Ala Gly Asn Pro Lys Val Ala Met 210
215 220 Val Ser Met Thr Gly
Ser Val Ser Ala Gly Glu Lys Ile Met Ala Thr 225 230
235 240 Ala Ala Lys Asn Ile Thr Lys Val Cys Leu
Glu Leu Gly Gly Lys Ala 245 250
255 Pro Ala Ile Val Met Asp Asp Ala Asp Leu Glu Leu Ala Val Lys
Ala 260 265 270 Ile
Val Asp Ser Arg Val Ile Asn Ser Gly Gln Val Cys Asn Cys Ala 275
280 285 Glu Arg Val Tyr Val Gln
Lys Gly Ile Tyr Asp Gln Phe Val Asn Arg 290 295
300 Leu Gly Glu Ala Met Gln Ala Val Gln Phe Gly
Asn Pro Ala Glu Arg 305 310 315
320 Asn Asp Ile Ala Met Gly Pro Leu Ile Asn Ala Ala Ala Leu Glu Arg
325 330 335 Val Glu
Gln Lys Val Ala Arg Ala Val Glu Glu Gly Ala Arg Val Ala 340
345 350 Phe Gly Gly Lys Ala Val Glu
Gly Lys Gly Tyr Tyr Tyr Pro Pro Thr 355 360
365 Leu Leu Leu Asp Val Arg Gln Glu Met Ser Ile Met
His Glu Glu Thr 370 375 380
Phe Gly Pro Val Leu Pro Val Val Ala Phe Asp Thr Leu Glu Asp Ala 385
390 395 400 Ile Ser Met
Ala Asn Asp Ser Asp Tyr Gly Leu Thr Ser Ser Ile Tyr 405
410 415 Thr Gln Asn Leu Asn Val Ala Met
Lys Ala Ile Lys Gly Leu Lys Phe 420 425
430 Gly Glu Thr Tyr Ile Asn Arg Glu Asn Phe Glu Ala Met
Gln Gly Phe 435 440 445
His Ala Gly Trp Arg Lys Ser Gly Ile Gly Gly Ala Asp Gly Lys His 450
455 460 Gly Leu His Glu
Tyr Leu Gln Thr Gln Val Val Tyr Leu Gln Ser 465 470
475 191488DNAEscherichia coliCDS(1)..(1488)
19atg aat ttt cat cat ctg gct tac tgg cag gat aaa gcg tta agt ctc
48Met Asn Phe His His Leu Ala Tyr Trp Gln Asp Lys Ala Leu Ser Leu1
5 10 15 gcc att gaa aac cgc
tta ttt att aac ggt gaa tat act gct gcg gcg 96Ala Ile Glu Asn Arg
Leu Phe Ile Asn Gly Glu Tyr Thr Ala Ala Ala 20
25 30 gaa aat gaa acc ttt gaa acc gtt gat ccg
gtc acc cag gca ccg ctg 144Glu Asn Glu Thr Phe Glu Thr Val Asp Pro
Val Thr Gln Ala Pro Leu 35 40 45
gcg aaa att gcc cgc ggc aag agc gtc gat atc gac cgt gcg atg
agc 192Ala Lys Ile Ala Arg Gly Lys Ser Val Asp Ile Asp Arg Ala Met
Ser 50 55 60 gca gca
cgc ggc gta ttt gaa cgc ggc gac tgg tca ctc tct tct ccg 240Ala Ala
Arg Gly Val Phe Glu Arg Gly Asp Trp Ser Leu Ser Ser Pro65
70 75 80 gct aaa cgt aaa gcg gta ctg
aat aaa ctc gcc gat tta atg gaa gcc 288Ala Lys Arg Lys Ala Val Leu
Asn Lys Leu Ala Asp Leu Met Glu Ala 85 90
95 cac gcc gaa gag ctg gca ctg ctg gaa act ctc gac
acc ggc aaa ccg 336His Ala Glu Glu Leu Ala Leu Leu Glu Thr Leu Asp
Thr Gly Lys Pro 100 105 110
att cgt cac agt ctg cgt gat gat att ccc ggc gcg gcg cgc gcc att
384Ile Arg His Ser Leu Arg Asp Asp Ile Pro Gly Ala Ala Arg Ala Ile
115 120 125 cgc tgg tac gcc
gaa gcg atc gac aaa gtg tat ggc gaa gtg gcg acc 432Arg Trp Tyr Ala
Glu Ala Ile Asp Lys Val Tyr Gly Glu Val Ala Thr 130
135 140 acc agt agc cat gag ctg gcg atg
atc gtg cgt gaa ccg gtc ggc gtg 480Thr Ser Ser His Glu Leu Ala Met
Ile Val Arg Glu Pro Val Gly Val145 150
155 160att gcc gcc atc gtg ccg tgg aac ttc ccg ctg ttg
ctg act tgc tgg 528Ile Ala Ala Ile Val Pro Trp Asn Phe Pro Leu Leu
Leu Thr Cys Trp 165 170
175 aaa ctc ggc ccg gcg ctg gcg gcg gga aac agc gtg att cta aaa ccg
576Lys Leu Gly Pro Ala Leu Ala Ala Gly Asn Ser Val Ile Leu Lys Pro
180 185 190 tct gaa aaa tca
ccg ctc agt gcg att cgt ctc gcg ggg ctg gcg aaa 624Ser Glu Lys Ser
Pro Leu Ser Ala Ile Arg Leu Ala Gly Leu Ala Lys 195
200 205 gaa gca ggc ttg ccg gat ggt gtg ttg
aac gtg gtg acg ggt ttt ggt 672Glu Ala Gly Leu Pro Asp Gly Val Leu
Asn Val Val Thr Gly Phe Gly 210 215
220 cat gaa gcc ggg cag gcg ctg tcg cgt cat aac gat atc
gac gcc att 720His Glu Ala Gly Gln Ala Leu Ser Arg His Asn Asp Ile
Asp Ala Ile225 230 235
240gcc ttt acc ggt tca acc cgt acc ggg aaa cag ctg ctg aaa gat gcg
768Ala Phe Thr Gly Ser Thr Arg Thr Gly Lys Gln Leu Leu Lys Asp Ala
245 250 255 ggc gac agc aac
atg aaa cgc gtc tgg ctg gaa gcg ggc ggc aaa agc 816Gly Asp Ser Asn
Met Lys Arg Val Trp Leu Glu Ala Gly Gly Lys Ser 260
265 270 gcc aac atc gtt ttc gct gac tgc ccg
gat ttg caa cag gcg gca agc 864Ala Asn Ile Val Phe Ala Asp Cys Pro
Asp Leu Gln Gln Ala Ala Ser 275 280
285 gcc acc gca gca ggc att ttc tac aac cag gga cag gtg tgc
atc gcc 912Ala Thr Ala Ala Gly Ile Phe Tyr Asn Gln Gly Gln Val Cys
Ile Ala 290 295 300 gga
acg cgc ctg ttg ctg gaa gag agc atc gcc gat gaa ttc tta gcc 960Gly
Thr Arg Leu Leu Leu Glu Glu Ser Ile Ala Asp Glu Phe Leu Ala305
310 315 320ctg tta aaa cag cag gcg
caa aac tgg cag ccg ggc cat cca ctt gat 1008Leu Leu Lys Gln Gln Ala
Gln Asn Trp Gln Pro Gly His Pro Leu Asp 325
330 335 ccc gca acc acc atg ggc acc tta atc gac tgc
gcc cac gcc gac tcg 1056Pro Ala Thr Thr Met Gly Thr Leu Ile Asp Cys
Ala His Ala Asp Ser 340 345
350 gtc cat agc ttt att cgg gaa ggc gaa agc aaa ggg caa ctg ttg
ttg 1104Val His Ser Phe Ile Arg Glu Gly Glu Ser Lys Gly Gln Leu Leu
Leu 355 360 365 gat ggc
cgt aac gcc ggg ctg gct gcc gcc atc ggc ccg acc atc ttt 1152Asp Gly
Arg Asn Ala Gly Leu Ala Ala Ala Ile Gly Pro Thr Ile Phe 370
375 380 gtg gat gtg gac ccg aat gcg
tcc tta agt cgc gaa gag att ttc ggt 1200Val Asp Val Asp Pro Asn Ala
Ser Leu Ser Arg Glu Glu Ile Phe Gly385 390
395 400ccg gtg ctg gtg gtc acg cgt ttc aca tca gaa gaa
cag gcg cta cag 1248Pro Val Leu Val Val Thr Arg Phe Thr Ser Glu Glu
Gln Ala Leu Gln 405 410
415 ctt gcc aac gac agc cag tac ggc ctt ggc gcg gcg gta tgg acg cgc
1296Leu Ala Asn Asp Ser Gln Tyr Gly Leu Gly Ala Ala Val Trp Thr Arg
420 425 430 gac ctc tcc cgc
gcg cac cgc atg agc cga cgc ctg aaa gcc ggt tcc 1344Asp Leu Ser Arg
Ala His Arg Met Ser Arg Arg Leu Lys Ala Gly Ser 435
440 445 gtc ttc gtc aat aac tac aac gac ggc
gat atg acc gtg ccg ttt ggc 1392Val Phe Val Asn Asn Tyr Asn Asp Gly
Asp Met Thr Val Pro Phe Gly 450 455
460 ggc tat aag cag agc ggc aac ggt cgc gac aaa tcc ctg
cat gcc ctt 1440Gly Tyr Lys Gln Ser Gly Asn Gly Arg Asp Lys Ser Leu
His Ala Leu465 470 475
480gaa aaa ttc act gaa ctg aaa acc atc tgg ata agc ctg gag gcc tga
1488Glu Lys Phe Thr Glu Leu Lys Thr Ile Trp Ile Ser Leu Glu Ala
485 490 495
20495PRTEscherichia coli 20Met Asn Phe His His Leu Ala Tyr Trp Gln Asp
Lys Ala Leu Ser Leu 1 5 10
15 Ala Ile Glu Asn Arg Leu Phe Ile Asn Gly Glu Tyr Thr Ala Ala Ala
20 25 30 Glu Asn
Glu Thr Phe Glu Thr Val Asp Pro Val Thr Gln Ala Pro Leu 35
40 45 Ala Lys Ile Ala Arg Gly Lys
Ser Val Asp Ile Asp Arg Ala Met Ser 50 55
60 Ala Ala Arg Gly Val Phe Glu Arg Gly Asp Trp Ser
Leu Ser Ser Pro 65 70 75
80 Ala Lys Arg Lys Ala Val Leu Asn Lys Leu Ala Asp Leu Met Glu Ala
85 90 95 His Ala Glu
Glu Leu Ala Leu Leu Glu Thr Leu Asp Thr Gly Lys Pro 100
105 110 Ile Arg His Ser Leu Arg Asp Asp
Ile Pro Gly Ala Ala Arg Ala Ile 115 120
125 Arg Trp Tyr Ala Glu Ala Ile Asp Lys Val Tyr Gly Glu
Val Ala Thr 130 135 140
Thr Ser Ser His Glu Leu Ala Met Ile Val Arg Glu Pro Val Gly Val 145
150 155 160 Ile Ala Ala Ile
Val Pro Trp Asn Phe Pro Leu Leu Leu Thr Cys Trp 165
170 175 Lys Leu Gly Pro Ala Leu Ala Ala Gly
Asn Ser Val Ile Leu Lys Pro 180 185
190 Ser Glu Lys Ser Pro Leu Ser Ala Ile Arg Leu Ala Gly Leu
Ala Lys 195 200 205
Glu Ala Gly Leu Pro Asp Gly Val Leu Asn Val Val Thr Gly Phe Gly 210
215 220 His Glu Ala Gly Gln
Ala Leu Ser Arg His Asn Asp Ile Asp Ala Ile 225 230
235 240 Ala Phe Thr Gly Ser Thr Arg Thr Gly Lys
Gln Leu Leu Lys Asp Ala 245 250
255 Gly Asp Ser Asn Met Lys Arg Val Trp Leu Glu Ala Gly Gly Lys
Ser 260 265 270 Ala
Asn Ile Val Phe Ala Asp Cys Pro Asp Leu Gln Gln Ala Ala Ser 275
280 285 Ala Thr Ala Ala Gly Ile
Phe Tyr Asn Gln Gly Gln Val Cys Ile Ala 290 295
300 Gly Thr Arg Leu Leu Leu Glu Glu Ser Ile Ala
Asp Glu Phe Leu Ala 305 310 315
320 Leu Leu Lys Gln Gln Ala Gln Asn Trp Gln Pro Gly His Pro Leu Asp
325 330 335 Pro Ala
Thr Thr Met Gly Thr Leu Ile Asp Cys Ala His Ala Asp Ser 340
345 350 Val His Ser Phe Ile Arg Glu
Gly Glu Ser Lys Gly Gln Leu Leu Leu 355 360
365 Asp Gly Arg Asn Ala Gly Leu Ala Ala Ala Ile Gly
Pro Thr Ile Phe 370 375 380
Val Asp Val Asp Pro Asn Ala Ser Leu Ser Arg Glu Glu Ile Phe Gly 385
390 395 400 Pro Val Leu
Val Val Thr Arg Phe Thr Ser Glu Glu Gln Ala Leu Gln 405
410 415 Leu Ala Asn Asp Ser Gln Tyr Gly
Leu Gly Ala Ala Val Trp Thr Arg 420 425
430 Asp Leu Ser Arg Ala His Arg Met Ser Arg Arg Leu Lys
Ala Gly Ser 435 440 445
Val Phe Val Asn Asn Tyr Asn Asp Gly Asp Met Thr Val Pro Phe Gly 450
455 460 Gly Tyr Lys Gln
Ser Gly Asn Gly Arg Asp Lys Ser Leu His Ala Leu 465 470
475 480 Glu Lys Phe Thr Glu Leu Lys Thr Ile
Trp Ile Ser Leu Glu Ala 485 490
495 211395DNAEscherichia coli 21atgcctgacg ctaaaaaaca ggggcggtca
aacaaggcaa tgacgttttt cgtctgcttc 60cttgccgctc tggcgggatt actctttggc
ctggatatcg gtgtaattgc tggcgcactg 120ccgtttattg cagatgaatt ccagattact
tcgcacacgc aagaatgggt cgtaagctcc 180atgatgttcg gtgcggcagt cggtgcggtg
ggcagcggct ggctctcctt taaactcggg 240cgcaaaaaga gcctgatgat cggcgcaatt
ttgtttgttg ccggttcgct gttctctgcg 300gctgcgccaa acgttgaagt actgattctt
tcccgcgttc tactggggct ggcggtgggt 360gtggcctctt ataccgcacc gctgtacctc
tctgaaattg cgccggaaaa aattcgtggc 420agtatgatct cgatgtatca gttgatgatc
actatcggga tcctcggtgc ttatctttct 480gataccgcct tcagctacac cggtgcatgg
cgctggatgc tgggtgtgat tatcatcccg 540gcaattttgc tgctgattgg tgtcttcttc
ctgccagaca gcccacgttg gtttgccgcc 600aaacgccgtt ttgttgatgc cgaacgcgtg
ctgctacgcc tgcgtgacac cagcgcggaa 660gcgaaacgcg aactggatga aatccgtgaa
agtttgcagg ttaaacagag tggctgggcg 720ctgtttaaag agaacagcaa cttccgccgc
gcggtgttcc ttggcgtact gttgcaggta 780atgcagcaat tcaccgggat gaacgtcatc
atgtattacg cgccgaaaat cttcgaactg 840gcgggttata ccaacactac cgagcaaatg
tgggggaccg tgattgtcgg cctgaccaac 900gtacttgcca cctttatcgc aatcggcctt
gttgaccgct ggggacgtaa accaacgcta 960acgctgggct tcctggtgat ggctgctggc
atgggcgtac tcggtacaat gatgcatatc 1020ggtattcact ctccgtcggc gcagtatttc
gccatcgcca tgctgctgat gtttattgtc 1080ggttttgcca tgagtgccgg tccgctgatt
tgggtactgt gctccgaaat tcagccgctg 1140aaaggccgcg attttggcat cacctgctcc
actgccacca actggattgc caacatgatc 1200gttggcgcaa cgttcctgac catgctcaac
acgctgggta acgccaacac cttctgggtg 1260tatgcggctc tgaacgtact gtttatcctg
ctgacattgt ggctggtacc ggaaaccaaa 1320cacgtttcgc tggaacatat tgaacgtaat
ctgatgaaag gtcgtaaact gcgcgaaata 1380ggcgctcacg attaa
139522464PRTEscherichia coli 22Met Pro
Asp Ala Lys Lys Gln Gly Arg Ser Asn Lys Ala Met Thr Phe 1 5
10 15 Phe Val Cys Phe Leu Ala Ala
Leu Ala Gly Leu Leu Phe Gly Leu Asp 20 25
30 Ile Gly Val Ile Ala Gly Ala Leu Pro Phe Ile Ala
Asp Glu Phe Gln 35 40 45
Ile Thr Ser His Thr Gln Glu Trp Val Val Ser Ser Met Met Phe Gly
50 55 60 Ala Ala Val
Gly Ala Val Gly Ser Gly Trp Leu Ser Phe Lys Leu Gly 65
70 75 80 Arg Lys Lys Ser Leu Met Ile
Gly Ala Ile Leu Phe Val Ala Gly Ser 85
90 95 Leu Phe Ser Ala Ala Ala Pro Asn Val Glu Val
Leu Ile Leu Ser Arg 100 105
110 Val Leu Leu Gly Leu Ala Val Gly Val Ala Ser Tyr Thr Ala Pro
Leu 115 120 125 Tyr
Leu Ser Glu Ile Ala Pro Glu Lys Ile Arg Gly Ser Met Ile Ser 130
135 140 Met Tyr Gln Leu Met Ile
Thr Ile Gly Ile Leu Gly Ala Tyr Leu Ser 145 150
155 160 Asp Thr Ala Phe Ser Tyr Thr Gly Ala Trp Arg
Trp Met Leu Gly Val 165 170
175 Ile Ile Ile Pro Ala Ile Leu Leu Leu Ile Gly Val Phe Phe Leu Pro
180 185 190 Asp Ser
Pro Arg Trp Phe Ala Ala Lys Arg Arg Phe Val Asp Ala Glu 195
200 205 Arg Val Leu Leu Arg Leu Arg
Asp Thr Ser Ala Glu Ala Lys Arg Glu 210 215
220 Leu Asp Glu Ile Arg Glu Ser Leu Gln Val Lys Gln
Ser Gly Trp Ala 225 230 235
240 Leu Phe Lys Glu Asn Ser Asn Phe Arg Arg Ala Val Phe Leu Gly Val
245 250 255 Leu Leu Gln
Val Met Gln Gln Phe Thr Gly Met Asn Val Ile Met Tyr 260
265 270 Tyr Ala Pro Lys Ile Phe Glu Leu
Ala Gly Tyr Thr Asn Thr Thr Glu 275 280
285 Gln Met Trp Gly Thr Val Ile Val Gly Leu Thr Asn Val
Leu Ala Thr 290 295 300
Phe Ile Ala Ile Gly Leu Val Asp Arg Trp Gly Arg Lys Pro Thr Leu 305
310 315 320 Thr Leu Gly Phe
Leu Val Met Ala Ala Gly Met Gly Val Leu Gly Thr 325
330 335 Met Met His Ile Gly Ile His Ser Pro
Ser Ala Gln Tyr Phe Ala Ile 340 345
350 Ala Met Leu Leu Met Phe Ile Val Gly Phe Ala Met Ser Ala
Gly Pro 355 360 365
Leu Ile Trp Val Leu Cys Ser Glu Ile Gln Pro Leu Lys Gly Arg Asp 370
375 380 Phe Gly Ile Thr Cys
Ser Thr Ala Thr Asn Trp Ile Ala Asn Met Ile 385 390
395 400 Val Gly Ala Thr Phe Leu Thr Met Leu Asn
Thr Leu Gly Asn Ala Asn 405 410
415 Thr Phe Trp Val Tyr Ala Ala Leu Asn Val Leu Phe Ile Leu Leu
Thr 420 425 430 Leu
Trp Leu Val Pro Glu Thr Lys His Val Ser Leu Glu His Ile Glu 435
440 445 Arg Asn Leu Met Lys Gly
Arg Lys Leu Arg Glu Ile Gly Ala His Asp 450 455
460 231248DNAEscherichia coli 23atggcactga
atattccatt cagaaatgcg tactatcgtt ttgcatccag ttactcattt 60ctctttttta
tttcctggtc gctgtggtgg tcgttatacg ctatttggct gaaaggacat 120ctaggattaa
cagggacgga attaggtaca ctttattcgg tcaaccagtt taccagcatt 180ctatttatga
tgttctacgg catcgttcag gataaactcg gtctgaagaa accgctcatc 240tggtgtatga
gtttcattct ggtcttgacc ggaccgttta tgatttacgt ttatgaaccg 300ttactgcaaa
gcaatttttc tgtaggtcta attctggggg cgctcttttt tggcctgggg 360tatctggcgg
gatgcggttt gcttgacagc ttcaccgaaa aaatggcgcg aaattttcat 420ttcgaatatg
gaacagcgcg cgcctgggga tcttttggct atgctattgg cgcgttcttt 480gccggtatat
tttttagtat cagtccccat atcaacttct ggttggtctc gctatttggc 540gctgtattta
tgatgatcaa catgcgtttt aaagataagg atcaccagtg catagcggcg 600gatgcgggag
gggtaaaaaa agaggatttt atcgcagttt tcaaggatcg aaacttctgg 660gttttcgtca
tatttattgt ggggacgtgg tctttctata acatttttga tcaacaactc 720tttcctgtct
tttatgcagg tttattcgaa tcacacgatg taggaacgcg cctgtatggt 780tatctcaact
cattccaggt ggtactcgaa gcgctgtgca tggcgattat tcctttcttt 840gtgaatcggg
tagggccaaa aaatgcatta cttatcggtg ttgtgattat ggcgttgcgt 900atcctttcct
gcgcgttgtt cgttaacccc tggattattt cattagtgaa gctgttacat 960gccattgagg
ttccactttg tgtcatatcc gtcttcaaat acagcgtggc aaactttgat 1020aagcgcctgt
cgtcgacgat ctttctgatt ggttttcaaa ttgccagttc gcttgggatt 1080gtgctgcttt
caacgccgac tgggatactc tttgaccacg caggctacca gacagttttc 1140ttcgcaattt
cgggtattgt ctgcctgatg ttgctatttg gcattttctt cctgagtaaa 1200aaacgcgagc
aaatagttat ggaaacgcct gtaccttcag caatatag
124824415PRTEscherichia coli 24Met Ala Leu Asn Ile Pro Phe Arg Asn Ala
Tyr Tyr Arg Phe Ala Ser 1 5 10
15 Ser Tyr Ser Phe Leu Phe Phe Ile Ser Trp Ser Leu Trp Trp Ser
Leu 20 25 30 Tyr
Ala Ile Trp Leu Lys Gly His Leu Gly Leu Thr Gly Thr Glu Leu 35
40 45 Gly Thr Leu Tyr Ser Val
Asn Gln Phe Thr Ser Ile Leu Phe Met Met 50 55
60 Phe Tyr Gly Ile Val Gln Asp Lys Leu Gly Leu
Lys Lys Pro Leu Ile 65 70 75
80 Trp Cys Met Ser Phe Ile Leu Val Leu Thr Gly Pro Phe Met Ile Tyr
85 90 95 Val Tyr
Glu Pro Leu Leu Gln Ser Asn Phe Ser Val Gly Leu Ile Leu 100
105 110 Gly Ala Leu Phe Phe Gly Leu
Gly Tyr Leu Ala Gly Cys Gly Leu Leu 115 120
125 Asp Ser Phe Thr Glu Lys Met Ala Arg Asn Phe His
Phe Glu Tyr Gly 130 135 140
Thr Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Ile Gly Ala Phe Phe 145
150 155 160 Ala Gly Ile
Phe Phe Ser Ile Ser Pro His Ile Asn Phe Trp Leu Val 165
170 175 Ser Leu Phe Gly Ala Val Phe Met
Met Ile Asn Met Arg Phe Lys Asp 180 185
190 Lys Asp His Gln Cys Ile Ala Ala Asp Ala Gly Gly Val
Lys Lys Glu 195 200 205
Asp Phe Ile Ala Val Phe Lys Asp Arg Asn Phe Trp Val Phe Val Ile 210
215 220 Phe Ile Val Gly
Thr Trp Ser Phe Tyr Asn Ile Phe Asp Gln Gln Leu 225 230
235 240 Phe Pro Val Phe Tyr Ala Gly Leu Phe
Glu Ser His Asp Val Gly Thr 245 250
255 Arg Leu Tyr Gly Tyr Leu Asn Ser Phe Gln Val Val Leu Glu
Ala Leu 260 265 270
Cys Met Ala Ile Ile Pro Phe Phe Val Asn Arg Val Gly Pro Lys Asn
275 280 285 Ala Leu Leu Ile
Gly Val Val Ile Met Ala Leu Arg Ile Leu Ser Cys 290
295 300 Ala Leu Phe Val Asn Pro Trp Ile
Ile Ser Leu Val Lys Leu Leu His 305 310
315 320 Ala Ile Glu Val Pro Leu Cys Val Ile Ser Val Phe
Lys Tyr Ser Val 325 330
335 Ala Asn Phe Asp Lys Arg Leu Ser Ser Thr Ile Phe Leu Ile Gly Phe
340 345 350 Gln Ile Ala
Ser Ser Leu Gly Ile Val Leu Leu Ser Thr Pro Thr Gly 355
360 365 Ile Leu Phe Asp His Ala Gly Tyr
Gln Thr Val Phe Phe Ala Ile Ser 370 375
380 Gly Ile Val Cys Leu Met Leu Leu Phe Gly Ile Phe Phe
Leu Ser Lys 385 390 395
400 Lys Arg Glu Gln Ile Val Met Glu Thr Pro Val Pro Ser Ala Ile
405 410 415 251248DNAEscherichia
coli 25atggcactga atattccatt cagaaatgcg tactatcgtt ttgcatccag ttactcattt
60ctctttttta tttcctggtc gctgtggtgg tcgttatacg ctatttggct gaaaggacat
120ctagggttga cagggacgga attaggtaca ctttattcgg tcaaccagtt taccagcatt
180ctatttatga tgttctacgg catcgttcag gataaactcg gtctgaagaa accgctcatc
240tggtgtatga gtttcatcct ggtcttgacc ggaccgttta tgatttacgt ttatgaaccg
300ttactgcaaa gcaatttttc tgtaggtcta attctggggg cgctattttt tggcttgggg
360tatctggcgg gatgcggttt gcttgatagc ttcaccgaaa aaatggcgcg aaattttcat
420ttcgaatatg gaacagcgcg cgcctgggga tcttttggct atgctattgg cgcgttcttt
480gccggcatat tttttagtat cagtccccat atcaacttct ggttggtctc gctatttggc
540gctgtattta tgatgatcaa catgcgtttt aaagataagg atcaccagtg cgtagcggca
600gatgcgggag gggtaaaaaa agaggatttt atcgcagttt tcaaggatcg aaacttctgg
660gttttcgtca tatttattgt ggggacgtgg tctttctata acatttttga tcaacaactt
720tttcctgtct tttattcagg tttattcgaa tcacacgatg taggaacgcg cctgtatggt
780tatctcaact cattccaggt ggtactcgaa gcgctgtgca tggcgattat tcctttcttt
840gtgaatcggg tagggccaaa aaatgcatta cttatcggag ttgtgattat ggcgttgcgt
900atcctttcct gcgcgctgtt cgttaacccc tggattattt cattagtgaa gttgttacat
960gccattgagg ttccactttg tgtcatatcc gtcttcaaat acagcgtggc aaactttgat
1020aagcgcctgt cgtcgacgat ctttctgatt ggttttcaaa ttgccagttc gcttgggatt
1080gtgctgcttt caacgccgac tgggatactc tttgaccacg caggctacca gacagttttc
1140ttcgcaattt cgggtattgt ctgcctgatg ttgctatttg gcattttctt cttgagtaaa
1200aaacgcgagc aaatagttat ggaaacgcct gtaccttcag caatatag
124826415PRTEscherichia coli 26Met Ala Leu Asn Ile Pro Phe Arg Asn Ala
Tyr Tyr Arg Phe Ala Ser 1 5 10
15 Ser Tyr Ser Phe Leu Phe Phe Ile Ser Trp Ser Leu Trp Trp Ser
Leu 20 25 30 Tyr
Ala Ile Trp Leu Lys Gly His Leu Gly Leu Thr Gly Thr Glu Leu 35
40 45 Gly Thr Leu Tyr Ser Val
Asn Gln Phe Thr Ser Ile Leu Phe Met Met 50 55
60 Phe Tyr Gly Ile Val Gln Asp Lys Leu Gly Leu
Lys Lys Pro Leu Ile 65 70 75
80 Trp Cys Met Ser Phe Ile Leu Val Leu Thr Gly Pro Phe Met Ile Tyr
85 90 95 Val Tyr
Glu Pro Leu Leu Gln Ser Asn Phe Ser Val Gly Leu Ile Leu 100
105 110 Gly Ala Leu Phe Phe Gly Leu
Gly Tyr Leu Ala Gly Cys Gly Leu Leu 115 120
125 Asp Ser Phe Thr Glu Lys Met Ala Arg Asn Phe His
Phe Glu Tyr Gly 130 135 140
Thr Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Ile Gly Ala Phe Phe 145
150 155 160 Ala Gly Ile
Phe Phe Ser Ile Ser Pro His Ile Asn Phe Trp Leu Val 165
170 175 Ser Leu Phe Gly Ala Val Phe Met
Met Ile Asn Met Arg Phe Lys Asp 180 185
190 Lys Asp His Gln Cys Val Ala Ala Asp Ala Gly Gly Val
Lys Lys Glu 195 200 205
Asp Phe Ile Ala Val Phe Lys Asp Arg Asn Phe Trp Val Phe Val Ile 210
215 220 Phe Ile Val Gly
Thr Trp Ser Phe Tyr Asn Ile Phe Asp Gln Gln Leu 225 230
235 240 Phe Pro Val Phe Tyr Ser Gly Leu Phe
Glu Ser His Asp Val Gly Thr 245 250
255 Arg Leu Tyr Gly Tyr Leu Asn Ser Phe Gln Val Val Leu Glu
Ala Leu 260 265 270
Cys Met Ala Ile Ile Pro Phe Phe Val Asn Arg Val Gly Pro Lys Asn
275 280 285 Ala Leu Leu Ile
Gly Val Val Ile Met Ala Leu Arg Ile Leu Ser Cys 290
295 300 Ala Leu Phe Val Asn Pro Trp Ile
Ile Ser Leu Val Lys Leu Leu His 305 310
315 320 Ala Ile Glu Val Pro Leu Cys Val Ile Ser Val Phe
Lys Tyr Ser Val 325 330
335 Ala Asn Phe Asp Lys Arg Leu Ser Ser Thr Ile Phe Leu Ile Gly Phe
340 345 350 Gln Ile Ala
Ser Ser Leu Gly Ile Val Leu Leu Ser Thr Pro Thr Gly 355
360 365 Ile Leu Phe Asp His Ala Gly Tyr
Gln Thr Val Phe Phe Ala Ile Ser 370 375
380 Gly Ile Val Cys Leu Met Leu Leu Phe Gly Ile Phe Phe
Leu Ser Lys 385 390 395
400 Lys Arg Glu Gln Ile Val Met Glu Thr Pro Val Pro Ser Ala Ile
405 410 415 271434DNAEscherichia
coli 27atgacgcaat ctcgattgca tgcggcgcaa aacgcactag caaaacttca cgagcgccga
60ggtaacactt tctatcccca ttttcacctc gcgcctcctg ccgggtggat gaacgatcca
120aacggcctga tctggtttaa cgatcgttat cacgcgtttt atcaacatca cccgatgagc
180gaacactggg ggccaatgca ctggggacat gccaccagcg acgatatgat ccactggcag
240catgagccta ttgcgctagc gccaggagac gagaatgaca aagacgggtg tttttcaggt
300agtgctgtcg atgacaatgg tgtcctctca cttatctaca ccggacacgt ctggctcgat
360ggtgcaggta atgacgatgc aattcgcgaa gtacaatgtc tggctaccag tcgggatggt
420attcatttcg agaaacaggg tgtgatcctc actccaccag aaggcatcat gcacttccgc
480gatcctaaag tgtggcgtga agccgacaca tggtggatgg tagtcggggc gaaagaccca
540ggcaacacgg ggcagatcct gctttatcgc ggcagttcat tgcgtgaatg gactttcgat
600cgcgtactgg cccacgctga tgcgggtgaa agctatatgt gggaatgtcc ggactttttc
660agccttggcg atcagcatta tctgatgttt tccccgcagg gaatgaatgc cgagggatac
720agttatcgaa atcgctttca aagtggcgta atacccggaa tgtggtcgcc aggacgactt
780tttgcacaat ccgggcattt tactgaactt gataacgggc atgactttta tgcaccacaa
840agctttgtag cgaaggatgg tcggcgtatt gttatcggct ggatggatat gtgggaatcg
900ccaatgccct caaaacgtga aggctgggca ggctgcatga cgctggcgcg cgagctatca
960gagagcaatg gcaaactcct acaacgcccg gtacacgaag ctgagtcgtt acgccagcag
1020catcaatcta tctctccccg cacaatcagc aataaatatg ttttgcagga aaacgcgcaa
1080gcagttgaga ttcagttgca gtgggcgctg aagaacagtg atgccgaaca ttacggatta
1140cagctcggcg ctggaatgcg gctgtatatt gataaccaat ctgagcgact tgttttgtgg
1200cggtattacc cacacgagaa tttagatggc taccgtagta ttcccctccc gcagggtgac
1260atgctcgccc taaggatatt tatcgataca tcatccgtgg aagtatttat taacgacggg
1320gaggcggtga tgagtagccg aatatatccg cagccagaag aacgggaact gtcgctctat
1380gcctcccacg gagtggctgt gctgcaacat ggagcactct ggcaactggg ttaa
143428477PRTEscherichia coli 28Met Thr Gln Ser Arg Leu His Ala Ala Gln
Asn Ala Leu Ala Lys Leu 1 5 10
15 His Glu Arg Arg Gly Asn Thr Phe Tyr Pro His Phe His Leu Ala
Pro 20 25 30 Pro
Ala Gly Trp Met Asn Asp Pro Asn Gly Leu Ile Trp Phe Asn Asp 35
40 45 Arg Tyr His Ala Phe Tyr
Gln His His Pro Met Ser Glu His Trp Gly 50 55
60 Pro Met His Trp Gly His Ala Thr Ser Asp Asp
Met Ile His Trp Gln 65 70 75
80 His Glu Pro Ile Ala Leu Ala Pro Gly Asp Glu Asn Asp Lys Asp Gly
85 90 95 Cys Phe
Ser Gly Ser Ala Val Asp Asp Asn Gly Val Leu Ser Leu Ile 100
105 110 Tyr Thr Gly His Val Trp Leu
Asp Gly Ala Gly Asn Asp Asp Ala Ile 115 120
125 Arg Glu Val Gln Cys Leu Ala Thr Ser Arg Asp Gly
Ile His Phe Glu 130 135 140
Lys Gln Gly Val Ile Leu Thr Pro Pro Glu Gly Ile Met His Phe Arg 145
150 155 160 Asp Pro Lys
Val Trp Arg Glu Ala Asp Thr Trp Trp Met Val Val Gly 165
170 175 Ala Lys Asp Pro Gly Asn Thr Gly
Gln Ile Leu Leu Tyr Arg Gly Ser 180 185
190 Ser Leu Arg Glu Trp Thr Phe Asp Arg Val Leu Ala His
Ala Asp Ala 195 200 205
Gly Glu Ser Tyr Met Trp Glu Cys Pro Asp Phe Phe Ser Leu Gly Asp 210
215 220 Gln His Tyr Leu
Met Phe Ser Pro Gln Gly Met Asn Ala Glu Gly Tyr 225 230
235 240 Ser Tyr Arg Asn Arg Phe Gln Ser Gly
Val Ile Pro Gly Met Trp Ser 245 250
255 Pro Gly Arg Leu Phe Ala Gln Ser Gly His Phe Thr Glu Leu
Asp Asn 260 265 270
Gly His Asp Phe Tyr Ala Pro Gln Ser Phe Val Ala Lys Asp Gly Arg
275 280 285 Arg Ile Val Ile
Gly Trp Met Asp Met Trp Glu Ser Pro Met Pro Ser 290
295 300 Lys Arg Glu Gly Trp Ala Gly Cys
Met Thr Leu Ala Arg Glu Leu Ser 305 310
315 320 Glu Ser Asn Gly Lys Leu Leu Gln Arg Pro Val His
Glu Ala Glu Ser 325 330
335 Leu Arg Gln Gln His Gln Ser Ile Ser Pro Arg Thr Ile Ser Asn Lys
340 345 350 Tyr Val Leu
Gln Glu Asn Ala Gln Ala Val Glu Ile Gln Leu Gln Trp 355
360 365 Ala Leu Lys Asn Ser Asp Ala Glu
His Tyr Gly Leu Gln Leu Gly Ala 370 375
380 Gly Met Arg Leu Tyr Ile Asp Asn Gln Ser Glu Arg Leu
Val Leu Trp 385 390 395
400 Arg Tyr Tyr Pro His Glu Asn Leu Asp Gly Tyr Arg Ser Ile Pro Leu
405 410 415 Pro Gln Gly Asp
Met Leu Ala Leu Arg Ile Phe Ile Asp Thr Ser Ser 420
425 430 Val Glu Val Phe Ile Asn Asp Gly Glu
Ala Val Met Ser Ser Arg Ile 435 440
445 Tyr Pro Gln Pro Glu Glu Arg Glu Leu Ser Leu Tyr Ala Ser
His Gly 450 455 460
Val Ala Val Leu Gln His Gly Ala Leu Trp Gln Leu Gly 465
470 475 291434DNAEscherichia coli 29atgacgcaat
ctcgattgca tgcggcgcaa aacgccctag caaaacttca tgagcaccgg 60ggtaacactt
tctatcccca ttttcacctc gcgcctcctg ccgggtggat gaacgatcca 120aacggcctga
tctggtttaa cgatcgttat cacgcgtttt atcaacatca tccgatgagc 180gaacactggg
ggccaatgca ctggggacat gccaccagcg acgatatgat ccactggcag 240catgagccta
ttgcgctagc gccaggagac gataatgaca aagacgggtg tttttcaggt 300agtgctgtcg
atgacaatgg tgtcctctca cttatctaca ccggacacgt ctggctcgat 360ggtgcaggta
atgacgatgc aattcgcgaa gtacaatgtc tggctaccag tcgggatggt 420attcatttcg
agaaacaggg tgtgatcctc actccaccag aaggaatcat gcacttccgc 480gatcctaaag
tgtggcgtga agccgacaca tggtggatgg tagtcggggc gaaagatcca 540ggcaacacgg
ggcagatcct gctttatcgc ggcagttcgt tgcgtgaatg gaccttcgat 600cgcgtactgg
cccacgctga tgcgggtgaa agctatatgt gggaatgtcc ggactttttc 660agccttggcg
atcagcatta tctgatgttt tccccgcagg gaatgaatgc cgagggatac 720agttaccgaa
atcgctttca aagtggcgta atacccggaa tgtggtcgcc aggacgactt 780tttgcacaat
ccgggcattt tactgaactt gataacgggc atgactttta tgcaccacaa 840agctttttag
cgaaggatgg tcggcgtatt gttatcggct ggatggatat gtgggaatcg 900ccaatgccct
caaaacgtga aggatgggca ggctgcatga cgctggcgcg cgagctatca 960gagagcaatg
gcaaacttct acaacgcccg gtacacgaag ctgagtcgtt acgccagcag 1020catcaatctg
tctctccccg cacaatcagc aataaatatg ttttgcagga aaacgcgcaa 1080gcagttgaga
ttcagttgca gtgggcgctg aagaacagtg atgccgaaca ttacggatta 1140cagctcggca
ctggaatgcg gctgtatatt gataaccaat ctgagcgact tgttttgtgg 1200cggtattacc
cacacgagaa tttagacggc taccgtagta ttcccctccc gcagcgtgac 1260acgctcgccc
taaggatatt tatcgataca tcatccgtgg aagtatttat taacgacggg 1320gaagcggtga
tgagtagtcg aatctatccg cagccagaag aacgggaact gtcgctttat 1380gcctcccacg
gagtggctgt gctgcaacat ggagcactct ggctactggg ttaa
143430477PRTEscherichia coli 30Met Thr Gln Ser Arg Leu His Ala Ala Gln
Asn Ala Leu Ala Lys Leu 1 5 10
15 His Glu His Arg Gly Asn Thr Phe Tyr Pro His Phe His Leu Ala
Pro 20 25 30 Pro
Ala Gly Trp Met Asn Asp Pro Asn Gly Leu Ile Trp Phe Asn Asp 35
40 45 Arg Tyr His Ala Phe Tyr
Gln His His Pro Met Ser Glu His Trp Gly 50 55
60 Pro Met His Trp Gly His Ala Thr Ser Asp Asp
Met Ile His Trp Gln 65 70 75
80 His Glu Pro Ile Ala Leu Ala Pro Gly Asp Asp Asn Asp Lys Asp Gly
85 90 95 Cys Phe
Ser Gly Ser Ala Val Asp Asp Asn Gly Val Leu Ser Leu Ile 100
105 110 Tyr Thr Gly His Val Trp Leu
Asp Gly Ala Gly Asn Asp Asp Ala Ile 115 120
125 Arg Glu Val Gln Cys Leu Ala Thr Ser Arg Asp Gly
Ile His Phe Glu 130 135 140
Lys Gln Gly Val Ile Leu Thr Pro Pro Glu Gly Ile Met His Phe Arg 145
150 155 160 Asp Pro Lys
Val Trp Arg Glu Ala Asp Thr Trp Trp Met Val Val Gly 165
170 175 Ala Lys Asp Pro Gly Asn Thr Gly
Gln Ile Leu Leu Tyr Arg Gly Ser 180 185
190 Ser Leu Arg Glu Trp Thr Phe Asp Arg Val Leu Ala His
Ala Asp Ala 195 200 205
Gly Glu Ser Tyr Met Trp Glu Cys Pro Asp Phe Phe Ser Leu Gly Asp 210
215 220 Gln His Tyr Leu
Met Phe Ser Pro Gln Gly Met Asn Ala Glu Gly Tyr 225 230
235 240 Ser Tyr Arg Asn Arg Phe Gln Ser Gly
Val Ile Pro Gly Met Trp Ser 245 250
255 Pro Gly Arg Leu Phe Ala Gln Ser Gly His Phe Thr Glu Leu
Asp Asn 260 265 270
Gly His Asp Phe Tyr Ala Pro Gln Ser Phe Leu Ala Lys Asp Gly Arg
275 280 285 Arg Ile Val Ile
Gly Trp Met Asp Met Trp Glu Ser Pro Met Pro Ser 290
295 300 Lys Arg Glu Gly Trp Ala Gly Cys
Met Thr Leu Ala Arg Glu Leu Ser 305 310
315 320 Glu Ser Asn Gly Lys Leu Leu Gln Arg Pro Val His
Glu Ala Glu Ser 325 330
335 Leu Arg Gln Gln His Gln Ser Val Ser Pro Arg Thr Ile Ser Asn Lys
340 345 350 Tyr Val Leu
Gln Glu Asn Ala Gln Ala Val Glu Ile Gln Leu Gln Trp 355
360 365 Ala Leu Lys Asn Ser Asp Ala Glu
His Tyr Gly Leu Gln Leu Gly Thr 370 375
380 Gly Met Arg Leu Tyr Ile Asp Asn Gln Ser Glu Arg Leu
Val Leu Trp 385 390 395
400 Arg Tyr Tyr Pro His Glu Asn Leu Asp Gly Tyr Arg Ser Ile Pro Leu
405 410 415 Pro Gln Arg Asp
Thr Leu Ala Leu Arg Ile Phe Ile Asp Thr Ser Ser 420
425 430 Val Glu Val Phe Ile Asn Asp Gly Glu
Ala Val Met Ser Ser Arg Ile 435 440
445 Tyr Pro Gln Pro Glu Glu Arg Glu Leu Ser Leu Tyr Ala Ser
His Gly 450 455 460
Val Ala Val Leu Gln His Gly Ala Leu Trp Leu Leu Gly 465
470 475 311599DNABifidobacterium lactis
31atggcaaccc ttcccaccaa tattcccgcc aacggcattc tgacccccga cccggcgctc
60gaccctgtgc tcacgccgat ctcggaccat gccgagcagc tgtcactcgc cgaagcaggc
120gtgtcggcac tggaaaccac ccgcaacgac cgctggtacc cgaagttcca cattgcctcc
180aatggcgggt ggatcaacga cccgaacggc ctgtgccgct acaacggacg ctggcacgtg
240ttctaccagc tgcatcccca cggcacacag tggggcccga tgcattgggg ccacgtctcc
300tccgacaaca tggtcgactg gcaccgcgaa cccatcgcct tcgcgccaag cctcgaacag
360gaacgccacg gtgtgttctc cggttccgcc gtgattggcg acgacggcaa gccgtggatt
420ttctacaccg gccaccgctg ggccaacggc aaggacaaca ccggaggcga ctggcaggtg
480cagatgctcg ccaagccgaa cgacgacgaa ctgaagacct tcacgaagga gggcatgatc
540atcgactgcc ccaccgacga ggtggaccac cacttccgcg acccgaaggt gtggaagacc
600ggtgacacct ggtatatgac cttcggtgtc tcgtcgaagg agcatcgtgg ccagatgtgg
660ctgtacacgt cgagcgacat ggtgcactgg agcttcgatc gggtgctgtt cgagcatccg
720gatccgaacg tgttcatgct tgaatgcccc gatttcttcc cgatccgcga tgcgcggggc
780aacgagaaat gggtcatcgg cttctccgcg atgggtgcca agccaaatgg cttcatgaac
840cgcaacgtga acaatgccgg ctacatggtg ggcacatgga agccaggcga gagcttcaag
900ccggagaccg agttccgcct gtgggacgaa ggccataact tctatgcacc acagtcgttc
960aacaccgaag ggcgccagat catgtacggc tggatgagcc cgttcgtcgc ccccatcccg
1020atggaggagg acggctggtg cggcaacctc accctccccc gcgagatcac gctgggcgat
1080gacggtgacc tggtcaccgc ccccaccatc gaaatggagg ggctgcgcga gaataccata
1140ggcttcgact cgctcgacct tggtacgaac cagacctcca cgatcctcga cgatgacggc
1200ggcgccctgg aaatcgagat gagactcgat ctgaacaaaa ccaccgccga acgcgccgga
1260ctgcatgtgc atgccacaag cgacggccac tacacggcaa tcgtattcga cgcgcagatc
1320ggcggcgtcg tcatcgaccg gcagaacgtg gcgaacggag acaaaggcta ccgggtggcc
1380aagctcagcg acaccgagct cgcagccgat acgcttgact tgcgcgtgtt catcgaccgc
1440ggatgcgtcg aggtctacgt cgacggcggc aagcatgcga tgagctcgta ctcgttccct
1500ggcgatggcg cacgcgccgt cgaactcgtg agcgaatccg gcaccacgca catcgacacc
1560ctcaccatgc actcgctcaa gtccatcgga ctcgagtga
159932532PRTBifidobacterium lactis 32Met Ala Thr Leu Pro Thr Asn Ile Pro
Ala Asn Gly Ile Leu Thr Pro 1 5 10
15 Asp Pro Ala Leu Asp Pro Val Leu Thr Pro Ile Ser Asp His
Ala Glu 20 25 30
Gln Leu Ser Leu Ala Glu Ala Gly Val Ser Ala Leu Glu Thr Thr Arg
35 40 45 Asn Asp Arg Trp
Tyr Pro Lys Phe His Ile Ala Ser Asn Gly Gly Trp 50
55 60 Ile Asn Asp Pro Asn Gly Leu Cys
Arg Tyr Asn Gly Arg Trp His Val 65 70
75 80 Phe Tyr Gln Leu His Pro His Gly Thr Gln Trp Gly
Pro Met His Trp 85 90
95 Gly His Val Ser Ser Asp Asn Met Val Asp Trp His Arg Glu Pro Ile
100 105 110 Ala Phe Ala
Pro Ser Leu Glu Gln Glu Arg His Gly Val Phe Ser Gly 115
120 125 Ser Ala Val Ile Gly Asp Asp Gly
Lys Pro Trp Ile Phe Tyr Thr Gly 130 135
140 His Arg Trp Ala Asn Gly Lys Asp Asn Thr Gly Gly Asp
Trp Gln Val 145 150 155
160 Gln Met Leu Ala Lys Pro Asn Asp Asp Glu Leu Lys Thr Phe Thr Lys
165 170 175 Glu Gly Met Ile
Ile Asp Cys Pro Thr Asp Glu Val Asp His His Phe 180
185 190 Arg Asp Pro Lys Val Trp Lys Thr Gly
Asp Thr Trp Tyr Met Thr Phe 195 200
205 Gly Val Ser Ser Lys Glu His Arg Gly Gln Met Trp Leu Tyr
Thr Ser 210 215 220
Ser Asp Met Val His Trp Ser Phe Asp Arg Val Leu Phe Glu His Pro 225
230 235 240 Asp Pro Asn Val Phe
Met Leu Glu Cys Pro Asp Phe Phe Pro Ile Arg 245
250 255 Asp Ala Arg Gly Asn Glu Lys Trp Val Ile
Gly Phe Ser Ala Met Gly 260 265
270 Ala Lys Pro Asn Gly Phe Met Asn Arg Asn Val Asn Asn Ala Gly
Tyr 275 280 285 Met
Val Gly Thr Trp Lys Pro Gly Glu Ser Phe Lys Pro Glu Thr Glu 290
295 300 Phe Arg Leu Trp Asp Glu
Gly His Asn Phe Tyr Ala Pro Gln Ser Phe 305 310
315 320 Asn Thr Glu Gly Arg Gln Ile Met Tyr Gly Trp
Met Ser Pro Phe Val 325 330
335 Ala Pro Ile Pro Met Glu Glu Asp Gly Trp Cys Gly Asn Leu Thr Leu
340 345 350 Pro Arg
Glu Ile Thr Leu Gly Asp Asp Gly Asp Leu Val Thr Ala Pro 355
360 365 Thr Ile Glu Met Glu Gly Leu
Arg Glu Asn Thr Ile Gly Phe Asp Ser 370 375
380 Leu Asp Leu Gly Thr Asn Gln Thr Ser Thr Ile Leu
Asp Asp Asp Gly 385 390 395
400 Gly Ala Leu Glu Ile Glu Met Arg Leu Asp Leu Asn Lys Thr Thr Ala
405 410 415 Glu Arg Ala
Gly Leu His Val His Ala Thr Ser Asp Gly His Tyr Thr 420
425 430 Ala Ile Val Phe Asp Ala Gln Ile
Gly Gly Val Val Ile Asp Arg Gln 435 440
445 Asn Val Ala Asn Gly Asp Lys Gly Tyr Arg Val Ala Lys
Leu Ser Asp 450 455 460
Thr Glu Leu Ala Ala Asp Thr Leu Asp Leu Arg Val Phe Ile Asp Arg 465
470 475 480 Gly Cys Val Glu
Val Tyr Val Asp Gly Gly Lys His Ala Met Ser Ser 485
490 495 Tyr Ser Phe Pro Gly Asp Gly Ala Arg
Ala Val Glu Leu Val Ser Glu 500 505
510 Ser Gly Thr Thr His Ile Asp Thr Leu Thr Met His Ser Leu
Lys Ser 515 520 525
Ile Gly Leu Glu 530 331599DNASaccharomyces cerevisiae
33atgcttttgc aagctttcct tttccttttg gctggttttg cagccaaaat atctgcatca
60atgacaaacg aaactagcga tagacctttg gtccacttca cacccaacaa gggctggatg
120aatgacccaa atgggttgtg gtacgatgaa aaagatgcca aatggcatct gtactttcaa
180tacaacccaa atgacaccgt atggggtacg ccattgtttt ggggccatgc tacttccgat
240gatttgacta attgggaaga tcaacccatt gctatcgctc ccaagcgtaa cgattcaggt
300gctttctctg gctccatggt ggttgattac aacaacacga gtgggttttt caatgatact
360attgatccaa gacaaagatg cgttgcgatt tggacttata acactcctga aagtgaagag
420caatacatta gctattctct tgatggtggt tacactttta ctgaatacca aaagaaccct
480gttttagctg ccaactccac tcaattcaga gatccaaagg tgttctggta tgaaccttct
540caaaaatgga ttatgacggc tgccaaatca caagactaca aaattgaaat ttactcctct
600gatgacttga agtcctggaa gctagaatct gcatttgcca atgaaggttt cttaggctac
660caatacgaat gtccaggttt gattgaagtc ccaactgagc aagatccttc caaatcttat
720tgggtcatgt ttatttctat caacccaggt gcacctgctg gcggttcctt caaccaatat
780tttgttggat ccttcaatgg tactcatttt gaagcgtttg acaatcaatc tagagtggta
840gattttggta aggactacta tgccttgcaa actttcttca acactgaccc aacctacggt
900tcagcattag gtattgcctg ggcttcaaac tgggagtaca gtgcctttgt cccaactaac
960ccatggagat catccatgtc tttggtccgc aagttttctt tgaacactga atatcaagct
1020aatccagaga ctgaattgat caatttgaaa gccgaaccaa tattgaacat tagtaatgct
1080ggtccctggt ctcgttttgc tactaacaca actctaacta aggccaattc ttacaatgtc
1140gatttgagca actcgactgg taccctagag tttgagttgg tttacgctgt taacaccaca
1200caaaccatat ccaaatccgt ctttgccgac ttatcacttt ggttcaaggg tttagaagat
1260cctgaagaat atttgagaat gggttttgaa gtcagtgctt cttccttctt tttggaccgt
1320ggtaactcta aggtcaagtt tgtcaaggag aacccatatt tcacaaacag aatgtctgtc
1380aacaaccaac cattcaagtc tgagaacgac ctaagttact ataaagtgta cggcctactg
1440gatcaaaaca tcttggaatt gtacttcaac gatggagatg tggtttctac aaatacctac
1500ttcatgacca ccggtaacgc tctaggatct gtgaacatga ccactggtgt cgataatttg
1560ttctacattg acaagttcca agtaagggaa gtaaaatag
159934532PRTSaccharomyces cerevisiae 34Met Leu Leu Gln Ala Phe Leu Phe
Leu Leu Ala Gly Phe Ala Ala Lys 1 5 10
15 Ile Ser Ala Ser Met Thr Asn Glu Thr Ser Asp Arg Pro
Leu Val His 20 25 30
Phe Thr Pro Asn Lys Gly Trp Met Asn Asp Pro Asn Gly Leu Trp Tyr
35 40 45 Asp Glu Lys Asp
Ala Lys Trp His Leu Tyr Phe Gln Tyr Asn Pro Asn 50
55 60 Asp Thr Val Trp Gly Thr Pro Leu
Phe Trp Gly His Ala Thr Ser Asp 65 70
75 80 Asp Leu Thr Asn Trp Glu Asp Gln Pro Ile Ala Ile
Ala Pro Lys Arg 85 90
95 Asn Asp Ser Gly Ala Phe Ser Gly Ser Met Val Val Asp Tyr Asn Asn
100 105 110 Thr Ser Gly
Phe Phe Asn Asp Thr Ile Asp Pro Arg Gln Arg Cys Val 115
120 125 Ala Ile Trp Thr Tyr Asn Thr Pro
Glu Ser Glu Glu Gln Tyr Ile Ser 130 135
140 Tyr Ser Leu Asp Gly Gly Tyr Thr Phe Thr Glu Tyr Gln
Lys Asn Pro 145 150 155
160 Val Leu Ala Ala Asn Ser Thr Gln Phe Arg Asp Pro Lys Val Phe Trp
165 170 175 Tyr Glu Pro Ser
Gln Lys Trp Ile Met Thr Ala Ala Lys Ser Gln Asp 180
185 190 Tyr Lys Ile Glu Ile Tyr Ser Ser Asp
Asp Leu Lys Ser Trp Lys Leu 195 200
205 Glu Ser Ala Phe Ala Asn Glu Gly Phe Leu Gly Tyr Gln Tyr
Glu Cys 210 215 220
Pro Gly Leu Ile Glu Val Pro Thr Glu Gln Asp Pro Ser Lys Ser Tyr 225
230 235 240 Trp Val Met Phe Ile
Ser Ile Asn Pro Gly Ala Pro Ala Gly Gly Ser 245
250 255 Phe Asn Gln Tyr Phe Val Gly Ser Phe Asn
Gly Thr His Phe Glu Ala 260 265
270 Phe Asp Asn Gln Ser Arg Val Val Asp Phe Gly Lys Asp Tyr Tyr
Ala 275 280 285 Leu
Gln Thr Phe Phe Asn Thr Asp Pro Thr Tyr Gly Ser Ala Leu Gly 290
295 300 Ile Ala Trp Ala Ser Asn
Trp Glu Tyr Ser Ala Phe Val Pro Thr Asn 305 310
315 320 Pro Trp Arg Ser Ser Met Ser Leu Val Arg Lys
Phe Ser Leu Asn Thr 325 330
335 Glu Tyr Gln Ala Asn Pro Glu Thr Glu Leu Ile Asn Leu Lys Ala Glu
340 345 350 Pro Ile
Leu Asn Ile Ser Asn Ala Gly Pro Trp Ser Arg Phe Ala Thr 355
360 365 Asn Thr Thr Leu Thr Lys Ala
Asn Ser Tyr Asn Val Asp Leu Ser Asn 370 375
380 Ser Thr Gly Thr Leu Glu Phe Glu Leu Val Tyr Ala
Val Asn Thr Thr 385 390 395
400 Gln Thr Ile Ser Lys Ser Val Phe Ala Asp Leu Ser Leu Trp Phe Lys
405 410 415 Gly Leu Glu
Asp Pro Glu Glu Tyr Leu Arg Met Gly Phe Glu Val Ser 420
425 430 Ala Ser Ser Phe Phe Leu Asp Arg
Gly Asn Ser Lys Val Lys Phe Val 435 440
445 Lys Glu Asn Pro Tyr Phe Thr Asn Arg Met Ser Val Asn
Asn Gln Pro 450 455 460
Phe Lys Ser Glu Asn Asp Leu Ser Tyr Tyr Lys Val Tyr Gly Leu Leu 465
470 475 480 Asp Gln Asn Ile
Leu Glu Leu Tyr Phe Asn Asp Gly Asp Val Val Ser 485
490 495 Thr Asn Thr Tyr Phe Met Thr Thr Gly
Asn Ala Leu Gly Ser Val Asn 500 505
510 Met Thr Thr Gly Val Asp Asn Leu Phe Tyr Ile Asp Lys Phe
Gln Val 515 520 525
Arg Glu Val Lys 530 351302DNACorynebacterium glutamicum
35gtgtgtgggg ctatgcacac agaactttcc agtttgcgcc ctgcgtacca tgtgactcct
60ccgcagggca ggctcaatga tcccaacgga atgtacgtcg atggcgatac cctccacgtc
120tactaccagc acgatccagg tttccccttc gcaccaaagc gcaccggctg ggctcacacc
180accacgccgt tgaccggacc gcagcgattg cagtggacgc acctgcccga cgctctttac
240ccggatgcat cctatgacct ggatggatgc tattccggtg gagccgtatt tactgacggc
300acacttaaac ttttctacac cggcaaccta aaaattgacg gcaagcgccg cgccacccaa
360aacctcgtcg aagtcgagga cccaactggg ctgatgggcg gcattcatcg ccgttcgcct
420aaaaatccgc ttatcgacgg acccgccagc ggtttcacac cccattaccg cgatcccatg
480atcagccctg atggtgatgg ttggaaaatg gttcttgggg cccaacgcga aaacctcacc
540ggtgcagcgg ttctataccg ctcgacagat cttgaaaact gggaattctc cggtgaaatc
600acctttgacc tcagtgatgc acaacctggt tctgctcctg atctcgttcc cggtggctac
660atgtgggaat gccccaacct ttttacgctt cgcgatgaag aaactggcga agatctcgac
720gtgctgattt tctgtccaca aggattggac cgaatccacg atgaggttac tcactacgca
780agctctgacc agtgcggata tgtcgtcggc aagcttgaag gaacgacctt ccgcgtcttg
840cgaggattca gcgagctgga tttcggccat gaattctacg caccgcaggt tgcagtaaac
900ggttctgatg cctggctcgt gggctggatg gggctgcccg cgcaggatga tcacccaaca
960gttgcacggg aaggatgggt gcactgcctg actgtgcccc gcaagcttca tttgcgcaac
1020cacgcgatct atcaagagct tcttctccca gagggggagt caggggtaat cagatctgta
1080ttaggttctg aacctgtccg agtagacatc cgaggcaata tttccctcga gtgggatggt
1140gtccgtttgt ctgtggatcg tggtggtgat cgtcgcgtag ctgaggtaaa acctggcgaa
1200ttagtgatcg cggacgataa tacagccatt gagataactg caggtgatgg acaggtttca
1260ttcgctttcc gggctttcaa aggtgacact attgagagat aa
130236433PRTCorynebacterium glutamicum 36Met Cys Gly Ala Met His Thr Glu
Leu Ser Ser Leu Arg Pro Ala Tyr 1 5 10
15 His Val Thr Pro Pro Gln Gly Arg Leu Asn Asp Pro Asn
Gly Met Tyr 20 25 30
Val Asp Gly Asp Thr Leu His Val Tyr Tyr Gln His Asp Pro Gly Phe
35 40 45 Pro Phe Ala Pro
Lys Arg Thr Gly Trp Ala His Thr Thr Thr Pro Leu 50
55 60 Thr Gly Pro Gln Arg Leu Gln Trp
Thr His Leu Pro Asp Ala Leu Tyr 65 70
75 80 Pro Asp Ala Ser Tyr Asp Leu Asp Gly Cys Tyr Ser
Gly Gly Ala Val 85 90
95 Phe Thr Asp Gly Thr Leu Lys Leu Phe Tyr Thr Gly Asn Leu Lys Ile
100 105 110 Asp Gly Lys
Arg Arg Ala Thr Gln Asn Leu Val Glu Val Glu Asp Pro 115
120 125 Thr Gly Leu Met Gly Gly Ile His
Arg Arg Ser Pro Lys Asn Pro Leu 130 135
140 Ile Asp Gly Pro Ala Ser Gly Phe Thr Pro His Tyr Arg
Asp Pro Met 145 150 155
160 Ile Ser Pro Asp Gly Asp Gly Trp Lys Met Val Leu Gly Ala Gln Arg
165 170 175 Glu Asn Leu Thr
Gly Ala Ala Val Leu Tyr Arg Ser Thr Asp Leu Glu 180
185 190 Asn Trp Glu Phe Ser Gly Glu Ile Thr
Phe Asp Leu Ser Asp Ala Gln 195 200
205 Pro Gly Ser Ala Pro Asp Leu Val Pro Gly Gly Tyr Met Trp
Glu Cys 210 215 220
Pro Asn Leu Phe Thr Leu Arg Asp Glu Glu Thr Gly Glu Asp Leu Asp 225
230 235 240 Val Leu Ile Phe Cys
Pro Gln Gly Leu Asp Arg Ile His Asp Glu Val 245
250 255 Thr His Tyr Ala Ser Ser Asp Gln Cys Gly
Tyr Val Val Gly Lys Leu 260 265
270 Glu Gly Thr Thr Phe Arg Val Leu Arg Gly Phe Ser Glu Leu Asp
Phe 275 280 285 Gly
His Glu Phe Tyr Ala Pro Gln Val Ala Val Asn Gly Ser Asp Ala 290
295 300 Trp Leu Val Gly Trp Met
Gly Leu Pro Ala Gln Asp Asp His Pro Thr 305 310
315 320 Val Ala Arg Glu Gly Trp Val His Cys Leu Thr
Val Pro Arg Lys Leu 325 330
335 His Leu Arg Asn His Ala Ile Tyr Gln Glu Leu Leu Leu Pro Glu Gly
340 345 350 Glu Ser
Gly Val Ile Arg Ser Val Leu Gly Ser Glu Pro Val Arg Val 355
360 365 Asp Ile Arg Gly Asn Ile Ser
Leu Glu Trp Asp Gly Val Arg Leu Ser 370 375
380 Val Asp Arg Gly Gly Asp Arg Arg Val Ala Glu Val
Lys Pro Gly Glu 385 390 395
400 Leu Val Ile Ala Asp Asp Asn Thr Ala Ile Glu Ile Thr Ala Gly Asp
405 410 415 Gly Gln Val
Ser Phe Ala Phe Arg Ala Phe Lys Gly Asp Thr Ile Glu 420
425 430 Arg 371473DNALeuconostoc
mesenteroides 37atggaaattc aaaacaaagc aatgttgatc acttatgctg attcgttggg
caaaaactta 60aaagatgttc atcaagtctt gaaagaagat attggagatg cgattggtgg
ggttcatttg 120ttgcctttct tcccttcaac aggtgatcgc ggttttgcgc cagccgatta
tactcgtgtt 180gatgccgcat ttggtgattg ggcagatgtc gaagcattgg gtgaagaata
ctatttgatg 240tttgacttca tgattaacca tatttctcgt gaatcagtga tgtatcaaga
ttttaagaag 300aatcatgacg attcaaagta taaagatttc tttattcgtt gggaaaagtt
ctgggcaaag 360gccggcgaaa accgtccaac acaagccgat gttgacttaa tttacaagcg
taaagataag 420gcaccaacgc aagaaatcac ttttgatgat ggcacaacag aaaacttgtg
gaatactttt 480ggtgaagaac aaattgacat tgatgttaat tcagccattg ccaaggaatt
tattaagaca 540acccttgaag acatggtaaa acatggtgct aacttgattc gtttggatgc
ctttgcgtat 600gcagttaaaa aagttgacac aaatgacttc ttcgttgagc cagaaatctg
ggacactttg 660aatgaagtac gtgaaatttt gacaccatta aaggctgaaa ttttaccaga
aattcatgaa 720cattactcaa tccctaaaaa gatcaatgat catggttact tcacctatga
ctttgcatta 780ccaatgacaa cgctttacac attgtattca ggtaagacaa atcaattggc
aaagtggttg 840aagatgtcac caatgaagca attcacaaca ttggacacgc atgatggtat
tggtgtcgtt 900gatgcccgtg atattctaac tgatgatgaa attgactacg cttctgaaca
actttacaag 960gttggcgcga atgtcaaaaa gacatattca tctgcttcat acaacaacct
tgatatttac 1020caaattaact caacttatta ttcagcattg ggaaatgatg atgcagcata
cttgttgagt 1080cgtgtcttcc aagtctttgc gcctggaatt ccacaaattt attacgttgg
tttgttggca 1140ggtgaaaacg atatcgcgct tttggagtca actaaagaag gtcgtaatat
taaccgtcat 1200tactatacgc gtgaagaagt taagtcagaa gttaagcgac cagttgttgc
taacttattg 1260aagctattgt catggcgtaa tgaaagccct gcatttgatt tggctggctc
aatcacagtt 1320gacacgccaa ctgatacaac aattgtggtg acacgtcaag atgaaaatgg
tcaaaacaaa 1380gctgtattaa cagccgatgc ggccaacaaa acttttgaaa tcgttgagaa
tggtcaaact 1440gttatgagca gtgataattt gactcagaac taa
147338490PRTLeuconostoc mesenteroides 38Met Glu Ile Gln Asn
Lys Ala Met Leu Ile Thr Tyr Ala Asp Ser Leu 1 5
10 15 Gly Lys Asn Leu Lys Asp Val His Gln Val
Leu Lys Glu Asp Ile Gly 20 25
30 Asp Ala Ile Gly Gly Val His Leu Leu Pro Phe Phe Pro Ser Thr
Gly 35 40 45 Asp
Arg Gly Phe Ala Pro Ala Asp Tyr Thr Arg Val Asp Ala Ala Phe 50
55 60 Gly Asp Trp Ala Asp Val
Glu Ala Leu Gly Glu Glu Tyr Tyr Leu Met 65 70
75 80 Phe Asp Phe Met Ile Asn His Ile Ser Arg Glu
Ser Val Met Tyr Gln 85 90
95 Asp Phe Lys Lys Asn His Asp Asp Ser Lys Tyr Lys Asp Phe Phe Ile
100 105 110 Arg Trp
Glu Lys Phe Trp Ala Lys Ala Gly Glu Asn Arg Pro Thr Gln 115
120 125 Ala Asp Val Asp Leu Ile Tyr
Lys Arg Lys Asp Lys Ala Pro Thr Gln 130 135
140 Glu Ile Thr Phe Asp Asp Gly Thr Thr Glu Asn Leu
Trp Asn Thr Phe 145 150 155
160 Gly Glu Glu Gln Ile Asp Ile Asp Val Asn Ser Ala Ile Ala Lys Glu
165 170 175 Phe Ile Lys
Thr Thr Leu Glu Asp Met Val Lys His Gly Ala Asn Leu 180
185 190 Ile Arg Leu Asp Ala Phe Ala Tyr
Ala Val Lys Lys Val Asp Thr Asn 195 200
205 Asp Phe Phe Val Glu Pro Glu Ile Trp Asp Thr Leu Asn
Glu Val Arg 210 215 220
Glu Ile Leu Thr Pro Leu Lys Ala Glu Ile Leu Pro Glu Ile His Glu 225
230 235 240 His Tyr Ser Ile
Pro Lys Lys Ile Asn Asp His Gly Tyr Phe Thr Tyr 245
250 255 Asp Phe Ala Leu Pro Met Thr Thr Leu
Tyr Thr Leu Tyr Ser Gly Lys 260 265
270 Thr Asn Gln Leu Ala Lys Trp Leu Lys Met Ser Pro Met Lys
Gln Phe 275 280 285
Thr Thr Leu Asp Thr His Asp Gly Ile Gly Val Val Asp Ala Arg Asp 290
295 300 Ile Leu Thr Asp Asp
Glu Ile Asp Tyr Ala Ser Glu Gln Leu Tyr Lys 305 310
315 320 Val Gly Ala Asn Val Lys Lys Thr Tyr Ser
Ser Ala Ser Tyr Asn Asn 325 330
335 Leu Asp Ile Tyr Gln Ile Asn Ser Thr Tyr Tyr Ser Ala Leu Gly
Asn 340 345 350 Asp
Asp Ala Ala Tyr Leu Leu Ser Arg Val Phe Gln Val Phe Ala Pro 355
360 365 Gly Ile Pro Gln Ile Tyr
Tyr Val Gly Leu Leu Ala Gly Glu Asn Asp 370 375
380 Ile Ala Leu Leu Glu Ser Thr Lys Glu Gly Arg
Asn Ile Asn Arg His 385 390 395
400 Tyr Tyr Thr Arg Glu Glu Val Lys Ser Glu Val Lys Arg Pro Val Val
405 410 415 Ala Asn
Leu Leu Lys Leu Leu Ser Trp Arg Asn Glu Ser Pro Ala Phe 420
425 430 Asp Leu Ala Gly Ser Ile Thr
Val Asp Thr Pro Thr Asp Thr Thr Ile 435 440
445 Val Val Thr Arg Gln Asp Glu Asn Gly Gln Asn Lys
Ala Val Leu Thr 450 455 460
Ala Asp Ala Ala Asn Lys Thr Phe Glu Ile Val Glu Asn Gly Gln Thr 465
470 475 480 Val Met Ser
Ser Asp Asn Leu Thr Gln Asn 485 490
391515DNABifidobacterium adolescentis 39atgaaaaaca aggtgcagct catcacttac
gccgaccgcc ttggcgacgg caccatcaag 60tcgatgaccg acattctgcg cacccgcttc
gacggcgtgt acgacggcgt tcacatcctg 120ccgttcttca ccccgttcga cggcgccgac
gcaggcttcg acccgatcga ccacaccaag 180gtcgacgaac gtctcggcag ctgggacgac
gtcgccgaac tctccaagac ccacaacatc 240atggtcgacg ccatcgtcaa ccacatgagt
tgggaatcca agcagttcca ggacgtgctg 300gccaagggcg aggagtccga atactatccg
atgttcctca ccatgagctc cgtgttcccg 360aacggcgcca ccgaagagga cctggccggc
atctaccgtc cgcgtccggg cctgccgttc 420acccactaca agttcgccgg caagacccgc
ctcgtgtggg tcagcttcac cccgcagcag 480gtggacatcg acaccgattc cgacaagggt
tgggaatacc tcatgtcgat tttcgaccag 540atggccgcct ctcacgtcag ctacatccgc
ctcgacgccg tcggctatgg cgccaaggaa 600gccggcacca gctgcttcat gaccccgaag
accttcaagc tgatctcccg tctgcgtgag 660gaaggcgtca agcgcggtct ggaaatcctc
atcgaagtgc actcctacta caagaagcag 720gtcgaaatcg catccaaggt ggaccgcgtc
tacgacttcg ccctgcctcc gctgctgctg 780cacgcgctga gcaccggcca cgtcgagccc
gtcgcccact ggaccgacat acgcccgaac 840aacgccgtca ccgtgctcga tacgcacgac
ggcatcggcg tgatcgacat cggctccgac 900cagctcgacc gctcgctcaa gggtctcgtg
ccggatgagg acgtggacaa cctcgtcaac 960accatccacg ccaacaccca cggcgaatcc
caggcagcca ctggcgccgc cgcatccaat 1020ctcgacctct accaggtcaa cagcacctac
tattcggcgc tcgggtgcaa cgaccagcac 1080tacatcgccg cccgcgcggt gcagttcttc
ctgccgggcg tgccgcaagt ctactacgtc 1140ggcgcgctcg ccggcaagaa cgacatggag
ctgctgcgta agacgaataa cggccgcgac 1200atcaatcgcc attactactc caccgcggaa
atcgacgaga acctcaagcg tccggtcgtc 1260aaggccctga acgcgctcgc caagttccgc
aacgagctcg acgcgttcga cggcacgttc 1320tcgtacacca ccgatgacga cacgtccatc
agcttcacct ggcgcggcga aaccagccag 1380gccacgctga cgttcgagcc gaagcgcggt
ctcggtgtgg acaacactac gccggtcgcc 1440atgttggaat gggaggattc cgcgggagac
caccgttcgg atgatctgat cgccaatccg 1500cctgtcgtcg cctga
151540504PRTBifidobacterium adolescentis
40Met Lys Asn Lys Val Gln Leu Ile Thr Tyr Ala Asp Arg Leu Gly Asp 1
5 10 15 Gly Thr Ile Lys
Ser Met Thr Asp Ile Leu Arg Thr Arg Phe Asp Gly 20
25 30 Val Tyr Asp Gly Val His Ile Leu Pro
Phe Phe Thr Pro Phe Asp Gly 35 40
45 Ala Asp Ala Gly Phe Asp Pro Ile Asp His Thr Lys Val Asp
Glu Arg 50 55 60
Leu Gly Ser Trp Asp Asp Val Ala Glu Leu Ser Lys Thr His Asn Ile 65
70 75 80 Met Val Asp Ala Ile
Val Asn His Met Ser Trp Glu Ser Lys Gln Phe 85
90 95 Gln Asp Val Leu Ala Lys Gly Glu Glu Ser
Glu Tyr Tyr Pro Met Phe 100 105
110 Leu Thr Met Ser Ser Val Phe Pro Asn Gly Ala Thr Glu Glu Asp
Leu 115 120 125 Ala
Gly Ile Tyr Arg Pro Arg Pro Gly Leu Pro Phe Thr His Tyr Lys 130
135 140 Phe Ala Gly Lys Thr Arg
Leu Val Trp Val Ser Phe Thr Pro Gln Gln 145 150
155 160 Val Asp Ile Asp Thr Asp Ser Asp Lys Gly Trp
Glu Tyr Leu Met Ser 165 170
175 Ile Phe Asp Gln Met Ala Ala Ser His Val Ser Tyr Ile Arg Leu Asp
180 185 190 Ala Val
Gly Tyr Gly Ala Lys Glu Ala Gly Thr Ser Cys Phe Met Thr 195
200 205 Pro Lys Thr Phe Lys Leu Ile
Ser Arg Leu Arg Glu Glu Gly Val Lys 210 215
220 Arg Gly Leu Glu Ile Leu Ile Glu Val His Ser Tyr
Tyr Lys Lys Gln 225 230 235
240 Val Glu Ile Ala Ser Lys Val Asp Arg Val Tyr Asp Phe Ala Leu Pro
245 250 255 Pro Leu Leu
Leu His Ala Leu Ser Thr Gly His Val Glu Pro Val Ala 260
265 270 His Trp Thr Asp Ile Arg Pro Asn
Asn Ala Val Thr Val Leu Asp Thr 275 280
285 His Asp Gly Ile Gly Val Ile Asp Ile Gly Ser Asp Gln
Leu Asp Arg 290 295 300
Ser Leu Lys Gly Leu Val Pro Asp Glu Asp Val Asp Asn Leu Val Asn 305
310 315 320 Thr Ile His Ala
Asn Thr His Gly Glu Ser Gln Ala Ala Thr Gly Ala 325
330 335 Ala Ala Ser Asn Leu Asp Leu Tyr Gln
Val Asn Ser Thr Tyr Tyr Ser 340 345
350 Ala Leu Gly Cys Asn Asp Gln His Tyr Ile Ala Ala Arg Ala
Val Gln 355 360 365
Phe Phe Leu Pro Gly Val Pro Gln Val Tyr Tyr Val Gly Ala Leu Ala 370
375 380 Gly Lys Asn Asp Met
Glu Leu Leu Arg Lys Thr Asn Asn Gly Arg Asp 385 390
395 400 Ile Asn Arg His Tyr Tyr Ser Thr Ala Glu
Ile Asp Glu Asn Leu Lys 405 410
415 Arg Pro Val Val Lys Ala Leu Asn Ala Leu Ala Lys Phe Arg Asn
Glu 420 425 430 Leu
Asp Ala Phe Asp Gly Thr Phe Ser Tyr Thr Thr Asp Asp Asp Thr 435
440 445 Ser Ile Ser Phe Thr Trp
Arg Gly Glu Thr Ser Gln Ala Thr Leu Thr 450 455
460 Phe Glu Pro Lys Arg Gly Leu Gly Val Asp Asn
Thr Thr Pro Val Ala 465 470 475
480 Met Leu Glu Trp Glu Asp Ser Ala Gly Asp His Arg Ser Asp Asp Leu
485 490 495 Ile Ala
Asn Pro Pro Val Val Ala 500
41927DNAAgrobacterium tumefaciens 41atgatcctgt gttgtggtga agccctgatc
gacatgctgc cccggcagac gacgctgggt 60gaggcgggct ttgcccctta cgcaggcgga
gcggtcttca acacggcaat tgcgctgggg 120cgtcttggcg tcccttcagc cttttttacc
ggtctttccg acgacatgat gggcgatatc 180ctgcgggaga ccctgcgggc cagcaaggtg
gatttcagct attgcgccac cctgtcgcgc 240cccaccacca ttgcgttcgt taagctggtt
gatggccatg cgacctacgc tttttacgac 300gagaacaccg ccggccggat gatcaccgag
gccgaacttc cggccttggg agcggattgc 360gaagcgctgc atttcggcgc catcagcctt
attcccgaac cctgcggcag cacctatgag 420gcgctgatga cgcgcgagca tgagacccgc
gtcatctcgc tcgatccgaa cattcgtccc 480ggcttcatcc agaacaagca gtcgcacatg
gcccgcatcc gccgcatggc ggcgatgtct 540gacatcgtca agttctcgga tgaggacctg
gcgtggttcg gtctggaagg cgacgaggac 600acgcttgccc gccactggct gcaccacggt
gcaaaactcg tcgttgtcac ccgtggcgcc 660aagggtgccg tgggttacag cgccaatctc
aaggtggaag tggcctccga gcgcgtcgaa 720gtggtcgata cggtcggcgc cggcgatacg
ttcgatgccg gcattcttgc ttcgctgaaa 780atgcagggcc tgctgaccaa agcgcaggtg
gcttcgctga gcgaagagca gatcagaaaa 840gctttggcgc ttggcgcgaa agccgctgcg
gtcactgtct cgcgggctgg cgcaaatccg 900cctttcgcgc atgaaatcgg tttgtga
92742308PRTAgrobacterium tumefaciens
42Met Ile Leu Cys Cys Gly Glu Ala Leu Ile Asp Met Leu Pro Arg Gln 1
5 10 15 Thr Thr Leu Gly
Glu Ala Gly Phe Ala Pro Tyr Ala Gly Gly Ala Val 20
25 30 Phe Asn Thr Ala Ile Ala Leu Gly Arg
Leu Gly Val Pro Ser Ala Phe 35 40
45 Phe Thr Gly Leu Ser Asp Asp Met Met Gly Asp Ile Leu Arg
Glu Thr 50 55 60
Leu Arg Ala Ser Lys Val Asp Phe Ser Tyr Cys Ala Thr Leu Ser Arg 65
70 75 80 Pro Thr Thr Ile Ala
Phe Val Lys Leu Val Asp Gly His Ala Thr Tyr 85
90 95 Ala Phe Tyr Asp Glu Asn Thr Ala Gly Arg
Met Ile Thr Glu Ala Glu 100 105
110 Leu Pro Ala Leu Gly Ala Asp Cys Glu Ala Leu His Phe Gly Ala
Ile 115 120 125 Ser
Leu Ile Pro Glu Pro Cys Gly Ser Thr Tyr Glu Ala Leu Met Thr 130
135 140 Arg Glu His Glu Thr Arg
Val Ile Ser Leu Asp Pro Asn Ile Arg Pro 145 150
155 160 Gly Phe Ile Gln Asn Lys Gln Ser His Met Ala
Arg Ile Arg Arg Met 165 170
175 Ala Ala Met Ser Asp Ile Val Lys Phe Ser Asp Glu Asp Leu Ala Trp
180 185 190 Phe Gly
Leu Glu Gly Asp Glu Asp Thr Leu Ala Arg His Trp Leu His 195
200 205 His Gly Ala Lys Leu Val Val
Val Thr Arg Gly Ala Lys Gly Ala Val 210 215
220 Gly Tyr Ser Ala Asn Leu Lys Val Glu Val Ala Ser
Glu Arg Val Glu 225 230 235
240 Val Val Asp Thr Val Gly Ala Gly Asp Thr Phe Asp Ala Gly Ile Leu
245 250 255 Ala Ser Leu
Lys Met Gln Gly Leu Leu Thr Lys Ala Gln Val Ala Ser 260
265 270 Leu Ser Glu Glu Gln Ile Arg Lys
Ala Leu Ala Leu Gly Ala Lys Ala 275 280
285 Ala Ala Val Thr Val Ser Arg Ala Gly Ala Asn Pro Pro
Phe Ala His 290 295 300
Glu Ile Gly Leu 305 431404DNAStreptococcus mutans
43cagctgatta tgcgtcagtt gaaaccctcg cttcttcagg aactgttgct gtaggtgata
60gcttacttga agttaaaaaa taagaaatat tatcagaaag accgtaaggt ctttttgact
120gcttaaaaga ttcagtaaca atagtattaa agccttttgg ctaactaata cttgaaattt
180agcaaattat gatataatgt taagtagtcc ttaagggtag attaagggta ttcaaatcca
240aaaattgatt tggtaagtta agtaaaatat aagaggttta ttatgtctaa attatatggc
300agcatcgaag ctggcggaac aaaatttgtc tgtgctgtag gtgatgaaaa ttttcaaatt
360ttagaaaaag ttcagttccc aacaacaaca ccttatgaaa caatagaaaa aacagttgct
420ttctttaaaa aatttgaagc tgatttagcc agtgttgcca ttggttcttt tggccctatt
480gatattgatc aaaattcaga cacttatggt tacattactt caacaccaaa gccaaactgg
540gctaacgttg attttgtcgg cttaatttct aaagatttta aaattccatt ttactttacg
600acagatgtta attcttctgc ttatggggaa acaattgctc gttcaaatgt taaaagtctg
660gtttattata ctattggaac aggcattgga gcaggggcta ttcaaaatgg cgaattcatt
720ggcggtatgg gacatacgga agctggacac gtttacatgg ctccgcatcc caatgatgtt
780catcatggtt ttgtaggcac ctgtcctttc cataaaggct gtttagaagg acttgcagcg
840ggtcctagct tagaggctcg tactggtatt cgtggtgagt taattgagca aaactcagaa
900gtttgggata ttcaggcata ctacattgct caggcggcta ttcaagcgac tgtcctttat
960cgtccgcaag tcattgtatt tggcggaggc gttatggcac aagaacatat gctcaatcgg
1020gttcgtgaaa aatttacttc acttttgaat gactatcttc cagttccaga tgttaaagat
1080tatattgtga caccagctgt tgcagaaaat ggttcagcaa cattgggaaa tctcgcttta
1140gctaaaaaga tagcagcgcg ttaattaaaa atgaattgga agattaaagc accttctaat
1200attcaatatt aaactgttag aatttacgtg aacgaaattt tcattttatg aggataatga
1260agtgaatata attactcttg atttcctctg aaactagata gtggtatatt gaaaaacaga
1320aaggagaaca ctatggaagg acctttgttt ttacaatcac aaatgcataa aaaaatctgg
1380ggcggcaatc ggctcagaaa agaa
140444293PRTStreptococcus mutans 44Met Ser Lys Leu Tyr Gly Ser Ile Glu
Ala Gly Gly Thr Lys Phe Val 1 5 10
15 Cys Ala Val Gly Asp Glu Asn Phe Gln Ile Leu Glu Lys Val
Gln Phe 20 25 30
Pro Thr Thr Thr Pro Tyr Glu Thr Ile Glu Lys Thr Val Ala Phe Phe
35 40 45 Lys Lys Phe Glu
Ala Asp Leu Ala Ser Val Ala Ile Gly Ser Phe Gly 50
55 60 Pro Ile Asp Ile Asp Gln Asn Ser
Asp Thr Tyr Gly Tyr Ile Thr Ser 65 70
75 80 Thr Pro Lys Pro Asn Trp Ala Asn Val Asp Phe Val
Gly Leu Ile Ser 85 90
95 Lys Asp Phe Lys Ile Pro Phe Tyr Phe Thr Thr Asp Val Asn Ser Ser
100 105 110 Ala Tyr Gly
Glu Thr Ile Ala Arg Ser Asn Val Lys Ser Leu Val Tyr 115
120 125 Tyr Thr Ile Gly Thr Gly Ile Gly
Ala Gly Ala Ile Gln Asn Gly Glu 130 135
140 Phe Ile Gly Gly Met Gly His Thr Glu Ala Gly His Val
Tyr Met Ala 145 150 155
160 Pro His Pro Asn Asp Val His His Gly Phe Val Gly Thr Cys Pro Phe
165 170 175 His Lys Gly Cys
Leu Glu Gly Leu Ala Ala Gly Pro Ser Leu Glu Ala 180
185 190 Arg Thr Gly Ile Arg Gly Glu Leu Ile
Glu Gln Asn Ser Glu Val Trp 195 200
205 Asp Ile Gln Ala Tyr Tyr Ile Ala Gln Ala Ala Ile Gln Ala
Thr Val 210 215 220
Leu Tyr Arg Pro Gln Val Ile Val Phe Gly Gly Gly Val Met Ala Gln 225
230 235 240 Glu His Met Leu Asn
Arg Val Arg Glu Lys Phe Thr Ser Leu Leu Asn 245
250 255 Asp Tyr Leu Pro Val Pro Asp Val Lys Asp
Tyr Ile Val Thr Pro Ala 260 265
270 Val Ala Glu Asn Gly Ser Ala Thr Leu Gly Asn Leu Ala Leu Ala
Lys 275 280 285 Lys
Ile Ala Ala Arg 290 45915DNAEscherichia coli 45atgtcagcca
aagtatgggt tttaggggat gcggtcgtag atctcttgcc agaatcagac 60gggcgcctac
tgccttgtcc tggcggcgcg ccagctaacg ttgcggtggg aatcgccaga 120ttaggcggaa
caagtgggtt tataggtcgg gtgggggatg atccttttgg tgcgttaatg 180caaagaacgc
tgctaactga gggagtcgat atcacgtatc tgaagcaaga tgaatggcac 240cggacatcca
cggtgcttgt cgatctgaac gatcaagggg aacgttcatt tacgtttatg 300gtccgcccca
gtgccgatct ttttttagag acgacagact tgccctgctg gcgacatggc 360gaatggttac
atctctgttc aattgcgttg tctgccgagc cttcgcgtac cagcgcattt 420actgcgatga
cggcgatccg gcatgccgga ggttttgtca gcttcgatcc taatattcgt 480gaagatctat
ggcaagacga gcatttgctc cgcttgtgtt tgcggcaggc gctacaactg 540gcggatgtcg
tcaagctctc ggaagaagaa tggcgactta tcagtggaaa aacacagaac 600gatcaggata
tatgcgccct ggcaaaagag tatgagatcg ccatgctgtt ggtgactaaa 660ggtgcagaag
gggtggtggt ctgttatcga ggacaagttc accattttgc tggaatgtct 720gtgaattgtg
tcgatagcac gggggcggga gatgcgttcg ttgccgggtt actcacaggt 780ctgtcctcta
cgggattatc tacagatgag agagaaatgc gacgaattat cgatctcgct 840caacgttgcg
gagcgcttgc agtaacggcg aaaggggcaa tgacagcgct gccatgtcga 900caagaactgg
aatag
91546304PRTEscherichia coli 46Met Ser Ala Lys Val Trp Val Leu Gly Asp Ala
Val Val Asp Leu Leu 1 5 10
15 Pro Glu Ser Asp Gly Arg Leu Leu Pro Cys Pro Gly Gly Ala Pro Ala
20 25 30 Asn Val
Ala Val Gly Ile Ala Arg Leu Gly Gly Thr Ser Gly Phe Ile 35
40 45 Gly Arg Val Gly Asp Asp Pro
Phe Gly Ala Leu Met Gln Arg Thr Leu 50 55
60 Leu Thr Glu Gly Val Asp Ile Thr Tyr Leu Lys Gln
Asp Glu Trp His 65 70 75
80 Arg Thr Ser Thr Val Leu Val Asp Leu Asn Asp Gln Gly Glu Arg Ser
85 90 95 Phe Thr Phe
Met Val Arg Pro Ser Ala Asp Leu Phe Leu Glu Thr Thr 100
105 110 Asp Leu Pro Cys Trp Arg His Gly
Glu Trp Leu His Leu Cys Ser Ile 115 120
125 Ala Leu Ser Ala Glu Pro Ser Arg Thr Ser Ala Phe Thr
Ala Met Thr 130 135 140
Ala Ile Arg His Ala Gly Gly Phe Val Ser Phe Asp Pro Asn Ile Arg 145
150 155 160 Glu Asp Leu Trp
Gln Asp Glu His Leu Leu Arg Leu Cys Leu Arg Gln 165
170 175 Ala Leu Gln Leu Ala Asp Val Val Lys
Leu Ser Glu Glu Glu Trp Arg 180 185
190 Leu Ile Ser Gly Lys Thr Gln Asn Asp Gln Asp Ile Cys Ala
Leu Ala 195 200 205
Lys Glu Tyr Glu Ile Ala Met Leu Leu Val Thr Lys Gly Ala Glu Gly 210
215 220 Val Val Val Cys Tyr
Arg Gly Gln Val His His Phe Ala Gly Met Ser 225 230
235 240 Val Asn Cys Val Asp Ser Thr Gly Ala Gly
Asp Ala Phe Val Ala Gly 245 250
255 Leu Leu Thr Gly Leu Ser Ser Thr Gly Leu Ser Thr Asp Glu Arg
Glu 260 265 270 Met
Arg Arg Ile Ile Asp Leu Ala Gln Arg Cys Gly Ala Leu Ala Val 275
280 285 Thr Ala Lys Gly Ala Met
Thr Ala Leu Pro Cys Arg Gln Glu Leu Glu 290 295
300 47924DNAKlebsiella pneumoniae 47atgaatggaa
aaatctgggt actcggcgat gcggtcgtcg atctcctgcc cgatggagag 60ggccgcctgc
tgcaatgccc cggcggcgcg ccggccaacg tggcggtcgg cgtggcgcgg 120ctcggcggtg
acagcgggtt tatcggccgc gtcggcgacg atcccttcgg ccgttttatg 180cgtcacaccc
tggcgcagga gcaagtggat gtgaactata tgcgcctcga tgcggcgcag 240cgcacctcca
cggtggtggt cgatctcgat agccacgggg agcgcacctt tacctttatg 300gtccgtccga
gcgccgacct gttccttcag cccgaggatc tcccgccgtt tgccgccggt 360cagtggctgc
acgtctgctc catcgctctc agcgcggagc cgagccgcag cacgacattc 420gcggcgatgg
aggcgataaa gcgcgccggg ggctatgtca gcttcgaccc caatatccgc 480agcgacctgt
ggcaggatcc gcaggacctt cgcgactgtc tcgaccgggc gctggccctc 540gccgacgcca
taaaactttc ggaagaggag ctggcgttta tcagcggcag cgacgacatc 600gtcagcggca
ccgcccggct gaacgcccgc ttccagccga cgctactgct ggtgacccag 660ggtaaagcgg
gggtccaggc cgccctgcgc gggcaggtta gccacttccc tgcccgcccg 720gtggtggccg
tcgataccac cggcgccggc gatgcctttg tcgccgggct actcgccggc 780ctcgccgccc
acggtatccc ggacaacctc gcagccctgg ctcccgacct cgcgctggcg 840caaacctgcg
gcgccctggc caccaccgcc aaaggcgcca tgaccgccct gccctacagg 900gacgatcttc
agcgctcgct gtga
92448307PRTKlebsiella pneumoniae 48Met Asn Gly Lys Ile Trp Val Leu Gly
Asp Ala Val Val Asp Leu Leu 1 5 10
15 Pro Asp Gly Glu Gly Arg Leu Leu Gln Cys Pro Gly Gly Ala
Pro Ala 20 25 30
Asn Val Ala Val Gly Val Ala Arg Leu Gly Gly Asp Ser Gly Phe Ile
35 40 45 Gly Arg Val Gly
Asp Asp Pro Phe Gly Arg Phe Met Arg His Thr Leu 50
55 60 Ala Gln Glu Gln Val Asp Val Asn
Tyr Met Arg Leu Asp Ala Ala Gln 65 70
75 80 Arg Thr Ser Thr Val Val Val Asp Leu Asp Ser His
Gly Glu Arg Thr 85 90
95 Phe Thr Phe Met Val Arg Pro Ser Ala Asp Leu Phe Leu Gln Pro Glu
100 105 110 Asp Leu Pro
Pro Phe Ala Ala Gly Gln Trp Leu His Val Cys Ser Ile 115
120 125 Ala Leu Ser Ala Glu Pro Ser Arg
Ser Thr Thr Phe Ala Ala Met Glu 130 135
140 Ala Ile Lys Arg Ala Gly Gly Tyr Val Ser Phe Asp Pro
Asn Ile Arg 145 150 155
160 Ser Asp Leu Trp Gln Asp Pro Gln Asp Leu Arg Asp Cys Leu Asp Arg
165 170 175 Ala Leu Ala Leu
Ala Asp Ala Ile Lys Leu Ser Glu Glu Glu Leu Ala 180
185 190 Phe Ile Ser Gly Ser Asp Asp Ile Val
Ser Gly Thr Ala Arg Leu Asn 195 200
205 Ala Arg Phe Gln Pro Thr Leu Leu Leu Val Thr Gln Gly Lys
Ala Gly 210 215 220
Val Gln Ala Ala Leu Arg Gly Gln Val Ser His Phe Pro Ala Arg Pro 225
230 235 240 Val Val Ala Val Asp
Thr Thr Gly Ala Gly Asp Ala Phe Val Ala Gly 245
250 255 Leu Leu Ala Gly Leu Ala Ala His Gly Ile
Pro Asp Asn Leu Ala Ala 260 265
270 Leu Ala Pro Asp Leu Ala Leu Ala Gln Thr Cys Gly Ala Leu Ala
Thr 275 280 285 Thr
Ala Lys Gly Ala Met Thr Ala Leu Pro Tyr Arg Asp Asp Leu Gln 290
295 300 Arg Ser Leu 305
49915DNAEscherichia coli 49atgtcagcca aagtatgggt tttaggggat gcggtcgtag
atctcttgcc agaatcagac 60gggcgcctac tgccttgtcc tggcggcgcg ccagctaacg
ttgcggtggg aatcgccaga 120ttaggcggaa caagtgggtt tataggtcgg gtgggggatg
atccttttgg tgcgttaatg 180caaagaacgc tgctaactga gggagtcgat atcacgtatc
tgaagcaaga tgaatggcac 240cggacatcca cggtgcttgt cgatctgaac gatcaagggg
aacgttcatt tacgtttatg 300gtccgcccca gtgccgatct ttttttagag acgacagact
tgccctgctg gcgacatggc 360gaatggttac atctctgttc aattgcgttg tctgccgagc
cttcgcgtac cagcgcattt 420actgcgatga cggcgatccg gcatgccgga ggttttgtca
gcttcgatcc taatattcgt 480gaagatctat ggcaagacga gcatttgctc cgcttgtgtt
tgcggcaggc gctacaactg 540gcggatgtcg tcaagctctc ggaagaagaa tggcgactta
tcagtggaaa aacacagaac 600gatcaggata tatgcgccct ggcaaaagag tatgagatcg
ccatgctgtt ggtgactaaa 660ggtgcagaag gggtggtggt ctgttatcga ggacaagttc
accattttgc tggaatgtct 720gtgaattgtg tcgatagcac gggggcggga gatgcgttcg
ttgccgggtt actcacaggt 780ctgtcctcta cgggattatc tacagatgag agagaaatgc
gacgaattat cgatctcgct 840caacgttgcg gagcgcttgc agtaacggcg aaaggggcaa
tgacagcgct gccatgtcga 900caagaactgg aatag
91550304PRTEscherichia coli 50Met Ser Ala Lys Val
Trp Val Leu Gly Asp Ala Val Val Asp Leu Leu 1 5
10 15 Pro Glu Ser Asp Gly Arg Leu Leu Pro Cys
Pro Gly Gly Ala Pro Ala 20 25
30 Asn Val Ala Val Gly Ile Ala Arg Leu Gly Gly Thr Ser Gly Phe
Ile 35 40 45 Gly
Arg Val Gly Asp Asp Pro Phe Gly Ala Leu Met Gln Arg Thr Leu 50
55 60 Leu Thr Glu Gly Val Asp
Ile Thr Tyr Leu Lys Gln Asp Glu Trp His 65 70
75 80 Arg Thr Ser Thr Val Leu Val Asp Leu Asn Asp
Gln Gly Glu Arg Ser 85 90
95 Phe Thr Phe Met Val Arg Pro Ser Ala Asp Leu Phe Leu Glu Thr Thr
100 105 110 Asp Leu
Pro Cys Trp Arg His Gly Glu Trp Leu His Leu Cys Ser Ile 115
120 125 Ala Leu Ser Ala Glu Pro Ser
Arg Thr Ser Ala Phe Thr Ala Met Thr 130 135
140 Ala Ile Arg His Ala Gly Gly Phe Val Ser Phe Asp
Pro Asn Ile Arg 145 150 155
160 Glu Asp Leu Trp Gln Asp Glu His Leu Leu Arg Leu Cys Leu Arg Gln
165 170 175 Ala Leu Gln
Leu Ala Asp Val Val Lys Leu Ser Glu Glu Glu Trp Arg 180
185 190 Leu Ile Ser Gly Lys Thr Gln Asn
Asp Gln Asp Ile Cys Ala Leu Ala 195 200
205 Lys Glu Tyr Glu Ile Ala Met Leu Leu Val Thr Lys Gly
Ala Glu Gly 210 215 220
Val Val Val Cys Tyr Arg Gly Gln Val His His Phe Ala Gly Met Ser 225
230 235 240 Val Asn Cys Val
Asp Ser Thr Gly Ala Gly Asp Ala Phe Val Ala Gly 245
250 255 Leu Leu Thr Gly Leu Ser Ser Thr Gly
Leu Ser Thr Asp Glu Arg Glu 260 265
270 Met Arg Arg Ile Ile Asp Leu Ala Gln Arg Cys Gly Ala Leu
Ala Val 275 280 285
Thr Ala Lys Gly Ala Met Thr Ala Leu Pro Cys Arg Gln Glu Leu Glu 290
295 300
51879DNAEnterococcus faecalis 51atgacagaaa aacttttagg aagtatcgaa
gccggtggca caaaatttgt atgtggcgtt 60gggacagatg atttgaccat cgtagaacgt
gtcagttttc ccacaacaac cccagaagaa 120acaatgaaaa aagtaataga atttttccaa
caatatcctt taaaagcgat tgggattggt 180tcatttggtc cgattgatat tcacgttgat
tctcctacgt atggttatat cacttctaca 240ccaaaattag cttggcgtaa ctttgacttg
ttaggaacta tgaaacaaca ttttgatgtg 300ccaatggctt ggacaacgga tgtgaatgct
gcggcatatg gtgagtatgt tgctggaaat 360gggcaacata catctagttg tgtatattat
acaattggaa ctggtgttgg cgctggagcg 420attcaaaacg gtgagtttat tgaaggcttt
agccacccag aaatggggca tgcgttagtt 480cgtcgtcatc ctgaagatac gtatgcagga
aattgtcctt atcatggaga ttgtttagaa 540gggattgcag caggaccagc agttgaaggt
cgttctggta aaaaaggaca tttattggaa 600gaggatcata aaacttggga attagaagct
tattatttag cgcaagcggc gtacaatacg 660actttattat tagcgccaga agtgatcatt
ttaggtggcg gcgtcatgaa acaacgtcat 720ttgatgccga aagttcgtga aaaatttgct
gaattagtca atggatatgt ggaaacaccg 780cctttagaaa aatacttggt gacgcctctt
ttagaagata atccaggaac aatcggttgc 840tttgccttgg caaaaaaagc tttaatggct
caaaaataa 87952292PRTEnterococcus faecalis
52Met Thr Glu Lys Leu Leu Gly Ser Ile Glu Ala Gly Gly Thr Lys Phe 1
5 10 15 Val Cys Gly Val
Gly Thr Asp Asp Leu Thr Ile Val Glu Arg Val Ser 20
25 30 Phe Pro Thr Thr Thr Pro Glu Glu Thr
Met Lys Lys Val Ile Glu Phe 35 40
45 Phe Gln Gln Tyr Pro Leu Lys Ala Ile Gly Ile Gly Ser Phe
Gly Pro 50 55 60
Ile Asp Ile His Val Asp Ser Pro Thr Tyr Gly Tyr Ile Thr Ser Thr 65
70 75 80 Pro Lys Leu Ala Trp
Arg Asn Phe Asp Leu Leu Gly Thr Met Lys Gln 85
90 95 His Phe Asp Val Pro Met Ala Trp Thr Thr
Asp Val Asn Ala Ala Ala 100 105
110 Tyr Gly Glu Tyr Val Ala Gly Asn Gly Gln His Thr Ser Ser Cys
Val 115 120 125 Tyr
Tyr Thr Ile Gly Thr Gly Val Gly Ala Gly Ala Ile Gln Asn Gly 130
135 140 Glu Phe Ile Glu Gly Phe
Ser His Pro Glu Met Gly His Ala Leu Val 145 150
155 160 Arg Arg His Pro Glu Asp Thr Tyr Ala Gly Asn
Cys Pro Tyr His Gly 165 170
175 Asp Cys Leu Glu Gly Ile Ala Ala Gly Pro Ala Val Glu Gly Arg Ser
180 185 190 Gly Lys
Lys Gly His Leu Leu Glu Glu Asp His Lys Thr Trp Glu Leu 195
200 205 Glu Ala Tyr Tyr Leu Ala Gln
Ala Ala Tyr Asn Thr Thr Leu Leu Leu 210 215
220 Ala Pro Glu Val Ile Ile Leu Gly Gly Gly Val Met
Lys Gln Arg His 225 230 235
240 Leu Met Pro Lys Val Arg Glu Lys Phe Ala Glu Leu Val Asn Gly Tyr
245 250 255 Val Glu Thr
Pro Pro Leu Glu Lys Tyr Leu Val Thr Pro Leu Leu Glu 260
265 270 Asp Asn Pro Gly Thr Ile Gly Cys
Phe Ala Leu Ala Lys Lys Ala Leu 275 280
285 Met Ala Gln Lys 290
531458DNASaccharomyces cerevisiae 53atggttcatt taggtccaaa gaaaccacag
gctagaaagg gttccatggc tgatgtgccc 60aaggaattga tggatgaaat tcatcagttg
gaagatatgt ttacagttga cagcgagacc 120ttgagaaagg ttgttaagca ctttatcgac
gaattgaata aaggtttgac aaagaaggga 180ggtaacattc caatgattcc cggttgggtc
atggaattcc caacaggtaa agaatctggt 240aactatttgg ccattgattt gggtggtact
aacttaagag tcgtgttggt caagttgagc 300ggtaaccata cctttgacac cactcaatcc
aagtataaac taccacatga catgagaacc 360actaagcacc aagaggagtt atggtccttt
attgccgact ctttgaagga ctttatggtc 420gagcaagaat tgctaaacac caaggacacc
ttaccattag gtttcacctt ctcgtaccca 480gcttcccaaa acaagattaa cgaaggtatt
ttgcaaagat ggaccaaggg tttcgatatt 540ccaaatgtcg aaggccacga tgtcgtccca
ttgctacaaa acgaaatttc caagagagag 600ttgcctattg aaattgtagc attgattaat
gatactgttg gtactttaat tgcctcatac 660tacactgacc cagagactaa gatgggtgtg
attttcggta ctggtgtcaa cggtgctttc 720tatgatgttg tttccgatat cgaaaagttg
gagggcaaat tagcagacga tattccaagt 780aactctccaa tggctatcaa ttgtgaatat
ggttccttcg ataatgaaca tttggtcttg 840ccaagaacca agtacgatgt tgctgtcgac
gaacaatctc caagacctgg tcaacaagct 900tttgaaaaga tgacctccgg ttactacttg
ggtgaattgt tgcgtctagt gttacttgaa 960ttaaacgaga agggcttgat gttgaaggat
caagatctaa gcaagttgaa acaaccatac 1020atcatggata cctcctaccc agcaagaatc
gaggatgatc catttgaaaa cttggaagat 1080actgatgaca tcttccaaaa ggactttggt
gtcaagacca ctctgccaga acgtaagttg 1140attagaagac tttgtgaatt gatcggtacc
agagctgcta gattagctgt ttgtggtatt 1200gccgctattt gccaaaagag aggttacaag
actggtcaca ttgccgctga cggttctgtc 1260tataacaaat acccaggttt caaggaagcc
gccgctaagg gtttgagaga tatctatgga 1320tggactggtg acgcaagcaa agatccaatt
acgattgttc cagctgagga tggttcaggt 1380gcaggtgctg ctgttattgc tgcattgtcc
gaaaaaagaa ttgccgaagg taagtctctt 1440ggtatcattg gcgcttaa
145854485PRTSaccharomyces cerevisiae
54Met Val His Leu Gly Pro Lys Lys Pro Gln Ala Arg Lys Gly Ser Met 1
5 10 15 Ala Asp Val Pro
Lys Glu Leu Met Asp Glu Ile His Gln Leu Glu Asp 20
25 30 Met Phe Thr Val Asp Ser Glu Thr Leu
Arg Lys Val Val Lys His Phe 35 40
45 Ile Asp Glu Leu Asn Lys Gly Leu Thr Lys Lys Gly Gly Asn
Ile Pro 50 55 60
Met Ile Pro Gly Trp Val Met Glu Phe Pro Thr Gly Lys Glu Ser Gly 65
70 75 80 Asn Tyr Leu Ala Ile
Asp Leu Gly Gly Thr Asn Leu Arg Val Val Leu 85
90 95 Val Lys Leu Ser Gly Asn His Thr Phe Asp
Thr Thr Gln Ser Lys Tyr 100 105
110 Lys Leu Pro His Asp Met Arg Thr Thr Lys His Gln Glu Glu Leu
Trp 115 120 125 Ser
Phe Ile Ala Asp Ser Leu Lys Asp Phe Met Val Glu Gln Glu Leu 130
135 140 Leu Asn Thr Lys Asp Thr
Leu Pro Leu Gly Phe Thr Phe Ser Tyr Pro 145 150
155 160 Ala Ser Gln Asn Lys Ile Asn Glu Gly Ile Leu
Gln Arg Trp Thr Lys 165 170
175 Gly Phe Asp Ile Pro Asn Val Glu Gly His Asp Val Val Pro Leu Leu
180 185 190 Gln Asn
Glu Ile Ser Lys Arg Glu Leu Pro Ile Glu Ile Val Ala Leu 195
200 205 Ile Asn Asp Thr Val Gly Thr
Leu Ile Ala Ser Tyr Tyr Thr Asp Pro 210 215
220 Glu Thr Lys Met Gly Val Ile Phe Gly Thr Gly Val
Asn Gly Ala Phe 225 230 235
240 Tyr Asp Val Val Ser Asp Ile Glu Lys Leu Glu Gly Lys Leu Ala Asp
245 250 255 Asp Ile Pro
Ser Asn Ser Pro Met Ala Ile Asn Cys Glu Tyr Gly Ser 260
265 270 Phe Asp Asn Glu His Leu Val Leu
Pro Arg Thr Lys Tyr Asp Val Ala 275 280
285 Val Asp Glu Gln Ser Pro Arg Pro Gly Gln Gln Ala Phe
Glu Lys Met 290 295 300
Thr Ser Gly Tyr Tyr Leu Gly Glu Leu Leu Arg Leu Val Leu Leu Glu 305
310 315 320 Leu Asn Glu Lys
Gly Leu Met Leu Lys Asp Gln Asp Leu Ser Lys Leu 325
330 335 Lys Gln Pro Tyr Ile Met Asp Thr Ser
Tyr Pro Ala Arg Ile Glu Asp 340 345
350 Asp Pro Phe Glu Asn Leu Glu Asp Thr Asp Asp Ile Phe Gln
Lys Asp 355 360 365
Phe Gly Val Lys Thr Thr Leu Pro Glu Arg Lys Leu Ile Arg Arg Leu 370
375 380 Cys Glu Leu Ile Gly
Thr Arg Ala Ala Arg Leu Ala Val Cys Gly Ile 385 390
395 400 Ala Ala Ile Cys Gln Lys Arg Gly Tyr Lys
Thr Gly His Ile Ala Ala 405 410
415 Asp Gly Ser Val Tyr Asn Lys Tyr Pro Gly Phe Lys Glu Ala Ala
Ala 420 425 430 Lys
Gly Leu Arg Asp Ile Tyr Gly Trp Thr Gly Asp Ala Ser Lys Asp 435
440 445 Pro Ile Thr Ile Val Pro
Ala Glu Asp Gly Ser Gly Ala Gly Ala Ala 450 455
460 Val Ile Ala Ala Leu Ser Glu Lys Arg Ile Ala
Glu Gly Lys Ser Leu 465 470 475
480 Gly Ile Ile Gly Ala 485 551461DNASaccharomyces
cerevisiae 55atggttcatt taggtccaaa aaaaccacaa gccagaaagg gttccatggc
cgatgtgcca 60aaggaattga tgcaacaaat tgagaatttt gaaaaaattt tcactgttcc
aactgaaact 120ttacaagccg ttaccaagca cttcatttcc gaattggaaa agggtttgtc
caagaagggt 180ggtaacattc caatgattcc aggttgggtt atggatttcc caactggtaa
ggaatccggt 240gatttcttgg ccattgattt gggtggtacc aacttgagag ttgtcttagt
caagttgggc 300ggtgaccgta cctttgacac cactcaatct aagtacagat taccagatgc
tatgagaact 360actcaaaatc cagacgaatt gtgggaattt attgccgact ctttgaaagc
ttttattgat 420gagcaattcc cacaaggtat ctctgagcca attccattgg gtttcacctt
ttctttccca 480gcttctcaaa acaaaatcaa tgaaggtatc ttgcaaagat ggactaaagg
ttttgatatt 540ccaaacattg aaaaccacga tgttgttcca atgttgcaaa agcaaatcac
taagaggaat 600atcccaattg aagttgttgc tttgataaac gacactaccg gtactttggt
tgcttcttac 660tacactgacc cagaaactaa gatgggtgtt atcttcggta ctggtgtcaa
tggtgcttac 720tacgatgttt gttccgatat cgaaaagcta caaggaaaac tatctgatga
cattccacca 780tctgctccaa tggccatcaa ctgtgaatac ggttccttcg ataatgaaca
tgtcgttttg 840ccaagaacta aatacgatat caccattgat gaagaatctc caagaccagg
ccaacaaacc 900tttgaaaaaa tgtcttctgg ttactactta ggtgaaattt tgcgtttggc
cttgatggac 960atgtacaaac aaggtttcat cttcaagaac caagacttgt ctaagttcga
caagcctttc 1020gtcatggaca cttcttaccc agccagaatc gaggaagatc cattcgagaa
cctagaagat 1080accgatgact tgttccaaaa tgagttcggt atcaacacta ctgttcaaga
acgtaaattg 1140atcagacgtt tatctgaatt gattggtgct agagctgcta gattgtccgt
ttgtggtatt 1200gctgctatct gtcaaaagag aggttacaag accggtcaca tcgctgcaga
cggttccgtt 1260tacaacagat acccaggttt caaagaaaag gctgccaatg ctttgaagga
catttacggc 1320tggactcaaa cctcactaga cgactaccca atcaagattg ttcctgctga
agatggttcc 1380ggtgctggtg ccgctgttat tgctgctttg gcccaaaaaa gaattgctga
aggtaagtcc 1440gttggtatca tcggtgctta a
146156486PRTSaccharomyces cerevisiae 56Met Val His Leu Gly Pro
Lys Lys Pro Gln Ala Arg Lys Gly Ser Met 1 5
10 15 Ala Asp Val Pro Lys Glu Leu Met Gln Gln Ile
Glu Asn Phe Glu Lys 20 25
30 Ile Phe Thr Val Pro Thr Glu Thr Leu Gln Ala Val Thr Lys His
Phe 35 40 45 Ile
Ser Glu Leu Glu Lys Gly Leu Ser Lys Lys Gly Gly Asn Ile Pro 50
55 60 Met Ile Pro Gly Trp Val
Met Asp Phe Pro Thr Gly Lys Glu Ser Gly 65 70
75 80 Asp Phe Leu Ala Ile Asp Leu Gly Gly Thr Asn
Leu Arg Val Val Leu 85 90
95 Val Lys Leu Gly Gly Asp Arg Thr Phe Asp Thr Thr Gln Ser Lys Tyr
100 105 110 Arg Leu
Pro Asp Ala Met Arg Thr Thr Gln Asn Pro Asp Glu Leu Trp 115
120 125 Glu Phe Ile Ala Asp Ser Leu
Lys Ala Phe Ile Asp Glu Gln Phe Pro 130 135
140 Gln Gly Ile Ser Glu Pro Ile Pro Leu Gly Phe Thr
Phe Ser Phe Pro 145 150 155
160 Ala Ser Gln Asn Lys Ile Asn Glu Gly Ile Leu Gln Arg Trp Thr Lys
165 170 175 Gly Phe Asp
Ile Pro Asn Ile Glu Asn His Asp Val Val Pro Met Leu 180
185 190 Gln Lys Gln Ile Thr Lys Arg Asn
Ile Pro Ile Glu Val Val Ala Leu 195 200
205 Ile Asn Asp Thr Thr Gly Thr Leu Val Ala Ser Tyr Tyr
Thr Asp Pro 210 215 220
Glu Thr Lys Met Gly Val Ile Phe Gly Thr Gly Val Asn Gly Ala Tyr 225
230 235 240 Tyr Asp Val Cys
Ser Asp Ile Glu Lys Leu Gln Gly Lys Leu Ser Asp 245
250 255 Asp Ile Pro Pro Ser Ala Pro Met Ala
Ile Asn Cys Glu Tyr Gly Ser 260 265
270 Phe Asp Asn Glu His Val Val Leu Pro Arg Thr Lys Tyr Asp
Ile Thr 275 280 285
Ile Asp Glu Glu Ser Pro Arg Pro Gly Gln Gln Thr Phe Glu Lys Met 290
295 300 Ser Ser Gly Tyr Tyr
Leu Gly Glu Ile Leu Arg Leu Ala Leu Met Asp 305 310
315 320 Met Tyr Lys Gln Gly Phe Ile Phe Lys Asn
Gln Asp Leu Ser Lys Phe 325 330
335 Asp Lys Pro Phe Val Met Asp Thr Ser Tyr Pro Ala Arg Ile Glu
Glu 340 345 350 Asp
Pro Phe Glu Asn Leu Glu Asp Thr Asp Asp Leu Phe Gln Asn Glu 355
360 365 Phe Gly Ile Asn Thr Thr
Val Gln Glu Arg Lys Leu Ile Arg Arg Leu 370 375
380 Ser Glu Leu Ile Gly Ala Arg Ala Ala Arg Leu
Ser Val Cys Gly Ile 385 390 395
400 Ala Ala Ile Cys Gln Lys Arg Gly Tyr Lys Thr Gly His Ile Ala Ala
405 410 415 Asp Gly
Ser Val Tyr Asn Arg Tyr Pro Gly Phe Lys Glu Lys Ala Ala 420
425 430 Asn Ala Leu Lys Asp Ile Tyr
Gly Trp Thr Gln Thr Ser Leu Asp Asp 435 440
445 Tyr Pro Ile Lys Ile Val Pro Ala Glu Asp Gly Ser
Gly Ala Gly Ala 450 455 460
Ala Val Ile Ala Ala Leu Ala Gln Lys Arg Ile Ala Glu Gly Lys Ser 465
470 475 480 Val Gly Ile
Ile Gly Ala 485 571164DNAKlebsiella pneumoniae
57atgagctatc gtatgtttga ttatctggtg ccaaacgtta acttttttgg ccccaacgcc
60atttccgtag tcggcgaacg ctgccagctg ctggggggga aaaaagccct gctggtcacc
120gacaaaggcc tgcgggcaat taaagatggc gcagtggaca aaaccctgca ttatctgcgg
180gaggccggga tcgaggtggc gatctttgac ggcgtcgagc cgaacccgaa agacaccaac
240gtgcgcgacg gcctcgccgt gtttcgccgc gaacagtgcg acatcatcgt caccgtgggc
300ggcggcagcc cgcacgattg cggcaaaggc atcggcatcg ccgccaccca tgagggcgat
360ctgtaccagt atgccggaat cgagaccctg accaacccgc tgccgcctat cgtcgcggtc
420aataccaccg ccggcaccgc cagcgaggtc acccgccact gcgtcctgac caacaccgaa
480accaaagtga agtttgtgat cgtcagctgg cgcaacctgc cgtcggtctc tatcaacgat
540ccgctgctga tgatcggtaa accggccgcc ctgaccgcgg cgaccgggat ggatgccctg
600acccacgccg tagaggccta tatctccaaa gacgctaacc cggtgacgga cgccgccgcc
660atgcaggcga tccgcctcat cgcccgcaac ctgcgccagg ccgtggccct cggcagcaat
720ctgcaggcgc gggaaaacat ggcctatgcc tctctgctgg ccgggatggc tttcaataac
780gccaacctcg gctacgtgca cgccatggcg caccagctgg gcggcctgta cgacatgccg
840cacggcgtgg ccaacgctgt cctgctgccg catgtggccc gctacaacct gatcgccaac
900ccggagaaat tcgccgatat cgctgaactg atgggcgaaa atatcaccgg actgtccact
960ctcgacgcgg cggaaaaagc catcgccgct atcacgcgtc tgtcgatgga tatcggtatt
1020ccgcagcatc tgcgcgatct gggagtaaaa gaggccgact tcccctacat ggcggagatg
1080gctctgaaag acggcaatgc gttctcgaac ccgcgtaaag gcaacgagca ggagattgcc
1140gcgattttcc gccaggcatt ctga
116458387PRTKlebsiella pneumoniae 58Met Ser Tyr Arg Met Phe Asp Tyr Leu
Val Pro Asn Val Asn Phe Phe 1 5 10
15 Gly Pro Asn Ala Ile Ser Val Val Gly Glu Arg Cys Gln Leu
Leu Gly 20 25 30
Gly Lys Lys Ala Leu Leu Val Thr Asp Lys Gly Leu Arg Ala Ile Lys
35 40 45 Asp Gly Ala Val
Asp Lys Thr Leu His Tyr Leu Arg Glu Ala Gly Ile 50
55 60 Glu Val Ala Ile Phe Asp Gly Val
Glu Pro Asn Pro Lys Asp Thr Asn 65 70
75 80 Val Arg Asp Gly Leu Ala Val Phe Arg Arg Glu Gln
Cys Asp Ile Ile 85 90
95 Val Thr Val Gly Gly Gly Ser Pro His Asp Cys Gly Lys Gly Ile Gly
100 105 110 Ile Ala Ala
Thr His Glu Gly Asp Leu Tyr Gln Tyr Ala Gly Ile Glu 115
120 125 Thr Leu Thr Asn Pro Leu Pro Pro
Ile Val Ala Val Asn Thr Thr Ala 130 135
140 Gly Thr Ala Ser Glu Val Thr Arg His Cys Val Leu Thr
Asn Thr Glu 145 150 155
160 Thr Lys Val Lys Phe Val Ile Val Ser Trp Arg Asn Leu Pro Ser Val
165 170 175 Ser Ile Asn Asp
Pro Leu Leu Met Ile Gly Lys Pro Ala Ala Leu Thr 180
185 190 Ala Ala Thr Gly Met Asp Ala Leu Thr
His Ala Val Glu Ala Tyr Ile 195 200
205 Ser Lys Asp Ala Asn Pro Val Thr Asp Ala Ala Ala Met Gln
Ala Ile 210 215 220
Arg Leu Ile Ala Arg Asn Leu Arg Gln Ala Val Ala Leu Gly Ser Asn 225
230 235 240 Leu Gln Ala Arg Glu
Asn Met Ala Tyr Ala Ser Leu Leu Ala Gly Met 245
250 255 Ala Phe Asn Asn Ala Asn Leu Gly Tyr Val
His Ala Met Ala His Gln 260 265
270 Leu Gly Gly Leu Tyr Asp Met Pro His Gly Val Ala Asn Ala Val
Leu 275 280 285 Leu
Pro His Val Ala Arg Tyr Asn Leu Ile Ala Asn Pro Glu Lys Phe 290
295 300 Ala Asp Ile Ala Glu Leu
Met Gly Glu Asn Ile Thr Gly Leu Ser Thr 305 310
315 320 Leu Asp Ala Ala Glu Lys Ala Ile Ala Ala Ile
Thr Arg Leu Ser Met 325 330
335 Asp Ile Gly Ile Pro Gln His Leu Arg Asp Leu Gly Val Lys Glu Ala
340 345 350 Asp Phe
Pro Tyr Met Ala Glu Met Ala Leu Lys Asp Gly Asn Ala Phe 355
360 365 Ser Asn Pro Arg Lys Gly Asn
Glu Gln Glu Ile Ala Ala Ile Phe Arg 370 375
380 Gln Ala Phe 385 591824DNAKlebsiella
pneumoniaeCDS(1)..(1824) 59atg ccg tta ata gcc ggg att gat atc ggc aac
gcc acc acc gag gtg 48Met Pro Leu Ile Ala Gly Ile Asp Ile Gly Asn
Ala Thr Thr Glu Val1 5 10
15 gcg ctg gcg tcc gac tac ccg cag gcg agg gcg ttt gtt gcc agc ggg
96Ala Leu Ala Ser Asp Tyr Pro Gln Ala Arg Ala Phe Val Ala Ser Gly
20 25 30 atc gtc gcg acg
acg ggc atg aaa ggg acg cgg gac aat atc gcc ggg 144Ile Val Ala Thr
Thr Gly Met Lys Gly Thr Arg Asp Asn Ile Ala Gly 35
40 45 acc ctc gcc gcg ctg gag cag gcc ctg
gcg aaa aca ccg tgg tcg atg 192Thr Leu Ala Ala Leu Glu Gln Ala Leu
Ala Lys Thr Pro Trp Ser Met 50 55 60
agc gat gtc tct cgc atc tat ctt aac gaa gcc gcg ccg gtg
att ggc 240Ser Asp Val Ser Arg Ile Tyr Leu Asn Glu Ala Ala Pro Val
Ile Gly65 70 75 80 gat
gtg gcg atg gag acc atc acc gag acc att atc acc gaa tcg acc 288Asp
Val Ala Met Glu Thr Ile Thr Glu Thr Ile Ile Thr Glu Ser Thr
85 90 95 atg atc ggt cat aac ccg
cag acg ccg ggc ggg gtg ggc gtt ggc gtg 336Met Ile Gly His Asn Pro
Gln Thr Pro Gly Gly Val Gly Val Gly Val 100
105 110 ggg acg act atc gcc ctc ggg cgg ctg gcg acg
ctg ccg gcg gcg cag 384Gly Thr Thr Ile Ala Leu Gly Arg Leu Ala Thr
Leu Pro Ala Ala Gln 115 120 125
tat gcc gag ggg tgg atc gta ctg att gac gac gcc gtc gat ttc ctt
432Tyr Ala Glu Gly Trp Ile Val Leu Ile Asp Asp Ala Val Asp Phe Leu
130 135 140 gac gcc gtg
tgg tgg ctc aat gag gcg ctc gac cgg ggg atc aac gtg 480Asp Ala Val
Trp Trp Leu Asn Glu Ala Leu Asp Arg Gly Ile Asn Val145
150 155 160gtg gcg gcg atc ctc aaa aag
gac gac ggc gtg ctg gtg aac aac cgc 528Val Ala Ala Ile Leu Lys Lys
Asp Asp Gly Val Leu Val Asn Asn Arg 165
170 175 ctg cgt aaa acc ctg ccg gtg gtg gat gaa gtg
acg ctg ctg gag cag 576Leu Arg Lys Thr Leu Pro Val Val Asp Glu Val
Thr Leu Leu Glu Gln 180 185
190 gtc ccc gag ggg gta atg gcg gcg gtg gaa gtg gcc gcg ccg ggc
cag 624Val Pro Glu Gly Val Met Ala Ala Val Glu Val Ala Ala Pro Gly
Gln 195 200 205 gtg gtg
cgg atc ctg tcg aat ccc tac ggg atc gcc acc ttc ttc ggg 672Val Val
Arg Ile Leu Ser Asn Pro Tyr Gly Ile Ala Thr Phe Phe Gly 210
215 220 cta agc ccg gaa gag acc cag
gcc atc gtc ccc atc gcc cgc gcc ctg 720Leu Ser Pro Glu Glu Thr Gln
Ala Ile Val Pro Ile Ala Arg Ala Leu225 230
235 240att ggc aac cgt tcc gcg gtg gtg ctc aag acc ccg
cag ggg gat gtg 768Ile Gly Asn Arg Ser Ala Val Val Leu Lys Thr Pro
Gln Gly Asp Val 245 250
255 cag tcg cgg gtg atc ccg gcg ggc aac ctc tac att agc ggc gaa aag
816Gln Ser Arg Val Ile Pro Ala Gly Asn Leu Tyr Ile Ser Gly Glu Lys
260 265 270 cgc cgc gga gag
gcc gat gtc gcc gag ggc gcg gaa gcc atc atg cag 864Arg Arg Gly Glu
Ala Asp Val Ala Glu Gly Ala Glu Ala Ile Met Gln 275
280 285 gcg atg agc gcc tgc gct ccg gta cgc
gac atc cgc ggc gaa ccg ggc 912Ala Met Ser Ala Cys Ala Pro Val Arg
Asp Ile Arg Gly Glu Pro Gly 290 295
300 acc cac gcc ggc ggc atg ctt gag cgg gtg cgc aag gta
atg gcg tcc 960Thr His Ala Gly Gly Met Leu Glu Arg Val Arg Lys Val
Met Ala Ser305 310 315
320ctg acc ggc cat gag atg agc gcg ata tac atc cag gat ctg ctg gcg
1008Leu Thr Gly His Glu Met Ser Ala Ile Tyr Ile Gln Asp Leu Leu Ala
325 330 335 gtg gat acg ttt
att ccg cgc aag gtg cag ggc ggg atg gcc ggc gag 1056Val Asp Thr Phe
Ile Pro Arg Lys Val Gln Gly Gly Met Ala Gly Glu 340
345 350 tgc gcc atg gag aat gcc gtc ggg atg
gcg gcg atg gtg aaa gcg gat 1104Cys Ala Met Glu Asn Ala Val Gly Met
Ala Ala Met Val Lys Ala Asp 355 360
365 cgt ctg caa atg cag gtt atc gcc cgc gaa ctg agc gcc cga
ctg cag 1152Arg Leu Gln Met Gln Val Ile Ala Arg Glu Leu Ser Ala Arg
Leu Gln 370 375 380 acc
gag gtg gtg gtg ggc ggc gtg gag gcc aac atg gcc atc gcc ggg 1200Thr
Glu Val Val Val Gly Gly Val Glu Ala Asn Met Ala Ile Ala Gly385
390 395 400gcg tta acc act ccc ggc
tgt gcg gcg ccg ctg gcg atc ctc gac ctc 1248Ala Leu Thr Thr Pro Gly
Cys Ala Ala Pro Leu Ala Ile Leu Asp Leu 405
410 415 ggc gcc ggc tcg acg gat gcg gcg atc gtc aac
gcg gag ggg cag ata 1296Gly Ala Gly Ser Thr Asp Ala Ala Ile Val Asn
Ala Glu Gly Gln Ile 420 425
430 acg gcg gtc cat ctc gcc ggg gcg ggg aat atg gtc agc ctg ttg
att 1344Thr Ala Val His Leu Ala Gly Ala Gly Asn Met Val Ser Leu Leu
Ile 435 440 445 aaa acc
gag ctg ggc ctc gag gat ctt tcg ctg gcg gaa gcg ata aaa 1392Lys Thr
Glu Leu Gly Leu Glu Asp Leu Ser Leu Ala Glu Ala Ile Lys 450
455 460 aaa tac ccg ctg gcc aaa gtg
gaa agc ctg ttc agt att cgt cac gag 1440Lys Tyr Pro Leu Ala Lys Val
Glu Ser Leu Phe Ser Ile Arg His Glu465 470
475 480aat ggc gcg gtg gag ttc ttt cgg gaa gcc ctc agc
ccg gcg gtg ttc 1488Asn Gly Ala Val Glu Phe Phe Arg Glu Ala Leu Ser
Pro Ala Val Phe 485 490
495 gcc aaa gtg gtg tac atc aag gag ggc gaa ctg gtg ccg atc gat aac
1536Ala Lys Val Val Tyr Ile Lys Glu Gly Glu Leu Val Pro Ile Asp Asn
500 505 510 gcc agc ccg ctg
gaa aaa att cgt ctc gtg cgc cgg cag gcg aaa gag 1584Ala Ser Pro Leu
Glu Lys Ile Arg Leu Val Arg Arg Gln Ala Lys Glu 515
520 525 aaa gtg ttt gtc acc aac tgc ctg cgc
gcg ctg cgc cag gtc tca ccc 1632Lys Val Phe Val Thr Asn Cys Leu Arg
Ala Leu Arg Gln Val Ser Pro 530 535
540 ggc ggt tcc att cgc gat atc gcc ttt gtg gtg ctg gtg
ggc ggc tca 1680Gly Gly Ser Ile Arg Asp Ile Ala Phe Val Val Leu Val
Gly Gly Ser545 550 555
560tcg ctg gac ttt gag atc ccg cag ctt atc acg gaa gcc ttg tcg cac
1728Ser Leu Asp Phe Glu Ile Pro Gln Leu Ile Thr Glu Ala Leu Ser His
565 570 575 tat ggc gtg gtc
gcc ggg cag ggc aat att cgg gga aca gaa ggg ccg 1776Tyr Gly Val Val
Ala Gly Gln Gly Asn Ile Arg Gly Thr Glu Gly Pro 580
585 590 cgc aat gcg gtc gcc acc ggg ctg cta
ctg gcc ggt cag gcg aat taa 1824Arg Asn Ala Val Ala Thr Gly Leu Leu
Leu Ala Gly Gln Ala Asn 595 600
605 60607PRTKlebsiella pneumoniae 60Met Pro Leu Ile Ala Gly
Ile Asp Ile Gly Asn Ala Thr Thr Glu Val 1 5
10 15 Ala Leu Ala Ser Asp Tyr Pro Gln Ala Arg Ala
Phe Val Ala Ser Gly 20 25
30 Ile Val Ala Thr Thr Gly Met Lys Gly Thr Arg Asp Asn Ile Ala
Gly 35 40 45 Thr
Leu Ala Ala Leu Glu Gln Ala Leu Ala Lys Thr Pro Trp Ser Met 50
55 60 Ser Asp Val Ser Arg Ile
Tyr Leu Asn Glu Ala Ala Pro Val Ile Gly 65 70
75 80 Asp Val Ala Met Glu Thr Ile Thr Glu Thr Ile
Ile Thr Glu Ser Thr 85 90
95 Met Ile Gly His Asn Pro Gln Thr Pro Gly Gly Val Gly Val Gly Val
100 105 110 Gly Thr
Thr Ile Ala Leu Gly Arg Leu Ala Thr Leu Pro Ala Ala Gln 115
120 125 Tyr Ala Glu Gly Trp Ile Val
Leu Ile Asp Asp Ala Val Asp Phe Leu 130 135
140 Asp Ala Val Trp Trp Leu Asn Glu Ala Leu Asp Arg
Gly Ile Asn Val 145 150 155
160 Val Ala Ala Ile Leu Lys Lys Asp Asp Gly Val Leu Val Asn Asn Arg
165 170 175 Leu Arg Lys
Thr Leu Pro Val Val Asp Glu Val Thr Leu Leu Glu Gln 180
185 190 Val Pro Glu Gly Val Met Ala Ala
Val Glu Val Ala Ala Pro Gly Gln 195 200
205 Val Val Arg Ile Leu Ser Asn Pro Tyr Gly Ile Ala Thr
Phe Phe Gly 210 215 220
Leu Ser Pro Glu Glu Thr Gln Ala Ile Val Pro Ile Ala Arg Ala Leu 225
230 235 240 Ile Gly Asn Arg
Ser Ala Val Val Leu Lys Thr Pro Gln Gly Asp Val 245
250 255 Gln Ser Arg Val Ile Pro Ala Gly Asn
Leu Tyr Ile Ser Gly Glu Lys 260 265
270 Arg Arg Gly Glu Ala Asp Val Ala Glu Gly Ala Glu Ala Ile
Met Gln 275 280 285
Ala Met Ser Ala Cys Ala Pro Val Arg Asp Ile Arg Gly Glu Pro Gly 290
295 300 Thr His Ala Gly Gly
Met Leu Glu Arg Val Arg Lys Val Met Ala Ser 305 310
315 320 Leu Thr Gly His Glu Met Ser Ala Ile Tyr
Ile Gln Asp Leu Leu Ala 325 330
335 Val Asp Thr Phe Ile Pro Arg Lys Val Gln Gly Gly Met Ala Gly
Glu 340 345 350 Cys
Ala Met Glu Asn Ala Val Gly Met Ala Ala Met Val Lys Ala Asp 355
360 365 Arg Leu Gln Met Gln Val
Ile Ala Arg Glu Leu Ser Ala Arg Leu Gln 370 375
380 Thr Glu Val Val Val Gly Gly Val Glu Ala Asn
Met Ala Ile Ala Gly 385 390 395
400 Ala Leu Thr Thr Pro Gly Cys Ala Ala Pro Leu Ala Ile Leu Asp Leu
405 410 415 Gly Ala
Gly Ser Thr Asp Ala Ala Ile Val Asn Ala Glu Gly Gln Ile 420
425 430 Thr Ala Val His Leu Ala Gly
Ala Gly Asn Met Val Ser Leu Leu Ile 435 440
445 Lys Thr Glu Leu Gly Leu Glu Asp Leu Ser Leu Ala
Glu Ala Ile Lys 450 455 460
Lys Tyr Pro Leu Ala Lys Val Glu Ser Leu Phe Ser Ile Arg His Glu 465
470 475 480 Asn Gly Ala
Val Glu Phe Phe Arg Glu Ala Leu Ser Pro Ala Val Phe 485
490 495 Ala Lys Val Val Tyr Ile Lys Glu
Gly Glu Leu Val Pro Ile Asp Asn 500 505
510 Ala Ser Pro Leu Glu Lys Ile Arg Leu Val Arg Arg Gln
Ala Lys Glu 515 520 525
Lys Val Phe Val Thr Asn Cys Leu Arg Ala Leu Arg Gln Val Ser Pro 530
535 540 Gly Gly Ser Ile
Arg Asp Ile Ala Phe Val Val Leu Val Gly Gly Ser 545 550
555 560 Ser Leu Asp Phe Glu Ile Pro Gln Leu
Ile Thr Glu Ala Leu Ser His 565 570
575 Tyr Gly Val Val Ala Gly Gln Gly Asn Ile Arg Gly Thr Glu
Gly Pro 580 585 590
Arg Asn Ala Val Ala Thr Gly Leu Leu Leu Ala Gly Gln Ala Asn 595
600 605 614146DNAEscherichia coli
61ggatcccttg cccgctgttg atccgttgtt ccacctgata ttatgttaac ccagtagcca
60gagtgctcca tgttgcagca cagccactcc gtgggaggca taaagcgaca gttcccgttc
120ttctggctgc ggatagattc gactactcat caccgcttcc ccgtcgttaa taaatacttc
180cacggatgat gtatcgataa atatccttag ggcgagcgtg tcacgctgcg ggaggggaat
240actacggtag ccgtctaaat tctcgtgtgg gtaataccgc cacaaaacaa gtcgctcaga
300ttggttatca atatacagcc gcattccagt gccgagctgt aatccgtaat gttcggcatc
360actgttcttc agcgcccact gcaactgaat ctcaactgct tgcgcgtttt cctgcaaaac
420atatttattg ctgattgtgc ggggagagac agattgatgc tgctggcgta acgactcagc
480ttcgtgtacc gggcgttgta gaagtttgcc attgctctct gatagctcgc gcgccagcgt
540catgcagcct gcccatcctt cacgttttga gggcattggc gattcccaca tatccatcca
600gccgataaca atacgccgac catccttcgc taaaaagctt tgtggtgcat aaaagtcatg
660cccgttatca agttcagtaa aatgcccgga ttgtgcaaaa agtcgtcctg gcgaccacat
720tccgggtatt acgccacttt gaaagcgatt tcggtaactg tatccctcgg cattcattcc
780ctgcggggaa aacatcagat aatgctgatc gccaaggctg aaaaagtccg gacattccca
840catatagctt tcacccgcat cagcgtgggc cagtacgcga tcgaaggtcc attcacgcaa
900cgaactgccg cgataaagca ggatctgccc cgtgttgcct ggatctttcg ccccgactac
960catccaccat gtgtcggctt cacgccacac tttaggatcg cggaagtgca tgattccttc
1020tggtggagtg aggatcacac cctgtttctc gaaatgaata ccatcccgac tggtagccag
1080acattgtact tcgcgaattg catcgtcatt acctgcacca tcgagccaga cgtgtccggt
1140gtagataagt gagaggacac cattgtcatc gacagcacta cctgaaaaac acccgtcttt
1200gtcattatcg tctcctggcg ctagcgcaat aggctcatgc tgccagtgga tcatatcgtc
1260gctggtggca tgtccccagt gcattggccc ccagtgttcg ctcatcggat gatgttgata
1320aaacgcgtga taacgatcgt taaaccagat caggccgttt ggatcgttca tccacccggc
1380aggaggcgcg aggtgaaaat ggggatagaa agtgttaccc cggtgctcat gaagttttgc
1440tagggcgttt tgcgccgcat gcaatcgaga ttgcgtcatt ttaatcatcc tggttaagca
1500aatttggtga attgttaacg ttaactttta taaaaataaa gtcccttact ttcataaatg
1560cgatgaatat cacaaatgtt aacgttaact atgacgtttt gtgatcgaat atgcatgttt
1620tagtaaatcc atgacgattt tgcgaaaaag aggtttatca ctatgcgtaa ctcagatgaa
1680tttaagggaa aaaaatgtca gccaaagtat gggttttagg ggatgcggtc gtagatctct
1740tgccagaatc agacgggcgc ctactgcctt gtcctggcgg cgcgccagct aacgttgcgg
1800tgggaatcgc cagattaggc ggaacaagtg ggtttatagg tcgggtgggg gatgatcctt
1860ttggtgcgtt aatgcaaaga acgctgctaa ctgagggagt cgatatcacg tatctgaagc
1920aagatgaatg gcaccggaca tccacggtgc ttgtcgatct gaacgatcaa ggggaacgtt
1980catttacgtt tatggtccgc cccagtgccg atcttttttt agagacgaca gacttgccct
2040gctggcgaca tggcgaatgg ttacatctct gttcaattgc gttgtctgcc gagccttcgc
2100gtaccagcgc atttactgcg atgacggcga tccggcatgc cggaggtttt gtcagcttcg
2160atcctaatat tcgtgaagat ctatggcaag acgagcattt gctccgcttg tgtttgcggc
2220aggcgctaca actggcggat gtcgtcaagc tctcggaaga agaatggcga cttatcagtg
2280gaaaaacaca gaacgatcag gatatatgcg ccctggcaaa agagtatgag atcgccatgc
2340tgttggtgac taaaggtgca gaaggggtgg tggtctgtta tcgaggacaa gttcaccatt
2400ttgctggaat gtctgtgaat tgtgtcgata gcacgggggc gggagatgcg ttcgttgccg
2460ggttactcac aggtctgtcc tctacgggat tatctacaga tgagagagaa atgcgacgaa
2520ttatcgatct cgctcaacgt tgcggagcgc ttgcagtaac ggcgaaaggg gcaatgacag
2580cgctgccatg tcgacaagaa ctggaatagt gagaagtaaa cggcgaagtc gctcttatct
2640ctaaatagga cgtgaatttt ttaacgacag gcaggtaatt atggcactga atattccatt
2700cagaaatgcg tactatcgtt ttgcatccag ttactcattt ctctttttta tttcctggtc
2760gctgtggtgg tcgttatacg ctatttggct gaaaggacat ctagggttga cagggacgga
2820attaggtaca ctttattcgg tcaaccagtt taccagcatt ctatttatga tgttctacgg
2880catcgttcag gataaactcg gtctgaagaa accgctcatc tggtgtatga gtttcatcct
2940ggtcttgacc ggaccgttta tgatttacgt ttatgaaccg ttactgcaaa gcaatttttc
3000tgtaggtcta attctggggg cgctattttt tggcttgggg tatctggcgg gatgcggttt
3060gcttgatagc ttcaccgaaa aaatggcgcg aaattttcat ttcgaatatg gaacagcgcg
3120cgcctgggga tcttttggct atgctattgg cgcgttcttt gccggcatat tttttagtat
3180cagtccccat atcaacttct ggttggtctc gctatttggc gctgtattta tgatgatcaa
3240catgcgtttt aaagataagg atcaccagtg cgtagcggca gatgcgggag gggtaaaaaa
3300agaggatttt atcgcagttt tcaaggatcg aaacttctgg gttttcgtca tatttattgt
3360ggggacgtgg tctttctata acatttttga tcaacaactt tttcctgtct tttattcagg
3420tttattcgaa tcacacgatg taggaacgcg cctgtatggt tatctcaact cattccaggt
3480ggtactcgaa gcgctgtgca tggcgattat tcctttcttt gtgaatcggg tagggccaaa
3540aaatgcatta cttatcggag ttgtgattat ggcgttgcgt atcctttcct gcgcgctgtt
3600cgttaacccc tggattattt cattagtgaa gttgttacat gccattgagg ttccactttg
3660tgtcatatcc gtcttcaaat acagcgtggc aaactttgat aagcgcctgt cgtcgacgat
3720ctttctgatt ggttttcaaa ttgccagttc gcttgggatt gtgctgcttt caacgccgac
3780tgggatactc tttgaccacg caggctacca gacagttttc ttcgcaattt cgggtattgt
3840ctgcctgatg ttgctatttg gcattttctt cttgagtaaa aaacgcgagc aaatagttat
3900ggaaacgcct gtaccttcag caatatagac gtaaactttt tccggttgtt gtcgatagct
3960ctatatccct caaccggaaa ataataatag taaaatgctt agccctgcta ataatcgcct
4020aatccaaacg cctcattcat gttctggtac agtcgctcaa atgtacttca gatgcgcggt
4080tcgctgattt ccaggacatt gtcgtcattc agtgacctgt cccgtgtatc acggtcctgc
4140gaattc
41466213669DNAArtificial SequencePlasmid 62tagtaaagcc ctcgctagat
tttaatgcgg atgttgcgat tacttcgcca actattgcga 60taacaagaaa aagccagcct
ttcatgatat atctcccaat ttgtgtaggg cttattatgc 120acgcttaaaa ataataaaag
cagacttgac ctgatagttt ggctgtgagc aattatgtgc 180ttagtgcatc taacgcttga
gttaagccgc gccgcgaagc ggcgtcggct tgaacgaatt 240gttagacatt atttgccgac
taccttggtg atctcgcctt tcacgtagtg gacaaattct 300tccaactgat ctgcgcgcga
ggccaagcga tcttcttctt gtccaagata agcctgtcta 360gcttcaagta tgacgggctg
atactgggcc ggcaggcgct ccattgccca gtcggcagcg 420acatccttcg gcgcgatttt
gccggttact gcgctgtacc aaatgcggga caacgtaagc 480actacatttc gctcatcgcc
agcccagtcg ggcggcgagt tccatagcgt taaggtttca 540tttagcgcct caaatagatc
ctgttcagga accggatcaa agagttcctc cgccgctgga 600cctaccaagg caacgctatg
ttctcttgct tttgtcagca agatagccag atcaatgtcg 660atcgtggctg gctcgaagat
acctgcaaga atgtcattgc gctgccattc tccaaattgc 720agttcgcgct tagctggata
acgccacgga atgatgtcgt cgtgcacaac aatggtgact 780tctacagcgc ggagaatctc
gctctctcca ggggaagccg aagtttccaa aaggtcgttg 840atcaaagctc gccgcgttgt
ttcatcaagc cttacggtca ccgtaaccag caaatcaata 900tcactgtgtg gcttcaggcc
gccatccact gcggagccgt acaaatgtac ggccagcaac 960gtcggttcga gatggcgctc
gatgacgcca actacctctg atagttgagt cgatacttcg 1020gcgatcaccg cttccctcat
gatgtttaac tttgttttag ggcgactgcc ctgctgcgta 1080acatcgttgc tgctccataa
catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg 1140gatgcccgag gcatagactg
taccccaaaa aaacagtcat aacaagccat gaaaaccgcc 1200actgcgccgt taccaccgct
gcgttcggtc aaggttctgg accagttgcg tgagcgcata 1260cgctacttgc attacagctt
acgaaccgaa caggcttatg tccactgggt tcgtgccttc 1320atccgtttcc acggtgtgcg
tcacccggca accttgggca gcagcgaagt cgaggcattt 1380ctgtcctggc tggcgaacga
gcgcaaggtt tcggtctcca cgcatcgtca ggcattggcg 1440gccttgctgt tcttctacgg
caaggtgctg tgcacggatc tgccctggct tcaggagatc 1500ggaagacctc ggccgtcgcg
gcgcttgccg gtggtgctga ccccggatga agtggttcgc 1560atcctcggtt ttctggaagg
cgagcatcgt ttgttcgccc agcttctgta tggaacgggc 1620atgcggatca gtgagggttt
gcaactgcgg gtcaaggatc tggatttcga tcacggcacg 1680atcatcgtgc gggagggcaa
gggctccaag gatcgggcct tgatgttacc cgagagcttg 1740gcacccagcc tgcgcgagca
ggggaattaa ttcccacggg ttttgctgcc cgcaaacggg 1800ctgttctggt gttgctagtt
tgttatcaga atcgcagatc cggcttcagc cggtttgccg 1860gctgaaagcg ctatttcttc
cagaattgcc atgatttttt ccccacggga ggcgtcactg 1920gctcccgtgt tgtcggcagc
tttgattcga taagcagcat cgcctgtttc aggctgtcta 1980tgtgtgactg ttgagctgta
acaagttgtc tcaggtgttc aatttcatgt tctagttgct 2040ttgttttact ggtttcacct
gttctattag gtgttacatg ctgttcatct gttacattgt 2100cgatctgttc atggtgaaca
gctttgaatg caccaaaaac tcgtaaaagc tctgatgtat 2160ctatcttttt tacaccgttt
tcatctgtgc atatggacag ttttcccttt gatatgtaac 2220ggtgaacagt tgttctactt
ttgtttgtta gtcttgatgc ttcactgata gatacaagag 2280ccataagaac ctcagatcct
tccgtattta gccagtatgt tctctagtgt ggttcgttgt 2340ttttgcgtga gccatgagaa
cgaaccattg agatcatact tactttgcat gtcactcaaa 2400aattttgcct caaaactggt
gagctgaatt tttgcagtta aagcatcgtg tagtgttttt 2460cttagtccgt tatgtaggta
ggaatctgat gtaatggttg ttggtatttt gtcaccattc 2520atttttatct ggttgttctc
aagttcggtt acgagatcca tttgtctatc tagttcaact 2580tggaaaatca acgtatcagt
cgggcggcct cgcttatcaa ccaccaattt catattgctg 2640taagtgttta aatctttact
tattggtttc aaaacccatt ggttaagcct tttaaactca 2700tggtagttat tttcaagcat
taacatgaac ttaaattcat caaggctaat ctctatattt 2760gccttgtgag ttttcttttg
tgttagttct tttaataacc actcataaat cctcatagag 2820tatttgtttt caaaagactt
aacatgttcc agattatatt ttatgaattt ttttaactgg 2880aaaagataag gcaatatctc
ttcactaaaa actaattcta atttttcgct tgagaacttg 2940gcatagtttg tccactggaa
aatctcaaag cctttaacca aaggattcct gatttccaca 3000gttctcgtca tcagctctct
ggttgcttta gctaatacac cataagcatt ttccctactg 3060atgttcatca tctgagcgta
ttggttataa gtgaacgata ccgtccgttc tttccttgta 3120gggttttcaa tcgtggggtt
gagtagtgcc acacagcata aaattagctt ggtttcatgc 3180tccgttaagt catagcgact
aatcgctagt tcatttgctt tgaaaacaac taattcagac 3240atacatctca attggtctag
gtgattttaa tcactatacc aattgagatg ggctagtcaa 3300tgataattac tagtcctttt
cctttgagtt gtgggtatct gtaaattctg ctagaccttt 3360gctggaaaac ttgtaaattc
tgctagaccc tctgtaaatt ccgctagacc tttgtgtgtt 3420ttttttgttt atattcaagt
ggttataatt tatagaataa agaaagaata aaaaaagata 3480aaaagaatag atcccagccc
tgtgtataac tcactacttt agtcagttcc gcagtattac 3540aaaaggatgt cgcaaacgct
gtttgctcct ctacaaaaca gaccttaaaa ccctaaaggc 3600ttaagtagca ccctcgcaag
ctcgggcaaa tcgctgaata ttccttttgt ctccgaccat 3660caggcacctg agtcgctgtc
tttttcgtga cattcagttc gctgcgctca cggctctggc 3720agtgaatggg ggtaaatggc
actacaggcg ccttttatgg attcatgcaa ggaaactacc 3780cataatacaa gaaaagcccg
tcacgggctt ctcagggcgt tttatggcgg gtctgctatg 3840tggtgctatc tgactttttg
ctgttcagca gttcctgccc tctgattttc cagtctgacc 3900acttcggatt atcccgtgac
aggtcattca gactggctaa tgcacccagt aaggcagcgg 3960tatcatcaac aggcttaccc
gtcttactgt cgggaattca tttaaatagt caaaagcctc 4020cgaccggagg cttttgactg
ctaggcgatc tgtgctgttt gccacggtat gcagcaccag 4080cgcgagatta tgggctcgca
cgctcgactg tcggacgggg gcactggaac gagaagtcag 4140gcgagccgtc acgcccttga
caatgccaca tcctgagcaa ataattcaac cactaaacaa 4200atcaaccgcg tttcccggag
gtaaccaagc ttgcgggaga gaatgatgaa caagagccaa 4260caagttcaga caatcaccct
ggccgccgcc cagcaaatgg cggcggcggt ggaaaaaaaa 4320gccactgaga tcaacgtggc
ggtggtgttt tccgtagttg accgcggagg caacacgctg 4380cttatccagc ggatggacga
ggccttcgtc tccagctgcg atatttccct gaataaagcc 4440tggagcgcct gcagcctgaa
gcaaggtacc catgaaatta cgtcagcggt ccagccagga 4500caatctctgt acggtctgca
gctaaccaac caacagcgaa ttattatttt tggcggcggc 4560ctgccagtta tttttaatga
gcaggtaatt ggcgccgtcg gcgttagcgg cggtacggtc 4620gagcaggatc aattattagc
ccagtgcgcc ctggattgtt tttccgcatt ataacctgaa 4680gcgagaaggt atattatgag
ctatcgtatg ttccgccagg cattctgagt gttaacgagg 4740ggaccgtcat gtcgctttca
ccgccaggcg tacgcctgtt ttacgatccg cgcgggcacc 4800atgccggcgc catcaatgag
ctgtgctggg ggctggagga gcagggggtc ccctgccaga 4860ccataaccta tgacggaggc
ggtgacgccg ctgcgctggg cgccctggcg gccagaagct 4920cgcccctgcg ggtgggtatc
gggctcagcg cgtccggcga gatagccctc actcatgccc 4980agctgccggc ggacgcgccg
ctggctaccg gacacgtcac cgatagcgac gatcaactgc 5040gtacgctcgg cgccaacgcc
gggcagctgg ttaaagtcct gccgttaagt gagagaaact 5100gaatgtatcg tatctatacc
cgcaccgggg ataaaggcac caccgccctg tacggcggca 5160gccgcatcga gaaagaccat
attcgcgtcg aggcctacgg caccgtcgat gaactgatat 5220cccagctggg cgtctgctac
gccacgaccc gcgacgccgg gctgcgggaa agcctgcacc 5280atattcagca gacgctgttc
gtgctggggg ctgaactggc cagcgatgcg cggggcctga 5340cccgcctgag ccagacgatc
ggcgaagagg agatcaccgc cctggagcgg cttatcgacc 5400gcaatatggc cgagagcggc
ccgttaaaac agttcgtgat cccggggagg aatctcgcct 5460ctgcccagct gcacgtggcg
cgcacccagt cccgtcggct cgaacgcctg ctgacggcca 5520tggaccgcgc gcatccgctg
cgcgacgcgc tcaaacgcta cagcaatcgc ctgtcggatg 5580ccctgttctc catggcgcga
atcgaagaga ctaggcctga tgcttgcgct tgaactggcc 5640tagcaaacac agaaaaaagc
ccgcacctga cagtgcgggc tttttttttc ctaggcgatc 5700tgtgctgttt gccacggtat
gcagcaccag cgcgagatta tgggctcgca cgctcgactg 5760tcggacgggg gcactggaac
gagaagtcag gcgagccgtc acgcccttga caatgccaca 5820tcctgagcaa ataattcaac
cactaaacaa atcaaccgcg tttcccggag gtaaccaagc 5880ttcacctttt gagccgatga
acaatgaaaa gatcaaaacg atttgcagta ctggcccagc 5940gccccgtcaa tcaggacggg
ctgattggcg agtggcctga agaggggctg atcgccatgg 6000acagcccctt tgacccggtc
tcttcagtaa aagtggacaa cggtctgatc gtcgaactgg 6060acggcaaacg ccgggaccag
tttgacatga tcgaccgatt tatcgccgat tacgcgatca 6120acgttgagcg cacagagcag
gcaatgcgcc tggaggcggt ggaaatagcc cgtatgctgg 6180tggatattca cgtcagccgg
gaggagatca ttgccatcac taccgccatc acgccggcca 6240aagcggtcga ggtgatggcg
cagatgaacg tggtggagat gatgatggcg ctgcagaaga 6300tgcgtgcccg ccggaccccc
tccaaccagt gccacgtcac caatctcaaa gataatccgg 6360tgcagattgc cgctgacgcc
gccgaggccg ggatccgcgg cttctcagaa caggagacca 6420cggtcggtat cgcgcgctac
gcgccgttta acgccctggc gctgttggtc ggttcgcagt 6480gcggccgccc cggcgtgttg
acgcagtgct cggtggaaga ggccaccgag ctggagctgg 6540gcatgcgtgg cttaaccagc
tacgccgaga cggtgtcggt ctacggcacc gaagcggtat 6600ttaccgacgg cgatgatacg
ccgtggtcaa aggcgttcct cgcctcggcc tacgcctccc 6660gcgggttgaa aatgcgctac
acctccggca ccggatccga agcgctgatg ggctattcgg 6720agagcaagtc gatgctctac
ctcgaatcgc gctgcatctt cattactaaa ggcgccgggg 6780ttcagggact gcaaaacggc
gcggtgagct gtatcggcat gaccggcgct gtgccgtcgg 6840gcattcgggc ggtgctggcg
gaaaacctga tcgcctctat gctcgacctc gaagtggcgt 6900ccgccaacga ccagactttc
tcccactcgg atattcgccg caccgcgcgc accctgatgc 6960agatgctgcc gggcaccgac
tttattttct ccggctacag cgcggtgccg aactacgaca 7020acatgttcgc cggctcgaac
ttcgatgcgg aagattttga tgattacaac atcctgcagc 7080gtgacctgat ggttgacggc
ggcctgcgtc cggtgaccga ggcggaaacc attgccattc 7140gccagaaagc ggcgcgggcg
atccaggcgg ttttccgcga gctggggctg ccgccaatcg 7200ccgacgagga ggtggaggcc
gccacctacg cgcacggcag caacgagatg ccgccgcgta 7260acgtggtgga ggatctgagt
gcggtggaag agatgatgaa gcgcaacatc accggcctcg 7320atattgtcgg cgcgctgagc
cgcagcggct ttgaggatat cgccagcaat attctcaata 7380tgctgcgcca gcgggtcacc
ggcgattacc tgcagacctc ggccattctc gatcggcagt 7440tcgaggtggt gagtgcggtc
aacgacatca atgactatca ggggccgggc accggctatc 7500gcatctctgc cgaacgctgg
gcggagatca aaaatattcc gggcgtggtt cagcccgaca 7560ccattgaata aggcggtatt
cctgtgcaac agacaaccca aattcagccc tcttttaccc 7620tgaaaacccg cgagggcggg
gtagcttctg ccgatgaacg cgccgatgaa gtggtgatcg 7680gcgtcggccc tgccttcgat
aaacaccagc atcacactct gatcgatatg ccccatggcg 7740cgatcctcaa agagctgatt
gccggggtgg aagaagaggg gcttcacgcc cgggtggtgc 7800gcattctgcg cacgtccgac
gtctccttta tggcctggga tgcggccaac ctgagcggct 7860cggggatcgg catcggtatc
cagtcgaagg ggaccacggt catccatcag cgcgatctgc 7920tgccgctcag caacctggag
ctgttctccc aggcgccgct gctgacgctg gagacctacc 7980ggcagattgg caaaaacgct
gcgcgctatg cgcgcaaaga gtcaccttcg ccggtgccgg 8040tggtgaacga tcagatggtg
cggccgaaat ttatggccaa agccgcgcta tttcatatca 8100aagagaccaa acatgtggtg
caggacgccg agcccgtcac cctgcacatc gacttagtaa 8160gggagtgacc atgagcgaga
aaaccatgcg cgtgcaggat tatccgttag ccacccgctg 8220cccggagcat atcctgacgc
ctaccggcaa accattgacc gatattaccc tcgagaaggt 8280gctctctggc gaggtgggcc
cgcaggatgt gcggatctcc cgccagaccc ttgagtacca 8340ggcgcagatt gccgagcaga
tgcagcgcca tgcggtggcg cgcaatttcc gccgcgcggc 8400ggagcttatc gccattcctg
acgagcgcat tctggctatc tataacgcgc tgcgcccgtt 8460ccgctcctcg caggcggagc
tgctggcgat cgccgacgag ctggagcaca cctggcatgc 8520gacagtgaat gccgcctttg
tccgggagtc ggcggaagtg tatcagcagc ggcataagct 8580gcgtaaagga agctaagcgg
aggtcagcat gccgttaata gccgggattg atatcggcaa 8640cgccaccacc gaggtggcgc
tggcgtccga ctacccgcag gcgagggcgt ttgttgccag 8700cgggatcgtc gcgacgacgg
gcatgaaagg gacgcgggac aatatcgccg ggaccctcgc 8760cgcgctggag caggccctgg
cgaaaacacc gtggtcgatg agcgatgtct ctcgcatcta 8820tcttaacgaa gccgcgccgg
tgattggcga tgtggcgatg gagaccatca ccgagaccat 8880tatcaccgaa tcgaccatga
tcggtcataa cccgcagacg ccgggcgggg tgggcgttgg 8940cgtggggacg actatcgccc
tcgggcggct ggcgacgctg ccggcggcgc agtatgccga 9000ggggtggatc gtactgattg
acgacgccgt cgatttcctt gacgccgtgt ggtggctcaa 9060tgaggcgctc gaccggggga
tcaacgtggt ggcggcgatc ctcaaaaagg acgacggcgt 9120gctggtgaac aaccgcctgc
gtaaaaccct gccggtggtg gatgaagtga cgctgctgga 9180gcaggtcccc gagggggtaa
tggcggcggt ggaagtggcc gcgccgggcc aggtggtgcg 9240gatcctgtcg aatccctacg
ggatcgccac cttcttcggg ctaagcccgg aagagaccca 9300ggccatcgtc cccatcgccc
gcgccctgat tggcaaccgt tccgcggtgg tgctcaagac 9360cccgcagggg gatgtgcagt
cgcgggtgat cccggcgggc aacctctaca ttagcggcga 9420aaagcgccgc ggagaggccg
atgtcgccga gggcgcggaa gccatcatgc aggcgatgag 9480cgcctgcgct ccggtacgcg
acatccgcgg cgaaccgggc acccacgccg gcggcatgct 9540tgagcgggtg cgcaaggtaa
tggcgtccct gaccggccat gagatgagcg cgatatacat 9600ccaggatctg ctggcggtgg
atacgtttat tccgcgcaag gtgcagggcg ggatggccgg 9660cgagtgcgcc atggagaatg
ccgtcgggat ggcggcgatg gtgaaagcgg atcgtctgca 9720aatgcaggtt atcgcccgcg
aactgagcgc ccgactgcag accgaggtgg tggtgggcgg 9780cgtggaggcc aacatggcca
tcgccggggc gttaaccact cccggctgtg cggcgccgct 9840ggcgatcctc gacctcggcg
ccggctcgac ggatgcggcg atcgtcaacg cggaggggca 9900gataacggcg gtccatctcg
ccggggcggg gaatatggtc agcctgttga ttaaaaccga 9960gctgggcctc gaggatcttt
cgctggcgga agcgataaaa aaatacccgc tggccaaagt 10020ggaaagcctg ttcagtattc
gtcacgagaa tggcgcggtg gagttctttc gggaagccct 10080cagcccggcg gtgttcgcca
aagtggtgta catcaaggag ggcgaactgg tgccgatcga 10140taacgccagc ccgctggaaa
aaattcgtct cgtgcgccgg caggcgaaag agaaagtgtt 10200tgtcaccaac tgcctgcgcg
cgctgcgcca ggtctcaccc ggcggttcca ttcgcgatat 10260cgcctttgtg gtgctggtgg
gcggctcatc gctggacttt gagatcccgc agcttatcac 10320ggaagccttg tcgcactatg
gcgtggtcgc cgggcagggc aatattcggg gaacagaagg 10380gccgcgcaat gcggtcgcca
ccgggctgct actggccggt caggcgaatt aaacgggcgc 10440tcgcgccagc ctctaggtac
aaataaaaaa ggcacgtcag atgacgtgcc ttttttcttg 10500tctagagtac tggcgaaagg
gggatgtgct gcaaggcgat taagttgggt aacgccaggg 10560ttttcccagt cacgacgttg
taaaacgacg gccagtgaat tcgagctcgg tacccggggc 10620ggccgcgcta gcgcccgatc
cagctggagt ttgtagaaac gcaaaaaggc catccgtcag 10680gatggccttc tgcttaattt
gatgcctggc agtttatggc gggcgtcctg cccgccaccc 10740tccgggccgt tgcttcgcaa
cgttcaaatc cgctcccggc ggatttgtcc tactcaggag 10800agcgttcacc gacaaacaac
agataaaacg aaaggcccag tctttcgact gagcctttcg 10860ttttatttga tgcctggcag
ttccctactc tcgcatgggg agaccccaca ctaccatcgg 10920cgctacggcg tttcacttct
gagttcggca tggggtcagg tgggaccacc gcgctactgc 10980cgccaggcaa attctgtttt
atcagaccgc ttctgcgttc tgatttaatc tgtatcaggc 11040tgaaaatctt ctctcatccg
ccaaaacagc caagcttgca tgcctgcagc ccgggttacc 11100atttcaacag atcgtcctta
gcatataagt agtcgtcaaa aatgaattca acttcgtctg 11160tttcggcatt gtagccgcca
actctgatgg attcgtggtt tttgacaatg atgtcacagc 11220ctttttcctt taggaagtcc
aagtcgaaag tagtggcaat accaatgatc ttacaaccgg 11280cggcttttcc ggcggcaata
cctgctggag cgtcttcaaa tactactacc ttagatttgg 11340aagggtcttg ctcattgatc
ggatatccta agccattcct gcccttcaga tatggttctg 11400gatgaggctt accctgtttg
acatcattag cggtaatgaa gtactttggt ctcctgattc 11460ccagatgctc gaaccatttt
tgtgccatat cacgggtacc ggaagttgcc acagcccatt 11520tctcttttgg tagagcgttc
aaagcgttgc acagcttaac tgcacctggg acttcaatgg 11580atttttcacc gtacttgacc
ggaatttcag cttctaattt gttaacatac tcttcattgg 11640caaagtctgg agcgaactta
gcaatggcat caaacgttct ccaaccatgc gagacttgga 11700taacgtgttc agcatcgaaa
taaggtttgt ccttaccgaa atccctccag aatgcagcaa 11760tggctggttg agagatgata
atggtaccgt cgacgtcgaa caaagcggcg ttaactttca 11820aagatagagg tttagtagtc
aatcccataa ttctagtctg tttcctggat ccaataaatc 11880taatcttcat gtagatctaa
ttcttcaatc atgtccggca ggttcttcat tgggtagttg 11940ttgtaaacga tttggtatac
ggcttcaaat aatgggaagt cttcgacaga gccacatgtt 12000tccaaccatt cgtgaacttc
tttgcaggta attaaacctt gagcggattg gccattcaac 12060aactcctttt cacattccca
ggcgtcctta ccagaagtag ccattagcct agcaaccttg 12120acgtttctac caccagcgca
ggtggtgatc aaatcagcaa caccagcaga ctcttggtag 12180tatgtttctt ctctagattc
tgggaaaaac atttgaccga atctgatgat ctcacccaaa 12240ccgactcttt ggatggcagc
agaagcgttg ttaccccagc ctagaccttc gacgaaacca 12300caacctaagg caacaacgtt
cttcaaagca ccacagatgg agataccagc aacatcttcg 12360atgacactaa cgtggaagta
aggtctgtgg aacaaggcct ttagaacctt atggtcgacg 12420tccttgccct cgcctctgaa
atcctttgga atgtggtaag caactgttgt ttcagaccag 12480tgttcttgag cgacttcggt
ggcaatgtta gcaccagata gagcaccaca ttgaatacct 12540agttcctcag tgatgtaaga
ggatagcaat tggacacctt tagcaccaac ttcaaaaccc 12600tttagacagg agatagctct
gacgtgtgaa tcaacatgac ctttcaattg gctacagata 12660cggggcaaaa attgatgtgg
aatgttgaaa acgatgatgt cgacatcctt gactgaatca 12720atcaagtctg gattagcaac
caaattgtcg ggtagagtga tgccaggcaa gtatttcacg 12780ttttgatgtc tagtatttat
gatttcagtc aatttttcac cattgatctc ttcttcgaac 12840acccacattt gtactattgg
agcgaaaact tctgggtatc ccttacaatt ttcggcaacc 12900accttggcaa tagtagtacc
ccagttacca gatccaatca cagtaacctt gaaaggcttt 12960tcggcagcct tcaaagaaac
agaagaggaa cttctctttc taccagcatt caagtggccg 13020gaagttaagt ttaatctatc
agcagcagca gccatggaat tgtcctcctt actagtcatg 13080gtctgtttcc tgtgtgaaat
tgttatccgc tcacaattcc acacattata cgagccggat 13140gattaattgt caacagctca
tttcagaata tttgccagaa ccgttatgat gtcggcgcaa 13200aaaacattat ccagaacggg
agtgcgcctt gagcgacacg aattatgcag tgatttacga 13260cctgcacagc cataccacag
cttccgatgg ctgcctgacg ccagaagcat tggtgcacgc 13320tagccagtac atttaaatgg
taccctctag tcaaggcctt aagtgagtcg tattacggac 13380tggccgtcgt tttacaacgt
cgtgactggg aaaaccctgg cgttacccaa cttaatcgcc 13440ttgcagcaca tccccctttc
gccagctggc gtaatagcga agaggcccgc accgatcgcc 13500cttcccaaca gttgcgcagc
ctgaatggcg aatggcgcct gatgcggtat tttctcctta 13560cgcatctgtg cggtatttca
caccgcatat ggtgcactct cagtacaatc tgctctgatg 13620ccgcatagtt aagccagccc
cgacacccgc caacacccgc tgacgagct 136696313543DNAArtificial
SequencePlasmid 63tagtaaagcc ctcgctagat tttaatgcgg atgttgcgat tacttcgcca
actattgcga 60taacaagaaa aagccagcct ttcatgatat atctcccaat ttgtgtaggg
cttattatgc 120acgcttaaaa ataataaaag cagacttgac ctgatagttt ggctgtgagc
aattatgtgc 180ttagtgcatc taacgcttga gttaagccgc gccgcgaagc ggcgtcggct
tgaacgaatt 240gttagacatt atttgccgac taccttggtg atctcgcctt tcacgtagtg
gacaaattct 300tccaactgat ctgcgcgcga ggccaagcga tcttcttctt gtccaagata
agcctgtcta 360gcttcaagta tgacgggctg atactgggcc ggcaggcgct ccattgccca
gtcggcagcg 420acatccttcg gcgcgatttt gccggttact gcgctgtacc aaatgcggga
caacgtaagc 480actacatttc gctcatcgcc agcccagtcg ggcggcgagt tccatagcgt
taaggtttca 540tttagcgcct caaatagatc ctgttcagga accggatcaa agagttcctc
cgccgctgga 600cctaccaagg caacgctatg ttctcttgct tttgtcagca agatagccag
atcaatgtcg 660atcgtggctg gctcgaagat acctgcaaga atgtcattgc gctgccattc
tccaaattgc 720agttcgcgct tagctggata acgccacgga atgatgtcgt cgtgcacaac
aatggtgact 780tctacagcgc ggagaatctc gctctctcca ggggaagccg aagtttccaa
aaggtcgttg 840atcaaagctc gccgcgttgt ttcatcaagc cttacggtca ccgtaaccag
caaatcaata 900tcactgtgtg gcttcaggcc gccatccact gcggagccgt acaaatgtac
ggccagcaac 960gtcggttcga gatggcgctc gatgacgcca actacctctg atagttgagt
cgatacttcg 1020gcgatcaccg cttccctcat gatgtttaac tttgttttag ggcgactgcc
ctgctgcgta 1080acatcgttgc tgctccataa catcaaacat cgacccacgg cgtaacgcgc
ttgctgcttg 1140gatgcccgag gcatagactg taccccaaaa aaacagtcat aacaagccat
gaaaaccgcc 1200actgcgccgt taccaccgct gcgttcggtc aaggttctgg accagttgcg
tgagcgcata 1260cgctacttgc attacagctt acgaaccgaa caggcttatg tccactgggt
tcgtgccttc 1320atccgtttcc acggtgtgcg tcacccggca accttgggca gcagcgaagt
cgaggcattt 1380ctgtcctggc tggcgaacga gcgcaaggtt tcggtctcca cgcatcgtca
ggcattggcg 1440gccttgctgt tcttctacgg caaggtgctg tgcacggatc tgccctggct
tcaggagatc 1500ggaagacctc ggccgtcgcg gcgcttgccg gtggtgctga ccccggatga
agtggttcgc 1560atcctcggtt ttctggaagg cgagcatcgt ttgttcgccc agcttctgta
tggaacgggc 1620atgcggatca gtgagggttt gcaactgcgg gtcaaggatc tggatttcga
tcacggcacg 1680atcatcgtgc gggagggcaa gggctccaag gatcgggcct tgatgttacc
cgagagcttg 1740gcacccagcc tgcgcgagca ggggaattaa ttcccacggg ttttgctgcc
cgcaaacggg 1800ctgttctggt gttgctagtt tgttatcaga atcgcagatc cggcttcagc
cggtttgccg 1860gctgaaagcg ctatttcttc cagaattgcc atgatttttt ccccacggga
ggcgtcactg 1920gctcccgtgt tgtcggcagc tttgattcga taagcagcat cgcctgtttc
aggctgtcta 1980tgtgtgactg ttgagctgta acaagttgtc tcaggtgttc aatttcatgt
tctagttgct 2040ttgttttact ggtttcacct gttctattag gtgttacatg ctgttcatct
gttacattgt 2100cgatctgttc atggtgaaca gctttgaatg caccaaaaac tcgtaaaagc
tctgatgtat 2160ctatcttttt tacaccgttt tcatctgtgc atatggacag ttttcccttt
gatatgtaac 2220ggtgaacagt tgttctactt ttgtttgtta gtcttgatgc ttcactgata
gatacaagag 2280ccataagaac ctcagatcct tccgtattta gccagtatgt tctctagtgt
ggttcgttgt 2340ttttgcgtga gccatgagaa cgaaccattg agatcatact tactttgcat
gtcactcaaa 2400aattttgcct caaaactggt gagctgaatt tttgcagtta aagcatcgtg
tagtgttttt 2460cttagtccgt tatgtaggta ggaatctgat gtaatggttg ttggtatttt
gtcaccattc 2520atttttatct ggttgttctc aagttcggtt acgagatcca tttgtctatc
tagttcaact 2580tggaaaatca acgtatcagt cgggcggcct cgcttatcaa ccaccaattt
catattgctg 2640taagtgttta aatctttact tattggtttc aaaacccatt ggttaagcct
tttaaactca 2700tggtagttat tttcaagcat taacatgaac ttaaattcat caaggctaat
ctctatattt 2760gccttgtgag ttttcttttg tgttagttct tttaataacc actcataaat
cctcatagag 2820tatttgtttt caaaagactt aacatgttcc agattatatt ttatgaattt
ttttaactgg 2880aaaagataag gcaatatctc ttcactaaaa actaattcta atttttcgct
tgagaacttg 2940gcatagtttg tccactggaa aatctcaaag cctttaacca aaggattcct
gatttccaca 3000gttctcgtca tcagctctct ggttgcttta gctaatacac cataagcatt
ttccctactg 3060atgttcatca tctgagcgta ttggttataa gtgaacgata ccgtccgttc
tttccttgta 3120gggttttcaa tcgtggggtt gagtagtgcc acacagcata aaattagctt
ggtttcatgc 3180tccgttaagt catagcgact aatcgctagt tcatttgctt tgaaaacaac
taattcagac 3240atacatctca attggtctag gtgattttaa tcactatacc aattgagatg
ggctagtcaa 3300tgataattac tagtcctttt cctttgagtt gtgggtatct gtaaattctg
ctagaccttt 3360gctggaaaac ttgtaaattc tgctagaccc tctgtaaatt ccgctagacc
tttgtgtgtt 3420ttttttgttt atattcaagt ggttataatt tatagaataa agaaagaata
aaaaaagata 3480aaaagaatag atcccagccc tgtgtataac tcactacttt agtcagttcc
gcagtattac 3540aaaaggatgt cgcaaacgct gtttgctcct ctacaaaaca gaccttaaaa
ccctaaaggc 3600ttaagtagca ccctcgcaag ctcgggcaaa tcgctgaata ttccttttgt
ctccgaccat 3660caggcacctg agtcgctgtc tttttcgtga cattcagttc gctgcgctca
cggctctggc 3720agtgaatggg ggtaaatggc actacaggcg ccttttatgg attcatgcaa
ggaaactacc 3780cataatacaa gaaaagcccg tcacgggctt ctcagggcgt tttatggcgg
gtctgctatg 3840tggtgctatc tgactttttg ctgttcagca gttcctgccc tctgattttc
cagtctgacc 3900acttcggatt atcccgtgac aggtcattca gactggctaa tgcacccagt
aaggcagcgg 3960tatcatcaac aggcttaccc gtcttactgt cgggaattca tttaaatagt
caaaagcctc 4020cgaccggagg cttttgactg ctaggcgatc tgtgctgttt gccacggtat
gcagcaccag 4080cgcgagatta tgggctcgca cgctcgactg tcggacgggg gcactggaac
gagaagtcag 4140gcgagccgtc acgcccttga ctatgccaca tcctgagcaa ataattcaac
cactaaacaa 4200atcaaccgcg tttcccggag gtaaccaagc ttgcgggaga gaatgatgaa
caagagccaa 4260caagttcaga caatcaccct ggccgccgcc cagcaaatgg cggcggcggt
ggaaaaaaaa 4320gccactgaga tcaacgtggc ggtggtgttt tccgtagttg accgcggagg
caacacgctg 4380cttatccagc ggatggacga ggccttcgtc tccagctgcg atatttccct
gaataaagcc 4440tggagcgcct gcagcctgaa gcaaggtacc catgaaatta cgtcagcggt
ccagccagga 4500caatctctgt acggtctgca gctaaccaac caacagcgaa ttattatttt
tggcggcggc 4560ctgccagtta tttttaatga gcaggtaatt ggcgccgtcg gcgttagcgg
cggtacggtc 4620gagcaggatc aattattagc ccagtgcgcc ctggattgtt tttccgcatt
ataacctgaa 4680gcgagaaggt atattatgag ctatcgtatg ttccgccagg cattctgagt
gttaacgagg 4740ggaccgtcat gtcgctttca ccgccaggcg tacgcctgtt ttacgatccg
cgcgggcacc 4800atgccggcgc catcaatgag ctgtgctggg ggctggagga gcagggggtc
ccctgccaga 4860ccataaccta tgacggaggc ggtgacgccg ctgcgctggg cgccctggcg
gccagaagct 4920cgcccctgcg ggtgggtatc gggctcagcg cgtccggcga gatagccctc
actcatgccc 4980agctgccggc ggacgcgccg ctggctaccg gacacgtcac cgatagcgac
gatcaactgc 5040gtacgctcgg cgccaacgcc gggcagctgg ttaaagtcct gccgttaagt
gagagaaact 5100gaatgtatcg tatctatacc cgcaccgggg ataaaggcac caccgccctg
tacggcggca 5160gccgcatcga gaaagaccat attcgcgtcg aggcctacgg caccgtcgat
gaactgatat 5220cccagctggg cgtctgctac gccacgaccc gcgacgccgg gctgcgggaa
agcctgcacc 5280atattcagca gacgctgttc gtgctggggg ctgaactggc cagcgatgcg
cggggcctga 5340cccgcctgag ccagacgatc ggcgaagagg agatcaccgc cctggagcgg
cttatcgacc 5400gcaatatggc cgagagcggc ccgttaaaac agttcgtgat cccggggagg
aatctcgcct 5460ctgcccagct gcacgtggcg cgcacccagt cccgtcggct cgaacgcctg
ctgacggcca 5520tggaccgcgc gcatccgctg cgcgacgcgc tcaaacgcta cagcaatcgc
ctgtcggatg 5580ccctgttctc catggcgcga atcgaagaga ctaggcctga tgcttgcgct
tgaactggcc 5640tagcaaacac agaaaaaagc ccgcacctga cagtgcgggc tttttttttc
ctaggcgatc 5700tgtgctgttt gccacggtat gcagcaccag cgcgagatta tgggctcgca
cgctcgactg 5760tcggacgggg gcactggaac gagaagtcag gcgagccgtc acgcccttga
ctatgccaca 5820tcctgagcaa ataattcaac cactaaacaa atcaaccgcg tttcccggag
gtaaccaagc 5880ttcacctttt gagccgatga acaatgaaaa gatcaaaacg atttgcagta
ctggcccagc 5940gccccgtcaa tcaggacggg ctgattggcg agtggcctga agaggggctg
atcgccatgg 6000acagcccctt tgacccggtc tcttcagtaa aagtggacaa cggtctgatc
gtcgaactgg 6060acggcaaacg ccgggaccag tttgacatga tcgaccgatt tatcgccgat
tacgcgatca 6120acgttgagcg cacagagcag gcaatgcgcc tggaggcggt ggaaatagcc
cgtatgctgg 6180tggatattca cgtcagccgg gaggagatca ttgccatcac taccgccatc
acgccggcca 6240aagcggtcga ggtgatggcg cagatgaacg tggtggagat gatgatggcg
ctgcagaaga 6300tgcgtgcccg ccggaccccc tccaaccagt gccacgtcac caatctcaaa
gataatccgg 6360tgcagattgc cgctgacgcc gccgaggccg ggatccgcgg cttctcagaa
caggagacca 6420cggtcggtat cgcgcgctac gcgccgttta acgccctggc gctgttggtc
ggttcgcagt 6480gcggccgccc cggcgtgttg acgcagtgct cggtggaaga ggccaccgag
ctggagctgg 6540gcatgcgtgg cttaaccagc tacgccgaga cggtgtcggt ctacggcacc
gaagcggtat 6600ttaccgacgg cgatgatacg ccgtggtcaa aggcgttcct cgcctcggcc
tacgcctccc 6660gcgggttgaa aatgcgctac acctccggca ccggatccga agcgctgatg
ggctattcgg 6720agagcaagtc gatgctctac ctcgaatcgc gctgcatctt cattactaaa
ggcgccgggg 6780ttcagggact gcaaaacggc gcggtgagct gtatcggcat gaccggcgct
gtgccgtcgg 6840gcattcgggc ggtgctggcg gaaaacctga tcgcctctat gctcgacctc
gaagtggcgt 6900ccgccaacga ccagactttc tcccactcgg atattcgccg caccgcgcgc
accctgatgc 6960agatgctgcc gggcaccgac tttattttct ccggctacag cgcggtgccg
aactacgaca 7020acatgttcgc cggctcgaac ttcgatgcgg aagattttga tgattacaac
atcctgcagc 7080gtgacctgat ggttgacggc ggcctgcgtc cggtgaccga ggcggaaacc
attgccattc 7140gccagaaagc ggcgcgggcg atccaggcgg ttttccgcga gctggggctg
ccgccaatcg 7200ccgacgagga ggtggaggcc gccacctacg cgcacggcag caacgagatg
ccgccgcgta 7260acgtggtgga ggatctgagt gcggtggaag agatgatgaa gcgcaacatc
accggcctcg 7320atattgtcgg cgcgctgagc cgcagcggct ttgaggatat cgccagcaat
attctcaata 7380tgctgcgcca gcgggtcacc ggcgattacc tgcagacctc ggccattctc
gatcggcagt 7440tcgaggtggt gagtgcggtc aacgacatca atgactatca ggggccgggc
accggctatc 7500gcatctctgc cgaacgctgg gcggagatca aaaatattcc gggcgtggtt
cagcccgaca 7560ccattgaata aggcggtatt cctgtgcaac agacaaccca aattcagccc
tcttttaccc 7620tgaaaacccg cgagggcggg gtagcttctg ccgatgaacg cgccgatgaa
gtggtgatcg 7680gcgtcggccc tgccttcgat aaacaccagc atcacactct gatcgatatg
ccccatggcg 7740cgatcctcaa agagctgatt gccggggtgg aagaagaggg gcttcacgcc
cgggtggtgc 7800gcattctgcg cacgtccgac gtctccttta tggcctggga tgcggccaac
ctgagcggct 7860cggggatcgg catcggtatc cagtcgaagg ggaccacggt catccatcag
cgcgatctgc 7920tgccgctcag caacctggag ctgttctccc aggcgccgct gctgacgctg
gagacctacc 7980ggcagattgg caaaaacgct gcgcgctatg cgcgcaaaga gtcaccttcg
ccggtgccgg 8040tggtgaacga tcagatggtg cggccgaaat ttatggccaa agccgcgcta
tttcatatca 8100aagagaccaa acatgtggtg caggacgccg agcccgtcac cctgcacatc
gacttagtaa 8160gggagtgacc atgagcgaga aaaccatgcg cgtgcaggat tatccgttag
ccacccgctg 8220cccggagcat atcctgacgc ctaccggcaa accattgacc gatattaccc
tcgagaaggt 8280gctctctggc gaggtgggcc cgcaggatgt gcggatctcc cgccagaccc
ttgagtacca 8340ggcgcagatt gccgagcaga tgcagcgcca tgcggtggcg cgcaatttcc
gccgcgcggc 8400ggagcttatc gccattcctg acgagcgcat tctggctatc tataacgcgc
tgcgcccgtt 8460ccgctcctcg caggcggagc tgctggcgat cgccgacgag ctggagcaca
cctggcatgc 8520gacagtgaat gccgcctttg tccgggagtc ggcggaagtg tatcagcagc
ggcataagct 8580gcgtaaagga agctaagcgg aggtcagcat gccgttaata gccgggattg
atatcggcaa 8640cgccaccacc gaggtggcgc tggcgtccga ctacccgcag gcgagggcgt
ttgttgccag 8700cgggatcgtc gcgacgacgg gcatgaaagg gacgcgggac aatatcgccg
ggaccctcgc 8760cgcgctggag caggccctgg cgaaaacacc gtggtcgatg agcgatgtct
ctcgcatcta 8820tcttaacgaa gccgcgccgg tgattggcga tgtggcgatg gagaccatca
ccgagaccat 8880tatcaccgaa tcgaccatga tcggtcataa cccgcagacg ccgggcgggg
tgggcgttgg 8940cgtggggacg actatcgccc tcgggcggct ggcgacgctg ccggcggcgc
agtatgccga 9000ggggtggatc gtactgattg acgacgccgt cgatttcctt gacgccgtgt
ggtggctcaa 9060tgaggcgctc gaccggggga tcaacgtggt ggcggcgatc ctcaaaaagg
acgacggcgt 9120gctggtgaac aaccgcctgc gtaaaaccct gccggtggtg gatgaagtga
cgctgctgga 9180gcaggtcccc gagggggtaa tggcggcggt ggaagtggcc gcgccgggcc
aggtggtgcg 9240gatcctgtcg aatccctacg ggatcgccac cttcttcggg ctaagcccgg
aagagaccca 9300ggccatcgtc cccatcgccc gcgccctgat tggcaaccgt tccgcggtgg
tgctcaagac 9360cccgcagggg gatgtgcagt cgcgggtgat cccggcgggc aacctctaca
ttagcggcga 9420aaagcgccgc ggagaggccg atgtcgccga gggcgcggaa gccatcatgc
aggcgatgag 9480cgcctgcgct ccggtacgcg acatccgcgg cgaaccgggc acccacgccg
gcggcatgct 9540tgagcgggtg cgcaaggtaa tggcgtccct gaccggccat gagatgagcg
cgatatacat 9600ccaggatctg ctggcggtgg atacgtttat tccgcgcaag gtgcagggcg
ggatggccgg 9660cgagtgcgcc atggagaatg ccgtcgggat ggcggcgatg gtgaaagcgg
atcgtctgca 9720aatgcaggtt atcgcccgcg aactgagcgc ccgactgcag accgaggtgg
tggtgggcgg 9780cgtggaggcc aacatggcca tcgccggggc gttaaccact cccggctgtg
cggcgccgct 9840ggcgatcctc gacctcggcg ccggctcgac ggatgcggcg atcgtcaacg
cggaggggca 9900gataacggcg gtccatctcg ccggggcggg gaatatggtc agcctgttga
ttaaaaccga 9960gctgggcctc gaggatcttt cgctggcgga agcgataaaa aaatacccgc
tggccaaagt 10020ggaaagcctg ttcagtattc gtcacgagaa tggcgcggtg gagttctttc
gggaagccct 10080cagcccggcg gtgttcgcca aagtggtgta catcaaggag ggcgaactgg
tgccgatcga 10140taacgccagc ccgctggaaa aaattcgtct cgtgcgccgg caggcgaaag
agaaagtgtt 10200tgtcaccaac tgcctgcgcg cgctgcgcca ggtctcaccc ggcggttcca
ttcgcgatat 10260cgcctttgtg gtgctggtgg gcggctcatc gctggacttt gagatcccgc
agcttatcac 10320ggaagccttg tcgcactatg gcgtggtcgc cgggcagggc aatattcggg
gaacagaagg 10380gccgcgcaat gcggtcgcca ccgggctgct actggccggt caggcgaatt
aaacgggcgc 10440tcgcgccagc ctctaggtac aaataaaaaa ggcacgtcag atgacgtgcc
ttttttcttg 10500tctagcgtgc accaatgctt ctggcgtcag gcagccatcg gaagctgtgg
tatggctgtg 10560caggtcgtaa atcactgcat aattcgtgtc gctcaaggcg cactcccgtt
ctggataatg 10620ttttttgcgc cgacatcata acggttctgg caaatattct gaaatgagct
gttgacaatt 10680aatcatccgg ctcgtataat gtgtggaatt gtgagcggat aacaatttca
cacaggaaac 10740agaccatgac tagtaaggag gacaattcca tggctgctgc tgctgataga
ttaaacttaa 10800cttccggcca cttgaatgct ggtagaaaga gaagttcctc ttctgtttct
ttgaaggctg 10860ccgaaaagcc tttcaaggtt actgtgattg gatctggtaa ctggggtact
actattgcca 10920aggtggttgc cgaaaattgt aagggatacc cagaagtttt cgctccaata
gtacaaatgt 10980gggtgttcga agaagagatc aatggtgaaa aattgactga aatcataaat
actagacatc 11040aaaacgtgaa atacttgcct ggcatcactc tacccgacaa tttggttgct
aatccagact 11100tgattgattc agtcaaggat gtcgacatca tcgttttcaa cattccacat
caatttttgc 11160cccgtatctg tagccaattg aaaggtcatg ttgattcaca cgtcagagct
atctcctgtc 11220taaagggttt tgaagttggt gctaaaggtg tccaattgct atcctcttac
atcactgagg 11280aactaggtat tcaatgtggt gctctatctg gtgctaacat tgccaccgaa
gtcgctcaag 11340aacactggtc tgaaacaaca gttgcttacc acattccaaa ggatttcaga
ggcgagggca 11400aggacgtcga ccataaggtt ctaaaggcct tgttccacag accttacttc
cacgttagtg 11460tcatcgaaga tgttgctggt atctccatct gtggtgcttt gaagaacgtt
gttgccttag 11520gttgtggttt cgtcgaaggt ctaggctggg gtaacaacgc ttctgctgcc
atccaaagag 11580tcggtttggg tgagatcatc agattcggtc aaatgttttt cccagaatct
agagaagaaa 11640catactacca agagtctgct ggtgttgctg atttgatcac cacctgcgct
ggtggtagaa 11700acgtcaaggt tgctaggcta atggctactt ctggtaagga cgcctgggaa
tgtgaaaagg 11760agttgttgaa tggccaatcc gctcaaggtt taattacctg caaagaagtt
cacgaatggt 11820tggaaacatg tggctctgtc gaagacttcc cattatttga agccgtatac
caaatcgttt 11880acaacaacta cccaatgaag aacctgccgg acatgattga agaattagat
ctacatgaag 11940attagattta ttggatccag gaaacagact agaattatgg gattgactac
taaacctcta 12000tctttgaaag ttaacgccgc tttgttcgac gtcgacggta ccattatcat
ctctcaacca 12060gccattgctg cattctggag ggatttcggt aaggacaaac cttatttcga
tgctgaacac 12120gttatccaag tctcgcatgg ttggagaacg tttgatgcca ttgctaagtt
cgctccagac 12180tttgccaatg aagagtatgt taacaaatta gaagctgaaa ttccggtcaa
gtacggtgaa 12240aaatccattg aagtcccagg tgcagttaag ctgtgcaacg ctttgaacgc
tctaccaaaa 12300gagaaatggg ctgtggcaac ttccggtacc cgtgatatgg cacaaaaatg
gttcgagcat 12360ctgggaatca ggagaccaaa gtacttcatt accgctaatg atgtcaaaca
gggtaagcct 12420catccagaac catatctgaa gggcaggaat ggcttaggat atccgatcaa
tgagcaagac 12480ccttccaaat ctaaggtagt agtatttgaa gacgctccag caggtattgc
cgccggaaaa 12540gccgccggtt gtaagatcat tggtattgcc actactttcg acttggactt
cctaaaggaa 12600aaaggctgtg acatcattgt caaaaaccac gaatccatca gagttggcgg
ctacaatgcc 12660gaaacagacg aagttgaatt catttttgac gactacttat atgctaagga
cgatctgttg 12720aaatggtaac ccgggctgca ggcatgcaag cttggctgtt ttggcggatg
agagaagatt 12780ttcagcctga tacagattaa atcagaacgc agaagcggtc tgataaaaca
gaatttgcct 12840ggcggcagta gcgcggtggt cccacctgac cccatgccga actcagaagt
gaaacgccgt 12900agcgccgatg gtagtgtggg gtctccccat gcgagagtag ggaactgcca
ggcatcaaat 12960aaaacgaaag gctcagtcga aagactgggc ctttcgtttt atctgttgtt
tgtcggtgaa 13020cgctctcctg agtaggacaa atccgccggg agcggatttg aacgttgcga
agcaacggcc 13080cggagggtgg cgggcaggac gcccgccata aactgccagg catcaaatta
agcagaaggc 13140catcctgacg gatggccttt ttgcgtttct acaaactcca gctggatcgg
gcgctagagt 13200atacatttaa atggtaccct ctagtcaagg ccttaagtga gtcgtattac
ggactggccg 13260tcgttttaca acgtcgtgac tgggaaaacc ctggcgttac ccaacttaat
cgccttgcag 13320cacatccccc tttcgccagc tggcgtaata gcgaagaggc ccgcaccgat
cgcccttccc 13380aacagttgcg cagcctgaat ggcgaatggc gcctgatgcg gtattttctc
cttacgcatc 13440tgtgcggtat ttcacaccgc atatggtgca ctctcagtac aatctgctct
gatgccgcat 13500agttaagcca gccccgacac ccgccaacac ccgctgacga gct
135436413543DNAArtificial SequencePlasmid 64tagtaaagcc
ctcgctagat tttaatgcgg atgttgcgat tacttcgcca actattgcga 60taacaagaaa
aagccagcct ttcatgatat atctcccaat ttgtgtaggg cttattatgc 120acgcttaaaa
ataataaaag cagacttgac ctgatagttt ggctgtgagc aattatgtgc 180ttagtgcatc
taacgcttga gttaagccgc gccgcgaagc ggcgtcggct tgaacgaatt 240gttagacatt
atttgccgac taccttggtg atctcgcctt tcacgtagtg gacaaattct 300tccaactgat
ctgcgcgcga ggccaagcga tcttcttctt gtccaagata agcctgtcta 360gcttcaagta
tgacgggctg atactgggcc ggcaggcgct ccattgccca gtcggcagcg 420acatccttcg
gcgcgatttt gccggttact gcgctgtacc aaatgcggga caacgtaagc 480actacatttc
gctcatcgcc agcccagtcg ggcggcgagt tccatagcgt taaggtttca 540tttagcgcct
caaatagatc ctgttcagga accggatcaa agagttcctc cgccgctgga 600cctaccaagg
caacgctatg ttctcttgct tttgtcagca agatagccag atcaatgtcg 660atcgtggctg
gctcgaagat acctgcaaga atgtcattgc gctgccattc tccaaattgc 720agttcgcgct
tagctggata acgccacgga atgatgtcgt cgtgcacaac aatggtgact 780tctacagcgc
ggagaatctc gctctctcca ggggaagccg aagtttccaa aaggtcgttg 840atcaaagctc
gccgcgttgt ttcatcaagc cttacggtca ccgtaaccag caaatcaata 900tcactgtgtg
gcttcaggcc gccatccact gcggagccgt acaaatgtac ggccagcaac 960gtcggttcga
gatggcgctc gatgacgcca actacctctg atagttgagt cgatacttcg 1020gcgatcaccg
cttccctcat gatgtttaac tttgttttag ggcgactgcc ctgctgcgta 1080acatcgttgc
tgctccataa catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg 1140gatgcccgag
gcatagactg taccccaaaa aaacagtcat aacaagccat gaaaaccgcc 1200actgcgccgt
taccaccgct gcgttcggtc aaggttctgg accagttgcg tgagcgcata 1260cgctacttgc
attacagctt acgaaccgaa caggcttatg tccactgggt tcgtgccttc 1320atccgtttcc
acggtgtgcg tcacccggca accttgggca gcagcgaagt cgaggcattt 1380ctgtcctggc
tggcgaacga gcgcaaggtt tcggtctcca cgcatcgtca ggcattggcg 1440gccttgctgt
tcttctacgg caaggtgctg tgcacggatc tgccctggct tcaggagatc 1500ggaagacctc
ggccgtcgcg gcgcttgccg gtggtgctga ccccggatga agtggttcgc 1560atcctcggtt
ttctggaagg cgagcatcgt ttgttcgccc agcttctgta tggaacgggc 1620atgcggatca
gtgagggttt gcaactgcgg gtcaaggatc tggatttcga tcacggcacg 1680atcatcgtgc
gggagggcaa gggctccaag gatcgggcct tgatgttacc cgagagcttg 1740gcacccagcc
tgcgcgagca ggggaattaa ttcccacggg ttttgctgcc cgcaaacggg 1800ctgttctggt
gttgctagtt tgttatcaga atcgcagatc cggcttcagc cggtttgccg 1860gctgaaagcg
ctatttcttc cagaattgcc atgatttttt ccccacggga ggcgtcactg 1920gctcccgtgt
tgtcggcagc tttgattcga taagcagcat cgcctgtttc aggctgtcta 1980tgtgtgactg
ttgagctgta acaagttgtc tcaggtgttc aatttcatgt tctagttgct 2040ttgttttact
ggtttcacct gttctattag gtgttacatg ctgttcatct gttacattgt 2100cgatctgttc
atggtgaaca gctttgaatg caccaaaaac tcgtaaaagc tctgatgtat 2160ctatcttttt
tacaccgttt tcatctgtgc atatggacag ttttcccttt gatatgtaac 2220ggtgaacagt
tgttctactt ttgtttgtta gtcttgatgc ttcactgata gatacaagag 2280ccataagaac
ctcagatcct tccgtattta gccagtatgt tctctagtgt ggttcgttgt 2340ttttgcgtga
gccatgagaa cgaaccattg agatcatact tactttgcat gtcactcaaa 2400aattttgcct
caaaactggt gagctgaatt tttgcagtta aagcatcgtg tagtgttttt 2460cttagtccgt
tatgtaggta ggaatctgat gtaatggttg ttggtatttt gtcaccattc 2520atttttatct
ggttgttctc aagttcggtt acgagatcca tttgtctatc tagttcaact 2580tggaaaatca
acgtatcagt cgggcggcct cgcttatcaa ccaccaattt catattgctg 2640taagtgttta
aatctttact tattggtttc aaaacccatt ggttaagcct tttaaactca 2700tggtagttat
tttcaagcat taacatgaac ttaaattcat caaggctaat ctctatattt 2760gccttgtgag
ttttcttttg tgttagttct tttaataacc actcataaat cctcatagag 2820tatttgtttt
caaaagactt aacatgttcc agattatatt ttatgaattt ttttaactgg 2880aaaagataag
gcaatatctc ttcactaaaa actaattcta atttttcgct tgagaacttg 2940gcatagtttg
tccactggaa aatctcaaag cctttaacca aaggattcct gatttccaca 3000gttctcgtca
tcagctctct ggttgcttta gctaatacac cataagcatt ttccctactg 3060atgttcatca
tctgagcgta ttggttataa gtgaacgata ccgtccgttc tttccttgta 3120gggttttcaa
tcgtggggtt gagtagtgcc acacagcata aaattagctt ggtttcatgc 3180tccgttaagt
catagcgact aatcgctagt tcatttgctt tgaaaacaac taattcagac 3240atacatctca
attggtctag gtgattttaa tcactatacc aattgagatg ggctagtcaa 3300tgataattac
tagtcctttt cctttgagtt gtgggtatct gtaaattctg ctagaccttt 3360gctggaaaac
ttgtaaattc tgctagaccc tctgtaaatt ccgctagacc tttgtgtgtt 3420ttttttgttt
atattcaagt ggttataatt tatagaataa agaaagaata aaaaaagata 3480aaaagaatag
atcccagccc tgtgtataac tcactacttt agtcagttcc gcagtattac 3540aaaaggatgt
cgcaaacgct gtttgctcct ctacaaaaca gaccttaaaa ccctaaaggc 3600ttaagtagca
ccctcgcaag ctcgggcaaa tcgctgaata ttccttttgt ctccgaccat 3660caggcacctg
agtcgctgtc tttttcgtga cattcagttc gctgcgctca cggctctggc 3720agtgaatggg
ggtaaatggc actacaggcg ccttttatgg attcatgcaa ggaaactacc 3780cataatacaa
gaaaagcccg tcacgggctt ctcagggcgt tttatggcgg gtctgctatg 3840tggtgctatc
tgactttttg ctgttcagca gttcctgccc tctgattttc cagtctgacc 3900acttcggatt
atcccgtgac aggtcattca gactggctaa tgcacccagt aaggcagcgg 3960tatcatcaac
aggcttaccc gtcttactgt cgggaattca tttaaatagt caaaagcctc 4020cgaccggagg
cttttgactg ctaggcgatc tgtgctgttt gccacggtat gcagcaccag 4080cgcgagatta
tgggctcgca cgctcgactg tcggacgggg gcactggaac gagaagtcag 4140gcgagccgtc
acgcccttga caatgccaca tcctgagcaa ataattcaac cactaaacaa 4200atcaaccgcg
tttcccggag gtaaccaagc ttgcgggaga gaatgatgaa caagagccaa 4260caagttcaga
caatcaccct ggccgccgcc cagcaaatgg cggcggcggt ggaaaaaaaa 4320gccactgaga
tcaacgtggc ggtggtgttt tccgtagttg accgcggagg caacacgctg 4380cttatccagc
ggatggacga ggccttcgtc tccagctgcg atatttccct gaataaagcc 4440tggagcgcct
gcagcctgaa gcaaggtacc catgaaatta cgtcagcggt ccagccagga 4500caatctctgt
acggtctgca gctaaccaac caacagcgaa ttattatttt tggcggcggc 4560ctgccagtta
tttttaatga gcaggtaatt ggcgccgtcg gcgttagcgg cggtacggtc 4620gagcaggatc
aattattagc ccagtgcgcc ctggattgtt tttccgcatt ataacctgaa 4680gcgagaaggt
atattatgag ctatcgtatg ttccgccagg cattctgagt gttaacgagg 4740ggaccgtcat
gtcgctttca ccgccaggcg tacgcctgtt ttacgatccg cgcgggcacc 4800atgccggcgc
catcaatgag ctgtgctggg ggctggagga gcagggggtc ccctgccaga 4860ccataaccta
tgacggaggc ggtgacgccg ctgcgctggg cgccctggcg gccagaagct 4920cgcccctgcg
ggtgggtatc gggctcagcg cgtccggcga gatagccctc actcatgccc 4980agctgccggc
ggacgcgccg ctggctaccg gacacgtcac cgatagcgac gatcaactgc 5040gtacgctcgg
cgccaacgcc gggcagctgg ttaaagtcct gccgttaagt gagagaaact 5100gaatgtatcg
tatctatacc cgcaccgggg ataaaggcac caccgccctg tacggcggca 5160gccgcatcga
gaaagaccat attcgcgtcg aggcctacgg caccgtcgat gaactgatat 5220cccagctggg
cgtctgctac gccacgaccc gcgacgccgg gctgcgggaa agcctgcacc 5280atattcagca
gacgctgttc gtgctggggg ctgaactggc cagcgatgcg cggggcctga 5340cccgcctgag
ccagacgatc ggcgaagagg agatcaccgc cctggagcgg cttatcgacc 5400gcaatatggc
cgagagcggc ccgttaaaac agttcgtgat cccggggagg aatctcgcct 5460ctgcccagct
gcacgtggcg cgcacccagt cccgtcggct cgaacgcctg ctgacggcca 5520tggaccgcgc
gcatccgctg cgcgacgcgc tcaaacgcta cagcaatcgc ctgtcggatg 5580ccctgttctc
catggcgcga atcgaagaga ctaggcctga tgcttgcgct tgaactggcc 5640tagcaaacac
agaaaaaagc ccgcacctga cagtgcgggc tttttttttc ctaggcgatc 5700tgtgctgttt
gccacggtat gcagcaccag cgcgagatta tgggctcgca cgctcgactg 5760tcggacgggg
gcactggaac gagaagtcag gcgagccgtc acgcccttga caatgccaca 5820tcctgagcaa
ataattcaac cactaaacaa atcaaccgcg tttcccggag gtaaccaagc 5880ttcacctttt
gagccgatga acaatgaaaa gatcaaaacg atttgcagta ctggcccagc 5940gccccgtcaa
tcaggacggg ctgattggcg agtggcctga agaggggctg atcgccatgg 6000acagcccctt
tgacccggtc tcttcagtaa aagtggacaa cggtctgatc gtcgaactgg 6060acggcaaacg
ccgggaccag tttgacatga tcgaccgatt tatcgccgat tacgcgatca 6120acgttgagcg
cacagagcag gcaatgcgcc tggaggcggt ggaaatagcc cgtatgctgg 6180tggatattca
cgtcagccgg gaggagatca ttgccatcac taccgccatc acgccggcca 6240aagcggtcga
ggtgatggcg cagatgaacg tggtggagat gatgatggcg ctgcagaaga 6300tgcgtgcccg
ccggaccccc tccaaccagt gccacgtcac caatctcaaa gataatccgg 6360tgcagattgc
cgctgacgcc gccgaggccg ggatccgcgg cttctcagaa caggagacca 6420cggtcggtat
cgcgcgctac gcgccgttta acgccctggc gctgttggtc ggttcgcagt 6480gcggccgccc
cggcgtgttg acgcagtgct cggtggaaga ggccaccgag ctggagctgg 6540gcatgcgtgg
cttaaccagc tacgccgaga cggtgtcggt ctacggcacc gaagcggtat 6600ttaccgacgg
cgatgatacg ccgtggtcaa aggcgttcct cgcctcggcc tacgcctccc 6660gcgggttgaa
aatgcgctac acctccggca ccggatccga agcgctgatg ggctattcgg 6720agagcaagtc
gatgctctac ctcgaatcgc gctgcatctt cattactaaa ggcgccgggg 6780ttcagggact
gcaaaacggc gcggtgagct gtatcggcat gaccggcgct gtgccgtcgg 6840gcattcgggc
ggtgctggcg gaaaacctga tcgcctctat gctcgacctc gaagtggcgt 6900ccgccaacga
ccagactttc tcccactcgg atattcgccg caccgcgcgc accctgatgc 6960agatgctgcc
gggcaccgac tttattttct ccggctacag cgcggtgccg aactacgaca 7020acatgttcgc
cggctcgaac ttcgatgcgg aagattttga tgattacaac atcctgcagc 7080gtgacctgat
ggttgacggc ggcctgcgtc cggtgaccga ggcggaaacc attgccattc 7140gccagaaagc
ggcgcgggcg atccaggcgg ttttccgcga gctggggctg ccgccaatcg 7200ccgacgagga
ggtggaggcc gccacctacg cgcacggcag caacgagatg ccgccgcgta 7260acgtggtgga
ggatctgagt gcggtggaag agatgatgaa gcgcaacatc accggcctcg 7320atattgtcgg
cgcgctgagc cgcagcggct ttgaggatat cgccagcaat attctcaata 7380tgctgcgcca
gcgggtcacc ggcgattacc tgcagacctc ggccattctc gatcggcagt 7440tcgaggtggt
gagtgcggtc aacgacatca atgactatca ggggccgggc accggctatc 7500gcatctctgc
cgaacgctgg gcggagatca aaaatattcc gggcgtggtt cagcccgaca 7560ccattgaata
aggcggtatt cctgtgcaac agacaaccca aattcagccc tcttttaccc 7620tgaaaacccg
cgagggcggg gtagcttctg ccgatgaacg cgccgatgaa gtggtgatcg 7680gcgtcggccc
tgccttcgat aaacaccagc atcacactct gatcgatatg ccccatggcg 7740cgatcctcaa
agagctgatt gccggggtgg aagaagaggg gcttcacgcc cgggtggtgc 7800gcattctgcg
cacgtccgac gtctccttta tggcctggga tgcggccaac ctgagcggct 7860cggggatcgg
catcggtatc cagtcgaagg ggaccacggt catccatcag cgcgatctgc 7920tgccgctcag
caacctggag ctgttctccc aggcgccgct gctgacgctg gagacctacc 7980ggcagattgg
caaaaacgct gcgcgctatg cgcgcaaaga gtcaccttcg ccggtgccgg 8040tggtgaacga
tcagatggtg cggccgaaat ttatggccaa agccgcgcta tttcatatca 8100aagagaccaa
acatgtggtg caggacgccg agcccgtcac cctgcacatc gacttagtaa 8160gggagtgacc
atgagcgaga aaaccatgcg cgtgcaggat tatccgttag ccacccgctg 8220cccggagcat
atcctgacgc ctaccggcaa accattgacc gatattaccc tcgagaaggt 8280gctctctggc
gaggtgggcc cgcaggatgt gcggatctcc cgccagaccc ttgagtacca 8340ggcgcagatt
gccgagcaga tgcagcgcca tgcggtggcg cgcaatttcc gccgcgcggc 8400ggagcttatc
gccattcctg acgagcgcat tctggctatc tataacgcgc tgcgcccgtt 8460ccgctcctcg
caggcggagc tgctggcgat cgccgacgag ctggagcaca cctggcatgc 8520gacagtgaat
gccgcctttg tccgggagtc ggcggaagtg tatcagcagc ggcataagct 8580gcgtaaagga
agctaagcgg aggtcagcat gccgttaata gccgggattg atatcggcaa 8640cgccaccacc
gaggtggcgc tggcgtccga ctacccgcag gcgagggcgt ttgttgccag 8700cgggatcgtc
gcgacgacgg gcatgaaagg gacgcgggac aatatcgccg ggaccctcgc 8760cgcgctggag
caggccctgg cgaaaacacc gtggtcgatg agcgatgtct ctcgcatcta 8820tcttaacgaa
gccgcgccgg tgattggcga tgtggcgatg gagaccatca ccgagaccat 8880tatcaccgaa
tcgaccatga tcggtcataa cccgcagacg ccgggcgggg tgggcgttgg 8940cgtggggacg
actatcgccc tcgggcggct ggcgacgctg ccggcggcgc agtatgccga 9000ggggtggatc
gtactgattg acgacgccgt cgatttcctt gacgccgtgt ggtggctcaa 9060tgaggcgctc
gaccggggga tcaacgtggt ggcggcgatc ctcaaaaagg acgacggcgt 9120gctggtgaac
aaccgcctgc gtaaaaccct gccggtggtg gatgaagtga cgctgctgga 9180gcaggtcccc
gagggggtaa tggcggcggt ggaagtggcc gcgccgggcc aggtggtgcg 9240gatcctgtcg
aatccctacg ggatcgccac cttcttcggg ctaagcccgg aagagaccca 9300ggccatcgtc
cccatcgccc gcgccctgat tggcaaccgt tccgcggtgg tgctcaagac 9360cccgcagggg
gatgtgcagt cgcgggtgat cccggcgggc aacctctaca ttagcggcga 9420aaagcgccgc
ggagaggccg atgtcgccga gggcgcggaa gccatcatgc aggcgatgag 9480cgcctgcgct
ccggtacgcg acatccgcgg cgaaccgggc acccacgccg gcggcatgct 9540tgagcgggtg
cgcaaggtaa tggcgtccct gaccggccat gagatgagcg cgatatacat 9600ccaggatctg
ctggcggtgg atacgtttat tccgcgcaag gtgcagggcg ggatggccgg 9660cgagtgcgcc
atggagaatg ccgtcgggat ggcggcgatg gtgaaagcgg atcgtctgca 9720aatgcaggtt
atcgcccgcg aactgagcgc ccgactgcag accgaggtgg tggtgggcgg 9780cgtggaggcc
aacatggcca tcgccggggc gttaaccact cccggctgtg cggcgccgct 9840ggcgatcctc
gacctcggcg ccggctcgac ggatgcggcg atcgtcaacg cggaggggca 9900gataacggcg
gtccatctcg ccggggcggg gaatatggtc agcctgttga ttaaaaccga 9960gctgggcctc
gaggatcttt cgctggcgga agcgataaaa aaatacccgc tggccaaagt 10020ggaaagcctg
ttcagtattc gtcacgagaa tggcgcggtg gagttctttc gggaagccct 10080cagcccggcg
gtgttcgcca aagtggtgta catcaaggag ggcgaactgg tgccgatcga 10140taacgccagc
ccgctggaaa aaattcgtct cgtgcgccgg caggcgaaag agaaagtgtt 10200tgtcaccaac
tgcctgcgcg cgctgcgcca ggtctcaccc ggcggttcca ttcgcgatat 10260cgcctttgtg
gtgctggtgg gcggctcatc gctggacttt gagatcccgc agcttatcac 10320ggaagccttg
tcgcactatg gcgtggtcgc cgggcagggc aatattcggg gaacagaagg 10380gccgcgcaat
gcggtcgcca ccgggctgct actggccggt caggcgaatt aaacgggcgc 10440tcgcgccagc
ctctaggtac aaataaaaaa ggcacgtcag atgacgtgcc ttttttcttg 10500tctagcgtgc
accaatgctt ctggcgtcag gcagccatcg gaagctgtgg tatggctgtg 10560caggtcgtaa
atcactgcat aattcgtgtc gctcaaggcg cactcccgtt ctggataatg 10620ttttttgcgc
cgacatcata acggttctgg caaatattct gaaatgagct gttgacaatt 10680aatcatccgg
ctcgtataat gtgtggaatt gtgagcggat aacaatttca cacaggaaac 10740agaccatgac
tagtaaggag gacaattcca tggctgctgc tgctgataga ttaaacttaa 10800cttccggcca
cttgaatgct ggtagaaaga gaagttcctc ttctgtttct ttgaaggctg 10860ccgaaaagcc
tttcaaggtt actgtgattg gatctggtaa ctggggtact actattgcca 10920aggtggttgc
cgaaaattgt aagggatacc cagaagtttt cgctccaata gtacaaatgt 10980gggtgttcga
agaagagatc aatggtgaaa aattgactga aatcataaat actagacatc 11040aaaacgtgaa
atacttgcct ggcatcactc tacccgacaa tttggttgct aatccagact 11100tgattgattc
agtcaaggat gtcgacatca tcgttttcaa cattccacat caatttttgc 11160cccgtatctg
tagccaattg aaaggtcatg ttgattcaca cgtcagagct atctcctgtc 11220taaagggttt
tgaagttggt gctaaaggtg tccaattgct atcctcttac atcactgagg 11280aactaggtat
tcaatgtggt gctctatctg gtgctaacat tgccaccgaa gtcgctcaag 11340aacactggtc
tgaaacaaca gttgcttacc acattccaaa ggatttcaga ggcgagggca 11400aggacgtcga
ccataaggtt ctaaaggcct tgttccacag accttacttc cacgttagtg 11460tcatcgaaga
tgttgctggt atctccatct gtggtgcttt gaagaacgtt gttgccttag 11520gttgtggttt
cgtcgaaggt ctaggctggg gtaacaacgc ttctgctgcc atccaaagag 11580tcggtttggg
tgagatcatc agattcggtc aaatgttttt cccagaatct agagaagaaa 11640catactacca
agagtctgct ggtgttgctg atttgatcac cacctgcgct ggtggtagaa 11700acgtcaaggt
tgctaggcta atggctactt ctggtaagga cgcctgggaa tgtgaaaagg 11760agttgttgaa
tggccaatcc gctcaaggtt taattacctg caaagaagtt cacgaatggt 11820tggaaacatg
tggctctgtc gaagacttcc cattatttga agccgtatac caaatcgttt 11880acaacaacta
cccaatgaag aacctgccgg acatgattga agaattagat ctacatgaag 11940attagattta
ttggatccag gaaacagact agaattatgg gattgactac taaacctcta 12000tctttgaaag
ttaacgccgc tttgttcgac gtcgacggta ccattatcat ctctcaacca 12060gccattgctg
cattctggag ggatttcggt aaggacaaac cttatttcga tgctgaacac 12120gttatccaag
tctcgcatgg ttggagaacg tttgatgcca ttgctaagtt cgctccagac 12180tttgccaatg
aagagtatgt taacaaatta gaagctgaaa ttccggtcaa gtacggtgaa 12240aaatccattg
aagtcccagg tgcagttaag ctgtgcaacg ctttgaacgc tctaccaaaa 12300gagaaatggg
ctgtggcaac ttccggtacc cgtgatatgg cacaaaaatg gttcgagcat 12360ctgggaatca
ggagaccaaa gtacttcatt accgctaatg atgtcaaaca gggtaagcct 12420catccagaac
catatctgaa gggcaggaat ggcttaggat atccgatcaa tgagcaagac 12480ccttccaaat
ctaaggtagt agtatttgaa gacgctccag caggtattgc cgccggaaaa 12540gccgccggtt
gtaagatcat tggtattgcc actactttcg acttggactt cctaaaggaa 12600aaaggctgtg
acatcattgt caaaaaccac gaatccatca gagttggcgg ctacaatgcc 12660gaaacagacg
aagttgaatt catttttgac gactacttat atgctaagga cgatctgttg 12720aaatggtaac
ccgggctgca ggcatgcaag cttggctgtt ttggcggatg agagaagatt 12780ttcagcctga
tacagattaa atcagaacgc agaagcggtc tgataaaaca gaatttgcct 12840ggcggcagta
gcgcggtggt cccacctgac cccatgccga actcagaagt gaaacgccgt 12900agcgccgatg
gtagtgtggg gtctccccat gcgagagtag ggaactgcca ggcatcaaat 12960aaaacgaaag
gctcagtcga aagactgggc ctttcgtttt atctgttgtt tgtcggtgaa 13020cgctctcctg
agtaggacaa atccgccggg agcggatttg aacgttgcga agcaacggcc 13080cggagggtgg
cgggcaggac gcccgccata aactgccagg catcaaatta agcagaaggc 13140catcctgacg
gatggccttt ttgcgtttct acaaactcca gctggatcgg gcgctagagt 13200atacatttaa
atggtaccct ctagtcaagg ccttaagtga gtcgtattac ggactggccg 13260tcgttttaca
acgtcgtgac tgggaaaacc ctggcgttac ccaacttaat cgccttgcag 13320cacatccccc
tttcgccagc tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc 13380aacagttgcg
cagcctgaat ggcgaatggc gcctgatgcg gtattttctc cttacgcatc 13440tgtgcggtat
ttcacaccgc atatggtgca ctctcagtac aatctgctct gatgccgcat 13500agttaagcca
gccccgacac ccgccaacac ccgctgacga gct
135436513402DNAArtificial SequencePlamid 65tagtaaagcc ctcgctagat
tttaatgcgg atgttgcgat tacttcgcca actattgcga 60taacaagaaa aagccagcct
ttcatgatat atctcccaat ttgtgtaggg cttattatgc 120acgcttaaaa ataataaaag
cagacttgac ctgatagttt ggctgtgagc aattatgtgc 180ttagtgcatc taacgcttga
gttaagccgc gccgcgaagc ggcgtcggct tgaacgaatt 240gttagacatt atttgccgac
taccttggtg atctcgcctt tcacgtagtg gacaaattct 300tccaactgat ctgcgcgcga
ggccaagcga tcttcttctt gtccaagata agcctgtcta 360gcttcaagta tgacgggctg
atactgggcc ggcaggcgct ccattgccca gtcggcagcg 420acatccttcg gcgcgatttt
gccggttact gcgctgtacc aaatgcggga caacgtaagc 480actacatttc gctcatcgcc
agcccagtcg ggcggcgagt tccatagcgt taaggtttca 540tttagcgcct caaatagatc
ctgttcagga accggatcaa agagttcctc cgccgctgga 600cctaccaagg caacgctatg
ttctcttgct tttgtcagca agatagccag atcaatgtcg 660atcgtggctg gctcgaagat
acctgcaaga atgtcattgc gctgccattc tccaaattgc 720agttcgcgct tagctggata
acgccacgga atgatgtcgt cgtgcacaac aatggtgact 780tctacagcgc ggagaatctc
gctctctcca ggggaagccg aagtttccaa aaggtcgttg 840atcaaagctc gccgcgttgt
ttcatcaagc cttacggtca ccgtaaccag caaatcaata 900tcactgtgtg gcttcaggcc
gccatccact gcggagccgt acaaatgtac ggccagcaac 960gtcggttcga gatggcgctc
gatgacgcca actacctctg atagttgagt cgatacttcg 1020gcgatcaccg cttccctcat
gatgtttaac tttgttttag ggcgactgcc ctgctgcgta 1080acatcgttgc tgctccataa
catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg 1140gatgcccgag gcatagactg
taccccaaaa aaacagtcat aacaagccat gaaaaccgcc 1200actgcgccgt taccaccgct
gcgttcggtc aaggttctgg accagttgcg tgagcgcata 1260cgctacttgc attacagctt
acgaaccgaa caggcttatg tccactgggt tcgtgccttc 1320atccgtttcc acggtgtgcg
tcacccggca accttgggca gcagcgaagt cgaggcattt 1380ctgtcctggc tggcgaacga
gcgcaaggtt tcggtctcca cgcatcgtca ggcattggcg 1440gccttgctgt tcttctacgg
caaggtgctg tgcacggatc tgccctggct tcaggagatc 1500ggaagacctc ggccgtcgcg
gcgcttgccg gtggtgctga ccccggatga agtggttcgc 1560atcctcggtt ttctggaagg
cgagcatcgt ttgttcgccc agcttctgta tggaacgggc 1620atgcggatca gtgagggttt
gcaactgcgg gtcaaggatc tggatttcga tcacggcacg 1680atcatcgtgc gggagggcaa
gggctccaag gatcgggcct tgatgttacc cgagagcttg 1740gcacccagcc tgcgcgagca
ggggaattaa ttcccacggg ttttgctgcc cgcaaacggg 1800ctgttctggt gttgctagtt
tgttatcaga atcgcagatc cggcttcagc cggtttgccg 1860gctgaaagcg ctatttcttc
cagaattgcc atgatttttt ccccacggga ggcgtcactg 1920gctcccgtgt tgtcggcagc
tttgattcga taagcagcat cgcctgtttc aggctgtcta 1980tgtgtgactg ttgagctgta
acaagttgtc tcaggtgttc aatttcatgt tctagttgct 2040ttgttttact ggtttcacct
gttctattag gtgttacatg ctgttcatct gttacattgt 2100cgatctgttc atggtgaaca
gctttgaatg caccaaaaac tcgtaaaagc tctgatgtat 2160ctatcttttt tacaccgttt
tcatctgtgc atatggacag ttttcccttt gatatgtaac 2220ggtgaacagt tgttctactt
ttgtttgtta gtcttgatgc ttcactgata gatacaagag 2280ccataagaac ctcagatcct
tccgtattta gccagtatgt tctctagtgt ggttcgttgt 2340ttttgcgtga gccatgagaa
cgaaccattg agatcatact tactttgcat gtcactcaaa 2400aattttgcct caaaactggt
gagctgaatt tttgcagtta aagcatcgtg tagtgttttt 2460cttagtccgt tatgtaggta
ggaatctgat gtaatggttg ttggtatttt gtcaccattc 2520atttttatct ggttgttctc
aagttcggtt acgagatcca tttgtctatc tagttcaact 2580tggaaaatca acgtatcagt
cgggcggcct cgcttatcaa ccaccaattt catattgctg 2640taagtgttta aatctttact
tattggtttc aaaacccatt ggttaagcct tttaaactca 2700tggtagttat tttcaagcat
taacatgaac ttaaattcat caaggctaat ctctatattt 2760gccttgtgag ttttcttttg
tgttagttct tttaataacc actcataaat cctcatagag 2820tatttgtttt caaaagactt
aacatgttcc agattatatt ttatgaattt ttttaactgg 2880aaaagataag gcaatatctc
ttcactaaaa actaattcta atttttcgct tgagaacttg 2940gcatagtttg tccactggaa
aatctcaaag cctttaacca aaggattcct gatttccaca 3000gttctcgtca tcagctctct
ggttgcttta gctaatacac cataagcatt ttccctactg 3060atgttcatca tctgagcgta
ttggttataa gtgaacgata ccgtccgttc tttccttgta 3120gggttttcaa tcgtggggtt
gagtagtgcc acacagcata aaattagctt ggtttcatgc 3180tccgttaagt catagcgact
aatcgctagt tcatttgctt tgaaaacaac taattcagac 3240atacatctca attggtctag
gtgattttaa tcactatacc aattgagatg ggctagtcaa 3300tgataattac tagtcctttt
cctttgagtt gtgggtatct gtaaattctg ctagaccttt 3360gctggaaaac ttgtaaattc
tgctagaccc tctgtaaatt ccgctagacc tttgtgtgtt 3420ttttttgttt atattcaagt
ggttataatt tatagaataa agaaagaata aaaaaagata 3480aaaagaatag atcccagccc
tgtgtataac tcactacttt agtcagttcc gcagtattac 3540aaaaggatgt cgcaaacgct
gtttgctcct ctacaaaaca gaccttaaaa ccctaaaggc 3600ttaagtagca ccctcgcaag
ctcgggcaaa tcgctgaata ttccttttgt ctccgaccat 3660caggcacctg agtcgctgtc
tttttcgtga cattcagttc gctgcgctca cggctctggc 3720agtgaatggg ggtaaatggc
actacaggcg ccttttatgg attcatgcaa ggaaactacc 3780cataatacaa gaaaagcccg
tcacgggctt ctcagggcgt tttatggcgg gtctgctatg 3840tggtgctatc tgactttttg
ctgttcagca gttcctgccc tctgattttc cagtctgacc 3900acttcggatt atcccgtgac
aggtcattca gactggctaa tgcacccagt aaggcagcgg 3960tatcatcaac aggcttaccc
gtcttactgt cgggaattca tttaaatagt caaaagcctc 4020cgaccggagg cttttgactg
ctaggcgatc tgtgctgttt gccacggtat gcagcaccag 4080cgcgagatta tgggctcgca
cgctcgactg tcggacgggg gcactggaac gagaagtcag 4140gcgagccgtc acgcccttga
caatgccaca tcctgagcaa ataattcaac cactaaacaa 4200atcaaccgcg tttcccggag
gtaaccaagc ttgcgggaga gaatgatgaa caagagccaa 4260caagttcaga caatcaccct
ggccgccgcc cagcaaatgg cggcggcggt ggaaaaaaaa 4320gccactgaga tcaacgtggc
ggtggtgttt tccgtagttg accgcggagg caacacgctg 4380cttatccagc ggatggacga
ggccttcgtc tccagctgcg atatttccct gaataaagcc 4440tggagcgcct gcagcctgaa
gcaaggtacc catgaaatta cgtcagcggt ccagccagga 4500caatctctgt acggtctgca
gctaaccaac caacagcgaa ttattatttt tggcggcggc 4560ctgccagtta tttttaatga
gcaggtaatt ggcgccgtcg gcgttagcgg cggtacggtc 4620gagcaggatc aattattagc
ccagtgcgcc ctggattgtt tttccgcatt ataacctgaa 4680gcgagaaggt atattatgag
ctatcgtatg ttccgccagg cattctgagt gttaacgagg 4740ggaccgtcat gtcgctttca
ccgccaggcg tacgcctgtt ttacgatccg cgcgggcacc 4800atgccggcgc catcaatgag
ctgtgctggg ggctggagga gcagggggtc ccctgccaga 4860ccataaccta tgacggaggc
ggtgacgccg ctgcgctggg cgccctggcg gccagaagct 4920cgcccctgcg ggtgggtatc
gggctcagcg cgtccggcga gatagccctc actcatgccc 4980agctgccggc ggacgcgccg
ctggctaccg gacacgtcac cgatagcgac gatcaactgc 5040gtacgctcgg cgccaacgcc
gggcagctgg ttaaagtcct gccgttaagt gagagaaact 5100gaatgtatcg tatctatacc
cgcaccgggg ataaaggcac caccgccctg tacggcggca 5160gccgcatcga gaaagaccat
attcgcgtcg aggcctacgg caccgtcgat gaactgatat 5220cccagctggg cgtctgctac
gccacgaccc gcgacgccgg gctgcgggaa agcctgcacc 5280atattcagca gacgctgttc
gtgctggggg ctgaactggc cagcgatgcg cggggcctga 5340cccgcctgag ccagacgatc
ggcgaagagg agatcaccgc cctggagcgg cttatcgacc 5400gcaatatggc cgagagcggc
ccgttaaaac agttcgtgat cccggggagg aatctcgcct 5460ctgcccagct gcaccctgat
gcttgcgctt gaactggcct agcaaacaca gaaaaaagcc 5520cgcacctgac agtgcgggct
ttttttttcc taggcgatct gtgctgtttg ccacggtatg 5580cagcaccagc gcgagattat
gggctcgcac gctcgactgt cggacggggg cactggaacg 5640agaagtcagg cgagccgtca
cgcccttgac aatgccacat cctgagcaaa taattcaacc 5700actaaacaaa tcaaccgcgt
ttcccggagg taaccaagct tcaccttttg agccgatgaa 5760caatgaaaag atcaaaacga
tttgcagtac tggcccagcg ccccgtcaat caggacgggc 5820tgattggcga gtggcctgaa
gaggggctga tcgccatgga cagccccttt gacccggtct 5880cttcagtaaa agtggacaac
ggtctgatcg tcgaactgga cggcaaacgc cgggaccagt 5940ttgacatgat cgaccgattt
atcgccgatt acgcgatcaa cgttgagcgc acagagcagg 6000caatgcgcct ggaggcggtg
gaaatagccc gtatgctggt ggatattcac gtcagccggg 6060aggagatcat tgccatcact
accgccatca cgccggccaa agcggtcgag gtgatggcgc 6120agatgaacgt ggtggagatg
atgatggcgc tgcagaagat gcgtgcccgc cggaccccct 6180ccaaccagtg ccacgtcacc
aatctcaaag ataatccggt gcagattgcc gctgacgccg 6240ccgaggccgg gatccgcggc
ttctcagaac aggagaccac ggtcggtatc gcgcgctacg 6300cgccgtttaa cgccctggcg
ctgttggtcg gttcgcagtg cggccgcccc ggcgtgttga 6360cgcagtgctc ggtggaagag
gccaccgagc tggagctggg catgcgtggc ttaaccagct 6420acgccgagac ggtgtcggtc
tacggcaccg aagcggtatt taccgacggc gatgatacgc 6480cgtggtcaaa ggcgttcctc
gcctcggcct acgcctcccg cgggttgaaa atgcgctaca 6540cctccggcac cggatccgaa
gcgctgatgg gctattcgga gagcaagtcg atgctctacc 6600tcgaatcgcg ctgcatcttc
attactaaag gcgccggggt tcagggactg caaaacggcg 6660cggtgagctg tatcggcatg
accggcgctg tgccgtcggg cattcgggcg gtgctggcgg 6720aaaacctgat cgcctctatg
ctcgacctcg aagtggcgtc cgccaacgac cagactttct 6780cccactcgga tattcgccgc
accgcgcgca ccctgatgca gatgctgccg ggcaccgact 6840ttattttctc cggctacagc
gcggtgccga actacgacaa catgttcgcc ggctcgaact 6900tcgatgcgga agattttgat
gattacaaca tcctgcagcg tgacctgatg gttgacggcg 6960gcctgcgtcc ggtgaccgag
gcggaaacca ttgccattcg ccagaaagcg gcgcgggcga 7020tccaggcggt tttccgcgag
ctggggctgc cgccaatcgc cgacgaggag gtggaggccg 7080ccacctacgc gcacggcagc
aacgagatgc cgccgcgtaa cgtggtggag gatctgagtg 7140cggtggaaga gatgatgaag
cgcaacatca ccggcctcga tattgtcggc gcgctgagcc 7200gcagcggctt tgaggatatc
gccagcaata ttctcaatat gctgcgccag cgggtcaccg 7260gcgattacct gcagacctcg
gccattctcg atcggcagtt cgaggtggtg agtgcggtca 7320acgacatcaa tgactatcag
gggccgggca ccggctatcg catctctgcc gaacgctggg 7380cggagatcaa aaatattccg
ggcgtggttc agcccgacac cattgaataa ggcggtattc 7440ctgtgcaaca gacaacccaa
attcagccct cttttaccct gaaaacccgc gagggcgggg 7500tagcttctgc cgatgaacgc
gccgatgaag tggtgatcgg cgtcggccct gccttcgata 7560aacaccagca tcacactctg
atcgatatgc cccatggcgc gatcctcaaa gagctgattg 7620ccggggtgga agaagagggg
cttcacgccc gggtggtgcg cattctgcgc acgtccgacg 7680tctcctttat ggcctgggat
gcggccaacc tgagcggctc ggggatcggc atcggtatcc 7740agtcgaaggg gaccacggtc
atccatcagc gcgatctgct gccgctcagc aacctggagc 7800tgttctccca ggcgccgctg
ctgacgctgg agacctaccg gcagattggc aaaaacgctg 7860cgcgctatgc gcgcaaagag
tcaccttcgc cggtgccggt ggtgaacgat cagatggtgc 7920ggccgaaatt tatggccaaa
gccgcgctat ttcatatcaa agagaccaaa catgtggtgc 7980aggacgccga gcccgtcacc
ctgcacatcg acttagtaag ggagtgacca tgagcgagaa 8040aaccatgcgc gtgcaggatt
atccgttagc cacccgctgc ccggagcata tcctgacgcc 8100taccggcaaa ccattgaccg
atattaccct cgagaaggtg ctctctggcg aggtgggccc 8160gcaggatgtg cggatctccc
gccagaccct tgagtaccag gcgcagattg ccgagcagat 8220gcagcgccat gcggtggcgc
gcaatttccg ccgcgcggcg gagcttatcg ccattcctga 8280cgagcgcatt ctggctatct
ataacgcgct gcgcccgttc cgctcctcgc aggcggagct 8340gctggcgatc gccgacgagc
tggagcacac ctggcatgcg acagtgaatg ccgcctttgt 8400ccgggagtcg gcggaagtgt
atcagcagcg gcataagctg cgtaaaggaa gctaagcgga 8460ggtcagcatg ccgttaatag
ccgggattga tatcggcaac gccaccaccg aggtggcgct 8520ggcgtccgac tacccgcagg
cgagggcgtt tgttgccagc gggatcgtcg cgacgacggg 8580catgaaaggg acgcgggaca
atatcgccgg gaccctcgcc gcgctggagc aggccctggc 8640gaaaacaccg tggtcgatga
gcgatgtctc tcgcatctat cttaacgaag ccgcgccggt 8700gattggcgat gtggcgatgg
agaccatcac cgagaccatt atcaccgaat cgaccatgat 8760cggtcataac ccgcagacgc
cgggcggggt gggcgttggc gtggggacga ctatcgccct 8820cgggcggctg gcgacgctgc
cggcggcgca gtatgccgag gggtggatcg tactgattga 8880cgacgccgtc gatttccttg
acgccgtgtg gtggctcaat gaggcgctcg accgggggat 8940caacgtggtg gcggcgatcc
tcaaaaagga cgacggcgtg ctggtgaaca accgcctgcg 9000taaaaccctg ccggtggtgg
atgaagtgac gctgctggag caggtccccg agggggtaat 9060ggcggcggtg gaagtggccg
cgccgggcca ggtggtgcgg atcctgtcga atccctacgg 9120gatcgccacc ttcttcgggc
taagcccgga agagacccag gccatcgtcc ccatcgcccg 9180cgccctgatt ggcaaccgtt
ccgcggtggt gctcaagacc ccgcaggggg atgtgcagtc 9240gcgggtgatc ccggcgggca
acctctacat tagcggcgaa aagcgccgcg gagaggccga 9300tgtcgccgag ggcgcggaag
ccatcatgca ggcgatgagc gcctgcgctc cggtacgcga 9360catccgcggc gaaccgggca
cccacgccgg cggcatgctt gagcgggtgc gcaaggtaat 9420ggcgtccctg accggccatg
agatgagcgc gatatacatc caggatctgc tggcggtgga 9480tacgtttatt ccgcgcaagg
tgcagggcgg gatggccggc gagtgcgcca tggagaatgc 9540cgtcgggatg gcggcgatgg
tgaaagcgga tcgtctgcaa atgcaggtta tcgcccgcga 9600actgagcgcc cgactgcaga
ccgaggtggt ggtgggcggc gtggaggcca acatggccat 9660cgccggggcg ttaaccactc
ccggctgtgc ggcgccgctg gcgatcctcg acctcggcgc 9720cggctcgacg gatgcggcga
tcgtcaacgc ggaggggcag ataacggcgg tccatctcgc 9780cggggcgggg aatatggtca
gcctgttgat taaaaccgag ctgggcctcg aggatctttc 9840gctggcggaa gcgataaaaa
aatacccgct ggccaaagtg gaaagcctgt tcagtattcg 9900tcacgagaat ggcgcggtgg
agttctttcg ggaagccctc agcccggcgg tgttcgccaa 9960agtggtgtac atcaaggagg
gcgaactggt gccgatcgat aacgccagcc cgctggaaaa 10020aattcgtctc gtgcgccggc
aggcgaaaga gaaagtgttt gtcaccaact gcctgcgcgc 10080gctgcgccag gtctcacccg
gcggttccat tcgcgatatc gcctttgtgg tgctggtggg 10140cggctcatcg ctggactttg
agatcccgca gcttatcacg gaagccttgt cgcactatgg 10200cgtggtcgcc gggcagggca
atattcgggg aacagaaggg ccgcgcaatg cggtcgccac 10260cgggctgcta ctggccggtc
aggcgaatta aacgggcgct cgcgccagcc tctaggtaca 10320aataaaaaag gcacgtcaga
tgacgtgcct tttttcttgt ctagcgtgca ccaatgcttc 10380tggcgtcagg cagccatcgg
aagctgtggt atggctgtgc aggtcgtaaa tcactgcata 10440attcgtgtcg ctcaaggcgc
actcccgttc tggataatgt tttttgcgcc gacatcataa 10500cggttctggc aaatattctg
aaatgagctg ttgacaatta atcatccggc tcgtataatg 10560tgtggaattg tgagcggata
acaatttcac acaggaaaca gaccatgact agtaaggagg 10620acaattccat ggctgctgct
gctgatagat taaacttaac ttccggccac ttgaatgctg 10680gtagaaagag aagttcctct
tctgtttctt tgaaggctgc cgaaaagcct ttcaaggtta 10740ctgtgattgg atctggtaac
tggggtacta ctattgccaa ggtggttgcc gaaaattgta 10800agggataccc agaagttttc
gctccaatag tacaaatgtg ggtgttcgaa gaagagatca 10860atggtgaaaa attgactgaa
atcataaata ctagacatca aaacgtgaaa tacttgcctg 10920gcatcactct acccgacaat
ttggttgcta atccagactt gattgattca gtcaaggatg 10980tcgacatcat cgttttcaac
attccacatc aatttttgcc ccgtatctgt agccaattga 11040aaggtcatgt tgattcacac
gtcagagcta tctcctgtct aaagggtttt gaagttggtg 11100ctaaaggtgt ccaattgcta
tcctcttaca tcactgagga actaggtatt caatgtggtg 11160ctctatctgg tgctaacatt
gccaccgaag tcgctcaaga acactggtct gaaacaacag 11220ttgcttacca cattccaaag
gatttcagag gcgagggcaa ggacgtcgac cataaggttc 11280taaaggcctt gttccacaga
ccttacttcc acgttagtgt catcgaagat gttgctggta 11340tctccatctg tggtgctttg
aagaacgttg ttgccttagg ttgtggtttc gtcgaaggtc 11400taggctgggg taacaacgct
tctgctgcca tccaaagagt cggtttgggt gagatcatca 11460gattcggtca aatgtttttc
ccagaatcta gagaagaaac atactaccaa gagtctgctg 11520gtgttgctga tttgatcacc
acctgcgctg gtggtagaaa cgtcaaggtt gctaggctaa 11580tggctacttc tggtaaggac
gcctgggaat gtgaaaagga gttgttgaat ggccaatccg 11640ctcaaggttt aattacctgc
aaagaagttc acgaatggtt ggaaacatgt ggctctgtcg 11700aagacttccc attatttgaa
gccgtatacc aaatcgttta caacaactac ccaatgaaga 11760acctgccgga catgattgaa
gaattagatc tacatgaaga ttagatttat tggatccagg 11820aaacagacta gaattatggg
attgactact aaacctctat ctttgaaagt taacgccgct 11880ttgttcgacg tcgacggtac
cattatcatc tctcaaccag ccattgctgc attctggagg 11940gatttcggta aggacaaacc
ttatttcgat gctgaacacg ttatccaagt ctcgcatggt 12000tggagaacgt ttgatgccat
tgctaagttc gctccagact ttgccaatga agagtatgtt 12060aacaaattag aagctgaaat
tccggtcaag tacggtgaaa aatccattga agtcccaggt 12120gcagttaagc tgtgcaacgc
tttgaacgct ctaccaaaag agaaatgggc tgtggcaact 12180tccggtaccc gtgatatggc
acaaaaatgg ttcgagcatc tgggaatcag gagaccaaag 12240tacttcatta ccgctaatga
tgtcaaacag ggtaagcctc atccagaacc atatctgaag 12300ggcaggaatg gcttaggata
tccgatcaat gagcaagacc cttccaaatc taaggtagta 12360gtatttgaag acgctccagc
aggtattgcc gccggaaaag ccgccggttg taagatcatt 12420ggtattgcca ctactttcga
cttggacttc ctaaaggaaa aaggctgtga catcattgtc 12480aaaaaccacg aatccatcag
agttggcggc tacaatgccg aaacagacga agttgaattc 12540atttttgacg actacttata
tgctaaggac gatctgttga aatggtaacc cgggctgcag 12600gcatgcaagc ttggctgttt
tggcggatga gagaagattt tcagcctgat acagattaaa 12660tcagaacgca gaagcggtct
gataaaacag aatttgcctg gcggcagtag cgcggtggtc 12720ccacctgacc ccatgccgaa
ctcagaagtg aaacgccgta gcgccgatgg tagtgtgggg 12780tctccccatg cgagagtagg
gaactgccag gcatcaaata aaacgaaagg ctcagtcgaa 12840agactgggcc tttcgtttta
tctgttgttt gtcggtgaac gctctcctga gtaggacaaa 12900tccgccggga gcggatttga
acgttgcgaa gcaacggccc ggagggtggc gggcaggacg 12960cccgccataa actgccaggc
atcaaattaa gcagaaggcc atcctgacgg atggcctttt 13020tgcgtttcta caaactccag
ctggatcggg cgctagagta tacatttaaa tggtaccctc 13080tagtcaaggc cttaagtgag
tcgtattacg gactggccgt cgttttacaa cgtcgtgact 13140gggaaaaccc tggcgttacc
caacttaatc gccttgcagc acatccccct ttcgccagct 13200ggcgtaatag cgaagaggcc
cgcaccgatc gcccttccca acagttgcgc agcctgaatg 13260gcgaatggcg cctgatgcgg
tattttctcc ttacgcatct gtgcggtatt tcacaccgca 13320tatggtgcac tctcagtaca
atctgctctg atgccgcata gttaagccag ccccgacacc 13380cgccaacacc cgctgacgag
ct 134026614443DNAArtificial
SequencePlasmid 66ttctgataac aaactagcaa caccagaaca gcccgtttgc gggcagcaaa
acccgtggga 60attaattccc ctgctcgcgc aggctgggtg ccaagctctc gggtaacatc
aaggcccgat 120ccttggagcc cttcttacag agatgaaaaa caaaccgcga cgccaggcgg
catcgcggtc 180tcagagatat gtttacgtag atcgaagagc accggtgttt aaacgccctt
gacgatgcca 240catcctgagc aaataattca accactaaac aaatcaaccg cgtttcccgg
aggtaaccga 300gctcatgatc ctgtgttgtg gtgaagccct gatcgacatg ctgccccggc
agacgacgct 360gggtgaggcg ggctttgccc cttacgcagg cggagcggtc ttcaacacgg
caattgcgct 420ggggcgtctt ggcgtccctt cagccttttt taccggtctt tccgacgaca
tgatgggcga 480tatcctgcgg gagaccctgc gggccagcaa ggtggatttc agctattgcg
ccaccctgtc 540gcgccccacc accattgcgt tcgttaagct ggttgatggc catgcgacct
acgcttttta 600cgacgagaac accgccggcc ggatgatcac cgaggccgaa cttccggcct
tgggagcgga 660ttgcgaagcg ctgcatttcg gcgccatcag ccttattccc gaaccctgcg
gcagcaccta 720tgaggcgctg atgacgcgcg agcatgagac ccgcgtcatc tcgctcgatc
cgaacattcg 780tcccggcttc atccagaaca agcagtcgca catggcccgc atccgccgca
tggcggcgat 840gtctgacatc gtcaagttct cggatgagga cctggcgtgg ttcggtctgg
aaggcgacga 900ggacacgctt gcccgccact ggctgcacca cggtgcaaaa ctcgtcgttg
tcacccgtgg 960cgccaagggt gccgtgggtt acagcgccaa tctcaaggtg gaagtggcct
ccgagcgcgt 1020cgaagtggtc gatacggtcg gcgccggcga tacgttcgat gccggcattc
ttgcttcgct 1080gaaaatgcag ggcctgctga ccaaagcgca ggtggcttcg ctgagcgaag
agcagatcag 1140aaaagctttg gcgcttggcg cgaaagccgc tgcggtcact gtctcgcggg
ctggcgcaaa 1200tccgcctttc gcgcatgaaa tcggtttgtg attaattaaa gcacgcagtc
aaacaaaaaa 1260cccgcgccat tgcgcgggtt tttttatgcc cgaaggcgcg ccagcacgca
gtcaaacaaa 1320aaacccgcgc cattgcgcgg gtttttttat gcccgaacgg ccgaggtctt
ccgatctcct 1380gaagccaggg cagatccgtg cacagcacct tgccgtagaa gaacagcaag
gccgccaatg 1440cctgacgatg cgtggagacc gaaaccttgc gctcgttcgc cagccaggac
agaaatgcct 1500cgacttcgct gctgcccaag gttgccgggt gacgcacacc gtggaaacgg
atgaaggcac 1560gaacccagtg gacataagcc tgttcggttc gtaagctgta atgcaagtag
cgtatgcgct 1620cacgcaactg gtccagaacc ttgaccgaac gcagcggtgg taacggcgca
gtggcggttt 1680tcatggcttg ttatgactgt ttttttgggg tacagtctat gcctcgggca
tccaagcagc 1740aagcgcgtta cgccgtgggt cgatgtttga tgttatggag cagcaacgat
gttacgcagc 1800agggcagtcg ccctaaaaca aagttaaaca tcatgaggga agcggtgatc
gccgaagtat 1860cgactcaact atcagaggta gttggcgtca tcgagcgcca tctcgaaccg
acgttgctgg 1920ccgtacattt gtacggctcc gcagtggatg gcggcctgaa gccacacagt
gatattgatt 1980tgctggttac ggtgaccgta aggcttgatg aaacaacgcg gcgagctttg
atcaacgacc 2040ttttggaaac ttcggcttcc cctggagaga gcgagattct ccgcgctgta
gaagtcacca 2100ttgttgtgca cgacgacatc attccgtggc gttatccagc taagcgcgaa
ctgcaatttg 2160gagaatggca gcgcaatgac attcttgcag gtatcttcga gccagccacg
atcgacattg 2220atctggctat cttgctgaca aaagcaagag aacatagcgt tgccttggta
ggtccagcgg 2280cggaggaact ctttgatccg gttcctgaac aggatctatt tgaggcgcta
aatgaaacct 2340taacgctatg gaactcgccg cccgactggg ctggcgatga gcgaaatgta
gtgcttacgt 2400tgtcccgcat ttggtacagc gcagtaaccg gcaaaatcgc gccgaaggat
gtcgctgccg 2460actgggcaat ggagcgcctg ccggcccagt atcagcccgt catacttgaa
gctagacagg 2520cttatcttgg acaagaagaa gatcgcttgg cctcgcgcgc agatcagttg
gaagaatttg 2580tccactacgt gaaaggcgag atcaccaagg tagtcggcaa ataatgtcta
acaattcgtt 2640caagccgacg ccgcttcgcg gcgcggctta actcaagcgt tagatgcact
aagcacataa 2700ttgctcacag ccaaactatc aggtcaagtc tgcttttatt atttttaagc
gtgcataata 2760agccctacac aaattgggag atatatcatg aaaggctggc tttttcttgt
tatcgcaata 2820gttggcgaag taatcgcaac atccgcatta aaatctagcg agggctttac
taagctcgtc 2880agcgggtgtt ggcgggtgtc ggggctggct taactatgcg gcatcagagc
agattgtact 2940gagagtgcac catatgcggt gtgaaatacc gcacagatgc gtaaggagaa
aataccgcat 3000caggcgccat tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg
tgcgggcctc 3060ttcgctatta cgccagctgg cgaaaggggg atgtgctgca aggcgattaa
gttgggtaac 3120gccagggttt tcccagtcac gacgttgtaa aacgacggcc agtccgtaat
acgactcact 3180taaggccttg actagagggt accatttaaa tgtatactct agcgcccgat
ccagctggag 3240tttgtagaaa cgcaaaaagg ccatccgtca ggatggcctt ctgcttaatt
tgatgcctgg 3300cagtttatgg cgggcgtcct gcccgccacc ctccgggccg ttgcttcgca
acgttcaaat 3360ccgctcccgg cggatttgtc ctactcagga gagcgttcac cgacaaacaa
cagataaaac 3420gaaaggccca gtctttcgac tgagcctttc gttttatttg atgcctggca
gttccctact 3480ctcgcatggg gagaccccac actaccatcg gcgctacggc gtttcacttc
tgagttcggc 3540atggggtcag gtgggaccac cgcgctactg ccgccaggca aattctgttt
tatcagaccg 3600cttctgcgtt ctgatttaat ctgtatcagg ctgaaaatct tctctcatcc
gccaaaacag 3660ccaagcttgc atgcctgcag cccgggttac catttcaaca gatcgtcctt
agcatataag 3720tagtcgtcaa aaatgaattc aacttcgtct gtttcggcat tgtagccgcc
aactctgatg 3780gattcgtggt ttttgacaat gatgtcacag cctttttcct ttaggaagtc
caagtcgaaa 3840gtagtggcaa taccaatgat cttacaaccg gcggcttttc cggcggcaat
acctgctgga 3900gcgtcttcaa atactactac cttagatttg gaagggtctt gctcattgat
cggatatcct 3960aagccattcc tgcccttcag atatggttct ggatgaggct taccctgttt
gacatcatta 4020gcggtaatga agtactttgg tctcctgatt cccagatgct cgaaccattt
ttgtgccata 4080tcacgggtac cggaagttgc cacagcccat ttctcttttg gtagagcgtt
caaagcgttg 4140cacagcttaa ctgcacctgg gacttcaatg gatttttcac cgtacttgac
cggaatttca 4200gcttctaatt tgttaacata ctcttcattg gcaaagtctg gagcgaactt
agcaatggca 4260tcaaacgttc tccaaccatg cgagacttgg ataacgtgtt cagcatcgaa
ataaggtttg 4320tccttaccga aatccctcca gaatgcagca atggctggtt gagagatgat
aatggtaccg 4380tcgacgtcga acaaagcggc gttaactttc aaagatagag gtttagtagt
caatcccata 4440attctagtct gtttcctgga tccaataaat ctaatcttca tgtagatcta
attcttcaat 4500catgtccggc aggttcttca ttgggtagtt gttgtaaacg atttggtata
cggcttcaaa 4560taatgggaag tcttcgacag agccacatgt ttccaaccat tcgtgaactt
ctttgcaggt 4620aattaaacct tgagcggatt ggccattcaa caactccttt tcacattccc
aggcgtcctt 4680accagaagta gccattagcc tagcaacctt gacgtttcta ccaccagcgc
aggtggtgat 4740caaatcagca acaccagcag actcttggta gtatgtttct tctctagatt
ctgggaaaaa 4800catttgaccg aatctgatga tctcacccaa accgactctt tggatggcag
cagaagcgtt 4860gttaccccag cctagacctt cgacgaaacc acaacctaag gcaacaacgt
tcttcaaagc 4920accacagatg gagataccag caacatcttc gatgacacta acgtggaagt
aaggtctgtg 4980gaacaaggcc tttagaacct tatggtcgac gtccttgccc tcgcctctga
aatcctttgg 5040aatgtggtaa gcaactgttg tttcagacca gtgttcttga gcgacttcgg
tggcaatgtt 5100agcaccagat agagcaccac attgaatacc tagttcctca gtgatgtaag
aggatagcaa 5160ttggacacct ttagcaccaa cttcaaaacc ctttagacag gagatagctc
tgacgtgtga 5220atcaacatga cctttcaatt ggctacagat acggggcaaa aattgatgtg
gaatgttgaa 5280aacgatgatg tcgacatcct tgactgaatc aatcaagtct ggattagcaa
ccaaattgtc 5340gggtagagtg atgccaggca agtatttcac gttttgatgt ctagtattta
tgatttcagt 5400caatttttca ccattgatct cttcttcgaa cacccacatt tgtactattg
gagcgaaaac 5460ttctgggtat cccttacaat tttcggcaac caccttggca atagtagtac
cccagttacc 5520agatccaatc acagtaacct tgaaaggctt ttcggcagcc ttcaaagaaa
cagaagagga 5580acttctcttt ctaccagcat tcaagtggcc ggaagttaag tttaatctat
cagcagcagc 5640agccatggaa ttgtcctcct tactagtcat ggtctgtttc ctgtgtgaaa
ttgttatccg 5700ctcacaattc cacacattat acgagccgga tgattaattg tcaacagctc
atttcagaat 5760atttgccaga accgttatga tgtcggcgca aaaaacatta tccagaacgg
gagtgcgcct 5820tgagcgacac gaattatgca gtgatttacg acctgcacag ccataccaca
gcttccgatg 5880gctgcctgac gccagaagca ttggtgcacg ctagacaaga aaaaaggcac
gtcatctgac 5940gtgccttttt tatttgtacc tagaggctgg cgcgagcgcc cgtttaattc
gcctgaccgg 6000ccagtagcag cccggtggcg accgcattgc gcggcccttc tgttccccga
atattgccct 6060gcccggcgac cacgccatag tgcgacaagg cttccgtgat aagctgcggg
atctcaaagt 6120ccagcgatga gccgcccacc agcaccacaa aggcgatatc gcgaatggaa
ccgccgggtg 6180agacctggcg cagcgcgcgc aggcagttgg tgacaaacac tttctctttc
gcctgccggc 6240gcacgagacg aattttttcc agcgggctgg cgttatcgat cggcaccagt
tcgccctcct 6300tgatgtacac cactttggcg aacaccgccg ggctgagggc ttcccgaaag
aactccaccg 6360cgccattctc gtgacgaata ctgaacaggc tttccacttt ggccagcggg
tattttttta 6420tcgcttccgc cagcgaaaga tcctcgaggc ccagctcggt tttaatcaac
aggctgacca 6480tattccccgc cccggcgaga tggaccgccg ttatctgccc ctccgcgttg
acgatcgccg 6540catccgtcga gccggcgccg aggtcgagga tcgccagcgg cgccgcacag
ccgggagtgg 6600ttaacgcccc ggcgatggcc atgttggcct ccacgccgcc caccaccacc
tcggtctgca 6660gtcgggcgct cagttcgcgg gcgataacct gcatttgcag acgatccgct
ttcaccatcg 6720ccgccatccc gacggcattc tccatggcgc actcgccggc catcccgccc
tgcaccttgc 6780gcggaataaa cgtatccacc gccagcagat cctggatgta tatcgcgctc
atctcatggc 6840cggtcaggga cgccattacc ttgcgcaccc gctcaagcat gccgccggcg
tgggtgcccg 6900gttcgccgcg gatgtcgcgt accggagcgc aggcgctcat cgcctgcatg
atggcttccg 6960cgccctcggc gacatcggcc tctccgcggc gcttttcgcc gctaatgtag
aggttgcccg 7020ccgggatcac ccgcgactgc acatccccct gcggggtctt gagcaccacc
gcggaacggt 7080tgccaatcag ggcgcgggcg atggggacga tggcctgggt ctcttccggg
cttagcccga 7140agaaggtggc gatcccgtag ggattcgaca ggatccgcac cacctggccc
ggcgcggcca 7200cttccaccgc cgccattacc ccctcgggga cctgctccag cagcgtcact
tcatccacca 7260ccggcagggt tttacgcagg cggttgttca ccagcacgcc gtcgtccttt
ttgaggatcg 7320ccgccaccac gttgatcccc cggtcgagcg cctcattgag ccaccacacg
gcgtcaagga 7380aatcgacggc gtcgtcaatc agtacgatcc acccctcggc atactgcgcc
gccggcagcg 7440tcgccagccg cccgagggcg atagtcgtcc ccacgccaac gcccaccccg
cccggcgtct 7500gcgggttatg accgatcatg gtcgattcgg tgataatggt ctcggtgatg
gtctccatcg 7560ccacatcgcc aatcaccggc gcggcttcgt taagatagat gcgagagaca
tcgctcatcg 7620accacggtgt tttcgccagg gcctgctcca gcgcggcgag ggtcccggcg
atattgtccc 7680gcgtcccttt catgcccgtc gtcgcgacga tcccgctggc aacaaacgcc
ctcgcctgcg 7740ggtagtcgga cgccagcgcc acctcggtgg tggcgttgcc gatatcaatc
ccggctatta 7800acggcatgct gacctccgct tagcttcctt tacgcagctt atgccgctgc
tgatacactt 7860ccgccgactc ccggacaaag gcggcattca ctgtcgcatg ccaggtgtgc
tccagctcgt 7920cggcgatcgc cagcagctcc gcctgcgagg agcggaacgg gcgcagcgcg
ttatagatag 7980ccagaatgcg ctcgtcagga atggcgataa gctccgccgc gcggcggaaa
ttgcgcgcca 8040ccgcatggcg ctgcatctgc tcggcaatct gcgcctggta ctcaagggtc
tggcgggaga 8100tccgcacatc ctgcgggccc acctcgccag agagcacctt ctcgagggta
atatcggtca 8160atggtttgcc ggtaggcgtc aggatatgct ccgggcagcg ggtggctaac
ggataatcct 8220gcacgcgcat ggttttctcg ctcatggtca ctcccttact aagtcgatgt
gcagggtgac 8280gggctcggcg tcctgcacca catgtttggt ctctttgata tgaaatagcg
cggctttggc 8340cataaatttc ggccgcacca tctgatcgtt caccaccggc accggcgaag
gtgactcttt 8400gcgcgcatag cgcgcagcgt ttttgccaat ctgccggtag gtctccagcg
tcagcagcgg 8460cgcctgggag aacagctcca ggttgctgag cggcagcaga tcgcgctgat
ggatgaccgt 8520ggtccccttc gactggatac cgatgccgat ccccgagccg ctcaggttgg
ccgcatccca 8580ggccataaag gagacgtcgg acgtgcgcag aatgcgcacc acccgggcgt
gaagcccctc 8640ttcttccacc ccggcaatca gctctttgag gatcgcgcca tggggcatat
cgatcagagt 8700gtgatgctgg tgtttatcga aggcagggcc gacgccgatc accacttcat
cggcgcgttc 8760atcggcagaa gctaccccgc cctcgcgggt tttcagggta aaagagggct
gaatttgggt 8820tgtctgttgc acaggaatac cgccttgttc aatggtgtcg ggctgaacca
cgcccggaat 8880atttttgatc tccgcccagc gttcggcaga gatgcgatag ccggtgcccg
gcccctgata 8940gtcattgatg tcgttgaccg cactcaccac ctcgaactgc cgatcgaaaa
tggccgaggt 9000ctgcaggtaa tcgccggtga cccgctggcg cagcatattg agaatattgc
tggcgatatc 9060ctcaaagccg ctgcggctca gcgcgccgac aatatcgagg ccggtgatgt
tgcgcttcat 9120catctcttcc accgcactca gatcctccac cacgttacgc ggcggcatct
cgttgctgcc 9180gtgcgcgtag gtggcggcct ccacctcctc gtcggcgatt ggcggcagcc
ccagctcgcg 9240gaaaaccgcc tggatcgccc gcgccgcttt ctggcgaatg gcaatggttt
ccgcctcggt 9300caccggacgc aggccgccgt caaccatcag gtcacgctgc aggatgttgt
aatcatcaaa 9360atcttccgca tcgaagttcg agccggcgaa catgttgtcg tagttcggca
ccgcgctgta 9420gccggagaaa ataaagtcgg tgcccggcag catctgcatc agggtgcgcg
cggtgcggcg 9480aatatccgag tgggagaaag tctggtcgtt ggcggacgcc acttcgaggt
cgagcataga 9540ggcgatcagg ttttccgcca gcaccgcccg aatgcccgac ggcacagcgc
cggtcatgcc 9600gatacagctc accgcgccgt tttgcagtcc ctgaaccccg gcgcctttag
taatgaagat 9660gcagcgcgat tcgaggtaga gcatcgactt gctctccgaa tagcccatca
gcgcttcgga 9720tccggtgccg gaggtgtagc gcattttcaa cccgcgggag gcgtaggccg
aggcgaggaa 9780cgcctttgac cacggcgtat catcgccgtc ggtaaatacc gcttcggtgc
cgtagaccga 9840caccgtctcg gcgtagctgg ttaagccacg catgcccagc tccagctcgg
tggcctcttc 9900caccgagcac tgcgtcaaca cgccggggcg gccgcactgc gaaccgacca
acagcgccag 9960ggcgttaaac ggcgcgtagc gcgcgatacc gaccgtggtc tcctgttctg
agaagccgcg 10020gatcccggcc tcggcggcgt cagcggcaat ctgcaccgga ttatctttga
gattggtgac 10080gtggcactgg ttggaggggg tccggcgggc acgcatcttc tgcagcgcca
tcatcatctc 10140caccacgttc atctgcgcca tcacctcgac cgctttggcc ggcgtgatgg
cggtagtgat 10200ggcaatgatc tcctcccggc tgacgtgaat atccaccagc atacgggcta
tttccaccgc 10260ctccaggcgc attgcctgct ctgtgcgctc aacgttgatc gcgtaatcgg
cgataaatcg 10320gtcgatcatg tcaaactggt cccggcgttt gccgtccagt tcgacgatca
gaccgttgtc 10380cacttttact gaagagaccg ggtcaaaggg gctgtccatg gcgatcagcc
cctcttcagg 10440ccactcgcca atcagcccgt cctgattgac ggggcgctgg gccagtactg
caaatcgttt 10500tgatcttttc attgttcatc ggctcaaaag gtgaagcttg gttacctccg
ggaaacgcgg 10560ttgatttgtt tagtggttga attatttgct caggatgtgg cattgtcaag
ggcgtgacgg 10620ctcgcctgac ttctcgttcc agtgcccccg tccgacagtc gagcgtgcga
gcccataatc 10680tcgcgctggt gctgcatacc gtggcaaaca gcacagatcg cctaggaaaa
aaaaagcccg 10740cactgtcagg tgcgggcttt tttctgtgtt tgctaggcca gttcaagcgc
aagcatcagg 10800gtgcagctgg gcagaggcga gattcctccc cgggatcacg aactgtttta
acgggccgct 10860ctcggccata ttgcggtcga taagccgctc cagggcggtg atctcctctt
cgccgatcgt 10920ctggctcagg cgggtcaggc cccgcgcatc gctggccagt tcagccccca
gcacgaacag 10980cgtctgctga atatggtgca ggctttcccg cagcccggcg tcgcgggtcg
tggcgtagca 11040gacgcccagc tgggatatca gttcatcgac ggtgccgtag gcctcgacgc
gaatatggtc 11100tttctcgatg cggctgccgc cgtacagggc ggtggtgcct ttatccccgg
tgcgggtata 11160gatacgatac attcagtttc tctcacttaa cggcaggact ttaaccagct
gcccggcgtt 11220ggcgccgagc gtacgcagtt gatcgtcgct atcggtgacg tgtccggtag
ccagcggcgc 11280gtccgccggc agctgggcat gagtgagggc tatctcgccg gacgcgctga
gcccgatacc 11340cacccgcagg ggcgagcttc tggccgccag ggcgcccagc gcagcggcgt
caccgcctcc 11400gtcataggtt atggtctggc aggggacccc ctgctcctcc agcccccagc
acagctcatt 11460gatggcgccg gcatggtgcc cgcgcggatc gtaaaacagg cgtacgcctg
gcggtgaaag 11520cgacatgacg gtcccctcgt taacactcag aatgcctggc ggaacatacg
atagctcata 11580atataccttc tcgcttcagg ttataatgcg gaaaaacaat ccagggcgca
ctgggctaat 11640aattgatcct gctcgaccgt accgccgcta acgccgacgg cgccaattac
ctgctcatta 11700aaaataactg gcaggccgcc gccaaaaata ataattcgct gttggttggt
tagctgcaga 11760ccgtacagag attgtcctgg ctggaccgct gacgtaattt catgggtacc
ttgcttcagg 11820ctgcaggcgc tccaggcttt attcagggaa atatcgcagc tggagacgaa
ggcctcgtcc 11880atccgctgga taagcagcgt gttgcctccg cggtcaacta cggaaaacac
caccgccacg 11940ttgatctcag tggctttttt ttccaccgcc gccgccattt gctgggcggc
ggccagggtg 12000attgtctgaa cttgttggct cttgttcatc attctctccc gcaagcttgg
ttacctccgg 12060gaaacgcggt tgatttgttt agtggttgaa ttatttgctc aggatgtggc
attgtcaagg 12120gcgtgacggc tcgcctgact tctcgttcca gtgcccccgt ccgacagtcg
agcgtgcgag 12180cccataatct cgcgctggtg ctgcataccg tggcaaacag cacagatcgc
ctagcagtca 12240aaagcctccg gtcggaggct tttgactatt taaatgaatt cccgacagta
agacgggtaa 12300gcctgttgat gataccgctg ccttactggg tgcattagcc agtctgaatg
acctgtcacg 12360ggataatccg aagtggtcag actggaaaat cagagggcag gaactgctga
acagcaaaaa 12420gtcagatagc accacatagc agacccgcca taaaacgccc tgagaagccc
gtgacgggct 12480tttcttgtat tatgggtagt ttccttgcat gaatccataa aaggcgcctg
tagtgccatt 12540tacccccatt cactgccaga gccgtgagcg cagcgaactg aatgtcacga
aaaagacagc 12600gactcaggtg cctgatggtc ggagacaaaa ggaatattca gcgatttgcc
cgagcttgcg 12660agggtgctac ttaagccttt agggttttaa ggtctgtttt gtagaggagc
aaacagcgtt 12720tgcgacatcc ttttgtaata ctgcggaact gactaaagta gtgagttata
cacagggctg 12780ggatctattc tttttatctt tttttattct ttctttattc tataaattat
aaccacttga 12840atataaacaa aaaaaacaca caaaggtcta gcggaattta cagagggtct
agcagaattt 12900acaagttttc cagcaaaggt ctagcagaat ttacagatac ccacaactca
aaggaaaagg 12960actagtaatt atcattgact agcccatctc aattggtata gtgattaaaa
tcacctagac 13020caattgagat gtatgtctga attagttgtt ttcaaagcaa atgaactagc
gattagtcgc 13080tatgacttaa cggagcatga aaccaagcta attttatgct gtgtggcact
actcaacccc 13140acgattgaaa accctacaag gaaagaacgg acggtatcgt tcacttataa
ccaatacgct 13200cagatgatga acatcagtag ggaaaatgct tatggtgtat tagctaaagc
aaccagagag 13260ctgatgacga gaactgtgga aatcaggaat cctttggtta aaggctttga
gattttccag 13320tggacaaact atgccaagtt ctcaagcgaa aaattagaat tagtttttag
tgaagagata 13380ttgccttatc ttttccagtt aaaaaaattc ataaaatata atctggaaca
tgttaagtct 13440tttgaaaaca aatactctat gaggatttat gagtggttat taaaagaact
aacacaaaag 13500aaaactcaca aggcaaatat agagattagc cttgatgaat ttaagttcat
gttaatgctt 13560gaaaataact accatgagtt taaaaggctt aaccaatggg ttttgaaacc
aataagtaaa 13620gatttaaaca cttacagcaa tatgaaattg gtggttgata agcgaggccg
cccgactgat 13680acgttgattt tccaagttga actagataga caaatggatc tcgtaaccga
acttgagaac 13740aaccagataa aaatgaatgg tgacaaaata ccaacaacca ttacatcaga
ttcctaccta 13800cataacggac taagaaaaac actacacgat gctttaactg caaaaattca
gctcaccagt 13860tttgaggcaa aatttttgag tgacatgcaa agtaagtatg atctcaatgg
ttcgttctca 13920tggctcacgc aaaaacaacg aaccacacta gagaacatac tggctaaata
cggaaggatc 13980tgaggttctt atggctcttg tatctatcag tgaagcatca agactaacaa
acaaaagtag 14040aacaactgtt caccgttaca tatcaaaggg aaaactgtcc atatgcacag
atgaaaacgg 14100tgtaaaaaag atagatacat cagagctttt acgagttttt ggtgcattca
aagctgttca 14160ccatgaacag atcgacaatg taacagatga acagcatgta acacctaata
gaacaggtga 14220aaccagtaaa acaaagcaac tagaacatga aattgaacac ctgagacaac
ttgttacagc 14280tcaacagtca cacatagaca gcctgaaaca ggcgatgctg cttatcgaat
caaagctgcc 14340gacaacacgg gagccagtga cgcctcccgt ggggaaaaaa tcatggcaat
tctggaagaa 14400atagcgcttt cagccggcaa accggctgaa gccggatctg cga
14443671260DNACitrobacter sp 67atgaagataa atatgccgtt cagtaatgac
aaatatcggt actcgtccgg gtatctgctt 60tttttctttg ccgcctggtc gttgtggtgg
tctttttatg cgatatggct aaaaaataaa 120cttggcctgt ccggaacgga gctgggaatg
ctgtatgccg taaaccagtt ctttagcatg 180ctgtttatgc tggtctacgg ttttctgcag
gataagctcg gcacccgtaa acaccttatc 240tggctgatgg ggatagtcat cacgctcagc
ggcccgttcc tgatttatgt ttacgaaccg 300ctgctgacct ccaacttcaa acttggtatg
gcgctgggag ccattttctt tggccttggc 360tacctcgcgg gttgtggtct ggtagaaagc
ttcgtcgaaa aagtgagccg caaattcaac 420tttgaattcg gcaccgcccg cttgtgggga
tcgcttggct acgccgcagg gacatttgtt 480ggcggtatct tcttcagcat caacccacac
attaacttct ggtgcgtatc ggtaatgggg 540gtgttattcc tgttgattaa cgtgttgttc
aaaaccaact cacccgcccc atcttctgta 600aaaacgcgtt ctcctgaacc tgacgcgctg
acccgaaagg attttctcac tatctttaaa 660gatacgcagt tctggttttt cgttatcttt
gtcgtcggta cctggtcgtt ctatagcatc 720tacgatcagc agatgttccc ggtgttttac
gccagcttat ttgacgatcc cgaactggca 780ccacgcgtat acggctacct caactcggta
caggtcttta tggaagccgt cggtatggcg 840ctggttccat tcctgattaa ccgcatcggg
cctaaatccg cattgctgct gggtggcaca 900atcatggcct gtcgaatcct gggttcagca
ctgttcaccg atatctatat tatctccttg 960attaaaatgc ttcatgcgct ggaagtccca
ctgtttgtta tttcagtgtt taaattcagc 1020gtagcgaatt ttgataaacg cctgtcatca
acgatatatc tcattggctt caatatcgcc 1080agttccattg gcattatcgt gctgtcactg
cctgtcggta agttgtttga taaagtgggc 1140tatcaggaaa tcttcctgat tatggccagc
attgtgataa taacactaat atttggctat 1200ttctcgttga gcaaaaagca tcatcagcag
aagatgggaa atgaactggt gacagagtag 126068419PRTCitrobacter sp 68Met Lys
Ile Asn Met Pro Phe Ser Asn Asp Lys Tyr Arg Tyr Ser Ser 1 5
10 15 Gly Tyr Leu Leu Phe Phe Phe
Ala Ala Trp Ser Leu Trp Trp Ser Phe 20 25
30 Tyr Ala Ile Trp Leu Lys Asn Lys Leu Gly Leu Ser
Gly Thr Glu Leu 35 40 45
Gly Met Leu Tyr Ala Val Asn Gln Phe Phe Ser Met Leu Phe Met Leu
50 55 60 Val Tyr Gly
Phe Leu Gln Asp Lys Leu Gly Thr Arg Lys His Leu Ile 65
70 75 80 Trp Leu Met Gly Ile Val Ile
Thr Leu Ser Gly Pro Phe Leu Ile Tyr 85
90 95 Val Tyr Glu Pro Leu Leu Thr Ser Asn Phe Lys
Leu Gly Met Ala Leu 100 105
110 Gly Ala Ile Phe Phe Gly Leu Gly Tyr Leu Ala Gly Cys Gly Leu
Val 115 120 125 Glu
Ser Phe Val Glu Lys Val Ser Arg Lys Phe Asn Phe Glu Phe Gly 130
135 140 Thr Ala Arg Leu Trp Gly
Ser Leu Gly Tyr Ala Ala Gly Thr Phe Val 145 150
155 160 Gly Gly Ile Phe Phe Ser Ile Asn Pro His Ile
Asn Phe Trp Cys Val 165 170
175 Ser Val Met Gly Val Leu Phe Leu Leu Ile Asn Val Leu Phe Lys Thr
180 185 190 Asn Ser
Pro Ala Pro Ser Ser Val Lys Thr Arg Ser Pro Glu Pro Asp 195
200 205 Ala Leu Thr Arg Lys Asp Phe
Leu Thr Ile Phe Lys Asp Thr Gln Phe 210 215
220 Trp Phe Phe Val Ile Phe Val Val Gly Thr Trp Ser
Phe Tyr Ser Ile 225 230 235
240 Tyr Asp Gln Gln Met Phe Pro Val Phe Tyr Ala Ser Leu Phe Asp Asp
245 250 255 Pro Glu Leu
Ala Pro Arg Val Tyr Gly Tyr Leu Asn Ser Val Gln Val 260
265 270 Phe Met Glu Ala Val Gly Met Ala
Leu Val Pro Phe Leu Ile Asn Arg 275 280
285 Ile Gly Pro Lys Ser Ala Leu Leu Leu Gly Gly Thr Ile
Met Ala Cys 290 295 300
Arg Ile Leu Gly Ser Ala Leu Phe Thr Asp Ile Tyr Ile Ile Ser Leu 305
310 315 320 Ile Lys Met Leu
His Ala Leu Glu Val Pro Leu Phe Val Ile Ser Val 325
330 335 Phe Lys Phe Ser Val Ala Asn Phe Asp
Lys Arg Leu Ser Ser Thr Ile 340 345
350 Tyr Leu Ile Gly Phe Asn Ile Ala Ser Ser Ile Gly Ile Ile
Val Leu 355 360 365
Ser Leu Pro Val Gly Lys Leu Phe Asp Lys Val Gly Tyr Gln Glu Ile 370
375 380 Phe Leu Ile Met Ala
Ser Ile Val Ile Ile Thr Leu Ile Phe Gly Tyr 385 390
395 400 Phe Ser Leu Ser Lys Lys His His Gln Gln
Lys Met Gly Asn Glu Leu 405 410
415 Val Thr Glu 691314DNAEnterococcus faecium 69atgaaagggg
atacaaatat atcgttggag gataaaaata tgtcaaaagt taacgtattt 60aaaaatcaat
cttatttaca aagttcagct acattattac tattttttgc ttcttggggt 120gtttggtggt
cattttttca actttggcta acatctgaat caaatggttt agggttatct 180ggcagtgctg
taggaacagt attctcggca aattcgttag ttaccttaat tttgatgttt 240atttatggaa
cattacaaga taaattgtat attaaacgaa atttattaat ttttgcttct 300gtattagcga
cacttgttgg accatttttt atatggatat atgggccatt gctagataac 360aattttaatt
taggcattat tatgggagcg ctatttttgt cagctggata tttagcttct 420gtaggagttt
ttgaagctgt gtcagaaagg tttagtcgtt tatttggctt tgaatatgga 480caagcaaggg
cgtggggatc atttggttat gccttggtag cgcttttggc aggattttta 540tttgtaaaaa
atcctcattt aaacttttgg gcgggatctt tctttggttc tttactattg 600ttaaatttat
tattttggaa ccctaaagtt gaacgggaag caaatcaaaa ttttaatcaa 660gaacaagctg
aatcaaatag tattccttct ttaaaagaaa tgtttgatct aatgaaactg 720cctcaattat
ggacgataat catctttatt gtttttacat ggacatttta tacggtattc 780gatcaacaaa
tgtttccggg attttatact ggtttgtttt caacatcagc taatggtgaa 840aaaatatatg
ggacattgaa tgctattcaa gtattttgtg aagcgttaat gatgggaatt 900gttccaatca
ttatgagaaa attaggggtt cgaaatactt tgttattagg tgtaaccatt 960atgtgtgtac
gaattggatt gtgcgggttt gcctcgacac cattatctgt ttcatgcata 1020aaaatgttgc
atgctttaga agtaccatta tttacattac caatgtttcg ctattttaca 1080cttcattttg
atacaaagct atcagcaacc ctctatatga taggatttca gatagctgct 1140caaattgggc
aagtgatttt atcaacacca ttgggaatat taagagacaa cgttggctat 1200caaccaacat
ttaaaattat ttctcttatt gtattactag caggcatata tgcattcttt 1260attcttaaac
aagatgatag agatgttcaa ggggatccat ttattcgagg ataa
131470437PRTEnterococcus faecium 70Met Lys Gly Asp Thr Asn Ile Ser Leu
Glu Asp Lys Asn Met Ser Lys 1 5 10
15 Val Asn Val Phe Lys Asn Gln Ser Tyr Leu Gln Ser Ser Ala
Thr Leu 20 25 30
Leu Leu Phe Phe Ala Ser Trp Gly Val Trp Trp Ser Phe Phe Gln Leu
35 40 45 Trp Leu Thr Ser
Glu Ser Asn Gly Leu Gly Leu Ser Gly Ser Ala Val 50
55 60 Gly Thr Val Phe Ser Ala Asn Ser
Leu Val Thr Leu Ile Leu Met Phe 65 70
75 80 Ile Tyr Gly Thr Leu Gln Asp Lys Leu Tyr Ile Lys
Arg Asn Leu Leu 85 90
95 Ile Phe Ala Ser Val Leu Ala Thr Leu Val Gly Pro Phe Phe Ile Trp
100 105 110 Ile Tyr Gly
Pro Leu Leu Asp Asn Asn Phe Asn Leu Gly Ile Ile Met 115
120 125 Gly Ala Leu Phe Leu Ser Ala Gly
Tyr Leu Ala Ser Val Gly Val Phe 130 135
140 Glu Ala Val Ser Glu Arg Phe Ser Arg Leu Phe Gly Phe
Glu Tyr Gly 145 150 155
160 Gln Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Leu Val Ala Leu Leu
165 170 175 Ala Gly Phe Leu
Phe Val Lys Asn Pro His Leu Asn Phe Trp Ala Gly 180
185 190 Ser Phe Phe Gly Ser Leu Leu Leu Leu
Asn Leu Leu Phe Trp Asn Pro 195 200
205 Lys Val Glu Arg Glu Ala Asn Gln Asn Phe Asn Gln Glu Gln
Ala Glu 210 215 220
Ser Asn Ser Ile Pro Ser Leu Lys Glu Met Phe Asp Leu Met Lys Leu 225
230 235 240 Pro Gln Leu Trp Thr
Ile Ile Ile Phe Ile Val Phe Thr Trp Thr Phe 245
250 255 Tyr Thr Val Phe Asp Gln Gln Met Phe Pro
Gly Phe Tyr Thr Gly Leu 260 265
270 Phe Ser Thr Ser Ala Asn Gly Glu Lys Ile Tyr Gly Thr Leu Asn
Ala 275 280 285 Ile
Gln Val Phe Cys Glu Ala Leu Met Met Gly Ile Val Pro Ile Ile 290
295 300 Met Arg Lys Leu Gly Val
Arg Asn Thr Leu Leu Leu Gly Val Thr Ile 305 310
315 320 Met Cys Val Arg Ile Gly Leu Cys Gly Phe Ala
Ser Thr Pro Leu Ser 325 330
335 Val Ser Cys Ile Lys Met Leu His Ala Leu Glu Val Pro Leu Phe Thr
340 345 350 Leu Pro
Met Phe Arg Tyr Phe Thr Leu His Phe Asp Thr Lys Leu Ser 355
360 365 Ala Thr Leu Tyr Met Ile Gly
Phe Gln Ile Ala Ala Gln Ile Gly Gln 370 375
380 Val Ile Leu Ser Thr Pro Leu Gly Ile Leu Arg Asp
Asn Val Gly Tyr 385 390 395
400 Gln Pro Thr Phe Lys Ile Ile Ser Leu Ile Val Leu Leu Ala Gly Ile
405 410 415 Tyr Ala Phe
Phe Ile Leu Lys Gln Asp Asp Arg Asp Val Gln Gly Asp 420
425 430 Pro Phe Ile Arg Gly 435
711299DNACorynebacterium glucuronolyticum 71atgtcaaaat tcatgcagca
gctgaaaaac actgcgtatc agcagtcatc agcgcaactc 60ctgctcttct tcatgtcctg
gggcatctgg tggtccttct tccagctttg gctctccagc 120gaaacccgcg gtctcggctt
caacggcagt gagatcggca ccatctactc ggtgaactcc 180gccgtcacgc tcgtcctcat
gctcgtctac ggaactgccc aagataagct tcgtactcgc 240cgtaatttgg tgatcggtat
tgcagttcta atgagcttga ccggcccgtt cttcatgtgg 300gtctactggc cactgctgca
gagcgagtcg ctctatgtcc tcggtgttgg acttggcgca 360atcttcatcg gtacggcttt
tgtggggtca tgcccgctgt tcgaggcgct tgccgagcgc 420atgtcccgaa aacacaactt
cgaatatggc caggcccgcg cgtggggttc ctttggctac 480gccatcgtcg cactcctcgc
cggcttcaac ttcaccatca acccggcgat taacttctgg 540atggcctcgg ccttcggcgt
tctgttgctt ctcatcctcg ttttctggaa ggaaccggta 600gcgcctcgta acgaaattgc
agaggaggaa gtggaaaaca ccacacctag cgtcaaggaa 660atggtgtctg ttctcaaagt
gcccgccctc tgggtcgtca ttgtcctcgt gttcttcacg 720tggacgttct acacggtctt
cgaccagcag atgttcccgc agttctacac ctcacttttt 780agtgactccg ccaccggcga
gcgaacctat ggcgtgctca actccgtcca agtgttcgtc 840gaggcgttga tgatgggaat
cgtgcccatc tacatgcgga aggtcggcgt gaagaacacc 900ctcatgacgg gcttcgccgt
catggcactg cgcatcctag gttgcgcggt cttcgcggac 960ccagtcacca tctcctttgt
caagatgttc cacgctctcg aggtaccact gtgcatcctc 1020cccatcttcc gctacttcac
cctgcacttc cccacgaaga tctcggccac cttgtacatg 1080gtcggcttcc agattgcctc
gcaggtgggt aacgtcgtca tgtccccgat cctcggttcg 1140ctgcgtgacc gcctcggttt
ccagccgacc ttctatgtca tctcgggaat cgtccttgtc 1200tccgctatct tcgcctggtt
ggctctcaag ggcgataagg aacaagtgga gggcgatccc 1260ttctaccgcg attcggaact
taaggagata caccaatga 129972432PRTCorynebacterium
glucuronolyticum 72Met Ser Lys Phe Met Gln Gln Leu Lys Asn Thr Ala Tyr
Gln Gln Ser 1 5 10 15
Ser Ala Gln Leu Leu Leu Phe Phe Met Ser Trp Gly Ile Trp Trp Ser
20 25 30 Phe Phe Gln Leu
Trp Leu Ser Ser Glu Thr Arg Gly Leu Gly Phe Asn 35
40 45 Gly Ser Glu Ile Gly Thr Ile Tyr Ser
Val Asn Ser Ala Val Thr Leu 50 55
60 Val Leu Met Leu Val Tyr Gly Thr Ala Gln Asp Lys Leu
Arg Thr Arg 65 70 75
80 Arg Asn Leu Val Ile Gly Ile Ala Val Leu Met Ser Leu Thr Gly Pro
85 90 95 Phe Phe Met Trp
Val Tyr Trp Pro Leu Leu Gln Ser Glu Ser Leu Tyr 100
105 110 Val Leu Gly Val Gly Leu Gly Ala Ile
Phe Ile Gly Thr Ala Phe Val 115 120
125 Gly Ser Cys Pro Leu Phe Glu Ala Leu Ala Glu Arg Met Ser
Arg Lys 130 135 140
His Asn Phe Glu Tyr Gly Gln Ala Arg Ala Trp Gly Ser Phe Gly Tyr 145
150 155 160 Ala Ile Val Ala Leu
Leu Ala Gly Phe Asn Phe Thr Ile Asn Pro Ala 165
170 175 Ile Asn Phe Trp Met Ala Ser Ala Phe Gly
Val Leu Leu Leu Leu Ile 180 185
190 Leu Val Phe Trp Lys Glu Pro Val Ala Pro Arg Asn Glu Ile Ala
Glu 195 200 205 Glu
Glu Val Glu Asn Thr Thr Pro Ser Val Lys Glu Met Val Ser Val 210
215 220 Leu Lys Val Pro Ala Leu
Trp Val Val Ile Val Leu Val Phe Phe Thr 225 230
235 240 Trp Thr Phe Tyr Thr Val Phe Asp Gln Gln Met
Phe Pro Gln Phe Tyr 245 250
255 Thr Ser Leu Phe Ser Asp Ser Ala Thr Gly Glu Arg Thr Tyr Gly Val
260 265 270 Leu Asn
Ser Val Gln Val Phe Val Glu Ala Leu Met Met Gly Ile Val 275
280 285 Pro Ile Tyr Met Arg Lys Val
Gly Val Lys Asn Thr Leu Met Thr Gly 290 295
300 Phe Ala Val Met Ala Leu Arg Ile Leu Gly Cys Ala
Val Phe Ala Asp 305 310 315
320 Pro Val Thr Ile Ser Phe Val Lys Met Phe His Ala Leu Glu Val Pro
325 330 335 Leu Cys Ile
Leu Pro Ile Phe Arg Tyr Phe Thr Leu His Phe Pro Thr 340
345 350 Lys Ile Ser Ala Thr Leu Tyr Met
Val Gly Phe Gln Ile Ala Ser Gln 355 360
365 Val Gly Asn Val Val Met Ser Pro Ile Leu Gly Ser Leu
Arg Asp Arg 370 375 380
Leu Gly Phe Gln Pro Thr Phe Tyr Val Ile Ser Gly Ile Val Leu Val 385
390 395 400 Ser Ala Ile Phe
Ala Trp Leu Ala Leu Lys Gly Asp Lys Glu Gln Val 405
410 415 Glu Gly Asp Pro Phe Tyr Arg Asp Ser
Glu Leu Lys Glu Ile His Gln 420 425
430 731326DNABifidobacterium animalis 73atggcaacaa
ccacgaaggt gtggaggaac ccctcctacc tgcaaagctc aaccggcatc 60ttcctgttct
tctgctcctg gggcatctgg tggtcgttct tccagcgctg gctcaactcg 120atgggactca
acggcgcgga agtgggcacg atctattcga tcaactcgct ggccacgctc 180atcctcatgt
tcgggtacgg cctcatccag gacaatctcg gactcaagcg ccgtcttgtg 240ctcgtcatct
cggcgatcgc cgcactcgtc ggacccttcg tgcagttcgt gtacgcgccg 300ctgatgagga
cgaacatgat ggccgccgca ctcgtgggct ccgtcgttct ctccgcgggc 360ttcatggcag
gctgctcgct catcgaggcc gtgaccgaac ggtacagccg ccgtttcaac 420ttcgagtacg
gccaatcccg cgcatggggt tccttcggct atgccattgt ggcgcttgtc 480gccggcttcg
tgttcaacat caacccgatg atcaacttct ggctcggctc cgcattcggc 540gtgggcatgc
tcatcgtgta cctcacctgg tatccggccg agcagcgcga agcgctcaag 600gaagccgccg
atccgaatgc cgcgccaact aacccgacca tcaaagacat gctcggcgtg 660ctcaagatgc
ccacgctgtg ggtgctcatc gtgttcatgc tgctcaccaa cacgttctac 720accgtattcg
accagcagat gttccccacc tactacgcct cgctcttccc gaatgaggcc 780accggcaacg
ccgtctacgg cacgctcaac tcggtgcagg tgttctgcga atccgcgatg 840atgggcgtcg
tgccgatcat catgcgcaag gtaggtgtgc gcaacgcgtt gctgctcgga 900tccacggtga
tgttccttcg catcgggctg tgcggcatct tccacgatcc ggtgtccatc 960tcgatcgtca
aaatgttcca cgccattgaa gttccgctgt tctgcctgcc ggcgttccgc 1020tacttcacgc
tccacttcaa tccgaagctc tccgcgacgc tctacatggt cggcttccag 1080attgcctcac
agatcggcca ggtcgtcttc tccaccccgc tcggcatgct gcatgaccgc 1140atgggcgacc
gcacgacgtt cctgacgatc tccgccatcg tgcttgctgc caccgtctac 1200ggattcttcg
tgatcaagcg cgacgacgag caggtggatg gcgatccgtt catccgcgat 1260tcgaagaagc
tgccgtcgct cgccaccgac gaggcgatcc tctccgcgga ttccgaggat 1320atgtaa
132674441PRTBifidobacterium animalis 74Met Ala Thr Thr Thr Lys Val Trp
Arg Asn Pro Ser Tyr Leu Gln Ser 1 5 10
15 Ser Thr Gly Ile Phe Leu Phe Phe Cys Ser Trp Gly Ile
Trp Trp Ser 20 25 30
Phe Phe Gln Arg Trp Leu Asn Ser Met Gly Leu Asn Gly Ala Glu Val
35 40 45 Gly Thr Ile Tyr
Ser Ile Asn Ser Leu Ala Thr Leu Ile Leu Met Phe 50
55 60 Gly Tyr Gly Leu Ile Gln Asp Asn
Leu Gly Leu Lys Arg Arg Leu Val 65 70
75 80 Leu Val Ile Ser Ala Ile Ala Ala Leu Val Gly Pro
Phe Val Gln Phe 85 90
95 Val Tyr Ala Pro Leu Met Arg Thr Asn Met Met Ala Ala Ala Leu Val
100 105 110 Gly Ser Val
Val Leu Ser Ala Gly Phe Met Ala Gly Cys Ser Leu Ile 115
120 125 Glu Ala Val Thr Glu Arg Tyr Ser
Arg Arg Phe Asn Phe Glu Tyr Gly 130 135
140 Gln Ser Arg Ala Trp Gly Ser Phe Gly Tyr Ala Ile Val
Ala Leu Val 145 150 155
160 Ala Gly Phe Val Phe Asn Ile Asn Pro Met Ile Asn Phe Trp Leu Gly
165 170 175 Ser Ala Phe Gly
Val Gly Met Leu Ile Val Tyr Leu Thr Trp Tyr Pro 180
185 190 Ala Glu Gln Arg Glu Ala Leu Lys Glu
Ala Ala Asp Pro Asn Ala Ala 195 200
205 Pro Thr Asn Pro Thr Ile Lys Asp Met Leu Gly Val Leu Lys
Met Pro 210 215 220
Thr Leu Trp Val Leu Ile Val Phe Met Leu Leu Thr Asn Thr Phe Tyr 225
230 235 240 Thr Val Phe Asp Gln
Gln Met Phe Pro Thr Tyr Tyr Ala Ser Leu Phe 245
250 255 Pro Asn Glu Ala Thr Gly Asn Ala Val Tyr
Gly Thr Leu Asn Ser Val 260 265
270 Gln Val Phe Cys Glu Ser Ala Met Met Gly Val Val Pro Ile Ile
Met 275 280 285 Arg
Lys Val Gly Val Arg Asn Ala Leu Leu Leu Gly Ser Thr Val Met 290
295 300 Phe Leu Arg Ile Gly Leu
Cys Gly Ile Phe His Asp Pro Val Ser Ile 305 310
315 320 Ser Ile Val Lys Met Phe His Ala Ile Glu Val
Pro Leu Phe Cys Leu 325 330
335 Pro Ala Phe Arg Tyr Phe Thr Leu His Phe Asn Pro Lys Leu Ser Ala
340 345 350 Thr Leu
Tyr Met Val Gly Phe Gln Ile Ala Ser Gln Ile Gly Gln Val 355
360 365 Val Phe Ser Thr Pro Leu Gly
Met Leu His Asp Arg Met Gly Asp Arg 370 375
380 Thr Thr Phe Leu Thr Ile Ser Ala Ile Val Leu Ala
Ala Thr Val Tyr 385 390 395
400 Gly Phe Phe Val Ile Lys Arg Asp Asp Glu Gln Val Asp Gly Asp Pro
405 410 415 Phe Ile Arg
Asp Ser Lys Lys Leu Pro Ser Leu Ala Thr Asp Glu Ala 420
425 430 Ile Leu Ser Ala Asp Ser Glu Asp
Met 435 440 751398DNABifidobacterium gallicum
75atggtgaata aaccgaagac cgcaaaaatc tggtccaacc cgtcctattt gcagagctcg
60tttggcattt tcctgttctt ctgctcatgg ggcatctggt ggtccttctt ccagcgctgg
120ctcaatacca ttggcctgaa cggcgcggaa gtcggcaccg tctattccat caactcgctg
180gccacgctga tcctcatgtt cggctacggc atcatccagg acaacctggg catcaagcgc
240cgtctcgtgg tcgtcatcgc caccatcgcg gcactgatcg gccccttcgt ccagttcgtg
300tacgcgccgc tcatgcagac gaacatcatg gccgccgccc tgatcggctc cgtggtgctc
360tccgccggct tcatgtccgg atgctcgctg attgaagcgc ttaccgaacg ctacagccgc
420aagttcggct tcgaatacgg ccagtcccgc gcatggggct ccttcggcta cgccattgtg
480gccctgatcg ccggcattgt cttcaacatc aacccgatga tcaacttctg gctcggctcc
540gcattcggcg tgggcatgct catcgtgtac ctcacgtggt acccggccga gcagcgccag
600gccctcaagg aagcggccga cccgaacgcc gagaagtcca acccgtcctt caaggacatg
660gtcaacgtgc tcaagatgcc gacgctgtgg gtgctcatca tcttcatgct gctgaccaac
720acgttctaca cggtcttcga ccagcagatg ttcccgacct actacgcctc gctgttcccg
780agcattgaaa cgggcaacac ggtctacggc gtgctcaact ccatccaggt cttctgcgaa
840tccgcgatga tgggcgtcgt cccgatcatc atgcgcaaga tcggcgtgcg caacgcgctg
900ctgctgggcg ccaccgtcat gttcctgcgc atcggcctgt gcggcatctt ccacgacccg
960gtagccatct ccatcgtcaa gatgttccac gccatcgaag ttccactgtt ctgcctgccg
1020gcgttccgct acttcacgct gcacttcaac ccgaagctct cggccacgct gtacatggtg
1080ggcttccaga tcgcctcaca gatcggccag gttatcttct ccaccccgct gggcatgctg
1140cacgaccgct tcggcgaccg caccaccttc ctgtccatca gcggcatcgt gctgctggca
1200acgatctacg gcttcttcgt catcaagcgc gacgacgagc acgtggacgg cgatccgttc
1260ctgcgtgacc gcgaccgcaa ggaaatggaa ctcatcgaag agaacctgca gccagacgcc
1320gagctggaaa cgagccccgt aggcgtcgca gcacaggtgc gcgacaaccg cgcggtccag
1380ccggaatacg caagctga
139876465PRTBifidobacterium gallicum 76Met Val Asn Lys Pro Lys Thr Ala
Lys Ile Trp Ser Asn Pro Ser Tyr 1 5 10
15 Leu Gln Ser Ser Phe Gly Ile Phe Leu Phe Phe Cys Ser
Trp Gly Ile 20 25 30
Trp Trp Ser Phe Phe Gln Arg Trp Leu Asn Thr Ile Gly Leu Asn Gly
35 40 45 Ala Glu Val Gly
Thr Val Tyr Ser Ile Asn Ser Leu Ala Thr Leu Ile 50
55 60 Leu Met Phe Gly Tyr Gly Ile Ile
Gln Asp Asn Leu Gly Ile Lys Arg 65 70
75 80 Arg Leu Val Val Val Ile Ala Thr Ile Ala Ala Leu
Ile Gly Pro Phe 85 90
95 Val Gln Phe Val Tyr Ala Pro Leu Met Gln Thr Asn Ile Met Ala Ala
100 105 110 Ala Leu Ile
Gly Ser Val Val Leu Ser Ala Gly Phe Met Ser Gly Cys 115
120 125 Ser Leu Ile Glu Ala Leu Thr Glu
Arg Tyr Ser Arg Lys Phe Gly Phe 130 135
140 Glu Tyr Gly Gln Ser Arg Ala Trp Gly Ser Phe Gly Tyr
Ala Ile Val 145 150 155
160 Ala Leu Ile Ala Gly Ile Val Phe Asn Ile Asn Pro Met Ile Asn Phe
165 170 175 Trp Leu Gly Ser
Ala Phe Gly Val Gly Met Leu Ile Val Tyr Leu Thr 180
185 190 Trp Tyr Pro Ala Glu Gln Arg Gln Ala
Leu Lys Glu Ala Ala Asp Pro 195 200
205 Asn Ala Glu Lys Ser Asn Pro Ser Phe Lys Asp Met Val Asn
Val Leu 210 215 220
Lys Met Pro Thr Leu Trp Val Leu Ile Ile Phe Met Leu Leu Thr Asn 225
230 235 240 Thr Phe Tyr Thr Val
Phe Asp Gln Gln Met Phe Pro Thr Tyr Tyr Ala 245
250 255 Ser Leu Phe Pro Ser Ile Glu Thr Gly Asn
Thr Val Tyr Gly Val Leu 260 265
270 Asn Ser Ile Gln Val Phe Cys Glu Ser Ala Met Met Gly Val Val
Pro 275 280 285 Ile
Ile Met Arg Lys Ile Gly Val Arg Asn Ala Leu Leu Leu Gly Ala 290
295 300 Thr Val Met Phe Leu Arg
Ile Gly Leu Cys Gly Ile Phe His Asp Pro 305 310
315 320 Val Ala Ile Ser Ile Val Lys Met Phe His Ala
Ile Glu Val Pro Leu 325 330
335 Phe Cys Leu Pro Ala Phe Arg Tyr Phe Thr Leu His Phe Asn Pro Lys
340 345 350 Leu Ser
Ala Thr Leu Tyr Met Val Gly Phe Gln Ile Ala Ser Gln Ile 355
360 365 Gly Gln Val Ile Phe Ser Thr
Pro Leu Gly Met Leu His Asp Arg Phe 370 375
380 Gly Asp Arg Thr Thr Phe Leu Ser Ile Ser Gly Ile
Val Leu Leu Ala 385 390 395
400 Thr Ile Tyr Gly Phe Phe Val Ile Lys Arg Asp Asp Glu His Val Asp
405 410 415 Gly Asp Pro
Phe Leu Arg Asp Arg Asp Arg Lys Glu Met Glu Leu Ile 420
425 430 Glu Glu Asn Leu Gln Pro Asp Ala
Glu Leu Glu Thr Ser Pro Val Gly 435 440
445 Val Ala Ala Gln Val Arg Asp Asn Arg Ala Val Gln Pro
Glu Tyr Ala 450 455 460
Ser 465 771338DNABifidobacterium longum 77atggcaagtg caaccaagtc
tgcatggaag aatccttcct atctgcagag ctctttcggc 60atcttcatgt tcttctgctc
ctggggcatc tggtggtcct tcttccagcg ctggctcatc 120tcaggcgttg gattgaccaa
tgctgaagtc ggcaccatct actccatcaa ctcgctggcc 180accctggtca tcatgtttgt
gtacggcgtg attcaggatc agctcggcat caagcgcaag 240ctcgtcatcg tagtctcggt
aatcgccgcc tgcgttggcc cattcgtcca attcgtttac 300gccccgatga tcctcgccgg
tggcaccacc cgctggatcg gcgcactcat cggctccatc 360gttctgtctg ccggcttcat
gtccggctgc tccctgttcg aggccgtcac cgaacgctac 420tcccgtaaat tcggtttcga
atatggccag tcccgtgctt ggggctcctt cggttacgcc 480atcgtggcgc tgtgcgccgg
cttcctgttc aacatcaacc cgctgatcaa cttctgggtc 540ggctccgcat tcggccctgg
catgctcctc gtgtacgcct tctgggtccc ggccgagcag 600aaggaagagc tcaagaagga
aaccgacccg aacgcagccc ccaccaaccc gtccctcaag 660gaaatggtcg ccgttctcaa
gatgccgacc ctgtgggtgc tcatcgtctt catgctgctg 720accaacacct tctacaccgt
gttcgatcag cagatgttcc cgacctacta cgccaacctc 780ttccccactg aagaaatcgg
caacgccacc tacggcaccc tgaacggttt ccaggtcttc 840cttgagtccg caatgatggg
cgtggtcccg atcatcatga agaagatcgg cgtgcgcaac 900gctctgctgc tcggcgctac
cgtgatgttc ctgcgcatcg gcttgtgcgg cgtgttccac 960gacccggtca ccatctccat
cgtcaagctg ttccactcca tcgaagtgcc gctgttctgc 1020ctgccggcat tccgctactt
cactctgcac ttcgacacca agctctctgc cacgctgtac 1080atggtgggct tccagatcgc
ttcccaagtg ggtcaggtca tcttctcgac ccctctgggt 1140gccttccacg acaagatggc
tcagattctg ccgaacaacg acatgggatc ccgcgtgacc 1200ttctgggtca tctctgccat
cgtgctgtgc gcactgattt acggcttctt cgtcatcaag 1260catgatgatc aggaagtcgg
cggcgacccg ttctacaccg acaagcagct tcgccagatg 1320gaagccgcca aggcctga
133878445PRTBifidobacterium
longum 78Met Ala Ser Ala Thr Lys Ser Ala Trp Lys Asn Pro Ser Tyr Leu Gln
1 5 10 15 Ser Ser
Phe Gly Ile Phe Met Phe Phe Cys Ser Trp Gly Ile Trp Trp 20
25 30 Ser Phe Phe Gln Arg Trp Leu
Ile Ser Gly Val Gly Leu Thr Asn Ala 35 40
45 Glu Val Gly Thr Ile Tyr Ser Ile Asn Ser Leu Ala
Thr Leu Val Ile 50 55 60
Met Phe Val Tyr Gly Val Ile Gln Asp Gln Leu Gly Ile Lys Arg Lys 65
70 75 80 Leu Val Ile
Val Val Ser Val Ile Ala Ala Cys Val Gly Pro Phe Val 85
90 95 Gln Phe Val Tyr Ala Pro Met Ile
Leu Ala Gly Gly Thr Thr Arg Trp 100 105
110 Ile Gly Ala Leu Ile Gly Ser Ile Val Leu Ser Ala Gly
Phe Met Ser 115 120 125
Gly Cys Ser Leu Phe Glu Ala Val Thr Glu Arg Tyr Ser Arg Lys Phe 130
135 140 Gly Phe Glu Tyr
Gly Gln Ser Arg Ala Trp Gly Ser Phe Gly Tyr Ala 145 150
155 160 Ile Val Ala Leu Cys Ala Gly Phe Leu
Phe Asn Ile Asn Pro Leu Ile 165 170
175 Asn Phe Trp Val Gly Ser Ala Phe Gly Pro Gly Met Leu Leu
Val Tyr 180 185 190
Ala Phe Trp Val Pro Ala Glu Gln Lys Glu Glu Leu Lys Lys Glu Thr
195 200 205 Asp Pro Asn Ala
Ala Pro Thr Asn Pro Ser Leu Lys Glu Met Val Ala 210
215 220 Val Leu Lys Met Pro Thr Leu Trp
Val Leu Ile Val Phe Met Leu Leu 225 230
235 240 Thr Asn Thr Phe Tyr Thr Val Phe Asp Gln Gln Met
Phe Pro Thr Tyr 245 250
255 Tyr Ala Asn Leu Phe Pro Thr Glu Glu Ile Gly Asn Ala Thr Tyr Gly
260 265 270 Thr Leu Asn
Gly Phe Gln Val Phe Leu Glu Ser Ala Met Met Gly Val 275
280 285 Val Pro Ile Ile Met Lys Lys Ile
Gly Val Arg Asn Ala Leu Leu Leu 290 295
300 Gly Ala Thr Val Met Phe Leu Arg Ile Gly Leu Cys Gly
Val Phe His 305 310 315
320 Asp Pro Val Thr Ile Ser Ile Val Lys Leu Phe His Ser Ile Glu Val
325 330 335 Pro Leu Phe Cys
Leu Pro Ala Phe Arg Tyr Phe Thr Leu His Phe Asp 340
345 350 Thr Lys Leu Ser Ala Thr Leu Tyr Met
Val Gly Phe Gln Ile Ala Ser 355 360
365 Gln Val Gly Gln Val Ile Phe Ser Thr Pro Leu Gly Ala Phe
His Asp 370 375 380
Lys Met Ala Gln Ile Leu Pro Asn Asn Asp Met Gly Ser Arg Val Thr 385
390 395 400 Phe Trp Val Ile Ser
Ala Ile Val Leu Cys Ala Leu Ile Tyr Gly Phe 405
410 415 Phe Val Ile Lys His Asp Asp Gln Glu Val
Gly Gly Asp Pro Phe Tyr 420 425
430 Thr Asp Lys Gln Leu Arg Gln Met Glu Ala Ala Lys Ala
435 440 445 791422DNABifidobacterium
adolescentis 79ctgaaatcag agcaggcgca agccaaaaca acatcggaag cgatcgctgc
cgcgcggcag 60cggcagcgcg aagagaaaaa gagaatcaaa atggcaagca aaacacgttc
tgtatggaag 120aatccttcct atctgcagag ctccttcggc attttcatgt tcttctgttc
ctggggcatc 180tggtggtcct tcttctcccg ctggctcact gacccgaccc acggtctggg
catgagctcc 240gcggaacagg gccagatcta ctccatcaac tccttggcca ccctggtcat
catgttcgtt 300tacggcacca ttcaggacca gctgggcatt aagcgtaagc tcgtgatctt
catctctgcg 360gtcgctgcat gcgttggccc gttcgtgcag ttcgtgtacc agccgatgct
gaccgccggc 420ggcaccaccc gattcatcgg cgtgcttctc ggctccatcg tgctgtccgc
aggcttcatg 480gccggctgct ccctgttcga agccatcacc gaacgttact cccgtaagtt
cggcttcgaa 540tacggccagt cccgcgcttg gggctccttc ggctacgctg tcgtggcact
gtgcgcaggc 600ttcctgttca acatcaaccc gctgctgaac ttctgggttg gttccatctg
cggcctcagc 660atgctgtgcg tctatgcttt ctgggttccg gccgagcaga aggaagaact
caagaaggaa 720gctgatccga acgcaactcc gaccaacccg tccttcaagg aaatggtctc
cgtcctgaag 780atgccgaccc tgtgggtgct catcgtcttc atgctgttca ccaacacctt
ctacaccgtg 840ttcgatcagc agatgttccc gaactactac gcctccctct tcccgaccac
cgaaatcggc 900aacgccacct acggcaccct gaactccttc caggtgttcc ttgagtccgc
catgatgggc 960gtcgtcccga tcatcatgaa gaagatcggc gtgcgtaact ccctgctgct
cggcgccacc 1020gtgatgttcg cccgtatcgg tctgtgcggc gtgttccatg acccggtctc
cgtctccatc 1080gtcaagctgt tccactccat cgaggtaccg ctgttctgcc tgccggcgtt
ccgctacttc 1140accctgcact tcgacacgaa gctgtctgcc accctgtaca tggttggttt
ccagatcgct 1200tcccaggtcg gccaggtgat tttctccacc ccgatgggtg ctctgcatga
tgccatgggc 1260gaccgtccga ccttcttcac catctctgcc atcgtgtttg cggctctggt
ctacggcttc 1320ttcgtcatca agaaggatga tcaggaagtc ggcggcgatc cgttctacac
tgacaagcag 1380ctcaaggcca tgaaggccgc tgatgcggaa gtgaaggcct ga
142280473PRTBifidobacterium adolescentis 80Met Lys Ser Glu Gln
Ala Gln Ala Lys Thr Thr Ser Glu Ala Ile Ala 1 5
10 15 Ala Ala Arg Gln Arg Gln Arg Glu Glu Lys
Lys Arg Ile Lys Met Ala 20 25
30 Ser Lys Thr Arg Ser Val Trp Lys Asn Pro Ser Tyr Leu Gln Ser
Ser 35 40 45 Phe
Gly Ile Phe Met Phe Phe Cys Ser Trp Gly Ile Trp Trp Ser Phe 50
55 60 Phe Ser Arg Trp Leu Thr
Asp Pro Thr His Gly Leu Gly Met Ser Ser 65 70
75 80 Ala Glu Gln Gly Gln Ile Tyr Ser Ile Asn Ser
Leu Ala Thr Leu Val 85 90
95 Ile Met Phe Val Tyr Gly Thr Ile Gln Asp Gln Leu Gly Ile Lys Arg
100 105 110 Lys Leu
Val Ile Phe Ile Ser Ala Val Ala Ala Cys Val Gly Pro Phe 115
120 125 Val Gln Phe Val Tyr Gln Pro
Met Leu Thr Ala Gly Gly Thr Thr Arg 130 135
140 Phe Ile Gly Val Leu Leu Gly Ser Ile Val Leu Ser
Ala Gly Phe Met 145 150 155
160 Ala Gly Cys Ser Leu Phe Glu Ala Ile Thr Glu Arg Tyr Ser Arg Lys
165 170 175 Phe Gly Phe
Glu Tyr Gly Gln Ser Arg Ala Trp Gly Ser Phe Gly Tyr 180
185 190 Ala Val Val Ala Leu Cys Ala Gly
Phe Leu Phe Asn Ile Asn Pro Leu 195 200
205 Leu Asn Phe Trp Val Gly Ser Ile Cys Gly Leu Ser Met
Leu Cys Val 210 215 220
Tyr Ala Phe Trp Val Pro Ala Glu Gln Lys Glu Glu Leu Lys Lys Glu 225
230 235 240 Ala Asp Pro Asn
Ala Thr Pro Thr Asn Pro Ser Phe Lys Glu Met Val 245
250 255 Ser Val Leu Lys Met Pro Thr Leu Trp
Val Leu Ile Val Phe Met Leu 260 265
270 Phe Thr Asn Thr Phe Tyr Thr Val Phe Asp Gln Gln Met Phe
Pro Asn 275 280 285
Tyr Tyr Ala Ser Leu Phe Pro Thr Thr Glu Ile Gly Asn Ala Thr Tyr 290
295 300 Gly Thr Leu Asn Ser
Phe Gln Val Phe Leu Glu Ser Ala Met Met Gly 305 310
315 320 Val Val Pro Ile Ile Met Lys Lys Ile Gly
Val Arg Asn Ser Leu Leu 325 330
335 Leu Gly Ala Thr Val Met Phe Ala Arg Ile Gly Leu Cys Gly Val
Phe 340 345 350 His
Asp Pro Val Ser Val Ser Ile Val Lys Leu Phe His Ser Ile Glu 355
360 365 Val Pro Leu Phe Cys Leu
Pro Ala Phe Arg Tyr Phe Thr Leu His Phe 370 375
380 Asp Thr Lys Leu Ser Ala Thr Leu Tyr Met Val
Gly Phe Gln Ile Ala 385 390 395
400 Ser Gln Val Gly Gln Val Ile Phe Ser Thr Pro Met Gly Ala Leu His
405 410 415 Asp Ala
Met Gly Asp Arg Pro Thr Phe Phe Thr Ile Ser Ala Ile Val 420
425 430 Phe Ala Ala Leu Val Tyr Gly
Phe Phe Val Ile Lys Lys Asp Asp Gln 435 440
445 Glu Val Gly Gly Asp Pro Phe Tyr Thr Asp Lys Gln
Leu Lys Ala Met 450 455 460
Lys Ala Ala Asp Ala Glu Val Lys Ala 465 470
811503DNABifidobacterium longum 81ttgcgtgtcg aacaggtacg acacgccgat
ttacgtcaat cgattgacgt aaatcgattg 60acgtcgcata ataattacta cataacaact
tctacaaagg cacggccgcc cgagcagaag 120cgttacataa ccaataacca accaagtagt
aatcaaagga tgattatggc aagtgcaacc 180aagtctgcat ggaagaatcc ttcctatctg
cagagctctt tcggcatctt catgttcttc 240tgctcctggg gcatctggtg gtccttcttc
cagcgctggc tcatctcagg cgttggattg 300accaatgctg aagtcggcac catctactcc
atcaactcgc tggccaccct ggtcatcatg 360tttgtgtacg gcgtgattca ggatcagctc
ggcatcaagc gcaagctcgt catcgtagtc 420tcggtaatcg ccgcctgcgt tggcccattc
gtccaattcg tttacgcccc gatgatcctc 480gccggtggca ccacccgctg gatcggcgca
ctcatcggct ccatcgttct gtctgccggc 540ttcatgtccg gctgctccct gttcgaggcc
gtcaccgaac gctactcccg taaattcggt 600ttcgaatatg gccagtcccg tgcttggggc
tccttcggtt acgccatcgt ggcgctgtgc 660gccggcttcc tgttcaacat caacccgctg
atcaacttct gggtcggctc cgcattcggc 720cctggcatgc tcctcgtgta cgccttctgg
gtcccggccg agcagaagga agagctcaag 780aaggaaaccg acccgaacgc agcccccacc
aacccgtccc tcaaggaaat ggtcgccgtt 840ctcaagatgc cgaccctgtg ggtgctcatc
gtcttcatgc tgctgaccaa caccttctac 900accgtgttcg atcagcagat gttcccgacc
tactacgcca acctcttccc cactgaagaa 960atcggcaacg ccacctacgg caccctgaac
ggtttccagg tcttccttga gtccgcaatg 1020atgggcgtgg tcccgatcat catgaagaag
atcggcgtgc gcaacgctct gctgctcggc 1080gctaccgtga tgttcctgcg catcggcttg
tgcggcgtgt tccacgaccc ggtcaccatc 1140tccatcgtca agctgttcca ctccatcgaa
gtgccgctgt tctgcctgcc ggcattccgc 1200tacttcactc tgcacttcga caccaagctc
tctgccacgc tgtacatggt gggcttccag 1260atcgcttccc aagtgggtca ggtcatcttc
tcgacccctc tgggtgcctt ccacgacaag 1320atggctcaga ttctgccgaa caacgacatg
ggatcccgcg tgaccttctg ggtcatctct 1380gccatcgtgc tgtgcgcact gatttacggc
ttcttcgtca tcaagcatga tgatcaggaa 1440gtcggcggcg acccgttcta caccgacaag
cagcttcgcc agatggaagc cgccaaggcc 1500tga
150382500PRTBifidobacterium longum 82Met
Arg Val Glu Gln Val Arg His Ala Asp Leu Arg Gln Ser Ile Asp 1
5 10 15 Val Asn Arg Leu Thr Ser
His Asn Asn Tyr Tyr Ile Thr Thr Ser Thr 20
25 30 Lys Ala Arg Pro Pro Glu Gln Lys Arg Tyr
Ile Thr Asn Asn Gln Pro 35 40
45 Ser Ser Asn Gln Arg Met Ile Met Ala Ser Ala Thr Lys Ser
Ala Trp 50 55 60
Lys Asn Pro Ser Tyr Leu Gln Ser Ser Phe Gly Ile Phe Met Phe Phe 65
70 75 80 Cys Ser Trp Gly Ile
Trp Trp Ser Phe Phe Gln Arg Trp Leu Ile Ser 85
90 95 Gly Val Gly Leu Thr Asn Ala Glu Val Gly
Thr Ile Tyr Ser Ile Asn 100 105
110 Ser Leu Ala Thr Leu Val Ile Met Phe Val Tyr Gly Val Ile Gln
Asp 115 120 125 Gln
Leu Gly Ile Lys Arg Lys Leu Val Ile Val Val Ser Val Ile Ala 130
135 140 Ala Cys Val Gly Pro Phe
Val Gln Phe Val Tyr Ala Pro Met Ile Leu 145 150
155 160 Ala Gly Gly Thr Thr Arg Trp Ile Gly Ala Leu
Ile Gly Ser Ile Val 165 170
175 Leu Ser Ala Gly Phe Met Ser Gly Cys Ser Leu Phe Glu Ala Val Thr
180 185 190 Glu Arg
Tyr Ser Arg Lys Phe Gly Phe Glu Tyr Gly Gln Ser Arg Ala 195
200 205 Trp Gly Ser Phe Gly Tyr Ala
Ile Val Ala Leu Cys Ala Gly Phe Leu 210 215
220 Phe Asn Ile Asn Pro Leu Ile Asn Phe Trp Val Gly
Ser Ala Phe Gly 225 230 235
240 Pro Gly Met Leu Leu Val Tyr Ala Phe Trp Val Pro Ala Glu Gln Lys
245 250 255 Glu Glu Leu
Lys Lys Glu Thr Asp Pro Asn Ala Ala Pro Thr Asn Pro 260
265 270 Ser Leu Lys Glu Met Val Ala Val
Leu Lys Met Pro Thr Leu Trp Val 275 280
285 Leu Ile Val Phe Met Leu Leu Thr Asn Thr Phe Tyr Thr
Val Phe Asp 290 295 300
Gln Gln Met Phe Pro Thr Tyr Tyr Ala Asn Leu Phe Pro Thr Glu Glu 305
310 315 320 Ile Gly Asn Ala
Thr Tyr Gly Thr Leu Asn Gly Phe Gln Val Phe Leu 325
330 335 Glu Ser Ala Met Met Gly Val Val Pro
Ile Ile Met Lys Lys Ile Gly 340 345
350 Val Arg Asn Ala Leu Leu Leu Gly Ala Thr Val Met Phe Leu
Arg Ile 355 360 365
Gly Leu Cys Gly Val Phe His Asp Pro Val Thr Ile Ser Ile Val Lys 370
375 380 Leu Phe His Ser Ile
Glu Val Pro Leu Phe Cys Leu Pro Ala Phe Arg 385 390
395 400 Tyr Phe Thr Leu His Phe Asp Thr Lys Leu
Ser Ala Thr Leu Tyr Met 405 410
415 Val Gly Phe Gln Ile Ala Ser Gln Val Gly Gln Val Ile Phe Ser
Thr 420 425 430 Pro
Leu Gly Ala Phe His Asp Lys Met Ala Gln Ile Leu Pro Asn Asn 435
440 445 Asp Met Gly Ser Arg Val
Thr Phe Trp Val Ile Ser Ala Ile Val Leu 450 455
460 Cys Ala Leu Ile Tyr Gly Phe Phe Val Ile Lys
His Asp Asp Gln Glu 465 470 475
480 Val Gly Gly Asp Pro Phe Tyr Thr Asp Lys Gln Leu Arg Gln Met Glu
485 490 495 Ala Ala
Lys Ala 500 831266DNAMitsuokella multacida 83atgggaaatc
tcttgaaggc attttcgaat ccgttctaca ggacgagctc gcttgagatc 60ctgctgttct
tcgcgggctg gggcatctgg tggtcgttct ttcagatctg gctgacgacg 120aagcagggct
tcacgggcgc gcaggtcggc acgatttact ccttcggcag cgcggtcgcg 180ctcgtcctga
tgttcgtcta cggctccctg caggacaagc tcggcatgaa gaagacgatg 240ctgaagttct
tcgccgtctg ccagatcctc gtcggcccgt tcttcacctg ggtctacgtg 300ccgatgctcg
ccgcgaactt ctacgtcggc gctgtcgtcg gtgccgtcta cctcgcggtg 360gcgttcctcg
cggcctgccc tgtctttgag gcggtcacag agcgcctgag ccgccgctac 420tcctttgagt
acggccaggc cagagcctgg ggctcgttcg gctatgccgt ggcagcgctc 480tgcgcaggct
tcctcttcac gatgaacccg aacctgatct tctggacggg ctccgctgtg 540gcggcggtgc
agcttatcgt cttggtctcg atgacgccgg agaacgacgc ttcgcttacg 600gcgcagtacg
aggtcaaggc agagagcatc aaggagagca agacgccgtc gttcggcgag 660atcgtcggcg
tgttcaagct catcgaggtc tggaagatga tcgtcttcgt catcatgagc 720tggacgttct
acaccgtctt tgaccagcag atgttcccgg agttcttcac gcgcttcttc 780gcgacgccag
aagcaggcca gcaggcttac ggcgtgctca actccatcga agtcttcctc 840gaattcctca
tgatgggcct cgtgccgatc ctcatgcgcc gtatcggcgt tcgcaaggcc 900atcctgctcg
gctgcgccat catgatcgtc cgcatcggcg gctgcggcct cgtcacgaat 960cctcttggcg
tcgccgtcat caagctcttg cacgcaccgg aaacggcgct cttcatcctc 1020gctgtcttcc
gctacttcac gctgcacttt gacacgcgca tctcggcgac gctctacatg 1080gtcggtttcc
agatcgctgc acaggtcggc cagattatct tctcgacgcc gctcggcgcc 1140ctgcatgaca
gcatcggcta ccagagcact ttcctcgtca tctccggcat cgtctgtgtg 1200gccagcctct
acgctttcgt catcctcaag aaagacgacc agcaggtcga cggccagccg 1260ctttga
126684421PRTMitsuokella multacida 84Met Gly Asn Leu Leu Lys Ala Phe Ser
Asn Pro Phe Tyr Arg Thr Ser 1 5 10
15 Ser Leu Glu Ile Leu Leu Phe Phe Ala Gly Trp Gly Ile Trp
Trp Ser 20 25 30
Phe Phe Gln Ile Trp Leu Thr Thr Lys Gln Gly Phe Thr Gly Ala Gln
35 40 45 Val Gly Thr Ile
Tyr Ser Phe Gly Ser Ala Val Ala Leu Val Leu Met 50
55 60 Phe Val Tyr Gly Ser Leu Gln Asp
Lys Leu Gly Met Lys Lys Thr Met 65 70
75 80 Leu Lys Phe Phe Ala Val Cys Gln Ile Leu Val Gly
Pro Phe Phe Thr 85 90
95 Trp Val Tyr Val Pro Met Leu Ala Ala Asn Phe Tyr Val Gly Ala Val
100 105 110 Val Gly Ala
Val Tyr Leu Ala Val Ala Phe Leu Ala Ala Cys Pro Val 115
120 125 Phe Glu Ala Val Thr Glu Arg Leu
Ser Arg Arg Tyr Ser Phe Glu Tyr 130 135
140 Gly Gln Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Val
Ala Ala Leu 145 150 155
160 Cys Ala Gly Phe Leu Phe Thr Met Asn Pro Asn Leu Ile Phe Trp Thr
165 170 175 Gly Ser Ala Val
Ala Ala Val Gln Leu Ile Val Leu Val Ser Met Thr 180
185 190 Pro Glu Asn Asp Ala Ser Leu Thr Ala
Gln Tyr Glu Val Lys Ala Glu 195 200
205 Ser Ile Lys Glu Ser Lys Thr Pro Ser Phe Gly Glu Ile Val
Gly Val 210 215 220
Phe Lys Leu Ile Glu Val Trp Lys Met Ile Val Phe Val Ile Met Ser 225
230 235 240 Trp Thr Phe Tyr Thr
Val Phe Asp Gln Gln Met Phe Pro Glu Phe Phe 245
250 255 Thr Arg Phe Phe Ala Thr Pro Glu Ala Gly
Gln Gln Ala Tyr Gly Val 260 265
270 Leu Asn Ser Ile Glu Val Phe Leu Glu Phe Leu Met Met Gly Leu
Val 275 280 285 Pro
Ile Leu Met Arg Arg Ile Gly Val Arg Lys Ala Ile Leu Leu Gly 290
295 300 Cys Ala Ile Met Ile Val
Arg Ile Gly Gly Cys Gly Leu Val Thr Asn 305 310
315 320 Pro Leu Gly Val Ala Val Ile Lys Leu Leu His
Ala Pro Glu Thr Ala 325 330
335 Leu Phe Ile Leu Ala Val Phe Arg Tyr Phe Thr Leu His Phe Asp Thr
340 345 350 Arg Ile
Ser Ala Thr Leu Tyr Met Val Gly Phe Gln Ile Ala Ala Gln 355
360 365 Val Gly Gln Ile Ile Phe Ser
Thr Pro Leu Gly Ala Leu His Asp Ser 370 375
380 Ile Gly Tyr Gln Ser Thr Phe Leu Val Ile Ser Gly
Ile Val Cys Val 385 390 395
400 Ala Ser Leu Tyr Ala Phe Val Ile Leu Lys Lys Asp Asp Gln Gln Val
405 410 415 Asp Gly Gln
Pro Leu 420 851275DNALactobacillus antri 85atgaaaaata
gcaagttatc agcgtttaaa aacagctttt acctggagag ttcgcttagt 60ctgctgctgt
tcttcgccgc gtggggaatc tggtggtcgt tcttccaaat ctggctcacc 120aatgacctcg
gcttctctgg ggccaaggtc gggatgatct atactttcga ttcggcaatt 180acgctggtct
taatgttcat ctacgggtca gtgcaagaca agctcggcat taaacgccgg 240ctgctgattg
gggttaccat cctggaaatg ctccttgggc ccttctttac ctggatttac 300gcgccactgc
tgcactctaa ctttatcctc ggcgccttct taggttccct ctacctctcc 360tttgcctttc
tggcggcgtc cccgaccttc gaggccctcg cagaacggat gagccggcgg 420tacagctttg
aatacggtcg ggcccgggcc tgggggtcat ttggttacgc cgtttcggca 480ttgtgtgccg
gctacctctt caccatcagt ccctacatcg tcttttggct cagcagcggg 540attagcttgc
taaccttcct cctgctctgc tttggccgga ctaagagccc cacacaggtt 600gcccgttacg
agaataaggc cgaggaagaa cacgacgcgg ataagccgag tttcaaagag 660atcatcagtg
ttttcaagct caagcagttg tgggaattgg ttttcttcat tattttcagc 720gggtcctttt
acacggtctt tgaccagcag atgtttcccc agttctttac ccaatttttc 780aagacggcgg
cccagggaaa cacggcctac ggaatcctca attcgattga agtcttcctc 840gaagcaatta
tgatggcgat tgttccctgg attatgaaga agatcggggt ccgcaagacc 900ctcttgattg
gggtcaccat tatgttcttg cggatcggcc tctgcggcct ggtcgtcagc 960ccggtcggga
tctcgattgt gaagctcttt cacgccccgg aaacggccat ctttgccctg 1020gcgatgttcc
gctatttgac cctccacttt gacacccggc tatcggcgac gatgtacatg 1080gtggttgggc
agattgccgg tcaaatcggc cagatcatcc tgtcgacgcc cctgggaatg 1140ctccacgacc
ggatcggcta ccgggcgacc ttcctggtta tttcgctgat tgtgatttgc 1200gctgcggtat
acgcattcgt cattttgcgc aaggataacc aggaggttga cggtcaacca 1260ctagaaaaca
actaa
127586424PRTLactobacillus antri 86Met Lys Asn Ser Lys Leu Ser Ala Phe Lys
Asn Ser Phe Tyr Leu Glu 1 5 10
15 Ser Ser Leu Ser Leu Leu Leu Phe Phe Ala Ala Trp Gly Ile Trp
Trp 20 25 30 Ser
Phe Phe Gln Ile Trp Leu Thr Asn Asp Leu Gly Phe Ser Gly Ala 35
40 45 Lys Val Gly Met Ile Tyr
Thr Phe Asp Ser Ala Ile Thr Leu Val Leu 50 55
60 Met Phe Ile Tyr Gly Ser Val Gln Asp Lys Leu
Gly Ile Lys Arg Arg 65 70 75
80 Leu Leu Ile Gly Val Thr Ile Leu Glu Met Leu Leu Gly Pro Phe Phe
85 90 95 Thr Trp
Ile Tyr Ala Pro Leu Leu His Ser Asn Phe Ile Leu Gly Ala 100
105 110 Phe Leu Gly Ser Leu Tyr Leu
Ser Phe Ala Phe Leu Ala Ala Ser Pro 115 120
125 Thr Phe Glu Ala Leu Ala Glu Arg Met Ser Arg Arg
Tyr Ser Phe Glu 130 135 140
Tyr Gly Arg Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Val Ser Ala 145
150 155 160 Leu Cys Ala
Gly Tyr Leu Phe Thr Ile Ser Pro Tyr Ile Val Phe Trp 165
170 175 Leu Ser Ser Gly Ile Ser Leu Leu
Thr Phe Leu Leu Leu Cys Phe Gly 180 185
190 Arg Thr Lys Ser Pro Thr Gln Val Ala Arg Tyr Glu Asn
Lys Ala Glu 195 200 205
Glu Glu His Asp Ala Asp Lys Pro Ser Phe Lys Glu Ile Ile Ser Val 210
215 220 Phe Lys Leu Lys
Gln Leu Trp Glu Leu Val Phe Phe Ile Ile Phe Ser 225 230
235 240 Gly Ser Phe Tyr Thr Val Phe Asp Gln
Gln Met Phe Pro Gln Phe Phe 245 250
255 Thr Gln Phe Phe Lys Thr Ala Ala Gln Gly Asn Thr Ala Tyr
Gly Ile 260 265 270
Leu Asn Ser Ile Glu Val Phe Leu Glu Ala Ile Met Met Ala Ile Val
275 280 285 Pro Trp Ile Met
Lys Lys Ile Gly Val Arg Lys Thr Leu Leu Ile Gly 290
295 300 Val Thr Ile Met Phe Leu Arg Ile
Gly Leu Cys Gly Leu Val Val Ser 305 310
315 320 Pro Val Gly Ile Ser Ile Val Lys Leu Phe His Ala
Pro Glu Thr Ala 325 330
335 Ile Phe Ala Leu Ala Met Phe Arg Tyr Leu Thr Leu His Phe Asp Thr
340 345 350 Arg Leu Ser
Ala Thr Met Tyr Met Val Val Gly Gln Ile Ala Gly Gln 355
360 365 Ile Gly Gln Ile Ile Leu Ser Thr
Pro Leu Gly Met Leu His Asp Arg 370 375
380 Ile Gly Tyr Arg Ala Thr Phe Leu Val Ile Ser Leu Ile
Val Ile Cys 385 390 395
400 Ala Ala Val Tyr Ala Phe Val Ile Leu Arg Lys Asp Asn Gln Glu Val
405 410 415 Asp Gly Gln Pro
Leu Glu Asn Asn 420 871323DNALactobacillus
ruminis 87atgatgccga tttctgacaa ttggaaagga attttattta tgaacgatat
gaataaaagc 60ggacggatgt cacaactgaa gaatccgttc tttacaagca atgcgacaaa
tattctcatg 120ttctttgctg gctggggcat ctggtggtca ttcttccaga tctggctgac
aaccaagcag 180gggttcaccg gagcccaggt tggcgagata tactccttca actcggcgtt
ctcactgatt 240gccaaccttg tttacagcaa cattcaggac aggctcggcc tcaaacgcaa
ccttttgatc 300ttctgcgcct gcctgcaggt gttcctcggg cccttcttca cgttcctctt
cgtgccgatg 360cttcatgcca accttgaact cggcgctctg atcggttcat gctacctgac
gcttgcctat 420ctttccgcat ccccgatgtt cgaggcactg acggaacgtg caagccgccg
cttcaactat 480cagtatgggt cagcgcgtgc ctggggctcg ttcggatatg ccgtatccgc
cttgcttgca 540ggattcgtct tcacaatcaa tccgtcgctg ctgttctgga tcggctctgc
catcgctgtt 600gtccttcttc tcctgctttt gttctggaac cctgtccgca acaaggagac
ggttgccaga 660tttgaaaatg aaatggtcag ggaacgtgag aactccaagc ctgggtcaag
ggacttcctc 720aatgtcttca aggttcgcag cctttgggaa atcgccattt tccttgtctt
cagcggtaca 780ttctacacga ttttcgatca gcagatgttt cctcagttct tcactcagtt
cttcaagacc 840caggcaatgg gcgatcacat gtatgggatc ctgaactcgg ttgaggtgtt
cctcgaagca 900ctcatgatgg gcctggttcc gcttctcatg aagaagatcg gcgtccgccg
cacgattctt 960gtcggcgtga cgttcatgtt catcagaatc ggtggctgcg gtctgattac
gaaccctctt 1020ggcgtttcaa tgatcaagct tctccatgcg cctgaaacgg ccattttctg
cgtcgtaatg 1080ttccgttact acactctgca ctacgatccg cgagtatcag ccacgatcaa
tatcgtaacg 1140ggcattgcgg gttcgttcgg ccagatactt ctctcaacgc cgcttggact
tctgcgtgac 1200cacatcggct atcagccgac cttcctggta atcgccggca tcgtattctg
cgccggcatc 1260tacggcttat tcatcattcg aagggatgat caggaagtaa acggagagag
gctgtctgaa 1320taa
132388440PRTLactobacillus ruminis 88Met Met Pro Ile Ser Asp
Asn Trp Lys Gly Ile Leu Phe Met Asn Asp 1 5
10 15 Met Asn Lys Ser Gly Arg Met Ser Gln Leu Lys
Asn Pro Phe Phe Thr 20 25
30 Ser Asn Ala Thr Asn Ile Leu Met Phe Phe Ala Gly Trp Gly Ile
Trp 35 40 45 Trp
Ser Phe Phe Gln Ile Trp Leu Thr Thr Lys Gln Gly Phe Thr Gly 50
55 60 Ala Gln Val Gly Glu Ile
Tyr Ser Phe Asn Ser Ala Phe Ser Leu Ile 65 70
75 80 Ala Asn Leu Val Tyr Ser Asn Ile Gln Asp Arg
Leu Gly Leu Lys Arg 85 90
95 Asn Leu Leu Ile Phe Cys Ala Cys Leu Gln Val Phe Leu Gly Pro Phe
100 105 110 Phe Thr
Phe Leu Phe Val Pro Met Leu His Ala Asn Leu Glu Leu Gly 115
120 125 Ala Leu Ile Gly Ser Cys Tyr
Leu Thr Leu Ala Tyr Leu Ser Ala Ser 130 135
140 Pro Met Phe Glu Ala Leu Thr Glu Arg Ala Ser Arg
Arg Phe Asn Tyr 145 150 155
160 Gln Tyr Gly Ser Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Val Ser
165 170 175 Ala Leu Leu
Ala Gly Phe Val Phe Thr Ile Asn Pro Ser Leu Leu Phe 180
185 190 Trp Ile Gly Ser Ala Ile Ala Val
Val Leu Leu Leu Leu Leu Leu Phe 195 200
205 Trp Asn Pro Val Arg Asn Lys Glu Thr Val Ala Arg Phe
Glu Asn Glu 210 215 220
Met Val Arg Glu Arg Glu Asn Ser Lys Pro Gly Ser Arg Asp Phe Leu 225
230 235 240 Asn Val Phe Lys
Val Arg Ser Leu Trp Glu Ile Ala Ile Phe Leu Val 245
250 255 Phe Ser Gly Thr Phe Tyr Thr Ile Phe
Asp Gln Gln Met Phe Pro Gln 260 265
270 Phe Phe Thr Gln Phe Phe Lys Thr Gln Ala Met Gly Asp His
Met Tyr 275 280 285
Gly Ile Leu Asn Ser Val Glu Val Phe Leu Glu Ala Leu Met Met Gly 290
295 300 Leu Val Pro Leu Leu
Met Lys Lys Ile Gly Val Arg Arg Thr Ile Leu 305 310
315 320 Val Gly Val Thr Phe Met Phe Ile Arg Ile
Gly Gly Cys Gly Leu Ile 325 330
335 Thr Asn Pro Leu Gly Val Ser Met Ile Lys Leu Leu His Ala Pro
Glu 340 345 350 Thr
Ala Ile Phe Cys Val Val Met Phe Arg Tyr Tyr Thr Leu His Tyr 355
360 365 Asp Pro Arg Val Ser Ala
Thr Ile Asn Ile Val Thr Gly Ile Ala Gly 370 375
380 Ser Phe Gly Gln Ile Leu Leu Ser Thr Pro Leu
Gly Leu Leu Arg Asp 385 390 395
400 His Ile Gly Tyr Gln Pro Thr Phe Leu Val Ile Ala Gly Ile Val Phe
405 410 415 Cys Ala
Gly Ile Tyr Gly Leu Phe Ile Ile Arg Arg Asp Asp Gln Glu 420
425 430 Val Asn Gly Glu Arg Leu Ser
Glu 435 440 891242DNAYersinia frederiksenii
89atgaaacatt ctgtccgtaa tcaatatctg atcttaagtg gcttattgtt tacgtttttc
60tttacttggt catcggcatt ctctttattc tccatatggc tcaatcaata tgtaggatta
120aaaggtaccg aaacaggggc gactttttcc gccattgcct taacggcact ttgcgctcaa
180ccgctttatg gcgtgataca agataagttg gggctaaaaa aacatctttt atgggccatt
240ggtattttgc tgctgatcag tggccctttt tttatttatg tttatgcccc tttattgcgt
300gtcaacatgc tggttggtgc cgttaccggt ggcttatata tggggatgac gttctttgcc
360ggtattggtg cgcttgagtc ttataccgaa cgagtgagcc gtattagtgg gtttgagttt
420ggtaaagccc gtatgtgggg atcgctgggg tgggcgggtg caaccttttt tgctggcatg
480ttgtttaata ttaatcccaa cattaatttc tggatggcat cggcatcggc cgcgatattt
540ttactgttgt tgtggcactt acatgaagtt aaaacagcgg ctatggggca gttggaatac
600ggtaagaata gtgccctgac actgagtgat acattgtcac tgtttcgtat gccgcgtttc
660tgggcgctgg tggtatttgt caccggtgtg agcgtttata acgtctatga ccagcaattt
720ccggtctatt tctcctctct atttactgac cgacgccacg gcaatgaaat gtacggcttt
780cttaattcac tacaggtatt cctagaggct ggtggtatgt tcctcgcgcc ttttctggtt
840aaccgtattg gcgcgaaaaa gggcttactg ctgagcggat taatcatggc aatgcgcata
900ttgggttcag ggttggcaca agatgcagtc accatctcat tgatgaagtt attacatgca
960gtggagttgc ctattttgct cattgcgatg tttaagtata tcgccgccaa tttcgacccg
1020cgtttgtcag ccacgcttta tctggtggga tttcagttta ttacccaagt ctatgccagc
1080gtattttcgc cgttggcagg taaaggctat gacctgatcg ggttcgctga tacctatctg
1140atcatgggag gcattgtcct cggattaaca gcaatttctt gttttatgct gcgcggcgag
1200tcgcgtacgg atgatccttc cctacaatta accactaagt ga
124290413PRTYersinia frederiksenii 90Met Lys His Ser Val Arg Asn Gln Tyr
Leu Ile Leu Ser Gly Leu Leu 1 5 10
15 Phe Thr Phe Phe Phe Thr Trp Ser Ser Ala Phe Ser Leu Phe
Ser Ile 20 25 30
Trp Leu Asn Gln Tyr Val Gly Leu Lys Gly Thr Glu Thr Gly Ala Thr
35 40 45 Phe Ser Ala Ile
Ala Leu Thr Ala Leu Cys Ala Gln Pro Leu Tyr Gly 50
55 60 Val Ile Gln Asp Lys Leu Gly Leu
Lys Lys His Leu Leu Trp Ala Ile 65 70
75 80 Gly Ile Leu Leu Leu Ile Ser Gly Pro Phe Phe Ile
Tyr Val Tyr Ala 85 90
95 Pro Leu Leu Arg Val Asn Met Leu Val Gly Ala Val Thr Gly Gly Leu
100 105 110 Tyr Met Gly
Met Thr Phe Phe Ala Gly Ile Gly Ala Leu Glu Ser Tyr 115
120 125 Thr Glu Arg Val Ser Arg Ile Ser
Gly Phe Glu Phe Gly Lys Ala Arg 130 135
140 Met Trp Gly Ser Leu Gly Trp Ala Gly Ala Thr Phe Phe
Ala Gly Met 145 150 155
160 Leu Phe Asn Ile Asn Pro Asn Ile Asn Phe Trp Met Ala Ser Ala Ser
165 170 175 Ala Ala Ile Phe
Leu Leu Leu Leu Trp His Leu His Glu Val Lys Thr 180
185 190 Ala Ala Met Gly Gln Leu Glu Tyr Gly
Lys Asn Ser Ala Leu Thr Leu 195 200
205 Ser Asp Thr Leu Ser Leu Phe Arg Met Pro Arg Phe Trp Ala
Leu Val 210 215 220
Val Phe Val Thr Gly Val Ser Val Tyr Asn Val Tyr Asp Gln Gln Phe 225
230 235 240 Pro Val Tyr Phe Ser
Ser Leu Phe Thr Asp Arg Arg His Gly Asn Glu 245
250 255 Met Tyr Gly Phe Leu Asn Ser Leu Gln Val
Phe Leu Glu Ala Gly Gly 260 265
270 Met Phe Leu Ala Pro Phe Leu Val Asn Arg Ile Gly Ala Lys Lys
Gly 275 280 285 Leu
Leu Leu Ser Gly Leu Ile Met Ala Met Arg Ile Leu Gly Ser Gly 290
295 300 Leu Ala Gln Asp Ala Val
Thr Ile Ser Leu Met Lys Leu Leu His Ala 305 310
315 320 Val Glu Leu Pro Ile Leu Leu Ile Ala Met Phe
Lys Tyr Ile Ala Ala 325 330
335 Asn Phe Asp Pro Arg Leu Ser Ala Thr Leu Tyr Leu Val Gly Phe Gln
340 345 350 Phe Ile
Thr Gln Val Tyr Ala Ser Val Phe Ser Pro Leu Ala Gly Lys 355
360 365 Gly Tyr Asp Leu Ile Gly Phe
Ala Asp Thr Tyr Leu Ile Met Gly Gly 370 375
380 Ile Val Leu Gly Leu Thr Ala Ile Ser Cys Phe Met
Leu Arg Gly Glu 385 390 395
400 Ser Arg Thr Asp Asp Pro Ser Leu Gln Leu Thr Thr Lys
405 410 911260DNASerratia proteamaculans
91atgaaccgcg aaacaaaaaa atattatgtg cttctcagcg gcctgttgtt tttcttcttc
60tttacctggt catccagctt ttcactgatc tccatctggc tgaaccagaa aatcggcctg
120aaagggactg aaaccgggct gatcttcgcg gcaatgtcga tcatggcgtt gtgcgcccaa
180ccgctgtacg gctttattca ggacaaactt gggctgcgta agcacctgct gctgtttgtc
240ggcgtgctgc tgttgctcac cggcccgttc tttatctatg tctacgcccc gctgctgcag
300agcaaccttg tggtcggcgc actggtgggc ggcgtgtttg tcagcctggc gttcaatgcc
360ggtattggcg cgctggaatc ctataccgaa cgagtcagcc gcatcgtcgg tttcgaattc
420ggccgggcgc gtatgtgggg gtcattgggc tgggccagcg ccaccttctt tgccggcttt
480aactacaata tcgaccccaa tatcaacttc tggatcgctt cggcctcggc ggcagtgttt
540ctgctgttgc tgtggcaagt gcgtgagctg aaacccaacg ccatggccgg tctggaatac
600ggcaagccgg aaaacctgaa gctgcaggac gcattggccc tgctgcgcct gccggggttc
660tgggcgctgg tggtgtttgt gctgggcacc agcatctacg gcgtgtttga ccagcagttc
720ccggtgtatt tcgcctcgca gttccccacc cacgaagaag gcaaccgcat gtacggtttc
780cttaattcgc tgcaggtgtt tctggaggcc ggtggcatgt tcctggcccc gctgctggtt
840aaccgcattg gcataaagca aagcctgttg ctggccagca gcgtgatggc gctgcgcatg
900gtcggttccg gctttgccag cggcgccctg atgatttccg ccatgaaact gctgcacgcc
960gtagaattgc caatcctgct ggtggcgatg ttcaagtaca tcaccacccg tttcgacagc
1020cgcctgtcct ccacgctgta cctggtgggc ttccagttta tcagccaaat tgtcgccggt
1080tttctggcac cgctggccgg ttatggttac gaccgcatcg gctttgccga cacctatttg
1140ctgatgggtt gcgcggtggc cgggaccacg ctgatttcct gcttcctgct gcgcggcgag
1200accgtcgcca gtgcgcctca atttcaatcc acgttaaaat caagtgagcc aacccaatga
126092419PRTSerratia proteamaculans 92Met Asn Arg Glu Thr Lys Lys Tyr Tyr
Val Leu Leu Ser Gly Leu Leu 1 5 10
15 Phe Phe Phe Phe Phe Thr Trp Ser Ser Ser Phe Ser Leu Ile
Ser Ile 20 25 30
Trp Leu Asn Gln Lys Ile Gly Leu Lys Gly Thr Glu Thr Gly Leu Ile
35 40 45 Phe Ala Ala Met
Ser Ile Met Ala Leu Cys Ala Gln Pro Leu Tyr Gly 50
55 60 Phe Ile Gln Asp Lys Leu Gly Leu
Arg Lys His Leu Leu Leu Phe Val 65 70
75 80 Gly Val Leu Leu Leu Leu Thr Gly Pro Phe Phe Ile
Tyr Val Tyr Ala 85 90
95 Pro Leu Leu Gln Ser Asn Leu Val Val Gly Ala Leu Val Gly Gly Val
100 105 110 Phe Val Ser
Leu Ala Phe Asn Ala Gly Ile Gly Ala Leu Glu Ser Tyr 115
120 125 Thr Glu Arg Val Ser Arg Ile Val
Gly Phe Glu Phe Gly Arg Ala Arg 130 135
140 Met Trp Gly Ser Leu Gly Trp Ala Ser Ala Thr Phe Phe
Ala Gly Phe 145 150 155
160 Asn Tyr Asn Ile Asp Pro Asn Ile Asn Phe Trp Ile Ala Ser Ala Ser
165 170 175 Ala Ala Val Phe
Leu Leu Leu Leu Trp Gln Val Arg Glu Leu Lys Pro 180
185 190 Asn Ala Met Ala Gly Leu Glu Tyr Gly
Lys Pro Glu Asn Leu Lys Leu 195 200
205 Gln Asp Ala Leu Ala Leu Leu Arg Leu Pro Gly Phe Trp Ala
Leu Val 210 215 220
Val Phe Val Leu Gly Thr Ser Ile Tyr Gly Val Phe Asp Gln Gln Phe 225
230 235 240 Pro Val Tyr Phe Ala
Ser Gln Phe Pro Thr His Glu Glu Gly Asn Arg 245
250 255 Met Tyr Gly Phe Leu Asn Ser Leu Gln Val
Phe Leu Glu Ala Gly Gly 260 265
270 Met Phe Leu Ala Pro Leu Leu Val Asn Arg Ile Gly Ile Lys Gln
Ser 275 280 285 Leu
Leu Leu Ala Ser Ser Val Met Ala Leu Arg Met Val Gly Ser Gly 290
295 300 Phe Ala Ser Gly Ala Leu
Met Ile Ser Ala Met Lys Leu Leu His Ala 305 310
315 320 Val Glu Leu Pro Ile Leu Leu Val Ala Met Phe
Lys Tyr Ile Thr Thr 325 330
335 Arg Phe Asp Ser Arg Leu Ser Ser Thr Leu Tyr Leu Val Gly Phe Gln
340 345 350 Phe Ile
Ser Gln Ile Val Ala Gly Phe Leu Ala Pro Leu Ala Gly Tyr 355
360 365 Gly Tyr Asp Arg Ile Gly Phe
Ala Asp Thr Tyr Leu Leu Met Gly Cys 370 375
380 Ala Val Ala Gly Thr Thr Leu Ile Ser Cys Phe Leu
Leu Arg Gly Glu 385 390 395
400 Thr Val Ala Ser Ala Pro Gln Phe Gln Ser Thr Leu Lys Ser Ser Glu
405 410 415 Pro Thr Gln
931239DNAEscherichia coli 93atgaaaaaac ggcctactcg aagttacatg ctgctcagcg
ctctgctgtt ctttttcttt 60gtgacctggt cctcatcaag ttcactgctc tcaatctggc
ttcaccagga agtggggcta 120aaagcatcgg aaaccggcat tattttttca gtattatccg
tctccgcgct cttcgcgcag 180gtctgttatg gctttattca ggaccgactt ggtctgcgca
aacatttgtt atggtttatc 240accgcgttgt tgatcctctc cggcccggct tatctgcttt
ttagttattt gctgagcgtt 300aatattctgc tgggcagcgt attcgggggc ttatttatcg
ggctgacgtt taatgggggt 360atcggcgttc tggagtccta taccgagcgc gtcgcgcgtc
aaagtacctt tgagtttggg 420cgggcacgca tgtgggggtc tctgggctgg gcagttgcca
cgttttttgc cgggttactg 480tttaatatca accctgacct taacttcctg gtggcttcat
gctcagggtt aatcttcttc 540tgcctcctgg cccgattaaa ggtggccgcg ccggcaagca
tggagaaact cgaaattggc 600gctaaaaaag tttctctgga agacgccctg cgtctgctta
ctctgccgcg cttctgggca 660ctgatattct tcgtggtcgg aacctgcatt tacggcgtat
acgatcagca attcccggtc 720tatttctcat cacagttccc gacattacgc gaagggaacg
agatgtttgg ctatttaaac 780tctttccagg tctttctcga ggccgcaggt atgttttgtg
cgccgtggct ggttaatcgc 840attggtgcta aaaatggtct gatattcgca ggaatggtga
tggcgctgcg catgattact 900tcagggctgg tggaaggccc cctgcttatc tccattacca
aactgcttca cgcggtcgaa 960ctgccaatat tgttagtcgc catatttaaa tacaacagtc
tgaatttcga caaacgtctc 1020tcctccacca tttatctggt gggatttgcc tgcaccagct
ccgtcattgg taccgtattg 1080tccccgctgg caggctttag ctatgagaga tttggcttcg
cccaatccta tctgatcatg 1140ggcatcatgg tgttcagcac cacgtttatt tccattttcc
ttttgcgctc aactaaatcc 1200tcatctgagc catcttttct gcagcaaaaa gctgtgtaa
123994412PRTEscherichia coli 94Met Lys Lys Arg Pro
Thr Arg Ser Tyr Met Leu Leu Ser Ala Leu Leu 1 5
10 15 Phe Phe Phe Phe Val Thr Trp Ser Ser Ser
Ser Ser Leu Leu Ser Ile 20 25
30 Trp Leu His Gln Glu Val Gly Leu Lys Ala Ser Glu Thr Gly Ile
Ile 35 40 45 Phe
Ser Val Leu Ser Val Ser Ala Leu Phe Ala Gln Val Cys Tyr Gly 50
55 60 Phe Ile Gln Asp Arg Leu
Gly Leu Arg Lys His Leu Leu Trp Phe Ile 65 70
75 80 Thr Ala Leu Leu Ile Leu Ser Gly Pro Ala Tyr
Leu Leu Phe Ser Tyr 85 90
95 Leu Leu Ser Val Asn Ile Leu Leu Gly Ser Val Phe Gly Gly Leu Phe
100 105 110 Ile Gly
Leu Thr Phe Asn Gly Gly Ile Gly Val Leu Glu Ser Tyr Thr 115
120 125 Glu Arg Val Ala Arg Gln Ser
Thr Phe Glu Phe Gly Arg Ala Arg Met 130 135
140 Trp Gly Ser Leu Gly Trp Ala Val Ala Thr Phe Phe
Ala Gly Leu Leu 145 150 155
160 Phe Asn Ile Asn Pro Asp Leu Asn Phe Leu Val Ala Ser Cys Ser Gly
165 170 175 Leu Ile Phe
Phe Cys Leu Leu Ala Arg Leu Lys Val Ala Ala Pro Ala 180
185 190 Ser Met Glu Lys Leu Glu Ile Gly
Ala Lys Lys Val Ser Leu Glu Asp 195 200
205 Ala Leu Arg Leu Leu Thr Leu Pro Arg Phe Trp Ala Leu
Ile Phe Phe 210 215 220
Val Val Gly Thr Cys Ile Tyr Gly Val Tyr Asp Gln Gln Phe Pro Val 225
230 235 240 Tyr Phe Ser Ser
Gln Phe Pro Thr Leu Arg Glu Gly Asn Glu Met Phe 245
250 255 Gly Tyr Leu Asn Ser Phe Gln Val Phe
Leu Glu Ala Ala Gly Met Phe 260 265
270 Cys Ala Pro Trp Leu Val Asn Arg Ile Gly Ala Lys Asn Gly
Leu Ile 275 280 285
Phe Ala Gly Met Val Met Ala Leu Arg Met Ile Thr Ser Gly Leu Val 290
295 300 Glu Gly Pro Leu Leu
Ile Ser Ile Thr Lys Leu Leu His Ala Val Glu 305 310
315 320 Leu Pro Ile Leu Leu Val Ala Ile Phe Lys
Tyr Asn Ser Leu Asn Phe 325 330
335 Asp Lys Arg Leu Ser Ser Thr Ile Tyr Leu Val Gly Phe Ala Cys
Thr 340 345 350 Ser
Ser Val Ile Gly Thr Val Leu Ser Pro Leu Ala Gly Phe Ser Tyr 355
360 365 Glu Arg Phe Gly Phe Ala
Gln Ser Tyr Leu Ile Met Gly Ile Met Val 370 375
380 Phe Ser Thr Thr Phe Ile Ser Ile Phe Leu Leu
Arg Ser Thr Lys Ser 385 390 395
400 Ser Ser Glu Pro Ser Phe Leu Gln Gln Lys Ala Val
405 410 951236DNABacillus licheniformis
95atgaaaagct caaacagtct gtattggaaa ctaagcgcct attttttctt tttctttttt
60acttggtctt ccagctattc tttatttgcg atttggttag ggcaagaaat caatttgaac
120gggtccgcga cgggcattat cttttctgta aacgctatct ttactttgtg catgcagcct
180ttgtacggtt ttatctccga taagctcggg ctgaagaaaa acatattatt tatgatcagt
240ttgctgctcg tatttacggg tcccttttat attttcgtct acggaccgct tttgcaatac
300aacgtctttc ttggggctat cgtcggggga atttatttgg gaactgcttt tctcgccgga
360atcggtgcga ttgaaacctt tattgaaaag gtcagccgca aatatcaatt tgaatatgga
420agaacaagga tgtgggggtc cctcggctgg gctgcggcga cattttttgc aggtcagctg
480ttcaatatcg atccgaatat caacttctgg gttgcgtccg cctcagcaat catattggtg
540gccattattg tttccgtaaa aattgagatg acagatgatg aaaaggaaag agcagactcg
600gtcggattaa aagacgtagg agggcttttt ctcttaaaag atttctggtt tttgatgctg
660tacgtgatcg gcgtaacatg cgtgtacggc gtctatgacc agcagttccc gctttactac
720gcttccttat ttccgactgc ggccttgggg aaccaaatat ttggatacct taattcattc
780caagtattta ttgaagcggg catgatgttt cttgcgcctt tcatcgtcaa taagctcggt
840cctaaaaaaa gcttgatttt agcggggctg ttaatggctt tccggattat cggttccgga
900cttgtcagcg gaccggtcgg aatttcatcg atgaaactca ttcatgcttt agaattgccg
960attatgctga ttgcgatgtt taaatatttg gcgactaatt ttgataatcg tctttcatcc
1020gtactttatc ttgtcggctt tcaattcgca tcccaggtag gcacgtcgat tttttcgccg
1080cttgcgggag gtttatacga cagcatcgga tttcgccaca cttatctcat catgggagca
1140atggtccttt gttttaccat tatttcgatt tttaccttgc tagactcaaa gaaagatgtt
1200gaatttgctc aaaatctaca aagcaatcat atatag
123696411PRTBacillus licheniformis 96Met Lys Ser Ser Asn Ser Leu Tyr Trp
Lys Leu Ser Ala Tyr Phe Phe 1 5 10
15 Phe Phe Phe Phe Thr Trp Ser Ser Ser Tyr Ser Leu Phe Ala
Ile Trp 20 25 30
Leu Gly Gln Glu Ile Asn Leu Asn Gly Ser Ala Thr Gly Ile Ile Phe
35 40 45 Ser Val Asn Ala
Ile Phe Thr Leu Cys Met Gln Pro Leu Tyr Gly Phe 50
55 60 Ile Ser Asp Lys Leu Gly Leu Lys
Lys Asn Ile Leu Phe Met Ile Ser 65 70
75 80 Leu Leu Leu Val Phe Thr Gly Pro Phe Tyr Ile Phe
Val Tyr Gly Pro 85 90
95 Leu Leu Gln Tyr Asn Val Phe Leu Gly Ala Ile Val Gly Gly Ile Tyr
100 105 110 Leu Gly Thr
Ala Phe Leu Ala Gly Ile Gly Ala Ile Glu Thr Phe Ile 115
120 125 Glu Lys Val Ser Arg Lys Tyr Gln
Phe Glu Tyr Gly Arg Thr Arg Met 130 135
140 Trp Gly Ser Leu Gly Trp Ala Ala Ala Thr Phe Phe Ala
Gly Gln Leu 145 150 155
160 Phe Asn Ile Asp Pro Asn Ile Asn Phe Trp Val Ala Ser Ala Ser Ala
165 170 175 Ile Ile Leu Val
Ala Ile Ile Val Ser Val Lys Ile Glu Met Thr Asp 180
185 190 Asp Glu Lys Glu Arg Ala Asp Ser Val
Gly Leu Lys Asp Val Gly Gly 195 200
205 Leu Phe Leu Leu Lys Asp Phe Trp Phe Leu Met Leu Tyr Val
Ile Gly 210 215 220
Val Thr Cys Val Tyr Gly Val Tyr Asp Gln Gln Phe Pro Leu Tyr Tyr 225
230 235 240 Ala Ser Leu Phe Pro
Thr Ala Ala Leu Gly Asn Gln Ile Phe Gly Tyr 245
250 255 Leu Asn Ser Phe Gln Val Phe Ile Glu Ala
Gly Met Met Phe Leu Ala 260 265
270 Pro Phe Ile Val Asn Lys Leu Gly Pro Lys Lys Ser Leu Ile Leu
Ala 275 280 285 Gly
Leu Leu Met Ala Phe Arg Ile Ile Gly Ser Gly Leu Val Ser Gly 290
295 300 Pro Val Gly Ile Ser Ser
Met Lys Leu Ile His Ala Leu Glu Leu Pro 305 310
315 320 Ile Met Leu Ile Ala Met Phe Lys Tyr Leu Ala
Thr Asn Phe Asp Asn 325 330
335 Arg Leu Ser Ser Val Leu Tyr Leu Val Gly Phe Gln Phe Ala Ser Gln
340 345 350 Val Gly
Thr Ser Ile Phe Ser Pro Leu Ala Gly Gly Leu Tyr Asp Ser 355
360 365 Ile Gly Phe Arg His Thr Tyr
Leu Ile Met Gly Ala Met Val Leu Cys 370 375
380 Phe Thr Ile Ile Ser Ile Phe Thr Leu Leu Asp Ser
Lys Lys Asp Val 385 390 395
400 Glu Phe Ala Gln Asn Leu Gln Ser Asn His Ile 405
410 971290DNAPseudomonas fluorescens 97atgcagtttg
ccgccaaacg cgagtactgg cttatcagtg gtttgttgtt tttcttcttc 60ttttcgtggt
catccagcta ttcattgttt tctatctggc tgcatcgagt cattggcttg 120aatggcacgg
aaaccggctt cattttcgcc gccaacgcta ttgcggcgct gctggttcaa 180cccttctacg
gcgcccttca agaccgcctc gggctgtcca aaaagcttct ggtgtggatt 240ggcatcctgc
tgtgtgccgc ggccccgttt gcaatttatg tctacgccgg cctgttggcg 300cagaacgtga
tgctcggcgc gttggtcggt gcggcgttcc tggcgctggc gatgctggca 360ggcgttgggg
tgatcgagtc gtacaccgag cgcttgtcgc ggcatgcagg attcgagttt 420ggaaccaccc
gaatgtgggg gtcgttgggc tgggccagcg cgacgggcgt ggtcggcgtg 480gtgttcaaca
tcgatcctga cattgcgttt tacatgagca gcctcgccgg catcgtgttt 540ttgctgatcc
tgttccgtct ggacctcgac cggttggccc agccggcagt gcaggcgggc 600gcggttgtcc
accccgtgcg cctgaacgat ctctggaagt tgctggcact cccgcggttc 660tgggctttca
gcctttacct gacgggggta tgcgggatct acatgatcta cgagcaacag 720tttccggtgt
atttctcctc gtttttcccg accccggagg aggggacccg tgcctatggc 780tacctgaact
cgtctcaggt actggtcgag gcggtcctga tgctgcttgc accctgggtg 840gtcagccgca
caggcgccaa atacgggctg attctggccg gcagcatcat gttcgtgcgc 900atccttgggt
cggggctggt aacgcaggct tgggccatcg ccgcctgcaa gatgttgcac 960gccttggaag
tgcccatctt gctggtctcg atattcaaat acatttcgct caactttgac 1020tctcggctgt
ccgcctcgat ctacttggtg gggttccagt tcgcccagca actgaccgcc 1080atgttgctgt
caccgctggt gggctacggc tacgaccatt tcggtttctc cagcgtctac 1140gtactgatgg
caggcctggt cggcgcttgc ctgctgcttt catggacctt gttgcgcaag 1200gaccccgtgc
gtgacgcctc tcaagtcggg gctggcgatt cacggcagct tcccgccatc 1260gcgccatccg
cccctcgtta tgaaccctag
129098429PRTPseudomonas fluorescens 98Met Gln Phe Ala Ala Lys Arg Glu Tyr
Trp Leu Ile Ser Gly Leu Leu 1 5 10
15 Phe Phe Phe Phe Phe Ser Trp Ser Ser Ser Tyr Ser Leu Phe
Ser Ile 20 25 30
Trp Leu His Arg Val Ile Gly Leu Asn Gly Thr Glu Thr Gly Phe Ile
35 40 45 Phe Ala Ala Asn
Ala Ile Ala Ala Leu Leu Val Gln Pro Phe Tyr Gly 50
55 60 Ala Leu Gln Asp Arg Leu Gly Leu
Ser Lys Lys Leu Leu Val Trp Ile 65 70
75 80 Gly Ile Leu Leu Cys Ala Ala Ala Pro Phe Ala Ile
Tyr Val Tyr Ala 85 90
95 Gly Leu Leu Ala Gln Asn Val Met Leu Gly Ala Leu Val Gly Ala Ala
100 105 110 Phe Leu Ala
Leu Ala Met Leu Ala Gly Val Gly Val Ile Glu Ser Tyr 115
120 125 Thr Glu Arg Leu Ser Arg His Ala
Gly Phe Glu Phe Gly Thr Thr Arg 130 135
140 Met Trp Gly Ser Leu Gly Trp Ala Ser Ala Thr Gly Val
Val Gly Val 145 150 155
160 Val Phe Asn Ile Asp Pro Asp Ile Ala Phe Tyr Met Ser Ser Leu Ala
165 170 175 Gly Ile Val Phe
Leu Leu Ile Leu Phe Arg Leu Asp Leu Asp Arg Leu 180
185 190 Ala Gln Pro Ala Val Gln Ala Gly Ala
Val Val His Pro Val Arg Leu 195 200
205 Asn Asp Leu Trp Lys Leu Leu Ala Leu Pro Arg Phe Trp Ala
Phe Ser 210 215 220
Leu Tyr Leu Thr Gly Val Cys Gly Ile Tyr Met Ile Tyr Glu Gln Gln 225
230 235 240 Phe Pro Val Tyr Phe
Ser Ser Phe Phe Pro Thr Pro Glu Glu Gly Thr 245
250 255 Arg Ala Tyr Gly Tyr Leu Asn Ser Ser Gln
Val Leu Val Glu Ala Val 260 265
270 Leu Met Leu Leu Ala Pro Trp Val Val Ser Arg Thr Gly Ala Lys
Tyr 275 280 285 Gly
Leu Ile Leu Ala Gly Ser Ile Met Phe Val Arg Ile Leu Gly Ser 290
295 300 Gly Leu Val Thr Gln Ala
Trp Ala Ile Ala Ala Cys Lys Met Leu His 305 310
315 320 Ala Leu Glu Val Pro Ile Leu Leu Val Ser Ile
Phe Lys Tyr Ile Ser 325 330
335 Leu Asn Phe Asp Ser Arg Leu Ser Ala Ser Ile Tyr Leu Val Gly Phe
340 345 350 Gln Phe
Ala Gln Gln Leu Thr Ala Met Leu Leu Ser Pro Leu Val Gly 355
360 365 Tyr Gly Tyr Asp His Phe Gly
Phe Ser Ser Val Tyr Val Leu Met Ala 370 375
380 Gly Leu Val Gly Ala Cys Leu Leu Leu Ser Trp Thr
Leu Leu Arg Lys 385 390 395
400 Asp Pro Val Arg Asp Ala Ser Gln Val Gly Ala Gly Asp Ser Arg Gln
405 410 415 Leu Pro Ala
Ile Ala Pro Ser Ala Pro Arg Tyr Glu Pro 420
425 991248DNAArtificial SequenceCoding sequence for
variant sucrose transporter 99atggcactga atattccatt cagaaatgcg tactatcgtt
ttgcatccag ttactcattt 60ctctttttta tttcctggtc gctgtggtgg tcgttatacg
ctatttggct gaaaggacat 120ctagggttga cagggacgga attaggtaca ctttattcgg
tcaaccagtt taccagcatt 180ctatttatga tgttctacgg catcgttcag gataaactcg
gtctgaagaa accgctcatc 240tggtgtatga gtttcatcct ggtcttgacc ggaccgttta
tgatttacgt ttatgaaccg 300ttactgcaaa gcaatttttc tgtaggtcta attctggggg
cgctattttt tggcttgggg 360tatctggcgg gatgcggttt gcttgatagc ttcaccgaaa
aaatggcgcg aaattttcat 420ttcgaatatg gaacagcgcg cgcctgggga tcttttggct
atgctattgg cgcgttcttt 480gccggcatat tttttagtat cagtccccat atcaacttct
ggttggtctc gctatttggc 540gctgtattta tgatgatcaa catgcgtttt aaagataagg
atcaccagtg cgtagcggca 600gatgcgggag gggtaaaaaa agaggatttt atcgcagttt
tcaaggatcg aaacttctgg 660gttttcgtca tatttattgt ggggacgtgg tctttctata
acatttttga tcaacaactt 720tttcctgtct tttattcagg tttattcgaa tcacacgatg
taggaacgcg cctgtatggt 780tatctcaact cattccaggt ggtactcgaa gcgctgtgca
tggcgattat tcctttcttt 840gtgaatcggg tagggccaaa aaatgcatta cttatcggag
ttgtgattat ggcgttggcg 900atcctttcct gcgcgctgtt cgttaacccc tggattattt
cattagtgaa gttgttacat 960gccattgagg ttccactttg tgtcatatcc gtcttcaaat
acagcgtggc aaactttgat 1020aagcgcctgt cgtcgacgat ctttctgatt ggttttcaaa
ttgccagttc gcttgggatt 1080gtgctgcttt caacgccgac tgggatactc tttgaccacg
caggctacca gacagttttc 1140ttcgcaattt cgggtattgt ctgcctgatg ttgctatttg
gcattttctt cttgagtaaa 1200aaacgcgagc aaatagttat ggaaacgcct gtaccttcag
caatatag 1248100415PRTArtificial SequenceVariant sucrose
transporter 100Met Ala Leu Asn Ile Pro Phe Arg Asn Ala Tyr Tyr Arg Phe
Ala Ser 1 5 10 15
Ser Tyr Ser Phe Leu Phe Phe Ile Ser Trp Ser Leu Trp Trp Ser Leu
20 25 30 Tyr Ala Ile Trp
Leu Lys Gly His Leu Gly Leu Thr Gly Thr Glu Leu 35
40 45 Gly Thr Leu Tyr Ser Val Asn Gln
Phe Thr Ser Ile Leu Phe Met Met 50 55
60 Phe Tyr Gly Ile Val Gln Asp Lys Leu Gly Leu Lys Lys
Pro Leu Ile 65 70 75
80 Trp Cys Met Ser Phe Ile Leu Val Leu Thr Gly Pro Phe Met Ile Tyr
85 90 95 Val Tyr Glu
Pro Leu Leu Gln Ser Asn Phe Ser Val Gly Leu Ile Leu 100
105 110 Gly Ala Leu Phe Phe Gly Leu
Gly Tyr Leu Ala Gly Cys Gly Leu Leu 115 120
125 Asp Ser Phe Thr Glu Lys Met Ala Arg Asn Phe
His Phe Glu Tyr Gly 130 135 140
Thr Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Ile Gly Ala Phe Phe
145 150 155 160 Ala Gly
Ile Phe Phe Ser Ile Ser Pro His Ile Asn Phe Trp Leu Val
165 170 175 Ser Leu Phe Gly Ala Val
Phe Met Met Ile Asn Met Arg Phe Lys Asp 180
185 190 Lys Asp His Gln Cys Val Ala Ala Asp Ala
Gly Gly Val Lys Lys Glu 195 200
205 Asp Phe Ile Ala Val Phe Lys Asp Arg Asn Phe Trp Val Phe
Val Ile 210 215 220
Phe Ile Val Gly Thr Trp Ser Phe Tyr Asn Ile Phe Asp Gln Gln Leu 225
230 235 240 Phe Pro Val Phe
Tyr Ser Gly Leu Phe Glu Ser His Asp Val Gly Thr 245
250 255 Arg Leu Tyr Gly Tyr Leu Asn Ser Phe
Gln Val Val Leu Glu Ala Leu 260 265
270 Cys Met Ala Ile Ile Pro Phe Phe Val Asn Arg Val Gly Pro
Lys Asn 275 280 285
Ala Leu Leu Ile Gly Val Val Ile Met Ala Leu Ala Ile Leu Ser Cys 290
295 300 Ala Leu Phe Val Asn
Pro Trp Ile Ile Ser Leu Val Lys Leu Leu His 305 310
315 320 Ala Ile Glu Val Pro Leu Cys Val Ile Ser
Val Phe Lys Tyr Ser Val 325 330
335 Ala Asn Phe Asp Lys Arg Leu Ser Ser Thr Ile Phe Leu Ile Gly
Phe 340 345 350 Gln
Ile Ala Ser Ser Leu Gly Ile Val Leu Leu Ser Thr Pro Thr Gly 355
360 365 Ile Leu Phe Asp His Ala
Gly Tyr Gln Thr Val Phe Phe Ala Ile Ser 370 375
380 Gly Ile Val Cys Leu Met Leu Leu Phe Gly Ile
Phe Phe Leu Ser Lys 385 390 395
400 Lys Arg Glu Gln Ile Val Met Glu Thr Pro Val Pro Ser Ala Ile
405 410 415
1011248DNAArtificial Sequencecoding squence for variant sucrose
transporter 101atggcactga atattccatt cagaaatgcg tactatcgtt ttgcatccag
ttactcattt 60ctctttttta tttcctggtc gctgtggtgg tcgttatacg ctatttggct
gaaaggacat 120ctagggttga cagggacgga attaggtaca ctttattcgg tcaaccagtt
taccagcatt 180ctatttatga tgttctacgg catcgttcag gataaactcg gtctgaagaa
accgctcatc 240tggtgtatga gtttcatcct ggtcttgacc ggaccgttta tgatttacgt
ttatgaaccg 300ttactgcaaa gcaatttttc tgtaggtcta attctggggg cgctattttt
tggcttgggg 360tatctggcgg gatgcggttt gcttgatagc ttcaccgaaa aaatggcgcg
aaattttcat 420ttcgaatatg gaacagcgcg cgcctgggga tcttttggct atgctattgg
cgcgttcttt 480gccggcatat tttttagtat cagtccccat atcaacttct ggttggtctc
gctatttggc 540gctgtattta tgatgatcaa catgcgtttt aaagataagg atcaccagtg
cgtagcggca 600gatgcgggag gggtaaaaaa agaggatttt atcgcagttt tcaaggatcg
aaacttctgg 660gttttcgtca tatttattgt ggggacgtgg tctttctata acatttttga
tcaacaactt 720tttcctgtct tttattcagg tttattcgaa tcacacgatg taggaacgcg
cctgtatggt 780tatctcaact cattccaggt ggtactcgaa gcgctgtgca tggcgattat
tcctttcttt 840gtgaatcggg tagggccaaa aaatgcatta cttatcggag ttgtgattat
ggcgttgctg 900atcctttcct gcgcgctgtt cgttaacccc tggattattt cattagtgaa
gttgttacat 960gccattgagg ttccactttg tgtcatatcc gtcttcaaat acagcgtggc
aaactttgat 1020aagcgcctgt cgtcgacgat ctttctgatt ggttttcaaa ttgccagttc
gcttgggatt 1080gtgctgcttt caacgccgac tgggatactc tttgaccacg caggctacca
gacagttttc 1140ttcgcaattt cgggtattgt ctgcctgatg ttgctatttg gcattttctt
cttgagtaaa 1200aaacgcgagc aaatagttat ggaaacgcct gtaccttcag caatatag
1248102415PRTArtificial SequenceVariant sucrose transporter
102Met Ala Leu Asn Ile Pro Phe Arg Asn Ala Tyr Tyr Arg Phe Ala Ser 1
5 10 15 Ser Tyr Ser Phe
Leu Phe Phe Ile Ser Trp Ser Leu Trp Trp Ser Leu 20
25 30 Tyr Ala Ile Trp Leu Lys Gly His
Leu Gly Leu Thr Gly Thr Glu Leu 35 40
45 Gly Thr Leu Tyr Ser Val Asn Gln Phe Thr Ser Ile
Leu Phe Met Met 50 55 60
Phe Tyr Gly Ile Val Gln Asp Lys Leu Gly Leu Lys Lys Pro Leu Ile 65
70 75 80 Trp Cys Met
Ser Phe Ile Leu Val Leu Thr Gly Pro Phe Met Ile Tyr 85
90 95 Val Tyr Glu Pro Leu Leu Gln
Ser Asn Phe Ser Val Gly Leu Ile Leu 100 105
110 Gly Ala Leu Phe Phe Gly Leu Gly Tyr Leu Ala
Gly Cys Gly Leu Leu 115 120 125
Asp Ser Phe Thr Glu Lys Met Ala Arg Asn Phe His Phe Glu Tyr
Gly 130 135 140 Thr
Ala Arg Ala Trp Gly Ser Phe Gly Tyr Ala Ile Gly Ala Phe Phe 145
150 155 160 Ala Gly Ile Phe Phe
Ser Ile Ser Pro His Ile Asn Phe Trp Leu Val 165
170 175 Ser Leu Phe Gly Ala Val Phe Met Met
Ile Asn Met Arg Phe Lys Asp 180 185
190 Lys Asp His Gln Cys Val Ala Ala Asp Ala Gly Gly Val
Lys Lys Glu 195 200 205
Asp Phe Ile Ala Val Phe Lys Asp Arg Asn Phe Trp Val Phe Val Ile
210 215 220 Phe Ile Val
Gly Thr Trp Ser Phe Tyr Asn Ile Phe Asp Gln Gln Leu 225
230 235 240 Phe Pro Val Phe Tyr Ser Gly
Leu Phe Glu Ser His Asp Val Gly Thr 245
250 255 Arg Leu Tyr Gly Tyr Leu Asn Ser Phe Gln Val
Val Leu Glu Ala Leu 260 265
270 Cys Met Ala Ile Ile Pro Phe Phe Val Asn Arg Val Gly Pro Lys
Asn 275 280 285 Ala
Leu Leu Ile Gly Val Val Ile Met Ala Leu Leu Ile Leu Ser Cys 290
295 300 Ala Leu Phe Val Asn Pro
Trp Ile Ile Ser Leu Val Lys Leu Leu His 305 310
315 320 Ala Ile Glu Val Pro Leu Cys Val Ile Ser Val
Phe Lys Tyr Ser Val 325 330
335 Ala Asn Phe Asp Lys Arg Leu Ser Ser Thr Ile Phe Leu Ile Gly Phe
340 345 350 Gln Ile
Ala Ser Ser Leu Gly Ile Val Leu Leu Ser Thr Pro Thr Gly 355
360 365 Ile Leu Phe Asp His Ala Gly
Tyr Gln Thr Val Phe Phe Ala Ile Ser 370 375
380 Gly Ile Val Cys Leu Met Leu Leu Phe Gly Ile Phe
Phe Leu Ser Lys 385 390 395
400 Lys Arg Glu Gln Ile Val Met Glu Thr Pro Val Pro Ser Ala Ile
405 410 415 1031248DNAArtificial
SequenceCoding sequence for variant sucrose transporter 103atggcactga
atattccatt cagaaatgcg tactatcgtt ttgcatccag ttactcattt 60ctctttttta
tttcctggtc gctgtggtgg tcgttatacg ctatttggct gaaaggacat 120ctagggttga
cagggacgga attaggtaca ctttattcgg tcaaccagtt taccagcatt 180ctatttatga
tgttctacgg catcgttcag gataaactcg gtctgaagaa accgctcatc 240tggtgtatga
gtttcatcct ggtcttgacc ggaccgttta tgatttacgt ttatgaaccg 300ttactgcaaa
gcaatttttc tgtaggtcta attctggggg cgctattttt tggcttgggg 360tatctggcgg
gatgcggttt gcttgatagc ttcaccgaaa aaatggcgcg aaattttcat 420ttcgaatatg
gaacagcgcg cgcctgggga tcttttggct atgctattgg cgcgttcttt 480gccggcatat
tttttagtat cagtccccat atcaacttct ggttggtctc gctatttggc 540gctgtattta
tgatgatcaa catgcgtttt aaagataagg atcaccagtg cgtagcggca 600gatgcgggag
gggtaaaaaa agaggatttt atcgcagttt tcaaggatcg aaacttctgg 660gttttcgtca
tatttattgt ggggacgtgg tctttctata acatttttga tcaacaactt 720tttcctgtct
tttattcagg tttattcgaa tcacacgatg taggaacgcg cctgtatggt 780tatctcaact
cattccaggt ggtactcgaa gcgctgtgca tggcgattat tcctttcttt 840gtgaatcggg
tagggccaaa aaatgcatta cttatcggag ttgtgattat ggcgttggcg 900atcctttcct
gcgcgctgtt cgttaacccc tggattattt cattagtgaa gttgttacat 960gccattgagg
ttccactttg tgtcatatcc gtcttcaaat acagcgtggc aaactttgat 1020aagcgcctgt
cgtcgacgat ctttctgatt ggttttcaca ttgccagttc gcttgggatt 1080gtgctgcttt
caacgccgac tgggatactc tttgaccacg caggctacca gacagttttc 1140ttcgcaattt
cgggtattgt ctgcctgatg ttgctatttg gcattttctt cttgagtaaa 1200aaacgcgagc
aaatagttat ggaaacgcct gtaccttcag caatatag
1248104415PRTArtificial SequenceVariant sucrose transporter 104Met Ala
Leu Asn Ile Pro Phe Arg Asn Ala Tyr Tyr Arg Phe Ala Ser 1 5
10 15 Ser Tyr Ser Phe Leu Phe Phe
Ile Ser Trp Ser Leu Trp Trp Ser Leu 20 25
30 Tyr Ala Ile Trp Leu Lys Gly His Leu Gly Leu Thr
Gly Thr Glu Leu 35 40 45
Gly Thr Leu Tyr Ser Val Asn Gln Phe Thr Ser Ile Leu Phe Met Met
50 55 60 Phe Tyr Gly
Ile Val Gln Asp Lys Leu Gly Leu Lys Lys Pro Leu Ile 65
70 75 80 Trp Cys Met Ser Phe Ile Leu
Val Leu Thr Gly Pro Phe Met Ile Tyr 85
90 95 Val Tyr Glu Pro Leu Leu Gln Ser Asn Phe Ser
Val Gly Leu Ile Leu 100 105
110 Gly Ala Leu Phe Phe Gly Leu Gly Tyr Leu Ala Gly Cys Gly Leu
Leu 115 120 125 Asp
Ser Phe Thr Glu Lys Met Ala Arg Asn Phe His Phe Glu Tyr Gly 130
135 140 Thr Ala Arg Ala Trp Gly
Ser Phe Gly Tyr Ala Ile Gly Ala Phe Phe 145 150
155 160 Ala Gly Ile Phe Phe Ser Ile Ser Pro His Ile
Asn Phe Trp Leu Val 165 170
175 Ser Leu Phe Gly Ala Val Phe Met Met Ile Asn Met Arg Phe Lys Asp
180 185 190 Lys Asp
His Gln Cys Val Ala Ala Asp Ala Gly Gly Val Lys Lys Glu 195
200 205 Asp Phe Ile Ala Val Phe Lys
Asp Arg Asn Phe Trp Val Phe Val Ile 210 215
220 Phe Ile Val Gly Thr Trp Ser Phe Tyr Asn Ile Phe
Asp Gln Gln Leu 225 230 235
240 Phe Pro Val Phe Tyr Ser Gly Leu Phe Glu Ser His Asp Val Gly Thr
245 250 255 Arg Leu Tyr
Gly Tyr Leu Asn Ser Phe Gln Val Val Leu Glu Ala Leu 260
265 270 Cys Met Ala Ile Ile Pro Phe Phe
Val Asn Arg Val Gly Pro Lys Asn 275 280
285 Ala Leu Leu Ile Gly Val Val Ile Met Ala Leu Ala Ile
Leu Ser Cys 290 295 300
Ala Leu Phe Val Asn Pro Trp Ile Ile Ser Leu Val Lys Leu Leu His 305
310 315 320 Ala Ile Glu Val
Pro Leu Cys Val Ile Ser Val Phe Lys Tyr Ser Val 325
330 335 Ala Asn Phe Asp Lys Arg Leu Ser Ser
Thr Ile Phe Leu Ile Gly Phe 340 345
350 His Ile Ala Ser Ser Leu Gly Ile Val Leu Leu Ser Thr Pro
Thr Gly 355 360 365
Ile Leu Phe Asp His Ala Gly Tyr Gln Thr Val Phe Phe Ala Ile Ser 370
375 380 Gly Ile Val Cys Leu
Met Leu Leu Phe Gly Ile Phe Phe Leu Ser Lys 385 390
395 400 Lys Arg Glu Gln Ile Val Met Glu Thr Pro
Val Pro Ser Ala Ile 405 410
415 1051248DNAArtificial SequenceCoding sequence for variant sucrose
transporter 105atggcactga atattccatt cagaaatgcg tactatcgtt ttgcatccag
ttactcattt 60ctctttttta tttcctggtc gctgtggtgg tcgttatacg ctatttggct
gaaaggacat 120ctagggttga cagggacgga attaggtaca ctttattcgg tcaaccagtt
taccagcatt 180ccatttatga tgttctacgg catcgttcag gataaactcg gtctgaagaa
accgctcatc 240tggtgtatga gtttcatcct ggtcttgacc ggaccgttta tgatttacgt
ttatgaaccg 300ttactgcaaa gcaatttttc tgtaggtcta attctggggg cgctattttt
tggcttgggg 360tatctggcgg gatgcggttt gcttgatagc ttcaccgaaa aaatggcgcg
aaattttcat 420ttcgaatatg gaacagcgcg cgcctgggga tcttttggct atgctattgg
cgcgttcttt 480gccggcatat tttttagtat cagtccccat atcaacttct ggttggtctc
gctatttggc 540gctgtattta tgatgatcaa catgcgtttt aaagataagg atcaccagtg
cgtagcggca 600gatgcgggag gggtaaaaaa agaggatttt atcgcagttt tcaaggatcg
aaacttctgg 660gttttcgtca tatttattgt ggggacgtgg tctttctata acatttttga
tcaacaactt 720tttcctgtct tttattcagg tttattcgaa tcacacgatg taggaacgcg
cctgtatggt 780tatctcaact cattccaggt ggtactcgaa gcgctgtgca tggcgattat
tcctttcttt 840gtgaatcggg tagggccaaa aaatgcatta cttatcggag ttgtgattat
ggcgttggcg 900atcctttcct gcgcgctgtt cgttaacccc tggattattt cattagtgaa
gttgttacat 960gccattgcgg ttccactttg tgtcatatcc gtcttcaaat acagcgtggc
aaactttgat 1020aagcgcctgt cgtcgacgat ctttctgatt ggttttcaca ttgccagttc
gcttgggatt 1080gtgctgcttt caacgccgac tgggatactc tttgaccacg caggctacca
gacagttttc 1140ttcgcaattt cgggtattgt ctgcctgatg ttgctatttg gcattttctt
cttgagtaaa 1200aaacgcgagc aaatagttat ggaaacgcct gtaccttcag caatatag
1248106415PRTArtificial SequenceVariant sucrose transporter
106Met Ala Leu Asn Ile Pro Phe Arg Asn Ala Tyr Tyr Arg Phe Ala Ser 1
5 10 15 Ser Tyr Ser Phe
Leu Phe Phe Ile Ser Trp Ser Leu Trp Trp Ser Leu 20
25 30 Tyr Ala Ile Trp Leu Lys Gly His Leu
Gly Leu Thr Gly Thr Glu Leu 35 40
45 Gly Thr Leu Tyr Ser Val Asn Gln Phe Thr Ser Ile Pro Phe
Met Met 50 55 60
Phe Tyr Gly Ile Val Gln Asp Lys Leu Gly Leu Lys Lys Pro Leu Ile 65
70 75 80 Trp Cys Met Ser Phe
Ile Leu Val Leu Thr Gly Pro Phe Met Ile Tyr 85
90 95 Val Tyr Glu Pro Leu Leu Gln Ser Asn Phe
Ser Val Gly Leu Ile Leu 100 105
110 Gly Ala Leu Phe Phe Gly Leu Gly Tyr Leu Ala Gly Cys Gly Leu
Leu 115 120 125 Asp
Ser Phe Thr Glu Lys Met Ala Arg Asn Phe His Phe Glu Tyr Gly 130
135 140 Thr Ala Arg Ala Trp Gly
Ser Phe Gly Tyr Ala Ile Gly Ala Phe Phe 145 150
155 160 Ala Gly Ile Phe Phe Ser Ile Ser Pro His Ile
Asn Phe Trp Leu Val 165 170
175 Ser Leu Phe Gly Ala Val Phe Met Met Ile Asn Met Arg Phe Lys Asp
180 185 190 Lys Asp
His Gln Cys Val Ala Ala Asp Ala Gly Gly Val Lys Lys Glu 195
200 205 Asp Phe Ile Ala Val Phe Lys
Asp Arg Asn Phe Trp Val Phe Val Ile 210 215
220 Phe Ile Val Gly Thr Trp Ser Phe Tyr Asn Ile Phe
Asp Gln Gln Leu 225 230 235
240 Phe Pro Val Phe Tyr Ser Gly Leu Phe Glu Ser His Asp Val Gly Thr
245 250 255 Arg Leu Tyr
Gly Tyr Leu Asn Ser Phe Gln Val Val Leu Glu Ala Leu 260
265 270 Cys Met Ala Ile Ile Pro Phe Phe
Val Asn Arg Val Gly Pro Lys Asn 275 280
285 Ala Leu Leu Ile Gly Val Val Ile Met Ala Leu Ala Ile
Leu Ser Cys 290 295 300
Ala Leu Phe Val Asn Pro Trp Ile Ile Ser Leu Val Lys Leu Leu His 305
310 315 320 Ala Ile Ala Val
Pro Leu Cys Val Ile Ser Val Phe Lys Tyr Ser Val 325
330 335 Ala Asn Phe Asp Lys Arg Leu Ser Ser
Thr Ile Phe Leu Ile Gly Phe 340 345
350 His Ile Ala Ser Ser Leu Gly Ile Val Leu Leu Ser Thr Pro
Thr Gly 355 360 365
Ile Leu Phe Asp His Ala Gly Tyr Gln Thr Val Phe Phe Ala Ile Ser 370
375 380 Gly Ile Val Cys Leu
Met Leu Leu Phe Gly Ile Phe Phe Leu Ser Lys 385 390
395 400 Lys Arg Glu Gln Ile Val Met Glu Thr Pro
Val Pro Ser Ala Ile 405 410
415 1071261DNAArtificial SequenceCoding sequence for variant sucrose
transporter 107atgaaaatca atatgccgtt ctccaatgac aaataccgtt atagttcggg
ctacctgctg 60ttcttcttcg ctgcgtggtc cctgtggtgg agtttctacg caatctggct
gaaaaacaaa 120ctgggcctgt ccggcaccga actgggcatg ctgtatgctg ttaatcagtt
tttctccatg 180ctgttcatgc tggtctacgg ctttctgcaa gataaactgg gcacccgtaa
acatctgatt 240tggctgatgg gcattgtgat cacgctgtca ggtccgttcc tgatctatgt
ttacgaaccg 300ctgctgacct cgaactttaa actgggcatg gcactgggtg ctattttctt
tggtctgggt 360tatctggcag gttgcggcct ggtggaatct tttgtggaaa aagtttctcg
taaattcaac 420ttcgaatttg gcaccgcacg tctgtggggc tctctgggtt acgcggccgg
tacgttcgtt 480ggcggtattt tctttagcat caacccgcac attaattttt ggtgtgtctc
tgtgatgggc 540gtcctgttcc tgctgatcaa cgtgctgttt aaaaccaata gtccggcacc
gagctctgtg 600aaaacccgtt ccccggaacc ggatgctctg acgcgcaaag acttcctgac
catctttaaa 660gatacgcagt tctggttttt cgttattttt gtggttggca cgtggagttt
ctattccatc 720tacgaccagc aaatgttccc ggtgttttat gcgagcctgt ttgatgaccc
ggaactggcc 780ccgcgtgttt atggttacct gaactctgtt caagtcttca tggaagcggt
tggcatggcc 840ctggtcccgt ttctgattaa tcgtatcggt ccgaaaagcg cactgctgct
gggcggcacc 900atcatggcat gcgcgattct gggttcagct ctgtttacgg atatctacat
catctcgctg 960atcaaaatgc tgcatgcgct ggaagtcccg ctgttcgtca tttcagtgtt
caaattttcg 1020gtggccaact ttgacaaacg cctgagttcc accatttacc tgatcggctt
taatatcgcg 1080tcatcgattg gtattatcgt gctgagtctg ccggttggca aactgttcga
taaagttggt 1140tatcaggaaa tttttctgat catggccagc atcgtcatta tcaccctgat
tttcggctac 1200tttagcctgt ctaaaaaaca tcaccagcaa aaaatgggta acgaactggt
gacggaataa 1260c
1261108419PRTArtificial SequenceVariant sucrose transporter
108Met Lys Ile Asn Met Pro Phe Ser Asn Asp Lys Tyr Arg Tyr Ser Ser 1
5 10 15 Gly Tyr Leu Leu
Phe Phe Phe Ala Ala Trp Ser Leu Trp Trp Ser Phe 20
25 30 Tyr Ala Ile Trp Leu Lys Asn Lys Leu
Gly Leu Ser Gly Thr Glu Leu 35 40
45 Gly Met Leu Tyr Ala Val Asn Gln Phe Phe Ser Met Leu Phe
Met Leu 50 55 60
Val Tyr Gly Phe Leu Gln Asp Lys Leu Gly Thr Arg Lys His Leu Ile 65
70 75 80 Trp Leu Met Gly Ile
Val Ile Thr Leu Ser Gly Pro Phe Leu Ile Tyr 85
90 95 Val Tyr Glu Pro Leu Leu Thr Ser Asn Phe
Lys Leu Gly Met Ala Leu 100 105
110 Gly Ala Ile Phe Phe Gly Leu Gly Tyr Leu Ala Gly Cys Gly Leu
Val 115 120 125 Glu
Ser Phe Val Glu Lys Val Ser Arg Lys Phe Asn Phe Glu Phe Gly 130
135 140 Thr Ala Arg Leu Trp Gly
Ser Leu Gly Tyr Ala Ala Gly Thr Phe Val 145 150
155 160 Gly Gly Ile Phe Phe Ser Ile Asn Pro His Ile
Asn Phe Trp Cys Val 165 170
175 Ser Val Met Gly Val Leu Phe Leu Leu Ile Asn Val Leu Phe Lys Thr
180 185 190 Asn Ser
Pro Ala Pro Ser Ser Val Lys Thr Arg Ser Pro Glu Pro Asp 195
200 205 Ala Leu Thr Arg Lys Asp Phe
Leu Thr Ile Phe Lys Asp Thr Gln Phe 210 215
220 Trp Phe Phe Val Ile Phe Val Val Gly Thr Trp Ser
Phe Tyr Ser Ile 225 230 235
240 Tyr Asp Gln Gln Met Phe Pro Val Phe Tyr Ala Ser Leu Phe Asp Asp
245 250 255 Pro Glu Leu
Ala Pro Arg Val Tyr Gly Tyr Leu Asn Ser Val Gln Val 260
265 270 Phe Met Glu Ala Val Gly Met Ala
Leu Val Pro Phe Leu Ile Asn Arg 275 280
285 Ile Gly Pro Lys Ser Ala Leu Leu Leu Gly Gly Thr Ile
Met Ala Cys 290 295 300
Ala Ile Leu Gly Ser Ala Leu Phe Thr Asp Ile Tyr Ile Ile Ser Leu 305
310 315 320 Ile Lys Met Leu
His Ala Leu Glu Val Pro Leu Phe Val Ile Ser Val 325
330 335 Phe Lys Phe Ser Val Ala Asn Phe Asp
Lys Arg Leu Ser Ser Thr Ile 340 345
350 Tyr Leu Ile Gly Phe Asn Ile Ala Ser Ser Ile Gly Ile Ile
Val Leu 355 360 365
Ser Leu Pro Val Gly Lys Leu Phe Asp Lys Val Gly Tyr Gln Glu Ile 370
375 380 Phe Leu Ile Met Ala
Ser Ile Val Ile Ile Thr Leu Ile Phe Gly Tyr 385 390
395 400 Phe Ser Leu Ser Lys Lys His His Gln Gln
Lys Met Gly Asn Glu Leu 405 410
415 Val Thr Glu 1091261DNAArtificial SequenceCoding sequence
for variant sucrose transporter 109atgaaaatca atatgccgtt ctccaatgac
aaataccgtt atagttcggg ctacctgctg 60ttcttcttcg ctgcgtggtc cctgtggtgg
agtttctacg caatctggct gaaaaacaaa 120ctgggcctgt ccggcaccga actgggcatg
ctgtatgctg ttaatcagtt tttctccatg 180ctgttcatgc tggtctacgg ctttctgcaa
gataaactgg gcacccgtaa acatctgatt 240tggctgatgg gcattgtgat cacgctgtca
ggtccgttcc tgatctatgt ttacgaaccg 300ctgctgacct cgaactttaa actgggcatg
gcactgggtg ctattttctt tggtctgggt 360tatctggcag gttgcggcct ggtggaatct
tttgtggaaa aagtttctcg taaattcaac 420ttcgaatttg gcaccgcacg tctgtggggc
tctctgggtt acgcggccgg tacgttcgtt 480ggcggtattt tctttagcat caacccgcac
attaattttt ggtgtgtctc tgtgatgggc 540gtcctgttcc tgctgatcaa cgtgctgttt
aaaaccaata gtccggcacc gagctctgtg 600aaaacccgtt ccccggaacc ggatgctctg
acgcgcaaag acttcctgac catctttaaa 660gatacgcagt tctggttttt cgttattttt
gtggttggca cgtggagttt ctattccatc 720tacgaccagc aaatgttccc ggtgttttat
gcgagcctgt ttgatgaccc ggaactggcc 780ccgcgtgttt atggttacct gaactctgtt
caagtcttca tggaagcggt tggcatggcc 840ctggtcccgt ttctgattaa tcgtatcggt
ccgaaaagcg cactgctgct gggcggcacc 900atcatggcat gcctgattct gggttcagct
ctgtttacgg atatctacat catctcgctg 960atcaaaatgc tgcatgcgct ggaagtcccg
ctgttcgtca tttcagtgtt caaattttcg 1020gtggccaact ttgacaaacg cctgagttcc
accatttacc tgatcggctt taatatcgcg 1080tcatcgattg gtattatcgt gctgagtctg
ccggttggca aactgttcga taaagttggt 1140tatcaggaaa tttttctgat catggccagc
atcgtcatta tcaccctgat tttcggctac 1200tttagcctgt ctaaaaaaca tcaccagcaa
aaaatgggta acgaactggt gacggaataa 1260c
1261110419PRTArtificial SequenceVariant
sucrose transporter 110Met Lys Ile Asn Met Pro Phe Ser Asn Asp Lys Tyr
Arg Tyr Ser Ser 1 5 10
15 Gly Tyr Leu Leu Phe Phe Phe Ala Ala Trp Ser Leu Trp Trp Ser Phe
20 25 30 Tyr Ala Ile
Trp Leu Lys Asn Lys Leu Gly Leu Ser Gly Thr Glu Leu 35
40 45 Gly Met Leu Tyr Ala Val Asn Gln
Phe Phe Ser Met Leu Phe Met Leu 50 55
60 Val Tyr Gly Phe Leu Gln Asp Lys Leu Gly Thr Arg Lys
His Leu Ile 65 70 75
80 Trp Leu Met Gly Ile Val Ile Thr Leu Ser Gly Pro Phe Leu Ile Tyr
85 90 95 Val Tyr Glu Pro
Leu Leu Thr Ser Asn Phe Lys Leu Gly Met Ala Leu 100
105 110 Gly Ala Ile Phe Phe Gly Leu Gly Tyr
Leu Ala Gly Cys Gly Leu Val 115 120
125 Glu Ser Phe Val Glu Lys Val Ser Arg Lys Phe Asn Phe Glu
Phe Gly 130 135 140
Thr Ala Arg Leu Trp Gly Ser Leu Gly Tyr Ala Ala Gly Thr Phe Val 145
150 155 160 Gly Gly Ile Phe Phe
Ser Ile Asn Pro His Ile Asn Phe Trp Cys Val 165
170 175 Ser Val Met Gly Val Leu Phe Leu Leu Ile
Asn Val Leu Phe Lys Thr 180 185
190 Asn Ser Pro Ala Pro Ser Ser Val Lys Thr Arg Ser Pro Glu Pro
Asp 195 200 205 Ala
Leu Thr Arg Lys Asp Phe Leu Thr Ile Phe Lys Asp Thr Gln Phe 210
215 220 Trp Phe Phe Val Ile Phe
Val Val Gly Thr Trp Ser Phe Tyr Ser Ile 225 230
235 240 Tyr Asp Gln Gln Met Phe Pro Val Phe Tyr Ala
Ser Leu Phe Asp Asp 245 250
255 Pro Glu Leu Ala Pro Arg Val Tyr Gly Tyr Leu Asn Ser Val Gln Val
260 265 270 Phe Met
Glu Ala Val Gly Met Ala Leu Val Pro Phe Leu Ile Asn Arg 275
280 285 Ile Gly Pro Lys Ser Ala Leu
Leu Leu Gly Gly Thr Ile Met Ala Cys 290 295
300 Leu Ile Leu Gly Ser Ala Leu Phe Thr Asp Ile Tyr
Ile Ile Ser Leu 305 310 315
320 Ile Lys Met Leu His Ala Leu Glu Val Pro Leu Phe Val Ile Ser Val
325 330 335 Phe Lys Phe
Ser Val Ala Asn Phe Asp Lys Arg Leu Ser Ser Thr Ile 340
345 350 Tyr Leu Ile Gly Phe Asn Ile Ala
Ser Ser Ile Gly Ile Ile Val Leu 355 360
365 Ser Leu Pro Val Gly Lys Leu Phe Asp Lys Val Gly Tyr
Gln Glu Ile 370 375 380
Phe Leu Ile Met Ala Ser Ile Val Ile Ile Thr Leu Ile Phe Gly Tyr 385
390 395 400 Phe Ser Leu Ser
Lys Lys His His Gln Gln Lys Met Gly Asn Glu Leu 405
410 415 Val Thr Glu 1111339DNAArtificial
SequenceCoding sequence for variant sucrose transporter 111atggcgtcag
cgaccaaatc ggcgtggaaa aacccgtcct atctgcaatc ctcattcggc 60atcttcatgt
tcttctgttc gtggggcatt tggtggtcat ttttccagcg ttggctgatc 120tcgggcgtgg
gtctgacgaa cgccgaagtt ggcaccattt atagcatcaa ttctctggca 180accctggtga
ttatgtttgt gtacggcgtt attcaggatc aactgggtat caaacgtaaa 240ctggttattg
tggttagcgt catcgcggcc tgcgtgggtc cgtttgtcca gttcgtgtat 300gcaccgatga
ttctggcggg cggcaccacg cgttggatcg gtgctctgat tggttcaatc 360gtgctgtcgg
cgggctttat gagtggttgc tccctgttcg aagctgttac cgaacgttat 420tctcgcaaat
ttggcttcga atacggtcag agccgcgcct ggggctcttt tggttatgca 480attgtggctc
tgtgtgcggg ctttctgttc aacattaatc cgctgatcaa cttttgggtt 540ggttcagcat
tcggtccggg catgctgctg gtttacgctt tttgggtccc ggcggaacaa 600aaagaagaac
tgaaaaaaga aacggatccg aacgcagctc cgaccaatcc gtcgctgaaa 660gaaatggttg
cggtcctgaa aatgccgacg ctgtgggttc tgattgtctt tatgctgctg 720accaacacgt
tttataccgt gttcgaccag caaatgtttc cgacgtatta cgctaacctg 780tttccgaccg
aagaaatcgg caacgcgacc tacggcacgc tgaatggttt tcaggttttc 840ctggaaagcg
ccatgatggg tgtcgtgccg attatcatga agaaaattgg cgttcgtaat 900gccctgctgc
tgggtgcaac ggtcatgttt ctggcgatcg gcctgtgcgg tgtgttccat 960gatccggtta
ccattagtat cgtcaaactg tttcactcca ttgaagtgcc gctgttctgt 1020ctgccggcgt
ttcgttattt caccctgcat tttgacacga aactgagcgc caccctgtac 1080atggttggct
tccagattgc aagccaagtg ggtcaagtta tcttttctac gccgctgggc 1140gccttccacg
ataaaatggc acaaatcctg ccgaacaatg acatgggtag tcgtgtcacc 1200ttttgggtga
tttccgctat cgtgctgtgt gcgctgattt atggcttttt cgtcatcaaa 1260catgatgacc
aggaagtggg cggtgatccg ttctacaccg acaaacaact gcgccaaatg 1320gaagcggcca
aagcgtaac
1339112445PRTArtificial SequenceVariant sucrose transporter 112Met Ala
Ser Ala Thr Lys Ser Ala Trp Lys Asn Pro Ser Tyr Leu Gln 1 5
10 15 Ser Ser Phe Gly Ile Phe Met
Phe Phe Cys Ser Trp Gly Ile Trp Trp 20 25
30 Ser Phe Phe Gln Arg Trp Leu Ile Ser Gly Val Gly
Leu Thr Asn Ala 35 40 45
Glu Val Gly Thr Ile Tyr Ser Ile Asn Ser Leu Ala Thr Leu Val Ile
50 55 60 Met Phe Val
Tyr Gly Val Ile Gln Asp Gln Leu Gly Ile Lys Arg Lys 65
70 75 80 Leu Val Ile Val Val Ser Val
Ile Ala Ala Cys Val Gly Pro Phe Val 85
90 95 Gln Phe Val Tyr Ala Pro Met Ile Leu Ala Gly
Gly Thr Thr Arg Trp 100 105
110 Ile Gly Ala Leu Ile Gly Ser Ile Val Leu Ser Ala Gly Phe Met
Ser 115 120 125 Gly
Cys Ser Leu Phe Glu Ala Val Thr Glu Arg Tyr Ser Arg Lys Phe 130
135 140 Gly Phe Glu Tyr Gly Gln
Ser Arg Ala Trp Gly Ser Phe Gly Tyr Ala 145 150
155 160 Ile Val Ala Leu Cys Ala Gly Phe Leu Phe Asn
Ile Asn Pro Leu Ile 165 170
175 Asn Phe Trp Val Gly Ser Ala Phe Gly Pro Gly Met Leu Leu Val Tyr
180 185 190 Ala Phe
Trp Val Pro Ala Glu Gln Lys Glu Glu Leu Lys Lys Glu Thr 195
200 205 Asp Pro Asn Ala Ala Pro Thr
Asn Pro Ser Leu Lys Glu Met Val Ala 210 215
220 Val Leu Lys Met Pro Thr Leu Trp Val Leu Ile Val
Phe Met Leu Leu 225 230 235
240 Thr Asn Thr Phe Tyr Thr Val Phe Asp Gln Gln Met Phe Pro Thr Tyr
245 250 255 Tyr Ala Asn
Leu Phe Pro Thr Glu Glu Ile Gly Asn Ala Thr Tyr Gly 260
265 270 Thr Leu Asn Gly Phe Gln Val Phe
Leu Glu Ser Ala Met Met Gly Val 275 280
285 Val Pro Ile Ile Met Lys Lys Ile Gly Val Arg Asn Ala
Leu Leu Leu 290 295 300
Gly Ala Thr Val Met Phe Leu Ala Ile Gly Leu Cys Gly Val Phe His 305
310 315 320 Asp Pro Val Thr
Ile Ser Ile Val Lys Leu Phe His Ser Ile Glu Val 325
330 335 Pro Leu Phe Cys Leu Pro Ala Phe Arg
Tyr Phe Thr Leu His Phe Asp 340 345
350 Thr Lys Leu Ser Ala Thr Leu Tyr Met Val Gly Phe Gln Ile
Ala Ser 355 360 365
Gln Val Gly Gln Val Ile Phe Ser Thr Pro Leu Gly Ala Phe His Asp 370
375 380 Lys Met Ala Gln Ile
Leu Pro Asn Asn Asp Met Gly Ser Arg Val Thr 385 390
395 400 Phe Trp Val Ile Ser Ala Ile Val Leu Cys
Ala Leu Ile Tyr Gly Phe 405 410
415 Phe Val Ile Lys His Asp Asp Gln Glu Val Gly Gly Asp Pro Phe
Tyr 420 425 430 Thr
Asp Lys Gln Leu Arg Gln Met Glu Ala Ala Lys Ala 435
440 445 113447DNAArtificial SequencePlasmid
113aggaattccc taggcgatct gtgctgtttg ccacggtatg cagcaccagc gcgagattat
60gggctcgcac gctcgactgt cggacggggg cactggaacg agaagtcagg cgagccgtca
120cgcccttgac tatgccacat cctgagcaaa taattcaacc actaaacaaa tcaaccgcgt
180ttcccggagg taaccaagct tgcccggatc cgcatgcgcg gccgcgtcga ctctagttta
240aacccccggg tgatcgatag ctcttaatta agttgtttgc caatgtaatg ccgctgcacc
300caggcatcaa ataaaacgaa aggctcagtc gaaagactgg gcctttcgtt ttatctgttg
360tttgtcggtg aacgctctct actagagtca cactggctca ccttcgggtg ggcctttctg
420cgtttataca gctgtcggta ccgccag
447114447DNAArtificial SequencePlasmid 114aggaattccc taggcgatct
gtgctgtttg ccacggtatg cagcaccagc gcgagattat 60gggctcgcac gctcgactgt
cggacggggg cactggaacg agaagtcagg cgagccgtca 120cgcccttgac gatgccacat
cctgagcaaa taattcaacc actaaacaaa tcaaccgcgt 180ttcccggagg taaccaagct
tgcccggatc cgcatgcgcg gccgcgtcga ctctagttta 240aacccccggg tgatcgatag
ctcttaatta agttgtttgc caatgtaatg ccgctgcacc 300caggcatcaa ataaaacgaa
aggctcagtc gaaagactgg gcctttcgtt ttatctgttg 360tttgtcggtg aacgctctct
actagagtca cactggctca ccttcgggtg ggcctttctg 420cgtttataca gctgtcggta
ccgccag 44711539DNAArtificial
SequencePrimer 115gttgtgatta tggcgttggc gatcctttcc tgcgcgctg
3911639DNAArtificial SequencePrimer 116cagcgcgcag
gaaaggatcg ccaacgccat aatcacaac
3911727DNAArtificial SequencePrimer 117aagcttatgg cactgaatat tccattc
2711826DNAArtificial SequencePrimer
118atcgatctat attgctgaag gtacag
261199317DNAArtificial SequencePlasmid 119tcgaggaatt cgcaggaccg
tgatacacgg gacaggtcac tgaatgacga caatgtcctg 60gaaatcagcg aaccgcgcat
ctgaagtaca tttgagcgac tgtaccagaa catgaatgag 120gcgtttggat taggcgatta
ttagcagggc taagcatttt actattatta ttttccggtt 180gagggatata gagctatcga
caacaaccgg aaaaagttta cgtctatatt gctgaaggta 240caggcgtttc cataactatt
tgctcgcgtt ttttactcaa gaagaaaatg ccaaatagca 300acatcaggca gacaataccc
gaaattgcga agaaaactgt ctggtagcct gcgtggtcaa 360agagtatccc agtcggcgtt
gaaagcagca caatcccaag cgaactggca atttgaaaac 420caatcagaaa gatcgtcgac
gacaggcgct tatcaaagtt tgccacgctg tatttgaaga 480cggatatgac acaaagtgga
acctcaatgg catgtaacaa cttcactaat gaaataatcc 540aggggttaac gaacagcgcg
caggaaagga tacgcaacgc cataatcaca actccgataa 600gtaatgcatt ttttggccct
acccgattca caaagaaagg aataatcgcc atgcacagcg 660cttcgagtac cacctggaat
gagttgagat aaccatacag gcgcgttcct acatcgtgtg 720attcgaataa acctgaataa
aagacaggaa aaagttgttg atcaaaaatg ttatagaaag 780accacgtccc cacaataaat
atgacgaaaa cccagaagtt tcgatccttg aaaactgcga 840taaaatcctc tttttttacc
cctcccgcat ctgccgctac gcactggtga tccttatctt 900taaaacgcat gttgatcatc
ataaatacag cgccaaatag cgagaccaac cagaagttga 960tatggggact gatactaaaa
aatatgccgg caaagaacgc gccaatagca tagccaaaag 1020atccccaggc gcgcgctgtt
ccatattcga aatgaaaatt tcgcgccatt ttttcggtga 1080agctatcaag caaaccgcat
cccgccagat accccaagcc aaaaaatagc gcccccagaa 1140ttagacctac agaaaaattg
ctttgcagta acggttcata aacgtaaatc ataaacggtc 1200cggtcaagac caggatgaaa
ctcatacacc agatgagcgg tttcttcaga ccgagtttat 1260cctgaacgat gccgtagaac
atcataaata gaatgctggt aaactggttg accgaataaa 1320gtgtacctaa ttccgtccct
gtcaacccta gatgtccttt cagccaaata gcgtataacg 1380accaccacag cgaccaggaa
ataaaaaaga gaaatgagta actggatgca aaacgatagt 1440acgcatttct gaatggaata
ttcagtgcca taattacctg cctgtcgtta aaaaattcac 1500gtcctattta gagataagag
cgacttcgcc gtttacttct cactattcca gttcttgtcg 1560acatggcagc gctgtcattg
cccctttcgc cgttactgca agcgctccgc aacgttgagc 1620gagatcgata attcgtcgca
tttctctctc atctgtagat aatcccgtag aggacagacc 1680tgtgagtaac ccggcaacga
acgcatctcc cgcccccgtg ctatcgacac aattcacaga 1740cattccagca aaatggtgaa
cttgtcctcg ataacagacc accacccctt ctgcaccttt 1800agtcaccaac agcatggcga
tctcatactc ttttgccagg gcgcatatat cctgatcgtt 1860ctgtgttttt ccactgataa
gtcgccattc ttcttccgag agcttgacga catccgccag 1920ttgtagcgcc tgccgcaaac
acaagcggag caaatgctcg tcttgccata gatcttcacg 1980aatattagga tcgaagctga
caaaacctcc ggcatgccgg atcgccgtca tcgcagtaaa 2040tgcgctggta cgcgaaggct
cggcagacaa cgcaattgaa cagagatgta accattcgcc 2100atgtcgccag cagggcaagt
ctgtcgtctc taaaaaaaga tcggcactgg ggcggaccat 2160aaacgtaaat gaacgttccc
cttgatcgtt cagatcgaca agcaccgtgg atgtccggtg 2220ccattcatct tgcttcagat
acgtgatatc gactccctca gttagcagcg ttctttgcat 2280taacgcacca aaaggatcat
cccccacccg acctataaac ccacttgttc cgcctaatct 2340ggcgattccc accgcaacgt
tagctggcgc gccgccagga caaggcagta ggcgcccgtc 2400tgattctggc aagagatcta
cgaccgcatc ccctaaaacc catactttgg ctgacatttt 2460tttcccttaa attcatctga
gttacgcata gtgataaacc tctttttcgc aaaatcgtca 2520tggatttact aaaacatgca
tattcgatca caaaacgtca tagttaacgt taacatttgt 2580gatattcatc gcatttatga
aagtaaggga ctttattttt ataaaagtta acgttaacaa 2640ttcaccaaat ttgcttaacc
aggatgatta aaatgacgca atctcgattg catgcggcgc 2700aaaacgccct agcaaaactt
catgagcacc ggggtaacac tttctatccc cattttcacc 2760tcgcgcctcc tgccgggtgg
atgaacgatc caaacggcct gatctggttt aacgatcgtt 2820atcacgcgtt ttatcaacat
catccgatga gcgaacactg ggggccaatg cactggggac 2880atgccaccag cgacgatatg
atccactggc agcatgagcc tattgcgcta gcgccaggag 2940acgataatga caaagacggg
tgtttttcag gtagtgctgt cgatgacaat ggtgtcctct 3000cacttatcta caccggacac
gtctggctcg atggtgcagg taatgacgat gcaattcgcg 3060aagtacaatg tctggctacc
agtcgggatg gtattcattt cgagaaacag ggtgtgatcc 3120tcactccacc agaaggaatc
atgcacttcc gcgatcctaa agtgtggcgt gaagccgaca 3180catggtggat ggtagtcggg
gcgaaagatc caggcaacac ggggcagatc ctgctttatc 3240gcggcagttc gttgcgtgaa
tggaccttcg atcgcgtact ggcccacgct gatgcgggtg 3300aaagctatat gtgggaatgt
ccggactttt tcagccttgg cgatcagcat tatctgatgt 3360tttccccgca gggaatgaat
gccgagggat acagttaccg aaatcgcttt caaagtggcg 3420taatacccgg aatgtggtcg
ccaggacgac tttttgcaca atccgggcat tttactgaac 3480ttgataacgg gcatgacttt
tatgcaccac aaagcttttt agcgaaggat ggtcggcgta 3540ttgttatcgg ctggatggat
atgtgggaat cgccaatgcc ctcaaaacgt gaaggatggg 3600caggctgcat gacgctggcg
cgcgagctat cagagagcaa tggcaaactt ctacaacgcc 3660cggtacacga agctgagtcg
ttacgccagc agcatcaatc tgtctctccc cgcacaatca 3720gcaataaata tgttttgcag
gaaaacgcgc aagcagttga gattcagttg cagtgggcgc 3780tgaagaacag tgatgccgaa
cattacggat tacagctcgg cactggaatg cggctgtata 3840ttgataacca atctgagcga
cttgttttgt ggcggtatta cccacacgag aatttagacg 3900gctaccgtag tattcccctc
ccgcagcgtg acacgctcgc cctaaggata tttatcgata 3960catcatccgt ggaagtattt
attaacgacg gggaagcggt gatgagtagt cgaatctatc 4020cgcagccaga agaacgggaa
ctgtcgcttt atgcctccca cggagtggct gtgctgcaac 4080atggagcact ctggctactg
ggttaacata atatcaggtg gaacaacgga tcaacagcgg 4140gcaagggatc cacgaagctt
cccatggtga cgtcaccggt aaaccagcaa tagacataag 4200cggctattta acgaccctgc
cctgaaccga cgaccgggtc gaatttgctt tcgaatttct 4260gccattcatc cgcttattat
acttattcag gcgtagcacc aggcgtttaa gggcaccaat 4320aactgcctta aaaaaattac
gccccgccct gccactcatc gcagtactgt tgtaattcat 4380taagcattct gccgacatgg
aagccatcac agacggcatg atgaacctga atcgccagcg 4440gcatcagcac cttgtcgcct
tgcgtataat atttgcccat ggtgaaaacg ggggcgaaga 4500agttgtccat attggccacg
tttaaatcaa aactggtgaa actcacccag ggattggctg 4560agacgaaaaa catattctca
ataaaccctt tagggaaata ggccaggttt tcaccgtaac 4620acgccacatc ttgcgaatat
atgtgtagaa actgccggaa atcgtcgtgg tattcactcc 4680agagcgatga aaacgtttca
gtttgctcat ggaaaacggt gtaacaaggg tgaacactat 4740cccatatcac cagctcaccg
tctttcattg ccatacggaa ttccggatga gcattcatca 4800ggcgggcaag aatgtgaata
aaggccggat aaaacttgtg cttatttttc tttacggtct 4860ttaaaaaggc cgtaatatcc
agctgaacgg tctggttata ggtacattga gcaactgact 4920gaaatgcctc aaaatgttct
ttacgatgcc attgggatat atcaacggtg gtatatccag 4980tgattttttt ctccatttta
gcttccttag ctcctgaaaa tctcgataac tcaaaaaata 5040cgcccggtag tgatcttatt
tcattatggt gaaagttgga acctcttacg tgccgatcaa 5100cgtctcattt tcgccaaaag
ttggcccagg gcttcccggt atcaacaggg acaccaggat 5160ttatttattc tgcgaagtga
tcttccgtca caggtattta ttcggcgcaa agggcctcgt 5220gatacgccta tttttatagg
ttaatgtcat gataataatg gtttcttaga cgtcaggtgg 5280cacttttcgg ggaaatgtgc
gcgcccgcgt tcctgctggc gctgggcctg tttctggcgc 5340tggacttccc gctgttccgt
cagcagcttt tcgcccacgg ccttgatgat cgcggcggcc 5400ttggcctgca tatcccgatt
caacggcccc agggcgtcca gaacgggctt caggcgctcc 5460cgaaggtctc gggccgtctc
ttgggcttga tcggccttct tgcgcatctc acgcgctcct 5520gcggcggcct gtagggcagg
ctcatacccc tgccgaaccg cttttgtcag ccggtcggcc 5580acggcttccg gcgtctcaac
gcgctttgag attcccagct tttcggccaa tccctgcggt 5640gcataggcgc gtggctcgac
cgcttgcggg ctgatggtga cgtggcccac tggtggccgc 5700tccagggcct cgtagaacgc
ctgaatgcgc gtgtgacgtg ccttgctgcc ctcgatgccc 5760cgttgcagcc ctagatcggc
cacagcggcc gcaaacgtgg tctggtcgcg ggtcatctgc 5820gctttgttgc cgatgaactc
cttggccgac agcctgccgt cctgcgtcag cggcaccacg 5880aacgcggtca tgtgcgggct
ggtttcgtca cggtggatgc tggccgtcac gatgcgatcc 5940gccccgtact tgtccgccag
ccacttgtgc gccttctcga agaacgccgc ctgctgttct 6000tggctggccg acttccacca
ttccgggctg gccgtcatga cgtactcgac cgccaacaca 6060gcgtccttgc gccgcttctc
tggcagcaac tcgcgcagtc ggcccatcgc ttcatcggtg 6120ctgctggccg cccagtgctc
gttctctggc gtcctgctgg cgtcagcgtt gggcgtctcg 6180cgctcgcggt aggcgtgctt
gagactggcc gccacgttgc ccattttcgc cagcttcttg 6240catcgcatga tcgcgtatgc
cgccatgcct gcccctccct tttggtgtcc aaccggctcg 6300acgggggcag cgcaaggcgg
tgcctccggc gggccactca atgcttgagt atactcacta 6360gactttgctt cgcaaagtcg
tgaccgccta cggcggctgc ggcgccctac gggcttgctc 6420tccgggcttc gccctgcgcg
gtcgctgcgc tcccttgcca gcccgtggat atgtggacga 6480tggccgcgag cggccaccgg
ctggctcgct tcgctcggcc cgtggacaac cctgctggac 6540aagctgatgg acaggctgcg
cctgcccacg agcttgacca cagggattgc ccaccggcta 6600cccagccttc gaccacatac
ccaccggctc caactgcgcg gcctgcggcc ttgccccatc 6660aattttttta attttctctg
gggaaaagcc tccggcctgc ggcctgcgcg cttcgcttgc 6720cggttggaca ccaagtggaa
ggcgggtcaa ggctcgcgca gcgaccgcgc agcggcttgg 6780ccttgacgcg cctggaacga
cccaagccta tgcgagtggg ggcagtcgaa ggcgaagccc 6840gcccgcctgc cccccgagac
ctgcaggggg gggggggcgc tgaggtctgc ctcgtgaaga 6900aggtgttgct gactcatacc
aggcctgaat cgccccatca tccagccaga aagtgaggga 6960gccacggttg atgagagctt
tgttgtaggt ggaccagttg gtgattttga acttttgctt 7020tgccacggaa cggtctgcgt
tgtcgggaag atgcgtgatc tgatccttca actcagcaaa 7080agttcgattt attcaacaaa
gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt 7140tacaaccaat taaccaattc
tgattagaaa aactcatcga gcatcaaatg aaactgcaat 7200ttattcatat caggattatc
aataccatat ttttgaaaaa gccgtttctg taatgaagga 7260gaaaactcac cgaggcagtt
ccataggatg gcaagatcct ggtatcggtc tgcgattccg 7320actcgtccaa catcaataca
acctattaat ttcccctcgt caaaaataag gttatcaagt 7380gagaaatcac catgagtgac
gactgaatcc ggtgagaatg gcaaaagctt atgcatttct 7440ttccagactt gttcaacagg
ccagccatta cgctcgtcat caaaatcact cgcatcaacc 7500aaaccgttat tcattcgtga
ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa 7560ggacaattac aaacaggaat
cgaatgcaac cggcgcagga acactgccag cgcatcaaca 7620atattttcac ctgaatcagg
atattcttct aatacctgga atgctgtttt cccggggatc 7680gcagtggtga gtaaccatgc
atcatcagga gtacggataa aatgcttgat ggtcggaaga 7740ggcataaatt ccgtcagcca
gtttagtctg accatctcat ctgtaacatc attggcaacg 7800ctacctttgc catgtttcag
aaacaactct ggcgcatcgg gcttcccata caatcgatag 7860attgtcgcac ctgattgccc
gacattatcg cgagcccatt tatacccata taaatcagca 7920tccatgttgg aatttaatcg
cggcctcgag caagacgttt cccgttgaat atggctcata 7980acaccccttg tattactgtt
tatgtaagca gacagtttta ttgttcatga tgatatattt 8040ttatcttgtg caatgtaaca
tcagagattt tgagacacaa cgtggctttc cccccccccc 8100ctgcaggtcc cgagcctcac
ggcggcgagt gcgggggttc caagggggca gcgccacctt 8160gggcaaggcc gaaggccgcg
cagtcgatca acaagccccg gaggggccac tttttgccgg 8220agggggagcc gcgccgaagg
cgtgggggaa ccccgcaggg gtgcccttct ttgggcacca 8280aagaactaga tatagggcga
aatgcgaaag acttaaaaat caacaactta aaaaaggggg 8340gtacgcaaca gctcattgcg
gcaccccccg caatagctca ttgcgtaggt taaagaaaat 8400ctgtaattga ctgccacttt
tacgcaacgc ataattgttg tcgcgctgcc gaaaagttgc 8460agctgattgc gcatggtgcc
gcaaccgtgc ggcaccctac cgcatggaga taagcatggc 8520cacgcagtcc agagaaatcg
gcattcaagc caagaacaag cccggtcact gggtgcaaac 8580ggaacgcaaa gcgcatgagg
cgtgggccgg gcttattgcg aggaaaccca cggcggcaat 8640gctgctgcat cacctcgtgg
cgcagatggg ccaccagaac gccgtggtgg tcagccagaa 8700gacactttcc aagctcatcg
gacgttcttt gcggacggtc caatacgcag tcaaggactt 8760ggtggccgag cgctggatct
ccgtcgtgaa gctcaacggc cccggcaccg tgtcggccta 8820cgtggtcaat gaccgcgtgg
cgtggggcca gccccgcgac cagttgcgcc tgtcggtgtt 8880cagtgccgcc gtggtggttg
atcacgacga ccaggacgaa tcgctgttgg ggcatggcga 8940cctgcgccgc atcccgaccc
tgtatccggg cgagcagcaa ctaccgaccg gccccggcga 9000ggagccgccc agccagcccg
gcattccggg catggaacca gacctgccag ccttgaccga 9060aacggaggaa tgggaacggc
gcgggcagca gcgcctgccg atgcccgatg agccgtgttt 9120tctggacgat ggcgagccgt
tggagccgcc gacacgggtc acgctgccgc gccggtagca 9180cttgggttgc gcagcaaccc
gtaagtgcgc tgttccagac tatcggctgt agccgcctcg 9240ccgccctata ccttgtctgc
ctccccgcgt tgcgtcgcgg tgcatggagc cgggccacct 9300cgacctgaat ggaagcc
931712060DNAArtificial
SequencePrimer 120accattgtgg cgatgggttg cttctacagc ctgaacgaga ggatcccttg
cccgctgttg 6012160DNAArtificial SequencePrimer 121ttacgggctt
ctatctcttc cacaatgcgg acatacatct gaattcgcag gaccgtgata
6012241DNAArtificial SequencePrimer 122caaccagttt accagcattc catttatgat
gttctacggc a 4112341DNAArtificial SequencePrimer
123tgccgtagaa catcataaat ggaatgctgg taaactggtt g
41124447DNAArtificial SequencePromoter/MCS/terminator insert
124aggaattccc taggcgatct gtgctgtttg ccacggtatg cagcaccagc gcgagattat
60gggctcgcac gctcgactgt cggacggggg cactggaacg agaagtcagg cgagccgtca
120cgcccttgac gatgccacat cctgagcaaa taattcaacc actaaacaaa tcaaccgcgt
180ttcccggagg taaccaagct tgcccggatc cgcatgcgcg gccgcgtcga ctctagttta
240aacccccggg tgatcgatag ctcttaatta agttgtttgc caatgtaatg ccgctgcacc
300caggcatcaa ataaaacgaa aggctcagtc gaaagactgg gcctttcgtt ttatctgttg
360tttgtcggtg aacgctctct actagagtca cactggctca ccttcgggtg ggcctttctg
420cgtttataca gctgtcggta ccgccag
4471251260DNAArtificial Sequencecodon optimized CDS from Citrobacter sp
125atgaaaatca atatgccgtt ctccaatgac aaataccgtt atagttcggg ctacctgctg
60ttcttcttcg ctgcgtggtc cctgtggtgg agtttctacg caatctggct gaaaaacaaa
120ctgggcctgt ccggcaccga actgggcatg ctgtatgctg ttaatcagtt tttctccatg
180ctgttcatgc tggtctacgg ctttctgcaa gataaactgg gcacccgtaa acatctgatt
240tggctgatgg gcattgtgat cacgctgtca ggtccgttcc tgatctatgt ttacgaaccg
300ctgctgacct cgaactttaa actgggcatg gcactgggtg ctattttctt tggtctgggt
360tatctggcag gttgcggcct ggtggaatct tttgtggaaa aagtttctcg taaattcaac
420ttcgaatttg gcaccgcacg tctgtggggc tctctgggtt acgcggccgg tacgttcgtt
480ggcggtattt tctttagcat caacccgcac attaattttt ggtgtgtctc tgtgatgggc
540gtcctgttcc tgctgatcaa cgtgctgttt aaaaccaata gtccggcacc gagctctgtg
600aaaacccgtt ccccggaacc ggatgctctg acgcgcaaag acttcctgac catctttaaa
660gatacgcagt tctggttttt cgttattttt gtggttggca cgtggagttt ctattccatc
720tacgaccagc aaatgttccc ggtgttttat gcgagcctgt ttgatgaccc ggaactggcc
780ccgcgtgttt atggttacct gaactctgtt caagtcttca tggaagcggt tggcatggcc
840ctggtcccgt ttctgattaa tcgtatcggt ccgaaaagcg cactgctgct gggcggcacc
900atcatggcat gccgcattct gggttcagct ctgtttacgg atatctacat catctcgctg
960atcaaaatgc tgcatgcgct ggaagtcccg ctgttcgtca tttcagtgtt caaattttcg
1020gtggccaact ttgacaaacg cctgagttcc accatttacc tgatcggctt taatatcgcg
1080tcatcgattg gtattatcgt gctgagtctg ccggttggca aactgttcga taaagttggt
1140tatcaggaaa tttttctgat catggccagc atcgtcatta tcaccctgat tttcggctac
1200tttagcctgt ctaaaaaaca tcaccagcaa aaaatgggta acgaactggt gacggaataa
12601261338DNAArtificial Sequencecodon optimized CDS from Bifidobacterium
longum 126atggcgtcag cgaccaaatc ggcgtggaaa aacccgtcct atctgcaatc
ctcattcggc 60atcttcatgt tcttctgttc gtggggcatt tggtggtcat ttttccagcg
ttggctgatc 120tcgggcgtgg gtctgacgaa cgccgaagtt ggcaccattt atagcatcaa
ttctctggca 180accctggtga ttatgtttgt gtacggcgtt attcaggatc aactgggtat
caaacgtaaa 240ctggttattg tggttagcgt catcgcggcc tgcgtgggtc cgtttgtcca
gttcgtgtat 300gcaccgatga ttctggcggg cggcaccacg cgttggatcg gtgctctgat
tggttcaatc 360gtgctgtcgg cgggctttat gagtggttgc tccctgttcg aagctgttac
cgaacgttat 420tctcgcaaat ttggcttcga atacggtcag agccgcgcct ggggctcttt
tggttatgca 480attgtggctc tgtgtgcggg ctttctgttc aacattaatc cgctgatcaa
cttttgggtt 540ggttcagcat tcggtccggg catgctgctg gtttacgctt tttgggtccc
ggcggaacaa 600aaagaagaac tgaaaaaaga aacggatccg aacgcagctc cgaccaatcc
gtcgctgaaa 660gaaatggttg cggtcctgaa aatgccgacg ctgtgggttc tgattgtctt
tatgctgctg 720accaacacgt tttataccgt gttcgaccag caaatgtttc cgacgtatta
cgctaacctg 780tttccgaccg aagaaatcgg caacgcgacc tacggcacgc tgaatggttt
tcaggttttc 840ctggaaagcg ccatgatggg tgtcgtgccg attatcatga agaaaattgg
cgttcgtaat 900gccctgctgc tgggtgcaac ggtcatgttt ctgcgcatcg gcctgtgcgg
tgtgttccat 960gatccggtta ccattagtat cgtcaaactg tttcactcca ttgaagtgcc
gctgttctgt 1020ctgccggcgt ttcgttattt caccctgcat tttgacacga aactgagcgc
caccctgtac 1080atggttggct tccagattgc aagccaagtg ggtcaagtta tcttttctac
gccgctgggc 1140gccttccacg ataaaatggc acaaatcctg ccgaacaatg acatgggtag
tcgtgtcacc 1200ttttgggtga tttccgctat cgtgctgtgt gcgctgattt atggcttttt
cgtcatcaaa 1260catgatgacc aggaagtggg cggtgatccg ttctacaccg acaaacaact
gcgccaaatg 1320gaagcggcca aagcgtaa
133812737DNAArtificial SequencePrimer 127gcaccatcat ggcatgcgcg
attctgggtt cagctct 3712837DNAArtificial
SequencePrimer 128agagctgaac ccagaatcgc gcatgccatg atggtgc
3712936DNAArtificial SequencePrimer 129caccatcatg
gcatgcctga ttctgggttc agctct
3613036DNAArtificial SequencePrimer 130agagctgaac ccagaatcag gcatgccatg
atggtg 3613135DNAArtificial SequencePrimer
131aacggtcatg tttctggcga tcggcctgtg cggtg
3513235DNAArtificial SequencePrimer 132caccgcacag gccgatcgcc agaaacatga
ccgtt 351331476DNAPseudomonas fluorescens
133atgcacgctg cactgttaga gcaagcgcat cgcgctattg aaaaaaaact gcctgggcgg
60ggtgatgtct atcgcctggc ctatcatctt gcgccgccgg tggggtggat gaatgacccg
120aacggtctgg tttattttcg cggcgagtac catgtgttct accaacatca tccctattcg
180gctcagtggg ggccgatgca ctggggccat gccaagagcc gtgacctggt gcactgggag
240cacctgccca tcgcgctggc gccgggcgag gcctatgacc gcgacggttg cttttcaggg
300tctgcggtgg tcatggacga cgtgttgtac ctgatttaca ccgggcatac ctggctgggt
360gcgcccggtg acgagcggag cattcgccag gttcagtgcc tggccagcag caccgacggg
420gttgcgttca gcaagcacgg gccggtgatc gatagggcgc ctgaaccggg catcatgcat
480tttcgcgacc ccaaggtatg gcggcgagga gagcaatggt ggatggccct gggggcgcgc
540caaggcgacg cccctcagct cctgctctat cgctcaggcg acctgcatca ctggacgtac
600ctcaggtgcg cactgcaagg gcaacgagag tcggacggct atatgtggga gtgtcctgac
660ctgttcgaac tcgatggctg tgatgtgttt ctctattcgc ctcaaggctt gaaccccagc
720ggttatgaca actggaacaa gttccagaac agctatcgga tgggcctgct ggacgatcgc
780ggatacttca gcgagggcgg tgagctgcgt gaactggatc atggtcacga tttctatgcg
840gcgcagacct tgctggcgcc agacgggcga cgcctgttgt gggcttggat ggacatgtgg
900gacagcccga tgccgagtca ggcgcaacac tggtgcggtg cgctgtcgct acctcgtgaa
960ctgagccgca atggcgaacg gctacgcatg cggccggccc gcgagttggc agcgctacgc
1020cagtcgcaac ggacactggc gatcggcgtg gtcgaatccg gcaattgcat actcgctgag
1080cgaggggcgc tgctggaatt cgaactgacc ctggacctgg ctggtagcac ggctgagcgt
1140ttcgggttgg cgctgcgttg tagtgaggat cggcaagagc ggaccctggt gtacttcgat
1200gcgatggcgc ggcgtctggt gctggacagg caacactcgg gagcgggggt aagcggtgcg
1260cgcagcgtgc cgatagccaa gggccaaatg cagatagcct tgcggatttt ccttgatcga
1320tcctccattg aggtgtttgt cgatgacgga gcctatagct tgagcagtcg gatctaccct
1380agccccgaca gcgtggcggt catggcgttt gcggtcaatg gtagcggtgg ttttggccaa
1440gcgtcggtct ggcacctggc cgatctgcac ctgtga
1476134491PRTPseudomonas fluorescens 134Met His Ala Ala Leu Leu Glu Gln
Ala His Arg Ala Ile Glu Lys Lys 1 5 10
15 Leu Pro Gly Arg Gly Asp Val Tyr Arg Leu Ala Tyr His
Leu Ala Pro 20 25 30
Pro Val Gly Trp Met Asn Asp Pro Asn Gly Leu Val Tyr Phe Arg Gly
35 40 45 Glu Tyr His Val
Phe Tyr Gln His His Pro Tyr Ser Ala Gln Trp Gly 50
55 60 Pro Met His Trp Gly His Ala Lys
Ser Arg Asp Leu Val His Trp Glu 65 70
75 80 His Leu Pro Ile Ala Leu Ala Pro Gly Glu Ala Tyr
Asp Arg Asp Gly 85 90
95 Cys Phe Ser Gly Ser Ala Val Val Met Asp Asp Val Leu Tyr Leu Ile
100 105 110 Tyr Thr Gly
His Thr Trp Leu Gly Ala Pro Gly Asp Glu Arg Ser Ile 115
120 125 Arg Gln Val Gln Cys Leu Ala Ser
Ser Thr Asp Gly Val Ala Phe Ser 130 135
140 Lys His Gly Pro Val Ile Asp Arg Ala Pro Glu Pro Gly
Ile Met His 145 150 155
160 Phe Arg Asp Pro Lys Val Trp Arg Arg Gly Glu Gln Trp Trp Met Ala
165 170 175 Leu Gly Ala Arg
Gln Gly Asp Ala Pro Gln Leu Leu Leu Tyr Arg Ser 180
185 190 Gly Asp Leu His His Trp Thr Tyr Leu
Arg Cys Ala Leu Gln Gly Gln 195 200
205 Arg Glu Ser Asp Gly Tyr Met Trp Glu Cys Pro Asp Leu Phe
Glu Leu 210 215 220
Asp Gly Cys Asp Val Phe Leu Tyr Ser Pro Gln Gly Leu Asn Pro Ser 225
230 235 240 Gly Tyr Asp Asn Trp
Asn Lys Phe Gln Asn Ser Tyr Arg Met Gly Leu 245
250 255 Leu Asp Asp Arg Gly Tyr Phe Ser Glu Gly
Gly Glu Leu Arg Glu Leu 260 265
270 Asp His Gly His Asp Phe Tyr Ala Ala Gln Thr Leu Leu Ala Pro
Asp 275 280 285 Gly
Arg Arg Leu Leu Trp Ala Trp Met Asp Met Trp Asp Ser Pro Met 290
295 300 Pro Ser Gln Ala Gln His
Trp Cys Gly Ala Leu Ser Leu Pro Arg Glu 305 310
315 320 Leu Ser Arg Asn Gly Glu Arg Leu Arg Met Arg
Pro Ala Arg Glu Leu 325 330
335 Ala Ala Leu Arg Gln Ser Gln Arg Thr Leu Ala Ile Gly Val Val Glu
340 345 350 Ser Gly
Asn Cys Ile Leu Ala Glu Arg Gly Ala Leu Leu Glu Phe Glu 355
360 365 Leu Thr Leu Asp Leu Ala Gly
Ser Thr Ala Glu Arg Phe Gly Leu Ala 370 375
380 Leu Arg Cys Ser Glu Asp Arg Gln Glu Arg Thr Leu
Val Tyr Phe Asp 385 390 395
400 Ala Met Ala Arg Arg Leu Val Leu Asp Arg Gln His Ser Gly Ala Gly
405 410 415 Val Ser Gly
Ala Arg Ser Val Pro Ile Ala Lys Gly Gln Met Gln Ile 420
425 430 Ala Leu Arg Ile Phe Leu Asp Arg
Ser Ser Ile Glu Val Phe Val Asp 435 440
445 Asp Gly Ala Tyr Ser Leu Ser Ser Arg Ile Tyr Pro Ser
Pro Asp Ser 450 455 460
Val Ala Val Met Ala Phe Ala Val Asn Gly Ser Gly Gly Phe Gly Gln 465
470 475 480 Ala Ser Val Trp
His Leu Ala Asp Leu His Leu 485 490
1351479DNABacillus licheniformis 135atgaacagaa ttcagcaggc agaagaagca
ttaaagaaag ccgggaaaaa agtgaatcgc 60cgttaccgaa tgggctatca catgatgccc
cgggcaaact ggataaatga tccaaacgga 120cttattcaat ataaagggga gtatcatgtc
ttttatcaac atcatccgta tgatgagaat 180tgggggccga tgcattgggg ccatttgaag
agcaaggatc ttattcactg ggagcacttg 240ccggttgctt tagcgccggg agacgaattt
gatgagagcg gctgtttctc aggaagcgca 300gtcgaatata acggcgacct cgctttaatc
tatactgggc ataatatgat agatgaagag 360aaagacgatt tctaccaaac tcagaatata
gcagtcagca aagacggtat cgtctttgaa 420aaactgaaag aaaaccctgt tattgcagag
ccgccggaag acagcgcacg tcactttcgc 480gatccaaaag tatggaagca tcgtgagaac
tggtatatgg tggtcggaaa ctcctcaaaa 540gagaacgtcg ggcgggtcat cttataccgc
tcgcctaact ttgtagattg ggagtacgta 600ggcgttctcg cccaaagcga cggaaatctc
ggctttatgt gggaatgtcc ggatttcttt 660gaactagacg gcaaacacat tttgctgatt
tcccctcagg gtatagaggc tgatggtgaa 720tcatatcaaa atctgtatca aacaggctat
ttgattggag actatgatga agaaacgaat 780gagtttgtac atggctcctt taaagagttg
gatcacggcc acgactttta tgccgtgcaa 840actttattgg atgacaaagg ccgcagaatt
gcgattggct ggatggatat gtgggagtca 900gagatgccga cgaaagcaga cggatggtgc
ggggcattaa ctttgccgcg tgaattgacg 960ttgaaggatg gtcacaaaat tttaatgaat
cccgtcgagg agactaaatt acttcgtgga 1020tcggaacatc atgagtgtga caatcaatcg
atttccggca gctattttat aaagacagcc 1080gaaaagcttc ttgaagtggt ggccgttttt
gatttgacaa tttgcagtgc cgaaacggtt 1140ggcttaaaga tccggggaat tgaacaggaa
gaaacaacca tcaagtacag cttgattgat 1200caaaagctga cgctcgactg ttcaaagtcc
ggcaaagcga gggacggtgt gagaaacgta 1260cggcttgaag cggatgagaa gctcactttg
catctgtttc tcgacagatc gtctattgaa 1320gtatttgcaa atcatggtga agcgacaatg
acaagccgca tatatccgaa ggaaggaaga 1380gcggggattg agctgttttc tgagaaaggc
aacgtacggg ttgaagaatt cacttactgg 1440acgttgaaag atatttggaa aggtgatgaa
gccaaatga 1479136492PRTBacillus licheniformis
136Met Asn Arg Ile Gln Gln Ala Glu Glu Ala Leu Lys Lys Ala Gly Lys 1
5 10 15 Lys Val Asn Arg
Arg Tyr Arg Met Gly Tyr His Met Met Pro Arg Ala 20
25 30 Asn Trp Ile Asn Asp Pro Asn Gly Leu
Ile Gln Tyr Lys Gly Glu Tyr 35 40
45 His Val Phe Tyr Gln His His Pro Tyr Asp Glu Asn Trp Gly
Pro Met 50 55 60
His Trp Gly His Leu Lys Ser Lys Asp Leu Ile His Trp Glu His Leu 65
70 75 80 Pro Val Ala Leu Ala
Pro Gly Asp Glu Phe Asp Glu Ser Gly Cys Phe 85
90 95 Ser Gly Ser Ala Val Glu Tyr Asn Gly Asp
Leu Ala Leu Ile Tyr Thr 100 105
110 Gly His Asn Met Ile Asp Glu Glu Lys Asp Asp Phe Tyr Gln Thr
Gln 115 120 125 Asn
Ile Ala Val Ser Lys Asp Gly Ile Val Phe Glu Lys Leu Lys Glu 130
135 140 Asn Pro Val Ile Ala Glu
Pro Pro Glu Asp Ser Ala Arg His Phe Arg 145 150
155 160 Asp Pro Lys Val Trp Lys His Arg Glu Asn Trp
Tyr Met Val Val Gly 165 170
175 Asn Ser Ser Lys Glu Asn Val Gly Arg Val Ile Leu Tyr Arg Ser Pro
180 185 190 Asn Phe
Val Asp Trp Glu Tyr Val Gly Val Leu Ala Gln Ser Asp Gly 195
200 205 Asn Leu Gly Phe Met Trp Glu
Cys Pro Asp Phe Phe Glu Leu Asp Gly 210 215
220 Lys His Ile Leu Leu Ile Ser Pro Gln Gly Ile Glu
Ala Asp Gly Glu 225 230 235
240 Ser Tyr Gln Asn Leu Tyr Gln Thr Gly Tyr Leu Ile Gly Asp Tyr Asp
245 250 255 Glu Glu Thr
Asn Glu Phe Val His Gly Ser Phe Lys Glu Leu Asp His 260
265 270 Gly His Asp Phe Tyr Ala Val Gln
Thr Leu Leu Asp Asp Lys Gly Arg 275 280
285 Arg Ile Ala Ile Gly Trp Met Asp Met Trp Glu Ser Glu
Met Pro Thr 290 295 300
Lys Ala Asp Gly Trp Cys Gly Ala Leu Thr Leu Pro Arg Glu Leu Thr 305
310 315 320 Leu Lys Asp Gly
His Lys Ile Leu Met Asn Pro Val Glu Glu Thr Lys 325
330 335 Leu Leu Arg Gly Ser Glu His His Glu
Cys Asp Asn Gln Ser Ile Ser 340 345
350 Gly Ser Tyr Phe Ile Lys Thr Ala Glu Lys Leu Leu Glu Val
Val Ala 355 360 365
Val Phe Asp Leu Thr Ile Cys Ser Ala Glu Thr Val Gly Leu Lys Ile 370
375 380 Arg Gly Ile Glu Gln
Glu Glu Thr Thr Ile Lys Tyr Ser Leu Ile Asp 385 390
395 400 Gln Lys Leu Thr Leu Asp Cys Ser Lys Ser
Gly Lys Ala Arg Asp Gly 405 410
415 Val Arg Asn Val Arg Leu Glu Ala Asp Glu Lys Leu Thr Leu His
Leu 420 425 430 Phe
Leu Asp Arg Ser Ser Ile Glu Val Phe Ala Asn His Gly Glu Ala 435
440 445 Thr Met Thr Ser Arg Ile
Tyr Pro Lys Glu Gly Arg Ala Gly Ile Glu 450 455
460 Leu Phe Ser Glu Lys Gly Asn Val Arg Val Glu
Glu Phe Thr Tyr Trp 465 470 475
480 Thr Leu Lys Asp Ile Trp Lys Gly Asp Glu Ala Lys
485 490
User Contributions:
Comment about this patent or add new information about this topic: