Patent application title: BIOFUEL PRODUCTION
Inventors:
Yuki Kashiyama (Seattle, WA, US)
Yasuo Yoshikuni (Seattle, WA, US)
David Baker (Seattle, WA, US)
David Baker (Seattle, WA, US)
Justin B. Siegel (Seattle, WA, US)
Assignees:
BIO ARCHITECTURE LAB, INC.
University of Washington
IPC8 Class: AC12P704FI
USPC Class:
435165
Class name: Ethanol produced as by-product, or from waste, or from cellulosic material substrate substrate contains cellulosic material
Publication date: 2009-06-18
Patent application number: 20090155873
Claims:
1. A method for converting a suitable monosaccharide or oligosaccharide to
a commodity chemical, comprising:(a) contacting the suitable
monosaccharide or oligosaccharide with a commodity chemical biosynthesis
pathway, wherein the commodity chemical biosynthesis pathway comprises an
aldehyde or ketone biosynthesis pathway, a C--C ligation pathway, and/or
a dehydration and reduction pathway,thereby converting the suitable
monosaccharide or oligosaccharide to the commodity chemical.
2. The method of claim 1, wherein the biomass is selected from marine biomass and vegetable/fruit/plant biomass.
3. The method of claim 2, wherein the marine biomass is selected from kelp, giant kelp, sargasso, seaweed, algae, marine microflora, microalgae, and sea grass.
4. The method of claim 2, wherein the vegetable/fruit/plant biomass comprises plant peel or pomace.
5. The method of claim 2, wherein the vegetable/fruit/plant biomass is selected from citrus, potato, tomato, grape, gooseberry, carrot, mango, sugar-beet, apple, switchgrass, wood, and stover.
6. The method of claim 1, wherein the suitable monosaccharide or oligosaccharide is obtained from a biomass-derived polysaccharide, wherein the polysaccharide is selected from alginate, agar, carrageenan, fucoidan, pectin, polygalacturonate, cellulose, hemicellulose, xylan, arabinan, and mannan.
7. The method of claim 1, wherein the suitable monosaccharide or oligosaccharide is selected from 2-keto-3-deoxy D-gluconate (KDG) gluronate, mannuronate, mannitol, lyxose, glycerol, xylitol, glucose, mannose, galactose, xylose, arabinose, glucuronate, galacturonates, and rhamnose, and D-mannitol.
8. The method of claim 1, wherein the commodity chemical is selected from methane, methanol, ethane, ethene, ethanol, n-propane, 1-propene, 1-propanol, propanal, acetone, propionate, n-butane, 1-butene, 1-butanol, butanal, butanoate, isobutanal, isobutanol, 2-methylbutanal, 2-methylbutanol, 3-methylbutanal, 3-methylbutanol, 2-butene, 2-butanol, 2-butanone, 2,3-butanediol, 3-hydroxy-2-butanone, 2,3-butanedione, ethylbenzene, ethenylbenzene, 2-phenylethanol, phenylacetaldehyde, 1-phenylbutane, 4-phenyl-1-butene, 4-phenyl-2-butene, 1-phenyl-2-butene, 1-phenyl-2-butanol, 4-phenyl-2-butanol, 1-phenyl-2-butanone, 4-phenyl-2-butanone, 1-phenyl-2,3-butandiol, 1-phenyl-3-hydroxy-2-butanone, 4-phenyl-3-hydroxy-2-butanone, 1-phenyl-2,3-butanedione, n-pentane, ethylphenol, ethenylphenol, 2-(4-hydroxyphenyl)ethanol, 4-hydroxyphenylacetaldehyde, 1-(4-hydroxyphenyl) butane, 4-(4-hydroxyphenyl)-1-butene, 4-(4-hydroxyphenyl)-2-butene, 1-(4-hydroxyphenyl)-1-butene, 1-(4-hydroxyphenyl)-2-butanol, 4-(4-hydroxyphenyl)-2-butanol, 1-(4-hydroxyphenyl)-2-butanone, 4-(4-hydroxyphenyl)-2-butanone, 1-(4-hydroxyphenyl)-2,3-butandiol, 1-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 4-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-2,3-butanonedione, indolylethane, indolylethene, 2-(indole-3-)ethanol, n-pentane, 1-pentene, 1-pentanol, pentanal, pentanoate, 2-pentene, 2-pentanol, 3-pentanol, 2-pentanone, 3-pentanone, 4-methylpentanal, 4-methylpentanol, 2,3-pentanediol, 2-hydroxy-3-pentanone, 3-hydroxy-2-pentanone, 2,3-pentanedione, 2-methylpentane, 4-methyl-1-pentene, 4-methyl-2-pentene, 4-methyl-3-pentene, 4-methyl-2-pentanol, 2-methyl-3-pentanol, 4-methyl-2-pentanone, 2-methyl-3-pentanone, 4-methyl-2,3-pentanediol, 4-methyl-2-hydroxy-3-pentanone, 4-methyl-3-hydroxy-2-pentanone, 4-methyl-2,3-pentanedione, 1-phenylpentane, 1-phenyl-1-pentene, 1-phenyl-2-pentene, 1-phenyl-3-pentene, 1-phenyl-2-pentanol, 1-phenyl-3-pentanol, 1-phenyl-2-pentanone, 1-phenyl-3-pentanone, 1-phenyl-2,3-pentanediol, 1-phenyl-2-hydroxy-3-pentanone, 1-phenyl-3-hydroxy-2-pentanone, 1-phenyl-2,3-pentanedione, 4-methyl-1-phenylpentane, 4-methyl-1-phenyl-1-pentene, 4-methyl-1-phenyl-2-pentene, 4-methyl-1-phenyl-3-pentene, 4-methyl-1-phenyl-3-pentanol, 4-methyl-1-phenyl-2-pentanol, 4-methyl-1-phenyl-3-pentanone, 4-methyl-1-phenyl-2-pentanone, 4-methyl-1-phenyl-2,3-pentanediol, 4-methyl-1-phenyl-2,3-pentanedione, 4-methyl-1-phenyl-3-hydroxy-2-pentanone, 4-methyl-1-phenyl-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl) pentane, 1-(4-hydroxyphenyl)-1-pentene, 1-(4-hydroxyphenyl)-2-pentene, 1-(4-hydroxyphenyl)-3-pentene, 1-(4-hydroxyphenyl)-2-pentanol, 1-(4-hydroxyphenyl)-3-pentanol, 1-(4-hydroxyphenyl)-2-pentanone, 1-(4-hydroxyphenyl)-3-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanediol, 1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl) pentane, 4-methyl-1-(4-hydroxyphenyl)-2-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentene, 4-methyl-1-(4-hydroxyphenyl)-1-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentanol, 4-methyl-1-(4-hydroxyphenyl)-2-pentanol, 4-methyl-1-(4-hydroxyphenyl)-3-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-indole-3-pentane, 1-(indole-3)-1-pentene, 1-(indole-3)-2-pentene, 1-(indole-3)-3-pentene, 1-(indole-3)-2-pentanol, 1-(indole-3)-3-pentanol, 1-(indole-3)-2-pentanone, 1-(indole-3)-3-pentanone, 1-(indole-3)-2,3-pentanediol, 1-(indole-3)-2-hydroxy-3-pentanone, 1-(indole-3)-3-hydroxy-2-pentanone, 1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3-)pentane, 4-methyl-1-(indole-3)-2-pentene, 4-methyl-1-(indole-3)-3-pentene, 4-methyl-1-(indole-3)-1-pentene, 4-methyl-2-(indole-3)-3-pentanol, 4-methyl-1-(indole-3)-2-pentanol, 4-methyl-1-(indole-3)-3-pentanone, 4-methyl-1-(indole-3)-2-pentanone, 4-methyl-1-(indole-3)-2,3-pentanediol, 4-methyl-1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3)-3-hydroxy-2-pentanone, 4-methyl-1-(indole-3)-2-hydroxy-3-pentanone, n-hexane, 1-hexene, 1-hexanol, hexanal, hexanoate, 2-hexene, 3-hexene, 2-hexanol, 3-hexanol, 2-hexanone, 3-hexanone, 2,3-hexanediol, 2,3-hexanedione, 3,4-hexanediol, 3,4-hexanedione, 2-hydroxy-3-hexanone, 3-hydroxy-2-hexanone, 3-hydroxy-4-hexanone, 4-hydroxy-3-hexanone, 2-methylhexane, 3-methylhexane, 2-methyl-2-hexene, 2-methyl-3-hexene, 5-methyl-1-hexene, 5-methyl-2-hexene, 4-methyl-1-hexene, 4-methyl-2-hexene, 3-methyl-3-hexene, 3-methyl-2-hexene, 3-methyl-1-hexene, 2-methyl-3-hexanol, 5-methyl-2-hexanol, 5-methyl-3-hexanol, 2-methyl-3-hexanone, 5-methyl-2-hexanone, 5-methyl-3-hexanone, 2-methyl-3,4-hexanediol, 2-methyl-3,4-hexanedione, 5-methyl-2,3-hexanediol, 5-methyl-2,3-hexanedione, 4-methyl-2,3-hexanediol, 4-methyl-2,3-hexanedione, 2-methyl-3-hydroxy-4-hexanone, 2-methyl-4-hydroxy-3-hexanone, 5-methyl-2-hydroxy-3-hexanone, 5-methyl-3-hydroxy-2-hexanone, 4-methyl-2-hydroxy-3-hexanone, 4-methyl-3-hydroxy-2-hexanone, 2,5-dimethylhexane, 2,5-dimethyl-2-hexene, 2,5-dimethyl-3-hexene, 2,5-dimethyl-3-hexanol, 2,5-dimethyl-3-hexanone, 2,5-dimethyl-3,4-hexanediol, 2,5-dimethyl-3,4-hexanedione, 2,5-dimethyl-3-hydroxy-4-hexanone, 5-methyl-1-phenylhexane, 4-methyl-1-phenylhexane, 5-methyl-1-phenyl-1-hexene, 5-methyl-1-phenyl-2-hexene, 5-methyl-1-phenyl-3-hexene, 4-methyl-1-phenyl-1-hexene, 4-methyl-1-phenyl-2-hexene, 4-methyl-1-phenyl-3-hexene, 5-methyl-1-phenyl-2-hexanol, 5-methyl-1-phenyl-3-hexanol, 4-methyl-1-phenyl-2-hexanol, 4-methyl-1-phenyl-3-hexanol, 5-methyl-1-phenyl-2-hexanone, 5-methyl-1-phenyl-3-hexanone, 4-methyl-1-phenyl-2-hexanone, 4-methyl-1-phenyl-3-hexanone, 5-methyl-1-phenyl-2,3-hexanediol, 4-methyl-1-phenyl-2,3-hexanediol, 5-methyl-1-phenyl-3-hydroxy-2-hexanone, 5-methyl-1-phenyl-2-hydroxy-3-hexanone, 4-methyl-1-phenyl-3-hydroxy-2-hexanone, 4-methyl-1-phenyl-2-hydroxy-3-hexanone, 5-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)hexane, 5-methyl-1-(4-hydroxyphenyl)-1-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexene, 5-methyl-1-(4-hydroxyphenyl)-3-hexene, 4-methyl-1-(4-hydroxyphenyl)-1-hexene, 4-methyl-1-(4-hydroxyphenyl)-2-hexene, 4-methyl-1-(4-hydroxyphenyl)-3-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexanol, 5-methyl-1-(4-hydroxyphenyl)-3-hexanol, 4-methyl-1-(4-hydroxyphenyl)-2-hexanol, 4-methyl-1-(4-hydroxyphenyl)-3-hexanol, 5-methyl-1-(4-hydroxyphenyl)-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 5-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(indole-3-)hexane, 5-methyl-1-(indole-3)-1-hexene, 5-methyl-1-(indole-3)-2-hexene, 5-methyl-1-(indole-3)-3-hexene, 4-methyl-1-(indole-3)-1-hexene, 4-methyl-1-(indole-3)-2-hexene, 4-methyl-1-(indole-3)-3-hexene, 5-methyl-1-(indole-3)-2-hexanol, 5-methyl-1-(indole-3)-3-hexanol, 4-methyl-1-(indole-3)-2-hexanol, 4-methyl-1-(indole-3)-3-hexanol, 5-methyl-1-(indole-3)-2-hexanone, 5-methyl-1-(indole-3)-3-hexanone, 4-methyl-1-(indole-3)-2-hexanone, 4-methyl-1-(indole-3)-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanediol, 4-methyl-1-(indole-3)-2,3-hexanediol, 5-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 5-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 4-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 4-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanedione, 4-methyl-1-(indole-3)-2,3-hexanedione, n-heptane, 1-heptene, 1-heptanol, heptanal, heptanoate, 2-heptene, 3-heptene, 2-heptanol, 3-heptanol, 4-heptanol, 2-heptanone, 3-heptanone, 4-heptanone, 2,3-heptanediol, 2,3-heptanedione, 3,4-heptanediol, 3,4-heptanedione, 2-hydroxy-3-heptanone, 3-hydroxy-2-heptanone, 3-hydroxy-4-heptanone, 4-hydroxy-3-heptanone, 2-methylheptane, 3-methylheptane, 6-methyl-2-heptene, 6-methyl-3-heptene, 2-methyl-3-heptene, 2-methyl-2-heptene, 5-methyl-2-heptene, 5-methyl-3-heptene, 3-methyl-3-heptene, 2-methyl-3-heptanol, 2-methyl-4-heptanol, 6-methyl-3-heptanol, 5-methyl-3-heptanol, 3-methyl-4-heptanol, 2-methyl-3-heptanone, 2-methyl-4-heptanone, 6-methyl-3-heptanone, 5-methyl-3-heptanone, 3-methyl-4-heptanone, 2-methyl-3,4-heptanediol, 2-methyl-3,4-heptanedione, 6-methyl-3,4-heptanediol, 6-methyl-3,4-heptanedione, 5-methyl-3,4-heptanediol, 5-methyl-3,4-heptanedione, 2-methyl-3-hydroxy-4-heptanone, 2-methyl-4-hydroxy-3-heptanone, 6-methyl-3-hydroxy-4-heptanone, 6-methyl-4-hydroxy-3-heptanone, 5-methyl-3-hydroxy-4-heptanone, 5-methyl-4-hydroxy-3-heptanone, 2,6-dimethylheptane, 2,5-dimethylheptane, 2,6-dimethyl-2-heptene, 2,6-dimethyl-3-heptene, 2,5-dimethyl-2-heptene, 2,5-dimethyl-3-heptene, 3,6-dimethyl-3-heptene, 2,6-dimethyl-3-heptanol, 2,6-dimethyl-4-heptanol, 2,5-dimethyl-3-heptanol, 2,5-dimethyl-4-heptanol, 2,6-dimethyl-3,4-heptanediol, 2,6-dimethyl-3,4-heptanedione, 2,5-dimethyl-3,4-heptanediol, 2,5-dimethyl-3,4-heptanedione, 2,6-dimethyl-3-hydroxy-4-heptanone, 2,6-dimethyl-4-hydroxy-3-heptanone, 2,5-dimethyl-3-hydroxy-4-heptanone, 2,5-dimethyl-4-hydroxy-3-heptanone, n-octane, 1-octene, 2-octene, 1-octanol, octanal, octanoate, 3-octene, 4-octene, 4-octanol, 4-octanone, 4,5-octanediol, 4,5-octanedione, 4-hydroxy-5-octanone, 2-methyloctane, 2-methyl-3-octene, 2-methyl-4-octene, 7-methyl-3-octene, 3-methyl-3-octene, 3-methyl-4-octene, 6-methyl-3-octene, 2-methyl-4-octanol, 7-methyl-4-octanol, 3-methyl-4-octanol, 6-methyl-4-octanol, 2-methyl-4-octanone, 7-methyl-4-octanone, 3-methyl-4-octanone, 6-methyl-4-octanone, 2-methyl-4,5-octanediol, 2-methyl-4,5-octanedione, 3-methyl-4,5-octanediol, 3-methyl-4,5-octanedione, 2-methyl-4-hydroxy-5-octanone, 2-methyl-5-hydroxy-4-octanone, 3-methyl-4-hydroxy-5-octanone, 3-methyl-5-hydroxy-4-octanone, 2,7-dimethyloctane, 2,7-dimethyl-3-octene, 2,7-dimethyl-4-octene, 2,7-dimethyl-4-octanol, 2,7-dimethyl-4-octanone, 2,7-dimethyl-4,5-octanediol, 2,7-dimethyl-4,5-octanedione, 2,7-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyloctane, 2,6-dimethyl-3-octene, 2,6-dimethyl-4-octene, 3,7-dimethyl-3-octene, 2,6-dimethyl-4-octanol, 3,7-dimethyl-4-octanol, 2,6-dimethyl-4-octanone, 3,7-dimethyl-4-octanone, 2,6-dimethyl-4,5-octanediol, 2,6-dimethyl-4,5-octanedione, 2,6-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyl-5-hydroxy-4-octanone, 3,6-dimethyloctane, 3,6-dimethyl-3-octene, 3,6-dimethyl-4-octene, 3,6-dimethyl-4-octanol, 3,6-dimethyl-4-octanone, 3,6-dimethyl-4,5-octanediol, 3,6-dimethyl-4,5-octanedione, 3,6-dimethyl-4-hydroxy-5-octanone, n-nonane, 1-nonene, 1-nonanol, nonanal, nonanoate, 2-methylnonane, 2-methyl-4-nonene, 2-methyl-5-nonene, 8-methyl-4-nonene, 2-methyl-5-nonanol, 8-methyl-4-nonanol, 2-methyl-5-nonanone, 8-methyl-4-nonanone, 8-methyl-4,5-nonanediol, 8-methyl-4,5-nonanedione, 8-methyl-4-hydroxy-5-nonanone, 8-methyl-5-hydroxy-4-nonanone, 2,8-dimethylnonane, 2,8-dimethyl-3-nonene, 2,8-dimethyl-4-nonene, 2,8-dimethyl-5-nonene, 2,8-dimethyl-4-nonanol, 2,8-dimethyl-5-nonanol, 2,8-dimethyl-4-nonanone, 2,8-dimethyl-5-nonanone, 2,8-dimethyl-4,5-nonanediol, 2,8-dimethyl-4,5-nonanedione, 2,8-dimethyl-4-hydroxy-5-nonanone, 2,8-dimethyl-5-hydroxy-4-nonanone, 2,7-dimethylnonane, 3,8-dimethyl-3-nonene, 3,8-dimethyl-4-nonene, 3,8-dimethyl-5-nonene, 3,8-dimethyl-4-nonanol, 3,8-dimethyl-5-nonanol, 3,8-dimethyl-4-nonanone, 3,8-dimethyl-5-nonanone, 3,8-dimethyl-4,5-nonanediol, 3,8-dimethyl-4,5-nonanedione, 3,8-dimethyl-4-hydroxy-5-nonanone, 3,8-dimethyl-5-hydroxy-4-nonanone, n-decane, 1-decene, 1-decanol, decanoate, 2,9-dimethyldecane, 2,9-dimethyl-3-decene, 2,9-dimethyl-4-decene, 2,9-dimethyl-5-decanol, 2,9-dimethyl-5-decanone, 2,9-dimethyl-5,6-decanediol, 2,9-dimethyl-6-hydroxy-5-decanone, 2,9-dimethyl-5,6-decanedionen-undecane, 1-undecene, 1-undecanol, undecanal. undecanoate, n-dodecane, 1-dodecene, 1-dodecanol, dodecanal, dodecanoate, n-dodecane, 1-decadecene, 1-dodecanol, ddodecanal, dodecanoate, n-tridecane, 1-tridecene, 1-tridecanol, tridecanal, tridecanoate, n-tetradecane, 1-tetradecene, 1-tetradecanol, tetradecanal, tetradecanoate, n-pentadecane, 1-pentadecene, 1-pentadecanol, pentadecanal, pentadecanoate, n-hexadecane, 1-hexadecene, 1-hexadecanol, hexadecanal, hexadecanoate, n-heptadecane, 1-heptadecene, 1-heptadecanol, heptadecanal, heptadecanoate, n-octadecane, 1-octadecene, 1-octadecanol, octadecanal, octadecanoate, n-nonadecane, 1-nonadecene, 1-nonadecanol, nonadecanal, nonadecanoate, eicosane, 1-eicosene, 1-eicosanol, eicosanal, eicosanoate, 3-hydroxy propanal, 1,3-propanediol, 4-hydroxybutanal, 1,4-butanediol, 3-hydroxy-2-butanone, 2,3-butandiol, 1,5-pentane diol, homocitrate, homoisocitorate, b-hydroxy adipate, glutarate, glutarsemialdehyde, glutaraldehyde, 2-hydroxy-1-cyclopentanone, 1,2-cyclopentanediol, cyclopentanone, cyclopentanol, (S)-2-acetolactate, (R)-2,3-Dihydroxy-isovalerate, 2-oxoisovalerate, isobutyryl-CoA, isobutyrate, isobutyraldehyde, 5-amino pentaldehyde, 1,10-diaminodecane, 1,10-diamino-5-decene, 1,10-diamino-5-hydroxydecane, 1,10-diamino-5-decanone, 1,10-diamino-5,6-decanediol, 1,10-diamino-6-hydroxy-5-decanone, phenylacetoaldehyde, 1,4-diphenylbutane, 1,4-diphenyl-1-butene, 1,4-diphenyl-2-butene, 1,4-diphenyl-2-butanol, 1,4-diphenyl-2-butanone, 1,4-diphenyl-2,3-butanediol, 1,4-diphenyl-3-hydroxy-2-butanone, 1-(4-hydeoxyphenyl)-4-phenylbutane, 1-(4-hydeoxyphenyl)-4-phenyl-1-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanol, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanone, 1-(4-hydeoxyphenyl)-4-phenyl-2,3-butanediol, 1-(4-hydeoxyphenyl)-4-phenyl-3-hydroxy-2-butanone, 1-(indole-3)-4-phenylbutane, 1-(indole-3)-4-phenyl-1-butene, 1-(indole-3)-4-phenyl-2-butene, 1-(indole-3)-4-phenyl-2-butanol, 1-(indole-3)-4-phenyl-2-butanone, 1-(indole-3)-4-phenyl-2,3-butanediol, 1-(indole-3)-4-phenyl-3-hydroxy-2-butanone, 4-hydroxyphenylacetoaldehyde, 1,4-di(4-hydroxyphenyl)butane, 1,4-di(4-hydroxyphenyl)-1-butene, 1,4-di(4-hydroxyphenyl)-2-butene, 1,4-di(4-hydroxyphenyl)-2-butanol, 1,4-di(4-hydroxyphenyl)-2-butanone, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 1,4-di(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3-)butane, 1-(4-hydroxyphenyl)-4-(indole-3)-1-butene, 1-di(4-hydroxyphenyl)-4-(indole-3)-2-butene,
1-(4-hydroxyphenyl)-4-(indole-3)-2-butanol, 1-(4-hydroxyphenyl)-4-(indole-3)-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3)-2,3-butanediol, 1-(4-hydroxyphenyl-4-(indole-3)-3-hydroxy-2-butanone, indole-3-acetoaldehyde, 1,4-di(indole-3-)butane, 1,4-di(indole-3)-1-butene, 1,4-di(indole-3)-2-butene, 1,4-di(indole-3)-2-butanol, 1,4-di(indole-3)-2-butanone, 1,4-di(indole-3)-2,3-butanediol, 1,4-di(indole-3)-3-hydroxy-2-butanone, succinate semialdehyde, hexane-1,8-dicarboxylic acid, 3-hexene-1,8-dicarboxylic acid, 3-hydroxy-hexane-1,8-dicarboxylic acid, 3-hexanone-1,8-dicarboxylic acid, 3,4-hexanediol-1,8-dicarboxylic acid, 4-hydroxy-3-hexanone-1,8-dicarboxylic acid, fucoidan, iodine, chlorophyll, carotenoid, calcium, magnesium, iron, sodium, potassium, and phosphate.
9. A method for converting a suitable monosaccharide or oligosaccharide to a commodity chemical comprising,(b) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide or oligosaccharide to the commodity chemical, wherein the microbial system comprises;(i) one or more genes encoding and expressing a biosynthesis pathway;(ii) one or more genes encoding and expressing a C--C ligation pathway; and(iii) a reduction and dehydration pathway, comprising one or more genes encoding and expressing an enzyme selected from a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase,thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical.
10. The method of claim 9, wherein the biosynthesis pathway is selected from (a) an aldehyde biosynthesis pathway, (b) a ketone synthesis pathway, and (c) both (a) and (b).
11. The method of claim 9, wherein the biosynthesis pathway comprises an acetoaldehyde biosynthesis pathway, and wherein the acetoaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to an acetoaldehyde.
12. The method of claim 9, wherein the biosynthesis pathway comprises a propionaldehyde biosynthesis pathway, and wherein the propionaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a propionaldehyde.
13. The method of claim 9, wherein the biosynthesis pathway comprises a butyraldehyde biosynthesis pathway, and wherein the butyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a butyraldehyde.
14. The method of claim 9, wherein the biosynthesis pathway comprises a isobutyraldehyde biosynthesis pathway, and wherein the isobutyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a isobutyraldehyde.
15. The method of claim 9, wherein the biosynthesis pathway comprises a 2-methylbutyraldehyde biosynthesis pathway, and wherein the 2-methylbutyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 2-methylbutyraldehyde.
16. The method of claim 9, wherein the biosynthesis pathway comprises a 3-methylbutyraldehyde biosynthesis pathway, and wherein the 3-methylbutyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 3-methylbutyraldehyde.
17. The method of claim 9, wherein the biosynthesis pathway comprises a 4-methylpentaldehyde biosynthesis pathway, and wherein the 4-methylpentaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 4-methylpentaldehyde.
18. The method of claim 9, wherein the biosynthesis pathway comprises a phenylacetaldehyde biosynthesis pathway, and wherein the phenylacetaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a phenylacetaldehyde.
19. The method of claim 9, wherein the biosynthesis pathway comprises a 5-amino pentaldehyde biosynthesis pathway, and wherein the 5-amino pentaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 5-amino pentaldehyde.
20. The method of claim 9, wherein the biosynthesis pathway comprises a 2-(4-hydroxyphenyl)acetaldehyde biosynthesis pathway, and wherein the 2-(4-hydroxyphenyl)acetaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 2(4-hydroxyphenyl)acetaldehyde.
21. The method of claim 9, wherein the biosynthesis pathway comprises a 2-(Indole-3-)acetaldehyde biosynthesis pathway, and wherein the 2-(Indole-3) acetaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 2-(Indole-3-)acetaldehyde.
22. The method of claim 9, wherein the C--C ligation pathway comprises at least one enzyme selected from an acetoaldehyde lyase, a propionaldehyde lyase, a butyraldehyde lyase, an isobutyraldehyde lyase, a 2-methyl-butyraldehyde lyase, a 3-methyl-butyraldehyde lyase, a phenylacetaldehyde lyase, an oxaloacetate decarboxylase, an α-keto glutarate decarboxylyase, an α-keto adipate decarboxylyase, a pentaldehyde lyase, a 4-methyl-pentaldehyde lyase, a hexaldehyde lyase, a heptaldehyde lyase, an octaldehyde lyase, a 4-hydroxyphenylacetaldehyde lyase, an indoleacetaldehyde lyase, an indolephenylacetaldehyde lyase, a benzaldehyde lyase, a pyruvate decarboxylase, a benzformate lyase, and a 2-keto isovalerate decarboxylase.
23. The method of claim 9, wherein the C--C ligation pathway comprises a C--C ligase or an optimized C--C ligase.
24. The method of claim 23, wherein the C--C ligase or optimized C--C ligase comprises at least one enzymatic activity selected from an acetoaldehyde lyase activity, a propionaldehyde lyase activity, a butyraldehyde lyase activity, an isobutyraldehyde lyase activity, a 2-methyl-butyraldehyde lyase activity, a 3-methyl-butyraldehyde lyase activity, a phenylacetaldehyde lyase activity, an oxaloacetate decarboxylase activity, an α-keto glutarate decarboxylyase activity, an α-keto adipate decarboxylyase activity, a pentaldehyde lyase activity a 4-methyl-pentaldehyde lyase activity, a hexaldehyde lyase activity, a heptaldehyde lyase activity, an octaldehyde lyase activity, a 4-hydroxyphenylacetaldehyde lyase activity, an indoleacetaldehyde lyase activity, an indolephenylacetaldehyde lyase activity, a benzaldehyde lyase activity, a pyruvate decarboxylase activity, a benzformate lyase activity, and a 2-keto isovalerate decarboxylase activity.
25. The method of claim 9, wherein the C--C ligation pathway comprises a benzaldehyde lyase, or a biologically active variant or fragment thereof.
26. The method of claim 25, wherein the benzaldehyde lyase is derived from Pseudomonas fluorescens.
27. The method of claim 26, wherein the benzaldehyde lyase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO: 144.
28. The method of claim 27, wherein the amino acid sequence of the benzaldehyde lyase comprises one or more conserved residues selected from G27, E50, A57, G155, P162, P234, D271, G277, G422, G447, D448, and G512.
29. The method of claim 9, wherein the dehydration and reduction pathway comprises a diol dehydrogenase selected from 2,3-butanediol dehydrogenase, 3,4-hexanediol dehydrogenase, 4,5-octanediol dehydrogenase, 5,6-decanediol dehydrogenase, 6,7-dodecanediol dehydrogenase, 7,8-tetradecanediol dehydrogenase, 8,9-hexadecanediol dehydrogenase, 2,5-dimethyl-3,4-hexanediol dehydrogenase, 3,6-dimethyl-4,5-octanediol dehydrogenase, 2,7-dimethyl-4,5-octanediol dehydrogenase, 2,9-dimethyl-5,6-decanediol dehydrogenase, 1,4-diphenyl-2,3-butanediol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,4-diindole-2,3-butanediol dehydrogenase, 1,2-cyclopentanediol dehydrogenase, 2,3-pentanediol dehydrogenase, 2,3-hexanediol dehydrogenase, 2,3-heptanediol dehydrogenase, 2,3-octanediol dehydrogenase, 2,3-nonanediol dehydrogenase, 4-methyl-2,3-pentanediol dehydrogenase, 4-methyl-2,3-hexanediol dehydrogenase, 5-methyl-2,3-hexanediol dehydrogenase, 6-methyl-2,3-heptanediol dehydrogenase, 1-phenyl-2,3-butanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1-indole-2,3-butanediol dehydrogenase, 3,4-heptanediol dehydrogenase, 3,4-octanediol dehydrogenase, 3,4-nonanediol dehydrogenase, 3,4-decanediol dehydrogenase, 3,4-undecanediol dehydrogenase, 2-methyl-3,4-hexanediol dehydrogenase, 5-methyl-3,4-heptanediol dehydrogenase, 6-methyl-3,4-heptanediol dehydrogenase, 7-methyl-3,4-octanediol dehydrogenase, 1-phenyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase, 1-indole-2,3-pentanediol dehydrogenase, 4,5-nonanediol dehydrogenase, 4,5-decanediol dehydrogenase, 4,5-undecanediol dehydrogenase, 4,5-dodecanediol dehydrogenase, 2-methyl-3,4-heptanediol dehydrogenase, 3-methyl-4,5-octanediol dehydrogenase, 2-methyl-4,5-octanediol dehydrogenase, 8-methyl-4,5-nonanediol dehydrogenase, 1-phenyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase, 1-indole-2,3-hexanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-tridecanediol dehydrogenase, 2-methyl-3,4-octanediol dehydrogenase, 3-methyl-4,5-nonanediol dehydrogenase, 2-methyl-4,5-nonanediol dehydrogenase, 2-methyl-5,6-decanediol dehydrogenase, 1-phenyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase, 1-indole-2,3-heptanediol dehydrogenase, 6,7-tridecanediol dehydrogenase, 6,7-tetradecanediol dehydrogenase, 2-methyl-3,4-nonanediol dehydrogenase, 3-methyl-4,5-decanediol dehydrogenase, 2-methyl-4,5-decanediol dehydrogenase, 2-methyl-5,6-undecanediol dehydrogenase, 1-phenyl-2,3-octanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase, 1-indole-2,3-octanediol dehydrogenase, 7,8-pentadecanediol dehydrogenase, 2-methyl-3,4-decanediol dehydrogenase, 3-methyl-4,5-undecanediol dehydrogenase, 2-methyl-4,5-undecanediol dehydrogenase, 2-methyl-5,6-dodecanediol dehydrogenase, 1-phenyl-2,3-nonanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase, 1-indole-2,3-nonanediol dehydrogenase, 2-methyl-3,4-undecanediol dehydrogenase, 3-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-5,6-tridecanediol dehydrogenase, 1-phenyl-2,3-decanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase, 1-indole-2,3-decanediol dehydrogenase, 2,5-dimethyl-3,4-heptanediol dehydrogenase, 2,6-dimethyl-3,4-heptanediol dehydrogenase, 2,7-dimethyl-3,4-octanediol dehydrogenase, 1-phenyl-4-methyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase, 1-indole-4-methyl-2,3-pentanediol dehydrogenase, 2,6-dimethyl-4,5-octanediol dehydrogenase, 3,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-4-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase, 1-indole-4-methyl-2,3-hexanediol dehydrogenase, 2,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-5-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase, 1-indole-5-methyl-2,3-hexanediol dehydrogenase, 1-phenyl-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase, 1-indole-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,10-diamino-5,6-decanediol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, and 2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase.
30. The method of claim 9, wherein the diol dehydrogenase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NOS:98, 100, or 102.
31. The method of claim 9, wherein the dehydration and reduction pathway comprises a diol dehydratase selected from 2,3-butanediol dehydratase, 3,4-hexanediol dehydratase, 4,5-octanediol dehydratase, 5,6-decanediol dehydratase, 6,7-dodecanediol dehydratase, 7,8-tetradecanediol dehydratase, 8,9-hexadecanediol dehydratase, 2,5-dimethyl-3,4-hexanediol dehydratase, 3,6-dimethyl-4,5-octanediol dehydratase, 2,7-dimethyl-4,5-octanediol dehydratase, 2,9-dimethyl-5,6-decanediol dehydratase, 1,4-diphenyl-2,3-butanediol dehydratase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,4-diindole-2,3-butanediol dehydratase, 1,2-cyclopentanediol dehydratase, 2,3-pentanediol dehydratase, 2,3-hexanediol dehydratase, 2,3-heptanediol dehydratase, 2,3-octanediol dehydratase, 2,3-nonanediol dehydratase, 4-methyl-2,3-pentanediol dehydratase, 4-methyl-2,3-hexanediol dehydratase, 5-methyl-2,3-hexanediol dehydratase, 6-methyl-2,3-heptanediol dehydratase, 1-phenyl-2,3-butanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1-indole-2,3-butanediol dehydratase, 3,4-heptanediol dehydratase, 3,4-octanediol dehydratase, 3,4-nonanediol dehydratase, 3,4-decanediol dehydratase, 3,4-undecanediol dehydratase, 2-methyl-3,4-hexanediol dehydratase, 5-methyl-3,4-heptanediol dehydratase, 6-methyl-3,4-heptanediol dehydratase, 7-methyl-3,4-octanediol dehydratase, 1-phenyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase, 1-indole-2,3-pentanediol dehydratase, 4,5-nonanediol dehydratase, 4,5-decanediol dehydratase, 4,5-undecanediol dehydratase, 4,5-dodecanediol dehydratase, 2-methyl-3,4-heptanediol dehydratase, 3-methyl-4,5-octanediol dehydratase, 2-methyl-4,5-octanediol dehydratase, 8-methyl-4,5-nonanediol dehydratase, 1-phenyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase, 1-indole-2,3-hexanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-tridecanediol dehydratase, 2-methyl-3,4-octanediol dehydratase, 3-methyl-4,5-nonanediol dehydratase, 2-methyl-4,5-nonanediol dehydratase, 2-methyl-5,6-decanediol dehydratase, 1-phenyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase, 1-indole-2,3-heptanediol dehydratase, 6,7-tridecanediol dehydratase, 6,7-tetradecanediol dehydratase, 2-methyl-3,4-nonanediol dehydratase, 3-methyl-4,5-decanediol dehydratase, 2-methyl-4,5-decanediol dehydratase, 2-methyl-5,6-undecanediol dehydratase, 1-phenyl-2,3-octanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydratase, 1-indole-2,3-octanediol dehydratase, 7,8-pentadecanediol dehydratase, 2-methyl-3,4-decanediol dehydratase, 3-methyl-4,5-undecanediol dehydratase, 2-methyl-4,5-undecanediol dehydratase, 2-methyl-5,6-dodecanediol dehydratase, 1-phenyl-2,3-nonanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase, 1-indole-2,3-nonanediol dehydratase, 2-methyl-3,4-undecanediol dehydratase, 3-methyl-4,5-dodecanediol dehydratase, 2-methyl-4,5-dodecanediol dehydratase, 2-methyl-5,6-tridecanediol dehydratase, 1-phenyl-2,3-decanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydratase, 1-indole-2,3-decanediol dehydratase, 2,5-dimethyl-3,4-heptanediol dehydratase, 2,6-dimethyl-3,4-heptanediol dehydratase, 2,7-dimethyl-3,4-octanediol dehydratase, 1-phenyl-4-methyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase, 1-indole-4-methyl-2,3-pentanediol dehydratase, 2,6-dimethyl-4,5-octanediol dehydratase, 3,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-4-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase, 1-indole-4-methyl-2,3-hexanediol dehydratase, 2,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-5-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase, 1-indole-5-methyl-2,3-hexanediol dehydratase, 1-phenyl-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase, 1-indole-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,10-diamino-5,6-decanediol dehydratase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, and 2,3-hexanediol-1,6-dicarboxylic acid dehydratase.
32. The method of claim 9, wherein the diol dehydratase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NOS:104, 106, 108, 308, 309, 310, or 311.
33. The method of claim 32, wherein the polypeptide of SEQ ID NO:104 comprises one or more conserved residues selected from D149, P151, A155, A159, G165, E168, E170, A183, G189, G196, Q200, E208, G215, Y219, E221, T222, S224, Y226, G227, T228, F232, G235, D236, D237, T238, P239, S241, L245, Y249, S251, R252, G253, K255, R257, S260, E265, M268, G269, S275, Y278, L279, E280, C283, G291, Q293, G294, Q296, N297, G298, G312, E329, S341, R344, G356, D371, N372, F374, S377, R392, D393, R412, L477, A486, G499, D500, S516, N522, D523, Y524, G526, and G530.
34. The method of claim 32, wherein the polypeptide of SEQ ID NO:310 comprises one or more conserved residues selected from T36, G74, P87, E88, E97, W126, R221, A263, Q265, R287, D289, E309, R317, G335, G345, G346, N356, P374, R379, G399, G401, P403, D408, G432, C433, N452, C529, G533, G539, G540, S559, G603, N604, A654, G658, R659, D676, N702, Q735, N737, A747, P751, R760, V761, A762, G763, Q776, I780, and R782.
35. The method of claim 32, wherein the polypeptide of SEQ ID NO:311 comprises one or more conserved residues selected from D19, G20, G22, R24, F28, G31, C32, C36, W38, C39, N41, P42, C58, C64, C96, G129, T132, G135, G136, D185, R187, N208, R222, and R264.
36. The method of claim 9, wherein the dehydration and reduction pathway comprises a secondary alcohol dehydrogenase selected from 2-butanol dehydrogenase, 3-hexanol dehydrogenase, 4-octanol dehydrogenase, 5-decanol dehydrogenase, 6-dodecanol dehydrogenase, 7-tetradecanol dehydrogenase, 8-hexadecanol dehydrogenase, 2,5-dimethyl-3-hexanol dehydrogenase, 3,6-dimethyl-4-octanol dehydrogenase, 2,7-dimethyl-4-octanol dehydrogenase, 2,9-dimethyl-4-decanol dehydrogenase, 1,4-diphenyl-2-butanol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase, 1,4-diindole-2-butanol dehydrogenase, cyclopentanol dehydrogenase, 2(or 3)-pentanol dehydrogenase, 2(or 3)-hexanol dehydrogenase, 2(or 3)-heptanol dehydrogenase, 2(or 3)-octanol dehydrogenase, 2(or 3)-nonanol dehydrogenase, 4-methyl-2(or 3)-pentanol dehydrogenase, 4-methyl-2(or 3)-hexanol dehydrogenase, 5-methyl-2(or 3)-hexanol dehydrogenase, 6-methyl-2(or 3)-heptanol dehydrogenase, 1-phenyl-2(or 3)-butanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1-indole-2(or 3)-butanol dehydrogenase, 3(or 4)-heptanol dehydrogenase, 3(or 4)-octanol dehydrogenase, 3(or 4)-nonanol dehydrogenase, 3(or 4)-decanol dehydrogenase, 3(or 4)-undecanol dehydrogenase, 2-methyl-3(or 4)-hexanol dehydrogenase, 5-methyl-3(or 4)-heptanol dehydrogenase, 6-methyl-3(or 4)-heptanol dehydrogenase, 7-methyl-3(or 4)-octanol dehydrogenase, 1-phenyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-pentanol dehydrogenase, 1-indole-2(or 3)-pentanol dehydrogenase, 4(or 5)-nonanol dehydrogenase, 4(or 5)-decanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 4(or 5)-dodecanol dehydrogenase, 2-methyl-3(or 4)-heptanol dehydrogenase, 3-methyl-4(or 5)-octanol dehydrogenase, 2-methyl-4(or 5)-octanol dehydrogenase, 8-methyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase, 1-indole-2(or 3)-hexanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 5(or 6)-undecanol dehydrogenase, 5(or 6)-tridecanol dehydrogenase, 2-methyl-3(or 4)-octanol dehydrogenase, 3-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-5(or 6)-decanol dehydrogenase, 1-phenyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-heptanol dehydrogenase, 1-indole-2(or 3)-heptanol dehydrogenase, 6(or 7)-tridecanol dehydrogenase, 6(or 7)-tetradecanol dehydrogenase, 2-methyl-3(or 4)-nonanol dehydrogenase, 3-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-5(or 6)-undecanol dehydrogenase, 1-phenyl-2(or 3)-octanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-octanol dehydrogenase, 1-indole-2(or 3)-octanol dehydrogenase, 7(or 8)-pentadecanol dehydrogenase, 2-methyl-3(or 4)-decanol dehydrogenase, 3-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-5(or 6)-dodecanol dehydrogenase, 1-phenyl-2(or 3)-nonanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase, 1-indole-2(or 3)-nonanol dehydrogenase, 2-methyl-3(or 4)-undecanol dehydrogenase, 3-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-5(or 6)-tridecanol dehydrogenase, 1-phenyl-2(or 3)-decanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase, 1-indole-2(or 3)-decanol dehydrogenase, 2,5-dimethyl-3 (or 4)-heptanol dehydrogenase, 2,6-dimethyl-3 (or 4)-heptanol dehydrogenase, 2,7-dimethyl-3(or 4)-octanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2(or 3)-pentanol dehydrogenase, 1-indole-4-methyl-2(or 3)-pentanol dehydrogenase, 2,6-dimethyl-4(or 5)-octanol dehydrogenase, 3,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2 (or 3)-hexanol dehydrogenase, 1-indole-4-methyl-2(or 3)-hexanol dehydrogenase, 2,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase, 1-indole-5-methyl-2(or 3)-hexanol dehydrogenase, 1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase, 1-indole-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1,10-diamino-5-decanol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase, 2-hexanol-1,6-dicarboxylic acid dehydrogenase, phenylethanol dehydrogenase, 4-hydroxyphenylethanol dehydrogenase, and Indole-3-ethanol dehydrogenase.
37. The method of claim 9, wherein the secondary alcohol dehydrogenase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NOS:110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, and 142.
38. The method of claim 37, wherein the secondary alcohol dehydrogenase comprises at least one of a nicotinamide adenine dinucleotide (NAD+), a NADH, nicotinamide adenine dinucleotide phosphate (NADP+), or a NADPH binding motif.
39. The method of claim 38, wherein the NAD+, NADH, NADP+, or NADPH binding motif is selected from the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y, Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y, Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein Y is independently selected from alanine, glycine, and serine, wherein G is glycine, and wherein X is independently selected from a genetically encoded amino acid.
40. A recombinant microorganism, comprising (i) one or more genes encoding and expressing an aldehyde and/or ketone biosynthesis pathway; (ii) one or more genes encoding and expressing a C--C ligation pathway; and (iii) a reduction and dehydration pathway, comprising one or more genes encoding and expressing an enzyme selected from a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase.
41. The recombinant microorganism of claim 40, wherein the microorganism is capable of converting a suitable monosaccharide or suitable oligosaccharide to a commodity chemical, or an intermediate thereof.
42. The recombinant microorganism of claim 40, wherein the one or more genes encoding the biosynthesis pathway encode a pathway selected from an acetoaldehyde, propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, 4-methylpentaldehyde, phenyl acetaldehyde, glutaraldehyde, 5-amino-pentaldehyde, succinate semialdehyde, succinate 4-hydroxyphenyl acetaldehyde, and an indole-3-acetaldehyde biosynthesis pathway, and combinations thereof.
43. The recombinant microorganism of claim 40, wherein the one or more genes encoding and expressing the C--C ligation pathway comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NO: 143.
44. The recombinant microorganism of claim 40, wherein the one or more genes encoding the diol dehydrogenase comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NOS:97, 99, or 101.
45. The recombinant microorganism of claim 40, wherein the one or more genes encoding the diol dehydratase comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NOS: 103, 105, or 107.
46. The recombinant microorganism of claim 40, wherein the one or more genes encoding a secondary alcohol dehydrogenase comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NOS: 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, or 141.
47. A recombinant microorganism, comprising one or more genes encoding and expressing an aldehyde or ketone biosynthesis pathway, wherein the pathway comprises at least one exogenous gene.
48. A recombinant microorganism, comprising one or more exogenous genes encoding and expressing one or more enzymes selected from a C--C ligase, a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase.
49. The recombinant microorganism of claim 47, wherein the one or more enzymes comprise a C--C ligase and a diol dehydrogenase.
50. The recombinant microorganism of claim 47, wherein the one or more enzymes comprise a diol dehydrogenase and a diol dehydratase.
51. The recombinant microorganism of claim 40, wherein the microorganism comprises reduced ethanol production capability compared to a wild-type microorganism.
52. The recombinant microorganism of claim 40, wherein the microorganism comprises a reduction or inhibition in the conversion of acetyl-coA to ethanol.
53. The recombinant microorganism of claim 40, wherein the recombinant microorganism comprises a reduction of an ethanol dehydrogenase, thereby providing a reduced ethanol production capability.
54. The recombinant microorganism of claim 53, wherein the ethanol dehydrogenase is an adhE, homolog or variant thereof.
55. The recombinant microorganism of claim 54, wherein the microorganism comprises a deletion or knockout of an adhE, homolog or variant thereof.
56. The recombinant microorganism of claim 40, wherein the recombinant microorganism comprises one or more deletions or knockouts in a gene encoding an enzyme selected from an enzyme that catalyzes the conversion of acetyl-coA to ethanol, an enzyme that catalyzes the conversion of pyruvate to lactate, an enzyme that catalyzes the conversion of fumarate to succinate, an enzyme that catalyzes the conversion of acetyl-coA and phosphate to coA and acetyl phosphate, an enzyme that catalyzes the conversion of acetyl-coA and formate to coA and pyruvate, and an enzyme that catalyzes the conversion of alpha-keto acid to branched chain amino acids.
57. The recombinant microorganism of claim 40, wherein the microorganism is a bacteria.
58. The recombinant microorganism of claim 40, wherein the microorganism is a gram-negative bacteria.
59. The recombinant microorganism of claim 40, wherein the microorganism is a eukaryote.
60. The recombinant microorganism of claim 59, wherein the eukaryote is a fungus.
61. The recombinant microorganism of claim 60, wherein the fungus is a yeast.
62. A method for converting a suitable monosaccharide to a commodity chemical comprising,(a) contacting the suitable monosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the commodity chemical, wherein the microbial system comprises,(i) one or more genes encoding and expressing a pathway selected from a fatty acid biosynthesis pathway, an amino acid biosynthetic pathway, and a short chain alcohol biosynthetic pathway;(ii) one or more genes encoding and expressing a keto-acid decarboxylase, aldehyde dehydrogenase, and/or alcohol dehydrogenase; and(iii) an enzymatic reduction pathway selected from (1) an enzymatic long chain alcohol reduction pathway, (2) an enzymatic decarbonylation pathway, (3) an enzymatic decarboxylation pathway, and (4) an enzymatic reduction pathway comprising (1), (2), and/or (3),thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical.
63. A recombinant microorganism, comprising (i) one or more genes encoding and expressing a pathway selected from a fatty acid biosynthesis pathway, an amino acid biosynthetic pathway, and a short chain alcohol biosynthetic pathway; (ii) one or more genes encoding and expressing a keto-acid decarboxylase, aldehyde dehydrogenase, and/or alcohol dehydrogenase; and (iii) an enzymatic reduction pathway selected from (1) an enzymatic long chain alcohol reduction pathway, (2) an enzymatic decarbonylation pathway, (3) an enzymatic decarboxylation pathway, and (4) an enzymatic reduction pathway comprising (1), (2), and/or (3).
64. The recombinant microorganism of claim 1, wherein the microorganism is selected from Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus usamii, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus licheniformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Candida rugosa, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Humicola insolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sccharomyces cerevisiae, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Vibrio alginolyticus, Xanthomonas, yeast, Zygosaccharomyces rouxii, Zymomonas, and Zymomonus mobilis.
65. A commodity chemical produced by the method of claim 1.
66. A blended commodity chemical comprising the commodity chemical of claim 65 and a refinery-produced petroleum product.
67. The blended commodity chemical of claim 66, wherein the commodity chemical is selected from a C10-C12 hydrocarbon, 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol.
68. The blended commodity chemical of claim 67, wherein the C10-C12 hydrocarbon is selected from 2,7-dimethyloctane and 2,9-dimethyldecane.
69. The blended commodity chemical of claim 66, wherein the refinery-produced petroleum product is selected from jet fuel and diesel fuel.
70. A method of producing a commodity chemical enriched refinery-produced petroleum product, comprising(a) blending the refinery-produced petroleum product with the commodity chemical produced by the method of claim 1,thereby producing the commodity chemical enriched refinery-produced petroleum product.
Description:
CROSS-REFERENCE TO RELATED APPLICATION
[0001]This application claims the benefit under 35 U.S.C. §119(e) of U.S. Provisional Patent Application No. 60/977,628 filed Oct. 4, 2007, which application is incorporated herein by reference in its entirety.
STATEMENT REGARDING SEQUENCE LISTING
[0002]The Sequence Listing associated with this application is provided in text format in lieu of a paper copy, and is hereby incorporated by reference into the specification. The name of the text file containing the Sequence Listing is 150097--40102_SEQUENCE_LISTING.txt. The text file is 519 KB, was created on Oct. 3, 2008, and is being submitted electronically via EFS-Web.
TECHNICAL FIELD
[0003]The present application relates generally to the use of microbial and chemical systems to convert biomass to commodity chemicals, such as biofuels/biopetrols.
BACKGROUND
[0004]Petroleum is facing declining global reserves and contributes to more than 30% of greenhouse gas emissions driving global warming. Annually 800 billion barrels of transportation fuel are consumed globally. Diesel and jet fuels account for greater than 50% of global transportation fuels.
[0005]Significant legislation has been passed, requiring fuel producers to cap or reduce the carbon emissions from the production and use of transportation fuels. Fuel producers are seeking substantially similar, low carbon fuels that can be blended and distributed through existing infrastructure (e.g., refineries, pipelines, tankers).
[0006]Due to increasing petroleum costs and reliance on petrochemical feedstocks, the chemicals industry is also looking for ways to improve margin and price stability, while reducing its environmental footprint. The chemicals industry is striving to develop greener products that are more energy, water, and CO2 efficient than current products. Fuels produced from biological sources, such as biomass, represent one aspect of process.
[0007]Presents method for converting biomass into biofuels focus on the use of lignocellulolic biomass, and there are many problems associated with using this process. Large-scale cultivation of lignocellulolic biomass requires substantial amount of cultivated land, which can be only achieved by replacing food crop production with energy crop production, deforestation, and by recultivating currently uncultivated land. Other problems include a decrease in water availability and quality and an increase in the use of pesticides and fertilizers.
[0008]The degradation of lignocellulolic biomass using biological systems is a very difficult challenge due to its substantial mechanistic strength and the complex chemical components. Approximately thirty different enzymes are required to fully convert lignocellulose to monosaccharides. The only available alternate to this complex approach requires a substantial amount of heat, pressure, and strong acids. The art therefore needs an economic and technically simple process for converting biomass into hydrocarbons for use as biofuels or biopetrols.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009]FIG. 1 shows the Vibrio splendidus genomic region of the fosmid clone described in Example 1. Genes are indicated with orange arrows. Labels show the numerical gene indices and the predicted function of the proteins.
[0010]FIG. 2 illustrates the pathways involved in certain embodiment in which E. coli may be engineered to grow on alginate as a sole source of carbon.
[0011]FIG. 3 illustrates the pathways involved in certain embodiment in which E. coli may be engineered to grow on pectin as a sole source of carbon.
[0012]FIG. 4 shows the results of engineered or recombinant E. coli growing on alginate as a sole source of carbon (see solid circles). Agrobacterium tumefaciens cells provide a positive control (see hatched circles). The well to the immediate left of the of the A. tumefaciens positive control contains DH10B E. coli cells, which provide a negative control.
[0013]FIG. 5 shows the growth of recombinant strain of E. coli on galacturonates and pectin. FIG. 5A shows the growth of E. coli on various lengths of galacturonate after 24 hr. The recombinant strain in FIG. 5A is the E. coli BL21(DE3) strain harboring pTrlogl-kdgR+pBBRGal3P, and the control strain is the BL21(DE3) strain harboring pTrc99A+pBBR1MCS-2, as described in Example 2. FIG. 5B shows the growth of recombinant E. coli on pectin after 3-4 days. The recombinant strain in FIG. 5B is E. coli DH5a strain containing pPEL74 (Ctrl) and pPEL74 and pROU2, as described in Example 2.
[0014]FIG. 6 shows the degradation of alginate to form pyruvate. FIG. 6A illustrates a simplified metabolic pathway for alginate degradation and metabolism. FIG. 6B shows the results of in vitro degradation of alginate to form pyruvate by an enzymatic degradation route. FIG. 6c shows the results of in vitro degradation of alginate to form pyruvate by a chemical degradation route.
[0015]FIG. 7 shows the biological activity of various alcohol dehydrogenases isolated from Agrobacterium tumefaciens C58. FIG. 7A shows DEHU hydrogenase activity as monitored by NADPH consumption, and FIG. 7B shows mannuronate hydrogenase activity as monitored by NADPH consumption.
[0016]FIG. 8 shows the GC-MS chromatogram results for the control sample (FIG. 8A) and for isobutyraldehyde, 3-methylpentanol, and 2-methylpentanal production from pBADalsS-ilvCD-leuABCD2 and pTrcBALK (FIG. 8B).
[0017]FIG. 9 shows the GC-MS chromatogram results for the control sample (FIG. 9A) and for 4-hydroxyphenylethanol and indole-3-ethanol production from pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK (FIG. 9B).
[0018]FIG. 10 shows the mass spectrometry results for isobutanal (FIG. 10A), 3-methylpentanol (FIG. 10B), and 2-methylpentanol (FIG. 10C).
[0019]FIG. 11 shows the mass spectrometry results for phenylethanol (FIG. 11A), 4-hydroxyphenylethanol (FIG. 11B), and indole-3-ethanol (FIG. 11C).
[0020]FIG. 12 shows the biological activity of diol dehydratases. FIG. 12A shows the reduction of butyroin by ddh1, ddh2, and ddh3 as monitored by NADH consumption. FIG. 12B shows the oxidation activity of ddh3 towards 1,2-cyclopentanediol and 1,2-cyclohexanediol as measured by NADH production.
[0021]FIG. 13 summarizes the results of kinetic studies for various substrates in the oxidation reactions catalyzed by the DDH polypeptides. These reactions were NAD+ dependent.
[0022]FIG. 14 shows the nucleotide sequence (FIG. 14A) (SEQ ID NO:97) and polypeptide sequence (FIG. 14B) (SEQ ID NO:98) of diol dehydrogenase DDH1 isolated from Lactobaccilus brevis ATCC 367.
[0023]FIG. 15 shows the nucleotide sequence (FIG. 15A) (SEQ ID NO:99) and polypeptide sequence (FIG. 15B) (SEQ ID NO:100) of diol dehydrogenase DDH2 isolated from Pseudomonas putida KT2440.
[0024]FIG. 16 shows the nucleotide sequence (FIG. 16A) (SEQ ID NO:101) and polypeptide sequence (FIG. 16B) (SEQ ID NO:102) of diol dehydrogenase DDH3 isolated from Klebsiella pneumoniae MGH78578.
[0025]FIG. 17 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This reaction illustrates the sequential conversion of butanal into 5-hydroxy-4-octanone and then 4,5-octanonediol. FIG. 17A shows the detection of butyroin (5-hydroxy-4-octanone) at 5.36 minutes, and FIG. 17B shows the detection of 4,5-octanediol at 6.49 and 6.65 minutes.
[0026]FIG. 18 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of n-pentanal into 6-hydroxy-5-decanone and then 5,6-decanediol. FIG. 18A shows the detection of valeroin (6-hydroxy-5-decanone) at 8.22 minutes, and FIG. 18B shows the detection of 5,6 decanediol at 9.22 and 9.35 minutes.
[0027]FIG. 19 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of 3-methylbutanal into 2,7-dimethyl-5-hydroxy-4-octanone and then 2,7-dimethyl-4,5-octanediol. FIG. 19A shows the detection of isoveraloin (2,7-dimethyl-5-hydroxy-4-octanone) at 6.79 minutes, and FIG. 19B shows the detection of 2,7-dimethyl-4,5-octanediol at 7.95 and 8.15 minutes.
[0028]FIG. 20 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of n-hexanal into 7-hydroxy-6-dodecanone and then 6,7-dodecanediol. FIG. 20A shows the detection of hexanoin (7-hydroxy-6-decanone) at 10.42 minutes, and FIG. 20B shows the detection of 6,7 dodecanediol at 10.89 and 10.95 minutes.
[0029]FIG. 21 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of 4-methylpentanal into 2,9-dimethyl-6-hydroxy-5-decanone and then 2,9-dimethyl-5,6-decanediol. FIG. 21A shows the detection of isohexanoin (2,9-Dimethyl-6-hydroxy-5-decanone) at 9.45 minutes, and FIG. 21B shows the detection of 2,9-dimethyl-5,6-decanediol at 10.38 and 10.44 minutes.
[0030]FIG. 22 shows the in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the conversion of n-octanal into 9-hydroxy-8-hexadecanone by showing the detection of detection of octanoin (9-hydroxy-8-hexadecanone) at 12.35 minutes.
[0031]FIG. 23 shows the in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the conversion of acetaldehyde into 3-hydroxy-2-butanone by showing the detection of acetoin (3-hydroxy-2-butanone) at rt=0.91 minutes.
[0032]FIG. 24 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of n-propanal into 4-hydroxy-3-hexanone and then 3,4-hexanediol. FIG. 24A shows the detection of propioin (4-hydroxy-3-hexanone) at rt=2.62 minutes, and FIG. 24B shows the detection of 3,4-hexanediol at rt=3.79 minutes.
[0033]FIG. 25 the in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the conversion of phenylacetoaldehyde into 1,4-diphenyl-3-hydroxy-2-butanone by showing the detection of 1,4-diphenyl-3-hydroxy-2-butanone at rt=13.66 minutes.
[0034]FIG. 26 shows the sequential biological activity of a diol dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 (DDH3) and a diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578. FIG. 26A shows GC-MS data which confirms the presence of 4,5-octanediol in the sample extraction, which is the expected product resulting from the reduction of butyroin by ddh3. FIG. 26B shows GC-MS data confirming the presence of 4-octanone in the sample extraction, which is the expected product resulting from the sequential dehydrogenation of butyroin and dehydration of 4,5-octanediol by ddh3 and pduCDE, respectively.
[0035]FIG. 27 shows the sequential biological activity of a diol dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 (DDH3) and a diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578. FIGS. 27A and 27B show comparisons between the sample extraction gas chromatograph/mass spectrum and the 4-octanone standard gas chromatograph/mass spectrum, confirming that 4-octanone was produced from butyroin using the enzymes diol dehydrogenase (ddh3) and a diol dehydratase (pduCDE).
[0036]FIG. 28 shows the nucleotide sequence (FIG. 28A) (SEQ ID NO:103) and polypeptide sequence (FIG. 28B) (SEQ ID NO:104) of a diol dehydratase large subunit (pduC) isolated from Klebsiella pneumoniae MGH78578.
[0037]FIG. 29 shows the nucleotide sequence (FIG. 29A) (SEQ ID NO:105) and polypeptide sequence (FIG. 29B) (SEQ ID NO:106) of a diol dehydratase medium subunit isolated from Klebsiella pneumoniae MGH78578 (pduD), in addition to the nucleotide sequence (FIG. 29C) (SEQ ID NO:107) and polypeptide sequence (FIG. 29D) (SEQ ID NO:108) of a diol dehydratase small subunit isolated from Klebsiella pneumoniae MGH78578 (pduE).
[0038]FIG. 30 shows the oxidation of 4-octanol by secondary alcohol dehydrogenases as monitored by NADH production (FIG. 30A) and NADPH production (FIG. 30B).
[0039]FIG. 31 shows the oxidation of 4-octanol by secondary alcohol dehydrogenases as monitored by NADH production (FIG. 31A) and NADPH production (FIG. 31B).
[0040]FIG. 32 shows the oxidation of 2,7-dimethyl octanol by secondary alcohol dehydrogenases as monitored by NADH production (FIG. 32A) and NADPH production (FIG. 32B).
[0041]FIG. 33 shows the oxidation and reduction activity of 2ADH11 and 2ADH16. FIG. 33A shows the reduction of 2,7-dimethyl-4-octanone as measured by NADPH consumption. FIG. 33B shows the reduction of 2,7-dimethyl-4-octanone, 4-octanone, and cyclolypentanone.
[0042]FIG. 34 shows the oxidation and reduction of cyclopentanol by secondary alcohol dehydrogenases. FIG. 34A shows the oxidation of cyclopentanol as monitored by NADH or NADPH formation. FIG. 34B shows the reduction of cyclopentanol as monitored by NADPH consumption.
[0043]FIG. 35 shows the calculated rate constants for the illustrated reduction reactions for each substrate catalyzed by secondary alcohol dehydrogenase ADH-16 (SEQ ID NO:138).
[0044]FIG. 36 shows the calculated rate constants for the illustrated oxidation reactions for each substrate catalyzed by secondary alcohol dehydrogenase ADH-16 (SEQ ID NO:138).
[0045]FIG. 37 shows a list of alginate lyases genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.
[0046]FIG. 38 shows a list of pectate lyase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.
[0047]FIG. 39A shows a list of rhamnogalacturonan lyase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein. FIG. 39B shows a list of rhamnogalacturonate hydrolase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.
[0048]FIG. 40 shows a list of pectin methyl esterase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.
[0049]FIG. 41 shows a list of pectin acetyl esterase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.
[0050]FIG. 42 shows the production of 2-phenyl ethanol (FIG. 42A), 2-(4-hydroxyphenyl)ethanol (FIG. 42B), and 2-(indole-3-)ethanol (FIG. 42C) at 24 hours from the recombinant microorganisms described in Example 4, which comprise functional 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis pathways.
[0051]FIG. 43 shows the GC-MS chromatogram results that confirm the production of 2-phenyl ethanol (FIG. 43B) at one week from the recombinant microorganisms described in Example 4 (pBADpheA-aroLAC-aroG-tktA-aroBDE and pTrcBALK). FIG. 43A shows the negative control cells (pBAD33 and pTrc99A).
[0052]FIG. 44 shows the GC-MS chromatogram results that confirm the production of 2-(4-hydroxyphenyl)ethanol (9.36 min) and 2-(indole-3) ethanol (10.32 min) at one week from the recombinant microorganisms described in Example 4 (pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK).
[0053]FIG. 45 confirms both the formation of 1-propanal from 1,2-propanediol (FIG. 45A), and the formation of 2-butanone from meso-2,3-butanediol (FIG. 45B), both of which were catalyzed in vitro by an isolated B12 independent diol dehydratase, as described in Example 9.
[0054]FIG. 46A shows the in vivo production of 1-propanol from 1,2-propanediol. FIG. 46B shows the in vivo production of 2-butanol from meso-2,3 butanediol. FIG. 46C shows the in vivo production of cyclopentanone from trans-1,2-cyclopentanediol. These experiments were performed as described in Example 9.
[0055]FIG. 47 shows the results of the TBA assay, as performed in Example 10. The left tube in FIG. 47 represents media taken from an overnight culture of cells expressing Vs24254, showing secretion of an alginate lyase, while the right hand tube shows the TBA reaction using media from cells expressing Vs24259 (negative control). The lack of pink coloration in the negative control indicates that little or no cleavage of the alginate polymer has occurred.
[0056]FIG. 48 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized benzaldehyde lyase (BAL) catalyzed the in vivo production of 3-hydroxy-2-pentanone and 2-hydroxy-3-pentanone from a ligation reaction between acetaldehyde and propionaldehyde (FIG. 48A), and catalyzed the in vivo production of 4-hydroxy-3-heptanone and 3-hydroxy-4-heptanone from a ligation reaction between propionaldehyde and butyraldehyde (FIG. 48B).
[0057]FIG. 49 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 3-hydroxy-2-heptanone from a ligation reaction between acetaldehyde and pentanal (FIG. 49A), and catalyzed the in vivo production of 4-hydroxy-3-octanone and 3-hydroxy-4-octanone from a ligation reaction between pentanal and propionaldehyde (FIG. 49B).
[0058]FIG. 50 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 5-hydroxy-4-nonanone from ligation reaction between butyraldehyde and pentanal (FIG. 50A), and catalyzed the in vivo production of 2-methyl-5-hydroxy-4-decanone and 2-methyl-4-hydroxy-5-decanone from ligation reaction between hexanal and 3-methylbutyraldehyde (FIG. 50B).
[0059]FIG. 51 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 6-methyl-3-hydroxy-2-heptanone from ligation reaction between acetaldehyde and 4-methylhexanal (FIG. 51A), and catalyzed the in vivo production of 7-methyl-4-hydroxy-3-octanone from a ligation reaction between 4-methylhexanal and propionaldehyde (FIG. 51B).
[0060]FIG. 52 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 8-methyl-5-hydroxy-4-nonanone from ligation reaction between 4-methylhexanal and butyraldehyde (FIG. 52A), and catalyzed the in vivo production of 3-hydroxy-2-decanone from a ligation reaction between acetaldehyde and octanal (FIG. 52B).
[0061]FIG. 53 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 4-hydroxy-3-undecanone from ligation reaction between octanal and propionaldehyde (FIG. 53A), and catalyzed the in vivo production of 5-hydroxy-4-dodecanone from a ligation reaction between octanal and butyraldehyde (FIG. 53B).
[0062]FIG. 54 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 6-hydroxy-5-tridecanone (FIG. 54A) from ligation reaction between octanal and pentanal, and catalyzed the in vivo production of 2-methyl-5-hydroxy-4-dodecanone and 2-methyl-4-hydroxy-5-decanone from a ligation reaction between octanal and 3-methylbutyraldehyde (FIG. 54B).
[0063]FIG. 55 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 2-methyl-6-hydroxy-5-tridecanone from a ligation reaction between octanal and 4-methylpentanal.
[0064]FIG. 56 shows the growth of recombinant E. coli on alginate as a sole source of carbon (FIG. 56A), as described in Example 10. Growth on glucose (FIG. 56B) provides a positive control. The cells were transformed with either no plasmid (BL21--negative control), one plasmid (e.g., Da or 3a), or two plasmids (e.g., Dk3a and Da3k). The plasmids are indicated by the lower case letter: "a" refers to the pET-DEST42 plasmid backbone and "k" refers to the pENTR/D/TOPO backbone. "D" indicates that the plasmid contains the genomic region Vs24214-24249, while "3" indicates that the plasmid contains the genomic region Vs24189-24209. Thus, Da would be pET-DEST42-Vs24214-24249, Da3k would be pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 and so on. These results show that the combined genomic regions Vs24214-24249 and Vs24189-24209 are sufficient to confer on E. coli the ability to grow on alginate as a sole source of carbon.
[0065]FIG. 57 shows the production of ethanol by E. coli growing on alginate, as performed in Example 11. E. coli was transformed with either pBBRPdc-AdhA/B or pBBRPdc-AdhA/B+1.5 FOS and allowed to grow in m9 media containing alginate.
BRIEF SUMMARY
[0066]Certain embodiments of the present invention relate to methods for converting a suitable monosaccharide or oligosaccharide to a commodity chemical, comprising: (a) contacting the suitable monosaccharide or oligosaccharide with a commodity chemical biosynthesis pathway, wherein the commodity chemical biosynthesis pathway comprises an aldehyde or ketone biosynthesis pathway, a C--C ligation pathway, and/or a dehydration and reduction pathway, thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical.
[0067]In certain aspects, the biomass is selected from marine biomass and vegetable/fruit/plant biomass. In certain aspects, the marine biomass is selected from kelp, giant kelp, sargasso, seaweed, algae, marine microflora, microalgae, and sea grass. In certain aspects, the vegetable/fruit/plant biomass comprises plant peel or pomace. In certain aspects, the vegetable/fruit/plant biomass is selected from citrus, potato, tomato, grape, gooseberry, carrot, mango, sugar-beet, apple, switchgrass, wood, and stover.
[0068]In certain aspects, the suitable monosaccharide or oligosaccharide is obtained from a biomass-derived polysaccharide, wherein the polysaccharide is selected from alginate, agar, carrageenan, fucoidan, pectin, polygalacturonate, cellulose, hemicellulose, xylan, arabinan, and mannan. In certain aspects, the suitable monosaccharide or oligosaccharide is selected from 2-keto-3-deoxy D-gluconate (KDG) gluronate, mannuronate, mannitol, lyxose, glycerol, xylitol, glucose, mannose, galactose, xylose, arabinose, glucuronate, galacturonates, and rhamnose, and D-mannitol.
In certain aspects, the commodity chemical is selected from methane, methanol, ethane, ethene, ethanol, n-propane, 1-propene, 1-propanol, propanal, acetone, propionate, n-butane, 1-butene, 1-butanol, butanal, butanoate, isobutanal, isobutanol, 2-methylbutanal, 2-methylbutanol, 3-methylbutanal, 3-methylbutanol, 2-butene, 2-butanol, 2-butanone, 2,3-butanediol, 3-hydroxy-2-butanone, 2,3-butanedione, ethylbenzene, ethenylbenzene, 2-phenylethanol, phenylacetaldehyde, 1-phenylbutane, 4-phenyl-1-butene, 4-phenyl-2-butene, 1-phenyl-2-butene, 1-phenyl-2-butanol, 4-phenyl-2-butanol, 1-phenyl-2-butanone, 4-phenyl-2-butanone, 1-phenyl-2,3-butandiol, 1-phenyl-3-hydroxy-2-butanone, 4-phenyl-3-hydroxy-2-butanone, 1-phenyl-2,3-butanedione, n-pentane, ethylphenol, ethenylphenol, 2-(4-hydroxyphenyl)ethanol, 4-hydroxyphenylacetaldehyde, 1-(4-hydroxyphenyl) butane, 4-(4-hydroxyphenyl)-1-butene, 4-(4-hydroxyphenyl)-2-butene, 1-(4-hydroxyphenyl)-1-butene, 1-(4-hydroxyphenyl)-2-butanol, 4-(4-hydroxyphenyl)-2-butanol, 1-(4-hydroxyphenyl)-2-butanone, 4-(4-hydroxyphenyl)-2-butanone, 1-(4-hydroxyphenyl)-2,3-butandiol, 1-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 4-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-2,3-butanonedione, indolylethane, indolylethene, 2-(indole-3-)ethanol, n-pentane, 1-pentene, 1-pentanol, pentanal, pentanoate, 2-pentene, 2-pentanol, 3-pentanol, 2-pentanone, 3-pentanone, 4-methylpentanal, 4-methylpentanol, 2,3-pentanediol, 2-hydroxy-3-pentanone, 3-hydroxy-2-pentanone, 2,3-pentanedione, 2-methylpentane, 4-methyl-1-pentene, 4-methyl-2-pentene, 4-methyl-3-pentene, 4-methyl-2-pentanol, 2-methyl-3-pentanol, 4-methyl-2-pentanone, 2-methyl-3-pentanone, 4-methyl-2,3-pentanediol, 4-methyl-2-hydroxy-3-pentanone, 4-methyl-3-hydroxy-2-pentanone, 4-methyl-2,3-pentanedione, 1-phenylpentane, 1-phenyl-1-pentene, 1-phenyl-2-pentene, 1-phenyl-3-pentene, 1-phenyl-2-pentanol, 1-phenyl-3-pentanol, 1-phenyl-2-pentanone, 1-phenyl-3-pentanone, 1-phenyl-2,3-pentanediol, 1-phenyl-2-hydroxy-3-pentanone, 1-phenyl-3-hydroxy-2-pentanone, 1-phenyl-2,3-pentanedione, 4-methyl-1-phenylpentane, 4-methyl-1-phenyl-1-pentene, 4-methyl-1-phenyl-2-pentene, 4-methyl-1-phenyl-3-pentene, 4-methyl-1-phenyl-3-pentanol, 4-methyl-1-phenyl-2-pentanol, 4-methyl-1-phenyl-3-pentanone, 4-methyl-1-phenyl-2-pentanone, 4-methyl-1-phenyl-2,3-pentanediol, 4-methyl-1-phenyl-2,3-pentanedione, 4-methyl-1-phenyl-3-hydroxy-2-pentanone, 4-methyl-1-phenyl-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl) pentane, 1-(4-hydroxyphenyl)-1-pentene, 1-(4-hydroxyphenyl)-2-pentene, 1-(4-hydroxyphenyl)-3-pentene, 1-(4-hydroxyphenyl)-2-pentanol, 1-(4-hydroxyphenyl)-3-pentanol, 1-(4-hydroxyphenyl)-2-pentanone, 1-(4-hydroxyphenyl)-3-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanediol, 1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl) pentane, 4-methyl-1-(4-hydroxyphenyl)-2-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentene, 4-methyl-1-(4-hydroxyphenyl)-1-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentanol, 4-methyl-1-(4-hydroxyphenyl)-2-pentanol, 4-methyl-1-(4-hydroxyphenyl)-3-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-indole-3-pentane, 1-(indole-3)-1-pentene, 1-(indole-3)-2-pentene, 1-(indole-3)-3-pentene, 1-(indole-3)-2-pentanol, 1-(indole-3)-3-pentanol, 1-(indole-3)-2-pentanone, 1-(indole-3)-3-pentanone, 1-(indole-3)-2,3-pentanediol, 1-(indole-3)-2-hydroxy-3-pentanone, 1-(indole-3)-3-hydroxy-2-pentanone, 1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3-)pentane, 4-methyl-1-(indole-3)-2-pentene, 4-methyl-1-(indole-3)-3-pentene, 4-methyl-1-(indole-3)-1-pentene, 4-methyl-2-(indole-3)-3-pentanol, 4-methyl-1-(indole-3)-2-pentanol, 4-methyl-1-(indole-3)-3-pentanone, 4-methyl-1-(indole-3)-2-pentanone, 4-methyl-1-(indole-3)-2,3-pentanediol, 4-methyl-1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3)-3-hydroxy-2-pentanone, 4-methyl-1-(indole-3)-2-hydroxy-3-pentanone, n-hexane, 1-hexene, 1-hexanol, hexanal, hexanoate, 2-hexene, 3-hexene, 2-hexanol, 3-hexanol, 2-hexanone, 3-hexanone, 2,3-hexanediol, 2,3-hexanedione, 3,4-hexanediol, 3,4-hexanedione, 2-hydroxy-3-hexanone, 3-hydroxy-2-hexanone, 3-hydroxy-4-hexanone, 4-hydroxy-3-hexanone, 2-methylhexane, 3-methylhexane, 2-methyl-2-hexene, 2-methyl-3-hexene, 5-methyl-1-hexene, 5-methyl-2-hexene, 4-methyl-1-hexene, 4-methyl-2-hexene, 3-methyl-3-hexene, 3-methyl-2-hexene, 3-methyl-1-hexene, 2-methyl-3-hexanol, 5-methyl-2-hexanol, 5-methyl-3-hexanol, 2-methyl-3-hexanone, 5-methyl-2-hexanone, 5-methyl-3-hexanone, 2-methyl-3,4-hexanediol, 2-methyl-3,4-hexanedione, 5-methyl-2,3-hexanediol, 5-methyl-2,3-hexanedione, 4-methyl-2,3-hexanediol, 4-methyl-2,3-hexanedione, 2-methyl-3-hydroxy-4-hexanone, 2-methyl-4-hydroxy-3-hexanone, 5-methyl-2-hydroxy-3-hexanone, 5-methyl-3-hydroxy-2-hexanone, 4-methyl-2-hydroxy-3-hexanone, 4-methyl-3-hydroxy-2-hexanone, 2,5-dimethylhexane, 2,5-dimethyl-2-hexene, 2,5-dimethyl-3-hexene, 2,5-dimethyl-3-hexanol, 2,5-dimethyl-3-hexanone, 2,5-dimethyl-3,4-hexanediol, 2,5-dimethyl-3,4-hexanedione, 2,5-dimethyl-3-hydroxy-4-hexanone, 5-methyl-1-phenylhexane, 4-methyl-1-phenylhexane, 5-methyl-1-phenyl-1-hexene, 5-methyl-1-phenyl-2-hexene, 5-methyl-1-phenyl-3-hexene, 4-methyl-1-phenyl-1-hexene, 4-methyl-1-phenyl-2-hexene, 4-methyl-1-phenyl-3-hexene, 5-methyl-1-phenyl-2-hexanol, 5-methyl-1-phenyl-3-hexanol, 4-methyl-1-phenyl-2-hexanol, 4-methyl-1-phenyl-3-hexanol, 5-methyl-1-phenyl-2-hexanone, 5-methyl-1-phenyl-3-hexanone, 4-methyl-1-phenyl-2-hexanone, 4-methyl-1-phenyl-3-hexanone, 5-methyl-1-phenyl-2,3-hexanediol, 4-methyl-1-phenyl-2,3-hexanediol, 5-methyl-1-phenyl-3-hydroxy-2-hexanone, 5-methyl-1-phenyl-2-hydroxy-3-hexanone, 4-methyl-1-phenyl-3-hydroxy-2-hexanone, 4-methyl-1-phenyl-2-hydroxy-3-hexanone, 5-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)hexane, 5-methyl-1-(4-hydroxyphenyl)-1-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexene, 5-methyl-1-(4-hydroxyphenyl)-3-hexene, 4-methyl-1-(4-hydroxyphenyl)-1-hexene, 4-methyl-1-(4-hydroxyphenyl)-2-hexene, 4-methyl-1-(4-hydroxyphenyl)-3-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexanol, 5-methyl-1-(4-hydroxyphenyl)-3-hexanol, 4-methyl-1-(4-hydroxyphenyl)-2-hexanol, 4-methyl-1-(4-hydroxyphenyl)-3-hexanol, 5-methyl-1-(4-hydroxyphenyl)-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 5-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(indole-3-)hexane, 5-methyl-1-(indole-3)-1-hexene, 5-methyl-1-(indole-3)-2-hexene, 5-methyl-1-(indole-3)-3-hexene, 4-methyl-1-(indole-3)-1-hexene, 4-methyl-1-(indole-3)-2-hexene, 4-methyl-1-(indole-3)-3-hexene, 5-methyl-1-(indole-3)-2-hexanol, 5-methyl-1-(indole-3)-3-hexanol, 4-methyl-1-(indole-3)-2-hexanol, 4-methyl-1-(indole-3)-3-hexanol, 5-methyl-1-(indole-3)-2-hexanone, 5-methyl-1-(indole-3)-3-hexanone, 4-methyl-1-(indole-3)-2-hexanone, 4-methyl-1-(indole-3)-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanediol, 4-methyl-1-(indole-3)-2,3-hexanediol, 5-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 5-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 4-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 4-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanedione, 4-methyl-1-(indole-3)-2,3-hexanedione, n-heptane, 1-heptene, 1-heptanol, heptanal, heptanoate, 2-heptene, 3-heptene, 2-heptanol, 3-heptanol, 4-heptanol, 2-heptanone, 3-heptanone, 4-heptanone, 2,3-heptanediol, 2,3-heptanedione, 3,4-heptanediol, 3,4-heptanedione, 2-hydroxy-3-heptanone, 3-hydroxy-2-heptanone, 3-hydroxy-4-heptanone, 4-hydroxy-3-heptanone, 2-methylheptane, 3-methylheptane, 6-methyl-2-heptene, 6-methyl-3-heptene, 2-methyl-3-heptene, 2-methyl-2-heptene, 5-methyl-2-heptene, 5-methyl-3-heptene, 3-methyl-3-heptene, 2-methyl-3-heptanol, 2-methyl-4-heptanol, 6-methyl-3-heptanol, 5-methyl-3-heptanol, 3-methyl-4-heptanol, 2-methyl-3-heptanone, 2-methyl-4-heptanone, 6-methyl-3-heptanone, 5-methyl-3-heptanone, 3-methyl-4-heptanone, 2-methyl-3,4-heptanediol, 2-methyl-3,4-heptanedione, 6-methyl-3,4-heptanediol, 6-methyl-3,4-heptanedione, 5-methyl-3,4-heptanediol, 5-methyl-3,4-heptanedione, 2-methyl-3-hydroxy-4-heptanone, 2-methyl-4-hydroxy-3-heptanone, 6-methyl-3-hydroxy-4-heptanone, 6-methyl-4-hydroxy-3-heptanone, 5-methyl-3-hydroxy-4-heptanone, 5-methyl-4-hydroxy-3-heptanone, 2,6-dimethylheptane, 2,5-dimethylheptane, 2,6-dimethyl-2-heptene, 2,6-dimethyl-3-heptene, 2,5-dimethyl-2-heptene, 2,5-dimethyl-3-heptene, 3,6-dimethyl-3-heptene, 2,6-dimethyl-3-heptanol, 2,6-dimethyl-4-heptanol, 2,5-dimethyl-3-heptanol, 2,5-dimethyl-4-heptanol, 2,6-dimethyl-3,4-heptanediol, 2,6-dimethyl-3,4-heptanedione, 2,5-dimethyl-3,4-heptanediol, 2,5-dimethyl-3,4-heptanedione, 2,6-dimethyl-3-hydroxy-4-heptanone, 2,6-dimethyl-4-hydroxy-3-heptanone, 2,5-dimethyl-3-hydroxy-4-heptanone, 2,5-dimethyl-4-hydroxy-3-heptanone, n-octane, 1-octene, 2-octene, 1-octanol, octanal, octanoate, 3-octene, 4-octene, 4-octanol, 4-octanone, 4,5-octanediol, 4,5-octanedione, 4-hydroxy-5-octanone, 2-methyloctane, 2-methyl-3-octene, 2-methyl-4-octene, 7-methyl-3-octene, 3-methyl-3-octene, 3-methyl-4-octene, 6-methyl-3-octene, 2-methyl-4-octanol, 7-methyl-4-octanol, 3-methyl-4-octanol, 6-methyl-4-octanol, 2-methyl-4-octanone, 7-methyl-4-octanone, 3-methyl-4-octanone, 6-methyl-4-octanone, 2-methyl-4,5-octanediol, 2-methyl-4,5-octanedione, 3-methyl-4,5-octanediol, 3-methyl-4,5-octanedione, 2-methyl-4-hydroxy-5-octanone, 2-methyl-5-hydroxy-4-octanone, 3-methyl-4-hydroxy-5-octanone, 3-methyl-5-hydroxy-4-octanone, 2,7-dimethyloctane, 2,7-dimethyl-3-octene, 2,7-dimethyl-4-octene, 2,7-dimethyl-4-octanol, 2,7-dimethyl-4-octanone, 2,7-dimethyl-4,5-octanediol, 2,7-dimethyl-4,5-octanedione, 2,7-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyloctane, 2,6-dimethyl-3-octene, 2,6-dimethyl-4-octene, 3,7-dimethyl-3-octene, 2,6-dimethyl-4-octanol, 3,7-dimethyl-4-octanol, 2,6-dimethyl-4-octanone, 3,7-dimethyl-4-octanone, 2,6-dimethyl-4,5-octanediol, 2,6-dimethyl-4,5-octanedione, 2,6-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyl-5-hydroxy-4-octanone, 3,6-dimethyloctane, 3,6-dimethyl-3-octene, 3,6-dimethyl-4-octene, 3,6-dimethyl-4-octanol, 3,6-dimethyl-4-octanone, 3,6-dimethyl-4,5-octanediol, 3,6-dimethyl-4,5-octanedione, 3,6-dimethyl-4-hydroxy-5-octanone, n-nonane, 1-nonene, 1-nonanol, nonanal, nonanoate, 2-methylnonane, 2-methyl-4-nonene, 2-methyl-5-nonene, 8-methyl-4-nonene, 2-methyl-5-nonanol, 8-methyl-4-nonanol, 2-methyl-5-nonanone, 8-methyl-4-nonanone, 8-methyl-4,5-nonanediol, 8-methyl-4,5-nonanedione, 8-methyl-4-hydroxy-5-nonanone, 8-methyl-5-hydroxy-4-nonanone, 2,8-dimethylnonane, 2,8-dimethyl-3-nonene, 2,8-dimethyl-4-nonene, 2,8-dimethyl-5-nonene, 2,8-dimethyl-4-nonanol, 2,8-dimethyl-5-nonanol, 2,8-dimethyl-4-nonanone, 2,8-dimethyl-5-nonanone, 2,8-dimethyl-4,5-nonanediol, 2,8-dimethyl-4,5-nonanedione, 2,8-dimethyl-4-hydroxy-5-nonanone, 2,8-dimethyl-5-hydroxy-4-nonanone, 2,7-dimethylnonane, 3,8-dimethyl-3-nonene, 3,8-dimethyl-4-nonene, 3,8-dimethyl-5-nonene, 3,8-dimethyl-4-nonanol, 3,8-dimethyl-5-nonanol, 3,8-dimethyl-4-nonanone, 3,8-dimethyl-5-nonanone, 3,8-dimethyl-4,5-nonanediol, 3,8-dimethyl-4,5-nonanedione, 3,8-dimethyl-4-hydroxy-5-nonanone, 3,8-dimethyl-5-hydroxy-4-nonanone, n-decane, 1-decene, 1-decanol, decanoate, 2,9-dimethyldecane, 2,9-dimethyl-3-decene, 2,9-dimethyl-4-decene, 2,9-dimethyl-5-decanol, 2,9-dimethyl-5-decanone, 2,9-dimethyl-5,6-decanediol, 2,9-dimethyl-6-hydroxy-5-decanone, 2,9-dimethyl-5,6-decanedionen-undecane, 1-undecene, 1-undecanol, undecanal. undecanoate, n-dodecane, 1-dodecene, 1-dodecanol, dodecanal, dodecanoate, n-dodecane, 1-decadecene, 1-dodecanol, ddodecanal, dodecanoate, n-tridecane, 1-tridecene, 1-tridecanol, tridecanal, tridecanoate, n-tetradecane, 1-tetradecene, 1-tetradecanol, tetradecanal, tetradecanoate, n-pentadecane, 1-pentadecene, 1-pentadecanol, pentadecanal, pentadecanoate, n-hexadecane, 1-hexadecene, 1-hexadecanol, hexadecanal, hexadecanoate, n-heptadecane, 1-heptadecene, 1-heptadecanol, heptadecanal, heptadecanoate, n-octadecane, 1-octadecene, 1-octadecanol, octadecanal, octadecanoate, n-nonadecane, 1-nonadecene, 1-nonadecanol, nonadecanal, nonadecanoate, eicosane, 1-eicosene, 1-eicosanol, eicosanal, eicosanoate, 3-hydroxy propanal, 1,3-propanediol, 4-hydroxybutanal, 1,4-butanediol, 3-hydrxy-2-butanone, 2,3-butandiol, 1,5-pentane diol, homocitrate, homoisocitorate, b-hydroxy adipate, glutarate, glutarsemialdehyde, glutaraldehyde, 2-hydroxy-1-cyclopentanone, 1,2-cyclopentanediol, cyclopentanone, cyclopentanol, (S)-2-acetolactate, (R)-2,3-Dihydroxy-isovalerate, 2-oxoisovalerate, isobutyryl-CoA, isobutyrate, isobutyraldehyde, 5-amino pentaldehyde, 1,10-diaminodecane, 1,10-diamino-5-decene, 1,10-diamino-5-hydroxydecane, 1,10-diamino-5-decanone, 1,10-diamino-5,6-decanediol, 1,10-diamino-6-hydroxy-5-decanone, phenylacetoaldehyde, 1,4-diphenylbutane, 1,4-diphenyl-1-butene, 1,4-diphenyl-2-butene, 1,4-diphenyl-2-butanol, 1,4-diphenyl-2-butanone, 1,4-diphenyl-2,3-butanediol, 1,4-diphenyl-3-hydroxy-2-butanone, 1-(4-hydeoxyphenyl)-4-phenylbutane, 1-(4-hydeoxyphenyl)-4-phenyl-1-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanol, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanone, 1-(4-hydeoxyphenyl)-4-phenyl-2,3-butanediol, 1-(4-hydeoxyphenyl)-4-phenyl-3-hydroxy-2-butanone, 1-(indole-3)-4-phenylbutane, 1-(indole-3)-4-phenyl-1-butene, 1-(indole-3)-4-phenyl-2-butene, 1-(indole-3)-4-phenyl-2-butanol, 1-(indole-3)-4-phenyl-2-butanone, 1-(indole-3)-4-phenyl-2,3-butanediol, 1-(indole-3)-4-phenyl-3-hydroxy-2-butanone, 4-hydroxyphenylacetoaldehyde, 1,4-di(4-hydroxyphenyl)butane, 1,4-di(4-hydroxyphenyl)-1-butene, 1,4-di(4-hydroxyphenyl)-2-butene, 1,4-di(4-hydroxyphenyl)-2-butanol, 1,4-di(4-hydroxyphenyl)-2-butanone, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 1,4-di(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3-)butane, 1-(4-hydroxyphenyl)-4-(indole-3)-1-butene, 1-di(4-hydroxyphenyl)-4-(indole-3)-2-butene,
1-(4-hydroxyphenyl)-4-(indole-3)-2-butanol, 1-(4-hydroxyphenyl)-4-(indole-3)-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3)-2,3-butanediol, 1-(4-hydroxyphenyl-4-(indole-3)-3-hydroxy-2-butanone, indole-3-acetoaldehyde, 1,4-di(indole-3-)butane, 1,4-di(indole-3)-1-butene, 1,4-di(indole-3)-2-butene, 1,4-di(indole-3)-2-butanol, 1,4-di(indole-3)-2-butanone, 1,4-di(indole-3)-2,3-butanediol, 1,4-di(indole-3)-3-hydroxy-2-butanone, succinate semialdehyde, hexane-1,8-dicarboxylic acid, 3-hexene-1,8-dicarboxylic acid, 3-hydroxy-hexane-1,8-dicarboxylic acid, 3-hexanone-1,8-dicarboxylic acid, 3,4-hexanediol-1,8-dicarboxylic acid, 4-hydroxy-3-hexanone-1,8-dicarboxylic acid, fucoidan, iodine, chlorophyll, carotenoid, calcium, magnesium, iron, sodium, potassium, and phosphate.
[0070]Certain embodiments of the present invention include methods for converting a suitable monosaccharide or oligosaccharide to a commodity chemical comprising, (b) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide or oligosaccharide to the commodity chemical, wherein the microbial system comprises; (i) one or more genes encoding and expressing a biosynthesis pathway; (ii) one or more genes encoding and expressing a C--C ligation pathway; and (iii) a reduction and dehydration pathway, comprising one or more genes encoding and expressing an enzyme selected from a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase, thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical. In certain aspects, the biosynthesis pathway is selected from (a) an aldehyde biosynthesis pathway, (b) a ketone synthesis pathway, and (c) both (a) and (b).
[0071]In certain aspects, the biosynthesis pathway comprises an acetoaldehyde biosynthesis pathway, and wherein the acetoaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to an acetoaldehyde. In certain aspects, the biosynthesis pathway comprises a propionaldehyde biosynthesis pathway, and wherein the propionaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a propionaldehyde. In certain aspects, the biosynthesis pathway comprises a butyraldehyde biosynthesis pathway, and wherein the butyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a butyraldehyde.
[0072]In certain aspects, the biosynthesis pathway comprises a isobutyraldehyde biosynthesis pathway, and wherein the isobutyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a isobutyraldehyde. In certain aspects, the biosynthesis pathway comprises a 2-methylbutyraldehyde biosynthesis pathway, and wherein the 2-methylbutyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 2-methylbutyraldehyde. In certain aspects, the biosynthesis pathway comprises a 3-methylbutyraldehyde biosynthesis pathway, and wherein the 3-methylbutyraldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 3-methylbutyraldehyde.
[0073]In certain aspects, the biosynthesis pathway comprises a 4-methylpentaldehyde biosynthesis pathway, and wherein the 4-methylpentaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 4-methylpentaldehyde. In certain aspects, the biosynthesis pathway comprises a phenylacetaldehyde biosynthesis pathway, and wherein the phenylacetaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a phenylacetaldehyde. In certain aspects, the biosynthesis pathway comprises a 5-amino pentaldehyde biosynthesis pathway, and wherein the 5-amino pentaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 5-amino pentaldehyde.
[0074]In certain aspects, the biosynthesis pathway comprises a 2-(4-hydroxyphenyl)acetaldehyde biosynthesis pathway, and wherein the 2-(4-hydroxyphenyl)acetaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 2(4-hydroxyphenyl)acetaldehyde. In certain aspects, the biosynthesis pathway comprises an 2-(Indole-3-)acetaldehyde biosynthesis pathway, and wherein the 2-(Indole-3) acetaldehyde biosynthesis pathway converts the suitable monosaccharide or oligosaccharide to a 2-(Indole-3-)acetaldehyde.
[0075]In certain aspects, the C--C ligation pathway comprises at least one enzyme selected from an acetoaldehyde lyase, a propionaldehyde lyase, a butyraldehyde lyase, an isobutyraldehyde lyase, a 2-methyl-butyraldehyde lyase, a 3-methyl-butyraldehyde lyase, a phenylacetaldehyde lyase, an oxaloacetate decarboxylase, an α-keto glutarate decarboxylyase, an α-keto adipate decarboxylyase, a pentaldehyde lyase, a 4-methyl-pentaldehyde lyase, a hexaldehyde lyase, a heptaldehyde lyase, an octaldehyde lyase, a 4-hydroxyphenylacetaldehyde lyase, an indoleacetaldehyde lyase, an indolephenylacetaldehyde lyase, a benzaldehyde lyase, a pyruvate decarboxylase, a benzformate lyase, and a 2-keto isovalerate decarboxylase. In certain aspects, the C--C ligation pathway comprises a C--C ligase or an optimized C--C ligase.
[0076]In certain aspects, the C--C ligase or optimized C--C ligase comprises at least one enzymatic activity selected from an acetoaldehyde lyase activity, a propionaldehyde lyase activity, a butyraldehyde lyase activity, an isobutyraldehyde lyase activity, a 2-methyl-butyraldehyde lyase activity, a 3-methyl-butyraldehyde lyase activity, a phenylacetaldehyde lyase activity, an oxaloacetate decarboxylase activity, an α-keto glutarate decarboxylyase activity, an α-keto adipate decarboxylyase activity, a pentaldehyde lyase activity a 4-methyl-pentaldehyde lyase activity, a hexaldehyde lyase activity, a heptaldehyde lyase activity, an octaldehyde lyase activity, a 4-hydroxyphenylacetaldehyde lyase activity, an indoleacetaldehyde lyase activity, an indolephenylacetaldehyde lyase activity, a benzaldehyde lyase activity, a pyruvate decarboxylase activity, a benzformate lyase activity, and a 2-keto isovalerate decarboxylase activity.
[0077]In certain aspects, the C--C ligation pathway comprises a benzaldehyde lyase, or a biologically active variant or fragment thereof. In certain aspects, the benzaldehyde lyase is derived from Pseudomonas fluorescens. In certain aspects, the benzaldehyde lyase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO: 144. In certain aspects, the amino acid sequence of the benzaldehyde lyase comprises one or more conserved residues selected from G27, E50, A57, G155, P162, P234, D271, G277, G422, G447, D448, and G512.
[0078]In certain aspects, the dehydration and reduction pathway comprises a diol dehydrogenase selected from 2,3-butanediol dehydrogenase, 3,4-hexanediol dehydrogenase, 4,5-octanediol dehydrogenase, 5,6-decanediol dehydrogenase, 6,7-dodecanediol dehydrogenase, 7,8-tetradecanediol dehydrogenase, 8,9-hexadecanediol dehydrogenase, 2,5-dimethyl-3,4-hexanediol dehydrogenase, 3,6-dimethyl-4,5-octanediol dehydrogenase, 2,7-dimethyl-4,5-octanediol dehydrogenase, 2,9-dimethyl-5,6-decanediol dehydrogenase, 1,4-diphenyl-2,3-butanediol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,4-diindole-2,3-butanediol dehydrogenase, 1,2-cyclopentanediol dehydrogenase, 2,3-pentanediol dehydrogenase, 2,3-hexanediol dehydrogenase, 2,3-heptanediol dehydrogenase, 2,3-octanediol dehydrogenase, 2,3-nonanediol dehydrogenase, 4-methyl-2,3-pentanediol dehydrogenase, 4-methyl-2,3-hexanediol dehydrogenase, 5-methyl-2,3-hexanediol dehydrogenase, 6-methyl-2,3-heptanediol dehydrogenase, 1-phenyl-2,3-butanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1-indole-2,3-butanediol dehydrogenase, 3,4-heptanediol dehydrogenase, 3,4-octanediol dehydrogenase, 3,4-nonanediol dehydrogenase, 3,4-decanediol dehydrogenase, 3,4-undecanediol dehydrogenase, 2-methyl-3,4-hexanediol dehydrogenase, 5-methyl-3,4-heptanediol dehydrogenase, 6-methyl-3,4-heptanediol dehydrogenase, 7-methyl-3,4-octanediol dehydrogenase, 1-phenyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase, 1-indole-2,3-pentanediol dehydrogenase, 4,5-nonanediol dehydrogenase, 4,5-decanediol dehydrogenase, 4,5-undecanediol dehydrogenase, 4,5-dodecanediol dehydrogenase, 2-methyl-3,4-heptanediol dehydrogenase, 3-methyl-4,5-octanediol dehydrogenase, 2-methyl-4,5-octanediol dehydrogenase, 8-methyl-4,5-nonanediol dehydrogenase, 1-phenyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase, 1-indole-2,3-hexanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-tridecanediol dehydrogenase, 2-methyl-3,4-octanediol dehydrogenase, 3-methyl-4,5-nonanediol dehydrogenase, 2-methyl-4,5-nonanediol dehydrogenase, 2-methyl-5,6-decanediol dehydrogenase, 1-phenyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase, 1-indole-2,3-heptanediol dehydrogenase, 6,7-tridecanediol dehydrogenase, 6,7-tetradecanediol dehydrogenase, 2-methyl-3,4-nonanediol dehydrogenase, 3-methyl-4,5-decanediol dehydrogenase, 2-methyl-4,5-decanediol dehydrogenase, 2-methyl-5,6-undecanediol dehydrogenase, 1-phenyl-2,3-octanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase, 1-indole-2,3-octanediol dehydrogenase, 7,8-pentadecanediol dehydrogenase, 2-methyl-3,4-decanediol dehydrogenase, 3-methyl-4,5-undecanediol dehydrogenase, 2-methyl-4,5-undecanediol dehydrogenase, 2-methyl-5,6-dodecanediol dehydrogenase, 1-phenyl-2,3-nonanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase, 1-indole-2,3-nonanediol dehydrogenase, 2-methyl-3,4-undecanediol dehydrogenase, 3-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-5,6-tridecanediol dehydrogenase, 1-phenyl-2,3-decanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase, 1-indole-2,3-decanediol dehydrogenase, 2,5-dimethyl-3,4-heptanediol dehydrogenase, 2,6-dimethyl-3,4-heptanediol dehydrogenase, 2,7-dimethyl-3,4-octanediol dehydrogenase, 1-phenyl-4-methyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase, 1-indole-4-methyl-2,3-pentanediol dehydrogenase, 2,6-dimethyl-4,5-octanediol dehydrogenase, 3,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-4-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase, 1-indole-4-methyl-2,3-hexanediol dehydrogenase, 2,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-5-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase, 1-indole-5-methyl-2,3-hexanediol dehydrogenase, 1-phenyl-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase, 1-indole-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,10-diamino-5,6-decanediol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, and 2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase. In certain aspects, the diol dehydrogenase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NOS:98, 100, or 102.
[0079]In certain aspects, the dehydration and reduction pathway comprises a diol dehydratase selected from 2,3-butanediol dehydratase, 3,4-hexanediol dehydratase, 4,5-octanediol dehydratase, 5,6-decanediol dehydratase, 6,7-dodecanediol dehydratase, 7,8-tetradecanediol dehydratase, 8,9-hexadecanediol dehydratase, 2,5-dimethyl-3,4-hexanediol dehydratase, 3,6-dimethyl-4,5-octanediol dehydratase, 2,7-dimethyl-4,5-octanediol dehydratase, 2,9-dimethyl-5,6-decanediol dehydratase, 1,4-diphenyl-2,3-butanediol dehydratase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,4-diindole-2,3-butanediol dehydratase, 1,2-cyclopentanediol dehydratase, 2,3-pentanediol dehydratase, 2,3-hexanediol dehydratase, 2,3-heptanediol dehydratase, 2,3-octanediol dehydratase, 2,3-nonanediol dehydratase, 4-methyl-2,3-pentanediol dehydratase, 4-methyl-2,3-hexanediol dehydratase, 5-methyl-2,3-hexanediol dehydratase, 6-methyl-2,3-heptanediol dehydratase, 1-phenyl-2,3-butanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1-indole-2,3-butanediol dehydratase, 3,4-heptanediol dehydratase, 3,4-octanediol dehydratase, 3,4-nonanediol dehydratase, 3,4-decanediol dehydratase, 3,4-undecanediol dehydratase, 2-methyl-3,4-hexanediol dehydratase, 5-methyl-3,4-heptanediol dehydratase, 6-methyl-3,4-heptanediol dehydratase, 7-methyl-3,4-octanediol dehydratase, 1-phenyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase, 1-indole-2,3-pentanediol dehydratase, 4,5-nonanediol dehydratase, 4,5-decanediol dehydratase, 4,5-undecanediol dehydratase, 4,5-dodecanediol dehydratase, 2-methyl-3,4-heptanediol dehydratase, 3-methyl-4,5-octanediol dehydratase, 2-methyl-4,5-octanediol dehydratase, 8-methyl-4,5-nonanediol dehydratase, 1-phenyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase, 1-indole-2,3-hexanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-tridecanediol dehydratase, 2-methyl-3,4-octanediol dehydratase, 3-methyl-4,5-nonanediol dehydratase, 2-methyl-4,5-nonanediol dehydratase, 2-methyl-5,6-decanediol dehydratase, 1-phenyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase, 1-indole-2,3-heptanediol dehydratase, 6,7-tridecanediol dehydratase, 6,7-tetradecanediol dehydratase, 2-methyl-3,4-nonanediol dehydratase, 3-methyl-4,5-decanediol dehydratase, 2-methyl-4,5-decanediol dehydratase, 2-methyl-5,6-undecanediol dehydratase, 1-phenyl-2,3-octanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydratase, 1-indole-2,3-octanediol dehydratase, 7,8-pentadecanediol dehydratase, 2-methyl-3,4-decanediol dehydratase, 3-methyl-4,5-undecanediol dehydratase, 2-methyl-4,5-undecanediol dehydratase, 2-methyl-5,6-dodecanediol dehydratase, 1-phenyl-2,3-nonanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase, 1-indole-2,3-nonanediol dehydratase, 2-methyl-3,4-undecanediol dehydratase, 3-methyl-4,5-dodecanediol dehydratase, 2-methyl-4,5-dodecanediol dehydratase, 2-methyl-5,6-tridecanediol dehydratase, 1-phenyl-2,3-decanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydratase, 1-indole-2,3-decanediol dehydratase, 2,5-dimethyl-3,4-heptanediol dehydratase, 2,6-dimethyl-3,4-heptanediol dehydratase, 2,7-dimethyl-3,4-octanediol dehydratase, 1-phenyl-4-methyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase, 1-indole-4-methyl-2,3-pentanediol dehydratase, 2,6-dimethyl-4,5-octanediol dehydratase, 3,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-4-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase, 1-indole-4-methyl-2,3-hexanediol dehydratase, 2,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-5-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase, 1-indole-5-methyl-2,3-hexanediol dehydratase, 1-phenyl-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase, 1-indole-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,10-diamino-5,6-decanediol dehydratase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, and 2,3-hexanediol-1,6-dicarboxylic acid dehydratase.
[0080]In certain aspects, the diol dehydratase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NOS:104, 106, 108, 308, 309, 310, or 311. In certain aspects, the polypeptide of SEQ ID NO: 104 comprises one or more conserved residues selected from D149, P151, A155, A159, G165, E168, E170, A183, G189, G196, Q200, E208, G215, Y219, E221, T222, S224, Y226, G227, T228, F232, G235, D236, D237, T238, P239, S241, L245, Y249, S251, R252, G253, K255, R257, S260, E265, M268, G269, S275, Y278, L279, E280, C283, G291, Q293, G294, Q296, N297, G298, G312, E329, S341, R344, G356, D371, N372, F374, S377, R392, D393, R412, L477, A486, G499, D500, S516, N522, D523, Y524, G526, and G530.
[0081]In certain aspects, the polypeptide of SEQ ID NO:310 comprises one or more conserved residues selected from T36, G74, P87, E88, E97, W126, R221, A263, Q265, R287, D289, E309, R317, G335, G345, G346, N356, P374, R379, G399, G401, P403, D408, G432, C433, N452, C529, G533, G539, G540, S559, G603, N604, A654, G658, R659, D676, N702, Q735, N737, A747, P751, R760, V761, A762, G763, Q776, I780, and R782. In certain aspects, the polypeptide of SEQ ID NO:311 comprises one or more conserved residues selected from D19, G20, G22, R24, F28, G31, C32, C36, W38, C39, N41, P42, C58, C64, C96, G129, T132, G135, G136, D185, R187, N208, R222, and R264.
[0082]In certain aspects, the dehydration and reduction pathway comprises a secondary alcohol dehydrogenase selected from 2-butanol dehydrogenase, 3-hexanol dehydrogenase, 4-octanol dehydrogenase, 5-decanol dehydrogenase, 6-dodecanol dehydrogenase, 7-tetradecanol dehydrogenase, 8-hexadecanol dehydrogenase, 2,5-dimethyl-3-hexanol dehydrogenase, 3,6-dimethyl-4-octanol dehydrogenase, 2,7-dimethyl-4-octanol dehydrogenase, 2,9-dimethyl-4-decanol dehydrogenase, 1,4-diphenyl-2-butanol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase, 1,4-diindole-2-butanol dehydrogenase, cyclopentanol dehydrogenase, 2(or 3)-pentanol dehydrogenase, 2(or 3)-hexanol dehydrogenase, 2(or 3)-heptanol dehydrogenase, 2(or 3)-octanol dehydrogenase, 2(or 3)-nonanol dehydrogenase, 4-methyl-2(or 3)-pentanol dehydrogenase, 4-methyl-2(or 3)-hexanol dehydrogenase, 5-methyl-2(or 3)-hexanol dehydrogenase, 6-methyl-2(or 3)-heptanol dehydrogenase, 1-phenyl-2(or 3)-butanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1-indole-2(or 3)-butanol dehydrogenase, 3(or 4)-heptanol dehydrogenase, 3(or 4)-octanol dehydrogenase, 3(or 4)-nonanol dehydrogenase, 3(or 4)-decanol dehydrogenase, 3(or 4)-undecanol dehydrogenase, 2-methyl-3(or 4)-hexanol dehydrogenase, 5-methyl-3(or 4)-heptanol dehydrogenase, 6-methyl-3(or 4)-heptanol dehydrogenase, 7-methyl-3(or 4)-octanol dehydrogenase, 1-phenyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-pentanol dehydrogenase, 1-indole-2(or 3)-pentanol dehydrogenase, 4(or 5)-nonanol dehydrogenase, 4(or 5)-decanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 4(or 5)-dodecanol dehydrogenase, 2-methyl-3(or 4)-heptanol dehydrogenase, 3-methyl-4(or 5)-octanol dehydrogenase, 2-methyl-4(or 5)-octanol dehydrogenase, 8-methyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase, 1-indole-2(or 3)-hexanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 5(or 6)-undecanol dehydrogenase, 5(or 6)-tridecanol dehydrogenase, 2-methyl-3(or 4)-octanol dehydrogenase, 3-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-5(or 6)-decanol dehydrogenase, 1-phenyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-heptanol dehydrogenase, 1-indole-2(or 3)-heptanol dehydrogenase, 6(or 7)-tridecanol dehydrogenase, 6(or 7)-tetradecanol dehydrogenase, 2-methyl-3(or 4)-nonanol dehydrogenase, 3-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-5(or 6)-undecanol dehydrogenase, 1-phenyl-2(or 3)-octanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-octanol dehydrogenase, 1-indole-2(or 3)-octanol dehydrogenase, 7(or 8)-pentadecanol dehydrogenase, 2-methyl-3(or 4)-decanol dehydrogenase, 3-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-5 (or 6)-dodecanol dehydrogenase, 1-phenyl-2(or 3)-nonanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase, 1-indole-2(or 3)-nonanol dehydrogenase, 2-methyl-3(or 4)-undecanol dehydrogenase, 3-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-5(or 6)-tridecanol dehydrogenase, 1-phenyl-2(or 3)-decanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase, 1-indole-2(or 3)-decanol dehydrogenase, 2,5-dimethyl-3(or 4)-heptanol dehydrogenase, 2,6-dimethyl-3(or 4)-heptanol dehydrogenase, 2,7-dimethyl-3(or 4)-octanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2(or 3)-pentanol dehydrogenase, 1-indole-4-methyl-2(or 3)-pentanol dehydrogenase, 2,6-dimethyl-4(or 5)-octanol dehydrogenase, 3,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2 (or 3)-hexanol dehydrogenase, 1-indole-4-methyl-2(or 3)-hexanol dehydrogenase, 2,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase, 1-indole-5-methyl-2(or 3)-hexanol dehydrogenase, 1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase, 1-indole-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1,10-diamino-5-decanol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase, 2-hexanol-1,6-dicarboxylic acid dehydrogenase, phenylethanol dehydrogenase, 4-hydroxyphenylethanol dehydrogenase, and Indole-3-ethanol dehydrogenase.
[0083]In certain aspects, the secondary alcohol dehydrogenase comprises a polypeptide having an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NOS:110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, and 142.
[0084]In certain aspects, the secondary alcohol dehydrogenase comprises at least one of a nicotinamide adenine dinucleotide (NAD+), a NADH, nicotinamide adenine dinucleotide phosphate (NADP+), or a NADPH binding motif. In certain aspects, the NAD+, NADH, NADP+, or NADPH binding motif is selected from the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y, Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y, Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein Y is independently selected from alanine, glycine, and serine, wherein G is glycine, and wherein X is independently selected from a genetically encoded amino acid.
[0085]Certain embodiments include a recombinant microorganism, comprising (i) one or more genes encoding and expressing an aldehyde and/or ketone biosynthesis pathway; (ii) one or more genes encoding and expressing a C--C ligation pathway; and (iii) a reduction and dehydration pathway, comprising one or more genes encoding and expressing an enzyme selected from a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase. In certain aspects, the microorganism is capable of converting a suitable monosaccharide or suitable oligosaccharide to a commodity chemical, or an intermediate thereof.
[0086]In certain aspects, the one or more genes encoding the biosynthesis pathway encode a pathway selected from an acetoaldehyde, propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, 4-methylpentaldehyde, phenyl acetaldehyde, glutaraldehyde, 5-amino-pentaldehyde, succinate semialdehyde, succinate 4-hydroxyphenyl acetaldehyde, and an indole-3-acetaldehyde biosynthesis pathway, and combinations thereof.
[0087]In certain aspects, the one or more genes encoding and expressing the C--C ligation pathway comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NO:143. In certain aspects, the one or more genes encoding the diol dehydrogenase comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NOS:97, 99, or 101. In certain aspects, the one or more genes encoding the diol dehydratase comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NOS: 103, 105, or 107. In certain aspects, the one or more genes encoding a secondary alcohol dehydrogenase comprise a nucleotide sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the nucleotide sequence set forth in SEQ ID NOS: 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, or 141.
[0088]Certain embodiments include a recombinant microorganism, comprising one or more genes encoding and expressing an aldehyde or ketone biosynthesis pathway, wherein the pathway comprises at least one exogenous gene. Certain embodiments include a recombinant microorganism, comprising one or more exogenous genes encoding and expressing one or more enzymes selected from a C--C ligase, a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase. In certain aspects, the one or more enzymes comprise a C--C ligase and a diol dehydrogenase. In certain aspects, the one or more enzymes comprise a diol dehydrogenase and a diol dehydratase.
[0089]In certain aspects, the recombinant microorganism comprises reduced ethanol production capability compared to a wild-type microorganism. In certain aspects, the microorganism comprises a reduction or inhibition in the conversion of acetyl-coA to ethanol. In certain aspects, the recombinant microorganism comprises a reduction of an ethanol dehydrogenase, thereby providing a reduced ethanol production capability. In certain aspects, the ethanol dehydrogenase is an adhE, homolog or variant thereof. In certain aspects, the microorganism comprises a deletion or knockout of an adhE, homolog or variant thereof. In certain aspects, the recombinant microorganism comprises one or more deletions or knockouts in a gene encoding an enzyme selected from an enzyme that catalyzes the conversion of acetyl-coA to ethanol, an enzyme that catalyzes the conversion of pyruvate to lactate, an enzyme that catalyzes the conversion of fumarate to succinate, an enzyme that catalyzes the conversion of acetyl-coA and phosphate to coA and acetyl phosphate, an enzyme that catalyzes the conversion of acetyl-coA and formate to coA and pyruvate, and an enzyme that catalyzes the conversion of alpha-keto acid to branched chain amino acids.
[0090]In certain aspects, the microorganism is a bacteria. In certain aspects, the microorganism is a gram-negative bacteria. In certain aspects, the microorganism is a eukaryote. In certain aspects, the eukaryote is a fungus. In certain aspects, the fungus is a yeast.
[0091]Certain embodiments include methods for converting a suitable monosaccharide to a commodity chemical comprising, (a) contacting the suitable monosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the commodity chemical, wherein the microbial system comprises, (i) one or more genes encoding and expressing a pathway selected from a fatty acid biosynthesis pathway, an amino acid biosynthetic pathway, and a short chain alcohol biosynthetic pathway; (ii) one or more genes encoding and expressing a keto-acid decarboxylase, aldehyde dehydrogenase, and/or alcohol dehydrogenase; and (iii) an enzymatic reduction pathway selected from (1) an enzymatic long chain alcohol reduction pathway, (2) an enzymatic decarbonylation pathway, (3) an enzymatic decarboxylation pathway, and (4) an enzymatic reduction pathway comprising (1), (2), and/or (3), thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical.
[0092]Certain embodiments include a recombinant microorganism, comprising (i) one or more genes encoding and expressing a pathway selected from a fatty acid biosynthesis pathway, an amino acid biosynthetic pathway, and a short chain alcohol biosynthetic pathway; (ii) one or more genes encoding and expressing a keto-acid decarboxylase, aldehyde dehydrogenase, and/or alcohol dehydrogenase; and (iii) an enzymatic reduction pathway selected from (1) an enzymatic long chain alcohol reduction pathway, (2) an enzymatic decarbonylation pathway, (3) an enzymatic decarboxylation pathway, and (4) an enzymatic reduction pathway comprising (1), (2), and/or (3).
[0093]In certain aspects, the recombinant microorganism or microbial systems described herein comprise a microorganism selected from Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus usamii, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus licheniformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Candida rugosa, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Humicola insolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Saccharomyces cerevisiae, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Vibrio alginolyticus, Xanthomonas, yeast, Zygosaccharomyces rouxii, Zymomonas, and Zymomonus mobilis.
[0094]Certain embodiments include a commodity chemical produced by the methods described herein. Certain aspects include a blended commodity chemical comprising a commodity chemical produced by the methods provided herein and a refinery-produced petroleum product. In certain aspects, the commodity chemical is selected from a C10-C12 hydrocarbon, 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol. In certain aspects, the C10-C12 hydrocarbon is selected from 2,7-dimethyloctane and 2,9-dimethyldecane. In certain aspects, the refinery-produced petroleum product is selected from jet fuel and diesel fuel.
[0095]Certain embodiments include methods of producing a commodity chemical enriched refinery-produced petroleum product, comprising (a) blending the refinery-produced petroleum product with the commodity chemical produced by the methods described herein, thereby producing the commodity chemical enriched refinery-produced petroleum product.
DETAILED DESCRIPTION
[0096]Embodiments of the present invention relate to the unexpected discovery that microorganisms which are otherwise incapable of growing on certain polysaccharides derived from biomass as a sole source of carbon, can be engineered to grow on these polysaccharides as a sole source of carbon. Such microorganisms can include both prokaryotic and eukaryotic microorganisms, such as bacteria and yeast. In some aspects, certain laboratory and/or wild-type strains of E. coli can be engineered to grow on biomass derived from either alginate or pectin as a sole source of carbon to produce suitable monosaccharides or other molecules. Among other uses apparent to a person skilled in the art, the monosaccharides and other molecules produced by the growth of these engineered or recombinant microorganisms on alginate or pectin may be utilized as feedstock in the production of various commodity chemicals, such as biofuels.
[0097]Alginate and pectin provide advantages over other biomass sources in the production of biofuel feedstocks. For example, large-scale aquatic-farming can generate a significant amount of biomass without replacing food crop production with energy crop production, deforestation, and recultivating currently uncultivated land, as most of hydrosphere including oceans, rivers, and lakes remains untapped. As one particular example, the Pacific coast of North America is abundant in minerals necessary for large-scale aqua-farming. Giant kelp, which lives in the area, grows as fast as 1 m/day, the fastest among plants on earth, and grows up to 50 m. Additionally, aqua-farming has other benefits including the prevention of a red tide outbreak and the creation of a fish-friendly environment.
[0098]As an additional advantage, and in contrast to lignocellulolic biomass, biomass derived from aquatic, fruit, plant and/or vegetable sources is easy to degrade. Such biomass typically lacks lignin and is significantly more fragile than lignocellulolic biomass and can thus be easily degraded using either enzymes or chemical catalysts (e.g., formate). As one example, aquatic biomass such as seaweed may be easily converted to monosaccharides using either enzymes or chemical catalysis, as seaweed has significantly simpler major sugar components (Alginate: 30%, Mannitol: 15%) as compared to lignocellulose (Glucose: 24.1-39%, Mannose: 0.2-4.6%, Galactose: 0.5-2.4%, Xylose: 0.4-22.1%, Arabinose 1.5-2.8%, and Uronic acids: 1.2-20.7%, and total sugar contents are corresponding to 36.5-70% of dried weight).
[0099]As an additional example, biomass from plants such as fruit, certain plants, and/or vegetable contains pectin, a heteropolysaccharide derived from the plant cell wall. The characteristic structure of pectin is a linear chain of α-(1-4)-linked D-galacturonic acid that forms the pectin-backbone, a homogalacturonan. Pectin can be easily converted to oligosaccharides or suitable monosaccharides using either enzymes, chemical catalysis, and/or microbial systems designed to utilize pectin as a source of carbon, as described herein. Saccharification and fermentation using aquatic, fruit, and/or vegetable biomass is much easier than using lignocellulose.
[0100]In this regard, embodiments of the present invention also relate to the surprising discovery that certain microorganisms can be engineered to produce various commodity chemicals, such as biofuels. In certain aspects, these biofuels may include alkanes, such as medium to long chain alkanes, which provide advantages over ethanol based biofuels. In certain aspects, the monosaccharides (e.g., 2-keto-3-deoxy D-gluconate; KDG) and other molecules produced by the growth of various engineered or recombinant microorganisms (e.g., recombinant microorganisms growing on pectin or alginate as a source of carbon) may be useful in the production of commodity chemicals, such as biofuels. As one example, suitable monosaccharides such as KDG may be utilized by recombinant microorganisms to produce alkanes, such as medium to long chain alkanes, among other chemicals. In certain aspects, such recombinant microorganisms may be utilized to produce such commodity chemical as 2,7 dimethyl octane and 2,9 dimethyl decane, among others provided herein and known in the art.
[0101]Such processes produce biofuels with significant advantages over other biofuels. In particular, medium to long chain alkanes provide a number of important advantages over the existing common biofuels such as ethanol and butanol, and are attractive long-term replacements of petroleum-based fuels such as gasoline, diesels, kerosene, and heavy oils in the future. As one example, medium to long chain alkanes and alcohols are major components in all petroleum products and jet fuel in particular, and hence alkanes we produce can be utilized directly by existing engines. By way of further example, medium to long chain alcohols are far better fuels than ethanol, and have a nearly comparable energy density to gasoline.
[0102]As another example, n-alkanes are major components of all oil products including gasoline, diesels, kerosene, and heavy oils. Microbial systems or recombinant microorganisms may be used to produce n-alkanes with different carbon lengths ranging, for example, from C7 to over C20: C7 for gasoline (e.g., motor vehicles), C10-C15 for diesels (e.g., motor vehicles, trains, and ships), and C8-C16 for kerosene (e.g., aviations and ships), and for all heavy oils.
[0103]As one aspect of the invention, the commodity chemicals produced by the methods and recombinant microorganisms described herein may be utilized by existing petroleum refineries for the purposes of blending with petroleum products produced by traditional refinery methods. To this end, as noted above, fuel producers are seeking substantially similar, low carbon fuels that can be blended and distributed through existing infrastructure (refineries, pipelines, tankers). As hydrocarbons, the commodity chemicals produced according to the methods herein are substantially similar to petroleum derived fuels, reduce green house gas emissions by more than 80% from petroleum derived fuels, and are compatible with existing infrastructure in the oil and gas industry. For instance, certain of the commodity chemicals produced herein, including, for example, various C10-C12 hydrocarbons such as 2,7 dimethyloctane, 2,7 dimethyldecanone, among others, are blendable directly into refinery-produced petroleum products, such as jet and diesel fuels. By using such biologically produced commodity chemicals as a blendstock for jet and diesel fuels, refineries may reduce Green House Gas emissions by more than 80%.
[0104]Accordingly, certain embodiments of the present invention relate generally to methods for converting biomass to a commodity chemical, comprising obtaining a polysaccharide from biomass; contacting the polysaccharide with a polysaccharide degrading or depolymerizing pathway, thereby converting the polysaccharide to a suitable monosaccharide. The suitable monosaccharide obtained from such as process may be used for any desired purpose. For instance, in certain aspects, the suitable monosaccharide may then be converted to a commodity chemical (e.g., biofuel) by contacting the suitable monosaccharide with a biofuel biosynthesis pathway, whether as part of a recombinant microorganism, an in vitro enzymatic or chemical pathway, or a combination thereof, thereby converting the monosaccharide to a commodity chemical.
[0105]In other aspects, in producing a commodity chemical such as a biofuel, a suitable monosaccharide may be obtained directly from any available source and converted to a commodity chemical by contacting the suitable monosaccharide with a biofuel biosynthesis pathway, as described herein. Among other uses apparent to a person skilled in the art, such biofuels may then be blended directly with refinery produced petroleum products, such as jet and diesel fuels, to produce commodity chemical enriched, refinery-produced petroleum products.
DEFINITIONS
[0106]Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which the invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, preferred methods and materials are described. For the purposes of the present invention, the following terms are defined below. All references referred to herein are incorporated by reference in their entirety.
[0107]The articles "a" and "an" are used herein to refer to one or to more than one (i.e. to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.
[0108]By "about" is meant a quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length that varies by as much as 30, 25, 20, 25, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1% to a reference quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length.
[0109]The term "biologically active fragment", as applied to fragments of a reference polynucleotide or polypeptide sequence, refers to a fragment that has at least about 0.1, 0.5, 1, 2, 5, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 100, 110, 120, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000% or more of the activity of a reference sequence.
[0110]The term "reference sequence" refers generally to a nucleic acid coding sequence, or amino acid sequence, of any enzyme having a biological activity described herein (e.g., saccharide dehydrogenase, alcohol dehydrogenase, dehydratase, lyase, transporter, decarboxylase, hydrolase, etc.), such as a "wild-type" sequence, including those reference sequences exemplified by SEQ ID NOS:1-144, and 308-313. A reference sequence may also include naturally-occurring, functional variants (i.e., orthologs or homologs) of the sequences described herein.
[0111]Included within the scope of the present invention are biologically active fragments of at least about 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 500, 600 or more contiguous nucleotides or amino acid residues in length, including all integers in between, which comprise or encode a polypeptide having an enzymatic activity of a reference polynucleotide or polypeptide. Representative biologically active fragments generally participate in an interaction, e.g., an intra-molecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction. Examples of enzymatic interactions or activities include saccharide dehydrogenase activities, alcohol dehydrogenase activities, dehydratases activities, lyase activities, transporter activities, isomerase activities, kinase activities, among others described herein. Biologically active fragments typically comprise one or more active sites or enzymatic/binding motifs, as described herein and known in the art.
[0112]By "coding sequence" is meant any nucleic acid sequence that contributes to the code for the polypeptide product of a gene. By contrast, the term "non-coding sequence" refers to any nucleic acid sequence that does not contribute to the code for the polypeptide product of a gene.
[0113]Throughout this specification, unless the context requires otherwise, the words "comprise", "comprises" and "comprising" will be understood to imply the inclusion of a stated step or element or group of steps or elements but not the exclusion of any other step or element or group of steps or elements.
[0114]By "consisting of," is meant including, and limited to, whatever follows the phrase "consisting of" Thus, the phrase "consisting of" indicates that the listed elements are required or mandatory, and that no other elements may be present.
[0115]By "consisting essentially of" is meant including any elements listed after the phrase, and limited to other elements that do not interfere with or contribute to the activity or action specified in the disclosure for the listed elements. Thus, the phrase "consisting essentially of" indicates that the listed elements are required or mandatory, but that other elements are optional and may or may not be present depending upon whether or not they affect the activity or action of the listed elements.
[0116]The terms "complementary" and "complementarity" refer to polynucleotides (i.e., a sequence of nucleotides) related by the base-pairing rules. For example, the sequence "A-G-T," is complementary to the sequence "T-C-A." Complementarity may be "partial," in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be "complete" or "total" complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands.
[0117]By "corresponds to" or "corresponding to" is meant (a) a polynucleotide having a nucleotide sequence that is substantially identical or complementary to all or a portion of a reference polynucleotide sequence or encoding an amino acid sequence identical to an amino acid sequence in a peptide or protein; or (b) a peptide or polypeptide having an amino acid sequence that is substantially identical to a sequence of amino acids in a reference peptide or protein.
[0118]By "derivative" is meant a polypeptide that has been derived from the basic sequence by modification, for example by conjugation or complexing with other chemical moieties (e.g., pegylation) or by post-translational modification techniques as would be understood in the art. The term "derivative" also includes within its scope alterations that have been made to a parent sequence including additions or deletions that provide for functionally equivalent molecules.
[0119]By "enzyme reactive conditions" it is meant that any necessary conditions are available in an environment (i.e., such factors as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function. Enzyme reactive conditions can be either in vitro, such as in a test tube, or in vivo, such as within a cell.
[0120]As used herein, the terms "function" and "functional" and the like refer to a biological or enzymatic function.
[0121]By "gene" is meant a unit of inheritance that occupies a specific locus on a chromosome and consists of transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (i.e., introns, 5' and 3' untranslated sequences).
[0122]"Homology" refers to the percentage number of amino acids that are identical or constitute conservative substitutions. Homology may be determined using sequence comparison programs such as GAP (Deveraux et al., 1984, Nucleic Acids Research 12, 387-395) which is incorporated herein by reference. In this way sequences of a similar or substantially different length to those cited herein could be compared by insertion of gaps into the alignment, such gaps being determined, for example, by the comparison algorithm used by GAP.
[0123]The term "host cell" includes an individual cell or cell culture which can be or has been a recipient of any recombinant vector(s) or isolated polynucleotide of the invention. Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology or in total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or change. A host cell includes cells transfected, transformed, or infected in vivo or in vitro with a recombinant vector or a polynucleotide of the invention. A host cell which comprises a recombinant vector of the invention is a recombinant host cell, recombinant cell, or recombinant microorganism.
[0124]By "isolated" is meant material that is substantially or essentially free from components that normally accompany it in its native state. For example, an "isolated polynucleotide", as used herein, refers to a polynucleotide, which has been purified from the sequences which flank it in a naturally-occurring state, e.g., a DNA fragment which has been removed from the sequences that are normally adjacent to the fragment. Alternatively, an "isolated peptide" or an "isolated polypeptide" and the like, as used herein, refer to in vitro isolation and/or purification of a peptide or polypeptide molecule from its natural cellular environment, and from association with other components of the cell, i.e., it is not associated with in vivo substances.
[0125]By "increased" or "increasing" is meant the ability of one or more recombinant microorganisms to produce a greater amount of a given product or molecule (e.g., commodity chemical, biofuel, or intermediate product thereof) as compared to a control microorganism, such as an unmodified microorganism or a differently modified microorganism. An "increased" amount is typically a "statistically significant" amount, and may include an increase that is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30 or more times (including all integers and decimal points in between, e.g., 1.5, 1.6, 1.7. 1.8, etc.) the amount produced by an unmodified microorganism or a differently modified microorganism.
[0126]By "obtained from" is meant that a sample such as, for example, a polynucleotide extract or polypeptide extract is isolated from, or derived from, a particular source, such as a desired organism, typically a microorganism. "Obtained from" can also refer to the situation in which a polynucleotide or polypeptide sequence is isolated from, or derived from, a particular organism or microorganism. For example, a polynucleotide sequence encoding a benzaldehyde lyase enzyme may be isolated from a variety of prokaryotic or eukaryotic microorganisms, such as Pseudomonas.
[0127]The term "operably linked" as used herein means placing a gene under the regulatory control of a promoter, which then controls the transcription and optionally the translation of the gene. In the construction of heterologous promoter/structural gene combinations, it is generally preferred to position the genetic sequence or promoter at a distance from the gene transcription start site that is approximately the same as the distance between that genetic sequence or promoter and the gene it controls in its natural setting; i.e. the gene from which the genetic sequence or promoter is derived. As is known in the art, some variation in this distance can be accommodated without loss of function. Similarly, the preferred positioning of a regulatory sequence element with respect to a heterologous gene to be placed under its control is defined by the positioning of the element in its natural setting; i.e., the genes from which it is derived. "Constitutive promoters" are typically active, i.e., promote transcription, under most conditions. "Inducible promoters" are typically active only under certain conditions, such as in the presence of a given molecule factor (e.g., IPTG) or a given environmental condition (e.g., CO2 concentration, nutrient levels, light, heat). In the absence of that condition, inducible promoters typically do not allow significant or measurable levels of transcriptional activity.
[0128]The recitation "polynucleotide" or "nucleic acid" as used herein designates mRNA, RNA, cRNA, rRNA, cDNA or DNA. The term typically refers to polymeric form of nucleotides of at least 10 bases in length, either ribonucleotides or deoxynucleotides or a modified form of either type of nucleotide. The term includes single and double stranded forms of DNA.
[0129]As will be understood by those skilled in the art, the polynucleotide sequences of this invention can include genomic sequences, extra-genomic and plasmid-encoded sequences and smaller engineered gene segments that express, or may be adapted to express, proteins, polypeptides, peptides and the like. Such segments may be naturally isolated, or modified synthetically by the hand of man.
[0130]Polynucleotides may be single-stranded (coding or antisense) or double-stranded, and may be DNA (genomic, cDNA or synthetic) or RNA molecules. Additional coding or non-coding sequences may, but need not, be present within a polynucleotide of the present invention, and a polynucleotide may, but need not, be linked to other molecules and/or support materials.
[0131]Polynucleotides may comprise a native sequence (i.e., an endogenous sequence) or may comprise a variant, or a biological functional equivalent of such a sequence. Polynucleotide variants may contain one or more substitutions, additions, deletions and/or insertions, as further described below, preferably such that the enzymatic activity of the encoded polypeptide is not substantially diminished relative to the unmodified polypeptide, and preferably such that the enzymatic activity of the encoded polypeptide is improved (e.g., optimized) relative to the unmodified polypeptide. The effect on the enzymatic activity of the encoded polypeptide may generally be assessed as described herein.
[0132]The polynucleotides of the present invention, regardless of the length of the coding sequence itself, may be combined with other DNA sequences, such as promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding segments, and the like, such that their overall length may vary considerably. It is therefore contemplated that a polynucleotide fragment of almost any length may be employed, with the total length preferably being limited by the ease of preparation and use in the intended recombinant DNA protocol.
[0133]The terms "polynucleotide variant" and "variant" and the like refer to polynucleotides that display substantial sequence identity with any of the reference polynucleotide sequences or genes described herein, and to polynucleotides that hybridize with any polynucleotide reference sequence described herein, or any polynucleotide coding sequence of any gene or protein referred to herein, under low stringency, medium stringency, high stringency, or very high stringency conditions that are defined hereinafter and known in the art. These terms also encompass polynucleotides that are distinguished from a reference polynucleotide by the addition, deletion or substitution of at least one nucleotide. Accordingly, the terms "polynucleotide variant" and "variant" include polynucleotides in which one or more nucleotides have been added or deleted, or replaced with different nucleotides. In this regard, it is well understood in the art that certain alterations inclusive of mutations, additions, deletions and substitutions can be made to a reference polynucleotide whereby the altered polynucleotide retains the biological function or activity of the reference polynucleotide, or has increased activity in relation to the reference polynucleotide (i.e., optimized). Polynucleotide variants include, for example, polynucleotides having at least 50% (and at least 51% to at least 99% and all integer percentages in between) sequence identity with a reference polynucleotide described herein.
[0134]The terms "polynucleotide variant" and "variant" also include naturally-occurring allelic variants that encode these enzymes. Examples of naturally-occurring variants include allelic variants (same locus), homologs (different locus), and orthologs (different organism). Naturally occurring variants such as these can be identified and isolated using well-known molecular biology techniques including, for example, various polymerase chain reaction (PCR) and hybridization-based techniques as known in the art. Naturally occurring variants can be isolated from any organism that encodes one or more genes having a suitable enzymatic activity described herein (e.g., C--C ligase, diol dehyodrogenase, pectate lyase, alginate lyase, diol dehydratase, transporter, etc.).
[0135]Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. In certain aspects, non-naturally occurring variants may have been optimized for use in a given microorganism (e.g., E. coli), such as by engineering and screening the enzymes for increased activity, stability, or any other desirable feature. The variations can produce both conservative and non-conservative amino acid substitutions (as compared to the originally encoded product). For nucleotide sequences, conservative variants include those sequences that, because of the degeneracy of the genetic code, encode the amino acid sequence of a reference polypeptide. Variant nucleotide sequences also include synthetically derived nucleotide sequences, such as those generated, for example, by using site-directed mutagenesis but which still encode a biologically active polypeptide. Generally, variants of a particular reference nucleotide sequence will have at least about 30%, 40% 50%, 55%, 60%, 65%, 70%, generally at least about 75%, 80%, 85%, 90% to 95% or more, and even about 97% or 98% or more sequence identity to that particular nucleotide sequence as determined by sequence alignment programs described elsewhere herein using default parameters.
[0136]As used herein, the term "hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions" describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Ausubel et al., "Current Protocols in Molecular Biology", John Wiley & Sons Inc, 1994-1998, Sections 6.3.1-6.3.6. Aqueous and non-aqueous methods are described in that reference and either can be used.
[0137]Reference herein to "low stringency" conditions include and encompass from at least about 1% v/v to at least about 15% v/v formamide and from at least about 1 M to at least about 2 M salt for hybridization at 42° C., and at least about 1 M to at least about 2 M salt for washing at 42° C. Low stringency conditions also may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M NaHPO4 (pH 7.2), 7% SDS for hybridization at 65° C., and (i) 2×SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO4 (pH 7.2), 5% SDS for washing at room temperature. One embodiment of low stringency conditions includes hybridization in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2×SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions).
[0138]"Medium stringency" conditions include and encompass from at least about 16% v/v to at least about 30% v/v formamide and from at least about 0.5 M to at least about 0.9 M salt for hybridization at 42° C., and at least about 0.1 M to at least about 0.2 M salt for washing at 55° C. Medium stringency conditions also may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M NaHPO4 (pH 7.2), 7% SDS for hybridization at 65° C., and (i) 2×SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO4 (pH 7.2), 5% SDS for washing at 60-65° C. One embodiment of medium stringency conditions includes hybridizing in 6×SSC at about 45° C., followed by one or more washes in 0.2×SSC, 0.1% SDS at 60° C.
[0139]"High stringency" conditions include and encompass from at least about 31% v/v to at least about 50% v/v formamide and from about 0.01 M to about 0.15 M salt for hybridization at 42° C., and about 0.01 M to about 0.02 M salt for washing at 55° C. High stringency conditions also may include 1% BSA, 1 mM EDTA, 0.5 M NaHPO4 (pH 7.2), 7% SDS for hybridization at 65° C., and (i) 0.2×SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO4 (pH 7.2), 1% SDS for washing at a temperature in excess of 65° C. One embodiment of high stringency conditions includes hybridizing in 6×SSC at about 45° C., followed by one or more washes in 0.2×SSC, 0.1% SDS at 65° C.
[0140]One embodiment of "very high stringency" conditions includes hybridizing in 0.5 M sodium phosphate, 7% SDS at 65° C., followed by one or more washes in 0.2×SSC, 1% SDS at 65° C.
[0141]Other stringency conditions are well known in the art and a skilled addressee will recognize that various factors can be manipulated to optimize the specificity of the hybridization. Optimization of the stringency of the final washes can serve to ensure a high degree of hybridization. For detailed examples, see Ausubel et al., supra at pages 2.10.1 to 2.10.16 and Sambrook et al., Current Protocols in Molecular Biology (1989), at sections 1.101 to 1.104.
[0142]While stringent washes are typically carried out at temperatures from about 42° C. to 68° C., one skilled in the art will appreciate that other temperatures may be suitable for stringent conditions. Maximum hybridization rate typically occurs at about 20° C. to 25° C. below the Tm for formation of a DNA-DNA hybrid. It is well known in the art that the Tm is the melting temperature, or temperature at which two complementary polynucleotide sequences dissociate. Methods for estimating Tm are well known in the art (see Ausubel et al., supra at page 2.10.8).
[0143]In general, the Tm of a perfectly matched duplex of DNA may be predicted as an approximation by the formula: Tm=81.5+16.6 (log10 M)+0.41 (% G+C)-0.63 (% formamide)-(600/length) wherein: M is the concentration of Na.sup.+, preferably in the range of 0.01 molar to 0.4 molar; % G+C is the sum of guano sine and cytosine bases as a percentage of the total number of bases, within the range between 30% and 75% G+C; % formamide is the percent formamide concentration by volume; length is the number of base pairs in the DNA duplex. The Tm of a duplex DNA decreases by approximately 1° C. with every increase of 1% in the number of randomly mismatched base pairs. Washing is generally carried out at Tm-15° C. for high stringency, or Tm-30° C. for moderate stringency.
[0144]In one example of a hybridization procedure, a membrane (e.g., a nitrocellulose membrane or a nylon membrane) containing immobilized DNA is hybridized overnight at 42° C. in a hybridization buffer (50% deionizer formamide, 5×SSC, 5× Reinhardt's solution (0.1% fecal, 0.1% polyvinylpyrollidone and 0.1% bovine serum albumin), 0.1% SDS and 200 mg/mL denatured salmon sperm DNA) containing a labeled probe. The membrane is then subjected to two sequential medium stringency washes (i.e., 2×SSC, 0.1% SDS for 15 min at 45° C., followed by 2×SSC, 0.1% SDS for 15 min at 50° C.), followed by two sequential higher stringency washes (i.e., 0.2×SSC, 0.1% SDS for 12 min at 55° C. followed by 0.2×SSC and 0.1% SDS solution for 12 min at 65-68° C.
[0145]Polynucleotides and fusions thereof may be prepared, manipulated and/or expressed using any of a variety of well established techniques known and available in the art. For example, polynucleotide sequences which encode polypeptides of the invention, or fusion proteins or functional equivalents thereof, may be used in recombinant DNA molecules to direct expression of a selected enzyme in appropriate host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences that encode substantially the same or a functionally equivalent amino acid sequence may be produced and these sequences may be used to clone and express a given polypeptide.
[0146]As will be understood by those of skill in the art, it may be advantageous in some instances to produce polypeptide-encoding nucleotide sequences possessing non-naturally occurring codons. For example, codons preferred by a particular prokaryotic or eukaryotic host can be selected to increase the rate of protein expression or to produce a recombinant RNA transcript having desirable properties, such as a half-life which is longer than that of a transcript generated from the naturally occurring sequence. Such nucleotides are typically referred to as "codon-optimized." Any of the nucleotide sequences described herein may be utilized in such a "codon-optimized" form. For example, the nucleotide coding sequence of the benzaldehyde lyase from Pseudomonas fluorescens may be codon-optimized for expression in E. coli.
[0147]Moreover, the polynucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter polypeptide encoding sequences for a variety of reasons, including but not limited to, alterations which modify the cloning, processing, expression and/or activity of the gene product.
[0148]In order to express a desired polypeptide, a nucleotide sequence encoding the polypeptide, or a functional equivalent, may be inserted into appropriate expression vector, i.e., a vector that contains the necessary elements for the transcription and translation of the inserted coding sequence. Methods which are well known to those skilled in the art may be used to construct expression vectors containing sequences encoding a polypeptide of interest and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described in Sambrook et al., Molecular Cloning, A Laboratory Manual (1989), and Ausubel et al., Current Protocols in Molecular Biology (1989).
[0149]"Polypeptide," "polypeptide fragment," "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues and to variants and synthetic analogues of the same. Thus, these terms apply to amino acid polymers in which one or more amino acid residues are synthetic non-naturally occurring amino acids, such as a chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally-occurring amino acid polymers. In certain aspects, polypeptides may include enzymatic polypeptides, or "enzymes," which typically catalyze (i.e., increase the rate of) various chemical reactions.
[0150]The recitation polypeptide "variant" refers to polypeptides that are distinguished from a reference polypeptide sequence by the addition, deletion or substitution of at least one amino acid residue. In certain embodiments, a polypeptide variant is distinguished from a reference polypeptide by one or more substitutions, which may be conservative or non-conservative. In certain embodiments, the polypeptide variant comprises conservative substitutions and, in this regard, it is well understood in the art that some amino acids may be changed to others with broadly similar properties without changing the nature of the activity of the polypeptide. Polypeptide variants also encompass polypeptides in which one or more amino acids have been added or deleted, or replaced with different amino acid residues.
[0151]The present invention contemplates the use in the methods described herein of variants of full-length polypeptides having any of the enzymatic activities described herein, truncated fragments of these full-length polypeptides, variants of truncated fragments, as well as their related biologically active fragments. Typically, biologically active fragments of a polypeptide may participate in an interaction, for example, an intra-molecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). Biologically active fragments of a polypeptide/enzyme an enzymatic activity described herein include peptides comprising amino acid sequences sufficiently similar to, or derived from, the amino acid sequences of a (putative) full-length reference polypeptide sequence. Typically, biologically active fragments comprise a domain or motif with at least one enzymatic activity, and may include one or more (and in some cases all) of the various active domains. A biologically active fragment of a an enzyme can be a polypeptide fragment which is, for example, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 450, 500, 600 or more contiguous amino acids, including all integers in between, of a reference polypeptide sequence. In certain embodiments, a biologically active fragment comprises a conserved enzymatic sequence, domain, or motif, as described elsewhere herein and known in the art. Suitably, the biologically-active fragment has no less than about 1%, 10%, 25%, 50% of an activity of the wild-type polypeptide from which it is derived.
[0152]The term "exogenous" refers generally to a polynucleotide sequence or polypeptide that does not naturally occur in a wild-type cell or organism, but is typically introduced into the cell by molecular biological techniques, i.e., engineering to produce a recombinant microorganism. Examples of "exogenous" polynucleotides include vectors, plasmids, and/or man-made nucleic acid constructs encoding a desired protein or enzyme. The term "endogenous" refers generally to naturally occurring polynucleotide sequences or polypeptides that may be found in a given wild-type cell or organism. For example, certain naturally-occurring bacterial or yeast species do not typically contain a benzaldehyde lyase gene, and, therefore, do not comprise an "endogenous" polynucleotide sequence that encodes a benzaldehyde lyase. In this regard, it is also noted that even though an organism may comprise an endogenous copy of a given polynucleotide sequence or gene, the introduction of a plasmid or vector encoding that sequence, such as to over-express or otherwise regulate the expression of the encoded protein, represents an "exogenous" copy of that gene or polynucleotide sequence. Any of the of pathways, genes, or enzymes described herein may utilize or rely on an "endogenous" sequence, or may be provided as one or more "exogenous" polynucleotide sequences, and/or may be utilized according to the endogenous sequences already contained within a given microorganism.
[0153]A "recombinant" microorganism typically comprises one or more exogenous nucleotide sequences, such as in a plasmid or vector.
[0154]The recitations "sequence identity" or, for example, comprising a "sequence 50% identical to," as used herein, refer to the extent that sequences are identical on a nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis over a window of comparison. Thus, a "percentage of sequence identity" may be calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, I) or the identical amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, Ile, Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gln, Cys and Met) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
[0155]Terms used to describe sequence relationships between two or more polynucleotides or polypeptides include "reference sequence", "comparison window", "sequence identity", "percentage of sequence identity" and "substantial identity". A "reference sequence" is at least 12 but frequently 15 to 18 and often at least 25 monomer units, inclusive of nucleotides and amino acid residues, in length. Because two polynucleotides may each comprise (1) a sequence (i.e., only a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity. A "comparison window" refers to a conceptual segment of at least 6 contiguous positions, usually about 50 to about 100, more usually about 100 to about 150 in which a sequence is compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. The comparison window may comprise additions or deletions (i.e., gaps) of about 20% or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window may be conducted by computerized implementations of algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Drive Madison, Wis., USA) or by inspection and the best alignment (i.e., resulting in the highest percentage homology over the comparison window) generated by any of the various methods selected. Reference also may be made to the BLAST family of programs as for example disclosed by Altschul et al., 1997, Nucl. Acids Res. 25:3389. A detailed discussion of sequence analysis can be found in Unit 19.3 of Ausubel et al., "Current Protocols in Molecular Biology", John Wiley & Sons Inc, 1994-1998, Chapter 15.
[0156]"Transformation" refers generally to the permanent, heritable alteration in a cell resulting from the uptake and incorporation of foreign DNA into the host-cell genome; also, the transfer of an exogenous gene from one organism into the genome of another organism.
[0157]By "vector" is meant a polynucleotide molecule, preferably a DNA molecule derived, for example, from a plasmid, bacteriophage, yeast or virus, into which a polynucleotide can be inserted or cloned. A vector preferably contains one or more unique restriction sites and can be capable of autonomous replication in a defined host cell including a target cell or tissue or a progenitor cell or tissue thereof, or be integrable with the genome of the defined host such that the cloned sequence is reproducible. Accordingly, the vector can be an autonomously replicating vector, i.e., a vector that exists as an extra-chromosomal entity, the replication of which is independent of chromosomal replication, e.g., a linear or closed circular plasmid, an extra-chromosomal element, a mini-chromosome, or an artificial chromosome. The vector can contain any means for assuring self-replication. Alternatively, the vector can be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. Such a vector may comprise specific sequences that allow recombination into a particular, desired site of the host chromosome. A vector system can comprise a single vector or plasmid, two or more vectors or plasmids, which together contain the total DNA to be introduced into the genome of the host cell, or a transposon. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. In the present case, the vector is preferably one which is operably functional in a bacterial cell, such as a cyanobacterial cell. The vector can include a reporter gene, such as a green fluorescent protein (GFP), which can be either fused in frame to one or more of the encoded polypeptides, or expressed separately. The vector can also include a selection marker such as an antibiotic resistance gene that can be used for selection of suitable transformants.
[0158]The terms "wild-type" and "naturally occurring" are used interchangeably to refer to a gene or gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source. A wild type gene or gene product (e.g., a polypeptide) is that which is most frequently observed in a population and is thus arbitrarily designed the "normal" or "wild-type" form of the gene.
[0159]Examples of "biomass" include aquatic or marine biomass, fruit-based biomass such as fruit waste, and vegetable-based biomass such as vegetable waste, among others. Examples of aquatic or marine biomass include, but are not limited to, kelp, giant kelp, seaweed, algae, and marine microflora, microalgae, sea grass, and the like. In certain aspects, biomass does not include fossilized sources of carbon, such as hydrocarbons that are typically found within the top layer of the Earth's crust (e.g., natural gas, nonvolatile materials composed of almost pure carbon, like anthracite coal, etc).
[0160]Examples of fruit and/or vegetable biomass include, but are not limited to, any source of pectin such as plant peel and pomace including citrus, orange, grapefruit, potato, tomato, grape, mango, gooseberry, carrot, sugar-beet, and apple, among others.
[0161]Examples of polysaccharides, oligosaccharides, monosaccharides or other sugar components of biomass include, but are not limited to, alginate, agar, carrageenan, fucoidan, pectin, gluronate, mannuronate, mannitol, lyxose, cellulose, hemicellulose, glycerol, xylitol, glucose, mannose, galactose, xylose, xylan, mannan, arabinan, arabinose, glucuronate, galacturonate (including di- and tri-galacturonates), rhamnose, and the like.
[0162]Certain examples of alginate-derived polysaccharides include saturated polysaccharides, such as β-D-mannuronate, α-L-gluronate, dialginate, trialginate, pentalginate, hexalginate, heptalginate, octalginate, nonalginate, decalginate, undecalginate, dodecalginate and polyalginate, as well as unsaturated polysaccharides such as 4-deoxy-L-erythro-5-hexoseulose uronic acid, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-D-mannuronate or L-guluronate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-dialginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-trialginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-tetralginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-pentalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-hexalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-heptalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-octalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-nonalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-undecalginate, and 4-(4-deoxy-beta-D-mann-4-enuronosyl)-dodecalginate.
[0163]Certain examples of pectin-derived polysaccharides include saturated polysaccharides, such as galacturonate, digalacturonate, trigalacturonate, tetragalacturonate, pentagalacturonate, hexagalacturonate, heptagalacturonate, octagalacturonate, nonagalacturonate, decagalacturonate, dodecagalacturonate, polygalacturonate, and rhamnopolygalacturonate, as well as saturated polysaccharides such as 4-deoxy-L-threo-5-hexosulose uronate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-digalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-trigalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-tetragalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-pentagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-hexagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-heptagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-octagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-nonagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-decagalacturonate, and 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-dodecagalacturonate.
[0164]These polysaccharide or oligosaccharide components may be converted into "suitable monosaccharides" or other "suitable saccharides," such as "suitable oligosaccharides," by the microorganisms described herein which are capable of growing on such polysaccharides or other sugar components as a source of carbon (e.g., a sole source of carbon).
[0165]A "suitable monosaccharide" or "suitable saccharide" refers generally to any saccharide that may be produced by a recombinant microorganism growing on pectin, alginate, or other saccharide (e.g., galacturonate, cellulose, hemi-cellulose etc.) as a source or sole source of carbon, and also refers generally to any saccharide that may be utilized in a biofuel biosynthesis pathway of the present invention to produce hydrocarbons such as biofuels or biopetrols. Examples of suitable monosaccharides or oligosaccharides include, but are not limited to, 2-keto-3-deoxy D-gluconate (KDG), D-mannitol, gluronate, mannuronate, mannitol, lyxose, glycerol, xylitol, glucose, mannose, galactose, xylose, arabinose, glucuronate, galacturonates, and rhamnose, and the like. As noted herein, a "suitable monosaccharide" or "suitable saccharide" as used herein may be produced by an engineered or recombinant microorganism of the present invention, or may be obtained from commercially available sources.
[0166]The recitation "commodity chemical" as used herein includes any saleable or marketable chemical that can be produced either directly or as a by-product of the methods provided herein, including biofuels and/or biopetrols. General examples of "commodity chemicals" include, but are not limited to, biofuels, minerals, polymer precursors, fatty alcohols, surfactants, plasticizers, and solvents. The recitation "biofuels" as used herein includes solid, liquid, or gas fuels derived, at least in part, from a biological source, such as a recombinant microorganism.
Examples of commodity chemicals include, but are not limited to, methane, methanol, ethane, ethene, ethanol, n-propane, 1-propene, 1-propanol, propanal, acetone, propionate, n-butane, 1-butene, 1-butanol, butanal, butanoate, isobutanal, isobutanol, 2-methylbutanal, 2-methylbutanol, 3-methylbutanal, 3-methylbutanol, 2-butene, 2-butanol, 2-butanone, 2,3-butanediol, 3-hydroxy-2-butanone, 2,3-butanedione, ethylbenzene, ethenylbenzene, 2-phenylethanol, phenylacetaldehyde, 1-phenylbutane, 4-phenyl-1-butene, 4-phenyl-2-butene, 1-phenyl-2-butene, 1-phenyl-2-butanol, 4-phenyl-2-butanol, 1-phenyl-2-butanone, 4-phenyl-2-butanone, 1-phenyl-2,3-butandiol, 1-phenyl-3-hydroxy-2-butanone, 4-phenyl-3-hydroxy-2-butanone, 1-phenyl-2,3-butanedione, n-pentane, ethylphenol, ethenylphenol, 2-(4-hydroxyphenyl)ethanol, 4-hydroxyphenylacetaldehyde, 1-(4-hydroxyphenyl) butane, 4-(4-hydroxyphenyl)-1-butene, 4-(4-hydroxyphenyl)-2-butene, 1-(4-hydroxyphenyl)-1-butene, 1-(4-hydroxyphenyl)-2-butanol, 4-(4-hydroxyphenyl)-2-butanol, 1-(4-hydroxyphenyl)-2-butanone, 4-(4-hydroxyphenyl)-2-butanone, 1-(4-hydroxyphenyl)-2,3-butandiol, 1-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 4-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-2,3-butanonedione, indolylethane, indolylethene, 2-(indole-3-)ethanol, n-pentane, 1-pentene, 1-pentanol, pentanal, pentanoate, 2-pentene, 2-pentanol, 3-pentanol, 2-pentanone, 3-pentanone, 4-methylpentanal, 4-methylpentanol, 2,3-pentanediol, 2-hydroxy-3-pentanone, 3-hydroxy-2-pentanone, 2,3-pentanedione, 2-methylpentane, 4-methyl-1-pentene, 4-methyl-2-pentene, 4-methyl-3-pentene, 4-methyl-2-pentanol, 2-methyl-3-pentanol, 4-methyl-2-pentanone, 2-methyl-3-pentanone, 4-methyl-2,3-pentanediol, 4-methyl-2-hydroxy-3-pentanone, 4-methyl-3-hydroxy-2-pentanone, 4-methyl-2,3-pentanedione, 1-phenylpentane, 1-phenyl-1-pentene, 1-phenyl-2-pentene, 1-phenyl-3-pentene, 1-phenyl-2-pentanol, 1-phenyl-3-pentanol, 1-phenyl-2-pentanone, 1-phenyl-3-pentanone, 1-phenyl-2,3-pentanediol, 1-phenyl-2-hydroxy-3-pentanone, 1-phenyl-3-hydroxy-2-pentanone, 1-phenyl-2,3-pentanedione, 4-methyl-1-phenylpentane, 4-methyl-1-phenyl-1-pentene, 4-methyl-1-phenyl-2-pentene, 4-methyl-1-phenyl-3-pentene, 4-methyl-1-phenyl-3-pentanol, 4-methyl-1-phenyl-2-pentanol, 4-methyl-1-phenyl-3-pentanone, 4-methyl-1-phenyl-2-pentanone, 4-methyl-1-phenyl-2,3-pentanediol, 4-methyl-1-phenyl-2,3-pentanedione, 4-methyl-1-phenyl-3-hydroxy-2-pentanone, 4-methyl-1-phenyl-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl) pentane, 1-(4-hydroxyphenyl)-1-pentene, 1-(4-hydroxyphenyl)-2-pentene, 1-(4-hydroxyphenyl)-3-pentene, 1-(4-hydroxyphenyl)-2-pentanol, 1-(4-hydroxyphenyl)-3-pentanol, 1-(4-hydroxyphenyl)-2-pentanone, 1-(4-hydroxyphenyl)-3-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanediol, 1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl) pentane, 4-methyl-1-(4-hydroxyphenyl)-2-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentene, 4-methyl-1-(4-hydroxyphenyl)-1-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentanol, 4-methyl-1-(4-hydroxyphenyl)-2-pentanol, 4-methyl-1-(4-hydroxyphenyl)-3-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-indole-3-pentane, 1-(indole-3)-1-pentene, 1-(indole-3)-2-pentene, 1-(indole-3)-3-pentene, 1-(indole-3)-2-pentanol, 1-(indole-3)-3-pentanol, 1-(indole-3)-2-pentanone, 1-(indole-3)-3-pentanone, 1-(indole-3)-2,3-pentanediol, 1-(indole-3)-2-hydroxy-3-pentanone, 1-(indole-3)-3-hydroxy-2-pentanone, 1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3-)pentane, 4-methyl-1-(indole-3)-2-pentene, 4-methyl-1-(indole-3)-3-pentene, 4-methyl-1-(indole-3)-1-pentene, 4-methyl-2-(indole-3)-3-pentanol, 4-methyl-1-(indole-3)-2-pentanol, 4-methyl-1-(indole-3)-3-pentanone, 4-methyl-1-(indole-3)-2-pentanone, 4-methyl-1-(indole-3)-2,3-pentanediol, 4-methyl-1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3)-3-hydroxy-2-pentanone, 4-methyl-1-(indole-3)-2-hydroxy-3-pentanone, n-hexane, 1-hexene, 1-hexanol, hexanal, hexanoate, 2-hexene, 3-hexene, 2-hexanol, 3-hexanol, 2-hexanone, 3-hexanone, 2,3-hexanediol, 2,3-hexanedione, 3,4-hexanediol, 3,4-hexanedione, 2-hydroxy-3-hexanone, 3-hydroxy-2-hexanone, 3-hydroxy-4-hexanone, 4-hydroxy-3-hexanone, 2-methylhexane, 3-methylhexane, 2-methyl-2-hexene, 2-methyl-3-hexene, 5-methyl-1-hexene, 5-methyl-2-hexene, 4-methyl-1-hexene, 4-methyl-2-hexene, 3-methyl-3-hexene, 3-methyl-2-hexene, 3-methyl-1-hexene, 2-methyl-3-hexanol, 5-methyl-2-hexanol, 5-methyl-3-hexanol, 2-methyl-3-hexanone, 5-methyl-2-hexanone, 5-methyl-3-hexanone, 2-methyl-3,4-hexanediol, 2-methyl-3,4-hexanedione, 5-methyl-2,3-hexanediol, 5-methyl-2,3-hexanedione, 4-methyl-2,3-hexanediol, 4-methyl-2,3-hexanedione, 2-methyl-3-hydroxy-4-hexanone, 2-methyl-4-hydroxy-3-hexanone, 5-methyl-2-hydroxy-3-hexanone, 5-methyl-3-hydroxy-2-hexanone, 4-methyl-2-hydroxy-3-hexanone, 4-methyl-3-hydroxy-2-hexanone, 2,5-dimethylhexane, 2,5-dimethyl-2-hexene, 2,5-dimethyl-3-hexene, 2,5-dimethyl-3-hexanol, 2,5-dimethyl-3-hexanone, 2,5-dimethyl-3,4-hexanediol, 2,5-dimethyl-3,4-hexanedione, 2,5-dimethyl-3-hydroxy-4-hexanone, 5-methyl-1-phenylhexane, 4-methyl-1-phenylhexane, 5-methyl-1-phenyl-1-hexene, 5-methyl-1-phenyl-2-hexene, 5-methyl-1-phenyl-3-hexene, 4-methyl-1-phenyl-1-hexene, 4-methyl-1-phenyl-2-hexene, 4-methyl-1-phenyl-3-hexene, 5-methyl-1-phenyl-2-hexanol, 5-methyl-1-phenyl-3-hexanol, 4-methyl-1-phenyl-2-hexanol, 4-methyl-1-phenyl-3-hexanol, 5-methyl-1-phenyl-2-hexanone, 5-methyl-1-phenyl-3-hexanone, 4-methyl-1-phenyl-2-hexanone, 4-methyl-1-phenyl-3-hexanone, 5-methyl-1-phenyl-2,3-hexanediol, 4-methyl-1-phenyl-2,3-hexanediol, 5-methyl-1-phenyl-3-hydroxy-2-hexanone, 5-methyl-1-phenyl-2-hydroxy-3-hexanone, 4-methyl-1-phenyl-3-hydroxy-2-hexanone, 4-methyl-1-phenyl-2-hydroxy-3-hexanone, 5-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)hexane, 5-methyl-1-(4-hydroxyphenyl)-1-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexene, 5-methyl-1-(4-hydroxyphenyl)-3-hexene, 4-methyl-1-(4-hydroxyphenyl)-1-hexene, 4-methyl-1-(4-hydroxyphenyl)-2-hexene, 4-methyl-1-(4-hydroxyphenyl)-3-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexanol, 5-methyl-1-(4-hydroxyphenyl)-3-hexanol, 4-methyl-1-(4-hydroxyphenyl)-2-hexanol, 4-methyl-1-(4-hydroxyphenyl)-3-hexanol, 5-methyl-1-(4-hydroxyphenyl)-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 5-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(indole-3-)hexane, 5-methyl-1-(indole-3)-1-hexene, 5-methyl-1-(indole-3)-2-hexene, 5-methyl-1-(indole-3)-3-hexene, 4-methyl-1-(indole-3)-1-hexene, 4-methyl-1-(indole-3)-2-hexene, 4-methyl-1-(indole-3)-3-hexene, 5-methyl-1-(indole-3)-2-hexanol, 5-methyl-1-(indole-3)-3-hexanol, 4-methyl-1-(indole-3)-2-hexanol, 4-methyl-1-(indole-3)-3-hexanol, 5-methyl-1-(indole-3)-2-hexanone, 5-methyl-1-(indole-3)-3-hexanone, 4-methyl-1-(indole-3)-2-hexanone, 4-methyl-1-(indole-3)-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanediol, 4-methyl-1-(indole-3)-2,3-hexanediol, 5-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 5-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 4-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 4-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanedione, 4-methyl-1-(indole-3)-2,3-hexanedione, n-heptane, 1-heptene, 1-heptanol, heptanal, heptanoate, 2-heptene, 3-heptene, 2-heptanol, 3-heptanol, 4-heptanol, 2-heptanone, 3-heptanone, 4-heptanone, 2,3-heptanediol, 2,3-heptanedione, 3,4-heptanediol, 3,4-heptanedione, 2-hydroxy-3-heptanone, 3-hydroxy-2-heptanone, 3-hydroxy-4-heptanone, 4-hydroxy-3-heptanone, 2-methylheptane, 3-methylheptane, 6-methyl-2-heptene, 6-methyl-3-heptene, 2-methyl-3-heptene, 2-methyl-2-heptene, 5-methyl-2-heptene, 5-methyl-3-heptene, 3-methyl-3-heptene, 2-methyl-3-heptanol, 2-methyl-4-heptanol, 6-methyl-3-heptanol, 5-methyl-3-heptanol, 3-methyl-4-heptanol, 2-methyl-3-heptanone, 2-methyl-4-heptanone, 6-methyl-3-heptanone, 5-methyl-3-heptanone, 3-methyl-4-heptanone, 2-methyl-3,4-heptanediol, 2-methyl-3,4-heptanedione, 6-methyl-3,4-heptanediol, 6-methyl-3,4-heptanedione, 5-methyl-3,4-heptanediol, 5-methyl-3,4-heptanedione, 2-methyl-3-hydroxy-4-heptanone, 2-methyl-4-hydroxy-3-heptanone, 6-methyl-3-hydroxy-4-heptanone, 6-methyl-4-hydroxy-3-heptanone, 5-methyl-3-hydroxy-4-heptanone, 5-methyl-4-hydroxy-3-heptanone, 2,6-dimethylheptane, 2,5-dimethylheptane, 2,6-dimethyl-2-heptene, 2,6-dimethyl-3-heptene, 2,5-dimethyl-2-heptene, 2,5-dimethyl-3-heptene, 3,6-dimethyl-3-heptene, 2,6-dimethyl-3-heptanol, 2,6-dimethyl-4-heptanol, 2,5-dimethyl-3-heptanol, 2,5-dimethyl-4-heptanol, 2,6-dimethyl-3,4-heptanediol, 2,6-dimethyl-3,4-heptanedione, 2,5-dimethyl-3,4-heptanediol, 2,5-dimethyl-3,4-heptanedione, 2,6-dimethyl-3-hydroxy-4-heptanone, 2,6-dimethyl-4-hydroxy-3-heptanone, 2,5-dimethyl-3-hydroxy-4-heptanone, 2,5-dimethyl-4-hydroxy-3-heptanone, n-octane, 1-octene, 2-octene, 1-octanol, octanal, octanoate, 3-octene, 4-octene, 4-octanol, 4-octanone, 4,5-octanediol, 4,5-octanedione, 4-hydroxy-5-octanone, 2-methyloctane, 2-methyl-3-octene, 2-methyl-4-octene, 7-methyl-3-octene, 3-methyl-3-octene, 3-methyl-4-octene, 6-methyl-3-octene, 2-methyl-4-octanol, 7-methyl-4-octanol, 3-methyl-4-octanol, 6-methyl-4-octanol, 2-methyl-4-octanone, 7-methyl-4-octanone, 3-methyl-4-octanone, 6-methyl-4-octanone, 2-methyl-4,5-octanediol, 2-methyl-4,5-octanedione, 3-methyl-4,5-octanediol, 3-methyl-4,5-octanedione, 2-methyl-4-hydroxy-5-octanone, 2-methyl-5-hydroxy-4-octanone, 3-methyl-4-hydroxy-5-octanone, 3-methyl-5-hydroxy-4-octanone, 2,7-dimethyloctane, 2,7-dimethyl-3-octene, 2,7-dimethyl-4-octene, 2,7-dimethyl-4-octanol, 2,7-dimethyl-4-octanone, 2,7-dimethyl-4,5-octanediol, 2,7-dimethyl-4,5-octanedione, 2,7-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyloctane, 2,6-dimethyl-3-octene, 2,6-dimethyl-4-octene, 3,7-dimethyl-3-octene, 2,6-dimethyl-4-octanol, 3,7-dimethyl-4-octanol, 2,6-dimethyl-4-octanone, 3,7-dimethyl-4-octanone, 2,6-dimethyl-4,5-octanediol, 2,6-dimethyl-4,5-octanedione, 2,6-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyl-5-hydroxy-4-octanone, 3,6-dimethyloctane, 3,6-dimethyl-3-octene, 3,6-dimethyl-4-octene, 3,6-dimethyl-4-octanol, 3,6-dimethyl-4-octanone, 3,6-dimethyl-4,5-octanediol, 3,6-dimethyl-4,5-octanedione, 3,6-dimethyl-4-hydroxy-5-octanone, n-nonane, 1-nonene, 1-nonanol, nonanal, nonanoate, 2-methylnonane, 2-methyl-4-nonene, 2-methyl-5-nonene, 8-methyl-4-nonene, 2-methyl-5-nonanol, 8-methyl-4-nonanol, 2-methyl-5-nonanone, 8-methyl-4-nonanone, 8-methyl-4,5-nonanediol, 8-methyl-4,5-nonanedione, 8-methyl-4-hydroxy-5-nonanone, 8-methyl-5-hydroxy-4-nonanone, 2,8-dimethylnonane, 2,8-dimethyl-3-nonene, 2,8-dimethyl-4-nonene, 2,8-dimethyl-5-nonene, 2,8-dimethyl-4-nonanol, 2,8-dimethyl-5-nonanol, 2,8-dimethyl-4-nonanone, 2,8-dimethyl-5-nonanone, 2,8-dimethyl-4,5-nonanediol, 2,8-dimethyl-4,5-nonanedione, 2,8-dimethyl-4-hydroxy-5-nonanone, 2,8-dimethyl-5-hydroxy-4-nonanone, 2,7-dimethylnonane, 3,8-dimethyl-3-nonene, 3,8-dimethyl-4-nonene, 3,8-dimethyl-5-nonene, 3,8-dimethyl-4-nonanol, 3,8-dimethyl-5-nonanol, 3,8-dimethyl-4-nonanone, 3,8-dimethyl-5-nonanone, 3,8-dimethyl-4,5-nonanediol, 3,8-dimethyl-4,5-nonanedione, 3,8-dimethyl-4-hydroxy-5-nonanone, 3,8-dimethyl-5-hydroxy-4-nonanone, n-decane, 1-decene, 1-decanol, decanoate, 2,9-dimethyldecane, 2,9-dimethyl-3-decene, 2,9-dimethyl-4-decene, 2,9-dimethyl-5-decanol, 2,9-dimethyl-5-decanone, 2,9-dimethyl-5,6-decanediol, 2,9-dimethyl-6-hydroxy-5-decanone, 2,9-dimethyl-5,6-decanedionen-undecane, 1-undecene, 1-undecanol, undecanal. undecanoate, n-dodecane, 1-dodecene, 1-dodecanol, dodecanal, dodecanoate, n-dodecane, 1-decadecene, 1-dodecanol, ddodecanal, dodecanoate, n-tridecane, 1-tridecene, 1-tridecanol, tridecanal, tridecanoate, n-tetradecane, 1-tetradecene, 1-tetradecanol, tetradecanal, tetradecanoate, n-pentadecane, 1-pentadecene, 1-pentadecanol, pentadecanal, pentadecanoate, n-hexadecane, 1-hexadecene, 1-hexadecanol, hexadecanal, hexadecanoate, n-heptadecane, 1-heptadecene, 1-heptadecanol, heptadecanal, heptadecanoate, n-octadecane, 1-octadecene, 1-octadecanol, octadecanal, octadecanoate, n-nonadecane, 1-nonadecene, 1-nonadecanol, nonadecanal, nonadecanoate, eicosane, 1-eicosene, 1-eicosanol, eicosanal, eicosanoate, 3-hydroxy propanal, 1,3-propanediol, 4-hydroxybutanal, 1,4-butanediol, 3-hydroxy-2-butanone, 2,3-butandiol, 1,5-pentane diol, homocitrate, homoisocitorate, b-hydroxy adipate, glutarate, glutarsemialdehyde, glutaraldehyde, 2-hydroxy-1-cyclopentanone, 1,2-cyclopentanediol, cyclopentanone, cyclopentanol, (S)-2-acetolactate, (R)-2,3-Dihydroxy-isovalerate, 2-oxoisovalerate, isobutyryl-CoA, isobutyrate, isobutyraldehyde, 5-amino pentaldehyde, 1,10-diaminodecane, 1,10-diamino-5-decene, 1,10-diamino-5-hydroxydecane, 1,10-diamino-5-decanone, 1,10-diamino-5,6-decanediol, 1,10-diamino-6-hydroxy-5-decanone, phenylacetoaldehyde, 1,4-diphenylbutane, 1,4-diphenyl-1-butene, 1,4-diphenyl-2-butene, 1,4-diphenyl-2-butanol, 1,4-diphenyl-2-butanone, 1,4-diphenyl-2,3-butanediol, 1,4-diphenyl-3-hydroxy-2-butanone, 1-(4-hydeoxyphenyl)-4-phenylbutane, 1-(4-hydeoxyphenyl)-4-phenyl-1-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanol, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanone, 1-(4-hydeoxyphenyl)-4-phenyl-2,3-butanediol, 1-(4-hydeoxyphenyl)-4-phenyl-3-hydroxy-2-butanone, 1-(indole-3)-4-phenylbutane, 1-(indole-3)-4-phenyl-1-butene, 1-(indole-3)-4-phenyl-2-butene, 1-(indole-3)-4-phenyl-2-butanol, 1-(indole-3)-4-phenyl-2-butanone, 1-(indole-3)-4-phenyl-2,3-butanediol, 1-(indole-3)-4-phenyl-3-hydroxy-2-butanone, 4-hydroxyphenylacetoaldehyde, 1,4-di(4-hydroxyphenyl)butane, 1,4-di(4-hydroxyphenyl)-1-butene, 1,4-di(4-hydroxyphenyl)-2-butene, 1,4-di(4-hydroxyphenyl)-2-butanol, 1,4-di(4-hydroxyphenyl)-2-butanone, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 1,4-di(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3-)butane, 1-(4-hydroxyphenyl)-4-(indole-3)-1-butene, 1-di(4-hydroxyphenyl)-4-(indole-3)-2-butene,
1-(4-hydroxyphenyl)-4-(indole-3)-2-butanol, 1-(4-hydroxyphenyl)-4-(indole-3)-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3)-2,3-butanediol, 1-(4-hydroxyphenyl-4-(indole-3)-3-hydroxy-2-butanone, indole-3-acetoaldehyde, 1,4-di(indole-3-)butane, 1,4-di(indole-3)-1-butene, 1,4-di(indole-3)-2-butene, 1,4-di(indole-3)-2-butanol, 1,4-di(indole-3)-2-butanone, 1,4-di(indole-3)-2,3-butanediol, 1,4-di(indole-3)-3-hydroxy-2-butanone, succinate semialdehyde, hexane-1,8-dicarboxylic acid, 3-hexene-1,8-dicarboxylic acid, 3-hydroxy-hexane-1,8-dicarboxylic acid, 3-hexanone-1,8-dicarboxylic acid, 3,4-hexanediol-1,8-dicarboxylic acid, 4-hydroxy-3-hexanone-1,8-dicarboxylic acid, fucoidan, iodine, chlorophyll, carotenoid, calcium, magnesium, iron, sodium, potassium, phosphate, and the like.
[0168]The recitation "optimized" as used herein refers to a pathway, gene, polypeptide, enzyme, or other molecule having an altered biological activity, such as by the genetic alteration of a polypeptide's amino acid sequence or by the alteration/modification of the polypeptide's surrounding cellular environment, to improve its functional characteristics in relation to the original molecule or original cellular environment (e.g., a wild-type sequence of a given polypeptide or a wild-type microorganism). Any of the polypeptides or enzymes described herein may be optionally "optimized," and any of the genes or nucleotide sequences described herein may optionally encode an optimized polypeptide or enzyme. Any of the pathways described herein may optionally contain one or more "optimized" enzymes, or one or more nucleotide sequences encoding for an optimized enzyme or polypeptide.
[0169]Typically, the improved functional characteristics of the polypeptide, enzyme, or other molecule relate to the suitability of the polypeptide or other molecule for use in a biological pathway (e.g., a biosynthesis pathway, a C--C ligation pathway) to convert a monosaccharide or oligosaccharide into a biofuel. Certain embodiments, therefore, contemplate the use of "optimized" biological pathways. An exemplary "optimized" polypeptide may contain one or more alterations or mutations in its amino acid coding sequence (e.g., point mutations, deletions, addition of heterologous sequences) that facilitate improved expression and/or stability in a given microbial system or microorganism, allow regulation of polypeptide activity in relation to a desired substrate (e.g., inducible or repressible activity), modulate the localization of the polypeptide within a cell (e.g., intracellular localization, extracellular secretion), and/or effect the polypeptide's overall level of activity in relation to a desired substrate (e.g., reduce or increase enzymatic activity). A polypeptide or other molecule may also be "optimized" for use with a given microbial system or microorganism by altering one or more pathways within that system or organism, such as by altering a pathway that regulates the expression (e.g., up-regulation), localization, and/or activity of the "optimized" polypeptide or other molecule, or by altering a pathway that minimizes the production of undesirable by-products, among other alterations. In this manner, a polypeptide or other molecule may be "optimized" with or without altering its wild-type amino acid sequence or original chemical structure. Optimized polypeptides or biological pathways may be obtained, for example, by direct mutagenesis or by natural selection for a desired phenotype, according to techniques known in the art.
[0170]In certain aspects, "optimized" genes or polypeptides may comprise a nucleotide coding sequence or amino acid sequence that is 50% to 99% identical (including all integeres in between) to the nucleotide or amino acid sequence of a reference (e.g., wild-type) gene or polypeptide. In certain aspects, an "optimized" polypeptide or enzyme may have about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 100 (including all integers and decimal points in between e.g., 1.2, 1.3, 1.4, 1.5, 5.5, 5.6, 5.7, 60, 70, etc.), or more times the biological activity of a reference polypeptide.
[0171]Certain aspects of the invention also include a commodity chemical, such as a biofuel, that is produced according to the methods and recombinant microorganisms described herein. Such a biofuel (e.g., medium to long chain alkane) may be distinguished from other fuels, such as those fuels produced by traditional refinery from crude carbon sources, by radio-carbon dating techniques. For instance, carbon has two stable, nonradioactive isotopes: carbon-12 (12C), and carbon-13 (13C). In addition, there are trace amounts of the unstable isotope carbon-14 (14C) on Earth. Carbon-14 has a half-life of 5730 years, and would have long ago vanished from Earth were it not for the unremitting impact of cosmic rays on nitrogen in the Earth's atmosphere, which create more of this isotope. The neutrons resulting from the cosmic ray interactions participate in the following nuclear reaction on the atoms of nitrogen molecules (N2) in the atmospheric air:
n + 7 14 N → 6 14 C + p ##EQU00001##
[0172]Plants and other photosynthetic organisms take up atmospheric carbon dioxide by photosynthesis. Since many plants are ingested by animals, every living organism on Earth is constantly exchanging carbon-14 with its environment for the duration of its existence. Once an organism dies, however, this exchange stops, and the amount of carbon-14 gradually decreases over time through radioactive beta decay.
[0173]Most hydrocarbon-based fuels, such as crude oil and natural gas derived from mining operations, are the result of compression and heating of ancient organic materials (i.e., kerogen) over geological time. Formation of petroleum typically occurs from hydrocarbon pyrolysis, in a variety of mostly endothermic reactions at high temperature and/or pressure. Today's oil formed from the preserved remains of prehistoric zooplankton and algae, which had settled to a sea or lake bottom in large quantities under anoxic conditions (the remains of prehistoric terrestrial plants, on the other hand, tended to form coal). Over geological time the organic matter mixed with mud, and was buried under heavy layers of sediment resulting in high levels of heat and pressure (known as diagenesis). This process caused the organic matter to chemically change, first into a waxy material known as kerogen which is found in various oil shales around the world, and then with more heat into liquid and gaseous hydrocarbons in a process known as catagenesis. Most hydrocarbon based fuels derived from crude oil have been undergoing a process of carbon-14 decay over geological time, and, thus, will have little to no detectable carbon-14. In contrast, certain biofuels produced by the living microorganisms of the present invention will comprise carbon-14 at a level comparable to all other presently living things (i.e., an equilibrium level). In this manner, by measuring the carbon-12 to carbon-14 ratio of a hydrocarbon-based biofuel of the present invention, and comparing that ratio to a hydrocarbon based fuel derived from crude oil, the biofuels produced by the methods provided herein can be structurally distinguished from typical sources of hydrocarbon based fuels.
[0174]Embodiments of the present invention include methods for converting a polysaccharide to a suitable monosaccharide comprising, (a) obtaining the polysaccharide; and (b) contacting the polysaccharide with a recombinant microorganism or microbial system comprising such a microorganism for a time sufficient to convert the polysaccharide to a suitable monosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a lyase and a hydrolase, wherein the lyase and/or hydrolase optionally comprises at least one signal peptide or at least one autotransporter domain; (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide transporter, a disaccharide transporter, a trisaccharide transporter, an oligosaccharide transporter, and a polysaccharide transporter; and (iii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to a suitable monosaccharide.
[0175]Alternatively, certain aspects may include methods for converting a polysaccharide to a suitable monosaccharide comprising, (a) obtaining the polysaccharide; and (b) contacting the polysaccharide with a microbial system for a time sufficient to convert the polysaccharide to a suitable monosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a lyase and a hydrolase; (ii) at least one gene encoding and expressing a superchannel; and (iii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to a suitable monosaccharide.
[0176]In certain embodiments, a microbial system or isolated microorganism is capable of growing using a polysaccharide (e.g., alginate, pectin, etc.) as a sole source of carbon and/or energy. A "sole source of carbon" refers generally to the ability to grow on a given carbon source as the only carbon source in a given growth medium.
[0177]With regard to alginate, approximately 50 percent of seaweed dry-weight comprises various sugar components, among which alginate and mannitol are major components corresponding to 30 and 15 percent of seaweed dry-weight, respectively. With regard to pectin, although microorganisms such as E. coli are generally considered as a host organisms in synthetic biology, and although such microorganism are able to metabolize mannitol, they completely lack the ability to degrade and metabolize alginate. In this regard, many laboratory or wild-type microorganisms, such as E. coli, are unable to grow on alginate as a sole source of carbon. Similarly, many organisms such as E. coli are unable to degrade and metabolize pectin, a polysaccharide found in many food waste products, and, thus are unable to grown on pectin as a sole source of carbon. Accordingly, embodiments of the present application include engineered microorganisms, such as E. coli, or microbial systems containing such engineered microorganisms, that are capable of using polysaccharides, such as alginate and pectin, as a sole source of carbon and/or energy.
[0178]Alginate is a block co-polymer of β-D-mannuronate (M) and α-D-gluronate (G) (M and G are epimeric about the C5-carboxyl group). Each alginate polymer comprises regions of all M (polyM), all G (polyG), and/or the mixture of M and G (polyMG). To utilize alginate to produce one or more suitable monosaccharides, certain aspects of the present invention provide an engineered or recombinant microorganism or microbial system that is able to degrade or de-polymerize alginate and to use it as a source of carbon and/or energy. As one means of accomplishing this purpose, such recombinant microorganisms may incorporate a set of polysaccharide degrading or depolymerizing enzymes such as alginate lyases (ALs) to the microbial system.
[0179]ALs are mainly classified into two distinctive subfamilies depending on their acts of catalysis: endo-(EC 4.2.2.3) and exo-acting (EC 4.2.2.-) ALs. Endo-acting ALs are further classified based on their catalytic specificity; M specific and G specific ALs. The endo-acting ALs randomly cleave alginate via a 1-elimination mechanism and mainly depolymerize alginate to di-, tri- and tetrasaccharides. The uronate at the non-reducing terminus of each oligosaccharide are converted to unsaturated sugar uronate, 4-deoxy-α-L-erythro-hex-4-ene pyranosyl uronates. The exo-acting ALs catalyze further depolymerization of these oligosaccharides and release unsaturated monosaccharides, which may be non-enzymatically converted to monosaccharides, including α-keto acid, 4-deoxy-α-L-erythro-hexoselulose uronate (DEHU). Certain embodiments of an engineered microbial system or isolated, engineered microorganism may include endoM-, endoG- and exo-acting ALs to degrade or depolymerize aquatic or marine-biomass polysaccharides such as alginate to a monosaccharide such as DEHU.
[0180]Embodiments of the present invention may also include lyases such as alginate lyases isolated from various sources, including, but not limited to, marine algae, mollusks, and wide varieties of microbes such as genus Pseudomonas, Vibrio, and Sphingomonas. Many alginate lyases are endo-acting M specific, several are G specific, and few are exo-acting. For example, ALs isolated from Sphingomonas sp. strain Al include five endo-acting ALs, Al-I, Al-II, Al-II', Al-III, and Al-IV' and an exo-acting AL, Al-IV.
[0181]Typically, Al-I, Al-II, and Al-III have molecular weights of 66 kDa, 25 kDa, and 40 kDa, respectively. AI-II and AI-III are self-splicing products of Al-I. AI-II may be more specific to G and Al-III may be specific to M. Al-I may have high activity for both M and G. Al-IV has molecular weight of about 85 kDa and catalyzes exo-lytic depolymerization of oligoalginate. Although both Al-II' and Al-IV' are functional homologues of Al-II and Al-IV. AI-II' has endo-lytic activity and may have no preference to M or G. Al-IV has primarily endo-lytic activity. In addition to these ALs, exo-lytic AL Atu3025 derived from Agrobacterium tumefaciens has high activity for depolymerization of oligoalginate, and may be used in certain embodiments of the present invention. Certain embodiments may incorporate into the microbial system or isolated microorganism the genes encoding Al-I, Al-II', Al-IV, and Atu3025, and may include optimal codon usage for the suitable host organisms, such as E. coli.
[0182]Certain examples of alginate lyases or oligoalginate lyases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to SEQ ID NOS:67-68, which show the nucleotide (SEQ ID NO:67) and polypeptide (SEQ ID NO:68) sequences of oligoalginate lyase Atu3025 isolated from Agrobacterium tumefaciens. Certain examples of alginate lyases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the alginate lyase enzymes described in FIG. 37, as well as the secreted alginate lyase encoded by Vs24254 from Vibrio splendidus.
[0183]In certain embodiments, a microbial system or recombinant microorganism may be engineered to secrete or display the lyases or alginate lyases (ALs) to the culture media, such as by incorporating a signal peptide or autotransporter domain into the lyase. In this regard, it is typically understood that bacteria have at least four different types of protein secretion machinery (type I, II, III and IV). For example, in E. coli, the type II secretion machinery is used for the secretion of recombinant proteins. The type II secretion machinery may comprise a two-step process: the translocation of premature proteins tagged with signal peptides to the periplasm fraction and processing to the mature proteins followed by secretion to media.
[0184]The first process may proceed by any of three different pathways: secB-dependent pathway, signal recognition particle (SRP) pathway, or twin-arginine translocation (TAT) pathway. Recombinant proteins may be secreted into periplasm fraction. The fates of the mature proteins vary dependent on the type of proteins. For example, some proteins are secreted spontaneously by diffusion or passively by a secretion apparatus named secretion that consists of 12-16 proteins, and others stay in periplasm fraction and are eventually degraded.
[0185]Some proteins may also be secreted by an autotransporter apparatus, such as by utilizing an autotransporter domain. The proteins secreted by autotransporter domains typically comprise an N-terminal signal peptide that plays a role in translocation to the periplasm, which may be mediated by secB or SRP pathways, passenger domain, and/or C-terminal translocation unit (UT) having a characteristic β-barrel structure. The β-barrel portion of the UT builds an aqueous pore channel across the outer membrane and helps the transportation of passenger domain to media. Autodisplayed passenger proteins are often cleaved by the autotransporter and set free to media.
[0186]The type I secretion machinery may also be used for the secretion of recombinant proteins in E. coli. The type I secretion machinery may be used for the secretion of high-molecular-weight toxins and exoenzymes. The type I secretion machinery consist of two inner membrane proteins (HlyB and HlyD) that are the member of the ATP binding cassette (ABC) transporter family, and an endogenous outer membrane protein (TolC). The secretion of recombinant proteins based on type I secretion machinery may utilize the C-terminal region of α-haemolysin (HlyA) as a signal sequence. The recombinant proteins may readily pass through the inner membrane, periplasm, and outer membrane through the type I secretion machinery.
[0187]Depending on the types of linker and signal peptides utilized by various embodiments of the present application, both autotransporter and type I secretion machinery can be altered to the cell surface display machinery. Alternatively, a system specific to cell surface display may be used. For example, in this system, target proteins may be fused to PgsA protein (a poly-γ-glutamate synthetase complex) that is natively displayed on the surface of Bacillus subtilis.
[0188]Certain embodiments may include lyases such as alginate lyases fused with various signal peptides and/or autotransporter domains found in proteins secreted by both type I and type II secretion machinery. Other embodiments may include lyases such as alginate lyases fused with any combination of signal peptides and or autotransporter domains found in proteins secreted transport machinery as described herein or known to a person skilled in the art. Embodiments may also include signal peptides or autotransporter domains that are experimentally redesigned to maximize the secretion of lyases such as alginate lyases to the culture media, and may also include the use of many different linker sequences that fuse signal peptides, lyases, and autotransporters that improve the efficiency of secretion or the cell surface presentation of lyases.
[0189]Certain embodiments may include a microbial system or isolated microorganism that comprise saccharide transporters, which are able to transport monosaccharides (e.g., DEHU) and oligosaccharides from the media to the cytosol to efficiently utilize these monosaccharides as a source of carbon and/or energy. For instance, genes encoding monosaccharide permeases (i.e., monosaccharide transporters) such as DEHU permeases may be isolated from bacteria that grow on polysaccharides such as alginate as a source of carbon and/or energy, and may be incorporated into embodiments of the present microbial system or isolated microorganism. As an additional example, embodiments may also include redesigned native permeases or transporters with altered specificity for monosaccharide (e.g., DEHU) transportation.
[0190]In this regard, E. coli contains several permeases able to transport monosaccharides, which include, but are not limited to, KdgT for 2-keto-3-deoxy-D-gluconate (KDG) transporter, ExuT for aldohexuronates such as D-galacturonate and D-glucuronate transporter, GntT, GntU, GntP, and GntT for gluconate transporter, and KgtP for proton-driven α-ketoglutarate transporter. Microbial systems or recombinant microorganisms described herein may comprise any of these permeases, in addition to those permeases known to a person of skill in the art and not mentioned herein, and may also include permease enzymes redesigned to transport other monosaccharides, such as DEHU.
[0191]A microbial system or recombinant microorganism according to the present invention may also comprise permeases/transporters/superchannels/porins that catalyze the transport of polysaccharides and monosaccharides (e.g., D-mannuronate and D-lyxose) from the media to the periplasm or cytosol of a microorganism. For example, genes encoding the permeases of D-mannuronate in soil Aeromonas may be incorporated into a microbial system as described herein.
[0192]As one alternative example, a microbial system or microorganism may comprise native permeases/transporters that are redesigned to alter their specificity for efficient monosaccharide transportation, such as for D-mannuronate and D-lyxose transportation. For instance, E. coli contains several permeases that are able to transport monosaccharides or sugars such as D-mannonate and D-lyxose, including KdgT for 2-keto-3-deoxy-D-gluconate (KDG) transporter, ExuT for aldohexuronates such as D-galacturonate and D-glucuronate transporter, GntPTU for gluconate/fructuronate transporter, uidB for glucuronide transporter, fucP for L-fucose transporter, galP for galactose transporter, yghK for glycolate transporter, dgoT for D-galactonate transporter, uhpT for hexose phosphate transporter, dctA for orotate/citrate transporter, gntUT for gluconate transporter, malEGF for maltose transporter: alsABC for D-allose transporter, idnT for L-idonate/D-gluconate transporter, KgtP for proton-driven α-ketoglutarate transporter, lacY for lactose/galactose transporter, xylEFGH for D-xylose transporter, araEFGH for L-arabinose transporter, and rbsABC for D-ribose transporter. In certain embodiments, a microbial system or recombinant microorganism may comprise permeases or transporters as described above, including those that are re-designed or optimized for improvided transport of certain monosaccharides, such as D-mannuronate, DEHU, and D-lyxose.
[0193]Certain aspects may employ a recombinant microorganism that comprises a "superchannel," by which aquatic or marine-biomass polysaccharides such as alginate polymers, or fruit or vegetable biomass such as pectin polymers, may be directly incorporated into the cytosol and degraded inside the microbial system. For instance, a group of bacteria characterized as Sphingomonads have a wide range in capability of degrading environmentally hazardous compounds such as polychlorinated polycyclic aromatics (dioxin). These bacteria contain characteristic large pleat-like molecules on their cell surfaces. In this regard, certain Sphingomonads have structures characterized as "superchannels" that enable the bacteria to directly take up macromolecules.
[0194]As one particular example of a microorganism comprising a superchannel, Sphingomonas sp. strain Al directly incorporates polysaccharides such as alginate through a superchannel. Such superchannels may consist of a pit on the outer membrane (e.g., AlgR), alginate-binding proteins in the periplasm (e.g., AlgQ1 and AlgQ2), and an ATP-binding cassette (ABC) transporter (e.g., AlgM1, AlgM2, and AlgS). Incorporated polysaccharides such as alginate may be readily depolymerized by lyases such as alginate lyases produced in the cytosol. Thus, certain embodiments may incorporate genes encoding a superchannel (e.g., ccpA, algS, algM1, algM2, algQ1, algQ2) to introduce this ability to the microbial system or recombinant microorganism. Other embodiments may include microorganisms such as Sphingomonas subarctica IFO 16058T, which harbor the plasmid containing genes that encode a superchannel, and which have significantly improved ability to utilize marine or aquatic biomass polysaccharides such as alginate as a source of carbon and/or energy. Certain recombinant microorganisms may employ these superchannel encoding plasmid sequences contained within Sphingomonas subarctica IFO 16058T.
[0195]Certain examples of alginate ABC transporters that may be utilized herein, include ABC transporters Atu3021, Atu3022, Atu3023, Atu3024, algM1, algM2, AlgQ1, AlgQ2, AlgS, OG2516--05558, OG2516--05563, OG2516--05568, and OG2516--05573, including functional variants thereof. Certain examples of alginate symporters that may be utilized herein include symporters V12B01--24239 and V12B01--24194, among others, including functional variants thereof. One additional example of an alginate porin includes V12B01--24269, and variants thereof.
[0196]As noted above, certain embodiments may include recombinant microorganisms that comprise one or more monosaccharide dehydrogenases, isomerases, dehydratases, kinases, and aldolases. With regard to monosaccharide dehydrogenases, certain microbial systems or recombinant microorganism may incorporate enzymes that reduce various monosaccharides (e.g., DEHU, mannuronate) to a monosaccharide that is suitable for biofuel biosynthesis, such as 2-keto-3-deoxy-D-gluconate (KDG) or D-mannitol. Such exemplary enzymes, include, for example, DEHU hydrogenases and mannuronate hydrogenases, in addition to various alcohol dehydrogenases having DEHU hydrogenase and/or mannuronate dehydrogenase activity, such as the novel ADH1 through ADH12 enzymes isolated from Agrobacterium tumefaciens C58 (see, e.g., SEQ ID NOS:69-92).
[0197]For more detail on the ADH1 through ADH12 enzymes, SEQ ID NO:69 shows the nucleotide and SEQ ID NO:70 shows the polypeptide sequence of ADH1 Atu1557 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:71 shows the nucleotide and SEQ ID NO:72 shows the polypeptide sequence of ADH2 Atu2022 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:73 shows the nucleotide and SEQ ID NO:74 shows the polypeptide sequence of ADH3 Atu0626 isolated from Agrobacterium tumefaciens C58.
[0198]SEQ ID NO:75 shows the nucleotide and SEQ ID NO:76 shows the polypeptide sequence of ADH4 Atu5240 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:77 shows the nucleotide and SEQ ID NO:78 shows the polypeptide sequence of ADH5 Atu3163 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:79 shows the nucleotide and SEQ ID NO:80 shows the polypeptide sequence of ADH6 Atu2151 isolated from Agrobacterium tumefaciens C58.
[0199]SEQ ID NO:81 shows the nucleotide and SEQ ID NO:82 shows the polypeptide sequence of ADH7 Atu2814 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:83 shows the nucleotide and SEQ ID NO:84 shows the polypeptide sequence of ADH8 Atu5447 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:85 shows the nucleotide and SEQ ID NO:86 shows the polypeptide sequence of ADH9 Atu4087 isolated from Agrobacterium tumefaciens C58.
[0200]SEQ ID NO:87 shows the nucleotide and SEQ ID NO:88 shows the polypeptide sequence of ADH10 Atu4289 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:89 shows the nucleotide and SEQ ID NO:90 shows the polypeptide sequence of ADH11 Atu3027 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:91 shows the nucleotide and SEQ ID NO:92 shows the polypeptide sequence of ADH12 Atu3026 isolated from Agrobacterium tumefaciens C58.
[0201]Further examples of enzymes having dehydrogenase activity include Atu3026, Atu3027, OG2516--05543, OG2516--05538 and V12B01--24244. The microorganisms and methods of the present invention may also utilize biologically active fragments and variants of these hydrogenase enzymes, including optimized variants thereof.
[0202]As a further example, Pseudomonas grown using alginate as a sole source of carbon and energy comprises a DEHU hydrogenase enzyme that uses NADPH as a co-factor, is more stable when NADP.sup.+ is present in the solution, and is active at ambient pH. Thus, certain embodiments of a microbial system or a recombinant microorganism as described herein may incorporate genes encoding hydrogenases such as DEHU or mannuronate hydrogenase derived or obtained from various microbes, in which these microbes may be capable of growing on polysaccharides such as alginate or pectin as a source of carbon and/or energy.
[0203]Certain embodiments may incorporate components of a microbial system or isolated microorganism that is capable of efficiently growing on monosaccharides such as D-mannuronate or D-lyxose as a source of carbon and energy. For instance, both Aeromonas and Aerobacter aerogenes PRL-R3 comprise genes encoding monosaccharide dehydrogenases such as D-mannuronate hydrogenase and D-lyxose isomerase. Thus, certain microbial systems or recombinant microorganisms may comprise monosaccharide dehydrogenases such as D-mannuronate hydrogenase and D-lyxose isomerase from Aeromonas, Aerobacter aerogenes PRL-R3, or various other suitable microorganisms, including those microorganisms capable of growing on D-mannuronate or D-lyxose as a source of carbon and energy.
[0204]Certain embodiments may include a microbial system or isolated microorganism with enhanced efficiency for converting monosaccharides such as D-mannonate and D-xylulose into monosaccharides suitable for a biofuel biosynthesis pathway such as KDG. Merely by way of explanation, D-mannonate and D-xylulose are metabolites in microbes such as E. coli. D-mannonate is converted by a D-mannonate dehydratase to KDG. D-xylulose enters the pentose phosphate pathway. Thus, to increase conversion of D-mannonate to KDG, an exogenous or endogenous D-mannonate dehydratase (e.g., uxuA) gene may be over-expressed an a recombinant microorganism of the invention. Similarly, in other embodiments, suitable endogenous or exogenous genes such as kinases (e.g., kdgK), nad, as well as KDG aldolases (e.g., kdgA and eda) may be either incorporated or overexpressed in a given recombinant microorganism (see SEQ ID NOS:93-96), including biologically active variants or fragments thereof, such as optimized variants of these genes. SEQ ID NO:93 shows the nucleotide sequence and SEQ ID NO:94 shows the polypeptide sequence of a 2-keto-deoxy gluconate kinase (KdgK) from Escherichia coli DH10B. SEQ ID NO:95 shows the nucleotide sequence and SEQ ID NO:96 shows the polypeptide sequence of a 2-keto-deoxy gluconate-6-phosphate aldorase (KdgA) from Escherichia coli DH10B.
[0205]In certain aspects, as noted above, a recombinant microorganism that is capable of growing on alginate or pectin as a sole source of carbon may utilize a naturally-occurring or endogenous copy of a dehyradratase, kinase, and/or aldolase. For instance, E. coli contains endogenous dehydratases, kinases, and aldolases that are capable of catalyzing the appropriate steps in the conversion of polysaccharides to a suitable monosaccharide. In these and other related aspects, the naturally-occurring dehydratase or kinase may also be over-expressed, such as by providing an exogenous copy of the naturally-occurring dehydratase, kinase or aldolase operable linked to a highly constitutive or inducible promoter.
[0206]As one exemplary source of enzymes for engineering a recombinant microorganism to grow on alginate as a sole source of carbon, Vibrio splendidus is known to be able to metabolize alginate to support growth. For example, SEQ ID NO:1 shows a secretome region carrying certain Vibrio splendidus genes (V12B01--02425 to V12B01--02480), which encodes a type II secretion apparatus. SEQ ID NO:2 shows the nucleotide sequence of an entire genomic region between V12B01--24189 to V12B01--24249, which was derived from Vibrio splendidus, and which when transformed into E. coli as a fosmid clone was sufficient to confer the ability to grow on alginate as a sole source of carbon. SEQ ID NOS:3-64 show the individual putative genes contained within SEQ ID NO:2. Thus, in certain aspects, a recombinant microorganism (e.g., E. coli) that is able to grow on alginate as a sole source of carbon and/or energy may comprise one or more nucleotide or polypeptide reference sequences described in SEQ ID NOS:1-64, including biologically active fragments or variants thereof, such as optimized variants.
[0207]In certain aspects, a recombinant microorganism that is able to grow on alginate as a sole source of carbon may contain certain coding nucleotide or polypeptide sequences contained within SEQ ID NO:2, such as the sequences in SEQ ID NOS:3-64, or biologically active fragments or variants thereof, including optimized variants. These sequences are described in further detail below.
[0208]SEQ ID NO:3 shows the nucleotide coding sequence of the putative protein V12B01--24184. This putative coding sequence is contained within the polynucleotide sequence of SEQ ID NO:2, and encodes a polypeptide that is similar to an autotransporter adhesion or type I secretion target ggxgxdxxx (SEQ ID NO:145) repeat. SEQ ID NO:4 shows the polypeptide sequence of putative protein V12B01--24184, encoded by the polynucleotide of SEQ ID NO:3. This putative polypeptide is similar to autotransporter adhesion or type I secretion target ggxgxdxxx (SEQ ID NO:145) repeat.
[0209]SEQ ID NO:5 shows the nucleotide sequence that encodes the putative protein V12B01--24189. SEQ ID NO:6 shows the polypeptide sequence of the putative protein V12B01--24189, which is similar to cyclohexadienyl dehydratase.
[0210]SEQ ID NO:7 shows the nucleotide sequence that encodes the putative protein V12B01--24194. SEQ ID NO:8 shows the polypeptide sequence of the putative protein V12B01--24194, which is similar to a Na/proline transporter.
[0211]SEQ ID NO:9 shows the nucleotide sequence that encodes the putative protein V12B01--24199. SEQ ID NO:10 shows the polypeptide sequence of the putative protein V12B01--24199, which is similar to a keto-deoxy-phosphogluconate aldolase.
[0212]SEQ ID NO:11 shows the nucleotide sequence that encodes the putative protein V12B01--24204. SEQ ID NO:12 shows the polypeptide sequence of the putative protein V12B01--24204, which is similar to 2-dehydro-3-deoxygluconokinase.
[0213]SEQ ID NO:13 shows the nucleotide sequence that encodes the putative protein V12B01--241209. SEQ ID NO:14 shows the polypeptide sequence of the putative protein V12B01--241209.
[0214]SEQ ID NO:15 shows the nucleotide sequence that encodes the putative protein V12B01--24214. SEQ ID NO:16 shows the polypeptide sequence of the putative protein V12B01--24214, which is similar to a chondroitin AC/alginate lyase.
[0215]SEQ ID NO:17 shows the nucleotide sequence that encodes the putative protein V12B01--24219. SEQ ID NO:18 shows the polypeptide sequence of the putative protein V12B01--24219, which is similar to a chondroitin AC/alginate lyase.
[0216]SEQ ID NO:19 shows the nucleotide sequence that encodes the putative protein V12B01--24224. SEQ ID NO:20 shows the polypeptide sequence of the putative protein V12B01--24224, which is similar to a 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase.
[0217]SEQ ID NO:21 shows the nucleotide sequence that encodes the putative protein V12B01--24229. SEQ ID NO:22 shows the polypeptide sequence of the putative protein V12B01--24229, which is similar to a GntR-family transcriptional regulator.
[0218]SEQ ID NO:23 shows the nucleotide sequence that encodes the putative protein V12B01--24234. SEQ ID NO:24 shows the polypeptide sequence of the putative protein V12B01--24234, which is similar to a Na.sup.+/proline symporter.
[0219]SEQ ID NO:25 shows the nucleotide sequence that encodes the putative protein V12B01--24239. SEQ ID NO:26 shows the polypeptide sequence of the putative protein V12B01--24239, which is similar to an oligoalginate lyase.
[0220]SEQ ID NO:27 shows the nucleotide sequence that encodes the putative protein V12B01--24244. SEQ ID NO:28 shows the polypeptide sequence of putative protein V12B01--24244, which is similar to a 3-hydroxyisobutyrate dehydrogenase.
[0221]SEQ ID NO:29 shows the nucleotide sequence that encodes the putative protein V12B01--24249. SEQ ID NO:30 shows the polypeptide sequence of the putative protein V12B01--24249, which is similar to a methyl-accepting chemotaxis protein.
[0222]SEQ ID NO:31 shows the nucleotide sequence that encodes the putative protein V12B01--24254. SEQ ID NO:32 shows the polypeptide sequence of putative protein V12B01--24254, which is similar to an alginate lyase.
[0223]SEQ ID NO:33 shows the nucleotide sequence that encodes the putative protein V12B01--24259. SEQ ID NO:34 shows the polypeptide sequence of putative protein V12B01--24259, which is similar to an alginate lyase.
[0224]SEQ ID NO:35 shows the nucleotide sequence that encodes the putative protein V12B01--24264. SEQ ID NO:36 shows the polypeptide sequence of putative protein V12B01--24264.
[0225]SEQ ID NO:37 shows the nucleotide sequence that encodes the putative protein V12B01--24269. SEQ ID NO:38 shows the polypeptide sequence of putative protein V12B01--24269, which is similar to a putative oligogalacturonate specific porin.
[0226]SEQ ID NO:39 shows the nucleotide sequence that encodes the putative protein V12B01--24274. SEQ ID NO:40 shows the polypeptide sequence of putative protein V12B01--24274, which is similar to an alginate lyase.
[0227]FIG. 32 shows the nucleotide coding sequence and polypeptide sequence of putative protein V12B01--02425. FIG. 32A shows the nucleotide sequence that encodes the putative protein V12B01--02425 (SEQ ID NO:41). FIG. 32B shows the polypeptide sequence of putative protein V12B01--02425 (SEQ ID NO:42), which is similar to a type II secretory pathway component EpsC.
[0228]SEQ ID NO:43 shows the nucleotide sequence that encodes the putative protein V12B01--02430. SEQ ID NO:44 shows the polypeptide sequence of putative protein V12B01--02430, which is similar to a type II secretory pathway component EpsD.
[0229]SEQ ID NO:45 shows the nucleotide sequence that encodes the putative protein V12B01--02435. SEQ ID NO:46 shows the polypeptide sequence of putative protein V12B01--02435, which is similar to a type II secretory pathway component EpsE.
[0230]SEQ ID NO:47 shows the nucleotide sequence that encodes the putative protein V12B01--02440. SEQ ID NO:48 shows the polypeptide sequence of putative protein V12B01--02440, which is similar to a type II secretory pathway component EpsF.
[0231]SEQ ID NO:49 shows the nucleotide sequence that encodes the putative protein V12B01--02445. SEQ ID NO:50 shows the polypeptide sequence of putative protein V12B01--02445, which is similar to a type II secretory pathway component EpsG.
[0232]SEQ ID NO:51 shows the nucleotide sequence that encodes the putative protein V12B01--02450. SEQ ID NO:52 shows the polypeptide sequence of putative protein V12B01--02450, which is similar to a type II secretory pathway component EpsH.
[0233]SEQ ID NO:53 shows the nucleotide sequence that encodes the putative protein V12B01--02455. SEQ ID NO:54 shows the polypeptide sequence of putative protein V12B01--02455, which is similar to a type II secretory pathway component EpsI.
[0234]SEQ ID NO:55 shows the nucleotide sequence that encodes the putative protein V12B01--02460. SEQ ID NO:56 shows the polypeptide sequence of putative protein V12B01--02460, which is similar to a type II secretory pathway component EpsJ.
[0235]SEQ ID NO:57 shows the nucleotide sequence that encodes the putative protein V12B01--02465. SEQ ID NO:58 shows the polypeptide sequence of putative protein V12B01--02465, which is similar to a type II secretory pathway component EpsK.
[0236]SEQ ID NO:59 shows the nucleotide sequence that encodes the putative protein V12B01--02470. SEQ ID NO:60 shows the polypeptide sequence of putative protein V12B01--02470, which is similar to a type II secretory pathway component EpsL.
[0237]SEQ ID NO:61 shows the nucleotide sequence that encodes the putative protein V12B01--02475. SEQ ID NO:62 shows the polypeptide sequence of putative protein V12B01--02475, which is similar to a type II secretory pathway component EpsM.
[0238]SEQ ID NO:63 shows the nucleotide sequence that encodes the putative protein V12B01--02480. SEQ ID NO:64 shows the nucleotide sequence that encodes the putative protein V12B01--02480, which is similar to a type II secretory pathway component EpsC.
[0239]As a further exemplary source of enzymes for engineering a microorganism to grow on alginate, Agrobacterium tumefaciens C58 is able to metabolize relatively small sizes of alginate molecules (˜1000 mers) as a sole source of carbon and energy. Since A. tumefaciens C58 has long been used for plant biotechnology, the genetics of this organism has been relatively well studied, and many genetic tools are available and compatible with other gram-negative bacteria such as E. coli. Thus, certain aspects may employ this microbe, or the genes therein, for the production of suitable monosaccharides. For instance, as noted above, the present disclosure provides a series of novel ADH genes having both DEHU and mannuronate hydrogenase activity that were obtained from Agrobacterium tumefaciens C58 (see SEQ ID NOS: 67-92).
[0240]As noted above, certain aspects may include a recombinant microorganism or microbial system that is capable of growing on pectin as a sole source of carbon and/or energy. Pectin is a linear chain of α-(1-4)-linked D-galacturonic acid that forms the pectin-backbone, a homogalacturonan. Into this backbone, there are regions where galacturonic acid is replaced by (1-2)-linked L-rhamnose. From rhamnose, side chains of various neutral sugars typically branch off. This type of pectin is called rhamnogalacturonan I. Over all, about up to every 25th galacturonic acid in the main chain is exchanged with rhamnose. Some stretches consisting of alternating galacturonic acid and rhamnose--"hairy regions", others with lower density of rhamnose--"smooth regions." The neutral sugars mainly comprise D-galactose, L-arabinose and D-xylose; the types and proportions of neutral sugars vary with the origin of pectin. In nature, around 80% of carboxyl groups of galacturonic acid are esterified with methanol. Some plants, like sugar-beet, potatoes and pears, contain pectins with acetylated galacturonic acid in addition to methyl esters. Acetylation prevents gel-formation but increases the stabilising and emulsifying effects of pectin. Certain pectin degradation and metabolic pathways are exemplified in FIG. 3.
[0241]In addition to the genes, enzymes, and biological pathways described above, certain recombinant microorganisms may incorporate features that are useful for growth on pectin as a sole source of carbon. For instance, to degrade and metabolize pectin as a sole source of carbon, pectin methyl and acetyl esterases first catalyze the hydrolysis of methyl and acetyl esters on pectin. Examples of pectin methyl esterases include, but are not limited to, pemA and pmeB. Examples of pectin acetyl esterases include, but are not limited to, PaeX and PaeY. Further examples of pectin methyl esterases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the pectate methyl esterases in FIG. 40. Further examples of pectate acetyl esterases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the pectate acetyl esterases described in FIG. 41.
[0242]Further to this end, pectate lyases and hydrolases may catalyze the endolytic cleavage of pectate via β-elimination and hydrolysis, respectively, to produce oligopectates. Other enzymes that may be utilized to metabolize pectin include Examples of pectate lyases include, but are not limited to, PelA, PelB, PelC, PelD, PelE, Pelf, PelI, PelL, and PelZ. Examples of pectate hydrolases include, but are not limited to, PehA, PehN, PehV, PehW, and PehX. Further examples of pectate lyases include polypeptides or enzymes sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the pectate lyases described in FIG. 38.
[0243]Polygalacturonases, rhamnogalacturonan lyases, and rhamnogalacturonan hydrolyases may also be utilized herein to degrade and metabolize pectin. Examples of rhamnogalacturonan lyases include polypeptides or enzymes sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the rhamnoglacturonan lyases (i.e., rhamnogalacturonases) described in FIG. 39A. Examples of rhamnogalacturonate hydrolyases include polypeptides or enzymes sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the rhamnogalacturonate hydrolases described in FIG. 39B.
[0244]Thus, to degrade and metabolize pectin, certain of the recombinant microorganisms and methods of the present invention may incorporate one or more of the above noted methyl and acetyl esterases, lyases, and/or hydrolases, among others known in the art. These may enzymes may be encoded and expressed by endogenous or exogenous genes, and may also include biologically active fragments or variants thereof, such as homologs, orthologs, and/or optimized variants of these enzymes.
[0245]To further metabolize the degradation products of pectin, oligopectates may be transported into the periplasm fraction of gram-negative bacteria by outer membrane porins, where they are further degraded into such components as di- and tri-galactonurates. Examples of outer membrane porins include that can transport oligopectates into the periplasm include, but are not limited to, kdgN and kdgM. Certain recombinant microorganism may incorporate these or similar genes.
[0246]Di- and tri-galactonurates may then be transported into the cytosol for further degradation. Bacteria contain at least two different transporter systems responsible for di- and tri-galacturonate transportation, including symporter and ABC transporter (e.g., TogT and TogMNAB, respectively). Thus, certain of the recombinant microorganisms provided herein may comprise one or more a di- or tri-galacturonate transporter systems, such as TogT and/or TogMNAB.
[0247]Once di- and trigalacturonate are incorporated into the cytosol, short pectate or galacturonate lyases, break them down to D-galacturonate and (4S)-4,6-dihydroxy-2,5-dioxohexuronate. Examples of short pectate or galacturonate lyases include, but are not limited to, PelW and Ogl, which genes may be either endogenously or exogenously incorporated into certain recombinant microorganisms provided herein. D-galacturonate and (4S)-4,6-dihydroxy-2,5-dioxohexuronate are then converted to 5-dehydro-4-deoxy-D-glucuronate and further to KDG, which steps may be catalyzed by KduI and KduD, respectively. The KduI enzyme has an isomerase activity, and the KduD enzyme has a dehydrogenase activity, such as a 2-deoxy-D-gluconate 3-dehydrogenase activity. Accordingly, certain recombinant microorganisms provided herein may comprise one or more short pectate or galacturonate lyases, such as PelW and/or Ogl, and may optionally comprise one or more isomerases, such as KduI, as well as one or more dehydrogenases, such as KduD, to convert di- and trigalacturonates into a suitable monosaccharide, such as KDG.
[0248]In certain aspects, a recombinant microorganism, such as E. coli, that is able to grown on pectin or tri-galacturonate as a sole source of carbon and/or energy may comprise one or more of the gene sequences contained within SEQ ID NOS:65 and 66, including biologically active fragments or variants thereof, such as optimized variants. SEQ ID NO:65 shows the nucleotide sequence of the kdgF-PaeX region from Erwinia carotovora subsp. Atroseptica SCRI1043. SEQ ID NO:66 shows the nucleotide sequence of ogl-kdgR from Erwinia carotovora subsp. Atroseptica SCRI1043.
[0249]In certain aspects, a recombinant microorganism, such as E. coli, that is able to grown on pectin or tri-galacturonate as a sole source of carbon and/or energy may comprise one or more genomic regions of Erwinia chrysanthemi, comprising several genes (kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR) encoding enzymes (kduI, kduD, ogl, pelW, and paeX), transporters (togM, togN, togA, togB, and kdgM), and regulatory proteins (kdgR) responsible for degradation of di- and trigalacturonate, as well as several genes (pelA, pelE, paeY, and pem) encoding pectate lyases (pelA and pelE), pectin acetylesterases (paeY), and pectin methylesterase (pem) (see Example 2).
[0250]Additional examples of isomerases that may be utilized herein include glucoronate isomerases, such as those in the family uxaC, as well as 4-deoxy-L-threo-5-hexulose uronate isomerases, such as those in the family KduI. Additional examples of reductases that may be utilized herein include tagaturonate reductases, such as those in the family uxaB. Additional examples of dehyadratases that may be utilized herein include altronate dehydratases, such as those in the family uxaA. Additional examples of dehydrogenases that may be utilized herein include 2-deoxy-D-gluconate 3-dehydrogenases, such as those in the family kduD.
[0251]Certain aspects my also utilize recombinant microorganisms engineered to enhance the efficiency of the KDG degradation pathway. For instance, in bacteria, KDG is a common metabolic intermediate in the degradation of hexuronates such as D-glucuronate and D-galacturonate and enters into Entner Doudoroff pathway where it is converted to pyruvate and glyceraldehyde-3-phosphate (G3P). In this pathway, KDG is first phosphorylated by KDG kinase (KdgK) followed by its cleavage into pyruvate and glyceraldehyde-3-phosphate (G3P) using 2-keto-3-deoxy-D-6-phosphate-gluconate (KDPG) aldolase (KdgA). The expression of these enzymes concurrently with KDG permease (e.g., KdgT) is negatively regulated by KdgR and is almost none at basal level. The expression is dramatically (3-5-fold) induced upon the addition of hexuronates, and a similar result has been reported in Pseudomonas grown on alginate. Hence, to increase the conversion of KDG to pyruvate and G3P, the negative regulator KdgR may be removed. To further improve the pathway efficiency, exogenous copies of KdgK and KdgA may also be incorporated into a given recombinant microorganism.
[0252]In certain aspects, a recombinant microorganism that is able to grow on a polysaccharide (e.g., alginate, pectin, etc) as a sole source of carbon may be capable of producing an increased amount of a given commodity chemical (e.g., ethanol) while growing on that polysaccharide. For example, E. coli engineered to grown on alginate may be engineered to produced an increased amount of ethanol from alginate as compared to E. coli that is not engineered to grown on alginate (see Example 11). Thus, certain aspects include a recombinant microorganism that is capable of growing on alginate or pectin as a sole source carbon, and that is capable of producing an increased amount of ethanol, such as by comprising one or more genes encoding and expressing a pyruvate decarboxylase (pdc) and/or an alcohol dehydrogenase gene, including functional variants thereof. In certain aspects, such a recombinant microorganism may comprise a pyruvate decarboxylase (pdc) and two alcohol dehydrogenases (adhA and adhB) obtained from Zymomonas mobilis.
[0253]Embodiments of the present invention also include methods for converting polysaccharide to a suitable monosaccharide comprising, (a) obtaining a polysaccharide; (b) contacting the polysaccharide with a chemical catalysis or enzymatic pathway, thereby converting the polysaccharide to a first monosaccharide or oligosaccharide; and (c) contacting the first monosaccharide with a microbial system for a time sufficient to convert the first monosaccharide or oligosaccharide to the suitable monosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a monosaccharide transporter, a disaccharide transporter, a trisaccharide transporter, an oligosaccharide transporter, and a polysaccharide transporter; and (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to a suitable monosaccharide.
[0254]In certain aspects of the present invention, aquatic or marine-biomass polysaccharides such as alginate may be chemically degraded using chemical catalysts such as acids. Similarly, biomass-derived pectin may be chemically degraded. For instance, the reaction catalyzed by chemical catalysts is typically through hydrolysis, as opposed to the β-elimination type of reactions catalyzed by enzymatic catalysts. Thus, certain embodiments may include boiling alginate or pectin with strong mineral acids to liberate carbon dioxide from D-mannuronate, thereby forming D-lyxose, a common sugar metabolite utilized by many microorganisms. Such embodiments may use, for example, formate, hydrochloric acid, sulfuric acid, in addition to other suitable acids known in the art as chemical catalysts.
[0255]An enzymatic pathway may utilized one or more enzymes described herein that are capable of catalyzing the degradation of polysaccharides, such as alginate or pectin.
[0256]Other embodiments may use variations of chemical catalysis similar to those described herein or known to a person skilled in the art, including improved or redesigned methods of chemical catalysis suitable for use with biomass related polysaccharides. Certain embodiments include those wherein the resulting monosaccharide uronate is D-mannuronate.
[0257]As noted above, the suitable monosaccharides or suitable oligosaccharides produced by the recombinant microorganisms and microbial systems of the present invention may be utilized as a feedstock in the production of commodity chemicals, such as biofuels, as well as commodity chemical intermediates. Thus, certain embodiments of the present invention relate generally to methods for converting a suitable monosaccharide or oligosaccharide to a commodity chemical, such as a biofuel, comprising, (a) obtaining a suitable monosaccharide or oligosaccharide; (b) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the biofuel, thereby converting the suitable monosaccharide to the biofuel.
[0258]Certain aspects include methods for converting a suitable monosaccharide to a first commodity chemical such as a biofuel, comprising, (a) obtaining a suitable monosaccharide; (b) contacting the suitable monosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the first commodity chemical, wherein the microbial system comprises one or more genes encoding a aldehyde or ketone biosynthesis pathway, thereby converting the suitable monosaccharide to the first commodity chemical.
[0259]In these and other related aspects, depending on the particular ketone or aldehyde biosynthesis pathway employed, the first commodity chemical may be further enzymatically and/or chemically reduced and dehydrated to a second commodity chemical. Examples of such second commodity chemicals include, but are not limited to, butene or butane; 1-phenylbutene or 1-phenylbutane; pentene or pentane; 2-methylpentene or 2-methylpentane; 1-phenylpentene or 1-phenylpentane; 1-phenyl-4-methylpentene or 1-phenyl-4-methylpentane; hexene or hexane; 2-methylhexene or 2-methylhexane; 3-methylhexene or 3-methylhexane; 2,5-dimethylhexene or 2,5-dimethylhexane; 1-phenylhexene or 1-phenylhexane; 1-phenyl-4-methylhexene or 1-phenyl-4-methylhexane; 1-phenyl-5-methylhexene or 1-phenyl-5-methylhexane; heptene or heptane; 2-methylheptene or 2-methylheptane; 3-methylheptene or 3-methylheptane; 2,6-dimethylheptene or 2,6-dimethylheptane; 3,6-dimethylheptene or 3,6-dimethylheptane; 3-methyloctene or 3-methyloctane; 2-methyloctene or 2-methyloctane; 2,6-dimethyloctene or 2,6-dimethyloctane; 2,7-dimethyloctene or 2,7-dimethyloctane; 3,6-dimethyloctene or 3,6-dimethyloctane; and cyclopentane or cyclopentene.
[0260]Certain embodiments of the present invention may also include methods for converting a suitable monosaccharide or oligosaccharide to a commodity chemical comprising (a) obtaining a suitable monosaccharide or oligosaccharide; (b) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide or oligosaccharide to the commodity chemical, wherein the microbial system comprises; (i) one or more genes encoding a biosynthesis pathway; (ii) one or more genes encoding and expressing a C--C ligation pathway; and (iii) one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase, thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical.
[0261]Certain aspects also include recombinant microorganism that comprise (i) one or more genes encoding a biosynthesis pathway; (ii) one or more genes encoding and expressing a C--C ligation pathway; and (iii) one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase. Certain aspects also include recombinant microorganisms that comprise the above pathways individually or in certain combinations, such as recombinant microorganism that comprises one or more genes encoding a biosynthesis pathway, as described herein. Certain aspects may also include recombinant microorganisms that comprise one or more genes encoding and expressing a C--C ligation pathway, as described herein. Certain aspects may also include recombinant microorganisms that comprise one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase, as described herein.
[0262]As for recombinant microorganisms that comprise combinations of the above-noted pathways, certain aspects may include recombinant microorganisms that comprise (i) one or more genes encoding a biosynthesis pathway; and (ii) one or more genes encoding and expressing a C--C ligation pathway. Certain aspects may also include recombinant microorganisms that comprise (i) one or more genes encoding and expressing a C--C ligation pathway; and (ii) one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase.
[0263]Certain aspects may also include recombinant microorganisms that comprise one or more individual components of a dehydration and reduction pathway, such as a recombinant microorganism that comprises a diol dehydrogenase, a diol dehydratase, or a secondary alcohol dehydrogenase. These and other microorganisms may be utilized, for example, to convert a suitable polysaccharide to a first commodity chemical, or an intermediate thereof, or to convert a first commodity chemical, or an intermediate thereof, to a second commodity chemical.
[0264]Merely by way of illustration, a recombinant microorganism comprising a C--C ligation pathway may be utilized to convert butanal into a first commodity chemical, or an intermediate thereof, such as 5-hydroxy-4-octanone, which can then be converted into a second commodity chemical, or intermediate thereof, by any suitable pathway. As a further example, a recombinant microorganism comprising a C--C ligation pathway and a diol hydrogenase may be utilized for the sequential conversion of butanal into 5-hydroxy-4-octanone and then 4,5-octanonediol. Examples of recombinant microorganisms that comprise these and other various combinations of the individual pathways described herein, as well as various combinations of the individual components of those pathways, will be apparent to those skilled in the art, and may also be found in the Examples.
[0265]Also included are methods of converting a polysaccharide to a first commodity chemical, or an intermediate thereof, such as by utilizing a recombinant microorganism that comprises an aldehyde or ketone biosynthesis pathway. Also included are methods of converting a first commodity chemical, or intermediate thereof, to a second commodity chemical, such as by utilizing a recombinant microorganism that optionally comprises a biosynthesis pathway, optionally comprises C--C ligation pathway and/or optionally comprises one or more of the individual components of a dehydration and reduction pathway. Merely by way of illustration, a recombinant microorganism comprising an exogenous C--C ligase (e.g., benzaldehyde lyase from Pseudomonas fluorescens) could be utilized in a method to convert a first commodity chemical such as 3-methylbutanal to a second commodity chemical such as 2,7-dimethyl-5-hydroxy-4-octanone. Along this line of illustration, the same or different recombinant microorganism comprising a diol dehydrogenase could be utilized in a method to convert 2,7-dimethyl-5-hydroxy-4-octanone to another commodity chemical such as 2,7-dimethyl-4,5-octanediol (see Table 2 for other examples). As an additional illustrative example, a recombinant microorganism comprising an exogenous secondary alcohol dehydrogenase could be utilized in a method to convert a first commodity chemical such as 2,7-dimethyl-4-octanone to a second commodity chemical such as 2,7-dimethyloctanol.
[0266]Embodiments of a microbial system or isolated microorganism of the present application may include a naturally-occurring biosynthesis pathway, and/or an engineered, reconstructed, or re-designed biosynthesis pathway that has been optimized for improved functionality.
[0267]Embodiments of a microbial system or recombinant microorganism of the present invention may include a natural or reconstructed biosynthesis pathway, such as a butyraldehyde biosynthesis pathway, as found in such microorganisms as Clostridium acetobutylicum and Streptomyces coelicolor. In explanation, butyrate and butanol are the common fermentation products of certain bacterial species such as Clostridia, in which the production of butyrate and butanol is mediated by a synthetic thiolase dependent pathway characteristically similar to fatty acid degradation pathway. Such pathways may be initiated with the condensation of two molecules of acetyl-CoA to acetoacetyl-CoA, which is catalyzed by thiolase. Acetoacetyl-CoA is then reduced to β-hydroxy butyryl-CoA, which is catalyzed by NAD(P)H dependent β-hydroxy butyryl-CoA dehydrogenase (HBDH). Crotonase catalyzes dehydration from β-hydroxy butyryl-CoA to form crotonyl-CoA. Further reduction catalyzed by NADH-dependent butyryl-CoA dehydrogenase (BCDH) saturates the double bond at C2 of crotonyl-CoA to form butyryl-CoA.
[0268]In certain embodiments, thiolase, the first enzyme in this pathway, may be overexpressed to maximize production. In certain embodiments, thiolase may over-expressed in E. coli. In this regard, all three enzymes (e.g., HBDH, crotonase, and BCDH) catalyzing the following reaction steps are found in Clostridium acetobutylicum ATCC824. In certain embodiments, BDH, crotonase, and BCDH may be expressed or over-expressed in a suitable microorganism such as E. coli. Alternatively, a short-chain aliphatic acyl-CoA dehydrogenase derived from Pseudomonas putida KT2440 may be utilized in other embodiments of a microbial system or isolated microorganism of the present application.
[0269]Further to this end, butyryl-CoA in Clostridia may be readily converted to butanol and/or butyrate by at least a few different pathways. In one pathway, butyryl-CoA is directly reduced to butyraldehyde catalyzed by NADH dependent CoA-acylating aldehyde dehydrogenase (ALDH). Butyraldehyde may be further reduced to butanol by NADH-dependent butanol dehydrogenase. Although CoA-acylating ALDH catalyzes the one step reduction of butyryl-CoA to butyraldehyde, the incorporation of CoA-acylating ALDH to the microbial system may result in acetoaldehyde formation because of its promiscuous acetyl-CoA deacylating activity. In certain embodiments, the formation of acetoaldehyde may be minimized by functionally redesigning the relevant enzyme(s).
[0270]Butyryl-CoA in other biosynthesis pathways is deacylated to form butyryl phosphate catalyzed by phosphotransbutyrylase. Butyryl phosphate is then hydrolyzed by reversible butyryl phosphate kinase to form butyrate. This reaction is coupled with ATP generation from ADP. The butyrate formation through these enzymes is known to be significantly more specific. Certain embodiments may comprise phosphotransbutyrylase and butyryl phosphate kinase to the microbial system. In other embodiments, butyrate may be directly formed from butyryl-CoA by short chain acyl-CoA thioesterase.
[0271]Butyrate in Clostridia may also be sequentially reduced to butanol, which is catalyzed by a single alcohol/aldehyde dehydrogenase. Certain embodiments may comprise short chain aldehyde dehydrogenase from other bacteria such as Pseudomonas putida to complement the production of butyraldehyde in the microbial system. One potential concern in using short chain aldehyde dehydrogenase involves the possible formation of acetoaldehyde from acetate. Certain embodiments may be directed to minimizing the acetate formation in the microbial system, for example, by deleting several genes encoding enzymes involved in the acetate production.
[0272]Moreover, there are multiple routes in E. coli to form acetate, one of which is mediated by pyruvate oxygenase (POXB) from pyruvate, whereas another is mediated by phosphotransacetylase (PTA) and acetyl phosphate kinase (ACKA) from acetyl-CoA. The acetate production from E. coli mutant strains with poxB.sup.-, pta.sup.-, and acka.sup.- are significantly diminished. In addition, incorporation of acetyl-CoA synthase (ACS) which catalyses the acetyl-CoA formation from acetate is also known to significantly reduce the accumulation of acetate. Certain embodiments may comprise a microbial system or isolated microorganism with deleted POXB, PTA, and/or ACKA genes, and other embodiments may also comprise, separately or together with the deleted genes, one or more genes encoding and expressing ACS.
[0273]A microbial system or recombinant microorganism provided herein may also comprise a glutaraldehyde biosynthesis pathway. As one example, Saccharomyces cerevisiae has a lysine biosynthetic pathway in which acetyl-CoA is initially condensed to α-ketoglutarate, a common metabolite in citric acid cycle, to form homocitorate. This reaction is catalyzed by homocitrate synthase derived from Yeast, Thermus thermophilus, or Deinococcus radiodurans. Homoaconitase derived from Yeast, Thermus thermophilus, or Deinococcus radiodurans catalyzes the conversion between homocitrate and homoisocitrate. Homoisocitrate is then oxidatively decarboxylated to form 2-ketoadipate, which is catalyzed by homoisocitrate dehydrogenase derived from Yeast, Thermus thermophilus, or Deinococcus radiodurans. Homoisocitrate is also oxidatively decarboxylated to form glutaryl-CoA, which may be catalyzed by homoisocitrate dehydrogenase. Thus, certain embodiments may comprise a homocitrate synthase, a homoaconitase, and/or a homoisocitrate dehydrogenase.
[0274]Further to this end, in synthesizing 2-keto-adipicsemialdehyde, 2-ketoadipate is reduced to 2-keto-adipicsemialdehyde. This reaction can be catalyzed by dialdehyde dehydrogenase, which, for example, may be isolated from Agrobacterium tumefaciens C58. Thus, certain embodiments may incorporate dialdehyde dehydrogenases into a microbial system or recombinant microorganism.
[0275]In synthesizing glutaraldehyde, Acyl-CoA thioesterases (ACOT) may also catalyze the hydrolysis of glutaryl-CoA. The genes encoding ω-carboxylic acyl-CoA specific peroxisomal ACOTs are found in many mammalian species; both ACOT4 and ACOT8 derived from mice have been previously expressed in E. coli and shown that both enzymes are highly active on the hydrolysis of glutaryl-CoA to form glutarate. Certain embodiments may comprise one or more Acyl-CoA thioesterases.
[0276]Glutarate is sequentially reduced to glutaraldehyde. This reaction can be catalyzed by glutaraldehyde dehydrogenase (CpnE), which, for example, may be isolated from Comomonas sp. Strain NCIMB 9872. Certain embodiments may incorporate glutaraldehyde dehydrogenases such as CpnE into a microbial system or isolated microorganism. Other embodiments may comprise both ACOT and CpnE enzymes. Other embodiments may comprise CpnE enzymes redesigned to catalyze the reduction of 1-hydroxy propanoate and succinate to 1-hydroxy propanal and succinicaldehyde.
[0277]In certain aspects, the biosynthesis pathway may include an aldehyde biosynthesis pathway, a ketone biosynthesis pathway, or both. In certain aspects, the biosynthesis pathway may be include one or more of an acetoaldehyde, propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, 4-methylpentaldehyde, phenylacetoaldehyde, 2-phenyl acetoaldehyde, 2-(4-hydroxyphenyl)acetaldehyde, 2-Indole-3-acetoaldehyde, glutaraldehyde, 5-amino-pentaldehyde, succinate semialdehyde, and/or succinate 4-hydroxyphenyl acetaldehyde biosynthesis pathway, including various combinations thereof.
[0278]With regard to combinations of biosynthesis pathways, a biosynthesis pathway may comprise an acetoaldehyde biosynthesis pathway in combination with at least one of a propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a propionaldehyde biosynthesis pathway in combination with at least one of a butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a butyraldehyde biosynthesis pathway in combination with at least one of an isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise an isobutyraldehyde biosynthesis pathway in combination with at least one of a 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a 2-methyl-butyraldehyde biosynthesis pathway in combination with at least one of a 3-methyl-butyraldehyde or a phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a 3-methyl-butyraldehyde biosynthesis pathway in combination with a phenylacetoaldehyde biosynthesis pathway.
[0279]In certain aspects, a propionaldehyde biosynthesis pathway may comprise a threonine deaminase (ilvA) gene from an organism such as Escherichia coli and a keto-isovalerate decarboxylase (kivd) gene from an organism such as Lactococcus lactis, and/or functional variants of these enzymes, including homologs or orthologs thereof, as well as optimized variants. These enzymes may be utilized generally to convert L-threonine to propionaldehyde.
[0280]In certain aspects, a butyraldehyde biosynthesis pathway may comprise at least one of a thiolase (atoB) gene from an organism such as E. coli, a β-hydroxy butyryl-CoA dehydrogenase (hbd) gene, a crotonase (crt) gene, a butyryl-CoA dehydrogenase (bcd) gene, an electron transfer flavoprotein A (etfA) gene, and/or an electron transfer flavoprotein B (etfB) gene from an organism such as Clostridium acetobutyricum (e.g., ATCC 824), as well as a coenzyme A-linked butyraldehyde dehydrogenase (ald) gene from an organism such as Clostridium beijerinckii acetobutyricum ATCC 824. In certain aspects, a coenzyme A-linked alcohol dehydrogenase (adhE2) gene from an organism such as Clostridium acetobutyricum ATCC 824 may be used as an alternative to an ald gene.
[0281]In certain aspects, an isobutyraldehyde biosynthetic pathway may comprise an acetolactate synthase (alsS) from an organism such as Bacillus subtilis or an als gene from an organism such as Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage may be optimized for E. coli protein expression). Such a pathway may also comprise acetolactate reductoisomerase (ilvC) and/or 2,3-dihydroxyisovalerate dehydratase (ilvD) genes from an organism such as E. coli, as well as a keto-isovalerate decarboxylase (kivd) gene from an organism such as Lactococcus lactis.
[0282]In certain aspects, a 3-methylbutyraldehyde and 2-methylbutyraldehyde biosynthesis pathway may comprise an acetolactate synthase (alsS) gene from an organism such as Bacillus subtilis or an (als) gene from an organism such as Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage may be optimized for E. coli protein expression). Certain aspects of such a pathway may also comprise acetolactate reductoisomerase (ilvC), 2,3-dihydroxyisovalerate dehydratase (ilvD), isopropylmalate synthase (LeuA), isopropylmalate isomerase (LeuC and LeuD), and 3-isopropylmalate dehydrogenase (LeuB) genes from an organism such as E. coli, as well as a keto-isovalerate decarboxylase (kivd) from an organism such as Lactococcus lactis.
[0283]In certain aspects, a phenylacetoaldehyde and 4-hydroxyphenylacetoaldehyde biosynthesis pathway may comprise one or more of 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), dehydroshikimate reductase (aroE), shikimate kinase II (aroL), shikimate kinase I (aroK), 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), chorismate synthase (aroC), fused chorismate mutase P/prephenate dehydratase (pheA), and/or fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from an organism such as E. coli, as well as a keto-isovalerate decarboxylase (kivd) from an organism such as Lactococcus lactis.
[0284]In certain aspects, such as for the ultimate production of 1,10-diamino-5-decanol and 1,10-dicarboxylic-5-decanol, a biosynthesis pathway may comprise one or more homocitrate synthase, homoaconitate hydratase, homoisocitrate dehydrogenase, and/or homoisocitrate dehydrogenase genes from an organism such as Deinococcus radiodurans and/or Thermus thermophilus, as well as a keto-adipate decarboxylase gene, a 2-aminoadipate transaminase gene, and a L-2-Aminoadipate-6-semialdehyde: NAD+ 6-oxidoreductase gene. Such a biosynthesis pathway would be able to convert α-ketoglutarate to 5-aminopentaldehyde.
[0285]In certain aspects, such as for one step in cyclopentanol production, a α-ketoadipate semialdehyde biosynthesis pathway may comprise homocitrate synthase (hcs), homoaconitate hydratase, and homoisocitrate dehydrogenase genes from an organism such as Deinococcus radiodurans and/or Thermus thermophilus, and an α-ketoadipate semialdehyde dehydrogenase gene. Such a biosynthesis pathway would be able to convert acetyl-CoA and α-ketoglutarate to α-ketoadipate semialdehyde.
[0286]For the production of certain commodity chemicals, such as 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol, among other similar chemicals, a biosynthesis pathway (e.g., aldehyde biosynthesis pathway) may optionally or further comprise one or more genes encoding a carboxylase enzyme, such as an indole-3-pyruvate decarboxylase (IPDC). An IPDC may be obtained, for example, from such microorganisms as Azospirillum brasilense and Paenibacillus polymyxa E681. In this regard, an IPDC may be utilized to more efficiently catalyze the dexarboxylation of various carboxylic acids to form the corresponding aldehyde, which can be further converted to a commodity chemical by a reductase or dehydrogenase, as detailed herein.
[0287]In certain aspects, a 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis pathway may comprise a transketolase (tktA), a 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II (aroL), a shikimate kinase I (aroK), a 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate synthase (aroC, a fused chorismate mutase P/prephenate dehydratase (pheA), and a fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from E. coli, keto-isovalerate decarboxylase (kivd) from Lactococcus lactis, alcohol dehydrogenase (adh2) from Saccharomyces cerevisiae, Indole-3-pyruvate decarboxylase (ipdc) from Azospirillum brasilense, phenylethanol reductase (par) from Rhodococcus sp. ST-10, and a benzaldehyde lyase (bal) from Pseudomonas fluorescence.
[0288]As for all other pathways described herein, the components for each of the biosynthesis pathways described herein may be present in a recombinant microorganism either endogenously or exogenously. To improve the efficiency of a given biosynthesis pathway, endogenous genes, for example, may be up-regulated or over-expressed, such as by introducing an additional (i.e., exogenous) copy of that endogenous gene into the recombinant microorganism. Such pathways may also be optimized by altering via mutagenesis the endogenous version of a gene to improve functionality, followed by introduction of the altered gene into the microorganism. The expression of endogenous genes may be up or down-regulated, or even eliminated, according to known techniques in the art and described herein. Similarly, the expression levels of exogenously provided genes may be regulated as desired, such as by using various constitutive or inducible promoters. Such genes may also be "codon-optimized," as described herein and known in the art. Also included are functional naturally-occurring variants of the genes and enzymes described herein, including homologs or orthologs thereof.
[0289]Certain embodiments of a microbial system or isolated microorganism may comprise a CC-ligation pathway. In certain aspects, a CC-ligation pathway may comprise a ThDP-dependent enzyme, such as a C--C ligase, or an optimized C--C ligase. For example, eight-carbon unit molecules (butyroins) may be made from condensing together two four-carbon unit molecules (butyraldehydes). ThDP-dependent enzymes are a group of enzymes known to catalyze both breaking and formation of C--C bonds and have been utilized as catalysts in chemoenzymatic syntheses. The spectrum of chemical reactions that these enzymes catalyze ranges from decarboxylation of α-keto acids, oxidative decarboxylation, carboligation, and to the cleavage of C--C bonds.
[0290]To provide a few examples, benzaldehyde lyase (BAL) from Pseudomonas fluorescens, benzoylformate decarboxylase (BFD) from Pseudomonas putida, and pyruvate decarboxylase (PDC) from Zymomonas mobilis may catalyze a carboligation reaction between two aldehydes. BAL accepts the broadest spectrum of aldehydes as substrates among these three enzymes ranging from substituted benzaldehyde to acetoaldehyde, among others, as shown herein. BAL catalyzes stereospecific carboligation reaction between two aldehydes and forms α-hydroxy ketone swith over 99% ee for R-configuration. The benzoin formation from two benzaldehyde molecules is a favored reaction catalyzed by BAL and proceeds as fast as 320 μmol (benzoin) mg (protein)-1 min-1. The formation of α-hydroxy ketone may be carried out using many different aldehydes, including butyraldehyde.
[0291]BFD and PCD may also catalyze the carboligation reactions between two aldehyde molecules. BFD and PCD accept relatively larger and smaller aldehyde molecules, respectively. With the presence of benzaldehyde and acetoaldehyde, BFD catalyzes the formation of benzoin and (S)-α-hydroxy phenylpropanone (2S-HPP), whereas PCD catalyzes the formation of (R)-α-hydroxy phenylpropanone (2R-HPP) and (R)-α-hydroxy 2-butanone (acetoin). As detailed below, certain microbial systems or isolated microorganisms of the present application may comprise natural or optimized C--C ligases (ThDP-dependent enzymes) selected from benzaldehyde lyase (BAL) from Pseudomonas fluorescens benzoylformate decarboxylase (BFD) from Pseudomonas putida, and pyruvate decarboxylase (PDC) from Zymomonas mobilis. Other embodiments may comprise a benzaldehyde lyase (BAL) from Pseudomonas fluorescens (see SEQ ID NOS:143-144, showing the nucleotide and polypeptide sequences, respectively) including biologically active variants thereof, such as optimized variants.
[0292]A C--C ligation pathway of the present invention typically comprises one or more C--C ligases, such as a lyase enzyme. Exemplary lyases include, but are not limited to, acetoaldehyde lyases, propionaldehyde lyases, butyraldehyde lyases, isobutyraldehyde lyases, 2-methyl-butyraldehyde lyases, 3-methyl-butyraldehyde lyases (isoveraldehyde), phenylacetaldehyde lyases, α-keto adipate carboxylyases, pentaldehyde lyases, 4-methyl-pentaldehyde lyases, hexaldehyde lyases, heptaldehyde lyases, octaldehyde lyases, 4-hydroxyphenylacetaldehyde lyases, indoleacetaldehyde lyases, indolephenylacetaldehyde lyases. In certain aspects, a selected CC-ligase or lyase enzyme may have one or more of the above exemplified lyase activities, such as acetoaldehyde lyase activity, a propionaldehyde lyase activity, a butyraldehyde lyase activity, and/or an isobutyraldehyde lyase activity, among others.
[0293]As noted above, a C--C ligase may comprise a benzaldehyde lyase, such as a benzaldehyde lyase isolated from Pseudomonas fluorescens (SEQ ID NOS:143-144), as well as biologically active fragments or variants of this reference sequence, such as optimized variants of a benzaldehyde lyase. In this regard, certain aspects may comprise nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:143-144, and which are capable of catalyzing a carboligation reaction, or which possess C--C lyase activity, as described herein. In certain aspects, a BAL enzyme will comprise one or more conserved amino acid residues, including G27, E50, A57, G155, P162, P234, D271, G277, G422, G447, D448, and/or G512.
[0294]Pseudomonas fluorescens is able to grow on R-benzoin as the sole carbon and energy source because it harbours the enzyme benzaldehyde lyase that cleaves the acyloin linkage using thiamine diphosphate (ThDP) as a cofactor. In the reverse reaction, as utilized herein, benzaldehyde lyase catalyses the carboligation of two aldehydes with high substrate and stereospecificity. Structure-based comparisons with other proteins show that benzaldehyde lyase belongs to a group of closely related ThDP-dependent enzymes. The ThDP cofactors of these enzymes are fixed at their two ends in separate domains, suspending a comparatively mobile thiazolium ring between them. While the residues binding the two ends of ThDP are well conserved, the lining of the active centre pocket around the thiazolium moiety varies greatly within the group. The active sites for BAL have been described, for example, in Kneen et al (Biochimica et Biophysica Acta 1753:263-271, 2005) and Brandt et al. (Biochemistry 47:7734-43, 2008). Benzaldehyde lyase derived from Pseudomonas fluorescens has been demonstrated herein to at least have an acetoaldehyde lyase activity, a propionaldehyde lyase activity, a butyraldehyde lyase activity, a 3-methyl-butyraldehyde lyase activity, a pentaldehyde lyase activity, a 4-methylpentaldehyde lyase activity, a hexaldehyde lyase activity, a phenylacetoaldehyde lyase activity, and an octaldehyde lyase activity (see Table 2), among other in vivo lyase activities (see FIGS. 48-55).
[0295]In certain aspects, a C--C ligase, such as BAL derived from Pseudomonas fluorescens, BFD derived from Pseudomonas putida, or PDC derived from Zymomonas mobilis may comprise a lyase with a combination of lyase activities, such as a lyase having both a propionaldehyde lyase activity and a 3-methyl-butyraldehyde lyase activity, among other combinations and activities, such as those exemplary combinations detailed herein. Merely by way of illustration, a lyase having a combination of lyase activities may be referred to herein as a propionaldehyde/3-methyl-butyraldehyde lyase.
[0296]A dehydration and reduction pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase, may be utilized to further convert an aldehyde, ketone, or corresponding alcohol, to a commodity chemical, such as a biofuel.
[0297]To this end, a dehydration and reduction pathway may comprise one or more diol dehydrogenases. A "diol dehydrogenase" refers generally to an enzyme that catalyzes the reversible reduction and oxidation of a α-hydroxy ketone and/or its corresponding diol. Certain embodiments of a microbial system or isolated microorganism may comprise genes encoding a diol dehydrogenase that specifically catalyzes the reduction of α-hydroxy-ketones, including, for example, a 4, 5, octanediol dehydrogenase. Diol dehydrogenases, such as 4, 5, octanediol dehydrogenase, may be isolated from a variety of organisms and incorporated into a microbial system or isolated microorganism. A particular group of alcohol dehydrogenases has a characteristic ability to oxidize various α-hydroxy alcohols and reduce various α-hydroxy ketones and α-keto ketones. As such, the recitation "diol dehydrogenase" may also encompass such alcohol dehydrogenases.
[0298]By way of example regarding diol dehydrogenases from exemplary organisms, glycerol dehydrogenase isolated from Hansenula ofunaensis has broad substrate specificity and is capable of catalyzing the oxidation of various α-hydroxy alcohols, including 1,2-octane, as well as the reduction of various α-hydroxy ketones and α-keto ketones, including 3-hydroxy-2-butanone and 3,4-hexanedione, with the activity comparable to its native substrates, glycerol and dihydroxyaceton, respectively (40-200%). As one further example, glycerol dehydrogenase discovered in Hansenula polumorpha DI-1 works similarly. In certain embodiments, a microbial system or recombinant microorganism may comprise a glycerol dehydrogenase gene isolated from Hansenula ofunaensis, a glycerol dehydrogenase isolated from Hansenula polumorpha DI-1 and/or a meso-2,3-butane diol dehydrogenase from Klebsiella pneumoniae. In other embodiments, a microbial system or isolated microorganism may comprise a 4, 5, octanediol dehydrogenase, among others detailed herein. Diol dehyodregnases may also be obtained from Lactobaccilus brevis ATCC 367, Pseudomanas putida KT2440, and Klebsiella pneumoniae MGH78578), as described herein (see Example 5).
[0299]Exemplary diol dehydrogenases include, but are not limited to, 2,3-butanediol dehydrogenase, 3,4-hexanediol dehydrogenase, 4,5-octanediol dehydrogenase, 5,6-decanediol dehydrogenase, 6,7-dodecanediol dehydrogenase, 7,8-tetradecanediol dehydrogenase, 8,9-hexadecanediol dehydrogenase, 2,5-dimethyl-3,4-hexanediol dehydrogenase, 3,6-dimethyl-4,5-octanediol dehydrogenase, 2,7-dimethyl-4,5-octanediol dehydrogenase, 2,9-dimethyl-5,6-decanediol dehydrogenase, 1,4-diphenyl-2,3-butanediol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,4-diindole-2,3-butanediol dehydrogenase, 1,2-cyclopentanediol dehydrogenase, 2,3-pentanediol dehydrogenase, 2,3-hexanediol dehydrogenase, 2,3-heptanediol dehydrogenase, 2,3-octanediol dehydrogenase, 2,3-nonanediol dehydrogenase, 4-methyl-2,3-pentanediol dehydrogenase, 4-methyl-2,3-hexanediol dehydrogenase, 5-methyl-2,3-hexanediol dehydrogenase, 6-methyl-2,3-heptanediol dehydrogenase, 1-phenyl-2,3-butanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1-indole-2,3-butanediol dehydrogenase, 3,4-heptanediol dehydrogenase, 3,4-octanediol dehydrogenase, 3,4-nonanediol dehydrogenase, 3,4-decanediol dehydrogenase, 3,4-undecanediol dehydrogenase, 2-methyl-3,4-hexanediol dehydrogenase, 5-methyl-3,4-heptanediol dehydrogenase, 6-methyl-3,4-heptanediol dehydrogenase, 7-methyl-3,4-octanediol dehydrogenase, 1-phenyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase, 1-indole-2,3-pentanediol dehydrogenase, 4,5-nonanediol dehydrogenase, 4,5-decanediol dehydrogenase, 4,5-undecanediol dehydrogenase, 4,5-dodecanediol dehydrogenase, 2-methyl-3,4-heptanediol dehydrogenase, 3-methyl-4,5-octanediol dehydrogenase, 2-methyl-4,5-octanediol dehydrogenase, 8-methyl-4,5-nonanediol dehydrogenase, 1-phenyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase, 1-indole-2,3-hexanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-tridecanediol dehydrogenase, 2-methyl-3,4-octanediol dehydrogenase, 3-methyl-4,5-nonanediol dehydrogenase, 2-methyl-4,5-nonanediol dehydrogenase, 2-methyl-5,6-decanediol dehydrogenase, 1-phenyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase, 1-indole-2,3-heptanediol dehydrogenase, 6,7-tridecanediol dehydrogenase, 6,7-tetradecanediol dehydrogenase, 2-methyl-3,4-nonanediol dehydrogenase, 3-methyl-4,5-decanediol dehydrogenase, 2-methyl-4,5-decanediol dehydrogenase, 2-methyl-5,6-undecanediol dehydrogenase, 1-phenyl-2,3-octanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase, 1-indole-2,3-octanediol dehydrogenase, 7,8-pentadecanediol dehydrogenase, 2-methyl-3,4-decanediol dehydrogenase, 3-methyl-4,5-undecanediol dehydrogenase, 2-methyl-4,5-undecanediol dehydrogenase, 2-methyl-5,6-dodecanediol dehydrogenase, 1-phenyl-2,3-nonanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase, 1-indole-2,3-nonanediol dehydrogenase, 2-methyl-3,4-undecanediol dehydrogenase, 3-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-5,6-tridecanediol dehydrogenase, 1-phenyl-2,3-decanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase, 1-indole-2,3-decanediol dehydrogenase, 2,5-dimethyl-3,4-heptanediol dehydrogenase, 2,6-dimethyl-3,4-heptanediol dehydrogenase, 2,7-dimethyl-3,4-octanediol dehydrogenase, 1-phenyl-4-methyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase, 1-indole-4-methyl-2,3-pentanediol dehydrogenase, 2,6-dimethyl-4,5-octanediol dehydrogenase, 3,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-4-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase, 1-indole-4-methyl-2,3-hexanediol dehydrogenase, 2,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-5-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase, 1-indole-5-methyl-2,3-hexanediol dehydrogenase, 1-phenyl-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase, 1-indole-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,10-diamino-5,6-decanediol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase, and the like.
[0300]In certain aspects, a selected diol dehydrogenase enzyme may have one or more of the above exemplified diol dehydrogenase activities, such as a 2,3-butanediol dehydrogenase activity, a 3,4-hexanediol dehydrogenase activity, and/or a 4,5-octanediol dehydrogenase activity, among others.
[0301]In certain aspects, a recombinant microorganism may comprise a diol dehydrogenase encoded by a nucleotide reference sequence selected from SEQ ID NO:97, 99, and 101, or an enzyme having a polypeptide sequence selected from SEQ ID NO:98, 100, and 102, including biologically active fragments or variants thereof, such as optimized variants. Certain aspects may also comprises nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:97-102.
[0302]Other embodiments may comprise re-designed diol dehydrogenases for reduction of 1-hydroxy propanal, succinicaldehyde, and glutaraldehyde to 1,3-propanediol, 1,4-butanediol, and 1,5 pentanediol, respectively, among others.
[0303]A dehydration and reduction pathway, as described herein, may comprise one or more diol dehydratases. A "diol dehydratase" refers generally to an enzyme that catalyzes the irreversible dehydration of diols. For instance, this enzyme may serve to dehydrate octanediol to form 4-octane. It has been recognized that there are at least two different types of diol dehydratases: a group dependent on and independent of coenzyme B12 for its catalysis. Coenzyme B12 dependent diol dehydratases are known to catalyze a radical mediated dehydration reaction from α-hydroxy alcohol to aldehydes or ketones. For example, a diol dehydratase from Klebsiella pneumoniae catalyzes the dehydration of glycerol to form β-hydroxypropyl aldehyde, accepts 2,3-butanediol as a substrate, and catalyzes the dehydration reaction to form 2-butanone.
[0304]As a further example, Clostridium butylicum contains coenzyme B12 independent diol dehydratases. FIG. 46 shows the in vivo biological activity of coenzyme B12 independent diol dehydratase (dhaB1) and activator (dhaB2) isolated from Clostridium butylicum (see Example 9). 46A shows the in vivo production of 1-propanol from 1,2-propanediol, FIG. 46B shows the in vivo production of 2-butanol from meso-2,3 butanediol, and FIG. 46C shows the in vivo production of cyclopentanone from trans-1,2-cyclopentanediol.
[0305]Thus, certain embodiments of the present invention may comprise optimized or redesigned diol dehydratases that accommodate various substrates, such as 4,5-octanediol as a substrate, and may include diol dehydratases isolated and/or optimized from Klebsiella pneumoniae and Clostridium butylicum, among other organisms described herein and known in the art.
[0306]Exemplary diol dehydratases include, but are not limited to, 2,3-butanediol dehydratase, 3,4-hexanediol dehydratase, 4,5-octanediol dehydratase, 5,6-decanediol dehydratase, 6,7-dodecanediol dehydratase, 7,8-tetradecanediol dehydratase, 8,9-hexadecanediol dehydratase, 2,5-dimethyl-3,4-hexanediol dehydratase, 3,6-dimethyl-4,5-octanediol dehydratase, 2,7-dimethyl-4,5-octanediol dehydratase, 2,9-dimethyl-5,6-decanediol dehydratase, 1,4-diphenyl-2,3-butanediol dehydratase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,4-diindole-2,3-butanediol dehydratase, 1,2-cyclopentanediol dehydratase, 2,3-pentanediol dehydratase, 2,3-hexanediol dehydratase, 2,3-heptanediol dehydratase, 2,3-octanediol dehydratase, 2,3-nonanediol dehydratase, 4-methyl-2,3-pentanediol dehydratase, 4-methyl-2,3-hexanediol dehydratase, 5-methyl-2,3-hexanediol dehydratase, 6-methyl-2,3-heptanediol dehydratase, 1-phenyl-2,3-butanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1-indole-2,3-butanediol dehydratase, 3,4-heptanediol dehydratase, 3,4-octanediol dehydratase, 3,4-nonanediol dehydratase, 3,4-decanediol dehydratase, 3,4-undecanediol dehydratase, 2-methyl-3,4-hexanediol dehydratase, 5-methyl-3,4-heptanediol dehydratase, 6-methyl-3,4-heptanediol dehydratase, 7-methyl-3,4-octanediol dehydratase, 1-phenyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase, 1-indole-2,3-pentanediol dehydratase, 4,5-nonanediol dehydratase, 4,5-decanediol dehydratase, 4,5-undecanediol dehydratase, 4,5-dodecanediol dehydratase, 2-methyl-3,4-heptanediol dehydratase, 3-methyl-4,5-octanediol dehydratase, 2-methyl-4,5-octanediol dehydratase, 8-methyl-4,5-nonanediol dehydratase, 1-phenyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase, 1-indole-2,3-hexanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-tridecanediol dehydratase, 2-methyl-3,4-octanediol dehydratase, 3-methyl-4,5-nonanediol dehydratase, 2-methyl-4,5-nonanediol dehydratase, 2-methyl-5,6-decanediol dehydratase, 1-phenyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase, 1-indole-2,3-heptanediol dehydratase, 6,7-tridecanediol dehydratase, 6,7-tetradecanediol dehydratase, 2-methyl-3,4-nonanediol dehydratase, 3-methyl-4,5-decanediol dehydratase, 2-methyl-4,5-decanediol dehydratase, 2-methyl-5,6-undecanediol dehydratase, 1-phenyl-2,3-octanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydratase, 1-indole-2,3-octanediol dehydratase, 7,8-pentadecanediol dehydratase, 2-methyl-3,4-decanediol dehydratase, 3-methyl-4,5-undecanediol dehydratase, 2-methyl-4,5-undecanediol dehydratase, 2-methyl-5,6-dodecanediol dehydratase, 1-phenyl-2,3-nonanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase, 1-indole-2,3-nonanediol dehydratase, 2-methyl-3,4-undecanediol dehydratase, 3-methyl-4,5-dodecanediol dehydratase, 2-methyl-4,5-dodecanediol dehydratase, 2-methyl-5,6-tridecanediol dehydratase, 1-phenyl-2,3-decanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydratase, 1-indole-2,3-decanediol dehydratase, 2,5-dimethyl-3,4-heptanediol dehydratase, 2,6-dimethyl-3,4-heptanediol dehydratase, 2,7-dimethyl-3,4-octanediol dehydratase, 1-phenyl-4-methyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase, 1-indole-4-methyl-2,3-pentanediol dehydratase, 2,6-dimethyl-4,5-octanediol dehydratase, 3,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-4-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase, 1-indole-4-methyl-2,3-hexanediol dehydratase, 2,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-5-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase, 1-indole-5-methyl-2,3-hexanediol dehydratase, 1-phenyl-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase, 1-indole-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,10-diamino-5,6-decanediol dehydratase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 2,3-hexanediol-1,6-dicarboxylic acid dehydratase, and the like.
[0307]In certain aspects, a selected diol dehydratase enzyme may have one or more of the above exemplified diol dehydratase activities, such as a 2,3-butanediol dehydratase activity, a 3,4-hexanediol dehydratase activity, and/or a 4,5-octanediol dehydratase activity, among others.
[0308]In certain aspects, diol dehydratases may be obtained from Klebsiella pneumoniae MGH 78578, including from the pduCDE gene of this and other microorganisms. In certain aspects, a recombinant microorganism may comprise one or more diol dehydratases encoded by a nucleotide reference sequence selected from SEQ ID NO:103, 105, and 107, or an enzyme having a polypeptide sequence selected from SEQ ID NO:104, 106, and 108, including biologically active fragments or variants thereof, such as optimized variants. Certain aspects may also comprises nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:103-108. In certain aspects, polypeptides of SEQ ID NO:104 may comprise certain conserved amino acid residues, including those chosen from D149, P151, A155, A159, G165, E168, E170, A183, G189, G196, Q200, E208, G215, Y219, E221, T222, S224, Y226, G227, T228, F232, G235, D236, D237, T238, P239, S241, L245, Y249, S251, R252, G253, K255, R257, S260, E265, M268, G269, S275, Y278, L279, E280, C283, G291, Q293, G294, Q296, N297, G298, G312, E329, S341, R344, G356, D371, N372, F374, S377, R392, D393, R412, L477, A486, G499, D500, S516, N522, D523, Y524, G526, and G530.
[0309]In certain aspects, a diol dehydratase may include a polypeptide that comprises an amino acid sequence having 0%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:308-311. SEQ ID NO:308 shows the polypeptide sequence of PduG, a diol dehydratase reactivation large subunit derived from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:309 shows the polypeptide sequence of PduH, diol dehydratase reactivation small subunit derived from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:310 shows the polypeptide sequence of a B12-independent glycerol dehydratase from Clostridium Butyricum. SEQ ID NO:311 shows the polypeptide sequence of a glycerol dehydratase activator from Clostridium Butyricum. In certain aspects, a B12-independent glycerol dehydratase may comprise conserved amino acid residues, such as T36, G74, P87, E88, E97, W126, R221, A263, Q265, R287, D289, E309, R317, G335, G345, G346, N356, P374, R379, G399, G401, P403, D408, G432, C433, N452, C529, G533, G539, G540, S559, G603, N604, A654, G658, R659, D676, N702, Q735, N737, A747, P751, R760, V761, A762, G763, Q776, I780, and/or R782. In certain aspects, a B12-independent glycerol dehydratase activator may comprise certain conserved amino acid residues, including D19, G20, G22, R24, F28, G31, C32, C36, W38, C39, N41, P42, C58, C64, C96, G129, T132, G135, G136, D185, R187, N208, R222, and/or R264.
[0310]A dehydration and reduction pathway, as described herein, may comprise one or more alcohol dehydrogenases or secondary alcohol dehydrogenases. An "alcohol dehydrogenase" or "secondary alcohol dehydrogenase" that is part of a dehydration and reduction pathway refers generally to an enzyme that catalyzes the conversion of aldehyde or ketone substituents to alcohols. For instance, 4-octanone may be reduced to 4-octanol by a secondary alcohol dehydrogenase one enzymatic step for the conversion of butyroin to a biofuel. Pseudomonads express at least one secondary alcohol dehydrogenase that oxidizes 4-octanol to 4-octanone using NAD.sup.+ as a co-factor. As another example, Rhodococcus erythropolis ATCC4277 catalyzes oxidation of medium to long chain secondary fatty alcohols using NADH as a co-factor, using an enzyme that also catalyzes the oxidation of 3-decanol and 4-decanol. In addition, Norcadia fusca AKU2123 contains an (S)-specific secondary alcohol dehydrogenase.
[0311]Genes encoding secondary alcohol dehydrogenases may be isolated from these and other organisms according to known techniques in the art and incorporated into the microbial systems recombinant organisms as described herein. In certain embodiments, a microbial system or isolated microorganism may comprise natural or optimized secondary alcohol dehydrogenases from Pseudomonads, Rhodococcus erythropolis ATCC4277, Norcadia fusca AKU2123, or other suitable organisms.
[0312]Examples of secondary alcohol dehydrogenases include, but are not limited to, 2-butanol dehydrogenase, 3-hexanol dehydrogenase, 4-octanol dehydrogenase, 5-decanol dehydrogenase, 6-dodecanol dehydrogenase, 7-tetradecanol dehydrogenase, 8-hexadecanol dehydrogenase, 2,5-dimethyl-3-hexanol dehydrogenase, 3,6-dimethyl-4-octanol dehydrogenase, 2,7-dimethyl-4-octanol dehydrogenase, 2,9-dimethyl-4-decanol dehydrogenase, 1,4-diphenyl-2-butanol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase, 1,4-diindole-2-butanol dehydrogenase, cyclopentanol dehydrogenase, 2(or 3)-pentanol dehydrogenase, 2(or 3)-hexanol dehydrogenase, 2(or 3)-heptanol dehydrogenase, 2(or 3)-octanol dehydrogenase, 2(or 3)-nonanol dehydrogenase, 4-methyl-2(or 3)-pentanol dehydrogenase, 4-methyl-2(or 3)-hexanol dehydrogenase, 5-methyl-2(or 3)-hexanol dehydrogenase, 6-methyl-2(or 3)-heptanol dehydrogenase, 1-phenyl-2(or 3)-butanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1-indole-2(or 3)-butanol dehydrogenase, 3(or 4)-heptanol dehydrogenase, 3(or 4)-octanol dehydrogenase, 3(or 4)-nonanol dehydrogenase, 3(or 4)-decanol dehydrogenase, 3(or 4)-undecanol dehydrogenase, 2-methyl-3(or 4)-hexanol dehydrogenase, 5-methyl-3 (or 4)-heptanol dehydrogenase, 6-methyl-3 (or 4)-heptanol dehydrogenase, 7-methyl-3(or 4)-octanol dehydrogenase, 1-phenyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-pentanol dehydrogenase, 1-indole-2(or 3)-pentanol dehydrogenase, 4(or 5)-nonanol dehydrogenase, 4(or 5)-decanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 4(or 5)-dodecanol dehydrogenase, 2-methyl-3(or 4)-heptanol dehydrogenase, 3-methyl-4(or 5)-octanol dehydrogenase, 2-methyl-4(or 5)-octanol dehydrogenase, 8-methyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase, 1-indole-2(or 3)-hexanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 5(or 6)-undecanol dehydrogenase, 5(or 6)-tridecanol dehydrogenase, 2-methyl-3(or 4)-octanol dehydrogenase, 3-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-5(or 6)-decanol dehydrogenase, 1-phenyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-heptanol dehydrogenase, 1-indole-2(or 3)-heptanol dehydrogenase, 6(or 7)-tridecanol dehydrogenase, 6(or 7)-tetradecanol dehydrogenase, 2-methyl-3(or 4)-nonanol dehydrogenase, 3-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-5(or 6)-undecanol dehydrogenase, 1-phenyl-2(or 3)-octanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-octanol dehydrogenase, 1-indole-2(or 3)-octanol dehydrogenase, 7(or 8)-pentadecanol dehydrogenase, 2-methyl-3(or 4)-decanol dehydrogenase, 3-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-5 (or 6)-dodecanol dehydrogenase, 1-phenyl-2(or 3)-nonanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase, 1-indole-2(or 3)-nonanol dehydrogenase, 2-methyl-3(or 4)-undecanol dehydrogenase, 3-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-5(or 6)-tridecanol dehydrogenase, 1-phenyl-2(or 3)-decanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase, 1-indole-2(or 3)-decanol dehydrogenase, 2,5-dimethyl-3(or 4)-heptanol dehydrogenase, 2,6-dimethyl-3(or 4)-heptanol dehydrogenase, 2,7-dimethyl-3(or 4)-octanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2(or 3)-pentanol dehydrogenase, 1-indole-4-methyl-2(or 3)-pentanol dehydrogenase, 2,6-dimethyl-4(or 5)-octanol dehydrogenase, 3,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2 (or 3)-hexanol dehydrogenase, 1-indole-4-methyl-2(or 3)-hexanol dehydrogenase, 2,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase, 1-indole-5-methyl-2(or 3)-hexanol dehydrogenase, 1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase, 1-indole-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1,10-diamino-5-decanol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase, 2-hexanol-1,6-dicarboxylic acid dehydrogenase, phenylethanol dehydrogenase, 4-hydroxyphenylethanol dehydrogenase, Indole-3-ethanol dehydrogenase, and the like.
[0313]In certain aspects, a selected alcohol dehydrogenase or secondary alcohol dehydrogenase may have one or more of the above exemplified alcohol dehydrogenase activities, such as a 2-butanol dehydrogenase activity, 3-hexanol dehydrogenase activity, and/or a 4-octanol dehydrogenase activity, among others.
[0314]In certain aspects, a recombinant microorganism may comprise one or more secondary alcohol dehydrogenases encoded by a nucleotide reference sequence selected from SEQ ID NO:109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, and 141, or an enzyme having a polypeptide sequence selected from SEQ ID NO:110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, and 142, including biologically active fragments or variants thereof, such as optimized variants. Certain aspects may also comprises nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:109-142.
[0315]For the secondary alcohol dehydrogenase sequences referred to above, SEQ ID NO:109 is the nucleotide sequence and SEQ ID NO:110 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-1: PP--1946) isolated from Pseudomonas putida KT2440. SEQ ID NO:111 is the nucleotide sequence and SEQ ID NO:112 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-2: PP--1817) isolated from Pseudomonas putida KT2440.
[0316]SEQ ID NO:113 is the nucleotide sequence and SEQ ID NO:114 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-3: PP--1953) isolated from Pseudomonas putida KT2440. SEQ ID NO:115 is the nucleotide sequence and SEQ ID NO:116 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-4: PP--3037) isolated from Pseudomonas putida KT2440.
[0317]SEQ ID NO:117 is the nucleotide sequence and SEQ ID NO:118 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-5: PP--1852) isolated from Pseudomonas putida KT2440. SEQ ID NO:119 is the nucleotide sequence and SEQ ID NO:120 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-6: PP--2723) isolated from Pseudomonas putida KT2440.
[0318]SEQ ID NO:121 is the nucleotide sequence and SEQ ID NO:122 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-7: PP--2002) isolated from Pseudomonas putida KT2440. SEQ ID NO:123 is the nucleotide sequence and SEQ ID NO:124 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-8: PP--1914) isolated from Pseudomonas putida KT2440.
[0319]SEQ ID NO:125 is the nucleotide sequence and SEQ ID NO:126 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-9: PP--1914) isolated from Pseudomonas putida KT2440. SEQ ID NO:127 is the nucleotide sequence and SEQ ID NO:128 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-10: PP--3926) isolated from Pseudomonas putida KT2440.
[0320]SEQ ID NO:129 is the nucleotide sequence and SEQ ID NO:130 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-11: PFL--1756) isolated from Pseudomonas fluorescens Pf-5. SEQ ID NO:131 is the nucleotide sequence and SEQ ID NO:132 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-12: KPN--01694) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578.
[0321]SEQ ID NO:133 is the nucleotide sequence and SEQ ID NO:134 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-13: KPN--02061) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:135 is the nucleotide sequence and SEQ ID NO:136 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-14: KPN--00827) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578.
[0322]SEQ ID NO:137 is the nucleotide sequence and SEQ ID NO:138 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-16: KPN--01350) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:139 is the nucleotide sequence and SEQ ID NO:140 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-17: KPN--03369) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:141 is the nucleotide sequence and SEQ ID NO:142 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-18: KPN--03363) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578.
[0323]In certain aspects, an alcohol dehydrogenase (e.g., DEHU hydrogenase), a secondary alcohol dehydrogenase (2ADH), a fragment, variant, or derivative thereof, or any other enzyme that utilizes such an active site, may comprise at least one of a nicotinamide adenine dinucleotide (NAD+), NADH, nicotinamide adenine dinucleotide phosphate (NADP+), or NADPH binding motif. In certain embodiments, the NAD+, NADH, NADP+, or NADPH binding motif may be selected from the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y, Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y, Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein Y is independently selected from alanine, glycine, and serine, wherein G is glycine, and wherein X is independently selected from a genetically encoded amino acid.
[0324]As one example of a step in a reduction and dehydration pathway, α-hydroxy cyclopentanone may be reduced to 1,2-cyclopentanediol. For example, the glycerol dehydrogenase isolated from Hansenula ofunaensis favors the reduction of α-hydroxy ketones and α-keto ketones, and has very broad substrate specificity. The similar alcohol dehydrogenase derived from Hansenula polumorpha and meso-2,3-butanediol dehydrogenase has similar properties. Certain embodiments may incorporate a 1,2-cyclopentanediol dehydrogenase to the microbial system or isolated microorganism. Other embodiments may incorporate a glycerol dehydrogenase from Hansenula ofunaensis, Hansenula polumorpha, Klebsiella pneumonia, or any other suitable organism.
[0325]By way of example, a chemical or hydrocarbon such as 1,2-cyclopentanediol may be dehydrated to form cyclopentanone as one enzymatic step in a reduction and dehydration pathway. There are at least two different types of diol dehydratases that may catalyze dehydration of chemicals such as 1,2-cyclopentanediol. Certain embodiments of microbial system comprising a reduction and dehydration pathway will comprise diol dehydratases such as 1,2-cyclopentanediol dehydratase.
[0326]In the last enzymatic step for a reduction and dehydration pathway, the conversion of such exemplary chemicals as α-hydroxy cyclopentanone to cyclopentanol may include the reduction of cyclopentanone to cyclopentanol. This step may be catalyzed by cyclopentanol dehydrogenase, which is found in Comomonas sp. strain NCIMB 9872 and its gene (cpnA) has been isolated. Certain embodiments of a microbial system or isolated microorganism may comprise a cyclopentanol dehydrogenase, such as that expressed by cpnA in Comomonas sp. strain NCIMB 9872, among others described herein.
[0327]As detailed below, in certain embodiments, selected C--C ligation pathways may be utilized in combination with selected components or enzymes of a reduction and dehydration pathway to produce a commodity chemical, or intermediate thereof.
[0328]For example, certain embodiments include a method wherein the C--C ligation pathway may comprise an acetoaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-butanediol dehydrogenase, a 2,3-butanediol dehydratase, and a 2-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a propionaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-hexanediol dehydrogenase, a 3,4-hexanediol dehydratase, and a 3-hexanol dehydrogenase.
[0329]Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-octanediol dehydrogenase, a 4,5-octanediol dehydratase, and a 4-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-decanediol dehydrogenase, a 5,6-decanediol dehydratase, and a 5-decanol dehydrogenase.
[0330]Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6,7-dodecanediol dehydrogenase, a 6,7-dodecanediol dehydratase, and a 6-dodecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 7,8-tetradecanediol dehydrogenase, a 7,8-tetradecanediol dehydratase, and a 7-tetradecanol dehydrogenase.
[0331]Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 8,9-hexadecanediol dehydrogenase, a 8,9-hexadecanediol dehydratase, and a 8-hexadecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise an isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,5-dimethyl-3,4-hexanediol dehydrogenase, a 2,5-dimethyl-3,4-hexanediol dehydratase, and a 2,5-dimethyl-3-hexanol dehydrogenase.
[0332]Additional embodiments include a method wherein the C--C ligation pathway may comprise a 2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,6-dimethyl-4,5-octanediol dehydrogenase, a 3,6-dimethyl-4,5-octanediol dehydratase, and a 3,6-dimethyl-4-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a 3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,7-dimethyl-4,5-octanediol dehydrogenase, a 2,7-dimethyl-4,5-octanediol dehydratase, and a 2,7-dimethyl-4-octanol dehydrogenase.
[0333]Additional embodiments include a method wherein the C--C ligation pathway may comprise a 3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,9-dimethyl-5,6-decanediol dehydrogenase, a 2,9-dimethyl-4,5-decanediol dehydratase, and a 2,9-dimethyl-4-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1,4-diphenyl-2,3-butanediol dehydrogenase, a 1,4-diphenyl-2,3-butanediol dehydratase, and a 1,4-diphenyl-2-butanol dehydrogenase.
[0334]Additional embodiments include a method wherein the C--C ligation pathway may comprise a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1,4-diindole-2,3-butanediol dehydrogenase, a 1,4-diindole-2,3-butanediol dehydratase, and a 1,4-diindole-2-butanol dehydrogenase.
[0335]Additional embodiments include a method wherein the C--C ligation pathway may comprise an α-keto adipate carboxylyase, and wherein the reduction and dehydration pathway may comprise at least one of a 1,2-cyclopentanediol dehydrogenase, a 1,2-cyclopentanediol dehydratase, and a cyclopentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/propiondehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-pentanediol dehydrogenase, a 2,3-pentanediol dehydratase, and a 2(or 3)-pentanol dehydrogenase.
[0336]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-hexanediol dehydrogenase, a 2,3-hexanediol dehydratase, and a 2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-heptanediol dehydrogenase, a 2,3-heptanediol dehydratase, and a 2(or 3)-heptanol dehydrogenase.
[0337]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/hexaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-octanediol dehydrogenase, a 2,3-octanediol dehydratase, and a 2(or 3)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-nonanediol dehydrogenase, a 2,3-nonanediol dehydratase, and a 2(or 3)-nonanol dehydrogenase.
[0338]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4-methyl-2,3-pentanediol dehydrogenase, a 4-methyl-2,3-pentanediol dehydratase, and a 4-methyl-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4-methyl-2,3-hexanediol dehydrogenase, a 4-methyl-2,3-hexanediol dehydratase, and a 4-methyl-2(or 3)-hexanol dehydrogenase.
[0339]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5-methyl-2,3-hexanediol dehydrogenase, a 5-methyl-2,3-hexanediol dehydrogenase, and a 5-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6-methyl-2,3-heptanediol dehydrogenase, a 6-methyl-2,3-heptanediol dehydrogenase, and a 6-methyl-2(or 3)-heptanol dehydrogenase.
[0340]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-butanediol dehydrogenase, a 1-phenyl-2,3-butanediol dehydratase, and a 1-phenyl-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase.
[0341]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-butanediol dehydrogenase, a 1-indole-2,3-butanediol dehydratase, and a 1-indole-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-heptanediol dehydrogenase, a 3,4-heptanediol dehydratase, and a 3(or 4)-heptanol dehydrogenase.
[0342]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-octanediol dehydrogenase, a 3,4-octanediol dehydratase, and a 3(or 4)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/hexaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-nonanediol dehydrogenase, a 3,4-nonanediol dehydratase, and a 3(or 4)-nonanol dehydrogenase.
[0343]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-decanediol dehydrogenase, a 3,4-decanediol dehydratase, and a 3(or 4)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-undecanediol dehydrogenase, a 3,4-undecanediol dehydratase, and a 3(or 4)-undecanol dehydrogenase.
[0344]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-hexanediol dehydrogenase, a 2-methyl-3,4-hexanediol dehydratase, and a 2-methyl-3(or 4)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5-methyl-3,4-heptanediol dehydrogenase, a 5-methyl-3,4-heptanediol dehydratase, and a 5-methyl-3 (or 4)-heptanol dehydrogenase.
[0345]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6-methyl-3,4-heptanediol dehydrogenase, a 6-methyl-3,4-heptanediol dehydratase, and a 6-methyl-3(or 4)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 7-methyl-3,4-octanediol dehydrogenase, a 7-methyl-3,4-octanediol dehydratase, and a 7-methyl-3(or 4)-octanol dehydrogenase.
[0346]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde and a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-pentanediol dehydrogenase, a 1-phenyl-2,3-pentanediol dehydratase, and a 1-phenyl-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-pentanol dehydrogenase.
[0347]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/indoleacetoaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-pentanediol dehydrogenase, a 1-indole-2,3-pentanediol dehydratase, and a 1-indole-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-nonanediol dehydrogenase, a 4,5-nonanediol dehydratase, and a 4(or 5)-nonanol dehydrogenase.
[0348]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/hexaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-decanediol dehydrogenase, a 4,5-decanediol dehydratase, and a 4(or 5)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-undecanediol dehydrogenase, a 4,5-undecanediol dehydratase, and a 4(or 5)-undecanol dehydrogenase.
[0349]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-dodecanediol dehydrogenase, a 4,5-dodecanediol dehydratase, and a 4(or 5)-dodecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-heptanediol dehydrogenase, a 2-methyl-3,4-heptanediol dehydratase, and a 2-methyl-3(or 4)-heptanol dehydrogenase.
[0350]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-octanediol dehydrogenase, a 3-methyl-4,5-octanediol dehydratase, and a 3-methyl-4(or 5)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-octanediol dehydrogenase, a 2-methyl-4,5-octanediol dehydratase, and a 2-methyl-4(or 5)-octanol dehydrogenase.
[0351]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of an 8-methyl-4,5-nonanediol dehydrogenase, an 8-methyl-4,5-nonanediol dehydratase, and an 8-methyl-4(or 5)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-hexanediol dehydrogenase, a 1-phenyl-2,3-hexanediol dehydratase, and a 1-phenyl-2(or 3)-hexanol dehydrogenase.
[0352]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-hexanediol dehydrogenase, a 1-indole-2,3-hexanediol dehydratase, and a 1-indole-2(or 3)-hexanol dehydrogenase.
[0353]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/hexaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-undecanediol dehydrogenase, a 4,5-undecanediol dehydratase, and a 4(or 5)-undecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-undecanediol dehydrogenase, a 5,6-undecanediol dehydratase, and a 5(or 6)-undecanol dehydrogenase.
[0354]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-tridecanediol dehydrogenase, a 5,6-tridecanediol dehydratase, and a 5(or 6)-tridecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-octanediol dehydrogenase, a 2-methyl-3,4-octanediol dehydratase, and a 2-methyl-3(or 4)-octanol dehydrogenase.
[0355]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-nonanediol dehydrogenase, a 3-methyl-4,5-nonanediol dehydratase, and a 3-methyl-4(or 5)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-nonanediol dehydrogenase, a 2-methyl-4,5-nonanediol dehydratase, and a 2-methyl-4(or 5)-nonanol dehydrogenase.
[0356]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-decanediol dehydrogenase, a 2-methyl-5,6-decanediol dehydratase, and a 2-methyl-5(or 6)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-heptanediol dehydrogenase, a 1-phenyl-2,3-heptanediol dehydratase, and a 1-phenyl-2(or 3)-heptanol dehydrogenase.
[0357]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-heptanediol dehydrogenase, a 1-indole-2,3-heptanediol dehydratase, and a 1-indole-2(or 3)-heptanol dehydrogenase.
[0358]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6,7-tridecanediol dehydrogenase, a 6,7-tridecanediol dehydratase, and a 6(or 7)-tridecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6,7-tetradecanediol dehydrogenase, a 6,7-tetradecanediol dehydratase, and a 6(or 7)-tetradecanol dehydrogenase.
[0359]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-nonanediol dehydrogenase, a 2-methyl-3,4-nonanediol dehydratase, and a 2-methyl-3(or 4)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-decanediol dehydrogenase, a 3-methyl-4,5-decanediol dehydratase, and a 3-methyl-4(or 5)-decanol dehydrogenase.
[0360]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-decanediol dehydrogenase, a 2-methyl-4,5-decanediol dehydratase, and a 2-methyl-4(or 5)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-undecanediol dehydrogenase, a 2-methyl-5,6-undecanediol dehydratase, and a 2-methyl-5(or 6)-undecanol dehydrogenase.
[0361]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-octanediol dehydrogenase, a 1-phenyl-2,3-octanediol dehydratase, and a 1-phenyl-2(or 3)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-octanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-octanol dehydrogenase.
[0362]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-octanediol dehydrogenase, a 1-indole-2,3-octanediol dehydratase, and a 1-indole-2(or 3)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 7,8-pentadecanediol dehydrogenase, a 7,8-pentadecanediol dehydratase, and a 7(or 8)-pentadecanol dehydrogenase.
[0363]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-decanediol dehydrogenase, a 2-methyl-3,4-decanediol dehydratase, and a 2-methyl-3(or 4)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-undecanediol dehydrogenase, a 3-methyl-4,5-undecanediol dehydratase, and a 3-methyl-4(or 5)-undecanol dehydrogenase.
[0364]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-undecanediol dehydrogenase, a 2-methyl-4,5-undecanediol dehydratase, and a 2-methyl-4(or 5)-undecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-dodecanediol dehydrogenase, a 2-methyl-5,6-dodecanediol dehydratase, and a 2-methyl-5(or 6)-dodecanol dehydrogenase.
[0365]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-nonanediol dehydrogenase, a 1-phenyl-2,3-nonanediol dehydratase, and a 1-phenyl-2(or 3)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase, and a 1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase.
[0366]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-nonanediol dehydrogenase, a 1-indole-2,3-nonanediol dehydratase, and a 1-indole-2(or 3)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-undecanediol dehydrogenase, a 2-methyl-3,4-undecanediol dehydratase, and a 2-methyl-3(or 4)-undecanol dehydrogenase.
[0367]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-dodecanediol dehydrogenase, a 3-methyl-4,5-dodecanediol dehydratase, and a 3-methyl-4(or 5)-dodecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-dodecanediol dehydrogenase, a 2-methyl-4,5-dodecanediol dehydratase, and a 2-methyl-4(or 5)-dodecanol dehydrogenase.
[0368]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-tridecanediol dehydrogenase, a 2-methyl-5,6-tridecanediol dehydratase, and a 2-methyl-5(or 6)-tridecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-decanediol dehydrogenase, a 1-phenyl-2,3-decanediol dehydratase, and a 1-phenyl-2(or 3)-decanol dehydrogenase.
[0369]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-decanediol dehydratase, and a 1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-decanediol dehydrogenase, a 1-indole-2,3-decanediol dehydratase, and a 1-indole-2(or 3)-decanol dehydrogenase.
[0370]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,5-dimethyl-3,4-heptanediol dehydrogenase, a 2,5-dimethyl-3,4-heptanediol dehydratase, and a 2,5-dimethyl-3(or 4)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,6-dimethyl-3,4-heptanediol dehydrogenase, a 2,6-dimethyl-3,4-heptanediol dehydratase, and a 2,6-dimethyl-3(or 4)-heptanol dehydrogenase.
[0371]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,7-dimethyl-3,4-octanediol dehydrogenase, a 2,7-dimethyl-3,4-octanediol dehydratase, and a 2,7-dimethyl-3(or 4)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-4-methyl-2,3-pentanediol dehydrogenase, a 1-phenyl-4-methyl-2,3-pentanediol dehydratase, and a 1-phenyl-4-methyl-2(or 3)-pentanol dehydrogenase.
[0372]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase, a 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase, and a 1-(4-hydroxyphenyl)-4-methyl-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-methyl-2,3-pentanediol dehydrogenase, a 1-indole-4-methyl-2,3-pentanediol dehydratase, and a 1-indole-4-methyl-2(or 3)-pentanol dehydrogenase.
[0373]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,6-dimethyl-4,5-octanediol dehydrogenase, a 2,6-dimethyl-4,5-octanediol dehydratase, and a 2,6-dimethyl-4(or 5)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,8-dimethyl-4,5-nonanediol dehydrogenase, a 3,8-dimethyl-4,5-nonanediol dehydratase, and a 3,8-dimethyl-4(or 5)-nonanol dehydrogenase.
[0374]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-4-methyl-2,3-hexanediol dehydrogenase, a 1-phenyl-4-methyl-2,3-hexanediol dehydratase, and a 1-phenyl-4-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase, a 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase, and a 1-(4-hydroxyphenyl)-4-methyl-2 (or 3)-hexanol dehydrogenase.
[0375]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-methyl-2,3-hexanediol dehydrogenase, a 1-indole-4-methyl-2,3-hexanediol dehydratase, and a 1-indole-4-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,8-dimethyl-4,5-nonanediol dehydrogenase, a 2,8-dimethyl-4,5-nonanediol dehydratase, and a 2,8-dimethyl-4(or 5)-nonanol dehydrogenase.
[0376]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-5-methyl-2,3-hexanediol dehydrogenase, a 1-phenyl-5-methyl-2,3-hexanediol dehydratase, and a 1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase, a 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase, and a 1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase.
[0377]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-5-methyl-2,3-hexanediol dehydrogenase, a 1-indole-5-methyl-2,3-hexanediol dehydratase, and a 1-indole-5-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-methyl-pentaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-6-methyl-2,3-heptanediol dehydrogenase, a 1-phenyl-6-methyl-2,3-heptanediol dehydratase, and a 1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase.
[0378]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-methyl-pentaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase, a 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase, and a 1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-methyl-pentaldehyde/Indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-6-methyl-2,3-heptanediol dehydrogenase, a 1-indole-6-methyl-2,3-heptanediol dehydratase, and a 1-indole-6-methyl-2(or 3)-heptanol dehydrogenase.
[0379]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a phenylacetaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase, a 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase, and a 1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a phenylacetaldehyde/indolephenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-phenyl-2,3-butanediol dehydrogenase, a 1-indole-4-phenyl-2,3-butanediol dehydratase, and a 1-indole-4-phenyl-2(or 3)-butanol dehydrogenase.
[0380]Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-hydroxyphenylacetaldehyde/indolephenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a 1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a 5-amino-pantaldehyde lyase, and wherein the reduction and dehydration pathway may comprise at least one of a 1,10-diamino-5,6-decanediol dehydrogenase, a 1,10-diamino-5,6-decanediol dehydratase, and a 1,10-diamino-5-decanol dehydrogenase.
[0381]Additional embodiments include a method wherein the C--C ligation pathway may comprise a 4-hydroxyphenyl acetaldehyde lyase, and wherein the reduction and dehydration pathway may comprise at least one of a 1,4-di(4-hydroxyphenyl)-2,3-butanediol, a 1,4-di(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a 1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a succinate semialdehyde lyase, and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase, a 2,3-hexanediol-1,6-dicarboxylic acid dehydratase, and a 2-hexanol-1,6-dicarboxylic dehydrogenase.
[0382]Certain embodiments of a microbial system or recombinant microorganism may comprise genes encoding enzymes that are able to catalyze (e.g., reduction and dehydration) the conversion of 4-octanol to octene or octane. Other embodiments may comprise redesigned or de novo designed enzymes for this reduction and dehydration pathway. For example, three redesigned enzymes could convert 4-octanone to either 3- and 4-octene. The first step could be catalyzed by redesigned isocitrate dehydrogenase. This enzyme could catalyze the formation of 4-hydroxy-3(or 5)-carboxylic octane. The 4-hydroxy group could be phosphorylated by redesigned kinase. Finally, redesigned mevalonate diphosphate decarboxylase catalyzes the formation of 3(or 4)-octene.
[0383]In other embodiments, several redesigned enzymes could convert 4-octanone to octane. For example, the 4-hydroxy-3(or 5)-carboxylic octane is sequentially reduced and dehydrated to form 3(or 5)-carboxylic octane. Redesigned enzymes involved in fatty acid metabolism can catalyze these reactions. The 3(or 5)-carboxylic octane can be reduced to corresponding aldehyde by aldehyde dehydrogenase and the product may be decarbonylated to form octane catalyzed by a redesigned decarbonylase.
[0384]As noted above, for the production of certain commodity chemicals, such as 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol, among other similar chemicals, a biosynthesis pathway (e.g., aldehyde biosynthesis pathway) may optionally or further comprise one or more genes encoding a decarboxylase enzyme, such as an indole-3-pyruvate decarboxylase (IPDC), to produce an aldehyde. In certain aspects, an IPDC may comprise an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO:312. An IDPC enzyme may comprise certain conserved amino acid residues, such as G24, D25, E48, A55, R60, G75, E89, H113, G252, G405, G413, G428, G430, and/or N456.
[0385]In these and other embodiments, a recombinant microorganism may comprise an aldehyde reductase, such as a phenylacetoaldehyde reductase (PAR), to convert an aldehyde to a commodity chemical. In certain aspects, a PAR may comprise an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO:313, which shows the sequence of a PAR enzymed derived from Rhodococcus sp. ST-10. In certain aspects, a PAR enzyme may comprise at least one of a nicotinamide adenine dinucleotide (NAD+), NADH, nicotinamide adenine dinucleotide phosphate (NADP+), or NADPH binding motif. In certain embodiments, the NAD+, NADH, NADP+, or NADPH binding motif may be selected from the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y, Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y, Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein Y is independently selected from alanine, glycine, and serine, wherein G is glycine, and wherein X is independently selected from a genetically encoded amino acid.
[0386]In certain embodiments, such a recombinant microorganism may also or alternatively comprise a secondary alcohol dehydrogenase having an activity selected from at least one of a phenylethanol dehydrogenase activity, a 4-hydroxyphenylethanol dehydrogenase activity, and an Indole-3-ethanol dehydrogenase activity, to reduce the aldehyde to its corresponding alcohol (e.g. 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol).
[0387]Embodiments of the present invention also include methods for converting a suitable monosaccharide to a commodity chemical comprising, (a) obtaining a suitable monosaccharide; (b) contacting the suitable monosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the biofuel, wherein the microbial system comprises, (i) one or more genes encoding and expressing a fatty acid biosynthesis pathway, an amino acid biosynthetic pathway, and/or a short chain alcohol biosynthetic pathway; (ii) one or more genes encoding and expressing a keto-acid decarboxylase, aldehyde dehydrogenase, and/or alcohol dehydrogenase; and (iii) an enzymatic reduction pathway selected from (1) an enzymatic long chain alcohol reduction pathway, (2) an enzymatic decarbonylation pathway, (3) an enzymatic decarboxylation pathway, and (4) an enzymatic reduction pathway comprising (1), (2), and/or (3), thereby converting the suitable monosaccharide to the commodity chemical.
[0388]Embodiments of the present invention may comprise one or more genes encoding and expressing enzymes in a fatty acid synthesis pathway, which may be used, as one example, to produce biofuels in the form of alkanes, such as medium to long chain alkanes. In certain embodiments, the specificity of the fatty acid biosynthesis pathway in the microbial system may be recalibrated or redesigned. Merely by way of example, microorganisms generally produce a mixture of long chain fatty acids (e.g., E. coli naturally produce large quantities of long chain fatty acids (C16-C19: <95% in whole cells) and small quantity of medium chain fatty acids (C12: 2% and C14: 5% in whole cells)).
[0389]In certain embodiments, the recalibration or re-engineering may be directed to increasing production of medium chain alkanes, including, but not limited to, caprylate (C8), caprate (C10), laurate (C12), myristate (C14), and palmitate (C16), as alkanes produced from these fatty acids are major components of gasoline, diesels, and kerosene. In addition to these fatty acids, other embodiments may be directed to increased production of long chain fatty acids, including, but not limited to, stearate (C18), arachidonate (C20), behenate (C22) and longer fatty acids, as n-alkanes produced from these fatty acids are one of major components in heavy oils.
[0390]For example, Cuphea mainly accumulate medium chain fatty acids as major components in their seed oils, and these compositions alter depending on species. In particular, Cuphea pulcherrima accumulates caprylate (C8:0) 96%, Cuphea koehneana accumulates caprate (C10:0) 95.3%, and Cuphea polymorpha accumulates laurate (C12:0) 80.1%. Embodiments of the microbial systems or isolated microorganisms according to the present application may incorporate genes from various Cuphea species encoding enzymes involved in a fatty acid biosynthesis pathway, and these microorganisms may be directed in part to the production of middle chain fatty acids.
[0391]In other embodiments, acyl-acyl carrier protein (ACP) thioesterases (TEs) derived from various species including Cuphea hookeriana, Cuphea palustris, Umbellularia californica, and Cinnamomum camphorum may be over-expressed in such microorganisms as E. coli, wherein the specific activity for the formation of each medium chain fatty acids, caprylate (C8), caprate (C10), laurate (C12), myristate (C14), and palmitate (C16) is improved over the wild type. Certain embodiments may include other enzyme components involved in fatty acid biosynthesis as known to a person skilled in the arts, including, but not limited to, ACP and β-ketoacyl ACP synthase (KAS) IV.
[0392]Microbial systems and isolated microorganisms of the present application may also incorporate fatty aldehyde dehydrogenases to reduce fatty acids to fatty aldehydes. Merely by way of explanation, the conversion of fatty acids to fatty aldehydes may be catalyzed by medium and/or long chain fatty aldehyde dehydrogenases isolated from various suitable organisms. Certain embodiments may incorporate, for example, a fatty aldehyde dehydrogenase derived from Vibrio harveyi.
[0393]Microbial systems and isolated microorganisms of the present application may also incorporate one or more enzymes that catalyze the conversion of fatty aldehydes to biofuels such as n-alkanes, including, for example, enzymes comprising an enzymatic long chain alcohol reduction pathway. Certain embodiments may incorporate genes from various other sources that encode enzymes capable of catalyzing the reduction and dehydration of fatty acids to biofuels, such as alkanes. For example, bacterial strain HD-1 is able to produce biofuels, such as n-alkanes, with various chain lengths, and also produces both odd and even numbered alkanes. Certain embodiments of the microbial systems and recombinant microorganisms provided herein may incorporate the HD-1 genes encoding the enzymes involved in this pathway.
[0394]Other embodiments may incorporate redesigned or de novo designed enzymes for this reduction pathway. For example, embodiments of the present invention may include a redesigned isocitrate dehydrogenase, which may catalyze the formation of 2-carboxy-1-alcohols. In certain embodiments, the 2-carboxy-1-alcohols may be sequentially reduced and dehydrated to form 2-carboxy-alkanes, which may be catalyzed by redesigned enzymes involved in fatty acid metabolism. The 2-carboxy-alkanes can be reduced to corresponding aldehyde by aldehyde dehydrogenase and then decarbonylated to form n-alkanes catalyzed by the redesigned decarbonylase as discussed below. Certain embodiments of these microbial systems may produce either even numbered n-alkanes, odd numbered n-alkanes, or both.
[0395]Certain embodiments of the present application may incorporate the genes encoding enzymes catalyzing decarbonylation, or an enzymatic decarbonylation pathway. Merely by way of example, green colonial alga Botyrococcus braunii, race A, produces linear odd-numbered C27, C29, and C31 hydrocarbons that total up to 32% of the alga's dry weight. Microsomal preparations of this organism have decarbonylation activity. This decarbonylase from B. braunii culture is a cobalt-protoporphyrin IX containing enzyme. Certain microbial systems of isolated microorganisms may incorporate the gene encoding fatty aldehyde decarbonylase from Botyrococcus braunii.
[0396]Other embodiments may include redesigned decarbonylase enzymes, for example, wherein the N-terminal membrane sequence is substituted. By way of explanation, the functional activity of a similar enzyme, cytochrome P450 containing Fe-protopolphyrin IX (heme), is improved by substituting N-terminal membrane associated sequence, and the functional activity of decarbonylases of the present microbial systems may comprise similar substitutions or improvements.
[0397]Other embodiments may incorporate the genes encoding a Co-porphyrin synthase. In explanation, decarbonylase enzymes may use Co-protoporphyrin IX as a co-factor, and Clostridium tetranomorphum is able to incorporate cobalt into incubated protopolphyrin IX. Certain embodiments may incorporate the Co-porphyrin synthase from Clostridium tetranomorphum, or from other suitable microorganisms. Other embodiments may incorporate de novo designed decarbonylation enzymes using inorganic metals such as Co2+, Fe2+, and Ni2+ as catalysts.
[0398]Certain embodiments may comprise genes encoding the enzymes responsible for the formation of alkenes, or an enzymatic decarboxylation pathway. These genes may be derived or isolated from various sources, such as higher plants and insects. For example, higher plants such as germinating safflower (Carthamus tinctorius L.) produce a number of odd numbered 1-alkenes, including 1-pentadecene, 1-heptadecene, 1,8-heptadecadiene and 1,8,11-heptadecatriene besides about 80-90% 1,8,11,14-heptadecatetraene by decarboxylation from their corresponding fatty acids. Certain embodiments may incorporate the genes from higher plants such as Carthamus tinctorius.
[0399]Other embodiments may incorporate the genes encoding the enzymes responsible for the formation of alkenes (e.g., an enzymatic decarboxylation pathway) from microorganisms, including, but not limited to, such as bacterial strain DH-1. By way of explanation, bacterial strain DH-1 produces n-alkenes in addition to n-alkanes.
[0400]Other embodiments may incorporate the genes from de novo designed enzymes for an enzymatic decarboxylation pathway. For example, these redesigned enzymes convert β-hydroxy fatty acids to n-alkenes. The first step is catalyzed by a redesigned kinase, which catalyzes the phosphorylation of a β-hydroxy group. A redesigned mevalonate diphosphate decarboxylase then catalyzes the formation of n-alkenes, such as n-1-alkene.
[0401]Any microorganism may be utilized according to the present invention. In certain aspects, a microorganism is a eukaryotic or prokaryotic microorganism. In certain aspects, a microorganism is a yeast, such as S. cerevisiae. In certain aspects, a microorganism is a bacteria, such as a gram-positive bacteria or a gram-negative bacteria. Given its rapid growth rate, well-understood genetics, the variety of available genetic tools, and its capability in producing heterologous proteins, genetically modified E. coli may be used in certain embodiments of a microbial system as described herein, whether for the degradation and metabolism of a polysaccharide, such as alginate or pectin, or the formation or biosynthesis of commodity chemicals, such as biofuels.
[0402]Other microorganisms may be used according to the present invention, based in part on the compatibility of enzymes and metabolites to host organisms. For example, other organisms such as Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus usamii, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus licheniformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Candida rugosa, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Humicola nsolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sccharomyces cerevisiae, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Vibrio alginolyticus, Xanthomonas, yeast, Zygosaccharomyces rouxii, Zymomonas, and Zymomonus mobilis, may be utilized as recombinant microorganisms provided herein, and, thus, may be utilized according to the various methods of the present invention.
[0403]The following Examples are offered by way of illustration, not limitation.
EXAMPLES
Example 1
Engineering E. Coli to Grow on Alginate as a Sole Source of Carbon
[0404]Wild type E. coli cannot use alginate polymer or degraded alginate as its sole carbon source (see FIG. 4). Vibrio splendidus, however, is known to be able to metabolize alginate to support growth. To generate recombinant E. coli that use degraded alginate as its sole carbon source, a Vibrio splendidus fosmid library was constructed and cloned into E. coli.
[0405]To prepare the Vibrio splendidus fosmid library, genomic DNA was isolated from Vibrio Splendidus B01 (gift from Dr. Martin Polz, MIT) using the DNeasy Blood and Tissue Kit (Qiagen, Valencia, Calif.). A fosmid library was then constructed using Copy Control Fosmid Library Production Kit (Epicentre, Madison, Wis.). This library consisted of random genomic fragments of approximately 40 kb inserted into the vector pCC1FOS (Epicentre, Madison, Wis.).
[0406]The fosmid library was packaged into phage, and E. coli DH10B cells harboring a pDONR221 plasmid (Invitrogen, Carlsbad, Calif.) carrying certain Vibrio splendidus genes (V12B01--02425 to V12B01--02480; encoding a type II secretion apparatus; see SEQ ID NO:1) were transfected with the phage library. This secretome region encodes a type II secretion apparatus derived from Vibrio splendidus, which was cloned into a pDONR221 plasmid and introduced into E. coli strain DH10B (see Example 1).
[0407]Transformants were selected for chloroamphenicol resistance and then screened for their ability to grow on degraded alginate. The resultant transformants were screened for growth on degraded alginate media. Degraded alginate media was prepared by incubating 2% Alginate (Sigma-Aldrich, St. Louis, Mo.) 10 mM Na-Phosphate buffer, 50 mM KCl, 400 mM NaCl with alginate lyase from Flavobacterium sp. (Sigma-Aldrich, St. Louis, Mo.) at room temperature for at least one week. This degraded alginate was diluted to a concentration of 0.8% to make growth media that had a final concentration of 1×M9 salts, 2 mM MgSO4, 100 μM CaCl2, 0.007% Leucine, 0.01% casamino acids, 1.5% NaCl (this includes all sources of sodium: M9, diluted alginate and added NaCl).
[0408]One fosmid-containing E. coli clone was isolated that grew well on this media. The fosmid DNA from this clone was isolated and prepared using FosmidMAX DNA Purification Kit (Epicentre, Madison, Wis.). This isolated fosmid was transferred back into DH10B cells, and these cells were tested for the ability to grown on alginate.
[0409]The results are illustrated in FIG. 4, which shows that certain fosmid-containing E. coli clones are capable of growing on alginate as a sole source of carbon. Agrobacterium tumefaciens provides a positive control (see hatched circles). As a negative control, E. coli DH10B cells are not capable of growing on alginate (see immediate left of positive control).
[0410]These results also demonstrate that the sequences contained within this Vibrio splendidus derived fosmid clone are sufficient to confer on E. coli the ability to grow on degraded alginate as a sole source of carbon. Accordingly, the type II secretion machinery sequences contained within the pDONR221 vector (i.e., SEQ ID NO:1), which was harbored by the original DH10B cells, were not necessary for growth on degraded alginate.
[0411]The isolated fosmid sufficient to confer growth alginate as a sole source of carbon was sequenced by Elim Biopharmaceuticals (Hayward, Calif.) using the following primers: Uni R3--GGGCGGCCGCAAGGGGTTCGCGTTGGCCGA (SEQ ID NO:147) and PCC1FOS_uni_F--GGAGAAAATACCGCATCAGGCG (SEQ ID NO:148). Sequencing showed that the vector contained a genomic DNA section that contained the full length genes V12B01--24189 to V12B01--24249 (see SEQ ID NOS:2-64). SEQ ID NO:2 shows the nucleotide sequence of entire region between V12B01--24189 to V12B01--24249. SEQ ID NOS:3-64 show the individual putative genes contained within SEQ ID NO:2. In this sequence, there is a large gene before V12B01--24189 that is truncated in the fosmid clone. The large gene V12B01--24184 is a putative protein with similarity to autotransporters and belongs to COG3210, which is a cluster of orthologous proteins that include large exoproteins involved in heme utilization or adhesion. In the fosmid clone, V12B01--24184 is N-terminally truncated such that the first 5893 bp are missing from the predicted open reading frame (which is predicted to contain 22889 bp in total).
Example 2
Engineering E. Coli to Grow on Pectin as a Sole Source of Carbon
[0412]Wild type E. coli is not capable of growing on pectin, di-, or tri-galacturonates as a sole source of carbon. To identify the minimal components to confer on E. coli the capability of growing on pectin, di- and/or tri-galacturonates as a sole source of carbon, an E. coli strain BL21(DE3) harboring both the pBBRGal3P plasmid and the pTrcogl-kdgR plasmid was engineered and tested for the ability to grown on these polysaccharides.
[0413]The pBBRGal3P plasmid was engineered to contain certain genomic region of Erwinia carotovora subsp. Atroseptica SCRI 1043, comprising several genes (kdgF, kduI, kduD, pelW, togM, togN, toga, togB, kdgM, and paeX) encoding certain enzymes (kduI, kduD, ogl, pelW and paeX), transporters (togM, togN, togA, togB, and kdgM), and regulatory proteins (kdgR) responsible for the degradation of di- and trigalacturonate. SEQ ID NO:65 shows the nucleotide sequence of the kdgF-PaeX region from Erwinia carotovora subsp. Atroseptica SCRI 1043.
[0414]To construct this plasmid, the DNA sequence encoding kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR of Erwinia carotovora subsp. Atroseptica SCRI 1043 was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 6 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGGGATCC AAGTTGCAGGATATGACGAAAGCG-3') (SEQ ID NO:149) and reverse (5'-GCTCTAGA AGATTATCCCTGTCTGCGGAAGCGG-3') (SEQ ID NO:150) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Erwinia carotovora subsp. Atroseptica SCRI 1043 genome (ATCC) in 50 μl.
[0415]The vector pBBR1MCS-2 was then amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 2.5 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-GCTCTAGA GGGGTGCCTAATGAGTGAGCTAAC-3') (SEQ ID NO:151) and reverse (5'-CGGGATCC GCGTTAATATTTTGTTAAAATTCGC-3') (SEQ ID NO:152) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBBR1MCS-2 in 50 μl. Both amplified DNA fragments were digested with BamHI and XbaI and ligated.
[0416]The pTrcogl-kdgR plasmid was engineered to contain certain genomic regions of Erwinia carotovora subsp. Atroseptica SCRI 1043, comprising two genes (ogl and kdgR) encoding an enzyme (ogl) and a regulatory protein (kdgR) responsible for degradation of di- and trigalacturonate. SEQ ID NO:66 shows the nucleotide sequence of ogl-kdgR from Erwinia carotovora subsp. Atroseptica SCRI 1043.
[0417]To prepare this construct, the DNA sequence encoding ogl and kdgR of Erwinia carotovora subsp. Atroseptica SCRI 1043 was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-GCTCTAGA GTTTATGTCGCACCCGCCGTTGG-3') (SEQ ID NO:153) and reverse (5'-CCCAAGC TTAGAAAGGGAAATTGTGGTAGCCC-3') (SEQ ID NO:154) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Erwinia carotovora subsp. Atroseptica SCRI 1043 genome (ATCC) in 50 μl. The amplified DNA fragment was digested with XbaI and HindIII and ligated into pTrc99A pre-digested with the same restriction enzymes.
[0418]The plasmids pBBRGal3P and pTrcogl-kdgR were co-transformed into E. coli strain BL21(DE3). A single colony was inoculated into LB media containing 50 ug/ml kanamycin and 100 ug/ml ampicillin, and the culture was grown in incubation shaker with 200 rpm at 37 C. When culture reached OD 600 nm of 0.6, 500 ul of culture was transferred to eppendorf tube and centrifuged to pellet the cells. The cells were resuspended into 50 ul of M9 media containing 2 mM MgSO4, 100 uM CaCl2, 0.4% di- or trigalacturonate, and 5 ul of this solution was inoculated into 500 ul of fresh M9 media containing 2 mM MgSO4, 100 uM CaCl2, 0.4% di- or trigalacturonate. The culture was grown in incubation shaker with 200 rpm at 37 C.
[0419]The results in FIG. 5A show that these two plasmids were sufficient to provide E. coli ability to grow on di- and trigalacturonate as sole source of carbon, but not pectin. In particular, these results show that the regions kdgF-paeX and ogl-kdgR were sufficient to confer this ability on E. coli.
[0420]Based on the information obtained from the above experiments, it was considered whether the introduction of pectate lyase, pectate acetylesterase, and methylesterase might confer E. coli capability of growing on pectin. To test this hypothesis, E. coli strain DH5α bacterial cells were engineered to contain both the pROU2 plasmid and the pPEL74 plasmid.
[0421]The pROU2 plasmid contains certain genomic regions of Erwinia chrysanthemi, comprising several genes (kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR) encoding enzymes (kduI, kduD, ogl, pelW, and paeX), transporters (togM, togN, togA, togB, and kdgM), and regulatory proteins (kdgR) responsible for degradation of di- and trigalacturonate.
[0422]The pPEL74 plasmid contains certain genomic regions of Erwinia chrysanthemi, comprising several genes (pelA, pelE, paeY, and pem) encoding pectate lyases (pelA and pelE), pectin acetylesterases (paeY), and pectin methylesterase (pem).
[0423]As shown in FIG. 5B, E. coli DH5α engineered with pROU2 and pPEL74 was able to grow on pectin as a sole source of carbon, showing that the genes contained within these plasmids are sufficient to confer this property on an organism that is otherwise incapable of growing on pectin as a sole source of carbon.
Example 3
In Vitro Conversion of Alginate to Pyruvate and Glyceraldehyde-3-Phosphate
[0424]The ability of an enzyme mixture containing all required enzymes for alginate degradation and metabolism was investigated for its ability to produce pyruvate from alginate. In addition, various novel alcohol dehydrogenases (ADHs), such as ADH1-12 (see SEQ ID NOS:69-92), isolated from Agrobacterium tumefaciens, were tested for their ability to catalyze either DEHU or mannuronate hydrogenation.
[0425]A simplified metabolic pathway for alginate degradation and metabolism is shown in FIG. 2. Alginate can be degraded by at least two different methodologies: enzymatic and chemical methodologies.
[0426]In enzymatic degradation, the degradation of alginate is catalyzed by a family of enzymes called alginate lyases. For this experiment, Atu3025 was used. Atu3025 is an exolytically acting enzyme and yields DEHU from alginate polymer. DEHU is converted to the common hexuronate metabolite, KDG. This reaction is catalyzed by alcohol dehydrogenases (e.g., DEHU hydrogenases).
[0427]Chemical degradation catalyzed by acid solution, such as formate, yields a monosaccharide mannuronate. Mannuronate is then converted to mannonate, which is catalyzed by enzymes with mannonate dehydrogenase (mannuronate reductase) activity. In bacteria, mannonate dehydratase (UxuA) catalyzes dehydration from mannuronate to form KDG.
[0428]KDG is readily metabolized to form of pyruvate and glyceraldehydes-3-phosphate (G3P). KDG is first phosphorylated to KDG-6-phosphate (KDGP), which is catalyzed by KDG kinase, and then broken down to pyruvate and G3P, which is catalyzed by KDGP aldolase.
[0429]Preparation of oligoalginate lyase Atu3025 derived from Agrobacterium tumefaciens C58. pETAtu3025 was constructed based on pET29 plasmid backbone (Novagen). The oligoalginate lyase Atu3025 was amplified by PCR: 98° C. for 10 sec, 55° C. for 15 sec, and 72° C. for 60 sec, repeated for 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 μM forward (5'-GGAATTCCATATGCGTCCCTCTGCCCCGGCC-3') (SEQ ID NO:155) and reverse (5'-CGGGATCCTTAGAACTGCTTGGGAAGGGAG-3') (SEQ ID NO:156) primers, 2.5 U Phusion DNA polymerase (Finezyme), and an aliquot of Agrobacterium tumefaciens C58 (gift from Professor Eugene Nester, University of Washington) cells as a template in total volume of 100 μl. The amplified fragment was digested with NdeI and BamHI and ligated into pET29 pre-digested with the same enzymes using T4 DNA ligase to form pETAtu3025. The constructed plasmid was sequenced (Elim Biopharmaceuticals) and the DNA sequence of the insert was confirmed. The nucleotide sequence of the Atu3025 insert is provided in SEQ ID NO:67. The polypeptide sequence encoded by the Atu3025 insert is provided in SEQ ID NO:68.
[0430]The pETAtu3025 was transformed into Escherichia coli strain BL21(DE3). A colony of BL21(DE3) containing pETAtu3025 was inoculated into 50 ml of LB media containing 50 μg/ml kanamycin (Km50). This strain was grown in an orbital shaker with 200 rpm at 37° C. The 0.2 mM IPTG was added to the culture when the OD600nm reached 0.6, and the induced culture was grown in an orbital shaker with 200 rpm at 20° C. 24 hours after the induction, the cells were harvested by centrifugation at 4,000 rpm×g for 10 min and the pellet was resuspended into 2 ml of Bugbuster (Novagen) containing 10 μl of Lysonase® Bioprocessing Reagent (Novagen). The solution was again centrifuged at 4,000 rpm×g for 10 min and the supernatant was obtained.
[0431]Construction of pETADH1 through pETADH12. DNA sequences of ADH1-12 of Agrobacterium tumefaciens C58 were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (Table 1) and reverse (Table 1) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Agrobacterium tumefaciens C58 genome in 50 μl. Amplified DNA fragment was digested with NdeI and BamHI and ligated into pET28 pre-digested with the same restriction enzymes. For DNA sequences with internal NdeI or BamHI site, front and bottom half sequences of each ADH were first amplified using described method. The resulting two DNA fragments were gel purified and spliced by overlapping PCR.
TABLE-US-00001 TABLE 1 Primers used to amplify ADH1-12 from Agrobacterium tumefaciens C58. A. tumefaciens Name C58 Forward Primer Reverse Primer ADH1 Atu1557 GGAATTCCATATGTTCACAACGTCCGCCTA GCTTGACGGCCATGTGGCCGAGGCCGC (SEQ ID NO:276) (SEQ ID NO:277) GCGGCCTCGGCCACATGGCCGTCAAGC CGGGATCCTTAGGCGGCCTTCTGGCGCG (SEQ ID NO:278) (SEQ ID NO:279) ADH2 Atu2022 GGAATTCCATATGGCTATTGCAAGAGGTTA CGGGATCCTTAAGCGTCGAGCGAGGCCA (SEQ ID NO:280) (SEQ ID NO:281) ADH3 Atu0626 GGAATTCCATATGACTAAAACAATGAAGGC CACCGGGGCCGGGGTCCGGTATTGCCA (SEQ ID NO:282) (SEQ ID NO:283) TGGCAATACCGGACCCCGGCCCCGGTG CGGGATCCTTAGGCGGCGAGATCCACGA (SEQ ID NO:284) (SEQ ID NO:285) ADH4 Atu5240 GGAATTCCATATGACCGGGGCGAACCAGCC ATAGCCGCTCATACGCCTCGGTTGCCT (SEQ ID NO:286) (SEQ ID NO:287) AGGCAACCGAGGCGTATGAGCGGCTAT CGGGATCCTTAAGCGCCGTGCGGAAGGA (SEQ ID NO:288) (SEQ ID NO:289) ADH5 Atu3163 GGAATTCCATATGACCATGCATGCCATTCA CGGGATCCTTATTCGGCTGCAAATTGCA (SEQ ID NO:290) (SEQ ID NO:291) ADH6 Atu2151 GGAATTCCATATGCGCGCGCTTTATTACGA CGGGATCCTTATTCGAACCGGTCGATGA (SEQ ID NO:292) (SEQ ID NO:293) ADH7 Atu2814 GGAATTCCATATGCTGGCGATTTTCTGTGA CGGGATCCTTATGCGACCTCCACCATGC (SEQ ID NO:294) (SEQ ID NO:295) ADH8 Atu5447 GGAATTCCATATGAAAGCCTTCGTCGTCGA CGGGATCCTTAGGATGCGTATGTAACCA (SEQ ID NO:296) (SEQ ID NO:297) ADH9 Atu4087 GGAATTCCATATGAAAGCGATTGTCGCCCA CGGGATCCTTAGGAAAAGGCGATCTGCA (SEQ ID NO:298) (SEQ ID NO:299) ADH10 Atu4289 GGAATTCCATATGCCGATGGCGCTCGGGCA CGGGATCCTTAGAATTCGATGACTTGCC (SEQ ID NO:300) (SEQ ID NO:301) ADH11 Atu3027 GGAATTCCATATGAAACATTCTCAGGACAA GGGCGCCGATCATGTGGTGCGTTTCCG (SEQ ID NO:302) (SEQ ID NO:303) CGGAAACGCACCACATGATCGGCGCCC CGGGATCCTTATGCCATACGTTCCATAT (SEQ ID NO:304) (SEQ ID NO:305) ADH12 Atu3026 GGAATTCCATATGCAGCGTTTTACCAACAG CGGGATCCTTAGGAAAACAGGACGCCGC (SEQ ID NO:306) (SEQ ID NO:307)
Expression and Purification of ADH1-10.
[0432]All plasmids were transformed into Escherichia coli strain BL21(DE3). The single colonies of BL21(DE3) containing respective alcohol dehydrogenase (ADH) genes were inoculated into 50 ml of LB media containing 50 μg/ml kanamycin (Km50). These strains were grown in an orbital shaker with 200 rpm at 37° C. The 0.2 mM IPTG was added to each culture when the OD600nm reached 0.6, and the induced culture was grown in an orbital shaker with 200 rpm at 20° C. 24 hours after the induction, the cells were harvested by centrifugation at 4,000 rpm×g for 10 min and the pellet was resuspended into 2 ml of Bugbuster (Novagen) containing 10 μl of Lysonase® Bioprocessing Reagent (Novagen). The solution was again centrifuged at 4,000 rpm×g for 10 min and the supernatant was obtained.
Preparation of ˜2% DEHU Solution by Enzymatic Degradation.
[0433]DEHU solution was enzymatically prepared. A 2% alginate solution was prepared by adding 10 g of low viscosity alginate into the 500 ml of 20 mM Tris-HCl (pH7.5) solution. An approximately 10 mg of alginate lyase derived from Flavobacterium sp. (purchased from Sigma-aldrich) was added to the alginate solution. 250 ml of this solution was then transferred to another bottle and the E. coli cell lysate containing Atu3025 prepared above section was added. The alginate degradation was carried out at room temperature over night. The resulting products were analyzed by thin layer chromatography, and DEHU formation was confirmed.
Preparation of D-Mannuronate Solution by Chemical Degradation.
[0434]D-mannuronate solution was chemically prepared based on the protocol previously described by Spoehr (Archive of Biochemistry, 14: pp 153-155). Fifty milligram of alginate was dissolved into 800 μL of ninety percent formate. This solution was incubated at 100° C. for over night. Formate was then evaporated and the residual substances were washed with absolute ethanol twice. The residual substance was again dissolved into absolute ethanol and filtrated. Ethanol was evaporated and residual substances were resuspended into 20 mL of 20 mM Tris-HCl (pH 8.0) and the solution was filtrated to make a D-mannuronate solution. This D-mannuronate solution was diluted 5-fold and used for assay.
Assay for DEHU Hydrogenase.
[0435]To identify DEHU hydrogenase, a NADPH dependent DEHU hydrogenation assay was performed. 20 μl of prepared cell lysate containing each ADH was added to 160 μl of 20-fold deluted DEHU solution prepared in the above section. 20 μl of 2.5 mg/ml of NADPH solution (20 mM Tris-HCl, pH 8.0) was added to initiate the hydrogenation reaction, as a preliminary study using cell lysate of A. tumefaciens C58 have shown that DEHU hydrogenation requires NADPH as a co-factor. The consumption of NADPH was monitored an absorbance at 340 nm for 30 min using the kinetic mode of ThermoMAX 96 well plate reader (Molecular Devises). E. coli cell lysate containing alcohol dehydrogenase (ADH) 10 lacking a portion of N-terminal domain was used in a control reaction mixture.
Assay for D-Mannuronate Hydrogenase.
[0436]To identify D-mannuronate hydrogenase, a NADPH dependent D-mannuronate hydrogenation assay was performed. 20 μl of prepared cell lysate containing each ADH was added to 160 μl of D-mannuronate solution prepared in the above section. 20 μl of 2.5 mg/ml of NADPH solution (20 mM Tris-HCl, pH 8.0) was added to initiate the hydrogenation reaction. The consumption of NADPH was monitored an absorbance at 340 nm for 30 min using the kinetic mode of ThermoMAX 96 well plate reader (Molecular Devises). E. coli cell lysate containing alcohol dehydrogenase (ADH) 10 lacking a portion of N-terminal domain was used in a control reaction mixture.
Construction of pETkdgK.
[0437]DNA sequence of kdgK of Escherichia coli encoding 2-keto-deoxy gluconate kinase was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-AGGTACGGTGAAATAA AGGAGG ATATACAT ATGTCCAAAAAGATTGCCGT-3') (SEQ ID NO:157) and reverse (5'-TTTTCCTTTTGCGGCCGCCCCGCTGGCATCGCCTCAC-3') (SEQ ID NO:158) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome in 50 μl. Amplified DNA fragment was digested with NdeI and NotI and ligated into pET29 pre-digested with the same restriction enzymes.
Construction of pETkdgA.
[0438]DNA sequence of kdgA Escherichi coli encoding 2-keto-deoxy gluconate-6-phosphate aldolase was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-GGCGATGCCAGCGTAA AGGAGG ATATACAT ATGAAAAACTGGAAAACAAG-3') (SEQ ID NO:159) and reverse (5'-TTTTCCTTTTGCGGCCGCCCCAGCTTAGCGCCTTCTA-3') (SEQ ID NO:160) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome in 50 μl. Amplified DNA fragment was digested with NdeI and NotI and ligated into pET29 pre-digested with the same restriction enzymes.
Protein Expression and Purification.
[0439]All plasmids (pETAtu3025, pETADH11, pETADH12, pETkdgA, pETkdgK, and pETuxuA) were transformed into Escherichia coli strain BL21(DE3). The single colonies of BL21(DE3) containing respective plasmids were inoculated into 50 ml of LB media containing 50 μg/ml kanamycin (Km50). These strains were grown in an orbital shaker with 200 rpm at 37° C. The 0.2 mM IPTG was added to each culture when the OD600nm reached 0.6, and the induced culture was grown in an orbital shaker with 200 rpm at 20° C. 24 hours after the induction, the cells were harvested by centrifugation at 4,000 rpm×g for 10 min and the pellet was resuspended into 2 ml of Bugbuster (Novagen) containing 10 μl of Lysonase® Bioprocessing Reagent (Novagen) and suggested amount of protease inhibitor cocktail (SIGMA). The solution was again centrifuged at 4,000 rpm×g for 10 min and the supernatant was obtained. The supernatant was applied to Nickel-NTA spin column (Qiagen) to purify His-tagged proteins.
[0440]The results of the assays for DEHU hydrogenase activity and D-mannuronate hydrogenase activity of ADH1-10 are shown in FIGS. 7A and 7B. These results demonstrate that the novel enzymes ADH1 and ADH2 showed significant DEHU hydrogenase activity (FIG. 7A), and that the novel enzymes ADH3, ADH4, and ADH9 showed significant mannuronate hydrogenase activity (FIG. 7B).
In Vitro Pyruvate Formation.
[0441]The reaction mixture contained 1% alginate or ˜0.5% mannuronate, ˜5 ug of purified Atu3026 (ADH12) or Atu3027 (ADH11), and ˜5 ug of purified oligoalginate lyase (Atu3025), UxuA, KdgK, and KdgA, 2 mM of ATP, and 0.6 mM of NADPH in 20 mM Tris-HCl pH7.0. The reaction was carried out over night and the pyruvate formation was monitored by the pyruvate assay kit (BioVision, Inc).
[0442]The results of in vitro pyruvate formation from alginate mediated by enzymatic and chemical degradation are shown in FIG. 6B and FIG. 6c, respectively. As can be seen in these figures, alginate was converted to pyruvate via the isolated enzymes. These results also show that each of Atu3026 (ADH12) and Atu3027 (ADH11) are capable of catalyzing both DEHU hydrogenase and mannuronate hydrogenase reactions.
Example 4
Construction and Biological Activity of Biosynthesis Pathways
Construction of Pathways:
[0443]A propionaldehyde biosynthetic pathway comprising a threonine deaminase (ilvA) gene from Escherichia coli and keto-isovalerate decarboxylase (kivd) from Lactococcus lactis is constructed and tested for the ability to convert L-threonine to propionaldehyde.
[0444]A butyraldehyde biosynthetic pathway comprising a thiolase (atoB) gene from E. coli, β-hydroxy butyryl-CoA dehydrogenase (hbd), crotonase (crt), butyryl-CoA dehydrogenase (bcd), electron transfer flavoprotein A (etfA), and electron transfer flavoprotein B (etfB) genes from Clostridium acetobutyricum ATCC 824, and a coenzyme A-linked butyraldehyde dehydrogenase (ald) gene from Clostridium beijerinckii acetobutyricum ATCC 824 was constructed in E. coli and tested for the ability to produce butyraldehyde. Also, a coenzyme A-linked alcohol dehydrogenase (adhE2) gene from Clostridium acetobutyricum ATCC 824 was used as an alternative to ald and tested for the ability to produce butanol.
[0445]An isobutyraldehyde biosynthetic pathway comprising an acetolactate synthase (alsS) from Bacillus subtilis or (als) from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage was optimized for E. coli protein expression) and acetolactate reductoisomerase (ilvC) and 2,3-dihydroxyisovalerate dehydratase (ilvD), genes from E. coli and keto-isovalerate decarboxylase (kivd) from Lactococcus lactis was constructed and tested for the ability to produce isobutyraldehyde, as measured by isobutanal production.
[0446]3-methylbutyraldehyde and 2-methylbutyraldehyde biosynthesis pathways comprising an acetolactate synthase (alsS) from Bacillus subtilis or (als) from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage was optimized for E. coli protein expression), acetolactate reductoisomerase (ilvC), 2,3-dihydroxyisovalerate dehydratase (ilvD), isopropylmalate synthase (LeuA), isopropylmalate isomerase (LeuC and LeuD), and 3-isopropylmalate dehydrogenase (LeuB) genes from E. coli and keto-isovalerate decarboxylase (kivd) from Lactococcus lactis were constructed and tested for the ability to produce 3-isovaleraldehyde and 2-isovaleraldehyde.
[0447]Phenylacetoaldehyde and 4-hydroxyphenylacetoaldehyde biosynthesis pathways comprising a transketolase (tktA), a 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II (aroL), a shikimate kinase I (aroK), a 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate synthase (aroC), a fused chorismate mutase P/prephenate dehydratase (pheA), and a fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from E. coli, keto-isovalerate decarboxylase (kivd) from Lactococcus lactis were constructed and tested for the ability to produce phenylacetoaldehyde and/or 4-hydroxyphenylacetoaldehyde.
[0448]A 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis pathway comprising a transketolase (tktA), a 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II (aroL), a shikimate kinase I (aroK), a 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate synthase (aroC), a fused chorismate mutase P/prephenate dehydratase (pheA), and a fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from E. coli, keto-isovalerate decarboxylase (kivd) from Lactococcus lactis, alcohol dehydrogenase (adh2) from Saccharomyces cerevisiae, Indole-3-pyruvate decarboxylase (ipdc) from Azospirillum brasilense, phenylethanol reductase (par) from Rhodococcus sp. ST-10, and benzaldehyde lyase (bal) from Pseudomonas fluorescence was constructed and tested for the ability to produce 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol and/or 2-(indole-3)ethanol.
[0449]Construction of pBADButP.
[0450]The DNA sequence encoding hbd, crt, bcd, etfA, and etfB of Clostridium acetobutyricum ATCC 824 was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 3 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCTTAGGAGGATTAGTCATGGAAC-3') (SEQ ID NO:161) and reverse (5'-GCTCTAGA TTATTTTGAATAATCGTAGAAACC-3') (SEQ ID NO:162) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Clostridium acetobutyricum ATCC 824 genome (ATCC) in 50 μl. Amplified DNA fragment was digested with BamHI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0451]Construction of pBADButP-atoB.
[0452]The DNA sequence encoding atoB of Escherichia coli DH10B was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-GCTCTAGAGGAGGATATATATATGAAAAATTGTGTCATCGTC-3') (SEQ ID NO:163) and reverse (5'-AA CTGCAGTTAATTCAACCGTTCAATCACC-3') (SEQ ID NO:164) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome in 50 μl. Amplified DNA fragment was digested with XbaI and PstI and ligated into pBADButP pre-digested with the same restriction enzymes.
[0453]Construction of pBADatoB-ald.
[0454]The DNA sequence encoding atoB of Escherichia coli DH10B and ald from Clostridium beijerinckii were amplified separately by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTC AGGAGGATATATATATGAAAAATTGTGTCATCGTCAGTG-3') (SEQ ID NO:165) for atoB and 5'-GGTTGAATTAAGGAGGATATATATATGAATAAAGACACACTAATACCTAC-3' for ald) (SEQ ID NO:166) and reverse (5'-GTCTTTATTCATATATATATCCTCCTTAATTCAACCGTTCAATCACCATC-3' (SEQ ID NO:146) for atoB and 5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3' for ald) (SEQ ID NO:167) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B and Clostridium beijerinckii genome (ATCC) in 501, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 2 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTC AGGAGGATATATATATGAAAAATTGTGTCATCGTCAGTG-3') (SEQ ID NO:168) and reverse (5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3') (SEQ ID NO:169) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and HindIII and ligated into pBADButP pre-digested with the same restriction enzymes.
[0455]Construction of pBADButP-atoB-ALD.
[0456]The DNA fragment 1 encoding chloramphenicol acetyltransferase (CAT), P15 origin of replication, araBAD promoter, atoB of Escherichia coli DH10B and ald of Clostridium beijerinckii and the DNA fragment 2 encoding araBAD promoter, hbd, crt, bcd, etfA, and etfB of Clostridium acetobutyricum ATCC 824 were amplified separately by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:170) for fragment 1 and 5'-CGGGGTACCACTTTTCATACTCCCGCCATTCAG-3' (SEQ ID NO:274) for fragment 2, and reverse (5'-CGGGGTACCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:171) for fragment 1 and (5'-AAGGAAAAAAGCGGCCGCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:172) for fragment 2) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADatoB-ald and pBADButP in 50 μl, respectively. Amplified DNA fragments were digested with NotI and KpnI and ligated each other.
[0457]Construction of pBADilvCD.
[0458]The DNA fragments encoding ilvC and ilvD of Escherichia coli DH10B were amplified separately by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-GCTCTAGAGGAGGATATATATATGGCTAACTACTTCAATACAC-3') (SEQ ID NO:173) for ilvC and 5'-TGCTGTTGCGGGTTAAGGAGGATATATATATGCCTAAGTACCGTTCCGCC-3' for ilvD) (SEQ ID NO:174) and reverse (5'-AACGGTACTTAGGCATATATATATCCTCCTTAACCCGCAACAGCAATACG-3') (SEQ ID NO:175) for ilvC and 5'-ACATGCATGCTTAACCCCCCAGTTTCGATT-3') (SEQ ID NO:176) for ilvD) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome (ATCC) in 50 μl. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 2 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-GCTCTAGAGGAGGATATATATATGGCTAACTACTTCAATACAC-3') (SEQ ID NO:177) and reverse (5'-ACATGCATGCTTAACCCCCCAGTTTCGATT-3') (SEQ ID NO:178) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with XbaI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0459]Construction of pBADals-ilvCD.
[0460]The DNA fragment encoding als of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 of its codon usage optimized for over-expression in E. coli was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGGATAAACAGTATCCGGT-3') (SEQ ID NO:179) and reverse (5'-GCTCTAGATTACAGAATTTGACTCAGGT-3') (SEQ ID NO:180) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETals in 50 μl. The amplified DNA fragment was digested with SacI and XbaI and ligated into pBADilvCD pre-digested with the same restriction enzymes.
[0461]Construction of pBADalsS-ilvCD.
[0462]The DNA fragments encoding front and bottom halves of alsS of Bacillus subtilis B26 were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 0.5 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGTTGACAAAAGCAACAAAAG-3') (SEQ ID NO:181) for front and 5'-CGGTACCCTTTCCAGAGATTTAGAG-3' (SEQ ID NO:275) for back halves, and reverse (5'-CTCTAAATCTCTGGAAAGGGTACCG-3') (SEQ ID NO:182) for front and (5'-GCTCTAGATTAGAGAGCTTTCGTTTTCATG-3' for back halves) (SEQ ID NO:183) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Bacillus subtilis B26 genome (ATCC) in 50 μl. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGTTGACAAAAGCAACAAAAG-3') (SEQ ID NO:184) and reverse (5'-GCTCTAGATTAGAGAGCTTTCGTTTTCATG-3') (SEQ ID NO:185) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was internal XbaI site free and thus was digested with SacI and XbaI and ligated into pBADilvCD pre-digested with the same restriction enzymes.
[0463]Construction of pBADLeuABCD.
[0464]The DNA fragment encoding leuA, leuB, leuC, and leuD of Escherichia coli BL21(DE3) was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 3 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:186) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:187) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 μl. The amplified DNA fragment was digested with SacI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0465]Construction of pBADLeuABCD2.
[0466]The DNA fragment 1 encoding leuA and leuB and the DNA fragment 2 encoding leuC and leuD of Escherichia coli BL21 (DE3) were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:188) for fragment 1 and (5'-AGGGGTGTAAGGAGGATATATATATGGCTAAGACGTTATACGAAAAATTG-3') (SEQ ID NO:189) for fragment 2 and reverse (5'-CGTCTTAGCCATATATATATCCTCCTTACACCCCTTCTGCTACATAGCGG-3') (SEQ ID NO:190) for fragment 1 and (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:191) for fragment 2 primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 μl, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 3 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:192) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:193) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0467]Construction of pBADLeuABCD4.
[0468]The DNA fragments encoding leuA, leuB, leuC and leuD of Escherichia coli BL21(DE3) were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:194) for leuA, (5'-GAAACCGTGTGAGGAGGATATATATATGTCGAAGAATTACCATATTGCCG-3') (SEQ ID NO:195) for leuB, (5'-AGGGGTGTAAGGAGGATATATATATGGCTAAGACGTTATACGAAAAATTG-3') (SEQ ID NO:196) for leuC, and (5'-ACATTAAATAAGGAGGATATATATATGGCAGAGAAATTTATCAAACACAC-3') (SEQ ID NO:197) for leuD and reverse (5'-ATTCTTCGACATATATATATCCTCCTCACACGGTTTCCTTGTTGTTTTCG-3') (SEQ ID NO:198) for leuA, (5'-CGTCTTAGCCATATATATATCCTCCTTACACCCCTTCTGCTACATAGCGG-3') (SEQ ID NO:199) for leuB, (5'-TTTCTCTGCCATATATATATCCTCCTTATTTAATGTTGCGAATGTCGGCG-3') (SEQ ID NO:200) for leuC, and (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:201) for leuD primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 μl, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 3 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:202) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:203) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0469]Construction of pBADals-ilvCD-leuABCD, pBADals-ilvCD-leuABCD2, pBADals-ilvCD-leuABCD4, pBADalsS-ilvCD-leuABCD, pBADalsS-ilvCD-leuABCD2, pBADalsS-ilvCD-leuABCD4.
[0470]The DNA fragments 1 (for als) and 2 (for alsS) encoding chloramphenicol acetyltransferase (CAT), P15 origin of replication, araBAD promoter, als of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 of its codon usage optimized for over-expression in E. coli or alsS of Bacillus subtilis B26 and ilvC and ilvD of E. coli DH10B were amplified separately by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:204) and reverse (5'-CGGGGTACCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:205) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADals-ilvCD and pBADalsS-ilvCD in 50 μl, respectively.
[0471]To remove an internal SphI restriction enzyme site form leuC, overlap PCR was carried out. The front and bottom halves of DNA fragment 3 (for leuABCD), fragment 4 (for leuABCD2), and fragment 5 (for leuABCD4) encoding araBAD promoter, leuA, leuB, leuC, and leuD of E. coli BL21(DE3) were amplified separately by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-AAGGAAAAAAGCGGCCGCACTTTTCATACTCCCGCCATTCAG-3') (SEQ ID NO:206) for front and (5'-CAAAGGCCGTCTGCACGCGCCGAAAGGCAAA-3') (SEQ ID NO:207) for back halves) and reverse (5'-TTTGCCTTTCGGCGCGTGCAGACGGCCTTTG-3') (SEQ ID NO:208) for front and (5'-ACATGCATGCCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:209) for bottom halves, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADleuABCD, pBADleuABCD2, and pBADleuABCD4 in 50 μl, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-AAGGAAAAAAGCGGCCGCACTTTTCATACTCCCGCCATTCAG-3') (SEQ ID NO:210) and reverse (5'-ACATGCATGCCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:211) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The resulting fragment 3, 4, and 5 were digested with SphI and NotI and ligated into both fragment 1 and 2 pre-digested with the same restriction enzymes.
[0472]Construction of pBADaroG-tktA-aroBDE.
[0473]The DNA fragments encoding aroG, tktA, aroB, aroD, and aroE of Escherichia coli BL21(DE3) were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATAT ATGAATTATCAGAACGACGATTTAC-3') (SEQ ID NO:212) for aroG, (5'-GCGTCGCGGGTAAGGAGGAAAATTTTATGTCCTCACGTAAAGAGCTTGCC-3') (SEQ ID NO:213) for tktA, (5'-GAACTGCTGTAAGGAGGTTAAAATTATGGAGAGGATTGTCGTTACTCTCG-3') (SEQ ID NO:214) for aroB,} (5'-CAATCAGCGTAAGGAGGTATATATAATGAAAACCGTAACTGTAAAAGATC-3') (SEQ ID NO:215) for aroD, and (5'-TACACCAGGCATAAGGAGGAATTAATTATGGAAACCTATGCTGTTTTTGG-3') (SEQ ID NO:216) for aroE and reverse (5'-TACGTGAGGACATAAAATTTTCCTCCTTACCCGCGACGCGCTTTTACTGC-3') (SEQ ID NO:217) for aroG, (5'-CAATCCTCTCCATAATTTTAACCTCCTTACAGCAGTTCTTTTGCTTTCGC-3') (SEQ ID NO:218) for tktA, (5'-CAATCAGCGTAAGGAGGTATATATAATGAAAACCGTAACTGTAAAAGATC-3') (SEQ ID NO:219) for aroB, (5'-TACGGTTTTCATTATATATACCTCCTTACGCTGATTGACAATCGGCAATG-3') (SEQ ID NO:220) for aroD, and (5'-ACATGCATGCTTACGCGGACAATTCCTCCTGCAA-3') (SEQ ID NO:221) for aroE, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 μl, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 3 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGAATTATCAGAACGACGATTTAC-3') (SEQ ID NO:222) and reverse (5'-ACATGCATGCTTACGCGGACAATTCCTCCTGCAA-3') (SEQ ID NO:223) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0474]Construction of pBADpheA-aroLAC.
[0475]The DNA fragments encoding pheA, aroL, aroA, and aroC of Escherichia coli DH10 were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGACATCGGAAAACCCGTTACTGG-3') (SEQ ID NO:224) for pheA, (5'-GATCCAACCTAAGGAGGAAAATTTTATGACACAACCTCTTTTTCTGATCG-3') (SEQ ID NO:225) for aroL, (5'-GATCAATTGTTAAGGAGGTATATATAATGGAATCCCTGACGTTACAACCC-3') (SEQ ID NO:226) for aroA, and (5'-CAGGCAGCCTAAGGAGGAATTAATTATGGCTGGAAACACAATTGGACAAC-3') (SEQ ID NO:227) for aroC and reverse (5'-AGGTTGTGTCATAAAATTTTCCTCCTTAGGTTGGATCAACAGGCACTACG-3') (SEQ ID NO:228) for pheA, (5'-CAGGGATTCCATTATATATACCTCCTTAACAATTGATCGTCTGTGCCAGG-3') (SEQ ID NO:229) for aroL, (5'-GTTTCCAGCCATAATTAATTCCTCCTTAGGCTGCCTGGCTAATCCGCGCC-3') (SEQ ID NO:230) for aroA, and (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:231) for aroC primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 μl, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGACATCGGAAAACCCGTTACTGG-3') (SEQ ID NO:232) and reverse (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:233) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0476]Construction of pBADtyrA-aroLAC.
[0477]The DNA fragments encoding pheA, aroL, aroA, and aroC of Escherichia coli DH10 were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGGTTGCTGAATTGACCGCATTAC-3') (SEQ ID NO:234) for tyrA, (5'-AATCGCCAGTAAGGAGGAAAATTTTATGACACAACCTCTTTTTCTGATCG-3') (SEQ ID NO:235) for aroL, (5'-GATCAATTGTTAAGGAGGTATATATAATGGAATCCCTGACGTTACAACCC-3') (SEQ ID NO:236) for aroA, and (5'-CAGGCAGCCTAAGGAGGAATTAATTATGGCTGGAAACACAATTGGACAAC-3') (SEQ ID NO:237) for aroC, and reverse (5'-GAGGTTGTGTCATAAAATTTTCCTCCTTACTGGCGATTGTCATTCGCCTG-3') (SEQ ID NO:238) for tyrA, (5'-CAGGGATTCCATTATATATACCTCCTTAACAATTGATCGTCTGTGCCAGG-3') (SEQ ID NO:239) for aroL, (5'-GTTTCCAGCCATAATTAATTCCTCCTTAGGCTGCCTGGCTAATCCGCGCC-3') (SEQ ID NO:240) for aroA, and (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:241) for aroC, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 μl, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGGTTGCTGAATTGACCGCATTAC-3') (SEQ ID NO:242) and reverse (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:243) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.
[0478]Construction of pBADpheA-aroLAC-aroG-tktA-aroBDE and pBADtyrA-aroLAC-aroG-tktA-aroBDE.
[0479]A DNA fragment 1 (for pheA) and 2 (for tyrA) encoding chloramphenicol acetyltransferase (CAT), P15 origin of replication, araBAD promoter, pheA or tyrA, aroL, aroA, aroC of Escherichia coli DH10B and a DNA fragment 3 encoding araBAD promoter, aroG, tktA, aroB, aroD, and aroE of Escherichia coli DH10B were amplified separately by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 4 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:244) for fragment 1 and 2 and (5'-GCTCTAGAACTTTTCATACTCCCGCCATTCAG-3') (SEQ ID NO:245) for fragment 3, and reverse (5'-GCTCTAGAGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:246) for fragment 1 and 2 and (5'-AAGGAAAAAAGCGGCCGCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:247) for fragment 3, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADpheA-aroLAC, pBADtyrA-aroLAC, and pBADaroG-tktA-aroBDE in 50 μl, respectively. Amplified DNA fragments 1 and 2 were digested with NotI and XbaI and ligated into fragment 3 pre-digested with the same restriction enzymes.
[0480]Construction of pTrcBAL.
[0481]A DNA sequence encoding benzaldehyde lyase (bal) of Pseudomonas fluorescens of its codon usage optimized for over-expression in E. coli was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CATGCCATGGCTATGATTACTGGTGG-3') (SEQ ID NO:248) and reverse (5'-CCCCGAGCTCTTACGCGCCGGATTGGAAATACA-3') (SEQ ID NO:249) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 μl. Amplified DNA fragment was digested with NcoI and SacI and ligated into pTrc99A pre-digested with the same restriction enzymes.
[0482]Construction of pTrcAdhE2.
[0483]A DNA sequence encoding Co-A linked alcohol/aldehyde dehydrogenase (adhE2) of Clostridium acetobutyricum ATCC824 was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CATGCCATGGCCAAAGTTACAAATCAAAAAG-3') (SEQ ID NO:250) and reverse (5'-CGAGCTCTTAAAATGATTTTATATAGATATCC-3') (SEQ ID NO:251) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Clostridium acetobutyricum ATCC824 genome in 50 μl. Amplified DNA fragment was digested with NcoI and SacI and ligated into pTrc99A pre-digested with the same restriction enzymes.
[0484]Construction of pTrcAdh2.
[0485]A DNA sequence encoding alcohol dehydrogenase (adh2) of Saccharomyces cerevisiae was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CATGCCATGGGTATTCCAGAAACTCAAAAAG-3') (SEQ ID NO:252) and reverse (5'-CCCGAGCTCTTATTTAGAAGTGTCAACAACG-3') (SEQ ID NO:253) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng genome of Saccharomyces cerevisiae in 50 μl. Amplified DNA fragment was digested with NcoI and SacI and ligated into pTrc99A pre-digested with the same restriction enzymes.
[0486]Construction of pTrcBALD.
[0487]A DNA sequence encoding CoA-linked aldehyde dehydrogenase (ald) of Clostridium beijerinckii was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCCGAGCTCAGGAGG ATATACATATGAATAAAGACACACTAATACC-3') (SEQ ID NO:254) and reverse (5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3') (SEQ ID NO:255) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 μl. Amplified DNA fragment was digested with SacI and HndIII and ligated into pTrcBAL pre-digested with the same restriction enzymes.
[0488]Construction of pTrcBALK.
[0489]A DNA sequence encoding ketoisovalerate decarboxylase (kivd) of Lactococcus lavtis was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGTATACAGTAGGAGATTACC-3') (SEQ ID NO:256) and reverse (5'-GCTCTAGATTATGATTTATTTTGTTCAGCAAAT-3') (SEQ ID NO:257) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 μl. Amplified DNA fragment was digested with SacI and XbaI and ligated into pTrcBAL pre-digested with the same restriction enzymes.
[0490]Construction of pTrcAdh-Kivd.
[0491]A DNA sequence encoding ketoisovalerate decarboxylase (kivd) of Lactococcus lavtis was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCCGAGCTCAGGAGGATATATATATGTATACAGTAGGAGATTACC-3') (SEQ ID NO:258) and reverse (5'-GCTCTAGATTATGATTTATTTTGTTCAGCAAAT-3') (SEQ ID NO:259) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 μl. Amplified DNA fragment was digested with SacI and XbaI and ligated into pTrcAdh2 pre-digested with the same restriction enzymes.
[0492]Construction of pTrcBAL-DDH-2ADH.
[0493]To remove internal NcoI site, overlap PCR was carried out. DNA fragments encoding front and bottom halves of meso-2,3-butanedioldehydrogenase (ddh) of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 and secondary alcohol dehydrogenase (2adh) of Pseudomanas fluorescens were amplified separately by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTCAGGAGGATATATATATGAAAAAAGTCGCACTTGTTACCG-3') (SEQ ID NO:260) for front half of ddh, (5'-GGCCGGCGGCCGCGCGATGGCGGTGAAAGTG-3') (SEQ ID NO:261) for bottom half of ddh, (5'-AACTAATCTAGAGGAGGATATATATATGAGCATGACGTTTTCCGGCCAGG-3') (SEQ ID NO:262) for front half of 2adh, and (5'-CCTTGCGGAGGGCTCGATGGATGAGTTCGAC-3') (SEQ ID NO:263) for bottom half of 2adh, and reverse (5'-CACTTTCACCGCCATCGCGCGGCCGCCGGCC-3') (SEQ ID NO:264) for front half of ddh, (5'-GCTCATATATATATCCTCCTCTAGATTAGTTAAACACCATCCCGCCGTCG-3') (SEQ ID NO:265) for bottom half of ddh, (5'-GTCGAACTCATCCATCGAGCCCTCCGCAAGG-3') (SEQ ID NO:266) for front half of 2adh, and (5'-CCCAAGCTTAGATCGCGGTGGCCCCGCCGTCG-3') (SEQ ID NO:267) for bottom half of 2adh, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578 for ddh and Pseudomanas fluorescens genome for 2adh in 50 μl, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 2 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CGAGCTCAGGAGGATATATATATGAAAAAAGTCGCACTTGTTACCG-3') (SEQ ID NO:268) and reverse (5'-CCCAAGCTTAGATCGCGGTGGCCCCGCCGTCG-3') (SEQ ID NO:269) primers, 1 U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and HindIII and ligated into pTrcBAL pre-digested with the same restriction enzymes.
[0494]Construction of pBBRPduCDEGH.
[0495]A DNA sequence encoding propanediol dehydratase medium (pduD) and small (pduE) subunits and propanediol dehydratase reactivation large (pduG) and small (pduH) subunits of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 2 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-GCTCTAGAGGAGGATTTAAAAATGGAAATTAACGAAACGCTGC-3') (SEQ ID NO:270) and reverse (5'-TCCCCGCGGTTAAGCATGGCGATCCCGAAATGGAATCCCTTTGAC-3') (SEQ ID NO:271) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578 in 50 μl. Amplified DNA fragment was digested with SacII and XbaI and ligated into pTrc99A pre-digested with the same restriction enzymes to form pBBRPduDEGH.
[0496]A DNA sequence encoding propanediol dehydratase large subunit (pduC) of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCGCTCGAGGAGGATATATATATGAGATCGAAAAGATTTGAAGC-3') (SEQ ID NO:272) and reverse (5'-GCTCTAGATTAGCCAAGTTCATTGGGATCG-3') (SEQ ID NO:273) primers, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578 in 50 μl. Amplified DNA fragment was digested with XhoI and XbaI and ligated into pBBRPduDEGH pre-digested with the same restriction enzymes.
[0497]Construction of pTrcIpdc-Par.
[0498]A DNA sequence encoding indole-3-pyruvate (ipdc) of Azospirillum brasilense and phenylethanol reductase (par) of Rhodococcus sp. ST-10 were amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 1 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward primers (5'-CATGCCATGGGACTGGCTGAGGCACTGCTGC-3' (SEQ ID NO:314) for ipdc and 5'-CGAGCTCAGGAGGATATATATATGAAAGCTATCCAGTACACCCGTAT-3' (SEQ ID NO:315) for par, and reverse primers (5'-CGAGCTCTTATTCGCGCGGTGCCGCGTGCAGG-3' (SEQ ID NO:316) for ipdc and 5'-GCTCTAGATTACAGGCCCGGAACCACAACGGCGC-3' (SEQ ID NO:317) for par, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pTrcIpdc and pTrcPar, respectively, in 50 μl. Amplified DNA fragment of ipdc and par were digested with NcoI/SacI and SacI/XbaI, respectively, and were ligated into pTrc99A pre-digested with NcoI and XbaI.
Testing and Results:
[0499]To test the butyraldehyde biosynthesis pathway, DH10B harboring pBADButP-atoB/pTrcBALD and pBADButP-atoB-ALD/pTrcB2DH/pBBRpduCDEGH were grown overnight in LB media containing 50 ug/ml chroramphenicol (Cm50) and 100 ug/ml ampicillin (Amp100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm50 and Amp100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS.
[0500]To test the isobutyeraldehyde biosynthesis pathway, DH10B cells harboring pBADals-ilvCD/pTrcBALK or pBADalsS-ilvCD/pTrcBALK were grown overnight in LB media containing 50 ug/ml chroramphenicol (Cm50) and 100 ug/ml ampicillin (Amp100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm50 and Amp100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS for the production of isobutyraldehyde. FIG. 8B shows the production of isobutanal from these cultures.
[0501]To test the 3-methylbutyraldehyde and 2-methylbutyraldehyde biosynthesis pathways, DH10B harboring pBADals-ilvCD-LeuABCD/pTrcBALK, pBADals-ilvCD-LeuABCD2/pTrcBALK, pBADals-ilvCD-LeuABCD/pTrcBALK4, pBADalsS-LeuABCD/pTrcBALK, pBADalsS-LeuABCD2/pTrcBALK, or pBADalsS-LeuABCD4/pTrcBALK were grown overnight in LB media containing 50 ug/ml chroramphenicol (Cm50) and 100 ug/ml ampicillin (Amp100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm50 and Amp100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS. The production of 2-isovaleralcohol (2-methylpental) and 3-isovaleralcohol (3-methylpentanal) was monitored because 3-isovaleraldehyde and 2-isovaleraldehyde are spontaneously converted to their corresponding alcohols. FIG. 8B shows the production of 2-methylpental and 3-methylpentanal from these cultures.
[0502]To test the phenylacetoaldehyde and 4-hydroxyphenylacetoaldehyde biosynthesis pathways, DH10B cells harboring pBADpheA-aroLAC/pTrcBALK, pBADtyrA-aroLAC/pTrcBALK, pBADaroG-tktA-aroBDE/pTrcBALK, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK, and pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK were grown overnight in LB media containing 50 ug/ml chroramphenicol (Cm50) and 100 ug/ml ampicillin (Amp100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm50 and Amp100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS. The production of phenylacetoaldehyde, 4-hydroxyphenylaldehyde and their corresponding alcohols were monitored using GC-MS. FIG. 9B shows the production of 4-hydroxyphenylethanol from these cultures.
[0503]To test the 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3)ethanol biosynthesis pathways, DH10B harboring pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcAdh2-Kivd, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcAdh2-Kivd, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcIpdc-Par, and pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcIpdc-Par were grown overnight in LB media containing 50 ug/ml chroramphenicol (Cm50) and 100 ug/ml ampicillin (Amp100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm50 and Amp100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight to a week. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS. The results are detailed below.
[0504]The production of 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol and/or 2-(indole-3-)ethanol was monitored using GC-MS. FIG. 42A shows the production of 2-phenylethanol from these cultures at 24 hours. FIG. 42B shows the production of 2-(4-hydroxyphenyl)ethanol from these cultures at 24 hours. FIG. 42C shows the production of 2-(indole-3-)ethanol from these cultures at 24 hours.
[0505]FIG. 43A shows the GC-MS chromatogram for control (pBAD33 and pTrc99A) at one week. FIG. 43B shows the GC-MS chromatogram for 2-phenylethanol (5.97 min) production from pBADpheA-aroLAC-aroG-tktA-aroBDE and pTrcBALK at one week. FIG. 44 shows the GC-MS chromatogram for 2-(4-hydroxyphenyl)ethanol (9.36 min) and 2-(indole-3) ethanol (10.32 min) production from pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK at one week.
Example 5
Isolation and Biological Activity of Diol Dehydrogenases
[0506]Available substrates such as 3-hydroxy-2-butanone (acetoin), 4-hydroxy-3-hexanone (propioin), 5-hydroxy-4-octanone (butyroin), 6-hydroxy-5-decanone (valeroin), and 1,2-cyclopentanediol were used to measure the ability of diol dehydrogenases (ddh) to catalyze the reduction of large saturated α-hydroxyketones to produce a diol. All reagents were purchased from Sigma-Aldrich Co. and TCI America, unless otherwise stated.
[0507]For cloning and isolation of DDH polypeptides, genomic DNA from several species of bacteria were obtained from ATCC (Lactobaccilus brevis ATCC 367, Pseudomanas putida KT2440, and Klebsiella pneumoniae MGH78578), PCR-amplified (using Phusion with polymerase with 1× Phusion buffer, 0.2 mM dNTP, 0.5 μL Phusion enzyme, 1.5 μM primers, and 20 pg template DNA in a 50 μL reaction) utilizing the following protocol: 30 cycles, 98° C./10 secs (denaturing), 60° C./15 secs (annealing), 72° C./30 secs (elongation). Polymerase chain reaction products were then digested using restriction enzymes NdeI and BamHI, then ligated into NdeI/BamHI digested pET28 vectors. Vectors containing ddh clones were transformed into BL21(DE3) competent cells for protein expression. Single colony was innoculated into LB media, and expression of 6×His-tagged proteins of interest was induced at OD600=0.6 with 0.1 mM IPTG. Expression was allowed to proceed for 15 hours at 22° C. The 6×His-tagged enzymes were purified using Ni-NTA spin columns following suggested protocols by QIAGEN, yielding purified protein concentrations in the range of 1.1-6.5 mg/mL (determined by Bradford assay).
[0508]Diol dehydrogenase ddh1 was isolated from Lactobaccilus brevis ATCC 367, diol dehydrogenase ddh2 was isolated from Pseudomonas putida KT2440, and diol dehydrogenase ddh3 was isolated from Klebsiella pneumoniae MGH78578. The nucleotide sequence encoding and polypeptide sequence of ddh1 are shown in SEQ ID NOS:97 and 98, respectively; nucleotide sequence encoding and polypeptide sequence of ddh2 are shown in SEQ ID NOS:99 and 100, respectively; and nucleotide sequence encoding and polypeptide sequence of ddh3 are shown in SEQ ID NOS: 101 and 102, respectively.
[0509]Reactions to measure biological activity of DDH polypeptides were performed in a final volume of 200 μL as follows: 25 mM substrate, 0.04 mg/mL DDH polypeptide, 0.25 mg/mL nicotinamide cofactor, 200 mM imidazole, 14 mM Tris-HCl, and 1.5% by volume DMSO. Biological activity was assayed using a Molecular Devices Thermomax 96 well plate reader, monitoring absorbance at 340 nm, which corresponds to NADH or NADPH concentration. For the kinetic studies, 0.04 mg/mL DDH polypeptide, 0.25 mg/mL NADH, 20 mM Tris HCl Buffer pH 6.5(red) or 9.0(ox), T=25 C, 100 uL total volume was used.
[0510]FIG. 12A shows the biological activity of ddh1, ddh2, and ddh3 using butyroin as a substrate (triangles represent ddh3 activity). FIG. 12B shows the oxidation activity of ddh3 towards 1,2-cyclopentanediol and 1,2-cyclohexanediol as measured by NADH production. FIG. 13 summarizes the results of kinetic studies for various substrates in the oxidation reactions catalyzed by the DDH polypeptides. These reactions were NAD+ dependent.
Example 6
Sequential In Vivo Biological Activity of CC-Ligases (Lyases) and Diol Dehydrogenases
[0511]The ability of a C--C lyase and a diol hydrogenase to perform the following sequential reaction was tested in E. coli:
##STR00001##
[0512]For α-hydroxyketone and diol production, a pathway comprising a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and meso-2,3-butanediol dehydrogenase (ddh) gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 was constructed in E. coli and tested for its ability to condensate the substrates detailed below in Table 2 (e.g., acetoaldehyde, propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, phenylacetaldehyde, and 4-hydroxyphenylacetaldehyde, or their corresponding alcohols) to form α-hydroxyketone and the corresponding diol in vivo. The production of various α-hydroxyketones and diols was monitored by gas chromatography-mass spectrometry (GC-MS).
TABLE-US-00002 TABLE 2 Summary of substrates and products. Produced Substrate α-hydroxyketone Produced diol FIGS. Butanal 5-Hydroxy-4-octanone 4,5-Octanonediol 17A & B n-Pentanal 6-Hydroxy-5-decanone 5,6-Decanediol 18A & B 3-Methylbutanal 2,7-Dimethyl-5-hydroxy-4- 2,7-Dimethyl-4,5-octanediol 19A & B octanone n-Hexanal 7-Hydroxy-6-dodecanone 6,7-dodecanediol 20A & B 4-Methylpentanal 2,9-Dimethyl-6-hydroxy-5- 2,9-Dimethyl-5,6- 21A & B decanone decanediol n-Octanal 9-Hydroxy-8-hexadecanone 8,9-hexadecanediol 22 Acetaldehyde 3-Hydroxy-2-butanone 2,3-Butanediol 23 n-Propanal 4-Hydroxy-3-hexanone 3,4-Hexanediol 24A & B Phenylacetoaldehyde 1,4-Diphenyl-3-hydroxy-2- 1,4-Diphenyl-2,3-butanediol 25 butanone
For Analysis of ≦C10.
[0513]E. coli harboring pTrcBAL-DDH-2ADH was grown for overnight in LB media containing 50 ug/ml Kanamycine (Km). This seed culture was innoculated into M9 media containing 3% (v/v) glycerol, 0.5% (g/v) and 50 ug/ml Km. 10 mL cultures were grown to O.D.600=0.7, then cultures were induced with 0.5 mM IPTG. The cells were allowed to express the enzymes of interest for 3 hours before various aldehydes were added to a concentration of 5-10 mM. After addition of aldehydes, the cultures were capped and incubated at 37° C. with shaking for 72 hours. Cultures were extracted with 2 mL ethyl acetate, and analyzed on GC-MS using the following protocol:
[0514]1 μL injection w/ 50:1 split
[0515]Inlet temperature--150° C.
[0516]Initial oven temperature--50° C.
[0517]Temperature Ramp 1--10° C./min to 150° C.
[0518]Temperature Ramp 2--50° C./min to 300° C.
[0519]GC to MS transfer temp--250° C.
[0520]MS detection--full scan MW 35-200
For Analysis of ≧C12.
[0521]E. coli DH10B strains harboring pTrc99A (Ctrl vector) or pTrcBAL were inoculated into 0.75×M9/0.5% LB containing 0.1 mM CaCl2, 2 mM MgSO4, 1 mM KCl, 1% galacturonate, 5 μg/mL thiamine, Amp. The cultures were grown up to an optical density (600 n nm) of 0.8 and induced with 0.25 mM IPTG. The cells were allowed to express the proteins for 2.5 hours at 37° C., then aldehyde substrate was added to a concentration of 5 mM, the culture vial was capped tightly and incubated for 72 hours at 37° C. w/ shaking 200 rpm. 1 mL of the final culture was extracted with 0.75 mL of ethyl acetate, centrifuged facilitate phase separation, then analyzed via GCMS using the following method.
[0522]1 μL injection w/ 50:1 split
[0523]Inlet temperature--250° C.
[0524]Initial oven temperature--50° C.
[0525]Temperature Ramp 1--10° C./min to 125° C.
[0526]Temperature Ramp 2--30° C./min to 300° C.
[0527]Final Temperature 300° C.--1 minute
[0528]GC to MS transfer temp--250° C.
[0529]MS detection--full scan MW 40-260.
[0530]The results are depicted in FIGS. 17 through 25. FIG. 17 shows the sequential conversion of butanal into 5-hydroxy-4-octanone and then 4,5-octanonediol. FIG. 18 shows the sequential conversion of n-pentanal into 6-hydroxy-5-decanone and then 5,6-decanediol. FIG. 19 shows the conversion of 3-methylbutanal into 2,7-dimethyl-5-hydroxy-4-octanone and then 2,7-Dimethyl-4,5-octanediol. FIG. 20 shows the sequential conversion of n-hexanal into 7-hydroxy-6-dodecanone and then 6,7-dodecanediol. FIG. 21 shows the conversion of 4-methylpentanal into 2,9-dimethyl-6-hydroxy-5-decanone and then 2,9-dimethyl-5,6-decanediol. FIG. 22 shows the conversion of n-octanal into 9-hydroxy-8-hexadecanone. FIG. 23 shows the conversion of acetaldehyde into 3-hydroxy-2-butanone. FIG. 24 shows the sequential conversion of n-propanal into 4-hydroxy-3-hexanone and then 3,4-hexanediol. FIG. 25 shows the conversion of phenylacetoaldehyde into 1,4-diphenyl-3-hydroxy-2-butanone.
[0531]Similar to above, a pathway comprising a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) was constructed in E. coli and tested for its ability to catalyze the production of various α-hydroxyketones. The results, which show the broad spectrum of C--C ligase activity for the bal gene tested, are set forth in FIG. 48 through FIG. 55.
Example 7
Sequential Biological Activity of Diol Dehydrogenases and Diol Dehydratases
[0532]To test the sequential biological activity of diol dehydrogenases and diol dehydratases in a dehydration and reduction pathway, butyroin was used as a substrate in a sequential reaction to produce 4-octanone. The enzyme diol dehydrogenase (e.g., ddh) catalyzes the reversible reduction and oxidation of α-hydroxy ketones and its corresponding diol, such as 5-hydroxy-4-octanone and 4,5-octanediol, and the enzyme diol dehydratase (e.g., pduCDE) catalyzes the irreversible dehydration of diols, such as 4,5-octanediol.
[0533]Diol dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 and diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578 were cloned into a bacterial expression vector and expressed and purified on a Ni-NTA column, as described in Example X except that 1 mM of 1,2-propanediol was added at all time during the expression and purification of diol dehydratase. The large, medium, and small subunits of the pduCDE polypeptide are encoded by the nucleotide sequences of SEQ ID NOs:103, 105, and 107, respectively, and the polypeptide sequence are set forth in SEQ ID NOs: 104, 106, and 108, respectively.
[0534]The ddh3 and pduCDE polypeptides were incubated with butyroin and their appropriate cofactors, then assayed using gas chromatography-mass spectrometry (GC-MS) for their ability to perform sequential reactions resulting in the product 4-octanone. Reaction conditions are given in Table 3 below. The reaction mixture was incubated at 37° C. for 40 hours in a 0.6 mL eppendorf tube with minimal head space. The reaction product was extracted with an equivalent volume of ethyl acetate, stored in a glass vial, and sent to Thermo Fischer Scientific Instruments Division for compositional analysis by GC-MS.
TABLE-US-00003 TABLE 3 Reaction Conditions Rxn Component Concentration 5-hydroxy-4-octanone (butyroin) 8.4 mM Adenosylcobalamin (coenzyme B12) 33.5 μM KCl 9.6 mM NADH 18 mM dDH3 enzyme 0.19 mg/mL dDOH1 enzyme mix 0.15 mg/mL Reaction Buffer 10 mM Tris HCl pH 7.0
[0535]FIG. 26A shows GC-MS data which confirms the presence of 4,5-octanediol in the sample extraction. The mass-spectra of the peaks, retention time, at 5.36 was identified as butyroin (substrate), and at 6.01, 6.09, and 6.12 min were identified as different isomers of 4,5-octanediol. This compound is the expected product resulting from the reduction of butyroin by ddh3.
[0536]FIG. 26B shows GC-MS data confirming the presence of 4-octanone in the sample extraction. The mass-spectra of the peak, retention time, at 4.55 was identified as 4-octanone. This compound is the expected product resulting from the sequential dehydrogenation of butyroin and dehydration of 4,5-octanediol by ddh3 and pduCDE, respectively.
[0537]FIGS. 27A and 27B show comparisons between the sample extraction gas chromatograph/mass spectrum and the 4-octanone standard gas chromatograph/mass spectrum. These results demonstrate that 4-octanone was produced from butyroin using the enzymes diol dehydrogenase (ddh3) and a diol dehydratase (pduCDE). GC-MS analysis of the incubated reaction mixture confirmed starting material, intermediate and product, demonstrating that these enzymes can be reappropriated for these specific substrates.
Example 8
Isolation and Biological Activity of Secondary Alcohol Dehydrogenases
[0538]Substrates such as 4-octanone, 2,7-dimethyl-4-octanone, cyclopentanone and corresponding alcohols were utilized to measure the ability of secondary alcohol dehydrogenases (2ADHs) to catalyze the reduction of large saturated ketones to secondary alcohols. An example of a reaction catalyzed by secondary alcohol dehydrogenases is illustrated below (reduction of 4-octanone to 4-octanol is shown):
##STR00002##
[0539]All enzymes and reagents were purchased from New England Biolabs and Sigma, respectively, unless otherwise stated.
[0540]Various secondary alcohol dehydrogenases (2ADHs) were isolated from Pseudomonas putida KT2440, Pseudomonas fluorescens Pf-5, and Klebsiella pneumoniae MGH 78578. All vectors were transformed in BL21(DE3) competent cells and expression of the genes encoding the proteins of interest was induced with IPTG (via the T7 promoter). The cells were lysed, proteins were extracted and then purified on Ni-NTA columns. Final protein concentration in the Ni-NTA eluate was diluted to 0.15 mg/mL prior to assays.
[0541]NADPH/NADPH consumption and production assays were performed using a THERMOmax microplate reader in the kinetic mode, monitoring the NADPH absorbance peak at 340 nm until the reaction reached equilibrium. In the assay described in Table 2, 2ADH-2, 2ADH-5, 2ADH-8, and 2ADH-10 were tested for their ability to either catalyze the oxidation of 4-octanol or catalyze the reduction of 4-octanone. These reaction conditions are found in Table 4 below.
TABLE-US-00004 TABLE 4 Reaction Conditions for Various Enzyme Assays Reaction Component Final Concentration NADH Production Assay (30° C.) 2ADH enzyme Approx. 0.058 μg/μL 4-octanol 5.55 mM NAD+ Approx. 1.4 μg/μL Imidizole (from Elution Buffer) Approx. 280 mM NADH Consumption Assay (30° C.) 2ADH enzyme Approx. 0.075 μg/μL 4-octanone 5.0 mM NADH Approx. 0.25 μg/μL Imidizole (from Elution Buffer) Approx. 250 mM NADPH Production Assay (30° C.) 2ADH enzyme Approx. 0.058 μg/μL 4-octanol 5.55 mM NADP+ Approx. 1.4 μg/μL Imidizole (from Elution Buffer) Approx. 280 mM
[0542]Further testing was performed, as described in Tables 5 below, in which 2ADH-2, 2ADH-11, 2ADH-12, 2ADH-13, 2ADH-14, 2ADH-15, 2ADH-16, 2ADH-17, and 2ADH-18 were tested for their ability to either catalyze the oxidation of 4-octanol, 2,7-dimethyl-4-octanonol, or cyclopentanol, or catalyze the reduction of 4-octanone, 2,7-dimethyl-4-octanonone, or cyclopentanone.
TABLE-US-00005 TABLE 5 Rxn Component Final Concentration Rxn Components for NADPH Consumption Assays (Reduction) Substrate 25 mM Enzyme 0.04 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM Tris HCl 14 mM DMSO 15% by volume Total Volume 200 μL Rxn Components for NAD(P)H Production Assays (Oxidation) Substrate 5 mM Enzyme 0.04 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM Tris HCl 14 mM Rxn Components for NAD(P)H Production Assay using 2,7- dimethyl-4-octanone as a substrate Substrate 50 mM Enzyme 0.08 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM Tris HCl 14 mM DMSO 3% by volume
[0543]FIG. 30A shows the results from the NADH Production Assay of Table 3, in which 2ADH-2 catalyzes the oxidation of 4-octanol in the presence of NAD+, as measured by NADH production. FIG. 30B shows the results of the NADPH Production Assay of Table 3, in which 2ADH-5, 2ADH-8, and 2ADH-10 catalyze the oxidation of 4-octanol in the presence of NADP+, as measured by NADPH production.
[0544]FIG. 31 shows the oxidation of 4-octanol by 2ADH-11 (FIG. 31A) and 2ADH-16 (FIG. 31B), as measured by NADH and NADPH production, respectively. FIG. 32 shows the oxidation of 2,7-dimethyloctanol by 2ADH-11 and others (FIG. 32A) and 2ADH-16 (FIG. 32B), as measured by NADH and NADPH production, respectively.
[0545]FIG. 33A shows the reduction of 2,7-dimethyl octanol by 2ADH11 and 2ADH16 as monitored by NADPH consumption. FIG. 33B shows the reduction activity of both 2ADH11 and 2ADH16 towards various substrates. FIG. 34 shows the oxidation (FIG. 34A) and reduction (FIG. 34B) of cyclopentanol by 2ADH-16.
[0546]Similar to above, kinetic testing for both oxidation and reduction reactions was performed on various substrates using 2ADH-16. The conditions for these studies were as follows: 0.04 mg/mL enzyme, 0.25 mg/mL cofactor, 20 mM Tris HCl Buffer pH 6.5(red) or 9.0(ox), T=25 C, 100 uL total volume was used. The calculated rate constants for the reduction reactions, along with the structures of the substrates, are summarized in FIG. 35. The calculated rate constants for the oxidation reactions, along with the structures of the substrates, are summarized in FIG. 36. These results show that 2ADH-16 is capable of catalyzing both the oxidation and reduction of a wide variety of substrates.
Example 9
Isolation and In Vitro and In Vivo Activity of Coenzyme B 12 Independent Diol Dehydratases
[0547]Substrates such as 1,2-propanediol, meso-2,3-butanediol, and trans-1,2-cyclopentanediol were utilized to test both the in vitro and in vivo biological activity of a B12 independent diol dehydratase in a dehydration and reduction pathway. Diol dehydratases catalyzes the irreversible dehydration of diols, such as 1,2-propanediol.
[0548]For in vitro activity, E. coli BL21(DE3) harboring pETPduCDE (diol dehydratase subunits) was inoculated into 100 mL LB media, grown to OD600=0.7, induced with 0.15 mM IPTG, and incubated for 22 hours at 22° C. The cells were lysed and proteins of interest were purified on a Ni-NTA spin column. Purification of all three dehydratase subunits was accomplished by adding 5 mM 1,2-propanediol to the lysis and wash buffers. The Ni-NTA purification yielded approximately 660 μL of protein mixture at a concentration of 2.2 mg/mL. Protein concentration assays were conducted using a Bradford reagent protocol.
[0549]The purified PduCDE was used to set up in vitro diol dehydratase reactions. Three assays were conducted with 1,2-propanediol and meso-2,3-butanediol. Control reactions were also set up with elution buffer added in place of purified PduCDE. In vitro reactions were conducted under semi-anaerobic conditions in 2 mL screw cap glass vials. Reaction components and concentrations are given in Table 6.
TABLE-US-00006 TABLE 6 Reaction conditions for B12 dependent DDOH in vitro assay Rxn Component Concentration Diol substrate 10 mM Adenosylcobalamin (B12) 100 μg/mL KCl 10 mM dOH1 enzyme mix 0.08 mg/mL Reaction Buffer 10 mM Tris HCl pH 7.5
[0550]After 48 hours, 1 mL of the reaction mixture was extracted with 0.5 mL of either ethylacetate or hexanol and analyzed by GCMS.
[0551]The following GCMS protocol was used for all experiments:
[0552]1 μL injection w/ 50:1 split
[0553]Inlet temperature--250° C.
[0554]Initial oven temperature--50° C.
[0555]Temperature Ramp 1--10° C./min to 125° C.
[0556]Temperature Ramp 2--30° C./min to 300° C.
[0557]Final Temperature 300° C.--1 minute
[0558]GC to MS transfer temp--250° C.
[0559]MS detection--full scan MW 40-260
[0560]The results are shown in FIG. 45. FIG. 45A confirms the formation of 1-propanal from 1,2-propanediol, and FIG. 45B confirms the formation of 2-butanone from meso-2,3-butanediol, both of which were catalyzed by B12 independent diol dehydratase.
[0561]For in vivo activity, the pBBRDhaB1/2 plasmid was constructed as follows: the DNA sequence encoding B12-independent glycerol dehydratase (dhaB1) and activator (dhaB2) of Clostridium butyricum was amplified by polymerase chain reaction (PCR): 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 2 min for dhaB1 and 1 min for dhaB2, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward primers (5'-CCGCTCGAGGAGGATATATATATGATTTCTAAAGGCTTTAGCACCC-3' (SEQ ID NO:318) for dhaB1 and 5'-ACGTGATGTAATCTAGAGGAGGATATATATATGAGCAAAGAAATTAAAGG-3' (SEQ ID NO:319) for dhaB2, and reverse primers (5'-TCTTTGCTCATATATATATCCTCCTCTAGATTACATCACGTGTTCAGTAC-3' (SEQ ID NO:320) for dhaB1 and 5'-CGAGCTCTTATTCGGCGCCAATGGTGCACGGG-3' (SEQ ID NO:321) for dhaB2, 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETdhaB1 and pETdhaB2, respectively, in 50 μl. Amplified fragments were gel purified and spliced by another round of PCR: 98° C. for 10 sec, 60° C. for 15 sec, and 72° C. for 2.5 min, repeated 30 times. The reaction mixture contained 1× Phusion buffer (NEB), 2 mM dNTP, 0.5 μM forward (5'-CCGCTCGAGGAGGATATATATATGATTTCTAAAGGCTTTAGCACCC-3') (SEQ ID NO:322) and reverse primers (5'-CGAGCTCTTATTCGGCGCCAATGGTGCACGGG-3') (SEQ ID NO:323), 1 U Phusion High Fidelity DNA polymerase (NEB), and 50 ng each fragment in 50 μl. Amplified DNA fragment was digested with XhoI and SacI and ligated into pBBR1MCS-2 pre-digested with the same restriction enzymes.
[0562]Two strains of E. coli DH10B harboring pBBR1MCS-2 or pBBRDhaB1/2 into TB media without glycerol were innoculated. Cultures were grown to OD600=0.5 and the substrates 1,2-propanediol, meso-2,3-butanediol, and trans-1,2-cyclopentanediol were added to separate cultures to a concentration of 10 mM. 5 ug/ml of co-enzyme S-adenosylmethionine was added before the culture is transferred to anaerobic environment. The cultures were incubated at 37 C for 48 hrs.
[0563]After 48 hours, 1 mL of culture was extracted with 0.5 mL of ethylacetate or hexanol and analyzed by GCMS, as described above. The results are shown in FIG. 46. FIG. 46A shows the in vivo production of 1-propanol from 1,2-propanediol. FIG. 46B shows the in vivo production of 2-butanol from meso-2,3 butanediol. FIG. 46C shows the in vivo production of cyclopentanone from trans-1,2-cyclopentanediol.
Example 10
Identification of Secreted Alginate Lyase and Genomic Regions Sufficient for Growth on Alginate as a Sole Source of Carbon
[0564]To identify secreted or external alginate lyases, and to identify genomic regions from Vibrio splendidus that are sufficient to confer growth in alginate as a sole source of carbon, the following clones were made using the gateway system from Invitrogen (Carlsbad, Calif.). First, entry vectors were made by TOPO cloning PCR fragments into pENTR/D/TOPO. PCR fragments were generated using Vibrio splendidus B01 genomic DNA as a template and amplified with the following primer pairs:
[0565]Vs24214-24249: genomic region corresponding to gene id between V12B01--24214 and V12B01--24249 (see Example 1).
TABLE-US-00007 TABLE 7 24214 F cacc caagcgatagtttatatagcgt (SEQ ID NO:324) 24249 R gaaatgaacggatattacgt (SEQ ID NO:325)
[0566]Vs24189-24209: genomic region corresponding to gene id between V12B01--24189 and V12B01--24209 (see Example 1).
TABLE-US-00008 TABLE 8 24189 R cggaacaggtgattgtggt (SEQ ID NO:326) 24209 F cacc gcccacttcaagatgaagctgt (SEQ ID NO:327)
[0567]Vs24214-24239: genomic region corresponding to gene id between V12B01--24214 and V12B01--24239 (see Example 1).
TABLE-US-00009 TABLE 9 24214 F cacc caagcgatagtttatatagcgt (SEQ ID NO:328) 24239 R_1 gtggctaagtacatgccggt (SEQ ID NO:329)
[0568]The entry vectors were recombined with the destination vector pET-DEST42 (Invitrogen) using the LR recombinase enzyme (Invitrogen). These destination vectors were then put into electrocompetent DH10B or BL21 cells.
[0569]The alginate lyase clones were then made by digesting (using enzymes Nde I and Bam HI) the PCR products that were generated using Vibrio splendidus 12B01 genomic DNA as a template and amplified with the following primer pairs:
TABLE-US-00010 TABLE 10 24214 ndeF GGAATTC CAT atgacaaagaatatgacgactaaac (SEQ ID NO:330) for forward primer for V12B01_24214 24214 bamR CG GGATCC ttattatttcccctgccctgcagt (SEQ ID NO:331) for reverse primer for V12B01_24214 24219 ndeF GGAATTC CAT atgagctatcaaccacttttac (SEQ ID NO:332) for forward primer for V12B01_24219 24219 bamR CG GGATCC ttacagttgagcaaatgatcc (SEQ ID NO:333) for reverse primer for V12B01_24219
[0570]The digested PCR products were then ligated into cut pET28 vector. Certain of the cloned genomic regions of Vibrio splendidus B01 were tested for the presence of secreted alginate lyases, and the above-described constructs were tested in various combinations for the ability to confer growth on alginate as a sole source of carbon.
[0571]The Vs24254 (SEQ ID NO: 32) region of Vibro spendidus encodes a functional external alginate lyase. BL21 cells expressing Vs24254 from the pET28 vector were capable of breaking down alginate in the growth medium. When grown on LB+2% alginate+0.1 mM Isopropyl β-D-1-thiogalactopyranoside (IPTG), only cells expressing the Vs24254 gene give a positive TBA assay result of pink color. This assay was performed by spinning down an overnight culture grown on the above mentioned media. The media was then mixed in a 1:1 ratio with 0.8% thiobarbituric acid (TBA), heated for 10 min at 99 degrees Celsius, and assayed for pink coloration. FIG. 47 shows the results of this assay. The left tube in FIG. 47 represents media taken from an overnight culture of cells expressing Vs24254, while the right hand tube shows the TBA reaction using media from cells expressing Vs24259 (negative control). The lack of pink coloration in the negative control indicates that little or no cleavage of the alginate polymer has occurred. Wildtype E. coli cells not expressing any recombinant proteins show the same coloration as the negative control Vs24259 (data not shown).
[0572]To test the ability of recombinant E. coli to grow on alginate as a sole source of carbon, transformed cells were grown for 19 hours at 30 degrees Celsius with mild shaking in a 96-well plate. Each well held 222 μl of minimal media (see growth conditions for explanation of minimal media) with the 0.66% carbon source in the form of either degraded alginate or glucose (positive control for growth). All cells were either BL21 with no plasmid (BL21--negative control), one plasmid (Da or 3a), or two plasmids (Dk3a and Da3k). The plasmids are indicated by the lower case letter: "a" refers to the plasmid backbone pET-DEST42 and "k" refers to the pENTR/D/TOPO backbone. "D" indicates that the plasmid contains the genomic region Vs24214-24249, while "3" indicates that the plasmid contains the genomic region Vs24189-24209. Thus, Da would be pET-DEST42-Vs24214-24249, Da3k would be pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 and so on.
[0573]As shown in FIG. 56A, the two vector-constructs pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 when combined in E. coli confer growth on degraded alginate as the sole carbon source. This same result is be observed when these genomic inserts are switched into the opposite vector (pET-DEST42-Vs24189-24209 and pENTR/D/TOPO-Vs24214-24249). FIG. 56B shows growth on glucose as a positive control. Thus, the combined genomic regions of Vs24214-24249 and Vs24189-24209 from Vibro splendidus were sufficient to confer on E. coli the ability to grown on alginate as a sole source of carbon.
Example 11
Production of Ethanol from Alginate
[0574]The ability of recombinant E. coli to produce ethanol by growing on alginate on a source of carbon was tested. To generate recombinant E. coli, DNA sequences encoding pyruvate decarboxylase (pdc), and two alcohol dehydrogenase (adhA and adhB) of Zymomonas mobilis were amplified by polymerase chain reaction (PCR). These amplified fragments were gel purified and spliced together by another round of PCR. The final amplified DNA fragment was digested with BamHI and XbaI ligated into pBBR1MCS-2 pre-digested with the same restriction enzymes. The resulting plasmid is referred to as pBBRPdc-AdhA/B.
[0575]E. coli was transformed with either pBBRPdc-AdhA/B or pBBRPdc-AdhA/B+1.5 Fos (fosmid clone containing genomic region between V12B01--24189 and V12B01--24249; these sequences confer on E. coli the ability to use alginate as a sole source of carbon, see Examples 1 and 10), grown in m9 media containing alginate, and tested for the production of ethanol. The results are shown in FIG. 57, which demonstrates that the strain harboring pBBRPdc-AdhA/B+1.5 FOS showed significantly higher ethanol production when growing on alginate. These results indicate that the pBBRPdc-AdhA/B+1.5 FOS was able to utilize alginate as a source of carbon in the production of ethanol.
[0576]The various embodiments described above can be combined to provide further embodiments. All of the U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet, are incorporated herein by reference, in their entirety. Aspects of the embodiments can be modified, if necessary to employ concepts of the various patents, applications and publications to provide yet further embodiments.
[0577]These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.
[0578]The following publications are herein incorporated by reference in their entirety. [0579]1. T. Y. Wong, L. A. Preston, N. L. Schiller, Annu Rev Microbiol 54, 289 (2000). [0580]2. W. Hashimoto, O. Miyake, A. Ochiai, K. Murata, J Biosci Bioeng 99, 48 (January, 2005). [0581]3. M. Yamasaki, K. Ogura, W. Hashimoto, B. Mikami, K. Murata, J Mol Biol 352, 11 (Sep. 9, 2005). [0582]4. M. Yamasaki et al., Acta Crystallogr Sect F Struct Biol Cryst Commun 61, 288 (Mar. 1, 2005). [0583]5. O. Miyake, A. Ochiai, W. Hashimoto, K. Murata, J Bacteriol 186, 2891 (May, 2004). [0584]6. O. Miyake, W. Hashimoto, K. Murata, Protein Expr Purif 29, 33 (May, 2003). [0585]7. H. J. Yoon, B. Mikami, W. Hashimoto, K. Murata, J Mol Biol 290, 505 (Jul. 9, 1999). [0586]8. H. J. Yoon, W. Hashimoto, O. Miyake, K. Murata, B. Mikami, J Mol Biol 307, 9 (Mar. 16, 2001). [0587]9. W. Hashimoto, O. Miyake, K. Momma, S. Kawai, K. Murata, J Bacteriol 182, 4572 (August, 2000). [0588]10. H. J. Yoon et al., Protein Expr Purif 19, 84 (June, 2000). [0589]11. T. Osawa, Y. Matsubara, T. Muramatsu, M. Kimura, Y. Kakuta, J Mol Biol 345, 1111 (Feb. 4, 2005). [0590]12. A. Ochiai, W. Hashimoto, K. Murata, Res Microbiol 157, 642 (September, 2006). [0591]13. F. J. Mergulhao, D. K. Summers, G. A. Monteiro, Biotechnol Adv 23, 177 (May, 2005). [0592]14. J. H. Choi, S. Y. Lee, Appl Microbiol Biotechnol 64, 625 (June, 2004). [0593]15. M. P. DeLisa, D. Tullman, G. Georgiou, Proc Natl Acad Sci USA 100, 6115 (May 13, 2003). [0594]16. N. Blaudeck, G. A. Sprenger, R. Freudl, T. Wiegert, J Bacteriol 183, 604 (January, 2001). [0595]17. N. Pradel et al., Biochem Biophys Res Commun 306, 786 (Jul. 4, 2003). [0596]18. L. Masip et al, Science 303, 1185 (Feb. 20, 2004). [0597]19. C. M. Barrett, N. Ray, J. D. Thomas, C. Robinson, A. Bolhuis, Biochem Biophys Res Commun 304, 279 (May 2, 2003). [0598]20. R. Binet, S. Letoffe, J. M. Ghigo, P. Delepelaire, C. Wandersman, Folia Microbiol (Praha) 42, 179 (1997). [0599]21. I. Gentschev, G. Dietrich, W. Goebel, Trends Microbiol 10, 39 (January, 2002). [0600]22. V. Koronakis, FEBS Lett 555, 66 (Nov. 27, 2003). [0601]23. J. Jose, Appl Microbiol Biotechnol 69, 607 (February, 2006). [0602]24. J. Jose, D. Betscheider, D. Zangen, Anal Biochem 346, 258 (Nov. 15, 2005). [0603]25. M. Ashiuchi, H. Misono, Appl Microbiol Biotechnol 59, 9 (June, 2002). [0604]26. J. Narita et al., Appl Microbiol Biotechnol 70, 564 (May, 2006). [0605]27. Y. Aso et al., Nat Biotechnol 24, 188 (February, 2006). [0606]28. W. Hashimoto et al., Biosci Biotechnol Biochem 69, 673 (April, 2005). [0607]29. A. E. Lagarde, F. R. Stoeber, J Bacteriol 129, 606 (February, 1977). [0608]30. M. A. Mandrand-Berthelot, P. Ritzenthaler, M. Mata-Gilsinger, J Bacteriol 160, 600 (November, 1984). [0609]31. J. Pouyssegur, F. Stoeber, J Bacteriol 117, 641 (February, 1974). [0610]32. J. Preiss, G. Ashwell, J Biol Chem 237, 309 (February, 1962). [0611]33. J. Preiss, G. Ashwell, J Biol Chem 237, 317 (February, 1962). [0612]34. G. M. Bird, P. Haas, Biochemical Journal 25, 403 (1931). [0613]35. L. H. Cretcher, W. L. Nelson, Science 67, 537 (May 25, 1928). [0614]36. W. L. Nelson, L. H. Cretcher, Journal of the American Chemical Society 51, 1914 (1929). [0615]37. W. L. Nelson, L. H. Cretcher, Journal of the American Chemical Society 52, 2130 (1930). [0616]38. W. L. Nelson, L. H. Cretcher, Journal of the American Chemical Society 54, 3409 (1932). [0617]39. E. Schoeffel, K. P. Link, Journal of Biological Chemistry 95, 213 (1932). [0618]40. E. Schoeffel, K. P. Link, Journal of Biological Chemistry 100, 397 (1933). [0619]41. H. A. Spoehr, Archive of Biochemistry 14, 153 (1947). [0620]42. J. J. Farmer, 3rd, R. G. Eagon, J Bacteriol 97, 97 (January, 1969). [0621]43. R. L. Anderson, D. P. Allison, J Biol Chem 240, 2367 (June, 1965). [0622]44. W. J. Lennarz, R. J. Light, K. Bloch, Proc Natl Acad Sci USA 48, 840 (May, 1962). [0623]45. S. A. Graham, Crit Rev Food Sci Nutr 28, 139 (1989). [0624]46. E. Wiberg, P. Edwards, J. Byrne, S. Stymne, K. Dehesh, Planta 212, 33 (December, 2000). [0625]47. L. Yuan, T. A. Voelker, D. J. Hawkins, Proc Natl Acad Sci USA 92, 10639 (Nov. 7, 1995). [0626]48. K. Dehesh, A. Jones, D. S. Knutzon, T. A. Voelker, Plant J 9, 167 (February, 1996). [0627]49. K. Dehesh, P. Edwards, T. Hayes, A. M. Cranmer, J. Fillatti, Plant Physiol 110, 203 (January, 1996). [0628]50. K. M. Mayer, J. Shanklin, BMC Plant Biol 7, 1 (2007). [0629]51. J. K. Jha et al., Plant Physiol Biochem 44, 645 (November-December, 2006). [0630]52. B. S. Schutt, M. Brummel, R. Schuch, F. Spener, Planta 205, 263 (June, 1998). [0631]53. K. Dehesh, P. Edwards, J. Fillatti, M. Slabaugh, J. Byrne, Plant J 15, 383 (August, 1998). [0632]54. J. M. Leonard, S. J. Knapp, M. B. Slabaugh, Plant J 13, 621 (March, 1998). [0633]55. M. Vedadi, R. Szittner, L. Smillie, E. Meighen, Biochemistry 34, 16725 (Dec. 26, 1995). [0634]56. M. O. Park, J Bacteriol 187, 1426 (February, 2005). [0635]57. M. O. Park, K. Heguri, K. Hirata, K. Miyamoto, J Appl Microbiol 98, 324 (2005). [0636]58. M. O. Park, M. Tanabe, K. Hirata, K. Miyamoto, Appl Microbiol Biotechnol 56, 448 (August, 2001). [0637]59. M. Morikawa, T. Iwasa, S. Yanagida, T. Imanaka, Journal of Fermentation and Bioengineering 85, 243 (1998). [0638]60. M. Dennis, P. E. Kolattukudy, Proc Natl Acad Sci USA 89, 5306 (Jun. 15, 1992). [0639]61. T. M. Cheesbrough, P. E. Kolattukudy, J Biol Chem 263, 2738 (Feb. 25, 1988). [0640]62. M. C. Chang, R. A. Eachus, W. Trieu, D. K. Ro, J. D. Keasling, Nat Chem Biol 3, 274 (May, 2007). [0641]63. R. J. Porra, B. D. Ross, Biochem J 94, 557 (March, 1965). [0642]64. X. Chen, W. Guo, L. Zhao, Q. Fu, Y. Ma, J Phys Chem A 111, 3566 (May 10, 2007). [0643]65. L. Zhao, W. Guo, R. Zhang, S. Wu, X. Lu, Chemphyschem 7, 1345 (Jun. 12, 2006). [0644]66. L. Zhao, R. Zhang, W. Guo, S. Wu, X. Lu, Chemical Physics Letters 414, 28 (2005). [0645]67. G. Gorgen, W. Boland, Eur J Biochem 185, 237 (Nov. 6, 1989). [0646]68. P. Ney, W. Boland, Eur J Biochem 162, 203 (Jan. 2, 1987). [0647]69. Z. L. Boynton, G. N. Bennett, F. B. Rudolph, Appl Environ Microbiol 62, 2758 (August, 1996). [0648]70. R. T. Yan, J. S. Chen, Appl Environ Microbiol 56, 2591 (September, 1990). [0649]71. R. V. Nair, G. N. Bennett, E. T. Papoutsakis, J Bacteriol 176, 871 (February, 1994). [0650]72. D. P. Wiesenborn, F. B. Rudolph, E. T. Papoutsakis, Appl Environ Microbiol 55, 317 (February, 1989). [0651]73. D. K. Thompson, J. S. Chen, Appl Environ Microbiol 56, 607 (March, 1990). [0652]74. M. G. Hartmanis, J Biol Chem 262, 617 (Jan. 15, 1987). [0653]75. K. X. Huang, S. Huang, F. B. Rudolph, G. N. Bennett, J Mol Microbiol Biotechnol 2, 33 (January, 2000). [0654]76. L. Fontaine et al., J Bacteriol 184, 821 (February, 2002). [0655]77. B. McMahon, M. E. Gallagher, S. G. Mayhew, FEMS Microbiol Lett 250, 121 (Sep. 1, 2005). [0656]78. M. Li, S. Yao, S. K., Microbial Biotechnology 23, 573 (2007). [0657]79. T. B. Causey, S. Zhou, K. T. Shanmugam, L. O. Ingram, Proc Natl Acad Sci USA 100, 825 (Feb. 4, 2003). [0658]80. D. E. Chang, S. Shin, J. S. Rhee, J. G. Pan, J Bacteriol 181, 6656 (November, 1999). [0659]81. C. R. Dittrich, R. V. Vadali, G. N. Bennett, K. Y. San, Biotechnol Prog 21, 627 (March-April, 2005). [0660]82. H. Lin, N. M. Castro, G. N. Bennett, K. Y. San, Appl Microbiol Biotechnol 71, 870 (August, 2006). [0661]83. U. Schorken, G. A. Sprenger, Biochim Biophys Acta 1385, 229 (Jun. 29, 1998). [0662]84. G. A. Sprenger, M. Pohl, Journal of Molecular Catalysis B: Enzymatic 6, 145 (1999). [0663]85. G. A. Sprenger, M. Pohl, Journal of Molecular Catalysis B: Enzymic 6, 145 (1999). [0664]86. B. Gonzalez, R. Vicuna, J Bacteriol 171, 2401 (May, 1989). [0665]87. P. Hinrichsen, I. Gomez, R. Vicuna, Gene 144, 137 (Jun. 24, 1994). [0666]88. E. Janzen et al., Bioorg Chem 34, 345 (December, 2006). [0667]89. M. M. Kneen, I. D. Pogozheva, G. L. Kenyon, M. J. McLeish, Biochim Biophys Acta 1753, 263 (Dec. 1, 2005). [0668]90. K. Yamada-Onodera, A. Nakajima, Y. Tani, J Biosci Bioeng 102, 545 (December, 2006). [0669]91. K. Yamada-Onodera, M. Fukui, Y. Tani, J Biosci Bioeng 103, 174 (February, 2007). [0670]92. T. Tobimatsu, M. Azuma, S. Hayashi, K. Nishimoto, T. Toraya, Biosci Biotechnol Biochem 62, 1774 (September, 1998). [0671]93. T. Tobimatsu et al., J Biol Chem 271, 22352 (Sep. 13, 1996). [0672]94. T. Toraya, T. Shirakashi, T. Kosuga, S. Fukui, Biochem Biophys Res Commun 69, 475 (Mar. 22, 1976). [0673]95. M. Yamanishi et al., Eur J Biochem 269, 4484 (September, 2002). [0674]96. J. R. O'Brien et al., Biochemistry 43, 4635 (Apr. 27, 2004).
[0675]97. C. Raynaud, P. Sarcabal, I. Meynial-Salles, C. Croux, P. Soucaille, Proc Natl Acad Sci USA 100, 5010 (Apr. 29, 2003). [0676]98. B. Ludwig, A. Akundi, K. Kendall, Appl Environ Microbiol 61, 3729 (October, 1995). [0677]99. S. X. Xie, J. Ogawa, S. Shimizu, Biosci Biotechnol Biochem 63, 1721 (October, 1999). [0678]100. T. Zelinski, J. Peters, M. R. Kula, J Biotechnol 33, 283 (Apr. 15, 1994). [0679]101. M. C. Hunt, A. Rautanen, M. A. Westin, L. T. Svensson, S. E. Alexson, Faseb J 20, 1855 (September, 2006). [0680]102. M. A. Westin, S. E. Alexson, M. C. Hunt, J Biol Chem 279, 21841 (May 21, 2004). [0681]103. M. A. Westin, M. C. Hunt, S. E. Alexson, J Biol Chem 280, 38125 (Nov. 18, 2005). [0682]104. H. Iwaki, Y. Hasegawa, S. Wang, M. M. Kayser, P. C. Lau, Appl Environ Microbiol 68, 5671 (November, 2002).
Sequence CWU
1
333112066DNAVibrio splendidus 1ggggacaagt ttgtacaaaa aagcaggctt gacgcttatc
acatttagta gaagcttatg 60tggaggcgat tggctttttt ttcaaggaag attacaaaat
agctcaggta atgccgattt 120atagatttgc tatgatatag ttcaggatct tatgctttta
ataagcagga acagaattta 180tgaacaaaaa agctgatagt ttagtaggtt acagctttat
tcgttataga aagggttagg 240gaacgtgaac tttttagagc tcaaacttcg catggataac
tctccggtgc tgagccgatt 300tttagagaat ggatttttac tccagcagaa actgagcctt
gttctttgtt gtgtgttgat 360cgcagcttct gcatggattt taggacagct tgcatggttt
attgaacctg ctgagcaaac 420cgtcgtgcca tggacagcaa cggcttcctc gtcttcaacg
cctcaatcga ctcttgatat 480ctcttctttg cagcagagca acatgtttgg tgcttataac
ccaaccacgc ctgctgtggt 540tgagcagcaa gttatccaag atgcgccaaa gacgcgactg
aacctcgttt tagtgggtgc 600agtagccagt tctaatccaa agctgagctt ggctgtgatt
gccaatcgcg gcacacaagc 660aacctacggc attaatgaag agatcgaagg tacgcgagct
aagttaaaag cggtattagt 720cgatcgcgtg attattgata actcaggtcg agacgaaacc
ttgatgcttg aaggcattga 780gtacaagcgt ttgtctgtat cagcacctgc gccacctcgt
acctcttctt ctgtgcgtgg 840caacaaccca gcttctgcag aagagaagct agatgaaatt
aaagcgaaga taatgaaaga 900tccgcaacaa atcttccaat atgttcgact gtctcaggtg
aaacgcgacg ataaagtgat 960tggttatcgt gtgagccctg gcaaagattc agaacttttt
aactctgttg ggctccaaaa 1020cggagatatt gccactcagt taaatggaca agacctgaca
gaccctgctg ctatgggcaa 1080catattccgt tctatctcag agctgacaga gctaaacctc
gtcgtcgaga gagatggtca 1140acaacatgaa gtgtttattg aattttagaa ctttgcgtct
aacgaaggac gaaagtgtag 1200gagaagtacg tgaagcattg gtttaagaaa agtgcatggt
tattggcagg aagcttaatc 1260tgcacacccg cagccatcgc gagtgatttt agtgccagct
ttaaaggcac tgatattcaa 1320gagtttatta atattgttgg tcgtaaccta gagaagacga
tcatcgttga cccttcggtg 1380cgcggaaaaa tcgatgtacg cagctacgac gtactcaatg
aagagcaata ctacagcttc 1440ttcctaaacg tattggaagt gtatggctac gcggttgtcg
aaatggactc gggtgttctt 1500aagatcatca aggccaaaga ttcgaaaaca tcggcaattc
cagtcgttgg agacagtgac 1560acgatcaaag gcgacaatgt ggtgacacgt gttgtgacgg
ttcgtaatgt ctcggtgcgt 1620gaactttctc ctctgcttcg tcaactaaac gacaatgcag
gcgcgggtaa cgttgtgcac 1680tacgacccag ccaacatcat ccttattaca ggccgagcgg
cggtagtaaa ccgtttagct 1740gaaatcatca agcgtgttga ccaagcgggt gataaagaga
ttgaagtcgt tgagctaaag 1800aatgcttctg cggcagaaat ggtacgtatc gttgatgcgt
taagcaaaac cactgatgcg 1860aaaaacacac ctgcatttct acaacctaaa ttagttgccg
atgaacgtac caatgcgatt 1920cttatctcag gcgaccctaa agtacgtagc cgtttaagaa
ggctgattga acagcttgat 1980gttgaaatgg caaccaaggg caataaccaa gttatttacc
ttaaatatgc aaaagccgaa 2040gatctagttg atgtgctgaa aggcgtgtcg gacaacctac
aatcagagaa gcagacatca 2100accaaaggaa gttcatcgca gcgtaaccaa gtgatgatct
cagctcacag tgacaccaac 2160tctttagtga ttaccgcaca gccggacatc atgaatgcgc
ttcaagatgt gatcgcacag 2220ctggatattc gtcgtgctca agtattgatt gaagcactga
ttgtcgaaat ggccgaaggt 2280gacggcgtta accttggtgt gcagtggggt aaccttgaaa
cgggtgccat gattcagtac 2340agcaacactg gcgcttccat tggcggtgtg atggttggtt
tagaagaagc gaaagacagc 2400gaaacgacaa ccgctgttta tgattcagac ggtaaattct
tacgtaatga aaccacgacg 2460gaagaaggtg actattcaac attagcttcc gcactttctg
gtgttaatgg tgcggcaatg 2520agtgtggtaa tgggtgactg gaccgccttg atcagtgcag
tagcgaccga ttcaaattca 2580aatatcctat cttctccaag tatcaccgtg atggataacg
gcgaagcgtc attcattgtg 2640ggtgaagagg tgcctgttct aaccggttct acagcaggct
caagtaacga caacccattc 2700caaacagttg aacgtaaaga agtgggtatc aagcttaaag
tggtgccgca aatcaatgaa 2760ggtgattcgg ttcaactgca aatagaacaa gaagtatcga
acgtattagg cgccaatggt 2820gcggttgatg tgcgttttgc taagcgacag ctaaatacat
cagtgattgt tcaagacggt 2880caaatgctgg tgttgggtgg cttgattgac gagcgagcat
tggaaagtga atctaaggtg 2940ccgttcttgg gagatattcc tgtgcttgga cacttgttca
aatcaaccag tactcaggtt 3000gagaaaaaga acctaatggt cttcatcaaa ccaaccatta
ttcgtgatgg tatgacagcc 3060gatggtatca cgcagcgtaa atacaacttc atccgtgctg
agcagttgta caaggctgag 3120caaggactga agttaatggc agacgataac atcccagtat
tgcctaaatt tggtgccgac 3180atgaatcacc cggctgaaat tcaagccttc atcgatcaaa
tggaacaaga ataatggctg 3240aattggtagg ggcggcacgt acttatcagc gcttgccgtt
tagctttgcg aatcgctaca 3300agatggtgtt ggaataccaa catccagagc gcgcaccgat
actttattat gttgagccac 3360tgaaatcggc ggcgatcatt gaagtgagtc gtgttgtgaa
aaatggtttc acgccacaag 3420cgattactct cgatgagttt gataaaaaac taaccgatgc
ttatcagcgt gactcgtcag 3480aagctcgtca gctcatggaa gacattggtg ctgatagtga
tgatttcttc tcactagcgg 3540aagaactgcc tcaagacgaa gacttacttg aatcagaaga
tgatgcacca atcatcaagt 3600taatcaatgc gatgctgggt gaggcgatca aagagggtgc
ttcggatata cacatcgaaa 3660cctttgaaaa gtcactttgt atccgtttcc gagttgatgg
tgtgctgcgt gatgttctag 3720cgccaagccg taaactggct ccgctattgg tttcacgtgt
caaggttatg gctaaactgg 3780atattgcgga aaaacgcgtg ccacaagatg gtcgtatttc
tctgcgtatt ggtggccgag 3840cggttgatgt tcgtgtttca accatgcctt cttcgcatgg
tgagcgtgtg gtaatgcgtc 3900tgttggacaa aaatgccact cgtctagact tgcacagttt
aggtatgaca gccgaaaacc 3960atgaaaactt ccgtaagctg attcagcgcc cacatggcat
tatcttggtg accggcccga 4020caggttcagg taaatcgacg accttgtacg caggtctgca
agaactcaac agcaatgaac 4080gaaacatttt aaccgttgaa gacccaatcg aattcgatat
cgatggcatt ggtcaaacac 4140aagtgaaccc taaggttgat atgacctttg cgcgtggttt
acgtgccatt cttcgtcaag 4200atcctgatgt tgttatgatt ggtgagatcc gtgacttgga
gaccgcagag attgctgtcc 4260aggcctcttt gacaggtcac ttagttatgt cgactctgca
taccaatact gccgtcggtg 4320cgattacacg tctacgtgat atgggcattg aacctttctt
gatctcttct tcgctgctgg 4380gtgttttggc tcagcgcttg gttcgtactt tatgtaacga
atgtaaagaa ccttatgaag 4440ccgataaaga gcagaagaaa ctgtttgggt tgaagaagaa
agaaagcttg acgctttacc 4500atgccaaagg ttgtgaagag tgtggccata agggttatcg
aggtcgtacg ggtattcatg 4560agctgttgat gattgatgat tcagtacaag agctgattca
cagtgaagcg ggtgagcagg 4620cgattgataa agcaattcgt ggcacaacac caagtattcg
agatgatggc ttgagcaaag 4680ttctgaaagg ggtaacgtcc ctagaagaag tgatgcgcgt
gaccaaggaa gtctagtatg 4740gcggcatttg aatacaaagc actggatgcc aaaggcaaaa
gtaaaaaagg ctcaattgaa 4800gcagataatg ctcgtcaggc tcgccaaaga ataaaagagc
ttggcttgat gccggttgag 4860atgaccgagg ctaaagcaaa aacagcaaaa ggtgctcagc
catcgaccag ctttaaacgc 4920ggcatcagta cgcctgatct tgcgcttatt actcgtcaaa
tatccacgct cgttcaatct 4980ggtatgccgc tagaagagtg tttgaaagcc gttgccgaac
agtctgagaa acctcgtatt 5040cgcaccatgc tactcgcggt gagatctaag gtgactgaag
gttattcgtt agcagacagc 5100ttgtctgatt atccccatat cttcgatgag ctattcagag
ccatggttgc tgctggtgag 5160aagtcagggc atctagatgc ggtattggaa cgattggctg
actacgcaga aaaccgtcag 5220aagatgcgtt ctaagttgct gcaagcgatg atctacccca
tcgtgctggt ggtgtttgcg 5280gtgacgattg tgtcgttcct actggcaacg gtagtgccga
agatcgttga gcctattatc 5340caaatgggac aagagctccc tcagtcgaca caatttttat
tagcatcgag tgaatttatc 5400cagaattggg gcatccaatt actggtgttg accattggtg
tgattgtgtt ggttaagact 5460gcgctgaaaa agccgggcgt tcgcatgagc tgggatcgca
aattattgag catcccgctg 5520ataggcaaga tagcgaaagg gatcaacacc tctcgttttg
cacgaacact ttctatctgt 5580acctctagtg cgattcctat ccttgaaggg atgaaggtcg
cggtagatgt gatgtcgaat 5640catcacgtga aacaacaagt attacaggca tcagatagcg
ttagagaagg ggcaagcctg 5700cgtaaagcgc ttgatcaaac caaactcttt cccccgatga
tgctgcatat gatcgccagt 5760ggtgagcaga gtggccaatt ggaacagatg ctgacaagag
cggcagataa tcaggatcaa 5820agctttgaat cgaccgttaa tatcgcgtta ggcattttta
ccccagcgct tattgcgttg 5880atggctggct tagtgctgtt tatcgtgatg gcgacgctga
tgccaatgct tgaaatgaac 5940aatttaatga gtggttaacc tgccgctcat cagacgttag
tttttggatt atcgagaaga 6000aggacatcat tcccctcaac tcgctatctg taatttggag
aaaataatga aaaataaaat 6060gaaaaaacaa tcaggcttta ccctattaga agtcatggtt
gttgtcgtta tccttggtgt 6120tctagcaagt tttgttgtac ctaacctgtt gggcaacaaa
gagaaggcgg atcaacaaaa 6180agccatcact gatattgtgg cgctagagaa cgcgctcgac
atgtacaaac tggataacag 6240cgtttaccca acaacggatc aaggcctgga cgggttggtg
acaaagccaa gcagtccaga 6300gcctcgtaac taccgagacg gcggttacat caagcgtcta
cctaacgacc catggggcaa 6360tgagtaccaa tacctaagtc ctggtgataa cggcacaatt
gatatcttca ctcttggcgc 6420agatggtcaa gaaggtggtg aaggtattgc tgcagatatc
ggcaactgga acatgcagga 6480cttccaataa gcttcggctt gttgtcggtt gatacgttcc
tgttgtttga ttcgttatcg 6540ttgcttgata cgttattgat ggtagtacgc aaaaaatgga
gtctacaagg tgaaaactaa 6600gcaaacacag ccaggtttca ccttgattga gattcttttg
gtgttggtat tactgtcagt 6660atcggcggtc gcggtgatct cgaccatccc taccaatagc
aaagatgttg ctaaaaaata 6720cgctcaaagc ttttatcagc gaattcagct actcaatgaa
gaggctattt tgagtggctt 6780agattttggt gttcgtgttg atgaaaaaaa atcgacttac
gttctgatga ctttgaagtc 6840tgatggctgg caagaaacgg agttcgaaaa gatcccttct
tcaactgaat taccggaaga 6900actggcactg tcgctgacat taggtggtgg cgcgtgggaa
gacgatgatc ggttgttcaa 6960tccaggaagc ttatttgatg aagatatgtt tgctgatctt
gaagaggaaa agaagccgaa 7020accaccacag atctacatct tgtcgagtgc tgaaatgacg
ccatttgtac tgtcgtttta 7080cccaaatacc ggtgacacaa tacaagatgt ttggcgcatt
cgagtattgg ataatggtgt 7140gattcgatta ctcgagccgg gagaagaaga tgaagaagaa
taaccgttct ccttatcgtt 7200ctcgcggtat gcctcttggt tctcgaggaa tgactctgct
tgaagtattg gttgcgctgg 7260ctatcttcgc tacggcggcg atcagtgtga ttcgtgctgt
cacccagcac atcaatacgc 7320tcagttatct cgaagaaaaa accttcgcgg cgatggtcgt
tgataatcaa atggccctag 7380tcatgctaca tcctgagatg cttaaaaaag cgcagggcac
gcaagagtta gcgggaagag 7440aatggttctg gaaggtgact cccatcgata ccagcgataa
tttattaaag gcgtttgatg 7500tgagtgcggc aaccagtaag aaagcgtctc cagtcgttac
ggtgcgcagt tatgtggtta 7560attaagagaa tgtggtcaat taagagcatg ttattaatta
agaacagctc gctaactaag 7620agcgtgtcgc taactaagag catgtcggaa aataagcgta
cgccgcgtaa acaaggtcta 7680ccttcaaaag ggagaggctt taccttaatt gaagtcttgg
tctcgattgc tatctttgcc 7740acgctaagta tggcggctta tcaggtggtt aatcaggtgc
agcgaagcaa cgagatctct 7800attgagcgca gtgctcgttt gaaccaactg caacgcagtt
tagtcatttt agataatgat 7860tttcgccaga tggcggtgcg aaaatttcgt accaacggtg
aagaagcatc atctaagctg 7920atcttaatga aagagtattt attggactcc gacagtgtag
gcatcatgtt tactcgtcta 7980ggttggcaca acccacaaca gcagtttcct cgcggtgaag
tcacgaaggt tggctaccgt 8040attaaagaag aaacacttga gcgtgtatgg tggcgttatc
ccgatacacc ttcaggccaa 8100gaaggtgtga ttacccctct gcttgatgat gttgaaagct
tggaattcga gttttatgac 8160ggaagccgct gggggaaaga gtggcaaacc gataaatcac
tgccgaaagc ggtgaggctt 8220aagctgacac tgaaagacta tggtgagata gagcgtgttt
atctcactcc cggtggcacc 8280ctagatcagg ccgatgattc ttcaaacagt gactcttcag
gcagtagtga ggggaataat 8340gactcatcga actaataagc gtttagcgac aaggtcagcc
ttgggacgta aacaacgtgg 8400tgtcgcgctg atcattattt tgatgctatt ggcgatcatg
gcaaccattg ctggcagcat 8460gtccgagcgt ttgtttacgc aattcaagcg cgttggtaac
caactgaatt accaacaggc 8520ttactggtac agcattggtg tggaagcgct tgtgcaaaac
ggtattaggc aaagttacaa 8580agacagtgat accgtgaacc taagccaacc atgggcgtta
gaagagcagg tatacccatt 8640ggattatggc caagttaagg gccgcattgt tgatgctcag
gcatgtttta atcttaatgc 8700cttagccgga gtggcgacca cttcaagtaa ccagactcct
tatttaatca cggtttggca 8760aaccttattg gaaaaccaag acgttgagcc ttatcaggct
gaggttatcg caaattcaac 8820gtgggaattt gttgatgcgg atacacgaac cacctcttcg
tctggtgtag aagacagcac 8880gtatgaagcg atgaagccct cttatttggc ggcgaatggc
ttaatggccg atgaatccga 8940gctacgagcg gtttatcaag tcactggtga agtgatgaat
aaggttcgcc cctttgtttg 9000cgctctgcca accgatgatt tccgcttgaa tgtgaatact
ctcacggaaa aacaagcacc 9060gttattggaa gcgatgtttg cgccaggctt aagtgaatcg
gatgccaaac agctgataga 9120taaacgccca tttgatggct gggatacggt agatgctttc
atggctgaac ctgccattgt 9180tggtgtaagt gccgaagtca gcaagaaagc gaaagcatat
ttaactgtag atagcgccta 9240ttttgagcta gatgcagagg tattagttga gcagtcacgt
gtacgtatac ggacgctttt 9300ctatagtagt aatcgagaaa cagtgacggt agtacgccgt
cgttttggag gaatcagtga 9360gcgagtttct gaccgttcga ctgagtagcg aaccacaaag
ccctgtgcag tggttagttt 9420ggtcgacaag ccaacaagaa gtgatagcaa gcggtgaact
gtctagctgg gaacagcttg 9480acgagttaac gccttacgct gaaaagcgca gctgtatcgc
tttattgccg ggaagtgaat 9540gcttaattaa gcgtgttgag atcccgaaag gtgctgctcg
ccagtttgat tctatgctgc 9600cgttcttatt agaagacgaa gtcgcacaag atatcgaaga
cttacacctg actattttag 9660ataaagatgc cactcacgct accgtgtgtg gtgtggatcg
tgaatggcta aaacaagctt 9720tagacctgtt tcgcgaagcc aatataatct tccgtaaggt
gctaccagat acactagccg 9780tgccttttga agaacaaggc atcagtgcgt tgcagataga
tcagcattgg ttattgcgcc 9840aaggtcactc tcaacgtcaa ggtcactatc aagccgtatc
gatcagtgaa gcatggttac 9900cgatgttttt gcaaagtgat tgggttgtcg ctggtgagga
agagcaagcg acgactatct 9960tcagctatac cgcgatgccg agcgacgacg ttcaacagca
aagcggcctc gagtggcaag 10020caaagcctgc ggaattggtg atgtctttat tgagtcagca
agcgatcaca agcggcgtaa 10080atttactgac tggcaccttt aaaaccaaat cttcattcag
taaatattgg cgtgtttggc 10140agaaagtggc gattgctgct tgtttgctgg tggccgtgat
tgtgactcag caagtgttga 10200aggttcagca atacgaagcg caagcacaag cctaccgcat
ggagagtgag cgtatcttta 10260gagctgtgct gcctggcaaa caacgcattc cgaccgtgag
ttacctcaag cgtcagatga 10320atgatgaagc taagaaatac ggtggttcag gcgaaggtga
ttctttactt ggttggttag 10380ctttgctgcc tgaaacctta gggcaagtga agacgatcga
agttgaaagc attcgctacg 10440atggcaaccg ttctgaggtt cgactgcagg ctaaaagttc
tgacttccaa cactttgaga 10500ccgcaagggt gaagctcgaa gagaagtttg tcgttgagca
agggccattg aaccgtaatg 10560gcgatgccgt atttggcagt tttactctta aaccccatca
ataacctgcg taaggagatc 10620agtgatgaga aatatgattg aaccactcca agcgtggtgg
gcttcaataa gtcagcggga 10680acaacgatta gtcattggtt gttctatttt attgatactg
ggcgttgtct attggggatt 10740aatacaacca cttagccaac gagccgagct tgcacaaagc
cgcattcaaa gtgagaagca 10800acttctggct tgggtaacgg acaaagcgaa tcaagtggtt
gaactacgag gcagtggtgg 10860catcagtgcc agtcagcctt tgaaccaatc tgtgcctgct
tctatgcgcc gttttaacat 10920cgagctgata cgcgtgcaac cacgcggtga gatgctgcaa
gtttggatta agcctgtgcc 10980atttaataag ttcgttgact ggctgacata cctgaaagaa
aagcagggtg ttgaggttga 11040gtttatggat attgatcgct ctgatagccc tggggttatt
gagatcaacc gactacagtt 11100taaacgaggt taatgtgaaa cgcggtttat ctttcaaata
cggcctgtta ttcagcgtca 11160tttttatcgt ttttttctcg gtaagcttgt tgctgcattt
gcctgccgct tttgctctca 11220agcatgcacc cgtcgtgcgt ggtttaagca ttgaaggcgt
tgagggcacc gtttggcaag 11280gtcgcgctaa caatatcgcg tggcagcgtg tcaattacgg
ctcagtgcag tgggacttcc 11340agttctctaa actattccaa gccaaagcag aacttgcggt
tcgctttggc cgcaacagcg 11400acatgaactt atcaggtaaa ggacgtgtcg gatatagcat
gagtggtgct tacgcggaaa 11460acttagtggc atcaatgcca gccagcaacg tgatgaaata
tgcgccagct atcccagtgc 11520ctgtgtctat tgcagggcaa gttgaactga cgatcaaaca
tgcggttcat gctcaacctt 11580ggtgtcaatc aggtgaaggt acgcttgctt ggtctggtgc
agcagtcgac tcgccagtgg 11640gttcgttaga ccttggccct gtgattgcgg acataacgtg
tgaagacagc acaattgcag 11700ccaaaggcac tcagaagagc gatcaggtag acagcgagtt
ctcagcgagc gtaacaccta 11760accaacgcta cacctcggca gcatggttta agccaggcgc
tgaattcccg ccagcaatgc 11820agagtcagct taagtggttg ggcaatcctg atagccaagg
taaataccaa tttacttatc 11880aaggccgctt ttagcccggt atttacttca gagctagtat
ctgaagtaaa tttggcgatc 11940aaatcgcgaa ctataaaaaa cgggcacctc actgaggtgc
ccgttttgtt tgttctgaga 12000atctagagga tatctgacgg ttaaagagag caaactcacc
cagctttctt gtacaaagtg 12060gtcccc
12066254080DNAVibrio splendidus 2gtgctttgtg
acaacggggg atgtatggat attgaagttt cgcgccaggt tgcggtagtt 60gaagctacga
gtggagatgt cgtcgtagtt aagccagacg gcagcgcaag aaaagtttca 120gttggcgata
ccatccgtga aaatgagatc gtgattacgg ccaacaagtc agagcttgta 180ttaggcgttc
agaatgattc gattccggtt gcagagaatt gcgtcggttg tgttgatgaa 240aacgctgcat
gggtagatgc cccaatagct ggtgaggtta attttgactt acagcaagca 300gacgcagaaa
ccttcactga agacgacctt gctgcaattc aagaagccat tttaggtggt 360gccgatccga
ctcaaatctt agaagcaacg gctgctggtg gcggactagg ttctgcaaat 420gctggctttg
tgacgattga ctataactac actgaaactc atccatcgac tttctttgag 480accgctggtc
tagcagaaca aactgttgat gaagacagag aagaattcag atctatcact 540cgttcatcag
gtggccaatc aatcagtgaa acactgactg aaggctccat atctggcaat 600acctatcccc
aatctgtaac aacgacagaa acgattattg ctggtagttt agctctcgcc 660cctaactctt
tcattccaga aactttatcc ctcgcttcac tacttagtga attaaacagc 720gacattactt
caagtggtca gtccgttatc ttcacctatg acgcgacgac taattctatc 780gttggtgttc
aagataccga cgaagtatta cgtatcgaca ttgatgccgt cagtgttggc 840aataacattg
agctttctct aaccacaacg atttcccagc cgattgatca tgtaccgtcg 900gttggcggtg
gtcaggtttc ttacactggc gatcaaatag atattgcctt tgatattcaa 960ggtgaagaca
ccgctgggaa cccgctagca acacccgtta acgcacaagt ttcagtgttt 1020gacgggatag
atccgtctgt tgaaagtgtc aatatcacta acgttgaaac tagcagcgcg 1080gcaatcgaag
ggacgttctc aaatattggt agtgataacc ttcaatcagc cgtatttgat 1140gcaagtgcac
tggaccagtt tgatgggttg ctcagtgata atcaaaacac gcttgcgaga 1200ctttctgatg
atggaacaac gattactctg tccatccaag gtcgaggtga ggttgttctc 1260actatctctc
tagataccga tggcacctat aaattcgagc agtctaatcc gatagaacaa 1320gtgggtaccg
attcactgac gttcgctttg ccaatcacga ttaccgattt tgaccaagat 1380gttgtaacca
atacgatcaa cattgccatt actgatggcg atagccctgt tattactaat 1440gttgacagta
ttgatgttga tgaagcgggc attgttggcg gctcacaaga gggcacggcg 1500ccagtgtctg
gcactggcgg tatcaccgcg gacatttttg aaagtgacat cattgaccat 1560tatgagctag
aacccactga atttaatact aatggcacct tggtttcaaa tggcgaggct 1620gtgctacttg
agttgattga tgaaaccaac ggtgtaagaa cttacgaagg ttatgttgag 1680gtcaatggtt
cgagaattac ggtctttgac gttaaaattg atagcccttc attgggcaac 1740tatgagttta
atctttatga agaactttct catcaaggcg ctgaagatgc gctgttaact 1800tttgcattgc
caatttatgc tgttgatgca gatggcgacc gttctgcact gtctggaggt 1860tcgaacacac
cagaagctgc tgagatcctc gttaatgtta aagacgatgt cgttgaatta 1920gttgataagg
ttgaatcagt caccgagccg accttagcgg gcgatactat tgtttcgtat 1980aacctgttca
attttgaagg cgcagatggt tctacaattc aatcgtttaa ctacgacggt 2040gttgattact
cactcgatca aagcctgctc cccgatgcta cccagatttt cagttttact 2100gaaggtgtcg
tcactatctc attaaacggt gacttcagtt ttgaagtcgc tcgtgatatc 2160gaccactcaa
gcagtgaaac tatcgtcaaa cagttctcat ttttagccga agatggtgat 2220ggggatactg
atagttcgac gcttgagtta agtattaccg atggccaaga tccgatcatt 2280gatttgatcc
cgcctgtgac tctctctgaa accaacctta atgacggctc tgctcccagc 2340ggaagtacag
ttagcgcaac cgagacgatt acctttaccg caggcagcga cgatgtagca 2400agtttccgta
ttgaaccaac agagtttaat gtgggcggtg cacttaaatc gaatggattt 2460tcggttgaga
taaaagaaga ttcggctaat ccgggtactt acattggctt tattaccaac 2520ggttcgggcg
ctgaaatccc agtgtttacg attgctttct ctacgagcac attgggtgaa 2580tacaccttta
ctctgcttga agcgttagac catgtagatg gtttagataa gaacgatctg 2640agctttgatc
tgcctattta tgcggttgat acggacggcg acgattcatt ggtgtctcag 2700cttaatgtga
ctatcggtga tgatgttcaa atcatgcaag acggtacgtt agatatcacc 2760gagccaaatc
ttgctgacgg tacaatcaca accaacacca ttgatgtaat gccaaatcaa 2820agtgctgatg
gcgcgacgat cactcggttc acttatgacg gtgtcgtaaa cacactggat 2880caaagtattt
caggagaaca gcagttcagc ttcacagaag gcgaactgtt tatcaccctt 2940gaaggtgaag
tgcgctttga gcctaatcgc gatctagacc actcagtgag tgaagatatc 3000gtgaagtcga
ttgtggtgac ttcaagcgac ttcgataacg atccggtgac ttcaaccatt 3060acgctgacga
tcactgatgg tgataacccg acgattgatg ttattccaag tgttacgctt 3120tctgaaatta
acctgagcga tggctctgct ccaagtggca gcgcggtaag ctcgactcaa 3180actattactt
ttaccaatca aagtgatgat gtggttcgtt tccgtattga gtcaacggag 3240ttcaatacta
acgatgatct taaatcgaac ggtttagctg ttgagttacg tgaagacccg 3300gcagggtcgg
gtgactacat tggttttacg accagtgcga cgaacgtaga aactccagta 3360ttcacattaa
gctttaattc tggatcatta ggtgaataca cgttcacact catcgaagcg 3420ttggaccacc
aagatgcccg tggcaacaac gacctcagtt ttgatttacc tgtttacgcg 3480gtagatagtg
atggcgatga ttcattggtg tctccgttaa acgtcactat cggtgatgat 3540gttcaaatca
tgcaagatag tacgttagat atcgtcgagc caaccgtcgc agatttggcc 3600gctggcacag
tgacaactaa caccattgat gtgatgccaa atcaaagtgc cgatggcgca 3660acggtgacgc
aattcactta tgatggccag cttcgaacac ttgaccaaaa tgacaatggt 3720gagcagcaat
ttagcttcac agaaggtgaa ctgttcatca cgcttcaagg tgatgtgcgc 3780tttgagccta
atcgtaatct agaccacaca ctcagcgaag acatcgtgaa atcaatcgtg 3840gtgacatcta
gcgattccga taacgatgtg ttgacctcaa ccgtcactct gaccattacc 3900gatggtgata
tcccaaccat tgataatgtt ccaactgtga acttgtctga aactaatctg 3960agtgatggct
ctgcacctag cggaagcgcg gtgagttcaa ctcaaactat tacttacacc 4020actcaaagtg
atgatgtgac aagcttccgt attgaaccga ctgaatttaa tgttggtggc 4080gctctcacat
caaacggatt ggcagtcgag ttaaaagctg atccaaccac accgggtggc 4140tacatcggtt
ttgtgactga tggttcgaac gttgaaacta acgtgttcac gattagcttc 4200tcagatacca
atttaggcca gtacaccttc accttacttg aagcgttaga ccatgtggat 4260ggtttagcga
acaatgatct gacctttgat ctgcctgttt atgcagttga tagcgatggc 4320gacgattcac
tggtgtctca gttaaatgta accatcggtg atgatgttca aatcatgcaa 4380ggtggtacgt
tagatatcac tgagccaaat cttgcagacg gcacaattac aaccaatacc 4440atcgatgtga
tgccagagca aagcgccgat ggtgcgacga tcactcagtt cacttatgac 4500ggtcaagttc
gaacactgga tcaaacggac aatggtgagc agcaatttag cttcactgaa 4560ggcgagttgt
tcatcactct tcaaggtgac gtgcgtttcg aacccaatcg caacctagat 4620cacacagcta
gcgaagatat cgtgaagtcg atagtggtga cttcaagcga tttagataac 4680gatgtggtga
cgtcaacggt cactctgacg attactgatg gtgatatccc aaccattgat 4740gcagtgccaa
gcgttactct gtctgaaatc aatcttagtg acggctctgc gccaagtggc 4800actgcagtta
gtcaaactga gacgattacc ttcaccaatc aaagtgatga tgtgaccagt 4860ttccgtattg
agccaataga gttcaatgtg ggcggtgcac tgaaatcgaa tggatttgcg 4920gttgagataa
aagaagattc ggctaatccg ggtacttaca ttggctttat taccaacggt 4980tcgggcgctg
aaatcccagt gtttacgatt gctttctcta cgagctcatt gggtgaatac 5040acctttactc
tgcttgaagc gttagaccat gtagatggtt tagataagaa cgatctgagc 5100ttcgatctgc
ctgtttatgc ggtcgatacg gacggcgatg attcattggt gtctcagcta 5160aacgtgacca
tcggtgatga tgtccaaatc atgcaagacg gtacgttaga tatcatcgag 5220ccaaatctgg
ctgatggaac aatcacaacc agcactattg atgtgatgcc aaaccaaagt 5280gctgatggtg
cgacgatcac tcagtttact tatgacggtc agctaagaac gcttgatcaa 5340aatgacactg
gcgaacagca gttcagcttc acagaaggcg agttgtttat cacccttgaa 5400ggtgaagtgc
gctttgagcc aaaccgagac ctagaccaca ccgcgagtga agatattgtt 5460aagtcgattg
tggtcacttc aagtgatttc gataacgact ctctgacttc taccgtaacg 5520ctgaccatta
ctgatggtga taaccctacg atcgacgtca ttccaagcgt taccctttct 5580gaaactaatc
tgagtgatgg ctctgctcca agtggcagcg cggtaagctc gactcaaact 5640attactttta
ccaatcaaag tgatgatgtg gttcgtttcc gtattgagcc aacggagttc 5700aatactaacg
atgatcttaa atcgaacggt ttagccgttg agttacgtga agacccggct 5760gggtcgggtg
actacattgg ttttactact agtgcgacga atgtcgaaac cacggtattt 5820acgctgagtt
tttctagcac cacattaggt gaatatacct tcactttgct tgaagcgttg 5880gaccaccaag
atgcccgtgg caacaacgac ctcagttttg aactgcctgt ttatgcggta 5940gacagtgatg
gcgatgattc actgatgtct ccgttaaacg tcaccatcgg cgatgatgtt 6000caaatcatgc
aagacggtac gttagatatc gtcgagccaa ccgtcgcaga tttggccgct 6060ggcattgtga
caactaacac cattgatgtg atgccaaatc aaagtgccga tggcgcgacg 6120atcactcaat
tcacttatga tggccaactt cgaacacttg accaaaatga caatggcgaa 6180caacagttta
gcttcacgga aggtgaacta ttcatcactc ttgaaggtga agtgcgcttt 6240gagcctaatc
gtaatctaga ccacacgctg aacgaagaca tcgtgaaatc gatcgtggtg 6300acgtctagtg
actccgataa cgatgtgttg acctcaaccg tcactctgac cattaccgat 6360ggtgatatcc
caaccattga taatgtgcca acagtgagct tgtcagaaac aagtctgagt 6420gacggctctt
caccaagtgg cagcgcagtt agctcaactc aaaccatcac ttacaccact 6480caaagtgatg
atgtaaccag cttccgtatt gaaccgactg agttcaatgt tggcggtgct 6540ctcaaatcaa
atggattggc ggttgagctg aaggccgatc caaccactcc gggcggctac 6600atcggctttg
tgactgatgg ttcgaacgtt gaaactaacg tgttcacgat tagcttctcg 6660gataccaatt
taggtcaata caccttcacc ttgcttgaag cgttggatca tgcggatagc 6720cttgcaaata
acgatctgag ctttgatctg ccagtctacg ccgtcgatag tgatggcgat 6780gattcactgg
tgtctcaact caatgtaacc atcggtgatg atgttcaaat catgcaaggt 6840ggtacgttag
atatcactga gccaaacctt gcagacggca caaccacaac taacaccatc 6900gatgtgatgc
cagaacaaag tgccgatggt gcgacgatca ctcagtttac gtatgacggg 6960caagttcgca
ctctggatca aactgacaat ggtgagcagc aatttagctt cactgaaggc 7020gagttgttca
tcactcttca aggtgacgtg cgtttcgaac ccaatcgcaa cctagatcac 7080acagctagcg
aagacatcgt gaagtcgata gtggtgactt caagcgattc agataacgat 7140gtggtgacgt
caacggtcac tctgactatt actgatggtg atctcccaac cattgatgca 7200gtgccaagcg
ttactctgtc tgaaactaat cttagtgacg gctctgcgcc aagtggcagc 7260gcagtcagtc
aaactgagac catcaccttt accaatcaaa gtgatgatgt ggcgagtttc 7320cgtattgagc
caaccgagtt taatgtgggc ggtgcactga aatcgaatgg gtttgcggtt 7380gagataaaag
aagactctgc taatccgggt acttacattg gctttattgc caatggttcg 7440agcgctgaaa
tcccagtgtt cacgattgct ttctctacga gtacgttggg tgaatacacc 7500tttactctgc
ttgaagcgtt agaccatgcg gatggtttag ataagaacga tctgagcttt 7560gagcttccgg
tttacgcggt tgatacagac ggtgatgatt cattggtatc tcagcttaat 7620gtgaccattg
gtgatgatgt tcaaatcatg caagatggta cgttagacgt tatcgagcca 7680aatcttgcag
acggcacaat cacaaccaac accattgatg tgatgcccga gcaaagtgct 7740gatggtgcga
cgatcactca gtttacttat gacggtcagc taagaacgct tgatcaaaat 7800gacactggtg
aacagcagtt cagcttcaca gaaggcgagt tgtttatcac ccttgaaggt 7860gaagtgcgct
ttgaacctaa tcgcgatcta gaccattccg ttagcgaaga catcgtgaag 7920tcgatagtag
tgacttcaag cgacttcgat aacgatccgg tgacttcagc cattacgctg 7980accattactg
atggtgataa tccgactatc gattcggtac cgagcgttgt acttgaagaa 8040gctgatttaa
ctgatggctc atcgccaagt ggcagcgcgg ttagtcaaac ggaaaccatc 8100actttcacta
atcaaagtga cgatgttgag aaattccgtt tagaaccaag tgaatttaat 8160actaacaacg
cgctcaagtc cgatggcttg atcattgaga ttcgagagga accaacagga 8220tccggcaatt
atattggttt cacgaccgat atttcgaatg tcgaaaccac tgtgtttaca 8280ctcgatttca
gcagtaccac tttgggtgag tacaccttca cgcttctgga agcgattgac 8340cacacgcctg
ttcaaggcaa taacgatcta acattcaact tgccagtcta cgcggttgat 8400agcgacggtg
atgattcgct aatgtcatca ctatcggtga cgattactga tgatgttcaa 8460gtgatggtga
gtggttcgct tagtatcgaa gagcctactg ttgccgactt ggctgcaggc 8520acgccaacaa
catcagtatt tgatgtatta acatccgcga gtgctgatgg ggcgaccatt 8580actcagttca
cttatgatgg tggggcggta ttaacgcttg atcaaaacga tacaggtgag 8640cagaagttcg
tggttgctga tggggcatta tatatcactc tgcaaggcga tattcgtttc 8700gaaccaagtc
gtaaccttga ccatactggt ggcgatatcg tcaagtcgat agtcgtaact 8760tcaagtgatt
ccgatagcga tcttgtgtct tcaacggtaa cgctaaccat tactgatggc 8820gatatcccaa
cgattgacac ggtgccaagc gttactctgt cagaaacgaa tctgagcgac 8880ggatctgctc
cgaatgcaag tgcggtaagt tcaactcaaa ccattacctt tactaaccaa 8940agtgatgacg
tgacgagttt ccgtattgaa ccgactgatt ttaatgttgg tggtgctctg 9000aaatcgaacg
gattggcggt cgaactgaaa gcggacccaa ctacaccggg tggctacatc 9060ggttttgtga
ctgatggttc gaacgttgaa actaacgtgt ttacgattag cttctcggat 9120accaatttag
gtcaatacac cttcaccctg cttgaagcgt tggatcatgt agatggctta 9180gtgaagaatg
atctgacttt tgatcttcct gtttatgcgg ttgatagcga tggtgatgat 9240tcactggtgt
ctcaactgaa tgtgaccatt ggtgatgatg tacaggtcat gcaaaaccaa 9300gcgcttaata
ttattgagcc aacggttgct gatttggctg caggtactcc gacgacagcc 9360actgttgatg
tgatgcctag ccaaagtgcc gatggcgcga caatcactca gtttacttac 9420gatggcgggg
cggcaataac actcgaccaa aacgacaccg gtgaacagaa gtttgtattt 9480actgaaggtt
cactgtttat caccttgcaa ggtgaagtgc gtttcgagcc aaatcgcaat 9540ctaaaccaca
cagcgagcga agacatcgtg aagtcgattg tggtgacttc aagcgattta 9600gataacgatg
tactgacgtc aacggtcact ctgactatta ctgatggtga tatcccaacc 9660attgatgcag
tgccaagcgt tactctgtct gaaactaatc ttagtgacgg ctcagcgcca 9720agcagcagtg
ctgtaagtca aacagagacg attaccttca tcaatcaaag tgatgatgtg 9780gcgagtttcc
gtattgagcc aacagagttc aatgtgggcg gtgcactgaa atcgaatgga 9840tttgcggttg
agataaaaga agattcggct aatccgggta cttatatcgg ttttattacc 9900gatggttcga
atactgaagt tcctgtattc acgattgctt tctctacaag tacgttgggc 9960gaatacacct
tcaccttact tgaagcgcta gaccatgcaa atggcctaga taagaacgat 10020ctgagttttg
atcttcctgt ttatgcggta gacagtgatg gcgatgattc actggtgtct 10080caactgaatg
tgaccattgg tgatgatgtc caaataatgc aagacggtac gttagatatc 10140actgagccaa
atcttgcaga cggaacaatc acaaccaaca ccattgatgt gatgccaaat 10200cagagtgccg
atggtgcgac gatcactgaa ttctcatttg gcggtattgt caaaacactc 10260gatcaaagca
tcgtaggtga gcagcagttt agtttcaccg aaggtgagct attcatcact 10320cttcaaggtc
aagtgcgctt tgaaccaaat cgtgaccttg accactctgc cagcgaagac 10380atcgtgaagt
cgatagtggt tacttcaagt gattttgata acgatcctgt gacttcaacc 10440gttacgctga
ccattaccga tggtgatatt ccaactatcg atgcggtacc aagtgttacg 10500ctttcagaaa
caaacctagc tgatggttct gcgccaagtg gtagtgcggt tagtcaaacg 10560gagacgatta
cttttaccaa tcaaagtgat gatgtggttc gcttccgtct ggaaccaacc 10620gagttcaata
ctaacgatgc acttaaatcg aatggcttag cggtcgaact gcgcgaagaa 10680cctcaaggct
ctggtcagta cattggcttt accaccagtt cgtctaatgt tgagacaaca 10740gtatttacgt
tggactttaa ctccggaacc ttaggtgaat acacatttac tttaatcgaa 10800gctctggatc
atcaagatgc gcgtggcaac aacgatttaa gctttaatct acctgtgtat 10860gcggtggata
gtgatggcga tgactcgtta gtctctcagc ttggcgtgac cattggcgac 10920gatgtgcagt
tgatgcaaga cggcacaatc accagtcgtg agcctgcagc aagtgttgaa 10980acatcaaata
cctttgatgt gatgccaaac caaagtgctg atggagccaa agtcacttca 11040tttgttttcg
atggtaagac tgcagaaagt cttgatttga atgtgaatgg tgaacaagag 11100ttcgtcttca
cggaaggttc ggtatttatt acgacggaag gtgagatacg attcgagccg 11160gtacgtaatc
aaaatcatgc tggtggtgat attaccaagt cgattgaggt gacgtctgtt 11220gacctcgatg
gcgatattgt cacatcgaca gtgacactga agattgttga tggtgacctt 11280cctactatcg
accttgttcc cggaattacg ttatctgaag tggatctggc cgatggctct 11340gtgccaaccg
gtaatccagt gacaatgaca caaaccatta cctacacagc gggtagtgac 11400gacgtaagcc
atttcagaat tgaccctacg cagttcaata cttcaggggt tttgaaatcg 11460aacggcctag
atgtcgaaat aaaagagcag ccagctaatt ctggtaatta cattggcttc 11520gtcaaagacg
gttctaacgt agaaaccaac gtcttcacga tcagcttctc gacgagcaat 11580ttagggcaat
acacgttcac actacttgaa gcgttagatc atgtagatgg attgcaaaac 11640aatatactaa
gcttcgatgt ccctgtttta gcggttgatg cggatggtga tgattctgca 11700atgtcgccta
tgacggttgc gatcaccgat gacgtacaag gtgttcaaga tggcaccttg 11760agtatcactg
agccttcatt agctgatttg gcatcgggta cgccaccaac gacggcaatc 11820attgatgtta
tgccaacgca gagtgctgat ggcgcgaaag taacacagtt tacttacgat 11880ggtggcacag
ctgtaacgtt agacccaagc atcgccacag aacaagtctt taccgtaacc 11940gatggcttac
tgtacatcac cattgaaggg gaggttcgtt ttgagccgag ccgagatcta 12000gaccattcat
ctggcgatat cgtaagaacg attgtcgtca ccaccagtga ttttgataac 12060gatacagata
ccgcggatgt cactttgacg atcaaagacg gtatcaatcc cgttatcaat 12120gtggttccag
atgttaactt atcggaagtt aatctagcgg atggctcgac gccaagtggt 12180tctgcagtca
gttcgactca cacaatcact tacaccgaag gaagtgatga ttttagtcac 12240tttagaattg
cgaccaacga attcaatcct ggcgatctgt tgaaatcaag tggtcttgtt 12300gttcaactaa
aagaagatcc tgcttctgct ggtgattaca ttggttatac cgatgatggt 12360atgggtaacg
ttaccgatgt atttaccatt agctttgata gtgcaaacaa agctcagttt 12420acatttacct
tgattgaggc gcttgatcac cttgatggtg tgctttacaa cgatcttacg 12480ttccgtttgc
ctatctatgc tgttgataca gatgattctg aatcaacaaa gcgcgatgtg 12540gtggttacga
tagaagatga catccagcaa atgcaagatg gcttcttaac cattaccgag 12600ccaaattctg
gtactccaac aacaactacc gttgatgtga tgccaatacc aagtgcagac 12660ggtgcgacta
ttacgcagtt cacgtatgac ggtggttctc caattactct gaatcaaagc 12720atcagcggcg
aacaagagtt tgttttcact gaaggttcac tgtttgtgac actagatggt 12780gatgtaaggt
ttgagccaaa tagaaacctt gatcactctg cgggcgacat tgttaaatcg 12840attgtgttca
cgtcttcaga ctttgataac gacatcttct catcaaaagt cactctcacc 12900attgttgatg
gtgatgggcc aacaatcgac gttgtgccgg gtgtggcatt gtcagaaagc 12960ttacttgcgg
atggttcgac gcctagcgta aatcccgtga gtatgactca aaccattact 13020tcacttgcaa
gtagtgatga tattgctgaa atagtggtgg aagtcgggtt gttcaatacc 13080aacggcgcgt
tgaagtcgga tggtttgtca ctgagtttac gtgaagaccc tgtaaattca 13140ggcgactaca
ttgcatttac tactaatggt tcgggtgttg agaaagttat cttcactctg 13200gattttgatg
atacgaatcc gagtcaatat acgtttactc tgcttgaacg tttagaccat 13260gttgatggct
taggaaataa cgatctgagt tttgatcttt ctgtttatgc agaagatacc 13320gatggtgata
tttcagcgtc taaaccgctt acagtcacca tcaccgatga tgttcagctc 13380atgcaatccg
gtgcgctcaa cattactgag ccaaccacag gaacaccgac tacagcagtc 13440tttgatgtga
tgcctgcgca aagtgcagat ggcgcgacaa tcactaagtt tacctatggc 13500agccaacctg
aagagtctct ggtacaaacc gtcacgggtg agcaagaatt tgtgttcact 13560gaaggttctc
tgtttatcaa tcttgaaggt gatgtacgtt tcgaacctaa ccgtaatctc 13620gatcattcgg
gtggtaacat cgttaagacc attacggtga catcggaaga taaagatggc 13680gatattgtca
cttcaacagt gacgctgact attgtagatg gcgcgccacc agtaatagac 13740acagtaccaa
cggttgcatt ggaagaagcg aatctggtcg acggatcttc accgggttta 13800cctgttagcc
aaactgaaat cattactttc acagcaggaa gtgatgatgt gagccacttc 13860cgtattgatc
cggctcaatt caacacatca ggcgatctga aagcggatgg tttggtggtt 13920cagttaaaag
aagatcctct aaacagcgat aattatattg gttacgttga aagcggcggt 13980gtccaaacgg
atatcttcac catcaccttt agcagcgtgg ttctaggaga gtacacattc 14040accttgttgg
aagagttaga tcacctgcct gtacaaggta acaatgatca aatcttcacc 14100ttgccagtga
tcgcagtcga caaagacaac actgactcag cggtgaaacc tcttacggtg 14160accattaccg
atgatgttcc aaccattact gacaccaccg gcgccagtac gtttgtggtt 14220gatgaagatg
atttgggcac tctggcacaa gcgacgggtt cgtttgtaac cacagaaggt 14280gcagatcaag
tcgaggttta cgaactacgt aatatatcaa cgttggaagc aacgctatcg 14340tcgggcagtg
aaggtattaa gatcactgag atcacaggtg ctgctaacac gaccacctac 14400caaggggcga
ccgacccaag tggaacgcca attttcacat tagtgctgac tgatgatggt 14460gcctacacct
ttaccttgct tggccctctc aatcacgcta cgacaccgag taacctcgat 14520acattaacaa
taccatttga tgttgttgcc gttgacggtg atggcgatga ttctaaccaa 14580tatgtattgc
caatcgaggt gctagatgat gtgcctgtaa tgacggcgcc gacgggtgaa 14640acggttgttg
atgaagacga tcttactggc attggttccg atcaatctga agatacaatt 14700atcaatggac
tgttcaccgt tgatgaaggt gcggatggcg ttgtgctgta tgagctggtt 14760gatgaagatt
tggttctgac gggcttaacc tctgatggag aaagcttaga gtggctagct 14820gtttcacaaa
acggcacaac atttacttac gttgctcaaa ctgcaacgag taatgaagcg 14880gtgttcgaga
ttattttcga cacctcggat aacagctacc aatttgaatt atttaagcca 14940ctgaagcacc
ctgacggtgc aaacgagaac gcgatagatc ttgatttctc aatcgttgct 15000gaagattttg
atcaagacca atcggatgcg atcggtctaa aaattacggt aaccgatgat 15060gttccgttag
tgacaactca atcgattact cgtcttgaag gtcaggggta tggcaactct 15120aaagtcgaca
tgtttgccaa tgcaacagat gtgggggctg atggcgcggt actgagtcga 15180attgagggta
tctcaaataa tggtgcagat attgttttcc gtagcgggaa caatgggcca 15240tatagtagcg
gcttcgattt aaacagcggt agccaacaag ttcgagtcta cgagcaaaca 15300aatggcggtg
ctgatactcg tgaacttggc cgtctacgca tcaactcaaa tggtgaggtt 15360gaattcagag
ctaacggcta tctcgatcat gacggtgatg acaccatcga cttctcgatt 15420aacgtgattg
ccacagatgg agatttagac acctctgaaa caccgttaga tattacgatt 15480actgataggg
attctacaag aattgcgctg aaagtgacga ccttcgagga tgcgggtaga 15540gactcaacca
taccttacgc aacaggtgat gagccgactc ttgagaatgt tcaagataac 15600caaaatggtt
tgccgaatgc gccagcgcaa gttgcgctgc aagttagtct gtatgaccaa 15660gataacgctg
aatctattgg gcagttgacg attaaaagcc cgaacggagg tgatagtcat 15720caaggtactt
tttattactt tgatggtgct gactacatag aattagtgcc tgagtcaaat 15780gggagcatta
tatttggctc tcctgaactc gaacaaagct tcgctccaaa cccgagtgaa 15840ccaagacaaa
ctatcgcgac gatagacaac ctgttctttg ttccagacca acacgctagt 15900tcggatgaaa
ctggtgggcg agttcgttat gagcttgaaa ttgagaaaaa tggcagtacg 15960gatcacaccg
ttaattcaaa cttcagaatt gagattgaag ctgtagctga tattgcgact 16020tgggatgatt
ccaacagcac gtatcagtat caagtcaacg aagatgaaga caatgtcacg 16080ttgcagctga
acgcagagtc tcaagataac agtaatactg agacgattac ctatgaactt 16140gaagccgttc
aaggcgacgg gaagtttgag ttacttgatc aaaatggcaa tgtgttaacg 16200cccgttaatg
gtgtttatat catcgcatct gctgatatca atagcaccgt agttaaccct 16260attgataact
tctcagggca gattgagttc aaagcgacgg caattacgga agagacgctt 16320aacccatacg
atgattcaga caacggtgga gcaaacgata agacgacggc tcgttctgtg 16380gaacaaagta
ttgttattga tgtgaccgca gatgcggacc ctggcacatt cagtgttagt 16440cgaattcaga
tcaacgaaga caatatcgat gatccagatt acgtcgggcc tttggacaat 16500aaagacgcgt
tcacgttaga cgaagtcatc accatgacag ggtcggtcga ttctgacagt 16560tctgaagaac
tgtttgtgcg catcagtaat gttacggaag gagctgtgct ttacttctta 16620ggcaccacga
cagtcgttcc gaccatcacg atcaatggtg tggattatca agaaatcgcg 16680tattccgatt
tggctaacgt tgaggttgtt ccaaccaaac acagtaatgt cgatttcacc 16740ttcgatgtta
cgggagtggt caaagatacg gcaaatctat ccacgggcgc ccaaatcgat 16800gaggagatac
taggaactaa aaccgtcaac gttgaagtca aaggcgttgc cgatactcct 16860tatggtggaa
cgaatggcac ggcttggagt gcaattacag atggcactac atctggtgtt 16920caaaccacga
ttcaagagag ccaaaatggt gatacctttg ctgagcttga tttcaccgtg 16980ttgtcgggag
agagaagacc agatactggc actacaccat tagctgacga tgggtcagaa 17040tcaataaccg
ttattctatc gggtataccc gatggggttg ttctagaaga cggtgacggt 17100acagtgattg
accttaactt tgtcggttat gaaaccggac cgggcggtag tcctgactta 17160tccaaaccta
tctacgaagc gaacattact gaggcgggta aaacttcagg cattcgcatc 17220agacctgtcg
actcttcaac cgagaatatt cacattcaag gtaaagtgat tgtgactgag 17280aacgatggtc
acacgcttac gtttgatcaa gaaattcgag tgcttgttat acctcgaatc 17340gacacatcag
caacttatgt caatacgact aacggtgatg aagatacggc tatcaatatt 17400gattggcacc
ctgaaggcac ggattacatt gatgacgatg agcatttcac taagataact 17460attaatggaa
taccactggg tgttactgca gtagtcaacg gtgatgtgac cgttgatgac 17520tcaaccccag
gaacattgat tataacgcct aaagatgctt cccaaactcc tgaacaattt 17580actcaaattg
cattagctaa taacttcatt caaatgacgc ctccggctga ttctagtgca 17640gattttacgt
tgaccaccga acttaaaatg gaagagcgag atcatgagta tacgtctagc 17700ggcctagagg
atgaagatgg tggttatgtc gaagccgatc cagatataac cggaatcatt 17760aacgttcaag
tacgacctgt ggttgaacct ggagatgccg acaacaagat tgtcgtttca 17820aacgaagatg
gctctggaga tctcactacg attacggctg atgctaatgg tgtcattaaa 17880tttacaacta
acagtgataa ccaaacgact gatactaacg gagacgaaat ctgggacggt 17940gaatacgtcg
tccgatacca agaaacggat ttaagcacag tagaagagca agtcgacgaa 18000gtgattgttc
agctgactaa caccgatgga agcgcgttat ctgatgatat tttagggcaa 18060cttttagtaa
ctggtgcctc ttacgaaggc ggtggccgat gggttgtgac caatgaagat 18120gcctttagcg
tcagtgcgcc caatggatta gatttcaccc ctgccaatga tgcggatgat 18180gtagctactg
atttcaatga tatcaagatg acaattttca ctttggtctc agatcctggt 18240gatgctaaca
atgaaacgtc cgcccaagtg caacgcaccg gagaagtaac gctttcttat 18300cctgaagtgc
tgacggcacc tgacaaagtt gccgcagata ttgcgattgt gccagacagt 18360gttatcgacg
ctgttgagga tactcagctt gatctcggcg cggcactcaa cggcattttg 18420agcttgacgg
gtcgcgatga ttctactgac caagtgacgg tgatcatcga tggcactctg 18480gtcattgatg
ctacaacatc attcccaatt agcctgtcgg gaacaagtga tgttgacttt 18540gtgaatggga
aatatgttta cgagacgact gttgagcagg gcgtagccgt cgattcatcg 18600ggtttgttat
tgaatctgcc accaaactac tctggtgact ttaggttgcc aatgaccatc 18660gtgaccaaag
atttacaatc tggtgatgag aagaccttag tgactgaagt tatcatcaaa 18720gtcgcaccag
atgctgagac ggatccaacg attgaggtga atgtcgtggg ttcgcttgat 18780gatgccttta
atcctgttga taccgacggt caagctgggc aagatccggt gggttacgaa 18840gacacctata
ttcaactcga cttcaattcg accatttcgg atcaggtttc cggcgtcgaa 18900ggcggccaag
aagcgtttac gtccattact ttaacgttgg acgacccttc tataggtgca 18960ttctatgaca
acacgggtac ttcattaggt acatctgtta cgtttaatca ggctgaaata 19020gcagcgggtg
cactcgataa cgtgctcttt agggcaatcg aaaattaccc aacgggtaat 19080gatattaacc
aagtgcaggt taatgtcagc ggtacagtca cagataccgc aacctataat 19140gatcctgctt
ctcctgcggg tacggcaaca gactcagata ctttctctac gagtgtcagc 19200tttgaagtcg
ttcctgtggt cgatgacgtg tctgtcactg gaccgggtag cgatcctgat 19260gttatcgaga
ttactggcaa cgaagaccag ctcatttctt tgtcggggac agggcctgta 19320tcgattgcac
tgactgacct tgatggttca gaacagtttg tatcgattaa gttcacagat 19380gtccctgatg
gcttccaaat gcgtgcagat gctggctcga catataccgt gaaaaataat 19440ggtaatggag
agtggagtgt tcaactgcct caagcttcgg ggttgtcatt cgatttaagt 19500gagatttcga
tcttgccgcc taaaaacttc agtggtaccg ctgagtttgg tgtggaagtc 19560ttcactcaag
aatcgttgct gggtgtgcct actgcggcgg caaacttgcc aagcttcaaa 19620ctgcatgtgg
tacctgttgg tgacgatgtt gataccaatc cgactgattc tgtaacaggc 19680aacgaaggcc
aaaacattga tatcgaaatc aatgcgacta ttttggataa agaattgtct 19740gcaacaggaa
gcgggacgta taccgagaat gcgcccgaaa cgcttcgagt tgaagtggcg 19800ggtgttcctc
aagatgcttc tattttctat ccagatggca cgacattggc tagctacgat 19860ccggcgacgc
agctctggac tctcgatgtt ccagctcagt cgttagataa gatcgtattt 19920aactctggcg
aacataatag tgatacaggc aatgtactgg gtatcaatgg tccactgcag 19980attacggtac
gttcagtaga tactgatgct gataatacag agtacctagg tacgccaacc 20040agcttcgatg
tcgatctggt gattgatcct attaacgatc aaccgatctt tgtgaacgta 20100acgaatattg
aaacatcgga agacatcagt gttgccatcg acaactttag tatctacgac 20160gtcgacgcaa
actttgataa tccagatgct ccgtatgaac tgacgcttaa agtcgaccaa 20220acactgccgg
gagcgcaagg tgtgtttgag tttaccagct ctcctgacgt gacgtttgta 20280ttgcaacctg
acggctcatt ggtgattacc ggtaaagaag ccgacattaa taccgcattg 20340actaatggag
ctgtgacttt caaacccgac ccagaccaga actacctcaa ccagactggt 20400ttagtcacaa
tcaatgcaac gctcgatgat ggtggtaata acggtttgat tgacgcggtt 20460gatccgaata
ccgctcaaac caatcaaact accttcacca ttaaggtgac ggaagtgaat 20520gacgctcctg
tggcgactaa cgttgattta ggctcgattg cggaagacgc tcaaatcgtg 20580attgttgaga
gtgacttgat tgcagccagt tctgatctag aaaaccataa tctcacagta 20640accggtgtga
ctcttactca agggcaaggt cagcttacac gctatgaaaa tgctggtggt 20700gctgatgacg
cagcgattac ggggccattc tggatattca ttgcagataa tgatttcaac 20760ggcgacgtta
aattcaatta ctccattatc gatgatggta ccaccaacgg tgtggatgat 20820tttaaaaccg
atagcgctga aatcagcctt gtagttactg aagtcaatga ccagccagtg 20880gcatcgaaca
ttgatttggg caccatgctt gaagaaggac agctggtcat taaagaggaa 20940gacctgattt
ccgcaaccac tgatccggaa aacgacacga ttactgtgaa cagtttggtg 21000ctcgatcaag
gtcagggcca attacaacgc tttgagaacg tgggcggtgc tgatgatgct 21060acgatcactg
gcccgtactg ggtatttact gcagccaacg aatacaacgg tgatgttaag 21120ttcacttata
ccgttgagga cgatggtaca accaacggcg ctgatgattt cttaacagat 21180accggcgaaa
ttagcgttgt ggtaacggaa gtgaatgatc aaccggtggc aacggatatc 21240gacttaggaa
acatccttga agaagggcag ttgatcatca aagaggaaga cttaattgct 21300gctacgagcg
atccggaaaa cgacacgatt accgtgacca atctggtgct cgacgaaggc 21360caaggccagt
tacagcgctt tgagaacgtg ggcggtgctg atgacgctat gattactggc 21420ccgtactgga
tatttacggc tgctgatgaa tacaacggta acgttaagtt cacctatacc 21480gtcgaggatg
atggtacaac caacggcgct aatgatttcc taacggatac tgcagagatc 21540acagcgattg
tcgacggagt gaacgatacg cctgttgtta atggtgacag tgtcactacg 21600attgttgacg
aggatgctgg tcagctattg agtggtatca atgtcagtga cccagattat 21660gtggatgcat
tttctaatga cttgatgaca gtcacgctga cagtggatta cggtacattg 21720aacgtatcac
ttccggcagt gacgacagtg atggtcaacg gcaacaacac tggttcggtt 21780atcttagttg
gtactttgag tgacctgaat gcgctgattg atacgccaac cagtccaaac 21840ggtgtctacc
tcgatgcgag cttgtctcca accaatagca ttggcttaga agtaatcgcc 21900aaagacagcg
gtaacccttc tggtatcgcg attgaaactg caccagtggt ttataatatc 21960gcagtgacac
cagtcgctaa tgcgccaacc ttgtctattg atccggcatt taactatgtg 22020agaaacatta
cgaccagctc atctgtggtc gctaatagtg gagtcgcttt agttggaatt 22080gtcgctgcat
tgacggacat tactgaagag ttaacgttga agatcagcga tgttccggat 22140ggtgttgatg
taaccagtga tgtgggtacg gtttcgttgg tgggtgatac ttggatagcg 22200accgctgatg
cgatcgatag tctcagactc gtagagcagt catcattagg taaaccgttg 22260accccgggta
attacacctt gaaagttgag gcgctatctg aagagactga caacaacgat 22320attgcgatat
ctcaaaacat cgatctgaat ctcaatattg ttgccaatcc aatagatctc 22380gatctgtctt
ctgaaacaga cgatgtgcaa cttttagcga gtaactttga tactaacctc 22440actggcggaa
ctggaaatga ccgacttgta ggtggagcgg gtgacgatac gctggttggc 22500ggtgacggta
acgacacact cattggtggc ggcggttccg atattctaac cggtggcaat 22560ggtatggatt
cgtttgtatg gctcaatatt gaagatggcg ttgaagacac cattaccgat 22620ttcagcctgt
ctgaaggaga ccaaatcgac ctacgagaag tattacctga gttgaagaat 22680acatctccag
acatgtctgc attgctacaa cagatagacg cgaaagtgga aggggatgat 22740attgagctta
cgatcaagtc tgatggttta ggcactacgg aacaggtgat tgtggttgaa 22800gaccttgctc
ctcagctaac cttaagtggc accatgcctt cggatatttt ggatgcgtta 22860gtgcaacaaa
atgtcatcac tcacggttaa cgcctaattg gaggctagct attagaatct 22920aacgattaaa
ctaaaagcgg accatttaac cataacgaaa gaggccagca ttgctggcct 22980cttttttgtc
actgtataaa tcgtaaagag ttacttaaga gagttgtgga tcaggaactc 23040ttcttcgacg
cctttcaatt tcatctcatc cataatgaag ttcactgtgt tcaacaagcg 23100ttgttcacct
tttggtatca ggtaaccgaa ttgactgttg gtaaacggtg tttcacagcg 23160tgccgcttca
agacgttcgt ccgtcacttg atagaacaga ccttcaggag tttctgtcac 23220cattacatca
actttacctt ccgcaacggc ttgcggaacg tctaggttgt tctcgtaacg 23280cgtaaagctc
gcgtcttgca agttagcatc cgcaaacatc tcattagtcc caccgatatt 23340gacgccaaca
cgcacagaag agaggttcac tttctcaatg ctgttgtatt gttctgcttt 23400gcctttcgca
actaagaaac acttgccaaa ggtcatgtaa ccttgagttt gttctgcgtt 23460taactgacgc
tgcattttac gcgtgatacc gcccatcgcg atgtcgtatt tatcgctgtc 23520tagatcggtc
agtagatctt tccatgtggt acgaacaatc tgtaattcaa cgcccaactg 23580ctctgcaaca
tgtttggcta cgtcaatgtc ataaccagag taggttttgc cgtcgaagta 23640agaaaaaggt
ttgtagtcgc ctgtggtgcc gacgcgaagt gtgcctgatt tttgaatgtc 23700ttctagctgg
tcagcttgta ctacaccaga aagtgccaga gtaatggaag caagtaatag 23760tgatgttttt
ttcattgtaa ttatctgttg tgtttgtgtt gttattcaaa gtaacagaaa 23820caatcagaga
aagagatcaa accattggaa aggttgtaaa agaagataaa acgagggcag 23880gagataggta
acgctattga tttgtgaaca ttgataaaca tgtgtttcat attccatttt 23940gataaaccgt
agacaaacaa aaagcccatg ttatcgaata acatgggctt cattttggtt 24000taacttgtta
gctgcttatt tagctgctta tttagctgtt tagctgttta gctgtttagc 24060tacttagcaa
ctgactcgtt gttcatctta gccggagctt tagatgcgtt aaccagcagg 24120ataccaacgg
tgagtaccat cgaaccacat agtaggaaca acaagcgtcc tgttggttcg 24180tttggaatca
gagccattgc taggataccg aaacctgctg tgctgataag cttaccaagc 24240attgaacgct
gtttagtatc taggttctgc tgctcttcac cttccgctac tagcggcgta 24300ttccagttag
tgaatagttg gtcaacttct ttctcacgtt caggcgatag gcctttgtag 24360aagcgagaag
ttaggatgaa gtaaccacca gtaaacacta cgtgagcagc taagctaaga 24420ccaactttca
agtcgctcca ttcacggcca gtaagcgctg tttccatacc aaataggtgc 24480tcgatgtctt
ctgcttgaag cgagataccg aagatgtaag aaacgaagcc accaacgatt 24540aacgtagacc
aaccagccca gtcaggcgtc ttacgaatcc acataccaag tagtacaggg 24600ataagcattg
ggaagccaat taacgcacct acgttcatta cgatatcgaa caagctcaaa 24660tgacgtagag
agttaatgaa caagccaatc gcgatgatga taatacccat catgatagtg 24720gttagcttac
ttacaataac cagctctttc tgagttgcgt tttgacgtag aatagggctg 24780tagaagttca
ttacaaagat gccagcgtta cggttcaaac ctgaatccat agaagacatt 24840gttgcagcga
acattgctga cataagaaga ccaaccatac ctgctggcat tacgttctgt 24900acgaatgcta
ggtaagcagc atcaccagct ttatcaccca ttgaagcgta ctccaatgcg 24960aaatcaggca
tgaatgcact tacgtaccaa ggtggtagga accagattag tgggccaaca 25020accataagga
tacatgctag gcctgccgct ttacgtgcgt tttcactgtc tttcgcacat 25080aggtaacggt
aagcgttgat gctgttgttc attacaccga actgcttcac gaagatgaat 25140acaacccaaa
gaacgaagat gctcatgtag tttaggttat tacctaacat gaagtcgccg 25200tcgaaatttg
caacgatgtt agttaggcca ccaccgtgga agtaagctgc aaccgcacaa 25260gtaatcgtaa
ccgccatgat aacaagcatt tgcatgaagt cagaagcaac aaccgcccaa 25320gagccgcctg
ttactgccat caatactaga accatacccg ttaccacaat ggttgcttcc 25380attgggatgt
tgaataccgc tgctacgaag atagctagac catttagcca gatacccgca 25440gagataaggc
tgtcaggcat acctgcccat gtgaagaact gttcagacgt tttaccaaag 25500cgctgacgaa
tagcttcgat cgccgttacc acacgaagtt ggcggaactt tggagcgaag 25560tacatatagt
tcatgaagta gccaaaagca ttggctaaga ataggattac aataacgaaa 25620ccgtcattga
acgcgcgtcc tgcggcacct gtaaacgtcc atgctgaaaa ctgtgtcatg 25680aaggcggttg
caccaaccat ccaccacaac attttgccgc cccctctgaa gtaatcacta 25740gtcgacgtgg
tgaacttacg gaacatccaa ccaatagcga ttaaaaagaa gaagtaggcg 25800agaacaacaa
aagtatcgat agtcatcttt tcagcctttt aaatatcata attaactggg 25860cttagattaa
cgcgttcaaa ggtttatttg tactacaata tgtctttagt atgatctagg 25920tcgcattgat
ttttgggtgc acacgataag ttaatttaac ctactgtttt tattgatttt 25980aattgttttt
atgaattgct ctagatccaa gataaattga agttcaaatg tttatatgta 26040ttacaatata
agtaatgagg ctttagttta ccttatttat aagattttaa ttataaccgt 26100aacaaatatg
ctacaactga gcgtggttgt gcgacgacat tcacgttaat ttggaactct 26160attctggaaa
ttcttgtatt aggatttcaa gtgtagctca ttgttttcac ttcgctattt 26220tgtgtttgtc
tgcggttctg tcgcctttcc atgctattga ttaatttttt cgtgctagag 26280agacgcgtat
ttggaatgtt tgtcactgag tgggcgttaa actggacgac gggacactct 26340ttcggctcac
tttgtctatt gtggtcttca gtgcatgcta tgagaaatgt ttgacgacgt 26400attgaaaagg
aatattgtcg gataaaggga tgggtaagga gctggataag cggtagggag 26460ccccagtaac
gcttcgctag atgcatactg aggttgcttg aaagccttac atcactcgtt 26520cttgcctgtc
ttagtcacgg agctgtacga ggccataggg agaacggtga tagggtatgg 26580ggaaacagaa
cgttgattga gcgtgtttta cggttagtca gcgcaataaa cgccagataa 26640taaaaagccc
caccgaggtg aggctttatc acgaaatcta aaacagatta agcgttaacg 26700tgatcaactg
cgtcacgaac aagcttgcct agttcgtccc acttaccttc atcgataagg 26760ttagttggaa
ccatccaagt accgccacac gcaagaacag aagggatcga taggtattca 26820tcaacattct
tcaagcttac gccaccagta ggcatgaatt taacagggta aactgctgtt 26880agtgctttaa
gcatgccagt accgcctgaa ggctcagcag ggaagaactt caacgtgcga 26940agacccattt
ccattgcttg ctcaactagg cttgggttgt taacacccgg tacgattgca 27000atacctttat
cgatacagta ttgaacagta cgtgggttaa aacctgggct tacgatgaaa 27060tcaacaccag
cttcgataga tgcgtcaact tgctcgttag tcagtacagt acctgaaccg 27120attagcatgt
ctgggaattc tttacgcatg atgcgaatcg cttcgattgc acattctgta 27180cgtagtgtaa
tttctgcaca tggcatgcca ttttcaacca acgctttacc tagagggata 27240gcgtcttcag
cacggttgat cgcgattaca ggaattactt ttaggtttgc tagttgttca 27300tttaatgtcg
tcatgaattc tttctcacgt taaatgtggg cctgctttca actaagcaaa 27360cccttgatta
atagttaaag tgcgtaatta tagagacaga tcaggcgtcg cttctagagg 27420aatgatagca
cctggatgct gaatcacggt tcctgccaca atatgacctg caaatgcagc 27480atcacgagca
ctaccgccgc tcaagcgctt ggccaagaag cctgcactga acgagtcgcc 27540agcggcagtc
gtatcaacga tgttgtctac agggttgggt gcaacgtatt gagcgctttg 27600gctttcaacc
actaagcagt ctttcgcgcc acgtttaatg acgatctctt tcacaccaga 27660ctctgacgta
cgtgtaatac attgttcaat gctttcgtcg ccgtatagct cttgctcatc 27720atcaaacgtc
agcagagccg tatctgtgta cttaagcatt ttcaagtacc aagaaatcgc 27780ttcttgttgg
ctttcccaaa gtttaggtcg gtagttattg tcgaagaata cttggccgcc 27840ttgagctttg
aatttgtcta agaagttgaa tagctgcgtg cgaccatttt ctgtcaagat 27900tgccagcgta
ataccactta agtaaatcgc gtcaaaagag aacagcttat caagaagagc 27960aggcgtgtct
tcctgatcaa acatgaactt cgctgcagca tcactacgcc agtagtggaa 28020actgcgttca
ccagtttcat cggtctcgat gtagtaaagc cctggttgtt tgtggtccag 28080ctgagcaatt
aagctcgtgt cgataccttc cgcttgccaa ttttttaaca tgtcggtact 28140gaatgggtca
gtgcctagtg cagttacgta gctcgtgttg atatcttgct cttttgttaa 28200gcgtgacaag
taaagtgcag tattcagcgt atcgccacca aaactttgct taagcccgtc 28260ttgtttcttt
tgtagctcaa ccatgcactc gccaatgacc gcgatgttta atgatttcat 28320atgcttacct
tagcaactga ggttgcgcta gttattattt taggaaatct tcacgcgcag 28380gattgaagat
atcaagaagg atgctgtctt gttctagagc aactgcaccg tgcatcatgt 28440gtttacgagc
gaagtaagca tcgccttctt taagcacttt cttctcgccg tcgatttcag 28500cttcgaagct
accacgaaca acataaccga tttggtcgtg aatttcgtga gtatgagggt 28560ggccaatcgc
gcccttatca aagcataggt gtactgccat tagatcgtca gtgtaagcaa 28620cgattttacg
cttaatgccg ccaccaagtt cttcccatgg attttcatct aggataaaga 28680aagagttcat
tgtgtatctc ctaatctgtt taaatctttt aagtgttact taacttgcat 28740ccatcataag
ggaatgagtt caattgtaat acaatatatc taaatttgtg tgatattgat 28800caagcgatag
tttatatagc gtaaatgaat caacaactta agaattgctt ggtatctggc 28860attagttagc
tgcatcaatg gcttacggtg aattatgtga ctctactcat catttggcga 28920cgaataggta
taattaaagc tcatattgta ttactttata tggagtttga aaatttaatc 28980aaagtttaag
cagataaact ctttattgag ggtgacaaag aatatgacga ctaaaccagt 29040attgttgact
gaagctgaaa tcgaacagct tcatcttgaa gtgggccgtt ctagcttaat 29100gggcaaaacc
attgcagcga acgcgaaaga cctagaagca ttcatgcgtt tacctattga 29160tgttccaggt
cacggtgaag ctgggggtta cgaacataac cgccacaagc aaaattacac 29220gtacatgaac
ctagctggtc gcatgttctt gatcactaaa gagcaaaaat acgctgactt 29280tgttacagaa
ttactagaag agtacgcaga caaatatcta acgtttgatt accacgtaca 29340gaaaaacacc
aacccaacag gtcgtttgtt ccaccaaatc ctaaacgaac actgctggtt 29400aatgttctca
agcttagctt attcttgtgt tgcttcaaca ctgacacaag atcagcgtga 29460caatattgag
tctcgcattt ttgaacccat gctagaaatg ttcacggtta aatacgcaca 29520cgacttcgac
cgtattcaca atcacggtat ttgggcagta gccgctgtgg gtatctgtgg 29580tcttgcttta
ggcaaacgtg aatacctaga aatgtcagtg tacggcatcg accgtaatga 29640tactggcggt
ttcctagcgc aagtttctca gctatttgca ccttctggct actacatgga 29700aggtccttac
taccatcgtt atgcgattcg cccaacgtgt gtgttcgctg aagtgattca 29760ccgtcatatg
cctgaagttg atatctacaa ctacaaaggc ggcgtgattg gtaacacagt 29820acaagctatg
cttgcgacag cgtacccgaa cggcgagttc ccggctctga atgatgcttc 29880tcgtactatg
ggtatcacag acatgggtgt tcaggttgcg gtcagtgttt acagtaagca 29940ttactcttct
gaaaacggtg tagaccaaaa cattctgggt atggcgaaga ttcaagacgc 30000agtatggatg
catccatgtg gtcttgagct atctaaagca tacgaagccg catctgcaga 30060gaaagaaatc
ggcatgcctt tctggccaag tgttgaattg aatgaaggcc ctcaaggtca 30120caacggcgcg
caaggcttta tccgtatgca ggataagaaa ggcgacgttt ctcaacttgt 30180gatgaactac
ggccaacacg gcatgggtca cggcaacttt gatacgctgg gtatttcttt 30240ctttaaccgc
ggtcaagaag tgctacgtga atacggcttc tgtcgttggg ttaacgttga 30300gccaaaattc
ggcggccgtt acctagacga aaacaaatct tacgctcgtc aaacgattgc 30360tcacaatgca
gttacgattg atgaaaaatg tcagaacaac tttgacgttg aacgtgcaga 30420ctcagtacat
ggtttacctc acttctttaa agtagaagac gatcaaatca acggtatgag 30480tgcatttgct
aacgatcatt accaaggctt tgacatgcaa cgcagcgtgt tcatgctaaa 30540tcttgaagaa
ttagaatctc cgttattgtt agacctatac cgcttagatt ctacaaaagg 30600cggcgaaggc
gagcaccaat acgactattc acaccaatat gcgggtcaga ttgttcgcac 30660taacttcgaa
taccaagcga acaaagagct aaacactcta ggtgacgatt tcggttacca 30720acatctatgg
aacgtcgcaa gcggtgaagt gaagggcaca gcaattgtaa gttggctaca 30780aaacaacacc
tactacacat ggctaggtgc aacgtctaac gataatgctg aagtaatatt 30840tactcgcact
ggcgctaacg acccaagttt caatctacgt tcagagcctg cgttcattct 30900acgcagcaaa
ggcgaaacaa cactgtttgc ttctgttgtt gaaacgcacg gttatttcaa 30960cgaagaattc
gagcaatctg tcaatgcacg tggtgttgtg aaagacatca aagtcgtggc 31020tcacaccaat
gtcggttcgg tagttgagat caccacagag aaatcaaacg tgacagtgat 31080gatcagcaac
caacttggcg cgactgacag cactgaacac aaagtagaac tgaacggcaa 31140agtatacagc
tggaaaggct tctactcagt agagacaact ttacaagaaa cgaattcaga 31200agaacttagc
actgcagggc aggggaaata ataatgagct atcaaccact tttacttaac 31260tttgatgaag
cagctgaact tcgtaaagaa cttggcaagg atagcctatt aggtaacgca 31320ctgactcgcg
acattaaaca aactgacgct tacatggctg aagttggcat tgaagtacca 31380ggtcacggtg
aaggcggcgg ttacgagcac aaccgtcata agcaaaacta catccatatg 31440gatctagcag
gccgtttgtt ccttatcact gaggaaacaa aataccgaga ttacatcgtt 31500gatatgctaa
cagcgtacgc gacggtatac ccaacacttg aaagcaacgt aagccgtgac 31560tctaaccctc
cgggtaagct gttccaccaa acgttgaacg agaacatgtg gatgctttac 31620gcttcttgtg
cgtacagctg catctaccac acgatctctg aagagcaaaa gcgtctgatc 31680gaagacgatc
ttcttaagca aatgatcgaa atgttcgttg tgacttacgc acacgacttc 31740gatatcgtac
acaaccacgg cttatgggca gtggcagcag taggtatctg tggttacgca 31800atcaacgatc
aagagtctgt agacaaagca ctatacggcc tgaaactaga caaagtcagc 31860ggcggtttct
tagcgcaact agaccaactg ttttcgccag acggctacta catggaaggt 31920ccttactacc
accgtttctc tctgcgtcca atctacctgt tcgcagaagc gattgaacgt 31980cgtcagcctg
aagttggtat ctatgaattc aacgattcag tgatcaagac aacgtcttac 32040tctgtattca
aaacggcatt cccagacggt acattgcctg ctctgaacga ttcatcgaag 32100acaatctcta
tcaacgatga aggcgttatc atggcaacgt ctgtgtgtta ccaccgttac 32160gagcaaactg
aaactctact tggtatggct aaccaccagc aaaacgtttg ggttcatgct 32220tcaggtaaaa
cactgtctga cgcggttgat gcagcagacg acatcaaagc attcaactgg 32280ggtagcctgt
ttgtaaccga cggccctgaa ggcgaaaaag gcggcgtaag catccttcgt 32340caccgtgacg
aacaagatga cgacacgatg gcgttgatct ggtttggtca acacggttct 32400gatcaccagt
accactctgc tctagaccac ggtcactacg atggcctgca cctaagcgta 32460tttaaccgtg
gccacgaagt gctgcacgat ttcggcttcg gtcgctgggt aaacgttgag 32520cctaagtttg
gcggtcgtta catcccagag aacaagtctt actgtaagca gacggttgct 32580cacaacacag
taacggttga tcagaaaacg cagaacaact tcaacacagc attggctgag 32640tctaagtttg
gtcagaagca cttcttcgta gcagacgacc agtctctaca aggcatgagc 32700ggcacaattt
ctgagtacta cactggcgta gacatgcaac gcagcgtgat tcttgctgaa 32760cttcctgagt
tcgagaagcc acttgtaatc gacgtatacc gcatcgaagc tgacgctgaa 32820caccagtacg
acctacccgt tcaccactct ggtcagatca tccgtactga cttcgattac 32880aacatggaaa
aaacgcttaa gccgctaggt gaagacaacg gttaccagca cttatggaac 32940gtggcttcag
gcaaagtgaa cgaagaaggt tctctagtaa gctggctaca tgacagcagc 33000tactacagcc
tagtaaccag cgcgaatgcg ggcagcgaag tgatttttgc tcgcactggt 33060gctaacgatc
cagacttcaa ccttaagagt gagcctgcgt tcatcttacg tcagtctggt 33120caaaaccacg
tgtttgcttc tgtactagaa acgcatggtt actttaacga gtctatcgaa 33180gcctctgtag
gcgctcgtgg tctagttaaa tcagtatctg ttgtgggcca taacagtgtc 33240gggactgttg
ttcgcattca gactacttct ggcaacactt accactacgg tatctcaaac 33300caagctgaag
acacgcagca agcaactcac actgttgagt tcgcgggtga gacatactcg 33360tgggaaggat
catttgctca actgtaaatg attaacatac atgccgttta acgatggcat 33420gtattgatgt
ggtgctttgc gggaacgaag catcacattg aattcagtcg tgattgcaaa 33480tcgttcgttg
ataccaacaa cgactgaata catcgggaat aagtcaaacc gagtaactca 33540ctgcgagttg
ctcggttttt ttatgcgtgc tgcttttata agaaggggga aagaggatgg 33600ggcaacggag
cttccctttt ccttcgaatc ttacagagtg ggctaaagta taatttagga 33660tttaaaaata
aagggattca aggatgaagt ggttattggc aatagttgcg atgtctggtg 33720tcgcattggc
ggcagaaaat aagaatgttg aggtgagcag tgagcatttc gtccgttatc 33780aataccaaga
caaaatcagc tatggaaagc tagacaatga cgcagtgtta ccggtcagcg 33840gcgatctctt
tggcgaatat tcggtagcaa aaaattcgat cccgttagag tcggttgagg 33900tgttactacc
gacaaaacca gagaaagtct tcgccgtcgg gatgaacttc gctagccact 33960tagcctcacc
tgccgatgca ccaccgccga tgtttcttaa acttccttct tctttgattc 34020tcacgggcga
agtgattcaa gtgccaccaa aagcaagaaa tgttcatttt gaaggcgagc 34080tggtggttgt
gattggtaga gagctcagtc aagccagtga agaagaagcc gaacaagcga 34140tctttggcgt
cacggtgggc aacgatatta ctgaaagaag ttggcaaggc gccgatttac 34200aatggctccg
agcgaaagct tccgatggtt ttggcccggt tggcaacaca attgtgcgcg 34260gcattgatta
caacaatatt gagttaacca ctcgtgttaa cggtaaagtg gttcaacaag 34320aaaatacttc
gttcatgatc cacaagccaa gaaaagtcgt gagctatttg agctattatt 34380ttaccctcaa
accgggcgat ctaattttca tgggcacgcc aggtagaact tatgctctgt 34440ccgacaaaga
tcaagtgagt gtcacgattg aaggggtagg gactgtggta aatgaagtgc 34500ggttctgatg
gaattgaatt agcgttggga gctacagagc ttatgtctga atttgcagta 34560cgtagacgac
ttgaacctat taatttgaac taggttaact tgtgtagtga ataaactaac 34620cgtttttcgg
ttccattatt ttagcccaat tgagtgatgt ttttggaagc gagcagagaa 34680aacgagaatg
acgaacctac atgctcggcg agggttttgt tagtggtgta acacagtgtt 34740tctagctaag
agaaattaga tgctttctaa gtgtttgatt aattgaataa attaacaggt 34800actatccgct
ttgattttac tcaattggct gtaggtttaa atactgttat agtgttcctt 34860aaataataca
taaacataac atataaataa gcgaacttat ggctagcact tttaattcaa 34920tttcgggctc
gaagcgtagc ctgcacgtgc aagtagcacg cgaaatcgct cgaggaattt 34980tgtctggtga
tctgccgcaa ggttctatta ttcctggtga aatggcgttg tgtgaacagt 35040ttggtatcag
ccgaacggca cttcgtgaag cagttaaact actgacctct aaaggtctgt 35100tagagtctcg
ccctaaaatt ggtactcgcg tagtcgaccg cgcatactgg aacttccttg 35160atcctcaact
gattgaatgg atggacggac taaccgacgt agaccaattc tgttctcagt 35220ttttaggcct
tcgccgtgcg atcgagcctg aagcgtgtgc actggcggca aaatttgcga 35280cagctgaaca
acgtatcgag ctttcagaga tcttccaaaa gatggtcgaa gtggatgaag 35340ctgaagtgtt
tgaccaagaa cgttggacag acattgatac tcgtttccat agcttgatct 35400tcaatgcgac
cggtaacgac ttctatctac cgttcggtaa tattctgact actatgttcg 35460ttaacttcat
agtgcattct tctgaagagg gaagcacatg catcaatgaa caccgcagaa 35520tctatgaagc
tatcatggcc ggtgattgtg acaaggctag aattgcttct gctgttcact 35580tgcaagatgc
caaccaccgt ttggcaacag cataatagaa atgatttaaa gcgcacctga 35640gccatctcac
atcgagatga acaccctcac gttcggataa acgactttaa aaggtatgcc 35700tagtgcatgc
cttttttggt ttttagaccg cgtgttgcac tatctgtagc actattttgg 35760gtcagtcttt
tcgctacgtc tgttaagcta ttcttccacg ttacaacccg ccttgttttt 35820aacgtctacg
taacaatccc caagcatcgt tctaaacaca tttttagact gtctgtacct 35880gacaagtagt
tatgcgacag ccgggatttt tcacctctca gtattctaaa tctgggatta 35940aacaaacagg
gttctcggat ttaatattta gatatttaaa tcgaattcta atgatattac 36000ccactcgatt
tcgtaaaaaa cactggttta ttgtgtgatg aatgatgtgg gtttggtcaa 36060ggattctctt
ttattatttt tgagaacttt atgtttatat gtgtttgatt gtatttgtta 36120ataagtgtgc
aaagtctcac ttttatttta agttgttgtt tttaatgttt aatttatttt 36180gagtgtttga
tcttttgggt ttttacctaa aaccctaaca atttccttaa tggattagcc 36240atattccatc
ctatgtcata tatataatta acttaatcaa tcaaaataag atcaccatca 36300cttatttgga
ttattgtact acaaataaag agtcgaattt cctatagtcc tcgtaacaaa 36360ttaaaacgga
caaaggatac acgatggaac tcaacacgat tattgtcggc atttatttcc 36420tattcttgat
tgcgataggt tggatgttta gaacatttac aagtactact agtgactact 36480tccgcggggg
cggtaacatg ttgtggtgga tggttggtgc aaccgccttt atgacccagt 36540ttagtgcatg
gacattcacc ggtgcagcag gtaaagcgta taacgatggt ttcgctgtag 36600cggtcatctt
cgtagccaac gcatttggtt acttcatgaa ctacgcgtac ttcgcgccga 36660aattccgtca
acttcgcgtt gttacggtaa tcgaagcgat tcgtatgcgt tttggtgcga 36720ccaacgaaca
agtattcact tggtcttcaa tgccaaactc agtggtatct gcgggtgtgt 36780ggttaaacgc
attggcaatc atcgcttcgg gtatcttcgg tttcgacatg aacatgacta 36840tctgggtgac
tggcctagtg gtattggcaa tgtcggtaac aggtggttca tgggcggtaa 36900tcgcatctga
cttcatgcag atggttatca tcatggcggt aacggtaact tgtgcggttg 36960tagcggttgt
tcaaggtggc ggtgttggtg agattgttaa caacttccca gtacaagatg 37020gtggttcgtt
cctttggggc aacaacatca actacctaag catctttacg atttgggcat 37080tcttcatctt
cgttaagcag ttctcaatca cgaacaacat gcttaactct taccgttacc 37140tagcggctaa
agactcaaag aacgctaaga aagctgcact gcttgcttgt gtgttgatgt 37200tgtgtggtgt
gtttatttgg ttcatgcctt cttggttcat tgcaggccaa ggtgttgatt 37260tatcagcggc
ttacccgaat gcaggtaaaa aagcgggtga ctttgcttac ctatacttcg 37320tacaagagta
catgccagca ggtatggttg gtctattagt tgccgcgatg tttgcagcga 37380caatgtcttc
aatggactca ggtctaaacc gtaactcagg tatttttgtt aagaacttct 37440acgaaacaat
cgttcgtaaa ggtcaagcat cagagaaaga gctagtaacc gtatctaaaa 37500ttacttcagc
ggtatttggt ttcgctatta tcctaatcgc acagttcatc aactcattaa 37560aaggcttaag
cctgtttgat acgatgatgt acgtaggtgc gttaatcggc ttccctatga 37620cgattcctgc
attccttggt ttcttcatca agaagactcc ggactgggct ggttggggaa 37680cgctagttgt
tggtggtatc gtatcttatg tggttggttt tgttatcaac gcggagatgg 37740tagcagcggc
gtttggtctt gatactctaa caggacgtga atggtctgat gttaaagttg 37800cgattggtct
gattgctcac atcacgctaa ccggtggctt cttcgtacta tctacgatgt 37860tctacaagcc
tctatcaaaa gaacgtcaag cggatgttga taagttcttt ggcaacttag 37920ataccccatt
agtagctgaa tcggcagagc aaaaagtgtt ggataacaaa caacgtcaaa 37980tgcttggtaa
actgattgcg gtagcgggtg ttggtattat gctgatggct cttctgacta 38040acccaatgtg
ggggcgccta gtcttcatct tatgtggtgt gatagtgggt ggtgtcggta 38100ttctacttgt
gaaagcggtc gatgacggcg gcaagcaagc gaaagcagta accgaaagct 38160aatacataga
aaacgtttat aatagaatgc gacgactcga aagggcgtcg cattttttat 38220tctgcggaac
tggaaaaccg tcaggtgaaa gatatctgac ctaaatcacg aaaactgtac 38280aaagtggttc
aatcgaatcg aaatatattc aattgtccta caataagacg tatattgttg 38340ctaattcctt
tcaatcaact tgaaaaataa gtgagttaga atgagcgacc aaaaatctct 38400tgatgcaatc
aggaagatga agctggaaaa cgatacttca gcaggtaatc ttgtagacct 38460actccctatc
gaagttcaaa cacgtgactt cgacctatca ttcctagaca ccttgagcga 38520agcacgtccg
cgtcttcttg ttcaagctga tcagctagaa gaattcaaag caaaagtgaa 38580agctgatcaa
gctcactgta tgtttgatga tttctacaac aactctaccg ttaagttcct 38640tgagactgct
cctttcgaag agcctcaagc gtacccagct gagacggtag gtaaagcttc 38700tctatggcgt
ccttattggc gtcaaatgta cgttgattgc caaatggcac tgaacgcgac 38760acgtaaccta
gcgattgctg gtgttgtaaa agaagacgaa gcgctcattg cgaaagcaaa 38820agcttggact
ctaaaactgt ctacgtacga tccagaaggc gtgacttctc gtggctataa 38880cgatgaagcg
gctttccgtg ttatcgctgc tatggcttgg ggttacgatt ggctacacgg 38940ctacttcacc
gatgaagaac gccagcaagt tcaagatgct ttgattgagc gtctagacga 39000aatcatgcac
cacctgaaag tgacggttga tctattgaac aacccactaa atagccacgg 39060tgttcgttct
atctcttctg ctatcatccc aacgtgtatc gcgctttacc acgatcaccc 39120gaaagcaggc
gagtacattg catacgcgct agaatactac gcagtacatt acccaccatg 39180gggcggtgta
gacggcggtt gggctgaagg tcctgattac tggaacacgc aaactgcatt 39240cctaggcgaa
gcattcgacc tattgaaagc atactgtggt gtagacatgt ttaacaaaac 39300attctacgaa
aacacaggtg atttcccgct ttactgcatg ccagttcact ctaagcgcgc 39360gagcttctgt
gaccagtctt caatcggcga tttcccaggt ttaaaactgg cttacaacat 39420caagcactac
gcaggtgtta accagaagcc tgagtacgtt tggtactata accagcttaa 39480aggccgtgat
actgaagcac acaccaaatt ctacaacttc ggttggtggg acttcggtta 39540tgacgatctt
cgttttaact tcctttggga tgcacctgaa gagaaagccc catcgaacga 39600tccactgttg
aaagtattcc caatcacggg ttgggctgca ttccacaaca agatgactga 39660gcgtgataac
catattcaca tggtattcaa atgttctccg tttggctcaa tcagccactc 39720tcacggtgac
caaaacgcat ttacgcttca cgcatttggt gaaacgctag cgtcagtaac 39780aggttactat
ggtggtttcg gtgtagacat gcacacgaaa tggcgtcgtc aaacgttctc 39840taaaaacctg
ccactatttg gcggtaaagg tcagtacggc gagaacaaga acacaggcta 39900cgaaaaccac
caagatcgct tttgtatcga agcgggcggc actatctctg acttcgacac 39960tgaatctgat
gtgaagatgg ttgaaggtga tgcaacggca tcttacaagt acttcgttcc 40020tgaaatcgaa
tcttacaagc gtaaagtctg gttcgttcaa ggtaaagtct tcgtaatgca 40080agacaaggca
acgctttctg aagagaaaga catgacttgg ctaatgcaca caactttcgc 40140aaacgaagtg
gcagacaagt ctttcactat ccgtggcgaa gttgcgcacc tagacgtaaa 40200cttcatcaac
gagtctgctg ataacatcac gtcagttaag aacgttgaag gctttggcga 40260agttgaccca
tacgagttca aagatcttga gatccaccgt cacgtggaag tggaattcaa 40320gccatcgaaa
gagcacaaca tcctgacgct tcttgttcct aataagaatg aaggcgagca 40380agttgaagtg
tttcacaagc ttgaaggcaa cacgctactg ctaaatgttg acggcgaaac 40440ggtttcaatc
gaactgtaat ccgctgaagt aacagaagtt agatactaaa aactccgagt 40500gaaagctcgg
agtttttttg tttggctagc caattaagtt ggagttggat aagtcagtta 40560agttgtatta
gttgacaacg ttggcaaacc gatcaggttg aaagaaaact taattggcca 40620gagataaata
gcttctcgat gccaagtcag tggctgaggg ctaaatctgg acattgatgc 40680acataaagac
cggcatgtac ttagccacta tgctcaatga aatgtgcagg agtcgtataa 40740gagactcgta
tatatcgctc tgttagaaga acagggcgcc aacgcctgtt tcctagcaat 40800tgttatgact
tacttttccg tgaacagtct tatcactggc tgagtaaggg agtagtgaac 40860tatacatagg
taaaggcgta gcttgttctt actaatcgta tgacatttaa cgtacgttat 40920tcgttattat
aatgaacata taatcataca atactatatt tggagtttga acatgactaa 40980acctgtaatc
ggtttcattg gcctaggtct tatgggcggc aacatggttg aaaacctaca 41040aaagcgcggc
taccacgtaa acgtaatgga tctaagcgct gaagctgttg ctcgcgtaac 41100agatcgcggc
aacgcaactg cattcacttc tgctaaagaa ctagctgctg caagtgacat 41160cgttcagttt
tgtctgacaa cttctgctgt tgttgaaaaa atcgtttacg gcgaagacgg 41220cgttctagcg
ggcatcaaag aaggcgcagt actagtagac ttcggtactt ctatccctgc 41280ttctactaag
aaaatcggcg cagctcttgc tgaaaaaggc gcgggcatga tcgacgcacc 41340tctaggtcgt
actcctgcac acgctaaaga tggtcttctg aacatcatgg ctgctggcga 41400catggaaact
ttcaacaaag ttaaacctgt tcttgaagag caaggcgaaa acgtattcca 41460cctaggggct
ctaggttctg gtcacgtgac taagcttgta aacaacttca tgggtatgac 41520gactgttgcg
actatgtctc aagctttcgc tgttgctcaa cgcgctggtg ttgatggcca 41580acaactgttt
gacatcatgt ctgcaggtcc atctaactct ccgttcatgc aattctgtaa 41640gttctacgcg
gtagacggcg aagagaagct aggtttctct gttgctaacg caaacaaaga 41700ccttggttac
ttccttgcac tttgtgaaga gctaggtact gagtctctaa tcgctcaagg 41760tactgcaaca
agcctacaag ctgctgttga tgcaggcatg ggtaacaacg acgtaccagt 41820aatcttcgac
tacttcgcta aactagagaa gtaatcgacg tacgacctcg ctagggtatt 41880gcttgtcttc
taggcggcga tacctcagcg aggttcgttt ttatctgcca tacccaaccc 41940tttgttccct
tgttaaaatc ttctacttct acttctactt ctacttcaat ttcctcagtt 42000acacctaatc
aaaactctgt ttaactctgt tactgcctca attcctattt ttttctatat 42060ctatttctaa
cggtaaattc aaaaccttct agcaccaact cattcactca tttttcctcg 42120caagctcaaa
ctcaacgcgc ttacatgatt gttggtgatg gcttaacacc gctcgtatat 42180cggtcctgaa
aagaaagtaa aaaaaaagcc cacacagctg gtgactgtat gggcatgttc 42240ggacgagccg
tctggacaaa caaatgagca atagtaagtg aaaaaacgaa taacgagatc 42300ccccgacagt
ttctacgtta aacgcgttca atgaccttaa agcggctgct tcaattatca 42360ctttgaattg
aacaaaagca tccagaaaga acttaagtta tgattcaaat acaccatagt 42420acaagactta
ttgtattaca aataaatttt aagattgaat gcctttagtg aatggttagt 42480tggtagaagt
gtgagttaag actcattttt tcactcagct gggtgaggta aagaagaaga 42540gttttcgaaa
agatgttatc ggaaaaatga tgagctaatt atctaaaaat cgatctattt 42600taatgtgtta
tgcgtcaatg tttaacttcg aacaaaatcc aaactcataa atgataccta 42660tgtcacaggg
cggttttagc cagttttaat atatcaagat cgctcacaga atgtctggtc 42720aattaaacat
acaatattaa ttaagttgat ggttgtgacg atggatcggc atgaacaagt 42780ttcgctttcc
gtatcttcga aaatgtaaaa aatggccatt tcattcggat gaaaataata 42840gacataggtt
gatatggatg atgagtttta tgaattcaaa attgtctcta gggtttaaag 42900gaaaattgat
tttaatggta gcggtcgtca gttctagtgc tttggcattt acgaactggt 42960ttacgcttaa
cttggccact gaacaggtaa accaaacgat ttataacgag attgatcact 43020cgcttacgat
agaaatcaat caaatagaaa gtaccgttca gcgcaccatc gataccgtta 43080actctgttgc
acaagagttc atgaaatccc cttaccaagt gccgaatgaa gcactcatgc 43140attatgccgc
taagcttggt ggcattgaca agattgtggt gggttttgac gacggccgtt 43200cttatacctc
tcgcccttca gagtctttcc ctaacggtgt tggaataaaa gaaaaataca 43260atccaaccac
tcgaccttgg tatcaacaag cgaaattgaa atcaggctta tcttttagtg 43320gtctgttttt
cactaagagt actcaagtgc ctatgatcgg tgtgacctac tcataccaag 43380atcgtgtcat
catggccgat atacgctttg acgatttgga aacgcagctt gaacagctgg 43440acagcatcta
cgaagccaaa ggcattatca tcgacgaaaa ggggatggtg gtcgcttcaa 43500caatcgaaaa
cgtgcttccg caaaccaata tatcttctgc agacactcaa atgaaactca 43560acagtgccat
tgaacagcct gatcaattca ttgagggtgt gattgatggt aaccagagaa 43620tcttgatggc
caagaaagtg gatattggca gccagaaaga gtggttcatg atctccagta 43680ttgaccctga
actcgcgctc aatcagctga atggcgtgat gtcgagtgcg cgcatcctta 43740tcgtcgcttg
tgtacttggc tcggtgatat tgatgatttt acttctgaat cgtttctacc 43800gcccaatcgt
gtcactgcgc aaaatcgtcc acgatctatc acaaggtaac ggagacctca 43860ctcaaaggct
tgctgagaag gggaatgatg acttagggca tatcgccaaa gacatcaact 43920tgttcattat
cggcttacaa gagatggtta aggatgtgaa atacaagaac tcggatctcg 43980ataccaaggt
actgagtatt cgcgaaggtt gtaaagaaac cagcgatgta ctgaaagttc 44040atactgatga
aacggttcaa gtggtctctg cgattaacgg cttgtctgaa gcatcaaacg 44100aagtagagaa
gagttctcag tcggcggcag aagcagcaag agaggccgct gtgttcagtg 44160atgagacgaa
acagattaac acggtgacgg aaacctatat cagtgatctt gagaagcaag 44220tctgcaccac
ttctgatgac attcgctcaa tggccaatga aacgcagagc atccagtcta 44280tcgtgtctgt
gattggcgga attgcggaac aaactaattt gctggcattg aatgcgtcaa 44340ttgaagcggc
gagggcgggt gaacatggtc gaggtttcgc ggtggttgct gatgaagtcc 44400gtgcgctagc
caaccgaacg caaatcagta cctctgaaat tgatgaagcg ttatctggct 44460tgcagtctaa
atcagatggt ttggttaaat ctattgagtt gaccaaaagt aactgtgaac 44520tgactcgcgc
tcaagttgtt caagctgtaa acatgttggc gaagctaacc gagcagatgg 44580aaacagtaag
tcgttttaat aatgacattt cgggttcgtc tgttgagcaa aacgccctta 44640ttcagagcat
tgctaagaac atgcataaga ttgaaagctt tgttgaggag cttaataaac 44700taagccaaga
tcagttaact gaatcagcag aaatcaaaac acttaacggt agcgttagtg 44760aattgatgag
cagctttaag gtttaatgtt tctaatattt atacctaaaa atcaacatgt 44820taagtttagt
tgttgatctg aaggccactc aataactgtc gagtttagag tggcttttct 44880gcgttgttct
tgagtctaac tctacgtaat atccgttcat ttcacttcat ttgccgcatc 44940tcacattctg
ataaatagac aattgacata aaatagtaca aatatacatt gtcactctac 45000tcttatggat
aagtgagata aatgtgaata agccaatctt tgtcgtcgta ctcgcttcgc 45060ttacgtatgg
ctgcggtgga agcagctcca gtgactctag tgacccttct gataccaata 45120actcaggagc
atcttatggt gttgttgctc cctatgatat tgccaagtat caaaacatcc 45180tttccagctc
agatcttcag gtgtctgatc ctaatggaga ggagggcaat aaaacctctg 45240aagtcaaaga
tggtaacttc gatggttatg tcagtgatta tttttatgct gacgaagaga 45300cggaaaatct
gatcttcaaa atggcgaact acaagatgcg ctctgaagtt cgtgaaggag 45360aaaacttcga
tatcaatgaa gcaggcgtaa gacgcagtct acatgcggaa ataagcctac 45420ctgatattga
gcatgtaatg gcgagttctc ccgcagatca cgatgaagtg accgtgctac 45480agatccacaa
taaaggtaca gacgagagtg gcacgggtta tatccctcat ccgctattgc 45540gtgtggtttg
ggagcaagaa cgagatggcc tcacaggtca ctactgggca gtcatgaaaa 45600ataatgccat
tgactgtagc agtgccgctg actcttcgga ttgttatgcc acttcatata 45660atcgctacga
tttgggagag gcggatctcg ataacttcac caagtttgat ctttctgttt 45720atgaaaatac
cctttcgatc aaagtgaacg atgaagttaa agtcgacgaa gacatcacct 45780actggcagca
tctactgagt tactttaaag cgggtatcta caatcaattt gaaaatggtg 45840aagccacggc
tcactttcag gcactgcgat acaccaccac acaggtcaac ggctcaaacg 45900attgggatat
taatgattgg aagttgacga ttcctgcgag taaagacact tggtatggaa 45960gtgggggtga
cagtgcggct gaactagaac ctgagcgctg cgaatcgagc aaagaccttc 46020tcgccaacga
cagtgatgtc tacgacagcg atattggtct ttcttatttc aataccgatg 46080aagggagagt
gcactttaga gcggatatgg gatatggcac ctctaccgaa aattctagct 46140atattcgctc
tgagctcagg gagttgtatc aaagcagtgt tcaaccggat tgtagcacca 46200gcgatgaaga
tacaagttgg tatttggacg acactagaac gaacgctacc agtcacgagt 46260taaccgcaag
cttacgaatt gaagactacc cgaacattaa taaccaagac ccgaaagtgg 46320tgcttgggca
aatacacggt tggaagatca atcaagcatt ggtgaagttg ttatgggaag 46380gcgagagtaa
gccagtaaga gtgatactga actctgattt tgagcgcaac aaccaagact 46440gtaaccattg
tgacccgttc agtgtcgagt taggtactta ttcggcaagt gaagagtggc 46500gatatacgat
tcgagccaat caagacggta tctacttagc gactcatgat ttagatggaa 46560ctaatacggt
ttctcattta atcccttggg gacaagatta cacagataaa gatggggaca 46620cggtctcgtt
gacgtcagat tggacatcga cagacatcgc tttctatttc aaagcgggca 46680tctacccaca
atttaagcct gatagcgact atgcgggtga agtgtttgat gtgagcttta 46740gttctctaag
agcagagcat aactgagttc tctgatgttt ggttagccat gtcggtaatg 46800aagaagacca
tattgatgcc tacaatgtgg tctttttttg tttttggaca cttacagtga 46860tgtgttttga
aggacaaatg ttctgctcga atcatgcaaa tacacacgat tacagctcgc 46920ttgttctgcc
cttgctagct catttcgcat tccaaattct tatatattgt cttttatcaa 46980taggaaatgt
gatccagtta aagtatggaa aaatcggaaa gtgttcctag tctcatttat 47040ccaacgaagt
gttttatttg tattataaga ttacgtaata ttttcgtgtt atcgcaaata 47100ctgataggtg
aatcgcctta tagctcgtgt ttgctgattt agctttcact tacgaacgct 47160gtctttgtat
tataataatg gattaaatat gaaacaaatt actctaaaaa ctttactcgc 47220ttcttctatt
ctacttgcgg ttggttgtgc gagcacgagc acgcctactg ctgattttcc 47280aaataacaaa
gaaactggtg aagcgcttct gacgccagtt gctgtttccg ctagtagcca 47340tgatggtaac
ggacctgatc gtctcgttga ccaagaccta actacacgtt ggtcatctgc 47400gggtgacggc
gagtgggcaa cgctagacta tggttcagta caggagtttg acgcggttca 47460ggcatctttc
agtaaaggta atcagcgcca atctaaattt gatatccaag tgagtgttga 47520tggcgaaagc
tggacaacgg tactagaaaa ccaactaagc tcaggtaaag cgatcggcct 47580agagcgtttc
caatttgagc cagtagtgca agcacgctac gtaagatacg ttggtcacgg 47640taacaccaaa
aacggttgga acagtgtgac tggattagcg gcggttaact gtagcattaa 47700cgcatgtcct
gctagccata tcatcacttc agacgtggtt gcagcagaag ccgtgattat 47760tgctgaaatg
aaagcggcag aaaaagcacg taaagatgcg cgcaaagatc tacgctctgg 47820taacttcggt
gtagcagcgg tttacccttg tgagacgacc gttgaatgtg acactcgcag 47880tgcacttcca
gttccgacag gcctgccagc gacaccagtt gcaggtaact cgccaagcga 47940aaactttgac
atgacgcatt ggtacctatc tcaaccattt gaccatgaca aaaatggcaa 48000acctgatgat
gtgtctgagt ggaaccttgc aaacggttac caacaccctg aaatcttcta 48060cacagctgat
gacggcggcc tagtattcaa agcttacgtg aaaggtgtac gtacctctaa 48120aaacactaag
tacgcgcgta cagagcttcg tgaaatgatg cgtcgtggtg atcagtctat 48180tagcactaaa
ggtgttaata agaataactg ggtattctca agcgctcctg aatctgactt 48240agagtcggca
gcgggtattg acggcgttct agaagcgacg ttgaaaatcg accatgcaac 48300aacgacgggt
aatgcgaatg aagtaggtcg ctttatcatt ggtcagattc acgatcaaaa 48360cgatgaacca
attcgtttgt actaccgtaa actgccaaac caagaaacgg gtgcggttta 48420cttcgcacat
gaaagccaag acgcaactaa agaggacttc taccctctag tgggcgacat 48480gacggctgaa
gtgggtgacg atggtatcgc gcttggcgaa gtgttcagct accgtattga 48540cgttaaaggc
aacacgatga ctgtaacgct aatacgtgaa ggcaaagacg atgttgtaca 48600agtggttgat
atgagcaaca gcggctacga cgcaggcggc aagtacatgt acttcaaagc 48660cggtgtttac
aaccaaaaca tcagcggcga cctagacgat tactcacaag cgactttcta 48720tcagctagat
gtatcgcacg atcaatacaa aaagtaatct aatcgaataa cacttaatat 48780taaaggtatt
gcaatagcct ccagccttag ggtttggagg cttttttgtg cctgctgttg 48840gttgggctta
agcgtatgat ttaattgagt aggagagggg tagttatcag ttgcacagag 48900tttaagacat
tatcattaag ctcattcagt attaacttta gtcattatca gtcactatta 48960ccccccaagc
gccgatcaca attaacctag ctcatgatta atctcagtta ccaataggct 49020agcctgtagc
ggattcaaac ccaaataatg tcgtgatgtt tatcggaatc accatagctc 49080gaaaactttg
accttgttct caaggctttg ccaatgcacg aacgtattat gtgcgtggtt 49140tactaataag
cgttagctcg gctgactact catactgttc ttgaaaccgt tactcttggg 49200ttgtttagct
agactcctag caacagccat aaatagtgct ctaactcttt cataattaga 49260agggtagggt
tagccattct attggttcca atgctttatg aaatactagg cgggctcaag 49320tcgatgatca
aacgactcta acagcttaag gttatgcgct tttgcgttag ttacctgcag 49380gccgtaaatg
ccctgattgt agttgtacgg tgacgctgaa taataatttg taggattagt 49440atagaactga
gagactttgt ctatctatga tcgatacagg ctttgagagg gctggatcag 49500tagaaagaca
gaatgacaat tagcactaga gttattttgg tttttaatta gagttaataa 49560aatagatatt
tggtttgtta aatttaatcg tgtcataagc tctgtgtttt aaaaaataaa 49620aaaagccata
gcagttgcta tggctttgaa taagtcaggt tctaaggtaa gcaaacagca 49680agtcaacttg
tctgttttga tattcttagt cttagttcaa gatattttct ttacctgccg 49740cagtgttcac
tgcagatggt tgtgcgtaga tggctgtatt tcttatatct ttaccgttgt 49800cagctgaaag
tagggttttg taaccattga acgtattgct ttcaatagtg acattacagc 49860caaggttatc
atcactattc tttatgacgg cactgctaga atcaccaact ttaattccgc 49920ccacatcact
accaccagag ttaatagtga atgtattatt tgtgatttgt gaaccaaatc 49980gcccgttgtc
actactacag ttaatacgaa ttgcattatt ttggaaactg ccacttaaac 50040cgacaaattc
gctattgtct aatgtaaagt aacctcgaga gaataaccaa cttgcttttt 50100tagtacctag
atcatcttcg gtaatgccgt ttgcatcgaa ctttaggttt tcaagtgcta 50160caggatctga
atctttacca attttaccaa taacgatcgc accagtttca ttatctgaag 50220taccagctga
ctccctacca aaacacccgg ccaaattgtc gttagcaaaa gtcatgtttt 50280tgatacctgc
accgggtgca gtgacatcaa tacaagcatc tccggtaatg gttgctaaac 50340cagcaccatc
aattgtgaca gctttattta gctcaataac accggtatca aacgtacctt 50400cagatgataa
atcaataatc gcgccatctt ctgctgatgc aatcgcagcg ttcacatcat 50460cgactgattt
aggttttcca tcatcggcat tttctaatgc agttataaca gattcatctg 50520taatctctgt
tgctgagtaa gctgtttcta cgtcatcttt ttcacaggtc atatctagct 50580gttgctcagt
cactgtgtaa gtacagtcct taccttcaaa ggaaacaaca ccttcttctt 50640cagcagtata
tattgaactc tcaaaagtga agccattagc tacgtctcca ctgtaaatgc 50700ttagagcacg
agtaccttcc tcattattat caaagcgata tggtgaagtt ccgctgagtg 50760actgtgcagc
aacagcacca cctgtcagat cccaatagac gttttctata gagtaaactt 50820caacaggttc
aacagggtct gttccgcctg gatctgttgg aataggtaaa ccatcactgt 50880tacaaccaaa
taataaacct gtcgaaagag cgacagctgt agcgacttta gaaatttgca 50940taaaatattc
tctttatgat attaaatcca tatgtaaatc acataagaaa tagataatga 51000atagtcgtta
aatatttatt aggatgaagc taattctgat tagaacatcc tattatttaa 51060aataaagtaa
ttaaaatatg cccaaataaa ttacaagagg agagggctat tttatatttt 51120gactatttta
ttattagaat gagtaagcaa taccaacacg gtatttagct tcacgatctt 51180ttgaagatga
actgatatca gaagaccaga tttcagcaaa aggtttccaa gaaccgaact 51240tgtagtttac
ctttaagcca gcatcccatt cccagtcatc actattataa agaagtacgt 51300tatccaaaga
ttttacatag tttgcttcgt aagaaaggcc aagcttaggt agagattcaa 51360ttttgtaaga
gcccgtcagt gtaactttag acttttgagc tgattctaaa cgctcgccag 51420tttcagaatc
tttgtcgcca aattgtgtgt ggttacggaa gtcagcatat tcatgacggt 51480aacgaatagc
agttgttaaa cccatatctg ctttatagcc aacgcggaac tgaggtttaa 51540acgtaacctt
tttcatcttc cagtcgccat cgttagcatt aggctcatcc caatcccaag 51600caataggcat
acccatttgt agataccaat tattgtctat tttgtatgtc gcagtgttat 51660cgatctccat
accatagatg taccaattgc catcgtaaaa actctggctg tttgctgatt 51720taacagaacc
tgaatcttca tcatagtaag agtcatcacc gtggaactta agttctagac 51780cagtagagtg
cttccacttg tctgacagct taaagctttc acctagctta actcggtgtt 51840gatggcgagc
gtctacgtga gccgtatcac cattagtctt tgtataatcc gtcgcagcac 51900gatactcgta
acgataatca agagatgcac cagcagctgt gcccgctaaa agagtacatg 51960caacagctgc
agcaattttt gtaacagaat tcataccttt gtctcactat tattttttta 52020ttttggatac
atccaatgta cccctgactc acaaaccaat accttacatg gtattaaatt 52080aatgtatgac
aaatatggta tttattccta gggtagattt ctgtgagatc tatcaaaagt 52140tccgactaat
ggcctattta tatagctaaa tgttatgaat atctcaattt aaggcttacc 52200aatcaaatca
atcatgactc agttctcata ttaacaaacc ttgtaagctc agttggttgt 52260atgtgttaaa
ataatacaaa tataagaata ttcccacact ttcatatcga tgttctagtt 52320gttgtggttt
aaacataacg gcgcatgttg agggatatag atataaacca ccgccaaatg 52380tttggtaaaa
gttaaaagat ggcgaaatgt aaattctatt tattggttgg tttatttaag 52440tcgaagagaa
aatatttagt actaattcgt gttcaaaagt agtttctgtg ctgagagtgt 52500actcagtatc
tgttaacaat aaaggatgag tcatgtttaa gaaaaacata ttagcagtgg 52560cgttattagc
gactgtgcca atggttactt tcgcaaataa cggtgtttct taccccgtac 52620ctgccgataa
attcgatatg cataattgga aaataaccat accttcagat attaatgaag 52680atggtcgcgt
tgatgaaata gaaggggtcg ctatgatgag ctactcacat agtgatttct 52740tccatcttga
taaagacggc aaccttgtat ttgaagtgca gaaccaagcg attacgacga 52800aaaactcgaa
gaatgcgcgt tctgagttac gccagatgcc aagaggcgca gatttctcta 52860tcgatacggc
tgataaagga aaccagtggg cactgtcgag tcacccagcg gctagtgaat 52920acagtgctgt
gggcggaaca ttagaagcga cattaaaagt gaatcacgtc tcagttaacg 52980ctaagttccc
agaaaaatac ccagctcatt ctgttgtggt tggtcagatt catgctaaaa 53040aacacaacga
gctaatcaaa gctggaaccg gttatgggca tggtaatgaa ccactaaaga 53100tcttctataa
gaagtttcct gaccaagaaa tgggttcagt attctggaac tatgaacgta 53160acctagagaa
aaaagatcct aaccgtgccg atatcgctta tccagtgtgg ggtaacacgt 53220gggaaaaccc
tgcagagccg ggtgaagccg gtattgctct tggtgaagag tttagctaca 53280aagtggaagt
gaaaggcacc atgatgtacc taacgtttga aaccgagcgt cacgataccg 53340ttaagtatga
aatcgacctg agtaagggca tcgatgaact tgactcacca acgggctatg 53400ctgaagatga
tttttactac aaagcgggcg catacggcca atgtagcgtg agcgattctc 53460accctgtatg
ggggcctggt tgtggcggta ctggcgattt cgctgtcgat aaaaagaatg 53520gcgattacaa
cagtgtgact ttctctgcgc ttaagttaaa cggtaaatag cacatagcat 53580aaccaatagt
ctagctagac gcagtcctta aggaatattt tcgaagacca cttaaccgaa 53640tgttgagtgg
tctttttgtt ttatatgagt tttaagatga acttggtatt aatgtgacct 53700tggtatcaat
gagggtgtac gtgaagccta ccaatgaaag gtacagctaa aacaatacaa 53760ccttgtcaaa
agacaaggtt gcattcagaa agcgtaggaa gattttagga cgacaactcg 53820atacggagtt
tagtcataca tcaactcttt ggctttgtcg gcatcaaact ctttaagaga 53880ctttcgagcc
aagtgacgga atgggaaagc tttcacgact tcttcgaatg gttggatggc 53940aaatgcccaa
aagatagaac cgtctaatcc aaagatgatc aatgcacaca atggaattga 54000aattacccat
tgaccagtaa agttgatttt gaagactgcg gtcgtttttc ctagggctct 54060taatacattc
ccatgaaccg
54080322890DNAVibrio splendidus 3gtgctttgtg acaacggggg atgtatggat
attgaagttt cgcgccaggt tgcggtagtt 60gaagctacga gtggagatgt cgtcgtagtt
aagccagacg gcagcgcaag aaaagtttca 120gttggcgata ccatccgtga aaatgagatc
gtgattacgg ccaacaagtc agagcttgta 180ttaggcgttc agaatgattc gattccggtt
gcagagaatt gcgtcggttg tgttgatgaa 240aacgctgcat gggtagatgc cccaatagct
ggtgaggtta attttgactt acagcaagca 300gacgcagaaa ccttcactga agacgacctt
gctgcaattc aagaagccat tttaggtggt 360gccgatccga ctcaaatctt agaagcaacg
gctgctggtg gcggactagg ttctgcaaat 420gctggctttg tgacgattga ctataactac
actgaaactc atccatcgac tttctttgag 480accgctggtc tagcagaaca aactgttgat
gaagacagag aagaattcag atctatcact 540cgttcatcag gtggccaatc aatcagtgaa
acactgactg aaggctccat atctggcaat 600acctatcccc aatctgtaac aacgacagaa
acgattattg ctggtagttt agctctcgcc 660cctaactctt tcattccaga aactttatcc
ctcgcttcac tacttagtga attaaacagc 720gacattactt caagtggtca gtccgttatc
ttcacctatg acgcgacgac taattctatc 780gttggtgttc aagataccga cgaagtatta
cgtatcgaca ttgatgccgt cagtgttggc 840aataacattg agctttctct aaccacaacg
atttcccagc cgattgatca tgtaccgtcg 900gttggcggtg gtcaggtttc ttacactggc
gatcaaatag atattgcctt tgatattcaa 960ggtgaagaca ccgctgggaa cccgctagca
acacccgtta acgcacaagt ttcagtgttt 1020gacgggatag atccgtctgt tgaaagtgtc
aatatcacta acgttgaaac tagcagcgcg 1080gcaatcgaag ggacgttctc aaatattggt
agtgataacc ttcaatcagc cgtatttgat 1140gcaagtgcac tggaccagtt tgatgggttg
ctcagtgata atcaaaacac gcttgcgaga 1200ctttctgatg atggaacaac gattactctg
tccatccaag gtcgaggtga ggttgttctc 1260actatctctc tagataccga tggcacctat
aaattcgagc agtctaatcc gatagaacaa 1320gtgggtaccg attcactgac gttcgctttg
ccaatcacga ttaccgattt tgaccaagat 1380gttgtaacca atacgatcaa cattgccatt
actgatggcg atagccctgt tattactaat 1440gttgacagta ttgatgttga tgaagcgggc
attgttggcg gctcacaaga gggcacggcg 1500ccagtgtctg gcactggcgg tatcaccgcg
gacatttttg aaagtgacat cattgaccat 1560tatgagctag aacccactga atttaatact
aatggcacct tggtttcaaa tggcgaggct 1620gtgctacttg agttgattga tgaaaccaac
ggtgtaagaa cttacgaagg ttatgttgag 1680gtcaatggtt cgagaattac ggtctttgac
gttaaaattg atagcccttc attgggcaac 1740tatgagttta atctttatga agaactttct
catcaaggcg ctgaagatgc gctgttaact 1800tttgcattgc caatttatgc tgttgatgca
gatggcgacc gttctgcact gtctggaggt 1860tcgaacacac cagaagctgc tgagatcctc
gttaatgtta aagacgatgt cgttgaatta 1920gttgataagg ttgaatcagt caccgagccg
accttagcgg gcgatactat tgtttcgtat 1980aacctgttca attttgaagg cgcagatggt
tctacaattc aatcgtttaa ctacgacggt 2040gttgattact cactcgatca aagcctgctc
cccgatgcta cccagatttt cagttttact 2100gaaggtgtcg tcactatctc attaaacggt
gacttcagtt ttgaagtcgc tcgtgatatc 2160gaccactcaa gcagtgaaac tatcgtcaaa
cagttctcat ttttagccga agatggtgat 2220ggggatactg atagttcgac gcttgagtta
agtattaccg atggccaaga tccgatcatt 2280gatttgatcc cgcctgtgac tctctctgaa
accaacctta atgacggctc tgctcccagc 2340ggaagtacag ttagcgcaac cgagacgatt
acctttaccg caggcagcga cgatgtagca 2400agtttccgta ttgaaccaac agagtttaat
gtgggcggtg cacttaaatc gaatggattt 2460tcggttgaga taaaagaaga ttcggctaat
ccgggtactt acattggctt tattaccaac 2520ggttcgggcg ctgaaatccc agtgtttacg
attgctttct ctacgagcac attgggtgaa 2580tacaccttta ctctgcttga agcgttagac
catgtagatg gtttagataa gaacgatctg 2640agctttgatc tgcctattta tgcggttgat
acggacggcg acgattcatt ggtgtctcag 2700cttaatgtga ctatcggtga tgatgttcaa
atcatgcaag acggtacgtt agatatcacc 2760gagccaaatc ttgctgacgg tacaatcaca
accaacacca ttgatgtaat gccaaatcaa 2820agtgctgatg gcgcgacgat cactcggttc
acttatgacg gtgtcgtaaa cacactggat 2880caaagtattt caggagaaca gcagttcagc
ttcacagaag gcgaactgtt tatcaccctt 2940gaaggtgaag tgcgctttga gcctaatcgc
gatctagacc actcagtgag tgaagatatc 3000gtgaagtcga ttgtggtgac ttcaagcgac
ttcgataacg atccggtgac ttcaaccatt 3060acgctgacga tcactgatgg tgataacccg
acgattgatg ttattccaag tgttacgctt 3120tctgaaatta acctgagcga tggctctgct
ccaagtggca gcgcggtaag ctcgactcaa 3180actattactt ttaccaatca aagtgatgat
gtggttcgtt tccgtattga gtcaacggag 3240ttcaatacta acgatgatct taaatcgaac
ggtttagctg ttgagttacg tgaagacccg 3300gcagggtcgg gtgactacat tggttttacg
accagtgcga cgaacgtaga aactccagta 3360ttcacattaa gctttaattc tggatcatta
ggtgaataca cgttcacact catcgaagcg 3420ttggaccacc aagatgcccg tggcaacaac
gacctcagtt ttgatttacc tgtttacgcg 3480gtagatagtg atggcgatga ttcattggtg
tctccgttaa acgtcactat cggtgatgat 3540gttcaaatca tgcaagatag tacgttagat
atcgtcgagc caaccgtcgc agatttggcc 3600gctggcacag tgacaactaa caccattgat
gtgatgccaa atcaaagtgc cgatggcgca 3660acggtgacgc aattcactta tgatggccag
cttcgaacac ttgaccaaaa tgacaatggt 3720gagcagcaat ttagcttcac agaaggtgaa
ctgttcatca cgcttcaagg tgatgtgcgc 3780tttgagccta atcgtaatct agaccacaca
ctcagcgaag acatcgtgaa atcaatcgtg 3840gtgacatcta gcgattccga taacgatgtg
ttgacctcaa ccgtcactct gaccattacc 3900gatggtgata tcccaaccat tgataatgtt
ccaactgtga acttgtctga aactaatctg 3960agtgatggct ctgcacctag cggaagcgcg
gtgagttcaa ctcaaactat tacttacacc 4020actcaaagtg atgatgtgac aagcttccgt
attgaaccga ctgaatttaa tgttggtggc 4080gctctcacat caaacggatt ggcagtcgag
ttaaaagctg atccaaccac accgggtggc 4140tacatcggtt ttgtgactga tggttcgaac
gttgaaacta acgtgttcac gattagcttc 4200tcagatacca atttaggcca gtacaccttc
accttacttg aagcgttaga ccatgtggat 4260ggtttagcga acaatgatct gacctttgat
ctgcctgttt atgcagttga tagcgatggc 4320gacgattcac tggtgtctca gttaaatgta
accatcggtg atgatgttca aatcatgcaa 4380ggtggtacgt tagatatcac tgagccaaat
cttgcagacg gcacaattac aaccaatacc 4440atcgatgtga tgccagagca aagcgccgat
ggtgcgacga tcactcagtt cacttatgac 4500ggtcaagttc gaacactgga tcaaacggac
aatggtgagc agcaatttag cttcactgaa 4560ggcgagttgt tcatcactct tcaaggtgac
gtgcgtttcg aacccaatcg caacctagat 4620cacacagcta gcgaagatat cgtgaagtcg
atagtggtga cttcaagcga tttagataac 4680gatgtggtga cgtcaacggt cactctgacg
attactgatg gtgatatccc aaccattgat 4740gcagtgccaa gcgttactct gtctgaaatc
aatcttagtg acggctctgc gccaagtggc 4800actgcagtta gtcaaactga gacgattacc
ttcaccaatc aaagtgatga tgtgaccagt 4860ttccgtattg agccaataga gttcaatgtg
ggcggtgcac tgaaatcgaa tggatttgcg 4920gttgagataa aagaagattc ggctaatccg
ggtacttaca ttggctttat taccaacggt 4980tcgggcgctg aaatcccagt gtttacgatt
gctttctcta cgagctcatt gggtgaatac 5040acctttactc tgcttgaagc gttagaccat
gtagatggtt tagataagaa cgatctgagc 5100ttcgatctgc ctgtttatgc ggtcgatacg
gacggcgatg attcattggt gtctcagcta 5160aacgtgacca tcggtgatga tgtccaaatc
atgcaagacg gtacgttaga tatcatcgag 5220ccaaatctgg ctgatggaac aatcacaacc
agcactattg atgtgatgcc aaaccaaagt 5280gctgatggtg cgacgatcac tcagtttact
tatgacggtc agctaagaac gcttgatcaa 5340aatgacactg gcgaacagca gttcagcttc
acagaaggcg agttgtttat cacccttgaa 5400ggtgaagtgc gctttgagcc aaaccgagac
ctagaccaca ccgcgagtga agatattgtt 5460aagtcgattg tggtcacttc aagtgatttc
gataacgact ctctgacttc taccgtaacg 5520ctgaccatta ctgatggtga taaccctacg
atcgacgtca ttccaagcgt taccctttct 5580gaaactaatc tgagtgatgg ctctgctcca
agtggcagcg cggtaagctc gactcaaact 5640attactttta ccaatcaaag tgatgatgtg
gttcgtttcc gtattgagcc aacggagttc 5700aatactaacg atgatcttaa atcgaacggt
ttagccgttg agttacgtga agacccggct 5760gggtcgggtg actacattgg ttttactact
agtgcgacga atgtcgaaac cacggtattt 5820acgctgagtt tttctagcac cacattaggt
gaatatacct tcactttgct tgaagcgttg 5880gaccaccaag atgcccgtgg caacaacgac
ctcagttttg aactgcctgt ttatgcggta 5940gacagtgatg gcgatgattc actgatgtct
ccgttaaacg tcaccatcgg cgatgatgtt 6000caaatcatgc aagacggtac gttagatatc
gtcgagccaa ccgtcgcaga tttggccgct 6060ggcattgtga caactaacac cattgatgtg
atgccaaatc aaagtgccga tggcgcgacg 6120atcactcaat tcacttatga tggccaactt
cgaacacttg accaaaatga caatggcgaa 6180caacagttta gcttcacgga aggtgaacta
ttcatcactc ttgaaggtga agtgcgcttt 6240gagcctaatc gtaatctaga ccacacgctg
aacgaagaca tcgtgaaatc gatcgtggtg 6300acgtctagtg actccgataa cgatgtgttg
acctcaaccg tcactctgac cattaccgat 6360ggtgatatcc caaccattga taatgtgcca
acagtgagct tgtcagaaac aagtctgagt 6420gacggctctt caccaagtgg cagcgcagtt
agctcaactc aaaccatcac ttacaccact 6480caaagtgatg atgtaaccag cttccgtatt
gaaccgactg agttcaatgt tggcggtgct 6540ctcaaatcaa atggattggc ggttgagctg
aaggccgatc caaccactcc gggcggctac 6600atcggctttg tgactgatgg ttcgaacgtt
gaaactaacg tgttcacgat tagcttctcg 6660gataccaatt taggtcaata caccttcacc
ttgcttgaag cgttggatca tgcggatagc 6720cttgcaaata acgatctgag ctttgatctg
ccagtctacg ccgtcgatag tgatggcgat 6780gattcactgg tgtctcaact caatgtaacc
atcggtgatg atgttcaaat catgcaaggt 6840ggtacgttag atatcactga gccaaacctt
gcagacggca caaccacaac taacaccatc 6900gatgtgatgc cagaacaaag tgccgatggt
gcgacgatca ctcagtttac gtatgacggg 6960caagttcgca ctctggatca aactgacaat
ggtgagcagc aatttagctt cactgaaggc 7020gagttgttca tcactcttca aggtgacgtg
cgtttcgaac ccaatcgcaa cctagatcac 7080acagctagcg aagacatcgt gaagtcgata
gtggtgactt caagcgattc agataacgat 7140gtggtgacgt caacggtcac tctgactatt
actgatggtg atctcccaac cattgatgca 7200gtgccaagcg ttactctgtc tgaaactaat
cttagtgacg gctctgcgcc aagtggcagc 7260gcagtcagtc aaactgagac catcaccttt
accaatcaaa gtgatgatgt ggcgagtttc 7320cgtattgagc caaccgagtt taatgtgggc
ggtgcactga aatcgaatgg gtttgcggtt 7380gagataaaag aagactctgc taatccgggt
acttacattg gctttattgc caatggttcg 7440agcgctgaaa tcccagtgtt cacgattgct
ttctctacga gtacgttggg tgaatacacc 7500tttactctgc ttgaagcgtt agaccatgcg
gatggtttag ataagaacga tctgagcttt 7560gagcttccgg tttacgcggt tgatacagac
ggtgatgatt cattggtatc tcagcttaat 7620gtgaccattg gtgatgatgt tcaaatcatg
caagatggta cgttagacgt tatcgagcca 7680aatcttgcag acggcacaat cacaaccaac
accattgatg tgatgcccga gcaaagtgct 7740gatggtgcga cgatcactca gtttacttat
gacggtcagc taagaacgct tgatcaaaat 7800gacactggtg aacagcagtt cagcttcaca
gaaggcgagt tgtttatcac ccttgaaggt 7860gaagtgcgct ttgaacctaa tcgcgatcta
gaccattccg ttagcgaaga catcgtgaag 7920tcgatagtag tgacttcaag cgacttcgat
aacgatccgg tgacttcagc cattacgctg 7980accattactg atggtgataa tccgactatc
gattcggtac cgagcgttgt acttgaagaa 8040gctgatttaa ctgatggctc atcgccaagt
ggcagcgcgg ttagtcaaac ggaaaccatc 8100actttcacta atcaaagtga cgatgttgag
aaattccgtt tagaaccaag tgaatttaat 8160actaacaacg cgctcaagtc cgatggcttg
atcattgaga ttcgagagga accaacagga 8220tccggcaatt atattggttt cacgaccgat
atttcgaatg tcgaaaccac tgtgtttaca 8280ctcgatttca gcagtaccac tttgggtgag
tacaccttca cgcttctgga agcgattgac 8340cacacgcctg ttcaaggcaa taacgatcta
acattcaact tgccagtcta cgcggttgat 8400agcgacggtg atgattcgct aatgtcatca
ctatcggtga cgattactga tgatgttcaa 8460gtgatggtga gtggttcgct tagtatcgaa
gagcctactg ttgccgactt ggctgcaggc 8520acgccaacaa catcagtatt tgatgtatta
acatccgcga gtgctgatgg ggcgaccatt 8580actcagttca cttatgatgg tggggcggta
ttaacgcttg atcaaaacga tacaggtgag 8640cagaagttcg tggttgctga tggggcatta
tatatcactc tgcaaggcga tattcgtttc 8700gaaccaagtc gtaaccttga ccatactggt
ggcgatatcg tcaagtcgat agtcgtaact 8760tcaagtgatt ccgatagcga tcttgtgtct
tcaacggtaa cgctaaccat tactgatggc 8820gatatcccaa cgattgacac ggtgccaagc
gttactctgt cagaaacgaa tctgagcgac 8880ggatctgctc cgaatgcaag tgcggtaagt
tcaactcaaa ccattacctt tactaaccaa 8940agtgatgacg tgacgagttt ccgtattgaa
ccgactgatt ttaatgttgg tggtgctctg 9000aaatcgaacg gattggcggt cgaactgaaa
gcggacccaa ctacaccggg tggctacatc 9060ggttttgtga ctgatggttc gaacgttgaa
actaacgtgt ttacgattag cttctcggat 9120accaatttag gtcaatacac cttcaccctg
cttgaagcgt tggatcatgt agatggctta 9180gtgaagaatg atctgacttt tgatcttcct
gtttatgcgg ttgatagcga tggtgatgat 9240tcactggtgt ctcaactgaa tgtgaccatt
ggtgatgatg tacaggtcat gcaaaaccaa 9300gcgcttaata ttattgagcc aacggttgct
gatttggctg caggtactcc gacgacagcc 9360actgttgatg tgatgcctag ccaaagtgcc
gatggcgcga caatcactca gtttacttac 9420gatggcgggg cggcaataac actcgaccaa
aacgacaccg gtgaacagaa gtttgtattt 9480actgaaggtt cactgtttat caccttgcaa
ggtgaagtgc gtttcgagcc aaatcgcaat 9540ctaaaccaca cagcgagcga agacatcgtg
aagtcgattg tggtgacttc aagcgattta 9600gataacgatg tactgacgtc aacggtcact
ctgactatta ctgatggtga tatcccaacc 9660attgatgcag tgccaagcgt tactctgtct
gaaactaatc ttagtgacgg ctcagcgcca 9720agcagcagtg ctgtaagtca aacagagacg
attaccttca tcaatcaaag tgatgatgtg 9780gcgagtttcc gtattgagcc aacagagttc
aatgtgggcg gtgcactgaa atcgaatgga 9840tttgcggttg agataaaaga agattcggct
aatccgggta cttatatcgg ttttattacc 9900gatggttcga atactgaagt tcctgtattc
acgattgctt tctctacaag tacgttgggc 9960gaatacacct tcaccttact tgaagcgcta
gaccatgcaa atggcctaga taagaacgat 10020ctgagttttg atcttcctgt ttatgcggta
gacagtgatg gcgatgattc actggtgtct 10080caactgaatg tgaccattgg tgatgatgtc
caaataatgc aagacggtac gttagatatc 10140actgagccaa atcttgcaga cggaacaatc
acaaccaaca ccattgatgt gatgccaaat 10200cagagtgccg atggtgcgac gatcactgaa
ttctcatttg gcggtattgt caaaacactc 10260gatcaaagca tcgtaggtga gcagcagttt
agtttcaccg aaggtgagct attcatcact 10320cttcaaggtc aagtgcgctt tgaaccaaat
cgtgaccttg accactctgc cagcgaagac 10380atcgtgaagt cgatagtggt tacttcaagt
gattttgata acgatcctgt gacttcaacc 10440gttacgctga ccattaccga tggtgatatt
ccaactatcg atgcggtacc aagtgttacg 10500ctttcagaaa caaacctagc tgatggttct
gcgccaagtg gtagtgcggt tagtcaaacg 10560gagacgatta cttttaccaa tcaaagtgat
gatgtggttc gcttccgtct ggaaccaacc 10620gagttcaata ctaacgatgc acttaaatcg
aatggcttag cggtcgaact gcgcgaagaa 10680cctcaaggct ctggtcagta cattggcttt
accaccagtt cgtctaatgt tgagacaaca 10740gtatttacgt tggactttaa ctccggaacc
ttaggtgaat acacatttac tttaatcgaa 10800gctctggatc atcaagatgc gcgtggcaac
aacgatttaa gctttaatct acctgtgtat 10860gcggtggata gtgatggcga tgactcgtta
gtctctcagc ttggcgtgac cattggcgac 10920gatgtgcagt tgatgcaaga cggcacaatc
accagtcgtg agcctgcagc aagtgttgaa 10980acatcaaata cctttgatgt gatgccaaac
caaagtgctg atggagccaa agtcacttca 11040tttgttttcg atggtaagac tgcagaaagt
cttgatttga atgtgaatgg tgaacaagag 11100ttcgtcttca cggaaggttc ggtatttatt
acgacggaag gtgagatacg attcgagccg 11160gtacgtaatc aaaatcatgc tggtggtgat
attaccaagt cgattgaggt gacgtctgtt 11220gacctcgatg gcgatattgt cacatcgaca
gtgacactga agattgttga tggtgacctt 11280cctactatcg accttgttcc cggaattacg
ttatctgaag tggatctggc cgatggctct 11340gtgccaaccg gtaatccagt gacaatgaca
caaaccatta cctacacagc gggtagtgac 11400gacgtaagcc atttcagaat tgaccctacg
cagttcaata cttcaggggt tttgaaatcg 11460aacggcctag atgtcgaaat aaaagagcag
ccagctaatt ctggtaatta cattggcttc 11520gtcaaagacg gttctaacgt agaaaccaac
gtcttcacga tcagcttctc gacgagcaat 11580ttagggcaat acacgttcac actacttgaa
gcgttagatc atgtagatgg attgcaaaac 11640aatatactaa gcttcgatgt ccctgtttta
gcggttgatg cggatggtga tgattctgca 11700atgtcgccta tgacggttgc gatcaccgat
gacgtacaag gtgttcaaga tggcaccttg 11760agtatcactg agccttcatt agctgatttg
gcatcgggta cgccaccaac gacggcaatc 11820attgatgtta tgccaacgca gagtgctgat
ggcgcgaaag taacacagtt tacttacgat 11880ggtggcacag ctgtaacgtt agacccaagc
atcgccacag aacaagtctt taccgtaacc 11940gatggcttac tgtacatcac cattgaaggg
gaggttcgtt ttgagccgag ccgagatcta 12000gaccattcat ctggcgatat cgtaagaacg
attgtcgtca ccaccagtga ttttgataac 12060gatacagata ccgcggatgt cactttgacg
atcaaagacg gtatcaatcc cgttatcaat 12120gtggttccag atgttaactt atcggaagtt
aatctagcgg atggctcgac gccaagtggt 12180tctgcagtca gttcgactca cacaatcact
tacaccgaag gaagtgatga ttttagtcac 12240tttagaattg cgaccaacga attcaatcct
ggcgatctgt tgaaatcaag tggtcttgtt 12300gttcaactaa aagaagatcc tgcttctgct
ggtgattaca ttggttatac cgatgatggt 12360atgggtaacg ttaccgatgt atttaccatt
agctttgata gtgcaaacaa agctcagttt 12420acatttacct tgattgaggc gcttgatcac
cttgatggtg tgctttacaa cgatcttacg 12480ttccgtttgc ctatctatgc tgttgataca
gatgattctg aatcaacaaa gcgcgatgtg 12540gtggttacga tagaagatga catccagcaa
atgcaagatg gcttcttaac cattaccgag 12600ccaaattctg gtactccaac aacaactacc
gttgatgtga tgccaatacc aagtgcagac 12660ggtgcgacta ttacgcagtt cacgtatgac
ggtggttctc caattactct gaatcaaagc 12720atcagcggcg aacaagagtt tgttttcact
gaaggttcac tgtttgtgac actagatggt 12780gatgtaaggt ttgagccaaa tagaaacctt
gatcactctg cgggcgacat tgttaaatcg 12840attgtgttca cgtcttcaga ctttgataac
gacatcttct catcaaaagt cactctcacc 12900attgttgatg gtgatgggcc aacaatcgac
gttgtgccgg gtgtggcatt gtcagaaagc 12960ttacttgcgg atggttcgac gcctagcgta
aatcccgtga gtatgactca aaccattact 13020tcacttgcaa gtagtgatga tattgctgaa
atagtggtgg aagtcgggtt gttcaatacc 13080aacggcgcgt tgaagtcgga tggtttgtca
ctgagtttac gtgaagaccc tgtaaattca 13140ggcgactaca ttgcatttac tactaatggt
tcgggtgttg agaaagttat cttcactctg 13200gattttgatg atacgaatcc gagtcaatat
acgtttactc tgcttgaacg tttagaccat 13260gttgatggct taggaaataa cgatctgagt
tttgatcttt ctgtttatgc agaagatacc 13320gatggtgata tttcagcgtc taaaccgctt
acagtcacca tcaccgatga tgttcagctc 13380atgcaatccg gtgcgctcaa cattactgag
ccaaccacag gaacaccgac tacagcagtc 13440tttgatgtga tgcctgcgca aagtgcagat
ggcgcgacaa tcactaagtt tacctatggc 13500agccaacctg aagagtctct ggtacaaacc
gtcacgggtg agcaagaatt tgtgttcact 13560gaaggttctc tgtttatcaa tcttgaaggt
gatgtacgtt tcgaacctaa ccgtaatctc 13620gatcattcgg gtggtaacat cgttaagacc
attacggtga catcggaaga taaagatggc 13680gatattgtca cttcaacagt gacgctgact
attgtagatg gcgcgccacc agtaatagac 13740acagtaccaa cggttgcatt ggaagaagcg
aatctggtcg acggatcttc accgggttta 13800cctgttagcc aaactgaaat cattactttc
acagcaggaa gtgatgatgt gagccacttc 13860cgtattgatc cggctcaatt caacacatca
ggcgatctga aagcggatgg tttggtggtt 13920cagttaaaag aagatcctct aaacagcgat
aattatattg gttacgttga aagcggcggt 13980gtccaaacgg atatcttcac catcaccttt
agcagcgtgg ttctaggaga gtacacattc 14040accttgttgg aagagttaga tcacctgcct
gtacaaggta acaatgatca aatcttcacc 14100ttgccagtga tcgcagtcga caaagacaac
actgactcag cggtgaaacc tcttacggtg 14160accattaccg atgatgttcc aaccattact
gacaccaccg gcgccagtac gtttgtggtt 14220gatgaagatg atttgggcac tctggcacaa
gcgacgggtt cgtttgtaac cacagaaggt 14280gcagatcaag tcgaggttta cgaactacgt
aatatatcaa cgttggaagc aacgctatcg 14340tcgggcagtg aaggtattaa gatcactgag
atcacaggtg ctgctaacac gaccacctac 14400caaggggcga ccgacccaag tggaacgcca
attttcacat tagtgctgac tgatgatggt 14460gcctacacct ttaccttgct tggccctctc
aatcacgcta cgacaccgag taacctcgat 14520acattaacaa taccatttga tgttgttgcc
gttgacggtg atggcgatga ttctaaccaa 14580tatgtattgc caatcgaggt gctagatgat
gtgcctgtaa tgacggcgcc gacgggtgaa 14640acggttgttg atgaagacga tcttactggc
attggttccg atcaatctga agatacaatt 14700atcaatggac tgttcaccgt tgatgaaggt
gcggatggcg ttgtgctgta tgagctggtt 14760gatgaagatt tggttctgac gggcttaacc
tctgatggag aaagcttaga gtggctagct 14820gtttcacaaa acggcacaac atttacttac
gttgctcaaa ctgcaacgag taatgaagcg 14880gtgttcgaga ttattttcga cacctcggat
aacagctacc aatttgaatt atttaagcca 14940ctgaagcacc ctgacggtgc aaacgagaac
gcgatagatc ttgatttctc aatcgttgct 15000gaagattttg atcaagacca atcggatgcg
atcggtctaa aaattacggt aaccgatgat 15060gttccgttag tgacaactca atcgattact
cgtcttgaag gtcaggggta tggcaactct 15120aaagtcgaca tgtttgccaa tgcaacagat
gtgggggctg atggcgcggt actgagtcga 15180attgagggta tctcaaataa tggtgcagat
attgttttcc gtagcgggaa caatgggcca 15240tatagtagcg gcttcgattt aaacagcggt
agccaacaag ttcgagtcta cgagcaaaca 15300aatggcggtg ctgatactcg tgaacttggc
cgtctacgca tcaactcaaa tggtgaggtt 15360gaattcagag ctaacggcta tctcgatcat
gacggtgatg acaccatcga cttctcgatt 15420aacgtgattg ccacagatgg agatttagac
acctctgaaa caccgttaga tattacgatt 15480actgataggg attctacaag aattgcgctg
aaagtgacga ccttcgagga tgcgggtaga 15540gactcaacca taccttacgc aacaggtgat
gagccgactc ttgagaatgt tcaagataac 15600caaaatggtt tgccgaatgc gccagcgcaa
gttgcgctgc aagttagtct gtatgaccaa 15660gataacgctg aatctattgg gcagttgacg
attaaaagcc cgaacggagg tgatagtcat 15720caaggtactt tttattactt tgatggtgct
gactacatag aattagtgcc tgagtcaaat 15780gggagcatta tatttggctc tcctgaactc
gaacaaagct tcgctccaaa cccgagtgaa 15840ccaagacaaa ctatcgcgac gatagacaac
ctgttctttg ttccagacca acacgctagt 15900tcggatgaaa ctggtgggcg agttcgttat
gagcttgaaa ttgagaaaaa tggcagtacg 15960gatcacaccg ttaattcaaa cttcagaatt
gagattgaag ctgtagctga tattgcgact 16020tgggatgatt ccaacagcac gtatcagtat
caagtcaacg aagatgaaga caatgtcacg 16080ttgcagctga acgcagagtc tcaagataac
agtaatactg agacgattac ctatgaactt 16140gaagccgttc aaggcgacgg gaagtttgag
ttacttgatc aaaatggcaa tgtgttaacg 16200cccgttaatg gtgtttatat catcgcatct
gctgatatca atagcaccgt agttaaccct 16260attgataact tctcagggca gattgagttc
aaagcgacgg caattacgga agagacgctt 16320aacccatacg atgattcaga caacggtgga
gcaaacgata agacgacggc tcgttctgtg 16380gaacaaagta ttgttattga tgtgaccgca
gatgcggacc ctggcacatt cagtgttagt 16440cgaattcaga tcaacgaaga caatatcgat
gatccagatt acgtcgggcc tttggacaat 16500aaagacgcgt tcacgttaga cgaagtcatc
accatgacag ggtcggtcga ttctgacagt 16560tctgaagaac tgtttgtgcg catcagtaat
gttacggaag gagctgtgct ttacttctta 16620ggcaccacga cagtcgttcc gaccatcacg
atcaatggtg tggattatca agaaatcgcg 16680tattccgatt tggctaacgt tgaggttgtt
ccaaccaaac acagtaatgt cgatttcacc 16740ttcgatgtta cgggagtggt caaagatacg
gcaaatctat ccacgggcgc ccaaatcgat 16800gaggagatac taggaactaa aaccgtcaac
gttgaagtca aaggcgttgc cgatactcct 16860tatggtggaa cgaatggcac ggcttggagt
gcaattacag atggcactac atctggtgtt 16920caaaccacga ttcaagagag ccaaaatggt
gatacctttg ctgagcttga tttcaccgtg 16980ttgtcgggag agagaagacc agatactggc
actacaccat tagctgacga tgggtcagaa 17040tcaataaccg ttattctatc gggtataccc
gatggggttg ttctagaaga cggtgacggt 17100acagtgattg accttaactt tgtcggttat
gaaaccggac cgggcggtag tcctgactta 17160tccaaaccta tctacgaagc gaacattact
gaggcgggta aaacttcagg cattcgcatc 17220agacctgtcg actcttcaac cgagaatatt
cacattcaag gtaaagtgat tgtgactgag 17280aacgatggtc acacgcttac gtttgatcaa
gaaattcgag tgcttgttat acctcgaatc 17340gacacatcag caacttatgt caatacgact
aacggtgatg aagatacggc tatcaatatt 17400gattggcacc ctgaaggcac ggattacatt
gatgacgatg agcatttcac taagataact 17460attaatggaa taccactggg tgttactgca
gtagtcaacg gtgatgtgac cgttgatgac 17520tcaaccccag gaacattgat tataacgcct
aaagatgctt cccaaactcc tgaacaattt 17580actcaaattg cattagctaa taacttcatt
caaatgacgc ctccggctga ttctagtgca 17640gattttacgt tgaccaccga acttaaaatg
gaagagcgag atcatgagta tacgtctagc 17700ggcctagagg atgaagatgg tggttatgtc
gaagccgatc cagatataac cggaatcatt 17760aacgttcaag tacgacctgt ggttgaacct
ggagatgccg acaacaagat tgtcgtttca 17820aacgaagatg gctctggaga tctcactacg
attacggctg atgctaatgg tgtcattaaa 17880tttacaacta acagtgataa ccaaacgact
gatactaacg gagacgaaat ctgggacggt 17940gaatacgtcg tccgatacca agaaacggat
ttaagcacag tagaagagca agtcgacgaa 18000gtgattgttc agctgactaa caccgatgga
agcgcgttat ctgatgatat tttagggcaa 18060cttttagtaa ctggtgcctc ttacgaaggc
ggtggccgat gggttgtgac caatgaagat 18120gcctttagcg tcagtgcgcc caatggatta
gatttcaccc ctgccaatga tgcggatgat 18180gtagctactg atttcaatga tatcaagatg
acaattttca ctttggtctc agatcctggt 18240gatgctaaca atgaaacgtc cgcccaagtg
caacgcaccg gagaagtaac gctttcttat 18300cctgaagtgc tgacggcacc tgacaaagtt
gccgcagata ttgcgattgt gccagacagt 18360gttatcgacg ctgttgagga tactcagctt
gatctcggcg cggcactcaa cggcattttg 18420agcttgacgg gtcgcgatga ttctactgac
caagtgacgg tgatcatcga tggcactctg 18480gtcattgatg ctacaacatc attcccaatt
agcctgtcgg gaacaagtga tgttgacttt 18540gtgaatggga aatatgttta cgagacgact
gttgagcagg gcgtagccgt cgattcatcg 18600ggtttgttat tgaatctgcc accaaactac
tctggtgact ttaggttgcc aatgaccatc 18660gtgaccaaag atttacaatc tggtgatgag
aagaccttag tgactgaagt tatcatcaaa 18720gtcgcaccag atgctgagac ggatccaacg
attgaggtga atgtcgtggg ttcgcttgat 18780gatgccttta atcctgttga taccgacggt
caagctgggc aagatccggt gggttacgaa 18840gacacctata ttcaactcga cttcaattcg
accatttcgg atcaggtttc cggcgtcgaa 18900ggcggccaag aagcgtttac gtccattact
ttaacgttgg acgacccttc tataggtgca 18960ttctatgaca acacgggtac ttcattaggt
acatctgtta cgtttaatca ggctgaaata 19020gcagcgggtg cactcgataa cgtgctcttt
agggcaatcg aaaattaccc aacgggtaat 19080gatattaacc aagtgcaggt taatgtcagc
ggtacagtca cagataccgc aacctataat 19140gatcctgctt ctcctgcggg tacggcaaca
gactcagata ctttctctac gagtgtcagc 19200tttgaagtcg ttcctgtggt cgatgacgtg
tctgtcactg gaccgggtag cgatcctgat 19260gttatcgaga ttactggcaa cgaagaccag
ctcatttctt tgtcggggac agggcctgta 19320tcgattgcac tgactgacct tgatggttca
gaacagtttg tatcgattaa gttcacagat 19380gtccctgatg gcttccaaat gcgtgcagat
gctggctcga catataccgt gaaaaataat 19440ggtaatggag agtggagtgt tcaactgcct
caagcttcgg ggttgtcatt cgatttaagt 19500gagatttcga tcttgccgcc taaaaacttc
agtggtaccg ctgagtttgg tgtggaagtc 19560ttcactcaag aatcgttgct gggtgtgcct
actgcggcgg caaacttgcc aagcttcaaa 19620ctgcatgtgg tacctgttgg tgacgatgtt
gataccaatc cgactgattc tgtaacaggc 19680aacgaaggcc aaaacattga tatcgaaatc
aatgcgacta ttttggataa agaattgtct 19740gcaacaggaa gcgggacgta taccgagaat
gcgcccgaaa cgcttcgagt tgaagtggcg 19800ggtgttcctc aagatgcttc tattttctat
ccagatggca cgacattggc tagctacgat 19860ccggcgacgc agctctggac tctcgatgtt
ccagctcagt cgttagataa gatcgtattt 19920aactctggcg aacataatag tgatacaggc
aatgtactgg gtatcaatgg tccactgcag 19980attacggtac gttcagtaga tactgatgct
gataatacag agtacctagg tacgccaacc 20040agcttcgatg tcgatctggt gattgatcct
attaacgatc aaccgatctt tgtgaacgta 20100acgaatattg aaacatcgga agacatcagt
gttgccatcg acaactttag tatctacgac 20160gtcgacgcaa actttgataa tccagatgct
ccgtatgaac tgacgcttaa agtcgaccaa 20220acactgccgg gagcgcaagg tgtgtttgag
tttaccagct ctcctgacgt gacgtttgta 20280ttgcaacctg acggctcatt ggtgattacc
ggtaaagaag ccgacattaa taccgcattg 20340actaatggag ctgtgacttt caaacccgac
ccagaccaga actacctcaa ccagactggt 20400ttagtcacaa tcaatgcaac gctcgatgat
ggtggtaata acggtttgat tgacgcggtt 20460gatccgaata ccgctcaaac caatcaaact
accttcacca ttaaggtgac ggaagtgaat 20520gacgctcctg tggcgactaa cgttgattta
ggctcgattg cggaagacgc tcaaatcgtg 20580attgttgaga gtgacttgat tgcagccagt
tctgatctag aaaaccataa tctcacagta 20640accggtgtga ctcttactca agggcaaggt
cagcttacac gctatgaaaa tgctggtggt 20700gctgatgacg cagcgattac ggggccattc
tggatattca ttgcagataa tgatttcaac 20760ggcgacgtta aattcaatta ctccattatc
gatgatggta ccaccaacgg tgtggatgat 20820tttaaaaccg atagcgctga aatcagcctt
gtagttactg aagtcaatga ccagccagtg 20880gcatcgaaca ttgatttggg caccatgctt
gaagaaggac agctggtcat taaagaggaa 20940gacctgattt ccgcaaccac tgatccggaa
aacgacacga ttactgtgaa cagtttggtg 21000ctcgatcaag gtcagggcca attacaacgc
tttgagaacg tgggcggtgc tgatgatgct 21060acgatcactg gcccgtactg ggtatttact
gcagccaacg aatacaacgg tgatgttaag 21120ttcacttata ccgttgagga cgatggtaca
accaacggcg ctgatgattt cttaacagat 21180accggcgaaa ttagcgttgt ggtaacggaa
gtgaatgatc aaccggtggc aacggatatc 21240gacttaggaa acatccttga agaagggcag
ttgatcatca aagaggaaga cttaattgct 21300gctacgagcg atccggaaaa cgacacgatt
accgtgacca atctggtgct cgacgaaggc 21360caaggccagt tacagcgctt tgagaacgtg
ggcggtgctg atgacgctat gattactggc 21420ccgtactgga tatttacggc tgctgatgaa
tacaacggta acgttaagtt cacctatacc 21480gtcgaggatg atggtacaac caacggcgct
aatgatttcc taacggatac tgcagagatc 21540acagcgattg tcgacggagt gaacgatacg
cctgttgtta atggtgacag tgtcactacg 21600attgttgacg aggatgctgg tcagctattg
agtggtatca atgtcagtga cccagattat 21660gtggatgcat tttctaatga cttgatgaca
gtcacgctga cagtggatta cggtacattg 21720aacgtatcac ttccggcagt gacgacagtg
atggtcaacg gcaacaacac tggttcggtt 21780atcttagttg gtactttgag tgacctgaat
gcgctgattg atacgccaac cagtccaaac 21840ggtgtctacc tcgatgcgag cttgtctcca
accaatagca ttggcttaga agtaatcgcc 21900aaagacagcg gtaacccttc tggtatcgcg
attgaaactg caccagtggt ttataatatc 21960gcagtgacac cagtcgctaa tgcgccaacc
ttgtctattg atccggcatt taactatgtg 22020agaaacatta cgaccagctc atctgtggtc
gctaatagtg gagtcgcttt agttggaatt 22080gtcgctgcat tgacggacat tactgaagag
ttaacgttga agatcagcga tgttccggat 22140ggtgttgatg taaccagtga tgtgggtacg
gtttcgttgg tgggtgatac ttggatagcg 22200accgctgatg cgatcgatag tctcagactc
gtagagcagt catcattagg taaaccgttg 22260accccgggta attacacctt gaaagttgag
gcgctatctg aagagactga caacaacgat 22320attgcgatat ctcaaaacat cgatctgaat
ctcaatattg ttgccaatcc aatagatctc 22380gatctgtctt ctgaaacaga cgatgtgcaa
cttttagcga gtaactttga tactaacctc 22440actggcggaa ctggaaatga ccgacttgta
ggtggagcgg gtgacgatac gctggttggc 22500ggtgacggta acgacacact cattggtggc
ggcggttccg atattctaac cggtggcaat 22560ggtatggatt cgtttgtatg gctcaatatt
gaagatggcg ttgaagacac cattaccgat 22620ttcagcctgt ctgaaggaga ccaaatcgac
ctacgagaag tattacctga gttgaagaat 22680acatctccag acatgtctgc attgctacaa
cagatagacg cgaaagtgga aggggatgat 22740attgagctta cgatcaagtc tgatggttta
ggcactacgg aacaggtgat tgtggttgaa 22800gaccttgctc ctcagctaac cttaagtggc
accatgcctt cggatatttt ggatgcgtta 22860gtgcaacaaa atgtcatcac tcacggttaa
2289047629PRTVibrio splendidus 4Met Leu
Cys Asp Asn Gly Gly Cys Met Asp Ile Glu Val Ser Arg Gln1 5
10 15Val Ala Val Val Glu Ala Thr Ser Gly
Asp Val Val Val Val Lys Pro20 25 30Asp
Gly Ser Ala Arg Lys Val Ser Val Gly Asp Thr Ile Arg Glu Asn35
40 45Glu Ile Val Ile Thr Ala Asn Lys Ser Glu Leu
Val Leu Gly Val Gln50 55 60Asn Asp Ser
Ile Pro Val Ala Glu Asn Cys Val Gly Cys Val Asp Glu65 70
75 80Asn Ala Ala Trp Val Asp Ala Pro
Ile Ala Gly Glu Val Asn Phe Asp85 90
95Leu Gln Gln Ala Asp Ala Glu Thr Phe Thr Glu Asp Asp Leu Ala Ala100
105 110Ile Gln Glu Ala Ile Leu Gly Gly Ala Asp
Pro Thr Gln Ile Leu Glu115 120 125Ala Thr
Ala Ala Gly Gly Gly Leu Gly Ser Ala Asn Ala Gly Phe Val130
135 140Thr Ile Asp Tyr Asn Tyr Thr Glu Thr His Pro Ser
Thr Phe Phe Glu145 150 155
160Thr Ala Gly Leu Ala Glu Gln Thr Val Asp Glu Asp Arg Glu Glu Phe165
170 175Arg Ser Ile Thr Arg Ser Ser Gly Gly
Gln Ser Ile Ser Glu Thr Leu180 185 190Thr
Glu Gly Ser Ile Ser Gly Asn Thr Tyr Pro Gln Ser Val Thr Thr195
200 205Thr Glu Thr Ile Ile Ala Gly Ser Leu Ala Leu
Ala Pro Asn Ser Phe210 215 220Ile Pro Glu
Thr Leu Ser Leu Ala Ser Leu Leu Ser Glu Leu Asn Ser225
230 235 240Asp Ile Thr Ser Ser Gly Gln
Ser Val Ile Phe Thr Tyr Asp Ala Thr245 250
255Thr Asn Ser Ile Val Gly Val Gln Asp Thr Asp Glu Val Leu Arg Ile260
265 270Asp Ile Asp Ala Val Ser Val Gly Asn
Asn Ile Glu Leu Ser Leu Thr275 280 285Thr
Thr Ile Ser Gln Pro Ile Asp His Val Pro Ser Val Gly Gly Gly290
295 300Gln Val Ser Tyr Thr Gly Asp Gln Ile Asp Ile
Ala Phe Asp Ile Gln305 310 315
320Gly Glu Asp Thr Ala Gly Asn Pro Leu Ala Thr Pro Val Asn Ala
Gln325 330 335Val Ser Val Phe Asp Gly Ile
Asp Pro Ser Val Glu Ser Val Asn Ile340 345
350Thr Asn Val Glu Thr Ser Ser Ala Ala Ile Glu Gly Thr Phe Ser Asn355
360 365Ile Gly Ser Asp Asn Leu Gln Ser Ala
Val Phe Asp Ala Ser Ala Leu370 375 380Asp
Gln Phe Asp Gly Leu Leu Ser Asp Asn Gln Asn Thr Leu Ala Arg385
390 395 400Leu Ser Asp Asp Gly Thr
Thr Ile Thr Leu Ser Ile Gln Gly Arg Gly405 410
415Glu Val Val Leu Thr Ile Ser Leu Asp Thr Asp Gly Thr Tyr Lys
Phe420 425 430Glu Gln Ser Asn Pro Ile Glu
Gln Val Gly Thr Asp Ser Leu Thr Phe435 440
445Ala Leu Pro Ile Thr Ile Thr Asp Phe Asp Gln Asp Val Val Thr Asn450
455 460Thr Ile Asn Ile Ala Ile Thr Asp Gly
Asp Ser Pro Val Ile Thr Asn465 470 475
480Val Asp Ser Ile Asp Val Asp Glu Ala Gly Ile Val Gly Gly
Ser Gln485 490 495Glu Gly Thr Ala Pro Val
Ser Gly Thr Gly Gly Ile Thr Ala Asp Ile500 505
510Phe Glu Ser Asp Ile Ile Asp His Tyr Glu Leu Glu Pro Thr Glu
Phe515 520 525Asn Thr Asn Gly Thr Leu Val
Ser Asn Gly Glu Ala Val Leu Leu Glu530 535
540Leu Ile Asp Glu Thr Asn Gly Val Arg Thr Tyr Glu Gly Tyr Val Glu545
550 555 560Val Asn Gly Ser
Arg Ile Thr Val Phe Asp Val Lys Ile Asp Ser Pro565 570
575Ser Leu Gly Asn Tyr Glu Phe Asn Leu Tyr Glu Glu Leu Ser
His Gln580 585 590Gly Ala Glu Asp Ala Leu
Leu Thr Phe Ala Leu Pro Ile Tyr Ala Val595 600
605Asp Ala Asp Gly Asp Arg Ser Ala Leu Ser Gly Gly Ser Asn Thr
Pro610 615 620Glu Ala Ala Glu Ile Leu Val
Asn Val Lys Asp Asp Val Val Glu Leu625 630
635 640Val Asp Lys Val Glu Ser Val Thr Glu Pro Thr Leu
Ala Gly Asp Thr645 650 655Ile Val Ser Tyr
Asn Leu Phe Asn Phe Glu Gly Ala Asp Gly Ser Thr660 665
670Ile Gln Ser Phe Asn Tyr Asp Gly Val Asp Tyr Ser Leu Asp
Gln Ser675 680 685Leu Leu Pro Asp Ala Thr
Gln Ile Phe Ser Phe Thr Glu Gly Val Val690 695
700Thr Ile Ser Leu Asn Gly Asp Phe Ser Phe Glu Val Ala Arg Asp
Ile705 710 715 720Asp His
Ser Ser Ser Glu Thr Ile Val Lys Gln Phe Ser Phe Leu Ala725
730 735Glu Asp Gly Asp Gly Asp Thr Asp Ser Ser Thr Leu
Glu Leu Ser Ile740 745 750Thr Asp Gly Gln
Asp Pro Ile Ile Asp Leu Ile Pro Pro Val Thr Leu755 760
765Ser Glu Thr Asn Leu Asn Asp Gly Ser Ala Pro Ser Gly Ser
Thr Val770 775 780Ser Ala Thr Glu Thr Ile
Thr Phe Thr Ala Gly Ser Asp Asp Val Ala785 790
795 800Ser Phe Arg Ile Glu Pro Thr Glu Phe Asn Val
Gly Gly Ala Leu Lys805 810 815Ser Asn Gly
Phe Ser Val Glu Ile Lys Glu Asp Ser Ala Asn Pro Gly820
825 830Thr Tyr Ile Gly Phe Ile Thr Asn Gly Ser Gly Ala
Glu Ile Pro Val835 840 845Phe Thr Ile Ala
Phe Ser Thr Ser Thr Leu Gly Glu Tyr Thr Phe Thr850 855
860Leu Leu Glu Ala Leu Asp His Val Asp Gly Leu Asp Lys Asn
Asp Leu865 870 875 880Ser
Phe Asp Leu Pro Ile Tyr Ala Val Asp Thr Asp Gly Asp Asp Ser885
890 895Leu Val Ser Gln Leu Asn Val Thr Ile Gly Asp
Asp Val Gln Ile Met900 905 910Gln Asp Gly
Thr Leu Asp Ile Thr Glu Pro Asn Leu Ala Asp Gly Thr915
920 925Ile Thr Thr Asn Thr Ile Asp Val Met Pro Asn Gln
Ser Ala Asp Gly930 935 940Ala Thr Ile Thr
Arg Phe Thr Tyr Asp Gly Val Val Asn Thr Leu Asp945 950
955 960Gln Ser Ile Ser Gly Glu Gln Gln Phe
Ser Phe Thr Glu Gly Glu Leu965 970 975Phe
Ile Thr Leu Glu Gly Glu Val Arg Phe Glu Pro Asn Arg Asp Leu980
985 990Asp His Ser Val Ser Glu Asp Ile Val Lys Ser
Ile Val Val Thr Ser995 1000 1005Ser Asp
Phe Asp Asn Asp Pro Val Thr Ser Thr Ile Thr Leu Thr Ile1010
1015 1020Thr Asp Gly Asp Asn Pro Thr Ile Asp Val Ile Pro
Ser Val Thr Leu1025 1030 1035
1040Ser Glu Ile Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly Ser Ala Val1045
1050 1055Ser Ser Thr Gln Thr Ile Thr Phe Thr
Asn Gln Ser Asp Asp Val Val1060 1065
1070Arg Phe Arg Ile Glu Ser Thr Glu Phe Asn Thr Asn Asp Asp Leu Lys1075
1080 1085Ser Asn Gly Leu Ala Val Glu Leu Arg
Glu Asp Pro Ala Gly Ser Gly1090 1095
1100Asp Tyr Ile Gly Phe Thr Thr Ser Ala Thr Asn Val Glu Thr Pro Val1105
1110 1115 1120Phe Thr Leu Ser
Phe Asn Ser Gly Ser Leu Gly Glu Tyr Thr Phe Thr1125 1130
1135Leu Ile Glu Ala Leu Asp His Gln Asp Ala Arg Gly Asn Asn
Asp Leu1140 1145 1150Ser Phe Asp Leu Pro
Val Tyr Ala Val Asp Ser Asp Gly Asp Asp Ser1155 1160
1165Leu Val Ser Pro Leu Asn Val Thr Ile Gly Asp Asp Val Gln Ile
Met1170 1175 1180Gln Asp Ser Thr Leu Asp
Ile Val Glu Pro Thr Val Ala Asp Leu Ala1185 1190
1195 1200Ala Gly Thr Val Thr Thr Asn Thr Ile Asp Val
Met Pro Asn Gln Ser1205 1210 1215Ala Asp
Gly Ala Thr Val Thr Gln Phe Thr Tyr Asp Gly Gln Leu Arg1220
1225 1230Thr Leu Asp Gln Asn Asp Asn Gly Glu Gln Gln Phe
Ser Phe Thr Glu1235 1240 1245Gly Glu Leu
Phe Ile Thr Leu Gln Gly Asp Val Arg Phe Glu Pro Asn1250
1255 1260Arg Asn Leu Asp His Thr Leu Ser Glu Asp Ile Val
Lys Ser Ile Val1265 1270 1275
1280Val Thr Ser Ser Asp Ser Asp Asn Asp Val Leu Thr Ser Thr Val Thr1285
1290 1295Leu Thr Ile Thr Asp Gly Asp Ile Pro
Thr Ile Asp Asn Val Pro Thr1300 1305
1310Val Asn Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly1315
1320 1325Ser Ala Val Ser Ser Thr Gln Thr Ile
Thr Tyr Thr Thr Gln Ser Asp1330 1335
1340Asp Val Thr Ser Phe Arg Ile Glu Pro Thr Glu Phe Asn Val Gly Gly1345
1350 1355 1360Ala Leu Thr Ser
Asn Gly Leu Ala Val Glu Leu Lys Ala Asp Pro Thr1365 1370
1375Thr Pro Gly Gly Tyr Ile Gly Phe Val Thr Asp Gly Ser Asn
Val Glu1380 1385 1390Thr Asn Val Phe Thr
Ile Ser Phe Ser Asp Thr Asn Leu Gly Gln Tyr1395 1400
1405Thr Phe Thr Leu Leu Glu Ala Leu Asp His Val Asp Gly Leu Ala
Asn1410 1415 1420Asn Asp Leu Thr Phe Asp
Leu Pro Val Tyr Ala Val Asp Ser Asp Gly1425 1430
1435 1440Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr
Ile Gly Asp Asp Val1445 1450 1455Gln Ile
Met Gln Gly Gly Thr Leu Asp Ile Thr Glu Pro Asn Leu Ala1460
1465 1470Asp Gly Thr Ile Thr Thr Asn Thr Ile Asp Val Met
Pro Glu Gln Ser1475 1480 1485Ala Asp Gly
Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly Gln Val Arg1490
1495 1500Thr Leu Asp Gln Thr Asp Asn Gly Glu Gln Gln Phe
Ser Phe Thr Glu1505 1510 1515
1520Gly Glu Leu Phe Ile Thr Leu Gln Gly Asp Val Arg Phe Glu Pro Asn1525
1530 1535Arg Asn Leu Asp His Thr Ala Ser Glu
Asp Ile Val Lys Ser Ile Val1540 1545
1550Val Thr Ser Ser Asp Leu Asp Asn Asp Val Val Thr Ser Thr Val Thr1555
1560 1565Leu Thr Ile Thr Asp Gly Asp Ile Pro
Thr Ile Asp Ala Val Pro Ser1570 1575
1580Val Thr Leu Ser Glu Ile Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly1585
1590 1595 1600Thr Ala Val Ser
Gln Thr Glu Thr Ile Thr Phe Thr Asn Gln Ser Asp1605 1610
1615Asp Val Thr Ser Phe Arg Ile Glu Pro Ile Glu Phe Asn Val
Gly Gly1620 1625 1630Ala Leu Lys Ser Asn
Gly Phe Ala Val Glu Ile Lys Glu Asp Ser Ala1635 1640
1645Asn Pro Gly Thr Tyr Ile Gly Phe Ile Thr Asn Gly Ser Gly Ala
Glu1650 1655 1660Ile Pro Val Phe Thr Ile
Ala Phe Ser Thr Ser Ser Leu Gly Glu Tyr1665 1670
1675 1680Thr Phe Thr Leu Leu Glu Ala Leu Asp His Val
Asp Gly Leu Asp Lys1685 1690 1695Asn Asp
Leu Ser Phe Asp Leu Pro Val Tyr Ala Val Asp Thr Asp Gly1700
1705 1710Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr Ile
Gly Asp Asp Val1715 1720 1725Gln Ile Met
Gln Asp Gly Thr Leu Asp Ile Ile Glu Pro Asn Leu Ala1730
1735 1740Asp Gly Thr Ile Thr Thr Ser Thr Ile Asp Val Met
Pro Asn Gln Ser1745 1750 1755
1760Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly Gln Leu Arg1765
1770 1775Thr Leu Asp Gln Asn Asp Thr Gly Glu
Gln Gln Phe Ser Phe Thr Glu1780 1785
1790Gly Glu Leu Phe Ile Thr Leu Glu Gly Glu Val Arg Phe Glu Pro Asn1795
1800 1805Arg Asp Leu Asp His Thr Ala Ser Glu
Asp Ile Val Lys Ser Ile Val1810 1815
1820Val Thr Ser Ser Asp Phe Asp Asn Asp Ser Leu Thr Ser Thr Val Thr1825
1830 1835 1840Leu Thr Ile Thr
Asp Gly Asp Asn Pro Thr Ile Asp Val Ile Pro Ser1845 1850
1855Val Thr Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala Pro
Ser Gly1860 1865 1870Ser Ala Val Ser Ser
Thr Gln Thr Ile Thr Phe Thr Asn Gln Ser Asp1875 1880
1885Asp Val Val Arg Phe Arg Ile Glu Pro Thr Glu Phe Asn Thr Asn
Asp1890 1895 1900Asp Leu Lys Ser Asn Gly
Leu Ala Val Glu Leu Arg Glu Asp Pro Ala1905 1910
1915 1920Gly Ser Gly Asp Tyr Ile Gly Phe Thr Thr Ser
Ala Thr Asn Val Glu1925 1930 1935Thr Thr
Val Phe Thr Leu Ser Phe Ser Ser Thr Thr Leu Gly Glu Tyr1940
1945 1950Thr Phe Thr Leu Leu Glu Ala Leu Asp His Gln Asp
Ala Arg Gly Asn1955 1960 1965Asn Asp Leu
Ser Phe Glu Leu Pro Val Tyr Ala Val Asp Ser Asp Gly1970
1975 1980Asp Asp Ser Leu Met Ser Pro Leu Asn Val Thr Ile
Gly Asp Asp Val1985 1990 1995
2000Gln Ile Met Gln Asp Gly Thr Leu Asp Ile Val Glu Pro Thr Val Ala2005
2010 2015Asp Leu Ala Ala Gly Ile Val Thr Thr
Asn Thr Ile Asp Val Met Pro2020 2025
2030Asn Gln Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly2035
2040 2045Gln Leu Arg Thr Leu Asp Gln Asn Asp
Asn Gly Glu Gln Gln Phe Ser2050 2055
2060Phe Thr Glu Gly Glu Leu Phe Ile Thr Leu Glu Gly Glu Val Arg Phe2065
2070 2075 2080Glu Pro Asn Arg
Asn Leu Asp His Thr Leu Asn Glu Asp Ile Val Lys2085 2090
2095Ser Ile Val Val Thr Ser Ser Asp Ser Asp Asn Asp Val Leu
Thr Ser2100 2105 2110Thr Val Thr Leu Thr
Ile Thr Asp Gly Asp Ile Pro Thr Ile Asp Asn2115 2120
2125Val Pro Thr Val Ser Leu Ser Glu Thr Ser Leu Ser Asp Gly Ser
Ser2130 2135 2140Pro Ser Gly Ser Ala Val
Ser Ser Thr Gln Thr Ile Thr Tyr Thr Thr2145 2150
2155 2160Gln Ser Asp Asp Val Thr Ser Phe Arg Ile Glu
Pro Thr Glu Phe Asn2165 2170 2175Val Gly
Gly Ala Leu Lys Ser Asn Gly Leu Ala Val Glu Leu Lys Ala2180
2185 2190Asp Pro Thr Thr Pro Gly Gly Tyr Ile Gly Phe Val
Thr Asp Gly Ser2195 2200 2205Asn Val Glu
Thr Asn Val Phe Thr Ile Ser Phe Ser Asp Thr Asn Leu2210
2215 2220Gly Gln Tyr Thr Phe Thr Leu Leu Glu Ala Leu Asp
His Ala Asp Ser2225 2230 2235
2240Leu Ala Asn Asn Asp Leu Ser Phe Asp Leu Pro Val Tyr Ala Val Asp2245
2250 2255Ser Asp Gly Asp Asp Ser Leu Val Ser
Gln Leu Asn Val Thr Ile Gly2260 2265
2270Asp Asp Val Gln Ile Met Gln Gly Gly Thr Leu Asp Ile Thr Glu Pro2275
2280 2285Asn Leu Ala Asp Gly Thr Thr Thr Thr
Asn Thr Ile Asp Val Met Pro2290 2295
2300Glu Gln Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly2305
2310 2315 2320Gln Val Arg Thr
Leu Asp Gln Thr Asp Asn Gly Glu Gln Gln Phe Ser2325 2330
2335Phe Thr Glu Gly Glu Leu Phe Ile Thr Leu Gln Gly Asp Val
Arg Phe2340 2345 2350Glu Pro Asn Arg Asn
Leu Asp His Thr Ala Ser Glu Asp Ile Val Lys2355 2360
2365Ser Ile Val Val Thr Ser Ser Asp Ser Asp Asn Asp Val Val Thr
Ser2370 2375 2380Thr Val Thr Leu Thr Ile
Thr Asp Gly Asp Leu Pro Thr Ile Asp Ala2385 2390
2395 2400Val Pro Ser Val Thr Leu Ser Glu Thr Asn Leu
Ser Asp Gly Ser Ala2405 2410 2415Pro Ser
Gly Ser Ala Val Ser Gln Thr Glu Thr Ile Thr Phe Thr Asn2420
2425 2430Gln Ser Asp Asp Val Ala Ser Phe Arg Ile Glu Pro
Thr Glu Phe Asn2435 2440 2445Val Gly Gly
Ala Leu Lys Ser Asn Gly Phe Ala Val Glu Ile Lys Glu2450
2455 2460Asp Ser Ala Asn Pro Gly Thr Tyr Ile Gly Phe Ile
Ala Asn Gly Ser2465 2470 2475
2480Ser Ala Glu Ile Pro Val Phe Thr Ile Ala Phe Ser Thr Ser Thr Leu2485
2490 2495Gly Glu Tyr Thr Phe Thr Leu Leu Glu
Ala Leu Asp His Ala Asp Gly2500 2505
2510Leu Asp Lys Asn Asp Leu Ser Phe Glu Leu Pro Val Tyr Ala Val Asp2515
2520 2525Thr Asp Gly Asp Asp Ser Leu Val Ser
Gln Leu Asn Val Thr Ile Gly2530 2535
2540Asp Asp Val Gln Ile Met Gln Asp Gly Thr Leu Asp Val Ile Glu Pro2545
2550 2555 2560Asn Leu Ala Asp
Gly Thr Ile Thr Thr Asn Thr Ile Asp Val Met Pro2565 2570
2575Glu Gln Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr
Asp Gly2580 2585 2590Gln Leu Arg Thr Leu
Asp Gln Asn Asp Thr Gly Glu Gln Gln Phe Ser2595 2600
2605Phe Thr Glu Gly Glu Leu Phe Ile Thr Leu Glu Gly Glu Val Arg
Phe2610 2615 2620Glu Pro Asn Arg Asp Leu
Asp His Ser Val Ser Glu Asp Ile Val Lys2625 2630
2635 2640Ser Ile Val Val Thr Ser Ser Asp Phe Asp Asn
Asp Pro Val Thr Ser2645 2650 2655Ala Ile
Thr Leu Thr Ile Thr Asp Gly Asp Asn Pro Thr Ile Asp Ser2660
2665 2670Val Pro Ser Val Val Leu Glu Glu Ala Asp Leu Thr
Asp Gly Ser Ser2675 2680 2685Pro Ser Gly
Ser Ala Val Ser Gln Thr Glu Thr Ile Thr Phe Thr Asn2690
2695 2700Gln Ser Asp Asp Val Glu Lys Phe Arg Leu Glu Pro
Ser Glu Phe Asn2705 2710 2715
2720Thr Asn Asn Ala Leu Lys Ser Asp Gly Leu Ile Ile Glu Ile Arg Glu2725
2730 2735Glu Pro Thr Gly Ser Gly Asn Tyr Ile
Gly Phe Thr Thr Asp Ile Ser2740 2745
2750Asn Val Glu Thr Thr Val Phe Thr Leu Asp Phe Ser Ser Thr Thr Leu2755
2760 2765Gly Glu Tyr Thr Phe Thr Leu Leu Glu
Ala Ile Asp His Thr Pro Val2770 2775
2780Gln Gly Asn Asn Asp Leu Thr Phe Asn Leu Pro Val Tyr Ala Val Asp2785
2790 2795 2800Ser Asp Gly Asp
Asp Ser Leu Met Ser Ser Leu Ser Val Thr Ile Thr2805 2810
2815Asp Asp Val Gln Val Met Val Ser Gly Ser Leu Ser Ile Glu
Glu Pro2820 2825 2830Thr Val Ala Asp Leu
Ala Ala Gly Thr Pro Thr Thr Ser Val Phe Asp2835 2840
2845Val Leu Thr Ser Ala Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe
Thr2850 2855 2860Tyr Asp Gly Gly Ala Val
Leu Thr Leu Asp Gln Asn Asp Thr Gly Glu2865 2870
2875 2880Gln Lys Phe Val Val Ala Asp Gly Ala Leu Tyr
Ile Thr Leu Gln Gly2885 2890 2895Asp Ile
Arg Phe Glu Pro Ser Arg Asn Leu Asp His Thr Gly Gly Asp2900
2905 2910Ile Val Lys Ser Ile Val Val Thr Ser Ser Asp Ser
Asp Ser Asp Leu2915 2920 2925Val Ser Ser
Thr Val Thr Leu Thr Ile Thr Asp Gly Asp Ile Pro Thr2930
2935 2940Ile Asp Thr Val Pro Ser Val Thr Leu Ser Glu Thr
Asn Leu Ser Asp2945 2950 2955
2960Gly Ser Ala Pro Asn Ala Ser Ala Val Ser Ser Thr Gln Thr Ile Thr2965
2970 2975Phe Thr Asn Gln Ser Asp Asp Val Thr
Ser Phe Arg Ile Glu Pro Thr2980 2985
2990Asp Phe Asn Val Gly Gly Ala Leu Lys Ser Asn Gly Leu Ala Val Glu2995
3000 3005Leu Lys Ala Asp Pro Thr Thr Pro Gly
Gly Tyr Ile Gly Phe Val Thr3010 3015
3020Asp Gly Ser Asn Val Glu Thr Asn Val Phe Thr Ile Ser Phe Ser Asp3025
3030 3035 3040Thr Asn Leu Gly
Gln Tyr Thr Phe Thr Leu Leu Glu Ala Leu Asp His3045 3050
3055Val Asp Gly Leu Val Lys Asn Asp Leu Thr Phe Asp Leu Pro
Val Tyr3060 3065 3070Ala Val Asp Ser Asp
Gly Asp Asp Ser Leu Val Ser Gln Leu Asn Val3075 3080
3085Thr Ile Gly Asp Asp Val Gln Val Met Gln Asn Gln Ala Leu Asn
Ile3090 3095 3100Ile Glu Pro Thr Val Ala
Asp Leu Ala Ala Gly Thr Pro Thr Thr Ala3105 3110
3115 3120Thr Val Asp Val Met Pro Ser Gln Ser Ala Asp
Gly Ala Thr Ile Thr3125 3130 3135Gln Phe
Thr Tyr Asp Gly Gly Ala Ala Ile Thr Leu Asp Gln Asn Asp3140
3145 3150Thr Gly Glu Gln Lys Phe Val Phe Thr Glu Gly Ser
Leu Phe Ile Thr3155 3160 3165Leu Gln Gly
Glu Val Arg Phe Glu Pro Asn Arg Asn Leu Asn His Thr3170
3175 3180Ala Ser Glu Asp Ile Val Lys Ser Ile Val Val Thr
Ser Ser Asp Leu3185 3190 3195
3200Asp Asn Asp Val Leu Thr Ser Thr Val Thr Leu Thr Ile Thr Asp Gly3205
3210 3215Asp Ile Pro Thr Ile Asp Ala Val Pro
Ser Val Thr Leu Ser Glu Thr3220 3225
3230Asn Leu Ser Asp Gly Ser Ala Pro Ser Ser Ser Ala Val Ser Gln Thr3235
3240 3245Glu Thr Ile Thr Phe Ile Asn Gln Ser
Asp Asp Val Ala Ser Phe Arg3250 3255
3260Ile Glu Pro Thr Glu Phe Asn Val Gly Gly Ala Leu Lys Ser Asn Gly3265
3270 3275 3280Phe Ala Val Glu
Ile Lys Glu Asp Ser Ala Asn Pro Gly Thr Tyr Ile3285 3290
3295Gly Phe Ile Thr Asp Gly Ser Asn Thr Glu Val Pro Val Phe
Thr Ile3300 3305 3310Ala Phe Ser Thr Ser
Thr Leu Gly Glu Tyr Thr Phe Thr Leu Leu Glu3315 3320
3325Ala Leu Asp His Ala Asn Gly Leu Asp Lys Asn Asp Leu Ser Phe
Asp3330 3335 3340Leu Pro Val Tyr Ala Val
Asp Ser Asp Gly Asp Asp Ser Leu Val Ser3345 3350
3355 3360Gln Leu Asn Val Thr Ile Gly Asp Asp Val Gln
Ile Met Gln Asp Gly3365 3370 3375Thr Leu
Asp Ile Thr Glu Pro Asn Leu Ala Asp Gly Thr Ile Thr Thr3380
3385 3390Asn Thr Ile Asp Val Met Pro Asn Gln Ser Ala Asp
Gly Ala Thr Ile3395 3400 3405Thr Glu Phe
Ser Phe Gly Gly Ile Val Lys Thr Leu Asp Gln Ser Ile3410
3415 3420Val Gly Glu Gln Gln Phe Ser Phe Thr Glu Gly Glu
Leu Phe Ile Thr3425 3430 3435
3440Leu Gln Gly Gln Val Arg Phe Glu Pro Asn Arg Asp Leu Asp His Ser3445
3450 3455Ala Ser Glu Asp Ile Val Lys Ser Ile
Val Val Thr Ser Ser Asp Phe3460 3465
3470Asp Asn Asp Pro Val Thr Ser Thr Val Thr Leu Thr Ile Thr Asp Gly3475
3480 3485Asp Ile Pro Thr Ile Asp Ala Val Pro
Ser Val Thr Leu Ser Glu Thr3490 3495
3500Asn Leu Ala Asp Gly Ser Ala Pro Ser Gly Ser Ala Val Ser Gln Thr3505
3510 3515 3520Glu Thr Ile Thr
Phe Thr Asn Gln Ser Asp Asp Val Val Arg Phe Arg3525 3530
3535Leu Glu Pro Thr Glu Phe Asn Thr Asn Asp Ala Leu Lys Ser
Asn Gly3540 3545 3550Leu Ala Val Glu Leu
Arg Glu Glu Pro Gln Gly Ser Gly Gln Tyr Ile3555 3560
3565Gly Phe Thr Thr Ser Ser Ser Asn Val Glu Thr Thr Val Phe Thr
Leu3570 3575 3580Asp Phe Asn Ser Gly Thr
Leu Gly Glu Tyr Thr Phe Thr Leu Ile Glu3585 3590
3595 3600Ala Leu Asp His Gln Asp Ala Arg Gly Asn Asn
Asp Leu Ser Phe Asn3605 3610 3615Leu Pro
Val Tyr Ala Val Asp Ser Asp Gly Asp Asp Ser Leu Val Ser3620
3625 3630Gln Leu Gly Val Thr Ile Gly Asp Asp Val Gln Leu
Met Gln Asp Gly3635 3640 3645Thr Ile Thr
Ser Arg Glu Pro Ala Ala Ser Val Glu Thr Ser Asn Thr3650
3655 3660Phe Asp Val Met Pro Asn Gln Ser Ala Asp Gly Ala
Lys Val Thr Ser3665 3670 3675
3680Phe Val Phe Asp Gly Lys Thr Ala Glu Ser Leu Asp Leu Asn Val Asn3685
3690 3695Gly Glu Gln Glu Phe Val Phe Thr Glu
Gly Ser Val Phe Ile Thr Thr3700 3705
3710Glu Gly Glu Ile Arg Phe Glu Pro Val Arg Asn Gln Asn His Ala Gly3715
3720 3725Gly Asp Ile Thr Lys Ser Ile Glu Val
Thr Ser Val Asp Leu Asp Gly3730 3735
3740Asp Ile Val Thr Ser Thr Val Thr Leu Lys Ile Val Asp Gly Asp Leu3745
3750 3755 3760Pro Thr Ile Asp
Leu Val Pro Gly Ile Thr Leu Ser Glu Val Asp Leu3765 3770
3775Ala Asp Gly Ser Val Pro Thr Gly Asn Pro Val Thr Met Thr
Gln Thr3780 3785 3790Ile Thr Tyr Thr Ala
Gly Ser Asp Asp Val Ser His Phe Arg Ile Asp3795 3800
3805Pro Thr Gln Phe Asn Thr Ser Gly Val Leu Lys Ser Asn Gly Leu
Asp3810 3815 3820Val Glu Ile Lys Glu Gln
Pro Ala Asn Ser Gly Asn Tyr Ile Gly Phe3825 3830
3835 3840Val Lys Asp Gly Ser Asn Val Glu Thr Asn Val
Phe Thr Ile Ser Phe3845 3850 3855Ser Thr
Ser Asn Leu Gly Gln Tyr Thr Phe Thr Leu Leu Glu Ala Leu3860
3865 3870Asp His Val Asp Gly Leu Gln Asn Asn Ile Leu Ser
Phe Asp Val Pro3875 3880 3885Val Leu Ala
Val Asp Ala Asp Gly Asp Asp Ser Ala Met Ser Pro Met3890
3895 3900Thr Val Ala Ile Thr Asp Asp Val Gln Gly Val Gln
Asp Gly Thr Leu3905 3910 3915
3920Ser Ile Thr Glu Pro Ser Leu Ala Asp Leu Ala Ser Gly Thr Pro Pro3925
3930 3935Thr Thr Ala Ile Ile Asp Val Met Pro
Thr Gln Ser Ala Asp Gly Ala3940 3945
3950Lys Val Thr Gln Phe Thr Tyr Asp Gly Gly Thr Ala Val Thr Leu Asp3955
3960 3965Pro Ser Ile Ala Thr Glu Gln Val Phe
Thr Val Thr Asp Gly Leu Leu3970 3975
3980Tyr Ile Thr Ile Glu Gly Glu Val Arg Phe Glu Pro Ser Arg Asp Leu3985
3990 3995 4000Asp His Ser Ser
Gly Asp Ile Val Arg Thr Ile Val Val Thr Thr Ser4005 4010
4015Asp Phe Asp Asn Asp Thr Asp Thr Ala Asp Val Thr Leu Thr
Ile Lys4020 4025 4030Asp Gly Ile Asn Pro
Val Ile Asn Val Val Pro Asp Val Asn Leu Ser4035 4040
4045Glu Val Asn Leu Ala Asp Gly Ser Thr Pro Ser Gly Ser Ala Val
Ser4050 4055 4060Ser Thr His Thr Ile Thr
Tyr Thr Glu Gly Ser Asp Asp Phe Ser His4065 4070
4075 4080Phe Arg Ile Ala Thr Asn Glu Phe Asn Pro Gly
Asp Leu Leu Lys Ser4085 4090 4095Ser Gly
Leu Val Val Gln Leu Lys Glu Asp Pro Ala Ser Ala Gly Asp4100
4105 4110Tyr Ile Gly Tyr Thr Asp Asp Gly Met Gly Asn Val
Thr Asp Val Phe4115 4120 4125Thr Ile Ser
Phe Asp Ser Ala Asn Lys Ala Gln Phe Thr Phe Thr Leu4130
4135 4140Ile Glu Ala Leu Asp His Leu Asp Gly Val Leu Tyr
Asn Asp Leu Thr4145 4150 4155
4160Phe Arg Leu Pro Ile Tyr Ala Val Asp Thr Asp Asp Ser Glu Ser Thr4165
4170 4175Lys Arg Asp Val Val Val Thr Ile Glu
Asp Asp Ile Gln Gln Met Gln4180 4185
4190Asp Gly Phe Leu Thr Ile Thr Glu Pro Asn Ser Gly Thr Pro Thr Thr4195
4200 4205Thr Thr Val Asp Val Met Pro Ile Pro
Ser Ala Asp Gly Ala Thr Ile4210 4215
4220Thr Gln Phe Thr Tyr Asp Gly Gly Ser Pro Ile Thr Leu Asn Gln Ser4225
4230 4235 4240Ile Ser Gly Glu
Gln Glu Phe Val Phe Thr Glu Gly Ser Leu Phe Val4245 4250
4255Thr Leu Asp Gly Asp Val Arg Phe Glu Pro Asn Arg Asn Leu
Asp His4260 4265 4270Ser Ala Gly Asp Ile
Val Lys Ser Ile Val Phe Thr Ser Ser Asp Phe4275 4280
4285Asp Asn Asp Ile Phe Ser Ser Lys Val Thr Leu Thr Ile Val Asp
Gly4290 4295 4300Asp Gly Pro Thr Ile Asp
Val Val Pro Gly Val Ala Leu Ser Glu Ser4305 4310
4315 4320Leu Leu Ala Asp Gly Ser Thr Pro Ser Val Asn
Pro Val Ser Met Thr4325 4330 4335Gln Thr
Ile Thr Ser Leu Ala Ser Ser Asp Asp Ile Ala Glu Ile Val4340
4345 4350Val Glu Val Gly Leu Phe Asn Thr Asn Gly Ala Leu
Lys Ser Asp Gly4355 4360 4365Leu Ser Leu
Ser Leu Arg Glu Asp Pro Val Asn Ser Gly Asp Tyr Ile4370
4375 4380Ala Phe Thr Thr Asn Gly Ser Gly Val Glu Lys Val
Ile Phe Thr Leu4385 4390 4395
4400Asp Phe Asp Asp Thr Asn Pro Ser Gln Tyr Thr Phe Thr Leu Leu Glu4405
4410 4415Arg Leu Asp His Val Asp Gly Leu Gly
Asn Asn Asp Leu Ser Phe Asp4420 4425
4430Leu Ser Val Tyr Ala Glu Asp Thr Asp Gly Asp Ile Ser Ala Ser Lys4435
4440 4445Pro Leu Thr Val Thr Ile Thr Asp Asp
Val Gln Leu Met Gln Ser Gly4450 4455
4460Ala Leu Asn Ile Thr Glu Pro Thr Thr Gly Thr Pro Thr Thr Ala Val4465
4470 4475 4480Phe Asp Val Met
Pro Ala Gln Ser Ala Asp Gly Ala Thr Ile Thr Lys4485 4490
4495Phe Thr Tyr Gly Ser Gln Pro Glu Glu Ser Leu Val Gln Thr
Val Thr4500 4505 4510Gly Glu Gln Glu Phe
Val Phe Thr Glu Gly Ser Leu Phe Ile Asn Leu4515 4520
4525Glu Gly Asp Val Arg Phe Glu Pro Asn Arg Asn Leu Asp His Ser
Gly4530 4535 4540Gly Asn Ile Val Lys Thr
Ile Thr Val Thr Ser Glu Asp Lys Asp Gly4545 4550
4555 4560Asp Ile Val Thr Ser Thr Val Thr Leu Thr Ile
Val Asp Gly Ala Pro4565 4570 4575Pro Val
Ile Asp Thr Val Pro Thr Val Ala Leu Glu Glu Ala Asn Leu4580
4585 4590Val Asp Gly Ser Ser Pro Gly Leu Pro Val Ser Gln
Thr Glu Ile Ile4595 4600 4605Thr Phe Thr
Ala Gly Ser Asp Asp Val Ser His Phe Arg Ile Asp Pro4610
4615 4620Ala Gln Phe Asn Thr Ser Gly Asp Leu Lys Ala Asp
Gly Leu Val Val4625 4630 4635
4640Gln Leu Lys Glu Asp Pro Leu Asn Ser Asp Asn Tyr Ile Gly Tyr Val4645
4650 4655Glu Ser Gly Gly Val Gln Thr Asp Ile
Phe Thr Ile Thr Phe Ser Ser4660 4665
4670Val Val Leu Gly Glu Tyr Thr Phe Thr Leu Leu Glu Glu Leu Asp His4675
4680 4685Leu Pro Val Gln Gly Asn Asn Asp Gln
Ile Phe Thr Leu Pro Val Ile4690 4695
4700Ala Val Asp Lys Asp Asn Thr Asp Ser Ala Val Lys Pro Leu Thr Val4705
4710 4715 4720Thr Ile Thr Asp
Asp Val Pro Thr Ile Thr Asp Thr Thr Gly Ala Ser4725 4730
4735Thr Phe Val Val Asp Glu Asp Asp Leu Gly Thr Leu Ala Gln
Ala Thr4740 4745 4750Gly Ser Phe Val Thr
Thr Glu Gly Ala Asp Gln Val Glu Val Tyr Glu4755 4760
4765Leu Arg Asn Ile Ser Thr Leu Glu Ala Thr Leu Ser Ser Gly Ser
Glu4770 4775 4780Gly Ile Lys Ile Thr Glu
Ile Thr Gly Ala Ala Asn Thr Thr Thr Tyr4785 4790
4795 4800Gln Gly Ala Thr Asp Pro Ser Gly Thr Pro Ile
Phe Thr Leu Val Leu4805 4810 4815Thr Asp
Asp Gly Ala Tyr Thr Phe Thr Leu Leu Gly Pro Leu Asn His4820
4825 4830Ala Thr Thr Pro Ser Asn Leu Asp Thr Leu Thr Ile
Pro Phe Asp Val4835 4840 4845Val Ala Val
Asp Gly Asp Gly Asp Asp Ser Asn Gln Tyr Val Leu Pro4850
4855 4860Ile Glu Val Leu Asp Asp Val Pro Val Met Thr Ala
Pro Thr Gly Glu4865 4870 4875
4880Thr Val Val Asp Glu Asp Asp Leu Thr Gly Ile Gly Ser Asp Gln Ser4885
4890 4895Glu Asp Thr Ile Ile Asn Gly Leu Phe
Thr Val Asp Glu Gly Ala Asp4900 4905
4910Gly Val Val Leu Tyr Glu Leu Val Asp Glu Asp Leu Val Leu Thr Gly4915
4920 4925Leu Thr Ser Asp Gly Glu Ser Leu Glu
Trp Leu Ala Val Ser Gln Asn4930 4935
4940Gly Thr Thr Phe Thr Tyr Val Ala Gln Thr Ala Thr Ser Asn Glu Ala4945
4950 4955 4960Val Phe Glu Ile
Ile Phe Asp Thr Ser Asp Asn Ser Tyr Gln Phe Glu4965 4970
4975Leu Phe Lys Pro Leu Lys His Pro Asp Gly Ala Asn Glu Asn
Ala Ile4980 4985 4990Asp Leu Asp Phe Ser
Ile Val Ala Glu Asp Phe Asp Gln Asp Gln Ser4995 5000
5005Asp Ala Ile Gly Leu Lys Ile Thr Val Thr Asp Asp Val Pro Leu
Val5010 5015 5020Thr Thr Gln Ser Ile Thr
Arg Leu Glu Gly Gln Gly Tyr Gly Asn Ser5025 5030
5035 5040Lys Val Asp Met Phe Ala Asn Ala Thr Asp Val
Gly Ala Asp Gly Ala5045 5050 5055Val Leu
Ser Arg Ile Glu Gly Ile Ser Asn Asn Gly Ala Asp Ile Val5060
5065 5070Phe Arg Ser Gly Asn Asn Gly Pro Tyr Ser Ser Gly
Phe Asp Leu Asn5075 5080 5085Ser Gly Ser
Gln Gln Val Arg Val Tyr Glu Gln Thr Asn Gly Gly Ala5090
5095 5100Asp Thr Arg Glu Leu Gly Arg Leu Arg Ile Asn Ser
Asn Gly Glu Val5105 5110 5115
5120Glu Phe Arg Ala Asn Gly Tyr Leu Asp His Asp Gly Asp Asp Thr Ile5125
5130 5135Asp Phe Ser Ile Asn Val Ile Ala Thr
Asp Gly Asp Leu Asp Thr Ser5140 5145
5150Glu Thr Pro Leu Asp Ile Thr Ile Thr Asp Arg Asp Ser Thr Arg Ile5155
5160 5165Ala Leu Lys Val Thr Thr Phe Glu Asp
Ala Gly Arg Asp Ser Thr Ile5170 5175
5180Pro Tyr Ala Thr Gly Asp Glu Pro Thr Leu Glu Asn Val Gln Asp Asn5185
5190 5195 5200Gln Asn Gly Leu
Pro Asn Ala Pro Ala Gln Val Ala Leu Gln Val Ser5205 5210
5215Leu Tyr Asp Gln Asp Asn Ala Glu Ser Ile Gly Gln Leu Thr
Ile Lys5220 5225 5230Ser Pro Asn Gly Gly
Asp Ser His Gln Gly Thr Phe Tyr Tyr Phe Asp5235 5240
5245Gly Ala Asp Tyr Ile Glu Leu Val Pro Glu Ser Asn Gly Ser Ile
Ile5250 5255 5260Phe Gly Ser Pro Glu Leu
Glu Gln Ser Phe Ala Pro Asn Pro Ser Glu5265 5270
5275 5280Pro Arg Gln Thr Ile Ala Thr Ile Asp Asn Leu
Phe Phe Val Pro Asp5285 5290 5295Gln His
Ala Ser Ser Asp Glu Thr Gly Gly Arg Val Arg Tyr Glu Leu5300
5305 5310Glu Ile Glu Lys Asn Gly Ser Thr Asp His Thr Val
Asn Ser Asn Phe5315 5320 5325Arg Ile Glu
Ile Glu Ala Val Ala Asp Ile Ala Thr Trp Asp Asp Ser5330
5335 5340Asn Ser Thr Tyr Gln Tyr Gln Val Asn Glu Asp Glu
Asp Asn Val Thr5345 5350 5355
5360Leu Gln Leu Asn Ala Glu Ser Gln Asp Asn Ser Asn Thr Glu Thr Ile5365
5370 5375Thr Tyr Glu Leu Glu Ala Val Gln Gly
Asp Gly Lys Phe Glu Leu Leu5380 5385
5390Asp Gln Asn Gly Asn Val Leu Thr Pro Val Asn Gly Val Tyr Ile Ile5395
5400 5405Ala Ser Ala Asp Ile Asn Ser Thr Val
Val Asn Pro Ile Asp Asn Phe5410 5415
5420Ser Gly Gln Ile Glu Phe Lys Ala Thr Ala Ile Thr Glu Glu Thr Leu5425
5430 5435 5440Asn Pro Tyr Asp
Asp Ser Asp Asn Gly Gly Ala Asn Asp Lys Thr Thr5445 5450
5455Ala Arg Ser Val Glu Gln Ser Ile Val Ile Asp Val Thr Ala
Asp Ala5460 5465 5470Asp Pro Gly Thr Phe
Ser Val Ser Arg Ile Gln Ile Asn Glu Asp Asn5475 5480
5485Ile Asp Asp Pro Asp Tyr Val Gly Pro Leu Asp Asn Lys Asp Ala
Phe5490 5495 5500Thr Leu Asp Glu Val Ile
Thr Met Thr Gly Ser Val Asp Ser Asp Ser5505 5510
5515 5520Ser Glu Glu Leu Phe Val Arg Ile Ser Asn Val
Thr Glu Gly Ala Val5525 5530 5535Leu Tyr
Phe Leu Gly Thr Thr Thr Val Val Pro Thr Ile Thr Ile Asn5540
5545 5550Gly Val Asp Tyr Gln Glu Ile Ala Tyr Ser Asp Leu
Ala Asn Val Glu5555 5560 5565Val Val Pro
Thr Lys His Ser Asn Val Asp Phe Thr Phe Asp Val Thr5570
5575 5580Gly Val Val Lys Asp Thr Ala Asn Leu Ser Thr Gly
Ala Gln Ile Asp5585 5590 5595
5600Glu Glu Ile Leu Gly Thr Lys Thr Val Asn Val Glu Val Lys Gly Val5605
5610 5615Ala Asp Thr Pro Tyr Gly Gly Thr Asn
Gly Thr Ala Trp Ser Ala Ile5620 5625
5630Thr Asp Gly Thr Thr Ser Gly Val Gln Thr Thr Ile Gln Glu Ser Gln5635
5640 5645Asn Gly Asp Thr Phe Ala Glu Leu Asp
Phe Thr Val Leu Ser Gly Glu5650 5655
5660Arg Arg Pro Asp Thr Gly Thr Thr Pro Leu Ala Asp Asp Gly Ser Glu5665
5670 5675 5680Ser Ile Thr Val
Ile Leu Ser Gly Ile Pro Asp Gly Val Val Leu Glu5685 5690
5695Asp Gly Asp Gly Thr Val Ile Asp Leu Asn Phe Val Gly Tyr
Glu Thr5700 5705 5710Gly Pro Gly Gly Ser
Pro Asp Leu Ser Lys Pro Ile Tyr Glu Ala Asn5715 5720
5725Ile Thr Glu Ala Gly Lys Thr Ser Gly Ile Arg Ile Arg Pro Val
Asp5730 5735 5740Ser Ser Thr Glu Asn Ile
His Ile Gln Gly Lys Val Ile Val Thr Glu5745 5750
5755 5760Asn Asp Gly His Thr Leu Thr Phe Asp Gln Glu
Ile Arg Val Leu Val5765 5770 5775Ile Pro
Arg Ile Asp Thr Ser Ala Thr Tyr Val Asn Thr Thr Asn Gly5780
5785 5790Asp Glu Asp Thr Ala Ile Asn Ile Asp Trp His Pro
Glu Gly Thr Asp5795 5800 5805Tyr Ile Asp
Asp Asp Glu His Phe Thr Lys Ile Thr Ile Asn Gly Ile5810
5815 5820Pro Leu Gly Val Thr Ala Val Val Asn Gly Asp Val
Thr Val Asp Asp5825 5830 5835
5840Ser Thr Pro Gly Thr Leu Ile Ile Thr Pro Lys Asp Ala Ser Gln Thr5845
5850 5855Pro Glu Gln Phe Thr Gln Ile Ala Leu
Ala Asn Asn Phe Ile Gln Met5860 5865
5870Thr Pro Pro Ala Asp Ser Ser Ala Asp Phe Thr Leu Thr Thr Glu Leu5875
5880 5885Lys Met Glu Glu Arg Asp His Glu Tyr
Thr Ser Ser Gly Leu Glu Asp5890 5895
5900Glu Asp Gly Gly Tyr Val Glu Ala Asp Pro Asp Ile Thr Gly Ile Ile5905
5910 5915 5920Asn Val Gln Val
Arg Pro Val Val Glu Pro Gly Asp Ala Asp Asn Lys5925 5930
5935Ile Val Val Ser Asn Glu Asp Gly Ser Gly Asp Leu Thr Thr
Ile Thr5940 5945 5950Ala Asp Ala Asn Gly
Val Ile Lys Phe Thr Thr Asn Ser Asp Asn Gln5955 5960
5965Thr Thr Asp Thr Asn Gly Asp Glu Ile Trp Asp Gly Glu Tyr Val
Val5970 5975 5980Arg Tyr Gln Glu Thr Asp
Leu Ser Thr Val Glu Glu Gln Val Asp Glu5985 5990
5995 6000Val Ile Val Gln Leu Thr Asn Thr Asp Gly Ser
Ala Leu Ser Asp Asp6005 6010 6015Ile Leu
Gly Gln Leu Leu Val Thr Gly Ala Ser Tyr Glu Gly Gly Gly6020
6025 6030Arg Trp Val Val Thr Asn Glu Asp Ala Phe Ser Val
Ser Ala Pro Asn6035 6040 6045Gly Leu Asp
Phe Thr Pro Ala Asn Asp Ala Asp Asp Val Ala Thr Asp6050
6055 6060Phe Asn Asp Ile Lys Met Thr Ile Phe Thr Leu Val
Ser Asp Pro Gly6065 6070 6075
6080Asp Ala Asn Asn Glu Thr Ser Ala Gln Val Gln Arg Thr Gly Glu Val6085
6090 6095Thr Leu Ser Tyr Pro Glu Val Leu Thr
Ala Pro Asp Lys Val Ala Ala6100 6105
6110Asp Ile Ala Ile Val Pro Asp Ser Val Ile Asp Ala Val Glu Asp Thr6115
6120 6125Gln Leu Asp Leu Gly Ala Ala Leu Asn
Gly Ile Leu Ser Leu Thr Gly6130 6135
6140Arg Asp Asp Ser Thr Asp Gln Val Thr Val Ile Ile Asp Gly Thr Leu6145
6150 6155 6160Val Ile Asp Ala
Thr Thr Ser Phe Pro Ile Ser Leu Ser Gly Thr Ser6165 6170
6175Asp Val Asp Phe Val Asn Gly Lys Tyr Val Tyr Glu Thr Thr
Val Glu6180 6185 6190Gln Gly Val Ala Val
Asp Ser Ser Gly Leu Leu Leu Asn Leu Pro Pro6195 6200
6205Asn Tyr Ser Gly Asp Phe Arg Leu Pro Met Thr Ile Val Thr Lys
Asp6210 6215 6220Leu Gln Ser Gly Asp Glu
Lys Thr Leu Val Thr Glu Val Ile Ile Lys6225 6230
6235 6240Val Ala Pro Asp Ala Glu Thr Asp Pro Thr Ile
Glu Val Asn Val Val6245 6250 6255Gly Ser
Leu Asp Asp Ala Phe Asn Pro Val Asp Thr Asp Gly Gln Ala6260
6265 6270Gly Gln Asp Pro Val Gly Tyr Glu Asp Thr Tyr Ile
Gln Leu Asp Phe6275 6280 6285Asn Ser Thr
Ile Ser Asp Gln Val Ser Gly Val Glu Gly Gly Gln Glu6290
6295 6300Ala Phe Thr Ser Ile Thr Leu Thr Leu Asp Asp Pro
Ser Ile Gly Ala6305 6310 6315
6320Phe Tyr Asp Asn Thr Gly Thr Ser Leu Gly Thr Ser Val Thr Phe Asn6325
6330 6335Gln Ala Glu Ile Ala Ala Gly Ala Leu
Asp Asn Val Leu Phe Arg Ala6340 6345
6350Ile Glu Asn Tyr Pro Thr Gly Asn Asp Ile Asn Gln Val Gln Val Asn6355
6360 6365Val Ser Gly Thr Val Thr Asp Thr Ala
Thr Tyr Asn Asp Pro Ala Ser6370 6375
6380Pro Ala Gly Thr Ala Thr Asp Ser Asp Thr Phe Ser Thr Ser Val Ser6385
6390 6395 6400Phe Glu Val Val
Pro Val Val Asp Asp Val Ser Val Thr Gly Pro Gly6405 6410
6415Ser Asp Pro Asp Val Ile Glu Ile Thr Gly Asn Glu Asp Gln
Leu Ile6420 6425 6430Ser Leu Ser Gly Thr
Gly Pro Val Ser Ile Ala Leu Thr Asp Leu Asp6435 6440
6445Gly Ser Glu Gln Phe Val Ser Ile Lys Phe Thr Asp Val Pro Asp
Gly6450 6455 6460Phe Gln Met Arg Ala Asp
Ala Gly Ser Thr Tyr Thr Val Lys Asn Asn6465 6470
6475 6480Gly Asn Gly Glu Trp Ser Val Gln Leu Pro Gln
Ala Ser Gly Leu Ser6485 6490 6495Phe Asp
Leu Ser Glu Ile Ser Ile Leu Pro Pro Lys Asn Phe Ser Gly6500
6505 6510Thr Ala Glu Phe Gly Val Glu Val Phe Thr Gln Glu
Ser Leu Leu Gly6515 6520 6525Val Pro Thr
Ala Ala Ala Asn Leu Pro Ser Phe Lys Leu His Val Val6530
6535 6540Pro Val Gly Asp Asp Val Asp Thr Asn Pro Thr Asp
Ser Val Thr Gly6545 6550 6555
6560Asn Glu Gly Gln Asn Ile Asp Ile Glu Ile Asn Ala Thr Ile Leu Asp6565
6570 6575Lys Glu Leu Ser Ala Thr Gly Ser Gly
Thr Tyr Thr Glu Asn Ala Pro6580 6585
6590Glu Thr Leu Arg Val Glu Val Ala Gly Val Pro Gln Asp Ala Ser Ile6595
6600 6605Phe Tyr Pro Asp Gly Thr Thr Leu Ala
Ser Tyr Asp Pro Ala Thr Gln6610 6615
6620Leu Trp Thr Leu Asp Val Pro Ala Gln Ser Leu Asp Lys Ile Val Phe6625
6630 6635 6640Asn Ser Gly Glu
His Asn Ser Asp Thr Gly Asn Val Leu Gly Ile Asn6645 6650
6655Gly Pro Leu Gln Ile Thr Val Arg Ser Val Asp Thr Asp Ala
Asp Asn6660 6665 6670Thr Glu Tyr Leu Gly
Thr Pro Thr Ser Phe Asp Val Asp Leu Val Ile6675 6680
6685Asp Pro Ile Asn Asp Gln Pro Ile Phe Val Asn Val Thr Asn Ile
Glu6690 6695 6700Thr Ser Glu Asp Ile Ser
Val Ala Ile Asp Asn Phe Ser Ile Tyr Asp6705 6710
6715 6720Val Asp Ala Asn Phe Asp Asn Pro Asp Ala Pro
Tyr Glu Leu Thr Leu6725 6730 6735Lys Val
Asp Gln Thr Leu Pro Gly Ala Gln Gly Val Phe Glu Phe Thr6740
6745 6750Ser Ser Pro Asp Val Thr Phe Val Leu Gln Pro Asp
Gly Ser Leu Val6755 6760 6765Ile Thr Gly
Lys Glu Ala Asp Ile Asn Thr Ala Leu Thr Asn Gly Ala6770
6775 6780Val Thr Phe Lys Pro Asp Pro Asp Gln Asn Tyr Leu
Asn Gln Thr Gly6785 6790 6795
6800Leu Val Thr Ile Asn Ala Thr Leu Asp Asp Gly Gly Asn Asn Gly Leu6805
6810 6815Ile Asp Ala Val Asp Pro Asn Thr Ala
Gln Thr Asn Gln Thr Thr Phe6820 6825
6830Thr Ile Lys Val Thr Glu Val Asn Asp Ala Pro Val Ala Thr Asn Val6835
6840 6845Asp Leu Gly Ser Ile Ala Glu Asp Ala
Gln Ile Val Ile Val Glu Ser6850 6855
6860Asp Leu Ile Ala Ala Ser Ser Asp Leu Glu Asn His Asn Leu Thr Val6865
6870 6875 6880Thr Gly Val Thr
Leu Thr Gln Gly Gln Gly Gln Leu Thr Arg Tyr Glu6885 6890
6895Asn Ala Gly Gly Ala Asp Asp Ala Ala Ile Thr Gly Pro Phe
Trp Ile6900 6905 6910Phe Ile Ala Asp Asn
Asp Phe Asn Gly Asp Val Lys Phe Asn Tyr Ser6915 6920
6925Ile Ile Asp Asp Gly Thr Thr Asn Gly Val Asp Asp Phe Lys Thr
Asp6930 6935 6940Ser Ala Glu Ile Ser Leu
Val Val Thr Glu Val Asn Asp Gln Pro Val6945 6950
6955 6960Ala Ser Asn Ile Asp Leu Gly Thr Met Leu Glu
Glu Gly Gln Leu Val6965 6970 6975Ile Lys
Glu Glu Asp Leu Ile Ser Ala Thr Thr Asp Pro Glu Asn Asp6980
6985 6990Thr Ile Thr Val Asn Ser Leu Val Leu Asp Gln Gly
Gln Gly Gln Leu6995 7000 7005Gln Arg Phe
Glu Asn Val Gly Gly Ala Asp Asp Ala Thr Ile Thr Gly7010
7015 7020Pro Tyr Trp Val Phe Thr Ala Ala Asn Glu Tyr Asn
Gly Asp Val Lys7025 7030 7035
7040Phe Thr Tyr Thr Val Glu Asp Asp Gly Thr Thr Asn Gly Ala Asp Asp7045
7050 7055Phe Leu Thr Asp Thr Gly Glu Ile Ser
Val Val Val Thr Glu Val Asn7060 7065
7070Asp Gln Pro Val Ala Thr Asp Ile Asp Leu Gly Asn Ile Leu Glu Glu7075
7080 7085Gly Gln Leu Ile Ile Lys Glu Glu Asp
Leu Ile Ala Ala Thr Ser Asp7090 7095
7100Pro Glu Asn Asp Thr Ile Thr Val Thr Asn Leu Val Leu Asp Glu Gly7105
7110 7115 7120Gln Gly Gln Leu
Gln Arg Phe Glu Asn Val Gly Gly Ala Asp Asp Ala7125 7130
7135Met Ile Thr Gly Pro Tyr Trp Ile Phe Thr Ala Ala Asp Glu
Tyr Asn7140 7145 7150Gly Asn Val Lys Phe
Thr Tyr Thr Val Glu Asp Asp Gly Thr Thr Asn7155 7160
7165Gly Ala Asn Asp Phe Leu Thr Asp Thr Ala Glu Ile Thr Ala Ile
Val7170 7175 7180Asp Gly Val Asn Asp Thr
Pro Val Val Asn Gly Asp Ser Val Thr Thr7185 7190
7195 7200Ile Val Asp Glu Asp Ala Gly Gln Leu Leu Ser
Gly Ile Asn Val Ser7205 7210 7215Asp Pro
Asp Tyr Val Asp Ala Phe Ser Asn Asp Leu Met Thr Val Thr7220
7225 7230Leu Thr Val Asp Tyr Gly Thr Leu Asn Val Ser Leu
Pro Ala Val Thr7235 7240 7245Thr Val Met
Val Asn Gly Asn Asn Thr Gly Ser Val Ile Leu Val Gly7250
7255 7260Thr Leu Ser Asp Leu Asn Ala Leu Ile Asp Thr Pro
Thr Ser Pro Asn7265 7270 7275
7280Gly Val Tyr Leu Asp Ala Ser Leu Ser Pro Thr Asn Ser Ile Gly Leu7285
7290 7295Glu Val Ile Ala Lys Asp Ser Gly Asn
Pro Ser Gly Ile Ala Ile Glu7300 7305
7310Thr Ala Pro Val Val Tyr Asn Ile Ala Val Thr Pro Val Ala Asn Ala7315
7320 7325Pro Thr Leu Ser Ile Asp Pro Ala Phe
Asn Tyr Val Arg Asn Ile Thr7330 7335
7340Thr Ser Ser Ser Val Val Ala Asn Ser Gly Val Ala Leu Val Gly Ile7345
7350 7355 7360Val Ala Ala Leu
Thr Asp Ile Thr Glu Glu Leu Thr Leu Lys Ile Ser7365 7370
7375Asp Val Pro Asp Gly Val Asp Val Thr Ser Asp Val Gly Thr
Val Ser7380 7385 7390Leu Val Gly Asp Thr
Trp Ile Ala Thr Ala Asp Ala Ile Asp Ser Leu7395 7400
7405Arg Leu Val Glu Gln Ser Ser Leu Gly Lys Pro Leu Thr Pro Gly
Asn7410 7415 7420Tyr Thr Leu Lys Val Glu
Ala Leu Ser Glu Glu Thr Asp Asn Asn Asp7425 7430
7435 7440Ile Ala Ile Ser Gln Asn Ile Asp Leu Asn Leu
Asn Ile Val Ala Asn7445 7450 7455Pro Ile
Asp Leu Asp Leu Ser Ser Glu Thr Asp Asp Val Gln Leu Leu7460
7465 7470Ala Ser Asn Phe Asp Thr Asn Leu Thr Gly Gly Thr
Gly Asn Asp Arg7475 7480 7485Leu Val Gly
Gly Ala Gly Asp Asp Thr Leu Val Gly Gly Asp Gly Asn7490
7495 7500Asp Thr Leu Ile Gly Gly Gly Gly Ser Asp Ile Leu
Thr Gly Gly Asn7505 7510 7515
7520Gly Met Asp Ser Phe Val Trp Leu Asn Ile Glu Asp Gly Val Glu Asp7525
7530 7535Thr Ile Thr Asp Phe Ser Leu Ser Glu
Gly Asp Gln Ile Asp Leu Arg7540 7545
7550Glu Val Leu Pro Glu Leu Lys Asn Thr Ser Pro Asp Met Ser Ala Leu7555
7560 7565Leu Gln Gln Ile Asp Ala Lys Val Glu
Gly Asp Asp Ile Glu Leu Thr7570 7575
7580Ile Lys Ser Asp Gly Leu Gly Thr Thr Glu Gln Val Ile Val Val Glu7585
7590 7595 7600Asp Leu Ala Pro
Gln Leu Thr Leu Ser Gly Thr Met Pro Ser Asp Ile7605 7610
7615Leu Asp Ala Leu Val Gln Gln Asn Val Ile Thr His Gly7620
76255765DNAVibrio splendidus 5atgaaaaaaa catcactatt
acttgcttcc attactctgg cactttctgg tgtagtacaa 60gctgaccagc tagaagacat
tcaaaaatca ggcacacttc gcgtcggcac cacaggcgac 120tacaaacctt tttcttactt
cgacggcaaa acctactctg gttatgacat tgacgtagcc 180aaacatgttg cagagcagtt
gggcgttgaa ttacagattg ttcgtaccac atggaaagat 240ctactgaccg atctagacag
cgataaatac gacatcgcga tgggcggtat cacgcgtaaa 300atgcagcgtc agttaaacgc
agaacaaact caaggttaca tgacctttgg caagtgtttc 360ttagttgcga aaggcaaagc
agaacaatac aacagcattg agaaagtgaa cctctcttct 420gtgcgtgttg gcgtcaatat
cggtgggact aatgagatgt ttgcggatgc taacttgcaa 480gacgcgagct ttacgcgtta
cgagaacaac ctagacgttc cgcaagccgt tgcggaaggt 540aaagttgatg taatggtgac
agaaactcct gaaggtctgt tctatcaagt gacggacgaa 600cgtcttgaag cggcacgctg
tgaaacaccg tttaccaaca gtcaattcgg ttacctgata 660ccaaaaggtg aacaacgctt
gttgaacaca gtgaacttca ttatggatga gatgaaattg 720aaaggcgtcg aagaagagtt
cctgatccac aactctctta agtaa 7656254PRTVibrio
splendidus 6Met Lys Lys Thr Ser Leu Leu Leu Ala Ser Ile Thr Leu Ala Leu
Ser1 5 10 15Gly Val Val
Gln Ala Asp Gln Leu Glu Asp Ile Gln Lys Ser Gly Thr20 25
30Leu Arg Val Gly Thr Thr Gly Asp Tyr Lys Pro Phe Ser
Tyr Phe Asp35 40 45Gly Lys Thr Tyr Ser
Gly Tyr Asp Ile Asp Val Ala Lys His Val Ala50 55
60Glu Gln Leu Gly Val Glu Leu Gln Ile Val Arg Thr Thr Trp Lys
Asp65 70 75 80Leu Leu
Thr Asp Leu Asp Ser Asp Lys Tyr Asp Ile Ala Met Gly Gly85
90 95Ile Thr Arg Lys Met Gln Arg Gln Leu Asn Ala Glu
Gln Thr Gln Gly100 105 110Tyr Met Thr Phe
Gly Lys Cys Phe Leu Val Ala Lys Gly Lys Ala Glu115 120
125Gln Tyr Asn Ser Ile Glu Lys Val Asn Leu Ser Ser Val Arg
Val Gly130 135 140Val Asn Ile Gly Gly Thr
Asn Glu Met Phe Ala Asp Ala Asn Leu Gln145 150
155 160Asp Ala Ser Phe Thr Arg Tyr Glu Asn Asn Leu
Asp Val Pro Gln Ala165 170 175Val Ala Glu
Gly Lys Val Asp Val Met Val Thr Glu Thr Pro Glu Gly180
185 190Leu Phe Tyr Gln Val Thr Asp Glu Arg Leu Glu Ala
Ala Arg Cys Glu195 200 205Thr Pro Phe Thr
Asn Ser Gln Phe Gly Tyr Leu Ile Pro Lys Gly Glu210 215
220Gln Arg Leu Leu Asn Thr Val Asn Phe Ile Met Asp Glu Met
Lys Leu225 230 235 240Lys
Gly Val Glu Glu Glu Phe Leu Ile His Asn Ser Leu Lys245
2507765DNAVibrio splendidus 7atgaaaaaaa catcactatt acttgcttcc attactctgg
cactttctgg tgtagtacaa 60gctgaccagc tagaagacat tcaaaaatca ggcacacttc
gcgtcggcac cacaggcgac 120tacaaacctt tttcttactt cgacggcaaa acctactctg
gttatgacat tgacgtagcc 180aaacatgttg cagagcagtt gggcgttgaa ttacagattg
ttcgtaccac atggaaagat 240ctactgaccg atctagacag cgataaatac gacatcgcga
tgggcggtat cacgcgtaaa 300atgcagcgtc agttaaacgc agaacaaact caaggttaca
tgacctttgg caagtgtttc 360ttagttgcga aaggcaaagc agaacaatac aacagcattg
agaaagtgaa cctctcttct 420gtgcgtgttg gcgtcaatat cggtgggact aatgagatgt
ttgcggatgc taacttgcaa 480gacgcgagct ttacgcgtta cgagaacaac ctagacgttc
cgcaagccgt tgcggaaggt 540aaagttgatg taatggtgac agaaactcct gaaggtctgt
tctatcaagt gacggacgaa 600cgtcttgaag cggcacgctg tgaaacaccg tttaccaaca
gtcaattcgg ttacctgata 660ccaaaaggtg aacaacgctt gttgaacaca gtgaacttca
ttatggatga gatgaaattg 720aaaggcgtcg aagaagagtt cctgatccac aactctctta
agtaa 7658588PRTVibrio splendidus 8Met Thr Ile Asp Thr
Phe Val Val Leu Ala Tyr Phe Phe Phe Leu Ile1 5
10 15Ala Ile Gly Trp Met Phe Arg Lys Phe Thr Thr Ser
Thr Ser Asp Tyr20 25 30Phe Arg Gly Gly
Gly Lys Met Leu Trp Trp Met Val Gly Ala Thr Ala35 40
45Phe Met Thr Gln Phe Ser Ala Trp Thr Phe Thr Gly Ala Ala
Gly Arg50 55 60Ala Phe Asn Asp Gly Phe
Val Ile Val Ile Leu Phe Leu Ala Asn Ala65 70
75 80Phe Gly Tyr Phe Met Asn Tyr Met Tyr Phe Ala
Pro Lys Phe Arg Gln85 90 95Leu Arg Val
Val Thr Ala Ile Glu Ala Ile Arg Gln Arg Phe Gly Lys100
105 110Thr Ser Glu Gln Phe Phe Thr Trp Ala Gly Met Pro
Asp Ser Leu Ile115 120 125Ser Ala Gly Ile
Trp Leu Asn Gly Leu Ala Ile Phe Val Ala Ala Val130 135
140Phe Asn Ile Pro Met Glu Ala Thr Ile Val Val Thr Gly Met
Val Leu145 150 155 160Val
Leu Met Ala Val Thr Gly Gly Ser Trp Ala Val Val Ala Ser Asp165
170 175Phe Met Gln Met Leu Val Ile Met Ala Val Thr
Ile Thr Cys Ala Val180 185 190Ala Ala Tyr
Phe His Gly Gly Gly Leu Thr Asn Ile Val Ala Asn Phe195
200 205Asp Gly Asp Phe Met Leu Gly Asn Asn Leu Asn Tyr
Met Ser Ile Phe210 215 220Val Leu Trp Val
Val Phe Ile Phe Val Lys Gln Phe Gly Val Met Asn225 230
235 240Asn Ser Ile Asn Ala Tyr Arg Tyr Leu
Cys Ala Lys Asp Ser Glu Asn245 250 255Ala
Arg Lys Ala Ala Gly Leu Ala Cys Ile Leu Met Val Val Gly Pro260
265 270Leu Ile Trp Phe Leu Pro Pro Trp Tyr Val Ser
Ala Phe Met Pro Asp275 280 285Phe Ala Leu
Glu Tyr Ala Ser Met Gly Asp Lys Ala Gly Asp Ala Ala290
295 300Tyr Leu Ala Phe Val Gln Asn Val Met Pro Ala Gly
Met Val Gly Leu305 310 315
320Leu Met Ser Ala Met Phe Ala Ala Thr Met Ser Ser Met Asp Ser Gly325
330 335Leu Asn Arg Asn Ala Gly Ile Phe Val
Met Asn Phe Tyr Ser Pro Ile340 345 350Leu
Arg Gln Asn Ala Thr Gln Lys Glu Leu Val Ile Val Ser Lys Leu355
360 365Thr Thr Ile Met Met Gly Ile Ile Ile Ile Ala
Ile Gly Leu Phe Ile370 375 380Asn Ser Leu
Arg His Leu Ser Leu Phe Asp Ile Val Met Asn Val Gly385
390 395 400Ala Leu Ile Gly Phe Pro Met
Leu Ile Pro Val Leu Leu Gly Met Trp405 410
415Ile Arg Lys Thr Pro Asp Trp Ala Gly Trp Ser Thr Leu Ile Val Gly420
425 430Gly Phe Val Ser Tyr Ile Phe Gly Ile
Ser Leu Gln Ala Glu Asp Ile435 440 445Glu
His Leu Phe Gly Met Glu Thr Ala Leu Thr Gly Arg Glu Trp Ser450
455 460Asp Leu Lys Val Gly Leu Ser Leu Ala Ala His
Val Val Phe Thr Gly465 470 475
480Gly Tyr Phe Ile Leu Thr Ser Arg Phe Tyr Lys Gly Leu Ser Pro
Glu485 490 495Arg Glu Lys Glu Val Asp Gln
Leu Phe Thr Asn Trp Asn Thr Pro Leu500 505
510Val Ala Glu Gly Glu Glu Gln Gln Asn Leu Asp Thr Lys Gln Arg Ser515
520 525Met Leu Gly Lys Leu Ile Ser Thr Ala
Gly Phe Gly Ile Leu Ala Met530 535 540Ala
Leu Ile Pro Asn Glu Pro Thr Gly Arg Leu Leu Phe Leu Leu Cys545
550 555 560Gly Ser Met Val Leu Thr
Val Gly Ile Leu Leu Val Asn Ala Ser Lys565 570
575Ala Pro Ala Lys Met Asn Asn Glu Ser Val Ala Lys580
5859627DNAVibrio splendidus 9atgacgacat taaatgaaca actagcaaac ctaaaagtaa
ttcctgtaat cgcgatcaac 60cgtgctgaag acgctatccc tctaggtaaa gcgttggttg
aaaatggcat gccatgtgca 120gaaattacac tacgtacaga atgtgcaatc gaagcgattc
gcatcatgcg taaagaattc 180ccagacatgc taatcggttc aggtactgta ctgactaacg
agcaagttga cgcatctatc 240gaagctggtg ttgatttcat cgtaagccca ggttttaacc
cacgtactgt tcaatactgt 300atcgataaag gtattgcaat cgtaccgggt gttaacaacc
caagcctagt tgagcaagca 360atggaaatgg gtcttcgcac gttgaagttc ttccctgctg
agccttcagg cggtactggc 420atgcttaaag cactaacagc agtttaccct gttaaattca
tgcctactgg tggcgtaagc 480ttgaagaatg ttgatgaata cctatcgatc ccttctgttc
ttgcgtgtgg cggtacttgg 540atggttccaa ctaaccttat cgatgaaggt aagtgggacg
aactaggcaa gcttgttcgt 600gacgcagttg atcacgttaa cgcttaa
62710208PRTVibrio splendidus 10Met Thr Thr Leu Asn
Glu Gln Leu Ala Asn Leu Lys Val Ile Pro Val1 5
10 15Ile Ala Ile Asn Arg Ala Glu Asp Ala Ile Pro Leu
Gly Lys Ala Leu20 25 30Val Glu Asn Gly
Met Pro Cys Ala Glu Ile Thr Leu Arg Thr Glu Cys35 40
45Ala Ile Glu Ala Ile Arg Ile Met Arg Lys Glu Phe Pro Asp
Met Leu50 55 60Ile Gly Ser Gly Thr Val
Leu Thr Asn Glu Gln Val Asp Ala Ser Ile65 70
75 80Glu Ala Gly Val Asp Phe Ile Val Ser Pro Gly
Phe Asn Pro Arg Thr85 90 95Val Gln Tyr
Cys Ile Asp Lys Gly Ile Ala Ile Val Pro Gly Val Asn100
105 110Asn Pro Ser Leu Val Glu Gln Ala Met Glu Met Gly
Leu Arg Thr Leu115 120 125Lys Phe Phe Pro
Ala Glu Pro Ser Gly Gly Thr Gly Met Leu Lys Ala130 135
140Leu Thr Ala Val Tyr Pro Val Lys Phe Met Pro Thr Gly Gly
Val Ser145 150 155 160Leu
Lys Asn Val Asp Glu Tyr Leu Ser Ile Pro Ser Val Leu Ala Cys165
170 175Gly Gly Thr Trp Met Val Pro Thr Asn Leu Ile
Asp Glu Gly Lys Trp180 185 190Asp Glu Leu
Gly Lys Leu Val Arg Asp Ala Val Asp His Val Asn Ala195
200 20511933DNAVibrio splendidus 11atgaaatcat taaacatcgc
ggtcattggc gagtgcatgg ttgagctaca aaagaaacaa 60gacgggctta agcaaagttt
tggtggcgat acgctgaata ctgcacttta cttgtcacgc 120ttaacaaaag agcaagatat
caacacgagc tacgtaactg cactaggcac tgacccattc 180agtaccgaca tgttaaaaaa
ttggcaagcg gaaggtatcg acacgagctt aattgctcag 240ctggaccaca aacaaccagg
gctttactac atcgagaccg atgaaactgg tgaacgcagt 300ttccactact ggcgtagtga
tgctgcagcg aagttcatgt ttgatcagga agacacgcct 360gctcttcttg ataagctgtt
ctcttttgac gcgatttact taagtggtat tacgctggca 420atcttgacag aaaatggtcg
cacgcagcta ttcaacttct tagacaaatt caaagctcaa 480ggcggccaag tattcttcga
caataactac cgacctaaac tttgggaaag ccaacaagaa 540gcgatttctt ggtacttgaa
aatgcttaag tacacagata cggctctgct gacgtttgat 600gatgagcaag agctatacgg
cgacgaaagc attgaacaat gtattacacg tacgtcagag 660tctggtgtga aagagatcgt
cattaaacgt ggcgcgaaag actgcttagt ggttgaaagc 720caaagcgctc aatacgttgc
acccaaccct gtagacaaca tcgttgatac gactgccgct 780ggcgactcgt tcagtgcagg
cttcttggcc aagcgcttga gcggcggtag tgctcgtgat 840gctgcatttg caggtcatat
tgtggcagga accgtgattc agcatccagg tgctatcatt 900cctctagaag cgacgcctga
tctgtctcta taa 93312310PRTVibrio
splendidus 12Met Lys Ser Leu Asn Ile Ala Val Ile Gly Glu Cys Met Val Glu
Leu1 5 10 15Gln Lys Lys
Gln Asp Gly Leu Lys Gln Ser Phe Gly Gly Asp Thr Leu20 25
30Asn Thr Ala Leu Tyr Leu Ser Arg Leu Thr Lys Glu Gln
Asp Ile Asn35 40 45Thr Ser Tyr Val Thr
Ala Leu Gly Thr Asp Pro Phe Ser Thr Asp Met50 55
60Leu Lys Asn Trp Gln Ala Glu Gly Ile Asp Thr Ser Leu Ile Ala
Gln65 70 75 80Leu Asp
His Lys Gln Pro Gly Leu Tyr Tyr Ile Glu Thr Asp Glu Thr85
90 95Gly Glu Arg Ser Phe His Tyr Trp Arg Ser Asp Ala
Ala Ala Lys Phe100 105 110Met Phe Asp Gln
Glu Asp Thr Pro Ala Leu Leu Asp Lys Leu Phe Ser115 120
125Phe Asp Ala Ile Tyr Leu Ser Gly Ile Thr Leu Ala Ile Leu
Thr Glu130 135 140Asn Gly Arg Thr Gln Leu
Phe Asn Phe Leu Asp Lys Phe Lys Ala Gln145 150
155 160Gly Gly Gln Val Phe Phe Asp Asn Asn Tyr Arg
Pro Lys Leu Trp Glu165 170 175Ser Gln Gln
Glu Ala Ile Ser Trp Tyr Leu Lys Met Leu Lys Tyr Thr180
185 190Asp Thr Ala Leu Leu Thr Phe Asp Asp Glu Gln Glu
Leu Tyr Gly Asp195 200 205Glu Ser Ile Glu
Gln Cys Ile Thr Arg Thr Ser Glu Ser Gly Val Lys210 215
220Glu Ile Val Ile Lys Arg Gly Ala Lys Asp Cys Leu Val Val
Glu Ser225 230 235 240Gln
Ser Ala Gln Tyr Val Ala Pro Asn Pro Val Asp Asn Ile Val Asp245
250 255Thr Thr Ala Ala Gly Asp Ser Phe Ser Ala Gly
Phe Leu Ala Lys Arg260 265 270Leu Ser Gly
Gly Ser Ala Arg Asp Ala Ala Phe Ala Gly His Ile Val275
280 285Ala Gly Thr Val Ile Gln His Pro Gly Ala Ile Ile
Pro Leu Glu Ala290 295 300Thr Pro Asp Leu
Ser Leu305 31013336DNAVibrio splendidus 13atgaactctt
tctttatcct agatgaaaat ccatgggaag aacttggtgg cggcattaag 60cgtaaaatcg
ttgcttacac tgacgatcta atggcagtac acctatgctt tgataagggc 120gcgattggcc
accctcatac tcacgaaatt cacgaccaaa tcggttatgt tgttcgtggt 180agcttcgaag
ctgaaatcga cggcgagaag aaagtgctta aagaaggcga tgcttacttc 240gctcgtaaac
acatgatgca cggtgcagtt gctctagaac aagacagcat ccttcttgat 300atcttcaatc
ctgcgcgtga agatttccta aaataa
33614111PRTVibrio splendidus 14Met Asn Ser Phe Phe Ile Leu Asp Glu Asn
Pro Trp Glu Glu Leu Gly1 5 10
15Gly Gly Ile Lys Arg Lys Ile Val Ala Tyr Thr Asp Asp Leu Met Ala20
25 30Val His Leu Cys Phe Asp Lys Gly Ala
Ile Gly His Pro His Thr His35 40 45Glu
Ile His Asp Gln Ile Gly Tyr Val Val Arg Gly Ser Phe Glu Ala50
55 60Glu Ile Asp Gly Glu Lys Lys Val Leu Lys Glu
Gly Asp Ala Tyr Phe65 70 75
80Ala Arg Lys His Met Met His Gly Ala Val Ala Leu Glu Gln Asp Ser85
90 95Ile Leu Leu Asp Ile Phe Asn Pro Ala
Arg Glu Asp Phe Leu Lys100 105
110152208DNAVibrio splendidus 15atgacgacta aaccagtatt gttgactgaa
gctgaaatcg aacagcttca tcttgaagtg 60ggccgttcta gcttaatggg caaaaccatt
gcagcgaacg cgaaagacct agaagcattc 120atgcgtttac ctattgatgt tccaggtcac
ggtgaagctg ggggttacga acataaccgc 180cacaagcaaa attacacgta catgaaccta
gctggtcgca tgttcttgat cactaaagag 240caaaaatacg ctgactttgt tacagaatta
ctagaagagt acgcagacaa atatctaacg 300tttgattacc acgtacagaa aaacaccaac
ccaacaggtc gtttgttcca ccaaatccta 360aacgaacact gctggttaat gttctcaagc
ttagcttatt cttgtgttgc ttcaacactg 420acacaagatc agcgtgacaa tattgagtct
cgcatttttg aacccatgct agaaatgttc 480acggttaaat acgcacacga cttcgaccgt
attcacaatc acggtatttg ggcagtagcc 540gctgtgggta tctgtggtct tgctttaggc
aaacgtgaat acctagaaat gtcagtgtac 600ggcatcgacc gtaatgatac tggcggtttc
ctagcgcaag tttctcagct atttgcacct 660tctggctact acatggaagg tccttactac
catcgttatg cgattcgccc aacgtgtgtg 720ttcgctgaag tgattcaccg tcatatgcct
gaagttgata tctacaacta caaaggcggc 780gtgattggta acacagtaca agctatgctt
gcgacagcgt acccgaacgg cgagttcccg 840gctctgaatg atgcttctcg tactatgggt
atcacagaca tgggtgttca ggttgcggtc 900agtgtttaca gtaagcatta ctcttctgaa
aacggtgtag accaaaacat tctgggtatg 960gcgaagattc aagacgcagt atggatgcat
ccatgtggtc ttgagctatc taaagcatac 1020gaagccgcat ctgcagagaa agaaatcggc
atgcctttct ggccaagtgt tgaattgaat 1080gaaggccctc aaggtcacaa cggcgcgcaa
ggctttatcc gtatgcagga taagaaaggc 1140gacgtttctc aacttgtgat gaactacggc
caacacggca tgggtcacgg caactttgat 1200acgctgggta tttctttctt taaccgcggt
caagaagtgc tacgtgaata cggcttctgt 1260cgttgggtta acgttgagcc aaaattcggc
ggccgttacc tagacgaaaa caaatcttac 1320gctcgtcaaa cgattgctca caatgcagtt
acgattgatg aaaaatgtca gaacaacttt 1380gacgttgaac gtgcagactc agtacatggt
ttacctcact tctttaaagt agaagacgat 1440caaatcaacg gtatgagtgc atttgctaac
gatcattacc aaggctttga catgcaacgc 1500agcgtgttca tgctaaatct tgaagaatta
gaatctccgt tattgttaga cctataccgc 1560ttagattcta caaaaggcgg cgaaggcgag
caccaatacg actattcaca ccaatatgcg 1620ggtcagattg ttcgcactaa cttcgaatac
caagcgaaca aagagctaaa cactctaggt 1680gacgatttcg gttaccaaca tctatggaac
gtcgcaagcg gtgaagtgaa gggcacagca 1740attgtaagtt ggctacaaaa caacacctac
tacacatggc taggtgcaac gtctaacgat 1800aatgctgaag taatatttac tcgcactggc
gctaacgacc caagtttcaa tctacgttca 1860gagcctgcgt tcattctacg cagcaaaggc
gaaacaacac tgtttgcttc tgttgttgaa 1920acgcacggtt atttcaacga agaattcgag
caatctgtca atgcacgtgg tgttgtgaaa 1980gacatcaaag tcgtggctca caccaatgtc
ggttcggtag ttgagatcac cacagagaaa 2040tcaaacgtga cagtgatgat cagcaaccaa
cttggcgcga ctgacagcac tgaacacaaa 2100gtagaactga acggcaaagt atacagctgg
aaaggcttct actcagtaga gacaacttta 2160caagaaacga attcagaaga acttagcact
gcagggcagg ggaaataa 220816735PRTVibrio splendidus 16Met
Thr Thr Lys Pro Val Leu Leu Thr Glu Ala Glu Ile Glu Gln Leu1
5 10 15His Leu Glu Val Gly Arg Ser Ser
Leu Met Gly Lys Thr Ile Ala Ala20 25
30Asn Ala Lys Asp Leu Glu Ala Phe Met Arg Leu Pro Ile Asp Val Pro35
40 45Gly His Gly Glu Ala Gly Gly Tyr Glu His
Asn Arg His Lys Gln Asn50 55 60Tyr Thr
Tyr Met Asn Leu Ala Gly Arg Met Phe Leu Ile Thr Lys Glu65
70 75 80Gln Lys Tyr Ala Asp Phe Val
Thr Glu Leu Leu Glu Glu Tyr Ala Asp85 90
95Lys Tyr Leu Thr Phe Asp Tyr His Val Gln Lys Asn Thr Asn Pro Thr100
105 110Gly Arg Leu Phe His Gln Ile Leu Asn
Glu His Cys Trp Leu Met Phe115 120 125Ser
Ser Leu Ala Tyr Ser Cys Val Ala Ser Thr Leu Thr Gln Asp Gln130
135 140Arg Asp Asn Ile Glu Ser Arg Ile Phe Glu Pro
Met Leu Glu Met Phe145 150 155
160Thr Val Lys Tyr Ala His Asp Phe Asp Arg Ile His Asn His Gly
Ile165 170 175Trp Ala Val Ala Ala Val Gly
Ile Cys Gly Leu Ala Leu Gly Lys Arg180 185
190Glu Tyr Leu Glu Met Ser Val Tyr Gly Ile Asp Arg Asn Asp Thr Gly195
200 205Gly Phe Leu Ala Gln Val Ser Gln Leu
Phe Ala Pro Ser Gly Tyr Tyr210 215 220Met
Glu Gly Pro Tyr Tyr His Arg Tyr Ala Ile Arg Pro Thr Cys Val225
230 235 240Phe Ala Glu Val Ile His
Arg His Met Pro Glu Val Asp Ile Tyr Asn245 250
255Tyr Lys Gly Gly Val Ile Gly Asn Thr Val Gln Ala Met Leu Ala
Thr260 265 270Ala Tyr Pro Asn Gly Glu Phe
Pro Ala Leu Asn Asp Ala Ser Arg Thr275 280
285Met Gly Ile Thr Asp Met Gly Val Gln Val Ala Val Ser Val Tyr Ser290
295 300Lys His Tyr Ser Ser Glu Asn Gly Val
Asp Gln Asn Ile Leu Gly Met305 310 315
320Ala Lys Ile Gln Asp Ala Val Trp Met His Pro Cys Gly Leu
Glu Leu325 330 335Ser Lys Ala Tyr Glu Ala
Ala Ser Ala Glu Lys Glu Ile Gly Met Pro340 345
350Phe Trp Pro Ser Val Glu Leu Asn Glu Gly Pro Gln Gly His Asn
Gly355 360 365Ala Gln Gly Phe Ile Arg Met
Gln Asp Lys Lys Gly Asp Val Ser Gln370 375
380Leu Val Met Asn Tyr Gly Gln His Gly Met Gly His Gly Asn Phe Asp385
390 395 400Thr Leu Gly Ile
Ser Phe Phe Asn Arg Gly Gln Glu Val Leu Arg Glu405 410
415Tyr Gly Phe Cys Arg Trp Val Asn Val Glu Pro Lys Phe Gly
Gly Arg420 425 430Tyr Leu Asp Glu Asn Lys
Ser Tyr Ala Arg Gln Thr Ile Ala His Asn435 440
445Ala Val Thr Ile Asp Glu Lys Cys Gln Asn Asn Phe Asp Val Glu
Arg450 455 460Ala Asp Ser Val His Gly Leu
Pro His Phe Phe Lys Val Glu Asp Asp465 470
475 480Gln Ile Asn Gly Met Ser Ala Phe Ala Asn Asp His
Tyr Gln Gly Phe485 490 495Asp Met Gln Arg
Ser Val Phe Met Leu Asn Leu Glu Glu Leu Glu Ser500 505
510Pro Leu Leu Leu Asp Leu Tyr Arg Leu Asp Ser Thr Lys Gly
Gly Glu515 520 525Gly Glu His Gln Tyr Asp
Tyr Ser His Gln Tyr Ala Gly Gln Ile Val530 535
540Arg Thr Asn Phe Glu Tyr Gln Ala Asn Lys Glu Leu Asn Thr Leu
Gly545 550 555 560Asp Asp
Phe Gly Tyr Gln His Leu Trp Asn Val Ala Ser Gly Glu Val565
570 575Lys Gly Thr Ala Ile Val Ser Trp Leu Gln Asn Asn
Thr Tyr Tyr Thr580 585 590Trp Leu Gly Ala
Thr Ser Asn Asp Asn Ala Glu Val Ile Phe Thr Arg595 600
605Thr Gly Ala Asn Asp Pro Ser Phe Asn Leu Arg Ser Glu Pro
Ala Phe610 615 620Ile Leu Arg Ser Lys Gly
Glu Thr Thr Leu Phe Ala Ser Val Val Glu625 630
635 640Thr His Gly Tyr Phe Asn Glu Glu Phe Glu Gln
Ser Val Asn Ala Arg645 650 655Gly Val Val
Lys Asp Ile Lys Val Val Ala His Thr Asn Val Gly Ser660
665 670Val Val Glu Ile Thr Thr Glu Lys Ser Asn Val Thr
Val Met Ile Ser675 680 685Asn Gln Leu Gly
Ala Thr Asp Ser Thr Glu His Lys Val Glu Leu Asn690 695
700Gly Lys Val Tyr Ser Trp Lys Gly Phe Tyr Ser Val Glu Thr
Thr Leu705 710 715 720Gln
Glu Thr Asn Ser Glu Glu Leu Ser Thr Ala Gly Gln Gly Lys725
730 735172154DNAVibrio splendidus 17atgagctatc
aaccactttt acttaacttt gatgaagcag ctgaacttcg taaagaactt 60ggcaaggata
gcctattagg taacgcactg actcgcgaca ttaaacaaac tgacgcttac 120atggctgaag
ttggcattga agtaccaggt cacggtgaag gcggcggtta cgagcacaac 180cgtcataagc
aaaactacat ccatatggat ctagcaggcc gtttgttcct tatcactgag 240gaaacaaaat
accgagatta catcgttgat atgctaacag cgtacgcgac ggtataccca 300acacttgaaa
gcaacgtaag ccgtgactct aaccctccgg gtaagctgtt ccaccaaacg 360ttgaacgaga
acatgtggat gctttacgct tcttgtgcgt acagctgcat ctaccacacg 420atctctgaag
agcaaaagcg tctgatcgaa gacgatcttc ttaagcaaat gatcgaaatg 480ttcgttgtga
cttacgcaca cgacttcgat atcgtacaca accacggctt atgggcagtg 540gcagcagtag
gtatctgtgg ttacgcaatc aacgatcaag agtctgtaga caaagcacta 600tacggcctga
aactagacaa agtcagcggc ggtttcttag cgcaactaga ccaactgttt 660tcgccagacg
gctactacat ggaaggtcct tactaccacc gtttctctct gcgtccaatc 720tacctgttcg
cagaagcgat tgaacgtcgt cagcctgaag ttggtatcta tgaattcaac 780gattcagtga
tcaagacaac gtcttactct gtattcaaaa cggcattccc agacggtaca 840ttgcctgctc
tgaacgattc atcgaagaca atctctatca acgatgaagg cgttatcatg 900gcaacgtctg
tgtgttacca ccgttacgag caaactgaaa ctctacttgg tatggctaac 960caccagcaaa
acgtttgggt tcatgcttca ggtaaaacac tgtctgacgc ggttgatgca 1020gcagacgaca
tcaaagcatt caactggggt agcctgtttg taaccgacgg ccctgaaggc 1080gaaaaaggcg
gcgtaagcat ccttcgtcac cgtgacgaac aagatgacga cacgatggcg 1140ttgatctggt
ttggtcaaca cggttctgat caccagtacc actctgctct agaccacggt 1200cactacgatg
gcctgcacct aagcgtattt aaccgtggcc acgaagtgct gcacgatttc 1260ggcttcggtc
gctgggtaaa cgttgagcct aagtttggcg gtcgttacat cccagagaac 1320aagtcttact
gtaagcagac ggttgctcac aacacagtaa cggttgatca gaaaacgcag 1380aacaacttca
acacagcatt ggctgagtct aagtttggtc agaagcactt cttcgtagca 1440gacgaccagt
ctctacaagg catgagcggc acaatttctg agtactacac tggcgtagac 1500atgcaacgca
gcgtgattct tgctgaactt cctgagttcg agaagccact tgtaatcgac 1560gtataccgca
tcgaagctga cgctgaacac cagtacgacc tacccgttca ccactctggt 1620cagatcatcc
gtactgactt cgattacaac atggaaaaaa cgcttaagcc gctaggtgaa 1680gacaacggtt
accagcactt atggaacgtg gcttcaggca aagtgaacga agaaggttct 1740ctagtaagct
ggctacatga cagcagctac tacagcctag taaccagcgc gaatgcgggc 1800agcgaagtga
tttttgctcg cactggtgct aacgatccag acttcaacct taagagtgag 1860cctgcgttca
tcttacgtca gtctggtcaa aaccacgtgt ttgcttctgt actagaaacg 1920catggttact
ttaacgagtc tatcgaagcc tctgtaggcg ctcgtggtct agttaaatca 1980gtatctgttg
tgggccataa cagtgtcggg actgttgttc gcattcagac tacttctggc 2040aacacttacc
actacggtat ctcaaaccaa gctgaagaca cgcagcaagc aactcacact 2100gttgagttcg
cgggtgagac atactcgtgg gaaggatcat ttgctcaact gtaa
215418717PRTVibrio slpendidus 18Met Ser Tyr Gln Pro Leu Leu Leu Asn Phe
Asp Glu Ala Ala Glu Leu1 5 10
15Arg Lys Glu Leu Gly Lys Asp Ser Leu Leu Gly Asn Ala Leu Thr Arg20
25 30Asp Ile Lys Gln Thr Asp Ala Tyr Met
Ala Glu Val Gly Ile Glu Val35 40 45Pro
Gly His Gly Glu Gly Gly Gly Tyr Glu His Asn Arg His Lys Gln50
55 60Asn Tyr Ile His Met Asp Leu Ala Gly Arg Leu
Phe Leu Ile Thr Glu65 70 75
80Glu Thr Lys Tyr Arg Asp Tyr Ile Val Asp Met Leu Thr Ala Tyr Ala85
90 95Thr Val Tyr Pro Thr Leu Glu Ser Asn
Val Ser Arg Asp Ser Asn Pro100 105 110Pro
Gly Lys Leu Phe His Gln Thr Leu Asn Glu Asn Met Trp Met Leu115
120 125Tyr Ala Ser Cys Ala Tyr Ser Cys Ile Tyr His
Thr Ile Ser Glu Glu130 135 140Gln Lys Arg
Leu Ile Glu Asp Asp Leu Leu Lys Gln Met Ile Glu Met145
150 155 160Phe Val Val Thr Tyr Ala His
Asp Phe Asp Ile Val His Asn His Gly165 170
175Leu Trp Ala Val Ala Ala Val Gly Ile Cys Gly Tyr Ala Ile Asn Asp180
185 190Gln Glu Ser Val Asp Lys Ala Leu Tyr
Gly Leu Lys Leu Asp Lys Val195 200 205Ser
Gly Gly Phe Leu Ala Gln Leu Asp Gln Leu Phe Ser Pro Asp Gly210
215 220Tyr Tyr Met Glu Gly Pro Tyr Tyr His Arg Phe
Ser Leu Arg Pro Ile225 230 235
240Tyr Leu Phe Ala Glu Ala Ile Glu Arg Arg Gln Pro Glu Val Gly
Ile245 250 255Tyr Glu Phe Asn Asp Ser Val
Ile Lys Thr Thr Ser Tyr Ser Val Phe260 265
270Lys Thr Ala Phe Pro Asp Gly Thr Leu Pro Ala Leu Asn Asp Ser Ser275
280 285Lys Thr Ile Ser Ile Asn Asp Glu Gly
Val Ile Met Ala Thr Ser Val290 295 300Cys
Tyr His Arg Tyr Glu Gln Thr Glu Thr Leu Leu Gly Met Ala Asn305
310 315 320His Gln Gln Asn Val Trp
Val His Ala Ser Gly Lys Thr Leu Ser Asp325 330
335Ala Val Asp Ala Ala Asp Asp Ile Lys Ala Phe Asn Trp Gly Ser
Leu340 345 350Phe Val Thr Asp Gly Pro Glu
Gly Glu Lys Gly Gly Val Ser Ile Leu355 360
365Arg His Arg Asp Glu Gln Asp Asp Asp Thr Met Ala Leu Ile Trp Phe370
375 380Gly Gln His Gly Ser Asp His Gln Tyr
His Ser Ala Leu Asp His Gly385 390 395
400His Tyr Asp Gly Leu His Leu Ser Val Phe Asn Arg Gly His
Glu Val405 410 415Leu His Asp Phe Gly Phe
Gly Arg Trp Val Asn Val Glu Pro Lys Phe420 425
430Gly Gly Arg Tyr Ile Pro Glu Asn Lys Ser Tyr Cys Lys Gln Thr
Val435 440 445Ala His Asn Thr Val Thr Val
Asp Gln Lys Thr Gln Asn Asn Phe Asn450 455
460Thr Ala Leu Ala Glu Ser Lys Phe Gly Gln Lys His Phe Phe Val Ala465
470 475 480Asp Asp Gln Ser
Leu Gln Gly Met Ser Gly Thr Ile Ser Glu Tyr Tyr485 490
495Thr Gly Val Asp Met Gln Arg Ser Val Ile Leu Ala Glu Leu
Pro Glu500 505 510Phe Glu Lys Pro Leu Val
Ile Asp Val Tyr Arg Ile Glu Ala Asp Ala515 520
525Glu His Gln Tyr Asp Leu Pro Val His His Ser Gly Gln Ile Ile
Arg530 535 540Thr Asp Phe Asp Tyr Asn Met
Glu Lys Thr Leu Lys Pro Leu Gly Glu545 550
555 560Asp Asn Gly Tyr Gln His Leu Trp Asn Val Ala Ser
Gly Lys Val Asn565 570 575Glu Glu Gly Ser
Leu Val Ser Trp Leu His Asp Ser Ser Tyr Tyr Ser580 585
590Leu Val Thr Ser Ala Asn Ala Gly Ser Glu Val Ile Phe Ala
Arg Thr595 600 605Gly Ala Asn Asp Pro Asp
Phe Asn Leu Lys Ser Glu Pro Ala Phe Ile610 615
620Leu Arg Gln Ser Gly Gln Asn His Val Phe Ala Ser Val Leu Glu
Thr625 630 635 640His Gly
Tyr Phe Asn Glu Ser Ile Glu Ala Ser Val Gly Ala Arg Gly645
650 655Leu Val Lys Ser Val Ser Val Val Gly His Asn Ser
Val Gly Thr Val660 665 670Val Arg Ile Gln
Thr Thr Ser Gly Asn Thr Tyr His Tyr Gly Ile Ser675 680
685Asn Gln Ala Glu Asp Thr Gln Gln Ala Thr His Thr Val Glu
Phe Ala690 695 700Gly Glu Thr Tyr Ser Trp
Glu Gly Ser Phe Ala Gln Leu705 710
71519825DNAVibrio splendidus 19atgaagtggt tattggcaat agttgcgatg
tctggtgtcg cattggcggc agaaaataag 60aatgttgagg tgagcagtga gcatttcgtc
cgttatcaat accaagacaa aatcagctat 120ggaaagctag acaatgacgc agtgttaccg
gtcagcggcg atctctttgg cgaatattcg 180gtagcaaaaa attcgatccc gttagagtcg
gttgaggtgt tactaccgac aaaaccagag 240aaagtcttcg ccgtcgggat gaacttcgct
agccacttag cctcacctgc cgatgcacca 300ccgccgatgt ttcttaaact tccttcttct
ttgattctca cgggcgaagt gattcaagtg 360ccaccaaaag caagaaatgt tcattttgaa
ggcgagctgg tggttgtgat tggtagagag 420ctcagtcaag ccagtgaaga agaagccgaa
caagcgatct ttggcgtcac ggtgggcaac 480gatattactg aaagaagttg gcaaggcgcc
gatttacaat ggctccgagc gaaagcttcc 540gatggttttg gcccggttgg caacacaatt
gtgcgcggca ttgattacaa caatattgag 600ttaaccactc gtgttaacgg taaagtggtt
caacaagaaa atacttcgtt catgatccac 660aagccaagaa aagtcgtgag ctatttgagc
tattatttta ccctcaaacc gggcgatcta 720attttcatgg gcacgccagg tagaacttat
gctctgtccg acaaagatca agtgagtgtc 780acgattgaag gggtagggac tgtggtaaat
gaagtgcggt tctga 82520274PRTVibrio splendidus 20Met
Lys Trp Leu Leu Ala Ile Val Ala Met Ser Gly Val Ala Leu Ala1
5 10 15Ala Glu Asn Lys Asn Val Glu Val
Ser Ser Glu His Phe Val Arg Tyr20 25
30Gln Tyr Gln Asp Lys Ile Ser Tyr Gly Lys Leu Asp Asn Asp Ala Val35
40 45Leu Pro Val Ser Gly Asp Leu Phe Gly Glu
Tyr Ser Val Ala Lys Asn50 55 60Ser Ile
Pro Leu Glu Ser Val Glu Val Leu Leu Pro Thr Lys Pro Glu65
70 75 80Lys Val Phe Ala Val Gly Met
Asn Phe Ala Ser His Leu Ala Ser Pro85 90
95Ala Asp Ala Pro Pro Pro Met Phe Leu Lys Leu Pro Ser Ser Leu Ile100
105 110Leu Thr Gly Glu Val Ile Gln Val Pro
Pro Lys Ala Arg Asn Val His115 120 125Phe
Glu Gly Glu Leu Val Val Val Ile Gly Arg Glu Leu Ser Gln Ala130
135 140Ser Glu Glu Glu Ala Glu Gln Ala Ile Phe Gly
Val Thr Val Gly Asn145 150 155
160Asp Ile Thr Glu Arg Ser Trp Gln Gly Ala Asp Leu Gln Trp Leu
Arg165 170 175Ala Lys Ala Ser Asp Gly Phe
Gly Pro Val Gly Asn Thr Ile Val Arg180 185
190Gly Ile Asp Tyr Asn Asn Ile Glu Leu Thr Thr Arg Val Asn Gly Lys195
200 205Val Val Gln Gln Glu Asn Thr Ser Phe
Met Ile His Lys Pro Arg Lys210 215 220Val
Val Ser Tyr Leu Ser Tyr Tyr Phe Thr Leu Lys Pro Gly Asp Leu225
230 235 240Ile Phe Met Gly Thr Pro
Gly Arg Thr Tyr Ala Leu Ser Asp Lys Asp245 250
255Gln Val Ser Val Thr Ile Glu Gly Val Gly Thr Val Val Asn Glu
Val260 265 270Arg Phe21717DNAVibrio
splendidus 21atggctagca cttttaattc aatttcgggc tcgaagcgta gcctgcacgt
gcaagtagca 60cgcgaaatcg ctcgaggaat tttgtctggt gatctgccgc aaggttctat
tattcctggt 120gaaatggcgt tgtgtgaaca gtttggtatc agccgaacgg cacttcgtga
agcagttaaa 180ctactgacct ctaaaggtct gttagagtct cgccctaaaa ttggtactcg
cgtagtcgac 240cgcgcatact ggaacttcct tgatcctcaa ctgattgaat ggatggacgg
actaaccgac 300gtagaccaat tctgttctca gtttttaggc cttcgccgtg cgatcgagcc
tgaagcgtgt 360gcactggcgg caaaatttgc gacagctgaa caacgtatcg agctttcaga
gatcttccaa 420aagatggtcg aagtggatga agctgaagtg tttgaccaag aacgttggac
agacattgat 480actcgtttcc atagcttgat cttcaatgcg accggtaacg acttctatct
accgttcggt 540aatattctga ctactatgtt cgttaacttc atagtgcatt cttctgaaga
gggaagcaca 600tgcatcaatg aacaccgcag aatctatgaa gctatcatgg ccggtgattg
tgacaaggct 660agaattgctt ctgctgttca cttgcaagat gccaaccacc gtttggcaac
agcataa 71722238PRTVibrio splendidus 22Met Ala Ser Thr Phe Asn Ser
Ile Ser Gly Ser Lys Arg Ser Leu His1 5 10
15Val Gln Val Ala Arg Glu Ile Ala Arg Gly Ile Leu Ser Gly
Asp Leu20 25 30Pro Gln Gly Ser Ile Ile
Pro Gly Glu Met Ala Leu Cys Glu Gln Phe35 40
45Gly Ile Ser Arg Thr Ala Leu Arg Glu Ala Val Lys Leu Leu Thr Ser50
55 60Lys Gly Leu Leu Glu Ser Arg Pro Lys
Ile Gly Thr Arg Val Val Asp65 70 75
80Arg Ala Tyr Trp Asn Phe Leu Asp Pro Gln Leu Ile Glu Trp
Met Asp85 90 95Gly Leu Thr Asp Val Asp
Gln Phe Cys Ser Gln Phe Leu Gly Leu Arg100 105
110Arg Ala Ile Glu Pro Glu Ala Cys Ala Leu Ala Ala Lys Phe Ala
Thr115 120 125Ala Glu Gln Arg Ile Glu Leu
Ser Glu Ile Phe Gln Lys Met Val Glu130 135
140Val Asp Glu Ala Glu Val Phe Asp Gln Glu Arg Trp Thr Asp Ile Asp145
150 155 160Thr Arg Phe His
Ser Leu Ile Phe Asn Ala Thr Gly Asn Asp Phe Tyr165 170
175Leu Pro Phe Gly Asn Ile Leu Thr Thr Met Phe Val Asn Phe
Ile Val180 185 190His Ser Ser Glu Glu Gly
Ser Thr Cys Ile Asn Glu His Arg Arg Ile195 200
205Tyr Glu Ala Ile Met Ala Gly Asp Cys Asp Lys Ala Arg Ile Ala
Ser210 215 220Ala Val His Leu Gln Asp Ala
Asn His Arg Leu Ala Thr Ala225 230
235231779DNAVibrio splendidus 23atggaactca acacgattat tgtcggcatt
tatttcctat tcttgattgc gataggttgg 60atgtttagaa catttacaag tactactagt
gactacttcc gcgggggcgg taacatgttg 120tggtggatgg ttggtgcaac cgcctttatg
acccagttta gtgcatggac attcaccggt 180gcagcaggta aagcgtataa cgatggtttc
gctgtagcgg tcatcttcgt agccaacgca 240tttggttact tcatgaacta cgcgtacttc
gcgccgaaat tccgtcaact tcgcgttgtt 300acggtaatcg aagcgattcg tatgcgtttt
ggtgcgacca acgaacaagt attcacttgg 360tcttcaatgc caaactcagt ggtatctgcg
ggtgtgtggt taaacgcatt ggcaatcatc 420gcttcgggta tcttcggttt cgacatgaac
atgactatct gggtgactgg cctagtggta 480ttggcaatgt cggtaacagg tggttcatgg
gcggtaatcg catctgactt catgcagatg 540gttatcatca tggcggtaac ggtaacttgt
gcggttgtag cggttgttca aggtggcggt 600gttggtgaga ttgttaacaa cttcccagta
caagatggtg gttcgttcct ttggggcaac 660aacatcaact acctaagcat ctttacgatt
tgggcattct tcatcttcgt taagcagttc 720tcaatcacga acaacatgct taactcttac
cgttacctag cggctaaaga ctcaaagaac 780gctaagaaag ctgcactgct tgcttgtgtg
ttgatgttgt gtggtgtgtt tatttggttc 840atgccttctt ggttcattgc aggccaaggt
gttgatttat cagcggctta cccgaatgca 900ggtaaaaaag cgggtgactt tgcttaccta
tacttcgtac aagagtacat gccagcaggt 960atggttggtc tattagttgc cgcgatgttt
gcagcgacaa tgtcttcaat ggactcaggt 1020ctaaaccgta actcaggtat ttttgttaag
aacttctacg aaacaatcgt tcgtaaaggt 1080caagcatcag agaaagagct agtaaccgta
tctaaaatta cttcagcggt atttggtttc 1140gctattatcc taatcgcaca gttcatcaac
tcattaaaag gcttaagcct gtttgatacg 1200atgatgtacg taggtgcgtt aatcggcttc
cctatgacga ttcctgcatt ccttggtttc 1260ttcatcaaga agactccgga ctgggctggt
tggggaacgc tagttgttgg tggtatcgta 1320tcttatgtgg ttggttttgt tatcaacgcg
gagatggtag cagcggcgtt tggtcttgat 1380actctaacag gacgtgaatg gtctgatgtt
aaagttgcga ttggtctgat tgctcacatc 1440acgctaaccg gtggcttctt cgtactatct
acgatgttct acaagcctct atcaaaagaa 1500cgtcaagcgg atgttgataa gttctttggc
aacttagata ccccattagt agctgaatcg 1560gcagagcaaa aagtgttgga taacaaacaa
cgtcaaatgc ttggtaaact gattgcggta 1620gcgggtgttg gtattatgct gatggctctt
ctgactaacc caatgtgggg gcgcctagtc 1680ttcatcttat gtggtgtgat agtgggtggt
gtcggtattc tacttgtgaa agcggtcgat 1740gacggcggca agcaagcgaa agcagtaacc
gaaagctaa 177924592PRTVibrio splendidus 24Met
Glu Leu Asn Thr Ile Ile Val Gly Ile Tyr Phe Leu Phe Leu Ile1
5 10 15Ala Ile Gly Trp Met Phe Arg Thr
Phe Thr Ser Thr Thr Ser Asp Tyr20 25
30Phe Arg Gly Gly Gly Asn Met Leu Trp Trp Met Val Gly Ala Thr Ala35
40 45Phe Met Thr Gln Phe Ser Ala Trp Thr Phe
Thr Gly Ala Ala Gly Lys50 55 60Ala Tyr
Asn Asp Gly Phe Ala Val Ala Val Ile Phe Val Ala Asn Ala65
70 75 80Phe Gly Tyr Phe Met Asn Tyr
Ala Tyr Phe Ala Pro Lys Phe Arg Gln85 90
95Leu Arg Val Val Thr Val Ile Glu Ala Ile Arg Met Arg Phe Gly Ala100
105 110Thr Asn Glu Gln Val Phe Thr Trp Ser
Ser Met Pro Asn Ser Val Val115 120 125Ser
Ala Gly Val Trp Leu Asn Ala Leu Ala Ile Ile Ala Ser Gly Ile130
135 140Phe Gly Phe Asp Met Asn Met Thr Ile Trp Val
Thr Gly Leu Val Val145 150 155
160Leu Ala Met Ser Val Thr Gly Gly Ser Trp Ala Val Ile Ala Ser
Asp165 170 175Phe Met Gln Met Val Ile Ile
Met Ala Val Thr Val Thr Cys Ala Val180 185
190Val Ala Val Val Gln Gly Gly Gly Val Gly Glu Ile Val Asn Asn Phe195
200 205Pro Val Gln Asp Gly Gly Ser Phe Leu
Trp Gly Asn Asn Ile Asn Tyr210 215 220Leu
Ser Ile Phe Thr Ile Trp Ala Phe Phe Ile Phe Val Lys Gln Phe225
230 235 240Ser Ile Thr Asn Asn Met
Leu Asn Ser Tyr Arg Tyr Leu Ala Ala Lys245 250
255Asp Ser Lys Asn Ala Lys Lys Ala Ala Leu Leu Ala Cys Val Leu
Met260 265 270Leu Cys Gly Val Phe Ile Trp
Phe Met Pro Ser Trp Phe Ile Ala Gly275 280
285Gln Gly Val Asp Leu Ser Ala Ala Tyr Pro Asn Ala Gly Lys Lys Ala290
295 300Gly Asp Phe Ala Tyr Leu Tyr Phe Val
Gln Glu Tyr Met Pro Ala Gly305 310 315
320Met Val Gly Leu Leu Val Ala Ala Met Phe Ala Ala Thr Met
Ser Ser325 330 335Met Asp Ser Gly Leu Asn
Arg Asn Ser Gly Ile Phe Val Lys Asn Phe340 345
350Tyr Glu Thr Ile Val Arg Lys Gly Gln Ala Ser Glu Lys Glu Leu
Val355 360 365Thr Val Ser Lys Ile Thr Ser
Ala Val Phe Gly Phe Ala Ile Ile Leu370 375
380Ile Ala Gln Phe Ile Asn Ser Leu Lys Gly Leu Ser Leu Phe Asp Thr385
390 395 400Met Met Tyr Val
Gly Ala Leu Ile Gly Phe Pro Met Thr Ile Pro Ala405 410
415Phe Leu Gly Phe Phe Ile Lys Lys Thr Pro Asp Trp Ala Gly
Trp Gly420 425 430Thr Leu Val Val Gly Gly
Ile Val Ser Tyr Val Val Gly Phe Val Ile435 440
445Asn Ala Glu Met Val Ala Ala Ala Phe Gly Leu Asp Thr Leu Thr
Gly450 455 460Arg Glu Trp Ser Asp Val Lys
Val Ala Ile Gly Leu Ile Ala His Ile465 470
475 480Thr Leu Thr Gly Gly Phe Phe Val Leu Ser Thr Met
Phe Tyr Lys Pro485 490 495Leu Ser Lys Glu
Arg Gln Ala Asp Val Asp Lys Phe Phe Gly Asn Leu500 505
510Asp Thr Pro Leu Val Ala Glu Ser Ala Glu Gln Lys Val Leu
Asp Asn515 520 525Lys Gln Arg Gln Met Leu
Gly Lys Leu Ile Ala Val Ala Gly Val Gly530 535
540Ile Met Leu Met Ala Leu Leu Thr Asn Pro Met Trp Gly Arg Leu
Val545 550 555 560Phe Ile
Leu Cys Gly Val Ile Val Gly Gly Val Gly Ile Leu Leu Val565
570 575Lys Ala Val Asp Asp Gly Gly Lys Gln Ala Lys Ala
Val Thr Glu Ser580 585 590252079DNAVibrio
splendidus 25atgagcgacc aaaaatctct tgatgcaatc aggaagatga agctggaaaa
cgatacttca 60gcaggtaatc ttgtagacct actccctatc gaagttcaaa cacgtgactt
cgacctatca 120ttcctagaca ccttgagcga agcacgtccg cgtcttcttg ttcaagctga
tcagctagaa 180gaattcaaag caaaagtgaa agctgatcaa gctcactgta tgtttgatga
tttctacaac 240aactctaccg ttaagttcct tgagactgct cctttcgaag agcctcaagc
gtacccagct 300gagacggtag gtaaagcttc tctatggcgt ccttattggc gtcaaatgta
cgttgattgc 360caaatggcac tgaacgcgac acgtaaccta gcgattgctg gtgttgtaaa
agaagacgaa 420gcgctcattg cgaaagcaaa agcttggact ctaaaactgt ctacgtacga
tccagaaggc 480gtgacttctc gtggctataa cgatgaagcg gctttccgtg ttatcgctgc
tatggcttgg 540ggttacgatt ggctacacgg ctacttcacc gatgaagaac gccagcaagt
tcaagatgct 600ttgattgagc gtctagacga aatcatgcac cacctgaaag tgacggttga
tctattgaac 660aacccactaa atagccacgg tgttcgttct atctcttctg ctatcatccc
aacgtgtatc 720gcgctttacc acgatcaccc gaaagcaggc gagtacattg catacgcgct
agaatactac 780gcagtacatt acccaccatg gggcggtgta gacggcggtt gggctgaagg
tcctgattac 840tggaacacgc aaactgcatt cctaggcgaa gcattcgacc tattgaaagc
atactgtggt 900gtagacatgt ttaacaaaac attctacgaa aacacaggtg atttcccgct
ttactgcatg 960ccagttcact ctaagcgcgc gagcttctgt gaccagtctt caatcggcga
tttcccaggt 1020ttaaaactgg cttacaacat caagcactac gcaggtgtta accagaagcc
tgagtacgtt 1080tggtactata accagcttaa aggccgtgat actgaagcac acaccaaatt
ctacaacttc 1140ggttggtggg acttcggtta tgacgatctt cgttttaact tcctttggga
tgcacctgaa 1200gagaaagccc catcgaacga tccactgttg aaagtattcc caatcacggg
ttgggctgca 1260ttccacaaca agatgactga gcgtgataac catattcaca tggtattcaa
atgttctccg 1320tttggctcaa tcagccactc tcacggtgac caaaacgcat ttacgcttca
cgcatttggt 1380gaaacgctag cgtcagtaac aggttactat ggtggtttcg gtgtagacat
gcacacgaaa 1440tggcgtcgtc aaacgttctc taaaaacctg ccactatttg gcggtaaagg
tcagtacggc 1500gagaacaaga acacaggcta cgaaaaccac caagatcgct tttgtatcga
agcgggcggc 1560actatctctg acttcgacac tgaatctgat gtgaagatgg ttgaaggtga
tgcaacggca 1620tcttacaagt acttcgttcc tgaaatcgaa tcttacaagc gtaaagtctg
gttcgttcaa 1680ggtaaagtct tcgtaatgca agacaaggca acgctttctg aagagaaaga
catgacttgg 1740ctaatgcaca caactttcgc aaacgaagtg gcagacaagt ctttcactat
ccgtggcgaa 1800gttgcgcacc tagacgtaaa cttcatcaac gagtctgctg ataacatcac
gtcagttaag 1860aacgttgaag gctttggcga agttgaccca tacgagttca aagatcttga
gatccaccgt 1920cacgtggaag tggaattcaa gccatcgaaa gagcacaaca tcctgacgct
tcttgttcct 1980aataagaatg aaggcgagca agttgaagtg tttcacaagc ttgaaggcaa
cacgctactg 2040ctaaatgttg acggcgaaac ggtttcaatc gaactgtaa
207926692PRTVibrio splendidus 26Met Ser Asp Gln Lys Ser Leu
Asp Ala Ile Arg Lys Met Lys Leu Glu1 5 10
15Asn Asp Thr Ser Ala Gly Asn Leu Val Asp Leu Leu Pro Ile
Glu Val20 25 30Gln Thr Arg Asp Phe Asp
Leu Ser Phe Leu Asp Thr Leu Ser Glu Ala35 40
45Arg Pro Arg Leu Leu Val Gln Ala Asp Gln Leu Glu Glu Phe Lys Ala50
55 60Lys Val Lys Ala Asp Gln Ala His Cys
Met Phe Asp Asp Phe Tyr Asn65 70 75
80Asn Ser Thr Val Lys Phe Leu Glu Thr Ala Pro Phe Glu Glu
Pro Gln85 90 95Ala Tyr Pro Ala Glu Thr
Val Gly Lys Ala Ser Leu Trp Arg Pro Tyr100 105
110Trp Arg Gln Met Tyr Val Asp Cys Gln Met Ala Leu Asn Ala Thr
Arg115 120 125Asn Leu Ala Ile Ala Gly Val
Val Lys Glu Asp Glu Ala Leu Ile Ala130 135
140Lys Ala Lys Ala Trp Thr Leu Lys Leu Ser Thr Tyr Asp Pro Glu Gly145
150 155 160Val Thr Ser Arg
Gly Tyr Asn Asp Glu Ala Ala Phe Arg Val Ile Ala165 170
175Ala Met Ala Trp Gly Tyr Asp Trp Leu His Gly Tyr Phe Thr
Asp Glu180 185 190Glu Arg Gln Gln Val Gln
Asp Ala Leu Ile Glu Arg Leu Asp Glu Ile195 200
205Met His His Leu Lys Val Thr Val Asp Leu Leu Asn Asn Pro Leu
Asn210 215 220Ser His Gly Val Arg Ser Ile
Ser Ser Ala Ile Ile Pro Thr Cys Ile225 230
235 240Ala Leu Tyr His Asp His Pro Lys Ala Gly Glu Tyr
Ile Ala Tyr Ala245 250 255Leu Glu Tyr Tyr
Ala Val His Tyr Pro Pro Trp Gly Gly Val Asp Gly260 265
270Gly Trp Ala Glu Gly Pro Asp Tyr Trp Asn Thr Gln Thr Ala
Phe Leu275 280 285Gly Glu Ala Phe Asp Leu
Leu Lys Ala Tyr Cys Gly Val Asp Met Phe290 295
300Asn Lys Thr Phe Tyr Glu Asn Thr Gly Asp Phe Pro Leu Tyr Cys
Met305 310 315 320Pro Val
His Ser Lys Arg Ala Ser Phe Cys Asp Gln Ser Ser Ile Gly325
330 335Asp Phe Pro Gly Leu Lys Leu Ala Tyr Asn Ile Lys
His Tyr Ala Gly340 345 350Val Asn Gln Lys
Pro Glu Tyr Val Trp Tyr Tyr Asn Gln Leu Lys Gly355 360
365Arg Asp Thr Glu Ala His Thr Lys Phe Tyr Asn Phe Gly Trp
Trp Asp370 375 380Phe Gly Tyr Asp Asp Leu
Arg Phe Asn Phe Leu Trp Asp Ala Pro Glu385 390
395 400Glu Lys Ala Pro Ser Asn Asp Pro Leu Leu Lys
Val Phe Pro Ile Thr405 410 415Gly Trp Ala
Ala Phe His Asn Lys Met Thr Glu Arg Asp Asn His Ile420
425 430His Met Val Phe Lys Cys Ser Pro Phe Gly Ser Ile
Ser His Ser His435 440 445Gly Asp Gln Asn
Ala Phe Thr Leu His Ala Phe Gly Glu Thr Leu Ala450 455
460Ser Val Thr Gly Tyr Tyr Gly Gly Phe Gly Val Asp Met His
Thr Lys465 470 475 480Trp
Arg Arg Gln Thr Phe Ser Lys Asn Leu Pro Leu Phe Gly Gly Lys485
490 495Gly Gln Tyr Gly Glu Asn Lys Asn Thr Gly Tyr
Glu Asn His Gln Asp500 505 510Arg Phe Cys
Ile Glu Ala Gly Gly Thr Ile Ser Asp Phe Asp Thr Glu515
520 525Ser Asp Val Lys Met Val Glu Gly Asp Ala Thr Ala
Ser Tyr Lys Tyr530 535 540Phe Val Pro Glu
Ile Glu Ser Tyr Lys Arg Lys Val Trp Phe Val Gln545 550
555 560Gly Lys Val Phe Val Met Gln Asp Lys
Ala Thr Leu Ser Glu Glu Lys565 570 575Asp
Met Thr Trp Leu Met His Thr Thr Phe Ala Asn Glu Val Ala Asp580
585 590Lys Ser Phe Thr Ile Arg Gly Glu Val Ala His
Leu Asp Val Asn Phe595 600 605Ile Asn Glu
Ser Ala Asp Asn Ile Thr Ser Val Lys Asn Val Glu Gly610
615 620Phe Gly Glu Val Asp Pro Tyr Glu Phe Lys Asp Leu
Glu Ile His Arg625 630 635
640His Val Glu Val Glu Phe Lys Pro Ser Lys Glu His Asn Ile Leu Thr645
650 655Leu Leu Val Pro Asn Lys Asn Glu Gly
Glu Gln Val Glu Val Phe His660 665 670Lys
Leu Glu Gly Asn Thr Leu Leu Leu Asn Val Asp Gly Glu Thr Val675
680 685Ser Ile Glu Leu69027882DNAVibrio splendidus
27atgactaaac ctgtaatcgg tttcattggc ctaggtctta tgggcggcaa catggttgaa
60aacctacaaa agcgcggcta ccacgtaaac gtaatggatc taagcgctga agctgttgct
120cgcgtaacag atcgcggcaa cgcaactgca ttcacttctg ctaaagaact agctgctgca
180agtgacatcg ttcagttttg tctgacaact tctgctgttg ttgaaaaaat cgtttacggc
240gaagacggcg ttctagcggg catcaaagaa ggcgcagtac tagtagactt cggtacttct
300atccctgctt ctactaagaa aatcggcgca gctcttgctg aaaaaggcgc gggcatgatc
360gacgcacctc taggtcgtac tcctgcacac gctaaagatg gtcttctgaa catcatggct
420gctggcgaca tggaaacttt caacaaagtt aaacctgttc ttgaagagca aggcgaaaac
480gtattccacc taggggctct aggttctggt cacgtgacta agcttgtaaa caacttcatg
540ggtatgacga ctgttgcgac tatgtctcaa gctttcgctg ttgctcaacg cgctggtgtt
600gatggccaac aactgtttga catcatgtct gcaggtccat ctaactctcc gttcatgcaa
660ttctgtaagt tctacgcggt agacggcgaa gagaagctag gtttctctgt tgctaacgca
720aacaaagacc ttggttactt ccttgcactt tgtgaagagc taggtactga gtctctaatc
780gctcaaggta ctgcaacaag cctacaagct gctgttgatg caggcatggg taacaacgac
840gtaccagtaa tcttcgacta cttcgctaaa ctagagaagt aa
88228293PRTVibrio splendidus 28Met Thr Lys Pro Val Ile Gly Phe Ile Gly
Leu Gly Leu Met Gly Gly1 5 10
15Asn Met Val Glu Asn Leu Gln Lys Arg Gly Tyr His Val Asn Val Met20
25 30Asp Leu Ser Ala Glu Ala Val Ala Arg
Val Thr Asp Arg Gly Asn Ala35 40 45Thr
Ala Phe Thr Ser Ala Lys Glu Leu Ala Ala Ala Ser Asp Ile Val50
55 60Gln Phe Cys Leu Thr Thr Ser Ala Val Val Glu
Lys Ile Val Tyr Gly65 70 75
80Glu Asp Gly Val Leu Ala Gly Ile Lys Glu Gly Ala Val Leu Val Asp85
90 95Phe Gly Thr Ser Ile Pro Ala Ser Thr
Lys Lys Ile Gly Ala Ala Leu100 105 110Ala
Glu Lys Gly Ala Gly Met Ile Asp Ala Pro Leu Gly Arg Thr Pro115
120 125Ala His Ala Lys Asp Gly Leu Leu Asn Ile Met
Ala Ala Gly Asp Met130 135 140Glu Thr Phe
Asn Lys Val Lys Pro Val Leu Glu Glu Gln Gly Glu Asn145
150 155 160Val Phe His Leu Gly Ala Leu
Gly Ser Gly His Val Thr Lys Leu Val165 170
175Asn Asn Phe Met Gly Met Thr Thr Val Ala Thr Met Ser Gln Ala Phe180
185 190Ala Val Ala Gln Arg Ala Gly Val Asp
Gly Gln Gln Leu Phe Asp Ile195 200 205Met
Ser Ala Gly Pro Ser Asn Ser Pro Phe Met Gln Phe Cys Lys Phe210
215 220Tyr Ala Val Asp Gly Glu Glu Lys Leu Gly Phe
Ser Val Ala Asn Ala225 230 235
240Asn Lys Asp Leu Gly Tyr Phe Leu Ala Leu Cys Glu Glu Leu Gly
Thr245 250 255Glu Ser Leu Ile Ala Gln Gly
Thr Ala Thr Ser Leu Gln Ala Ala Val260 265
270Asp Ala Gly Met Gly Asn Asn Asp Val Pro Val Ile Phe Asp Tyr Phe275
280 285Ala Lys Leu Glu Lys290291872DNAVibrio
splendidus 29atggtagcgg tcgtcagttc tagtgctttg gcatttacga actggtttac
gcttaacttg 60gccactgaac aggtaaacca aacgatttat aacgagattg atcactcgct
tacgatagaa 120atcaatcaaa tagaaagtac cgttcagcgc accatcgata ccgttaactc
tgttgcacaa 180gagttcatga aatcccctta ccaagtgccg aatgaagcac tcatgcatta
tgccgctaag 240cttggtggca ttgacaagat tgtggtgggt tttgacgacg gccgttctta
tacctctcgc 300ccttcagagt ctttccctaa cggtgttgga ataaaagaaa aatacaatcc
aaccactcga 360ccttggtatc aacaagcgaa attgaaatca ggcttatctt ttagtggtct
gtttttcact 420aagagtactc aagtgcctat gatcggtgtg acctactcat accaagatcg
tgtcatcatg 480gccgatatac gctttgacga tttggaaacg cagcttgaac agctggacag
catctacgaa 540gccaaaggca ttatcatcga cgaaaagggg atggtggtcg cttcaacaat
cgaaaacgtg 600cttccgcaaa ccaatatatc ttctgcagac actcaaatga aactcaacag
tgccattgaa 660cagcctgatc aattcattga gggtgtgatt gatggtaacc agagaatctt
gatggccaag 720aaagtggata ttggcagcca gaaagagtgg ttcatgatct ccagtattga
ccctgaactc 780gcgctcaatc agctgaatgg cgtgatgtcg agtgcgcgca tccttatcgt
cgcttgtgta 840cttggctcgg tgatattgat gattttactt ctgaatcgtt tctaccgccc
aatcgtgtca 900ctgcgcaaaa tcgtccacga tctatcacaa ggtaacggag acctcactca
aaggcttgct 960gagaagggga atgatgactt agggcatatc gccaaagaca tcaacttgtt
cattatcggc 1020ttacaagaga tggttaagga tgtgaaatac aagaactcgg atctcgatac
caaggtactg 1080agtattcgcg aaggttgtaa agaaaccagc gatgtactga aagttcatac
tgatgaaacg 1140gttcaagtgg tctctgcgat taacggcttg tctgaagcat caaacgaagt
agagaagagt 1200tctcagtcgg cggcagaagc agcaagagag gccgctgtgt tcagtgatga
gacgaaacag 1260attaacacgg tgacggaaac ctatatcagt gatcttgaga agcaagtctg
caccacttct 1320gatgacattc gctcaatggc caatgaaacg cagagcatcc agtctatcgt
gtctgtgatt 1380ggcggaattg cggaacaaac taatttgctg gcattgaatg cgtcaattga
agcggcgagg 1440gcgggtgaac atggtcgagg tttcgcggtg gttgctgatg aagtccgtgc
gctagccaac 1500cgaacgcaaa tcagtacctc tgaaattgat gaagcgttat ctggcttgca
gtctaaatca 1560gatggtttgg ttaaatctat tgagttgacc aaaagtaact gtgaactgac
tcgcgctcaa 1620gttgttcaag ctgtaaacat gttggcgaag ctaaccgagc agatggaaac
agtaagtcgt 1680tttaataatg acatttcggg ttcgtctgtt gagcaaaacg cccttattca
gagcattgct 1740aagaacatgc ataagattga aagctttgtt gaggagctta ataaactaag
ccaagatcag 1800ttaactgaat cagcagaaat caaaacactt aacggtagcg ttagtgaatt
gatgagcagc 1860tttaaggttt aa
187230623PRTVibrio splendidus 30Met Val Ala Val Val Ser Ser
Ser Ala Leu Ala Phe Thr Asn Trp Phe1 5 10
15Thr Leu Asn Leu Ala Thr Glu Gln Val Asn Gln Thr Ile Tyr
Asn Glu20 25 30Ile Asp His Ser Leu Thr
Ile Glu Ile Asn Gln Ile Glu Ser Thr Val35 40
45Gln Arg Thr Ile Asp Thr Val Asn Ser Val Ala Gln Glu Phe Met Lys50
55 60Ser Pro Tyr Gln Val Pro Asn Glu Ala
Leu Met His Tyr Ala Ala Lys65 70 75
80Leu Gly Gly Ile Asp Lys Ile Val Val Gly Phe Asp Asp Gly
Arg Ser85 90 95Tyr Thr Ser Arg Pro Ser
Glu Ser Phe Pro Asn Gly Val Gly Ile Lys100 105
110Glu Lys Tyr Asn Pro Thr Thr Arg Pro Trp Tyr Gln Gln Ala Lys
Leu115 120 125Lys Ser Gly Leu Ser Phe Ser
Gly Leu Phe Phe Thr Lys Ser Thr Gln130 135
140Val Pro Met Ile Gly Val Thr Tyr Ser Tyr Gln Asp Arg Val Ile Met145
150 155 160Ala Asp Ile Arg
Phe Asp Asp Leu Glu Thr Gln Leu Glu Gln Leu Asp165 170
175Ser Ile Tyr Glu Ala Lys Gly Ile Ile Ile Asp Glu Lys Gly
Met Val180 185 190Val Ala Ser Thr Ile Glu
Asn Val Leu Pro Gln Thr Asn Ile Ser Ser195 200
205Ala Asp Thr Gln Met Lys Leu Asn Ser Ala Ile Glu Gln Pro Asp
Gln210 215 220Phe Ile Glu Gly Val Ile Asp
Gly Asn Gln Arg Ile Leu Met Ala Lys225 230
235 240Lys Val Asp Ile Gly Ser Gln Lys Glu Trp Phe Met
Ile Ser Ser Ile245 250 255Asp Pro Glu Leu
Ala Leu Asn Gln Leu Asn Gly Val Met Ser Ser Ala260 265
270Arg Ile Leu Ile Val Ala Cys Val Leu Gly Ser Val Ile Leu
Met Ile275 280 285Leu Leu Leu Asn Arg Phe
Tyr Arg Pro Ile Val Ser Leu Arg Lys Ile290 295
300Val His Asp Leu Ser Gln Gly Asn Gly Asp Leu Thr Gln Arg Leu
Ala305 310 315 320Glu Lys
Gly Asn Asp Asp Leu Gly His Ile Ala Lys Asp Ile Asn Leu325
330 335Phe Ile Ile Gly Leu Gln Glu Met Val Lys Asp Val
Lys Tyr Lys Asn340 345 350Ser Asp Leu Asp
Thr Lys Val Leu Ser Ile Arg Glu Gly Cys Lys Glu355 360
365Thr Ser Asp Val Leu Lys Val His Thr Asp Glu Thr Val Gln
Val Val370 375 380Ser Ala Ile Asn Gly Leu
Ser Glu Ala Ser Asn Glu Val Glu Lys Ser385 390
395 400Ser Gln Ser Ala Ala Glu Ala Ala Arg Glu Ala
Ala Val Phe Ser Asp405 410 415Glu Thr Lys
Gln Ile Asn Thr Val Thr Glu Thr Tyr Ile Ser Asp Leu420
425 430Glu Lys Gln Val Cys Thr Thr Ser Asp Asp Ile Arg
Ser Met Ala Asn435 440 445Glu Thr Gln Ser
Ile Gln Ser Ile Val Ser Val Ile Gly Gly Ile Ala450 455
460Glu Gln Thr Asn Leu Leu Ala Leu Asn Ala Ser Ile Glu Ala
Ala Arg465 470 475 480Ala
Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg485
490 495Ala Leu Ala Asn Arg Thr Gln Ile Ser Thr Ser
Glu Ile Asp Glu Ala500 505 510Leu Ser Gly
Leu Gln Ser Lys Ser Asp Gly Leu Val Lys Ser Ile Glu515
520 525Leu Thr Lys Ser Asn Cys Glu Leu Thr Arg Ala Gln
Val Val Gln Ala530 535 540Val Asn Met Leu
Ala Lys Leu Thr Glu Gln Met Glu Thr Val Ser Arg545 550
555 560Phe Asn Asn Asp Ile Ser Gly Ser Ser
Val Glu Gln Asn Ala Leu Ile565 570 575Gln
Ser Ile Ala Lys Asn Met His Lys Ile Glu Ser Phe Val Glu Glu580
585 590Leu Asn Lys Leu Ser Gln Asp Gln Leu Thr Glu
Ser Ala Glu Ile Lys595 600 605Thr Leu Asn
Gly Ser Val Ser Glu Leu Met Ser Ser Phe Lys Val610 615
620311743DNAVibrio splendidus 31gtgaataagc caatctttgt
cgtcgtactc gcttcgctta cgtatggctg cggtggaagc 60agctccagtg actctagtga
cccttctgat accaataact caggagcatc ttatggtgtt 120gttgctccct atgatattgc
caagtatcaa aacatccttt ccagctcaga tcttcaggtg 180tctgatccta atggagagga
gggcaataaa acctctgaag tcaaagatgg taacttcgat 240ggttatgtca gtgattattt
ttatgctgac gaagagacgg aaaatctgat cttcaaaatg 300gcgaactaca agatgcgctc
tgaagttcgt gaaggagaaa acttcgatat caatgaagca 360ggcgtaagac gcagtctaca
tgcggaaata agcctacctg atattgagca tgtaatggcg 420agttctcccg cagatcacga
tgaagtgacc gtgctacaga tccacaataa aggtacagac 480gagagtggca cgggttatat
ccctcatccg ctattgcgtg tggtttggga gcaagaacga 540gatggcctca caggtcacta
ctgggcagtc atgaaaaata atgccattga ctgtagcagt 600gccgctgact cttcggattg
ttatgccact tcatataatc gctacgattt gggagaggcg 660gatctcgata acttcaccaa
gtttgatctt tctgtttatg aaaataccct ttcgatcaaa 720gtgaacgatg aagttaaagt
cgacgaagac atcacctact ggcagcatct actgagttac 780tttaaagcgg gtatctacaa
tcaatttgaa aatggtgaag ccacggctca ctttcaggca 840ctgcgataca ccaccacaca
ggtcaacggc tcaaacgatt gggatattaa tgattggaag 900ttgacgattc ctgcgagtaa
agacacttgg tatggaagtg ggggtgacag tgcggctgaa 960ctagaacctg agcgctgcga
atcgagcaaa gaccttctcg ccaacgacag tgatgtctac 1020gacagcgata ttggtctttc
ttatttcaat accgatgaag ggagagtgca ctttagagcg 1080gatatgggat atggcacctc
taccgaaaat tctagctata ttcgctctga gctcagggag 1140ttgtatcaaa gcagtgttca
accggattgt agcaccagcg atgaagatac aagttggtat 1200ttggacgaca ctagaacgaa
cgctaccagt cacgagttaa ccgcaagctt acgaattgaa 1260gactacccga acattaataa
ccaagacccg aaagtggtgc ttgggcaaat acacggttgg 1320aagatcaatc aagcattggt
gaagttgtta tgggaaggcg agagtaagcc agtaagagtg 1380atactgaact ctgattttga
gcgcaacaac caagactgta accattgtga cccgttcagt 1440gtcgagttag gtacttattc
ggcaagtgaa gagtggcgat atacgattcg agccaatcaa 1500gacggtatct acttagcgac
tcatgattta gatggaacta atacggtttc tcatttaatc 1560ccttggggac aagattacac
agataaagat ggggacacgg tctcgttgac gtcagattgg 1620acatcgacag acatcgcttt
ctatttcaaa gcgggcatct acccacaatt taagcctgat 1680agcgactatg cgggtgaagt
gtttgatgtg agctttagtt ctctaagagc agagcataac 1740tga
174332580PRTVibrio splendidus
32Met Asn Lys Pro Ile Phe Val Val Val Leu Ala Ser Leu Thr Tyr Gly1
5 10 15Cys Gly Gly Ser Ser Ser
Ser Asp Ser Ser Asp Pro Ser Asp Thr Asn20 25
30Asn Ser Gly Ala Ser Tyr Gly Val Val Ala Pro Tyr Asp Ile Ala Lys35
40 45Tyr Gln Asn Ile Leu Ser Ser Ser Asp
Leu Gln Val Ser Asp Pro Asn50 55 60Gly
Glu Glu Gly Asn Lys Thr Ser Glu Val Lys Asp Gly Asn Phe Asp65
70 75 80Gly Tyr Val Ser Asp Tyr
Phe Tyr Ala Asp Glu Glu Thr Glu Asn Leu85 90
95Ile Phe Lys Met Ala Asn Tyr Lys Met Arg Ser Glu Val Arg Glu Gly100
105 110Glu Asn Phe Asp Ile Asn Glu Ala
Gly Val Arg Arg Ser Leu His Ala115 120
125Glu Ile Ser Leu Pro Asp Ile Glu His Val Met Ala Ser Ser Pro Ala130
135 140Asp His Asp Glu Val Thr Val Leu Gln
Ile His Asn Lys Gly Thr Asp145 150 155
160Glu Ser Gly Thr Gly Tyr Ile Pro His Pro Leu Leu Arg Val
Val Trp165 170 175Glu Gln Glu Arg Asp Gly
Leu Thr Gly His Tyr Trp Ala Val Met Lys180 185
190Asn Asn Ala Ile Asp Cys Ser Ser Ala Ala Asp Ser Ser Asp Cys
Tyr195 200 205Ala Thr Ser Tyr Asn Arg Tyr
Asp Leu Gly Glu Ala Asp Leu Asp Asn210 215
220Phe Thr Lys Phe Asp Leu Ser Val Tyr Glu Asn Thr Leu Ser Ile Lys225
230 235 240Val Asn Asp Glu
Val Lys Val Asp Glu Asp Ile Thr Tyr Trp Gln His245 250
255Leu Leu Ser Tyr Phe Lys Ala Gly Ile Tyr Asn Gln Phe Glu
Asn Gly260 265 270Glu Ala Thr Ala His Phe
Gln Ala Leu Arg Tyr Thr Thr Thr Gln Val275 280
285Asn Gly Ser Asn Asp Trp Asp Ile Asn Asp Trp Lys Leu Thr Ile
Pro290 295 300Ala Ser Lys Asp Thr Trp Tyr
Gly Ser Gly Gly Asp Ser Ala Ala Glu305 310
315 320Leu Glu Pro Glu Arg Cys Glu Ser Ser Lys Asp Leu
Leu Ala Asn Asp325 330 335Ser Asp Val Tyr
Asp Ser Asp Ile Gly Leu Ser Tyr Phe Asn Thr Asp340 345
350Glu Gly Arg Val His Phe Arg Ala Asp Met Gly Tyr Gly Thr
Ser Thr355 360 365Glu Asn Ser Ser Tyr Ile
Arg Ser Glu Leu Arg Glu Leu Tyr Gln Ser370 375
380Ser Val Gln Pro Asp Cys Ser Thr Ser Asp Glu Asp Thr Ser Trp
Tyr385 390 395 400Leu Asp
Asp Thr Arg Thr Asn Ala Thr Ser His Glu Leu Thr Ala Ser405
410 415Leu Arg Ile Glu Asp Tyr Pro Asn Ile Asn Asn Gln
Asp Pro Lys Val420 425 430Val Leu Gly Gln
Ile His Gly Trp Lys Ile Asn Gln Ala Leu Val Lys435 440
445Leu Leu Trp Glu Gly Glu Ser Lys Pro Val Arg Val Ile Leu
Asn Ser450 455 460Asp Phe Glu Arg Asn Asn
Gln Asp Cys Asn His Cys Asp Pro Phe Ser465 470
475 480Val Glu Leu Gly Thr Tyr Ser Ala Ser Glu Glu
Trp Arg Tyr Thr Ile485 490 495Arg Ala Asn
Gln Asp Gly Ile Tyr Leu Ala Thr His Asp Leu Asp Gly500
505 510Thr Asn Thr Val Ser His Leu Ile Pro Trp Gly Gln
Asp Tyr Thr Asp515 520 525Lys Asp Gly Asp
Thr Val Ser Leu Thr Ser Asp Trp Thr Ser Thr Asp530 535
540Ile Ala Phe Tyr Phe Lys Ala Gly Ile Tyr Pro Gln Phe Lys
Pro Asp545 550 555 560Ser
Asp Tyr Ala Gly Glu Val Phe Asp Val Ser Phe Ser Ser Leu Arg565
570 575Ala Glu His Asn580331569DNAVibrio splendidus
33atgaaacaaa ttactctaaa aactttactc gcttcttcta ttctacttgc ggttggttgt
60gcgagcacga gcacgcctac tgctgatttt ccaaataaca aagaaactgg tgaagcgctt
120ctgacgccag ttgctgtttc cgctagtagc catgatggta acggacctga tcgtctcgtt
180gaccaagacc taactacacg ttggtcatct gcgggtgacg gcgagtgggc aacgctagac
240tatggttcag tacaggagtt tgacgcggtt caggcatctt tcagtaaagg taatcagcgc
300caatctaaat ttgatatcca agtgagtgtt gatggcgaaa gctggacaac ggtactagaa
360aaccaactaa gctcaggtaa agcgatcggc ctagagcgtt tccaatttga gccagtagtg
420caagcacgct acgtaagata cgttggtcac ggtaacacca aaaacggttg gaacagtgtg
480actggattag cggcggttaa ctgtagcatt aacgcatgtc ctgctagcca tatcatcact
540tcagacgtgg ttgcagcaga agccgtgatt attgctgaaa tgaaagcggc agaaaaagca
600cgtaaagatg cgcgcaaaga tctacgctct ggtaacttcg gtgtagcagc ggtttaccct
660tgtgagacga ccgttgaatg tgacactcgc agtgcacttc cagttccgac aggcctgcca
720gcgacaccag ttgcaggtaa ctcgccaagc gaaaactttg acatgacgca ttggtaccta
780tctcaaccat ttgaccatga caaaaatggc aaacctgatg atgtgtctga gtggaacctt
840gcaaacggtt accaacaccc tgaaatcttc tacacagctg atgacggcgg cctagtattc
900aaagcttacg tgaaaggtgt acgtacctct aaaaacacta agtacgcgcg tacagagctt
960cgtgaaatga tgcgtcgtgg tgatcagtct attagcacta aaggtgttaa taagaataac
1020tgggtattct caagcgctcc tgaatctgac ttagagtcgg cagcgggtat tgacggcgtt
1080ctagaagcga cgttgaaaat cgaccatgca acaacgacgg gtaatgcgaa tgaagtaggt
1140cgctttatca ttggtcagat tcacgatcaa aacgatgaac caattcgttt gtactaccgt
1200aaactgccaa accaagaaac gggtgcggtt tacttcgcac atgaaagcca agacgcaact
1260aaagaggact tctaccctct agtgggcgac atgacggctg aagtgggtga cgatggtatc
1320gcgcttggcg aagtgttcag ctaccgtatt gacgttaaag gcaacacgat gactgtaacg
1380ctaatacgtg aaggcaaaga cgatgttgta caagtggttg atatgagcaa cagcggctac
1440gacgcaggcg gcaagtacat gtacttcaaa gccggtgttt acaaccaaaa catcagcggc
1500gacctagacg attactcaca agcgactttc tatcagctag atgtatcgca cgatcaatac
1560aaaaagtaa
156934522PRTVibrio splendidus 34Met Lys Gln Ile Thr Leu Lys Thr Leu Leu
Ala Ser Ser Ile Leu Leu1 5 10
15Ala Val Gly Cys Ala Ser Thr Ser Thr Pro Thr Ala Asp Phe Pro Asn20
25 30Asn Lys Glu Thr Gly Glu Ala Leu Leu
Thr Pro Val Ala Val Ser Ala35 40 45Ser
Ser His Asp Gly Asn Gly Pro Asp Arg Leu Val Asp Gln Asp Leu50
55 60Thr Thr Arg Trp Ser Ser Ala Gly Asp Gly Glu
Trp Ala Thr Leu Asp65 70 75
80Tyr Gly Ser Val Gln Glu Phe Asp Ala Val Gln Ala Ser Phe Ser Lys85
90 95Gly Asn Gln Arg Gln Ser Lys Phe Asp
Ile Gln Val Ser Val Asp Gly100 105 110Glu
Ser Trp Thr Thr Val Leu Glu Asn Gln Leu Ser Ser Gly Lys Ala115
120 125Ile Gly Leu Glu Arg Phe Gln Phe Glu Pro Val
Val Gln Ala Arg Tyr130 135 140Val Arg Tyr
Val Gly His Gly Asn Thr Lys Asn Gly Trp Asn Ser Val145
150 155 160Thr Gly Leu Ala Ala Val Asn
Cys Ser Ile Asn Ala Cys Pro Ala Ser165 170
175His Ile Ile Thr Ser Asp Val Val Ala Ala Glu Ala Val Ile Ile Ala180
185 190Glu Met Lys Ala Ala Glu Lys Ala Arg
Lys Asp Ala Arg Lys Asp Leu195 200 205Arg
Ser Gly Asn Phe Gly Val Ala Ala Val Tyr Pro Cys Glu Thr Thr210
215 220Val Glu Cys Asp Thr Arg Ser Ala Leu Pro Val
Pro Thr Gly Leu Pro225 230 235
240Ala Thr Pro Val Ala Gly Asn Ser Pro Ser Glu Asn Phe Asp Met
Thr245 250 255His Trp Tyr Leu Ser Gln Pro
Phe Asp His Asp Lys Asn Gly Lys Pro260 265
270Asp Asp Val Ser Glu Trp Asn Leu Ala Asn Gly Tyr Gln His Pro Glu275
280 285Ile Phe Tyr Thr Ala Asp Asp Gly Gly
Leu Val Phe Lys Ala Tyr Val290 295 300Lys
Gly Val Arg Thr Ser Lys Asn Thr Lys Tyr Ala Arg Thr Glu Leu305
310 315 320Arg Glu Met Met Arg Arg
Gly Asp Gln Ser Ile Ser Thr Lys Gly Val325 330
335Asn Lys Asn Asn Trp Val Phe Ser Ser Ala Pro Glu Ser Asp Leu
Glu340 345 350Ser Ala Ala Gly Ile Asp Gly
Val Leu Glu Ala Thr Leu Lys Ile Asp355 360
365His Ala Thr Thr Thr Gly Asn Ala Asn Glu Val Gly Arg Phe Ile Ile370
375 380Gly Gln Ile His Asp Gln Asn Asp Glu
Pro Ile Arg Leu Tyr Tyr Arg385 390 395
400Lys Leu Pro Asn Gln Glu Thr Gly Ala Val Tyr Phe Ala His
Glu Ser405 410 415Gln Asp Ala Thr Lys Glu
Asp Phe Tyr Pro Leu Val Gly Asp Met Thr420 425
430Ala Glu Val Gly Asp Asp Gly Ile Ala Leu Gly Glu Val Phe Ser
Tyr435 440 445Arg Ile Asp Val Lys Gly Asn
Thr Met Thr Val Thr Leu Ile Arg Glu450 455
460Gly Lys Asp Asp Val Val Gln Val Val Asp Met Ser Asn Ser Gly Tyr465
470 475 480Asp Ala Gly Gly
Lys Tyr Met Tyr Phe Lys Ala Gly Val Tyr Asn Gln485 490
495Asn Ile Ser Gly Asp Leu Asp Asp Tyr Ser Gln Ala Thr Phe
Tyr Gln500 505 510Leu Asp Val Ser His Asp
Gln Tyr Lys Lys515 520351230DNAVibrio splendidus
35atgcaaattt ctaaagtcgc tacagctgtc gctctttcga caggtttatt atttggttgt
60aacagtgatg gtttacctat tccaacagat ccaggcggaa cagaccctgt tgaacctgtt
120gaagtttact ctatagaaaa cgtctattgg gatctgacag gtggtgctgt tgctgcacag
180tcactcagcg gaacttcacc atatcgcttt gataataatg aggaaggtac tcgtgctcta
240agcatttaca gtggagacgt agctaatggc ttcacttttg agagttcaat atatactgct
300gaagaagaag gtgttgtttc ctttgaaggt aaggactgta cttacacagt gactgagcaa
360cagctagata tgacctgtga aaaagatgac gtagaaacag cttactcagc aacagagatt
420acagatgaat ctgttataac tgcattagaa aatgccgatg atggaaaacc taaatcagtc
480gatgatgtga acgctgcgat tgcatcagca gaagatggcg cgattattga tttatcatct
540gaaggtacgt ttgataccgg tgttattgag ctaaataaag ctgtcacaat tgatggtgct
600ggtttagcaa ccattaccgg agatgcttgt attgatgtca ctgcacccgg tgcaggtatc
660aaaaacatga cttttgctaa cgacaatttg gccgggtgtt ttggtaggga gtcagctggt
720acttcagata atgaaactgg tgcgatcgtt attggtaaaa ttggtaaaga ttcagatcct
780gtagcacttg aaaacctaaa gttcgatgca aacggcatta ccgaagatga tctaggtact
840aaaaaagcaa gttggttatt ctctcgaggt tactttacat tagacaatag cgaatttgtc
900ggtttaagtg gcagtttcca aaataatgca attcgtatta actgtagtag tgacaacggg
960cgatttggtt cacaaatcac aaataataca ttcactatta actctggtgg tagtgatgtg
1020ggcggaatta aagttggtga ttctagcagt gccgtcataa agaatagtga tgataacctt
1080ggctgtaatg tcactattga aagcaatacg ttcaatggtt acaaaaccct actttcagct
1140gacaacggta aagatataag aaatacagcc atctacgcac aaccatctgc agtgaacact
1200gcggcaggta aagaaaatat cttgaactaa
123036409PRTVibrio splendidus 36Met Gln Ile Ser Lys Val Ala Thr Ala Val
Ala Leu Ser Thr Gly Leu1 5 10
15Leu Phe Gly Cys Asn Ser Asp Gly Leu Pro Ile Pro Thr Asp Pro Gly20
25 30Gly Thr Asp Pro Val Glu Pro Val Glu
Val Tyr Ser Ile Glu Asn Val35 40 45Tyr
Trp Asp Leu Thr Gly Gly Ala Val Ala Ala Gln Ser Leu Ser Gly50
55 60Thr Ser Pro Tyr Arg Phe Asp Asn Asn Glu Glu
Gly Thr Arg Ala Leu65 70 75
80Ser Ile Tyr Ser Gly Asp Val Ala Asn Gly Phe Thr Phe Glu Ser Ser85
90 95Ile Tyr Thr Ala Glu Glu Glu Gly Val
Val Ser Phe Glu Gly Lys Asp100 105 110Cys
Thr Tyr Thr Val Thr Glu Gln Gln Leu Asp Met Thr Cys Glu Lys115
120 125Asp Asp Val Glu Thr Ala Tyr Ser Ala Thr Glu
Ile Thr Asp Glu Ser130 135 140Val Ile Thr
Ala Leu Glu Asn Ala Asp Asp Gly Lys Pro Lys Ser Val145
150 155 160Asp Asp Val Asn Ala Ala Ile
Ala Ser Ala Glu Asp Gly Ala Ile Ile165 170
175Asp Leu Ser Ser Glu Gly Thr Phe Asp Thr Gly Val Ile Glu Leu Asn180
185 190Lys Ala Val Thr Ile Asp Gly Ala Gly
Leu Ala Thr Ile Thr Gly Asp195 200 205Ala
Cys Ile Asp Val Thr Ala Pro Gly Ala Gly Ile Lys Asn Met Thr210
215 220Phe Ala Asn Asp Asn Leu Ala Gly Cys Phe Gly
Arg Glu Ser Ala Gly225 230 235
240Thr Ser Asp Asn Glu Thr Gly Ala Ile Val Ile Gly Lys Ile Gly
Lys245 250 255Asp Ser Asp Pro Val Ala Leu
Glu Asn Leu Lys Phe Asp Ala Asn Gly260 265
270Ile Thr Glu Asp Asp Leu Gly Thr Lys Lys Ala Ser Trp Leu Phe Ser275
280 285Arg Gly Tyr Phe Thr Leu Asp Asn Ser
Glu Phe Val Gly Leu Ser Gly290 295 300Ser
Phe Gln Asn Asn Ala Ile Arg Ile Asn Cys Ser Ser Asp Asn Gly305
310 315 320Arg Phe Gly Ser Gln Ile
Thr Asn Asn Thr Phe Thr Ile Asn Ser Gly325 330
335Gly Ser Asp Val Gly Gly Ile Lys Val Gly Asp Ser Ser Ser Ala
Val340 345 350Ile Lys Asn Ser Asp Asp Asn
Leu Gly Cys Asn Val Thr Ile Glu Ser355 360
365Asn Thr Phe Asn Gly Tyr Lys Thr Leu Leu Ser Ala Asp Asn Gly Lys370
375 380Asp Ile Arg Asn Thr Ala Ile Tyr Ala
Gln Pro Ser Ala Val Asn Thr385 390 395
400Ala Ala Gly Lys Glu Asn Ile Leu Asn40537861DNAVibrio
splendidus 37atgaattctg ttacaaaaat tgctgcagct gttgcatgta ctcttttagc
gggcacagct 60gctggtgcat ctcttgatta tcgttacgag tatcgtgctg cgacggatta
tacaaagact 120aatggtgata cggctcacgt agacgctcgc catcaacacc gagttaagct
aggtgaaagc 180tttaagctgt cagacaagtg gaagcactct actggtctag aacttaagtt
ccacggtgat 240gactcttact atgatgaaga ttcaggttct gttaaatcag caaacagcca
gagtttttac 300gatggcaatt ggtacatcta tggtatggag atcgataaca ctgcgacata
caaaatagac 360aataattggt atctacaaat gggtatgcct attgcttggg attgggatga
gcctaatgct 420aacgatggcg actggaagat gaaaaaggtt acgtttaaac ctcagttccg
cgttggctat 480aaagcagata tgggtttaac aactgctatt cgttaccgtc atgaatatgc
tgacttccgt 540aaccacacac aatttggcga caaagattct gaaactggcg agcgtttaga
atcagctcaa 600aagtctaaag ttacactgac gggctcttac aaaattgaat ctctacctaa
gcttggcctt 660tcttacgaag caaactatgt aaaatctttg gataacgtac ttctttataa
tagtgatgac 720tgggaatggg atgctggctt aaaggtaaac tacaagttcg gttcttggaa
accttttgct 780gaaatctggt cttctgatat cagttcatct tcaaaagatc gtgaagctaa
ataccgtgtt 840ggtattgctt actcattcta a
86138286PRTVibrio splendidus 38Met Asn Ser Val Thr Lys Ile
Ala Ala Ala Val Ala Cys Thr Leu Leu1 5 10
15Ala Gly Thr Ala Ala Gly Ala Ser Leu Asp Tyr Arg Tyr Glu
Tyr Arg20 25 30Ala Ala Thr Asp Tyr Thr
Lys Thr Asn Gly Asp Thr Ala His Val Asp35 40
45Ala Arg His Gln His Arg Val Lys Leu Gly Glu Ser Phe Lys Leu Ser50
55 60Asp Lys Trp Lys His Ser Thr Gly Leu
Glu Leu Lys Phe His Gly Asp65 70 75
80Asp Ser Tyr Tyr Asp Glu Asp Ser Gly Ser Val Lys Ser Ala
Asn Ser85 90 95Gln Ser Phe Tyr Asp Gly
Asn Trp Tyr Ile Tyr Gly Met Glu Ile Asp100 105
110Asn Thr Ala Thr Tyr Lys Ile Asp Asn Asn Trp Tyr Leu Gln Met
Gly115 120 125Met Pro Ile Ala Trp Asp Trp
Asp Glu Pro Asn Ala Asn Asp Gly Asp130 135
140Trp Lys Met Lys Lys Val Thr Phe Lys Pro Gln Phe Arg Val Gly Tyr145
150 155 160Lys Ala Asp Met
Gly Leu Thr Thr Ala Ile Arg Tyr Arg His Glu Tyr165 170
175Ala Asp Phe Arg Asn His Thr Gln Phe Gly Asp Lys Asp Ser
Glu Thr180 185 190Gly Glu Arg Leu Glu Ser
Ala Gln Lys Ser Lys Val Thr Leu Thr Gly195 200
205Ser Tyr Lys Ile Glu Ser Leu Pro Lys Leu Gly Leu Ser Tyr Glu
Ala210 215 220Asn Tyr Val Lys Ser Leu Asp
Asn Val Leu Leu Tyr Asn Ser Asp Asp225 230
235 240Trp Glu Trp Asp Ala Gly Leu Lys Val Asn Tyr Lys
Phe Gly Ser Trp245 250 255Lys Pro Phe Ala
Glu Ile Trp Ser Ser Asp Ile Ser Ser Ser Ser Lys260 265
270Asp Arg Glu Ala Lys Tyr Arg Val Gly Ile Ala Tyr Ser
Phe275 280 285391038DNAVibrio splendidus
39atgtttaaga aaaacatatt agcagtggcg ttattagcga ctgtgccaat ggttactttc
60gcaaataacg gtgtttctta ccccgtacct gccgataaat tcgatatgca taattggaaa
120ataaccatac cttcagatat taatgaagat ggtcgcgttg atgaaataga aggggtcgct
180atgatgagct actcacatag tgatttcttc catcttgata aagacggcaa ccttgtattt
240gaagtgcaga accaagcgat tacgacgaaa aactcgaaga atgcgcgttc tgagttacgc
300cagatgccaa gaggcgcaga tttctctatc gatacggctg ataaaggaaa ccagtgggca
360ctgtcgagtc acccagcggc tagtgaatac agtgctgtgg gcggaacatt agaagcgaca
420ttaaaagtga atcacgtctc agttaacgct aagttcccag aaaaataccc agctcattct
480gttgtggttg gtcagattca tgctaaaaaa cacaacgagc taatcaaagc tggaaccggt
540tatgggcatg gtaatgaacc actaaagatc ttctataaga agtttcctga ccaagaaatg
600ggttcagtat tctggaacta tgaacgtaac ctagagaaaa aagatcctaa ccgtgccgat
660atcgcttatc cagtgtgggg taacacgtgg gaaaaccctg cagagccggg tgaagccggt
720attgctcttg gtgaagagtt tagctacaaa gtggaagtga aaggcaccat gatgtaccta
780acgtttgaaa ccgagcgtca cgataccgtt aagtatgaaa tcgacctgag taagggcatc
840gatgaacttg actcaccaac gggctatgct gaagatgatt tttactacaa agcgggcgca
900tacggccaat gtagcgtgag cgattctcac cctgtatggg ggcctggttg tggcggtact
960ggcgatttcg ctgtcgataa aaagaatggc gattacaaca gtgtgacttt ctctgcgctt
1020aagttaaacg gtaaatag
103840345PRTVibrio splendidus 40Met Phe Lys Lys Asn Ile Leu Ala Val Ala
Leu Leu Ala Thr Val Pro1 5 10
15Met Val Thr Phe Ala Asn Asn Gly Val Ser Tyr Pro Val Pro Ala Asp20
25 30Lys Phe Asp Met His Asn Trp Lys Ile
Thr Ile Pro Ser Asp Ile Asn35 40 45Glu
Asp Gly Arg Val Asp Glu Ile Glu Gly Val Ala Met Met Ser Tyr50
55 60Ser His Ser Asp Phe Phe His Leu Asp Lys Asp
Gly Asn Leu Val Phe65 70 75
80Glu Val Gln Asn Gln Ala Ile Thr Thr Lys Asn Ser Lys Asn Ala Arg85
90 95Ser Glu Leu Arg Gln Met Pro Arg Gly
Ala Asp Phe Ser Ile Asp Thr100 105 110Ala
Asp Lys Gly Asn Gln Trp Ala Leu Ser Ser His Pro Ala Ala Ser115
120 125Glu Tyr Ser Ala Val Gly Gly Thr Leu Glu Ala
Thr Leu Lys Val Asn130 135 140His Val Ser
Val Asn Ala Lys Phe Pro Glu Lys Tyr Pro Ala His Ser145
150 155 160Val Val Val Gly Gln Ile His
Ala Lys Lys His Asn Glu Leu Ile Lys165 170
175Ala Gly Thr Gly Tyr Gly His Gly Asn Glu Pro Leu Lys Ile Phe Tyr180
185 190Lys Lys Phe Pro Asp Gln Glu Met Gly
Ser Val Phe Trp Asn Tyr Glu195 200 205Arg
Asn Leu Glu Lys Lys Asp Pro Asn Arg Ala Asp Ile Ala Tyr Pro210
215 220Val Trp Gly Asn Thr Trp Glu Asn Pro Ala Glu
Pro Gly Glu Ala Gly225 230 235
240Ile Ala Leu Gly Glu Glu Phe Ser Tyr Lys Val Glu Val Lys Gly
Thr245 250 255Met Met Tyr Leu Thr Phe Glu
Thr Glu Arg His Asp Thr Val Lys Tyr260 265
270Glu Ile Asp Leu Ser Lys Gly Ile Asp Glu Leu Asp Ser Pro Thr Gly275
280 285Tyr Ala Glu Asp Asp Phe Tyr Tyr Lys
Ala Gly Ala Tyr Gly Gln Cys290 295 300Ser
Val Ser Asp Ser His Pro Val Trp Gly Pro Gly Cys Gly Gly Thr305
310 315 320Gly Asp Phe Ala Val Asp
Lys Lys Asn Gly Asp Tyr Asn Ser Val Thr325 330
335Phe Ser Ala Leu Lys Leu Asn Gly Lys340
34541897DNAVibrio splendidus 41atggataact ctccggtgct gagccgattt
ttagagaatg gatttttact ccagcagaaa 60ctgagccttg ttctttgttg tgtgttgatc
gcagcttctg catggatttt aggacagctt 120gcatggttta ttgaacctgc tgagcaaacc
gtcgtgccat ggacagcaac ggcttcctcg 180tcttcaacgc ctcaatcgac tcttgatatc
tcttctttgc agcagagcaa catgtttggt 240gcttataacc caaccacgcc tgctgtggtt
gagcagcaag ttatccaaga tgcgccaaag 300acgcgactga acctcgtttt agtgggtgca
gtagccagtt ctaatccaaa gctgagcttg 360gctgtgattg ccaatcgcgg cacacaagca
acctacggca ttaatgaaga gatcgaaggt 420acgcgagcta agttaaaagc ggtattagtc
gatcgcgtga ttattgataa ctcaggtcga 480gacgaaacct tgatgcttga aggcattgag
tacaagcgtt tgtctgtatc agcacctgcg 540ccacctcgta cctcttcttc tgtgcgtggc
aacaacccag cttctgcaga agagaagcta 600gatgaaatta aagcgaagat aatgaaagat
ccgcaacaaa tcttccaata tgttcgactg 660tctcaggtga aacgcgacga taaagtgatt
ggttatcgtg tgagccctgg caaagattca 720gaacttttta actctgttgg gctccaaaac
ggagatattg ccactcagtt aaatggacaa 780gacctgacag accctgctgc tatgggcaac
atattccgtt ctatctcaga gctgacagag 840ctaaacctcg tcgtcgagag agatggtcaa
caacatgaag tgtttattga attttag 89742298PRTVibrio splendidus 42Met
Asp Asn Ser Pro Val Leu Ser Arg Phe Leu Glu Asn Gly Phe Leu1
5 10 15Leu Gln Gln Lys Leu Ser Leu Val
Leu Cys Cys Val Leu Ile Ala Ala20 25
30Ser Ala Trp Ile Leu Gly Gln Leu Ala Trp Phe Ile Glu Pro Ala Glu35
40 45Gln Thr Val Val Pro Trp Thr Ala Thr Ala
Ser Ser Ser Ser Thr Pro50 55 60Gln Ser
Thr Leu Asp Ile Ser Ser Leu Gln Gln Ser Asn Met Phe Gly65
70 75 80Ala Tyr Asn Pro Thr Thr Pro
Ala Val Val Glu Gln Gln Val Ile Gln85 90
95Asp Ala Pro Lys Thr Arg Leu Asn Leu Val Leu Val Gly Ala Val Ala100
105 110Ser Ser Asn Pro Lys Leu Ser Leu Ala
Val Ile Ala Asn Arg Gly Thr115 120 125Gln
Ala Thr Tyr Gly Ile Asn Glu Glu Ile Glu Gly Thr Arg Ala Lys130
135 140Leu Lys Ala Val Leu Val Asp Arg Val Ile Ile
Asp Asn Ser Gly Arg145 150 155
160Asp Glu Thr Leu Met Leu Glu Gly Ile Glu Tyr Lys Arg Leu Ser
Val165 170 175Ser Ala Pro Ala Pro Pro Arg
Thr Ser Ser Ser Val Arg Gly Asn Asn180 185
190Pro Ala Ser Ala Glu Glu Lys Leu Asp Glu Ile Lys Ala Lys Ile Met195
200 205Lys Asp Pro Gln Gln Ile Phe Gln Tyr
Val Arg Leu Ser Gln Val Lys210 215 220Arg
Asp Asp Lys Val Ile Gly Tyr Arg Val Ser Pro Gly Lys Asp Ser225
230 235 240Glu Leu Phe Asn Ser Val
Gly Leu Gln Asn Gly Asp Ile Ala Thr Gln245 250
255Leu Asn Gly Gln Asp Leu Thr Asp Pro Ala Ala Met Gly Asn Ile
Phe260 265 270Arg Ser Ile Ser Glu Leu Thr
Glu Leu Asn Leu Val Val Glu Arg Asp275 280
285Gly Gln Gln His Glu Val Phe Ile Glu Phe290
295432025DNAVibrio splendidus 43gtgaagcatt ggtttaagaa aagtgcatgg
ttattggcag gaagcttaat ctgcacaccc 60gcagccatcg cgagtgattt tagtgccagc
tttaaaggca ctgatattca agagtttatt 120aatattgttg gtcgtaacct agagaagacg
atcatcgttg acccttcggt gcgcggaaaa 180atcgatgtac gcagctacga cgtactcaat
gaagagcaat actacagctt cttcctaaac 240gtattggaag tgtatggcta cgcggttgtc
gaaatggact cgggtgttct taagatcatc 300aaggccaaag attcgaaaac atcggcaatt
ccagtcgttg gagacagtga cacgatcaaa 360ggcgacaatg tggtgacacg tgttgtgacg
gttcgtaatg tctcggtgcg tgaactttct 420cctctgcttc gtcaactaaa cgacaatgca
ggcgcgggta acgttgtgca ctacgaccca 480gccaacatca tccttattac aggccgagcg
gcggtagtaa accgtttagc tgaaatcatc 540aagcgtgttg accaagcggg tgataaagag
attgaagtcg ttgagctaaa gaatgcttct 600gcggcagaaa tggtacgtat cgttgatgcg
ttaagcaaaa ccactgatgc gaaaaacaca 660cctgcatttc tacaacctaa attagttgcc
gatgaacgta ccaatgcgat tcttatctca 720ggcgacccta aagtacgtag ccgtttaaga
aggctgattg aacagcttga tgttgaaatg 780gcaaccaagg gcaataacca agttatttac
cttaaatatg caaaagccga agatctagtt 840gatgtgctga aaggcgtgtc ggacaaccta
caatcagaga agcagacatc aaccaaagga 900agttcatcgc agcgtaacca agtgatgatc
tcagctcaca gtgacaccaa ctctttagtg 960attaccgcac agccggacat catgaatgcg
cttcaagatg tgatcgcaca gctggatatt 1020cgtcgtgctc aagtattgat tgaagcactg
attgtcgaaa tggccgaagg tgacggcgtt 1080aaccttggtg tgcagtgggg taaccttgaa
acgggtgcca tgattcagta cagcaacact 1140ggcgcttcca ttggcggtgt gatggttggt
ttagaagaag cgaaagacag cgaaacgaca 1200accgctgttt atgattcaga cggtaaattc
ttacgtaatg aaaccacgac ggaagaaggt 1260gactattcaa cattagcttc cgcactttct
ggtgttaatg gtgcggcaat gagtgtggta 1320atgggtgact ggaccgcctt gatcagtgca
gtagcgaccg attcaaattc aaatatccta 1380tcttctccaa gtatcaccgt gatggataac
ggcgaagcgt cattcattgt gggtgaagag 1440gtgcctgttc taaccggttc tacagcaggc
tcaagtaacg acaacccatt ccaaacagtt 1500gaacgtaaag aagtgggtat caagcttaaa
gtggtgccgc aaatcaatga aggtgattcg 1560gttcaactgc aaatagaaca agaagtatcg
aacgtattag gcgccaatgg tgcggttgat 1620gtgcgttttg ctaagcgaca gctaaataca
tcagtgattg ttcaagacgg tcaaatgctg 1680gtgttgggtg gcttgattga cgagcgagca
ttggaaagtg aatctaaggt gccgttcttg 1740ggagatattc ctgtgcttgg acacttgttc
aaatcaacca gtactcaggt tgagaaaaag 1800aacctaatgg tcttcatcaa accaaccatt
attcgtgatg gtatgacagc cgatggtatc 1860acgcagcgta aatacaactt catccgtgct
gagcagttgt acaaggctga gcaaggactg 1920aagttaatgg cagacgataa catcccagta
ttgcctaaat ttggtgccga catgaatcac 1980ccggctgaaa ttcaagcctt catcgatcaa
atggaacaag aataa 202544674PRTVibrio splendidus 44Met
Lys His Trp Phe Lys Lys Ser Ala Trp Leu Leu Ala Gly Ser Leu1
5 10 15Ile Cys Thr Pro Ala Ala Ile Ala
Ser Asp Phe Ser Ala Ser Phe Lys20 25
30Gly Thr Asp Ile Gln Glu Phe Ile Asn Ile Val Gly Arg Asn Leu Glu35
40 45Lys Thr Ile Ile Val Asp Pro Ser Val Arg
Gly Lys Ile Asp Val Arg50 55 60Ser Tyr
Asp Val Leu Asn Glu Glu Gln Tyr Tyr Ser Phe Phe Leu Asn65
70 75 80Val Leu Glu Val Tyr Gly Tyr
Ala Val Val Glu Met Asp Ser Gly Val85 90
95Leu Lys Ile Ile Lys Ala Lys Asp Ser Lys Thr Ser Ala Ile Pro Val100
105 110Val Gly Asp Ser Asp Thr Ile Lys Gly
Asp Asn Val Val Thr Arg Val115 120 125Val
Thr Val Arg Asn Val Ser Val Arg Glu Leu Ser Pro Leu Leu Arg130
135 140Gln Leu Asn Asp Asn Ala Gly Ala Gly Asn Val
Val His Tyr Asp Pro145 150 155
160Ala Asn Ile Ile Leu Ile Thr Gly Arg Ala Ala Val Val Asn Arg
Leu165 170 175Ala Glu Ile Ile Lys Arg Val
Asp Gln Ala Gly Asp Lys Glu Ile Glu180 185
190Val Val Glu Leu Lys Asn Ala Ser Ala Ala Glu Met Val Arg Ile Val195
200 205Asp Ala Leu Ser Lys Thr Thr Asp Ala
Lys Asn Thr Pro Ala Phe Leu210 215 220Gln
Pro Lys Leu Val Ala Asp Glu Arg Thr Asn Ala Ile Leu Ile Ser225
230 235 240Gly Asp Pro Lys Val Arg
Ser Arg Leu Arg Arg Leu Ile Glu Gln Leu245 250
255Asp Val Glu Met Ala Thr Lys Gly Asn Asn Gln Val Ile Tyr Leu
Lys260 265 270Tyr Ala Lys Ala Glu Asp Leu
Val Asp Val Leu Lys Gly Val Ser Asp275 280
285Asn Leu Gln Ser Glu Lys Gln Thr Ser Thr Lys Gly Ser Ser Ser Gln290
295 300Arg Asn Gln Val Met Ile Ser Ala His
Ser Asp Thr Asn Ser Leu Val305 310 315
320Ile Thr Ala Gln Pro Asp Ile Met Asn Ala Leu Gln Asp Val
Ile Ala325 330 335Gln Leu Asp Ile Arg Arg
Ala Gln Val Leu Ile Glu Ala Leu Ile Val340 345
350Glu Met Ala Glu Gly Asp Gly Val Asn Leu Gly Val Gln Trp Gly
Asn355 360 365Leu Glu Thr Gly Ala Met Ile
Gln Tyr Ser Asn Thr Gly Ala Ser Ile370 375
380Gly Gly Val Met Val Gly Leu Glu Glu Ala Lys Asp Ser Glu Thr Thr385
390 395 400Thr Ala Val Tyr
Asp Ser Asp Gly Lys Phe Leu Arg Asn Glu Thr Thr405 410
415Thr Glu Glu Gly Asp Tyr Ser Thr Leu Ala Ser Ala Leu Ser
Gly Val420 425 430Asn Gly Ala Ala Met Ser
Val Val Met Gly Asp Trp Thr Ala Leu Ile435 440
445Ser Ala Val Ala Thr Asp Ser Asn Ser Asn Ile Leu Ser Ser Pro
Ser450 455 460Ile Thr Val Met Asp Asn Gly
Glu Ala Ser Phe Ile Val Gly Glu Glu465 470
475 480Val Pro Val Leu Thr Gly Ser Thr Ala Gly Ser Ser
Asn Asp Asn Pro485 490 495Phe Gln Thr Val
Glu Arg Lys Glu Val Gly Ile Lys Leu Lys Val Val500 505
510Pro Gln Ile Asn Glu Gly Asp Ser Val Gln Leu Gln Ile Glu
Gln Glu515 520 525Val Ser Asn Val Leu Gly
Ala Asn Gly Ala Val Asp Val Arg Phe Ala530 535
540Lys Arg Gln Leu Asn Thr Ser Val Ile Val Gln Asp Gly Gln Met
Leu545 550 555 560Val Leu
Gly Gly Leu Ile Asp Glu Arg Ala Leu Glu Ser Glu Ser Lys565
570 575Val Pro Phe Leu Gly Asp Ile Pro Val Leu Gly His
Leu Phe Lys Ser580 585 590Thr Ser Thr Gln
Val Glu Lys Lys Asn Leu Met Val Phe Ile Lys Pro595 600
605Thr Ile Ile Arg Asp Gly Met Thr Ala Asp Gly Ile Thr Gln
Arg Lys610 615 620Tyr Asn Phe Ile Arg Ala
Glu Gln Leu Tyr Lys Ala Glu Gln Gly Leu625 630
635 640Lys Leu Met Ala Asp Asp Asn Ile Pro Val Leu
Pro Lys Phe Gly Ala645 650 655Asp Met Asn
His Pro Ala Glu Ile Gln Ala Phe Ile Asp Gln Met Glu660
665 670Gln Glu451503DNAVibrio splendidus 45atggctgaat
tggtaggggc ggcacgtact tatcagcgct tgccgtttag ctttgcgaat 60cgctacaaga
tggtgttgga ataccaacat ccagagcgcg caccgatact ttattatgtt 120gagccactga
aatcggcggc gatcattgaa gtgagtcgtg ttgtgaaaaa tggtttcacg 180ccacaagcga
ttactctcga tgagtttgat aaaaaactaa ccgatgctta tcagcgtgac 240tcgtcagaag
ctcgtcagct catggaagac attggtgctg atagtgatga tttcttctca 300ctagcggaag
aactgcctca agacgaagac ttacttgaat cagaagatga tgcaccaatc 360atcaagttaa
tcaatgcgat gctgggtgag gcgatcaaag agggtgcttc ggatatacac 420atcgaaacct
ttgaaaagtc actttgtatc cgtttccgag ttgatggtgt gctgcgtgat 480gttctagcgc
caagccgtaa actggctccg ctattggttt cacgtgtcaa ggttatggct 540aaactggata
ttgcggaaaa acgcgtgcca caagatggtc gtatttctct gcgtattggt 600ggccgagcgg
ttgatgttcg tgtttcaacc atgccttctt cgcatggtga gcgtgtggta 660atgcgtctgt
tggacaaaaa tgccactcgt ctagacttgc acagtttagg tatgacagcc 720gaaaaccatg
aaaacttccg taagctgatt cagcgcccac atggcattat cttggtgacc 780ggcccgacag
gttcaggtaa atcgacgacc ttgtacgcag gtctgcaaga actcaacagc 840aatgaacgaa
acattttaac cgttgaagac ccaatcgaat tcgatatcga tggcattggt 900caaacacaag
tgaaccctaa ggttgatatg acctttgcgc gtggtttacg tgccattctt 960cgtcaagatc
ctgatgttgt tatgattggt gagatccgtg acttggagac cgcagagatt 1020gctgtccagg
cctctttgac aggtcactta gttatgtcga ctctgcatac caatactgcc 1080gtcggtgcga
ttacacgtct acgtgatatg ggcattgaac ctttcttgat ctcttcttcg 1140ctgctgggtg
ttttggctca gcgcttggtt cgtactttat gtaacgaatg taaagaacct 1200tatgaagccg
ataaagagca gaagaaactg tttgggttga agaagaaaga aagcttgacg 1260ctttaccatg
ccaaaggttg tgaagagtgt ggccataagg gttatcgagg tcgtacgggt 1320attcatgagc
tgttgatgat tgatgattca gtacaagagc tgattcacag tgaagcgggt 1380gagcaggcga
ttgataaagc aattcgtggc acaacaccaa gtattcgaga tgatggcttg 1440agcaaagttc
tgaaaggggt aacgtcccta gaagaagtga tgcgcgtgac caaggaagtc 1500tag
150346500PRTVibrio
splendidus 46Met Ala Glu Leu Val Gly Ala Ala Arg Thr Tyr Gln Arg Leu Pro
Phe1 5 10 15Ser Phe Ala
Asn Arg Tyr Lys Met Val Leu Glu Tyr Gln His Pro Glu20 25
30Arg Ala Pro Ile Leu Tyr Tyr Val Glu Pro Leu Lys Ser
Ala Ala Ile35 40 45Ile Glu Val Ser Arg
Val Val Lys Asn Gly Phe Thr Pro Gln Ala Ile50 55
60Thr Leu Asp Glu Phe Asp Lys Lys Leu Thr Asp Ala Tyr Gln Arg
Asp65 70 75 80Ser Ser
Glu Ala Arg Gln Leu Met Glu Asp Ile Gly Ala Asp Ser Asp85
90 95Asp Phe Phe Ser Leu Ala Glu Glu Leu Pro Gln Asp
Glu Asp Leu Leu100 105 110Glu Ser Glu Asp
Asp Ala Pro Ile Ile Lys Leu Ile Asn Ala Met Leu115 120
125Gly Glu Ala Ile Lys Glu Gly Ala Ser Asp Ile His Ile Glu
Thr Phe130 135 140Glu Lys Ser Leu Cys Ile
Arg Phe Arg Val Asp Gly Val Leu Arg Asp145 150
155 160Val Leu Ala Pro Ser Arg Lys Leu Ala Pro Leu
Leu Val Ser Arg Val165 170 175Lys Val Met
Ala Lys Leu Asp Ile Ala Glu Lys Arg Val Pro Gln Asp180
185 190Gly Arg Ile Ser Leu Arg Ile Gly Gly Arg Ala Val
Asp Val Arg Val195 200 205Ser Thr Met Pro
Ser Ser His Gly Glu Arg Val Val Met Arg Leu Leu210 215
220Asp Lys Asn Ala Thr Arg Leu Asp Leu His Ser Leu Gly Met
Thr Ala225 230 235 240Glu
Asn His Glu Asn Phe Arg Lys Leu Ile Gln Arg Pro His Gly Ile245
250 255Ile Leu Val Thr Gly Pro Thr Gly Ser Gly Lys
Ser Thr Thr Leu Tyr260 265 270Ala Gly Leu
Gln Glu Leu Asn Ser Asn Glu Arg Asn Ile Leu Thr Val275
280 285Glu Asp Pro Ile Glu Phe Asp Ile Asp Gly Ile Gly
Gln Thr Gln Val290 295 300Asn Pro Lys Val
Asp Met Thr Phe Ala Arg Gly Leu Arg Ala Ile Leu305 310
315 320Arg Gln Asp Pro Asp Val Val Met Ile
Gly Glu Ile Arg Asp Leu Glu325 330 335Thr
Ala Glu Ile Ala Val Gln Ala Ser Leu Thr Gly His Leu Val Met340
345 350Ser Thr Leu His Thr Asn Thr Ala Val Gly Ala
Ile Thr Arg Leu Arg355 360 365Asp Met Gly
Ile Glu Pro Phe Leu Ile Ser Ser Ser Leu Leu Gly Val370
375 380Leu Ala Gln Arg Leu Val Arg Thr Leu Cys Asn Glu
Cys Lys Glu Pro385 390 395
400Tyr Glu Ala Asp Lys Glu Gln Lys Lys Leu Phe Gly Leu Lys Lys Lys405
410 415Glu Ser Leu Thr Leu Tyr His Ala Lys
Gly Cys Glu Glu Cys Gly His420 425 430Lys
Gly Tyr Arg Gly Arg Thr Gly Ile His Glu Leu Leu Met Ile Asp435
440 445Asp Ser Val Gln Glu Leu Ile His Ser Glu Ala
Gly Glu Gln Ala Ile450 455 460Asp Lys Ala
Ile Arg Gly Thr Thr Pro Ser Ile Arg Asp Asp Gly Leu465
470 475 480Ser Lys Val Leu Lys Gly Val
Thr Ser Leu Glu Glu Val Met Arg Val485 490
495Thr Lys Glu Val500471221DNAVibrio splendidus 47atggcggcat ttgaatacaa
agcactggat gccaaaggca aaagtaaaaa aggctcaatt 60gaagcagata atgctcgtca
ggctcgccaa agaataaaag agcttggctt gatgccggtt 120gagatgaccg aggctaaagc
aaaaacagca aaaggtgctc agccatcgac cagctttaaa 180cgcggcatca gtacgcctga
tcttgcgctt attactcgtc aaatatccac gctcgttcaa 240tctggtatgc cgctagaaga
gtgtttgaaa gccgttgccg aacagtctga gaaacctcgt 300attcgcacca tgctactcgc
ggtgagatct aaggtgactg aaggttattc gttagcagac 360agcttgtctg attatcccca
tatcttcgat gagctattca gagccatggt tgctgctggt 420gagaagtcag ggcatctaga
tgcggtattg gaacgattgg ctgactacgc agaaaaccgt 480cagaagatgc gttctaagtt
gctgcaagcg atgatctacc ccatcgtgct ggtggtgttt 540gcggtgacga ttgtgtcgtt
cctactggca acggtagtgc cgaagatcgt tgagcctatt 600atccaaatgg gacaagagct
ccctcagtcg acacaatttt tattagcatc gagtgaattt 660atccagaatt ggggcatcca
attactggtg ttgaccattg gtgtgattgt gttggttaag 720actgcgctga aaaagccggg
cgttcgcatg agctgggatc gcaaattatt gagcatcccg 780ctgataggca agatagcgaa
agggatcaac acctctcgtt ttgcacgaac actttctatc 840tgtacctcta gtgcgattcc
tatccttgaa gggatgaagg tcgcggtaga tgtgatgtcg 900aatcatcacg tgaaacaaca
agtattacag gcatcagata gcgttagaga aggggcaagc 960ctgcgtaaag cgcttgatca
aaccaaactc tttcccccga tgatgctgca tatgatcgcc 1020agtggtgagc agagtggcca
attggaacag atgctgacaa gagcggcaga taatcaggat 1080caaagctttg aatcgaccgt
taatatcgcg ttaggcattt ttaccccagc gcttattgcg 1140ttgatggctg gcttagtgct
gtttatcgtg atggcgacgc tgatgccaat gcttgaaatg 1200aacaatttaa tgagtggtta a
122148406PRTVibrio splendidus
48Met Ala Ala Phe Glu Tyr Lys Ala Leu Asp Ala Lys Gly Lys Ser Lys1
5 10 15Lys Gly Ser Ile Glu Ala
Asp Asn Ala Arg Gln Ala Arg Gln Arg Ile20 25
30Lys Glu Leu Gly Leu Met Pro Val Glu Met Thr Glu Ala Lys Ala Lys35
40 45Thr Ala Lys Gly Ala Gln Pro Ser Thr
Ser Phe Lys Arg Gly Ile Ser50 55 60Thr
Pro Asp Leu Ala Leu Ile Thr Arg Gln Ile Ser Thr Leu Val Gln65
70 75 80Ser Gly Met Pro Leu Glu
Glu Cys Leu Lys Ala Val Ala Glu Gln Ser85 90
95Glu Lys Pro Arg Ile Arg Thr Met Leu Leu Ala Val Arg Ser Lys Val100
105 110Thr Glu Gly Tyr Ser Leu Ala Asp
Ser Leu Ser Asp Tyr Pro His Ile115 120
125Phe Asp Glu Leu Phe Arg Ala Met Val Ala Ala Gly Glu Lys Ser Gly130
135 140His Leu Asp Ala Val Leu Glu Arg Leu
Ala Asp Tyr Ala Glu Asn Arg145 150 155
160Gln Lys Met Arg Ser Lys Leu Leu Gln Ala Met Ile Tyr Pro
Ile Val165 170 175Leu Val Val Phe Ala Val
Thr Ile Val Ser Phe Leu Leu Ala Thr Val180 185
190Val Pro Lys Ile Val Glu Pro Ile Ile Gln Met Gly Gln Glu Leu
Pro195 200 205Gln Ser Thr Gln Phe Leu Leu
Ala Ser Ser Glu Phe Ile Gln Asn Trp210 215
220Gly Ile Gln Leu Leu Val Leu Thr Ile Gly Val Ile Val Leu Val Lys225
230 235 240Thr Ala Leu Lys
Lys Pro Gly Val Arg Met Ser Trp Asp Arg Lys Leu245 250
255Leu Ser Ile Pro Leu Ile Gly Lys Ile Ala Lys Gly Ile Asn
Thr Ser260 265 270Arg Phe Ala Arg Thr Leu
Ser Ile Cys Thr Ser Ser Ala Ile Pro Ile275 280
285Leu Glu Gly Met Lys Val Ala Val Asp Val Met Ser Asn His His
Val290 295 300Lys Gln Gln Val Leu Gln Ala
Ser Asp Ser Val Arg Glu Gly Ala Ser305 310
315 320Leu Arg Lys Ala Leu Asp Gln Thr Lys Leu Phe Pro
Pro Met Met Leu325 330 335His Met Ile Ala
Ser Gly Glu Gln Ser Gly Gln Leu Glu Gln Met Leu340 345
350Thr Arg Ala Ala Asp Asn Gln Asp Gln Ser Phe Glu Ser Thr
Val Asn355 360 365Ile Ala Leu Gly Ile Phe
Thr Pro Ala Leu Ile Ala Leu Met Ala Gly370 375
380Leu Val Leu Phe Ile Val Met Ala Thr Leu Met Pro Met Leu Glu
Met385 390 395 400Asn Asn
Leu Met Ser Gly40549444DNAVibrio splendidus 49atgaaaaata aaatgaaaaa
acaatcaggc tttaccctat tagaagtcat ggttgttgtc 60gttatccttg gtgttctagc
aagttttgtt gtacctaacc tgttgggcaa caaagagaag 120gcggatcaac aaaaagccat
cactgatatt gtggcgctag agaacgcgct cgacatgtac 180aaactggata acagcgttta
cccaacaacg gatcaaggcc tggacgggtt ggtgacaaag 240ccaagcagtc cagagcctcg
taactaccga gacggcggtt acatcaagcg tctacctaac 300gacccatggg gcaatgagta
ccaataccta agtcctggtg ataacggcac aattgatatc 360ttcactcttg gcgcagatgg
tcaagaaggt ggtgaaggta ttgctgcaga tatcggcaac 420tggaacatgc aggacttcca
ataa 44450146PRTVibrio
splendidus 50Lys Asn Lys Met Lys Lys Gln Ser Gly Phe Thr Leu Leu Glu Val
Met1 5 10 15Val Val Val
Val Ile Leu Gly Val Leu Ala Ser Phe Val Val Pro Asn20 25
30Leu Leu Gly Asn Lys Glu Lys Ala Asp Gln Gln Lys Ala
Ile Thr Asp35 40 45Ile Val Ala Leu Glu
Asn Ala Leu Asp Met Tyr Lys Leu Asp Asn Ser50 55
60Val Tyr Pro Thr Thr Asp Gln Gly Leu Asp Gly Leu Val Thr Lys
Pro65 70 75 80Ser Ser
Pro Glu Pro Arg Asn Tyr Arg Asp Gly Gly Tyr Ile Lys Arg85
90 95Leu Pro Asn Asp Pro Trp Gly Asn Glu Tyr Gln Tyr
Leu Ser Pro Gly100 105 110Asp Asn Gly Thr
Ile Asp Ile Phe Thr Leu Gly Ala Asp Gly Gln Glu115 120
125Gly Gly Glu Gly Ile Ala Ala Asp Ile Gly Asn Trp Asn Met
Gln Asp130 135 140Phe
Gln14551594DNAVibrio splendidus 51gtgaaaacta agcaaacaca gccaggtttc
accttgattg agattctttt ggtgttggta 60ttactgtcag tatcggcggt cgcggtgatc
tcgaccatcc ctaccaatag caaagatgtt 120gctaaaaaat acgctcaaag cttttatcag
cgaattcagc tactcaatga agaggctatt 180ttgagtggct tagattttgg tgttcgtgtt
gatgaaaaaa aatcgactta cgttctgatg 240actttgaagt ctgatggctg gcaagaaacg
gagttcgaaa agatcccttc ttcaactgaa 300ttaccggaag aactggcact gtcgctgaca
ttaggtggtg gcgcgtggga agacgatgat 360cggttgttca atccaggaag cttatttgat
gaagatatgt ttgctgatct tgaagaggaa 420aagaagccga aaccaccaca gatctacatc
ttgtcgagtg ctgaaatgac gccatttgta 480ctgtcgtttt acccaaatac cggtgacaca
atacaagatg tttggcgcat tcgagtattg 540gataatggtg tgattcgatt actcgagccg
ggagaagaag atgaagaaga ataa 59452197PRTVibrio splendidus 52Met
Lys Thr Lys Gln Thr Gln Pro Gly Phe Thr Leu Ile Glu Ile Leu1
5 10 15Leu Val Leu Val Leu Leu Ser Val
Ser Ala Val Ala Val Ile Ser Thr20 25
30Ile Pro Thr Asn Ser Lys Asp Val Ala Lys Lys Tyr Ala Gln Ser Phe35
40 45Tyr Gln Arg Ile Gln Leu Leu Asn Glu Glu
Ala Ile Leu Ser Gly Leu50 55 60Asp Phe
Gly Val Arg Val Asp Glu Lys Lys Ser Thr Tyr Val Leu Met65
70 75 80Thr Leu Lys Ser Asp Gly Trp
Gln Glu Thr Glu Phe Glu Lys Ile Pro85 90
95Ser Ser Thr Glu Leu Pro Glu Glu Leu Ala Leu Ser Leu Thr Leu Gly100
105 110Gly Gly Ala Trp Glu Asp Asp Asp Arg
Leu Phe Asn Pro Gly Ser Leu115 120 125Phe
Asp Glu Asp Met Phe Ala Asp Leu Glu Glu Glu Lys Lys Pro Lys130
135 140Pro Pro Gln Ile Tyr Ile Leu Ser Ser Ala Glu
Met Thr Pro Phe Val145 150 155
160Leu Ser Phe Tyr Pro Asn Thr Gly Asp Thr Ile Gln Asp Val Trp
Arg165 170 175Ile Arg Val Leu Asp Asn Gly
Val Ile Arg Leu Leu Glu Pro Gly Glu180 185
190Glu Asp Glu Glu Glu19553396DNAVibrio splendidus 53atgaagaaga
ataaccgttc tccttatcgt tctcgcggta tgcctcttgg ttctcgagga 60atgactctgc
ttgaagtatt ggttgcgctg gctatcttcg ctacggcggc gatcagtgtg 120attcgtgctg
tcacccagca catcaatacg ctcagttatc tcgaagaaaa aaccttcgcg 180gcgatggtcg
ttgataatca aatggcccta gtcatgctac atcctgagat gcttaaaaaa 240gcgcagggca
cgcaagagtt agcgggaaga gaatggttct ggaaggtgac tcccatcgat 300accagcgata
atttattaaa ggcgtttgat gtgagtgcgg caaccagtaa gaaagcgtct 360ccagtcgtta
cggtgcgcag ttatgtggtt aattaa
39654131PRTVibrio splendidus 54Met Lys Lys Asn Asn Arg Ser Pro Tyr Arg
Ser Arg Gly Met Pro Leu1 5 10
15Gly Ser Arg Gly Met Thr Leu Leu Glu Val Leu Val Ala Leu Ala Ile20
25 30Phe Ala Thr Ala Ala Ile Ser Val Ile
Arg Ala Val Thr Gln His Ile35 40 45Asn
Thr Leu Ser Tyr Leu Glu Glu Lys Thr Phe Ala Ala Met Val Val50
55 60Asp Asn Gln Met Ala Leu Val Met Leu His Pro
Glu Met Leu Lys Lys65 70 75
80Ala Gln Gly Thr Gln Glu Leu Ala Gly Arg Glu Trp Phe Trp Lys Val85
90 95Thr Pro Ile Asp Thr Ser Asp Asn Leu
Leu Lys Ala Phe Asp Val Ser100 105 110Ala
Ala Thr Ser Lys Lys Ala Ser Pro Val Val Thr Val Arg Ser Tyr115
120 125Val Val Asn13055804DNAVibrio slpendidus
55atgtggttaa ttaagagaat gtggtcaatt aagagcatgt tattaattaa gaacagctcg
60ctaactaaga gcgtgtcgct aactaagagc atgtcggaaa ataagcgtac gccgcgtaaa
120caaggtctac cttcaaaagg gagaggcttt accttaattg aagtcttggt ctcgattgct
180atctttgcca cgctaagtat ggcggcttat caggtggtta atcaggtgca gcgaagcaac
240gagatctcta ttgagcgcag tgctcgtttg aaccaactgc aacgcagttt agtcatttta
300gataatgatt ttcgccagat ggcggtgcga aaatttcgta ccaacggtga agaagcatca
360tctaagctga tcttaatgaa agagtattta ttggactccg acagtgtagg catcatgttt
420actcgtctag gttggcacaa cccacaacag cagtttcctc gcggtgaagt cacgaaggtt
480ggctaccgta ttaaagaaga aacacttgag cgtgtatggt ggcgttatcc cgatacacct
540tcaggccaag aaggtgtgat tacccctctg cttgatgatg ttgaaagctt ggaattcgag
600ttttatgacg gaagccgctg ggggaaagag tggcaaaccg ataaatcact gccgaaagcg
660gtgaggctta agctgacact gaaagactat ggtgagatag agcgtgttta tctcactccc
720ggtggcaccc tagatcaggc cgatgattct tcaaacagtg actcttcagg cagtagtgag
780gggaataatg actcatcgaa ctaa
80456267PRTVibrio splendidus 56Met Trp Leu Ile Lys Arg Met Trp Ser Ile
Lys Ser Met Leu Leu Ile1 5 10
15Lys Asn Ser Ser Leu Thr Lys Ser Val Ser Leu Thr Lys Ser Met Ser20
25 30Glu Asn Lys Arg Thr Pro Arg Lys Gln
Gly Leu Pro Ser Lys Gly Arg35 40 45Gly
Phe Thr Leu Ile Glu Val Leu Val Ser Ile Ala Ile Phe Ala Thr50
55 60Leu Ser Met Ala Ala Tyr Gln Val Val Asn Gln
Val Gln Arg Ser Asn65 70 75
80Glu Ile Ser Ile Glu Arg Ser Ala Arg Leu Asn Gln Leu Gln Arg Ser85
90 95Leu Val Ile Leu Asp Asn Asp Phe Arg
Gln Met Ala Val Arg Lys Phe100 105 110Arg
Thr Asn Gly Glu Glu Ala Ser Ser Lys Leu Ile Leu Met Lys Glu115
120 125Tyr Leu Leu Asp Ser Asp Ser Val Gly Ile Met
Phe Thr Arg Leu Gly130 135 140Trp His Asn
Pro Gln Gln Gln Phe Pro Arg Gly Glu Val Thr Lys Val145
150 155 160Gly Tyr Arg Ile Lys Glu Glu
Thr Leu Glu Arg Val Trp Trp Arg Tyr165 170
175Pro Asp Thr Pro Ser Gly Gln Glu Gly Val Ile Thr Pro Leu Leu Asp180
185 190Asp Val Glu Ser Leu Glu Phe Glu Phe
Tyr Asp Gly Ser Arg Trp Gly195 200 205Lys
Glu Trp Gln Thr Asp Lys Ser Leu Pro Lys Ala Val Arg Leu Lys210
215 220Leu Thr Leu Lys Asp Tyr Gly Glu Ile Glu Arg
Val Tyr Leu Thr Pro225 230 235
240Gly Gly Thr Leu Asp Gln Ala Asp Asp Ser Ser Asn Ser Asp Ser
Ser245 250 255Gly Ser Ser Glu Gly Asn Asn
Asp Ser Ser Asn260 265571050DNAVibrio splendidus
57atgactcatc gaactaataa gcgtttagcg acaaggtcag ccttgggacg taaacaacgt
60ggtgtcgcgc tgatcattat tttgatgcta ttggcgatca tggcaaccat tgctggcagc
120atgtccgagc gtttgtttac gcaattcaag cgcgttggta accaactgaa ttaccaacag
180gcttactggt acagcattgg tgtggaagcg cttgtgcaaa acggtattag gcaaagttac
240aaagacagtg ataccgtgaa cctaagccaa ccatgggcgt tagaagagca ggtataccca
300ttggattatg gccaagttaa gggccgcatt gttgatgctc aggcatgttt taatcttaat
360gccttagccg gagtggcgac cacttcaagt aaccagactc cttatttaat cacggtttgg
420caaaccttat tggaaaacca agacgttgag ccttatcagg ctgaggttat cgcaaattca
480acgtgggaat ttgttgatgc ggatacacga accacctctt cgtctggtgt agaagacagc
540acgtatgaag cgatgaagcc ctcttatttg gcggcgaatg gcttaatggc cgatgaatcc
600gagctacgag cggtttatca agtcactggt gaagtgatga ataaggttcg cccctttgtt
660tgcgctctgc caaccgatga tttccgcttg aatgtgaata ctctcacgga aaaacaagca
720ccgttattgg aagcgatgtt tgcgccaggc ttaagtgaat cggatgccaa acagctgata
780gataaacgcc catttgatgg ctgggatacg gtagatgctt tcatggctga acctgccatt
840gttggtgtaa gtgccgaagt cagcaagaaa gcgaaagcat atttaactgt agatagcgcc
900tattttgagc tagatgcaga ggtattagtt gagcagtcac gtgtacgtat acggacgctt
960ttctatagta gtaatcgaga aacagtgacg gtagtacgcc gtcgttttgg aggaatcagt
1020gagcgagttt ctgaccgttc gactgagtag
105058349PRTVibrio splendidus 58Met Thr His Arg Thr Asn Lys Arg Leu Ala
Thr Arg Ser Ala Leu Gly1 5 10
15Arg Lys Gln Arg Gly Val Ala Leu Ile Ile Ile Leu Met Leu Leu Ala20
25 30Ile Met Ala Thr Ile Ala Gly Ser Met
Ser Glu Arg Leu Phe Thr Gln35 40 45Phe
Lys Arg Val Gly Asn Gln Leu Asn Tyr Gln Gln Ala Tyr Trp Tyr50
55 60Ser Ile Gly Val Glu Ala Leu Val Gln Asn Gly
Ile Arg Gln Ser Tyr65 70 75
80Lys Asp Ser Asp Thr Val Asn Leu Ser Gln Pro Trp Ala Leu Glu Glu85
90 95Gln Val Tyr Pro Leu Asp Tyr Gly Gln
Val Lys Gly Arg Ile Val Asp100 105 110Ala
Gln Ala Cys Phe Asn Leu Asn Ala Leu Ala Gly Val Ala Thr Thr115
120 125Ser Ser Asn Gln Thr Pro Tyr Leu Ile Thr Val
Trp Gln Thr Leu Leu130 135 140Glu Asn Gln
Asp Val Glu Pro Tyr Gln Ala Glu Val Ile Ala Asn Ser145
150 155 160Thr Trp Glu Phe Val Asp Ala
Asp Thr Arg Thr Thr Ser Ser Ser Gly165 170
175Val Glu Asp Ser Thr Tyr Glu Ala Met Lys Pro Ser Tyr Leu Ala Ala180
185 190Asn Gly Leu Met Ala Asp Glu Ser Glu
Leu Arg Ala Val Tyr Gln Val195 200 205Thr
Gly Glu Val Met Asn Lys Val Arg Pro Phe Val Cys Ala Leu Pro210
215 220Thr Asp Asp Phe Arg Leu Asn Val Asn Thr Leu
Thr Glu Lys Gln Ala225 230 235
240Pro Leu Leu Glu Ala Met Phe Ala Pro Gly Leu Ser Glu Ser Asp
Ala245 250 255Lys Gln Leu Ile Asp Lys Arg
Pro Phe Asp Gly Trp Asp Thr Val Asp260 265
270Ala Phe Met Ala Glu Pro Ala Ile Val Gly Val Ser Ala Glu Val Ser275
280 285Lys Lys Ala Lys Ala Tyr Leu Thr Val
Asp Ser Ala Tyr Phe Glu Leu290 295 300Asp
Ala Glu Val Leu Val Glu Gln Ser Arg Val Arg Ile Arg Thr Leu305
310 315 320Phe Tyr Ser Ser Asn Arg
Glu Thr Val Thr Val Val Arg Arg Arg Phe325 330
335Gly Gly Ile Ser Glu Arg Val Ser Asp Arg Ser Thr Glu340
345591248DNAVibrio splendidus 59gtgagcgagt ttctgaccgt tcgactgagt
agcgaaccac aaagccctgt gcagtggtta 60gtttggtcga caagccaaca agaagtgata
gcaagcggtg aactgtctag ctgggaacag 120cttgacgagt taacgcctta cgctgaaaag
cgcagctgta tcgctttatt gccgggaagt 180gaatgcttaa ttaagcgtgt tgagatcccg
aaaggtgctg ctcgccagtt tgattctatg 240ctgccgttct tattagaaga cgaagtcgca
caagatatcg aagacttaca cctgactatt 300ttagataaag atgccactca cgctaccgtg
tgtggtgtgg atcgtgaatg gctaaaacaa 360gctttagacc tgtttcgcga agccaatata
atcttccgta aggtgctacc agatacacta 420gccgtgcctt ttgaagaaca aggcatcagt
gcgttgcaga tagatcagca ttggttattg 480cgccaaggtc actctcaacg tcaaggtcac
tatcaagccg tatcgatcag tgaagcatgg 540ttaccgatgt ttttgcaaag tgattgggtt
gtcgctggtg aggaagagca agcgacgact 600atcttcagct ataccgcgat gccgagcgac
gacgttcaac agcaaagcgg cctcgagtgg 660caagcaaagc ctgcggaatt ggtgatgtct
ttattgagtc agcaagcgat cacaagcggc 720gtaaatttac tgactggcac ctttaaaacc
aaatcttcat tcagtaaata ttggcgtgtt 780tggcagaaag tggcgattgc tgcttgtttg
ctggtggccg tgattgtgac tcagcaagtg 840ttgaaggttc agcaatacga agcgcaagca
caagcctacc gcatggagag tgagcgtatc 900tttagagctg tgctgcctgg caaacaacgc
attccgaccg tgagttacct caagcgtcag 960atgaatgatg aagctaagaa atacggtggt
tcaggcgaag gtgattcttt acttggttgg 1020ttagctttgc tgcctgaaac cttagggcaa
gtgaagacga tcgaagttga aagcattcgc 1080tacgatggca accgttctga ggttcgactg
caggctaaaa gttctgactt ccaacacttt 1140gagaccgcaa gggtgaagct cgaagagaag
tttgtcgttg agcaagggcc attgaaccgt 1200aatggcgatg ccgtatttgg cagttttact
cttaaacccc atcaataa 124860415PRTVibrio splendidus 60Met
Ser Glu Phe Leu Thr Val Arg Leu Ser Ser Glu Pro Gln Ser Pro1
5 10 15Val Gln Trp Leu Val Trp Ser Thr
Ser Gln Gln Glu Val Ile Ala Ser20 25
30Gly Glu Leu Ser Ser Trp Glu Gln Leu Asp Glu Leu Thr Pro Tyr Ala35
40 45Glu Lys Arg Ser Cys Ile Ala Leu Leu Pro
Gly Ser Glu Cys Leu Ile50 55 60Lys Arg
Val Glu Ile Pro Lys Gly Ala Ala Arg Gln Phe Asp Ser Met65
70 75 80Leu Pro Phe Leu Leu Glu Asp
Glu Val Ala Gln Asp Ile Glu Asp Leu85 90
95His Leu Thr Ile Leu Asp Lys Asp Ala Thr His Ala Thr Val Cys Gly100
105 110Val Asp Arg Glu Trp Leu Lys Gln Ala
Leu Asp Leu Phe Arg Glu Ala115 120 125Asn
Ile Ile Phe Arg Lys Val Leu Pro Asp Thr Leu Ala Val Pro Phe130
135 140Glu Glu Gln Gly Ile Ser Ala Leu Gln Ile Asp
Gln His Trp Leu Leu145 150 155
160Arg Gln Gly His Ser Gln Arg Gln Gly His Tyr Gln Ala Val Ser
Ile165 170 175Ser Glu Ala Trp Leu Pro Met
Phe Leu Gln Ser Asp Trp Val Val Ala180 185
190Gly Glu Glu Glu Gln Ala Thr Thr Ile Phe Ser Tyr Thr Ala Met Pro195
200 205Ser Asp Asp Val Gln Gln Gln Ser Gly
Leu Glu Trp Gln Ala Lys Pro210 215 220Ala
Glu Leu Val Met Ser Leu Leu Ser Gln Gln Ala Ile Thr Ser Gly225
230 235 240Val Asn Leu Leu Thr Gly
Thr Phe Lys Thr Lys Ser Ser Phe Ser Lys245 250
255Tyr Trp Arg Val Trp Gln Lys Val Ala Ile Ala Ala Cys Leu Leu
Val260 265 270Ala Val Ile Val Thr Gln Gln
Val Leu Lys Val Gln Gln Tyr Glu Ala275 280
285Gln Ala Gln Ala Tyr Arg Met Glu Ser Glu Arg Ile Phe Arg Ala Val290
295 300Leu Pro Gly Lys Gln Arg Ile Pro Thr
Val Ser Tyr Leu Lys Arg Gln305 310 315
320Met Asn Asp Glu Ala Lys Lys Tyr Gly Gly Ser Gly Glu Gly
Asp Ser325 330 335Leu Leu Gly Trp Leu Ala
Leu Leu Pro Glu Thr Leu Gly Gln Val Lys340 345
350Thr Ile Glu Val Glu Ser Ile Arg Tyr Asp Gly Asn Arg Ser Glu
Val355 360 365Arg Leu Gln Ala Lys Ser Ser
Asp Phe Gln His Phe Glu Thr Ala Arg370 375
380Val Lys Leu Glu Glu Lys Phe Val Val Glu Gln Gly Pro Leu Asn Arg385
390 395 400Asn Gly Asp Ala
Val Phe Gly Ser Phe Thr Leu Lys Pro His Gln405 410
41561489DNAVibrio splendidus 61atgagaaata tgattgaacc actccaagcg
tggtgggctt caataagtca gcgggaacaa 60cgattagtca ttggttgttc tattttattg
atactgggcg ttgtctattg gggattaata 120caaccactta gccaacgagc cgagcttgca
caaagccgca ttcaaagtga gaagcaactt 180ctggcttggg taacggacaa agcgaatcaa
gtggttgaac tacgaggcag tggtggcatc 240agtgccagtc agcctttgaa ccaatctgtg
cctgcttcta tgcgccgttt taacatcgag 300ctgatacgcg tgcaaccacg cggtgagatg
ctgcaagttt ggattaagcc tgtgccattt 360aataagttcg ttgactggct gacatacctg
aaagaaaagc agggtgttga ggttgagttt 420atggatattg atcgctctga tagccctggg
gttattgaga tcaaccgact acagtttaaa 480cgaggttaa
48962162PRTVibrio splendidus 62Met Arg
Asn Met Ile Glu Pro Leu Gln Ala Trp Trp Ala Ser Ile Ser1 5
10 15Gln Arg Glu Gln Arg Leu Val Ile Gly
Cys Ser Ile Leu Leu Ile Leu20 25 30Gly
Val Val Tyr Trp Gly Leu Ile Gln Pro Leu Ser Gln Arg Ala Glu35
40 45Leu Ala Gln Ser Arg Ile Gln Ser Glu Lys Gln
Leu Leu Ala Trp Val50 55 60Thr Asp Lys
Ala Asn Gln Val Val Glu Leu Arg Gly Ser Gly Gly Ile65 70
75 80Ser Ala Ser Gln Pro Leu Asn Gln
Ser Val Pro Ala Ser Met Arg Arg85 90
95Phe Asn Ile Glu Leu Ile Arg Val Gln Pro Arg Gly Glu Met Leu Gln100
105 110Val Trp Ile Lys Pro Val Pro Phe Asn Lys
Phe Val Asp Trp Leu Thr115 120 125Tyr Leu
Lys Glu Lys Gln Gly Val Glu Val Glu Phe Met Asp Ile Asp130
135 140Arg Ser Asp Ser Pro Gly Val Ile Glu Ile Asn Arg
Leu Gln Phe Lys145 150 155
160Arg Gly63780DNAVibrio splendidus 63gtgaaacgcg gtttatcttt caaatacggc
ctgttattca gcgtcatttt tatcgttttt 60ttctcggtaa gcttgttgct gcatttgcct
gccgcttttg ctctcaagca tgcacccgtc 120gtgcgtggtt taagcattga aggcgttgag
ggcaccgttt ggcaaggtcg cgctaacaat 180atcgcgtggc agcgtgtcaa ttacggctca
gtgcagtggg acttccagtt ctctaaacta 240ttccaagcca aagcagaact tgcggttcgc
tttggccgca acagcgacat gaacttatca 300ggtaaaggac gtgtcggata tagcatgagt
ggtgcttacg cggaaaactt agtggcatca 360atgccagcca gcaacgtgat gaaatatgcg
ccagctatcc cagtgcctgt gtctattgca 420gggcaagttg aactgacgat caaacatgcg
gttcatgctc aaccttggtg tcaatcaggt 480gaaggtacgc ttgcttggtc tggtgcagca
gtcgactcgc cagtgggttc gttagacctt 540ggccctgtga ttgcggacat aacgtgtgaa
gacagcacaa ttgcagccaa aggcactcag 600aagagcgatc aggtagacag cgagttctca
gcgagcgtaa cacctaacca acgctacacc 660tcggcagcat ggtttaagcc aggcgctgaa
ttcccgccag caatgcagag tcagcttaag 720tggttgggca atcctgatag ccaaggtaaa
taccaattta cttatcaagg ccgcttttag 78064259PRTVibrio splendidus 64Met
Lys Arg Gly Leu Ser Phe Lys Tyr Gly Leu Leu Phe Ser Val Ile1
5 10 15Phe Ile Val Phe Phe Ser Val Ser
Leu Leu Leu His Leu Pro Ala Ala20 25
30Phe Ala Leu Lys His Ala Pro Val Val Arg Gly Leu Ser Ile Glu Gly35
40 45Val Glu Gly Thr Val Trp Gln Gly Arg Ala
Asn Asn Ile Ala Trp Gln50 55 60Arg Val
Asn Tyr Gly Ser Val Gln Trp Asp Phe Gln Phe Ser Lys Leu65
70 75 80Phe Gln Ala Lys Ala Glu Leu
Ala Val Arg Phe Gly Arg Asn Ser Asp85 90
95Met Asn Leu Ser Gly Lys Gly Arg Val Gly Tyr Ser Met Ser Gly Ala100
105 110Tyr Ala Glu Asn Leu Val Ala Ser Met
Pro Ala Ser Asn Val Met Lys115 120 125Tyr
Ala Pro Ala Ile Pro Val Pro Val Ser Ile Ala Gly Gln Val Glu130
135 140Leu Thr Ile Lys His Ala Val His Ala Gln Pro
Trp Cys Gln Ser Gly145 150 155
160Glu Gly Thr Leu Ala Trp Ser Gly Ala Ala Val Asp Ser Pro Val
Gly165 170 175Ser Leu Asp Leu Gly Pro Val
Ile Ala Asp Ile Thr Cys Glu Asp Ser180 185
190Thr Ile Ala Ala Lys Gly Thr Gln Lys Ser Asp Gln Val Asp Ser Glu195
200 205Phe Ser Ala Ser Val Thr Pro Asn Gln
Arg Tyr Thr Ser Ala Ala Trp210 215 220Phe
Lys Pro Gly Ala Glu Phe Pro Pro Ala Met Gln Ser Gln Leu Lys225
230 235 240Trp Leu Gly Asn Pro Asp
Ser Gln Gly Lys Tyr Gln Phe Thr Tyr Gln245 250
255Gly Arg Phe6510967DNAErwinia carotovora subsp. Atroseptica
SCRI1043 65aagttgcagg atatgacgaa agcgtggccg acgactatac cggccacgct
ttgaggaatt 60acaggaaatc agctcgctta ggcgagaaag catcgatcag tacgctaccg
tcttccagcg 120aaaccacgcc gtgcatctcg tgtttcaccg ccagataggc gtcgcccgtt
ttcagggtgc 180gtttttcacc ttcgatcacg acttcaaagc tgccagcggc aacataagca
atctggtcgt 240gaatctcatg gaagtgcggc gtaccaatcg cacctttatc aaagtgcacg
taaaccatca 300tcagctcatc gctccatgtc atgattttac gtttaatgcc accgcccagc
tcttcccatg 360gcgtttcatc atcaataaag tatcttctca tcatctctct cctctaacgc
tctttttgcc 420cataccttct attgcgtcaa caaaccgtgt acgacaacga atgcatggct
atggattgcg 480acattttagc cacatcagta ccagaagaaa cataaaataa gcaaaaccat
gacggccctc 540aagaaataaa taaaacatta tttcattttt attgaattcg catctcatcc
aaactatcat 600cccgcataac aagaaagaac cgggcatgtt gaggaacagg tgacgttgtc
actgccacgc 660aacatcatct gtttcgcccg gcgctttcgc caggaacgat tcctcttctt
ggaacggcgc 720ctgatttttg tttttctctg aaagagaggc taagaaatgc aagttcgtca
aagcattcac 780agcgatcacg cgaagcagct agatacagca ggcctgcgtc gtgaattcct
gatcgaacag 840attttttctg ccgatgccta cactatgacc tatagccaca tcgaccgaat
catcgtcggt 900ggcatcatgc ccgtacacag cgccgtaacg attggcggtg aagtgggtaa
acaactcggc 960gttagctatt tccttgagcg tcgcgaactc ggagccatca acattggcgg
cgcgggtacc 1020gttactgtcg atggcgagcg ctatgacgtg ggtaatgaag aagcaattta
tgttggcatg 1080ggcgtgaaag acgtgcagtt taccagcact gatgccacta acccggccaa
gttctactac 1140aacagcgcgc ctgcacatac gacatatcct acccgcaaga ttacccaagc
tgacgcttca 1200ccacaaaccg tgggagaaga tgcaagctgt aatcgtcgca caattaacaa
atacattgtt 1260cccgatgtat tgccaacctg ccagctcacc atgggattaa ccaagttagc
tgaaggcagc 1320ctgtggaaca ccatgccttg tcatacgcat gagcgccgga tggaagtcta
tttctatttt 1380gatatggatg aggaaacggc cgttttccac atgatggggc aaccgcagga
aacccgtcac 1440atagttatta aaaacgagca ggcggtgatt tcaccgagct ggtcgattca
ttccggtgtt 1500ggcaccagac gctacacctt tatctggggc atggttggcg agaatcaagt
tttcggtgac 1560atggatcacg tcaaggttag cgagttacgt taatcgcttt caaccggaat
taccggtgtt 1620ccctacagta acagctaacg actaagtatt gtcgcttata gagagattat
tgatatgatt 1680ttaaattctt ttgatttgca aggtaaagtt gctcttatca cgggttgtga
tacgggttta 1740ggtcagggta tggctatcgg tctggcacaa gctggctgtg atatcgttgg
cgtcaacatc 1800gttgaaccaa aagataccat cgaaaaagtt accgcactgg gacgccgttt
cctcagcctg 1860accgctgaca tgagcaacgt agcgggtcat gccgagctgg tagagaaagc
cgttgctgaa 1920tttggtcacg ttgacattct ggtcaacaac gccggtatca tccgtcgtga
agatgctatc 1980gagttcagcg agaaaaactg ggacgacgtc atgaatctga acattaagag
cgttttcttt 2040atgtctcagg ctgttgcacg ccagtttatc aaacaaggta aaggcggcaa
gatcatcaac 2100atcgcctcta tgctgtcctt ccaaggcggt atccgcgtgc cttcttacac
tgcgtcaaaa 2160agcgccgtta tgggtgtaac ccgtctgctg gctaacgagt gggcaaaaca
cggcatcaac 2220gttaacgcca ttgctccagg gtacatggca accaacaata ctcagcaact
gcgcgccgat 2280gaagaccgca gcaaagagat tctggaccgt atcccggctg gccgttgggg
tttaccacag 2340gatctgatgg gcccatccgt cttcctggca tccagcgcat ctgattacat
caatggctac 2400acgattgccg ttgatggtgg ctggctggct cgctaagtgt aatttttctt
agcggcattt 2460cgctaatcca cgataaaaag cacaatttag gttgtgcttt ttatttattt
ttcaagttgt 2520tatttcgttt tttataattc tcttttctgc ctaaatcctt tcttaaaaaa
aaatcaaaac 2580aacgttccga ctttgatcac actttcgata ttgcgtgcat gacgacaagg
ttaatagcgc 2640aatataatca atcaaaacag tgtttctatt tataaggaac tgttcacgca
gttccataag 2700aaggtactcc atgagtattt ttgaaaactt atacaccagc aggaaatcgc
agctcgacga 2760atgggttgct gcacttgata gccacatatc ctgcgttcag gaaaaaggcc
gcagccaaag 2820ccaaccgacg ctattactgg ccgatggttt tgatgtggaa aattatgcgc
ctgcggtatg 2880gcaatttccg gatgggcaca gcgcgcctat ttctaatttt gccagccagc
agaattggct 2940aagaacgctg tgcgccatga gcgtcgttac gggtaatgat agttaccaac
agcacgctat 3000cgcacaaagc gaatatttcc tggatcattt cgttgatgat aatagcggcc
tgttctactg 3060gggcggccat cgctttatta atctggatac gctggaaggc gaagggccag
aatccaaagc 3120tcaggtgcat gaattaaagc accacctgcc ctattacgcg ctgttacatc
gtgttaacgc 3180ggaaaagacg ctgaacttct ttcaggggtt ctggaacgca cacgttgaag
attggaattc 3240actggatctg ggtcgtcatg gcgattacag caaaaaacgc gatcctgatg
ttttcctgca 3300taaccgtcat gatgtcgtcg atccggcaca gtggcccgtt ctgccattaa
cgaaaggcct 3360gacgtttgtt aatgccggca cggatctgat ttacgccgca ttcaaatatg
cagaatatac 3420gggcgatagc catgccgcgg catggggtaa acacctttat cgccaatacg
ttctggctcg 3480caacccagaa accggtatgc cggtgtatca attcagttca ccacagcagc
gccagccagt 3540gccggaagac gataaccaga cgcagtcctg gtttggcgat cgcgctcaac
gccagtttgg 3600cccagagttc ggtgaaatcg cacgtgaagc caatgtgctg ttccgcgata
tgcgtccact 3660gctgattgat aacccgctgg caatgctgga tatcctccgc acacagcctg
atgcagaaat 3720gctgaattgg gtaatctctg gattaaaaaa ttattaccag tacgcctacg
atgtcaccag 3780caatacgttg cgcccgatgt ggaacaacgg gcaggacatg acaggctacc
gttttaaacg 3840cgatggctat tacggcaaag cgggaacgga attaaaaccg ttcgcattag
aaggtgatta 3900tttattacct ctggttcgtg cttatcgtct gagcggtgat gaagacctgt
acgcactggt 3960taacaccatg ctgacacggc tgaataaaga agatattcag cacatcgcca
gtccgctact 4020tttgttgacc gttatcgaac tggccgatca caagcaatca gaatcctggg
cacattacgc 4080cgcacaactg gcgggcgtta tgtttgaaca acatttccat cgtggtttgt
ttgttcgctc 4140tgcacagcat cgttatgttc gtctggatga tacctatccg ctggctttac
tgactttcgt 4200tgccgcctgt cgcaacaaat taaacgatat cccgccgtat ctgacacaag
gtggatatgt 4260tcacggcgat tttcacgtta acggggaaaa tagaattgtt tatgacgtgg
aattaattta 4320tccagagtta ttaacagctt aattttatgt tttttttaat gattcacaat
taatcaatag 4380gtaagcatta tgaatgaaaa cagaatgctg gggttagcct atatctcccc
ctatattata 4440gggctgatag tttttaccgc tttccccttt atttcgtcat ttatcctcag
ttttactgag 4500tatgatttga tgagtccgcc tgagtttacg ggtcttgaga actatcaccg
tatgttcatg 4560gaggatgatc ttttttggaa atcaatgggc gtcacctttg cctatgtatt
tctgaccatt 4620ccattgaaat taatcttcgc actgttaatt gcgtttgtac ttaatttcaa
attacgtggt 4680atcggtttct tccgtactgc ttactatgtg ccttctattc tgggcagcag
cgtggccatt 4740gccgttctgt ggcgtgccct attcgccatc gatggcttgc tgaacagctt
cctcggcgta 4800tttggctttg atgccatcaa ctggctgggc gaaccttcgc tggcactgat
gtcggtaacc 4860ctgctgcgcg tatggcagtt tggttccgcc atggttatct tccttgctgc
attgcagaac 4920gtcccgcaat cacagtatga agcagccatg atcgacggtg catccaaatg
gcaaatgttc 4980ctgaaagtaa cggttccact gattacgccg gttattttct ttaactttat
catgcagacc 5040actcaggcat tccaggagtt tacggcacct tacgtcatca ctggcggcgg
tccaacgcac 5100tacacctatc tgttctcgct ctatatctat gataccgcgt tcaagtattt
cgatatgggc 5160tatggtgctg cgctggcatg ggttctgttc ctggttgttg cggtatttgc
ggcaatctcc 5220tttaagtcgt cgaaatactg ggtgttctac tccgctgata aaggaggaaa
aaatggctga 5280catgcattca aacctgacta cagcacaaga aattgctgct gcagaagtac
gccgcacgct 5340gcgtaaagag aaactcagtg cctccatccg ttacgtgata ctgctgttcg
ttggcttact 5400gatgctttac ccactagcgt ggatgttctc agcgtcgttc aaaccgaacc
aagagatctt 5460cacgacactg ggcctgtggc cggaacacgc cacatgggac ggtttcgtta
acggttggaa 5520aaccggtacg gaatacaatt tcggtcacta catgatcaat acgctcaagt
tcgtgattcc 5580gaaagtgcta ctgaccatta tctcttccac cattgtcgct tacggctttg
cccgtttcga 5640gattccatgg aagggcttct ggttcgggac gctgatcacc accatgctgt
taccaagcac 5700cgtgttgctg attccgcagt acatcatgtt ccgtgaaatg ggcatgctga
acagctatct 5760gccactgtac ttgccgatgg cgtttgcaac acaagggttc tttgtgttca
tgctgatcca 5820gttcctgcgt ggtgtaccac gtgatatgga agaagccgcc cagatcgatg
gctgtaactc 5880cttccaggtt ctgtggtatg tggtcgtgcc gattttgaaa ccagccatca
tctctgttgc 5940gctgttccag ttcatgtggt caatgaacga cttcatcggt ccgctgattt
atgtctatag 6000cgtggataaa tatccgattg cgctggcgct gaaaatgtct atcgacgtta
ctgaaggcgc 6060tccgtggaat gaaatcctgg caatgtccag catctccatt ctgccatcca
ttattgtttt 6120cttcctggca cagcgttact tcgtacaagg cgtgaccagc agcggaatta
aaggttaata 6180gaggatttat catggctgaa gttattttca ataaactgga aaaagtatac
accaacggct 6240tcaaagcggt tcacggcatc gacctgacca ttaaagacgg tgagttcatg
gttatcgtcg 6300gcccgtcagg ctgtgcgaaa tcaacgacgc tgcgtatgtt agcgggtctg
gaaaccatca 6360gcggcggtga agttcgcatc ggcgagcgcg ttgttaacaa tctggcaccg
aaagagcgtg 6420ggattgcaat ggtgttccag aactatgcgc tctaccctca tatgacggta
aaagagaacc 6480tggcgtttgg tctgaagctg agcaaaatgc ctaaagatca aattgaagcg
caagtaacgg 6540aagcagccaa aattctggag ctggaagacc tgatggatcg tctgccacgc
cagctatctg 6600gtggtcaggc gcagcgtgtg gccgtaggcc gtgccatcgt taaaaagccg
gatgttttcc 6660tgtttgatga accgttatct aacctggatg ccaaactgcg tgcttccatg
cgtatccgta 6720tttctgacct gcataagcag ttgaagaaaa gcggtaaagc ggcaacgacg
gtatatgtta 6780cccacgacca gactgaagcc atgaccatgg gcgaccgtat ctgcgttatg
aagctgggtc 6840acatcatgca ggtcgatacg ccggataacc tgtaccattt ccctgtcaac
atgttcgttg 6900ctggcttcat tggctcacca gaaatgaaca ttaagccgtg caaactggtc
gagaaagacg 6960gtcagattgg cgttgttgtg ggtaataacg cgctggtatt aaatactgaa
aaacaagata 7020aagtgcgcag ctacgtagga caagacgtat tcttcggcgt tcgcccagac
tatgtttcct 7080tgtcagatac gccatttgaa ggcagccact cacagggtga actggttcgc
gtagaaaaca 7140tgggtcacga attctttatg tacattaaag tcgatggctt tgaattaacc
agccgcattc 7200cttatgacga aggtcggctg attatcgaga agggactgca tcgtccggta
tatttccagt 7260tcgacatgga aaaatgccat atttttgatg caaaaacaga aaaaaatatc
tctctttaac 7320aggagtagta accgatgaaa aaagcgatcc tacacacgtt aatagcttca
tctttggcat 7380tagttgcaat gccatctctg gcagccgatc aggttgagtt gagaatgtcc
tggtggggcg 7440gcaacagccg tcaccaacag acgctcaagg cgattgaaga gttccataag
cagcacccag 7500acatcaccgt gaaagcggaa tacaccggat gggatggtca cctgtctcgt
ctgacaacac 7560agattgccgg taacactgag ccagatgtga tgcagactaa ctggaactgg
ctgccgattt 7620tctccaaaaa cggcgatggt ttttatgatc tgaacaaagt gaaagattct
ctggatctga 7680cccagttcga agcaaaagaa ctgcaaaaca ccacggttaa cggcaagctg
aacggtattc 7740ctatttctgt taccgctcgc gtgttctatt tcaacaacga aagctgggca
aaagcgggac 7800tggaataccc gaaaacgtgg gacgaactgc tgaacgccgg taaagtgttc
aaagagaagc 7860tgggcgacca atactaccct atcgtgttgg aacaccagga ttctctggca
ctgctgaact 7920cttacatggt tcaaaaatac aacattcctg ctattgatgt gaaaagtcag
aaattcgcct 7980ataccgatgc acaatgggtt gaattctttg gcatgtataa gaaactgatc
gacagccatg 8040tcatgcctga tgcgaaatac tatgcctctt tcggtaagag caacatgtat
gagatgaagc 8100catggatcaa tggcgagtgg tctggtactt acatgtggaa ctccactatc
actaagtact 8160ctgacaactt gcaaccacca gcaaaactgg cgttaggtaa ctacccaatg
ctgcctggtg 8220caaaagatgc tggcttgttc ttcaaacctg cacaaatgct gtctatcggt
aagtcaacca 8280agcatcctaa agagtctgct cagttgatca acttcctgct gaacagcaaa
gaaggtgctc 8340aggctttggg tctggaacgt ggtgtaccgt tgagtaaagc ggctgtggct
cagctgaccg 8400ctgatggcat catcaaagat gatgctccag cagttgccgg gttgaagctg
gcgctgtctc 8460tgccgcatga agttgctgtt tctccttatt tcgacgaccc acaaatcgtt
tctctgtttg 8520gtgataccat ccaatctatc gattatggtc agaaatctgt ggaagacgca
gcgaaatact 8580tccagcgtca atctgagcgt gttctgaaac gcgcaatgaa ataatgtagc
actcgattta 8640ccctgtaatt catccctgcc gcaccgacgg cagggatttt tcatttaaat
taaaacatcc 8700tctatattca attcgatctc cctcacaatt tgaaacccta ttttactttt
tgttactcaa 8760aacgatctcg atcacagaac gtaatttaat aataaataga atagaacttg
tcccaaaaaa 8820cataatgcgc ctttcgaatt aaagtattaa gcacagtcct aaccaatggg
gaatataaca 8880atgaaattta aattattagc tctggctgtt acatcattaa ttagtgtgaa
tgcaatggct 8940gtaactatcg attaccgtca tgaaatgaaa gatacaccga aaaatgatca
ccgcgatcgt 9000ttgtcaatgt cacaccgttt tgccaatggc tttggtttat ccgttgaagc
aaaatggcgt 9060caatccagtg ctgacagcac accgaataaa ccatttaatg aaaccgtcag
caacggtact 9120gaagttgtcg ccagctatgt ttacaacttc aacaaaactt tttctctgga
gccaggtttc 9180tctttagatt caagctctac ctctaacaac tatcgccctt atctgcgcgg
taaagtgaat 9240atcactgacg atctttctac ctctttacgt tatcgtcctt actacaaacg
taacagcggt 9300gatgttccaa atgcatcaaa aaacaaccaa gagaatggtt ataacctaac
cgccgttctc 9360agctataaat tcctgaaaga tttccaagtt gattacgaac tggactacaa
aaaagcaaat 9420aaagccggtg cgtatcaata cgacaatgaa acatacaatt tcgaccatga
tgtaaaattg 9480tcttataaaa tggataaaaa ctggaagcct tatatggctg taggtaatgt
tgcagattcc 9540ggcaccaacg atcatcgtca aactcgttac cgtgttggtg tgcaatacag
cttctaataa 9600cggccttgtt atttaaataa gcgttattag gtagcagaag ggatgttatt
gttaatcgat 9660ttactcagat ctacttttat cattaacatc cctttattat ggtgtccgtt
gtaggttaag 9720caggttagtt acgtttcttt gttgtacatg atttagttat atgcgtttta
gctgctgtaa 9780ttgctgtgtc tgatttaccc tcttcgtgta tgaatgttat ttctttatta
aaatttgcgg 9840ttcagggtag tcattttttc tccgatgtga tggctaccct attttttacc
accgcccaac 9900gattcccccc tcattccctt tgtcaggtga tctatcatga ttgttcgttc
tctgcttgtc 9960ggggccatta tgatgtctgt aaatggatta agttacgcac aacctgtttt
ctctgtctgg 10020ccacacggtg aagcaccggg tgcctcttct tcaacggcac agccgcaagt
ggtcgaacgg 10080agtaaagatc cttctcttcc cgatcgagcc gcaacgggta ttcgcagccc
tgaaattacc 10140gtttatccgg cagagaaacc caatggcatg gcattactca ttacgccggg
cggttcttat 10200cagcgcgtcg tgctagataa agaaggcagc gatctagccc ctttctttaa
tcaacaaggc 10260tacacccttt tcgtgatgac ctatcgtatg cccggtgaag gccataaaga
aggcgctgac 10320gctccgctag ccgatgccca acgagccatc agaacactga gagccaacgc
cgaaaagtgg 10380cacattaacc cgcagcgcat cggtattatg gggttctccg ccggtggtca
cgttgccgcc 10440agccttggaa cccgattcgc acagtccgtt taccccgcga tggacgccgt
tgataacgta 10500agcgcacgcc ctgacttcat ggtgttgatg taccccgtaa tttctatgca
ggcagatatt 10560gcgcacgccg gttcacgtaa acagttaatc ggcgagcaac cgatggaagt
acaagcggta 10620cgttattctc ctgagaaaca ggttactgat cagactcccc ccacgttttt
ggtgcatgcg 10680gttgacgatc cgtcagtgtc ggttgataac agcctggtga tgtttagcgc
gctgcgggca 10740aagcagattc cggtcgaaat gcatctcttt gagaaaggta aacacggctt
cggtctccgc 10800ggcaccaagg ggcttcctgc cgctgcctgg cctcaactgc tggacaactg
gctacgcgct 10860ttacctgcaa gcaacgaatt gccgaaagcc gcgccataag gtatagcaaa
catcgtaacc 10920gaaataaatc gttacgccgt caccgcttcc gcagacaggg ataatct
10967662582DNAErwinia carotovora subsp. Atroseptica SCRI1043
66ccaacggcgg gtgcgacata aacataagcg aatcgaagcg ctgcgctccg gtgagtatct
60gaagtaattt acgatagttt ctttccaaag gcccattcgg gcctttgtta tttcagcgtt
120tattgattca tcaaacctgc gctttctctg ctcgaatgtt ttcactagat ctgaaacagg
180tggtgaaaac atgaagaatg ttttataaaa taaaaccacg atcacggaaa aatgaaacat
240tgtttctata ataccgatat gacaggcgtc tcgcgtgaga tttgtggcct gatttttgaa
300caaccggtgt cggggtgacc gattcgtcgg acgttcagta atgtcaggtt atcgaagcgt
360atgcgtgtgt ggcgtcaaat tcttcatgat aagttctaag gatttacgga tggccaaagg
420taataagatc cccctaacgt ttcataccta ccaggatgca gcaaccggca ccgaagttgt
480gcgtttaacc ccgcccgatg ttatctgcca ccggaattat ttctaccaga agtgtttctt
540caatgacggt agcaagctgc tgtttggcgc tgcatttgat ggcccatgga actactatct
600gctggattta aaagagcaga acgccacaca gttgacggaa ggcaaaggcg acaatacttt
660tggtggtttc ctgtctccga atgacgatgc gctatattac gttaaaaata cccgtaattt
720gatgcgtgtc gatctgacta cgctggaaga gaaaacgatt tatcaggtgc ctgacgattg
780ggtcggctac ggtacttggg ttgccaactc cgattgcacc aaaatggtcg gtattgagat
840caagaaagaa gactggaagc cactgaccga ttggaaaaaa ttccaggagt tctacttcac
900taatccttgc tgtcgtctga ttcgcgtcga tttggtaacg ggcgaagcgg agactatcct
960tcaggaaaac cagtggctgg gtcacccaat ctaccgtcca ggtgatgaca acacggttgc
1020tttctgtcac gaaggcccgc atgacctggt tgatgctcgt atgtggttca tcaacgaaga
1080tggcaccaac atgcgcaaag tgaaagagca tgcagaaggc gaaagctgca cccacgaatt
1140ttgggtgccg gatggctccg cgatgattta tgtctcttat cttaaagacg ataccaaccg
1200ttatattcgc agcatcgatc ccgttacgct ggaagatcgc caactgcgtg taatgccgcc
1260gtgttctcac ctgatgagta actatgatgg cacactgttg gtcggtgatg gttccgatgc
1320accggtcgac gtgcaggatg atggtggcta caaaattgag aacgatccgt tcctgtatgt
1380tttcaacctg aaaactggca aagaacatcg tattgcgcag cacaatacat cctgggaagt
1440gttggaaggg gaccgtcagg tcactcaccc gcacccgtct ttcacgccgg ataataaaca
1500agttctgttt acttctgacg tagatggaaa acctgcgttg tatctggcga aggttcctga
1560ttcagtctgg aactaataat actaataaat ccgcgtcacg tttcatggcg cggattattt
1620taaaatattt acttacatat tattttatta agtctctgac gcggttattt ctcaaactta
1680acttgattat cgttgttgct ccattgccat aatcaaagcg ttccctttat actaaaacca
1740ttgttctatt ttttttaaaa caaaaaaacc tgagtagggt aaccacaaaa atggctagtg
1800cagatttaga taaacaaccc gattccgtgt cgtccgtttt aaaggttttt ggtattttgc
1860aggcattagg tgaagagaga gaaattggta ttaccgagct ttctcagcga gtcatgatgt
1920ctaagagtac cgtttaccgt ttcttgcaga cgatgaaatc cctgggctat gtcgcgcagg
1980aaggtgaatc agagaagtat tcgctaacgc tcaagttgtt tgaacttggt gcaaaagcat
2040tgcagaacgt agacttaatc cgcagtgcgg atatacagat gcgcgagttg tctgtgctga
2100cgcgggaaac gattcacctt ggcgcgttgg atgaagacgg catcgtttat atccacaaga
2160ttgattctat gtataacctg cgtatgtatt cgcgcatcgg tcgccgtaat ccactacaca
2220gtaccgcaat tggtaaagtg ttgctggctt ggcgcgatcg cggtgaagtg gaagaggttc
2280tgtcgactgt cgaattcacg cgtagtacgc cacacacatt gtgtactgct gaagatcttc
2340tcaatcaact ggatgtcgtg cgtgagcaag gctacgggga agataaagaa gagcaggaag
2400aagggctgcg ttgtatcgct gtgccagtat tcgatcgttt tggtgtggtg attgccggcc
2460tcagtatttc cttcccaacg attcgttttt cagaagaaaa caaacacgaa tatgtggcca
2520tgctgcacac cgcagctaga aatatctctg agcaaatggg ctaccacaat ttccctttct
2580ga
2582672331DNAAgrobacterium tumefaciens 67atgcgtccct ctgccccggc catctccaga
cagacacttc tcgatgaacc ccgcccgggc 60tcattgacca ttggctacga gccgagcgaa
gaagcacaac cgacggagaa ccctccgcgc 120ttttcatggc tacccgatat tgacgacggc
gcgcgttacg tgctgcgcat ttcgaccgat 180cccggtttta cagacaaaaa aacgctcgtc
ttcgaggatc tcgcctggaa tttcttcacc 240ccggatgaag cactgccgga cggccattat
cactggtgtt atgcgctatg ggatcagaaa 300tccgcaacag cgcattccaa ctggagcacc
gtacgcagtt tcgagatcag tgaagcactg 360ccgaaaacgc cgctgcccgg caggtctgcc
cgccatgctg ccgcgcaaac cagccaccct 420cggctgtggc tcaactccga gcaattgagt
gccttcgccg atgccgttgc gaaggacccc 480aaccattgtg gctgggccga gttttacgaa
aaatcggtcg agccgtggct cgagcggccg 540gtcatgccgg aaccgcagcc ctatcccaac
aacacgcgtg tcgccacgct ctggcggcag 600atgtatatag actgccagga agtgatctat
gcgatccggc acctggccat tgccggccgc 660gtgctcggac gcgacgacct tctcgatgca
tcccgcaaat ggctgctggc cgtcgccgcc 720tgggacacga aaggtgcgac ctcacgcgcc
tataatgacg aggcggggtt ccgcgtcgtc 780gtcgcactcg cctggggtta tgactggctg
tacgaccatc tgagcgaaga cgaacgcagg 840accgtgcgat ccgttcttct cgaacggacg
cgggaagttg ccgatcatgt catcgcacac 900gcccgcattc acgtctttcc ctatgacagc
catgcggtgc gctcgctttc ggctgtattg 960acgccggcct gcatcgcact tcagggagaa
agcgacgagg ctggcgaatg gctcgactat 1020accgtcgaat tccttgccac gctctattct
ccctgggcgg gaaccgatgg tggttgggcg 1080gaaggtccgc attactggat gaccggcatg
gcctatctca tcgaggccgc caatctgatc 1140cgctcctata ttggttatga cctctatcaa
cggccgtttt tccagaatac cggtcgcttc 1200ccgctttaca ccaaggcgcc gggaacccgc
cgcgccaact tcggcgacga ctccaccctt 1260ggcgaccttc ccggcctgaa gctgggatac
aacgtccggc aattcgccgg cgtcaccggc 1320aatggccatt accagtggta tttcgatcac
atcaaggccg atgcgacagg cacggaaatg 1380gccttttaca attacggctg gtgggacctc
aacttcgacg atctcgtcta tcgccacgat 1440tacccgcagg tggaagccgt gtctcccgcc
gacctgccgg cactcgccgt tttcgatgat 1500attggttggg cgaccatcca aaaagacatg
gaagacccgg accggcacct gcagttcgtc 1560ttcaaatcca gcccttacgg ttcgctcagc
cacagtcacg gcgaccagaa tgcctttgtg 1620ctttatgccc atggcgagga tctggcgatc
cagtccggtt attacgtggc gttcaattcg 1680cagatgcatc tgaattggcg gcgtcagaca
cggtcgaaaa atgccgtgct gatcggcggc 1740aaaggccaat atgcggaaaa ggacaaggcg
cttgcacgcc gcgccgccgg ccgcatcgtc 1800tcggtggagg aacagcccgg ccatgttcgt
atcgtcggcg atgcaaccgc cgcctaccag 1860gttgcgaacc cgctggttca aaaggtgctg
cgcgaaaccc acttcgttaa tgacagctat 1920ttcgtgattg tcgacgaagt cgaatgttcg
gaaccccagg aactgcaatg gctttgccat 1980acactcggag cgccgcagac cggcaggtca
agcttccgct acaatggccg gaaagccggt 2040ttctacggac agttcgttta ctcttcgggc
ggcacgccgc aaatcagcgc cgtggagggt 2100tttcccgata tcgacccgaa agaattcgaa
gggctcgaca tacaccacca tgtctgcgcc 2160acggttccgg ccgccacccg gcatcgcctt
gtcacccttc tggtgcctta cagcctgaag 2220gagccgaagc gcattttcag cttcatcgat
gatcagggtt tttccaccga catctacttc 2280agtgatgtcg atgacgagcg tttcaagctc
tcccttccca agcagttcta a 233168776PRTAgrobacterium tumefaciens
68Met Arg Pro Ser Ala Pro Ala Ile Ser Arg Gln Thr Leu Leu Asp Glu1
5 10 15Pro Arg Pro Gly Ser Leu
Thr Ile Gly Tyr Glu Pro Ser Glu Glu Ala20 25
30Gln Pro Thr Glu Asn Pro Pro Arg Phe Ser Trp Leu Pro Asp Ile Asp35
40 45Asp Gly Ala Arg Tyr Val Leu Arg Ile
Ser Thr Asp Pro Gly Phe Thr50 55 60Asp
Lys Lys Thr Leu Val Phe Glu Asp Leu Ala Trp Asn Phe Phe Thr65
70 75 80Pro Asp Glu Ala Leu Pro
Asp Gly His Tyr His Trp Cys Tyr Ala Leu85 90
95Trp Asp Gln Lys Ser Ala Thr Ala His Ser Asn Trp Ser Thr Val Arg100
105 110Ser Phe Glu Ile Ser Glu Ala Leu
Pro Lys Thr Pro Leu Pro Gly Arg115 120
125Ser Ala Arg His Ala Ala Ala Gln Thr Ser His Pro Arg Leu Trp Leu130
135 140Asn Ser Glu Gln Leu Ser Ala Phe Ala
Asp Ala Val Ala Lys Asp Pro145 150 155
160Asn His Cys Gly Trp Ala Glu Phe Tyr Glu Lys Ser Val Glu
Pro Trp165 170 175Leu Glu Arg Pro Val Met
Pro Glu Pro Gln Pro Tyr Pro Asn Asn Thr180 185
190Arg Val Ala Thr Leu Trp Arg Gln Met Tyr Ile Asp Cys Gln Glu
Val195 200 205Ile Tyr Ala Ile Arg His Leu
Ala Ile Ala Gly Arg Val Leu Gly Arg210 215
220Asp Asp Leu Leu Asp Ala Ser Arg Lys Trp Leu Leu Ala Val Ala Ala225
230 235 240Trp Asp Thr Lys
Gly Ala Thr Ser Arg Ala Tyr Asn Asp Glu Ala Gly245 250
255Phe Arg Val Val Val Ala Leu Ala Trp Gly Tyr Asp Trp Leu
Tyr Asp260 265 270His Leu Ser Glu Asp Glu
Arg Arg Thr Val Arg Ser Val Leu Leu Glu275 280
285Arg Thr Arg Glu Val Ala Asp His Val Ile Ala His Ala Arg Ile
His290 295 300Val Phe Pro Tyr Asp Ser His
Ala Val Arg Ser Leu Ser Ala Val Leu305 310
315 320Thr Pro Ala Cys Ile Ala Leu Gln Gly Glu Ser Asp
Glu Ala Gly Glu325 330 335Trp Leu Asp Tyr
Thr Val Glu Phe Leu Ala Thr Leu Tyr Ser Pro Trp340 345
350Ala Gly Thr Asp Gly Gly Trp Ala Glu Gly Pro His Tyr Trp
Met Thr355 360 365Gly Met Ala Tyr Leu Ile
Glu Ala Ala Asn Leu Ile Arg Ser Tyr Ile370 375
380Gly Tyr Asp Leu Tyr Gln Arg Pro Phe Phe Gln Asn Thr Gly Arg
Phe385 390 395 400Pro Leu
Tyr Thr Lys Ala Pro Gly Thr Arg Arg Ala Asn Phe Gly Asp405
410 415Asp Ser Thr Leu Gly Asp Leu Pro Gly Leu Lys Leu
Gly Tyr Asn Val420 425 430Arg Gln Phe Ala
Gly Val Thr Gly Asn Gly His Tyr Gln Trp Tyr Phe435 440
445Asp His Ile Lys Ala Asp Ala Thr Gly Thr Glu Met Ala Phe
Tyr Asn450 455 460Tyr Gly Trp Trp Asp Leu
Asn Phe Asp Asp Leu Val Tyr Arg His Asp465 470
475 480Tyr Pro Gln Val Glu Ala Val Ser Pro Ala Asp
Leu Pro Ala Leu Ala485 490 495Val Phe Asp
Asp Ile Gly Trp Ala Thr Ile Gln Lys Asp Met Glu Asp500
505 510Pro Asp Arg His Leu Gln Phe Val Phe Lys Ser Ser
Pro Tyr Gly Ser515 520 525Leu Ser His Ser
His Gly Asp Gln Asn Ala Phe Val Leu Tyr Ala His530 535
540Gly Glu Asp Leu Ala Ile Gln Ser Gly Tyr Tyr Val Ala Phe
Asn Ser545 550 555 560Gln
Met His Leu Asn Trp Arg Arg Gln Thr Arg Ser Lys Asn Ala Val565
570 575Leu Ile Gly Gly Lys Gly Gln Tyr Ala Glu Lys
Asp Lys Ala Leu Ala580 585 590Arg Arg Ala
Ala Gly Arg Ile Val Ser Val Glu Glu Gln Pro Gly His595
600 605Val Arg Ile Val Gly Asp Ala Thr Ala Ala Tyr Gln
Val Ala Asn Pro610 615 620Leu Val Gln Lys
Val Leu Arg Glu Thr His Phe Val Asn Asp Ser Tyr625 630
635 640Phe Val Ile Val Asp Glu Val Glu Cys
Ser Glu Pro Gln Glu Leu Gln645 650 655Trp
Leu Cys His Thr Leu Gly Ala Pro Gln Thr Gly Arg Ser Ser Phe660
665 670Arg Tyr Asn Gly Arg Lys Ala Gly Phe Tyr Gly
Gln Phe Val Tyr Ser675 680 685Ser Gly Gly
Thr Pro Gln Ile Ser Ala Val Glu Gly Phe Pro Asp Ile690
695 700Asp Pro Lys Glu Phe Glu Gly Leu Asp Ile His His
His Val Cys Ala705 710 715
720Thr Val Pro Ala Ala Thr Arg His Arg Leu Val Thr Leu Leu Val Pro725
730 735Tyr Ser Leu Lys Glu Pro Lys Arg Ile
Phe Ser Phe Ile Asp Asp Gln740 745 750Gly
Phe Ser Thr Asp Ile Tyr Phe Ser Asp Val Asp Asp Glu Arg Phe755
760 765Lys Leu Ser Leu Pro Lys Gln Phe770
775691068DNAAgrobacterium temefaciens C58 69atgttcacaa cgtccgccta
tgcctgcgat gacggctctt cgccgatgaa gctcgcgacc 60atcaggcgcc gcgatcccgg
tccgcgcgat gtcgaaatcg agatagaatt ctgtggcgtc 120tgccactcgg acatccatac
ggcccgcagc gaatggccgg gctccctcta cccttgcgtc 180cccggccacg aaatcgtcgg
ccgtgtcggt cgggtgggcg cgcaagtcac ccggttcaag 240acgggtgacc gcgtcggtgt
cggctgtatc gtcgatagct gccgcgaatg cgcaagctgc 300gccgaagggc tggagcaata
ttgcgaaaac ggcatgaccg gcacctataa ctcccctgac 360aaggcgatgg gcggcggcgc
gcatacgctt ggcggctatt ccgcccatgt ggtggtggat 420gaccgctatg tgctcaatat
tcccgaaggg ctcgatccgg cggcagcagc accgctactc 480tgcgctggta tcaccaccta
ctcgccgctg cgccactgga atgccggccc cggcaaacgc 540gtcggcgtcg tcggtctggg
cggcctcggc catatggccg tcaagctcgc caatgccatg 600ggtgcgactg tcgtgatgat
caccacctcg cccggcaagg cggaggatgc caaaaaactc 660ggcgcacacg aggtgatcat
ctcccgcgat gcggagcaga tgaagaaggc tacctcgagc 720ctcgatctca tcatcgatgc
tgtcgccgcc gaccacgaca tcgacgccta tctggcgctg 780ctgaaacgcg atggcgcgct
ggtgcaggtg ggcgcgccgg aaaagccact ttcggtgatg 840gccttcagcc tcatccccgg
ccgcaagacc tttgccggct cgatgatcgg cggtattccc 900gagactcagg aaatgctgga
tttctgcgcc gaaaaaggca tcgccggcga aatcgagatg 960atcgatatcg atcagatcaa
tgacgcttat gaacgcatga taaaaagcga tgtgcgttat 1020cgtttcgtca ttgatatgaa
gagcctgccg cgccagaagg ccgcctga 106870355PRTAgrobacterium
tumefaciens C58 70Met Phe Thr Thr Ser Ala Tyr Ala Cys Asp Asp Gly Ser Ser
Pro Met1 5 10 15Lys Leu
Ala Thr Ile Arg Arg Arg Asp Pro Gly Pro Arg Asp Val Glu20
25 30Ile Glu Ile Glu Phe Cys Gly Val Cys His Ser Asp
Ile His Thr Ala35 40 45Arg Ser Glu Trp
Pro Gly Ser Leu Tyr Pro Cys Val Pro Gly His Glu50 55
60Ile Val Gly Arg Val Gly Arg Val Gly Ala Gln Val Thr Arg
Phe Lys65 70 75 80Thr
Gly Asp Arg Val Gly Val Gly Cys Ile Val Asp Ser Cys Arg Glu85
90 95Cys Ala Ser Cys Ala Glu Gly Leu Glu Gln Tyr
Cys Glu Asn Gly Met100 105 110Thr Gly Thr
Tyr Asn Ser Pro Asp Lys Ala Met Gly Gly Gly Ala His115
120 125Thr Leu Gly Gly Tyr Ser Ala His Val Val Val Asp
Asp Arg Tyr Val130 135 140Leu Asn Ile Pro
Glu Gly Leu Asp Pro Ala Ala Ala Ala Pro Leu Leu145 150
155 160Cys Ala Gly Ile Thr Thr Tyr Ser Pro
Leu Arg His Trp Asn Ala Gly165 170 175Pro
Gly Lys Arg Val Gly Val Val Gly Leu Gly Gly Leu Gly His Met180
185 190Ala Val Lys Leu Ala Asn Ala Met Gly Ala Thr
Val Val Met Ile Thr195 200 205Thr Ser Pro
Gly Lys Ala Glu Asp Ala Lys Lys Leu Gly Ala His Glu210
215 220Val Ile Ile Ser Arg Asp Ala Glu Gln Met Lys Lys
Ala Thr Ser Ser225 230 235
240Leu Asp Leu Ile Ile Asp Ala Val Ala Ala Asp His Asp Ile Asp Ala245
250 255Tyr Leu Ala Leu Leu Lys Arg Asp Gly
Ala Leu Val Gln Val Gly Ala260 265 270Pro
Glu Lys Pro Leu Ser Val Met Ala Phe Ser Leu Ile Pro Gly Arg275
280 285Lys Thr Phe Ala Gly Ser Met Ile Gly Gly Ile
Pro Glu Thr Gln Glu290 295 300Met Leu Asp
Phe Cys Ala Glu Lys Gly Ile Ala Gly Glu Ile Glu Met305
310 315 320Ile Asp Ile Asp Gln Ile Asn
Asp Ala Tyr Glu Arg Met Ile Lys Ser325 330
335Asp Val Arg Tyr Arg Phe Val Ile Asp Met Lys Ser Leu Pro Arg Gln340
345 350Lys Ala Ala355711047DNAAgrobacterium
tumefaciens C58 71atggctattg caagaggtta tgctgcgacc gacgcgtcga agccgcttac
cccgttcacc 60ttcgaacgcc gcgagccgaa tgatgacgac gtcgtcatcg atatcaaata
tgccggcatc 120tgccactcgg acatccacac cgtccgcaac gaatggcaca atgccgttta
cccgatcgtt 180ccgggccacg aaatcgccgg tgtcgtgcgg gccgttggtt ccaaggtcac
gcggttcaag 240gtcggcgacc atgtcggcgt cggctgcttt gtcgattcct gcgttggctg
cgccacccgc 300gatgtcgaca atgagcagta tatgccgggt ctcgtgcaga cctacaattc
cgttgaacgg 360gacggcaaga gcgcgaccca gggcggttat tccgaccata tcgtggtcag
ggaagactac 420gtcctgtcca tcccggacaa cctgccgctc gatgcctccg cgccgcttct
ctgcgccggc 480atcacgctct attcgccgct gcagcactgg aatgcaggcc ccggcaagaa
agtggctatc 540gtcggcatgg gtggccttgg ccacatgggc gtgaagatcg gctcggccat
gggcgctgat 600atcaccgttc tctcgcagac gctgtcgaag aaggaagacg gcctcaagct
cggcgcgaag 660gaatattacg ccaccagcga cgcctcgacc tttgagaaac tcgccggcac
cttcgacctg 720atcctgtgca cagtctcggc cgaaatcgac tggaacgcct acctcaacct
gctcaaggtc 780aacggcacga tggttctgct cggcgtgccg gaacatgcga tcccggtgca
cgcattctcg 840gtcattcccg cccgccgttc gctcgccggt tcgatgatcg gctcgatcaa
ggaaacccag 900gaaatgctgg atttctgcgg caagcacgac atcgtttcgg aaatcgaaac
gatcggcatc 960aaggacgtca acgaagccta tgagcgcgtg ctgaagagcg acgtgcgtta
ccgcttcgtc 1020atcgacatgg cctcgctcga cgcttga
104772348PRTAgrobacterium tumefaciens C58 72Met Ala Ile Ala
Arg Gly Tyr Ala Ala Thr Asp Ala Ser Lys Pro Leu1 5
10 15Thr Pro Phe Thr Phe Glu Arg Arg Glu Pro Asn
Asp Asp Asp Val Val20 25 30Ile Asp Ile
Lys Tyr Ala Gly Ile Cys His Ser Asp Ile His Thr Val35 40
45Arg Asn Glu Trp His Asn Ala Val Tyr Pro Ile Val Pro
Gly His Glu50 55 60Ile Ala Gly Val Val
Arg Ala Val Gly Ser Lys Val Thr Arg Phe Lys65 70
75 80Val Gly Asp His Val Gly Val Gly Cys Phe
Val Asp Ser Cys Val Gly85 90 95Cys Ala
Thr Arg Asp Val Asp Asn Glu Gln Tyr Met Pro Gly Leu Val100
105 110Gln Thr Tyr Asn Ser Val Glu Arg Asp Gly Lys Ser
Ala Thr Gln Gly115 120 125Gly Tyr Ser Asp
His Ile Val Val Arg Glu Asp Tyr Val Leu Ser Ile130 135
140Pro Asp Asn Leu Pro Leu Asp Ala Ser Ala Pro Leu Leu Cys
Ala Gly145 150 155 160Ile
Thr Leu Tyr Ser Pro Leu Gln His Trp Asn Ala Gly Pro Gly Lys165
170 175Lys Val Ala Ile Val Gly Met Gly Gly Leu Gly
His Met Gly Val Lys180 185 190Ile Gly Ser
Ala Met Gly Ala Asp Ile Thr Val Leu Ser Gln Thr Leu195
200 205Ser Lys Lys Glu Asp Gly Leu Lys Leu Gly Ala Lys
Glu Tyr Tyr Ala210 215 220Thr Ser Asp Ala
Ser Thr Phe Glu Lys Leu Ala Gly Thr Phe Asp Leu225 230
235 240Ile Leu Cys Thr Val Ser Ala Glu Ile
Asp Trp Asn Ala Tyr Leu Asn245 250 255Leu
Leu Lys Val Asn Gly Thr Met Val Leu Leu Gly Val Pro Glu His260
265 270Ala Ile Pro Val His Ala Phe Ser Val Ile Pro
Ala Arg Arg Ser Leu275 280 285Ala Gly Ser
Met Ile Gly Ser Ile Lys Glu Thr Gln Glu Met Leu Asp290
295 300Phe Cys Gly Lys His Asp Ile Val Ser Glu Ile Glu
Thr Ile Gly Ile305 310 315
320Lys Asp Val Asn Glu Ala Tyr Glu Arg Val Leu Lys Ser Asp Val Arg325
330 335Tyr Arg Phe Val Ile Asp Met Ala Ser
Leu Asp Ala340 345731029DNAAgrobacterium tumefaciens C58
73atgactaaaa caatgaaggc ggcggttgtc cgcgcatttg gaaaaccgct gaccatcgag
60gaagtggcaa taccggatcc cggccccggt gaaattctca tcaactacaa ggcgacgggc
120gtttgccaca ccgacctgca cgccgcaacg ggggattggc cggtcaagcc caacccgccc
180ttcattcccg gacatgaagg tgcaggttac gtcgccaaga tcggcgctgg cgtcaccggc
240atcaaggagg gcgaccgcgc cggcacgccc tggctctaca ccgcctgcgg atgctgcatt
300ccctgccgta ccggctggga aaccctgtgc ccgagccaga agaactcagg ttattccgtc
360aacggcagct ttgccgaata tggccttgcc gatccgaaat tcgtcggccg cctgcctgac
420aatctcgatt tcggcccagc cgcacccgtg ctctgcgccg gcgttacagt ctataagggc
480ctgaaggaaa ccgaagtcag gcccggtgaa tgggtggtca tttcaggcat tggcgggctt
540ggccacatgg ccgtgcaata tgcgaaagcc atgggcatgc atgtggttgc cgccgatatt
600ttcgacgaca agctggcgct tgccaaaaag ctcggagccg acgtcgtcgt caacggccgc
660gcgcctgacg cggtggagca agtgcaaaag gcaaccggcg gcgtccatgg cgcgctggtg
720acggcggttt caccgaaggc catggagcag gcttatggct tcctgcgctc caagggcacg
780atggcgcttg tcggtctgcc gccgggcttc atctccattc cggtgttcga cacggtgctg
840aagcgcatca cggtgcgtgg ctccatcgtc ggcacgcggc aggatctgga ggaggcgttg
900accttcgccg gtgaaggcaa ggtggccgcc cacttctcgt gggacaagct cgaaaacatc
960aatgatatct tccatcgcat ggaagagggc aagatcgacg gccgtatcgt cgtggatctc
1020gccgcctga
102974342PRTAgrobacterium tumefaciens C58 74Met Thr Lys Thr Met Lys Ala
Ala Val Val Arg Ala Phe Gly Lys Pro1 5 10
15Leu Thr Ile Glu Glu Val Ala Ile Pro Asp Pro Gly Pro Gly
Glu Ile20 25 30Leu Ile Asn Tyr Lys Ala
Thr Gly Val Cys His Thr Asp Leu His Ala35 40
45Ala Thr Gly Asp Trp Pro Val Lys Pro Asn Pro Pro Phe Ile Pro Gly50
55 60His Glu Gly Ala Gly Tyr Val Ala Lys
Ile Gly Ala Gly Val Thr Gly65 70 75
80Ile Lys Glu Gly Asp Arg Ala Gly Thr Pro Trp Leu Tyr Thr
Ala Cys85 90 95Gly Cys Cys Ile Pro Cys
Arg Thr Gly Trp Glu Thr Leu Cys Pro Ser100 105
110Gln Lys Asn Ser Gly Tyr Ser Val Asn Gly Ser Phe Ala Glu Tyr
Gly115 120 125Leu Ala Asp Pro Lys Phe Val
Gly Arg Leu Pro Asp Asn Leu Asp Phe130 135
140Gly Pro Ala Ala Pro Val Leu Cys Ala Gly Val Thr Val Tyr Lys Gly145
150 155 160Leu Lys Glu Thr
Glu Val Arg Pro Gly Glu Trp Val Val Ile Ser Gly165 170
175Ile Gly Gly Leu Gly His Met Ala Val Gln Tyr Ala Lys Ala
Met Gly180 185 190Met His Val Val Ala Ala
Asp Ile Phe Asp Asp Lys Leu Ala Leu Ala195 200
205Lys Lys Leu Gly Ala Asp Val Val Val Asn Gly Arg Ala Pro Asp
Ala210 215 220Val Glu Gln Val Gln Lys Ala
Thr Gly Gly Val His Gly Ala Leu Val225 230
235 240Thr Ala Val Ser Pro Lys Ala Met Glu Gln Ala Tyr
Gly Phe Leu Arg245 250 255Ser Lys Gly Thr
Met Ala Leu Val Gly Leu Pro Pro Gly Phe Ile Ser260 265
270Ile Pro Val Phe Asp Thr Val Leu Lys Arg Ile Thr Val Arg
Gly Ser275 280 285Ile Val Gly Thr Arg Gln
Asp Leu Glu Glu Ala Leu Thr Phe Ala Gly290 295
300Glu Gly Lys Val Ala Ala His Phe Ser Trp Asp Lys Leu Glu Asn
Ile305 310 315 320Asn Asp
Ile Phe His Arg Met Glu Glu Gly Lys Ile Asp Gly Arg Ile325
330 335Val Val Asp Leu Ala Ala340751008DNAAgrobacterium
tumefaciens C58 75atgaccgggg cgaaccagcc ttgggaggtt caagaggttc ccgttccgaa
ggcagagcca 60ggacttgtcc ttgttaaaat ccacgcctcc ggcatgtgct acacggacgt
gtgggcgacg 120cagggtgccg gtggcgacat ctatccgcag acccccggcc atgaggttgt
cggcgagatc 180atcgaggtcg gcgcgggcgt tcatacgcgc aaggtgggag accgggtcgg
caccacctgg 240gtgcagtcct cttgtggacg atgctcctac tgccgccaga accgtccgtt
gaccggccag 300acagccatga actgcgattc acccaggaca acggggttcg cgacgcaagg
cgggcacgca 360gagtacatcg cgatctctgc tgaaggcaca gtgttattac ccgacgggct
cgactacacg 420gatgccgcac ccatgatgtg cgcaggctac acgacctgga gcggcttgcg
cgacgccgag 480cccaaacctg gtgacagaat tgcggtactt ggcatcggcg ggctggggca
cgtcgccgtg 540cagttctcca aagccttggg gtttgagacc atcgcgatca cgcattcacc
cgacaagcac 600aagttggcca ccgatcttgg tgcagacatc gtcgtcgccg atggcaaaga
gttattggag 660gccggcggtg cggacgttct tctggttacg accaacgact tcgacaccgc
cgaaaaagcg 720atggcgggcg taaggcctga cgggcgcatc gttctttgcg cgctcgactt
cagcaagccg 780ttctcgatcc cgtccgacgg caagccgttc cacatgatgc gccaacgcgt
ggttgggtcc 840acgcatggcg gacagcacta tctcgccgaa atcctcgatc tcgccgccaa
gggcaaggtc 900aagccgattg tcgagacctt cgccctcgag caggcaaccg aggcatatga
gcggctatcc 960accgggaaga tgcgcttccg gggcgtgttc cttccgcacg gcgcttga
100876335PRTAgrobacterium tumefaciens C58 76Met Thr Gly Ala
Asn Gln Pro Trp Glu Val Gln Glu Val Pro Val Pro1 5
10 15Lys Ala Glu Pro Gly Leu Val Leu Val Lys Ile
His Ala Ser Gly Met20 25 30Cys Tyr Thr
Asp Val Trp Ala Thr Gln Gly Ala Gly Gly Asp Ile Tyr35 40
45Pro Gln Thr Pro Gly His Glu Val Val Gly Glu Ile Ile
Glu Val Gly50 55 60Ala Gly Val His Thr
Arg Lys Val Gly Asp Arg Val Gly Thr Thr Trp65 70
75 80Val Gln Ser Ser Cys Gly Arg Cys Ser Tyr
Cys Arg Gln Asn Arg Pro85 90 95Leu Thr
Gly Gln Thr Ala Met Asn Cys Asp Ser Pro Arg Thr Thr Gly100
105 110Phe Ala Thr Gln Gly Gly His Ala Glu Tyr Ile Ala
Ile Ser Ala Glu115 120 125Gly Thr Val Leu
Leu Pro Asp Gly Leu Asp Tyr Thr Asp Ala Ala Pro130 135
140Met Met Cys Ala Gly Tyr Thr Thr Trp Ser Gly Leu Arg Asp
Ala Glu145 150 155 160Pro
Lys Pro Gly Asp Arg Ile Ala Val Leu Gly Ile Gly Gly Leu Gly165
170 175His Val Ala Val Gln Phe Ser Lys Ala Leu Gly
Phe Glu Thr Ile Ala180 185 190Ile Thr His
Ser Pro Asp Lys His Lys Leu Ala Thr Asp Leu Gly Ala195
200 205Asp Ile Val Val Ala Asp Gly Lys Glu Leu Leu Glu
Ala Gly Gly Ala210 215 220Asp Val Leu Leu
Val Thr Thr Asn Asp Phe Asp Thr Ala Glu Lys Ala225 230
235 240Met Ala Gly Val Arg Pro Asp Gly Arg
Ile Val Leu Cys Ala Leu Asp245 250 255Phe
Ser Lys Pro Phe Ser Ile Pro Ser Asp Gly Lys Pro Phe His Met260
265 270Met Arg Gln Arg Val Val Gly Ser Thr His Gly
Gly Gln His Tyr Leu275 280 285Ala Glu Ile
Leu Asp Leu Ala Ala Lys Gly Lys Val Lys Pro Ile Val290
295 300Glu Thr Phe Ala Leu Glu Gln Ala Thr Glu Ala Tyr
Glu Arg Leu Ser305 310 315
320Thr Gly Lys Met Arg Phe Arg Gly Val Phe Leu Pro His Gly Ala325
330 335771017DNAAgrobacterium tumefaciens C58
77atgaccatgc atgccattca attcgtcgag aagggacgcg ccgtgctggc ggaactcccc
60gtcgccgatc tgccgccggg ccatgcgctc gtgcgggtca aggcttcggg gctttgccat
120accgatatcg acgtgctgca tgcgcgttat ggcgacggtg cgttccccgt cattccgggg
180catgaatatg ctggcgaagt cgcagccgtg gcttccgatg tgacagtctt caaggctggc
240gaccgggttg tcgtcgatcc caatctgccc tgtggcacct gcgccagctg caggaaaggg
300ctgaccaacc tttgcagcac attgaaagct tacggcgttt cccacaatgg cggctttgcg
360gagttcagtg tggtgcgtgc cgatcacctg cacggtatcg gttcgatgcc ctatcacgtc
420gcggcgctgg ctgagccgct tgcctgtgtt gtcaatggca tgcagagtgc gggtattggc
480gagagtggcg tggtgccgga gaatgcgctt gttttcggtg ctgggcccat cggcctgctg
540cttgccctgt cgctgaaatc acgcggcatt gcgacggtga cgatggccga tatcaatgaa
600agcaggctgg cctttgccca ggacctcggg cttcagacgg cggtatccgg ctcggaagcg
660ctctcgcggc agcggaagga gttcgatttc gtggccgatg cgacgggtat tgccccggtc
720gccgaggcga tgatcccgct ggttgcggat ggcggcacgg cgctattctt cggcgtctgc
780gcgccggatg cccgtatttc ggtggcaccg tttgaaatct tccggcgcca gctgaaactt
840gtcggctcgc attcgctgaa ccgcaacata ccgcaggcgc ttgccattct ggagacggat
900ggcgaggtca tggcgcggct cgtttcgcac cgcttgccgc tttcggagat gctgccgttc
960tttacgaaaa aaccgtctga tccggcgacg atgaaagtgc aatttgcagc cgaatga
101778338PRTAgrobacterium tumefaciens C58 78Met Thr Met His Ala Ile Gln
Phe Val Glu Lys Gly Arg Ala Val Leu1 5 10
15Ala Glu Leu Pro Val Ala Asp Leu Pro Pro Gly His Ala Leu
Val Arg20 25 30Val Lys Ala Ser Gly Leu
Cys His Thr Asp Ile Asp Val Leu His Ala35 40
45Arg Tyr Gly Asp Gly Ala Phe Pro Val Ile Pro Gly His Glu Tyr Ala50
55 60Gly Glu Val Ala Ala Val Ala Ser Asp
Val Thr Val Phe Lys Ala Gly65 70 75
80Asp Arg Val Val Val Asp Pro Asn Leu Pro Cys Gly Thr Cys
Ala Ser85 90 95Cys Arg Lys Gly Leu Thr
Asn Leu Cys Ser Thr Leu Lys Ala Tyr Gly100 105
110Val Ser His Asn Gly Gly Phe Ala Glu Phe Ser Val Val Arg Ala
Asp115 120 125His Leu His Gly Ile Gly Ser
Met Pro Tyr His Val Ala Ala Leu Ala130 135
140Glu Pro Leu Ala Cys Val Val Asn Gly Met Gln Ser Ala Gly Ile Gly145
150 155 160Glu Ser Gly Val
Val Pro Glu Asn Ala Leu Val Phe Gly Ala Gly Pro165 170
175Ile Gly Leu Leu Leu Ala Leu Ser Leu Lys Ser Arg Gly Ile
Ala Thr180 185 190Val Thr Met Ala Asp Ile
Asn Glu Ser Arg Leu Ala Phe Ala Gln Asp195 200
205Leu Gly Leu Gln Thr Ala Val Ser Gly Ser Glu Ala Leu Ser Arg
Gln210 215 220Arg Lys Glu Phe Asp Phe Val
Ala Asp Ala Thr Gly Ile Ala Pro Val225 230
235 240Ala Glu Ala Met Ile Pro Leu Val Ala Asp Gly Gly
Thr Ala Leu Phe245 250 255Phe Gly Val Cys
Ala Pro Asp Ala Arg Ile Ser Val Ala Pro Phe Glu260 265
270Ile Phe Arg Arg Gln Leu Lys Leu Val Gly Ser His Ser Leu
Asn Arg275 280 285Asn Ile Pro Gln Ala Leu
Ala Ile Leu Glu Thr Asp Gly Glu Val Met290 295
300Ala Arg Leu Val Ser His Arg Leu Pro Leu Ser Glu Met Leu Pro
Phe305 310 315 320Phe Thr
Lys Lys Pro Ser Asp Pro Ala Thr Met Lys Val Gln Phe Ala325
330 335Ala Glu791044DNAAgrobacterium tumefaciens C58
79atgcgcgcgc tttattacga acgattcggc gagacccctg tagtcgcgtc cctgcctgat
60ccggcaccga gcgatggcgg cgtggtgatt gcggtgaagg caaccggcct ctgccgcagc
120gactggcatg gctggatggg acatgacacg gatatccgtc tgccgcatgt gcccggccac
180gagttcgccg gcgtcatctc cgcagtcggc agaaacgtca cccgcttcaa gacgggtgat
240cgcgttaccg tgcctttcgt ctccggctgc ggccattgcc atgagtgccg ctccggcaat
300cagcaggtct gcgaaacgca gttccagccc ggcttcaccc attggggttc cttcgccgaa
360tatgtcgcca tcgactatgc cgatcagaac ctcgtgcacc tgccggaatc gatgagttac
420gccaccgccg ccggcctcgg ttgccgtttc gccacctcct tccgggcggt gacggatcag
480ggacgcctga agggcggcga atggctggct gtccatggct gcggcggtgt cggtctctcc
540gccatcatga tcggcgccgg cctcggcgca caggtcgtcg ccatcgatat tgccgaagac
600aagctcgaac tcgcccggca actgggtgca accgcaacca tcaacagccg ctccgttgcc
660gatgtcgccg aagcggtgcg cgacatcacc ggtggcggcg cgcatgtgtc ggtggatgcg
720cttggccatc cgcagacctg ctgcaattcc atcagcaacc tgcgccggcg cggacgccat
780gtgcaggtgg ggctgatgct ggcagaccat gccatgccgg ccattcccat ggcccgggtg
840atcgctcatg agctggagat ctatggcagc cacggcatgc aggcatggcg ttacgaggac
900atgctggcca tgatcgaaag cggcaggctt gcgccggaaa agctgattgg ccgccatatc
960tcgctgaccg aagcggccgt cgccctgccc ggaatggata ggttccagga gagcggcatc
1020agcatcatcg accggttcga atag
104480357PRTAgrobacterium tumefaciens C58 80Met Asn Leu Arg Thr Asn Asp
Glu Ala Met Met Arg Ala Leu Tyr Tyr1 5 10
15Glu Arg Phe Gly Glu Thr Pro Val Val Ala Ser Leu Pro Asp
Pro Ala20 25 30Pro Ser Asp Gly Gly Val
Val Ile Ala Val Lys Ala Thr Gly Leu Cys35 40
45Arg Ser Asp Trp His Gly Trp Met Gly His Asp Thr Asp Ile Arg Leu50
55 60Pro His Val Pro Gly His Glu Phe Ala
Gly Val Ile Ser Ala Val Gly65 70 75
80Arg Asn Val Thr Arg Phe Lys Thr Gly Asp Arg Val Thr Val
Pro Phe85 90 95Val Ser Gly Cys Gly His
Cys His Glu Cys Arg Ser Gly Asn Gln Gln100 105
110Val Cys Glu Thr Gln Phe Gln Pro Gly Phe Thr His Trp Gly Ser
Phe115 120 125Ala Glu Tyr Val Ala Ile Asp
Tyr Ala Asp Gln Asn Leu Val His Leu130 135
140Pro Glu Ser Met Ser Tyr Ala Thr Ala Ala Gly Leu Gly Cys Arg Phe145
150 155 160Ala Thr Ser Phe
Arg Ala Val Thr Asp Gln Gly Arg Leu Lys Gly Gly165 170
175Glu Trp Leu Ala Val His Gly Cys Gly Gly Val Gly Leu Ser
Ala Ile180 185 190Met Ile Gly Ala Gly Leu
Gly Ala Gln Val Val Ala Ile Asp Ile Ala195 200
205Glu Asp Lys Leu Glu Leu Ala Arg Gln Leu Gly Ala Thr Ala Thr
Ile210 215 220Asn Ser Arg Ser Val Ala Asp
Val Ala Glu Ala Val Arg Asp Ile Thr225 230
235 240Gly Gly Gly Ala His Val Ser Val Asp Ala Leu Gly
His Pro Gln Thr245 250 255Cys Cys Asn Ser
Ile Ser Asn Leu Arg Arg Arg Gly Arg His Val Gln260 265
270Val Gly Leu Met Leu Ala Asp His Ala Met Pro Ala Ile Pro
Met Ala275 280 285Arg Val Ile Ala His Glu
Leu Glu Ile Tyr Gly Ser His Gly Met Gln290 295
300Ala Trp Arg Tyr Glu Asp Met Leu Ala Met Ile Glu Ser Gly Arg
Leu305 310 315 320Ala Pro
Glu Lys Leu Ile Gly Arg His Ile Ser Leu Thr Glu Ala Ala325
330 335Val Ala Leu Pro Gly Met Asp Arg Phe Gln Glu Ser
Gly Ile Ser Ile340 345 350Ile Asp Arg Phe
Glu355811011DNAAgrobacterium tumefaciens C58 81atgctggcga ttttctgtga
cactcccggt caattaaccg ccaaggatct gccgaacccc 60gtgcgcggcg aaggtgaagt
cctggtacgt attcgccgga ttggcgtttg cggcacggat 120ctgcacatct ttaccggcaa
ccagccctat ctttcctatc cgcggatcat gggtcacgaa 180ctttccggca cggttgagga
ggcacccgct ggcagccacc tttccgctgg cgatgtggtg 240accataattc cctatatgtc
ctgcgggaaa tgcaatgcct gcctgaaggg taagagcaat 300tgctgccgca atatcggtgt
gcttggcgtt catcgcgatg gcggcatggt ggaatatctg 360agcgtgccgc agcaattcgt
gctgaaggcg gaggggctga gcctcgacca ggcagccatg 420acggaatttc tggcgatcgg
tgcccatgcg gtgcgtcgcg gtgccgtcga aaaagggcaa 480aaggtcctga tcgtcggtgc
cggcccgatc ggcatggcgg ttgctgtctt tgcggttctc 540gatggcacgg aagtgacgat
gatcgacggt cgcaccgacc ggctggattt ctgcaaggac 600cacctcggtg tcgctcatac
agtcgccctc ggcgacggtg acaaagatcg tctgtccgac 660attaccggtg gcaatttctt
cgatgcggtg tttgatgcga ccggcaatcc gaaagccatg 720gagcgcggtt tctccttcgt
cggtcacggc ggctcctatg ttctggtgtc catcgtcgcc 780agcgatatca gcttcaacga
cccggaattt cacaagcgtg agacgacgct gctcggcagc 840cgcaacgcga cggctgatga
tttcgagcgg gtgcttcgcg ccttgcgcga agggaaagtg 900ccggaggcac taatcaccca
tcgcatgaca cttgccgatg ttccctcgaa gttcgccggc 960ctgaccgatc cgaaagccgg
agtcatcaag ggcatggtgg aggtcgcatg a 101182336PRTAgrobacterium
tumefaciens C58 82Met Leu Ala Ile Phe Cys Asp Thr Pro Gly Gln Leu Thr Ala
Lys Asp1 5 10 15Leu Pro
Asn Pro Val Arg Gly Glu Gly Glu Val Leu Val Arg Ile Arg20
25 30Arg Ile Gly Val Cys Gly Thr Asp Leu His Ile Phe
Thr Gly Asn Gln35 40 45Pro Tyr Leu Ser
Tyr Pro Arg Ile Met Gly His Glu Leu Ser Gly Thr50 55
60Val Glu Glu Ala Pro Ala Gly Ser His Leu Ser Ala Gly Asp
Val Val65 70 75 80Thr
Ile Ile Pro Tyr Met Ser Cys Gly Lys Cys Asn Ala Cys Leu Lys85
90 95Gly Lys Ser Asn Cys Cys Arg Asn Ile Gly Val
Leu Gly Val His Arg100 105 110Asp Gly Gly
Met Val Glu Tyr Leu Ser Val Pro Gln Gln Phe Val Leu115
120 125Lys Ala Glu Gly Leu Ser Leu Asp Gln Ala Ala Met
Thr Glu Phe Leu130 135 140Ala Ile Gly Ala
His Ala Val Arg Arg Gly Ala Val Glu Lys Gly Gln145 150
155 160Lys Val Leu Ile Val Gly Ala Gly Pro
Ile Gly Met Ala Val Ala Val165 170 175Phe
Ala Val Leu Asp Gly Thr Glu Val Thr Met Ile Asp Gly Arg Thr180
185 190Asp Arg Leu Asp Phe Cys Lys Asp His Leu Gly
Val Ala His Thr Val195 200 205Ala Leu Gly
Asp Gly Asp Lys Asp Arg Leu Ser Asp Ile Thr Gly Gly210
215 220Asn Phe Phe Asp Ala Val Phe Asp Ala Thr Gly Asn
Pro Lys Ala Met225 230 235
240Glu Arg Gly Phe Ser Phe Val Gly His Gly Gly Ser Tyr Val Leu Val245
250 255Ser Ile Val Ala Ser Asp Ile Ser Phe
Asn Asp Pro Glu Phe His Lys260 265 270Arg
Glu Thr Thr Leu Leu Gly Ser Arg Asn Ala Thr Ala Asp Asp Phe275
280 285Glu Arg Val Leu Arg Ala Leu Arg Glu Gly Lys
Val Pro Glu Ala Leu290 295 300Ile Thr His
Arg Met Thr Leu Ala Asp Val Pro Ser Lys Phe Ala Gly305
310 315 320Leu Thr Asp Pro Lys Ala Gly
Val Ile Lys Gly Met Val Glu Val Ala325 330
335831005DNAAgrobacterium tumefaciens C58 83gtgaaagcct tcgtcgtcga
caagtacaag aagaagggcc cgctgcgtct ggccgacatg 60cccaatccgg tcatcggcgc
caatgatgtg ctggttcgca tccatgccac tgccatcaat 120cttctcgact ccaaggtgcg
cgacggggaa ttcaagctgt tcctgcccta tcgtcctccc 180ttcattctcg gtcatgatct
ggccggaacg gtcatccgcg tcggcgcgaa tgtacggcag 240ttcaagacag gcgacgaggt
tttcgctcgc ccgcgtgatc accgggtcgg aaccttcgca 300gaaatgattg cggtcgatgc
cgcagacctt gcgctgaagc caacgagcct gtccatggag 360caggcagcgt cgatcccgct
cgtcggactg actgcctggc aggcgcttat cgaggttggc 420aaggtcaagt ccggccagaa
ggttttcatc caggccggtt ccggcggtgt cggcaccttc 480gccatccagc ttgccaagca
tctcggcgct accgtggcca cgaccaccag cgccgcgaat 540gccgaactgg tcaaaagcct
cggcgcagat gtggtgatcg actacaagac gcaggacttc 600gaacaggtgc tgtccggcta
cgatctcgtc ctgaacagcc aggatgccaa gacgctggaa 660aagtcgttga acgtgctgag
accgggcgga aagctcattt cgatctccgg tccgccggat 720gttgcctttg ccagatcgtt
gaaactgaat ccgctcctgc gttttgtcgt cagaatgctg 780agccgtggtg tcctgaaaaa
ggcaagcaga cgcggtgtcg attactcttt cctgttcatg 840cgcgccgaag gtcagcaatt
gcatgagatc gccgaactga tcgatgccgg caccatccgt 900ccggtcgtcg acaaggtgtt
tcaatttgcg cagacgcccg acgccctggc ctatgtcgag 960accggacggg caaggggcaa
ggttgtggtt acatacgcat cctag 100584359PRTAgrobacterium
tumefaciens C58 84Met Pro Ser Leu Cys Arg Lys Pro Trp Leu Ser Ser Leu Pro
Asp Leu1 5 10 15Ile Asn
Val Ser His Trp Arg Lys Pro Val Lys Ala Phe Val Val Asp20
25 30Lys Tyr Lys Lys Lys Gly Pro Leu Arg Leu Ala Asp
Met Pro Asn Pro35 40 45Val Ile Gly Ala
Asn Asp Val Leu Val Arg Ile His Ala Thr Ala Ile50 55
60Asn Leu Leu Asp Ser Lys Val Arg Asp Gly Glu Phe Lys Leu
Phe Leu65 70 75 80Pro
Tyr Arg Pro Pro Phe Ile Leu Gly His Asp Leu Ala Gly Thr Val85
90 95Ile Arg Val Gly Ala Asn Val Arg Gln Phe Lys
Thr Gly Asp Glu Val100 105 110Phe Ala Arg
Pro Arg Asp His Arg Val Gly Thr Phe Ala Glu Met Ile115
120 125Ala Val Asp Ala Ala Asp Leu Ala Leu Lys Pro Thr
Ser Leu Ser Met130 135 140Glu Gln Ala Ala
Ser Ile Pro Leu Val Gly Leu Thr Ala Trp Gln Ala145 150
155 160Leu Ile Glu Val Gly Lys Val Lys Ser
Gly Gln Lys Val Phe Ile Gln165 170 175Ala
Gly Ser Gly Gly Val Gly Thr Phe Ala Ile Gln Leu Ala Lys His180
185 190Leu Gly Ala Thr Val Ala Thr Thr Thr Ser Ala
Ala Asn Ala Glu Leu195 200 205Val Lys Ser
Leu Gly Ala Asp Val Val Ile Asp Tyr Lys Thr Gln Asp210
215 220Phe Glu Gln Val Leu Ser Gly Tyr Asp Leu Val Leu
Asn Ser Gln Asp225 230 235
240Ala Lys Thr Leu Glu Lys Ser Leu Asn Val Leu Arg Pro Gly Gly Lys245
250 255Leu Ile Ser Ile Ser Gly Pro Pro Asp
Val Ala Phe Ala Arg Ser Leu260 265 270Lys
Leu Asn Pro Leu Leu Arg Phe Val Val Arg Met Leu Ser Arg Gly275
280 285Val Leu Lys Lys Ala Ser Arg Arg Gly Val Asp
Tyr Ser Phe Leu Phe290 295 300Met Arg Ala
Glu Gly Gln Gln Leu His Glu Ile Ala Glu Leu Ile Asp305
310 315 320Ala Gly Thr Ile Arg Pro Val
Val Asp Lys Val Phe Gln Phe Ala Gln325 330
335Thr Pro Asp Ala Leu Ala Tyr Val Glu Thr Gly Arg Ala Arg Gly Lys340
345 350Val Val Val Thr Tyr Ala
Ser355851032DNAAgrobacterium tumefaciens C58 85atgaaagcga ttgtcgccca
cggggcaaag gatgtgcgca tcgaagaccg gccggaggaa 60aagccgggtc cgggcgaggt
gcggctccgt ctggcgaggg gcgggatctg cggcagtgat 120ctgcattatt acaatcatgg
cggtttcggc gccgtgcggc ttcgtgaacc catggtgctg 180ggccatgagg tttccgccgt
catcgaggaa ctgggcgaag gcgttgaggg gctgaagatc 240ggcggtctgg tggcggtttc
gccgtcgcgc ccatgccgaa cctgccgctt ctgccaggag 300ggtctgcaca atcagtgcct
caacatgcgg ttttatggca gcgccatgcc tttcccgcat 360attcagggcg cgttccggga
aattctggtg gcggacgccc tgcaatgcgt gccggccgat 420ggtctcagcg ccggggaagc
cgccatggcg gaaccgctgg cggtgacgct gcatgccaca 480cgccgggccg gcgatttgct
gggaaaacgt gtgctcgtca cgggttgcgg ccccatcggc 540attctctcca ttctggctgc
gcgccgggcg ggtgctgctg aaatcgtcgc caccgacctt 600tccgatttca cgctcggcaa
ggcgcgtgaa gcgggggcgg accgtgtcat caacagcaag 660gatgagcccg atgcgctcgc
cgcttatggt gcaaacaagg gaaccttcga cattctctat 720gaatgctcgg gtgcggccgt
ggcgcttgcc ggcggcatta cggcactgcg gccgcgcggc 780atcatcgtcc agctcgggct
cggcggcgat atgagcctgc cgatgatggc gatcacagcc 840aaggaactcg acctgcgtgg
ttcctttcgc ttccacgagg aattcgccac cggcgtcgag 900ctgatgcgca agggcctgat
cgacgtcaaa cccttcatca cccagaccgt cgatcttgcc 960gacgccatct cggccttcga
attcgcctcg gatcgcagcc gcgccatgaa ggtgcagatc 1020gccttttcct aa
103286343PRTAgrobacterium
tumefaciens C58 86Met Lys Ala Ile Val Ala His Gly Ala Lys Asp Val Arg Ile
Glu Asp1 5 10 15Arg Pro
Glu Glu Lys Pro Gly Pro Gly Glu Val Arg Leu Arg Leu Ala20
25 30Arg Gly Gly Ile Cys Gly Ser Asp Leu His Tyr Tyr
Asn His Gly Gly35 40 45Phe Gly Ala Val
Arg Leu Arg Glu Pro Met Val Leu Gly His Glu Val50 55
60Ser Ala Val Ile Glu Glu Leu Gly Glu Gly Val Glu Gly Leu
Lys Ile65 70 75 80Gly
Gly Leu Val Ala Val Ser Pro Ser Arg Pro Cys Arg Thr Cys Arg85
90 95Phe Cys Gln Glu Gly Leu His Asn Gln Cys Leu
Asn Met Arg Phe Tyr100 105 110Gly Ser Ala
Met Pro Phe Pro His Ile Gln Gly Ala Phe Arg Glu Ile115
120 125Leu Val Ala Asp Ala Leu Gln Cys Val Pro Ala Asp
Gly Leu Ser Ala130 135 140Gly Glu Ala Ala
Met Ala Glu Pro Leu Ala Val Thr Leu His Ala Thr145 150
155 160Arg Arg Ala Gly Asp Leu Leu Gly Lys
Arg Val Leu Val Thr Gly Cys165 170 175Gly
Pro Ile Gly Ile Leu Ser Ile Leu Ala Ala Arg Arg Ala Gly Ala180
185 190Ala Glu Ile Val Ala Thr Asp Leu Ser Asp Phe
Thr Leu Gly Lys Ala195 200 205Arg Glu Ala
Gly Ala Asp Arg Val Ile Asn Ser Lys Asp Glu Pro Asp210
215 220Ala Leu Ala Ala Tyr Gly Ala Asn Lys Gly Thr Phe
Asp Ile Leu Tyr225 230 235
240Glu Cys Ser Gly Ala Ala Val Ala Leu Ala Gly Gly Ile Thr Ala Leu245
250 255Arg Pro Arg Gly Ile Ile Val Gln Leu
Gly Leu Gly Gly Asp Met Ser260 265 270Leu
Pro Met Met Ala Ile Thr Ala Lys Glu Leu Asp Leu Arg Gly Ser275
280 285Phe Arg Phe His Glu Glu Phe Ala Thr Gly Val
Glu Leu Met Arg Lys290 295 300Gly Leu Ile
Asp Val Lys Pro Phe Ile Thr Gln Thr Val Asp Leu Ala305
310 315 320Asp Ala Ile Ser Ala Phe Glu
Phe Ala Ser Asp Arg Ser Arg Ala Met325 330
335Lys Val Gln Ile Ala Phe Ser34087939DNAAgrobacterium tumefaciens C58
87atgccgatgg cgctcgggca cgaagcggcg ggcgtcgtcg aggcattggg cgaaggcgtg
60cgcgatcttg agcccggcga tcatgtggtc atggtcttca tgcccagttg cggacattgc
120ctgccctgtg cggaaggcag gcccgctctg tgcgagccgg gcgccgccgc caatgcagca
180ggcaggctgt tgggtggcgc cacccgcctg aactatcatg gcgaggtcgt ccatcatcac
240cttggtgtgt cggcctttgc cgaatatgcc gtggtgtcgc gcaattcgct ggtcaagatc
300gaccgcgatc ttccatttgt cgaggcggca ctcttcggct gcgcggttct caccggcgtc
360ggcgccgtcg tgaatacggc aagggtcagg accggctcga ctgcggtcgt catcggactt
420ggcggtgtgg gccttgccgc ggttctcgga gcccgggcgg ccggtgccag caagatcgtc
480gccgtcgacc tttcgcagga aaagcttgca ctcgccagcg aactgggcgc gaccgccatc
540gtgaacggac gcgatgagga tgccgtcgag caggtccgcg agctcacttc cggcggtgcc
600gattatgcct tcgagatggc agggtctatt cgcgccctcg aaaacgcctt caggatgacc
660aaacgtggcg gcaccaccgt taccgccggt ctgccaccgc cgggtgcggc cctgccgctc
720aacgtcgtgc agctcgtcgg cgaggagcgg acactcaagg gcagctatat cggcacctgt
780gtgcctctcc gggatattcc gcgcttcatc gccctttatc gcgacggccg gttgccggtg
840aaccgccttc tgagcggaag gctgaagcta gaagacatca atgaagggtt cgaccgcctg
900cacgacggaa gcgccgttcg gcaagtcatc gaattctga
93988312PRTAgrobacterium tumefaciens C58 88Met Pro Met Ala Leu Gly His
Glu Ala Ala Gly Val Val Glu Ala Leu1 5 10
15Gly Glu Gly Val Arg Asp Leu Glu Pro Gly Asp His Val Val
Met Val20 25 30Phe Met Pro Ser Cys Gly
His Cys Leu Pro Cys Ala Glu Gly Arg Pro35 40
45Ala Leu Cys Glu Pro Gly Ala Ala Ala Asn Ala Ala Gly Arg Leu Leu50
55 60Gly Gly Ala Thr Arg Leu Asn Tyr His
Gly Glu Val Val His His His65 70 75
80Leu Gly Val Ser Ala Phe Ala Glu Tyr Ala Val Val Ser Arg
Asn Ser85 90 95Leu Val Lys Ile Asp Arg
Asp Leu Pro Phe Val Glu Ala Ala Leu Phe100 105
110Gly Cys Ala Val Leu Thr Gly Val Gly Ala Val Val Asn Thr Ala
Arg115 120 125Val Arg Thr Gly Ser Thr Ala
Val Val Ile Gly Leu Gly Gly Val Gly130 135
140Leu Ala Ala Val Leu Gly Ala Arg Ala Ala Gly Ala Ser Lys Ile Val145
150 155 160Ala Val Asp Leu
Ser Gln Glu Lys Leu Ala Leu Ala Ser Glu Leu Gly165 170
175Ala Thr Ala Ile Val Asn Gly Arg Asp Glu Asp Ala Val Glu
Gln Val180 185 190Arg Glu Leu Thr Ser Gly
Gly Ala Asp Tyr Ala Phe Glu Met Ala Gly195 200
205Ser Ile Arg Ala Leu Glu Asn Ala Phe Arg Met Thr Lys Arg Gly
Gly210 215 220Thr Thr Val Thr Ala Gly Leu
Pro Pro Pro Gly Ala Ala Leu Pro Leu225 230
235 240Asn Val Val Gln Leu Val Gly Glu Glu Arg Thr Leu
Lys Gly Ser Tyr245 250 255Ile Gly Thr Cys
Val Pro Leu Arg Asp Ile Pro Arg Phe Ile Ala Leu260 265
270Tyr Arg Asp Gly Arg Leu Pro Val Asn Arg Leu Leu Ser Gly
Arg Leu275 280 285Lys Leu Glu Asp Ile Asn
Glu Gly Phe Asp Arg Leu His Asp Gly Ser290 295
300Ala Val Arg Gln Val Ile Glu Phe305
310891035DNAAgrobacterium tumefaciens C58 89atgaaacatt ctcaggacaa
accacgcctg ctgattgcga tgcgtagcga gcttccagaa 60ggcttcttcg gtccgcgcga
atgggcaagg ctgaatgccg tagcggacat tattccgggc 120tttccccata cggatttcga
cacggcgaac ggtgccgagg ctctcgccga agcggatatt 180ctgctcgctg cctggggtac
gccatccctg acacgcgaac gactttcacg cgcgccgcgg 240ctgaaaatgc tggcctatgc
ggcatcatcg gtgcggatgg ttgcgcccgc agaattctgg 300gagacgtcgg atattctggt
cacgacagca gcttccgcca tggccgtgcc ggttgccgaa 360ttcacctatg cggcaatcat
catgtgcggc aaggatgtgt ttcgattgcg ggatgaacat 420agaacagagc gcggcaccgg
cgtttttggc agcaggcgcg gcagaagcct gccctatctt 480ggcaatcatg cccgcaaggt
tggcattgtc ggcgcctcgc gcatcgggcg gctggtgatg 540gagatgctgg cgcgcggcac
attcgagatt gccgtttacg atccctttct gtcggcggaa 600gaggccgcat cccttggcgc
gaagaaagcc gaactggacg agcttctcgc atggtccgat 660gtggtctcgc tgcacgcgcc
gatcctgccg gaaacgcacc atatgatcgg cgcccgcgaa 720ctggcgctga tggcggacca
tgccatcttc atcaacacgg cgcggggctg gctggtcgac 780cacgatgcat tgctgactga
agcgatttcc ggacggctgc gcattctgat tgacacgccc 840gaacccgagc ccctgcccac
ggacagcccg ttttacgatc tgcccaatgt cgttctaacc 900ccccatatag ccggggcgct
gggcaatgaa ttgcgcgcac tttccgatct ggccattacc 960gaaattgaac gtttcgtggc
gggacttgcg cccctccacc cggtccacaa gcaggatatg 1020gaacgtatgg catga
103590331PRTAgrobacterium
tumefaciens C58 90Met Arg Ser Glu Leu Pro Glu Gly Phe Phe Gly Pro Arg Glu
Trp Ala1 5 10 15Arg Leu
Asn Ala Val Ala Asp Ile Ile Pro Gly Phe Pro His Thr Asp20
25 30Phe Asp Thr Ala Asn Gly Ala Glu Ala Leu Ala Glu
Ala Asp Ile Leu35 40 45Leu Ala Ala Trp
Gly Thr Pro Ser Leu Thr Arg Glu Arg Leu Ser Arg50 55
60Ala Pro Arg Leu Lys Met Leu Ala Tyr Ala Ala Ser Ser Val
Arg Met65 70 75 80Val
Ala Pro Ala Glu Phe Trp Glu Thr Ser Asp Ile Leu Val Thr Thr85
90 95Ala Ala Ser Ala Met Ala Val Pro Val Ala Glu
Phe Thr Tyr Ala Ala100 105 110Ile Ile Met
Cys Gly Lys Asp Val Phe Arg Leu Arg Asp Glu His Arg115
120 125Thr Glu Arg Gly Thr Gly Val Phe Gly Ser Arg Arg
Gly Arg Ser Leu130 135 140Pro Tyr Leu Gly
Asn His Ala Arg Lys Val Gly Ile Val Gly Ala Ser145 150
155 160Arg Ile Gly Arg Leu Val Met Glu Met
Leu Ala Arg Gly Thr Phe Glu165 170 175Ile
Ala Val Tyr Asp Pro Phe Leu Ser Ala Glu Glu Ala Ala Ser Leu180
185 190Gly Ala Lys Lys Ala Glu Leu Asp Glu Leu Leu
Ala Trp Ser Asp Val195 200 205Val Ser Leu
His Ala Pro Ile Leu Pro Glu Thr His His Met Ile Gly210
215 220Ala Arg Glu Leu Ala Leu Met Ala Asp His Ala Ile
Phe Ile Asn Thr225 230 235
240Ala Arg Gly Trp Leu Val Asp His Asp Ala Leu Leu Thr Glu Ala Ile245
250 255Ser Gly Arg Leu Arg Ile Leu Ile Asp
Thr Pro Glu Pro Glu Pro Leu260 265 270Pro
Thr Asp Ser Pro Phe Tyr Asp Leu Pro Asn Val Val Leu Thr Pro275
280 285His Ile Ala Gly Ala Leu Gly Asn Glu Leu Arg
Ala Leu Ser Asp Leu290 295 300Ala Ile Thr
Glu Ile Glu Arg Phe Val Ala Gly Leu Ala Pro Leu His305
310 315 320Pro Val His Lys Gln Asp Met
Glu Arg Met Ala325 33091750DNAAgrobacterium tumefaciens
C58 91atgcagcgtt ttaccaacag aaccatcgtt gtcgccgggg ccggccggga tatcggccgg
60gcatgcgcca tccgtttcgc acaggaaggc gccaatgtcg ttcttaccta taatggcgcg
120gcagagggcg cggccacagc cgttgccgaa atcgaaaagc ttggtcgttc ggctctggcg
180atcaaggcgg atctcacaaa cgccgccgaa gtcgaggctg ccatatctgc ggctgcggac
240aagtttgggg agatccacgg cctcgtccat gttgccggcg gcctgatcgc ccgcaagaca
300atcgcagaaa tggatgaagc cttctggcat caggtcctcg acgtcaatct gacatcgctg
360ttcctgacgg ccaagaccgc attgccgaag atggccaagg gcggcgcgat cgtcactttc
420tcgtcgcagg ccggccgtga tggcggcggc ccgggcgctc ttgcctatgc cacttccaag
480ggtgccgtga tgaccttcac ccgcggactt gccaaagaag tcggccccaa aatccgcgtc
540aacgccgttt gccccggtat gatctccacc accttccacg ataccttcac caagccggag
600gtgcgcgaac gggtggccgg cgcgacgtcg ctcaagcgcg aagggtcgag cgaagacgtc
660gccggtctgg tggccttcct cgcgtctgac gatgccgctt atgtcaccgg cgcctgctac
720gacatcaatg gcggcgtcct gttttcctga
75092249PRTAgrobacterium tumefaciens C58 92Met Gln Arg Phe Thr Asn Arg
Thr Ile Val Val Ala Gly Ala Gly Arg1 5 10
15Asp Ile Gly Arg Ala Cys Ala Ile Arg Phe Ala Gln Glu Gly
Ala Asn20 25 30Val Val Leu Thr Tyr Asn
Gly Ala Ala Glu Gly Ala Ala Thr Ala Val35 40
45Ala Glu Ile Glu Lys Leu Gly Arg Ser Ala Leu Ala Ile Lys Ala Asp50
55 60Leu Thr Asn Ala Ala Glu Val Glu Ala
Ala Ile Ser Ala Ala Ala Asp65 70 75
80Lys Phe Gly Glu Ile His Gly Leu Val His Val Ala Gly Gly
Leu Ile85 90 95Ala Arg Lys Thr Ile Ala
Glu Met Asp Glu Ala Phe Trp His Gln Val100 105
110Leu Asp Val Asn Leu Thr Ser Leu Phe Leu Thr Ala Lys Thr Ala
Leu115 120 125Pro Lys Met Ala Lys Gly Gly
Ala Ile Val Thr Phe Ser Ser Gln Ala130 135
140Gly Arg Asp Gly Gly Gly Pro Gly Ala Leu Ala Tyr Ala Thr Ser Lys145
150 155 160Gly Ala Val Met
Thr Phe Thr Arg Gly Leu Ala Lys Glu Val Gly Pro165 170
175Lys Ile Arg Val Asn Ala Val Cys Pro Gly Met Ile Ser Thr
Thr Phe180 185 190His Asp Thr Phe Thr Lys
Pro Glu Val Arg Glu Arg Val Ala Gly Ala195 200
205Thr Ser Leu Lys Arg Glu Gly Ser Ser Glu Asp Val Ala Gly Leu
Val210 215 220Ala Phe Leu Ala Ser Asp Asp
Ala Ala Tyr Val Thr Gly Ala Cys Tyr225 230
235 240Asp Ile Asn Gly Gly Val Leu Phe
Ser24593930DNAEscherichia coli DH10B 93atgtccaaaa agattgccgt gattggcgaa
tgcatgattg agctttccga gaaaggcgcg 60gacgttaagc gcggtttcgg cggcgatacc
ctgaacactt ccgtctatat cgcccgtcag 120gtcgatcctg cggcattaac cgttcattac
gtaacggcgc tgggaacgga cagttttagc 180cagcagatgc tggacgcctg gcacggcgag
aacgttgata cttccctgac ccaacggatg 240gaaaaccgtc tgccgggcct ttactacatt
gaaaccgaca gcaccggcga gcgtacgttc 300tactactggc ggaacgaagc cgccgccaaa
ttctggctgg agagtgagca gtctgcggcg 360atttgcgaag agctggcgaa tttcgattat
ctctacctga gcgggattag cctggcgatc 420ttaagcccga ccagccgcga aaagctgctt
tccctgctgc gcgaatgccg cgccaacggc 480ggaaaagtga ttttcgacaa taactatcgt
ccgcgcctgt gggccagcaa agaagagaca 540cagcaggtgt accaacaaat gctggaatgc
acggatatcg ccttcctgac gctggacgac 600gaagacgcgc tgtggggtca acagccggtg
gaagacgtca ttgcgcgcac ccataacgcg 660ggcgtgaaag aagtggtggt gaaacgcggg
gcggattctt gcctggtgtc cattgctggc 720gaagggttag tggatgttcc ggcggtgaaa
ctgccgaaag aaaaagtgat cgataccacc 780gcagctggcg actctttcag tgccggttat
ctggcggtac gtctgacagg cggcagcgcg 840gaagacgcgg cgaaacgtgg gcacctgacc
gcaagtaccg ttattcagta tcgcggcgcg 900attatcccgc gtgaggcgat gccagcgtaa
93094309PRTEscherichia coli DH10B 94Met
Ser Lys Lys Ile Ala Val Ile Gly Glu Cys Met Ile Glu Leu Ser1
5 10 15Glu Lys Gly Ala Asp Val Lys Arg
Gly Phe Gly Gly Asp Thr Leu Asn20 25
30Thr Ser Val Tyr Ile Ala Arg Gln Val Asp Pro Ala Ala Leu Thr Val35
40 45His Tyr Val Thr Ala Leu Gly Thr Asp Ser
Phe Ser Gln Gln Met Leu50 55 60Asp Ala
Trp His Gly Glu Asn Val Asp Thr Ser Leu Thr Gln Arg Met65
70 75 80Glu Asn Arg Leu Pro Gly Leu
Tyr Tyr Ile Glu Thr Asp Ser Thr Gly85 90
95Glu Arg Thr Phe Tyr Tyr Trp Arg Asn Glu Ala Ala Ala Lys Phe Trp100
105 110Leu Glu Ser Glu Gln Ser Ala Ala Ile
Cys Glu Glu Leu Ala Asn Phe115 120 125Asp
Tyr Leu Tyr Leu Ser Gly Ile Ser Leu Ala Ile Leu Ser Pro Thr130
135 140Ser Arg Glu Lys Leu Leu Ser Leu Leu Arg Glu
Cys Arg Ala Asn Gly145 150 155
160Gly Lys Val Ile Phe Asp Asn Asn Tyr Arg Pro Arg Leu Trp Ala
Ser165 170 175Lys Glu Glu Thr Gln Gln Val
Tyr Gln Gln Met Leu Glu Cys Thr Asp180 185
190Ile Ala Phe Leu Thr Leu Asp Asp Glu Asp Ala Leu Trp Gly Gln Gln195
200 205Pro Val Glu Asp Val Ile Ala Arg Thr
His Asn Ala Gly Val Lys Glu210 215 220Val
Val Val Lys Arg Gly Ala Asp Ser Cys Leu Val Ser Ile Ala Gly225
230 235 240Glu Gly Leu Val Asp Val
Pro Ala Val Lys Leu Pro Lys Glu Lys Val245 250
255Ile Asp Thr Thr Ala Ala Gly Asp Ser Phe Ser Ala Gly Tyr Leu
Ala260 265 270Val Arg Leu Thr Gly Gly Ser
Ala Glu Asp Ala Ala Lys Arg Gly His275 280
285Leu Thr Ala Ser Thr Val Ile Gln Tyr Arg Gly Ala Ile Ile Pro Arg290
295 300Glu Ala Met Pro
Ala30595642DNAEscherichia coli DH10B 95atgaaaaact ggaaaacaag tgcagaatca
atcctgacca ccggcccggt tgtaccggtt 60atcgtggtaa aaaaactgga acacgcggtg
ccgatggcaa aagcgttggt tgctggtggg 120gtgcgcgttc tggaagtgac tctgcgtacc
gagtgtgcag ttgacgctat ccgtgctatc 180gccaaagaag tgcctgaagc gattgtgggt
gccggtacgg tgctgaatcc acagcagctg 240gcagaagtca ctgaagcggg tgcacagttc
gcaattagcc cgggtctgac cgagccgctg 300ctgaaagctg ctaccgaagg gactattcct
ctgattccgg ggatcagcac tgtttccgaa 360ctgatgctgg gtatggacta cggtttgaaa
gagttcaaat tcttcccggc tgaagctaac 420ggcggcgtga aagccctgca ggcgatcgcg
ggtccgttct cccaggtccg tttctgcccg 480acgggtggta tttctccggc taactaccgt
gactacctgg cgctgaaaag cgtgctgtgc 540atcggtggtt cctggctggt tccggcagat
gcgctggaag cgggcgatta cgaccgcatt 600actaagctgg cgcgtgaagc tgtagaaggc
gctaagctgt aa 64296213PRTEscherichia coli DH10B
96Met Lys Asn Trp Lys Thr Ser Ala Glu Ser Ile Leu Thr Thr Gly Pro1
5 10 15Val Val Pro Val Ile Val
Val Lys Lys Leu Glu His Ala Val Pro Met20 25
30Ala Lys Ala Leu Val Ala Gly Gly Val Arg Val Leu Glu Val Thr Leu35
40 45Arg Thr Glu Cys Ala Val Asp Ala Ile
Arg Ala Ile Ala Lys Glu Val50 55 60Pro
Glu Ala Ile Val Gly Ala Gly Thr Val Leu Asn Pro Gln Gln Leu65
70 75 80Ala Glu Val Thr Glu Ala
Gly Ala Gln Phe Ala Ile Ser Pro Gly Leu85 90
95Thr Glu Pro Leu Leu Lys Ala Ala Thr Glu Gly Thr Ile Pro Leu Ile100
105 110Pro Gly Ile Ser Thr Val Ser Glu
Leu Met Leu Gly Met Asp Tyr Gly115 120
125Leu Lys Glu Phe Lys Phe Phe Pro Ala Glu Ala Asn Gly Gly Val Lys130
135 140Ala Leu Gln Ala Ile Ala Gly Pro Phe
Ser Gln Val Arg Phe Cys Pro145 150 155
160Thr Gly Gly Ile Ser Pro Ala Asn Tyr Arg Asp Tyr Leu Ala
Leu Lys165 170 175Ser Val Leu Cys Ile Gly
Gly Ser Trp Leu Val Pro Ala Asp Ala Leu180 185
190Glu Ala Gly Asp Tyr Asp Arg Ile Thr Lys Leu Ala Arg Glu Ala
Val195 200 205Glu Gly Ala Lys
Leu21097780DNALactobaccilus brevis ATCC 367 97atggcatcaa atggaaaagt
agcaatggtt accggtggcg gacaaggaat tggtgaagcc 60atctcgaaac ggttagctaa
cgacggcttt gctgtggcaa ttgctgattt gaacttggac 120aatgccaaca aggtcgtttc
tgatattgaa gctgctggtg gcaaggccat tgcggtcaag 180accgatgtct ctgatcgtga
tagcgtgttt gctgcggtta atgaagcggc cgacaagctg 240ggcggctttg acgttatcgt
taataacgcc ggccttggcc caaccacgcc aattgacacc 300atcacccaag aacagtttga
tacggtttat cacgttaacg tgggtggggt tctttggggc 360attcaagcag cccatgcgaa
gttcaaggaa ttgggtcatg gtgggaagat catttccgcg 420acgtctcaag ccggggttgt
tggtaacccg aacttagctc tgtacagtgg aactaagttt 480gccattcgtg gtgtgaccca
agttgcggcg cgtgacttag ccgctgaagg tatcacggtc 540aatgcttatg cacccgggat
tgttaagaca ccaatgatgt ttgacatcgc tcacaaggtt 600ggtcaaaatg ctggtaaaga
cgacgaatgg gggatgcaaa ccttctcaaa ggacatcgct 660ttatgtcgat tgtcagaacc
agaagatgtg gctaacgggg tggctttctt agccggtccc 720gattctaact acattacggg
tcaaacactt gaagttgatg gtgggatgca gttccactaa 78098259PRTLactobaccilus
brevis ATCC 367 98Met Ala Ser Asn Gly Lys Val Ala Met Val Thr Gly Gly Gly
Gln Gly1 5 10 15Ile Gly
Glu Ala Ile Ser Lys Arg Leu Ala Asn Asp Gly Phe Ala Val20
25 30Ala Ile Ala Asp Leu Asn Leu Asp Asn Ala Asn Lys
Val Val Ser Asp35 40 45Ile Glu Ala Ala
Gly Gly Lys Ala Ile Ala Val Lys Thr Asp Val Ser50 55
60Asp Arg Asp Ser Val Phe Ala Ala Val Asn Glu Ala Ala Asp
Lys Leu65 70 75 80Gly
Gly Phe Asp Val Ile Val Asn Asn Ala Gly Leu Gly Pro Thr Thr85
90 95Pro Ile Asp Thr Ile Thr Gln Glu Gln Phe Asp
Thr Val Tyr His Val100 105 110Asn Val Gly
Gly Val Leu Trp Gly Ile Gln Ala Ala His Ala Lys Phe115
120 125Lys Glu Leu Gly His Gly Gly Lys Ile Ile Ser Ala
Thr Ser Gln Ala130 135 140Gly Val Val Gly
Asn Pro Asn Leu Ala Leu Tyr Ser Gly Thr Lys Phe145 150
155 160Ala Ile Arg Gly Val Thr Gln Val Ala
Ala Arg Asp Leu Ala Ala Glu165 170 175Gly
Ile Thr Val Asn Ala Tyr Ala Pro Gly Ile Val Lys Thr Pro Met180
185 190Met Phe Asp Ile Ala His Lys Val Gly Gln Asn
Ala Gly Lys Asp Asp195 200 205Glu Trp Gly
Met Gln Thr Phe Ser Lys Asp Ile Ala Leu Cys Arg Leu210
215 220Ser Glu Pro Glu Asp Val Ala Asn Gly Val Ala Phe
Leu Ala Gly Pro225 230 235
240Asp Ser Asn Tyr Ile Thr Gly Gln Thr Leu Glu Val Asp Gly Gly Met245
250 255Gln Phe His991089DNAPseudomonas
putida KT2440 99atgaatgacc tgagccacac ccacatgcgc gcggccgtct ggcatggccg
ccacgatatt 60cgtgtcgaac aggtaccttt gccggccgac cctgcgccgg gctgggtgca
gatcaaggtg 120gactggtgcg gcatctgcgg ctccgacctg cacgaatatg ttgccggccc
ggtgttcatc 180ccggtagagg ccccgcaccc gctgaccggc attcagggcc agtgcatcct
cggccacgaa 240ttctgcggcc acatcgccaa gcttggcgaa ggcgtggaag gctatgccgt
aggcgacccg 300gtggcggcag acgcgtgcca gcattgtggt acctgctatt actgcaccca
tggcctgtac 360aacatctgcg aacgcctggc gttcaccggc ctgatgaaca acggtgcctt
cgccgagctg 420gtcaacgtgc ccgccaacct gctctaccgg ctgccgcagg gcttccctgc
cgaagccggg 480gcactgatcg agccgctggc ggtgggtatg cacgcggtga aaaaggccgg
cagcctgctt 540gggcaaaccg ttgtagtggt tggggccggc accatcggcc tgtgcaccat
catgtgcgcc 600aaggctgcag gtgcggcaca ggtcatcgcc cttgagatgt cctctgcgcg
caaagccaag 660gccaaggaag cgggcgccaa cgtggtgctg gaccccagcc agtgcgatgc
cctggcggaa 720atccgcgcac tgactgctgg gctgggcgcc gatgtgagtt ttgagtgcat
cggcaacaaa 780catacggcca agctggccat cgacaccatc cgcaaagcag gcaagtgcgt
gctggtgggt 840attttcgaag agcccagcga gttcaacttc ttcgagctgg tgtccaccga
gaagcaagtg 900ctgggggcgt tggcgtacaa cggcgagttt gctgacgtga ttgccttcat
tgctgatggt 960cggctggata ttcgcccgct ggtaaccggc cggatcggat tggagcagat
tgtcgagctg 1020ggcttcgagg aactggtgaa caacaaagag gagaacgtga agatcatcgt
ttcaccaggt 1080gtgcgctga
1089100362PRTPseudomonas putida KT2440 100Met Asn Asp Leu Ser
His Thr His Met Arg Ala Ala Val Trp His Gly1 5
10 15Arg His Asp Ile Arg Val Glu Gln Val Pro Leu Pro
Ala Asp Pro Ala20 25 30Pro Gly Trp Val
Gln Ile Lys Val Asp Trp Cys Gly Ile Cys Gly Ser35 40
45Asp Leu His Glu Tyr Val Ala Gly Pro Val Phe Ile Pro Val
Glu Ala50 55 60Pro His Pro Leu Thr Gly
Ile Gln Gly Gln Cys Ile Leu Gly His Glu65 70
75 80Phe Cys Gly His Ile Ala Lys Leu Gly Glu Gly
Val Glu Gly Tyr Ala85 90 95Val Gly Asp
Pro Val Ala Ala Asp Ala Cys Gln His Cys Gly Thr Cys100
105 110Tyr Tyr Cys Thr His Gly Leu Tyr Asn Ile Cys Glu
Arg Leu Ala Phe115 120 125Thr Gly Leu Met
Asn Asn Gly Ala Phe Ala Glu Leu Val Asn Val Pro130 135
140Ala Asn Leu Leu Tyr Arg Leu Pro Gln Gly Phe Pro Ala Glu
Ala Gly145 150 155 160Ala
Leu Ile Glu Pro Leu Ala Val Gly Met His Ala Val Lys Lys Ala165
170 175Gly Ser Leu Leu Gly Gln Thr Val Val Val Val
Gly Ala Gly Thr Ile180 185 190Gly Leu Cys
Thr Ile Met Cys Ala Lys Ala Ala Gly Ala Ala Gln Val195
200 205Ile Ala Leu Glu Met Ser Ser Ala Arg Lys Ala Lys
Ala Lys Glu Ala210 215 220Gly Ala Asn Val
Val Leu Asp Pro Ser Gln Cys Asp Ala Leu Ala Glu225 230
235 240Ile Arg Ala Leu Thr Ala Gly Leu Gly
Ala Asp Val Ser Phe Glu Cys245 250 255Ile
Gly Asn Lys His Thr Ala Lys Leu Ala Ile Asp Thr Ile Arg Lys260
265 270Ala Gly Lys Cys Val Leu Val Gly Ile Phe Glu
Glu Pro Ser Glu Phe275 280 285Asn Phe Phe
Glu Leu Val Ser Thr Glu Lys Gln Val Leu Gly Ala Leu290
295 300Ala Tyr Asn Gly Glu Phe Ala Asp Val Ile Ala Phe
Ile Ala Asp Gly305 310 315
320Arg Leu Asp Ile Arg Pro Leu Val Thr Gly Arg Ile Gly Leu Glu Gln325
330 335Ile Val Glu Leu Gly Phe Glu Glu Leu
Val Asn Asn Lys Glu Glu Asn340 345 350Val
Lys Ile Ile Val Ser Pro Gly Val Arg355
360101771DNAKlebsiella pneumoniae MGH78578 101atgaaaaaag tcgcacttgt
taccggcgcc ggccagggga ttggtaaagc tatcgccctt 60cgtctggtga aggatggatt
tgccgtggcc attgccgatt ataacgacgc caccgccaaa 120gcggtcgcct cggaaatcaa
ccaggccggc ggacacgccg tggcggtgaa agtggatgtc 180tccgaccgcg atcaggtatt
tgccgccgtt gaacaggcgc gcaaaacgct gggcggcttc 240gacgtcatcg tcaataacgc
cggtgtggca ccgtctacgc cgatcgagtc cattaccccg 300gagattgtcg acaaagtcta
caacatcaac gtcaaagggg tgatctgggg tattcaggcg 360gcggtcgagg cctttaagaa
agaggggcac ggcgggaaaa tcatcaacgc ctgttcccag 420gccggccacg tcggcaaccc
ggagctggcg gtgtatagct ccagtaaatt cgcggtacgc 480ggcttaaccc agaccgccgc
tcgcgacctc gcgccgctgg gcatcacggt caacggctac 540tgcccgggga ttgtcaaaac
gccaatgtgg gccgaaattg accgccaggt gtccgaagcc 600gccggtaaac cgctgggcta
cggtaccgcc gagttcgcca aacgcatcac tctcggtcgt 660ctgtccgagc cggaagatgt
cgccgcctgc gtctcctatc ttgccagccc ggattctgat 720tacatgaccg gtcagtcgtt
gctgatcgac ggcgggatgg tatttaacta a 771102256PRTKlebsiella
pneumoniae MGH78578 102Met Lys Lys Val Ala Leu Val Thr Gly Ala Gly Gln
Gly Ile Gly Lys1 5 10
15Ala Ile Ala Leu Arg Leu Val Lys Asp Gly Phe Ala Val Ala Ile Ala20
25 30Asp Tyr Asn Asp Ala Thr Ala Lys Ala Val
Ala Ser Glu Ile Asn Gln35 40 45Ala Gly
Gly His Ala Val Ala Val Lys Val Asp Val Ser Asp Arg Asp50
55 60Gln Val Phe Ala Ala Val Glu Gln Ala Arg Lys Thr
Leu Gly Gly Phe65 70 75
80Asp Val Ile Val Asn Asn Ala Gly Val Ala Pro Ser Thr Pro Ile Glu85
90 95Ser Ile Thr Pro Glu Ile Val Asp Lys Val
Tyr Asn Ile Asn Val Lys100 105 110Gly Val
Ile Trp Gly Ile Gln Ala Ala Val Glu Ala Phe Lys Lys Glu115
120 125Gly His Gly Gly Lys Ile Ile Asn Ala Cys Ser Gln
Ala Gly His Val130 135 140Gly Asn Pro Glu
Leu Ala Val Tyr Ser Ser Ser Lys Phe Ala Val Arg145 150
155 160Gly Leu Thr Gln Thr Ala Ala Arg Asp
Leu Ala Pro Leu Gly Ile Thr165 170 175Val
Asn Gly Tyr Cys Pro Gly Ile Val Lys Thr Pro Met Trp Ala Glu180
185 190Ile Asp Arg Gln Val Ser Glu Ala Ala Gly Lys
Pro Leu Gly Tyr Gly195 200 205Thr Ala Glu
Phe Ala Lys Arg Ile Thr Leu Gly Arg Leu Ser Glu Pro210
215 220Glu Asp Val Ala Ala Cys Val Ser Tyr Leu Ala Ser
Pro Asp Ser Asp225 230 235
240Tyr Met Thr Gly Gln Ser Leu Leu Ile Asp Gly Gly Met Val Phe Asn245
250 2551031665DNAKlebsiella pneumoniae
MGH78578 103atgagatcga aaagatttga agcactggcg aaacgccctg tgaatcagga
tggtttcgtt 60aaggagtgga ttgaagaggg ctttatcgcg atggaaagcc ctaacgatcc
caaaccttct 120atccgcatcg tcaacggcgc ggtgaccgaa ctcgacgata aaccggttga
gcagttcgac 180ctgattgacc actttatcgc gcgctacggc attaatctcg cccgggccga
agaagtgatg 240gccatggatt cggttaagct cgccaacatg ctctgcgacc cgaacgttaa
acgcagcgac 300atcgtgccgc tcactaccgc gatgaccccg gcgaaaatcg tggaagtggt
gtcgcatatg 360aacgtggtcg agatgatgat ggcgatgcaa aaaatgcgcg cccgccgcac
gccgtcccag 420caggcgcatg tcactaatat caaagataat ccggtacaga ttgccgccga
cgccgctgaa 480ggcgcatggc gcggctttga cgagcaggag accaccgtcg ccgtggcgcg
ctacgcgccg 540ttcaacgcca tcgccctgct ggtcggttca caggttggcc gccccggcgt
cctcacccag 600tgttcgctgg aagaagccac cgagctgaaa ctgggcatgc tgggccacac
ctgctatgcc 660gaaaccattt cggtatacgg tacggaaccg gtgtttaccg atggcgatga
caccccgtgg 720tcgaaaggct tcctcgcctc ctcctacgcc tcgcgcggcc tgaaaatgcg
ctttacctcc 780ggttccggct cggaggtgca gatgggctat gccgaaggca aatcgatgct
ttatctcgaa 840gcgcgctgca tctacatcac caaagccgcc ggggtgcaag gcctgcagaa
tggctccgtc 900agctgtatcg gcgtgccgtc cgccgtgccg tccgggatcc gcgccgtact
ggcggaaaac 960ctgatctgct cagcgctgga tctggagtgc gcctccagca acgatcaaac
ctttacccac 1020tcggatatgc ggcgtaccgc gcgtctgctg atgcagttcc tgccaggtac
cgactttatc 1080tcctccggtt actcggcggt gccgaactac gacaacatgt tcgccggttc
caacgaagat 1140gccgaagact tcgatgacta caacgtgatc cagcgcgacc tgaaggtcga
tggcggcctg 1200cggccggtgc gtgaagagga cgtgatcgcc attcgcaaca aagccgcccg
cgcgctgcag 1260gcggtatttg ccggcatggg tttgccgcct attacggatg aagaagtaga
agccgccacc 1320tacgcccacg gttcaaaaga tatgcctgag cgcaatatcg tcgaggacat
caagtttgct 1380caggagatca tcaacaagaa ccgcaacggc ctggaggtgg tgaaagccct
ggcgaaaggc 1440ggcttccccg atgtcgccca ggacatgctc aatattcaga aagccaagct
caccggcgac 1500tacctgcata cctccgccat cattgttggc gagggccagg tgctctcggc
cgtgaatgac 1560gtgaacgatt atgccggtcc ggcaacaggc taccgcctgc aaggcgagcg
ctgggaagag 1620attaaaaata tcccgggcgc gctcgatccc aatgaacttg gctaa
1665104554PRTKlebsiella pneumoniae MGH78578 104Met Arg Ser Lys
Arg Phe Glu Ala Leu Ala Lys Arg Pro Val Asn Gln1 5
10 15Asp Gly Phe Val Lys Glu Trp Ile Glu Glu Gly
Phe Ile Ala Met Glu20 25 30Ser Pro Asn
Asp Pro Lys Pro Ser Ile Arg Ile Val Asn Gly Ala Val35 40
45Thr Glu Leu Asp Asp Lys Pro Val Glu Gln Phe Asp Leu
Ile Asp His50 55 60Phe Ile Ala Arg Tyr
Gly Ile Asn Leu Ala Arg Ala Glu Glu Val Met65 70
75 80Ala Met Asp Ser Val Lys Leu Ala Asn Met
Leu Cys Asp Pro Asn Val85 90 95Lys Arg
Ser Asp Ile Val Pro Leu Thr Thr Ala Met Thr Pro Ala Lys100
105 110Ile Val Glu Val Val Ser His Met Asn Val Val Glu
Met Met Met Ala115 120 125Met Gln Lys Met
Arg Ala Arg Arg Thr Pro Ser Gln Gln Ala His Val130 135
140Thr Asn Ile Lys Asp Asn Pro Val Gln Ile Ala Ala Asp Ala
Ala Glu145 150 155 160Gly
Ala Trp Arg Gly Phe Asp Glu Gln Glu Thr Thr Val Ala Val Ala165
170 175Arg Tyr Ala Pro Phe Asn Ala Ile Ala Leu Leu
Val Gly Ser Gln Val180 185 190Gly Arg Pro
Gly Val Leu Thr Gln Cys Ser Leu Glu Glu Ala Thr Glu195
200 205Leu Lys Leu Gly Met Leu Gly His Thr Cys Tyr Ala
Glu Thr Ile Ser210 215 220Val Tyr Gly Thr
Glu Pro Val Phe Thr Asp Gly Asp Asp Thr Pro Trp225 230
235 240Ser Lys Gly Phe Leu Ala Ser Ser Tyr
Ala Ser Arg Gly Leu Lys Met245 250 255Arg
Phe Thr Ser Gly Ser Gly Ser Glu Val Gln Met Gly Tyr Ala Glu260
265 270Gly Lys Ser Met Leu Tyr Leu Glu Ala Arg Cys
Ile Tyr Ile Thr Lys275 280 285Ala Ala Gly
Val Gln Gly Leu Gln Asn Gly Ser Val Ser Cys Ile Gly290
295 300Val Pro Ser Ala Val Pro Ser Gly Ile Arg Ala Val
Leu Ala Glu Asn305 310 315
320Leu Ile Cys Ser Ala Leu Asp Leu Glu Cys Ala Ser Ser Asn Asp Gln325
330 335Thr Phe Thr His Ser Asp Met Arg Arg
Thr Ala Arg Leu Leu Met Gln340 345 350Phe
Leu Pro Gly Thr Asp Phe Ile Ser Ser Gly Tyr Ser Ala Val Pro355
360 365Asn Tyr Asp Asn Met Phe Ala Gly Ser Asn Glu
Asp Ala Glu Asp Phe370 375 380Asp Asp Tyr
Asn Val Ile Gln Arg Asp Leu Lys Val Asp Gly Gly Leu385
390 395 400Arg Pro Val Arg Glu Glu Asp
Val Ile Ala Ile Arg Asn Lys Ala Ala405 410
415Arg Ala Leu Gln Ala Val Phe Ala Gly Met Gly Leu Pro Pro Ile Thr420
425 430Asp Glu Glu Val Glu Ala Ala Thr Tyr
Ala His Gly Ser Lys Asp Met435 440 445Pro
Glu Arg Asn Ile Val Glu Asp Ile Lys Phe Ala Gln Glu Ile Ile450
455 460Asn Lys Asn Arg Asn Gly Leu Glu Val Val Lys
Ala Leu Ala Lys Gly465 470 475
480Gly Phe Pro Asp Val Ala Gln Asp Met Leu Asn Ile Gln Lys Ala
Lys485 490 495Leu Thr Gly Asp Tyr Leu His
Thr Ser Ala Ile Ile Val Gly Glu Gly500 505
510Gln Val Leu Ser Ala Val Asn Asp Val Asn Asp Tyr Ala Gly Pro Ala515
520 525Thr Gly Tyr Arg Leu Gln Gly Glu Arg
Trp Glu Glu Ile Lys Asn Ile530 535 540Pro
Gly Ala Leu Asp Pro Asn Glu Leu Gly545
550105690DNAKlebsiella pneumoniae MGH78578 105atggaaatta acgaaacgct
gctgcgccag attatcgaag aggtgctgtc ggagatgaaa 60tcaggcgcag ataagccggt
ctcctttagc gcgcctgcgg cttctgtcgc ctctgccgcg 120ccggtcgccg ttgcgcctgt
gtccggcgac agcttcctga cggaaatcgg cgaagccaaa 180cccggcacgc agcaggatga
agtcattatt gccgtcgggc cagcgtttgg tctggcgcaa 240accgccaata tcgtcggcat
tccgcataaa aatattctgc gcgaagtgat cgccggcatt 300gaggaagaag gcatcaaagc
ccgggtgatc cgctgcttta agtcttctga cgtcgccttc 360gtggcagtgg aaggcaaccg
cctgagcggc tccggcatct cgatcggtat tcagtcgaaa 420ggcaccaccg tcatccacca
gcgcggcctg ccgccgcttt ccaatctgga actcttcccg 480caggcgccgc tgctgacgct
ggaaacctac cgtcagattg gcaaaaacgc cgcgcgctac 540gccaaacgcg agtcgccgca
gccggtgccg acgcttaacg atcagatggc tcgtcccaaa 600taccaggcga agtcggccat
tttgcacatt aaagagacca aatacgtggt gacgggcaaa 660aacccgcagg aactgcgcgt
ggcgctttaa 690106229PRTKlebsiella
pneumoniae MGH78578 106Met Glu Ile Asn Glu Thr Leu Leu Arg Gln Ile Ile
Glu Glu Val Leu1 5 10
15Ser Glu Met Lys Ser Gly Ala Asp Lys Pro Val Ser Phe Ser Ala Pro20
25 30Ala Ala Ser Val Ala Ser Ala Ala Pro Val
Ala Val Ala Pro Val Ser35 40 45Gly Asp
Ser Phe Leu Thr Glu Ile Gly Glu Ala Lys Pro Gly Thr Gln50
55 60Gln Asp Glu Val Ile Ile Ala Val Gly Pro Ala Phe
Gly Leu Ala Gln65 70 75
80Thr Ala Asn Ile Val Gly Ile Pro His Lys Asn Ile Leu Arg Glu Val85
90 95Ile Ala Gly Ile Glu Glu Glu Gly Ile Lys
Ala Arg Val Ile Arg Cys100 105 110Phe Lys
Ser Ser Asp Val Ala Phe Val Ala Val Glu Gly Asn Arg Leu115
120 125Ser Gly Ser Gly Ile Ser Ile Gly Ile Gln Ser Lys
Gly Thr Thr Val130 135 140Ile His Gln Arg
Gly Leu Pro Pro Leu Ser Asn Leu Glu Leu Phe Pro145 150
155 160Gln Ala Pro Leu Leu Thr Leu Glu Thr
Tyr Arg Gln Ile Gly Lys Asn165 170 175Ala
Ala Arg Tyr Ala Lys Arg Glu Ser Pro Gln Pro Val Pro Thr Leu180
185 190Asn Asp Gln Met Ala Arg Pro Lys Tyr Gln Ala
Lys Ser Ala Ile Leu195 200 205His Ile Lys
Glu Thr Lys Tyr Val Val Thr Gly Lys Asn Pro Gln Glu210
215 220Leu Arg Val Ala Leu225107525DNAKlebsiella
pneumoniae MGH78578 107atgaataccg acgcaattga atccatggta cgcgacgtgc
tgagccggat gaacagccta 60caggacggga taacgcccgc gccagccgcg ccgacaaacg
acaccgttcg ccagccaaaa 120gttagcgact acccgttagc gacccgccat ccggagtggg
tcaaaaccgc taccaataaa 180acgctcgatg acctgacgct ggagaacgta ttaagcgatc
gcgttacggc gcaggacatg 240cgcatcactc cggaaacgct gcgtatgcag gcggcgatcg
cccaggatgc cggacgcgat 300cggctggcga tgaactttga gcgggccgca gagctcaccg
cggttcccga cgaccgaatc 360cttgagatct acaacgccct gcgcccatac cgttccaccc
aggcggagct actggcgatc 420gctgatgacc tcgagcatcg ctaccaggca cgactctgtg
ccgcctttgt tcgggaagcg 480gccgggctgt acatcgagcg taagaagctg aaaggcgacg
attaa 525108174PRTKlebsiella pneumoniae MGH78578
108Met Asn Thr Asp Ala Ile Glu Ser Met Val Arg Asp Val Leu Ser Arg1
5 10 15Met Asn Ser Leu Gln Asp
Gly Ile Thr Pro Ala Pro Ala Ala Pro Thr20 25
30Asn Asp Thr Val Arg Gln Pro Lys Val Ser Asp Tyr Pro Leu Ala Thr35
40 45Arg His Pro Glu Trp Val Lys Thr Ala
Thr Asn Lys Thr Leu Asp Asp50 55 60Leu
Thr Leu Glu Asn Val Leu Ser Asp Arg Val Thr Ala Gln Asp Met65
70 75 80Arg Ile Thr Pro Glu Thr
Leu Arg Met Gln Ala Ala Ile Ala Gln Asp85 90
95Ala Gly Arg Asp Arg Leu Ala Met Asn Phe Glu Arg Ala Ala Glu Leu100
105 110Thr Ala Val Pro Asp Asp Arg Ile
Leu Glu Ile Tyr Asn Ala Leu Arg115 120
125Pro Tyr Arg Ser Thr Gln Ala Glu Leu Leu Ala Ile Ala Asp Asp Leu130
135 140Glu His Arg Tyr Gln Ala Arg Leu Cys
Ala Ala Phe Val Arg Glu Ala145 150 155
160Ala Gly Leu Tyr Ile Glu Arg Lys Lys Leu Lys Gly Asp
Asp165 170109789DNAPseudomonas putida KT2440
109atgacagtca attatgattt ttccggaaaa gtcgtgctgg ttaccggcgc tggctctggt
60attggccgtg ccactgcgct tgccttcgcg cagtcgggcg catccgttgc ggtcgcagac
120atctcgactg accacggttt gaaaaccgta gagttggtca aagccgaagg aggcgaggcg
180accttcttcc atgtcgatgt aggctctgaa cccagcgtcc agtcgatgct ggctggtgtc
240gtggcgcatt acggcggcct ggacattgcg cacaacaacg ccggcattga ggccaatatc
300gtgccgctgg ccgagctgga ctccgacaac tggcgtcgtg tcatcgatgt gaacctttcc
360tcggtgttct attgcctgaa aggtgaaatc cctctgatgc tgaaaagggg cggcggcgcc
420attgtgaata ccgcatcggc ctccgggctg attggcggct atcgcctttc cgggtatacc
480gccacgaagc acggcgtagt ggggctgact aaggctgctg ctatcgatta tgcaaaccag
540aatatccgga ttaatgccgt gtgccctggt ccagttgact ccccattcct ggctgacatg
600ccgcaaccca tgcgcgatcg acttctcttt ggcactccaa ttggacgatt ggccaccgca
660gaggagatcg cgcgttcggt tctgtggctg tgttctgacg atgcaaaata cgtggtgggc
720cattcgatgt cagtcgacgg tggcgtggca gtgactgcgg ttggtactcg aatggatgat
780ctcttttaa
789110262PRTPseudomonas putida KT2440 110Met Thr Val Asn Tyr Asp Phe Ser
Gly Lys Val Val Leu Val Thr Gly1 5 10
15Ala Gly Ser Gly Ile Gly Arg Ala Thr Ala Leu Ala Phe Ala Gln
Ser20 25 30Gly Ala Ser Val Ala Val Ala
Asp Ile Ser Thr Asp His Gly Leu Lys35 40
45Thr Val Glu Leu Val Lys Ala Glu Gly Gly Glu Ala Thr Phe Phe His50
55 60Val Asp Val Gly Ser Glu Pro Ser Val Gln
Ser Met Leu Ala Gly Val65 70 75
80Val Ala His Tyr Gly Gly Leu Asp Ile Ala His Asn Asn Ala Gly
Ile85 90 95Glu Ala Asn Ile Val Pro Leu
Ala Glu Leu Asp Ser Asp Asn Trp Arg100 105
110Arg Val Ile Asp Val Asn Leu Ser Ser Val Phe Tyr Cys Leu Lys Gly115
120 125Glu Ile Pro Leu Met Leu Lys Arg Gly
Gly Gly Ala Ile Val Asn Thr130 135 140Ala
Ser Ala Ser Gly Leu Ile Gly Gly Tyr Arg Leu Ser Gly Tyr Thr145
150 155 160Ala Thr Lys His Gly Val
Val Gly Leu Thr Lys Ala Ala Ala Ile Asp165 170
175Tyr Ala Asn Gln Asn Ile Arg Ile Asn Ala Val Cys Pro Gly Pro
Val180 185 190Asp Ser Pro Phe Leu Ala Asp
Met Pro Gln Pro Met Arg Asp Arg Leu195 200
205Leu Phe Gly Thr Pro Ile Gly Arg Leu Ala Thr Ala Glu Glu Ile Ala210
215 220Arg Ser Val Leu Trp Leu Cys Ser Asp
Asp Ala Lys Tyr Val Val Gly225 230 235
240His Ser Met Ser Val Asp Gly Gly Val Ala Val Thr Ala Val
Gly Thr245 250 255Arg Met Asp Asp Leu
Phe260111762DNAPseudomonas putida KT2440 111atgagcatga ccttttctgg
ccaggtagcc ctggtgaccg gcgcgggtgc cggcatcggc 60cgggcaaccg ccctggcgtt
cgcccacgag ggcatgaaag tggtggtggc ggacctcgac 120ccggtcggcg gcgaggccac
cgtggcgcag atccacgcgg caggcggcga agcgctgttc 180attgcctgcg acgtgacccg
cgacgccgag gtgcgccagt tgcatgagcg cctgatggcc 240gcctacggcc ggctggacta
cgccttcaac aacgccggga tcgagatcga gcaacaccgc 300ctggccgaag gcagcgaagc
ggagttcgat gccatcatgg gcgtgaacgt gaagggcgtg 360tggttgtgca tgaagtatca
gttgcccttg ttgctggccc aaggcggtgg ggccatcgtc 420aataccgcgt cggtggcggg
gctaggggcg gcgccaaaga tgagcatcta cagcgccagc 480aagcatgcgg tcatcggtct
gaccaagtcg gcggccatcg agtacgccaa gaagggcatc 540cgcgtgaacg ccgtgtgccc
ggccgtgatc gacaccgaca tgttccgccg cgcttaccag 600gccgacccgc gcaaggccga
gttcgccgca gccatgcacc cggtagggcg cattggcaag 660gtcgaggaaa tcgccagcgc
cgtgctgtat ctgtgcagtg acggcgcggc gtttaccacc 720gggcattgcc tgacggtgga
tggtggggct acggcgatct ga 762112253PRTPseudomonas
putida KT2440 112Met Ser Met Thr Phe Ser Gly Gln Val Ala Leu Val Thr Gly
Ala Gly1 5 10 15Ala Gly
Ile Gly Arg Ala Thr Ala Leu Ala Phe Ala His Glu Gly Met20
25 30Lys Val Val Val Ala Asp Leu Asp Pro Val Gly Gly
Glu Ala Thr Val35 40 45Ala Gln Ile His
Ala Ala Gly Gly Glu Ala Leu Phe Ile Ala Cys Asp50 55
60Val Thr Arg Asp Ala Glu Val Arg Gln Leu His Glu Arg Leu
Met Ala65 70 75 80Ala
Tyr Gly Arg Leu Asp Tyr Ala Phe Asn Asn Ala Gly Ile Glu Ile85
90 95Glu Gln His Arg Leu Ala Glu Gly Ser Glu Ala
Glu Phe Asp Ala Ile100 105 110Met Gly Val
Asn Val Lys Gly Val Trp Leu Cys Met Lys Tyr Gln Leu115
120 125Pro Leu Leu Leu Ala Gln Gly Gly Gly Ala Ile Val
Asn Thr Ala Ser130 135 140Val Ala Gly Leu
Gly Ala Ala Pro Lys Met Ser Ile Tyr Ser Ala Ser145 150
155 160Lys His Ala Val Ile Gly Leu Thr Lys
Ser Ala Ala Ile Glu Tyr Ala165 170 175Lys
Lys Gly Ile Arg Val Asn Ala Val Cys Pro Ala Val Ile Asp Thr180
185 190Asp Met Phe Arg Arg Ala Tyr Gln Ala Asp Pro
Arg Lys Ala Glu Phe195 200 205Ala Ala Ala
Met His Pro Val Gly Arg Ile Gly Lys Val Glu Glu Ile210
215 220Ala Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly Ala
Ala Phe Thr Thr225 230 235
240Gly His Cys Leu Thr Val Asp Gly Gly Ala Thr Ala Ile245
250113810DNAPseudomonas putida KT2440 113atgtcttttc aaaacaaaat
cgttgtgctc acaggcgcag cttctggcat cggcaaagcg 60acagcacagc tgctagtgga
gcagggcgcc catgtggttg ccatggatct taaaagcgac 120ttgcttcaac aagcattcgg
cagtgaggag cacgttctgt gcatccctac cgacgtcagc 180gatagcgaag ccgtgcgagc
cgccttccag gcagtggacg cgaaatttgg ccgtgtcgac 240gtgattatta acgccgcggg
catcaacgca cctacgcgag aagccaacca gaaaatggtt 300gatgccaacg tcgctgccct
cgatgccatg aagagcgggc gggcgcccac tttcgacttc 360ctggccgata cctcggatca
ggatttccgg cgcgtaatgg aagtcaattt gttcagccag 420ttttactgca ttcgagaggg
tgttccgctg atgcgccgag cgggtggcgg cagcatcgtc 480aacatctcca gcgtggcagc
gctcctgggc gtggcaatgc cactttacta ccccgcctcc 540aaggcggcgg tgctgggcct
cacccgtgca gcggcagctg agttggcacc ttacaacatt 600cgtgtgaatg ccatcgctcc
aggctctgtc gacacaccat tgatgcatga gcaaccaccg 660gaagtcgttc agttcctggt
cagcatgcaa cccatcaagc ggctggccca acccgaggag 720cttgcccaaa gcatcctgtt
ccttgccggt gagcattcgt ccttcatcac cggacagacg 780ctttctccca acggcgggat
gcacatgtaa 810114269PRTPseudomonas
putida KT2440 114Met Ser Phe Gln Asn Lys Ile Val Val Leu Thr Gly Ala Ala
Ser Gly1 5 10 15Ile Gly
Lys Ala Thr Ala Gln Leu Leu Val Glu Gln Gly Ala His Val20
25 30Val Ala Met Asp Leu Lys Ser Asp Leu Leu Gln Gln
Ala Phe Gly Ser35 40 45Glu Glu His Val
Leu Cys Ile Pro Thr Asp Val Ser Asp Ser Glu Ala50 55
60Val Arg Ala Ala Phe Gln Ala Val Asp Ala Lys Phe Gly Arg
Val Asp65 70 75 80Val
Ile Ile Asn Ala Ala Gly Ile Asn Ala Pro Thr Arg Glu Ala Asn85
90 95Gln Lys Met Val Asp Ala Asn Val Ala Ala Leu
Asp Ala Met Lys Ser100 105 110Gly Arg Ala
Pro Thr Phe Asp Phe Leu Ala Asp Thr Ser Asp Gln Asp115
120 125Phe Arg Arg Val Met Glu Val Asn Leu Phe Ser Gln
Phe Tyr Cys Ile130 135 140Arg Glu Gly Val
Pro Leu Met Arg Arg Ala Gly Gly Gly Ser Ile Val145 150
155 160Asn Ile Ser Ser Val Ala Ala Leu Leu
Gly Val Ala Met Pro Leu Tyr165 170 175Tyr
Pro Ala Ser Lys Ala Ala Val Leu Gly Leu Thr Arg Ala Ala Ala180
185 190Ala Glu Leu Ala Pro Tyr Asn Ile Arg Val Asn
Ala Ile Ala Pro Gly195 200 205Ser Val Asp
Thr Pro Leu Met His Glu Gln Pro Pro Glu Val Val Gln210
215 220Phe Leu Val Ser Met Gln Pro Ile Lys Arg Leu Ala
Gln Pro Glu Glu225 230 235
240Leu Ala Gln Ser Ile Leu Phe Leu Ala Gly Glu His Ser Ser Phe Ile245
250 255Thr Gly Gln Thr Leu Ser Pro Asn Gly
Gly Met His Met260 265115771DNAPseudomonas putida KT2440
115atgacccttg aaggcaaaac tgcactcgtc accggttcca ccagcggcat tggcctgggc
60atcgcccagg tattggcccg ggctggcgcc aacatcgtgc tcaacggctt tggtgacccg
120ggccccgcca tggcggaaat tgcccggcac ggggtgaagg ttgtgcacca cccggccgac
180ctgtcggatg tggtccagat cgaggctttg ttcaacctgg ccgaacgcga gttcggcggc
240gtcgacatcc tggtcaacaa cgccggtatc cagcatgtgg caccggttga gcagttcccg
300ccagaaagct gggacaagat catcgccctg aacctgtcgg ccgtattcca tggcacgcgc
360ctggcgctgc cgggcatgcg cacgcgcaac tgggggcgca tcatcaatat cgcttcggtg
420catggcctgg tcggctcgat tggcaaggca gcctacgtgg cagccaagca tggcgtgatc
480ggcctgacca aggtggtcgg cctggaaacc gccaccagtc atgtcacctg caatgccata
540tgcccgggct gggtgctgac accgctggtg caaaagcaga tcgacgatcg tgcggccaag
600ggtggcgatc ggctgcaagc gcagcacgat ctgctggcag aaaagcaacc gtcgctggct
660ttcgtcaccc ccgaacacct cggtgagctg gtactctttc tgtgcagcga ggccggtagc
720caggttcgcg gcgccgcctg gaacgtcgat ggtggctggt tggcccagtg a
771116256PRTPseudomonas putida KT2440 116Met Thr Leu Glu Gly Lys Thr Ala
Leu Val Thr Gly Ser Thr Ser Gly1 5 10
15Ile Gly Leu Gly Ile Ala Gln Val Leu Ala Arg Ala Gly Ala Asn
Ile20 25 30Val Leu Asn Gly Phe Gly Asp
Pro Gly Pro Ala Met Ala Glu Ile Ala35 40
45Arg His Gly Val Lys Val Val His His Pro Ala Asp Leu Ser Asp Val50
55 60Val Gln Ile Glu Ala Leu Phe Asn Leu Ala
Glu Arg Glu Phe Gly Gly65 70 75
80Val Asp Ile Leu Val Asn Asn Ala Gly Ile Gln His Val Ala Pro
Val85 90 95Glu Gln Phe Pro Pro Glu Ser
Trp Asp Lys Ile Ile Ala Leu Asn Leu100 105
110Ser Ala Val Phe His Gly Thr Arg Leu Ala Leu Pro Gly Met Arg Thr115
120 125Arg Asn Trp Gly Arg Ile Ile Asn Ile
Ala Ser Val His Gly Leu Val130 135 140Gly
Ser Ile Gly Lys Ala Ala Tyr Val Ala Ala Lys His Gly Val Ile145
150 155 160Gly Leu Thr Lys Val Val
Gly Leu Glu Thr Ala Thr Ser His Val Thr165 170
175Cys Asn Ala Ile Cys Pro Gly Trp Val Leu Thr Pro Leu Val Gln
Lys180 185 190Gln Ile Asp Asp Arg Ala Ala
Lys Gly Gly Asp Arg Leu Gln Ala Gln195 200
205His Asp Leu Leu Ala Glu Lys Gln Pro Ser Leu Ala Phe Val Thr Pro210
215 220Glu His Leu Gly Glu Leu Val Leu Phe
Leu Cys Ser Glu Ala Gly Ser225 230 235
240Gln Val Arg Gly Ala Ala Trp Asn Val Asp Gly Gly Trp Leu
Ala Gln245 250 255117750DNAPseudomonas
putida KT2440 117atgtccaagc aacttacact cgaaggcaaa gtggccctgg ttcagggcgg
ttcccgaggc 60attggcgcag ctatcgtaag gcgcctggcc cgcgaaggcg cgcaagtggc
cttcacctat 120gtcagctctg ccggcccggc tgaagaactg gctcgggaaa ttaccgagaa
cggcggcaaa 180gccttggccc tgcgggctga cagcgctgat gccgcggccg tgcagctggc
ggttgatgac 240accgagaaag ccttgggccg gctggatatc ctggtcaaca acgccggtgt
gctggcagtg 300gccccagtga cagagttcga cctggccgac ttcgatcata tgctggccgt
gaacgtacgc 360agcgtgttcg tcgccagcca ggccgcggca cgctatatgg gccagggcgg
tcgtatcatc 420aacattggca gcaccaacgc cgagcgcatg ccgtttgccg gtggtgcacc
gtacgccatg 480agcaagtcgg cactggttgg tctgacccgc ggcatggcac gcgacctcgg
gccgcagggc 540attaccgtga acaacgtgca gcccggcccg gtggacaccg acatgaaccc
ggccagtggc 600gagtttgccg agagcctgat tccgctgatg gccattgggc gatatggcga
gccggaggag 660attgccagct tcgtggctta cctggcaggg cctgaagccg ggtatatcac
cggggccagc 720ctgactgtag atggtgggtt tgcagcctga
750118249PRTPseudomonas putida KT2440 118Met Ser Lys Gln Leu
Thr Leu Glu Gly Lys Val Ala Leu Val Gln Gly1 5
10 15Gly Ser Arg Gly Ile Gly Ala Ala Ile Val Arg Arg
Leu Ala Arg Glu20 25 30Gly Ala Gln Val
Ala Phe Thr Tyr Val Ser Ser Ala Gly Pro Ala Glu35 40
45Glu Leu Ala Arg Glu Ile Thr Glu Asn Gly Gly Lys Ala Leu
Ala Leu50 55 60Arg Ala Asp Ser Ala Asp
Ala Ala Ala Val Gln Leu Ala Val Asp Asp65 70
75 80Thr Glu Lys Ala Leu Gly Arg Leu Asp Ile Leu
Val Asn Asn Ala Gly85 90 95Val Leu Ala
Val Ala Pro Val Thr Glu Phe Asp Leu Ala Asp Phe Asp100
105 110His Met Leu Ala Val Asn Val Arg Ser Val Phe Val
Ala Ser Gln Ala115 120 125Ala Ala Arg Tyr
Met Gly Gln Gly Gly Arg Ile Ile Asn Ile Gly Ser130 135
140Thr Asn Ala Glu Arg Met Pro Phe Ala Gly Gly Ala Pro Tyr
Ala Met145 150 155 160Ser
Lys Ser Ala Leu Val Gly Leu Thr Arg Gly Met Ala Arg Asp Leu165
170 175Gly Pro Gln Gly Ile Thr Val Asn Asn Val Gln
Pro Gly Pro Val Asp180 185 190Thr Asp Met
Asn Pro Ala Ser Gly Glu Phe Ala Glu Ser Leu Ile Pro195
200 205Leu Met Ala Ile Gly Arg Tyr Gly Glu Pro Glu Glu
Ile Ala Ser Phe210 215 220Val Ala Tyr Leu
Ala Gly Pro Glu Ala Gly Tyr Ile Thr Gly Ala Ser225 230
235 240Leu Thr Val Asp Gly Gly Phe Ala
Ala245119858DNAPseudomonas putida KT2440 119atgagcgact accctacccc
tccattccca tcccaaccgc aaagcgttcc cggttcccag 60cgcaagatgg atccgtatcc
ggactgcggt gagcagagct acaccggcaa caatcgcctc 120gcaggcaaga tcgccttgat
aaccggtgct gacagcggca tcgggcgtgc ggtggcgatt 180gcctatgccc gagaaggcgc
tgacgttgcc attgcctatc tgaatgaaca cgacgatgcg 240caggaaaccg cgcgctgggt
caaagcggct ggccgccagt gcctgctgct gcccggcgac 300ctggcacaga aacagcactg
ccacgacatc gtcgacaaga ccgtggcgca gtttggtcgc 360atcgatatcc tggtcaacaa
cgccgcgttc cagatggccc atgaaagcct ggacgacatt 420gatgacgatg aatgggtgaa
gaccttcgat accaacatca ccgccatttt ccgcatttgc 480cagcgcgctt tgccctcgat
gccaaagggc ggttcgatca tcaacaccag ttcggtcaac 540tctgacgacc cgtcacccag
cctgttggcc tatgccgcga ccaaaggggc tattgccaat 600ttcactgcag gccttgcgca
actgctgggc aagcagggca ttcgcgtcaa cagcgtcgca 660cccggcccga tctggacccc
gctgatcccg gccaccatgc ctgatgaggc ggtgagaaac 720ttcggttccg gttacccgat
gggacggccg ggtcaacctg tggaggtggc gccaatctat 780gtcttgctgg ggtccgatga
agccagctac atctcgggtt cgcgttacgc cgtgacggga 840ggcaaaccta ttctgtga
858120285PRTPseudomonas
putida KT2440 120Met Ser Asp Tyr Pro Thr Pro Pro Phe Pro Ser Gln Pro Gln
Ser Val1 5 10 15Pro Gly
Ser Gln Arg Lys Met Asp Pro Tyr Pro Asp Cys Gly Glu Gln20
25 30Ser Tyr Thr Gly Asn Asn Arg Leu Ala Gly Lys Ile
Ala Leu Ile Thr35 40 45Gly Ala Asp Ser
Gly Ile Gly Arg Ala Val Ala Ile Ala Tyr Ala Arg50 55
60Glu Gly Ala Asp Val Ala Ile Ala Tyr Leu Asn Glu His Asp
Asp Ala65 70 75 80Gln
Glu Thr Ala Arg Trp Val Lys Ala Ala Gly Arg Gln Cys Leu Leu85
90 95Leu Pro Gly Asp Leu Ala Gln Lys Gln His Cys
His Asp Ile Val Asp100 105 110Lys Thr Val
Ala Gln Phe Gly Arg Ile Asp Ile Leu Val Asn Asn Ala115
120 125Ala Phe Gln Met Ala His Glu Ser Leu Asp Asp Ile
Asp Asp Asp Glu130 135 140Trp Val Lys Thr
Phe Asp Thr Asn Ile Thr Ala Ile Phe Arg Ile Cys145 150
155 160Gln Arg Ala Leu Pro Ser Met Pro Lys
Gly Gly Ser Ile Ile Asn Thr165 170 175Ser
Ser Val Asn Ser Asp Asp Pro Ser Pro Ser Leu Leu Ala Tyr Ala180
185 190Ala Thr Lys Gly Ala Ile Ala Asn Phe Thr Ala
Gly Leu Ala Gln Leu195 200 205Leu Gly Lys
Gln Gly Ile Arg Val Asn Ser Val Ala Pro Gly Pro Ile210
215 220Trp Thr Pro Leu Ile Pro Ala Thr Met Pro Asp Glu
Ala Val Arg Asn225 230 235
240Phe Gly Ser Gly Tyr Pro Met Gly Arg Pro Gly Gln Pro Val Glu Val245
250 255Ala Pro Ile Tyr Val Leu Leu Gly Ser
Asp Glu Ala Ser Tyr Ile Ser260 265 270Gly
Ser Arg Tyr Ala Val Thr Gly Gly Lys Pro Ile Leu275 280
285121774DNAPseudomonas putida KT2440 121atgatcgaaa
tcagcggcag caccccgggc cacaatggcc gggtagcctt ggtcacgggc 60gccgcccgcg
gcatcggtct gggcattgcc gcatggctga tctgcgaagg ctggcaagtg 120gtgctgagtg
atctggaccg ccagcgtggt accaaagtgg ccaaggcgtt gggcgacaac 180gcctggttca
tcaccatgga cgttgccgac gaggcccagg tcagtgccgg cgtgtccgaa 240gtgctcgggc
agttcggccg gctggacgcg ctggtgtgca atgcggccat tgccaacccg 300cacaaccaga
cgctggaaag cctgagcctg gcacaatgga accgggtgct gggggtcaac 360ctcagcggcc
ccatgctgct ggccaagcat tgtgcgccgt acctgcgtgc gcacaatggg 420gcgatcgtca
acctgacctc tacccgtgct cggcagtccg aacccgacac cgaggcttac 480gcggcaagca
agggcggcct ggtggctttg acccatgccc tggccatgag cctgggcccg 540gagattcgcg
tcaatgcggt gagcccgggc tggatcgatg cccgtgatcc gtcgcagcgc 600cgtgccgagc
cgttgagcga agctgaccat gcccagcatc caacgggcag ggtagggacc 660gtggaagatg
tcgcggccat ggttgcctgg ttgctgtcac gccaggcggc atttgtcacc 720ggccaggagt
ttgtggtcga tggcggcatg acccgcaaga tgatctatac ctga
774122257PRTPseudomonas putida KT2440 122Met Ile Glu Ile Ser Gly Ser Thr
Pro Gly His Asn Gly Arg Val Ala1 5 10
15Leu Val Thr Gly Ala Ala Arg Gly Ile Gly Leu Gly Ile Ala Ala
Trp20 25 30Leu Ile Cys Glu Gly Trp Gln
Val Val Leu Ser Asp Leu Asp Arg Gln35 40
45Arg Gly Thr Lys Val Ala Lys Ala Leu Gly Asp Asn Ala Trp Phe Ile50
55 60Thr Met Asp Val Ala Asp Glu Ala Gln Val
Ser Ala Gly Val Ser Glu65 70 75
80Val Leu Gly Gln Phe Gly Arg Leu Asp Ala Leu Val Cys Asn Ala
Ala85 90 95Ile Ala Asn Pro His Asn Gln
Thr Leu Glu Ser Leu Ser Leu Ala Gln100 105
110Trp Asn Arg Val Leu Gly Val Asn Leu Ser Gly Pro Met Leu Leu Ala115
120 125Lys His Cys Ala Pro Tyr Leu Arg Ala
His Asn Gly Ala Ile Val Asn130 135 140Leu
Thr Ser Thr Arg Ala Arg Gln Ser Glu Pro Asp Thr Glu Ala Tyr145
150 155 160Ala Ala Ser Lys Gly Gly
Leu Val Ala Leu Thr His Ala Leu Ala Met165 170
175Ser Leu Gly Pro Glu Ile Arg Val Asn Ala Val Ser Pro Gly Trp
Ile180 185 190Asp Ala Arg Asp Pro Ser Gln
Arg Arg Ala Glu Pro Leu Ser Glu Ala195 200
205Asp His Ala Gln His Pro Thr Gly Arg Val Gly Thr Val Glu Asp Val210
215 220Ala Ala Met Val Ala Trp Leu Leu Ser
Arg Gln Ala Ala Phe Val Thr225 230 235
240Gly Gln Glu Phe Val Val Asp Gly Gly Met Thr Arg Lys Met
Ile Tyr245 250 255Thr123741DNAPseudomonas
putida KT2440 123atgagcctgc aaggtaaagt tgcactggtt accggcgcca gccgtggcat
tggccaggcc 60atcgccctcg agctgggccg ccagggcgcg accgtgatcg gtaccgccac
gtcggcgtcc 120ggtgccgagc gcatcgctgc caccctgaaa gaacacggca ttaccggcac
tggcatggag 180ctgaacgtga ccagcgccga atcggttgaa gccgtactgg ccgccattgg
cgagcagttc 240ggcgcgccgg ccatcttggt caacaatgcc ggtatcaccc gcgacaacct
catgctgcgc 300atgaaagacg acgagtggtt tgatgtcatc gacaccaacc tgaacagcct
ctaccgtctg 360tccaagggcg tgctgcgtgg catgaccaag gcgcgttggg gtcgtatcat
cagcatcggc 420tcggtcgttg gtgccatggg taacgcaggt caggccaact acgcggctgc
caaggccggt 480ctggaaggtt tcagccgcgc cctggcgcgt gaagtgggtt cgcgtggtat
caccgtcaac 540tcggtgaccc caggcttcat cgataccgac atgacccgcg agctgccaga
agctcagcgc 600gaagccctgc agacccagat tccgctgggc cgcctgggcc aggctgacga
aattgccaag 660gtggtttcgt tcctggcatc cgacggcgcc gcctacgtga ccggcgctac
cgtgccggtc 720aacggcggga tgtacatgta a
741124246PRTPseudomonas putida KT2440 124Met Ser Leu Gln Gly
Lys Val Ala Leu Val Thr Gly Ala Ser Arg Gly1 5
10 15Ile Gly Gln Ala Ile Ala Leu Glu Leu Gly Arg Gln
Gly Ala Thr Val20 25 30Ile Gly Thr Ala
Thr Ser Ala Ser Gly Ala Glu Arg Ile Ala Ala Thr35 40
45Leu Lys Glu His Gly Ile Thr Gly Thr Gly Met Glu Leu Asn
Val Thr50 55 60Ser Ala Glu Ser Val Glu
Ala Val Leu Ala Ala Ile Gly Glu Gln Phe65 70
75 80Gly Ala Pro Ala Ile Leu Val Asn Asn Ala Gly
Ile Thr Arg Asp Asn85 90 95Leu Met Leu
Arg Met Lys Asp Asp Glu Trp Phe Asp Val Ile Asp Thr100
105 110Asn Leu Asn Ser Leu Tyr Arg Leu Ser Lys Gly Val
Leu Arg Gly Met115 120 125Thr Lys Ala Arg
Trp Gly Arg Ile Ile Ser Ile Gly Ser Val Val Gly130 135
140Ala Met Gly Asn Ala Gly Gln Ala Asn Tyr Ala Ala Ala Lys
Ala Gly145 150 155 160Leu
Glu Gly Phe Ser Arg Ala Leu Ala Arg Glu Val Gly Ser Arg Gly165
170 175Ile Thr Val Asn Ser Val Thr Pro Gly Phe Ile
Asp Thr Asp Met Thr180 185 190Arg Glu Leu
Pro Glu Ala Gln Arg Glu Ala Leu Gln Thr Gln Ile Pro195
200 205Leu Gly Arg Leu Gly Gln Ala Asp Glu Ile Ala Lys
Val Val Ser Phe210 215 220Leu Ala Ser Asp
Gly Ala Ala Tyr Val Thr Gly Ala Thr Val Pro Val225 230
235 240Asn Gly Gly Met Tyr
Met245125738DNAPseudomonas putida KT2440 125atgactcaga aaatagctgt
cgtgaccggc ggcagtcgcg gcattggcaa gtccatcgtg 60ctggccctgg ccggcgcggg
ttatcaggtt gccttcagtt atgtccgtga cgaggcgtca 120gccgctgcct tgcaggcgca
ggtcgaaggg ctcggccggg actgcctggc cgtgcagtgt 180gatgtcaagg aagcgccgag
cattcaggcg ttttttgaac gggtcgagca acgtttcgag 240cgtatcgact tgttggtcaa
caacgccggt attacccgtg acggtttgct cgccacgcaa 300tcgttgaacg acatcaccga
ggtcatccag accaacctgg tcggcacgtt gttgtgctgt 360cagcaggtgc tgccctgcat
gatgcgccaa cgcagcgggt gcatcgtcaa cctcagttcg 420gtggccgcgc aaaagcccgg
caagggccag agcaactacg ccgccgccaa aggcggtgta 480gaagcattga cacgcgcact
ggcggtggag ttggcgccgc gcaacatccg ggtcaacgcg 540gtggcgcccg gcatcgtcag
caccgacatg agccaagccc tggtcggcgc ccatgagcag 600gaaatccagt cgcggctgtt
gatcaaacgg ttcgcccggc ctgaagaaat tgccgacgcg 660gtgctgtatc tggccgagcg
cggcctgtac atcacgggcg aagtcctgtc cgtcaacggc 720ggattgaaaa tgccatga
738126245PRTPseudomonas
putida KT2440 126Met Thr Gln Lys Ile Ala Val Val Thr Gly Gly Ser Arg Gly
Ile Gly1 5 10 15Lys Ser
Ile Val Leu Ala Leu Ala Gly Ala Gly Tyr Gln Val Ala Phe20
25 30Ser Tyr Val Arg Asp Glu Ala Ser Ala Ala Ala Leu
Gln Ala Gln Val35 40 45Glu Gly Leu Gly
Arg Asp Cys Leu Ala Val Gln Cys Asp Val Lys Glu50 55
60Ala Pro Ser Ile Gln Ala Phe Phe Glu Arg Val Glu Gln Arg
Phe Glu65 70 75 80Arg
Ile Asp Leu Leu Val Asn Asn Ala Gly Ile Thr Arg Asp Gly Leu85
90 95Leu Ala Thr Gln Ser Leu Asn Asp Ile Thr Glu
Val Ile Gln Thr Asn100 105 110Leu Val Gly
Thr Leu Leu Cys Cys Gln Gln Val Leu Pro Cys Met Met115
120 125Arg Gln Arg Ser Gly Cys Ile Val Asn Leu Ser Ser
Val Ala Ala Gln130 135 140Lys Pro Gly Lys
Gly Gln Ser Asn Tyr Ala Ala Ala Lys Gly Gly Val145 150
155 160Glu Ala Leu Thr Arg Ala Leu Ala Val
Glu Leu Ala Pro Arg Asn Ile165 170 175Arg
Val Asn Ala Val Ala Pro Gly Ile Val Ser Thr Asp Met Ser Gln180
185 190Ala Leu Val Gly Ala His Glu Gln Glu Ile Gln
Ser Arg Leu Leu Ile195 200 205Lys Arg Phe
Ala Arg Pro Glu Glu Ile Ala Asp Ala Val Leu Tyr Leu210
215 220Ala Glu Arg Gly Leu Tyr Ile Thr Gly Glu Val Leu
Ser Val Asn Gly225 230 235
240Gly Leu Lys Met Pro245127768DNAPseudomonas putida KT2440
127atgtccaaga cccacctgtt cgacctcgac ggcaagattg cctttgtttc cggcgccagc
60cgtggcatcg gcgaggccat cgcccacttg ctcgcgcagc aaggggccca tgtgatcgtt
120tccagccgca agcttgacgg gtgccagcag gtggccgacg ccatcattgc cgccggcggc
180aaggccacgg ctgtggcctg ccacattggt gagctggaac agattcagca ggtgttcgcc
240ggcattcgcg aacagttcgg gcgactggac gtgctggtca acaatgcagc caccaacccg
300caattctgca atgtgctgga caccgaccca ggggcgttcc agaagaccgt ggacgtgaac
360atccgtggtt acttcttcat gtcggtggag gctggcaagc tgatgcgcga gaacggcggc
420ggcagcatca tcaacgtggc gtcgatcaac ggtgtttcac ccgggctgtt ccaaggcatc
480tactcggtga ccaaggcggc ggtcatcaac atgaccaagg tgttcgccaa agagtgtgca
540cccttcggta ttcgctgcaa cgcgctactg ccggggctga ccgataccaa gttcgcttcg
600gcattggtga agaacgaagc catcctcaac gccgccttgc agcagatccc cctcaaacgc
660gtggccgacc ccaaggaaat ggcgggtgcg gtgctgtacc tggccagcga tgcctccagc
720tacaccaccg gcaccacgct caatgtcgac ggtggcttcc tgtcctga
768128255PRTPseudomonas putida KT2440 128Met Ser Lys Thr His Leu Phe Asp
Leu Asp Gly Lys Ile Ala Phe Val1 5 10
15Ser Gly Ala Ser Arg Gly Ile Gly Glu Ala Ile Ala His Leu Leu
Ala20 25 30Gln Gln Gly Ala His Val Ile
Val Ser Ser Arg Lys Leu Asp Gly Cys35 40
45Gln Gln Val Ala Asp Ala Ile Ile Ala Ala Gly Gly Lys Ala Thr Ala50
55 60Val Ala Cys His Ile Gly Glu Leu Glu Gln
Ile Gln Gln Val Phe Ala65 70 75
80Gly Ile Arg Glu Gln Phe Gly Arg Leu Asp Val Leu Val Asn Asn
Ala85 90 95Ala Thr Asn Pro Gln Phe Cys
Asn Val Leu Asp Thr Asp Pro Gly Ala100 105
110Phe Gln Lys Thr Val Asp Val Asn Ile Arg Gly Tyr Phe Phe Met Ser115
120 125Val Glu Ala Gly Lys Leu Met Arg Glu
Asn Gly Gly Gly Ser Ile Ile130 135 140Asn
Val Ala Ser Ile Asn Gly Val Ser Pro Gly Leu Phe Gln Gly Ile145
150 155 160Tyr Ser Val Thr Lys Ala
Ala Val Ile Asn Met Thr Lys Val Phe Ala165 170
175Lys Glu Cys Ala Pro Phe Gly Ile Arg Cys Asn Ala Leu Leu Pro
Gly180 185 190Leu Thr Asp Thr Lys Phe Ala
Ser Ala Leu Val Lys Asn Glu Ala Ile195 200
205Leu Asn Ala Ala Leu Gln Gln Ile Pro Leu Lys Arg Val Ala Asp Pro210
215 220Lys Glu Met Ala Gly Ala Val Leu Tyr
Leu Ala Ser Asp Ala Ser Ser225 230 235
240Tyr Thr Thr Gly Thr Thr Leu Asn Val Asp Gly Gly Phe Leu
Ser245 250 255129762DNAPseudomonas
fluorescens Pf-5 129atgagcatga cgttttccgg ccaggtggcc ctagtgaccg
gcgcagccaa tggtatcggc 60cgcgccaccg cccaggcatt tgccgcacaa ggcttgaagg
tggtggtggc ggacctggac 120acggcggggg gcgagggcac cgtggcgctg atccgcgagg
ccggtggcga ggcattgttc 180gtgccgtgca acgttaccct ggaggcggat gtgcaaagcc
tcatggcccg caccatcgaa 240gcctatgggc gcctggatta cgccttcaac aatgccggta
tcgagatcga aaagggccgc 300cttgcggagg gctccatgga tgagttcgac gccatcatgg
gggtcaacgt caaaggggtc 360tggctgtgca tgaagtacca gttgccgctg ctgctggccc
agggcggtgg ggcgatcgtc 420aacaccgcct cggtggcggg cctgggcgcg gcgccgaaga
tgagcatcta tgcggcctcc 480aagcatgcgg tgatcggcct gaccaagtcg gcggccatcg
aatatgcgaa gaagaaaatc 540cgcgtgaacg cggtatgccc ggcggtgatc gacaccgaca
tgttccgccg tgcctacgag 600gcggacccga agaaggccga gttcgccgcg gccatgcacc
cggtggggcg catcggcaag 660gtcgaggaga tcgccagtgc ggtgctctac ctgtgcagcg
atggcgcggc ctttaccacc 720ggccatgcac tggcggtcga cggcggggcc accgcgatct
ga 762130253PRTPseudomonas fluorscens Pf-5 130Met
Ser Met Thr Phe Ser Gly Gln Val Ala Leu Val Thr Gly Ala Ala1
5 10 15Asn Gly Ile Gly Arg Ala Thr Ala
Gln Ala Phe Ala Ala Gln Gly Leu20 25
30Lys Val Val Val Ala Asp Leu Asp Thr Ala Gly Gly Glu Gly Thr Val35
40 45Ala Leu Ile Arg Glu Ala Gly Gly Glu Ala
Leu Phe Val Pro Cys Asn50 55 60Val Thr
Leu Glu Ala Asp Val Gln Ser Leu Met Ala Arg Thr Ile Glu65
70 75 80Ala Tyr Gly Arg Leu Asp Tyr
Ala Phe Asn Asn Ala Gly Ile Glu Ile85 90
95Glu Lys Gly Arg Leu Ala Glu Gly Ser Met Asp Glu Phe Asp Ala Ile100
105 110Met Gly Val Asn Val Lys Gly Val Trp
Leu Cys Met Lys Tyr Gln Leu115 120 125Pro
Leu Leu Leu Ala Gln Gly Gly Gly Ala Ile Val Asn Thr Ala Ser130
135 140Val Ala Gly Leu Gly Ala Ala Pro Lys Met Ser
Ile Tyr Ala Ala Ser145 150 155
160Lys His Ala Val Ile Gly Leu Thr Lys Ser Ala Ala Ile Glu Tyr
Ala165 170 175Lys Lys Lys Ile Arg Val Asn
Ala Val Cys Pro Ala Val Ile Asp Thr180 185
190Asp Met Phe Arg Arg Ala Tyr Glu Ala Asp Pro Lys Lys Ala Glu Phe195
200 205Ala Ala Ala Met His Pro Val Gly Arg
Ile Gly Lys Val Glu Glu Ile210 215 220Ala
Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly Ala Ala Phe Thr Thr225
230 235 240Gly His Ala Leu Ala Val
Asp Gly Gly Ala Thr Ala Ile245 250131735DNAKlebsiella
pneumoniae subsp. pneumoniae MGH78578 131atgaaacttg ccagtaaaac cgccattgtc
accggcgccg cacgcggtat cggctttggc 60attgcccagg tgcttgcgcg ggaaggcgcg
cgagtgatta tcgccgatcg tgatgcacac 120ggcgaagccg ccgccgcttc cctgcgcgaa
tcgggcgcac aggcgctgtt tatcagctgc 180aatatcgctg aaaaaacgca ggtcgaagcc
ctgtattccc aggccgaaga ggcgtttggc 240ccggtagaca ttctggtgaa taacgccgga
atcaaccgcg acgccatgct gcacaaatta 300acggaagcgg actgggacac ggttatcgac
gttaacctga aaggcacttt cctctgtatg 360cagcaggccg ctatccgcat gcgcgagcgc
ggtgcgggcc gcattatcaa tatcgcttcc 420gccagttggc ttggcaacgt cgggcaaacc
aactattcgg cgtcaaaagc cggcgtggtg 480ggaatgacca aaaccgcctg ccgcgaactg
gcgaaaaaag gtgtcacggt gaatgccatc 540tgcccgggct ttatcgatac cgacatgacg
cgcggcgtac cggaaaacgt ctggcaaatc 600atggtcagca aaattcccgc gggttacgcc
ggcgaggcga aagacgtcgg cgagtgtgtg 660gcgtttctgg cgtccgatgg cgcgcgctat
atcaatggtg aagtgattaa cgtcggcggc 720ggcatggtgc tgtaa
735132253PRTKlebsiella pneumoniae
subsp. pneumoniae MGH78578 132Met Ser Met Thr Phe Ser Gly Gln Val Ala Leu
Val Thr Gly Ala Ala1 5 10
15Asn Gly Ile Gly Arg Ala Thr Ala Gln Ala Phe Ala Ala Gln Gly Leu20
25 30Lys Val Val Val Ala Asp Leu Asp Thr Ala
Gly Gly Glu Gly Thr Val35 40 45Ala Leu
Ile Arg Glu Ala Gly Gly Glu Ala Leu Phe Val Pro Cys Asn50
55 60Val Thr Leu Glu Ala Asp Val Gln Ser Leu Met Ala
Arg Thr Ile Glu65 70 75
80Ala Tyr Gly Arg Leu Asp Tyr Ala Phe Asn Asn Ala Gly Ile Glu Ile85
90 95Glu Lys Gly Arg Leu Ala Glu Gly Ser Met
Asp Glu Phe Asp Ala Ile100 105 110Met Gly
Val Asn Val Lys Gly Val Trp Leu Cys Met Lys Tyr Gln Leu115
120 125Pro Leu Leu Leu Ala Gln Gly Gly Gly Ala Ile Val
Asn Thr Ala Ser130 135 140Val Ala Gly Leu
Gly Ala Ala Pro Lys Met Ser Ile Tyr Ala Ala Ser145 150
155 160Lys His Ala Val Ile Gly Leu Thr Lys
Ser Ala Ala Ile Glu Tyr Ala165 170 175Lys
Lys Lys Ile Arg Val Asn Ala Val Cys Pro Ala Val Ile Asp Thr180
185 190Asp Met Phe Arg Arg Ala Tyr Glu Ala Asp Pro
Lys Lys Ala Glu Phe195 200 205Ala Ala Ala
Met His Pro Val Gly Arg Ile Gly Lys Val Glu Glu Ile210
215 220Ala Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly Ala
Ala Phe Thr Thr225 230 235
240Gly His Ala Leu Ala Val Asp Gly Gly Ala Thr Ala Ile245
250133750DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578
133atgttattga aagataaagt cgccattatt actggcgcgg cctccgcacg cggtttgggc
60ttcgcgactg cgaaattatt cgccgaaaac ggcgcgaaag tggtcattat cgacctcaat
120ggcgaagcca gtaaaaccgc cgcggcggca ttaggcgaag accatctcgg cctggcggcc
180aacgtcgctg atgaagtgca ggtgcaggcg gccatcgaac agatcctggc gaaatacggt
240cgggttgatg tactggtcaa taacgccggg attacccagc cgctgaagct gatggatatc
300aagcgcgcca actatgacgc ggtgcttgat gttagcctgc gcggcacgct gctgatgtcg
360caggcggtta tccccaccat gcgggcgcaa aaatccggca gcatcgtctg catctcgtcc
420gtctccgccc agcgcggcgg cggtattttc ggcggaccgc actacagcgc ggcaaaagcc
480ggggtgctgg gtctggcgcg ggcgatggcg cgcgagcttg gcccggataa cgtccgcgtt
540aactgcatca ccccggggct gattcagacc gacattaccg ccggcaagct gactgatgac
600atgacggcca acattcttgc cggcattccg atgaaccgcc ttggcgacgc gatagacatc
660gcgcgcgccg cgctgttcct cggcagcgat ctttcctcct actccaccgg catcaccctg
720gacgttaacg gcggcatgtt aattcactaa
750134249PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 134Met Leu
Leu Lys Asp Lys Val Ala Ile Ile Thr Gly Ala Ala Ser Ala1 5
10 15Arg Gly Leu Gly Phe Ala Thr Ala Lys
Leu Phe Ala Glu Asn Gly Ala20 25 30Lys
Val Val Ile Ile Asp Leu Asn Gly Glu Ala Ser Lys Thr Ala Ala35
40 45Ala Ala Leu Gly Glu Asp His Leu Gly Leu Ala
Ala Asn Val Ala Asp50 55 60Glu Val Gln
Val Gln Ala Ala Ile Glu Gln Ile Leu Ala Lys Tyr Gly65 70
75 80Arg Val Asp Val Leu Val Asn Asn
Ala Gly Ile Thr Gln Pro Leu Lys85 90
95Leu Met Asp Ile Lys Arg Ala Asn Tyr Asp Ala Val Leu Asp Val Ser100
105 110Leu Arg Gly Thr Leu Leu Met Ser Gln Ala
Val Ile Pro Thr Met Arg115 120 125Ala Gln
Lys Ser Gly Ser Ile Val Cys Ile Ser Ser Val Ser Ala Gln130
135 140Arg Gly Gly Gly Ile Phe Gly Gly Pro His Tyr Ser
Ala Ala Lys Ala145 150 155
160Gly Val Leu Gly Leu Ala Arg Ala Met Ala Arg Glu Leu Gly Pro Asp165
170 175Asn Val Arg Val Asn Cys Ile Thr Pro
Gly Leu Ile Gln Thr Asp Ile180 185 190Thr
Ala Gly Lys Leu Thr Asp Asp Met Thr Ala Asn Ile Leu Ala Gly195
200 205Ile Pro Met Asn Arg Leu Gly Asp Ala Ile Asp
Ile Ala Arg Ala Ala210 215 220Leu Phe Leu
Gly Ser Asp Leu Ser Ser Tyr Ser Thr Gly Ile Thr Leu225
230 235 240Asp Val Asn Gly Gly Met Leu
Ile His245135750DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578
135atgttattga aagataaagt cgccattatt actggcgcgg cctccgcacg cggtttgggc
60ttcgcgactg cgaaattatt cgccgaaaac ggcgcgaaag tggtcattat cgacctcaat
120ggcgaagcca gtaaaaccgc cgcggcggca ttaggcgaag accatctcgg cctggcggcc
180aacgtcgctg atgaagtgca ggtgcaggcg gccatcgaac agatcctggc gaaatacggt
240cgggttgatg tactggtcaa taacgccggg attacccagc cgctgaagct gatggatatc
300aagcgcgcca actatgacgc ggtgcttgat gttagcctgc gcggcacgct gctgatgtcg
360caggcggtta tccccaccat gcgggcgcaa aaatccggca gcatcgtctg catctcgtcc
420gtctccgccc agcgcggcgg cggtattttc ggcggaccgc actacagcgc ggcaaaagcc
480ggggtgctgg gtctggcgcg ggcgatggcg cgcgagcttg gcccggataa cgtccgcgtt
540aactgcatca ccccggggct gattcagacc gacattaccg ccggcaagct gactgatgac
600atgacggcca acattcttgc cggcattccg atgaaccgcc ttggcgacgc gatagacatc
660gcgcgcgccg cgctgttcct cggcagcgat ctttcctcct actccaccgg catcaccctg
720gacgttaacg gcggcatgtt aattcactaa
750136249PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 136Met Leu
Leu Lys Asp Lys Val Ala Ile Ile Thr Gly Ala Ala Ser Ala1 5
10 15Arg Gly Leu Gly Phe Ala Thr Ala Lys
Leu Phe Ala Glu Asn Gly Ala20 25 30Lys
Val Val Ile Ile Asp Leu Asn Gly Glu Ala Ser Lys Thr Ala Ala35
40 45Ala Ala Leu Gly Glu Asp His Leu Gly Leu Ala
Ala Asn Val Ala Asp50 55 60Glu Val Gln
Val Gln Ala Ala Ile Glu Gln Ile Leu Ala Lys Tyr Gly65 70
75 80Arg Val Asp Val Leu Val Asn Asn
Ala Gly Ile Thr Gln Pro Leu Lys85 90
95Leu Met Asp Ile Lys Arg Ala Asn Tyr Asp Ala Val Leu Asp Val Ser100
105 110Leu Arg Gly Thr Leu Leu Met Ser Gln Ala
Val Ile Pro Thr Met Arg115 120 125Ala Gln
Lys Ser Gly Ser Ile Val Cys Ile Ser Ser Val Ser Ala Gln130
135 140Arg Gly Gly Gly Ile Phe Gly Gly Pro His Tyr Ser
Ala Ala Lys Ala145 150 155
160Gly Val Leu Gly Leu Ala Arg Ala Met Ala Arg Glu Leu Gly Pro Asp165
170 175Asn Val Arg Val Asn Cys Ile Thr Pro
Gly Leu Ile Gln Thr Asp Ile180 185 190Thr
Ala Gly Lys Leu Thr Asp Asp Met Thr Ala Asn Ile Leu Ala Gly195
200 205Ile Pro Met Asn Arg Leu Gly Asp Ala Ile Asp
Ile Ala Arg Ala Ala210 215 220Leu Phe Leu
Gly Ser Asp Leu Ser Ser Tyr Ser Thr Gly Ile Thr Leu225
230 235 240Asp Val Asn Gly Gly Met Leu
Ile His245137714DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578
137atgacagcgt ttcacaacaa atcagtgctg gttttaggcg ggagtcgggg aattggcgcg
60gcgatcgtca ggcgttttgt cgccgatggc gcgtcggtgg tgtttagcta ttccggttcg
120ccggaagcgg ccgagcggct ggcggcagag accggcagca cggcggtgca ggcggacagc
180gccgatcgcg atgcggtgat aagcctggtc cgcgacagcg gcccgctgga cgtgttagtg
240gtcaatgccg ggatcgcgct tttcggtgac gctctcgagc aggacagcga tgcaatcgat
300cgcctgttcc acatcaatat tcacgccccc taccatgcct ccgtcgaagc ggcgcgccgc
360atgccggaag gcgggcgcat tattgtcatc ggctcagtca atggcgatcg catgccgttg
420ccgggaatgg cggcctatgc gctcagcaaa tcggccctgc aggggctggc gcgcggcctg
480gcgcgggatt ttggcccgcg cggcatcacg gtcaacgtcg tccagcccgg cccaattgat
540accgacgcca acccggagaa cggcccgatg aaagagctga tgcacagctt tatggccatt
600aagcgccatg gccgtccgga agaggtggcg ggaatggtgg cgtggctggc cggtccggag
660gcgtcgtttg tcactggcgc catgcacacc atcgacggag cgtttggcgc ctga
714138237PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 138Met Thr
Ala Phe His Asn Lys Ser Val Leu Val Leu Gly Gly Ser Arg1 5
10 15Gly Ile Gly Ala Ala Ile Val Arg Arg
Phe Val Ala Asp Gly Ala Ser20 25 30Val
Val Phe Ser Tyr Ser Gly Ser Pro Glu Ala Ala Glu Arg Leu Ala35
40 45Ala Glu Thr Gly Ser Thr Ala Val Gln Ala Asp
Ser Ala Asp Arg Asp50 55 60Ala Val Ile
Ser Leu Val Arg Asp Ser Gly Pro Leu Asp Val Leu Val65 70
75 80Val Asn Ala Gly Ile Ala Leu Phe
Gly Asp Ala Leu Glu Gln Asp Ser85 90
95Asp Ala Ile Asp Arg Leu Phe His Ile Asn Ile His Ala Pro Tyr His100
105 110Ala Ser Val Glu Ala Ala Arg Arg Met Pro
Glu Gly Gly Arg Ile Ile115 120 125Val Ile
Gly Ser Val Asn Gly Asp Arg Met Pro Leu Pro Gly Met Ala130
135 140Ala Tyr Ala Leu Ser Lys Ser Ala Leu Gln Gly Leu
Ala Arg Gly Leu145 150 155
160Ala Arg Asp Phe Gly Pro Arg Gly Ile Thr Val Asn Val Val Gln Pro165
170 175Gly Pro Ile Asp Thr Asp Ala Asn Pro
Glu Asn Gly Pro Met Lys Glu180 185 190Leu
Met His Ser Phe Met Ala Ile Lys Arg His Gly Arg Pro Glu Glu195
200 205Val Ala Gly Met Val Ala Trp Leu Ala Gly Pro
Glu Ala Ser Phe Val210 215 220Thr Gly Ala
Met His Thr Ile Asp Gly Ala Phe Gly Ala225 230
235139750DNAKlebsiella pneumoniae subp. pneumoniae MGH78578
139atgaacggcc tgctaaacgg taaacgtatt gtcgtcaccg gtgcggcgcg cggtctcggg
60taccactttg ccgaagcctg cgccgctcag ggcgcgacgg tggtgatgtg cgacatcctg
120cagggagagc tggcggaaag cgctcatcgc ctgcagcaga agggctatca ggtcgaatct
180cacgccatcg atcttgccag tcaagcatcg atcgagcagg tcttcagcgc catcggcgcg
240caggggtcta tcgatggctt agtcaataac gcagcgatgg ccaccggcgt cggcggaaaa
300aatatgatcg attacgatcc ggatctgtgg gatcgggtaa tgacggtcaa cgttaaaggc
360acctggttgg tgacccgcgc ggcggtaccg ctgctgcgcg aaggggcggc gatcgtcaac
420gtcgcttcgg ataccgcgct gtggggcgcg ccgcggctga tggcctatgt cgccagtaag
480ggcgcggtga ttgcgatgac ccgctccatg gcccgcgagc tgggtgaaaa gcggatccgt
540atcaacgcca tcgcgccggg actgacccgc gttgaggcca cggaatacgt tcccgccgag
600cgtcatcagc tgtatgagaa cggccgcgcg ctcagcggcg cgcagcagcc ggaagatgtc
660accggcagcg tggtctggct gctgagcgat ctttcgcgct ttatcaccgg ccaactgatc
720ccggtcaacg gcggttttgt ctttaactaa
750140249PRTKlebsiella pneumoniae subsp. pneumoinae MGH78578 140Met Asn
Gly Leu Leu Asn Gly Lys Arg Ile Val Val Thr Gly Ala Ala1 5
10 15Arg Gly Leu Gly Tyr His Phe Ala Glu
Ala Cys Ala Ala Gln Gly Ala20 25 30Thr
Val Val Met Cys Asp Ile Leu Gln Gly Glu Leu Ala Glu Ser Ala35
40 45His Arg Leu Gln Gln Lys Gly Tyr Gln Val Glu
Ser His Ala Ile Asp50 55 60Leu Ala Ser
Gln Ala Ser Ile Glu Gln Val Phe Ser Ala Ile Gly Ala65 70
75 80Gln Gly Ser Ile Asp Gly Leu Val
Asn Asn Ala Ala Met Ala Thr Gly85 90
95Val Gly Gly Lys Asn Met Ile Asp Tyr Asp Pro Asp Leu Trp Asp Arg100
105 110Val Met Thr Val Asn Val Lys Gly Thr Trp
Leu Val Thr Arg Ala Ala115 120 125Val Pro
Leu Leu Arg Glu Gly Ala Ala Ile Val Asn Val Ala Ser Asp130
135 140Thr Ala Leu Trp Gly Ala Pro Arg Leu Met Ala Tyr
Val Ala Ser Lys145 150 155
160Gly Ala Val Ile Ala Met Thr Arg Ser Met Ala Arg Glu Leu Gly Glu165
170 175Lys Arg Ile Arg Ile Asn Ala Ile Ala
Pro Gly Leu Thr Arg Val Glu180 185 190Ala
Thr Glu Tyr Val Pro Ala Glu Arg His Gln Leu Tyr Glu Asn Gly195
200 205Arg Ala Leu Ser Gly Ala Gln Gln Pro Glu Asp
Val Thr Gly Ser Val210 215 220Val Trp Leu
Leu Ser Asp Leu Ser Arg Phe Ile Thr Gly Gln Leu Ile225
230 235 240Pro Val Asn Gly Gly Phe Val
Phe Asn245141795DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578
141atgaatgcac aaattgaagg gcgcgtcgcg gtagtcaccg gcggttcgtc aggaatcggc
60tttgaaacgc tgcgcctgct gctgggcgaa ggggcgaaag tcgccttttg cggccgcaac
120ccggatcggc ttgccagcgc ccatgcggcg ttgcaaaacg aatatccaga aggtgaggtg
180ttctcctggc gctgtgacgt actgaacgaa gctgaagttg aggcgttcgc cgccgcggtc
240gccgcgcgtt tcggcggcgt cgatatgctg attaataacg ccggccaggg ctatgtcgcc
300cacttcgccg atacgccacg tgaggcctgg ctgcacgaag ccgaactgaa actgttcggc
360gtgattaacc cggtaaaggc ctttcagtcc ctgctagagg cgtcggatat cgcctcgatt
420acctgtgtga actcgctgct ggcgttacag ccggaagagc acatgatcgc cacctctgcc
480gcccgcgccg cgctgctcaa tatgacgctg actctgtcga aagagctggt ggataaaggt
540attcgtgtga attccattct gctggggatg gtggagtccg ggcagtggca gcgccgtttt
600gagagccgaa gcgataagag ccagagttgg cagcagtgga ccgccgatat cgcccgtaag
660cgggggatcc cgatggcgcg tctcggtaag ccgcaggagc cagcgcaagc gctgctattc
720ctcgcttcgc cgctggcctc ctttaccacc ggcgcggcgc tggacgtttc cggcggtttc
780tgtcgccatc tgtaa
795142264PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 142Met Asn
Ala Gln Ile Glu Gly Arg Val Ala Val Val Thr Gly Gly Ser1 5
10 15Ser Gly Ile Gly Phe Glu Thr Leu Arg
Leu Leu Leu Gly Glu Gly Ala20 25 30Lys
Val Ala Phe Cys Gly Arg Asn Pro Asp Arg Leu Ala Ser Ala His35
40 45Ala Ala Leu Gln Asn Glu Tyr Pro Glu Gly Glu
Val Phe Ser Trp Arg50 55 60Cys Asp Val
Leu Asn Glu Ala Glu Val Glu Ala Phe Ala Ala Ala Val65 70
75 80Ala Ala Arg Phe Gly Gly Val Asp
Met Leu Ile Asn Asn Ala Gly Gln85 90
95Gly Tyr Val Ala His Phe Ala Asp Thr Pro Arg Glu Ala Trp Leu His100
105 110Glu Ala Glu Leu Lys Leu Phe Gly Val Ile
Asn Pro Val Lys Ala Phe115 120 125Gln Ser
Leu Leu Glu Ala Ser Asp Ile Ala Ser Ile Thr Cys Val Asn130
135 140Ser Leu Leu Ala Leu Gln Pro Glu Glu His Met Ile
Ala Thr Ser Ala145 150 155
160Ala Arg Ala Ala Leu Leu Asn Met Thr Leu Thr Leu Ser Lys Glu Leu165
170 175Val Asp Lys Gly Ile Arg Val Asn Ser
Ile Leu Leu Gly Met Val Glu180 185 190Ser
Gly Gln Trp Gln Arg Arg Phe Glu Ser Arg Ser Asp Lys Ser Gln195
200 205Ser Trp Gln Gln Trp Thr Ala Asp Ile Ala Arg
Lys Arg Gly Ile Pro210 215 220Met Ala Arg
Leu Gly Lys Pro Gln Glu Pro Ala Gln Ala Leu Leu Phe225
230 235 240Leu Ala Ser Pro Leu Ala Ser
Phe Thr Thr Gly Ala Ala Leu Asp Val245 250
255Ser Gly Gly Phe Cys Arg His Leu2601431795DNAPseudomonas fluorescens
143cgccaagcaa tcgggctttg gggcagaatt gggtcgcgaa gggcttgagg agtttgccca
60gtccaagatc atcaacgccg cgctataaat taaaggatcc cccatggcga tgattacagg
120cggcgaactg gttgttcgca ccctaataaa ggctggggtc gaacatctgt tcggcctgca
180cggcgcgcat atcgatacga tttttcaagc ctgtctcgat catgatgtgc cgatcatcga
240cacccgccat gaggccgccg cagggcatgc ggccgagggc tatgcccgcg ctggcgccaa
300gctgggcgtg gctggtcacg gcgggcgggg gatttaccaa tgcggtcacg cccattgcca
360acgcttggct ggatcgcaag gccggtgtat tcctcacccg ggatcgggcg cgctgcgtga
420tgatgaaacc aacacgttgc aggcggggat tgatcaggtc gccatggcgg cgcccattac
480caaatgggcg catcgggtga tggcaaccga gcatatccca cggctggtga tgcaggcgat
540ccgcgccgcg ttgagcgcgc cacgcgggcc ggtgttgctg gatctgccgt gggatattct
600gatgaaccag attgatgagg atagcgtcat tatccccgat ctggtcttgt ccgcgcatgg
660ggccagaccc gaccctgccg atctggatca ggctctcgcg cttttgcgca aggcggagcg
720gccggtcatc gtgctcggct cagaagcctc gcggacagcg cgcaagacgg cgcttagcgc
780cttcgtggcg gcgactggcg tgccggtgtt tgccgattat gaagggctaa gcatgctctc
840ggggctgccc gatgctatgc ggggcgggct ggtgcaaaac ctctattctt ttgccaaagc
900cgatgccgcg ccagatctcg tgctgatgct gggggcgcgc tttggcctta acaccgggca
960tggatctggg cagttgatcc cccatagcgc gcaggtcatt caggtcgacc ctgatgcctg
1020cgagctggga cgcctgcagg gcatcgctct gggcattgtg gccgatgtgg gtgggaccat
1080cgaggctttg gcgcaggcca ccgcgcaaga tgcggcttgg ccggatcgcg gcgactggtg
1140cgccaaagtg acggatctgg cgcaagagcg ctatgccagc atcgctgcga aatcgagcag
1200cgagcatgcg ctccacccct ttcacgcctc gcaggtcatt gccaaacacg tcgatgcagg
1260ggtgacggtg gtagcggatg gtgcgctgac ctatctctgg ctgtccgaag tgatgagccg
1320cgtgaaaccc ggcggttttc tctgccacgg ctatctaggc tcgatgggcg tgggcttcgg
1380cacggcgctg ggcgcgcaag tggccgatct tgaagcaggc cgccgcacga tccttgtgac
1440cggcgatggc tcggtgggct atagcatcgg tgaatttgat acgctggtgc gcaaacaatt
1500gccgctgatc gtcatcatca tgaacaacca aagctggggg gcgacattgc atttccagca
1560attggccgtc ggccccaatc gcgtgacggg cacccgtttg gaaaatggct cctatcacgg
1620ggtggccgcc gcctttggcg cggatggcta tcatgtcgac agtgtggaga gcttttctgc
1680ggctctggcc caagcgctcg cccataatcg ccccgcctgc atcaatgtcg cggtcgcgct
1740cgatccgatc ccgcccgaag aactcattct gatcggcatg gaccccttcg catga
1795144563PRTPseudomonas fluorescens 144Met Ala Met Ile Thr Gly Gly Glu
Leu Val Val Arg Thr Leu Ile Lys1 5 10
15Ala Gly Val Glu His Leu Phe Gly Leu His Gly Ala His Ile Asp
Thr20 25 30Ile Phe Gln Ala Cys Leu Asp
His Asp Val Pro Ile Ile Asp Thr Arg35 40
45His Glu Ala Ala Ala Gly His Ala Ala Glu Gly Tyr Ala Arg Ala Gly50
55 60Ala Lys Leu Gly Val Ala Gly His Gly Gly
Arg Gly Ile Tyr Gln Cys65 70 75
80Gly His Ala His Cys Gln Arg Leu Ala Gly Ser Gln Gly Arg Cys
Ile85 90 95Pro His Pro Gly Ser Gly Ala
Leu Arg Asp Asp Glu Thr Asn Thr Leu100 105
110Gln Ala Gly Ile Asp Gln Val Ala Met Ala Ala Pro Ile Thr Lys Trp115
120 125Ala His Arg Val Met Ala Thr Glu His
Ile Pro Arg Leu Val Met Gln130 135 140Ala
Ile Arg Ala Ala Leu Ser Ala Pro Arg Gly Pro Val Leu Leu Asp145
150 155 160Leu Pro Trp Asp Ile Leu
Met Asn Gln Ile Asp Glu Asp Ser Val Ile165 170
175Ile Pro Asp Leu Val Leu Ser Ala His Gly Ala Arg Pro Asp Pro
Ala180 185 190Asp Leu Asp Gln Ala Leu Ala
Leu Leu Arg Lys Ala Glu Arg Pro Val195 200
205Ile Val Leu Gly Ser Glu Ala Ser Arg Thr Ala Arg Lys Thr Ala Leu210
215 220Ser Ala Phe Val Ala Ala Thr Gly Val
Pro Val Phe Ala Asp Tyr Glu225 230 235
240Gly Leu Ser Met Leu Ser Gly Leu Pro Asp Ala Met Arg Gly
Gly Leu245 250 255Val Gln Asn Leu Tyr Ser
Phe Ala Lys Ala Asp Ala Ala Pro Asp Leu260 265
270Val Leu Met Leu Gly Ala Arg Phe Gly Leu Asn Thr Gly His Gly
Ser275 280 285Gly Gln Leu Ile Pro His Ser
Ala Gln Val Ile Gln Val Asp Pro Asp290 295
300Ala Cys Glu Leu Gly Arg Leu Gln Gly Ile Ala Leu Gly Ile Val Ala305
310 315 320Asp Val Gly Gly
Thr Ile Glu Ala Leu Ala Gln Ala Thr Ala Gln Asp325 330
335Ala Ala Trp Pro Asp Arg Gly Asp Trp Cys Ala Lys Val Thr
Asp Leu340 345 350Ala Gln Glu Arg Tyr Ala
Ser Ile Ala Ala Lys Ser Ser Ser Glu His355 360
365Ala Leu His Pro Phe His Ala Ser Gln Val Ile Ala Lys His Val
Asp370 375 380Ala Gly Val Thr Val Val Ala
Asp Gly Ala Leu Thr Tyr Leu Trp Leu385 390
395 400Ser Glu Val Met Ser Arg Val Lys Pro Gly Gly Phe
Leu Cys His Gly405 410 415Tyr Leu Gly Ser
Met Gly Val Gly Phe Gly Thr Ala Leu Gly Ala Gln420 425
430Val Ala Asp Leu Glu Ala Gly Arg Arg Thr Ile Leu Val Thr
Gly Asp435 440 445Gly Ser Val Gly Tyr Ser
Ile Gly Glu Phe Asp Thr Leu Val Arg Lys450 455
460Gln Leu Pro Leu Ile Val Ile Ile Met Asn Asn Gln Ser Trp Gly
Ala465 470 475 480Thr Leu
His Phe Gln Gln Leu Ala Val Gly Pro Asn Arg Val Thr Gly485
490 495Thr Arg Leu Glu Asn Gly Ser Tyr His Gly Val Ala
Ala Ala Phe Gly500 505 510Ala Asp Gly Tyr
His Val Asp Ser Val Glu Ser Phe Ser Ala Ala Leu515 520
525Ala Gln Ala Leu Ala His Asn Arg Pro Ala Cys Ile Asn Val
Ala Val530 535 540Ala Leu Asp Pro Ile Pro
Pro Glu Glu Leu Ile Leu Ile Gly Met Asp545 550
555 560Pro Phe Ala1459PRTArtificial SequenceA
polypeptide that is similar to an autotransporter adhesion or type I
secretion target repeat. 145Gly Gly Xaa Gly Xaa Asp Xaa Xaa Xaa1
514650DNAArtificial SequencePrimer 146gtctttattc atatatatat
cctccttaat tcaaccgttc aatcaccatc 5014730DNAArtificial
SequencePrimer 147gggcggccgc aaggggttcg cgttggccga
3014822DNAArtificial SequencePrimer 148ggagaaaata
ccgcatcagg cg
2214932DNAArtificial SequencePrimer 149cgggatccaa gttgcaggat atgacgaaag
cg 3215033DNAArtificial SequencePrimer
150gctctagaag attatccctg tctgcggaag cgg
3315132DNAArtificial SequencePrimer 151gctctagagg ggtgcctaat gagtgagcta
ac 3215233DNAArtificial SequencePrimer
152cgggatccgc gttaatattt tgttaaaatt cgc
3315331DNAArtificial SequencePrimer 153gctctagagt ttatgtcgca cccgccgttg g
3115432DNAArtificial SequencePrimer
154cccaagctta gaaagggaaa ttgtggtagc cc
3215531DNAArtificial SequencePrimer 155ggaattccat atgcgtccct ctgccccggc c
3115630DNAArtificial SequencePrimer
156cgggatcctt agaactgctt gggaagggag
3015750DNAArtificial SequencePrimer 157aggtacggtg aaataaagga ggatatacat
atgtccaaaa agattgccgt 5015837DNAArtificial SequencePrimer
158ttttcctttt gcggccgccc cgctggcatc gcctcac
3715950DNAArtificial SequencePrimer 159ggcgatgcca gcgtaaagga ggatatacat
atgaaaaact ggaaaacaag 5016037DNAArtificial SequencePrimer
160ttttcctttt gcggccgccc cagcttagcg ccttcta
3716131DNAArtificial SequencePrimer 161cccgagctct taggaggatt agtcatggaa c
3116232DNAArtificial SequencePrimer
162gctctagatt attttgaata atcgtagaaa cc
3216342DNAArtificial sequencePrimer 163gctctagagg aggatatata tatgaaaaat
tgtgtcatcg tc 4216430DNAArtificial SequencePrimer
164aactgcagtt aattcaaccg ttcaatcacc
3016546DNAArtificial SequencePrimer 165cgagctcagg aggatatata tatgaaaaat
tgtgtcatcg tcagtg 4616650DNAArtificial SequencePrimer
166ggttgaatta aggaggatat atatatgaat aaagacacac taatacctac
5016730DNAArtificial SequencePrimer 167cccaagctta gccggcaagt acacatcttc
3016846DNAArtificial SequencePrimer
168cgagctcagg aggatatata tatgaaaaat tgtgtcatcg tcagtg
4616930DNAArtificial SequencePrimer 169cccaagctta gccggcaagt acacatcttc
3017040DNAArtificial SequencePrimer
170aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg
4017135DNAArtificial SequencePrimer 171cggggtaccg cggatacata tttgaatgta
tttag 3517244DNAArtificial SequencePrimer
172aaggaaaaaa gcggccgcgc ggatacatat ttgaatgtat ttag
4417343DNAArtificial SequencePrimer 173gctctagagg aggatatata tatggctaac
tacttcaata cac 4317450DNAArtificial SequencePrimer
174tgctgttgcg ggttaaggag gatatatata tgcctaagta ccgttccgcc
5017550DNAArtificial SequencePrimer 175aacggtactt aggcatatat atatcctcct
taacccgcaa cagcaatacg 5017630DNAArtificial SequencePrimer
176acatgcatgc ttaacccccc agtttcgatt
3017743DNAArtificial SequencePrimer 177gctctagagg aggatatata tatggctaac
tacttcaata cac 4317830DNAArtificial SequencePrimer
178acatgcatgc ttaacccccc agtttcgatt
3017943DNAArtificial SequencePrimer 179cccgagctca ggaggatata tatatggata
aacagtatcc ggt 4318028DNAArtificial SequencePrimer
180gctctagatt acagaatttg actcaggt
2818145DNAArtificial SequencePrimer 181cccgagctca ggaggatata tatatgttga
caaaagcaac aaaag 4518225DNAArtificial SequencePrimer
182ctctaaatct ctggaaaggg taccg
2518330DNAArtificial SequencePrimer 183gctctagatt agagagcttt cgttttcatg
3018445DNAArtificial SequencePrimer
184cccgagctca ggaggatata tatatgttga caaaagcaac aaaag
4518530DNAArtificial SequencePrimer 185gctctagatt agagagcttt cgttttcatg
3018646DNAArtificial SequencePrimer
186cgagctcagg aggatatata tatgagccag caagtcatta ttttcg
4618735DNAArtificial SequencePrimer 187aaaactgcag cgtttgatga cgtggacgat
agcgg 3518846DNAArtificial SequencePrimer
188cgagctcagg aggatatata tatgagccag caagtcatta ttttcg
4618950DNAArtificial SequencePrimer 189aggggtgtaa ggaggatata tatatggcta
agacgttata cgaaaaattg 5019050DNAArtificial SequencePrimer
190cgtcttagcc atatatatat cctccttaca ccccttctgc tacatagcgg
5019135DNAArtificial SequencePrimer 191aaaactgcag cgtttgatga cgtggacgat
agcgg 3519246DNAArtificial SequencePrimer
192cgagctcagg aggatatata tatgagccag caagtcatta ttttcg
4619335DNAArtificial SequencePrimer 193aaaactgcag cgtttgatga cgtggacgat
agcgg 3519446DNAArtificial SequencePrimer
194cgagctcagg aggatatata tatgagccag caagtcatta ttttcg
4619550DNAArtificial SequencePrimer 195gaaaccgtgt gaggaggata tatatatgtc
gaagaattac catattgccg 5019650DNAArtificial SequencePrimer
196aggggtgtaa ggaggatata tatatggcta agacgttata cgaaaaattg
5019750DNAArtificial SequencePrimer 197acattaaata aggaggatat atatatggca
gagaaattta tcaaacacac 5019850DNAArtificial SequencePrimer
198attcttcgac atatatatat cctcctcaca cggtttcctt gttgttttcg
5019950DNAArtificial SequencePrimer 199cgtcttagcc atatatatat cctccttaca
ccccttctgc tacatagcgg 5020050DNAArtificial SequencePrimer
200tttctctgcc atatatatat cctccttatt taatgttgcg aatgtcggcg
5020135DNAArtificial SequencePrimer 201aaaactgcag cgtttgatga cgtggacgat
agcgg 3520246DNAArtificial SequencePrimer
202cgagctcagg aggatatata tatgagccag caagtcatta ttttcg
4620335DNAArtificial SequencePrimer 203aaaactgcag cgtttgatga cgtggacgat
agcgg 3520440DNAArtificial SequencePrimer
204aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg
4020535DNAArtificial SequencePrimer 205cggggtaccg cggatacata tttgaatgta
tttag 3520642DNAArtificial SequencePrimer
206aaggaaaaaa gcggccgcac ttttcatact cccgccattc ag
4220731DNAArtificial SequencePrimer 207caaaggccgt ctgcacgcgc cgaaaggcaa a
3120831DNAArtificial SequencePrimer
208tttgcctttc ggcgcgtgca gacggccttt g
3120935DNAArtificial SequencePrimer 209acatgcatgc cgtttgatga cgtggacgat
agcgg 3521042DNAArtificial SequencePrimer
210aaggaaaaaa gcggccgcac ttttcatact cccgccattc ag
4221135DNAArtificial SequencePrimer 211acatgcatgc cgtttgatga cgtggacgat
agcgg 3521248DNAArtificial SequencePrimer
212cccgagctca ggaggatata tatatgaatt atcagaacga cgatttac
4821350DNAArtificial SequencePrimer 213gcgtcgcggg taaggaggaa aattttatgt
cctcacgtaa agagcttgcc 5021450DNAArtificial SequencePrimer
214gaactgctgt aaggaggtta aaattatgga gaggattgtc gttactctcg
5021550DNAArtificial SequencePrimer 215caatcagcgt aaggaggtat atataatgaa
aaccgtaact gtaaaagatc 5021650DNAArtificial SequencePrimer
216tacaccaggc ataaggagga attaattatg gaaacctatg ctgtttttgg
5021750DNAArtificial SequencePrimer 217tacgtgagga cataaaattt tcctccttac
ccgcgacgcg cttttactgc 5021850DNAArtificial SequencePrimer
218caatcctctc cataatttta acctccttac agcagttctt ttgctttcgc
5021950DNAArtificial SequencePrimer 219caatcagcgt aaggaggtat atataatgaa
aaccgtaact gtaaaagatc 5022050DNAArtificial SequencePrimer
220tacggttttc attatatata cctccttacg ctgattgaca atcggcaatg
5022134DNAArtificial SequencePrimer 221acatgcatgc ttacgcggac aattcctcct
gcaa 3422248DNAArtificial SequencePrimer
222cccgagctca ggaggatata tatatgaatt atcagaacga cgatttac
4822334DNAArtificial SequencePrimer 223acatgcatgc ttacgcggac aattcctcct
gcaa 3422448DNAArtificial SequencePrimer
224cccgagctca ggaggatata tatatgacat cggaaaaccc gttactgg
4822550DNAArtificial SequencePrimer 225gatccaacct aaggaggaaa attttatgac
acaacctctt tttctgatcg 5022650DNAArtificial SequencePrimer
226gatcaattgt taaggaggta tatataatgg aatccctgac gttacaaccc
5022750DNAArtificial SequencePrimer 227caggcagcct aaggaggaat taattatggc
tggaaacaca attggacaac 5022850DNAArtificial SequencePrimer
228aggttgtgtc ataaaatttt cctccttagg ttggatcaac aggcactacg
5022950DNAArtificial SequencePrimer 229cagggattcc attatatata cctccttaac
aattgatcgt ctgtgccagg 5023050DNAArtificial SequencePrimer
230gtttccagcc ataattaatt cctccttagg ctgcctggct aatccgcgcc
5023135DNAArtificial SequencePrimer 231acatgcatgc ttaccagcgt ggaatatcag
tcttc 3523248DNAArtificial SequencePrimer
232cccgagctca ggaggatata tatatgacat cggaaaaccc gttactgg
4823335DNAArtificial SequencePrimer 233acatgcatgc ttaccagcgt ggaatatcag
tcttc 3523448DNAArtificial SequencePrimer
234cccgagctca ggaggatata tatatggttg ctgaattgac cgcattac
4823550DNAArtificial SequencePrimer 235aatcgccagt aaggaggaaa attttatgac
acaacctctt tttctgatcg 5023650DNAArtificial SequencePrimer
236gatcaattgt taaggaggta tatataatgg aatccctgac gttacaaccc
5023750DNAArtificial SequencePrimer 237caggcagcct aaggaggaat taattatggc
tggaaacaca attggacaac 5023850DNAArtificial SequencePrimer
238gaggttgtgt cataaaattt tcctccttac tggcgattgt cattcgcctg
5023950DNAArtificial SequencePrimer 239cagggattcc attatatata cctccttaac
aattgatcgt ctgtgccagg 5024050DNAArtificial SequencePrimer
240gtttccagcc ataattaatt cctccttagg ctgcctggct aatccgcgcc
5024135DNAArtificial SequencePrimer 241acatgcatgc ttaccagcgt ggaatatcag
tcttc 3524248DNAArtificial SequencePrimer
242cccgagctca ggaggatata tatatggttg ctgaattgac cgcattac
4824335DNAArtificial SequencePrimer 243acatgcatgc ttaccagcgt ggaatatcag
tcttc 3524440DNAArtificial SequencePrimer
244aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg
4024532DNAArtificial SequencePrimer 245gctctagaac ttttcatact cccgccattc
ag 3224634DNAArtificial SequencePrimer
246gctctagagc ggatacatat ttgaatgtat ttag
3424744DNAArtificial SequencePrimer 247aaggaaaaaa gcggccgcgc ggatacatat
ttgaatgtat ttag 4424826DNAArtificial SequencePrimer
248catgccatgg ctatgattac tggtgg
2624933DNAArtificial SequencePrimer 249ccccgagctc ttacgcgccg gattggaaat
aca 3325031DNAArtificial SequencePrimer
250catgccatgg ccaaagttac aaatcaaaaa g
3125132DNAArtificial SequencePrimer 251cgagctctta aaatgatttt atatagatat
cc 3225231DNAArtificial SequencePrimer
252catgccatgg gtattccaga aactcaaaaa g
3125331DNAArtificial SequencePrimer 253cccgagctct tatttagaag tgtcaacaac g
3125447DNAArtificial SequencePrimer
254ccccgagctc aggaggatat acatatgaat aaagacacac taatacc
4725530DNAArtificial SequencePrimer 255cccaagctta gccggcaagt acacatcttc
3025645DNAArtificial SequencePrimer
256cccgagctca ggaggatata tatatgtata cagtaggaga ttacc
4525733DNAArtificial SequencePrimer 257gctctagatt atgatttatt ttgttcagca
aat 3325845DNAArtificial SequencePrimer
258cccgagctca ggaggatata tatatgtata cagtaggaga ttacc
4525933DNAArtificial SequencePrimer 259gctctagatt atgatttatt ttgttcagca
aat 3326046DNAArtificial SequencePrimer
260cgagctcagg aggatatata tatgaaaaaa gtcgcacttg ttaccg
4626131DNAArtificial SequencePrimer 261ggccggcggc cgcgcgatgg cggtgaaagt g
3126250DNAArtificial SequencePrimer
262aactaatcta gaggaggata tatatatgag catgacgttt tccggccagg
5026331DNAArtificial SequencePrimer 263ccttgcggag ggctcgatgg atgagttcga c
3126431DNAArtificial SequencePrimer
264cactttcacc gccatcgcgc ggccgccggc c
3126550DNAArtificial SequencePrimer 265gctcatatat atatcctcct ctagattagt
taaacaccat cccgccgtcg 5026631DNAArtificial SequencePrimer
266gtcgaactca tccatcgagc cctccgcaag g
3126732DNAArtificial SequencePrimer 267cccaagctta gatcgcggtg gccccgccgt
cg 3226846DNAArtificial SequencePrimer
268cgagctcagg aggatatata tatgaaaaaa gtcgcacttg ttaccg
4626932DNAArtificial SequencePrimer 269cccaagctta gatcgcggtg gccccgccgt
cg 3227043DNAArtificial SequencePrimer
270gctctagagg aggatttaaa aatggaaatt aacgaaacgc tgc
4327145DNAArtificial SequencePrimer 271tccccgcggt taagcatggc gatcccgaaa
tggaatccct ttgac 4527244DNAArtificial SequencePrimer
272ccgctcgagg aggatatata tatgagatcg aaaagatttg aagc
4427330DNAArtificial SequencePrimer 273gctctagatt agccaagttc attgggatcg
3027433DNAArtificial SequencePrimer
274cggggtacca cttttcatac tcccgccatt cag
3327525DNAArtificial SequencePrimer 275cggtaccctt tccagagatt tagag
2527630DNAArtificial SequencePrimer
276ggaattccat atgttcacaa cgtccgccta
3027727DNAArtificial SequencePrimer 277gcttgacggc catgtggccg aggccgc
2727827DNAArtificial SequencePrimer
278gcggcctcgg ccacatggcc gtcaagc
2727928DNAArtificial SequencePrimer 279cgggatcctt aggcggcctt ctggcgcg
2828030DNAArtificial SequencePrimer
280ggaattccat atggctattg caagaggtta
3028128DNAArtificial SequencePrimer 281cgggatcctt aagcgtcgag cgaggcca
2828230DNAArtificial SequencePrimer
282ggaattccat atgactaaaa caatgaaggc
3028327DNAArtificial SequencePrimer 283caccggggcc ggggtccggt attgcca
2728427DNAArtificial SequencePrimer
284tggcaatacc ggaccccggc cccggtg
2728528DNAArtificial SequencePrimer 285cgggatcctt aggcggcgag atccacga
2828630DNAArtificial SequencePrimer
286ggaattccat atgaccgggg cgaaccagcc
3028727DNAArtificial SequencePrimer 287atagccgctc atacgcctcg gttgcct
2728827DNAArtificial SequencePrimer
288aggcaaccga ggcgtatgag cggctat
2728928DNAArtificial SequencePrimer 289cgggatcctt aagcgccgtg cggaagga
2829030DNAArtificial SequencePrimer
290ggaattccat atgaccatgc atgccattca
3029128DNAArtificial SequencePrimer 291cgggatcctt attcggctgc aaattgca
2829230DNAArtificial SequencePrimer
292ggaattccat atgcgcgcgc tttattacga
3029328DNAArtificial SequencePrimer 293cgggatcctt attcgaaccg gtcgatga
2829430DNAArtificial SequencePrimer
294ggaattccat atgctggcga ttttctgtga
3029528DNAArtificial SequencePrimer 295cgggatcctt atgcgacctc caccatgc
2829630DNAArtificial SequencePrimer
296ggaattccat atgaaagcct tcgtcgtcga
3029728DNAArtificial SequencePrimer 297cgggatcctt aggatgcgta tgtaacca
2829830DNAArtificial SequencePrimer
298ggaattccat atgaaagcga ttgtcgccca
3029928DNAArtificial SequencePrimer 299cgggatcctt aggaaaaggc gatctgca
2830030DNAArtificial SequencePrimer
300ggaattccat atgccgatgg cgctcgggca
3030128DNAArtificial SequencePrimer 301cgggatcctt agaattcgat gacttgcc
2830230DNAArtificial SequencePrimer
302ggaattccat atgaaacatt ctcaggacaa
3030327DNAArtificial SequencePrimer 303gggcgccgat catgtggtgc gtttccg
2730427DNAArtificial SequencePrimer
304cggaaacgca ccacatgatc ggcgccc
2730528DNAArtificial SequencePrimer 305cgggatcctt atgccatacg ttccatat
2830630DNAArtificial SequencePrimer
306ggaattccat atgcagcgtt ttaccaacag
3030728DNAArtificial SequencePrimer 307cgggatcctt aggaaaacag gacgccgc
28308610PRTKlebsiella pneumoniae
subsp. pneumoniae MGH 78578 308Met Arg Tyr Ile Ala Gly Ile Asp Ile Gly
Asn Ser Ser Thr Glu Val1 5 10
15Ala Leu Ala Thr Val Asp Asp Ala Gly Val Leu Asn Ile Arg His Ser20
25 30Ala Leu Ala Glu Thr Thr Gly Ile Lys
Gly Thr Leu Arg Asn Val Phe35 40 45Gly
Ile Gln Glu Ala Leu Thr Gln Ala Ala Lys Ala Ala Gly Ile Gln50
55 60Leu Ser Asp Ile Ser Leu Ile Arg Ile Asn Glu
Ala Thr Pro Val Ile65 70 75
80Gly Asp Val Ala Met Glu Thr Ile Thr Glu Thr Ile Ile Thr Glu Ser85
90 95Thr Met Ile Gly His Asn Pro Lys Thr
Pro Gly Gly Val Gly Leu Gly100 105 110Val
Gly Ile Thr Ile Thr Pro Glu Ala Leu Leu Ser Cys Ser Ala Asp115
120 125Thr Pro Tyr Ile Leu Val Val Ser Ser Ala Phe
Asp Phe Ala Asp Val130 135 140Ala Ala Met
Val Asn Ala Ala Thr Ala Ala Gly Tyr Gln Ile Thr Gly145
150 155 160Ile Ile Leu Gln Gln Asp Asp
Gly Val Leu Val Asn Asn Arg Leu Gln165 170
175Gln Pro Leu Pro Val Ile Asp Glu Val Gln His Ile Asp Arg Ile Pro180
185 190Leu Gly Met Leu Ala Ala Val Glu Val
Ala Leu Pro Gly Lys Ile Ile195 200 205Glu
Thr Leu Ser Asn Pro Tyr Gly Ile Ala Thr Val Phe Asp Leu Asn210
215 220Ala Glu Glu Thr Lys Asn Ile Val Pro Met Ala
Arg Ala Leu Ile Gly225 230 235
240Asn Arg Ser Ala Val Val Val Lys Thr Pro Ser Gly Asp Val Lys
Ala245 250 255Arg Ala Ile Pro Ala Gly Asn
Leu Leu Leu Ile Ala Gln Gly Arg Ser260 265
270Val Gln Val Asp Val Ala Ala Gly Ala Glu Ala Ile Met Lys Ala Val275
280 285Asp Gly Cys Gly Lys Leu Asp Asn Val
Ala Gly Glu Ala Gly Thr Asn290 295 300Ile
Gly Gly Met Leu Glu His Val Arg Gln Thr Met Ala Glu Leu Thr305
310 315 320Asn Lys Pro Ala Gln Glu
Ile Arg Ile Gln Asp Leu Leu Ala Val Asp325 330
335Thr Ala Val Pro Val Ser Val Thr Gly Gly Leu Ala Gly Glu Phe
Ser340 345 350Leu Glu Gln Ala Val Gly Ile
Ala Ser Met Val Lys Ser Asp Arg Leu355 360
365Gln Met Ala Leu Ile Ala Arg Glu Ile Glu His Lys Leu Gln Ile Ala370
375 380Val Gln Val Gly Gly Ala Glu Ala Glu
Ala Ala Ile Leu Gly Ala Leu385 390 395
400Thr Thr Pro Gly Thr Thr Arg Pro Leu Ala Ile Leu Asp Leu
Gly Ala405 410 415Gly Ser Thr Asp Ala Ser
Ile Ile Asn Ala Gln Gly Glu Ile Ser Ala420 425
430Thr His Leu Ala Gly Ala Gly Asp Met Val Thr Met Ile Ile Ala
Arg435 440 445Glu Leu Gly Leu Glu Asp Arg
Tyr Leu Ala Glu Glu Ile Lys Lys Tyr450 455
460Pro Leu Ala Lys Val Glu Ser Leu Phe His Leu Arg His Glu Asp Gly465
470 475 480Ser Val Gln Phe
Phe Pro Ser Ala Leu Pro Pro Ala Val Phe Ala Arg485 490
495Val Cys Val Val Lys Pro Asp Glu Leu Val Pro Leu Pro Gly
Asp Leu500 505 510Pro Leu Glu Lys Val Arg
Ala Ile Arg Arg Ser Ala Lys Ser Arg Val515 520
525Phe Val Thr Asn Ala Leu Arg Ala Leu Arg Gln Val Ser Pro Thr
Gly530 535 540Asn Ile Arg Asp Ile Pro Phe
Val Val Leu Val Gly Gly Ser Ser Leu545 550
555 560Asp Phe Glu Ile Pro Gln Leu Val Thr Asp Ala Leu
Ala His Tyr Arg565 570 575Leu Val Ala Gly
Arg Gly Asn Ile Arg Gly Cys Glu Gly Pro Arg Asn580 585
590Ala Val Ala Ser Gly Leu Leu Leu Ser Trp Gln Lys Gly Gly
Thr His595 600 605Gly
Glu610309116PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 309Met
Glu Ser Ser Val Val Ala Pro Ala Ile Val Ile Ala Val Thr Asp1
5 10 15Glu Cys Ser Glu Gln Trp Arg Asp
Val Leu Leu Gly Ile Glu Glu Glu20 25
30Gly Ile Pro Phe Val Leu Gln Pro Gln Thr Gly Gly Asp Leu Ile His35
40 45His Ala Trp Gln Ala Ala Gln Arg Ser Pro
Leu Gln Val Gly Ile Ala50 55 60Cys Asp
Arg Glu Arg Leu Ile Val His Tyr Lys Asn Leu Pro Ala Ser65
70 75 80Thr Pro Leu Phe Ser Leu Met
Tyr His Gln Asn Arg Leu Ala Arg Arg85 90
95Asn Thr Gly Asn Asn Ala Ala Arg Leu Val Lys Gly Ile Pro Phe Arg100
105 110Asp Arg His Ala115310787PRTClostridium
butyricum 310Met Ile Ser Lys Gly Phe Ser Thr Gln Thr Glu Arg Ile Asn Ile
Leu1 5 10 15Lys Ala Gln
Ile Leu Asn Ala Lys Pro Cys Val Glu Ser Glu Arg Ala20 25
30Ile Leu Ile Thr Glu Ser Phe Lys Gln Thr Glu Gly Gln
Pro Ala Ile35 40 45Leu Arg Arg Ala Leu
Ala Leu Lys His Ile Leu Glu Asn Ile Pro Ile50 55
60Thr Ile Arg Asp Gln Glu Leu Ile Val Gly Ser Leu Thr Lys Glu
Pro65 70 75 80Arg Ser
Ser Gln Val Phe Pro Glu Phe Ser Asn Lys Trp Leu Gln Asp85
90 95Glu Leu Asp Arg Leu Asn Lys Arg Thr Gly Asp Ala
Phe Gln Ile Ser100 105 110Glu Glu Ser Lys
Glu Lys Leu Lys Asp Val Phe Glu Tyr Trp Asn Gly115 120
125Lys Thr Thr Ser Glu Leu Ala Thr Ser Tyr Met Thr Glu Glu
Thr Arg130 135 140Glu Ala Val Asn Cys Asp
Val Phe Thr Val Gly Asn Tyr Tyr Tyr Asn145 150
155 160Gly Val Gly His Val Ser Val Asp Tyr Gly Lys
Val Leu Arg Val Gly165 170 175Phe Asn Gly
Ile Ile Asn Glu Ala Lys Glu Gln Leu Glu Lys Asn Arg180
185 190Ser Ile Asp Pro Asp Phe Ile Lys Lys Glu Lys Phe
Leu Asn Ser Val195 200 205Ile Ile Ser Cys
Glu Ala Ala Ile Thr Tyr Val Asn Arg Tyr Ala Lys210 215
220Lys Ala Lys Glu Ile Ala Asp Asn Thr Ser Asp Ala Lys Arg
Lys Ala225 230 235 240Glu
Leu Asn Glu Ile Ala Lys Ile Cys Ser Lys Val Ser Gly Glu Gly245
250 255Ala Lys Ser Phe Tyr Glu Ala Cys Gln Leu Phe
Trp Phe Ile His Ala260 265 270Ile Ile Asn
Ile Glu Ser Asn Gly His Ser Ile Ser Pro Ala Arg Phe275
280 285Asp Gln Tyr Met Tyr Pro Tyr Tyr Glu Asn Asp Lys
Asn Ile Thr Asp290 295 300Lys Phe Ala Gln
Glu Leu Ile Asp Cys Ile Trp Ile Lys Leu Asn Asp305 310
315 320Ile Asn Lys Val Arg Asp Glu Ile Ser
Thr Lys His Phe Gly Gly Tyr325 330 335Pro
Met Tyr Gln Asn Leu Ile Val Gly Gly Gln Asn Ser Glu Gly Lys340
345 350Asp Ala Thr Asn Lys Val Ser Tyr Met Ala Leu
Glu Ala Ala Val His355 360 365Val Lys Leu
Pro Gln Pro Ser Leu Ser Val Arg Ile Trp Asn Lys Thr370
375 380Pro Asp Glu Phe Leu Leu Arg Ala Ala Glu Leu Thr
Arg Glu Gly Leu385 390 395
400Gly Leu Pro Ala Tyr Tyr Asn Asp Glu Val Ile Ile Pro Ala Leu Val405
410 415Ser Arg Gly Leu Thr Leu Glu Asp Ala
Arg Asp Tyr Gly Ile Ile Gly420 425 430Cys
Val Glu Pro Gln Lys Pro Gly Lys Thr Glu Gly Trp His Asp Ser435
440 445Ala Phe Phe Asn Leu Ala Arg Ile Val Glu Leu
Thr Ile Asn Ser Gly450 455 460Phe Asp Lys
Asn Lys Gln Ile Gly Pro Lys Thr Gln Asn Phe Glu Glu465
470 475 480Met Lys Ser Phe Asp Glu Phe
Met Lys Ala Tyr Lys Ala Gln Met Glu485 490
495Tyr Phe Val Lys His Met Cys Cys Ala Asp Asn Cys Ile Asp Ile Ala500
505 510His Ala Glu Arg Ala Pro Leu Pro Phe
Leu Ser Ser Met Val Asp Asn515 520 525Cys
Ile Gly Lys Gly Lys Ser Leu Gln Asp Gly Gly Ala Glu Tyr Asn530
535 540Phe Ser Gly Pro Gln Gly Val Gly Val Ala Asn
Ile Gly Asp Ser Leu545 550 555
560Val Ala Val Lys Lys Ile Val Phe Asp Glu Asn Lys Ile Thr Pro
Ser565 570 575Glu Leu Lys Lys Thr Leu Asn
Asn Asp Phe Lys Asn Ser Glu Glu Ile580 585
590Gln Ala Leu Leu Lys Asn Ala Pro Lys Phe Gly Asn Asp Ile Asp Glu595
600 605Val Asp Asn Leu Ala Arg Glu Gly Ala
Leu Val Tyr Cys Arg Glu Val610 615 620Asn
Lys Tyr Thr Asn Pro Arg Gly Gly Asn Phe Gln Pro Gly Leu Tyr625
630 635 640Pro Ser Ser Ile Asn Val
Tyr Phe Gly Ser Leu Thr Gly Ala Thr Pro645 650
655Asp Gly Arg Lys Ser Gly Gln Pro Leu Ala Asp Gly Val Ser Pro
Ser660 665 670Arg Gly Cys Asp Val Ser Gly
Pro Thr Ala Ala Cys Asn Ser Val Ser675 680
685Lys Leu Asp His Phe Ile Ala Ser Asn Gly Thr Leu Phe Asn Gln Lys690
695 700Phe His Pro Ser Ala Leu Lys Gly Asp
Asn Gly Leu Met Asn Leu Ser705 710 715
720Ser Leu Ile Arg Ser Tyr Phe Asp Gln Lys Gly Phe His Val
Gln Phe725 730 735Asn Val Ile Asp Lys Lys
Ile Leu Leu Ala Ala Gln Lys Asn Pro Glu740 745
750Lys Tyr Gln Asp Leu Ile Val Arg Val Ala Gly Tyr Ser Ala Gln
Phe755 760 765Ile Ser Leu Asp Lys Ser Ile
Gln Asn Asp Ile Ile Ala Arg Thr Glu770 775
780His Val Met785311304PRTClostridium buyricum 311Met Ser Lys Glu Ile
Lys Gly Val Leu Phe Asn Ile Gln Lys Phe Ser1 5
10 15Leu His Asp Gly Pro Gly Ile Arg Thr Ile Val Phe
Phe Lys Gly Cys20 25 30Ser Met Ser Cys
Leu Trp Cys Ser Asn Pro Glu Ser Gln Asp Ile Lys35 40
45Pro Gln Val Met Phe Asn Lys Asn Leu Cys Thr Lys Cys Gly
Arg Cys50 55 60Lys Ser Gln Cys Lys Ser
Ala Ala Ile Asp Met Asn Ser Glu Tyr Arg65 70
75 80Ile Asp Lys Ser Lys Cys Thr Glu Cys Thr Lys
Cys Val Asp Asn Cys85 90 95Leu Ser Gly
Ala Leu Val Ile Glu Gly Arg Asn Tyr Ser Val Glu Asp100
105 110Val Ile Lys Glu Leu Lys Lys Asp Ser Val Gln Tyr
Arg Arg Ser Asn115 120 125Gly Gly Ile Thr
Leu Ser Gly Gly Glu Val Leu Leu Gln Pro Asp Phe130 135
140Ala Val Glu Leu Leu Lys Glu Cys Lys Ser Tyr Gly Trp His
Thr Ala145 150 155 160Ile
Glu Thr Ala Met Tyr Val Asn Ser Glu Ser Val Lys Lys Val Ile165
170 175Pro Tyr Ile Asp Leu Ala Met Ile Asp Ile Lys
Ser Met Asn Asp Glu180 185 190Ile His Arg
Lys Phe Thr Gly Val Ser Asn Glu Ile Ile Leu Gln Asn195
200 205Ile Lys Leu Ser Asp Glu Leu Ala Lys Glu Ile Ile
Ile Arg Ile Pro210 215 220Val Ile Glu Gly
Phe Asn Ala Asp Leu Gln Ser Ile Gly Ala Ile Ala225 230
235 240Gln Phe Ser Lys Ser Leu Thr Asn Leu
Lys Arg Ile Asp Leu Leu Pro245 250 255Tyr
His Asn Tyr Gly Glu Asn Lys Tyr Gln Ala Ile Gly Arg Glu Tyr260
265 270Ser Leu Lys Glu Leu Lys Ser Pro Ser Lys Asp
Lys Met Glu Arg Leu275 280 285Lys Ala Leu
Val Glu Ile Met Gly Ile Pro Cys Thr Ile Gly Ala Glu290
295 300312545PRTAzospirillum brasilense 312Met Lys Leu
Ala Glu Ala Leu Leu Arg Ala Leu Lys Asp Arg Gly Ala1 5
10 15Gln Ala Met Phe Gly Ile Pro Gly Asp Phe
Ala Leu Pro Phe Phe Lys20 25 30Val Ala
Glu Glu Thr Gln Ile Leu Pro Leu His Thr Leu Ser His Glu35
40 45Pro Ala Val Gly Phe Ala Ala Asp Ala Ala Ala Arg
Tyr Ser Ser Thr50 55 60Leu Gly Val Ala
Ala Val Thr Tyr Gly Ala Gly Ala Phe Asn Met Val65 70
75 80Asn Ala Val Ala Gly Ala Tyr Ala Glu
Lys Ser Pro Val Val Val Ile85 90 95Ser
Gly Ala Pro Gly Thr Thr Glu Gly Asn Ala Gly Leu Leu Leu His100
105 110His Gln Gly Arg Thr Leu Asp Thr Gln Phe Gln
Val Phe Lys Glu Ile115 120 125Thr Val Ala
Gln Ala Arg Leu Asp Asp Pro Ala Lys Ala Pro Ala Glu130
135 140Ile Ala Arg Val Leu Gly Ala Ala Arg Ala Gln Ser
Arg Pro Val Tyr145 150 155
160Leu Glu Ile Pro Arg Asn Met Val Asn Ala Glu Val Glu Pro Val Gly165
170 175Asp Asp Pro Ala Trp Pro Val Asp Arg
Asp Ala Leu Ala Ala Cys Ala180 185 190Asp
Glu Val Leu Ala Ala Met Arg Ser Ala Thr Ser Pro Val Leu Met195
200 205Val Cys Val Glu Val Arg Arg Tyr Gly Leu Glu
Ala Lys Val Ala Glu210 215 220Leu Ala Gln
Arg Leu Gly Val Pro Val Val Thr Thr Phe Met Gly Arg225
230 235 240Gly Leu Leu Ala Asp Ala Pro
Thr Pro Pro Leu Gly Thr Tyr Ile Gly245 250
255Val Ala Gly Asp Ala Glu Ile Thr Arg Leu Val Glu Glu Ser Asp Gly260
265 270Leu Phe Leu Leu Gly Ala Ile Leu Ser
Asp Thr Asn Phe Ala Val Ser275 280 285Gln
Arg Lys Ile Asp Leu Arg Lys Thr Ile His Ala Phe Asp Arg Ala290
295 300Val Thr Leu Gly Tyr His Thr Tyr Ala Asp Ile
Pro Leu Ala Gly Leu305 310 315
320Val Asp Ala Leu Leu Glu Arg Leu Pro Pro Ser Asp Arg Thr Thr
Arg325 330 335Gly Lys Glu Pro His Ala Tyr
Pro Thr Gly Leu Gln Ala Asp Gly Glu340 345
350Pro Ile Ala Pro Met Asp Ile Ala Arg Ala Val Asn Asp Arg Val Arg355
360 365Ala Gly Gln Glu Pro Leu Leu Ile Ala
Ala Asp Met Gly Asp Cys Leu370 375 380Phe
Thr Ala Met Asp Met Ile Asp Ala Gly Leu Met Ala Pro Gly Tyr385
390 395 400Tyr Ala Gly Met Gly Phe
Gly Val Pro Ala Gly Ile Gly Ala Gln Cys405 410
415Val Ser Gly Gly Lys Arg Ile Leu Thr Val Val Gly Asp Gly Ala
Phe420 425 430Gln Met Thr Gly Trp Glu Leu
Gly Asn Cys Arg Arg Leu Gly Ile Asp435 440
445Pro Ile Val Ile Leu Phe Asn Asn Ala Ser Trp Glu Met Leu Arg Thr450
455 460Phe Gln Pro Glu Ser Ala Phe Asn Asp
Leu Asp Asp Trp Arg Phe Ala465 470 475
480Asp Met Ala Ala Gly Met Gly Gly Asp Gly Val Arg Val Arg
Thr Arg485 490 495Ala Glu Leu Lys Ala Ala
Leu Asp Lys Ala Phe Ala Thr Arg Gly Arg500 505
510Phe Gln Leu Ile Glu Ala Met Ile Pro Arg Gly Val Leu Ser Asp
Thr515 520 525Leu Ala Arg Phe Val Gln Gly
Gln Lys Arg Leu His Ala Ala Pro Arg530 535
540Glu545313348PRTRhodococcus sp. ST-10 313Met Lys Ala Ile Gln Tyr Thr
Arg Ile Gly Ala Glu Pro Glu Leu Thr1 5 10
15Glu Ile Pro Lys Pro Glu Pro Gly Pro Gly Glu Val Leu Leu
Glu Val20 25 30Thr Ala Ala Gly Val Cys
His Ser Asp Asp Phe Ile Met Ser Leu Pro35 40
45Glu Glu Gln Tyr Thr Tyr Gly Leu Pro Leu Thr Leu Gly His Glu Gly50
55 60Ala Gly Lys Val Ala Ala Val Gly Glu
Gly Val Glu Gly Leu Asp Ile65 70 75
80Gly Thr Asn Val Val Val Tyr Gly Pro Trp Gly Cys Gly Asn
Cys Trp85 90 95His Cys Ser Gln Gly Leu
Glu Asn Tyr Cys Ser Arg Ala Gln Glu Leu100 105
110Gly Ile Asn Pro Pro Gly Leu Gly Ala Pro Gly Ala Leu Ala Glu
Phe115 120 125Met Ile Val Asp Ser Pro Arg
His Leu Val Pro Ile Gly Asp Leu Asp130 135
140Pro Val Lys Thr Val Pro Leu Thr Asp Ala Gly Leu Thr Pro Tyr His145
150 155 160Ala Ile Lys Arg
Ser Leu Pro Lys Leu Arg Gly Gly Ser Tyr Ala Val165 170
175Val Ile Gly Thr Gly Gly Leu Gly His Val Ala Ile Gln Leu
Leu Arg180 185 190His Leu Ser Ala Ala Thr
Val Ile Ala Leu Asp Val Ser Ala Asp Lys195 200
205Leu Glu Leu Ala Thr Lys Val Gly Ala His Glu Val Val Leu Ser
Asp210 215 220Lys Asp Ala Ala Glu Asn Val
Arg Lys Ile Thr Gly Ser Gln Gly Ala225 230
235 240Ala Leu Val Leu Asp Phe Val Gly Tyr Gln Pro Thr
Ile Asp Thr Ala245 250 255Met Ala Val Ala
Gly Val Gly Ser Asp Val Thr Ile Val Gly Ile Gly260 265
270Asp Gly Gln Ala His Ala Lys Val Gly Phe Phe Gln Ser Pro
Tyr Glu275 280 285Ala Ser Val Thr Val Pro
Tyr Trp Gly Ala Arg Asn Glu Leu Ile Glu290 295
300Leu Ile Asp Leu Ala His Ala Gly Ile Phe Asp Ile Ser Val Glu
Thr305 310 315 320Phe Ser
Leu Asp Asn Gly Ala Glu Ala Tyr Arg Arg Leu Ala Ala Gly325
330 335Thr Leu Ser Gly Arg Ala Val Val Val Pro Gly
Leu340 34531431DNAArtificial SequencePrimer 314catgccatgg
gactggctga ggcactgctg c
3131547DNAArtificial SequencePrimer 315cgagctcagg aggatatata tatgaaagct
atccagtaca cccgtat 4731632DNAArtificial SequencePrimer
316cgagctctta ttcgcgcggt gccgcgtgca gg
3231734DNAArtificial SequencePrimer 317gctctagatt acaggcccgg aaccacaacg
gcgc 3431846DNAArtificial SequencePrimer
318ccgctcgagg aggatatata tatgatttct aaaggcttta gcaccc
4631950DNAArtificial SequencePrimer 319acgtgatgta atctagagga ggatatatat
atgagcaaag aaattaaagg 5032050DNAArtificial SequencePrimer
320tctttgctca tatatatatc ctcctctaga ttacatcacg tgttcagtac
5032132DNAArtificial SequencePrimer 321cgagctctta ttcggcgcca atggtgcacg
gg 3232246DNAArtificial SequencePrimer
322ccgctcgagg aggatatata tatgatttct aaaggcttta gcaccc
4632332DNAArtificial SequencePrimer 323cgagctctta ttcggcgcca atggtgcacg
gg 3232426DNAArtificial SequencePrimer
324cacccaagcg atagtttata tagcgt
2632520DNAArtificial SequencePrimer 325gaaatgaacg gatattacgt
2032619DNAArtificial SequencePrimer
326cggaacaggt gattgtggt
1932726DNAArtificial SequencePrimer 327caccgcccac ttcaagatga agctgt
2632826DNAArtificial SequencePrimer
328cacccaagcg atagtttata tagcgt
2632920DNAArtificial SequencePrimer 329gtggctaagt acatgccggt
2033035DNAArtificial SequencePrimer
330ggaattccat atgacaaaga atatgacgac taaac
3533132DNAArtificial SequencePrimer 331cgggatcctt attatttccc ctgccctgca
gt 3233232DNAArtificial SequencePrimer
332ggaattccat atgagctatc aaccactttt ac
3233329DNAArtificial SequencePrimer 333cgggatcctt acagttgagc aaatgatcc
29
User Contributions:
Comment about this patent or add new information about this topic: