Patents - stay tuned to the technology

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: VALENCENE SYNTHASE

Inventors:  Jihane Achkar (Zurich, CH)  Theodorus Sonke (Guttecoven, NL)  Martinus Julius Beekwilder (Renkum, NL)  Hendrik Jan Bouwmeester (Wageningen, NL)  Hendrik Jan Bosch (Wageningen, NL)  Hendrik Jan Bosch (Wageningen, NL)
Assignees:  Isobionics B.V.
IPC8 Class: AC12N988FI
USPC Class: 800297
Class name: Multicellular living organisms and unmodified parts thereof and related processes plant, seedling, plant seed, or plant part, per se mushroom
Publication date: 2014-03-13
Patent application number: 20140075600



Abstract:

The present invention relates to a valencene synthase, to a nucleic acid encoding such valencene synthase, to a host cell comprising said encoding nucleic acid sequence and to a method for preparing valencene, comprising converting farnesyl diphosphate to valencene in the presence of a valencene synthase according to the invention.

Claims:

1. A valencene synthase comprising an amino acid sequence as shown in SEQ ID NO: 2, SEQ ID NO: 4, or a functional homologue of any of these sequences, said homologue being a valencene synthase comprising an amino acid sequence which has a sequence identity of at least 40% with SEQ ID NO: 2 or SEQ ID NO: 4.

2. The valencene synthase according to claim 1, having at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 98% sequence identity with SEQ ID NO: 2 or SEQ ID NO: 4.

3. A nucleic acid, comprising a nucleic acid sequence encoding a valencene synthase according to claim 1, a complementary sequence thereof, or comprising a nucleic acid sequence hybridising with a nucleic acid sequence encoding a valencene synthase according to claim 1 under stringent conditions.

4. A nucleic acid according to claim 3, wherein the nucleic acid comprises a nucleic acid sequence as shown in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 18, SEQ ID NO: 19 or another nucleic acid sequence encoding a valencene synthase comprising a nucleic acid sequence having a sequence identity of at least 40%, at least 60%, at least 70%, at least 80%, at least 90% or at least 95% with any of the sequences shown in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 18, or SEQ ID NO: 19.

5. An expression vector comprising a nucleic acid according to claim 3.

6. A host cell, which may be an organism per se or part of a multi-cellular organism, said host cell comprising an expression vector according to claim 5, which host cell is selected from the group consisting of bacterial cells, fungal cells and plant cells.

7. The host cell according to claim 6, wherein the host cell is a bacterial cell selected from the group of gram negative bacteria.

8. The host cell according to claim 6, wherein the host cell is a fungal cell selected from the group consisting of Aspergillus, Blakeslee, Penicillium, Phaffia (Xanthophyllomyces), Pichia, Saccharomyces, and Yarrowia.

9. A transgenic plant or culture comprising transgenic plant cells, said plant or culture comprising host cells according to claim 6, wherein the host cell is of a transgenic plant selected from Nicotiana spp, Solanum spp, Cichorum intybus, Lactuca sativa, Mentha spp, Artemisia annua, tuber forming plants, such as Helianthus tuberosus, cassava and Beta vulgaris, oil crops, such as rape seed, canola, palm tree, sunflower, soybean and peanut, liquid culture plants, such as duckweed, tobacco BY2 cells and Physcomitrella patens, trees, such as pine tree and poplar.

10. A transgenic mushroom or culture comprising transgenic mushroom cells, said mushroom or culture comprising host cells according to claim 6, wherein the host cell is selected from Schizophyllum, Agaricus and Pleurotis.

11. A method for preparing valencene, comprising converting a farnesyl diphosphate to valencene in the presence of a valencene synthase according to claim 1.

12. The method according to claim 11, wherein the valencene is prepared in a host cell, a plant or plant culture, or a mushroom or mushroom culture, expressing said valencene synthase, and optionally isolating the valencene from the host cell, plant, plant culture, mushroom or mushroom culture.

13. A method for preparing nootkatone, wherein valencene prepared in a method according to claim 11 is converted into nootkatone, which conversion may comprise a regiospecific hydroxylation of valencene followed by oxidation thereby forming nootkatone, and optionally isolating the nootkatone from the host cell.

14. The method according to claim 13, wherein the nootkatone is prepared in a host cell expressing at least one enzyme catalysing a reaction step for the conversion of valencene to nootkatone.

15. An antibody to a valencene synthase according to claim 1, or a protein having binding affinity to an antigen binding part of said antibody.

16. A method for preparing a terpenoid or a terpene, the method comprising converting a substrate in the presence of an enzyme having terpenoid or terpene synthase activity, the enzyme comprising a first segment comprising a tag-peptide, and a second segment comprising a polypeptide having terpenoid or terpene synthase activity.

17. The method according to claim 16, wherein the substrate is farnesyl diphosphate.

18. The method according to claim 16, wherein the terpene or terpenoid is selected from the group consisting of valencene, amorphadiene, artemisinic acid and nootkatone.

19. The method according to claim 16, wherein the terpenoid or terpene is prepared in a host cell, the host cell expressing the enzyme having terpenoid or terpene synthase activity.

20. The method according to claim 16, wherein the tag-peptide is selected from the group consisting of maltose binding proteins, nitrogen utilization proteins, thioredoxins, SET-peptides, and functional homologues thereof.

21. An enzyme having terpenoid or terpene synthase activity, the enzyme comprising a first segment comprising a tag-peptide and a second segment comprising a polypeptide having terpenoid or terpene synthase activity.

22. The enzyme according to claim 21, wherein the tag-peptide is selected from the group consisting of maltose binding proteins, nitrogen utilization proteins, thioredoxins, tags comprising a sequence according to SEQ ID NO: 34, and functional homologues thereof.

23. A nucleic acid encoding an enzyme according to claim 21.

24. A host cell, which may be an organism per se or part of a multi-cellular organism, said host cell comprising an expression vector comprising a nucleic acid according to claim 23.

25. The host cell according to claim 24, selected from the group consisting of Rhodobacter, Escherichia and Saccharomyces.

26. The host cell according to claim 25, selected from the group consisting of R. capsulatus, R. sphaeroides, E. coli and S. cerevisiae.

27. The method of claim 7 wherein the host cell is a bacterial cell selected from the group consisting of Rhodobacter, Paracoccus and Escherichia.

28. The method of claim 27 wherein the host cell is a bacterial cell selected from the group consisting of Rhodobacter capsulatus, Rhodobacter sphaeroides, Paracoccus carotinifaciens, Paracoccus zeaxanthinifaciens and Escherichia coli.

29. The host cell of claim 8, wherein the host cell is a fungal cell selected from the group consisting of Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Blakeslee trispora, Penicillium chrysogenum, Phaffia rhodozyma (Xanthophyllomyces dendrorhous), Pichia pastoris, Saccharomyces cerevisiae, and Yarrowia lipolytica.

30. The host cell of claim 29, wherein the host is a fungal cell selected from the group consisting of Saccharomyces cerevisiae, Penicillium chrysogenum and Pichia pastoris.

31. The host cell of claim 10, wherein the host cell is selected from the group consisting of Schizophyllum commune, Agaricus bisporis, Pleurotis ostreotis and Pleurotis sapidus.

32. The method of claim 19, wherein the host cell is selected from the group consisting of Rhodobacter, Escherichia and Saccharomyces.

33. The method of claim 32 wherein the host cell is selected from the group consisting of R. capsulatus, R. sphaeroides, E. coli and S. cerevisiae.

Description:

[0001] The invention is directed to a valencene synthase, to a nucleic acid encoding said valencene synthase, to an expression vector comprising said nucleic acid, to a host cell comprising said expression vector, to a method of preparing valencene, to a method of preparing nootkatone and to a method of preparing a valencene synthase.

[0002] Many organisms have the capacity to produce a wide array of terpenes and terpenoids. Terpenes are actually or conceptually built up from 2-methylbutane residues, usually referred to as units of isoprene, which has the molecular formula C5H8. One can consider the isoprene unit as one of nature's common building blocks. The basic molecular formulae of terpenes are multiples of that formula: (C5H8).sub.n, wherein n is the number of linked isoprene units. This is called the isoprene rule, as a result of which terpenes are also denoted as isoprenoids. The isoprene units may be linked together "head to tail" to form linear chains or they may be arranged to form rings. In their biosynthesis, terpenes are formed from the universal 5 carbon precursors isopentenyl diphosphate (IPP) and its isomer, dimethylallyl diphosphate (DMAPP). Accordingly, a terpene carbon skeleton generally comprises a multiple of 5 carbon atoms. Most common are the 5-, 10-, 15-, 20-, 30- and 40-carbon terpenes, which are referred to as hemi-, mono-, sesqui-, di-, tri- and tetraterpenes, respectively. Besides "head-to-tail" connections, tri- and tetraterpenes also contain one "tail-to-tail" connection in their centre. The terpenes may comprise further functional groups, like alcohols and their glycosides, ethers, aldehydes, ketones, carboxylic acids and esters. These functionalised terpenes are herein referred to as terpenoids. Like terpenes, terpenoids generally have a carbon skeleton having a multiple of 5 carbon atoms. It should be noted that the total number of carbons in a terpenoid does not need to be a multiple of 5, e.g. the functional group may be an ester group comprising an alkyl radical having any number of carbon atoms.

[0003] Apart from the definitions given above, it is important to note that the terms "terpene", "terpenoid" and "isoprenoid" are frequently used interchangeably in open as well as patent literature.

[0004] Valencene is a naturally occurring terpene, produced in specific plants, such as various citrus fruits. In these plants farnesyl diphosphate (FPP) is enzymatically converted into valencene in the presence of a valencene synthase.

[0005] Valencene is, e.g., industrially applicable as an aroma or flavour. Valencene can be obtained by distillation from citrus essential oils obtained from citrus fruits, but isolation from these oils is cumbersome because of the low valencene concentration in these fruits (0.2 to 0.6% by weight).

[0006] It has been proposed to prepare valencene microbiologically, making use of micro-organisms genetically modified by incorporation of a gene that is coding for a protein having valencene synthase activity. Thus produced valencene synthase can be used for the preparation of valencene from FPP, a conversion which might be executed as an isolated reaction (in vitro) or as part of a longer metabolic pathway eventually leading to the production of valencene from sugar (in vivo).

[0007] Several valencene synthases from citrus are known. For instance, in U.S. Pat. No. 7,273,735 and U.S. Pat. No. 7,442,785 the expression of valencene synthase from Citrus×paradisi in E. coli is described. Further, valencene synthase from Vitis vinifera has been described by Liicker et al. (Phytochemistry (2004) 65: 2649-2659). Although the expression of these valencene synthases in a host organism has been described, the actual enzymatic activity is only shown under in vitro conditions.

[0008] A number of papers also describe the activity of valencene synthases in vivo. Takahashi et al. (Biotechnol. Bioeng. (2007) 97: 17-181), for instance, report the expression of a Citrus×paradisi valencene synthase gene (accession number AF411120) in Saccharomyces cerevisiae strains that have been optimized for enhanced levels of the key intermediate FPP by amongst other things inactivating the ERG9 gene through a knockout mutation. Cultivation of the best strain in a defined minimal medium containing ergosterol to complement the ERG9 mutation for 216 h led to production of 20 mg/L valencene. Asadollahi et al. (Biotechnol. Bioeng. (2008) 99: 666-677) describe a rather similar valencene production system, which is based on the expression of a Citrus×paradisi valencene synthase gene (accession number CQ813508; 3 out of 548 amino acids difference compared to AF411120) in a S. cerevisiae strain in which the expression of the ERG9 gene was downregulated via replacement of the native ERG9 promoter with the regulatable METS promoter. Cultivation of this strain in a minimal medium applying a two-liquid phase fermentation with dodecane as the organic solvent resulted in the formation of 3 mg/L valencene in 60 h.

[0009] The currently known valencene synthases have a number of distinct drawbacks which are in particular undesirable when they are applied in an industrial valencene production process wherein valencene is prepared from FPP, either in an isolated reaction (in vitro), e.g. using an isolated valencene synthase or (permeabilized) whole cells, or otherwise, e.g. in a fermentative process being part of a longer metabolic pathway eventually leading to the production of valencene from sugar (in vivo). Internal research by the present inventors revealed, for instance, that overexpression of the valencene synthase from Citrus×paradisi (CQ813508) or from Citrus sinensis (AF441124) in different microorganisms (E. coli, Rhodobacter sphaeroides, Saccharomyces cerevisiae) in active form is troublesome, resulting in a severely impaired production rate of valencene. Similarly, Asadollahi et al. (Biotechnol. Bioeng. (2008) 99: 666-677) found that the low valencene synthesis in a recombinant S. cerevisiae strain was caused by poor heterologous expression of the Citrus×paradisi valencene synthase gene.

[0010] Moreover, the C.×paradisi valencene synthase, which is nearly identical to the enzyme form C. sinensis, has been found to catalyse the conversion of FPP not only into valencene but also into significant amounts of germacrene A (U.S. Pat. No. 7,442,785 B2), at neutral or mildly alkaline pH.

[0011] An incubation of this enzyme with FPP at pH 7.5, for instance, resulted in the formation of two compounds accounting for over 95% of the total reaction products formed, 30% of which was beta-elemene (a thermal rearrangement product of germacrene A) and 65% of which was valencene. The inventors further found that also under in vivo conditions, significant amounts of the germacrene A side product are formed by this enzyme; cultivation of a Rhodobacter sphaeroides strain optimised for isoprenoids production and carrying the C.×paradisi valencene synthase gene (accession number CQ813508) led to the formation of valencene and beta-elemene in 48% and 25% of the total amount of sesquiterpenes formed, respectively.

[0012] The valencene synthase from grapevine Vitis vinifera (accession number AAS66358) displays a similar lack of specificity. Expression in E. coli followed by an in vitro enzyme assay showed that this synthase converts FPP into (+)-valencene (49.5% of total product) and (-)-7-epi-alpha-selinene (35.5% of total product) along with five minor products (Lucker et al. Phytochemistry (2004) 65: 2649-2659).

[0013] Besides the above enzymes with biochemically proven valencene synthase activity, the GenBank nucleic acid sequence database contains yet another entry annotated as a valencene synthase, i.e. the Perilla frutescens var. frutescens valencene synthase gene (accession number AY917195). In literature, however, nothing has been reported on this specific putative valencene synthase, so a biochemical proof for its activity and specificity is lacking.

[0014] Thus, there is a need for an alternative valencene synthase which may be used in the preparation of valencene. In particular there is a need for an alternative valencene synthase that displays an improved expression, at least in selected host cells; an alternative valencene synthase that has a high enzymatic activity at least under specific conditions, such as at a neutral or alkaline pH and/or intracellularly in the cell wherein it has been produced; and/or an alternative valencene synthase that is highly specific, in particular that has improved specificity compared to valencene synthase from Citrus×paradisi, with respect to catalysing the conversion of FPP into valencene, at least under specific conditions, such as at about neutral or at alkaline pH and/or intracellularly in the cell wherein it has been produced.

[0015] It has been found that a specific polypeptide that was hitherto unknown has valencene synthase activity and that this polypeptide can be used as a catalyst that may serve as an alternative to known valencene synthases.

[0016] Accordingly, the present invention relates to a valencene synthase comprising an amino acid sequence as shown in SEQ ID NO: 2, SEQ ID NO: 4, or a functional homologue thereof, said functional homologue being a valencene synthase comprising an amino acid sequence which has a sequence identity of at least 40%, preferably of at least 50% with SEQ ID NO: 2 or SEQ ID NO: 4. Said homologue may in particular be a valencene synthase comprising an amino acid sequence which has a sequence identity of at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% with SEQ ID NO: 2 or SEQ ID NO: 4.

[0017] Further, the invention relates to an antibody having binding affinity to a valencene synthase according to the invention. An antibody according to the invention thus specifically binds to a valencene synthase according to the invention.

[0018] Further, the invention relates to a protein displaying immunological cross-reactivity with an antibody raised against a fragment of the amino acid sequence according to SEQ ID: NO. 2 or SEQ ID: NO. 4, in particular such a protein having valencene synthase activity.

[0019] The immunological cross reactivity may be assayed using an antibody raised against, or reactive with, at least one epitope of an isolated polypeptide according to the present invention having valencene synthase activity. The antibody, which may either be monoclonal or polyclonal, may be produced by methods known in the art, e.g. as described by Hudson et al., Practical Immunology, Third Edition (1989), Blackwell Scientific Publications. The immunochemical cross-reactivity may be determined using assays known in the art, an example of which is Western blotting, e.g. as described in Hudson et al., Practical Immunology, Third Edition (1989), Blackwell Scientific Publications.

[0020] The invention further relates to a nucleic acid, comprising a nucleic acid sequence encoding a valencene synthase according to the invention, or comprising a nucleic acid sequence complementary to said encoding sequence. In particular, the nucleic acid may be selected from nucleic acids comprising a nucleic acid sequence as shown in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 18, SEQ ID NO: 19 and other nucleic acid sequences encoding a valencene synthase according to the invention, said other sequences comprising a nucleic acid sequence having a sequence identity of at least 50%, in particular of at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% with the nucleic acid sequence shown in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 18 or SEQ ID NO: 19, respectively nucleic acids complementary thereto. Said other nucleic acid sequence encoding a valencene synthase according to the invention may herein after be referred to as a functional analogue.

[0021] The present invention also relates to a nucleic acid, comprising a nucleic acid sequence encoding a valencene synthase according to the invention, which hybridizes under low stringency conditions, preferably under medium stringency conditions, more preferably under high stringency conditions and most preferably under very high stringency conditions with the nucleic acid sequence shown in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 18 or SEQ ID NO: 19, respectively nucleic acids complementary thereto.

[0022] Hybridization experiments can be performed by a variety of methods, which are well available to the skilled man. General guidelines for choosing among these various methods can be found in e.g. Sambrook, J., and Russell, D. W. Molecular Cloning: A Laboratory Manual. 3d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (2001).

[0023] With stringency of the hybridization conditions is meant, the conditions under which the hybridization, consisting of the actual hybridization and wash steps, are performed. Wash steps are used to wash off the nucleic acids, which do not hybridize with the target nucleic acid immobilized on for example a nitrocellulose filter. The stringency of the hybridization conditions can for example be changed by changing the salt concentration of the wash solution and/or by changing the temperature under which the wash step is performed (wash temperature). Stringency of the hybridization increases by lowering the salt concentration in the wash solution or by raising the wash temperature. For purpose of this application, low, medium, high and very high stringency conditions are in particular the following conditions and equivalents thereof: the hybridization is performed in an aqueous solution comprising 6×SSC (20×SSC stock solution is 3.0 M NaCl and 0.3 M trisodium citrate in water), 5×Denhardt's reagent (100×Denhardt's reagent is 2% (w/v) BSA Fraction V, 20% (w/v) Ficoll 400 and 2% (w/v) polyvinylpyrrollidone in water), 0.5% SDS and 100 μg/mL denaturated, fragmented salmon sperm DNA, at about 45° C. for about 12 hours. After removal of non-bonded nucleic acid probe by two consecutive 5 minutes wash steps in 2×SSC, 0.1% SDS at room temperature, execution of two consecutive 5 minutes wash steps in 0.2×SSC, 0.1% SDS at room temperature is an example of low stringency, of two consecutive 15 minutes wash steps in 0.2×SSC, 0.1% SDS at 42° C. an example of medium stringency, of two consecutive 15 minutes wash steps in 0.1×SSC, 0.1% SDS at 55° C. an example of high stringency, and two consecutive 30 minutes wash steps in 0.1×SSC, 0.1% SDS at 68° C. an example of very high stringency.

[0024] A valencene synthase or nucleic acid according to the invention may be a natural compound or fragment of a compound isolated from its natural source (e.g. Chamaecyparis nootkatensis), be a chemically or enzymatically synthesised compound or fragment of a compound or a compound or fragment of a compound produced in a recombinant cell, in which recombinant cell it may be present or from which cell it may have been isolated.

[0025] The invention further relates to an expression vector comprising a nucleic acid according to the invention.

[0026] The invention further relates to a host cell, comprising an expression vector according to the invention.

[0027] The invention further relates to a method for preparing valencene, comprising converting FPP to valencene in the presence of a valencene synthase according to the invention. Four different geometric isomers of FPP can exist, i.e. 2E,6E-FPP, 2Z,6E-FPP, 2E,6Z-FPP, and 2Z,6Z-FPP. Good results have been obtained with 2E,6E-FPP, although in principle any other isomer of FPP may be a suitable substrate for an enzyme according to the invention.

[0028] The invention further relates to a method for preparing nootkatone, wherein valencene prepared in a method according to the invention is converted into nootkatone.

[0029] The invention is further directed to a method for producing a valencene synthase according to the invention, comprising culturing a host cell according to the invention under conditions conducive to the production of the valencene synthase and recovering the valencene synthase from the host cell.

[0030] Of a valencene synthase according to the invention it has been found that it is more specific towards valencene synthesis than a valencene synthase from citrus, in particular at or around neutral pH in an in vitro assay or in a method wherein valencene is synthesised intracellularly in a host cell genetically modified to produce a valencene synthase according to the invention and a citrus valencene synthase, respectively. Initial results show that under identical conditions, the amount of major side product (germacrene A) formed with the novel enzyme of the invention is significantly lower, namely a molar ratio valencene:germacrene A of 4:1 compared to 2:1 with the citrus valencene synthase.

[0031] In accordance with the invention it has been found possible to bring the valencene synthase to expression with good yield in distinct organisms. For instance, the valencene synthase has been found to be expressed well in E. coli and in Saccharomyces cerevisiae (baker's yeast). Also it has been found that in a method according to the invention wherein a valencene synthase according to the invention is expressed in an isoprenoid producing host organism (Rhodobacter sphaeroides) the valencene production is higher than in a comparative method wherein a citrus valencene synthase is expressed.

[0032] Thus, in an advantageous embodiment, the present invention provides a valencene synthase with improved specificity towards the catalysis of valencene synthesis and an improved production rate, when used in a method for preparing valencene, in particular compared to valencene synthase from citrus or another valencene synthase according to the prior art, cited herein.

[0033] Without being bound by theory, it is thought that a high specificity towards the catalysis of valencene synthesis at neutral or mildly alkaline pH is in particular considered desirable for methods wherein the valencene is prepared intracellularly, because various host cells are thought to have a neutral or slightly alkaline intracellular pH, such as a pH of 7.0-8.5 (for intracellular pH values of bacteria, see for instance: Booth, Microbiological Reviews (1985) 49: 359-378). When, for instance, E. coli cells were exposed to pH values ranging from 5.5 to 8.0, the intracellular pH was between 7.1 and 7.9 (Olsen et al., Appl. Environ. Microbiol. (2002) 68: 4145-4147). This may explain an improved specificity towards the synthesis of valencene of a valencene synthase according to the invention, also intracellularly.

[0034] The term "or" as used herein is defined as "and/or" unless specified otherwise.

[0035] The term "a" or "an" as used herein is defined as "at least one" unless specified otherwise.

[0036] When referring to a noun (e.g. a compound, an additive, etc.) in the singular, the plural is meant to be included.

[0037] The terms farnesyl diphosphate and farnesylpyrophosphate (both abbreviated as FPP) as interchangeably used herein refer to the compound 3,7,11-trimethyl-2,6,10-dodecatrien-1-yl pyrophosphate and include all known isomers of this compound.

[0038] The term "recombinant" in relation to a recombinant cell, vector, nucleic acid or the like as used herein, refers to a cell, vector, nucleic acid or the like, containing nucleic acid not naturally occurring in that cell, vector, nucleic acid or the like and/or not naturally occurring at that same location. Generally, said nucleic acid has been introduced into that strain (cell) using recombinant DNA techniques.

[0039] The term "heterologous" when used with respect to a nucleic acid (DNA or RNA) or protein refers to a nucleic acid or protein that does not occur naturally as part of the organism, cell, genome or DNA or RNA sequence in which it is present, or that is found in a cell or location or locations in the genome or DNA or RNA sequence that differ from that in which it is found in nature. Heterologous nucleic acids or proteins are not endogenous to the cell into which they are introduced, but have been obtained from another cell or synthetically or recombinantly produced. Generally, though not necessarily, such nucleic acids encode proteins that are not normally produced by the cell in which the DNA is expressed.

[0040] A gene that is endogenous to a particular host cell but has been modified from its natural form, through, for example, the use of DNA shuffling, is also called heterologous. The term "heterologous" also includes non-naturally occurring multiple copies of a naturally occurring DNA sequence. Thus, the term "heterologous" may refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position and/or a number within the host cell nucleic acid in which the segment is not ordinarily found. Exogenous DNA segments are expressed to yield exogenous polypeptides. A "homologous" DNA sequence is a DNA sequence that is naturally associated with a host cell into which it is introduced.

[0041] Any nucleic acid or protein that one of skill in the art would recognize as heterologous or foreign to the cell in which it is expressed is herein encompassed by the term heterologous nucleic acid or protein.

[0042] The term "mutated" or "mutation" as used herein regarding proteins or polypeptides means that at least one amino acid in the wild-type or naturally occurring protein or polypeptide sequence has been replaced with a different amino acid, or deleted from, or inserted into the sequence via mutagenesis of nucleic acids encoding these amino acids. Mutagenesis is a well-known method in the art, and includes, for example, site-directed mutagenesis by means of PCR or via oligonucleotide-mediated mutagenesis as described in Sambrook, J., and Russell, D. W. Molecular Cloning: A Laboratory Manual. 3d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (2001). The term "mutated" or "mutation" as used herein regarding genes means that at least one nucleotide in the nucleotide sequence of that gene or a regulatory sequence thereof, has been replaced with a different nucleotide, or has been deleted from or inserted into the sequence via mutagenesis.

[0043] The terms "open reading frame" and "ORF" refer to the amino acid sequence encoded between translation initiation and termination codons of a coding sequence. The terms "initiation codon" and "termination codon" refer to a unit of three adjacent nucleotides (`codon`) in a coding sequence that specifies initiation and chain termination, respectively, of protein synthesis (mRNA translation).

[0044] The term "gene" is used broadly to refer to any segment of nucleic acid associated with a biological function. Thus, genes include coding sequences and/or the regulatory sequences required for their expression. For example, gene refers to a nucleic acid fragment that expresses mRNA or functional RNA, or encodes a specific protein, and which includes regulatory sequences. Genes also include nonexpressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.

[0045] The term "chimeric gene" refers to any gene that contains 1) DNA sequences, including regulatory and coding sequences, that are not found together in nature, or 2) sequences encoding parts of proteins not naturally adjoined, or 3) parts of promoters that are not naturally adjoined. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or comprise regulatory sequences and coding sequences derived from the same source, but arranged in a manner different from that found in nature.

[0046] The term "transgenic" for a transgenic cell or organism as used herein, refers to an organism or cell (which cell may be an organism per se or a cell of a multi-cellular organism from which it has been isolated) containing a nucleic acid not naturally occurring in that organism or cell and which nucleic acid has been introduced into that organism or cell (i.e. has been introduced in the organism or cell itself or in an ancestor of the organism or an ancestral organism of an organism of which the cell has been isolated) using recombinant DNA techniques.

[0047] A "transgene" refers to a gene that has been introduced into the genome by transformation and preferably is stably maintained. Transgenes may include, for example, genes that are either heterologous or homologous to the genes of a particular plant to be transformed. Additionally, transgenes may comprise native genes inserted into a non-native organism, or chimeric genes. The term "endogenous gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene refers to a gene not normally found in the host organism but that is introduced by gene transfer.

[0048] "Transformation" and "transforming", as used herein, refers to the introduction of a heterologous nucleotide sequence into a host cell, irrespective of the method used for the insertion, for example, direct uptake, transduction, conjugation, f-mating or electroporation. The exogenous polynucleotide may be maintained as a non-integrated vector, for example, a plasmid, or alternatively, may be integrated into the host cell genome.

[0049] "Coding sequence" refers to a DNA or RNA sequence that codes for a specific amino acid sequence and excludes the non-coding sequences. It may constitute an "uninterrupted coding sequence", i.e. lacking an intron, such as in a cDNA or it may include one or more introns bound by appropriate splice junctions. An "intron" is a sequence of RNA which is contained in the primary transcript but which is removed through cleavage and re-ligation of the RNA within the cell to create the mature mRNA that can be translated into a protein.

[0050] "Regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences include enhancers, promoters, translation leader sequences, introns, and polyadenylation signal sequences. They include natural and synthetic sequences as well as sequences which may be a combination of synthetic and natural sequences. As is noted above, the term "suitable regulatory sequences" is not limited to promoters.

[0051] Examples of regulatory sequences include promoters (such as transcriptional promoters, constitutive promoters, inducible promoters), operators, or enhancers, mRNA ribosomal binding sites, and appropriate sequences which control transcription and translation initiation and termination. Nucleic acid sequences are "operably linked" when the regulatory sequence functionally relates to the cDNA sequence of the invention.

[0052] Each of the regulatory sequences may independently be selected from heterologous and homologous regulatory sequences.

[0053] "Promoter" refers to a nucleotide sequence, usually upstream (5') to its coding sequence, which controls the expression of said coding sequence by providing the recognition for RNA polymerase and other factors required for proper transcription. "Promoter" includes a minimal promoter that is a short DNA sequence comprised of a TATA box and other sequences that serve to specify the site of transcription initiation, to which regulatory elements are added for control of expression. "Promoter" also refers to a nucleotide sequence that includes a minimal promoter plus regulatory elements that is capable of controlling the expression of a coding sequence or functional RNA. This type of promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue specificity of a promoter. It is capable of operating in both orientations (normal or flipped), and is capable of functioning even when moved either upstream or downstream from the promoter. Both enhancers and other upstream promoter elements bind sequence-specific DNA-binding proteins that mediate their effects. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even be comprised of synthetic DNA segments. A promoter may also contain DNA sequences that are involved in the binding of protein factors which control the effectiveness of transcription initiation in response to physiological or developmental conditions.

[0054] The term "nucleic acid" as used herein, includes reference to a deoxyribonucleotide or ribonucleotide polymer, i.e. a polynucleotide, in either single- or double-stranded form, and unless otherwise limited, encompasses known analogues having the essential nature of natural nucleotides in that they hybridize to single-stranded nucleic acids in a manner similar to naturally occurring nucleotides (e.g., peptide nucleic acids). A polynucleotide can be full-length or a subsequence of a native or heterologous structural or regulatory gene. Unless otherwise indicated, the term includes reference to the specified sequence as well as the complementary sequence thereof. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to name just two examples, are "polynucleotides" as the term is used herein. It will be appreciated that a great variety of modifications have been made to DNA and RNA that serve many useful purposes known to those of skill in the art. The term "polynucleotide" as it is employed herein embraces such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including among other things, simple and complex cells.

[0055] Every nucleic acid sequence herein that encodes a polypeptide also, by reference to the genetic code, describes every possible silent variation of the nucleic acid. The term "conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, the term "conservatively modified variants" refers to those nucleic acids which encode identical or conservatively modified variants of the amino acid sequences due to the degeneracy of the genetic code. The term "degeneracy of the genetic code" refers to the fact that a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations" and represent one species of conservatively modified variation. The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The essential nature of such analogues of naturally occurring amino acids is that, when incorporated into a protein, that protein is specifically reactive to antibodies elicited to the same protein but consisting entirely of naturally occurring amino acids. The terms "polypeptide", "peptide" and "protein" are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulphation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.

[0056] Within the context of the present application, oligomers (such as oligonucleotides, oligopeptides) are considered a species of the group of polymers. Oligomers have a relatively low number of monomeric units, in general 2-100, in particular 6-100.

[0057] "Expression cassette" as used herein means a DNA sequence capable of directing expression of a particular nucleotide sequence in an appropriate host cell, comprising a promoter operably linked to the nucleotide sequence of interest which is operably linked to termination signals. It also typically comprises sequences required for proper translation of the nucleotide sequence. The coding region usually codes for a protein of interest but may also code for a functional RNA of interest, for example antisense RNA or a nontranslated RNA, in the sense or antisense direction. The expression cassette comprising the nucleotide sequence of interest may be chimeric, meaning that at least one of its components is heterologous with respect to at least one of its other components. The expression cassette may also be one which is naturally occurring but has been obtained in a recombinant form useful for heterologous expression. The expression of the nucleotide sequence in the expression cassette may be under the control of a constitutive promoter or of an inducible promoter which initiates transcription only when the host cell is exposed to some particular external stimulus. In the case of a multicellular organism, the promoter can also be specific to a particular tissue or organ or stage of development.

[0058] The term "vector" as used herein refers to a construction comprised of genetic material designed to direct transformation of a targeted cell. A vector contains multiple genetic elements positionally and sequentially oriented, i.e., operatively linked with other necessary elements such that the nucleic acid in a nucleic acid cassette can be transcribed and when necessary, translated in the transformed cells.

[0059] In particular, the vector may be selected from the group of viral vectors, (bacterio)phages, cosmids or plasmids. The vector may also be a yeast artificial chromosome (YAC), a bacterial artificial chromosome (BAC) or Agrobacterium binary vector. The vector may be in double or single stranded linear or circular form which may or may not be self transmissible or mobilizable, and which can transform prokaryotic or eukaryotic host either by integration into the cellular genome or exist extrachromosomally (e.g. autonomous replicating plasmid with an origin of replication). Specifically included are shuttle vectors by which is meant a DNA vehicle capable, naturally or by design, of replication in two different host organisms, which may be selected from actinomycetes and related species, bacteria and eukaryotic (e.g. higher plant, mammalian, yeast or fungal cells). Preferably the nucleic acid in the vector is under the control of, and operably linked to, an appropriate promoter or other regulatory elements for transcription in a host cell such as a microbial, e.g. bacterial, or plant cell. The vector may be a bi-functional expression vector which functions in multiple hosts. In the case of genomic DNA, this may contain its own promoter or other regulatory elements and in the case of cDNA this may be under the control of an appropriate promoter or other regulatory elements for expression in the host cell.

[0060] Vectors containing a polynucleic acid according to the invention can be prepared based on methodology known in the art per se. For instance use can be made of a cDNA sequence encoding the polypeptide according to the invention operably linked to suitable regulatory elements, such as transcriptional or translational regulatory nucleic acid sequences.

[0061] The term "vector" as used herein, includes reference to a vector for standard cloning work ("cloning vector") as well as to more specialized type of vectors, like an (autosomal) expression vector and a cloning vector used for integration into the chromosome of the host cell ("integration vector").

[0062] "Cloning vectors" typically contain one or a small number of restriction endonuclease recognition sites at which foreign DNA sequences can be inserted in a determinable fashion without loss of essential biological function of the vector, as well as a marker gene that is suitable for use in the identification and selection of cells transformed with the cloning vector.

[0063] The term "expression vector" refers to a DNA molecule, linear or circular, that comprises a segment encoding a polypeptide of interest under the control of (i.e. operably linked to) additional nucleic acid segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and may optionally include one or more origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, and the like. Expression vectors are generally derived from plasmid or viral DNA, or may contain elements of both. In particular an expression vector comprises a nucleotide sequence that comprises in the 5' to 3' direction and operably linked: (a) a transcription and translation initiation region that are recognized by the host organism, (b) a coding sequence for a polypeptide of interest, and (c) a transcription and translation termination region that are recognized by the host organism. "Plasmid" refers to autonomously replicating extrachromosomal DNA which is not integrated into a microorganism's genome and is usually circular in nature.

[0064] An "integration vector" refers to a DNA molecule, linear or circular, that can be incorporated into a microorganism's genome and provides for stable inheritance of a gene encoding a polypeptide of interest. The integration vector generally comprises one or more segments comprising a gene sequence encoding a polypeptide of interest under the control of (i.e., operably linked to) additional nucleic acid segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and one or more segments that drive the incorporation of the gene of interest into the genome of the target cell, usually by the process of homologous recombination. Typically, the integration vector will be one which can be transferred into the target cell, but which has a replicon which is nonfunctional in that organism. Integration of the segment comprising the gene of interest may be selected if an appropriate marker is included within that segment.

[0065] As used herein, the term "operably linked" or "operatively linked" refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence "operably linked" to another control sequence and/or to a coding sequence is ligated in such a way that transcription and/or expression of the coding sequence is achieved under conditions compatible with the control sequence. Generally, operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame.

[0066] The term "valencene synthase" is used herein for polypeptides having catalytic activity in the formation of valencene from farnesyl diphosphate, and for other moieties comprising such a polypeptide. Examples of such other moieties include complexes of said polypeptide with one or more other polypeptides, other complexes of said polypeptides (e.g. metalloprotein complexes), macromolecular compounds comprising said polypeptide and another organic moiety, said polypeptide bound to a support material, etc. The valencene synthase can be provided in its natural environment, i.e. within a cell in which it has been produced, or in the medium into which it has been excreted by the cell producing it. It can also be provided separate from the source that has produced the polypeptide and can be manipulated by attachment to a carrier, labeled with a labeling moiety, and the like.

[0067] The term "functional homologue" of a sequence, or in short "homologue", as used herein, refers to a polypeptide comprising said specific sequence with the proviso that one or more amino acids are substituted, deleted, added, and/or inserted, and which polypeptide has (qualitatively) the same enzymatic functionality for substrate conversion in case the term `functional homologue` is used for an enzyme, i.e. a homologue of the sequence with SEQ ID NO: 2 or SEQ ID NO: 4 having catalytic activity in the formation of valencene from farnesyl diphosphate. In the examples a test is described that is suitable to verify whether a polypeptide or a moiety comprising a polypeptide is a valencene synthase ("Valencene synthase activity test"). Moreover; the skilled artisan recognises that equivalent nucleotide sequences encompassed by this invention can also be defined by their ability to hybridize, under low, moderate and/or stringent conditions, with the nucleotide sequences that are within the literal scope of the instant claims.

[0068] A preferred homologue to SEQ ID NO: 2 or SEQ ID NO: 4 according to the invention has a specificity towards catalysis of valencene formation, expressed as the molar ratio valencene to germacrene A (a known side-product, formed in known valencene synthase catalysed reactions) of at least 3:1, in particular of at least 4:1, when determined at pH 7, using the valencene synthase activity test described herein below in the Examples (using a purified polypeptide). Said ratio may be infinite (1:0; i.e. no detectible amount of germacrene A formed), or up to 100:1, or up to 10:1 or up to 5:1.

[0069] Sequence identity or similarity is defined herein as a relationship between two or more polypeptide sequences or two or more nucleic acid sequences, as determined by comparing those sequences. Usually, sequence identities or similarities are compared over the whole length of the sequences, but may however also be compared only for a part of the sequences aligning with each other. In the art, "identity" or "similarity" also means the degree of sequence relatedness between polypeptide sequences or nucleic acid sequences, as the case may be, as determined by the match between such sequences. Sequence identity as used herein is the value as determined by the EMBOSS Pairwise Alignment Algorithm "Needle", for instance at the server of the European Bioinformatics Institute (http://www.ebi.ac.uk/Tools/emboss/align/). For alignment of amino acid sequences the default parameters are: Matrix=Blosum62; Open Gap Penalty=10.0; Gap Extension Penalty=0.5. For alignment of nucleic acid sequences the default parameters are: Matrix=DNAfull; Open Gap Penalty=10.0; Gap Extension Penalty=0.5.

[0070] Discrepancies between a valencene synthase according to SEQ ID NO: 2 or SEQ ID NO: 4 or a nucleic acid according to SEQ ID NO: 1 or SEQ ID NO: 3 on hand and a functional homologue of said valencene synthase may in particular be the result of modifications performed, e.g. to improve a property of the valencene synthase or polynucleic acid (e.g. improved expression) by a biological technique known to the skilled person in the art, such as e.g. molecular evolution or rational design or by using a mutagenesis technique known in the art (random mutagenesis, site-directed mutagenesis, directed evolution, gene recombination, etc.). The valencene synthase's or the nucleic acid's sequence may be altered compared to the sequences of SEQ ID NO: 2 or SEQ ID NO: 4 and SEQ ID NO: 1 or SEQ ID NO: 3, respectively, as a result of one or more natural occurring variations. Examples of such natural modifications/variations are differences in glycosylation (more broadly defined as "post-translational modifications"), differences due to alternative splicing, and single-nucleic acid polymorphisms (SNPs). The nucleic acid may be modified such that it encodes a polypeptide that differs by at least one amino acid from the polypeptide of SEQ ID NO: 2 or SEQ ID NO: 4, so that it encodes a polypeptide comprising one or more amino acid substitutions, deletions and/or insertions compared to SEQ ID NO: 2 or SEQ ID NO: 4, which polypeptide still has valencene synthase activity. Further, use may be made of codon optimisation or codon pair optimisation, e.g. based on a method as described in WO 2008/000632 or as offered by commercial DNA synthesizing companies like DNA2.0, Geneart, and GenScript. Examples of codon optimised sequences include SEQ ID NO: 18 and SEQ ID NO: 19.

[0071] One or more sequences encoding appropriate signal peptides that are not naturally associated with the polypeptides of the invention can be incorporated into (expression) vectors. For example, a DNA sequence for a signal peptide leader can be fused in-frame to a nucleic acid sequence of the invention so that the polypeptide of the invention is initially translated as a fusion protein comprising the signal peptide. Depending on the nature of the signal peptide, the expressed polypeptide will be targeted differently. A secretory signal peptide that is functional in the intended host cells, for instance, enhances extracellular secretion of the expressed polypeptide. Other signal peptides direct the expressed polypeptides to certain organelles, like the chloroplasts, mitochondria and peroxisomes. The signal peptide can be cleaved from the polypeptide upon transportation to the intended organelle or from the cell. It is possible to provide a fusion of an additional peptide sequence at the amino or carboxyl terminal end of a polypeptide according to SEQ ID NO: 2 or SEQ ID NO: 4 or homologue thereof.

[0072] As mentioned above the invention further relates to a host cell comprising a vector according to the invention. By "host cell" is meant a cell which contains a vector and supports the replication and/or expression of the vector.

[0073] The nucleic acid of the invention is heterologous to the host cell. The host cell may be a prokaryotic cell, a eukaryotic cell or a cell from a member of the Archaea. The host cell may be from any organism, in particular any non-human organism. In particular the host cell may be selected from bacterial cells, fungal cells, archaea, protists, plant cells (including algae), cells originating from an animal (in particular isolated from said animal). The host cell may form part of a multicellular organism, other than human or the organism from which the enzyme naturally originates (such as Chamaecyparis nootkatensis in case of the valencene synthase of SEQ ID NO: 4). In a specific embodiment, host cells of the invention are in a culture of cells originating from a multicellular organism, yet isolated there from.

[0074] In general, the host cell is an organism comprising genes for expressing the enzymes for catalysing the reaction steps of the mevalonate pathway or another metabolic pathway (such as the deoxyxylulose-5-phosphate (DXP) pathway) enabling the production of the C5 prenyl diphosphates isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), which are the universal isoprenoid building blocks. As far as known, unless specific genes have been knocked-out, all known organisms comprise such a pathway. Eukaryotes generally are naturally capable of preparing IPP via the mevalonate pathway. This IPP is then isomerized into DMAPP by the action of the enzyme isopentenyl diphosphate isomerase (Idi). The DXP pathway, which is furnishing IPP and DMAPP in a 5:1 ratio, is common to prokaryotes, although several prokaryotes are naturally capable of preparing IPP via the mevalonate pathway. These pathways are known in the art, and have been described, e.g., by Withers & Keasling in Appl. Microbiol. Biotechnol. (2007) 73: 980-990, of which the contents with respect to the description of these pathways, and in particular Figure 1 and the enzymes mentioned in said publication that play a role in one or both of said pathways, are enclosed by reference. The genes of these pathways may each independently be homologous or heterologous to the cell.

[0075] The host cells further will, either endogenically or from heterologous sources, comprise one or more genes for expressing enzymes with prenyl transferase activity catalysing the head-to-tail condensation of the C5 prenyl diphosphates producing longer prenyl diphosphates. The universal sesquiterpene precursor farnesyl diphosphate (FPP), for instance, is formed by the action of these enzymes through the successive head-to-tail addition of 2 molecules of IPP to 1 molecule of DMAPP.

[0076] In an embodiment, the host cell is a bacterium. The bacterium may be gram-positive or gram-negative. Gram-positive bacteria may be selected from the genera of Bacillus and Lactobacillus, in particular from the species of Bacillus subtilis and Lactobacillus casei.

[0077] In a preferred embodiment, the bacterium is selected from the group of gram-negative bacteria, in particular from the group of Rhodobacter, Paracoccus and Escherichia, more in particular from the group of Rhodobacter capsulatus, Rhodobacter sphaeroides, Paracoccus carotinifaciens, Paracoccus zeaxanthinifaciens and Escherichia coli. Rhodobacter sphaeroides is an example of an organism naturally containing all genes needed for expressing enzymes catalysing the various reaction steps in the DXP pathway, enabling the intracellular production of IPP and DMAPP.

[0078] In an embodiment, the host cell is a fungal cell, in particular a fungal cell selected from the group of Aspergillus, Blakeslea, Penicillium, Phaffia (Xanthophyllomyces), Pichia, Saccharomyces and Yarrowia, more in particular from the group of Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Blakeslea trispora, Penicillium chrysogenum, Phaffia rhodozyma (Xanthophyllomyces dendrorhous), Pichia pastoris, Saccharomyces cerevisiae and Yarrowia lipolytica.

[0079] It is also possible to express the nucleic acids of the invention in cells derived from higher eukaryotic organisms, such as plant cells and animal cells, such as insect cell, or cells from mouse, rat or human. Said cells can be maintained in a cell or tissue culture and be used for in vitro production of valencene synthase.

[0080] A multicellular organism comprising host cells according to the invention may in particular be selected from the group of multicellular plants and mushrooms (Basidiomycetes).

[0081] Thus, in a specific embodiment, the invention relates to a transgenic plant or plant cell or tissue culture comprising transgenic plant cells, said plant or culture comprising plant host cells according to the invention. The transgenic plant or culture of transgenic plant cells may in particular be selected from Nicotiana spp., Solanum spp., Cichorum intybus, Lactuca sativa, Mentha spp., Artemisia annua, tuber forming plants, such as Helianthus tuberosus, cassava and Beta vulgaris, oil crops, such as Brassica spp., Elaeis spp. (oil palm tree), Helianthus annuus, Glycine max and Arachis hypogaea, liquid culture plants, such as duckweed Lemna spp., tobacco BY2 cells and Physcomitrella patens, trees, such as pine tree and poplar, respectively a cell culture or a tissue culture of any of said plants. In a specific embodiment, the tissue culture is a hairy root culture.

[0082] In a further specific embodiment the invention relates to a transgenic mushroom or culture comprising transgenic mushroom cells. The transgenic mushroom or culture comprising transgenic host cells, may in particular be selected from the group of Schizophyllum, Agaricus and Pleurotus, more in particular from Schizophyllum commune, the common mushroom (Agaricus bisporus), the oyster mushroom (Pleurotus ostreotus and Pleurotus sapidus), respectively a culture comprising cells of any of said mushrooms. One additional advantage for using mushrooms to express the valencene synthase is that at least some mushrooms are able to convert valencene into nootkatone (Fraatz, M. A. et al., J. Mol. Catal. B: Enzym. (2009) 61: 202-207).

[0083] Next to the production of valencene per se, expression of valencene synthase according to the invention and production of valencene in plants or mushrooms also provides resistance in these organisms. First of all, valencene is known to act as an insect repellent and is active against insects such as mosquitoes, cockroaches, ticks, fleas, termites and Drosophila. Further, valencene has been shown to make plants resistant to pathogens, such as the fungus Phytophthora, especially P. ramorum (Sudden oak death agent) (Manter, D. K. et al., Forest Pathology (2006) 36: 297-308).

[0084] A host cell according to the invention may be produced based on standard genetic and molecular biology techniques that are generally known in the art, e.g. as described in Sambrook, J., and Russell, D. W. "Molecular Cloning: A Laboratory Manual" 3d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (2001); and F. M. Ausubel et al, eds., "Current protocols in molecular biology", John Wiley and Sons, Inc., New York (1987), and later supplements thereto.

[0085] Methods to transform Basidiomycetes are known from, for example, Alves et al. (App). Environ. Microbiol. (2004) 70: 6379-6384), Godio et al. (Curr. Genet. (2004) 46: 287-294), Schuurs et al. (Genetics (1997) 147: 589-596), and WO 06/096050. To achieve expression of a suitable valencene synthase gene in basidiomycetes, its complete open reading frame is typically cloned into an expression vector suitable for transformation of basidiomycetes. The expression vector preferably also comprises nucleic acid sequences that regulate transcription initiation and termination. It is also preferred to incorporate at least one selectable marker gene to allow for selection of transformants. Expression of a valencene synthase can be achieved using a basidiomycete promoter, e.g. a constitutive promoter or an inducible promoter. An example of a strong constitutive promoter is the glyceraldehyde-3-phosphate dehydrogenase (gpdA) promoter. This promoter is preferred for constitutive expression when recombinant DNA material is expressed in a basidiomycete host. Other examples are the phosphoglycerate kinase (pgk) promoter, the pyruvate kinase (pki) promoter, TPI, the triose phosphate isomerase (tpi) promoter, the APC synthetase subunit g (oliC) promoter, the sc3 promoter and the acetamidase (amdS) promoter of a basidiomycete (WO 96/41882).

[0086] If needed, the primary nucleotide sequence of the valencene synthase gene can be adapted to the codon usage of the basidiomycete host. Further, expression can be directed especially to the (monokaryotic) mycelium or to the (dikaryotic) fruiting bodies. In the latter case, the Fbh1 promoter of Pleurotis is especially useful (Penas, M. M. et al., Mycologia (2004) 96: 75-82).

[0087] Methodologies for the construction of plant transformation constructs are described in the art. Overexpression can be achieved by insertion of one or more than one extra copy of the selected gene. It is not unknown for plants or their progeny, originally transformed with one or more than one extra copy of a nucleotide sequence to exhibit overexpression.

[0088] Obtaining sufficient levels of transgenic expression in the appropriate plant tissues is an important aspect in the production of genetically engineered crops. Expression of heterologous DNA sequences in a plant host is dependent upon the presence of an operably linked promoter that is functional within the plant host. Choice of the promoter sequence will determine when and where within the organism the heterologous DNA sequence is expressed. Although many promoters from dicotyledons have been shown to be operational in monocotyledons and vice versa, ideally dicotyledonous promoters are selected for expression in dicotyledons, and monocotyledonous promoters for expression in monocotyledons. However, there is no restriction to the provenance of selected promoters; it is sufficient that they are operational in driving the expression of the nucleotide sequences in the desired cell or tissue. In some cases, expression in multiple tissues is desirable, and constitutive promoters such as the 35S promoter series may be used in this respect. However, in some of the embodiments of the present invention it is preferred that the expression in transgenic plants is leaf-specific, more preferably, the expression of the gene occurs in the leaf plastids. The promoter of the isoprene synthase gene from Populus alba (PaIspS) (Sasaki et al., FEBS Letters (2005) 579: 2514-2518) appears to drive plastid-specific expression. Hence, this promoter is a very suitable promoter for use in an expression vector of the present invention.

[0089] Other suitable leaf-specific promoters are the rbcS (Rubisco) promoter (e.g. from coffee, see WO 02/092822); from Brassica, see U.S. Pat. No. 7,115,733; from soybean, see Dhanker, O., et al., Nature Biotechnol. (2002) 20: 1140-1145), the cy-FBPase promoter (see U.S. Pat. No. 6,229,067), the promoter sequence of the light-harvesting chlorophyll a/b binding protein from oil-palm (see US 2006/0288409), the STP3 promoter from Arabidopsis thaliana (see, Buttner, M. et al., Plant cell & Environ. (2001) 23: 175-184), the promoter of the bean PAL2 gene (see Sablowski, R. W. et al., Proc. Natl. Acad. Sci. USA (1995) 92: 6901-6905), enhancer sequences of the potato ST-LS1 promoter (see Stockhaus, J. et al., Proc. Natl. Acad. Sci. USA (1985) 84: 7943-7947), the wheat CAB1 promoter (see Gotor, C. et al., Plant J. (1993) 3: 509-518), the stomata-specific promoter from the potato ADP-glucose-phosphorylase gene (see U.S. Pat. No. 5,538,879), the LPSE1 element from the P(D540) gene of rice (see CN 2007/10051443), and the stomata specific promoter, pGC/(At1g22690) from Arabidopsis thaliana (see Yang, Y. et al., Plant Methods (2008) 4: 6).

[0090] Plant species may, for instance, be transformed by the DNA-mediated transformation of plant cell protoplasts and subsequent regeneration of the plant from the transformed protoplasts in accordance with procedures well known in the art.

[0091] Further examples of methods of transforming plant cells include microinjection (Crossway et al., Mol. Gen. Genet. (1986) 202: 179-185), electroporation (Riggs, C. D. and Bates, G. W., Proc. Natl. Acad. Sci. USA (1986), 83: 5602-5606), Agrobacterium-mediated transformation (Hinchee et al., Bio/Technol. (1988) 6: 915-922), direct gene transfer (Paszkowski, J. et al., EMBO J. (1984) 3: 2717-2722), and ballistic particle acceleration using devices available from Agracetus, Inc., Madison, Wis. and BioRad, Hercules, Calif. (see, for example, Sanford et al., U.S. Pat. No. 4,945,050 and European Patent Application EP 0 332 581).

[0092] It is also possible to employ the protoplast transformation method for maize (European Patent Application EP 0 292 435, U.S. Pat. No. 5,350,689).

[0093] It is particularly preferred to use the binary type vectors of Ti and Ri plasmids of Agrobacterium spp. Ti-derived vectors transform a wide variety of higher plants, including monocotyledonous and dicotyledonous plants, such as soybean, cotton, rape, tobacco, and rice (Pacciotti et al., Bio/technol. (1985) 3: 241; Byrne M. C. et al., Plant Cell Tissue and Organ Culture (1987) 8: 3-15; Sukhapinda, K. et al., Plant Mol. Biol. (1987) 8: 209-217; Hiei, Y. et al., The Plant J. (1994) 6: 271-282). The use of T-DNA to transform plant cells has received extensive study and is amply described (e.g. EP-A 120 516). For introduction into plants, the chimeric genes of the invention can be inserted into binary vectors as described in the examples.

[0094] Other transformation methods are available to those skilled in the art, such as direct uptake of foreign DNA constructs (see EP-A 295 959), techniques of electroporation (Fromm, M. E. et al., Nature (1986), 319: 791-793) or high velocity ballistic bombardment with metal particles coated with the nucleic acid constructs (e.g. U.S. Pat. No. 4,945,050). Once transformed, the cells can be regenerated by those skilled in the art. Of particular relevance are the methods to transform foreign genes into commercially important crops, such as rapeseed (De Block, M. et al., Plant Physiol. (1989) 91: 694-701), sunflower (Everett, N. P. et al., Bio/Technology (1987) 5: 1201-1204), soybean (EP-A 301 749), rice (Hiei, Y. et al., The Plant J. (1994) 6: 271-282), and corn (Fromm et al., 1990, Bio/Technology 8: 833-839).

[0095] Those skilled in the art will appreciate that the choice of method might depend on the type of plant, i.e., monocotyledonous or dicotyledonous.

[0096] In another embodiment, the vector as described herein may be directly transformed into the plastid genome. Plastid transformation technology is extensively described in, e.g., U.S. Pat. No. 5,451,513, U.S. Pat. No. 5,545,817, U.S. Pat. No. 5,545,818 and WO 95/16783. The basic technique for chloroplast transformation involves introducing regions of cloned plastid DNA flanking a selectable marker together with the gene of interest into a suitable target tissue, e.g., using biolistics or protoplast transformation (e.g. calcium chloride or PEG mediated transformation).

[0097] Agrobacterium tumefaciens cells containing a vector according to the present invention, wherein the vector comprises a Ti plasmid, are useful in methods of making transformed plants. Plant cells are infected with an Agrobacterium tumefaciens as described above to produce a transformed plant cell, and then a plant is regenerated from the transformed plant cell. Numerous Agrobacterium vector systems useful in carrying out the present invention are known. These typically carry at least one T-DNA border sequence and include vectors such as pBIN19 (Bevan, Nucl. Acids Res. (1984) 12: 8711-8720).

[0098] Methods using either a form of direct gene transfer or Agrobacterium-mediated transfer usually, but not necessarily, are undertaken with a selectable marker which may provide resistance to an antibiotic (e.g. kanamycin, hygromycin or methotrexate) or a herbicide (e.g. phosphinothricin). The choice of selectable marker for plant transformation is not, however, critical to the invention.

[0099] General methods of culturing plant tissues are provided for example by Maki, K. Y. et al., Plant Physiol. (1993) 15: 473-497; and by Phillips, R. I. et al. In: Sprague G F, Dudley J W, eds. Corn and corn improvement. 3rd edn. Madison (1988) 345-387.

[0100] After transformation the transgenic plant cells are placed in an appropriate selective medium for selection of transgenic cells which are then grown to callus. Shoots are grown from callus and plantlets generated from the shoot by growing in rooting medium. The particular marker used will allow for selection of transformed cells as compared to cells lacking the DNA which has been introduced.

[0101] To confirm the presence of the transgenes in transgenic cells and plants, a variety of assays may be performed. Such assays include, for example, "molecular biological" assays well known to those of skill in the art, such as Southern and Northern blotting, in situ hybridization and nucleic acid-based amplification methods such as PCR or RT-PCR and "biochemical" assays, such as detecting the presence of a protein product, e.g., by immunological means (ELISAs and Western blots) or by enzymatic function. The presence of enzymatically active valencene synthase may be established by chemical analysis of the volatile products (valencene) of the plant.

[0102] A valencene synthase according to the invention may be used for the industrial production of valencene, which valencene may be used per se as a flavour or aroma, e.g. in a food product, or as a fragrance, e.g. in a household product, or as an intermediate for the production of another isoprenoid, e.g. nootkatone.

[0103] A method for producing valencene according to the invention comprises preparing valencene in the presence of valencene synthase. In principle such a method can be based on any technique for employing an enzyme in the preparation of a compound of interest.

[0104] The method can be a method wherein FPP or any of its precursors (such as farnesol, IPP, isopentenyl phosphate, 3-methylbut-3-en-1-ol and even mevalonate) is fed as a substrate to cells comprising the valencene synthase. Alternatively the method can also be a method wherein use is made of a living organism that comprises an enzyme system capable of forming FPP from a suitable carbon source, thus establishing a full fermentative route to valencene. It should be noted that the term "fermentative" is used herein in a broad sense for processes wherein use is made of a culture of an organism to synthesise a compound from a suitable feedstock (e.g. a carbohydrate, an amino acid source, a fatty acid source). Thus, fermentative processes as meant herein are not limited to anaerobic conditions, and extended to processes under aerobic conditions. Suitable feedstocks are generally known for specific species of (micro-)organisms.

[0105] Also, use may be made of the valencene synthase isolated from the cell wherein it has been produced, e.g. in a reaction system wherein the substrate (FPP) and the valencene synthase are contacted under suitable conditions (pH, solvent, temperature), which conditions may be based on the prior art referred to herein and the present disclosure, optionally in combination with some routine testing. The valencene synthase may e.g. be solubilised in an aqueous medium wherein also the FPP is present or the valencene synthase may be immobilised on a support material in a manner known in the art and then contacted with a liquid comprising the FPP. Since the enzyme has a high activity and/or selectivity towards the catalysis from FPP to valencene, the present invention is also advantageous for such an in vitro method, not only under acidic conditions, but also in case the pH is about neutral or alkaline. Suitable conditions may be based on known methodology for known valencene synthases, e.g. referred to in the literature referred to herein, the information disclosed herein, common general knowledge and optionally some routine experimentation.

[0106] In a particularly advantageous method of the invention, valencene is fermentatively prepared, i.e. by cultivating cells expressing valencene synthase in a culture medium. The actual reaction catalysed by the valencene synthase may take place intracellularly or--if the valencene synthase is excreted into the culture medium--extracellularly in the culture medium.

[0107] The cells used for in a method for preparing valencene according to the invention may in particular be host cells according to the invention. If desired, these host cells may be engineered to supply the FPP to the valencene synthase in increased amounts. This can for instance be done by enhancing the flux of carbon towards FPP, which in itself can be realized in different ways. In host cells with an endogenous DXP pathway (like E. coli and R. sphaeroides) deregulation of the expression of these pathway's enzymes can have a clear positive effect on isoprenoids formation. Overexpression of dxs encoding 1-deoxy-D-xylulose-5-phosphate synthase (DXP-synthases), the first enzyme of the DXP pathway and thus one of the main targets for metabolic engineering, has resulted in increased biosynthesis of several isoprenoids (e.g., Matthews and Wurtzel, Appl. Microbiol. Biotechnol. (2000) 53: 396-400; Huang et al., Bioorg. Med. Chem. (2001) 9: 2237-2242; Harker and Bramley, FEBS Lett (1999) 448: 115-119; Jones et al. Metab. Eng. (2000) 2: 328-338; and Yuan et al. Metab. Eng. (2006) 8: 79-90). Also overexpression of dxr coding for DXP isomeroreductase (also known as, 1-deoxy-D-xylulose-5-phosphate reductoisomerase), the enzyme catalyzing the second and committed step in the DXP pathway, can lead to increased isoprenoid production (Albrecht et al., Biotechnol. Lett. (1999) 21: 791-795), which effect can be further increased by co-overexpressing dxs at the same time (Kim & Keasling, Biotechnol Bioeng (2001) 72: 408-415). A positive effect on isoprenoid biosynthesis was further obtained by overexpression of isopentenyl diphosphate isomerase (IPP isomerase, Idi), the enzyme that catalyzes the interconversion of IPP to dimethylallyl diphosphate, DMAPP (e.g., Kajiwara et al. Biochem. J. (1997) 324: 421-426); Misawa and Shimada, J. Biotech. (1998) 59: 169-181; and Yuan et al. Metab. Eng. (2006) 8: 79-90) and the enzymes MEP cytidylyltransferase (also known as 4-diphosphocytidyl-2-C-methyl-D-erythritol synthase, IspD) and 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (IspF), that are transcribed as one operon ispDF in E. coli (Yuan et al. Metab. Eng. (2006) 8: 79-90).

[0108] An alternative and more efficient approach to engineer strains with an endogenous DXP pathway for high-level production of isoprenoids is the introduction of a heterologous mevalonate pathway. Coexpression in E. coli of the Saccharomyces cerevisiae mevalonate pathway with a synthetic amorpha-4,11-diene synthase gene resulted in the formation of the sesquiterpene amorphadiene in titres of more than 110 mg/L when the recombinant E. coli strain was cultivated in an LB+ glycerol medium (Martin et al. Nat. Biotechnol. (2003) 21: 796-802). This E. coli strain was subsequently improved by the introduction of extra copies of the gene tHMG1 encoding the C-terminal catalytic domain of the yeast enzyme 3-hydroxy-3-methyl-glutaryl-coenzyme A (HMG-CoA) reductase. By increasing the formation and thus the activity of this enzyme, the intracellular level of the toxic mevalonate pathway intermediate HMG-CoA was reduced thereby overcoming growth inhibition and leading to an increased production of mevalonate (Pitera et al. Metab. Eng. (2007) 9: 193-207). Further improvement of the flux through the heterologous mevalonate pathway was obtained by codon optimization of the first three genes of this pathway in combination with replacement of the wild-type lac promoter with the two-fold stronger lac UV5 promoter (Anthony et al. Met. Eng. (2009) 11: 13-19). The production of amorphadiene could be even more increased by replacing the yeast genes for HMG-CoA synthase and HMG-CoA reductase with the equivalent genes from the gram positive bacterium Staphylococcus aureus. In combination with an optimized fermentation protocol, cultivation of this novel engineered E. coli strain yielded an amorphadiene titre of 27.4 g/L (Tsuruta et al. PloS ONE (2009) 4(2): e4489. doi:10.1371/journal.pone.0004489). Similarly, an E. coli strain engineered with the mevalonate pathway from Streptococcus pneumoniae in combination with the Agrobacterium tumefaciens decaprenyl diphosphate synthase (ddsA) gene produced coenzyme Q10 (CoQ10) in more than 2400 μg/g cell dry weight (Zahiri et al. Met. Eng. (2006) 8: 406-416. Increased production of CoQ10 was also obtained by engineering a Rhodobacter sphaeroides strain with the mevalonate pathway from Paracoccus zeaxanthinifaciens in its native (WO 2005/005650) and a mutated from (WO 2006/018211).

[0109] Also host cells with an endogenous MEV pathway (like S. cerevisiae) have been the subject of multiple engineering studies to obtain isoprenoid hyper producing strains. Introduction into S. cerevisiae of the heterologous E. coli derived DXP pathway in combination with the gene encoding the Citrus valencene synthase resulted in a strain accumulating approximately 10-fold more valencene compared to the strain expressing only the valencene synthase (WO 2007/093962). Most improvements in the industrially-important yeasts Candida utilis and S. cerevisiae, however, have centred on the engineering of the homologous MEV pathway. Especially overexpression of the enzyme HMG-CoA reductase, which is believed to be the main regulatory enzyme in the DXP pathway, in its full-length or truncated version, has appeared to be an efficient method to increase production of isoprenoids. This stimulating effect of overexpression of the N-terminal truncated HMG-CoA reductase has, for instance, been observed in case of lycopene production in C. utilis (Shimada et al. Appl. Env. Microbiol. (1998) 64: 2676-2680) and epi-cedrol production in S. cerevisiae (Jackson et al. Org. Lett. (2003) 5: 1629-1632). In the last case the production of this sesquiterpene could be further enhanced by introduction of upc2-1, an allele that elicitates an increase in the metabolic flux to sterol biosynthesis. Another method to increase the flux through the MEV pathway is the employment of a mevalonate kinase variant that is less sensitive for feedback inhibition by FPP and other isoprenoid precursors. WO 2006/063752, for instance, shows that Paracoccus zeaxanthinifaciens R114, a bacterium with an endogenous MEV pathway, after introduction of the S. cerevisiae mevalonate kinase mutant N66K/1152M and the ddsA gene from P. zeaxanthinifaciens ATCC 21588 produces significantly more coenzyme Q10 than the corresponding P. zeaxanthinifaciens strain expressing the wild type S. cerevisiae mevalonate kinase. Similar positive results on CoQ10 production with P. zeaxanthinifaciens R114 have also been obtained with the feedback resistant variant K93E of the P. zeaxanthinifaciens mevalonate kinase (WO 2004/111214).

[0110] A second approach to increased amounts of FPP is based on reducing or elimination of enzymatic side activities on FPP. In yeast the gene ERGS encodes the enzyme farnesyl diphosphate farnesyl transferase (squalene synthase), which catalyzes the condensation of two farnesyl diphosphate moieties to form squalene. Because this is the first step after FPP in the sterol biosynthesis and thus regulates the flux of isoprene units into the sterol pathway, ERG9 is a frequent target in yeast metabolic engineering for increased sesquiterpene and carotenoids production. Disruption of ERG9 in combination with overexpression of the tHMG-CoA reductase in the yeast C. utilis led to increased production of lycopene (Shimada et al. Appl. Env. Microbiol. (1998) 64: 2676-2680). A similar combination of overexpression of tHMG-CoA reductase and downregulation of ERG9 using a methionine repressible promoter increased the production of the sesquiterpene amorphadiene in yeast with approx. 10-fold as compared to the yeast strain only expressing the amorphadiene synthase gene (Ro et al. Nature (2006) 440: 940-943; Lenihan et al. Biotechnol. Prog. (2008) 24: 1026-1032). Since ergosterol is vital for yeast growth and yeast cells cannot assimilate externally fed ergosterol during aerobic growth, downregulation/knockout of ERG9 is frequently combined with mutations that equip the yeast strain with efficient aerobic uptake of ergosterol from the culture medium. Examples are the sue allele (Takahishi et al. Biotechnol. Bioeng. (2007) 97: 170-181) and the upc2-1 allele (Jackson et al. Org. Lett. (2003) 5: 1629-1632). Takahashi et al (Biotechnol. Bioeng. (2007) 97: 170-181) also investigated the effect of limiting the endogenous phosphatase activity by knocking out the phosphatase gene dpp1 in yeast. Although this knockout clearly limited the dephosphorylation of FPP reflected by much less farnesol accumulation, it did not improve sesquiterpene production beyond that of the combined erg9/sue mutations under the growth conditions applied.

[0111] Reaction conditions for fermentatively preparing valencene may be chosen depending upon known conditions for the species of host cell used (e.g. Rhodobacter capsulatus, Rhodobacter sphaeroides, Paracoccus zeaxanthinifaciens, Escherichia coli, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Saccharomyces cerevisiae, Penicillium chrysogenum, Phaffia rhodozyma and Pichia pastoris), the information disclosed herein, common general knowledge and optionally some routine experimentation.

[0112] In principle, the pH of the reaction medium (culture medium) used in a method according to the invention may be chosen within wide limits, as long as the valencene synthase (in the host cell) is active and displays a wanted specificity under the pH conditions. In case the method includes the use of cells, for expressing the valencene synthase, the pH is selected such that the cells are capable of performing its intended function or functions. The pH may in particular be chosen within the range of four pH units below neutral pH and two pH units above neutral pH, i.e. between pH 3 and pH 9 in case of an essentially aqueous system at 25° C. Good results have e.g. been achieved in an aqueous reaction medium having a pH in the range of 6.8 to 7.5.

[0113] A system is considered aqueous if water is the only solvent or the predominant solvent (>50 wt. %, in particular >90 wt. %, based on total liquids), wherein e.g. a minor amount of alcohol or another solvent (<50 wt. %, in particular <10 wt. %, based on total liquids) may be dissolved (e.g. as a carbon source, in case of a full fermentative approach) in such a concentration that micro-organisms which are present remain active.

[0114] In particular in case a yeast and/or a fungus is used, acidic conditions may be preferred, in particular the pH may be in the range of pH 3 to pH 8, based on an essentially aqueous system at 25° C. If desired, the pH may be adjusted using an acid and/or a base or buffered with a suitable combination of an acid and a base.

[0115] Anaerobic conditions are herein defined as conditions without any oxygen or in which substantially no oxygen is consumed by the cultured cells, in particular a micro-organism, and usually corresponds to an oxygen consumption of less than 5 mmol/lh, preferably to an oxygen consumption of less than 2.5 mmol/lh, or more preferably less than 1 mmol/lh. Aerobic conditions are conditions in which a sufficient level of oxygen for unrestricted growth is dissolved in the medium, able to support a rate of oxygen consumption of at least 10 mmol/lh, more preferably more than 20 mmol/lh, even more preferably more than 50 mmol/lh, and most preferably more than 100 mmol/lh.

[0116] Oxygen-limited conditions are defined as conditions in which the oxygen consumption is limited by the oxygen transfer from the gas to the liquid. The lower limit for oxygen-limited conditions is determined by the upper limit for anaerobic conditions, i.e. usually at least 1 mmol/lh, and in particular at least 2.5 mmol/lh, or at least 5 mmol/lh. The upper limit for oxygen-limited conditions is determined by the lower limit for aerobic conditions, i.e. less than 100 mmol/lh, less than 50 mmol/lh, less than 20 mmol/lh, or less than to 10 mmol/lh.

[0117] Whether conditions are aerobic, anaerobic or oxygen-limited is dependent on the conditions under which the method is carried out, in particular by the amount and composition of ingoing gas flow, the actual mixing/mass transfer properties of the equipment used, the type of micro-organism used and the micro-organism density.

[0118] In principle, the temperature used is not critical, as long as the valencene synthase (in the cells), shows substantial activity. Generally, the temperature may be at least 0° C., in particular at least 15° C., more in particular at least 20° C. A desired maximum temperature depends upon the valencene synthase and the cells, in case of a method wherein use is made of cells for expressing the valencene synthase. The temperature is 70° or less, preferably 50° C. or less, more preferably 40° C. or less, in particular 35° C. or less.

[0119] In case of a fermentative process, the incubation conditions can be chosen within wide limits as long as the cells show sufficient activity and/or growth. This includes aerobic, oxygen-limited and anaerobic conditions.

[0120] In particular if the catalytic reaction whereby valencene is formed, is carried out outside a host cell, a reaction medium comprising an organic solvent may be used in a high concentration (e.g. more than 50%, or more than 90 wt. %, based on total liquids), in case the valencene synthase that is used retains sufficient activity and specificity in such a medium.

[0121] If desired, valencene produced in a method according to the invention, or a further compound into which valencene has been converted after its preparation (such as nootkatone), is recovered from the reaction medium, wherein it has been made. A suitable method is liquid-liquid extraction with an extracting liquid that is non-miscible with the reaction medium.

[0122] In particular suitable (for extraction from an aqueous reaction medium) is extraction with a liquid organic solvent, such as a liquid hydrocarbon. From initial results it is apparent that this method is also suitable to extract the valencene (or further product) from a reaction medium comprising cells according to the invention used for its production, without needing to lyse the cells for recovery of the valencene (or further product). In particular, the organic solvent may be selected from liquid alkanes, liquid long-chain alcohols (alcohols having at least 12 carbon atoms), and liquid esters of long-chain fatty acids (acids having at least 12 carbon atoms). Suitable liquid alkanes in particular include C6-C16 alkanes, such as hexane, octane, decane, dodecane, isododecane and hexadecane. Suitable long-chain aliphatic alcohol in particular include C12-C18 aliphatic alcohols, like oleyl alcohol and palmitoleyl alcohol. Suitable esters of long-chain fatty acids in particular include esters of C1-C4 alcohols of C12-C18 fatty acids, like isopropyl myristate, and ethyl oleate.

[0123] In an advantageous embodiment, valencene (or a further product) is produced in a reactor comprising a first liquid phase (the reaction phase), said first liquid phase containing cells according to the invention in which cells the valencene (or a further product) is produced, and a second liquid phase (organic phase that remains essentially phase-separated with the first phase when contacted), said second liquid phase being the extracting phase, for which the formed product has a higher affinity. This method is advantageous in that it allows in situ product recovery. Also, it contributes to preventing or at least reducing potential toxic effects of valencene (or a further product) to the cells, because due to the presence of the second phase, the valencene (or a further product) concentration in the reaction phase may be kept relatively low throughout the process. Finally, there are strong indications that the extracting phase contributes to extracting the valencene (or further product) out of the reaction phase.

[0124] In a preferred method of the invention the extracting phase forms a layer on top of the reaction phase or is mixed with the reaction phase to form a dispersion of the reaction phase in the extracting phase or a dispersion of the extracting phase in the reaction phase. Thus, the extracting phase not only extracts product from the reaction phase, but also helps to reduce or completely avoid losses of the formed product from the reactor through the off-gas, that may occur if valencene is produced in the (aqueous) reaction phase or excreted into the (aqueous) reaction phase. Valencene is poorly soluble in water and therefore easily volatilizes from water. It is contemplated that valencene solvated in the organic phase (as a layer or dispersion) is at least substantially prevented from volatilization.

[0125] Suitable liquids for use as extracting phase combine a lower density than the reaction phase with a good biocompatibility (no interference with the viability of living cells), low volatility, and near absolute immiscibility with the aqueous reaction phase. Examples of suitable liquids for this application are liquid alkanes like decane, dodecane, isododecane, tetradecane, and hexadecane or long-chain aliphatic alcohols like oleyl alcohol, and palmitoleyl alcohol, or esters of long-chain fatty acids like isopropyl myristate, and ethyl oleate (see e.g. Asadollahi et al. (Biotechnol. Bioeng. (2008) 99: 666-677), Newman et al. (Biotechnol. Bioeng. (2006) 95: 684-691) and WO 2009/042070).

[0126] The valencene produced in accordance with the invention may be used as such, e.g. for use as a flavour or fragrance, or as an insect repellent, or may be used as a starting material for another compound, in particular another flavour or fragrance. In particular, valencene may be converted into nootkatone. The conversion of valencene into nootkatone may be carried out intracellularly, or extracellularly. If this preparation is carried out inside a cell, the nootkatone is usually isolated from the host cell after its production.

[0127] Suitable manners of converting valencene to nootkatone are known in the art, e.g. as described in Fraatz et al. Appl. Microbiol. Biotechnol (2009) 83: 35-41, of which the contents are incorporated by reference, or the references cited therein.

[0128] In general, suitable methods to prepare nootkatone from valence may be divided in: i. purely chemical methods, ii. biocatalytic methods (e.g. those using laccases in combination with a mediator), iii. bioconversion (i.e. methods applying whole living cells), and iv. full fermentation. In methods i-iii externally fed valencene is converted, whereas in method iv the valencene is produced in situ.

[0129] In a specific embodiment, the conversion comprises a regiospecific hydroxylation of valencene at the 2-position to alpha- and/or beta-nootkatol, followed by oxidation thereof forming nootkatone.

[0130] In a further embodiment, valencene is converted into the hydroperoxide of valencene, which is thereafter converted in nootkatone. U.S. Pat. No. 5,847,226 describes the chemical conversion of (+)-valencene into nootkatone in an oxygen-containing atmosphere in the presence of a hydroperoxyde of an unsaturated fatty acid. This fatty acid hydroperoxide is generated in situ by, e.g., autooxidation, photooxidation or enzymatic oxidation using a lipoxyygenase, after which this hydroperoxide catalyzes the autooxidation of valencene.

[0131] (+)-Valencene can be converted in high yields into nootkatone by different species of the green alga Chlorella or the fungus Botryosphaeria (Furusawa et al. Chem. Pharm. Bull. (2005) 53: 1513-1514, and JP 2003-070492).

[0132] EP-A 1 083 233 describes the preparation of nootkatone applying cell-free (biocatalytic) systems based on laccase catalyzed conversion of valencene into valencene hydroperoxide, which is subsequently degraded to form nootkatone. Optionally, a mediator and/or a solvent at a concentration that maintains laccase activity may be included.

[0133] WO 2006/079020 describes amongst other things a novel plant derived cytochrome P450 enzyme, the Premnaspirodiene oxygenase (HPO) from Hyoscyamus muticus which catalyzes the mono-hydroxylation of (+)-valencene to mainly beta-nootkatol. Nootkatone formation was only observed at very high concentrations of nootkatol (>30 μM) but only at a very low reaction rate (Takahashi et al. J. Biol. Chem. (2007) 282: 31744-31754). In the same paper, Takahashi et al. report on an HPO mutant with a 5-fold improvement in its catalytic efficiency for nootkatol biosynthesis without significantly changing the overall reaction product profiles. This nootkatol might be further oxidized to nootkatone by co-expression of an alcohol dehydrogenase enzyme in the same host cell.

[0134] Besides plant derived cytochrome P450 enzymes, also the bacterial cytochrome 450 monooxygenases P450cam and P450BM-3 and mutants thereof have been reported to oxidize (+)-valencene (Sowden et al. Org. Biomol. Chem. (2005) 3: 57-64). Whereas wild type P450cam did not catalyze this oxidation reaction, mutants showed relatively high regioselectivity for the desired C2 position in (+)-valencene, (+)-trans-nootkatol and (+)-nootkatone constituting >85% of the products formed. The activity of these mutants was still rather low. The P450BM-3 mutants, on the other hand, displayed a higher activity but were unselective because of the multiple binding orientations of (+)-valencene in the active site. Recently, much more selective BM-3 mutants have been reported, the best of which has a C2-regioselectivity of 95% (Seifert et al. ChemBioChem (2009) 10: 853-861).

[0135] It is contemplated that one or more genes encoding an enzyme or plurality of enzymes for catalysing the conversion of valencene into nootkatone may be incorporated in a host cell according to the invention. Such enzymes may in for instance be selected from the enzymes of Chlorella or Botryosphaeria, or Premnaspirodiene oxidase from Hyoscyamus muticus, or the P450cam or P450BM-3 mutants referred to herein above.

[0136] As indicated above, the invention relates to an antibody having binding affinity to a valencene synthase according to the invention. The term "antibody" includes reference to antigen binding forms of antibodies (e.g., Fab, F (ab) 2). The term "antibody" frequently refers to a polypeptide substantially encoded by an immunoglobulin gene or immunoglobulin genes, or fragments thereof which specifically bind and recognize an analyte (antigen). However, while various antibody fragments can be defined in terms of the digestion of an intact antibody, one of skill will appreciate that such fragments may be synthesized de novo either chemically or by utilizing recombinant DNA methodology. Thus, the term antibody, as used herein, also includes antibody fragments such as single chain Fv, chimeric antibodies (i.e., comprising constant and variable regions from different species), humanized antibodies (i.e., comprising a complementarity determining region (CDR) from a non-human source) and heteroconjugate antibodies (e.g., bispecific antibodies).

[0137] The antibodies or fragments thereof can be produced by any method known in the art for the synthesis of antibodies, in particular, by chemical synthesis or preferably, by recombinant expression techniques.

[0138] Polyclonal antibodies to valencene synthase can be produced by various procedures well known in the art. For example, a heterologous valencene synthase can be administered to various host animals including, but not limited to, rabbits, mice, rats, etc. to induce the production of sera containing polyclonal antibodies specific for valencene synthase. Various adjuvants may be used to increase the immunological response, depending on the host species, and include but are not limited to, Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. Such adjuvants are also well known in the art.

[0139] Monoclonal antibodies can be prepared using a wide variety of techniques known in the art including the use of hybridoma, recombinant, and phage display technologies, or a combination thereof. For example, monoclonal antibodies can be produced using hybridoma techniques including those known in the art and taught, for example, in Harlow et al., Antibodies: A Laboratory Manual, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988); Hammerling, et al., in: Monoclonal Antibodies and T-Cell Hybridomas 563-681 (Elsevier, N.Y., 1981). The term "monoclonal antibody" as used herein is not limited to antibodies produced through hybridoma technology. The term "monoclonal antibody" refers to an antibody that is derived from a single clone, including any eukaryotic, prokaryotic, or phage clone, and not the method by which it is produced.

[0140] Methods for producing and screening for specific antibodies using hybridoma technology are routine and well known in the art. Briefly, mice can be immunized with valencene synthase and once an immune response is detected, e.g., antibodies specific for the valencene synthase are detected in the mouse serum, the mouse spleen is harvested and splenocytes isolated. The splenocytes are then fused by well known techniques to any suitable myeloma cells, for example cells from cell line SP20 available from the ATCC. Hybridomas are selected and cloned by limited dilution. The hybridoma clones are then assayed by methods known in the art for cells that secrete antibodies capable of binding a polypeptide of the invention. Ascites fluid, which generally contains high levels of antibodies, can be generated by immunizing mice with positive hybridoma clones.

[0141] In certain embodiments, a method of generating monoclonal antibodies comprises culturing a hybridoma cell secreting an antibody of the invention wherein, preferably, the hybridoma is generated by fusing splenocytes isolated from a mouse immunized with valencene synthase with myeloma cells and then screening the hybridomas resulting from the fusion for hybridoma clones that secrete an antibody able to bind valencene synthase. An antibody according to the invention may for instance be used in a method for isolating a valencene synthase produced in accordance with the invention, e.g. by using the antibody immobilised on a chromatographic support material.

[0142] Further, the present disclosure is directed to a method for preparing a terpenoid or a terpene, the method comprising converting a polyprenyl diphosphate substrate into the terpenoid or terpene in the presence of an enzyme, the enzyme comprising a first segment comprising a tag-peptide and a second segment comprising a polypeptide having enzymatic activity for converting a polyprenyl diphosphate into that terpene or terpenoid. An enzyme comprising said first and said second segment may herein be referred to as a `tagged enzyme`.

[0143] In particular, the terpene that is prepared may be valencene, in which case the tagged enzyme has valencene synthase activity, or amorphadiene, in which case the tagged enzyme has amorphadiene synthase activity. For valencene preparation in particular use can be made of a method, an amino acid sequence, a nucleic acid sequence or a host cell as described herein.

[0144] Further, the terpene or terpenoid may amongst others be selected from the group of nootkatone and artemisinic acid. Artemisinic acid can be prepared by oxygenation/oxidation of amorphadiene in a manner known per se.

[0145] The tag-peptide is preferably selected from the group of nitrogen utilization proteins (NusA), thioredoxins (Trx), maltose-binding proteins (MBP), a peptide having the sequence: EEASVTSTEETLTPAQEAARTRAANKARKEAELAAATAEQ (the so called SET-tag, SEQ ID NO: 34), and functional homologues thereof. As used herein a functional homologue of a tag peptide is a tag peptide having at least about the same effect on the solubility of the tagged enzyme, compared to the non-tagged enzyme. Typically the homologue differs in that one or more amino acids have been inserted, substituted, deleted from or extended to the peptide of which it is a homologue. The homologue may in particular comprise one or more substitutions of a hydrophilic amino acid for another hydrophilic amino acid or of a hydrophobic amino acid for another. The homologue may in particular have a sequence identity of at least 40%, more in particular of at least 50%, preferably of at least 55%, more preferably of at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% with the sequence of a NusA, Trx, MBP or SET.

[0146] SEQ ID NO 25 and 24 show a valencene synthase provided with a SET-tag respectively a nucleic acid sequence encoding said valencene synthase.

[0147] Particularly suitable is maltose binding protein from Escherichia coli, or a functional homologue thereof.

[0148] The use of a tagged enzyme according to the invention is in particular advantageous in that it may contribute to an increased production, especially increased cellular production of a terpenoid or a terpene, such as valancene or amorphadiene.

[0149] For improved solubility of the tagged enzyme (compared to the enzyme without the tag), the first segment of the enzyme is preferably bound at its C-terminus to the N-terminus of the second segment. Alternatively, the first segment of the tagged enzyme is bound at its N-terminus to the C-terminus of the second segment.

[0150] Further, the present disclosure is directed to a nucleic acid comprising a nucleotide sequence encoding a polypeptide, the polypeptide comprising a first segment comprising a tag-peptide, preferably an MBP, a NusA, a Trx, a SET-tag) or a functional homologue of any of these, and a second segment comprising a terpenoid synthase or terpene synthase, preferably a valencene synthase or an amorphadiene synthase. The second segment may for instance comprise an amino acid sequence as shown in SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 9, SEQ ID NO: 27 or a functional homologue of any of these sequences with SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 9 or SEQ ID NO: 27.

[0151] Further, the present disclosure is directed to a host cell comprising said nucleic acid encoding said tagged terpenoid synthase or tagged terpene synthase. Specific nucleic acids according to the invention encoding a tagged enzyme are shown in SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24 and SEQ ID NO: 28. The host cell may in particular comprise a gene comprising any of these sequences or a functional analogue thereof.

[0152] SEQ ID NO: 28 shows a nucleotide sequence encoding an amorphadiene synthase with an N-terminal MBP-tag (MBP-AaaS).

[0153] Further, the present disclosure is directed to an enzyme, comprising a first segment comprising a tag-peptide and a second segment comprising a polypeptide having enzymatic activity for converting a polyprenyl diphosphate into a terpene, in particular a valencene synthase or an amorphadiene synthase, the tag-peptide preferably being selected from the group of MBP, NusA, Trx or SET). Specific enzymes comprising a tagged enzyme according to the invention are shown in SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO 21, SEQ ID NO: 23, SEQ ID NO: 25 and SEQ ID NO: 29 (MBP-AaaS).

[0154] The invention will now be illustrated by the following examples.

EXAMPLES

General Part

Valencene Synthase Activity Test

[0155] For verifying whether a polypeptide has valencene synthase activity the following test can be used.

[0156] In a glass tube, make a mix of 800 μL of MOPSO buffer (15 mM MOPSO (3-[N-morpholino]-2-hydroxypropane sulphonic acid) pH=7.0, 1 mM MgCl2, 0.1% Tween 20, 1 mM ascorbic acid, 1 mM dithiothreitol), 175 μL of purified polypeptide solution (as a rule of thumb providing about 100 ng of the polypeptide) and 5 μL of farnesyl diphosphate (10 mM, Sigma FPP dry-evaporated and dissolved in 0.2 M ammonium carbamate and 50% ethanol). Carefully overlay the mix with 5004 of pentane, and incubate at 30° C. with mild agitation for 2 hours. Subsequently, collect the pentane. Then, subject the remaining water-phase to extraction with 1 mL ethylacetate. Combine the ethylacetate and the pentane phases and centrifuge the combination at 1,200×g. Dry over a sodium sulphate column and analyse a sample of the dried product by GC-MS. Suitably for the GC-MS analysis an Agilent Technologies system, comprising a 7980A GC system, a 597C inert MSD detector (70 eV), a 7683 auto-sampler and injector and a Phenomenex Zebron ZB-5 ms column of 30 m length×0.25 mm internal diameter and 0.25 μM stationary phase, with a Guardian precolumn (5 m) may be used. In this system, inject 1 μL of the sample, under the following conditions: injection port at 250° C., splitless injection, the ZB5 column maintained at 45° C. for 2 minutes after which a gradient of 10° C. per minute is started, until 300° C. Sesquiterpene peaks are detected at 204 m/z. Compounds can be identified by their retention index and by their mass spectrum in combination with comparison of the mass spectrum to libraries (NIST or in-house developed). In this system, valencene is detected at (about) 14.125 minutes. If valencene is detected, the polypeptide is a valencene synthase.

Bacteria and Culture Conditions

[0157] Rhodobacter sphaeroides strain Rs265-9c was obtained from Rhodobacter sphaeroides strain ATCC 35053 [purchased from the American Type Culture Collection (ATCC--Manassas, Va., USA--www.atcc.org); number 35053; Rhodobacter sphaeroides (van Niel) Imhoff et al., isolated from a sewage settling pond in Indiana and deposited as Rhodopseudomonas sphaeroides van Niel] after two rounds of mutagenesis and was used as the base host for construction of recombinant strains having improved production of valencene. All R. sphaeroides strains were grown at 30° C. in medium RS102 unless otherwise stated. The composition and preparation of medium RS102 is summarized in Table 1.

[0158] E. coli strains were grown at 37° C. in LB medium (Becton Dickinson, Sparks, Md., USA). For maintenance of plasmids in recombinant E. coli and R. sphaeroides strains, ampicillin (100 mg/L), chloramphenicol (30 mg/L) and/or kanamycin (25-50 mg/L, depending on the plasmid) were added to the culture medium. Liquid cultures were routinely grown aerobically in a rotary shaker at 220 rpm (see below). When solid media were required, agar (1.5% final concentration) was added.

TABLE-US-00001 TABLE 1 Composition and preparation of medium RS102 Component Amount perlitre distilled water 1. Yeast extract 20 g 2. NaCl 0.5 g 3. MgSO4•7H2O 0.5 g 4. D-glucose monohydrate 33 g 5. Microelements solution 2 mL 6. CaFe solution 2 mL Components 1-4 are mixed together, the final volume is adjusted to 1 litre. The pH is adjusted to 7.4 with 0.5M NaOH. The resulting base medium is then sterilized by filtration through a 0.22 micron membrane; 2 mL each of sterile microelements solution and sterile CaFe solution (see below) are added to give the final medium RS102. For solid medium, the 1 litre base medium mentioned above plus 15 g agar are first mixed together and autoclaved. After the medium is cooled to about 60° C., the sterile microelements and CaFe solutions (2 mL of each) are added and the molten medium is dispensed into sterile Petri plates. Microelements solution (NH4)2Fe(SO4)2•6H2O 80 g ZnSO4•7H2O 6 g MnSO4•H2O 2 g NiSO4•6H2O 0.2 g Vitamin C 2 g Sterilize by filtration through a 0.22 micron membrane, store at 4° C. CaFe solution CaCl2•2H2O 75 g FeCl3•6H2O 5 g HCl (37%) 3.75 ml Sterilize by filtration through a 0.22 micron membrane, store at 4° C.

Example 1

Construction of E. coli Expression Vectors

[0159] Chamaecyparis nootkatensis pendula was purchased from "Plantentuin Esveld" in Boskoop (NL). RNA was extracted from woody tissue from branches. 15 mL extraction buffer (2% hexadecyltrimethylammonium bromide, 2% polyvinylpyrrolidinone K 30, 100 mM Tris-HCl (pH 8.0), 25 mM EDTA, 2.0 M NaCl, 0.5 g/L spermidine and 2% β-mercaptoethanol (added just before use)) was warmed to 65° C. in a water bath, after which 2 g ground tissue was added and mixed completely by inverting the tube. The mixture was extracted two times with an equal volume of chloroform:isoamyl alcohol (24:1). 1/4 volume of 10 M LiCl was added to the aqueous upper layer and mixed. The RNA was precipitated overnight at 4° C. and harvested by centrifugation at 10,000×g for 20 min. The pellet was dissolved in 500 μL of SSTE (1.0 M NaCl, 0.5% SDS, 10 mM Tris-HCl (pH 8.0), 1 mM EDTA (pH 8.0)), and extracted once with an equal volume of chloroform:isoamyl alcohol. Two volumes of ethanol were added to the aqueous upper layer, incubated for at least 2 hours at -20° C., centrifuged at 13,000×g, after which the supernatant was removed. The pellet was air dried, and resuspended in water. This procedure resulted in the isolation of approx. 60 μg of total RNA per 2 g of ground tissue.

[0160] Starting from 133 μg of total RNA from Chamaecyparis nootkatensis wood, 2.7 μg of PolyA+ RNA was isolated using the mRNA Purification Kit (GE Healthcare Life Sciences, Diegem, Belgium) according to the manufacturer's instructions. This polyA+ RNA was used to generate 3'RACE cDNA, using the SMART RACE cDNA Amplification Kit (Clontech, Mountain View, Calif., USA), according to the Kit's descriptions.

[0161] The full length open reading frame encoding the valencene synthase from Chamaecyparis nootkatensis according to the invention (herein below also referred to as "valC") was then amplified from the C. nootkatensis cDNA library using Phusion "proofreading polymerase" (Finnzymes, Espoo, Finland) and the following primers:

TABLE-US-00002 [SEQ ID NO: 5] 5'-atataggatccGGCTGAAATGTTTAATGGAAATTCCAGC-3' (BamHI recognition site underlined), and [SEQ ID NO: 6] 5'-atatactgcagCTCTGGATCTATGGAATGATTGGTTCCAC-3'

(PstI restriction site underlined).

[0162] The amplified fragment and vector pACYCDuet-1 (Novagen, Merck4Biosciences, Nottingham, UK) were digested with the restriction enzymes BamHI and PstI, followed by purification of the required DNA fragments, their subsequent ligation and finally transformation into E. coli XL1-Blue (Stratagene, La Jolla, Calif., USA) using standard procedures. Recombinant bacteria were selected on LB plates containing 30 μg/mL chloramphenicol. After overnight growth of recombinant colonies in liquid culture (3 mL LB broth with 30 μg/mL chloramphenicol, 250 rpm, 37° C.), plasmid DNA was isolated using the Qiaprep Spin Miniprep kit (Qiagen, Hilden, Germany). Isolated plasmid material was tested by restriction analysis using the enzymes BamHI and PstI. Finally, the insert of a correct vector, which was named pAC-65-3, was checked by DETT sequencing with vector primers. This cloning strategy led to the expression of ValC with an N-terminal Hiss-tag.

[0163] For expression of the Citrus×paradisi valencene synthase (ValF, accession number CAG29905), the full length open reading frame was prepared by custom DNA synthesis by a third party company. To improve its heterologous expression in Rhodobacter sphaeroides, this synthetic gene sequence was optimized in terms of codon usage (SEQ ID NO: 7). Furthermore, the synthetic gene comprised an NdeI restriction site at its 5'-end, which also provided the ATG start codon, and a BamHI restriction site at its 3'-end downstream of a stop codon. After digestion of this synthetic gene and vector pET-16b (Novagen) with restriction enzymes NdeI and BamHI, the correct fragments were purified and ligated, followed by transformation of E. coli TOP 10 (Invitrogen, Breda, The Netherlands) using standard protocols. Recombinant bacteria were selected on LB plates containing 100 μg/mL ampicillin. After overnight growth of recombinant colonies in 5 mL LB broth with 100 μg/mL ampicillin, 250 rpm, 37° C., plasmid DNA was isolated using the Qiaprep Spin Miniprep kit (Qiagen). Finally, a correct recombinant plasmid was selected by testing for the presence of the desired insert fragment by restriction analysis using the enzymes NdeI and BamHI. This plasmid was named pET-16b-ValF.

[0164] Due to this cloning strategy, also the expressed ValF enzyme contains an N-terminal Hiss-tag.

Example 2

In Vitro Comparison of C. nootkatensis (Invention) Valencene Synthase and Citrus Valencene Synthase (Reference)

[0165] The control plasmid pACYCDuet-1, the pAC-65-3 construct (comprising a nucleic acid sequence encoding a valencene synthase according to the invention) and the pET-16b-ValF construct were transformed to E. coli BL21 AI (Invitrogen). For expression, a 1 mL overnight culture of the recombinant E. coli strains was prepared (LB medium with appropriate antibiotic; 30 ug chloramphenicol/mL in case of pAC-65-3 and pACYCDuet-1; 100 ug ampicillin/mL in case of pET-16b-ValF). 500 μL of that culture was transferred to 50 mL of LB medium with the appropriate antibiotic in a 250 mL Erlenmeyer flask, and incubated at 37° C., 250 rpm until the optical density at 600 nm (OD600 or A600) was 0.4 to 0.6. Subsequently, 0.02% arabinose was added and cultures were incubated overnight at 18° C. and 250 rpm. The next day, cells were harvested by centrifugation (10 min 8,000×g), medium was removed, and cells were resuspended in 1 mL Resuspension buffer (50 mM Tris-HCl pH=8.0, 300 mM NaCl, 1.4 mM 2-mercaptoethanol; 4° C.). Cells were disrupted by sonication (on ice, 5 times 10 seconds with 10 seconds break, MSE Soniprep 150, amplitude 14 μm). Insoluble particles were subsequently removed by centrifugation (10 min 13,000×g, 4° C.) yielding the cell free extract.

[0166] Soluble protein was further purified by LMAC (immobilized metal affinity chromatography) on Ni-NTA spin columns (Qiagen). Cell free extract (600 μL) was loaded on these columns, which had been pre-rinsed with Resuspension buffer, and the columns were centrifuged at 700×g for 2 min, after which the flow-through was discarded. Subsequently the columns were washed two times with 600 μL Resuspension buffer (flow-through discarded) followed by transfer of the columns to a fresh tube. 100 μL of Imidazole Elution buffer (Resuspension buffer with 175 mM imidazole) was loaded onto the column, left for 2 minutes and collected by centrifugation. This elution procedure was repeated once. For every construct, in total 200 μL eluate was transferred to a Slide-A-Lyzer Mini Dialysis Unit (10,000 MWCO; Pierce, Rockford, Ill., USA), and dialyzed for 3 hours to 1 L Storage buffer (50 mM Tris-HCl pH=7.5, 12.5% glycerol, 1.4 mM 2-mercaptoethanol) at 4° C. After dialysis, the purified enzyme preparations were immediately used in enzyme assays, which were essentially executed as the Valencene synthase activity test described above. In this case, however, all peaks in the chromatograms were detected applying the total ion count mode. Compounds were identified by their retention index and by their mass spectrum in combination with comparison of the mass spectrum to libraries (NIST and in-house). To quantify the produced compounds, the peak surface area for each relevant peak was measured from the total ion count chromatograms.

[0167] The results of these in-vitro tests are given in Table 2.

TABLE-US-00003 TABLE 2 Terpenoid compounds detected in the in-vitro enzyme assays with valencene synthase purified from E. coli BL21 AI cells containing pAC-65-3 (thus expressing ValC), pET-16b-ValF (thus expressing ValF) or pACYCDuet-1 (negative blank). pAC-65-3 pET-16b-ValF pACYCDuet-1 Rf (invention) (reference) (blank) (min) area area area β-elemene/ 12.75 495079 (22%) 509223 (42%) nd germacrene A sesquiterpene I 14.028 168400 (8%) 118789 (10%) nd (chamigrene) valencene 14.126 2228164 (100%) 1207259 (100%) nd sesquiterpene III 14.103 164722 (7%) nd nd (selinene) sesquiterpene IV 14.479 69696 (3%) 115944 (10%) nd (panasinsen) sesquiterpene alcohol 15.155 203027 (9%) nd nd I (germacrene-D-ol) sesquiterpene alcohol 16.225 63561 (3%) 275093 (23%) nd II (eudesmadienol) farnesol 16.79 530588 (24%) 809363 (67%) 798326 Rf: retention time; area: peak surface area in GC-MS chromatogram; percentage indicates the percentage of the area relative to the area of the valencene; nd: not detected. Compound names between brackets indicate tentative identification.

[0168] The valencene area of the preparation expressing ValC corresponds to 2.7 ug/mL (as calculated by comparison to a valencene standard), while the valencene area for the ValF preparation corresponds to 1.5 ug/mL. Thus, the preparation according to the invention produced 1.8 times more valencene than the ValF preparation. To verify whether this was due to the amounts of valencene synthase in both preparations or to a difference in specific activity of both valencene synthases, total protein content of both enzyme preparations was compared based on the absorption at 280 nm (A280) of a 10-fold dilution in Resuspension buffer. For the preparation comprising the ValC, A280 was 0.12; in case of ValF, A280 was 0.14; and in case of the blank, A280 was 0.18. The purified proteins were also analysed by electrophoresis on a 12.5% poly-acryl amide gel with SDS, together with a protein marker (Fermentas, PAGE Ruler pre-stained protein ladder). After Coomassie Brilliant Blue staining, in each lane a number of protein bands could be observed. Bands of various mobility were observed in the blank sample as well as in the other two samples. Between 55 kilodalton and 72 kilodalton, bands that were specific for ValC and ValF were observed (not present in the blank sample). These bands probably reflect the produced sesquiterpene synthases. In the ValC sample, the specific band contained about 5% of the total protein, whereas in the ValF sample, the specific band contained about 20% of the total protein, as estimated by visual inspection. This indicated that the concentration of sesquiterpene synthase in the ValF preparation was considerably higher, possibly more than twofold higher, than in the ValC preparation. Despite the lower quantity of enzyme, the preparation comprising ValC produced considerably more valencene (see above). Thus, this example shows that a valencene synthase according to the invention has a considerably higher specific enzymatic activity with respect to valencene synthesis than a known valencene synthase from citrus.

[0169] Besides valencene also other sesquiterpenes were formed by the two valencene synthases. The relative amount (as compared to the area of valencene) of germacrene-A (observed as beta-elemene due to thermal rearrangement in the injection port of the GC-MS), the major by-product formed with both synthases, appeared to be 22% with the preparation expressing ValC whereas this was 42% with the ValF containing preparation. Also the total relative amount of the sesquiterpene alcohols I and II with the preparation expressing ValC is approximately twofold lower than with the preparation expressing ValF, being 12% and 23%, respectively. Because the total relative amount of the other three sesquiterpenes formed (I, III and IV) are similar with both terpene synthases (ValC: 18%; ValF: 20%), this example also shows that a valencene according to the invention is significantly more specific with respect to formation of valencene compared to other terpenoids.

Example 3

Construction of R. sphaeroides Strains Producing Valencene or Amorphadiene

[0170] Cloning of Citrus×paradisi Valencene Synthase and Corresponding N-Terminal Fusions Construction of plasmids pBBR-K-PcrtE-valF-op, pBBR-K-PcrtE-valFpoR, pBBR-K-PcrtE-mbp-valFpoR, pBBR-K-PcrtE-nusA-valFpoR, pBBR-K-PcrtE-set-valFpoR, and pBBR-K-PcrtE-trx-valFpoR

[0171] The following nucleotide fragments were prepared by custom synthesis by DNA 2.0 Inc. (Menlo Park, Calif., USA): valF (SEQ ID NO: 7) coding for valencene synthase ValF from Citrus×paradisi (Accession number: CAG29905), valFpoR (SEQ ID NO: 8) coding for valencene synthase ValF from Citrus×paradisi with a two-amino acid C-terminal extension (referred to as ValFpoR) (SEQ ID NO: 9), mbp-valFpoR (SEQ ID NO: 10) coding for a fusion of maltose-binding protein (MBP) from Escherichia coli at its C-terminus to the N-terminus of valencene synthase ValFpoR (SEQ ID NO: 11), nusA-valFpoR (SEQ ID NO: 12) coding for a fusion of nitrogen utilization protein (NusA) from Escherichia coli at its C-terminus to the N-terminus of valencene synthase ValFpoR (SEQ ID NO: 13), set-valFpoR (SEQ ID NO: 24) coding for a fusion of solubility enhancing tag (SET) at its C-terminus to the N-terminus of valencene synthase ValFpoR (SEQ ID NO: 25), and trx-valFpoR (SEQ ID NO: 14) coding for a fusion of thioredoxin (Trx) from Escherichia coli at its C-terminus to the N-terminus of valencene synthase ValFpoR (SEQ ID NO: 15). All synthetic gene sequences were optimized in terms of codon usage for improved heterologous protein expression in Rhodobacter sphaeroides, and comprised an NdeI restriction site at their 5'-end, which also provided the ATG start codon, and a BamHI restriction site at their 3'-end downstream of stop codons. Also an AseI restriction site, which provides NdeI-compatible cohesive ends upon digestion, was introduced in the linkage region between the 3'-end of the genes encoding the fusion proteins MBP, NusA, SET, and Trx, and the 5'-end of the gene coding for ValFpoR. Synthetic nucleotides valF, valFpoR, mbp-valFpoR, nusA-valFpoR, set-valFpoR, and trx-valFpoR were digested with NdeI and BamHI and the resulting DNA fragments were ligated to NdeI/BamHI-digested plasmid vector pBBR-K-PcrtE, yielding plasmids pBBR-K-PcrtE-valF-op, pBBR-K-PcrtE-valFpoR, pBBR-K-PcrtE-mbp-valFpoR, pBBR-K-PcrtE-nusA-valFpoR, pBBR-K-PcrtE-set-valFpoR, and pBBR-K-PcrtE-trx-valFpoR. In all these plasmids the kanamycin resistance gene and the valencene synthase-encoding gene are transcribed in opposite directions. The construction of plasmid vector pBBR-K-PcrtE is described in detail in Example 6 (page 91, lines 12-27) of WO 02/099095.

Construction of Plasmids pBBR-K-PcrtE-valF, pBBR-K-PcrtE-valFpoR-rev, pBBR-K-PcrtE-mbp-valFpoR-rev, pBBR-K-PcrtE-nusA-valFpoR-rev, pBBR-K-PcrtE-set-valFpoR-rev, and pBBR-K-PcrtE-trx-valFpoR-rev

[0172] Gene inserts carrying the translationally fused or native valencene synthase genes were excised from parent plasmids pBBR-K-PcrtE-valF-op, pBBR-K-PcrtE-valFpoR, pBBR-K-PcrtE-mbp-valFpoR, pBBR-K-PcrtE-nusA-valFpoR, pBBR-K-PcrtE-set-valFpoR, and pBBR-K-PcrtE-trx-valFpoR as MlyI/PshAI-blunt ended fragments with respective lengths of 2.4 kilobases, 2.4 kilobases, 3.5 kilobases, 3.9 kilobases, 2.5 kilobases, and 2.7 kilobases. Plasmid vector pBBR-K-PcrtE was digested with EcoRI and BamHI, the resulting 5'-overhangs were blunted using DNA polymerase I, large (Klenow) fragment, the larger 4.2 kilobases DNA fragment was gel-purified and ligated to each of the above nucleotide fragments encoding PcrtE-valF, PcrtE-valFpoR, PcrtE-mbp-valFpoR, PcrtE-nusA-valFpoR, PcrtE-set-valFpoR, and PcrtE-trx-valFpoR. The orientation of the insert was checked and the plasmids which carried the valencene synthase-encoding gene in the same orientation as the kanamycin resistance gene were designated pBBR-K-PcrtE-valF, pBBR-K-PcrtE-valFpoR-rev, pBBR-K-PcrtE-mbp-valFpoR-rev, pBBR-K-PcrtE-nusA-valFpoR-rev, pBBR-K-PcrtE-set-valFpoR-rev, and pBBR-K-PcrtE-trx-valFpoR-rev.

Construction of Plasmid pBBR-K-PcrtE-mbp-valF-op

[0173] Plasmid pBBR-K-PcrtE-valF was digested with NdeI and BamHI and the smaller 1.7 kilobase DNA fragment encoding ValF was ligated to the larger of the two fragments generated upon AseI/BamHI-digestion of plasmid vector pBBR-K-PcrtE-mbp-valFpoR, resulting in pBBR-K-PcrtE-mbp-valF-op, in which the Citrus valencene synthase ValF is expressed as a translational fusion to the C-terminus of maltose-binding protein (MBP) from Escherichia coli. In this newly constructed plasmid, the kanamycin resistance gene and the valencene synthase-encoding gene are transcribed in the opposite orientation.

Construction of Plasmid pBBR-K-PcrtE-mbp-valF

[0174] Plasmid pBBR-K-PcrtE-valF was digested with NdeI and BamHI and the smaller 1.7 kilobase DNA fragment encoding ValF was ligated to the larger of the two fragments generated upon AseI/BamHI-digestion of plasmid vector pBBR-K-PcrtE-mbp-valFpoR-rev, resulting in plasmid pBBR-K-PcrtE-mbp-valF containing the mbp-valF gene (SEQ ID NO: 16) encoding the Citrus valencene synthase ValF translationally fused to the C-terminus of maltose-binding protein (MBP) from Escherichia coli (SEQ ID NO: 17). In this newly constructed plasmid, the kanamycin resistance gene and the valencene synthase-encoding gene are transcribed in the same orientation.

Cloning of Mevalonate (mev) Operon from Paracoccus zeaxanthinifaciens Construction of Plasmids pBBR-K-mev-op-4-89-PcrtE-valF-op, pBBR-K-mev-op-4-89-PcrtE-valFpoR, pBBR-K-mev-op-4-89-PcrtE-mbp-valF-op, pBBR-K-mev-op-4-89-PcrtE-mbp-valFpoR, pBBR-K-mev-op-4-89-PcrtE-nusA-valFpoR, pBBR-K-mev-on-4-89-PcrtE-set-valFpoR, and pBBR-K-mev-op-4-89-PcrtE-trx-valFpoR

[0175] Plasmid pBBR-K-mev-op-4-89-PcrtE-ddsAwt was used as the source of the mutated mevalonate operon from Paracoccus zeaxanthinifaciens. The construction of plasmid pBBR-K-mev-op-4-89-PcrtE-ddsAwt is described in detail in Example 3 (page 15, lines 4-31) of WO 06/018211.

[0176] The mev operon insert was excised from parent plasmid pBBR-K-mev-op-4-89-PcrtE-ddsAwt as an RsrII/XbaI-fragment, the XbaI-generated 5'-overhang was blunted using DNA polymerase I large (Klenow) fragment prior to treatment with RsrII. The resulting 7.0-kilobase nucleotide fragment was ligated to the RsrII/MlyI-digested plasmid vectors pBBR-K-PcrtE-valF-op, pBBR-K-PcrtE-valFpoR, pBBR-K-PcrtE-mbp-valF-op, pBBR-K-PcrtE-mbp-valFpoR, pBBR-K-PcrtE-nusA-valFpoR, pBBR-K-PcrtE-set-valFpoR, and pBBR-K-PcrtE-trx-valFpoR, yielding plasmids pBBR-K-mev-op-4-89-PcrtE-valF-op, pBBR-K-mev-op-4-89-PcrtE-valFpoR, pBBR-K-mev-op-4-89-PcrtE-mbp-valF-op, pBBR-K-mev-op-4-89-PcrtE-mbp-valFpoR, pBBR-K-mev-op-4-89-PcrtE-nusA-valFpoR, pBBR-K-mev-op-4-89-PcrtE-set-valFpoR, and pBBR-K-mev-op-4-89-PcrtE-trx-valFpoR, respectively. In those newly constructed plasmids, the mev operon insert and the valencene synthase-encoding, gene are transcribed in opposite orientations.

Construction of Plasmids pBBR-K-mev-op-4-89-PcrtE-valF, pBBR-K-mev-op-4-89-PcrtE-valFpoR-rev, pBBR-K-mev-op-4-89-PcrtE-mbp-valF, pBBR-K-mev-op-4-89-PcrtE-mbp-valFpoR-rev, pBBR-K-mev-op-4-89-PcrtE-nusA-valFpoR-rev, pBBR-K-mev-op-4-89-PcrtE-set-valFpoR-rev, and pBBR-K-mev-op-4-89-PcrtE-trx-valFpoR-rev

[0177] The mev operon insert was excised from parent plasmid pBBR-K-mev-op-4-89-PcrtE-ddsAwt as an RsrII/BlpI-fragment and the resulting 7.3-kilobase nucleotide fragment was ligated to the RsrII/BlpI-digested plasmid vectors pBBR-K-PcrtE-valF, pBBR-K-PcrtE-valFpoR-rev, pBBR-K-PcrtE-mbp-valF, pBBR-K-PcrtE-mbp-valFpoR-rev, pBBR-K-PcrtE-nusA-valFpoR-rev, pBBR-K-PcrtE-set-valFpoR-rev, and pBBR-K-PcrtE-trx-valFpoR-rev, yielding plasmids pBBR-K-mev-op-4-89-PcrtE-valF, pBBR-K-mev-op-4-89-PcrtE-valFpoR-rev, pBBR-K-mev-op-4-89-PcrtE-mbp-valF, pBBR-K-mev-op-4-89-PcrtE-mbp-valFpoR-rev, pBBR-K-mev-op-4-89-PcrtE-nusA-valFpoR-rev, pBBR-K-mev-op-4-89-PcrtE-set-valFpoR-rev, and pBBR-K-mev-op-4-89-PcrtE-trx-valFpoR-rev, respectively. In those newly constructed plasmids, the kanamycin resistance gene, the mev operon insert, and the valencene synthase-encoding gene are transcribed in the same orientation.

Cloning of Chamaecyparis nootkatensis Valencene Synthase and Corresponding N-Terminal Fusions Construction of Plasmids pBBR-K-PcrtE-valC-opt, pBBR-K-PcrtE-valC-opt-short, pBBR-K-PcrtE-mbp-valC-opt, and pBBR-K-PcrtE-mbp-valC-opt-short

[0178] Two nucleic acid fragments encoding the valencene synthase from Chamaecyparis nootkatensis (ValC) were prepared by custom synthesis by DNA 2.0 Inc. Both synthetic gene sequences were optimized in terms of codon usage for improved heterologous protein expression in Rhodobacter sphaeroides, and comprised an NdeI restriction site at their 5'-end, which also provided the ATG start codon, and a BamHI restriction site at their 3'-end downstream of stop codons. The first nucleic acid fragment contained an ORF corresponding to the full-length valC gene (valC-opt) (SEQ ID NO: 18) coding for the full-length version of valencene synthase ValC from C. nootkatensis (SEQ ID NO: 4). The second nucleic acid fragment contained an ORF corresponding to a truncated version of the valC gene (valC-opt-short) (SEQ ID NO: 19) coding for a shorter variant of the C. nootkatensis valencene synthase that lacked 16 amino acids from its N-terminus, ValC-short (SEQ ID NO: 2).

[0179] The synthetic nucleic acid fragments containing valC-opt and valC-opt-short were digested with NdeI and BamHI. The resulting DNA fragments were ligated to the larger of the two fragments generated upon NdeI/BamHI-digestion of plasmid vector pBBR-K-PcrtE-valFpoR-rev, resulting in pBBR-K-PcrtE-valC-opt and pBBR-K-PcrtE-valC-opt-short, respectively. In these two newly constructed plasmids, the kanamycin resistance gene and the valencene synthase-encoding gene are transcribed in the same orientation.

[0180] The synthetic nucleic acid fragments containing valC-opt and valC-opt-short were again digested with NdeI and BamHI. Subsequently, the resulting DNA fragments were ligated to the larger of the two fragments generated upon AseI/BamHI-digestion of plasmid vector pBBR-K-PcrtE-mbp-valFpoR-rev, resulting in pBBR-K-PcrtE-mbp-valC-opt containing the mbp-valC-opt gene (SEQ ID NO: 20) and pBBR-K-PcrtE-mbp-valC-opt-short containing the mbp-valC-opt-short gene (SEQ ID NO: 22), respectively. In plasmid pBBR-K-PcrtE-mbp-valC-opt the full-length version of ValC is expressed as translational fusion at the C-terminus of the maltose-binding protein (MBP) from Escherichia coli (SEQ ID NO: 21), whereas in plasmid pBBR-K-PcrtE-mbp-valC-opt-short the truncated version of ValC is expressed as translational fusion at the C-terminus of the maltose-binding protein (MBP) from Escherichia coli (SEQ ID NO: 23). In these two newly constructed plasmids, the kanamycin resistance gene and the valencene synthase-encoding gene are transcribed in the same orientation.

Cloning of Mevalonate (mev) Operon from Paracoccus zeaxanthinifaciens into Plasmids Encoding Valencene Synthase from Chamaecyparis nootkatensis Construction of Plasmids pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt and pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt-short

[0181] The mev operon insert was excised from parent plasmid pBBR-K-mev-op-4-89-PcrtE-ddsAwt as an RsrII/BlpI-fragment and the resulting 7.3-kilobase nucleotide fragment was ligated to RsrII/BlpI-digested plasmid vectors pBBR-K-PcrtE-mbp-valC-opt and pBBR-K-PcrtE-mbp-valC-opt-short, resulting in plasmids pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt and pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt-short, respectively. In these newly constructed plasmids, the kanamycin resistance gene, the mev operon insert, and the valencene synthase-encoding gene are transcribed in the same orientation.

Cloning of Artemisia annua Amorphadiene Synthase and Corresponding N-Terminal Fusion Construction of Plasmids pBBR-K-PcrtE-aaas and pBBR-K-PcrtE-mbp-aaas

[0182] A synthetic nucleic acid fragment carrying a gene (aaas) (SEQ ID NO: 26) encoding the amorphadiene synthase Aaas from Artemisia annua (SEQ ID NO: 27) was prepared by custom synthesis by DNA 2.0 Inc. The synthetic gene sequence was optimized in terms of codon usage for improved heterologous protein expression in Rhodobacter sphaeroides and comprised an NdeI restriction site at its 5'-end, which also provided the ATG start codon, and a BamHI restriction site at its 3'-end downstream of stop codons.

[0183] The synthetic nucleic acid fragment containing aaas was digested with NdeI and BamHI. The resulting DNA fragment was ligated to the larger of the two fragments generated upon NdeI/BamHI-digestion of plasmid vector pBBR-K-PcrtE-valFpoR-rev, resulting in pBBR-K-PcrtE-aaas. In this newly constructed plasmid, the kanamycin resistance gene and the amorphadiene synthase-encoding gene are transcribed in the same orientation.

[0184] The synthetic nucleic acid fragment containing aaas was again digested with NdeI and BamHI. Subsequently, the resulting DNA fragment was ligated to the larger of the two fragments generated upon AseI/BamHI-digestion of plasmid vector pBBR-K-PcrtE-mbp-valFpoR-rev, resulting in pBBR-K-PcrtE-mbp-aaas containing the mbp-aaas gene (SEQ ID NO: 28). In plasmid pBBR-K-PcrtE-mbp-aaas; Aaas is expressed as translational fusion at the C-terminus of the maltose-binding protein (MBP) from Escherichia coli (SEQ ID NO: 29). In this newly constructed plasmid, the kanamycin resistance gene and the amorphadiene synthase-encoding gene are transcribed in the same orientation.

Transformation of Rhodobacter sphaeroides

[0185] Transformation of E. coli S17-1 with plasmids and subsequent transfer of plasmids from S17-1 to R. sphaeroides Rs265-9c by conjugation were performed using standard procedures (Nishimura et al., Nucl. Acids Res. (1990) 18, 6169; Parke, Gene (1990) 93, 135-137). R. sphaeroides Rs265-9c recipient strain was grown in RA-medium. The composition and preparation of medium RA is summarized in Table 3. In parallel, E. coli S17-1 donor strain that carries the plasmid to be transferred was grown in LB-broth containing the appropriate antibiotic. For the conjugation, 450 μL culture aliquots of the R. sphaeroides Rs265-9c recipient strain and of the E. coli S17-1 donor strain were mixed together, and then pelleted by centrifugation. The supernatant was discarded. Cells were washed twice with fresh RA-medium to remove the antibiotics, and then resuspended in 0.05 mL fresh RA-medium and spotted onto a PY-plate. The composition and preparation of medium PY is summarized in Table 4. After 4-5 h incubation at 30° C. the cells were harvested with an inoculating loop and resuspended in 0.3 mL of RA-medium. Dilutions of this suspension were spread onto RA-plates containing the appropriate antibiotic and incubated at 30° C. for 2-3 days. Colonies were picked from the plates, streaked onto RS102-plates containing the appropriate antibiotic, and incubated at 30° C. for 2-3 days to obtain single colonies. One single colony from each clone (putatively transformed cells of R. sphaeroides Rs265-9c) was again grown in liquid RS102 medium containing the appropriate antibiotic and the presence of the expected plasmid was confirmed by PCR using appropriate primers. The final transformants were preserved by adding glycerol to the culture (15% v/v) and freezing at -80° C.

TABLE-US-00004 TABLE 3 Composition and preparation of medium RA Component Amount perlitre distilled water Medium RA 1. Malic acid 3 g 2. MgSO4•7H2O 0.2 g 3. (NH4)2SO4 1.2 g 4. CaCl2•2H2O 0.07 g 5. Microelements solution 1.5 mL 6. Vitamins solution 8 mL 7. Phosphate buffer solution 20 mL Components 1-5 are mixed together, the final volume is adjusted to 1 litre, and the pH is adjusted to 6.9 with 0.5M NaOH. The resulting base medium is then sterilized by filtration through a 0.22 micron membrane; 8 mL of sterile vitamins solution and 20 mL of sterile phosphate buffer solution (see below) are added to give the final medium RA. For solid medium, the 1 litre base medium mentioned above plus 20 g agar are first mixed together and autoclaved. After the medium is cooled down to about 60° C., the sterile vitamins and phosphate buffer solutions are added and the molten medium is dispensed into sterile Petri plates. Microelements solution Fe(II) citrate 500 mg MnCl2•4H2O 20 mg ZnCl2 5 mg LiCl 5 mg KBr 2.5 mg KI 2.5 mg CuSO4•5H2O 0.23 mg Na2MoO4 0.851 mg CoCl2•6H2O 5 mg SnCl2•2H2O 0.5 mg BaCl2•2H2O 0.59 mg AlCl3 1 mg H3BO4 10 mg EDTA 20 mg Sterilize by filtration through a 0.22 micron membrane, store at 4° C. Vitamins solution Niacin 200 mg Thiamin-HCl 400 mg Nicotinamide 200 mg Biotin 8 mg Sterilize by filtration through a 0.22 micron membrane, store at 4° C. Phosphate buffer solution KH2PO4 600 mg K2HPO4 900 mg Sterilize by filtration through a 0.22 micron membrane, store at 4° C.

TABLE-US-00005 TABLE 4 Composition and preparation of medium PY plates Medium PY Component Amount per litre distilled water 1. Bacto peptone 10 g 2. Yeast extract 0.5 g 3. CaC12 (0.4 M) 5 mL 4. MgC12 (0.4 M) 5 mL 5. FeSO4 (0.5%) 2.4 mL 6. Agar 20 g 7. H2O 990 mL Components 1-7 are mixed together, the pH is adjusted to 7.0 with 0.5 M NaOH, and the mixture is autoclaved. After the medium is cooled down to about 60° C., the molten medium is dispensed into sterile Petri plates.

Example 4

Cultivation of Rhodobacter sphaeroides Strains Under Standard Shake-Flask Conditions and Evaluation of Valencene Production

Preparation of Frozen Cell Stocks

[0186] Frozen cell stocks of R. sphaeroides strains were prepared by introducing a loop-full of frozen cells into 2 mL RS102 medium containing 50 mg/L kanamycin (if applicable for plasmid maintenance). The preculture was grown at 30° C. with agitation at 220 rpm for 24 h. A 250 μL aliquot of preculture was transferred to 25 mL of RS102 medium containing 50 mg/L kanamycin to initiate (t=0) growth. The 25 mL main culture was grown in a 250-mL baffled Erlenmeyer flasks at 30° C. with agitation at 220 rpm for about 24 h. Bacterial cell cultures were mixed with sterile anhydrous glycerol and sterile water so as to reach a final glycerol content of 25% and a final optical density at 660 nanometers (OD660) of 12. The resulting cell suspension was aseptically distributed in 1.2 mL-aliquots into 2 mL-cryovials then frozen at -80° C. until used.

Shake-Flask Procedure

[0187] Inoculants of R. sphaeroides strains were started by introducing 250 μL of a thawed and homogenized frozen cell stock into 25 mL of RS102 medium containing 50 mg/L of kanamycin (if applicable for plasmid maintenance). Precultures were grown in 250-mL baffled Erlenmeyer flasks for 24-28 h at 30° C. with agitation at 220 rpm. A suitable aliquot of preculture was transferred to 22.5 mL of RS102 medium containing 50 mg/L of kanamycin (if applicable for plasmid maintenance) to initiate (t=0) shake-flask experiments with an initial optical density at 660 nm (OD660) of 0.16. Main cultures were grown in 250-mL baffled Erlenmeyer flasks at 30° C. with agitation at 220 rpm. After 8 h cultivation, 2.5 mL of n-dodecane were added to the bacterial culture. Shake-flask cultivation continued at 30° C. with agitation at 220 rpm for 72 h from inoculation. Each seed culture served to inoculate two duplicate shake-flasks with a final volume of 25 mL whole broth, composed of culture medium and n-dodecane for in situ product recovery. Samples (0.5 mL) of biphasic culture broth were removed at 24 h intervals and analyzed for growth (OD660), pH, and glucose in supernatant. At the end of the experiments (t=72 h), the biphasic culture broth was analyzed for presence of valencene (see analytical methods below). At the end of the experiments, 10 μL of culture broth were aseptically plated on general cultivation count agar plates (Becton Dickinson GmbH, Heidelberg, Germany) and incubated at 37° C. for 24 h to test for contamination.

Analytical Methods

Sample Preparation for Analysis of Isoprenoid Content in Organic Phase

[0188] In a typical procedure, 10 mL whole broth samples were transferred to a disposable sterile 15 mL polypropylene conical tube. The organic and aqueous phases were separated upon ultracentrifugation for 30 min. The organic phase was transferred to amber chromatography vials for analysis by gas chromatography (see below). Product yields were determined based on calibration curves established upon analysis of three standard solutions of authentic valencene dissolved in analytical grade n-dodecane.

Sample Preparation for Analysis of Isoprenoid Content in Whole Broth

[0189] In a typical procedure, 400 μL whole broth samples were transferred to a disposable sterile 15 mL polypropylene conical tube, treated with 4 mL acetone, vigorously shaken on an IKA Vibrax orbital shaker at 1,500 rpm for 20 minutes, then incubated in a bench top ultrasonic bath for 30 min at ambient temperature. Finally samples were centrifuged at maximum speed and the supernatant transferred to amber chromatography vials for analysis by gas chromatography (see below). Product yields were determined based on calibration curves established using a standard solution of authentic valencene prepared as follows: 5 mL of authentic valencene were added into a 100 mL volumetric flask and dissolved with analytical grade n-dodecane. Aliquots of valencene standard solution (20, 40 and 80 μl) were transferred to disposable sterile 15 mL polypropylene conical tubes, treated with deionized sterile water (380, 360, and 320 μL respectively) and 4 mL acetone. Each mixture was homogenized vigorously on a vortex shaker then transferred to amber chromatography vials for analysis by gas chromatography, wherefrom a calibration curve was derived.

Gas Chromatography

[0190] Gas chromatography was performed on a Hewlett-Packard GC 6890 instrument equipped with a Restek Rtx-5 capillary column (30.0 m×0.32 mm×0.25 μm). The injector and FID detector temperatures were set to 300° C. and 250° C., respectively. Gas flow through the column was set at 2.7 mL/min. The oven initial temperature was held at 70° C. for 2 min, increased to 180° C. at a rate of 10° C./min, further increased to 300° C. at a rate of 40° C./min, then cooled down to 60° C. and held at that temperature for 3 min until the next injection. Injected sample volume was 1 μL with a 4:1 split-ratio. Product yields were determined based on calibration curves established for authentic samples.

Example 5

In Vivo Comparison of C. nootkatensis Valencene Synthase (Invention) and Citrus Valencene Synthase (Reference)

[0191] R. sphaeroides strains Rs265-9c (blank strain, no plasmid), Rs265-9c/pBBR-K-PcrtE-mbp-valF (reference strain), Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-mbp-valF (reference strain also expressing the mutated mevalonate operon mev from Paracoccus zeaxanthinifaciens), and Rs265-9c/pBBR-K-PcrtE-mbp-valC-opt, Rs265-9c/pBBR-K-PcrtE-mbp-valC-opt-short, Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt, and Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt-short (four strains expressing a nucleic acid sequence encoding a valencene synthase according to the invention), were grown under standard shake flask cultivation condition as described above. Several clones of each transformed R. sphaeroides strain were tested for valencene production and each shake-flask experiment was run in duplicate, unless stated otherwise. The valencene titre is reported in mg/L n-dodecane, wherein the organic phase n-dodecane constituted 10% (v/v) of the whole broth.

[0192] The results of these in vivo tests are given in Table 5.

TABLE-US-00006 TABLE 5 In vivo formation of valencene and germacrene A in shake flask experiments employing R. sphaeroides containing plasmids pBBR-K-PcrtE-mbp-valF, pBBR-K-PcrtE-mbp-valC-opt, pBBR-K-mev- op-4-89-PcrtE-mbp-valF, or pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt, and R. sphaeroides without plasmid. Valencene in Germacrene A n-dodecane in n-dodecane (mg/L) (mg/L)a Rhodobacter sphaeroides Average Std Average Std V/G strain Titre Dev Titre Dev ratiob 1 Rs265-9c/pBBR-K-PcrtE- 25 1 38 2 0.67 mbp-valFc 2 Rs265-9c/pBBR-K-PcrtE- 575 35 176 10 3.3 mbp-valC-optd 3 Rs265-9c/pBBR-K-mev-op- 249 13 259 28 0.96 4-89-PcrtE-mbp-valFe 4 Rs265-9c/pBBR-K-mev-op- 3519 368 983 111 3.6 4-89-PcrtE-mbp-valC-optd 5 Rs265-9c 0.0 0.0 0.6 0.1 0 aQuantified as beta-elemene upon Cope thermal rearrangement of substrate germacrene A in the GC injector (300° C.). bValencene (V) to germacrene A (G) ratio. cValencene production for each strain was tested on seven clones in duplicate. dValencene production for each strain was tested on six clones in duplicate. eValencene production for each strain was tested on four clones in duplicate.

[0193] Whereas cultivation of the empty R. sphaeroides strain Rs265-9c did not result in detectable amounts of valencene (entry 5), the strain transformed with plasmid pBBR-K-PcrtE-mbp-valF expressing ValF from Citrus×paradisi with the E. coli MBP at its N-terminus formed 25 mg/L valencene (entry 1). The strain with the analogous plasmid pBBR-K-PcrtE-mbp-valC-opt expressing ValC from Chamaecyparis nootkatensis with the E. coli MBP at its N-terminus resulted in a valencene titre of 575 mg/L (entry 2), a 23-fold increase compared to the MBP-ValF expressing strain. Also in the presence of the mutated mevalonate operon from Paracoccus zeaxanthinifaciens expression of MBP-ValC led to significantly higher valencene titres than MBP-ValF. While R. sphaeroides containing pBBR-K-mev-op-4-89-PcrtE-mbp-valF produced 249 mg/L valencene (entry 3), 3519 mg/L was formed in case of R. sphaeroides containing pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt (entry 4), a 14-fold increase. Thus, this example shows that a valencene synthase according to the invention leads to a considerably higher in vivo valencene production than a known valencene synthase from citrus.

[0194] The novel valencene synthase ValC also forms much less germacrene-A than the Citrus×paradisi valencene synthase ValF. The valencene to germacrene A (observed as beta-elemene due to thermal rearrangement in the injection port of the GC-MS) ratio in the n-dodecane layer appeared to be 0.67 and 0.96 for R. sphaeroides Rs265-9c with plasmids pBBR-K-PcrtE-mbp-valF and pBBR-K-mev-op-4-89-PcrtE-mbp-valF, respectively, indicating that under these conditions expression of MBP-ValF results in slightly more germacrene-A than valencene (entries 1 & 3). This valencene to germacrene A ratio increased to 3.3 and 3.6 when R. sphaeroides with plasmids pBBR-K-PcrtE-mbp-valC-opt and pBBR-K-mev-op-4-89-PcrtE-mbp-valC-opt was cultivated (entries 2 & 4). Thus, this example shows that a valencene according to the invention is also significantly more specific with respect to formation of valencene compared to germacrene A than the Citrus×paradisi valencene synthase.

Example 6

In Vivo Comparison of C. nootkatensis Full-Length Valencene Synthase (ValC) and C. nootkatensis N-Terminally Truncated Valencene Synthase (ValC-Short)

[0195] R. sphaeroides strains Rs265-9c (blank strain, no plasmid), Rs265-9c/pBBR-K-PcrtE-valC-opt (strain expressing the full-length valencene synthase gene valC-opt), and Rs265-9c/pBBR-K-PcrtE-valC-opt-short (strain expressing a truncated version of the valencene synthase gene valC-opt-short), as well as the R. sphaeroides strains expressing the corresponding valC genes but now translationally fused at their 5'-ends to the 3'-end of the E. coli mbp gene (Rs265-9c/pBBR-K-PcrtE-mbp-valC-opt and Rs265-9c/pBBR-K-PcrtE-mbp-valC-opt-short), were grown under the standard shake flask cultivation conditions as described above. Several clones of each of these five strains were tested for valencene production, and each shake-flask experiment was run in duplicate, unless stated otherwise. The valencene titre is reported in mg/L n-dodecane, wherein the organic phase n-dodecane constituted 10% (v/v) of the whole broth.

[0196] The results of these in vivo tests are presented in Table 6.

TABLE-US-00007 TABLE 6 In vivo formation of valencene in shake flask experiments employing R. sphaeroides containing plasmids pBBR-K- PcrtE-mbp-valC-opt, pBBR-K-PcrtE-mbp-valC-opt-short, pBBR-K-PcrtE-valC-opt, and pBBR-K-PcrtE-valC-opt-short, and R. sphaeroides without plasmid. Valencene in n- dodecane (mg/L) Average Rhodobacter sphaeroides strain Titre Std Dev 1 Rs265-9c/pBBR-K-PcrtE-mbp-valC-opta 575 35 2 Rs265-9c/pBBR-K-PcrtE-mbp-valC-opt- 592 38 shortb 3 Rs265-9c/pBBR-K-PcrtE-valC-optc 299 22 4 Rs265-9c/pBBR-K-PcrtE-valC-opt-shorta 20 5 5 Rs265-9c 0.0 0.0 aValencene production for each strain was tested on six clones in duplicate. bValencene production for each strain was tested on four clones in duplicate. cValencene production for each strain was tested on five clones in duplicate.

[0197] The results in Table 6 show that cultivation of the R. sphaeroides strains expressing the full-length and the N-terminally truncated version of the C. nootkatensis valencene synthase with an N-terminal MBP-tag leads to quite similar valencene titres, i.e. 575 and 592 mg/L, respectively (entries 1 & 2). When expressed without N-terminal MBP-tag, however, very different valencene titres are obtained. While cultivation of the R. sphaeroides strain containing plasmid pBBR-K-PcrtE-valC-opt, thus forming the un-tagged full-length ValC, resulted in 299 mg/L valencene, which is a factor 1.9 lower than with the corresponding MBP-tagged ValC, only 20 mg/L valencene was obtained by cultivation of strain Rs265-9c/pBBR-K-PcrtE-valC-opt-short expressing the untagged and N-terminally truncated ValC. This is a factor 30 lower than with the equivalent MBP-tagged ValC-short.

[0198] Thus, this example proofs that a valencene synthase according to the current invention can be expressed in active form in its native form, so without use of an N-terminal tag-peptide. This example moreover shows that an increased terpenoid titre is obtainable by expressing a valencene synthase according to the current invention with an N-terminal tag-peptide; the effect of such N-terminal tag-peptide is more profound in case of expression of an N-terminally truncated version of a valencene synthase according to the current invention.

Example 7

In Vivo Comparison of the Expression of a Valencene Synthase with an N-Terminal Tag-Peptide (Invention) and without Such Tag-Peptide (Reference)

[0199] R. sphaeroides strains Rs265-9c (blank strain, no plasmid), Rs265-9c/pBBR-K-PcrtE-valFpoR, Rs265-9c/pBBR-K-PcrtE-valFpoR-rev, and Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-valFpoR-rev (three reference strains, no N-terminal tag-peptide), Rs265-9c/pBBR-K-PcrtE-mbp-valFpoR, Rs265-9c/pBBR-K-PcrtE-mbp-valFpoR-rev, and Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-mbp-valFpoR-rev (three strains expressing the Citrus×paradisi valencene synthase gene valFpoR translationally fused at its 5'-end to the 3'-end of the E. coli mbp gene), Rs265-9c/pBBR-K-PcrtE-nusA-valFpoR, Rs265-9c/pBBR-K-PcrtE-nusA-valFpoR-rev, and Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-nusA-valFpoR-rev (three strains expressing the Citrus×paradisi valencene synthase gene valFpoR translationally fused at its 5'-end to the 3'-end of the E. coli nusA gene), Rs265-9c/pBBR-K-PcrtE-set-valFpoR, Rs265-9c/pBBR-K-PcrtE-set-valFpoR-rev, and Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-set-valFpoR-rev (three strains expressing the Citrus×paradisi valencene synthase gene valFpoR translationally fused at its 5'-end to the 3'-end of the set tag), and Rs265-9c/pBBR-K-PcrtE-trx-valFpoR, Rs265-9c/pBBR-K-PcrtE-trx-valFpoR-rev, and Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-trx-valFpoR-rev (three strains expressing the Citrus×paradisi valencene synthase gene valFpoR translationally fused at its 5'-end to the 3'-end of the E. coli trx gene) were grown under the standard shake flask cultivation condition as described above. Several clones of each transformed R. sphaeroides strain were tested for valencene production, and each shake-flask experiment was run in duplicate, unless stated otherwise. The valencene titre is reported in mg/L n-dodecane, wherein the organic phase n-dodecane constituted 10% (v/v) of the whole broth.

[0200] The results of this experiment are given in Tables 7-9.

TABLE-US-00008 TABLE 7 In vivo formation of valencene in shake flask experiments employing R. sphaeroides containing plasmids pBBR-K- PcrtE-mbp-valFpoR, pBBR-K-PcrtE-nusA-valFpoR, Rs265- 9c/pBBR-K-PcrtE-set-valFpoR, pBBR-K-PcrtE-trx-valFpoR, and pBBR-K-PcrtE-valFpoR, and R. sphaeroides without plasmid. Valencene in n-dodecane (mg/L) Average Standard Rhodobacter sphaeroides Strain Titre Deviation Rs265-9c/pBBR-K-PcrtE-mbp-valFpoRa 26.2 1.6 Rs265-9c/pBBR-K-PcrtE-nusA-valFpoRa 7.5 0.9 Rs265-9c/pBBR-K-PcrtE-set-valFpoRb 3.5 0.7 Rs265-9c/pBBR-K-PcrtE-trx-valFpoRa 16.6 1.7 Rs265-9c/pBBR-K-PcrtE-valFpoRa 0.5 0.6 Rs265-9ca 0.0 0.0 aValencene production for each strain was tested on three different clones. bValencene production for each strain was tested on two different clones.

TABLE-US-00009 TABLE 8 In vivo formation of valencene in shake flask experiments employing R. sphaeroides containing plasmids pBBR- K-PcrtE-mbp-valFpoR-rev, pBBR-K-PcrtE-nusA-valFpoR-rev, pBBR-K-PcrtE-set-valFpoR-rev, pBBR-K-PcrtE-trx-valFpoR-rev, and pBBR-K-PcrtE-valFpoR-rev, and R. sphaeroides without plasmid. Valencene in n-dodecane (mg/L) Average Standard Rhodobacter sphaeroides Strain Titre Deviation Rs265-9c/pBBR-K-PcrtE-mbp-valFpoR-reva 22.2 2.8 Rs265-9c/pBBR-K-PcrtE-nusA-valFpoR-revb 5.1 0.7 Rs265-9c/pBBR-K-PcrtE-set-valFpoR-reva 3.0 0.5 Rs265-9c/pBBR-K-PcrtE-trx-valFpoR-revc 6.2 0.8 Rs265-9c/pBBR-K-PcrtE-valFpoR-revc 0.2 0.1 Rs265-9c 0.0 0.0 aValencene production for each strain was tested on two different clones. bValencene production for each strain was tested on one clone. cValencene production for each strain was tested on three different clones.

TABLE-US-00010 TABLE 9 In vivo formation of valencene in shake flask experiments employing R. sphaeroides containing plasmids pBBR- K-mev-op-4-89-PcrtE-mbp-valFpoR-rev, pBBR-K-mev-op-4- 89-PcrtE-nusA-valFpoR-rev, Rs265-9c/pBBR-K-mev-op-4- 89-PcrtE-set-valFpoR-rev, pBBR-K-mev-op-4-89-PcrtE-trx- valFpoR-rev, and pBBR-K-mev-op-4-89-PcrtE-valFpoR- rev, and R. sphaeroides without plasmid. Valencene in n-dodecane (mg/L) Average Standard Rhodobacter sphaeroides Strain Titre Deviation Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-mbp- 95.9 9.0 valFpoR-reva Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-nusA- 23.9 3.0 valFpoR-revb Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-set- 12.5 0.9 valFpoR-revc Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-trx- 66.9 5.8 valFpoR-revc Rs265-9c/pBBR-K-mev-op-4-89-PcrtE-valFpoR- 0.4 0.1 revd Rs265-9c 0.0 0.0 aValencene production for each strain was tested on six different clones. bValencene production for each strain was tested on four different clones. cValencene production for each strain was tested on three different clones. dValencene production for each strain was tested on two different clones.

[0201] The data in Tables 7-8 show that the R. sphaeroides strains in which the Citrus×paradisi valencene synthase ValF (with a two amino acid C-terminal extension ValFpoR) is expressed with an N-terminal tag-peptide, produced over 7-fold more valencene than the strains expressing ValFpoR in its native form. This positive effect of expressing ValFpoR with an N-terminal tag-peptide on the valencene production is most pronounced when the E. coli MBP is applied as peptide-tag.

[0202] A similar positive effect of a translational fusion of the valFpoR-rev gene at its 5'-end to the 3'-end of a tag-peptide encoding gene on the valencene production is observed with R. sphaeroides strains that co-express a mutated mevalonate operon from Paracoccus zeaxanthinifaciens (Table 9). Also in this case, this positive effect is largest when the E. coli mbp encoding gene is used as such tag-peptide encoding gene.

[0203] Thus, this example shows that expression of a terpene synthase enzyme comprising a tag-peptide at its N-terminus according to the invention in an isoprenoid producing organism leads to a higher isoprenoid production than when expressing the terpene synthase without such tag-peptide.

Example 8

In Vivo Comparison of the Expression of an Amorphadiene Synthase with an N-Terminal Tag-Peptide (Invention) and without Such Tag-Peptide (Reference)

[0204] R. sphaeroides strains Rs265-9c (blank strain, no plasmid), Rs265-9c/pBBR-K-PcrtE-valF and Rs265-9c/pBBR-K-PcrtE-aaas (two reference strains, no N-terminal tag-peptide), Rs265-9c/pBBR-K-PcrtE-mbp-valF (a strain expressing the Citrus×paradisi valencene synthase gene valF translationally fused at its 5'-end to the 3'-end of the E. coli mbp gene) and Rs265-9c/pBBR-K-PcrtE-mbp-aaas (a strain expressing the Artemisia annua amorphadiene synthase gene aaas translationally fused at its 5'-end to the 3'-end of the E. coli mbp gene) were grown under the standard shake flask cultivation conditions as described above. Several clones of each transformed R. sphaeroides strain were tested for valencene or amorphadiene production, and each shake-flask experiment was run in duplicate, unless stated otherwise. The valencene and amorphadiene titre is reported in mg/L n-dodecane, wherein the organic phase n-dodecane constituted 10% (v/v) of the whole broth.

[0205] The results of this experiment are given in Table 10.

TABLE-US-00011 TABLE 10 In vivo formation of valencene or amorphadiene in shake flask experiments employing R. sphaeroides containing plasmids pBBR-K-PcrtE-mbp-valF, pBBR-K-PcrtE-mbp-aaas, pBBR-K- PcrtE-valF, and pBBR-K-PcrtE-aaas, and R. sphaeroides without plasmid. Valencene or Amorphadiene in n-dodecane (mg/L) Average Standard Rhodobacter sphaeroides Strain Titre Deviation Rs265-9c/pBBR-K-PcrtE-mbp-valFa 25.4 1.4 Rs265-9c/pBBR-K-PcrtE-mbp-aaasb 666 72 Rs265-9c/pBBR-K-PcrtE-valFc 2.0 0.1 Rs265-9c/pBBR-K-PcrtE-aaasd 361 30 Rs265-9c 0.0 0.0 aValencene production was tested on seven different clones. bAmorphadiene production was tested on seven different clones. cValencene production was tested on one clone. dAmorphadiene production was tested on one clone.

[0206] The data in Table 10 show that the R. sphaeroides strains in which the Citrus×paradisi valencene synthase ValF is expressed with an N-terminal MBP-tag, produced over 10-fold more valencene than the strains expressing ValF in its native form and that the R. sphaeroides strains in which the Artemisia annua amorphadiene synthase Aaas is expressed with an N-terminal MBP-tag, produced almost 2-fold more amorphadiene than the strains expressing Aaas in its native form. This positive effect of expressing a sesquiterpene synthase with an N-terminal MBP-tag on sesquiterpene production is thus clearly applicable to enzymes other than valencene synthase such as amorphadiene synthase.

Example 9

In Vivo Expression of C. nootkatensis Valencene Synthase in Yeast

[0207] The full length open reading frame encoding the C. nootkatensis valencene synthase (ValC) was amplified from plasmid pAC-65-3 with the primers 65-3ATGDuetFw 5'-tatatggatccATGGCTGAAATGTTTAATGGAAATTCCAGC-3' [SEQ ID NO: 30] (BamHI recognition site underlined), and DuetAS1 5'-GATTATGCGGCCGTGTACAA-3' [SEQ ID NO: 31].

[0208] The annealing site of the 65-3ATGDuetFw primer was at the beginning of the native open reading frame of valC (SEQ ID NO:3) and the primer was designed to introduce a start codon and the BamHI site for cloning into the yeast vector. Reverse primer DuetAS was complementary to a region of the pAC-65-3 plasmid downstream of the valC open reading frame. The PCR conditions were as follows: initial denaturation of 45 s at 98° C. was followed by thirty PCR cycles of 10 s at 98° C., 20 s at 58° C. and 2 min at 72° C. which was again followed by a final extension of 5 min at 72° C. The final concentration of PCR reagents was 1× Phusion HF Buffer (Finnzymes), 200 μM dNTPs, 0.5 primers, 3% DMSO and 0.02 U/μL Phusion DNA polymerase (Finnzymes). The obtained PCR fragment was electrophoresed to confirm the desired length of the PCR product (1.9 kb) and was subsequently excised from the agarose gel and purified via standard techniques.

[0209] The purified PCR fragment was ligated into vector pGEM-T Easy (Promega) according to the product manual and transformed into E. coli XL-1 Blue using standard procedures. Recombinant bacteria were selected on LB plates supplemented with 100 mg/mL ampicillin. The presence of the valC gene in the recombinant E. coli clones was confirmed by colony PCR using M13(-20) (5'-TTGTAAAACGACGGCCAGTG-3', SEQ ID NO: 32) and SP6 Chip (5'-GTGACACTATAGAATACTCAAGC-3', SEQ ID NO: 33)) primers and standard protocols. The plasmid pGEM-valC was isolated using QIAprep Spin Miniprep Kit (Qiagen) and the sequence of valC was confirmed by DETT sequencing.

[0210] The plasmid pGEM-valC and the yeast expression vector pYES3/CT (Invitrogen) were digested with the restriction enzymes BamHI and NotI. The two required restriction fragments were subsequently excised from an agarose gel for purification. The fragments were then ligated and transformed into E. coli XL-1 Blue using standard procedures. By this cloning procedure the valC open reading frame was positioned between the GAL1 promoter that enables high level protein induction in yeast by galactose and the CYC1 terminator. No N- or C-terminal tags were added. Recombinant bacteria were selected on LB plates supplemented with 100 μg/mL ampicillin. The presence of the valC gene in the recombinant E. coli colonies was verified by colony PCR using vector primers and standard conditions. The plasmid was isolated using QIAprep Spin Miniprep Kit (Qiagen) and the nucleotide sequence of valC was confirmed by DETT sequencing.

[0211] The plasmid was then transformed into yeast strain WAT11 (Urban, P., Mignotte, C., Kazmaier, M., Delorme, F. and Pompon, D. 1997. J. Biol. Chem. 272: 19176-19186) using standard protocols (Gietz, R. D., Woods R. A. 2002. Methods in Enzymology 350: 87-96). The recombinant yeast colonies were selected on solid Synthetic dextrose minimal medium (0.67% Difco yeast nitrogen base medium without amino acids, 2% D-glucose, 40 mg/L adenine sulphate, 20 mg/L L-arginine, 100 mg/L L-aspartic acid, 100 mg/L L-glutamic acid, 20 mg/L L-histidine, 60 mg/L L-leucine, 30 mg/L L-lysine, 20 mg/L L-methionine, 50 mg/L L-phenylalanine, 375 mg/L L-serine, 200 mg/L L-threonine, 30 mg/L L-tyrosine, 150 mg/L L-valine, 20 mg/L uracil, 2% agar) omitting L-tryptophan for auxotrophic selection.

[0212] A single yeast colony containing valC was inoculated into 5 mL of liquid Synthetic galactose minimal medium (0.67% Difco yeast nitrogen base medium without amino acids, 2% D-galactose, 40 mg/L adenine sulphate, 20 mg/L L-arginine, 100 mg/L L-aspartic acid, 100 mg/L L-glutamic acid, 20 mg/L L-histidine, 60 mg/L L-leucine, 30 mg/L L-lysine, 20 mg/L L-methionine, 50 mg/L L-phenylalanine, 375 mg/L L-serine, 200 mg/L L-threonine, 30 mg/L L-tyrosine, 150 mg/L L-valine, 20 mg/L uracil) without L-tryptophan and the starter yeast culture was grown overnight at 30° C. Yeast cultures transformed with the empty pYES3/CT vector were used as controls in shake-flask fermentation experiments. After overnight incubation the optical density (OD600) of the yeast cultures was measured. The cultures were subsequently diluted to OD600 of 0.05 in 50 mL of Synthetic galactose minimal medium and incubated at 200 rpm and 30° C. The cultures were overlaid with 5 mL of n-dodecane when the OD600 was in the range from 0.8 to 1, and cultivation was continued for 3 days. After three days of fermentation the n-dodecane layer was separated from the yeast cultures by a glass separation funnel and subsequently centrifuged at 1,200 rpm for 10 min, diluted 3-fold in ethyl acetate, dried using anhydrous Na2SO4 and then analyzed by GC-MS, which was operated as has been described in the "Valencene synthase activity test" in the general part of the experimental section.

[0213] (+)-Valencene was detected at a retention time of 14.051 and was identified by comparison of the spectra and retention time to the authentic standard of (+)-valencene. No compound was detected at this retention time in the yeast cultures transformed with the empty pYES3/CT vector. Germacrene A was formed as a minor side product in these yeast fermentations.

[0214] Quantification of the amount of (+)-valencene produced was conducted by determination of the total ion count (TIC) peak area of the (+)-valencene peaks from three independent shake-flask fermentation experiments. Absolute concentration of (+)-valencene was calculated from the peak area by comparison to a standard curve prepared by measuring the dilution series of authentic standards with a known concentration. The produced amount of (+)-valencene was 1.36±0.05 mg/L yeast culture. This example thus demonstrates the applicability of valC to produce (+)-valencene in yeast.

Example 10

Expression of ValC in Plants

[0215] The full length open reading frame encoding the valC was excised from plasmid pAC-65-3 using restriction enzymes BamHI and Nod. In parallel, cloning vector pImpactVector 1.5 (http://www.pri.wur.nl/UK/products/ImpactVector/) was also digested with restriction enzymes BamHI and NotI. Both the required pImpactVector 1.5 and the valC DNA restriction fragments were isolated from an agarose gel, followed by purification of the required DNA fragments, their subsequent ligation and finally transformation into E. coli XL-1 blue using standard procedures. Recombinant bacteria were selected on solid LB medium (1000 mL deionized water, with 10 g Bactotryptone, 5 g Bacto yeast, 5 g NaCl) with 1.5% technical agar, containing 20 μg/mL gentamycin for selection of transformants. After overnight growth of recombinant colonies in liquid culture (3 mL LB broth with 20 μg/mL gentamycin, 250 rpm, 37° C.), plasmid DNA was isolated using the Qiaprep Spin Miniprep kit (Qiagen). Isolated plasmid material was tested by restriction analysis using the enzymes BamHI and NotI. Finally, the insert of a correct vector, which was named pIV5-ValC, was checked by DETT sequencing with vector primers. Within pIV5-ValC, the ValC DNA is preceeded by a CoxIV mitochondrial targeting sequence (Kohler R H, Zipfel W R, Webb W W, Hanson M R. Plant J. 1997; 11:613-21), and positioned between the RbcS1 promotor (Prbcs) and RbcS1 terminator (Trbcs) from Chrysanthemum morifolium (http://www.pri.wur.nl/UK/products/ImpactVector/; Outchkourov N S, Peters J, de Jong J, Rademakers W, Jongsma M A. Planta. 2003, 216(6):1003-12).

[0216] DNA from the plasmids pIV5-ValC and pBINPLUS (van Engelen F A, Molthoff J W, Conner A J, Nap J P, Pereira A, Stiekema W J. Transgenic Res. 1995 July; 4(4):288-90.) were both digested with AscI and PacI restriction enzymes in the prescribed buffers. Both the required pBINPLUS and valC DNA restriction fragments were isolated from an agarose gel, followed by purification of the required DNA fragments, their subsequent ligation and finally transformation into E. coli XL-1 blue using standard procedures. Recombinant bacteria were selected on LB plates containing 50 μg/mL kanamycin. After ON growth of recombinant colonies in liquid culture (3 mL LB broth with 50 μg/mL kanamycin, 250 rpm, 37° C.), plasmid DNA was isolated using the Qiaprep Spin Miniprep kit (Qiagen). Isolated plasmid material was tested by restriction analysis using the enzymes AscI and PacI. A plasmid with a correct insertion of the Prbcs, ValC and Trbcs cassette was called pBIN-ValC.

[0217] The pBin-ValC and control plasmid pBINPLUS were transformed to Agrobacterium tumefaciens LBA4404. Electro competent cells of Agrobacterium were prepared according to standard protocols, and 40 μl of competent cells were mixed with 1 μl of plasmid DNA. The mix was then transferred to a pre-cooled electroporation cuvette and kept on ice until electroporation. For electroporation, the The cuvette was placed in the electroporation holder and electroporated under standard conditions (100 ohm, 250 capacitance, 2.50 Kvolts and 25 cap). Immediately after the electroporation, 1 mL of SOC-medium was added, and the cells were incubated 60 minutes at 37° C. under gentle shaking. Thereafter, bacteria were plated on LB-agar plates with rifampicillin (100 μg/ml) and kanamycin (50 μg/ml). The presence of correct plasmid DNA in the transformed bacteria was confirmed by plasmid isolation, and restriction analysis using BamHI and NotI restriction enzymes.

[0218] For transformation of Nicotiana benthamiana plants, the Agrobacterium tumefaciens LBA4404 strains with pBinValC and control plasmid pBINPLUS were inoculated in a starterculture 10 mL liquid LB broth with antibiotics with rifampicillin (100 mg/ml) and kanamycin (50 mg/ml) overnight at 28° C. and 250 rpm shaking. Subsequently, 0.25 mL of the startercultures were added to 25 ml liquid LB broth with rifampicillin (100 μg/ml) and kanamycin (50 mg/ml) and incubated overnight at 28° C. and 250 rpm shaking. The next day, the overnight culture was centrifuged for 10 minutes at 8000×g and the supernatant discarded. The pellet was resuspended in 20 mL M300 liquid medium (4.4 g/l Murachige & Skoog (MS) salts with vitamins, 0.5 g/l 2-(N-morpholino)ethanesulfonic acid (MES), 30 g/l sucrose, pH6.0) with acetosyringone (100 μM). All chemicals for preparing the media were from Duchefa. Cells were centrifuged again under the same conditions, the supernatant was discarded and the cells were again resuspended in 20 mL M300 medium with acetosyringone. The resuspension was diluted in 980 ml of M300 medium with acetosyringone.

[0219] On the same day, Nicotiana benthamiana plants that had been seeded on sterile MS-medium with 0.6% agar six weeks before and raised in a sterile environment (16 hour light per day, 25° C.) were cut into leaf discs (explants) of 5-7 mm, and explants were immediately put in M300 liquid medium to prevent drying. After all explants (120 per construct) were cut, the M300 medium was replaced by diluted Agrobacterium suspension in a petridish, and the petridish was sealed and incubated in the dark for three days at room temperature. Subsequently, the explants were washed in M300 medium with ticarcillin (500 mg/L) and laid on solid M300 with benzylaminopurine (1 mg/l), auxin (0.1 mg/L), ticarcillin (500 mg/L), kanamycin (50 μg/L) and microagar (0.6%). In this way, explants were maintained in a growth chamber (16 hour light per day, 25° C.) and transferred to fresh medium every 14 days. After callus-formation had occurred (after +/-4 weeks), calli were cut and transferred to solid M300 with benzylaminopurine (1 mg/l), ticarcillin (500 mg/l), kanamycin (50 μg/l) and microagar (0.6%). When shoots were formed (after 4 to 8 weeks), they were cut from the callus, and transferred to solid M300 with ticarcillin (500 mg/l), kanamycin (50 μg/l) and microagar (0.6%) to stimulate rooting. For each line, 12 rooted plants were transferred to soil and further raised in a greenhouse (16 h light at 28° C. and 8 hours darkness at 25° C.) until they had ±12 leaves. In this stage, experiments for determining production of valencene were started.

[0220] Three pBIN-ValC plants and three pBINPLUS plants were further analyzed. For each plant, three freshly cut N. benthamiana leaves of 0.4 to 1.0 g were weighed, and cut ends were placed in a 4-mL beaker covered with aluminum foil and containing 3 mL of water. Each beaker with a leaf was placed in a separate 0.5-liter sealed glass container. Leaves were then incubated at 21° C. in a light regime of 16 hours of light and 8 hours of darkness. A vacuum pump was used to draw air through the glass container at approximately 100 mL/min, with the incoming air being purified through steel sorbent cartridges (89 mm×6.4 mm O.D.; Markes) containing 200 mg Tenax TA 20/35. At the outlet, the volatiles emitted by the detached leaves were trapped on a similar cartridge. Volatiles were collected during 24 h. Outlet cartridges were eluted using 3 times 1 mL of pentane:diethyl ether (4:1). Non-concentrated samples were dehydrated using anhydrous Na2SO4, and analyzed by GC-MS using a gas chromatograph (5890 series II, Hewlett-Packard) equipped with a 30 m×0.25 mm, 0.25 mm film thickness column (5MS, Hewlett-Packard) and a mass-selective detector (model 5972A, Hewlett-Packard). For analysis, 1 μl was injected, and the column temperature was increased from 45° C. to 280° C. in 20 minutes. A range of valencene standard solutions in pentane:ethyl-ether (80:20 v/v) was injected for reference and quantification. Valencene was found to elute at 13.87 minutes, and was identified in the plant headspace by comparison to the mass spectrum and retention time of the standard. The amount of valencene emitted was quantified for each plant by averaging the emitted micrograms of valencene per g leaf per 24 hours. While the pBINPLUS plants did not emit any detectable valencene, the three pBIN-ValC plants emitted (+) valencene at 0.51, 0.63 or 0.48 μg valencene per g leaf per 24 hours, respectively. This demonstrated the ability of ValC to mediate valencene production in plants

Sequence CWU 1

1

3511722DNACallitropsis nootkatensisCDS(1)..(1722) 1atg ccc gtg aag gac gcc ctt cgt cgg act gga aat cat cat cct aac 48Met Pro Val Lys Asp Ala Leu Arg Arg Thr Gly Asn His His Pro Asn 1 5 10 15 ttg tgg act gat gat ttc ata cag tcc ctc aat tct cca tat tcg gat 96Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu Asn Ser Pro Tyr Ser Asp 20 25 30 tct tca tac cat aaa cat agg gaa ata cta att gat gag att cgt gat 144Ser Ser Tyr His Lys His Arg Glu Ile Leu Ile Asp Glu Ile Arg Asp 35 40 45 atg ttt tct aat gga gaa ggc gat gag ttc ggt gta ctt gaa aat att 192Met Phe Ser Asn Gly Glu Gly Asp Glu Phe Gly Val Leu Glu Asn Ile 50 55 60 tgg ttt gtt gat gtt gta caa cgt ttg gga ata gat cga cat ttt caa 240Trp Phe Val Asp Val Val Gln Arg Leu Gly Ile Asp Arg His Phe Gln 65 70 75 80 gag gaa atc aaa act gca ctt gat tat atc tac aag ttc tgg aat cat 288Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile Tyr Lys Phe Trp Asn His 85 90 95 gat agt att ttt ggc gat ctc aac atg gtg gct cta gga ttt cgg ata 336Asp Ser Ile Phe Gly Asp Leu Asn Met Val Ala Leu Gly Phe Arg Ile 100 105 110 cta cga ctg aat aga tat gtc gct tct tca gat gtt ttt aaa aag ttc 384Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser Asp Val Phe Lys Lys Phe 115 120 125 aaa ggt gaa gaa gga caa ttc tct ggt ttt gaa tct agc gat caa gat 432Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe Glu Ser Ser Asp Gln Asp 130 135 140 gca aaa tta gaa atg atg tta aat tta tat aaa gct tca gaa tta gat 480Ala Lys Leu Glu Met Met Leu Asn Leu Tyr Lys Ala Ser Glu Leu Asp 145 150 155 160 ttt cct gat gaa gat atc tta aaa gaa gca aga gcg ttt gct tct atg 528Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala Arg Ala Phe Ala Ser Met 165 170 175 tac ctg aaa cat gtt atc aaa gaa tat ggt gac ata caa gaa tca aaa 576Tyr Leu Lys His Val Ile Lys Glu Tyr Gly Asp Ile Gln Glu Ser Lys 180 185 190 aat cca ctt cta atg gag ata gag tac act ttt aaa tat cct tgg aga 624Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr Phe Lys Tyr Pro Trp Arg 195 200 205 tgt agg ctt cca agg ttg gag gct tgg aac ttt att cat ata atg aga 672Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn Phe Ile His Ile Met Arg 210 215 220 caa caa gat tgc aat ata tca ctt gcc aat aac ctt tat aaa att cca 720Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn Asn Leu Tyr Lys Ile Pro 225 230 235 240 aaa ata tat atg aaa aag ata ttg gaa cta gca ata ctg gac ttc aat 768Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu Ala Ile Leu Asp Phe Asn 245 250 255 att ttg cag tca caa cat caa cat gaa atg aaa tta ata tcc aca tgg 816Ile Leu Gln Ser Gln His Gln His Glu Met Lys Leu Ile Ser Thr Trp 260 265 270 tgg aaa aat tca agt gca att caa ttg gat ttc ttt cgg cat cgt cac 864Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp Phe Phe Arg His Arg His 275 280 285 ata gaa agt tat ttt tgg tgg gct agt cca tta ttt gaa cct gag ttc 912Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro Leu Phe Glu Pro Glu Phe 290 295 300 agt aca tgt aga att aat tgt acc aaa tta tct aca aaa atg ttc ctc 960Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu Ser Thr Lys Met Phe Leu 305 310 315 320 ctt gac gat att tat gac aca tat ggg act gtt gag gaa ttg aaa cca 1008Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr Val Glu Glu Leu Lys Pro 325 330 335 ttc aca aca aca tta aca aga tgg gat gtt tcc aca gtt gat aat cat 1056Phe Thr Thr Thr Leu Thr Arg Trp Asp Val Ser Thr Val Asp Asn His 340 345 350 cca gac tac atg aaa att gct ttc aat ttt tca tat gag ata tat aag 1104Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe Ser Tyr Glu Ile Tyr Lys 355 360 365 gaa att gca agt gaa gcc gaa aga aag cat ggt ccc ttt gtt tac aaa 1152Glu Ile Ala Ser Glu Ala Glu Arg Lys His Gly Pro Phe Val Tyr Lys 370 375 380 tac ctt caa tct tgc tgg aag agt tat atc gag gct tat atg caa gaa 1200Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile Glu Ala Tyr Met Gln Glu 385 390 395 400 gca gaa tgg ata gct tct aat cat ata cca ggt ttt gat gaa tac ttg 1248Ala Glu Trp Ile Ala Ser Asn His Ile Pro Gly Phe Asp Glu Tyr Leu 405 410 415 atg aat gga gta aaa agt agc ggc atg cga att cta atg ata cat gca 1296Met Asn Gly Val Lys Ser Ser Gly Met Arg Ile Leu Met Ile His Ala 420 425 430 cta ata cta atg gat act cct tta tct gat gaa att ttg gag caa ctt 1344Leu Ile Leu Met Asp Thr Pro Leu Ser Asp Glu Ile Leu Glu Gln Leu 435 440 445 gat atc cca tca tcc aag tcg caa gct ctt cta tca tta att act cga 1392Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu Leu Ser Leu Ile Thr Arg 450 455 460 cta gtg gat gat gtc aaa gac ttt gag gat gaa caa gct cat ggg gag 1440Leu Val Asp Asp Val Lys Asp Phe Glu Asp Glu Gln Ala His Gly Glu 465 470 475 480 atg gca tca agt ata gag tgc tac atg aaa gac aac cat ggt tct aca 1488Met Ala Ser Ser Ile Glu Cys Tyr Met Lys Asp Asn His Gly Ser Thr 485 490 495 agg gaa gat gct ttg aat tat ctc aaa att cgt ata gag agt tgt gtg 1536Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile Arg Ile Glu Ser Cys Val 500 505 510 caa gag tta aat aag gag ctt ctc gag cct tca aat atg cat gga tct 1584Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro Ser Asn Met His Gly Ser 515 520 525 ttt aga aac cta tat ctc aat gtt ggc atg cga gta ata ttt ttt atg 1632Phe Arg Asn Leu Tyr Leu Asn Val Gly Met Arg Val Ile Phe Phe Met 530 535 540 ctc aat gat ggt gat ctc ttt aca cac tcc aat aga aaa gag ata caa 1680Leu Asn Asp Gly Asp Leu Phe Thr His Ser Asn Arg Lys Glu Ile Gln 545 550 555 560 gat gca ata aca aaa ttt ttt gtg gaa cca atc att cca tag 1722Asp Ala Ile Thr Lys Phe Phe Val Glu Pro Ile Ile Pro 565 570 2573PRTCallitropsis nootkatensis 2Met Pro Val Lys Asp Ala Leu Arg Arg Thr Gly Asn His His Pro Asn 1 5 10 15 Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu Asn Ser Pro Tyr Ser Asp 20 25 30 Ser Ser Tyr His Lys His Arg Glu Ile Leu Ile Asp Glu Ile Arg Asp 35 40 45 Met Phe Ser Asn Gly Glu Gly Asp Glu Phe Gly Val Leu Glu Asn Ile 50 55 60 Trp Phe Val Asp Val Val Gln Arg Leu Gly Ile Asp Arg His Phe Gln 65 70 75 80 Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile Tyr Lys Phe Trp Asn His 85 90 95 Asp Ser Ile Phe Gly Asp Leu Asn Met Val Ala Leu Gly Phe Arg Ile 100 105 110 Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser Asp Val Phe Lys Lys Phe 115 120 125 Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe Glu Ser Ser Asp Gln Asp 130 135 140 Ala Lys Leu Glu Met Met Leu Asn Leu Tyr Lys Ala Ser Glu Leu Asp 145 150 155 160 Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala Arg Ala Phe Ala Ser Met 165 170 175 Tyr Leu Lys His Val Ile Lys Glu Tyr Gly Asp Ile Gln Glu Ser Lys 180 185 190 Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr Phe Lys Tyr Pro Trp Arg 195 200 205 Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn Phe Ile His Ile Met Arg 210 215 220 Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn Asn Leu Tyr Lys Ile Pro 225 230 235 240 Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu Ala Ile Leu Asp Phe Asn 245 250 255 Ile Leu Gln Ser Gln His Gln His Glu Met Lys Leu Ile Ser Thr Trp 260 265 270 Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp Phe Phe Arg His Arg His 275 280 285 Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro Leu Phe Glu Pro Glu Phe 290 295 300 Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu Ser Thr Lys Met Phe Leu 305 310 315 320 Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr Val Glu Glu Leu Lys Pro 325 330 335 Phe Thr Thr Thr Leu Thr Arg Trp Asp Val Ser Thr Val Asp Asn His 340 345 350 Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe Ser Tyr Glu Ile Tyr Lys 355 360 365 Glu Ile Ala Ser Glu Ala Glu Arg Lys His Gly Pro Phe Val Tyr Lys 370 375 380 Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile Glu Ala Tyr Met Gln Glu 385 390 395 400 Ala Glu Trp Ile Ala Ser Asn His Ile Pro Gly Phe Asp Glu Tyr Leu 405 410 415 Met Asn Gly Val Lys Ser Ser Gly Met Arg Ile Leu Met Ile His Ala 420 425 430 Leu Ile Leu Met Asp Thr Pro Leu Ser Asp Glu Ile Leu Glu Gln Leu 435 440 445 Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu Leu Ser Leu Ile Thr Arg 450 455 460 Leu Val Asp Asp Val Lys Asp Phe Glu Asp Glu Gln Ala His Gly Glu 465 470 475 480 Met Ala Ser Ser Ile Glu Cys Tyr Met Lys Asp Asn His Gly Ser Thr 485 490 495 Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile Arg Ile Glu Ser Cys Val 500 505 510 Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro Ser Asn Met His Gly Ser 515 520 525 Phe Arg Asn Leu Tyr Leu Asn Val Gly Met Arg Val Ile Phe Phe Met 530 535 540 Leu Asn Asp Gly Asp Leu Phe Thr His Ser Asn Arg Lys Glu Ile Gln 545 550 555 560 Asp Ala Ile Thr Lys Phe Phe Val Glu Pro Ile Ile Pro 565 570 31770DNACallitropsis nootkatensisCDS(1)..(1770) 3atg gct gaa atg ttt aat gga aat tcc agc aat gat gga agt tct tgc 48Met Ala Glu Met Phe Asn Gly Asn Ser Ser Asn Asp Gly Ser Ser Cys 1 5 10 15 atg ccc gtg aag gac gcc ctt cgt cgg act gga aat cat cat cct aac 96Met Pro Val Lys Asp Ala Leu Arg Arg Thr Gly Asn His His Pro Asn 20 25 30 ttg tgg act gat gat ttc ata cag tcc ctc aat tct cca tat tcg gat 144Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu Asn Ser Pro Tyr Ser Asp 35 40 45 tct tca tac cat aaa cat agg gaa ata cta att gat gag att cgt gat 192Ser Ser Tyr His Lys His Arg Glu Ile Leu Ile Asp Glu Ile Arg Asp 50 55 60 atg ttt tct aat gga gaa ggc gat gag ttc ggt gta ctt gaa aat att 240Met Phe Ser Asn Gly Glu Gly Asp Glu Phe Gly Val Leu Glu Asn Ile 65 70 75 80 tgg ttt gtt gat gtt gta caa cgt ttg gga ata gat cga cat ttt caa 288Trp Phe Val Asp Val Val Gln Arg Leu Gly Ile Asp Arg His Phe Gln 85 90 95 gag gaa atc aaa act gca ctt gat tat atc tac aag ttc tgg aat cat 336Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile Tyr Lys Phe Trp Asn His 100 105 110 gat agt att ttt ggc gat ctc aac atg gtg gct cta gga ttt cgg ata 384Asp Ser Ile Phe Gly Asp Leu Asn Met Val Ala Leu Gly Phe Arg Ile 115 120 125 cta cga ctg aat aga tat gtc gct tct tca gat gtt ttt aaa aag ttc 432Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser Asp Val Phe Lys Lys Phe 130 135 140 aaa ggt gaa gaa gga caa ttc tct ggt ttt gaa tct agc gat caa gat 480Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe Glu Ser Ser Asp Gln Asp 145 150 155 160 gca aaa tta gaa atg atg tta aat tta tat aaa gct tca gaa tta gat 528Ala Lys Leu Glu Met Met Leu Asn Leu Tyr Lys Ala Ser Glu Leu Asp 165 170 175 ttt cct gat gaa gat atc tta aaa gaa gca aga gcg ttt gct tct atg 576Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala Arg Ala Phe Ala Ser Met 180 185 190 tac ctg aaa cat gtt atc aaa gaa tat ggt gac ata caa gaa tca aaa 624Tyr Leu Lys His Val Ile Lys Glu Tyr Gly Asp Ile Gln Glu Ser Lys 195 200 205 aat cca ctt cta atg gag ata gag tac act ttt aaa tat cct tgg aga 672Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr Phe Lys Tyr Pro Trp Arg 210 215 220 tgt agg ctt cca agg ttg gag gct tgg aac ttt att cat ata atg aga 720Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn Phe Ile His Ile Met Arg 225 230 235 240 caa caa gat tgc aat ata tca ctt gcc aat aac ctt tat aaa att cca 768Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn Asn Leu Tyr Lys Ile Pro 245 250 255 aaa ata tat atg aaa aag ata ttg gaa cta gca ata ctg gac ttc aat 816Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu Ala Ile Leu Asp Phe Asn 260 265 270 att ttg cag tca caa cat caa cat gaa atg aaa tta ata tcc aca tgg 864Ile Leu Gln Ser Gln His Gln His Glu Met Lys Leu Ile Ser Thr Trp 275 280 285 tgg aaa aat tca agt gca att caa ttg gat ttc ttt cgg cat cgt cac 912Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp Phe Phe Arg His Arg His 290 295 300 ata gaa agt tat ttt tgg tgg gct agt cca tta ttt gaa cct gag ttc 960Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro Leu Phe Glu Pro Glu Phe 305 310 315 320 agt aca tgt aga att aat tgt acc aaa tta tct aca aaa atg ttc ctc 1008Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu Ser Thr Lys Met Phe Leu 325 330 335 ctt gac gat att tat gac aca tat ggg act gtt gag gaa ttg aaa cca 1056Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr Val Glu Glu Leu Lys Pro 340 345 350 ttc aca aca aca tta aca aga tgg gat gtt tcc aca gtt gat aat cat 1104Phe Thr Thr Thr Leu Thr Arg Trp Asp Val Ser Thr Val Asp Asn His 355 360 365 cca gac tac atg aaa att gct ttc aat ttt tca tat gag ata tat aag 1152Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe Ser Tyr Glu Ile Tyr Lys 370 375 380 gaa att gca agt gaa gcc gaa aga aag cat ggt ccc ttt gtt tac aaa 1200Glu Ile Ala Ser Glu Ala Glu Arg Lys His Gly Pro Phe Val Tyr Lys 385 390 395 400 tac ctt caa tct tgc tgg aag agt tat atc gag gct tat atg caa gaa 1248Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile Glu Ala Tyr Met Gln Glu 405 410 415 gca gaa tgg ata gct tct aat cat ata cca ggt ttt gat gaa tac ttg 1296Ala Glu Trp Ile Ala Ser Asn His Ile Pro Gly Phe Asp Glu Tyr Leu

420 425 430 atg aat gga gta aaa agt agc ggc atg cga att cta atg ata cat gca 1344Met Asn Gly Val Lys Ser Ser Gly Met Arg Ile Leu Met Ile His Ala 435 440 445 cta ata cta atg gat act cct tta tct gat gaa att ttg gag caa ctt 1392Leu Ile Leu Met Asp Thr Pro Leu Ser Asp Glu Ile Leu Glu Gln Leu 450 455 460 gat atc cca tca tcc aag tcg caa gct ctt cta tca tta att act cga 1440Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu Leu Ser Leu Ile Thr Arg 465 470 475 480 cta gtg gat gat gtc aaa gac ttt gag gat gaa caa gct cat ggg gag 1488Leu Val Asp Asp Val Lys Asp Phe Glu Asp Glu Gln Ala His Gly Glu 485 490 495 atg gca tca agt ata gag tgc tac atg aaa gac aac cat ggt tct aca 1536Met Ala Ser Ser Ile Glu Cys Tyr Met Lys Asp Asn His Gly Ser Thr 500 505 510 agg gaa gat gct ttg aat tat ctc aaa att cgt ata gag agt tgt gtg 1584Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile Arg Ile Glu Ser Cys Val 515 520 525 caa gag tta aat aag gag ctt ctc gag cct tca aat atg cat gga tct 1632Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro Ser Asn Met His Gly Ser 530 535 540 ttt aga aac cta tat ctc aat gtt ggc atg cga gta ata ttt ttt atg 1680Phe Arg Asn Leu Tyr Leu Asn Val Gly Met Arg Val Ile Phe Phe Met 545 550 555 560 ctc aat gat ggt gat ctc ttt aca cac tcc aat aga aaa gag ata caa 1728Leu Asn Asp Gly Asp Leu Phe Thr His Ser Asn Arg Lys Glu Ile Gln 565 570 575 gat gca ata aca aaa ttt ttt gtg gaa cca atc att cca tag 1770Asp Ala Ile Thr Lys Phe Phe Val Glu Pro Ile Ile Pro 580 585 4589PRTCallitropsis nootkatensis 4Met Ala Glu Met Phe Asn Gly Asn Ser Ser Asn Asp Gly Ser Ser Cys 1 5 10 15 Met Pro Val Lys Asp Ala Leu Arg Arg Thr Gly Asn His His Pro Asn 20 25 30 Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu Asn Ser Pro Tyr Ser Asp 35 40 45 Ser Ser Tyr His Lys His Arg Glu Ile Leu Ile Asp Glu Ile Arg Asp 50 55 60 Met Phe Ser Asn Gly Glu Gly Asp Glu Phe Gly Val Leu Glu Asn Ile 65 70 75 80 Trp Phe Val Asp Val Val Gln Arg Leu Gly Ile Asp Arg His Phe Gln 85 90 95 Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile Tyr Lys Phe Trp Asn His 100 105 110 Asp Ser Ile Phe Gly Asp Leu Asn Met Val Ala Leu Gly Phe Arg Ile 115 120 125 Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser Asp Val Phe Lys Lys Phe 130 135 140 Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe Glu Ser Ser Asp Gln Asp 145 150 155 160 Ala Lys Leu Glu Met Met Leu Asn Leu Tyr Lys Ala Ser Glu Leu Asp 165 170 175 Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala Arg Ala Phe Ala Ser Met 180 185 190 Tyr Leu Lys His Val Ile Lys Glu Tyr Gly Asp Ile Gln Glu Ser Lys 195 200 205 Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr Phe Lys Tyr Pro Trp Arg 210 215 220 Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn Phe Ile His Ile Met Arg 225 230 235 240 Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn Asn Leu Tyr Lys Ile Pro 245 250 255 Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu Ala Ile Leu Asp Phe Asn 260 265 270 Ile Leu Gln Ser Gln His Gln His Glu Met Lys Leu Ile Ser Thr Trp 275 280 285 Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp Phe Phe Arg His Arg His 290 295 300 Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro Leu Phe Glu Pro Glu Phe 305 310 315 320 Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu Ser Thr Lys Met Phe Leu 325 330 335 Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr Val Glu Glu Leu Lys Pro 340 345 350 Phe Thr Thr Thr Leu Thr Arg Trp Asp Val Ser Thr Val Asp Asn His 355 360 365 Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe Ser Tyr Glu Ile Tyr Lys 370 375 380 Glu Ile Ala Ser Glu Ala Glu Arg Lys His Gly Pro Phe Val Tyr Lys 385 390 395 400 Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile Glu Ala Tyr Met Gln Glu 405 410 415 Ala Glu Trp Ile Ala Ser Asn His Ile Pro Gly Phe Asp Glu Tyr Leu 420 425 430 Met Asn Gly Val Lys Ser Ser Gly Met Arg Ile Leu Met Ile His Ala 435 440 445 Leu Ile Leu Met Asp Thr Pro Leu Ser Asp Glu Ile Leu Glu Gln Leu 450 455 460 Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu Leu Ser Leu Ile Thr Arg 465 470 475 480 Leu Val Asp Asp Val Lys Asp Phe Glu Asp Glu Gln Ala His Gly Glu 485 490 495 Met Ala Ser Ser Ile Glu Cys Tyr Met Lys Asp Asn His Gly Ser Thr 500 505 510 Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile Arg Ile Glu Ser Cys Val 515 520 525 Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro Ser Asn Met His Gly Ser 530 535 540 Phe Arg Asn Leu Tyr Leu Asn Val Gly Met Arg Val Ile Phe Phe Met 545 550 555 560 Leu Asn Asp Gly Asp Leu Phe Thr His Ser Asn Arg Lys Glu Ile Gln 565 570 575 Asp Ala Ile Thr Lys Phe Phe Val Glu Pro Ile Ile Pro 580 585 539DNAArtificialprimer 5atataggatc cggctgaaat gtttaatgga aattccagc 39640DNAArtificialprimer 6atatactgca gctctggatc tatggaatga ttggttccac 4071650DNAArtificialcodon optimized ValF gene 7atgtcgagcg gcgagacctt ccgccccacg gccgacttcc atccgtccct ctggcggaac 60cacttcctca agggggcctc cgatttcaag accgtggacc atacggcgac gcaggaacgg 120cacgaggccc tcaaggagga ggtccgccgc atgatcaccg acgccgaaga caagccggtc 180cagaagctcc gcctgatcga cgaggtccag cgcctgggcg tggcgtatca tttcgagaaa 240gaaatcgagg atgcgatcca gaagctctgc ccgatctata tcgatagcaa tcgcgccgat 300ctccataccg tgtcgctgca cttccgcctg ctgcggcagc agggcatcaa gatcagctgc 360gacgtgttcg aaaagttcaa ggacgacgag ggccgcttca agtcgtcgct gatcaacgac 420gtgcagggca tgctgtcgct gtacgaggcc gcgtacatgg ccgtgcgcgg cgagcatatc 480ctggacgaag ccatcgcgtt cacgaccacg catctgaagt cgctggtggc gcaggaccac 540gtgacgccga agctcgccga gcagatcaac cacgcgctgt atcggccgct ccgcaagacc 600ctcccgcgcc tcgaggcccg ctatttcatg agcatgatca actcgacctc ggatcacctg 660tacaataaga ccctgctcaa cttcgcgaaa ctggacttca atatcctcct cgagctgcac 720aaggaggagc tcaacgagct gaccaagtgg tggaaggatc tggacttcac caccaagctg 780ccgtacgccc gcgatcgcct cgtggagctg tatttctggg acctgggcac ctacttcgaa 840ccccagtacg ccttcgggcg gaagatcatg acccagctca attatatcct cagcatcatc 900gacgacacct atgacgcgta cggcacgctg gaggagctgt ccctgttcac ggaagccgtc 960cagcggtgga acatcgaggc cgtcgacatg ctccccgagt acatgaaact gatctaccgg 1020accctgctgg atgccttcaa cgagatcgag gaggacatgg cgaaacaggg ccggtcccac 1080tgcgtgcgct acgcgaagga agagaaccag aaggtcatcg gcgcctactc ggtccaggcg 1140aagtggttca gcgagggcta tgtgccgacg atcgaggaat atatgccgat cgcgctcacc 1200tcgtgcgcgt acacgttcgt gatcaccaat tcgttcctcg gcatgggcga tttcgcgacc 1260aaggaggtct tcgagtggat cagcaacaat ccgaaggtgg tgaaggcggc ctcggtcatc 1320tgccggctca tggatgacat gcaggggcat gagttcgaac agaagcgcgg ccacgtcgcg 1380tccgccatcg agtgctatac caagcagcat ggcgtgtcga aggaggaggc catcaagatg 1440ttcgaggagg aagtcgccaa cgcgtggaag gacatcaatg aggagctgat gatgaagccc 1500accgtcgtgg cccgccccct gctgggcacc atcctgaacc tcgcccgcgc catcgacttc 1560atctacaagg aggacgatgg gtatacgcat tcctatctga tcaaggacca gatcgcctcg 1620gtcctcggcg atcatgtccc gttctgataa 165081656DNAArtificialcodon optimized ValFpoR 8atg agc tcg ggc gag acc ttc cgc ccg acc gcc gat ttc cat ccc tcg 48Met Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala Asp Phe His Pro Ser 1 5 10 15 ctc tgg cgc aac cat ttc ctg aag ggc gcc tcc gac ttc aag acc gtc 96Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser Asp Phe Lys Thr Val 20 25 30 gat cac acg gcc acc cag gag cgc cac gag gcg ctg aag gaa gag gtg 144Asp His Thr Ala Thr Gln Glu Arg His Glu Ala Leu Lys Glu Glu Val 35 40 45 cgc cgg atg atc acc gac gcc gag gac aag ccg gtg cag aag ctg cgg 192Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro Val Gln Lys Leu Arg 50 55 60 ctg atc gac gag gtg cag cgt ctc ggc gtg gcc tat cac ttc gag aag 240Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala Tyr His Phe Glu Lys 65 70 75 80 gag atc gag gat gcg atc cag aag ctc tgc ccg atc tac atc gac agc 288Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro Ile Tyr Ile Asp Ser 85 90 95 aac cgc gcc gat ctg cac acg gtc tcg ctg cat ttc cgg ctg ctg cgc 336Asn Arg Ala Asp Leu His Thr Val Ser Leu His Phe Arg Leu Leu Arg 100 105 110 cag cag ggc atc aag atc tcc tgc gac gtc ttc gag aag ttc aag gac 384Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe Glu Lys Phe Lys Asp 115 120 125 gac gag ggc cgc ttc aag tcc tcg ctg atc aac gac gtg cag ggg atg 432Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn Asp Val Gln Gly Met 130 135 140 ctg tcg ctc tac gag gcg gcc tac atg gcg gtg cgc ggc gag cat atc 480Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val Arg Gly Glu His Ile 145 150 155 160 ctc gac gag gcg atc gcc ttc acc acc acc cat ctg aaa tcg ctc gtg 528Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His Leu Lys Ser Leu Val 165 170 175 gcg cag gac cat gtc acg ccg aag ctc gcc gag cag atc aac cat gcg 576Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu Gln Ile Asn His Ala 180 185 190 ctc tac cgc ccg ctg cgc aag acg ctg ccg cgg ctc gag gcg cgc tat 624Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg Leu Glu Ala Arg Tyr 195 200 205 ttc atg tcg atg atc aac tcg acc tcg gac cat ctc tac aac aag acg 672Phe Met Ser Met Ile Asn Ser Thr Ser Asp His Leu Tyr Asn Lys Thr 210 215 220 ctg ctg aac ttc gcc aag ctc gac ttc aac atc ctg ctc gag ctg cac 720Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile Leu Leu Glu Leu His 225 230 235 240 aag gaa gag ctg aac gag ctg acg aaa tgg tgg aag gat ctc gac ttc 768Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp Lys Asp Leu Asp Phe 245 250 255 acc acc aag ctg ccc tat gcg cgc gac cgg ctg gtc gag ctc tat ttc 816Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu Val Glu Leu Tyr Phe 260 265 270 tgg gat ctc ggc acc tat ttc gag ccg cag tat gcc ttc ggc cgc aag 864Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr Ala Phe Gly Arg Lys 275 280 285 atc atg acc cag ctg aac tac atc ctc tcg atc atc gac gac acc tac 912Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile Ile Asp Asp Thr Tyr 290 295 300 gac gcc tac ggc acg ctg gaa gag ctg tcg ctc ttc acc gag gcg gtg 960Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu Phe Thr Glu Ala Val 305 310 315 320 cag cgc tgg aac atc gag gcg gtc gac atg ctg ccg gaa tac atg aag 1008Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu Pro Glu Tyr Met Lys 325 330 335 ctg atc tac cgc acg ctg ctc gat gcc ttc aac gag atc gag gaa gac 1056Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn Glu Ile Glu Glu Asp 340 345 350 atg gcg aaa caa ggg cgc agc cac tgc gtg cgc tat gcc aag gaa gag 1104Met Ala Lys Gln Gly Arg Ser His Cys Val Arg Tyr Ala Lys Glu Glu 355 360 365 aac cag aag gtc atc ggc gcc tat tcg gtc cag gcg aaa tgg ttc tcg 1152Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln Ala Lys Trp Phe Ser 370 375 380 gaa ggc tat gtc ccc acg atc gag gaa tac atg ccg atc gcg ctg acc 1200Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met Pro Ile Ala Leu Thr 385 390 395 400 tcc tgc gcc tat acc ttc gtc atc acc aac agc ttc ctc ggc atg ggc 1248Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser Phe Leu Gly Met Gly 405 410 415 gac ttc gcc acc aag gaa gtc ttc gaa tgg atc tcg aac aac ccg aag 1296Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile Ser Asn Asn Pro Lys 420 425 430 gtc gtc aag gcg gcc tcg gtc atc tgc cgg ctg atg gac gac atg cag 1344Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu Met Asp Asp Met Gln 435 440 445 ggc cac gag ttc gag cag aag cgc ggc cat gtc gcc tcg gcc atc gaa 1392Gly His Glu Phe Glu Gln Lys Arg Gly His Val Ala Ser Ala Ile Glu 450 455 460 tgc tac acc aag cag cac ggc gtc tcg aag gaa gag gcg atc aag atg 1440Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu Glu Ala Ile Lys Met 465 470 475 480 ttc gaa gag gaa gtg gcc aat gcc tgg aag gac atc aac gag gaa ctg 1488Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp Ile Asn Glu Glu Leu 485 490 495 atg atg aag ccc acc gtc gtg gcc cgt ccg ctg ctc ggc acg atc ctg 1536Met Met Lys Pro Thr Val Val Ala Arg Pro Leu Leu Gly Thr Ile Leu 500 505 510 aac ctc gcc cgc gcc atc gac ttc atc tac aag gaa gac gac ggc tat 1584Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr Lys Glu Asp Asp Gly Tyr 515 520 525 acc cat tcc tat ctg atc aag gac cag atc gcc tcg gtc ctc ggc gac 1632Thr His Ser Tyr Leu Ile Lys Asp Gln Ile Ala Ser Val Leu Gly Asp 530 535 540 cat gtg cct ttc att aat tga taa 1656His Val Pro Phe Ile Asn 545 550 9550PRTArtificialSynthetic Construct 9Met Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala Asp Phe His Pro Ser 1 5 10 15 Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser Asp Phe Lys Thr Val 20 25 30 Asp His Thr Ala Thr Gln Glu Arg His Glu Ala Leu Lys Glu Glu Val 35 40 45 Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro Val Gln Lys Leu Arg 50 55 60 Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala Tyr His Phe Glu Lys 65 70 75 80 Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro Ile Tyr Ile Asp Ser 85 90 95 Asn Arg Ala Asp Leu His Thr Val Ser Leu His Phe Arg Leu Leu Arg 100 105 110 Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe Glu Lys Phe Lys Asp 115 120 125 Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn Asp Val Gln Gly Met 130 135 140 Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val Arg Gly Glu His Ile 145 150 155 160 Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His Leu Lys Ser Leu Val

165 170 175 Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu Gln Ile Asn His Ala 180 185 190 Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg Leu Glu Ala Arg Tyr 195 200 205 Phe Met Ser Met Ile Asn Ser Thr Ser Asp His Leu Tyr Asn Lys Thr 210 215 220 Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile Leu Leu Glu Leu His 225 230 235 240 Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp Lys Asp Leu Asp Phe 245 250 255 Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu Val Glu Leu Tyr Phe 260 265 270 Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr Ala Phe Gly Arg Lys 275 280 285 Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile Ile Asp Asp Thr Tyr 290 295 300 Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu Phe Thr Glu Ala Val 305 310 315 320 Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu Pro Glu Tyr Met Lys 325 330 335 Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn Glu Ile Glu Glu Asp 340 345 350 Met Ala Lys Gln Gly Arg Ser His Cys Val Arg Tyr Ala Lys Glu Glu 355 360 365 Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln Ala Lys Trp Phe Ser 370 375 380 Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met Pro Ile Ala Leu Thr 385 390 395 400 Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser Phe Leu Gly Met Gly 405 410 415 Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile Ser Asn Asn Pro Lys 420 425 430 Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu Met Asp Asp Met Gln 435 440 445 Gly His Glu Phe Glu Gln Lys Arg Gly His Val Ala Ser Ala Ile Glu 450 455 460 Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu Glu Ala Ile Lys Met 465 470 475 480 Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp Ile Asn Glu Glu Leu 485 490 495 Met Met Lys Pro Thr Val Val Ala Arg Pro Leu Leu Gly Thr Ile Leu 500 505 510 Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr Lys Glu Asp Asp Gly Tyr 515 520 525 Thr His Ser Tyr Leu Ile Lys Asp Gln Ile Ala Ser Val Leu Gly Asp 530 535 540 His Val Pro Phe Ile Asn 545 550 102778DNAArtificialsynthetic fusion gene MBP-ValFpoR 10atg aag atc gag gaa ggc aag ctc gtc atc tgg atc aac ggc gac aag 48Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 ggc tac aac ggc ctc gcc gag gtg ggc aag aag ttc gag aag gac acg 96Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 ggc atc aag gtc acc gtc gag cat ccc gac aag ctc gag gag aag ttc 144Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 ccg cag gtc gcc gcc acc ggc gac ggc ccc gac atc atc ttc tgg gcc 192Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 cac gac cgc ttc ggc ggc tat gcg cag tcg ggc ctg ctc gcc gag atc 240His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 acg ccc gac aag gcc ttc cag gac aag ctc tat ccc ttc acc tgg gat 288Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 gcg gtg cgc tac aac ggc aag ctg atc gcc tat ccg atc gcc gtc gag 336Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 gcg ctg tcg ctg atc tac aac aag gat ctg ctg ccg aac ccg ccg aag 384Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 acc tgg gaa gag atc ccg gcg ctc gac aag gaa ctg aag gcc aag ggc 432Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 aag tcc gcg ctg atg ttc aac ctg cag gag ccc tat ttc acc tgg ccg 480Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 ctg atc gcc gcc gac ggc ggc tat gcc ttc aaa tac gag aac ggc aaa 528Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 tac gac atc aag gac gtg ggc gtc gac aat gcg ggc gcc aag gcc ggg 576Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 ctg acc ttc ctc gtc gat ctg atc aag aac aag cac atg aat gcc gac 624Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 acc gac tat tcc atc gcc gag gcg gcc ttc aac aag ggc gag acc gcc 672Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 atg acg atc aac ggg ccg tgg gcc tgg tcg aac atc gac acc tcg aag 720Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 gtc aat tac ggc gtc acg gtg ctg ccg acc ttc aag ggc cag ccc tcg 768Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 aaa ccc ttc gtc ggc gtg ctg tcg gcg ggc atc aac gcg gcc tcg ccg 816Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 aac aag gaa ctc gcc aag gag ttc ctc gag aac tac ctg ctg acc gac 864Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 gag ggg ctc gag gcg gtg aac aag gac aag ccg ctc ggc gcg gtg gcg 912Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 ctg aaa tcc tac gag gaa gag ctc gtc aag gac ccg cgg atc gcc gcc 960Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 acg atg gag aat gcg cag aag ggc gag atc atg ccg aac atc ccg cag 1008Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 atg tcg gcc ttc tgg tat gcc gtc cgc acc gcg gtg atc aac gcg gcc 1056Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 tcg ggc cgt cag acc gtc gac gag gcg ctg aag gat gcg cag act ggt 1104Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 gat gac gac gac aag att aat agc tcg ggc gag acc ttc cgc ccg acc 1152Asp Asp Asp Asp Lys Ile Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr 370 375 380 gcc gat ttc cat ccc tcg ctc tgg cgc aac cat ttc ctg aag ggc gcc 1200Ala Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala 385 390 395 400 tcc gac ttc aag acc gtc gat cac acg gcc acc cag gag cgc cac gag 1248Ser Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu 405 410 415 gcg ctg aag gaa gag gtg cgc cgg atg atc acc gac gcc gag gac aag 1296Ala Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys 420 425 430 ccg gtg cag aag ctg cgg ctg atc gac gag gtg cag cgt ctc ggc gtg 1344Pro Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val 435 440 445 gcc tat cac ttc gag aag gag atc gag gat gcg atc cag aag ctc tgc 1392Ala Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys 450 455 460 ccg atc tac atc gac agc aac cgc gcc gat ctg cac acg gtc tcg ctg 1440Pro Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu 465 470 475 480 cat ttc cgg ctg ctg cgc cag cag ggc atc aag atc tcc tgc gac gtc 1488His Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val 485 490 495 ttc gag aag ttc aag gac gac gag ggc cgc ttc aag tcc tcg ctg atc 1536Phe Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile 500 505 510 aac gac gtg cag ggg atg ctg tcg ctc tac gag gcg gcc tac atg gcg 1584Asn Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala 515 520 525 gtg cgc ggc gag cat atc ctc gac gag gcg atc gcc ttc acc acc acc 1632Val Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr 530 535 540 cat ctg aaa tcg ctc gtg gcg cag gac cat gtc acg ccg aag ctc gcc 1680His Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala 545 550 555 560 gag cag atc aac cat gcg ctc tac cgc ccg ctg cgc aag acg ctg ccg 1728Glu Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro 565 570 575 cgg ctc gag gcg cgc tat ttc atg tcg atg atc aac tcg acc tcg gac 1776Arg Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp 580 585 590 cat ctc tac aac aag acg ctg ctg aac ttc gcc aag ctc gac ttc aac 1824His Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn 595 600 605 atc ctg ctc gag ctg cac aag gaa gag ctg aac gag ctg acg aaa tgg 1872Ile Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp 610 615 620 tgg aag gat ctc gac ttc acc acc aag ctg ccc tat gcg cgc gac cgg 1920Trp Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg 625 630 635 640 ctg gtc gag ctc tat ttc tgg gat ctc ggc acc tat ttc gag ccg cag 1968Leu Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln 645 650 655 tat gcc ttc ggc cgc aag atc atg acc cag ctg aac tac atc ctc tcg 2016Tyr Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser 660 665 670 atc atc gac gac acc tac gac gcc tac ggc acg ctg gaa gag ctg tcg 2064Ile Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser 675 680 685 ctc ttc acc gag gcg gtg cag cgc tgg aac atc gag gcg gtc gac atg 2112Leu Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met 690 695 700 ctg ccg gaa tac atg aag ctg atc tac cgc acg ctg ctc gat gcc ttc 2160Leu Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe 705 710 715 720 aac gag atc gag gaa gac atg gcg aaa caa ggg cgc agc cac tgc gtg 2208Asn Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val 725 730 735 cgc tat gcc aag gaa gag aac cag aag gtc atc ggc gcc tat tcg gtc 2256Arg Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val 740 745 750 cag gcg aaa tgg ttc tcg gaa ggc tat gtc ccc acg atc gag gaa tac 2304Gln Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr 755 760 765 atg ccg atc gcg ctg acc tcc tgc gcc tat acc ttc gtc atc acc aac 2352Met Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn 770 775 780 agc ttc ctc ggc atg ggc gac ttc gcc acc aag gaa gtc ttc gaa tgg 2400Ser Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp 785 790 795 800 atc tcg aac aac ccg aag gtc gtc aag gcg gcc tcg gtc atc tgc cgg 2448Ile Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg 805 810 815 ctg atg gac gac atg cag ggc cac gag ttc gag cag aag cgc ggc cat 2496Leu Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His 820 825 830 gtc gcc tcg gcc atc gaa tgc tac acc aag cag cac ggc gtc tcg aag 2544Val Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys 835 840 845 gaa gag gcg atc aag atg ttc gaa gag gaa gtg gcc aat gcc tgg aag 2592Glu Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys 850 855 860 gac atc aac gag gaa ctg atg atg aag ccc acc gtc gtg gcc cgt ccg 2640Asp Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro 865 870 875 880 ctg ctc ggc acg atc ctg aac ctc gcc cgc gcc atc gac ttc atc tac 2688Leu Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr 885 890 895 aag gaa gac gac ggc tat acc cat tcc tat ctg atc aag gac cag atc 2736Lys Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln Ile 900 905 910 gcc tcg gtc ctc ggc gac cat gtg cct ttc att aat tga taa 2778Ala Ser Val Leu Gly Asp His Val Pro Phe Ile Asn 915 920 11924PRTArtificialSynthetic Construct 11Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310

315 320 Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 Asp Asp Asp Asp Lys Ile Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr 370 375 380 Ala Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala 385 390 395 400 Ser Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu 405 410 415 Ala Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys 420 425 430 Pro Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val 435 440 445 Ala Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys 450 455 460 Pro Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu 465 470 475 480 His Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val 485 490 495 Phe Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile 500 505 510 Asn Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala 515 520 525 Val Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr 530 535 540 His Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala 545 550 555 560 Glu Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro 565 570 575 Arg Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp 580 585 590 His Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn 595 600 605 Ile Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp 610 615 620 Trp Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg 625 630 635 640 Leu Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln 645 650 655 Tyr Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser 660 665 670 Ile Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser 675 680 685 Leu Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met 690 695 700 Leu Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe 705 710 715 720 Asn Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val 725 730 735 Arg Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val 740 745 750 Gln Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr 755 760 765 Met Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn 770 775 780 Ser Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp 785 790 795 800 Ile Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg 805 810 815 Leu Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His 820 825 830 Val Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys 835 840 845 Glu Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys 850 855 860 Asp Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro 865 870 875 880 Leu Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr 885 890 895 Lys Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln Ile 900 905 910 Ala Ser Val Leu Gly Asp His Val Pro Phe Ile Asn 915 920 123159DNAArtificialsynthetic fusion gene NusA-ValFpoR 12atg aac aag gag atc ctg gcc gtc gtc gag gcg gtc tcg aac gag aag 48Met Asn Lys Glu Ile Leu Ala Val Val Glu Ala Val Ser Asn Glu Lys 1 5 10 15 gcg ctg ccg cgc gag aag atc ttc gag gcg ctg gaa tcc gcg ctg gcc 96Ala Leu Pro Arg Glu Lys Ile Phe Glu Ala Leu Glu Ser Ala Leu Ala 20 25 30 acc gcc acc aag aag aaa tac gag cag gag atc gac gtg cgc gtg cag 144Thr Ala Thr Lys Lys Lys Tyr Glu Gln Glu Ile Asp Val Arg Val Gln 35 40 45 atc gac agg aaa tcc ggc gac ttc gac acc ttc cgc cgc tgg ctc gtc 192Ile Asp Arg Lys Ser Gly Asp Phe Asp Thr Phe Arg Arg Trp Leu Val 50 55 60 gtc gac gag gtc acg cag ccg acc aag gag atc acg ctc gag gcg gcc 240Val Asp Glu Val Thr Gln Pro Thr Lys Glu Ile Thr Leu Glu Ala Ala 65 70 75 80 cgc tac gag gac gag agc ctg aac ctc ggc gac tat gtc gag gat cag 288Arg Tyr Glu Asp Glu Ser Leu Asn Leu Gly Asp Tyr Val Glu Asp Gln 85 90 95 atc gag agc gtc acc ttc gac cgg atc acc acg cag acc gcc aag cag 336Ile Glu Ser Val Thr Phe Asp Arg Ile Thr Thr Gln Thr Ala Lys Gln 100 105 110 gtc atc gtg cag aag gtc cgc gag gcc gag cgg gcg atg gtc gtc gat 384Val Ile Val Gln Lys Val Arg Glu Ala Glu Arg Ala Met Val Val Asp 115 120 125 cag ttc cgc gag cac gag ggc gag atc atc acc ggc gtg gtg aag aag 432Gln Phe Arg Glu His Glu Gly Glu Ile Ile Thr Gly Val Val Lys Lys 130 135 140 gtc aac cgc gac aac atc tcg ctc gat ctc ggc aac aat gcc gag gcg 480Val Asn Arg Asp Asn Ile Ser Leu Asp Leu Gly Asn Asn Ala Glu Ala 145 150 155 160 gtg atc ctg cgc gag gac atg ctg ccg cgc gag aac ttc cgc ccg ggc 528Val Ile Leu Arg Glu Asp Met Leu Pro Arg Glu Asn Phe Arg Pro Gly 165 170 175 gac cgg gtg cgc ggc gtg ctc tat tcc gtc cgt ccc gag gcg cgc ggc 576Asp Arg Val Arg Gly Val Leu Tyr Ser Val Arg Pro Glu Ala Arg Gly 180 185 190 gcg cag ctc ttc gtc acc cgc tcg aag ccc gag atg ctg atc gag ctg 624Ala Gln Leu Phe Val Thr Arg Ser Lys Pro Glu Met Leu Ile Glu Leu 195 200 205 ttc cgc atc gag gtg ccc gag atc ggc gag gaa gtg atc gag atc aag 672Phe Arg Ile Glu Val Pro Glu Ile Gly Glu Glu Val Ile Glu Ile Lys 210 215 220 gcc gcg gcc cgc gac ccg ggc tcg cgc gcc aag atc gcc gtc aag acc 720Ala Ala Ala Arg Asp Pro Gly Ser Arg Ala Lys Ile Ala Val Lys Thr 225 230 235 240 aac gac aag cgg atc gac ccg gtg ggc gcc tgc gtg ggc atg cgc ggc 768Asn Asp Lys Arg Ile Asp Pro Val Gly Ala Cys Val Gly Met Arg Gly 245 250 255 gcg cgg gtg cag gcc gtc tcg acc gag ctc ggc ggc gag cgg atc gac 816Ala Arg Val Gln Ala Val Ser Thr Glu Leu Gly Gly Glu Arg Ile Asp 260 265 270 atc gtg ctc tgg gac gac aat ccg gcg cag ttc gtc atc aat gcc atg 864Ile Val Leu Trp Asp Asp Asn Pro Ala Gln Phe Val Ile Asn Ala Met 275 280 285 gcg ccc gcc gac gtg gcc tcg atc gtc gtc gac gag gac aag cac acg 912Ala Pro Ala Asp Val Ala Ser Ile Val Val Asp Glu Asp Lys His Thr 290 295 300 atg gac atc gcc gtc gag gcg ggc aac ctc gcg cag gcc atc ggc cgc 960Met Asp Ile Ala Val Glu Ala Gly Asn Leu Ala Gln Ala Ile Gly Arg 305 310 315 320 aac ggg cag aac gtg cgg ctg gcc tcg cag ctc tcg ggc tgg gag ctg 1008Asn Gly Gln Asn Val Arg Leu Ala Ser Gln Leu Ser Gly Trp Glu Leu 325 330 335 aac gtg atg acc gtc gac gat ctg cag gcc aag cac cag gcc gag gcc 1056Asn Val Met Thr Val Asp Asp Leu Gln Ala Lys His Gln Ala Glu Ala 340 345 350 cat gcg gcc atc gac acc ttc acc aaa tat ctc gac atc gac gag gat 1104His Ala Ala Ile Asp Thr Phe Thr Lys Tyr Leu Asp Ile Asp Glu Asp 355 360 365 ttc gcc acg gtt ctc gtc gaa gag ggc ttc tcg acg ctg gaa gag ctg 1152Phe Ala Thr Val Leu Val Glu Glu Gly Phe Ser Thr Leu Glu Glu Leu 370 375 380 gcc tat gtg ccg atg aag gaa ctg ctc gag atc gag ggg ctc gac gag 1200Ala Tyr Val Pro Met Lys Glu Leu Leu Glu Ile Glu Gly Leu Asp Glu 385 390 395 400 ccg acc gtc gag gcg ctg cgc gag cgc gcc aag aac gcg ctg gcc acc 1248Pro Thr Val Glu Ala Leu Arg Glu Arg Ala Lys Asn Ala Leu Ala Thr 405 410 415 atc gcg cag gcg cag gaa gag agc ctc ggc gac aac aag ccc gcc gac 1296Ile Ala Gln Ala Gln Glu Glu Ser Leu Gly Asp Asn Lys Pro Ala Asp 420 425 430 gat ctg ctg aac ctc gag ggc gtc gac cgc gac ctg gcc ttc aag ctg 1344Asp Leu Leu Asn Leu Glu Gly Val Asp Arg Asp Leu Ala Phe Lys Leu 435 440 445 gcc gcg cgc ggc gtc tgc acg ctc gag gat ctg gcc gag cag ggc atc 1392Ala Ala Arg Gly Val Cys Thr Leu Glu Asp Leu Ala Glu Gln Gly Ile 450 455 460 gac gat ctg gcc gac atc gag ggg ctg acc gac gag aag gcg ggc gcg 1440Asp Asp Leu Ala Asp Ile Glu Gly Leu Thr Asp Glu Lys Ala Gly Ala 465 470 475 480 ctg atc atg gcc gcc cgc aac atc tgc tgg ttc ggc gac gaa ggt gat 1488Leu Ile Met Ala Ala Arg Asn Ile Cys Trp Phe Gly Asp Glu Gly Asp 485 490 495 gac gac gac aag att aat agc tcg ggc gag acc ttc cgc ccg acc gcc 1536Asp Asp Asp Lys Ile Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala 500 505 510 gat ttc cat ccc tcg ctc tgg cgc aac cat ttc ctg aag ggc gcc tcc 1584Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser 515 520 525 gac ttc aag acc gtc gat cac acg gcc acc cag gag cgc cac gag gcg 1632Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu Ala 530 535 540 ctg aag gaa gag gtg cgc cgg atg atc acc gac gcc gag gac aag ccg 1680Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro 545 550 555 560 gtg cag aag ctg cgg ctg atc gac gag gtg cag cgt ctc ggc gtg gcc 1728Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala 565 570 575 tat cac ttc gag aag gag atc gag gat gcg atc cag aag ctc tgc ccg 1776Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro 580 585 590 atc tac atc gac agc aac cgc gcc gat ctg cac acg gtc tcg ctg cat 1824Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu His 595 600 605 ttc cgg ctg ctg cgc cag cag ggc atc aag atc tcc tgc gac gtc ttc 1872Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe 610 615 620 gag aag ttc aag gac gac gag ggc cgc ttc aag tcc tcg ctg atc aac 1920Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn 625 630 635 640 gac gtg cag ggg atg ctg tcg ctc tac gag gcg gcc tac atg gcg gtg 1968Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val 645 650 655 cgc ggc gag cat atc ctc gac gag gcg atc gcc ttc acc acc acc cat 2016Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His 660 665 670 ctg aaa tcg ctc gtg gcg cag gac cat gtc acg ccg aag ctc gcc gag 2064Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu 675 680 685 cag atc aac cat gcg ctc tac cgc ccg ctg cgc aag acg ctg ccg cgg 2112Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg 690 695 700 ctc gag gcg cgc tat ttc atg tcg atg atc aac tcg acc tcg gac cat 2160Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp His 705 710 715 720 ctc tac aac aag acg ctg ctg aac ttc gcc aag ctc gac ttc aac atc 2208Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile 725 730 735 ctg ctc gag ctg cac aag gaa gag ctg aac gag ctg acg aaa tgg tgg 2256Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp 740 745 750 aag gat ctc gac ttc acc acc aag ctg ccc tat gcg cgc gac cgg ctg 2304Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu 755 760 765 gtc gag ctc tat ttc tgg gat ctc ggc acc tat ttc gag ccg cag tat 2352Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr 770 775 780 gcc ttc ggc cgc aag atc atg acc cag ctg aac tac atc ctc tcg atc 2400Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile 785 790 795 800 atc gac gac acc tac gac gcc tac ggc acg ctg gaa gag ctg tcg ctc 2448Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu 805 810 815 ttc acc gag gcg gtg cag cgc tgg aac atc gag gcg gtc gac atg ctg 2496Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu 820 825 830 ccg gaa tac atg aag ctg atc tac cgc acg ctg ctc gat gcc ttc aac 2544Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn 835 840 845 gag atc gag gaa gac atg gcg aaa caa ggg cgc agc cac tgc gtg cgc 2592Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val Arg 850 855 860 tat gcc aag gaa gag aac cag aag gtc atc ggc gcc tat tcg gtc cag 2640Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln 865 870 875 880 gcg aaa tgg ttc tcg gaa ggc tat gtc ccc acg atc gag gaa tac atg 2688Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met 885 890 895 ccg atc gcg ctg acc tcc tgc gcc tat acc ttc gtc atc acc aac agc 2736Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser 900 905 910 ttc ctc ggc atg ggc gac ttc gcc acc aag gaa gtc ttc gaa tgg atc 2784Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile 915 920 925 tcg aac aac ccg aag gtc gtc aag gcg gcc tcg gtc atc tgc cgg ctg 2832Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu 930 935 940 atg gac gac atg cag ggc cac gag ttc gag cag aag cgc ggc cat gtc 2880Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His Val 945 950 955 960 gcc tcg gcc atc gaa tgc tac acc aag cag cac ggc gtc tcg aag gaa 2928Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu 965 970 975 gag gcg atc aag atg ttc gaa gag gaa gtg gcc aat gcc tgg aag gac 2976Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp

980 985 990 atc aac gag gaa ctg atg atg aag ccc acc gtc gtg gcc cgt ccg ctg 3024Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro Leu 995 1000 1005 ctc ggc acg atc ctg aac ctc gcc cgc gcc atc gac ttc atc tac 3069Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr 1010 1015 1020 aag gaa gac gac ggc tat acc cat tcc tat ctg atc aag gac cag 3114Lys Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln 1025 1030 1035 atc gcc tcg gtc ctc ggc gac cat gtg cct ttc att aat tga taa 3159Ile Ala Ser Val Leu Gly Asp His Val Pro Phe Ile Asn 1040 1045 1050 131051PRTArtificialSynthetic Construct 13Met Asn Lys Glu Ile Leu Ala Val Val Glu Ala Val Ser Asn Glu Lys 1 5 10 15 Ala Leu Pro Arg Glu Lys Ile Phe Glu Ala Leu Glu Ser Ala Leu Ala 20 25 30 Thr Ala Thr Lys Lys Lys Tyr Glu Gln Glu Ile Asp Val Arg Val Gln 35 40 45 Ile Asp Arg Lys Ser Gly Asp Phe Asp Thr Phe Arg Arg Trp Leu Val 50 55 60 Val Asp Glu Val Thr Gln Pro Thr Lys Glu Ile Thr Leu Glu Ala Ala 65 70 75 80 Arg Tyr Glu Asp Glu Ser Leu Asn Leu Gly Asp Tyr Val Glu Asp Gln 85 90 95 Ile Glu Ser Val Thr Phe Asp Arg Ile Thr Thr Gln Thr Ala Lys Gln 100 105 110 Val Ile Val Gln Lys Val Arg Glu Ala Glu Arg Ala Met Val Val Asp 115 120 125 Gln Phe Arg Glu His Glu Gly Glu Ile Ile Thr Gly Val Val Lys Lys 130 135 140 Val Asn Arg Asp Asn Ile Ser Leu Asp Leu Gly Asn Asn Ala Glu Ala 145 150 155 160 Val Ile Leu Arg Glu Asp Met Leu Pro Arg Glu Asn Phe Arg Pro Gly 165 170 175 Asp Arg Val Arg Gly Val Leu Tyr Ser Val Arg Pro Glu Ala Arg Gly 180 185 190 Ala Gln Leu Phe Val Thr Arg Ser Lys Pro Glu Met Leu Ile Glu Leu 195 200 205 Phe Arg Ile Glu Val Pro Glu Ile Gly Glu Glu Val Ile Glu Ile Lys 210 215 220 Ala Ala Ala Arg Asp Pro Gly Ser Arg Ala Lys Ile Ala Val Lys Thr 225 230 235 240 Asn Asp Lys Arg Ile Asp Pro Val Gly Ala Cys Val Gly Met Arg Gly 245 250 255 Ala Arg Val Gln Ala Val Ser Thr Glu Leu Gly Gly Glu Arg Ile Asp 260 265 270 Ile Val Leu Trp Asp Asp Asn Pro Ala Gln Phe Val Ile Asn Ala Met 275 280 285 Ala Pro Ala Asp Val Ala Ser Ile Val Val Asp Glu Asp Lys His Thr 290 295 300 Met Asp Ile Ala Val Glu Ala Gly Asn Leu Ala Gln Ala Ile Gly Arg 305 310 315 320 Asn Gly Gln Asn Val Arg Leu Ala Ser Gln Leu Ser Gly Trp Glu Leu 325 330 335 Asn Val Met Thr Val Asp Asp Leu Gln Ala Lys His Gln Ala Glu Ala 340 345 350 His Ala Ala Ile Asp Thr Phe Thr Lys Tyr Leu Asp Ile Asp Glu Asp 355 360 365 Phe Ala Thr Val Leu Val Glu Glu Gly Phe Ser Thr Leu Glu Glu Leu 370 375 380 Ala Tyr Val Pro Met Lys Glu Leu Leu Glu Ile Glu Gly Leu Asp Glu 385 390 395 400 Pro Thr Val Glu Ala Leu Arg Glu Arg Ala Lys Asn Ala Leu Ala Thr 405 410 415 Ile Ala Gln Ala Gln Glu Glu Ser Leu Gly Asp Asn Lys Pro Ala Asp 420 425 430 Asp Leu Leu Asn Leu Glu Gly Val Asp Arg Asp Leu Ala Phe Lys Leu 435 440 445 Ala Ala Arg Gly Val Cys Thr Leu Glu Asp Leu Ala Glu Gln Gly Ile 450 455 460 Asp Asp Leu Ala Asp Ile Glu Gly Leu Thr Asp Glu Lys Ala Gly Ala 465 470 475 480 Leu Ile Met Ala Ala Arg Asn Ile Cys Trp Phe Gly Asp Glu Gly Asp 485 490 495 Asp Asp Asp Lys Ile Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala 500 505 510 Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser 515 520 525 Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu Ala 530 535 540 Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro 545 550 555 560 Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala 565 570 575 Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro 580 585 590 Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu His 595 600 605 Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe 610 615 620 Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn 625 630 635 640 Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val 645 650 655 Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His 660 665 670 Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu 675 680 685 Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg 690 695 700 Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp His 705 710 715 720 Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile 725 730 735 Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp 740 745 750 Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu 755 760 765 Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr 770 775 780 Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile 785 790 795 800 Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu 805 810 815 Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu 820 825 830 Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn 835 840 845 Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val Arg 850 855 860 Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln 865 870 875 880 Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met 885 890 895 Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser 900 905 910 Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile 915 920 925 Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu 930 935 940 Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His Val 945 950 955 960 Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu 965 970 975 Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp 980 985 990 Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro Leu 995 1000 1005 Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr 1010 1015 1020 Lys Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln 1025 1030 1035 Ile Ala Ser Val Leu Gly Asp His Val Pro Phe Ile Asn 1040 1045 1050 142007DNAArtificialsynthetic fusion gene Trx-ValFpoR 14atg tcg gac aag atc atc cac ctg acc gac gac agc ttc gac acc gac 48Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp Ser Phe Asp Thr Asp 1 5 10 15 gtg ctg aag gcc gac ggc gcc atc ctc gtc gat ttc tgg gcc gaa tgg 96Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp Phe Trp Ala Glu Trp 20 25 30 tgc ggc ccc tgc aag atg atc gcg ccg atc ctc gac gag atc gcc gac 144Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu Asp Glu Ile Ala Asp 35 40 45 gaa tat cag ggc aag ctg acc gtc gcc aag ctg aac atc gac cag aac 192Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu Asn Ile Asp Gln Asn 50 55 60 ccg ggc acg gcg ccg aaa tac ggc atc cgc ggc atc ccg acg ctg ctg 240Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly Ile Pro Thr Leu Leu 65 70 75 80 ctc ttc aag aac ggc gag gtg gcg gcc acc aag gtc ggc gcg ctg tcg 288Leu Phe Lys Asn Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu Ser 85 90 95 aag ggc cag ctg aag gag ttc ctc gat gcg aac ctc gcc ggt ggt gat 336Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala Asn Leu Ala Gly Gly Asp 100 105 110 gac gac gac aag att aat agc tcg ggc gag acc ttc cgc ccg acc gcc 384Asp Asp Asp Lys Ile Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala 115 120 125 gat ttc cat ccc tcg ctc tgg cgc aac cat ttc ctg aag ggc gcc tcc 432Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser 130 135 140 gac ttc aag acc gtc gat cac acg gcc acc cag gag cgc cac gag gcg 480Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu Ala 145 150 155 160 ctg aag gaa gag gtg cgc cgg atg atc acc gac gcc gag gac aag ccg 528Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro 165 170 175 gtg cag aag ctg cgg ctg atc gac gag gtg cag cgt ctc ggc gtg gcc 576Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala 180 185 190 tat cac ttc gag aag gag atc gag gat gcg atc cag aag ctc tgc ccg 624Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro 195 200 205 atc tac atc gac agc aac cgc gcc gat ctg cac acg gtc tcg ctg cat 672Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu His 210 215 220 ttc cgg ctg ctg cgc cag cag ggc atc aag atc tcc tgc gac gtc ttc 720Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe 225 230 235 240 gag aag ttc aag gac gac gag ggc cgc ttc aag tcc tcg ctg atc aac 768Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn 245 250 255 gac gtg cag ggg atg ctg tcg ctc tac gag gcg gcc tac atg gcg gtg 816Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val 260 265 270 cgc ggc gag cat atc ctc gac gag gcg atc gcc ttc acc acc acc cat 864Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His 275 280 285 ctg aaa tcg ctc gtg gcg cag gac cat gtc acg ccg aag ctc gcc gag 912Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu 290 295 300 cag atc aac cat gcg ctc tac cgc ccg ctg cgc aag acg ctg ccg cgg 960Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg 305 310 315 320 ctc gag gcg cgc tat ttc atg tcg atg atc aac tcg acc tcg gac cat 1008Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp His 325 330 335 ctc tac aac aag acg ctg ctg aac ttc gcc aag ctc gac ttc aac atc 1056Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile 340 345 350 ctg ctc gag ctg cac aag gaa gag ctg aac gag ctg acg aaa tgg tgg 1104Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp 355 360 365 aag gat ctc gac ttc acc acc aag ctg ccc tat gcg cgc gac cgg ctg 1152Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu 370 375 380 gtc gag ctc tat ttc tgg gat ctc ggc acc tat ttc gag ccg cag tat 1200Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr 385 390 395 400 gcc ttc ggc cgc aag atc atg acc cag ctg aac tac atc ctc tcg atc 1248Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile 405 410 415 atc gac gac acc tac gac gcc tac ggc acg ctg gaa gag ctg tcg ctc 1296Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu 420 425 430 ttc acc gag gcg gtg cag cgc tgg aac atc gag gcg gtc gac atg ctg 1344Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu 435 440 445 ccg gaa tac atg aag ctg atc tac cgc acg ctg ctc gat gcc ttc aac 1392Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn 450 455 460 gag atc gag gaa gac atg gcg aaa caa ggg cgc agc cac tgc gtg cgc 1440Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val Arg 465 470 475 480 tat gcc aag gaa gag aac cag aag gtc atc ggc gcc tat tcg gtc cag 1488Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln 485 490 495 gcg aaa tgg ttc tcg gaa ggc tat gtc ccc acg atc gag gaa tac atg 1536Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met 500 505 510 ccg atc gcg ctg acc tcc tgc gcc tat acc ttc gtc atc acc aac agc 1584Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser 515 520 525 ttc ctc ggc atg ggc gac ttc gcc acc aag gaa gtc ttc gaa tgg atc 1632Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile 530 535 540 tcg aac aac ccg aag gtc gtc aag gcg gcc tcg gtc atc tgc cgg ctg 1680Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu 545 550 555 560 atg gac gac atg cag ggc cac gag ttc gag cag aag cgc ggc cat gtc 1728Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His Val 565 570 575 gcc tcg gcc atc gaa tgc tac acc aag cag cac ggc gtc tcg aag gaa 1776Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu 580 585 590 gag gcg atc aag atg ttc gaa gag gaa gtg gcc aat gcc tgg aag gac 1824Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp 595 600 605 atc aac gag gaa ctg atg atg aag ccc acc gtc gtg gcc cgt ccg ctg 1872Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro Leu 610 615 620 ctc ggc acg atc ctg aac ctc gcc cgc gcc atc gac ttc atc tac aag 1920Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr Lys 625 630 635 640 gaa gac gac ggc tat acc cat tcc tat ctg atc aag gac cag atc gcc 1968Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln Ile Ala

645 650 655 tcg gtc ctc ggc gac cat gtg cct ttc att aat tga taa 2007Ser Val Leu Gly Asp His Val Pro Phe Ile Asn 660 665 15667PRTArtificialSynthetic Construct 15Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp Ser Phe Asp Thr Asp 1 5 10 15 Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp Phe Trp Ala Glu Trp 20 25 30 Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu Asp Glu Ile Ala Asp 35 40 45 Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu Asn Ile Asp Gln Asn 50 55 60 Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly Ile Pro Thr Leu Leu 65 70 75 80 Leu Phe Lys Asn Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu Ser 85 90 95 Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala Asn Leu Ala Gly Gly Asp 100 105 110 Asp Asp Asp Lys Ile Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala 115 120 125 Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser 130 135 140 Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu Ala 145 150 155 160 Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro 165 170 175 Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala 180 185 190 Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro 195 200 205 Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu His 210 215 220 Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe 225 230 235 240 Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn 245 250 255 Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val 260 265 270 Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His 275 280 285 Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu 290 295 300 Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg 305 310 315 320 Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp His 325 330 335 Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile 340 345 350 Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp 355 360 365 Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu 370 375 380 Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr 385 390 395 400 Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile 405 410 415 Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu 420 425 430 Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu 435 440 445 Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn 450 455 460 Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val Arg 465 470 475 480 Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln 485 490 495 Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met 500 505 510 Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser 515 520 525 Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile 530 535 540 Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu 545 550 555 560 Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His Val 565 570 575 Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu 580 585 590 Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp 595 600 605 Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro Leu 610 615 620 Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr Lys 625 630 635 640 Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln Ile Ala 645 650 655 Ser Val Leu Gly Asp His Val Pro Phe Ile Asn 660 665 162772DNAArtificialsynthetic fusion gene MBP-ValF 16atg aag atc gag gaa ggc aag ctc gtc atc tgg atc aac ggc gac aag 48Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 ggc tac aac ggc ctc gcc gag gtg ggc aag aag ttc gag aag gac acg 96Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 ggc atc aag gtc acc gtc gag cat ccc gac aag ctc gag gag aag ttc 144Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 ccg cag gtc gcc gcc acc ggc gac ggc ccc gac atc atc ttc tgg gcc 192Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 cac gac cgc ttc ggc ggc tat gcg cag tcg ggc ctg ctc gcc gag atc 240His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 acg ccc gac aag gcc ttc cag gac aag ctc tat ccc ttc acc tgg gat 288Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 gcg gtg cgc tac aac ggc aag ctg atc gcc tat ccg atc gcc gtc gag 336Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 gcg ctg tcg ctg atc tac aac aag gat ctg ctg ccg aac ccg ccg aag 384Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 acc tgg gaa gag atc ccg gcg ctc gac aag gaa ctg aag gcc aag ggc 432Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 aag tcc gcg ctg atg ttc aac ctg cag gag ccc tat ttc acc tgg ccg 480Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 ctg atc gcc gcc gac ggc ggc tat gcc ttc aaa tac gag aac ggc aaa 528Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 tac gac atc aag gac gtg ggc gtc gac aat gcg ggc gcc aag gcc ggg 576Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 ctg acc ttc ctc gtc gat ctg atc aag aac aag cac atg aat gcc gac 624Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 acc gac tat tcc atc gcc gag gcg gcc ttc aac aag ggc gag acc gcc 672Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 atg acg atc aac ggg ccg tgg gcc tgg tcg aac atc gac acc tcg aag 720Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 gtc aat tac ggc gtc acg gtg ctg ccg acc ttc aag ggc cag ccc tcg 768Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 aaa ccc ttc gtc ggc gtg ctg tcg gcg ggc atc aac gcg gcc tcg ccg 816Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 aac aag gaa ctc gcc aag gag ttc ctc gag aac tac ctg ctg acc gac 864Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 gag ggg ctc gag gcg gtg aac aag gac aag ccg ctc ggc gcg gtg gcg 912Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 ctg aaa tcc tac gag gaa gag ctc gtc aag gac ccg cgg atc gcc gcc 960Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 acg atg gag aat gcg cag aag ggc gag atc atg ccg aac atc ccg cag 1008Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 atg tcg gcc ttc tgg tat gcc gtc cgc acc gcg gtg atc aac gcg gcc 1056Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 tcg ggc cgt cag acc gtc gac gag gcg ctg aag gat gcg cag act ggt 1104Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 gat gac gac gac aag att atg tcg agc ggc gag acc ttc cgc ccc acg 1152Asp Asp Asp Asp Lys Ile Met Ser Ser Gly Glu Thr Phe Arg Pro Thr 370 375 380 gcc gac ttc cat ccg tcc ctc tgg cgg aac cac ttc ctc aag ggg gcc 1200Ala Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala 385 390 395 400 tcc gat ttc aag acc gtg gac cat acg gcg acg cag gaa cgg cac gag 1248Ser Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu 405 410 415 gcc ctc aag gag gag gtc cgc cgc atg atc acc gac gcc gaa gac aag 1296Ala Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys 420 425 430 ccg gtc cag aag ctc cgc ctg atc gac gag gtc cag cgc ctg ggc gtg 1344Pro Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val 435 440 445 gcg tat cat ttc gag aaa gaa atc gag gat gcg atc cag aag ctc tgc 1392Ala Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys 450 455 460 ccg atc tat atc gat agc aat cgc gcc gat ctc cat acc gtg tcg ctg 1440Pro Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu 465 470 475 480 cac ttc cgc ctg ctg cgg cag cag ggc atc aag atc agc tgc gac gtg 1488His Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val 485 490 495 ttc gaa aag ttc aag gac gac gag ggc cgc ttc aag tcg tcg ctg atc 1536Phe Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile 500 505 510 aac gac gtg cag ggc atg ctg tcg ctg tac gag gcc gcg tac atg gcc 1584Asn Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala 515 520 525 gtg cgc ggc gag cat atc ctg gac gaa gcc atc gcg ttc acg acc acg 1632Val Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr 530 535 540 cat ctg aag tcg ctg gtg gcg cag gac cac gtg acg ccg aag ctc gcc 1680His Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala 545 550 555 560 gag cag atc aac cac gcg ctg tat cgg ccg ctc cgc aag acc ctc ccg 1728Glu Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro 565 570 575 cgc ctc gag gcc cgc tat ttc atg agc atg atc aac tcg acc tcg gat 1776Arg Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp 580 585 590 cac ctg tac aat aag acc ctg ctc aac ttc gcg aaa ctg gac ttc aat 1824His Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn 595 600 605 atc ctc ctc gag ctg cac aag gag gag ctc aac gag ctg acc aag tgg 1872Ile Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp 610 615 620 tgg aag gat ctg gac ttc acc acc aag ctg ccg tac gcc cgc gat cgc 1920Trp Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg 625 630 635 640 ctc gtg gag ctg tat ttc tgg gac ctg ggc acc tac ttc gaa ccc cag 1968Leu Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln 645 650 655 tac gcc ttc ggg cgg aag atc atg acc cag ctc aat tat atc ctc agc 2016Tyr Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser 660 665 670 atc atc gac gac acc tat gac gcg tac ggc acg ctg gag gag ctg tcc 2064Ile Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser 675 680 685 ctg ttc acg gaa gcc gtc cag cgg tgg aac atc gag gcc gtc gac atg 2112Leu Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met 690 695 700 ctc ccc gag tac atg aaa ctg atc tac cgg acc ctg ctg gat gcc ttc 2160Leu Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe 705 710 715 720 aac gag atc gag gag gac atg gcg aaa cag ggc cgg tcc cac tgc gtg 2208Asn Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val 725 730 735 cgc tac gcg aag gaa gag aac cag aag gtc atc ggc gcc tac tcg gtc 2256Arg Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val 740 745 750 cag gcg aag tgg ttc agc gag ggc tat gtg ccg acg atc gag gaa tat 2304Gln Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr 755 760 765 atg ccg atc gcg ctc acc tcg tgc gcg tac acg ttc gtg atc acc aat 2352Met Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn 770 775 780 tcg ttc ctc ggc atg ggc gat ttc gcg acc aag gag gtc ttc gag tgg 2400Ser Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp 785 790 795 800 atc agc aac aat ccg aag gtg gtg aag gcg gcc tcg gtc atc tgc cgg 2448Ile Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg 805 810 815 ctc atg gat gac atg cag ggg cat gag ttc gaa cag aag cgc ggc cac 2496Leu Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His 820 825 830 gtc gcg tcc gcc atc gag tgc tat acc aag cag cat ggc gtg tcg aag 2544Val Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys 835 840 845 gag gag gcc atc aag atg ttc gag gag gaa gtc gcc aac gcg tgg aag 2592Glu Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys 850 855 860 gac atc aat gag gag ctg atg atg aag ccc acc gtc gtg gcc cgc ccc 2640Asp Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro 865 870 875 880 ctg ctg ggc acc atc ctg aac ctc gcc cgc gcc atc gac ttc atc tac 2688Leu Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr 885 890 895 aag gag gac gat ggg tat acg cat tcc tat ctg atc aag gac cag atc 2736Lys Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln Ile 900 905 910 gcc tcg gtc ctc ggc gat cat gtc ccg ttc tga taa 2772Ala Ser Val Leu Gly Asp His Val Pro Phe 915 920

17922PRTArtificialSynthetic Construct 17Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 Asp Asp Asp Asp Lys Ile Met Ser Ser Gly Glu Thr Phe Arg Pro Thr 370 375 380 Ala Asp Phe His Pro Ser Leu Trp Arg Asn His Phe Leu Lys Gly Ala 385 390 395 400 Ser Asp Phe Lys Thr Val Asp His Thr Ala Thr Gln Glu Arg His Glu 405 410 415 Ala Leu Lys Glu Glu Val Arg Arg Met Ile Thr Asp Ala Glu Asp Lys 420 425 430 Pro Val Gln Lys Leu Arg Leu Ile Asp Glu Val Gln Arg Leu Gly Val 435 440 445 Ala Tyr His Phe Glu Lys Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys 450 455 460 Pro Ile Tyr Ile Asp Ser Asn Arg Ala Asp Leu His Thr Val Ser Leu 465 470 475 480 His Phe Arg Leu Leu Arg Gln Gln Gly Ile Lys Ile Ser Cys Asp Val 485 490 495 Phe Glu Lys Phe Lys Asp Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile 500 505 510 Asn Asp Val Gln Gly Met Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala 515 520 525 Val Arg Gly Glu His Ile Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr 530 535 540 His Leu Lys Ser Leu Val Ala Gln Asp His Val Thr Pro Lys Leu Ala 545 550 555 560 Glu Gln Ile Asn His Ala Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro 565 570 575 Arg Leu Glu Ala Arg Tyr Phe Met Ser Met Ile Asn Ser Thr Ser Asp 580 585 590 His Leu Tyr Asn Lys Thr Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn 595 600 605 Ile Leu Leu Glu Leu His Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp 610 615 620 Trp Lys Asp Leu Asp Phe Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg 625 630 635 640 Leu Val Glu Leu Tyr Phe Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln 645 650 655 Tyr Ala Phe Gly Arg Lys Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser 660 665 670 Ile Ile Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser 675 680 685 Leu Phe Thr Glu Ala Val Gln Arg Trp Asn Ile Glu Ala Val Asp Met 690 695 700 Leu Pro Glu Tyr Met Lys Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe 705 710 715 720 Asn Glu Ile Glu Glu Asp Met Ala Lys Gln Gly Arg Ser His Cys Val 725 730 735 Arg Tyr Ala Lys Glu Glu Asn Gln Lys Val Ile Gly Ala Tyr Ser Val 740 745 750 Gln Ala Lys Trp Phe Ser Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr 755 760 765 Met Pro Ile Ala Leu Thr Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn 770 775 780 Ser Phe Leu Gly Met Gly Asp Phe Ala Thr Lys Glu Val Phe Glu Trp 785 790 795 800 Ile Ser Asn Asn Pro Lys Val Val Lys Ala Ala Ser Val Ile Cys Arg 805 810 815 Leu Met Asp Asp Met Gln Gly His Glu Phe Glu Gln Lys Arg Gly His 820 825 830 Val Ala Ser Ala Ile Glu Cys Tyr Thr Lys Gln His Gly Val Ser Lys 835 840 845 Glu Glu Ala Ile Lys Met Phe Glu Glu Glu Val Ala Asn Ala Trp Lys 850 855 860 Asp Ile Asn Glu Glu Leu Met Met Lys Pro Thr Val Val Ala Arg Pro 865 870 875 880 Leu Leu Gly Thr Ile Leu Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr 885 890 895 Lys Glu Asp Asp Gly Tyr Thr His Ser Tyr Leu Ile Lys Asp Gln Ile 900 905 910 Ala Ser Val Leu Gly Asp His Val Pro Phe 915 920 181773DNAArtificialcodon optimized ValC gene 18atggccgaaa tgttcaatgg caattccagc aatgatggca gctcctgcat gccggtcaag 60gacgcgctgc gccgcaccgg gaaccaccat ccgaacctct ggaccgacga tttcatccag 120tcgctgaact ccccctattc ggattcctcg tatcataaac atcgcgagat cctgatcgat 180gagatccggg acatgttctc caacggcgag ggggatgagt tcggggtcct cgagaacatc 240tggttcgtcg acgtggtcca gcggctgggc atcgatcggc acttccagga agagatcaag 300acggccctgg attatatcta taagttctgg aaccatgata gcatcttcgg cgacctcaac 360atggtggcgc tggggttccg catcctgcgg ctcaatcgct acgtggcgtc gtcggacgtg 420ttcaagaagt tcaagggcga ggagggccag ttctcggggt tcgagagcag cgatcaggac 480gccaagctgg agatgatgct gaacctctac aaggcctcgg aactcgactt cccggatgag 540gacatcctca aggaagcgcg ggccttcgcg tcgatgtatc tcaagcatgt catcaaggag 600tatggggaca tccaggaatc gaagaacccc ctgctcatgg agatcgagta caccttcaag 660tacccctggc gctgccgcct cccgcggctg gaggcgtgga acttcatcca catcatgcgg 720cagcaggact gcaatatctc gctcgccaac aacctctata agatcccgaa gatctatatg 780aagaagatcc tggagctggc gatcctcgac ttcaacatcc tccagagcca gcatcagcat 840gagatgaaac tgatcagcac gtggtggaag aactcgtccg cgatccagct cgacttcttc 900cgccaccgcc atatcgagag ctacttctgg tgggccagcc cgctgttcga gcccgagttc 960tccacctgcc gcatcaactg caccaagctg tccaccaaga tgttcctcct ggacgacatc 1020tatgacacgt acgggaccgt cgaggaactc aagccgttca cgaccaccct cacgcgctgg 1080gatgtcagca cggtggacaa tcacccggac tacatgaaga tcgcgttcaa tttctcctac 1140gagatctaca aggagatcgc gtccgaggcc gagcgcaagc acggcccgtt cgtgtataag 1200tatctccagt cgtgctggaa gtcgtatatc gaggcgtata tgcaggaggc cgagtggatc 1260gcctccaacc acatccccgg cttcgacgag tacctgatga atggcgtgaa gagctcgggg 1320atgcgcatcc tcatgatcca tgcgctgatc ctgatggata cgcccctgtc cgacgagatc 1380ctcgagcagc tcgacatccc gagcagcaag agccaggccc tgctgtcgct catcacgcgg 1440ctcgtcgatg atgtgaagga tttcgaggac gagcaggcgc atggggagat ggcctcgtcg 1500atcgaatgct atatgaagga taatcacggc tccacgcgcg aggacgccct gaactacctg 1560aaaatccgca tcgagagctg cgtgcaggag ctcaacaagg aactcctcga accgagcaac 1620atgcatggca gcttccgcaa cctgtacctc aacgtgggca tgcgggtgat cttcttcatg 1680ctgaacgacg gggacctctt cacccattcg aatcggaagg agatccagga tgcgatcacg 1740aagttcttcg tggaaccgat catcccgtga taa 1773191725DNAArtificialcodon optimized ValC gene short 19atgccggtca aggacgcgct gcgccgcacc gggaaccacc atccgaacct ctggaccgac 60gatttcatcc agtcgctgaa ctccccctat tcggattcct cgtatcataa acatcgcgag 120atcctgatcg atgagatccg ggacatgttc tccaacggcg agggggatga gttcggggtc 180ctcgagaaca tctggttcgt cgacgtggtc cagcggctgg gcatcgatcg gcacttccag 240gaagagatca agacggccct ggattatatc tataagttct ggaaccatga tagcatcttc 300ggcgacctca acatggtggc gctggggttc cgcatcctgc ggctcaatcg ctacgtggcg 360tcgtcggacg tgttcaagaa gttcaagggc gaggagggcc agttctcggg gttcgagagc 420agcgatcagg acgccaagct ggagatgatg ctgaacctct acaaggcctc ggaactcgac 480ttcccggatg aggacatcct caaggaagcg cgggccttcg cgtcgatgta tctcaagcat 540gtcatcaagg agtatgggga catccaggaa tcgaagaacc ccctgctcat ggagatcgag 600tacaccttca agtacccctg gcgctgccgc ctcccgcggc tggaggcgtg gaacttcatc 660cacatcatgc ggcagcagga ctgcaatatc tcgctcgcca acaacctcta taagatcccg 720aagatctata tgaagaagat cctggagctg gcgatcctcg acttcaacat cctccagagc 780cagcatcagc atgagatgaa actgatcagc acgtggtgga agaactcgtc cgcgatccag 840ctcgacttct tccgccaccg ccatatcgag agctacttct ggtgggccag cccgctgttc 900gagcccgagt tctccacctg ccgcatcaac tgcaccaagc tgtccaccaa gatgttcctc 960ctggacgaca tctatgacac gtacgggacc gtcgaggaac tcaagccgtt cacgaccacc 1020ctcacgcgct gggatgtcag cacggtggac aatcacccgg actacatgaa gatcgcgttc 1080aatttctcct acgagatcta caaggagatc gcgtccgagg ccgagcgcaa gcacggcccg 1140ttcgtgtata agtatctcca gtcgtgctgg aagtcgtata tcgaggcgta tatgcaggag 1200gccgagtgga tcgcctccaa ccacatcccc ggcttcgacg agtacctgat gaatggcgtg 1260aagagctcgg ggatgcgcat cctcatgatc catgcgctga tcctgatgga tacgcccctg 1320tccgacgaga tcctcgagca gctcgacatc ccgagcagca agagccaggc cctgctgtcg 1380ctcatcacgc ggctcgtcga tgatgtgaag gatttcgagg acgagcaggc gcatggggag 1440atggcctcgt cgatcgaatg ctatatgaag gataatcacg gctccacgcg cgaggacgcc 1500ctgaactacc tgaaaatccg catcgagagc tgcgtgcagg agctcaacaa ggaactcctc 1560gaaccgagca acatgcatgg cagcttccgc aacctgtacc tcaacgtggg catgcgggtg 1620atcttcttca tgctgaacga cggggacctc ttcacccatt cgaatcggaa ggagatccag 1680gatgcgatca cgaagttctt cgtggaaccg atcatcccgt gataa 1725202895DNAArtificialsynthetic fusion gene MBP-ValC 20atg aag atc gag gaa ggc aag ctc gtc atc tgg atc aac ggc gac aag 48Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 ggc tac aac ggc ctc gcc gag gtg ggc aag aag ttc gag aag gac acg 96Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 ggc atc aag gtc acc gtc gag cat ccc gac aag ctc gag gag aag ttc 144Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 ccg cag gtc gcc gcc acc ggc gac ggc ccc gac atc atc ttc tgg gcc 192Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 cac gac cgc ttc ggc ggc tat gcg cag tcg ggc ctg ctc gcc gag atc 240His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 acg ccc gac aag gcc ttc cag gac aag ctc tat ccc ttc acc tgg gat 288Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 gcg gtg cgc tac aac ggc aag ctg atc gcc tat ccg atc gcc gtc gag 336Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 gcg ctg tcg ctg atc tac aac aag gat ctg ctg ccg aac ccg ccg aag 384Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 acc tgg gaa gag atc ccg gcg ctc gac aag gaa ctg aag gcc aag ggc 432Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 aag tcc gcg ctg atg ttc aac ctg cag gag ccc tat ttc acc tgg ccg 480Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 ctg atc gcc gcc gac ggc ggc tat gcc ttc aaa tac gag aac ggc aaa 528Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 tac gac atc aag gac gtg ggc gtc gac aat gcg ggc gcc aag gcc ggg 576Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 ctg acc ttc ctc gtc gat ctg atc aag aac aag cac atg aat gcc gac 624Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 acc gac tat tcc atc gcc gag gcg gcc ttc aac aag ggc gag acc gcc 672Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 atg acg atc aac ggg ccg tgg gcc tgg tcg aac atc gac acc tcg aag 720Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 gtc aat tac ggc gtc acg gtg ctg ccg acc ttc aag ggc cag ccc tcg 768Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 aaa ccc ttc gtc ggc gtg ctg tcg gcg ggc atc aac gcg gcc tcg ccg 816Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 aac aag gaa ctc gcc aag gag ttc ctc gag aac tac ctg ctg acc gac 864Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 gag ggg ctc gag gcg gtg aac aag gac aag ccg ctc ggc gcg gtg gcg 912Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 ctg aaa tcc tac gag gaa gag ctc gtc aag gac ccg cgg atc gcc gcc 960Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 acg atg gag aat gcg cag aag ggc gag atc atg ccg aac atc ccg cag 1008Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 atg tcg gcc ttc tgg tat gcc gtc cgc acc gcg gtg atc aac gcg gcc 1056Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 tcg ggc cgt cag acc gtc gac gag gcg ctg aag gat gcg cag act ggt 1104Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 gat gac gac gac aag att atg gcc gaa atg ttc aat ggc aat tcc agc 1152Asp Asp Asp Asp Lys Ile Met Ala Glu Met Phe Asn Gly Asn Ser Ser 370 375 380 aat gat ggc agc tcc tgc atg ccg gtc aag gac gcg ctg cgc cgc acc 1200Asn Asp Gly Ser Ser Cys Met Pro Val Lys Asp Ala Leu Arg Arg Thr 385 390 395 400 ggg aac cac cat ccg aac ctc tgg acc gac gat ttc atc cag tcg ctg 1248Gly Asn His His Pro Asn Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu 405 410 415 aac tcc ccc tat tcg gat tcc tcg tat cat aaa cat cgc gag atc ctg 1296Asn Ser Pro Tyr Ser Asp Ser Ser Tyr His Lys His Arg Glu Ile Leu 420 425 430 atc gat gag atc cgg gac atg ttc tcc aac ggc gag ggg gat gag ttc 1344Ile Asp Glu Ile Arg Asp Met Phe Ser Asn Gly Glu Gly Asp Glu Phe 435 440 445 ggg gtc ctc gag aac atc tgg ttc gtc gac gtg gtc cag cgg ctg ggc 1392Gly Val Leu Glu Asn Ile Trp Phe Val Asp Val Val Gln Arg Leu Gly 450 455 460 atc gat cgg cac ttc cag gaa gag atc aag acg gcc ctg gat tat atc 1440Ile Asp Arg His Phe Gln Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile

465 470 475 480 tat aag ttc tgg aac cat gat agc atc ttc ggc gac ctc aac atg gtg 1488Tyr Lys Phe Trp Asn His Asp Ser Ile Phe Gly Asp Leu Asn Met Val 485 490 495 gcg ctg ggg ttc cgc atc ctg cgg ctc aat cgc tac gtg gcg tcg tcg 1536Ala Leu Gly Phe Arg Ile Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser 500 505 510 gac gtg ttc aag aag ttc aag ggc gag gag ggc cag ttc tcg ggg ttc 1584Asp Val Phe Lys Lys Phe Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe 515 520 525 gag agc agc gat cag gac gcc aag ctg gag atg atg ctg aac ctc tac 1632Glu Ser Ser Asp Gln Asp Ala Lys Leu Glu Met Met Leu Asn Leu Tyr 530 535 540 aag gcc tcg gaa ctc gac ttc ccg gat gag gac atc ctc aag gaa gcg 1680Lys Ala Ser Glu Leu Asp Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala 545 550 555 560 cgg gcc ttc gcg tcg atg tat ctc aag cat gtc atc aag gag tat ggg 1728Arg Ala Phe Ala Ser Met Tyr Leu Lys His Val Ile Lys Glu Tyr Gly 565 570 575 gac atc cag gaa tcg aag aac ccc ctg ctc atg gag atc gag tac acc 1776Asp Ile Gln Glu Ser Lys Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr 580 585 590 ttc aag tac ccc tgg cgc tgc cgc ctc ccg cgg ctg gag gcg tgg aac 1824Phe Lys Tyr Pro Trp Arg Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn 595 600 605 ttc atc cac atc atg cgg cag cag gac tgc aat atc tcg ctc gcc aac 1872Phe Ile His Ile Met Arg Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn 610 615 620 aac ctc tat aag atc ccg aag atc tat atg aag aag atc ctg gag ctg 1920Asn Leu Tyr Lys Ile Pro Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu 625 630 635 640 gcg atc ctc gac ttc aac atc ctc cag agc cag cat cag cat gag atg 1968Ala Ile Leu Asp Phe Asn Ile Leu Gln Ser Gln His Gln His Glu Met 645 650 655 aaa ctg atc agc acg tgg tgg aag aac tcg tcc gcg atc cag ctc gac 2016Lys Leu Ile Ser Thr Trp Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp 660 665 670 ttc ttc cgc cac cgc cat atc gag agc tac ttc tgg tgg gcc agc ccg 2064Phe Phe Arg His Arg His Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro 675 680 685 ctg ttc gag ccc gag ttc tcc acc tgc cgc atc aac tgc acc aag ctg 2112Leu Phe Glu Pro Glu Phe Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu 690 695 700 tcc acc aag atg ttc ctc ctg gac gac atc tat gac acg tac ggg acc 2160Ser Thr Lys Met Phe Leu Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr 705 710 715 720 gtc gag gaa ctc aag ccg ttc acg acc acc ctc acg cgc tgg gat gtc 2208Val Glu Glu Leu Lys Pro Phe Thr Thr Thr Leu Thr Arg Trp Asp Val 725 730 735 agc acg gtg gac aat cac ccg gac tac atg aag atc gcg ttc aat ttc 2256Ser Thr Val Asp Asn His Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe 740 745 750 tcc tac gag atc tac aag gag atc gcg tcc gag gcc gag cgc aag cac 2304Ser Tyr Glu Ile Tyr Lys Glu Ile Ala Ser Glu Ala Glu Arg Lys His 755 760 765 ggc ccg ttc gtg tat aag tat ctc cag tcg tgc tgg aag tcg tat atc 2352Gly Pro Phe Val Tyr Lys Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile 770 775 780 gag gcg tat atg cag gag gcc gag tgg atc gcc tcc aac cac atc ccc 2400Glu Ala Tyr Met Gln Glu Ala Glu Trp Ile Ala Ser Asn His Ile Pro 785 790 795 800 ggc ttc gac gag tac ctg atg aat ggc gtg aag agc tcg ggg atg cgc 2448Gly Phe Asp Glu Tyr Leu Met Asn Gly Val Lys Ser Ser Gly Met Arg 805 810 815 atc ctc atg atc cat gcg ctg atc ctg atg gat acg ccc ctg tcc gac 2496Ile Leu Met Ile His Ala Leu Ile Leu Met Asp Thr Pro Leu Ser Asp 820 825 830 gag atc ctc gag cag ctc gac atc ccg agc agc aag agc cag gcc ctg 2544Glu Ile Leu Glu Gln Leu Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu 835 840 845 ctg tcg ctc atc acg cgg ctc gtc gat gat gtg aag gat ttc gag gac 2592Leu Ser Leu Ile Thr Arg Leu Val Asp Asp Val Lys Asp Phe Glu Asp 850 855 860 gag cag gcg cat ggg gag atg gcc tcg tcg atc gaa tgc tat atg aag 2640Glu Gln Ala His Gly Glu Met Ala Ser Ser Ile Glu Cys Tyr Met Lys 865 870 875 880 gat aat cac ggc tcc acg cgc gag gac gcc ctg aac tac ctg aaa atc 2688Asp Asn His Gly Ser Thr Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile 885 890 895 cgc atc gag agc tgc gtg cag gag ctc aac aag gaa ctc ctc gaa ccg 2736Arg Ile Glu Ser Cys Val Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro 900 905 910 agc aac atg cat ggc agc ttc cgc aac ctg tac ctc aac gtg ggc atg 2784Ser Asn Met His Gly Ser Phe Arg Asn Leu Tyr Leu Asn Val Gly Met 915 920 925 cgg gtg atc ttc ttc atg ctg aac gac ggg gac ctc ttc acc cat tcg 2832Arg Val Ile Phe Phe Met Leu Asn Asp Gly Asp Leu Phe Thr His Ser 930 935 940 aat cgg aag gag atc cag gat gcg atc acg aag ttc ttc gtg gaa ccg 2880Asn Arg Lys Glu Ile Gln Asp Ala Ile Thr Lys Phe Phe Val Glu Pro 945 950 955 960 atc atc ccg tga taa 2895Ile Ile Pro 21963PRTArtificialSynthetic Construct 21Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 Asp Asp Asp Asp Lys Ile Met Ala Glu Met Phe Asn Gly Asn Ser Ser 370 375 380 Asn Asp Gly Ser Ser Cys Met Pro Val Lys Asp Ala Leu Arg Arg Thr 385 390 395 400 Gly Asn His His Pro Asn Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu 405 410 415 Asn Ser Pro Tyr Ser Asp Ser Ser Tyr His Lys His Arg Glu Ile Leu 420 425 430 Ile Asp Glu Ile Arg Asp Met Phe Ser Asn Gly Glu Gly Asp Glu Phe 435 440 445 Gly Val Leu Glu Asn Ile Trp Phe Val Asp Val Val Gln Arg Leu Gly 450 455 460 Ile Asp Arg His Phe Gln Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile 465 470 475 480 Tyr Lys Phe Trp Asn His Asp Ser Ile Phe Gly Asp Leu Asn Met Val 485 490 495 Ala Leu Gly Phe Arg Ile Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser 500 505 510 Asp Val Phe Lys Lys Phe Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe 515 520 525 Glu Ser Ser Asp Gln Asp Ala Lys Leu Glu Met Met Leu Asn Leu Tyr 530 535 540 Lys Ala Ser Glu Leu Asp Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala 545 550 555 560 Arg Ala Phe Ala Ser Met Tyr Leu Lys His Val Ile Lys Glu Tyr Gly 565 570 575 Asp Ile Gln Glu Ser Lys Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr 580 585 590 Phe Lys Tyr Pro Trp Arg Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn 595 600 605 Phe Ile His Ile Met Arg Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn 610 615 620 Asn Leu Tyr Lys Ile Pro Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu 625 630 635 640 Ala Ile Leu Asp Phe Asn Ile Leu Gln Ser Gln His Gln His Glu Met 645 650 655 Lys Leu Ile Ser Thr Trp Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp 660 665 670 Phe Phe Arg His Arg His Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro 675 680 685 Leu Phe Glu Pro Glu Phe Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu 690 695 700 Ser Thr Lys Met Phe Leu Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr 705 710 715 720 Val Glu Glu Leu Lys Pro Phe Thr Thr Thr Leu Thr Arg Trp Asp Val 725 730 735 Ser Thr Val Asp Asn His Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe 740 745 750 Ser Tyr Glu Ile Tyr Lys Glu Ile Ala Ser Glu Ala Glu Arg Lys His 755 760 765 Gly Pro Phe Val Tyr Lys Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile 770 775 780 Glu Ala Tyr Met Gln Glu Ala Glu Trp Ile Ala Ser Asn His Ile Pro 785 790 795 800 Gly Phe Asp Glu Tyr Leu Met Asn Gly Val Lys Ser Ser Gly Met Arg 805 810 815 Ile Leu Met Ile His Ala Leu Ile Leu Met Asp Thr Pro Leu Ser Asp 820 825 830 Glu Ile Leu Glu Gln Leu Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu 835 840 845 Leu Ser Leu Ile Thr Arg Leu Val Asp Asp Val Lys Asp Phe Glu Asp 850 855 860 Glu Gln Ala His Gly Glu Met Ala Ser Ser Ile Glu Cys Tyr Met Lys 865 870 875 880 Asp Asn His Gly Ser Thr Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile 885 890 895 Arg Ile Glu Ser Cys Val Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro 900 905 910 Ser Asn Met His Gly Ser Phe Arg Asn Leu Tyr Leu Asn Val Gly Met 915 920 925 Arg Val Ile Phe Phe Met Leu Asn Asp Gly Asp Leu Phe Thr His Ser 930 935 940 Asn Arg Lys Glu Ile Gln Asp Ala Ile Thr Lys Phe Phe Val Glu Pro 945 950 955 960 Ile Ile Pro 222847DNAArtificialsynthetic fusion gene MBP - ValC short 22atg aag atc gag gaa ggc aag ctc gtc atc tgg atc aac ggc gac aag 48Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 ggc tac aac ggc ctc gcc gag gtg ggc aag aag ttc gag aag gac acg 96Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 ggc atc aag gtc acc gtc gag cat ccc gac aag ctc gag gag aag ttc 144Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 ccg cag gtc gcc gcc acc ggc gac ggc ccc gac atc atc ttc tgg gcc 192Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 cac gac cgc ttc ggc ggc tat gcg cag tcg ggc ctg ctc gcc gag atc 240His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 acg ccc gac aag gcc ttc cag gac aag ctc tat ccc ttc acc tgg gat 288Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 gcg gtg cgc tac aac ggc aag ctg atc gcc tat ccg atc gcc gtc gag 336Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 gcg ctg tcg ctg atc tac aac aag gat ctg ctg ccg aac ccg ccg aag 384Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 acc tgg gaa gag atc ccg gcg ctc gac aag gaa ctg aag gcc aag ggc 432Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 aag tcc gcg ctg atg ttc aac ctg cag gag ccc tat ttc acc tgg ccg 480Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 ctg atc gcc gcc gac ggc ggc tat gcc ttc aaa tac gag aac ggc aaa 528Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 tac gac atc aag gac gtg ggc gtc gac aat gcg ggc gcc aag gcc ggg 576Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 ctg acc ttc ctc gtc gat ctg atc aag aac aag cac atg aat gcc gac 624Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 acc gac tat tcc atc gcc gag gcg gcc ttc aac aag ggc gag acc gcc 672Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 atg acg atc aac ggg ccg tgg gcc tgg tcg aac atc gac acc tcg aag 720Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 gtc aat tac ggc gtc acg gtg ctg ccg acc ttc aag ggc cag ccc tcg 768Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 aaa ccc ttc gtc ggc gtg ctg tcg gcg ggc atc aac gcg gcc tcg ccg 816Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 aac aag gaa ctc gcc aag gag ttc ctc gag aac tac ctg ctg acc gac

864Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 gag ggg ctc gag gcg gtg aac aag gac aag ccg ctc ggc gcg gtg gcg 912Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 ctg aaa tcc tac gag gaa gag ctc gtc aag gac ccg cgg atc gcc gcc 960Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 acg atg gag aat gcg cag aag ggc gag atc atg ccg aac atc ccg cag 1008Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 atg tcg gcc ttc tgg tat gcc gtc cgc acc gcg gtg atc aac gcg gcc 1056Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 tcg ggc cgt cag acc gtc gac gag gcg ctg aag gat gcg cag act ggt 1104Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 gat gac gac gac aag att atg ccg gtc aag gac gcg ctg cgc cgc acc 1152Asp Asp Asp Asp Lys Ile Met Pro Val Lys Asp Ala Leu Arg Arg Thr 370 375 380 ggg aac cac cat ccg aac ctc tgg acc gac gat ttc atc cag tcg ctg 1200Gly Asn His His Pro Asn Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu 385 390 395 400 aac tcc ccc tat tcg gat tcc tcg tat cat aaa cat cgc gag atc ctg 1248Asn Ser Pro Tyr Ser Asp Ser Ser Tyr His Lys His Arg Glu Ile Leu 405 410 415 atc gat gag atc cgg gac atg ttc tcc aac ggc gag ggg gat gag ttc 1296Ile Asp Glu Ile Arg Asp Met Phe Ser Asn Gly Glu Gly Asp Glu Phe 420 425 430 ggg gtc ctc gag aac atc tgg ttc gtc gac gtg gtc cag cgg ctg ggc 1344Gly Val Leu Glu Asn Ile Trp Phe Val Asp Val Val Gln Arg Leu Gly 435 440 445 atc gat cgg cac ttc cag gaa gag atc aag acg gcc ctg gat tat atc 1392Ile Asp Arg His Phe Gln Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile 450 455 460 tat aag ttc tgg aac cat gat agc atc ttc ggc gac ctc aac atg gtg 1440Tyr Lys Phe Trp Asn His Asp Ser Ile Phe Gly Asp Leu Asn Met Val 465 470 475 480 gcg ctg ggg ttc cgc atc ctg cgg ctc aat cgc tac gtg gcg tcg tcg 1488Ala Leu Gly Phe Arg Ile Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser 485 490 495 gac gtg ttc aag aag ttc aag ggc gag gag ggc cag ttc tcg ggg ttc 1536Asp Val Phe Lys Lys Phe Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe 500 505 510 gag agc agc gat cag gac gcc aag ctg gag atg atg ctg aac ctc tac 1584Glu Ser Ser Asp Gln Asp Ala Lys Leu Glu Met Met Leu Asn Leu Tyr 515 520 525 aag gcc tcg gaa ctc gac ttc ccg gat gag gac atc ctc aag gaa gcg 1632Lys Ala Ser Glu Leu Asp Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala 530 535 540 cgg gcc ttc gcg tcg atg tat ctc aag cat gtc atc aag gag tat ggg 1680Arg Ala Phe Ala Ser Met Tyr Leu Lys His Val Ile Lys Glu Tyr Gly 545 550 555 560 gac atc cag gaa tcg aag aac ccc ctg ctc atg gag atc gag tac acc 1728Asp Ile Gln Glu Ser Lys Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr 565 570 575 ttc aag tac ccc tgg cgc tgc cgc ctc ccg cgg ctg gag gcg tgg aac 1776Phe Lys Tyr Pro Trp Arg Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn 580 585 590 ttc atc cac atc atg cgg cag cag gac tgc aat atc tcg ctc gcc aac 1824Phe Ile His Ile Met Arg Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn 595 600 605 aac ctc tat aag atc ccg aag atc tat atg aag aag atc ctg gag ctg 1872Asn Leu Tyr Lys Ile Pro Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu 610 615 620 gcg atc ctc gac ttc aac atc ctc cag agc cag cat cag cat gag atg 1920Ala Ile Leu Asp Phe Asn Ile Leu Gln Ser Gln His Gln His Glu Met 625 630 635 640 aaa ctg atc agc acg tgg tgg aag aac tcg tcc gcg atc cag ctc gac 1968Lys Leu Ile Ser Thr Trp Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp 645 650 655 ttc ttc cgc cac cgc cat atc gag agc tac ttc tgg tgg gcc agc ccg 2016Phe Phe Arg His Arg His Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro 660 665 670 ctg ttc gag ccc gag ttc tcc acc tgc cgc atc aac tgc acc aag ctg 2064Leu Phe Glu Pro Glu Phe Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu 675 680 685 tcc acc aag atg ttc ctc ctg gac gac atc tat gac acg tac ggg acc 2112Ser Thr Lys Met Phe Leu Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr 690 695 700 gtc gag gaa ctc aag ccg ttc acg acc acc ctc acg cgc tgg gat gtc 2160Val Glu Glu Leu Lys Pro Phe Thr Thr Thr Leu Thr Arg Trp Asp Val 705 710 715 720 agc acg gtg gac aat cac ccg gac tac atg aag atc gcg ttc aat ttc 2208Ser Thr Val Asp Asn His Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe 725 730 735 tcc tac gag atc tac aag gag atc gcg tcc gag gcc gag cgc aag cac 2256Ser Tyr Glu Ile Tyr Lys Glu Ile Ala Ser Glu Ala Glu Arg Lys His 740 745 750 ggc ccg ttc gtg tat aag tat ctc cag tcg tgc tgg aag tcg tat atc 2304Gly Pro Phe Val Tyr Lys Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile 755 760 765 gag gcg tat atg cag gag gcc gag tgg atc gcc tcc aac cac atc ccc 2352Glu Ala Tyr Met Gln Glu Ala Glu Trp Ile Ala Ser Asn His Ile Pro 770 775 780 ggc ttc gac gag tac ctg atg aat ggc gtg aag agc tcg ggg atg cgc 2400Gly Phe Asp Glu Tyr Leu Met Asn Gly Val Lys Ser Ser Gly Met Arg 785 790 795 800 atc ctc atg atc cat gcg ctg atc ctg atg gat acg ccc ctg tcc gac 2448Ile Leu Met Ile His Ala Leu Ile Leu Met Asp Thr Pro Leu Ser Asp 805 810 815 gag atc ctc gag cag ctc gac atc ccg agc agc aag agc cag gcc ctg 2496Glu Ile Leu Glu Gln Leu Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu 820 825 830 ctg tcg ctc atc acg cgg ctc gtc gat gat gtg aag gat ttc gag gac 2544Leu Ser Leu Ile Thr Arg Leu Val Asp Asp Val Lys Asp Phe Glu Asp 835 840 845 gag cag gcg cat ggg gag atg gcc tcg tcg atc gaa tgc tat atg aag 2592Glu Gln Ala His Gly Glu Met Ala Ser Ser Ile Glu Cys Tyr Met Lys 850 855 860 gat aat cac ggc tcc acg cgc gag gac gcc ctg aac tac ctg aaa atc 2640Asp Asn His Gly Ser Thr Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile 865 870 875 880 cgc atc gag agc tgc gtg cag gag ctc aac aag gaa ctc ctc gaa ccg 2688Arg Ile Glu Ser Cys Val Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro 885 890 895 agc aac atg cat ggc agc ttc cgc aac ctg tac ctc aac gtg ggc atg 2736Ser Asn Met His Gly Ser Phe Arg Asn Leu Tyr Leu Asn Val Gly Met 900 905 910 cgg gtg atc ttc ttc atg ctg aac gac ggg gac ctc ttc acc cat tcg 2784Arg Val Ile Phe Phe Met Leu Asn Asp Gly Asp Leu Phe Thr His Ser 915 920 925 aat cgg aag gag atc cag gat gcg atc acg aag ttc ttc gtg gaa ccg 2832Asn Arg Lys Glu Ile Gln Asp Ala Ile Thr Lys Phe Phe Val Glu Pro 930 935 940 atc atc ccg tga taa 2847Ile Ile Pro 945 23947PRTArtificialSynthetic Construct 23Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 Asp Asp Asp Asp Lys Ile Met Pro Val Lys Asp Ala Leu Arg Arg Thr 370 375 380 Gly Asn His His Pro Asn Leu Trp Thr Asp Asp Phe Ile Gln Ser Leu 385 390 395 400 Asn Ser Pro Tyr Ser Asp Ser Ser Tyr His Lys His Arg Glu Ile Leu 405 410 415 Ile Asp Glu Ile Arg Asp Met Phe Ser Asn Gly Glu Gly Asp Glu Phe 420 425 430 Gly Val Leu Glu Asn Ile Trp Phe Val Asp Val Val Gln Arg Leu Gly 435 440 445 Ile Asp Arg His Phe Gln Glu Glu Ile Lys Thr Ala Leu Asp Tyr Ile 450 455 460 Tyr Lys Phe Trp Asn His Asp Ser Ile Phe Gly Asp Leu Asn Met Val 465 470 475 480 Ala Leu Gly Phe Arg Ile Leu Arg Leu Asn Arg Tyr Val Ala Ser Ser 485 490 495 Asp Val Phe Lys Lys Phe Lys Gly Glu Glu Gly Gln Phe Ser Gly Phe 500 505 510 Glu Ser Ser Asp Gln Asp Ala Lys Leu Glu Met Met Leu Asn Leu Tyr 515 520 525 Lys Ala Ser Glu Leu Asp Phe Pro Asp Glu Asp Ile Leu Lys Glu Ala 530 535 540 Arg Ala Phe Ala Ser Met Tyr Leu Lys His Val Ile Lys Glu Tyr Gly 545 550 555 560 Asp Ile Gln Glu Ser Lys Asn Pro Leu Leu Met Glu Ile Glu Tyr Thr 565 570 575 Phe Lys Tyr Pro Trp Arg Cys Arg Leu Pro Arg Leu Glu Ala Trp Asn 580 585 590 Phe Ile His Ile Met Arg Gln Gln Asp Cys Asn Ile Ser Leu Ala Asn 595 600 605 Asn Leu Tyr Lys Ile Pro Lys Ile Tyr Met Lys Lys Ile Leu Glu Leu 610 615 620 Ala Ile Leu Asp Phe Asn Ile Leu Gln Ser Gln His Gln His Glu Met 625 630 635 640 Lys Leu Ile Ser Thr Trp Trp Lys Asn Ser Ser Ala Ile Gln Leu Asp 645 650 655 Phe Phe Arg His Arg His Ile Glu Ser Tyr Phe Trp Trp Ala Ser Pro 660 665 670 Leu Phe Glu Pro Glu Phe Ser Thr Cys Arg Ile Asn Cys Thr Lys Leu 675 680 685 Ser Thr Lys Met Phe Leu Leu Asp Asp Ile Tyr Asp Thr Tyr Gly Thr 690 695 700 Val Glu Glu Leu Lys Pro Phe Thr Thr Thr Leu Thr Arg Trp Asp Val 705 710 715 720 Ser Thr Val Asp Asn His Pro Asp Tyr Met Lys Ile Ala Phe Asn Phe 725 730 735 Ser Tyr Glu Ile Tyr Lys Glu Ile Ala Ser Glu Ala Glu Arg Lys His 740 745 750 Gly Pro Phe Val Tyr Lys Tyr Leu Gln Ser Cys Trp Lys Ser Tyr Ile 755 760 765 Glu Ala Tyr Met Gln Glu Ala Glu Trp Ile Ala Ser Asn His Ile Pro 770 775 780 Gly Phe Asp Glu Tyr Leu Met Asn Gly Val Lys Ser Ser Gly Met Arg 785 790 795 800 Ile Leu Met Ile His Ala Leu Ile Leu Met Asp Thr Pro Leu Ser Asp 805 810 815 Glu Ile Leu Glu Gln Leu Asp Ile Pro Ser Ser Lys Ser Gln Ala Leu 820 825 830 Leu Ser Leu Ile Thr Arg Leu Val Asp Asp Val Lys Asp Phe Glu Asp 835 840 845 Glu Gln Ala His Gly Glu Met Ala Ser Ser Ile Glu Cys Tyr Met Lys 850 855 860 Asp Asn His Gly Ser Thr Arg Glu Asp Ala Leu Asn Tyr Leu Lys Ile 865 870 875 880 Arg Ile Glu Ser Cys Val Gln Glu Leu Asn Lys Glu Leu Leu Glu Pro 885 890 895 Ser Asn Met His Gly Ser Phe Arg Asn Leu Tyr Leu Asn Val Gly Met 900 905 910 Arg Val Ile Phe Phe Met Leu Asn Asp Gly Asp Leu Phe Thr His Ser 915 920 925 Asn Arg Lys Glu Ile Gln Asp Ala Ile Thr Lys Phe Phe Val Glu Pro 930 935 940 Ile Ile Pro 945 241800DNAArtificialsynthetic gene set-ValFpoR 24atg gaa gag gcc tcg gtc acc tcg acc gaa gag acg ctg acg ccc gcg 48Met Glu Glu Ala Ser Val Thr Ser Thr Glu Glu Thr Leu Thr Pro Ala 1 5 10 15 cag gaa gcc gcg cgc acc cgc gcg gcc aac aag gcg cgc aag gaa gcc 96Gln Glu Ala Ala Arg Thr Arg Ala Ala Asn Lys Ala Arg Lys Glu Ala 20 25 30 gag ctc gcc gcg gcc acc gcc gag cag ggt gat gac gac gac aag att 144Glu Leu Ala Ala Ala Thr Ala Glu Gln Gly Asp Asp Asp Asp Lys Ile 35 40 45 aat agc tcg ggc gag acc ttc cgc ccg acc gcc gat ttc cat ccc tcg 192Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala Asp Phe His Pro Ser 50 55 60 ctc tgg cgc aac cat ttc ctg aag ggc gcc tcc gac ttc aag acc gtc 240Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser Asp Phe Lys Thr Val 65 70 75 80 gat cac acg gcc acc cag gag cgc cac gag gcg ctg aag gaa gag gtg 288Asp His Thr Ala Thr Gln Glu Arg His Glu Ala Leu Lys Glu Glu Val 85 90 95 cgc cgg atg atc acc gac gcc gag gac aag ccg

gtg cag aag ctg cgg 336Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro Val Gln Lys Leu Arg 100 105 110 ctg atc gac gag gtg cag cgt ctc ggc gtg gcc tat cac ttc gag aag 384Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala Tyr His Phe Glu Lys 115 120 125 gag atc gag gat gcg atc cag aag ctc tgc ccg atc tac atc gac agc 432Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro Ile Tyr Ile Asp Ser 130 135 140 aac cgc gcc gat ctg cac acg gtc tcg ctg cat ttc cgg ctg ctg cgc 480Asn Arg Ala Asp Leu His Thr Val Ser Leu His Phe Arg Leu Leu Arg 145 150 155 160 cag cag ggc atc aag atc tcc tgc gac gtc ttc gag aag ttc aag gac 528Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe Glu Lys Phe Lys Asp 165 170 175 gac gag ggc cgc ttc aag tcc tcg ctg atc aac gac gtg cag ggg atg 576Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn Asp Val Gln Gly Met 180 185 190 ctg tcg ctc tac gag gcg gcc tac atg gcg gtg cgc ggc gag cat atc 624Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val Arg Gly Glu His Ile 195 200 205 ctc gac gag gcg atc gcc ttc acc acc acc cat ctg aaa tcg ctc gtg 672Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His Leu Lys Ser Leu Val 210 215 220 gcg cag gac cat gtc acg ccg aag ctc gcc gag cag atc aac cat gcg 720Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu Gln Ile Asn His Ala 225 230 235 240 ctc tac cgc ccg ctg cgc aag acg ctg ccg cgg ctc gag gcg cgc tat 768Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg Leu Glu Ala Arg Tyr 245 250 255 ttc atg tcg atg atc aac tcg acc tcg gac cat ctc tac aac aag acg 816Phe Met Ser Met Ile Asn Ser Thr Ser Asp His Leu Tyr Asn Lys Thr 260 265 270 ctg ctg aac ttc gcc aag ctc gac ttc aac atc ctg ctc gag ctg cac 864Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile Leu Leu Glu Leu His 275 280 285 aag gaa gag ctg aac gag ctg acg aaa tgg tgg aag gat ctc gac ttc 912Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp Lys Asp Leu Asp Phe 290 295 300 acc acc aag ctg ccc tat gcg cgc gac cgg ctg gtc gag ctc tat ttc 960Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu Val Glu Leu Tyr Phe 305 310 315 320 tgg gat ctc ggc acc tat ttc gag ccg cag tat gcc ttc ggc cgc aag 1008Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr Ala Phe Gly Arg Lys 325 330 335 atc atg acc cag ctg aac tac atc ctc tcg atc atc gac gac acc tac 1056Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile Ile Asp Asp Thr Tyr 340 345 350 gac gcc tac ggc acg ctg gaa gag ctg tcg ctc ttc acc gag gcg gtg 1104Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu Phe Thr Glu Ala Val 355 360 365 cag cgc tgg aac atc gag gcg gtc gac atg ctg ccg gaa tac atg aag 1152Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu Pro Glu Tyr Met Lys 370 375 380 ctg atc tac cgc acg ctg ctc gat gcc ttc aac gag atc gag gaa gac 1200Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn Glu Ile Glu Glu Asp 385 390 395 400 atg gcg aaa caa ggg cgc agc cac tgc gtg cgc tat gcc aag gaa gag 1248Met Ala Lys Gln Gly Arg Ser His Cys Val Arg Tyr Ala Lys Glu Glu 405 410 415 aac cag aag gtc atc ggc gcc tat tcg gtc cag gcg aaa tgg ttc tcg 1296Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln Ala Lys Trp Phe Ser 420 425 430 gaa ggc tat gtc ccc acg atc gag gaa tac atg ccg atc gcg ctg acc 1344Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met Pro Ile Ala Leu Thr 435 440 445 tcc tgc gcc tat acc ttc gtc atc acc aac agc ttc ctc ggc atg ggc 1392Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser Phe Leu Gly Met Gly 450 455 460 gac ttc gcc acc aag gaa gtc ttc gaa tgg atc tcg aac aac ccg aag 1440Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile Ser Asn Asn Pro Lys 465 470 475 480 gtc gtc aag gcg gcc tcg gtc atc tgc cgg ctg atg gac gac atg cag 1488Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu Met Asp Asp Met Gln 485 490 495 ggc cac gag ttc gag cag aag cgc ggc cat gtc gcc tcg gcc atc gaa 1536Gly His Glu Phe Glu Gln Lys Arg Gly His Val Ala Ser Ala Ile Glu 500 505 510 tgc tac acc aag cag cac ggc gtc tcg aag gaa gag gcg atc aag atg 1584Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu Glu Ala Ile Lys Met 515 520 525 ttc gaa gag gaa gtg gcc aat gcc tgg aag gac atc aac gag gaa ctg 1632Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp Ile Asn Glu Glu Leu 530 535 540 atg atg aag ccc acc gtc gtg gcc cgt ccg ctg ctc ggc acg atc ctg 1680Met Met Lys Pro Thr Val Val Ala Arg Pro Leu Leu Gly Thr Ile Leu 545 550 555 560 aac ctc gcc cgc gcc atc gac ttc atc tac aag gaa gac gac ggc tat 1728Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr Lys Glu Asp Asp Gly Tyr 565 570 575 acc cat tcc tat ctg atc aag gac cag atc gcc tcg gtc ctc ggc gac 1776Thr His Ser Tyr Leu Ile Lys Asp Gln Ile Ala Ser Val Leu Gly Asp 580 585 590 cat gtg cct ttc att aat tga taa 1800His Val Pro Phe Ile Asn 595 25598PRTArtificialSynthetic Construct 25Met Glu Glu Ala Ser Val Thr Ser Thr Glu Glu Thr Leu Thr Pro Ala 1 5 10 15 Gln Glu Ala Ala Arg Thr Arg Ala Ala Asn Lys Ala Arg Lys Glu Ala 20 25 30 Glu Leu Ala Ala Ala Thr Ala Glu Gln Gly Asp Asp Asp Asp Lys Ile 35 40 45 Asn Ser Ser Gly Glu Thr Phe Arg Pro Thr Ala Asp Phe His Pro Ser 50 55 60 Leu Trp Arg Asn His Phe Leu Lys Gly Ala Ser Asp Phe Lys Thr Val 65 70 75 80 Asp His Thr Ala Thr Gln Glu Arg His Glu Ala Leu Lys Glu Glu Val 85 90 95 Arg Arg Met Ile Thr Asp Ala Glu Asp Lys Pro Val Gln Lys Leu Arg 100 105 110 Leu Ile Asp Glu Val Gln Arg Leu Gly Val Ala Tyr His Phe Glu Lys 115 120 125 Glu Ile Glu Asp Ala Ile Gln Lys Leu Cys Pro Ile Tyr Ile Asp Ser 130 135 140 Asn Arg Ala Asp Leu His Thr Val Ser Leu His Phe Arg Leu Leu Arg 145 150 155 160 Gln Gln Gly Ile Lys Ile Ser Cys Asp Val Phe Glu Lys Phe Lys Asp 165 170 175 Asp Glu Gly Arg Phe Lys Ser Ser Leu Ile Asn Asp Val Gln Gly Met 180 185 190 Leu Ser Leu Tyr Glu Ala Ala Tyr Met Ala Val Arg Gly Glu His Ile 195 200 205 Leu Asp Glu Ala Ile Ala Phe Thr Thr Thr His Leu Lys Ser Leu Val 210 215 220 Ala Gln Asp His Val Thr Pro Lys Leu Ala Glu Gln Ile Asn His Ala 225 230 235 240 Leu Tyr Arg Pro Leu Arg Lys Thr Leu Pro Arg Leu Glu Ala Arg Tyr 245 250 255 Phe Met Ser Met Ile Asn Ser Thr Ser Asp His Leu Tyr Asn Lys Thr 260 265 270 Leu Leu Asn Phe Ala Lys Leu Asp Phe Asn Ile Leu Leu Glu Leu His 275 280 285 Lys Glu Glu Leu Asn Glu Leu Thr Lys Trp Trp Lys Asp Leu Asp Phe 290 295 300 Thr Thr Lys Leu Pro Tyr Ala Arg Asp Arg Leu Val Glu Leu Tyr Phe 305 310 315 320 Trp Asp Leu Gly Thr Tyr Phe Glu Pro Gln Tyr Ala Phe Gly Arg Lys 325 330 335 Ile Met Thr Gln Leu Asn Tyr Ile Leu Ser Ile Ile Asp Asp Thr Tyr 340 345 350 Asp Ala Tyr Gly Thr Leu Glu Glu Leu Ser Leu Phe Thr Glu Ala Val 355 360 365 Gln Arg Trp Asn Ile Glu Ala Val Asp Met Leu Pro Glu Tyr Met Lys 370 375 380 Leu Ile Tyr Arg Thr Leu Leu Asp Ala Phe Asn Glu Ile Glu Glu Asp 385 390 395 400 Met Ala Lys Gln Gly Arg Ser His Cys Val Arg Tyr Ala Lys Glu Glu 405 410 415 Asn Gln Lys Val Ile Gly Ala Tyr Ser Val Gln Ala Lys Trp Phe Ser 420 425 430 Glu Gly Tyr Val Pro Thr Ile Glu Glu Tyr Met Pro Ile Ala Leu Thr 435 440 445 Ser Cys Ala Tyr Thr Phe Val Ile Thr Asn Ser Phe Leu Gly Met Gly 450 455 460 Asp Phe Ala Thr Lys Glu Val Phe Glu Trp Ile Ser Asn Asn Pro Lys 465 470 475 480 Val Val Lys Ala Ala Ser Val Ile Cys Arg Leu Met Asp Asp Met Gln 485 490 495 Gly His Glu Phe Glu Gln Lys Arg Gly His Val Ala Ser Ala Ile Glu 500 505 510 Cys Tyr Thr Lys Gln His Gly Val Ser Lys Glu Glu Ala Ile Lys Met 515 520 525 Phe Glu Glu Glu Val Ala Asn Ala Trp Lys Asp Ile Asn Glu Glu Leu 530 535 540 Met Met Lys Pro Thr Val Val Ala Arg Pro Leu Leu Gly Thr Ile Leu 545 550 555 560 Asn Leu Ala Arg Ala Ile Asp Phe Ile Tyr Lys Glu Asp Asp Gly Tyr 565 570 575 Thr His Ser Tyr Leu Ile Lys Asp Gln Ile Ala Ser Val Leu Gly Asp 580 585 590 His Val Pro Phe Ile Asn 595 261644DNAArtificialsynthetic gene aaaS 26atg gcc ctg acc gag gaa aag ccg atc cgc ccc atc gcg aac ttc ccg 48Met Ala Leu Thr Glu Glu Lys Pro Ile Arg Pro Ile Ala Asn Phe Pro 1 5 10 15 ccc agc atc tgg ggc gat cag ttc ctg atc tac gag aag cag gtg gag 96Pro Ser Ile Trp Gly Asp Gln Phe Leu Ile Tyr Glu Lys Gln Val Glu 20 25 30 cag ggc gtc gag cag atc gtg aac gat ctc aag aag gag gtg cgg cag 144Gln Gly Val Glu Gln Ile Val Asn Asp Leu Lys Lys Glu Val Arg Gln 35 40 45 ctg ctg aag gag gcc ctc gat atc ccc atg aag cac gcc aac ctc ctg 192Leu Leu Lys Glu Ala Leu Asp Ile Pro Met Lys His Ala Asn Leu Leu 50 55 60 aag ctg atc gat gaa atc cag cgc ctc ggc atc ccg tat cac ttc gaa 240Lys Leu Ile Asp Glu Ile Gln Arg Leu Gly Ile Pro Tyr His Phe Glu 65 70 75 80 cgc gag atc gac cac gcg ctc cag tgc atc tat gag acc tac ggc gac 288Arg Glu Ile Asp His Ala Leu Gln Cys Ile Tyr Glu Thr Tyr Gly Asp 85 90 95 aac tgg aac ggc gac cgc tcg tcc ctc tgg ttc cgc ctg atg cgc aag 336Asn Trp Asn Gly Asp Arg Ser Ser Leu Trp Phe Arg Leu Met Arg Lys 100 105 110 cag ggc tat tac gtg acc tgc gat gtc ttc aac aac tat aag gac aag 384Gln Gly Tyr Tyr Val Thr Cys Asp Val Phe Asn Asn Tyr Lys Asp Lys 115 120 125 aac ggg gcg ttc aaa cag tcg ctc gcg aac gac gtg gag ggc ctg ctg 432Asn Gly Ala Phe Lys Gln Ser Leu Ala Asn Asp Val Glu Gly Leu Leu 130 135 140 gag ctg tat gag gcg acg agc atg cgc gtc ccc ggc gag atc atc ctg 480Glu Leu Tyr Glu Ala Thr Ser Met Arg Val Pro Gly Glu Ile Ile Leu 145 150 155 160 gag gac gcg ctc ggc ttc acg cgc tcg cgc ctc tcc atc atg acg aag 528Glu Asp Ala Leu Gly Phe Thr Arg Ser Arg Leu Ser Ile Met Thr Lys 165 170 175 gac gcc ttc tcg acg aac ccg gcg ctg ttc acc gag atc cag cgg gcg 576Asp Ala Phe Ser Thr Asn Pro Ala Leu Phe Thr Glu Ile Gln Arg Ala 180 185 190 ctc aag cag ccg ctg tgg aag cgc ctg ccc cgc atc gag gcg gcg cag 624Leu Lys Gln Pro Leu Trp Lys Arg Leu Pro Arg Ile Glu Ala Ala Gln 195 200 205 tac atc ccc ttc tat cag cag cag gat agc cat aac aag acg ctc ctc 672Tyr Ile Pro Phe Tyr Gln Gln Gln Asp Ser His Asn Lys Thr Leu Leu 210 215 220 aag ctc gcg aag ctc gag ttc aac ctg ctg cag tcg ctc cat aag gag 720Lys Leu Ala Lys Leu Glu Phe Asn Leu Leu Gln Ser Leu His Lys Glu 225 230 235 240 gag ctg tcg cat gtg tgc aag tgg tgg aag gcg ttc gat atc aaa aag 768Glu Leu Ser His Val Cys Lys Trp Trp Lys Ala Phe Asp Ile Lys Lys 245 250 255 aac gcc ccc tgc ctc cgg gac cgc atc gtc gag tgc tat ttc tgg ggc 816Asn Ala Pro Cys Leu Arg Asp Arg Ile Val Glu Cys Tyr Phe Trp Gly 260 265 270 ctg ggc tcg ggc tat gag ccg cag tac tcc cgc gcc cgg gtc ttc ttc 864Leu Gly Ser Gly Tyr Glu Pro Gln Tyr Ser Arg Ala Arg Val Phe Phe 275 280 285 acc aag gcg gtg gcg gtg atc acg ctc atc gac gat acg tac gac gcc 912Thr Lys Ala Val Ala Val Ile Thr Leu Ile Asp Asp Thr Tyr Asp Ala 290 295 300 tac ggc acg tac gag gaa ctg aaa atc ttc acc gag gcc gtg gaa cgc 960Tyr Gly Thr Tyr Glu Glu Leu Lys Ile Phe Thr Glu Ala Val Glu Arg 305 310 315 320 tgg tcg atc acc tgc ctc gat acg ctc ccg gag tat atg aag ccc atc 1008Trp Ser Ile Thr Cys Leu Asp Thr Leu Pro Glu Tyr Met Lys Pro Ile 325 330 335 tat aag ctc ttc atg gat acc tat acc gag atg gag gag ttc ctc gcg 1056Tyr Lys Leu Phe Met Asp Thr Tyr Thr Glu Met Glu Glu Phe Leu Ala 340 345 350 aag gag ggg cgc acg gac ctg ttc aac tgc ggc aag gag ttc gtc aag 1104Lys Glu Gly Arg Thr Asp Leu Phe Asn Cys Gly Lys Glu Phe Val Lys 355 360 365 gag ttc gtg cgc aac ctg atg gtg gag gcg aag tgg gcc aac gag ggg 1152Glu Phe Val Arg Asn Leu Met Val Glu Ala Lys Trp Ala Asn Glu Gly 370 375 380 cat atc ccc acg acg gag gag cat gac ccc gtg gtg atc atc acc ggc 1200His Ile Pro Thr Thr Glu Glu His Asp Pro Val Val Ile Ile Thr Gly 385 390 395 400 ggc gcc aac ctg ctc acc acc acc tgc tac ctg ggc atg tcc gac atc 1248Gly Ala Asn Leu Leu Thr Thr Thr Cys Tyr Leu Gly Met Ser Asp Ile 405 410 415 ttc acg aag gag agc gtg gag tgg gcg gtg tcc gcc ccc ccg ctc ttc 1296Phe Thr Lys Glu Ser Val Glu Trp Ala Val Ser Ala Pro Pro Leu Phe 420 425 430 cgc tat tcg ggc atc ctg ggc cgg cgg ctc aac gac ctc atg acc cac 1344Arg Tyr Ser Gly Ile Leu Gly Arg Arg Leu Asn Asp Leu Met Thr His 435 440 445 aaa gcg gag cag gag cgg aag cac tcc tcg agc agc ctg gaa agc tat 1392Lys Ala Glu Gln Glu Arg Lys His Ser Ser Ser Ser Leu Glu Ser Tyr 450 455 460 atg aag gaa tat aac gtg aac gag gag tac gcc cag acg ctg atc tac 1440Met Lys Glu Tyr Asn Val Asn Glu Glu Tyr Ala Gln Thr Leu Ile Tyr 465 470 475 480

aag gag gtc gag gat gtg tgg aag gac atc aac cgg gag tat ctc acg 1488Lys Glu Val Glu Asp Val Trp Lys Asp Ile Asn Arg Glu Tyr Leu Thr 485 490 495 acg aag aac atc ccc cgc ccg ctc ctc atg gcg gtc atc tac ctc tgc 1536Thr Lys Asn Ile Pro Arg Pro Leu Leu Met Ala Val Ile Tyr Leu Cys 500 505 510 cag ttc ctg gag gtc cag tat gcg ggc aag gat aat ttc acg cgc atg 1584Gln Phe Leu Glu Val Gln Tyr Ala Gly Lys Asp Asn Phe Thr Arg Met 515 520 525 ggc gat gag tat aag cac ctg atc aag tcg ctg ctc gtg tac ccc atg 1632Gly Asp Glu Tyr Lys His Leu Ile Lys Ser Leu Leu Val Tyr Pro Met 530 535 540 tcg atc tga taa 1644Ser Ile 545 27546PRTArtificialSynthetic Construct 27Met Ala Leu Thr Glu Glu Lys Pro Ile Arg Pro Ile Ala Asn Phe Pro 1 5 10 15 Pro Ser Ile Trp Gly Asp Gln Phe Leu Ile Tyr Glu Lys Gln Val Glu 20 25 30 Gln Gly Val Glu Gln Ile Val Asn Asp Leu Lys Lys Glu Val Arg Gln 35 40 45 Leu Leu Lys Glu Ala Leu Asp Ile Pro Met Lys His Ala Asn Leu Leu 50 55 60 Lys Leu Ile Asp Glu Ile Gln Arg Leu Gly Ile Pro Tyr His Phe Glu 65 70 75 80 Arg Glu Ile Asp His Ala Leu Gln Cys Ile Tyr Glu Thr Tyr Gly Asp 85 90 95 Asn Trp Asn Gly Asp Arg Ser Ser Leu Trp Phe Arg Leu Met Arg Lys 100 105 110 Gln Gly Tyr Tyr Val Thr Cys Asp Val Phe Asn Asn Tyr Lys Asp Lys 115 120 125 Asn Gly Ala Phe Lys Gln Ser Leu Ala Asn Asp Val Glu Gly Leu Leu 130 135 140 Glu Leu Tyr Glu Ala Thr Ser Met Arg Val Pro Gly Glu Ile Ile Leu 145 150 155 160 Glu Asp Ala Leu Gly Phe Thr Arg Ser Arg Leu Ser Ile Met Thr Lys 165 170 175 Asp Ala Phe Ser Thr Asn Pro Ala Leu Phe Thr Glu Ile Gln Arg Ala 180 185 190 Leu Lys Gln Pro Leu Trp Lys Arg Leu Pro Arg Ile Glu Ala Ala Gln 195 200 205 Tyr Ile Pro Phe Tyr Gln Gln Gln Asp Ser His Asn Lys Thr Leu Leu 210 215 220 Lys Leu Ala Lys Leu Glu Phe Asn Leu Leu Gln Ser Leu His Lys Glu 225 230 235 240 Glu Leu Ser His Val Cys Lys Trp Trp Lys Ala Phe Asp Ile Lys Lys 245 250 255 Asn Ala Pro Cys Leu Arg Asp Arg Ile Val Glu Cys Tyr Phe Trp Gly 260 265 270 Leu Gly Ser Gly Tyr Glu Pro Gln Tyr Ser Arg Ala Arg Val Phe Phe 275 280 285 Thr Lys Ala Val Ala Val Ile Thr Leu Ile Asp Asp Thr Tyr Asp Ala 290 295 300 Tyr Gly Thr Tyr Glu Glu Leu Lys Ile Phe Thr Glu Ala Val Glu Arg 305 310 315 320 Trp Ser Ile Thr Cys Leu Asp Thr Leu Pro Glu Tyr Met Lys Pro Ile 325 330 335 Tyr Lys Leu Phe Met Asp Thr Tyr Thr Glu Met Glu Glu Phe Leu Ala 340 345 350 Lys Glu Gly Arg Thr Asp Leu Phe Asn Cys Gly Lys Glu Phe Val Lys 355 360 365 Glu Phe Val Arg Asn Leu Met Val Glu Ala Lys Trp Ala Asn Glu Gly 370 375 380 His Ile Pro Thr Thr Glu Glu His Asp Pro Val Val Ile Ile Thr Gly 385 390 395 400 Gly Ala Asn Leu Leu Thr Thr Thr Cys Tyr Leu Gly Met Ser Asp Ile 405 410 415 Phe Thr Lys Glu Ser Val Glu Trp Ala Val Ser Ala Pro Pro Leu Phe 420 425 430 Arg Tyr Ser Gly Ile Leu Gly Arg Arg Leu Asn Asp Leu Met Thr His 435 440 445 Lys Ala Glu Gln Glu Arg Lys His Ser Ser Ser Ser Leu Glu Ser Tyr 450 455 460 Met Lys Glu Tyr Asn Val Asn Glu Glu Tyr Ala Gln Thr Leu Ile Tyr 465 470 475 480 Lys Glu Val Glu Asp Val Trp Lys Asp Ile Asn Arg Glu Tyr Leu Thr 485 490 495 Thr Lys Asn Ile Pro Arg Pro Leu Leu Met Ala Val Ile Tyr Leu Cys 500 505 510 Gln Phe Leu Glu Val Gln Tyr Ala Gly Lys Asp Asn Phe Thr Arg Met 515 520 525 Gly Asp Glu Tyr Lys His Leu Ile Lys Ser Leu Leu Val Tyr Pro Met 530 535 540 Ser Ile 545 282766DNAArtificialsynthetic gene mbp-aaaS 28atg aag atc gag gaa ggc aag ctc gtc atc tgg atc aac ggc gac aag 48Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10 15 ggc tac aac ggc ctc gcc gag gtg ggc aag aag ttc gag aag gac acg 96Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 ggc atc aag gtc acc gtc gag cat ccc gac aag ctc gag gag aag ttc 144Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 ccg cag gtc gcc gcc acc ggc gac ggc ccc gac atc atc ttc tgg gcc 192Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 cac gac cgc ttc ggc ggc tat gcg cag tcg ggc ctg ctc gcc gag atc 240His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 acg ccc gac aag gcc ttc cag gac aag ctc tat ccc ttc acc tgg gat 288Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 gcg gtg cgc tac aac ggc aag ctg atc gcc tat ccg atc gcc gtc gag 336Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 gcg ctg tcg ctg atc tac aac aag gat ctg ctg ccg aac ccg ccg aag 384Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 acc tgg gaa gag atc ccg gcg ctc gac aag gaa ctg aag gcc aag ggc 432Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 aag tcc gcg ctg atg ttc aac ctg cag gag ccc tat ttc acc tgg ccg 480Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 ctg atc gcc gcc gac ggc ggc tat gcc ttc aaa tac gag aac ggc aaa 528Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 tac gac atc aag gac gtg ggc gtc gac aat gcg ggc gcc aag gcc ggg 576Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 ctg acc ttc ctc gtc gat ctg atc aag aac aag cac atg aat gcc gac 624Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 acc gac tat tcc atc gcc gag gcg gcc ttc aac aag ggc gag acc gcc 672Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 atg acg atc aac ggg ccg tgg gcc tgg tcg aac atc gac acc tcg aag 720Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 gtc aat tac ggc gtc acg gtg ctg ccg acc ttc aag ggc cag ccc tcg 768Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 aaa ccc ttc gtc ggc gtg ctg tcg gcg ggc atc aac gcg gcc tcg ccg 816Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 aac aag gaa ctc gcc aag gag ttc ctc gag aac tac ctg ctg acc gac 864Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 gag ggg ctc gag gcg gtg aac aag gac aag ccg ctc ggc gcg gtg gcg 912Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 ctg aaa tcc tac gag gaa gag ctc gtc aag gac ccg cgg atc gcc gcc 960Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 acg atg gag aat gcg cag aag ggc gag atc atg ccg aac atc ccg cag 1008Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 atg tcg gcc ttc tgg tat gcc gtc cgc acc gcg gtg atc aac gcg gcc 1056Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 tcg ggc cgt cag acc gtc gac gag gcg ctg aag gat gcg cag act ggt 1104Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 gat gac gac gac aag att atg gcc ctg acc gag gaa aag ccg atc cgc 1152Asp Asp Asp Asp Lys Ile Met Ala Leu Thr Glu Glu Lys Pro Ile Arg 370 375 380 ccc atc gcg aac ttc ccg ccc agc atc tgg ggc gat cag ttc ctg atc 1200Pro Ile Ala Asn Phe Pro Pro Ser Ile Trp Gly Asp Gln Phe Leu Ile 385 390 395 400 tac gag aag cag gtg gag cag ggc gtc gag cag atc gtg aac gat ctc 1248Tyr Glu Lys Gln Val Glu Gln Gly Val Glu Gln Ile Val Asn Asp Leu 405 410 415 aag aag gag gtg cgg cag ctg ctg aag gag gcc ctc gat atc ccc atg 1296Lys Lys Glu Val Arg Gln Leu Leu Lys Glu Ala Leu Asp Ile Pro Met 420 425 430 aag cac gcc aac ctc ctg aag ctg atc gat gaa atc cag cgc ctc ggc 1344Lys His Ala Asn Leu Leu Lys Leu Ile Asp Glu Ile Gln Arg Leu Gly 435 440 445 atc ccg tat cac ttc gaa cgc gag atc gac cac gcg ctc cag tgc atc 1392Ile Pro Tyr His Phe Glu Arg Glu Ile Asp His Ala Leu Gln Cys Ile 450 455 460 tat gag acc tac ggc gac aac tgg aac ggc gac cgc tcg tcc ctc tgg 1440Tyr Glu Thr Tyr Gly Asp Asn Trp Asn Gly Asp Arg Ser Ser Leu Trp 465 470 475 480 ttc cgc ctg atg cgc aag cag ggc tat tac gtg acc tgc gat gtc ttc 1488Phe Arg Leu Met Arg Lys Gln Gly Tyr Tyr Val Thr Cys Asp Val Phe 485 490 495 aac aac tat aag gac aag aac ggg gcg ttc aaa cag tcg ctc gcg aac 1536Asn Asn Tyr Lys Asp Lys Asn Gly Ala Phe Lys Gln Ser Leu Ala Asn 500 505 510 gac gtg gag ggc ctg ctg gag ctg tat gag gcg acg agc atg cgc gtc 1584Asp Val Glu Gly Leu Leu Glu Leu Tyr Glu Ala Thr Ser Met Arg Val 515 520 525 ccc ggc gag atc atc ctg gag gac gcg ctc ggc ttc acg cgc tcg cgc 1632Pro Gly Glu Ile Ile Leu Glu Asp Ala Leu Gly Phe Thr Arg Ser Arg 530 535 540 ctc tcc atc atg acg aag gac gcc ttc tcg acg aac ccg gcg ctg ttc 1680Leu Ser Ile Met Thr Lys Asp Ala Phe Ser Thr Asn Pro Ala Leu Phe 545 550 555 560 acc gag atc cag cgg gcg ctc aag cag ccg ctg tgg aag cgc ctg ccc 1728Thr Glu Ile Gln Arg Ala Leu Lys Gln Pro Leu Trp Lys Arg Leu Pro 565 570 575 cgc atc gag gcg gcg cag tac atc ccc ttc tat cag cag cag gat agc 1776Arg Ile Glu Ala Ala Gln Tyr Ile Pro Phe Tyr Gln Gln Gln Asp Ser 580 585 590 cat aac aag acg ctc ctc aag ctc gcg aag ctc gag ttc aac ctg ctg 1824His Asn Lys Thr Leu Leu Lys Leu Ala Lys Leu Glu Phe Asn Leu Leu 595 600 605 cag tcg ctc cat aag gag gag ctg tcg cat gtg tgc aag tgg tgg aag 1872Gln Ser Leu His Lys Glu Glu Leu Ser His Val Cys Lys Trp Trp Lys 610 615 620 gcg ttc gat atc aaa aag aac gcc ccc tgc ctc cgg gac cgc atc gtc 1920Ala Phe Asp Ile Lys Lys Asn Ala Pro Cys Leu Arg Asp Arg Ile Val 625 630 635 640 gag tgc tat ttc tgg ggc ctg ggc tcg ggc tat gag ccg cag tac tcc 1968Glu Cys Tyr Phe Trp Gly Leu Gly Ser Gly Tyr Glu Pro Gln Tyr Ser 645 650 655 cgc gcc cgg gtc ttc ttc acc aag gcg gtg gcg gtg atc acg ctc atc 2016Arg Ala Arg Val Phe Phe Thr Lys Ala Val Ala Val Ile Thr Leu Ile 660 665 670 gac gat acg tac gac gcc tac ggc acg tac gag gaa ctg aaa atc ttc 2064Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Tyr Glu Glu Leu Lys Ile Phe 675 680 685 acc gag gcc gtg gaa cgc tgg tcg atc acc tgc ctc gat acg ctc ccg 2112Thr Glu Ala Val Glu Arg Trp Ser Ile Thr Cys Leu Asp Thr Leu Pro 690 695 700 gag tat atg aag ccc atc tat aag ctc ttc atg gat acc tat acc gag 2160Glu Tyr Met Lys Pro Ile Tyr Lys Leu Phe Met Asp Thr Tyr Thr Glu 705 710 715 720 atg gag gag ttc ctc gcg aag gag ggg cgc acg gac ctg ttc aac tgc 2208Met Glu Glu Phe Leu Ala Lys Glu Gly Arg Thr Asp Leu Phe Asn Cys 725 730 735 ggc aag gag ttc gtc aag gag ttc gtg cgc aac ctg atg gtg gag gcg 2256Gly Lys Glu Phe Val Lys Glu Phe Val Arg Asn Leu Met Val Glu Ala 740 745 750 aag tgg gcc aac gag ggg cat atc ccc acg acg gag gag cat gac ccc 2304Lys Trp Ala Asn Glu Gly His Ile Pro Thr Thr Glu Glu His Asp Pro 755 760 765 gtg gtg atc atc acc ggc ggc gcc aac ctg ctc acc acc acc tgc tac 2352Val Val Ile Ile Thr Gly Gly Ala Asn Leu Leu Thr Thr Thr Cys Tyr 770 775 780 ctg ggc atg tcc gac atc ttc acg aag gag agc gtg gag tgg gcg gtg 2400Leu Gly Met Ser Asp Ile Phe Thr Lys Glu Ser Val Glu Trp Ala Val 785 790 795 800 tcc gcc ccc ccg ctc ttc cgc tat tcg ggc atc ctg ggc cgg cgg ctc 2448Ser Ala Pro Pro Leu Phe Arg Tyr Ser Gly Ile Leu Gly Arg Arg Leu 805 810 815 aac gac ctc atg acc cac aaa gcg gag cag gag cgg aag cac tcc tcg 2496Asn Asp Leu Met Thr His Lys Ala Glu Gln Glu Arg Lys His Ser Ser 820 825 830 agc agc ctg gaa agc tat atg aag gaa tat aac gtg aac gag gag tac 2544Ser Ser Leu Glu Ser Tyr Met Lys Glu Tyr Asn Val Asn Glu Glu Tyr 835 840 845 gcc cag acg ctg atc tac aag gag gtc gag gat gtg tgg aag gac atc 2592Ala Gln Thr Leu Ile Tyr Lys Glu Val Glu Asp Val Trp Lys Asp Ile 850 855 860 aac cgg gag tat ctc acg acg aag aac atc ccc cgc ccg ctc ctc atg 2640Asn Arg Glu Tyr Leu Thr Thr Lys Asn Ile Pro Arg Pro Leu Leu Met 865 870 875 880 gcg gtc atc tac ctc tgc cag ttc ctg gag gtc cag tat gcg ggc aag 2688Ala Val Ile Tyr Leu Cys Gln Phe Leu Glu Val Gln Tyr Ala Gly Lys 885 890 895 gat aat ttc acg cgc atg ggc gat gag tat aag cac ctg atc aag tcg 2736Asp Asn Phe Thr Arg Met Gly Asp Glu Tyr Lys His Leu Ile Lys Ser 900 905 910 ctg ctc gtg tac ccc atg tcg atc tga taa 2766Leu Leu Val Tyr Pro Met Ser Ile 915 920 29920PRTArtificialSynthetic Construct 29Met Lys Ile Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys 1 5 10

15 Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp Thr 20 25 30 Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe 35 40 45 Pro Gln Val Ala Ala Thr Gly Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55 60 His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile 65 70 75 80 Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85 90 95 Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala Tyr Pro Ile Ala Val Glu 100 105 110 Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys 115 120 125 Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135 140 Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp Pro 145 150 155 160 Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys 165 170 175 Tyr Asp Ile Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185 190 Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn Ala Asp 195 200 205 Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210 215 220 Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile Asp Thr Ser Lys 225 230 235 240 Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser 245 250 255 Lys Pro Phe Val Gly Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260 265 270 Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr Asp 275 280 285 Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290 295 300 Leu Lys Ser Tyr Glu Glu Glu Leu Val Lys Asp Pro Arg Ile Ala Ala 305 310 315 320 Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln 325 330 335 Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340 345 350 Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys Asp Ala Gln Thr Gly 355 360 365 Asp Asp Asp Asp Lys Ile Met Ala Leu Thr Glu Glu Lys Pro Ile Arg 370 375 380 Pro Ile Ala Asn Phe Pro Pro Ser Ile Trp Gly Asp Gln Phe Leu Ile 385 390 395 400 Tyr Glu Lys Gln Val Glu Gln Gly Val Glu Gln Ile Val Asn Asp Leu 405 410 415 Lys Lys Glu Val Arg Gln Leu Leu Lys Glu Ala Leu Asp Ile Pro Met 420 425 430 Lys His Ala Asn Leu Leu Lys Leu Ile Asp Glu Ile Gln Arg Leu Gly 435 440 445 Ile Pro Tyr His Phe Glu Arg Glu Ile Asp His Ala Leu Gln Cys Ile 450 455 460 Tyr Glu Thr Tyr Gly Asp Asn Trp Asn Gly Asp Arg Ser Ser Leu Trp 465 470 475 480 Phe Arg Leu Met Arg Lys Gln Gly Tyr Tyr Val Thr Cys Asp Val Phe 485 490 495 Asn Asn Tyr Lys Asp Lys Asn Gly Ala Phe Lys Gln Ser Leu Ala Asn 500 505 510 Asp Val Glu Gly Leu Leu Glu Leu Tyr Glu Ala Thr Ser Met Arg Val 515 520 525 Pro Gly Glu Ile Ile Leu Glu Asp Ala Leu Gly Phe Thr Arg Ser Arg 530 535 540 Leu Ser Ile Met Thr Lys Asp Ala Phe Ser Thr Asn Pro Ala Leu Phe 545 550 555 560 Thr Glu Ile Gln Arg Ala Leu Lys Gln Pro Leu Trp Lys Arg Leu Pro 565 570 575 Arg Ile Glu Ala Ala Gln Tyr Ile Pro Phe Tyr Gln Gln Gln Asp Ser 580 585 590 His Asn Lys Thr Leu Leu Lys Leu Ala Lys Leu Glu Phe Asn Leu Leu 595 600 605 Gln Ser Leu His Lys Glu Glu Leu Ser His Val Cys Lys Trp Trp Lys 610 615 620 Ala Phe Asp Ile Lys Lys Asn Ala Pro Cys Leu Arg Asp Arg Ile Val 625 630 635 640 Glu Cys Tyr Phe Trp Gly Leu Gly Ser Gly Tyr Glu Pro Gln Tyr Ser 645 650 655 Arg Ala Arg Val Phe Phe Thr Lys Ala Val Ala Val Ile Thr Leu Ile 660 665 670 Asp Asp Thr Tyr Asp Ala Tyr Gly Thr Tyr Glu Glu Leu Lys Ile Phe 675 680 685 Thr Glu Ala Val Glu Arg Trp Ser Ile Thr Cys Leu Asp Thr Leu Pro 690 695 700 Glu Tyr Met Lys Pro Ile Tyr Lys Leu Phe Met Asp Thr Tyr Thr Glu 705 710 715 720 Met Glu Glu Phe Leu Ala Lys Glu Gly Arg Thr Asp Leu Phe Asn Cys 725 730 735 Gly Lys Glu Phe Val Lys Glu Phe Val Arg Asn Leu Met Val Glu Ala 740 745 750 Lys Trp Ala Asn Glu Gly His Ile Pro Thr Thr Glu Glu His Asp Pro 755 760 765 Val Val Ile Ile Thr Gly Gly Ala Asn Leu Leu Thr Thr Thr Cys Tyr 770 775 780 Leu Gly Met Ser Asp Ile Phe Thr Lys Glu Ser Val Glu Trp Ala Val 785 790 795 800 Ser Ala Pro Pro Leu Phe Arg Tyr Ser Gly Ile Leu Gly Arg Arg Leu 805 810 815 Asn Asp Leu Met Thr His Lys Ala Glu Gln Glu Arg Lys His Ser Ser 820 825 830 Ser Ser Leu Glu Ser Tyr Met Lys Glu Tyr Asn Val Asn Glu Glu Tyr 835 840 845 Ala Gln Thr Leu Ile Tyr Lys Glu Val Glu Asp Val Trp Lys Asp Ile 850 855 860 Asn Arg Glu Tyr Leu Thr Thr Lys Asn Ile Pro Arg Pro Leu Leu Met 865 870 875 880 Ala Val Ile Tyr Leu Cys Gln Phe Leu Glu Val Gln Tyr Ala Gly Lys 885 890 895 Asp Asn Phe Thr Arg Met Gly Asp Glu Tyr Lys His Leu Ile Lys Ser 900 905 910 Leu Leu Val Tyr Pro Met Ser Ile 915 920 3041DNAArtificialprimer 30tatatggatc catggctgaa atgtttaatg gaaattccag c 413120DNAArtificialprimer 31gattatgcgg ccgtgtacaa 203220DNAArtificialprimer 32ttgtaaaacg acggccagtg 203323DNAArtificialprimer 33gtgacactat agaatactca agc 233440PRTArtificialSET tag 34Glu Glu Ala Ser Val Thr Ser Thr Glu Glu Thr Leu Thr Pro Ala Gln 1 5 10 15 Glu Ala Ala Arg Thr Arg Ala Ala Asn Lys Ala Arg Lys Glu Ala Glu 20 25 30 Leu Ala Ala Ala Thr Ala Glu Gln 35 40 3541PRTArtificialSET tag (long) 35Met Glu Glu Ala Ser Val Thr Ser Thr Glu Glu Thr Leu Thr Pro Ala 1 5 10 15 Gln Glu Ala Ala Arg Thr Arg Ala Ala Asn Lys Ala Arg Lys Glu Ala 20 25 30 Glu Leu Ala Ala Ala Thr Ala Glu Gln 35 40


Patent applications by Hendrik Jan Bosch, Wageningen NL

Patent applications by Hendrik Jan Bouwmeester, Wageningen NL

Patent applications by Jihane Achkar, Zurich CH

Patent applications by Theodorus Sonke, Guttecoven NL

Patent applications by Isobionics B.V.

Patent applications in class Mushroom

Patent applications in all subclasses Mushroom


User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA
Similar patent applications:
DateTitle
2014-08-07Oplophorus-derived luciferases, novel coelenterazine substrates, and methods of use
2009-04-09Maize cellulose synthases and uses thereof
2009-06-11Maize cellulose synthases and uses thereof
2010-09-02Maize cellulose synthases and uses thereof
2011-12-15Farmesene synthase
New patent applications in this class:
DateTitle
2015-12-03Method for making dehydrated mycelium elements and products made thereby
2015-05-28Method for controlling fungal diseases in mushroom production
2015-02-19Culture method for antrodia cinnamomea
2014-11-20Novel strain of pleurotus nebrodensis
2014-05-08Modified plant defensins useful as anti-pathogenic agents
New patent applications from these inventors:
DateTitle
2015-06-25Mammalian-type glycosylation in plants by expression of non-mammalian glycosyltransferases
2015-03-19Valencene synthase
2015-02-26Methods and compositions for producing drimenol
2014-09-11Methods for producing cinnamolide and/or drimendiol
2013-12-26Gntiii expression in plants
Top Inventors for class "Multicellular living organisms and unmodified parts thereof and related processes"
RankInventor's name
1Gregory J. Holland
2William H. Eby
3Richard G. Stelpflug
4Laron L. Peters
5Justin T. Mason
Website © 2025 Advameg, Inc.