Patent application title: PRODUCTION OF MANOOL
Inventors:
Michel Schalk (Satigny, CH)
Laurent Daviet (Satigny, CH)
Letizia Rocci (Satigny, CH)
Daniel Solis Escalante (Satigny, CH)
IPC8 Class: AC12P702FI
USPC Class:
1 1
Class name:
Publication date: 2021-01-14
Patent application number: 20210010035
Abstract:
Described herein are methods of producing (+)-manool, the methods
including: contacting geranylgeranyl diphosphate with a copalyl
diphosphate (CPP) synthase to form a (9S, 10S)-copalyl diphosphate and
contacting the CPP with a sclareol synthase enzyme to form (+)-manool and
derivatives thereof. Also described herein are nucleic acids encoding CPP
synthases and sclareol synthases for use in the methods. Further
described herein are expression vectors and non-human host organisms and
cells including nucleic acids encoding a CPP synthase and a sclareol
synthase as described herein.Claims:
1. A method of producing (+)-manool, the method comprising: a) contacting
geranylgeranyl diphosphate (GGPP) with a copalyl diphosphate (CPP)
synthase to form a copalyl diphosphate, wherein the CPP synthase
comprises a) an amino acid sequence having at least 50%, 55%, 60%, 65%,
70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ
ID NO: 14 or SEQ ID NO: 15; or b) an amino acid sequence having at least
71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to
SEQ ID NO: 17 or SEQ ID NO: 18; or c) an amino acid sequence having at
least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 20 or SEQ ID NO: 21; and b) contacting the CPP
with a sclareol synthase to form the (+)-manool; and c) optionally
isolating the (+)-manool.
2. The method of claim 1, wherein the CPP synthase comprises a) a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or b) a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or c) a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21.
3. The method of claim 1, wherein step a) further comprises culturing a non-human host organism or cell capable of producing GGPP and transformed to express at least one polypeptide comprising a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and having a CPP synthase activity, under conditions conducive to a production of CPP.
4. The method of claim 1, wherein the method further comprises, prior to step a), transforming a non-human host organism or cell capable of producing GGPP with a) at least one nucleic acid encoding a polypeptide comprising a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 17 and SEQ ID NO: 18; or c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and having a CPP synthase activity, so that said organism or cell expresses said polypeptide having a CPP synthase activity; and b) at least one nucleic acid encoding a polypeptide having a sclareol synthase activity, so that said organism or cell expresses said polypeptide having a sclareol synthase activity.
5. The method as recited in claim 4, wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
6. The method as recited in claim 1, further comprising processing the (+)-manool to a (+)-manool derivative using a chemical or biochemical synthesis or a combination of both.
7. The method as recited in claim 6, wherein the derivative is an alcohol, acetal, aldehyde, acid, ether, ketone, lactone, acetate or an ester.
8. The method as recited in claim 6, wherein the derivative is selected from the group consisting of copalol, copalal, manooloxy, Z-11, gamma-ambrol and ambrox.
9. A method for transforming a host cell or non-human organism, the method comprising transforming a host cell or non-human organism with a nucleic acid encoding a polypeptide having a copalyl diphosphate synthase activity and a nucleic acid encoding a polypeptide having a sclareol synthase activity, wherein the polypeptide having copalyl diphosphate activity comprises a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 and SEQ ID NO: 21; and wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
10. The method as recited in claim 4, wherein the host cell or non-human organism is a plant, a prokaryote, or a fungus.
11. The method as recited in claim 4, wherein the non-human host organism or cell is E. coli or Saccharomyces cerevisiae.
12. An expression vector comprising a) a nucleic acid encoding a polypeptide having a CPP synthase activity comprising a) an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or b) an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or c) an amino acid sequence having at 98%, 99% or 100% sequence identity SEQ ID NO: 20 or SEQ ID NO: 21; or b) a nucleic acid encoding a polypeptide having a CPP synthase activity comprising a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a nucleic acid sequence as set forth in SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
13. The expression vector of claim 12 further comprising a) a nucleic acid encoding a polypeptide having a sclareol synthase activity, wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25; or b) a nucleic acid encoding a polypeptide having a sclareol synthase activity comprising a nucleotide sequence having at least 90%, 95%, 98%,99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
14. A non-human host organism or cell comprising a) the expression vector as recited in claim 12; or b) a nucleic acid encoding a polypeptide having a copalyl diphosphate synthase activity and a nucleic acid encoding a polypeptide having a sclareol synthase activity, wherein the polypeptide having copalyl diphosphate activity comprises i. an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or ii. an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or iii. an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25; and wherein at least one of the nucleic acids is heterologous to the non-human host organism or cell.
15. The non-human host organism or cell of claim 14, wherein the non-human host organism or cell is a plant, a prokaryote, a fungus, Escherichia coli, or Saccharomyces cerevisiae.
16. The method as recited in claim 9, wherein the host cell or non-human organism is a plant, a prokaryote, or a fungus.
17. The method as recited in claim 9, wherein the non-human host organism or cell is E. coli or Saccharomyces cerevisiae.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a Divisional Application of U.S. Non-Provisional application Ser. No. 16/472,120 filed Jun. 20, 2019, which claims priority to U.S. National Phase Application of PCT/EP2017/083372, filed Dec. 18, 2017, which claims the benefit of priority to European Patent Application No. 16206349.9, filed Dec. 22, 2016, the entire contents of which are hereby incorporated by reference herein.
TECHNICAL FIELD
[0002] Provided herein are biochemical methods of producing (+)-manool using a copalyl diphosphate synthase and a sclareol synthase.
BACKGROUND
[0003] Terpenes are found in most organisms (microorganisms, animals and plants). These compounds are made up of five carbon units called isoprene units and are classified by the number of these units present in their structure. Thus monoterpenes, sesquiterpenes and diterpenes are terpenes containing 10, 15 and 20 carbon atoms respectively. Sesquiterpenes, for example, are widely found in the plant kingdom. Many sesquiterpene molecules are known for their flavor and fragrance properties and their cosmetic, medicinal and antimicrobial effects. Numerous sesquiterpene hydrocarbons and sesquiterpenoids have been identified.
[0004] Biosynthetic production of terpenes involves enzymes called terpene synthases. These enzymes convert an acyclic terpene precursor in one or more terpene products. In particular, diterpene synthases produce diterpenes by cyclization of the precursor geranylgeranyl diphosphate (GGPP). The cyclization of GGPP often requires two enzyme polypeptides, a type I and a type II diterpene synthase working in combination in two successive enzymatic reactions. The type II diterpene synthases catalyze a cyclization/rearrangement of GGPP initiated by the protonation of the terminal double bond of GGPP leading to a cyclic diterpene diphosphate intermediate. This intermediate is then further converted by a type I diterpene synthase catalyzing an ionization initiated cyclization.
[0005] Diterpene synthases are present in the plants and other organisms and use substrates such as GGPP but they have different product profiles. Genes and cDNAs encoding diterpene synthases have been cloned and the corresponding recombinant enzymes characterized.
[0006] Copalyl diphosphate (CPP) synthase enzymes and sclareol synthase enzymes are enzymes that occur in plants. Hence, it is desirable to discover and use these enzymes and variants in biochemical processes to generate (+)-manool.
SUMMARY
[0007] Provided herein is a method of producing (+)-manool comprising:
[0008] a) contacting geranylgeranyl diphosphate (GGPP) with a copalyl diphosphate (CPP) synthase to form a copalyl diphosphate, wherein the CPP synthase comprises
[0009] i) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
[0010] ii) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
[0011] iii) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21; and
[0012] b) contacting the CPP with a sclareol synthase to form (+)-manool; and
[0013] c) optionally isolating the (+)-manool.
[0014] Provided herein is the above method further comprising further processing the (+)-manool to a (+)-manool derivative.
[0015] Also provided herein is a polypeptide having CPP synthase activity, wherein the polypeptide comprises
[0016] a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
[0017] b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
[0018] c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
[0019] Further provided is a polypeptide having sclareol synthase activity, wherein the polypeptide comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
[0020] Also provided herein is a nucleic acid encoding a polypeptide described above.
[0021] Also provided herein is a nucleic acid encoding a CPP synthase wherein the nucleic acid comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to the nucleic acid sequence as set forth in SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0022] Further provided herein is a nucleic acid encoding a sclareol synthase wherein the nucleic acid comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
[0023] Also provided is an expression vector comprising the nucleic acids described above, a non-human host organism or cell comprising the nucleic acids described above or comprising the expression vector, non-human host organisms or cells capable of producing GGPP, methods of transforming a non-human host organism or cell, and methods for culturing the non-human host organisms or cells for producing (+)-manool.
DESCRIPTION OF THE DRAWINGS
[0024] FIG. 1. Enzymatic pathway from geranylgeranyl diphosphate (GGPP) to (+)-manool.
[0025] FIG. 2. Enzymatic pathways from geranylgeranyl diphosphate (GGPP) to (+)-manool and sclareol.
[0026] FIG. 3. GCMS analysis of the in vitro enzymatic conversion of GGPP. A. Using the recombinant SmCPS enzyme. B. Using the recombinant ScScS enzyme. C. Combining the SmCPS with ScScS enzymes in a single assay.
[0027] FIG. 4. GCMS analysis of (+)-manool produced using Escherichia coli cells expressing SmCPS, ScScS and mevalonate pathway enzymes. A. Total ion chromatogram of an extract of the E. coli culture medium. B. Total ion chromatogram of a (+)-manool standard. C. Mass spectrum of the major peak (retention time of 14.55 min) in chromatogram A. D. Mass spectrum of the (+)-manool authentic standard.
[0028] FIG. 5. GCMS analysis of (+)-manool produced using E. coli cells expressing, mevalonate pathway enzymes, a GGPP synthase, ScSCS and five different CPP synthases: SmCPS2 from Salvia miltiorrhiza, CfCPS1 from Coleus forskohlii, TaTps1 from Triticum aestivum, MvCps3 from Marrubium vulgare and RoCPS1 from Rosmarinus officinalis.
[0029] FIG. 6. GCMS analysis of (+)-manool produced using E. coli cells expressing, mevalonate pathway enzymes, a GGPP synthase, SmCPS2 and a class I diterpene synthases: NgSCS-del29 from Nicotiana glutinosa or SsScS from Salvia sclarea.
[0030] FIG. 7. Saccharomyces cerevisiae expression plasmids were constructed in vivo by co-transformation of yeast with six DNA fragments: a) LEU2 yeast marker, b) AmpR E. coli marker, c) Yeast origin of replication, d) E. coli replication origin, e) a fragment for co-expression of CrtE and one of the sclareol synthases coding sequences tested, and f) a fragment for expression of one of the copalyl diphosphate (CPP) synthases coding sequences tested.
[0031] FIG. 8. GCMS analysis of (+)-manool produced using the modified S. cerevisiae strain YST045 expressing a GGPP synthase, ScSCS and five different truncated versions of CPP synthases: SmCPS2 from Salvia miltiorrhiza, CfCPS1 from Coleus forskohlii, TaTps1 from Triticum aestivum, MvCps3 from Marrubium vulgare and RoCPS1 from Rosmarinus officinalis.
DETAILED DESCRIPTION
Definitions
[0032] For the descriptions herein and the appended claims, the use of "or" means "and/or" unless stated otherwise.
[0033] Similarly, "comprise," "comprises," "comprising," "include," "includes," and "including" are interchangeable and not intended to be limiting.
[0034] It is to be further understood that where descriptions of various embodiments use the term "comprising," those skilled in the art would understand that in some specific instances, an embodiment can be alternatively described using language "consisting essentially of" or "consisting of."
[0035] The following terms have the meanings ascribed to them unless specified otherwise.
[0036] The term "polypeptide" means an amino acid sequence of consecutively polymerized amino acid residues, for instance, at least 15 residues, at least 30 residues, at least 50 residues. In some embodiments provided herein, a polypeptide comprises an amino acid sequence that is an enzyme, or a fragment, or a variant thereof.
[0037] The term "isolated" polypeptide refers to an amino acid sequence that is removed from its natural environment by any method or combination of methods known in the art and includes recombinant, biochemical and synthetic methods.
[0038] The term "protein" refers to an amino acid sequence of any length wherein amino acids are linked by covalent peptide bonds, and includes oligopeptide, peptide, polypeptide and full length protein whether naturally occurring or synthetic.
[0039] The terms "biological function," "function," "biological activity" or "activity" refer to the ability of the CPP synthase and the sclareol synthase activity to catalyze the formation of (+)-manool.
[0040] The terms "nucleic acid sequence," "nucleic acid," and "polynucleotide" are used interchangeably meaning a sequence of nucleotides. A nucleic acid sequence may be a single-stranded or double-stranded deoxyribonucleotide, or ribonucleotide of any length, and include coding and non-coding sequences of a gene, exons, introns, sense and anti-sense complimentary sequences, genomic DNA, cDNA, miRNA, siRNA, mRNA, rRNA, tRNA, recombinant nucleic acid sequences, isolated and purified naturally occurring DNA and/or RNA sequences, synthetic DNA and RNA sequences, fragments, primers and nucleic acid probes; and the complement of such sequences. The skilled artisan is aware that the nucleic acid sequences of RNA are identical to the DNA sequences with the difference of thymine (T) being replaced by uracil (U).
[0041] An isolated nucleic acid or isolated nucleic acid sequence refers to a nucleic acid or nucleic acid sequence that is in an environment different from that in which the nucleic acid or nucleic acid sequence naturally occurs. The term "naturally-occurring" as used herein as applied to a nucleic acid refers to a nucleic acid that is found in a cell in nature. For example, a nucleic acid sequence that is present in an organism, for instance in the cells of an organism, that can be isolated from a source in nature and which it has not been intentionally modified by a human in the laboratory is naturally occurring.
[0042] The terms "purified," "substantially purified," and "isolated" as used herein refer to the state of being free of other, dissimilar compounds with which the compound of the invention is normally associated in its natural state, so that the "purified," "substantially purified," and "isolated" subject comprises at least 0.5%, 1%, 5%, 10%, or 20%, or at least 50% or 75% of the mass, by weight, of a given sample. In one particular embodiment, these terms refer to the compound of the invention comprising at least 95, 96, 97, 98, 99 or 100% of the mass, by weight, of a given sample. As used herein, the terms "purified," "substantially purified," and "isolated," when referring to a nucleic acid or protein, of nucleic acids or proteins, also refers to a state of purification or concentration different than that which occurs naturally in a cell or organism. Any degree of purification or concentration greater than that which occurs naturally in a cell or organism, including (1) the purification from other associated structures or compounds or (2) the association with structures or compounds to which it is not normally associated in the cell or organism, are within the meaning of "isolated." The nucleic acid or protein or classes of nucleic acids or proteins, described herein, may be isolated, or otherwise associated with structures or compounds to which they are not normally associated in nature, according to a variety of methods and processes known to those of skill in the art.
[0043] As used herein, the terms "amplifying" and "amplification" refer to the use of any suitable amplification methodology for generating or detecting recombinant of naturally expressed nucleic acid, as described in detail, below. For example, the invention provides methods and reagents (e.g., specific degenerate oligonucleotide primer pairs, oligo dT primer) for amplifying (e.g., by polymerase chain reaction, PCR) naturally expressed (e.g., genomic DNA or mRNA) or recombinant (e.g., cDNA) nucleic acids of the invention in vivo, ex vivo or in vitro.
[0044] "Recombinant nucleic acid sequence" are nucleic acid sequences that result from the use of laboratory methods (molecular cloning) to bring together genetic material from more than on source, creating a nucleic acid sequence that does not occur naturally and would not be otherwise found in biological organisms.
[0045] "Recombinant DNA technology" refers to molecular biology procedures to prepare a recombinant nucleic acid sequence as described, for instance, in Laboratory Manuals edited by Weigel and Glazebrook, 2002 Cold Spring Harbor Lab Press; and Sambrook et al., 1989 Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press.
[0046] The term "gene" means a DNA sequence comprising a region, which is transcribed into a RNA molecule, e.g., an mRNA in a cell, operably linked to suitable regulatory regions, e.g., a promoter. A gene may thus comprise several operably linked sequences, such as a promoter, a 5' leader sequence comprising, e.g., sequences involved in translation initiation, a coding region of cDNA or genomic DNA, introns, exons, and/or a 3'non-translated sequence comprising, e.g., transcription termination sites.
[0047] A "chimeric gene" refers to any gene, which is not normally found in nature in a species, in particular, a gene in which one or more parts of the nucleic acid sequence are present that are not associated with each other in nature. For example the promoter is not associated in nature with part or all of the transcribed region or with another regulatory region. The term "chimeric gene" is understood to include expression constructs in which a promoter or transcription regulatory sequence is operably linked to one or more coding sequences or to an antisense, i.e., reverse complement of the sense strand, or inverted repeat sequence (sense and antisense, whereby the RNA transcript forms double stranded RNA upon transcription). The term "chimeric gene" also includes genes obtained through the combination of portions of one or more coding sequences to produce a new gene.
[0048] A "3' UTR" or "3' non-translated sequence" (also referred to as "3' untranslated region," or "3'end") refers to the nucleic acid sequence found downstream of the coding sequence of a gene, which comprises for example a transcription termination site and (in most, but not all eukaryotic mRNAs) a polyadenylation signal such as AAUAAA or variants thereof. After termination of transcription, the mRNA transcript may be cleaved downstream of the polyadenylation signal and a poly(A) tail may be added, which is involved in the transport of the mRNA to the site of translation, e.g., cytoplasm.
[0049] "Expression of a gene" involves transcription of the gene and translation of the mRNA into a protein. Overexpression refers to the production of the gene product as measured by levels of mRNA, polypeptide and/or enzyme activity in transgenic cells or organisms that exceeds levels of production in non-transformed cells or organisms of a similar genetic background.
[0050] "Expression vector" as used herein means a nucleic acid molecule engineered using molecular biology methods and recombinant DNA technology for delivery of foreign or exogenous DNA into a host cell. The expression vector typically includes sequences required for proper transcription of the nucleotide sequence. The coding region usually codes for a protein of interest but may also code for an RNA, e.g., an antisense RNA, siRNA and the like.
[0051] An "expression vector" as used herein includes any linear or circular recombinant vector including but not limited to viral vectors, bacteriophages and plasmids. The skilled person is capable of selecting a suitable vector according to the expression system. In one embodiment, the expression vector includes the nucleic acid of an embodiment herein operably linked to at least one regulatory sequence, which controls transcription, translation, initiation and termination, such as a transcriptional promoter, operator or enhancer, or an mRNA ribosomal binding site and, optionally, including at least one selection marker. Nucleotide sequences are "operably linked" when the regulatory sequence functionally relates to the nucleic acid of an embodiment herein. "Regulatory sequence" refers to a nucleic acid sequence that determines the expression level of the nucleic acid sequences of an embodiment herein and is capable of regulating the rate of transcription of the nucleic acid sequence operably linked to the regulatory sequence. Regulatory sequences comprise promoters, enhancers, transcription factors, promoter elements and the like.
[0052] "Promoter" refers to a nucleic acid sequence that controls the expression of a coding sequence by providing a binding site for RNA polymerase and other factors required for proper transcription including without limitation transcription factor binding sites, repressor and activator protein binding sites. The meaning of the term promoter also includes the term "promoter regulatory sequence". Promoter regulatory sequences may include upstream and downstream elements that may influences transcription, RNA processing or stability of the associated coding nucleic acid sequence. Promoters include naturally-derived and synthetic sequences. The coding nucleic acid sequences is usually located downstream of the promoter with respect to the direction of the transcription starting at the transcription initiation site.
[0053] The term "constitutive promoter" refers to an unregulated promoter that allows for continual transcription of the nucleic acid sequence it is operably linked to.
[0054] As used herein, the term "operably linked" refers to a linkage of polynucleotide elements in a functional relationship. A nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For instance, a promoter, or rather a transcription regulatory sequence, is operably linked to a coding sequence if it affects the transcription of the coding sequence. Operably linked means that the DNA sequences being linked are typically contiguous. The nucleotide sequence associated with the promoter sequence may be of homologous or heterologous origin with respect to the cell or organism, e.g. host cell, plant cell, plant, or microorganism, to be transformed. The sequence also may be entirely or partially synthetic. Regardless of the origin, the nucleic acid sequence associated with the promoter sequence will be expressed or silenced in accordance with promoter properties to which it is linked. The associated nucleic acid may code for a protein that is desired to be expressed or suppressed throughout the organism at all times or, alternatively, at a specific time or in specific tissues, cells, or cell compartment. Such nucleotide sequences particularly encode proteins conferring desirable phenotypic traits to the host cells or organism altered or transformed therewith. More particularly, the associated nucleotide sequence leads to the production of a (+)-manool synthase in the host cell or organism.
[0055] "Target peptide" refers to an amino acid sequence which targets a protein, or polypeptide to intracellular organelles, i.e., mitochondria, or plastids, or to the extracellular space (secretion signal peptide). A nucleic acid sequence encoding a target peptide may be fused to the nucleic acid sequence encoding the amino terminal end, e.g., N-terminal end, of the protein or polypeptide, or may be used to replace a native targeting polypeptide.
[0056] The term "primer" refers to a short nucleic acid sequence that is hybridized to a template nucleic acid sequence and is used for polymerization of a nucleic acid sequence complementary to the template.
[0057] As used herein, the term "host cell" or "transformed cell" refers to a cell (or organism) altered to harbor at least one nucleic acid molecule, for instance, a recombinant gene encoding a desired protein or nucleic acid sequence which upon transcription yields a CPP synthase protein and/or a sclareol synthase protein or which together produce (+)-manool.
[0058] The host cell is particularly a bacterial cell, a fungal cell or a plant cell. The host cell may contain a recombinant gene which has been integrated into the nuclear or organelle genomes of the host cell. Alternatively, the host may contain the recombinant gene extra-chromosomally. Homologous sequences include orthologous or paralogous sequences. Methods of identifying orthologs or paralogs including phylogenetic methods, sequence similarity and hybridization methods are known in the art and are described herein.
[0059] Paralogs result from gene duplication that gives rise to two or more genes with similar sequences and similar functions. Paralogs typically cluster together and are formed by duplications of genes within related plant species. Paralogs are found in groups of similar genes using pair-wise Blast analysis or during phylogenetic analysis of gene families using programs such as CLUSTAL. In paralogs, consensus sequences can be identified characteristic to sequences within related genes and having similar functions of the genes.
[0060] Orthologs, or orthologous sequences, are sequences similar to each other because they are found in species that descended from a common ancestor. For instance, plant species that have common ancestors are known to contain many enzymes that have similar sequences and functions. The skilled artisan can identify orthologous sequences and predict the functions of the orthologs, for example, by constructing a polygenic tree for a gene family of one species using for example CLUSTAL or BLAST programs
[0061] The term "selectable marker" refers to any gene which upon expression may be used to select a cell or cells that include the selectable marker. Examples of selectable markers are described below. The skilled artisan will know that different antibiotic, fungicide, auxotrophic or herbicide selectable markers are applicable to different target species.
[0062] The term "organism" refers to any non-human multicellular or unicellular organisms such as a plant, or a microorganism. Particularly, a microorganism is a bacterium, a yeast, an algae or a fungus.
[0063] The term "plant" is used interchangeably to include plant cells including plant protoplasts, plant tissues, plant cell tissue cultures giving rise to regenerated plants, or parts of plants, or plant organs such as roots, stems, leaves, buds, flowers, petioles, petals, pollen, ovules, embryos, tubers, fruits, seed, progeny thereof and the like. Any plant can be used to carry out the methods of an embodiment herein.
Particular Embodiments
[0064] In one embodiment provided herein is a method for transforming a host cell or non-human organism comprising transforming a host cell or non-human organism with a nucleic acid encoding a polypeptide having a copalyl diphosphate synthase activity and with a nucleic acid encoding a polypeptide having a sclareol synthase activity, wherein the polypeptide having the copalyl diphosphate activity comprises
[0065] a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
[0066] b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
[0067] c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
[0068] In one embodiment, the polypeptide having the sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
[0069] In one embodiment provided herein is a method comprising cultivating a non-human host organism or cell capable of producing a geranylgeranyl diphosphate (GGPP) and transformed to express a polypeptide having a copalyl diphosphate synthase activity wherein the polypeptide having the copalyl diphosphate synthase activity comprises
[0070] a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
[0071] b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
[0072] c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21; and
further transformed to express a polypeptide having a sclareol synthase activity.
[0073] Particularly, the polypeptide having the sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
[0074] Further provided herein is an expression vector comprising a nucleic acid encoding a CPP synthase wherein the CPP synthase comprises a polypeptide comprising
[0075] a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
[0076] b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
[0077] c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and further the expression vector comprises a nucleic acid encoding a sclareol synthase enzyme.
[0078] Particularly, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25. In a particularly embodiment, the two enzymes, i.e. the CPP synthase and the sclareol synthase, could be on two different vectors or plasmids transformed in the same cell. In a further embodiment, these two enzymes could be on two different vectors or plasmids transformed in two different cells.
[0079] Further provided herein is a non-human host organism or cell comprising or transformed to harbor at least one nucleic acid encoding a CPP synthase wherein the CPP synthase comprises
[0080] a) a polypeptide comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
[0081] b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
[0082] c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and at least one nucleic acid encoding a sclareol enzyme.
[0083] Particularly, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
[0084] Further provided herein is a non-human host organism or cell comprising or transformed to harbor at least one nucleic acid encoding a CPP synthase wherein the CPP synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO: 2; and
at least one nucleic acid encoding a sclareol enzyme wherein the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group SEQ ID NO: 23 and SEQ ID NO: 25.
[0085] In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence that has at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0086] In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0087] In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having at least 98% %, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0088] In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0089] In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises the nucleotide sequence as set forth in SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0090] In one embodiment, the CPP synthase comprises a polypeptide comprising
[0091] a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
[0092] b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
[0093] c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
[0094] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% sequence identity to SEQ ID NO: 14.
[0095] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
[0096] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
[0097] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
[0098] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
[0099] In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 14.
[0100] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
[0101] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
[0102] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
[0103] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
[0104] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
[0105] In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 15.
[0106] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
[0107] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
[0108] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
[0109] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
[0110] In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 17.
[0111] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
[0112] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
[0113] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
[0114] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
[0115] In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 18.
[0116] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
[0117] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
[0118] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
[0119] In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 20.
[0120] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
[0121] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
[0122] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
[0123] In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 21.
[0124] In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 21.
[0125] In one embodiment, the nucleic acid encoding the sclareol synthase enzyme comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
[0126] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
[0127] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
[0128] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
[0129] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 4.
[0130] In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 4.
[0131] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
[0132] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
[0133] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
[0134] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 5.
[0135] In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 5.
[0136] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
[0137] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
[0138] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
[0139] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 23.
[0140] In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 23.
[0141] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
[0142] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
[0143] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
[0144] In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 25.
[0145] In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 25.
[0146] In another embodiment, provided herein is an expression vector comprising at least one of the nucleic acids described herein.
[0147] In another embodiment, provided herein is a non-human host organism or cell that comprises one or more expression vectors comprising a nucleic acid encoding a CPP synthase as described herein and a nucleic acid encoding a sclareol synthase as described herein.
[0148] In another embodiment, provided herein is a non-human host organism or cell comprising or transformed to harbor at least one nucleic acid described herein so that it heterologously expresses or over-expresses at least one polypeptide described herein.
[0149] In an embodiment, the present invention provides a transformed cell or organism, in which the polypeptides are expressed in higher quantity than in the same cell or organism not so transformed.
[0150] There are several methods known in the art for the creation of transgenic host organisms or cells such as plants, fungi, prokaryotes, or cultures of higher eukaryotic cells. Appropriate cloning and expression vectors for use with bacterial, fungal, yeast, plant and mammalian cellular hosts are described, for example, in Pouwels et al., Cloning Vectors: A Laboratory Manual, 1985, Elsevier, New York and Sambrook et al., Molecular Cloning: A Laboratory Manual, 2.sup.nd edition, 1989, Cold Spring Harbor Laboratory Press. Cloning and expression vectors for higher plants and/or plant cells in particular are available to the skilled person. See for example Schardl et al., Gene, 1987, 61:1-11.
[0151] Methods for transforming host organisms or cells to harbor transgenic nucleic acids are familiar to the skilled person. For the creation of transgenic plants, for example, current methods include: electroporation of plant protoplasts, liposome-mediated transformation, agrobacterium-mediated transformation, polyethylene-glycol-mediated transformation, particle bombardment, microinjection of plant cells, and transformation using viruses.
[0152] In one embodiment, transformed DNA is integrated into a chromosome of a non-human host organism and/or cell such that a stable recombinant system results. Any chromosomal integration method known in the art may be used in the practice of the invention, including but not limited to recombinase-mediated cassette exchange (RMCE), viral site-specific chromosomal insertion, adenovirus and pronuclear injection.
[0153] In one embodiment for carrying out the method for producing (+)-manool, herein provided is a method of making at least one polypeptide having a CPP synthase activity and at least one polypeptide having a sclareol synthase activity as described in any embodiment of the invention.
[0154] One embodiment provides a method for producing manool comprising
[0155] a) contacting geranylgeranyl diphosphate (GGPP) with a copalyl diphosphate (CPP) synthase as described herein to form a copalyl diphosphate; and
[0156] b) contacting the CPP with a sclareol synthase as described herein to form (+)-manool; wherein step a) comprises culturing a non-human host organism or host cell capable of producing GGPP and transformed with one or more nucleic acids as described herein or with one or more expression vectors as described herein, so that the non-human host organism or host cell harbors a nucleic acid encoding a polypeptide having CPP synthase activity as described herein and a nucleic acid encoding a polypeptide having a sclareol synthase activity as described herein and expresses or over-expresses the polypeptides.
[0157] One embodiment provides the above method for producing manool further comprising prior to step a), transforming a non-human host organism or host cell capable of producing GGPP with
[0158] a) at least one nucleic acid encoding a polypeptide comprising
[0159] i. an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
[0160] ii. an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
[0161] iii. an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and
[0162] having a CPP synthase activity, so that said organism or cell expresses said polypeptide having a CPP synthase activity; and
[0163] b) at least one nucleic acid encoding a polypeptide having a sclareol synthase activity as described herein, so that said organism or cell expresses said polypeptide having a sclareol synthase activity.
[0164] In one embodiment, the non-human host organism or host cell capable of producing GGPP comprises
[0165] a) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 5; or
[0166] b) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 5; or
[0167] c) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 5; or
[0168] d) a nucleic acid comprising SEQ ID NO: 16 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 6 which encodes for a sclareol synthase; or
[0169] e) a nucleic acid comprising SEQ ID NO: 19 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 6 which encodes for a sclareol synthase; or
[0170] f) a nucleic acid comprising SEQ ID NO: 22 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 6 which encodes for a sclareol synthase; or
[0171] g) a nucleic acid comprising SEQ ID NO: 26 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
[0172] h) a nucleic acid comprising SEQ ID NO: 29 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
[0173] i) a nucleic acid comprising SEQ ID NO: 30 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
[0174] j) a nucleic acid comprising SEQ ID NO: 31 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
[0175] k) a nucleic acid comprising SEQ ID NO: 32 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
[0176] l) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 2 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
[0177] m) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
[0178] n) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
[0179] o) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
[0180] p) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 2 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
[0181] q) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
[0182] r) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
[0183] s) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
[0184] t) a nucleic acid comprising SEQ ID NO: 16 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 24 which encodes for a sclareol synthase; or
[0185] u) a nucleic acid comprising SEQ ID NO: 19 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 24 which encodes for a sclareol synthase; or
[0186] v) a nucleic acid comprising SEQ ID NO: 22 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 24 which encodes for a sclareol synthase; or
[0187] w) a nucleic acid comprising SEQ ID NO: 26 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
[0188] x) a nucleic acid comprising SEQ ID NO: 26 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase
[0189] y) a nucleic acid comprising SEQ ID NO: 29 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
[0190] z) a nucleic acid comprising SEQ ID NO: 29 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase
[0191] aa) a nucleic acid comprising SEQ ID NO: 30 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
[0192] bb) a nucleic acid comprising SEQ ID NO: 30 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase
[0193] cc) a nucleic acid comprising SEQ ID NO: 31 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
[0194] dd) a nucleic acid comprising SEQ ID NO: 31 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase; or
[0195] ee) a nucleic acid comprising SEQ ID NO: 32 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
[0196] ff) a nucleic acid comprising SEQ ID NO: 32 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase; wherein the above combinations of nucleic acid sequences and/or synthases also comprise the variants and various percent identities to the SEQ ID NO enumerated as described herein.
[0197] In one embodiment, the non-human host organism provided herein is a plant, a prokaryote or a fungus.
[0198] In one embodiment, the non-human host provided herein is a microorganism, particularly bacteria or yeast.
[0199] In one embodiment, the bacterium provided herein is Escherichia coli and yeast is Saccharomyces cerevisiae.
[0200] In one embodiment, the non-human organism provided herein is Saccharomyces cerevisiae.
[0201] In one embodiment, the cell is a prokaryotic cell.
[0202] In other embodiment, the cell is a bacterial cell.
[0203] In one embodiment, the cell is a eukaryotic cell.
[0204] In one embodiment, the eukaryotic cell is a yeast cell or a plant cell.
[0205] In one embodiment, the manool can be produced by culturing the transformed bacteria or yeast described herein, including through fermentation, for example as described in Paddon et al., Nature, 2013, 496:528-532.
[0206] In one embodiment, the process of producing (+)-manool produces the (+)-manool at a purity of at least 98.5%.
[0207] In another embodiment, a method provided herein further comprising processing the (+)-manool to a derivative using a chemical or biochemical synthesis or a combination of both using methods commonly known in the art.
[0208] In one embodiment, the (+)-manool derivative is selected from the group consisting of a hydrocarbon, an alcohol, acetal, aldehyde, acid, ether, ketone, lactone, acetate and an ester.
[0209] According to any embodiment of the invention, said (+)-manool derivative is a C.sub.10 to C.sub.25 compound optionally comprising one, two or three oxygen atoms.
[0210] In a further embodiment, the derivative is selected from the group consisting of manool acetate ((3R)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na- phthalenyl]-1-penten-3-yl acetate), copalol ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na- phthalenyl]-2-penten-1-ol), copalol acetate ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na- phthalenyl]-2-penten-1-yl acetate), copalal ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na- phthalenyl]-2-pentenal), (+)-manooloxy (4-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2-- butanone), Z-11 ((3 S,5aR,7aS,11aS,11bR)-3,8,8,11a-tetramethyldodecahydro-3,5a-epoxynaphtho[2- ,1-c]oxepin), gamma-ambrol (2-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]eth- anol) and Ambrox.RTM. (3aR,5aS,9aS,9bR)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan).
[0211] In another embodiment, a method provided herein further comprises contacting the (+)-manool with a suitable reacting system to convert said (+)-manool in to a suitable (+)-manool derivative. Said suitable reacting system can be of enzymatic nature (e.g. requiring one or more enzymes) or of chemical nature (e.g. requiring one or more synthetic chemicals).
[0212] For example, (+)-manool may be enzymatically converted to manooloxy or gamma-ambrol using a process described in the literature, for example as set forth in U.S. Pat. No. 7,294,492, wherein said patent is hereby incorporated by reference in its entirety herein.
[0213] In yet another embodiment, the (+)-manool derivative is copalol and its esters with a C.sub.1-C.sub.5 carboxylic acids.
[0214] In yet another embodiment, the (+)-manool derivative is a (+)-manool ester with a C.sub.1-C.sub.5 carboxylic acids.
[0215] In one embodiment, the (+)-manool derivative is copalal.
[0216] In one embodiment, the (+)-manool derivative is manooloxy.
[0217] In yet another embodiment, the (+)-manool derivative is Z-11.
[0218] In one embodiment, the (+)-manool derivative is an ambrol or is a mixture thereof and its esters with a C.sub.1-C.sub.5 carboxylic acids, and in particular gamma-ambrol and its esters.
[0219] In a further embodiment, the (+)-manool derivative is Ambrox.RTM., sclareolide (also known as 3a,6,6,9a-tetramethyldecahydronaphtho[2,1-b]furan-2(1H)-one and all its diastereoisomer and stereoisomers), 3,4a,7,7,10a-pentamethyldodecahydro-1H-benzo[f]chromen-3-ol or 3,4a,7,7,10a-pentamethyl-4a,5,6,6a,7,8,9,10,10a,10b-decahydro-1H-benzo[f]- chromene and all their diastereoisomer and stereoisomers cyclic ketone and open form, (1R,2R,4aS,8aS)-1-(2-hydroxyethyl)-2,5,5,8a-tetramethyldecahydronaphthale- n-2-ol DOL, gamma-ambrol.
[0220] Specific examples of how said derivatives (e.g. a triene hydrocarbon, an acetate or copalol) can be obtained are detailed in the Examples.
[0221] For instance, the manool obtained according to the invention can be processed into Manooloxy (a ketone, as per known methods) and then into ambrol (an alcohol) and ambrox (an ether), according to EP 212254.
[0222] The ability of a polypeptide to catalyze the synthesis of a particular sesquiterpene can be confirmed by performing the enzyme assay as detailed in the Examples provided herein.
[0223] Polypeptides are also meant to include truncated polypeptides provided that they keep their (+)-manool synthase activity and their sclareol synthase activity.
[0224] As intended herein below, a nucleotide sequence obtained by modifying the sequences described herein may be performed using any method known in the art, for example by introducing any type of mutations such as deletion, insertion or substitution mutations. Examples of such methods are cited in the part of the description relative to the variant polypeptides and the methods to prepare them.
[0225] The percentage of identity between two peptide or nucleotide sequences is a function of the number of amino acids or nucleotide residues that are identical in the two sequences when an alignment of these two sequences has been generated. Identical residues are defined as residues that are the same in the two sequences in a given position of the alignment. The percentage of sequence identity, as used herein, is calculated from the optimal alignment by taking the number of residues identical between two sequences dividing it by the total number of residues in the shortest sequence and multiplying by 100. The optimal alignment is the alignment in which the percentage of identity is the highest possible. Gaps may be introduced into one or both sequences in one or more positions of the alignment to obtain the optimal alignment. These gaps are then taken into account as non-identical residues for the calculation of the percentage of sequence identity. Alignment for the purpose of determining the percentage of amino acid or nucleic acid sequence identity can be achieved in various ways using computer programs and for instance publicly available computer programs available on the world wide web. Preferably, the BLAST program (Tatiana et al., FEMS Microbiol Lett., 1999, 174:247-250) set to the default parameters, available from the National Center for Biotechnology Information (NCBI) at http://www.ncbi.nlm.nih.gov/BLAST/bl2seq/wblast2.cgi, can be used to obtain an optimal alignment of protein or nucleic acid sequences and to calculate the percentage of sequence identity.
[0226] The polypeptide to be contacted with GGPP in vitro can be obtained by extraction from any organism expressing it, using standard protein or enzyme extraction technologies. If the host organism is an unicellular organism or cell releasing the polypeptide of an embodiment herein into the culture medium, the polypeptide may simply be collected from the culture medium, for example by centrifugation, optionally followed by washing steps and re-suspension in suitable buffer solutions. In another embodiment, the GGPP may be contacted with the polypeptide in the culture medium where the polypeptide may be released from the host organism, unicellular organism or cell. If the organism or cell accumulates the polypeptide within its cells, the polypeptide may be obtained by disruption or lysis of the cells. The GGPP may be contacted with the polypeptide upon further extraction of the polypeptide from the cell lysate or through contact with the cell lysate without necessarily conducting such an extraction.
[0227] According to another particularly embodiment, the method of any of the above-described embodiments is carried out in vivo. These embodiments provided herein are particularly advantageous since it is possible to carry out the method in vivo without previously isolating the polypeptide. The reaction occurs directly within the organism or cell transformed to express said polypeptide.
[0228] The organism or cell is meant to "express" a polypeptide, provided that the organism or cell is transformed to harbor a nucleic acid encoding said polypeptide, this nucleic acid is transcribed to mRNA and the polypeptide is found in the host organism or cell. The term "express" encompasses "heterologously express" and "over-express", the latter referring to levels of mRNA, polypeptide and/or enzyme activity over and above what is measured in a non-transformed organism or cell. A more detailed description of suitable methods to transform a non-human host organism or cell will be described later on in the part of the specification that is dedicated to such transformed non-human host organisms or cells.
[0229] A particular organism or cell is meant to be "capable of producing GGPP" when it produces GGPP naturally or when it does not produce GPPP naturally but is transformed to produce GGPP, either prior to the transformation with a nucleic acid as described herein or together with said nucleic acid. Organisms or cells transformed to produce a higher amount of GGPP than the naturally occurring organism or cell are also encompassed by the "organisms or cells capable of producing GGPP". Several methods to transform organisms, for example microorganisms, so that they produce GGPP are known, for example in Schalk et al., J. Am. Chem. Soc., 2013, 134:18900-18903.
[0230] Non-human host organisms suitable to carry out the method of an embodiment herein in vivo may be any non-human multicellular or unicellular organisms. In a particular embodiment, the non-human host organism used to carry out an embodiment herein in vivo is a plant, a prokaryote or a fungus. Any plant, prokaryote or fungus can be used. Particularly useful plants are those that naturally produce high amounts of terpenes. In a more particular embodiment the non-human host organism used to carry out the method of an embodiment herein in vivo is a microorganism. Any microorganism can be used but according to an even more particular embodiment said microorganism is a bacteria or yeast. Most particularly, said bacterium is E. coli and said yeast is Saccharomyces cerevisiae.
[0231] Some of these organisms do not produce GGPP naturally or only in small amounts. To be suitable to carry out the method of an embodiment herein, these organisms have to be transformed to produce said precursor or engineered to produce said precursor in larger amounts. They can be so transformed either before the modification with the nucleic acid described according to any of the above embodiments or simultaneously, as explained above.
[0232] In one embodiment, the non-human host organism or cell capable of producing GGPP is transformed with a nucleic acid encoding a CPP synthase or variant thereof as described herein and a nucleic acid encoding a sclareol synthase or variant thereof as described herein, wherein the non-human host organism or cell capable of producing GGPP has been engineered to over-express a GGPP synthase or transformed with a nucleic acid encoding a GGPP synthase.
[0233] In one embodiment, the non-human host organism or cell comprises a nucleic acid encoding a GGPP synthase, a nucleic acid encoding a CPP synthase or variant thereof as described herein, and a nucleic acid encoding a sclareol synthase or variant thereof as described herein, wherein at least one of said nucleic acids is heterologous to the non-human host organism or cell.
[0234] Isolated higher eukaryotic cells can also be used, instead of complete organisms, as hosts to carry out the method of an embodiment herein in vivo. Suitable eukaryotic cells may be any non-human cell, but are particularly plant or fungal cells.
[0235] According to another embodiment, the polypeptides having a CPP synthase activity used in any of the embodiments described herein or encoded by the nucleic acids described herein may be variants obtained by genetic engineering, provided that said variant keeps its CPP synthase activity.
[0236] According to another embodiment, the polypeptides having a sclareol synthase activity used in any of the embodiments described herein or encoded by the nucleic acids described herein may be variants obtained by genetic engineering, provided that said variant keeps its sclareol synthase activity or has manool synthase activity.
[0237] As used herein, the polypeptide is intended as a polypeptide or peptide fragment that encompasses the amino acid sequences identified herein, as well as truncated or variant polypeptides, provided that they keep their CPP synthase activity and their sclareol synthase activity and/or manool synthase activity.
[0238] Examples of variant polypeptides are naturally occurring proteins that result from alternate mRNA splicing events or from proteolytic cleavage of the polypeptides described herein. Variations attributable to proteolysis include, for example, differences in the N- or C-termini upon expression in different types of host cells, due to proteolytic removal of one or more terminal amino acids from the polypeptides of an embodiment herein. Polypeptides encoded by a nucleic acid obtained by natural or artificial mutation of a nucleic acid of an embodiment herein, as described thereafter, are also encompassed by an embodiment herein.
[0239] Polypeptide variants resulting from a fusion of additional peptide sequences at the amino and carboxyl terminal ends can also be used in the methods of an embodiment herein. In particular such a fusion can enhance expression of the polypeptides, be useful in the purification of the protein or improve the enzymatic activity of the polypeptide in a desired environment or expression system. Such additional peptide sequences may be signal peptides, for example. Accordingly, encompassed herein are methods using variant polypeptides, such as those obtained by fusion with other oligo- or polypeptides and/or those which are linked to signal peptides. Polypeptides resulting from a fusion with another functional protein, such as another protein from the terpene biosynthesis pathway, can also advantageously be used in the methods of an embodiment herein.
[0240] A variant may also differ from the polypeptide of an embodiment herein by attachment of modifying groups which are covalently or non-covalently linked to the polypeptide backbone.
[0241] The variant also includes a polypeptide which differs from the polypeptide described herein by introduced N-linked or O-linked glycosylation sites, and/or an addition of cysteine residues. The skilled artisan will recognize how to modify an amino acid sequence and preserve biological activity.
[0242] Therefore, in an embodiment, the present invention provides a method for preparing a variant polypeptide having a CPP synthase activity or a sclareol synthase activity or a manool synthase activity, as described in any of the above embodiments, and comprising the steps of:
[0243] (a) selecting a nucleic acid according to any of the embodiments exposed above;
[0244] (b) modifying the selected nucleic acid to obtain at least one mutant nucleic acid;
[0245] (c) transforming host cells or unicellular organisms with the mutant nucleic acid sequence to express a polypeptide encoded by the mutant nucleic acid sequence;
[0246] (d) screening the polypeptide for at least one modified property; and,
[0247] (e) optionally, if the polypeptide has no desired variant CPP synthase activity, sclareol synthase activity, or manool synthase activity repeating the process steps (a) to (d) until a polypeptide with a desired variant CPP synthase activity, sclareol synthase activity, or manool synthase activity is obtained;
[0248] (f) optionally, if a polypeptide having a desired variant CPP synthase activity or a sclareol synthase activity or manool synthase activity was identified in step (d), isolating the corresponding mutant nucleic acid obtained in step (c).
[0249] According to an embodiment, the variant polypeptide prepared when in combination with either a polypeptide with CPP synthase activity or a sclareol synthase activity is capable of producing (+)-manool.
[0250] In step (b), a large number of mutant nucleic acid sequences may be created, for example by random mutagenesis, site-specific mutagenesis, or DNA shuffling. The detailed procedures of gene shuffling are found in Stemmer, DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution (Proc Natl Acad Sci USA., 1994, 91(22): 10747-1075). In short, DNA shuffling refers to a process of random recombination of known sequences in vitro, involving at least two nucleic acids selected for recombination. For example mutations can be introduced at particular loci by synthesizing oligonucleotides containing a mutant sequence, flanked by restriction sites enabling ligation to fragments of the native sequence. Following ligation, the resulting reconstructed sequence encodes an analog having the desired amino acid insertion, substitution, or deletion. Alternatively, oligonucleotide-directed site-specific mutagenesis procedures can be employed to provide an altered gene wherein predetermined codons can be altered by substitution, deletion or insertion.
[0251] Mutant nucleic acids may be obtained and separated, which may be used for transforming a host cell according to standard procedures, for example such as disclosed in the present examples.
[0252] In step (d), the polypeptide obtained in step (c) is screened for at least one modified property, for example a desired modified enzymatic activity. Examples of desired enzymatic activities, for which an expressed polypeptide may be screened, include enhanced or reduced enzymatic activity, as measured by K.sub.M or V.sub.max value, modified regio-chemistry or stereochemistry and altered substrate utilization or product distribution. The screening of enzymatic activity can be performed according to procedures familiar to the skilled person and those disclosed in the present examples.
[0253] Step (e) provides for repetition of process steps (a)-(d), which may preferably be performed in parallel. Accordingly, by creating a significant number of mutant nucleic acids, many host cells may be transformed with different mutant nucleic acids at the same time, allowing for the subsequent screening of an elevated number of polypeptides. The chances of obtaining a desired variant polypeptide may thus be increased at the discretion of the skilled person.
[0254] In addition to the gene sequences shown in the sequences disclosed herein, it will be apparent for the person skilled in the art that DNA sequence polymorphisms may exist within a given population, which may lead to changes in the amino acid sequence of the polypeptides disclosed herein. Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation. Allelic variants may also include functional equivalents.
[0255] Further embodiments also relate to the molecules derived by such sequence polymorphisms from the concretely disclosed nucleic acids. These natural variations usually bring about a variance of about 1 to 5% in the nucleotide sequence of a gene or in the amino acid sequence of the polypeptides disclosed herein. As mentioned above, the nucleic acid encoding the polypeptide of an embodiment herein is a useful tool to modify non-human host organisms or cells intended to be used when the method is carried out in vivo.
[0256] A nucleic acid encoding a polypeptide according to any of the above-described embodiments is therefore also provided herein.
[0257] The nucleic acid of an embodiment herein can be defined as including deoxyribonucleotide or ribonucleotide polymers in either single- or double-stranded form (DNA and/or RNA). The terms "nucleotide sequence" should also be understood as comprising a polynucleotide molecule or an oligonucleotide molecule in the form of a separate fragment or as a component of a larger nucleic acid. Nucleic acids of an embodiment herein also encompass certain isolated nucleotide sequences including those that are substantially free from contaminating endogenous material. The nucleic acid of an embodiment herein may be truncated, provided that it encodes a polypeptide encompassed herein, as described above.
[0258] In one embodiment, the nucleic acid of an embodiment herein that encodes for a CPP synthase can be either present naturally in a plant such as Salvia miltiorrhiza, or other species, such as Coleus forskohlii, Triticum aestivum, Marrubium vulgare or Rosmarinus officinalis, or be obtained by modifying SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0259] In a further embodiment, the nucleic acid of an embodiment herein that encodes for a sclareol synthase can be either present naturally in a plant such as Salvia sclarea, or other species such as Nicotiana glutinosa, or can be obtained by modifying SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
[0260] Mutations may be any kind of mutations of these nucleic acids, such as point mutations, deletion mutations, insertion mutations and/or frame shift mutations. A variant nucleic acid may be prepared in order to adapt its nucleotide sequence to a specific expression system. For example, bacterial expression systems are known to more efficiently express polypeptides if amino acids are encoded by particular codons.
[0261] Due to the degeneracy of the genetic code, more than one codon may encode the same amino acid sequence, multiple nucleic acid sequences can code for the same protein or polypeptide, all these DNA sequences being encompassed by an embodiment herein. Where appropriate, the nucleic acid sequences encoding the CPP synthase and the scalereol synthase may be optimized for increased expression in the host cell. For example, nucleotides of an embodiment herein may be synthesized using codons particular to a host for improved expression.
[0262] Another important tool for transforming host organisms or cells suitable to carry out the method of an embodiment herein in vivo is an expression vector comprising a nucleic acid according to any embodiment of an embodiment herein. Such a vector is therefore also provided herein.
[0263] Recombinant non-human host organisms and cells transformed to harbor at least one nucleic acid of an embodiment herein so that it heterologously expresses or over-expresses at least one polypeptide of an embodiment herein are also very useful tools to carry out the method of an embodiment herein. Such non-human host organisms and cells are therefore also provided herein.
[0264] A nucleic acid according to any of the above-described embodiments can be used to transform the non-human host organisms and cells and the expressed polypeptide can be any of the above-described polypeptides.
[0265] Non-human host organisms of an embodiment herein may be any non-human multicellular or unicellular organisms. In a particular embodiment, the non-human host organism is a plant, a prokaryote or a fungus. Any plant, prokaryote or fungus is suitable to be transformed according to the methods provided herein. Particularly useful plants are those that naturally produce high amounts of terpenes.
[0266] In a more particular embodiment the non-human host organism is a microorganism. Any microorganism is suitable to be used herein, but according to an even more particular embodiment said microorganism is a bacteria or yeast. Most particularly, said bacterium is E. coli and said yeast is Saccharomyces cerevisiae.
[0267] Isolated higher eukaryotic cells can also be transformed, instead of complete organisms. As higher eukaryotic cells, we mean here any non-human eukaryotic cell except yeast cells. Particular higher eukaryotic cells are plant cells or fungal cells.
[0268] Embodiments provided herein include, but are not limited to cDNA, genomic DNA and RNA sequences.
[0269] Genes, including the polynucleotides of an embodiment herein, can be cloned on basis of the available nucleotide sequence information, such as found in the attached sequence listing and by methods known in the art. These include e.g. the design of DNA primers representing the flanking sequences of such gene of which one is generated in sense orientations and which initiates synthesis of the sense strand and the other is created in reverse complementary fashion and generates the antisense strand. Thermo stable DNA polymerases such as those used in polymerase chain reaction are commonly used to carry out such experiments. Alternatively, DNA sequences representing genes can be chemically synthesized and subsequently introduced in DNA vector molecules that can be multiplied by e.g. compatible bacteria such as e.g. E. coli.
[0270] Provided herein are nucleic acid sequences obtained by mutations of SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32, and SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34; such mutations can be routinely made. It is clear to the skilled artisan that mutations, deletions, insertions, and/or substitutions of one or more nucleotides can be introduced into these DNA sequence
[0271] The nucleic acid sequences of an embodiment herein encoding CPP synthase and the sclareol synthase proteins can be inserted in expression vectors and/or be contained in chimeric genes inserted in expression vectors, to produce CPP synthase and sclareol synthase in a host cell or host organism. The vectors for inserting transgenes into the genome of host cells are well known in the art and include plasmids, viruses, cosmids and artificial chromosomes. Binary or co-integration vectors into which a chimeric gene is inserted are also used for transforming host cells.
[0272] An embodiment provided herein provides recombinant expression vectors comprising a nucleic acid encoding for a CPP synthase and a sclareol synthase each, separately, are operably linked to associated nucleic acid sequences such as, for instance, promoter sequences.
[0273] Alternatively, the promoter sequence may already be present in a vector so that the nucleic acid sequence which is to be transcribed is inserted into the vector downstream of the promoter sequence. Vectors are typically engineered to have an origin of replication, a multiple cloning site, and a selectable marker.
EXAMPLES
Example 1
[0274] Diterpene Synthase Genes.
[0275] Two diterpene synthase are necessary for the conversion of geranylgeranyl diphosphate (GGPP) to manool: a type II and a type I diterpene synthase. In the following examples, several type II and type I diterpene synthase combinations were selected and evaluated for the production of manool. For the type II synthases, five copalyl diphosphate (CPP) synthases were selected:
[0276] SmCPS, NCBI accession No ABV57835.1, from Salvia miltiorrhiza.
[0277] CfCPS1, NCBI accession No AHW04046.1, from Coleus forskohlii.
[0278] TaTps1, NCBI accession No BAH56559.1, from Triticum aestivum.
[0279] MvCps3, NCBI accession No AIE77092.1, from Marrubium vulgare.
[0280] RoCPS1, NCBI accession No AHL67261.1, from Rosmarinus officinalis.
[0281] The codon usage of the cDNA encoding for the five CPP synthases were modified for optimal expression in E. coli (DNA 2.0, Menlo Park, Calif. 94025) and the NdeI and KpnI restriction sites were added at 5'-end and 3'-end, respectively. In addition, the cDNA were designed to express the recombinant CPP synthase with deletion of the predicted peptide signal (58, 63, 59, 63 and 67 amino acids for SmCPS, CfCPS1, TaTps1, MvCps3 and RoCPS1, respectively).
[0282] For the type I diterpene synthase, the sclareol synthase from Salvia sclarea (SsScS) was used (NCBI accession No AET21246.1, WO2009095366). The codon usage of the cDNA was optimized for E. coli expression (DNA 2.0, Menlo Park, Calif. 94025), the 50 first N-terminal codon were removed and the NdeI and KpnI restriction sites were added at the 5'-end and 3'-end, respectively. All the cDNAs were synthesized in vitro and cloned in the pJ208 or pJ401 plasmid (DNA 2.0, Menlo Park, Calif. 94025, USA).
Example 2
[0283] Expression Plasmids.
[0284] The modified SmCPS-encoding cDNA (SmCPS2) and sclareol synthase (SsScS)-encoding cDNA (1132-2-5_opt) were digested with NdeI and KpnI and ligated into the pETDuet-1 plasmid providing the pETDuet-SmCPS2 and pETDuet-1132opt expression plasmids, respectively.
[0285] Another plasmid was constructed to co-expression the SmCPS2 and SsScS enzymes together with a geranylgeranyl diphosphate (GGPP) synthase. For the GGPP synthase, the CrtE gene from Pantoea agglomerans (NCBI accession M38424.1) encoding for a GGPP synthase (NCBI accession number AAA24819.1) was used. The CrtE gene was synthesized with codon optimization and addition of the NcoI and BamHI restriction enzyme recognition sites at the 3' and 5' ends (DNA 2.0, Menlo Park, Calif. 94025, USA) and ligated between NcoI and BamHI site of the pETDuet-1 plasmid to obtain the pETDuet-CrtE plasmid. The SmCPS2 encoding cDNA was digested with NdeI and KpnI and ligated into the pETDuet-1-CrtE plasmid thus providing the pETDuet-CrtE-SmCPS2 construct. The optimized cDNA (1132-2-5_opt) encoding for the truncated SsScS was then introduced in the pETDuet-CrtE-SmCPS2 plasmid using the In-Fusion.RTM. technique (Clontech, Takara Bio Europe). For this cloning, the pETDuet-1132opt was used as template in a PCR amplification using the forward primer SmCPS2-1132Inf_F1 5'-CTGTTTGAGCCGGTCGCCTAAGGTACCAGAAGGAGATAAATAATGGCGAAAATG AAGGAGAACTTTAAACG-3' (SEQ ID NO: 9) and the reverse primer 1132-pET_Inf_R1 5'-GCAGCGGTTTCTTTACCAGACTCGAGGTCAGAACACGAAGCTCTTCATGTCCTCT-3' (SEQ ID NO: 10). The PCR product was ligated in the plasmid pETDuet-CrtE-SmCPS2 digested with the KpnI and XhoI restriction enzymes and using the In-Fusion.RTM. Dry-Down PCR Cloning Kit (Clontech, Takara Bio Europe), providing the new plasmid pETDuet-CrtE-SmCPS2-SsScS. In this plasmid the CrtE gene is under the control of the first T7 promoter of the pETDuet plasmid and the CPP synthase and sclareol synthase encoding cDNAs are organized in a bi-cistronic construct under the control of the second T7 promoter.
[0286] The pETDuet-CrtE-SmCPS2-SsScS plasmid was used as template for construction of new expression plasmids carrying the four other CPP synthases-encoding enzymes. The SmCPS2 cDNA was replaced by one of the four new CPP synthase encoding cDNA using an NdeI-KpnI restriction digestion-ligation approach providing the new plasmids pETDuet-CrtE-CfCPS1del63-SsScS, pETDuet-CrtE-TaTps1del59-SsScS, pETDuet-CrtE-MvCps3del63-SsScS and pETDuet-CrtE-RoCPS1del67-SsScS.
Example 3
[0287] Heterologous Expression in E. coli and Enzymatic Activities.
[0288] The expression plasmids (pETDuet-SmCPS2 or pETDuet-1132opt) were used to transform Bl21(DE3) E. coli cells (Novagene, Madison, Wis.). Single colonies of transformed cells were used to inoculate 25 ml LB medium. After 5 to 6 hours incubation at 37.degree. C., the cultures were transferred to a 20.degree. C. incubator and left 1 hour for equilibration. Expression of the protein was then induced by the addition of 0.1 mM IPTG and the culture was incubated over-night at 20.degree. C. The next day, the cells were collected by centrifugation, re-suspended in 0.1 volume of 50 mM MOPSO (3-morpholino-2-hydroxypropanesulfonic acid sodium salt, 3-(N-morpholinyl)-2-hydroxypropanesulfonic acid sodium salt) buffer at pH 7, 10% glycerol, 1 mM DTT and lysed by sonication. The extracts were cleared by centrifugation (30 min at 20,000 g) and the supernatants containing the soluble proteins were used for further experiments.
Example 4
[0289] In Vitro Diterpene Synthase Activity Assays.
[0290] Enzymatic assays were performed in Teflon sealed glass tubes using 50 to 100 .mu.l of protein extract in a final volume of 1 mL of 50 mM MOPSO pH 7, 10% glycerol supplemented with 20 mM MgCl.sub.2 and 50 to 200 .mu.M purified geranylgeranyl diphosphate (GGPP) (prepared as described by Keller and Thompson, J. Chromatogr, 1993, 645(1):161-167). The tubes were incubated 5 to 48 hours at 30.degree. C. and the enzyme products were extracted twice with one volume of pentane. After concentration under a nitrogen flux, the extracts were analyzed by GC-MS and compared to extracts from control proteins (obtained from cells transformed with the empty plasmid). GC-MS analysis were performed on an Agilent 6890 series GC system equipped with a DB1 column (30 m.times.0.25 mm.times.0.25 mm film thickness; Agilent) and coupled with a 5975 series mass spectrometer. The carrier gas was helium at a constant flow of 1 ml/min. Injection was in split-less mode with the injector temperature set at 260.degree. C. and the oven temperature was programmed from 100.degree. C. to 225.degree. C. at 10.degree. C./min and to 280.degree. C. at 30.degree. C./min. The identities of the products were confirmed based on the concordance of the retention indices and mass spectra of authentic standards.
[0291] In these conditions and with the recombinant protein from E. coli cells transformed with the plasmids pETDuet-SmCPS2 or pETDuet-1132opt (heterologously expressing the SmCPS or ScScS enzymes, respectively) no production of diterpene molecules was detected in the solvent extracts (the diphosphate-containing diterpenes are not detected in these conditions). Similar assays were then performed but combining the 2 protein extracts containing the recombinant SmCPS and SsScS in a single assay. In these assays, one major product was formed and was identified as being (+)-manool by matching of the mass spectrum and retention index with authentic standards (FIG. 3). This experiment demonstrated that a sclareol synthase can be used together with a CPP synthase to produce manool.
Example 5
[0292] In Vivo Manool Production Using E. coli Cells.
[0293] The in vivo production of manool using cultures of whole cells was evaluated using E. coli cells. The CrtE gene inserted in the co-expression plasmids described in Example 2 encodes for an enzyme having GGPP synthase activity that uses farnesyl-diphosphate (FPP) to produce geranylgeranyl diphosphate (GGPP). To increase the level of the endogenous GGPP pool and therefore the productivity in diterpene of the cells, a heterologous complete mevalonate pathway leading to FPP was co-expressed in the same cells. The enzymes of this pathway were expressed using a single plasmid containing all the genes organized in two operons under the control of two promoters. The construction of this expression plasmid is described in patent application WO2013064411 or in Schalk et al. (J. Am. Chem. Soc., 2013, 134:18900-18903). Briefly, a first synthetic operon consisting of an E. coli acetoacetyl-CoA thiolase (atoB), a Staphylococcus aureus HMG-CoA synthase (mvaS), a Staphylococcus aureus HMG-CoA reductase (mvaA) and a Saccharomyces cerevisiae FPP synthase (ERG20) genes was synthetized in vitro (DNA2.0, Menlo Park, Calif., USA) and ligated into the NcoI-BamHI digested pACYCDuet-1 vector (Invitrogen) yielding pACYC-29258. A second operon containing a mevalonate kinase (MvaK1), a phosphomevalonate kinase (MvaK2), a mevalonate diphosphate decarboxylase (MvaD), and an isopentenyl diphosphate isomerase (idi) was amplified from genomic DNA of Streptococcus pneumoniae (ATCC BAA-334) and ligated into the second multicloning site of pACYC-29258 providing the plasmid pACYC-29258-4506. This plasmid thus contains the genes encoding all enzymes of the biosynthetic pathway leading from acetyl-coenzyme A to FPP.
[0294] KRX E. coli cells (Promega) were co-transformed with the plasmid pACYC-29258-4506 and one plasmid selected from pETDuet-CrtE-SmCPS2-SsSc, pETDuet-CrtE-CfCPS1del63-SsScS, pETDuet-CrtE-TaTps1del59-SsScS, pETDuet-CrtE-MvCps3del63-SsScS, or pETDuet-CrtE-RoCPS1del67-SsScS. Transformed cells were selected on carbenicillin (50 .mu.g/ml) and chloramphenicol (34 .mu.g/ml) LB-agarose plates. Single colonies were used to inoculate 5 mL liquid LB medium supplemented with the same antibiotics. The cultures were incubated overnight at 37.degree. C. The next day 2 mL of TB medium supplemented with the same antibiotics were inoculated with 0.2 mL of the overnight culture. After 6 hours incubation at 37.degree. C., the culture was cooled down to 28.degree. C. and 0.1 mM IPTG, 0.2% rhamnose and 1:10 volume of decane were added to each tube. The cultures were incubated for 48 hours at 28.degree. C. The cultures were then extracted twice with 2 volumes of MTBE (Methyl tert-butyl ether), the organic phase were concentrated to 500 .mu.L and analyzed by GC-MS as described above in Example 4 except for the oven temperature which was 1 min hold at 100.degree. C., followed by a temperature gradient of 10.degree. C./min to 220.degree. C. and 20.degree. C./min and to 3000.degree. C.
[0295] Under these culture conditions, manool was produced with each combination of type II diterpene synthase and the Salvia sclarea sclareol synthase (SsScS) (FIGS. 4 and 5). The amounts of diterpene compounds produced were quantified using an internal standard (alpha-longipinene). The table below shows the quantities of manool produced relative to the SmCPS/SsScS combination, when the ScScS is combined with various type II diterpene synthase (under these experimental conditions, the concentration of manool produced by cells expressing the SmCPS and the SsScS was 300 to 500 mg/L (FIG. 4)). Under these conditions, the highest relative quantity of manool produced was with the TaTps1del59 combination.
TABLE-US-00001 Type II diterpene Type I diterpene Relative quantity of synthase synthase manool produced SmCPS2 ScScS 100 CfCPS1del63 ScScS 125.3 TaTps1del59 ScScS 139.4 MyCps3del63 ScScS 14.9 RoCPS1del67 ScScS 77.7
Example 6
[0296] Production of (+)-Manool Using Recombinant Cells, Purification and NMR Analysis.
[0297] One litre of E. coli culture was prepared in the conditions described in Example 5, using the SmCPS/SsScS enzyme combination, except that the decane organic phase was replaced by 50 g/L Amberlite XAD-4 for solid phase extraction. The culture medium was filtered to recover the resine. The resine was then washed with 3 column volumes of water, and eluted using 3 column volumes of MTBE. The product was then further purified by flash chromatography on silica gel using a mobile phase composed of heptane:MTBE 8:2 (v/v). The structure of manool was confirmed by 1H- and 13C-NMR using a Bruker Avance 500 MHz spectrometer. The optical rotation was measured using a Perkin-Elmer 241 polarimeter and the value of [.alpha.].sup.D.sub.20=+26.9.degree. (0.3%, CHCl.sub.3) confirmed the production of (+)-manool.
Example 7
[0298] In Vivo Manool Production in E. coli Cells Using a Sclareol Synthases from Nicotiana glutinosa.
[0299] Sclareol synthases from the plant Nicotiana glutinosa are described in WO 2014/022434 and are shown to produce sclareol from labdenediol diphosphate (LPP). Two of the sclareol synthase described in WO 2014/022434 were evaluated, NgSCS-del29 (corresponding to SEQ ID NO: 78 in WO 2014/0224) and NgSCS-del38 (corresponding to SEQ ID NO: 40 of WO 2014/022434) for the production of (+)-manool under conditions similar to Example 5.
[0300] A cDNA encoding for NgSCS-del29 was design with a codon usage optimal for E. coli expression and including the KpnI and XhoI sites at the 5'-end and 3'-end respectively. This DNA was synthesized by DNA 2.0 (Newark, CA 94560).
[0301] The pETDuet-CrtE-SmCPS2-SsScS plasmid (Example 2) was used as template for construction of a new expression plasmid. The pETDuet-CrtE-SmCPS2-SsScS plasmid was digested with the KpnI and XhoI restriction sites to replace the SsScS cDNA with the NgSCS-del29 cDNA, providing the new pETDuet-CrtE-SmCPS2-del29 plasmid.
[0302] KRX E. coli cells (Promega) were co-transformed with the plasmid pACYC-29258-4506 (Example 5) and the pETDuet-CrtE-SmCPS2-del29 plasmid. Transformed cells were selected and cultivated in conditions for production of diterpene as described in Example 5. The production of diterpenes was evaluated using GC-MS analysis and the diterpene compounds produced were quantified using an internal standard (alpha-longipinene). With the new combination of the diterpene synthases SmCPS2 and NgSCS-del29, manool was produced by transformed E. coli cells (FIG. 6). The combination of the diterpene synthases SmCPS2 and NgSCS-del38 did not produce manool under the experimental conditions used. Thus at least one of the Nicotiana glutinosa sclareol synthase tested can also be used to produce manool from CPP. However, the quantities produced using the Nicotiana glutinosa synthase were much lower than with the SsSCS synthase (see table below).
TABLE-US-00002 Type II diterpene Type I diterpene Relative quantity of synthase synthase manool produced. SmCPS2 SsScS 100 SmCPS2 NgSCS-del29 3.1
Example 8
[0303] The manool obtained in the above examples was converted into its esters according to the following experimental part (herein below as example into its acetate):
##STR00001##
Following the literature (G. Ohloff, Helv. Chim. Acta 41, 845 (1958)), 32.0 g (0.11 mole) of pure crystalline (+)-Manool were treated by 20.0 g (0.25 mole) of acetyl chloride in 100 ml of dimethyl aniline for 5 days at room temperature. The mixture was additionally heated for 7 hours at 50.degree. to reach 100% of conversion. After cooling, the reaction mixture was diluted with ether, washed successively with 10% H.sub.2SO.sub.4, aqueous NaHCO.sub.3 and water to neutrality. After drying (Na.sub.2SO.sub.4) and concentration, the product was distilled (bulb-to-bulb, B.p.=160.degree., 0.1 mbar) to give 20.01 g (79.4%) of Manool Acetate which was used without further purification.
[0304] MS: M.sup.+ 332 (0); m/e: 272 (27), 257 (83), 137 (62), 95 (90), 81 (100).
[0305] .sup.1H-NMR (CDCl.sub.3): 0.67, 0.80, 0.87, 1.54 and 2.01 (5 s, 3H each), 4.49 (s, 1H), 4.80 (s, 1H), 5.11 (m, 1H), 5.13 (m, 1H), 5.95 (m, 1H).
[0306] .sup.13C-NMR (CDCl.sub.3): 14.5 (q), 17.4 (t), 19.4 (t), 21.7 (q), 22.2 (q), 23.5 (q), 24.2 (t), 33.5 (s), 33.6 (t), 38.3 (t), 39.0 (t), 39.3 (t), 39.8 (s), 42.2 (t), 55.6 (d), 57.2 (t), 83.4 (s), 106.4 (t), 113.0 (t), 142.0 (d), 148.6 (s), 169.9 (s).
Example 9
[0307] The manool acetate obtained in the above examples was converted into its trienes according to the following experimental part (herein below as example into its Sclarene and (Z+E)-Biformene):
##STR00002##
To a solution of 0.4 g of Manool Acetate in 4 ml of cyclohexane at room temperature was added 0.029 g (0.05 eq.) of BF.sub.3.AcOH complex. After 15 minutes at room temperature, the reaction was quenched with aqueous NaHCO.sub.3 and washed with water to neutrality. GC-MS analysis showed only hydrocarbons which were identified as Sclarene, (Z) and (E)-biformene. No Copalol Acetate was detected. Another trial with more catalyst (0.15 eq) gave the same result.
[0308] Sclarene: MS: M.sup.+ 272 (18); m/e: 257 (100), 149 (15), 105 (15).
[0309] (Z) and (E)-Biformene (identical spectra): MS: M.sup.+ 272 (29); m/e: 257 (100), 187 (27), 161 (33), 105 (37).
Example 10
[0310] The manool obtained in the above examples was converted into Copalyl esters according to the following experimental part (herein below as example into the acetate):
##STR00003##
To a solution of 0.474 g (0.826 mmole, 0.27 eq.) of BF.sub.3.AcOH in 100 ml of cyclohexane at room temperature was added 4.4 g of acetic anhydride and 12.1 g of acetic acid. At room temperature, 10.0 g (33 mmole) of pure crystalline Manool in 15 ml of cyclohexane were added (sl. exothermic) and the temperature was maintained at room temperature using a water bath. After 30 min. of stirring at room temperature, a GC control showed no starting material. The reaction mixture was quenched with 300 ml of aq. saturated NaHCO.sub.3 and treated as usual. The crude mixture (9.9 g) was purified by flash chromatography (SiO2, pentane/ether 95:5) and bulb-to-bulb distillation (Eb.=130.degree., 0.1 mbar) to give 4.34 g (37.1%) of a 27/73 mixture of (Z) and (E)-Copalyl Acetate.
[0311] (Z)-Copalyl Acetate:
[0312] MS: M.sup.+ 332 (0); m/e: 317 (2), 272 (35)=, 257 (100), 137 (48),95 (68), 81 (70).
[0313] .sup.1H-NMR (CDCl.sub.3): 0.67, 0.80, 0.87 1.76 and 2.04 (5s, 3H each), 4.86 (s, 1H), 5.35 (t: J=6 Hz, 1H).
[0314] (E)-Copalyl Acetate:
[0315] MS: M.sup.+ 332 (0); m/e: 317 (2), 272 (33)=, 257 (100), 137 (54),95 (67), 81 (74).
[0316] .sup.1H-NMR (CDCl.sub.3): 0.68, 0.80, 0.87 1.70 and 2.06 (5s, 3H each), 4.82 (s, 1H), 5.31 (t: J=6 Hz, 1H).
[0317] .sup.13C-NMR (CDCl.sub.3): (Spectrum recorded on (Z/E) mixture, only significant signals are given): 61.4 (t), 106.2 (t), 117.9 (d), 143.1 (s), 148.6 (s), 171.1 (s).
Example 11
[0318] The copalyl acetate obtained in the above examples was converted into Copalol according to the following experimental part:
##STR00004##
Copalyl Acetate (4.17 g, 12.5 mmole), KOH pellets (3.35 g, 59.7 mmole), water (1.5 g) and EtOH (9.5 ml) were mixed together and stirred for 3 hours at 50.degree.. After usual workup, 3.7 g of crude (Z+E)-Copalol were obtained and purified by flash chromatography (SiO2, pentane/ether 7:2. After evaporation of the solvent, a bulb-to-bulb distillation (Eb=170.degree., 0.1 mbar) furnished 3.25 g (92%) of a 27/73 mixture of (Z) and (E)-Copalol.
[0319] (Z)-Copalol
[0320] MS: M.sup.+ 290 (3); m/e: 275 (18), 272 (27), 257 (82), 137 (71), 95 (93), 81 (100), 69 (70).
[0321] .sup.1H-NMR (CDCl.sub.3): 0.67, 0.80, 0.87 and 1.74 (4s, 3H each); 4.06 (m, 2H), 4.55 (s, 1H), 4.86 (s, 1H), 5.42 (t: J=6 Hz, 1H).
[0322] (E)-Copalol
[0323] MS: M.sup.+ 290 (3); m/e: 275 (27), 272 (22), 257 (75), 137 (75), 95 (91), 81 (100), 69 (68).
[0324] .sup.1H-NMR (CDCl.sub.3): 0.68, 0.80, 0.87 and 1.67 (4s, 3H each); 4.15 (m, 2H), 4.51 (s, 1H), 4.83 (s, 1H), 5.39 (t, J=6 Hz, 1H)
[0325] .sup.13C-NMR (CDCl.sub.3): (Spectrum recorded on (Z/E) mixture, only significant signals are given): 59.4 (t), 106.2 (t), 123.0 (d), 140.6 (s), 148.6 (s).
Example 12
[0326] In Vivo Manool Production in Saccharomyces cerevisiae Cells Using Different Combinations of CPP Synthases and Sclareol Synthases.
[0327] Different combinations of class I and class II diterpene synthases were evaluated for the production of manool in S. cerevisiae cells.
[0328] For the class II diterpene synthase, five CPP synthases were selected:
[0329] SmCPS, NCBI accession No ABV57835.1, from Salvia miltiorrhiza.
[0330] CfCPS1, NCBI accession No AHW04046.1, from Coleus forskohlii.
[0331] TaTps1, NCBI accession No BAH56559.1, from Triticum aestivum.
[0332] MvCps3, NCBI accession No AIE77092.1, from Marrubium vulgare.
[0333] RoCPS1, NCBI accession No AHL67261.1, from Rosmarinus officinalis.
[0334] For the class I, two putative sclareol synthases from Nicotiana glutinosa and one from Salvia sclarea were selected:
[0335] NgSCS-del38 (corresponding to SEQ ID NO: 40 of WO 2014/022434).
[0336] NgSCS-del29 (corresponding to SEQ ID NO: 78 of WO 2014/022434).
[0337] SsScS, NCBI accession No AET21246.1, from Salvia sclarea.
[0338] The codon usage of the DNA encoding for different CPP synthases was modified for optimal expression in S. cerevisiae. In addition, the DNA sequences were designed to express the recombinant CPP synthase with deletion of the predicted peptide signal (58, 63, 59, 63 and 67 amino acids for SmCPS, CfCPS1, TaTps1, MvCps3 and RoCPS1, respectively). The NgSCS-del38, NgSCS-del29 and SaSCS DNA sequences were also codon optimized for S. cerevisiae expression.
[0339] For expression of the different genes in S. cerevisiae, a set of plasmids were constructed in vivo using yeast endogenous homologous recombination as previously described in Kuijpers et al., Microb Cell Fact., 2013, 12:47. Each plasmid is composed of six DNA fragments which were used for S. cerevisiae co-transformation. The fragments were:
[0340] a) LEU2 yeast marker, constructed by PCR using the primers 5' AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGTCG TACCGCGCCATTCGACTACGTCGTAAGGCC-3' (SEQ ID NO: 44) and 5' TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCGTT GTTGCTGACCATCGACGGTCGAGGAGAACTT-3' (SEQ ID NO: 45) with the plasmid pESC-LEU (Agilent Technologies, California, USA) as template;
[0341] b) AmpR E. coli marker, constructed by PCR using the primers 5'-TGGTCAGCAACAACGCCGAAGAATCACTCTCGTGTTGAGAATTGCACGCC TTGACCACGACACGTTAAGGGATTTTGGTCATGAG-3' (SEQ ID NO: 37) and 5'-AACGCGTACCCTAAGTACGGCACCACAGTGACTATGCAGTCCGCACTTTG CCAATGCCAAAAATGTGCGCGGAACCCCTA-3' (SEQ ID NO: 38) with the plasmid pESC-URA as template;
[0342] c) Yeast origin of replication, obtained by PCR using the primers 5'-TTGGCATTGGCAAAGTGCGGACTGCATAGTCACTGTGGTGCCGTACTTAG GGTACGCGTTCCTGAACGAAGCATCTGTGCTTCA-3' (SEQ ID NO: 39) and 5'-CCGAGATGCCAAAGGATAGGTGCTATGTTGATGACTACGACACAGAACTG CGGGTGACATAATGATAGCATTGAAGGATGAGACT-3' (SEQ ID NO: 40) with pESC-URA as template;
[0343] d) E. coli replication origin, obtained by PCR using the primers 5'-ATGTCACCCGCAGTTCTGTGTCGTAGTCATCAACATAGCACCTATCCTTTG GCATCTCGGTGAGCAAAAGGCCAGCAAAAGG-3' (SEQ ID NO: 41) and 5'-CTCAGATGTACGGTGATCGCCACCATGTGACGGAAGCTATCCTGACAGTG TAGCAAGTGCTGAGCGTCAGACCCCGTAGAA-3' (SEQ ID NO: 42) with the plasmid pESC-URA as template;
[0344] e) a fragment composed by the last 60 nucleotides of the fragment "d", 200 nucleotides downstream the stop codon of the yeast gene PGK1, the GGPP synthase coding sequence CrtE (from Pantoea agglomerans, NCBI accession M38424.1) codon optimized for its expression in S. cerevisiae, the bidirectional yeast promoter of GAL10/GAL1, one of the tested sclareol synthase coding sequences, 200 nucleotides downstream the stop codon of the yeast gene CYC1 and the sequence 5'-ATTCCTAGTGACGGCCTTGGGAACTCGATACACGATGTTCAGTAGACCGC TCACACATGG-3'(SEQ ID NO: 43), this fragment was obtained by DNA synthesis (DNA 2.0, Menlo Park, Calif. 94025) and
[0345] f) a fragment composed by the last 60 nucleotides of fragment "e", 200 nucleotides downstream the stop codon of the yeast gene CYC1, one of the tested CPP synthase coding sequences, the bidirectional yeast promoter of GAL10/GAL1 and 60 nucleotides corresponding to the beginning of the fragment "a", this fragment was obtained by DNA synthesis (DNA 2.0, Menlo Park, Calif. 94025).
[0346] In total 15 plasmids were constructed which cover all the possible combinations of class I and class II diterpene synthases listed above. The table below show all the plasmids.
TABLE-US-00003 Plasmid Class II diterpene Class I diterpene name synthase synthase Nm SmCPS2 SsScS Cf CfCPS1del63 SsScS Mv MvCps3del63 SsScS Ro RoCPS1del67 SsScS Ta TaTps1del59 SsScS Nt_Sm SmCPS2 NgSCS-del38 Nt_Cf CfCPS1del63 NgSCS-del38 Nt_Mv MvCps3del63 NgSCS-del38 Nt_Ro RoCPS1del67 NgSCS-del38 Nt_Ta TaTps1del59 NgSCS-del38 Nt2_Sm SmCPS2 NgSCS-del29 Nt2_Cf CfCPS1del63 NgSCS-del29 Nt2_Mv MvCps3del63 NgSCS-del29 Nt2_Ro RoCPS1del67 NgSCS-del29 Nt2_Ta TaTps1del59 NgSCS-del29
[0347] To increase the level of endogenous farnesyl-diphosphate (FPP) pool in S. cerevisiae cells, an extra copy of all the yeast endogenous genes involved in the mevalonate pathway, from ERG10 coding for acetyl-CoA C-acetyltransferase to ERG20 coding for FPP synthetase, were integrated in the genome of the S. cerevisiae strain CEN.PK2-1C (Euroscarf, Frankfurt, Germany) under the control of galactose-inducible promoters, similarly as described in Paddon et al., Nature, 2013, 496:528-532. Briefly, three cassettes were integrated in the LEU2, TRP1 and URA3 loci respectively. A first cassette containing the genes ERG20 and a truncated HMG1 (tHMG1) as described in Donald et al., Proc Natl Acad Sci USA, 1997, 109:E111-8, under the control of the bidirectional promoter GAL10/GAL1 and the genes ERG19 and ERG13 also under the control of GAL10/GAL1 promoter, the cassette was flanked by two 100 nucleotides regions corresponding to the up- and down-stream sections of LEU2. A second cassette where the genes IDI1 and tHMG1 were under the control of the GAL10/GAL1 promoter and the gene ERG13 under the control of the promoter region of GAL 7, the cassette was flanked by two 100 nucleotides regions corresponding to the up- and down-stream sections of TRP1. A third cassette with the genes ERG10, ERG12, tHMG1 and ERG8, all under the control of GAL10/GAL1 promoters, the cassette was flanked by two 100 nucleotides regions corresponding to the up- and down-stream sections of URA3. All genes in the three cassettes included 200 nucleotides of their own terminator regions. Also, an extra copy of GAL4 under the control of a mutated version of its own promoter, as described in Griggs and Johnston, Proc Natl Acad Sci USA, 1991, 88:8597-8601, was integrated upstream the ERG9 promoter region. In addition, the endogenous promoter of ERG9 was replaced by the yeast promoter region of CTR3 generating the strain YST035. Finally, YST035 was mated with the strain CEN.PK2-1D (Euroscarf, Frankfurt, Germany) obtaining a diploid strain termed YST045.
[0348] YST045 was transformed with the above described fragments required for in vivo plasmid assembly. Yeast transformations were performed with the lithium acetate protocol as described in Gietz and Woods, Methods Enzymol., 2002, 350:87-96. Transformation mixtures were plated on SmLeu-media containing 6.7 g/L of Yeast Nitrogen Base without amino acids (BD Difco, New Jersey, USA), 1.6 g/L Dropout supplement without leucine (Sigma Aldrich, Missouri, USA), 20 g/L glucose and 20 g/L agar. Plates were incubated for 3-4 days at 30.degree. C. Single cells were used to produce manool in cultures as described in Westfall et al., Proc Natl Acad Sci USA, 2012, 109:E111-118.
[0349] Under these culture conditions, manool was produced with some combinations of type II and type I diterpene synthases. The production of manool was evaluated using GC-MS analysis and quantified using an internal standard. The table below shows the quantities of manool produced relative to the SmCPS/SsScS combination (under these experimental conditions, the concentration of manool produced by cells expressing the SmCPS and the SsScS was 100 to 250 mg/L, the highest quantity of manool produced).
TABLE-US-00004 Class II diterpene Class I diterpene Relative quantity of synthase synthase manool produced SmCPS2 SsScS 100 CfCPS1del63 SsScS 67 MvCps3del63 SsScS 1 RoCPS1del67 SsScS 29 TaTps1del59 SsScS 16 SmCPS2 NgSCS-del38 0 CfCPS1del63 NgSCS-del38 0 MvCps3del63 NgSCS-del38 0 RoCPS1del67 NgSCS-del38 0 TaTps1del59 NgSCS-del38 0 SmCPS2 NgSCS-del29 0 CfCPS1del63 NgSCS-del29 0 MvCps3del63 NgSCS-del29 0 RoCPS1del67 NgSCS-del29 0 TaTps1del59 NgSCS-del29 0
TABLE-US-00005 Sequence Listing. SEQ ID NO: 1 SmCPS, full-length copalyl diphosphate synthase from Salvia miltiorrhiza MASLSSTILSRSPAARRRITPASAKLHRPECFATSAWMGSSSKNLSL SYQLNHKKISVATVDAPQVHDHDGTTVHQGHDAVKNIEDPIEYIRTL LRTTGDGRISVSPYDTAWVAMIKDVEGRDGPQFPSSLEWIVQNQLED GSWGDQKLFCVYDRLVNTIACVVALRSWNVHAHKVKRGVTYKENVDK LMEGNEEHMTCGFEWFPALLQKAKSLGIEDLPYDSPAVQEVYHVREQ KLKRIPLEIMHKIPTSLLFSLEGLENLDWDKLLKLQSADGSFLTSPS STAFAFMQTKDEKCYQFIKNTIDTFNGGAPHTYPVDWGRLWAIDRLQ RLGISRFFEPEIADCLSHIHKFWTDKGVFSGRESEFCDIDDTSMGMR LMRMHGYDVDPNVLRNFKQKDGKFSCYGGQMIESPSPIYNLYRASQL RFPGEEILEDAKRFAYDFLKEKLANNQILDKWVISKHLPDEIKLGLE MPWLATLPRVEAKYYIQYYAGSGDVWIGKTLYRMPEISNDTYHDLAK TDFKRCQAKHQFEWLYMQEWYESCGIEEFGISRKDLLLSYFLATASI FELERTNERIAWAKSQIIAKMITSFFNKETTSEEDKRALLNELGNIN GLNDTNGAGREGGAGSIALATLTQFLEGFDRYTRHQLKNAWSVWLTQ LQHGEADDAELLTNTLNICAGHIAFREEILAHNEYKALSNLTSKICR QLSFIQSEKEMGVEGEIAAKSSIKNKELEEDMQMLVKLVLEKYGGID RNIKKAFLAVAKTYYYRAYHAADTIDTHMFKVLFEPVA SEQ ID NO: 2 SmCPS2, truncated copalyl diphosphate synthase from S. miltiorrhiza MATVDAPQVHDHDGTTVHQGHDAVKNIEDPIEYIRTLLRTTGDGRIS VSPYDTAWVAMIKDVEGRDGPQFPSSLEWIVQNQLEDGSWGDQKLFC VYDRLVNTIACVVALRSWNVHAHKVKRGVTYIKENVDKLMEGNEEHM TCGFEVVFPALLQKAKSLGIEDLPYDSPAVQEVYHVREQKLKRIPLE IMHKIPTSLLFSLEGLENLDWDKLLKLQSADGSFLTSPSSTAFAFMQ TKDEKCYQFIKNTIDTFNGGAPHTYPVDVFGRLWAIDRLQRLGISRF FEPEIADCLSHIHKFWTDKGVFSGRESEFCDIDDTSMGMRLMRMHGY DVDPNVLRNFKQKDGKFSCYGGQMIESPSPrYNLYRASQLRFPGEEI LEDAKRFAYDFLKEKLANNQILDKWVISKHLPDEIKLGLEMPWLATL PRVEAKYYIQYYAGSGDVWIGKTLYRMPEISNDTYHDLAKTDFKRCQ AKHQFEWLYMQEWYESCGIEEFGISRKDLLLSYFLATASIFELERTN ERIAWAKSQIIAKMITSFFNKETTSEEDKRALLNELGNINGLNDTNG AGREGGAGSIALATLTQFLEGFDRYTRHQLKNAWSVWLTQLQHGEAD DAELLTNTLNICAGHIAFREEILAHNEYKALSNLTSKICRQLSFIQS EKEMGVEGEIAAKSSIKNKELEEDMQMLVKLVLEKYGGlDRNIKKAF LAVAKTYYYRAYHAADTrDTHMFKVLFEPVA SEQ ID NO: 3 SmCPS2opt, optimized cDNA for E. coli expression encoding for SmCPS2 ATGGCAACTGTTGACGCACCTCAAGTCCATGATCACGATGGCACCAC CGTTCACCAGGGTCACGACGCGGTGAAGAACATCGAGGACCCGATCG AATACATTCGTACCCTGCTGCGTACCACTGGTGATGGTCGCATCAGC GTCAGCCCGTATGACACGGCGTGGGTGGCGATGATTAAAGACGTCGA GGGTCGCGATGGCCCGCAATTTCCTTCTAGCCTGGAGTGGATTGTCC AAAATCAGCTGGAAGATGGCTCGTGGGGTGACCAGAAGCTGTTTTGT GTTTACGATCGCCTGGTTAATACCATCGCATGTGTGGTTGCGCTGC GTAGCTGGAATGTTCACGCTCATAAAGTCAAACGTGGCGTGACGTAT ATCAAGGAAAACGTGGATAAGCTGATGGAAGGCAACGAAGAACACAT GACGTGTGGCTTCGAGGTTGTTTTTCCAGCCTTGCTGCAGAAAGCAA AGTCCCTGGGTATTGAGGATCTGCCGTACGACTCGCCGGCAGTGCAA GAAGTCTATCACGTCCGCGAGCAGAAGCTGAAACGCATCCCGCTGGA GATTATGCATAAGATTCCGACCTCTCTGCTGTTCTCTCTGGAAGGTC TGGAGAACCTGGATTGGGACAAACTGCTGAAGCTGCAGTCCGCTGAC GGTAGCTTTCTGACCAGCCCGAGCAGCACGGCCTTTGCGTTTATGCA GACCAAAGATGAGAAGTGCTATCAATTCATCAAGAATACTATTGATA CCTTCAACGGTGGCGCACCGCACACGTACCCAGTAGACGTTTTTGGT CGCCTGTGGGCGATTGACCGTTTGCAGCGTCTGGGTATCAGCCGTTT CTTCGAGCCGGAGATTGCGGACTGCTTGAGCCATATTCACAAATTCT GGACGGACAAAGGCGTGTTCAGCGGTCGTGAGAGCGAGTTCTGCGAC ATCGACGATACGAGCATGGGTATGCGTCTGATGCGTATGCACGGTTA CGACGTGGACCCGAATGTGTTGCGCAACTTCAAGCAAAAAGATGGCA AGTTTAGCTGCTACGGTGGCCAAATGATTGAGAGCCCGAGCCCGATC TATAACTTATATCGTGCGAGCCAACTGCGTTTCCCGGGTGAAGAAAT TCTGGAAGATGCGAAGCGTTTTGCGTATGACTTCCTGAAGGAAAAGC TCGCAAACAATCAAATCTTGGATAAATGGGTGATCAGCAAGCACTTG CCGGATGAGATTAAACTGGGTCTGGAGATGCCGTGGTTGGCCACCCT GCCGAGAGTTGAGGCGAAATACTATATTCAGTATTACGCGGGTAGCG GTGATGTTTGGATTGGCAAGACCCTGTACCGCATGCCGGAGATCAGC AATGATACCTATCATGACCTGGCCAAGACCGACTTCAAACGCTGTCA AGCGAAACATCAATTTGAATGGTTATACATGCAAGAGTGGTACGAAA GCTGCGGCATCGAAGAGTTCGGTATCTCCCGTAAAGATCTGCTGCTG TCTTACTTTCTGGCAACGGCCAGCATTTTCGAGCTGGAGCGTACCAA TGAGCGTATTGCCTGGGCGAAATCACAAATCATTGCTAAGATGATTA CGAGCTTTTTCAATAAAGAAACCACGTCCGAGGAAGATAAACGTGCT CTGCTGAATGAACTGGGCAACATCAACGGTCTGAATGACACCAACGG TGCCGGTCGTGAGGGTGGCGCAGGCAGCATTGCACTGGCCACGCTGA CCCAGTTCCTGGAAGGTTTCGACCGCTACACCCGTCACCAGCTGAAG AACGCGTGGTCCGTCTGGCTGACCCAGCTGCAGCATGGTGAGGCAGA CGACGCGGAGCTGCTGACCAACACGTTGAATATCTGCGCTGGCCATA TCGCGTTTCGCGAAGAGATTCTGGCGCACAACGAGTACAAAGCCCTG AGCAATCTGACCTCTAAAATCTGTCGTCAGCTTAGCTTTATTCAGAG CGAGAAAGAAATGGGCGTGGAAGGTGAGATCGCGGCAAAATCCAGCA TCAAGAACAAAGAACTGGAAGAAGATATGCAGATGTTGGTCAAGCTC GTCCTGGAGAAGTATGGTGGCATCGACCGTAATATCAAGAAAGCGTT TCTGGCCGTGGCGAAAACGTATTACTACCGCGCGTACCACGCGGCAG ATACCATTGACACCCACATGTTTAAGGTTTTGTTTGAGCCGGTTGCT TAA SEQ ID NO: 4 Full-length sclareol synthase from Salvia sclarea MSLAFNVGVTPFSGQRVGSRKEKFPVQGFPVTTPNRSRLIVNCSLTT IDFMAKMKENFKREDDKFPTTTTLRSEDIPSNLCIIDTLQRLGVDQF FQYEINTILDNTFRLWQEKHKVIYGNVTTHAMAFRLLRVKGYEVSSE ELAPYGNQEAVSQQTNDLPMIIELYRAANERIYEEERSLEKILAWTT IFLNKQVQDNSIPDKKLHKLVEFYLRNYKGITIRLGARRNLELYDMT YYQALKSTNRFSNLCNEDFLVFAKQDFDIHEAQNQKGLQQLQRWYAD CRLDTLNFGRDVVIIANYLASLIIGDHAFDYVRLAFAKTSVLVTIMD DFFDCHGSSQECDKIIELVKEWKENPDAEYGSEELEILFMALYNTVN ELAERARVEQGRSVKEFLVKLWVEILSAFKIELDTWSNGTQQSFDEY ISSSWLSNGSRLTGLLTMQFVGVKLSDEMLMSEECTDLARHVCMVGR LLNDVCSSEREREENIAGKSYSILLATEKDGRKVSEDEAIAEINEMV EYHWRKVLQIVYKKESILPRRCKDVFLEMAKGTFYAYGINDELTSPQ QSKEDMKSFVF SEQ ID NO: 5 Truncated sclareol synthase from Salvia sclarea (SsScS) MAKMKENFKREDDKFPTTTTLRSEDIPSNLCIIDTLQRLGVDQFFQY EINTILDNTFRLWQEKHKVIYGNVTTHAMAFRLLRVKGYEVSSEELA PYGNQEAVSQQTNDLPMIIELYRAANERIYEEERSLEKILAWTTIFL NKQVQDNSIPDKKLHKLVEFYLRNYKGITIRLGARRNLELYDMTYYQ ALKSTNRFSNLCNEDFLVFAKQDFDIHEAQNQKGLQQLQRWYADCRL DTLNFGRDVVIIANYLASLIIGDHAFDYVRLAFAKTSVLVTIMDDFF DCHGSSQECDKIIELVKEWKENPDAEYGSEELEILFMALYNTVNELA ERARVEQGRSVKEFLVKLWVEILSAFKIELDTWSNGTQQSFDEYISS SWLSNGSRLTGLLTMQFVGVKLSDEMLMSEECTDLARHVCMVGRLLN DVCSSEREREENIAGKSYSILLATEKDGRKVSEDEAIAEINEMVEYH WRKVLQIVYKKESILPRRCKDVFLEMAKGTFYAYGINDELTSPQQSK EDMKSFVF SEQ ID NO: 6 1132-2-5_opt, optimized cDNA for E. coli expression encoding the truncated sclareol synthase from Salvia sclarea ATGGCGAAAATGAAGGAGAACTTTAAACGCGAGGACGATAAATTCCC GACGACCACGACCCTGCGCAGCGAGGATATCCCGAGCAACCTGTGCA TCATTGATACCCTGCAGCGCCTGGGTGTCGATCAGTTCTTCCAATAC GAAATCAATACCATTCTGGACAATACTTTTCGTCTGTGGCAAGAGAA ACACAAAGTGATCTACGGCAACGTTACCACCCACGCGATGGCGTTCC GTTTGTTGCGTGTCAAGGGCTACGAGGTTTCCAGCGAGGAACTGGCG CCGTACGGTAATCAGGAAGCAGTTAGCCAACAGACGAATGATCTGCC TATGATCATTGAGCTGTATCGCGCAGCAAATGAGCGTATCTACGAAG AGGAACGCAGCCTGGAAAAGATCCTGGCGTGGACCACGATCTTCCTG
AACAAACAAGTTCAAGACAATTCTATTCCTGATAAGAAGCTGCATAA ACTGGTCGAATTCTATCTGCGTAATTACAAGGGCATCACGATCCGTC TGGGCGCACGCCGTAACCTGGAGTTGTATGATATGACGTATTACCAG GCTCTGAAAAGCACCAATCGTTTCTCCAATCTGTGTAATGAGGATTT TCTGGTGTTCGCCAAGCAGGATTTTGACATCCACGAGGCGCAAAATC AAAAAGGTCTGCAACAACTGCAACGTTGGTACGCTGACTGTCGCCTG GACACCCTGAATTTCGGTCGCGACGTTGTCATTATTGCAAACTATCT GGCCAGCCTGATCATCGGTGATCACGCATTCGACTACGTCCGCCTGG CCTTCGCTAAGACCAGCGTTCTGGTGACCATTATGGATGATTTCTTC GATTGCCACGGTTCTAGCCAGGAATGCGACAAAATCATTGAGCTGGT GAAAGAGTGGAAAGAAAACCCTGATGCGGAATACGGTTCCGAAGAGT TGGAGATCCTGTTTATGGCCTTGTACAACACCGTGAATGAACTGGCC GAGCGTGCTCGTGTGGAGCAGGGCCGTTCTGTGAAGGAGTTTTTGGT CAAGTTGTGGGTGGAAATCCTGTCCGCGTTCAAGATCGAACTGGATA CGTGGTCGAATGGTACGCAACAGAGCTTCGACGAATACATCAGCAGC AGCTGGCTGAGCAATGGCAGCCGTCTGACCGGTTTGCTGACCATGCA ATTTGTGGGTGTTAAACTGTCCGATGAAATGCTGATGAGCGAAGAAT GCACCGACCTGGCACGCCATGTGTGTATGGTGGGTCGCCTGCTGAAC GACGTCTGCAGCAGCGAACGTGAGCGCGAGGAAAACATTGCAGGCAA GAGCTACAGCATCTTGTTGGCCACCGAGAAAGATGGTCGCAAAGTGT CTGAGGACGAAGCAATTGCAGAGATTAATGAAATGGTCGAGTACCAC TGGCGTAAGGTTTTGCAGATTGTGTATAAGAAAGAGAGCATCTTGCC GCGTCGCTGTAAGGATGTTTTCTTGGAGATGGCGAAGGGCACGTTCT ATGCGTACGGCATTAACGACGAGCTGACGAGCCCGCAACAATCGAAA GAGGACATGAAGAGCTTCGTGTTCTGAGGTAC SEQ ID NO: 7 GGPP synthase from Pantoea agglomerans MVSGSKAGVSPHREIEVMRQSIDDHLAGLLPETDSQDIVSLAMREGV MAPGKRIRPLLMLLAARDLRYQGSMPTLLDLACAVELTHTASLMLDD MPCMDNAELRRGQPTTHKKFGESVAILASVGLLSKAFGLIAATGDLP GERRAQAVNELSTAVGVQGLVLGQFRDLNDAALDRTPDAILSTNHLK TGILFSAMLQIVAIASASSPSTRETLHAFALDFGQAFQLLDDLRDDH PETGKDRNKDAGKSTLVNRLGADAARQKLREHIDSADKHLTFACPQG GAIRQFMHLWFGHHLADWSPVMKIA SEQ ID NO: 8 CrtEopt, optimized cDNA encoding for the GGPP synthase from Pantoea agglomeranes. ATGGTTTCTGGTTCGAAAGCAGGAGTATCACCTCATAGGGAAATCGA AGTCATGAGACAGTCCATTGATGACCACTTAGCAGGATTGTTGCCAG AAACAGATTCCCAGGATATCGTTAGCCTTGCTATGAGAGAAGGTGTT ATGGCACCTGGTAAACGTATCAGACCTTTGCTGATGTTACTTGCTGC AAGAGACCTGAGATATCAGGGTTCTATGCCTACACTACTGGATCTAG CTTGTGCTGTTGAACTGACACATACTGCTTCCTTGATGCTGGATGAC ATGCCTTGTATGGACAATGCGGAACTTAGAAGAGGTCAACCAACAAC CCACAAGAAATTCGGAGAATCTGTTGCCATTTTGGCTTCTGTAGGTC TGTTGTCGAAAGCTTTTGGCTTGATTGCTGCAACTGGTGATCTTCCA GGTGAAAGGAGAGCACAAGCTGTAAACGAGCTATCTACTGCAGTTGG TGTTCAAGGTCTAGTCTTAGGACAGTTCAGAGATTTGAATGACGCAG CTTTGGACAGAACTCCTGATGCTATCCTGTCTACGAACCATCTGAAG ACTGGCATCTTGTTCTCAGCTATGTTGCAAATCGTAGCCATTGCTTC TGCTTCTTCACCATCTACTAGGGAAACGTTACACGCATTCGCATTGG ACTTTGGTCAAGCCTTTCAACTGCTAGACGATTTGAGGGATGATCAT CCAGAGACAGGTAAAGACCGTAACAAAGACGCTGGTAAAAGCACTCT AGTCAACAGATTGGGTGCTGATGCAGCTAGACAGAAACTGAGAGAGC ACATTGACTCTGCTGACAAACACCTGACATTTGCATGTCCACAAGGA GGTGCTATAAGGCAGTTTATGCACCTATGGTTTGGACACCATCTTGC TGATTGGTCTCCAGTGATGAAGATCGCCTAA SEQ ID NO: 9 Forward primer SmCPS2-1132Inf_F1 CTGTTTGAGCCGGTCGCCTAAGGTACCAGAAGGAGATAAATAATGGC GAAAATGAAGGAGAACTTTAAACG SEQ ID NO: 10 Reverse primer 1132-pET_Inf_R1 GCAGCGGTTTCTTTACCAGACTCGAGGTCAGAACACGAAGCTCTTCA TGTCCTCT SEQ ID NO: 11 CfCPS1, full-length copalyl diphosphate synthase from Coleus forskohlii MGSLSTMNLNHSPMSYSGILPSSSAKAKLLLPGCFSISAWMNNGKNL NCQLTHKKISKVAE1RVATVNAPPVHDQDDSTENQCHDAVNNIEDPI EYIRTLLRTTGDGRISVSPYDTAWVALIKDLOGRDAPEFPSSLEWII QNQLADGSWGDAKFFCVYDRLVNTIACVVALRSWDVHAEKVERGVRY INENVEKLRDGNEEHMTCGFEVVFPALLQRAKSLGIQDLPYDAPVIQ EIYHSREQKSKRIPLEMMHKVPTSLLFSLEGLENLEWDKLLKLQSAD GSFLTSPSSTAFAFMQTRDPKCYQFIKNTIQTFNGGAPHTYPVDVFG RLWAIDRLQRLGISRFFESEIADCIAHIHRFWTEKGVFSGRESEFCD IDDTSMGVRLMRMHGYDVDPNVLKNFKKDDKFSCYGGQMIESPSPIY NLYRASQLRFPGEQILEDANKFAYDFLQEKLAHNQILDKWVISKHLP DEIKLGLEMPWYATLPRVEARYYIQYYAGSGDVWIGKTLYRMPEISN DTYFIELAKTDFKRCQAQHQFEWIYMQEWYESCNMEEFGISRKELLV AYFLATASIFELERANERIAWAKSQIISTIIASFFNNQNTSPEDKLA FLTDFKNGNSTNMALVTLTQFLEGFDRYTSHQLKNAWSVWLRKLQQG IEGNGGADAELLVNTLNICAGHIAFREELAHNDYKTLSNLTSKICRQ LSQIQNEKELETEGQKTSIKNKELEEDMQRLVKLVLEKSRVGINRDM KKTFLAVVKTYYYKAYHSAQAIDNHMFKVLFEPVA SEQ ID NO: 12 CfCPS1-del63, truncated copalyl diphosphate synthase from Coleus forskohlii MVATVNAPPVHDQDDSTENQCHDAVNNIEDPIEYIRTLLRTTGDGRI SVSPYDTAWVALIKDLQGRDAPEFPSSLEWIIQNQLADGSWGDAKFF CVYDRLVNTIACVVALRSWDVHAEKVERGVRYINENVEKLRDGNEEH MTCGFEVVFPALLQRAKSLGIQDLPYDAPVIQEIYHSREQKSKRIPL EMMHKVPTSLLFSLEGLENLEWDKLLKLQSADGSFLTSPSSTAFAFM QTRDPKCYQFIKNTIQTFNGGAPHTYPVDVFGRLWAIDRLQRLGISR FFESEIADCIAHIHRFWTEKGVFSGRESEFCDIDDTSMGVRLMRMHG YDVDPNVLKNFKKDDKFSCYGGQMIESPSPIYNLYRASQLRFPGEQI LEDANKFAYDFLQEKLAHNQILDKWVISKHLPDEIKLGLEMPWYATL PRVEARYYIQYYAGSGDVWIGKTLYRMPEISNDTYHELAKTDFKRCQ AQHQFEWIYMQEWYESCNMEEFGISRKELLVAYFLATASIFELERAN ERIAWAKSQIISTIIASFFNNQNTSPEDKLAFLTDFKNGNSTNMALV TLTQFLEGFDRYTSHQLKNAWSVWLRKLQQGEGNGGADAELLVNTLN ICAGHIAFREEILAHNDYKTLSNLTSKICRQLSQIQNEKELETEGQK TSIKNKELEEDMQRLVKLVLEKSRVGINRDMKKTFLAVVKTYYYKAY HSAQAIDNHMFKVLFEPVA SEQ ID NO: 13 Optimized cDNA for E. coli expression encoding for CfCPS1-del63 ATGGTCGCTACTGTCAATGCTCCACCGGTCCACGATCAAGACGACAG CACTGAGAATCAATGTCATGATGCCGTAAACAATATTGAAGATCCAA TCGAGTATATCCGTACCCTGTTGCGCACGACGGGTGATGGTCGTATC AGCGTCAGCCCGTACGATACCGCGTGGGTGGCGCTGATCAAAGATCT GCAGGGCCGTGACGCACCGGAGTTTCCGTCCTCTCTTGAGTGGATCA TTCAAAACCAGCTGGCCGACGGTTCTTGGGGCGACGCCAAATTTTTC TGCGTGTATGACCGTCTGGTGAACACCATCGCGTGCGTCGTTGCGCT GCGTTCCTGGGACGTCCACGCGGAAAAAGTTGAGCGTGGCGTGCGCT ATATCAACGAAAATGTCGAAAAGCTGCGCGACGGTAATGAAGAACAC ATGACCTGTGGCTTTGAAGTTGTTTTCCCGGCGCTCCTGCAGCGCGC GAAGTCTCTGGGTATTCAAGATCTGCCGTACGATGCTCCGGTGATCC AAGAGATTTATCACTCTCGTGAGCAGAAGTCCAAGCGTATCCCGTTG GAGATGATGCACAAAGTTCCGACGAGCCTGCTGTTCAGCTTGGAAGG CCTGGAAAATCTGGAGTGGGACAAACTGCTGAAGCTGCAGAGCGCGG ACGGTAGCTTCCTGACGAGCCCGAGCAGCACCGCATTTGCATTTATG CAGACCCGTGACCCGAAGTGTTACCAATTTATTAAGAACACGATTCA GACGTTTAACGGTGGTGCACCGCATACCTATCCGGTAGACGTCTTTG GTCGCCTGTGGGCAATTGATCGTCTGCAGCGTTTGGGTATCAGCCGC TTCTTCGAAAGCGAAATTGCAGATTGTATCGCACACATCCATCGTTT TTGGACCGAGAAAGGCGTCTTTAGCGGCCGTGAGTCTGAGTTCTGTG ACATCGATGACACGAGCATGGGTGTCCGTCTGATGCGTATGCATGGC TATGATGTTGACCCGAACGTGCTGAAGAATTTTAAAAAAGATGACAA GTTTAGCTGCTACGGCGGTCAGATGATTGAGAGCCCGAGCCCGATTT ATAATCTGTACCGCGCGAGCCAACTGCGTTTCCCGGGTGAACAGATT CTGGAAGATGCCAATAAATTCGCGTATGATTTCCTGCAGGAAAAACT GGCGCACAATCAGATCCTGGATAAATGGGTTATCAGCAAGCATCTGC CTGACGAAATCAAATTGGGCCTGGAGATGCCGTGGTATGCGACCTTG CCGCGTGTCGAAGCGCGTTACTACATCCAGTACTATGCGGGTAGCGG
CGATGTCTGGATTGGTAAGACGCTGTACCGTATGCCAGAGATTAGCA ACGACACCTACCATGAATTGGCAAAGACCGATTTCAAGCGTTGCCAA GCCCAACACCAGTTCGAGTGGATTTACATGCAAGAGTGGTACGAGTC GTGCAACATGGAAGAGTTCGGTATTAGCCGCAAAGAACTGCTGGTTG CATATTTCCTGGCCACGGCGAGCATCTTTGAGCTGGAGCGTGCGAAT GAACGCATTGCATGGGCAAAAAGCCAAATCATTTCTACCATTATCGC TTCGTTCTTTAATAACCAAAATACGAGCCCTGAGGATAAACTGGCGT TTCTGACTGATTTCAAAAATGGCAACAGCACCAACATGGCTCTGGTG ACCCTGACCCAGTTCCTGGAAGGCTTTGACCGCTACACTTCCCATCA ACTGAAAAACGCGTGGAGCGTTTGGCTGCGTAAGCTGCAACAGGGTG AGGGTAATGGCGGTGCCGACGCCGAGTTACTGGTGAATACGCTGAAC ATTTGCGCGGGTCACATCGCGTTCCGTGAAGAAATTCTGGCACATAA TGACTATAAAACGTTGTCGAACCTGACCAGCAAGATTTGTCGCCAGC TGAGCCAGATTCAGAATGAAAAAGAATTGGAAACCGAAGGCCAAAAG ACTTCCATTAAGAACAAAGAACTGGAAGAAGATATGCAGCGCCTGGT TAAACTGGTTTTGGAGAAAAGCCGTGTGGGTATCAATCGTGACATGA AGAAAACGTTCCTGGCTGTGGTGAAAACCTACTATTACAAAGCATAC CACTCCGCGCAGGCAATCGATAACCACATGTTCAAGGTTCTGTTCGA ACCGGTGGCCTAA SEQ ID NO: 14 TaTps1, full-length copalyl diphosphate synthase from Tritictim aestivum. MLTFTAALRHVPVLDQPTSEPWRRLSLHLHSQRRPCGLVLISKSPSY PEVDVGEWKVDEYRQRTDEPSETRQMIDDIRTALASLGDDETSMSVS AYDTALVALVKNLDGGDGPQFPSCIDWIVQNQLPDGSWGDPAFFMVQ DRMISTLACWAVKSWNIDRDNLCDRGVLFIKENMSRLVEEEQDWMPC GFEINFPALLEKAKDLDLDIPYDHPVLEEIYAKRNLKLLKIPLDVLH AIPTTLLFSVEGMVDLPLDWEKLLRLRCPDGSFHSSPAATAAALSHT GDKECHAFLDRLIQKFEGGVPCSHSMDTFEQLWVYDRLMRLGISRHF TSEIQQCLEFIYRRWTQKGLAHNMHCPIPDIDDTAMGFRLLRQHGYD VTPSVFKHFEKDGKFVCFPMETNHASVTPMHNTYRASQFMFPGDDDV LARAGRYCRAFLQERQSSNKLYDKWIITKDLPGEVGYTLNFPWKSSL PRIETRMYLDQYGGNNDVWIAKVLYRMNLVSNDLYLKMAKADFTEYQ RLSRIEWNGLRKWYFRNHLQRYGATPKSALKAYFLASANIFEPGRAA ERLAWARMAVLAEAVTTHFRHIGGPCYSTENLEELIDLVSFDDVSGG LREAWKQWLMAWTAKESHGSVDGDTALLFVRTIEICSGRIVSSEQKL NLWDYSQLEQLTSSICHKLATIGLSQNEASMENTEDLHQQVDLEMQE LSWRVHQGCHGINRETRQTFLNVVKSFYYSAHCSPETVDSHIAKVIF QDVI SEQ ID NO: 15 TaTps1-del59, truncated copalyl diphosphate synthase from Triticum aestivum. MYRQRTDEPSETRQMIDDIRTALASLGDDETSMSVSAYDTALVALVK NLDGGDGPQFPSCIDWIVQNQLPDGSWGDPAFFMVQDRMISTLACVV AVKSWNIDRDNLCDRGVLFIKENMSRLVEEEQDWMPCGFEINFPALL EKAKDLDLDIPYDHPVLEEIYAKRNLKLLKIPLDVLHAIPTTLLFSV EGMVDLPLDWEKLLRLRCPDGSFHSSPAATAAALSHTGDKECHAFLD RLIQKFEGGVPCSHSMDTFEQLWVVDRLMRLGISRHFTSEIQQCLEF IYRRWTQKGLAHNMHCPIPDIDDTAMGFRLLRQHGYDVTPSVFKHFE KDGKFVCFPMETNHASVTPMHNTYRASQFMFPGDDDVLARAGRYCRA FLQERQSSNKLYDKWIITKDLPGEVGYTLNFPWKSSLPRIETRMYLD QYGGNNDVWIAKVLYRMNLVSNDLYLKMAKADFTEYQRLSRIEWNGL RKWYFRNHLQRYGATPKSALKAYFLASANIFEPGRAAERLAWARMAV LAEAVTTHFRHIGGPCYSTENLEELIDLVSFDDVSGGLREAWKQWLM AWTAKESHGSVDGDTALLFVRTIEICSGRIVSSEQKLNLWDYSQLEQ LTSSICHKLATIGLSQNEASMENTEDLHQQVDLEMQELSWRVHQGCH GINRETRQTFLNVVKSFYYSAHCSPETVDSHIAKVIFQDVI SEQ ID NO: 16 Optimized cDNA for E. coli expression encoding for TaTps1-del59 ATGTATCGCCAAAGAACTGATGAGCCAAGCGAAACCCGCCAGATGAT CGATGATATTCGCACCGCTTTGGCTAGCCTGGGTGACGATGAAACCA GCATGAGCGTGAGCGCATACGACACCGCCCTGGTTGCCCTGGTGAAG AACCTGGACGGTGGCGATGGCCCGCAGTTCCCGAGCTGCATTGACTG GATTGTTCAGAACCAGCTGCCGGACGGTAGCTGGGGCGACCCGGCTT TCTTTATGGTTCAGGACCGTATGATCAGCACCCTGGCCTGTGTCGTG GCCGTGAAATCCTGGAATATCGATCGTGACAACTTGTGCGATCGTGG TGTCCTGTTTATCAAAGAAAACATGTCGCGTCTGGTTGAAGAAGAAC AAGATTGGATGCCATGTGGCTTCGAGATTAACTTTCCTGCACTGTTG GAGAAAGCTAAAGACCTGGACTTGGACATTCCGTACGATCATCCTGT GCTGGAAGAGATTTACGCGAAGCGTAATCTGAAACTGCTGAAGATTC CGTTAGATGTCCTCCATGCGATCCCGACGACGCTGTTGTTTTCCGTT GAGGGTATGGTCGATCTGCCGCTGGATTGGGAGAAACTGCTGCGTCT GCGTTGCCCGGACGGTTCTTTTCATTCTAGCCCGGCGGCGACGGCAG CGGCGCTGAGCCACACGGGTGACAAAGAGTGTCACGCCTTCCTGGAC CGCCTGATTCAAAAGTTCGAGGGTGGCGTCCCGTGCTCCCACAGCAT GGACACCTTCGAGCAACTGTGGGTTGTTGACCGTTTGATGCGTCTGG GTATCAGCCGTCATTTTACGAGCGAGATCCAGCAGTGCTTGGAGTTC ATCTATCGTCGTTGGACCCAGAAAGGTCTGGCGCACAATATGCACTG CCCGATCCCGGACATTGATGACACTGCGATGGGTTTTCGTCTGTTGA GACAGCACGGTTACGACGTGACCCCGTCGGTTTTCAAGCATTTCGAG AAAGACGGCAAGTTCGTATGCTTCCCGATGGAAACCAACCATGCGAG CGTGACGCCGATGCACAATACCTACCGTGCGAGCCAGTTCATGTTCC CGGGTGATGACGACGTGCTGGCCCGTGCCGGCCGCTACTGTCGCGCA TTCTTGCAAGAGCGTCAGAGCTCTAACAAGTTGTACGATAAGTGGAT TATCACGAAAGATCTGCCGGGTGAGGTTGGCTACACGCTGAACTTTC CGTGGAAAAGCTCCCTGCCGCGTATTGAAACTCGTATGTATCTGGAT CAGTACGGTGGCAATAACGATGTCTGGATTGCAAAGGTCCTGTATCG CATGAACCTGGTTAGCAATGACCTGTACCTGAAAATGGCGAAAGCCG ACTTTACCGAGTATCAACGTCTGTCTCGCATTGAGTGGAACGGCCTG CGCAAATGGTATTTTCGCAATCATCTGCAGCGTTACGGTGCGACCCC GAAGTCCGCGCTGAAAGCGTATTTCCTGGCGTCGGCAAACATCTTTG AGCCTGGCCGCGCAGCCGAGCGCCTGGCATGGGCACGTATGGCCGTG CTGGCTGAAGCTGTAACGACTCATTTCCGTCACATTGGCGGCCCGTG CTACAGCACCGAGAATCTGGAAGAACTGATCGACCTTGTTAGCTTCG ACGACGTGAGCGGCGGCTTGCGTGAGGCGTGGAAGCAATGGCTGATG GCGTGGACCGCAAAAGAATCACACGGCAGCGTGGACGGTGACACGGC ACTGCTGTTTGTCCGCACGATTGAGATTTGCAGCGGCCGCATCGTTT CCAGCGAGCAGAAACTGAATCTGTGGGATTACAGCCAGTTAGAGCAA TTGACCAGCAGCATCTGTCATAAACTGGCCACCATCGGTCTGAGCCA GAACGAAGCTAGCATGGAAAATACCGAAGATCTGCACCAACAAGTCG ATTTGGAAATGCAAGAACTGTCATGGCGTGTTCACCAGGGTTGTCAC GGTATTAATCGCGAAACCCGTCAAACCTTCCTGAATGTTGTTAAGTC TTTTTATTACTCCGCACACTGCAGCCCGGAAACCGTGGACAGCCATA TTGCAAAAGTGATCTTTCAAGACGTTATCTGA SEQ ID NO: 17 MvCps3, full-length copalyl diphosphate synthase from Marrubium vulgare. MGSLSTLNLIKTCVTLASSEKLNQPSQCYTISTCMKSSNNPPFNYYQ INGRKKMSTAIDSSVNAPPEQKYNSTALEHDTEIIEIEDHIECIRRL LRTAGDGRISVSPYDTAWIALIKDLDGHDSPQFPSSMEWVADNQLPD GSWGDEHFVCVYDRLVNTIACVVALRSWNVHAHKCEKGIKYIKENVH KLEDANEEHMTCGFEVVFPALLQRAQSMGIKGIPYNAPVIEEIYNSR EKKLKRIPMEVVHKVATSLLFSLEGLENLEWEKLLKLQSPDGSFLTS PSSTAFAFIHTKDRKCFNFINNIVHTFKGGAPHTYPVDIFGRLWAVD RLQRLGISRFFESEIAEFLSHVHRFWSDEAGVFSGRESVFCDIDDTS HMGLRLLRMHGYHVDPNVLKNFKQSDKFSCYGGQMMECSSPIYNLYR ASQLQFPGEEILEEANKFAYKFLQEKLESNQILDKWLISNLSDEIKV GLEMPWYATLPRVETSYYIHHYGGGDDVWIGKTLYRMPEISNDTYRE LARLDFRRCQAQHQLEWIYMQRWYESCRMQEFGISRKEVLRAYFLAS GTIFEVERAKERVAWARSQIISHMIKSFFNKETTSSDQKQALLTELL FGNISASETEKRELDGWVATLRQFLEGFDIGTRHQVKAAWDVWLRKV EQGEAHGGADAELCTTTLNTCANQHLSSHPDYNTLSKLTNKICHKLS QIQHQKEMKGGIKAKCSINNKEVDIEMQWLVKLVLEKSGLNRKAKQA FLSIAKTYYYRAYYADQTMDAHEFKVLFEPVV SEQ ID NO: 18 MvCps3-del63, truncated copalyl diphosphate synthase from Marrubium vulgare MAPPEQKYNSTALEHDTEIIEIEDHIECIRRLLRTAGDGRISVSPYD TAWIALIKDLDGHDSPQFPSSMEWVADNQLPDGSWGDEHFVCVYDRL VNTIACWALRSWNVHAHKCEKGIKYIKENVHKLEDANEEHMTCGFEV VFPALLQRAQSMGIKGIPYNAPVIEEIYNSREKKLKRIPMEVVHKVA TSLLFSLEGLENLEWEKLLKLQSPDGSFLTSPSSTAFAFIHTKDRKC
FNFINNIVHTFKGGAPHTYPVDIFGRLWAVDRLQRLGISRFFESEIA EFLSHVHRFWSDEAGVFSGRESVFCDIDDTSMGLRLLRMHGYHVDPN VLKNFKQSDKFSCYGGQMMECSSPIYNLYRASQLQFPGEEILEEANK FAYKFLQEKLESNQILDKWLISNHLSDEIKVGLEMPWYATLPRVETS YYIHHYGGGDDVWIGKTLYRMPEISNDTYRELARLDFRRCQAQHQLE WIYMQRWYESCRMQEFGISRKEVLRAYFLASGTIFEVERAKERVAWA RSQIISHMIKSFFNKETTSSDQKQALLTELLFGNISASETEKRELDG VVVATLRQFLEGFDIGTRHQVKAAWDVWLRKVEQGEAHGGADAELCT TTLNTCANQHLSSHPDYNTLSKLTNKICHKLSQIQHQKEMKGGIKAK CSINNKEVDIEMQWLVKLVLEKSGLNRKAKQAFLSIAKTYYYRAYYA DQTMDAHIFKVLFEPVV SEQ ID NO: 19 Optimized cDNA for E. coli expression encoding for MvCps3-del63 ATGGCCCCGCCGGAACAAAAGTACAACAGCACTGCATTAGAACACGA CACCGAGATTATTGAGATCGAGGACCACATCGAGTGTATCCGCCGTC TGCTGCGTACCGCGGGTGATGGTCGTATTAGCGTGAGCCCGTATGAT ACCGCGTGGATTGCACTGATTAAAGATTTGGATGGCCACGACTCCCC GCAATTCCCGTCGAGCATGGAATGGGTTGCTGATAATCAGCTGCCGG ACGGTAGCTGGGGTGACGAGCACTTCGTTTGCGTTTACGATCGCCTG GTTAATACCATCGCATGCGTCGTGGCGCTGCGCAGCTGGAATGTCCA TGCACATAAGTGCGAGAAAGGTATTAAGTACATTAAAGAAAATGTCC ACAAACTGGAAGATGCGAACGAAGAACACATGACTTGCGGCTTCGAA GTCGTTTTTCCGGCCTTGCTGCAGCGTGCACAGAGCATGGGTATTAA GGGCATCCCGTACAACGCGCCTGTCATTGAAGAAATTTACAATTCCC GTGAGAAAAAGCTGAAACGTATTCCGATGGAAGTTGTCCACAAAGTC GCGACCAGCCTGCTGTTCTCCCTGGAAGGTCTGGAGAACCTGGAGTG GGAGAAATTGCTGAAACTGCAGAGCCCGGACGGTTCGTTTCTGACCA GCCCGAGCTCTACGGCATTCGCGTTTATCCATACCAAAGACCGTAAA TGTTTTAACTTTATTAACAATATCGTTCATACCTTTAAGGGTGGTGC ACCGCACACGTACCCTGTGGACATCTTTGGCCGCCTGTGGGCAGTGG ATCGCTTGCAGCGTCTGGGTATTAGCCGCTTCTTCGAGAGCGAGATC GCGGAATTTCTGAGCCACGTGCACCGTTTTTGGAGCGACGAAGCGGG CGTTTTCAGCGGCCGTGAGAGCGTGTTCTGTGATATTGATGACACCA GCATGGGTCTGCGCCTGCTTCGTATGCATGGCTACCATGTAGACCCA AACGTTCTGAAGAACTTCAAGCAATCTGACAAGTTTAGCTGCTACGG TGGCCAGATGATGGAATGCAGCAGCCCAATTTACAATCTGTACCGTG CGAGCCAACTGCAATTTCCGGGTGAAGAAATCTTGGAAGAGGCTAAC AAATTCGCGTATAAGTTTTTGCAAGAGAAACTGGAGTCCAATCAGAT TCTGGACAAGTGGCTGATCTCCAACCACCTGAGCGACGAAATCAAAG TTGGCCTGGAAATGCCGTGGTATGCGACCTTGCCGCGCGTTGAGACT AGCTATTATATTCACCATTACGGCGGTGGCGACGATGTGTGGATTGG TAAAACGCTGTATCGCATGCCGGAAATTAGCAACGACACCTACCGTG AGCTGGCACGTCTGGACTTCCGCCGCTGCCAGGCGCAGCACCAGTTG GAATGGATCTATATGCAACGTTGGTATGAGAGCTGTCGTATGCAAGA ATTTGGTATTTCCCGCAAAGAAGTCCTGCGTGCCTACTTCCTGGCCT CTGGCACGATTTTCGAAGTTGAGCGCGCCAAAGAGCGCGTGGCGTGG GCTCGTAGCCAAATCATTTCCCACATGATCAAGAGCTTCTTCAATAA AGAAACCACGAGCAGCGATCAGAAACAAGCGCTGCTGACCGAGTTGC TGTTTGGTAACATCTCTGCAAGCGAGACTGAGAAACGTGAGCTGGAT GGTGTTGTGGTTGCGACCCTGCGTCAGTTCCTGGAAGGCTTCGATAT CGGCACCCGTCACCAAGTGAAGGCAGCGTGGGATGTGTGGCTGCGTA AAGTCGAACAGGGTGAGGCACATGGTGGCGCGGACGCCGAGTTGTGT ACGACGACGCTGAACACGTGCGCGAATCAGCATCTGTCTAGCCATCC GGACTACAATACCCTGTCGAAACTCACCAATAAGATTTGTCACAAGC TGTCCCAAATCCAGCATCAGAAAGAAATGAAGGGCGGTATTAAGGCA AAGTGCTCTATCAATAACAAAGAAGTGGATATCGAGATGCAATGGCT GGTCAAACTGGTCCTGGAGAAATCCGGTCTGAACCGCAAGGCTAAAC AAGCGTTTCTGAGCATTGCCAAAACCTATTATTATCGTGCTTACTAT GCCGACCAGACGATGGATGCCCACATCTTCAAGGTCCTGTTTGAACC GGTCGTGTAA SEQ ID NO: 20 RoCPSl, full-length copalyl diphosphate synthase from Rosmarinus officinalis MTSMSSLNLSRAPAISRRLQLPAKVQLPEFYAVCSWLNNSSKHTPLS CHIHRKQLSKVTKCRVASLDASQVSEKGTSSPVQTPEEVNEKIENYI EYIKNLLTTSGDGRISVSPYDTSIVALIKDLKGRDTPQFPSCLEWIA QHQMADGSWGDEFFCIYDRILNTLACWALKSWNVHADMIEKGVTYVN ENVQKLEDGNLEHMTSGFEIVVPALVQRAQDLGIQGLPYDHPLIKEI ANTKEGRLKKIPKDMIYQKPTTLLFSLEGLGDLEWEKILKLQSGDGS FLTSPSSTAHVFMKTKDEKCLKFIENAVKNCNGGAPHTYPVDVFARL WAVDRLQRLGISRFFQQEIKYFLDHINSVWTENGVFSGRDSEFCDID DTSMGIRLLKMHGYDIDPNALEHFKQQDGKFSCYGGQMIESASPIYN LYRAAQLRFPGEEILEEATKFAYNFLQEKIANDQFQEKWVISDHLID EVKLGLKMPWYATLPRVEAAYYLQYYAGCGDVWIGKVFYRJVIPEIS NDTYKKLAILDFNRCQAQHQFEWIYMQEWYHRSSVSEFGISKKDLLR AYFLAAATIFEPERTQERLVWAKTQIVSGMITSFVNSGTTLSLHQKT ALLSQIGHNFDGLDEIISAMKDHGLAATLLTTFQQLLDGFDRYTRHQ LKNAWSQWFMKLQQGEASGGEDAELLANTLNICAGLIAFNEDVLSHH EYTTLSTLTNKICKRLTQIQDKKTLEVVDGSIKDKELEKDIQMLVKL VLEENGGGVDRNIKHTFLSVFKTFYYNAYHDDETTDVHIFKVLFGPV V SEQ ID NO: 21 RoCPSl-del67, truncated copalyl diphosphate synthase from Rosmarinus officinalis MASQVSEKGTSSPVQTPEEVNEKIENYIEYIKNLLTTSGDGRISVSP YDTSIVALIKDLKGRDTPQFPSCLEWIAQHQMADGSWGDEFFCIYDR ILNTLACVVALKSWNVHADMIEKGVTYVNENVQKLEDGNLEHMTSGF EIVVPALVQRAQDLGIQGLPYDHPLIKEIANTKEGRLKKIPKDMIYQ KPTTLLFSLEGLGDLEWEKLLKLQSGDGSFLTSPSSTAHVFMKTKDE KCLKFIENAVKNCNGGAPHTYPVDVFARLWAVDRLQRLGISRFFQQE IKYFLDHINSVWTENGVFSGRDSEFCDIDDTSMGIRLLKMHGYDIDP NALEHFKQQDGKFSCYGGQMIESASPIYNLYRAAQLRFPGEEILEEA TKFAYNFLQEKIANDQFQEKWVISDHLIDEVKLGLKMPWYATLPRVE AAYYLQYYAGCGDVWIGKVFYRMPEISNDTYKKLAILDFNRCQAQHQ FEWIYMQEWYIIRSSVSEFGISKKDLLRAYFLAAATIFEPERTQERL VWAKTQIVSGMITSFVNSGTTLSLHQKTALLSQIGHNFDGLDEIISA MKDHGLAATLLTTFQQLLDGFDRYTRHQLKNAWSQWFMKLQQGEASG GEDAELLANTLNICAGLtAFNEDVLSHHEYTTLSTLTNKICKRLTQI QDKKTLEWDGSIKDKELEKDIQMLVKLVLEENGGGVDRNIKHTFLSV FKTFYYNAYHDDETTDVHIFKVLFGPVV SEQ ID NO: 22 Optimized cDNA for E. coli expression encoding for RoCPS1-del67 ATGGCATCACAAGTTAGCGAGAAAGGCACCAGCTCCCCAGTTCAAAC GCCAGAGGAAGTGAACGAAAAGATCGAGAATTACATTGAGTATATTA AAAATCTGCTGACTACTTCGGGCGACGGCCGCATCAGCGTCAGCCCG TACGACACGAGCATCGTTGCCCTGATTAAAGACCTGAAGGGTCGTGA CACCCCGCAGTTTCCGTCCTGTCTGGAGTGGATTGCCCAACACCAAA TGGCCGATGGTTCCTGGGGTGATGAATTTTTCTGCATTTACGACCGC GATCCTGAATACGCTGGCTTGTGTTGTCGCCCTGAAGTCCTGGAATT TCATGCAGACATGATCGAAAAGGGTGTCACTTACGTTAACGAAAACG TGCAGAAACTGGAAGATGGCAATCTGGAGCACATGACGAGCGGTTTC CGAGATTGTTGTCCCGGCGCTGGTTCAGAGAGCGCAAGACCTGGGCA TCCAGGGCCTGCCGTATGATATCCGTTGATCAAAGAAATCGCAAACA CCAAAGAGGGCCGCCTGAAGAAAATTCCTAAAGACATGATTTATCAG AAACCGACTACGCTGCTGTTCAGCCTGGAAGGCTTGGGCGACCTGGA GTGGGAAAAGATCCTGAAGTTACAGTCTGGTGATGGTTCTTTCCTGA CCAGCCCGAGCTCTACGGCCCATGTTTTCATGAAAACCAAAGATGAG AAGTGTCTGAAGTTTATTGAAAATGCCGTCAAGAATTGCAACGGTGG CGCGCCTCACACCTACCCGGTGGACGTTTTCGCTCGTCTGTGGGCCG TCGATCGTCTGCAACGCCTGGGCATCTCGCGTTTCTTCCAGCAAGAG ATTAAGTACTTCCTGGACCACATTAATAGCGTGTGGACCGAAAACGG CGTTTTCAGCGGTCGCGACAGCGAGTTTTGTGATATTGATGACACCT CTATGGGTATCCGTTTGCTGAAGATGCACGGTTACGACATTGACCCG AATGCCCTGGAGCACTTTAAACAACAGGATGGTAAGTTCTCCTGCTA CGGTGGTCAGATGATTGAGAGCGCGAGCCCGATCTACAACCTGTACC GTGCTGCGCAGCTGCGTTTTCCGGGTGAAGAGATTCTGGAAGAGGCC ACCAAATTTGCGTATAATTTTTTGCAAGAGAAAATTGCAAACGACCA AATTCCAGGAAAAATGGGTTATTAGCGATCACCTTATCGATGAAGTG
AAAACTGGGTTTGAAGATGCCGTGGTACGCGCGCTGCCACGTGTCGA GGCAGCGTATTATCTGCAGTATTATGCGGGCTGTGGTGTGTGTGGAT CGGCAAAGTGTTCTACCGTATGCCGGAAATCAGCAATGACACCTACA AGAAACTGGCCATCCTGGATTTCAACCGTTGCCAGGCGCAACACCAA TTCGAGTGGATCTACATGCAAGAGTGGTATCATCGTAGCAGCGTTTC TGAGTTTGGCATTTCCAAAAAAGACTTGCTGCGCGCGTATTTTCTGG CGGCAGCGACCATTTTCGAACCGGAGCGCACCCAGGAACGTCTGGTG TGGGCTAAGACGCAAATCGTCAGCGGTATGATTACGTCCTTTGTTAA TAGCGGTACGACTCTGAGCCTGCACCAGAAAACGGCACTGTTGAGCC AAATCGGTCATAACTTTGACGGCCTGGATGAGATTATCAGCGCGATG AAAGACCACGGCCTGGCAGCGACGCTGTTAACGACCTTTCAACAGCT GCTGGACGGCTTCGATCGCTACACCCGTCATCAGCTGAAAAACGCGT GGAGCCAGTGGTTCATGAAGCTGCAACAGGGTGAGGCGTCGGGTGGC GAAGATGCTGAGCTGCTGGCTAATACCCTGAACATTTGCGCGGGTTT GATTGCGTTTAATGAAGATGTGTTGAGCCACCATGAGTACACCACCC TGAGCACCCTGACCAACAAGATCTGTAAGCGCTTGACTCAAATCCAG GATAAGAAAACGCTGGAAGTCGTGGATGGTAGCATCAAAGATAAAGA ACTGGAAAAAGACATTCAAATGCTGGTGAAACTGGTCCTTGAAGAGA ACGGCGGTGGCGTTGACCGTAACATCAAGCACACCTTCCTGAGCGTC TTTAAAACCTTTTATTATAATGCCTATCATGACGATGAAACGACCGA CGTGCACATTTTCAAAGTTCTGTTCGGTCCGGTCGTGTAA SEQ ID NO: 23 NgSCS-del29, truncated putative sclareol synthase from Nicotiana glutinosa MANFHRPSRVRCSHSTASSLEEAKERIRETFGKNELSPSSYDTAWVA MVPSRYSMNQPCFPRCLDWILENQREDGSWGLNPSHPLLVKDSLSST LACLLALRKWRIGDNQVQRGLGFIETHGWAVDNVDQISPLGFDIIFP SMIKYAEKLNLDLPFDPNLVNMMLRERELTIERALKNEFEGNMANVE YFAEGLGELCHWKEIMLHQRRNGSLFDSPATTAAALIYHQHDEKCFG YLSSILKLHENWVPTIYPTKVHSNLFFVDALQNLGVDRYFKTELKSV LDEIYRLWLEKNEEIFSDIAHCAMAFRLLRMNNYEVSSEELEGFVDQ EHFFTTSGGKLISHVAILELHRASQVDIQEGKDLILDKISTWTRNFM EQELLDNQILDRSKKEMEFAMRKFYGTFDRVETRRYIESYKMDSFKI LKAAYRSSNINNIDLLKFSEFIDFNLCQARHKEELQQIKRWFADCKL EQVGSSQNYLYTSYFPIAAILFEPEYGDARLAFAKCGIIATTVDDFF DGFACNEELQNIIELVERWDGYPTVGFRSERVRIFFLALYKMIEEIA AKAETKQGRCVKDLLINLWIDLLKCMLVELDLWKIKSTTPSIEEYLS IACVTTGVKCLILISLHLLGPKLSKDVTESSEVSALWNCTAVVARLN NDIHSYKREQAESSTNMAAILISQSQRTISEEEAIRQIKEMMESKRR ELLGMVLQNKESQLPQVCKDLFWTTFKAAYSIYTHGDEYRFPQELKN HINDVIYKPLNQYSP SEQ ID NO: 24 Optimized cDNA for E. coli expression encoding for NgSCS-del29 ATGGCTAATTTCCATCGCCCATCCCGTGTTCGTTGTTCCCACTCTAC CGCAAGCTCCCTGGAAGAGGCAAAAGAGCGCATCCGTGAAACCTTCG GCAAAAATGAACTCTCTCCTTCTAGCTATGATACGGCCTGGGTTGCT ATGGTCCCGAGCCGCTACAGCATGAACCAGCCGTGCTTTCCGCGCTG CCTGGACTGGATTCTGGAGAACCAACGTGAGGATGGCAGCTGGGGTC TGAACCCGAGCCATCCGTTACTGGTGAAAGACAGCTTGAGCAGCACG CTGGCGTGTTTGCTGGCGCTGCGTAAGTGGCGTATTGGCGACAACCA AGTCCAGCGTGGCCTGGGTTTTATCGAGACTCATGGTTGGGCAGTGG ACAACGTAGACCAGATCTCTCCACTGGGTTTTGACATCATTTTCCCG AGCATGATTAAATATGCGGAAAAGCTGAATCTGGATTTGCCTTTTGA TCCGAACCTGGTGAACATGATGCTGCGCGAGCGCGAGCTGACGATCG AGCGTGCGCTGAAAAACGAATTTGAGGGTAATATGGCTAATGTCGAG TACTTCGCCGAGGGTTTGGGTGAGCTGTGTCACTGGAAAGAAATCAT GCTGCACCAACGCCGTAACGGTAGCCTGTTCGACTCTCCGGCAACGA CCGCCGCGGCTCTTATTTATCATCAGCACGATGAGAAGTGCTTCGGC TATCTGTCTAGCATCCTGAAATTACACGAGAACTGGGTGCCGACCAT CTATCCGACCAAGGTTCACTCCAATCTGTTTTTCGTCGATGCGCTGC AGAACCTGGGTGTTGACCGTTACTTCAAAACCGAACTGAAGTCCGTC CTGGATGAGATCTACCGTTTGTGGCTGGAGAAAAACGAAGAGATCTT CAGCGATATTGCGCACTGCGCAATGGCGTTTCGCCTGTTGCGCATGA ATAATTACGAGGTTAGCAGCGAAGAACTGGAAGGCTTCGTGGACCAA GAACATTTTTTCACCACGTCGGGTGGCAAGCTGATCAGCCACGTTGC CATCCTGGAACTGCACCGTGCAAGCCAAGTGGACATTCAGGAGGGCA AAGACCTGATCCTGGACAAAATTAGCACCTGGACTCGCAACTTTATG GAACAGGAACTGCTGGATAACCAGATCTTGGATCGTAGCAAAAAAGA AATGGAATTTGCAATGCGTAAGTTTTACGGTACGTTCGATCGCGTGG AAACCCGTCGTTATATTGAAAGCTACAAAATGGATTCCTTCAAGATC CTGAAGGCAGCGTACCGTAGCTCCAACATTAACAATATTGACCTGTT GAAGTTCAGCGAGCACGACTTCAATCTCTGCCAGGCGCGTCACAAGG AAGAACTGCAGCAAATCAAACGCTGGTTCGCAGATTGCAAACTGGAG CAAGTCGGTAGCAGCCAGAACTACTTGTACACCTCTTACTTCCCGAT CGCGGCCATTTTGTTCGAGCCGGAGTATGGCGACGCACGCCTGGCGT TCGCGAAGTGCGGTATTATCGCGACCACCGTTGACGATTTTTTTGAC GGTTTTGCATGTAATGAAGAACTGCAAAACATCATCGAACTGGTCGA GAGATGGGACGGTTATCCGACGGTTGGTTTCCGCTCCGAGCGTGTGC GCATTTTCTTTCTGGCGCTGTACAAAATGATTGAAGAAATTGCCGCG AAAGCGGAAACGAAACAGGGCCGTTGCGTGAAAGATCTGTTGATCAA TCTGTGGATTGATCTGCTGAAATGCATGCTGGTCGAACTGGATCTGT GGAAAATTAAGAGCACGACCCCGAGCATTGAAGAGTATCTGAGCATT GCCTGTGTGACGACCGGCGTTAAGTGCTTGATCCTGATTAGCCTGCA TCTGCTGGGCCCGAAACTGAGCAAAGACGTGACCGAATCCAGCGAAG TTAGCGCTCTGTGGAACTGTACGGCCGTGGTTGCGCGCCTGAACAAC GACATTCATAGCTACAAGCGTGAGCAAGCCGAGAGCAGCACTAATAT GGCCGCAATCCTGATTTCGCAAAGCCAGCGTACCATCTCAGAAGAAG AAGCTATCCGCCAGATCAAAGAGATGATGGAATCGAAACGCCGTGAG CTGCTGGGCATGGTGCTGCAGAATAAAGAGAGCCAATTGCCGCAAGT CTGCAAAGACCTGTTTTGGACCACCTTCAAAGGCGCGTACAGCATTT ATACCCACGGTGATGAGTACCGTTTTCCACAAGAACTGAAGAACCAT ATCAACGATGTCATCTATAAGCCGTTAAATCAATACAGCCCTTAA SEQ ID NO: 25 NgSCS-del38, putative sclareol synthase from Nicotiana glutinosa MSHSTASSLEEAKERIRETFGKNELSSSSYDTAWVAMVPSRYSMNQP CFPRCLDWILENQREDGSWGLNPSLPLLVKDSLSSTLACLLALRKWR IGDNQVQRGLGFIETHGWAVDNVDQISPLGFDIIFPSMIKYAEKLNL DLPFDPNLVNMMLRERELTIERALKNEFEGNMANVEYFAEGLGELCH WKEIMLHQRRNGSPFDSPATTAAALIYHQHDEKCFGYLSSILKLHEN WVPTIYPTKVHSNLFFVDALQNLGVDRYFKTELKSVLDEIYRLWLEK NEEIFSDIAHCAMAFRLLRMNNYEVSSEELEGFVDQEHFFTTSGGKL ISHVAILELHRASQVDIQEGKDLILDKISTWTRNFMEQELLDNQILD RSKKEMEFAMRKFYGTFDRVETRRYIESYKMDSFKILKAAYRSSNIN NIDLLKFSEHDFNLCQARHKEELQQIKRWFADCKLEQVGSSQNYLYT SYFPIAAILFEPEYGDARLAFAKCGIIATTVDDFFDGFACNEELQNI IELVERWDGYPTVGFRSERVRIFFLALYKMIEEIAAKAETKQGRCVK DLLINLWIDLLKCMLVELDLWKIKSTTPSIEEYLSIACVTTGVKCLI LISLHLLGPKLSKDVTESSEVSALWNCTAVVARLNNDIHSYKREQAE SSTNMVAILISQSQRTISEEEAIRQIKEMMESKRRELLGMVLQNKES QLPQVCKDLFWTTFKAAYSIYTHGDEYRFPQELKNHINDVIYKPLNQ YSP SEQ ID NO: 26 Optimized cDNA for Saccharomyces cerevisiae expression encoding for SmCPS2. ATGGCTACTGTTGACGCTCCACAAGTTCACGACCACGACGGTACTAC TGTTCACCAAGGTCACGACGCTGTTAAGAACATCGAAGACCCAATCG AATACATCAGAACTTTGTTGAGAACTACTGGTGACGGTAGAATCTCT GTTTCTCCATACGACACTGCTTGGGTTGCTATGATCAAGGACGTTGA AGGTAGAGACGGTCCACAATTCCCATCTTCTTTGGAATGGATCGTTC AAAACCAATTGGAAGACGGTTCTTGGGGTGACCAAAAGTTGTTCTGT GTTTACGACAGATTGGTTAACACTATCGCTTGTGTTGTTGCTTTGAG ATCTTGGAACGTTCACGCTCACAAGGTTAAGAGAGGTGTTACTTACA TCAAGGAAAACGTTGACAAGTTGATGGAAGGTAACGAAGAACACATG ACTTGTGGTTTCGAAGTTGTTTTCCCAGCTTTGTTGCAAAAGGCTAA GTCTTTGGGTATCGAAGACTTGCCATACGACTCTCCAGCTGTTCAAG AAGTTTACCACGTTAGAGAACAAAAGTTGAAGAGAATCCCATTGGAA ATCATGCACAAGATCCCAACTTCTTTGTTGTTCTCTTTGGAAGGTTT GGAAAACTTGGACTGGGACAAGTTGTTGAAGTTGCAATCTGCTGACG GTTCTTTCTTGACTTCTCCATCTTCTACTGCTTTCGCTTTCATGCAA ACTAAGGACGAAAAGTGTTACCAATTCATCAAGAACACTATCGACAC
TTTCAACGGTGGTGCTCCACACACTTACCCAGTTGACGTTTTCGGTA GATTGTGGGCTATCGACAGATTGCAAAGATTGGGTATCTCTAGATTC TTCGAACCAGAAATCGCTGACTGTTTGTCTCACATCCACAAGTTCTG GACTGACAAGGGTGTTTTCTCTGGTAGAGAATCTGAATTCTGTGACA TCGACGACACTTCTATGGGTATGAGATTGATGAGAATGCACGGTTAC GACGTTGACCCAAACGTTTTGAGAAACTTCAAGCAAAAGGACGGTAA GTTCTCTTGTTACGGTGGTCAAATGATCGAATCTCCATCTCCAATCT ACAACTTGTACAGAGCTTCTCAATTGAGATTCCCAGGTGAAGAAATC TTGGAAGACGCTAAGAGATTCGCTTACGACTTCTTGAAGGAAAAGTT GGCTAACAACCAAATCTTGGACAAGTGGGTTATCTCTAAGCACTTGC CAGACGAAATCAAGTTGGGTTTGGAAATGCCATGGTTGGCTACTTTG CCAAGAGTTGAAGCTAAGTACTACATCCAATACTACGCTGGTTCTGG TGACGTTTGGATCGGTAAGACTTTGTACAGAATGCCAGAAATCTCTA ACGACACTTACCACGACTTGGCTAAGACTGACTTCAAGAGATGTCAA GCTAAGCACCAATTCGAATGGTTGTACATGCAAGAATGGTACGAATC TTGTGGTATCGAAGAATTCGGTATCTCTAGAAAGGACTTGTTGTTGT CTTACTTCTTGGCTACTGCTTCTATCTTCGAATTGGAAAGAACTAAC GAAAGAATCGCTTGGGCTAAGTCTCAAATCATCGCTAAGATGATCAC TTCTTTCTTCAACAAGGAAACTACTTCTGAAGAAGACAAGAGAGCTT TGTTGAACGAATTGGGTAACATCAACGGTTTGAACGACACTAACGGT GCTGGTAGAGAAGGTGGTGCTGGTTCTATCGCTTTGGCTACTTTGAC TCAATTCTTGGAAGGTTTCGACAGATACACTAGACACCAATTGAAGA ACGCTTGGTCTGTTTGGTTGACTCAATTGCAACACGGTGAAGCTGAC GACGCTGAATTGTTGACTAACACTTTGAACATCTGTGCTGGTCACAT CGCTTTCAGAGAAGAAATCTTGGCTCACAACGAATACAAGGCTTTGT CTAACTTGACTTCTAAGATCTGTAGACAATTGTCTTTCATCCAATCT GAAAAGGAAATGGGTGTTGAAGGTGAAATCGCTGCTAAGTCTTCTAT CAAGAACAAGGAATTGGAAGAAGACATGCAAATGTTGGTTAAGTTGG TTTTGGAAAAGTACGGTGGTATCGACAGAAACATCAAGAAGGCTTTC TTGGCTGTTGCTAAGACTTACTACTACAGAGCTTACCACGCTGCTGA CACTATCGACACTCACATGTTCAAGGTTTTGTTCGAACCAGTTGCTT AA SEQ ID NO: 27 Optimized cDNA for S. cerevisiae expression encoding for truncated SsScS from Salvia sclarea ATGGCTAAGATGAAGGAAAACTTCAAGAGAGAAGACGACAAGTTCCC AACTACTACTACTTTGAGATCTGAAGACATCCCATCTAACTTGTGTA TCATCGACACTTTGCAAAGATTGGGTGTTGACCAATTCTTCCAATAC GAAATCAACACTATCTTGGACAACACTTTCAGATTGTGGCAAGAAAA GCACAAGGTTATCTACGGTAACGTTACTACTCACGCTATGGCTTTCA GATTGTTGAGAGTTAAGGGTTACGAAGTTTCTTCTGAAGAATTGGCT CCATACGGTAACCAAGAAGCTGTTTCTCAACAAACTAACGACTTGCC AATGATCATCGAATTGTACAGAGCTGCTAACGAAAGAATCTACGAAG AAGAAAGATCTTTGGAAAAGATCTTGGCTTGGACTACTATCTTCTTG AACAAGCAAGTTCAAGACAACTCTATCCCAGACAAGAAGTTGCACAA GTTGGTTGAATTCTACTTGAGAAACTACAAGGGTATCACTATCAGAT TGGGTGCTAGAAGAAACTTGGAATTGTACGACATGACTTACTACCAA GCTTTGAAGTCTACTAACAGATTCTCTAACTTGTGTAACGAAGACTT CTTGGTTTTCGCTAAGCAAGACTTCGACATCCACGAAGCTCAAAACC AAAAGGGTTTGCAACAATTGCAAAGATGGTACGCTGACTGTAGATTG GACACTTTGAACTTCGGTAGAGACGTTGTTATCATCGCTAACTACTT GGCTTCTTTGATCATCGGTGACCACGCTTTCGACTACGTTAGATTGG CTTTCGCTAAGACTTCTGTTTTGGTTACTATCATGGACGACTTCTTC GACTGTCACGGTTCTTCTCAAGAATGTGACAAGATCATCGAATTGGT TAAGGAATGGAAGGAAAACCCAGACGCTGAATACGGTTCTGAAGAAT TGGAAATCTTGTTCATGGCTTTGTACAACACTGTTAACGAATTGGCT GAAAGAGCTAGAGTTGAACAAGGTAGATCTGTTAAGGAATTCTTGGT TAAGTTGTGGGTTGAAATCTTGTCTGCTTTCAAGATCGAATTGGACA CTTGGTCTAACGGTACTCAACAATCTTTCGACGAATACATCTCTTCT TCTTGGTTGTCTAACGGTTCTAGATTGACTGGTTTGTTGACTATGCA ATTCGTTGGTGTTAAGTTGTCTGACGAAATGTTGATGTCTGAAGAAT GTACTGACTTGGCTAGACACGTTTGTATGGTTGGTAGATTGTTGAAC GACGTTTGTTCTTCTGAAAGAGAAAGAGAAGAAAACATCGCTGGTAA GTCTTACTCTATCTTGTTGGCTACTGAAAAGGACGGTAGAAAGGTTT CTGAAGACGAAGCTATCGCTGAAATCAACGAAATGGTTGAATACCAC TGGAGAAAGGTTTTGCAAATCGTTTACAAGAAGGAATCTATCTTGCC AAGAAGATGTAAGGACGTTTTCTTGGAAATGGCTAAGGGTACTTTCT ACGCTTACGGTATCAACGACGAATTGACTTCTCCACAACAATCTAAG GAAGACATGAAGTCTTTCGTTTTCTAA SEQ ID NO: 28 Optimized cDNA for S. cerevisiae expression encoding for the GGPP synthase from Pantoea agglomeranes ATGGTTTCTGGTTCTAAGGCTGGTGTTTCTCCACACAGAGAAATCGA AGTTATGAGACAATCTATCGACGACCACTTGGCTGGTTTGTTGCCAG AAACTGACTCTCAAGACATCGTTTCTTTGGCTATGAGAGAAGGTGTT ATGGCTCCAGGTAAGAGAATCAGACCATTGTTGATGTTGTTGGCTGC TAGAGACTTGAGATACCAAGGTTCTATGCCAACTTTGTTGGACTTGG CTTGTGCTGTTGAATTGACTCACACTGCTTCTTTGATGTTGGACGAC ATGCCATGTATGGACAACGCTGAATTGAGAAGAGGTCAACCAACTAC TCACAAGAAGTTCGGTGAATCTGTTGCTATCTTGGCTTCTGTTGGTT TGTTGTCTAAGGCTTTCGGTTTGATCGCTGCTACTGGTGACTTGCCA GGTGAAAGAAGAGCTCAAGCTGTTAACGAATTGTCTACTGCTGTTGG TGTTCAAGGTTTGGTTTTGGGTCAATTCAGAGACTTGAACGACGCTG CTTTGGACAGAACTCCAGACGCTATCTTGTCTACTAACCACTTGAAG ACTGGTATCTTGTTCTCTGCTATGTTGCAAATCGTTGCTATCGCTTC TGCTTCTTCTCCATCTACTAGAGAAACTTTGCACGCTTTCGCTTTGG ACTTCGGTCAAGCTTTCCAATTGTTGGACGACTTGAGAGACGACCAC CCAGAAACTGGTAAGGACAGAAACAAGGACGCTGGTAAGTCTACTTT GGTTAACAGATTGGGTGCTGACGCTGCTAGACAAAAGTTGAGAGAAC ACATCGACTCTGCTGACAAGCACTTGACTTTCGCTTGTCCACAAGGT GGTGCTATCAGACAATTCATGCACTTGTGGTTCGGTCACCACTTGGC TGACTGGTCTCCAGTTATGAAGATCGCTTAA SEQ ID NO: 29 Optimized cDNA for S. cerevisiae expression encoding for CfCPS1-del63 ATGGTTGCTACTGTTAACGCTCCACCAGTTCACGACCAAGACGACTC TACTGAAAACCAATGTCACGACGCTGTTAACAACATCGAAGACCCAA TCGAATACATCAGAACTTTGTTGAGAACTACTGGTGACGGTAGAATC TCTGTTTCTCCATACGACACTGCTTGGGTTGCTTTGATCAAGGACTT GCAAGGTAGAGACGCTCCAGAATTCCCATCTTCTTTGGAATGGATCA TCCAAAACCAATTGGCTGACGGTTCTTGGGGTGACGCTAAGTTCTTC TGTGTTTACGACAGATTGGTTAACACTATCGCTTGTGTTGTTGCTTT GAGATCTTGGGACGTTCACGCTGAAAAGGTTGAAAGAGGTGTTAGAT ACATCAACGAAAACGTTGAAAAGTTGAGAGACGGTAACGAAGAACAC ATGACTTGTGGTTTCGAAGTTGTTTTCCCAGCTTTGTTGCAAAGAGC TAAGTCTTTGGGTATCCAAGACTTGCCATACGACGCTCCAGTTATCC AAGAAATCTACCACTCTAGAGAACAAAAGTCTAAGAGAATCCCATTG GAAATGATGCACAAGGTTCCAACTTCTTTGTTGTTCTCTTTGGAAGG TTTGGAAAACTTGGAATGGGACAAGTTGTTGAAGTTGCAATCTGCTG ACGGTTCTTTCTTGACTTCTCCATCTTCTACTGCTTTCGCTTTCATG CAAACTAGAGACCCAAAGTGTTACCAATTCATCAAGAACACTATCCA AACTTTCAACGGTGGTGCTCCACACACTTACCCAGTTGACGTTTTCG GTAGATTGTGGGCTATCGACAGATTGCAAAGATTGGGTATCTCTAGA TTCTTCGAATCTGAAATCGCTGACTGTATCGCTCACATCCACAGATT CTGGACTGAAAAGGGTGTTTTCTCTGGTAGAGAATCTGAATTCTGTG ACATCGACGACACTTCTATGGGTGTTAGATTGATGAGAATGCACGGT TACGACGTTGACCCAAACGTTTTGAAGAACTTCAAGAAGGACGACAA GTTCTCTTGTTACGGTGGTCAAATGATCGAATCTCCATCTCCAATCT ACAACTTGTACAGAGCTTCTCAATTGAGATTCCCAGGTGAACAAATC TTGGAAGACGCTAACAAGTTCGCTTACGACTTCTTGCAAGAAAAGTT GGCTCACAACCAAATCTTGGACAAGTGGGTTATCTCTAAGCACTTGC CAGACGAAATCAAGTTGGGTTTGGAAATGCCATGGTACGCTACTTTG CCAAGAGTTGAAGCTAGATACTACATCCAATACTACGCTGGTTCTGG TGACGTTTGGATCGGTAAGACTTTGTACAGAATGCCAGAAATCTCTA ACGACACTTACCACGAATTGGCTAAGACTGACTTCAAGAGATGTCAA GCTCAACACCAATTCGAATGGATCTACATGCAAGAATGGTACGAATC TTGTAACATGGAAGAATTCGGTATCTCTAGAAAGGAATTGTTGGTTG CTTACTTCTTGGCTACTGCTTCTATCTTCGAATTGGAAAGAGCTAAC GAAAGAATCGCTTGGGCTAAGTCTCAAATCATCTCTACTATCATCGC
TTCTTTCTTCAACAACCAAAACACTTCTCCAGAAGACAAGTTGGCTT TCTTGACTGACTTCAAGAACGGTAACTCTACTAACATGGCTTTGGTT ACTTTGACTCAATTCTTGGAAGGTTTCGACAGATACACTTCTCACCA ATTGAAGAACGCTTGGTCTGTTTGGTTGAGAAAGTTGCAACAAGGTG AAGGTAACGGTGGTGCTGACGCTGAATTGTTGGTTAACACTTTGAAC ATCTGTGCTGGTCACATCGCTTTCAGAGAAGAAATCTTGGCTCACAA CGACTACAAGACTTTGTCTAACTTGACTTCTAAGATCTGTAGACAAT TGTCTCAAATCCAAAACGAAAAGGAATTGGAAACTGAAGGTCAAAAG ACTTCTATCAAGAACAAGGAATTGGAAGAAGACATGCAAAGATTGGT TAAGTTGGTTTTGGAAAAGTCTAGAGTTGGTATCAACAGAGACATGA AGAAGACTTTCTTGGCTGTTGTTAAGACTTACTACTACAAGGCTTAC CACTCTGCTCAAGCTATCGACAACCACATGTTCAAGGTTTTGTTCGA ACCAGTTGCTTAA SEQ ID NO: 30 Optimized cDNA for S. cerevisiae expression encoding for TaTps1-del59 ATGTACAGACAAAGAACTGACGAACCATCTGAAACTAGACAAATGAT CGACGACATCAGAACTGCTTTGGCTTCTTTGGGTGACGACGAAACTT CTATGTCTGTTTCTGCTTACGACACTGCTTTGGTTGCTTTGGTTAAG AACTTGGACGGTGGTGACGGTCCACAATTCCCATCTTGTATCGACTG GATCGTTCAAAACCAATTGCCAGACGGTTCTTGGGGTGACCCAGCTT TCTTCATGGTTCAAGACAGAATGATCTCTACTTTGGCTTGTGTTGTT GCTGTTAAGTCTTGGAACATCGACAGAGACAACTTGTGTGACAGAGG TGTTTTGTTCATCAAGGAAAACATGTCTAGATTGGTTGAAGAAGAAC AAGACTGGATGCCATGTGGTTTCGAAATCAACTTCCCAGCTTTGTTG GAAAAGGCTAAGGACTTGGACTTGGACATCCCATACGACCACCCAGT TTTGGAAGAAATCTACGCTAAGAGAAACTTGAAGTTGTTGAAGATCC CATTGGACGTTTTGCACGCTATCCCAACTACTTTGTTGTTCTCTGTT GAAGGTATGGTTGACTTGCCATTGGACTGGGAAAAGTTGTTGAGATT GAGATGTCCAGACGGTTCTTTCCACTCTTCTCCAGCTGCTACTGCTG CTGCTTTGTCTCACACTGGTGACAAGGAATGTCACGCTTTCTTGGAC AGATTGATCCAAAAGTTCGAAGGTGGTGTTCCATGTTCTCACTCTAT GGACACTTTCGAACAATTGTGGGTTGTTGACAGATTGATGAGATTGG GTATCTCTAGACACTTCACTTCTGAAATCCAACAATGTTTGGAATTC ATCTACAGAAGATGGACTCAAAAGGGTTTGGCTCACAACATGCACTG TCCAATCCCAGACATCGACGACACTGCTATGGGTTTCAGATTGTTGA GACAACACGGTTACGACGTTACTCCATCTGTTTTCAAGCACTTCGAA AAGGACGGTAAGTTCGTTTGTTTCCCAATGGAAACTAACCACGCTTC TGTTACTCCAATGCACAACACTTACAGAGCTTCTCAATTCATGTTCC CAGGTGACGACGACGTTTTGGCTAGAGCTGGTAGATACTGTAGAGCT TTCTTGCAAGAAAGACAATCTTCTAACAAGTTGTACGACAAGTGGAT CATCACTAAGGACTTGCCAGGTGAAGTTGGTTACACTTTGAACTTCC CATGGAAGTCTTCTTTGCCAAGAATCGAAACTAGAATGTACTTGGAC CAATACGGTGGTAACAACGACGTTTGGATCGCTAAGGTTTTGTACAG AATGAACTTGGTTTCTAACGACTTGTACTTGAAGATGGCTAAGGCTG ACTTCACTGAATACCAAAGATTGTCTAGAATCGAATGGAACGGTTTG AGAAAGTGGTACTTCAGAAACCACTTGCAAAGATACGGTGCTACTCC AAAGTCTGCTTTGAAGGCTTACTTCTTGGCTTCTGCTAACATCTTCG AACCAGGTAGAGCTGCTGAAAGATTGGCTTGGGCTAGAATGGCTGTT TTGGCTGAAGCTGTTACTACTCACTTCAGACACATCGGTGGTCCATG TTACTCTACTGAAAACTTGGAAGAATTGATCGACTTGGTTTCTTTCG ACGACGTTTCTGGTGGTTTGAGAGAAGCTTGGAAGCAATGGTTGATG GCTTGGACTGCTAAGGAATCTCACGGTTCTGTTGACGGTGACACTGC TTTGTTGTTCGTTAGAACTATCGAAATCTGTTCTGGTAGAATCGTTT CTTCTGAACAAAAGTTGAACTTGTGGGACTACTCTCAATTGGAACAA TTGACTTCTTCTATCTGTCACAAGTTGGCTACTATCGGTTTGTCTCA AAACGAAGCTTCTATGGAAAACACTGAAGACTTGCACCAACAAGTTG ACTTGGAAATGCAAGAATTGTCTTGGAGAGTTCACCAAGGTTGTCAC GGTATCAACAGAGAAACTAGACAAACTTTCTTGAACGTTGTTAAGTC TTTCTACTACTCTGCTCACTGTTCTCCAGAAACTGTTGACTCTCACA TCGCTAAGGTTATCTTCCAAGACGTTATCTAA SEQ ID NO: 31 Optimized cDNA for S. cerevisiae expression encoding for MvCps3-del63 ATGGCTCCACCAGAACAAAAGTACAACTCTACTGCTTTGGAACACGA CACTGAAATCATCGAAATCGAAGACCACATCGAATGTATCAGAAGAT TGTTGAGAACTGCTGGTGACGGTAGAATCTCTGTTTCTCCATACGAC ACTGCTTGGATCGCTTTGATCAAGGACTTGGACGGTCACGACTCTCC ACAATTCCCATCTTCTATGGAATGGGTTGCTGACAACCAATTGCCAG ACGGTTCTTGGGGTGACGAACACTTCGTTTGTGTTTACGACAGATTG GTTAACACTATCGCTTGTGTTGTTGCTTTGAGATCTTGGAACGTTCA CGCTCACAAGTGTGAAAAGGGTATCAAGTACATCAAGGAAAACGTTC ACAAGTTGGAAGACGCTAACGAAGAACACATGACTTGTGGTTTCGAA GTTGTTTTCCCAGCTTTGTTGCAAAGAGCTCAATCTATGGGTATCAA GGGTATCCCATACAACGCTCCAGTTATCGAAGAAATCTACAACTCTA GAGAAAAGAAGTTGAAGAGAATCCCAATGGAAGTTGTTCACAAGGTT GCTACTTCTTTGTTGTTCTCTTTGGAAGGTTTGGAAAACTTGGAATG GGAAAAGTTGTTGAAGTTGCAATCTCCAGACGGTTCTTTCTTGACTT CTCCATCTTCTACTGCTTTCGCTTTCATCCACACTAAGGACAGAAAG TGTTTCAACTTCATCAACAACATCGTTCACACTTTCAAGGGTGGTGC TCCACACACTTACCCAGTTGACATCTTCGGTAGATTGTGGGCTGTTG ACAGATTGCAAAGATTGGGTATCTCTAGATTCTTCGAATCTGAAATC GCTGAATTCTTGTCTCACGTTCACAGATTCTGGTCTGACGAAGCTGG TGTTTTCTCTGGTAGAGAATCTGTTTTCTGTGACATCGACGACACTT CTATGGGTTTGAGATTGTTGAGAATGCACGGTTACCACGTTGACCCA AACGTTTTGAAGAACTTCAAGCAATCTGACAAGTTCTCTTGTTACGG TGGTCAAATGATGGAATGTTCTTCTCCAATCTACAACTTGTACAGAG CTTCTCAATTGCAATTCCCAGGTGAAGAAATCTTGGAAGAAGCTAAC AAGTTCGCTTACAAGTTCTTGCAAGAAAAGTTGGAATCTAACCAAAT CTTGGACAAGTGGTTGATCTCTAACCACTTGTCTGACGAAATCAAGG TTGGTTTGGAAATGCCATGGTACGCTACTTTGCCAAGAGTTGAAACT TCTTACTACATCCACCACTACGGTGGTGGTGACGACGTTTGGATCGG TAAGACTTTGTACAGAATGCCAGAAATCTCTAACGACACTTACAGAG AATTGGCTAGATTGGACTTCAGAAGATGTCAAGCTCAACACCAATTG GAATGGATCTACATGCAAAGATGGTACGAATCTTGTAGAATGCAAGA ATTCGGTATCTCTAGAAAGGAAGTTTTGAGAGCTTACTTCTTGGCTT CTGGTACTATCTTCGAAGTTGAAAGAGCTAAGGAAAGAGTTGCTTGG GCTAGATCTCAAATCATCTCTCACATGATCAAGTCTTTCTTCAACAA GGAAACTACTTCTTCTGACCAAAAGCAAGCTTTGTTGACTGAATTGT TGTTCGGTAACATCTCTGCTTCTGAAACTGAAAAGAGAGAATTGGAC GGTGTTGTTGTTGCTACTTTGAGACAATTCTTGGAAGGTTTCGACAT CGGTACTAGACACCAAGTTAAGGCTGCTTGGGACGTTTGGTTGAGAA AGGTTGAACAAGGTGAAGCTCACGGTGGTGCTGACGCTGAATTGTGT ACTACTACTTTGAACACTTGTGCTAACCAACACTTGTCTTCTCACCC AGACTACAACACTTTGTCTAAGTTGACTAACAAGATCTGTCACAAGT TGTCTCAAATCCAACACCAAAAGGAAATGAAGGGTGGTATCAAGGCT AAGTGTTCTATCAACAACAAGGAAGTTGACATCGAAATGCAATGGTT GGTTAAGTTGGTTTTGGAAAAGTCTGGTTTGAACAGAAAGGCTAAGC AAGCTTTCTTGTCTATCGCTAAGACTTACTACTACAGAGCTTACTAC GCTGACCAAACTATGGACGCTCACATCTTCAAGGTTTTGTTCGAACC AGTTGTTTAA SEQ ID NO: 32 Optimized cDNA for S. cerevisiae expression encoding for RoCPS1-del67 ATGGCTTCTCAAGTTTCTGAAAAGGGTACTTCTTCTCCAGTTCAAAC TCCAGAAGAAGTTAACGAAAAGATCGAAAACTACATCGAATACATCA AGAACTTGTTGACTACTTCTGGTGACGGTAGAATCTCTGTTTCTCCA TACGACACTTCTATCGTTGCTTTGATCAAGGACTTGAAGGGTAGAGA CACTCCACAATTCCCATCTTGTTTGGAATGGATCGCTCAACACCAAA TGGCTGACGGTTCTTGGGGTGACGAATTCTTCTGTATCTACGACAGA ATCTTGAACACTTTGGCTTGTGTTGTTGCTTTGAAGTCTTGGAACGT TCACGCTGACATGATCGAAAAGGGTGTTACTTACGTTAACGAAAACG TTCAAAAGTTGGAAGACGGTAACTTGGAACACATGACTTCTGGTTTC GAAATCGTTGTTCCAGCTTTGGTTCAAAGAGCTCAAGACTTGGGTAT CCAAGGTTTGCCATACGACCACCCATTGATCAAGGAAATCGCTAACA CTAAGGAAGGTAGATTGAAGAAGATCCCAAAGGACATGATCTACCAA AAGCCAACTACTTTGTTGTTCTCTTTGGAAGGTTTGGGTGACTTGGA ATGGGAAAAGATCTTGAAGTTGCAATCTGGTGACGGTTCTTTCTTGA CTTCTCCATCTTCTACTGCTCACGTTTTCATGAAGACTAAGGACGAA TAAGTGTTGAAGTTCATCGAAAACGCTGTTAAGAACTGTAACGGTGG
TGCTCCACACACTTACCCAGTTGACGTTTTCGCTAGATTGTGGGCTG TTGACAGATTGCAAAGATTGGGTATCTCTAGATTCTTCCAACAAGAA ATCAAGTACTTCTTGGACCACATCAACTCTGTTTGGACTGAAAACGG TGTTTTCTCTGGTAGAGACTCTGAATTCTGTGACATCGACGACACTT CTATGGGTATCAGATTGTTGAAGATGCACGGTTACGACATCGACCCA AACGCTTTGGAACACTTCAAGCAACAAGACGGTAAGTTCTCTTGTTA CGGTGGTCAAATGATCGAATCTGCTTCTCCAATCTACAACTTGTACA GAGCTGCTCAATTGAGATTCCCAGGTGAAGAAATCTTGGAAGAAGCT ACTAAGTTCGCTTACAACTTCTTGCAAGAAAAGATCGCTAACGACCA ATTCCAAGAAAAGTGGGTTATCTCTGACCACTTGATCGACGAAGTTA AGTTGGGTTTGAAGATGCCATGGTACGCTACTTTGCCAAGAGTTGAA TGCTGCTTACTACTTGCAATACTACGCTGGTTGTGGGACGTTTGGAT CGGTAAGGTTTTCTACAGAATGCCAGAAATCTCTAACGACACTTACA AGAAGTTGGCTATCTTGGACTTCAACAGATGTCAAGCTCAACACCAA TTCGAATGGATCTACATGCAAGAATGGTACCACAGATCTTCTGTTTC TGAATTCGGTATCTCTAAGAAGGACTTGTTGAGAGCTTACTTCTTGG CTGCTGCTACTATCTTCGAACCAGAAAGAACTCAAGAAAGATTGGTT TGGGCTAAGACTCAAATCGTTTCTGGTATGATCACTTCTTTCGTTAA CTCTGGTACTACTTTGTCTTTGCACCAAAAGACTGCTTTGTTGTCTC AAATCGGTCACAACTTCGACGGTTTGGACGAAATCATCTCTGCTATG AAGGACCACGGTTTGGCTGCTACTTTGTTGACTACTTTCCAACAATT GTTGGACGGTTTCGACAGATACACTAGACACCAATTGAAGAACGCTT GGTCTCAATGGTTCATGAAGTTGCAACAAGGTGAAGCTTCTGGTGGT GAAGACGCTGAATTGTTGGCTAACACTTTGAACATCTGTGCTGGTTT GATCGCTTTCAACGAAGACGTTTTGTCTCACCACGAATACACTACTT TGTCTACTTTGACTAACAAGATCTGTAAGAGATTGACTCAAATCCAA GACAAGAAGACTTTGGAAGTTGTTGACGGTTCTATCAAGGACAAGGA ATTGGAAAAGGACATCCAAATGTTGGTTAAGTTGGTTTTGGAAGAAA ACGGTGGTGGTGTTGACAGAAACATCAAGCACACTTTCTTGTCTGTT TTCAAGACTTTCTACTACAACGCTTACCACGACGACGAAACTACTGA CGTTCACATCTTCAAGGTTTTGTTCGGTCCAGTTGTTTAA SEQ ID NO: 33 Optimized cDNA for S. cerevisiae expression encoding for NgSCS-del29 ATGGCTAACTTCCACAGACCATCTAGAGTTAGATGTTCTCACTCTAC TGCTTCTTCTTTGGAAGAAGCTAAGGAAAGAATCAGAGAAACTTTCG GTAAGAACGAATTGTCTCCATCTTCTTACGACACTGCTTGGGTTGCT ATGGTTCCATCTAGATACTCTATGAACCAACCATGTTTCCCAAGATG TTTGGACTGGATCTTGGAAAACCAAAGAGAAGACGGTTCTTGGGGTT TGAACCCATCTCACCCATTGTTGGTTAAGGACTCTTTGTCTTCTACT TTGGCTTGTTTGTTGGCTTTGAGAAAGTGGAGAATCGGTGACAACCA AGTTCAAAGAGGTTTGGGTTTCATCGAAACTCACGGTTGGGCTGTTG ACAACGTTGACCAAATCTCTCCATTGGGTTTCGACATCATCTTCCCA TCTATGATCAAGTACGCTGAAAAGTTGAACTTGGACTTGCCATTCGA CCCAAACTTGGTTAACATGATGTTGAGAGAAAGAGAATTGACTATCG AAAGAGCTTTGAAGAACGAATTCGAAGGTAACATGGCTAACGTTGAA TACTTCGCTGAAGGTTTGGGTGAATTGTGTCACTGGAAGGAAATCAT GTTGCACCAAAGAAGAAACGGTTCTTTGTTCGACTCTCCAGCTACTA CTGCTGCTGCTTTGATCTACCACCAACACGACGAAAAGTGTTTCGGT TACTTGTCTTCTATCTTGAAGTTGCACGAAAACTGGGTTCCAACTAT CTACCCAACTAAGGTTCACTCTAACTTGTTCTTCGTTGACGCTTTGC AAAACTTGGGTGTTGACAGATACTTCAAGACTGAATTGAAGTCTGTT TTGGACGAAATCTACAGATTGTGGTTGGAAAAGAACGAAGAAATCTT CTCTGACATCGCTCACTGTGCTATGGCTTTCAGATTGTTGAGAATGA ACAACTACGAAGTTTCTTCTGAAGAATTGGAAGGTTTCGTTGACCAA GAACACTTCTTCACTACTTCTGGTGGTAAGTTGATCTCTCACGTTGC TATCTTGGAATTGCACAGAGCTTCTCAAGTTGACATCCAAGAAGGTA AGGACTTGATCTTGGACAAGATCTCTACTTGGACTAGAAACTTCATG GAACAAGAATTGTTGGACAACCAAATCTTGGACAGATCTAAGAAGGA AATGGAATTCGCTATGAGAAAGTTCTACGGTACTTTCGACAGAGTTG AAACTAGAAGATACATCGAATCTTACAAGATGGACTCTTTCAAGATC TTGAAGGCTGCTTACAGATCTTCTAACATCAACAACATCGACTTGTT GAAGTTCTCTGAACACGACTTCAACTTGTGTCAAGCTAGACACAAGG AAGAATTGCAACAAATCAAGAGATGGTTCGCTGACTGTAAGTTGGAA CAAGTTGGTTCTTCTCAAAACTACTTGTACACTTCTTACTTCCCAAT CGCTGCTATCTTGTTCGAACCAGAATACGGTGACGCTAGATTGGCTT TCGCTAAGTGTGGTATCATCGCTACTACTGTTGACGACTTCTTCGAC GGTTTCGCTTGTAACGAAGAATTGCAAAACATCATCGAATTGGTTGA AAGATGGGACGGTTACCCAACTGTTGGTTTCAGATCTGAAAGAGTTA GAATCTTCTTCTTGGCTTTGTACAAGATGATCGAAGAAATCGCTGCT AAGGCTGAAACTAAGCAAGGTAGATGTGTTAAGGACTTGTTGATCAA CTTGTGGATCGACTTGTTGAAGTGTATGTTGGTTGAATTGGACTTGT GGAAGATCAAGTCTACTACTCCATCTATCGAAGAATACTTGTCTATC GCTTGTGTTACTACTGGTGTTAAGTGTTTGATCTTGATCTCTTTGCA CTTGTTGGGTCCAAAGTTGTCTAAGGACGTTACTGAATCTTCTGAAG TTTCTGCTTTGTGGAACTGTACTGCTGTTGTTGCTAGATTGAACAAC GACATCCACTCTTACAAGAGAGAACAAGCTGAATCTTCTACTAACAT GGCTGCTATCTTGATCTCTCAATCTCAAAGAACTATCTCTGAAGAAG AAGCTATCAGACAAATCAAGGAAATGATGGAATCTAAGAGAAGAGAA TTGTTGGGTATGGTTTTGCAAAACAAGGAATCTCAATTGCCACAAGT TTGTAAGGACTTGTTCTGGACTACTTTCAAGGCTGCTTACTCTATCT ACACTCACGGTGACGAATACAGATTCCCACAAGAATTGAAGAACCAC ATCAACGACGTTATCTACAAGCCATTGAACCAATACTCTCCATAA SEQ ID NO: 34 Optimized cDNA for S. cerevisiae expression encoding for NgSCS-del38 ATGTCTCACTCTACTGCTTCTTCTTTGGAAGAAGCTAAGGAAAGAAT CAGAGAAACTTTCGGTAAGAACGAATTGTCTTCTTCTTCTTACGACA CTGCTTGGGTTGCTATGGTTCCATCTAGATACTCTATGAACCAACCA TGTTTCCCAAGATGTTTGGACTGGATCTTGGAAAACCAAAGAGAAGA CGGTTCTTGGGGTTTGAACCCATCTTTGCCATTGTTGGTTAAGGACT CTTTGTCTTCTACTTTGGCTTGTTTGTTGGCTTTGAGAAAGTGGAGA ATCGGTGACAACCAAGTTCAAAGAGGTTTGGGTTTCATCGAAACTCA CGGTTGGGCTGTTGACAACGTTGACCAAATCTCTCCATTGGGTTTCG ACATCATCTTCCCATCTATGATCAAGTACGCTGAAAAGTTGAACTTG GACTTGCCATTCGACCCAAACTTGGTTAACATGATGTTGAGAGAAAG AGAATTGACTATCGAAAGAGCTTTGAAGAACGAATTCGAAGGTAACA TGGCTAACGTTGAATACTTCGCTGAAGGTTTGGGTGAATTGTGTCAC TGGAAGGAAATCATGTTGCACCAAAGAAGAAACGGTTCTCCATTCGA CTCTCCAGCTACTACTGCTGCTGCTTTGATCTACCACCAACACGACG AAAAGTGTTTCGGTTACTTGTCTTCTATCTTGAAGTTGCACGAAAAC TGGGTTCCAACTATCTACCCAACTAAGGTTCACTCTAACTTGTTCTT CGTTGACGCTTTGCAAAACTTGGGTGTTGACAGATACTTCAAGACTG AATTGAAGTCTGTTTTGGACGAAATCTACAGATTGTGGTTGGAAAAG AACGAAGAAATCTTCTCTGACATCGCTCACTGTGCTATGGCTTTCAG ATTGTTGAGAATGAACAACTACGAAGTTTCTTCTGAAGAATTGGAAG GTTTCGTTGACCAAGAACACTTCTTCACTACTTCTGGTGGTAAGTTG ATCTCTCACGTTGCTATCTTGGAATTGCACAGAGCTTCTCAAGTTGA CATCCAAGAAGGTAAGGACTTGATCTTGGACAAGATCTCTACTTGGA CTAGAAACTTCATGGAACAAGAATTGTTGGACAACCAAATCTTGGAC AGATCTAAGAAGGAAATGGAATTCGCTATGAGAAAGTTCTACGGTAC TTTCGACAGAGTTGAAACTAGAAGATACATCGAATCTTACAAGATGG ACTCTTTCAAGATCTTGAAGGCTGCTTACAGATCTTCTAACATCAAC AACATCGACTTGTTGAAGTTCTCTGAACACGACTTCAACTTGTGTCA AGCTAGACACAAGGAAGAATTGCAACAAATCAAGAGATGGTTCGCTG ACTGTAAGTTGGAACAAGTTGGTTCTTCTCAAAACTACTTGTACACT TCTTACTTCCCAATCGCTGCTATCTTGTTCGAACCAGAATACGGTGA CGCTAGATTGGCTTTCGCTAAGTGTGGTATCATCGCTACTACTGTTG ACGACTTCTTCGACGGTTTCGCTTGTAACGAAGAATTGCAAAACATC ATCGAATTGGTTGAAAGATGGGACGGTTACCCAACTGTTGGTTTCAG ATCTGAAAGAGTTAGAATCTTCTTCTTGGCTTTGTACAAGATGATCG AAGAAATCGCTGCTAAGGCTGAAACTAAGCAAGGTAGATGTGTTAAG GACTTGTTGATCAACTTGTGGATCGACTTGTTGAAGTGTATGTTGGT TGAATTGGACTTGTGGAAGATCAAGTCTACTACTCCATCTATCGAAG AATACTTGTCTATCGCTTGTGTTACTACTGGTGTTAAGTGTTTGATC TTGATCTCTTTGCACTTGTTGGGTCCAAAGTTGTCTAAGGACGTTAC TGAATCTTCTGAAGTTTCTGCTTTGTGGAACTGTACTGCTGTTGTTG CTAGATTGAACAACGACATCCACTCTTACAAGAGAGAACAAGCTGAA
TCTTCTACTAACATGGTTGCTATCTTGATCTCTCAATCTCAAAGAAC TATCTCTGAAGAAGAAGCTATCAGACAAATCAAGGAAATGATGGAAT CTAAGAGAAGAGAATTGTTGGGTATGGTTTTGCAAAACAAGGAATCT CAATTGCCACAAGTTTGTAAGGACTTGTTCTGGACTACTTTCAAGGC TGCTTACTCTATCTACACTCACGGTGACGAATACAGATTCCCACAAG AATTGAAGAACCACATCAACGACGTTATCTACAAGCCATTGAACCAA TACTCTCCATAA SEQ ID NO: 35 Primer for construction of fragment "a" (URA3 yeast marker) AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGT CGTACCGCGCCATTGAGAGTGCACCATACCACAGCTTT SEQ ID NO: 36 Primer for construction of fragment "a" (URA3 yeast marker) TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCG TTGTTGCTGACCAGCGGTATTTCACACCGCATAGGGTA SEQ ID NO: 37 Primer for construction of fragment "b" (AmpR E. coli marker) TGGTCAGCAACAACGCCGAAGAATCACTCTCGTGTTGAGAATTGCAC GCCTTGACCACGACACGTTAAGGGATTTTGGTCATGAG SEQ ID NO: 38 Primer for construction of fragment "b" (AmpR E. coli marker) AACGCGTACCCTAAGTACGGCACCACAGTGACTATGCAGTCCGCACT TTGCCAATGCCAAAAATGTGCGCGGAACCCCTA SEQ ID NO: 39 Primer for construction of fragment "c" (Yeast origin of replication) TTGGCATTGGCAAAGTGCGGACTGCATAGTCACTGTGGTGCCGTACT TAGGGTACGCGTTCCTGAACGAAGCATCTGTGCTTCA SEQ ID NO: 40 Primer for construction of fragment "c" (Yeast origin of replication) CCGAGATGCCAAAGGATAGGTGCTATGTTGATGACTACGACACAGAA CTGCGGGTGACATAATGATAGCATTGAAGGATGAGACT SEQ ID NO: 41 Primer for construction of fragment "d" (E. coli origin of replication) ATGTCACCCGCAGTTCTGTGTCGTAGTCATCAACATAGCACCTATCC TTTGGCATCTCGGTGAGCAAAAGGCCAGCAAAAGG SEQ ID NO: 42 Primer for construction of fragment "d" (E. coli origin of replication) CTCAGATGTACGGTGATCGCCACCATGTGACGGAAGCTATCCTGACA GTGTAGCAAGTGCTGAGCGTCAGACCCCGTAGAA SEQ ID NO: 43 Part of fragment "d" obtained by DNA synthesis ATTCCTAGTGACGGCCTTGGGAACTCGATACACGATGTTCAGTAGAC CGCTCACACATGG SEQ ID NO: 44 Primer for construction of fragment "a" (LEU2 yeast marker) AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGT CGTACCGCGCCATTCGACTACGTCGTAAGGCC SEQ ID NO: 45 Primer for construction of fragment "a" (LEU2 yeast marker) TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCG TTGTTGCTGACCATCGACGGTCGAGGAGAACTT
Sequence CWU
1
1
451793PRTSalvia miltiorrhiza 1Met Ala Ser Leu Ser Ser Thr Ile Leu Ser Arg
Ser Pro Ala Ala Arg1 5 10
15Arg Arg Ile Thr Pro Ala Ser Ala Lys Leu His Arg Pro Glu Cys Phe
20 25 30Ala Thr Ser Ala Trp Met Gly
Ser Ser Ser Lys Asn Leu Ser Leu Ser 35 40
45Tyr Gln Leu Asn His Lys Lys Ile Ser Val Ala Thr Val Asp Ala
Pro 50 55 60Gln Val His Asp His Asp
Gly Thr Thr Val His Gln Gly His Asp Ala65 70
75 80Val Lys Asn Ile Glu Asp Pro Ile Glu Tyr Ile
Arg Thr Leu Leu Arg 85 90
95Thr Thr Gly Asp Gly Arg Ile Ser Val Ser Pro Tyr Asp Thr Ala Trp
100 105 110Val Ala Met Ile Lys Asp
Val Glu Gly Arg Asp Gly Pro Gln Phe Pro 115 120
125Ser Ser Leu Glu Trp Ile Val Gln Asn Gln Leu Glu Asp Gly
Ser Trp 130 135 140Gly Asp Gln Lys Leu
Phe Cys Val Tyr Asp Arg Leu Val Asn Thr Ile145 150
155 160Ala Cys Val Val Ala Leu Arg Ser Trp Asn
Val His Ala His Lys Val 165 170
175Lys Arg Gly Val Thr Tyr Ile Lys Glu Asn Val Asp Lys Leu Met Glu
180 185 190Gly Asn Glu Glu His
Met Thr Cys Gly Phe Glu Val Val Phe Pro Ala 195
200 205Leu Leu Gln Lys Ala Lys Ser Leu Gly Ile Glu Asp
Leu Pro Tyr Asp 210 215 220Ser Pro Ala
Val Gln Glu Val Tyr His Val Arg Glu Gln Lys Leu Lys225
230 235 240Arg Ile Pro Leu Glu Ile Met
His Lys Ile Pro Thr Ser Leu Leu Phe 245
250 255Ser Leu Glu Gly Leu Glu Asn Leu Asp Trp Asp Lys
Leu Leu Lys Leu 260 265 270Gln
Ser Ala Asp Gly Ser Phe Leu Thr Ser Pro Ser Ser Thr Ala Phe 275
280 285Ala Phe Met Gln Thr Lys Asp Glu Lys
Cys Tyr Gln Phe Ile Lys Asn 290 295
300Thr Ile Asp Thr Phe Asn Gly Gly Ala Pro His Thr Tyr Pro Val Asp305
310 315 320Val Phe Gly Arg
Leu Trp Ala Ile Asp Arg Leu Gln Arg Leu Gly Ile 325
330 335Ser Arg Phe Phe Glu Pro Glu Ile Ala Asp
Cys Leu Ser His Ile His 340 345
350Lys Phe Trp Thr Asp Lys Gly Val Phe Ser Gly Arg Glu Ser Glu Phe
355 360 365Cys Asp Ile Asp Asp Thr Ser
Met Gly Met Arg Leu Met Arg Met His 370 375
380Gly Tyr Asp Val Asp Pro Asn Val Leu Arg Asn Phe Lys Gln Lys
Asp385 390 395 400Gly Lys
Phe Ser Cys Tyr Gly Gly Gln Met Ile Glu Ser Pro Ser Pro
405 410 415Ile Tyr Asn Leu Tyr Arg Ala
Ser Gln Leu Arg Phe Pro Gly Glu Glu 420 425
430Ile Leu Glu Asp Ala Lys Arg Phe Ala Tyr Asp Phe Leu Lys
Glu Lys 435 440 445Leu Ala Asn Asn
Gln Ile Leu Asp Lys Trp Val Ile Ser Lys His Leu 450
455 460Pro Asp Glu Ile Lys Leu Gly Leu Glu Met Pro Trp
Leu Ala Thr Leu465 470 475
480Pro Arg Val Glu Ala Lys Tyr Tyr Ile Gln Tyr Tyr Ala Gly Ser Gly
485 490 495Asp Val Trp Ile Gly
Lys Thr Leu Tyr Arg Met Pro Glu Ile Ser Asn 500
505 510Asp Thr Tyr His Asp Leu Ala Lys Thr Asp Phe Lys
Arg Cys Gln Ala 515 520 525Lys His
Gln Phe Glu Trp Leu Tyr Met Gln Glu Trp Tyr Glu Ser Cys 530
535 540Gly Ile Glu Glu Phe Gly Ile Ser Arg Lys Asp
Leu Leu Leu Ser Tyr545 550 555
560Phe Leu Ala Thr Ala Ser Ile Phe Glu Leu Glu Arg Thr Asn Glu Arg
565 570 575Ile Ala Trp Ala
Lys Ser Gln Ile Ile Ala Lys Met Ile Thr Ser Phe 580
585 590Phe Asn Lys Glu Thr Thr Ser Glu Glu Asp Lys
Arg Ala Leu Leu Asn 595 600 605Glu
Leu Gly Asn Ile Asn Gly Leu Asn Asp Thr Asn Gly Ala Gly Arg 610
615 620Glu Gly Gly Ala Gly Ser Ile Ala Leu Ala
Thr Leu Thr Gln Phe Leu625 630 635
640Glu Gly Phe Asp Arg Tyr Thr Arg His Gln Leu Lys Asn Ala Trp
Ser 645 650 655Val Trp Leu
Thr Gln Leu Gln His Gly Glu Ala Asp Asp Ala Glu Leu 660
665 670Leu Thr Asn Thr Leu Asn Ile Cys Ala Gly
His Ile Ala Phe Arg Glu 675 680
685Glu Ile Leu Ala His Asn Glu Tyr Lys Ala Leu Ser Asn Leu Thr Ser 690
695 700Lys Ile Cys Arg Gln Leu Ser Phe
Ile Gln Ser Glu Lys Glu Met Gly705 710
715 720Val Glu Gly Glu Ile Ala Ala Lys Ser Ser Ile Lys
Asn Lys Glu Leu 725 730
735Glu Glu Asp Met Gln Met Leu Val Lys Leu Val Leu Glu Lys Tyr Gly
740 745 750Gly Ile Asp Arg Asn Ile
Lys Lys Ala Phe Leu Ala Val Ala Lys Thr 755 760
765Tyr Tyr Tyr Arg Ala Tyr His Ala Ala Asp Thr Ile Asp Thr
His Met 770 775 780Phe Lys Val Leu Phe
Glu Pro Val Ala785 7902736PRTArtificial SequenceTruncated
copalyl diphosphate synthase 2Met Ala Thr Val Asp Ala Pro Gln Val His Asp
His Asp Gly Thr Thr1 5 10
15Val His Gln Gly His Asp Ala Val Lys Asn Ile Glu Asp Pro Ile Glu
20 25 30Tyr Ile Arg Thr Leu Leu Arg
Thr Thr Gly Asp Gly Arg Ile Ser Val 35 40
45Ser Pro Tyr Asp Thr Ala Trp Val Ala Met Ile Lys Asp Val Glu
Gly 50 55 60Arg Asp Gly Pro Gln Phe
Pro Ser Ser Leu Glu Trp Ile Val Gln Asn65 70
75 80Gln Leu Glu Asp Gly Ser Trp Gly Asp Gln Lys
Leu Phe Cys Val Tyr 85 90
95Asp Arg Leu Val Asn Thr Ile Ala Cys Val Val Ala Leu Arg Ser Trp
100 105 110Asn Val His Ala His Lys
Val Lys Arg Gly Val Thr Tyr Ile Lys Glu 115 120
125Asn Val Asp Lys Leu Met Glu Gly Asn Glu Glu His Met Thr
Cys Gly 130 135 140Phe Glu Val Val Phe
Pro Ala Leu Leu Gln Lys Ala Lys Ser Leu Gly145 150
155 160Ile Glu Asp Leu Pro Tyr Asp Ser Pro Ala
Val Gln Glu Val Tyr His 165 170
175Val Arg Glu Gln Lys Leu Lys Arg Ile Pro Leu Glu Ile Met His Lys
180 185 190Ile Pro Thr Ser Leu
Leu Phe Ser Leu Glu Gly Leu Glu Asn Leu Asp 195
200 205Trp Asp Lys Leu Leu Lys Leu Gln Ser Ala Asp Gly
Ser Phe Leu Thr 210 215 220Ser Pro Ser
Ser Thr Ala Phe Ala Phe Met Gln Thr Lys Asp Glu Lys225
230 235 240Cys Tyr Gln Phe Ile Lys Asn
Thr Ile Asp Thr Phe Asn Gly Gly Ala 245
250 255Pro His Thr Tyr Pro Val Asp Val Phe Gly Arg Leu
Trp Ala Ile Asp 260 265 270Arg
Leu Gln Arg Leu Gly Ile Ser Arg Phe Phe Glu Pro Glu Ile Ala 275
280 285Asp Cys Leu Ser His Ile His Lys Phe
Trp Thr Asp Lys Gly Val Phe 290 295
300Ser Gly Arg Glu Ser Glu Phe Cys Asp Ile Asp Asp Thr Ser Met Gly305
310 315 320Met Arg Leu Met
Arg Met His Gly Tyr Asp Val Asp Pro Asn Val Leu 325
330 335Arg Asn Phe Lys Gln Lys Asp Gly Lys Phe
Ser Cys Tyr Gly Gly Gln 340 345
350Met Ile Glu Ser Pro Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ser Gln
355 360 365Leu Arg Phe Pro Gly Glu Glu
Ile Leu Glu Asp Ala Lys Arg Phe Ala 370 375
380Tyr Asp Phe Leu Lys Glu Lys Leu Ala Asn Asn Gln Ile Leu Asp
Lys385 390 395 400Trp Val
Ile Ser Lys His Leu Pro Asp Glu Ile Lys Leu Gly Leu Glu
405 410 415Met Pro Trp Leu Ala Thr Leu
Pro Arg Val Glu Ala Lys Tyr Tyr Ile 420 425
430Gln Tyr Tyr Ala Gly Ser Gly Asp Val Trp Ile Gly Lys Thr
Leu Tyr 435 440 445Arg Met Pro Glu
Ile Ser Asn Asp Thr Tyr His Asp Leu Ala Lys Thr 450
455 460Asp Phe Lys Arg Cys Gln Ala Lys His Gln Phe Glu
Trp Leu Tyr Met465 470 475
480Gln Glu Trp Tyr Glu Ser Cys Gly Ile Glu Glu Phe Gly Ile Ser Arg
485 490 495Lys Asp Leu Leu Leu
Ser Tyr Phe Leu Ala Thr Ala Ser Ile Phe Glu 500
505 510Leu Glu Arg Thr Asn Glu Arg Ile Ala Trp Ala Lys
Ser Gln Ile Ile 515 520 525Ala Lys
Met Ile Thr Ser Phe Phe Asn Lys Glu Thr Thr Ser Glu Glu 530
535 540Asp Lys Arg Ala Leu Leu Asn Glu Leu Gly Asn
Ile Asn Gly Leu Asn545 550 555
560Asp Thr Asn Gly Ala Gly Arg Glu Gly Gly Ala Gly Ser Ile Ala Leu
565 570 575Ala Thr Leu Thr
Gln Phe Leu Glu Gly Phe Asp Arg Tyr Thr Arg His 580
585 590Gln Leu Lys Asn Ala Trp Ser Val Trp Leu Thr
Gln Leu Gln His Gly 595 600 605Glu
Ala Asp Asp Ala Glu Leu Leu Thr Asn Thr Leu Asn Ile Cys Ala 610
615 620Gly His Ile Ala Phe Arg Glu Glu Ile Leu
Ala His Asn Glu Tyr Lys625 630 635
640Ala Leu Ser Asn Leu Thr Ser Lys Ile Cys Arg Gln Leu Ser Phe
Ile 645 650 655Gln Ser Glu
Lys Glu Met Gly Val Glu Gly Glu Ile Ala Ala Lys Ser 660
665 670Ser Ile Lys Asn Lys Glu Leu Glu Glu Asp
Met Gln Met Leu Val Lys 675 680
685Leu Val Leu Glu Lys Tyr Gly Gly Ile Asp Arg Asn Ile Lys Lys Ala 690
695 700Phe Leu Ala Val Ala Lys Thr Tyr
Tyr Tyr Arg Ala Tyr His Ala Ala705 710
715 720Asp Thr Ile Asp Thr His Met Phe Lys Val Leu Phe
Glu Pro Val Ala 725 730
73532211DNAArtificial SequenceOptimized cDNA for E. coli expression
encoding for SmCPS2 3atggcaactg ttgacgcacc tcaagtccat gatcacgatg
gcaccaccgt tcaccagggt 60cacgacgcgg tgaagaacat cgaggacccg atcgaataca
ttcgtaccct gctgcgtacc 120actggtgatg gtcgcatcag cgtcagcccg tatgacacgg
cgtgggtggc gatgattaaa 180gacgtcgagg gtcgcgatgg cccgcaattt ccttctagcc
tggagtggat tgtccaaaat 240cagctggaag atggctcgtg gggtgaccag aagctgtttt
gtgtttacga tcgcctggtt 300aataccatcg catgtgtggt tgcgctgcgt agctggaatg
ttcacgctca taaagtcaaa 360cgtggcgtga cgtatatcaa ggaaaacgtg gataagctga
tggaaggcaa cgaagaacac 420atgacgtgtg gcttcgaggt tgtttttcca gccttgctgc
agaaagcaaa gtccctgggt 480attgaggatc tgccgtacga ctcgccggca gtgcaagaag
tctatcacgt ccgcgagcag 540aagctgaaac gcatcccgct ggagattatg cataagattc
cgacctctct gctgttctct 600ctggaaggtc tggagaacct ggattgggac aaactgctga
agctgcagtc cgctgacggt 660agctttctga ccagcccgag cagcacggcc tttgcgttta
tgcagaccaa agatgagaag 720tgctatcaat tcatcaagaa tactattgat accttcaacg
gtggcgcacc gcacacgtac 780ccagtagacg tttttggtcg cctgtgggcg attgaccgtt
tgcagcgtct gggtatcagc 840cgtttcttcg agccggagat tgcggactgc ttgagccata
ttcacaaatt ctggacggac 900aaaggcgtgt tcagcggtcg tgagagcgag ttctgcgaca
tcgacgatac gagcatgggt 960atgcgtctga tgcgtatgca cggttacgac gtggacccga
atgtgttgcg caacttcaag 1020caaaaagatg gcaagtttag ctgctacggt ggccaaatga
ttgagagccc gagcccgatc 1080tataacttat atcgtgcgag ccaactgcgt ttcccgggtg
aagaaattct ggaagatgcg 1140aagcgttttg cgtatgactt cctgaaggaa aagctcgcaa
acaatcaaat cttggataaa 1200tgggtgatca gcaagcactt gccggatgag attaaactgg
gtctggagat gccgtggttg 1260gccaccctgc cgagagttga ggcgaaatac tatattcagt
attacgcggg tagcggtgat 1320gtttggattg gcaagaccct gtaccgcatg ccggagatca
gcaatgatac ctatcatgac 1380ctggccaaga ccgacttcaa acgctgtcaa gcgaaacatc
aatttgaatg gttatacatg 1440caagagtggt acgaaagctg cggcatcgaa gagttcggta
tctcccgtaa agatctgctg 1500ctgtcttact ttctggcaac ggccagcatt ttcgagctgg
agcgtaccaa tgagcgtatt 1560gcctgggcga aatcacaaat cattgctaag atgattacga
gctttttcaa taaagaaacc 1620acgtccgagg aagataaacg tgctctgctg aatgaactgg
gcaacatcaa cggtctgaat 1680gacaccaacg gtgccggtcg tgagggtggc gcaggcagca
ttgcactggc cacgctgacc 1740cagttcctgg aaggtttcga ccgctacacc cgtcaccagc
tgaagaacgc gtggtccgtc 1800tggctgaccc agctgcagca tggtgaggca gacgacgcgg
agctgctgac caacacgttg 1860aatatctgcg ctggccatat cgcgtttcgc gaagagattc
tggcgcacaa cgagtacaaa 1920gccctgagca atctgacctc taaaatctgt cgtcagctta
gctttattca gagcgagaaa 1980gaaatgggcg tggaaggtga gatcgcggca aaatccagca
tcaagaacaa agaactggaa 2040gaagatatgc agatgttggt caagctcgtc ctggagaagt
atggtggcat cgaccgtaat 2100atcaagaaag cgtttctggc cgtggcgaaa acgtattact
accgcgcgta ccacgcggca 2160gataccattg acacccacat gtttaaggtt ttgtttgagc
cggttgctta a 22114575PRTSalvia sclarea 4Met Ser Leu Ala Phe
Asn Val Gly Val Thr Pro Phe Ser Gly Gln Arg1 5
10 15Val Gly Ser Arg Lys Glu Lys Phe Pro Val Gln
Gly Phe Pro Val Thr 20 25
30Thr Pro Asn Arg Ser Arg Leu Ile Val Asn Cys Ser Leu Thr Thr Ile
35 40 45Asp Phe Met Ala Lys Met Lys Glu
Asn Phe Lys Arg Glu Asp Asp Lys 50 55
60Phe Pro Thr Thr Thr Thr Leu Arg Ser Glu Asp Ile Pro Ser Asn Leu65
70 75 80Cys Ile Ile Asp Thr
Leu Gln Arg Leu Gly Val Asp Gln Phe Phe Gln 85
90 95Tyr Glu Ile Asn Thr Ile Leu Asp Asn Thr Phe
Arg Leu Trp Gln Glu 100 105
110Lys His Lys Val Ile Tyr Gly Asn Val Thr Thr His Ala Met Ala Phe
115 120 125Arg Leu Leu Arg Val Lys Gly
Tyr Glu Val Ser Ser Glu Glu Leu Ala 130 135
140Pro Tyr Gly Asn Gln Glu Ala Val Ser Gln Gln Thr Asn Asp Leu
Pro145 150 155 160Met Ile
Ile Glu Leu Tyr Arg Ala Ala Asn Glu Arg Ile Tyr Glu Glu
165 170 175Glu Arg Ser Leu Glu Lys Ile
Leu Ala Trp Thr Thr Ile Phe Leu Asn 180 185
190Lys Gln Val Gln Asp Asn Ser Ile Pro Asp Lys Lys Leu His
Lys Leu 195 200 205Val Glu Phe Tyr
Leu Arg Asn Tyr Lys Gly Ile Thr Ile Arg Leu Gly 210
215 220Ala Arg Arg Asn Leu Glu Leu Tyr Asp Met Thr Tyr
Tyr Gln Ala Leu225 230 235
240Lys Ser Thr Asn Arg Phe Ser Asn Leu Cys Asn Glu Asp Phe Leu Val
245 250 255Phe Ala Lys Gln Asp
Phe Asp Ile His Glu Ala Gln Asn Gln Lys Gly 260
265 270Leu Gln Gln Leu Gln Arg Trp Tyr Ala Asp Cys Arg
Leu Asp Thr Leu 275 280 285Asn Phe
Gly Arg Asp Val Val Ile Ile Ala Asn Tyr Leu Ala Ser Leu 290
295 300Ile Ile Gly Asp His Ala Phe Asp Tyr Val Arg
Leu Ala Phe Ala Lys305 310 315
320Thr Ser Val Leu Val Thr Ile Met Asp Asp Phe Phe Asp Cys His Gly
325 330 335Ser Ser Gln Glu
Cys Asp Lys Ile Ile Glu Leu Val Lys Glu Trp Lys 340
345 350Glu Asn Pro Asp Ala Glu Tyr Gly Ser Glu Glu
Leu Glu Ile Leu Phe 355 360 365Met
Ala Leu Tyr Asn Thr Val Asn Glu Leu Ala Glu Arg Ala Arg Val 370
375 380Glu Gln Gly Arg Ser Val Lys Glu Phe Leu
Val Lys Leu Trp Val Glu385 390 395
400Ile Leu Ser Ala Phe Lys Ile Glu Leu Asp Thr Trp Ser Asn Gly
Thr 405 410 415Gln Gln Ser
Phe Asp Glu Tyr Ile Ser Ser Ser Trp Leu Ser Asn Gly 420
425 430Ser Arg Leu Thr Gly Leu Leu Thr Met Gln
Phe Val Gly Val Lys Leu 435 440
445Ser Asp Glu Met Leu Met Ser Glu Glu Cys Thr Asp Leu Ala Arg His 450
455 460Val Cys Met Val Gly Arg Leu Leu
Asn Asp Val Cys Ser Ser Glu Arg465 470
475 480Glu Arg Glu Glu Asn Ile Ala Gly Lys Ser Tyr Ser
Ile Leu Leu Ala 485 490
495Thr Glu Lys Asp Gly Arg Lys Val Ser Glu Asp Glu Ala Ile Ala Glu
500 505 510Ile Asn Glu Met Val Glu
Tyr His Trp Arg Lys Val Leu Gln Ile Val 515 520
525Tyr Lys Lys Glu Ser Ile Leu Pro Arg Arg Cys Lys Asp Val
Phe Leu 530 535 540Glu Met Ala Lys Gly
Thr Phe Tyr Ala Tyr Gly Ile Asn Asp Glu Leu545 550
555 560Thr Ser Pro Gln Gln Ser Lys Glu Asp Met
Lys Ser Phe Val Phe 565 570
5755525PRTArtificial SequenceTruncated sclareol synthase from Salvia
sclarea (SsScS) 5Met Ala Lys Met Lys Glu Asn Phe Lys Arg Glu Asp Asp
Lys Phe Pro1 5 10 15Thr
Thr Thr Thr Leu Arg Ser Glu Asp Ile Pro Ser Asn Leu Cys Ile 20
25 30Ile Asp Thr Leu Gln Arg Leu Gly
Val Asp Gln Phe Phe Gln Tyr Glu 35 40
45Ile Asn Thr Ile Leu Asp Asn Thr Phe Arg Leu Trp Gln Glu Lys His
50 55 60Lys Val Ile Tyr Gly Asn Val Thr
Thr His Ala Met Ala Phe Arg Leu65 70 75
80Leu Arg Val Lys Gly Tyr Glu Val Ser Ser Glu Glu Leu
Ala Pro Tyr 85 90 95Gly
Asn Gln Glu Ala Val Ser Gln Gln Thr Asn Asp Leu Pro Met Ile
100 105 110Ile Glu Leu Tyr Arg Ala Ala
Asn Glu Arg Ile Tyr Glu Glu Glu Arg 115 120
125Ser Leu Glu Lys Ile Leu Ala Trp Thr Thr Ile Phe Leu Asn Lys
Gln 130 135 140Val Gln Asp Asn Ser Ile
Pro Asp Lys Lys Leu His Lys Leu Val Glu145 150
155 160Phe Tyr Leu Arg Asn Tyr Lys Gly Ile Thr Ile
Arg Leu Gly Ala Arg 165 170
175Arg Asn Leu Glu Leu Tyr Asp Met Thr Tyr Tyr Gln Ala Leu Lys Ser
180 185 190Thr Asn Arg Phe Ser Asn
Leu Cys Asn Glu Asp Phe Leu Val Phe Ala 195 200
205Lys Gln Asp Phe Asp Ile His Glu Ala Gln Asn Gln Lys Gly
Leu Gln 210 215 220Gln Leu Gln Arg Trp
Tyr Ala Asp Cys Arg Leu Asp Thr Leu Asn Phe225 230
235 240Gly Arg Asp Val Val Ile Ile Ala Asn Tyr
Leu Ala Ser Leu Ile Ile 245 250
255Gly Asp His Ala Phe Asp Tyr Val Arg Leu Ala Phe Ala Lys Thr Ser
260 265 270Val Leu Val Thr Ile
Met Asp Asp Phe Phe Asp Cys His Gly Ser Ser 275
280 285Gln Glu Cys Asp Lys Ile Ile Glu Leu Val Lys Glu
Trp Lys Glu Asn 290 295 300Pro Asp Ala
Glu Tyr Gly Ser Glu Glu Leu Glu Ile Leu Phe Met Ala305
310 315 320Leu Tyr Asn Thr Val Asn Glu
Leu Ala Glu Arg Ala Arg Val Glu Gln 325
330 335Gly Arg Ser Val Lys Glu Phe Leu Val Lys Leu Trp
Val Glu Ile Leu 340 345 350Ser
Ala Phe Lys Ile Glu Leu Asp Thr Trp Ser Asn Gly Thr Gln Gln 355
360 365Ser Phe Asp Glu Tyr Ile Ser Ser Ser
Trp Leu Ser Asn Gly Ser Arg 370 375
380Leu Thr Gly Leu Leu Thr Met Gln Phe Val Gly Val Lys Leu Ser Asp385
390 395 400Glu Met Leu Met
Ser Glu Glu Cys Thr Asp Leu Ala Arg His Val Cys 405
410 415Met Val Gly Arg Leu Leu Asn Asp Val Cys
Ser Ser Glu Arg Glu Arg 420 425
430Glu Glu Asn Ile Ala Gly Lys Ser Tyr Ser Ile Leu Leu Ala Thr Glu
435 440 445Lys Asp Gly Arg Lys Val Ser
Glu Asp Glu Ala Ile Ala Glu Ile Asn 450 455
460Glu Met Val Glu Tyr His Trp Arg Lys Val Leu Gln Ile Val Tyr
Lys465 470 475 480Lys Glu
Ser Ile Leu Pro Arg Arg Cys Lys Asp Val Phe Leu Glu Met
485 490 495Ala Lys Gly Thr Phe Tyr Ala
Tyr Gly Ile Asn Asp Glu Leu Thr Ser 500 505
510Pro Gln Gln Ser Lys Glu Asp Met Lys Ser Phe Val Phe
515 520 52561583DNAArtificial
SequenceOptimized cDNA for E. coli expression encoding the truncated
sclareol synthase from Salvia sclarea 6atggcgaaaa tgaaggagaa ctttaaacgc
gaggacgata aattcccgac gaccacgacc 60ctgcgcagcg aggatatccc gagcaacctg
tgcatcattg ataccctgca gcgcctgggt 120gtcgatcagt tcttccaata cgaaatcaat
accattctgg acaatacttt tcgtctgtgg 180caagagaaac acaaagtgat ctacggcaac
gttaccaccc acgcgatggc gttccgtttg 240ttgcgtgtca agggctacga ggtttccagc
gaggaactgg cgccgtacgg taatcaggaa 300gcagttagcc aacagacgaa tgatctgcct
atgatcattg agctgtatcg cgcagcaaat 360gagcgtatct acgaagagga acgcagcctg
gaaaagatcc tggcgtggac cacgatcttc 420ctgaacaaac aagttcaaga caattctatt
cctgataaga agctgcataa actggtcgaa 480ttctatctgc gtaattacaa gggcatcacg
atccgtctgg gcgcacgccg taacctggag 540ttgtatgata tgacgtatta ccaggctctg
aaaagcacca atcgtttctc caatctgtgt 600aatgaggatt ttctggtgtt cgccaagcag
gattttgaca tccacgaggc gcaaaatcaa 660aaaggtctgc aacaactgca acgttggtac
gctgactgtc gcctggacac cctgaatttc 720ggtcgcgacg ttgtcattat tgcaaactat
ctggccagcc tgatcatcgg tgatcacgca 780ttcgactacg tccgcctggc cttcgctaag
accagcgttc tggtgaccat tatggatgat 840ttcttcgatt gccacggttc tagccaggaa
tgcgacaaaa tcattgagct ggtgaaagag 900tggaaagaaa accctgatgc ggaatacggt
tccgaagagt tggagatcct gtttatggcc 960ttgtacaaca ccgtgaatga actggccgag
cgtgctcgtg tggagcaggg ccgttctgtg 1020aaggagtttt tggtcaagtt gtgggtggaa
atcctgtccg cgttcaagat cgaactggat 1080acgtggtcga atggtacgca acagagcttc
gacgaataca tcagcagcag ctggctgagc 1140aatggcagcc gtctgaccgg tttgctgacc
atgcaatttg tgggtgttaa actgtccgat 1200gaaatgctga tgagcgaaga atgcaccgac
ctggcacgcc atgtgtgtat ggtgggtcgc 1260ctgctgaacg acgtctgcag cagcgaacgt
gagcgcgagg aaaacattgc aggcaagagc 1320tacagcatct tgttggccac cgagaaagat
ggtcgcaaag tgtctgagga cgaagcaatt 1380gcagagatta atgaaatggt cgagtaccac
tggcgtaagg ttttgcagat tgtgtataag 1440aaagagagca tcttgccgcg tcgctgtaag
gatgttttct tggagatggc gaagggcacg 1500ttctatgcgt acggcattaa cgacgagctg
acgagcccgc aacaatcgaa agaggacatg 1560aagagcttcg tgttctgagg tac
15837307PRTPantoea agglomerans 7Met Val
Ser Gly Ser Lys Ala Gly Val Ser Pro His Arg Glu Ile Glu1 5
10 15Val Met Arg Gln Ser Ile Asp Asp
His Leu Ala Gly Leu Leu Pro Glu 20 25
30Thr Asp Ser Gln Asp Ile Val Ser Leu Ala Met Arg Glu Gly Val
Met 35 40 45Ala Pro Gly Lys Arg
Ile Arg Pro Leu Leu Met Leu Leu Ala Ala Arg 50 55
60Asp Leu Arg Tyr Gln Gly Ser Met Pro Thr Leu Leu Asp Leu
Ala Cys65 70 75 80Ala
Val Glu Leu Thr His Thr Ala Ser Leu Met Leu Asp Asp Met Pro
85 90 95Cys Met Asp Asn Ala Glu Leu
Arg Arg Gly Gln Pro Thr Thr His Lys 100 105
110Lys Phe Gly Glu Ser Val Ala Ile Leu Ala Ser Val Gly Leu
Leu Ser 115 120 125Lys Ala Phe Gly
Leu Ile Ala Ala Thr Gly Asp Leu Pro Gly Glu Arg 130
135 140Arg Ala Gln Ala Val Asn Glu Leu Ser Thr Ala Val
Gly Val Gln Gly145 150 155
160Leu Val Leu Gly Gln Phe Arg Asp Leu Asn Asp Ala Ala Leu Asp Arg
165 170 175Thr Pro Asp Ala Ile
Leu Ser Thr Asn His Leu Lys Thr Gly Ile Leu 180
185 190Phe Ser Ala Met Leu Gln Ile Val Ala Ile Ala Ser
Ala Ser Ser Pro 195 200 205Ser Thr
Arg Glu Thr Leu His Ala Phe Ala Leu Asp Phe Gly Gln Ala 210
215 220Phe Gln Leu Leu Asp Asp Leu Arg Asp Asp His
Pro Glu Thr Gly Lys225 230 235
240Asp Arg Asn Lys Asp Ala Gly Lys Ser Thr Leu Val Asn Arg Leu Gly
245 250 255Ala Asp Ala Ala
Arg Gln Lys Leu Arg Glu His Ile Asp Ser Ala Asp 260
265 270Lys His Leu Thr Phe Ala Cys Pro Gln Gly Gly
Ala Ile Arg Gln Phe 275 280 285Met
His Leu Trp Phe Gly His His Leu Ala Asp Trp Ser Pro Val Met 290
295 300Lys Ile Ala3058924DNAArtificial
SequenceOptimized cDNA encoding for the GGPP synthase from Pantoea
agglomerans 8atggtttctg gttcgaaagc aggagtatca cctcataggg aaatcgaagt
catgagacag 60tccattgatg accacttagc aggattgttg ccagaaacag attcccagga
tatcgttagc 120cttgctatga gagaaggtgt tatggcacct ggtaaacgta tcagaccttt
gctgatgtta 180cttgctgcaa gagacctgag atatcagggt tctatgccta cactactgga
tctagcttgt 240gctgttgaac tgacacatac tgcttccttg atgctggatg acatgccttg
tatggacaat 300gcggaactta gaagaggtca accaacaacc cacaagaaat tcggagaatc
tgttgccatt 360ttggcttctg taggtctgtt gtcgaaagct tttggcttga ttgctgcaac
tggtgatctt 420ccaggtgaaa ggagagcaca agctgtaaac gagctatcta ctgcagttgg
tgttcaaggt 480ctagtcttag gacagttcag agatttgaat gacgcagctt tggacagaac
tcctgatgct 540atcctgtcta cgaaccatct gaagactggc atcttgttct cagctatgtt
gcaaatcgta 600gccattgctt ctgcttcttc accatctact agggaaacgt tacacgcatt
cgcattggac 660tttggtcaag cctttcaact gctagacgat ttgagggatg atcatccaga
gacaggtaaa 720gaccgtaaca aagacgctgg taaaagcact ctagtcaaca gattgggtgc
tgatgcagct 780agacagaaac tgagagagca cattgactct gctgacaaac acctgacatt
tgcatgtcca 840caaggaggtg ctataaggca gtttatgcac ctatggtttg gacaccatct
tgctgattgg 900tctccagtga tgaagatcgc ctaa
924971DNAArtificial SequencePrimer Sequence 9ctgtttgagc
cggtcgccta aggtaccaga aggagataaa taatggcgaa aatgaaggag 60aactttaaac g
711055DNAArtificial SequencePrimer Sequence 10gcagcggttt ctttaccaga
ctcgaggtca gaacacgaag ctcttcatgt cctct 5511786PRTColeus
forskohlii 11Met Gly Ser Leu Ser Thr Met Asn Leu Asn His Ser Pro Met Ser
Tyr1 5 10 15Ser Gly Ile
Leu Pro Ser Ser Ser Ala Lys Ala Lys Leu Leu Leu Pro 20
25 30Gly Cys Phe Ser Ile Ser Ala Trp Met Asn
Asn Gly Lys Asn Leu Asn 35 40
45Cys Gln Leu Thr His Lys Lys Ile Ser Lys Val Ala Glu Ile Arg Val 50
55 60Ala Thr Val Asn Ala Pro Pro Val His
Asp Gln Asp Asp Ser Thr Glu65 70 75
80Asn Gln Cys His Asp Ala Val Asn Asn Ile Glu Asp Pro Ile
Glu Tyr 85 90 95Ile Arg
Thr Leu Leu Arg Thr Thr Gly Asp Gly Arg Ile Ser Val Ser 100
105 110Pro Tyr Asp Thr Ala Trp Val Ala Leu
Ile Lys Asp Leu Gln Gly Arg 115 120
125Asp Ala Pro Glu Phe Pro Ser Ser Leu Glu Trp Ile Ile Gln Asn Gln
130 135 140Leu Ala Asp Gly Ser Trp Gly
Asp Ala Lys Phe Phe Cys Val Tyr Asp145 150
155 160Arg Leu Val Asn Thr Ile Ala Cys Val Val Ala Leu
Arg Ser Trp Asp 165 170
175Val His Ala Glu Lys Val Glu Arg Gly Val Arg Tyr Ile Asn Glu Asn
180 185 190Val Glu Lys Leu Arg Asp
Gly Asn Glu Glu His Met Thr Cys Gly Phe 195 200
205Glu Val Val Phe Pro Ala Leu Leu Gln Arg Ala Lys Ser Leu
Gly Ile 210 215 220Gln Asp Leu Pro Tyr
Asp Ala Pro Val Ile Gln Glu Ile Tyr His Ser225 230
235 240Arg Glu Gln Lys Ser Lys Arg Ile Pro Leu
Glu Met Met His Lys Val 245 250
255Pro Thr Ser Leu Leu Phe Ser Leu Glu Gly Leu Glu Asn Leu Glu Trp
260 265 270Asp Lys Leu Leu Lys
Leu Gln Ser Ala Asp Gly Ser Phe Leu Thr Ser 275
280 285Pro Ser Ser Thr Ala Phe Ala Phe Met Gln Thr Arg
Asp Pro Lys Cys 290 295 300Tyr Gln Phe
Ile Lys Asn Thr Ile Gln Thr Phe Asn Gly Gly Ala Pro305
310 315 320His Thr Tyr Pro Val Asp Val
Phe Gly Arg Leu Trp Ala Ile Asp Arg 325
330 335Leu Gln Arg Leu Gly Ile Ser Arg Phe Phe Glu Ser
Glu Ile Ala Asp 340 345 350Cys
Ile Ala His Ile His Arg Phe Trp Thr Glu Lys Gly Val Phe Ser 355
360 365Gly Arg Glu Ser Glu Phe Cys Asp Ile
Asp Asp Thr Ser Met Gly Val 370 375
380Arg Leu Met Arg Met His Gly Tyr Asp Val Asp Pro Asn Val Leu Lys385
390 395 400Asn Phe Lys Lys
Asp Asp Lys Phe Ser Cys Tyr Gly Gly Gln Met Ile 405
410 415Glu Ser Pro Ser Pro Ile Tyr Asn Leu Tyr
Arg Ala Ser Gln Leu Arg 420 425
430Phe Pro Gly Glu Gln Ile Leu Glu Asp Ala Asn Lys Phe Ala Tyr Asp
435 440 445Phe Leu Gln Glu Lys Leu Ala
His Asn Gln Ile Leu Asp Lys Trp Val 450 455
460Ile Ser Lys His Leu Pro Asp Glu Ile Lys Leu Gly Leu Glu Met
Pro465 470 475 480Trp Tyr
Ala Thr Leu Pro Arg Val Glu Ala Arg Tyr Tyr Ile Gln Tyr
485 490 495Tyr Ala Gly Ser Gly Asp Val
Trp Ile Gly Lys Thr Leu Tyr Arg Met 500 505
510Pro Glu Ile Ser Asn Asp Thr Tyr His Glu Leu Ala Lys Thr
Asp Phe 515 520 525Lys Arg Cys Gln
Ala Gln His Gln Phe Glu Trp Ile Tyr Met Gln Glu 530
535 540Trp Tyr Glu Ser Cys Asn Met Glu Glu Phe Gly Ile
Ser Arg Lys Glu545 550 555
560Leu Leu Val Ala Tyr Phe Leu Ala Thr Ala Ser Ile Phe Glu Leu Glu
565 570 575Arg Ala Asn Glu Arg
Ile Ala Trp Ala Lys Ser Gln Ile Ile Ser Thr 580
585 590Ile Ile Ala Ser Phe Phe Asn Asn Gln Asn Thr Ser
Pro Glu Asp Lys 595 600 605Leu Ala
Phe Leu Thr Asp Phe Lys Asn Gly Asn Ser Thr Asn Met Ala 610
615 620Leu Val Thr Leu Thr Gln Phe Leu Glu Gly Phe
Asp Arg Tyr Thr Ser625 630 635
640His Gln Leu Lys Asn Ala Trp Ser Val Trp Leu Arg Lys Leu Gln Gln
645 650 655Gly Glu Gly Asn
Gly Gly Ala Asp Ala Glu Leu Leu Val Asn Thr Leu 660
665 670Asn Ile Cys Ala Gly His Ile Ala Phe Arg Glu
Glu Ile Leu Ala His 675 680 685Asn
Asp Tyr Lys Thr Leu Ser Asn Leu Thr Ser Lys Ile Cys Arg Gln 690
695 700Leu Ser Gln Ile Gln Asn Glu Lys Glu Leu
Glu Thr Glu Gly Gln Lys705 710 715
720Thr Ser Ile Lys Asn Lys Glu Leu Glu Glu Asp Met Gln Arg Leu
Val 725 730 735Lys Leu Val
Leu Glu Lys Ser Arg Val Gly Ile Asn Arg Asp Met Lys 740
745 750Lys Thr Phe Leu Ala Val Val Lys Thr Tyr
Tyr Tyr Lys Ala Tyr His 755 760
765Ser Ala Gln Ala Ile Asp Asn His Met Phe Lys Val Leu Phe Glu Pro 770
775 780Val Ala78512724PRTArtificial
SequenceTruncated copalyl diphosphate synthase from Coleus
forskohlii 12Met Val Ala Thr Val Asn Ala Pro Pro Val His Asp Gln Asp Asp
Ser1 5 10 15Thr Glu Asn
Gln Cys His Asp Ala Val Asn Asn Ile Glu Asp Pro Ile 20
25 30Glu Tyr Ile Arg Thr Leu Leu Arg Thr Thr
Gly Asp Gly Arg Ile Ser 35 40
45Val Ser Pro Tyr Asp Thr Ala Trp Val Ala Leu Ile Lys Asp Leu Gln 50
55 60Gly Arg Asp Ala Pro Glu Phe Pro Ser
Ser Leu Glu Trp Ile Ile Gln65 70 75
80Asn Gln Leu Ala Asp Gly Ser Trp Gly Asp Ala Lys Phe Phe
Cys Val 85 90 95Tyr Asp
Arg Leu Val Asn Thr Ile Ala Cys Val Val Ala Leu Arg Ser 100
105 110Trp Asp Val His Ala Glu Lys Val Glu
Arg Gly Val Arg Tyr Ile Asn 115 120
125Glu Asn Val Glu Lys Leu Arg Asp Gly Asn Glu Glu His Met Thr Cys
130 135 140Gly Phe Glu Val Val Phe Pro
Ala Leu Leu Gln Arg Ala Lys Ser Leu145 150
155 160Gly Ile Gln Asp Leu Pro Tyr Asp Ala Pro Val Ile
Gln Glu Ile Tyr 165 170
175His Ser Arg Glu Gln Lys Ser Lys Arg Ile Pro Leu Glu Met Met His
180 185 190Lys Val Pro Thr Ser Leu
Leu Phe Ser Leu Glu Gly Leu Glu Asn Leu 195 200
205Glu Trp Asp Lys Leu Leu Lys Leu Gln Ser Ala Asp Gly Ser
Phe Leu 210 215 220Thr Ser Pro Ser Ser
Thr Ala Phe Ala Phe Met Gln Thr Arg Asp Pro225 230
235 240Lys Cys Tyr Gln Phe Ile Lys Asn Thr Ile
Gln Thr Phe Asn Gly Gly 245 250
255Ala Pro His Thr Tyr Pro Val Asp Val Phe Gly Arg Leu Trp Ala Ile
260 265 270Asp Arg Leu Gln Arg
Leu Gly Ile Ser Arg Phe Phe Glu Ser Glu Ile 275
280 285Ala Asp Cys Ile Ala His Ile His Arg Phe Trp Thr
Glu Lys Gly Val 290 295 300Phe Ser Gly
Arg Glu Ser Glu Phe Cys Asp Ile Asp Asp Thr Ser Met305
310 315 320Gly Val Arg Leu Met Arg Met
His Gly Tyr Asp Val Asp Pro Asn Val 325
330 335Leu Lys Asn Phe Lys Lys Asp Asp Lys Phe Ser Cys
Tyr Gly Gly Gln 340 345 350Met
Ile Glu Ser Pro Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ser Gln 355
360 365Leu Arg Phe Pro Gly Glu Gln Ile Leu
Glu Asp Ala Asn Lys Phe Ala 370 375
380Tyr Asp Phe Leu Gln Glu Lys Leu Ala His Asn Gln Ile Leu Asp Lys385
390 395 400Trp Val Ile Ser
Lys His Leu Pro Asp Glu Ile Lys Leu Gly Leu Glu 405
410 415Met Pro Trp Tyr Ala Thr Leu Pro Arg Val
Glu Ala Arg Tyr Tyr Ile 420 425
430Gln Tyr Tyr Ala Gly Ser Gly Asp Val Trp Ile Gly Lys Thr Leu Tyr
435 440 445Arg Met Pro Glu Ile Ser Asn
Asp Thr Tyr His Glu Leu Ala Lys Thr 450 455
460Asp Phe Lys Arg Cys Gln Ala Gln His Gln Phe Glu Trp Ile Tyr
Met465 470 475 480Gln Glu
Trp Tyr Glu Ser Cys Asn Met Glu Glu Phe Gly Ile Ser Arg
485 490 495Lys Glu Leu Leu Val Ala Tyr
Phe Leu Ala Thr Ala Ser Ile Phe Glu 500 505
510Leu Glu Arg Ala Asn Glu Arg Ile Ala Trp Ala Lys Ser Gln
Ile Ile 515 520 525Ser Thr Ile Ile
Ala Ser Phe Phe Asn Asn Gln Asn Thr Ser Pro Glu 530
535 540Asp Lys Leu Ala Phe Leu Thr Asp Phe Lys Asn Gly
Asn Ser Thr Asn545 550 555
560Met Ala Leu Val Thr Leu Thr Gln Phe Leu Glu Gly Phe Asp Arg Tyr
565 570 575Thr Ser His Gln Leu
Lys Asn Ala Trp Ser Val Trp Leu Arg Lys Leu 580
585 590Gln Gln Gly Glu Gly Asn Gly Gly Ala Asp Ala Glu
Leu Leu Val Asn 595 600 605Thr Leu
Asn Ile Cys Ala Gly His Ile Ala Phe Arg Glu Glu Ile Leu 610
615 620Ala His Asn Asp Tyr Lys Thr Leu Ser Asn Leu
Thr Ser Lys Ile Cys625 630 635
640Arg Gln Leu Ser Gln Ile Gln Asn Glu Lys Glu Leu Glu Thr Glu Gly
645 650 655Gln Lys Thr Ser
Ile Lys Asn Lys Glu Leu Glu Glu Asp Met Gln Arg 660
665 670Leu Val Lys Leu Val Leu Glu Lys Ser Arg Val
Gly Ile Asn Arg Asp 675 680 685Met
Lys Lys Thr Phe Leu Ala Val Val Lys Thr Tyr Tyr Tyr Lys Ala 690
695 700Tyr His Ser Ala Gln Ala Ile Asp Asn His
Met Phe Lys Val Leu Phe705 710 715
720Glu Pro Val Ala132175DNAArtificial SequenceOptimized cDNA for
E. coli expression encoding for CfCPS1-del63 13atggtcgcta ctgtcaatgc
tccaccggtc cacgatcaag acgacagcac tgagaatcaa 60tgtcatgatg ccgtaaacaa
tattgaagat ccaatcgagt atatccgtac cctgttgcgc 120acgacgggtg atggtcgtat
cagcgtcagc ccgtacgata ccgcgtgggt ggcgctgatc 180aaagatctgc agggccgtga
cgcaccggag tttccgtcct ctcttgagtg gatcattcaa 240aaccagctgg ccgacggttc
ttggggcgac gccaaatttt tctgcgtgta tgaccgtctg 300gtgaacacca tcgcgtgcgt
cgttgcgctg cgttcctggg acgtccacgc ggaaaaagtt 360gagcgtggcg tgcgctatat
caacgaaaat gtcgaaaagc tgcgcgacgg taatgaagaa 420cacatgacct gtggctttga
agttgttttc ccggcgctcc tgcagcgcgc gaagtctctg 480ggtattcaag atctgccgta
cgatgctccg gtgatccaag agatttatca ctctcgtgag 540cagaagtcca agcgtatccc
gttggagatg atgcacaaag ttccgacgag cctgctgttc 600agcttggaag gcctggaaaa
tctggagtgg gacaaactgc tgaagctgca gagcgcggac 660ggtagcttcc tgacgagccc
gagcagcacc gcatttgcat ttatgcagac ccgtgacccg 720aagtgttacc aatttattaa
gaacacgatt cagacgttta acggtggtgc accgcatacc 780tatccggtag acgtctttgg
tcgcctgtgg gcaattgatc gtctgcagcg tttgggtatc 840agccgcttct tcgaaagcga
aattgcagat tgtatcgcac acatccatcg tttttggacc 900gagaaaggcg tctttagcgg
ccgtgagtct gagttctgtg acatcgatga cacgagcatg 960ggtgtccgtc tgatgcgtat
gcatggctat gatgttgacc cgaacgtgct gaagaatttt 1020aaaaaagatg acaagtttag
ctgctacggc ggtcagatga ttgagagccc gagcccgatt 1080tataatctgt accgcgcgag
ccaactgcgt ttcccgggtg aacagattct ggaagatgcc 1140aataaattcg cgtatgattt
cctgcaggaa aaactggcgc acaatcagat cctggataaa 1200tgggttatca gcaagcatct
gcctgacgaa atcaaattgg gcctggagat gccgtggtat 1260gcgaccttgc cgcgtgtcga
agcgcgttac tacatccagt actatgcggg tagcggcgat 1320gtctggattg gtaagacgct
gtaccgtatg ccagagatta gcaacgacac ctaccatgaa 1380ttggcaaaga ccgatttcaa
gcgttgccaa gcccaacacc agttcgagtg gatttacatg 1440caagagtggt acgagtcgtg
caacatggaa gagttcggta ttagccgcaa agaactgctg 1500gttgcatatt tcctggccac
ggcgagcatc tttgagctgg agcgtgcgaa tgaacgcatt 1560gcatgggcaa aaagccaaat
catttctacc attatcgctt cgttctttaa taaccaaaat 1620acgagccctg aggataaact
ggcgtttctg actgatttca aaaatggcaa cagcaccaac 1680atggctctgg tgaccctgac
ccagttcctg gaaggctttg accgctacac ttcccatcaa 1740ctgaaaaacg cgtggagcgt
ttggctgcgt aagctgcaac agggtgaggg taatggcggt 1800gccgacgccg agttactggt
gaatacgctg aacatttgcg cgggtcacat cgcgttccgt 1860gaagaaattc tggcacataa
tgactataaa acgttgtcga acctgaccag caagatttgt 1920cgccagctga gccagattca
gaatgaaaaa gaattggaaa ccgaaggcca aaagacttcc 1980attaagaaca aagaactgga
agaagatatg cagcgcctgg ttaaactggt tttggagaaa 2040agccgtgtgg gtatcaatcg
tgacatgaag aaaacgttcc tggctgtggt gaaaacctac 2100tattacaaag cataccactc
cgcgcaggca atcgataacc acatgttcaa ggttctgttc 2160gaaccggtgg cctaa
217514757PRTTriticum aestivum
14Met Leu Thr Phe Thr Ala Ala Leu Arg His Val Pro Val Leu Asp Gln1
5 10 15Pro Thr Ser Glu Pro Trp
Arg Arg Leu Ser Leu His Leu His Ser Gln 20 25
30Arg Arg Pro Cys Gly Leu Val Leu Ile Ser Lys Ser Pro
Ser Tyr Pro 35 40 45Glu Val Asp
Val Gly Glu Trp Lys Val Asp Glu Tyr Arg Gln Arg Thr 50
55 60Asp Glu Pro Ser Glu Thr Arg Gln Met Ile Asp Asp
Ile Arg Thr Ala65 70 75
80Leu Ala Ser Leu Gly Asp Asp Glu Thr Ser Met Ser Val Ser Ala Tyr
85 90 95Asp Thr Ala Leu Val Ala
Leu Val Lys Asn Leu Asp Gly Gly Asp Gly 100
105 110Pro Gln Phe Pro Ser Cys Ile Asp Trp Ile Val Gln
Asn Gln Leu Pro 115 120 125Asp Gly
Ser Trp Gly Asp Pro Ala Phe Phe Met Val Gln Asp Arg Met 130
135 140Ile Ser Thr Leu Ala Cys Val Val Ala Val Lys
Ser Trp Asn Ile Asp145 150 155
160Arg Asp Asn Leu Cys Asp Arg Gly Val Leu Phe Ile Lys Glu Asn Met
165 170 175Ser Arg Leu Val
Glu Glu Glu Gln Asp Trp Met Pro Cys Gly Phe Glu 180
185 190Ile Asn Phe Pro Ala Leu Leu Glu Lys Ala Lys
Asp Leu Asp Leu Asp 195 200 205Ile
Pro Tyr Asp His Pro Val Leu Glu Glu Ile Tyr Ala Lys Arg Asn 210
215 220Leu Lys Leu Leu Lys Ile Pro Leu Asp Val
Leu His Ala Ile Pro Thr225 230 235
240Thr Leu Leu Phe Ser Val Glu Gly Met Val Asp Leu Pro Leu Asp
Trp 245 250 255Glu Lys Leu
Leu Arg Leu Arg Cys Pro Asp Gly Ser Phe His Ser Ser 260
265 270Pro Ala Ala Thr Ala Ala Ala Leu Ser His
Thr Gly Asp Lys Glu Cys 275 280
285His Ala Phe Leu Asp Arg Leu Ile Gln Lys Phe Glu Gly Gly Val Pro 290
295 300Cys Ser His Ser Met Asp Thr Phe
Glu Gln Leu Trp Val Val Asp Arg305 310
315 320Leu Met Arg Leu Gly Ile Ser Arg His Phe Thr Ser
Glu Ile Gln Gln 325 330
335Cys Leu Glu Phe Ile Tyr Arg Arg Trp Thr Gln Lys Gly Leu Ala His
340 345 350Asn Met His Cys Pro Ile
Pro Asp Ile Asp Asp Thr Ala Met Gly Phe 355 360
365Arg Leu Leu Arg Gln His Gly Tyr Asp Val Thr Pro Ser Val
Phe Lys 370 375 380His Phe Glu Lys Asp
Gly Lys Phe Val Cys Phe Pro Met Glu Thr Asn385 390
395 400His Ala Ser Val Thr Pro Met His Asn Thr
Tyr Arg Ala Ser Gln Phe 405 410
415Met Phe Pro Gly Asp Asp Asp Val Leu Ala Arg Ala Gly Arg Tyr Cys
420 425 430Arg Ala Phe Leu Gln
Glu Arg Gln Ser Ser Asn Lys Leu Tyr Asp Lys 435
440 445Trp Ile Ile Thr Lys Asp Leu Pro Gly Glu Val Gly
Tyr Thr Leu Asn 450 455 460Phe Pro Trp
Lys Ser Ser Leu Pro Arg Ile Glu Thr Arg Met Tyr Leu465
470 475 480Asp Gln Tyr Gly Gly Asn Asn
Asp Val Trp Ile Ala Lys Val Leu Tyr 485
490 495Arg Met Asn Leu Val Ser Asn Asp Leu Tyr Leu Lys
Met Ala Lys Ala 500 505 510Asp
Phe Thr Glu Tyr Gln Arg Leu Ser Arg Ile Glu Trp Asn Gly Leu 515
520 525Arg Lys Trp Tyr Phe Arg Asn His Leu
Gln Arg Tyr Gly Ala Thr Pro 530 535
540Lys Ser Ala Leu Lys Ala Tyr Phe Leu Ala Ser Ala Asn Ile Phe Glu545
550 555 560Pro Gly Arg Ala
Ala Glu Arg Leu Ala Trp Ala Arg Met Ala Val Leu 565
570 575Ala Glu Ala Val Thr Thr His Phe Arg His
Ile Gly Gly Pro Cys Tyr 580 585
590Ser Thr Glu Asn Leu Glu Glu Leu Ile Asp Leu Val Ser Phe Asp Asp
595 600 605Val Ser Gly Gly Leu Arg Glu
Ala Trp Lys Gln Trp Leu Met Ala Trp 610 615
620Thr Ala Lys Glu Ser His Gly Ser Val Asp Gly Asp Thr Ala Leu
Leu625 630 635 640Phe Val
Arg Thr Ile Glu Ile Cys Ser Gly Arg Ile Val Ser Ser Glu
645 650 655Gln Lys Leu Asn Leu Trp Asp
Tyr Ser Gln Leu Glu Gln Leu Thr Ser 660 665
670Ser Ile Cys His Lys Leu Ala Thr Ile Gly Leu Ser Gln Asn
Glu Ala 675 680 685Ser Met Glu Asn
Thr Glu Asp Leu His Gln Gln Val Asp Leu Glu Met 690
695 700Gln Glu Leu Ser Trp Arg Val His Gln Gly Cys His
Gly Ile Asn Arg705 710 715
720Glu Thr Arg Gln Thr Phe Leu Asn Val Val Lys Ser Phe Tyr Tyr Ser
725 730 735Ala His Cys Ser Pro
Glu Thr Val Asp Ser His Ile Ala Lys Val Ile 740
745 750Phe Gln Asp Val Ile 75515699PRTArtificial
SequenceTruncated copalyl diphosphate synthase from Triticum
aestivum 15Met Tyr Arg Gln Arg Thr Asp Glu Pro Ser Glu Thr Arg Gln Met
Ile1 5 10 15Asp Asp Ile
Arg Thr Ala Leu Ala Ser Leu Gly Asp Asp Glu Thr Ser 20
25 30Met Ser Val Ser Ala Tyr Asp Thr Ala Leu
Val Ala Leu Val Lys Asn 35 40
45Leu Asp Gly Gly Asp Gly Pro Gln Phe Pro Ser Cys Ile Asp Trp Ile 50
55 60Val Gln Asn Gln Leu Pro Asp Gly Ser
Trp Gly Asp Pro Ala Phe Phe65 70 75
80Met Val Gln Asp Arg Met Ile Ser Thr Leu Ala Cys Val Val
Ala Val 85 90 95Lys Ser
Trp Asn Ile Asp Arg Asp Asn Leu Cys Asp Arg Gly Val Leu 100
105 110Phe Ile Lys Glu Asn Met Ser Arg Leu
Val Glu Glu Glu Gln Asp Trp 115 120
125Met Pro Cys Gly Phe Glu Ile Asn Phe Pro Ala Leu Leu Glu Lys Ala
130 135 140Lys Asp Leu Asp Leu Asp Ile
Pro Tyr Asp His Pro Val Leu Glu Glu145 150
155 160Ile Tyr Ala Lys Arg Asn Leu Lys Leu Leu Lys Ile
Pro Leu Asp Val 165 170
175Leu His Ala Ile Pro Thr Thr Leu Leu Phe Ser Val Glu Gly Met Val
180 185 190Asp Leu Pro Leu Asp Trp
Glu Lys Leu Leu Arg Leu Arg Cys Pro Asp 195 200
205Gly Ser Phe His Ser Ser Pro Ala Ala Thr Ala Ala Ala Leu
Ser His 210 215 220Thr Gly Asp Lys Glu
Cys His Ala Phe Leu Asp Arg Leu Ile Gln Lys225 230
235 240Phe Glu Gly Gly Val Pro Cys Ser His Ser
Met Asp Thr Phe Glu Gln 245 250
255Leu Trp Val Val Asp Arg Leu Met Arg Leu Gly Ile Ser Arg His Phe
260 265 270Thr Ser Glu Ile Gln
Gln Cys Leu Glu Phe Ile Tyr Arg Arg Trp Thr 275
280 285Gln Lys Gly Leu Ala His Asn Met His Cys Pro Ile
Pro Asp Ile Asp 290 295 300Asp Thr Ala
Met Gly Phe Arg Leu Leu Arg Gln His Gly Tyr Asp Val305
310 315 320Thr Pro Ser Val Phe Lys His
Phe Glu Lys Asp Gly Lys Phe Val Cys 325
330 335Phe Pro Met Glu Thr Asn His Ala Ser Val Thr Pro
Met His Asn Thr 340 345 350Tyr
Arg Ala Ser Gln Phe Met Phe Pro Gly Asp Asp Asp Val Leu Ala 355
360 365Arg Ala Gly Arg Tyr Cys Arg Ala Phe
Leu Gln Glu Arg Gln Ser Ser 370 375
380Asn Lys Leu Tyr Asp Lys Trp Ile Ile Thr Lys Asp Leu Pro Gly Glu385
390 395 400Val Gly Tyr Thr
Leu Asn Phe Pro Trp Lys Ser Ser Leu Pro Arg Ile 405
410 415Glu Thr Arg Met Tyr Leu Asp Gln Tyr Gly
Gly Asn Asn Asp Val Trp 420 425
430Ile Ala Lys Val Leu Tyr Arg Met Asn Leu Val Ser Asn Asp Leu Tyr
435 440 445Leu Lys Met Ala Lys Ala Asp
Phe Thr Glu Tyr Gln Arg Leu Ser Arg 450 455
460Ile Glu Trp Asn Gly Leu Arg Lys Trp Tyr Phe Arg Asn His Leu
Gln465 470 475 480Arg Tyr
Gly Ala Thr Pro Lys Ser Ala Leu Lys Ala Tyr Phe Leu Ala
485 490 495Ser Ala Asn Ile Phe Glu Pro
Gly Arg Ala Ala Glu Arg Leu Ala Trp 500 505
510Ala Arg Met Ala Val Leu Ala Glu Ala Val Thr Thr His Phe
Arg His 515 520 525Ile Gly Gly Pro
Cys Tyr Ser Thr Glu Asn Leu Glu Glu Leu Ile Asp 530
535 540Leu Val Ser Phe Asp Asp Val Ser Gly Gly Leu Arg
Glu Ala Trp Lys545 550 555
560Gln Trp Leu Met Ala Trp Thr Ala Lys Glu Ser His Gly Ser Val Asp
565 570 575Gly Asp Thr Ala Leu
Leu Phe Val Arg Thr Ile Glu Ile Cys Ser Gly 580
585 590Arg Ile Val Ser Ser Glu Gln Lys Leu Asn Leu Trp
Asp Tyr Ser Gln 595 600 605Leu Glu
Gln Leu Thr Ser Ser Ile Cys His Lys Leu Ala Thr Ile Gly 610
615 620Leu Ser Gln Asn Glu Ala Ser Met Glu Asn Thr
Glu Asp Leu His Gln625 630 635
640Gln Val Asp Leu Glu Met Gln Glu Leu Ser Trp Arg Val His Gln Gly
645 650 655Cys His Gly Ile
Asn Arg Glu Thr Arg Gln Thr Phe Leu Asn Val Val 660
665 670Lys Ser Phe Tyr Tyr Ser Ala His Cys Ser Pro
Glu Thr Val Asp Ser 675 680 685His
Ile Ala Lys Val Ile Phe Gln Asp Val Ile 690
695162100DNAArtificial SequenceOptimized cDNA for E. coli expression
encoding for TaTps1-del59 16atgtatcgcc aaagaactga tgagccaagc
gaaacccgcc agatgatcga tgatattcgc 60accgctttgg ctagcctggg tgacgatgaa
accagcatga gcgtgagcgc atacgacacc 120gccctggttg ccctggtgaa gaacctggac
ggtggcgatg gcccgcagtt cccgagctgc 180attgactgga ttgttcagaa ccagctgccg
gacggtagct ggggcgaccc ggctttcttt 240atggttcagg accgtatgat cagcaccctg
gcctgtgtcg tggccgtgaa atcctggaat 300atcgatcgtg acaacttgtg cgatcgtggt
gtcctgttta tcaaagaaaa catgtcgcgt 360ctggttgaag aagaacaaga ttggatgcca
tgtggcttcg agattaactt tcctgcactg 420ttggagaaag ctaaagacct ggacttggac
attccgtacg atcatcctgt gctggaagag 480atttacgcga agcgtaatct gaaactgctg
aagattccgt tagatgtcct ccatgcgatc 540ccgacgacgc tgttgttttc cgttgagggt
atggtcgatc tgccgctgga ttgggagaaa 600ctgctgcgtc tgcgttgccc ggacggttct
tttcattcta gcccggcggc gacggcagcg 660gcgctgagcc acacgggtga caaagagtgt
cacgccttcc tggaccgcct gattcaaaag 720ttcgagggtg gcgtcccgtg ctcccacagc
atggacacct tcgagcaact gtgggttgtt 780gaccgtttga tgcgtctggg tatcagccgt
cattttacga gcgagatcca gcagtgcttg 840gagttcatct atcgtcgttg gacccagaaa
ggtctggcgc acaatatgca ctgcccgatc 900ccggacattg atgacactgc gatgggtttt
cgtctgttga gacagcacgg ttacgacgtg 960accccgtcgg ttttcaagca tttcgagaaa
gacggcaagt tcgtatgctt cccgatggaa 1020accaaccatg cgagcgtgac gccgatgcac
aatacctacc gtgcgagcca gttcatgttc 1080ccgggtgatg acgacgtgct ggcccgtgcc
ggccgctact gtcgcgcatt cttgcaagag 1140cgtcagagct ctaacaagtt gtacgataag
tggattatca cgaaagatct gccgggtgag 1200gttggctaca cgctgaactt tccgtggaaa
agctccctgc cgcgtattga aactcgtatg 1260tatctggatc agtacggtgg caataacgat
gtctggattg caaaggtcct gtatcgcatg 1320aacctggtta gcaatgacct gtacctgaaa
atggcgaaag ccgactttac cgagtatcaa 1380cgtctgtctc gcattgagtg gaacggcctg
cgcaaatggt attttcgcaa tcatctgcag 1440cgttacggtg cgaccccgaa gtccgcgctg
aaagcgtatt tcctggcgtc ggcaaacatc 1500tttgagcctg gccgcgcagc cgagcgcctg
gcatgggcac gtatggccgt gctggctgaa 1560gctgtaacga ctcatttccg tcacattggc
ggcccgtgct acagcaccga gaatctggaa 1620gaactgatcg accttgttag cttcgacgac
gtgagcggcg gcttgcgtga ggcgtggaag 1680caatggctga tggcgtggac cgcaaaagaa
tcacacggca gcgtggacgg tgacacggca 1740ctgctgtttg tccgcacgat tgagatttgc
agcggccgca tcgtttccag cgagcagaaa 1800ctgaatctgt gggattacag ccagttagag
caattgacca gcagcatctg tcataaactg 1860gccaccatcg gtctgagcca gaacgaagct
agcatggaaa ataccgaaga tctgcaccaa 1920caagtcgatt tggaaatgca agaactgtca
tggcgtgttc accagggttg tcacggtatt 1980aatcgcgaaa cccgtcaaac cttcctgaat
gttgttaagt ctttttatta ctccgcacac 2040tgcagcccgg aaaccgtgga cagccatatt
gcaaaagtga tctttcaaga cgttatctga 210017785PRTMarrubium vulgare 17Met
Gly Ser Leu Ser Thr Leu Asn Leu Ile Lys Thr Cys Val Thr Leu1
5 10 15Ala Ser Ser Glu Lys Leu Asn
Gln Pro Ser Gln Cys Tyr Thr Ile Ser 20 25
30Thr Cys Met Lys Ser Ser Asn Asn Pro Pro Phe Asn Tyr Tyr
Gln Ile 35 40 45Asn Gly Arg Lys
Lys Met Ser Thr Ala Ile Asp Ser Ser Val Asn Ala 50 55
60Pro Pro Glu Gln Lys Tyr Asn Ser Thr Ala Leu Glu His
Asp Thr Glu65 70 75
80Ile Ile Glu Ile Glu Asp His Ile Glu Cys Ile Arg Arg Leu Leu Arg
85 90 95Thr Ala Gly Asp Gly Arg
Ile Ser Val Ser Pro Tyr Asp Thr Ala Trp 100
105 110Ile Ala Leu Ile Lys Asp Leu Asp Gly His Asp Ser
Pro Gln Phe Pro 115 120 125Ser Ser
Met Glu Trp Val Ala Asp Asn Gln Leu Pro Asp Gly Ser Trp 130
135 140Gly Asp Glu His Phe Val Cys Val Tyr Asp Arg
Leu Val Asn Thr Ile145 150 155
160Ala Cys Val Val Ala Leu Arg Ser Trp Asn Val His Ala His Lys Cys
165 170 175Glu Lys Gly Ile
Lys Tyr Ile Lys Glu Asn Val His Lys Leu Glu Asp 180
185 190Ala Asn Glu Glu His Met Thr Cys Gly Phe Glu
Val Val Phe Pro Ala 195 200 205Leu
Leu Gln Arg Ala Gln Ser Met Gly Ile Lys Gly Ile Pro Tyr Asn 210
215 220Ala Pro Val Ile Glu Glu Ile Tyr Asn Ser
Arg Glu Lys Lys Leu Lys225 230 235
240Arg Ile Pro Met Glu Val Val His Lys Val Ala Thr Ser Leu Leu
Phe 245 250 255Ser Leu Glu
Gly Leu Glu Asn Leu Glu Trp Glu Lys Leu Leu Lys Leu 260
265 270Gln Ser Pro Asp Gly Ser Phe Leu Thr Ser
Pro Ser Ser Thr Ala Phe 275 280
285Ala Phe Ile His Thr Lys Asp Arg Lys Cys Phe Asn Phe Ile Asn Asn 290
295 300Ile Val His Thr Phe Lys Gly Gly
Ala Pro His Thr Tyr Pro Val Asp305 310
315 320Ile Phe Gly Arg Leu Trp Ala Val Asp Arg Leu Gln
Arg Leu Gly Ile 325 330
335Ser Arg Phe Phe Glu Ser Glu Ile Ala Glu Phe Leu Ser His Val His
340 345 350Arg Phe Trp Ser Asp Glu
Ala Gly Val Phe Ser Gly Arg Glu Ser Val 355 360
365Phe Cys Asp Ile Asp Asp Thr Ser Met Gly Leu Arg Leu Leu
Arg Met 370 375 380His Gly Tyr His Val
Asp Pro Asn Val Leu Lys Asn Phe Lys Gln Ser385 390
395 400Asp Lys Phe Ser Cys Tyr Gly Gly Gln Met
Met Glu Cys Ser Ser Pro 405 410
415Ile Tyr Asn Leu Tyr Arg Ala Ser Gln Leu Gln Phe Pro Gly Glu Glu
420 425 430Ile Leu Glu Glu Ala
Asn Lys Phe Ala Tyr Lys Phe Leu Gln Glu Lys 435
440 445Leu Glu Ser Asn Gln Ile Leu Asp Lys Trp Leu Ile
Ser Asn His Leu 450 455 460Ser Asp Glu
Ile Lys Val Gly Leu Glu Met Pro Trp Tyr Ala Thr Leu465
470 475 480Pro Arg Val Glu Thr Ser Tyr
Tyr Ile His His Tyr Gly Gly Gly Asp 485
490 495Asp Val Trp Ile Gly Lys Thr Leu Tyr Arg Met Pro
Glu Ile Ser Asn 500 505 510Asp
Thr Tyr Arg Glu Leu Ala Arg Leu Asp Phe Arg Arg Cys Gln Ala 515
520 525Gln His Gln Leu Glu Trp Ile Tyr Met
Gln Arg Trp Tyr Glu Ser Cys 530 535
540Arg Met Gln Glu Phe Gly Ile Ser Arg Lys Glu Val Leu Arg Ala Tyr545
550 555 560Phe Leu Ala Ser
Gly Thr Ile Phe Glu Val Glu Arg Ala Lys Glu Arg 565
570 575Val Ala Trp Ala Arg Ser Gln Ile Ile Ser
His Met Ile Lys Ser Phe 580 585
590Phe Asn Lys Glu Thr Thr Ser Ser Asp Gln Lys Gln Ala Leu Leu Thr
595 600 605Glu Leu Leu Phe Gly Asn Ile
Ser Ala Ser Glu Thr Glu Lys Arg Glu 610 615
620Leu Asp Gly Val Val Val Ala Thr Leu Arg Gln Phe Leu Glu Gly
Phe625 630 635 640Asp Ile
Gly Thr Arg His Gln Val Lys Ala Ala Trp Asp Val Trp Leu
645 650 655Arg Lys Val Glu Gln Gly Glu
Ala His Gly Gly Ala Asp Ala Glu Leu 660 665
670Cys Thr Thr Thr Leu Asn Thr Cys Ala Asn Gln His Leu Ser
Ser His 675 680 685Pro Asp Tyr Asn
Thr Leu Ser Lys Leu Thr Asn Lys Ile Cys His Lys 690
695 700Leu Ser Gln Ile Gln His Gln Lys Glu Met Lys Gly
Gly Ile Lys Ala705 710 715
720Lys Cys Ser Ile Asn Asn Lys Glu Val Asp Ile Glu Met Gln Trp Leu
725 730 735Val Lys Leu Val Leu
Glu Lys Ser Gly Leu Asn Arg Lys Ala Lys Gln 740
745 750Ala Phe Leu Ser Ile Ala Lys Thr Tyr Tyr Tyr Arg
Ala Tyr Tyr Ala 755 760 765Asp Gln
Thr Met Asp Ala His Ile Phe Lys Val Leu Phe Glu Pro Val 770
775 780Val78518723PRTArtificial SequenceTruncated
copalyl diphosphate synthase from Marrubium vulgare 18Met Ala Pro
Pro Glu Gln Lys Tyr Asn Ser Thr Ala Leu Glu His Asp1 5
10 15Thr Glu Ile Ile Glu Ile Glu Asp His
Ile Glu Cys Ile Arg Arg Leu 20 25
30Leu Arg Thr Ala Gly Asp Gly Arg Ile Ser Val Ser Pro Tyr Asp Thr
35 40 45Ala Trp Ile Ala Leu Ile Lys
Asp Leu Asp Gly His Asp Ser Pro Gln 50 55
60Phe Pro Ser Ser Met Glu Trp Val Ala Asp Asn Gln Leu Pro Asp Gly65
70 75 80Ser Trp Gly Asp
Glu His Phe Val Cys Val Tyr Asp Arg Leu Val Asn 85
90 95Thr Ile Ala Cys Val Val Ala Leu Arg Ser
Trp Asn Val His Ala His 100 105
110Lys Cys Glu Lys Gly Ile Lys Tyr Ile Lys Glu Asn Val His Lys Leu
115 120 125Glu Asp Ala Asn Glu Glu His
Met Thr Cys Gly Phe Glu Val Val Phe 130 135
140Pro Ala Leu Leu Gln Arg Ala Gln Ser Met Gly Ile Lys Gly Ile
Pro145 150 155 160Tyr Asn
Ala Pro Val Ile Glu Glu Ile Tyr Asn Ser Arg Glu Lys Lys
165 170 175Leu Lys Arg Ile Pro Met Glu
Val Val His Lys Val Ala Thr Ser Leu 180 185
190Leu Phe Ser Leu Glu Gly Leu Glu Asn Leu Glu Trp Glu Lys
Leu Leu 195 200 205Lys Leu Gln Ser
Pro Asp Gly Ser Phe Leu Thr Ser Pro Ser Ser Thr 210
215 220Ala Phe Ala Phe Ile His Thr Lys Asp Arg Lys Cys
Phe Asn Phe Ile225 230 235
240Asn Asn Ile Val His Thr Phe Lys Gly Gly Ala Pro His Thr Tyr Pro
245 250 255Val Asp Ile Phe Gly
Arg Leu Trp Ala Val Asp Arg Leu Gln Arg Leu 260
265 270Gly Ile Ser Arg Phe Phe Glu Ser Glu Ile Ala Glu
Phe Leu Ser His 275 280 285Val His
Arg Phe Trp Ser Asp Glu Ala Gly Val Phe Ser Gly Arg Glu 290
295 300Ser Val Phe Cys Asp Ile Asp Asp Thr Ser Met
Gly Leu Arg Leu Leu305 310 315
320Arg Met His Gly Tyr His Val Asp Pro Asn Val Leu Lys Asn Phe Lys
325 330 335Gln Ser Asp Lys
Phe Ser Cys Tyr Gly Gly Gln Met Met Glu Cys Ser 340
345 350Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ser Gln
Leu Gln Phe Pro Gly 355 360 365Glu
Glu Ile Leu Glu Glu Ala Asn Lys Phe Ala Tyr Lys Phe Leu Gln 370
375 380Glu Lys Leu Glu Ser Asn Gln Ile Leu Asp
Lys Trp Leu Ile Ser Asn385 390 395
400His Leu Ser Asp Glu Ile Lys Val Gly Leu Glu Met Pro Trp Tyr
Ala 405 410 415Thr Leu Pro
Arg Val Glu Thr Ser Tyr Tyr Ile His His Tyr Gly Gly 420
425 430Gly Asp Asp Val Trp Ile Gly Lys Thr Leu
Tyr Arg Met Pro Glu Ile 435 440
445Ser Asn Asp Thr Tyr Arg Glu Leu Ala Arg Leu Asp Phe Arg Arg Cys 450
455 460Gln Ala Gln His Gln Leu Glu Trp
Ile Tyr Met Gln Arg Trp Tyr Glu465 470
475 480Ser Cys Arg Met Gln Glu Phe Gly Ile Ser Arg Lys
Glu Val Leu Arg 485 490
495Ala Tyr Phe Leu Ala Ser Gly Thr Ile Phe Glu Val Glu Arg Ala Lys
500 505 510Glu Arg Val Ala Trp Ala
Arg Ser Gln Ile Ile Ser His Met Ile Lys 515 520
525Ser Phe Phe Asn Lys Glu Thr Thr Ser Ser Asp Gln Lys Gln
Ala Leu 530 535 540Leu Thr Glu Leu Leu
Phe Gly Asn Ile Ser Ala Ser Glu Thr Glu Lys545 550
555 560Arg Glu Leu Asp Gly Val Val Val Ala Thr
Leu Arg Gln Phe Leu Glu 565 570
575Gly Phe Asp Ile Gly Thr Arg His Gln Val Lys Ala Ala Trp Asp Val
580 585 590Trp Leu Arg Lys Val
Glu Gln Gly Glu Ala His Gly Gly Ala Asp Ala 595
600 605Glu Leu Cys Thr Thr Thr Leu Asn Thr Cys Ala Asn
Gln His Leu Ser 610 615 620Ser His Pro
Asp Tyr Asn Thr Leu Ser Lys Leu Thr Asn Lys Ile Cys625
630 635 640His Lys Leu Ser Gln Ile Gln
His Gln Lys Glu Met Lys Gly Gly Ile 645
650 655Lys Ala Lys Cys Ser Ile Asn Asn Lys Glu Val Asp
Ile Glu Met Gln 660 665 670Trp
Leu Val Lys Leu Val Leu Glu Lys Ser Gly Leu Asn Arg Lys Ala 675
680 685Lys Gln Ala Phe Leu Ser Ile Ala Lys
Thr Tyr Tyr Tyr Arg Ala Tyr 690 695
700Tyr Ala Asp Gln Thr Met Asp Ala His Ile Phe Lys Val Leu Phe Glu705
710 715 720Pro Val
Val192172DNAArtificial SequenceOptimized cDNA for E. coli expression
encoding for MvCps3-del63 19atggccccgc cggaacaaaa gtacaacagc
actgcattag aacacgacac cgagattatt 60gagatcgagg accacatcga gtgtatccgc
cgtctgctgc gtaccgcggg tgatggtcgt 120attagcgtga gcccgtatga taccgcgtgg
attgcactga ttaaagattt ggatggccac 180gactccccgc aattcccgtc gagcatggaa
tgggttgctg ataatcagct gccggacggt 240agctggggtg acgagcactt cgtttgcgtt
tacgatcgcc tggttaatac catcgcatgc 300gtcgtggcgc tgcgcagctg gaatgtccat
gcacataagt gcgagaaagg tattaagtac 360attaaagaaa atgtccacaa actggaagat
gcgaacgaag aacacatgac ttgcggcttc 420gaagtcgttt ttccggcctt gctgcagcgt
gcacagagca tgggtattaa gggcatcccg 480tacaacgcgc ctgtcattga agaaatttac
aattcccgtg agaaaaagct gaaacgtatt 540ccgatggaag ttgtccacaa agtcgcgacc
agcctgctgt tctccctgga aggtctggag 600aacctggagt gggagaaatt gctgaaactg
cagagcccgg acggttcgtt tctgaccagc 660ccgagctcta cggcattcgc gtttatccat
accaaagacc gtaaatgttt taactttatt 720aacaatatcg ttcatacctt taagggtggt
gcaccgcaca cgtaccctgt ggacatcttt 780ggccgcctgt gggcagtgga tcgcttgcag
cgtctgggta ttagccgctt cttcgagagc 840gagatcgcgg aatttctgag ccacgtgcac
cgtttttgga gcgacgaagc gggcgttttc 900agcggccgtg agagcgtgtt ctgtgatatt
gatgacacca gcatgggtct gcgcctgctt 960cgtatgcatg gctaccatgt agacccaaac
gttctgaaga acttcaagca atctgacaag 1020tttagctgct acggtggcca gatgatggaa
tgcagcagcc caatttacaa tctgtaccgt 1080gcgagccaac tgcaatttcc gggtgaagaa
atcttggaag aggctaacaa attcgcgtat 1140aagtttttgc aagagaaact ggagtccaat
cagattctgg acaagtggct gatctccaac 1200cacctgagcg acgaaatcaa agttggcctg
gaaatgccgt ggtatgcgac cttgccgcgc 1260gttgagacta gctattatat tcaccattac
ggcggtggcg acgatgtgtg gattggtaaa 1320acgctgtatc gcatgccgga aattagcaac
gacacctacc gtgagctggc acgtctggac 1380ttccgccgct gccaggcgca gcaccagttg
gaatggatct atatgcaacg ttggtatgag 1440agctgtcgta tgcaagaatt tggtatttcc
cgcaaagaag tcctgcgtgc ctacttcctg 1500gcctctggca cgattttcga agttgagcgc
gccaaagagc gcgtggcgtg ggctcgtagc 1560caaatcattt cccacatgat caagagcttc
ttcaataaag aaaccacgag cagcgatcag 1620aaacaagcgc tgctgaccga gttgctgttt
ggtaacatct ctgcaagcga gactgagaaa 1680cgtgagctgg atggtgttgt ggttgcgacc
ctgcgtcagt tcctggaagg cttcgatatc 1740ggcacccgtc accaagtgaa ggcagcgtgg
gatgtgtggc tgcgtaaagt cgaacagggt 1800gaggcacatg gtggcgcgga cgccgagttg
tgtacgacga cgctgaacac gtgcgcgaat 1860cagcatctgt ctagccatcc ggactacaat
accctgtcga aactcaccaa taagatttgt 1920cacaagctgt cccaaatcca gcatcagaaa
gaaatgaagg gcggtattaa ggcaaagtgc 1980tctatcaata acaaagaagt ggatatcgag
atgcaatggc tggtcaaact ggtcctggag 2040aaatccggtc tgaaccgcaa ggctaaacaa
gcgtttctga gcattgccaa aacctattat 2100tatcgtgctt actatgccga ccagacgatg
gatgcccaca tcttcaaggt cctgtttgaa 2160ccggtcgtgt aa
217220799PRTRosmarinus officinalis 20Met
Thr Ser Met Ser Ser Leu Asn Leu Ser Arg Ala Pro Ala Ile Ser1
5 10 15Arg Arg Leu Gln Leu Pro Ala
Lys Val Gln Leu Pro Glu Phe Tyr Ala 20 25
30Val Cys Ser Trp Leu Asn Asn Ser Ser Lys His Thr Pro Leu
Ser Cys 35 40 45His Ile His Arg
Lys Gln Leu Ser Lys Val Thr Lys Cys Arg Val Ala 50 55
60Ser Leu Asp Ala Ser Gln Val Ser Glu Lys Gly Thr Ser
Ser Pro Val65 70 75
80Gln Thr Pro Glu Glu Val Asn Glu Lys Ile Glu Asn Tyr Ile Glu Tyr
85 90 95Ile Lys Asn Leu Leu Thr
Thr Ser Gly Asp Gly Arg Ile Ser Val Ser 100
105 110Pro Tyr Asp Thr Ser Ile Val Ala Leu Ile Lys Asp
Leu Lys Gly Arg 115 120 125Asp Thr
Pro Gln Phe Pro Ser Cys Leu Glu Trp Ile Ala Gln His Gln 130
135 140Met Ala Asp Gly Ser Trp Gly Asp Glu Phe Phe
Cys Ile Tyr Asp Arg145 150 155
160Ile Leu Asn Thr Leu Ala Cys Val Val Ala Leu Lys Ser Trp Asn Val
165 170 175His Ala Asp Met
Ile Glu Lys Gly Val Thr Tyr Val Asn Glu Asn Val 180
185 190Gln Lys Leu Glu Asp Gly Asn Leu Glu His Met
Thr Ser Gly Phe Glu 195 200 205Ile
Val Val Pro Ala Leu Val Gln Arg Ala Gln Asp Leu Gly Ile Gln 210
215 220Gly Leu Pro Tyr Asp His Pro Leu Ile Lys
Glu Ile Ala Asn Thr Lys225 230 235
240Glu Gly Arg Leu Lys Lys Ile Pro Lys Asp Met Ile Tyr Gln Lys
Pro 245 250 255Thr Thr Leu
Leu Phe Ser Leu Glu Gly Leu Gly Asp Leu Glu Trp Glu 260
265 270Lys Ile Leu Lys Leu Gln Ser Gly Asp Gly
Ser Phe Leu Thr Ser Pro 275 280
285Ser Ser Thr Ala His Val Phe Met Lys Thr Lys Asp Glu Lys Cys Leu 290
295 300Lys Phe Ile Glu Asn Ala Val Lys
Asn Cys Asn Gly Gly Ala Pro His305 310
315 320Thr Tyr Pro Val Asp Val Phe Ala Arg Leu Trp Ala
Val Asp Arg Leu 325 330
335Gln Arg Leu Gly Ile Ser Arg Phe Phe Gln Gln Glu Ile Lys Tyr Phe
340 345 350Leu Asp His Ile Asn Ser
Val Trp Thr Glu Asn Gly Val Phe Ser Gly 355 360
365Arg Asp Ser Glu Phe Cys Asp Ile Asp Asp Thr Ser Met Gly
Ile Arg 370 375 380Leu Leu Lys Met His
Gly Tyr Asp Ile Asp Pro Asn Ala Leu Glu His385 390
395 400Phe Lys Gln Gln Asp Gly Lys Phe Ser Cys
Tyr Gly Gly Gln Met Ile 405 410
415Glu Ser Ala Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ala Gln Leu Arg
420 425 430Phe Pro Gly Glu Glu
Ile Leu Glu Glu Ala Thr Lys Phe Ala Tyr Asn 435
440 445Phe Leu Gln Glu Lys Ile Ala Asn Asp Gln Phe Gln
Glu Lys Trp Val 450 455 460Ile Ser Asp
His Leu Ile Asp Glu Val Lys Leu Gly Leu Lys Met Pro465
470 475 480Trp Tyr Ala Thr Leu Pro Arg
Val Glu Ala Ala Tyr Tyr Leu Gln Tyr 485
490 495Tyr Ala Gly Cys Gly Asp Val Trp Ile Gly Lys Val
Phe Tyr Arg Met 500 505 510Pro
Glu Ile Ser Asn Asp Thr Tyr Lys Lys Leu Ala Ile Leu Asp Phe 515
520 525Asn Arg Cys Gln Ala Gln His Gln Phe
Glu Trp Ile Tyr Met Gln Glu 530 535
540Trp Tyr His Arg Ser Ser Val Ser Glu Phe Gly Ile Ser Lys Lys Asp545
550 555 560Leu Leu Arg Ala
Tyr Phe Leu Ala Ala Ala Thr Ile Phe Glu Pro Glu 565
570 575Arg Thr Gln Glu Arg Leu Val Trp Ala Lys
Thr Gln Ile Val Ser Gly 580 585
590Met Ile Thr Ser Phe Val Asn Ser Gly Thr Thr Leu Ser Leu His Gln
595 600 605Lys Thr Ala Leu Leu Ser Gln
Ile Gly His Asn Phe Asp Gly Leu Asp 610 615
620Glu Ile Ile Ser Ala Met Lys Asp His Gly Leu Ala Ala Thr Leu
Leu625 630 635 640Thr Thr
Phe Gln Gln Leu Leu Asp Gly Phe Asp Arg Tyr Thr Arg His
645 650 655Gln Leu Lys Asn Ala Trp Ser
Gln Trp Phe Met Lys Leu Gln Gln Gly 660 665
670Glu Ala Ser Gly Gly Glu Asp Ala Glu Leu Leu Ala Asn Thr
Leu Asn 675 680 685Ile Cys Ala Gly
Leu Ile Ala Phe Asn Glu Asp Val Leu Ser His His 690
695 700Glu Tyr Thr Thr Leu Ser Thr Leu Thr Asn Lys Ile
Cys Lys Arg Leu705 710 715
720Thr Gln Ile Gln Asp Lys Lys Thr Leu Glu Val Val Asp Gly Ser Ile
725 730 735Lys Asp Lys Glu Leu
Glu Lys Asp Ile Gln Met Leu Val Lys Leu Val 740
745 750Leu Glu Glu Asn Gly Gly Gly Val Asp Arg Asn Ile
Lys His Thr Phe 755 760 765Leu Ser
Val Phe Lys Thr Phe Tyr Tyr Asn Ala Tyr His Asp Asp Glu 770
775 780Thr Thr Asp Val His Ile Phe Lys Val Leu Phe
Gly Pro Val Val785 790
79521733PRTArtificial SequenceTruncated copalyl diphosphate synthase from
Rosmarinus officinalis 21Met Ala Ser Gln Val Ser Glu Lys Gly Thr Ser
Ser Pro Val Gln Thr1 5 10
15Pro Glu Glu Val Asn Glu Lys Ile Glu Asn Tyr Ile Glu Tyr Ile Lys
20 25 30Asn Leu Leu Thr Thr Ser Gly
Asp Gly Arg Ile Ser Val Ser Pro Tyr 35 40
45Asp Thr Ser Ile Val Ala Leu Ile Lys Asp Leu Lys Gly Arg Asp
Thr 50 55 60Pro Gln Phe Pro Ser Cys
Leu Glu Trp Ile Ala Gln His Gln Met Ala65 70
75 80Asp Gly Ser Trp Gly Asp Glu Phe Phe Cys Ile
Tyr Asp Arg Ile Leu 85 90
95Asn Thr Leu Ala Cys Val Val Ala Leu Lys Ser Trp Asn Val His Ala
100 105 110Asp Met Ile Glu Lys Gly
Val Thr Tyr Val Asn Glu Asn Val Gln Lys 115 120
125Leu Glu Asp Gly Asn Leu Glu His Met Thr Ser Gly Phe Glu
Ile Val 130 135 140Val Pro Ala Leu Val
Gln Arg Ala Gln Asp Leu Gly Ile Gln Gly Leu145 150
155 160Pro Tyr Asp His Pro Leu Ile Lys Glu Ile
Ala Asn Thr Lys Glu Gly 165 170
175Arg Leu Lys Lys Ile Pro Lys Asp Met Ile Tyr Gln Lys Pro Thr Thr
180 185 190Leu Leu Phe Ser Leu
Glu Gly Leu Gly Asp Leu Glu Trp Glu Lys Ile 195
200 205Leu Lys Leu Gln Ser Gly Asp Gly Ser Phe Leu Thr
Ser Pro Ser Ser 210 215 220Thr Ala His
Val Phe Met Lys Thr Lys Asp Glu Lys Cys Leu Lys Phe225
230 235 240Ile Glu Asn Ala Val Lys Asn
Cys Asn Gly Gly Ala Pro His Thr Tyr 245
250 255Pro Val Asp Val Phe Ala Arg Leu Trp Ala Val Asp
Arg Leu Gln Arg 260 265 270Leu
Gly Ile Ser Arg Phe Phe Gln Gln Glu Ile Lys Tyr Phe Leu Asp 275
280 285His Ile Asn Ser Val Trp Thr Glu Asn
Gly Val Phe Ser Gly Arg Asp 290 295
300Ser Glu Phe Cys Asp Ile Asp Asp Thr Ser Met Gly Ile Arg Leu Leu305
310 315 320Lys Met His Gly
Tyr Asp Ile Asp Pro Asn Ala Leu Glu His Phe Lys 325
330 335Gln Gln Asp Gly Lys Phe Ser Cys Tyr Gly
Gly Gln Met Ile Glu Ser 340 345
350Ala Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ala Gln Leu Arg Phe Pro
355 360 365Gly Glu Glu Ile Leu Glu Glu
Ala Thr Lys Phe Ala Tyr Asn Phe Leu 370 375
380Gln Glu Lys Ile Ala Asn Asp Gln Phe Gln Glu Lys Trp Val Ile
Ser385 390 395 400Asp His
Leu Ile Asp Glu Val Lys Leu Gly Leu Lys Met Pro Trp Tyr
405 410 415Ala Thr Leu Pro Arg Val Glu
Ala Ala Tyr Tyr Leu Gln Tyr Tyr Ala 420 425
430Gly Cys Gly Asp Val Trp Ile Gly Lys Val Phe Tyr Arg Met
Pro Glu 435 440 445Ile Ser Asn Asp
Thr Tyr Lys Lys Leu Ala Ile Leu Asp Phe Asn Arg 450
455 460Cys Gln Ala Gln His Gln Phe Glu Trp Ile Tyr Met
Gln Glu Trp Tyr465 470 475
480His Arg Ser Ser Val Ser Glu Phe Gly Ile Ser Lys Lys Asp Leu Leu
485 490 495Arg Ala Tyr Phe Leu
Ala Ala Ala Thr Ile Phe Glu Pro Glu Arg Thr 500
505 510Gln Glu Arg Leu Val Trp Ala Lys Thr Gln Ile Val
Ser Gly Met Ile 515 520 525Thr Ser
Phe Val Asn Ser Gly Thr Thr Leu Ser Leu His Gln Lys Thr 530
535 540Ala Leu Leu Ser Gln Ile Gly His Asn Phe Asp
Gly Leu Asp Glu Ile545 550 555
560Ile Ser Ala Met Lys Asp His Gly Leu Ala Ala Thr Leu Leu Thr Thr
565 570 575Phe Gln Gln Leu
Leu Asp Gly Phe Asp Arg Tyr Thr Arg His Gln Leu 580
585 590Lys Asn Ala Trp Ser Gln Trp Phe Met Lys Leu
Gln Gln Gly Glu Ala 595 600 605Ser
Gly Gly Glu Asp Ala Glu Leu Leu Ala Asn Thr Leu Asn Ile Cys 610
615 620Ala Gly Leu Ile Ala Phe Asn Glu Asp Val
Leu Ser His His Glu Tyr625 630 635
640Thr Thr Leu Ser Thr Leu Thr Asn Lys Ile Cys Lys Arg Leu Thr
Gln 645 650 655Ile Gln Asp
Lys Lys Thr Leu Glu Val Val Asp Gly Ser Ile Lys Asp 660
665 670Lys Glu Leu Glu Lys Asp Ile Gln Met Leu
Val Lys Leu Val Leu Glu 675 680
685Glu Asn Gly Gly Gly Val Asp Arg Asn Ile Lys His Thr Phe Leu Ser 690
695 700Val Phe Lys Thr Phe Tyr Tyr Asn
Ala Tyr His Asp Asp Glu Thr Thr705 710
715 720Asp Val His Ile Phe Lys Val Leu Phe Gly Pro Val
Val 725 730222202DNAArtificial
SequenceOptimized cDNA for E. coli expression encoding for
RoCPS1-del67 22atggcatcac aagttagcga gaaaggcacc agctccccag ttcaaacgcc
agaggaagtg 60aacgaaaaga tcgagaatta cattgagtat attaaaaatc tgctgactac
ttcgggcgac 120ggccgcatca gcgtcagccc gtacgacacg agcatcgttg ccctgattaa
agacctgaag 180ggtcgtgaca ccccgcagtt tccgtcctgt ctggagtgga ttgcccaaca
ccaaatggcc 240gatggttcct ggggtgatga atttttctgc atttacgacc gcatcctgaa
tacgctggct 300tgtgttgtcg ccctgaagtc ctggaatgtt catgcagaca tgatcgaaaa
gggtgtcact 360tacgttaacg aaaacgtgca gaaactggaa gatggcaatc tggagcacat
gacgagcggt 420ttcgagattg ttgtcccggc gctggttcag agagcgcaag acctgggcat
ccagggcctg 480ccgtatgatc atccgttgat caaagaaatc gcaaacacca aagagggccg
cctgaagaaa 540attcctaaag acatgattta tcagaaaccg actacgctgc tgttcagcct
ggaaggcttg 600ggcgacctgg agtgggaaaa gatcctgaag ttacagtctg gtgatggttc
tttcctgacc 660agcccgagct ctacggccca tgttttcatg aaaaccaaag atgagaagtg
tctgaagttt 720attgaaaatg ccgtcaagaa ttgcaacggt ggcgcgcctc acacctaccc
ggtggacgtt 780ttcgctcgtc tgtgggccgt cgatcgtctg caacgcctgg gcatctcgcg
tttcttccag 840caagagatta agtacttcct ggaccacatt aatagcgtgt ggaccgaaaa
cggcgttttc 900agcggtcgcg acagcgagtt ttgtgatatt gatgacacct ctatgggtat
ccgtttgctg 960aagatgcacg gttacgacat tgacccgaat gccctggagc actttaaaca
acaggatggt 1020aagttctcct gctacggtgg tcagatgatt gagagcgcga gcccgatcta
caacctgtac 1080cgtgctgcgc agctgcgttt tccgggtgaa gagattctgg aagaggccac
caaatttgcg 1140tataattttt tgcaagagaa aattgcaaac gaccaattcc aggaaaaatg
ggttattagc 1200gatcacctta tcgatgaagt gaaactgggt ttgaagatgc cgtggtacgc
gacgctgcca 1260cgtgtcgagg cagcgtatta tctgcagtat tatgcgggct gtggtgatgt
gtggatcggc 1320aaagtgttct accgtatgcc ggaaatcagc aatgacacct acaagaaact
ggccatcctg 1380gatttcaacc gttgccaggc gcaacaccaa ttcgagtgga tctacatgca
agagtggtat 1440catcgtagca gcgtttctga gtttggcatt tccaaaaaag acttgctgcg
cgcgtatttt 1500ctggcggcag cgaccatttt cgaaccggag cgcacccagg aacgtctggt
gtgggctaag 1560acgcaaatcg tcagcggtat gattacgtcc tttgttaata gcggtacgac
tctgagcctg 1620caccagaaaa cggcactgtt gagccaaatc ggtcataact ttgacggcct
ggatgagatt 1680atcagcgcga tgaaagacca cggcctggca gcgacgctgt taacgacctt
tcaacagctg 1740ctggacggct tcgatcgcta cacccgtcat cagctgaaaa acgcgtggag
ccagtggttc 1800atgaagctgc aacagggtga ggcgtcgggt ggcgaagatg ctgagctgct
ggctaatacc 1860ctgaacattt gcgcgggttt gattgcgttt aatgaagatg tgttgagcca
ccatgagtac 1920accaccctga gcaccctgac caacaagatc tgtaagcgct tgactcaaat
ccaggataag 1980aaaacgctgg aagtcgtgga tggtagcatc aaagataaag aactggaaaa
agacattcaa 2040atgctggtga aactggtcct tgaagagaac ggcggtggcg ttgaccgtaa
catcaagcac 2100accttcctga gcgtctttaa aaccttttat tataatgcct atcatgacga
tgaaacgacc 2160gacgtgcaca ttttcaaagt tctgttcggt ccggtcgtgt aa
220223766PRTArtificial SequenceTruncated putative sclareol
synthase from Nicotiana glutinosa 23Met Ala Asn Phe His Arg Pro Ser
Arg Val Arg Cys Ser His Ser Thr1 5 10
15Ala Ser Ser Leu Glu Glu Ala Lys Glu Arg Ile Arg Glu Thr
Phe Gly 20 25 30Lys Asn Glu
Leu Ser Pro Ser Ser Tyr Asp Thr Ala Trp Val Ala Met 35
40 45Val Pro Ser Arg Tyr Ser Met Asn Gln Pro Cys
Phe Pro Arg Cys Leu 50 55 60Asp Trp
Ile Leu Glu Asn Gln Arg Glu Asp Gly Ser Trp Gly Leu Asn65
70 75 80Pro Ser His Pro Leu Leu Val
Lys Asp Ser Leu Ser Ser Thr Leu Ala 85 90
95Cys Leu Leu Ala Leu Arg Lys Trp Arg Ile Gly Asp Asn
Gln Val Gln 100 105 110Arg Gly
Leu Gly Phe Ile Glu Thr His Gly Trp Ala Val Asp Asn Val 115
120 125Asp Gln Ile Ser Pro Leu Gly Phe Asp Ile
Ile Phe Pro Ser Met Ile 130 135 140Lys
Tyr Ala Glu Lys Leu Asn Leu Asp Leu Pro Phe Asp Pro Asn Leu145
150 155 160Val Asn Met Met Leu Arg
Glu Arg Glu Leu Thr Ile Glu Arg Ala Leu 165
170 175Lys Asn Glu Phe Glu Gly Asn Met Ala Asn Val Glu
Tyr Phe Ala Glu 180 185 190Gly
Leu Gly Glu Leu Cys His Trp Lys Glu Ile Met Leu His Gln Arg 195
200 205Arg Asn Gly Ser Leu Phe Asp Ser Pro
Ala Thr Thr Ala Ala Ala Leu 210 215
220Ile Tyr His Gln His Asp Glu Lys Cys Phe Gly Tyr Leu Ser Ser Ile225
230 235 240Leu Lys Leu His
Glu Asn Trp Val Pro Thr Ile Tyr Pro Thr Lys Val 245
250 255His Ser Asn Leu Phe Phe Val Asp Ala Leu
Gln Asn Leu Gly Val Asp 260 265
270Arg Tyr Phe Lys Thr Glu Leu Lys Ser Val Leu Asp Glu Ile Tyr Arg
275 280 285Leu Trp Leu Glu Lys Asn Glu
Glu Ile Phe Ser Asp Ile Ala His Cys 290 295
300Ala Met Ala Phe Arg Leu Leu Arg Met Asn Asn Tyr Glu Val Ser
Ser305 310 315 320Glu Glu
Leu Glu Gly Phe Val Asp Gln Glu His Phe Phe Thr Thr Ser
325 330 335Gly Gly Lys Leu Ile Ser His
Val Ala Ile Leu Glu Leu His Arg Ala 340 345
350Ser Gln Val Asp Ile Gln Glu Gly Lys Asp Leu Ile Leu Asp
Lys Ile 355 360 365Ser Thr Trp Thr
Arg Asn Phe Met Glu Gln Glu Leu Leu Asp Asn Gln 370
375 380Ile Leu Asp Arg Ser Lys Lys Glu Met Glu Phe Ala
Met Arg Lys Phe385 390 395
400Tyr Gly Thr Phe Asp Arg Val Glu Thr Arg Arg Tyr Ile Glu Ser Tyr
405 410 415Lys Met Asp Ser Phe
Lys Ile Leu Lys Ala Ala Tyr Arg Ser Ser Asn 420
425 430Ile Asn Asn Ile Asp Leu Leu Lys Phe Ser Glu His
Asp Phe Asn Leu 435 440 445Cys Gln
Ala Arg His Lys Glu Glu Leu Gln Gln Ile Lys Arg Trp Phe 450
455 460Ala Asp Cys Lys Leu Glu Gln Val Gly Ser Ser
Gln Asn Tyr Leu Tyr465 470 475
480Thr Ser Tyr Phe Pro Ile Ala Ala Ile Leu Phe Glu Pro Glu Tyr Gly
485 490 495Asp Ala Arg Leu
Ala Phe Ala Lys Cys Gly Ile Ile Ala Thr Thr Val 500
505 510Asp Asp Phe Phe Asp Gly Phe Ala Cys Asn Glu
Glu Leu Gln Asn Ile 515 520 525Ile
Glu Leu Val Glu Arg Trp Asp Gly Tyr Pro Thr Val Gly Phe Arg 530
535 540Ser Glu Arg Val Arg Ile Phe Phe Leu Ala
Leu Tyr Lys Met Ile Glu545 550 555
560Glu Ile Ala Ala Lys Ala Glu Thr Lys Gln Gly Arg Cys Val Lys
Asp 565 570 575Leu Leu Ile
Asn Leu Trp Ile Asp Leu Leu Lys Cys Met Leu Val Glu 580
585 590Leu Asp Leu Trp Lys Ile Lys Ser Thr Thr
Pro Ser Ile Glu Glu Tyr 595 600
605Leu Ser Ile Ala Cys Val Thr Thr Gly Val Lys Cys Leu Ile Leu Ile 610
615 620Ser Leu His Leu Leu Gly Pro Lys
Leu Ser Lys Asp Val Thr Glu Ser625 630
635 640Ser Glu Val Ser Ala Leu Trp Asn Cys Thr Ala Val
Val Ala Arg Leu 645 650
655Asn Asn Asp Ile His Ser Tyr Lys Arg Glu Gln Ala Glu Ser Ser Thr
660 665 670Asn Met Ala Ala Ile Leu
Ile Ser Gln Ser Gln Arg Thr Ile Ser Glu 675 680
685Glu Glu Ala Ile Arg Gln Ile Lys Glu Met Met Glu Ser Lys
Arg Arg 690 695 700Glu Leu Leu Gly Met
Val Leu Gln Asn Lys Glu Ser Gln Leu Pro Gln705 710
715 720Val Cys Lys Asp Leu Phe Trp Thr Thr Phe
Lys Ala Ala Tyr Ser Ile 725 730
735Tyr Thr His Gly Asp Glu Tyr Arg Phe Pro Gln Glu Leu Lys Asn His
740 745 750Ile Asn Asp Val Ile
Tyr Lys Pro Leu Asn Gln Tyr Ser Pro 755 760
765242301DNAArtificial SequenceOptimized cDNA for E. coli
expression encoding for NgSCS-del29 24atggctaatt tccatcgccc
atcccgtgtt cgttgttccc actctaccgc aagctccctg 60gaagaggcaa aagagcgcat
ccgtgaaacc ttcggcaaaa atgaactctc tccttctagc 120tatgatacgg cctgggttgc
tatggtcccg agccgctaca gcatgaacca gccgtgcttt 180ccgcgctgcc tggactggat
tctggagaac caacgtgagg atggcagctg gggtctgaac 240ccgagccatc cgttactggt
gaaagacagc ttgagcagca cgctggcgtg tttgctggcg 300ctgcgtaagt ggcgtattgg
cgacaaccaa gtccagcgtg gcctgggttt tatcgagact 360catggttggg cagtggacaa
cgtagaccag atctctccac tgggttttga catcattttc 420ccgagcatga ttaaatatgc
ggaaaagctg aatctggatt tgccttttga tccgaacctg 480gtgaacatga tgctgcgcga
gcgcgagctg acgatcgagc gtgcgctgaa aaacgaattt 540gagggtaata tggctaatgt
cgagtacttc gccgagggtt tgggtgagct gtgtcactgg 600aaagaaatca tgctgcacca
acgccgtaac ggtagcctgt tcgactctcc ggcaacgacc 660gccgcggctc ttatttatca
tcagcacgat gagaagtgct tcggctatct gtctagcatc 720ctgaaattac acgagaactg
ggtgccgacc atctatccga ccaaggttca ctccaatctg 780tttttcgtcg atgcgctgca
gaacctgggt gttgaccgtt acttcaaaac cgaactgaag 840tccgtcctgg atgagatcta
ccgtttgtgg ctggagaaaa acgaagagat cttcagcgat 900attgcgcact gcgcaatggc
gtttcgcctg ttgcgcatga ataattacga ggttagcagc 960gaagaactgg aaggcttcgt
ggaccaagaa cattttttca ccacgtcggg tggcaagctg 1020atcagccacg ttgccatcct
ggaactgcac cgtgcaagcc aagtggacat tcaggagggc 1080aaagacctga tcctggacaa
aattagcacc tggactcgca actttatgga acaggaactg 1140ctggataacc agatcttgga
tcgtagcaaa aaagaaatgg aatttgcaat gcgtaagttt 1200tacggtacgt tcgatcgcgt
ggaaacccgt cgttatattg aaagctacaa aatggattcc 1260ttcaagatcc tgaaggcagc
gtaccgtagc tccaacatta acaatattga cctgttgaag 1320ttcagcgagc acgacttcaa
tctctgccag gcgcgtcaca aggaagaact gcagcaaatc 1380aaacgctggt tcgcagattg
caaactggag caagtcggta gcagccagaa ctacttgtac 1440acctcttact tcccgatcgc
ggccattttg ttcgagccgg agtatggcga cgcacgcctg 1500gcgttcgcga agtgcggtat
tatcgcgacc accgttgacg atttttttga cggttttgca 1560tgtaatgaag aactgcaaaa
catcatcgaa ctggtcgaga gatgggacgg ttatccgacg 1620gttggtttcc gctccgagcg
tgtgcgcatt ttctttctgg cgctgtacaa aatgattgaa 1680gaaattgccg cgaaagcgga
aacgaaacag ggccgttgcg tgaaagatct gttgatcaat 1740ctgtggattg atctgctgaa
atgcatgctg gtcgaactgg atctgtggaa aattaagagc 1800acgaccccga gcattgaaga
gtatctgagc attgcctgtg tgacgaccgg cgttaagtgc 1860ttgatcctga ttagcctgca
tctgctgggc ccgaaactga gcaaagacgt gaccgaatcc 1920agcgaagtta gcgctctgtg
gaactgtacg gccgtggttg cgcgcctgaa caacgacatt 1980catagctaca agcgtgagca
agccgagagc agcactaata tggccgcaat cctgatttcg 2040caaagccagc gtaccatctc
agaagaagaa gctatccgcc agatcaaaga gatgatggaa 2100tcgaaacgcc gtgagctgct
gggcatggtg ctgcagaata aagagagcca attgccgcaa 2160gtctgcaaag acctgttttg
gaccaccttc aaagccgcgt acagcattta tacccacggt 2220gatgagtacc gttttccaca
agaactgaag aaccatatca acgatgtcat ctataagccg 2280ttaaatcaat acagccctta a
230125755PRTNicotiana
glutinosa 25Met Ser His Ser Thr Ala Ser Ser Leu Glu Glu Ala Lys Glu Arg
Ile1 5 10 15Arg Glu Thr
Phe Gly Lys Asn Glu Leu Ser Ser Ser Ser Tyr Asp Thr 20
25 30Ala Trp Val Ala Met Val Pro Ser Arg Tyr
Ser Met Asn Gln Pro Cys 35 40
45Phe Pro Arg Cys Leu Asp Trp Ile Leu Glu Asn Gln Arg Glu Asp Gly 50
55 60Ser Trp Gly Leu Asn Pro Ser Leu Pro
Leu Leu Val Lys Asp Ser Leu65 70 75
80Ser Ser Thr Leu Ala Cys Leu Leu Ala Leu Arg Lys Trp Arg
Ile Gly 85 90 95Asp Asn
Gln Val Gln Arg Gly Leu Gly Phe Ile Glu Thr His Gly Trp 100
105 110Ala Val Asp Asn Val Asp Gln Ile Ser
Pro Leu Gly Phe Asp Ile Ile 115 120
125Phe Pro Ser Met Ile Lys Tyr Ala Glu Lys Leu Asn Leu Asp Leu Pro
130 135 140Phe Asp Pro Asn Leu Val Asn
Met Met Leu Arg Glu Arg Glu Leu Thr145 150
155 160Ile Glu Arg Ala Leu Lys Asn Glu Phe Glu Gly Asn
Met Ala Asn Val 165 170
175Glu Tyr Phe Ala Glu Gly Leu Gly Glu Leu Cys His Trp Lys Glu Ile
180 185 190Met Leu His Gln Arg Arg
Asn Gly Ser Pro Phe Asp Ser Pro Ala Thr 195 200
205Thr Ala Ala Ala Leu Ile Tyr His Gln His Asp Glu Lys Cys
Phe Gly 210 215 220Tyr Leu Ser Ser Ile
Leu Lys Leu His Glu Asn Trp Val Pro Thr Ile225 230
235 240Tyr Pro Thr Lys Val His Ser Asn Leu Phe
Phe Val Asp Ala Leu Gln 245 250
255Asn Leu Gly Val Asp Arg Tyr Phe Lys Thr Glu Leu Lys Ser Val Leu
260 265 270Asp Glu Ile Tyr Arg
Leu Trp Leu Glu Lys Asn Glu Glu Ile Phe Ser 275
280 285Asp Ile Ala His Cys Ala Met Ala Phe Arg Leu Leu
Arg Met Asn Asn 290 295 300Tyr Glu Val
Ser Ser Glu Glu Leu Glu Gly Phe Val Asp Gln Glu His305
310 315 320Phe Phe Thr Thr Ser Gly Gly
Lys Leu Ile Ser His Val Ala Ile Leu 325
330 335Glu Leu His Arg Ala Ser Gln Val Asp Ile Gln Glu
Gly Lys Asp Leu 340 345 350Ile
Leu Asp Lys Ile Ser Thr Trp Thr Arg Asn Phe Met Glu Gln Glu 355
360 365Leu Leu Asp Asn Gln Ile Leu Asp Arg
Ser Lys Lys Glu Met Glu Phe 370 375
380Ala Met Arg Lys Phe Tyr Gly Thr Phe Asp Arg Val Glu Thr Arg Arg385
390 395 400Tyr Ile Glu Ser
Tyr Lys Met Asp Ser Phe Lys Ile Leu Lys Ala Ala 405
410 415Tyr Arg Ser Ser Asn Ile Asn Asn Ile Asp
Leu Leu Lys Phe Ser Glu 420 425
430His Asp Phe Asn Leu Cys Gln Ala Arg His Lys Glu Glu Leu Gln Gln
435 440 445Ile Lys Arg Trp Phe Ala Asp
Cys Lys Leu Glu Gln Val Gly Ser Ser 450 455
460Gln Asn Tyr Leu Tyr Thr Ser Tyr Phe Pro Ile Ala Ala Ile Leu
Phe465 470 475 480Glu Pro
Glu Tyr Gly Asp Ala Arg Leu Ala Phe Ala Lys Cys Gly Ile
485 490 495Ile Ala Thr Thr Val Asp Asp
Phe Phe Asp Gly Phe Ala Cys Asn Glu 500 505
510Glu Leu Gln Asn Ile Ile Glu Leu Val Glu Arg Trp Asp Gly
Tyr Pro 515 520 525Thr Val Gly Phe
Arg Ser Glu Arg Val Arg Ile Phe Phe Leu Ala Leu 530
535 540Tyr Lys Met Ile Glu Glu Ile Ala Ala Lys Ala Glu
Thr Lys Gln Gly545 550 555
560Arg Cys Val Lys Asp Leu Leu Ile Asn Leu Trp Ile Asp Leu Leu Lys
565 570 575Cys Met Leu Val Glu
Leu Asp Leu Trp Lys Ile Lys Ser Thr Thr Pro 580
585 590Ser Ile Glu Glu Tyr Leu Ser Ile Ala Cys Val Thr
Thr Gly Val Lys 595 600 605Cys Leu
Ile Leu Ile Ser Leu His Leu Leu Gly Pro Lys Leu Ser Lys 610
615 620Asp Val Thr Glu Ser Ser Glu Val Ser Ala Leu
Trp Asn Cys Thr Ala625 630 635
640Val Val Ala Arg Leu Asn Asn Asp Ile His Ser Tyr Lys Arg Glu Gln
645 650 655Ala Glu Ser Ser
Thr Asn Met Val Ala Ile Leu Ile Ser Gln Ser Gln 660
665 670Arg Thr Ile Ser Glu Glu Glu Ala Ile Arg Gln
Ile Lys Glu Met Met 675 680 685Glu
Ser Lys Arg Arg Glu Leu Leu Gly Met Val Leu Gln Asn Lys Glu 690
695 700Ser Gln Leu Pro Gln Val Cys Lys Asp Leu
Phe Trp Thr Thr Phe Lys705 710 715
720Ala Ala Tyr Ser Ile Tyr Thr His Gly Asp Glu Tyr Arg Phe Pro
Gln 725 730 735Glu Leu Lys
Asn His Ile Asn Asp Val Ile Tyr Lys Pro Leu Asn Gln 740
745 750Tyr Ser Pro 755262211DNAArtificial
SequenceOptimized cDNA for Saccharomyces cerevisiae expression
encoding for SmCPS2 26atggctactg ttgacgctcc acaagttcac gaccacgacg
gtactactgt tcaccaaggt 60cacgacgctg ttaagaacat cgaagaccca atcgaataca
tcagaacttt gttgagaact 120actggtgacg gtagaatctc tgtttctcca tacgacactg
cttgggttgc tatgatcaag 180gacgttgaag gtagagacgg tccacaattc ccatcttctt
tggaatggat cgttcaaaac 240caattggaag acggttcttg gggtgaccaa aagttgttct
gtgtttacga cagattggtt 300aacactatcg cttgtgttgt tgctttgaga tcttggaacg
ttcacgctca caaggttaag 360agaggtgtta cttacatcaa ggaaaacgtt gacaagttga
tggaaggtaa cgaagaacac 420atgacttgtg gtttcgaagt tgttttccca gctttgttgc
aaaaggctaa gtctttgggt 480atcgaagact tgccatacga ctctccagct gttcaagaag
tttaccacgt tagagaacaa 540aagttgaaga gaatcccatt ggaaatcatg cacaagatcc
caacttcttt gttgttctct 600ttggaaggtt tggaaaactt ggactgggac aagttgttga
agttgcaatc tgctgacggt 660tctttcttga cttctccatc ttctactgct ttcgctttca
tgcaaactaa ggacgaaaag 720tgttaccaat tcatcaagaa cactatcgac actttcaacg
gtggtgctcc acacacttac 780ccagttgacg ttttcggtag attgtgggct atcgacagat
tgcaaagatt gggtatctct 840agattcttcg aaccagaaat cgctgactgt ttgtctcaca
tccacaagtt ctggactgac 900aagggtgttt tctctggtag agaatctgaa ttctgtgaca
tcgacgacac ttctatgggt 960atgagattga tgagaatgca cggttacgac gttgacccaa
acgttttgag aaacttcaag 1020caaaaggacg gtaagttctc ttgttacggt ggtcaaatga
tcgaatctcc atctccaatc 1080tacaacttgt acagagcttc tcaattgaga ttcccaggtg
aagaaatctt ggaagacgct 1140aagagattcg cttacgactt cttgaaggaa aagttggcta
acaaccaaat cttggacaag 1200tgggttatct ctaagcactt gccagacgaa atcaagttgg
gtttggaaat gccatggttg 1260gctactttgc caagagttga agctaagtac tacatccaat
actacgctgg ttctggtgac 1320gtttggatcg gtaagacttt gtacagaatg ccagaaatct
ctaacgacac ttaccacgac 1380ttggctaaga ctgacttcaa gagatgtcaa gctaagcacc
aattcgaatg gttgtacatg 1440caagaatggt acgaatcttg tggtatcgaa gaattcggta
tctctagaaa ggacttgttg 1500ttgtcttact tcttggctac tgcttctatc ttcgaattgg
aaagaactaa cgaaagaatc 1560gcttgggcta agtctcaaat catcgctaag atgatcactt
ctttcttcaa caaggaaact 1620acttctgaag aagacaagag agctttgttg aacgaattgg
gtaacatcaa cggtttgaac 1680gacactaacg gtgctggtag agaaggtggt gctggttcta
tcgctttggc tactttgact 1740caattcttgg aaggtttcga cagatacact agacaccaat
tgaagaacgc ttggtctgtt 1800tggttgactc aattgcaaca cggtgaagct gacgacgctg
aattgttgac taacactttg 1860aacatctgtg ctggtcacat cgctttcaga gaagaaatct
tggctcacaa cgaatacaag 1920gctttgtcta acttgacttc taagatctgt agacaattgt
ctttcatcca atctgaaaag 1980gaaatgggtg ttgaaggtga aatcgctgct aagtcttcta
tcaagaacaa ggaattggaa 2040gaagacatgc aaatgttggt taagttggtt ttggaaaagt
acggtggtat cgacagaaac 2100atcaagaagg ctttcttggc tgttgctaag acttactact
acagagctta ccacgctgct 2160gacactatcg acactcacat gttcaaggtt ttgttcgaac
cagttgctta a 2211271578DNAArtificial SequenceOptimized cDNA
for S. cerevisiae expression encoding for truncated SsScS from
Salvia sclarea 27atggctaaga tgaaggaaaa cttcaagaga gaagacgaca agttcccaac
tactactact 60ttgagatctg aagacatccc atctaacttg tgtatcatcg acactttgca
aagattgggt 120gttgaccaat tcttccaata cgaaatcaac actatcttgg acaacacttt
cagattgtgg 180caagaaaagc acaaggttat ctacggtaac gttactactc acgctatggc
tttcagattg 240ttgagagtta agggttacga agtttcttct gaagaattgg ctccatacgg
taaccaagaa 300gctgtttctc aacaaactaa cgacttgcca atgatcatcg aattgtacag
agctgctaac 360gaaagaatct acgaagaaga aagatctttg gaaaagatct tggcttggac
tactatcttc 420ttgaacaagc aagttcaaga caactctatc ccagacaaga agttgcacaa
gttggttgaa 480ttctacttga gaaactacaa gggtatcact atcagattgg gtgctagaag
aaacttggaa 540ttgtacgaca tgacttacta ccaagctttg aagtctacta acagattctc
taacttgtgt 600aacgaagact tcttggtttt cgctaagcaa gacttcgaca tccacgaagc
tcaaaaccaa 660aagggtttgc aacaattgca aagatggtac gctgactgta gattggacac
tttgaacttc 720ggtagagacg ttgttatcat cgctaactac ttggcttctt tgatcatcgg
tgaccacgct 780ttcgactacg ttagattggc tttcgctaag acttctgttt tggttactat
catggacgac 840ttcttcgact gtcacggttc ttctcaagaa tgtgacaaga tcatcgaatt
ggttaaggaa 900tggaaggaaa acccagacgc tgaatacggt tctgaagaat tggaaatctt
gttcatggct 960ttgtacaaca ctgttaacga attggctgaa agagctagag ttgaacaagg
tagatctgtt 1020aaggaattct tggttaagtt gtgggttgaa atcttgtctg ctttcaagat
cgaattggac 1080acttggtcta acggtactca acaatctttc gacgaataca tctcttcttc
ttggttgtct 1140aacggttcta gattgactgg tttgttgact atgcaattcg ttggtgttaa
gttgtctgac 1200gaaatgttga tgtctgaaga atgtactgac ttggctagac acgtttgtat
ggttggtaga 1260ttgttgaacg acgtttgttc ttctgaaaga gaaagagaag aaaacatcgc
tggtaagtct 1320tactctatct tgttggctac tgaaaaggac ggtagaaagg tttctgaaga
cgaagctatc 1380gctgaaatca acgaaatggt tgaataccac tggagaaagg ttttgcaaat
cgtttacaag 1440aaggaatcta tcttgccaag aagatgtaag gacgttttct tggaaatggc
taagggtact 1500ttctacgctt acggtatcaa cgacgaattg acttctccac aacaatctaa
ggaagacatg 1560aagtctttcg ttttctaa
157828924DNAArtificial SequenceOptimized cDNA for S.
cerevisiae expression encoding for the GGPP synthase from Pantoea
agglomerans 28atggtttctg gttctaaggc tggtgtttct ccacacagag aaatcgaagt
tatgagacaa 60tctatcgacg accacttggc tggtttgttg ccagaaactg actctcaaga
catcgtttct 120ttggctatga gagaaggtgt tatggctcca ggtaagagaa tcagaccatt
gttgatgttg 180ttggctgcta gagacttgag ataccaaggt tctatgccaa ctttgttgga
cttggcttgt 240gctgttgaat tgactcacac tgcttctttg atgttggacg acatgccatg
tatggacaac 300gctgaattga gaagaggtca accaactact cacaagaagt tcggtgaatc
tgttgctatc 360ttggcttctg ttggtttgtt gtctaaggct ttcggtttga tcgctgctac
tggtgacttg 420ccaggtgaaa gaagagctca agctgttaac gaattgtcta ctgctgttgg
tgttcaaggt 480ttggttttgg gtcaattcag agacttgaac gacgctgctt tggacagaac
tccagacgct 540atcttgtcta ctaaccactt gaagactggt atcttgttct ctgctatgtt
gcaaatcgtt 600gctatcgctt ctgcttcttc tccatctact agagaaactt tgcacgcttt
cgctttggac 660ttcggtcaag ctttccaatt gttggacgac ttgagagacg accacccaga
aactggtaag 720gacagaaaca aggacgctgg taagtctact ttggttaaca gattgggtgc
tgacgctgct 780agacaaaagt tgagagaaca catcgactct gctgacaagc acttgacttt
cgcttgtcca 840caaggtggtg ctatcagaca attcatgcac ttgtggttcg gtcaccactt
ggctgactgg 900tctccagtta tgaagatcgc ttaa
924292175DNAArtificial SequenceOptimized cDNA for S.
cerevisiae expression encoding for CfCPS1-del63 29atggttgcta
ctgttaacgc tccaccagtt cacgaccaag acgactctac tgaaaaccaa 60tgtcacgacg
ctgttaacaa catcgaagac ccaatcgaat acatcagaac tttgttgaga 120actactggtg
acggtagaat ctctgtttct ccatacgaca ctgcttgggt tgctttgatc 180aaggacttgc
aaggtagaga cgctccagaa ttcccatctt ctttggaatg gatcatccaa 240aaccaattgg
ctgacggttc ttggggtgac gctaagttct tctgtgttta cgacagattg 300gttaacacta
tcgcttgtgt tgttgctttg agatcttggg acgttcacgc tgaaaaggtt 360gaaagaggtg
ttagatacat caacgaaaac gttgaaaagt tgagagacgg taacgaagaa 420cacatgactt
gtggtttcga agttgttttc ccagctttgt tgcaaagagc taagtctttg 480ggtatccaag
acttgccata cgacgctcca gttatccaag aaatctacca ctctagagaa 540caaaagtcta
agagaatccc attggaaatg atgcacaagg ttccaacttc tttgttgttc 600tctttggaag
gtttggaaaa cttggaatgg gacaagttgt tgaagttgca atctgctgac 660ggttctttct
tgacttctcc atcttctact gctttcgctt tcatgcaaac tagagaccca 720aagtgttacc
aattcatcaa gaacactatc caaactttca acggtggtgc tccacacact 780tacccagttg
acgttttcgg tagattgtgg gctatcgaca gattgcaaag attgggtatc 840tctagattct
tcgaatctga aatcgctgac tgtatcgctc acatccacag attctggact 900gaaaagggtg
ttttctctgg tagagaatct gaattctgtg acatcgacga cacttctatg 960ggtgttagat
tgatgagaat gcacggttac gacgttgacc caaacgtttt gaagaacttc 1020aagaaggacg
acaagttctc ttgttacggt ggtcaaatga tcgaatctcc atctccaatc 1080tacaacttgt
acagagcttc tcaattgaga ttcccaggtg aacaaatctt ggaagacgct 1140aacaagttcg
cttacgactt cttgcaagaa aagttggctc acaaccaaat cttggacaag 1200tgggttatct
ctaagcactt gccagacgaa atcaagttgg gtttggaaat gccatggtac 1260gctactttgc
caagagttga agctagatac tacatccaat actacgctgg ttctggtgac 1320gtttggatcg
gtaagacttt gtacagaatg ccagaaatct ctaacgacac ttaccacgaa 1380ttggctaaga
ctgacttcaa gagatgtcaa gctcaacacc aattcgaatg gatctacatg 1440caagaatggt
acgaatcttg taacatggaa gaattcggta tctctagaaa ggaattgttg 1500gttgcttact
tcttggctac tgcttctatc ttcgaattgg aaagagctaa cgaaagaatc 1560gcttgggcta
agtctcaaat catctctact atcatcgctt ctttcttcaa caaccaaaac 1620acttctccag
aagacaagtt ggctttcttg actgacttca agaacggtaa ctctactaac 1680atggctttgg
ttactttgac tcaattcttg gaaggtttcg acagatacac ttctcaccaa 1740ttgaagaacg
cttggtctgt ttggttgaga aagttgcaac aaggtgaagg taacggtggt 1800gctgacgctg
aattgttggt taacactttg aacatctgtg ctggtcacat cgctttcaga 1860gaagaaatct
tggctcacaa cgactacaag actttgtcta acttgacttc taagatctgt 1920agacaattgt
ctcaaatcca aaacgaaaag gaattggaaa ctgaaggtca aaagacttct 1980atcaagaaca
aggaattgga agaagacatg caaagattgg ttaagttggt tttggaaaag 2040tctagagttg
gtatcaacag agacatgaag aagactttct tggctgttgt taagacttac 2100tactacaagg
cttaccactc tgctcaagct atcgacaacc acatgttcaa ggttttgttc 2160gaaccagttg
cttaa
2175302100DNAArtificial SequenceOptimized cDNA for S. cerevisiae
expression encoding for TaTps1-del59 30atgtacagac aaagaactga
cgaaccatct gaaactagac aaatgatcga cgacatcaga 60actgctttgg cttctttggg
tgacgacgaa acttctatgt ctgtttctgc ttacgacact 120gctttggttg ctttggttaa
gaacttggac ggtggtgacg gtccacaatt cccatcttgt 180atcgactgga tcgttcaaaa
ccaattgcca gacggttctt ggggtgaccc agctttcttc 240atggttcaag acagaatgat
ctctactttg gcttgtgttg ttgctgttaa gtcttggaac 300atcgacagag acaacttgtg
tgacagaggt gttttgttca tcaaggaaaa catgtctaga 360ttggttgaag aagaacaaga
ctggatgcca tgtggtttcg aaatcaactt cccagctttg 420ttggaaaagg ctaaggactt
ggacttggac atcccatacg accacccagt tttggaagaa 480atctacgcta agagaaactt
gaagttgttg aagatcccat tggacgtttt gcacgctatc 540ccaactactt tgttgttctc
tgttgaaggt atggttgact tgccattgga ctgggaaaag 600ttgttgagat tgagatgtcc
agacggttct ttccactctt ctccagctgc tactgctgct 660gctttgtctc acactggtga
caaggaatgt cacgctttct tggacagatt gatccaaaag 720ttcgaaggtg gtgttccatg
ttctcactct atggacactt tcgaacaatt gtgggttgtt 780gacagattga tgagattggg
tatctctaga cacttcactt ctgaaatcca acaatgtttg 840gaattcatct acagaagatg
gactcaaaag ggtttggctc acaacatgca ctgtccaatc 900ccagacatcg acgacactgc
tatgggtttc agattgttga gacaacacgg ttacgacgtt 960actccatctg ttttcaagca
cttcgaaaag gacggtaagt tcgtttgttt cccaatggaa 1020actaaccacg cttctgttac
tccaatgcac aacacttaca gagcttctca attcatgttc 1080ccaggtgacg acgacgtttt
ggctagagct ggtagatact gtagagcttt cttgcaagaa 1140agacaatctt ctaacaagtt
gtacgacaag tggatcatca ctaaggactt gccaggtgaa 1200gttggttaca ctttgaactt
cccatggaag tcttctttgc caagaatcga aactagaatg 1260tacttggacc aatacggtgg
taacaacgac gtttggatcg ctaaggtttt gtacagaatg 1320aacttggttt ctaacgactt
gtacttgaag atggctaagg ctgacttcac tgaataccaa 1380agattgtcta gaatcgaatg
gaacggtttg agaaagtggt acttcagaaa ccacttgcaa 1440agatacggtg ctactccaaa
gtctgctttg aaggcttact tcttggcttc tgctaacatc 1500ttcgaaccag gtagagctgc
tgaaagattg gcttgggcta gaatggctgt tttggctgaa 1560gctgttacta ctcacttcag
acacatcggt ggtccatgtt actctactga aaacttggaa 1620gaattgatcg acttggtttc
tttcgacgac gtttctggtg gtttgagaga agcttggaag 1680caatggttga tggcttggac
tgctaaggaa tctcacggtt ctgttgacgg tgacactgct 1740ttgttgttcg ttagaactat
cgaaatctgt tctggtagaa tcgtttcttc tgaacaaaag 1800ttgaacttgt gggactactc
tcaattggaa caattgactt cttctatctg tcacaagttg 1860gctactatcg gtttgtctca
aaacgaagct tctatggaaa acactgaaga cttgcaccaa 1920caagttgact tggaaatgca
agaattgtct tggagagttc accaaggttg tcacggtatc 1980aacagagaaa ctagacaaac
tttcttgaac gttgttaagt ctttctacta ctctgctcac 2040tgttctccag aaactgttga
ctctcacatc gctaaggtta tcttccaaga cgttatctaa 2100312172DNAArtificial
SequenceOptimized cDNA for S. cerevisiae expression encoding for
MvCps3-del63 31atggctccac cagaacaaaa gtacaactct actgctttgg aacacgacac
tgaaatcatc 60gaaatcgaag accacatcga atgtatcaga agattgttga gaactgctgg
tgacggtaga 120atctctgttt ctccatacga cactgcttgg atcgctttga tcaaggactt
ggacggtcac 180gactctccac aattcccatc ttctatggaa tgggttgctg acaaccaatt
gccagacggt 240tcttggggtg acgaacactt cgtttgtgtt tacgacagat tggttaacac
tatcgcttgt 300gttgttgctt tgagatcttg gaacgttcac gctcacaagt gtgaaaaggg
tatcaagtac 360atcaaggaaa acgttcacaa gttggaagac gctaacgaag aacacatgac
ttgtggtttc 420gaagttgttt tcccagcttt gttgcaaaga gctcaatcta tgggtatcaa
gggtatccca 480tacaacgctc cagttatcga agaaatctac aactctagag aaaagaagtt
gaagagaatc 540ccaatggaag ttgttcacaa ggttgctact tctttgttgt tctctttgga
aggtttggaa 600aacttggaat gggaaaagtt gttgaagttg caatctccag acggttcttt
cttgacttct 660ccatcttcta ctgctttcgc tttcatccac actaaggaca gaaagtgttt
caacttcatc 720aacaacatcg ttcacacttt caagggtggt gctccacaca cttacccagt
tgacatcttc 780ggtagattgt gggctgttga cagattgcaa agattgggta tctctagatt
cttcgaatct 840gaaatcgctg aattcttgtc tcacgttcac agattctggt ctgacgaagc
tggtgttttc 900tctggtagag aatctgtttt ctgtgacatc gacgacactt ctatgggttt
gagattgttg 960agaatgcacg gttaccacgt tgacccaaac gttttgaaga acttcaagca
atctgacaag 1020ttctcttgtt acggtggtca aatgatggaa tgttcttctc caatctacaa
cttgtacaga 1080gcttctcaat tgcaattccc aggtgaagaa atcttggaag aagctaacaa
gttcgcttac 1140aagttcttgc aagaaaagtt ggaatctaac caaatcttgg acaagtggtt
gatctctaac 1200cacttgtctg acgaaatcaa ggttggtttg gaaatgccat ggtacgctac
tttgccaaga 1260gttgaaactt cttactacat ccaccactac ggtggtggtg acgacgtttg
gatcggtaag 1320actttgtaca gaatgccaga aatctctaac gacacttaca gagaattggc
tagattggac 1380ttcagaagat gtcaagctca acaccaattg gaatggatct acatgcaaag
atggtacgaa 1440tcttgtagaa tgcaagaatt cggtatctct agaaaggaag ttttgagagc
ttacttcttg 1500gcttctggta ctatcttcga agttgaaaga gctaaggaaa gagttgcttg
ggctagatct 1560caaatcatct ctcacatgat caagtctttc ttcaacaagg aaactacttc
ttctgaccaa 1620aagcaagctt tgttgactga attgttgttc ggtaacatct ctgcttctga
aactgaaaag 1680agagaattgg acggtgttgt tgttgctact ttgagacaat tcttggaagg
tttcgacatc 1740ggtactagac accaagttaa ggctgcttgg gacgtttggt tgagaaaggt
tgaacaaggt 1800gaagctcacg gtggtgctga cgctgaattg tgtactacta ctttgaacac
ttgtgctaac 1860caacacttgt cttctcaccc agactacaac actttgtcta agttgactaa
caagatctgt 1920cacaagttgt ctcaaatcca acaccaaaag gaaatgaagg gtggtatcaa
ggctaagtgt 1980tctatcaaca acaaggaagt tgacatcgaa atgcaatggt tggttaagtt
ggttttggaa 2040aagtctggtt tgaacagaaa ggctaagcaa gctttcttgt ctatcgctaa
gacttactac 2100tacagagctt actacgctga ccaaactatg gacgctcaca tcttcaaggt
tttgttcgaa 2160ccagttgttt aa
2172322202DNAArtificial SequenceOptimized cDNA for S.
cerevisiae expression encoding for RoCPS1-del67 32atggcttctc
aagtttctga aaagggtact tcttctccag ttcaaactcc agaagaagtt 60aacgaaaaga
tcgaaaacta catcgaatac atcaagaact tgttgactac ttctggtgac 120ggtagaatct
ctgtttctcc atacgacact tctatcgttg ctttgatcaa ggacttgaag 180ggtagagaca
ctccacaatt cccatcttgt ttggaatgga tcgctcaaca ccaaatggct 240gacggttctt
ggggtgacga attcttctgt atctacgaca gaatcttgaa cactttggct 300tgtgttgttg
ctttgaagtc ttggaacgtt cacgctgaca tgatcgaaaa gggtgttact 360tacgttaacg
aaaacgttca aaagttggaa gacggtaact tggaacacat gacttctggt 420ttcgaaatcg
ttgttccagc tttggttcaa agagctcaag acttgggtat ccaaggtttg 480ccatacgacc
acccattgat caaggaaatc gctaacacta aggaaggtag attgaagaag 540atcccaaagg
acatgatcta ccaaaagcca actactttgt tgttctcttt ggaaggtttg 600ggtgacttgg
aatgggaaaa gatcttgaag ttgcaatctg gtgacggttc tttcttgact 660tctccatctt
ctactgctca cgttttcatg aagactaagg acgaaaagtg tttgaagttc 720atcgaaaacg
ctgttaagaa ctgtaacggt ggtgctccac acacttaccc agttgacgtt 780ttcgctagat
tgtgggctgt tgacagattg caaagattgg gtatctctag attcttccaa 840caagaaatca
agtacttctt ggaccacatc aactctgttt ggactgaaaa cggtgttttc 900tctggtagag
actctgaatt ctgtgacatc gacgacactt ctatgggtat cagattgttg 960aagatgcacg
gttacgacat cgacccaaac gctttggaac acttcaagca acaagacggt 1020aagttctctt
gttacggtgg tcaaatgatc gaatctgctt ctccaatcta caacttgtac 1080agagctgctc
aattgagatt cccaggtgaa gaaatcttgg aagaagctac taagttcgct 1140tacaacttct
tgcaagaaaa gatcgctaac gaccaattcc aagaaaagtg ggttatctct 1200gaccacttga
tcgacgaagt taagttgggt ttgaagatgc catggtacgc tactttgcca 1260agagttgaag
ctgcttacta cttgcaatac tacgctggtt gtggtgacgt ttggatcggt 1320aaggttttct
acagaatgcc agaaatctct aacgacactt acaagaagtt ggctatcttg 1380gacttcaaca
gatgtcaagc tcaacaccaa ttcgaatgga tctacatgca agaatggtac 1440cacagatctt
ctgtttctga attcggtatc tctaagaagg acttgttgag agcttacttc 1500ttggctgctg
ctactatctt cgaaccagaa agaactcaag aaagattggt ttgggctaag 1560actcaaatcg
tttctggtat gatcacttct ttcgttaact ctggtactac tttgtctttg 1620caccaaaaga
ctgctttgtt gtctcaaatc ggtcacaact tcgacggttt ggacgaaatc 1680atctctgcta
tgaaggacca cggtttggct gctactttgt tgactacttt ccaacaattg 1740ttggacggtt
tcgacagata cactagacac caattgaaga acgcttggtc tcaatggttc 1800atgaagttgc
aacaaggtga agcttctggt ggtgaagacg ctgaattgtt ggctaacact 1860ttgaacatct
gtgctggttt gatcgctttc aacgaagacg ttttgtctca ccacgaatac 1920actactttgt
ctactttgac taacaagatc tgtaagagat tgactcaaat ccaagacaag 1980aagactttgg
aagttgttga cggttctatc aaggacaagg aattggaaaa ggacatccaa 2040atgttggtta
agttggtttt ggaagaaaac ggtggtggtg ttgacagaaa catcaagcac 2100actttcttgt
ctgttttcaa gactttctac tacaacgctt accacgacga cgaaactact 2160gacgttcaca
tcttcaaggt tttgttcggt ccagttgttt aa
2202332301DNAArtificial SequenceOptimized cDNA for S. cerevisiae
expression encoding for NgSCS-del29 33atggctaact tccacagacc
atctagagtt agatgttctc actctactgc ttcttctttg 60gaagaagcta aggaaagaat
cagagaaact ttcggtaaga acgaattgtc tccatcttct 120tacgacactg cttgggttgc
tatggttcca tctagatact ctatgaacca accatgtttc 180ccaagatgtt tggactggat
cttggaaaac caaagagaag acggttcttg gggtttgaac 240ccatctcacc cattgttggt
taaggactct ttgtcttcta ctttggcttg tttgttggct 300ttgagaaagt ggagaatcgg
tgacaaccaa gttcaaagag gtttgggttt catcgaaact 360cacggttggg ctgttgacaa
cgttgaccaa atctctccat tgggtttcga catcatcttc 420ccatctatga tcaagtacgc
tgaaaagttg aacttggact tgccattcga cccaaacttg 480gttaacatga tgttgagaga
aagagaattg actatcgaaa gagctttgaa gaacgaattc 540gaaggtaaca tggctaacgt
tgaatacttc gctgaaggtt tgggtgaatt gtgtcactgg 600aaggaaatca tgttgcacca
aagaagaaac ggttctttgt tcgactctcc agctactact 660gctgctgctt tgatctacca
ccaacacgac gaaaagtgtt tcggttactt gtcttctatc 720ttgaagttgc acgaaaactg
ggttccaact atctacccaa ctaaggttca ctctaacttg 780ttcttcgttg acgctttgca
aaacttgggt gttgacagat acttcaagac tgaattgaag 840tctgttttgg acgaaatcta
cagattgtgg ttggaaaaga acgaagaaat cttctctgac 900atcgctcact gtgctatggc
tttcagattg ttgagaatga acaactacga agtttcttct 960gaagaattgg aaggtttcgt
tgaccaagaa cacttcttca ctacttctgg tggtaagttg 1020atctctcacg ttgctatctt
ggaattgcac agagcttctc aagttgacat ccaagaaggt 1080aaggacttga tcttggacaa
gatctctact tggactagaa acttcatgga acaagaattg 1140ttggacaacc aaatcttgga
cagatctaag aaggaaatgg aattcgctat gagaaagttc 1200tacggtactt tcgacagagt
tgaaactaga agatacatcg aatcttacaa gatggactct 1260ttcaagatct tgaaggctgc
ttacagatct tctaacatca acaacatcga cttgttgaag 1320ttctctgaac acgacttcaa
cttgtgtcaa gctagacaca aggaagaatt gcaacaaatc 1380aagagatggt tcgctgactg
taagttggaa caagttggtt cttctcaaaa ctacttgtac 1440acttcttact tcccaatcgc
tgctatcttg ttcgaaccag aatacggtga cgctagattg 1500gctttcgcta agtgtggtat
catcgctact actgttgacg acttcttcga cggtttcgct 1560tgtaacgaag aattgcaaaa
catcatcgaa ttggttgaaa gatgggacgg ttacccaact 1620gttggtttca gatctgaaag
agttagaatc ttcttcttgg ctttgtacaa gatgatcgaa 1680gaaatcgctg ctaaggctga
aactaagcaa ggtagatgtg ttaaggactt gttgatcaac 1740ttgtggatcg acttgttgaa
gtgtatgttg gttgaattgg acttgtggaa gatcaagtct 1800actactccat ctatcgaaga
atacttgtct atcgcttgtg ttactactgg tgttaagtgt 1860ttgatcttga tctctttgca
cttgttgggt ccaaagttgt ctaaggacgt tactgaatct 1920tctgaagttt ctgctttgtg
gaactgtact gctgttgttg ctagattgaa caacgacatc 1980cactcttaca agagagaaca
agctgaatct tctactaaca tggctgctat cttgatctct 2040caatctcaaa gaactatctc
tgaagaagaa gctatcagac aaatcaagga aatgatggaa 2100tctaagagaa gagaattgtt
gggtatggtt ttgcaaaaca aggaatctca attgccacaa 2160gtttgtaagg acttgttctg
gactactttc aaggctgctt actctatcta cactcacggt 2220gacgaataca gattcccaca
agaattgaag aaccacatca acgacgttat ctacaagcca 2280ttgaaccaat actctccata a
2301342268DNAArtificial
SequenceOptimized cDNA for S. cerevisiae expression encoding for
NgSCS-del38 34atgtctcact ctactgcttc ttctttggaa gaagctaagg aaagaatcag
agaaactttc 60ggtaagaacg aattgtcttc ttcttcttac gacactgctt gggttgctat
ggttccatct 120agatactcta tgaaccaacc atgtttccca agatgtttgg actggatctt
ggaaaaccaa 180agagaagacg gttcttgggg tttgaaccca tctttgccat tgttggttaa
ggactctttg 240tcttctactt tggcttgttt gttggctttg agaaagtgga gaatcggtga
caaccaagtt 300caaagaggtt tgggtttcat cgaaactcac ggttgggctg ttgacaacgt
tgaccaaatc 360tctccattgg gtttcgacat catcttccca tctatgatca agtacgctga
aaagttgaac 420ttggacttgc cattcgaccc aaacttggtt aacatgatgt tgagagaaag
agaattgact 480atcgaaagag ctttgaagaa cgaattcgaa ggtaacatgg ctaacgttga
atacttcgct 540gaaggtttgg gtgaattgtg tcactggaag gaaatcatgt tgcaccaaag
aagaaacggt 600tctccattcg actctccagc tactactgct gctgctttga tctaccacca
acacgacgaa 660aagtgtttcg gttacttgtc ttctatcttg aagttgcacg aaaactgggt
tccaactatc 720tacccaacta aggttcactc taacttgttc ttcgttgacg ctttgcaaaa
cttgggtgtt 780gacagatact tcaagactga attgaagtct gttttggacg aaatctacag
attgtggttg 840gaaaagaacg aagaaatctt ctctgacatc gctcactgtg ctatggcttt
cagattgttg 900agaatgaaca actacgaagt ttcttctgaa gaattggaag gtttcgttga
ccaagaacac 960ttcttcacta cttctggtgg taagttgatc tctcacgttg ctatcttgga
attgcacaga 1020gcttctcaag ttgacatcca agaaggtaag gacttgatct tggacaagat
ctctacttgg 1080actagaaact tcatggaaca agaattgttg gacaaccaaa tcttggacag
atctaagaag 1140gaaatggaat tcgctatgag aaagttctac ggtactttcg acagagttga
aactagaaga 1200tacatcgaat cttacaagat ggactctttc aagatcttga aggctgctta
cagatcttct 1260aacatcaaca acatcgactt gttgaagttc tctgaacacg acttcaactt
gtgtcaagct 1320agacacaagg aagaattgca acaaatcaag agatggttcg ctgactgtaa
gttggaacaa 1380gttggttctt ctcaaaacta cttgtacact tcttacttcc caatcgctgc
tatcttgttc 1440gaaccagaat acggtgacgc tagattggct ttcgctaagt gtggtatcat
cgctactact 1500gttgacgact tcttcgacgg tttcgcttgt aacgaagaat tgcaaaacat
catcgaattg 1560gttgaaagat gggacggtta cccaactgtt ggtttcagat ctgaaagagt
tagaatcttc 1620ttcttggctt tgtacaagat gatcgaagaa atcgctgcta aggctgaaac
taagcaaggt 1680agatgtgtta aggacttgtt gatcaacttg tggatcgact tgttgaagtg
tatgttggtt 1740gaattggact tgtggaagat caagtctact actccatcta tcgaagaata
cttgtctatc 1800gcttgtgtta ctactggtgt taagtgtttg atcttgatct ctttgcactt
gttgggtcca 1860aagttgtcta aggacgttac tgaatcttct gaagtttctg ctttgtggaa
ctgtactgct 1920gttgttgcta gattgaacaa cgacatccac tcttacaaga gagaacaagc
tgaatcttct 1980actaacatgg ttgctatctt gatctctcaa tctcaaagaa ctatctctga
agaagaagct 2040atcagacaaa tcaaggaaat gatggaatct aagagaagag aattgttggg
tatggttttg 2100caaaacaagg aatctcaatt gccacaagtt tgtaaggact tgttctggac
tactttcaag 2160gctgcttact ctatctacac tcacggtgac gaatacagat tcccacaaga
attgaagaac 2220cacatcaacg acgttatcta caagccattg aaccaatact ctccataa
22683585DNAArtificial SequencePrimer Sequence 35aggtgcagtt
cgcgtgcaat tataacgtcg tggcaactgt tatcagtcgt accgcgccat 60tgagagtgca
ccataccaca gcttt
853685DNAArtificial SequencePrimer Sequence 36tcgtggtcaa ggcgtgcaat
tctcaacacg agagtgattc ttcggcgttg ttgctgacca 60gcggtatttc acaccgcata
gggta 853785DNAArtificial
SequencePrimer Sequence 37tggtcagcaa caacgccgaa gaatcactct cgtgttgaga
attgcacgcc ttgaccacga 60cacgttaagg gattttggtc atgag
853880DNAArtificial SequencePrimer Sequence
38aacgcgtacc ctaagtacgg caccacagtg actatgcagt ccgcactttg ccaatgccaa
60aaatgtgcgc ggaaccccta
803984DNAArtificial SequencePrimer Sequence 39ttggcattgg caaagtgcgg
actgcatagt cactgtggtg ccgtacttag ggtacgcgtt 60cctgaacgaa gcatctgtgc
ttca 844085DNAArtificial
SequencePrimer Sequence 40ccgagatgcc aaaggatagg tgctatgttg atgactacga
cacagaactg cgggtgacat 60aatgatagca ttgaaggatg agact
854182DNAArtificial SequencePrimer Sequence
41atgtcacccg cagttctgtg tcgtagtcat caacatagca cctatccttt ggcatctcgg
60tgagcaaaag gccagcaaaa gg
824281DNAArtificial SequencePrimer Sequence 42ctcagatgta cggtgatcgc
caccatgtga cggaagctat cctgacagtg tagcaagtgc 60tgagcgtcag accccgtaga a
814360DNAArtificial
SequencePrimer Sequence 43attcctagtg acggccttgg gaactcgata cacgatgttc
agtagaccgc tcacacatgg 604479DNAArtificial SequencePrimer Sequence
44aggtgcagtt cgcgtgcaat tataacgtcg tggcaactgt tatcagtcgt accgcgccat
60tcgactacgt cgtaaggcc
794580DNAArtificial SequencePrimer Sequence 45tcgtggtcaa ggcgtgcaat
tctcaacacg agagtgattc ttcggcgttg ttgctgacca 60tcgacggtcg aggagaactt
80
User Contributions:
Comment about this patent or add new information about this topic: