Patent application title: METHODS FOR CONTROLLING GENE EXPRESSION
Inventors:
IPC8 Class: AC12N1510FI
USPC Class:
1 1
Class name:
Publication date: 2021-01-21
Patent application number: 20210017514
Abstract:
The invention relates to methods for precisely controlling expressions of
a target gene in an organism using a light-inducible kinase and a
response regulator. The invention also relates to nucleic acid constructs
and nucleic acids encoding the light-inducible kinase and response
regulator, as well as organisms expressing these constructs.Claims:
1. A nucleic acid construct comprising a nucleic acid encoding a
light-responsive histidine kinase and/or a nucleic acid encoding a
response regulator, wherein the nucleic acid encodes a light-responsive
histidine kinase as defined in any one of SEQ ID NOs: 1, 3, 5, 7, 9 or 11
or a functional variant thereof and wherein the response regulator
encodes a response regulator as defined in SEQ ID NO 13 or 15 or a
functional variant thereof.
2. The nucleic acid construct of claim 1, wherein the nucleic acid encoding a light-responsive histidine kinase comprises or consists of SEQ ID NO 2, 4, 6, 8, 10 or 12 or a functional variant thereof or comprises or consists of SEQ ID NO: 47, 48, 49 or 50 or a functional variant thereof and wherein the nucleic acid encoding a response regulator comprises or consists of SEQ ID NO: 14 or 16 or a functional variant thereof.
3. (canceled)
4. The nucleic acid construct of claim 1, wherein the construct comprises at least one regulatory sequence operably linked to at least one of the light-responsive histidine kinase and the response regulator, wherein the regulatory sequence is a constitutive promoter.
5-11. (canceled)
12. The nucleic acid construct of claim 1, wherein the construct further comprises a target sequence operably linked to a regulatory sequence that is specifically activated by the response regulator.
13. The nucleic acid construct of claim 12, wherein the regulatory sequence comprises a nucleic acid sequence as defined in SEQ ID NO: 17 or a functional variant thereof.
14-15. (canceled)
16. A host cell comprising the nucleic acid construct of claim 1.
17. The host cell of claim 16, wherein the cell is a eukaryotic or prokaryotic cell.
18. (canceled)
19. A transgenic organism expressing the nucleic acid construct of claim 1.
20. (canceled)
21. A method of producing a transgenic organism as defined in claim 19, the method comprising: a. selecting a part of the organism; b. transfecting at least one cell of the part of the organism of part (a) with the nucleic acid construct of claim 1; and c. regenerating at least one organism derived from the transfected cell or cells.
22-23. (canceled)
24. A method of modulating expression of a target gene in an organism the method comprising introducing and expressing a nucleic acid construct as defined in claim 1 in said organism and applying at least one wavelength of light, wherein preferably said wavelength of light activates or represses activation of a LRHK.
25. A method of modulating any biochemical response in an organism, the method comprising introducing and expressing at least one nucleic acid construct as defined in claim 1 in said organism and applying at least wavelength of light, wherein preferably said wavelength of light activates or represses activation of a LRHK.
26. (canceled)
27. The method of claim 24 wherein expression of a target gene can be increased or decreased by applying at least one first wavelength of light.
28. The method of claim 27, wherein expression of a target gene can be decreased or increased or further decreased or increased by applying at least one second wavelength of light, wherein the first wavelength of light is different from the second wavelength of light.
29. The method of claim 24, wherein the wavelength of light may have one of the following ranges, 430 to 495 nm (blue light), 495 to 570 nm (green light), 600 to 750 nm (red light), white light or white light enriched with at least one of red, blue or green light or wherein the wavelength of light is dark light (no visible light).
30. (canceled)
31. The method of claim 27, wherein the first wavelength of light that increases expression of the target gene is preferably green, white or red light or is white light enriched with red light and wherein the first wavelength of light that decreases expression of the target gene is preferably blue light or is white light enriched with blue light.
32. (canceled)
33. The method of claim 28, wherein the second wavelength of light that further increases expression of a target gene is red light, and wherein the second wavelength of light that decreases expression of a target gene is blue light.
34-35. (canceled)
36. A photoreceptor molecule comprising a phytochrome and a chromophore, wherein the phytochrome comprises an amino acid sequence as defined in any of SEQ ID NOs 1, 3, 5, 7, 9 and 11.
37. The photoreceptor molecule of claim 36, wherein the chromophore is selected from PCB (phycocyanobilin), P.phi.B (phytochromobilin) and BV (biliverdin).
38-40. (canceled)
41. A nucleic acid construct comprising a target sequence operably linked to a regulatory sequence, wherein the regulatory sequence is a regulatory sequence that is specifically activated by the response regulator, wherein the regulatory sequence comprises a nucleic acid sequence as defined in SEQ ID NO: 17 or a functional variant thereof.
42. (canceled)
43. A nucleic acid comprising: a. a nucleic acid sequence encoding a polypeptide as defined in any of SEQ ID NOs 1, 3, 5, 7, 9, 11, 13 and 15; b. a nucleic acid sequence as defined in any of SEQ ID NOs 2, 4, 6, 8, 10, 12, 14, 16, 17, 47, 48, 49 or 50 or the complementary sequence thereof; c. a nucleic acid with at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% overall sequence identity to the nucleic acid sequence of (a) or (b); or d. a nucleic acid sequence that is capable of hybridising under stringent conditions as defined herein to the nucleic acid sequence of any of (a) to (c).
Description:
FIELD OF THE INVENTION
[0001] The invention relates to methods for precisely controlling expression levels of a nucleic acid sequence, such as a target gene, in an organism using a light-inducible kinase and a response regulator. The invention also relates to nucleic acid constructs and nucleic acids encoding the light-inducible kinase and response regulator, as well as organisms expressing these constructs.
BACKGROUND OF THE INVENTION
[0002] There are thousands of genes in cells which are regulated to orchestrate developmental processes and physiological activities. Some gene functions are unknown in certain contexts, and some are well defined and are of interest to manipulate to produce an advantageous effect. It is therefore of growing interest to be able to selectively regulate gene expression. In research, it is an important tool to probe the function of genes and/or processes controlled by genes, including developmental processes or biochemical activities. In the case of plants, it is of particular interest to manipulate genes relating to physiological processes such as flowering or germination, or pest resistance, for commercial and agroeconomic purposes.
[0003] Current systems for genetic manipulation, including inducing or repressing expression of genes, mainly rely on small-molecule inducers such as doxycycline (Motta-Mena et al, 2014, Nature Chemical Biology). These chemical inducers are associated with a number of disadvantages, as the chemical can be pharmacologically active and therefore have off-target effects, is often limited by diffusion into tissue, cannot be localised into small areas or removed after application, and can be toxic to both the target organism, people, and the environment.
[0004] More recently, the field of "optogenetics" for the regulation of gene expression has grown. These optogenetic systems allow for gene expression to be selectively controlled by exposure to minimally invasive light stimuli in a highly selective spatiotemporal manner. This technique circumvents the previously described problems of chemically inducible systems. In addition, light stimuli are cheap to generate, environmentally benign and can potentially be applied repeatedly over large areas and over long periods, which may be particularly advantageous in crop plants, or light stimuli can be applied with incredible resolution using lasers. There have been some optogenetic systems described, however these are accompanied with a number of limitations or issues, including; low transcriptional activation, long deactivation times, use of exotic chromophores not found endogenously, potential interference with endogenous signalling pathways and the need for multiple protein components (Motta-Mena et al, 2014). It is well known in the field that there are many biological challenges associated with optogenetic systems, including the development of appropriate light-sensitive proteins (Hunter, 2016, EMBO reports). In particular, the application of optogenetic tools in plants presents further difficulties in that plants require light for growth and development, and thus far only a red/far-red light inducible "on/off" system has been applied to plants (Ochoa-Fernandez et al. 2016, Methods in Molecular Biology).
[0005] The present invention addresses the need for an improved optogenetic system that can be used in any organism, including plants.
SUMMARY OF THE INVENTION
[0006] We have created a new tool for manipulating gene expression with light, named the "Highlighter system". This system repurposes a photoreversible two-component signal transduction system termed CcaS-CcaR, originally derived from a native cyanobacterium Synechocystis sp. PCC6803, for use in cells and whole organisms, including plants. In nature, cyanobacteria use this system to change the composition of their light-harvesting pigments in response to green and red light for photosynthetic purposes or for resistance to photodamage (Hirose et al, 2010, PNAS., Abe et al, 2014, Microbial Biotechnology). When cyanobacteria are exposed to green light for example, CcaS is activated by a chromophore-dependent, light-induced conformational change, and phosphorylates CcaR which then induces CcaR binding to a promoter region that drives transcription of the transcriptional regulator for regulating the synthesis of the light-harvesting pigment phycoerythrin.
[0007] This invention harnesses this natural phenomenon and functions, in its most simple form, by expressing in a target cell or organism, a CcaS variant (in this invention known as the light-responsive histidine kinase (LRHK)) and a CcaR variant (in this invention known as the response regulator (RR)) along with a target gene of interest that is under the control of a response-regulator specific promoter. In this way, expression of the target gene is controlled as when the LRHK is exposed to an activating wavelength of light, it phosphorylates the RR which can then bind to its cognate promoter to drive transcription of the target gene. A strong advantage of the CcaS-CcaR system is that the components of the CcaS-CcaR system are not present in plants, so therefore the system is orthogonal to plant signalling pathways, and therefore will less likely interfere with, or be interfered by, endogenous signalling pathways. This system has been used in cyanobacteria and E. coli to drive target gene expression upon green-light stimulation (Abe et al, 2014; Tabor et al, 2011). However, we have further altered this system, wherein the system can be activated with a range of different light wavelengths, with a view to utilising the system in plants in particular through a number of modifications.
[0008] These improvements include modification to CcaS (codon optimisation, improved photoswitching with the P.PHI.B chromophore present in plants, untethering of CcaS from the cell membrane and addition of a nuclear localisation signal) and to CcaR (codon optimisation, addition of a C-terminal nuclear localisation signal, addition of a eukaryotic transactivation domain). We have also created a plant vector expression system to deliver the system to plants that includes a synthetic promoter, whose activity level can be modulated via the response regulator, and optionally a fluorescent output reader for normalisation purposes, and ribosomal skipping sequences to reduce vector size. The system is designed to exhibit one target gene expression state during plant growth in normal light-dark cycles, and an altered target gene expression state following treatment with light spectra that are not found in horticultural environment.
[0009] There are many possible applications of this system, whereby gene expression can be precisely and effectively manipulated to study a range of biological processes, or induce advantageous properties in an organism. The system can be used in a precise manner, both spatially and temporally, to for example, target a certain area of the plant such as the leaves, or for example, at a defined time to trigger a biological process such as the timing of flowering or germination. This would allow for specific interventions for improved agronomic outcomes.
[0010] The invention described here is thus aimed at providing light-regulated gene expression in cells and organisms and related methods, thus providing products and methods of research and agricultural importance.
[0011] In one aspect of the invention, there is provided a nucleic acid construct comprising a nucleic acid encoding a light-responsive histidine kinase and/or a nucleic acid encoding a response regulator, wherein the nucleic acid encodes a light-responsive histidine kinase as defined in any one of SEQ ID NOs: 1, 3, 5, 7, 9 or 11 or a functional variant thereof and wherein the response regulator encodes a response regulator as defined in any of SEQ ID NOs 13 or 15 or a functional variant thereof.
[0012] In one embodiment, the nucleic acid encoding a light-responsive histidine kinase comprises or consists of SEQ ID NO 2, 4, 6, 8, 10 or 12 or a functional variant thereof or comprises or consists of SEQ ID NO: 47, 48, 49 or 50 or a functional variant thereof.
[0013] In another embodiment, the nucleic acid encoding a response regulator comprises or consists of SEQ ID NO: 14 or 16 or a functional variant thereof.
[0014] In a further embodiment, the construct comprises at least one regulatory sequence operably linked to at least one of the light-responsive histidine kinase and the response regulator. Preferably, the regulatory sequence is operably linked to the light-responsive histidine kinase and the response regulator.
[0015] In another embodiment, the construct further comprises a reporter sequence. Preferably, the reporter sequence is operably linked to a regulatory sequence. More preferably, the light-responsive histidine kinase, the response regulator and the reporter sequence are operably linked to a single regulatory sequence.
[0016] In a further embodiment, the construct further comprises at least one terminator sequence operably linked to at least one, preferably at least two, more preferably all three of the light-responsive histidine kinase, the response regulator and the reporter sequence.
[0017] In one embodiment, the regulatory sequence is a constitutive promoter. For example, the promoter is the UBQ10 promoter or a functional variant thereof.
[0018] In a further embodiment, the construct further comprises a target sequence operably linked to a regulatory sequence that is specifically activated by the response regulator. In one embodiment, the regulatory sequence comprises a nucleic acid sequence as defined in SEQ ID NO: 17 or a functional variant thereof. In a further embodiment, the target sequence is operably linked to a terminator sequence.
[0019] In another aspect of the invention, there is provided a vector, preferably an expression vector, comprising the nucleic acid construct as described herein.
[0020] In a further aspect of the invention, there is provided a host cell comprising a nucleic acid construct as described herein or a vector as described herein. Preferably, the cell is a eukaryotic or prokaryotic cell. More preferably, the eukaryotic cell is a plant cell.
[0021] In another aspect of the invention, there is provided a transgenic organism expressing the nucleic acid construct as described herein or a vector as described herein. In a preferred embodiment, the organism is a plant.
[0022] In another aspect of the invention, there is provided a method of producing a transgenic organism as described herein, the method comprising:
[0023] a. selecting a part of the organism;
[0024] b. transfecting at least one cell of the part of the organism of part (a) with the nucleic acid construct as described herein or the vector as described herein; and
[0025] c. regenerating at least one organism derived from the transfected cell or cells.
[0026] In a further aspect, there is provided an organism obtained or obtainable by the method described herein. Preferably, the organism is a plant.
[0027] In another aspect of the invention, there is provided a method of modulating expression of a target gene in an organism, the method comprising introducing and expressing a nucleic acid construct as described herein or a vector as described herein in said organism and applying at least one wavelength of light. In one embodiment, the wavelength of light activates or represses activation of a LRHK
[0028] In a further aspect of the invention, there is also provided a method of modulating any biochemical response in an organism, the method comprising introducing and expressing at least one nucleic acid construct as described herein or a vector as described herein in said organism and applying at least one wavelength of light. In one embodiment, the biochemical response is a developmental process or physiological response. Preferably, the biochemical response is modulated by modulating expression of at least one target gene. In one embodiment, the wavelength of light activates or represses activation of a LRHK.
[0029] The wavelength of light may be referred to as an activating or repressing wavelength.
[0030] In one embodiment, the wavelength of light may have one of the following ranges, 370-400 (ultraviolet light), 430 to 495 nm (blue light), 495 to 570 nm (green light), 570 nm to 600 nm (yellow/orange light), 600 to 750 nm (red light) or far-red (750 to 850 nm), or be a white light (as described below). In another embodiment, the wavelength of light may be dark light (as described below). In a further embodiment, the wavelength of light may be white light enriched with at least one of red, blue or green light.
[0031] In one embodiment, expression of a target gene can be increased or decreased by applying at least one first wavelength of light.
[0032] In a further embodiment, expression of a target gene can be decreased or further increased by applying at least one second wavelength of light, wherein the first wavelength of light is different from the second wavelength of light.
[0033] In one embodiment, the first wavelength of light that increases expression of the target gene is preferably green, white, dark or red light or is white light enriched with red light.
[0034] In another embodiment, the first wavelength of light that decreases expression of the target gene is preferably blue light or is white light enriched with blue light.
[0035] In a further embodiment, the second wavelength of light that further increases expression of a target gene is red light. In this embodiment, the first wavelength of light is preferably white, green or dark light.
[0036] In another embodiment, the second wavelength of light that decreases expression of a target gene is blue light. In this embodiment, the first wavelength may be red, green, white or dark light.
[0037] In another embodiment, the first wavelength of light may be blue light and the second wavelength of light red light or vice versa.
[0038] In another aspect of the invention, there is provided a photoreceptor molecule comprising a phytochrome and a chromophore, wherein the phytochrome comprises an amino acid sequence as defined in any of SEQ ID NOs 1, 3, 5, 7, 9 and 11 or a variant thereof. Preferably, the chromophore is selected from PCB (phycocyanobilin), P.phi.B (phytochromobilin) and BV (biliverdin). More preferably, the chromophore is P.phi.B.
[0039] In a further aspect of the invention, there is provided the use of the nucleic acid construct as described above or a vector as described above to modulate expression of a target gene in an organism.
[0040] In another aspect of the invention, there is provided the use of the nucleic acid construct as described above or a vector as described above to modulate any biochemical response in an organism, preferably a developmental or physiological response.
[0041] In a further aspect of the invention, there is provided a nucleic acid construct comprising a target sequence operably linked to a regulatory sequence, wherein the regulatory sequence is a regulatory sequence that is specifically activated by the response regulator. In one embodiment, the regulatory sequence comprises a nucleic acid sequence as defined in SEQ ID NO: 17 or a functional variant thereof.
[0042] In a final aspect of the invention, there is provided a nucleic acid comprising:
[0043] a. a nucleic acid sequence encoding a polypeptide as defined in any of SEQ ID NOs 1, 3, 5, 7, 9, 11, 13 and 15;
[0044] b. a nucleic acid sequence as defined in any of SEQ ID NOs 2, 4, 6, 8, 10, 12, 14, 16 or 17 or the complementary sequence thereof;
[0045] c. a nucleic acid with at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% overall sequence identity to the nucleic acid sequence of (a) or (b); or
[0046] d. a nucleic acid sequence that is capable of hybridising under stringent conditions as defined herein to the nucleic acid sequence of any of (a) to (c).
DESCRIPTION OF THE FIGURES
[0047] The invention is further described in the following non-limiting figures:
[0048] FIG. 1 shows the CcaS-CcaR system repurposed for control of gene expression in E. coli. In darkness, or upon red light illumination, the CcaS-CcaR system remains in/enters its inactive state where sfGFP expression is at its lowest. Upon green light illumination, the kinase activity of CcaS is activated and CcaS phosphorylates and hence activates CcaR (CcaR-P). CcaR-P binds the ccaR CRE, inside the P.sub.cpcG2-172 promoter sequence and induces sfgfp transcription.
[0049] FIG. 2 shows photoswitching Assay in E. coli. Serial dilutions of E. coli cultures expressing the CcaS-CcaR system was grown in 96-well plates On LB media at 37.degree. C., shaking) while receiving light treatments, here blue light (Blue), green light (Blue), red light (Red) and darkness (Dark). The GFP fluorescence was quantified on a fluorimeter, along with the cell density (OD.sub.600). The Fluorescence was then plotted against the cell density (A). The fluorescence was then estimated at OD600=0.2 and converted into a heat map (B).
[0050] FIG. 3 shows chromophore dependency of the CcaS-CcaR system in E. coli. The system was tested under five light regimes; four hour treatments with RGB-white (White), blue, green or red light and in darkness (Dark). CcaS was always coexpressed with CcaR in combination with the biosynthetic machinery to produce PCB, P.PHI.B, BV or no chromophore (O). The intensity of green in the heat map corresponds to the level of sfGFP expression observed under the tested conditions.
[0051] FIG. 4 shows the A92V mutation enhances CcaS photoswitching with P.PHI.B. CcaS(A92V) with P.PHI.B is repressed by blue light and RGB-white light (White) and activated by green light and red light. CcaS(A92V) behaves like CcaS in the presence of BV and in the absence of chromophores.
[0052] FIG. 5 shows bacterial validation of modifications made to CcaS in order for it to function in planta. We simultaneously tested the effects on the photoswitching properties of CcaS of the following modification; the A92 mutation to allow for photoswitching with P.PHI.B, removal of the transmembrane domain (.DELTA.22 or .DELTA.23), and the addition of an N-terminal NLS. The numbers in the table are fluorescence counts in millions.
[0053] FIG. 6 shows bacterial testing of the effects of 2A tails on CcaS function.
[0054] FIG. 7 shows a schematic of a pHighlighter plant expression vector. The input cassette constitutively expresses a light responsive histidine kinase (LRHK), a reporter (R.sub.const) and a response regulator (RR). The constitutive expression of these three proteins from the input cassette is controlled by the UBQ10 promoter (P.sub.UBQ10) (SEQ ID NO: 44) and the rbcS terminator (T.sub.rbcS)(SEQ ID No: 42). The output cassette holds a cognate promoter for the response regulator (P.sub.RR), a target gene of interest (Target) and a NOS terminator (T.sub.NOS)(SEQ ID NO: 43). When the LRHK is exposed to an activating wavelength of light, it phosphorylates the RR, which then binds to its cognate promoter, P.sub.RR, and the Target is expressed. The constitutively expressed reporter, R.sub.const, allows for the detection of transfected cells during transient transfections of plants and a normalization control if a fluorescent protein is used as Target. LB and RB are the left border and right borders. ColEI and OriV are origins of replication, trfA is a replication initiation protein and Amp.sup.R is the bacterial resistance gene against ampicillin.
[0055] FIG. 8 shows the cognate promoter, P.sub.RR, for the response regulator. The P.sub.RR is made up of three ccaR CRE sequences, separated by spacers, and fused to the -51 35S minimal promoter (P.sub.35Smin(-51)). +1 denotes the transcription start site (TSS).
[0056] FIG. 9 shows ribosomal skipping efficiency in Tobacco. The efficiency of ribosomal skipping for P2A, F2A and F2A.sub.30 was tested in transiently transfected tobacco. The graph shows the mean TagRFP signal in the nucleus/mean TagRFP signal in the cytosol. For this experiment, the LRHK, MM:NLS:CcaS(.DELTA.23 A92V), was linked to a downstream TagRFP via the three different 2A sequences, P2A, F2A and F2A.sub.30, and expressed from the P.sub.UBQ-T.sub.rbcS cassette. The controls for perfect ribosomal skipping and complete failure of skipping are TagRFP and NLS:TagRFP. n=4-6, error bars are S.D.
[0057] FIG. 10 shows transient expression of the Highlighter system in Tobacco: The plant expression vector, pHighlighter, was transformed into Agrobacterium and used to infiltrate tobacco leaves. The plants were left to express the system for 2 days in the greenhouse and light treated for a minimum of 18 hours.
[0058] FIG. 11 shows light-controlled induction of NLS:Venus expression, by four Highlighter system variants, in response to blue light, green light and darkness. The systems were transiently expressed in tobacco as described in FIG. 6. The numbers are YFP mean/RFP mean averages for plant nuclei under the given light condition. .+-. are S.D., n=3 biological replica (each n is an average of the YFP mean/RFP mean calculated for 15-20 nuclei).
[0059] FIG. 12 shows transient expression of the Highlighter system in Tobacco: The plant expression vector, pHighlighter, was transformed into Agrobacterium and used to infiltrate tobacco leaves. The plants were left to express the system for 2 days under continuous blue light conditions and light treated (RGB-white light (White), blue light, green light, red light and darkness) for a minimum of 24 hours.
[0060] FIG. 13 shows light-controlled induction of NLS:Venus expression, by four Highlighter system variants, in response to blue light, green light and darkness. The systems were transiently expressed in tobacco as described in FIG. 7. The numbers are YFP mean/RFP mean (specifically NLS:Venus mean signal/NLS:TagRFP mean signal) averages for plant nuclei under the given light condition. The values in the table are the YFP mean/RFP mean average calculated for 22-209 nuclei, .+-. are 95% confidence intervals.
[0061] FIG. 14 shows light-controlled induction of NLS:Venus expression, by three Highlighter system variants. Induction of NLS:Venus expression was measured in response to what the human eye perceives as pure red light (RRR), very red enriched white light (RRW), slightly red enriched white light (RWW, i.e. red light proportion 42% and blue light proportion 32%), slightly blue enriched white light (WWB, i.e. red light proportion 18% and blue light proportion 60%), very blue enriched white light (WBB) and pure blue light (BBB). The systems were transiently expressed in tobacco as shown in FIG. 12. Confocal fluorescence images of tobacco epidermal cells were acquired and IMARIS software was used to segment and quantify fluorescence signals from individual nuclei. The values in the table are mean fluorescence emission values for YFP/RFP calculated for 12-132 nuclei.+-.95% confidence intervals.
[0062] FIG. 15 shows quantification of LRHK variants in E. coli. E. coli strains expressing the LRHK variants were quantified after four hour treatments of darkness and eight different light regimes: ultraviolet light (370 nm or 400 nm), blue light (450 nm), green light (520 nm), yellow light (590 nm), orange light (610 nm), red light (630 nm), far red light (700 nm). The LRHKs were coexpressed with CcaR, sfGFP under control of a CcaS/CcaR responsive promoter, and the biosynthetic machinery to produce P.PHI.B. The values are fluorescence counts in millions, corresponding to the level of sfGFP expression observed under the tested light regimes.
[0063] FIG. 16 shows conditional complementation of the semi-dwarf phenotype of the ga3ox1-3, ga3ox2-1, nGPS1 Arabidopsis line by using the Highlighter system to control AtGA3OX1 expression levels with blue- and red-enriched white light. (A) The ga3ox1-3, ga3ox2-1, nGPS1 line grown in continuous blue-enriched white light. (B) The ga3ox1-3, ga3ox2-1, nGPS1 line, transformed with the Highlighter system to control GA3OX1 expression levels, grown in continuous blue-enriched white light. (C) The ga3ox1-3, ga3ox2-1, nGPS1 line grown in continuous red-enriched white light. (D) The ga3ox1-3, ga3ox2-1, nGPS1 line, transformed with the Highlighter system to control AtGA3OX1 expression levels, grown in continuous red-enriched white light.
DETAILED DESCRIPTION OF THE INVENTION
[0064] The present invention will now be further described. In the following passages, different aspects of the invention are defined in more detail. Each aspect so defined may be combined with any other aspect or aspects unless clearly indicated to the contrary. In particular, any feature indicated as being preferred or advantageous may be combined with any other feature or features indicated as being preferred or advantageous.
[0065] The practice of the present invention will employ, unless otherwise indicated, conventional techniques of botany, microbiology, tissue culture, molecular biology, chemistry, biochemistry, recombinant DNA technology, and bioinformatics which are within the skill of the art. Such techniques are explained fully in the literature.
[0066] As used herein, the words "nucleic acid", "nucleic acid sequence", "nucleotide", "nucleic acid molecule" or "polynucleotide" are intended to include DNA molecules (e.g., cDNA or genomic DNA), RNA molecules (e.g., mRNA), natural occurring, mutated, synthetic DNA or RNA molecules, and analogs of the DNA or RNA generated using nucleotide analogs. It can be single-stranded or double-stranded. Such nucleic acids or polynucleotides include, but are not limited to, coding sequences of structural genes, anti-sense sequences, and non-coding regulatory sequences that do not encode mRNAs or protein products. These terms also encompass a gene. The term "gene" or "gene sequence" is used broadly to refer to a DNA nucleic acid associated with a biological function. Thus, genes may include introns and exons as in the genomic sequence, or may comprise only a coding sequence as in cDNAs, and/or may include cDNAs in combination with regulatory sequences.
[0067] The terms "polypeptide" and "protein" are used interchangeably herein and refer to amino acids in a polymeric form of any length, linked together by peptide bonds.
[0068] In one aspect of the invention, there is provided a nucleic acid construct comprising a light-responsive histidine kinase (LRHK) and/or a response regulator (RR). In a preferred embodiment, the LRHK is a cyanobacteriochrome, more preferably, the cyanobacteriochrome CcaS (complementary chromatic acclimation sensor). In a further preferred embodiment, CcaS comprises a nuclear localisation signal and/or lacks a membrane anchor and/or has a A92V mutation. More preferably, as described above, CcaS comprises or consists of a nucleic acid, wherein the nucleic acid encodes a light-responsive histidine kinase as defined in any one of SEQ ID NOs: 1, 3, 5, 7, 9 or 11 or a functional variant thereof. Preferably, the construct comprises both a LRHK and RR.
[0069] In another preferred embodiment, the RR is a transcriptional regulatory protein, preferably a OmpR-class response regulator, and more preferably CcaR (complementary chromatic acclimation regulator). In a preferred embodiment, CcaR comprises a C-terminal nuclear localisation signal and/or a transcription activation or repressor domain, preferably the VP64 eukaryotic transactivation domain. In a particularly preferred embodiment, the response regulator comprises a nucleic acid sequence encoding a response regulator as defined in any of SEQ ID NOs 13 or 15 or a functional variant thereof.
[0070] In one embodiment, the nucleic acid encoding a light-responsive histidine kinase comprises or consists of SEQ ID NO 2, 4, 6, 8, 10, 12, 47, 48, 49 or 50 or a functional variant thereof. In a further embodiment, the nucleic acid encoding a response regulator comprises or consists of SEQ ID NO: 14 or 16 or a functional variant thereof.
[0071] SEQ ID NOs 1-12 and 47 to 50 relate to exemplary variants of CcaS that may be used in the invention. Similarly, SEQ ID NOs 13-16 relate to exemplary variants of CcaR that may be used in the invention.
CcaS Variants
[0072] SEQ ID NOs 1 and 2 (amino and nucleic acid sequences respectively) correspond to a CcaS mutant with a A92V point mutation that results in with improved photoswitching with P.PHI.B.
[0073] SEQ ID NOs 3 and 4 (amino and nucleic acid sequences respectively) correspond to a CcaS mutant with a truncation (removal of bases 1-69) and the addition of an NLS sequence (as described in SEQ ID NO: 26 and 27).
[0074] SEQ ID NOs 5 and 6 (amino and nucleic acid sequences respectively) correspond to a CcaS mutant with a A92V point mutation that results in improved photoswitching with PPB and a truncation (removal of bases 4-69).
[0075] SEQ ID NOs 7 and 8 (amino and nucleic acid sequences respectively) correspond to a CcaS mutant with a A92V point mutation that results in improved photoswitching with PPB, the addition of an NLS sequences, and a truncation (removal of bases 1-69).
[0076] SEQ ID NOs 9 and 10 (amino and nucleic acid sequences respectively) correspond to a CcaS mutant with a A92V point mutation that results in improved photoswitching with PPB, the addition of an NLS sequences, a truncation (removal of bases 1-69), and the addition of a peptide tail (amino acids 1-20) encoding a 2A ribosomal skipping sequence.
[0077] SEQ ID NOs 11 and 12 (amino and nucleic acid sequences respectively) correspond to a CcaS mutant with a A92V point mutation that results improved photoswitching with PPB, the addition of an NLS sequences, a truncation (removal of bases 1-69), and the addition of a peptide tail (amino acids 1-29) encoding a 2A ribosomal skipping sequence.
CcaR Variants
[0078] SEQ ID NOs 13 and 14 (amino and nucleic acid sequences respectively) correspond to a CcaR variant with an NLS and VP64 domain fused to the N-terminal as well as an N-terminal proline.
[0079] SEQ ID NOs 15 and 16 (amino and nucleic acid sequences respectively) correspond to a CcaR variant with an NLS and VP64 domain fused to the C-terminal as well as an N-terminal proline.
[0080] The term "variant" or "functional variant" as used throughout with reference to any of SEQ ID NOs: 1 to 50 refers to a variant gene sequence or part of the gene sequence which retains the biological function of the full non-variant sequence. A functional variant also comprises a variant of the gene of interest, which has sequence alterations that do not affect function, for example in non-conserved residues. Also encompassed is a variant that is substantially identical, i.e. has only some sequence variations, for example in non-conserved residues, compared to the wild type sequences as shown herein and is biologically active. Alterations in a nucleic acid sequence that results in the production of a different amino acid at a given site that does not affect the functional properties of the encoded polypeptide are well known in the art. For example, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one negatively charged residue for another, such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can also be expected to produce a functionally equivalent product. Nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the polypeptide molecule would also not be expected to alter the activity of the polypeptide. Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products.
[0081] As used in any aspect of the invention described throughout a "variant" or a "functional variant" has at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% overall sequence identity to the non-variant nucleic acid or amino acid sequence.
[0082] In one embodiment, the "CcaS" protein encodes a light responsive histidine kinase, wherein the kinase is characterised by a number of domains or motifs. For example, the CcaS protein may comprise at least one of a GAF domain or GAF domain variant (for example, from AnPixjg2, slr1393g2, NpR1597g4 and UirSg), a His-Kinase domain and a nuclear localisation signal or NLS, as well as optionally at least one, preferably two PAS (or Per-Arnt-Sim) domains.
[0083] In one embodiment, the sequence of these domains comprises or consists of the following sequence or a functional variant thereof:
TABLE-US-00001 GAF domain (nucleic acid sequence): (SEQ ID NO: 18): ATCAGACAATCTCTTAATTTGGAGACTGTTTTGAACACTACAG TTGCTGAAGTTAAGACACTTTTGCAGGTTGATAGAGTTCTTAT CTATAGAATCTGGCAAGATGGTACAGGATCTGTTATCACTGAG TCTGTTAATGCTAACTACCCTTCTATTTTGGGTAGAACTTTTT CTGATGAGGTTTTCCCAGTTGAATATCATCAAGCTTACACAAA GGGAAAAGTTAGAGCTATTAATGATATCGATCAGGATGATATC GAAATCTGTCTTGCTGATTTCGTTAAACAATTCGGTGTTAAGT CTAAACTTGTTGTTCCTATCTTGCAGCATAATAGAGCTTCTTC TTTGGATAACGAATCTGAGTTTCCATATCTTTGGGGACTTTTG ATTACACATCAGTGTGCTTTCACTAGACCTTGGCAACCTTGGG AAGTTGAGCTTATGAAGCAGTTGGCTAACCAAGTTGCTATTGC TATC GAF domain (amino acid sequence): (SEQ ID NO: 19): IRQSLNLETVLNTTVAEVKTLLQVDRVLIYRIWQDGTGSVITE SVNANYPSILGRTFSDEVFPVEYHQAYTKGKVRAINDIDQDDI EICLADFVKQFGVKSKLVVPILQHNRASSLDNESEFPYLWGLL ITHQCAFTRPWQPWEVELMKQLANQVAIAI PAS domain (nucleic acid sequence); domain 1: (SEQ ID NO: 20): ACTAACCATACACTTCAGTCTTTGATTGCTGCTTCTCCTAGAG GTATCTTTACTCTTAATTTGGCTGATCAAATTCAGATCTGGAA CCCAACAGCTGAGCGAATCTTCGGATGGACTGAAACAGAGATT ATCGCTCATCCTGAGCTTTTGACATCTAACATCCTTTTGGAAG ATTACCAACAGTTTAAGCAAAAGGTTCTTTCTGGTATGGTTTC TCCATCT PAS domain (amino acid sequence); domain 1: (SEQ ID NO: 21): TNHTLQSLIAASPRGIFTLNLADQIQIWNPTAERIFGVVTETE IIAHPELLTSNILLEDYQQFKQKVLSGMVSPS PAS domain (nucleic acid sequence); domain 2: (SEQ ID NO: 22): ATCGATGATCCTGGACCAAGAATCCTTTATGTTAATGAGGCTT TCACTAAGATCACAGGATACACTGCTGAAGAGATGTTGGGAAA GACTCCTAGAGTTCTTCAAGGACCAAAAACTTCAAGAACTGAG TTGGATAGAGTTAGACAGGCTATCTCTCAATGG PAS domain (amino acid sequence); domain 2: (SEQ ID NO: 23): IDDPGPRILYVNEAFTKITGYTAEEMLGKTPRVLQGPKTSRTE LDRVRQAISQW His-Kinase domain (nucleic acid sequence): (SEQ ID NO: 24) ATGGCTTCTCATGAGTTTAGAACACCACTTTCTACTGCTTTGG CTGCTGCTCAACTTCTTGAAAATTCTGAAGTTGCTTGGCTTGA TCCTGATAAGAGATCAAGAAACCTTCATAGAATCCAAAATTCT GTTAAAAACATGGTTCAACTTTTGGATGATATCTTGATTATCA ACAGAGCTGAGGCTGGAAAGCTTGAGTTTAATCCAAACTGGCT TGATTTGAAGCTTTTGTTCCAACAGTTCATTGAAGAGATCCAG CTTTCTGTTTCTGATCAATACTACTTCGATTTCATCTGTTCTG CTCAAGATACTAAGGCTCTTGTTGATGAAAGATTGGTTAGATC TATCCTTTCTAATCTTTTGTCTAACGCTATCAAGTACTCTCCT GGAGGTGGACAGATTAAAATCGCTCTTTCTTTGGATTCTGAGC AGATTATCTTCGAAGTTACAGATCAAGGTATTGGAATCTCTCC TGAGGATCAAAAGCAGATCTTTGAACCATTCCATAGAGGAAAG AATGTTAGAAACATTACTGGTACAGGACTTGGTTTGATGGTTG CTAAGAAATGTGTTGATCTTCATTCTGGATCTATCCTTTTGAA GTCTGCTGTGGATCAAGGAACAACTGTGACCATCTGTCTCAAA AGGTACAAC His-Kinase domain (amino acid sequence): (SEQ ID NO: 25) MASHEFRTPLSTALAAAQLLENSEVAWLDPDKRSRNLHRIQNS VKNMVQLLDDILIINRAEAGKLEFNPNWLDLKLLFQQFIEEIQ LSVSDQYYFDFICSAQDTKALVDERLVRSILSNLLSNAIKYSP GGGQIKIALSLDSEQIIFEVTDQGIGISPEDQKQIFEPFHRGK NVRNITGTGLGLMVAKKCVDLHSGSILLKSAVDQGTTVTICLK RYN NLS (nucleic acid sequence): (SEQ ID NO: 26) TTACAACCAAAGAAGAAAAGGAAGGTGGGTGGA NLS (amino acid sequence): (SEQ ID NO: 27) LQPKKKRKVGG
[0084] Accordingly, in one embodiment, a CcaS variant may have at least one of a GAF domain, a NLS and a His-Kinase domain and optionally at least one, preferably at least two PAS domains as defined above or a domain with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% overall sequence identity to any one of SEQ ID NOs 18 to 27.
[0085] In one embodiment, the "CcaR" protein encodes a transcriptional regulatory protein, wherein the regulator is characterised by a number of domains or motifs. For example, the CcaR may comprise at least one of a REC domain (receiver domain, preferably a N-terminal REC domain), a transcriptional activation or repression domain and a DNA-binding domain (preferably a C-terminal DNA-binding domain). Preferably, CcaR comprises a VP64 transactivation domain.
[0086] In one embodiment, the sequence of these domains comprises or consists of the following sequences:
TABLE-US-00002 REC domain (nucleic acid sequence): (SEQ ID NO: 28) AGAATACTCCTCGTGGAAGATGATTTGCCATTAGCAGAAACCC TCGCAGAAGCTTTGTCTGATCAACTTTACACTGTTGATATTGC TACAGATGCTTCTTTGGCTTGGGATTATGCTTCTAGACTTGAA TACGATTTGGTTATTCTTGATGTTATGTTGCCTGAGCTTGATG GAATTACTCTTTGTCAGAAGTGGAGATCTCATTCTTATTTGAT GCCAATCCTTATGATGACTGCTAGAGATACAATTAATGATAAG ATCACAGGACTTGATGCTGGTGCTGATGATTACGTTGTTAAAC CTGTTGATTTGGGTGAACTTTTTGCTAGAGTTAGAGCTCTTTT G REC domain (amino acid sequence): (SEQ ID NO: 29) RILLVEDDLPLAETLAEALSDQLYTVDIATDASLAWDYASRLE YDLVILDVMLPELDGITLCQKWRSHSYLMPILMMTARDTINDK ITGLDAGADDYVVKPVDLGELFARVRALL DNA binding domain (nucleic acid sequence): (SEQ ID NO: 30): CAACCAGTTTTGGAGTGGGGTCCTATTAGACTTGATCCATCTA CTTATGAAGTTTCTTACGATAATGAGGTTTTGTCTCTTACAAG AAAGGAATACTCTATCTTGGAGCTTTTGCTTAGAAACGGAAGA AGAGTTCTTTCTAGATCTATGATCATCGATTCTATCTGGAAGT TGGAGTCTCCTCCAGAAGAGGATACAGTTAAAGTTCATGTTAG ATCTTTGAGACAAAAGCTTAAGTCTGCTGGACTTTCTGCTGAT GCTATTGAAACTGTTCATGGAATCGGTTACAGATTGGCTAAT DNA binding domain (amino acid sequence): (SEQ ID NO: 31): QPVLEWGPIRLDPSTYEVSYDNEVLSLTRKEYSILELLLRNGR RVLSRSMIIDSIWKLESPPEEDTVKVHVRSLRQKLKSAGLSAD AIETVHGIGYRLAN NLS (nucleic acid sequence): (SEQ ID NO: 32) CTCCAGCCTAAGAAGAAGAGAAAGGTTGGAGGT NLS (amino acid sequence): (SEQ ID NO: 33) LQPKKKRKVGG VP64 domain (nucleic acid sequence): (SEQ ID NO: 34): GATGCCCTCGACGATTTCGACCTCGATATGCTCGGTTCTGATG CTCTCGATGACTTTGACCTTGACATGCTTGGATCAGACGCTTT GGACGACTTCGACTTGGACATGTTGGGATCTGATGCACTTGAT GATTTTGACCTTGATATGCTT VP64 domain (amino acid sequence): (SEQ ID NO: 35): DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALD DFDLDML
[0087] Accordingly, in one embodiment, a CcaR variant has at least one of a REC domain a NLS and a transcriptional activation or repression domain as defined in SEQ ID NO: 28 to 35 or a domain with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% overall sequence identity to SEQ ID NO 28 to 35.
[0088] Two nucleic acid sequences or polypeptides are said to be "identical" if the sequence of nucleotides or amino acid residues, respectively, in the two sequences is the same when aligned for maximum correspondence as described below. The terms "identical" or percent "identity," in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. When percentage of sequence identity is used in reference to proteins or peptides, it is recognised that residue positions that are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. Non-limiting examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms.
[0089] In a further embodiment, a variant as used herein, can comprise a nucleic acid encoding a LRHK or RR as defined herein that is capable of binding or hybridising under stringent conditions as defined herein to a nucleic acid sequence as defined in any of SEQ ID NOs 1 to 50.
[0090] Hybridization of such sequences may be carried out under stringent conditions. By "stringent conditions" or "stringent hybridization conditions" is intended conditions under which a probe will hybridize to its target sequence to a detectably greater degree than to other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences that are 100% complementary to the probe can be identified (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing). Generally, a probe is less than about 1000 nucleotides in length, preferably less than 500 nucleotides in length.
[0091] Typically, stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30.degree. C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60.degree. C. for long probes (e.g., greater than 50 nucleotides). Duration of hybridization is generally less than about 24 hours, usually about 4 to 12 hours. Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
[0092] In another embodiment, the construct further comprises at least one regulatory sequence operably linked to at least one of the light-responsive histidine kinase and the response regulator. In one embodiment, the construct comprises a first regulatory sequence operably linked to the LRHK. In a second embodiment, the construct comprises a second regulatory sequence operably linked a second regulatory sequence. However, preferably, the construct comprises a single regulatory sequence that is operably linked to both the LRHK and the RR.
[0093] To allow two proteins to be expressed as individual proteins from a single mRNA molecule, ribosomal skipping sequences may be added to the 5' and/or 3' end of the LRHK and/or RR gene. During translation, when the ribosome encounters a ribosomal skipping sequence it is prevented from creating the peptide bond with the last proline in the ribosomal skipping sequence. As a result, translation is stopped, the nascent polypeptide released and translation is re-initiated to produce a second polypeptide. This results in the addition of a C-terminal ribosomal skipping sequence (or the majority of such a sequence) to the first polypeptide chain, and a N-terminal proline to the next polypeptide.
[0094] Accordingly, in a further embodiment, the nucleic acid construct comprises at least one ribosomal skipping sequence.
[0095] In one example, the ribosomal skipping sequence may be selected from one of the following:
TABLE-US-00003 F2A; A 2A DNA sequence variant used between two CDS. F2A: (SEQ ID NO: 36) GGACAACTTCTCAACTTTGACTTGCTAAAGTTA GCTGGTGATGTTGAATCTAATCCTGGACCA.
[0096] Use of the F2A sequence results in the addition of the F2Aaa1-20 polypeptide sequence to the C-terminus of the protein upstream of the ribosomal skipping site and a proline residue (F2Aaa21) to the downstream protein.
TABLE-US-00004 F2Aaa1-20: (SEQ ID NO: 37) GQLLNFDLLKLAGDVESNPG F2Aaa21: P F2A30; A 2A DNA sequence variant used between two CDS. F2A30: (SEQ ID NO: 38) CACAAACAGAAAATTGTGGCACCGGTGAAGCAGACTCTC AACTTTGACTTGCTAAAGTTAGCTGGTGATGTTGAATCT AATCCTGGACCA.
[0097] Use of the F2A30 sequence results in the addition of the F2A30aa1-29 polypeptide sequence to the C-terminus of the protein upstream of the ribosomal skipping site and a proline residue (F2A30aa30) to the downstream protein.
TABLE-US-00005 (SEQ ID NO: 39) F2Aaa1-20: HKQKIVAPVKQTLNFDLLKLAGDVESNPG F2Aaa21: P
[0098] In one embodiment, LRHK includes a C-terminal skipping sequence, preferably F2A30(aa1-29). The nucleic acid and amino acid sequence of CcaS with such a skipping sequence is shown in SEQ ID 9 and 11 and 10 and 12 respectively. Accordingly, where the nucleic acid construct comprises a single sequence for LRHK and RR, the LRHK preferably comprises a sequence comprising or consisting of SEQ ID NO: 10 or 12.
[0099] In a further embodiment, RR includes a N-terminal skipping sequence and F2A30(aa30), i.e. a proline amino acid residue. The nucleic acid and amino acid sequence of CcaR comprising such a skipping sequence is shown in SEQ ID 14 and 16 and 13 and 15 respectively. Accordingly, where the nucleic acid construct comprises a single sequence for LRHK and RR, RR preferably comprises a sequence comprising or consisting of SEQ ID NO: 14 or 16.
[0100] In a further alternative embodiment, an internal ribosomal entry site (IRES), tRNA sequence, a ribozyme (such as a Hammerhead (HH) ribozyme unit and/or a hepatitis delta virus (HDV) ribozyme unit) or direct repeat (DR) sequence could be used instead of a ribosomal skipping sequence. Again, such sequences may be added to the 5' and/or 3' end of the LRHK and/or RR gene and allow two proteins to be expressed as individual proteins from a single mRNA transcript and from a single regulatory sequence (promoter).
[0101] In a further embodiment, the nucleic acid construct may further comprise a reporter sequence. The reporter sequence may be used as a means to flag cells that have been successfully transformed with the nucleic acid construct. The reporter sequence may also be used as a control to allow quantification of the level of expression of a target gene, expressed concurrently (either on the same or on a different expression vector) as the vector comprising the LRHK and/or the RR. Accordingly, the reporter sequence may be any sequence that can perform this function. As an example, common tags include the fluorescent proteins, such as GFP, EGFP, Emerald, Superfolder GFP, Azami Green, mWasabi, TagGFP, TurboGFP, AcGFP, ZsGreen, T-Sapphire, EBFP, EBFP2, Azurite, mTagBFP, ECFP, mECFP, Cerulean, mTurquoise, CyPet, AmCyan 1, Midori-Ishi Cyan, TagCFP, mTFP1, EYFP, Topaz, Venus, mCitrine, YPet, TagYFP, PhiYFP, ZsYellowl, mBanana, Kusabira Orange Kusabira Orange2 mOrange mOrange2 dTomato dTomato-Tandem, TagRFP, TagRFP-T, DsRed, DsRed2, DsRed-Express (T1), DsRed-Monomer, mTangerine, mRuby, mApple, mStrawberry, AsRed2, mRFP1, JRed, mCherry, HcRed1, mRaspberry, dKeima-Tandem, HcRed-Tandem, mPlum and AQ143.
[0102] In a further embodiment, the regulatory sequence is operably linked to a regulatory sequence. Preferably the regulatory sequence is operably linked to a single regulatory sequence that is also operably linked to the LRHK and/or the RR. As discussed above, the reporter sequence may also comprise 5' or 3' ribosomal skipping sequences, such as one of the skipping sequences described above.
[0103] The term "operably linked" as used throughout refers to a functional linkage between the promoter sequence and the gene of interest, such that the promoter sequence is able to initiate transcription of the gene of interest.
[0104] In a further embodiment, the construct comprises at least one terminator sequence, which marks the end of the operon causing transcription to stop. A suitable terminator sequence would be well known to the skilled person, and may include Rho-dependent and Rho-independent sequences. In one example, the sequence may comprise or consist of SEQ ID NO: 42 and/or 43 or a functional variant thereof.
[0105] In one embodiment, the regulatory sequence is a promoter. According to all aspects of the invention, including the method above and including the plants, methods and uses as described below, the term "regulatory sequence" is used interchangeably herein with "promoter" and all terms are to be taken in a broad context to refer to regulatory nucleic acid sequences capable of effecting expression of the sequences to which they are ligated. The term "regulatory sequence" also encompasses a synthetic fusion molecule or derivative that confers, activates or enhances expression of a nucleic acid molecule in a cell, tissue or organ.
[0106] The term "promoter" typically refers to a nucleic acid control sequence located upstream from the transcriptional start of a gene and which is involved in the binding of RNA polymerase and other proteins, thereby directing transcription of an operably linked nucleic acid. Encompassed by the aforementioned terms are transcriptional regulatory sequences derived from a classical eukaryotic genomic gene (including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence) and additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or external stimuli, or in a tissue-specific manner. Also included within the term is a transcriptional regulatory sequence of a classical prokaryotic gene, in which case it may include a -35 box sequence and/or -10 box transcriptional regulatory sequences.
[0107] In a preferred embodiment, the promoter is a constitutive promoter, strong promoter or tissue-specific promoter.
[0108] A "constitutive promoter" refers to a promoter that is transcriptionally active during most, but not necessarily all, phases of growth and development and under most environmental conditions, in at least one cell, tissue or organ. Examples of constitutive promoters include the cauliflower mosaic virus promoter (CaMV35S or 19S), rice actin promoter, maize ubiquitin promoter, polyubiquitin (UBQ10) promoter, rubisco small subunit, maize or alfalfa H3 histone, OCS, SAD1 or 2, GOS2 or any promoter that gives enhanced expression.
[0109] A "strong promoter" refers to a promoter that leads to increased or overexpression of the target gene. Examples of strong promoters include, but are not limited to, CaMV-35S, CaMV-35Somega, Arabidopsis ubiquitin UBQ1, rice ubiquitin, actin, Maize alcohol dehydrogenase 1 promoter (Adh-1), AtPyk10, BdEF1.alpha., FaRB7, HvIDS2, HvPht1.1, LjCCaMK, MtCCaMK, MtIPD3, MtPT1, MtPT2, OsAPX, OsCc1, OsCCaMK, OsCYCLOPS, OsPGD1, OsR1G1B, OsRCc3, OsRS1, OsRS2, OsSCP1, OsUBI3, SbCCaMK, SiCCaMK, TobRB7, ZmCCaMK, ZmEF1.alpha., ZmPIP2.1, ZmRsyn7, ZmTUB1.alpha., ZmTUB2.alpha. and ZmUBI.
[0110] Tissue specific promoters are transcriptional control elements that are only active in particular cells or tissues at specific times during plant development.
[0111] For the identification of functionally equivalent promoters, the promoter strength and/or expression pattern of a candidate promoter may be analysed for example by operably linking the promoter to a reporter gene and assaying the expression level and pattern of the reporter gene in various tissues of the plant. Suitable well-known reporter genes are known to the skilled person and include for example beta-glucuronidase or beta-galactosidase.
[0112] In one embodiment, the nucleic acid construct further comprises a target sequence operably linked to a regulatory sequence that is specifically activated by the response regulator. In an alternative embodiment, the regulatory sequence is constitutively active and binding of RR represses the activity of the regulatory sequence. Preferably the regulatory sequence is a promoter, more preferably an inducible promoter. In a preferred embodiment, the promoter comprises a core promoter element (such that the promoter has little or no activity without adjacent or distal activation sequences) and a cis-regulatory element (CRE) (non-variant or variant) recognised by CcaR. In one example, the core promoter element may comprise or consist of a sequence as defined in SEQ ID NO: 41 or a variant thereof and the CRE may comprise or consist of a sequence as defined in SEQ ID NO: 40 or a variant thereof. In a further preferred embodiment, the promoter comprises or consists of the nucleic acid sequence as defined in SEQ ID NO: 17 or a functional variant thereof. In one embodiment, the target sequence may be expressed using a promoter that drives overexpression. Overexpression according to the invention means that the target gene is expressed at a level that is higher than the expression of the endogenous target gene whose expression is driven by its endogenous counterpart.
[0113] As used herein a "target sequence" may refer to any nucleic acid sequence or gene that could possibly be and/or would be of value to control the transcription level of.
[0114] The construct may further comprise a second terminator sequence to define the end of the target sequence operon. A terminator sequence is defined above. Preferably the terminator sequence comprises or consists of SEQ ID NO: 43 or a variant thereof.
[0115] As described in detail below, in use when the (LRHK) is exposed to an activating wavelength of light it phosphorylates the RR, which then binds to its cognate promoter (the regulatory sequence that is specifically recognized by the RR) resulting in transcription of the target sequence.
[0116] In another aspect of the invention, there is provided a vector or expression vector comprising the nucleic acid construct described herein. In one embodiment, the vector backbone is pEAQ.
[0117] In another aspect of the invention there is provided a host cell comprising the nucleic acid construct or the vector. The host cell may be a prokaryotic or eukaryotic cell. Preferably the cell is a mammalian, bacterial or plant cell. Most preferably the cell is a plant cell.
[0118] In another aspect of the invention there is provided a transgenic organism where the transgenic organism expresses the nucleic acid construct or vector. Again, the organism is any prokaryote or eukaryote, but in a preferred embodiment, the organism is a plant.
[0119] In one embodiment, the progeny organism is transiently transformed with the nucleic acid construct or vector. In another embodiment, the progeny organism is stably transformed with the nucleic acid construct described herein and comprises the exogenous polynucleotide which is heritably maintained in at least one cell of the organism. The method may include steps to verify that the construct is stably integrated. Where the organism is a plant, the method may also comprise the additional step of collecting seeds from the selected progeny plant.
[0120] In a further aspect of the invention there is provided a method of producing a transgenic organism as described herein. In a different aspect there is provided a method of producing an organism that is capable of light-regulated expression of a target sequence. In either aspect the method comprises at least the following steps:
[0121] a. selecting a part of the organism;
[0122] b. transfecting at least one cell of the part of the organism of part (a) with the nucleic acid construct or the vector; and
[0123] c. regenerating at least one organism derived from the transfected cell or cells.
[0124] Transformation or transfection methods for generating a transgenic organism of the invention are known in the art. Thus, according to the various aspects of the invention, a nucleic acid construct as defined herein is introduced into an organism and expressed as a transgene. The nucleic acid construct is introduced into said organism through a process called transformation. The term "transfection", "introduction" or "transformation" as referred to herein encompasses the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for transfer. Such terms can also be used interchangeably in the present context. Where the organism is a plant, tissue capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may be transformed with a genetic construct of the present invention and a whole plant regenerated there from. The particular tissue chosen will vary depending on the clonal propagation systems available for, and best suited to, the particular species being transformed. Exemplary tissue targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), and induced meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). The polynucleotide may be transiently or stably introduced into a host cell and may be maintained non-integrated, for example, as a plasmid. Alternatively, it may be integrated into the host genome. The resulting transformed plant cell may then be used to regenerate a transformed plant in a manner known to persons skilled in the art.
[0125] Transformation of plants is now a routine technique in many species. Advantageously, any of several transformation methods may be used to introduce the gene of interest into a suitable ancestor cell. The methods described for the transformation of an organism's cells may be utilized for transient or for stable transformation. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts, electroporation of protoplasts, microinjection into plant material, DNA or RNA-coated particle bombardment, infection with (non-integrative) viruses and the like. Transgenic plants, including transgenic crop plants, are preferably produced via Agrobacterium tumefaciens mediated transformation.
[0126] To select transformed plants, the plant material obtained in the transformation is subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility is growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker or expression of a constitutively expressed reporter gene, as described above. Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern blot analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western blot analysis, both techniques being well known to persons having ordinary skill in the art.
[0127] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).
[0128] In a further aspect of the invention, there is provided a plant obtained or obtainable by the methods described herein.
[0129] In another aspect of the invention there is provided a method of modulating expression of a target gene in an organism, the method comprising introducing and expressing at least one nucleic acid construct or vector as described herein in an organism, and applying at least one (activating and/or repressing) wavelength of light, wherein preferably the wavelength of light modulates expression of the target gene, as described herein. In one embodiment, the wavelength of light activates or represses activation of a LRHK. As described above, preferably the wavelength of light activates the LRHK causing phosphorylation of RR which then binds to its cognate promoter to drive transcription of the target gene. As such, as used throughout an "activating" wavelength is one that activates LRHK, and preferably causes the expression or increases the expression of target gene (although in alternative embodiments an activating wavelength may decrease expression of a target gene). Similarly, as also used throughout, a "repressing" wavelength of light is one that represses or prevents activation of LRHK, and preferably decreases or prevents the expression of a target gene, although, again in alternative embodiments, the repressing wavelength may increase expression of a target gene.
[0130] Preferably the target gene is operably linked to a regulatory sequence that may be specifically activated by the response regulator, as described above. Even more preferably, the target gene is a transgene (either an exogenous or endogenous transgene) operably linked the regulatory sequence.
[0131] In one embodiment, the nucleic acid construct comprises a LRHK and a RR operably linked to at least one regulatory sequence, as described herein. Preferably, the construct also comprises a target gene operably linked to a regulatory sequence that may be specifically activated by the response regulator, as also described above.
[0132] In a further embodiment, the method may comprise introducing and expressing a first and second nucleic acid construct, wherein the first nucleic acid construct comprises a LRHK operably linked to a regulatory sequence and the second nucleic acid construct comprises a RR operably linked to a regulatory sequence. In a further preferred embodiment, the method may further comprise introducing a third nucleic acid construct, wherein the third nucleic acid construct comprises a target gene operably linked to a regulatory sequence that may be specifically activated by the response regulator. Alternatively, the target gene and regulatory sequence may be present on the first or second nucleic acid construct.
[0133] As used herein "modulating" may encompass an increase or decrease in expression of a target gene, preferably compared to the level of expression in a control organism. In particular, expression of a target gene may be increased by applying a wavelength of light, preferably a first activating or repressing wavelength of light. Expression of the target gene can then be decreased (or further increased) by applying a second wavelength of light that is different from the first wavelength of light and is applied after the first wavelength of light. This effect can again be reversed by subsequently applying an activating wavelength of light and so on. The result is an "on/off" system to control expression of a target gene. However, the present invention is also capable of more subtlety than a simple "on/off" switch for target gene expression. We have found that different wavelengths of light can stimulate or repress target gene expression to different levels.
[0134] Accordingly, in a further embodiment, the activating light wavelength can be a maximal activating wavelength or an intermediate activating wavelength. In such an example, the maximal activating wavelength results in the highest level of target gene expression--i.e. a level of target gene expression that is higher than the intermediate activating wavelength. Similarly, the intermediate activating wavelength results in expression of the target gene but to a level that is lower than that obtained by applying a maximal activating wavelength. By comparison, the repressing wavelength of light results in no or minimal expression of the target gene.
[0135] In one embodiment, the level of target gene expression may be relative to a control organism, such as a plant, wherein the control plant does not express the transgene--for example, the plant does not express a nucleic acid construct, as described herein.
[0136] In an alternative embodiment, that may be particularly useful for defining a maximal or intermediate wavelength of light, the level of target gene expression may be relative to the level of gene expression in an organism where the light applied is white light or dark light (as defined below).
[0137] In a preferred embodiment of the methods described herein the organism is grown or cultured in light and/or darkness (darkness as used in this context refers to growth in the absence of light). In other words, the organism may be cultured in normal day and/or night conditions (normal day and/or night conditions for that organism or any experimentally set conditions). Where the organism is a plant, this may mean that the plant is exposed to a suitable day/night cycle. As such, expression of a target gene can be modulated (i.e. increased or decreased as defined herein) by the application of a (activating or repressing) wavelength of light in additional to normal light/dark conditions--this may lead to enriched white light for example (e.g. white light enriched with red or blue light). Accordingly, in a further embodiment, the increase or decrease in the level of target gene expression following application of an activating or repressing wavelength may be relative to the level of gene expression when the organism is cultured or grown in light or darkness (without application of a activating or repressing wavelength).
[0138] Accordingly, in a preferred embodiment, the method comprises applying enriched light, preferably enriched white light. In other words, the method comprises growing or culturing the organism in enriched light, preferably enriched white light.
[0139] As used here "white light" may refer to all visible light (for example, light between the wavelengths of 390 nm to 700 nm) or a combination of red, blue and green light as described below.
[0140] As used here "dark light" may refer to non-visible light. For example, dark light may refer to light in the infra-red portion (and beyond) of the spectrum (for example, above 700 nm, more preferably above 750 nm, and even more preferably between 710 and 850 nm) or light in the ultra-violet portion (and beyond) of the spectrum (for example, 390 nm, more preferably between 10 and 400 nm).
[0141] As used here, "enriched light", preferably enriched white light may comprise a proportion of activating or repressing wavelength of light, wherein said activating or repressing wavelength of light may be as defined below, and wherein the proportion of the activating or repressing wavelength of light is at least 5%, 10%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% of the total light.
[0142] Accordingly, modulating target gene expression encompasses both turning on, and optionally turning off expression of a target gene, as well as modulating the level of increase or decrease of target gene expression. As explained above, this latter feature allows the system to exhibit a first level of target gene expression during normal-light dark cycles and a second, different level of target gene expression (that is either higher or lower than the first) following application of a specific light spectra (such as red, blue or green) that is not found in a normal horticultural environment. As such, the invention allows for the very precise control of levels of target gene expression. Moreover, as the invention depends on the application of light to modulate gene expression, expression of a target gene can also be controlled (i.e. modulated) spatially (e.g. by directing the light source at a specific location on the organism) and temporally (e.g. by applying an activating or repressing wavelength at any point during the growth or life cycle of an organism).
[0143] As used throughout "increase", "higher" or "activate" (such terms may be used interchangeably) may mean an increase in target gene expression of at least 5%, 10%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% or more compared to a control as described above. Similarly, as also used throughout, "further increasing" the expression of a target gene in response to the application of a second wavelength of light may mean an increase in target gene expression of at least 5%, 10%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% or more compared to the level of gene expression following application of the first wavelength of light.
[0144] As also used throughout, "decrease" or "repress" (such terms may also be used interchangeably) may mean an decrease in target gene expression of at least 5%, 10%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% or more compared to a control as described above. Alternatively, such a decrease may be relative to the level of gene expression following application of the first wavelength of light.
[0145] In one embodiment, the activating wavelength of light may fall within one of the following ranges 430-495 nm (blue light), 495 to 570 nm (green light), 600 to 750 nm (red light). Alternatively the wavelength may be described as dark light (as described above) or white light (as described above). In another embodiment, the activating wavelength of light may comprise white light, as described above, supplemented or enriched with a specific wavelength of light, for example, blue, green or red light. This latter option may be particularly valuable where the organism is a plant, and wherein the plant requires white light for growth, but can tolerate an additional specific light wavelength, such as blue or red light with minimal physiological effects.
[0146] In a further embodiment, the maximally activating wavelength of light preferably falls within one of the following ranges range of 600 to 750 nm (red light). In an alternative embodiment, the intermediate activating wavelength preferably falls within the range 390 nm to 700 nm (white light) or 495 to 570 nm (green light).
[0147] In an alternative embodiment, the repressing wavelength of light may fall within one of the following ranges, 430-495 nm (blue light), 495 to 570 nm (green light) and 600 to 750 nm (red light). Alternatively the light may be white light, as defined above or dark light. In a preferred embodiment, the repressing wavelength of light falls within the range 430-495 nm (blue light). In another embodiment, the repressing wavelength of light may comprise white light, as described above, supplemented or enriched with a specific wavelength of light, for example, blue, green or red light.
[0148] In one embodiment, the activating or repressing wavelength of light is applied for sufficient time to modulate target gene expression as described above. Depending on the system and organism, the length of time could be seconds, minutes, hours or days. In one example, the light may be applied for at least 6 hours, more preferably at least 12 hours and even more preferably at least 18 hours.
[0149] It would be clear to the skilled person that other wavelengths of light, both in the visible and non-visible spectrum, and/or falling within the ranges described above, would be possible. The above ranges are intended as examples only.
[0150] In one embodiment, the light is applied using a light source having a desired wavelength as described above. Suitable light sources would be known to the skilled person, but may be one or more of a suitable LED, laser, white light source and the like.
[0151] In one example, the organism is cultured or grown for at least 1 hour, preferably at least 2, 6, 12 or 24 hours, or 2, or 7 days before an activating and/or repressing wavelength of light is applied.
[0152] In one embodiment, the activating and/or repressing wavelength of light is preferably applied to an outer or external surface of the organism. Where the organism is a plant, this surface is preferably at least one leaf and/or at least one root and/or at least one shoot or stem.
[0153] In a further aspect of the invention, there is provided a method of modulating any biochemical pathway or response or biological process in a target organism, the method comprising introducing and expressing at least one nucleic acid construct or vector as described herein, and applying a (activating or repressing) wavelength of light, as described above. In one embodiment, the biochemical pathway is a developmental pathway or physiological response. Where the organism is a plant, the method may be used, for example, to modulate the concentration of phytohormones to modulate developmental traits such as organ size and plant architecture, to modulate flowering (i.e. prevent or induce flowering, including for purposes of synchronization), modulate germination (for example, prevent or induce germination, including for purposes of synchronization), modulate senescence (for example to prevent senescence in food products for increased shelf-life), modulate a stress response (for example, induce a drought stress response or produce drought stress tolerance) or modulate plant immunity (e.g. increase or decrease immunity to a plant pathogen or parasite). Alternatively, the method may be used to control expression or production of a natural or synthetic metabolite such as a pharmaceutical.
[0154] In a further aspect of the invention, there is provided the use of the nucleic acid or vector as described herein to modulate expression of a target gene.
[0155] In another aspect of the invention, there is provided a photoreceptor molecule, wherein the photoreceptor comprises a phytochrome or phytochrome-related photoreceptor protein and a chromophore. In one embodiment, the phytochrome-related photoreceptor is CcaS, as described herein. In one example, the chromophore is a tetrapyrrole. In one embodiment, the tetrapyrrole is selected from PCB (phycocyanobilin), P.phi.B (phytochromobilin), phycoviolobilin or phycoerythrin and BV (biliverdin). Similarly, there is also provided the use of a photoreceptor molecule as described herein to modulate any biochemical pathway or response or biological process in a target organism.
[0156] In a further embodiment, the nucleic acid constructs described above may further comprise at least one biosynthetic enzyme necessary to produce a chromophore, as described above, preferably from heme. In one example, the biosynthetic enzyme may be heme oxygenase and/or oxidoreductase, such as heme oxygenase 1 (ho1) and phycocyanobilin:ferredoxin (pcyA).
[0157] In a further aspect of the invention, there is provided a nucleic acid construct comprising a target sequence operably linked to a regulatory sequence, wherein the regulatory sequence is specifically activated by the response regulator. In one embodiment, the regulatory sequence comprises or consists of a nucleic acid sequence as defined in SEQ ID NO: 17 or a functional variant thereof. A functional variant is defined above.
[0158] In a final aspect of the invention, there is provided a nucleic acid molecule comprising
[0159] a. a nucleic acid sequence encoding a polypeptide as defined in any of SEQ ID NOs 1, 3, 5, 7, 9, 11, 13 and 15;
[0160] b. a nucleic acid sequence as defined in any of SEQ ID NOs 2, 4, 6, 8, 10, 12, 14, 16, 17, 47, 48, 49 or 50 or the complementary sequence thereof;
[0161] c. a nucleic acid with at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% overall sequence identity to the nucleic acid sequence of (a); or (b)
[0162] d. a nucleic acid sequence that is capable of hybridising under stringent conditions as defined herein to the nucleic acid sequence of any of (a) to (c).
[0163] The term "organism" as used herein refers to any prokaryotic or eukaryotic organism. Some examples of eukaryotes include a human, a non-human primate/mammal, a livestock animal (e.g. cattle, horse, pig, sheep, goat, chicken, camel, donkey, cat, and dog), a mammalian model organism (mouse, rat, hamster, guinea pig, rabbit or other rodents), an amphibian (e.g., Xenopus), fish, insect (e.g. Drosophila), a nematode (e.g., C. elegans), a plant, an algae, a fungus. Examples of prokaryotes include bacteria (e.g. cyanobacteria) and archaea.
[0164] The term "plant" as used herein may refer to any plant. For example, the plant may be a monocot or dicot. Preferably, the plant is a crop plant. By crop plant is meant any plant which is grown on a commercial scale for human or animal consumption or use. In a preferred embodiment, the plant is a cereal. In another embodiment the plant is Arabidopsis or Medicago truncatula. In another example, the plant may be N. benthamiana.
[0165] The term "plant" as used herein encompasses whole plants, ancestors and progeny of the plants and plant parts, including seeds, fruit, shoots, stems, leaves, roots (including tubers), flowers, tissues and organs, wherein each of the aforementioned comprise the nucleic acid construct as described herein. The term "plant" also encompasses plant cells, suspension cultures, callus tissue, embryos, meristematic regions, gametophytes, sporophytes, pollen and microspores, again wherein each of the aforementioned comprises the nucleic acid construct.
[0166] The invention also extends to harvestable parts of a plant of the invention as described herein, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs. The aspects of the invention also extend to products derived, preferably directly derived, from a harvestable part of such a plant, such as dry pellets or powders, oil, fat and fatty acids, starch or proteins. Another product that may be derived from the harvestable parts of the plant of the invention is biodiesel. The invention also relates to food products and food supplements comprising the plant of the invention or parts thereof. In one embodiment, the food products may be animal feed. In another aspect of the invention, there is provided a product derived from a plant as described herein or from a part thereof.
[0167] In a most preferred embodiment, the plant part or harvestable product is a seed or grain. Therefore, in a further aspect of the invention, there is provided a seed produced from a transgenic or genetically altered plant as described herein.
[0168] In an alternative embodiment, the plant part is pollen, a propagule or progeny of the genetically altered plant described herein. Accordingly, in a further aspect of the invention there is provided pollen, a propagule or progeny produced from a transgenic or genetically altered plant as described herein.
[0169] A control organism, such as a plant as used herein according to all of the aspects of the invention is an organism that has not been modified according to the methods of the invention.
[0170] While the foregoing disclosure provides a general description of the subject matter encompassed within the scope of the present invention, including methods, as well as the best mode thereof, of making and using this invention, the following examples are provided to further enable those skilled in the art to practice this invention and to provide a complete written description thereof. However, those skilled in the art will appreciate that the specifics of these examples should not be read as limiting on the invention, the scope of which should be apprehended from the claims and equivalents thereof appended to this disclosure. Various further aspects and embodiments of the present invention will be apparent to those skilled in the art in view of the present disclosure.
[0171] "and/or" where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. For example "A and/or B" is to be taken as specific disclosure of each of (i) A, (ii) B and (iii) A and B, just as if each is set out individually herein.
[0172] Unless context dictates otherwise, the descriptions and definitions of the features set out above are not limited to any particular aspect or embodiment of the invention and apply equally to all aspects and embodiments which are described.
[0173] The foregoing application, and all documents and sequence accession numbers cited therein or during their prosecution ("appln cited documents") and all documents cited or referenced in the appln cited documents, and all documents cited or referenced herein ("herein cited documents"), and all documents cited or referenced in herein cited documents, together with any manufacturer's instructions, descriptions, product specifications, and product sheets for any products mentioned herein or in any document incorporated by reference herein, are hereby incorporated herein by reference, and may be employed in the practice of the invention. More specifically, all referenced documents are incorporated by reference to the same extent as if each individual document was specifically and individually indicated to be incorporated by reference.
[0174] The invention is now described in the following non-limiting example.
Example 1
The CcaS-CcaR System
[0175] The CcaS-CcaR system is a green/red photoswitchable two-component system derived from Synechocystis PCC6803 and consists of a light-responsive histidine kinase (LRHK), CcaS, and its cognate response regulator (RR), CcaR. CcaS is a membrane-associated cyanobacteriochrome which covalently binds a linear tetrapyrrole molecule, phycocyanobilin (PCB), to a conserved cysteine residue in its GAF domain. This allows for reversible photoactivation of CcaS with maximal activation in response to green light (.about.535 nm) and maximal repression by red light (.about.672 nm). Activating light wavelengths trigger CcaS to phosphorylate and activate CcaR, which then binds a cognate DNA recognition element, the cis-regulatory element (CRE), and promotes transcription of target gene(s) in cis.
Characterization of the Chromophore Dependency of the CcaS-CcaR System by Heterologous Expression in E. coli
[0176] In plants, the native chromophore for CcaS, PCB, is not produced, but the near identical chromophore, phytochromobilin (P.PHI.B) is. We therefore set out to test if the CcaS-CcaR system would photoswitch in E. coli with P.PHI.B.
[0177] The CcaS-CcaR system, in E. coli, is designed as a two-vector system. From one vector, CcaS is synthesized along with the two proteins, HO1 and PCYA, which produce the chromophore PCB from heme. From the second vector, CcaR is produced. The second vector also holds a sfgfp gene under the control of the P.sub.cpcg2-172 promoter. To produce P.PHI.B, instead PCB, we replaced the pcyA gene with the gene encoding the P.PHI.B synthase from Arabidopsis, lacking a transit peptide (mHY2), as described by Mukougawa et al. (2006).sup.4. We also characterize the photoswitching of the system in the presence of the precursor molecule for PCB and P.PHI.B, biliverdin (BV), and in the absence of any chromophore (O). In order to test this, we introduced stop mutations in pcyA and ho1, respectively.
[0178] Photoswitching Assay in E. coli. In order to examine the behaviour of the CcaS-CcaR system and its variants in E. coli, cells expressing the systems are cultured in defined light regimes and then tested for GFP fluorescence in a fluorimeter. GFP fluorescence serves as a reporter of photoactivation of CcaS and successful signal transduction through CcaR. An example of data from such an experiment is seen below (Error! Reference source not found.).
[0179] With PCB the CcaS-CcaR system is activated by a red-green-blue light mixture simulating white light (RGB-white), blue light, and green light and shows low activity in red light and in darkness (Error! Reference source not found.). With P.PHI.B, the system appears to be constitutively active under all tested light conditions. Only subtle changes in activity are observed in response to the different light regimes. With BV, the system is inactivated by RGB-white, and blue light treatment. Low activity is observed under green and red light conditions and in darkness. Without the chromophore, the system is inactivated by RGB-white, and blue light treatment and the system only show very low activity under green and red light conditions and in darkness (Error! Reference source not found. 3).
Repurposing the CcaS-CcaR System for in Planta Function by Engineering in E. coli
[0180] a. We made several modifications to the CcaS-CcaR system for the purpose of creating a system that would function in plants. We tested some of these modifications in E. coli to confirm that the photoswitching function was not compromised. We also tested certain modifications in planta (described below).
[0181] b. Modifications to CcaS
[0182] a. Improved the photoswitching of CcaS with P.PHI.B
[0183] b. Released CcaS from the cell membrane by removing its membrane anchor via a N-terminal deletion of 66 bases.
[0184] c. N-terminal nuclear localization signal (NLS) added to CcaS
[0185] d. Confirmed that peptide tails added by ribosomal skipping sequences were tolerated by CcaSs Improving Photoswitching of CcaS with P.PHI.B
[0186] We first set out to adapt CcaS for improved photoswitching with P.PHI.B by site-directed mutagenesis of residues in the chromophore binding pocket. By comparing sequences for proteins that utilize either phycoviolobilin (PVB), PCB, P.PHI.B or BV as chromophores, including four cyanobacteriochromes (TePixJ, FdRcaE, SyCcaS and SyCph1), two bacteriophytochromes (PsBphP and DrBphP) and two plant phytochromes (AtPhyA and AtPhyB), we identified candidate amino acid residues that could be mutated in order to improve CcaS photoswitching with P.PHI.B. The following 8 single amino acid residue mutations were created by site-directed mutagenesis of CcaS; L80M, I84F, A92V, I104Y, V113D, F114I, L142H and F149M. The A92V mutation improved CcaS photoswitching with POB but also altered the photochemical properties of the protein with respect to blue light and red light (Error! Reference source not found. 4). CcaS with the A92V mutation is from heron referred to as CcaS (A92V). Rather than being activated by blue light and RGB-white light and repressed by red light, the CcaS(A92V) with P.PHI.B system is repressed by blue light and RGB-white light and activated by red light. The low activity in RGB-white light might be a result of the blue light response being dominant.
Removing the Transmembrane Domain of CcaS to Make it Soluble and Adding a N-Terminal Nuclear Localization Signal
[0187] In order to release CcaS(A92V) from the cell membrane, bioinformatics software (Phobius and TMHMM-2.0) was used to predict the transmembrane domain (TMD). Phobius predicted the TMD to be encoded by bases 16-69 or 16-87 and TMHMM-2.0 predicted 13-69. A truncation was made, removing bases 4-69 in ccaS (corresponds to a G2_H23del in CcaS, referred to as .DELTA.22). .DELTA.22 was not well tolerated by CcaS. However, when removing bases 1-69 in ccaS (Corresponds to an M1_H23del in CcaS, referred to as .DELTA.23) and replacing them with an NLS sequence, the photoswitching properties were restored (FIG. 5).
Testing the Effects of 2A Peptide Tails on CcaS Functionality
[0188] Ribosomal skipping is a technology used to express multiple proteins from a single mRNA in eukaryotes and can therefore be used to minimize the size of an expression vector, because fewer promoter and terminator sequences are required. We wished to explore if this technology was compatible with our system. During translation, a 2A sequence will cause translation to stop, release the nascent peptide chain and reinitiate translation to produce a second peptide chain. During this process, a peptide tail encoding the majority of the 2A ribosomal skipping sequence, is added to the C-terminus of the upstream protein while a single proline is added to the N-terminus of the downstream protein. In order to test whether the addition of 2A peptide tails could affect CcaS function, we tested CcaS with three peptide tails, corresponding to the 2A sequences P2A, F2A and F2A.sub.30, in E. coli photoswitching assays (Table 4). As 2A sequences are not functional in E. coli, the sequences encoding the 2A tails were added to the 3' end of tested CcaS variant (MM:NLS:CcaS (.DELTA.23 A92V)). The F2A tail was not well tolerated, but both the P2A and the F2A30 sequence were tolerated well (Error! Reference source not found. 6).
Repurposing the CcaS-CcasaR System for in Planta Function by Engineering in Tobacco
[0189] For the system to function in planta, we had to make a plant expression vector and several further modifications to the system.
[0190] Further modifications to CcaS
[0191] ccaS was codon optimized for expression in Arabidopsis.
[0192] Further modifications to CcaR
[0193] C-terminal NLS signal added to CcaR
[0194] VP64 eukaryotic transactivation domain added to CcaR
[0195] ccaR was codon optimized for expression in Arabidopsis.
[0196] Constructed a synthetic cognate promoter for CcaR or `upstream activation sequence` (UAS) consisting of three copies of a CcaR recognition element fused to a minimal CaMV 35S promoter sequence.
[0197] Add a GFP variant (NLS:Venus) as a fluorescence output reporter for light induced gene expression for the new system.
[0198] Add a GFP homolog (NLS:TagRFP) as a normalization control for expression of the system in plants.
[0199] F2A.sub.30: Add ribosomal skipping sequences (e.g. F2A.sub.30) between ccaS and tagrfp and between tagrfp and ccaR in order to express all three system components from the same promoter-terminator cassette.
Design of the Plant Expression Vector
[0200] To express and test variants of the Highlighter system in planta we designed plant expression vectors with an input cassette and an output cassette. In principle, the input cassette expresses the proteins required for the Highlighter system to control expression of a target gene (Target) in planta via the output cassette. The input cassette was designed for constitutive expression of three proteins: a light-responsive histidine kinase (a CcaS variant), a reporter gene (TagRFP) and a repose regulator (a CcaR variant). The output cassette was designed with a synthetic cognate promoter (P.sub.RR) that the response regulator can bind to and induce target gene expression in planta (FIG. 7).
The Vector Backbone Used to Create Our Plant Expression Vector
[0201] The vector backbone used to build our plant expression vector, was obtained from collaborators at the DynaMo Center (University of Copenhagen, Associate Professor Meike Burow). The vector is based on pEAQ-HT but the region between the RB and LB has been replaced with a cassette containing P.sub.UBQ10, a USER cassette and T.sub.rbcS.
Designing the Output Cassette: A Light-Controlled Gene Expression Cassette
[0202] The output cassette for the Highlighter system was designed as a gateway cassette (to allow for easy exchange of the expressed gene), with the sequence of the cognate promoter for the RR upstream of the cassette and a T.sub.NOS sequence downstream. For our initial test, we decided to use NLS:Venus (NLS:edAFPt9) as the reporter to evaluate the light-induced gene expression.
Designing a Synthetic Plant Promoter and Cognate Transcription Activator
[0203] A synthetic plant promoter and transcription activator was designed for the Highlighter system, based on the idea behind the estrogen inducible XVE system.sup.5. The XVE system is composed of a chimeric transcription activator, XVE (a fusion of the DNA-binding domain of the bacterial repressor LexA (X), the acidic transactivating domain of VP16 (V) and the regulatory region of the human estrogen receptor (E)), and its cognate promoter, which consists of eight copies of the LexA operator fused upstream of the -46 35S minimal promoter. In the presence of estrogen, XVE binds its cognate promoter and the downstream gene is transcribed.
[0204] Our synthetic promoter design consists of three copies of the ccaR CRE fused upstream of the -51 35S minimal promoter (FIG. 8). Inspired by the work of Qilai Huang et al..sup.6, we mimicked their construct 191 so that the ccaR CREs were spaced evenly around the DNA helix, offset at 120.degree. angles. This design was chosen as it effectively recruited transcription machinery components to the TATA box in eukaryotic HEK293T cells to form the transcription initiation complex.
Designing the Input Cassette: An Expression Cassette for the LRHK and the RR
[0205] To keep the size of the expression vector to a minimum and to attempt to balance expression of the LRHK and RR, both LRHK and RR variants, along with an expression reporter (TagRFP), were expressed from a single cassette controlled by P.sub.UBQ10 and T.sub.rbcS. To allow the three proteins to be expressed as individual proteins from one mRNA, F2A.sub.30 ribosomal skipping sequence were included between ccaS and tagrfp and between tagrfp and ccaR. Because TagRFP will be constitutively expressed from the input cassette, we can quantify the induction of a fluorescent Target (e.g. NLS:Venus) ratiometrically by dividing the YFP signal by the RFP signal. The TagRFP also serves as a reporter for cells expressing the Highlighter system.
Testing the Efficiency of Ribosomal Skipping of 2A Sequences in Planta (Transient Expression in Tobacco)
[0206] We tested the efficiency of ribosomal skipping of `2A-type` sequences in planta by transient expression in N. benthamiana (Tobacco). To evaluate the skipping efficiency of the p2a, f2a and f2a.sub.30 sequences, tagrfp was connected to the 3' end of the LRHK gene, encoding MM:NLS:CcaS(.DELTA.23 A92V), via the three different 2A sequences and expressed from the P.sub.UBQ-T.sub.rbcS cassette. With perfect skipping, the TagRFP fluorescence should not be limited to the nucleus. With failed skipping, TagRFP would be fused with MM:NLS:CcaS(.DELTA.23 A92V) and localized to the nucleus. As theoretical controls for perfect ribosomal skipping and complete failure of skipping, TagRFP and NLS:TagRFP was expressed from the P.sub.UBQ-T.sub.rbcS cassette. All three 2A sequences worked with high efficiency in planta (FIG. 9). The F2A.sub.30 sequence was selected for further experiments.
Testing the Highlighter System in Planta
Photoswitching of the Highlighter System(s) in Response to Green Light, Blue Light and Darkness
[0207] The highlighter system was tested by transient transfection of Tobacco leaves. Agrobacterium tumefaciens (Agrobacterium), transformed with variants of the highlighter system, were used to infiltrate Tobacco leaves. The leaves were left to express the highlighter system for .about.2 days in the greenhouse before they received light treatments (blue light, green light or darkness) for minimum 18 hours (FIG. 10). For the light treatment the leaves were cut of the plant and kept in a humid environment inside plastic containers.
[0208] Light-controlled induction of YFP expression was evaluated by confocal imaging by analyzing and dividing the mean YFP fluorescence intensity by the mean RFP fluorescence intensity in the plant cell nuclei. As the YFP expression is inducible and the TagRFP expression is constitutive, a low ratio between the two signals can be interpreted as low target gene expression and a high ratio can be interpreted as a high target gene expression.
[0209] Four variants of the highlighter system were tested; Highlighter 209, Highlighter 210, Highlighter 213 and Highlighter 214 (Error! Reference source not found.). These systems test the importance of the A92V mutation (systems 209 and 213 have the A92V mutation, whereas 210 and 214 do not) and if it is better to add the NLS and VP64 domain to the N- or the C-terminus of CcaR (systems 209 and 210 are N-terminal fusions and 213 and 214 are C-terminal fusions).
[0210] The results revealed that for all constructs, blue light treatment reduced target gene expression compared to the green light treatment and the dark treatment. The largest fold-change in expression between light treatments were observed for Highlighter 213 and 214, where the VP64 domain and NLS are fused to the C-terminus of CcaR (Error! Reference source not found. 11).
Second Test--RGB-White, Blue, Green, Red and Darkness
[0211] Next we evaluated the Highlighter systems 213 and 214 under more light regimes, this time including red light and RGB-white light. During expression of the system, while the leaves were still attached to the plant, the plants were grown in continuous blue light (FIG. 12).
[0212] In this experiment we include a NLS:Venus only control and a NLS:TagRFP only control. These two controls approximate the maximum (NLS:Venus only) and minimum ratios (NLS:TagRFP only) that can be achieved using our imaging system under the current experimental conditions and analysis methods. The systems, Highlighter 213 and Highlighter 214, were tested in duplicates.
[0213] In general, the systems are inactive under blue light conditions, intermediately active under green light and RGB-white light conditions and fully active under red light conditions and in the dark. The Highlighter system having the A92V mutation, Highlighter 213, exhibits broadly lower expression of the NLS:Venus target in the various light treatment regimes along with higher fold-change in expression between light treatments.
Potential Applications for the Highlighter System
[0214] There is great demand for a chemical free, minimally invasive system for controlling target gene expression in plants. Such a tool would be of great value to both fundamental laboratory research as well as horticultural systems. With the highlighter system we have accomplished this and demonstrated its effectiveness in directing target gene expression in the plant host N. benthamiana. We will now continue to demonstrate its function in other model systems, including Arabidopsis thaliana and Medicago truncatula.
[0215] In plants, the availability of optogenetics tools are presently limited and Highlighter represents a major improvement over current technologies (e.g. cell-type specific promoters or chemical induction systems). Combined with laser-based light sources that offer high spatial- and temporal-resolution, the Highlighter system will enable research biologists to direct gene expression with unprecedented precision. Furthermore, light can be employed as a benign and low-cost regulator of gene expression, making it ideal for directing developmental and physiological changes in crop plants, compared to plant growth regulatory chemicals.
Applications for the Highlighter System in Fundamental Research
[0216] Plant hosts, and potentially other eukaryotic hosts, expressing Highlighter can be reversibly directed to lower expression levels of a target gene using blue light treatment. This feature will allow biologists to examine the developmental and physiological responses of the organism to perturbation of nearly any biological process at the cell, tissue, organ, and organismal levels. Immediate interests include directing changes in the concentration of phytohormones. Examples below (Table 1).
TABLE-US-00006 TABLE 1 Precision genetics with the Highlighter system: Interrogating consequences of spatiotemporal genetic perturbation. Basal Genetic background Highlighter Target expression Blue light regime Hormone Biosynthetic gene Elevated Spatiotemporal biosynthetic mutant complement hormone depletion Hormone Catabolic gene Depleted Spatiotemporal catabolic mutant complement hormone elevation
Applications for the Highlighter System in Horticulture
[0217] Plant hosts expressing Highlighter can be directed to undergo key developmental transitions or physiological state changes through application of light treatments. The developed technology holds the potential to permit specific interventions for improved agronomic outcomes. Immediate interests include directing the timing of germination, flowering, senescence, drought tolerance, immune activation and synthetic metabolite production (i.e. use as `metabolic valve`). Examples below (Table 2).
TABLE-US-00007 TABLE 2 Precision horticulture with Highlighter: direct crop development and physiology to suit agricultural/agropharmaceutical needs Genetic Blue light Red light background Highlighter Target regime or basal Flowering mutant Floral regulator Non-flowering Synchronous complement flowering Germination Germination Non- Synchronous mutant regulator comple- germinating germination ment Abscisic acid Catabolic mutant Induced Low drought (ABA) catabolic complement drought tolerance/ mutant tolerance rapid growth Salicylic acid Biosynthetic Reduced Induction of (SA) biosyn- mutant comple- biotroph biotroph thetic mutant ment immunity immunity Synthetic metab- Synthetic No Synchronous olite (e.g. phar- metabolite production of production maceutical) line regulator com- pharmaceutical of pharmaceu- lacking regulator plement tical
Example 2
Highlighter Response to Mixed Light Environments
[0218] Horticultural environments are typically mixed light environments, rather than monochromatic light. The responsiveness of the Highlighter system was therefore evaluated under light regimes where white light was enriched in either red (activating wavelengths) or blue light (inactivating wavelengths). Monochromatic red and blue light were used as control conditions to establish the maximum response for the system. In mixed light environments, a switch from white light with modest enrichment in red light to modest enrichment in blue light is sufficient to convert the Highlighter system 213 (tested in quadruplicate) from activation to inactivation of gene expression (FIG. 14).
Creating Spectral Variants of the LRHK for Multichromatic Control of Gene Regulation
[0219] Advanced control of gene regulatory networks can be achieved by developing multichromatic optogenetic systems. We therefore tested if the LRHK we developed could be adapted to respond alternative light stimuli. A segment of the GAF domain in the LRHK (from the extreme N-terminal part of .beta.1 sheet (DRV motif) to the C-terminal part of .beta.6 sheet (WGL motif) was replaced by the corresponding segment of the following GAF domains; AnPixJg2, slr1393g2, NpR1597g4 and UirSg. The resulting LRHKs are referred to as LRHK1-01, LRHK1-05, LRHK1-10 and LRHK1-12, respectively. Gene induction (i.e. sfGFP fluorescence) downstream of the synthetic LRHKs were evaluated in response to darkness, ultraviolet light (370 nm and 400 nm), blue light (450 nm), green light (520 nm), yellow light (590 nm), orange light (610 nm), red light (630 nm), and far red light (700 nm) (FIG. 15).
[0220] The original LRHK is inactive in most light regimes, but strongly induces sfGFP expression in the green (520 nm), yellow (590 nm) and orange (610 nm) light regimes. In contrast, the LRHK1-01 induced sfGFP expression in all light regimes, except for the ultraviolet (370 nm and 400 nm) and blue (450 nm) light regimes. LRHK1-05 induced sfGFP expression in all light regimes, with the exception of blue light specifically. LRHK1-10 strongly induced sfGFP expression in all tested light regimes but still displays somewhat reduced induction of sfGFP expression in response to blue light (450 nm). LRHK1-12 is constitutively inactive in all light regimes. The results clearly demonstrate that the LRHK developed for the Highlighter system can be adapted to display new light responsive properties.
Control of Gene Expression in Stably Transformed Arabidopsis in a Light Dependent Manner Using the Highlighter System
[0221] To demonstrate that the Highlighter system is able to control gene expression levels in stably transformed plants we attempted to complement the semi-dwarf phenotype of an Arabidopsis thaliana ga3ox1-3, ga3ox2-1 double mutant line that also expresses a nuclear localized GIBBERELLIN PERCEPTION SENSOR 1 (nGPS1) construct (ga3ox1-3, ga3ox2-1, nGPS1, Rizza 2017). Because the ga3ox2-1 mutant does not have a visible growth phenotype (Mitchum 2006), we hypothesized that AtGA3OX1 expression controlled by the Highlighter system could be used to complement the semi-dwarf phenotype in a light-dependent manner. A semi-dwarf phenotype of the ga3ox1-3, ga3ox2-1, nGPS1 line was clearly visible when grown in continuous blue-enriched white light and in continuous red-enriched white light. For the ga3ox1-3, ga3ox2-1, nGPS1 line transformed with the Highlighter system controlling AtGA3OX1 expression, the semi-dwarf phenotype is only observed when grown in `inactivating` blue-enriched white light, whereas an undwarfed phenotype was observed in the same line grown in `activating` red-enriched white light (FIG. 16). These results correspond well with the results observed in the transient tobacco experiments driving NLS:Venus expression under control of the Highlighter system.
REFERENCES
[0222] 1. Hirose, Y., Narikawa, R., Katayama, M. & Ikeuchi, M. Cyanobacteriochrome CcaS regulates phycoerythrin accumulation in Nostoc punctiforme, a group II chromatic adapter. Proc. Natl. Acad. Sci. 107, 8854-8859 (2010).
[0223] 2. Schmidl, S. R., Sheth, R. U., Wu, A. & Tabor, J. J. Refactoring and optimization of light-switchable Escherichia coli two-component systems. ACS Synth. Biol. 3, 820-831 (2014).
[0224] 3. Tabor, J. J., Levskaya, A. & Voigt, C. A. Multichromatic control of gene expression in Escherichia coli. J. Mol. Biol. 405, 315-324 (2011).
[0225] 4. Mukougawa, K., Kanamoto, H., Kobayashi, T., Yokota, A. & Kohchi, T. Metabolic engineering to produce phytochromes with phytochromobilin, phycocyanobilin, or phycoerythrobilin chromophore in Escherichia coli. FEBS Lett. 580, 1333-1338 (2006).
[0226] 5. Zuo, J., Niu, Q.-W. & Chua, N.-H. An estrogen-based transactivator XVE mediates highly inducible gene expression in transgenic plants. Plant J. 24, 265-273 (2000).
[0227] 6. Huang, Q. et al. Distance and helical phase dependence of synergistic transcription activation in cis-regulatory module. PLoS One 7, 1-10 (2012).
[0228] 7. Ochoa-Fernandez, R., Samodelov, S. L., Brandl, S. M., Wehinger, E., Muller, K., Weber, W., Zurbriggen, M. D., Optogenetics in Plants: Red/Far-Red Light Control of Gene Expression. Methods in Molecular Biology. 1408, 125-139 (2016).
[0229] 8. Abe, K., Miyake, K., Nakamura, M., Kojima, K., Ferri, S., Ikebukuro, K., Sode, K. Engineering of a green-light inducible gene expression system in Synechocystis sp. PCC6803. Microbial Biotechnology. 7 (2) 177-183. (2013).
[0230] 9. Hunter, P. Shining a light on optogenetics. EMBO Reports 17(5), 634-637 (2016).
[0231] 10. Mitchum, M. G., Yamaguchi, S., Hanada, A., Kuwahara, A., Yoshioka, Y., Kato, T., Tabata, S., Kamiya, Y. & Sun, T.-P. Distinct and overlapping roles of two gibberellin 3-oxidases in Arabidopsis development. Plant J. 45(5), 804-818 (2006).
[0232] 11. Rizza, A., Walia, A., Lanquar, V., Frommer, W. B. & Jones, A. M. In vivo gibberellin gradients visualized in rapidly elongating tissues. Nat Plants. 3(10), 803-813 (2017)
TABLE-US-00008
[0232] SEQUENCE LISTING CcaS variants SEQ ID NO: 1 CcaS (A92V); amino acid sequence MGKFLIPIEFVFLAIAMTCYLWHRQNQERRRIEISIKQQTQRERF INQITQHIRQSLNLETVLNTTVAEVKTLLQVDRVLIYRIWQDGTG SVITESVNANYPSILGRTFSDEVFPVEYHQAYTKGKVRAINDIDQ DDIEICLADFVKQFGVKSKLVVPILQHNRASSLDNESEFPYLWGL LITHQCAFTRPWQPWEVELMKQLANQVAIAIQQSELYEQLQQLNK DLENRVEKRTQQLAATNQSLRMEISERQKTEAALRHTNHTLQSLI AASPRGIFTLNLADQIQIWNPTAERIFGWTETEIIAHPELLTSNI LLEDYQQFKQKVLSGMVSPSLELKCQKKDGSWIEIVLSAAPLLDS EENIAGLVAVVADITEQKRQAEQIRLLQSVVVNTNDAVVITEAEP IDDPGPRILYVNEAFTKITGYTAEEMLGKTPRVLQGPKTSRTELD RVRQAISQWQSVTVEVINYRKDGSEFWVEFSLVPVANKTGFYTHW IAVQRDVTERRRTEEVRLALEREKELSRLKTRFFSMASHEFRTPL STALAAAQLLENSEVAWLDPDKRSRNLHRIQNSVKNMVQLLDDIL IINRAEAGKLEFNPNWLDLKLLFQQFIEEIQLSVSDQYYFDFICS AQDTKALVDERLVRSILSNLLSNAIKYSPGGGQIKIALSLDSEQI IFEVTDQGIGISPEDQKQIFEPFHRGKNVRNITGTGLGLMVAKKC VDLHSGSILLKSAVDQGTTVTICLKRYNHLPRA SEQ ID NO: 2 CcaS (A92V); nucleic acid sequence ATGGGCAAATTTCTAATTCCAATCGAATTTGTTTTTCTGGCGATC GCCATGACCTGTTATTTATGGCACAGACAAAACCAAGAACGCCGC AGGATTGAAATTAGCATCAAGCAACAAACCCAACGGGAACGATTT ATTAACCAAATTACCCAACATATCCGCCAATCTTTAAACTTGGAA ACGGTTTTAAATACCACCGTCGCTGAAGTTAAAACCCTGTTGCAA GTTGATCGAGTTCTAATTTATCGCATTTGGCAAGATGGCACGGGC AGCGTCATTACGGAATCGGTGAATGCCAATTATCCTAGTATTTTA GGGCGGACCTTTTCCGATGAAGTTTTTCCCGTTGAATACCATCAA GCCTACACCAAAGGTAAAGTACGGGCCATTAATGACATTGACCAG GATGACATAGAGATTTGCCTAGCTGATTTCGTCAAACAATTTGGC GTGAAATCAAAATTAGTAGTGCCCATTCTTCAACATAATCGTGCT TCTTCCCTAGATAATGAATCAGAATTTCCCTATCTTTGGGGGCTG TTAATTACCCATCAATGTGCTTTTACCCGGCCATGGCAACCGTGG GAAGTGGAGTTAATGAAACAGCTAGCCAATCAGGTCGCGATCGCC ATCCAACAATCGGAATTATATGAGCAATTACAGCAACTCAATAAA GATTTGGAAAACCGAGTCGAAAAACGCACCCAGCAACTTGCCGCC ACCAATCAATCCCTAAGAATGGAAATCAGTGAGCGACAAAAAACG GAAGCCGCTCTCCGCCACACTAACCATACTCTGCAATCCCTGATT GCGGCCTCCCCCAGGGGTATTTTTACCCTTAATTTAGCAGACCAA ATTCAGATTTGGAATCCTACAGCAGAACGTATTTTTGGTTGGACA GAAACAGAAATTATTGCCCATCCAGAATTATTAACATCCAACATT TTGCTGGAAGATTATCAGCAATTTAAACAGAAAGTTTTATCAGGC ATGGTTTCCCCTAGCCTAGAATTAAAATGTCAAAAAAAAGATGGT AGTTGGATTGAAATTGTCCTTTCCGCTGCTCCCCTATTGGATAGT GAAGAAAATATTGCCGGATTGGTGGCGGTTGTCGCCGATATTACC GAGCAAAAGCGGCAGGCAGAACAAATTCGTTTGCTACAATCCGTT GTGGTTAATACTAATGATGCGGTGGTGATTACGGAAGCGGAGCCC ATTGATGATCCCGGGCCGAGAATTCTCTATGTCAATGAAGCATTT ACTAAAATCACCGGTTATACTGCTGAAGAAATGCTAGGCAAAACC CCCCGAGTTTTACAGGGACCAAAAACTAGTCGCACTGAATTAGAT AGGGTGCGGCAAGCCATTAGTCAATGGCAATCAGTTACCGTTGAA GTGATTAATTATCGTAAGGATGGCAGTGAGTTTTGGGTGGAATTT AGTCTGGTGCCCGTTGCCAATAAAACAGGTTTTTACACCCATTGG ATTGCTGTGCAAAGGGATGTCACTGAGCGCCGACGCACGGAGGAA GTCCGCCTAGCTTTAGAACGGGAAAAAGAATTAAGCCGCCTAAAA ACTCGTTTTTTCTCCATGGCTTCCCATGAATTTCGTACTCCCCTC AGTACGGCCTTAGCTGCTGCCCAATTACTGGAAAATTCTGAAGTG GCCTGGCTTGATCCCGATAAGCGTAGCCGGAACTTACACCGTATT CAAAATTCCGTGAAAAATATGGTACAGCTCCTGGATGATATTTTA ATCATTAACCGTGCCGAAGCGGGCAAATTGGAATTTAATCCTAAT TGGTTAGATTTGAAATTATTGTTCCAGCAATTTATCGAAGAAATT CAATTAAGTGTCAGTGACCAATATTATTTTGACTTTATTTGTAGC GCTCAAGATACGAAGGCATTGGTGGATGAAAGGTTAGTGCGGTCT ATTTTATCTAATCTGTTATCTAATGCGATTAAATACTCTCCCGGG GGAGGGCAGATTAAAATTGCCCTAAGCCTAGATTCGGAACAGATT ATTTTTGAAGTCACCGACCAGGGCATTGGCATTTCGCCAGAGGAC CAAAAGCAAATTTTTGAACCCTTTCATCGGGGCAAAAATGTCAGA AATATTACGGGAACAGGACTCGGTTTAATGGTTGCCAAGAAATGT GTTGACTTACACAGTGGCAGTATCTTGCTAAAAAGTGCAGTTGAC CAGGGAACAACAGTTACTATCTGTTTAAAACGCTATAACCATTTG CCTCGAGCTTAG SEQ ID NO: 3: M:NLS: CcaS (.DELTA.23); amino acid sequence MLQPKKKRKVGGRQNQERRRIEISIKQQTQRERFINQITQHIRQS LNLETVLNTTVAEVKTLLQVDRVLIYRIWQDGTGSAITESVNANY PSILGRTFSDEVFPVEYHQAYTKGKVRAINDIDQDDIEICLADFV KQFGVKSKLVVPILQHNRASSLDNESEFPYLWGLLITHQCAFTRP WQPWEVELMKQLANQVAIAIQQSELYEQLQQLNKDLENRVEKRTQ QLAATNQSLRMEISERQKTEAALRHTNHTLQSLIAASPRGIFTLN LADQIQIWNPTAERIFGWTETEIIAHPELLTSNILLEDYQQFKQK VLSGMVSPSLELKCQKKDGSWIEIVLSAAPLLDSEENIAGLVAVV ADITEQKRQAEQIRLLQSVVVNTNDAWITEAEPIDDPGPRILYVN EAFTKITGYTAEEMLGKTPRVLQGPKTSRTELDRVRQAISQWQSV TVEVINYRKDGSEFWVEFSLVPVANKTGFYTHWIAVQRDVTERRR TEEVRLALEREKELSRLKTRFFSMASHEFRTPLSTALAAAQLLEN SEVAWLDPDKRSRNLHRIQNSVKNMVQLLDDILIINRAEAGKLEF NPNWLDLKLLFQQFIEEIQLSVSDQYYFDFICSAQDTKALVDERL VRSILSNLLSNAIKYSPGGGQIKIALSLDSEQIIFEVTDQGIGIS PEDQKQIFEPFHRGKNVRNITGTGLGLMVAKKCVDLHSGSILLKS AVDQGTTVTICLKRYNHLPRA SEQ ID NO: 4 M:NLS:CcaS (.DELTA.23); nucleic acid sequence ATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAAAAC CAAGAACGCCGCAGGATTGAAATTAGCATCAAGCAACAAACCCAA CGGGAACGATTTATTAACCAAATTACCCAACATATCCGCCAATCT TTAAACTTGGAAACGGTTTTAAATACCACCGTCGCTGAAGTTAAA ACCCTGTTGCAAGTTGATCGAGTTCTAATTTATCGCATTTGGCAA GATGGCACGGGCAGCGCCATTACGGAATCGGTGAATGCCAATTAT CCTAGTATTTTAGGGCGGACCTTTTCCGATGAAGTTTTTCCCGTT GAATACCATCAAGCCTACACCAAAGGTAAAGTACGGGCCATTAAT GACATTGACCAGGATGACATAGAGATTTGCCTAGCTGATTTCGTC AAACAATTTGGCGTGAAATCAAAATTAGTAGTGCCCATTCTTCAA CATAATCGTGCTTCTTCCCTAGATAATGAATCAGAATTTCCCTAT CTTTGGGGGCTGTTAATTACCCATCAATGTGCTTTTACCCGGCCA TGGCAACCGTGGGAAGTGGAGTTAATGAAACAGCTAGCCAATCAG GTCGCGATCGCCATCCAACAATCGGAATTATATGAGCAATTACAG CAACTCAATAAAGATTTGGAAAACCGAGTCGAAAAACGCACCCAG CAACTTGCCGCCACCAATCAATCCCTAAGAATGGAAATCAGTGAG CGACAAAAAACGGAAGCCGCTCTCCGCCACACTAACCATACTCTG CAATCCCTGATTGCGGCCTCCCCCAGGGGTATTTTTACCCTTAAT TTAGCAGACCAAATTCAGATTTGGAATCCTACAGCAGAACGTATT TTTGGTTGGACAGAAACAGAAATTATTGCCCATCCAGAATTATTA ACATCCAACATTTTGCTGGAAGATTATCAGCAATTTAAACAGAAA GTTTTATCAGGCATGGTTTCCCCTAGCCTAGAATTAAAATGTCAA AAAAAAGATGGTAGTTGGATTGAAATTGTCCTTTCCGCTGCTCCC CTATTGGATAGTGAAGAAAATATTGCCGGATTGGTGGCGGTTGTC GCCGATATTACCGAGCAAAAGCGGCAGGCAGAACAAATTCGTTTG CTACAATCCGTTGTGGTTAATACTAATGATGCGGTGGTGATTACG GAAGCGGAGCCCATTGATGATCCCGGGCCGAGAATTCTCTATGTC AATGAAGCATTTACTAAAATCACCGGTTATACTGCTGAAGAAATG CTAGGCAAAACCCCCCGAGTTTTACAGGGACCAAAAACTAGTCGC ACTGAATTAGATAGGGTGCGGCAAGCCATTAGTCAATGGCAATCA GTTACCGTTGAAGTGATTAATTATCGTAAGGATGGCAGTGAGTTT TGGGTGGAATTTAGTCTGGTGCCCGTTGCCAATAAAACAGGTTTT TACACCCATTGGATTGCTGTGCAAAGGGATGTCACTGAGCGCCGA CGCACGGAGGAAGTCCGCCTAGCTTTAGAACGGGAAAAAGAATTA AGCCGCCTAAAAACTCGTTTTTTCTCCATGGCTTCCCATGAATTT CGTACTCCCCTCAGTACGGCCTTAGCTGCTGCCCAATTACTGGAA
AATTCTGAAGTGGCCTGGCTTGATCCCGATAAGCGTAGCCGGAAC TTACACCGTATTCAAAATTCCGTGAAAAATATGGTACAGCTCCTG GATGATATTTTAATCATTAACCGTGCCGAAGCGGGCAAATTGGAA TTTAATCCTAATTGGTTAGATTTGAAATTATTGTTCCAGCAATTT ATCGAAGAAATTCAATTAAGTGTCAGTGACCAATATTATTTTGAC TTTATTTGTAGCGCTCAAGATACGAAGGCATTGGTGGATGAAAGG TTAGTGCGGTCTATTTTATCTAATCTGTTATCTAATGCGATTAAA TACTCTCCCGGGGGAGGGCAGATTAAAATTGCCCTAAGCCTAGAT TCGGAACAGATTATTTTTGAAGTCACCGACCAGGGCATTGGCATT TCGCCAGAGGACCAAAAGCAAATTTTTGAACCCTTTCATCGGGGC AAAAATGTCAGAAATATTACGGGAACAGGACTCGGTTTAATGGTT GCCAAGAAATGTGTTGACTTACACAGTGGCAGTATCTTGCTAAAA AGTGCAGTTGACCAGGGAACAACAGTTACTATCTGTTTAAAACGC TATAACCATTTGCCTCGAGCTTAG SEQ ID NO: 5: CcaS (.DELTA. 22 A92V); amino acid sequence MRQNQERRRIEISIKQQTQRERFINQITQHIRQSLNLETVLNTTV AEVKTLLQVDRVLIYRIWQDGTGSVITESVNANYPSILGRTFSDE VFPVEYHQAYTKGKVRAINDIDQDDIEICLADFVKQFGVKSKLVV PILQHNRASSLDNESEFPYLWGLLITHQCAFTRPWQPWEVELMKQ LANQVAIAIQQSELYEQLQQLNKDLENRVEKRTQQLAATNQSLRM EISERQKTEAALRHTNHTLQSLIAASPRGIFTLNLADQIQIWNPT AERIFGWTETEIIAHPELLTSNILLEDYQQFKQKVLSGMVSPSLE LKCQKKDGSWIEIVLSAAPLLDSEENIAGLVAVVADITEQKRQAE QIRLLQSWVNTNDAVVITEAEPIDDPGPRILYVNEAFTKITGYTA EEMLGKTPRVLQGPKTSRTELDRVRQAISQWQSVTVEVINYRKDG SEFVWEFSLVPVANKTGFYTHWIAVQRDVTERRRTEEVRLALERE KELSRLKTRFFSMASHEFRTPLSTALAAAQLLENSEVAWLDPDKR SRNLHRIQNSVKNMVQLLDDILIINRAEAGKLEFNPNWLDLKLLF QQFIEEIQLSVSDQYYFDFICSAQDTKALVDERLVRSILSNLLSN AIKYSPGGGQIKIALSLDSEQIIFEVTDQGIGISPEDQKQIFEPF HRGKNVRNITGTGLGLMVAKKCVDLHSGSILLKSAVDQGTTVTIC LKRYNHLPRA SEQ ID NO: 6: CcaS (.DELTA. 22 A92V); nucleic acid sequence ATGAGACAAAACCAAGAACGCCGCAGGATTGAAATTAGCATCAAG CAACAAACCCAACGGGAACGATTTATTAACCAAATTACCCAACAT ATCCGCCAATCTTTAAACTTGGAAACGGTTTTAAATACCACCGTC GCTGAAGTTAAAACCCTGTTGCAAGTTGATCGAGTTCTAATTTAT CGCATTTGGCAAGATGGCACGGGCAGCGTCATTACGGAATCGGTG AATGCCAATTATCCTAGTATTTTAGGGCGGACCTTTTCCGATGAA GTTTTTCCCGTTGAATACCATCAAGCCTACACCAAAGGTAAAGTA CGGGCCATTAATGACATTGACCAGGATGACATAGAGATTTGCCTA GCTGATTTCGTCAAACAATTTGGCGTGAAATCAAAATTAGTAGTG CCCATTCTTCAACATAATCGTGCTTCTTCCCTAGATAATGAATCA GAATTTCCCTATCTTTGGGGGCTGTTAATTACCCATCAATGTGCT TTTACCCGGCCATGGCAACCGTGGGAAGTGGAGTTAATGAAACAG CTAGCCAATCAGGTCGCGATCGCCATCCAACAATCGGAATTATAT GAGCAATTACAGCAACTCAATAAAGATTTGGAAAACCGAGTCGAA AAACGCACCCAGCAACTTGCCGCCACCAATCAATCCCTAAGAATG GAAATCAGTGAGCGACAAAAAACGGAAGCCGCTCTCCGCCACACT AACCATACTCTGCAATCCCTGATTGCGGCCTCCCCCAGGGGTATT TTTACCCTTAATTTAGCAGACCAAATTCAGATTTGGAATCCTACA GCAGAACGTATTTTTGGTTGGACAGAAACAGAAATTATTGCCCAT CCAGAATTATTAACATCCAACATTTTGCTGGAAGATTATCAGCAA TTTAAACAGAAAGTTTTATCAGGCATGGTTTCCCCTAGCCTAGAA TTAAAATGTCAAAAAAAAGATGGTAGTTGGATTGAAATTGTCCTT TCCGCTGCTCCCCTATTGGATAGTGAAGAAAATATTGCCGGATTG GTGGCGGTTGTCGCCGATATTACCGAGCAAAAGCGGCAGGCAGAA CAAATTCGTTTGCTACAATCCGTTGTGGTTAATACTAATGATGCG GTGGTGATTACGGAAGCGGAGCCCATTGATGATCCCGGGCCGAGA ATTCTCTATGTCAATGAAGCATTTACTAAAATCACCGGTTATACT GCTGAAGAAATGCTAGGCAAAACCCCCCGAGTTTTACAGGGACCA AAAACTAGTCGCACTGAATTAGATAGGGTGCGGCAAGCCATTAGT CAATGGCAATCAGTTACCGTTGAAGTGATTAATTATCGTAAGGAT GGCAGTGAGTTTTGGGTGGAATTTAGTCTGGTGCCCGTTGCCAAT AAAACAGGTTTTTACACCCATTGGATTGCTGTGCAAAGGGATGTC ACTGAGCGCCGACGCACGGAGGAAGTCCGCCTAGCTTTAGAACGG GAAAAAGAATTAAGCCGCCTAAAAACTCGTTTTTTCTCCATGGCT TCCCATGAATTTCGTACTCCCCTCAGTACGGCCTTAGCTGCTGCC CAATTACTGGAAAATTCTGAAGTGGCCTGGCTTGATCCCGATAAG CGTAGCCGGAACTTACACCGTATTCAAAATTCCGTGAAAAATATG GTACAGCTCCTGGATGATATTTTAATCATTAACCGTGCCGAAGCG GGCAAATTGGAATTTAATCCTAATTGGTTAGATTTGAAATTATTG TTCCAGCAATTTATCGAAGAAATTCAATTAAGTGTCAGTGACCAA TATTATTTTGACTTTATTTGTAGCGCTCAAGATACGAAGGCATTG GTGGATGAAAGGTTAGTGCGGTCTATTTTATCTAATCTGTTATCT AATGCGATTAAATACTCTCCCGGGGGAGGGCAGATTAAAATTGCC CTAAGCCTAGATTCGGAACAGATTATTTTTGAAGTCACCGACCAG GGCATTGGCATTTCGCCAGAGGACCAAAAGCAAATTTTTGAACCC TTTCATCGGGGCAAAAATGTCAGAAATATTACGGGAACAGGACTC GGTTTAATGGTTGCCAAGAAATGTGTTGACTTACACAGTGGCAGT ATCTTGCTAAAAAGTGCAGTTGACCAGGGAACAACAGTTACTATC TGTTTAAAACGCTATAACCATTTGCCTCGAGCTTAG SEQ ID NO: 7 M:NLS: CcaS (.DELTA.23 A92V); amino acid sequence MLQPKKKRKVGGRQNQERRRIEISIKQQTQRERFINQITQHIRQSL NLETVLNTTVAEVKTLLQVDRVLIYRIWQDGTGSVITESVNANYPS ILGRTFSDEVFPVEYHQAYTKGKVRAINDIDQDDIEICLADFVKQF GVKSKLVVPILQHNRASSLDNESEFPYLWGLLITHQCAFTRPWQPW EVELMKQIANQVAIAIQQSELYEQLQQLNKDLENRVEKRTQQLAAT NQSLRMEISERQKTEAALRHTNHTLQSLIAASPRGIFTLNLADQIQ IWNPTAERIFGWTETEIIAHPELLTSNILLEDYQQFKQKVLSGMVS PSLELKCQKKDGSWIEIVLSAAPLLDSEENIAGLVAVVADITEQKR QAEQIRLLQSVVVNTNDAVVITEAEPIDDPGPRILYVNEAFTKITG YTAEEMLGKTPRVLQGPKTSRTELDRVRQAISQWQSVTVEVINYRK DGSEFWVEFSLVPVANKTGFYTHWIAVQRDVTERRRTEEVRLALER EKELSRLKTRFFSMASHEFRTPLSTALAAAQLLENSEVAWLDPDKR SRNLHRIQNSVKNMVQLLDDILIINRAEAGKLEFNPNWLDLKLLFQ QFIEEIQLSVSDQYYFDFICSAQDTKALVDERLVRSILSNLLSNAI KYSPGGGQIKIALSLDSEQIIFEVTDQGIGISPEDQKQIFEPFHRG KNVRNITGTGLGLMVAKKCVDLHSGSILLKSAVDQGTTVTICLKRY NHLPRA SEQ ID NO: 8 M:NLS: CcaS (.DELTA.23 A92V); nucleic acid sequence ATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAAAACC AAGAACGCCGCAGGATTGAAATTAGCATCAAGCAACAAACCCAACG GGAACGATTTATTAACCAAATTACCCAACATATCCGCCAATCTTTA AACTTGGAAACGGTTTTAAATACCACCGTCGCTGAAGTTAAAACCC TGTTGCAAGTTGATCGAGTTCTAATTTATCGCATTTGGCAAGATGG CACGGGCAGCGTCATTACGGAATCGGTGAATGCCAATTATCCTAGT ATTTTAGGGCGGACCTTTTCCGATGAAGTTTTTCCCGTTGAATACC ATCAAGCCTACACCAAAGGTAAAGTACGGGCCATTAATGACATTGA CCAGGATGACATAGAGATTTGCCTAGCTGATTTCGTCAAACAATTT GGCGTGAAATCAAAATTAGTAGTGCCCATTCTTCAACATAATCGTG GCTTCTTCCCTAGATAATGAATCAGAATTTCCCTATCTTTGGGGCT GTTAATTACCCATCAATGTGCTTTTACCCGGCCATGGCAACCGTGG GAAGTGGAGTTAATGAAACAGCTAGCCAATCAGGTCGCGATCGCCA TCCAACAATCGGAATTATATGAGCAATTACAGCAACTCAATAAAGA TTTGGAAAACCGAGTCGAAAAACGCACCCAGCAACTTGCCGCCACC AATCAATCCCTAAGAATGGAAATCAGTGAGCGACAAAAAACGGAAG CCGCTCTCCGCCACACTAACCATACTCTGCAATCCCTGATTGCGGC CTCCCCCAGGGGTATTTTTACCCTTAATTTAGCAGACCAAATTCAG ATTTGGAATCCTACAGCAGAACGTATTTTTGGTTGGACAGAAACAG AAATTATTGCCCATCCAGAATTATTAACATCCAACATTTTGCTGGA AGATTATCAGCAATTTAAACAGAAAGTTTTATCAGGCATGGTTTCC CCTAGCCTAGAATTAAAATGTCAAAAAAAAGATGGTAGTTGGATTG AAATTGTCCTTTCCGCTGCTCCCCTATTGGATAGTGAAGAAAATAT TGCCGGATTGGTGGCGGTTGTCGCCGATATTACCGAGCAAAAGCGG
CAGGCAGAACAAATTCGTTTGCTACAATCCGTTGTGGTTAATACTA ATGATGCGGTGGTGATTACGGAAGCGGAGCCCATTGATGATCCCGG GCCGAGAATTCTCTATGTCAATGAAGCATTTACTAAAATCACCGGT TATACTGCTGAAGAAATGCTAGGCAAAACCCCCCGAGTTTTACAGG GACCAAAAACTAGTCGCACTGAATTAGATAGGGTGCGGCAAGCCAT TAGTCAATGGCAATCAGTTACCGTTGAAGTGATTAATTATCGTAAG GATGGCAGTGAGTTTTGGGTGGAATTTAGTCTGGTGCCCGTTGCCA ATAAAACAGGTTTTTACACCCATTGGATTGCTGTGCAAAGGGATGT CACTGAGCGCCGACGCACGGAGGAAGTCCGCCTAGCTTTAGAACGG GAAAAAGAATTAAGCCGCCTAAAAACTCGTTTTTTCTCCATGGCTT CCCATGAATTTCGTACTCCCCTCAGTACGGCCTTAGCTGCTGCCCA ATTACTGGAAAATTCTGAAGTGGCCTGGCTTGATCCCGATAAGCGT AGCCGGAACTTACACCGTATTCAAAATTCCGTGAAAAATATGGTAC AGCTCCTGGATGATATTTTAATCATTAACCGTGCCGAAGCGGGCAA ATTGGAATTTAATCCTAATTGGTTAGATTTGAAATTATTGTTCCAG CAATTTATCGAAGAAATTCAATTAAGTGTCAGTGACCAATATTATT TTGACTTTATTTGTAGCGCTCAAGATACGAAGGCATTGGTGGATGA AAGGTTAGTGCGGTCTATTTTATCTAATCTGTTATCTAATGCGATT AAATACTCTCCCGGGGGAGGGCAGATTAAAATTGCCCTAAGCCTAG ATTCGGAACAGATTATTTTTGAAGTCACCGACCAGGGCATTGGCAT TTCGCCAGAGGACCAAAAGCAAATTTTTGAACCCTTTCATCGGGGC AAAAATGTCAGAAATATTACGGGAACAGGACTCGGTTTAATGGTTG CCAAGAAATGTGTTGACTTACACAGTGGCAGTATCTTGCTAAAAAG TGCAGTTGACCAGGGAACAACAGTTACTATCTGTTTAAAACGCTAT AACCATTTGCCTCGAGCTTAG SEQ ID NO: 9 MM:NLS:CcaS(.DELTA.23):F2A30(aa1-29) amino acid sequence MMLQPKKKRKVGGRQNQERRRIEISIKQQTQRERFINQITQHIRQS LNLETVLNTTVAEVKTLLQVDRVLIYRIWQDGTGSAITESVNANYP SILGRTFSDEVFPVEYHQAYTKGKVRAINDIDQDDIEICLADFVKQ FGVKSKLVVPILQHNRASSLDNESEFPYLWGLLITHQCAFTRPWQP WEVELMKQLANQVAIAIQQSELYEQLQQLNKDLENRVEKRTQQLAA TNQSLRMEISERQKTEAALRHTNHTLQSLIAASPRGIFTLNLADQI QIWNPTAERIFGWTETEIIAHPELLTSNILLEDYQQFKQKVLSGMV SPSLELKCQKKDGSWIEIVLSAAPLLDSEENIAGLVAVVADITEQK RQAEQIRLLQSVVVNTNDAVVITEAEPIDDPGPRILYVNEAFTKIT GYTAEEMLGKTPRVLQGPKTSRTELDRVRQAISQWQSVTVEVINYR KDGSEFWVEFSLVPVANKTGFYTHWIAVQRDVTERRRTEEVRLALE REKELSRLKTRFFSMASHEFRTPLSTALAAAQLLENSEVAWLDPDK RSRNLHRIQNSVKNMVQLLDDILIINRAEAGKLEFNPNWLDLKLLF QQFIEEIQLSVSDQYYFDFICSAQDTKALVDERLVRSILSNLLSNA IKYSPGGGQIKIALSLDSEQIIFEVTDQGIGISPEDQKQIFEPFHR GKNVRNITGTGLGLMVAKKCVDLHSGSILLKSAVDQGTTVTICLKR YNHLPRA SEQ ID NO: 10 MM:NLS: CcaS(.DELTA.23):F2A30 (aa1-29) nucleic acid sequence ATGATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAGA ACCAAGAACGAAGAAGAATAGAAATAAGTATCAAGCAGCAGACACA ACGTGAGAGGTTTATCAACCAAATCACACAGCATATCAGACAATCT CTTAATTTGGAGACTGTTTTGAACACTACAGTTGCTGAAGTTAAGA CACTTTTGCAGGTTGATAGAGTTCTTATCTATAGAATCTGGCAAGA TGGTACAGGATCTGCTATCACTGAGTCTGTTAATGCTAACTACCCT TCTATTTTGGGTAGAACTTTTTCTGATGAGGTTTTCCCAGTTGAAT ATCATCAAGCTTACACAAAGGGAAAAGTTAGAGCTATTAATGATAT CGATCAGGATGATATCGAAATCTGTCTTGCTGATTTCGTTAAACAA TTCGGTGTTAAGTCTAAACTTGTTGTTCCTATCTTGCAGCATAATA GAGCTTCTTCTTTGGATAACGAATCTGAGTTTCCATATCTTTGGGG ACTTTTGATTACACATCAGTGTGCTTTCACTAGACCTTGGCAACCT TGGGAAGTTGAGCTTATGAAGCAGTTGGCTAACCAAGTTGCTATTG CTATCCAACAGTCTGAGTTGTACGAACAACTTCAACAGTTGAATAA GGATCTTGAGAACAGAGTTGAAAAAAGAACACAACAGTTGGCTGCT ACTAATCAGTCTCTTAGGATGGAAATCTCTGAAAGACAAAAGACTG AGGCTGCTTTGAGACATACTAACCATACACTTCAGTCTTTGATTGC TGCTTCTCCTAGAGGTATCTTTACTCTTAATTTGGCTGATCAAATT CAGATCTGGAACCCAACAGCTGAGCGAATCTTCGGATGGACTGAAA CAGAGATTATCGCTCATCCTGAGCTTTTGACATCTAACATCCTTTT GGAAGATTACCAACAGTTTAAGCAAAAGGTTCTTTCTGGTATGGTT TCTCCATCTCTTGAGTTGAAGTGTCAGAAGAAAGATGGATCTTGGA TTGAAATCGTTTTGTCTGCTGCTCCTCTTTTGGATTCTGAAGAGAA CATTGCTGGTCTTGTTGCTGTTGTTGCTGATATCACTGAGCAAAAA AGACAGGCTGAACAAATCAGACTTTTGCAATCTGTTGTTGTTAACA CAAACGATGCTGTTGTTATTACTGAAGCTGAACCAATCGATGATCC TGGACCAAGAATCCTTTATGTTAATGAGGCTTTCACTAAGATCACA GGATACACTGCTGAAGAGATGTTGGGAAAGACTCCTAGAGTTCTTC AAGGACCAAAAACTTCAAGAACTGAGTTGGATAGAGTTAGACAGGC TATCTCTCAATGGCAGTCTGTTACAGTTGAAGTTATTAATTACAGA AAGGATGGTTCTGAGTTTTGGGTTGAATTTTCTCTTGTTCCTGTTG CTAACAAAACAGGATTTTACACTCATTGGATTGCTGTTCAAAGAGA TGTTACAGAGAGAAGAAGAACTGAAGAGGTTAGACTTGCTTTGGAA AGAGAGAAGGAACTTTCAAGATTGAAGACTAGATTTTTCTCTATGG CTTCTCATGAGTTTAGAACACCACTTTCTACTGCTTTGGCTGCTGC TCAACTTCTTGAAAATTCTGAAGTTGCTTGGCTTGATCCTGATAAG AGATCAAGAAACCTTCATAGAATCCAAAATTCTGTTAAAAACATGG TTCAACTTTTGGATGATATCTTGATTATCAACAGAGCTGAGGCTGG AAAGCTTGAGTTTAATCCAAACTGGCTTGATTTGAAGCTTTTGTTC CAACAGTTCATTGAAGAGATCCAGCTTTCTGTTTCTGATCAATACT ACTTCGATTTCATCTGTTCTGCTCAAGATACTAAGGCTCTTGTTGA TGAAAGATTGGTTAGATCTATCCTTTCTAATCTTTTGTCTAACGCT ATCAAGTACTCTCCTGGAGGTGGACAGATTAAAATCGCTCTTTCTT TGGATTCTGAGCAGATTATCTTCGAAGTTACAGATCAAGGTATTGG AATCTCTCCTGAGGATCAAAAGCAGATCTTTGAACCATTCCATAGA GGAAAGAATGTTAGAAACATTACTGGTACAGGACTTGGTTTGATGG TTGCTAAGAAATGTGTTGATCTTCATTCTGGATCTATCCTTTTGAA GTCTGCTGTGGATCAAGGAACAACTGTGACCATCTGTCTCAAAAGG TACAACCATCTCCCAAGGGCT SEQ ID NO: 11 MM:NLS:CcaS (.DELTA.23 A92V):F2A30 (aa1-29) amino acid sequence MMLQPKKKRKVGGRQNQERRRIEISIKQQTQRERFINQITQHIRQS LNLETVLNTTVAEVKTLLQVDRVLIYRIWQDGTGSVITESVNANYP SILGRTFSDEVFPVEYHQAYTKGKVRAINDIDQDDIEICLADFVKQ FGVKSKLVVPILQHNRASSLDNESEFPYLWGLLITHQCAFTRPWQP WEVELMKQLANQVAIAIQQSELYEQLQQLNKDLENRVEKRTQQLAA TNQSLRMEISERQKTEAALRHTNHTLQSLIAASPRGIFTLNLADQI QIWNPTAERIFGWTETEIIAHPELLTSNILLEDYQQFKQKVLSGMV SPSLELKCQKKDGSWIEIVLSAAPLLDSEENIAGLVAVVADITEQK RQAEQIRLLQSVVVNTNDAVVITEAEPIDDPGPRILYVNEAFTKIT GYTAEEMLGKTPRVLQGPKTSRTELDRVRQAISQWQSVTVEVINYR KDGSEFWVEFSLVPVANKTGFYTHWIAVQRDVTERRRTEEVRLALE REKELSRLKTRFFSMASHEFRTPLSTALAAAQLLENSEVAWLDPDK RSRNLHRIQNSVKNMVQLLDDILIINRAEAGKLEFNPNWLDLKLLF QQFIEEIQLSVSDQYYFDFICSAQDTKALVDERLVRSILSNLLSNA IKYSPGGGQIKIALSLDSEQIIFEVTDQGIGISPEDQKQIFEPFHR GKNVRNITGTGLGLMVAKKCVDLHSGSILLKSAVDQGTTVTICLKR YNHLPRA SEQ ID NO: 12 MM:NLS:CcaS(.DELTA.23 A92V):F2A30 (aa1-29) nucleic acid sequence ATGATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAGA ACCAAGAACGAAGAAGAATAGAAATAAGTATCAAGCAGCAGACACA ACGTGAGAGGTTTATCAACCAAATCACACAGCATATCAGACAATCT CTTAATTTGGAGACTGTTTTGAACACTACAGTTGCTGAAGTTAAGA CACTTTTGCAGGTTGATAGAGTTCTTATCTATAGAATCTGGCAAGA TGGTACAGGATCTGTTATCACTGAGTCTGTTAATGCTAACTACCCT TCTATTTTGGGTAGAACTTTTTCTGATGAGGTTTTCCCAGTTGAAT ATCATCAAGCTTACACAAAGGGAAAAGTTAGAGCTATTAATGATAT CGATCAGGATGATATCGAAATCTGTCTTGCTGATTTCGTTAAACAA TTCGGTGTTAAGTCTAAACTTGTTGTTCCTATCTTGCAGCATAATA GAGCTTCTTCTTTGGATAACGAATCTGAGTTTCCATATCTTTGGGG ACTTTTGATTACACATCAGTGTGCTTTCACTAGACCTTGGCAACCT TGGGAAGTTGAGCTTATGAAGCAGTTGGCTAACCAAGTTGCTATTG CTATCCAACAGTCTGAGTTGTACGAACAACTTCAACAGTTGAATAA
GGATCTTGAGAACAGAGTTGAAAAAAGAACACAACAGTTGGCTGCT ACTAATCAGTCTCTTAGGATGGAAATCTCTGAAAGACAAAAGACTG AGGCTGCTTTGAGACATACTAACCATACACTTCAGTCTTTGATTGC TGCTTCTCCTAGAGGTATCTTTACTCTTAATTTGGCTGATCAAATT CAGATCTGGAACCCAACAGCTGAGCGAATCTTCGGATGGACTGAAA CAGAGATTATCGCTCATCCTGAGCTTTTGACATCTAACATCCTTTT GGAAGATTACCAACAGTTTAAGCAAAAGGTTCTTTCTGGTATGGTT TCTCCATCTCTTGAGTTGAAGTGTCAGAAGAAAGATGGATCTTGGA TTGAAATCGTTTTGTCTGCTGCTCCTCTTTTGGATTCTGAAGAGAA CATTGCTGGTCTTGTTGCTGTTGTTGCTGATATCACTGAGCAAAAA AGACAGGCTGAACAAATCAGACTTTTGCAATCTGTTGTTGTTAACA CAAACGATGCTGTTGTTATTACTGAAGCTGAACCAATCGATGATCC TGGACCAAGAATCCTTTATGTTAATGAGGCTTTCACTAAGATCACA GGATACACTGCTGAAGAGATGTTGGGAAAGACTCCTAGAGTTCTTC AAGGACCAAAAACTTCAAGAACTGAGTTGGATAGAGTTAGACAGGC TATCTCTCAATGGCAGTCTGTTACAGTTGAAGTTATTAATTACAGA AAGGATGGTTCTGAGTTTTGGGTTGAATTTTCTCTTGTTCCTGTTG CTAACAAAACAGGATTTTACACTCATTGGATTGCTGTTCAAAGAGA TGTTACAGAGAGAAGAAGAACTGAAGAGGTTAGACTTGCTTTGGAA AGAGAGAAGGAACTTTCAAGATTGAAGACTAGATTTTTCTCTATGG CTTCTCATGAGTTTAGAACACCACTTTCTACTGCTTTGGCTGCTGC TCAACTTCTTGAAAATTCTGAAGTTGCTTGGCTTGATCCTGATAAG AGATCAAGAAACCTTCATAGAATCCAAAATTCTGTTAAAAACATGG TTCAACTTTTGGATGATATCTTGATTATCAACAGAGCTGAGGCTGG AAAGCTTGAGTTTAATCCAAACTGGCTTGATTTGAAGCTTTTGTTC CAACAGTTCATTGAAGAGATCCAGCTTTCTGTTTCTGATCAATACT ACTTCGATTTCATCTGTTCTGCTCAAGATACTAAGGCTCTTGTTGA TGAAAGATTGGTTAGATCTATCCTTTCTAATCTTTTGTCTAACGCT ATCAAGTACTCTCCTGGAGGTGGACAGATTAAAATCGCTCTTTCTT TGGATTCTGAGCAGATTATCTTCGAAGTTACAGATCAAGGTATTGG AATCTCTCCTGAGGATCAAAAGCAGATCTTTGAACCATTCCATAGA GGAAAGAATGTTAGAAACATTACTGGTACAGGACTTGGTTTGATGG TTGCTAAGAAATGTGTTGATCTTCATTCTGGATCTATCCTTTTGAA GTCTGCTGTGGATCAAGGAACAACTGTGACCATCTGTCTCAAAAGG TACAACCATCTCCCAAGGGCT CcaR variants SEQ ID NO: 13: F2A30(aa30):NLS:2xGGS:VP64: 4xGGS:Cca Ramino acid PGSLQPKKKRKVGGGGSGGSDALDDFDLDMLGSDALDDFDLDMLGS DALDDFDLDMLGSDALDDFDLDMLGGSGGSGGSGGSMRILLVEDDL PLAETLAEALSDQLYTVDIATDASLAWDYASRLEYDLVILDVMLPE LDGITLCQKWRSHSYLMPILMMTARDTINDKITGLDAGADDYVVKP VDLGELFARVRALLRRGCATCQPVLEWGPIRLDPSTYEVSYDNEVL SLTRKEYSILELLLRNGRRVLSRSMIIDSIWKLESPPEEDTVKVHV RSLRQKLKSAGLSADAIETVHGIGYRLANLTEKSLCQGKN SEQ ID NO: 14: F2A30(aa30):NLS:2xGGS:VP64: 4xGGS:CcaR nucleic acid CCAGGTTCACTCCAGCCTAAGAAGAAGAGAAAGGTTGGAGGTGGTG GCTCCGGAGGCTCTGATGCCCTCGACGATTTCGACCTCGATATGCT CGGTTCTGATGCTCTCGATGACTTTGACCTTGACATGCTTGGATCA GACGCTTTGGACGACTTCGACTTGGACATGTTGGGATCTGATGCAC TTGATGATTTTGACCTTGATATGCTTGGTGGTTCAGGAGGGTCTGG TGGATCAGGAGGATCTATGAGAATACTCCTCGTGGAAGATGATTTG CCATTAGCAGAAACCCTCGCAGAAGCTTTGTCTGATCAACTTTACA CTGTTGATATTGCTACAGATGCTTCTTTGGCTTGGGATTATGCTTC TAGACTTGAATACGATTTGGTTATTCTTGATGTTATGTTGCCTGAG CTTGATGGAATTACTCTTTGTCAGAAGTGGAGATCTCATTCTTATT TGATGCCAATCCTTATGATGACTGCTAGAGATACAATTAATGATAA GATCACAGGACTTGATGCTGGTGCTGATGATTACGTTGTTAAACCT GTTGATTTGGGTGAACTTTTTGCTAGAGTTAGAGCTCTTTTGAGAA GAGGATGTGCTACTTGTCAACCAGTTTTGGAGTGGGGTCCTATTAG ACTTGATCCATCTACTTATGAAGTTTCTTACGATAATGAGGTTTTG TCTCTTACAAGAAAGGAATACTCTATCTTGGAGCTTTTGCTTAGAA ACGGAAGAAGAGTTCTTTCTAGATCTATGATCATCGATTCTATCTG GAAGTTGGAGTCTCCTCCAGAAGAGGATACAGTTAAAGTTCATGTT AGATCTTTGAGACAAAAGCTTAAGTCTGCTGGACTTTCTGCTGATG CTATTGAAACTGTTCATGGAATCGGTTACAGATTGGCTAATCTTAC AGAGAAGTCTTTGTGTCAGGGAAAGAAT SEQ ID NO: 15: F2A30(aa30):CcaR:4xGSS:VP64: 2xGGS:NLS amino acid PMRILLVEDDLPLAETLAEALSDQLYTVDIATDASLAWDYASRLEY DLVILDVMLPELDGITLCQKWRSHSYLMPILMMTARDTINDKITGL DAGADDYVVKPVDLGELFARVRALLRRGCATCQPVLEWGPIRLDPS DTYEVSYDNEVLSLTRKEYSILELLLRNGRRVLSRSMIISIWKLES PPEEDTVKVHVRSLRQKLKSAGLSADAIETVHGIGYRLANLTEKSL NCQGKGGSGGSGGSGGSDALDDFDLDMLGSDALDDFDLDMLGSDAL DDFDLDMLGSDALDDFDLDMLGGSGGSLQPKKKRKVGG SEQ ID NO: 16: F2A30(aa30):CcaR:4xGSS:VP64: 2xGGS:NLS nucleic acid CCAATGAGAATACTCCTCGTGGAAGATGATTTGCCATTAGCAGAAA CCCTCGCAGAAGCTTTGTCTGATCAACTTTACACTGTTGATATTGC TACAGATGCTTCTTTGGCTTGGGATTATGCTTCTAGACTTGAATAC GATTTGGTTATTCTTGATGTTATGTTGCCTGAGCTTGATGGAATTA CTCTTTGTCAGAAGTGGAGATCTCATTCTTATTTGATGCCAATCCT TATGATGACTGCTAGAGATACAATTAATGATAAGATCACAGGACTT GATGCTGGTGCTGATGATTACGTTGTTAAACCTGTTGATTTGGGTG AACTTTTTGCTAGAGTTAGAGCTCTTTTGAGAAGAGGATGTGCTAC TTGTCAACCAGTTTTGGAGTGGGGTCCTATTAGACTTGATCCATCT TCACTTATGAAGTTTCTTACGATAATGAGGTTTTGTCTTACAAGAA AGGAATACTCTATCTTGGAGCTTTTGCTTAGAAACGGAAGAAGAGT TCTTTCTAGATCTATGATCATCGATTCTATCTGGAAGTTGGAGTCT CCTCCAGAAGAGGATACAGTTAAAGTTCATGTTAGATCTTTGAGAC AAAAGCTTAAGTCTGCTGGACTTTCTGCTGATGCTATTGAAACTGT TCATGGAATCGGTTACAGATTGGCTAATCTTACAGAGAAGTCTTTG TGTCAGGGAAAGAATGGAGGCTCCGGTGGGTCAGGTGGTTCTGGAG GCTCGGATGCCCTCGACGATTTCGACCTCGATATGCTCGGTTCTGA TGCTCTCGATGACTTTGACCTTGACATGCTTGGATCAGACGCTTTG GACGACTTCGACTTGGACATGTTGGGATCTGATGCACTTGATGATT TTGACCTTGATATGCTTGGCGGTTCCGGTGGATCACTCCAGCCTAA GAAGAAGAGAAAGGTTGGAGGT Synthetic plant promoter and cognate transcription activator SEQ ID NO: 17: CTTTCCGATTTCTTTACGATTTCCGCTTTCCGATTTCTTTACGATT TGGCTTTCCGATTTCTTTACGATTTATCCTTCGCAAGACCCTTCCT CTATATAAGGAAGTTCATTTCATTTGGAGAGGA SEQ ID NO: 40; ccaR CRE motif CTTTCCGATTTCTTTACGATTT SEQ ID NO: 41; P35Smin(-51) CTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGA GAGGA) SEQ ID NO: 42: Terminator sequence (Trbcs) AGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGTTCAA TGCATCAGTTTCATTGCGCACACACCAGAATCCTACTGAGTTtGAG TATTATGGCATTGGGAAAacTGTTTTTCTTGTACCATTTGTTGTGC TTGTAATTTACTGTGTTTTTTATTCGGTTTTCGCTATCGAACTGTG AAATGGAAATGGATGGAGAAGAGTTAATGAATGATATGGTCCTTTT GTTCATTCTCAAATTAATATTATTTGTTTTTTCTCTTATTTGTTGT GTGTTGAATTTGAAAtTATAAGAGATATGCAAACATTTTGTTTTGA GTAAAAATGTGTCAAATCGTGGCCTCTAATGACCGAAGTTAATATG AGGAGTAAAACACTTGTAGTTGTACCATTATGCTTATTCACTAGGC AACAAATATATTTTCAGACCTAGAAAAGCTGCAAATGTTACTGAAT ACAAGTATGTCCTCTTGTGTTTTAGACATTTATGAACTTTCCTTTA TGTAATTTTCCAGAATCCTTGTCAGATTCTAATCATTGCTTTATAA TTATAGTTATACTCATGGATTTGTAGTTGAGTATGAAAATATTTTT TAATGCATTTTATGACTTGCCAATTGATTGACAACATGCATCAaTC G SEQ ID NO: 43: Terminator sequence (NOS terminator): TAGAGTAGATGCCGACCGAACAAGAGCTGATTTCGAGAACGCCTCA GCCAGCAACTCGCGCGAGCCTAGCAAGGCAAATGCGAGAGAACGGC CTTACGCTTGGTGGCACAGTTCTCGTCCACAGTTCGCTAAGCTCGC TCGGCTGGGTCGCGGGAGGGCCGGTCGCAGTGATTCAGGAATTAAT TCCCTAGAGTCAAGCAGATCGTTCAAACATTTGGCAATAAAGTTTC
TTAAGATTGAATCCTGTTGCCGGTCTTGCGATGATTATCATATAAT TTCTGTTGAATTACGTTAAGCATGTAATAATTAACATGTAATGCAT GACGTTATTTATGAGATGGGTTTTTATGATTAGAGTCCCGCAATTA TACATTTAATACGCGATAGAAAACAAAATATAGCGCGCAAACTAGG ATAAATTATCGCGCGCGGTGTCATCTATGTTACTAGATCGACCGGC ATGCAAGCTGAT SEQ ID NO: 44 UBQ10 promoter ACCCGACGAGtCAGTAATAAACGGCGTCAAAGTGGTTGCAGCCGGC ACACACGAGTCGTGTTTATCAACTCAAAGCACAAATACTTTTCCTC AACCTAAAAATAAGGCAATTAGCCAAAAACAACTTTGCGTGTAAAC AACGCTCAATACACGTGTCATTTTATTATTAGCTATTGCTTCACCG CCTTAGCTTTCTCGTGACCTAGTCGTCCTCGTCTTTTCTTCTTCTT CTTCTATAAAACAATACCCAAAGAGCTCTTCTTCTTCACAATTCAG ATTTCAATTTCTCAAAATCTTAAAAACTTTCTCTCAATTCTCTCTA CCGTGATCAAGGTAAATTTCTGTGTTCCTTATTCTCTCAAAATCTT CGATTTTGTTTTCGTTCGATCCCAATTTCGTATATGTTCTTTGGTT TAGATTCTGTTAATCTTAGATCGAAGACGATTTTCTGGGTTTGATC GTTAGATATCATCTTAATTCTCGATTAGGGTTTCATAGATATCATC CGATTTGTTCAAATAATTTGAGTTTTGTCGAATAATTACTCTTCGA TTTGTGATTTCTATCTAGATCTGGTGTTAGTTTCTAGTTTGTGCGA TCGAATTTGTAGATTAATCTGAGTTTTTCTGATTAACAGCTCGAGT GCGGGATC SEQ ID NO: 47 LRHK1-01 nucleic acid sequence ATGATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAAA ACCAAGAACGCCGCAGGATTGAAATTAGCATCAAGCAACAAACCCA ACGGGAACGATTTATTAACCAAATTACCCAACATATCCGCCAATCT TTAAACTTGGAAACGGTTTTAAATACCACCGTCGCTGAAGTTAAAA CCCTGTTGCAAGTTGATCGAGTTGCCGTGTACCGTTTTAACCCGGA TTGGAGCGGCGAGTTTGTGGCCGAAAGCGTGGGTAGCGGTTGGGTG AAACTGGTGGGCCCGGATATCAAAACCGTGTGGGAAGACACACATC TGCAAGAAACCCAAGGTGGTCGCTATCGCCATCAAGAAAGCTTCGT GGTGAACGACATTTATGAGGCCGGCCATTTCAGCTGCCATCTGGAG ATTTTAGAACAGTTTGAAATTAAAGCCTACATTATCGTGCCGGTTT TTGCCGCCGAAAAACTGTGGGGTTTACTGGCCGCCTATCAGAACAG TGGTACCCGCGAATGGGTGGAATGGGAAAGCAGCTTTCTGACCCAA GTTGGTCTGCAGTTCGGCATCGCCATCCAACAATCGGAATTATATG AGCAATTACAGCAACTCAATAAAGATTTGGAAAACCGAGTCGAAAA ACGCACCCAGCAACTTGCCGCCACCAATCAATCCCTAAGAATGGAA ATCAGTGAGCGACAAAAAACGGAAGCCGCTCTCCGCCACACTAACC ATACTCTGCAATCCCTGATTGCGGCCTCCCCCAGGGGTATTTTTAC CCTTAATTTAGCAGACCAAATTCAGATTTGGAATCCTACAGCAGAA CGTATTTTTGGTTGGACAGAAACAGAAATTATTGCCCATCCAGAAT TATTAACATCCAACATTTTGCTGGAAGATTATCAGCAATTTAAACA GAAAGTTTTATCAGGCATGGTTTCCCCTAGCCTAGAATTAAAATGT CAAAAAAAAGATGGTAGTTGGATTGAAATTGTCCTTTCCGCTGCTC CCCTATTGGATAGTGAAGAAAATATTGCCGGATTGGTGGCGGTTGT CGCCGATATTACCGAGCAAAAGCGGCAGGCAGAACAAATTCGTTTG CTACAATCCGTTGTGGTTAATACTAATGATGCGGTGGTGATTACGG AAGCGGAGCCCATTGATGATCCCGGGCCGAGAATTCTCTATGTCAA TGAAGCATTTACTAAAATCACCGGTTATACTGCTGAAGAAATGCTA GGCAAAACCCCCCGAGTTTTACAGGGACCAAAAACTAGTCGCACTG AATTAGATAGGGTGCGGCAAGCCATTAGTCAATGGCAATCAGTTAC CGTTGAAGTGATTAATTATCGTAAGGATGGCAGTGAGTTTTGGGTG GAATTTAGTCTGGTGCCCGTTGCCAATAAAACAGGTTTTTACACCC ATTGGATTGCTGTGCAAAGGGATGTCACTGAGCGCCGACGCACGGA GGAAGTCCGCCTAGCTTTAGAACGGGAAAAAGAATTAAGCCGCCTA AAAACTCGTTTTTTCTCCATGGCTTCCCATGAATTTCGTACTCCCC TCAGTACGGCCTTAGCTGCTGCCCAATTACTGGAAAATTCTGAAGT GGCCTGGCTTGATCCCGATAAGCGTAGCCGGAACTTACACCGTATT CAAAATTCCGTGAAAAATATGGTACAGCTCCTGGATGATATTTTAA TCATTAACCGTGCCGAAGCGGGCAAATTGGAATTTAATCCTAATTG GTTAGATTTGAAATTATTGTTCCAGCAATTTATCGAAGAAATTCAA TTAAGTGTCAGTGACCAATATTATTTTGACTTTATTTGTAGCGCTC AAGATACGAAGGCATTGGTGGATGAAAGGTTAGTGCGGTCTATTTT ATCTAATCTGTTATCTAATGCGATTAAATACTCTCCCGGGGGAGGG CAGATTAAAATTGCCCTAAGCCTAGATTCGGAACAGATTATTTTTG AAGTCACCGACCAGGGCATTGGCATTTCGCCAGAGGACCAAAAGCA AATTTTTGAACCCTTTCATCGGGGCAAAAATGTCAGAAATATTACG GGAACAGGACTCGGTTTAATGGTTGCCAAGAAATGTGTTGACTTAC ACAGTGGCAGTATCTTGCTAAAAAGTGCAGTTGACCAGGGAACAAC AGTTACTATCTGTTTAAAACGCTATAACCATTTGCCTCGAGCTCAC AAACAGAAAATTGTGGCACCGGTGAAGCAGACTCTCAACTTTGACT TGCTAAAGTTAGCTGGTGATGTTGAATCTAATCCTGGA SEQ ID NO: 48 LRHK1-05 nucleic acid sequence ATGATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAAA ACCAAGAACGCCGCAGGATTGAAATTAGCATCAAGCAACAAACCCA ACGGGAACGATTTATTAACCAAATTACCCAACATATCCGCCAATCT TTAAACTTGGAAACGGTTTTAAATACCACCGTCGCTGAAGTTAAAA CCCTGTTGCAAGTTGATCGAGTTCTGGTGTATCGCTTTAACCCGGA TTGGAGCGGCGAGTTTATCCATGAAAGCGTGGCCCAGATGTGGGAA CCGCTGAAGGATCTGCAGAACAACTTTCCGCTGTGGCAAGATACCT ATTTACAAGAAAATGAGGGTGGCCGCTACCGCAATCATGAAAGTCT GGCCGTGGGCGATGTGGAAACCGCCGGTTTCACCGATTGCCATTTA GATAATCTGCGTCGCTTCGAAATTCGCGCCTTTCTGACCGTGCCGG TTTTTGTTGGTGAACAGCTGTGGGGTCTGCTGGGCGCCTATCAGAA TGGTGCACCGCGCCATTGGCAAGCTCGCGAAATTCATCTGCTGCAC CAGATCGCCAACCAGCTGGGTATCGCCATCCAACAATCGGAATTAT ATGAGCAATTACAGCAACTCAATAAAGATTTGGAAAACCGAGTCGA AAAACGCACCCAGCAACTTGCCGCCACCAATCAATCCCTAAGAATG GAAATCAGTGAGCGACAAAAAACGGAAGCCGCTCTCCGCCACACTA ACCATACTCTGCAATCCCTGATTGCGGCCTCCCCCAGGGGTATTTT TACCCTTAATTTAGCAGACCAAATTCAGATTTGGAATCCTACAGCA GAACGTATTTTTGGTTGGACAGAAACAGAAATTATTGCCCATCCAG AATTATTAACATCCAACATTTTGCTGGAAGATTATCAGCAATTTAA ACAGAAAGTTTTATCAGGCATGGTTTCCCCTAGCCTAGAATTAAAA TGTCAAAAAAAAGATGGTAGTTGGATTGAAATTGTCCTTTCCGCTG CTCCCCTATTGGATAGTGAAGAAAATATTGCCGGATTGGTGGCGGT TGTCGCCGATATTACCGAGCAAAAGCGGCAGGCAGAACAAATTCGT TTGCTACAATCCGTTGTGGTTAATACTAATGATGCGGTGGTGATTA CGGAAGCGGAGCCCATTGATGATCCCGGGCCGAGAATTCTCTATGT CAATGAAGCATTTACTAAAATCACCGGTTATACTGCTGAAGAAATG CTAGGCAAAACCCCCCGAGTTTTACAGGGACCAAAAACTAGTCGCA CTGAATTAGATAGGGTGCGGCAAGCCATTAGTCAATGGCAATCAGT TACCGTTGAAGTGATTAATTATCGTAAGGATGGCAGTGAGTTTTGG GTGGAATTTAGTCTGGTGCCCGTTGCCAATAAAACAGGTTTTTACA CCCATTGGATTGCTGTGCAAAGGGATGTCACTGAGCGCCGACGCAC GGAGGAAGTCCGCCTAGCTTTAGAACGGGAAAAAGAATTAAGCCGC CTAAAAACTCGTTTTTTCTCCATGGCTTCCCATGAATTTCGTACTC CCCTCAGTACGGCCTTAGCTGCTGCCCAATTACTGGAAAATTCTGA AGTGGCCTGGCTTGATCCCGATAAGCGTAGCCGGAACTTACACCGT ATTCAAAATTCCGTGAAAAATATGGTACAGCTCCTGGATGATATTT TAATCATTAACCGTGCCGAAGCGGGCAAATTGGAATTTAATCCTAA TTGGTTAGATTTGAAATTATTGTTCCAGCAATTTATCGAAGAAATT CAATTAAGTGTCAGTGACCAATATTATTTTGACTTTATTTGTAGCG CTCAAGATACGAAGGCATTGGTGGATGAAAGGTTAGTGCGGTCTAT TTTATCTAATCTGTTATCTAATGCGATTAAATACTCTCCCGGGGGA GGGCAGATTAAAATTGCCCTAAGCCTAGATTCGGAACAGATTATTT TTGAAGTCACCGACCAGGGCATTGGCATTTCGCCAGAGGACCAAAA GCAAATTTTTGAACCCTTTCATCGGGGCAAAAATGTCAGAAATATT ACGGGAACAGGACTCGGTTTAATGGTTGCCAAGAAATGTGTTGACT TACACAGTGGCAGTATCTTGCTAAAAAGTGCAGTTGACCAGGGAAC AACAGTTACTATCTGTTTAAAACGCTATAACCATTTGCCTCGAGCT CACAAACAGAAAATTGTGGCACCGGTGAAGCAGACTCTCAACTTTG ACTTGCTAAAGTTAGCTGGTGATGTTGAATCTAATCCTGGA SEQ ID NO: 49 LRHK1-10 nucleic acid sequence ATGATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAAA ACCAAGAACGCCGCAGGATTGAAATTAGCATCAAGCAACAAACCCA ACGGGAACGATTTATTAACCAAATTACCCAACATATCCGCCAATCT
TTAAACTTGGAAACGGTTTTAAATACCACCGTCGCTGAAGTTAAAA CCCTGTTGCAAGTTGATCGAGTTACCATTTATCGTTTTCGCGCCGA TTGGAGCGGTGAATTTGTGGCCGAATCTTTAGCCCAAGGTTGGACA CCGGTGCGTGAAATTGTGCCGGTGGTTGCCGATGACTATCTGCAAG AAACCCAAGGTCGCAACTTTGCCAATGGCAAAAGCATCGTGATTAA AGATATTTACAGCGCCAACTACAGCATCTGCCACATTGCACTGCTG GAACTGATGCAAGCTCGCGCCTATATGATCGTGCCGATCTTCCAAG GTGAAAAGCTGTGGGGTCTGCTGGCCGCCTATCAGAACATCAAGCC TCGCGATTGGCAAGAAGATGAGGTGGATCTGGTGATGCAGATCGGT ACCCAGCTGGGCATCGCCATCCAACAATCGGAATTATATGAGCAAT TACAGCAACTCAATAAAGATTTGGAAAACCGAGTCGAAAAACGCAC CCAGCAACTTGCCGCCACCAATCAATCCCTAAGAATGGAAATCAGT GAGCGACAAAAAACGGAAGCCGCTCTCCGCCACACTAACCATACTC TGCAATCCCTGATTGCGGCCTCCCCCAGGGGTATTTTTACCCTTAA TTTAGCAGACCAAATTCAGATTTGGAATCCTACAGCAGAACGTATT TTTGGTTGGACAGAAACAGAAATTATTGCCCATCCAGAATTATTAA CATCCAACATTTTGCTGGAAGATTATCAGCAATTTAAACAGAAAGT TTTATCAGGCATGGTTTCCCCTAGCCTAGAATTAAAATGTCAAAAA AAAGATGGTAGTTGGATTGAAATTGTCCTTTCCGCTGCTCCCCTAT TGGATAGTGAAGAAAATATTGCCGGATTGGTGGCGGTTGTCGCCGA TATTACCGAGCAAAAGCGGCAGGCAGAACAAATTCGTTTGCTACAA TCCGTTGTGGTTAATACTAATGATGCGGTGGTGATTACGGAAGCGG AGCCCATTGATGATCCCGGGCCGAGAATTCTCTATGTCAATGAAGC ATTTACTAAAATCACCGGTTATACTGCTGAAGAAATGCTAGGCAAA ACCCCCCGAGTTTTACAGGGACCAAAAACTAGTCGCACTGAATTAG ATAGGGTGCGGCAAGCCATTAGTCAATGGCAATCAGTTACCGTTGA AGTGATTAATTATCGTAAGGATGGCAGTGAGTTTTGGGTGGAATTT AGTCTGGTGCCCGTTGCCAATAAAACAGGTTTTTACACCCATTGGA TTGCTGTGCAAAGGGATGTCACTGAGCGCCGACGCACGGAGGAAGT CCGCCTAGCTTTAGAACGGGAAAAAGAATTAAGCCGCCTAAAAACT CGTTTTTTCTCCATGGCTTCCCATGAATTTCGTACTCCCCTCAGTA CGGCCTTAGCTGCTGCCCAATTACTGGAAAATTCTGAAGTGGCCTG GCTTGATCCCGATAAGCGTAGCCGGAACTTACACCGTATTCAAAAT TCCGTGAAAAATATGGTACAGCTCCTGGATGATATTTTAATCATTA ACCGTGCCGAAGCGGGCAAATTGGAATTTAATCCTAATTGGTTAGA TTTGAAATTATTGTTCCAGCAATTTATCGAAGAAATTCAATTAAGT GTCAGTGACCAATATTATTTTGACTTTATTTGTAGCGCTCAAGATA CGAAGGCATTGGTGGATGAAAGGTTAGTGCGGTCTATTTTATCTAA TCTGTTATCTAATGCGATTAAATACTCTCCCGGGGGAGGGCAGATT AAAATTGCCCTAAGCCTAGATTCGGAACAGATTATTTTTGAAGTCA CCGACCAGGGCATTGGCATTTCGCCAGAGGACCAAAAGCAAATTTT TGAACCCTTTCATCGGGGCAAAAATGTCAGAAATATTACGGGAACA GGACTCGGTTTAATGGTTGCCAAGAAATGTGTTGACTTACACAGTG GCAGTATCTTGCTAAAAAGTGCAGTTGACCAGGGAACAACAGTTAC TATCTGTTTAAAACGCTATAACCATTTGCCTCGAGCTCACAAACAG AAAATTGTGGCACCGGTGAAGCAGACTCTCAACTTTGACTTGCTAA AGTTAGCTGGTGATGTTGAATCTAATCCTGGA SEQ ID NO: 50 LRHK1-12 nucleic acid sequence ATGATGTTACAACCAAAGAAGAAAAGGAAGGTGGGTGGAAGACAAA ACCAAGAACGCCGCAGGATTGAAATTAGCATCAAGCAACAAACCCA ACGGGAACGATTTATTAACCAAATTACCCAACATATCCGCCAATCT TTAAACTTGGAAACGGTTTTAAATACCACCGTCGCTGAAGTTAAAA CCCTGTTGCAAGTTGATCGAGTTGTTATTTTTCAGTTTTCACCCGA CTCTGACTTTTCCGTTGGTAATATTGTGGCAGAGTCGGTATTGGCT CCATTTAAGCCAATCATTAATAGTGCAATTGAAGAAACTTGTTTTA GTAATAACTATGCCCAAAGGTATCAGCAGGGCAGAATTCAGGTCAT TGAGGATATTCACCAGTCCCATCTTAGGCAATGCCACATTGACTTT CTTGCCAGGCTACAGGTCAGGGCAAACCTAGTGCTACCACTAATTA ATGATGCCATTTTGTGGGGCTTATTGTGTATTCATCAATGTGACAG TTCTAGAGTTTGGGAACAAACAGAAATTGATCTGCTCAAGCAGATC ACTAATCAGTTTGAAATCGCCATCCAACAATCGGAATTATATGAGC AATTACAGCAACTCAATAAAGATTTGGAAAACCGAGTCGAAAAACG CACCCAGCAACTTGCCGCCACCAATCAATCCCTAAGAATGGAAATC AGTGAGCGACAAAAAACGGAAGCCGCTCTCCGCCACACTAACCATA CTCTGCAATCCCTGATTGCGGCCTCCCCCAGGGGTATTTTTACCCT TAATTTAGCAGACCAAATTCAGATTTGGAATCCTACAGCAGAACGT ATTTTTGGTTGGACAGAAACAGAAATTATTGCCCATCCAGAATTAT TAACATCCAACATTTTGCTGGAAGATTATCAGCAATTTAAACAGAA AGTTTTATCAGGCATGGTTTCCCCTAGCCTAGAATTAAAATGTCAA AAAAAAGATGGTAGTTGGATTGAAATTGTCCTTTCCGCTGCTCCCC TATTGGATAGTGAAGAAAATATTGCCGGATTGGTGGCGGTTGTCGC CGATATTACCGAGCAAAAGCGGCAGGCAGAACAAATTCGTTTGCTA CAATCCGTTGTGGTTAATACTAATGATGCGGTGGTGATTACGGAAG CGGAGCCCATTGATGATCCCGGGCCGAGAATTCTCTATGTCAATGA AGCATTTACTAAAATCACCGGTTATACTGCTGAAGAAATGCTAGGC AAAACCCCCCGAGTTTTACAGGGACCAAAAACTAGTCGCACTGAAT TAGATAGGGTGCGGCAAGCCATTAGTCAATGGCAATCAGTTACCGT TGAAGTGATTAATTATCGTAAGGATGGCAGTGAGTTTTGGGTGGAA TTTAGTCTGGTGCCCGTTGCCAATAAAACAGGTTTTTACACCCATT GGATTGCTGTGCAAAGGGATGTCACTGAGCGCCGACGCACGGAGGA AGTCCGCCTAGCTTTAGAACGGGAAAAAGAATTAAGCCGCCTAAAA ACTCGTTTTTTCTCCATGGCTTCCCATGAATTTCGTACTCCCCTCA GTACGGCCTTAGCTGCTGCCCAATTACTGGAAAATTCTGAAGTGGC CTGGCTTGATCCCGATAAGCGTAGCCGGAACTTACACCGTATTCAA AATTCCGTGAAAAATATGGTACAGCTCCTGGATGATATTTTAATCA TTAACCGTGCCGAAGCGGGCAAATTGGAATTTAATCCTAATTGGTT AGATTTGAAATTATTGTTCCAGCAATTTATCGAAGAAATTCAATTA AGTGTCAGTGACCAATATTATTTTGACTTTATTTGTAGCGCTCAAG ATACGAAGGCATTGGTGGATGAAAGGTTAGTGCGGTCTATTTTATC TAATCTGTTATCTAATGCGATTAAATACTCTCCCGGGGGAGGGCAG ATTAAAATTGCCCTAAGCCTAGATTCGGAACAGATTATTTTTGAAG TCACCGACCAGGGCATTGGCATTTCGCCAGAGGACCAAAAGCAAAT TTTTGAACCCTTTCATCGGGGCAAAAATGTCAGAAATATTACGGGA ACAGGACTCGGTTTAATGGTTGCCAAGAAATGTGTTGACTTACACA GTGGCAGTATCTTGCTAAAAAGTGCAGTTGACCAGGGAACAACAGT TACTATCTGTTTAAAACGCTATAACCATTTGCCTCGAGCTCACAAA CAGAAAATTGTGGCACCGGTGAAGCAGACTCTCAACTTTGACTTGC TAAAGTTAGCTGGTGATGTTGAATCTAATCCTGGA
Sequence CWU
1
1
501753PRTArtificial SequenceCcaS (A92V) 1Met Gly Lys Phe Leu Ile Pro Ile
Glu Phe Val Phe Leu Ala Ile Ala1 5 10
15Met Thr Cys Tyr Leu Trp His Arg Gln Asn Gln Glu Arg Arg
Arg Ile 20 25 30Glu Ile Ser
Ile Lys Gln Gln Thr Gln Arg Glu Arg Phe Ile Asn Gln 35
40 45Ile Thr Gln His Ile Arg Gln Ser Leu Asn Leu
Glu Thr Val Leu Asn 50 55 60Thr Thr
Val Ala Glu Val Lys Thr Leu Leu Gln Val Asp Arg Val Leu65
70 75 80Ile Tyr Arg Ile Trp Gln Asp
Gly Thr Gly Ser Val Ile Thr Glu Ser 85 90
95Val Asn Ala Asn Tyr Pro Ser Ile Leu Gly Arg Thr Phe
Ser Asp Glu 100 105 110Val Phe
Pro Val Glu Tyr His Gln Ala Tyr Thr Lys Gly Lys Val Arg 115
120 125Ala Ile Asn Asp Ile Asp Gln Asp Asp Ile
Glu Ile Cys Leu Ala Asp 130 135 140Phe
Val Lys Gln Phe Gly Val Lys Ser Lys Leu Val Val Pro Ile Leu145
150 155 160Gln His Asn Arg Ala Ser
Ser Leu Asp Asn Glu Ser Glu Phe Pro Tyr 165
170 175Leu Trp Gly Leu Leu Ile Thr His Gln Cys Ala Phe
Thr Arg Pro Trp 180 185 190Gln
Pro Trp Glu Val Glu Leu Met Lys Gln Leu Ala Asn Gln Val Ala 195
200 205Ile Ala Ile Gln Gln Ser Glu Leu Tyr
Glu Gln Leu Gln Gln Leu Asn 210 215
220Lys Asp Leu Glu Asn Arg Val Glu Lys Arg Thr Gln Gln Leu Ala Ala225
230 235 240Thr Asn Gln Ser
Leu Arg Met Glu Ile Ser Glu Arg Gln Lys Thr Glu 245
250 255Ala Ala Leu Arg His Thr Asn His Thr Leu
Gln Ser Leu Ile Ala Ala 260 265
270Ser Pro Arg Gly Ile Phe Thr Leu Asn Leu Ala Asp Gln Ile Gln Ile
275 280 285Trp Asn Pro Thr Ala Glu Arg
Ile Phe Gly Trp Thr Glu Thr Glu Ile 290 295
300Ile Ala His Pro Glu Leu Leu Thr Ser Asn Ile Leu Leu Glu Asp
Tyr305 310 315 320Gln Gln
Phe Lys Gln Lys Val Leu Ser Gly Met Val Ser Pro Ser Leu
325 330 335Glu Leu Lys Cys Gln Lys Lys
Asp Gly Ser Trp Ile Glu Ile Val Leu 340 345
350Ser Ala Ala Pro Leu Leu Asp Ser Glu Glu Asn Ile Ala Gly
Leu Val 355 360 365Ala Val Val Ala
Asp Ile Thr Glu Gln Lys Arg Gln Ala Glu Gln Ile 370
375 380Arg Leu Leu Gln Ser Val Val Val Asn Thr Asn Asp
Ala Val Val Ile385 390 395
400Thr Glu Ala Glu Pro Ile Asp Asp Pro Gly Pro Arg Ile Leu Tyr Val
405 410 415Asn Glu Ala Phe Thr
Lys Ile Thr Gly Tyr Thr Ala Glu Glu Met Leu 420
425 430Gly Lys Thr Pro Arg Val Leu Gln Gly Pro Lys Thr
Ser Arg Thr Glu 435 440 445Leu Asp
Arg Val Arg Gln Ala Ile Ser Gln Trp Gln Ser Val Thr Val 450
455 460Glu Val Ile Asn Tyr Arg Lys Asp Gly Ser Glu
Phe Trp Val Glu Phe465 470 475
480Ser Leu Val Pro Val Ala Asn Lys Thr Gly Phe Tyr Thr His Trp Ile
485 490 495Ala Val Gln Arg
Asp Val Thr Glu Arg Arg Arg Thr Glu Glu Val Arg 500
505 510Leu Ala Leu Glu Arg Glu Lys Glu Leu Ser Arg
Leu Lys Thr Arg Phe 515 520 525Phe
Ser Met Ala Ser His Glu Phe Arg Thr Pro Leu Ser Thr Ala Leu 530
535 540Ala Ala Ala Gln Leu Leu Glu Asn Ser Glu
Val Ala Trp Leu Asp Pro545 550 555
560Asp Lys Arg Ser Arg Asn Leu His Arg Ile Gln Asn Ser Val Lys
Asn 565 570 575Met Val Gln
Leu Leu Asp Asp Ile Leu Ile Ile Asn Arg Ala Glu Ala 580
585 590Gly Lys Leu Glu Phe Asn Pro Asn Trp Leu
Asp Leu Lys Leu Leu Phe 595 600
605Gln Gln Phe Ile Glu Glu Ile Gln Leu Ser Val Ser Asp Gln Tyr Tyr 610
615 620Phe Asp Phe Ile Cys Ser Ala Gln
Asp Thr Lys Ala Leu Val Asp Glu625 630
635 640Arg Leu Val Arg Ser Ile Leu Ser Asn Leu Leu Ser
Asn Ala Ile Lys 645 650
655Tyr Ser Pro Gly Gly Gly Gln Ile Lys Ile Ala Leu Ser Leu Asp Ser
660 665 670Glu Gln Ile Ile Phe Glu
Val Thr Asp Gln Gly Ile Gly Ile Ser Pro 675 680
685Glu Asp Gln Lys Gln Ile Phe Glu Pro Phe His Arg Gly Lys
Asn Val 690 695 700Arg Asn Ile Thr Gly
Thr Gly Leu Gly Leu Met Val Ala Lys Lys Cys705 710
715 720Val Asp Leu His Ser Gly Ser Ile Leu Leu
Lys Ser Ala Val Asp Gln 725 730
735Gly Thr Thr Val Thr Ile Cys Leu Lys Arg Tyr Asn His Leu Pro Arg
740 745 750Ala22262DNAArtificial
SequenceCcaS (A92V) 2atgggcaaat ttctaattcc aatcgaattt gtttttctgg
cgatcgccat gacctgttat 60ttatggcaca gacaaaacca agaacgccgc aggattgaaa
ttagcatcaa gcaacaaacc 120caacgggaac gatttattaa ccaaattacc caacatatcc
gccaatcttt aaacttggaa 180acggttttaa ataccaccgt cgctgaagtt aaaaccctgt
tgcaagttga tcgagttcta 240atttatcgca tttggcaaga tggcacgggc agcgtcatta
cggaatcggt gaatgccaat 300tatcctagta ttttagggcg gaccttttcc gatgaagttt
ttcccgttga ataccatcaa 360gcctacacca aaggtaaagt acgggccatt aatgacattg
accaggatga catagagatt 420tgcctagctg atttcgtcaa acaatttggc gtgaaatcaa
aattagtagt gcccattctt 480caacataatc gtgcttcttc cctagataat gaatcagaat
ttccctatct ttgggggctg 540ttaattaccc atcaatgtgc ttttacccgg ccatggcaac
cgtgggaagt ggagttaatg 600aaacagctag ccaatcaggt cgcgatcgcc atccaacaat
cggaattata tgagcaatta 660cagcaactca ataaagattt ggaaaaccga gtcgaaaaac
gcacccagca acttgccgcc 720accaatcaat ccctaagaat ggaaatcagt gagcgacaaa
aaacggaagc cgctctccgc 780cacactaacc atactctgca atccctgatt gcggcctccc
ccaggggtat ttttaccctt 840aatttagcag accaaattca gatttggaat cctacagcag
aacgtatttt tggttggaca 900gaaacagaaa ttattgccca tccagaatta ttaacatcca
acattttgct ggaagattat 960cagcaattta aacagaaagt tttatcaggc atggtttccc
ctagcctaga attaaaatgt 1020caaaaaaaag atggtagttg gattgaaatt gtcctttccg
ctgctcccct attggatagt 1080gaagaaaata ttgccggatt ggtggcggtt gtcgccgata
ttaccgagca aaagcggcag 1140gcagaacaaa ttcgtttgct acaatccgtt gtggttaata
ctaatgatgc ggtggtgatt 1200acggaagcgg agcccattga tgatcccggg ccgagaattc
tctatgtcaa tgaagcattt 1260actaaaatca ccggttatac tgctgaagaa atgctaggca
aaaccccccg agttttacag 1320ggaccaaaaa ctagtcgcac tgaattagat agggtgcggc
aagccattag tcaatggcaa 1380tcagttaccg ttgaagtgat taattatcgt aaggatggca
gtgagttttg ggtggaattt 1440agtctggtgc ccgttgccaa taaaacaggt ttttacaccc
attggattgc tgtgcaaagg 1500gatgtcactg agcgccgacg cacggaggaa gtccgcctag
ctttagaacg ggaaaaagaa 1560ttaagccgcc taaaaactcg ttttttctcc atggcttccc
atgaatttcg tactcccctc 1620agtacggcct tagctgctgc ccaattactg gaaaattctg
aagtggcctg gcttgatccc 1680gataagcgta gccggaactt acaccgtatt caaaattccg
tgaaaaatat ggtacagctc 1740ctggatgata ttttaatcat taaccgtgcc gaagcgggca
aattggaatt taatcctaat 1800tggttagatt tgaaattatt gttccagcaa tttatcgaag
aaattcaatt aagtgtcagt 1860gaccaatatt attttgactt tatttgtagc gctcaagata
cgaaggcatt ggtggatgaa 1920aggttagtgc ggtctatttt atctaatctg ttatctaatg
cgattaaata ctctcccggg 1980ggagggcaga ttaaaattgc cctaagccta gattcggaac
agattatttt tgaagtcacc 2040gaccagggca ttggcatttc gccagaggac caaaagcaaa
tttttgaacc ctttcatcgg 2100ggcaaaaatg tcagaaatat tacgggaaca ggactcggtt
taatggttgc caagaaatgt 2160gttgacttac acagtggcag tatcttgcta aaaagtgcag
ttgaccaggg aacaacagtt 2220actatctgtt taaaacgcta taaccatttg cctcgagctt
ag 22623742PRTArtificial SequenceCcaS (23) 3Met Leu
Gln Pro Lys Lys Lys Arg Lys Val Gly Gly Arg Gln Asn Gln1 5
10 15Glu Arg Arg Arg Ile Glu Ile Ser
Ile Lys Gln Gln Thr Gln Arg Glu 20 25
30Arg Phe Ile Asn Gln Ile Thr Gln His Ile Arg Gln Ser Leu Asn
Leu 35 40 45Glu Thr Val Leu Asn
Thr Thr Val Ala Glu Val Lys Thr Leu Leu Gln 50 55
60Val Asp Arg Val Leu Ile Tyr Arg Ile Trp Gln Asp Gly Thr
Gly Ser65 70 75 80Ala
Ile Thr Glu Ser Val Asn Ala Asn Tyr Pro Ser Ile Leu Gly Arg
85 90 95Thr Phe Ser Asp Glu Val Phe
Pro Val Glu Tyr His Gln Ala Tyr Thr 100 105
110Lys Gly Lys Val Arg Ala Ile Asn Asp Ile Asp Gln Asp Asp
Ile Glu 115 120 125Ile Cys Leu Ala
Asp Phe Val Lys Gln Phe Gly Val Lys Ser Lys Leu 130
135 140Val Val Pro Ile Leu Gln His Asn Arg Ala Ser Ser
Leu Asp Asn Glu145 150 155
160Ser Glu Phe Pro Tyr Leu Trp Gly Leu Leu Ile Thr His Gln Cys Ala
165 170 175Phe Thr Arg Pro Trp
Gln Pro Trp Glu Val Glu Leu Met Lys Gln Leu 180
185 190Ala Asn Gln Val Ala Ile Ala Ile Gln Gln Ser Glu
Leu Tyr Glu Gln 195 200 205Leu Gln
Gln Leu Asn Lys Asp Leu Glu Asn Arg Val Glu Lys Arg Thr 210
215 220Gln Gln Leu Ala Ala Thr Asn Gln Ser Leu Arg
Met Glu Ile Ser Glu225 230 235
240Arg Gln Lys Thr Glu Ala Ala Leu Arg His Thr Asn His Thr Leu Gln
245 250 255Ser Leu Ile Ala
Ala Ser Pro Arg Gly Ile Phe Thr Leu Asn Leu Ala 260
265 270Asp Gln Ile Gln Ile Trp Asn Pro Thr Ala Glu
Arg Ile Phe Gly Trp 275 280 285Thr
Glu Thr Glu Ile Ile Ala His Pro Glu Leu Leu Thr Ser Asn Ile 290
295 300Leu Leu Glu Asp Tyr Gln Gln Phe Lys Gln
Lys Val Leu Ser Gly Met305 310 315
320Val Ser Pro Ser Leu Glu Leu Lys Cys Gln Lys Lys Asp Gly Ser
Trp 325 330 335Ile Glu Ile
Val Leu Ser Ala Ala Pro Leu Leu Asp Ser Glu Glu Asn 340
345 350Ile Ala Gly Leu Val Ala Val Val Ala Asp
Ile Thr Glu Gln Lys Arg 355 360
365Gln Ala Glu Gln Ile Arg Leu Leu Gln Ser Val Val Val Asn Thr Asn 370
375 380Asp Ala Val Val Ile Thr Glu Ala
Glu Pro Ile Asp Asp Pro Gly Pro385 390
395 400Arg Ile Leu Tyr Val Asn Glu Ala Phe Thr Lys Ile
Thr Gly Tyr Thr 405 410
415Ala Glu Glu Met Leu Gly Lys Thr Pro Arg Val Leu Gln Gly Pro Lys
420 425 430Thr Ser Arg Thr Glu Leu
Asp Arg Val Arg Gln Ala Ile Ser Gln Trp 435 440
445Gln Ser Val Thr Val Glu Val Ile Asn Tyr Arg Lys Asp Gly
Ser Glu 450 455 460Phe Trp Val Glu Phe
Ser Leu Val Pro Val Ala Asn Lys Thr Gly Phe465 470
475 480Tyr Thr His Trp Ile Ala Val Gln Arg Asp
Val Thr Glu Arg Arg Arg 485 490
495Thr Glu Glu Val Arg Leu Ala Leu Glu Arg Glu Lys Glu Leu Ser Arg
500 505 510Leu Lys Thr Arg Phe
Phe Ser Met Ala Ser His Glu Phe Arg Thr Pro 515
520 525Leu Ser Thr Ala Leu Ala Ala Ala Gln Leu Leu Glu
Asn Ser Glu Val 530 535 540Ala Trp Leu
Asp Pro Asp Lys Arg Ser Arg Asn Leu His Arg Ile Gln545
550 555 560Asn Ser Val Lys Asn Met Val
Gln Leu Leu Asp Asp Ile Leu Ile Ile 565
570 575Asn Arg Ala Glu Ala Gly Lys Leu Glu Phe Asn Pro
Asn Trp Leu Asp 580 585 590Leu
Lys Leu Leu Phe Gln Gln Phe Ile Glu Glu Ile Gln Leu Ser Val 595
600 605Ser Asp Gln Tyr Tyr Phe Asp Phe Ile
Cys Ser Ala Gln Asp Thr Lys 610 615
620Ala Leu Val Asp Glu Arg Leu Val Arg Ser Ile Leu Ser Asn Leu Leu625
630 635 640Ser Asn Ala Ile
Lys Tyr Ser Pro Gly Gly Gly Gln Ile Lys Ile Ala 645
650 655Leu Ser Leu Asp Ser Glu Gln Ile Ile Phe
Glu Val Thr Asp Gln Gly 660 665
670Ile Gly Ile Ser Pro Glu Asp Gln Lys Gln Ile Phe Glu Pro Phe His
675 680 685Arg Gly Lys Asn Val Arg Asn
Ile Thr Gly Thr Gly Leu Gly Leu Met 690 695
700Val Ala Lys Lys Cys Val Asp Leu His Ser Gly Ser Ile Leu Leu
Lys705 710 715 720Ser Ala
Val Asp Gln Gly Thr Thr Val Thr Ile Cys Leu Lys Arg Tyr
725 730 735Asn His Leu Pro Arg Ala
74042229DNAArtificial SequenceMNLS CcaS (23) 4atgttacaac caaagaagaa
aaggaaggtg ggtggaagac aaaaccaaga acgccgcagg 60attgaaatta gcatcaagca
acaaacccaa cgggaacgat ttattaacca aattacccaa 120catatccgcc aatctttaaa
cttggaaacg gttttaaata ccaccgtcgc tgaagttaaa 180accctgttgc aagttgatcg
agttctaatt tatcgcattt ggcaagatgg cacgggcagc 240gccattacgg aatcggtgaa
tgccaattat cctagtattt tagggcggac cttttccgat 300gaagtttttc ccgttgaata
ccatcaagcc tacaccaaag gtaaagtacg ggccattaat 360gacattgacc aggatgacat
agagatttgc ctagctgatt tcgtcaaaca atttggcgtg 420aaatcaaaat tagtagtgcc
cattcttcaa cataatcgtg cttcttccct agataatgaa 480tcagaatttc cctatctttg
ggggctgtta attacccatc aatgtgcttt tacccggcca 540tggcaaccgt gggaagtgga
gttaatgaaa cagctagcca atcaggtcgc gatcgccatc 600caacaatcgg aattatatga
gcaattacag caactcaata aagatttgga aaaccgagtc 660gaaaaacgca cccagcaact
tgccgccacc aatcaatccc taagaatgga aatcagtgag 720cgacaaaaaa cggaagccgc
tctccgccac actaaccata ctctgcaatc cctgattgcg 780gcctccccca ggggtatttt
tacccttaat ttagcagacc aaattcagat ttggaatcct 840acagcagaac gtatttttgg
ttggacagaa acagaaatta ttgcccatcc agaattatta 900acatccaaca ttttgctgga
agattatcag caatttaaac agaaagtttt atcaggcatg 960gtttccccta gcctagaatt
aaaatgtcaa aaaaaagatg gtagttggat tgaaattgtc 1020ctttccgctg ctcccctatt
ggatagtgaa gaaaatattg ccggattggt ggcggttgtc 1080gccgatatta ccgagcaaaa
gcggcaggca gaacaaattc gtttgctaca atccgttgtg 1140gttaatacta atgatgcggt
ggtgattacg gaagcggagc ccattgatga tcccgggccg 1200agaattctct atgtcaatga
agcatttact aaaatcaccg gttatactgc tgaagaaatg 1260ctaggcaaaa ccccccgagt
tttacaggga ccaaaaacta gtcgcactga attagatagg 1320gtgcggcaag ccattagtca
atggcaatca gttaccgttg aagtgattaa ttatcgtaag 1380gatggcagtg agttttgggt
ggaatttagt ctggtgcccg ttgccaataa aacaggtttt 1440tacacccatt ggattgctgt
gcaaagggat gtcactgagc gccgacgcac ggaggaagtc 1500cgcctagctt tagaacggga
aaaagaatta agccgcctaa aaactcgttt tttctccatg 1560gcttcccatg aatttcgtac
tcccctcagt acggccttag ctgctgccca attactggaa 1620aattctgaag tggcctggct
tgatcccgat aagcgtagcc ggaacttaca ccgtattcaa 1680aattccgtga aaaatatggt
acagctcctg gatgatattt taatcattaa ccgtgccgaa 1740gcgggcaaat tggaatttaa
tcctaattgg ttagatttga aattattgtt ccagcaattt 1800atcgaagaaa ttcaattaag
tgtcagtgac caatattatt ttgactttat ttgtagcgct 1860caagatacga aggcattggt
ggatgaaagg ttagtgcggt ctattttatc taatctgtta 1920tctaatgcga ttaaatactc
tcccggggga gggcagatta aaattgccct aagcctagat 1980tcggaacaga ttatttttga
agtcaccgac cagggcattg gcatttcgcc agaggaccaa 2040aagcaaattt ttgaaccctt
tcatcggggc aaaaatgtca gaaatattac gggaacagga 2100ctcggtttaa tggttgccaa
gaaatgtgtt gacttacaca gtggcagtat cttgctaaaa 2160agtgcagttg accagggaac
aacagttact atctgtttaa aacgctataa ccatttgcct 2220cgagcttag
22295731PRTArtificial
SequenceCcaS (22 A92V) 5Met Arg Gln Asn Gln Glu Arg Arg Arg Ile Glu Ile
Ser Ile Lys Gln1 5 10
15Gln Thr Gln Arg Glu Arg Phe Ile Asn Gln Ile Thr Gln His Ile Arg
20 25 30Gln Ser Leu Asn Leu Glu Thr
Val Leu Asn Thr Thr Val Ala Glu Val 35 40
45Lys Thr Leu Leu Gln Val Asp Arg Val Leu Ile Tyr Arg Ile Trp
Gln 50 55 60Asp Gly Thr Gly Ser Val
Ile Thr Glu Ser Val Asn Ala Asn Tyr Pro65 70
75 80Ser Ile Leu Gly Arg Thr Phe Ser Asp Glu Val
Phe Pro Val Glu Tyr 85 90
95His Gln Ala Tyr Thr Lys Gly Lys Val Arg Ala Ile Asn Asp Ile Asp
100 105 110Gln Asp Asp Ile Glu Ile
Cys Leu Ala Asp Phe Val Lys Gln Phe Gly 115 120
125Val Lys Ser Lys Leu Val Val Pro Ile Leu Gln His Asn Arg
Ala Ser 130 135 140Ser Leu Asp Asn Glu
Ser Glu Phe Pro Tyr Leu Trp Gly Leu Leu Ile145 150
155 160Thr His Gln Cys Ala Phe Thr Arg Pro Trp
Gln Pro Trp Glu Val Glu 165 170
175Leu Met Lys Gln Leu Ala Asn Gln Val Ala Ile Ala Ile Gln Gln Ser
180 185 190Glu Leu Tyr Glu Gln
Leu Gln Gln Leu Asn Lys Asp Leu Glu Asn Arg 195
200 205Val Glu Lys Arg Thr Gln Gln Leu Ala Ala Thr Asn
Gln Ser Leu Arg 210 215 220Met Glu Ile
Ser Glu Arg Gln Lys Thr Glu Ala Ala Leu Arg His Thr225
230 235 240Asn His Thr Leu Gln Ser Leu
Ile Ala Ala Ser Pro Arg Gly Ile Phe 245
250 255Thr Leu Asn Leu Ala Asp Gln Ile Gln Ile Trp Asn
Pro Thr Ala Glu 260 265 270Arg
Ile Phe Gly Trp Thr Glu Thr Glu Ile Ile Ala His Pro Glu Leu 275
280 285Leu Thr Ser Asn Ile Leu Leu Glu Asp
Tyr Gln Gln Phe Lys Gln Lys 290 295
300Val Leu Ser Gly Met Val Ser Pro Ser Leu Glu Leu Lys Cys Gln Lys305
310 315 320Lys Asp Gly Ser
Trp Ile Glu Ile Val Leu Ser Ala Ala Pro Leu Leu 325
330 335Asp Ser Glu Glu Asn Ile Ala Gly Leu Val
Ala Val Val Ala Asp Ile 340 345
350Thr Glu Gln Lys Arg Gln Ala Glu Gln Ile Arg Leu Leu Gln Ser Val
355 360 365Val Val Asn Thr Asn Asp Ala
Val Val Ile Thr Glu Ala Glu Pro Ile 370 375
380Asp Asp Pro Gly Pro Arg Ile Leu Tyr Val Asn Glu Ala Phe Thr
Lys385 390 395 400Ile Thr
Gly Tyr Thr Ala Glu Glu Met Leu Gly Lys Thr Pro Arg Val
405 410 415Leu Gln Gly Pro Lys Thr Ser
Arg Thr Glu Leu Asp Arg Val Arg Gln 420 425
430Ala Ile Ser Gln Trp Gln Ser Val Thr Val Glu Val Ile Asn
Tyr Arg 435 440 445Lys Asp Gly Ser
Glu Phe Trp Val Glu Phe Ser Leu Val Pro Val Ala 450
455 460Asn Lys Thr Gly Phe Tyr Thr His Trp Ile Ala Val
Gln Arg Asp Val465 470 475
480Thr Glu Arg Arg Arg Thr Glu Glu Val Arg Leu Ala Leu Glu Arg Glu
485 490 495Lys Glu Leu Ser Arg
Leu Lys Thr Arg Phe Phe Ser Met Ala Ser His 500
505 510Glu Phe Arg Thr Pro Leu Ser Thr Ala Leu Ala Ala
Ala Gln Leu Leu 515 520 525Glu Asn
Ser Glu Val Ala Trp Leu Asp Pro Asp Lys Arg Ser Arg Asn 530
535 540Leu His Arg Ile Gln Asn Ser Val Lys Asn Met
Val Gln Leu Leu Asp545 550 555
560Asp Ile Leu Ile Ile Asn Arg Ala Glu Ala Gly Lys Leu Glu Phe Asn
565 570 575Pro Asn Trp Leu
Asp Leu Lys Leu Leu Phe Gln Gln Phe Ile Glu Glu 580
585 590Ile Gln Leu Ser Val Ser Asp Gln Tyr Tyr Phe
Asp Phe Ile Cys Ser 595 600 605Ala
Gln Asp Thr Lys Ala Leu Val Asp Glu Arg Leu Val Arg Ser Ile 610
615 620Leu Ser Asn Leu Leu Ser Asn Ala Ile Lys
Tyr Ser Pro Gly Gly Gly625 630 635
640Gln Ile Lys Ile Ala Leu Ser Leu Asp Ser Glu Gln Ile Ile Phe
Glu 645 650 655Val Thr Asp
Gln Gly Ile Gly Ile Ser Pro Glu Asp Gln Lys Gln Ile 660
665 670Phe Glu Pro Phe His Arg Gly Lys Asn Val
Arg Asn Ile Thr Gly Thr 675 680
685Gly Leu Gly Leu Met Val Ala Lys Lys Cys Val Asp Leu His Ser Gly 690
695 700Ser Ile Leu Leu Lys Ser Ala Val
Asp Gln Gly Thr Thr Val Thr Ile705 710
715 720Cys Leu Lys Arg Tyr Asn His Leu Pro Arg Ala
725 73062196DNAArtificial SequenceCcaS (22 A92V)
6atgagacaaa accaagaacg ccgcaggatt gaaattagca tcaagcaaca aacccaacgg
60gaacgattta ttaaccaaat tacccaacat atccgccaat ctttaaactt ggaaacggtt
120ttaaatacca ccgtcgctga agttaaaacc ctgttgcaag ttgatcgagt tctaatttat
180cgcatttggc aagatggcac gggcagcgtc attacggaat cggtgaatgc caattatcct
240agtattttag ggcggacctt ttccgatgaa gtttttcccg ttgaatacca tcaagcctac
300accaaaggta aagtacgggc cattaatgac attgaccagg atgacataga gatttgccta
360gctgatttcg tcaaacaatt tggcgtgaaa tcaaaattag tagtgcccat tcttcaacat
420aatcgtgctt cttccctaga taatgaatca gaatttccct atctttgggg gctgttaatt
480acccatcaat gtgcttttac ccggccatgg caaccgtggg aagtggagtt aatgaaacag
540ctagccaatc aggtcgcgat cgccatccaa caatcggaat tatatgagca attacagcaa
600ctcaataaag atttggaaaa ccgagtcgaa aaacgcaccc agcaacttgc cgccaccaat
660caatccctaa gaatggaaat cagtgagcga caaaaaacgg aagccgctct ccgccacact
720aaccatactc tgcaatccct gattgcggcc tcccccaggg gtatttttac ccttaattta
780gcagaccaaa ttcagatttg gaatcctaca gcagaacgta tttttggttg gacagaaaca
840gaaattattg cccatccaga attattaaca tccaacattt tgctggaaga ttatcagcaa
900tttaaacaga aagttttatc aggcatggtt tcccctagcc tagaattaaa atgtcaaaaa
960aaagatggta gttggattga aattgtcctt tccgctgctc ccctattgga tagtgaagaa
1020aatattgccg gattggtggc ggttgtcgcc gatattaccg agcaaaagcg gcaggcagaa
1080caaattcgtt tgctacaatc cgttgtggtt aatactaatg atgcggtggt gattacggaa
1140gcggagccca ttgatgatcc cgggccgaga attctctatg tcaatgaagc atttactaaa
1200atcaccggtt atactgctga agaaatgcta ggcaaaaccc cccgagtttt acagggacca
1260aaaactagtc gcactgaatt agatagggtg cggcaagcca ttagtcaatg gcaatcagtt
1320accgttgaag tgattaatta tcgtaaggat ggcagtgagt tttgggtgga atttagtctg
1380gtgcccgttg ccaataaaac aggtttttac acccattgga ttgctgtgca aagggatgtc
1440actgagcgcc gacgcacgga ggaagtccgc ctagctttag aacgggaaaa agaattaagc
1500cgcctaaaaa ctcgtttttt ctccatggct tcccatgaat ttcgtactcc cctcagtacg
1560gccttagctg ctgcccaatt actggaaaat tctgaagtgg cctggcttga tcccgataag
1620cgtagccgga acttacaccg tattcaaaat tccgtgaaaa atatggtaca gctcctggat
1680gatattttaa tcattaaccg tgccgaagcg ggcaaattgg aatttaatcc taattggtta
1740gatttgaaat tattgttcca gcaatttatc gaagaaattc aattaagtgt cagtgaccaa
1800tattattttg actttatttg tagcgctcaa gatacgaagg cattggtgga tgaaaggtta
1860gtgcggtcta ttttatctaa tctgttatct aatgcgatta aatactctcc cgggggaggg
1920cagattaaaa ttgccctaag cctagattcg gaacagatta tttttgaagt caccgaccag
1980ggcattggca tttcgccaga ggaccaaaag caaatttttg aaccctttca tcggggcaaa
2040aatgtcagaa atattacggg aacaggactc ggtttaatgg ttgccaagaa atgtgttgac
2100ttacacagtg gcagtatctt gctaaaaagt gcagttgacc agggaacaac agttactatc
2160tgtttaaaac gctataacca tttgcctcga gcttag
21967742PRTArtificial SequenceMNLS CcaS (23 A92V) 7Met Leu Gln Pro Lys
Lys Lys Arg Lys Val Gly Gly Arg Gln Asn Gln1 5
10 15Glu Arg Arg Arg Ile Glu Ile Ser Ile Lys Gln
Gln Thr Gln Arg Glu 20 25
30Arg Phe Ile Asn Gln Ile Thr Gln His Ile Arg Gln Ser Leu Asn Leu
35 40 45Glu Thr Val Leu Asn Thr Thr Val
Ala Glu Val Lys Thr Leu Leu Gln 50 55
60Val Asp Arg Val Leu Ile Tyr Arg Ile Trp Gln Asp Gly Thr Gly Ser65
70 75 80Val Ile Thr Glu Ser
Val Asn Ala Asn Tyr Pro Ser Ile Leu Gly Arg 85
90 95Thr Phe Ser Asp Glu Val Phe Pro Val Glu Tyr
His Gln Ala Tyr Thr 100 105
110Lys Gly Lys Val Arg Ala Ile Asn Asp Ile Asp Gln Asp Asp Ile Glu
115 120 125Ile Cys Leu Ala Asp Phe Val
Lys Gln Phe Gly Val Lys Ser Lys Leu 130 135
140Val Val Pro Ile Leu Gln His Asn Arg Ala Ser Ser Leu Asp Asn
Glu145 150 155 160Ser Glu
Phe Pro Tyr Leu Trp Gly Leu Leu Ile Thr His Gln Cys Ala
165 170 175Phe Thr Arg Pro Trp Gln Pro
Trp Glu Val Glu Leu Met Lys Gln Leu 180 185
190Ala Asn Gln Val Ala Ile Ala Ile Gln Gln Ser Glu Leu Tyr
Glu Gln 195 200 205Leu Gln Gln Leu
Asn Lys Asp Leu Glu Asn Arg Val Glu Lys Arg Thr 210
215 220Gln Gln Leu Ala Ala Thr Asn Gln Ser Leu Arg Met
Glu Ile Ser Glu225 230 235
240Arg Gln Lys Thr Glu Ala Ala Leu Arg His Thr Asn His Thr Leu Gln
245 250 255Ser Leu Ile Ala Ala
Ser Pro Arg Gly Ile Phe Thr Leu Asn Leu Ala 260
265 270Asp Gln Ile Gln Ile Trp Asn Pro Thr Ala Glu Arg
Ile Phe Gly Trp 275 280 285Thr Glu
Thr Glu Ile Ile Ala His Pro Glu Leu Leu Thr Ser Asn Ile 290
295 300Leu Leu Glu Asp Tyr Gln Gln Phe Lys Gln Lys
Val Leu Ser Gly Met305 310 315
320Val Ser Pro Ser Leu Glu Leu Lys Cys Gln Lys Lys Asp Gly Ser Trp
325 330 335Ile Glu Ile Val
Leu Ser Ala Ala Pro Leu Leu Asp Ser Glu Glu Asn 340
345 350Ile Ala Gly Leu Val Ala Val Val Ala Asp Ile
Thr Glu Gln Lys Arg 355 360 365Gln
Ala Glu Gln Ile Arg Leu Leu Gln Ser Val Val Val Asn Thr Asn 370
375 380Asp Ala Val Val Ile Thr Glu Ala Glu Pro
Ile Asp Asp Pro Gly Pro385 390 395
400Arg Ile Leu Tyr Val Asn Glu Ala Phe Thr Lys Ile Thr Gly Tyr
Thr 405 410 415Ala Glu Glu
Met Leu Gly Lys Thr Pro Arg Val Leu Gln Gly Pro Lys 420
425 430Thr Ser Arg Thr Glu Leu Asp Arg Val Arg
Gln Ala Ile Ser Gln Trp 435 440
445Gln Ser Val Thr Val Glu Val Ile Asn Tyr Arg Lys Asp Gly Ser Glu 450
455 460Phe Trp Val Glu Phe Ser Leu Val
Pro Val Ala Asn Lys Thr Gly Phe465 470
475 480Tyr Thr His Trp Ile Ala Val Gln Arg Asp Val Thr
Glu Arg Arg Arg 485 490
495Thr Glu Glu Val Arg Leu Ala Leu Glu Arg Glu Lys Glu Leu Ser Arg
500 505 510Leu Lys Thr Arg Phe Phe
Ser Met Ala Ser His Glu Phe Arg Thr Pro 515 520
525Leu Ser Thr Ala Leu Ala Ala Ala Gln Leu Leu Glu Asn Ser
Glu Val 530 535 540Ala Trp Leu Asp Pro
Asp Lys Arg Ser Arg Asn Leu His Arg Ile Gln545 550
555 560Asn Ser Val Lys Asn Met Val Gln Leu Leu
Asp Asp Ile Leu Ile Ile 565 570
575Asn Arg Ala Glu Ala Gly Lys Leu Glu Phe Asn Pro Asn Trp Leu Asp
580 585 590Leu Lys Leu Leu Phe
Gln Gln Phe Ile Glu Glu Ile Gln Leu Ser Val 595
600 605Ser Asp Gln Tyr Tyr Phe Asp Phe Ile Cys Ser Ala
Gln Asp Thr Lys 610 615 620Ala Leu Val
Asp Glu Arg Leu Val Arg Ser Ile Leu Ser Asn Leu Leu625
630 635 640Ser Asn Ala Ile Lys Tyr Ser
Pro Gly Gly Gly Gln Ile Lys Ile Ala 645
650 655Leu Ser Leu Asp Ser Glu Gln Ile Ile Phe Glu Val
Thr Asp Gln Gly 660 665 670Ile
Gly Ile Ser Pro Glu Asp Gln Lys Gln Ile Phe Glu Pro Phe His 675
680 685Arg Gly Lys Asn Val Arg Asn Ile Thr
Gly Thr Gly Leu Gly Leu Met 690 695
700Val Ala Lys Lys Cys Val Asp Leu His Ser Gly Ser Ile Leu Leu Lys705
710 715 720Ser Ala Val Asp
Gln Gly Thr Thr Val Thr Ile Cys Leu Lys Arg Tyr 725
730 735Asn His Leu Pro Arg Ala
74082229DNAArtificial SequenceMNLS CcaS (23 A92V) 8atgttacaac caaagaagaa
aaggaaggtg ggtggaagac aaaaccaaga acgccgcagg 60attgaaatta gcatcaagca
acaaacccaa cgggaacgat ttattaacca aattacccaa 120catatccgcc aatctttaaa
cttggaaacg gttttaaata ccaccgtcgc tgaagttaaa 180accctgttgc aagttgatcg
agttctaatt tatcgcattt ggcaagatgg cacgggcagc 240gtcattacgg aatcggtgaa
tgccaattat cctagtattt tagggcggac cttttccgat 300gaagtttttc ccgttgaata
ccatcaagcc tacaccaaag gtaaagtacg ggccattaat 360gacattgacc aggatgacat
agagatttgc ctagctgatt tcgtcaaaca atttggcgtg 420aaatcaaaat tagtagtgcc
cattcttcaa cataatcgtg cttcttccct agataatgaa 480tcagaatttc cctatctttg
ggggctgtta attacccatc aatgtgcttt tacccggcca 540tggcaaccgt gggaagtgga
gttaatgaaa cagctagcca atcaggtcgc gatcgccatc 600caacaatcgg aattatatga
gcaattacag caactcaata aagatttgga aaaccgagtc 660gaaaaacgca cccagcaact
tgccgccacc aatcaatccc taagaatgga aatcagtgag 720cgacaaaaaa cggaagccgc
tctccgccac actaaccata ctctgcaatc cctgattgcg 780gcctccccca ggggtatttt
tacccttaat ttagcagacc aaattcagat ttggaatcct 840acagcagaac gtatttttgg
ttggacagaa acagaaatta ttgcccatcc agaattatta 900acatccaaca ttttgctgga
agattatcag caatttaaac agaaagtttt atcaggcatg 960gtttccccta gcctagaatt
aaaatgtcaa aaaaaagatg gtagttggat tgaaattgtc 1020ctttccgctg ctcccctatt
ggatagtgaa gaaaatattg ccggattggt ggcggttgtc 1080gccgatatta ccgagcaaaa
gcggcaggca gaacaaattc gtttgctaca atccgttgtg 1140gttaatacta atgatgcggt
ggtgattacg gaagcggagc ccattgatga tcccgggccg 1200agaattctct atgtcaatga
agcatttact aaaatcaccg gttatactgc tgaagaaatg 1260ctaggcaaaa ccccccgagt
tttacaggga ccaaaaacta gtcgcactga attagatagg 1320gtgcggcaag ccattagtca
atggcaatca gttaccgttg aagtgattaa ttatcgtaag 1380gatggcagtg agttttgggt
ggaatttagt ctggtgcccg ttgccaataa aacaggtttt 1440tacacccatt ggattgctgt
gcaaagggat gtcactgagc gccgacgcac ggaggaagtc 1500cgcctagctt tagaacggga
aaaagaatta agccgcctaa aaactcgttt tttctccatg 1560gcttcccatg aatttcgtac
tcccctcagt acggccttag ctgctgccca attactggaa 1620aattctgaag tggcctggct
tgatcccgat aagcgtagcc ggaacttaca ccgtattcaa 1680aattccgtga aaaatatggt
acagctcctg gatgatattt taatcattaa ccgtgccgaa 1740gcgggcaaat tggaatttaa
tcctaattgg ttagatttga aattattgtt ccagcaattt 1800atcgaagaaa ttcaattaag
tgtcagtgac caatattatt ttgactttat ttgtagcgct 1860caagatacga aggcattggt
ggatgaaagg ttagtgcggt ctattttatc taatctgtta 1920tctaatgcga ttaaatactc
tcccggggga gggcagatta aaattgccct aagcctagat 1980tcggaacaga ttatttttga
agtcaccgac cagggcattg gcatttcgcc agaggaccaa 2040aagcaaattt ttgaaccctt
tcatcggggc aaaaatgtca gaaatattac gggaacagga 2100ctcggtttaa tggttgccaa
gaaatgtgtt gacttacaca gtggcagtat cttgctaaaa 2160agtgcagttg accagggaac
aacagttact atctgtttaa aacgctataa ccatttgcct 2220cgagcttag
22299743PRTArtificial
SequenceMMNLS CcaS (23) F2A30 (aa1-29) 9Met Met Leu Gln Pro Lys Lys Lys
Arg Lys Val Gly Gly Arg Gln Asn1 5 10
15Gln Glu Arg Arg Arg Ile Glu Ile Ser Ile Lys Gln Gln Thr
Gln Arg 20 25 30Glu Arg Phe
Ile Asn Gln Ile Thr Gln His Ile Arg Gln Ser Leu Asn 35
40 45Leu Glu Thr Val Leu Asn Thr Thr Val Ala Glu
Val Lys Thr Leu Leu 50 55 60Gln Val
Asp Arg Val Leu Ile Tyr Arg Ile Trp Gln Asp Gly Thr Gly65
70 75 80Ser Ala Ile Thr Glu Ser Val
Asn Ala Asn Tyr Pro Ser Ile Leu Gly 85 90
95Arg Thr Phe Ser Asp Glu Val Phe Pro Val Glu Tyr His
Gln Ala Tyr 100 105 110Thr Lys
Gly Lys Val Arg Ala Ile Asn Asp Ile Asp Gln Asp Asp Ile 115
120 125Glu Ile Cys Leu Ala Asp Phe Val Lys Gln
Phe Gly Val Lys Ser Lys 130 135 140Leu
Val Val Pro Ile Leu Gln His Asn Arg Ala Ser Ser Leu Asp Asn145
150 155 160Glu Ser Glu Phe Pro Tyr
Leu Trp Gly Leu Leu Ile Thr His Gln Cys 165
170 175Ala Phe Thr Arg Pro Trp Gln Pro Trp Glu Val Glu
Leu Met Lys Gln 180 185 190Leu
Ala Asn Gln Val Ala Ile Ala Ile Gln Gln Ser Glu Leu Tyr Glu 195
200 205Gln Leu Gln Gln Leu Asn Lys Asp Leu
Glu Asn Arg Val Glu Lys Arg 210 215
220Thr Gln Gln Leu Ala Ala Thr Asn Gln Ser Leu Arg Met Glu Ile Ser225
230 235 240Glu Arg Gln Lys
Thr Glu Ala Ala Leu Arg His Thr Asn His Thr Leu 245
250 255Gln Ser Leu Ile Ala Ala Ser Pro Arg Gly
Ile Phe Thr Leu Asn Leu 260 265
270Ala Asp Gln Ile Gln Ile Trp Asn Pro Thr Ala Glu Arg Ile Phe Gly
275 280 285Trp Thr Glu Thr Glu Ile Ile
Ala His Pro Glu Leu Leu Thr Ser Asn 290 295
300Ile Leu Leu Glu Asp Tyr Gln Gln Phe Lys Gln Lys Val Leu Ser
Gly305 310 315 320Met Val
Ser Pro Ser Leu Glu Leu Lys Cys Gln Lys Lys Asp Gly Ser
325 330 335Trp Ile Glu Ile Val Leu Ser
Ala Ala Pro Leu Leu Asp Ser Glu Glu 340 345
350Asn Ile Ala Gly Leu Val Ala Val Val Ala Asp Ile Thr Glu
Gln Lys 355 360 365Arg Gln Ala Glu
Gln Ile Arg Leu Leu Gln Ser Val Val Val Asn Thr 370
375 380Asn Asp Ala Val Val Ile Thr Glu Ala Glu Pro Ile
Asp Asp Pro Gly385 390 395
400Pro Arg Ile Leu Tyr Val Asn Glu Ala Phe Thr Lys Ile Thr Gly Tyr
405 410 415Thr Ala Glu Glu Met
Leu Gly Lys Thr Pro Arg Val Leu Gln Gly Pro 420
425 430Lys Thr Ser Arg Thr Glu Leu Asp Arg Val Arg Gln
Ala Ile Ser Gln 435 440 445Trp Gln
Ser Val Thr Val Glu Val Ile Asn Tyr Arg Lys Asp Gly Ser 450
455 460Glu Phe Trp Val Glu Phe Ser Leu Val Pro Val
Ala Asn Lys Thr Gly465 470 475
480Phe Tyr Thr His Trp Ile Ala Val Gln Arg Asp Val Thr Glu Arg Arg
485 490 495Arg Thr Glu Glu
Val Arg Leu Ala Leu Glu Arg Glu Lys Glu Leu Ser 500
505 510Arg Leu Lys Thr Arg Phe Phe Ser Met Ala Ser
His Glu Phe Arg Thr 515 520 525Pro
Leu Ser Thr Ala Leu Ala Ala Ala Gln Leu Leu Glu Asn Ser Glu 530
535 540Val Ala Trp Leu Asp Pro Asp Lys Arg Ser
Arg Asn Leu His Arg Ile545 550 555
560Gln Asn Ser Val Lys Asn Met Val Gln Leu Leu Asp Asp Ile Leu
Ile 565 570 575Ile Asn Arg
Ala Glu Ala Gly Lys Leu Glu Phe Asn Pro Asn Trp Leu 580
585 590Asp Leu Lys Leu Leu Phe Gln Gln Phe Ile
Glu Glu Ile Gln Leu Ser 595 600
605Val Ser Asp Gln Tyr Tyr Phe Asp Phe Ile Cys Ser Ala Gln Asp Thr 610
615 620Lys Ala Leu Val Asp Glu Arg Leu
Val Arg Ser Ile Leu Ser Asn Leu625 630
635 640Leu Ser Asn Ala Ile Lys Tyr Ser Pro Gly Gly Gly
Gln Ile Lys Ile 645 650
655Ala Leu Ser Leu Asp Ser Glu Gln Ile Ile Phe Glu Val Thr Asp Gln
660 665 670Gly Ile Gly Ile Ser Pro
Glu Asp Gln Lys Gln Ile Phe Glu Pro Phe 675 680
685His Arg Gly Lys Asn Val Arg Asn Ile Thr Gly Thr Gly Leu
Gly Leu 690 695 700Met Val Ala Lys Lys
Cys Val Asp Leu His Ser Gly Ser Ile Leu Leu705 710
715 720Lys Ser Ala Val Asp Gln Gly Thr Thr Val
Thr Ile Cys Leu Lys Arg 725 730
735Tyr Asn His Leu Pro Arg Ala 740102229DNAArtificial
SequenceMMNLS CcaS (23) F2A30 (aa1-29) 10atgatgttac aaccaaagaa gaaaaggaag
gtgggtggaa gacagaacca agaacgaaga 60agaatagaaa taagtatcaa gcagcagaca
caacgtgaga ggtttatcaa ccaaatcaca 120cagcatatca gacaatctct taatttggag
actgttttga acactacagt tgctgaagtt 180aagacacttt tgcaggttga tagagttctt
atctatagaa tctggcaaga tggtacagga 240tctgctatca ctgagtctgt taatgctaac
tacccttcta ttttgggtag aactttttct 300gatgaggttt tcccagttga atatcatcaa
gcttacacaa agggaaaagt tagagctatt 360aatgatatcg atcaggatga tatcgaaatc
tgtcttgctg atttcgttaa acaattcggt 420gttaagtcta aacttgttgt tcctatcttg
cagcataata gagcttcttc tttggataac 480gaatctgagt ttccatatct ttggggactt
ttgattacac atcagtgtgc tttcactaga 540ccttggcaac cttgggaagt tgagcttatg
aagcagttgg ctaaccaagt tgctattgct 600atccaacagt ctgagttgta cgaacaactt
caacagttga ataaggatct tgagaacaga 660gttgaaaaaa gaacacaaca gttggctgct
actaatcagt ctcttaggat ggaaatctct 720gaaagacaaa agactgaggc tgctttgaga
catactaacc atacacttca gtctttgatt 780gctgcttctc ctagaggtat ctttactctt
aatttggctg atcaaattca gatctggaac 840ccaacagctg agcgaatctt cggatggact
gaaacagaga ttatcgctca tcctgagctt 900ttgacatcta acatcctttt ggaagattac
caacagttta agcaaaaggt tctttctggt 960atggtttctc catctcttga gttgaagtgt
cagaagaaag atggatcttg gattgaaatc 1020gttttgtctg ctgctcctct tttggattct
gaagagaaca ttgctggtct tgttgctgtt 1080gttgctgata tcactgagca aaaaagacag
gctgaacaaa tcagactttt gcaatctgtt 1140gttgttaaca caaacgatgc tgttgttatt
actgaagctg aaccaatcga tgatcctgga 1200ccaagaatcc tttatgttaa tgaggctttc
actaagatca caggatacac tgctgaagag 1260atgttgggaa agactcctag agttcttcaa
ggaccaaaaa cttcaagaac tgagttggat 1320agagttagac aggctatctc tcaatggcag
tctgttacag ttgaagttat taattacaga 1380aaggatggtt ctgagttttg ggttgaattt
tctcttgttc ctgttgctaa caaaacagga 1440ttttacactc attggattgc tgttcaaaga
gatgttacag agagaagaag aactgaagag 1500gttagacttg ctttggaaag agagaaggaa
ctttcaagat tgaagactag atttttctct 1560atggcttctc atgagtttag aacaccactt
tctactgctt tggctgctgc tcaacttctt 1620gaaaattctg aagttgcttg gcttgatcct
gataagagat caagaaacct tcatagaatc 1680caaaattctg ttaaaaacat ggttcaactt
ttggatgata tcttgattat caacagagct 1740gaggctggaa agcttgagtt taatccaaac
tggcttgatt tgaagctttt gttccaacag 1800ttcattgaag agatccagct ttctgtttct
gatcaatact acttcgattt catctgttct 1860gctcaagata ctaaggctct tgttgatgaa
agattggtta gatctatcct ttctaatctt 1920ttgtctaacg ctatcaagta ctctcctgga
ggtggacaga ttaaaatcgc tctttctttg 1980gattctgagc agattatctt cgaagttaca
gatcaaggta ttggaatctc tcctgaggat 2040caaaagcaga tctttgaacc attccataga
ggaaagaatg ttagaaacat tactggtaca 2100ggacttggtt tgatggttgc taagaaatgt
gttgatcttc attctggatc tatccttttg 2160aagtctgctg tggatcaagg aacaactgtg
accatctgtc tcaaaaggta caaccatctc 2220ccaagggct
222911743PRTArtificial SequenceMMNLS
CcaS (23 A92V) F2A30 (aa1-29) 11Met Met Leu Gln Pro Lys Lys Lys Arg Lys
Val Gly Gly Arg Gln Asn1 5 10
15Gln Glu Arg Arg Arg Ile Glu Ile Ser Ile Lys Gln Gln Thr Gln Arg
20 25 30Glu Arg Phe Ile Asn Gln
Ile Thr Gln His Ile Arg Gln Ser Leu Asn 35 40
45Leu Glu Thr Val Leu Asn Thr Thr Val Ala Glu Val Lys Thr
Leu Leu 50 55 60Gln Val Asp Arg Val
Leu Ile Tyr Arg Ile Trp Gln Asp Gly Thr Gly65 70
75 80Ser Val Ile Thr Glu Ser Val Asn Ala Asn
Tyr Pro Ser Ile Leu Gly 85 90
95Arg Thr Phe Ser Asp Glu Val Phe Pro Val Glu Tyr His Gln Ala Tyr
100 105 110Thr Lys Gly Lys Val
Arg Ala Ile Asn Asp Ile Asp Gln Asp Asp Ile 115
120 125Glu Ile Cys Leu Ala Asp Phe Val Lys Gln Phe Gly
Val Lys Ser Lys 130 135 140Leu Val Val
Pro Ile Leu Gln His Asn Arg Ala Ser Ser Leu Asp Asn145
150 155 160Glu Ser Glu Phe Pro Tyr Leu
Trp Gly Leu Leu Ile Thr His Gln Cys 165
170 175Ala Phe Thr Arg Pro Trp Gln Pro Trp Glu Val Glu
Leu Met Lys Gln 180 185 190Leu
Ala Asn Gln Val Ala Ile Ala Ile Gln Gln Ser Glu Leu Tyr Glu 195
200 205Gln Leu Gln Gln Leu Asn Lys Asp Leu
Glu Asn Arg Val Glu Lys Arg 210 215
220Thr Gln Gln Leu Ala Ala Thr Asn Gln Ser Leu Arg Met Glu Ile Ser225
230 235 240Glu Arg Gln Lys
Thr Glu Ala Ala Leu Arg His Thr Asn His Thr Leu 245
250 255Gln Ser Leu Ile Ala Ala Ser Pro Arg Gly
Ile Phe Thr Leu Asn Leu 260 265
270Ala Asp Gln Ile Gln Ile Trp Asn Pro Thr Ala Glu Arg Ile Phe Gly
275 280 285Trp Thr Glu Thr Glu Ile Ile
Ala His Pro Glu Leu Leu Thr Ser Asn 290 295
300Ile Leu Leu Glu Asp Tyr Gln Gln Phe Lys Gln Lys Val Leu Ser
Gly305 310 315 320Met Val
Ser Pro Ser Leu Glu Leu Lys Cys Gln Lys Lys Asp Gly Ser
325 330 335Trp Ile Glu Ile Val Leu Ser
Ala Ala Pro Leu Leu Asp Ser Glu Glu 340 345
350Asn Ile Ala Gly Leu Val Ala Val Val Ala Asp Ile Thr Glu
Gln Lys 355 360 365Arg Gln Ala Glu
Gln Ile Arg Leu Leu Gln Ser Val Val Val Asn Thr 370
375 380Asn Asp Ala Val Val Ile Thr Glu Ala Glu Pro Ile
Asp Asp Pro Gly385 390 395
400Pro Arg Ile Leu Tyr Val Asn Glu Ala Phe Thr Lys Ile Thr Gly Tyr
405 410 415Thr Ala Glu Glu Met
Leu Gly Lys Thr Pro Arg Val Leu Gln Gly Pro 420
425 430Lys Thr Ser Arg Thr Glu Leu Asp Arg Val Arg Gln
Ala Ile Ser Gln 435 440 445Trp Gln
Ser Val Thr Val Glu Val Ile Asn Tyr Arg Lys Asp Gly Ser 450
455 460Glu Phe Trp Val Glu Phe Ser Leu Val Pro Val
Ala Asn Lys Thr Gly465 470 475
480Phe Tyr Thr His Trp Ile Ala Val Gln Arg Asp Val Thr Glu Arg Arg
485 490 495Arg Thr Glu Glu
Val Arg Leu Ala Leu Glu Arg Glu Lys Glu Leu Ser 500
505 510Arg Leu Lys Thr Arg Phe Phe Ser Met Ala Ser
His Glu Phe Arg Thr 515 520 525Pro
Leu Ser Thr Ala Leu Ala Ala Ala Gln Leu Leu Glu Asn Ser Glu 530
535 540Val Ala Trp Leu Asp Pro Asp Lys Arg Ser
Arg Asn Leu His Arg Ile545 550 555
560Gln Asn Ser Val Lys Asn Met Val Gln Leu Leu Asp Asp Ile Leu
Ile 565 570 575Ile Asn Arg
Ala Glu Ala Gly Lys Leu Glu Phe Asn Pro Asn Trp Leu 580
585 590Asp Leu Lys Leu Leu Phe Gln Gln Phe Ile
Glu Glu Ile Gln Leu Ser 595 600
605Val Ser Asp Gln Tyr Tyr Phe Asp Phe Ile Cys Ser Ala Gln Asp Thr 610
615 620Lys Ala Leu Val Asp Glu Arg Leu
Val Arg Ser Ile Leu Ser Asn Leu625 630
635 640Leu Ser Asn Ala Ile Lys Tyr Ser Pro Gly Gly Gly
Gln Ile Lys Ile 645 650
655Ala Leu Ser Leu Asp Ser Glu Gln Ile Ile Phe Glu Val Thr Asp Gln
660 665 670Gly Ile Gly Ile Ser Pro
Glu Asp Gln Lys Gln Ile Phe Glu Pro Phe 675 680
685His Arg Gly Lys Asn Val Arg Asn Ile Thr Gly Thr Gly Leu
Gly Leu 690 695 700Met Val Ala Lys Lys
Cys Val Asp Leu His Ser Gly Ser Ile Leu Leu705 710
715 720Lys Ser Ala Val Asp Gln Gly Thr Thr Val
Thr Ile Cys Leu Lys Arg 725 730
735Tyr Asn His Leu Pro Arg Ala 740122229DNAArtificial
SequenceMMNLS CcaS (23 A92V) F2A30 (aa1-29) 12atgatgttac aaccaaagaa
gaaaaggaag gtgggtggaa gacagaacca agaacgaaga 60agaatagaaa taagtatcaa
gcagcagaca caacgtgaga ggtttatcaa ccaaatcaca 120cagcatatca gacaatctct
taatttggag actgttttga acactacagt tgctgaagtt 180aagacacttt tgcaggttga
tagagttctt atctatagaa tctggcaaga tggtacagga 240tctgttatca ctgagtctgt
taatgctaac tacccttcta ttttgggtag aactttttct 300gatgaggttt tcccagttga
atatcatcaa gcttacacaa agggaaaagt tagagctatt 360aatgatatcg atcaggatga
tatcgaaatc tgtcttgctg atttcgttaa acaattcggt 420gttaagtcta aacttgttgt
tcctatcttg cagcataata gagcttcttc tttggataac 480gaatctgagt ttccatatct
ttggggactt ttgattacac atcagtgtgc tttcactaga 540ccttggcaac cttgggaagt
tgagcttatg aagcagttgg ctaaccaagt tgctattgct 600atccaacagt ctgagttgta
cgaacaactt caacagttga ataaggatct tgagaacaga 660gttgaaaaaa gaacacaaca
gttggctgct actaatcagt ctcttaggat ggaaatctct 720gaaagacaaa agactgaggc
tgctttgaga catactaacc atacacttca gtctttgatt 780gctgcttctc ctagaggtat
ctttactctt aatttggctg atcaaattca gatctggaac 840ccaacagctg agcgaatctt
cggatggact gaaacagaga ttatcgctca tcctgagctt 900ttgacatcta acatcctttt
ggaagattac caacagttta agcaaaaggt tctttctggt 960atggtttctc catctcttga
gttgaagtgt cagaagaaag atggatcttg gattgaaatc 1020gttttgtctg ctgctcctct
tttggattct gaagagaaca ttgctggtct tgttgctgtt 1080gttgctgata tcactgagca
aaaaagacag gctgaacaaa tcagactttt gcaatctgtt 1140gttgttaaca caaacgatgc
tgttgttatt actgaagctg aaccaatcga tgatcctgga 1200ccaagaatcc tttatgttaa
tgaggctttc actaagatca caggatacac tgctgaagag 1260atgttgggaa agactcctag
agttcttcaa ggaccaaaaa cttcaagaac tgagttggat 1320agagttagac aggctatctc
tcaatggcag tctgttacag ttgaagttat taattacaga 1380aaggatggtt ctgagttttg
ggttgaattt tctcttgttc ctgttgctaa caaaacagga 1440ttttacactc attggattgc
tgttcaaaga gatgttacag agagaagaag aactgaagag 1500gttagacttg ctttggaaag
agagaaggaa ctttcaagat tgaagactag atttttctct 1560atggcttctc atgagtttag
aacaccactt tctactgctt tggctgctgc tcaacttctt 1620gaaaattctg aagttgcttg
gcttgatcct gataagagat caagaaacct tcatagaatc 1680caaaattctg ttaaaaacat
ggttcaactt ttggatgata tcttgattat caacagagct 1740gaggctggaa agcttgagtt
taatccaaac tggcttgatt tgaagctttt gttccaacag 1800ttcattgaag agatccagct
ttctgtttct gatcaatact acttcgattt catctgttct 1860gctcaagata ctaaggctct
tgttgatgaa agattggtta gatctatcct ttctaatctt 1920ttgtctaacg ctatcaagta
ctctcctgga ggtggacaga ttaaaatcgc tctttctttg 1980gattctgagc agattatctt
cgaagttaca gatcaaggta ttggaatctc tcctgaggat 2040caaaagcaga tctttgaacc
attccataga ggaaagaatg ttagaaacat tactggtaca 2100ggacttggtt tgatggttgc
taagaaatgt gttgatcttc attctggatc tatccttttg 2160aagtctgctg tggatcaagg
aacaactgtg accatctgtc tcaaaaggta caaccatctc 2220ccaagggct
222913316PRTArtificial
SequenceF2A30(aa30)NLS2xGGSVP644xGGSCcaR 13Pro Gly Ser Leu Gln Pro Lys
Lys Lys Arg Lys Val Gly Gly Gly Gly1 5 10
15Ser Gly Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp
Met Leu Gly 20 25 30Ser Asp
Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala 35
40 45Leu Asp Asp Phe Asp Leu Asp Met Leu Gly
Ser Asp Ala Leu Asp Asp 50 55 60Phe
Asp Leu Asp Met Leu Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly65
70 75 80Gly Ser Met Arg Ile Leu
Leu Val Glu Asp Asp Leu Pro Leu Ala Glu 85
90 95Thr Leu Ala Glu Ala Leu Ser Asp Gln Leu Tyr Thr
Val Asp Ile Ala 100 105 110Thr
Asp Ala Ser Leu Ala Trp Asp Tyr Ala Ser Arg Leu Glu Tyr Asp 115
120 125Leu Val Ile Leu Asp Val Met Leu Pro
Glu Leu Asp Gly Ile Thr Leu 130 135
140Cys Gln Lys Trp Arg Ser His Ser Tyr Leu Met Pro Ile Leu Met Met145
150 155 160Thr Ala Arg Asp
Thr Ile Asn Asp Lys Ile Thr Gly Leu Asp Ala Gly 165
170 175Ala Asp Asp Tyr Val Val Lys Pro Val Asp
Leu Gly Glu Leu Phe Ala 180 185
190Arg Val Arg Ala Leu Leu Arg Arg Gly Cys Ala Thr Cys Gln Pro Val
195 200 205Leu Glu Trp Gly Pro Ile Arg
Leu Asp Pro Ser Thr Tyr Glu Val Ser 210 215
220Tyr Asp Asn Glu Val Leu Ser Leu Thr Arg Lys Glu Tyr Ser Ile
Leu225 230 235 240Glu Leu
Leu Leu Arg Asn Gly Arg Arg Val Leu Ser Arg Ser Met Ile
245 250 255Ile Asp Ser Ile Trp Lys Leu
Glu Ser Pro Pro Glu Glu Asp Thr Val 260 265
270Lys Val His Val Arg Ser Leu Arg Gln Lys Leu Lys Ser Ala
Gly Leu 275 280 285Ser Ala Asp Ala
Ile Glu Thr Val His Gly Ile Gly Tyr Arg Leu Ala 290
295 300Asn Leu Thr Glu Lys Ser Leu Cys Gln Gly Lys Asn305
310 31514948DNAArtificial
SequenceF2A30(aa30)NLS2xGGSVP644xGGSCcaR 14ccaggttcac tccagcctaa
gaagaagaga aaggttggag gtggtggctc cggaggctct 60gatgccctcg acgatttcga
cctcgatatg ctcggttctg atgctctcga tgactttgac 120cttgacatgc ttggatcaga
cgctttggac gacttcgact tggacatgtt gggatctgat 180gcacttgatg attttgacct
tgatatgctt ggtggttcag gagggtctgg tggatcagga 240ggatctatga gaatactcct
cgtggaagat gatttgccat tagcagaaac cctcgcagaa 300gctttgtctg atcaacttta
cactgttgat attgctacag atgcttcttt ggcttgggat 360tatgcttcta gacttgaata
cgatttggtt attcttgatg ttatgttgcc tgagcttgat 420ggaattactc tttgtcagaa
gtggagatct cattcttatt tgatgccaat ccttatgatg 480actgctagag atacaattaa
tgataagatc acaggacttg atgctggtgc tgatgattac 540gttgttaaac ctgttgattt
gggtgaactt tttgctagag ttagagctct tttgagaaga 600ggatgtgcta cttgtcaacc
agttttggag tggggtccta ttagacttga tccatctact 660tatgaagttt cttacgataa
tgaggttttg tctcttacaa gaaaggaata ctctatcttg 720gagcttttgc ttagaaacgg
aagaagagtt ctttctagat ctatgatcat cgattctatc 780tggaagttgg agtctcctcc
agaagaggat acagttaaag ttcatgttag atctttgaga 840caaaagctta agtctgctgg
actttctgct gatgctattg aaactgttca tggaatcggt 900tacagattgg ctaatcttac
agagaagtct ttgtgtcagg gaaagaat 94815314PRTArtificial
SequenceF2A30(aa30)CcaR4xGSSVP642xGGSNLS 15Pro Met Arg Ile Leu Leu Val
Glu Asp Asp Leu Pro Leu Ala Glu Thr1 5 10
15Leu Ala Glu Ala Leu Ser Asp Gln Leu Tyr Thr Val Asp
Ile Ala Thr 20 25 30Asp Ala
Ser Leu Ala Trp Asp Tyr Ala Ser Arg Leu Glu Tyr Asp Leu 35
40 45Val Ile Leu Asp Val Met Leu Pro Glu Leu
Asp Gly Ile Thr Leu Cys 50 55 60Gln
Lys Trp Arg Ser His Ser Tyr Leu Met Pro Ile Leu Met Met Thr65
70 75 80Ala Arg Asp Thr Ile Asn
Asp Lys Ile Thr Gly Leu Asp Ala Gly Ala 85
90 95Asp Asp Tyr Val Val Lys Pro Val Asp Leu Gly Glu
Leu Phe Ala Arg 100 105 110Val
Arg Ala Leu Leu Arg Arg Gly Cys Ala Thr Cys Gln Pro Val Leu 115
120 125Glu Trp Gly Pro Ile Arg Leu Asp Pro
Ser Thr Tyr Glu Val Ser Tyr 130 135
140Asp Asn Glu Val Leu Ser Leu Thr Arg Lys Glu Tyr Ser Ile Leu Glu145
150 155 160Leu Leu Leu Arg
Asn Gly Arg Arg Val Leu Ser Arg Ser Met Ile Ile 165
170 175Asp Ser Ile Trp Lys Leu Glu Ser Pro Pro
Glu Glu Asp Thr Val Lys 180 185
190Val His Val Arg Ser Leu Arg Gln Lys Leu Lys Ser Ala Gly Leu Ser
195 200 205Ala Asp Ala Ile Glu Thr Val
His Gly Ile Gly Tyr Arg Leu Ala Asn 210 215
220Leu Thr Glu Lys Ser Leu Cys Gln Gly Lys Asn Gly Gly Ser Gly
Gly225 230 235 240Ser Gly
Gly Ser Gly Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp
245 250 255Met Leu Gly Ser Asp Ala Leu
Asp Asp Phe Asp Leu Asp Met Leu Gly 260 265
270Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser
Asp Ala 275 280 285Leu Asp Asp Phe
Asp Leu Asp Met Leu Gly Gly Ser Gly Gly Ser Leu 290
295 300Gln Pro Lys Lys Lys Arg Lys Val Gly Gly305
31016942DNAArtificial SequenceF2A30(aa30)CcaR4xGSSVP642xGGSNLS
16ccaatgagaa tactcctcgt ggaagatgat ttgccattag cagaaaccct cgcagaagct
60ttgtctgatc aactttacac tgttgatatt gctacagatg cttctttggc ttgggattat
120gcttctagac ttgaatacga tttggttatt cttgatgtta tgttgcctga gcttgatgga
180attactcttt gtcagaagtg gagatctcat tcttatttga tgccaatcct tatgatgact
240gctagagata caattaatga taagatcaca ggacttgatg ctggtgctga tgattacgtt
300gttaaacctg ttgatttggg tgaacttttt gctagagtta gagctctttt gagaagagga
360tgtgctactt gtcaaccagt tttggagtgg ggtcctatta gacttgatcc atctacttat
420gaagtttctt acgataatga ggttttgtct cttacaagaa aggaatactc tatcttggag
480cttttgctta gaaacggaag aagagttctt tctagatcta tgatcatcga ttctatctgg
540aagttggagt ctcctccaga agaggataca gttaaagttc atgttagatc tttgagacaa
600aagcttaagt ctgctggact ttctgctgat gctattgaaa ctgttcatgg aatcggttac
660agattggcta atcttacaga gaagtctttg tgtcagggaa agaatggagg ctccggtggg
720tcaggtggtt ctggaggctc ggatgccctc gacgatttcg acctcgatat gctcggttct
780gatgctctcg atgactttga ccttgacatg cttggatcag acgctttgga cgacttcgac
840ttggacatgt tgggatctga tgcacttgat gattttgacc ttgatatgct tggcggttcc
900ggtggatcac tccagcctaa gaagaagaga aaggttggag gt
94217125DNAArtificial Sequencepromoter 17ctttccgatt tctttacgat ttccgctttc
cgatttcttt acgatttggc tttccgattt 60ctttacgatt tatccttcgc aagacccttc
ctctatataa ggaagttcat ttcatttgga 120gagga
12518477DNAArtificial SequenceGAF
domain 18atcagacaat ctcttaattt ggagactgtt ttgaacacta cagttgctga
agttaagaca 60cttttgcagg ttgatagagt tcttatctat agaatctggc aagatggtac
aggatctgtt 120atcactgagt ctgttaatgc taactaccct tctattttgg gtagaacttt
ttctgatgag 180gttttcccag ttgaatatca tcaagcttac acaaagggaa aagttagagc
tattaatgat 240atcgatcagg atgatatcga aatctgtctt gctgatttcg ttaaacaatt
cggtgttaag 300tctaaacttg ttgttcctat cttgcagcat aatagagctt cttctttgga
taacgaatct 360gagtttccat atctttgggg acttttgatt acacatcagt gtgctttcac
tagaccttgg 420caaccttggg aagttgagct tatgaagcag ttggctaacc aagttgctat
tgctatc 47719159PRTArtificial SequenceGAF domain 19Ile Arg Gln Ser
Leu Asn Leu Glu Thr Val Leu Asn Thr Thr Val Ala1 5
10 15Glu Val Lys Thr Leu Leu Gln Val Asp Arg
Val Leu Ile Tyr Arg Ile 20 25
30Trp Gln Asp Gly Thr Gly Ser Val Ile Thr Glu Ser Val Asn Ala Asn
35 40 45Tyr Pro Ser Ile Leu Gly Arg Thr
Phe Ser Asp Glu Val Phe Pro Val 50 55
60Glu Tyr His Gln Ala Tyr Thr Lys Gly Lys Val Arg Ala Ile Asn Asp65
70 75 80Ile Asp Gln Asp Asp
Ile Glu Ile Cys Leu Ala Asp Phe Val Lys Gln 85
90 95Phe Gly Val Lys Ser Lys Leu Val Val Pro Ile
Leu Gln His Asn Arg 100 105
110Ala Ser Ser Leu Asp Asn Glu Ser Glu Phe Pro Tyr Leu Trp Gly Leu
115 120 125Leu Ile Thr His Gln Cys Ala
Phe Thr Arg Pro Trp Gln Pro Trp Glu 130 135
140Val Glu Leu Met Lys Gln Leu Ala Asn Gln Val Ala Ile Ala Ile145
150 15520222DNAArtificial SequencePAS domain
1 20actaaccata cacttcagtc tttgattgct gcttctccta gaggtatctt tactcttaat
60ttggctgatc aaattcagat ctggaaccca acagctgagc gaatcttcgg atggactgaa
120acagagatta tcgctcatcc tgagcttttg acatctaaca tccttttgga agattaccaa
180cagtttaagc aaaaggttct ttctggtatg gtttctccat ct
2222174PRTArtificial SequencePAS domain 1 21Thr Asn His Thr Leu Gln Ser
Leu Ile Ala Ala Ser Pro Arg Gly Ile1 5 10
15Phe Thr Leu Asn Leu Ala Asp Gln Ile Gln Ile Trp Asn
Pro Thr Ala 20 25 30Glu Arg
Ile Phe Gly Trp Thr Glu Thr Glu Ile Ile Ala His Pro Glu 35
40 45Leu Leu Thr Ser Asn Ile Leu Leu Glu Asp
Tyr Gln Gln Phe Lys Gln 50 55 60Lys
Val Leu Ser Gly Met Val Ser Pro Ser65
7022162DNAArtificial SequencePAS domain 2 22atcgatgatc ctggaccaag
aatcctttat gttaatgagg ctttcactaa gatcacagga 60tacactgctg aagagatgtt
gggaaagact cctagagttc ttcaaggacc aaaaacttca 120agaactgagt tggatagagt
tagacaggct atctctcaat gg 1622354PRTArtificial
SequencePAS domain 2 23Ile Asp Asp Pro Gly Pro Arg Ile Leu Tyr Val Asn
Glu Ala Phe Thr1 5 10
15Lys Ile Thr Gly Tyr Thr Ala Glu Glu Met Leu Gly Lys Thr Pro Arg
20 25 30Val Leu Gln Gly Pro Lys Thr
Ser Arg Thr Glu Leu Asp Arg Val Arg 35 40
45Gln Ala Ile Ser Gln Trp 5024654DNAArtificial
SequenceHis-Kinase domain 24atggcttctc atgagtttag aacaccactt tctactgctt
tggctgctgc tcaacttctt 60gaaaattctg aagttgcttg gcttgatcct gataagagat
caagaaacct tcatagaatc 120caaaattctg ttaaaaacat ggttcaactt ttggatgata
tcttgattat caacagagct 180gaggctggaa agcttgagtt taatccaaac tggcttgatt
tgaagctttt gttccaacag 240ttcattgaag agatccagct ttctgtttct gatcaatact
acttcgattt catctgttct 300gctcaagata ctaaggctct tgttgatgaa agattggtta
gatctatcct ttctaatctt 360ttgtctaacg ctatcaagta ctctcctgga ggtggacaga
ttaaaatcgc tctttctttg 420gattctgagc agattatctt cgaagttaca gatcaaggta
ttggaatctc tcctgaggat 480caaaagcaga tctttgaacc attccataga ggaaagaatg
ttagaaacat tactggtaca 540ggacttggtt tgatggttgc taagaaatgt gttgatcttc
attctggatc tatccttttg 600aagtctgctg tggatcaagg aacaactgtg accatctgtc
tcaaaaggta caac 65425218PRTArtificial SequenceHis-Kinase domain
25Met Ala Ser His Glu Phe Arg Thr Pro Leu Ser Thr Ala Leu Ala Ala1
5 10 15Ala Gln Leu Leu Glu Asn
Ser Glu Val Ala Trp Leu Asp Pro Asp Lys 20 25
30Arg Ser Arg Asn Leu His Arg Ile Gln Asn Ser Val Lys
Asn Met Val 35 40 45Gln Leu Leu
Asp Asp Ile Leu Ile Ile Asn Arg Ala Glu Ala Gly Lys 50
55 60Leu Glu Phe Asn Pro Asn Trp Leu Asp Leu Lys Leu
Leu Phe Gln Gln65 70 75
80Phe Ile Glu Glu Ile Gln Leu Ser Val Ser Asp Gln Tyr Tyr Phe Asp
85 90 95Phe Ile Cys Ser Ala Gln
Asp Thr Lys Ala Leu Val Asp Glu Arg Leu 100
105 110Val Arg Ser Ile Leu Ser Asn Leu Leu Ser Asn Ala
Ile Lys Tyr Ser 115 120 125Pro Gly
Gly Gly Gln Ile Lys Ile Ala Leu Ser Leu Asp Ser Glu Gln 130
135 140Ile Ile Phe Glu Val Thr Asp Gln Gly Ile Gly
Ile Ser Pro Glu Asp145 150 155
160Gln Lys Gln Ile Phe Glu Pro Phe His Arg Gly Lys Asn Val Arg Asn
165 170 175Ile Thr Gly Thr
Gly Leu Gly Leu Met Val Ala Lys Lys Cys Val Asp 180
185 190Leu His Ser Gly Ser Ile Leu Leu Lys Ser Ala
Val Asp Gln Gly Thr 195 200 205Thr
Val Thr Ile Cys Leu Lys Arg Tyr Asn 210
2152633DNAArtificial SequenceNLS 26ttacaaccaa agaagaaaag gaaggtgggt gga
332711PRTArtificial SequenceNLS 27Leu Gln
Pro Lys Lys Lys Arg Lys Val Gly Gly1 5
1028345DNAArtificial SequenceREC domain 28agaatactcc tcgtggaaga
tgatttgcca ttagcagaaa ccctcgcaga agctttgtct 60gatcaacttt acactgttga
tattgctaca gatgcttctt tggcttggga ttatgcttct 120agacttgaat acgatttggt
tattcttgat gttatgttgc ctgagcttga tggaattact 180ctttgtcaga agtggagatc
tcattcttat ttgatgccaa tccttatgat gactgctaga 240gatacaatta atgataagat
cacaggactt gatgctggtg ctgatgatta cgttgttaaa 300cctgttgatt tgggtgaact
ttttgctaga gttagagctc ttttg 34529115PRTArtificial
SequenceREC domain 29Arg Ile Leu Leu Val Glu Asp Asp Leu Pro Leu Ala Glu
Thr Leu Ala1 5 10 15Glu
Ala Leu Ser Asp Gln Leu Tyr Thr Val Asp Ile Ala Thr Asp Ala 20
25 30Ser Leu Ala Trp Asp Tyr Ala Ser
Arg Leu Glu Tyr Asp Leu Val Ile 35 40
45Leu Asp Val Met Leu Pro Glu Leu Asp Gly Ile Thr Leu Cys Gln Lys
50 55 60Trp Arg Ser His Ser Tyr Leu Met
Pro Ile Leu Met Met Thr Ala Arg65 70 75
80Asp Thr Ile Asn Asp Lys Ile Thr Gly Leu Asp Ala Gly
Ala Asp Asp 85 90 95Tyr
Val Val Lys Pro Val Asp Leu Gly Glu Leu Phe Ala Arg Val Arg
100 105 110Ala Leu Leu
11530300DNAArtificial SequenceDNA binding domain 30caaccagttt tggagtgggg
tcctattaga cttgatccat ctacttatga agtttcttac 60gataatgagg ttttgtctct
tacaagaaag gaatactcta tcttggagct tttgcttaga 120aacggaagaa gagttctttc
tagatctatg atcatcgatt ctatctggaa gttggagtct 180cctccagaag aggatacagt
taaagttcat gttagatctt tgagacaaaa gcttaagtct 240gctggacttt ctgctgatgc
tattgaaact gttcatggaa tcggttacag attggctaat 30031100PRTArtificial
SequenceDNA binding domain 31Gln Pro Val Leu Glu Trp Gly Pro Ile Arg Leu
Asp Pro Ser Thr Tyr1 5 10
15Glu Val Ser Tyr Asp Asn Glu Val Leu Ser Leu Thr Arg Lys Glu Tyr
20 25 30Ser Ile Leu Glu Leu Leu Leu
Arg Asn Gly Arg Arg Val Leu Ser Arg 35 40
45Ser Met Ile Ile Asp Ser Ile Trp Lys Leu Glu Ser Pro Pro Glu
Glu 50 55 60Asp Thr Val Lys Val His
Val Arg Ser Leu Arg Gln Lys Leu Lys Ser65 70
75 80Ala Gly Leu Ser Ala Asp Ala Ile Glu Thr Val
His Gly Ile Gly Tyr 85 90
95Arg Leu Ala Asn 1003233DNAArtificial SequenceNLS
32ctccagccta agaagaagag aaaggttgga ggt
333311PRTArtificial SequenceNLS 33Leu Gln Pro Lys Lys Lys Arg Lys Val Gly
Gly1 5 1034150DNAArtificial SequenceVP64
domain 34gatgccctcg acgatttcga cctcgatatg ctcggttctg atgctctcga
tgactttgac 60cttgacatgc ttggatcaga cgctttggac gacttcgact tggacatgtt
gggatctgat 120gcacttgatg attttgacct tgatatgctt
1503550PRTArtificial SequenceVP64 domain 35Asp Ala Leu Asp
Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu1 5
10 15Asp Asp Phe Asp Leu Asp Met Leu Gly Ser
Asp Ala Leu Asp Asp Phe 20 25
30Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp
35 40 45Met Leu 503663DNAArtificial
SequenceF2A 36ggacaacttc tcaactttga cttgctaaag ttagctggtg atgttgaatc
taatcctgga 60cca
633720PRTArtificial SequenceF2Aaa1-20 37Gly Gln Leu Leu Asn
Phe Asp Leu Leu Lys Leu Ala Gly Asp Val Glu1 5
10 15Ser Asn Pro Gly 203890DNAArtificial
SequenceF2A30 38cacaaacaga aaattgtggc accggtgaag cagactctca actttgactt
gctaaagtta 60gctggtgatg ttgaatctaa tcctggacca
903929PRTArtificial SequenceF2Aaa1-20 39His Lys Gln Lys Ile
Val Ala Pro Val Lys Gln Thr Leu Asn Phe Asp1 5
10 15Leu Leu Lys Leu Ala Gly Asp Val Glu Ser Asn
Pro Gly 20 254022DNAArtificial SequenceccaR
CRE motif 40ctttccgatt tctttacgat tt
224151DNAArtificial SequenceP35Smin(-51) 41cttcgcaaga cccttcctct
atataaggaa gttcatttca tttggagagg a 5142645DNAArtificial
SequenceTerminator sequence (Trbcs) 42agctttcgtt cgtatcatcg gtttcgacaa
cgttcgtcaa gttcaatgca tcagtttcat 60tgcgcacaca ccagaatcct actgagtttg
agtattatgg cattgggaaa actgtttttc 120ttgtaccatt tgttgtgctt gtaatttact
gtgtttttta ttcggttttc gctatcgaac 180tgtgaaatgg aaatggatgg agaagagtta
atgaatgata tggtcctttt gttcattctc 240aaattaatat tatttgtttt ttctcttatt
tgttgtgtgt tgaatttgaa attataagag 300atatgcaaac attttgtttt gagtaaaaat
gtgtcaaatc gtggcctcta atgaccgaag 360ttaatatgag gagtaaaaca cttgtagttg
taccattatg cttattcact aggcaacaaa 420tatattttca gacctagaaa agctgcaaat
gttactgaat acaagtatgt cctcttgtgt 480tttagacatt tatgaacttt cctttatgta
attttccaga atccttgtca gattctaatc 540attgctttat aattatagtt atactcatgg
atttgtagtt gagtatgaaa atatttttta 600atgcatttta tgacttgcca attgattgac
aacatgcatc aatcg 64543472DNAArtificial
SequenceTerminator sequence (NOS terminator) 43tagagtagat gccgaccgaa
caagagctga tttcgagaac gcctcagcca gcaactcgcg 60cgagcctagc aaggcaaatg
cgagagaacg gccttacgct tggtggcaca gttctcgtcc 120acagttcgct aagctcgctc
ggctgggtcg cgggagggcc ggtcgcagtg attcaggaat 180taattcccta gagtcaagca
gatcgttcaa acatttggca ataaagtttc ttaagattga 240atcctgttgc cggtcttgcg
atgattatca tataatttct gttgaattac gttaagcatg 300taataattaa catgtaatgc
atgacgttat ttatgagatg ggtttttatg attagagtcc 360cgcaattata catttaatac
gcgatagaaa acaaaatata gcgcgcaaac taggataaat 420tatcgcgcgc ggtgtcatct
atgttactag atcgaccggc atgcaagctg at 47244652DNAArtificial
SequenceUBQ10 promoter 44acccgacgag tcagtaataa acggcgtcaa agtggttgca
gccggcacac acgagtcgtg 60tttatcaact caaagcacaa atacttttcc tcaacctaaa
aataaggcaa ttagccaaaa 120acaactttgc gtgtaaacaa cgctcaatac acgtgtcatt
ttattattag ctattgcttc 180accgccttag ctttctcgtg acctagtcgt cctcgtcttt
tcttcttctt cttctataaa 240acaataccca aagagctctt cttcttcaca attcagattt
caatttctca aaatcttaaa 300aactttctct caattctctc taccgtgatc aaggtaaatt
tctgtgttcc ttattctctc 360aaaatcttcg attttgtttt cgttcgatcc caatttcgta
tatgttcttt ggtttagatt 420ctgttaatct tagatcgaag acgattttct gggtttgatc
gttagatatc atcttaattc 480tcgattaggg tttcatagat atcatccgat ttgttcaaat
aatttgagtt ttgtcgaata 540attactcttc gatttgtgat ttctatctag atctggtgtt
agtttctagt ttgtgcgatc 600gaatttgtag attaatctga gtttttctga ttaacagctc
gagtgcggga tc 6524574DNAArtificial SequencePRR 45caggctgcgc
aactggcttt ccgatttctt tacgatttcc gctttccgat ttctttacga 60tttggctttc
cgat
744674DNAArtificial SequencePRR 46ttctttacga tttatccttc gcaagaccct
tcctctatat aaggaagttc atttcatttg 60gagaggacac gctg
74472292DNAArtificial SequenceLRHK1-01
47atgatgttac aaccaaagaa gaaaaggaag gtgggtggaa gacaaaacca agaacgccgc
60aggattgaaa ttagcatcaa gcaacaaacc caacgggaac gatttattaa ccaaattacc
120caacatatcc gccaatcttt aaacttggaa acggttttaa ataccaccgt cgctgaagtt
180aaaaccctgt tgcaagttga tcgagttgcc gtgtaccgtt ttaacccgga ttggagcggc
240gagtttgtgg ccgaaagcgt gggtagcggt tgggtgaaac tggtgggccc ggatatcaaa
300accgtgtggg aagacacaca tctgcaagaa acccaaggtg gtcgctatcg ccatcaagaa
360agcttcgtgg tgaacgacat ttatgaggcc ggccatttca gctgccatct ggagatttta
420gaacagtttg aaattaaagc ctacattatc gtgccggttt ttgccgccga aaaactgtgg
480ggtttactgg ccgcctatca gaacagtggt acccgcgaat gggtggaatg ggaaagcagc
540tttctgaccc aagttggtct gcagttcggc atcgccatcc aacaatcgga attatatgag
600caattacagc aactcaataa agatttggaa aaccgagtcg aaaaacgcac ccagcaactt
660gccgccacca atcaatccct aagaatggaa atcagtgagc gacaaaaaac ggaagccgct
720ctccgccaca ctaaccatac tctgcaatcc ctgattgcgg cctcccccag gggtattttt
780acccttaatt tagcagacca aattcagatt tggaatccta cagcagaacg tatttttggt
840tggacagaaa cagaaattat tgcccatcca gaattattaa catccaacat tttgctggaa
900gattatcagc aatttaaaca gaaagtttta tcaggcatgg tttcccctag cctagaatta
960aaatgtcaaa aaaaagatgg tagttggatt gaaattgtcc tttccgctgc tcccctattg
1020gatagtgaag aaaatattgc cggattggtg gcggttgtcg ccgatattac cgagcaaaag
1080cggcaggcag aacaaattcg tttgctacaa tccgttgtgg ttaatactaa tgatgcggtg
1140gtgattacgg aagcggagcc cattgatgat cccgggccga gaattctcta tgtcaatgaa
1200gcatttacta aaatcaccgg ttatactgct gaagaaatgc taggcaaaac cccccgagtt
1260ttacagggac caaaaactag tcgcactgaa ttagataggg tgcggcaagc cattagtcaa
1320tggcaatcag ttaccgttga agtgattaat tatcgtaagg atggcagtga gttttgggtg
1380gaatttagtc tggtgcccgt tgccaataaa acaggttttt acacccattg gattgctgtg
1440caaagggatg tcactgagcg ccgacgcacg gaggaagtcc gcctagcttt agaacgggaa
1500aaagaattaa gccgcctaaa aactcgtttt ttctccatgg cttcccatga atttcgtact
1560cccctcagta cggccttagc tgctgcccaa ttactggaaa attctgaagt ggcctggctt
1620gatcccgata agcgtagccg gaacttacac cgtattcaaa attccgtgaa aaatatggta
1680cagctcctgg atgatatttt aatcattaac cgtgccgaag cgggcaaatt ggaatttaat
1740cctaattggt tagatttgaa attattgttc cagcaattta tcgaagaaat tcaattaagt
1800gtcagtgacc aatattattt tgactttatt tgtagcgctc aagatacgaa ggcattggtg
1860gatgaaaggt tagtgcggtc tattttatct aatctgttat ctaatgcgat taaatactct
1920cccgggggag ggcagattaa aattgcccta agcctagatt cggaacagat tatttttgaa
1980gtcaccgacc agggcattgg catttcgcca gaggaccaaa agcaaatttt tgaacccttt
2040catcggggca aaaatgtcag aaatattacg ggaacaggac tcggtttaat ggttgccaag
2100aaatgtgttg acttacacag tggcagtatc ttgctaaaaa gtgcagttga ccagggaaca
2160acagttacta tctgtttaaa acgctataac catttgcctc gagctcacaa acagaaaatt
2220gtggcaccgg tgaagcagac tctcaacttt gacttgctaa agttagctgg tgatgttgaa
2280tctaatcctg ga
2292482295DNAArtificial SequenceLRHK1-05 48atgatgttac aaccaaagaa
gaaaaggaag gtgggtggaa gacaaaacca agaacgccgc 60aggattgaaa ttagcatcaa
gcaacaaacc caacgggaac gatttattaa ccaaattacc 120caacatatcc gccaatcttt
aaacttggaa acggttttaa ataccaccgt cgctgaagtt 180aaaaccctgt tgcaagttga
tcgagttctg gtgtatcgct ttaacccgga ttggagcggc 240gagtttatcc atgaaagcgt
ggcccagatg tgggaaccgc tgaaggatct gcagaacaac 300tttccgctgt ggcaagatac
ctatttacaa gaaaatgagg gtggccgcta ccgcaatcat 360gaaagtctgg ccgtgggcga
tgtggaaacc gccggtttca ccgattgcca tttagataat 420ctgcgtcgct tcgaaattcg
cgcctttctg accgtgccgg tttttgttgg tgaacagctg 480tggggtctgc tgggcgccta
tcagaatggt gcaccgcgcc attggcaagc tcgcgaaatt 540catctgctgc accagatcgc
caaccagctg ggtatcgcca tccaacaatc ggaattatat 600gagcaattac agcaactcaa
taaagatttg gaaaaccgag tcgaaaaacg cacccagcaa 660cttgccgcca ccaatcaatc
cctaagaatg gaaatcagtg agcgacaaaa aacggaagcc 720gctctccgcc acactaacca
tactctgcaa tccctgattg cggcctcccc caggggtatt 780tttaccctta atttagcaga
ccaaattcag atttggaatc ctacagcaga acgtattttt 840ggttggacag aaacagaaat
tattgcccat ccagaattat taacatccaa cattttgctg 900gaagattatc agcaatttaa
acagaaagtt ttatcaggca tggtttcccc tagcctagaa 960ttaaaatgtc aaaaaaaaga
tggtagttgg attgaaattg tcctttccgc tgctccccta 1020ttggatagtg aagaaaatat
tgccggattg gtggcggttg tcgccgatat taccgagcaa 1080aagcggcagg cagaacaaat
tcgtttgcta caatccgttg tggttaatac taatgatgcg 1140gtggtgatta cggaagcgga
gcccattgat gatcccgggc cgagaattct ctatgtcaat 1200gaagcattta ctaaaatcac
cggttatact gctgaagaaa tgctaggcaa aaccccccga 1260gttttacagg gaccaaaaac
tagtcgcact gaattagata gggtgcggca agccattagt 1320caatggcaat cagttaccgt
tgaagtgatt aattatcgta aggatggcag tgagttttgg 1380gtggaattta gtctggtgcc
cgttgccaat aaaacaggtt tttacaccca ttggattgct 1440gtgcaaaggg atgtcactga
gcgccgacgc acggaggaag tccgcctagc tttagaacgg 1500gaaaaagaat taagccgcct
aaaaactcgt tttttctcca tggcttccca tgaatttcgt 1560actcccctca gtacggcctt
agctgctgcc caattactgg aaaattctga agtggcctgg 1620cttgatcccg ataagcgtag
ccggaactta caccgtattc aaaattccgt gaaaaatatg 1680gtacagctcc tggatgatat
tttaatcatt aaccgtgccg aagcgggcaa attggaattt 1740aatcctaatt ggttagattt
gaaattattg ttccagcaat ttatcgaaga aattcaatta 1800agtgtcagtg accaatatta
ttttgacttt atttgtagcg ctcaagatac gaaggcattg 1860gtggatgaaa ggttagtgcg
gtctatttta tctaatctgt tatctaatgc gattaaatac 1920tctcccgggg gagggcagat
taaaattgcc ctaagcctag attcggaaca gattattttt 1980gaagtcaccg accagggcat
tggcatttcg ccagaggacc aaaagcaaat ttttgaaccc 2040tttcatcggg gcaaaaatgt
cagaaatatt acgggaacag gactcggttt aatggttgcc 2100aagaaatgtg ttgacttaca
cagtggcagt atcttgctaa aaagtgcagt tgaccaggga 2160acaacagtta ctatctgttt
aaaacgctat aaccatttgc ctcgagctca caaacagaaa 2220attgtggcac cggtgaagca
gactctcaac tttgacttgc taaagttagc tggtgatgtt 2280gaatctaatc ctgga
2295492286DNAArtificial
SequenceLRHK1-10 49atgatgttac aaccaaagaa gaaaaggaag gtgggtggaa gacaaaacca
agaacgccgc 60aggattgaaa ttagcatcaa gcaacaaacc caacgggaac gatttattaa
ccaaattacc 120caacatatcc gccaatcttt aaacttggaa acggttttaa ataccaccgt
cgctgaagtt 180aaaaccctgt tgcaagttga tcgagttacc atttatcgtt ttcgcgccga
ttggagcggt 240gaatttgtgg ccgaatcttt agcccaaggt tggacaccgg tgcgtgaaat
tgtgccggtg 300gttgccgatg actatctgca agaaacccaa ggtcgcaact ttgccaatgg
caaaagcatc 360gtgattaaag atatttacag cgccaactac agcatctgcc acattgcact
gctggaactg 420atgcaagctc gcgcctatat gatcgtgccg atcttccaag gtgaaaagct
gtggggtctg 480ctggccgcct atcagaacat caagcctcgc gattggcaag aagatgaggt
ggatctggtg 540atgcagatcg gtacccagct gggcatcgcc atccaacaat cggaattata
tgagcaatta 600cagcaactca ataaagattt ggaaaaccga gtcgaaaaac gcacccagca
acttgccgcc 660accaatcaat ccctaagaat ggaaatcagt gagcgacaaa aaacggaagc
cgctctccgc 720cacactaacc atactctgca atccctgatt gcggcctccc ccaggggtat
ttttaccctt 780aatttagcag accaaattca gatttggaat cctacagcag aacgtatttt
tggttggaca 840gaaacagaaa ttattgccca tccagaatta ttaacatcca acattttgct
ggaagattat 900cagcaattta aacagaaagt tttatcaggc atggtttccc ctagcctaga
attaaaatgt 960caaaaaaaag atggtagttg gattgaaatt gtcctttccg ctgctcccct
attggatagt 1020gaagaaaata ttgccggatt ggtggcggtt gtcgccgata ttaccgagca
aaagcggcag 1080gcagaacaaa ttcgtttgct acaatccgtt gtggttaata ctaatgatgc
ggtggtgatt 1140acggaagcgg agcccattga tgatcccggg ccgagaattc tctatgtcaa
tgaagcattt 1200actaaaatca ccggttatac tgctgaagaa atgctaggca aaaccccccg
agttttacag 1260ggaccaaaaa ctagtcgcac tgaattagat agggtgcggc aagccattag
tcaatggcaa 1320tcagttaccg ttgaagtgat taattatcgt aaggatggca gtgagttttg
ggtggaattt 1380agtctggtgc ccgttgccaa taaaacaggt ttttacaccc attggattgc
tgtgcaaagg 1440gatgtcactg agcgccgacg cacggaggaa gtccgcctag ctttagaacg
ggaaaaagaa 1500ttaagccgcc taaaaactcg ttttttctcc atggcttccc atgaatttcg
tactcccctc 1560agtacggcct tagctgctgc ccaattactg gaaaattctg aagtggcctg
gcttgatccc 1620gataagcgta gccggaactt acaccgtatt caaaattccg tgaaaaatat
ggtacagctc 1680ctggatgata ttttaatcat taaccgtgcc gaagcgggca aattggaatt
taatcctaat 1740tggttagatt tgaaattatt gttccagcaa tttatcgaag aaattcaatt
aagtgtcagt 1800gaccaatatt attttgactt tatttgtagc gctcaagata cgaaggcatt
ggtggatgaa 1860aggttagtgc ggtctatttt atctaatctg ttatctaatg cgattaaata
ctctcccggg 1920ggagggcaga ttaaaattgc cctaagccta gattcggaac agattatttt
tgaagtcacc 1980gaccagggca ttggcatttc gccagaggac caaaagcaaa tttttgaacc
ctttcatcgg 2040ggcaaaaatg tcagaaatat tacgggaaca ggactcggtt taatggttgc
caagaaatgt 2100gttgacttac acagtggcag tatcttgcta aaaagtgcag ttgaccaggg
aacaacagtt 2160actatctgtt taaaacgcta taaccatttg cctcgagctc acaaacagaa
aattgtggca 2220ccggtgaagc agactctcaa ctttgacttg ctaaagttag ctggtgatgt
tgaatctaat 2280cctgga
2286502289DNAArtificial SequenceLRHK1-12 50atgatgttac
aaccaaagaa gaaaaggaag gtgggtggaa gacaaaacca agaacgccgc 60aggattgaaa
ttagcatcaa gcaacaaacc caacgggaac gatttattaa ccaaattacc 120caacatatcc
gccaatcttt aaacttggaa acggttttaa ataccaccgt cgctgaagtt 180aaaaccctgt
tgcaagttga tcgagttgtt atttttcagt tttcacccga ctctgacttt 240tccgttggta
atattgtggc agagtcggta ttggctccat ttaagccaat cattaatagt 300gcaattgaag
aaacttgttt tagtaataac tatgcccaaa ggtatcagca gggcagaatt 360caggtcattg
aggatattca ccagtcccat cttaggcaat gccacattga ctttcttgcc 420aggctacagg
tcagggcaaa cctagtgcta ccactaatta atgatgccat tttgtggggc 480ttattgtgta
ttcatcaatg tgacagttct agagtttggg aacaaacaga aattgatctg 540ctcaagcaga
tcactaatca gtttgaaatc gccatccaac aatcggaatt atatgagcaa 600ttacagcaac
tcaataaaga tttggaaaac cgagtcgaaa aacgcaccca gcaacttgcc 660gccaccaatc
aatccctaag aatggaaatc agtgagcgac aaaaaacgga agccgctctc 720cgccacacta
accatactct gcaatccctg attgcggcct cccccagggg tatttttacc 780cttaatttag
cagaccaaat tcagatttgg aatcctacag cagaacgtat ttttggttgg 840acagaaacag
aaattattgc ccatccagaa ttattaacat ccaacatttt gctggaagat 900tatcagcaat
ttaaacagaa agttttatca ggcatggttt cccctagcct agaattaaaa 960tgtcaaaaaa
aagatggtag ttggattgaa attgtccttt ccgctgctcc cctattggat 1020agtgaagaaa
atattgccgg attggtggcg gttgtcgccg atattaccga gcaaaagcgg 1080caggcagaac
aaattcgttt gctacaatcc gttgtggtta atactaatga tgcggtggtg 1140attacggaag
cggagcccat tgatgatccc gggccgagaa ttctctatgt caatgaagca 1200tttactaaaa
tcaccggtta tactgctgaa gaaatgctag gcaaaacccc ccgagtttta 1260cagggaccaa
aaactagtcg cactgaatta gatagggtgc ggcaagccat tagtcaatgg 1320caatcagtta
ccgttgaagt gattaattat cgtaaggatg gcagtgagtt ttgggtggaa 1380tttagtctgg
tgcccgttgc caataaaaca ggtttttaca cccattggat tgctgtgcaa 1440agggatgtca
ctgagcgccg acgcacggag gaagtccgcc tagctttaga acgggaaaaa 1500gaattaagcc
gcctaaaaac tcgttttttc tccatggctt cccatgaatt tcgtactccc 1560ctcagtacgg
ccttagctgc tgcccaatta ctggaaaatt ctgaagtggc ctggcttgat 1620cccgataagc
gtagccggaa cttacaccgt attcaaaatt ccgtgaaaaa tatggtacag 1680ctcctggatg
atattttaat cattaaccgt gccgaagcgg gcaaattgga atttaatcct 1740aattggttag
atttgaaatt attgttccag caatttatcg aagaaattca attaagtgtc 1800agtgaccaat
attattttga ctttatttgt agcgctcaag atacgaaggc attggtggat 1860gaaaggttag
tgcggtctat tttatctaat ctgttatcta atgcgattaa atactctccc 1920gggggagggc
agattaaaat tgccctaagc ctagattcgg aacagattat ttttgaagtc 1980accgaccagg
gcattggcat ttcgccagag gaccaaaagc aaatttttga accctttcat 2040cggggcaaaa
atgtcagaaa tattacggga acaggactcg gtttaatggt tgccaagaaa 2100tgtgttgact
tacacagtgg cagtatcttg ctaaaaagtg cagttgacca gggaacaaca 2160gttactatct
gtttaaaacg ctataaccat ttgcctcgag ctcacaaaca gaaaattgtg 2220gcaccggtga
agcagactct caactttgac ttgctaaagt tagctggtga tgttgaatct 2280aatcctgga
2289
User Contributions:
Comment about this patent or add new information about this topic: