Patent application title: TRANSGENIC PLANTS WITH ENHANCED TRAITS
Inventors:
IPC8 Class: AC12N1582FI
USPC Class:
1 1
Class name:
Publication date: 2021-03-11
Patent application number: 20210071188
Abstract:
This disclosure provides recombinant DNA constructs and transgenic plants
having enhanced traits such as increased yield, increased nitrogen use
efficiency, and enhanced drought tolerance or water use efficiency.
Transgenic plants may include field crops as well as plant propagules,
plant parts and progeny of such transgenic plants. Methods of making and
using such transgenic plants are also provided. This disclosure also
provides methods of producing seed from such transgenic plants, growing
such seed, and selecting progeny plants with enhanced traits. Also
disclosed are transgenic plants with altered phenotypes which are useful
for screening and selecting transgenic events for the desired enhanced
trait.Claims:
1. A recombinant DNA construct comprising: a) a polynucleotide sequence
with at least 90%, at least 91%, at least 92%, at least 93%, at least
94%, at least 95%, at least 96%, at least 97%, at least 98%, at least
99%, or 100% identity to a sequence selected from the group consisting of
SEQ ID NOs: 1-9; or b) a polynucleotide sequence that encodes a
polypeptide comprising an amino acid sequence with at least 90%, at least
91%, at least 92%, at least 93%, at least 94%, at least 95%, at least
96%, at least 97%, at least 98%, at least 99%, or 100% identity to a
sequence selected from the group consisting of SEQ ID NOs: 10-18 and
30-39.
2. The recombinant DNA construct of claim 1, further comprising a heterologous promoter functional in a plant cell and operably linked to the polynucleotide sequence.
3. A vector or plasmid comprising the recombinant DNA construct of claim 1.
4. A plant comprising the recombinant DNA construct of claim 1.
5. The plant of claim 4, wherein the plant is a field crop.
6. The plant of claim 5, wherein the field crop plant is selected from the group consisting of corn, soybean, cotton, canola, rice, barley, oat, wheat, turf grass, alfalfa, sugar beet, sunflower, quinoa and sugarcane.
7. The plant of claim 4, wherein the plant has an altered phenotype or an enhanced trait as compared to a control plant.
8. The plant of claim 7, wherein a) the enhanced trait is selected from the group consisting of: decreased days from planting to maturity, increased stalk size, increased number of leaves, increased plant height growth rate in vegetative stage, increased ear size, increased ear dry weight per plant, increased number of kernels per ear, increased weight per kernel, increased number of kernels per plant, decreased ear void, extended grain fill period, reduced plant height, increased number of root branches, increased total root length, increased yield, increased nitrogen use efficiency, and increased water use efficiency as compared to a control plant; or b) the altered phenotype is selected from the group consisting of plant height, biomass, canopy area, anthocyanin content, chlorophyll content, water applied, water content, and water use efficiency.
9. (canceled)
10. A plant part or propagule comprising the recombinant DNA construct of claim 1, wherein the plant part or propagule is selected from the group consisting of cells, pollen, ovule, flower, embryo, leaf, root, stem, shoot, meristem, grain and seed.
11. A method for altering a phenotype, enhancing a trait, increasing yield, increasing nitrogen use efficiency, or increasing water use efficiency in a plant comprising producing a transgenic plant comprising a recombinant DNA construct of claim 1.
12. The method of claim 11, wherein a) the recombinant DNA construct further comprises a heterologous promoter functional in a plant cell and operably linked to the polynucleotide sequence of the recombinant DNA construct; b) the transgenic plant is produced by transforming a plant cell or tissue with the recombinant DNA construct, and regenerating or developing the transgenic plant from the plant cell or tissue comprising the recombinant DNA construct; or c) the transgenic plant is produced by site-directed integration of the recombinant DNA construct into the genome of a plant cell or tissue using a donor template comprising the recombinant DNA construct, and regenerating or developing the transgenic plant from the plant cell or tissue comprising the recombinant DNA construct.
13. (canceled)
14. The method of claim 11, further comprising: producing a progeny plant comprising the recombinant DNA construct by crossing the transgenic plant with: a) itself; b) a second plant from the same plant line; c) a wild type plant; or d) a second plant from a different plant line, to produce a seed, growing the seed to produce a progeny plant; and selecting a progeny plant with increased yield, increased nitrogen use efficiency, or increased water use efficiency as compared to a control plant.
15. (canceled)
16. A plant produced by the method of claim 11.
17. A recombinant DNA molecule for use as a donor template in site-directed integration, wherein a) the recombinant DNA molecule comprises an insertion sequence comprising: i) a polynucleotide sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 1-9; or ii) a polynucleotide sequence that encodes a polypeptide comprising an amino acid sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 10-18 and 30-39; or b) the recombinant DNA molecule comprises an insertion sequence for modulation of expression of an endogenous gene, wherein the endogenous gene comprises: iii) a polynucleotide sequence encoding a mRNA molecule with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 1-9; or iv) a polynucleotide sequence that encodes a polypeptide having an amino acid sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 10-18 and 30-39.
18. The recombinant DNA molecule of claim 17, wherein the insertion sequence further comprises a heterologous promoter functional in a plant cell and operably linked to the polynucleotide sequence.
19. The recombinant DNA molecule of claim 17, further comprising at least one homology arm flanking the insertion sequence.
20. The recombinant DNA molecule of claim 17, wherein a) the recombinant DNA molecule further comprises at least one cassette encoding site-specific nuclease, wherein the site specific nuclease is selected from the group comprising zinc-finger nuclease, an engineered or native meganuclease, a TALE endonuclease, or an RNA-guided endonuclease; or b) the recombinant DNA molecule further comprises at least one cassette encoding one or more guide RNAs.
21-22. (canceled)
23. The recombinant DNA molecule of claim 17, wherein the insertion sequence comprises a promoter, an enhancer, an intron, or a terminator region.
24-25. (canceled)
26. A method for altering a phenotype, enhancing a trait, increasing yield, increasing nitrogen use efficiency, or increasing water use efficiency in a plant comprising: a) modifying the genome of a plant cell by: i) identifying an endogenous gene of the plant corresponding to a gene selected from the list of genes in Tables 1 and 14, and their homologs, and ii) modifying a sequence of the endogenous gene in the plant cell via genome editing or site-directed integration to modify the expression level of the endogenous gene; and b) regenerating or developing a plant from the plant cell.
27. The method of claim 26, wherein the modifying step comprises modifying a regulatory or upstream sequence of the endogenous gene via genome editing.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Application No. 62/677,448, filed May 29, 2018, which is incorporated herein by reference in its entirety.
INCORPORATION OF SEQUENCE LISTING
[0002] The sequence listing file named "MONS:457WO.txt", which is 100 kilobytes (measured in MS-WINDOWS) and was created on May 28, 2019, is filed herewith and incorporated herein by reference in its entirety.
FIELD OF THE INVENTION
[0003] Disclosed herein are recombinant DNA constructs, plants having altered phenotypes, enhanced traits, increased yield, increased nitrogen use efficiency and increased water use efficiency; propagules, progenies and field crops of such plants; and methods of making and using such plants. Also disclosed are methods of producing seed from such plants, growing such seed and/or selecting progeny plants with altered phenotypes, enhanced traits, increased yield, increased nitrogen use efficiency and increased water use efficiency.
SUMMARY
[0004] In one aspect, the present disclosure provides recombinant DNA constructs each comprising: (a) a polynucleotide sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 1-9; (b) a polynucleotide sequence that encodes a polypeptide comprising an amino acid sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 10-18 and 30-39.
[0005] Plants comprising a recombinant DNA construct may be a field crop plant, such as corn, soybean, cotton, canola, rice, barley, oat, wheat, turf grass, alfalfa, sugar beet, sunflower, quinoa and sugarcane. A plant comprising a recombinant DNA construct may have an altered phenotype or an enhanced trait as compared to a control plant. The enhanced trait may be, for example, decreased days from planting to maturity, increased stalk size, increased number of leaves, increased plant height growth rate in vegetative stage, increased ear size, increased ear dry weight per plant, increased number of kernels per ear, increased weight per kernel, increased number of kernels per plant, decreased ear void, extended grain fill period, reduced plant height, increased number of root branches, increased total root length, increased yield, increased nitrogen use efficiency, and increased water use efficiency as compared to a control plant. The altered phenotype may be, for example, plant height, biomass, canopy area, anthocyanin content, chlorophyll content, water applied, water content, and water use efficiency.
[0006] According to another aspect, the present disclosure provides methods for altering a phenotype, enhancing a trait, increasing yield, increasing nitrogen use efficiency, or increasing water use efficiency in a plant comprising producing a transgenic plant comprising a recombinant DNA construct of the present disclosure. The step of producing a transgenic plant may further comprise transforming a plant cell or tissue with the recombinant DNA construct, and regenerating or developing the transgenic plant from the plant cell or tissue comprising the recombinant DNA construct. The transgenic plant may then be crossed to (a) itself; (b) a second plant from the same plant line; (c) a wild type plant; or (d) a second plant from a different plant line, to produce one or more progeny plants; and a plant may be selected from the progeny plants having increased yield, increased nitrogen use efficiency, or increased water use efficiency, or other altered phenotype or enhanced trait as compared to a control plant. Plants produced by this method are further provided.
[0007] According to another aspect, the present disclosure provides recombinant DNA molecules for use as a donor template in site-directed integration, wherein a recombinant DNA molecule comprises an insertion sequence comprising: (a) a polynucleotide sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 1-9; (b) a polynucleotide sequence that encodes a polypeptide comprising an amino acid sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs:10-18 and 30-39.
[0008] The insertion sequence of a recombinant DNA molecule may comprise a heterologous promoter functional in a plant cell and operably linked to the polynucleotide sequence. The recombinant DNA molecule may further comprise at least one homology arm flanking the insertion sequence to direct the integration of the insertion sequence into a desired genomic locus. Plants, propagules and plant cells are further provided comprising the insertion sequence. According to some embodiments, the recombinant DNA molecule may further comprise an expression cassette encoding a site-specific nuclease and/or one or more guide RNAs.
[0009] According to another aspect, the present disclosure provides recombinant DNA molecules for use as a donor template in site-directed integration, wherein a recombinant DNA molecule comprises an insertion sequence for modulation of expression of an endogenous gene, wherein the endogenous gene comprises: (a) a polynucleotide sequence encoding a mRNA molecule with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identity, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs:1-9; or (b) a polynucleotide sequence that encodes a polypeptide having an amino acid sequence with at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identity, or 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 10-18 and 30-39.
[0010] The insertion sequence may comprise a promoter, an enhancer, an intron, or a terminator region, which may correspond to a promoter, an enhancer, an intron, or a terminator region of an endogenous gene. Plants, propagules and plant cells are further provided comprising the insertion sequence. The recombinant DNA molecule may further comprise at least one homology arm flanking the insertion sequence. According to some embodiments, the recombinant DNA molecule may further comprise an expression cassette encoding a site-specific nuclease and/or one or more guide RNAs.
[0011] According to another aspect, the present disclosure provides methods for altering a phenotype, enhancing a trait, increasing yield, increasing nitrogen use efficiency, or increasing water use efficiency in a plant comprising: (a) modifying the genome of a plant cell by: (i) identifying an endogenous gene of the plant corresponding to a gene selected from the list of genes in Tables 1 and 14 herein, and their homologs, and (ii) modifying a sequence of the endogenous gene in the plant cell via genome editing or site-directed integration to modify, augment, or increase the expression level of the endogenous gene; and (b) regenerating or developing a plant from the plant cell.
DETAILED DESCRIPTION
[0012] In the attached sequence listing:
[0013] SEQ ID NOs 1 to 9 are nucleotide sequences or DNA coding sequences or strands that may be used in recombinant DNA constructs to impart an enhanced trait in plants, each representing a coding sequence for a protein.
[0014] SEQ ID NOs 10 to 18 are amino acid sequences encoded by the nucleotide or DNA sequences of SEQ ID NOs 1 to 9, respectively in the same order.
[0015] SEQ ID NOs 19 to 29 are nucleotide or DNA sequences that may be used in recombinant DNA constructs to impart an enhanced trait or altered phenotype in plants, each representing a promoter with a specific type of expression pattern.
[0016] SEQ ID NOs 30 to 39 are amino acid sequences of proteins homologous to proteins having the amino acid sequences of SEQ ID NOs 10 to 18.
[0017] Unless otherwise stated, nucleic acid sequences in the text of this specification are given, when read from left to right, in the 5' to 3' direction. One of skill in the art would be aware that a given DNA sequence is understood to define a corresponding RNA sequence which is identical to the DNA sequence except for replacement of the thymine (T) nucleotide of the DNA with uracil (U) nucleotide. Thus, providing a specific DNA sequence is understood to define the exact RNA equivalent. A given first polynucleotide sequence, whether DNA or RNA, further defines the sequence of its exact complement (which can be DNA or RNA), i.e., a second polynucleotide that hybridizes perfectly to the first polynucleotide by forming Watson-Crick base-pairs. By "essentially identical" or "essentially complementary" to a target gene or a fragment of a target gene is meant that a polynucleotide strand (or at least one strand of a double-stranded polynucleotide) is designed to hybridize (generally under physiological conditions such as those found in a living plant or animal cell) to a target gene or to a fragment of a target gene or to the transcript of the target gene or the fragment of a target gene; one of skill in the art would understand that such hybridization does not necessarily require 100% sequence identity or complementarity. As used herein, "operably linked" means the association of two or more DNA fragments in a recombinant DNA construct so that the expression or function of one (for example, protein-encoding DNA), is controlled or influenced by the other (for example, a promoter). A first nucleic acid sequence is "operably" connected or "linked" with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For example, a promoter sequence is "operably linked" to DNA if the promoter provides for transcription or expression of the DNA. Generally, operably linked DNA sequences are contiguous.
[0018] As used herein, the terms "percent identity" and "percent identical" (including any numerical percentage identity) in reference to two or more nucleotide or protein sequences is calculated by (i) comparing two optimally aligned sequences (nucleotide or protein) over a window of comparison, (ii) determining the number of positions at which the identical nucleic acid base (for nucleotide sequences) or amino acid residue (for proteins) occurs in both sequences to yield the number of matched positions, (iii) dividing the number of matched positions by the total number of positions in the window of comparison, and then (iv) multiplying this quotient by 100% to yield the percent identity. For percent identity, two or more polynucleotide or protein sequences are optimally aligned if the maximum number of ordered nucleotides or amino acids of the two or more sequences are linearly aligned or matched (i.e., identical) with allowance for gap(s) in their alignment. For purposes of calculating "percent identity" between DNA and RNA sequences, a uracil (U) of a RNA sequence is considered identical to a thymine (T) of a DNA sequence. If the window of comparison is defined as a region of alignment between two or more sequences (i.e., excluding nucleotides at the 5' and 3' ends of aligned polynucleotide sequences, or amino acids at the N-terminus and C-terminus of aligned protein sequences, that are not identical between the compared sequences), then the "percent identity" may also be referred to as a "percent alignment identity". If the "percent identity" is being calculated in relation to a reference sequence without a particular comparison window being specified, then the percent identity is determined by dividing the number of matched positions over the region of alignment by the total length of the reference sequence. Accordingly, for purposes of the present disclosure, when two sequences (query and subject) are optimally aligned (with allowance for gaps in their alignment), the "percent identity" for the query sequence is equal to the number of identical positions between the two sequences divided by the total number of positions in the query sequence over its length (or a comparison window), which is then multiplied by 100%.
[0019] As used herein, the terms "percent complementarity" or "percent complementary" (including any numerical percentage complementarity) in reference to two nucleotide sequences is similar to the concept of percent identity, but refers to the percentage of nucleotides of a query sequence that optimally base-pair or hybridize to nucleotides of a subject sequence when the query and subject sequences are linearly arranged and optimally base paired. Such a percent complementarity may be between two DNA strands, two RNA strands, or a DNA strand and a RNA strand. The "percent complementarity" is calculated by (i) optimally base-pairing or hybridizing the two nucleotide sequences in a linear and fully extended arrangement (i.e., without folding or secondary structures) over a window of comparison, (ii) determining the number of positions that base-pair between the two sequences over the window of comparison to yield the number of complementary positions, (iii) dividing the number of complementary positions by the total number of positions in the window of comparison, and (iv) multiplying this quotient by 100% to yield the percent complementarity of the two sequences. Optimal base pairing of two sequences may be determined based on the known pairings of nucleotide bases, such as G-C, A-T, and A-U, through hydrogen bonding. If the "percent complementarity" is being calculated in relation to a reference sequence without specifying a particular comparison window, then the percent identity is determined by dividing the number of complementary positions between the two linear sequences by the total length of the reference sequence. Thus, for purposes of the present disclosure, when two sequences (query and subject) are optimally base-paired (with allowance for mismatches or non-base-paired nucleotides but without folding or secondary structures), the "percent complementarity" for the query sequence is equal to the number of base-paired positions between the two sequences divided by the total number of positions in the query sequence over its length (or by the number of positions in the query sequence over a comparison window), which is then multiplied by 100%.
[0020] As used herein, the term "expression" refers to the production of a polynucleotide or a protein by a plant, plant cell or plant tissue which can give rise to an altered phenotype or enhanced trait. Expression can also refer to the process by which information from a gene is used in the synthesis of functional gene products, which may include but are not limited to other polynucleotides or proteins which may serve, e.g., an enzymatic, structural or regulatory function. Gene products having a regulatory function include but are not limited to elements that affect the occurrence or level of transcription or translation of a target protein. In some cases, the expression product is a non-coding functional RNA.
[0021] "Modulation" of expression refers to the process of effecting either overexpression or suppression of a polynucleotide or a protein.
[0022] The term "suppression" as used herein refers to a lower expression level of a target polynucleotide or target protein in a plant, plant cell or plant tissue, as compared to the expression in a wild-type or control plant, cell or tissue, at any developmental or temporal stage for the gene. The term "target protein" as used in the context of suppression refers to a protein which is suppressed; similarly, "target mRNA" refers to a polynucleotide which can be suppressed or, once expressed, degraded so as to result in suppression of the target protein it encodes. The term "target gene" as used in the context of suppression refers to a "target protein" and/or "target mRNA". In alternative non-limiting embodiments, suppression of a target protein and/or target polynucleotide can give rise to an enhanced trait or altered phenotype directly or indirectly. In one exemplary embodiment, the target protein is one which can indirectly increase or decrease the expression of one or more other proteins, the increased or decreased expression, respectively, of which is associated with an enhanced trait or an altered phenotype. In another exemplary embodiment, the target protein can bind to one or more other proteins associated with an altered phenotype or enhanced trait to enhance or inhibit their function and thereby affect the altered phenotype or enhanced trait indirectly.
[0023] Suppression can be applied using numerous approaches. Non-limiting examples include: suppressing an endogenous gene(s) or a subset of genes in a pathway, suppressing one or more mutation(s) that has/have resulted in decreased activity of a protein, suppressing the production of an inhibitory agent, to elevate, reduce or eliminate the level of substrate that an enzyme requires for activity, producing a new protein, activating a normally silent gene; or accumulating a product that does not normally increase under natural conditions.
[0024] Conversely, the term "overexpression" as used herein refers to a greater expression level of a polynucleotide or a protein in a plant, plant cell or plant tissue, compared to expression in a wild-type plant, cell or tissue, at any developmental or temporal stage for the gene. Overexpression can take place in plant cells normally lacking expression of polypeptides functionally equivalent or identical to the present polypeptides. Overexpression can also occur in plant cells where endogenous expression of the present polypeptides or functionally equivalent molecules normally occurs, but such normal expression is at a lower level. Overexpression thus results in a greater than normal production, or "overproduction" of the polypeptide in the plant, cell or tissue.
[0025] The term "target protein" as used herein in the context of overexpression refers to a protein which is overexpressed; "target mRNA" refers to an mRNA which encodes and is translated to produce the target protein, which can also be overexpressed. The term "target gene" as used in the context of overexpression refers to a "target protein" and/or "target mRNA". In alternative embodiments, the target protein can effect an enhanced trait or altered phenotype directly or indirectly. In the latter case it may do so, for example, by affecting the expression, function or substrate available to one or more other proteins. In an exemplary embodiment, the target protein can bind to one or more other proteins associated with an altered phenotype or enhanced trait to enhance or inhibit their function.
[0026] Overexpression can be achieved using numerous approaches. In one embodiment, overexpression can be achieved by placing the DNA sequence encoding one or more polynucleotides and/or polypeptides under the control of a promoter, examples of which include but are not limited to endogenous promoters, heterologous promoters, inducible promoters and tissue specific promoters. In one exemplary embodiment, the promoter is a constitutive promoter, for example, the cauliflower mosaic virus 35S transcription initiation region. Thus, depending on the promoter used, overexpression can occur throughout a plant, in specific tissues of the plant, or in the presence or absence of different inducing or inducible agents, such as hormones or environmental signals.
[0027] As used herein a "plant" includes a whole plant, a transgenic plant, meristematic tissue, a shoot organ/structure (for example, leaf, stem and tuber), a root, a flower, a floral organ/structure (for example, a bract, a sepal, a petal, a stamen, a carpel, an anther and an ovule), a seed (including an embryo, endosperm, and a seed coat) and a fruit (the mature ovary), plant tissue (for example, vascular tissue, ground tissue, and the like) and a cell (for example, guard cell, egg cell, pollen, mesophyll cell, and the like), and progeny of same. The classes of plants that can be used in the disclosed methods are generally as broad as the classes of higher and lower plants amenable to transformation and breeding techniques, including angiosperms (monocotyledonous and dicotyledonous plants), gymnosperms, ferns, horsetails, psilophytes, lycophytes, bryophytes, and multicellular algae.
[0028] As used herein a "transgenic plant cell" means a plant cell that is transformed with stably-integrated, recombinant DNA, for example, by Agrobacterium-mediated transformation, by bombardment using microparticles coated with recombinant DNA, or by other means, such as site-directed integration. A plant cell of this disclosure can be an originally-transformed plant cell or a progeny plant cell that is regenerated into differentiated tissue, for example, into a transgenic plant with stably-integrated, recombinant DNA, or plant part, seed or pollen derived from a transgenic plant or a progeny plant thereof. As used herein, a "transgenic plant" and a "transgenic plant part" mean a plant or plant part, respectively, having in the genome of at least one cell of such plant or plant part a stably-integrated, recombinant DNA construct or sequence introduced using a transformation method.
[0029] As used herein a "control plant" means a plant that does not contain the recombinant DNA of the present disclosure that imparts an enhanced trait or altered phenotype. A control plant is used to identify and select a transgenic plant that has an enhanced trait or altered phenotype. A suitable control plant can be a non-transgenic plant of the parental line used to generate a transgenic plant, for example, a wild type plant devoid of a recombinant DNA. A suitable control plant can also be a transgenic plant that contains recombinant DNA that imparts other traits, for example, a transgenic plant having enhanced herbicide tolerance. A suitable control plant can in some cases be a progeny of a heterozygous or hemizygous transgenic plant line that does not contain the recombinant DNA, known as a negative segregant, or a negative isogenic line.
[0030] As used herein a "propagule" includes all products of meiosis and mitosis, including but not limited to, plant, seed and part of a plant able to propagate a new plant. Propagules include whole plants, cells, pollen, ovules, flowers, embryos, leaves, roots, stems, shoots, meristems, grains or seeds, or any plant part that is capable of growing into an entire plant. Propagule also includes graft where one portion of a plant is grafted to another portion of a different plant (even one of a different species) to create a living organism. Propagule also includes all plants and seeds produced by cloning or by bringing together meiotic products, or allowing meiotic products to come together to form an embryo or a fertilized egg (naturally or with human intervention).
[0031] As used herein a "progeny" includes any plant, seed, plant cell, and/or regenerable plant part comprising a recombinant DNA of the present disclosure derived from an ancestor plant. A progeny can be homozygous or heterozygous for the transgene. Progeny can be grown from seeds produced by a transgenic plant comprising a recombinant DNA of the present disclosure, and/or from seeds produced by a plant fertilized with pollen or ovule from a transgenic plant comprising a recombinant DNA of the present disclosure.
[0032] As used herein a "trait" is a physiological, morphological, biochemical, or physical characteristic of a plant or particular plant material or cell. In some instances, this characteristic is visible to the human eye and can be measured mechanically, such as seed or plant size, weight, shape, form, length, height, growth rate and development stage, or can be measured by biochemical techniques, such as detecting the protein, starch, certain metabolites, or oil content of seed or leaves, or by observation of a metabolic or physiological process, for example, by measuring tolerance to water deprivation or particular salt or sugar concentrations, or by the measurement of the expression level of a gene or genes, for example, by employing Northern analysis, RT-PCR, microarray gene expression assays, or reporter gene expression systems, or by agricultural observations such as hyperosmotic stress tolerance or yield. Any technique can be used to measure the amount of, comparative level of, or difference in any selected chemical compound or macromolecule in the transgenic plants, however.
[0033] As used herein an "enhanced trait" means a characteristic of a transgenic plant as a result of stable integration and expression of a recombinant DNA in the transgenic plant. Such traits include, but are not limited to, an enhanced agronomic trait characterized by enhanced plant morphology, physiology, growth and development, yield, nutritional enhancement, disease or pest resistance, or environmental or chemical tolerance. In some specific aspects of this disclosure an enhanced trait is selected from the group consisting of decreased days from planting to maturity, increased stalk size, increased number of leaves, increased plant height growth rate in vegetative stage, increased ear size, increased ear dry weight per plant, increased number of kernels per ear, increased weight per kernel, increased number of kernels per plant, decreased ear void, extended grain fill period, reduced plant height, increased number of root branches, increased total root length, drought tolerance, increased water use efficiency, cold tolerance, increased nitrogen use efficiency, increased yield and altered phenotypes as shown in Tables 6-8 and 10-15. In another aspect of the disclosure the trait is increased yield under non-stress conditions or increased yield under environmental stress conditions. Stress conditions can include both biotic and abiotic stress, for example, drought, shade, fungal disease, viral disease, bacterial disease, insect infestation, nematode infestation, cold temperature exposure, heat exposure, osmotic stress, reduced nitrogen nutrient availability, reduced phosphorus nutrient availability and high plant density. "Yield" can be affected by many properties including without limitation, plant height, plant biomass, pod number, pod position on the plant, number of internodes, incidence of pod shatter, grain size, ear size, ear tip filling, kernel abortion, efficiency of nodulation and nitrogen fixation, efficiency of nutrient assimilation, resistance to biotic and abiotic stress, carbon assimilation, plant architecture, resistance to lodging, percent seed germination, seedling vigor, and juvenile traits. Yield can also be affected by efficiency of germination (including germination in stressed conditions), growth rate (including growth rate in stressed conditions), flowering time and duration, ear number, ear size, ear weight, seed number per ear or pod, seed size, composition of seed (starch, oil, protein) and characteristics of seed fill.
[0034] Also used herein, the term "trait modification" encompasses altering the naturally occurring trait by producing a detectable difference in a characteristic in a plant comprising a recombinant DNA of the present disclosure relative to a plant not comprising the recombinant DNA, such as a wild-type plant, or a negative segregant. In some cases, the trait modification can be evaluated quantitatively. For example, the trait modification can entail an increase or decrease, in an observed trait characteristics or phenotype as compared to a control plant. It is known that there can be natural variations in a modified trait. Therefore, the trait modification observed entails a change of the normal distribution and magnitude of the trait characteristics or phenotype in the plants as compared to a control plant.
[0035] The present disclosure relates to a plant with improved economically important characteristics, more specifically increased yield. More specifically the present disclosure relates to a transgenic plant comprising a recombinant polynucleotide of this disclosure, wherein the plant has increased yield as compared to a control plant. Many plants of this disclosure exhibited increased yield or improved yield trait components as compared to a control plant. In an embodiment, a plant of the present disclosure exhibited an improved trait that is related to yield, including but not limited to increased nitrogen use efficiency, increased nitrogen stress tolerance, increased water use efficiency and increased drought tolerance, as defined and discussed infra.
[0036] Yield can be defined as the measurable produce of economic value from a crop. Yield can be defined in the scope of quantity and/or quality. Yield can be directly dependent on several factors, for example, the number and size of organs, plant architecture (such as the number of branches, plant biomass, etc.), flowering time and duration, grain fill period. Root architecture and development, photosynthetic efficiency, nutrient uptake, stress tolerance, early vigor, delayed senescence and functional stay green phenotypes can be important factors in determining yield. Optimizing the above-mentioned factors can therefore contribute to increasing crop yield.
[0037] Reference herein to an increase in yield-related traits can also be taken to mean an increase in biomass (weight) of one or more parts of a plant, which can include above ground and/or below ground (harvestable) plant parts. In particular, such harvestable parts are seeds, and performance of the methods of the disclosure results in plants with increased yield and in particular increased seed yield relative to the seed yield of suitable control plants. The term "yield" of a plant can relate to vegetative biomass (root and/or shoot biomass), to reproductive organs, and/or to propagules (such as seeds) of that plant.
[0038] Increased yield of a plant of the present disclosure can be measured in a number of ways, including test weight, seed number per plant, seed weight, seed number per unit area (for example, seeds, or weight of seeds, per acre), bushels per acre, tons per acre, or kilo per hectare. For example, corn yield can be measured as production of shelled corn kernels per unit of production area, for example in bushels per acre or metric tons per hectare. This is often also reported on a moisture adjusted basis, for example at 15.5 percent moisture. Increased yield can result from improved utilization of key biochemical compounds, such as nitrogen, phosphorous and carbohydrate, or from improved responses to environmental stresses, such as cold, heat, drought, salt, shade, high plant density, and attack by pests or pathogens. This disclosure can also be used to provide plants with improved growth and development, and ultimately increased yield, as the result of modified expression of plant growth regulators or modification of cell cycle or photosynthesis pathways. Also of interest is the generation of plants that demonstrate increased yield with respect to a seed component that may or may not correspond to an increase in overall plant yield.
[0039] In an embodiment, "alfalfa yield" can also be measured in forage yield, the amount of above ground biomass at harvest. Factors leading contributing to increased biomass include increased vegetative growth, branches, nodes and internodes, leaf area, and leaf area index.
[0040] In another embodiment, "canola yield" can also be measured in pod number, number of pods per plant, number of pods per node, number of internodes, incidence of pod shatter, seeds per silique, seed weight per silique, improved seed, oil, or protein composition.
[0041] Additionally, "corn or maize yield" can also be measured as production of shelled corn kernels per unit of production area, ears per acre, number of kernel rows per ear and number of kernels per row, kernel number or weight per ear, weight per kernel, ear number, ear weight, fresh or dry ear biomass (weight)
[0042] In yet another embodiment, "cotton yield" can be measured as bolls per plant, size of bolls, fiber quality, seed cotton yield in g/plant, seed cotton yield in lb/acre, lint yield in lb/acre, and number of bales.
[0043] Specific embodiment for "rice yield" can also include panicles per hill, grain per hill, and filled grains per panicle.
[0044] Still further embodiment for "soybean yield" can also include pods per plant, pods per acre, seeds per plant, seeds per pod, weight per seed, weight per pod, pods per node, number of nodes, and the number of internodes per plant.
[0045] In still further embodiment, "sugarcane yield" can be measured as cane yield (tons per acre; kg/hectare), total recoverable sugar (pounds per ton), and sugar yield (tons/acre).
[0046] In yet still further embodiment, "wheat yield" can include: cereal per unit area, grain number, grain weight, grain size, grains per head, seeds per head, seeds per plant, heads per acre, number of viable tillers per plant, composition of seed (for example, carbohydrates, starch, oil, and protein) and characteristics of seed fill.
[0047] The terms "yield", "seed yield" are defined above for a number of core crops. The terms "increased", "improved", "enhanced" are interchangeable and are defined herein.
[0048] In another embodiment, the present disclosure provides a method for the production of plants having altered phenotype, enhanced trait, or increased yield; performance of the method gives plants altered phenotype, enhanced trait, or increased yield.
[0049] "Increased yield" can manifest as one or more of the following: (i) increased plant biomass (weight) of one or more parts of a plant, particularly aboveground (harvestable) parts, of a plant, increased root biomass (increased number of roots, increased root thickness, increased root length) or increased biomass of any other harvestable part; or (ii) increased early vigor, defined herein as an improved seedling aboveground area approximately three weeks post-germination. "Early vigor" refers to active healthy plant growth especially during early stages of plant growth, and can result from increased plant fitness due to, for example, the plants being better adapted to their environment (for example, optimizing the use of energy resources, uptake of nutrients and partitioning carbon allocation between shoot and root). Early vigor in corn, for example, is a combination of the ability of corn seeds to germinate and emerge after planting and the ability of the young corn plants to grow and develop after emergence. Plants having early vigor also show increased seedling survival and better establishment of the crop, which often results in highly uniform fields with the majority of the plants reaching the various stages of development at substantially the same time, which often results in increased yield. Therefore early vigor can be determined by measuring various factors, such as kernel weight, percentage germination, percentage emergence, seedling growth, seedling height, root length, root and shoot biomass, canopy size and color and others.
[0050] Further, increased yield can also manifest as (iii) increased total seed yield, which may result from one or more of an increase in seed biomass (seed weight) due to an increase in the seed weight on a per plant and/or on an individual seed basis an increased number of panicles per plant; an increased number of pods; an increased number of nodes; an increased number of flowers ("florets") per panicle/plant; increased seed fill rate; an increased number of filled seeds; increased seed size (length, width, area, perimeter), which can also influence the composition of seeds; and/or increased seed volume, which can also influence the composition of seeds. In one embodiment, increased yield can be increased seed yield, and is selected from one or more of the following: (i) increased seed weight; (ii) increased number of filled seeds; and (iii) increased harvest index.
[0051] Increased yield can also (iv) result in modified architecture, or can occur because of modified plant architecture.
[0052] Increased yield can also manifest as (v) increased harvest index, which is expressed as a ratio of the yield of harvestable parts, such as seeds, over the total biomass
[0053] Increased yield can also manifest as (vi) increased kernel weight, which is extrapolated from the number of filled seeds counted and their total weight. An increased kernel weight can result from an increased seed size and/or seed weight, an increase in embryo size, increased endosperm size, aleurone and/or scutellum, or an increase with respect to other parts of the seed that result in increased kernel weight.
[0054] Increased yield can also manifest as (vii) increased ear biomass, which is the weight of the ear and can be represented on a per ear, per plant or per plot basis.
[0055] The disclosure also extends to harvestable parts of a plant such as, but not limited to, seeds, leaves, fruits, flowers, bolls, pods, siliques, nuts, stems, rhizomes, tubers and bulbs. The disclosure furthermore relates to products derived from a harvestable part of such a plant, such as dry pellets, powders, oil, fat and fatty acids, starch or proteins.
[0056] The present disclosure provides a method for increasing "yield" of a plant or "broad acre yield" of a plant or plant part defined as the harvestable plant parts per unit area, for example seeds, or weight of seeds, per acre, pounds per acre, bushels per acre, tones per acre, tons per acre, kilo per hectare.
[0057] This disclosure further provides a method of altering phenotype, enhancing trait, or increasing yield in a plant by producing a plant comprising a polynucleic acid sequence of this disclosure where the plant can be crossed with itself, a second plant from the same plant line, a wild type plant, or a plant from a different line of plants to produce a seed. The seed of the resultant plant can be harvested from fertile plants and be used to grow progeny generations of plant(s) of this disclosure. In addition to direct transformation of a plant with a recombinant DNA construct, transgenic plants can be prepared by crossing a first plant having a stably integrated recombinant DNA construct with a second plant lacking the DNA. For example, recombinant DNA can be introduced into a first plant line that is amenable to transformation to produce a transgenic plant which can be crossed with a second plant line to introgress the recombinant DNA into the second plant line.
[0058] Selected transgenic plants transformed with a recombinant DNA construct and having the polynucleotide of this disclosure provides the altered phenotype, enhanced trait, or increased yield compared to a control plant. Use of genetic markers associated with the recombinant DNA can facilitate production of transgenic progeny that is homozygous for the desired recombinant DNA. Progeny plants carrying DNA for both parental traits can be back-crossed into a parent line multiple times, for example usually 6 to 8 generations, to produce a progeny plant with substantially the same genotype as the one reoccurring original transgenic parental line but having the recombinant DNA of the other transgenic parental line. The term "progeny" denotes the offspring of any generation of a parent plant prepared by the methods of this disclosure containing the recombinant polynucleotides as described herein.
[0059] As used herein "nitrogen use efficiency" refers to the processes which lead to an increase in the plant's yield, biomass, vigor, and growth rate per nitrogen unit applied. The processes can include the uptake, assimilation, accumulation, signaling, sensing, retranslocation (within the plant) and use of nitrogen by the plant.
[0060] As used herein "nitrogen limiting conditions" refers to growth conditions or environments that provide less than optimal amounts of nitrogen needed for adequate or successful plant metabolism, growth, reproductive success and/or viability.
[0061] As used herein the "increased nitrogen stress tolerance" refers to the ability of plants to grow, develop, or yield normally, or grow, develop, or yield faster or better when subjected to less than optimal amounts of available/applied nitrogen, or under nitrogen limiting conditions.
[0062] As used herein "increased nitrogen use efficiency" refers to the ability of plants to grow, develop, or yield faster or better than normal when subjected to the same amount of available/applied nitrogen as under normal or standard conditions; ability of plants to grow, develop, or yield normally, or grow, develop, or yield faster or better when subjected to less than optimal amounts of available/applied nitrogen, or under nitrogen limiting conditions.
[0063] Increased plant nitrogen use efficiency can be translated in the field into either harvesting similar quantities of yield, while supplying less nitrogen, or increased yield gained by supplying optimal/sufficient amounts of nitrogen. The increased nitrogen use efficiency can improve plant nitrogen stress tolerance, and can also improve crop quality and biochemical constituents of the seed such as protein yield and oil yield. The terms "increased nitrogen use efficiency", "enhanced nitrogen use efficiency", and "nitrogen stress tolerance" are used inter-changeably in the present disclosure to refer to plants with improved productivity under nitrogen limiting conditions.
[0064] As used herein "water use efficiency" refers to the amount of carbon dioxide assimilated by leaves per unit of water vapor transpired. It constitutes one of the most important traits controlling plant productivity in dry environments. "Drought tolerance" refers to the degree to which a plant is adapted to arid or drought conditions. The physiological responses of plants to a deficit of water include leaf wilting, a reduction in leaf area, leaf abscission, and the stimulation of root growth by directing nutrients to the underground parts of the plants. Plants are more susceptible to drought during flowering and seed development (the reproductive stages), as plant's resources are deviated to support root growth. In addition, abscisic acid (ABA), a plant stress hormone, induces the closure of leaf stomata (microscopic pores involved in gas exchange), thereby reducing water loss through transpiration, and decreasing the rate of photosynthesis. These responses improve the water-use efficiency of the plant on the short term. The terms "increased water use efficiency", "enhanced water use efficiency", and "increased drought tolerance" are used inter-changeably in the present disclosure to refer to plants with improved productivity under water-limiting conditions.
[0065] As used herein "increased water use efficiency" refers to the ability of plants to grow, develop, or yield faster or better than normal when subjected to the same amount of available/applied water as under normal or standard conditions; ability of plants to grow, develop, or yield normally, or grow, develop, or yield faster or better when subjected to reduced amounts of available/applied water (water input) or under conditions of water stress or water deficit stress.
[0066] As used herein "increased drought tolerance" refers to the ability of plants to grow, develop, or yield normally, or grow, develop, or yield faster or better than normal when subjected to reduced amounts of available/applied water and/or under conditions of acute or chronic drought; ability of plants to grow, develop, or yield normally when subjected to reduced amounts of available/applied water (water input) or under conditions of water deficit stress or under conditions of acute or chronic drought.
[0067] As used herein "drought stress" refers to a period of dryness (acute or chronic/prolonged) that results in water deficit and subjects plants to stress and/or damage to plant tissues and/or negatively affects grain/crop yield; a period of dryness (acute or chronic/prolonged) that results in water deficit and/or higher temperatures and subjects plants to stress and/or damage to plant tissues and/or negatively affects grain/crop yield.
[0068] As used herein "water deficit" refers to the conditions or environments that provide less than optimal amounts of water needed for adequate/successful growth and development of plants.
[0069] As used herein "water stress" refers to the conditions or environments that provide improper (either less/insufficient or more/excessive) amounts of water than that needed for adequate/successful growth and development of plants/crops thereby subjecting the plants to stress and/or damage to plant tissues and/or negatively affecting grain/crop yield.
[0070] As used herein "water deficit stress" refers to the conditions or environments that provide less/insufficient amounts of water than that needed for adequate/successful growth and development of plants/crops thereby subjecting the plants to stress and/or damage to plant tissues and/or negatively affecting grain yield.
[0071] As used herein a "polynucleotide" is a nucleic acid molecule comprising a plurality of polymerized nucleotides. A polynucleotide may be referred to as a nucleic acid, a oligonucleotide, or any fragment thereof. In many instances, a polynucleotide encodes a polypeptide (or protein) or a domain or a fragment thereof. Additionally, a polynucleotide can comprise a promoter, an intron, an enhancer region, a polyadenylation site, a translation initiation site, 5' or 3' untranslated regions, a reporter gene, a selectable marker, a scorable marker, or the like. A polynucleotide can be single-stranded or double-stranded DNA or RNA. A polynucleotide optionally comprises modified bases or a modified backbone. A polynucleotide can be, for example, genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, a synthetic DNA or RNA, or the like. A polynucleotide can be combined with carbohydrate(s), lipid(s), protein(s), or other materials to perform a particular activity such as transformation or form a composition such as a peptide nucleic acid (PNA). A polynucleotide can comprise a sequence in either sense or antisense orientations. "Oligonucleotide" is substantially equivalent to the terms amplimer, primer, oligomer, element, target, and probe and is preferably single-stranded.
[0072] As used herein a "recombinant polynucleotide" or "recombinant DNA" is a polynucleotide that is not in its native state, for example, a polynucleotide comprises a series of nucleotides (represented as a nucleotide sequence) not found in nature, or a polynucleotide is in a context other than that in which it is naturally found; for example, separated from polynucleotides with which it typically is in proximity in nature, or adjacent (or contiguous with) polynucleotides with which it typically is not in proximity. The "recombinant polynucleotide" or "recombinant DNA" refers to polynucleotide or DNA which has been genetically engineered and constructed outside of a cell including DNA containing naturally occurring DNA or cDNA or synthetic DNA. For example, the polynucleotide at issue can be cloned into a vector, or otherwise recombined with one or more additional nucleic acids.
[0073] As used herein a "polypeptide" comprises a plurality of consecutive polymerized amino acid residues for example, at least about 15 consecutive polymerized amino acid residues. In many instances, a polypeptide comprises a series of polymerized amino acid residues that is a transcriptional regulator or a domain or portion or fragment thereof. Additionally, the polypeptide can comprise: (i) a localization domain; (ii) an activation domain; (iii) a repression domain; (iv) an oligomerization domain; (v) a protein-protein interaction domain; (vi) a DNA-binding domain; or the like. The polypeptide optionally comprises modified amino acid residues, naturally occurring amino acid residues not encoded by a codon, non-naturally occurring amino acid residues.
[0074] As used herein "protein" refers to a series of amino acids, oligopeptide, peptide, polypeptide or portions thereof whether naturally occurring or synthetic.
[0075] As used herein a "recombinant polypeptide" is a polypeptide produced by translation of a recombinant polynucleotide.
[0076] A "synthetic polypeptide" is a polypeptide created by consecutive polymerization of isolated amino acid residues using methods known in the art.
[0077] An "isolated polypeptide", whether a naturally occurring or a recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in its natural state in a wild-type cell, for example, more than about 5% enriched, more than about 10% enriched, or more than about 20%, or more than about 50%, or more, enriched, for example, alternatively denoted: 105%, 110%, 120%, 150% or more, enriched relative to wild type standardized at 100%. Such enrichment is not the result of a natural response of a wild-type plant. Alternatively, or additionally, the isolated polypeptide is separated from other cellular components, with which it is typically associated, for example, by any of the various protein purification methods.
[0078] As used herein, a "functional fragment" refers to a portion of a polypeptide provided herein which retains full or partial molecular, physiological or biochemical function of the full length polypeptide. A functional fragment often contains the domain(s), such as Pfam domains (see below), identified in the polypeptide provided in the sequence listing. In certain embodiments, fragments of any of SEQ ID NO: 1-9 are provided comprising at least about 50, at least about 75, at least about 95, at least about 100, at least about 125, at least about 150, at least about 175, at least about 200, at least about 225, at least about 250, at least about 275, at least about 300, at least about 500, at least about 600, at least about 700, at least about 750, at least about 800, at least about 900, or at least about 1000 contiguous nucleotides, or at least about 1250 contiguous nucleotides, or at least about 1500 contiguous nucleotides, or at least about 1750 contiguous nucleotides, or at least about 2000 contiguous nucleotides, or at least about 2250 contiguous nucleotides, or at least about 2500 contiguous nucleotides, or at least about 2750 contiguous nucleotides, or longer, of any of SEQ ID NO: 1-9, and having activity as disclosed herein. Further provided are fragments of any of SEQ ID NOs: 10-18 and 30-39 are provided comprising at least about at least about 50, at least about 75, at least about 95, at least about 100, at least about 125, at least about 150, at least about 175, at least about 200, at least about 225, at least about 250, at least about 275, at least about 300, at least about 500, at least about 600, at least about 700, at least about 750, at least about 800, at least about 900, or at least about 1000 contiguous amino acids, or longer, of any of SEQ ID NO: 10-18 and 30-39, and having activity as disclosed herein.
[0079] A "recombinant DNA construct" as used in the present disclosure comprises at least one expression cassette having a promoter operable in plant cells and a polynucleotide of the present disclosure. DNA constructs can be used as a means of delivering recombinant DNA constructs to a plant cell in order to effect stable integration of the recombinant molecule into the plant cell genome. In one embodiment, the polynucleotide can encode a protein or variant of a protein or fragment of a protein that is functionally defined to maintain activity in transgenic host cells including plant cells, plant parts, explants and whole plants. In another embodiment, the polynucleotide can encode a non-coding RNA that interferes with the functioning of endogenous classes of small RNAs that regulate expression, including but not limited to taRNAs, siRNAs and miRNAs. Recombinant DNA constructs are assembled using methods known to persons of ordinary skill in the art and typically comprise a promoter operably linked to DNA, the expression of which provides the enhanced agronomic trait.
[0080] Other construct components can include additional regulatory elements, such as 5' leaders and introns for enhancing transcription, 3' untranslated regions (such as polyadenylation signals and sites), and DNA for transit or targeting or signal peptides.
[0081] As used herein, a "homolog" or "homologues" means a protein in a group of proteins that perform the same biological function, for example, proteins that belong to the same Pfam protein family and that provide a common enhanced trait in transgenic plants of this disclosure. Homologs are expressed by homologous genes. With reference to homologous genes, homologs include orthologs, for example, genes expressed in different species that evolved from common ancestral genes by speciation and encode proteins retain the same function, but do not include paralogs, i.e., genes that are related by duplication but have evolved to encode proteins with different functions. Homologous genes include naturally occurring alleles and artificially-created variants.
[0082] Degeneracy of the genetic code provides the possibility to substitute at least one base of the protein encoding sequence of a gene with a different base without causing the amino acid sequence of the polypeptide produced from the gene to be changed. When optimally aligned, homolog proteins, or their corresponding nucleotide sequences, have typically at least about 60% identity, in some instances at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or even at least about 99.5% identity over the full length of a protein or its corresponding nucleotide sequence identified as being associated with imparting an enhanced trait or altered phenotype when expressed in plant cells. In one aspect of the disclosure homolog proteins have at least about 80%, at least about 85%, at least about 90%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 99.5% identity to a consensus amino acid sequence of proteins and homologs that can be built from sequences disclosed herein.
[0083] Homologs are inferred from sequence similarity, by comparison of protein sequences, for example, manually or by use of a computer-based tool using known sequence comparison algorithms such as BLAST and FASTA. A sequence search and local alignment program, for example, BLAST, can be used to search query protein sequences of a base organism against a database of protein sequences of various organisms, to find similar sequences, and the summary Expectation value (E-value) can be used to measure the level of sequence similarity. Because a protein hit with the lowest E-value for a particular organism may not necessarily be an ortholog or be the only ortholog, a reciprocal query is used to filter hit sequences with significant E-values for ortholog identification. The reciprocal query entails search of the significant hits against a database of protein sequences of the base organism. A hit can be identified as an ortholog, when the reciprocal query's best hit is the query protein itself or a paralog of the query protein. With the reciprocal query process orthologs are further differentiated from paralogs among all the homologs, which allows for the inference of functional equivalence of genes. A further aspect of the homologs encoded by DNA useful in the transgenic plants of the invention are those proteins that differ from a disclosed protein as the result of deletion or insertion of one or more amino acids in a native sequence.
[0084] Other functional homolog proteins differ in one or more amino acids from those of a trait-improving protein disclosed herein as the result of one or more of known conservative amino acid substitutions, for example, valine is a conservative substitute for alanine and threonine is a conservative substitute for serine. Conservative substitutions for an amino acid within the native sequence can be selected from other members of a class to which the naturally occurring amino acid belongs. Representative amino acids within these various classes include, but are not limited to: (1) acidic (negatively charged) amino acids such as aspartic acid and glutamic acid; (2) basic (positively charged) amino acids such as arginine, histidine, and lysine; (3) neutral polar amino acids such as glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; and (4) neutral nonpolar (hydrophobic) amino acids such as alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine. Conserved substitutes for an amino acid within a native protein or polypeptide can be selected from other members of the group to which the naturally occurring amino acid belongs. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side 30 chains is cysteine and methionine. Naturally conservative amino acids substitution groups are: valine-leucine, valine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, aspartic acid-glutamic acid, and asparagine-glutamine. A further aspect of the disclosure includes proteins that differ in one or more amino acids from those of a described protein sequence as the result of deletion or insertion of one or more amino acids in a native sequence.
[0085] In general, the term "variant" refers to molecules with some differences, generated synthetically or naturally, in their nucleotide or amino acid sequences as compared to a reference (native) polynucleotides or polypeptides, respectively. These differences include substitutions, insertions, deletions or any desired combinations of such changes in a native polynucleotide or amino acid sequence.
[0086] With regard to polynucleotide variants, differences between presently disclosed polynucleotides and polynucleotide variants are limited so that the nucleotide sequences of the former and the latter are similar overall and, in many regions, identical. Due to the degeneracy of the genetic code, differences between the former and the latter nucleotide sequences may be silent (for example, the amino acids encoded by the polynucleotide are the same, and the variant polynucleotide sequence encodes the same amino acid sequence as the presently disclosed polynucleotide). Variant nucleotide sequences can encode different amino acid sequences, in which case such nucleotide differences will result in amino acid substitutions, additions, deletions, insertions, truncations or fusions with respect to the similarly disclosed polynucleotide sequences. These variations can result in polynucleotide variants encoding polypeptides that share at least one functional characteristic. The degeneracy of the genetic code also dictates that many different variant polynucleotides can encode identical and/or substantially similar polypeptides.
[0087] As used herein "gene" or "gene sequence" refers to the partial or complete coding sequence of a gene, its complement, and its 5' and/or 3' untranslated regions (UTRs) and their complements. A gene is also a functional unit of inheritance, and in physical terms is a particular segment or sequence of nucleotides along a molecule of DNA (or RNA, in the case of RNA viruses) involved in producing a polypeptide chain. The latter can be subjected to subsequent processing such as chemical modification or folding to obtain a functional protein or polypeptide. By way of example, a transcriptional regulator gene encodes a transcriptional regulator polypeptide, which can be functional or require processing to function as an initiator of transcription.
[0088] As used herein, the term "promoter" refers generally to a DNA molecule that is involved in recognition and binding of RNA polymerase II and other proteins (trans-acting transcription factors) to initiate transcription. A promoter can be initially isolated from the 5' untranslated region (5' UTR) of a genomic copy of a gene. Alternately, promoters can be synthetically produced or manipulated DNA molecules. Promoters can also be chimeric, that is a promoter produced through the fusion of two or more heterologous DNA molecules. Plant promoters include promoter DNA obtained from plants, plant viruses, fungi and bacteria such as Agrobacterium and Bradyrhizobium bacteria.
[0089] Promoters which initiate transcription in all or most tissues of the plant are referred to as "constitutive" promoters. Promoters which initiate transcription during certain periods or stages of development are referred to as "developmental" promoters. Promoters whose expression is enhanced in certain tissues of the plant relative to other plant tissues are referred to as "tissue enhanced" or "tissue preferred" promoters. Promoters which express within a specific tissue of the plant, with little or no expression in other plant tissues are referred to as "tissue specific" promoters. For example, a "seed enhanced" or "seed preferred" promoter drives enhanced or higher expression levels of an associated transgene or transcribable nucleotide sequence (i.e., operably linked to the promoter) in seed tissues relative to other tissues of the plant, whereas a "seed specific" promoter would drive expression of an associated transgene or transcribable nucleotide sequence (i.e., operably linked to the promoter) in seed tissues with little or no expression in other tissues of the plant. Other types of tissue specific or tissue preferred promoters for other tissue types, such as roots, meristem, leaf, etc., may also be described in this way. A promoter that expresses in a certain cell type of the plant, for example a microspore mother cell, is referred to as a "cell type specific" promoter. An "inducible" promoter is a promoter in which transcription is initiated in response to an environmental stimulus such as cold, drought or light; or other stimuli such as wounding or chemical application. Many physiological and biochemical processes in plants exhibit endogenous rhythms with a period of about 24 hours. A "diurnal promoter" is a promoter which exhibits altered expression profiles under the control of a circadian oscillator. Diurnal regulation is subject to environmental inputs such as light and temperature and coordination by the circadian clock.
[0090] Examples of seed preferred or seed specific promoters include promoters from genes expressed in seed tissues, such as napin as disclosed in U.S. Pat. No. 5,420,034, maize L3 oleosin as disclosed in U.S. Pat. No. 6,433,252, zein Z27 as disclosed by Russell et al. (1997) Transgenic Res. 6(2):157-166, globulin 1 as disclosed by Belanger et al (1991) Genetics 129:863-872, glutelin 1 as disclosed by Russell (1997) supra, and peroxiredoxin antioxidant (Per1) as disclosed by Stacy et al. (1996) Plant Mot Biol. 31(6):1205-1216. The contents and disclosures of each of the above references are incorporated herein by reference. Examples of meristem preferred or meristem specific promoters are provided, for example, in International Application No. PCT/US2017/057202, the contents and disclosure of which are incorporated herein by reference.
[0091] Many examples of constitutive promoters that may be used in plants are known in the art, such as a cauliflower mosaic virus (CaMV) 35S and 19S promoter (see, e.g., U.S. Pat. No. 5,352,605), an enhanced CaMV 35S promoter, such as a CaMV 35S promoter with Omega region (see, e.g., Holtorf, S. et al., Plant Molecular Biology, 29: 637-646 (1995) or a dual enhanced CaMV promoter (see, e.g., U.S. Pat. No. 5,322,938), a Figwort Mosaic Virus (FMV) 35S promoter (see, e.g., U.S. Pat. No. 6,372,211), a Mirabilis Mosaic Virus (MMV) promoter (see, e.g., U.S. Pat. No. 6,420,547), a Peanut Chlorotic Streak Caulimovirus promoter (see, e.g., U.S. Pat. No. 5,850,019), a nopaline or octopine promoter, a ubiquitin promoter, such as a soybean polyubiquitin promoter (see, e.g., U.S. Pat. No. 7,393,948), an Arabidopsis S-Adenosylmethionine synthetase promoter (see, e.g., U.S. Pat. No. 8,809,628), etc., or any functional portion of the foregoing promoters, the contents and disclosures of each of the above references are incorporated herein by reference.
[0092] Examples of constitutive promoters that may be used in monocot plants, such as cereal or corn plants, include, for example, various actin gene promoters, such as a rice Actin 1 promoter (see, e.g., U.S. Pat. No. 5,641,876; see also SEQ ID NO: 75 or SEQ ID NO: 76) and a rice Actin 2 promoter (see, e.g., U.S. Pat. No. 6,429,357; see also, e.g., SEQ ID NO: 77 or SEQ ID NO: 78), a CaMV 35S or 19S promoter (see, e.g., U.S. Pat. No. 5,352,605; see also, e.g., SEQ ID NO: 79 for CaMV 35S), a maize ubiquitin promoter (see, e.g., U.S. Pat. No. 5,510,474), a Coix lacryma-jobi polyubiquitin promoter (see, e.g., SEQ ID NO: 80), a rice or maize Gos2 promoter (see, e.g., Pater et al., The Plant Journal, 2(6): 837-44 1992; see also, e.g., SEQ ID NO: 81 for the rice Gos2 promoter), a FMV 35S promoter (see, e.g., U.S. Pat. No. 6,372,211), a dual enhanced CMV promoter (see, e.g., U.S. Pat. No. 5,322,938), a MMV promoter (see, e.g., U.S. Pat. No. 6,420,547; see also, e.g., SEQ ID NO: 82), a PCLSV promoter (see, e.g., U.S. Pat. No. 5,850,019; see also, e.g., SEQ ID NO: 83), an Emu promoter (see, e.g., Last et al., Theor. Appl. Genet. 81:581 (1991); and Mcelroy et al., Mol. Gen. Genet. 231:150 (1991)), a tubulin promoter from maize, rice or other species, a nopaline synthase (nos) promoter, an octopine synthase (ocs) promoter, a mannopine synthase (mas) promoter, or a plant alcohol dehydrogenase (e.g., maize Adh1) promoter, any other promoters including viral promoters known or later-identified in the art to provide constitutive expression in a cereal or corn plant, any other constitutive promoters known in the art that may be used in monocot or cereal plants, and any functional sequence portion or truncation of any of the foregoing promoters. The contents and disclosures of each of the above references are incorporated herein by reference.
[0093] As used herein, the term "leader" refers to a DNA molecule isolated from the untranslated 5' region (5' UTR) of a genomic copy of a gene and is defined generally as a nucleotide segment between the transcription start site (TSS) and the protein coding sequence start site. Alternately, leaders can be synthetically produced or manipulated DNA elements. A leader can be used as a 5' regulatory element for modulating expression of an operably linked transcribable polynucleotide molecule. As used herein, the term "intron" refers to a DNA molecule that can be isolated or identified from the genomic copy of a gene and can be defined generally as a region spliced out during mRNA processing prior to translation. Alternately, an intron can be a synthetically produced or manipulated DNA element. An intron can contain enhancer elements that effect the transcription of operably linked genes. An intron can be used as a regulatory element for modulating expression of an operably linked transcribable polynucleotide molecule. A DNA construct can comprise an intron, and the intron may or may not be with respect to the transcribable polynucleotide molecule.
[0094] As used herein, the term "enhancer" or "enhancer element" refers to a cis-acting transcriptional regulatory element, a.k.a. cis-element, which confers an aspect of the overall expression pattern, but is usually insufficient alone to drive transcription, of an operably linked polynucleotide. Unlike promoters, enhancer elements do not usually include a transcription start site (TSS) or TATA box or equivalent sequence. A promoter can naturally comprise one or more enhancer elements that affect the transcription of an operably linked polynucleotide. An isolated enhancer element can also be fused to a promoter to produce a chimeric promoter cis-element, which confers an aspect of the overall modulation of gene expression. A promoter or promoter fragment can comprise one or more enhancer elements that effect the transcription of operably linked genes. Many promoter enhancer elements are believed to bind DNA-binding proteins and/or affect DNA topology, producing local conformations that selectively allow or restrict access of RNA polymerase to the DNA template or that facilitate selective opening of the double helix at the site of transcriptional initiation. An enhancer element can function to bind transcription factors that regulate transcription. Some enhancer elements bind more than one transcription factor, and transcription factors can interact with different affinities with more than one enhancer domain.
[0095] Expression cassettes of this disclosure can include a "transit peptide" or "targeting peptide" or "signal peptide" molecule located either 5' or 3' to or within the gene(s). These terms generally refer to peptide molecules that when linked to a protein of interest directs the protein to a particular tissue, cell, subcellular location, or cell organelle. Examples include, but are not limited to, chloroplast transit peptides (CTPs), chloroplast targeting peptides, mitochondrial targeting peptides, nuclear targeting signals, nuclear exporting signals, vacuolar targeting peptides, and vacuolar sorting peptides. For description of the use of chloroplast transit peptides see U.S. Pat. Nos. 5,188,642 and 5,728,925. For description of the transit peptide region of an Arabidopsis EPSPS gene in the present disclosure, see Klee, H. J. Et al (MGG (1987) 210:437-442. Expression cassettes of this disclosure can also include an intron or introns. Expression cassettes of this disclosure can contain a DNA near the 3' end of the cassette that acts as a signal to terminate transcription from a heterologous nucleic acid and that directs polyadenylation of the resultant mRNA. These are commonly referred to as "3'-untranslated regions" or "3'-non-coding sequences" or "3'-UTRs". The "3' non-translated sequences" means DNA sequences located downstream of a structural nucleotide sequence and include sequences encoding polyadenylation and other regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal functions in plants to cause the addition of polyadenylate nucleotides to the 3' end of the mRNA precursor. The polyadenylation signal can be derived from a natural gene, from a variety of plant genes, or from T-DNA. An example of a polyadenylation sequence is the nopaline synthase 3' sequence (nos 3'; Fraley et al., Proc. Natl. Acad. Sci. USA 80: 4803-4807, 1983). The use of different 3' non-translated sequences is exemplified by Ingelbrecht et al., Plant Cell 1:671-680, 1989.
[0096] Expression cassettes of this disclosure can also contain one or more genes that encode selectable markers and confer resistance to a selective agent such as an antibiotic or an herbicide. A number of selectable marker genes are known in the art and can be used in the present disclosure: selectable marker genes conferring tolerance to antibiotics like kanamycin and paromomycin (nptII), hygromycin B (aph IV), spectinomycin (aadA), U.S. Patent Publication 2009/0138985A1 and gentamycin (aac3 and aacC4) or tolerance to herbicides like glyphosate (for example, 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS), U.S. Pat. Nos. 5,627,061; 5,633,435; 6,040,497; 5,094,945), sulfonyl herbicides (for example, acetohydroxyacid synthase or acetolactate synthase conferring tolerance to acetolactate synthase inhibitors such as sulfonylurea, imidazolinone, triazolopyrimidine, pyrimidyloxybenzoates and phthalide (U.S. Pat. Nos. 6,225,105; 5,767,366; 4,761,373; 5,633,437; 6,613,963; 5,013,659; 5,141,870; 5,378,824; 5,605,011)), bialaphos or phosphinothricin or derivatives (e. g., phosphinothricin acetyltransferase (bar) tolerance to phosphinothricin or glufosinate (U.S. Pat. Nos. 5,646,024; 5,561,236; 5,276,268; 5,637,489; 5,273,894); dicamba (dicamba monooxygenase, Patent Application Publications US2003/0115626A1), or sethoxydim (modified acetyl-coenzyme A carboxylase for conferring tolerance to cyclohexanedione), and aryloxyphenoxypropionate (haloxyfop, U.S. Pat. No. 6,414,222).
[0097] Transformation vectors of this disclosure can contain one or more "expression cassettes", each comprising a native or non-native plant promoter operably linked to a polynucleotide sequence of interest, which is operably linked to a 3' UTR sequence and termination signal, for expression in an appropriate host cell. It also typically comprises sequences required for proper translation of the polynucleotide or transgene. As used herein, the term "transgene" refers to a polynucleotide molecule artificially incorporated into a host cell's genome. Such a transgene can be heterologous to the host cell. The term "transgenic plant" refers to a plant comprising such a transgene. The coding region usually codes for a protein of interest but can also code for a functional RNA of interest, for example an antisense RNA, a non-translated RNA, in the sense or antisense direction, a miRNA, a noncoding RNA, or a synthetic RNA used in either suppression or over expression of target gene sequences. The expression cassette comprising the nucleotide sequence of interest can be chimeric, meaning that at least one of its components is heterologous with respect to at least one of its other components. As used herein the term "chimeric" refers to a DNA molecule that is created from two or more genetically diverse sources, for example a first molecule from one gene or organism and a second molecule from another gene or organism.
[0098] As used herein, the term "heterologous" refers to the combination of two or more components, including DNA molecules, when such a combination is not normally found in nature. For example, the two DNA molecules may be derived from different species and/or the two DNA molecules may be derived from different genes, e.g., different genes from the same species or the same genes from different species, or one of the DNA molecules might be synthetic and not found in nature. A first DNA molecule is heterologous with respect to an operably linked second DNA molecule if such a combination is not normally found in nature, i.e., the second DNA molecule does not naturally occur operably linked to the first element.
[0099] Recombinant DNA constructs in this disclosure generally include a 3' element that typically contains a polyadenylation signal and site. Known 3' elements include those from Agrobacterium tumefaciens genes such as nos 3', tml 3, tmr 3', tms 3', ocs 3', tr7 3', for example disclosed in U.S. Pat. No. 6,090,627; 3' elements from plant genes such as wheat (Triticum aesevitum) heat shock protein 17 (Hsp17 3), a wheat ubiquitin gene, a wheat fructose-1,6-biphosphatase gene, a rice glutelin gene, a rice lactate dehydrogenase gene and a rice beta-tubulin gene, all of which are disclosed in U.S. Patent Application Publication 2002/0192813 A1; and the pea (Pisum sativum) ribulose biphosphate carboxylase gene (rbs 3'), and 3' elements from the genes within the host plant.
[0100] Transgenic plants can comprise a stack of one or more polynucleotides disclosed herein resulting in the production of multiple polypeptide sequences. Transgenic plants comprising stacks of polynucleotides can be obtained by either or both of traditional breeding methods or through genetic engineering methods. These methods include, but are not limited to, crossing individual transgenic lines each comprising a polynucleotide of interest, transforming a transgenic plant comprising a first gene disclosed herein with a second gene, and co-transformation of genes into a single plant cell. Co-transformation of genes can be carried out using single transformation vectors comprising multiple genes or genes carried separately on multiple vectors.
[0101] As an alternative to traditional transformation methods, a DNA sequence, such as a transgene, expression cassette(s), etc., may be inserted or integrated into a specific site or locus within the genome of a plant or plant cell via site-directed integration. Recombinant DNA construct(s) and molecule(s) of this disclosure may thus include a donor template sequence comprising at least one transgene, expression cassette, or other DNA sequence for insertion into the genome of the plant or plant cell. Such donor template for site-directed integration may further include one or two homology arms flanking an insertion sequence (i.e., the sequence, transgene, cassette, etc., to be inserted into the plant genome). The recombinant DNA construct(s) of this disclosure may further comprise an expression cassette(s) encoding a site-specific nuclease and/or any associated protein(s) to carry out site-directed integration. These nuclease expressing cassette(s) may be present in the same molecule or vector as the donor template (in cis) or on a separate molecule or vector (in trans).
[0102] Any site or locus within the genome of a plant may potentially be chosen for site-directed integration of a transgene, construct or transcribable DNA sequence provided herein. Several methods for site-directed integration are known in the art involving different proteins (or complexes of proteins and/or guide RNA) that cut the genomic DNA to produce a double strand break (DSB) or nick at a desired genomic site or locus. Briefly as understood in the art, during the process of repairing the DSB or nick introduced by the nuclease enzyme, the donor template DNA may become integrated into the genome at or near the site of the DSB or nick. The presence of the homology arm(s) in the donor template may promote the adoption and targeting of the insertion sequence into the plant genome during the repair process through homologous recombination, although an insertion event may also occur through non-homologous end joining (NHEJ). Examples of site-specific nucleases that may be used include zinc-finger nucleases, engineered or native meganucleases, TALE-endonucleases, and RNA-guided endonucleases (e.g., Cas9 or Cpf1). For methods using RNA-guided site-specific nucleases (e.g., Cas9 or Cpf1), the recombinant DNA construct(s) will also comprise a sequence encoding one or more guide RNAs to direct the nuclease to the desired site within the plant genome.
[0103] As used herein, the term "homology arm" refers to a polynucleotide sequence that has at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 9'7%, at least 98%, at least 99% or 100% sequence identity to a target sequence in a plant or plant cell that is being transformed. A homology arm can comprise at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 100, at least 250, at least 500, or at least 1000 nucleotides.
[0104] As used herein "operably linked" means the association of two or more DNA fragments in a recombinant DNA construct so that the expression or function of one (for example, protein-encoding DNA), is controlled or influenced by the other (for example, a promoter).
[0105] Transgenic plants can comprise a stack of one or more polynucleotides disclosed herein resulting in the production of multiple polypeptide sequences. Transgenic plants comprising stacks of polynucleotides can be obtained by either or both of traditional breeding methods or through genetic engineering methods. These methods include, but are not limited to, crossing individual transgenic lines each comprising a polynucleotide of interest, transforming a transgenic plant comprising a first gene disclosed herein with a second gene, and co-transformation of genes into a single plant cell. Co-transformation of genes can be carried out using single transformation vectors comprising multiple genes or genes carried separately on multiple vectors.
[0106] Transgenic plants comprising or derived from plant cells that are transformed with a recombinant DNA of this disclosure can be further enhanced with stacked traits, for example, a crop plant having an enhanced trait resulting from expression of DNA disclosed herein in combination with herbicide and/or pest resistance traits. For example, genes of the current disclosure can be stacked with other traits of agronomic interest, such as a trait providing herbicide resistance, or insect resistance, such as using a gene from Bacillus thuringensis to provide resistance against lepidopteran, coleopteran, homopteran, hemipteran, and other insects, or improved quality traits such as improved nutritional value. Herbicides for which transgenic plant tolerance has been demonstrated and the method of the present disclosure can be applied include, but are not limited to, glyphosate, dicamba, glufosinate, sulfonylurea, bromoxynil, norflurazon, 2,4-D (2,4-dichlorophenoxy) acetic acid, aryloxyphenoxy propionates, p-hydroxyphenyl pyruvate dioxygenase inhibitors (HPPD), and protoporphyrinogen oxidase inhibitors (PPO) herbicides. Polynucleotide molecules encoding proteins involved in herbicide tolerance known in the art and include, but are not limited to, a polynucleotide molecule encoding 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) disclosed in U.S. Pat. Nos. 5,094,945; 5,627,061; 5,633,435 and 6,040,497 for imparting glyphosate tolerance; polynucleotide molecules encoding a glyphosate oxidoreductase (GOX) disclosed in U.S. Pat. No. 5,463,175 and a glyphosate-N-acetyl transferase (GAT) disclosed in U.S. Patent No. Application Publication 2003/0083480 A1 also for imparting glyphosate tolerance; dicamba monooxygenase disclosed in U.S. Patent Application Publication 2003/0135879 A1 for imparting dicamba tolerance; a polynucleotide molecule encoding bromoxynil nitrilase (Bxn) disclosed in U.S. Pat. No. 4,810,648 for imparting bromoxynil tolerance; a polynucleotide molecule encoding phytoene desaturase (crt1) described in Misawa et al, (1993) Plant J. 4:833-840 and in Misawa et al, (1994) Plant J. 6:481-489 for norflurazon tolerance; a polynucleotide molecule encoding acetohydroxyacid synthase (AHAS, aka ALS) described in Sathasiivan et al. (1990) Nucl. Acids Res. 18:2188-2193 for imparting tolerance to sulfonylurea herbicides; polynucleotide molecules known as bar genes disclosed in DeBlock, et al. (1987) EMBO J. 6:2513-2519 for imparting glufosinate and bialaphos tolerance; polynucleotide molecules disclosed in U.S. Patent Application Publication 2003/010609 A1 for imparting N-amino methyl phosphonic acid tolerance; polynucleotide molecules disclosed in U.S. Pat. No. 6,107,549 for imparting pyridine herbicide resistance; molecules and methods for imparting tolerance to multiple herbicides such as glyphosate, atrazine, ALS inhibitors, isoxoflutole and glufosinate herbicides are disclosed in U.S. Pat. No. 6,376,754 and U.S. Patent Application Publication 2002/0112260. Molecules and methods for imparting insect/nematode/virus resistance are disclosed in U.S. Pat. Nos. 5,250,515; 5,880,275; 6,506,599; 5,986,175 and U.S. Patent Application Publication 2003/0150017 A1.
Plant Cell Transformation Methods
[0107] Numerous methods for transforming chromosomes and plastids in a plant cell with a recombinant DNA, and/or introducing a recombinant DNA into chromosomes and plastids of a plant cell, are known in the art that may be used in methods of producing a transgenic plant cell and plant. Two effective methods for transformation are Agrobacterium-mediated transformation and microprojectile bombardment-mediated transformation. Microprojectile bombardment methods are illustrated, for example, in U.S. Pat. No. 5,015,580 (soybean); U.S. Pat. No. 5,550,318 (corn); U.S. Pat. No. 5,538,880 (corn); U.S. Pat. No. 5,914,451 (soybean); U.S. Pat. No. 6,160,208 (corn); U.S. Pat. No. 6,399,861 (corn); U.S. Pat. No. 6,153,812 (wheat) and U.S. Pat. No. 6,365,807 (rice). Agrobacterium-mediated transformation methods are described, for example, in U.S. Pat. No. 5,159,135 (cotton); U.S. Pat. No. 5,824,877 (soybean); U.S. Pat. No. 5,463,174 (canola); U.S. Pat. No. 5,591,616 (corn); U.S. Pat. No. 5,846,797 (cotton); U.S. Pat. No. 8,044,260 (cotton); U.S. Pat. No. 6,384,301 (soybean), U.S. Pat. No. 7,026,528 (wheat) and U.S. Pat. No. 6,329,571 (rice), U.S. Patent Application Publication No. 2004/0087030 A1 (cotton), and U.S. Patent Application Publication No. 2001/0042257 A1 (sugar beet), all of which are incorporated herein by reference in their entirety. Transformation of plant material is practiced in tissue culture on nutrient media, for example a mixture of nutrients that allow cells to grow in vitro. Recipient cell targets include, but are not limited to, meristem cells, shoot tips, hypocotyls, calli, immature or mature embryos, and gametic cells such as microspores, pollen, sperm and egg cells. Callus can be initiated from tissue sources including, but not limited to, immature or mature embryos, hypocotyls, seedling apical meristems, microspores and the like. Cells containing a transgenic nucleus are grown into transgenic plants.
[0108] As introduced above, another method for transforming plant cells and chromosomes in a plant cell is via insertion of a DNA sequence using a recombinant DNA donor template at a pre-determined site of the genome by methods of site-directed integration. Site-directed integration may be accomplished by any method known in the art, for example, by use of zinc-finger nucleases, engineered or native meganucleases, TALE-endonucleases, or an RNA-guided endonuclease (for example Cas9 or Cpf1). The recombinant DNA construct may be inserted at the pre-determined site by homologous recombination (HR) or by non-homologous end joining (NHEJ). In addition to insertion of a recombinant DNA construct into a plant chromosome at a pre-determined site, genome editing can be achieved through oligonucleotide-directed mutagenesis (ODM) (Oh and May, 2001; U.S. Pat. No. 8,268,622) or by introduction of a double-strand break (DSB) or nick with a site specific nuclease, followed by NHEJ or repair. The repair of the DSB or nick may be used to introduce insertions or deletions at the site of the DSB or nick, and these mutations may result in the introduction of frame-shifts, amino acid substitutions, and/or an early termination codon of protein translation or alteration of a regulatory sequence of a gene. Genome editing may be achieved with or without a donor template molecule.
[0109] In addition to direct transformation of a plant material with a recombinant DNA construct, a transgenic plant can be prepared by crossing a first plant comprising a recombinant DNA with a second plant lacking the recombinant DNA. For example, recombinant DNA can be introduced into a first plant line that is amenable to transformation, which can be crossed with a second plant line to introgress the recombinant DNA into the second plant line. A transgenic plant with recombinant DNA providing an enhanced trait, for example, enhanced yield or other yield component trait, can be crossed with a transgenic plant line having another recombinant DNA that confers another trait, for example herbicide resistance or pest resistance, to produce progeny plants having recombinant DNA sequences that confer both traits. The progeny of these crosses may segregate, such that some of the plants will carry the recombinant DNA for both parental traits and some will carry the recombinant DNA for one of the parental traits; and such plants can be identified by one or both of the parental traits and/or markers associated with one or both of the parental traits or the recombinant DNA. For example, marker identification may be performed by analysis or detection of the recombinant DNA, or in the case where a selectable marker is linked to the recombinant DNA, by application of a selection agent, such as a herbicide for use with a herbicide tolerance marker, or by selection for the enhanced trait, or using any molecular technique. Progeny plants carrying DNA for both parental traits can be crossed back to one of the parent lines multiple times, for example 6 to 8 generations, to produce a progeny plant with substantially the same genotype as the original transgenic parental line but for the recombinant DNA of the other transgenic parental line.
[0110] For transformation, DNA is typically introduced into only a small percentage of target plant cells in any one transformation experiment. Marker genes are used to provide an efficient system for identification of those cells that are stably transformed by receiving and integrating a recombinant DNA construct into their genomes. Preferred marker genes provide selective markers which confer resistance to a selective agent, such as an antibiotic or an herbicide. Any of the herbicides to which plants of this disclosure can be resistant is an agent for selective markers. Potentially transformed cells are exposed to the selective agent. In the population of surviving cells are those cells where, generally, the resistance-conferring gene is integrated and expressed at sufficient levels to permit cell survival. Cells can be tested further to confirm stable integration of the exogenous DNA. Commonly used selective marker genes include those conferring resistance to antibiotics such as kanamycin and paromomycin (nptII), hygromycin B (aph IV), spectinomycin (aadA) and gentamycin (aac3 and aacC4) or resistance to herbicides such as glufosinate (bar or pat), dicamba (DMO) and glyphosate (aroA or EPSPS). Examples of such selectable markers are illustrated in U.S. Pat. Nos. 5,550,318; 5,633,435; 5,780,708; 6,118,047 and 8,030,544. Markers which provide an ability to visually screen transformants can also be employed, for example, a gene expressing a colored or fluorescent protein such as a luciferase or green fluorescent protein (GFP) or a gene expressing a beta-glucuronidase or uidA gene (GUS) for which various chromogenic substrates are known.
[0111] Plant cells that survive exposure to a selective agent, or plant cells that have been scored positive in a screening assay, may be cultured in vitro to develop or regenerate plantlets. Developing plantlets regenerated from transformed plant cells can be transferred to plant growth mix, and hardened off, for example, in an environmentally controlled chamber at about 85% relative humidity, 600 ppm CO2, and 25-250 microEinsteins m.sup.-2 s.sup.-1 of light, prior to transfer to a greenhouse or growth chamber for maturation. Plants may be regenerated from about 6 weeks to 10 months after a transformant is identified, depending on the initial tissue, and plant species. Plants can be pollinated using conventional plant breeding methods known to those of skill in the art to produce seeds, for example cross-pollination and self-pollination are commonly used with transgenic corn and other plants. The regenerated transformed plant or its progeny seed or plants can be tested for expression of the recombinant DNA and selected for the presence of an altered phenotype or an enhanced agronomic trait.
Transgenic Plants and Seeds
[0112] Transgenic plants derived from transgenic plant cells having a transgenic nucleus of this disclosure are grown to generate transgenic plants having an altered phenotype or an enhanced trait as compared to a control plant, and produce transgenic seed and haploid pollen of this disclosure. Such plants with enhanced traits are identified by selection of transformed plants or progeny seed for the enhanced trait. For efficiency a selection method is designed to evaluate multiple transgenic plants (events) comprising the recombinant DNA, for example multiple plants from 2 to 20 or more transgenic events. Transgenic plants grown from transgenic seeds provided herein demonstrate improved agronomic traits that contribute to increased yield or other traits that provide increased plant value, including, for example, improved seed quality. Of particular interest are plants having increased water use efficiency or drought tolerance, enhanced high temperature or cold tolerance, increased yield, and increased nitrogen use efficiency.
[0113] Table 1 provides a list of sequences of protein-encoding genes as recombinant DNA for production of transgenic plants with enhanced traits. The elements of Table 1 are described by reference to: "NUC SEQ ID NO." which identifies a DNA sequence; "PEP SEQ ID NO." which identifies an amino acid sequence; "Gene ID" which refers to an identifier for the gene; and "Gene Name and Description" which is a common name and functional description of the gene.
TABLE-US-00001 TABLE 1 Sequences for Protein-Coding Genes NUC SEQ PEP SEQ Gene ID NO. ID NO. ID Gene Name and Description 1 10 TX7G1 Agrobacterium tumefaciens Isopentyl transferase (AGRtu.IPT) 2 11 TX7G2 Soybean 14-3-3-like protein A gene (Gm.SGF14A) 3 12 TX7G3 Nostoc sp. sucrose-phosphate phosphatase (sppA) 4 13 TX7G4 Corn isopropylmalate synthase 5 14 TX7G5 Arabidopsis thaliana tonoplast monosaccharide transporter1 gene (At.TMT1) 6 15 TX7G6 Chlamydomonas reinhardtii S-adenosyl-L-homocysteine hydrolase 7 16 TX7G7 Truncated corn sucrose phosphate synthase gene (Zm.SPS truncated) 8 17 TX7G8 Arabidopsis thaliana vacuolar glucose transporter 1 gene (At.VGT1) 9 18 TX7G9 Arabidopsis thaliana KURZ UND KLEIN gene (At.KUK1)
[0114] Table 2 provides a list of constructs with specific expression pattern, for expression or suppression of protein-coding genes, as recombinant DNA for production of transgenic plants with enhanced traits. The elements of Table 2 are described by reference to: "Construct ID" which identifies a construct with a particular expression pattern by a promoter operably linked to a polynucleotide sequence either expressing or suppressing a protein-coding gene. "Gene ID" which identifies either an expressed or suppressed gene from Table 1 or Table 2. "Specific Expression Pattern" which describes the expected expression pattern or promoter type.
TABLE-US-00002 TABLE 2 Constructs for Gene expression Construct ID Gene ID Specific Expression Pattern TX7G1c01 TX7G1 Meristem Preferred TX7G1c02 TX7G1 Meristem Preferred TX7G1c03 TX7G1 Constitutive TX7G1c04 TX7G1 Seed Preferred TX7G1c05 TX7G1 Seed Preferred TX7G1c06 TX7G1 Root Preferred TX7G1c07 TX7G1 Ovule & Early Kernel Preferred TX7G1c08 TX7G1 Root Preferred TX7G1c09 TX7G1 Constitutive TX7G1c10 TX7G1 Embryo Scutellum Preferred TX7G1c11 TX7G1 Drought Responsive in Leaf & Root TX7G1c12 TX7G1 Meristem Preferred TX7G2c1 TX7G2 Constitutive TX7G2c2 TX7G2 Constitutive TX7G2c3 TX7G2 Constitutive TX7G2c4 TX7G2 Drought Responsive TX7G3c1 TX7G3 Above Ground Preferred; Medium TX7G3c2 TX7G3 Leaf Bundle Sheath & Mesophyll Preferred TX7G3c3 TX7G3 Constitutive TX7G3c4 TX7G3 Leaf Bundle Sheath & Mesophyll Preferred TX7G3c5 TX7G3 Above Ground Preferred; High TX7G4c1 TX7G4 Constitutive TX7G5c1 TX7G5 Constitutive TX7G5c2 TX7G5 Constitutive TX7G5c3 TX7G5 Leaf Preferred TX7G6c1 TX7G6 Constitutive TX7G7c1 TX7G7 Above Ground Preferred; High TX7G7C2 TX7G7 Leaf Bundle Sheath & Mesophyll Preferred TX7G8c1 TX7G8 Constitutive TX7G8c2 TX7G8 Constitutive TX7G9c1 TX7G9 Root Preferred
[0115] Table 3 provides a list of polynucleotide sequences of promoters with specific expression patterns. To convey the specific expression patterns, choices of promoters are not limited to those listed in Table 3.
TABLE-US-00003 TABLE 3 Promoter sequences and expression patterns Nucleotide SEQ ID NO. Promoter Expression Pattern 19 Above Ground Preferred; High 20 Above Ground Preferred; Medium 21 Drought Responsive 22 Drought Responsive in Leaf & Root 23 Embryo Scutellum Preferred 24 Leaf Bundle Sheath & Mesophyll Preferred 25 Leaf Preferred 26 Meristem Preferred 27 Ovule & Early Kernel Preferred 28 Root Preferred 29 Seed Preferred
Selecting and Testing Transgenic Plants for Enhanced Traits
[0116] Within a population of transgenic plants each developed or regenerated from a plant cell with a recombinant DNA, many plants that survive to fertile transgenic plants that produce seeds and progeny plants will not exhibit an enhanced agronomic trait. Selection from the population may be necessary to identify one or more transgenic plants with an enhanced trait. Further evaluation with vigorous testing may be important for understanding the contributing components to a trait, supporting trait advancement decisions and generating mode of action hypotheses. Transgenic plants having enhanced traits can be selected and tested from populations of plants developed, regenerated or derived from plant cells transformed as described herein by evaluating the plants in a variety of assays to detect an enhanced trait, for example, increased water use efficiency or drought tolerance, enhanced high temperature or cold tolerance, increased yield or yield components, desirable architecture, optimum life cycle, increased nitrogen use efficiency, enhanced seed composition such as enhanced seed protein and enhanced seed oil.
[0117] These assays can take many forms including, but not limited to, direct screening for the trait in a greenhouse or field trial or by screening for a surrogate trait. Such analyses can be directed to detecting changes in the chemical composition, biomass, yield components, physiological property, root architecture, morphology, or life cycle of the plant. Changes in chemical compositions such as nutritional composition of grain can be detected by analysis of the seed composition and content of protein, free amino acids, oils, free fatty acids, starch or tocopherols. Changes in chemical compositions can also be detected by analysis of contents in leaves, such as chlorophyll or carotenoid contents. Changes in biomass characteristics can be evaluated on greenhouse or field grown plants and can include plant height, stem diameter, root and shoot dry weights, canopy size; and, for corn plants, ear length and diameter. Changes in yield components can be measured by total number of kernels per unit area and its individual weight. Changes in physiological properties can be identified by evaluating responses to stress conditions, for example assays using imposed stress conditions such as water deficit, nitrogen deficiency, cold growing conditions, pathogen or insect attack or light deficiency, or increased plant density. Changes in root architecture can be evaluated by root length and branch number. Changes in morphology can be measured by visual observation of tendency of a transformed plant to appear to be a normal plant as compared to changes toward bushy, taller, thicker, narrower leaves, striped leaves, knotted trait, chlorosis, albino, anthocyanin production, or altered tassels, ears or roots. Changes in morphology can also be measured with morphometric analysis based on shape parameters, using dimensional measurement such as ear diameter, ear length, kernel row number, internode length, plant height, or stem volume. Changes in life cycle can be measured by macro or microscopic morphological changes partitioned into developmental stages, such as days to pollen shed, days to silking, leaf extension rate. Other selection and testing properties include days to pollen shed, days to silking, leaf extension rate, chlorophyll content, leaf temperature, stand, seedling vigor, internode length, plant height, leaf number, leaf area, tillering, brace roots, stay green or delayed senescence, stalk lodging, root lodging, plant health, bareness/prolificacy, green snap, and pest resistance. In addition, phenotypic characteristics of harvested grain can be evaluated, including number of kernels per row on the ear, number of rows of kernels on the ear, kernel abortion, kernel weight, kernel size, kernel density and physical grain quality.
[0118] Assays for screening for a desired trait are readily designed by those practicing in the art. The following illustrates screening assays for corn traits using hybrid corn plants. The assays can be adapted for screening other plants such as canola, wheat, cotton and soybean either as hybrids or inbreds.
[0119] Transgenic corn plants having increased nitrogen use efficiency can be identified by screening transgenic plants in the field under the same and sufficient amount of nitrogen supply as compared to control plants, where such plants provide higher yield as compared to control plants. Transgenic corn plants having increased nitrogen use efficiency can also be identified by screening transgenic plants in the field under reduced amount of nitrogen supply as compared to control plants, where such plants provide the same or similar yield as compared to control plants.
[0120] Transgenic corn plants having increased yield can be identified by screening using progenies of the transgenic plants over multiple locations for several years with plants grown under optimal production management practices and maximum weed and pest control or standard agronomic practices (SAP). Selection methods can be applied in multiple and diverse geographic locations, for example up to 16 or more locations, over one or more planting seasons, for example at least two planting seasons, to statistically distinguish yield improvement from natural environmental effects.
[0121] Transgenic corn plants having increased water use efficiency or drought tolerance can be identified by screening plants in an assay where water is withheld for a period to induce stress followed by watering to revive the plants. For example, a selection process imposes 3 drought/re-water cycles on plants over a total period of 15 days after an initial stress free growth period of 11 days. Each cycle consists of 5 days, with no water being applied for the first four days and a water quenching on the 5th day of the cycle. The primary phenotypes analyzed by the selection method may be changes in plant growth rate as determined by height and biomass during a vegetative drought treatment.
[0122] Although the plant cells and methods of this disclosure can be applied to any plant cell, plant, seed or pollen, for example, any fruit, vegetable, grass, tree or ornamental plant, the various aspects of the disclosure are applied to corn, soybean, cotton, canola, rice, barley, oat, wheat, turf grass, alfalfa, sugar beet, sunflower, quinoa and sugar cane plants.
EXAMPLES
Example 1. Corn Transformation
[0123] This example illustrates transformation methods to produce a transgenic corn plant cell, seed, and plant having altered phenotypes as shown in Tables 5-7, and enhanced traits, increased water use efficiency, increased nitrogen use efficiency, and increased yield and altered traits and phenology as shown in Tables 9, 10, 12 and 13.
[0124] For Agrobacterium-mediated transformation of corn embryo cells, ears from corn plants were harvested and surface-sterilized by spraying or soaking the ears in ethanol, followed by air drying. Embryos were isolated from individual kernels of surface-sterilized ears. After excision, maize embryos were inoculated with Agrobacterium cells containing plasmid DNA with the gene of interest cassette and a plant selectable marker cassette, and then co-cultured with Agrobacterium for several days. Co-cultured embryos were transferred to various selection and regeneration media, and transformed R0 plants were recovered 6 to 8 weeks after initiation of selection, which were transplanted into potting soil. Regenerated R0 plants were selfed, and R1 and subsequent progeny generations were obtained.
[0125] The above process can be repeated to produce multiple events of transgenic corn plants from cells that were transformed with recombinant DNA constructs identified in Table 2. Progeny transgenic plants and seeds of the transformed plants were screened for the presence and single copy of the inserted gene, and for various altered or enhanced traits and phenotypes, such as increased water use efficiency, increased yield, and increased nitrogen use efficiency as shown in Tables 5-7 and 9, 10, 12 and 13. From each group of multiple events of transgenic plants with a specific recombinant DNA from Table 2, the event(s) that showed increased yield, increased water use efficiency, increased drought tolerance, increased nitrogen use efficiency, and altered phenotypes and traits were identified.
Example 2. Soybean Transformation
[0126] This example illustrates plant transformation in producing a transgenic soybean plant cell, seed, and plant having an altered phenotype or an enhanced trait, such as increased nitrogen use efficiency, increased water use efficiency, increased drought tolerance, and increased yield as shown in Table 13.
[0127] For Agrobacterium mediated transformation, soybean seeds were imbibed overnight and the meristem explants excised. Soybean explants were mixed with induced Agrobacterium cells containing plasmid DNA with the gene of interest cassette and a plant selectable marker cassette no later than 14 hours from the time of initiation of seed imbibition, and wounded using sonication. Following wounding, explants were placed in co-culture for 2-5 days at which point they were transferred to selection media to allow selection and growth of transgenic shoots. Resistant shoots were harvested in approximately 6-8 weeks and placed into selective rooting media for 2-3 weeks. Shoots producing roots were transferred to the greenhouse and potted in soil. Shoots that remained healthy on selection, but did not produce roots were transferred to non-selective rooting media for an additional two weeks. Roots from any shoots that produced roots off selection were tested for expression of the plant selectable marker before they were transferred to the greenhouse and potted in soil.
[0128] The above process can be repeated to produce multiple events of transgenic soybean plants from cells that were transformed with recombinant DNA having the constructs identified in Table 2. Progeny transgenic plants and seed of the transformed plants were screened for the presence and single copy of the inserted gene, and tested for various altered or enhanced phenotypes and traits as shown in Tables 11, 12 and 13.
Example 3. Identification of Altered Phenotypes in Automated Greenhouse
[0129] This example illustrates screening and identification of transgenic corn plants for altered phenotypes in an automated greenhouse (AGH). The apparatus and the methods for automated phenotypic screening of plants are disclosed, for example, in U.S. Patent Publication No. 2011/0135161, which is incorporated herein by reference in its entirety.
[0130] Corn plants were tested in three screens in the AGH under different conditions including non-stress, nitrogen deficit, and water deficit stress conditions. All screens began with non-stress conditions during days 0-5 germination phase, after which the plants were grown for 22 days under the screen-specific conditions shown in Table 4.
TABLE-US-00004 TABLE 4 Description of the three AGH screens for corn plants Germination Screen Specific Phase Phase Screen Description (5 days) (22 days) Non-stress well watered 55% VWC 55% VWC sufficient nitrogen water 8 mM nitrogen Water deficit limited watered 55% VWC 30% VWC sufficient nitrogen water 8 mM nitrogen Nitrogen deficit well watered 55% VWC 55% VWC low nitrogen water 2 mM nitrogen
[0131] Water deficit is defined as a specific Volumetric Water Content (VWC) that is lower than the VWC of a non-stressed plant. For example, a non-stressed plant might be maintained at 55% VWC, and the VWC for a water-deficit assay might be defined around 30% VWC. Data were collected using visible light and hyperspectral imaging as well as direct measurement of pot weight and amount of water and nutrient applied to individual plants on a daily basis.
[0132] Nitrogen deficit is defined (in part) as a specific mM concentration of nitrogen that is lower than the nitrogen concentration of a non-stressed plant. For example, a non-stressed plant might be maintained at 8 mM nitrogen, while the nitrogen concentration applied in a nitrogen-deficit assay might be maintained at a concentration of 2 mM.
[0133] Up to ten parameters were measured for each screen. The visible light color imaging based measurements are: biomass, canopy area, and plant height. Biomass (Bmass) is defined as the estimated shoot fresh weight (g) of the plant obtained from images acquired from multiple angles of view. Canopy Area (Cnop) is defined as leaf area as seen in a top-down image (mm.sup.2). Plant Height (PlntH) refers to the distance from the top of the pot to the highest point of the plant derived from a side image (mm). Anthocyanin score and area, chlorophyll score and concentration, and water content score are hyperspectral imaging-based parameters. Anthocyanin Score (AntS) is an estimate of anthocyanin in the leaf canopy obtained from a top-down hyperspectral image. Anthocyanin Area (AntA) is an estimate of anthocyanin in the stem obtained from a side-view hyperspectral image. Chlorophyll Score (ClrpS) and Chlorophyll Concentration (ClrpC) are both measurements of chlorophyll in the leaf canopy obtained from a top-down hyperspectral image, where Chlorophyll Score measures in relative units, and Chlorophyll Concentration is measured in parts per million (ppm) units. Water Content Score (WtrCt) is a measurement of water in the leaf canopy obtained from a top-down hyperspectral image. Water Use Efficiency (WUE) is derived from the grams of plant biomass per liter of water added. Water Applied (WtrAp) is a direct measurement of water added to a pot (pot with no hole) during the course of an experiment to maintain a stable soil water content.
[0134] These physiological screen runs were set up so that tested transgenic lines were compared to a control line. The collected data were analyzed against the control using % delta and certain p-value cutoff. Tables 5, 6 and 7 are summaries of transgenic corn plants comprising the disclosed recombinant DNA constructs with altered phenotypes under non stress, nitrogen deficit, and water deficit conditions, respectively. "ConstructID" refers to the construct identifier as defined in Table 2.
[0135] The test results are represented by three numbers: the first number before letter "p" denotes number of events with an increase in the tested parameter at p.ltoreq.0.1; the second number before letter "n" denotes number of events with a decrease in the tested parameter at p.ltoreq.0.1; the third number before letter "t" denotes total number of transgenic events tested for a given parameter in a specific screen. The increase or decrease is measured in comparison to non-transgenic control plants. A designation of "-" indicates that it has not been tested. For example, 2p1n5t indicates that 5 transgenic plant events were screened, of which 2 events showed an increase, and 1 showed a decrease of the measured parameter.
TABLE-US-00005 TABLE 5 Summary of transgenic plants with altered phenotypes in AGH non-stress screens. ConstructID AntS Bmass Cnop ClrpS PlntH WtrAp WtrCt WUE TX7G3c3 0p0n8t 0p0n8t 1p1n8t 0p2n8t 0p1n8t 3p2n8t 0p0n8t 0p0n8t
TABLE-US-00006 TABLE 6 Summary of transgenic plants with altered phenotypes in AGH nitrogen-deficit screens. ConstructID AntA AntS Bmass Cnop ClrpC ClrpS PlntH WtrAp WtrCt WUE TX7G3c3 -- 0p0n8t 3p0n8t 3p0n8t -- 2p1n8t 2p0n8t 3p0n8t 2p2n8t 3p0n8t TX7G1c09 0p0n5t 1p0n5t 0p1n5t 0p2n5t 0p0n5t -- 1p0n5t 0p1n5t -- 0p1n5t TX7G3c4 0p0n5t 0p0n5t 1p0n5t 1p0n5t 0p0n5t -- 0p0n5t 0p2n5t -- 1p0n5t
TABLE-US-00007 TABLE 7 Summary of transgenic plants with altered phenotypes in AGH water-deficit screens. ConstructID AntA AntS Bmass Cnop ClrpC ClrpS PlntH WtrAp WtrCt WUE TX7G3c3 -- 0p1n8t 0p3n8t 0p3n8t -- 1p0n8t 0p2n8t 2p4n8t 2p3n8t 0p2n8t TX7G3c4 1p0n5t 0p0n5t 0p1n5t 0p2n5t 0p1n5t -- 0p0n5t 0p1n5t -- 0p2n5t
Example 4. Evaluation of Transgenic Plants for Trait Characteristics
[0136] Trait assays were conducted to evaluate trait characteristics and phenotypic changes in transgenic plants as compared to non-transgenic controls. Corn and soybean plants were grown in field and greenhouse conditions. Up to 18 parameters were measured for corn in phenology, morphometrics, biomass, and yield component studies at certain plant developmental stages. For root assays, soybean plants were grown in the greenhouse in transparent nutrient medium to allow the root system to be imaged and analyzed.
[0137] Corn developmental stages are defined by the following development criteria:
[0138] Developed leaf: leaf with a visible leaf collar;
[0139] V-Stages: Number of developed leaves on a corn plant corresponds to the plant's vegetative growth stage--i.e., a V6 stage corn plant has 6 developed (fully unfolded) leaves;
[0140] R1 (Silking): Plants defined as R1 must have one or more silks extending outside the husk leaves. Determining the reproductive stage of the crop plant at R1 or later is based solely on the development of the primary ear;
[0141] R3 (Milk): Typically occurs 18-22 days after silking depending on temperature and relative maturity. Kernels are usually yellow in color and the fluid inside each kernel is milky white;
[0142] R6 (Physiological maturity): Typically occurs 55-65 days after silking (depending on temperature and relative maturity group of the germplasm being observed). Kernels have reached their maximum dry matter accumulation at this point, and kernel moisture is approximately 35%.
[0143] Soybean developmental stages are defined by criteria as following:
[0144] Fully developed trifoliate leaf node: A leaf is considered completely developed when the leaf at the node immediately above it has unrolled sufficiently so the two edges of each leaflet are no longer touching. At the terminal node on the main stem, the leaf is considered completely developed when the leaflets are flat and similar in appearance to older leaves on the plant;
[0145] VC: Cotyledons and Unifoliolates are fully expanded;
[0146] R1: Beginning of flowering--i.e., one open flower at any node on the main stem.
[0147] Table 8 describes the trait assays. TraitRefID is the reference ID of each trait assay. Trait Assay Name is the descriptive name of the assay. The Description provides what the assay measures, and how the measurement is conducted. Direction For Positive Call indicates whether an increase or decrease in the measurement quantity corresponds to a "positive" call in the assay results.
TABLE-US-00008 TABLE 8 Description of Trait Assays Direction For Positive TraitRefID Trait Assay Name Description Call HINDXR6 Harvest Index at R6 Ratio of grain weight to total plant weight at Increase harvest. Weights are determined on a dry weight basis. DBMSR6 Dry Biomass by Seed at Ratio of grain weight to total plant weight at R6 Increase R6 stage. Weights are determined on a dry weight basis. AGDWR6 Total Dry Biomass at Total aboveground oven-dried biomass at R6. Increase R6 Plants are cut at ground level, oven-dried at 70 deg. C. to a constant weight, and weighed. DFL50 Days from Planting Days from Planting to 50% Flowering Neutral to 50% Flowering PDPPR8 Number of Pods per Total pods per soybean plant. Quotient of count Increase Plant at R8 of pods from plants in a defined linear distance (20'') on a plot row divided by number of plants. PDNODER8 Pods per Node at R8 Total pods per flowering node on a soybean Increase plant. Quotient from count of pods on plants in a defined linear distance (20'') on a plot row divided by count of nodes on those plants. ARDR2 Average Root Diameter Estimated average diameter of all root classes of Increase at R2 root at R2 stage, using WinRHIZO (TM) image analysis system software. RBNR2 Root branch number Number of root branches per plant determined Increase at R2 by automated analysis of digitized root images from field root digs. DOV12 Days from Planting number of days from the date of planting to the Decrease to V12 date when 50% of the plants in a plot reaches V12 stage. EAR6 Ear Area at R6 plot average of size of area of a ear from a 2- Increase dimentional view. The measurement is done through imaging of ear, including kernels and void. Typically 10 representative ears are measured per plot. Measurement is taken at R6 stage. EDR6 Ear Diameter at R6 plot average of the ear diameter. It measures Increase maximal "wide" axis over the ear on the largest section of the ear. Measurement is taken at R6 stage. EDWR1 Ear Dry Weight at R6 plot average of the ear dry weight of a plant. Increase Measurement is taken at R6 stage. ELR6 Ear Length at R6 plot average of the length of ear. It measures Increase from tip of ear in a straight line to the base at the ear node. Measurement is taken at R6 stage. ETVR6 Ear Tip Void plot average of area percentage of void at the Decrease Percentage at R6 top 30% area of a ear, from a 2-dimentional view. The measurement is done through imaging of ear, including kernels and void. Typically 10 representative ears are measured per plot. Measurement is taken at R6 stage. EVR6 Ear Void Percentage plot average of area percentage of void on a ear, Decrease at R6 from a 2-dimentional view. The measurement is done through imaging of ear, including kernels and void. Typically 10 representative ears are measured per plot. Measurement is taken at R6 stage. KPER6 Kernels per Ear at plot average of the number of kernels per ear. It Increase R6 is calculated as (total kernel weight/(Single Kernel Weight * total ear count), where total kernel weight and total ear count are measured from ear samples from an area between 0.19 to 10 square meters, and Single Kernel Weight (SKWTR6) is described below. Measurement is taken at R6 stage. KRLR6 Kernels per Row (also known as rank number) the plot average of Increase Longitudinally at R6 the number of kernels per row longitudinally. It is calculated as the ratio of (total kernel count per ear)/(kernel row number). Measurement is taken at R6 stage. KRNR6 Kernel Row Number plot average of the number of rows of kernels on Increase at R6 an ear, by counting around the circumference of the ear. Measurement is taken at R6 stage. LFTNR3 Leaf Tip Number at plot average of the number of leaves per plant, Increase R3 by counting the number of leaf tips. Measurement is taken at R3 stage. P50DR1 Days to 50% Pollen number of days from the date of planting to the Decrease Shedding date when 50% of the plants in a plot reaches Pollen Shed stage. PHTR3 Plant Height at R3 plot average of plant height. It measures from Decrease soil line to base of highest collared leaf. Measurement is taken at R3 stage. PLTHGR Plant Height Growth plot average of growth rate of a plant from V6 to Increase Rate from V6 to V12 V12 stage. It is calculated as (Plant Height measured at V12 - Plant Height measured at V6)/Days between measurements. RBPN Root Branch Point number of root branch tip points of a plant. The Increase Number at VC or V2 measurement is done through imaging of the root system of a plant grown in a transparent Gelzan(TM) gum gel nutrient medium to VC stage for soybean, or to V2 stage for corn. The root system image is skeletonized for the root length measurement. Up to 40 images are taken at various angles around the root vertical axis and measurement is averaged over the images. Gelzan is a trademark of CP Kelco U.S., Inc. RTL Root Total Length at cumulative length of roots of a plant, as if the Increase VC or V2 roots were all lined up in a row. The measurement is done through imaging of the root system of a plant grown in a transparent Gelzan(TM) gum gel nutrient medium to VC stage for soybean, or to V2 stage for corn. The root system image is skeletonized for the root length measurement. Up to 40 images are taken at various angles around the root vertical axis and measurement is averaged over the images. Gelzan is a trademark of CP Kelco U.S., Inc. S50DR1 Days to 50% Visible number of days from the date of planting to the Decrease Silk date when 50% of the plants in a plot reaches visible Silking (R1) stage. SKWTR6 Single Kernel Weight plot average of weight per kernel. It is calculated Increase at R6 as the ratio of (sample kernel weight adjusted to 15.5% moisture)/(sample kernel number). The sample kernel number ranges from 350 to 850. Measurement is taken at R6 stage. STDIR3 Stalk Diameter at R3 plot average of the stalk diameter of a plant. It Increase measures maximal "long" axis in the middle of the internode above first visible node. Measurement is taken at R3 stage. EDWPPR6 Ear Dry Weight Per plot average of the ear dry weight of a plant. Increase Plant at R6 Measurement is taken at R6 stage. SPPR8 Seeds per Plant at The number of seeds per plant at developmental Increase R8 stage R8 (maturity stage) SW1000 Weight of 1000 seeds The weight of one thousand seeds Increase PDDWR6 Pod Dry Weight at The weight of hand harvested pods from a plot Increase R6 at developmental stage R6
[0148] These trait assays were set up so that the tested transgenic lines were compared to a control line. The collected data were analyzed against the control, and positives were assigned if there was a p-value of 0.2 or less. Tables 9-12 are summaries of transgenic plants comprising the disclosed recombinant DNA constructs for corn phenology and morphometrics assays, corn yield/trait component assays, soybean phenology and morphometrics, and yield/trait component assays, and corn and soybean root assays, respectively.
[0149] The test results are represented by three numbers: the first number before letter "p" denotes number of tests of events with a "positive" change as defined in Table 9; the second number before letter "n" denotes number of tests of events with a "negative" change which is in the opposite direction of "positive" as defined in Table 8; the third number before letter "t" denotes total number of tests of transgenic events for a specific assay for a given gene. The "positive" or "negative" change is measured in comparison to non-transgenic control plants. A designation "-" indicates that it has not been tested. For example, 2p1n5t indicates that 5 transgenic plant events were tested, of which 2 events showed a "positive" change and 1 showed a "negative" change of the measured parameter. The assay is indicated with its TraitRefID as in Table 8.
TABLE-US-00009 TABLE 9 Summary of assay results for corn phenology and morphometric trait assays. Construct ID P50DR1 S50DR1 KRNR6 KRLR6 TX7G1c07 -- -- 0p4n8t 0p2n8t TX7G1c10 -- -- 2p4n8t 0p2n8t TX7G5c1 1p0n4t 1p1n4t -- -- TX7G3c3 6p2n10t 2p0n10t 0p8n10t 0p0n10t
TABLE-US-00010 TABLE 10 Summary of results for corn trait component assays. Construct ID EAR6 EDR6 EDWPPR6 ERDWAR6 ELR6 TX7G1c07 0p2n8t 2p2n8t -- -- 0p2n8t TX7G1c10 0p6n8t 0p4n8t -- -- 0p6n8t TX7G8c2 0p1n4t 0p1n4t 0p0n4t 0p0n4t 0p0n4t TX7G5c1 1p5n16t 1p3n16t 2p1n12t 0p3n12t 1p2n16t TX7G3c3 6p0n19t 0p13n19t 0p2n4t 2p0n4t 11p0n19t TX7G3c4 0p1n4t 0p1n4t 0p1n4t 0p0n4t 0p1n4t TX7G5c3 0p2n7t 0p2n7t 1p1n7t 0p1n7t 0p1n7t Construct ID ETVR6 EVR6 HINDXR6 KPER6 SKWTR6 AGDWR6 TX7G1c07 0p0n8t 0p0n8t -- 0p2n8t 0p0n8t -- TX7G1c10 4p0n8t 4p0n8t -- 0p4n8t 0p4n8t -- TX7G8c2 2p0n4t 2p0n4t 0p0n4t 0p0n4t 1p0n4t 0p1n4t TX7G5c1 7p0n16t 5p0n16t 9p5n28t 0p1n16t 0p3n16t 0p2n12t TX7G3c3 4p4n23t 12p0n24t 0p2n4t 2p2n19t 3p0n19t 2p0n4t TX7G3c4 0p2n8t 0p0n8t 0p0n4t 1p1n4t 0p2n4t 0p0n4t TX7G5c3 3p1n7t 4p0n7t 2p0n7t 0p1n7t 2p1n7t 0p1n7t
TABLE-US-00011 TABLE 11 Summary of results for soybean phenology, morphometries and trait component assays. Construct ID DBMSR6 PDPPR8 PHTR8 PDDWR6 PDNODER8 SW1000 SPPR8 AGDWR6 TX7G3c5 0p0n8t -- -- 4p3n12t -- 0p6n12t 0p5n12t 4p3n12t TX7G5c2 -- -- -- -- -- 2p0n4t 0p1n4t 0p1n4t TX7G8c1 -- -- -- -- -- 0p1n4t 0p0n4t 0p0n4t TX7G9c1 -- -- -- -- -- 0p1n6t 0p1n6t -- TX7G7c1 0p0n8t -- -- 0p0n8t -- 0p6n8t 0p2n8t 4p0n8t TX7G1c04 -- 0p3n4t 0p4n4t -- 0p4n4t 2p0n4t 0p4n4t -- TX7G1c01 -- 0p4n4t 0p4n4t -- 1p1n4t 4p0n4t 0p4n4t -- TX7G1c02 -- 0p2n4t 2p1n4t -- 2p1n4t 1p1n4t 0p1n4t -- TX7G6c1 -- -- -- -- -- 0p4n8t 2p2n8t 0p2n8t TX7G1c03 -- 0p4n4t 0p2n4t -- 0p4n4t 4p0n4t 0p4n4t --
TABLE-US-00012 TABLE 12 Summary of results for corn and soybean root assays. Crop Construct ID RTL RBPN Corn TX7G1c09 0p1n4t 0p3n4t Soybean TX7G9c1 2p0n4t 2p0n4t
Example 5. Phenotypic Evaluation of Transgenic Plants in Field Trials for Increased Nitrogen Use Efficiency, Increased Water Use Efficiency, and Increased Yield
[0150] Corn field trials were conducted to identify genes that can improve nitrogen use efficiency (NUE) under nitrogen limiting conditions leading to increased yield performance as compared to non transgenic controls. For the Nitrogen field trial results shown in Table 13, each field was planted under nitrogen limiting condition (60 lbs/acre), and corn ear weight or yield was compared to non-transgenic control plants.
[0151] Corn field trials can be conducted to identify genes that can improve water use efficiency (WUE) under water limiting conditions leading to increased yield performance as compared to non transgenic controls. The corn ear weight or yield can be compared to non-transgenic control plants.
[0152] Corn and soybean field trials were conducted to identify genes that can improve broad-acre yield (BAY) under standard agronomic practice. Results of the broad-acre yield trials conducted under standard agronomic practice are shown in Table 13, and the corn or soybean yield was compared to non-transgenic control plants.
[0153] Table 13 provides a list of genes that produce transgenic plants having increased nitrogen use efficiency (NUE), and/or increased broad-acre yield (BAY) as compared to a control plant. Polynucleotide sequences in constructs with at least one event showing significant yield or ear weight increase across multiple locations at p.ltoreq.0.2 are included. The genes were expressed with constitutive promoters unless noted otherwise under the "Specific Expression Pattern" column. A promoter of a specific expression pattern was chosen over a constitutive promoter, based on the understanding of the gene function, or based on the observed lack of significant yield increase when the gene was expressed with constitutive promoter. The elements of Table 13 are described as follows: "Crop" refers to the crop in trial, which is either corn or soybean; "Condition" refers to the type of field trial, which is BAY for broad acre yield trial under standard agronomic practice (SAP), and NUE for nitrogen use efficiency trial; "Construct ID" refers to the construct identifier as defined in Table 2; "Gene ID" refers to the gene identifier as defined in Table 1; "Yield results" refers to the recombinant DNA in a construct with at least one event showing significant yield increase at p.ltoreq.0.2 across locations. The first number refers to the number of tests of events with significant yield or ear weight increase, whereas the second number refers to the total number of tests of events for each recombinant DNA in the construct. Typically 4 to 8 distinct events per construct are tested.
TABLE-US-00013 TABLE 13 Yield and nitrogen use efficiency with protein-coding transgenes. Crop Condition Construct ID Gene ID Yield Results Soybean BAY TX7G3c1 TX7G3 3/18 Corn BAY TX7G3c2 TX7G3 2/18 Corn BAY TX7G3c3 TX7G3 13/49 Corn NUE TX7G3c3 TX7G3 2/8 Soybean BAY TX7G2c2 TX7G2 0/8 Corn BAY TX7G3c4 TX7G3 0/8 Soybeans BAY TX7G7c1 TX7G7 0/6 Soybean BAY TX7G2c3 TX7G2 4/18 Corn BAY TX7G2c4 TX7G2 0/6
Example 6. Homolog Identification
[0154] This example illustrates the identification of homologs of proteins encoded by the DNA sequences identified in Table 1, which were used to provide transgenic seed and plants having enhanced agronomic traits. From the sequences of the homolog proteins, corresponding homologous DNA sequences can be identified for preparing additional transgenic seeds and plants with enhanced agronomic traits.
[0155] An "All Protein Database" was constructed of known protein sequences using a proprietary sequence database and the National Center for Biotechnology Information (NCBI) non-redundant amino acid database (nr.aa). For each organism from which a polynucleotide sequence provided herein was obtained, an "Organism Protein Database" was constructed of known protein sequences of the organism; it is a subset of the All Protein Database based on the NCBI taxonomy ID for the organism.
[0156] The All Protein Database was queried using amino acid sequences provided in Table 1 using NCBI "blastp" program with E-value cutoff of le-8. Up to 1000 top hits were kept, and separated by organism names. For each organism other than that of the query sequence, a list was kept for hits from the query organism itself with a more significant E-value than the best hit of the organism. The list contains likely duplicated genes of the polynucleotides provided herein, and is referred to as the Core List. Another list was kept for all the hits from each organism, sorted by E-value, and referred to as the Hit List.
[0157] The Organism Protein Database was queried using polypeptide sequences provided in Table 1 using NCBI "blastp" program with E-value cutoff of le-4. Up to 1000 top hits were kept. A BLAST searchable database was constructed based on these hits, and is referred to as "SubDB". SubDB is queried with each sequence in the Hit List using NCBI "blastp" program with E-value cutoff of le-8. The hit with the best E-value was compared with the Core List from the corresponding organism. The hit is deemed a likely ortholog if it belongs to the Core List, otherwise it is deemed not a likely ortholog and there is no further search of sequences in the Hit List for the same organism. Homologs with at least 95% identity over 95% of the length of the polypeptide sequences provided in Table 1 are reported below in Tables 14 and 15.
[0158] Table 14 provides a list of homolog genes, the elements of which are described as follows: "PEP SEQ ID NO." identifies an amino acid sequence. "Homolog ID" refers to an alphanumeric identifier, the numeric part of which is the NCBI Genbank GI number; and "Gene Name and Description" is a common name and functional description of the gene. Table 15 describes the correspondence between the protein-coding genes in Table 1 and their homologs, and the level of protein sequence alignment between the gene and its homolog.
TABLE-US-00014 TABLE 14 Homologous gene information PEP SEQ ID NO. Homolog ID Gene Name and Description 30 gi_1495273 gi|1495273|emb|CAA90628.1| sugar transporter [Arabidopsis thaliana] 31 gi_255645592 gi|255645592|gb|ACU23290.1| [Glycine max] 32 gi_15219062 gi|15219062|ref|NP_176240.1| F-box family protein, containing similarity to MYB transcription factor isolog T01O24.1 [Arabidopsis thaliana] 33 gi_4586310 gi|4586310|dbj|BAA76344.1| isopentenyl transferase [Agrobacterium tumefaciens] 34 gi_10955004 gi|10955004|ref|NP_053424.1| hypothetical protein pTi-SAKURA_p186, isopentenyl transferase [Agrobacterium tumefaciens] 35 gi_226529888 gi|226529888|ref|NP_001146891.1| 2-isopropylmalate synthase B [Zea mays] 36 gi_5042196 gi|5042196|emb|CAB44641.1| isopentenyl transferase [Agrobacterium tumefaciens] 37 gi_255638346 gi|255638346|gb|ACU19485.1| [Glycine max] 38 gi_4836905 gi|4836905|gb|AAD30608.1|AC007369_18 Sugar transporter [Arabidopsis thaliana] 39 gi_3023194 gi|3023194|sp|Q96450|1433A_SOYBN 14-3-3-like protein A; AltName: SGF14A [Glycine max]
TABLE-US-00015 TABLE 15 Correspondence of Genes and Homologs Percent Percent Gene Homolog Percent Gene ID Homolog ID Coverage Coverage Identity TX7G1 gi_10955004 100 100 99 TX7G1 gi_5042196 100 100 99 TX7G1 gi_4586310 100 100 97 TX7G2 gi_255645592 100 100 98 TX7G2 gi_255638346 100 100 98 TX7G2 gi_3023194 100 100 98 TX7G4 gi_226529888 100 100 98 TX7G5 gi_4836905 100 100 99 TX7G5 gi_1495273 100 100 97 TX7G9 gi_15219062 100 100 99
Example 7. Use of Site-Directed Integration to Introduce Transgenes or Modulate Expression of Endogenous Genes in Plants
[0159] As introduced above, a DNA sequence comprising a transgene(s), expression cassette(s), etc., such as one or more coding sequences of genes identified in Tables 1, 2 and 15, or homologs thereof, may be inserted or integrated into a specific site or locus within the genome of a plant or plant cell via site-directed integration. Recombinant DNA constructs and molecules of this disclosure may thus include a donor template having an insertion sequence comprising at least one transgene, expression cassette, or other DNA sequence for insertion into the genome of the plant or plant cell. Such donor template for site-directed integration may further include one or two homology arms flanking the insertion sequence to promote insertion of the insertion sequence at the desired site or locus. Any site or locus within the genome of a plant may be chosen for site-directed integration of the insertion sequence. Several methods for site-directed integration are known in the art involving different proteins (or complexes of proteins and/or guide RNA) that cut the genomic DNA to produce a double strand break (DSB) or nick at a desired genomic site or locus. Examples of site-specific nucleases that may be used include zinc-finger nucleases, engineered or native meganucleases, TALE-endonucleases, and RNA-guided endonucleases (e.g., Cas9 or Cpf1). For methods using RNA-guided site-specific nucleases (e.g., Cas9 or Cpf1), the recombinant DNA construct(s) will also comprise a sequence encoding one or more guide RNAs to direct the nuclease to the desired site within the plant genome. The recombinant DNA molecules or constructs of this disclosure may further comprise an expression cassette(s) encoding a site-specific nuclease, a guide RNA, and/or any associated protein(s) to carry out the desired site-directed integration event.
[0160] The endogenous genomic loci of a plant or plant cell corresponding to the genes identified in Tables 1 and 14, or a homolog thereof, may be selected for site-specific insertion of a recombinant DNA molecule or sequence capable of modulating expression of the corresponding endogenous genes. As described above, the recombinant DNA molecule or sequence serves as a donor template for integration of an insertion sequence into the plant genome. The donor template may also have one or two homology arms flanking the insertion sequence to promote the targeted insertion event. Although a transgene, expression cassette, or other DNA sequence may be inserted into a desired locus or site of the plant genome via site-directed integration, a donor template may instead be used to replace, insert, or modify a 5' untranslated region (UTR), upstream sequence, promoter, enhancer, intron, 3' UTR and/or terminator region of an endogenous gene, or any portion thereof, to modulate the expression level of the endogenous gene. Another method for modifying expression of an endogenous gene is by genome editing of an endogenous gene locus. For example, a targeted genome editing event may be made to disrupt or abolish a regulatory binding site for a transcriptional repressor of an endogenous gene to increase or modify expression of the endogenous gene.
[0161] For genome editing or site-specific integration of an insertion sequence of a donor template, a double-strand break (DSB) or nick is made in the selected genomic locus. The DSB or nick may be made with a site-specific nuclease, for example a zinc-finger nuclease, an engineered or native meganuclease, a TALE-endonuclease, or an RNA-guided endonuclease (for example Cas9 or Cpf1). In the presence of a donor template, the DSB or nick may be repaired by homologous recombination between the homology arms of the donor template and the plant genome, resulting in site-directed integration of the insertion sequence to make a targeted genomic modification or insertion at the site of the DSB or nick. For genes shown herein to cause or produce a desired phenotype or trait in a plant, an expression construct or transgene comprising the coding sequence of the gene operably linked to a plant expressible promoter may be inserted at a desired or selected site within the genome of the plant via site-directed integration as discussed above. Alternatively, the sequence of a corresponding endogenous gene, such as within a regulatory region of the endogenous gene, may be modified via genome editing or site-directed integration to augment or alter the expression level of the endogenous gene, such as by adding a promoter or intron sequence, or by modifying or replacing a 5' UTR sequence, promoter, enhancer, transcription factor or repressor binding site, intron, 3' UTR sequence, and/or terminator region, or any portion thereof, of the endogenous gene.
[0162] Following transformation of a plant cell with a recombinant molecule(s) or construct(s), the resulting events are screened for site-directed insertion of the donor template insertion sequence or genome modification. Plants containing these confirmed edits, events or insertions may then be tested for modulation of an endogenous gene, expression of an integrated transgene and/or modification of yield traits or other phenotypes.
Sequence CWU
1
1
391723DNAAgrobacterium tumefaciens 1atggatctgc gtctaatttt cggtccaact
tgcacaggaa agacgtcgac cgcggtagct 60cttgcccagc agactgggct tccagtcctt
tcgctcgatc gggtccaatg ttgtcctcag 120ctgtcaaccg gaagcggacg accaacagtg
gaagaactga aaggaacgag ccgtctatac 180cttgatgatc ggcctctggt gaagggtatc
atcgcagcca agcaagctca tgaaaggctg 240atgggggggg tgtataatta tgaggcccac
ggcgggctta ttcttgaggg aggatctatc 300tcgttgctca agtgcatggc gcaaagcagt
tattggagtg cggattttcg ttggcatatt 360attcgccacg agttagcaga cgaagagacc
ttcatgaacg tggccaaggc cagagttaag 420cagatgttac gccctgctgc aggcctttct
attatccaag agttggttga tctttggaaa 480gagcctcggc tgaggcccat actgaaagag
atcgatggat atcgatatgc catgttgttt 540gctagccaga accagatcac atccgatatg
ctattgcagc ttgacgcaga tatggaggat 600aagttgattc atgggatcgc tcaggagtat
ctcatccatg cacgccgaca agaacagaaa 660ttccctcgag ttaacgcagc cgcttacgac
ggattcgaag gtcatccatt cggaatgtat 720tag
7232774DNAGlycine max 2atggcggatt
cttctcggga ggagaacgtt tacatggcga aattggcgga gcaggccgag 60cgttacgagg
agatggttga gttcatggag aaggttgcaa agactgtgga ggttgaggag 120ttgacggtgg
aggagaggaa tctcctctct gtggcttaca agaacgtgat tggtgcgagg 180agggcttcgt
ggaggatcat atcctccatt gagcagaagg aggagagcag gggcaatgag 240gaccacgtgg
ccattataaa ggagtacagg ggcaaaattg aggctgaact cagcaagatc 300tgtgatggga
ttttgaacct ccttgagtcc aacctcattc cttccgctgc atctcccgag 360agcaaagtct
tttaccttaa aatgaagggt gattaccaca ggtaccttgc tgagttcaag 420accggggcag
agaggaaaga ggctgcagag agtactttgc ttgcttacaa atccgctcag 480gatattgctc
ttgctgactt ggcccccact caccccatta ggttgggact tgctctcaac 540ttttctgtgt
tctattatga aatccttaac tcgccagatc gtgcttgtaa tcttgccaag 600caggcatttg
atgaggcaat ttccgagctt gacacattgg gtgaagagtc atacaaagat 660agtacattga
tcatgcaact tctccgtgac aatctgactt tgtggacatc agacatcacg 720gacgatgctg
gagatgagat caaggaaaca tctaagcaac aaccaggcga atag
7743750DNANostoc sp. PCC 7120 3atgaagccat ttctttttgt caccgatttg
gatcatactc tggtaggtaa tgatgcagcc 60ctggcagaac tcagccagat actcactcac
catcgtcaag aatatggcac aaagatagtt 120tatgccactg ggcgatcgcc tattctttac
aaagaactgc aagtagaaaa aaacctgata 180gaacctgatg ggttagtttt gtctgtgggt
acggaaatct atcttgatgg tagtggtaat 240cctgattctg actggtcaga aattcttaac
gatggctgga atcgagaact agtattgtcc 300gtaactaaaa aatttcctga attaatgctg
caaccagact cggaacaacg tccttttaaa 360gtcagttttt ttctgcatca agaagcctca
tttaaggtca taccacaact tgagacagag 420ttagcgaaat gtaaactaaa tataaagtta
atttatagta gcggtataga ccttgacatt 480gtaccattaa acagcgataa aggtcaggca
atgcagtttc ttcgtcaaaa gtggaaattt 540gcagcagaaa gaacagttgt ctgtggtgat
tcaggtaatg atattgcttt gttcgctgtg 600ggcaacgaaa ggggaatcat cgtcgggaat
gcccgaccgg agttgcttca gtggcacagt 660gagtatcccg cagaccatcg ctacctggca
aaaaactttt gtgcaggtgg aattattgaa 720ggtttacaat tctttggttt cctcgaatag
75041881DNAZea mays 4atggcgtcct
cgctgctctc ctcccccgct aaacccacca tcaccaccac caccaaaacg 60accccggccc
caagacccgc ccgctccgcc catgtccatg tcctctccgc cgcccgctgc 120ctccgcctcc
gcctccgcgc gtcgtcgcag catcctccgc ctccgcccac cccgcggtcg 180cggcggccgg
attacgtccc gaaccgcatc gacgacccca actacgtgcg catcttcgac 240accacgctgc
gcgacgggga gcagtcgccg ggcgccacca tgacgagcgc ccagaagctc 300gtcgtcgcgc
gccagctggc ccgcctcggc gtcgacatca tcgaggccgg gttcccggcc 360tcctcccccg
acgacctcga cgccgtgcgc tccatcgcca tcgaggtcgg caacccggcg 420ccaggacccg
ccggggagga ggacgccgtc cacgtgccgg tcatctgcgg cctctcgcgg 480tgcaaccgga
aggacatcga cgccgcgtgg gaggccgtgc gccacgcgcg ccgcccccgg 540atccacacct
tcatcgccac cagcgacatt cacatgcagc ataagctcag gaagacgccc 600gaccaggtgg
tcgccattgc cagggagatg gtggcctacg cccgcagcct cggatgcact 660gacgtcgagt
tcagccccga ggacgccggc aggtcaaata gagagttctt gtatcatatt 720ctaggggaag
tcataaaagc tggagctacg actctcaata tcccggacac tgtcggatac 780aatcttcctt
atgaatttgg aaagttgatt gctgatataa aggcaaacac tcctggaatt 840gaaaaggcta
tcatttccac tcattgtcag aatgaccttg gtcttgcgac tgccaacaca 900ctagcgggcg
ctcgtgcagg agcacggcag ttagaggtta ctattaatgg tattggtgaa 960agggctggaa
atgcttcttt ggaggaggtt gtcatggcaa ttaaatgccg cagagaactg 1020ttagatggtc
tctatactgg aatcgattcc cgacatatca ctttgacgag caaaatggtg 1080caagagcata
gtggacttca cgtgcagcca cataaagcta ttgttggtgc caatgcgttc 1140gctcatgaaa
gtggaattca tcaggatggg atgcttaaat ataaaggaac atacgaaatt 1200atatcgcctg
atgatattgg tctaacacgt gcaaatgaat ttggtattgt tcttgggaaa 1260ctcagcggaa
ggcatgctgt gagatctaag ctagtagagc ttggatatga aatcggtgac 1320aaggaatttg
aggacttctt taaacgctac aaagaggttg cagagaagaa aaagcgcgta 1380actgatgaag
acttagaagc gttattgtca gatgagatat tccagcctaa ggttatttgg 1440tcccttgctg
atgtacaggc aacatgtggt acacttgctt tatctacggc aacagtgaaa 1500ttggtagcac
cagatggaga ggagaaaata gcatgttcag tcggaacagg tccagtcgat 1560gcagcttaca
aggctgttga caaaataatc cagattccaa cggttctccg agaatacagt 1620atgacatcag
tcacagaagg cattgacgca atcgcgacaa ctcgggttgt tgtcactgga 1680gatgtgagca
acaacgccaa acatgccctg actggccagt ccttcaaccg ctccttcagc 1740gggagcgggg
catccatgga cattgtggtg tccagtgtca gagcttacct gagcgccctg 1800aacaagatct
gcagtttcgc tggcgccgtg aaagccagca gcgatgtagc tgagaccgca 1860agcgtcccga
gcacagaatg a
188152205DNAArabidopsis thaliana 5atgaagggag cgactctcgt tgctctcgcc
gccacaatcg gcaatttctt acaaggatgg 60gacaatgcca ccattgctgg agctatggtt
tatatcaaca aagacttgaa tctaccaacc 120tctgttcaag gtcttgtcgt tgctatgtca
ttgatcggtg caacggtcat cacgacttgc 180tcaggaccga tatctgattg gctcggcaga
cgccccatgc tcattttatc atcagttatg 240tatttcgtct gcggtttgat aatgttgtgg
tctcccaatg tctatgttct gtgctttgct 300aggcttctta atgggtttgg tgccgggctc
gcggttacac ttgtccctgt ttacatttct 360gaaaccgctc ctccggagat cagaggacag
ttaaatactc tccctcagtt tcttggctct 420ggtggaatgt ttttgtcata ctgtatggtt
ttcactatgt ccctgagtga ctcccctagc 480tggagagcca tgctcggtgt cctctcgatc
ccttctcttc tttatttgtt tctcacggtg 540ttttatttgc ccgagtctcc tcgttggctg
gttagtaaag gaagaatgga cgaggctaag 600cgagttcttc aacagttatg tggcagagaa
gatgttaccg atgagatggc tttactagtt 660gaaggactag atataggagg agaaaaaaca
atggaagatc tcttagtaac tttggaggat 720catgaaggtg atgatacact tgaaaccgtt
gatgaggatg gacaaatgcg gctttatgga 780acccacgaga atcaatcgta ccttgctaga
cctgtcccag aacaaaatag ctcacttggg 840ctacgctctc gccacggaag cttagcaaac
caaagcatga tccttaaaga tccgctcgtc 900aatctttttg gcagtctcca cgagaagatg
ccagaagcag gcggaaacac tcggagtggg 960attttccctc atttcggaag catgttcagt
actactgccg atgcgcctca cggtaaaccg 1020gctcattggg aaaaggacat agagagccat
tacaacaaag acaatgatga ctatgcgact 1080gatgatggtg cgggtgatga tgatgactcg
gacaacgatt tgcgtagccc cttaatgtcg 1140cgccagacca caagcatgga caaggatatg
atcccacatc ctacaagtgg aagcacttta 1200agcatgagac gacacagtac gcttatgcaa
ggcaacggcg aaagtagcat gggaattggt 1260ggtggttggc atatgggata tagatacgaa
aacgatgaat acaagaggta ttatcttaaa 1320gaagatggag ctgaatctcg ccgtggctcg
atcatctcta ttcccggagg tccggatggt 1380ggaggcagct acattcacgc ttctgccctt
gtaagcagat ctgttcttgg tcctaaatca 1440gttcatggat ccgccatggt tcccccggag
aaaattgctg cctctggacc actctggtct 1500gctcttcttg aacctggtgt taagcgtgcc
ttggttgttg gtgtcggcat tcaaatactg 1560cagcagtttt caggtatcaa tggagttctc
tactacactc ctcagattct cgaacgggct 1620ggcgtagata ttcttctttc gagcctcgga
ctaagttcca tctctgcgtc attcctcatc 1680agcggtttaa caacattact catgctccca
gccattgtcg ttgccatgag actcatggat 1740gtatccggaa gaaggtcatt acttctctgg
acaatcccag ttctcattgt ctcacttgtc 1800gtccttgtca tcagcgagct catccacatc
agcaaagtcg tgaacgcagc actctccaca 1860ggttgtgtcg tgctctactt ctgcttcttc
gtgatgggtt acggtcccat tccaaacatc 1920ctctgttctg aaatcttccc aacaagagtc
cgtggtctct gcatcgccat atgtgctatg 1980gtcttttgga ttggagacat tattgtcacg
tactcacttc ccgttctcct cagctcgatc 2040ggactagttg gtgttttcag catttacgct
gcggtttgcg ttatctcatg gatcttcgtt 2100tacatgaaag tcccggagac taaaggcatg
cctttggaag ttatcacaga ctactttgcc 2160tttggagctc aagctcaagc ttctgctcct
tctaaggata tataa 22056216DNAChlamydomonas reinhardtii
6atgggtgaag cggttctcat ctgctttagt ggagtgtctt gtggcgggtg tagaagggca
60tggagatcca ctcatgctgt tgaatgttcc ggagcattac aaagacttgc ggtcgcagcc
120ggtgctgtta tgcaagatgc gcgcgagcgt agagggatcg tgtgcactta catcaagcag
180tatgggaacg gagcagagcc ctgtcctgca caatga
21672760DNAArtificial SequenceTruncated corn sucrose phosphate synthase
gene (Zm.SPStruncated) 7atggcattcc agaggaactt ctctgacctt accgtctggt
ccgacgacaa taaggagaag 60aagctttaca ttgtgctcat cagtgtgcat ggtcttgttc
gtggcgaaaa catggaacta 120ggtcgtgatt ctgacaccgg tggccaggtg aaatatgttg
tcgaacttgc aagagcaatg 180tcaatgatgc ctggagtgta cagggtggac ctcttcactc
gtcaagtgtc atctcctgac 240gtggactgga gctatggtga gccgaccgag atgttatgct
ccggttccaa cgatggagag 300gggatgggtg agagtgccgg agcctacatt gtgcgcatac
cgtgtggacc acgggataaa 360tatctcaaga aggaagcact gtggccttac ctccaagagt
ttgtcgatgg agctcttgcg 420catattctta acatgtccaa ggctctggga gagcaggttg
gaaatgggag gccagtactg 480ccttacgtga tacatggaca ctatgccgac gctggagatg
ttgctgctct cctttccggt 540gcgctcaatg tacccatggt gctgactggt cactcacttg
ggaggaacaa gctggagcaa 600atcctgaagc aagggcgcat gtccaaggag gagatcgatt
caacatacaa gatcatgagg 660cgtatcgagg gtgaggagct ggccctggat gcgtcagagc
ttgtcatcac cagcacaagg 720caggagattg atgaacagtg gggattgtac gacggatttg
atgtcaagct tgagaaagtg 780ttgagggcac gggcgaggcg tggggttagc tgccatggtc
gtttcatgcc taggatggtg 840gtgattcctc caggaatgga tttcagcaat gttgtggttc
atgaagacat tgatggggat 900ggtgacagca aagatgatat cgttggtttg gagggtgctt
caccgaagtc aatgccccca 960atttgggccg aggtgatgcg gttcctaacc aatcctcaca
agccgatgat cctggcgctg 1020tcgaggccag acccgaagaa gaacatcact accctcgtca
aagcctttgg agagtgcccc 1080ccactcaggg aacttgcaaa ccttactcta atcatgggaa
acagagatga catcgacgat 1140atgtctgctg gcaatgccag tgtcctcacc acagttctga
agctgataga caagtatgat 1200ctgtatggaa gcgtagcgtt tcctaagcat cacaatcagg
ctgatgtccc ggagatctac 1260gccgtcgcgg ccaaaatgaa gggtgtcttc atcaaccctg
ctctcgttga gccgttcggt 1320ctcaccctga tcgaggctgc agcacacgga cttccaatag
tcgctaccaa gaatggtggt 1380ccggtcgaca ttacaactgc actgagcaac ggactgctcg
ttgacccgca cgaccagaac 1440gccatcgctc aagcactgct gaagctcgta gcagataaga
acctgtggca ggaatgccgg 1500agaaacgggc tgcgaaacat ccacctttac tcatggccgg
agcactgccg cacttacctc 1560accagggttg ctgggtgccg gttaaggaac ccgaggtggc
tgaaggacac accagcagat 1620gccggagcgg atgaggagga gttcctggag gattccatgg
acgctcagga cctgtcactc 1680cgtctgtcca tcgacggtga gaagagctcg ctgaacacca
acgacccatt gtcgtcggac 1740ccgcaggatc aggtgcagaa gatcatgaac aagatcaatc
agtcgtcagc acttccgccg 1800tccatgtcct cagtcgcaga cggtgccaag aacgcaaccg
agaccacggg cagcaccttg 1860aacaagtacc cactccctcg cggccggcgc ctgttcgtca
tcgccgtgga ctgctaccaa 1920gacgacggcc gtgctagcaa gaagatgctg caggtgatcc
aggaagtttt cagagcagtc 1980cgatcggact cccagatgtc caagatctca gggttcgcgc
tgtccactgc gatgccgttg 2040tccgaaacac tccagcttct gcagctcggc aagatcccag
cgaccgactt cgacaccctc 2100atctgtggca gtggcagcga ggtctactat cctggcacag
tgaactgcgt cgacgctgaa 2160ggaaagctgc gcccagacca ggactatctg atgcacatca
gccaccgctg gtcccatgac 2220ggcgcgaagc agaccatagc gaagctcatg gccactcagg
acggttcagg cgacactgtc 2280gagctggacc cggcgtctag taatgcacac tgcttcacgt
tccttatcaa agatcccaaa 2340aaggtgaaaa cggtcgatga gatgagggag aggctgagga
tgcgtggtct ccggtgccac 2400atcatgtact gcaggaactc gacaaggctt caggttgtcc
ctctgctagc atcaaggtca 2460caggcactca ggtatctttt tgtgcgctgg ggcctatatg
tggggaacat gtatctgatc 2520actggggaac atggcgacac cgatcatgag gagatgctat
ctgggttaca caagactgtg 2580attgtccggg gtgtcaccga gaaaggttcg gaagggctgc
tgaggagccc aggaagctac 2640aagaaggacg acgtcgtgcc gtctgagacc cccttggctg
cgtacacgac tggtgagctg 2700aaggccgatg agatcatgag ggctctgaaa caagtctcaa
agacttccag cggcatgtga 276081512DNAArabidopsis thaliana 8atggggtttg
atcccgagaa ccaatcgatc tcttccgttg gacaggttgt tggtgattct 60tcttcaggtg
ggattactgc tgaaaaggaa cctttgttaa aggaaaacca cagcccagag 120aactactctg
ttcttgcagc cattcctccg tttctctttc cagctcttgg agcattgctt 180tttggttatg
aaattggtgc aacatcttgt gctatcatgt ctcttaagtc gcctactcta 240agtggaattt
catggtacga cttgtcttca gtggatgttg gtataattac cagtggctca 300ctgtatggtg
ccttaattgg ctccattgtt gcatttagtg ttgccgacat tataggaagg 360agaaaggagc
tgattttggc tgcattcttg tatcttgttg gagccattgt gactgtagta 420gcacctgtct
tttccatact gataattgga cgagttacgt atggcatggg gattggactg 480accatgcacg
cggctccaat gtacattgca gagactgctc caagtcaaat acgtggacgg 540atgatatcac
taaaggaatt ctccactgtc cttgggatgg ttgggggtta tggaatcggt 600agcctttgga
ttacggttat ttctggttgg cgttacatgt acgcaacaat tctccctttt 660ccagttatta
tgggaactgg aatgtgttgg ctaccagcat ctccgaggtg gcttttactg 720cgcgctctcc
agggacaagg aaatggggag aatcttcaac aggctgcgat tagatctctt 780tgtcgcctta
gagggtctgt catagctgac tcagcagctg aacaagtaaa cgaaatattg 840gctgaacttt
cccttgtggg tgaagacaaa gaagctacat ttggtgaatt atttcgaggc 900aaatgcttga
aagctctcac tatagcagga gggttagtct tgttccaaca gataactggg 960caaccaagtg
tactatatta tgcaccatca atactacaga ctgccggctt ttctgctgca 1020gctgatgcaa
ctcggatctc aattctgctc ggcctattga agttggttat gacaggagtt 1080tctgtgatag
ttatcgacag agttggaagg agacctttac ttctttgtgg tgttagcgga 1140atggtgatct
cattgttcct cctgggatcc tactacatgt tttataaaaa tgtaccagct 1200gttgctgtag
ctgcattgct actgtatgta ggctgttacc agctgtcctt tggccctatt 1260ggttggctga
tgatttcaga gatatttccc ttaaaattaa gaggtagagg aatcagtcta 1320gcagtgcttg
tgaattttgg cgcaaacgca cttgtgacat tcgctttctc accgctaaag 1380gagctgttag
gagctggaat actgttctgt gcatttggag tgatatgtgt cgtgtctctc 1440ttcttcatat
actacattgt gccagagaca aagggtctca ctcttgaaga aattgaagcc 1500aaatgtctct
aa
15129852DNAArabidopsis thaliana 9atgaacggag gagaaaagtt agaatctatc
ccgattgatc tcattattga gatacattca 60agattaccag cggagtcagt cgcaaggttt
cgctgcgtgt cgaagctatg ggggtctatg 120tttcgccgtc catatttcac cgagctgttc
ttgaccaggt cgcgtgctcg tccacgtctc 180ttattcgtcc tccaacacaa ccgtaaatgg
agcttcagcg tcttctcttc gcctcaaaat 240cagaatatat atgtgaagcc gtcttttgta
gtagctgatt ttcacatgaa gttctctgta 300agcacgttcc cagattttca tagttgctct
ggtttgatcc atttctctat gatgaaaggc 360gcatatacag tgccggtggt atgtaactct
cgcacgggac aatatgcggt cctacctaaa 420ctgacaagga caaggtacga aaattcgtat
agctttgtag ggtatgatcc gattgagaag 480caaatcaagg tactgttcat gtctgatcca
gatagtggtg atgaccatag aattctgacg 540ttaggaacaa ctgaaaaaat gttggggagg
aagatcgaat gtagcttaac ccataatata 600ttgtctaatg aaggggtatg catcaatgga
gttttgtatt acaaagcttc ccgaattgtt 660gaatcgtcat ctgacgatga cacgtctgat
gatgatgatg atgatcatga acggtctgat 720gtgattgttt gctttgattt taggtgtgag
aaattcgagt ttattgtcat atgcttttat 780ggccagttga taaattccgt ccagttatca
ctaaaacaaa aatctcacca gaaacttgac 840ttaatcatat ag
85210240PRTAgrobacterium tumefaciens
10Met Asp Leu Arg Leu Ile Phe Gly Pro Thr Cys Thr Gly Lys Thr Ser1
5 10 15Thr Ala Val Ala Leu Ala
Gln Gln Thr Gly Leu Pro Val Leu Ser Leu 20 25
30Asp Arg Val Gln Cys Cys Pro Gln Leu Ser Thr Gly Ser
Gly Arg Pro 35 40 45Thr Val Glu
Glu Leu Lys Gly Thr Ser Arg Leu Tyr Leu Asp Asp Arg 50
55 60Pro Leu Val Lys Gly Ile Ile Ala Ala Lys Gln Ala
His Glu Arg Leu65 70 75
80Met Gly Gly Val Tyr Asn Tyr Glu Ala His Gly Gly Leu Ile Leu Glu
85 90 95Gly Gly Ser Ile Ser Leu
Leu Lys Cys Met Ala Gln Ser Ser Tyr Trp 100
105 110Ser Ala Asp Phe Arg Trp His Ile Ile Arg His Glu
Leu Ala Asp Glu 115 120 125Glu Thr
Phe Met Asn Val Ala Lys Ala Arg Val Lys Gln Met Leu Arg 130
135 140Pro Ala Ala Gly Leu Ser Ile Ile Gln Glu Leu
Val Asp Leu Trp Lys145 150 155
160Glu Pro Arg Leu Arg Pro Ile Leu Lys Glu Ile Asp Gly Tyr Arg Tyr
165 170 175Ala Met Leu Phe
Ala Ser Gln Asn Gln Ile Thr Ser Asp Met Leu Leu 180
185 190Gln Leu Asp Ala Asp Met Glu Asp Lys Leu Ile
His Gly Ile Ala Gln 195 200 205Glu
Tyr Leu Ile His Ala Arg Arg Gln Glu Gln Lys Phe Pro Arg Val 210
215 220Asn Ala Ala Ala Tyr Asp Gly Phe Glu Gly
His Pro Phe Gly Met Tyr225 230 235
24011257PRTGlycine max 11Met Ala Asp Ser Ser Arg Glu Glu Asn Val
Tyr Met Ala Lys Leu Ala1 5 10
15Glu Gln Ala Glu Arg Tyr Glu Glu Met Val Glu Phe Met Glu Lys Val
20 25 30Ala Lys Thr Val Glu Val
Glu Glu Leu Thr Val Glu Glu Arg Asn Leu 35 40
45Leu Ser Val Ala Tyr Lys Asn Val Ile Gly Ala Arg Arg Ala
Ser Trp 50 55 60Arg Ile Ile Ser Ser
Ile Glu Gln Lys Glu Glu Ser Arg Gly Asn Glu65 70
75 80Asp His Val Ala Ile Ile Lys Glu Tyr Arg
Gly Lys Ile Glu Ala Glu 85 90
95Leu Ser Lys Ile Cys Asp Gly Ile Leu Asn Leu Leu Glu Ser Asn Leu
100 105 110Ile Pro Ser Ala Ala
Ser Pro Glu Ser Lys Val Phe Tyr Leu Lys Met 115
120 125Lys Gly Asp Tyr His Arg Tyr Leu Ala Glu Phe Lys
Thr Gly Ala Glu 130 135 140Arg Lys Glu
Ala Ala Glu Ser Thr Leu Leu Ala Tyr Lys Ser Ala Gln145
150 155 160Asp Ile Ala Leu Ala Asp Leu
Ala Pro Thr His Pro Ile Arg Leu Gly 165
170 175Leu Ala Leu Asn Phe Ser Val Phe Tyr Tyr Glu Ile
Leu Asn Ser Pro 180 185 190Asp
Arg Ala Cys Asn Leu Ala Lys Gln Ala Phe Asp Glu Ala Ile Ser 195
200 205Glu Leu Asp Thr Leu Gly Glu Glu Ser
Tyr Lys Asp Ser Thr Leu Ile 210 215
220Met Gln Leu Leu Arg Asp Asn Leu Thr Leu Trp Thr Ser Asp Ile Thr225
230 235 240Asp Asp Ala Gly
Asp Glu Ile Lys Glu Thr Ser Lys Gln Gln Pro Gly 245
250 255Glu12249PRTNostoc sp. PCC 7120 12Met Lys
Pro Phe Leu Phe Val Thr Asp Leu Asp His Thr Leu Val Gly1 5
10 15Asn Asp Ala Ala Leu Ala Glu Leu
Ser Gln Ile Leu Thr His His Arg 20 25
30Gln Glu Tyr Gly Thr Lys Ile Val Tyr Ala Thr Gly Arg Ser Pro
Ile 35 40 45Leu Tyr Lys Glu Leu
Gln Val Glu Lys Asn Leu Ile Glu Pro Asp Gly 50 55
60Leu Val Leu Ser Val Gly Thr Glu Ile Tyr Leu Asp Gly Ser
Gly Asn65 70 75 80Pro
Asp Ser Asp Trp Ser Glu Ile Leu Asn Asp Gly Trp Asn Arg Glu
85 90 95Leu Val Leu Ser Val Thr Lys
Lys Phe Pro Glu Leu Met Leu Gln Pro 100 105
110Asp Ser Glu Gln Arg Pro Phe Lys Val Ser Phe Phe Leu His
Gln Glu 115 120 125Ala Ser Phe Lys
Val Ile Pro Gln Leu Glu Thr Glu Leu Ala Lys Cys 130
135 140Lys Leu Asn Ile Lys Leu Ile Tyr Ser Ser Gly Ile
Asp Leu Asp Ile145 150 155
160Val Pro Leu Asn Ser Asp Lys Gly Gln Ala Met Gln Phe Leu Arg Gln
165 170 175Lys Trp Lys Phe Ala
Ala Glu Arg Thr Val Val Cys Gly Asp Ser Gly 180
185 190Asn Asp Ile Ala Leu Phe Ala Val Gly Asn Glu Arg
Gly Ile Ile Val 195 200 205Gly Asn
Ala Arg Pro Glu Leu Leu Gln Trp His Ser Glu Tyr Pro Ala 210
215 220Asp His Arg Tyr Leu Ala Lys Asn Phe Cys Ala
Gly Gly Ile Ile Glu225 230 235
240Gly Leu Gln Phe Phe Gly Phe Leu Glu 24513626PRTZea
mays 13Met Ala Ser Ser Leu Leu Ser Ser Pro Ala Lys Pro Thr Ile Thr Thr1
5 10 15Thr Thr Lys Thr Thr
Pro Ala Pro Arg Pro Ala Arg Ser Ala His Val 20
25 30His Val Leu Ser Ala Ala Arg Cys Leu Arg Leu Arg
Leu Arg Ala Ser 35 40 45Ser Gln
His Pro Pro Pro Pro Pro Thr Pro Arg Ser Arg Arg Pro Asp 50
55 60Tyr Val Pro Asn Arg Ile Asp Asp Pro Asn Tyr
Val Arg Ile Phe Asp65 70 75
80Thr Thr Leu Arg Asp Gly Glu Gln Ser Pro Gly Ala Thr Met Thr Ser
85 90 95Ala Gln Lys Leu Val
Val Ala Arg Gln Leu Ala Arg Leu Gly Val Asp 100
105 110Ile Ile Glu Ala Gly Phe Pro Ala Ser Ser Pro Asp
Asp Leu Asp Ala 115 120 125Val Arg
Ser Ile Ala Ile Glu Val Gly Asn Pro Ala Pro Gly Pro Ala 130
135 140Gly Glu Glu Asp Ala Val His Val Pro Val Ile
Cys Gly Leu Ser Arg145 150 155
160Cys Asn Arg Lys Asp Ile Asp Ala Ala Trp Glu Ala Val Arg His Ala
165 170 175Arg Arg Pro Arg
Ile His Thr Phe Ile Ala Thr Ser Asp Ile His Met 180
185 190Gln His Lys Leu Arg Lys Thr Pro Asp Gln Val
Val Ala Ile Ala Arg 195 200 205Glu
Met Val Ala Tyr Ala Arg Ser Leu Gly Cys Thr Asp Val Glu Phe 210
215 220Ser Pro Glu Asp Ala Gly Arg Ser Asn Arg
Glu Phe Leu Tyr His Ile225 230 235
240Leu Gly Glu Val Ile Lys Ala Gly Ala Thr Thr Leu Asn Ile Pro
Asp 245 250 255Thr Val Gly
Tyr Asn Leu Pro Tyr Glu Phe Gly Lys Leu Ile Ala Asp 260
265 270Ile Lys Ala Asn Thr Pro Gly Ile Glu Lys
Ala Ile Ile Ser Thr His 275 280
285Cys Gln Asn Asp Leu Gly Leu Ala Thr Ala Asn Thr Leu Ala Gly Ala 290
295 300Arg Ala Gly Ala Arg Gln Leu Glu
Val Thr Ile Asn Gly Ile Gly Glu305 310
315 320Arg Ala Gly Asn Ala Ser Leu Glu Glu Val Val Met
Ala Ile Lys Cys 325 330
335Arg Arg Glu Leu Leu Asp Gly Leu Tyr Thr Gly Ile Asp Ser Arg His
340 345 350Ile Thr Leu Thr Ser Lys
Met Val Gln Glu His Ser Gly Leu His Val 355 360
365Gln Pro His Lys Ala Ile Val Gly Ala Asn Ala Phe Ala His
Glu Ser 370 375 380Gly Ile His Gln Asp
Gly Met Leu Lys Tyr Lys Gly Thr Tyr Glu Ile385 390
395 400Ile Ser Pro Asp Asp Ile Gly Leu Thr Arg
Ala Asn Glu Phe Gly Ile 405 410
415Val Leu Gly Lys Leu Ser Gly Arg His Ala Val Arg Ser Lys Leu Val
420 425 430Glu Leu Gly Tyr Glu
Ile Gly Asp Lys Glu Phe Glu Asp Phe Phe Lys 435
440 445Arg Tyr Lys Glu Val Ala Glu Lys Lys Lys Arg Val
Thr Asp Glu Asp 450 455 460Leu Glu Ala
Leu Leu Ser Asp Glu Ile Phe Gln Pro Lys Val Ile Trp465
470 475 480Ser Leu Ala Asp Val Gln Ala
Thr Cys Gly Thr Leu Ala Leu Ser Thr 485
490 495Ala Thr Val Lys Leu Val Ala Pro Asp Gly Glu Glu
Lys Ile Ala Cys 500 505 510Ser
Val Gly Thr Gly Pro Val Asp Ala Ala Tyr Lys Ala Val Asp Lys 515
520 525Ile Ile Gln Ile Pro Thr Val Leu Arg
Glu Tyr Ser Met Thr Ser Val 530 535
540Thr Glu Gly Ile Asp Ala Ile Ala Thr Thr Arg Val Val Val Thr Gly545
550 555 560Asp Val Ser Asn
Asn Ala Lys His Ala Leu Thr Gly Gln Ser Phe Asn 565
570 575Arg Ser Phe Ser Gly Ser Gly Ala Ser Met
Asp Ile Val Val Ser Ser 580 585
590Val Arg Ala Tyr Leu Ser Ala Leu Asn Lys Ile Cys Ser Phe Ala Gly
595 600 605Ala Val Lys Ala Ser Ser Asp
Val Ala Glu Thr Ala Ser Val Pro Ser 610 615
620Thr Glu62514734PRTArabidopsis thaliana 14Met Lys Gly Ala Thr Leu
Val Ala Leu Ala Ala Thr Ile Gly Asn Phe1 5
10 15Leu Gln Gly Trp Asp Asn Ala Thr Ile Ala Gly Ala
Met Val Tyr Ile 20 25 30Asn
Lys Asp Leu Asn Leu Pro Thr Ser Val Gln Gly Leu Val Val Ala 35
40 45Met Ser Leu Ile Gly Ala Thr Val Ile
Thr Thr Cys Ser Gly Pro Ile 50 55
60Ser Asp Trp Leu Gly Arg Arg Pro Met Leu Ile Leu Ser Ser Val Met65
70 75 80Tyr Phe Val Cys Gly
Leu Ile Met Leu Trp Ser Pro Asn Val Tyr Val 85
90 95Leu Cys Phe Ala Arg Leu Leu Asn Gly Phe Gly
Ala Gly Leu Ala Val 100 105
110Thr Leu Val Pro Val Tyr Ile Ser Glu Thr Ala Pro Pro Glu Ile Arg
115 120 125Gly Gln Leu Asn Thr Leu Pro
Gln Phe Leu Gly Ser Gly Gly Met Phe 130 135
140Leu Ser Tyr Cys Met Val Phe Thr Met Ser Leu Ser Asp Ser Pro
Ser145 150 155 160Trp Arg
Ala Met Leu Gly Val Leu Ser Ile Pro Ser Leu Leu Tyr Leu
165 170 175Phe Leu Thr Val Phe Tyr Leu
Pro Glu Ser Pro Arg Trp Leu Val Ser 180 185
190Lys Gly Arg Met Asp Glu Ala Lys Arg Val Leu Gln Gln Leu
Cys Gly 195 200 205Arg Glu Asp Val
Thr Asp Glu Met Ala Leu Leu Val Glu Gly Leu Asp 210
215 220Ile Gly Gly Glu Lys Thr Met Glu Asp Leu Leu Val
Thr Leu Glu Asp225 230 235
240His Glu Gly Asp Asp Thr Leu Glu Thr Val Asp Glu Asp Gly Gln Met
245 250 255Arg Leu Tyr Gly Thr
His Glu Asn Gln Ser Tyr Leu Ala Arg Pro Val 260
265 270Pro Glu Gln Asn Ser Ser Leu Gly Leu Arg Ser Arg
His Gly Ser Leu 275 280 285Ala Asn
Gln Ser Met Ile Leu Lys Asp Pro Leu Val Asn Leu Phe Gly 290
295 300Ser Leu His Glu Lys Met Pro Glu Ala Gly Gly
Asn Thr Arg Ser Gly305 310 315
320Ile Phe Pro His Phe Gly Ser Met Phe Ser Thr Thr Ala Asp Ala Pro
325 330 335His Gly Lys Pro
Ala His Trp Glu Lys Asp Ile Glu Ser His Tyr Asn 340
345 350Lys Asp Asn Asp Asp Tyr Ala Thr Asp Asp Gly
Ala Gly Asp Asp Asp 355 360 365Asp
Ser Asp Asn Asp Leu Arg Ser Pro Leu Met Ser Arg Gln Thr Thr 370
375 380Ser Met Asp Lys Asp Met Ile Pro His Pro
Thr Ser Gly Ser Thr Leu385 390 395
400Ser Met Arg Arg His Ser Thr Leu Met Gln Gly Asn Gly Glu Ser
Ser 405 410 415Met Gly Ile
Gly Gly Gly Trp His Met Gly Tyr Arg Tyr Glu Asn Asp 420
425 430Glu Tyr Lys Arg Tyr Tyr Leu Lys Glu Asp
Gly Ala Glu Ser Arg Arg 435 440
445Gly Ser Ile Ile Ser Ile Pro Gly Gly Pro Asp Gly Gly Gly Ser Tyr 450
455 460Ile His Ala Ser Ala Leu Val Ser
Arg Ser Val Leu Gly Pro Lys Ser465 470
475 480Val His Gly Ser Ala Met Val Pro Pro Glu Lys Ile
Ala Ala Ser Gly 485 490
495Pro Leu Trp Ser Ala Leu Leu Glu Pro Gly Val Lys Arg Ala Leu Val
500 505 510Val Gly Val Gly Ile Gln
Ile Leu Gln Gln Phe Ser Gly Ile Asn Gly 515 520
525Val Leu Tyr Tyr Thr Pro Gln Ile Leu Glu Arg Ala Gly Val
Asp Ile 530 535 540Leu Leu Ser Ser Leu
Gly Leu Ser Ser Ile Ser Ala Ser Phe Leu Ile545 550
555 560Ser Gly Leu Thr Thr Leu Leu Met Leu Pro
Ala Ile Val Val Ala Met 565 570
575Arg Leu Met Asp Val Ser Gly Arg Arg Ser Leu Leu Leu Trp Thr Ile
580 585 590Pro Val Leu Ile Val
Ser Leu Val Val Leu Val Ile Ser Glu Leu Ile 595
600 605His Ile Ser Lys Val Val Asn Ala Ala Leu Ser Thr
Gly Cys Val Val 610 615 620Leu Tyr Phe
Cys Phe Phe Val Met Gly Tyr Gly Pro Ile Pro Asn Ile625
630 635 640Leu Cys Ser Glu Ile Phe Pro
Thr Arg Val Arg Gly Leu Cys Ile Ala 645
650 655Ile Cys Ala Met Val Phe Trp Ile Gly Asp Ile Ile
Val Thr Tyr Ser 660 665 670Leu
Pro Val Leu Leu Ser Ser Ile Gly Leu Val Gly Val Phe Ser Ile 675
680 685Tyr Ala Ala Val Cys Val Ile Ser Trp
Ile Phe Val Tyr Met Lys Val 690 695
700Pro Glu Thr Lys Gly Met Pro Leu Glu Val Ile Thr Asp Tyr Phe Ala705
710 715 720Phe Gly Ala Gln
Ala Gln Ala Ser Ala Pro Ser Lys Asp Ile 725
7301571PRTChlamydomonas reinhardtii 15Met Gly Glu Ala Val Leu Ile Cys
Phe Ser Gly Val Ser Cys Gly Gly1 5 10
15Cys Arg Arg Ala Trp Arg Ser Thr His Ala Val Glu Cys Ser
Gly Ala 20 25 30Leu Gln Arg
Leu Ala Val Ala Ala Gly Ala Val Met Gln Asp Ala Arg 35
40 45Glu Arg Arg Gly Ile Val Cys Thr Tyr Ile Lys
Gln Tyr Gly Asn Gly 50 55 60Ala Glu
Pro Cys Pro Ala Gln65 7016919PRTArtificial
SequenceTruncated corn sucrose phosphate synthase gene
(Zm.SPStruncated) 16Met Ala Phe Gln Arg Asn Phe Ser Asp Leu Thr Val Trp
Ser Asp Asp1 5 10 15Asn
Lys Glu Lys Lys Leu Tyr Ile Val Leu Ile Ser Val His Gly Leu 20
25 30Val Arg Gly Glu Asn Met Glu Leu
Gly Arg Asp Ser Asp Thr Gly Gly 35 40
45Gln Val Lys Tyr Val Val Glu Leu Ala Arg Ala Met Ser Met Met Pro
50 55 60Gly Val Tyr Arg Val Asp Leu Phe
Thr Arg Gln Val Ser Ser Pro Asp65 70 75
80Val Asp Trp Ser Tyr Gly Glu Pro Thr Glu Met Leu Cys
Ser Gly Ser 85 90 95Asn
Asp Gly Glu Gly Met Gly Glu Ser Ala Gly Ala Tyr Ile Val Arg
100 105 110Ile Pro Cys Gly Pro Arg Asp
Lys Tyr Leu Lys Lys Glu Ala Leu Trp 115 120
125Pro Tyr Leu Gln Glu Phe Val Asp Gly Ala Leu Ala His Ile Leu
Asn 130 135 140Met Ser Lys Ala Leu Gly
Glu Gln Val Gly Asn Gly Arg Pro Val Leu145 150
155 160Pro Tyr Val Ile His Gly His Tyr Ala Asp Ala
Gly Asp Val Ala Ala 165 170
175Leu Leu Ser Gly Ala Leu Asn Val Pro Met Val Leu Thr Gly His Ser
180 185 190Leu Gly Arg Asn Lys Leu
Glu Gln Ile Leu Lys Gln Gly Arg Met Ser 195 200
205Lys Glu Glu Ile Asp Ser Thr Tyr Lys Ile Met Arg Arg Ile
Glu Gly 210 215 220Glu Glu Leu Ala Leu
Asp Ala Ser Glu Leu Val Ile Thr Ser Thr Arg225 230
235 240Gln Glu Ile Asp Glu Gln Trp Gly Leu Tyr
Asp Gly Phe Asp Val Lys 245 250
255Leu Glu Lys Val Leu Arg Ala Arg Ala Arg Arg Gly Val Ser Cys His
260 265 270Gly Arg Phe Met Pro
Arg Met Val Val Ile Pro Pro Gly Met Asp Phe 275
280 285Ser Asn Val Val Val His Glu Asp Ile Asp Gly Asp
Gly Asp Ser Lys 290 295 300Asp Asp Ile
Val Gly Leu Glu Gly Ala Ser Pro Lys Ser Met Pro Pro305
310 315 320Ile Trp Ala Glu Val Met Arg
Phe Leu Thr Asn Pro His Lys Pro Met 325
330 335Ile Leu Ala Leu Ser Arg Pro Asp Pro Lys Lys Asn
Ile Thr Thr Leu 340 345 350Val
Lys Ala Phe Gly Glu Cys Pro Pro Leu Arg Glu Leu Ala Asn Leu 355
360 365Thr Leu Ile Met Gly Asn Arg Asp Asp
Ile Asp Asp Met Ser Ala Gly 370 375
380Asn Ala Ser Val Leu Thr Thr Val Leu Lys Leu Ile Asp Lys Tyr Asp385
390 395 400Leu Tyr Gly Ser
Val Ala Phe Pro Lys His His Asn Gln Ala Asp Val 405
410 415Pro Glu Ile Tyr Ala Val Ala Ala Lys Met
Lys Gly Val Phe Ile Asn 420 425
430Pro Ala Leu Val Glu Pro Phe Gly Leu Thr Leu Ile Glu Ala Ala Ala
435 440 445His Gly Leu Pro Ile Val Ala
Thr Lys Asn Gly Gly Pro Val Asp Ile 450 455
460Thr Thr Ala Leu Ser Asn Gly Leu Leu Val Asp Pro His Asp Gln
Asn465 470 475 480Ala Ile
Ala Gln Ala Leu Leu Lys Leu Val Ala Asp Lys Asn Leu Trp
485 490 495Gln Glu Cys Arg Arg Asn Gly
Leu Arg Asn Ile His Leu Tyr Ser Trp 500 505
510Pro Glu His Cys Arg Thr Tyr Leu Thr Arg Val Ala Gly Cys
Arg Leu 515 520 525Arg Asn Pro Arg
Trp Leu Lys Asp Thr Pro Ala Asp Ala Gly Ala Asp 530
535 540Glu Glu Glu Phe Leu Glu Asp Ser Met Asp Ala Gln
Asp Leu Ser Leu545 550 555
560Arg Leu Ser Ile Asp Gly Glu Lys Ser Ser Leu Asn Thr Asn Asp Pro
565 570 575Leu Ser Ser Asp Pro
Gln Asp Gln Val Gln Lys Ile Met Asn Lys Ile 580
585 590Asn Gln Ser Ser Ala Leu Pro Pro Ser Met Ser Ser
Val Ala Asp Gly 595 600 605Ala Lys
Asn Ala Thr Glu Thr Thr Gly Ser Thr Leu Asn Lys Tyr Pro 610
615 620Leu Pro Arg Gly Arg Arg Leu Phe Val Ile Ala
Val Asp Cys Tyr Gln625 630 635
640Asp Asp Gly Arg Ala Ser Lys Lys Met Leu Gln Val Ile Gln Glu Val
645 650 655Phe Arg Ala Val
Arg Ser Asp Ser Gln Met Ser Lys Ile Ser Gly Phe 660
665 670Ala Leu Ser Thr Ala Met Pro Leu Ser Glu Thr
Leu Gln Leu Leu Gln 675 680 685Leu
Gly Lys Ile Pro Ala Thr Asp Phe Asp Thr Leu Ile Cys Gly Ser 690
695 700Gly Ser Glu Val Tyr Tyr Pro Gly Thr Val
Asn Cys Val Asp Ala Glu705 710 715
720Gly Lys Leu Arg Pro Asp Gln Asp Tyr Leu Met His Ile Ser His
Arg 725 730 735Trp Ser His
Asp Gly Ala Lys Gln Thr Ile Ala Lys Leu Met Ala Thr 740
745 750Gln Asp Gly Ser Gly Asp Thr Val Glu Leu
Asp Pro Ala Ser Ser Asn 755 760
765Ala His Cys Phe Thr Phe Leu Ile Lys Asp Pro Lys Lys Val Lys Thr 770
775 780Val Asp Glu Met Arg Glu Arg Leu
Arg Met Arg Gly Leu Arg Cys His785 790
795 800Ile Met Tyr Cys Arg Asn Ser Thr Arg Leu Gln Val
Val Pro Leu Leu 805 810
815Ala Ser Arg Ser Gln Ala Leu Arg Tyr Leu Phe Val Arg Trp Gly Leu
820 825 830Tyr Val Gly Asn Met Tyr
Leu Ile Thr Gly Glu His Gly Asp Thr Asp 835 840
845His Glu Glu Met Leu Ser Gly Leu His Lys Thr Val Ile Val
Arg Gly 850 855 860Val Thr Glu Lys Gly
Ser Glu Gly Leu Leu Arg Ser Pro Gly Ser Tyr865 870
875 880Lys Lys Asp Asp Val Val Pro Ser Glu Thr
Pro Leu Ala Ala Tyr Thr 885 890
895Thr Gly Glu Leu Lys Ala Asp Glu Ile Met Arg Ala Leu Lys Gln Val
900 905 910Ser Lys Thr Ser Ser
Gly Met 91517503PRTArabidopsis thaliana 17Met Gly Phe Asp Pro Glu
Asn Gln Ser Ile Ser Ser Val Gly Gln Val1 5
10 15Val Gly Asp Ser Ser Ser Gly Gly Ile Thr Ala Glu
Lys Glu Pro Leu 20 25 30Leu
Lys Glu Asn His Ser Pro Glu Asn Tyr Ser Val Leu Ala Ala Ile 35
40 45Pro Pro Phe Leu Phe Pro Ala Leu Gly
Ala Leu Leu Phe Gly Tyr Glu 50 55
60Ile Gly Ala Thr Ser Cys Ala Ile Met Ser Leu Lys Ser Pro Thr Leu65
70 75 80Ser Gly Ile Ser Trp
Tyr Asp Leu Ser Ser Val Asp Val Gly Ile Ile 85
90 95Thr Ser Gly Ser Leu Tyr Gly Ala Leu Ile Gly
Ser Ile Val Ala Phe 100 105
110Ser Val Ala Asp Ile Ile Gly Arg Arg Lys Glu Leu Ile Leu Ala Ala
115 120 125Phe Leu Tyr Leu Val Gly Ala
Ile Val Thr Val Val Ala Pro Val Phe 130 135
140Ser Ile Leu Ile Ile Gly Arg Val Thr Tyr Gly Met Gly Ile Gly
Leu145 150 155 160Thr Met
His Ala Ala Pro Met Tyr Ile Ala Glu Thr Ala Pro Ser Gln
165 170 175Ile Arg Gly Arg Met Ile Ser
Leu Lys Glu Phe Ser Thr Val Leu Gly 180 185
190Met Val Gly Gly Tyr Gly Ile Gly Ser Leu Trp Ile Thr Val
Ile Ser 195 200 205Gly Trp Arg Tyr
Met Tyr Ala Thr Ile Leu Pro Phe Pro Val Ile Met 210
215 220Gly Thr Gly Met Cys Trp Leu Pro Ala Ser Pro Arg
Trp Leu Leu Leu225 230 235
240Arg Ala Leu Gln Gly Gln Gly Asn Gly Glu Asn Leu Gln Gln Ala Ala
245 250 255Ile Arg Ser Leu Cys
Arg Leu Arg Gly Ser Val Ile Ala Asp Ser Ala 260
265 270Ala Glu Gln Val Asn Glu Ile Leu Ala Glu Leu Ser
Leu Val Gly Glu 275 280 285Asp Lys
Glu Ala Thr Phe Gly Glu Leu Phe Arg Gly Lys Cys Leu Lys 290
295 300Ala Leu Thr Ile Ala Gly Gly Leu Val Leu Phe
Gln Gln Ile Thr Gly305 310 315
320Gln Pro Ser Val Leu Tyr Tyr Ala Pro Ser Ile Leu Gln Thr Ala Gly
325 330 335Phe Ser Ala Ala
Ala Asp Ala Thr Arg Ile Ser Ile Leu Leu Gly Leu 340
345 350Leu Lys Leu Val Met Thr Gly Val Ser Val Ile
Val Ile Asp Arg Val 355 360 365Gly
Arg Arg Pro Leu Leu Leu Cys Gly Val Ser Gly Met Val Ile Ser 370
375 380Leu Phe Leu Leu Gly Ser Tyr Tyr Met Phe
Tyr Lys Asn Val Pro Ala385 390 395
400Val Ala Val Ala Ala Leu Leu Leu Tyr Val Gly Cys Tyr Gln Leu
Ser 405 410 415Phe Gly Pro
Ile Gly Trp Leu Met Ile Ser Glu Ile Phe Pro Leu Lys 420
425 430Leu Arg Gly Arg Gly Ile Ser Leu Ala Val
Leu Val Asn Phe Gly Ala 435 440
445Asn Ala Leu Val Thr Phe Ala Phe Ser Pro Leu Lys Glu Leu Leu Gly 450
455 460Ala Gly Ile Leu Phe Cys Ala Phe
Gly Val Ile Cys Val Val Ser Leu465 470
475 480Phe Phe Ile Tyr Tyr Ile Val Pro Glu Thr Lys Gly
Leu Thr Leu Glu 485 490
495Glu Ile Glu Ala Lys Cys Leu 50018283PRTArabidopsis thaliana
18Met Asn Gly Gly Glu Lys Leu Glu Ser Ile Pro Ile Asp Leu Ile Ile1
5 10 15Glu Ile His Ser Arg Leu
Pro Ala Glu Ser Val Ala Arg Phe Arg Cys 20 25
30Val Ser Lys Leu Trp Gly Ser Met Phe Arg Arg Pro Tyr
Phe Thr Glu 35 40 45Leu Phe Leu
Thr Arg Ser Arg Ala Arg Pro Arg Leu Leu Phe Val Leu 50
55 60Gln His Asn Arg Lys Trp Ser Phe Ser Val Phe Ser
Ser Pro Gln Asn65 70 75
80Gln Asn Ile Tyr Val Lys Pro Ser Phe Val Val Ala Asp Phe His Met
85 90 95Lys Phe Ser Val Ser Thr
Phe Pro Asp Phe His Ser Cys Ser Gly Leu 100
105 110Ile His Phe Ser Met Met Lys Gly Ala Tyr Thr Val
Pro Val Val Cys 115 120 125Asn Ser
Arg Thr Gly Gln Tyr Ala Val Leu Pro Lys Leu Thr Arg Thr 130
135 140Arg Tyr Glu Asn Ser Tyr Ser Phe Val Gly Tyr
Asp Pro Ile Glu Lys145 150 155
160Gln Ile Lys Val Leu Phe Met Ser Asp Pro Asp Ser Gly Asp Asp His
165 170 175Arg Ile Leu Thr
Leu Gly Thr Thr Glu Lys Met Leu Gly Arg Lys Ile 180
185 190Glu Cys Ser Leu Thr His Asn Ile Leu Ser Asn
Glu Gly Val Cys Ile 195 200 205Asn
Gly Val Leu Tyr Tyr Lys Ala Ser Arg Ile Val Glu Ser Ser Ser 210
215 220Asp Asp Asp Thr Ser Asp Asp Asp Asp Asp
Asp His Glu Arg Ser Asp225 230 235
240Val Ile Val Cys Phe Asp Phe Arg Cys Glu Lys Phe Glu Phe Ile
Val 245 250 255Ile Cys Phe
Tyr Gly Gln Leu Ile Asn Ser Val Gln Leu Ser Leu Lys 260
265 270Gln Lys Ser His Gln Lys Leu Asp Leu Ile
Ile 275 280191318DNAArtificial SequencePromoter
sequence with specific expression pattern. 19atcaacaaat tactcctcaa
tcacactcct atagaaaacg gtttaagcta tcattacatg 60tctagttggt tttactcagc
cctagaagtg ttgtttattg catcactttc cacgaagcac 120aatttttctt ttttacaatc
accagacctc acaggctcac acatatgctt tagagcacat 180tctaaacttt gaactataaa
agctgttaac actaatacac tatgcgttct tttttgctcc 240aaacactttt gatccattat
taggagacac tccacttaga aagattttct aatcctttgg 300tcaactagga agttcaaggt
ttttctaaac agaaattcat ttcacaagta atttaattta 360taaggaaatg aatagagaaa
tcaaatcatt gaagaactac aaaatataga ttcaaggtca 420ggtctaagaa aatattcctg
aagctcaaaa aagagttttc ctctcacatt atagaattgg 480cctttacttc aacattttcc
cacctattcc acatttggtc agaacatttt taattacttg 540tggatcaatt tccggttgaa
atgggtttgg tgaatatccg gttcagttat atggtggccg 600ttggaattgg cttattagtt
gtggccgttg ttgaagccgt tggtattggt aagggagaag 660cagacttgtg gctatgagtc
tatgaccatg actcgtgatt atggagctgt cttatgaccc 720tgaccatcac cttgatctgg
tggattccaa tgttttcttc ttcttctaat aaaatattat 780ggtcaataca ggtgctaatt
aagatggtaa taatttctta tgtttctgtg gtaaagtttg 840attcaattcc gtagttttag
ataatcttat ttccatacat aaattttata gttttatcta 900ctttgttctt atgttttatc
tctagccaag agttattatt attatcagaa gaagaaaaaa 960aaaagaagca tatatacaaa
aggtttaata aaatgtatta tacaaggcaa ttatccaaat 1020tttttttgtt ttggtttaca
ttgatgctct caggatttca taaggataga gagatctatt 1080cgtatacgtg tcacgtcatg
agtgggtgtt tcgccaatcc atgaaacgca cctagatatc 1140taaaacacat atcaattgcg
aatctgcgaa gtgcgagcca ttaaccacgt aagcaaacaa 1200acaatctaaa ccccaaaaaa
aatctatgac tagccaatag caacctcaga gattgatatt 1260tcaagataag acagtattta
gatttctgta ttatatatag cgaaaatcgc atcaatac 1318201696DNAArtificial
SequencePromoter sequence with specific expression pattern.
20caaatttatt atgtgttttt tttccgtggt cgagattgtg tattattctt tagttattac
60aagactttta gctaaaattt gaaagaattt actttaagaa aatcttaaca tctgagataa
120tttcagcaat agattatatt tttcattact ctagcagtat ttttgcagat caatcgcaac
180atatatggtt gttagaaaaa atgcactata tatatatata ttattttttc aattaaaagt
240gcatgatata taatatatat atatatatat atgtgtgtgt gtatatggtc aaagaaattc
300ttatacaaat atacacgaac acatatattt gacaaaatca aagtattaca ctaaacaatg
360agttggtgca tggccaaaac aaatatgtag attaaaaatt ccagcctcca aaaaaaaatc
420caagtgttgt aaagcattat atatatatag tagatcccaa atttttgtac aattccacac
480tgatcgaatt tttaaagttg aatatctgac gtaggatttt tttaatgtct tacctgacca
540tttactaata acattcatac gttttcattt gaaatatcct ctataattat attgaatttg
600gcacataata agaaacctaa ttggtgattt attttactag taaatttctg gtgatgggct
660ttctactaga aagctctcgg aaaatcttgg accaaatcca tattccatga cttcgattgt
720taaccctatt agttttcaca aacatactat caatatcatt gcaacggaaa aggtacaagt
780aaaacattca atccgatagg gaagtgatgt aggaggttgg gaagacaggc ccagaaagag
840atttatctga cttgttttgt gtatagtttt caatgttcat aaaggaagat ggagacttga
900gaagtttttt ttggactttg tttagctttg ttgggcgttt ttttttttga tcaataactt
960tgttgggctt atgatttgta atattttcgt ggactcttta gtttatttag acgtgctaac
1020tttgttgggc ttatgacttg ttgtaacata ttgtaacaga tgacttgatg tgcgactaat
1080ctttacacat taaacatagt tctgtttttt gaaagttctt attttcattt ttatttgaat
1140gttatatatt tttctatatt tataattcta gtaaaaggca aattttgctt ttaaatgaaa
1200aaaatatata ttccacagtt tcacctaatc ttatgcattt agcagtacaa attcaaaaat
1260ttcccatttt tattcatgaa tcataccatt atatattaac taaatccaag gtaaaaaaaa
1320ggtatgaaag ctctatagta agtaaaatat aaattcccca taaggaaagg gccaagtcca
1380ccaggcaagt aaaatgagca agcaccactc caccatcaca caatttcact catagataac
1440gataagattc atggaattat cttccacgtg gcattattcc agcggttcaa gccgataagg
1500gtctcaacac ctctccttag gcctttgtgg ccgttaccaa gtaaaattaa cctcacacat
1560atccacactc aaaatccaac ggtgtagatc ctagtccact tgaatctcat gtatcctaga
1620ccctccgatc actccaaagc ttgttctcat tgttgttatc attatatata gatgaccaaa
1680gcactagacc aaacct
169621469DNAArtificial SequencePromoter sequence with specific expression
pattern. 21ggtactcctg agatactata ccctcctgtt ttaaaatagt tggcattatc
gaattatcat 60tttacttttt aatgttttct cttcttttaa tatattttat gaattttaat
gtattttaaa 120atgttatgca gttcgctctg gacttttctg ctgcgcctac acttgggtgt
actgggccta 180aattcagcct gaccgaccgc ctgcattgaa taatggatga gcaccggtaa
aatccgcgta 240cccaactttc gagaagaacc gagacgtggc gggccgggcc accgacgcac
ggcaccagcg 300actgcacacg tcccgccggc gtacgtgtac gtgctgttcc ctcactggcc
gcccaatcca 360ctcatgcatg cccacgtaca cccctgccgt ggcgcgccca gatcctaatc
ctttcgccgt 420tctgcacttc tgctgcctat aaatggcggc atcgaccgtc acctgcttc
469221099DNAArtificial SequencePromoter sequence with
specific expression pattern. 22cagcggggca gcgcaacaca aaaagggggg
aggatgccgg cgaccacgct agtgaccatg 60aagcaagatg atgtgaaagg gaggaccgga
cgagggttgg acctctgctg ccgacatgaa 120gagcgtgatg tgtagaagga gatgttagac
cagatgccga cgcaactagc cctggcaagg 180tcacccgact gatatcgctg cttgcccttg
tcctcatgta cacaatcagc ttgcttatct 240ctcccatact ggtcgtttgt ttcccgtggc
cgaaatagaa gaagacagag gtaggttttg 300ttagagaatt ttagtggtat tgtagcctat
ttgtaatttt gttgtacttt attgtattaa 360tcaataaagg tgtttcattc tattttgact
caatgttgaa tccattgatc tcttggtgtt 420gcactcagta tgttagaata ttacattccg
ttgaaacaat cttggttaag ggttggaaca 480tttttatccg ttcgtgaaac atccgtaata
ttttcgttga aacaattttt atcgacagca 540ccgtccaaca atttacacca atttggacgt
gtgatacata gcagtcccca agtgaaactg 600accaccagtt gaaaggtata caaagtgaac
ttattcatct aaaagaccgc agagatgggc 660cgtgggccgt ggcctgcgaa acgcagcgtt
caggcccatg agcatttatt ttttaaaaaa 720atatttcaca acaaaaaaga gaacggataa
aatccatcga aaaaaaaaaa ctttcctacg 780catcctctcc tatctccatc cacggcgagc
actcatccaa accgtccatc cacgcgcaca 840gtacacacac atagttatcg tctctccccc
cgatgagtca ccacccgtgt cttcgagaaa 900cgcctcgccc gacaccgtac gtggcgccac
cgccgcgcct gccgcctgga cacgtccggc 960tcctctccac gccgcgctgg ccaccgtcca
ccggctcccg cacacgtctc cctgtctccc 1020tccacccatg ccgtggcaat cgagctcatc
tcctcgcctc ctccggctta taaatggcgg 1080ccaccacctt cacctgctt
109923246DNAArtificial SequencePromoter
sequence with specific expression pattern. 23gtggcctggg aaaagagaga
gcccaaccaa ggcggcccat ctgcgacgct tcggcactgt 60caagcatccc gcacaggcgc
agcaccgcag tcatcggtga catttgtcgc tacagctgtt 120tcaggcatct gccacctcgg
tcacatgccg tccgccacgt cgagaccgcg agctccctac 180gtgtcacgcc cagccatgcc
cgacgtctcc ggtgccgtgt tttaaagaac gcgccgtagc 240gcactg
24624786DNAArtificial
SequencePromoter sequence with specific expression pattern.
24gacatggagg tggaaggcct gacgtagata gagaagatgc tcttagcttt cattgtcttt
60cttttgtagt catctgattt acctctctcg tttatacaac tggtttttta aacactcctt
120aacttttcaa attgtctctt tctttaccct agactagata attttaatgg tgattttgct
180aatgtggcgc catgttagat agaggtaaaa tgaactagtt aaaagctcag agtgataaat
240caggctctca aaaattcata aactgttttt taaatatcca aatattttta catggaaaat
300aataaaattt agtttagtat taaaaaattc agttgaatat agttttgtct tcaaaaatta
360tgaaactgat cttaattatt tttccttaaa accgtgctct atctttgatg tctagtttga
420gacgattata taattttttt tgtgcttaac tacgacgagc tgaagtacgt agaaatacta
480gtggagtcgt gccgcgtgtg cctgtagcca ctcgtacgct acagcccaag cgctagagcc
540caagaggccg gaggtggaag gcgtcgcggc actatagcca ctcgccgcaa gagcccaaga
600gaccggagct ggaaggatga gggtctgggt gttcacgaat tgcctggagg caggaggctc
660gtcgtccgga gccacaggcg tggagacgtc cgggataagg tgagcagccg ctgcgatagg
720ggcgcgtgtg aaccccgtcg cgccccacgg atggtataag aataaaggca ttccgcgtgc
780aggatt
78625728DNAArtificial SequencePromoter sequence with specific expression
pattern. 25ctgcgtgtac aactaatata attgtccaaa caatttctgt ggcacgtact
taagtttgag 60ccaggataca aactttggcc gctaatggtt gctgtcgccg gtcaagaggg
cgttggctac 120ttgagttaga ttttggttgt gtttcatccc cacgtacgtc cagcaaagaa
aaattgaagc 180tagtgcatgc atggttcgtc atcaaatgca tggccggccg gatacaaatt
tgaactgtag 240ctatcgacgt acgcatgtat taatttatat cagagaagac aaggaacaca
gatacataca 300tgtcgaaaca atcattttct atggcacttg agctagctag catacaattt
tgttttaaat 360gaaatgaaac tgaagacgat cgatcgaatt gaaggttgtg gttcgtgagc
aatgcaatgc 420agtttcacag aacgttgcca atgcaacaag ccaccaagaa aagagaagtc
tactcgatct 480tgcaatgatt aggcttggat gatgcgtggg gccacgtacg tatggacatc
gaagaacccc 540atcctcagcg tgtggcctga gggtgatggc aaagctgatc cacacattgc
ggcccccttt 600cccccctcag agaccctgac ctcccgagca cagccagcca ccgcgcaacg
ccggccacca 660ccaccaccac catacctgct agcgctagct ctctttattt aacgccgccg
tgtgcgtgcc 720tcgacgac
728263912DNAArtificial SequencePromoter sequence with
specific expression pattern. 26acaccaataa aaatacacag caataaaatc
gctacgtata tatatatata atatgtatta 60tctattacaa gatagtaata gagtatagca
agttgtatca tctaacaaac tatgcgaata 120aaatttgaac attgtgacat gtagatgtag
tgtaatttag ctaagtgctt atcatcagta 180acatagaccg acttaacttt ttacgaaaaa
aaaaaagtaa catagaccga aaaaatgcat 240atcgtaaatt taatggaaaa cacaatttac
gataagtaaa aaacaaaaag aaattacgat 300aagtcgagaa aaatgcaaca aattgagata
aagtattgat aaaaccatga aagtgtcggc 360gtatgtaaat gcggtgatta atgtgatcat
tagagcgtgt gtgttaaacg cggcggtttt 420agtggagatt gatcagctga taacactctt
accgggacga atctaattcc atattcatgg 480cttgttaaaa cctaagacat acgcaatctc
taatttgcta gtatagttag ttctatatta 540tttttcgact aataatgtaa acatatgatt
attaagtcgc aaaaagagtg cttaacaacc 600aaaaagtgga ttaattaact tggtgggaaa
agttacaaaa cctttaatga ttactctttg 660taccaagaat agtggcgaag cactataaga
gcagagaaaa gaagctcaat aatgtactaa 720aagttgtaga tttttacagc ttaaatacac
caaaattaat agaaaagttg gtaatttttt 780aattcatggc tactgattta gattttagaa
aacaatagta gtatcattgt cacatcttaa 840acacacaata ggtatgtttt aaatcaaagg
ccgtagttaa tttgtcaaaa atgtatgcat 900ttggtatttg gatgtctccg aaaggatgga
tatatggact tgttagataa tttcatacct 960cagtatcaat agtcatggag cccaaattgc
tcaaaaacat atttttaatt ccaagacttt 1020gatgaagacg taataatgag tccaatgggc
catcagatac aatgttcgga atttaacggg 1080tttgttagtt ataagtattg ggcttgacct
atctggttca atgatatgta ggaacaaccc 1140aatttgcaaa gctttattaa aagactcttt
agttgtcgtc aaggtttaac ttgtagtagt 1200tggtaagaaa ttctacgtga aataggcaac
attacaaaaa caaaaatcaa ttcgaaatca 1260tacaaaacga aaccaagtag taaccaacta
cactattatg acattaatga ttagacattc 1320ccaaatcata caagttcctg tcatgaagga
aacaatggtc cgtatttgca aacgattaca 1380aaaattcaaa ccaaaaatga aaaaacgagt
taaattattt ggtttataaa aatagtaatg 1440tcaacagaag actagattgg gaaacctgaa
gcgaacagag cttttaaaaa cgagtttgaa 1500cggctgggat catttggtac aatacccacc
gtaagtttgt ttaccctagg gatgcaagcc 1560aaaggcccaa atcagttact acttactgct
acaaccatcg tctcagcttt ttgtctcagc 1620tttttactaa tgaagcatac aatttcttgg
gcatgtcaca tctcgacacg tgtccactat 1680tctcttctct tattggctac tcgttcgtag
gcttctgtta atagatgatc tctctataac 1740tctaacagtc ttttctttct ctttatttcg
ttttggtatt ttaagtttca aattgaaaat 1800aataggagga aaagtctagt tttaaatatt
gtttttttac aagtgaacgt gaaccaattt 1860acctcttttt ttttatatat cctatcggct
aatctggtta gtatcggtag aaatgcaccg 1920aggtgctaca gagattaatg ctagggatag
tcagaccgct tgtatttctg actatcaagt 1980aaatctacgc ccaactcaca tatttcccaa
acaaatgtga tttttttttt tttttttttt 2040tttttttttt ttttgtaaca aatgtgattt
tgttttcaag gaaaatagaa cttacgtttg 2100ggaatttcac ccttcactaa agcttccttc
tgccattaga ccacaaaggc ttgggcaatt 2160taccattttt gtaaaagtag aaaacaaaat
gcctaaaatg ttcatacttc attacatcaa 2220caaggttatg cccacgatat agaggcatgt
aacatttata tatatagtgg aagaagccta 2280cgagctttat taataagtat aaactctgat
tattaggtaa ataaattact taaaacgatt 2340actcaactga caaaaccgta gttgaataat
aaggttacta tgaataccga ttgaatattg 2400caaagccgga attgaaaaat atataacaga
tcaaatgttc aagtgtggtc ataattctca 2460cataggtcat atagctgaac ccatgcatct
atttactagt ctatagaaag tactagagac 2520gcatacagct gaacctactc tattctttta
ttaattttgg ttctcgtgga tacaaaattc 2580ctccaacatt tattagaacg aataaaacca
atatgatgat gattagttat tggtaaacat 2640ataaacgttg agtaaacttc aaaatagatt
gaagtactat taagacttgc attttttccc 2700cttgggttat attcttgaat cgtttcgaag
tattttaact ttcaagaata gaaggttcct 2760caactataaa caattacatt aatcaaaacc
atttctatgt aaacaacata atttttgtat 2820attttagtct tccccaaaag tttgaccgat
agggcggttt agaccgtata gtacgactgt 2880acaacaaaaa ggactctgga gacctaaaga
tccaaaacta tgcaaaataa agatacggtc 2940ggaccaattt aatctaacaa aaccaaatcc
ttatactaaa ctatttaccg atacatttcc 3000atataacaca gtacacacaa ttaaatcaaa
cattattgga agaacaagat agaatattgg 3060cttaatctcg aacgattaga gttatcctag
agcctcggag cttttgtcac atataatata 3120aactatggta tatataaaca tgactctcat
ttgtatttat cgcaaggtac aattccacca 3180atttttttcg tcccactcat acagctttaa
ttgtgaaatc aatccataaa aaaccaacat 3240gtgacatggt ctctataact ataactataa
gatagtaaaa aattcacatc aacataaaag 3300aaaaccaatc atattggcta aaaaaaacta
acggtcgaaa aacgtataac cacaaaacca 3360aaccggtcca accggtgtcc ccaatcacta
tcaaagcatt aactaacttt cacaaggaaa 3420agcatagttc agtttctcta catcgcttcc
catcctctta accctgttta ctcgaatcat 3480ccaccgttgg atcaaacacg cgctacaaat
ctagcgcgtg accgaggttt ttacacagtg 3540gaatattacc atgcattgga aagcggcgtc
tacaacaaac ggcgggtcat gtcaccgtca 3600aaatcaacct ttcttaattc ctaacgccgt
tacttatctc cgtttactaa aaatgttaat 3660gcgtgtgaga gtgaagatca tatactaatt
agaagtggct aatgttttaa cgtgacatta 3720ttatcatagt taatggttcg atcagagttt
taagtagtaa atgatataag tgtgtgtata 3780taattgcata catatatact ctcacactct
gacagatttg tcgtggtctt agtattctct 3840ttcatggcta gttatatagg gctctagtac
attatctctc tctccccatt tctctgtctc 3900tctcttcttt aa
3912271718DNAArtificial SequencePromoter
sequence with specific expression pattern. 27atctctcctc tcctctcctc
tcctcccgtt ggtgtgactg tagtagatcc tttgcccgtg 60tcagaacaag ctgctcctcg
gaccgggtaa tgttaaacat cggaggagcc tttgcctagg 120atccgtaacg gggaggaaag
agaaaaaaaa ctaaggatga ttatggatac cgtgtaataa 180ctgctaacta cagttagccc
atctcagcgg actctctgcc ctatattgta tgtcactttc 240tattataaac tacactatac
aacctatgat gtaaaataat gttttgcacg ttcatatata 300aatcagtcga agaaagggtg
cctcactaca gggaatggtt tctattggac accttagcat 360tcaatcagtc atgtcccccc
ccccccccaa aaaaaaaatg cacccatcca gtcgattttt 420gtcatatttg aattcggtgg
tgctccatgc acgcgtacct gctttgacca atttatacga 480tcaatatata acttacgttc
ttacggttct tagactttat gagactttgc aagtatgttt 540ggatacaaat cacactaatg
tgcatctttg taaactaaat tcttttgatt aaatttgtaa 600ttttaaggtt taacctgttt
ttgttgtgta gacgacgtta ggcaccgatc gtcgcttcgc 660tatatatctt tgttgtagac
gacgttagac tcctagatta aataagcgaa aaccgatcgt 720cgcttcgcta tctttgttta
tttgtttgtg gctgctctac gctgaagagc ccacaggcca 780cagccccaca cgacacgtta
ggcaccccca cccaccatcc gcgcataata taagctactg 840caaaatatat gccggcggag
cccgagcgag ctttgtactt gctccgccgt ggcctggctc 900caggatgctt tggatttcgt
gcggcgccgt acgtccaggc aaacagacaa gtggagctgc 960atgtcctaaa agcccggcaa
tcaaacacgc tctagcagca gcatggatca cagatatcag 1020tcatggggtg gcgctggcgc
gggtgggtgg ccaggtggag gtgggtgcat gtcgtcgtcg 1080tcgtcccata cagaaattgg
ctcacgtatg tatacgctgc gtacaggcag tagtacacaa 1140ttactagcac caatgcaatc
caacggatgg atcttcgcac acccgccacc cggttaaatt 1200aagctactcc tacctctccc
agtctccctt ggcctgcctc tatatttttg ggcagcctcc 1260accagccggg cggatggggt
tggatcgtcg tatctgaggc ggcgtggtcg tccaaggcga 1320aagcaacggc gcagggctgg
gaccctagta ggtgcatgag gtcgtgcatg gcgcgcgaga 1380tgcatggttt gggttaggcc
taggaggttc tctctccatg gcatgggtag ctcgcgccgc 1440ttggctgccg ttctcgtgta
tgcgcatgca ccaggcattt gcaccgcgcc gtgtatattt 1500ctggcgtggg ggccggcgcc
gcattggagc tgcagccccg tttcggcacg gacacgggac 1560acctcccgtt agggtaagcc
cggggcagtg ggtaactgcc cagcgccact actccgaatt 1620taccctcctt ttatttttaa
agcttgggag aggggagaat ggatggatgg atggatgtag 1680acgcgtgaaa aagatgcgcg
agaccggcag cgtgtgct 1718281280DNAArtificial
SequencePromoter sequence with specific expression pattern.
28tccaccgatc atcacacaca gccagtagtg ggggtgggcc aagcaatcag gcacccggca
60atgcgagctg atgcgtgatg atggtgctac caacaaactg actataaaat ttctgatttg
120aaagggattg gcctcgatat tttattagct ccccggcttt tgtcacgaca cgttagcatg
180cgtgccttct agaagctagt ccgggtatta ccgctagaaa gttcccgaaa tgaagcattt
240accacccgta aagctcattt ttctttatga tgagtagaca cggtaccaac attgaggacc
300gattggttgg ctcccaaaat ctgccctgcc aaactagggc aagttcataa attttgacat
360tcgcttggtt ggcaatcaat taaatcctat tctaaaattc ttgcctaggt tttgatataa
420catgccctat attttggtct actcaaattt tggtatggta aattttgaac accaacaaat
480caggctatta tttatcttat ctctttctca atttcattac acagcaaggc agtaattaaa
540aggaccgtat atacaatgga tgtaagaata aaatgtataa gtagaaatat attggcatgc
600ctcgtgctgg tgcatgtcga tatgctctca attagaagtt ggagacaggt tatgcttagg
660atagtcccaa cctatgatat ctgtgtgtct atactgccac ataagtaaga catcacttta
720gaaattacat tctacaacct ataatttctt agtgtggatc cttaattaat tcatcatctc
780tcctctcaat tcctcatcaa ttatgaagac accatcttct tccaatgcaa atttaacact
840gtctaggatc taggttcagg tgttgatact gggtcttgca tgagatccag tttcttgttc
900ttccaattct ctctcattta atatataatc acataagcaa aagatcctat gtagctgcac
960aattaatgct atggaaacta tcctaatcgg agggttggga ctgctcctgc ctatggcggc
1020ttattcccca tttgcctaac ctgaaaatcg aaagggagtg catgacaggg caaacactag
1080tgttgcctgc atcaataatc gtccatgatt atatagaggt agcatgactt ttttaggcgt
1140cgtgtcctaa tcaatcagaa aagaaagcca acctaatcgc tatgggccgc aaccaccgat
1200gcgactatgc gagtatatgg aacccgttgc tactccccca ctatatatcg tggagtctga
1260tggcaatcca acggcagacg
12802951DNAArtificial SequencePromoter sequence with specific expression
pattern. 29tatatcatcg ttctctctat aaactttata gaactttgtt ctgattttct c
5130734PRTArabidopsis thaliana 30Met Lys Gly Ala Thr Leu Val
Ala Leu Ala Ala Thr Ile Gly Asn Phe1 5 10
15Leu Gln Gly Trp Asp Asn Ala Thr Ile Ala Gly Ala Met
Val Tyr Ile 20 25 30Asn Lys
Asp Leu Asn Leu Pro Thr Ser Val Gln Gly Leu Val Val Ala 35
40 45Met Ser Leu Ile Gly Ala Thr Val Ile Thr
Thr Cys Ser Gly Pro Ile 50 55 60Ser
Asp Trp Leu Gly Arg Arg Pro Met Leu Ile Leu Ser Ser Val Met65
70 75 80Tyr Phe Val Cys Gly Leu
Ile Met Leu Trp Ser Pro Asn Val Tyr Val 85
90 95Leu Cys Phe Ala Arg Leu Leu Asn Gly Phe Gly Ala
Gly Leu Ala Val 100 105 110Thr
Leu Val Pro Val Tyr Ile Ser Glu Thr Ala Pro Pro Glu Ile Arg 115
120 125Gly Gln Leu Asn Thr Leu Pro Gln Phe
Leu Gly Ser Gly Gly Met Phe 130 135
140Leu Ser Tyr Cys Met Val Phe Thr Met Ser Leu Ser Asp Ser Pro Ser145
150 155 160Trp Arg Ala Met
Leu Gly Val Leu Ser Ile Pro Ser Leu Leu Tyr Leu 165
170 175Phe Leu Thr Val Phe Tyr Leu Pro Glu Ser
Pro Arg Trp Leu Val Ser 180 185
190Lys Gly Arg Met Asp Glu Ala Lys Arg Val Leu Gln Gln Leu Cys Gly
195 200 205Arg Glu Asp Val Thr Asp Glu
Met Ala Leu Leu Val Glu Gly Leu Asp 210 215
220Ile Gly Gly Glu Lys Thr Met Glu Asp Leu Leu Val Thr Leu Glu
Asp225 230 235 240His Glu
Gly Asp Asp Thr Leu Glu Thr Val Asp Glu Asp Gly Gln Ile
245 250 255Arg Leu Tyr Gly Thr His Glu
Asn Gln Ser Tyr Leu Ala Arg Pro Val 260 265
270Pro Glu Gln Asn Ser Ser Leu Gly Leu Arg Ser Arg His Gly
Ser Leu 275 280 285Ala Asn Gln Ser
Met Ile Leu Lys Asp Pro Leu Val Asn Leu Phe Gly 290
295 300Ser Leu His Glu Lys Met Pro Glu Ala Gly Gly Asn
Thr Arg Ser Gly305 310 315
320Ile Phe Pro His Phe Gly Ser Met Phe Ser Thr Thr Ala Asp Ala Pro
325 330 335His Gly Lys Pro Ala
His Trp Glu Lys Asp Ile Glu Ser His Tyr Asn 340
345 350Lys Asp Asn Asp Asp Tyr Ala Thr Asp Asp Gly Ala
Gly Asp Asp Asp 355 360 365Asp Ser
Asp Asn Asp Leu Arg Ser Pro Leu Met Ser Arg Gln Thr Thr 370
375 380Ser Met Asp Lys Asp Met Ile Pro His Pro Thr
Ser Gly Ser Thr Leu385 390 395
400Ser Met Arg Arg His Ser Thr Leu Met Gln Gly Asn Gly Glu Ser Ser
405 410 415Met Gly Ile Gly
Gly Gly Trp His Met Gly Tyr Arg Tyr Glu Asn Asp 420
425 430Glu Tyr Lys Arg Tyr Tyr Leu Lys Glu Asp Gly
Ala Glu Ser Arg Arg 435 440 445Gly
Ser Ile Ile Ser Ile Pro Gly Gly Pro Asp Gly Gly Gly Ser Tyr 450
455 460Ile His Ala Ser Ala Leu Val Ser Arg Ser
Val Leu Gly Pro Lys Ser465 470 475
480Val His Gly Ser Ala Met Val Pro Pro Glu Lys Ile Ala Ala Ser
Gly 485 490 495Pro Leu Trp
Ser Ala Leu Leu Glu Pro Gly Val Lys Arg Ala Leu Val 500
505 510Val Gly Val Gly Ile Gln Ile Leu Gln Gln
Phe Ser Gly Ile Asn Gly 515 520
525Val Leu Tyr Tyr Thr Pro Gln Ile Leu Glu Arg Ala Gly Val Asp Ile 530
535 540Leu Leu Ser Ser Leu Gly Leu Ser
Ser Ile Ser Ala Ser Phe Leu Ile545 550
555 560Ser Gly Leu Thr Thr Leu Leu Met Leu Pro Ala Ile
Val Val Ala Met 565 570
575Arg Leu Met Asp Val Ser Gly Arg Arg Ser Leu Leu Leu Trp Thr Ile
580 585 590Pro Val Leu Ile Val Ser
Leu Val Val Leu Val Ile Ser Glu Leu Ile 595 600
605His Ile Ser Lys Val Val Asn Ala Ala Leu Ser Thr Gly Cys
Val Val 610 615 620Leu Tyr Phe Cys Phe
Phe Val Met Gly Tyr Gly Pro Phe Gln Thr Ser625 630
635 640Ser Val Leu Lys Ser Ser Gln Gln Ala Asp
Arg Gly Leu Cys Ile Ala 645 650
655Ile Cys Ala Met Val Phe Trp Ile Gly Asp Ile Ile Val Thr Tyr Ser
660 665 670Leu Pro Val Leu Leu
Ser Ser Ile Glu Leu Val Gly Val Phe Ser Ile 675
680 685Tyr Ala Ala Val Cys Val Ile Ser Trp Ile Phe Val
Tyr Met Lys Val 690 695 700Pro Glu Thr
Lys Gly Met Pro Leu Glu Val Ile Thr Asp Tyr Phe Ala705
710 715 720Phe Gly Ala Gln Ala Gln Ala
Ser Ala Pro Ser Lys Asp Ile 725
73031257PRTGlycine max 31Met Ala Asp Ser Ser Arg Glu Glu Asn Val Tyr Met
Ala Lys Leu Ala1 5 10
15Glu Gln Ala Glu Arg Tyr Glu Glu Met Val Glu Phe Met Glu Lys Val
20 25 30Ala Lys Thr Val Glu Val Glu
Glu Leu Thr Val Glu Glu Arg Asn Leu 35 40
45Leu Ser Val Ala Tyr Lys Asn Val Ile Gly Ala Arg Arg Ala Ser
Trp 50 55 60Arg Ile Ile Ser Ser Ile
Glu Gln Lys Glu Glu Ser Arg Gly Asn Glu65 70
75 80Asp His Val Ala Ile Ile Lys Glu Tyr Arg Gly
Lys Ile Glu Ala Glu 85 90
95Leu Ser Lys Ile Cys Asp Gly Ile Leu Asn Leu Leu Glu Ser Asn Leu
100 105 110Ile Pro Ser Ala Ala Ser
Pro Glu Ser Lys Val Phe Tyr Leu Lys Met 115 120
125Lys Gly Asp Tyr His Arg Tyr Leu Ala Glu Phe Lys Thr Gly
Ala Glu 130 135 140Arg Lys Gly Ala Ala
Glu Ser Thr Leu Leu Ala Tyr Lys Ser Ala Gln145 150
155 160Asp Ile Ala Leu Ala Asp Leu Ala Pro Thr
His Pro Ile Arg Leu Gly 165 170
175Leu Ala Leu Asn Phe Ser Val Phe Tyr Tyr Glu Ile Leu Asn Ser Pro
180 185 190Asp Arg Ala Cys Asn
Leu Ala Lys Gln Ala Phe Asp Glu Ala Ile Ser 195
200 205Glu Leu Asp Thr Leu Gly Glu Glu Ser Tyr Lys Asp
Ser Thr Leu Ile 210 215 220Met Gln Leu
Phe Arg Asp Asn Leu Thr Leu Trp Thr Ser Asp Ile Thr225
230 235 240Asp Asp Ala Gly Asp Glu Ile
Lys Glu Thr Phe Lys Gln Gln Pro Gly 245
250 255Glu32283PRTArabidopsis thaliana 32Met Asn Gly Gly
Glu Lys Leu Glu Ser Ile Pro Ile Asp Leu Ile Ile1 5
10 15Glu Ile His Ser Arg Leu Pro Ala Glu Ser
Val Ala Arg Phe Arg Cys 20 25
30Val Ser Lys Leu Trp Gly Ser Met Phe Arg Arg Pro Tyr Phe Thr Glu
35 40 45Leu Phe Leu Thr Arg Ser Arg Ala
Arg Pro Arg Leu Leu Phe Val Leu 50 55
60Gln His Asn Arg Lys Trp Ser Phe Ser Val Phe Ser Ser Pro Gln Asn65
70 75 80Gln Asn Ile Tyr Glu
Lys Pro Ser Phe Val Val Ala Asp Phe His Met 85
90 95Lys Phe Ser Val Ser Thr Phe Pro Asp Phe His
Ser Cys Ser Gly Leu 100 105
110Ile His Phe Ser Met Met Lys Gly Ala Tyr Thr Val Pro Val Val Cys
115 120 125Asn Pro Arg Thr Gly Gln Tyr
Ala Val Leu Pro Lys Leu Thr Arg Thr 130 135
140Arg Tyr Glu Asn Ser Tyr Ser Phe Val Gly Tyr Asp Pro Ile Glu
Lys145 150 155 160Gln Ile
Lys Val Leu Phe Met Ser Asp Pro Asp Ser Gly Asp Asp His
165 170 175Arg Ile Leu Thr Leu Gly Thr
Thr Glu Lys Met Leu Gly Arg Lys Ile 180 185
190Glu Cys Ser Leu Thr His Asn Ile Leu Ser Asn Glu Gly Val
Cys Ile 195 200 205Asn Gly Val Leu
Tyr Tyr Lys Ala Ser Arg Ile Val Glu Ser Ser Ser 210
215 220Asp Asp Asp Thr Ser Asp Asp Asp Asp Asp Asp His
Glu Arg Ser Asp225 230 235
240Val Ile Val Cys Phe Asp Phe Arg Cys Glu Lys Phe Glu Phe Ile Val
245 250 255Ile Cys Phe Tyr Gly
Gln Leu Ile Asn Ser Val Gln Leu Ser Leu Lys 260
265 270Gln Lys Ser His Gln Lys Leu Asp Leu Ile Ile
275 28033240PRTAgrobacterium tumefaciens 33Met Asp Leu
Arg Leu Ile Phe Gly Pro Thr Cys Thr Gly Lys Thr Ser1 5
10 15Thr Ala Val Ala Leu Ala Gln Gln Thr
Gly Leu Pro Val Leu Ser Leu 20 25
30Asp Arg Val Gln Cys Cys Pro Gln Leu Ser Thr Gly Ser Gly Arg Pro
35 40 45Thr Val Glu Glu Leu Lys Gly
Thr Ser Arg Leu Tyr Leu Asp Asp Arg 50 55
60Pro Leu Val Lys Gly Ile Ile Ala Ala Lys Gln Ala His Glu Arg Leu65
70 75 80Met Gly Glu Val
Tyr Asn Tyr Glu Ala His Gly Gly Leu Ile Leu Glu 85
90 95Gly Gly Ser Ile Ser Leu Leu Lys Cys Met
Ala Gln Ser Ser Tyr Trp 100 105
110Ser Ala Asp Phe Arg Trp His Ile Ile Arg His Glu Leu Ala His Glu
115 120 125Glu Thr Phe Met Asn Val Ala
Lys Ala Arg Val Lys Gln Met Leu Arg 130 135
140Pro Ala Ser Gly Leu Ser Ile Ile Gln Glu Leu Val Asp Leu Trp
Lys145 150 155 160Glu Pro
Arg Leu Arg Arg Ile Leu Lys Glu Ile Asp Gly Tyr Arg Tyr
165 170 175Ala Met Leu Phe Val Ser Gln
Asn Gln Ile Thr Ser Asp Met Leu Leu 180 185
190Gln Leu Asp Ala Asp Met Glu Asp Lys Leu Ile His Gly Ile
Ala Gln 195 200 205Glu Tyr Leu Ile
His Ala Arg Arg Gln Glu Gln Lys Phe Pro Arg Val 210
215 220Asn Ala Ala Ala Tyr Asp Gly Phe Glu Gly His Pro
Phe Gly Met Tyr225 230 235
24034240PRTAgrobacterium tumefaciens 34Met Asp Leu Arg Leu Ile Phe Gly
Pro Thr Cys Thr Gly Lys Thr Ser1 5 10
15Thr Ala Val Ala Leu Ala Gln Gln Thr Gly Leu Pro Val Leu
Ser Leu 20 25 30Asp Arg Val
Gln Cys Cys Pro Gln Leu Ser Thr Gly Ser Gly Arg Pro 35
40 45Thr Val Glu Glu Leu Lys Gly Thr Ser Arg Leu
Tyr Leu Asp Asp Arg 50 55 60Pro Leu
Val Lys Gly Ile Ile Ala Ala Lys Gln Ala His Glu Arg Leu65
70 75 80Met Gly Glu Val Tyr Asn Tyr
Glu Ala His Gly Gly Leu Ile Leu Glu 85 90
95Gly Gly Ser Ile Ser Leu Leu Lys Cys Met Ala Gln Ser
Ser Tyr Trp 100 105 110Ser Ala
Asp Phe Arg Trp His Ile Ile Arg His Glu Leu Ala Asp Glu 115
120 125Glu Thr Phe Met Asn Val Ala Lys Ala Arg
Val Lys Gln Met Leu Arg 130 135 140Pro
Ala Ala Gly Leu Ser Ile Ile Gln Glu Leu Val Asp Leu Trp Lys145
150 155 160Glu Pro Arg Leu Arg Pro
Ile Leu Lys Glu Ile Asp Gly Tyr Arg Tyr 165
170 175Ala Met Leu Phe Ala Ser Gln Asn Gln Ile Thr Ser
Asp Met Leu Leu 180 185 190Gln
Leu Asp Ala Asp Met Glu Asp Lys Leu Ile His Gly Ile Ala Gln 195
200 205Glu Tyr Leu Ile His Ala Arg Arg Gln
Glu Gln Lys Phe Pro Arg Val 210 215
220Asn Ala Ala Ala Tyr Asp Gly Phe Glu Gly His Pro Phe Gly Met Tyr225
230 235 24035624PRTZea mays
35Met Ala Ser Ser Leu Leu Ser Ser Pro Ala Lys Pro Thr Ile Thr Thr1
5 10 15Thr Thr Lys Thr Thr Pro
Ala Pro Arg Pro Ala Arg Ser Ala His Val 20 25
30His Val Leu Ser Ala Ala Arg Cys Leu Arg Leu Arg Leu
Arg Ala Ser 35 40 45Ser Gln His
Pro Pro Pro Pro Pro Thr Pro Arg Ser Arg Arg Pro Glu 50
55 60Tyr Val Pro Asn Arg Ile Asp Asp Pro Asn Tyr Val
Arg Ile Phe Asp65 70 75
80Thr Thr Leu Arg Asp Gly Glu Gln Ser Pro Gly Ala Thr Met Thr Ser
85 90 95Ala Gln Lys Leu Val Val
Ala Arg Gln Leu Ala Arg Leu Gly Val Asp 100
105 110Ile Ile Glu Ala Gly Phe Pro Ala Ser Ser Pro Asp
Asp Leu Asp Ala 115 120 125Val Arg
Ser Ile Ala Ile Glu Val Gly Asn Pro Ala Pro Ala Gly Glu 130
135 140Asp Ala Ala Val His Val Pro Val Ile Cys Gly
Leu Ser Arg Cys Asn145 150 155
160Arg Arg Asp Ile Asp Ala Ala Trp Glu Ala Val Arg His Ala Arg Arg
165 170 175Pro Arg Ile His
Thr Phe Ile Ala Thr Ser Asp Ile His Met Gln His 180
185 190Lys Leu Arg Lys Thr Pro Asp Gln Val Val Ala
Ile Ala Arg Glu Met 195 200 205Val
Ala Tyr Ala Arg Ser Leu Gly Cys Thr Asp Val Glu Phe Ser Pro 210
215 220Glu Asp Ala Gly Arg Ser Asn Arg Glu Phe
Leu Tyr His Ile Leu Gly225 230 235
240Glu Val Ile Lys Ala Gly Ala Thr Thr Leu Asn Ile Pro Asp Thr
Val 245 250 255Gly Tyr Asn
Leu Pro Tyr Glu Phe Gly Lys Leu Ile Ala Asp Ile Lys 260
265 270Ala Asn Thr Pro Gly Ile Glu Lys Ala Ile
Ile Ser Thr His Cys Gln 275 280
285Asn Asp Leu Gly Leu Ala Thr Ala Asn Thr Leu Ala Gly Ala Arg Ala 290
295 300Gly Ala Arg Gln Leu Glu Val Thr
Ile Asn Gly Ile Gly Glu Arg Ala305 310
315 320Gly Asn Ala Ser Leu Glu Glu Val Val Met Ala Ile
Lys Cys Arg Arg 325 330
335Glu Leu Leu Asp Gly Leu Tyr Thr Gly Ile Asp Ser Arg His Ile Thr
340 345 350Leu Thr Ser Lys Met Val
Gln Glu His Ser Gly Leu His Val Gln Pro 355 360
365His Lys Ala Ile Val Gly Ala Asn Ala Phe Ala His Glu Ser
Gly Ile 370 375 380His Gln Asp Gly Met
Leu Lys Tyr Lys Gly Thr Tyr Glu Ile Ile Ser385 390
395 400Pro Asp Asp Ile Gly Leu Thr Arg Ala Asn
Glu Phe Gly Ile Val Leu 405 410
415Gly Lys Leu Ser Gly Arg His Ala Val Arg Ser Lys Leu Val Glu Leu
420 425 430Gly Tyr Glu Ile Gly
Asp Lys Glu Phe Glu Asp Phe Phe Lys Arg Tyr 435
440 445Lys Glu Val Ala Glu Lys Lys Lys Arg Val Thr Asp
Glu Asp Leu Glu 450 455 460Ala Leu Leu
Ser Asp Glu Ile Phe Gln Pro Lys Val Ile Trp Ser Leu465
470 475 480Ala Asp Val Gln Ala Thr Cys
Gly Thr Leu Ala Leu Ser Thr Ala Thr 485
490 495Val Lys Leu Ile Ala Pro Asp Gly Glu Glu Lys Ile
Ala Cys Ser Val 500 505 510Gly
Thr Gly Pro Val Asp Ala Ala Tyr Lys Ala Val Asp Lys Ile Ile 515
520 525Gln Ile Pro Thr Val Leu Arg Glu Tyr
Ser Met Thr Ser Val Thr Glu 530 535
540Gly Ile Asp Ala Ile Ala Thr Thr Arg Val Val Val Thr Gly Asp Val545
550 555 560Ser Asn Asn Ala
Lys His Ala Leu Thr Gly Gln Ser Phe Asn Arg Ser 565
570 575Phe Ser Gly Ser Gly Ala Ser Met Asp Ile
Val Val Ser Ser Val Arg 580 585
590Ala Tyr Leu Ser Ala Leu Asn Lys Ile Cys Ser Phe Ala Gly Ala Val
595 600 605Lys Ala Ser Ser Asp Val Ala
Glu Thr Ala Ser Val Pro Ser Thr Glu 610 615
62036240PRTAgrobacterium tumefaciens 36Met Asp Leu Arg Leu Ile Phe
Gly Pro Thr Cys Thr Gly Lys Thr Ser1 5 10
15Thr Ala Val Ala Leu Ala Gln Gln Thr Gly Leu Pro Val
Leu Ser Leu 20 25 30Asp Arg
Val Gln Cys Cys Pro Gln Leu Ser Thr Gly Ser Gly Arg Pro 35
40 45Thr Val Glu Glu Leu Lys Gly Thr Ser Arg
Leu Tyr Leu Asp Asp Arg 50 55 60Pro
Leu Val Lys Gly Ile Ile Ala Ala Lys Gln Ala His Glu Arg Leu65
70 75 80Met Gly Glu Val Tyr Asn
Tyr Glu Ala His Gly Gly Leu Ile Leu Glu 85
90 95Gly Gly Ser Ile Ser Leu Leu Lys Cys Met Ala Gln
Ser Ser Tyr Trp 100 105 110Ser
Ala Asp Phe Arg Trp Asp Ile Ile Arg His Glu Leu Ala Asp Glu 115
120 125Glu Thr Phe Met Asn Val Ala Lys Ala
Arg Val Lys Gln Met Leu Arg 130 135
140Pro Ala Ala Gly Leu Ser Ile Ile Gln Glu Leu Val Asp Leu Trp Lys145
150 155 160Glu Pro Arg Leu
Arg Pro Ile Leu Lys Glu Ile Asp Gly Tyr Arg Tyr 165
170 175Ala Met Leu Phe Ala Ser Gln Asn Gln Ile
Thr Ser Asp Met Leu Leu 180 185
190Gln Leu Asp Ala Asp Met Glu Asp Lys Leu Ile His Gly Ile Ala Gln
195 200 205Glu Tyr Leu Ile His Ala Arg
Arg Gln Glu Gln Lys Phe Pro Arg Val 210 215
220Asn Ala Ala Ala Tyr Asp Gly Phe Glu Gly His Pro Phe Gly Met
Tyr225 230 235
24037257PRTGlycine max 37Met Ser Asp Ser Ser Arg Glu Glu Asn Val Tyr Met
Ala Lys Leu Ala1 5 10
15Glu Gln Ala Glu Arg Tyr Glu Glu Met Val Glu Phe Met Glu Lys Val
20 25 30Ala Lys Thr Val Glu Val Glu
Glu Leu Thr Val Glu Glu Arg Asn Leu 35 40
45Leu Ser Val Ala Tyr Lys Asn Val Ile Gly Ala Arg Arg Ala Ser
Trp 50 55 60Arg Ile Ile Ser Ser Ile
Glu Gln Lys Glu Glu Ser Arg Gly Asn Glu65 70
75 80Asp His Val Ala Ile Ile Lys Glu Tyr Arg Gly
Lys Ile Glu Ala Glu 85 90
95Leu Ser Lys Ile Cys Asp Gly Ile Leu Asn Leu Leu Glu Ser Asn Leu
100 105 110Ile Pro Ser Ala Ala Ser
Pro Glu Ser Lys Val Phe Tyr Leu Lys Met 115 120
125Lys Gly Asp Tyr His Arg Tyr Leu Ala Glu Phe Lys Thr Gly
Ala Glu 130 135 140Arg Lys Glu Ala Ala
Glu Ser Thr Leu Leu Ala Tyr Lys Ser Ala Gln145 150
155 160Asp Ile Ala Leu Ala Asp Leu Ala Pro Thr
His Pro Ile Arg Leu Gly 165 170
175Leu Ala Leu Asn Phe Ser Val Phe Tyr Tyr Glu Ile Leu Asn Ser Pro
180 185 190Asp Arg Ala Cys Asn
Leu Ala Lys Gln Ala Phe Asp Glu Ala Ile Ser 195
200 205Glu Leu Asp Thr Leu Gly Glu Glu Ser Tyr Lys Asp
Ser Thr Leu Ile 210 215 220Met Gln Leu
Phe Arg Asp Asn Leu Thr Leu Trp Thr Ser Asp Ile Thr225
230 235 240Asp Asp Ala Gly Asp Glu Ile
Lys Glu Thr Phe Lys Arg Gln Pro Gly 245
250 255Glu38734PRTArabidopsis thaliana 38Met Lys Gly Ala
Thr Leu Val Ala Leu Ala Ala Thr Ile Gly Asn Phe1 5
10 15Leu Gln Gly Trp Asp Asn Ala Thr Ile Ala
Gly Ala Met Val Tyr Ile 20 25
30Asn Lys Asp Leu Asn Leu Pro Thr Ser Val Gln Gly Leu Val Val Ala
35 40 45Met Ser Leu Ile Gly Ala Thr Val
Ile Thr Thr Cys Ser Gly Pro Ile 50 55
60Ser Asp Trp Leu Gly Arg Arg Pro Met Leu Ile Leu Ser Ser Val Met65
70 75 80Tyr Phe Val Cys Gly
Leu Ile Met Leu Trp Ser Pro Asn Val Tyr Val 85
90 95Leu Cys Phe Ala Arg Leu Leu Asn Gly Phe Gly
Ala Gly Leu Ala Val 100 105
110Thr Leu Val Pro Val Tyr Ile Ser Glu Thr Ala Pro Pro Glu Ile Arg
115 120 125Gly Gln Leu Asn Thr Leu Pro
Gln Phe Leu Gly Ser Gly Gly Met Phe 130 135
140Leu Ser Tyr Cys Met Val Phe Thr Met Ser Leu Ser Asp Ser Pro
Ser145 150 155 160Trp Arg
Ala Met Leu Gly Val Leu Ser Ile Pro Ser Leu Leu Tyr Leu
165 170 175Phe Leu Thr Val Phe Tyr Leu
Pro Glu Ser Pro Arg Trp Leu Val Ser 180 185
190Lys Gly Arg Met Asp Glu Ala Lys Arg Val Leu Gln Gln Leu
Cys Gly 195 200 205Arg Glu Asp Val
Thr Gly Lys Met Ala Leu Leu Val Glu Gly Leu Asp 210
215 220Ile Gly Gly Glu Lys Thr Met Glu Asp Leu Leu Val
Thr Leu Glu Asp225 230 235
240His Glu Gly Asp Asp Thr Leu Glu Thr Val Asp Glu Asp Gly Gln Met
245 250 255Arg Leu Tyr Gly Thr
His Glu Asn Gln Ser Tyr Leu Ala Arg Pro Val 260
265 270Pro Glu Gln Asn Ser Ser Leu Gly Leu Arg Ser Arg
His Gly Ser Leu 275 280 285Ala Asn
Gln Ser Met Ile Leu Lys Asp Pro Leu Val Asn Leu Phe Gly 290
295 300Ser Leu His Glu Lys Met Pro Glu Ala Gly Gly
Asn Thr Arg Ser Gly305 310 315
320Ile Phe Pro His Phe Gly Ser Met Phe Ser Thr Thr Ala Asp Ala Pro
325 330 335His Gly Lys Pro
Ala His Trp Glu Lys Asp Ile Glu Ser His Tyr Asn 340
345 350Lys Asp Asn Asp Asp Tyr Ala Thr Asp Asp Gly
Ala Gly Asp Asp Asp 355 360 365Asp
Ser Asp Asn Asp Leu Arg Ser Pro Leu Met Ser Arg Gln Thr Thr 370
375 380Ser Met Asp Lys Asp Met Ile Pro His Pro
Thr Ser Gly Ser Thr Leu385 390 395
400Ser Met Arg Arg His Ser Thr Leu Met Gln Gly Asn Gly Glu Ser
Ser 405 410 415Met Gly Ile
Gly Gly Gly Trp His Met Gly Tyr Arg Tyr Glu Asn Asp 420
425 430Glu Tyr Lys Arg Tyr Tyr Leu Lys Glu Asp
Gly Ala Glu Ser Arg Arg 435 440
445Gly Ser Ile Ile Ser Ile Pro Gly Gly Pro Asp Gly Gly Gly Ser Tyr 450
455 460Ile His Ala Ser Ala Leu Val Ser
Arg Ser Val Leu Gly Pro Lys Ser465 470
475 480Val His Gly Ser Ala Met Val Pro Pro Glu Lys Ile
Ala Ala Ser Gly 485 490
495Pro Leu Trp Ser Ala Leu Leu Glu Pro Gly Val Lys Arg Ala Leu Val
500 505 510Val Gly Val Gly Ile Gln
Ile Leu Gln Gln Phe Ser Gly Ile Asn Gly 515 520
525Val Leu Tyr Tyr Thr Pro Gln Ile Leu Glu Arg Ala Gly Val
Asp Ile 530 535 540Leu Leu Ser Ser Leu
Gly Leu Ser Ser Ile Ser Ala Ser Phe Leu Ile545 550
555 560Ser Gly Leu Thr Thr Leu Leu Met Leu Pro
Ala Ile Val Val Ala Met 565 570
575Arg Leu Met Asp Val Ser Gly Arg Arg Ser Leu Leu Leu Trp Thr Ile
580 585 590Pro Val Leu Ile Val
Ser Leu Val Val Leu Val Ile Ser Glu Leu Ile 595
600 605His Ile Ser Lys Val Val Asn Ala Ala Leu Ser Thr
Gly Cys Val Val 610 615 620Leu Tyr Phe
Cys Phe Phe Val Met Gly Tyr Gly Pro Ile Pro Asn Ile625
630 635 640Leu Cys Ser Glu Ile Phe Pro
Thr Arg Val Arg Gly Leu Cys Ile Ala 645
650 655Ile Cys Ala Met Val Phe Trp Ile Gly Asp Ile Ile
Val Thr Tyr Ser 660 665 670Leu
Pro Val Leu Leu Ser Ser Ile Gly Leu Val Gly Val Phe Ser Ile 675
680 685Tyr Ala Ala Val Cys Val Ile Ser Trp
Ile Phe Val Tyr Met Lys Val 690 695
700Pro Glu Thr Lys Gly Met Pro Leu Glu Val Ile Thr Asp Tyr Phe Ala705
710 715 720Phe Gly Ala Gln
Ala Gln Ala Ser Ala Pro Ser Lys Asp Ile 725
73039257PRTGlycine max 39Met Ser Asp Ser Ser Arg Glu Glu Asn Val Tyr Met
Ala Lys Leu Ala1 5 10
15Asp Glu Ala Glu Arg Tyr Glu Glu Met Val Glu Phe Met Glu Lys Val
20 25 30Ala Lys Thr Val Glu Val Glu
Glu Leu Thr Val Glu Glu Arg Asn Leu 35 40
45Leu Ser Val Ala Tyr Lys Asn Val Ile Gly Ala Arg Arg Ala Ser
Trp 50 55 60Arg Ile Ile Ser Ser Ile
Glu Gln Lys Glu Glu Ser Arg Gly Asn Glu65 70
75 80Asp His Val Ala Ile Ile Lys Glu Tyr Arg Gly
Lys Ile Glu Ala Glu 85 90
95Leu Ser Lys Ile Cys Asp Gly Ile Leu Asn Leu Leu Glu Ser Asn Leu
100 105 110Ile Pro Ser Ala Ala Ser
Pro Glu Ser Lys Val Phe Tyr Leu Lys Met 115 120
125Lys Gly Asp Tyr His Arg Tyr Leu Ala Glu Phe Lys Thr Gly
Ala Glu 130 135 140Arg Lys Glu Ala Ala
Glu Ser Thr Leu Leu Ala Tyr Lys Ser Ala Gln145 150
155 160Asp Ile Ala Leu Ala Asp Leu Ala Pro Thr
His Pro Ile Arg Leu Gly 165 170
175Leu Ala Leu Asn Phe Ser Val Phe Tyr Tyr Glu Ile Leu Asn Ser Pro
180 185 190Asp Arg Ala Cys Asn
Leu Ala Lys Gln Ala Phe Asp Glu Ala Ile Ser 195
200 205Glu Leu Asp Thr Leu Gly Glu Glu Ser Tyr Lys Asp
Ser Thr Leu Ile 210 215 220Met Gln Leu
Leu Arg Asp Asn Leu Thr Leu Trp Thr Ser Asp Ile Thr225
230 235 240Asp Ile Ala Gly Asp Glu Ile
Lys Glu Thr Ser Lys Gln Gln Pro Gly 245
250 255Glu
User Contributions:
Comment about this patent or add new information about this topic: