Patent application title: PLANTS HAVING ENHANCED YIELD-RELATED TRAITS AND A METHOD FOR MAKING THE SAME
Inventors:
Yves Hatzfeld (Lille, FR)
Valerie Frankard (Waterloo, BE)
Steven Vandenabeele (Oudenaarde, BE)
IPC8 Class: AC12N1582FI
USPC Class:
800290
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part the polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)
Publication date: 2014-11-20
Patent application number: 20140345002
Abstract:
The present invention relates generally to the field of molecular biology
and concerns a method for enhancing various plant yield-related traits by
modulating expression in a plant of a nucleic acid encoding a RHL1 (root
hairless 1) polypeptide, a TGase (transglutaminase) polypeptide, a
TRY-like (tryptichon) polypeptide, or a BZR (brassinazole-resistant)
polypeptide. Constructs useful for performing the methods, as well as
plants having enhanced various plant yield-related traits thus obtained,
are also provided.Claims:
1. A method for enhancing yield-related traits in a plant relative to a
control plant, comprising: (a) modulating expression in a plant of a
nucleic acid encoding a Root Hairless 1 (RHL1) polypeptide, a
transglutaminase (TGase) polypeptide, a TRY-like (tryptichon)
polypeptide, or a Brassinazole-Resistant (BZR) polypeptide; and (b)
optionally selecting for a plant having enhanced yield-related traits
relative to a control plant, wherein: (i) said RHL1 polypeptide comprises
one or more of Motif 9 of SEQ ID NO: 37, Motif 10 of SEQ ID NO: 38, and
Motif 11 of SEQ ID NO: 39; (ii) said TGase polypeptide comprises a
plastidic transit peptide and has at least 50% sequence identity to the
amino acid sequence of SEQ ID NO: 27; (iii) said TRY-like polypeptide
comprises a Myb-like DNA-binding domain and/or one or more of motifs of
SEQ ID NO: 229 to SEQ ID NO: 232; or (iv) said BZR polypeptide comprises
a domain having at least 50% sequence identity to the domain located
between amino acid coordinates 10-157 in SEQ ID NO: 239, a domain having
at least 50% sequence identity to a bHLH-like domain comprising the amino
acid sequence of SEQ ID NO: 326, and/or a motif comprising the amino acid
sequence of SEQ ID NO: 323, 324, or 325.
2. The method of claim 1, wherein said modulated expression is effected by introducing and expressing in the plant said nucleic acid.
3. The method of claim 1, wherein: (a) said nucleic acid encoding a RHL1 polypeptide encodes any one of the proteins listed in Table A1, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid or the complement thereof; (b) said nucleic acid encoding a TGase polypeptide encodes the TGase polypeptide of SEQ ID NO: 45 or any one of the proteins listed in Table A2, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid or the complement thereof; (c) said nucleic acid encoding a TRY-like polypeptide encodes any one of the proteins listed in Table A3, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid; or (d) said nucleic acid encoding a BZR polypeptide encodes any one of the proteins listed in Table A4, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid.
4. The method of claim 1, wherein: (a) said nucleic acid encodes a RHL1 polypeptide, and wherein said enhanced yield-related traits comprise increased seed yield relative to a control plant; (b) said nucleic acid encodes a TGase polypeptide, and wherein said enhanced yield-related traits comprise increased total seed yield per plant, increased number of filled seeds, and/or increased harvest index relative to a control plant; (c) said nucleic acid encodes a TRY-like polypeptide, and wherein said enhanced yield-related traits comprise increased emergence vigour and/or increased yield relative to a control plant; or (d) said nucleic acid encodes a BZR polypeptide, and wherein said enhanced yield-related traits comprise increased total weight of the seed, increased number of filled seed and increased thousand kernel weight relative to a control plant.
5. The method of claim 1, wherein: (a) said nucleic acid encodes a RHL1 polypeptide, and wherein said enhanced yield-related traits are obtained under cultivation conditions of nitrogen deficiency; (b) said nucleic acid encodes a TRY-like polypeptide, and wherein said enhanced yield-related traits are obtained under non-stress conditions or under conditions of nitrogen deficiency; or (c) said nucleic acid encodes a BZR polypeptide, and wherein said enhanced yield-related traits are obtained under non-stress conditions.
6. The method of claim 1, wherein: (a) said nucleic acid encoding a RHL1 polypeptide is operably linked to a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice; (b) said nucleic acid encoding a TGase polypeptide is operably linked to a seed-specific promoter; (c) said nucleic acid encoding a TGase polypeptide is operably linked to an alpha-globulin promoter, a rice alpha-globulin promoter, or an alpha-globulin promoter comprising the nucleotide sequence of SEQ ID NO: 72; (d) said nucleic acid encoding a TRY-like polypeptide is operably linked to a root-specific promoter, a RCc3 promoter, or a RCc3 promoter from rice; (e) said nucleic acid encoding a TRY-like polypeptide is operably linked to a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice; or (f) said nucleic acid encoding a BZR polypeptide is operably linked to a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice.
7. The method of claim 1, wherein: (a) said nucleic acid encoding a RHL1 polypeptide is of plant origin, from a dicotyledonous plant, from a plant of the family Brassicaceae, or from an Arabidopsis thaliana plant; (b) said nucleic acid encoding a TGase polypeptide is from a plant, from a monocotyledonous plant, from a plant of the family Poaceae, or from an Oryza sativa plant; (c) said nucleic acid encoding a TRY-like polypeptide is of plant origin, from a dicotyledonous plant, from a plant of the family Brassicaceae, from a plant of the genus Arabidopsis, or from an Arabidopsis thaliana plant; or (d) said nucleic acid encoding a BZR polypeptide is of plant origin, from a dicotyledonous plant, from a plant of the family Brasicaceae, or from an Arabidopsis thaliana plant.
8. A plant obtained by the method of claim 1, or a plant part, seed, or progeny of said plant, wherein said plant, or said plant part, seed, or progeny, comprises a recombinant nucleic acid encoding said RHL1 polypeptide, said TGase polypeptide, said TRY-like polypeptide, or said BZR polypeptide.
9. Harvestable parts of the plant of claim 8, wherein said harvestable parts comprises said recombinant nucleic acid and are preferably shoot biomass and/or seeds.
10. Products derived from the plant of claim 8 and/or from harvestable parts of said plant, wherein said products comprises said recombinant nucleic acid.
11. The method of claim 1, wherein the plant is a crop plant, a monocot or a cereal, or wherein said plant is rice, maize, wheat, barley, millet, rye, triticale, sorghum or oats.
12. The method of claim 1, wherein said nucleic acid encodes a BZR polypeptide and comprises: (a) the nucleotide sequence of SEQ ID NO: 238; (b) a nucleotide sequence encoding a polypeptide comprising the amino acid sequence of SEQ ID NO: 239; or (c) a nucleotide sequence encoding a polypeptide having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 239.
13. A construct comprising: (a) a nucleic acid sequence encoding a RHL1 polypeptide, a TGase polypeptide, a TRY-like polypeptide, or a BZR polypeptide as defined in claim 1; (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally (c) a transcription termination sequence.
14. The construct of claim 13, wherein: (a) said nucleic acid encodes a RHL1 polypeptide, and wherein said one or more control sequences comprise a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice; (b) said nucleic acid encodes a TGase polypeptide, and wherein said one or more control sequences comprise a seed-specific promoter; (c) said nucleic acid encodes a TGase polypeptide, and wherein said one or more control sequences comprise an alpha-globulin promoter, a rice alpha-globulin promoter, or an alpha-globulin promoter comprising the nucleotide sequence of SEQ ID NO: 72; (d) said nucleic acid encodes a TRY-like polypeptide, and wherein said one or more control sequences comprise a root-specific promoter, a RCc3 promoter, or a RCc3 promoter from rice; (e) said nucleic acid encodes a TRY-like polypeptide, and wherein said one or more control sequences comprise a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice; or (f) said nucleic acid encodes a BZR polypeptide, and wherein said one or more control sequences comprise a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice.
15. A plant, plant part or plant cell comprising the construct of claim 13.
16. A method for the production of a transgenic plant having increased yield relative to a control plant, comprising: (a) introducing and expressing in a plant or plant cell a nucleic acid encoding a RHL1 polypeptide, a TGase polypeptide, a TRY-like polypeptide, or a BZR polypeptide as defined in claim 1; (b) cultivating the plant or plant cell under conditions promoting plant growth and development; and optionally (c) selecting for a plant having increased yield relative to a control plant.
17. A transgenic plant having increased yield relative to a control plant, resulting from modulated expression of a nucleic acid encoding a RNL1 polypeptide, a TGase polypeptide, a TRY-like polypeptide, or a BZR polypeptide as defined in claim 1, or a transgenic plant cell derived from said transgenic plant.
18. The transgenic plant of claim 17, wherein said plant is a crop plant, a monocot or a cereal, or wherein said plant is rice, maize, wheat, barley, millet, rye, triticale, sorghum or oats.
19. Harvestable parts of the transgenic plant of claim 17, wherein said harvestable parts are preferably shoot biomass and/or seeds.
20. Products derived from the transgenic plant of claim 17 and/or from harvestable parts of said plant.
Description:
RELATED APPLICATIONS
[0001] This application is a continuation of patent application Ser. No. 13/000,067 filed on Dec. 20, 2010, which is a national stage application (under 35 U.S.C. §371) of PCT/EP2009/057722, filed Jun. 22, 2009, which claims benefit of European application 08159089.5, filed Jun. 26, 2008; U.S. Provisional Application 61/075,909, filed Jun. 26, 2008; European Application 08159099.4, filed Jun. 26, 2008; European Application 08159093.7, filed Jun. 26, 2008; U.S. Provisional Application 61/076,178, filed Jun. 27, 2008; U.S. Provisional Application 61/076,158, filed Jun. 27, 2008; European Application 08159746.0, filed Jul. 4, 2008 and U.S. Provisional Application 61/078,471 filed on Jul. 7, 2008. The entire content of each aforementioned application is hereby incorporated by reference in its entirety.
SUBMISSION OF SEQUENCE LISTING
[0002] The Sequence Listing associated with this application is filed in electronic format via EFS-Web and hereby incorporated by reference into the specification in its entirety. The name of the text file containing the Sequence Listing is Sequence_Listing--074040--0100--01. The size of the text file is 481 KB, and the text file was created on Jun. 17, 2014.
[0003] The present invention relates generally to the field of molecular biology and concerns a method for enhancing various plant yield-related traits by modulating expression in a plant of a nucleic acid encoding a RHL1 (Root Hairless 1). The present invention also concerns plants having modulated expression of a nucleic acid encoding a RHL1, which plants have enhanced various plant yield-related relative to corresponding wild type plants or other control plants. The invention also provides constructs useful in the methods of the invention.
[0004] The present invention relates generally to the field of molecular biology and concerns a method for increasing various plant seed yield-related traits by increasing expression in a plant of a nucleic acid sequence encoding a transglutaminase (TGase) polypeptide. The present invention also concerns plants having increased expression of a nucleic acid sequence encoding a TGase polypeptide, which plants have increased seed yield-related traits relative to control plants. The invention additionally relates to nucleic acid sequences, nucleic acid constructs, vectors and plants containing said nucleic acid sequences.
[0005] The present invention relates generally to the field of molecular biology and concerns a method for improving various plant growth characteristics by modulating expression in a plant of a nucleic acid encoding a TRY-like (Tryptichon) polypeptide. The present invention also concerns plants having modulated expression of a nucleic acid encoding a TRY-like polypeptide, which plants have improved growth characteristics relative to corresponding wild type plants or other control plants. The invention also provides constructs useful in the methods of the invention.
[0006] The present invention relates generally to the field of molecular biology and concerns a method for increasing seed yield in plants. More specifically, the present invention concerns a method for increasing seed yield in plants by modulating expression in a plant of a nucleic acid encoding a BZR (BRASSINAZOLE-RESISTANT) polypeptide. The present invention also concerns plants having modulated expression of a nucleic acid encoding a BZR polypeptide, which plants have increased seed yield relative to control plants. The invention also provides hitherto unknown BZR-encoding nucleic acids, and constructs comprising the same, useful in performing the methods of the invention.
[0007] The ever-increasing world population and the dwindling supply of arable land available for agriculture fuels research towards increasing the efficiency of agriculture. Conventional means for crop and horticultural improvements utilise selective breeding techniques to identify plants having desirable characteristics. However, such selective breeding techniques have several drawbacks, namely that these techniques are typically labour intensive and result in plants that often contain heterogeneous genetic components that may not always result in the desirable trait being passed on from parent plants. Advances in molecular biology have allowed mankind to modify the germplasm of animals and plants. Genetic engineering of plants entails the isolation and manipulation of genetic material (typically in the form of DNA or RNA) and the subsequent introduction of that genetic material into a plant. Such technology has the capacity to deliver crops or plants having various improved economic, agronomic or horticultural traits.
[0008] A trait of particular economic interest is increased yield. Yield is normally defined as the measurable produce of economic value from a crop. This may be defined in terms of quantity and/or quality. Yield is directly dependent on several factors, for example, the number and size of the organs, plant architecture (for example, the number of branches), seed production, leaf senescence and more. Root development, nutrient uptake, stress tolerance and early vigour may also be important factors in determining yield. Optimizing the abovementioned factors may therefore contribute to increasing crop yield.
[0009] Seed yield is a particularly important trait, since the seeds of many plants are important for human and animal nutrition. Crops such as corn, rice, wheat, canola and soybean account for over half the total human caloric intake, whether through direct consumption of the seeds themselves or through consumption of meat products raised on processed seeds. They are also a source of sugars, oils and many kinds of metabolites used in industrial processes. Seeds contain an embryo (the source of new shoots and roots) and an endosperm (the source of nutrients for embryo growth during germination and during early growth of seedlings). The development of a seed involves many genes, and requires the transfer of metabolites from the roots, leaves and stems into the growing seed. The endosperm, in particular, assimilates the metabolic precursors of carbohydrates, oils and proteins and synthesizes them into storage macromolecules to fill out the grain.
[0010] Plant biomass is yield for forage crops like alfalfa, silage corn and hay. Many proxies for yield have been used in grain crops. Chief amongst these are estimates of plant size. Plant size can be measured in many ways depending on species and developmental stage, but include total plant dry weight, above-ground dry weight, above-ground fresh weight, leaf area, stem volume, plant height, rosette diameter, leaf length, root length, root mass, tiller number and leaf number. Many species maintain a conservative ratio between the size of different parts of the plant at a given developmental stage. These allometric relationships are used to extrapolate from one of these measures of size to another (e.g. Tittonell et al 2005 Agric Ecosys & Environ 105: 213). Plant size at an early developmental stage will typically correlate with plant size later in development. A larger plant with a greater leaf area can typically absorb more light and carbon dioxide than a smaller plant and therefore will likely gain a greater weight during the same period (Fasoula & Tollenaar 2005 Maydica 50:39). This is in addition to the potential continuation of the micro-environmental or genetic advantage that the plant had to achieve the larger size initially. There is a strong genetic component to plant size and growth rate (e.g. ter Steege et al 2005 Plant Physiology 139:1078), and so for a range of diverse genotypes plant size under one environmental condition is likely to correlate with size under another (Hittalmani et al 2003 Theoretical Applied Genetics 107:679). In this way a standard environment is used as a proxy for the diverse and dynamic environments encountered at different locations and times by crops in the field.
[0011] Another important trait for many crops is early vigour. Improving early vigour is an important objective of modern rice breeding programs in both temperate and tropical rice cultivars. Long roots are important for proper soil anchorage in water-seeded rice. Where rice is sown directly into flooded fields, and where plants must emerge rapidly through water, longer shoots are associated with vigour. Where drill-seeding is practiced, longer mesocotyls and coleoptiles are important for good seedling emergence. The ability to engineer early vigour into plants would be of great importance in agriculture. For example, poor early vigour has been a limitation to the introduction of maize (Zea mays L.) hybrids based on Corn Belt germplasm in the European Atlantic.
[0012] Harvest index, the ratio of seed yield to aboveground dry weight, is relatively stable under many environmental conditions and so a robust correlation between plant size and grain yield can often be obtained (e.g. Rebetzke et al 2002 Crop Science 42:739). These processes are intrinsically linked because the majority of grain biomass is dependent on current or stored photosynthetic productivity by the leaves and stem of the plant (Gardener et al 1985 Physiology of Crop Plants. Iowa State University Press, pp 68-73). Therefore, selecting for plant size, even at early stages of development, has been used as an indicator for future potential yield (e.g. Tittonell et al 2005 Agric Ecosys & Environ 105: 213). When testing for the impact of genetic differences on stress tolerance, the ability to standardize soil properties, temperature, water and nutrient availability and light intensity is an intrinsic advantage of greenhouse or plant growth chamber environments compared to the field. However, artificial limitations on yield due to poor pollination due to the absence of wind or insects, or insufficient space for mature root or canopy growth, can restrict the use of these controlled environments for testing yield differences. Therefore, measurements of plant size in early development, under standardized conditions in a growth chamber or greenhouse, are standard practices to provide indication of potential genetic yield advantages.
[0013] A further important trait is that of improved abiotic stress tolerance. Abiotic stress is a primary cause of crop loss worldwide, reducing average yields for most major crop plants by more than 50% (Wang et al., Planta (2003) 218: 1-14). Abiotic stresses may be caused by drought, salinity, extremes of temperature, chemical toxicity and oxidative stress. The ability to improve plant tolerance to abiotic stress would be of great economic advantage to farmers worldwide and would allow for the cultivation of crops during adverse conditions and in territories where cultivation of crops may not otherwise be possible.
[0014] Crop yield may therefore be increased by optimising one of the above-mentioned factors.
[0015] Depending on the end use, the modification of certain yield traits may be favoured over others. For example for applications such as forage or wood production, or bio-fuel resource, an increase in the vegetative parts of a plant may be desirable, and for applications such as flour, starch or oil production, an increase in seed parameters may be particularly desirable. Even amongst the seed parameters, some may be favoured over others, depending on the application. Various mechanisms may contribute to increasing seed yield, whether that is in the form of increased seed size or increased seed number.
[0016] One approach to increasing yield (seed yield and/or biomass) in plants may be through modification of the inherent growth mechanisms of a plant, such as the cell cycle or various signalling pathways involved in plant growth or in defense mechanisms.
[0017] Concerning BZR, depending on the end use, the modification of certain yield traits may be favoured over others. For example for applications such as forage or wood production, or bio-fuel resource, an increase in the vegetative parts of a plant may be desirable, and for applications such as flour, starch or oil production, an increase in seed parameters may be particularly desirable. Even amongst the seed parameters, some may be favoured over others, depending on the application. Various mechanisms may contribute to increasing seed yield, whether that is in the form of increased seed size or increased seed number or increase number of inflorescences.
[0018] It has now been found that various plant yield-related may be improved in plants by modulating expression in a plant of a nucleic acid encoding a RHL1 (Root Hairless 1) in a plant.
[0019] Furthermore, it has now been found that various seed yield-related traits may be increased in plants relative to control plants, by increasing expression in a plant of a nucleic acid sequence encoding a transglutaminase (TGase) polypeptide. The increased seed yield-related traits comprise one or more of: increased total seed yield per plant, increased number of filled seeds, and increased harvest index.
[0020] Even furthermore, it has now been found that various growth characteristics may be improved in plants by modulating expression in a plant of a nucleic acid encoding a TRY-like (Tryptichon) in a plant.
[0021] Yet furthermore, It has now been found that seed yield may be improved in plants by modulating expression in a plant of a nucleic acid encoding a BZR (BRASSINAZOLE-RESISTANT) polypeptide in a plant.
BACKGROUND
1. Root Hairless 1 (RHL1)
[0022] An RHL1 polypeptide was first described in 1998 by Schneider at al. (Genes Dev. 12, 2013-2021) as a nuclear targeted protein required for root hair initiation in Arabidopsis thaliana. RHL1 polypeptides are ubiquitous to the viridiplantae kingdom. Sequence comparison of RHL1 originating from different organism reveals that RHL1 polypeptides share an overall sequence similarity around 30-80% identity. RHL1 polypeptides comprise a number of putative nuclear localization signals as well as phosphorylation sites and a PEST sequence which is a putative proteasome-dependent proteins degradation motif. The presence of such motifs may reportedly confer some regulatory roles by modulating subcellular localization of topos and/or their interaction with other proteins. The C-terminus of RHL1 proteins has weak but significant sequence similarity to the C-terminal of mammalian Topo II-alpha protein (Sugimoto-Shirasu et al. 2005 PNAS 102, 18736-17741). Eukaryotic topo II proteins belong to the subclass of the type II topo (typeIIA) that is required to unwind replicating double-stranded DNA. Physical Interaction between an RHL1 polypeptide and a plant topo VI protein, At TOP6B, has been reported (Sugimoto-Shirasu et al. 2005). It has been suggested that RHL1 polypeptides function in a plant topo VI complex active during the mitotic cell cycle and endocycle of plant cells. Arabidopsis thaliana plants, hyp7, carrying mutations in an RHL1 gene exhibit an extreme dwarf phenotype and defects in endoreduplication (Sugimoto-Shirasu et al. 2005).
2. Transglutaminases (TGases)
[0023] Transglutaminases (TGases, EC 2.3.2.13; protein-glutamine-gamma-glutamyltransferase) are a family of enzymes that have a range of calcium (Ca)-dependent catalytic activities, most of which concern the post-translational modification of proteins. They catalyze the covalent attachment to proteins and polypeptides of a series of substances containing primary amine groups, i.e., they promote the formation of amide linkages, generally in a Ca-dependent fashion, between the primary amine of an amine donor substrate and the y-carboxamide group of peptide-bound
endo-glutamine residues in proteins or polypeptides that are the amine acceptors:
protein glutamine+alkylamine=protein N5-alkylglutamine+NH3.
[0024] Polyamines have been shown to serve as physiological substrates of TGases. Polyamines appear to play an essential role in growth and cell division process in animals, microorganisms, and plants. One of the roles of polyamines is their regulatory action by a TGase-mediated process of post-translational modification (addition of polyamine moieties) of enzymes and structural proteins.
[0025] TGases enzymes are found intracellularly and extracellularly, and are widely distributed in bacteria, animals and plants. In plants, the TGase activity is found in chloroplasts. Rubisco and apoproteins of the antenna complex have been shown to be substrates of TGase activity, thereby suggesting a role of these enzymes in photosynthesis related processes, such as protection protection of photosystem antenna proteins (Villalobos et al. (2004) Gene 336: 93-104).
[0026] Transgenic rice plants (Claparols et al. (2004) Transgenic Research 13: 195-199) expressing a gene encoding rat prostate calcium-dependent transglutaminase polypeptide under the control of maize constitutive promoter accumulated the recombinant enzyme in an inactive form.
[0027] International patent application WO 2003/102128 describes a nucleic acid sequence encoding a corn TGase polypeptide, vectors, micro-organisms and plants comprising such nucleic acid sequences, and the use of polypeptides with such TGase activity in food manipulation, processing and transformation.
[0028] Surprisingly, it has now been found that increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide as defined herein, gives plants having increased seed yield-related traits relative to control plants.
[0029] According to one embodiment, there is provided a method for increasing seed yield-related traits in plants relative to control plants, comprising increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide as defined herein. The increased seed yield-related traits comprise one or more of: increased total seed yield per plant, increased number of filled seeds, and increased harvest index.
3. Tryptichon (TRY-Like)
[0030] The Arabidopsis gene Tryptichon encodes a protein that reportedly negatively regulates trychome development and positively regulates root hair development. Trichome patterning in Arabidopsis is a model for the generation of a spacing pattern from initially equivalent cells. Schellmann et al. (EMBO J. 21, 5036-5046, 2002) show that the Tryptichon gene that functions in lateral inhibition encodes a single-repeat MYB-related transcription factor that lacks a recognizable activation domain. It has high sequence similarity to the root hair patterning gene Caprice. Both genes are expressed in trichomes and act together during lateral inhibition. They further show that Tryptichon and Caprice act redundantly in the position-dependent cell fate determination in the root epidermis. Thus, the same lateral inhibition mechanism seems to be involved in both de novo patterning and position-dependent cell determination (Schellmann et al., 2002).
4. Brassinazole Resistant1 (BZR1)
[0031] The regulation of gene expression is key to the viability of any cell. Several hundreds of proteins are involved in the regulation of gene transcription. In particular transcription factors play a central role and act directly on gene promoters. Plant genomes devote approximately 7% of their coding sequence to transcription factors (TFs; Rushton et al. 2008 Plant Physiology 147:280-295 (2008).
[0032] Plants encode a particular class of transcription factors, the BES or BZR proteins, which modulate gene response to fluctuations in plant steroid hormones such as brassinosteroids (BRs). BZR transcription factors (BZR TFs) are characterized by the presence of a conserved BZR1 repressor domain typically found at the N-terminus of the protein and involved in binding to the targeted gene promoter. Plant typically encode a small number of BZR TFs. For example the Arabidopsis genome contains only 6 genes encoding BZR TFs, while tobacco, a plant in which this family of TFs is expanded encodes 19 BZR TFs. All TFs comprised a conserved BZR1 repressor domain and are predicted to function in the modulation of BR signalling.
[0033] In Arabidopsis thaliana, the cascade of events in BR signalling are triggered upon binding of BRs to the BRASSINOSTEROID INSENSITIVE1(BRI1)/BK11 receptor complex at the plasma membrane, causing the release of BKI1. The subsequent dimerization of BRI1 and BRI1 ASSOCIATED RECEPTOR KINASE1 (BAK1) activates a downstream signal transductionpathway that leads to BRI1 EMS SUPPRESSOR1 (BES1) and BRASSINAZOLE RESISTANT1 (BZR1). The phosphorylation of BES1 and BZR1 by the kinase BIN2 appears to control their signalling activity by acting on the subcelullar localization and stability of the protein. Dephosphorilated BZR1 accumulates in the nuclei which is the site at which the transcriptional function is performed (Wang et al., 2006 Cell Res. 16: 427-434). Mechanistically, transcription factors of the BZR1 family directly bind to the promoter of the targeted gene and may act to activate or repress expression.
[0034] Methods for modulating the Brassinosteroid response pathway to modify a number of traits in plants have been disclosed (U.S. Pat. No. 6,921,848). The traits as defined in U.S. Pat. No. 6,921,848 comprised increased growth and cell elongation in various organs and tissues. However those effects did not result in an increase in the number of organs such as the number of seeds produced and/or in an increase in the seed yield of the plant.
SUMMARY
1. Root Hairless 1 (RHL1)
[0035] Surprisingly, it has now been found that modulating expression of a nucleic acid encoding a RHL1 polypeptide gives plants having enhanced yield-related traits relative to control plants.
[0036] According one embodiment, there is provided a method for enhancing yield related traits of a plant relative to control plants, comprising modulating expression of a nucleic acid encoding a RHL1 polypeptide in a plant. The enhanced yield related traits comprised increased early vigour, seed yield, number of seed and harvest index of a plant.
2. Tryptichon (TRY-Like)
[0037] Surprisingly, it has now been found that modulating expression of a nucleic acid encoding a TRY-like polypeptide gives plants having enhanced yield-related traits in particular increased emergence vigour and/or increased yield relative to control plants.
[0038] According one embodiment, there is provided a method for improving yield related traits of a plant relative to control plants, comprising modulating expression of a nucleic acid encoding a TRY-like polypeptide in a plant. The improved yield related traits comprised increased seed yield, including total weight of seeds.
3. Brassinazole Resistant1 (BZR1)
[0039] Surprisingly, it has now been found that modulating expression of a nucleic acid encoding a BZR polypeptide gives plants having increased seed yield relative to control plants.
[0040] According to one embodiment of the invention there is provided a method for increasing plant seed yield relative to control plants, comprising modulating expression of a nucleic acid encoding a BZR polypeptide in a plant.
DEFINITIONS
Polypeptide(s)/Protein(s)
[0041] The terms "polypeptide" and "protein" are used interchangeably herein and refer to amino acids in a polymeric form of any length, linked together by peptide bonds.
Polynucleotide(s)/Nucleic Acid(s)/Nucleic Acid Sequence(s)/Nucleotide Sequence(s)
[0042] The terms "polynucleotide(s)", "nucleic acid sequence(s)", "nucleotide sequence(s)", "nucleic acid(s)", "nucleic acid molecule" are used interchangeably herein and refer to nucleotides, either ribonucleotides or deoxyribonucleotides or a combination of both, in a polymeric unbranched form of any length.
Control Plant(s)
[0043] The choice of suitable control plants is a routine part of an experimental setup and may include corresponding wild type plants or corresponding plants without the gene of interest. The control plant is typically of the same plant species or even of the same variety as the plant to be assessed. The control plant may also be a nullizygote of the plant to be assessed. Nullizygotes are individuals missing the transgene by segregation. A "control plant" as used herein refers not only to whole plants, but also to plant parts, including seeds and seed parts.
Homologue(s)
[0044] "Homologues" of a protein encompass peptides, oligopeptides, polypeptides, proteins and enzymes having amino acid substitutions, deletions and/or insertions relative to the unmodified protein in question and having similar biological and functional activity as the unmodified protein from which they are derived.
[0045] A deletion refers to removal of one or more amino acids from a protein.
[0046] An insertion refers to one or more amino acid residues being introduced into a predetermined site in a protein. Insertions may comprise N-terminal and/or C-terminal fusions as well as intra-sequence insertions of single or multiple amino acids. Generally, insertions within the amino acid sequence will be smaller than N- or C-terminal fusions, of the order of about 1 to 10 residues. Examples of N- or C-terminal fusion proteins or peptides include the binding domain or activation domain of a transcriptional activator as used in the yeast two-hybrid system, phage coat proteins, (histidine)-6-tag, glutathione S-transferase-tag, protein A, maltose-binding protein, dihydrofolate reductase, Tag•100 epitope, c-myc epitope, FLAG®-epitope, lacZ, CMP (calmodulin-binding peptide), HA epitope, protein C epitope and VSV epitope.
[0047] A substitution refers to replacement of amino acids of the protein with other amino acids having similar properties (such as similar hydrophobicity, hydrophilicity, antigenicity, propensity to form or break α-helical structures or β-sheet structures). Amino acid substitutions are typically of single residues, but may be clustered depending upon functional constraints placed upon the polypeptide; insertions will usually be of the order of about 1 to 10 amino acid residues. The amino acid substitutions are preferably conservative amino acid substitutions. Conservative substitution tables are well known in the art (see for example Creighton (1984) Proteins. W.H. Freeman and Company (Eds) and Table 1 below).
TABLE-US-00001 TABLE 1 Examples of conserved amino acid substitutions Residue Conservative Substitutions Ala Ser Arg Lys Asn Gln; His Asp Glu Gln Asn Cys Ser Glu Asp Gly Pro His Asn; Gln Ile Leu, Val Leu Ile; Val Lys Arg; Gln Met Leu; Ile Phe Met; Leu; Tyr Ser Thr; Gly Thr Ser; Val Trp Tyr Tyr Trp; Phe Val Ile; Leu
[0048] Amino acid substitutions, deletions and/or insertions may readily be made using peptide synthetic techniques well known in the art, such as solid phase peptide synthesis and the like, or by recombinant DNA manipulation. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. For example, techniques for making substitution mutations at predetermined sites in DNA are well known to those skilled in the art and include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, Ohio), QuickChange Site Directed mutagenesis (Stratagene, San Diego, Calif.), PCR-mediated site-directed mutagenesis or other site-directed mutagenesis protocols.
Derivatives
[0049] "Derivatives" include peptides, oligopeptides, polypeptides which may, compared to the amino acid sequence of the naturally-occurring form of the protein, such as the protein of interest, comprise substitutions of amino acids with non-naturally occurring amino acid residues, or additions of non-naturally occurring amino acid residues. "Derivatives" of a protein also encompass peptides, oligopeptides, polypeptides which comprise naturally occurring altered (glycosylated, acylated, prenylated, phosphorylated, myristoylated, sulphated etc.) or non-naturally altered amino acid residues compared to the amino acid sequence of a naturally-occurring form of the polypeptide. A derivative may also comprise one or more non-amino acid substituents or additions compared to the amino acid sequence from which it is derived, for example a reporter molecule or other ligand, covalently or non-covalently bound to the amino acid sequence, such as a reporter molecule which is bound to facilitate its detection, and non-naturally occurring amino acid residues relative to the amino acid sequence of a naturally-occurring protein. Furthermore, "derivatives" also include fusions of the naturally-occurring form of the protein with tagging peptides such as FLAG, HIS6 or thioredoxin (for a review of tagging peptides, see Terpe, Appl. Microbiol. Biotechnol. 60, 523-533, 2003).
Ortholoque(s)/Paraloque(s)
[0050] Orthologues and paralogues encompass evolutionary concepts used to describe the ancestral relationships of genes. Paralogues are genes within the same species that have originated through duplication of an ancestral gene; orthologues are genes from different organisms that have originated through speciation, and are also derived from a common ancestral gene.
Domain
[0051] The term "domain" refers to a set of amino acids conserved at specific positions along an alignment of sequences of evolutionarily related proteins. While amino acids at other positions can vary between homologues, amino acids that are highly conserved at specific positions indicate amino acids that are likely essential in the structure, stability or function of a protein. Identified by their high degree of conservation in aligned sequences of a family of protein homologues, they can be used as identifiers to determine if any polypeptide in question belongs to a previously identified polypeptide family.
Motif/Consensus Sequence/Signature
[0052] The term "motif" or "consensus sequence" or "signature" refers to a short conserved region in the sequence of evolutionarily related proteins. Motifs are frequently highly conserved parts of domains, but may also include only part of the domain, or be located outside of conserved domain (if all of the amino acids of the motif fall outside of a defined domain).
Hybridisation
[0053] The term "hybridisation" as defined herein is a process wherein substantially homologous complementary nucleotide sequences anneal to each other. The hybridisation process can occur entirely in solution, i.e. both complementary nucleic acids are in solution. The hybridisation process can also occur with one of the complementary nucleic acids immobilised to a matrix such as magnetic beads, Sepharose beads or any other resin. The hybridisation process can furthermore occur with one of the complementary nucleic acids immobilised to a solid support such as a nitro-cellulose or nylon membrane or immobilised by e.g. photolithography to, for example, a siliceous glass support (the latter known as nucleic acid arrays or microarrays or as nucleic acid chips). In order to allow hybridisation to occur, the nucleic acid molecules are generally thermally or chemically denatured to melt a double strand into two single strands and/or to remove hairpins or other secondary structures from single stranded nucleic acids.
[0054] The term "stringency" refers to the conditions under which a hybridisation takes place. The stringency of hybridisation is influenced by conditions such as temperature, salt concentration, ionic strength and hybridisation buffer composition. Generally, low stringency conditions are selected to be about 30° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. Medium stringency conditions are when the temperature is 20° C. below Tm, and high stringency conditions are when the temperature is 10° C. below Tm. High stringency hybridisation conditions are typically used for isolating hybridising sequences that have high sequence similarity to the target nucleic acid sequence. However, nucleic acids may deviate in sequence and still encode a substantially identical polypeptide, due to the degeneracy of the genetic code. Therefore medium stringency hybridisation conditions may sometimes be needed to identify such nucleic acid molecules.
[0055] The Tm is the temperature under defined ionic strength and pH, at which 50% of the target sequence hybridises to a perfectly matched probe. The Tm is dependent upon the solution conditions and the base composition and length of the probe. For example, longer sequences hybridise specifically at higher temperatures. The maximum rate of hybridisation is obtained from about 16° C. up to 32° C. below Tm. The presence of monovalent cations in the hybridisation solution reduce the electrostatic repulsion between the two nucleic acid strands thereby promoting hybrid formation; this effect is visible for sodium concentrations of up to 0.4M (for higher concentrations, this effect may be ignored). Formamide reduces the melting temperature of DNA-DNA and DNA-RNA duplexes with 0.6 to 0.7° C. for each percent formamide, and addition of 50% formamide allows hybridisation to be performed at 30 to 45° C., though the rate of hybridisation will be lowered. Base pair mismatches reduce the hybridisation rate and the thermal stability of the duplexes. On average and for large probes, the Tm decreases about 1° C. per % base mismatch. The Tm may be calculated using the following equations, depending on the types of hybrids:
1) DNA-DNA hybrids (Meinkoth and Wahl, Anal. Biochem., 138: 267-284, 1984):
Tm=81.5° C.+16.6x log10[Na.sup.+]a+0.41x%[G/Cb]-500x[Lc]-1-0.61x % formamide
2) DNA-RNA or RNA-RNA hybrids:
Tm=79.8+18.5(log10[Na.sup.+]a)+0.58(% G/Cb)+11.8(% G/Cb)2-820/Lc
3) oligo-DNA or oligo-RNAd hybrids:
For <20 nucleotides: Tm=2(In)
For 20-35 nucleotides: Tm=22+1.46(In)
a or for other monovalent cation, but only accurate in the 0.01-0.4 M range. b only accurate for % GC in the 30% to 75% range. c L=length of duplex in base pairs. d oligo, oligonucleotide; In,=effective length of primer=2×(no. of G/C)+(no. of NT).
[0056] Non-specific binding may be controlled using any one of a number of known techniques such as, for example, blocking the membrane with protein containing solutions, additions of heterologous RNA, DNA, and SDS to the hybridisation buffer, and treatment with Rnase. For non-homologous probes, a series of hybridizations may be performed by varying one of (i) progressively lowering the annealing temperature (for example from 68° C. to 42° C.) or (ii) progressively lowering the formamide concentration (for example from 50% to 0%). The skilled artisan is aware of various parameters which may be altered during hybridisation and which will either maintain or change the stringency conditions.
[0057] Besides the hybridisation conditions, specificity of hybridisation typically also depends on the function of post-hybridisation washes. To remove background resulting from non-specific hybridisation, samples are washed with dilute salt solutions. Critical factors of such washes include the ionic strength and temperature of the final wash solution: the lower the salt concentration and the higher the wash temperature, the higher the stringency of the wash. Wash conditions are typically performed at or below hybridisation stringency. A positive hybridisation gives a signal that is at least twice of that of the background. Generally, suitable stringent conditions for nucleic acid hybridisation assays or gene amplification detection procedures are as set forth above. More or less stringent conditions may also be selected. The skilled artisan is aware of various parameters which may be altered during washing and which will either maintain or change the stringency conditions.
[0058] For example, typical high stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 65° C. in 1×SSC or at 42° C. in 1×SSC and 50% formamide, followed by washing at 65° C. in 0.3×SSC. Examples of medium stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 50° C. in 4×SSC or at 40° C. in 6×SSC and 50% formamide, followed by washing at 50° C. in 2×SSC. The length of the hybrid is the anticipated length for the hybridising nucleic acid. When nucleic acids of known sequence are hybridised, the hybrid length may be determined by aligning the sequences and identifying the conserved regions described herein. 1×SSC is 0.15M NaCl and 15 mM sodium citrate; the hybridisation solution and wash solutions may additionally include 5×Denhardt's reagent, 0.5-1.0% SDS, 100 μg/ml denatured, fragmented salmon sperm DNA, 0.5% sodium pyrophosphate.
[0059] For the purposes of defining the level of stringency, reference can be made to Sambrook et al. (2001) Molecular Cloning: a laboratory manual, 3rd Edition, Cold Spring Harbor Laboratory Press, CSH, New York or to Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates).
Splice Variant
[0060] The term "splice variant" as used herein encompasses variants of a nucleic acid sequence in which selected introns and/or exons have been excised, replaced, displaced or added, or in which introns have been shortened or lengthened. Such variants will be ones in which the biological activity of the protein is substantially retained; this may be achieved by selectively retaining functional segments of the protein. Such splice variants may be found in nature or may be manmade. Methods for predicting and isolating such splice variants are well known in the art (see for example Foissac and Schiex (2005) BMC Bioinformatics 6: 25).
Allelic Variant
[0061] Alleles or allelic variants are alternative forms of a given gene, located at the same chromosomal position. Allelic variants encompass Single Nucleotide Polymorphisms (SNPs), as well as Small Insertion/Deletion Polymorphisms (INDELs). The size of INDELs is usually less than 100 bp. SNPs and INDELs form the largest set of sequence variants in naturally occurring polymorphic strains of most organisms.
Gene Shuffling/Directed Evolution
[0062] Gene shuffling or directed evolution consists of iterations of DNA shuffling followed by appropriate screening and/or selection to generate variants of nucleic acids or portions thereof encoding proteins having a modified biological activity (Castle et al., (2004) Science 304(5674): 1151-4; U.S. Pat. Nos. 5,811,238 and 6,395,547).
Regulatory Element/Control Sequence/Promoter
[0063] The terms "regulatory element", "control sequence" and "promoter" are all used interchangeably herein and are to be taken in a broad context to refer to regulatory nucleic acid sequences capable of effecting expression of the sequences to which they are ligated. The term "promoter" typically refers to a nucleic acid control sequence located upstream from the transcriptional start of a gene and which is involved in recognising and binding of RNA polymerase and other proteins, thereby directing transcription of an operably linked nucleic acid. Encompassed by the aforementioned terms are transcriptional regulatory sequences derived from a classical eukaryotic genomic gene (including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence) and additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or external stimuli, or in a tissue-specific manner. Also included within the term is a transcriptional regulatory sequence of a classical prokaryotic gene, in which case it may include a -35 box sequence and/or -10 box transcriptional regulatory sequences. The term "regulatory element" also encompasses a synthetic fusion molecule or derivative that confers, activates or enhances expression of a nucleic acid molecule in a cell, tissue or organ.
[0064] A "plant promoter" comprises regulatory elements, which mediate the expression of a coding sequence segment in plant cells. Accordingly, a plant promoter need not be of plant origin, but may originate from viruses or micro-organisms, for example from viruses which attack plant cells. The "plant promoter" can also originate from a plant cell, e.g. from the plant which is transformed with the nucleic acid sequence to be expressed in the inventive process and described herein. This also applies to other "plant" regulatory signals, such as "plant" terminators. The promoters upstream of the nucleotide sequences useful in the methods of the present invention can be modified by one or more nucleotide substitution(s), insertion(s) and/or deletion(s) without interfering with the functionality or activity of either the promoters, the open reading frame (ORF) or the 3'-regulatory region such as terminators or other 3' regulatory regions which are located away from the ORF. It is furthermore possible that the activity of the promoters is increased by modification of their sequence, or that they are replaced completely by more active promoters, even promoters from heterologous organisms. For expression in plants, the nucleic acid molecule must, as described above, be linked operably to or comprise a suitable promoter which expresses the gene at the right point in time and with the required spatial expression pattern.
[0065] For the identification of functionally equivalent promoters, the promoter strength and/or expression pattern of a candidate promoter may be analysed for example by operably linking the promoter to a reporter gene and assaying the expression level and pattern of the reporter gene in various tissues of the plant. Suitable well-known reporter genes include for example beta-glucuronidase or beta-galactosidase. The promoter activity is assayed by measuring the enzymatic activity of the beta-glucuronidase or beta-galactosidase. The promoter strength and/or expression pattern may then be compared to that of a reference promoter (such as the one used in the methods of the present invention). Alternatively, promoter strength may be assayed by quantifying mRNA levels or by comparing mRNA levels of the nucleic acid used in the methods of the present invention, with mRNA levels of housekeeping genes such as 18S rRNA, using methods known in the art, such as Northern blotting with densitometric analysis of autoradiograms, quantitative real-time PCR or RT-PCR (Heid et al., 1996 Genome Methods 6: 986-994). Generally by "weak promoter" is intended a promoter that drives expression of a coding sequence at a low level. By "low level" is intended at levels of about 1/10,000 transcripts to about 1/100,000 transcripts, to about 1/500,0000 transcripts per cell. Conversely, a "strong promoter" drives expression of a coding sequence at high level, or at about 1/10 transcripts to about 1/100 transcripts to about 1/1000 transcripts per cell. Generally, by "medium strength promoter" is intended a promoter that drives expression of a coding sequence at a lower level than a strong promoter, in particular at a level that is in all instances below that obtained when under the control of a 35S CaMV promoter.
Operably Linked
[0066] The term "operably linked" as used herein refers to a functional linkage between the promoter sequence and the gene of interest, such that the promoter sequence is able to initiate transcription of the gene of interest.
Constitutive Promoter
[0067] A "constitutive promoter" refers to a promoter that is transcriptionally active during most, but not necessarily all, phases of growth and development and under most environmental conditions, in at least one cell, tissue or organ. Table 2a below gives examples of constitutive promoters.
TABLE-US-00002 TABLE 2a Examples of constitutive promoters Gene Source Reference Actin McElroy et al, Plant Cell, 2: 163-171, 1990 HMGP WO 2004/070039 CAMV 35S Odell et al, Nature, 313: 810-812, 1985 CaMV 19S Nilsson et al., Physiol. Plant. 100: 456-462, 1997 GOS2 de Pater et al, Plant J November; 2(6): 837-44, 1992, WO 2004/065596 Ubiquitin Christensen et al, Plant Mol. Biol. 18: 675-689, 1992 Rice cyclophilin Buchholz et al, Plant Mol Biol. 25(5): 837-43, 1994 Maize H3 histone Lepetit et al, Mol. Gen. Genet. 231: 276-285, 1992 Alfalfa H3 Wu et al. Plant Mol. Biol. 11: 641-649, 1988 histone Actin 2 An et al, Plant J. 10(1); 107-121, 1996 34S FMV Sanger et al., Plant. Mol. Biol., 14, 1990: 433-443 Rubisco small U.S. Pat. No. 4,962,028 subunit OCS Leisner (1988) Proc Natl Acad Sci USA 85(5): 2553 SAD1 Jain et al., Crop Science, 39 (6), 1999: 1696 SAD2 Jain et al., Crop Science, 39 (6), 1999: 1696 nos Shaw et al. (1984) Nucleic Acids Res. 12(20): 7831-7846 V-ATPase WO 01/14572 Super promoter WO 95/14098 G-box proteins WO 94/12015
Ubiquitous Promoter
[0068] A ubiquitous promoter is active in substantially all tissues or cells of an organism.
Developmentally-Regulated Promoter
[0069] A developmentally-regulated promoter is active during certain developmental stages or in parts of the plant that undergo developmental changes.
Inducible Promoter
[0070] An inducible promoter has induced or increased transcription initiation in response to a chemical (for a review see Gatz 1997, Annu. Rev. Plant Physiol. Plant Mol. Biol., 48:89-108), environmental or physical stimulus, or may be "stress-inducible", i.e. activated when a plant is exposed to various stress conditions, or a "pathogen-inducible" i.e. activated when a plant is exposed to exposure to various pathogens.
Organ-Specific/Tissue-Specific Promoter
[0071] An organ-specific or tissue-specific promoter is one that is capable of preferentially initiating transcription in certain organs or tissues, such as the leaves, roots, seed tissue etc. For example, a "root-specific promoter" is a promoter that is transcriptionally active predominantly in plant roots, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Promoters able to initiate transcription in certain cells only are referred to herein as "cell-specific".
[0072] Examples of root-specific promoters are listed in Table 2b below:
TABLE-US-00003 TABLE 2b Examples of root-specific promoters Gene Source Reference RCc3 Plant Mol Biol. 1995 January; 27(2): 237-48 Arabidopsis PHT1 Kovama et al., 2005; Mudge et al. (2002, Plant J. 31: 341) Medicago phosphate Xiao et al., 2006 transporter Arabidopsis Pyk10 Nitz et al. (2001) Plant Sci 161(2): 337-346 root-expressible genes Tingey et al., EMBO J. 6: 1, 1987. tobacco auxin- Van der Zaal et al., Plant Mol. Biol. 16, inducible gene 983, 1991. β-tubulin Oppenheimer, et al., Gene 63: 87, 1988. tobacco root- Conkling, et al., Plant Physiol. 93: 1203, 1990. specific genes B. napus G1-3b gene U.S. Pat. No. 5,401,836 SbPRP1 Suzuki et al., Plant Mol. Biol. 21: 109-119, 1993. LRX1 Baumberger et al. 2001, Genes & Dev. 15: 1128 BTG-26 Brassica US 20050044585 napus LeAMT1 (tomato) Lauter et al. (1996, PNAS 3: 8139) The LeNRT1-1 Lauter et al. (1996, PNAS 3: 8139) (tomato) class I patatin Liu et al., Plant Mol. Biol. 153: 386-395, 1991. gene (potato) KDC1 (Daucus Downey et al. (2000, J. Biol. Chem. 275: 39420) carota) TobRB7 gene W Song (1997) PhD Thesis, North Carolina State University, Raleigh, NC USA OsRAB5a (rice) Wang et al. 2002, Plant Sci. 163: 273 ALF5 (Arabidopsis) Diener et al. (2001, Plant Cell 13: 1625) NRT2; 1Np (N. Quesada et al. (1997, Plant Mol. Biol. 34: 265) plumbaginifolia)
[0073] A seed-specific promoter is transcriptionally active predominantly in seed tissue, but not necessarily exclusively in seed tissue (in cases of leaky expression). The seed-specific promoter may be active during seed development and/or during germination. The seed specific promoter may be endosperm/aleurone/embryo specific. Examples of seed-specific promoters (endosperm/aleurone/embryo specific) are shown in Table 2c to Table 2f below. Further examples of seed-specific promoters are given in Qing Qu and Takaiwa (Plant Biotechnol. J. 2, 113-125, 2004), which disclosure is incorporated by reference herein as if fully set forth.
TABLE-US-00004 TABLE 2c Examples of seed-specific promoters Gene source Reference seed-specific genes Simon et al., Plant Mol. Biol. 5: 191, 1985; Scofield et al., J. Biol. Chem. 262: 12202, 1987.; Baszczynski et al., Plant Mol. Biol. 14: 633, 1990. Brazil Nut albumin Pearson et al., Plant Mol. Biol. 18: 235-245, 1992. legumin Ellis et al., Plant Mol. Biol. 10: 203-214, 1988. glutelin (rice) Takaiwa et al., Mol. Gen. Genet. 208: 15-22, 1986; Takaiwa et al., FEBS Letts. 221: 43-47, 1987. zein Matzke et al Plant Mol Biol, 14(3): 323-32 1990 napA Stalberg et al, Planta 199: 515-519, 1996. wheat LMW and HMW Mol Gen Genet 216: 81-90, 1989; NAR 17: 461-2, 1989 glutenin-1 wheat SPA Albani et al, Plant Cell, 9: 171-184, 1997 wheat α, β, γ-gliadins EMBO J. 3: 1409-15, 1984 barley Itr1 promoter Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 barley B1, C, D, hordein Theor Appl Gen 98: 1253-62, 1999; Plant J 4: 343-55, 1993; Mol Gen Genet 250: 750-60, 1996 barley DOF Mena et al, The Plant Journal, 116(1): 53-62, 1998 blz2 EP99106056.7 synthetic promoter Vicente-Carbajosa et al., Plant J. 13: 629-640, 1998. rice prolamin NRP33 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice a-globulin Glb-1 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 rice α-globulin REB/OHP-1 Nakase et al. Plant Mol. Biol. 33: 513-522, 1997 rice ADP-glucose pyrophos- Trans Res 6: 157-68, 1997 phorylase maize ESR gene family Plant J 12: 235-46, 1997 sorghum α-kafirin DeRose et al., Plant Mol. Biol 32: 1029-35, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 rice oleosin Wu et al, J. Biochem. 123: 386, 1998 sunflower oleosin Cummins et al., Plant Mol. Biol. 19: 873-876, 1992 PRO0117, putative rice 40S WO 2004/070039 ribosomal protein PRO0136, rice alanine unpublished aminotransferase PRO0147, trypsin inhibitor unpublished ITR1 (barley) PRO0151, rice WSI18 WO 2004/070039 PRO0175, rice RAB21 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039 α-amylase (Amy32b) Lanahan et al, Plant Cell 4: 203-211, 1992; Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998
TABLE-US-00005 TABLE 2d examples of endosperm-specific promoters Gene source Reference glutelin (rice) Takaiwa et al. (1986) Mol Gen Genet 208: 15-22; Takaiwa et al. (1987) FEBS Letts. 221: 43-47 zein Matzke et al., (1990) Plant Mol Biol 14(3): 323-32 wheat LMW Colot et al. (1989) Mol Gen Genet 216: 81-90, and HMW Anderson et al. (1989) NAR 17: 461-2 glutenin-1 wheat SPA Albani et al. (1997) Plant Cell 9: 171-184 wheat gliadins Rafalski et al. (1984) EMBO 3: 1409-15 barley Itr1 Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 promoter barley B1, C, D, Cho et al. (1999) Theor Appl Genet 98: 1253-62; hordein Muller et al. (1993) Plant J 4: 343-55; Sorenson et al. (1996) Mol Gen Genet 250: 750-60 barley DOF Mena et al, (1998) Plant J 116(1): 53-62 blz2 Onate et al. (1999) J Biol Chem 274(14): 9175-82 synthetic promoter Vicente-Carbajosa et al. (1998) Plant J 13: 629-640 rice prolamin Wu et al, (1998) Plant Cell Physiol 39(8) 885-889 NRP33 rice globulin Wu et al. (1998) Plant Cell Physiol 39(8) 885-889 Glb-1 rice globulin Nakase et al. (1997) Plant Molec Biol 33: 513-522 REB/OHP-1 rice ADP-glucose Russell et al. (1997) Trans Res 6: 157-68 pyrophosphorylase maize ESR Opsahl-Ferstad et al. (1997) Plant J 12: 235-46 gene family sorghum kafirin DeRose et al. (1996) Plant Mol Biol 32: 1029-35
TABLE-US-00006 TABLE 2e Examples of embryo specific promoters: Gene source Reference rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 PRO0151 WO 2004/070039 PRO0175 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039
TABLE-US-00007 TABLE 2f Examples of aleurone-specific promoters: Gene source Reference α-amylase Lanahan et al, Plant Cell 4: 203-211, 1992; (Amy32b) Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998
[0074] A green tissue-specific promoter as defined herein is a promoter that is transcriptionally active predominantly in green tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts.
[0075] Examples of green tissue-specific promoters which may be used to perform the methods of the invention are shown in Table 2g below.
TABLE-US-00008 TABLE 2g Examples of green tissue-specific promoters Gene Expression Reference Maize Orthophosphate dikinase Leaf specific Fukavama et al., 2001 Maize Phosphoenolpyruvate Leaf specific Kausch et al., 2001 carboxylase Rice Phosphoenolpyruvate Leaf specific Liu et al., 2003 carboxylase Rice small subunit Rubisco Leaf specific Nomura et al., 2000 rice beta expansin EXBP9 Shoot specific WO 2004/070039 Pigeonpea small subunit Rubisco Leaf specific Panguluri et al., 2005 Pea RBCS3A Leaf specific
[0076] Another example of a tissue-specific promoter is a meristem-specific promoter, which is transcriptionally active predominantly in meristematic tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Examples of green meristem-specific promoters which may be used to perform the methods of the invention are shown in Table 2h below.
TABLE-US-00009 TABLE 2h Examples of meristem-specific promoters Gene source Expression pattern Reference rice OSH1 Shoot apical meristem, Sato et al. (1996) from embryo globular Proc. Natl. Acad. Sci. stage to seedling stage USA, 93: 8117-8122 Rice metallothionein Meristem specific BAD87835.1 WAK1 & WAK 2 Shoot and root apical Wagner & Kohorn meristems, and in ex- (2001) Plant Cell panding leaves and sepals 13(2): 303-318
Terminator
[0077] The term "terminator" encompasses a control sequence which is a DNA sequence at the end of a transcriptional unit which signals 3' processing and polyadenylation of a primary transcript and termination of transcription. The terminator can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The terminator to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.
Modulation
[0078] The term "modulation" means in relation to expression or gene expression, a process in which the expression level is changed by said gene expression in comparison to the control plant, the expression level may be increased or decreased. The original, unmodulated expression may be of any kind of expression of a structural RNA (rRNA, tRNA) or mRNA with subsequent translation. The term "modulating the activity" shall mean any change of the expression of the inventive nucleic acid sequences or encoded proteins, which leads to increased yield and/or increased growth of the plants.
Expression
[0079] The term "expression" or "gene expression" means the transcription of a specific gene or specific genes or specific genetic construct. The term "expression" or "gene expression" in particular means the transcription of a gene or genes or genetic construct into structural RNA (rRNA, tRNA) or mRNA with or without subsequent translation of the latter into a protein. The process includes transcription of DNA and processing of the resulting mRNA product.
Increased Expression/Overexpression
[0080] The term "increased expression" or "overexpression" as used herein means any form of expression that is additional to the original wild-type expression level.
[0081] Methods for increasing expression of genes or gene products are well documented in the art and include, for example, overexpression driven by appropriate promoters, the use of transcription enhancers or translation enhancers. Isolated nucleic acids which serve as promoter or enhancer elements may be introduced in an appropriate position (typically upstream) of a non-heterologous form of a polynucleotide so as to upregulate expression of a nucleic acid encoding the polypeptide of interest. For example, endogenous promoters may be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., WO9322443), or isolated promoters may be introduced into a plant cell in the proper orientation and distance from a gene of the present invention so as to control the expression of the gene.
[0082] If polypeptide expression is desired, it is generally desirable to include a polyadenylation region at the 3'-end of a polynucleotide coding region. The polyadenylation region can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The 3' end sequence to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.
[0083] An intron sequence may also be added to the 5' untranslated region (UTR) or the coding sequence of the partial coding sequence to increase the amount of the mature message that accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both plant and animal expression constructs has been shown to increase gene expression at both the mRNA and protein levels up to 1000-fold (Buchman and Berg (1988) Mol. Cell biol. 8: 4395-4405; Callis et al. (1987) Genes Dev 1:1183-1200). Such intron enhancement of gene expression is typically greatest when placed near the 5' end of the transcription unit. Use of the maize introns Adh1-S intron 1, 2, and 6, the Bronze-1 intron are known in the art. For general information see: The Maize Handbook, Chapter 116, Freeling and Walbot, Eds., Springer, N.Y. (1994).
Endogenous Gene
[0084] Reference herein to an "endogenous" gene not only refers to the gene in question as found in a plant in its natural form (i.e., without there being any human intervention), but also refers to that same gene (or a substantially homologous nucleic acid/gene) in an isolated form subsequently (re)introduced into a plant (a transgene). For example, a transgenic plant containing such a transgene may encounter a substantial reduction of the transgene expression and/or substantial reduction of expression of the endogenous gene. The isolated gene may be isolated from an organism or may be manmade, for example by chemical synthesis.
Decreased Expression
[0085] Reference herein to "decreased expression" or "reduction or substantial elimination" of expression is taken to mean a decrease in endogenous gene expression and/or polypeptide levels and/or polypeptide activity relative to control plants. The reduction or substantial elimination is in increasing order of preference at least 10%, 20%, 30%, 40% or 50%, 60%, 70%, 80%, 85%, 90%, or 95%, 96%, 97%, 98%, 99% or more reduced compared to that of control plants. Methods for decreasing expression are known in the art and the skilled person would readily be able to adapt the known methods for silencing so as to achieve reduction of expression of an endogenous gene in a whole plant or in parts thereof through the use of an appropriate promoter, for example.
[0086] For the reduction or substantial elimination of expression an endogenous gene in a plant, a sufficient length of substantially contiguous nucleotides of a nucleic acid sequence is required. In order to perform gene silencing, this may be as little as 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10 or fewer nucleotides, alternatively this may be as much as the entire gene (including the 5' and/or 3' UTR, either in part or in whole). The stretch of substantially contiguous nucleotides may be derived from the nucleic acid encoding the protein of interest (target gene), or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest. Preferably, the stretch of substantially contiguous nucleotides is capable of forming hydrogen bonds with the target gene (either sense or antisense strand), more preferably, the stretch of substantially contiguous nucleotides has, in increasing order of preference, 50%, 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% sequence identity to the target gene (either sense or antisense strand). A nucleic acid sequence encoding a (functional) polypeptide is not a requirement for the various methods discussed herein for the reduction or substantial elimination of expression of an endogenous gene.
[0087] Examples of various methods for the reduction or substantial elimination of expression in a plant of an endogenous gene, or for lowering levels and/or activity of a protein, are known to the skilled in the art. A skilled person would readily be able to adapt the known methods for silencing, so as to achieve reduction of expression of an endogenous gene in a whole plant or in parts thereof through the use of an appropriate promoter, for example.
[0088] This reduction or substantial elimination of expression may be achieved using routine tools and techniques. A preferred method for the reduction or substantial elimination of endogenous gene expression is by introducing and expressing in a plant a genetic construct into which the nucleic acid (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of any one of the protein of interest) is cloned as an inverted repeat (in part or completely), separated by a spacer (non-coding DNA).
[0089] In such a preferred method, expression of the endogenous gene is reduced or substantially eliminated through RNA-mediated silencing using an inverted repeat of a nucleic acid or a part thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), preferably capable of forming a hairpin structure. The inverted repeat is cloned in an expression vector comprising control sequences. A non-coding DNA nucleic acid sequence (a spacer, for example a matrix attachment region fragment (MAR), an intron, a polylinker, etc.) is located between the two inverted nucleic acids forming the inverted repeat. After transcription of the inverted repeat, a chimeric RNA with a self-complementary structure is formed (partial or complete). This double-stranded RNA structure is referred to as the hairpin RNA (hpRNA). The hpRNA is processed by the plant into siRNAs that are incorporated into an RNA-induced silencing complex (RISC). The RISC further cleaves the mRNA transcripts, thereby substantially reducing the number of mRNA transcripts to be translated into polypeptides. For further general details see for example, Grierson et al. (1998) WO 98/53083; Waterhouse et al. (1999) WO 99/53050).
[0090] Performance of the methods of the invention does not rely on introducing and expressing in a plant a genetic construct into which the nucleic acid is cloned as an inverted repeat, but any one or more of several well-known "gene silencing" methods may be used to achieve the same effects.
[0091] One such method for the reduction of endogenous gene expression is RNA-mediated silencing of gene expression (downregulation). Silencing in this case is triggered in a plant by a double stranded RNA sequence (dsRNA) that is substantially similar to the target endogenous gene. This dsRNA is further processed by the plant into about 20 to about 26 nucleotides called short interfering RNAs (siRNAs). The siRNAs are incorporated into an RNA-induced silencing complex (RISC) that cleaves the mRNA transcript of the endogenous target gene, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. Preferably, the double stranded RNA sequence corresponds to a target gene.
[0092] Another example of an RNA silencing method involves the introduction of nucleic acid sequences or parts thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest) in a sense orientation into a plant. "Sense orientation" refers to a DNA sequence that is homologous to an mRNA transcript thereof. Introduced into a plant would therefore be at least one copy of the nucleic acid sequence. The additional nucleic acid sequence will reduce expression of the endogenous gene, giving rise to a phenomenon known as co-suppression. The reduction of gene expression will be more pronounced if several additional copies of a nucleic acid sequence are introduced into the plant, as there is a positive correlation between high transcript levels and the triggering of co-suppression.
[0093] Another example of an RNA silencing method involves the use of antisense nucleic acid sequences. An "antisense" nucleic acid sequence comprises a nucleotide sequence that is complementary to a "sense" nucleic acid sequence encoding a protein, i.e. complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA transcript sequence. The antisense nucleic acid sequence is preferably complementary to the endogenous gene to be silenced. The complementarity may be located in the "coding region" and/or in the "non-coding region" of a gene. The term "coding region" refers to a region of the nucleotide sequence comprising codons that are translated into amino acid residues. The term "non-coding region" refers to 5' and 3' sequences that flank the coding region that are transcribed but not translated into amino acids (also referred to as 5' and 3' untranslated regions).
[0094] Antisense nucleic acid sequences can be designed according to the rules of Watson and Crick base pairing. The antisense nucleic acid sequence may be complementary to the entire nucleic acid sequence (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), but may also be an oligonucleotide that is antisense to only a part of the nucleic acid sequence (including the mRNA 5' and 3' UTR). For example, the antisense oligonucleotide sequence may be complementary to the region surrounding the translation start site of an mRNA transcript encoding a polypeptide. The length of a suitable antisense oligonucleotide sequence is known in the art and may start from about 50, 45, 40, 35, 30, 25, 20, 15 or 10 nucleotides in length or less. An antisense nucleic acid sequence according to the invention may be constructed using chemical synthesis and enzymatic ligation reactions using methods known in the art. For example, an antisense nucleic acid sequence (e.g., an antisense oligonucleotide sequence) may be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acid sequences, e.g., phosphorothioate derivatives and acridine substituted nucleotides may be used. Examples of modified nucleotides that may be used to generate the antisense nucleic acid sequences are well known in the art. Known nucleotide modifications include methylation, cyclization and `caps` and substitution of one or more of the naturally occurring nucleotides with an analogue such as inosine. Other modifications of nucleotides are well known in the art.
[0095] The antisense nucleic acid sequence can be produced biologically using an expression vector into which a nucleic acid sequence has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest). Preferably, production of antisense nucleic acid sequences in plants occurs by means of a stably integrated nucleic acid construct comprising a promoter, an operably linked antisense oligonucleotide, and a terminator.
[0096] The nucleic acid molecules used for silencing in the methods of the invention (whether introduced into a plant or generated in situ) hybridize with or bind to mRNA transcripts and/or genomic DNA encoding a polypeptide to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid sequence which binds to DNA duplexes, through specific interactions in the major groove of the double helix. Antisense nucleic acid sequences may be introduced into a plant by transformation or direct injection at a specific tissue site. Alternatively, antisense nucleic acid sequences can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense nucleic acid sequences can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid sequence to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid sequences can also be delivered to cells using the vectors described herein.
[0097] According to a further aspect, the antisense nucleic acid sequence is an a-anomeric nucleic acid sequence. An a-anomeric nucleic acid sequence forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual b-units, the strands run parallel to each other (Gaultier et al. (1987) Nucl Ac Res 15: 6625-6641). The antisense nucleic acid sequence may also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucl Ac Res 15, 6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215, 327-330).
[0098] The reduction or substantial elimination of endogenous gene expression may also be performed using ribozymes. Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid sequence, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334, 585-591) can be used to catalytically cleave mRNA transcripts encoding a polypeptide, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. A ribozyme having specificity for a nucleic acid sequence can be designed (see for example: Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742). Alternatively, mRNA transcripts corresponding to a nucleic acid sequence can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules (Bartel and Szostak (1993) Science 261, 1411-1418). The use of ribozymes for gene silencing in plants is known in the art (e.g., Atkins et al. (1994) WO 94/00012; Lenne et al. (1995) WO 95/03404; Lutziger et al. (2000) WO 00/00619; Prinsen et al. (1997) WO 97/13865 and Scott et al. (1997) WO 97/38116).
[0099] Gene silencing may also be achieved by insertion mutagenesis (for example, T-DNA insertion or transposon insertion) or by strategies as described by, among others, Angell and Baulcombe ((1999) Plant J 20(3): 357-62), (Amplicon VIGS WO 98/36083), or Baulcombe (WO 99/15682).
[0100] Gene silencing may also occur if there is a mutation on an endogenous gene and/or a mutation on an isolated gene/nucleic acid subsequently introduced into a plant. The reduction or substantial elimination may be caused by a non-functional polypeptide. For example, the polypeptide may bind to various interacting proteins; one or more mutation(s) and/or truncation(s) may therefore provide for a polypeptide that is still able to bind interacting proteins (such as receptor proteins) but that cannot exhibit its normal function (such as signalling ligand).
[0101] A further approach to gene silencing is by targeting nucleic acid sequences complementary to the regulatory region of the gene (e.g., the promoter and/or enhancers) to form triple helical structures that prevent transcription of the gene in target cells. See Helene, C., Anticancer Drug Res. 6, 569-84, 1991; Helene et al., Ann. N.Y. Acad. Sci. 660, 27-36 1992; and Maher, L. J. Bioassays 14, 807-15, 1992.
[0102] Other methods, such as the use of antibodies directed to an endogenous polypeptide for inhibiting its function in planta, or interference in the signalling pathway in which a polypeptide is involved, will be well known to the skilled man. In particular, it can be envisaged that manmade molecules may be useful for inhibiting the biological function of a target polypeptide, or for interfering with the signalling pathway in which the target polypeptide is involved.
[0103] Alternatively, a screening program may be set up to identify in a plant population natural variants of a gene, which variants encode polypeptides with reduced activity. Such natural variants may also be used for example, to perform homologous recombination.
[0104] Artificial and/or natural microRNAs (miRNAs) may be used to knock out gene expression and/or mRNA translation. Endogenous miRNAs are single stranded small RNAs of typically 19-24 nucleotides long. They function primarily to regulate gene expression and/or mRNA translation. Most plant microRNAs (miRNAs) have perfect or near-perfect complementarity with their target sequences. However, there are natural targets with up to five mismatches. They are processed from longer non-coding RNAs with characteristic fold-back structures by double-strand specific RNases of the Dicer family. Upon processing, they are incorporated in the RNA-induced silencing complex (RISC) by binding to its main component, an Argonaute protein. MiRNAs serve as the specificity components of RISC, since they base-pair to target nucleic acids, mostly mRNAs, in the cytoplasm. Subsequent regulatory events include target mRNA cleavage and destruction and/or translational inhibition. Effects of miRNA overexpression are thus often reflected in decreased mRNA levels of target genes.
[0105] Artificial microRNAs (amiRNAs), which are typically 21 nucleotides in length, can be genetically engineered specifically to negatively regulate gene expression of single or multiple genes of interest. Determinants of plant microRNA target selection are well known in the art. Empirical parameters for target recognition have been defined and can be used to aid in the design of specific amiRNAs, (Schwab et al., Dev. Cell 8, 517-527, 2005). Convenient tools for design and generation of amiRNAs and their precursors are also available to the public (Schwab et al., Plant Cell 18, 1121-1133, 2006).
[0106] For optimal performance, the gene silencing techniques used for reducing expression in a plant of an endogenous gene requires the use of nucleic acid sequences from monocotyledonous plants for transformation of monocotyledonous plants, and from dicotyledonous plants for transformation of dicotyledonous plants. Preferably, a nucleic acid sequence from any given plant species is introduced into that same species. For example, a nucleic acid sequence from rice is transformed into a rice plant. However, it is not an absolute requirement that the nucleic acid sequence to be introduced originates from the same plant species as the plant in which it will be introduced. It is sufficient that there is substantial homology between the endogenous target gene and the nucleic acid to be introduced.
[0107] Described above are examples of various methods for the reduction or substantial elimination of expression in a plant of an endogenous gene. A person skilled in the art would readily be able to adapt the aforementioned methods for silencing so as to achieve reduction of expression of an endogenous gene in a whole plant or in parts thereof through the use of an appropriate promoter, for example.
Selectable Marker (Gene)/Reporter Gene
[0108] "Selectable marker", "selectable marker gene" or "reporter gene" includes any gene that confers a phenotype on a cell in which it is expressed to facilitate the identification and/or selection of cells that are transfected or transformed with a nucleic acid construct of the invention. These marker genes enable the identification of a successful transfer of the nucleic acid molecules via a series of different principles. Suitable markers may be selected from markers that confer antibiotic or herbicide resistance, that introduce a new metabolic trait or that allow visual selection. Examples of selectable marker genes include genes conferring resistance to antibiotics (such as nptII that phosphorylates neomycin and kanamycin, or hpt, phosphorylating hygromycin, or genes conferring resistance to, for example, bleomycin, streptomycin, tetracyclin, chloramphenicol, ampicillin, gentamycin, geneticin (G418), spectinomycin or blasticidin), to herbicides (for example bar which provides resistance to Basta®; aroA or gox providing resistance against glyphosate, or the genes conferring resistance to, for example, imidazolinone, phosphinothricin or sulfonylurea), or genes that provide a metabolic trait (such as manA that allows plants to use mannose as sole carbon source or xylose isomerase for the utilisation of xylose, or antinutritive markers such as the resistance to 2-deoxyglucose). Expression of visual marker genes results in the formation of colour (for example β-glucuronidase, GUS or β-galactosidase with its coloured substrates, for example X-Gal), luminescence (such as the luciferin/luceferase system) or fluorescence (Green Fluorescent Protein, GFP, and derivatives thereof). This list represents only a small number of possible markers. The skilled worker is familiar with such markers. Different markers are preferred, depending on the organism and the selection method.
[0109] It is known that upon stable or transient integration of nucleic acids into plant cells, only a minority of the cells takes up the foreign DNA and, if desired, integrates it into its genome, depending on the expression vector used and the transfection technique used. To identify and select these integrants, a gene coding for a selectable marker (such as the ones described above) is usually introduced into the host cells together with the gene of interest. These markers can for example be used in mutants in which these genes are not functional by, for example, deletion by conventional methods. Furthermore, nucleic acid molecules encoding a selectable marker can be introduced into a host cell on the same vector that comprises the sequence encoding the polypeptides of the invention or used in the methods of the invention, or else in a separate vector. Cells which have been stably transfected with the introduced nucleic acid can be identified for example by selection (for example, cells which have integrated the selectable marker survive whereas the other cells die). The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker gene removal are known in the art, useful techniques are described above in the definitions section.
[0110] Since the marker genes, particularly genes for resistance to antibiotics and herbicides, are no longer required or are undesired in the transgenic host cell once the nucleic acids have been introduced successfully, the process according to the invention for introducing the nucleic acids advantageously employs techniques which enable the removal or excision of these marker genes. One such a method is what is known as co-transformation. The co-transformation method employs two vectors simultaneously for the transformation, one vector bearing the nucleic acid according to the invention and a second bearing the marker gene(s). A large proportion of transformants receives or, in the case of plants, comprises (up to 40% or more of the transformants), both vectors. In case of transformation with Agrobacteria, the transformants usually receive only a part of the vector, i.e. the sequence flanked by the T-DNA, which usually represents the expression cassette. The marker genes can subsequently be removed from the transformed plant by performing crosses. In another method, marker genes integrated into a transposon are used for the transformation together with desired nucleic acid (known as the Ac/Ds technology). The transformants can be crossed with a transposase source or the transformants are transformed with a nucleic acid construct conferring expression of a transposase, transiently or stable. In some cases (approx. 10%), the transposon jumps out of the genome of the host cell once transformation has taken place successfully and is lost. In a further number of cases, the transposon jumps to a different location. In these cases the marker gene must be eliminated by performing crosses. In microbiology, techniques were developed which make possible, or facilitate, the detection of such events. A further advantageous method relies on what is known as recombination systems; whose advantage is that elimination by crossing can be dispensed with. The best-known system of this type is what is known as the Cre/lox system. Cre1 is a recombinase that removes the sequences located between the loxP sequences. If the marker gene is integrated between the loxP sequences, it is removed once transformation has taken place successfully, by expression of the recombinase. Further recombination systems are the HIN/HIX, FLP/FRT and REP/STB system (Tribble et al., J. Biol. Chem., 275, 2000: 22255-22267; Velmurugan et al., J. Cell Biol., 149, 2000: 553-566). A site-specific integration into the plant genome of the nucleic acid sequences according to the invention is possible. Naturally, these methods can also be applied to microorganisms such as yeast, fungi or bacteria.
Transgenic/Transgene/Recombinant
[0111] For the purposes of the invention, "transgenic", "transgene" or "recombinant" means with regard to, for example, a nucleic acid sequence, an expression cassette, gene construct or a vector comprising the nucleic acid sequence or an organism transformed with the nucleic acid sequences, expression cassettes or vectors according to the invention, all those constructions brought about by recombinant methods in which either
[0112] (a) the nucleic acid sequences encoding proteins useful in the methods of the invention, or
[0113] (b) genetic control sequence(s) which is operably linked with the nucleic acid sequence according to the invention, for example a promoter, or
[0114] (c) a) and b) are not located in their natural genetic environment or have been modified by recombinant methods, it being possible for the modification to take the form of, for example, a substitution, addition, deletion, inversion or insertion of one or more nucleotide residues. The natural genetic environment is understood as meaning the natural genomic or chromosomal locus in the original plant or the presence in a genomic library. In the case of a genomic library, the natural genetic environment of the nucleic acid sequence is preferably retained, at least in part. The environment flanks the nucleic acid sequence at least on one side and has a sequence length of at least 50 bp, preferably at least 500 bp, especially preferably at least 1000 bp, most preferably at least 5000 bp. A naturally occurring expression cassette--for example the naturally occurring combination of the natural promoter of the nucleic acid sequences with the corresponding nucleic acid sequence encoding a polypeptide useful in the methods of the present invention, as defined above--becomes a transgenic expression cassette when this expression cassette is modified by non-natural, synthetic ("artificial") methods such as, for example, mutagenic treatment. Suitable methods are described, for example, in U.S. Pat. No. 5,565,350 or WO 00/15815.
[0115] A transgenic plant for the purposes of the invention is thus understood as meaning, as above, that the nucleic acids used in the method of the invention are not at their natural locus in the genome of said plant, it being possible for the nucleic acids to be expressed homologously or heterologously. However, as mentioned, transgenic also means that, while the nucleic acids according to the invention or used in the inventive method are at their natural position in the genome of a plant, the sequence has been modified with regard to the natural sequence, and/or that the regulatory sequences of the natural sequences have been modified. Transgenic is preferably understood as meaning the expression of the nucleic acids according to the invention at an unnatural locus in the genome, i.e. homologous or, preferably, heterologous expression of the nucleic acids takes place. Preferred transgenic plants are mentioned herein.
Transformation
[0116] The term "introduction" or "transformation" as referred to herein encompasses the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for transfer. Plant tissue capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may be transformed with a genetic construct of the present invention and a whole plant regenerated there from. The particular tissue chosen will vary depending on the clonal propagation systems available for, and best suited to, the particular species being transformed. Exemplary tissue targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), and induced meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). The polynucleotide may be transiently or stably introduced into a host cell and may be maintained non-integrated, for example, as a plasmid. Alternatively, it may be integrated into the host genome. The resulting transformed plant cell may then be used to regenerate a transformed plant in a manner known to persons skilled in the art.
[0117] The transfer of foreign genes into the genome of a plant is called transformation. Transformation of plant species is now a fairly routine technique. Advantageously, any of several transformation methods may be used to introduce the gene of interest into a suitable ancestor cell. The methods described for the transformation and regeneration of plants from plant tissues or plant cells may be utilized for transient or for stable transformation. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts (Krens, F. A. et al., (1982) Nature 296, 72-74; Negrutiu I et al. (1987) Plant Mol Biol 8: 363-373); electroporation of protoplasts (Shillito R. D. et al. (1985) Bio/Technol 3, 1099-1102); microinjection into plant material (Crossway A et al., (1986) Mol. Gen Genet 202: 179-185); DNA or RNA-coated particle bombardment (Klein T M et al., (1987) Nature 327: 70) infection with (non-integrative) viruses and the like. Transgenic plants, including transgenic crop plants, are preferably produced via Agrobacterium-mediated transformation. An advantageous transformation method is the transformation in planta. To this end, it is possible, for example, to allow the agrobacteria to act on plant seeds or to inoculate the plant meristem with agrobacteria. It has proved particularly expedient in accordance with the invention to allow a suspension of transformed agrobacteria to act on the intact plant or at least on the flower primordia. The plant is subsequently grown on until the seeds of the treated plant are obtained (Clough and Bent, Plant J. (1998) 16, 735-743). Methods for Agrobacterium-mediated transformation of rice include well known methods for rice transformation, such as those described in any of the following: European patent application EP 1198985 A1, Aldemita and Hodges (Planta 199: 612-617, 1996); Chan et al. (Plant Mol Biol 22 (3): 491-506, 1993), Hiei et al. (Plant J 6 (2): 271-282, 1994), which disclosures are incorporated by reference herein as if fully set forth. In the case of corn transformation, the preferred method is as described in either Ishida et al. (Nat. Biotechnol 14(6): 745-50, 1996) or Frame et al. (Plant Physiol 129(1): 13-22, 2002), which disclosures are incorporated by reference herein as if fully set forth. Said methods are further described by way of example in B. Jenes et al., Techniques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press (1993) 128-143 and in Potrykus Annu. Rev. Plant Physiol. Plant Molec. Biol. 42 (1991) 205-225). The nucleic acids or the construct to be expressed is preferably cloned into a vector, which is suitable for transforming Agrobacterium tumefaciens, for example pBin19 (Bevan et al., Nucl. Acids Res. 12 (1984) 8711). Agrobacteria transformed by such a vector can then be used in known manner for the transformation of plants, such as plants used as a model, like Arabidopsis (Arabidopsis thaliana is within the scope of the present invention not considered as a crop plant), or crop plants such as, by way of example, tobacco plants, for example by immersing bruised leaves or chopped leaves in an agrobacterial solution and then culturing them in suitable media. The transformation of plants by means of Agrobacterium tumefaciens is described, for example, by Hofgen and Willmitzer in Nucl. Acid Res. (1988) 16, 9877 or is known inter alia from F. F. White, Vectors for Gene Transfer in Higher Plants; in Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press, 1993, pp. 15-38.
[0118] In addition to the transformation of somatic cells, which then have to be regenerated into intact plants, it is also possible to transform the cells of plant meristems and in particular those cells which develop into gametes. In this case, the transformed gametes follow the natural plant development, giving rise to transgenic plants. Thus, for example, seeds of Arabidopsis are treated with agrobacteria and seeds are obtained from the developing plants of which a certain proportion is transformed and thus transgenic [Feldman, K A and Marks M D (1987). Mol Gen Genet 208:274-289; Feldmann K (1992). In: C Koncz, N-H Chua and J Shell, eds, Methods in Arabidopsis Research. Word Scientific, Singapore, pp. 274-289]. Alternative methods are based on the repeated removal of the inflorescences and incubation of the excision site in the center of the rosette with transformed agrobacteria, whereby transformed seeds can likewise be obtained at a later point in time (Chang (1994). Plant J. 5: 551-558; Katavic (1994). Mol Gen Genet, 245: 363-370). However, an especially effective method is the vacuum infiltration method with its modifications such as the "floral dip" method. In the case of vacuum infiltration of Arabidopsis, intact plants under reduced pressure are treated with an agrobacterial suspension [Bechthold, N (1993). C R Acad Sci Paris Life Sci, 316: 1194-1199], while in the case of the "floral dip" method the developing floral tissue is incubated briefly with a surfactant-treated agrobacterial suspension [Clough, S J and Bent A F (1998) The Plant J. 16, 735-743]. A certain proportion of transgenic seeds are harvested in both cases, and these seeds can be distinguished from non-transgenic seeds by growing under the above-described selective conditions. In addition the stable transformation of plastids is of advantages because plastids are inherited maternally is most crops reducing or eliminating the risk of transgene flow through pollen. The transformation of the chloroplast genome is generally achieved by a process which has been schematically displayed in Klaus et al., 2004 [Nature Biotechnology 22 (2), 225-229]. Briefly the sequences to be transformed are cloned together with a selectable marker gene between flanking sequences homologous to the chloroplast genome. These homologous flanking sequences direct site specific integration into the plastome. Plastidal transformation has been described for many different plant species and an overview is given in Bock (2001) Transgenic plastids in basic research and plant biotechnology. J Mol Biol. 2001 Sep. 21; 312 (3):425-38 or Maliga, P (2003) Progress towards commercialization of plastid transformation technology. Trends Biotechnol. 21, 20-28. Further biotechnological progress has recently been reported in form of marker free plastid transformants, which can be produced by a transient co-integrated maker gene (Klaus et al., 2004, Nature Biotechnology 22(2), 225-229).
T-DNA Activation Taming
[0119] T-DNA activation tagging (Hayashi et al. Science (1992) 1350-1353), involves insertion of T-DNA, usually containing a promoter (may also be a translation enhancer or an intron), in the genomic region of the gene of interest or 10 kb up- or downstream of the coding region of a gene in a configuration such that the promoter directs expression of the targeted gene. Typically, regulation of expression of the targeted gene by its natural promoter is disrupted and the gene falls under the control of the newly introduced promoter. The promoter is typically embedded in a T-DNA. This T-DNA is randomly inserted into the plant genome, for example, through Agrobacterium infection and leads to modified expression of genes near the inserted T-DNA. The resulting transgenic plants show dominant phenotypes due to modified expression of genes close to the introduced promoter.
TILLING
[0120] The term "TILLING" is an abbreviation of "Targeted Induced Local Lesions In Genomes" and refers to a mutagenesis technology useful to generate and/or identify nucleic acids encoding proteins with modified expression and/or activity. TILLING also allows selection of plants carrying such mutant variants. These mutant variants may exhibit modified expression, either in strength or in location or in timing (if the mutations affect the promoter for example). These mutant variants may exhibit higher activity than that exhibited by the gene in its natural form. TILLING combines high-density mutagenesis with high-throughput screening methods. The steps typically followed in TILLING are: (a) EMS mutagenesis (Redei G P and Koncz C (1992) In Methods in Arabidopsis Research, Koncz C, Chua N H, Schell J, eds. Singapore, World Scientific Publishing Co, pp. 16-82; Feldmann et al., (1994) In Meyerowitz E M, Somerville C R, eds, Arabidopsis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp 137-172; Lightner J and Caspar T (1998) In J Martinez-Zapater, J Salinas, eds, Methods on Molecular Biology, Vol. 82. Humana Press, Totowa, N.J., pp 91-104); (b) DNA preparation and pooling of individuals; (c) PCR amplification of a region of interest; (d) denaturation and annealing to allow formation of heteroduplexes; (e) DHPLC, where the presence of a heteroduplex in a pool is detected as an extra peak in the chromatogram; (f) identification of the mutant individual; and (g) sequencing of the mutant PCR product. Methods for TILLING are well known in the art (McCallum et al., (2000) Nat Biotechnol 18: 455-457; reviewed by Stemple (2004) Nat Rev Genet 5(2): 145-50).
Homologous Recombination
[0121] Homologous recombination allows introduction in a genome of a selected nucleic acid at a defined selected position. Homologous recombination is a standard technology used routinely in biological sciences for lower organisms such as yeast or the moss Physcomitrella. Methods for performing homologous recombination in plants have been described not only for model plants (Offringa et al. (1990) EMBO J 9(10): 3077-84) but also for crop plants, for example rice (Terada et al. (2002) Nat Biotech 20(10): 1030-4; Iida and Terada (2004) Curr Opin Biotech 15(2): 132-8), and approaches exist that are generally applicable regardless of the target organism (Miller et al, Nature Biotechnol. 25, 778-785, 2007).
Yield
[0122] The term "yield" in general means a measurable produce of economic value, typically related to a specified crop, to an area, and to a period of time. Individual plant parts directly contribute to yield based on their number, size and/or weight, or the actual yield is the yield per square meter for a crop and year, which is determined by dividing total production (includes both harvested and appraised production) by planted square meters. The term "yield" of a plant may relate to vegetative biomass (root and/or shoot biomass), to reproductive organs, and/or to propagules (such as seeds) of that plant.
Early Vigour
[0123] "Early vigour" refers to active healthy well-balanced growth especially during early stages of plant growth, and may result from increased plant fitness due to, for example, the plants being better adapted to their environment (i.e. optimizing the use of energy resources and partitioning between shoot and root). Plants having early vigour also show increased seedling survival and a better establishment of the crop, which often results in highly uniform fields (with the crop growing in uniform manner, i.e. with the majority of plants reaching the various stages of development at substantially the same time), and often better and higher yield. Therefore, early vigour may be determined by measuring various factors, such as thousand kernel weight, percentage germination, percentage emergence, seedling growth, seedling height, root length, root and shoot biomass and many more.
Increase/Improve/Enhance
[0124] The terms "increase", "improve" or "enhance" are interchangeable and shall mean in the sense of the application at least a 3%, 4%, 5%, 6%, 7%, 8%, 9% or 10%, preferably at least 15% or 20%, more preferably 25%, 30%, 35% or 40% more yield and/or growth in comparison to control plants as defined herein.
Seed Yield
[0125] Increased seed yield may manifest itself as one or more of the following: a) an increase in seed biomass (total seed weight) which may be on an individual seed basis and/or per plant and/or per square meter; b) increased number of flowers per plant; c) increased number of (filled) seeds; d) increased seed filling rate (which is expressed as the ratio between the number of filled seeds divided by the total number of seeds); e) increased harvest index, which is expressed as a ratio of the yield of harvestable parts, such as seeds, divided by the total biomass; and f) increased thousand kernel weight (TKW), and g) increased number of primary panicles, which is extrapolated from the number of filled seeds counted and their total weight. An increased TKW may result from an increased seed size and/or seed weight, and may also result from an increase in embryo and/or endosperm size.
[0126] An increase in seed yield may also be manifested as an increase in seed size and/or seed volume. Furthermore, an increase in seed yield may also manifest itself as an increase in seed area and/or seed length and/or seed width and/or seed perimeter. Increased seed yield may also result in modified architecture, or may occur because of modified architecture.
Greenness Index
[0127] The "greenness index" as used herein is calculated from digital images of plants. For each pixel belonging to the plant object on the image, the ratio of the green value versus the red value (in the RGB model for encoding color) is calculated. The greenness index is expressed as the percentage of pixels for which the green-to-red ratio exceeds a given threshold. Under normal growth conditions, under salt stress growth conditions, and under reduced nutrient availability growth conditions, the greenness index of plants is measured in the last imaging before flowering. In contrast, under drought stress growth conditions, the greenness index of plants is measured in the first imaging after drought.
Plant
[0128] The term "plant" as used herein encompasses whole plants, ancestors and progeny of the plants and plant parts, including seeds, shoots, stems, leaves, roots (including tubers), flowers, and tissues and organs, wherein each of the aforementioned comprise the gene/nucleic acid of interest. The term "plant" also encompasses plant cells, suspension cultures, callus tissue, embryos, meristematic regions, gametophytes, sporophytes, pollen and microspores, again wherein each of the aforementioned comprises the gene/nucleic acid of interest.
[0129] Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs selected from the list comprising Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp. (e.g. Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Averrhoa carambola, Bambusa sp., Benincasa hispida, Bertholletia excelsea, Beta vulgaris, Brassica spp. (e.g. Brassica napus, Brassica rapa ssp. [canola, oilseed rape, turnip rape]), Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g. Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp., Eriobotrya japonica, Eucalyptus sp., Eugenia uniflora, Fagopyrum spp., Fagus spp., Festuca arundinacea, Ficus carica, Fortunella spp., Fragaria spp., Ginkgo biloba, Glycine spp. (e.g. Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g. Helianthus annuus), Hemerocallis fulva, Hibiscus spp., Hordeum spp. (e.g. Hordeum vulgare), Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lens culinaris, Linum usitatissimum, Litchi chinensis, Lotus spp., Luffa acutangula, Lupinus spp., Luzula sylvatica, Lycopersicon spp. (e.g. Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Malus spp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihot spp., Manilkara zapota, Medicago sativa, Melilotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp., Opuntia spp., Omithopus spp., Oryza spp. (e.g. Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp., Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g. Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp., Tripsacum dactyloides, Triticale sp., Triticosecale rimpaui, Triticum spp. (e.g. Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vitis spp., Zea mays, Zizania palustris, Ziziphus spp., amongst others.
DETAILED DESCRIPTION OF THE INVENTION
[0130] Surprisingly, it has now been found that modulating expression in a plant of a nucleic acid encoding a RHL1 polypeptide gives plants having enhanced yield-related traits relative to control plants. According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a RHL1 polypeptide.
[0131] Furthermore, surprisingly, it has now been found that increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide as defined herein, gives plants having increased seed yield-related traits relative to control plants. According to a first embodiment, the present invention provides a method for increasing seed yield-related traits in plants relative to control plants, comprising increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide.
[0132] Furthermore, surprisingly, it has now been found that modulating expression in a plant of a nucleic acid encoding a TRY-like polypeptide gives plants having enhanced yield-related traits relative to control plants. According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a TRY-like polypeptide and optionally selecting for plants having enhanced yield-related traits.
[0133] Furthermore, surprisingly, it has now been found that modulating expression in a plant of a nucleic acid encoding a BZR polypeptide gives plants having increased seed yield relative to control plants. According to a first embodiment, the present invention provides a method for increasing seed yield in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a BZR polypeptide.
[0134] A preferred method for modulating (preferably, increasing) expression of a nucleic acid sequence encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, is by introducing and expressing in a plant a nucleic acid sequence encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide.
[0135] Concerning BZR polypeptides, in a further preferred embodiment the invention provides a method for increasing seed yield in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a BZR polypeptide wherein said modulation is effected by introducing a nucleic acid encoding a BZR polypeptide under the control of a plant derived promoter.
[0136] Concerning RHL1 polypeptides, any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a RHL1 polypeptide as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such a RHL1 polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "RHL1 nucleic acid" or "RHL1 gene".
[0137] An RHL1 polypeptide" as defined herein refers to any polypeptide comprising a sequence having in increasing order of preference 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of any of the polypeptides of Table A1.
[0138] A preferred RHL1 polypeptide useful in the methods of the invention comprises a sequence having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of any of the polypeptides of SEQ ID NO: 2 or SEQ ID NO: 10, more preferably comprises SEQ ID NO: 2.
[0139] Various conserved protein motifs are found RHL1 polypeptides. Methods to find conserved protein domain in a group of related sequences are well known in the art. Example 4 details the use of one such method, the MEME system, to identify conserved protein motifs in RHL1 polypeptides.
[0140] A further preferred RHL1 polypeptide useful in the methods of the invention comprises one or more of the following motifs:
[0141] (i) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 1: [IV]R[RK][KG][SG]QRK[NS][RK][FY]L FSFPGLLAP (SEQ ID NO: 29);
[0142] (ii) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 2: SGG[KR][IV]G[ED]L[KA]DL[GD]TKNP[ILV]LYLDFPQG[RQ]MKL](SEQ ID NO: 30);
[0143] (iii) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 3: TP[VS]RQSARTAGKK[FL][KN][FY][AT]ExSS (SEQ ID NO: 31);
[0144] (iv) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 4: GTK[ED]ENPEE[LA][RK]L[DE]FPKE[LF]Q[ENQ][GD](SEQ ID NO: 32);
[0145] (v) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 5:[SN][GN][NL]L[LQV][SR][EDG]xP[AS][KA]PR[SA][APS]LAPSK[TAG]VL[KR][HL][HQ- ]G[KR]D (SEQ ID NO: 33);
[0146] (vi) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 6: HA[ED][CY]DFKGGAGAA[CS]D[ES][KA]Q (SEQ ID NO: 34);
[0147] (vii) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 7:[KSN][KEP]P[GEK][EKT][KTE][YT][VT][EG][EPST][ELQ]SP[KE][IT][ED][SLV][ED- ][DI][DV][LS]S[ED][DE][SD][NDS][LD]K[DK](SEQ ID NO: 35);
[0148] (viii) a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of Motif 8: KG[PA]AAKKQRASP[EM][EA]K[HQ]P[TA]G[KI]K (SEQ ID NO: 36).
[0149] Wherein the amino acids between square brackets are alternatives.
[0150] Alternatively a preferred RHL1 polypeptide useful in the methods of the invention comprises:
A. a motif having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of one or more of the following motifs:
TABLE-US-00010 (i) Motif 9: (SEQ ID NO: 37) (SN)VMC(ED)D(YV)F(DE)(NS)(ML)(IV)VFS(DE)AWWIG(TR) K(ED)ENPEE; (ii) Motif 10: (SEQ ID NO: 38) L(AILV)A(PA)(IVA)(SA)GG(KR)(IVF)G(ED)L(KA)DL(GDS) (TS)KNP(IVL)LYLDFPQ; (iii) Motif 11: (SEQ ID NO: 39) G(RQ)(ML)KLFGTI(VL)YPKN(RK)Y(LI)TLQF;
Wherein the amino acids between brackets (alternative amino acids at that position), are alternatives; or B. any one or more of the following motifs:
TABLE-US-00011 (i) Motif 9: (SEQ ID NO: 37) (SN)VMC(ED)D(YV)F(DE)(NS)(ML)(IV)VFS(DE)AWWIG(TR) K(ED)ENPEE; (ii) Motif 10: (SEQ ID NO: 38) L(AILV)A(PA)(IVA)(SA)GG(KR)(IVF)G(ED)L(KA)DL(GDS) (TS)KNP(IVL)LYLDFPQ; (iii) Motif 11: (SEQ ID NO: 39) G(RQ)(ML)KLFGTI(VL)YPKN(RK)Y(LI)TLQF;
[0151] Wherein the amino acids between brackets are alternatives (alternative amino acids at that position), and wherein in increasing order of preference 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 amino acids are substituted by any other amino acid, preferably by a conservative amino acid.
[0152] An even further preferred RHL1 polypeptides useful in the methods of the invention are paralogous or orthologous proteins of any of the polypeptides of Table A.
[0153] Alternatively, the homologue of an RHL1 protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 2, provided that the homologous protein comprises one of the conserved motifs as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters. Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered.
[0154] Preferably, the polypeptide sequence which when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 3, clusters with any of the RHL1 polypeptides originating from a dicotyledoneous plant comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with any other group.
[0155] Concerning TGase, any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a TGase polypeptide as defined herein. Any reference hereinafter to a "nucleic acid sequence useful in the methods of the invention" is taken to mean a nucleic acid sequence capable of encoding such a TGase polypeptide. The nucleic acid sequence to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid sequence encoding the type of polypeptide, which will now be described, hereafter also named "TGase nucleic acid sequence" or "TGase gene".
[0156] A "TGase polypeptide" as defined herein refers to any polypeptide comprising (i) a plastidic transit peptide; (ii) in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to a domain comprising at least one coiled coil as represented by SEQ ID NO: 70; (iii) and an Integrated relational Enzyme database entry EC 2.3.2.13 for protein-glutamine γ-glutamyltransferase.
[0157] Alternatively or additionally, a "TGase polypeptide" as defined herein refers to any polypeptide sequence having (i) a plastidic transit peptide; (ii) in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to a polypeptide as represented by SEQ ID NO: 45.
[0158] Alternatively or additionally, a "TGase polypeptide" as defined herein refers to any polypeptide having in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to a TGase polypeptide as represented by SEQ ID NO: 45, or to any of the polypeptide sequences given in Table A2 herein.
[0159] Alternatively or additionally, a "TGase polypeptide" as defined herein refers to any polypeptide sequence which when used in the construction of a TGase phylogenetic tree, such as the one depicted in FIG. 5, clusters with the clade of TGase polypeptides comprising the polypeptide sequence as represented by SEQ ID NO: 45 (marked by an arrow in FIG. 5; TGases from plants are delimited by a bracket in FIG. 5), rather than with the other clades.
[0160] Alternatively or additionally, a "TGase polypeptide" is a polypeptide with enzymatic activity consisting in catalyzing the formation of amide linkages, generally in a Ca-dependent fashion, between the primary amine of an amine donor substrate and the y-carboxamide group of peptide-bound endo-glutamine residues in proteins or polypeptides that are the amine acceptors.
[0161] Concerning ny reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a TRY-like polypeptide as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such a TRY-like polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "TRY-like nucleic acid" or "TRY-like gene".
[0162] A "TRY-like polypeptide" as defined herein refers to any polypeptide comprising a Myb-like DNA-binding domain (PFam domain PF00249.17, SMART domain SM00717, ProfileScan domain PS50090, Panther PTHR10641:SF26).
[0163] Preferably, the TRY-like polypeptide comprises one or more of the following motifs:
TABLE-US-00012 Motif 12: [FM][ST]E[DQ]EE[DT]LIIRM[YHF][NKR]LVG[EDN]RW[SE] LIAGRI Motif 13: PGR[TK]AEEIE[KR][YF]WT[SM][RK] Motif 14: EEVSS[QT][ED][SW][EK][FL][IE] Motif 15: E[ED][DET][LI][IV]X[RK][LFM][HY]XL[LFV]G[NED][RK] WX[LI]I A[GRK]R[LIV][PV]GR[TKEQ][DAP][NEKG][EQ] [IVQ]
wherein X on position 6 may be any amino acid, but preferably one of I, V, L, Y, S, F, C, or T; and wherein X on position 10 may be any amino acid, but preferably one of R, K, E, S, T, or N; and wherein X on position 17 may be any amino acid, but preferably one of S, D, A, E, P, or T. Preferably Motif 15 is EE[DT][LI][IV]XRM[HY][RKN]LVG[NED]RWX[LI]IA[GR]R[IV][PV]GR[TKEQ][AP][NEKG- ]E[IVQ] wherein X on position 6 may be any amino acid, but preferably one of Y, S, F, C, or T; and wherein X on position 17 may be any amino acid, but preferably one of D, A, E, P, or T.
[0164] More preferably, the TRY-like polypeptide comprises in increasing order of preference, at least 2, at least 3, or all 4 motifs.
[0165] Alternatively, the homologue of a TRY-like protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 76, provided that the homologous protein comprises the conserved motifs as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides). Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered. Preferably the motifs in a TRY-like polypeptide have, in increasing order of preference, at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the motifs represented by SEQ ID NO: 229 to SEQ ID NO: 232 (Motifs 12 to 15).
[0166] Preferably, the polypeptide sequence which when used in the construction of a phylogenetic tree, constructed with the polypeptide sequences of Table A3, clusters with the group of TRY-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 76 (At5g53200) rather than with any other group.
[0167] Concerning BZR polypeptides, any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a BZR polypeptide as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such a BZR polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "BZR nucleic acid" or "BZR gene".
[0168] A "BZR polypeptide" as defined herein refers to any transcription factor polypeptide comprising a BZR1 transcriptional repressor domain (Interpro accession number: IPR008540). Typically the N-terminus of BZR polypetides comprises one or more nuclear localization signals and a bHLH-like DNA binding domain (Yin et al. (2008) Plant Physiology 147:280-295.
[0169] BZR transcription factors are well known in the art. BZR polypeptides belong to a small family of proteins of plant origin which function as transcriptional modulators involved in controlling the response to Brassinosteroids (BRs).
[0170] The BZR polypeptide useful in the methods of the invention comprises a domain having in increasing order of preference at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the BZR1 transcriptional repressor domain in SEQ ID NO: 238 located at amino acid position (coordinates) 10 to 157 in SEQ ID NO: 238 or to a BZR transcriptional repressor domain comprised in any of the polypeptides of Table A4.
[0171] Typically, the BZR polypeptides useful in the methods of the invention have a conserved bHLH-like domain located at the N-terminus of the protein for example such domain corresponds to the sequence RERRRRAIAAKIFTGLRSQGNYKLPKHCDNNEVLKALCLE AGWIVHEDGT: (SEQ ID NO: 326) located at positions 27 to 76 of SEQ ID NO: 238.
[0172] Additionally the BZR polypeptide useful in the methods of the invention may comprise a domain having in increasing order of preference at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a bHLH-like domain as represented by SEQ ID NO: 326.
[0173] Additionally, the BZR polypeptide useful in the methods of the invention may comprise any one or more of the following motifs:
[0174] (i) Motif 16: SAPVTPPLSSP (SEQ ID NO: 323), wherein 1, 2, 3 or 4 residues may be substituted by any amino acid.
[0175] (ii) Motif 17: VKPWEGERIHE (SEQ ID NO: 324), wherein 1, 2, 3 or 4 residues may be substituted by any amino acid.
[0176] (iii) Motif 18: DLELTLG (SEQ ID NO: 325), wherein 1, 2, 3 or 4 residues may be substituted by any amino acid.
[0177] Alternatively, the homologue of a BZR protein has in increasing order of preference at least 20%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 238, provided that the homologous protein comprises the conserved BZR domain as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides). Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered.
[0178] The terms "domain", "signature" and "motif" are defined in the "definitions" section herein. Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788(2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.
[0179] Concerning TGase plypeptides, the term "domain" and "motif" is defined in the "definitions" section herein. Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32: D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788(2003)). Domains may also be identified using routine techniques, such as by sequence alignment. An alignment of the polypeptides of Table A2 herein, is shown in FIG. 6. Such alignments are useful for identifying the most conserved domains or motifs between the TGase polypeptides as defined herein. One such domain is a domain comprising at least one coiled coil, marked by X's in FIG. 6, and as represented by SEQ ID NO: 70.
[0180] Concerning TGase polypeptides, coiled coils are domains that are important to identify for protein-protein interactions, such as oligomerization, either of identical proteins, of proteins of the same family, or of unrelated proteins. Recently much progress has been made in computational prediction of coiled coils from sequence data. Among algorithms well known to a person skilled in the art are available at the ExPASy Proteomics tools COILS, PAIRCOIL, PAIRCOIL2, MULTICOIL, or MARCOIL, hosted by the Swiss Institute for Bioinformatics. In Example 4 and FIG. 5, are shown respectively the numerical and graphical results of SEQ ID NO: 45 as produced by the COILS algorithm analysis. A domain comprising at one coiled coil is identified in the TGase polypeptide sequence as represented by SEQ ID NO: 45, and is represented as in SEQ ID NO: 70.
[0181] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity and performs a statistical analysis of the similarity between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity and identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters. For local alignments, the Smith-Waterman algorithm is particularly useful (Smith T F, Waterman M S (1981) J. Mol. Biol 147(1); 195-7).
[0182] Concerning TGase polypeptides, Example 3 herein describes in Table B2 the percentage identity between the TGase polypeptide as represented by SEQ ID NO: 45 and the TGase polypeptides listed in Table A2, which can be as low as 26% amino acid sequence identity.
[0183] The task of protein subcellular localisation prediction is important and well studied. Knowing a protein's localisation helps elucidate its function. Experimental methods for protein localization range from immunolocalization to tagging of proteins using green fluorescent protein (GFP) or beta-glucuronidase (GUS). Such methods are accurate although labor-intensive compared with computational methods. Recently much progress has been made in computational prediction of protein localisation from sequence data. Among algorithms well known to a person skilled in the art are available at the ExPASy Proteomics tools hosted by the Swiss Institute for Bioinformatics, for example, PSort, TargetP, ChloroP, LocTree, Predotar, LipoP, MITOPROT, PATS, PTS1, SignalP, TMHMM, and others. The subcellular localisation of polypeptides useful in performing the methods of the invention was previously described in the literature (Villalobos et al. (2004) Gene 336: 93-104). In particular SEQ ID NO: 45 of the present invention is assigned to the plastidic (chloroplastic) compartment of plant cells.
[0184] Methods for targeting to plastids are well known in the art and include the use of transit peptides. Table 3 below shows examples of transit peptides which can be used to target any TGase polypeptide to a plastid, which TGase polypeptide is not, in its natural form, normally targeted to a plastid, or which TGase polypeptide in its natural form is targeted to a plastid by virtue of a different transit peptide (for example, its natural transit peptide). Cloning a nucleic acid sequence encoding a transit peptide upstream and in-frame of a nucleic acid sequence encoding a polypeptide (for example, a TGase polypeptide lacking its own transit peptide), involves standard molecular techniques that are well-known in the art.
TABLE-US-00013 TABLE 3 Examples of transit peptide sequences useful in targeting polypeptides to plastids NCBI Accession Number/ Source Protein SEQ ID NO Organism Function Transit Peptide Sequence SEQ ID NO: Chlamydomonas Ferredoxin MAMAMRSTFAARVGAKPAVRGARPAS P07839 RMSCMA SEQ ID NO: Chlamydomonas Rubisco MQVTMKSSAVSGQRVGGARVATRSVR AAR23425 activase RAQLQV SEQ ID NO: Arabidopsis Aspartate amino MASLMLSLGSTSLLPREINKDKLKLGTS CAA56932 thaliana transferase ASNPFLKAKSFSRVTMTVAVKPSR SEQ ID NO: Arabidopsis Acyl carrier MATQFSASVSLQTSCLATTRISFQKPAL CAA31991 thaliana protein1 ISNHGKTNLSFNLRRSIPSRRLSVSC SEQ ID NO: Arabidopsis Acyl carrier MASIAASASISLQARPRQLAIAASQVKS CAB63798 thaliana protein2 FSNGRRSSLSFNLRQLPTRLTVSCAAKP ETVDKVCAVVRKQL SEQ ID NO: Arabidopsis Acyl carrier MASIATSASTSLQARPRQLVIGAKQVKS CAB63799 thaliana protein3 FSYGSRSNLSFNLRQLPTRLTVYCAAKP ETVDKVCAVVRKQLSLKE
[0185] The TGase polypeptide is targeted and active in the chloroplast, i.e., the TGase polypeptide is capable of consisting in catalyzing the formation of amide linkages, generally in a Ca-dependent fashion, between the primary amine of an amine donor substrate and the y-carboxamide group of peptide-bound endo-glutamine residues in proteins or polypeptides that are the amine acceptors (Villalobos et al. (2004) Gene 336: 93-104).
[0186] Furthermore, RHL1 polypeptides typically have DNA biding activity. Tools and techniques for measuring DNA biding activity are well known in the art. Further details are provided in the Examples section.
[0187] In addition, RHL1 polypeptides, when expressed in rice according to the methods of the present invention as outlined in the Examples section, give plants having increased yield related traits, in particular any one of early vigour, increased total seed weight per plant, increased number of seeds, increased number of filled seeds and increased harvest index.
[0188] Furthermore, TRY-like polypeptides (at least in their native form) typically have DNA binding activity. Tools and techniques for measuring DNA binding activity are well known in the art.
[0189] In addition, TRY-like polypeptides, when expressed in rice according to the methods of the present invention as outlined in the Examples section, give plants having increased yield related traits, in particular one or more of increased emergence vigour, increased fill rate, increased harvest index, increased total number of seeds, increased thousand kernel weight, increased number of first panicles, increased number of filled seeds and/or increased total weight of seeds.
[0190] Furthermore, BZR polypeptides (at least in their native form) typically have DNA-binding activity and optionally protein-binding activity. Tools and techniques for measuring DNA binding activity are well known in the art. For example the EMSA technique which is based is based on the observation that protein:DNA complexes migrate more slowly than free DNA molecules when subjected to non-denaturing polyacrylamide or agarose gel electrophoresis may be used (He et al. Science 307, 134-138 (2005)). Techniques useful to determine interaction between polypeptides are well known in the art and include but are not limited to yeast two hybrid, inmunoprecipation, or affinity purification of tagged proteins such as that used in TAP (Tandem Affinity Purification) technology Rigaut et al. Nat Biotechnol. 1999 October; 17(10): 1030-2.
[0191] Preferably, BZR polypeptides useful in the methods of the invention have DNA binding activity and bind a DNA fragment of preferably and in increasing order of preference at least 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, nucleotides long comprising a BRRE element (Brassinosteroid response element) as represented by SEQ ID NO: 327 element and/or an E-box element as represented by SEQ ID NO: 328. BRRE elements and E-box elements are well known in the art (He et al. 2005; Yin et al 2008). Preferably the BRRE element and the E-box element comprises a sequence having at least 70%, 80%, 85%, 90%, 95% sequence identity to the sequence CGTGC(T/C)G (BRRE element: SEQ ID NO: 90) and CANNTC (E-box: SEQ ID NO: 328) respectively. More preferably the DNA fragment to which the BZR polypeptide binds is selected from the CPD, DWF4, UBC and CNX5 promoter as described by He et al. 2005.
[0192] In addition, BZR polypeptides, when expressed in rice according to the methods of the present invention as outlined in the Examples section, give plants having increased seed yield, in particular increased number of filled seeds.
[0193] Concerning RHL1 polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 1, encoding the polypeptide sequence of SEQ ID NO: 2. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any RHL1-encoding nucleic acid or RHL1 polypeptide as defined herein.
[0194] Concerning RHL1 polypeptides, examples of nucleic acids encoding RHL1 polypeptides are given in Table A1 of Example 1 herein. Such nucleic acids are useful in performing the methods of the invention. The amino acid sequences given in Table A1 of Example 1 are example sequences of orthologues and paralogues of the RHL1 polypeptide represented by SEQ ID NO: 2, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using any of the sequences listed in Table A1 of Example 1) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 1 or SEQ ID NO: 2, the second BLAST would therefore be against Arabidopsis thaliana sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.
[0195] Concerning TGase polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 44, encoding the TGase polypeptide sequence of SEQ ID NO: 45. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any nucleic acid sequence encoding a TGase polypeptide as defined herein.
[0196] Concerning TGase polypeptides, examples of nucleic acid sequences encoding TGase polypeptides are given in Table A2 of Example 1 herein. Such nucleic acid sequences are useful in performing the methods of the invention. The polypeptide sequences given in Table A2 of Example 1 are example sequences of orthologues and paralogues of the TGase polypeptide represented by SEQ ID NO: 45, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using any of the sequences listed in Table A of Example 1) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 44 or SEQ ID NO: 45, the second BLAST would therefore be against Oryza sativa sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.
[0197] Concerning TRY-like polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 75, encoding the polypeptide sequence of SEQ ID NO: 76. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any TRY-like-encoding nucleic acid or TRY-like polypeptide as defined herein.
[0198] Concerning TRY-like polypeptides, examples of nucleic acids encoding TRY-like polypeptides are given in Table A3 of Example 1 herein. Such nucleic acids are useful in performing the methods of the invention. The amino acid sequences given in Table A3 of Example 1 are example sequences of orthologues and paralogues of the TRY-like polypeptide represented by SEQ ID NO: 76, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using any of the sequences listed in Table A3 of Example 1) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 75 or SEQ ID NO: 76, the second BLAST would therefore be against Arabidopsis sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.
[0199] Concerning BZR polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 238, encoding the polypeptide sequence of SEQ ID NO: 239. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any BZR-encoding nucleic acid or BZR polypeptide as defined herein.
[0200] Concerning BZR polypeptides, examples of nucleic acids encoding BZR polypeptides are given in Table A4 of Example 1 herein. Such nucleic acids are useful in performing the methods of the invention. The amino acid sequences given in Table A4 of Example 1 are example sequences of orthologues and paralogues of the BZR polypeptide represented by SEQ ID NO: 239, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using any of the sequences listed in Table A4 of Example 1) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 238 or SEQ ID NO: 239, the second BLAST would therefore be against Arabidopsis thaliana sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.
[0201] High-ranking hits are those having a low E-value. The lower the E-value, the more significant the score (or in other words the lower the chance that the hit was found by chance). Computation of the E-value is well known in the art. In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In the case of large families, ClustalW may be used, followed by a neighbour joining tree, to help visualize clustering of related genes and to identify orthologues and paralogues.
[0202] Concerning BZR polypeptides, preferably, the BZR polynucleotides useful in the methods of the invention encode a polypeptide having in increasing order of preference 50%, 55%, 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of any of the polypeptides of Table A4.
[0203] Nucleic acid variants may also be useful in practising the methods of the invention. Examples of such variants include nucleic acid sequences encoding homologues and derivatives of any one of the amino acid sequences given in Table A1 to A4 of Example 1, the terms "homologue" and "derivative" being as defined herein. Also useful in the methods of the invention are nucleic acid sequences encoding homologues and derivatives of orthologues or paralogues of any one of the amino acid sequences given in Table A1 to A4 of Example 1. Homologues and derivatives useful in the methods of the present invention have substantially the same biological and functional activity as the unmodified protein from which they are derived. Further variants useful in practising the methods of the invention are variants in which codon usage is optimised or in which miRNA target sites are removed.
[0204] Further nucleic acid variants useful in practising the methods of the invention include portions of nucleic acid sequences encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides, nucleic acid sequences hybridising to nucleic acid sequences encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides, splice variants of nucleic acid sequences encoding RHL1 polypeptides, or TRY-like polypeptides, or BZR polypeptides, allelic variants of nucleic acids encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides, and variants of nucleic acid sequences encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides, obtained by gene shuffling. The terms hybridising sequence, splice variant, allelic variant and gene shuffling are as described herein.
[0205] Nucleic acids encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides, need not be full-length nucleic acids, since performance of the methods of the invention does not rely on the use of full-length nucleic acid sequences. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a portion of any one of the nucleic acid sequences given in Table A1 to A4 of Example 1, or a portion of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 to A4 of Example 1.
[0206] A portion of a nucleic acid may be prepared, for example, by making one or more deletions to the nucleic acid. The portions may be used in isolated form or they may be fused to other coding (or non-coding) sequences in order to, for example, produce a protein that combines several activities. When fused to other coding sequences, the resultant polypeptide produced upon translation may be bigger than that predicted for the protein portion.
[0207] Concerning RHL1 polypeptides, portions useful in the methods of the invention, encode a RHL1 polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in Table A1 of Example 1. Preferably, the portion is a portion of any one of the nucleic acids given in Table A1 of Example 1, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 of Example 1. Preferably the portion is at least 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A1 of Example 1, or of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 of Example 1. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 1. Preferably, the portion encodes a fragment of an amino acid sequence which, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 3, clusters with any of the RHL1 polypeptides originating from a dicotyledoneous plant comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with any other group.
[0208] Concerning TGase polypeptides, portions useful in the methods of the invention, encode a TGase polypeptide as defined herein, and have substantially the same biological activity as the polypeptide sequences given in Table A2 of Example 1. Preferably, the portion is a portion of any one of the nucleic acid sequences given in Table A2 of Example 1, or is a portion of a nucleic acid sequence encoding an orthologue or paralogue of any one of the polypeptide sequences given in Table A2 of Example 1. Preferably the portion is, in increasing order of preference at least 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200 or more consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A2 of Example 1, or of a nucleic acid sequence encoding an orthologue or paralogue of any one of the polypeptide sequences given in Table A2 of Example 1. Preferably, the portion is a portion of a nucleic sequence encoding a polypeptide sequence having in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to the TGase polypeptide as represented by SEQ ID NO: 45 or to any of the polypeptide sequences given in Table A herein. Most preferably, the portion is a portion of the nucleic acid sequence of SEQ ID NO: 44.
[0209] Concerning TRY-like polypeptides, portions useful in the methods of the invention, encode a TRY-like polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in Table A3 of the Example 1. Preferably, the portion is a portion of any one of the nucleic acids given in Table A3 of the Example 1, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A3 of the Example 1. Preferably the portion is at least 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750, 1800, 1850, 1900, 1950, 2000, 2050, 2100, 2150, 2200, 2250, 2300, 2350, 2400, 2450, 2500 consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A3 of the Example 1, or of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A3 of the Example 1. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 75. Preferably, the portion encodes a fragment of an polypeptide comprising a Myb-like DNA-binding domain (PFam domain PF00249.17, SMART domain SM00717, ProfileScan domain PS50090, Panther PTHR10641:SF26).
[0210] Concerning BZR polypeptides, portions useful in the methods of the invention, encode a BZR polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in Table A4 of The Example 1. Preferably, the portion is a portion of any one of the nucleic acids given in Table A4 of The Example 1, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A4 of The Example 1. Preferably the portion is at least 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 500, 550, 600, 700, 800, 900 consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A4 of The Example 1, or of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given Table A4 of The Example 1. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 238. Preferably, the portion encodes a fragment of an amino acid sequence comprising a protein domain having in increasing order of preference 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a bHLH-like domain as represented SEQ ID NO: 326.
[0211] Another nucleic acid variant useful in the methods of the invention is a nucleic acid sequence capable of hybridising, under reduced stringency conditions, preferably under stringent conditions, with a nucleic acid encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, as defined herein, or with a portion as defined herein.
[0212] Concerning RHL1 polypeptides, or TRY-like polypeptides, according to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to any one of the nucleic acids given in Table A1, or Table A3 of Example 1, or comprising introducing and expressing in a plant a nucleic acid capable of hybridising to a nucleic acid encoding an orthologue, paralogue or homologue of any of the nucleic acid sequences given in Table A1, or Table A3 of Example 1.
[0213] Concerning TGase polypeptides, according to the present invention, there is provided a method for increasing seed yield-related traits in plants, comprising introducing and expressing in a plant, a nucleic acid sequence capable of hybridizing to any one of the nucleic acid sequences given in Table A2 of Example 1, or comprising introducing and expressing in a plant, a nucleic acid sequence capable of hybridising to a nucleic acid sequence encoding an orthologue, paralogue or homologue of any of the nucleic acid sequences given in Table A2 of Example 1.
[0214] Concerning BZR polypeptides, according to the present invention, there is provided a method for increasing seed yield in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to any one of the nucleic acids given in Table A4 of Example 1, or comprising introducing and expressing in a plant a nucleic acid capable of hybridising to a nucleic acid encoding an orthologue, paralogue or homologue of any of the nucleic acid sequences given in Table A4 of Example 1.
[0215] Concerning RHL1 polypeptides, hybridising sequences useful in the methods of the invention encode a RHL1 polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in Table A1 of Example 1. Preferably, the hybridising sequence is capable of hybridising to any one of the nucleic acids given in Table A1 of Example 1, or to a portion of any of these sequences, a portion being as defined above, or the hybridising sequence is capable of hybridising to a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 of Example 1. Most preferably, the hybridising sequence is capable of hybridising to a nucleic acid as represented by SEQ ID NO: 1 or to a portion thereof.
[0216] Concerning TGase polypeptides, hybridising sequences useful in the methods of the invention encode a TGase polypeptide as defined herein, and have substantially the same biological activity as the polypeptide sequences given in Table A2 of Example 1. Preferably, the hybridising sequence is capable of hybridising to any one of the nucleic acid sequences given in Table A2 of Example 1, or to a complement thereof, or to a portion of any of these sequences, a portion being as defined above, or wherein the hybridising sequence is capable of hybridising to a nucleic acid sequence encoding an orthologue or paralogue of any one of the polypeptide sequences given in Table A2 of Example 1, or to a complement thereof. Preferably, the hybridising sequence is capable of hybridising to a nucleic acid sequence encoding a polypeptide sequence having in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to the TGase polypeptide as represented by SEQ ID NO: 45 or to any of the polypeptide sequences given in Table A herein. Most preferably, the hybridising sequence is capable of hybridising to a nucleic acid sequence as represented by SEQ ID NO: 44 or to a portion thereof.
[0217] Concerning TRY-like polypeptides, hybridising sequences useful in the methods of the invention encode a TRY-like polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in Table A3 of Example 1. Preferably, the hybridising sequence is capable of hybridising to the complement of any one of the nucleic acids given in Table A3 of Example 1, or to a portion of any of these sequences, a portion being as defined above, or the hybridising sequence is capable of hybridising to the complement of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A3 of Example 1. Most preferably, the hybridising sequence is capable of hybridising to the complement of a nucleic acid as represented by SEQ ID NO: 75 or to a portion thereof.
[0218] Concerning BZR polypeptides, hybridising sequences useful in the methods of the invention encode a BZR polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in Table A4 of Example 1. Preferably, the hybridising sequence is capable of hybridising to the complement of any one of the nucleic acids given in Table A4 of Example 1, or to a portion of any of these sequences, a portion being as defined above, or the hybridising sequence is capable of hybridising to the complement of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A4 of Example 1. Most preferably, the hybridising sequence is capable of hybridising to the complement of a nucleic acid as represented by SEQ ID NO: 238 or to a portion thereof.
[0219] Concerning RHL1 polypeptides, preferably, the hybridising sequence encodes a polypeptide with an amino acid sequence which, when full-length and used in the construction of a phylogenetic tree, such as the one depicted in FIG. 3, clusters with any of the RHL1 polypeptides originating from a dicotyledoneous plant comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with any other group.
[0220] Concerning TRY-like polypeptides, preferably, the hybridising sequence encodes a polypeptide comprising a Myb-like DNA-binding domain (PFam domain PF00249.17, SMART domain SM00717, ProfileScan domain PS50090, Panther PTHR10641:SF26).
[0221] Concerning BZR polypeptides, preferably, the hybridising sequence comprises a protein domain or encodes a polypeptide with an amino acid sequence which, when full-length comprises a protein domain having in increasing order of preference 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a bHLH-like domain as represented SEQ ID NO: 326.
[0222] Another nucleic acid sequence variant useful in the methods of the invention is a splice variant encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, as defined hereinabove, a splice variant being as defined herein.
[0223] Concerning RHL1 polypeptides, or TRY-like polypeptides, according to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a splice variant of any one of the nucleic acid sequences given in Table A1, or Table A3 of Example 1, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1, or Table A3 of Example 1.
[0224] Concerning TGase polypeptides, according to the present invention, there is provided a method for increasing seed yield-related traits, comprising introducing and expressing in a plant, a splice variant of any one of the nucleic acid sequences given in Table A2 of Example 1, or a splice variant of a nucleic acid sequence encoding an orthologue, paralogue or homologue of any of the polypeptide sequences given in Table A2 of Example 1, having substantially the same biological activity as the polypeptide sequence as represented by SEQ ID NO: 45 and any of the polypeptide sequences depicted in Table A2 of Example 1.
[0225] Concerning BZR polypeptides, according to the present invention, there is provided a method for increasing seed yield in plants, comprising introducing and expressing in a plant a splice variant of any one of the nucleic acid sequences given in Table A4 of Example 1, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A4 of Example 1.
[0226] Concerning RHL1 polypeptides, preferred splice variants are splice variants of a nucleic acid represented by SEQ ID NO: 1, or a splice variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2. Preferably, the amino acid sequence encoded by the splice variant, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 3, clusters with any of the RHL1 polypeptides originating from a dicotyledoneous plant comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with any other group.
[0227] Concerning TGase polypeptides, preferred splice variants are splice variants of a nucleic acid sequence represented by SEQ ID NO: 44, or a splice variant of a nucleic acid sequence encoding an orthologue or paralogue of SEQ ID NO: 45. Preferably, the splice variant is a splice variant of a nucleic acid sequence encoding a polypeptide sequence having in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to the TGase polypeptide as represented by SEQ ID NO: 45 or to any of the polypeptide sequences given in Table A2 herein.
[0228] Concerning TRY-like polypeptides, preferred splice variants are splice variants of a nucleic acid represented by SEQ ID NO: 75, or a splice variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 76. Preferably, the amino acid sequence encoded by the splice variant comprises a Myb-like DNA-binding domain (PFam domain PF00249.17, SMART domain SM00717, ProfileScan domain PS50090, Panther PTHR10641:SF26).
[0229] Concerning BZR polypeptides, preferred splice variants are splice variants of a nucleic acid represented by SEQ ID NO: 238, or a splice variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 239. Preferably, the amino acid sequence encoded by the splice variant comprises a protein domain having in increasing order of preference 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a bHLH-like domain as represented SEQ ID NO: 326.
[0230] Another nucleic acid sequence variant useful in performing the methods of the invention is an allelic variant of a nucleic acid encoding a RHL1 polypeptide, or a TGase polypeptide, or TRY-like polypeptide, or a BZR polypeptide, as defined hereinabove, an allelic variant being as defined herein.
[0231] Concerning RHL1 polypeptides, or TRY-like polypeptides, according to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant an allelic variant of any one of the nucleic acids given in Table A1, or Table A3 of Example 1, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1, or Table A3 of Example 1.
[0232] Concerning TGase polypeptides, according to the present invention, there is provided a method for increasing seed yield-related traits, comprising introducing and expressing in a plant, an allelic variant of any one of the nucleic acid sequences given in Table A2 of Example 1, or comprising introducing and expressing in a plant, an allelic variant of a nucleic acid sequence encoding an orthologue, paralogue or homologue of any of the polypeptide sequences given in Table A2 of Example 1.
[0233] Concerning BZR polypeptides, according to the present invention, there is provided a method for increasing seed yield in plants, comprising introducing and expressing in a plant an allelic variant of any one of the nucleic acids given in Table A4 of Example 1, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A4 of Example 1.
[0234] Concerning RHL1 polypeptides, the polypeptides encoded by allelic variants useful in the methods of the present invention have substantially the same biological activity as the RHL1 polypeptide of SEQ ID NO: 2 and any of the amino acids depicted in Table A1 of Example 1. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Preferably, the allelic variant is an allelic variant of SEQ ID NO: 1 or an allelic variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2. Preferably, the amino acid sequence encoded by the allelic variant, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 3, clusters with any of the RHL1 polypeptides originating from a dicotyledoneous plant comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with any other group.
[0235] Concerning TGase polypeptides, the allelic variants useful in the methods of the present invention have substantially the same biological activity as the TGase polypeptide of SEQ ID NO: 45 and any of the polypeptide sequences depicted in Table A2 of Example 1. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Preferably, the allelic variant is an allelic variant of SEQ ID NO: 44 or an allelic variant of a nucleic acid sequence encoding an orthologue or paralogue of SEQ ID NO: 45. Preferably, the allelic variant is an allelic variant of a polypeptide sequence having in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to the TGase polypeptide as represented by SEQ ID NO: 45 or to any of the polypeptide sequences given in Table A2 herein.
[0236] Concerning TRY-like polypeptides, the polypeptides encoded by allelic variants useful in the methods of the present invention have substantially the same biological activity as the TRY-like polypeptide of SEQ ID NO: 76 and any of the amino acids depicted in Table A3 of Example 1. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Allelic variants of SEQ ID NO: 76 are for example described in Schellmann et al. (2002). Preferably, the allelic variant is an allelic variant of SEQ ID NO: 75 or an allelic variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 76. Preferably, the amino acid sequence encoded by the allelic variant, polypeptide comprises a Myb-like DNA-binding domain (PFam domain PF00249.17, SMART domain SM00717, ProfileScan domain PS50090, Panther PTHR10641:SF26).
[0237] Concerning BZR polypeptides, the polypeptides encoded by allelic variants useful in the methods of the present invention have substantially the same biological activity as the BZR polypeptide of SEQ ID NO: 239 and any of the amino acids depicted in Table A4 of Example 1. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Preferably, the allelic variant is an allelic variant of SEQ ID NO: 238 or an allelic variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 239. Preferably, the amino acid sequence encoded by the allelic comprises a protein domain having in increasing order of preference 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a bHLH-like domain as represented SEQ ID NO: 326.
[0238] Gene shuffling or directed evolution may also be used to generate variants of nucleic acid sequences encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides, as defined above; the term "gene shuffling" being as defined herein.
[0239] Concerning RHL1 polypeptides, or TRY-like polypeptides, according to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a variant of any one of the nucleic acid sequences given in Table A1, or Table A3 of Example 1, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1, or Table A3 of Example 1, which variant nucleic acid is obtained by gene shuffling.
[0240] Concerning TGase polypeptides, according to the present invention, there is provided a method for increasing seed yield-related traits, comprising introducing and expressing in a plant, a variant of any one of the nucleic acid sequences given in Table A2 of Example 1, or comprising introducing and expressing in a plant a variant of a nucleic acid sequence encoding an orthologue, paralogue or homologue of any of the polypeptide sequences given in Table A2 of Example 1, which variant nucleic acid sequence is obtained by gene shuffling.
[0241] Concerning BZR polypeptides, according to the present invention, there is provided a method for increasing seed yield in plants, comprising introducing and expressing in a plant a variant of any one of the nucleic acid sequences given in Table A4 of Example 1, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A4 of Example 1, which variant nucleic acid is obtained by gene shuffling.
[0242] Concerning RHL1 polypeptides, preferably, the amino acid sequence encoded by the variant nucleic acid obtained by gene shuffling, when used in the construction of a phylogenetic tree such as the one depicted in FIG. 3, clusters with any of the RHL1 polypeptides originating from a dicotyledoneous plant comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with any other group.
[0243] Concerning TGase polypeptides, preferably, the variant nucleic acid sequence obtained by gene shuffling encodes a polypeptide sequence having in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to the TGase polypeptide as represented by SEQ ID NO: 45 or to any of the polypeptide sequences given in Table A2 herein.
[0244] Concerning TRY-like polypeptides, preferably, the amino acid sequence encoded by the variant nucleic acid obtained by gene shuffling, comprises a Myb-like DNA-binding domain (PFam domain PF00249.17, SMART domain SM00717, ProfileScan domain PS50090, Panther PTHR10641:SF26).
[0245] Concerning BZR polypeptides, preferably, the amino acid sequence encoded by the variant nucleic acid obtained by gene shuffling comprises a protein domain having in increasing order of preference 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a bHLH-like domain as represented SEQ ID NO: 326.
[0246] Furthermore, nucleic acid variants may also be obtained by site-directed mutagenesis. Several methods are available to achieve site-directed mutagenesis, the most common being PCR based methods (Current Protocols in Molecular Biology. Wiley Eds.).
[0247] Nucleic acids encoding RHL1 polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the RHL1 polypeptide-encoding nucleic acid is from a plant, further preferably from a dicocotyledonous plant, more preferably from the family Breassicaceae, most preferably the nucleic acid is from Arabidopsis thaliana.
[0248] Nucleic acid sequences encoding TGase polypeptides may be derived from any natural or artificial source. The nucleic acid sequence may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. The nucleic acid sequence encoding a TGase polypeptide is from a plant, further preferably from a dicotyledonous plant, more preferably from the family Poaceae, most preferably the nucleic acid sequence is from Oryza sativa.
[0249] Nucleic acids encoding TRY-like polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the TRY-like polypeptide-encoding nucleic acid is from a plant, further preferably from a dicotyledonous plant, more preferably from the family Brassicaceae, most preferably the nucleic acid is from Arabidopsis thaliana.
[0250] Nucleic acids encoding BZR polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the BZR polypeptide-encoding nucleic acid is from a plant, further preferably from a heterologous plant, more preferably from a dicotyledonous plant, even more preferably from the family Brassicaceae, most preferably the nucleic acid is from Arabidopsis thaliana.
[0251] Advantageously, the present invention provides hitherto unknown BZR nucleic acid and polypeptide sequences.
[0252] According to a further embodiment of the present invention, there is provided an isolated nucleic acid molecule comprising:
[0253] (i) a nucleic acid represented by any one of SEQ ID NO: 13, 15, 17, 19;
[0254] (ii) a nucleic acid or fragment thereof that is complementary to any one of SEQ ID NO: 13, 15, 17, 19;
[0255] (iii) a nucleic acid encoding an BZR polypeptide having, in increasing order of preference, at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to one of SEQ ID NO: 14, 16, 18, 20;
[0256] (iv) a nucleic acid capable of hybridizing under stringent conditions to any one of the nucleic acids given in (i), (ii) or (iii) above.
[0257] According to a further embodiment of the present invention, there is therefore provided an isolated polypeptide comprising:
[0258] (i) an amino acid sequence having, in increasing order of preference, at least at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to one of SEQ ID NO: 14, 16, 18, 20;
[0259] (ii) derivatives of any of the amino acid sequences given in (i).
[0260] Concerning RHL1 polypeptides, performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased yield, especially increased seed yield relative to control plants. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.
[0261] Concerning TGase polypeptides, performance of the methods of the invention gives plants having increased seed yield-related traits relative to control plants. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.
[0262] Concerning TRY-like polypeptides, performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased early vigour (emergence vigour) and increased yield, especially increased seed yield relative to control plants. The terms "early vigour", "yield" and "seed yield" are described in more detail in the "definitions" section herein.
[0263] Concerning BZR polypeptides, performance of the methods of the invention gives plants having increased seed yield
[0264] Concerning RHL1 polypeptides, reference herein to enhanced yield-related traits is taken to mean an increase in biomass (weight) of one or more parts of a plant, which may include aboveground (harvestable) parts and/or (harvestable) parts below ground. In particular, such harvestable parts are seeds, and performance of the methods of the invention results in plants having increased seed yield relative to the seed yield of control plants.
[0265] Concerning TRY-like polypeptides, reference herein to enhanced yield-related traits is taken to mean an increase in early vigour and/or an increase in biomass (weight) of one or more parts of a plant, which may include aboveground (harvestable) parts and/or (harvestable) parts below ground. In particular, such harvestable parts are seeds, and performance of the methods of the invention results in plants having increased seed yield relative to the seed yield of control plants.
[0266] Concerning BZR polypeptides, reference herein to increase seed yield is taken to mean any one or more of the following seed parameters: an increase in the seed weight, the total number of seed, the number of filled seeds, the seed filing rate, the proportion of filled seeds, the size of the seed, the volume of the seed harvested. The skill in art will recognized that the abovementioned seed yield parameters may be expressed in different units including but not limited to per panicle and/or per plant and/or per harvest. Performance of the methods of the invention results in plants having increased seed yield relative to the seed yield of control plants.
[0267] Taking corn as an example, a yield increase may be manifested as one or more of the following: increase in the number of plants established per square meter, an increase in the number of ears per plant, an increase in the number of rows, number of kernels per row, kernel weight, thousand kernel weight, ear length/diameter, increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), among others. Taking rice as an example, a yield increase may manifest itself as an increase in one or more of the following: number of plants per square meter, number of panicles per plant, number of spikelets per panicle, number of flowers (florets) per panicle (which is expressed as a ratio of the number of filled seeds over the number of primary panicles), increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), increase in thousand kernel weight, among others.
[0268] The present invention provides a method for increasing yield, especially seed yield of plants, relative to control plants, which method comprises modulating expression in a plant of a nucleic acid encoding a RHL1 polypeptide, or a TRY-like polypeptide, as defined herein.
[0269] The present invention also provides a method for increasing seed yield-related traits of plants relative to control plants, which method comprises increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide as defined herein.
[0270] The present invention furthermore provides a method for increasing seed yield of plants, relative to control plants, which method comprises modulating expression in a plant of a nucleic acid encoding a BZR polypeptide as defined herein.
[0271] Since the transgenic plants according to the present invention have increased yield and/or increased seed yield-related traits and/or yield, it is likely that these plants exhibit an increased growth rate (during at least part of their life cycle), relative to the growth rate of control plants at a corresponding stage in their life cycle.
[0272] The increased growth rate may be specific to one or more parts of a plant (including seeds), or may be throughout substantially the whole plant. Plants having an increased growth rate may have a shorter life cycle. The life cycle of a plant may be taken to mean the time needed to grow from a dry mature seed up to the stage where the plant has produced dry mature seeds, similar to the starting material. This life cycle may be influenced by factors such as early vigour, growth rate, greenness index, flowering time and speed of seed maturation. The increase in growth rate may take place at one or more stages in the life cycle of a plant or during substantially the whole plant life cycle. Increased growth rate during the early stages in the life cycle of a plant may reflect enhanced vigour. The increase in growth rate may alter the harvest cycle of a plant allowing plants to be sown later and/or harvested sooner than would otherwise be possible (a similar effect may be obtained with earlier flowering time). If the growth rate is sufficiently increased, it may allow for the further sowing of seeds of the same plant species (for example sowing and harvesting of rice plants followed by sowing and harvesting of further rice plants all within one conventional growing period). Similarly, if the growth rate is sufficiently increased, it may allow for the further sowing of seeds of different plants species (for example the sowing and harvesting of corn plants followed by, for example, the sowing and optional harvesting of soybean, potato or any other suitable plant). Harvesting additional times from the same rootstock in the case of some crop plants may also be possible. Altering the harvest cycle of a plant may lead to an increase in annual biomass production per square meter (due to an increase in the number of times (say in a year) that any particular plant may be grown and harvested). An increase in growth rate may also allow for the cultivation of transgenic plants in a wider geographical area than their wild-type counterparts, since the territorial limitations for growing a crop are often determined by adverse environmental conditions either at the time of planting (early season) or at the time of harvesting (late season). Such adverse conditions may be avoided if the harvest cycle is shortened. The growth rate may be determined by deriving various parameters from growth curves, such parameters may be: T-Mid (the time taken for plants to reach 50% of their maximal size) and T-90 (time taken for plants to reach 90% of their maximal size), amongst others.
[0273] According to a preferred feature of the present invention, performance of the methods of the invention gives plants having an increased growth rate relative to control plants. Therefore, according to the present invention, there is provided a method for increasing the growth rate of plants, which method comprises modulating expression in a plant of a nucleic acid sequence encoding a RHL1 polypeptide, or a TGase polypeptide, or TRY-like polypeptide, a BZR polypeptide, as defined herein.
[0274] Increased seed yield-related traits occur whether the plant is under non-stress conditions or whether the plant is exposed to various stresses compared to control plants grown under comparable conditions. Plants typically respond to exposure to stress by growing more slowly. In conditions of severe stress, the plant may even stop growing altogether. Mild stress on the other hand is defined herein as being any stress to which a plant is exposed which does not result in the plant ceasing to grow altogether without the capacity to resume growth. Mild stress in the sense of the invention leads to a reduction in the growth of the stressed plants of less than 40%, 35% or 30%, preferably less than 25%, 20% or 15%, more preferably less than 14%, 13%, 12%, 11% or 10% or less in comparison to the control plant under non-stress conditions. Due to advances in agricultural practices (irrigation, fertilization, pesticide treatments) severe stresses are not often encountered in cultivated crop plants. As a consequence, the compromised growth induced by mild stress is often an undesirable feature for agriculture. Mild stresses are the everyday biotic and/or abiotic (environmental) stresses to which a plant is exposed. Abiotic stresses may be due to drought or excess water, anaerobic stress, salt stress, chemical toxicity, oxidative stress and hot, cold or freezing temperatures. The abiotic stress may be an osmotic stress caused by a water stress (particularly due to drought), salt stress, oxidative stress or an ionic stress. Biotic stresses are typically those stresses caused by pathogens, such as bacteria, viruses, fungi, nematodes, and insects. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location.
[0275] In particular, the methods of the present invention may be performed under non-stress conditions or under conditions of mild drought to give plants having increased yield relative to control plants. As reported in Wang et al. (Planta (2003) 218: 1-14), abiotic stress leads to a series of morphological, physiological, biochemical and molecular changes that adversely affect plant growth and productivity. Drought, salinity, extreme temperatures and oxidative stress are known to be interconnected and may induce growth and cellular damage through similar mechanisms. Rabbani et al. (Plant Physiol (2003) 133: 1755-1767) describes a particularly high degree of "cross talk" between drought stress and high-salinity stress. For example, drought and/or salinisation are manifested primarily as osmotic stress, resulting in the disruption of homeostasis and ion distribution in the cell. Oxidative stress, which frequently accompanies high or low temperature, salinity or drought stress, may cause denaturing of functional and structural proteins. As a consequence, these diverse environmental stresses often activate similar cell signalling pathways and cellular responses, such as the production of stress proteins, up-regulation of anti-oxidants, accumulation of compatible solutes and growth arrest. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location. Plants with optimal growth conditions, (grown under non-stress conditions) typically yield in increasing order of preference at least 97%, 95%, 92%, 90%, 87%, 85%, 83%, 80%, 77% or 75% of the average production of such plant in a given environment. Average production may be calculated on harvest and/or season basis. Persons skilled in the art are aware of average yield productions of a crop.
[0276] The term "abiotic stress" as defined herein is taken to mean any one or more of: water stress (due to drought or excess water), anaerobic stress, salt stress, temperature stress (due to hot, cold or freezing temperatures), chemical toxicity stress and oxidative stress. According to one aspect of the invention, the abiotic stress is an osmotic stress, selected from water stress, salt stress, oxidative stress and ionic stress. Preferably, the water stress is drought stress. The term salt stress is not restricted to common salt (NaCl), but may be any stress caused by one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.
[0277] Performance of the methods of the invention gives plants having increased seed yield-related traits, under abiotic stress conditions relative to control plants grown in comparable stress conditions. Therefore, according to the present invention, there is provided a method for increasing seed yield-related traits, in plants grown under abiotic stress conditions, which method comprises increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide. According to one aspect of the invention, the abiotic stress is an osmotic stress, selected from one or more of the following: water stress, salt stress, oxidative stress and ionic stress.
[0278] Another example of abiotic environmental stress is the reduced availability of one or more nutrients that need to be assimilated by the plants for growth and development. Because of the strong influence of nutrition utilization efficiency on plant yield and product quality, a huge amount of fertilizer is poured onto fields to optimize plant growth and quality. Productivity of plants ordinarily is limited by three primary nutrients, phosphorous, potassium and nitrogen, which is usually the rate-limiting element in plant growth of these three. Therefore the major nutritional element required for plant growth is nitrogen (N). It is a constituent of numerous important compounds found in living cells, including amino acids, proteins (enzymes), nucleic acids, and chlorophyll. 1.5% to 2% of plant dry matter is nitrogen and approximately 16% of total plant protein. Thus, nitrogen availability is a major limiting factor for crop plant growth and production (Frink et al. (1999) Proc Natl Acad Sci USA 96(4): 1175-1180), and has as well a major impact on protein accumulation and amino acid composition. Therefore, of great interest are crop plants with increased seed yield-related traits, when grown under nitrogen-limiting conditions.
[0279] Performance of the methods of the invention gives plants grown under conditions of reduced nutrient availability, particularly under conditions of reduced nitrogen availablity, having increased seed yield-related traits relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing seed yield-related traits in plants grown under conditions of reduced nutrient availablity, preferably reduced nitrogen availability, which method comprises increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide. Reduced nutrient availability may result from a deficiency or excess of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, cadmium, magnesium, manganese, iron and boron, amongst others. Preferably, reduced nutrient availablity is reduced nitrogen availability.
[0280] Concerning RHL1 polypeptides, performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under non-stress conditions or under mild drought conditions, which method comprises modulating expression in a plant of a nucleic acid sequence encoding a RHL1 polypeptide.
[0281] Concerning TGase polypeptides, performance of the methods of the invention gives plants grown under non-stress conditions or under mild stress conditions having increased seed yield-related traits, relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing seed yield-related traits in plants grown under non-stress conditions or under mild stress conditions, which method comprises increasing expression in a plant of a nucleic acid sequence encoding a TGase polypeptide.
[0282] Concerning TRY-like polypeptides, performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield relative to control plants grown under comparable conditions. Therefore, according to one embodiment of the present invention, there is provided a method for increasing yield in plants grown under non-stress conditions or under mild drought conditions, which method comprises modulating expression in a plant of a nucleic acid encoding a TRY-like polypeptide.
[0283] Concerning BZR polypeptides, performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased seed yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing seed yield in plants grown under non-stress conditions or under mild drought conditions, which method comprises modulating expression in a plant of a nucleic acid encoding a BZR polypeptide.
[0284] Concerning RHL1 polypeptides, performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises modulating expression in a plant of a nucleic acid encoding a RHL1 polypeptide. Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, cadmium, magnesium, manganese, iron and boron, amongst others.
[0285] Concerning TRY-like polypeptides, performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises modulating expression in a plant of a nucleic acid encoding a TRY-like polypeptide. Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, magnesium, manganese, iron and boron, amongst others. In another embodiment of the invention, the improved yield related traits are obtained under conditions of nitrogen deficiency.
[0286] Concerning BZR polypeptides, performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased seed yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing seed yield in plants grown under conditions of nutrient deficiency, which method comprises modulating expression in a plant of a nucleic acid encoding a BZR polypeptide. Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, cadmium, magnesium, manganese, iron and boron, amongst others.
[0287] Concerning TRY-like polypeptides, performance of the methods of the invention gives plants grown under conditions of salt stress, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a TRY-like polypeptide. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.
[0288] Concerning BZR polypeptides, performance of the methods of the invention gives plants grown under conditions of salt stress, increased seed yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing seed yield in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a BZR polypeptide. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.
[0289] The present invention encompasses plants or parts thereof (including seeds) or cells thereof obtainable by the methods according to the present invention. The plants or parts thereof comprise a nucleic acid transgene encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, as defined above, operably linked to a promoter functioning in plants.
[0290] The invention also provides genetic constructs and vectors to facilitate introduction and/or expression in plants of nucleic acids sequences encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides. The gene constructs may be inserted into vectors, which may be commercially available, suitable for transforming into plants and suitable for expression of the gene of interest in the transformed cells. The invention also provides use of a gene construct as defined herein in the methods of the invention.
[0291] More specifically, the present invention provides a construct comprising:
[0292] (a) a nucleic acid encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, as defined above;
[0293] (b) one or more control sequences capable of driving, or increasing expression of the nucleic acid sequence of (a); and optionally
[0294] (c) a transcription termination sequence.
[0295] Preferably, the nucleic acid sequence encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, is as defined above. The term "control sequence" and "termination sequence" are as defined herein.
[0296] Concerning TGase polypeptides, preferably, one of the control sequences of a construct is a seed-specific promoter isolated from a plant genome. An example of a seed-specific promoter is an alpha-globulin promoter, preferably a rice alpha-globulin promoter, more preferably an alpha-globulin promoter as represented by SEQ ID NO: 72. Alternatively, a control sequence is a constitutive promoter, for example a GOS2 promoter, preferably a GOS2 promoter from rice, most preferably a GOS2 sequence as represented by SEQ ID NO: 71.
[0297] Plants are transformed with a vector comprising any of the nucleic acids described above. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter).
[0298] Concerning RHL1 polypeptides, advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence, but preferably the promoter is of plant origin. A constitutive promoter is particularly useful in the methods. Preferably the constitutive promoter is also a ubiquitous promoter. See the "Definitions" section herein for definitions of the various promoter types. Also useful in the methods of the invention is a root-specific promoter.
[0299] Concerning TGase polypeptides, advantageously, any type of promoter, whether natural or synthetic, may be used to increase expression of the nucleic acid sequence. A constitutive promoter is particularly useful in the methods, preferably a constitutive promoter isolated from a plant genome. The plant constitutive promoter drives expression of a coding sequence at a level that is in all instances below that obtained under the control of a 35S CaMV viral promoter. An example of such a promoter is a GOS2 promoter as represented by SEQ ID NO: 71. Organ-specific promoters, for example for preferred expression in leaves, stems, tubers, meristems, are useful in performing the methods of the invention. Developmentally-regulated and inducible promoters are also useful in performing the methods of the invention. Preferably, seed-specific promoters are particularly useful in the methods of the invention. See the "Definitions" section herein for definitions of the various promoter types.
[0300] Concerning TRY-like polypeptides, advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence, but preferably the promoter is of plant origin. A constitutive promoter is particularly useful in the methods. Preferably the constitutive promoter is a ubiquitous constitutive promoter of medium strength. See the "Definitions" section herein for definitions of the various promoter types. Also useful in the methods of the invention is a root-specific promoter.
[0301] Concerning BZR polypeptides, advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence, but preferably the promoter is of plant origin. A constitutive promoter is particularly useful in the methods.
[0302] Preferably the constitutive promoter is also a ubiquitous promoter of medium strength. See the "Definitions" section herein for definitions of the various promoter types.
[0303] Concerning RHL1 polypeptides, it should be clear that the applicability of the present invention is not restricted to the RHL1 polypeptide-encoding nucleic acid represented by SEQ ID NO: 1, nor is the applicability of the invention restricted to expression of a RHL1 polypeptide-encoding nucleic acid when driven by a constitutive promoter, or when driven by a root-specific promoter.
[0304] The constitutive promoter is preferably a medium strength promoter, such as a GOS2 promoter, preferably the promoter is a GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 39, most preferably the constitutive promoter is as represented by SEQ ID NO: 39. See Table 2 in the "Definitions" section herein for further examples of constitutive promoters.
[0305] According to another preferred feature of the invention, the nucleic acid encoding an polypeptide is operably linked to a root-specific promoter. The root-specific promoter is preferably an RCc3 promoter (Plant Mol Biol. 1995 January; 27(2):237-48), more preferably the RCc3 promoter is from rice, further preferably the RCc3 promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 43, most preferably the promoter is as represented by SEQ ID NO: 43. Examples of other root-specific promoters which may also be used to perform the methods of the invention are shown in Table 3 in the "Definitions" section above.
[0306] Concerning TGase polypeptides, it should be clear that the applicability of the present invention is not restricted to a nucleic acid sequence encoding the TGase polypeptide, as represented by SEQ ID NO: 45, nor is the applicability of the invention restricted to expression of a TGase polypeptide-encoding nucleic acid sequence when driven by a seed-specific promoter.
[0307] Concerning TRY-like polypeptides, it should be clear that the applicability of the present invention is not restricted to the TRY-like polypeptide-encoding nucleic acid represented by SEQ ID NO: 75, nor is the applicability of the invention restricted to expression of a TRY-like polypeptide-encoding nucleic acid when driven by a constitutive promoter, or when driven by a root-specific promoter.
[0308] The constitutive promoter is preferably a medium strength promoter, more preferably selected from a plant, such as a GOS2 promoter, more preferably is the promoter GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 237, most preferably the constitutive promoter is as represented by SEQ ID NO: 237. See the "Definitions" section herein for further examples of constitutive promoters.
[0309] According to another preferred feature of the invention, the nucleic acid encoding a TRY-like polypeptide is operably linked to a root-specific promoter. The root-specific promoter is preferably an RCc3 promoter (Plant Mol Biol. 1995 January; 27(2):237-48), more preferably the RCc3 promoter is from rice, further preferably the RCc3 promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 235, most preferably the promoter is as represented by SEQ ID NO: 235. Examples of other root-specific promoters which may also be used to perform the methods of the invention are shown in Table 2 in the "Definitions" section above.
[0310] Concerning TRY-like polypeptides, optionally, one or more terminator sequences may be used in the construct introduced into a plant. Preferably, the construct comprises an expression cassette essentially similar or identical to SEQ ID NO 236, comprising the RCc3 promoter and the nucleic acid encoding the TRY-like polypeptide, or an expression cassette wherein the nucleic acid encoding the TRY-like polypeptide is operably linked to a rice GOS2 promoter that is substantially similar to SEQ ID NO: 237.
[0311] Concerning BZR polypeptides, it should be clear that the applicability of the present invention is not restricted to the BZR polypeptide-encoding nucleic acid represented by SEQ ID NO: 238, nor is the applicability of the invention restricted to expression of a BZR polypeptide-encoding nucleic acid when driven by a constitutive promoter.
[0312] The constitutive promoter is preferably a medium strength promoter, more preferably selected from a plant derived promoter, such as a GOS2 promoter, more preferably is the promoter GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 322, most preferably the constitutive promoter is as represented by SEQ ID NO: 322. See the "Definitions" section herein for further examples of plant derived and constitutive promoters. A plant derived promoter is preferably of plant origin. The plant derived promoter can be isolated by any of the well known techniques in the art from a plant or may be obtained via others methods such as chemical synthesis using any of the well-known suitable techniques in the art. The plant derived promoter preferably has substantially the same expression pattern and strength as that of a plant promoter of plant origin. Sequence and element structure of the plant derived promoter are similar to that of a promoter of plant origin. Preferably the plant derived promoter comprises a sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a promoter of plant origin.
[0313] Optionally, one or more terminator sequences may be used in the construct introduced into a plant. Additional regulatory elements may include transcriptional as well as translational enhancers. Those skilled in the art will be aware of terminator and enhancer sequences that may be suitable for use in performing the invention. An intron sequence may also be added to the 5' untranslated region (UTR) or in the coding sequence to increase the amount of the mature message that accumulates in the cytosol, as described in the definitions section. Other control sequences (besides promoter, enhancer, silencer, intron sequences, 3'UTR and/or 5'UTR regions) may be protein and/or RNA stabilizing elements. Such sequences would be known or may readily be obtained by a person skilled in the art.
[0314] The genetic constructs of the invention may further include an origin of replication sequence that is required for maintenance and/or replication in a specific cell type. One example is when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not limited to, the f1-ori and colE1.
[0315] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.
[0316] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.
[0317] It is known that upon stable or transient integration of nucleic acid sequences into plant cells, only a minority of the cells takes up the foreign DNA and, if desired, integrates it into its genome, depending on the expression vector used and the transfection technique used. To identify and select these integrants, a gene coding for a selectable marker (such as the ones described above) is usually introduced into the host cells together with the gene of interest. These markers can for example be used in mutants in which these genes are not functional by, for example, deletion by conventional methods. Furthermore, nucleic acid sequence molecules encoding a selectable marker can be introduced into a host cell on the same vector that comprises the sequence encoding the polypeptides of the invention or used in the methods of the invention, or else in a separate vector. Cells which have been stably transfected with the introduced nucleic acid sequence can be identified for example by selection (for example, cells which have integrated the selectable marker survive whereas the other cells die). The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker gene removal are known in the art, useful techniques are described above in the definitions section.
[0318] Concerning RHL1 polypeptides, or TRY-like polypeptides, the invention also provides a method for the production of transgenic plants having enhanced yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding a RHL1 polypeptide, or a TRY-like polypeptide, as defined hereinabove.
[0319] Concerning TGase polypeptides, the invention also provides a method for the production of transgenic plants having increased seed yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid sequence encoding a TGase polypeptide as defined hereinabove.
[0320] Concerning BZR polypeptides, the invention also provides a method for the production of transgenic plants having increased seed yield relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding a BZR polypeptide as defined hereinabove.
[0321] Concerning RHL1 polypeptides, more specifically, the present invention provides a method for the production of transgenic plants having increased enhanced yield-related traits, particularly increased (seed) yield, which method comprises:
[0322] (i) introducing and expressing in a plant or plant cell a RHL1 polypeptide-encoding nucleic acid; and
[0323] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0324] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a RHL1 polypeptide as defined herein.
[0325] Concerning TGase polypeptides, more specifically, the present invention provides a method for the production of transgenic plants having increased seed yield-related traits relative to control plants, which method comprises:
[0326] (i) introducing and expressing in a plant, plant part, or plant cell a nucleic acid sequence encoding a TGase polypeptide; and
[0327] (ii) cultivating the plant cell, plant part or plant under conditions promoting plant growth and development.
[0328] The nucleic acid sequence of (i) may be any of the nucleic acid sequences capable of encoding a TGase polypeptide as defined herein.
[0329] Concerning TRY-like polypeptides, more specifically, the present invention provides a method for the production of transgenic plants having enhanced yield-related traits, particularly increased early vigour and/or increased seed yield, which method comprises:
[0330] (i) introducing and expressing in a plant or plant cell a TRY-like polypeptide-encoding nucleic acid; and
[0331] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0332] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a TRY-like polypeptide as defined herein.
[0333] Concerning BZR polypeptides, more specifically, the present invention provides a method for the production of transgenic plants having increased seed yield, which method comprises:
[0334] (i) introducing and expressing in a plant or plant cell a BZR polypeptide-encoding nucleic acid; and
[0335] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0336] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a BZR polypeptide as defined herein.
[0337] The nucleic acid sequence may be introduced directly into a plant cell or into the plant itself (including introduction into a tissue, organ or any other part of a plant). According to a preferred feature of the present invention, the nucleic acid is preferably introduced into a plant by transformation. The term "transformation" is described in more detail in the "definitions" section herein.
[0338] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer.
[0339] Generally after transformation, plant cells or cell groupings are selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant. To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above.
[0340] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.
[0341] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).
[0342] The present invention clearly extends to any plant cell or plant produced by any of the methods described herein, and to all plant parts and propagules thereof. The present invention extends further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or whole plant that has been produced by any of the aforementioned methods, the only requirement being that progeny exhibit the same genotypic and/or phenotypic characteristic(s) as those produced by the parent in the methods according to the invention.
[0343] The invention also includes host cells containing an isolated nucleic acid sequence encoding a TGase polypeptide as defined hereinabove. Preferred host cells according to the invention are plant cells. Host plants for the nucleic acid sequences or the vector used in the method according to the invention, the expression cassette or construct or vector are, in principle, advantageously all plants, which are capable of synthesizing the polypeptides used in the inventive method.
[0344] The methods of the invention are advantageously applicable to any plant. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs. According to a preferred embodiment of the present invention, the plant is a crop plant. Examples of crop plants include soybean, sunflower, canola, alfalfa, rapeseed, linseed, cotton, tomato, potato and tobacco. Further preferably, the plant is a monocotyledonous plant. Examples of monocotyledonous plants include sugarcane. More preferably the plant is a cereal. Examples of cereals include rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, secale, einkorn, teff, milo and oats.
[0345] The invention also extends to harvestable parts of a plant such as, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs, which harvestable parts comprise a recombinant nucleic acid encoding a BZR polypeptide. The invention furthermore relates to products derived, preferably directly derived, from a harvestable part of such a plant, such as dry pellets or powders, oil, fat and fatty acids, starch or proteins.
[0346] According to a preferred feature of the invention, the modulated expression is increased expression. Methods for increasing expression of nucleic acids or genes, or gene products, are well documented in the art and examples are provided in the definitions section.
[0347] Concerning RHL1 polypeptides, or TRY-like polypeptides, as mentioned above, a preferred method for modulating expression of a nucleic acid sequence encoding a RHL1 polypeptide is by introducing and expressing in a plant a nucleic acid sequence encoding a RHL1 polypeptide, or a TRY-like polypeptide; however the effects of performing the method, i.e. enhancing yield-related traits may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.
[0348] Concerning TGase polypeptides, as mentioned above, a preferred method for increasing expression of a nucleic acid sequence encoding a TGase polypeptide is by introducing and expressing in a plant a nucleic acid sequence encoding a TGase polypeptide; however the effects of performing the method, i.e. increasing seed yield-related traits, may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.
[0349] Concerning BZR polypeptides, as mentioned above, a preferred method for modulating expression of a nucleic acid sequence encoding a BZR polypeptide is by introducing and expressing in a plant a nucleic acid sequence encoding a BZR polypeptide; however the effects of performing the method, i.e. increasing seed yield may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.
[0350] The present invention also encompasses use of nucleic acid sequences encoding RHL1 polypeptides as described herein and use of these RHL1 polypeptides in enhancing any of the aforementioned yield-related traits in plants.
[0351] Furthermore, the present invention also encompasses use of nucleic acid sequences encoding TGase polypeptides as described herein and use of these TGase polypeptides in increasing any of the aforementioned seed yield-related traits in plants, under normal growth conditions, under abiotic stress growth (preferably osmotic stress growth conditions) conditions, and under growth conditions of reduced nutrient availability, preferably under conditions of reduced nitrogen availability.
[0352] Even furthermore, the present invention also encompasses use of nucleic acids encoding TRY-like polypeptides as described herein and use of these TRY-like polypeptides in enhancing any of the aforementioned yield-related traits in plants.
[0353] Furthermore, the present invention also encompasses use of nucleic acids encoding BZR polypeptides as described herein and use of these BZR polypeptides in increasing any of the aforementioned seed yield parameters in plants.
[0354] Concerning RHL1 polypeptides, nucleic acid sequences encoding RHL1 polypeptide described herein, or the RHL1 polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to a RHL1 polypeptide-encoding gene. The nucleic acid sequences/genes, or the RHL1 polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined hereinabove in the methods of the invention.
[0355] Concerning TGase polypeptides, nucleic acid sequences encoding TGase polypeptides described herein, or the TGase polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified that may be genetically linked to a TGase polypeptide-encoding gene. The genes/nucleic acid sequences, or the TGase polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having increased seed yield-related traits, as defined hereinabove in the methods of the invention.
[0356] Concerning TRY-like polypeptides, nucleic acid sequences encoding TRY-like polypeptide described herein, or the TRY-like polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to a TRY-like polypeptide-encoding gene. The nucleic acid sequences/genes, or the TRY-like polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined hereinabove in the methods of the invention.
[0357] Concerning BZR polypeptides, nucleic acid sequences encoding BZR polypeptide described herein, or the BZR polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to a BZR polypeptide-encoding gene. The nucleic acid sequences/genes, or the BZR polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having increased seed yield as defined hereinabove in the methods of the invention.
[0358] Allelic variants of a nucleic acid/gene encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, may also find use in marker-assisted breeding programmes. Such breeding programmes sometimes require introduction of allelic variation by mutagenic treatment of the plants, using for example EMS mutagenesis; alternatively, the programme may start with a collection of allelic variants of so called "natural" origin caused unintentionally. Identification of allelic variants then takes place, for example, by PCR. This is followed by a step for selection of superior allelic variants of the sequence in question and which give increased yield. Selection is typically carried out by monitoring growth performance of plants containing different allelic variants of the sequence in question. Growth performance may be monitored in a greenhouse or in the field. Further optional steps include crossing plants in which the superior allelic variant was identified with another plant. This could be used, for example, to make a combination of interesting phenotypic features.
[0359] Nucleic acid sequences encoding RHL1 polypeptides, or TGase polypeptides, or TRY-like polypeptides, or BZR polypeptides, may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes. Such use of nucleic acid sequences encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, requires only a nucleic acid sequence of at least 15 nucleotides in length. The nucleic acid sequences encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, A Laboratory Manual) of restriction-digested plant genomic DNA may be probed with the nucleic acid sequences encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et al. (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acid sequences may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the nucleic acid sequence encoding a RHL1 polypeptide, or a TGase polypeptide, or a TRY-like polypeptide, or a BZR polypeptide, in the genetic map previously obtained using this population (Botstein et al. (1980) Am. J. Hum. Genet. 32: 314-331).
[0360] The production and use of plant gene-derived probes for use in genetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol. Reporter 4: 37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.
[0361] The nucleic acid sequence probes may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel et al. In: Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).
[0362] In another embodiment, the nucleic acid sequence probes may be used in direct fluorescence in situ hybridisation (FISH) mapping (Trask (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favour use of large clones (several kb to several hundred kb; see Laan et al. (1995) Genome Res. 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.
[0363] A variety of nucleic acid sequence amplification-based methods for genetic and physical mapping may be carried out using the nucleic acid sequences. Examples include allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med 11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov (1990) Nucleic acid sequence Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic acid sequence Res. 17:6795-6807). For these methods, the sequence of a nucleic acid sequence is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art. In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.
[0364] The methods according to the present invention result in plants having enhanced yield-related traits, as described hereinbefore. These traits may also be combined with other economically advantageous traits, such as further yield-enhancing traits, tolerance to other abiotic and biotic stresses, traits modifying various architectural features and/or biochemical and/or physiological features.
[0365] Furthermore, the methods according to the present invention also result in plants having increased seed yield-related traits, as described hereinbefore. These traits may also be combined with other economically advantageous traits, such as further yield-increasing traits, tolerance to abiotic and biotic stresses, tolerance to herbicides, insectides, traits modifying various architectural features and/or biochemical and/or physiological features.
[0366] Even furthermore, the methods according to the present invention also result in plants having increased seed yield, as described hereinbefore. These traits may also be combined with other economically advantageous traits, such as further yield-enhancing traits, tolerance to other abiotic and biotic stresses, traits modifying various architectural features and/or biochemical and/or physiological features.
Items
[0367] 1. A method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a Root Hairless polypeptide and optionally selecting for plants having enhanced yield-related traits.
[0368] 2. Method according to item 1, wherein said Root Hairless polypeptide comprises any one or more of the following motifs:
TABLE-US-00014
[0368] (i) Motif 9: (SEQ ID NO: 37) (SN)VMC(ED)D(YV)F(DE)(NS)(ML)(IV)VFS(DE)AWWIG(TR) K(ED)ENPEE; (ii) Motif 10: (SEQ ID NO: 38) L(AILV)A(PA)(IVA)(SA)GG(KR)(IVF)G(ED)L(KA)DL(GDS) (TS)KNP(IVL)LYLDFPQ; (iii) Motif 11: (SEQ ID NO: 39) G(RQ)(ML)KLFGTI(VL)YPKN(RK)Y(LI)TLQF;
[0369] Wherein the amino acids between brackets are alternative amino acids at that position, and wherein in increasing order of preference 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 amino acids are substituted by any other amino acid, preferably by a conservative amino acid
[0370] 3. Method according to item 1 or 2 wherein said modulated expression is effected by introducing and expressing in a plant a nucleic acid encoding an Root Hairless polypeptide.
[0371] 4. Method according to any preceding item, wherein said nucleic acid encoding an Root Hairless polypeptide encodes any one of the proteins listed in Table A1 or is a portion of such a nucleic acid, or a nucleic acid capable of hybridising with such a nucleic acid or the complement thereof.
[0372] 5. Method according to any preceding item, wherein said nucleic acid sequence encodes an orthologue or paralogue of any of the proteins given in Table A1.
[0373] 6. Method according to any preceding item, wherein said enhanced yield-related traits comprise increased seed yield relative to control plants.
[0374] 7. Method according to any preceding item wherein said enhanced yield-related traits are obtained under cultivation conditions of nitrogen deficiency.
[0375] 8. Method according to any one of items 3 to 7, wherein said nucleic acid is operably linked to a constitutive promoter, preferably to a GOS2 promoter, most preferably to a GOS2 promoter from rice.
[0376] 9. Method according to any preceding item, wherein said nucleic acid encoding an Root Hairless polypeptide is of plant origin, preferably from a dicotyledonous plant, further preferably from the family Brassicaceae, most preferably from Arabidopsis thaliana.
[0377] 10. Plant or part thereof, including seeds, obtainable by a method according to any preceeding item, wherein said plant or part thereof comprises a recombinant nucleic acid encoding an Root Hairless polypeptide.
[0378] 11. Construct comprising:
[0379] (i) nucleic acid encoding a Root Hairless polypeptide as defined in items 1, 2 or 3;
[0380] (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally
[0381] (iii) a transcription termination sequence.
[0382] 12. Construct according to item 11, wherein one of said control sequences is a constitutive promoter, preferably a GOS2 promoter, most preferably a GOS2 promoter from rice.
[0383] 13. Use of a construct according to item 11 or 12 in a method for making plants having increased yield, particularly increased seed yield relative to control plants.
[0384] 14. Plant, plant part or plant cell transformed with a construct according to item 11 or 12.
[0385] 15. Method for the production of a transgenic plant having increased yield, preferably increased seed yield relative to control plants, comprising:
[0386] (a) introducing and expressing in a plant a nucleic acid encoding an Root Hairless polypeptide as defined in item 1 or 2; and
[0387] (b) cultivating the plant cell under conditions promoting plant growth and development; and optionally
[0388] (c) selecting for plants having enhanced yield-related traits
[0389] 16. Transgenic plant having increased yield, particularly increased biomass, relative to control plants, resulting from modulated expression of a nucleic acid encoding an Root Hairless polypeptide as defined in item 1 or 2 or a transgenic plant cell derived from said transgenic plant.
[0390] 17. Transgenic plant according to item 10, 14 or 16, or a transgenic plant cell derived thereof, wherein said plant is a crop plant or a monocot or a cereal, such as rice, maize, wheat, barley, millet, rye, triticale, sorghum and oats.
[0391] 18. Harvestable parts of a plant according to item 17, wherein said harvestable parts are preferably shoot biomass and/or seeds.
[0392] 19. Products derived from a plant according to item 17 and/or from harvestable parts of a plant according to item 18.
[0393] 20. Use of a nucleic acid encoding a Root Hairless polypeptide in increasing yield, particularly in increasing shoot and/or biomass in plants, relative to control plants.
[0394] 21. A method for increasing seed yield-related traits in plants relative to control plants, comprising increasing expression in a plant of a nucleic acid sequence encoding a transglutaminase (TGase) polypeptide, which TGase polypeptide comprises (i) a plastidic transit peptide; (ii) in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to a domain comprising at least one coiled coil as represented by SEQ ID NO: 27; (iii) and an Integrated relational Enzyme database entry EC 2.3.2.13 for protein-glutamine γ-glutamyltransferase.
[0395] 22. Method according to item 21, wherein said TGase polypeptide has (i) a plastidic transit peptide; (ii) in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to a polypeptide as represented by SEQ ID NO: 45.
[0396] 23. Method according to item 21 or 22, wherein said TGase polypeptide has in increasing order of preference at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more amino acid sequence identity to a TGase polypeptide as represented by SEQ ID NO: 45 or to any of the polypeptide sequences given in Table A2 herein.
[0397] 24. Method according to any one of items 21 to 23, wherein said TGase polypeptide is any polypeptide sequence which when used in the construction of a TGase phylogenetic tree, such as the one depicted in FIG. 5, clusters with the clade of TGase polypeptides comprising the polypeptide sequence as represented by SEQ ID NO: 45, rather than with the other clades.
[0398] 25. Method according to any one of items 21 to 24, wherein said TGase polypeptide is a polypeptide with enzymatic activity consisting in catalyzing the formation of amide linkages, generally in a Ca-dependent fashion, between the primary amine of an amine donor substrate and the y-carboxamide group of peptide-bound endo-glutamine residues in proteins or polypeptides that are the amine acceptors.
[0399] 26. Method according to any one of items 21 to 25, wherein said nucleic acid sequence encoding a TGase polypeptide is represented by any one of the nucleic acid sequence SEQ ID NOs given in Table A or a portion thereof, or a sequence capable of hybridising with any one of the nucleic acid sequences SEQ ID NOs given in Table A2, or to a complement thereof.
[0400] 27. Method according to any one of items 21 to 26, wherein said nucleic acid sequence encodes an orthologue or paralogue of any of the polypeptide sequence SEQ ID NOs given in Table A2.
[0401] 28. Method according to any one of items 21 to 27, wherein said increased expression is effected by any one or more of: T-DNA activation tagging, TILLING, or homologous recombination.
[0402] 29. Method according to any one of items 21 to 28, wherein said increased expression is effected by introducing and expressing in a plant a nucleic acid sequence encoding a TGase polypeptide.
[0403] 30. Method according to any one of items 21 to 29, wherein said increased seed yield-related trait is one or more of: increased total seed yield per plant, increased number of filled seeds, and increased harvest index.
[0404] 31. Method according to any one of items 21 to 30, wherein said nucleic acid sequence is operably linked to a seed-specific promoter.
[0405] 32. Method according to item 31, wherein said seed-specific promoter is an alpha-globulin promoter, preferably a rice alpha-globulin promoter, more preferably an alpha-globulin promoter as represented by SEQ ID NO: 72.
[0406] 33. Method according to any one of items 21 to 32, wherein said nucleic acid sequence encoding a TGase polypeptide is from a plant, further preferably from a monocotyledonous plant, more preferably from the family Poaceae, most preferably the nucleic acid sequence is from Oryza sativa.
[0407] 34. Plants, parts thereof (including seeds), or plant cells obtainable by a method according to any one of items 21 to 33, wherein said plant, part or cell thereof comprises an isolated nucleic acid transgene encoding a TGase polypeptide.
[0408] 35. Construct comprising:
[0409] (a) a nucleic acid sequence encoding a TGase polypeptide as defined in any one of items 21 to 27;
[0410] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally
[0411] (c) a transcription termination sequence.
[0412] 36. Construct according to item 35, wherein said control sequence is a seed-specific promoter.
[0413] 37. Construct according to item 36, wherein said seed-specific promoter is an alpha-globulin promoter, preferably a rice alpha-globulin promoter, more preferably an alpha-globulin promoter as represented by SEQ ID NO: 72.
[0414] 38. Use of a construct according to any one of items 35 to 37, in a method for making plants having increased seed yield-related traits relative to control plants, which increased seed yield-related traits are one or more of: increased total seed yield per plant, increased number of filled seeds, and increased harvest index.
[0415] 39. Plant, plant part or plant cell transformed with a construct according to any one of items 35 to 37.
[0416] 40. Method for the production of transgenic plants having increased seed yield-related traits relative to control plants, comprising:
[0417] (i) introducing and expressing in a plant, plant part, or plant cell, a nucleic acid sequence encoding a TGase polypeptide as defined in any one of items 1 to 7; and
[0418] (ii) cultivating the plant cell, plant part, or plant under conditions promoting plant growth and development.
[0419] 41. Transgenic plant having increased seed yield-related traits relative to control plants, resulting from increased expression of an isolated nucleic acid sequence encoding a TGase polypeptide as defined in any one of items 21 to 27, or a transgenic plant cell or transgenic plant part derived from said transgenic plant.
[0420] 42. Transgenic plant according to item 34, 39 or 41, wherein said plant is a crop plant or a monocot or a cereal, such as rice, maize, wheat, barley, millet, rye, triticale, sorghum and oats, or a transgenic plant cell derived from said transgenic plant.
[0421] 43. Harvestable parts comprising an isolated nucleic acid sequence encoding a TGase polypeptide, of a plant according to item 42, wherein said harvestable parts are preferably seeds.
[0422] 44. Products derived from a plant according to item 42 and/or from harvestable parts of a plant according to item 43.
[0423] 45. Use of a nucleic acid sequence encoding a TGase polypeptide as defined in any one of items 21 to 27 in increasing seed yield-related traits, comprising one or more of: increased total seed yield per plant, increased number of filled seeds, and increased harvest index.
[0424] 46. A method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a TRY-like polypeptide, wherein said TRY-like polypeptide comprises a Myb-like DNA-binding domain (Panther PTHR10641:SF26; Gene3D G3DSA:1.10.10.60).
[0425] 47. Method according to item 46, wherein said TRY-like polypeptide comprises one or more of motifs 12 to 15 (SEQ ID NO: 229 to SEQ ID NO: 232).
[0426] 48. Method according to item 46 or 47, wherein said modulated expression is effected by introducing and expressing in a plant a nucleic acid encoding a TRY-like polypeptide.
[0427] 49. Method according to any one of items 46 to 48, wherein said nucleic acid encoding a TRY-like polypeptide encodes any one of the proteins listed in Table A3 or is a portion of such a nucleic acid, or a nucleic acid capable of hybridising with such a nucleic acid.
[0428] 50. Method according to any one of items 46 to 49, wherein said nucleic acid sequence encodes an orthologue or paralogue of any of the proteins given in Table A3.
[0429] 51. Method according to any one of items 46 to 50, wherein said enhanced yield-related traits comprise increased emergence vigour and/or increased yield, relative to control plants.
[0430] 52. Method according to any one of items 46 to 51, wherein said enhanced yield-related traits are obtained under non-stress conditions or under conditions of nitrogen deficiency.
[0431] 53. Method according to any one of items 48 to 52, wherein said nucleic acid is operably linked to a root-specific promoter, preferably to a RCc3 promoter, most preferably to a RCc3 promoter from rice, or wherein said nucleic acid is operably linked to a constitutive promoter, preferably to a GOS2 promoter, most preferably to a GOS2 promoter from rice.
[0432] 54. Method according to any one of items 46 to 53, wherein said nucleic acid encoding a TRY-like polypeptide is of plant origin, preferably from a dicotyledonous plant, further preferably from the family Brassicaceae, more preferably from the genus Arabidopsis, most preferably from Arabidopsis thaliana.
[0433] 55. Plant or part thereof, including seeds, obtainable by a method according to any one of items 46 to 48, wherein said plant or part thereof comprises a recombinant nucleic acid encoding a TRY-like polypeptide.
[0434] 56. Construct comprising:
[0435] (a) nucleic acid encoding a TRY-like polypeptide as defined in items 1 or 2;
[0436] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally
[0437] (c) a transcription termination sequence.
[0438] 57. Construct according to item 56, wherein one of said control sequences is a constitutive promoter, preferably a RCc3 promoter, most preferably a RCc3 promoter from rice, or wherein one of said control sequences is a constitutive promoter, preferably a RCc3 promoter, most preferably a RCc3 promoter from rice.
[0439] 58. Use of a construct according to item 56 or 57 in a method for making plants having increased yield-related traits, particularly increased emergence vigour and/or increased seed yield relative to control plants.
[0440] 59. Plant, plant part or plant cell transformed with a construct according to item 56 or 57.
[0441] 60. Method for the production of a transgenic plant having increased yield-related traits, particularly increased emergence vigour and/or increased seed yield relative to control plants, comprising:
[0442] (i) introducing and expressing in a plant a nucleic acid encoding a TRY-like polypeptide as defined in item 1 or 2; and
[0443] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0444] 61. Transgenic plant having increased emergence vigour and/or increased yield, relative to control plants, resulting from modulated expression of a nucleic acid encoding a TRY-like polypeptide as defined in item 46 or 47, or a transgenic plant cell derived from said transgenic plant.
[0445] 62. Transgenic plant according to item 55, 59 or 61, or a transgenic plant cell derived thereof, wherein said plant is a crop plant or a monocot or a cereal, such as rice, maize, wheat, barley, millet, rye,
triticale, sorghum emmer, spelt, secale, einkorn, teff, milo and oats.
[0446] 63. Harvestable parts of a plant according to item 62, wherein said harvestable parts are preferably seeds.
[0447] 64. Products derived from a plant according to item 62 and/or from harvestable parts of a plant according to item 63.
[0448] 65. Use of a nucleic acid encoding a TRY-like polypeptide in increasing emergence vigour and/or increasing yield in plants, relative to control plants.
[0449] 66. A method for increasing seed yield in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a BZR, BRASSINAZOLE-RESISTANT polypeptide and optionally selecting for plants having increased seed yield.
[0450] 67. Method according to item 66, wherein said BZR polypeptide comprises:
[0451] (i) a domain having in increasing order of preference at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the domain located between amino acid coordinates 10-157 in SEQ ID NO: 239 and/or
[0452] (ii) a domain having in increasing order of preference at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a bHLH-like domain as represented SEQ ID NO: 326 and/or
[0453] (iii) 50%, 55%, 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence of any of the polypeptides of Table A4; and/or
[0454] (iv) a motif as represented by any one of SEQ ID NO: 323, SEQ ID NO: 324, SEQ ID NO: 325, wherein 1, 2, 3 or 4 residues may be substituted by any amino acid.
[0455] 68. Method according to item 66 or 67, wherein said modulated expression is effected by introducing and expressing in a plant a nucleic acid encoding a BZR polypeptide.
[0456] 69. Method according to any one of items 66 to 68, wherein said nucleic acid encoding a BZR polypeptide encodes any one of the proteins listed in Table A4 or is a portion of such a nucleic acid, or a nucleic acid capable of hybridising with such a nucleic acid.
[0457] 70. Method according to any preceding item, wherein said nucleic acid sequence encodes an orthologue or paralogue of any of the proteins given in Table A4.
[0458] 71. Method according to any one of items 66 to 70, wherein said increased seed yield is selected from the total weight of the seed, the number of filled seed and the thousand kernel weight.
[0459] 72. Method according to any one of items 66 to 71, wherein said increased seed yield is obtained under non-stress conditions.
[0460] 73. Method according to any one of items 68 to 72, wherein said nucleic acid is operably linked to a plant derived constitutive promoter, preferably to a GOS2 promoter, most preferably to a GOS2 promoter from rice.
[0461] 74. Method according to any one of items 66 to 73, wherein said nucleic acid encoding a
[0462] BZR polypeptide is of plant origin, preferably from a dicotyledonous plant, further preferably from the family Brasicaceae, most preferably from Arabidopsis thaliana.
[0463] 75. Plant or part thereof, including seeds, obtainable by a method according to any one of items 66 to 74, wherein said plant or part thereof comprises a recombinant nucleic acid encoding a BZR polypeptide.
[0464] 76. An isolated nucleic acid molecule comprising any one of the following features:
[0465] (i) a nucleic acid represented by any one of SEQ ID NO: 250, 252, 254, 256;
[0466] (ii) a nucleic acid or fragment thereof that is complementary to any one of SEQ ID NO: 250, 252, 254, 256;
[0467] (iii) a nucleic acid encoding a BZR polypeptide having, in increasing order of preference, at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to one of SEQ ID NO: 252, 254, 256, 258;
[0468] (iv) a nucleic acid capable of hybridizing under stringent conditions to any one of the nucleic acids given in (i), (ii) or (iii) above.
[0469] 77. An isolated polypeptide comprising:
[0470] (i) an amino acid sequence having, in increasing order of preference, at least at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to one of SEQ ID NO: 252, 254, 256, 258; and/or
[0471] (ii) derivatives of any of the amino acid sequences given in (i).
[0472] 78. Construct comprising:
[0473] (i) nucleic acid encoding a BZR polypeptide as defined in items 66, 67 or 77, or a nucleic acid according to item 76;
[0474] (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally
[0475] (iii) a transcription termination sequence.
[0476] 79. Construct according to item 78, wherein one of said control sequences is a constitutive promoter, preferably a GOS2 promoter, most preferably a GOS2 promoter from rice.
[0477] 80. Use of a construct according to item 13 or 14 in a method for making plants having increased seed yield relative to control plants.
[0478] 81. Plant, plant part or plant cell transformed with a construct according to item 78 or 79.
[0479] 82. Method for the production of a transgenic plant having increased seed yield relative to control plants, comprising:
[0480] (i) introducing and expressing in a plant a nucleic acid encoding a BZR polypeptide as defined in item 66, 67 or 77, or a nucleic acid according to item 11; and
[0481] (ii) cultivating the plant cell under conditions promoting plant growth and development;
[0482] and optionally
[0483] (iii) selecting for plants having seed yield.
[0484] 83. Transgenic plant having increased seed yield resulting from modulated expression of a nucleic acid encoding a BZR polypeptide as defined in item 66, 67 or 77 or a transgenic plant cell derived from said transgenic plant.
[0485] 84. Transgenic plant according to item 75, 81 or 83, or a transgenic plant cell derived thereof, wherein said plant is a crop plant or a monocot or a cereal, such as rice, maize, wheat, barley, millet, rye, triticale, sorghum and oats.
[0486] 85. Harvestable parts of a plant according to item 84, wherein said harvestable parts are preferably shoot biomass and/or seeds.
[0487] 86. Products derived from a plant according to item 85 and/or from harvestable parts of a plant according to item 85.
[0488] 87. Use of a nucleic acid encoding a BZR polypeptide in increasing seed yield relative to control plants.
DESCRIPTION OF FIGURES
[0489] The present invention will now be described with reference to the following figures in which:
[0490] FIG. 1 represents the sequence of SEQ ID NO: 2 with conserved putative nuclear localization signals.
[0491] FIG. 2 represents multiple alignment of RHL1 polypeptide. O. sativa_Os07g07580: SEQ ID NO: 10; O. sativa_Os06g51380: SEQ ID NO: 12; Z. mays_TA180670: SEQ ID NO: 14; A. formosa_TA10038: SEQ ID NO: 16; V. vinifera_GSVIVT00027050001: SEQ ID NO: 28; M. domestica_TA43921: SEQ ID NO: 18; S. tuberosurn_TA36268: SEQ ID NO: 24; p. trichocarpa_scaff_IV.277: SEQ ID NO: 8; A. thaliana_AT1G48380.1: SEQ ID NO: 2; P. patens--74926: SEQ ID NO: 20; and P. patens--173149: SEQ ID NO: 22.
[0492] FIG. 3 shows phylogenetic tree of RHL1 polypeptide
[0493] FIG. 4 represents the binary vector for increased expression in Oryza sativa of a RHL1-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2)
[0494] FIG. 5 shows a phylogenetic tree of TGase polypeptides from various source organisms, according to Villalobos et al. (2004; Gene 336: 93-104). TGases useful in performing the methods of the invention (essentially from plants) are shown with a bracket, the clade split with a circle, the arrow points to the Oryza sativa TGase polypeptide as represented by SEQ ID NO: 45.
[0495] FIG. 6 shows the graphical output of the COILS algorithm predicting at least one coiled coil domain in the polypeptide as represented by SEQ ID NO: 45. The X axis represents the amino acid residue coordinates, the Y axis the probability (ranging from 0 to 1) that a coiled coil domain is present, and the three lines, the three windows (14, 21, 28) examined.
[0496] FIG. 7 shows an AlignX (from Vector NTI 10.3, Invitrogen Corporation) multiple sequence alignment of the TGase polypeptides from Table A2. The N-terminal plastidic transit peptide is separated from the rest of the polypeptide (mature polypeptide) by a vertical bar. The putative calcium binding region is boxed, and marked with X's under the consensus sequence. The domain where at least one coiled coil is predicted using the Coils algorithm (and as represented by SEQ ID NO: 70) is marked with X's under the consensus region. Orysa_TGase: SEQ ID NO: 45; Sorbi_TGase: SEQ ID NO: 61; Sacof_TGase: SEQ ID NO: 59; Zeama_TGase 2: SEQ ID NO: 63; Zeama_TGase 3: SEQ ID NO: 65; Zeama_tgz15: SEQ ID NO: 67; Zeama_tgz21: SEQ ID NO: 69; Poptr_TGase: SEQ ID NO: 57; Lyces_TGase: SEQ ID NO: 51; Horvu_TGase: SEQ ID NO: 49; Orysa_TGase 2: SEQ ID NO: 53; Arath_TGase: SEQ ID NO: 47; Picsi_TGAse: SEQ ID NO: 55; and Consensus: SEQ ID NO: 329.
[0497] FIG. 8 shows the binary vector for increased expression in Oryza sativa plants of a nucleic acid sequence encoding a TGase polypeptide under the control of a promoter functioning in plants.
[0498] FIG. 9 represents the sequence of SEQ ID NO: 76 with conserved motifs or domains: the Myb-like DNA-binding domain as identified with HMMPfam (PF00249.17) is indicated in bold, the sequence that is covered by motif 12 is underlined.
[0499] FIG. 10 represents a multiple alignment of various TRY-like polypeptide sequences. A dot indicates conserved residues, a colon indicates highly conserved residues and an asterisk stands for perfectly conserved residues. The highest degree of sequence conservation is found in the region of the DNA-binding domain. A. sativa_CN818591: SEQ ID NO: 84; L. multiflorum_AU249134: SEQ ID NO: 148; A. capillaris_DV853805: SEQ ID NO: 78; A. capillaris_DV859458: SEQ ID NO: 80; H. vulgare_TC189825: SEQ ID NO: 138; T. aestivum_BE412359: SEQ ID NO: 216; O. minuta_CB884361: SEQ ID NO: 164; O. sativa_LOC_Os01g43230.2: SEQ ID NO: 166; S. bicolor_Sb03g028170.1: SEQ ID NO: 208; Z. mays_TC409725: SEQ ID NO: 228; T. androssowii_TA2313--189785: SEQ ID NO: 218; Triphysaria_sp_TC9313: SEQ ID NO: 220; S. tuberosum_CV505951: SEQ ID NO: 214; P. tremula_TA11725--113636: SEQ ID NO: 192; P. trichocarpa--594467: SEQ ID NO: 200; P. trichocarpa--562293: SEQ ID NO: 196; B. gymnorrhiza_TA2541--39984: SEQ ID NO: 102; M. esculenta_TA9427--3983: SEQ ID NO: 160; P. persica_BU039343: SEQ ID NO: 176; V. vinifera_GSVIVT00026045001: SEQ ID NO: 226; L. tulipifera_CV004984: SEQ ID NO: 154; G. hirsutum_TC121748: SEQ ID NO: 128; E. esula_DV121180: SEQ ID NO: 120; M. domestica_TC17597: SEQ ID NO: 156; P. tremula_BU888423: SEQ ID NO: 188; P. trichocarpa--568212: SEQ ID NO: 198; G. hirsutum_DW508052: SEQ ID NO: 122; G. hirsutum_TC116960: SEQ ID NO: 126; V. vinifera GSVIVT00006915001: SEQ ID NO: 222; J. hindsii_×--regia_TA1295--43229: SEQ ID NO: 144; G. max--8223: SEQ ID NO: 132; P. vulgaris_CV538421: SEQ ID NO: 206; C. tetragonoloba_EG990179: SEQ ID NO: 118; G. max--29139: SEQ ID NO: 134; G. soja_TA4526--3848: SEQ ID NO: 136; L. japonicus_CB827663: SEQ ID NO: 146; C. canephora_DV693718: SEQ ID NO: 114; P. hybrida_EB175070: SEQ ID NO: 172; I. nil_TC6509: SEQ ID NO: 140; L. saligna_DW052030: SEQ ID NO: 150; A. hypogaea_CD038483: SEQ ID NO: 82; P. tremula_TA7610--113636: SEQ ID NO: 194; P. equestris_CB034844: SEQ ID NO: 168; V. vinifera_GSVIVT00010755001: SEQ ID NO: 224; P. pinaster_CT579117: SEQ ID NO: 178; P. taeda_DR096185: SEQ ID NO: 186; P. sitchensis_TA16538--3332: SEQ ID NO: 182; A. thaliana_At1g01380_CPC-like_: SEQ ID NO: 86; B. napus_EV055366: SEQ ID NO: 108; A. thaliana_At4g01060_CPC-like_: SEQ ID NO: 100; A. thaliana_At2g46410_CPC-like_: SEQ ID NO: 98; B. napus_TC92601: SEQ ID NO: 110; A. thaliana_At5g53200_CPC-like_: SEQ ID NO: 76; B. napus_EE451172: SEQ ID NO: 106; J. hindsii_×--regia_EL893054: SEQ ID NO: 142; G. hirsutum_TC102183: SEQ ID NO: 124; P. trichocarpa--807368: SEQ ID NO: 204; M. esculenta_DV443286: SEQ ID NO: 158; P. tremula_DN497189: SEQ ID NO: 190; P. trichocarpa--674550: SEQ ID NO: 202; C. longa_DY390653: SEQ ID NO: 116; A. thaliana_At2g30420_CPC-like_: SEQ ID NO: 96; B. napus_CD843377: SEQ ID NO: 104; B. napus_TC95812: SEQ ID NO: 112; G. max_Glyma11g02060.1: SEQ ID NO: 130; M. truncatula_CT033771--17.4: SEQ ID NO: 162; L. serriola_DW108811: SEQ ID NO: 152; S. miltiorrhiza_CV166339: SEQ ID NO: 210; S. miltiorrhiza_TA1626--226208: SEQ ID NO: 212; P. glauca_DR564374: SEQ ID NO: 170; P. sitchensis_TA17447--3332: SEQ ID NO: 184; P. pinaster_TA6535--71647: SEQ ID NO: 180; P. menziesii_TA3655--3357: SEQ ID NO: 174; A. thaliana_At1g71030_CPC-like_: SEQ ID NO: 94; A. thaliana_At1g18960_CPC-like_: SEQ ID NO: 90; A. thaliana_At1g09710_CPC-like_: SEQ ID NO: 88; and A. thaliana_At1g58220_CPC-like_: SEQ ID NO: 92.
[0500] FIG. 11 represents the binary vector for increased expression in Oryza sativa of a TRY-like-encoding nucleic acid under the control of a rice RCc3 promoter (pRCc3).
[0501] FIG. 12 represents a multiple alignment of BZR polypeptides. AT1G19350.1: SEQ ID NO: 241; AT1G75080.1: SEQ ID NO: 243; Pt_scaff--40.175: SEQ ID NO: 295; Pt_scaff_II.1237: SEQ ID NO: 299; Gm 1762729: SEQ ID NO: 251; Mt_TA28179--3880: SEQ ID NO: 273; Le LAT61: SEQ ID NO: 261; Le_TA51962--4081: SEQ ID NO: 267; Vv_TA44770--29760: SEQ ID NO: 311; AT3G50750.1: SEQ ID NO: 239; Gm 1768381: SEQ ID NO: 255; Gm 1768507: SEQ ID NO: 257; Mt_TA21345--3880: SEQ ID NO: 271; Pt_scaff--57.215: SEQ ID NO: 297; Pt_scaff_VII.1038: SEQ ID NO: 303; Le_DB718708: SEQ ID NO: 263; Le_TA37112--4081: SEQ ID NO: 265; Os07g0580500: SEQ ID NO: 281; Zm_AY107201: SEQ ID NO: 313; AT4G36780.1: SEQ ID NO: 249; Os02g0129600: SEQ ID NO: 277; Zm_EE158804: SEQ ID NO: 315; Zm_TA189809--4577: SEQ ID NO: 330; Pp 82495: SEQ ID NO: 287; Pp 17189: SEQ ID NO: 283; Pp 172161: SEQ ID NO: 285; Ps WS0287--023: SEQ ID NO: 289; Hv_TA37786--4513: SEQ ID NO: 259; Os01g0203000: SEQ ID NO: 275; Zm_TA178991--4577: SEQ ID NO: 319; Os06g0552300: SEQ ID NO: 279; Zm_TA1750444577: SEQ ID NO: 317; AT1G78700.1: SEQ ID NO: 245; AT4G18890.1: SEQ ID NO: 247; Pt_scaff_IV.340: SEQ ID NO: 301; Pt_scaff_XI.678: SEQ ID NO: 305; Gm 1765606: SEQ ID NO: 253; Mt_BF635822: SEQ ID NO: 269; Pt WS01123_K11: SEQ ID NO: 291; Pt_scaff--178.36: SEQ ID NO: 293; Pt_scaff_XI.792: SEQ ID NO: 307; SI FC26BA11: SEQ ID NO: 309; and Consensus: SEQ ID NO: 331.
[0502] FIG. 13 represents the binary vector for increased expression in Oryza sativa of a BZR-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2).
EXAMPLES
[0503] The present invention will now be described with reference to the following examples, which are by way of illustration alone. The following examples are not intended to completely define or otherwise limit the scope of the invention.
[0504] DNA manipulation: unless otherwise stated, recombinant DNA techniques are performed according to standard protocols described in (Sambrook (2001) Molecular Cloning: a laboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, CSH, New York) or in Volumes 1 and 2 of Ausubel et al. (1994), Current Protocols in Molecular Biology, Current Protocols. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfax (1993) by R. D. D. Croy, published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications (UK).
Example 1
Identification of Sequences Related to the Nucleic Acid Sequence Used in the Methods of the Invention
1.1. Root Hairless 1 (RHL1)
[0505] Sequences (full length cDNA, ESTs or genomic) related to the nucleic acid sequence used in the methods of the present invention were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acid used in the present invention was used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.
[0506] Table A1 provides a list of nucleic acid sequences related to the nucleic acid sequence used in the methods of the present invention.
TABLE-US-00015 TABLE A1 Examples of RHL1 nucleic acids and polypeptides: Nucleic acid Protein Name Source Organism SEQ ID NO: SEQ ID NO: A.thaliana_AT1G48380.1 (Arath_RHL1) Arabidopsis thaliana 1 2 p.trichocarpa_scaff_44.278 Populus trichocarpa 3 4 p.trichocarpa_scaff_184.3 Populus trichocarpa 5 6 p.trichocarpa_scaff_IV.277 Populus trichocarpa 7 8 O.sativa_Os07g07580 (Orysa_RHL1) Oryza sativa 9 10 O.sativa_Os06g51380 Oryza sativa 11 12 Z.mays_TA180670 Zea mays 13 14 A.formosa_TA10038 Aquilegia formosa 15 16 M.domestica_TA43921 Malus domestica 17 18 P.patens_74926 Physcomitrella patens 19 20 P.patens_173149 Physcomitrella patens 21 22 S.tuberosum_TA36268 Solanum tuberosum 23 24 V.shuttleworthii_TA2694 Vitis shuttleworthii 25 26 V.vinifera_GSVIVT00027050001 Vitis vinifera 27 28
[0507] In some instances, related sequences have tentatively been assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR). The Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid or polypeptide sequence of interest.
1.2. Transglutaminases (TGases)
[0508] Sequences (full length cDNA, ESTs or genomic) related to the nucleic acid sequence used in the methods of the present invention were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid sequence or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acid sequence of the present invention was used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid sequence (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.
[0509] Table A2 provides a list of nucleic acid sequences related to the nucleic acid sequence used in the methods of the present invention.
TABLE-US-00016 TABLE A2 Examples of TGase polypeptide sequences, and encoding nucleic acid sequences: Nucleic acid Polypeptide Public database sequence sequence Name Source organism accession number SEQ ID NO: SEQ ID NO: Orysa_TGase Oryza sativa na 44 45 Arath_TGase Arabidopsis thaliana NM_105387 46 47 Horvu_TGase Hordeum vulgare AK251411 48 49 Lyces_TGase Lycopersicon esculentum BT012898 50 51 Orysa_TGase II Oryza sativa NM_001052696 52 53 Picsi_TGAse Picea sitchensis EF087701 54 55 Poptr_TGase Populus tremuloides TA16744_3694 56 57 Sacof_TGase Saccharum officinarum CA246119 58 59 CA254082 CA265940 Sorbi_TGase Sorghum bicolor CL187991 60 61 ER757182.1 CW291038 Zeama_TGase II Zea mays DT641696.1, 62 63 DV540831.1 Zeama_TGase III Zea mays na 64 65 Zeama_tgz15 Zea mays AJ421525 66 67 Zeama_tgz21 Zea mays AJ488103 68 69
[0510] In some instances, related sequences have tentatively been assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR; beginning with TA). The Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid sequence or polypeptide sequence of interest. On other instances, special nucleic acid sequence databases have been created for particular organisms, such as by the Joint Genome Institute.
1.3. Tryptichon (TRY-Like)
[0511] Sequences (full length cDNA, ESTs or genomic) related to the nucleic acid sequence used in the methods of the present invention were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acid used in the present invention was used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.
[0512] Table A3 provides a list of nucleic acid sequences related to the nucleic acid sequence used in the methods of the present invention.
TABLE-US-00017 TABLE A3 Examples of TRY-like polypeptides: Nucleic Poly- acid peptide SEQ SEQ Name ID NO: ID NO: At5g53200 75 76 A.capillaris_DV853805 77 78 A.capillaris_DV859458 79 80 A.hypogaea_CD038483 81 82 A.sativa_CN818591 83 84 A.thaliana_At1g01380_CPC-like_ETC1 85 86 A.thaliana_At1g09710_CPC-like_NA 87 88 A.thaliana_At1g18960_CPC-like_NA 89 90 A.thaliana_At1g58220_CPC-like_NA 91 92 A.thaliana_At1g71030_CPC-like_ATMYBL2 93 94 A.thaliana_At2g30420_CPC-like_NA 95 96 A.thaliana_At2g46410_CPC-like_CPC 97 98 A.thaliana_At4g01060_CPC-like_ETC3 99 100 B.gymnorrhiza_TA2541_39984 101 102 B.napus_CD843377 103 104 B.napus_EE451172 105 106 B.napus_EV055366 107 108 B.napus_TC92601 109 110 B.napus_TC95812 111 112 C.canephora_DV693718 113 114 C.longa_DY390653 115 116 C.tetragonoloba_EG990179 117 118 E.esula_DV121180 119 120 G.hirsutum_DW508052 121 122 G.hirsutum_TC102183 123 124 G.hirsutum_TC116960 125 126 G.hirsutum_TC121748 127 128 G.max_Glyma11g02060.1 129 130 G.max_8223 131 132 G.max_29139 133 134 G.soja_TA4526_3848 135 136 H.vulgare_TC189825 137 138 I.nil_TC6509 139 140 J.hindsii_x_regia_EL893054 141 142 J.hindsii_x_regia_TA1295_432290 143 144 L.japonicus_CB827663 145 146 L.multiflorum_AU249134 147 148 L.saligna_DW052030 149 150 L.serriola_DW108811 151 152 L.tulipifera_CV004984 153 154 M.domestica_TC17597 155 156 M.esculenta_DV443286 157 158 M.esculenta_TA9427_3983 159 160 M.truncatula_CT033771_17.4 161 162 O.minuta_CB884361 163 164 O.sativa_LOC_Os01g43230.2 165 166 P.equestris_CB034844 167 168 P.glauca_DR564374 169 170 P.hybrida_EB175070 171 172 P.menziesii_TA3655_3357 173 174 P.persica_BU039343 175 176 P.pinaster_CT579117 177 178 P.pinaster_TA6535_71647 179 180 P.sitchensis_TA16538_3332 181 182 P.sitchensis_TA17447_3332 183 184 P.taeda_DR096185 185 186 P.tremula_BU888423 187 188 P.tremula_DN497189 189 190 P.tremula_TA11725_113636 191 192 P.tremula_TA7610_113636 193 194 P.trichocarpa_562293 195 196 P.trichocarpa_568212 197 198 P.trichocarpa_594467 199 200 P.trichocarpa_674550 201 202 P.trichocarpa_807368 203 204 P.vulgaris_CV538421 205 206 S.bicolor_Sb03g028170.1 207 208 S.miltiorrhiza_CV166339 209 210 S.miltiorrhiza_TA1626_226208 211 212 S.tuberosum_CV505951 213 214 T.aestivum_BE412359 215 216 T.androssowii_TA2313_189785 217 218 Triphysaria_sp_TC9313 219 220 V.vinifera_GSVIVT00006915001 221 222 V.vinifera_GSVIVT00010755001 223 224 V.vinifera_GSVIVT00026045001 225 226 Z.mays_TC409725 227 228
[0513] In some instances, related sequences have tentatively been assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR; beginning with TA). The Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid sequence or polypeptide sequence of interest. On other instances, special nucleic acid sequence databases have been created for particular organisms, such as by the Joint Genome Institute. Further, access to proprietary databases, has allowed the identification of novel nucleic acid and polypeptide sequences.
1.4. Brassinazole Resistant1 (BZR1)
[0514] Sequences (full length cDNA, ESTs or genomic) related BZR nucleic acid sequence were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program was used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. SEQ ID NO: 2 was used for the TBLASTN algorithm, under default settings and without filters to ignore low complexity sequences. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length.
[0515] Table A4 provides a list of BZR nucleic acid sequences and encoded proteins thereof.
TABLE-US-00018 TABLE A4 Examples of BZR polypeptides: Nucleic Poly- acid peptide SEQ SEQ Name Plant Source ID NO: ID NO: AT3G50750.1 Arabidopsis thaliana 238 239 AT1G19350.1 Arabidopsis thaliana 240 241 AT1G75080.1 Arabidopsis thaliana 242 243 AT1G78700.1 Arabidopsis thaliana 244 245 AT4G18890.1 Arabidopsis thaliana 246 247 AT4G36780.1 Arabidopsis thaliana 248 249 Gm\1762729 Glycine max 250 251 Gm\1765606 Glycine max 252 253 Gm\1768381 Glycine max 254 255 Gm\1768507 Glycine max 256 257 Hv_TA37786_4513 Hordeum vulgare 258 259 Le\LAT61 Lycopersicum esculentum 260 261 Le_DB718708 Lycopersicum esculentum 262 263 Le_TA37112_4081 Lycopersicum esculentum 264 265 Le_TA51962_4081 Lycopersicum esculentum 266 267 Mt_BF635822 Medicago truncatula 268 269 Mt_TA21345_3880 Medicago truncatula 270 271 Mt_TA28179_3880 Medicago truncatula 272 273 Os01g0203000 Oryza sativa 274 275 Os02g0129600 Oryza sativa 276 277 Os06g0552300 Oryza sativa 278 279 Os07g0580500 Oryza sativa 280 281 Pp\17189 Physcomitrella patens 282 283 Pp\172161 Physcomitrella patens 284 285 Pp\82495 Physcomitrella patens 286 287 Ps\WS0287_023 Picea sitchensis 288 289 Pt\WS01123_K11 Populus trichocarpa 290 291 Pt_scaff_178.36 Populus trichocarpa 292 293 Pt_scaff_40.175 Populus trichocarpa 294 295 Pt_scaff_57.215 Populus trichocarpa 296 297 Pt_scaff_II.1237 Populus trichocarpa 298 299 Pt_scaff_IV.340 Populus trichocarpa 300 301 Pt_scaff_VII.1038 Populus trichocarpa 302 303 Pt_scaff_XI.678 Populus trichocarpa 304 305 Pt_scaff_XI.792 Populus trichocarpa 306 307 Sl\FC26BA11 Solanum lycopersicum 308 309 Vv_TA44770_29760 Vitis vinifera 310 311 Zm_AY107201 Zea mays 312 313 Zm_EE158804 Zea mays 314 315 Zm_TA175044_4577 Zea mays 316 317 Zm_TA178991_4577 Zea mays 318 319
Example 2
Alignment of Sequences Related to the Nucleic Acid Sequence Used in the Methods of the Invention
2.1. Root Hairless 1 (RNL1)
[0516] Alignment of the RHL1 polypeptide sequences of Table A was performed using the Clustal W algorithm of progressive alignment (Larking et al. Bioinformatics. 2007 Nov. 1; 23(21):2947-8. Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chenna et al. (2003). Nucleic Acids Res 31:3497-3500). Default values are for the gap open penalty of 10, for the gap extension penalty of 0.1 and the selected weight matrix is Blosum 62. Proteins alignment is given in FIG. 2a.
[0517] A phylogenetic tree the RHL1 polypeptide sequences of Table A (FIG. 2b) was constructed using a neighbour-joining clustering algorithm as provided in the Clustal W programme.
2.2. Transglutaminases (TGases)
[0518] Multiple sequence alignment of all the TGase polypeptide sequences in Table A was performed using the AlignX algorithm (from Vector NTI 10.3, Invitrogen Corporation). Results of the alignment are shown in FIG. 7 of the present application. The N-terminal plastidic transit peptide is separated from the rest of the polypeptide by a vertical bar. The putative calcium binding region is boxed, and marked with X's under the consensus sequence. The domain where at least one coiled coil is predicted using the Coils algorithm (and as represented by SEQ ID NO: 70) is marked with X's under the consensus region.
2.3. Tryptichon (TRY-Like)
[0519] Alignment of polypeptide sequences was performed using the ClustalW 2.0 algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chenna et al. (2003). Nucleic Acids Res 31:3497-3500) with standard setting (slow alignment, similarity matrix: Gonnet, gap opening penalty 10, gap extension penalty: 0.2). Minor manual editing was done to further optimise the alignment. Sequence conservation among TRY-like polypeptides is essentially in the DNA binding domain of the polypeptides, the C-terminus and N-terminus usually being more variable in sequence length and composition. The TRY-like polypeptides are aligned in FIG. 10.
[0520] This alignment can be used for determining conserved signature sequences of about 5 to 10 amino acids in length. Preferably the conserved regions of the proteins are used, recognisable by the asterisks (identical residues), the colons (highly conserved substitutions) and the dots (conserved substitutions).
2.4. Brassinazole Resistant1 (BZR1)
[0521] Alignment of polypeptide sequences was performed using the AlignX programme from the Vector NTI (Invitrogen) which is based on the popular Clustal W algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chenna et al. (2003). Nucleic Acids Res 31:3497-3500). Default values were as follows: gap open penalty of 10; gap extension penalty of 0.1; and the selected weight matrix was Blosum 62 (if polypeptides are aligned). The alignment of BZR polypeptides is shown in FIG. 12. The sequence Pp172161 in the alignment is truncated in the N- and C-terminal. The highly conserved amino acid residues are indicated in the consensus sequence.
[0522] Sequence conservation among BZR polypeptides is essentially in the N-terminal part along the BZR1, transcriptional repressor domain. The highly conserved bHLH-like DNA binding domain characteristic of BZR polypeptides is highlighted in FIG. 12. Conserved amino acid motifs such as SAPVTPPLSSP (SEQ ID NO: 323: located at position 405-415 in the consensus sequence) and VKPWEGERIHE (SEQ ID NO: 324: located at position 634-644 in the consensus sequence) and DLELTLG (SEQ ID NO: 325: located at positions 656-662 in the consensus sequence) were identified.
Example 3
Calculation of Global Percentage Identity Between Polypeptide Sequences Useful in Performing the Methods of the Invention
3.1. Root Hairless 1 (RNL1)
[0523] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention were determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.
[0524] Parameters used in the comparison were:
[0525] Scoring matrix: Blosum62
[0526] First Gap: 12
[0527] Extending gap: 2
[0528] Results of the software analysis are shown in Table B1 for the global similarity and identity over the full length of the polypeptide sequences. Percentage identity is given above the diagonal in bold and percentage similarity is given below the diagonal (normal face).
[0529] The percentage identity between the RHL1polypeptide sequences useful in performing the methods of the invention can be as low as 31% amino acid identity compared to SEQ ID NO: 2.
TABLE-US-00019 TABLE B1 MatGAT results for global similarity and identity over the full length of the polypeptid sequences. Polypeptide nr Name polypeptide 1 2 3 4 5 6 7 8 9 10 11 1 O. sativa_Os07g07580 100 88 74 42 42 49 39 40 39 35 34 2 O. sativa_Os06g51380 88 100 66 36 35 43 34 35 35 31 30 3 Z. mays_TA180670 74 66 100 38 37 47 38 37 38 33 31 4 A. formosa_TA10038 42 36 38 100 60 62 50 50 49 34 35 5 V. vinifera_GSVIVT00027050001 42 35 37 60 100 65 51 49 49 27 27 6 M. domestica_TA43921 49 43 47 62 65 100 58 59 56 41 39 7 S. tuberosum_TA36268 39 34 38 50 51 58 100 48 50 31 32 8 P. trichocarpa_scaff_IV.277 40 35 37 50 49 59 48 100 49 29 31 9 A. thaliana_AT1G48380.1 39 35 38 49 49 56 50 49 100 32 31 10 P. patens_74926 35 31 33 34 27 41 31 29 32 100 52 11 P. patens_173149 34 30 31 35 27 39 32 31 31 52 100
3.2. Transglutaminases (TGases)
[0530] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention were determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.
[0531] Parameters used in the comparison were:
[0532] Scoring matrix: Blosum62
[0533] First Gap: 12
[0534] Extending gap: 2
[0535] Results of the software analysis are shown in Table B2 for the global similarity and identity over the full length of the polypeptide sequences (excluding the partial polypeptide sequences).
[0536] The percentage identity between the full length polypeptide sequences useful in performing the methods of the invention can be as low as 26% amino acid identity compared to SEQ ID NO: 44.
TABLE-US-00020 TABLE B MatGAT results for global similarity and identity over the full length of the polypeptide sequences of Table A. 1 2 3 4 5 6 7 8 9 10 11 12 13 1. Arath_TGase 36 41 26 31 33 30 24 24 21 23 18 16 2. Horvu_TGase 49 40 29 61 30 26 28 27 26 27 22 21 3. Lyces_TGase 58 51 29 36 30 26 28 28 27 28 21 21 4. Orysa_TGase 42 46 45 28 26 29 67 62 57 62 40 36 5. Orysa_TGase\II 43 69 47 42 25 23 27 27 26 25 20 20 6. Picsi_TGAse 49 41 42 43 34 32 24 22 20 21 17 16 7. Poptr_TGase 50 40 39 45 35 51 31 29 26 28 23 22 8. Sacof_TGase 40 45 45 75 42 41 44 88 75 81 48 44 9. Sorbi_TGase 40 46 46 70 41 38 41 91 81 87 49 45 10. Zeama_TGase\II 34 43 42 64 42 34 37 77 82 93 49 47 11. Zeama_TGase\III 38 46 44 70 41 37 39 83 88 93 50 46 12. Zeama_tgz15 32 37 38 49 37 29 34 56 56 58 57 91 13. Zeama_tgz21 29 34 35 45 34 27 32 52 51 55 53 91
[0537] The percentage amino acid identity can be significantly increased if the most conserved region of the polypeptides are compared. For example, when comparing the amino acid sequence of the coiled coil domain of SEQ ID NO: 2 (as represented by SEQ ID NO: 27), with the coiled coil domain of the polypeptides of Table A, the percentage amino acid identity increased up to 50%.
3.3. Tryptichon (TRY-Like)
[0538] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention were determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.
[0539] Parameters used in the comparison were:
[0540] Scoring matrix: Blosum62
[0541] First Gap: 12
[0542] Extending gap: 2
[0543] Results of the software analysis are shown in Table B3 for the global similarity and identity over the full length of the polypeptide sequences. Percentage identity between At5g53200 and other TRY-like polypeptides is given above the diagonal in bold, for both the full length sequence and for the DNA-binding domain.
[0544] The percentage identity between the TRY-like polypeptide sequences useful in performing the methods of the invention can be as low as 12% sequence identity compared to SEQ ID NO: 76. However, the sequence conservation is much higher when the DNA binding is compared. Table B2 shows similarity and identity among the sequences representing the DNA binding domain (sequences that align with the DNA-binding domain as shown in FIG. 9 (residues 30 to 75 in SEQ ID NO: 76). The sequence identity is generally higher than 50%.
TABLE-US-00021 TABLE B3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 A: MatGAT results for global similarity and identity over the full length of the polypeptide sequences. 1. Pt_scaff_XII.135 38.1 57.3 19.1 36.1 15.7 43.3 15.3 13.2 15.1 51.8 52.0 10.3 12.7 31.5 34.0 52.6 12.8 63.0 56.0 11.8 37.1 12.2 13.8 2. Pt_scaff_64.55 52.6 36.5 15.5 93.2 12.9 40.7 14.8 12.3 12.8 32.1 41.2 10.6 11.3 31.5 34.9 45.8 11.4 31.1 38.3 10.4 41.8 11.1 13.3 3. Pt_scaff_II.1572 70.2 44.2 20.1 36.5 16.5 44.8 16.5 17.5 14.6 47.4 48.1 11.2 13.5 32.3 33.7 45.2 14.7 52.9 51.3 13.5 33.7 13.7 15.1 4. Pt_scaff_I.1021 29.9 22.2 30.9 16.0 37.8 15.5 44.3 36.1 40.8 19.1 16.0 29.8 35.3 15.5 15.5 19.6 35.0 21.1 20.1 28.4 16.0 37.4 38.7 5. Pt_scaff_VII.231 49.5 94.6 43.3 21.1 12.0 38.3 14.8 12.7 11.9 33.0 39.3 10.6 10.9 32.4 36.1 47.6 11.0 32.1 37.2 10.1 39.2 11.1 12.8 6. AT3G13540 22.1 17.3 25.7 51.4 17.7 12.4 40.4 35.2 38.2 16.9 14.1 33.7 45.0 14.5 15.3 13.5 33.9 16.1 14.9 36.7 14.1 36.8 41.0 7. AT4G01060 62.9 62.3 57.7 26.3 63.6 19.7 12.8 14.5 14.6 36.6 45.2 9.7 11.6 38.3 42.4 65.9 11.0 39.8 48.9 11.5 46.3 11.8 12.8 8. AT5G14750 25.1 23.6 30.5 57.1 23.6 53.4 22.2 55.0 58.9 17.2 15.8 28.0 35.3 18.2 20.2 14.3 31.3 18.1 16.2 31.2 19.2 31.5 38.7 9. AT3G27920 24.1 19.3 27.2 50.9 21.5 49.0 21.1 64.5 63.8 18.8 14.0 28.8 35.3 16.2 15.8 13.2 32.5 18.3 15.8 32.8 15.8 31.8 35.9 10. AT5G40330 24.7 19.2 22.4 54.3 18.3 52.6 21.0 71.2 75.9 17.8 14.5 27.1 34.2 17.1 15.8 13.7 34.3 19.0 16.0 30.6 16.4 35.6 36.3 11. AT2G30420 65.2 42.9 60.7 32.0 43.8 27.7 53.6 28.6 28.9 29.2 59.8 13.1 15.6 27.6 31.3 44.6 15.4 57.5 44.6 12.8 31.0 16.2 15.6 12. AT2G30432 72.2 53.6 60.6 24.2 51.2 20.1 66.7 28.1 22.4 24.7 66.1 9.0 12.4 28.8 35.3 51.8 11.7 56.5 44.7 10.4 32.6 10.7 13.8 13. Os03g29614 16.5 13.4 19.0 38.3 13.7 43.9 14.3 41.7 40.5 38.3 19.6 15.6 37.3 10.2 9.7 9.0 39.6 12.1 11.8 33.7 8.1 40.3 31.2 14. Os01g50110 21.5 16.7 21.8 44.7 16.0 57.1 18.2 46.9 48.7 44.4 21.5 17.8 48.6 12.7 11.6 12.0 37.4 15.3 14.2 35.1 12.4 37.5 34.5 15. Os01g43180 51.4 46.7 57.9 27.8 44.9 21.7 53.3 28.6 26.8 27.4 53.6 53.3 17.1 19.6 54.5 35.1 11.4 27.4 33.6 13.4 49.5 11.1 15.8 16. Os01g43230 54.6 49.4 51.9 24.7 50.6 21.7 59.0 27.6 23.7 23.7 48.2 60.7 15.0 16.4 64.5 37.2 12.8 32.1 37.9 10.8 60.2 12.5 15.9 17. AT1G01380 67.0 59.0 61.5 28.4 59.0 19.7 79.5 22.7 20.6 21.0 56.3 69.0 14.3 18.2 52.3 59.0 12.7 44.4 54.3 10.1 39.8 12.1 12.8 18. Zm_C1 19.8 16.1 21.6 44.3 15.4 54.2 17.2 42.9 45.8 46.9 23.8 19.8 48.6 50.5 20.1 17.9 19.4 13.9 13.9 34.4 10.3 81.8 37.7 19. At5g53200 79.2 40.6 69.8 33.5 42.5 25.7 55.7 30.5 28.1 27.9 72.3 64.2 18.4 22.2 55.1 50.9 57.5 20.1 47.7 14.2 30.2 15.9 16.1 20. AT2G46410 72.2 50.0 66.3 30.4 51.1 23.7 60.6 24.6 25.4 23.7 60.7 67.0 16.5 21.8 54.2 55.3 66.0 20.1 65.1 12.8 39.4 15.9 16.5 21. Zm_TA175111 18.8 15.6 20.5 44.1 14.9 52.1 17.0 45.8 43.4 44.1 22.6 16.3 48.3 50.0 19.1 16.3 16.7 45.5 20.8 19.4 9.7 36.1 48.6 22. Zm_TA218306 55.7 62.8 50.0 25.3 61.5 19.7 62.8 25.1 23.2 23.7 46.4 64.3 13.1 17.8 57.0 72.3 59.0 16.1 50.0 57.4 17.4 10.7 13.3 23. Zm_AY135018 20.7 16.6 19.9 45.0 15.5 53.9 17.7 44.3 48.0 49.1 24.7 18.5 50.8 50.2 19.9 18.8 18.5 84.6 22.5 20.3 49.7 16.6 36.2 24. Zm_TA175105 25.2 22.0 27.1 55.0 21.1 57.0 22.0 53.2 52.6 52.5 28.9 22.5 41.7 49.1 25.7 24.8 22.9 47.6 26.6 25.2 59.0 23.4 46.5 B: MatGAT results for global similarity and identity over the DNA binding domain of the polypeptide sequences. 1. Zm_C1 95.7 71.7 73.9 91.3 84.8 87.0 89.1 76.1 76.1 73.9 50.0 52.2 52.2 52.2 47.8 52.2 52.2 47.8 47.8 46.9 47.8 54.3 54.3 2. Zm_AY135018 100.0 69.6 71.7 91.3 82.6 84.8 91.3 76.1 76.1 73.9 47.8 54.3 54.3 50.0 45.7 50.0 54.3 50.0 45.7 44.9 45.7 54.3 54.3 3. Zm_TA175111 87.0 89.1 91.3 67.4 71.7 67.4 71.7 73.9 71.7 71.7 41.3 45.7 47.8 47.8 43.5 47.8 45.7 43.5 47.8 46.9 45.7 45.7 45.7 4. Zm_TA175105 91.3 91.3 93.5 73.9 73.9 69.6 73.9 76.1 73.9 73.9 39.1 43.5 45.7 45.7 41.3 45.7 43.5 41.3 45.7 44.9 43.5 45.7 45.7 5. Os03g29614 95.7 95.7 87.0 93.5 82.6 84.8 89.1 71.7 71.7 69.6 45.7 50.0 50.0 47.8 43.5 47.8 50.0 45.7 43.5 42.9 43.5 54.3 54.3 6. AT3G13540 93.5 93.5 87.0 91.3 91.3 87.0 84.8 78.3 78.3 76.1 50.0 50.0 52.2 54.3 47.8 52.2 50.0 50.0 52.2 51.0 52.2 52.2 52.2 7. Os01g50110 91.3 91.3 84.8 89.1 89.1 91.3 89.1 71.7 73.9 69.6 47.8 47.8 50.0 52.2 45.7 50.0 47.8 50.0 47.8 46.9 47.8 52.2 52.2 8. Pt_scaff_I.1021 93.5 93.5 93.5 95.7 91.3 93.5 93.5 80.4 82.6 78.3 50.0 56.5 54.3 52.2 47.8 52.2 54.3 50.0 45.7 44.9 45.7 52.2 52.2 9. AT5G14750 89.1 89.1 93.5 95.7 87.0 89.1 87.0 95.7 97.8 97.8 43.5 50.0 47.8 52.2 50.0 52.2 45.7 43.5 50.0 46.9 50.0 45.7 45.7 10. AT5G40330 89.1 89.1 93.5 95.7 87.0 89.1 87.0 95.7 100.0 95.7 41.3 47.8 45.7 50.0 47.8 50.0 45.7 43.5 47.8 44.9 47.8 43.5 43.5 11. AT3G27920 89.1 89.1 93.5 95.7 87.0 89.1 87.0 95.7 100.0 100.0 43.5 50.0 47.8 52.2 50.0 52.2 45.7 43.5 50.0 46.9 50.0 47.8 47.8 12. Pt_scaff_XII.135 67.4 67.4 63.0 67.4 65.2 65.2 67.4 71.7 67.4 67.4 67.4 87.0 73.9 78.3 73.9 78.3 67.4 78.3 65.2 53.1 63.0 65.2 65.2 13. Pt_scaff_II.1572 71.7 71.7 69.6 71.7 69.6 69.6 71.7 76.1 71.7 71.7 71.7 97.8 78.3 78.3 78.3 82.6 69.6 71.7 58.7 51.0 54.3 67.4 67.4 14. AT2G46410 71.7 71.7 67.4 69.6 69.6 71.7 73.9 73.9 69.6 69.6 69.6 89.1 87.0 65.2 63.0 71.7 78.3 73.9 65.2 55.1 63.0 65.2 65.2 15. AT2G30420 69.6 69.6 65.2 69.6 67.4 67.4 67.4 73.9 69.6 69.6 69.6 89.1 91.3 87.0 87.0 84.8 60.9 67.4 56.5 53.1 56.5 58.7 58.7 16. AT2G30432 65.2 65.2 60.9 65.2 63.0 63.0 63.0 69.6 69.6 69.6 69.6 95.7 93.5 84.8 93.5 89.1 60.9 65.2 54.3 53.1 50.0 56.5 56.5 17. At5g53200 69.6 69.6 65.2 69.6 67.4 67.4 67.4 73.9 69.6 69.6 69.6 97.8 95.7 91.3 91.3 93.5 63.0 69.6 58.7 53.1 54.3 60.9 60.9 18. AT4G01060 73.9 73.9 69.6 71.7 71.7 71.7 71.7 73.9 71.7 69.6 71.7 91.3 89.1 91.3 91.3 87.0 89.1 78.3 67.4 59.2 67.4 56.5 56.5 19. AT1G01380 69.6 69.6 63.0 65.2 67.4 67.4 69.6 69.6 65.2 63.0 67.4 91.3 89.1 91.3 84.8 84.8 87.0 91.3 60.9 51.0 58.7 63.0 63.0 20. Os01g43180 71.7 71.7 69.6 73.9 69.6 69.6 69.6 73.9 73.9 73.9 73.9 89.1 91.3 87.0 84.8 89.1 91.3 89.1 84.8 73.5 78.3 58.7 58.7 21. Os01g43230 61.2 61.2 59.2 63.3 59.2 59.2 59.2 63.3 63.3 63.3 63.3 73.5 73.5 71.4 67.3 71.4 73.5 73.5 69.4 83.7 65.3 51.0 51.0 22. Zm_TA218306 71.7 71.7 69.6 73.9 69.6 69.6 69.6 73.9 73.9 73.9 73.9 80.4 80.4 82.6 76.1 76.1 78.3 84.8 76.1 87.0 71.4 58.7 58.7 23. Pt_scaff_64.55 67.4 67.4 67.4 73.9 67.4 67.4 69.6 67.4 67.4 67.4 67.4 78.3 76.1 82.6 69.6 69.6 71.7 78.3 80.4 80.4 65.3 82.6 97.8 24. Pt_scaff_VII.231 69.6 69.6 69.6 76.1 69.6 69.6 71.7 69.6 69.6 69.6 69.6 78.3 76.1 82.6 69.6 69.6 71.7 78.3 80.4 80.4 65.3 82.6 97.8
3.4. Brassinazole Resistant1 (BZR1)
[0545] Global percentages of similarity and identity between full length BZR polypeptide sequences were determined the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performed a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2) and calculated similarity and identity using for example Blosum 62 (for polypeptides), and then placed the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.
[0546] Parameters used in the comparison were:
[0547] Scoring matrix: Blosum62
[0548] First Gap: 12
[0549] Extending gap: 2
[0550] Results of the software analysis are shown in Table B for the global similarity and identity over the full length of the polypeptide sequences. Percentage identity is given above the diagonal in bold and percentage similarity is given below the diagonal (normal face).
[0551] The percentage identity between the BZR polypeptide sequences of table B4 compared to SEQ ID NO: 239 ranges from 22.5% to 88.2%.
TABLE-US-00022 TABLE B4 MatGAT results for global similarity and identity over the full length of the polypeptide sequences. 1 2 3 4 5 6 8 9 10 11 20 21 22 23 1. AT1G19350.1 88.2 40.7 51.9 39.8 25.1 65.4 41.0 54.7 54.9 40.6 23.2 41.1 56.5 2. AT1G75080.1 92.3 39.4 53.4 39.3 24.0 65.7 40.9 55.4 55.5 40.6 23.0 40.5 55.1 3. AT1G78700.1 56.1 56.5 46.0 62.1 25.1 40.6 67.1 46.2 46.7 59.5 22.8 59.3 45.5 4. AT3G50750.1 62.7 64.0 55.1 47.3 33.1 54.3 42.9 58.6 60.3 40.6 22.5 40.9 57.4 5. AT4G18890.1 54.6 53.3 72.3 61.6 27.9 42.1 57.2 47.0 48.7 52.2 22.1 52.2 46.2 6. AT4G36780.1 35.2 35.4 33.2 42.0 36.6 26.5 24.7 29.0 29.8 23.6 18.7 26.5 28.0 8. Gm\1762729 78.5 78.0 55.7 67.2 55.6 37.0 43.1 56.9 56.1 41.9 24.6 41.6 55.6 9. Gm\1765606 58.8 58.6 80.8 53.6 68.3 31.1 56.6 45.4 44.4 56.6 22.6 58.4 43.9 10. Gm\1768381 69.9 71.7 59.1 71.3 57.4 36.1 73.3 56.9 93.6 41.9 21.4 44.0 63.1 11. Gm\1768507 69.9 69.3 56.6 72.1 56.5 37.7 71.4 56.6 96.1 40.1 21.0 43.2 64.0 20. Os01g0203000 53.4 52.1 69.9 49.9 62.2 31.2 52.1 69.6 52.6 52.1 23.3 64.8 43.4 21. Os02g0129600 36.9 37.2 38.2 34.8 35.3 26.2 36.1 37.2 35.9 34.0 37.4 23.3 22.6 22. Os06g0552300 53.5 52.7 71.0 51.5 62.3 31.8 53.5 70.7 56.3 55.2 78.4 36.6 43.5 23. Os07g0580500 69.6 69.0 58.5 71.1 58.1 37.2 71.1 56.3 75.2 76.3 54.2 35.1 55.2
Example 4
Identification of Domains Comprised in Polypeptide Sequences Useful in Performing the Methods of the Invention
4.1. Root Hairless 1 (RNL1)
[0552] Identification of highly conserved sequence motifs in the RNL1 polypeptides of Table A was carried out using the MEME system (Timothy et al; 1998. Journal of Computational Biology, Vol. 5, pp. 211-221, 1998); Timothy et al. 1998. Bioinformatics, Vol. 14, pp. 48-54).
TABLE-US-00023 TABLE C1 MEME scan results (major accession numbers) of the polypeptide sequence as represented by SEQ ID NO: 2. Position of motif in SEQ ID NO: Motif 2: Start E-value** Sequence* Motif 1 63 1.5e-118 [IV]R[RK][KG][SG]QRK[NS][RK][FY]LFSFPGLLAP Motif 2 85 3.2e-141 SGG[KR][IV]G[ED]L[KA]DL[GD]TKNP[ILV]LYLDFPQG [RQ]MKL Motif 3 244 6.9e-073 TP[VS]RQSARTAGKK[FL][KN][FY][AT]ExSS Motif 4 155 9.6e-072 GTK[ED]ENPEE[LA][RK]L[DE]FPKE[LF]Q[ENQ][GD] Motif 5 34 2.3e-061 [SN][GN][NL]L[LQV][SR][EDG]xP[AS][KA]PR[SA][APS] LAPSK[TAG]VL[KR][HL][HQ]G[KR]D Motif 6 177 1.1e-036 HA[ED][CY]DFKGGAGAA[CS]D[ES][KA]Q Motif 7 198 2.5e-009 [KSN][KEP]P[GEK][EKT][KTE][YT][VT][EG][EPST][ELQ] SP[KE][IT][ED][SLV][ED][DI][DV][LS]S[ED][DE][SD] [NDS][LD]K[DK] Motif 8 334 7.3e-004 KG[PA]AAKKQRASP[EM][EA]K[HQ]P[TA]G[KI]K *Amino acids given between brackets indicate any of the possible amino acid at such given position. **E-value: Expectation value. The number of different alignments with scores equivalent to or better than a given Score that are expected to occur in a database search by chance. The lower the E value, the more significant the score.
4.2. Tryptichon (TRY-Like)
[0553] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, ProDom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.
[0554] The results of the InterPro scan of the polypeptide sequence as represented by SEQ ID NO: 76 are presented in Table C2.
TABLE-US-00024 TABLE C2 InterPro scan results (major accession numbers) of the polypeptide sequence as represented by SEQ ID NO: 76. Method Accession Domain start stop E-value HMMPanther PTHR10641: TRIPTYCHON 32 106 1.20E-60 SF26 AND CPC HMMPanther PTHR10641 MYB- 32 106 1.20E-60 RELATED Gene3D G3DSA: no 31 79 3.60E-08 1.10.10.60 description HMMPfam PF00249 Myb_DNA- 30 75 5.00E-07 binding superfamily SSF46689 Homeodomain- 26 79 6.90E-07 like HMMSmart SM00717 SANT 29 77 3.10E-06 ProfileScan PS50090 MYB_LIKE 34 71 6.307
4.3. Brassinazole Resistant1 (BZR1)
[0555] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, ProDom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.
[0556] The results of the InterPro scan of the polypeptide sequence as represented by SEQ ID NO: 239 are presented in Table C3. The Interpro family corresponding to the BZR polypeptide is IPR008540.
TABLE-US-00025 TABLE C3 InterPro scan results (major accession numbers) of the polypeptide sequence as represented by SEQ ID NO: 239. Amino acid Accession Accession coordinates on Database number name Evalue SEQ ID NO: 239 Pfam PF05687 DUF822 7.199 × E-89 10-157
Example 5
Prediction of Secondary Structure Features of TGase
[0557] Coiled coils usually contain a repeated seven amino acid residue pattern called a heptad repeat. Coiled coils are important to identify for protein-protein interactions, such as oligomerization, either of identical proteins, of proteins of the same family, or of unrelated proteins. Recently much progress has been made in computational prediction of coiled coils from sequence data. Many algorithms well known to a person skilled in the art are available at the ExPASy Proteomics tools. One of them, COILS, is a program that compares a sequence to a database of known parallel two-stranded coiled-coils and derives a similarity score. By comparing this score to the distribution of scores in globular and coiled-coil proteins, the program then calculates the probability that the sequence will adopt a coiled-coil conformation.
[0558] The TGase polypeptide as represented by SEQ ID NO: 45, has at least one predicted coiled coil domain, with a high probability, in all three windows (14, 21 and 28) examined. In Table D1, the residue coordinates, residues, the three windows and corresponding probability values are shown. In FIG. 6, is the graphical output of the COILS algorithm on the polypeptide as represented by SEQ ID NO: 45, where at least one predicted coiled coil is clearly visible, in all three windows (as represented by the three lines).
TABLE-US-00026 TABLE D1 Numerical output of the COILS algorithm on the polypeptide as represented by SEQ ID NO: 45. The residue coordinates (#), residues, the three windows and corresponding probability values are shown. Probabilities above 0.9 are shown in bold. # Residue Window 14 Prob Window 21 Prob Window 28 Prob 29 R e 0.001 e 0.082 e 0.946 30 Q f 0.003 f 0.082 f 0.946 31 P g 0.003 g 0.082 g 0.946 32 L a 0.708 a 0.991 a 1.000 33 D b 0.708 b 0.991 b 1.000 34 R c 0.708 c 0.991 c 1.000 35 A d 0.737 d 0.991 d 1.000 36 A e 0.737 e 0.993 e 1.000 37 T f 0.737 f 0.993 f 1.000 38 A g 0.813 g 0.999 g 1.000 39 L a 0.980 a 1.000 a 1.000 40 E b 0.980 b 1.000 b 1.000 41 I c 0.980 c 1.000 c 1.000 42 L d 0.996 d 1.000 d 1.000 43 E e 0.996 e 1.000 e 1.000 44 K f 0.996 f 1.000 f 1.000 45 K g 0.996 g 1.000 g 1.000 46 L a 0.996 a 1.000 a 1.000 47 A b 0.996 b 1.000 b 1.000 48 E c 0.996 c 1.000 c 1.000 49 Q d 0.996 d 1.000 d 1.000 50 T e 0.996 e 1.000 e 1.000 51 A f 0.996 f 1.000 f 1.000 52 E g 0.996 g 1.000 g 1.000 53 A a 0.996 a 1.000 a 1.000 54 E b 0.996 b 1.000 b 1.000 55 K c 0.996 c 1.000 c 1.000 56 L d 0.996 d 1.000 d 1.000 57 I e 0.971 e 1.000 e 1.000 58 R f 0.971 f 1.000 f 1.000 59 E g 0.971 g 1.000 g 1.000 60 N a 0.971 a 1.000 a 1.000 61 Q b 0.971 b 1.000 b 1.000 62 R c 0.971 c 1.000 c 1.000 63 L d 0.971 d 1.000 d 1.000 64 A e 0.921 e 0.997 e 1.000 65 S f 0.848 f 0.989 f 1.000 66 S g 0.231 g 0.917 g 1.000 67 H a 0.099 a 0.537 a 0.999 68 V b 0.179 b 0.379 b 0.979 69 V c 0.662 c 0.379 c 0.979 70 L d 0.736 d 0.379 d 0.979 71 R e 0.736 e 0.379 e 0.939 72 Q f 0.736 f 0.379 f 0.939 73 D g 0.736 g 0.297 g 0.939 74 I a 0.736 a 0.297 a 0.939 75 V b 0.736 b 0.297 b 0.939 76 D c 0.736 c 0.297 c 0.939 77 T d 0.736 d 0.297 d 0.939 78 E e 0.736 e 0.297 e 0.939 79 K f 0.736 f 0.297 f 0.939 80 E g 0.736 g 0.297 g 0.939 81 M a 0.736 a 0.297 a 0.939 82 Q b 0.736 b 0.297 b 0.900 83 M c 0.736 c 0.297 c 0.560 84 I d 0.265 d 0.297 d 0.560 85 R e 0.265 e 0.297 e 0.560 86 A f 0.151 f 0.297 f 0.407 87 H g 0.047 g 0.297 g 0.123 88 L a 0.047 a 0.297 a 0.123 89 G b 0.026 b 0.297 b 0.123 90 D c 0.026 c 0.297 c 0.123 91 V d 0.026 d 0.108 d 0.123 92 Q e 0.013 e 0.108 e 0.123 93 T b 0.009 b 0.086 b 0.123 94 E c 0.025 c 0.259 c 0.123 95 T d 0.025 d 0.259 d 0.123 96 D e 0.025 e 0.259 e 0.123 97 M f 0.051 f 0.259 f 0.123 98 H g 0.159 g 0.259 g 0.124 99 M a 0.517 a 0.259 a 0.124 100 R b 0.556 b 0.259 b 0.124 101 D c 0.707 c 0.259 c 0.124 102 L d 0.707 d 0.259 d 0.124 103 M e 0.707 e 0.259 e 0.124 104 E f 0.707 f 0.259 f 0.124 105 R g 0.707 g 0.259 g 0.124 106 M a 0.707 a 0.259 a 0.124 107 R b 0.707 b 0.259 b 0.124 108 L c 0.707 c 0.259 c 0.124 109 M d 0.707 d 0.259 d 0.124 110 E e 0.707 e 0.259 e 0.124 111 A f 0.707 f 0.259 f 0.124 112 D g 0.707 g 0.259 g 0.124 113 I a 0.707 a 0.259 a 0.124 114 Q b 0.707 b 0.259 b 0.124 115 A c 0.561 c 0.092 c 0.124 116 G d 0.028 d 0.057 d 0.124 117 D b 0.040 b 0.424 b 0.842 118 A c 0.042 c 0.452 c 0.842 119 V d 0.074 d 0.452 d 0.918 120 K e 0.349 e 0.639 e 0.918 121 K f 0.349 f 0.639 f 0.918 122 E g 0.349 g 0.639 g 0.918 123 L a 0.349 a 0.639 a 0.918 124 H b 0.349 b 0.639 b 0.918 125 Q c 0.349 c 0.639 c 0.918 126 V d 0.349 d 0.639 d 0.918 127 H e 0.349 e 0.639 e 0.918 128 M f 0.349 f 0.639 f 0.918 129 E g 0.349 g 0.797 g 0.918 130 A a 0.349 a 0.797 a 0.918 131 K b 0.349 b 0.930 b 0.918 132 R c 0.349 c 0.930 c 0.918 133 L d 0.349 d 0.930 d 0.918 134 I e 0.190 e 0.930 e 0.918 135 A f 0.190 f 0.930 f 0.918 136 E g 0.190 g 0.930 g 0.918 137 R a 0.190 a 0.930 a 0.918 138 Q b 0.376 b 0.930 b 0.918 139 M c 0.376 c 0.930 c 0.918 140 L d 0.426 d 0.930 d 0.918 141 T e 0.426 e 0.930 e 0.918 142 V f 0.426 f 0.930 f 0.918 143 E g 0.426 g 0.930 g 0.918 144 M a 0.426 a 0.930 a 0.918 145 D b 0.426 b 0.930 b 0.918 146 K c 0.426 c 0.930 c 0.918 147 V d 0.426 d 0.930 d 0.918 148 T e 0.426 e 0.930 e 0.838 149 K f 0.426 f 0.930 f 0.838 150 E g 0.426 g 0.930 g 0.838 151 L a 0.426 a 0.930 a 0.838 152 H b 0.426 b 0.708 b 0.838 153 K c 0.426 c 0.708 c 0.838 154 F d 0.066 d 0.334 d 0.838 155 S e 0.055 e 0.334 e 0.838 156 G f 0.055 f 0.099 f 0.792 157 D g 0.018 g 0.062 g 0.492 158 S e 0.071 e 0.755 e 0.146 159 K f 0.176 f 0.908 f 0.201 160 K g 0.176 g 0.908 g 0.288 161 L a 0.333 a 0.908 a 0.288 162 P b 0.333 b 0.908 b 0.288 163 E c 0.804 c 0.916 c 0.288 164 L d 0.804 d 0.916 d 0.288 165 L e 0.804 e 0.916 e 0.288 166 T f 0.807 f 0.916 f 0.288 167 E g 0.856 g 0.916 g 0.288 168 L a 0.856 a 0.916 a 0.288 169 D b 0.856 b 0.916 b 0.288 170 G c 0.856 c 0.916 c 0.288 171 L d 0.856 d 0.916 d 0.288 172 R e 0.856 e 0.916 e 0.288 173 K f 0.856 f 0.916 f 0.288 174 E g 0.856 g 0.916 g 0.288 175 H a 0.856 a 0.916 a 0.288 176 Q b 0.856 b 0.916 b 0.288 177 S c 0.856 c 0.916 c 0.288 178 L d 0.856 d 0.916 d 0.288 179 R e 0.856 e 0.916 e 0.288 180 S f 0.856 f 0.916 f 0.288 181 A g 0.405 g 0.916 g 0.288 182 F a 0.102 a 0.916 a 0.288 183 E b 0.102 b 0.916 b 0.288 184 Y c 0.016 c 0.283 c 0.288 185 E c 0.085 c 0.283 c 0.288 186 K d 0.085 d 0.283 d 0.638 187 N e 0.085 e 0.283 e 0.638 188 T f 0.085 f 0.058 f 0.638 189 N g 0.085 g 0.025 g 0.638 190 I a 0.085 a 0.026 a 0.638 191 K b 0.085 b 0.232 b 0.938 192 Q c 0.085 c 0.232 c 0.951 193 V d 0.085 d 0.696 d 0.988 194 E f 0.087 f 0.967 f 1.000 195 Q g 0.087 g 0.967 g 1.000 196 M a 0.087 a 0.967 a 1.000 197 R b 0.087 b 0.967 b 1.000 198 T c 0.087 c 0.967 c 1.000 199 M d 0.203 d 0.967 d 1.000 200 E e 0.497 e 0.967 e 1.000 201 M f 0.497 f 0.967 f 1.000 202 N g 0.497 g 0.994 g 1.000 203 L a 0.497 a 0.994 a 1.000 204 M b 0.497 b 0.994 b 1.000 205 T c 0.497 c 0.997 c 1.000 206 M d 0.792 d 0.997 d 1.000 207 T e 0.884 e 0.997 e 1.000 208 K f 0.993 f 0.997 f 1.000 209 E g 0.993 g 0.997 g 1.000 210 A a 0.993 a 0.997 a 1.000 211 D b 0.993 b 0.997 b 1.000 212 K c 0.993 c 0.997 c 1.000 213 L d 0.993 d 0.997 d 1.000 214 R e 0.993 e 0.997 e 1.000 215 A f 0.993 f 0.997 f 1.000 216 D g 0.993 g 0.997 g 1.000 217 V a 0.993 a 0.997 a 1.000 218 A b 0.993 b 0.997 b 1.000 219 N c 0.993 c 0.997 c 1.000 220 A d 0.993 d 0.997 d 1.000 221 E e 0.993 e 0.997 e 1.000 222 K f 0.993 f 0.997 f 1.000 223 R g 0.978 g 0.997 g 0.999 224 A a 0.978 a 0.997 a 0.999 225 Q b 0.978 b 0.997 b 0.999 226 V c 0.745 c 0.996 c 0.999 227 A d 0.560 d 0.996 d 0.999 228 A e 0.348 e 0.989 e 0.999 229 A f 0.348 f 0.982 f 0.999 230 Q g 0.348 g 0.968 g 0.999 231 A a 0.296 a 0.968 a 0.999 232 V g 0.184 g 0.778 g 0.999 233 A a 0.184 a 0.632 a 0.999 234 A b 0.184 b 0.598 b 0.999 235 Q c 0.184 c 0.598 c 0.999 236 A d 0.184 d 0.598 d 0.998 237 G e 0.013 e 0.069 e 0.840 238 V f 0.004 f 0.069 f 0.840 239 A b 0.001 b 0.069 b 0.742 240 H c 0.001 c 0.017 c 0.383 241 V d 0.001 d 0.007 d 0.085 242 T e 0.001 e 0.001 e 0.042 243 A f 0.001 f 0.001 f 0.042 244 S g 0.000 g 0.001 g 0.013 245 Q f 0.000 f 0.001 f 0.002 246 P b 0.000 b 0.000 b 0.000
Example 6
Topology Prediction of the Polypeptide Sequences Useful in Performing the Methods of the Invention
6.1. Transglutaminases (TGases)
[0559] Many algorithms can be used to perform prediction of the subcellular localisation of polypeptides, including:
[0560] TargetP 1.1 hosted on the server of the Technical University of Denmark;
[0561] ChloroP 1.1 hosted on the server of the Technical University of Denmark;
[0562] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;
[0563] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;
[0564] TMHMM, hosted on the server of the Technical University of Denmark
[0565] By comparing the polypeptide sequence of SEQ ID NO: 45 with orthologs from other plant species for which subcellular localisation was identified, it is possible to deduce that the subcellular localization of the polypeptide sequence as represented by SEQ ID NO: 45 is the chloroplast (Villalobos et al. (2004) Gene 336: 93-104).
6.2. Tryptichon (TRY-Like)
[0566] TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal pre-sequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Scores on which the final prediction is based are not really probabilities, and they do not necessarily add to one. However, the location with the highest score is the most likely according to TargetP, and the relationship between the scores (the reliability class) may be an indication of how certain the prediction is. The reliability class (RC) ranges from 1 to 5, where 1 indicates the strongest prediction. TargetP is maintained at the server of the Technical University of Denmark.
[0567] For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted.
[0568] A number of parameters were selected, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).
[0569] The results of TargetP 1.1 analysis of the polypeptide sequence as represented by SEQ ID NO: 76 are presented Table E1. The "plant" organism group has been selected, no cutoffs defined, and the predicted length of the transit peptide requested. The subcellular localization of the polypeptide sequence as represented by SEQ ID NO: 76 may be the cytoplasm or nucleus, no transit peptide is predicted.
TABLE-US-00027 TABLE E1 TargetP 1.1 analysis of the polypeptide sequence as represented by SEQ ID NO: 76 Length (AA) 106 Chloroplastic transit peptide 0.048 Mitochondrial transit peptide 0.348 Secretory pathway signal peptide 0.046 Other subcellular targeting 0.822 Predicted Location / Reliability class 3 Predicted transit peptide length /
[0570] When analysed with PSort, the probability for a nuclear localisation is 0.700, therefore the protein is likely a nuclear protein.
[0571] Many other algorithms can be used to perform such analyses, including:
[0572] ChloroP 1.1 hosted on the server of the Technical University of Denmark;
[0573] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;
[0574] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;
[0575] TMHMM, hosted on the server of the Technical University of Denmark
[0576] PSORT (URL: psort.org)
[0577] PLOC (Park and Kanehisa, Bioinformatics, 19, 1656-1663, 2003).
Example 7
Assay Related to the Polypeptide Sequences Useful in Performing the Methods of the Invention
7.1. Functional Assay of Root Hairless 1 (RHL1)
[0578] The binding of an RHL1 polypeptide is assayed in an in vitro assay essentially as described by Sugimoto-Shirasu et al. 2005 PNAS vol. 102 no. 51, 18736-18741. Briefly, Recombinant RHL1 Pprotein is produced in a bacterial system and purified using standard methods. Purified RHL1 protein is incubated with DNa fragments and binding of the RHL1 protein the DNA fragment is detected using plasmon resonance (SPR).
7.2. Transglutaminases (TGases)
[0579] Polypeptides useful in performing the methods of the invention typically catalyze the formation of amide linkages, generally in a Ca-dependent fashion, between the primary amine of an amine donor substrate and the y-carboxamide group of peptide-bound endo-glutamine residues in proteins or polypeptides that are the amine acceptors. More specifically, TGase activity can be measured using the radiolabeled putrescine method, or the gamma-glutamyl biotin cadaverine method, as described in Villalobos et al. (2004; supra).
[0580] A person skilled in the art is well aware of such experimental procedures to measure TGase polypeptide enzymatic activity, including the activity of a TGase polypeptide as represented by SEQ ID NO: 45.
7.3. Functional Assay of Tryptichon (TRY-Like)
[0581] TRY-like polypeptides typically have DNA-binding activity. One method for measuring and characterising DNA-binding properties of polypeptides is described in Xue (A CELD-fusion method for rapid determination of the DNA-binding sequence specificity of novel plant DNA-binding proteins. Plant Journal 41, 638-649, 2005).
Example 8
Cloning of the Nucleic Acid Sequence Used in the Methods of the Invention
[0582] 8.1. Root Hairless 1 (RNL1) The nucleic acid sequence used in the methods of the invention was amplified by PCR using as template a custom-made Arabidopsis thaliana seedlings cDNA library (in pCMV Sport 6.0; Invitrogen, Paisley, UK). PCR was performed using Hifi Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were: 5'-ggggacaagtttgtacaaaaaagcaggcttaaacaatggtacgagcttcatcgtc-3' (SEQ ID NO: 40; sense) and 5'-ggggaccactttgtacaagaaagctgggtttctggaaaagatttct ttaagc-3' (SEQ ID NO: 41; reverse) which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombines in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone", pArath_RHL1. Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0583] The entry clone comprising SEQ ID NO: 1 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 42) for root specific expression was located upstream of this Gateway cassette.
[0584] After the LR recombination step, the resulting expression vector pGOS2::Arath_RHL1l (FIG. 3) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
8.2. Transglutaminases (TGases)
[0585] The Oryza sativa nucleic acid sequence encoding a TGase polypeptide sequence as represented by SEQ ID NO: 45 was amplified by PCR using as template a cDNA bank constructed using RNA from rice plants at different developmental stages. The following primers, which include the AttB sites for Gateway recombination, were used for PCR amplification: prm02265 (SEQ ID NO: 73, sense): 5'-ggggacaagtttgtacaaaaaa gcaggcttcacaatggcataccatggacag-3' and prm02266 (SEQ ID NO: 74, reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggtatttcacctctggcctg-3'. PCR was performed using Hifi Taq DNA polymerase in standard conditions. A PCR fragment of the expected length (including attB sites) was amplified and purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombined in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone". Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0586] The entry clone comprising SEQ ID NO: 44 was subsequently used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice alpha-globulin promoter (SEQ ID NO: 72) for seed-specific expression was located upstream of this Gateway cassette.
[0587] After the LR recombination step, the resulting expression vector pGlob::TGase (FIG. 4) for seed-specific expression, was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
8.3. Tryptichon (TRY-Like)
[0588] The nucleic acid sequence used in the methods of the invention was amplified by PCR using as template a custom-made Arabidopsis thaliana seedlings cDNA library (in pCMV Sport 6.0; Invitrogen, Paisley, UK). PCR was performed using Hifi Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm09014 (SEQ ID NO: 233; sense, start codon in bold): 5'-ggggacaagtttgtacaaaaaagcagg cttaaacaatggataacactgaccgtcgt-3' and prm09015 (SEQ ID NO: 234; reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggttttttcgttggcttaaaaa ca-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombines in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone", pTRY-like. Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0589] The entry clone comprising SEQ ID NO: 75 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 237) for constitutive specific expression was located upstream of this Gateway cassette. In an alternative embodiment, a root specific promoter (RCc3 promoter; SEQ ID NO: 235)
[0590] After the LR recombination step, the resulting expression vector pGOS2::TRY-like (FIG. 3) or pRCc3::TRY was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
8.4. Brassinazole Resistant1 (BZR1)
[0591] The nucleic acid sequence used in the methods of the invention was amplified by PCR using as template a custom-made Arabidopsis thaliana seedlings cDNA library (in pCMV Sport 6.0; Invitrogen, Paisley, UK). PCR was performed using Hifi Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were (SEQ ID NO: 320; sense): 5'-ggggacaagtttgtacaaaaaagcaggcttaaacaatgacggcatcaggag ga-3' and (SEQ ID NO: 321; reverse, complementary): 5'-ggggaccactttgtacaag aaagctgggtaccacgatattaacctagccg-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombined in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone", pBZR. Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0592] The entry clone comprising SEQ ID NO: 238 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 322) for constitutive specific expression was located upstream of this Gateway cassette.
[0593] After the LR recombination step, the resulting expression vector pGOS2::BZR (FIG. 3) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
Example 9
Plant Transformation
Rice Transformation
[0594] The Agrobacterium containing the expression vector was used to transform Oryza sativa plants. Mature dry seeds of the rice japonica cultivar Nipponbare were dehusked. Sterilization was carried out by incubating for one minute in 70% ethanol, followed by 30 minutes in 0.2% HgCl2, followed by a 6 times 15 minutes wash with sterile distilled water. The sterile seeds were then germinated on a medium containing 2,4-D (callus induction medium). After incubation in the dark for four weeks, embryogenic, scutellum-derived calli were excised and propagated on the same medium. After two weeks, the calli were multiplied or propagated by subculture on the same medium for another 2 weeks. Embryogenic callus pieces were sub-cultured on fresh medium 3 days before co-cultivation (to boost cell division activity).
[0595] Agrobacterium strain LBA4404 containing the expression vector was used for co-cultivation. Agrobacterium was inoculated on AB medium with the appropriate antibiotics and cultured for 3 days at 28° C. The bacteria were then collected and suspended in liquid co-cultivation medium to a density (OD600) of about 1. The suspension was then transferred to a Petri dish and the calli immersed in the suspension for 15 minutes. The callus tissues were then blotted dry on a filter paper and transferred to solidified, co-cultivation medium and incubated for 3 days in the dark at 25° C. Co-cultivated calli were grown on 2,4-D-containing medium for 4 weeks in the dark at 28° C. in the presence of a selection agent. During this period, rapidly growing resistant callus islands developed. After transfer of this material to a regeneration medium and incubation in the light, the embryogenic potential was released and shoots developed in the next four to five weeks. Shoots were excised from the calli and incubated for 2 to 3 weeks on an auxin-containing medium from which they were transferred to soil. Hardened shoots were grown under high humidity and short days in a greenhouse.
[0596] Approximately 35 independent T0 rice transformants were generated for one construct. The primary transformants were transferred from a tissue culture chamber to a greenhouse. After a quantitative PCR analysis to verify copy number of the T-DNA insert, only single copy transgenic plants that exhibit tolerance to the selection agent were kept for harvest of T1 seed. Seeds were then harvested three to five months after transplanting. The method yielded single locus transformants at a rate of over 50% (Aldemita and Hodges 1996, Chan et al. 1993, Hiei et al. 1994).
Corn Transformation
[0597] Transformation of maize (Zea mays) is performed with a modification of the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. Transformation is genotype-dependent in corn and only specific genotypes are amenable to transformation and regeneration. The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation, but other genotypes can be used successfully as well. Ears are harvested from corn plant approximately 11 days after pollination (DAP) when the length of the immature embryo is about 1 to 1.2 mm. Immature embryos are cocultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. Excised embryos are grown on callus induction medium, then maize regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Wheat Transformation
[0598] Transformation of wheat is performed with the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. The cultivar Bobwhite (available from CIMMYT, Mexico) is commonly used in transformation. Immature embryos are co-cultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. After incubation with Agrobacterium, the embryos are grown in vitro on callus induction medium, then regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Soybean Transformation
[0599] Soybean is transformed according to a modification of the method described in the Texas A&M patent U.S. Pat. No. 5,164,310. Several commercial soybean varieties are amenable to transformation by this method. The cultivar Jack (available from the Illinois Seed foundation) is commonly used for transformation. Soybean seeds are sterilised for in vitro sowing. The hypocotyl, the radicle and one cotyledon are excised from seven-day old young seedlings. The epicotyl and the remaining cotyledon are further grown to develop axillary nodes. These axillary nodes are excised and incubated with Agrobacterium tumefaciens containing the expression vector. After the cocultivation treatment, the explants are washed and transferred to selection media. Regenerated shoots are excised and placed on a shoot elongation medium. Shoots no longer than 1 cm are placed on rooting medium until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Rapeseed/Canola Transformation
[0600] Cotyledonary petioles and hypocotyls of 5-6 day old young seedling are used as explants for tissue culture and transformed according to Babic et al. (1998, Plant Cell Rep 17: 183-188). The commercial cultivar Westar (Agriculture Canada) is the standard variety used for transformation, but other varieties can also be used. Canola seeds are surface-sterilized for in vitro sowing. The cotyledon petiole explants with the cotyledon attached are excised from the in vitro seedlings, and inoculated with Agrobacterium (containing the expression vector) by dipping the cut end of the petiole explant into the bacterial suspension. The explants are then cultured for 2 days on MSBAP-3 medium containing 3 mg/I BAP, 3% sucrose, 0.7% Phytagar at 23° C., 16 hr light. After two days of co-cultivation with Agrobacterium, the petiole explants are transferred to MSBAP-3 medium containing 3 mg/I BAP, cefotaxime, carbenicillin, or timentin (300 mg/I) for 7 days, and then cultured on MSBAP-3 medium with cefotaxime, carbenicillin, or timentin and selection agent until shoot regeneration. When the shoots are 5-10 mm in length, they are cut and transferred to shoot elongation medium (MSBAP-0.5, containing 0.5 mg/I BAP). Shoots of about 2 cm in length are transferred to the rooting medium (MS0) for root induction. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Alfalfa Transformation
[0601] A regenerating clone of alfalfa (Medicago sativa) is transformed using the method of (McKersie et al., 1999 Plant Physiol 119: 839-847). Regeneration and transformation of alfalfa is genotype dependent and therefore a regenerating plant is required. Methods to obtain regenerating plants have been described. For example, these can be selected from the cultivar Rangelander (Agriculture Canada) or any other commercial alfalfa variety as described by Brown D C W and A Atanassov (1985. Plant Cell Tissue Organ Culture 4: 111-112). Alternatively, the RA3 variety (University of Wisconsin) has been selected for use in tissue culture (Walker et al., 1978 Am J Bot 65:654-659). Petiole explants are cocultivated with an overnight culture of Agrobacterium tumefaciens C58C1 pMP90 (McKersie et al., 1999 Plant Physiol 119: 839-847) or LBA4404 containing the expression vector. The explants are cocultivated for 3 d in the dark on SH induction medium containing 288 mg/L Pro, 53 mg/L thioproline, 4.35 g/L K2SO4, and 100 μm acetosyringinone. The explants are washed in half-strength Murashige-Skoog medium (Murashige and Skoog, 1962) and plated on the same SH induction medium without acetosyringinone but with a suitable selection agent and suitable antibiotic to inhibit Agrobacterium growth. After several weeks, somatic embryos are transferred to BOi2Y development medium containing no growth regulators, no antibiotics, and 50 g/L sucrose. Somatic embryos are subsequently germinated on half-strength Murashige-Skoog medium. Rooted seedlings were transplanted into pots and grown in a greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Cotton Transformation
[0602] Cotton is transformed using Agrobacterium tumefaciens according to the method described in U.S. Pat. No. 5,159,135. Cotton seeds are surface sterilised in 3% sodium hypochlorite solution during 20 minutes and washed in distilled water with 500 μg/ml cefotaxime. The seeds are then transferred to SH-medium with 50 μg/ml benomyl for germination. Hypocotyls of 4 to 6 days old seedlings are removed, cut into 0.5 cm pieces and are placed on 0.8% agar. An Agrobacterium suspension (approx. 108 cells per ml, diluted from an overnight culture transformed with the gene of interest and suitable selection markers) is used for inoculation of the hypocotyl explants. After 3 days at room temperature and lighting, the tissues are transferred to a solid medium (1.6 g/I Gelrite) with Murashige and Skoog salts with B5 vitamins (Gamborg et al., Exp. Cell Res. 50:151-158 (1968)), 0.1 mg/I 2,4-D, 0.1 mg/I 6-furfurylaminopurine and 750 μg/ml MgCL2, and with 50 to 100 μg/ml cefotaxime and 400-500 μg/ml carbenicillin to kill residual bacteria. Individual cell lines are isolated after two to three months (with subcultures every four to six weeks) and are further cultivated on selective medium for tissue amplification (30° C., 16 hr photoperiod). Transformed tissues are subsequently further cultivated on non-selective medium during 2 to 3 months to give rise to somatic embryos. Healthy looking embryos of at least 4 mm length are transferred to tubes with SH medium in fine vermiculite, supplemented with 0.1 mg/I indole acetic acid, 6 furfurylaminopurine and gibberellic acid. The embryos are cultivated at 30° C. with a photoperiod of 16 hrs, and plantlets at the 2 to 3 leaf stage are transferred to pots with vermiculite and nutrients. The plants are hardened and subsequently moved to the greenhouse for further cultivation.
Example 10
Phenotypic Evaluation Procedure
10.1 Evaluation Setup
[0603] Approximately 35 independent T0 rice transformants were generated. The primary transformants were transferred from a tissue culture chamber to a greenhouse for growing and harvest of T1 seed. Six events, of which the T1 progeny segregated 3:1 for presence/absence of the transgene, were retained. For each of these events, approximately 10 T1 seedlings containing the transgene (hetero- and homo-zygotes) and approximately 10 T1 seedlings lacking the transgene (nullizygotes) were selected by monitoring visual marker expression. The transgenic plants and the corresponding nullizygotes were grown side-by-side at random positions. Greenhouse conditions were of shorts days (12 hours light), 28° C. in the light and 22° C. in the dark, and a relative humidity of 70%. Plants grown under non-stress conditions are watered at regular intervals to ensure that water and nutrients are not limiting to satisfy plant needs to complete growth and development.
[0604] Four T1 events were further evaluated in the T2 generation following the same evaluation procedure as for the T1 generation but with more individuals per event. From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.
Drought Screen
[0605] Plants from T2 seeds are grown in potting soil under normal conditions until they approached the heading stage. They were then transferred to a "dry" section where irrigation was withheld. Humidity probes were inserted in randomly chosen pots to monitor the soil water content (SWC). When SWC went below certain thresholds, the plants were automatically re-watered continuously until a normal level was reached again. The plants were then re-transferred again to normal conditions. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress conditions. Growth and yield parameters are recorded as detailed for growth under normal conditions.
Nitrogen Use Efficiency Screen
[0606] Rice plants from T2 seeds were grown in potting soil under normal conditions except for the nutrient solution. The pots were watered from transplantation to maturation with a specific nutrient solution containing reduced N nitrogen (N) content, usually between 7 to 8 times less. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress. Growth and yield parameters are recorded as detailed for growth under normal conditions.
Salt Stress Screen
[0607] Plants are grown on a substrate made of coco fibers and argex (3 to 1 ratio). A normal nutrient solution was used during the first two weeks after transplanting the plantlets in the greenhouse. After the first two weeks, 25 mM of salt (NaCl) was added to the nutrient solution, until the plants were harvested. Seed-related parameters were then measured.
10.2 Statistical Analysis: F Test
[0608] A two factor ANOVA (analysis of variants) was used as a statistical model for the overall evaluation of plant phenotypic characteristics. An F test was carried out on all the parameters measured of all the plants of all the events transformed with the gene of the present invention. The F test was carried out to check for an effect of the gene over all the transformation events and to verify for an overall effect of the gene, also known as a global gene effect. The threshold for significance for a true global gene effect was set at a 5% probability level for the F test. A significant F test value points to a gene effect, meaning that it is not only the mere presence or position of the gene that is causing the differences in phenotype.
[0609] Because two experiments with overlapping events were carried out, a combined analysis was performed. This is useful to check consistency of the effects over the two experiments, and if this is the case, to accumulate evidence from both experiments in order to increase confidence in the conclusion. The method used was a mixed-model approach that takes into account the multilevel structure of the data (i.e. experiment-event-segregants). P values were obtained by comparing likelihood ratio test to chi square distributions.
10.3 Parameters Measured
Biomass-Related Parameter Measurement
[0610] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.
[0611] The plant aboveground area (or leafy biomass) was determined by counting the total number of pixels on the digital images from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from the different angles and was converted to a physical surface value expressed in square mm by calibration. Experiments show that the aboveground plant area measured this way correlates with the biomass of plant parts above ground. The above ground area is the area measured at the time point at which the plant had reached its maximal leafy biomass. The early vigour is the plant (seedling) aboveground area three weeks post-germination. Increase in root biomass is expressed as an increase in total root biomass (measured as maximum biomass of roots observed during the lifespan of a plant); or as an increase in the root/shoot index (measured as the ratio between root mass and shoot mass in the period of active growth of root and shoot).
[0612] Early vigour was determined by counting the total number of pixels from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from different angles and was converted to a physical surface value expressed in square mm by calibration. The results described below are for plants three weeks post-germination.
Seed-Related Parameter Measurements
[0613] The mature primary panicles were harvested, counted, bagged, barcode-labelled and then dried for three days in an oven at 37° C. The panicles were then threshed and all the seeds were collected and counted. The filled husks were separated from the empty ones using an air-blowing device. The empty husks were discarded and the remaining fraction was counted again. The filled husks were weighed on an analytical balance. The number of filled seeds was determined by counting the number of filled husks that remained after the separation step. The total seed yield was measured by weighing all filled husks harvested from a plant. Total seed number per plant was measured by counting the number of husks harvested from a plant. Thousand Kernel Weight (TKW) is extrapolated from the number of filled seeds counted and their total weight. The Harvest Index (HI) in the present invention is defined as the ratio between the total seed yield and the above ground area (mm2), multiplied by a factor 106. The total number of flowers per panicle as defined in the present invention is the ratio between the total number of seeds and the number of mature primary panicles. The seed fill rate as defined in the present invention is the proportion (expressed as a %) of the number of filled seeds over the total number of seeds (or florets).
Example 11
Results of the Phenotypic Evaluation of the Transgenic Plants
11. 1. Root Hairless 1 (RNL1)
[0614] The results of the evaluation of transgenic rice plants in T2 generation expressing coding region of an Arath_RHL1 nucleic acid (SEQ ID NO: 1) under the growth conditions of nitrogen limitation of Example 8 are presented below. An increase of at least 5% was observed for emergence vigour (early vigour, EmerVigor), total seed yield (totalweightseeds), number of filled seeds (Nr filled seeds), harvest index (harvestindex), root biomass (Rootmax) and the number of total seeds on a plant (nrtotalseed) (Table F1).
TABLE-US-00028 TABLE F1 Evaluation of transgenic plants expressing the Arath_RHL1 gene under nitrogen limitation growth conditions. % increase in transgenic Parameter compared to control plant EmerVigor 19 RootMax 7.6 totalweightseeds 17 Nr filled seeds 16 harvestindex 12 nrtotalseed 11
[0615] The results of the evaluation of transgenic rice plants in T1 generation expressing the coding region of Orysa_RHL1 nucleic acid (SEQ ID NO: 9) from the constitutive promoter GOS2 (SEQ ID NO: 39) under the non-stress presented below (Table F2).
TABLE-US-00029 TABLE F2 Evaluation of transgenic plants expressing an Orysa_RHL1 nucleic acid under non-stress conditions. % increase in transgenic Parameter compared to control plant AreaMax 8.1 TimetoFlower 1.25 RootMax 3.6 totalwgseeds 16.19 nrfilledseed 14.9 filtrate 5.7 harvestindex 10.0 HeightMax 2.6 GNbfFlow 7.4 nrtotalseed 8.6
[0616] The results of the evaluation of transgenic rice plants in T1 generation expressing the coding region of Orysa_RHL1 nucleic acid (SEQ ID NO: 9) driven from the root specific promoter Rcc3 (SEQ ID NO: 40) grown under nitrogen limiting conditions as specified above in the Nitrogen use efficiency screen are shown in Table F3. EmerVigor (also refer to as Early vigour) is a yield trait directly correlated with the vigour of the plant in particular at early, seedling stage of development.
TABLE-US-00030 TABLE F3 Evaluation of transgenic plants expressing an Orysa_RHL1 nucleic acid under nitrogen limiting conditions. % increase in transgenic Parameter compared to control plant EmerVigor 16.6 totalwgseeds 12 nrfilledseed 15
11.2. Transglutaminases (TGases)
[0617] The results of the evaluation of T1 and T2 generation transgenic rice plants expressing the nucleic acid sequence encoding a TGase polypeptide as represented by SEQ ID NO: 45, under the control of a seed-specific promoter, and grown under normal growth conditions, are presented below.
[0618] There was a significant increase in the early vigor, in the aboveground biomass, in the total seed yield per plant, in the total number of seeds, in the number of filled seeds, in the seed filling rate, and in the harvest index of the transgenic plants compared to corresponding nullizygotes (controls), as shown in Table F4.
TABLE-US-00031 TABLE F4 Results of the evaluation of T1 and T2 generation transgenic rice plants expressing the nucleic acid sequence encoding a TGase polypeptide as represented by SEQ ID NO: 45, under the control of a promoter for seed-specific expression. Overall average % Overall average % increase in 8 events increase in 4 events Trait in the T1 generation in the T2 generation Total seed yield per plant 26% 15% Total number of filled seeds 27% 14% Harvest index 26% 14%
11.3. Tryptichon (TRY-Like)
[0619] The evaluation of transgenic rice plants expressing a TRY-like nucleic acid under control of the RCc3 promoter, and grown under conditions of reduced nitrogen availability, revealed an increase of more than 5% for emergence vigour (early vigour), fill rate, harvest index, and total seed yield. These increases were observed in T1 generation plants as well as in T2 generation plants.
[0620] The results of the evaluation of transgenic rice plants, expressing a nucleic acid encoding the TRY polypeptide of SEQ ID NO: 76 under control of the constitutive promoter, and grown under non-stress conditions in the T1 and the T2 generation, are presented below in Table E and F respectively. When grown under non-stress conditions, an increase of at least 5% was observed in T1 for seed yield (total weight of seeds, number of filled seeds, number of total seeds).
TABLE-US-00032 TABLE F5 Data summary for transgenic rice plants transformed with the pGOS2::TRY construct; for each parameter, the overall percent increase is shown for the T1 generation, for each parameter the p-value is ≦0.05. Parameter Overall increase Nr total seeds 8.7 totalwgseeds 17.3 nrfilledseed 14.5
[0621] In the T2 generation, a strong increase 5%) was found for above ground biomass (AreaMax and firstpan), early vigour, and seed yield; details are given in Table F:
TABLE-US-00033 TABLE F6 Data summary for transgenic rice plants transformed with the pGOS2::TRY construct; for each parameter, the overall percent increase is shown for the T2 generation, for each parameter the p-value is ≦0.05. Parameter Overall increase AreaMax 5.0 EmerVigor 20.2 firstpan 8.3 totalwgseeds 8.8 nrfilledseed 7.5
11.4. Brassinazole Resistant1 (BZR1)
[0622] The results of the evaluation of transgenic rice plants expressing the coding region of the BZR nucleic acid of SEQ ID NO: 1 under non-stress conditions are presented below. An increase of at least 5% was observed for total seed yield, the number of filled seeds per plant, the seed filling rate and the harvest index, and of more than 2.5% for thousand kernel weight compared to control (corresponding nullyzogotes) plants (Table F7).
TABLE-US-00034 TABLE F7 Total Seed Number of Seed Harvest Parameter weight filled seed filling rate Index TKW % increase in the 17.5 14.4 5.5 11.4 3.0 transgenic compared to the control plant
Sequence CWU
1
1
33111359DNAArabidopsis thaliana 1gacaaagaca tttaaaagaa gaattttcga
agaaaaatta gagagagtag aagaagcaga 60agcagtaatg gtacgagctt catcgtcgaa
gaaaggagga tcaaaaggag gagacaaaga 120cgacgcagag tcaaaacaga ggaagagatt
aaaaacccta gctctcgata accaattgct 180ctctgattct ccggcgaaat ctcattcctc
tctcaaacct tcaaagcaag ttctcaaaca 240ccatggcacc gacatcatcc gcaaatctca
gcgcaagaat cgctttctct tctccttccc 300tggtcttctc gctcctatct ccgccgctac
catcggcgat ctcgatcgat tatctaccaa 360aaaccctgtc ctctacctta atttcccaca
gggtcgtatg aaactttttg gaacgatttt 420gtatccgaag aacagatact tgactcttca
attctctaga ggaggcaaaa atgtcttatg 480tgatgattat tttgataaca tgattgtgtt
ctctgagtca tggtggattg ggacaaaaga 540ggagaatcca gaagaagctc gtcttgattt
ccctaaagaa ctagctcagg cagagaatac 600tgagtttgat ttccaaggcg gtgctggagg
agcagcttcg gtgaagaagc tggcgagtcc 660tgaaattggt agccaaccaa cagagacaga
ctcacctgaa gttgacaacg aggatgtttt 720gtctgaggat ggagaattct tggacgataa
gatccaagta acaccaccag ttcaattaac 780accaccagtc caagtaactc cggtccgaca
gtctcagaga aattctggga agaaattcaa 840ctttgcagaa acttcctcag aggcctcctc
tggtgaaagt gaaggcaata catctgatga 900agatgagaaa cctctgttgg aacctgaatc
ttcaacaaga agtcgtgagg aatctcaaga 960tggtaatggt attactgcat ctgcaagcaa
gttgcctgaa gaacttccgg ctaaaaggga 1020aaaactaaag agcaaagaca gtaagctcgt
tcaagctact ttgtctaacc ttttcaagaa 1080agctgaggag aaaacagctg gaacttccaa
ggctaaatca tcctcaaaag cttaaagaaa 1140tcttttccag aagaaaatag aggtctgttg
tttctttgct gtgagaatga acagttttta 1200gttcttttag gtatgtttgt gtgagaaatt
gctacaagac tgatgtattc atcatgcagt 1260tggataatgt attcattatg cttattcgat
agtttgtgtt catcgagcta tgtataaatc 1320atctctgctc tttttaatac aacaaactgt
ctccatcta 13592355PRTArabidopsis thaliana 2Met
Val Arg Ala Ser Ser Ser Lys Lys Gly Gly Ser Lys Gly Gly Asp 1
5 10 15 Lys Asp Asp Ala Glu Ser
Lys Gln Arg Lys Arg Leu Lys Thr Leu Ala 20
25 30 Leu Asp Asn Gln Leu Leu Ser Asp Ser Pro
Ala Lys Ser His Ser Ser 35 40
45 Leu Lys Pro Ser Lys Gln Val Leu Lys His His Gly Thr Asp
Ile Ile 50 55 60
Arg Lys Ser Gln Arg Lys Asn Arg Phe Leu Phe Ser Phe Pro Gly Leu 65
70 75 80 Leu Ala Pro Ile Ser
Ala Ala Thr Ile Gly Asp Leu Asp Arg Leu Ser 85
90 95 Thr Lys Asn Pro Val Leu Tyr Leu Asn Phe
Pro Gln Gly Arg Met Lys 100 105
110 Leu Phe Gly Thr Ile Leu Tyr Pro Lys Asn Arg Tyr Leu Thr Leu
Gln 115 120 125 Phe
Ser Arg Gly Gly Lys Asn Val Leu Cys Asp Asp Tyr Phe Asp Asn 130
135 140 Met Ile Val Phe Ser Glu
Ser Trp Trp Ile Gly Thr Lys Glu Glu Asn 145 150
155 160 Pro Glu Glu Ala Arg Leu Asp Phe Pro Lys Glu
Leu Ala Gln Ala Glu 165 170
175 Asn Thr Glu Phe Asp Phe Gln Gly Gly Ala Gly Gly Ala Ala Ser Val
180 185 190 Lys Lys
Leu Ala Ser Pro Glu Ile Gly Ser Gln Pro Thr Glu Thr Asp 195
200 205 Ser Pro Glu Val Asp Asn Glu
Asp Val Leu Ser Glu Asp Gly Glu Phe 210 215
220 Leu Asp Asp Lys Ile Gln Val Thr Pro Pro Val Gln
Leu Thr Pro Pro 225 230 235
240 Val Gln Val Thr Pro Val Arg Gln Ser Gln Arg Asn Ser Gly Lys Lys
245 250 255 Phe Asn Phe
Ala Glu Thr Ser Ser Glu Ala Ser Ser Gly Glu Ser Glu 260
265 270 Gly Asn Thr Ser Asp Glu Asp Glu
Lys Pro Leu Leu Glu Pro Glu Ser 275 280
285 Ser Thr Arg Ser Arg Glu Glu Ser Gln Asp Gly Asn Gly
Ile Thr Ala 290 295 300
Ser Ala Ser Lys Leu Pro Glu Glu Leu Pro Ala Lys Arg Glu Lys Leu 305
310 315 320 Lys Ser Lys Asp
Ser Lys Leu Val Gln Ala Thr Leu Ser Asn Leu Phe 325
330 335 Lys Lys Ala Glu Glu Lys Thr Ala Gly
Thr Ser Lys Ala Lys Ser Ser 340 345
350 Ser Lys Ala 355 3674DNAPopulus trichocarpa
3acatcatcat caaagaactt ttttaaaaaa atggtgaaat ccaagaaaac agaagccagc
60aattctaaca gggaaaaccc ggatgtgtta gagagaaaaa gactgaaaaa gcttgcaata
120accaacaaca tagtatcaga cacacaagtc agatatatta ggaaatctca aagaaaaaac
180aggtacttgc cttcatttcc tggtcttctt gctcctgtca atggtggtgg caagattggc
240gagctcaaag acttgtcctc taaaagccct gttctttacc tcgattttcg tcagctgaca
300ttgcaattct ctaggagtgg aaagaatgtt atgtgtgagg attattttga tcacatgatt
360gtattttctg aggcatggtg gattggaacg aaagaagaga acccggagga attgaaactt
420gattttctta aggaactgtt tgaggagggg caggagcatg tagttgataa aagtggtggg
480acaaaatatg tgaaagaaga gtctcctgaa acagagcttg atgatgatga taacaaatat
540ttgaaaggtt tgaaggaagt tatgccaatt cggcagcatg caagaactct gtaaaaatat
600gttttaaaaa tatttttgaa agaaattaat atttttttaa ttttttttac ttcaaattaa
660tattttttta gatc
6744187PRTPopulus trichocarpa 4Met Val Lys Ser Lys Lys Thr Glu Ala Ser
Asn Ser Asn Arg Glu Asn 1 5 10
15 Pro Asp Val Leu Glu Arg Lys Arg Leu Lys Lys Leu Ala Ile Thr
Asn 20 25 30 Asn
Ile Val Ser Asp Thr Gln Val Arg Tyr Ile Arg Lys Ser Gln Arg 35
40 45 Lys Asn Arg Tyr Leu Pro
Ser Phe Pro Gly Leu Leu Ala Pro Val Asn 50 55
60 Gly Gly Gly Lys Ile Gly Glu Leu Lys Asp Leu
Ser Ser Lys Ser Pro 65 70 75
80 Val Leu Tyr Leu Asp Phe Arg Gln Leu Thr Leu Gln Phe Ser Arg Ser
85 90 95 Gly Lys
Asn Val Met Cys Glu Asp Tyr Phe Asp His Met Ile Val Phe 100
105 110 Ser Glu Ala Trp Trp Ile Gly
Thr Lys Glu Glu Asn Pro Glu Glu Leu 115 120
125 Lys Leu Asp Phe Leu Lys Glu Leu Phe Glu Glu Gly
Gln Glu His Val 130 135 140
Val Asp Lys Ser Gly Gly Thr Lys Tyr Val Lys Glu Glu Ser Pro Glu 145
150 155 160 Thr Glu Leu
Asp Asp Asp Asp Asn Lys Tyr Leu Lys Gly Leu Lys Glu 165
170 175 Val Met Pro Ile Arg Gln His Ala
Arg Thr Leu 180 185 5332DNAPopulus
trichocarpa 5ttacagcaac cccctcttaa atctaaataa atgcagagaa aacacaaaga
gcatttgaat 60caatgtagaa ggaaatctca aagaaaaaac aggtacttgc cttcatttcc
tggtcttctt 120gctcctgtca atggtggtgg caagattggc gagctcaaag acttgtcctc
taaaagccct 180gttctttacc tcgattttcg tcagggacgg atgaagctgc ttgggactgt
tgtgtatcca 240aaaaacagat agctgacatt gcaattctct aggagtggaa agaatgttat
gtgtgaggat 300tattttgatc acatggtttc tttctttctt gt
3326100PRTPopulus trichocarpa 6Met Gln Arg Lys His Lys Glu
His Leu Asn Gln Cys Arg Arg Lys Ser 1 5
10 15 Gln Arg Lys Asn Arg Tyr Leu Pro Ser Phe Pro
Gly Leu Leu Ala Pro 20 25
30 Val Asn Gly Gly Gly Lys Ile Gly Glu Leu Lys Asp Leu Ser Ser
Lys 35 40 45 Ser
Pro Val Leu Tyr Leu Asp Phe Arg Gln Gly Arg Met Lys Leu Leu 50
55 60 Gly Thr Val Val Tyr Pro
Lys Asn Arg Leu Thr Leu Gln Phe Ser Arg 65 70
75 80 Ser Gly Lys Asn Val Met Cys Glu Asp Tyr Phe
Asp His Met Val Ser 85 90
95 Phe Phe Leu Val 100 71202DNAPopulus trichocarpa
7atcttcatcg tcaaagagaa gggaaaaaaa atggtgaaat ccaagaagac agaagccagc
60aattctaaca gagaaaaccc ggatgtgtta gagagaaaaa gattgaaaaa acttgccata
120accaacaaca tagtatcaga cgcacaagtc aaggctccat attcattgaa cccatcaaaa
180actgttgcaa aacaccatgg taaagatatt attaggaaat ctcaaagaaa gaacaggttt
240ttgttttcat ttcctggtct tcttgcacct attaatggag gtggcaagat tggcgagctc
300aaagacttgt cctctaaaaa ccctgttctt tacctcgatt ttcctcaggg acagatgaag
360ctgtttggga caattttgca tccaaagaat agatatttga cattgcaatt ctctaggagt
420ggaaagaatg ttatgtgtga ggattatttt gatcacatga ttatattttc tgaggcatgg
480tggattggaa cgaaagaaga gaacccggaa gaattgaaac ttgattttcc caacgaactg
540tttgagggaa aaggtgttga atgtgatttt aaaggtgggg caggagcagg atctgtcaat
600aagcaagtac ttcaaaagag tggtggaacc aaatatgtaa aagaagagtc tcctgaaact
660gagcttgatg atgatttatc agatgataac aatgatttta aagatttgaa tgaaactaca
720ccaattcggc aatctgcaag aacttctggg aaaaaattca agtttactga agtttcctcg
780ggagatgatt ctgctgaaag aagtcctgat gccttggggg tggaggagga ggaggaggag
840gaggaggaaa agaaagtgaa aactaacatg tcctctggtc ttgacattga aagtgaaagt
900tctagagaag ggaatcatct ttctgagcaa attcaagcat ctataaccaa atctaaaaag
960ctttctgagt ctgctgcttc agtgacgata cctaaggaaa acttgtataa tagtcatggt
1020tcacttgttc agtcaaccat atccacgctg ttcaagaaag tgcaggaaaa gaagaaagtg
1080gtggaaaagg ttaggtttga caactttgag gctaatagct aaatatgccg tctagtcaga
1140acttgttgag aggctataat ctgtagtatt ttgctgcagt cactgaatag aaccatatag
1200gt
12028363PRTPopulus trichocarpa 8Met Val Lys Ser Lys Lys Thr Glu Ala Ser
Asn Ser Asn Arg Glu Asn 1 5 10
15 Pro Asp Val Leu Glu Arg Lys Arg Leu Lys Lys Leu Ala Ile Thr
Asn 20 25 30 Asn
Ile Val Ser Asp Ala Gln Val Lys Ala Pro Tyr Ser Leu Asn Pro 35
40 45 Ser Lys Thr Val Ala Lys
His His Gly Lys Asp Ile Ile Arg Lys Ser 50 55
60 Gln Arg Lys Asn Arg Phe Leu Phe Ser Phe Pro
Gly Leu Leu Ala Pro 65 70 75
80 Ile Asn Gly Gly Gly Lys Ile Gly Glu Leu Lys Asp Leu Ser Ser Lys
85 90 95 Asn Pro
Val Leu Tyr Leu Asp Phe Pro Gln Gly Gln Met Lys Leu Phe 100
105 110 Gly Thr Ile Leu His Pro Lys
Asn Arg Tyr Leu Thr Leu Gln Phe Ser 115 120
125 Arg Ser Gly Lys Asn Val Met Cys Glu Asp Tyr Phe
Asp His Met Ile 130 135 140
Ile Phe Ser Glu Ala Trp Trp Ile Gly Thr Lys Glu Glu Asn Pro Glu 145
150 155 160 Glu Leu Lys
Leu Asp Phe Pro Asn Glu Leu Phe Glu Gly Lys Gly Val 165
170 175 Glu Cys Asp Phe Lys Gly Gly Ala
Gly Ala Gly Ser Val Asn Lys Gln 180 185
190 Val Leu Gln Lys Ser Gly Gly Thr Lys Tyr Val Lys Glu
Glu Ser Pro 195 200 205
Glu Thr Glu Leu Asp Asp Asp Leu Ser Asp Asp Asn Asn Asp Phe Lys 210
215 220 Asp Leu Asn Glu
Thr Thr Pro Ile Arg Gln Ser Ala Arg Thr Ser Gly 225 230
235 240 Lys Lys Phe Lys Phe Thr Glu Val Ser
Ser Gly Asp Asp Ser Ala Glu 245 250
255 Arg Ser Pro Asp Ala Leu Gly Val Glu Glu Glu Glu Glu Glu
Glu Glu 260 265 270
Glu Lys Lys Val Lys Thr Asn Met Ser Ser Gly Leu Asp Ile Glu Ser
275 280 285 Glu Ser Ser Arg
Glu Gly Asn His Leu Ser Glu Gln Ile Gln Ala Ser 290
295 300 Ile Thr Lys Ser Lys Lys Leu Ser
Glu Ser Ala Ala Ser Val Thr Ile 305 310
315 320 Pro Lys Glu Asn Leu Tyr Asn Ser His Gly Ser Leu
Val Gln Ser Thr 325 330
335 Ile Ser Thr Leu Phe Lys Lys Val Gln Glu Lys Lys Lys Val Val Glu
340 345 350 Lys Val Arg
Phe Asp Asn Phe Glu Ala Asn Ser 355 360
92170DNAPopulus trichocarpa 9gaggcgtaaa gtagtgcagg caaggcgaga
agggcaagaa ggaggaggga ggagggatgg 60tgaagaagaa ggaggccggc gatgcggagg
ccgacgagcg gcgccgcctc cgctccctcg 120ccttctctaa tggcttactc cagcgcgggg
agccggcggc gccgcgctcg gcgctcgcgc 180cctccactgc cgtgtcgcgg ctgcagggcc
gcgacatcgt gcgccgcggc gggcagcgca 240agagccgctt cctcttctca ttccccggcc
tcctcgcgcc cgcggctgct gcctcgggcg 300gccgcgtcgg cgagctcgct gatcttggca
ccaaaaatcc tctgctctac ctcgacttcc 360cacaggggag gatgaagctg ttggggacgc
atgtgtaccc caagaacaag tatctgacac 420tgcagatgag caggtccacc aagggcgttg
tctgcgagga cgtcttcgag agcctgattg 480ttttttctga agcctggtgg attggaacaa
aagaagaaaa cccacaagaa ctgaaactgg 540attttccaaa agagttccag aatgatgggg
ctgttgcaga ttctgatttt aaaggtggag 600caggtgcttc ctgtgatgaa gctgttacca
tcaataaacc gccaaaggaa accaccacag 660gatccctttc cccaaagatt gaatctgaca
ttgattcttc cgaggattca gaccttaagg 720acgaggataa cacacaaagc actagtcaag
caccttcagt taggcagtct gctagaactg 780ctgggaaagc cttgaagtat actgagatat
cctctggaga cgattcatct gataatgacg 840atgagattga tgtccctgag gacatggatg
agaaggtgaa gagtccggca gttaagaatg 900aatcccaaag tgaagacatt aaacctgcag
attcatctgc gcagcctatc tcagctaaga 960aggagccact cgttcaggct actctgtcta
gcatgtttaa aaaagcagaa gaaaaaaaga 1020gatgtactag aagcccgaaa ggatctccag
caacaaaagg acctgctgct aagaagcagc 1080gagcaagtcc agaggaaaaa catccaacag
ggaagaagag tggtaagtgc agtagcaagt 1140ctgttgttag aagcatctaa tagaccttgg
tgcgattaca tttgttttaa gcctctgaag 1200gccttcttca atggagaaga cttgttctaa
aataatccaa aggctttggg aggcttactt 1260gagaagcttt cagcttttgt ctatgtaagg
ggcatttaaa atattaatcc atttcatgaa 1320gctttaggac catctcaagc ctgtcgaatc
ggaataaatg atggaattaa catctattgt 1380ggtggctcta tactaaatat gctatttgaa
ttttacaaga cttgatattg ctgggtaaaa 1440gctgtgatgc atggggggaa acaaatgcat
tttagtggct cttagtgcaa tgtgcaacga 1500tctatcagct tttctttaac tgagtacatc
agtacatgtt agtgttcttc atatgaagct 1560gacaattgta tttacatgga tgtagctggc
agaagtcaga aaaagagaaa aacacaggta 1620gaagatgacg aaattgaagt gctctcaagt
tcctcccagg ataacaatgt ggacgatgat 1680agcgatgaag actgggctga gtgatgtgca
gctgaagttg aaaggaatgt agcactcggt 1740gatgaagagt gaaggatgca agattgtggg
caattgtttt cttttggggc aatatcacaa 1800ttgatgtgtt tgaggagctt aggttctgac
ctgaccattg atgagatata tcatcatagt 1860cttttcatcg tactggtgaa attgaaaacc
gagaagtgtg atctgttagc actagattta 1920ttatttattt gacttgttgt aatgtaacat
aaacaagagc tgataaatca ttgttagggt 1980ctgcaatgca aattagtacg gcaattgcag
atataaacta taccattgat gaagaaaatg 2040attgttgtgt atttatttat ggtattgata
aattaaacgg ttttatctga acaactgaga 2100gggtcctggg gcttaaggca aacaggcctg
tttttgtaag agaaaaagct gaccccgagc 2160ttgtatgttc
217010367PRTPopulus trichocarpa 10Met
Val Lys Lys Lys Glu Ala Gly Asp Ala Glu Ala Asp Glu Arg Arg 1
5 10 15 Arg Leu Arg Ser Leu Ala
Phe Ser Asn Gly Leu Leu Gln Arg Gly Glu 20
25 30 Pro Ala Ala Pro Arg Ser Ala Leu Ala Pro
Ser Thr Ala Val Ser Arg 35 40
45 Leu Gln Gly Arg Asp Ile Val Arg Arg Gly Gly Gln Arg Lys
Ser Arg 50 55 60
Phe Leu Phe Ser Phe Pro Gly Leu Leu Ala Pro Ala Ala Ala Ala Ser 65
70 75 80 Gly Gly Arg Val Gly
Glu Leu Ala Asp Leu Gly Thr Lys Asn Pro Leu 85
90 95 Leu Tyr Leu Asp Phe Pro Gln Gly Arg Met
Lys Leu Leu Gly Thr His 100 105
110 Val Tyr Pro Lys Asn Lys Tyr Leu Thr Leu Gln Met Ser Arg Ser
Thr 115 120 125 Lys
Gly Val Val Cys Glu Asp Val Phe Glu Ser Leu Ile Val Phe Ser 130
135 140 Glu Ala Trp Trp Ile Gly
Thr Lys Glu Glu Asn Pro Gln Glu Leu Lys 145 150
155 160 Leu Asp Phe Pro Lys Glu Phe Gln Asn Asp Gly
Ala Val Ala Asp Ser 165 170
175 Asp Phe Lys Gly Gly Ala Gly Ala Ser Cys Asp Glu Ala Val Thr Ile
180 185 190 Asn Lys
Pro Pro Lys Glu Thr Thr Thr Gly Ser Leu Ser Pro Lys Ile 195
200 205 Glu Ser Asp Ile Asp Ser Ser
Glu Asp Ser Asp Leu Lys Asp Glu Asp 210 215
220 Asn Thr Gln Ser Thr Ser Gln Ala Pro Ser Val Arg
Gln Ser Ala Arg 225 230 235
240 Thr Ala Gly Lys Ala Leu Lys Tyr Thr Glu Ile Ser Ser Gly Asp Asp
245 250 255 Ser Ser Asp
Asn Asp Asp Glu Ile Asp Val Pro Glu Asp Met Asp Glu 260
265 270 Lys Val Lys Ser Pro Ala Val Lys
Asn Glu Ser Gln Ser Glu Asp Ile 275 280
285 Lys Pro Ala Asp Ser Ser Ala Gln Pro Ile Ser Ala Lys
Lys Glu Pro 290 295 300
Leu Val Gln Ala Thr Leu Ser Ser Met Phe Lys Lys Ala Glu Glu Lys 305
310 315 320 Lys Arg Cys Thr
Arg Ser Pro Lys Gly Ser Pro Ala Thr Lys Gly Pro 325
330 335 Ala Ala Lys Lys Gln Arg Ala Ser Pro
Glu Glu Lys His Pro Thr Gly 340 345
350 Lys Lys Ser Gly Lys Cys Ser Ser Lys Ser Val Val Arg Ser
Ile 355 360 365
111511DNAOryza sativa 11atggtgaaga agaagccggc cggcgatgcg gaggccgacg
agcggcgccg cctccgctcc 60ctcgccttct ctaatggctt actccagcgc ggggagccgg
cggcgccgcg ctcggctctc 120gcgccctcca ctgccgtgtc gcggctgcag ggccgcgaca
tcgtgcgccg cggcgggcag 180cgcaagagcc gcttcctctt ctccttcccc ggcctcctcg
cgcccgcggc tgctgcctcg 240ggcggccgcg tcggcgagct cgctgatctt ggcaccaaaa
atcctctgct ctacctcgac 300ttcccacagg tatcctatat ctatctatct attccgtcag
gggaggatga agctgttggg 360gacgcatgtg taccccaaga acaagtatct gacactgcag
atgacgtctt cgagagcctg 420attgtttttt ctgaagcctg gtggattgga acaaaagaag
aagaaaaccc acaagaactg 480aaactggatt ttccaaaaga gttccagaat gatgaggcgg
ttgcagattc tgattttaaa 540ggtggagcag gtgcttcctg tgatgaagct gtttccatca
ataaaccgcc aaaggaaacc 600accacaggat ccctttcccc taagattgaa tctgacattg
attcttccga ggattcagac 660cttaaggacg aggataacac acaaagcact agtcaagcac
cttcagttag gcagtctgct 720agaactgctg ggaaagcctt gaagtatact gagatatcat
ctggagacga ttcatctgat 780aatgacgatg agattgatgt ccctgaggac atggatgaga
agatgaagag tccagcagtt 840aagaatgaat cccaaagtga agacattaaa cctgcagatt
ggtctgcgca gcctatctca 900gctaagaagg agccactcgt tcaggccact ctgtctagca
tgtttaaaaa agcagaagaa 960aaaaaaggac ctgctgctaa gaagcagcga gcaagtccag
aggaaaaaca tccaacaggg 1020aagaagagtg ctggcagaag tcagaaaagg agaaaaacac
aggtagaaga tgacaaaatt 1080gaagtgctct caagttcctc ccaggataac aacgtggacg
atgatagcga tgaggactgg 1140gctgagtgat gtgcagctga agttgaaagg aatgatgcaa
gattgtgggc aattcttttc 1200tttgggggcg atatcacaat tgatgtgttt gaggagctta
ggttctggcc tgaccattga 1260tgagatatat catcatagtc ttttcatcgc actggtgaaa
ttgaaaaccg agaagtcgtg 1320tgatctgtta gcactagatt tattatttat ttgacttgtt
gtaatgtaac ataaacaaga 1380gctgataaat cattgttagg gtctgcaatg caaattagta
ctgcaattac agatataaac 1440tataccattg atgaagaaaa tgattgttgt gtatttattt
atggtattga taaattaaac 1500ggttttatct g
151112382PRTOryza sativa 12Met Val Lys Lys Lys Pro
Ala Gly Asp Ala Glu Ala Asp Glu Arg Arg 1 5
10 15 Arg Leu Arg Ser Leu Ala Phe Ser Asn Gly Leu
Leu Gln Arg Gly Glu 20 25
30 Pro Ala Ala Pro Arg Ser Ala Leu Ala Pro Ser Thr Ala Val Ser
Arg 35 40 45 Leu
Gln Gly Arg Asp Ile Val Arg Arg Gly Gly Gln Arg Lys Ser Arg 50
55 60 Phe Leu Phe Ser Phe Pro
Gly Leu Leu Ala Pro Ala Ala Ala Ala Ser 65 70
75 80 Gly Gly Arg Val Gly Glu Leu Ala Asp Leu Gly
Thr Lys Asn Pro Leu 85 90
95 Leu Tyr Leu Asp Phe Pro Gln Val Ser Tyr Ile Tyr Leu Ser Ile Pro
100 105 110 Ser Gly
Glu Asp Glu Ala Val Gly Asp Ala Cys Val Pro Gln Glu Gln 115
120 125 Val Ser Asp Thr Ala Asp Asp
Val Phe Glu Ser Leu Ile Val Phe Ser 130 135
140 Glu Ala Trp Trp Ile Gly Thr Lys Glu Glu Glu Asn
Pro Gln Glu Leu 145 150 155
160 Lys Leu Asp Phe Pro Lys Glu Phe Gln Asn Asp Glu Ala Val Ala Asp
165 170 175 Ser Asp Phe
Lys Gly Gly Ala Gly Ala Ser Cys Asp Glu Ala Val Ser 180
185 190 Ile Asn Lys Pro Pro Lys Glu Thr
Thr Thr Gly Ser Leu Ser Pro Lys 195 200
205 Ile Glu Ser Asp Ile Asp Ser Ser Glu Asp Ser Asp Leu
Lys Asp Glu 210 215 220
Asp Asn Thr Gln Ser Thr Ser Gln Ala Pro Ser Val Arg Gln Ser Ala 225
230 235 240 Arg Thr Ala Gly
Lys Ala Leu Lys Tyr Thr Glu Ile Ser Ser Gly Asp 245
250 255 Asp Ser Ser Asp Asn Asp Asp Glu Ile
Asp Val Pro Glu Asp Met Asp 260 265
270 Glu Lys Met Lys Ser Pro Ala Val Lys Asn Glu Ser Gln Ser
Glu Asp 275 280 285
Ile Lys Pro Ala Asp Trp Ser Ala Gln Pro Ile Ser Ala Lys Lys Glu 290
295 300 Pro Leu Val Gln Ala
Thr Leu Ser Ser Met Phe Lys Lys Ala Glu Glu 305 310
315 320 Lys Lys Gly Pro Ala Ala Lys Lys Gln Arg
Ala Ser Pro Glu Glu Lys 325 330
335 His Pro Thr Gly Lys Lys Ser Ala Gly Arg Ser Gln Lys Arg Arg
Lys 340 345 350 Thr
Gln Val Glu Asp Asp Lys Ile Glu Val Leu Ser Ser Ser Ser Gln 355
360 365 Asp Asn Asn Val Asp Asp
Asp Ser Asp Glu Asp Trp Ala Glu 370 375
380 131965DNAOryza sativa 13ctcgcatggc cgttttttca gcgtcgaagg
cggtctcgga tgaatccgct ttccgtgtgg 60ccggtagagt cttcgtcttc gtctcgtctc
gtctcgtccc aaacacgcga acgcgacacg 120ccgacacccg tatcgtacca cctaactttt
gctccgagat ctcctcctcc cgtctcatca 180gtcttttcca atcaatcacc agctcaaact
tgtacccaaa accctacacc ggagggagag 240ggatggtcaa gaaagccgtc tccaccgcgc
cagcagacgc cgaggccgac gagcgccgcc 300gcctccgctc gctcgccttc tccaacggcc
tgctccagcg aggcgacccc gcggcgccgc 360gggcgccgct cgcgccggcg gctgccgtca
cgcgcctaca gggccgcgac gtcgtccgcc 420gtggcggcca gcgaaagagc cgctacctct
tctccttccc tggcctcctc gcgccagcag 480cctcgggtgg ccgggtcgga gagctcgccg
acctcgggac caagaacccc ctgctgtacc 540tcgagttccc acagggaagg atgaagctct
tcgggacgca cgtttacccc aagaacaagt 600acctcacgct gcagatgacc aggtcggcca
agggcgtcgt ctgtgaggac gtctttgaga 660gcctgattgt gttttctgaa gcctggtggg
tgggaacaaa agaagataat ccggaagagc 720tcaaacttga gtttccaaaa gaattccaaa
atgatggcac gacagcagac tgtgatttca 780gaggtggtgc aggtggtgcc atcgatgaag
caactggaag caaagctgga aaggaaattg 840cagaacctcg ttccccaaag tttgcatctg
atgacgatgc tcctgaggat tcaaatcata 900aggatgagaa taacacacag actatgagtg
gaacaccagt tagacagtct gctaggaatg 960cagggaaaac cttgaaaagg tacacagact
tatcttctgg aggtgaatcg tctgacaata 1020ataatgaaac tgatatatct gaggacttgg
atgataagga ggtggagagt ccagaaatta 1080aggacgagat tgaaagcgaa gatgtcaaac
ccgcagattc ttcagcaatt tccctctcta 1140gcaagaagga gcctcttgtt caggctactt
tgtctagcat gtttataagg gcagaagaaa 1200aaaagagatc tacaaggagt cctaaagggt
cccctgcaac caaaggtgct gctgctaaga 1260agcagcgagc aagtccaatg gcaaaacagc
cagcagggat caagaaggtt agcggaactc 1320ggggaaagaa aaaaccaaag gtgggagaag
atgaaatcga agagctctca agttcctccc 1380aggataacga cgcagatgat gatagtgacg
aggactgggc cgaataatgc gatggtacag 1440acgacggata ggttgggccg ttggggaagc
atatgcataa ttgatgtgcg gctgaagaga 1500aggatggatt gggggatggc acagaaactg
cctgttttca tggactggtt tggctggtca 1560tgtctgacga cggatctaga atatggaata
tcctgatctg ttgtcaagct gagtccatga 1620aacgttggtt gatgaaacat tagttagtca
gaagcgggtt tgtttgtagt taggaactag 1680gaacacacac acccgaggtt gtgatggggt
tgcatacttg gactagccat tcccattcat 1740ctgtctgtac tctgttcgac tgttcctttg
tgtggctcac acctgttgtg tttctgatgg 1800tggtcatttc tccagccagg ggcaaagaac
acactgcgtt tcgctgatcc tttaattcag 1860acccatagtc cttttttttt tttcagcaca
ttctgatgtg gctctttgta ttgcagactg 1920tatcatggta tcagattggt ttcgacttga
aacccctctc tcttc 196514473PRTOryza sativa 14Met Ala Val
Phe Ser Ala Ser Lys Ala Val Ser Asp Glu Ser Ala Phe 1 5
10 15 Arg Val Ala Gly Arg Val Phe Val
Phe Val Ser Ser Arg Leu Val Pro 20 25
30 Asn Thr Arg Thr Arg His Ala Asp Thr Arg Ile Val Pro
Pro Asn Phe 35 40 45
Cys Ser Glu Ile Ser Ser Ser Arg Leu Ile Ser Leu Phe Gln Ser Ile 50
55 60 Thr Ser Ser Asn
Leu Tyr Pro Lys Pro Tyr Thr Gly Gly Arg Gly Met 65 70
75 80 Val Lys Lys Ala Val Ser Thr Ala Pro
Ala Asp Ala Glu Ala Asp Glu 85 90
95 Arg Arg Arg Leu Arg Ser Leu Ala Phe Ser Asn Gly Leu Leu
Gln Arg 100 105 110
Gly Asp Pro Ala Ala Pro Arg Ala Pro Leu Ala Pro Ala Ala Ala Val
115 120 125 Thr Arg Leu Gln
Gly Arg Asp Val Val Arg Arg Gly Gly Gln Arg Lys 130
135 140 Ser Arg Tyr Leu Phe Ser Phe Pro
Gly Leu Leu Ala Pro Ala Ala Ser 145 150
155 160 Gly Gly Arg Val Gly Glu Leu Ala Asp Leu Gly Thr
Lys Asn Pro Leu 165 170
175 Leu Tyr Leu Glu Phe Pro Gln Gly Arg Met Lys Leu Phe Gly Thr His
180 185 190 Val Tyr Pro
Lys Asn Lys Tyr Leu Thr Leu Gln Met Thr Arg Ser Ala 195
200 205 Lys Gly Val Val Cys Glu Asp Val
Phe Glu Ser Leu Ile Val Phe Ser 210 215
220 Glu Ala Trp Trp Val Gly Thr Lys Glu Asp Asn Pro Glu
Glu Leu Lys 225 230 235
240 Leu Glu Phe Pro Lys Glu Phe Gln Asn Asp Gly Thr Thr Ala Asp Cys
245 250 255 Asp Phe Arg Gly
Gly Ala Gly Gly Ala Ile Asp Glu Ala Thr Gly Ser 260
265 270 Lys Ala Gly Lys Glu Ile Ala Glu Pro
Arg Ser Pro Lys Phe Ala Ser 275 280
285 Asp Asp Asp Ala Pro Glu Asp Ser Asn His Lys Asp Glu Asn
Asn Thr 290 295 300
Gln Thr Met Ser Gly Thr Pro Val Arg Gln Ser Ala Arg Asn Ala Gly 305
310 315 320 Lys Thr Leu Lys Arg
Tyr Thr Asp Leu Ser Ser Gly Gly Glu Ser Ser 325
330 335 Asp Asn Asn Asn Glu Thr Asp Ile Ser Glu
Asp Leu Asp Asp Lys Glu 340 345
350 Val Glu Ser Pro Glu Ile Lys Asp Glu Ile Glu Ser Glu Asp Val
Lys 355 360 365 Pro
Ala Asp Ser Ser Ala Ile Ser Leu Ser Ser Lys Lys Glu Pro Leu 370
375 380 Val Gln Ala Thr Leu Ser
Ser Met Phe Ile Arg Ala Glu Glu Lys Lys 385 390
395 400 Arg Ser Thr Arg Ser Pro Lys Gly Ser Pro Ala
Thr Lys Gly Ala Ala 405 410
415 Ala Lys Lys Gln Arg Ala Ser Pro Met Ala Lys Gln Pro Ala Gly Ile
420 425 430 Lys Lys
Val Ser Gly Thr Arg Gly Lys Lys Lys Pro Lys Val Gly Glu 435
440 445 Asp Glu Ile Glu Glu Leu Ser
Ser Ser Ser Gln Asp Asn Asp Ala Asp 450 455
460 Asp Asp Ser Asp Glu Asp Trp Ala Glu 465
470 151363DNAAquilegia formosa 15tcgctgttgc
ctgaacaact gaactgaaca cgatggttcg agctacgtcg aagaagatag 60agaacgacga
cgatcgaagt agactaaaaa agcttgctct atcacgaaat ctcctttcgc 120aaactccttc
gaaaccttct tctacactat cactatcgaa aacagttctc aaacaccatg 180gtaaagatat
aatgaagaaa tcacagagaa agaacagatt tcttttttca tttcctggtc 240ttcttggtcc
tattactggt ggtaaggttg gtgaactgaa ggatttagga acaatgaagc 300caattcttta
tctcgatttc cctcagggaa gggtgaaaat gtttggtaca atagtttatc 360ccaagaacag
gtacttgacg ctgcatttct ctaaaggagg gaagaatgtg atgtgtgaag 420atcattttga
taacatggtt gttttctcag acgcatggtg gattgggaca aaagatgaga 480atccagaaga
ggttcaactc gaatttccta agaacctgat taagggtaag catacagacg 540ctgattttaa
aggtggagcc ggtgctggtg ccacatctga acaaaaacca ggtcctaaca 600agcctagaaa
agaatatgtt gaaacagaga ctcctagtac tgatgtagaa gatgtttctg 660aggattttga
ttccttaaat gaaaagaata aggatttgat ggaagtactg ccagttcgat 720cttctaccag
aacagctggg agaaaattca agtttacaga accttcatca gtagataatt 780ctactgaaag
tgactctgac tcatctaaag tgaggaaagg agttaaacag acacttgacg 840atgagactga
agatgccagc ttggtgggcc atgctattga taatccgaat gttgcaacaa 900aacaaattta
ccccaacaaa agcagcagtc ttctattcca gtgaattcta aagaaatttc 960ttccagtaaa
cgtggtcccc ttgttcaagc aaccatatcg aacttatttt cgaaagcaaa 1020agcaaaggat
gcgggcggaa gcgatgaagt gacacggatc gaaggaagaa agcctgtgat 1080aggaagcaga
acgaagaaaa agcagtctca ggtcgaagat gatgacatag aagagttctc 1140aaccgaatcg
gagttgtttg tgcaggatat tgaggaaagt gatgaagatt gggttgcctg 1200acacagtatg
gtccaatttt gtgttgctgg atatggacag gcaaggactg tgcacgtagt 1260tagaaagaag
acattgctgt gctctttcag aatatgtaat actgtttagc tctgtagaat 1320taaaaacatg
aaatataaga taatttgact ctggttaact ctg
136316303PRTAquilegia formosa 16Met Val Arg Ala Thr Ser Lys Lys Ile Glu
Asn Asp Asp Asp Arg Ser 1 5 10
15 Arg Leu Lys Lys Leu Ala Leu Ser Arg Asn Leu Leu Ser Gln Thr
Pro 20 25 30 Ser
Lys Pro Ser Ser Thr Leu Ser Leu Ser Lys Thr Val Leu Lys His 35
40 45 His Gly Lys Asp Ile Met
Lys Lys Ser Gln Arg Lys Asn Arg Phe Leu 50 55
60 Phe Ser Phe Pro Gly Leu Leu Gly Pro Ile Thr
Gly Gly Lys Val Gly 65 70 75
80 Glu Leu Lys Asp Leu Gly Thr Met Lys Pro Ile Leu Tyr Leu Asp Phe
85 90 95 Pro Gln
Gly Arg Val Lys Met Phe Gly Thr Ile Val Tyr Pro Lys Asn 100
105 110 Arg Tyr Leu Thr Leu His Phe
Ser Lys Gly Gly Lys Asn Val Met Cys 115 120
125 Glu Asp His Phe Asp Asn Met Val Val Phe Ser Asp
Ala Trp Trp Ile 130 135 140
Gly Thr Lys Asp Glu Asn Pro Glu Glu Val Gln Leu Glu Phe Pro Lys 145
150 155 160 Asn Leu Ile
Lys Gly Lys His Thr Asp Ala Asp Phe Lys Gly Gly Ala 165
170 175 Gly Ala Gly Ala Thr Ser Glu Gln
Lys Pro Gly Pro Asn Lys Pro Arg 180 185
190 Lys Glu Tyr Val Glu Thr Glu Thr Pro Ser Thr Asp Val
Glu Asp Val 195 200 205
Ser Glu Asp Phe Asp Ser Leu Asn Glu Lys Asn Lys Asp Leu Met Glu 210
215 220 Val Leu Pro Val
Arg Ser Ser Thr Arg Thr Ala Gly Arg Lys Phe Lys 225 230
235 240 Phe Thr Glu Pro Ser Ser Val Asp Asn
Ser Thr Glu Ser Asp Ser Asp 245 250
255 Ser Ser Lys Val Arg Lys Gly Val Lys Gln Thr Leu Asp Asp
Glu Thr 260 265 270
Glu Asp Ala Ser Leu Val Gly His Ala Ile Asp Asn Pro Asn Val Ala
275 280 285 Thr Lys Gln Ile
Tyr Pro Asn Lys Ser Ser Ser Leu Leu Phe Gln 290 295
300 171294DNAMalus
domesticamisc_feature(1287)..(1287)n is a, c, g, or t 17gcccattccc
atttttattt gtacgaaggt attcgaaagc agatcccggg aagtgctaaa 60agcggggatc
tgctggaggg tgggtgaggg gaagaacatc cgagtaagat atgatccatg 120gttgcctatt
ccaagaactt ttatgcctct aggacaatac ttgggtcggg tctgttgggt 180gcaatagaca
ttttagagag agaggtttaa aaccctacag agacaggagg tagggttggt 240ggtctccggt
ccggttgaat caaaatggcg cggacctcgt catcgaagaa gcggaaacac 300gaagatgatg
aaggggcaga ggcagaggca gaacctgagg tcgcgcagcg gaagaggctc 360aaagccctcg
ccttctccaa caaccagctc tcagagatcc ctgcaaagcc ccgcgcgcct 420ctcacacctt
caaacggtgt ccttaagcag catggcaagg acattgtgaa gaaatctcag 480cggaagaaca
agttcctctt ctccttccct ggcctccttg cccccattgg aggtggtaag 540atcggcgacc
tcaaggattt ggacaccaag aaccccgtcc tctacctcca attccctctg 600ggtcagatga
agttgtttgg gactcttgtg ttccccaaga acaggtatct gacaatgcag 660ttccccaagg
gtggaaagag tgtcatgtgc gaggactact tcgataatat gattgtattt 720tccgatgctt
ggtggattgg gacaaaagat gagaatcccg agtaagccca acttgatttt 780cctaaggaat
tgactgaggg acaacactct gagttcgact ttcaaggtgg cgcaggttca 840acatctgcca
aaaagcaaag tgatagtaaa aatgaaacta catatgttga agagtattcg 900ccccataaca
aggttgaaga taatttatca gatgaagaaa acaatgaatt aatgaaggca 960acaccagttc
gacattcagc aagaactgca ggaaaaaaat tcaagtttgg agaagcttct 1020tctggagatg
attctgctga aagtgatacc ccctcagctg aagggagaag ataaaaaagt 1080tggaagactt
gattcttcat ctgggaagca cagtagtgga aagactgaca atctcagctt 1140tggagatgca
gacattgata atgaggatcg tatgaaagga gctcaaactc ccaagcaaaa 1200tgaagattcc
tctctgtccg aagctaaatc aaagaaagag tcacattctg cctttgctgg 1260gactacatct
aaagaggact ctcatancaa tcat
129418268PRTMalus domestica 18Met Ala Arg Thr Ser Ser Ser Lys Lys Arg Lys
His Glu Asp Asp Glu 1 5 10
15 Gly Ala Glu Ala Glu Ala Glu Pro Glu Val Ala Gln Arg Lys Arg Leu
20 25 30 Lys Ala
Leu Ala Phe Ser Asn Asn Gln Leu Ser Glu Ile Pro Ala Lys 35
40 45 Pro Arg Ala Pro Leu Thr Pro
Ser Asn Gly Val Leu Lys Gln His Gly 50 55
60 Lys Asp Ile Val Lys Lys Ser Gln Arg Lys Asn Lys
Phe Leu Phe Ser 65 70 75
80 Phe Pro Gly Leu Leu Ala Pro Ile Gly Gly Gly Lys Ile Gly Asp Leu
85 90 95 Lys Asp Leu
Asp Thr Lys Asn Pro Val Leu Tyr Leu Gln Phe Pro Leu 100
105 110 Gly Gln Met Lys Leu Phe Gly Thr
Leu Val Phe Pro Lys Asn Arg Tyr 115 120
125 Leu Thr Met Gln Phe Pro Lys Gly Gly Lys Ser Val Met
Cys Glu Asp 130 135 140
Tyr Phe Asp Asn Met Ile Val Phe Ser Asp Ala Trp Trp Ile Gly Thr 145
150 155 160 Lys Asp Glu Asn
Pro Glu Ala Gln Leu Asp Phe Pro Lys Glu Leu Thr 165
170 175 Glu Gly Gln His Ser Glu Phe Asp Phe
Gln Gly Gly Ala Gly Ser Thr 180 185
190 Ser Ala Lys Lys Gln Ser Asp Ser Lys Asn Glu Thr Thr Tyr
Val Glu 195 200 205
Glu Tyr Ser Pro His Asn Lys Val Glu Asp Asn Leu Ser Asp Glu Glu 210
215 220 Asn Asn Glu Leu Met
Lys Ala Thr Pro Val Arg His Ser Ala Arg Thr 225 230
235 240 Ala Gly Lys Lys Phe Lys Phe Gly Glu Ala
Ser Ser Gly Asp Asp Ser 245 250
255 Ala Glu Ser Asp Thr Pro Ser Ala Glu Gly Arg Arg
260 265 191773DNAPhyscomitrella patens
19atggggaaaa agaaagttga ggaggtgagc cagacaaagg aggagaagac attagcgaag
60gaaagcaaga ggctgcggga gctggctttg acgtccgggt tgctgtcgga gaaaaaagct
120gtgccggatg cgccaatgca cccgcactct ggtatagtaa gatgtgacgg gaaggacatt
180tgtaaaaagg ggcataggaa gaacaagtac cttttctctt ttcctggtct cgtagctcct
240gtagctgttg gaaaattcgg tgatctgacg caattagaca caaaaaatcc aatattgtac
300gttgatttcc tacaggcaag tagagctttt gcgcagactg gacgactcaa attattcggc
360accattgtgt attccaaaaa caagtatatt actttgaact ttgtccgtgg ggcaggaagc
420atacaatgcg aagatatttt tgagaatctg gtggtatttt cggacgcgtg gtggatcggg
480acgaaggaag agaatcctga tgaactgcgc cttgaaatgc ctttggattt tcagcaggaa
540aggcatgctg tgtacgactt cgcaggtgga gctggcaagc ctagaaatat taaagatgat
600gttgacgtcc aagactcgca attagaacta gtccaggtat cggaattgga atccagtaaa
660cagtgcacac cgaagggcca actcaaaatg gaccggtggc ttttccaaaa gaaaccttcg
720gaaaacaaga ccttagagaa atcttttgag agtaatgcga aaacgaagcc aaaaaatgct
780agtgaatggg agtcagatga agacgaggaa ggttttgtca acctagggga cgcgccaact
840ccatcccgac aatctgctcg tatagctgag aagaaacact cgtatgcaga gtcttcctcg
900gaggagaacc agacagacgg cagtgatgaa cacgaccgtg cagatgctga acttagggat
960cctcgagaca aaactgttaa taaacttttc aatgatgtag aagatggctt cttggctcct
1020gaatcacaaa tatcacagat ggacttggca gatatggtga cggatagatg tgtatggaat
1080gtagtttatg atagcatacc cttaagcttt tttccaggtt gctgccgtat tcgaagccct
1140tgtgtgtttc atgagcgttg tgtgcaattt actaatagct taaagttcac ggctaatttg
1200gtttacatga agacttacat cgtctgtcat gcagtctcgc cacaaacttc agtaccaaaa
1260accagtgcaa gtggcatgat actttcaacc tctgtcctag cagatttaac tgccaacaca
1320ataggagctg ggtcgaagca atcaagtctc tctgcattct ttatgaagtc aagcgaaaag
1380tcaactttag acaacgatga tgctggtaaa gaagatgaaa acgtaactcc aaaagacgaa
1440atggaatcgg tattaccctg tacacctcct gatatcaatg attctaaacg caaaagaaaa
1500tctactccag atggaaagag aaattcgaga gatgatatga atatttcaag agtggataac
1560ttaccaccgg cattttctga taactgccgt gtaggaactg gaggaaatgg ctaccatatc
1620gaaggagcag ccgttaatac tggtacaatg ctgaccaaac agagtaatgt tcacaatgtt
1680gctactttat tttatggacc acagaataag tcttatacat catgtgagaa ctacgccctg
1740tttttcatgt tttgccatgc ttttggaaaa tga
177320590PRTPhyscomitrella patens 20Met Gly Lys Lys Lys Val Glu Glu Val
Ser Gln Thr Lys Glu Glu Lys 1 5 10
15 Thr Leu Ala Lys Glu Ser Lys Arg Leu Arg Glu Leu Ala Leu
Thr Ser 20 25 30
Gly Leu Leu Ser Glu Lys Lys Ala Val Pro Asp Ala Pro Met His Pro
35 40 45 His Ser Gly Ile
Val Arg Cys Asp Gly Lys Asp Ile Cys Lys Lys Gly 50
55 60 His Arg Lys Asn Lys Tyr Leu Phe
Ser Phe Pro Gly Leu Val Ala Pro 65 70
75 80 Val Ala Val Gly Lys Phe Gly Asp Leu Thr Gln Leu
Asp Thr Lys Asn 85 90
95 Pro Ile Leu Tyr Val Asp Phe Leu Gln Ala Ser Arg Ala Phe Ala Gln
100 105 110 Thr Gly Arg
Leu Lys Leu Phe Gly Thr Ile Val Tyr Ser Lys Asn Lys 115
120 125 Tyr Ile Thr Leu Asn Phe Val Arg
Gly Ala Gly Ser Ile Gln Cys Glu 130 135
140 Asp Ile Phe Glu Asn Leu Val Val Phe Ser Asp Ala Trp
Trp Ile Gly 145 150 155
160 Thr Lys Glu Glu Asn Pro Asp Glu Leu Arg Leu Glu Met Pro Leu Asp
165 170 175 Phe Gln Gln Glu
Arg His Ala Val Tyr Asp Phe Ala Gly Gly Ala Gly 180
185 190 Lys Pro Arg Asn Ile Lys Asp Asp Val
Asp Val Gln Asp Ser Gln Leu 195 200
205 Glu Leu Val Gln Val Ser Glu Leu Glu Ser Ser Lys Gln Cys
Thr Pro 210 215 220
Lys Gly Gln Leu Lys Met Asp Arg Trp Leu Phe Gln Lys Lys Pro Ser 225
230 235 240 Glu Asn Lys Thr Leu
Glu Lys Ser Phe Glu Ser Asn Ala Lys Thr Lys 245
250 255 Pro Lys Asn Ala Ser Glu Trp Glu Ser Asp
Glu Asp Glu Glu Gly Phe 260 265
270 Val Asn Leu Gly Asp Ala Pro Thr Pro Ser Arg Gln Ser Ala Arg
Ile 275 280 285 Ala
Glu Lys Lys His Ser Tyr Ala Glu Ser Ser Ser Glu Glu Asn Gln 290
295 300 Thr Asp Gly Ser Asp Glu
His Asp Arg Ala Asp Ala Glu Leu Arg Asp 305 310
315 320 Pro Arg Asp Lys Thr Val Asn Lys Leu Phe Asn
Asp Val Glu Asp Gly 325 330
335 Phe Leu Ala Pro Glu Ser Gln Ile Ser Gln Met Asp Leu Ala Asp Met
340 345 350 Val Thr
Asp Arg Cys Val Trp Asn Val Val Tyr Asp Ser Ile Pro Leu 355
360 365 Ser Phe Phe Pro Gly Cys Cys
Arg Ile Arg Ser Pro Cys Val Phe His 370 375
380 Glu Arg Cys Val Gln Phe Thr Asn Ser Leu Lys Phe
Thr Ala Asn Leu 385 390 395
400 Val Tyr Met Lys Thr Tyr Ile Val Cys His Ala Val Ser Pro Gln Thr
405 410 415 Ser Val Pro
Lys Thr Ser Ala Ser Gly Met Ile Leu Ser Thr Ser Val 420
425 430 Leu Ala Asp Leu Thr Ala Asn Thr
Ile Gly Ala Gly Ser Lys Gln Ser 435 440
445 Ser Leu Ser Ala Phe Phe Met Lys Ser Ser Glu Lys Ser
Thr Leu Asp 450 455 460
Asn Asp Asp Ala Gly Lys Glu Asp Glu Asn Val Thr Pro Lys Asp Glu 465
470 475 480 Met Glu Ser Val
Leu Pro Cys Thr Pro Pro Asp Ile Asn Asp Ser Lys 485
490 495 Arg Lys Arg Lys Ser Thr Pro Asp Gly
Lys Arg Asn Ser Arg Asp Asp 500 505
510 Met Asn Ile Ser Arg Val Asp Asn Leu Pro Pro Ala Phe Ser
Asp Asn 515 520 525
Cys Arg Val Gly Thr Gly Gly Asn Gly Tyr His Ile Glu Gly Ala Ala 530
535 540 Val Asn Thr Gly Thr
Met Leu Thr Lys Gln Ser Asn Val His Asn Val 545 550
555 560 Ala Thr Leu Phe Tyr Gly Pro Gln Asn Lys
Ser Tyr Thr Ser Cys Glu 565 570
575 Asn Tyr Ala Leu Phe Phe Met Phe Cys His Ala Phe Gly Lys
580 585 590
212184DNAPhyscomitrella patens 21gagaatctca ttacgaagat tgattggatt
tttctacgct actgcatttc ccgtcgattg 60gtaacaaaac gccgtaacta ggagtgcact
gggggtatct atctaggagg aagctcttcg 120ttcatttaga aattcctcat gaaactggct
gtctgagctt gggtgaggaa gccgtggtag 180tcgggatacc gattcctaat ttgtttggcc
gtttcatcgt tgaggcgcgg tggttaaatg 240gtttagcaag catggggaag aagaagcagg
tagaagaggt gagtcagacg aaggaggata 300agtcgctgga gaagcagagt aagaagctgc
gggagctagc tcgatcgtgc ggattggtgt 360cggagaagaa agctttacca gcagaggcgt
tgcgtcctaa atggggtatc gtaaaatgcg 420atggtaaaga tatttgtaaa aaaggacata
gaaagaacaa atatctgttt tctttcccgg 480gtctcgtagc tcctgtttct ggtggtaagt
tcggtgaact gacgcaactg gactcgagaa 540atccaatcct ttatattgat tttccacagg
gtcgacttaa attatttgga accatcgtat 600atcccatcaa caagtacatt acaatgaatt
ttgttcgcgg agcaggaagc attctatgcg 660aagatctctt tgaaagtatg gtggtatttc
cggaggcttg gtgggtcggc aaaaaagaag 720aaaacccgga tgagctgcgt cttgatatgc
ctctggacct tcaacaggaa aaacatcagg 780tgtacgattt cacaggcgga gctggtgagc
caagagattc caggaaatat ggtgatgttc 840aacccataca atcggaactg gttcaggaaa
cacaactaga ttctagcaaa cttagcactc 900cgaaggctca ggtttctcag agaaaacctt
cagagaaaat tccgaagcac aaagttgtaa 960gtgaatggga gtcagatgat gatgatgatg
atcctggttt tgccattgtg ggagctgctc 1020cgacaccatc acgtcaatcg gctcgcacag
ctggaaagaa atactcgtat gcagaatcat 1080cttcagaaga gaatctatca gacgacgctg
atgaaagtga cgatttagat ggcaaacagg 1140gggaatctag aagcaaggct gccaataagg
ctattgaatt tgaggatgcc gacgatactc 1200tcttggttcc cgaatcacaa gcatctaaga
aggacgtgac tgatgcagtg gataaaaatc 1260cttctacaaa catgactatc acaattgatg
aacatgatga cgaagaggct agtgccatcg 1320accatttagc catgtcacaa accagagctc
caagcactgc aggtgacatg cttctttcaa 1380cttccgttca agcagttgca aatgcaagta
ctacaggagc tgggtccagg caatctactc 1440tgtctacgtt cttcttgaag tcaagcgaga
aggaaaaggt taaaaatgtg gagccgcaga 1500attctgtggt agatataggt ttcacaagga
cttatcaaag agaaaaactc acaaagctgc 1560agtcaacttt tgacaaggga cgcgctgctg
aagatgatga aaatgtggca tctaaaaccg 1620aagtggaatc tgtattacca tttacacctc
ctgaaagtaa cggttctaag cgcaaaagaa 1680aagctccatc agaaagaaaa caaatttcaa
aaggagttga aggaaagggt aagactcctg 1740taaaaagaag aaagaaaata gctgaagaca
aggagcctcg ggcaaaggat cagctgatac 1800tggtttctga tgacagtgat tcgagttgag
cacagagagt tggtcttaga atcctggact 1860ccacgacttc gcccttgagc acagtttagc
cctttcacga agtacgcaat gaaaaactta 1920tatggtaaag cgttttaatg tacacaataa
cgcctgttct gtgacagccg caagattcgg 1980tttaacgttg atggccttga cagcatatag
ttggactagc tgcgagccaa gatacattaa 2040ctgcattatt gatgtaaagg tcagtgtaca
tatgttccaa cacattacat tgattcagct 2100ggtatccgag acattcagca gattgtttgc
gacatgcatt caaactaatg tacatgacaa 2160aacgccactg gcagtcttca tcca
218422525PRTPhyscomitrella patens 22Met
Gly Lys Lys Lys Gln Val Glu Glu Val Ser Gln Thr Lys Glu Asp 1
5 10 15 Lys Ser Leu Glu Lys Gln
Ser Lys Lys Leu Arg Glu Leu Ala Arg Ser 20
25 30 Cys Gly Leu Val Ser Glu Lys Lys Ala Leu
Pro Ala Glu Ala Leu Arg 35 40
45 Pro Lys Trp Gly Ile Val Lys Cys Asp Gly Lys Asp Ile Cys
Lys Lys 50 55 60
Gly His Arg Lys Asn Lys Tyr Leu Phe Ser Phe Pro Gly Leu Val Ala 65
70 75 80 Pro Val Ser Gly Gly
Lys Phe Gly Glu Leu Thr Gln Leu Asp Ser Arg 85
90 95 Asn Pro Ile Leu Tyr Ile Asp Phe Pro Gln
Gly Arg Leu Lys Leu Phe 100 105
110 Gly Thr Ile Val Tyr Pro Ile Asn Lys Tyr Ile Thr Met Asn Phe
Val 115 120 125 Arg
Gly Ala Gly Ser Ile Leu Cys Glu Asp Leu Phe Glu Ser Met Val 130
135 140 Val Phe Pro Glu Ala Trp
Trp Val Gly Lys Lys Glu Glu Asn Pro Asp 145 150
155 160 Glu Leu Arg Leu Asp Met Pro Leu Asp Leu Gln
Gln Glu Lys His Gln 165 170
175 Val Tyr Asp Phe Thr Gly Gly Ala Gly Glu Pro Arg Asp Ser Arg Lys
180 185 190 Tyr Gly
Asp Val Gln Pro Ile Gln Ser Glu Leu Val Gln Glu Thr Gln 195
200 205 Leu Asp Ser Ser Lys Leu Ser
Thr Pro Lys Ala Gln Val Ser Gln Arg 210 215
220 Lys Pro Ser Glu Lys Ile Pro Lys His Lys Val Val
Ser Glu Trp Glu 225 230 235
240 Ser Asp Asp Asp Asp Asp Asp Pro Gly Phe Ala Ile Val Gly Ala Ala
245 250 255 Pro Thr Pro
Ser Arg Gln Ser Ala Arg Thr Ala Gly Lys Lys Tyr Ser 260
265 270 Tyr Ala Glu Ser Ser Ser Glu Glu
Asn Leu Ser Asp Asp Ala Asp Glu 275 280
285 Ser Asp Asp Leu Asp Gly Lys Gln Gly Glu Ser Arg Ser
Lys Ala Ala 290 295 300
Asn Lys Ala Ile Glu Phe Glu Asp Ala Asp Asp Thr Leu Leu Val Pro 305
310 315 320 Glu Ser Gln Ala
Ser Lys Lys Asp Val Thr Asp Ala Val Asp Lys Asn 325
330 335 Pro Ser Thr Asn Met Thr Ile Thr Ile
Asp Glu His Asp Asp Glu Glu 340 345
350 Ala Ser Ala Ile Asp His Leu Ala Met Ser Gln Thr Arg Ala
Pro Ser 355 360 365
Thr Ala Gly Asp Met Leu Leu Ser Thr Ser Val Gln Ala Val Ala Asn 370
375 380 Ala Ser Thr Thr Gly
Ala Gly Ser Arg Gln Ser Thr Leu Ser Thr Phe 385 390
395 400 Phe Leu Lys Ser Ser Glu Lys Glu Lys Val
Lys Asn Val Glu Pro Gln 405 410
415 Asn Ser Val Val Asp Ile Gly Phe Thr Arg Thr Tyr Gln Arg Glu
Lys 420 425 430 Leu
Thr Lys Leu Gln Ser Thr Phe Asp Lys Gly Arg Ala Ala Glu Asp 435
440 445 Asp Glu Asn Val Ala Ser
Lys Thr Glu Val Glu Ser Val Leu Pro Phe 450 455
460 Thr Pro Pro Glu Ser Asn Gly Ser Lys Arg Lys
Arg Lys Ala Pro Ser 465 470 475
480 Glu Arg Lys Gln Ile Ser Lys Gly Val Glu Gly Lys Gly Lys Thr Pro
485 490 495 Val Lys
Arg Arg Lys Lys Ile Ala Glu Asp Lys Glu Pro Arg Ala Lys 500
505 510 Asp Gln Leu Ile Leu Val Ser
Asp Asp Ser Asp Ser Ser 515 520
525 231338DNASolanum tuberosum 23ggtttccagc aactgttccg gcggagttag
ggctatggct cgaggaggga agaaggcagc 60aaatggagaa tccaatccag atatggagga
gaagaagagg ttaaagaaac ttgcaatttc 120gaagcaaatg gtctcagaga atccttcaag
ggataataat tctctgaatc catcaaaaac 180tgtgattaaa catcatggta aagacatttt
gcgcaaatct caacggaaga atcgtttcct 240cttttctctt cccggtttac ttgccccggt
ttccgggggt aaaattggtg agctcaaaga 300ccttggtacc aaaaacccca ttctctacct
cgacttccct cagggtcaaa tgaagttgtt 360tgggacaatt gtatatccaa aaaatggtta
tctgactatg cagttctcca gaggtgggaa 420aaatgtagtg tgcgaagatt accttgacaa
tatgattgtg ttttctgatg catggtggat 480agggaggaaa gatgagaatc ctgaagaagc
acgactcgag tttccaaaag agctgaatgt 540gcagcaagag aaatcggagt gtgattttaa
aggtggtgct ggtgctacat gtgttcaaaa 600acgaagtact agtgaatgtg gggtcaagca
tgtggaacaa cagtctcctg aacatgaaca 660ggaggagtta ttatcagaaa gtcaaaatga
ttcaaaagag tttatcgaat taactccatc 720tcgtcgttca gcaagggcgg caggaaaaaa
aatcaatttt gcagaagttt cttccgggga 780tgaattggtt gacaatgaag tcgaatcttc
cgagggggag gagaaaactg gcagtgacat 840tctttgtgat gaaactgtag tacaaagtca
agttactgga aaaattactg cccttgccga 900aactgcttcc aagtctaaga aatccgctcg
tacaaagcaa agttctctcg ttcaggctac 960tatttcaaca atgtttaaga aagtggacaa
gcttgtcact ccagatagag tttctcaaag 1020gaaaacaaga aaatcaacaa acaaagggga
atccaacaca gaatgtggtt caaccatgcc 1080tgatcatgtt ggtacttctc agggtgaaga
tgacattgaa gagttgtcta gttcatctaa 1140ggatacagaa gctagtgatg aagattgggc
tgcttgagtt ttggatttta tgttttaatc 1200gatagaatat gaccaccatt tctcatggat
ctgcaaatta ctcttggcgg acaatggttg 1260taatccaatt gaagaagctg ctaatctgga
tgcaaaggga gacaccacga agtgaacatc 1320acatgggctt tgcatgtc
133824380PRTSolanum tuberosum 24Met Ala
Arg Gly Gly Lys Lys Ala Ala Asn Gly Glu Ser Asn Pro Asp 1 5
10 15 Met Glu Glu Lys Lys Arg Leu
Lys Lys Leu Ala Ile Ser Lys Gln Met 20 25
30 Val Ser Glu Asn Pro Ser Arg Asp Asn Asn Ser Leu
Asn Pro Ser Lys 35 40 45
Thr Val Ile Lys His His Gly Lys Asp Ile Leu Arg Lys Ser Gln Arg
50 55 60 Lys Asn Arg
Phe Leu Phe Ser Leu Pro Gly Leu Leu Ala Pro Val Ser 65
70 75 80 Gly Gly Lys Ile Gly Glu Leu
Lys Asp Leu Gly Thr Lys Asn Pro Ile 85
90 95 Leu Tyr Leu Asp Phe Pro Gln Gly Gln Met Lys
Leu Phe Gly Thr Ile 100 105
110 Val Tyr Pro Lys Asn Gly Tyr Leu Thr Met Gln Phe Ser Arg Gly
Gly 115 120 125 Lys
Asn Val Val Cys Glu Asp Tyr Leu Asp Asn Met Ile Val Phe Ser 130
135 140 Asp Ala Trp Trp Ile Gly
Arg Lys Asp Glu Asn Pro Glu Glu Ala Arg 145 150
155 160 Leu Glu Phe Pro Lys Glu Leu Asn Val Gln Gln
Glu Lys Ser Glu Cys 165 170
175 Asp Phe Lys Gly Gly Ala Gly Ala Thr Cys Val Gln Lys Arg Ser Thr
180 185 190 Ser Glu
Cys Gly Val Lys His Val Glu Gln Gln Ser Pro Glu His Glu 195
200 205 Gln Glu Glu Leu Leu Ser Glu
Ser Gln Asn Asp Ser Lys Glu Phe Ile 210 215
220 Glu Leu Thr Pro Ser Arg Arg Ser Ala Arg Ala Ala
Gly Lys Lys Ile 225 230 235
240 Asn Phe Ala Glu Val Ser Ser Gly Asp Glu Leu Val Asp Asn Glu Val
245 250 255 Glu Ser Ser
Glu Gly Glu Glu Lys Thr Gly Ser Asp Ile Leu Cys Asp 260
265 270 Glu Thr Val Val Gln Ser Gln Val
Thr Gly Lys Ile Thr Ala Leu Ala 275 280
285 Glu Thr Ala Ser Lys Ser Lys Lys Ser Ala Arg Thr Lys
Gln Ser Ser 290 295 300
Leu Val Gln Ala Thr Ile Ser Thr Met Phe Lys Lys Val Asp Lys Leu 305
310 315 320 Val Thr Pro Asp
Arg Val Ser Gln Arg Lys Thr Arg Lys Ser Thr Asn 325
330 335 Lys Gly Glu Ser Asn Thr Glu Cys Gly
Ser Thr Met Pro Asp His Val 340 345
350 Gly Thr Ser Gln Gly Glu Asp Asp Ile Glu Glu Leu Ser Ser
Ser Ser 355 360 365
Lys Asp Thr Glu Ala Ser Asp Glu Asp Trp Ala Ala 370
375 380 251083DNAVitis shuttleworthii 25ctacaaattt
gggatgaatc aacttgttga agctgcagcc ttaatttcat ccacaaacca 60tcatccttgc
tctaattgac tgtttgaagc ttagaaagtt gtcacactgg ttttctggga 120tatgattatc
agtaactctt aataactatt ctccaagaga agtagggcag gcagaatggt 180acgagtttca
aagaagaatg aaaatggtgg agtatctgaa ctgaatccag aagctgaaga 240gcgtaaaaga
cgaaaaaaat tggcgttctc caagaactta ctgtcagata ctccttcaaa 300agcgttttca
gctctgagcc cttcaaaaac agtgatcaaa caccatggaa aagatattct 360gaagaaatct
cagaggaaga atcggttcct cttctcattc ccaggtcttc ttgctcctat 420cgctggcggc
aagatcggtg aactcaagga tttgggaacc aagaatccta tactctacct 480tgatttccct
cagggtcaaa tgaagttgtt tgggactata gtttacccga agaacaggta 540tttgactctg
catttctcta gaggcggaaa aaatgtaatg tgtgaggatt actttgataa 600tatgattgta
ttttctgatg catggtggat tgggagaaag gaggagaatc cagaagaagc 660ccgactcgag
tttcctaaag aactgagtga aggacaaagt gttgaatacg attttaaagg 720gggtgcaggc
atggcatctg acagtaagca aggtgttaat aaacctgaaa tgaaatatgt 780agaaccgcag
tcacctaaac ctgagctaga agatgatttg tctggtgaag acagtttgaa 840agatgtggtt
gaaatgacac cgaaagatgt tgaagtgaca ccagttcgac attcacagag 900aactgcagga
aaaacattca attttgcaga agcttcttct ggagatgatt ctgttgaaaa 960tgatggcaac
atatctgatg gacaagaaaa ttctggctct gcaacacctg aaagtggcaa 1020tgaagatgct
gaagcaagga ctcgagcaac cacacaaatt caagagtctg ctggggcagc 1080tac
108326303PRTVitis
shuttleworthii 26Met Val Arg Val Ser Lys Lys Asn Glu Asn Gly Gly Val Ser
Glu Leu 1 5 10 15
Asn Pro Glu Ala Glu Glu Arg Lys Arg Arg Lys Lys Leu Ala Phe Ser
20 25 30 Lys Asn Leu Leu Ser
Asp Thr Pro Ser Lys Ala Phe Ser Ala Leu Ser 35
40 45 Pro Ser Lys Thr Val Ile Lys His His
Gly Lys Asp Ile Leu Lys Lys 50 55
60 Ser Gln Arg Lys Asn Arg Phe Leu Phe Ser Phe Pro Gly
Leu Leu Ala 65 70 75
80 Pro Ile Ala Gly Gly Lys Ile Gly Glu Leu Lys Asp Leu Gly Thr Lys
85 90 95 Asn Pro Ile Leu
Tyr Leu Asp Phe Pro Gln Gly Gln Met Lys Leu Phe 100
105 110 Gly Thr Ile Val Tyr Pro Lys Asn Arg
Tyr Leu Thr Leu His Phe Ser 115 120
125 Arg Gly Gly Lys Asn Val Met Cys Glu Asp Tyr Phe Asp Asn
Met Ile 130 135 140
Val Phe Ser Asp Ala Trp Trp Ile Gly Arg Lys Glu Glu Asn Pro Glu 145
150 155 160 Glu Ala Arg Leu Glu
Phe Pro Lys Glu Leu Ser Glu Gly Gln Ser Val 165
170 175 Glu Tyr Asp Phe Lys Gly Gly Ala Gly Met
Ala Ser Asp Ser Lys Gln 180 185
190 Gly Val Asn Lys Pro Glu Met Lys Tyr Val Glu Pro Gln Ser Pro
Lys 195 200 205 Pro
Glu Leu Glu Asp Asp Leu Ser Gly Glu Asp Ser Leu Lys Asp Val 210
215 220 Val Glu Met Thr Pro Lys
Asp Val Glu Val Thr Pro Val Arg His Ser 225 230
235 240 Gln Arg Thr Ala Gly Lys Thr Phe Asn Phe Ala
Glu Ala Ser Ser Gly 245 250
255 Asp Asp Ser Val Glu Asn Asp Gly Asn Ile Ser Asp Gly Gln Glu Asn
260 265 270 Ser Gly
Ser Ala Thr Pro Glu Ser Gly Asn Glu Asp Ala Glu Ala Arg 275
280 285 Thr Arg Ala Thr Thr Gln Ile
Gln Glu Ser Ala Gly Ala Ala Thr 290 295
300 271350DNAVitis vinifera 27atggtacgag tttcaaagaa
gaatgaaaat ggtggagtat ctgaactgaa tccagaagct 60gaagagcgta aaagacgaaa
aaaattggcg ttctccaaga acttactgtc agatactcct 120tcaaaagcgt tttcagctct
gagcccttca aaaacagtga tcaaacacca tggaaaagat 180attctgaaga aatctcagag
gaagaatcgg ttcctcttct cattcccagg tcttcttgct 240cctattgctg gtggcaagat
tggtgaactc aaggatttgg gaaccaagaa tcctatactc 300taccttgatt tccctcaggg
tcaaatgaag ttgtttggga ctatagttta cccgaagaac 360aggtatttga ctctgcattt
ctctagaggc ggaaaaaatg taatgtgtga ggattacttt 420gataatatga ttgtattttc
tgatgcatgg tggattggga gaaaggagga gaatccagaa 480gaagcccgac tcgagtttcc
taaagaactg agtgaaggac aaagtgttga atacgacttt 540aaagggggtg caggcatggc
atctgacagt aagcaaggtg ttaataaacc tgaaatgaaa 600tatgtagaac cgcagtcacc
taaacctgag ctagaagatg atttgtctgg tgaagacagt 660ttgaaagatg tggttgaaat
gacaccgaaa gatgttgaag tgacaccagt tcgacattca 720cagagaactg caggaaaaac
attcaatttt gcagaagctt cttctggaga tgattctgtt 780gaaaatgatg gcaacatatc
tgatggacaa gaaaattctg gctctgcaac acctgaaagt 840ggcaatgaag atgctgaagc
aaggactgga gcaaccacac aaattcaaga gtctgctggg 900gcagctacca agtcaaggaa
acgactatct caagctacta tatccacatt gtttaagaaa 960gtggaggaac agaaaacatc
cagaactcca aggaaatcct catcagccaa agcttctgct 1020cagaagactg attccaggaa
agctccggaa cacgggaaaa aaagaaaagt aattgaggaa 1080acaaaatctg agatagacat
ctcaacagaa agtgaacaat ctgatgagga aaagaaaaca 1140tctagaaccc caaggaaatc
gtcatcaacc aaagtttctg cccggaagac tgatgccagg 1200aaagcccagg gacccaggaa
aaggagaaaa gtaatcgagg aaacaaaatc tgagatagac 1260atctcaacag aaggcgagca
atctgataat ccgacctctg atgcttctgt tagagtgtac 1320aagagaaaga tgaaaagccc
tgcagcttaa 135028449PRTVitis vinifera
28Met Val Arg Val Ser Lys Lys Asn Glu Asn Gly Gly Val Ser Glu Leu 1
5 10 15 Asn Pro Glu Ala
Glu Glu Arg Lys Arg Arg Lys Lys Leu Ala Phe Ser 20
25 30 Lys Asn Leu Leu Ser Asp Thr Pro Ser
Lys Ala Phe Ser Ala Leu Ser 35 40
45 Pro Ser Lys Thr Val Ile Lys His His Gly Lys Asp Ile Leu
Lys Lys 50 55 60
Ser Gln Arg Lys Asn Arg Phe Leu Phe Ser Phe Pro Gly Leu Leu Ala 65
70 75 80 Pro Ile Ala Gly Gly
Lys Ile Gly Glu Leu Lys Asp Leu Gly Thr Lys 85
90 95 Asn Pro Ile Leu Tyr Leu Asp Phe Pro Gln
Gly Gln Met Lys Leu Phe 100 105
110 Gly Thr Ile Val Tyr Pro Lys Asn Arg Tyr Leu Thr Leu His Phe
Ser 115 120 125 Arg
Gly Gly Lys Asn Val Met Cys Glu Asp Tyr Phe Asp Asn Met Ile 130
135 140 Val Phe Ser Asp Ala Trp
Trp Ile Gly Arg Lys Glu Glu Asn Pro Glu 145 150
155 160 Glu Ala Arg Leu Glu Phe Pro Lys Glu Leu Ser
Glu Gly Gln Ser Val 165 170
175 Glu Tyr Asp Phe Lys Gly Gly Ala Gly Met Ala Ser Asp Ser Lys Gln
180 185 190 Gly Val
Asn Lys Pro Glu Met Lys Tyr Val Glu Pro Gln Ser Pro Lys 195
200 205 Pro Glu Leu Glu Asp Asp Leu
Ser Gly Glu Asp Ser Leu Lys Asp Val 210 215
220 Val Glu Met Thr Pro Lys Asp Val Glu Val Thr Pro
Val Arg His Ser 225 230 235
240 Gln Arg Thr Ala Gly Lys Thr Phe Asn Phe Ala Glu Ala Ser Ser Gly
245 250 255 Asp Asp Ser
Val Glu Asn Asp Gly Asn Ile Ser Asp Gly Gln Glu Asn 260
265 270 Ser Gly Ser Ala Thr Pro Glu Ser
Gly Asn Glu Asp Ala Glu Ala Arg 275 280
285 Thr Gly Ala Thr Thr Gln Ile Gln Glu Ser Ala Gly Ala
Ala Thr Lys 290 295 300
Ser Arg Lys Arg Leu Ser Gln Ala Thr Ile Ser Thr Leu Phe Lys Lys 305
310 315 320 Val Glu Glu Gln
Lys Thr Ser Arg Thr Pro Arg Lys Ser Ser Ser Ala 325
330 335 Lys Ala Ser Ala Gln Lys Thr Asp Ser
Arg Lys Ala Pro Glu His Gly 340 345
350 Lys Lys Arg Lys Val Ile Glu Glu Thr Lys Ser Glu Ile Asp
Ile Ser 355 360 365
Thr Glu Ser Glu Gln Ser Asp Glu Glu Lys Lys Thr Ser Arg Thr Pro 370
375 380 Arg Lys Ser Ser Ser
Thr Lys Val Ser Ala Arg Lys Thr Asp Ala Arg 385 390
395 400 Lys Ala Gln Gly Pro Arg Lys Arg Arg Lys
Val Ile Glu Glu Thr Lys 405 410
415 Ser Glu Ile Asp Ile Ser Thr Glu Gly Glu Gln Ser Asp Asn Pro
Thr 420 425 430 Ser
Asp Ala Ser Val Arg Val Tyr Lys Arg Lys Met Lys Ser Pro Ala 435
440 445 Ala 2921PRTArtificial
sequencemotif 1 29Ile Arg Arg Lys Ser Gln Arg Lys Asn Arg Phe Leu Phe Ser
Phe Pro 1 5 10 15
Gly Leu Leu Ala Pro 20 3029PRTArtificial sequencemotif 2
30Ser Gly Gly Lys Ile Gly Glu Leu Lys Asp Leu Gly Thr Lys Asn Pro 1
5 10 15 Ile Leu Tyr Leu
Asp Phe Pro Gln Gly Arg Met Lys Leu 20 25
3121PRTArtificial sequencemotif 3 31Thr Pro Val Arg Gln Ser
Ala Arg Thr Ala Gly Lys Lys Phe Lys Phe 1 5
10 15 Ala Glu Xaa Ser Ser 20
3221PRTArtificial sequencemotif 4 32Gly Thr Lys Glu Glu Asn Pro Glu Glu
Leu Arg Leu Asp Phe Pro Lys 1 5 10
15 Glu Leu Gln Glu Gly 20
3329PRTArtificial sequencemotif 5 33Ser Gly Asn Leu Leu Ser Glu Xaa Pro
Ala Lys Pro Arg Ser Ala Leu 1 5 10
15 Ala Pro Ser Lys Thr Val Leu Lys His His Gly Lys Asp
20 25 3418PRTArtificial
sequencemotif 6 34His Ala Glu Cys Asp Phe Lys Gly Gly Ala Gly Ala Ala Cys
Asp Glu 1 5 10 15
Lys Gln 3529PRTArtificial sequencemotif 7 35Lys Lys Pro Gly Glu Lys Tyr
Val Glu Glu Glu Ser Pro Lys Ile Glu 1 5
10 15 Ser Glu Asp Asp Leu Ser Glu Asp Ser Asn Leu
Lys Asp 20 25
3621PRTArtificial sequencemotif 8 36Lys Gly Pro Ala Ala Lys Lys Gln Arg
Ala Ser Pro Glu Glu Lys His 1 5 10
15 Pro Thr Gly Lys Lys 20
3729PRTArtificial sequencemotif 9 37Ser Val Met Cys Glu Asp Tyr Phe Asp
Asn Met Ile Val Phe Ser Asp 1 5 10
15 Ala Trp Trp Ile Gly Thr Lys Glu Glu Asn Pro Glu Glu
20 25 3829PRTArtificial
sequencemotif 10 38Leu Ala Ala Pro Ile Ser Gly Gly Lys Ile Gly Glu Leu
Lys Asp Leu 1 5 10 15
Gly Thr Lys Asn Pro Ile Leu Tyr Leu Asp Phe Pro Gln 20
25 3921PRTArtificial sequencemotif 11 39Gly
Arg Met Lys Leu Phe Gly Thr Ile Val Tyr Pro Lys Asn Arg Tyr 1
5 10 15 Leu Thr Leu Gln Phe
20 4055DNAArtificial sequenceprimer 1 40ggggacaagt
ttgtacaaaa aagcaggctt aaacaatggt acgagcttca tcgtc
554152DNAArtificial sequenceprimer 2 41ggggaccact ttgtacaaga aagctgggtt
tctggaaaag atttctttaa gc 52422194DNAOryza sativa 42aatccgaaaa
gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa
tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta
ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt
aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga
agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt
tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat
tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc
gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta
aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc
acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca
acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag
cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag
aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa
ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc
tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa
ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat
cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc
aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt
ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct
cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac
gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg
atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca
atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt
gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt
acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt
gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg
aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc
cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt
ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt
tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt
cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc
tccattttta attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt
tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa
ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa
gtttcttttt ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat
cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc
ttgccacttt caccagcaaa gttc
2194431264DNAOryza sativa 43tcgacgctac tcaagtggtg ggaggccacc gcatgttcca
acgaagcgcc aaagaaagcc 60ttgcagactc taatgctatt agtcgcctag gatatttgga
atgaaaggaa ccgcagagtt 120tttcagcacc aagagcttcc ggtggctagt ctgatagcca
aaattaagga ggatgccaaa 180acatgggtct tggcgggcgc gaaacacctt gataggtggc
ttacctttta acatgttcgg 240gccaaaggcc ttgagacggt aaagttttct atttgcgctt
gcgcatgtac aattttattc 300ctctattcaa tgaaattggt ggctcactgg ttcattaaaa
aaaaaagaat ctagcctgtt 360cgggaagaag aggattttgt tcgtgagaga gagagagaga
gagagagaga gagagagaga 420gaaggaggag gaggattttc aggcttcgca ttgcccaacc
tctgcttctg ttggcccaag 480aagaatccca ggcgcccatg ggctggcagt ttaccacgga
cctacctagc ctaccttagc 540tatctaagcg ggccgaccta gtagccacgt gcctagtgta
gattaaagtt gccgggccag 600caggaagcca cgctgcaatg gcatcttccc ctgtccttcg
cgtacgtgaa aacaaaccca 660ggtaagctta gaatcttctt gcccgttgga ctgggacacc
caccaatccc accatgcccc 720gatattcctc cggtctcggt tcatgtgatg tcctctcttg
tgtgatcacg gagcaagcat 780tcttaaacgg caaaagaaaa tcaccaactt gctcacgcag
tcacgctgca ccgcgcgaag 840cgacgcccga taggccaaga tcgcgagata aaataacaac
caatgatcat aaggaaacaa 900gcccgcgatg tgtcgtgtgc agcaatcttg gtcatttgcg
ggatcgagtg cttcacagct 960aaccaaatat tcggccgatg atttaacaca ttatcagcgt
agatgtacgt acgatttgtt 1020aattaatcta cgagccttgc tagggcaggt gttctgccag
ccaatccaga tcgccctcgt 1080atgcacgctc acatgatggc agggcagggt tcacatgagc
tctaacggtc gattaattaa 1140tcccggggct cgactataaa tacctcccta atcccatgat
caaaaccatc tcaagcagcc 1200taatcatctc cagctgatca agagctctta attagctagc
tagtgattag ctgcgcttgt 1260gatc
1264441221DNAOryza sativa 44atggcatacc atggacagct
ggatggacgc caagcttcag gtttgatgcg tgatggcgcc 60ttccctgcag ccagcctttc
tggccgccaa cctttggatc gcgctgccac cgctctggag 120atcttggaaa agaaacttgc
tgagcagacc gccgaggcag aaaagcttat cagagagaat 180cagcgattgg catctagcca
tgtcgtcttg aggcaggata ttgttgatac tgagaaagaa 240atgcaaatga tccgtgctca
cctaggtgat gttcagacag agactgatat gcatatgaga 300gatttgatgg agagaatgag
attgatggaa gcagatatac aagctggtga tgcagtgaag 360aaggaacttc atcaagtgca
tatggaggca aagagactta ttgctgagag gcagatgctc 420actgttgaga tggataaagt
aactaaagag ctacataagt tctctggtga cagtaagaaa 480cttcctgaat tactgactga
gctagatggg ctccgaaaag agcatcagag tctaagatct 540gcttttgaat atgagaaaaa
cacaaacatc aagcaagttg agcagatgcg gacgatggag 600atgaatttaa tgactatgac
caaagaggct gacaagttgc gtgctgatgt ggcaaatgct 660gaaaaacgag ctcaagtggc
agcggctcaa gcagtagcag cacaggcggg ggtggcacat 720gtgactgctt cacaaccagg
ggcagcacaa gctgtggcag tgccagctgc ctcaaaccca 780tattcaagtg catttaccgg
tcatccctct gcatatcacc aaggagccac ccaagctggg 840gtttatcagc aagggaccac
ccaagttggg gcatatcagc aaggatctac ccaagctggg 900gcatatgctt acccaactta
tgatgccgct actgcttacc agatgcatgc tgcgcaagca 960aatgcatacg cgggctatcc
tggttatcca gttgcagggt acacacaggc cgctttgccc 1020ggttatccta gtgcgtatgc
tgcaccgcag caaccaataa gcagtggtgt agctacagat 1080gttgcaagca tgtatggcgc
gatcagtagt gctggatatc ctgctggagt tgttcagtca 1140agcagtggag ctgccaatgc
aggacaagca ccagctactt atcctgtcgc atacgaccca 1200accagagcag gccagaggtg a
122145406PRTOryza sativa
45Met Ala Tyr His Gly Gln Leu Asp Gly Arg Gln Ala Ser Gly Leu Met 1
5 10 15 Arg Asp Gly Ala
Phe Pro Ala Ala Ser Leu Ser Gly Arg Gln Pro Leu 20
25 30 Asp Arg Ala Ala Thr Ala Leu Glu Ile
Leu Glu Lys Lys Leu Ala Glu 35 40
45 Gln Thr Ala Glu Ala Glu Lys Leu Ile Arg Glu Asn Gln Arg
Leu Ala 50 55 60
Ser Ser His Val Val Leu Arg Gln Asp Ile Val Asp Thr Glu Lys Glu 65
70 75 80 Met Gln Met Ile Arg
Ala His Leu Gly Asp Val Gln Thr Glu Thr Asp 85
90 95 Met His Met Arg Asp Leu Met Glu Arg Met
Arg Leu Met Glu Ala Asp 100 105
110 Ile Gln Ala Gly Asp Ala Val Lys Lys Glu Leu His Gln Val His
Met 115 120 125 Glu
Ala Lys Arg Leu Ile Ala Glu Arg Gln Met Leu Thr Val Glu Met 130
135 140 Asp Lys Val Thr Lys Glu
Leu His Lys Phe Ser Gly Asp Ser Lys Lys 145 150
155 160 Leu Pro Glu Leu Leu Thr Glu Leu Asp Gly Leu
Arg Lys Glu His Gln 165 170
175 Ser Leu Arg Ser Ala Phe Glu Tyr Glu Lys Asn Thr Asn Ile Lys Gln
180 185 190 Val Glu
Gln Met Arg Thr Met Glu Met Asn Leu Met Thr Met Thr Lys 195
200 205 Glu Ala Asp Lys Leu Arg Ala
Asp Val Ala Asn Ala Glu Lys Arg Ala 210 215
220 Gln Val Ala Ala Ala Gln Ala Val Ala Ala Gln Ala
Gly Val Ala His 225 230 235
240 Val Thr Ala Ser Gln Pro Gly Ala Ala Gln Ala Val Ala Val Pro Ala
245 250 255 Ala Ser Asn
Pro Tyr Ser Ser Ala Phe Thr Gly His Pro Ser Ala Tyr 260
265 270 His Gln Gly Ala Thr Gln Ala Gly
Val Tyr Gln Gln Gly Thr Thr Gln 275 280
285 Val Gly Ala Tyr Gln Gln Gly Ser Thr Gln Ala Gly Ala
Tyr Ala Tyr 290 295 300
Pro Thr Tyr Asp Ala Ala Thr Ala Tyr Gln Met His Ala Ala Gln Ala 305
310 315 320 Asn Ala Tyr Ala
Gly Tyr Pro Gly Tyr Pro Val Ala Gly Tyr Thr Gln 325
330 335 Ala Ala Leu Pro Gly Tyr Pro Ser Ala
Tyr Ala Ala Pro Gln Gln Pro 340 345
350 Ile Ser Ser Gly Val Ala Thr Asp Val Ala Ser Met Tyr Gly
Ala Ile 355 360 365
Ser Ser Ala Gly Tyr Pro Ala Gly Val Val Gln Ser Ser Ser Gly Ala 370
375 380 Ala Asn Ala Gly Gln
Ala Pro Ala Thr Tyr Pro Val Ala Tyr Asp Pro 385 390
395 400 Thr Arg Ala Gly Gln Arg
405 461080DNAArabidopsis thaliana 46atggaaagca aaggaagaat ccatccatct
catcatcata tgaggcgtcc tcttccaggt 60cccggtggct gtatagcgca tccggagact
ttcggtaatc acggtgctat accaccttct 120gctgctcaag gtgtgtatcc ttccttcaac
atgttacctc cacctgaagt tatggagcaa 180aagtttgtgg cacaacacgg ggaattacag
agacttgcta tagagaatca gagacttggt 240ggaactcatg gtagtttaag acaagagtta
gcagcagcac agcatgaaat acagatgttg 300cacgcgcaaa ttgggtcgat gaagtccgag
agagagcaac ggatgatggg tcttgctgag 360aaagttgcta aaatggagac tgagcttcag
aaatctgagg ctgttaagtt ggagatgcaa 420caagcacgtg ctgaggcacg gagtcttgtt
gtggctaggg aggagcttat gtctaaagtg 480catcagttga ctcaggaact tcaaaaatct
cgttctgatg tgcagcaaat acctgctctg 540atgtctgaac ttgagaatct aagacaggag
taccagcagt gcagggcaac atatgactat 600gagaagaagt tttataatga ccatctcgag
tcacttcagg caatggagaa gaactacatg 660actatggcta gggaagttga aaaacttcaa
gcacagttga tgaacaatgc aaattcagat 720agaagagcag gtggccctta tggtaacaac
ataaatgctg aaattgacgc ttctggacat 780cagagtggaa acggttatta tgaagatgct
tttggtcctc agggatatat tcctcaacca 840gtagctggta acgcaactgg accaaattca
gttgttggcg cagctcaata cccttatcaa 900ggagtaactc agccaggata cttccctcaa
agacccggtt acaactttcc aagaggccct 960cctggttcat atgacccaac aacaaggtta
ccaacaggac cttacggcgc tccattccca 1020cctggaccat ctaacaatac tccttacgcc
ggtacacacg gaaaccctag tcgcagatga 108047359PRTArabidopsis thaliana 47Met
Glu Ser Lys Gly Arg Ile His Pro Ser His His His Met Arg Arg 1
5 10 15 Pro Leu Pro Gly Pro Gly
Gly Cys Ile Ala His Pro Glu Thr Phe Gly 20
25 30 Asn His Gly Ala Ile Pro Pro Ser Ala Ala
Gln Gly Val Tyr Pro Ser 35 40
45 Phe Asn Met Leu Pro Pro Pro Glu Val Met Glu Gln Lys Phe
Val Ala 50 55 60
Gln His Gly Glu Leu Gln Arg Leu Ala Ile Glu Asn Gln Arg Leu Gly 65
70 75 80 Gly Thr His Gly Ser
Leu Arg Gln Glu Leu Ala Ala Ala Gln His Glu 85
90 95 Ile Gln Met Leu His Ala Gln Ile Gly Ser
Met Lys Ser Glu Arg Glu 100 105
110 Gln Arg Met Met Gly Leu Ala Glu Lys Val Ala Lys Met Glu Thr
Glu 115 120 125 Leu
Gln Lys Ser Glu Ala Val Lys Leu Glu Met Gln Gln Ala Arg Ala 130
135 140 Glu Ala Arg Ser Leu Val
Val Ala Arg Glu Glu Leu Met Ser Lys Val 145 150
155 160 His Gln Leu Thr Gln Glu Leu Gln Lys Ser Arg
Ser Asp Val Gln Gln 165 170
175 Ile Pro Ala Leu Met Ser Glu Leu Glu Asn Leu Arg Gln Glu Tyr Gln
180 185 190 Gln Cys
Arg Ala Thr Tyr Asp Tyr Glu Lys Lys Phe Tyr Asn Asp His 195
200 205 Leu Glu Ser Leu Gln Ala Met
Glu Lys Asn Tyr Met Thr Met Ala Arg 210 215
220 Glu Val Glu Lys Leu Gln Ala Gln Leu Met Asn Asn
Ala Asn Ser Asp 225 230 235
240 Arg Arg Ala Gly Gly Pro Tyr Gly Asn Asn Ile Asn Ala Glu Ile Asp
245 250 255 Ala Ser Gly
His Gln Ser Gly Asn Gly Tyr Tyr Glu Asp Ala Phe Gly 260
265 270 Pro Gln Gly Tyr Ile Pro Gln Pro
Val Ala Gly Asn Ala Thr Gly Pro 275 280
285 Asn Ser Val Val Gly Ala Ala Gln Tyr Pro Tyr Gln Gly
Val Thr Gln 290 295 300
Pro Gly Tyr Phe Pro Gln Arg Pro Gly Tyr Asn Phe Pro Arg Gly Pro 305
310 315 320 Pro Gly Ser Tyr
Asp Pro Thr Thr Arg Leu Pro Thr Gly Pro Tyr Gly 325
330 335 Ala Pro Phe Pro Pro Gly Pro Ser Asn
Asn Thr Pro Tyr Ala Gly Thr 340 345
350 His Gly Asn Pro Ser Arg Arg 355
481368DNAHordeum vulgare 48atggggagca agggtcggat gcctccttct taccaccacc
ggccgctccc aggttccggc 60tctggcccgc cgcatggcat gatgcaccgt gatccgtacg
gcccgggcat gcacccaccg 120ccagggccgg ggccataccc ctacgatatg ttgccgccgc
ctgagatcct ggagcagaag 180ctggcggtgc agtgtggaga gatacagaag ctggcggtgg
agaacgaacg gctcgccacg 240agccacgtgt ctctgaggaa agagctggct gccgcgcagc
aggagctgca gaggctgcag 300gcgcagggtg aggcggcgaa ggccgccgag gagcaggaga
tgagggggct ccttgacaag 360gctgccaaga tggaggccga tctgaagtcg tacgagtctg
tcaaggcgga cctgcagcag 420gcgcacaccg aggcgcagaa cctggcggca gcaaggcagc
atttgtcggc ggaggtgcag 480aagctgaaca aagacctgca gaggaacttt ggggaggcac
aacagctgcc agcactcatg 540gctgatcttg atgctgctag acaggaatat cagcacctaa
gggctgcata tgagtatgaa 600aggaaactga agatggacca ctcggagtcg ctgcaggtaa
ccaagacaaa ttatgactcc 660atggttacag agttagagaa gcttcgtgct gagttgacaa
actcaactaa tattgacaga 720agtggtactt tgtacaatcc taatttggct cagaaggatg
gtggtacatc tggtcggcat 780tctgcttatg atggtggcta tgggggtgca caggctagga
cgccccctgg tatgccagac 840cctctaagcg gaagcccagc tggaactgct cctctttctg
gatatgatcc atcaagaggg 900aatgcatatg agacttctcg tcttgctaga gtccatgatg
catcaagggg tgctactggt 960tacgactctc taaaagttgc tggatatgat acttctagaa
tgcccgcact tggagctcag 1020acagcggctc caactgctca tgggagtagt gctggttact
atggatctgc acaggtgcca 1080ccatcatatg cttctgggcc agtctcgtct tcatcatacg
gcgcaacaac agcgcgacct 1140catggctcag ctcagggact atcatcatat ggacaaacac
aggctccatc ttcttatgca 1200cacacacaga taccaccatc ctatggacta gcacaggcat
catcacactt tggcccaact 1260cagggggggt caccgtatgg gttgtctgca cggccccagg
cctatggatc cgcgcaagca 1320gcacctaaca ctggtggtgc ttatcaaact ccacatggac
gtagataa 136849455PRTHordeum vulgare 49Met Gly Ser Lys Gly
Arg Met Pro Pro Ser Tyr His His Arg Pro Leu 1 5
10 15 Pro Gly Ser Gly Ser Gly Pro Pro His Gly
Met Met His Arg Asp Pro 20 25
30 Tyr Gly Pro Gly Met His Pro Pro Pro Gly Pro Gly Pro Tyr Pro
Tyr 35 40 45 Asp
Met Leu Pro Pro Pro Glu Ile Leu Glu Gln Lys Leu Ala Val Gln 50
55 60 Cys Gly Glu Ile Gln Lys
Leu Ala Val Glu Asn Glu Arg Leu Ala Thr 65 70
75 80 Ser His Val Ser Leu Arg Lys Glu Leu Ala Ala
Ala Gln Gln Glu Leu 85 90
95 Gln Arg Leu Gln Ala Gln Gly Glu Ala Ala Lys Ala Ala Glu Glu Gln
100 105 110 Glu Met
Arg Gly Leu Leu Asp Lys Ala Ala Lys Met Glu Ala Asp Leu 115
120 125 Lys Ser Tyr Glu Ser Val Lys
Ala Asp Leu Gln Gln Ala His Thr Glu 130 135
140 Ala Gln Asn Leu Ala Ala Ala Arg Gln His Leu Ser
Ala Glu Val Gln 145 150 155
160 Lys Leu Asn Lys Asp Leu Gln Arg Asn Phe Gly Glu Ala Gln Gln Leu
165 170 175 Pro Ala Leu
Met Ala Asp Leu Asp Ala Ala Arg Gln Glu Tyr Gln His 180
185 190 Leu Arg Ala Ala Tyr Glu Tyr Glu
Arg Lys Leu Lys Met Asp His Ser 195 200
205 Glu Ser Leu Gln Val Thr Lys Thr Asn Tyr Asp Ser Met
Val Thr Glu 210 215 220
Leu Glu Lys Leu Arg Ala Glu Leu Thr Asn Ser Thr Asn Ile Asp Arg 225
230 235 240 Ser Gly Thr Leu
Tyr Asn Pro Asn Leu Ala Gln Lys Asp Gly Gly Thr 245
250 255 Ser Gly Arg His Ser Ala Tyr Asp Gly
Gly Tyr Gly Gly Ala Gln Ala 260 265
270 Arg Thr Pro Pro Gly Met Pro Asp Pro Leu Ser Gly Ser Pro
Ala Gly 275 280 285
Thr Ala Pro Leu Ser Gly Tyr Asp Pro Ser Arg Gly Asn Ala Tyr Glu 290
295 300 Thr Ser Arg Leu Ala
Arg Val His Asp Ala Ser Arg Gly Ala Thr Gly 305 310
315 320 Tyr Asp Ser Leu Lys Val Ala Gly Tyr Asp
Thr Ser Arg Met Pro Ala 325 330
335 Leu Gly Ala Gln Thr Ala Ala Pro Thr Ala His Gly Ser Ser Ala
Gly 340 345 350 Tyr
Tyr Gly Ser Ala Gln Val Pro Pro Ser Tyr Ala Ser Gly Pro Val 355
360 365 Ser Ser Ser Ser Tyr Gly
Ala Thr Thr Ala Arg Pro His Gly Ser Ala 370 375
380 Gln Gly Leu Ser Ser Tyr Gly Gln Thr Gln Ala
Pro Ser Ser Tyr Ala 385 390 395
400 His Thr Gln Ile Pro Pro Ser Tyr Gly Leu Ala Gln Ala Ser Ser His
405 410 415 Phe Gly
Pro Thr Gln Gly Gly Ser Pro Tyr Gly Leu Ser Ala Arg Pro 420
425 430 Gln Ala Tyr Gly Ser Ala Gln
Ala Ala Pro Asn Thr Gly Gly Ala Tyr 435 440
445 Gln Thr Pro His Gly Arg Arg 450
455 501329DNALycopersicon esculentum 50atgggaagca aaggtcgagg
gccacctccc aacattaggc gtccgcctcc aggacccggc 60atgatgtatc ctgattcttt
tggtcctcct acgcataacc ctccaccagt tgatttcccc 120ccttttgaca ggctacctcc
tccagagatt ttggaacaga agattggtgc acaacatctt 180gagatgcaaa aacttactac
agaaaatcag aggcttgctg ccacccatgt aactttgagg 240cgagatttag ctgctgcaca
acatgagcta caaatgttgc atgttcagat agaaacagtc 300aaggccaaca gggaacaaga
gactaaaggc ctcagtgata aaatttctag gatagaggct 360gaacttcaag ctgctgaatc
tatcaaaaaa gaattgccgc aagcacaagg ggaagctcgc 420actttgtttg cagcaaggca
agaacttgtt actaaaatac aaatgctgac tcaggatctt 480caaagggctc acgctgatgt
gctacatatt cctcgtttgc tggctgagtt ggagagccta 540aaaaaggagt atcagcagtg
ccggactacc tatgagtgcg agaggaagtt atacagtgat 600catcttgaat ctcttcaagt
gatggagaag aactacatga ctatgtccag agaggtggaa 660aagcttaggg cagagttagc
gaacacttct aactctgaca gacaaacagg tggaccttat 720ggtggttcaa ctggatacaa
tgaaaatgat gccactaata attatgctac tgggcaaaac 780atctatgcag acggctatgg
agtttatcag ggtagaggct ccgtaccaac agggactaat 840gctggaggag ttcctgctgt
tgactcacca caagttggag ctcagtctgt gcctccgtca 900aacaggcctc cttatgatac
atcaaatatg tctggttatg atgcacaaag gggaattaga 960ggccctgttg gacatggtta
tgaagcacaa atgggatcaa gtggtcctgg ctatgatgcg 1020caaagaggat ctggtttagc
agcttatgaa gctcagaggg ggcatgggta tgatagggga 1080cctgggtatg atgctcagag
ggcagcgggt tatgaagctt acagaggacc tggctatgat 1140gcatatgggg cccctgttta
tgatcctagc aaggcctcta actatgacgc atcttccaaa 1200ggcggtgttg caactcaagg
acaggtagca cctataggaa atgctcctcc tggggcagct 1260ccctcaccgg gtcatattgg
tcctggatat gatgcatcag cacaaggtgg aaatccagca 1320cgtagatga
132951442PRTLycopersicon
esculentum 51Met Gly Ser Lys Gly Arg Gly Pro Pro Pro Asn Ile Arg Arg Pro
Pro 1 5 10 15 Pro
Gly Pro Gly Met Met Tyr Pro Asp Ser Phe Gly Pro Pro Thr His
20 25 30 Asn Pro Pro Pro Val
Asp Phe Pro Pro Phe Asp Arg Leu Pro Pro Pro 35
40 45 Glu Ile Leu Glu Gln Lys Ile Gly Ala
Gln His Leu Glu Met Gln Lys 50 55
60 Leu Thr Thr Glu Asn Gln Arg Leu Ala Ala Thr His Val
Thr Leu Arg 65 70 75
80 Arg Asp Leu Ala Ala Ala Gln His Glu Leu Gln Met Leu His Val Gln
85 90 95 Ile Glu Thr Val
Lys Ala Asn Arg Glu Gln Glu Thr Lys Gly Leu Ser 100
105 110 Asp Lys Ile Ser Arg Ile Glu Ala Glu
Leu Gln Ala Ala Glu Ser Ile 115 120
125 Lys Lys Glu Leu Pro Gln Ala Gln Gly Glu Ala Arg Thr Leu
Phe Ala 130 135 140
Ala Arg Gln Glu Leu Val Thr Lys Ile Gln Met Leu Thr Gln Asp Leu 145
150 155 160 Gln Arg Ala His Ala
Asp Val Leu His Ile Pro Arg Leu Leu Ala Glu 165
170 175 Leu Glu Ser Leu Lys Lys Glu Tyr Gln Gln
Cys Arg Thr Thr Tyr Glu 180 185
190 Cys Glu Arg Lys Leu Tyr Ser Asp His Leu Glu Ser Leu Gln Val
Met 195 200 205 Glu
Lys Asn Tyr Met Thr Met Ser Arg Glu Val Glu Lys Leu Arg Ala 210
215 220 Glu Leu Ala Asn Thr Ser
Asn Ser Asp Arg Gln Thr Gly Gly Pro Tyr 225 230
235 240 Gly Gly Ser Thr Gly Tyr Asn Glu Asn Asp Ala
Thr Asn Asn Tyr Ala 245 250
255 Thr Gly Gln Asn Ile Tyr Ala Asp Gly Tyr Gly Val Tyr Gln Gly Arg
260 265 270 Gly Ser
Val Pro Thr Gly Thr Asn Ala Gly Gly Val Pro Ala Val Asp 275
280 285 Ser Pro Gln Val Gly Ala Gln
Ser Val Pro Pro Ser Asn Arg Pro Pro 290 295
300 Tyr Asp Thr Ser Asn Met Ser Gly Tyr Asp Ala Gln
Arg Gly Ile Arg 305 310 315
320 Gly Pro Val Gly His Gly Tyr Glu Ala Gln Met Gly Ser Ser Gly Pro
325 330 335 Gly Tyr Asp
Ala Gln Arg Gly Ser Gly Leu Ala Ala Tyr Glu Ala Gln 340
345 350 Arg Gly His Gly Tyr Asp Arg Gly
Pro Gly Tyr Asp Ala Gln Arg Ala 355 360
365 Ala Gly Tyr Glu Ala Tyr Arg Gly Pro Gly Tyr Asp Ala
Tyr Gly Ala 370 375 380
Pro Val Tyr Asp Pro Ser Lys Ala Ser Asn Tyr Asp Ala Ser Ser Lys 385
390 395 400 Gly Gly Val Ala
Thr Gln Gly Gln Val Ala Pro Ile Gly Asn Ala Pro 405
410 415 Pro Gly Ala Ala Pro Ser Pro Gly His
Ile Gly Pro Gly Tyr Asp Ala 420 425
430 Ser Ala Gln Gly Gly Asn Pro Ala Arg Arg 435
440 521530DNAOryza sativa 52atggggagca aggggagggc
gccgccgcct taccaccacc ggggggcgca caagatgatg 60caccgggacc cgtacggggg
ggcaccgggg atgccggggc cgttcccgta cgacatgctg 120gcggcggcgg cgccgccgcc
ggagatcctg gagcagaagc tgatggcgca gcggggggag 180ctgcagaagc tggcggtgga
gaacgaccgg ctggcgatga gccacgactc gctgcggaag 240gagctcgccg cggcgcagca
ggaggcgcag aggctgcagg cgcaggggca ggcggcgatg 300gcggccgagg agcaggaggc
gagggggatc ctcgacaagg tcgccaagat ggaggccgac 360ctcaaggccc gcgaccccgt
caaggccgag ctgcagcagg cgcacgccga ggcgcagggc 420ctcgtcgtcg cgaggcagca
gctggccgcc gacacgcaga agctgagcaa ggacctgcag 480aggaacctcg gcgaggcgca
gcagctcccc gcgctcgtgg ccgagcgcga cgccgctagg 540caggagtatc agcacctcag
ggctacgtat gagtacgaaa ggaaactcag gatggatcac 600tccgagtcgc tgcaggtgat
gaagaggaat tatgacacca tggtcgctga gctagacaag 660cttcgtgctg agctgatgaa
cacggctaat attgacagag gaggcatgcc gtttatctgt 720tgttccattt tttttaccac
aatttttggt agcatatata gaaaccacac catgagacat 780tttctatgtg taggtatgct
atacaatact aatactgctc aaaaggatga tggcgcgcct 840agtcttcctg ttggacaaat
tgcttatgat agcggttatg gagctgcgca gggaaggaca 900ccacctgctg gactgggaga
ctctttaagc ggaaacccag ctggcacagc tcctcggact 960ggatttgatc catcaagagg
caatatgtat gacgcttctc gtattgctag cttcagttct 1020tcaaaagctg gaggacatga
tgcatcaagg ggtgccgcag gctacaattc tttgaaaggt 1080gctggatatg atccttctaa
agcacctgca cttggaggac aggcaacagc tgcagctgct 1140catgggagta gtgctgatta
ctatggatca aatcaggcaa caccaccttc atatgcttgg 1200ggacaagctg catccgctta
tggatctgca caagtgccac agtcacatgc atctggacct 1260cctgttcaat caacatccta
cagtgcaaca acagcacgta actttggctc tgcccaggct 1320ttaccatcat atgcacatgc
acaggagcaa ccttcatatg gacacgcaca gctaccatcc 1380tcatatggat tagcgcaagc
atcatttcca tttgccccag cgcaaggggt gtcaccctat 1440gggtcaggtg cacagcctcc
gcagtatgga gctgggcaag cagcaactaa tcctggcagt 1500gcttaccaag cacctcatgg
acgtaaataa 153053509PRTOryza sativa
53Met Gly Ser Lys Gly Arg Ala Pro Pro Pro Tyr His His Arg Gly Ala 1
5 10 15 His Lys Met Met
His Arg Asp Pro Tyr Gly Gly Ala Pro Gly Met Pro 20
25 30 Gly Pro Phe Pro Tyr Asp Met Leu Ala
Ala Ala Ala Pro Pro Pro Glu 35 40
45 Ile Leu Glu Gln Lys Leu Met Ala Gln Arg Gly Glu Leu Gln
Lys Leu 50 55 60
Ala Val Glu Asn Asp Arg Leu Ala Met Ser His Asp Ser Leu Arg Lys 65
70 75 80 Glu Leu Ala Ala Ala
Gln Gln Glu Ala Gln Arg Leu Gln Ala Gln Gly 85
90 95 Gln Ala Ala Met Ala Ala Glu Glu Gln Glu
Ala Arg Gly Ile Leu Asp 100 105
110 Lys Val Ala Lys Met Glu Ala Asp Leu Lys Ala Arg Asp Pro Val
Lys 115 120 125 Ala
Glu Leu Gln Gln Ala His Ala Glu Ala Gln Gly Leu Val Val Ala 130
135 140 Arg Gln Gln Leu Ala Ala
Asp Thr Gln Lys Leu Ser Lys Asp Leu Gln 145 150
155 160 Arg Asn Leu Gly Glu Ala Gln Gln Leu Pro Ala
Leu Val Ala Glu Arg 165 170
175 Asp Ala Ala Arg Gln Glu Tyr Gln His Leu Arg Ala Thr Tyr Glu Tyr
180 185 190 Glu Arg
Lys Leu Arg Met Asp His Ser Glu Ser Leu Gln Val Met Lys 195
200 205 Arg Asn Tyr Asp Thr Met Val
Ala Glu Leu Asp Lys Leu Arg Ala Glu 210 215
220 Leu Met Asn Thr Ala Asn Ile Asp Arg Gly Gly Met
Pro Phe Ile Cys 225 230 235
240 Cys Ser Ile Phe Phe Thr Thr Ile Phe Gly Ser Ile Tyr Arg Asn His
245 250 255 Thr Met Arg
His Phe Leu Cys Val Gly Met Leu Tyr Asn Thr Asn Thr 260
265 270 Ala Gln Lys Asp Asp Gly Ala Pro
Ser Leu Pro Val Gly Gln Ile Ala 275 280
285 Tyr Asp Ser Gly Tyr Gly Ala Ala Gln Gly Arg Thr Pro
Pro Ala Gly 290 295 300
Leu Gly Asp Ser Leu Ser Gly Asn Pro Ala Gly Thr Ala Pro Arg Thr 305
310 315 320 Gly Phe Asp Pro
Ser Arg Gly Asn Met Tyr Asp Ala Ser Arg Ile Ala 325
330 335 Ser Phe Ser Ser Ser Lys Ala Gly Gly
His Asp Ala Ser Arg Gly Ala 340 345
350 Ala Gly Tyr Asn Ser Leu Lys Gly Ala Gly Tyr Asp Pro Ser
Lys Ala 355 360 365
Pro Ala Leu Gly Gly Gln Ala Thr Ala Ala Ala Ala His Gly Ser Ser 370
375 380 Ala Asp Tyr Tyr Gly
Ser Asn Gln Ala Thr Pro Pro Ser Tyr Ala Trp 385 390
395 400 Gly Gln Ala Ala Ser Ala Tyr Gly Ser Ala
Gln Val Pro Gln Ser His 405 410
415 Ala Ser Gly Pro Pro Val Gln Ser Thr Ser Tyr Ser Ala Thr Thr
Ala 420 425 430 Arg
Asn Phe Gly Ser Ala Gln Ala Leu Pro Ser Tyr Ala His Ala Gln 435
440 445 Glu Gln Pro Ser Tyr Gly
His Ala Gln Leu Pro Ser Ser Tyr Gly Leu 450 455
460 Ala Gln Ala Ser Phe Pro Phe Ala Pro Ala Gln
Gly Val Ser Pro Tyr 465 470 475
480 Gly Ser Gly Ala Gln Pro Pro Gln Tyr Gly Ala Gly Gln Ala Ala Thr
485 490 495 Asn Pro
Gly Ser Ala Tyr Gln Ala Pro His Gly Arg Lys 500
505 54930DNAPicea sitchensis 54atggctggaa gaaatcgcct
acctgcacac cctcttaagg gcggtccacg gggaatgcct 60ccaatgcgag agggccctta
tgccaggggt ccagggcctt tgccacctca tcctggcctt 120gttgaagaga ttcgtgatgg
cccctttgga agaggcccag gtcctctgcc cccacaccct 180gcattgatcg aggagaagct
tgcagctcag catcaagaga ttcagggact acttgtggag 240aatcagcggc ttgctgccac
tcatgtagct ttacgacagg aacttgcatc agcgcagcag 300gagctgcaac acatgaatca
tatggctgct aatatgcagg ctgacaaaga gcaccatctc 360agggagttgt atgacaaatc
tatgaagcta gaagcagatt tgcgtgcaaa tgagccaatg 420aaagctgaac ttatgcagct
gcgtgcagat aatcagaaga tgggtgctat caggcaagaa 480atgacagctc aggtgcaagc
acttacacaa gatttggtga gagctcgagc agatatgcag 540caggtgggtg ccatgagggc
agagatagaa agcatgcacc aggagctgca acgagcaaga 600actgccattg aatatgagaa
gaaggcacgt gctgaccagc tggagcaggg tcaggcaatg 660gagaaaaact tgatctcaat
ggctcgtgaa gttgagaaac ttcgagctga gcttgcaaat 720gctgacaaga gagggcgtgt
tgctgcaaac cctggtggag catatgctgg gaactatggt 780ggtgcagaaa tgggctactc
gggtggtgct tatggtgatg gttatggcgt gcacccggcc 840caagggggtg cagaaagtgg
tggtcagtat ggggctggag ctgctccatg gggagcatat 900gaaatgcagc gttcccatgt
acgtagataa 93055309PRTPicea
sitchensis 55Met Ala Gly Arg Asn Arg Leu Pro Ala His Pro Leu Lys Gly Gly
Pro 1 5 10 15 Arg
Gly Met Pro Pro Met Arg Glu Gly Pro Tyr Ala Arg Gly Pro Gly
20 25 30 Pro Leu Pro Pro His
Pro Gly Leu Val Glu Glu Ile Arg Asp Gly Pro 35
40 45 Phe Gly Arg Gly Pro Gly Pro Leu Pro
Pro His Pro Ala Leu Ile Glu 50 55
60 Glu Lys Leu Ala Ala Gln His Gln Glu Ile Gln Gly Leu
Leu Val Glu 65 70 75
80 Asn Gln Arg Leu Ala Ala Thr His Val Ala Leu Arg Gln Glu Leu Ala
85 90 95 Ser Ala Gln Gln
Glu Leu Gln His Met Asn His Met Ala Ala Asn Met 100
105 110 Gln Ala Asp Lys Glu His His Leu Arg
Glu Leu Tyr Asp Lys Ser Met 115 120
125 Lys Leu Glu Ala Asp Leu Arg Ala Asn Glu Pro Met Lys Ala
Glu Leu 130 135 140
Met Gln Leu Arg Ala Asp Asn Gln Lys Met Gly Ala Ile Arg Gln Glu 145
150 155 160 Met Thr Ala Gln Val
Gln Ala Leu Thr Gln Asp Leu Val Arg Ala Arg 165
170 175 Ala Asp Met Gln Gln Val Gly Ala Met Arg
Ala Glu Ile Glu Ser Met 180 185
190 His Gln Glu Leu Gln Arg Ala Arg Thr Ala Ile Glu Tyr Glu Lys
Lys 195 200 205 Ala
Arg Ala Asp Gln Leu Glu Gln Gly Gln Ala Met Glu Lys Asn Leu 210
215 220 Ile Ser Met Ala Arg Glu
Val Glu Lys Leu Arg Ala Glu Leu Ala Asn 225 230
235 240 Ala Asp Lys Arg Gly Arg Val Ala Ala Asn Pro
Gly Gly Ala Tyr Ala 245 250
255 Gly Asn Tyr Gly Gly Ala Glu Met Gly Tyr Ser Gly Gly Ala Tyr Gly
260 265 270 Asp Gly
Tyr Gly Val His Pro Ala Gln Gly Gly Ala Glu Ser Gly Gly 275
280 285 Gln Tyr Gly Ala Gly Ala Ala
Pro Trp Gly Ala Tyr Glu Met Gln Arg 290 295
300 Ser His Val Arg Arg 305
56894DNAPopulus tremuloides 56atgtctgcaa gaaggcatat tcgaccaact ttagaagggc
gtgttatcca agcacctggg 60atgatgcgtc atggtccatt tcctgctggc caccatacat
cagaaccact ctctcgttct 120gatcttctag agcataggtt tgctgctcag gctgcggaca
ttgaacaact tgcaggggat 180aataatagac tggttactag tcacatggcc ttgagggagg
accttgctgc tgctcagcag 240gaagtgcaaa gactcaaggc acatattaga agcatccaga
ctgaaagtga tatccagatc 300agggttttgc tggataaaat tgcaaaaatg gaaaaagaca
tcagggctgg tgagaacgtg 360aaaaaggacc tcaaacaggc acatgtggag gcacagaact
tggtcaaaga aagacaagag 420cttgctacac aaatccaaca ggcttcacac gagttgcaga
aaatccacac tgatgtaaag 480agtataccag atctgcatgc tgagcttgag aattcaaggc
atgaactcaa gaggttaaga 540gctacattcg agtacgaaaa aggattaaat atagagaagg
tggagcaaat gcgagcaatg 600gaacagaatc tcataggtat ggcaagagaa atggaaaatt
tgcgcgttga tgtcttgaat 660gctgagacca gagcacgtgc tccaaaccaa tatattggtg
gctacgcaaa tcctgatgga 720tatgggaggc cttttgttca catgggtgtt ggaccagcag
gggaagggat aattccttac 780aacagtagca acagtgtagt gtccaatgtt gggtttggtg
gtgcagcaat gtctactact 840ggtggtgtcg ctcaatgggt agggcctttt gatccgtcac
atgctcgggg gtga 89457297PRTPopulus tremuloides 57Met Ser Ala Arg
Arg His Ile Arg Pro Thr Leu Glu Gly Arg Val Ile 1 5
10 15 Gln Ala Pro Gly Met Met Arg His Gly
Pro Phe Pro Ala Gly His His 20 25
30 Thr Ser Glu Pro Leu Ser Arg Ser Asp Leu Leu Glu His Arg
Phe Ala 35 40 45
Ala Gln Ala Ala Asp Ile Glu Gln Leu Ala Gly Asp Asn Asn Arg Leu 50
55 60 Val Thr Ser His Met
Ala Leu Arg Glu Asp Leu Ala Ala Ala Gln Gln 65 70
75 80 Glu Val Gln Arg Leu Lys Ala His Ile Arg
Ser Ile Gln Thr Glu Ser 85 90
95 Asp Ile Gln Ile Arg Val Leu Leu Asp Lys Ile Ala Lys Met Glu
Lys 100 105 110 Asp
Ile Arg Ala Gly Glu Asn Val Lys Lys Asp Leu Lys Gln Ala His 115
120 125 Val Glu Ala Gln Asn Leu
Val Lys Glu Arg Gln Glu Leu Ala Thr Gln 130 135
140 Ile Gln Gln Ala Ser His Glu Leu Gln Lys Ile
His Thr Asp Val Lys 145 150 155
160 Ser Ile Pro Asp Leu His Ala Glu Leu Glu Asn Ser Arg His Glu Leu
165 170 175 Lys Arg
Leu Arg Ala Thr Phe Glu Tyr Glu Lys Gly Leu Asn Ile Glu 180
185 190 Lys Val Glu Gln Met Arg Ala
Met Glu Gln Asn Leu Ile Gly Met Ala 195 200
205 Arg Glu Met Glu Asn Leu Arg Val Asp Val Leu Asn
Ala Glu Thr Arg 210 215 220
Ala Arg Ala Pro Asn Gln Tyr Ile Gly Gly Tyr Ala Asn Pro Asp Gly 225
230 235 240 Tyr Gly Arg
Pro Phe Val His Met Gly Val Gly Pro Ala Gly Glu Gly 245
250 255 Ile Ile Pro Tyr Asn Ser Ser Asn
Ser Val Val Ser Asn Val Gly Phe 260 265
270 Gly Gly Ala Ala Met Ser Thr Thr Gly Gly Val Ala Gln
Trp Val Gly 275 280 285
Pro Phe Asp Pro Ser His Ala Arg Gly 290 295
581272DNASaccharum officinarummisc_feature(1219)..(1219)n is a, c, g, or
t 58atggctcgcc gtggacacct agatggactg actgcccaag ctccagctct gatgcgccat
60ggttccttcg ctgcaggcag cctgtctagc cactcacctt tgcagtcttc atccacactg
120gagatgctgg agagcaagct tgccatgcaa actgcagaag tggaaaagct tatcatggag
180aatcagcggt tagcatcaag ccatgtggtc ctgaggcagg acatcgttga tacggagaaa
240gagatgcaaa tgatccgcac ccacctaagt gaagttcaga cagagactga tctgcagatt
300agagatttgt tggagagaat cagattaatg gaggcagaca tacatagtgg tgatgcagtg
360aagaaagagc ttcaccaagt gcatatggag gcaaagagac ttattactga aaggcagatg
420ctaacccttg agatagataa tgtgactaaa gaattacata aaatctctgc ccctggtgac
480gggaaaagcc ttcctgaatt gcttgctgag ctagatgggc tacggaaaga gcatcataat
540ttacgatctc aatttgaata tgagaaaaat acaaacatca agcaagttga gcagatgcgg
600acaatggaaa tgaacctgat aaccatgact aaacaagctg agaagttacg tggtgatgtg
660gcaaatgctg aaagacgggc acaggcagct gcggctcaag cagcggcaca tgcagctggt
720gcacaggtga cagcttcaca gcctgggaca gctcaagcta cagcggtttc agcagcagct
780acagacccat atgctggtgc atatgccagt tacccctcgg catatcagca gggagcccag
840gctggggcat atcagcaggg agcccaggct ggggcatatc agcagggaac ccaagctggg
900gcatatcagc ttggggcata tcaacaggga acccaagctg gggcatatca aaaaggaaac
960caagctggaa catacaccta tgcttatgat gctgccaccg cttacacata tgcgggttac
1020tccggttatc caattgcagg ctatgcgcaa aaggcagtgc ccaattattc ctatgccgta
1080cctccgcagc caagcagcgg tgcagctaca gacgccgcaa gcctgtatgg cgcagctggt
1140agtgctggat atcctactgg gcaagttcag ccgagcagtg tcactgcaaa tgcagcgcaa
1200ccaccttctt caccactgnc gactgcacca tatcctagca catatgacca aaccagagga
1260gcccagagat ga
127259423PRTSaccharum officinarummisc_feature(407)..(407)Xaa can be any
naturally occurring amino acid 59Met Ala Arg Arg Gly His Leu Asp Gly Leu
Thr Ala Gln Ala Pro Ala 1 5 10
15 Leu Met Arg His Gly Ser Phe Ala Ala Gly Ser Leu Ser Ser His
Ser 20 25 30 Pro
Leu Gln Ser Ser Ser Thr Leu Glu Met Leu Glu Ser Lys Leu Ala 35
40 45 Met Gln Thr Ala Glu Val
Glu Lys Leu Ile Met Glu Asn Gln Arg Leu 50 55
60 Ala Ser Ser His Val Val Leu Arg Gln Asp Ile
Val Asp Thr Glu Lys 65 70 75
80 Glu Met Gln Met Ile Arg Thr His Leu Ser Glu Val Gln Thr Glu Thr
85 90 95 Asp Leu
Gln Ile Arg Asp Leu Leu Glu Arg Ile Arg Leu Met Glu Ala 100
105 110 Asp Ile His Ser Gly Asp Ala
Val Lys Lys Glu Leu His Gln Val His 115 120
125 Met Glu Ala Lys Arg Leu Ile Thr Glu Arg Gln Met
Leu Thr Leu Glu 130 135 140
Ile Asp Asn Val Thr Lys Glu Leu His Lys Ile Ser Ala Pro Gly Asp 145
150 155 160 Gly Lys Ser
Leu Pro Glu Leu Leu Ala Glu Leu Asp Gly Leu Arg Lys 165
170 175 Glu His His Asn Leu Arg Ser Gln
Phe Glu Tyr Glu Lys Asn Thr Asn 180 185
190 Ile Lys Gln Val Glu Gln Met Arg Thr Met Glu Met Asn
Leu Ile Thr 195 200 205
Met Thr Lys Gln Ala Glu Lys Leu Arg Gly Asp Val Ala Asn Ala Glu 210
215 220 Arg Arg Ala Gln
Ala Ala Ala Ala Gln Ala Ala Ala His Ala Ala Gly 225 230
235 240 Ala Gln Val Thr Ala Ser Gln Pro Gly
Thr Ala Gln Ala Thr Ala Val 245 250
255 Ser Ala Ala Ala Thr Asp Pro Tyr Ala Gly Ala Tyr Ala Ser
Tyr Pro 260 265 270
Ser Ala Tyr Gln Gln Gly Ala Gln Ala Gly Ala Tyr Gln Gln Gly Ala
275 280 285 Gln Ala Gly Ala
Tyr Gln Gln Gly Thr Gln Ala Gly Ala Tyr Gln Leu 290
295 300 Gly Ala Tyr Gln Gln Gly Thr Gln
Ala Gly Ala Tyr Gln Lys Gly Asn 305 310
315 320 Gln Ala Gly Thr Tyr Thr Tyr Ala Tyr Asp Ala Ala
Thr Ala Tyr Thr 325 330
335 Tyr Ala Gly Tyr Ser Gly Tyr Pro Ile Ala Gly Tyr Ala Gln Lys Ala
340 345 350 Val Pro Asn
Tyr Ser Tyr Ala Val Pro Pro Gln Pro Ser Ser Gly Ala 355
360 365 Ala Thr Asp Ala Ala Ser Leu Tyr
Gly Ala Ala Gly Ser Ala Gly Tyr 370 375
380 Pro Thr Gly Gln Val Gln Pro Ser Ser Val Thr Ala Asn
Ala Ala Gln 385 390 395
400 Pro Pro Ser Ser Pro Leu Xaa Thr Ala Pro Tyr Pro Ser Thr Tyr Asp
405 410 415 Gln Thr Arg Gly
Ala Gln Arg 420 601344DNASorghum bicolor
60atgttcagga taatggctca tcgtggtcac ctagatggac tgactgccca agctccagct
60ctgatgcacc atggttcctt cgctgctggc aaactctcta gccactcacc tttgcagtct
120tcatccacac tggagatgct ggagaacaag cttgccatgc aaactgcaga agtagaaaag
180cttatcatgg agaatcagcg gttagcatca agccatgtgg tcttgaggca ggacattgtt
240gatacggaga aagagatgca aatgatccgc acccacctag gtgaagttca gacagagact
300gatttgcaga ttagagattt gttggagaga atcagattaa tggaggcaga catacatagt
360ggtgatgcag tgaagaagga gcttcaccaa gtgcatatgg aggcaaagag acttattact
420gaaaggcaga tgctaaccct tgacatagag aatgtgatta aagaattaca gaaactctct
480gcctctggtg acggtaaaag ccttcctgaa ttgcttgctg agctagatgg gctacggaaa
540gagcatcata atttacgatc tcaatttgaa tttgagaaaa atacaaacat caagcaagtt
600gagcagatgc ggacaatgga aatgaacctg ataaccatga ctaaacaagc cgagaagtta
660cgtggtgatg tagcaaatgc tgaaagacgg gcacaggcag ctgcggctca agcagcggca
720catgcagctg gtgcgcaggt gacagcttca cagcctggga cagctcaagc tacagcggtt
780tcagcagcag ctacagaccc atatgcaggt gcatatgcca gttacccctc ggcatatcag
840cagggagccc aggctgcagc atatcagcag ggagcccagg ctgcggcata tcagcaggga
900gcccaggctg gggcatatca gcagggagcc caggctgggg catatcagca gggagcccag
960gctggggcat atcagcaggg agcccaggtt ggggcatatc agcacggaac ccaagctggg
1020gcatatcagc aaggaaacca ggctggagca tacacctatg cttatgatgc tgccacggct
1080tacgcatatg caggttactc tggctatcca ggctatgcgc aaagtgcagt gcccaattat
1140tcctatgccg tacctccgca gccaagcagc ggtgcaacta cagaggccgc aagcatgtat
1200ggcgcagctg gtagtgctgg atatcctact gcgcaagttc agccgagcag tgccactgca
1260aatgcagcgc aaccacctcc tccaccaccg cctgcagcac catatcctag cacatatgac
1320caaaccagag gagcccagag gtga
134461443PRTSorghum bicolor 61Met Ala His Arg Gly His Leu Asp Gly Leu Thr
Ala Gln Ala Pro Ala 1 5 10
15 Leu Met His His Gly Ser Phe Ala Ala Gly Lys Leu Ser Ser His Ser
20 25 30 Pro Leu
Gln Ser Ser Ser Thr Leu Glu Met Leu Glu Asn Lys Leu Ala 35
40 45 Met Gln Thr Ala Glu Val Glu
Lys Leu Ile Met Glu Asn Gln Arg Leu 50 55
60 Ala Ser Ser His Val Val Leu Arg Gln Asp Ile Val
Asp Thr Glu Lys 65 70 75
80 Glu Met Gln Met Ile Arg Thr His Leu Gly Glu Val Gln Thr Glu Thr
85 90 95 Asp Leu Gln
Ile Arg Asp Leu Leu Glu Arg Ile Arg Leu Met Glu Ala 100
105 110 Asp Ile His Ser Gly Asp Ala Val
Lys Lys Glu Leu His Gln Val His 115 120
125 Met Glu Ala Lys Arg Leu Ile Thr Glu Arg Gln Met Leu
Thr Leu Asp 130 135 140
Ile Glu Asn Val Ile Lys Glu Leu Gln Lys Leu Ser Ala Ser Gly Asp 145
150 155 160 Gly Lys Ser Leu
Pro Glu Leu Leu Ala Glu Leu Asp Gly Leu Arg Lys 165
170 175 Glu His His Asn Leu Arg Ser Gln Phe
Glu Phe Glu Lys Asn Thr Asn 180 185
190 Ile Lys Gln Val Glu Gln Met Arg Thr Met Glu Met Asn Leu
Ile Thr 195 200 205
Met Thr Lys Gln Ala Glu Lys Leu Arg Gly Asp Val Ala Asn Ala Glu 210
215 220 Arg Arg Ala Gln Ala
Ala Ala Ala Gln Ala Ala Ala His Ala Ala Gly 225 230
235 240 Ala Gln Val Thr Ala Ser Gln Pro Gly Thr
Ala Gln Ala Thr Ala Val 245 250
255 Ser Ala Ala Ala Thr Asp Pro Tyr Ala Gly Ala Tyr Ala Ser Tyr
Pro 260 265 270 Ser
Ala Tyr Gln Gln Gly Ala Gln Ala Ala Ala Tyr Gln Gln Gly Ala 275
280 285 Gln Ala Ala Ala Tyr Gln
Gln Gly Ala Gln Ala Gly Ala Tyr Gln Gln 290 295
300 Gly Ala Gln Ala Gly Ala Tyr Gln Gln Gly Ala
Gln Ala Gly Ala Tyr 305 310 315
320 Gln Gln Gly Ala Gln Val Gly Ala Tyr Gln His Gly Thr Gln Ala Gly
325 330 335 Ala Tyr
Gln Gln Gly Asn Gln Ala Gly Ala Tyr Thr Tyr Ala Tyr Asp 340
345 350 Ala Ala Thr Ala Tyr Ala Tyr
Ala Gly Tyr Ser Gly Tyr Pro Gly Tyr 355 360
365 Ala Gln Ser Ala Val Pro Asn Tyr Ser Tyr Ala Val
Pro Pro Gln Pro 370 375 380
Ser Ser Gly Ala Thr Thr Glu Ala Ala Ser Met Tyr Gly Ala Ala Gly 385
390 395 400 Ser Ala Gly
Tyr Pro Thr Ala Gln Val Gln Pro Ser Ser Ala Thr Ala 405
410 415 Asn Ala Ala Gln Pro Pro Pro Pro
Pro Pro Pro Ala Ala Pro Tyr Pro 420 425
430 Ser Thr Tyr Asp Gln Thr Arg Gly Ala Gln Arg
435 440 621512DNAZea mays 62atggctcacc
gtggacacct agatggactg actgcccaag ctccagcact gatgcgccat 60ggttccttcg
ctgcaggcag cctttctagc cactcacctt tggagtcttc atctacactg 120gagatgctgg
agaacaagct tgccatgcag actgcagaag tggaaaagct tatcatggag 180aatcagcggt
tagcatcaag ccatgtggtc ttgaggcagg acattgttga tacagagaaa 240gagatgcaaa
taatccgcac ccacctaggt gaagttcaga cagagactga tttgcatatt 300agagatttat
tggagagaat tagattaatg gaggcagaca tacatagtgg tgatgcggtg 360aagaaggagc
ttcatcaagt gcatatggag gcaaagagac ttattactga aaggcagatg 420ctgacccttg
agacagagga tgtgaataaa gaattacaga aactctctgc ctctggtgac 480agtaaaagcc
ttcctgaatt gctagctgag ctagatgggc taaggaaaga gcatcttaat 540ttacgatctc
aatttgaatt tgagaaaaat acaaacatca agcaagttga gcagatgcgg 600acaatggaaa
tgaacttgat gaccatgact aaacaagctg agaagttacg aggtgatgtg 660gcaaatgctg
aaagacgggc acaggcagct gtggctaaag caacagggca tgcagctggt 720gcacaggtga
cagcttcaca gcctgggaca gctcaagcta cagcggttcc agcagcagct 780acagacccat
atgcaggtgc atatgccagt tacccccctg catatcagca gggagcccag 840gctggggcat
atcagcaggg agcccaggct ggggcatatc agcagggagc ccaggctggg 900gcatatcagc
aggggggcca ggatggggca tatcagcagg gggctcaggc tggggcatat 960cagcagggag
cccaggctgg ggcatatcag cagggagccc aggctggggc atatcagcag 1020ggtgctcagg
ctggggcata tcagcaggga gcccaggctg gggcatatca gcagggggcc 1080cagtctgggg
catatcagca gggggcccag gctggggcat atcagcaggg agcccaggat 1140ggggcatatc
agcagggagc ccaggatggg gcatatcagc agggtgctca ggctggagca 1200tacaactatg
cttatgatgc tggcacggct tatgcatatg caggttactc tggctatcca 1260gttgcaggct
acgcgcaaag tgcagtgccc aactattctt atgctgcacc tccgcagcca 1320acaagcagcg
gtgcagctac gaacgccgca ggaggccagt atggggcagt tggtagtgct 1380ggatatccta
ctgggcaagt tcagccgagc agtggcactg caaatgcagc gcaagcacct 1440cctcctccac
caccaccggc agcaccatat ccccccagca catatgacca aaccagagga 1500gcccagagat
aa 151263503PRTZea
mays 63Met Ala His Arg Gly His Leu Asp Gly Leu Thr Ala Gln Ala Pro Ala 1
5 10 15 Leu Met Arg
His Gly Ser Phe Ala Ala Gly Ser Leu Ser Ser His Ser 20
25 30 Pro Leu Glu Ser Ser Ser Thr Leu
Glu Met Leu Glu Asn Lys Leu Ala 35 40
45 Met Gln Thr Ala Glu Val Glu Lys Leu Ile Met Glu Asn
Gln Arg Leu 50 55 60
Ala Ser Ser His Val Val Leu Arg Gln Asp Ile Val Asp Thr Glu Lys 65
70 75 80 Glu Met Gln Ile
Ile Arg Thr His Leu Gly Glu Val Gln Thr Glu Thr 85
90 95 Asp Leu His Ile Arg Asp Leu Leu Glu
Arg Ile Arg Leu Met Glu Ala 100 105
110 Asp Ile His Ser Gly Asp Ala Val Lys Lys Glu Leu His Gln
Val His 115 120 125
Met Glu Ala Lys Arg Leu Ile Thr Glu Arg Gln Met Leu Thr Leu Glu 130
135 140 Thr Glu Asp Val Asn
Lys Glu Leu Gln Lys Leu Ser Ala Ser Gly Asp 145 150
155 160 Ser Lys Ser Leu Pro Glu Leu Leu Ala Glu
Leu Asp Gly Leu Arg Lys 165 170
175 Glu His Leu Asn Leu Arg Ser Gln Phe Glu Phe Glu Lys Asn Thr
Asn 180 185 190 Ile
Lys Gln Val Glu Gln Met Arg Thr Met Glu Met Asn Leu Met Thr 195
200 205 Met Thr Lys Gln Ala Glu
Lys Leu Arg Gly Asp Val Ala Asn Ala Glu 210 215
220 Arg Arg Ala Gln Ala Ala Val Ala Lys Ala Thr
Gly His Ala Ala Gly 225 230 235
240 Ala Gln Val Thr Ala Ser Gln Pro Gly Thr Ala Gln Ala Thr Ala Val
245 250 255 Pro Ala
Ala Ala Thr Asp Pro Tyr Ala Gly Ala Tyr Ala Ser Tyr Pro 260
265 270 Pro Ala Tyr Gln Gln Gly Ala
Gln Ala Gly Ala Tyr Gln Gln Gly Ala 275 280
285 Gln Ala Gly Ala Tyr Gln Gln Gly Ala Gln Ala Gly
Ala Tyr Gln Gln 290 295 300
Gly Gly Gln Asp Gly Ala Tyr Gln Gln Gly Ala Gln Ala Gly Ala Tyr 305
310 315 320 Gln Gln Gly
Ala Gln Ala Gly Ala Tyr Gln Gln Gly Ala Gln Ala Gly 325
330 335 Ala Tyr Gln Gln Gly Ala Gln Ala
Gly Ala Tyr Gln Gln Gly Ala Gln 340 345
350 Ala Gly Ala Tyr Gln Gln Gly Ala Gln Ser Gly Ala Tyr
Gln Gln Gly 355 360 365
Ala Gln Ala Gly Ala Tyr Gln Gln Gly Ala Gln Asp Gly Ala Tyr Gln 370
375 380 Gln Gly Ala Gln
Asp Gly Ala Tyr Gln Gln Gly Ala Gln Ala Gly Ala 385 390
395 400 Tyr Asn Tyr Ala Tyr Asp Ala Gly Thr
Ala Tyr Ala Tyr Ala Gly Tyr 405 410
415 Ser Gly Tyr Pro Val Ala Gly Tyr Ala Gln Ser Ala Val Pro
Asn Tyr 420 425 430
Ser Tyr Ala Ala Pro Pro Gln Pro Thr Ser Ser Gly Ala Ala Thr Asn
435 440 445 Ala Ala Gly Gly
Gln Tyr Gly Ala Val Gly Ser Ala Gly Tyr Pro Thr 450
455 460 Gly Gln Val Gln Pro Ser Ser Gly
Thr Ala Asn Ala Ala Gln Ala Pro 465 470
475 480 Pro Pro Pro Pro Pro Pro Ala Ala Pro Tyr Pro Pro
Ser Thr Tyr Asp 485 490
495 Gln Thr Arg Gly Ala Gln Arg 500
641605DNAZea mays 64atggctcatc gtggacatct agatggactg actggccaag
ctcctgctct tatgcgccat 60ggttccttcg ctgcaggcag cctctctagc cgctcacctt
tgcagtcttc atccacactg 120gagatgctgg agaacaagct tgccatgcaa actacagaag
tggaaaagct tatcacggag 180aatcagcggt tagcatcaag ccatgtggtc ttgaggcagg
acattgttga tacggagaaa 240gagatgcaaa tgatccgcac ccacctaggt gaagttcaga
cagagactga tttgcagatt 300agagatttgt tggagagaat cagattaatg gaggtagata
tacatagtgg taatgtagtg 360aacaaggagc ttcaccaaat gcatatggag gcaaagagac
ttattactga aaggcagatg 420ctaacccttg agatagagga tgtgactaaa gaattacaga
aactctctgc ctctggggac 480aataaaagcc ttcctgaatt gctttctgag ctagataggc
tacggaaaga gcatcataat 540ttacgatctc agtttgaatt tgagaaaaat acaaacgtca
agcaagttga gcagatgcgg 600acaatggaaa tgaacttgat aaccatgacc aaacaagctg
agaagttacg tgttgatgtg 660gcaaatgctg aaagacgggc acaagcagct gcggctcaag
cagcagcaca tgcagctggt 720gcacaggtga cagcttcgca gcctggacag ctcaagctac
cacggtttca gcagcagcag 780ccacagactc atatgcaggt gcatatacca gctacccccc
tgcatatcag cagggagccc 840aggctggggc atatcagcag ggtgctcagg ctggggtata
tcagcaggga gcccaggctg 900gggcatatca gcagggagcc caggctgggg catatcagca
ggggggccag gatggggcat 960atcagcaggg ggctcaggct ggggcatatc agcagggagc
ccaggctggg gcatatcagc 1020agggagccca ggctggggca tatcagcagg gtgctcaggc
tggggcatat cagcagggag 1080cccaggctgg ggcatatcag cagggggccc agtctggggc
atatcagcag ggggcccagg 1140ctggggcata tcagcaggga gcccaggatg gggcatatca
gcagggagcc caggatgggg 1200catatcagca gggtgctcag gctggagcat acaactatgc
ttatgatgct ggcacggctt 1260atgcatatgc aggttactct ggctatccag ttgcaggcta
cgcgcaaagt gcagtgccca 1320actattccta tgctgcacct ccgcagccaa caagcagcgg
tgcagctacg aacgccgcag 1380gaggccagta tggggcagtt ggtagtgctg gatatcctac
tgggcaagtt cagccgagca 1440gtggcactgc aaatgcagcg caagcacctc ctcctccacc
accaccggca gcaccatatc 1500cccccagcac atatgaccaa accagaggag cccagagata
aaatctggga tgtaaaccag 1560atggatgttt gccatgcaca tttgttgagc agacaaatat
ggtga 160565469PRTZea mays 65Met Ala His Arg Gly His
Leu Asp Gly Leu Thr Ala Gln Ala Pro Ala 1 5
10 15 Leu Met Arg His Gly Ser Phe Ala Ala Gly Ser
Leu Ser Ser His Ser 20 25
30 Pro Leu Glu Ser Ser Ser Thr Leu Glu Met Leu Glu Asn Lys Leu
Ala 35 40 45 Met
Gln Thr Ala Glu Val Glu Lys Leu Ile Met Glu Asn Gln Arg Leu 50
55 60 Ala Ser Ser His Val Val
Leu Arg Gln Asp Ile Val Asp Thr Glu Lys 65 70
75 80 Glu Met Gln Ile Ile Arg Thr His Leu Gly Glu
Val Gln Thr Glu Thr 85 90
95 Asp Leu His Ile Arg Asp Leu Leu Glu Arg Ile Arg Leu Met Glu Ala
100 105 110 Asp Ile
His Ser Gly Asp Ala Val Lys Lys Glu Leu His Gln Val His 115
120 125 Met Glu Ala Lys Arg Leu Ile
Thr Glu Arg Gln Met Leu Thr Leu Glu 130 135
140 Thr Glu Asp Val Thr Lys Glu Leu Gln Lys Leu Ser
Ala Ser Gly Asp 145 150 155
160 Ser Lys Ser Leu Pro Glu Leu Leu Ala Glu Leu Asp Gly Leu Arg Lys
165 170 175 Glu His Leu
Asn Leu Arg Ser Gln Phe Glu Phe Glu Lys Asn Thr Asn 180
185 190 Ile Lys Gln Val Glu Gln Met Arg
Thr Met Glu Met Asn Leu Met Thr 195 200
205 Met Thr Lys Gln Ala Glu Lys Leu Arg Gly Asp Val Ala
Asn Ala Glu 210 215 220
Arg Arg Ala Gln Ala Ala Val Ala Lys Ala Thr Gly His Ala Ala Gly 225
230 235 240 Ala Gln Val Thr
Ala Ser Gln Pro Gly Thr Ala Gln Ala Thr Ala Val 245
250 255 Pro Ala Ala Ala Thr Asp Pro Tyr Ala
Gly Ala Tyr Ala Ser Tyr Pro 260 265
270 Pro Ala Tyr Gln Gln Gly Ala Gln Ala Gly Ala Tyr Gln Gln
Gly Ala 275 280 285
Gln Ala Gly Thr Tyr Gln Gln Gly Ala Gly Thr Gln Ala Gly Ala Tyr 290
295 300 Gln Gln Gly Ala Gln
Ala Gly Ala Tyr Gln Gln Gly Ala Gln Ala Gly 305 310
315 320 Ala Tyr Gln Gln Gly Ala Gln Ser Gly Ala
Tyr Gln Gln Gly Ala Gln 325 330
335 Ala Gly Ala Tyr Gln Gln Gly Ala Gln Asp Gly Ala Tyr Gln Gln
Gly 340 345 350 Ala
Gln Asp Gly Ala Tyr Gln Gln Gly Ala Gln Ala Gly Ala Tyr Asn 355
360 365 Tyr Ala Tyr Asp Ala Gly
Thr Ala Tyr Ala Tyr Ala Gly Tyr Ser Gly 370 375
380 Tyr Pro Val Ala Gly Tyr Ala Gln Ser Ala Val
Pro Asn Tyr Ser Tyr 385 390 395
400 Ala Ala Pro Pro Gln Pro Thr Ser Ser Gly Ala Ala Thr Asn Ala Ala
405 410 415 Gly Gly
Gln Tyr Gly Ala Val Gly Ser Ala Gly Tyr Pro Thr Gly Gln 420
425 430 Val Gln Pro Ser Ser Gly Thr
Ala Asn Ala Ala Gln Ala Pro Pro Pro 435 440
445 Pro Pro Pro Pro Ala Ala Pro Tyr Pro Pro Ser Thr
Tyr Asp Gln Thr 450 455 460
Arg Gly Ala Gln Arg 465 661605DNAZea mays
66atggctcatc gtggacatct agatggactg actggccaag ctcctgctct tatgcgccat
60ggttccttcg ctgcaggcag cctctctagc cgctcacctt tgcagtcttc atccacactg
120gagatgctgg agaacaagct tgccatgcaa actacagaag tggaaaagct tatcacggag
180aatcagcggt tagcatcaag ccatgtggtc ttgaggcagg acattgttga tacggagaaa
240gagatgcaaa tgatccgcac ccacctaggt gaagttcaga cagagactga tttgcagatt
300agagatttgt tggagagaat cagattaatg gaggtagata tacatagtgg taatgtagtg
360aacaaggagc ttcaccaaat gcatatggag gcaaagagac ttattactga aaggcagatg
420ctaacccttg agatagagga tgtgactaaa gaattacaga aactctctgc ctctggggac
480aataaaagcc ttcctgaatt gctttctgag ctagataggc tacggaaaga gcatcataat
540ttacgatctc agtttgaatt tgagaaaaat acaaacgtca agcaagttga gcagatgcgg
600acaatggaaa tgaacttgat aaccatgacc aaacaagctg agaagttacg tgttgatgtg
660gcaaatgctg aaagacgggc acaagcagct gcggctcaag cagcagcaca tgcagctggt
720gcacaggtga cagcttcgca gcctggacag ctcaagctac cacggtttca gcagcagcag
780ccacagactc atatgcaggt gcatatacca gctacccccc tgcatatcag cagggagccc
840aggctggggc atatcagcag ggtgctcagg ctggggtata tcagcaggga gcccaggctg
900gggcatatca gcagggagcc caggctgggg catatcagca ggggggccag gatggggcat
960atcagcaggg ggctcaggct ggggcatatc agcagggagc ccaggctggg gcatatcagc
1020agggagccca ggctggggca tatcagcagg gtgctcaggc tggggcatat cagcagggag
1080cccaggctgg ggcatatcag cagggggccc agtctggggc atatcagcag ggggcccagg
1140ctggggcata tcagcaggga gcccaggatg gggcatatca gcagggagcc caggatgggg
1200catatcagca gggtgctcag gctggagcat acaactatgc ttatgatgct ggcacggctt
1260atgcatatgc aggttactct ggctatccag ttgcaggcta cgcgcaaagt gcagtgccca
1320actattccta tgctgcacct ccgcagccaa caagcagcgg tgcagctacg aacgccgcag
1380gaggccagta tggggcagtt ggtagtgctg gatatcctac tgggcaagtt cagccgagca
1440gtggcactgc aaatgcagcg caagcacctc ctcctccacc accaccggca gcaccatatc
1500cccccagcac atatgaccaa accagaggag cccagagata aaatctggga tgtaaaccag
1560atggatgttt gccatgcaca tttgttgagc agacaaatat ggtga
160567534PRTZea mays 67Met Ala His Arg Gly His Leu Asp Gly Leu Thr Gly
Gln Ala Pro Ala 1 5 10
15 Leu Met Arg His Gly Ser Phe Ala Ala Gly Ser Leu Ser Ser Arg Ser
20 25 30 Pro Leu Gln
Ser Ser Ser Thr Leu Glu Met Leu Glu Asn Lys Leu Ala 35
40 45 Met Gln Thr Thr Glu Val Glu Lys
Leu Ile Thr Glu Asn Gln Arg Leu 50 55
60 Ala Ser Ser His Val Val Leu Arg Gln Asp Ile Val Asp
Thr Glu Lys 65 70 75
80 Glu Met Gln Met Ile Arg Thr His Leu Gly Glu Val Gln Thr Glu Thr
85 90 95 Asp Leu Gln Ile
Arg Asp Leu Leu Glu Arg Ile Arg Leu Met Glu Val 100
105 110 Asp Ile His Ser Gly Asn Val Val Asn
Lys Glu Leu His Gln Met His 115 120
125 Met Glu Ala Lys Arg Leu Ile Thr Glu Arg Gln Met Leu Thr
Leu Glu 130 135 140
Ile Glu Asp Val Thr Lys Glu Leu Gln Lys Leu Ser Ala Ser Gly Asp 145
150 155 160 Asn Lys Ser Leu Pro
Glu Leu Leu Ser Glu Leu Asp Arg Leu Arg Lys 165
170 175 Glu His His Asn Leu Arg Ser Gln Phe Glu
Phe Glu Lys Asn Thr Asn 180 185
190 Val Lys Gln Val Glu Gln Met Arg Thr Met Glu Met Asn Leu Ile
Thr 195 200 205 Met
Thr Lys Gln Ala Glu Lys Leu Arg Val Asp Val Ala Asn Ala Glu 210
215 220 Arg Arg Ala Gln Ala Ala
Ala Ala Gln Ala Ala Ala His Ala Ala Gly 225 230
235 240 Ala Gln Val Thr Ala Ser Gln Pro Gly Gln Leu
Lys Leu Pro Arg Phe 245 250
255 Gln Gln Gln Gln Pro Gln Thr His Met Gln Val His Ile Pro Ala Thr
260 265 270 Pro Leu
His Ile Ser Arg Glu Pro Arg Leu Gly His Ile Ser Arg Val 275
280 285 Leu Arg Leu Gly Tyr Ile Ser
Arg Glu Pro Arg Leu Gly His Ile Ser 290 295
300 Arg Glu Pro Arg Leu Gly His Ile Ser Arg Gly Ala
Arg Met Gly His 305 310 315
320 Ile Ser Arg Gly Leu Arg Leu Gly His Ile Ser Arg Glu Pro Arg Leu
325 330 335 Gly His Ile
Ser Arg Glu Pro Arg Leu Gly His Ile Ser Arg Val Leu 340
345 350 Arg Leu Gly His Ile Ser Arg Glu
Pro Arg Leu Gly His Ile Ser Arg 355 360
365 Gly Pro Ser Leu Gly His Ile Ser Arg Gly Pro Arg Leu
Gly His Ile 370 375 380
Ser Arg Glu Pro Arg Met Gly His Ile Ser Arg Glu Pro Arg Met Gly 385
390 395 400 His Ile Ser Arg
Val Leu Arg Leu Glu His Thr Thr Met Leu Met Met 405
410 415 Leu Ala Arg Leu Met His Met Gln Val
Thr Leu Ala Ile Gln Leu Gln 420 425
430 Ala Thr Arg Lys Val Gln Cys Pro Thr Ile Pro Met Leu His
Leu Arg 435 440 445
Ser Gln Gln Ala Ala Val Gln Leu Arg Thr Pro Gln Glu Ala Ser Met 450
455 460 Gly Gln Leu Val Val
Leu Asp Ile Leu Leu Gly Lys Phe Ser Arg Ala 465 470
475 480 Val Ala Leu Gln Met Gln Arg Lys His Leu
Leu Leu His His His Arg 485 490
495 Gln His His Ile Pro Pro Ala His Met Thr Lys Pro Glu Glu Pro
Arg 500 505 510 Asp
Lys Ile Trp Asp Val Asn Gln Met Asp Val Cys His Ala His Leu 515
520 525 Leu Ser Arg Gln Ile Trp
530 681767DNAZea mays 68atggctcatc gtggacatct
agatggactg actggccaag ctcctgctct tatgcgccat 60ggttccttcg ctgcaggcag
cctctctagc cgctcacctt tgcagtcttc atccacactg 120gagatgctgg agaacaagct
tgccatgcaa actacagaag tggaaaagct tatcacggag 180aatcagcggt tagcatcaag
ccatgtggtc ttgaggcagg acattgttga tacggagaaa 240gagatgcaaa tgatccgcac
ccacctaggt gaagttcaga cagagactga tttgcagatt 300agagatttgt tggagagaat
cagattaatg gaggtagata tacatagtgg taatgtagtg 360aacaaggagc ttcaccaaat
gcatatggag gcaaagagac ttattactga aaggcagatg 420ctaacccttg agatagagga
tgtgactaaa gaattacaga aactctctgc ctctggggac 480aataaaagcc ttcctgaatt
gctttctgag ctagataggc tacggaaaga gcatcataat 540ttacgatctc agtttgaatt
tgagaaaaat acaaacgtca agcaagttga gcagatgcgg 600acaatggaaa tgaacttgat
aaccatgacc aaacaagctg agaagttacg tgttgatgtg 660gcaaatgctg aaagacgggc
acaagcagct gcggctcaag cagcagcaca tgcagctggt 720gcacaggtga cagcttcgca
gcctggacag ctcaagctac cacggtttca gcagcagcag 780ccacagactc atatgcaggt
gcatatacca gctacccccc tgcatatcag cagggagccc 840aggctggggc atatcagcag
ggtgctcagg ctggggtata tcagcaggga gcccaggctg 900gggcatatca gcagggagcc
caggctgggg catatcagca ggggggccag gatggggcat 960atcagcaggg ggctcaggct
ggggcatatc agcagggagc ccaggctggg gcatatcagc 1020agggagccca ggctggggca
tatcagcagg gtgctcaggc tggggcatat cagcagggtg 1080ctcaggctgg ggtatatcag
cagggaaccc aggctggggc atatcagcag ggagcccagg 1140ctggggcata tcagcagggg
ggccaggatg gggcatatca gcagggggct caggctgggg 1200catatcagca gggagcccag
gctggggcat atcagcaggg agcccaggct ggggcatatc 1260agcagggggc ccagtctggg
gcatatcagc agggggccca ggctggggca tatcagcagg 1320gagcccagga tggggcatat
cagcagggag cccaggatgg ggcatatcag cagggtgctc 1380aggctggagc atacaactat
gcttatgatg ctggcacggc ttatgcatat gcaggttact 1440ctggctatcc agttgcaggc
tacgcgcaaa gtgcagtgcc caactattcc tatgctgcac 1500ctccgcagcc aacaagcagc
ggtgcagcta cgaacgccgc aggaggccag tatggggcag 1560ttggtagtgc tggatatcct
actgggcaag ttcagccgag cagtggcact gcaaatgcag 1620cgcaagcacc tcctcctcca
ccaccaccgg cagcaccata tccccccagc acatatgacc 1680aaaccagagg agcccagaga
taaaatctgg gatgtaaacc agatggatgt ttgccatgca 1740catttgttga gcagacaaat
atggtga 176769588PRTZea mays 69Met
Ala His Arg Gly His Leu Asp Gly Leu Thr Gly Gln Ala Pro Ala 1
5 10 15 Leu Met Arg His Gly Ser
Phe Ala Ala Gly Ser Leu Ser Ser Arg Ser 20
25 30 Pro Leu Gln Ser Ser Ser Thr Leu Glu Met
Leu Glu Asn Lys Leu Ala 35 40
45 Met Gln Thr Thr Glu Val Glu Lys Leu Ile Thr Glu Asn Gln
Arg Leu 50 55 60
Ala Ser Ser His Val Val Leu Arg Gln Asp Ile Val Asp Thr Glu Lys 65
70 75 80 Glu Met Gln Met Ile
Arg Thr His Leu Gly Glu Val Gln Thr Glu Thr 85
90 95 Asp Leu Gln Ile Arg Asp Leu Leu Glu Arg
Ile Arg Leu Met Glu Val 100 105
110 Asp Ile His Ser Gly Asn Val Val Asn Lys Glu Leu His Gln Met
His 115 120 125 Met
Glu Ala Lys Arg Leu Ile Thr Glu Arg Gln Met Leu Thr Leu Glu 130
135 140 Ile Glu Asp Val Thr Lys
Glu Leu Gln Lys Leu Ser Ala Ser Gly Asp 145 150
155 160 Asn Lys Ser Leu Pro Glu Leu Leu Ser Glu Leu
Asp Arg Leu Arg Lys 165 170
175 Glu His His Asn Leu Arg Ser Gln Phe Glu Phe Glu Lys Asn Thr Asn
180 185 190 Val Lys
Gln Val Glu Gln Met Arg Thr Met Glu Met Asn Leu Ile Thr 195
200 205 Met Thr Lys Gln Ala Glu Lys
Leu Arg Val Asp Val Ala Asn Ala Glu 210 215
220 Arg Arg Ala Gln Ala Ala Ala Ala Gln Ala Ala Ala
His Ala Ala Gly 225 230 235
240 Ala Gln Val Thr Ala Ser Gln Pro Gly Gln Leu Lys Leu Pro Arg Phe
245 250 255 Gln Gln Gln
Gln Pro Gln Thr His Met Gln Val His Ile Pro Ala Thr 260
265 270 Pro Leu His Ile Ser Arg Glu Pro
Arg Leu Gly His Ile Ser Arg Val 275 280
285 Leu Arg Leu Gly Tyr Ile Ser Arg Glu Pro Arg Leu Gly
His Ile Ser 290 295 300
Arg Glu Pro Arg Leu Gly His Ile Ser Arg Gly Ala Arg Met Gly His 305
310 315 320 Ile Ser Arg Gly
Leu Arg Leu Gly His Ile Ser Arg Glu Pro Arg Leu 325
330 335 Gly His Ile Ser Arg Glu Pro Arg Leu
Gly His Ile Ser Arg Val Leu 340 345
350 Arg Leu Gly His Ile Ser Arg Val Leu Arg Leu Gly Tyr Ile
Ser Arg 355 360 365
Glu Pro Arg Leu Gly His Ile Ser Arg Glu Pro Arg Leu Gly His Ile 370
375 380 Ser Arg Gly Ala Arg
Met Gly His Ile Ser Arg Gly Leu Arg Leu Gly 385 390
395 400 His Ile Ser Arg Glu Pro Arg Leu Gly His
Ile Ser Arg Glu Pro Arg 405 410
415 Leu Gly His Ile Ser Arg Gly Pro Ser Leu Gly His Ile Ser Arg
Gly 420 425 430 Pro
Arg Leu Gly His Ile Ser Arg Glu Pro Arg Met Gly His Ile Ser 435
440 445 Arg Glu Pro Arg Met Gly
His Ile Ser Arg Val Leu Arg Leu Glu His 450 455
460 Thr Thr Met Leu Met Met Leu Ala Arg Leu Met
His Met Gln Val Thr 465 470 475
480 Leu Ala Ile Gln Leu Gln Ala Thr Arg Lys Val Gln Cys Pro Thr Ile
485 490 495 Pro Met
Leu His Leu Arg Ser Gln Gln Ala Ala Val Gln Leu Arg Thr 500
505 510 Pro Gln Glu Ala Ser Met Gly
Gln Leu Val Val Leu Asp Ile Leu Leu 515 520
525 Gly Lys Phe Ser Arg Ala Val Ala Leu Gln Met Gln
Arg Lys His Leu 530 535 540
Leu Leu His His His Arg Gln His His Ile Pro Pro Ala His Met Thr 545
550 555 560 Lys Pro Glu
Glu Pro Arg Asp Lys Ile Trp Asp Val Asn Gln Met Asp 565
570 575 Val Cys His Ala His Leu Leu Ser
Arg Gln Ile Trp 580 585
70214PRTArtificial sequenceCoiled coil domain comprised in SEQ ID NO2
70Arg Gln Pro Leu Asp Arg Ala Ala Thr Ala Leu Glu Ile Leu Glu Lys 1
5 10 15 Lys Leu Ala Glu
Gln Thr Ala Glu Ala Glu Lys Leu Ile Arg Glu Asn 20
25 30 Gln Arg Leu Ala Ser Ser His Val Val
Leu Arg Gln Asp Ile Val Asp 35 40
45 Thr Glu Lys Glu Met Gln Met Ile Arg Ala His Leu Gly Asp
Val Gln 50 55 60
Thr Glu Thr Asp Met His Met Arg Asp Leu Met Glu Arg Met Arg Leu 65
70 75 80 Met Glu Ala Asp Ile
Gln Ala Gly Asp Ala Val Lys Lys Glu Leu His 85
90 95 Gln Val His Met Glu Ala Lys Arg Leu Ile
Ala Glu Arg Gln Met Leu 100 105
110 Thr Val Glu Met Asp Lys Val Thr Lys Glu Leu His Lys Phe Ser
Gly 115 120 125 Asp
Ser Lys Lys Leu Pro Glu Leu Leu Thr Glu Leu Asp Gly Leu Arg 130
135 140 Lys Glu His Gln Ser Leu
Arg Ser Ala Phe Glu Tyr Glu Lys Asn Thr 145 150
155 160 Asn Ile Lys Gln Val Glu Gln Met Arg Thr Met
Glu Met Asn Leu Met 165 170
175 Thr Met Thr Lys Glu Ala Asp Lys Leu Arg Ala Asp Val Ala Asn Ala
180 185 190 Glu Lys
Arg Ala Gln Val Ala Ala Ala Gln Ala Val Ala Ala Gln Ala 195
200 205 Gly Val Ala His Val Thr
210 712194DNAOryza sativa 71aatccgaaaa gtttctgcac
cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta
tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc
aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg
gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta
ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa
ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta
ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc
acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg
acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg
tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct
aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca
tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga
aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt
gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga
acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca
gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc
ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa
gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat
atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat
gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat
gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt
tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga
gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt
tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt
ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc
tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat
tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga
aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct
ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat
gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt cagttaatag
gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc tccattttta
attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt tttattagct
ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc
aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt
ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt
atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt
caccagcaaa gttc 2194721008DNAOryza sativa
72gccccccgcc ggacctcccg tggccccgtg gcgcctggag ggaggagagg ggagagatgg
60tgagagagga ggaagaagag gaggggtgac aatgatatgt ggggccatgt gggccccacc
120attttttaat tcattctttt gttgaaactg acatgtgggt cccatgagat ttattatttt
180tcggatcgaa ttgccacgta agcgctacgt caatgctacg tcagatgaag accgagtcaa
240attagccacg taagcgccac gtcagccaaa accaccatcc aaaccgccga gggacctcat
300ctgcactggt tttgatagtt gagggacccg ttgtatctgg tttttcgatt gaaggacgag
360aatcaaattt gttgacaagt taagggacct taaatgaact tattccattt caaaatattc
420tgtgagccat atataccgtg ggcttccaat cctcctcaaa ttaaagggcc tttttaaaat
480agataattgc cttctttcag tcacccataa aagtacaaaa ctactaccaa caagcaacat
540gcgcagttac acacattttc tgcacatttc caccacgtca caaagagcta agagttatcc
600ctaggacaat ctcattagtg tagatacatc cattaatctt ttatcagagg caaacgtaaa
660gccgctcttt atgacaaaaa taggtgacac aaaagtgtta tctgccacat acataacttc
720agaaattacc caacaccaag agaaaaataa aaaaaaatct ttttgcaagc tccaaatctt
780ggaaaccttt ttcactcttt gcagcattgt actcttgctc tttttccaac cgatccatgt
840caccctcaag cttctacttg atctacacga agctcaccgt gcacacaacc atggccacaa
900aaaccctata aaaccccatc cgatcgccat catctcatca tcagttcatc accaacaaac
960aaaagaggaa aaaaaacata tacacttcta gtgattgtct gattgatc
10087352DNAArtificial sequenceprimer prm02265 73ggggacaagt ttgtacaaaa
aagcaggctt cacaatggca taccatggac ag 527446DNAArtificial
sequenceprimer prm02266 74ggggaccact ttgtacaaga aagctgggta tttcacctct
ggcctg 4675321DNAArabidopsis thaliana 75atggataaca
ctgaccgtcg tcgccgtcgt aagcaacaca aaatcgccct ccatgactct 60gaagaagtga
gcagtatcga attggagttt atcaacatga ctgaacaaga agaagatctc 120atctttcgaa
tgtacagact tgtcggtgat aggtgggatt tgatagcagg aagagttcct 180ggaagacaac
cagaggagat agagagatat tggataatga gaaacagtga aggctttgct 240gataaacgac
gccagcttca ctcatcttcc cacaaacata ccaagcctca ccgtcctcgc 300ttttctatct
atccttccta g
32176106PRTArabidopsis thaliana 76Met Asp Asn Thr Asp Arg Arg Arg Arg Arg
Lys Gln His Lys Ile Ala 1 5 10
15 Leu His Asp Ser Glu Glu Val Ser Ser Ile Glu Trp Glu Phe Ile
Asn 20 25 30 Met
Thr Glu Gln Glu Glu Asp Leu Ile Phe Arg Met Tyr Arg Leu Val 35
40 45 Gly Asp Arg Trp Asp Leu
Ile Ala Gly Arg Val Pro Gly Arg Gln Pro 50 55
60 Glu Glu Ile Glu Arg Tyr Trp Ile Met Arg Asn
Ser Glu Gly Phe Ala 65 70 75
80 Asp Lys Arg Arg Gln Leu His Ser Ser Ser His Lys His Thr Lys Pro
85 90 95 His Arg
Pro Arg Phe Ser Ile Tyr Pro Ser 100 105
77231DNAAgrostis capillaris 77atgagcagtg gaagcttggt gaagaactcc aagacaatgg
gtgtccatga agcgaaagaa 60gttaatggca cttcacagca tttcgttgat ttcacagaag
cagaggagaa tctcgttttc 120agaatgcaca ggcttgtcgg gaccaggtgg gagcttatag
ctggagaaat ccccggaaga 180acggcaaaag aagtagagat gttttgggca aaaaagcccc
gggagcaatg a 2317876PRTAgrostis capillaris 78Met Ser Ser Gly
Ser Leu Val Lys Asn Ser Lys Thr Met Gly Val His 1 5
10 15 Glu Ala Lys Glu Val Asn Gly Thr Ser
Gln His Phe Val Asp Phe Thr 20 25
30 Glu Ala Glu Glu Asn Leu Val Phe Arg Met His Arg Leu Val
Gly Thr 35 40 45
Arg Trp Glu Leu Ile Ala Gly Glu Ile Pro Gly Arg Thr Ala Lys Glu 50
55 60 Val Glu Met Phe Trp
Ala Lys Lys Pro Arg Glu Gln 65 70 75
79231DNAAgrostis capillaris 79atgagcagtg aaagcttggc gaagaactcc
aagatcatgg ctatccatga aacgaaagga 60aataatacca ctgcacagca tttcgttgat
ttcacagaag cagaggaaga tctcgttttc 120agaatgcaca ggcttgtcgg gaacaggtgg
gagcttatag ctggaagaat ccccggaaga 180acggcaaaag aagtagagat gttttgggca
aaaaagcacc aggagcaatg a 2318076PRTAgrostis capillaris 80Met
Ser Ser Glu Ser Leu Ala Lys Asn Ser Lys Ile Met Ala Ile His 1
5 10 15 Glu Thr Lys Gly Asn Asn
Thr Thr Ala Gln His Phe Val Asp Phe Thr 20
25 30 Glu Ala Glu Glu Asp Leu Val Phe Arg Met
His Arg Leu Val Gly Asn 35 40
45 Arg Trp Glu Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala
Lys Glu 50 55 60
Val Glu Met Phe Trp Ala Lys Lys His Gln Glu Gln 65 70
75 81198DNAArachis hypogaea 81atgagacaag aggaggagcc
tactatgttg gaattctccg aagatgagga agatcttgtt 60gccaggatgt ttagattggt
tgggaagagg tggtctctta tcgctgggag aatccctgga 120agaacagcac aagagattga
aaagtattgg agttcaaagt gcgcatttcc cagtgaccaa 180tgctcttcct ctgcataa
1988265PRTArachis hypogaea
82Met Arg Gln Glu Glu Glu Pro Thr Met Leu Glu Phe Ser Glu Asp Glu 1
5 10 15 Glu Asp Leu Val
Ala Arg Met Phe Arg Leu Val Gly Lys Arg Trp Ser 20
25 30 Leu Ile Ala Gly Arg Ile Pro Gly Arg
Thr Ala Gln Glu Ile Glu Lys 35 40
45 Tyr Trp Ser Ser Lys Cys Ala Phe Pro Ser Asp Gln Cys Ser
Ser Ser 50 55 60
Ala 65 83231DNAAvena sativa 83atgagcagta aaagcttggc gaagaacttc
aagaccatgg gtgtccatga agcgaaagaa 60gttaatagca ctgcacagca tttcgttgat
ttcacagaag cagaggaaga tcttgttttc 120agaatgcaca ggcttgttgg gaacaggtgg
gaacttatag ctggaagaat ccccggaaga 180acagcaaaag aagtagagat gttttgggca
aaaaagcaca gggaacaatg a 2318476PRTAvena sativa 84Met Ser Ser
Lys Ser Leu Ala Lys Asn Phe Lys Thr Met Gly Val His 1 5
10 15 Glu Ala Lys Glu Val Asn Ser Thr
Ala Gln His Phe Val Asp Phe Thr 20 25
30 Glu Ala Glu Glu Asp Leu Val Phe Arg Met His Arg Leu
Val Gly Asn 35 40 45
Arg Trp Glu Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Lys Glu 50
55 60 Val Glu Met Phe
Trp Ala Lys Lys His Arg Glu Gln 65 70
75 85252DNAArabidopsis thaliana 85atgaatacgc agcgtaagtc gaagcatctt
aagaccaatc caaccattgt tgcctcttct 60tctgaagaag tgagcagtct tgagtgggaa
gaaatagcaa tggctcagga agaagaggat 120ttgatttgca ggatgtataa gcttgtcggt
gaaaggtggg atttaatagc tgggaggatt 180ccaggaagaa cagcagaaga gattgagagg
ttttgggtga tgaagaatca tcgaagatct 240caattacgtt ga
2528683PRTArabidopsis thaliana 86Met
Asn Thr Gln Arg Lys Ser Lys His Leu Lys Thr Asn Pro Thr Ile 1
5 10 15 Val Ala Ser Ser Ser Glu
Glu Val Ser Ser Leu Glu Trp Glu Glu Ile 20
25 30 Ala Met Ala Gln Glu Glu Glu Asp Leu Ile
Cys Arg Met Tyr Lys Leu 35 40
45 Val Gly Glu Arg Trp Asp Leu Ile Ala Gly Arg Ile Pro Gly
Arg Thr 50 55 60
Ala Glu Glu Ile Glu Arg Phe Trp Val Met Lys Asn His Arg Arg Ser 65
70 75 80 Gln Leu Arg
871833DNAArabidopsis thaliana 87atggttgcta ataataatac tagtagcaat
cgcaggaaga gaatcattac tgaaggcgac 60atcgccactc ttttgctgag atatgatatg
gagacgatac tgagaatgct acaggagata 120tcttattgtt ccgaaaccaa gatggactgg
aatgcgttgg tgaagaagac cactaccgga 180attactaatg ctagagagta ccagttgcta
tggcgtcatc tttcttatcg gcatcctctc 240ctccctgtgg aagatgatgc tctacctctg
gacgacgata gtgacatgga gtgcgaattg 300gaagcttctc ctgcagtcag ccatgaagca
tcagtggagg ctattgcaca tgtcaaagtg 360atggctgctt catatgttct aagtgagtct
gatatactcg acgattcaac agttgaggct 420cccttgacta taaacatacc ttatgctttg
cctgagggtt ctcaggaacc atcagagtct 480ccttggtcgt caagagggat gaatatcaac
tttccggtct gtcttcagaa agttacatct 540accgagggga tgaatggaaa tggttcagct
ggtattagca tggcttttcg gaggaaaagg 600aaaagatggt ctgctgagga ggatgaggag
ctgttcgccg ctgtaaagcg atgtggtgaa 660gggaactggg ctcatattgt taagggagac
tttagaggag agagaaccgc ctcccaactc 720tcgcagaggt gggcgcttat aagaaaaagg
tgtcacactt cgacctctgt tagccaatgt 780ggcctacaag gaactgaagc gaaactagca
gttaaccatg cattatcttt agctctggga 840aatcggcccc cttcaaataa gcttgcaata
ggtcttatgc caacgacgtc atcttgtacc 900atcacagaaa cggaagcgaa tgggggaagt
tcttctcaag gtcaacaaca gtccaaacca 960attgttcaag cattgcctcg ggcaggaaca
tcacttccgg ctgcaaagtc tcgagttgtt 1020aaaaaaacaa cagcaagctc cacttccaga
tcggatctta tggtaacagc taattcagta 1080gctgcagctg catgcatggg tgatgtattg
actgctgcat caggacgaaa ggtcgaacct 1140ggaaaaactg atgctccacg agtgccaaag
actaaacctg taaaacatgc ttctacagtc 1200tgcatgcctc agccctcagg tagcctctcc
atgccaaagg ttgaaccagg aacgagtgtt 1260gccgcctcta tacggtctct agctaatgga
aaattgaaac ctgttatggc ttcatcatct 1320tccaacaaac ctcctctcat agctcctcgt
tcagaaggat cttcaatgct ttctgcttcc 1380gcccccttgg cttctctatc aaggattgtc
tccaatcaga gagtttttgc aggctctgtc 1440ccagctactg agattgtcac ttgcaaacca
gatggtggac agaaagggca agctcgtgga 1500aatgaagcaa gctcatcggc tgcaatccag
ccacatcaaa taacctcaag aaacttggag 1560attagccagg gaaagcaggc tacacaggct
cagtccccta atctcttgcc taggaaagtt 1620ccagtagttc ggactgcagt tcattgtgcc
actaaccaaa agttgatgga taaaccatct 1680gatcaaactg tagtacctat cagaggagct
ggttcgcaat ctaaagccaa aggtgaagta 1740aacagtaagg ttggtccggt gatcaaagtg
agtagtgttt gcggaaaacc ccttgaggtt 1800gcaactgtgg cagggaccgg acagggtgtt
tag 183388610PRTArabidopsis thaliana 88Met
Val Ala Asn Asn Asn Thr Ser Ser Asn Arg Arg Lys Arg Ile Ile 1
5 10 15 Thr Glu Gly Asp Ile Ala
Thr Leu Leu Leu Arg Tyr Asp Met Glu Thr 20
25 30 Ile Leu Arg Met Leu Gln Glu Ile Ser Tyr
Cys Ser Glu Thr Lys Met 35 40
45 Asp Trp Asn Ala Leu Val Lys Lys Thr Thr Thr Gly Ile Thr
Asn Ala 50 55 60
Arg Glu Tyr Gln Leu Leu Trp Arg His Leu Ser Tyr Arg His Pro Leu 65
70 75 80 Leu Pro Val Glu Asp
Asp Ala Leu Pro Leu Asp Asp Asp Ser Asp Met 85
90 95 Glu Cys Glu Leu Glu Ala Ser Pro Ala Val
Ser His Glu Ala Ser Val 100 105
110 Glu Ala Ile Ala His Val Lys Val Met Ala Ala Ser Tyr Val Leu
Ser 115 120 125 Glu
Ser Asp Ile Leu Asp Asp Ser Thr Val Glu Ala Pro Leu Thr Ile 130
135 140 Asn Ile Pro Tyr Ala Leu
Pro Glu Gly Ser Gln Glu Pro Ser Glu Ser 145 150
155 160 Pro Trp Ser Ser Arg Gly Met Asn Ile Asn Phe
Pro Val Cys Leu Gln 165 170
175 Lys Val Thr Ser Thr Glu Gly Met Asn Gly Asn Gly Ser Ala Gly Ile
180 185 190 Ser Met
Ala Phe Arg Arg Lys Arg Lys Arg Trp Ser Ala Glu Glu Asp 195
200 205 Glu Glu Leu Phe Ala Ala Val
Lys Arg Cys Gly Glu Gly Asn Trp Ala 210 215
220 His Ile Val Lys Gly Asp Phe Arg Gly Glu Arg Thr
Ala Ser Gln Leu 225 230 235
240 Ser Gln Arg Trp Ala Leu Ile Arg Lys Arg Cys His Thr Ser Thr Ser
245 250 255 Val Ser Gln
Cys Gly Leu Gln Gly Thr Glu Ala Lys Leu Ala Val Asn 260
265 270 His Ala Leu Ser Leu Ala Leu Gly
Asn Arg Pro Pro Ser Asn Lys Leu 275 280
285 Ala Ile Gly Leu Met Pro Thr Thr Ser Ser Cys Thr Ile
Thr Glu Thr 290 295 300
Glu Ala Asn Gly Gly Ser Ser Ser Gln Gly Gln Gln Gln Ser Lys Pro 305
310 315 320 Ile Val Gln Ala
Leu Pro Arg Ala Gly Thr Ser Leu Pro Ala Ala Lys 325
330 335 Ser Arg Val Val Lys Lys Thr Thr Ala
Ser Ser Thr Ser Arg Ser Asp 340 345
350 Leu Met Val Thr Ala Asn Ser Val Ala Ala Ala Ala Cys Met
Gly Asp 355 360 365
Val Leu Thr Ala Ala Ser Gly Arg Lys Val Glu Pro Gly Lys Thr Asp 370
375 380 Ala Pro Arg Val Pro
Lys Thr Lys Pro Val Lys His Ala Ser Thr Val 385 390
395 400 Cys Met Pro Gln Pro Ser Gly Ser Leu Ser
Met Pro Lys Val Glu Pro 405 410
415 Gly Thr Ser Val Ala Ala Ser Ile Arg Ser Leu Ala Asn Gly Lys
Leu 420 425 430 Lys
Pro Val Met Ala Ser Ser Ser Ser Asn Lys Pro Pro Leu Ile Ala 435
440 445 Pro Arg Ser Glu Gly Ser
Ser Met Leu Ser Ala Ser Ala Pro Leu Ala 450 455
460 Ser Leu Ser Arg Ile Val Ser Asn Gln Arg Val
Phe Ala Gly Ser Val 465 470 475
480 Pro Ala Thr Glu Ile Val Thr Cys Lys Pro Asp Gly Gly Gln Lys Gly
485 490 495 Gln Ala
Arg Gly Asn Glu Ala Ser Ser Ser Ala Ala Ile Gln Pro His 500
505 510 Gln Ile Thr Ser Arg Asn Leu
Glu Ile Ser Gln Gly Lys Gln Ala Thr 515 520
525 Gln Ala Gln Ser Pro Asn Leu Leu Pro Arg Lys Val
Pro Val Val Arg 530 535 540
Thr Ala Val His Cys Ala Thr Asn Gln Lys Leu Met Asp Lys Pro Ser 545
550 555 560 Asp Gln Thr
Val Val Pro Ile Arg Gly Ala Gly Ser Gln Ser Lys Ala 565
570 575 Lys Gly Glu Val Asn Ser Lys Val
Gly Pro Val Ile Lys Val Ser Ser 580 585
590 Val Cys Gly Lys Pro Leu Glu Val Ala Thr Val Ala Gly
Thr Gly Gln 595 600 605
Gly Val 610 89924DNAArabidopsis thaliana 89atggtgagat cttgttcttc
aaaatcaaag aatccttgga caaacgaaga agacacaacc 60caaaagtttg tgtttgcgtc
tgcgtccaaa aacggatgtg cagctcctaa gaaaatagga 120cttaggagat gtggaaagag
ttgcagagtg agaaagactg atcattcagg aaccaaacat 180gagagcttca cttctgaaga
cgaagatctg atcatcaaga tgcacgcagc aatgggaagc 240agatggcaac ttattgcaca
acatttacca ggaaagacag aagaagaagt gaagatgttt 300tggaacacaa aactgaagaa
gaaactgtcg gaaatgggga tagatcatgt cactcaccgt 360cccttttctc acgtacttgc
tgaatacggc aacatcaatg gtggtggaaa cctaaaccct 420aatccctcga accaagccgg
atctcttgga cgcaatcact cgctcaatga tgatggtcat 480caacaacaac ctaatgattc
aggagatctc atgtttcatt tacaagcaat caagcttatg 540acagattcat cgaaccaagt
caagcctgag tctacgtttg tgtacgcctc ttcgtcttct 600tctaactcgt ctcctccatt
gttctcttca acttgttcta ccatagctca ggagaattca 660gaggttaact tcacttggtc
tgacttcctt cttgaccaag aaaccttcca tgaaaaccaa 720cagaatcatc ctcaagaact
agacagcttg tttgggaacg acttctccga ggtaacagca 780gctacaatgg ctaacacatc
aaccgtacca tctcagatcg aagaagaatc tttgagcaat 840gggttcgttg aatcgattat
cgctaaagaa aaggagtttt tcttgggatt tccgagctat 900ctggaacaac ctttccactt
ttag 92490307PRTArabidopsis
thaliana 90Met Val Arg Ser Cys Ser Ser Lys Ser Lys Asn Pro Trp Thr Asn
Glu 1 5 10 15 Glu
Asp Thr Thr Gln Lys Phe Val Phe Ala Ser Ala Ser Lys Asn Gly
20 25 30 Cys Ala Ala Pro Lys
Lys Ile Gly Leu Arg Arg Cys Gly Lys Ser Cys 35
40 45 Arg Val Arg Lys Thr Asp His Ser Gly
Thr Lys His Glu Ser Phe Thr 50 55
60 Ser Glu Asp Glu Asp Leu Ile Ile Lys Met His Ala Ala
Met Gly Ser 65 70 75
80 Arg Trp Gln Leu Ile Ala Gln His Leu Pro Gly Lys Thr Glu Glu Glu
85 90 95 Val Lys Met Phe
Trp Asn Thr Lys Leu Lys Lys Lys Leu Ser Glu Met 100
105 110 Gly Ile Asp His Val Thr His Arg Pro
Phe Ser His Val Leu Ala Glu 115 120
125 Tyr Gly Asn Ile Asn Gly Gly Gly Asn Leu Asn Pro Asn Pro
Ser Asn 130 135 140
Gln Ala Gly Ser Leu Gly Arg Asn His Ser Leu Asn Asp Asp Gly His 145
150 155 160 Gln Gln Gln Pro Asn
Asp Ser Gly Asp Leu Met Phe His Leu Gln Ala 165
170 175 Ile Lys Leu Met Thr Asp Ser Ser Asn Gln
Val Lys Pro Glu Ser Thr 180 185
190 Phe Val Tyr Ala Ser Ser Ser Ser Ser Asn Ser Ser Pro Pro Leu
Phe 195 200 205 Ser
Ser Thr Cys Ser Thr Ile Ala Gln Glu Asn Ser Glu Val Asn Phe 210
215 220 Thr Trp Ser Asp Phe Leu
Leu Asp Gln Glu Thr Phe His Glu Asn Gln 225 230
235 240 Gln Asn His Pro Gln Glu Leu Asp Ser Leu Phe
Gly Asn Asp Phe Ser 245 250
255 Glu Val Thr Ala Ala Thr Met Ala Asn Thr Ser Thr Val Pro Ser Gln
260 265 270 Ile Glu
Glu Glu Ser Leu Ser Asn Gly Phe Val Glu Ser Ile Ile Ala 275
280 285 Lys Glu Lys Glu Phe Phe Leu
Gly Phe Pro Ser Tyr Leu Glu Gln Pro 290 295
300 Phe His Phe 305 912505DNAArabidopsis
thaliana 91atggttgata acagtaacaa taagaagagg aaagagttca tcagtgaagc
agacatcgcc 60actcttttgc agagatatga tactgtgacg atactgaagt tgctacaaga
aatggcgtat 120tatgctgaag caaagatgaa ttggaatgag ttagtgaaga agacaagtac
tggaattact 180agtgctagag aatatcagtt gctttggcgg catcttgctt atagagattc
tctcgtccct 240gtgggaaata atgctcgagt tctggatgat gatagtgata tggagtgtga
attggaagca 300tcccctggag ttagtgttga tgtagtaacg gaagctgttg cgcatgtgaa
agtgatggct 360gcttcctatg tgccaagtga gtccgatatt cccgaagact caacggttga
ggctcccttg 420accattaaca taccttacag cctgcatagg gggcctcagg aaccatcaga
ctcatattgg 480tcatcaagag ggatgaatat cacctttcct gtttttcttc cgaaagcagc
tgaaggacat 540aatgggaatg ggttagccag tagcttggct cctcggaaga gaagaaaaaa
atggtcagct 600gaggaggatg aggagctgat tgctgctgtt aagcgacatg gtgaaggcag
ctgggccctt 660atctctaagg aagaatttga aggagagcga acagcctcac aactctcaca
gcggtggggg 720gctataagga gaaggactga tacttcaaac acttctaccc aaactggcct
acagcgaaca 780gaagcacaaa tggcagctaa tcgtgcatta tctttagcgg tgggaaatcg
gttaccctca 840aaaaaacttg cagtaggtat gactccaatg ctgtcatccg gtaccatcaa
gggagcacaa 900gccaatggtg ccagcagtgg tagtacattg caaggtcaac aacagcctca
gccacaaatt 960caagcattat cacgggcaac aacatcagtg ccagttgcaa aatctcgagt
tcctgtaaag 1020aaaacaacag ggaactccac ttcgagagca gacctaatgg taactgctaa
ttcagtagct 1080gctgcagcct gtatgtctgg cctggcaacc gctgtaacag tgcctaagat
tgaaccagga 1140aagaatgctg tttctgcgtt ggtgccgaag actgaacccg taaaaaccgc
ttccacagtt 1200tctatgcctc gtccttcagg tatatcatca gcactgaata ctgagcctgt
aaaaaccgct 1260gtggcagcct ctttgcctcg ttcatcaggt attatttcag caccaaaggt
tgagcctgta 1320aaaaccgctg cttcagcagc ctctttgcct cgtccatcag gaatgatatc
agcaccaaag 1380gttgagcctg tgaaaaccac cgcctctgta gcctctttgc ctcgtccatc
aggtattatt 1440tcagctccaa aggctgagcc tgtaaaaacc gctgcttctg cagcctcttc
gcctcgtcca 1500tcaggaatga tatcagcacc aaaggttgag tctgtgaaaa ccaccgcctc
tatgcctcgt 1560ccatcaggta ttatatccgc accaaaggct gagcttgtaa aatccgccgc
ttctgcagcc 1620tctttgcctt gtacatcagg tattatatct tcaccaaagg ctgagcttgt
aaaatccgcc 1680gcttctgcag cctcttttcc tcgcccatca agtatgctat cagcaccaaa
ggctgaccca 1740gtaaagattg ttcctgctgc tgccactaac actaaatcgg ttggaccttt
gaatttaagg 1800catgcagtca atggaagccc aaaccacacg ataccttcat caccctttac
taagccttta 1860catatggctc ctctctccaa aggatctaca atccagagta attcagttcc
tcctagtttt 1920gcatcgtcaa ggttggtccc cacacagaga gctcctgcgg ctactgttgt
cacgccacaa 1980aagccaagtg tggtagcggc agctactgtt gtcacgccac aaaagccaag
tgtgggagca 2040gcagctactg ttgtaacgcc acaaaagcca agtgtgggag cagcagctaa
tgttgtaacg 2100ccacaaaagc caagtgtggg atcagcagct actgttgtaa cgccacaaaa
gccaagtgtg 2160ggagcagcag ttaccgtcac ttccaagccg gttggtgtac agaaagagca
aactcaggga 2220aacagagcaa gccccttggt tacagcaaca cttccgccaa ataaaaccat
cccagcaaat 2280tcagtgattg gcacagcaaa agcggtggct gcgaaagtgg agactcctcc
tagccttatg 2340cctaagaaaa atgaagtagt tggcagttgc accgataaaa gttcattgga
taaaccacct 2400gagaaagaaa gtactaccac ggtgtcacct ctagctgtag ctgcgactaa
atcaaaaccc 2460aaagatgaag caaccgtgac agggaccgga ctgaaggagt tgtag
250592834PRTArabidopsis thaliana 92Met Val Asp Asn Ser Asn Asn
Lys Lys Arg Lys Glu Phe Ile Ser Glu 1 5
10 15 Ala Asp Ile Ala Thr Leu Leu Gln Arg Tyr Asp
Thr Val Thr Ile Leu 20 25
30 Lys Leu Leu Gln Glu Met Ala Tyr Tyr Ala Glu Ala Lys Met Asn
Trp 35 40 45 Asn
Glu Leu Val Lys Lys Thr Ser Thr Gly Ile Thr Ser Ala Arg Glu 50
55 60 Tyr Gln Leu Leu Trp Arg
His Leu Ala Tyr Arg Asp Ser Leu Val Pro 65 70
75 80 Val Gly Asn Asn Ala Arg Val Leu Asp Asp Asp
Ser Asp Met Glu Cys 85 90
95 Glu Leu Glu Ala Ser Pro Gly Val Ser Val Asp Val Val Thr Glu Ala
100 105 110 Val Ala
His Val Lys Val Met Ala Ala Ser Tyr Val Pro Ser Glu Ser 115
120 125 Asp Ile Pro Glu Asp Ser Thr
Val Glu Ala Pro Leu Thr Ile Asn Ile 130 135
140 Pro Tyr Ser Leu His Arg Gly Pro Gln Glu Pro Ser
Asp Ser Tyr Trp 145 150 155
160 Ser Ser Arg Gly Met Asn Ile Thr Phe Pro Val Phe Leu Pro Lys Ala
165 170 175 Ala Glu Gly
His Asn Gly Asn Gly Leu Ala Ser Ser Leu Ala Pro Arg 180
185 190 Lys Arg Arg Lys Lys Trp Ser Ala
Glu Glu Asp Glu Glu Leu Ile Ala 195 200
205 Ala Val Lys Arg His Gly Glu Gly Ser Trp Ala Leu Ile
Ser Lys Glu 210 215 220
Glu Phe Glu Gly Glu Arg Thr Ala Ser Gln Leu Ser Gln Arg Trp Gly 225
230 235 240 Ala Ile Arg Arg
Arg Thr Asp Thr Ser Asn Thr Ser Thr Gln Thr Gly 245
250 255 Leu Gln Arg Thr Glu Ala Gln Met Ala
Ala Asn Arg Ala Leu Ser Leu 260 265
270 Ala Val Gly Asn Arg Leu Pro Ser Lys Lys Leu Ala Val Gly
Met Thr 275 280 285
Pro Met Leu Ser Ser Gly Thr Ile Lys Gly Ala Gln Ala Asn Gly Ala 290
295 300 Ser Ser Gly Ser Thr
Leu Gln Gly Gln Gln Gln Pro Gln Pro Gln Ile 305 310
315 320 Gln Ala Leu Ser Arg Ala Thr Thr Ser Val
Pro Val Ala Lys Ser Arg 325 330
335 Val Pro Val Lys Lys Thr Thr Gly Asn Ser Thr Ser Arg Ala Asp
Leu 340 345 350 Met
Val Thr Ala Asn Ser Val Ala Ala Ala Ala Cys Met Ser Gly Leu 355
360 365 Ala Thr Ala Val Thr Val
Pro Lys Ile Glu Pro Gly Lys Asn Ala Val 370 375
380 Ser Ala Leu Val Pro Lys Thr Glu Pro Val Lys
Thr Ala Ser Thr Val 385 390 395
400 Ser Met Pro Arg Pro Ser Gly Ile Ser Ser Ala Leu Asn Thr Glu Pro
405 410 415 Val Lys
Thr Ala Val Ala Ala Ser Leu Pro Arg Ser Ser Gly Ile Ile 420
425 430 Ser Ala Pro Lys Val Glu Pro
Val Lys Thr Ala Ala Ser Ala Ala Ser 435 440
445 Leu Pro Arg Pro Ser Gly Met Ile Ser Ala Pro Lys
Val Glu Pro Val 450 455 460
Lys Thr Thr Ala Ser Val Ala Ser Leu Pro Arg Pro Ser Gly Ile Ile 465
470 475 480 Ser Ala Pro
Lys Ala Glu Pro Val Lys Thr Ala Ala Ser Ala Ala Ser 485
490 495 Ser Pro Arg Pro Ser Gly Met Ile
Ser Ala Pro Lys Val Glu Ser Val 500 505
510 Lys Thr Thr Ala Ser Met Pro Arg Pro Ser Gly Ile Ile
Ser Ala Pro 515 520 525
Lys Ala Glu Leu Val Lys Ser Ala Ala Ser Ala Ala Ser Leu Pro Cys 530
535 540 Thr Ser Gly Ile
Ile Ser Ser Pro Lys Ala Glu Leu Val Lys Ser Ala 545 550
555 560 Ala Ser Ala Ala Ser Phe Pro Arg Pro
Ser Ser Met Leu Ser Ala Pro 565 570
575 Lys Ala Asp Pro Val Lys Ile Val Pro Ala Ala Ala Thr Asn
Thr Lys 580 585 590
Ser Val Gly Pro Leu Asn Leu Arg His Ala Val Asn Gly Ser Pro Asn
595 600 605 His Thr Ile Pro
Ser Ser Pro Phe Thr Lys Pro Leu His Met Ala Pro 610
615 620 Leu Ser Lys Gly Ser Thr Ile Gln
Ser Asn Ser Val Pro Pro Ser Phe 625 630
635 640 Ala Ser Ser Arg Leu Val Pro Thr Gln Arg Ala Pro
Ala Ala Thr Val 645 650
655 Val Thr Pro Gln Lys Pro Ser Val Val Ala Ala Ala Thr Val Val Thr
660 665 670 Pro Gln Lys
Pro Ser Val Gly Ala Ala Ala Thr Val Val Thr Pro Gln 675
680 685 Lys Pro Ser Val Gly Ala Ala Ala
Asn Val Val Thr Pro Gln Lys Pro 690 695
700 Ser Val Gly Ser Ala Ala Thr Val Val Thr Pro Gln Lys
Pro Ser Val 705 710 715
720 Gly Ala Ala Val Thr Val Thr Ser Lys Pro Val Gly Val Gln Lys Glu
725 730 735 Gln Thr Gln Gly
Asn Arg Ala Ser Pro Leu Val Thr Ala Thr Leu Pro 740
745 750 Pro Asn Lys Thr Ile Pro Ala Asn Ser
Val Ile Gly Thr Ala Lys Ala 755 760
765 Val Ala Ala Lys Val Glu Thr Pro Pro Ser Leu Met Pro Lys
Lys Asn 770 775 780
Glu Val Val Gly Ser Cys Thr Asp Lys Ser Ser Leu Asp Lys Pro Pro 785
790 795 800 Glu Lys Glu Ser Thr
Thr Thr Val Ser Pro Leu Ala Val Ala Ala Thr 805
810 815 Lys Ser Lys Pro Lys Asp Glu Ala Thr Val
Thr Gly Thr Gly Leu Lys 820 825
830 Glu Leu 93588DNAArabidopsis thaliana 93atgaacaaaa
cccgccttcg tgctctctcc ccaccttccg gtatgcaaca ccgtaagaga 60tgtcgattga
gaggtcgaaa ctacgtaagg ccagaagtta aacaacgcaa cttctcaaaa 120gatgaagacg
atctcatcct caagcttcat gcacttcttg gcaatagatg gtcattgata 180gcgggaagat
tgccaggacg aaccgacaac gaagttagga tccattggga aacttaccta 240aaaaggaagc
tcgtaaaaat gggaatcgac ccaaccaatc atcgtctcca ccatcacacc 300aactacattt
ctagacgtca cctccattct tcacataagg aacatgaaac caagattatt 360agtgatcaat
cttcttcggt atccgaatca tgtggtgtaa caattttgcc cattccaagt 420accaattgct
cggaggatag tactagtacc ggacgaagtc atttgcctga cctaaacatt 480ggtctcatcc
cggccgtgac ttctttgcca gctctttgcc ttcaggactc tagcgaatcc 540tctaccaatg
gttcaacagg tcaagaaacg cttcttctat tccgatga
58894195PRTArabidopsis thaliana 94Met Asn Lys Thr Arg Leu Arg Ala Leu Ser
Pro Pro Ser Gly Met Gln 1 5 10
15 His Arg Lys Arg Cys Arg Leu Arg Gly Arg Asn Tyr Val Arg Pro
Glu 20 25 30 Val
Lys Gln Arg Asn Phe Ser Lys Asp Glu Asp Asp Leu Ile Leu Lys 35
40 45 Leu His Ala Leu Leu Gly
Asn Arg Trp Ser Leu Ile Ala Gly Arg Leu 50 55
60 Pro Gly Arg Thr Asp Asn Glu Val Arg Ile His
Trp Glu Thr Tyr Leu 65 70 75
80 Lys Arg Lys Leu Val Lys Met Gly Ile Asp Pro Thr Asn His Arg Leu
85 90 95 His His
His Thr Asn Tyr Ile Ser Arg Arg His Leu His Ser Ser His 100
105 110 Lys Glu His Glu Thr Lys Ile
Ile Ser Asp Gln Ser Ser Ser Val Ser 115 120
125 Glu Ser Cys Gly Val Thr Ile Leu Pro Ile Pro Ser
Thr Asn Cys Ser 130 135 140
Glu Asp Ser Thr Ser Thr Gly Arg Ser His Leu Pro Asp Leu Asn Ile 145
150 155 160 Gly Leu Ile
Pro Ala Val Thr Ser Leu Pro Ala Leu Cys Leu Gln Asp 165
170 175 Ser Ser Glu Ser Ser Thr Asn Gly
Ser Thr Gly Gln Glu Thr Leu Leu 180 185
190 Leu Phe Arg 195 95339DNAArabidopsis
thaliana 95atggataata ccaaccgtct tcgtcttcgt cgcggtccca gtcttaggca
aactaagttc 60actcgatccc gatatgactc tgaagaagtg agtagcatcg aatgggagtt
tatcagtatg 120accgaacaag aagaagatct catctctcga atgtacagac ttgtcggtaa
taggtgggat 180ttaatagcag gaagagtcgt aggaagaaag gcaaatgaga ttgagagata
ctggattatg 240agaaactctg actatttttc tcacaaacga cgacgtctta ataattctcc
ctttttttct 300acttctcctc ttaatctcca agaaaatcta aaattgtaa
33996112PRTArabidopsis thaliana 96Met Asp Asn Thr Asn Arg Leu
Arg Leu Arg Arg Gly Pro Ser Leu Arg 1 5
10 15 Gln Thr Lys Phe Thr Arg Ser Arg Tyr Asp Ser
Glu Glu Val Ser Ser 20 25
30 Ile Glu Trp Glu Phe Ile Ser Met Thr Glu Gln Glu Glu Asp Leu
Ile 35 40 45 Ser
Arg Met Tyr Arg Leu Val Gly Asn Arg Trp Asp Leu Ile Ala Gly 50
55 60 Arg Val Val Gly Arg Lys
Ala Asn Glu Ile Glu Arg Tyr Trp Ile Met 65 70
75 80 Arg Asn Ser Asp Tyr Phe Ser His Lys Arg Arg
Arg Leu Asn Asn Ser 85 90
95 Pro Phe Phe Ser Thr Ser Pro Leu Asn Leu Gln Glu Asn Leu Lys Leu
100 105 110
97285DNAArabidopsis thaliana 97atgtttcgtt cagacaaggc ggaaaaaatg
gataaacgac gacggagaca gagcaaagcc 60aaggcttctt gttccgaaga ggtgagtagt
atcgaatggg aagctgtgaa gatgtcagaa 120gaagaagaag atctcatttc tcggatgtat
aaactcgttg gcgacaggtg ggagttgatc 180gccggaagga tcccgggacg gacgccggag
gagatagaga gatattggct tatgaaacac 240ggcgtcgttt ttgccaacag acgaagagac
ttttttagga aatga 2859894PRTArabidopsis thaliana 98Met
Phe Arg Ser Asp Lys Ala Glu Lys Met Asp Lys Arg Arg Arg Arg 1
5 10 15 Gln Ser Lys Ala Lys Ala
Ser Cys Ser Glu Glu Val Ser Ser Ile Glu 20
25 30 Trp Glu Ala Val Lys Met Ser Glu Glu Glu
Glu Asp Leu Ile Ser Arg 35 40
45 Met Tyr Lys Leu Val Gly Asp Arg Trp Glu Leu Ile Ala Gly
Arg Ile 50 55 60
Pro Gly Arg Thr Pro Glu Glu Ile Glu Arg Tyr Trp Leu Met Lys His 65
70 75 80 Gly Val Val Phe Ala
Asn Arg Arg Arg Asp Phe Phe Arg Lys 85
90 99234DNAArabidopsis thaliana 99atggataacc atcgcaggac
taagcaaccc aagaccaact ccatcgttac ttcttcttct 60gaaggaacag aagtgagtag
tcttgagtgg gaagttgtga acatgagtca agaagaagaa 120gatttggtct ctcgaatgca
taagcttgtc ggtgacaggt gggaactgat agctgggagg 180atcccaggaa gaaccgctgg
agaaattgag aggttttggg tcatgaaaaa ttga 23410077PRTArabidopsis
thaliana 100Met Asp Asn His Arg Arg Thr Lys Gln Pro Lys Thr Asn Ser Ile
Val 1 5 10 15 Thr
Ser Ser Ser Glu Gly Thr Glu Val Ser Ser Leu Glu Trp Glu Val
20 25 30 Val Asn Met Ser Gln
Glu Glu Glu Asp Leu Val Ser Arg Met His Lys 35
40 45 Leu Val Gly Asp Arg Trp Glu Leu Ile
Ala Gly Arg Ile Pro Gly Arg 50 55
60 Thr Ala Gly Glu Ile Glu Arg Phe Trp Val Met Lys Asn
65 70 75 101225DNABruguiera
gymnorrhiza 101atggctgacg tggataactc atccgttgat gaattctctg ttgattctag
agaggaatcc 60agccaggatt ctaagcttga gttctcagag gatgaggaga cccttattac
taggatgtac 120aatctggttg gtgagaggtg gcctttgatt gctgggagga ttcctggaag
gacagcagag 180gagattgaga agtactggac ttcaagatac tctacaagcc aatga
22510274PRTBruguiera gymnorrhiza 102Met Ala Asp Val Asp Asn
Ser Ser Val Asp Glu Phe Ser Val Asp Ser 1 5
10 15 Arg Glu Glu Ser Ser Gln Asp Ser Lys Leu Glu
Phe Ser Glu Asp Glu 20 25
30 Glu Thr Leu Ile Thr Arg Met Tyr Asn Leu Val Gly Glu Arg Trp
Pro 35 40 45 Leu
Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys 50
55 60 Tyr Trp Thr Ser Arg Tyr
Ser Thr Ser Gln 65 70 103309DNABrassica
napus 103atggattcaa cgtatcgacg tcagcgtcac aactctgaag aagtgtgtag
cgtaaagtgg 60gatttcatca aaatgagcca acaggaggaa gatctcatct taagaatgta
cagactcgta 120ggcgataggt gggaaataat agcaggaaga gtaccggcga agaaaagctg
tggagataga 180gagatattgg atcatgagaa acaacacaca tgtccgccct ccatcttcca
aattttaacc 240tccttcttgt gctgtgcctc ccatgtttgt tttaagtgtt atattttcat
ttccaaaact 300aaaaactag
309104102PRTBrassica napus 104Met Asp Ser Thr Tyr Arg Arg Gln
Arg His Asn Ser Glu Glu Val Cys 1 5 10
15 Ser Val Lys Trp Asp Phe Ile Lys Met Ser Gln Gln Glu
Glu Asp Leu 20 25 30
Ile Leu Arg Met Tyr Arg Leu Val Gly Asp Arg Trp Glu Ile Ile Ala
35 40 45 Gly Arg Val Pro
Ala Lys Lys Ser Cys Gly Asp Arg Glu Ile Leu Asp 50
55 60 His Glu Lys Gln His Thr Cys Pro
Pro Ser Ile Phe Gln Ile Leu Thr 65 70
75 80 Ser Phe Leu Cys Cys Ala Ser His Val Cys Phe Lys
Cys Tyr Ile Phe 85 90
95 Ile Ser Lys Thr Lys Asn 100 105324DNABrassica
napus 105atggataaca ctgaccgtcg tcgccgtcgt aagcaacaca aagtcactct
ccatgactct 60gaagaagtga gcagtattga atgggagttt atcaatatga cagaacaaga
agaagatctc 120atctttcgaa tgcatagact tgtcggtgat aggtgggatt taatagcagg
acgagtgcca 180ggaagacaac cagaagagat agagagatac tggataatga gaaacagtga
tggctttgct 240gagaaacgac gccaacttca tcactcctct tctcacaaaa gtaccaaacc
tcatcgtcca 300cgtttttcta tttatccttc ttag
324106107PRTBrassica napus 106Met Asp Asn Thr Asp Arg Arg Arg
Arg Arg Lys Gln His Lys Val Thr 1 5 10
15 Leu His Asp Ser Glu Glu Val Ser Ser Ile Glu Trp Glu
Phe Ile Asn 20 25 30
Met Thr Glu Gln Glu Glu Asp Leu Ile Phe Arg Met His Arg Leu Val
35 40 45 Gly Asp Arg Trp
Asp Leu Ile Ala Gly Arg Val Pro Gly Arg Gln Pro 50
55 60 Glu Glu Ile Glu Arg Tyr Trp Ile
Met Arg Asn Ser Asp Gly Phe Ala 65 70
75 80 Glu Lys Arg Arg Gln Leu His His Ser Ser Ser His
Lys Ser Thr Lys 85 90
95 Pro His Arg Pro Arg Phe Ser Ile Tyr Pro Ser 100
105 107249DNABrassica napus 107atggataagc agcgtaagtc
gaagcatccc aagaccaatg cttatgccac cattgtttcc 60tcttcttcgg aagaagtgag
cagtcttgag tgggaagaaa tagcaatgac acaagaagaa 120gaggatttga tctgcaggat
gtataagctt gtcggcgaaa ggtgggattt aataactggg 180aggattccag gaagaacggc
acaagtgatc gagaggtttt gggtcatgaa gaatcatcga 240agagcttga
24910882PRTBrassica napus
108Met Asp Lys Gln Arg Lys Ser Lys His Pro Lys Thr Asn Ala Tyr Ala 1
5 10 15 Thr Ile Val Ser
Ser Ser Ser Glu Glu Val Ser Ser Leu Glu Trp Glu 20
25 30 Glu Ile Ala Met Thr Gln Glu Glu Glu
Asp Leu Ile Cys Arg Met Tyr 35 40
45 Lys Leu Val Gly Glu Arg Trp Asp Leu Ile Thr Gly Arg Ile
Pro Gly 50 55 60
Arg Thr Ala Gln Val Ile Glu Arg Phe Trp Val Met Lys Asn His Arg 65
70 75 80 Arg Ala
109258DNABrassica napus 109atggatagac gacgtcgtag acagagcaag gccaaagcgt
cgtgttccga agaagtgagt 60agcatagaat gggaagctgt gaagatgacg gaggaagaag
aagatctcat ttctcggatg 120tataaactcg tcggagacag gtgggaattg atagccggaa
ggattccagg acggacgccg 180gaggagatag aaagatattg gcttatgaaa cacggtgtcg
tttttgccaa ccgaccaaga 240gattttgtta ggagatga
25811085PRTBrassica napus 110Met Asp Arg Arg Arg
Arg Arg Gln Ser Lys Ala Lys Ala Ser Cys Ser 1 5
10 15 Glu Glu Val Ser Ser Ile Glu Trp Glu Ala
Val Lys Met Thr Glu Glu 20 25
30 Glu Glu Asp Leu Ile Ser Arg Met Tyr Lys Leu Val Gly Asp Arg
Trp 35 40 45 Glu
Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Pro Glu Glu Ile Glu 50
55 60 Arg Tyr Trp Leu Met Lys
His Gly Val Val Phe Ala Asn Arg Pro Arg 65 70
75 80 Asp Phe Val Arg Arg 85
111237DNABrassica napus 111atggattcaa cgtatcgacg tcagcgtcac aactctgaag
aagtgtgtag cgtaaagtgg 60gatttcatca aaatgagcca acaggaggaa gatctcatct
taagaatgta cagactcgta 120ggcgataggt gggaaataat agcaggaaga gtaccgggaa
gaaaagctgt ggagatagag 180agatactgga tcatgagaaa caacacacat ttcttgcctc
catcttccaa attttaa 23711278PRTBrassica napus 112Met Asp Ser Thr Tyr
Arg Arg Gln Arg His Asn Ser Glu Glu Val Cys 1 5
10 15 Ser Val Lys Trp Asp Phe Ile Lys Met Ser
Gln Gln Glu Glu Asp Leu 20 25
30 Ile Leu Arg Met Tyr Arg Leu Val Gly Asp Arg Trp Glu Ile Ile
Ala 35 40 45 Gly
Arg Val Pro Gly Arg Lys Ala Val Glu Ile Glu Arg Tyr Trp Ile 50
55 60 Met Arg Asn Asn Thr His
Phe Leu Pro Pro Ser Ser Lys Phe 65 70
75 113228DNACoffea canephora 113atggctgact cggatcaatc
cacaactttg aatgaaaaat ctgtcggctc tcaagaggat 60aaaagtcaag actctgagct
tcatttctct gaagatgagg aaattctcat cattaggatg 120ttcaacttgg ttggtaaaag
gtggtcttta attgctggaa gaatccctgg aagaactgca 180aaggaaattg aggagtattg
gaatacaaga tctgcaacca gtccatga 22811475PRTCoffea
canephora 114Met Ala Asp Ser Asp Gln Ser Thr Thr Leu Asn Glu Lys Ser Val
Gly 1 5 10 15 Ser
Gln Glu Asp Lys Ser Gln Asp Ser Glu Leu His Phe Ser Glu Asp
20 25 30 Glu Glu Ile Leu Ile
Ile Arg Met Phe Asn Leu Val Gly Lys Arg Trp 35
40 45 Ser Leu Ile Ala Gly Arg Ile Pro Gly
Arg Thr Ala Lys Glu Ile Glu 50 55
60 Glu Tyr Trp Asn Thr Arg Ser Ala Thr Ser Pro 65
70 75 115261DNACurcuma longa 115atggagcgcc
gtcgcaagaa gcagcgcagg tcgtccgacg actccgaaga agaggtgaac 60agtgtggagt
ggcagtccat cagcatgacc gagcaggagg aagacctcat ctgcagaatg 120tatcgcctcg
tcggcgacag gtgggatttg atagcagggc gagttccggg tcgaaaacct 180gaagaaatag
agaggttctg gatcatgaga catcgtcaag attcaagaag gcgttcctct 240ttcgccagtc
atgcaaaata a
26111686PRTCurcuma longa 116Met Glu Arg Arg Arg Lys Lys Gln Arg Arg Ser
Ser Asp Asp Ser Glu 1 5 10
15 Glu Glu Val Asn Ser Val Glu Trp Gln Ser Ile Ser Met Thr Glu Gln
20 25 30 Glu Glu
Asp Leu Ile Cys Arg Met Tyr Arg Leu Val Gly Asp Arg Trp 35
40 45 Asp Leu Ile Ala Gly Arg Val
Pro Gly Arg Lys Pro Glu Glu Ile Glu 50 55
60 Arg Phe Trp Ile Met Arg His Arg Gln Asp Ser Arg
Arg Arg Ser Ser 65 70 75
80 Phe Ala Ser His Ala Lys 85 117213DNACyamopsis
tetragonoloba 117atggatcgct cttctgatga tgtttctgca gattcttcag agcaacgcag
tcagggttcc 60aaggttgaat tttccgaaga tgaggaaact cttataatca gaatgtataa
actggttggg 120aagaggtggc ctttgattgc aggaagaatt cccggaagaa cggcagaaga
aatagaaaaa 180ttttggaatt caagattctc caatagcaaa tga
21311870PRTCyamopsis tetragonoloba 118Met Asp Arg Ser Ser Asp
Asp Val Ser Ala Asp Ser Ser Glu Gln Arg 1 5
10 15 Ser Gln Gly Ser Lys Val Glu Phe Ser Glu Asp
Glu Glu Thr Leu Ile 20 25
30 Ile Arg Met Tyr Lys Leu Val Gly Lys Arg Trp Pro Leu Ile Ala
Gly 35 40 45 Arg
Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys Phe Trp Asn Ser 50
55 60 Arg Phe Ser Asn Ser Lys
65 70 119231DNAEuphorbia esula 119atggctgatt ctgaacattc
ttctacttct gatgagatct atttggacta tcaagatgaa 60cagagtcatg agtactcaaa
gcaagaattc tctgaagatg aggaagaact tgtaattagg 120atgtacaatt tggttggaga
aaggtggcat ctaattgctg ggaggattcc aggaagaaca 180gcagatgaga ttgagaagta
ttggaattct agatattcaa ctagtgctta g 23112076PRTEuphorbia esula
120Met Ala Asp Ser Glu His Ser Ser Thr Ser Asp Glu Ile Tyr Leu Asp 1
5 10 15 Tyr Gln Asp Glu
Gln Ser His Glu Tyr Ser Lys Gln Glu Phe Ser Glu 20
25 30 Asp Glu Glu Glu Leu Val Ile Arg Met
Tyr Asn Leu Val Gly Glu Arg 35 40
45 Trp His Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Asp
Glu Ile 50 55 60
Glu Lys Tyr Trp Asn Ser Arg Tyr Ser Thr Ser Ala 65 70
75 121267DNAGossypium hirsutum 121atggctgact
ctcaacattc ctcttctggt aaaacttatg tcaactctca agactttagt 60tcagaggagg
aaacaaatga agaatcaaag cttaaattct ctgaagatga ggaaacactt 120ataattagga
tgtttaatct ggttggagag aggtgggctt taattgctgg aagaatcccc 180ggtagaacag
ctgaagaaat tgaagagtat tggaatacca ggtattcaac aaggaaggac 240atttttaaag
aaaggggcca aatttga
26712288PRTGossypium hirsutum 122Met Ala Asp Ser Gln His Ser Ser Ser Gly
Lys Thr Tyr Val Asn Ser 1 5 10
15 Gln Asp Phe Ser Ser Glu Glu Glu Thr Asn Glu Glu Ser Lys Leu
Lys 20 25 30 Phe
Ser Glu Asp Glu Glu Thr Leu Ile Ile Arg Met Phe Asn Leu Val 35
40 45 Gly Glu Arg Trp Ala Leu
Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala 50 55
60 Glu Glu Ile Glu Glu Tyr Trp Asn Thr Arg Tyr
Ser Thr Arg Lys Asp 65 70 75
80 Ile Phe Lys Glu Arg Gly Gln Ile 85
123270DNAGossypium hirsutum 123cttttaatcg aataaaactg caactacctt
cttccatttc ccatctcgac aaaaagacct 60tgtgccatac atcaacaaac ggcatgcact
gcaatttgca acccttacag ggaaggaaac 120agagcaccaa acacagacac agatcataaa
caacattaga cccaaaaaaa aacataacat 180tgttccatcg ctaatataaa agaaatgaat
gccgttattt tcaaacagaa ccgtgtctca 240tcttcagctc ccttcgcctc tttgcaaata
27012489PRTGossypium hirsutum 124Met
Asp Lys Arg Asp Arg Lys Gln Ala Lys Thr Gly Ser Cys Cys Ser 1
5 10 15 Glu Glu Val Ser Ser Thr
Glu Trp Glu Phe Ile Asn Met Ser Glu Gln 20
25 30 Glu Glu Asp Leu Ile Tyr Arg Met Tyr Lys
Leu Val Gly Asp Arg Trp 35 40
45 Gly Leu Ile Ala Gly Arg Ile Pro Gly Gln Lys Ala Glu Glu
Ile Glu 50 55 60
Arg Phe Trp Ile Met Arg His Gly Glu Leu Phe Ala Lys Arg Arg Arg 65
70 75 80 Glu Leu Lys Met
Arg His Gly Ser Val 85
125234DNAGossypium hirsutum 125ataaaaacta gagggaaata aaattggaat
agaagggatg tataggatac gagaaacgca 60tccaacttac atctacattc aacattccta
tgttacaaca ataacatcat aattaatcat 120gctgtctacc atcaaagttg agcaaaattc
ccagaatgtt ttttcactgg cttgttgagt 180atcttgtgtt ccaatatttc tcaatctcct
cggctgttct tccagggatt ctcc 23412677PRTGossypium hirsutum 126Met
Ala Glu Ser Glu Tyr Ser Ser Ser Glu Asn Ala Ser Thr Asp Ser 1
5 10 15 Asn Ser Ile Ala Glu Gln
Ser Lys Gln Asp Leu Glu Leu Gln Phe Ser 20
25 30 Glu Asp Glu Glu Thr Leu Val Ile Arg Met
Phe Asn Leu Val Gly Glu 35 40
45 Arg Trp Gly Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala
Glu Glu 50 55 60
Ile Glu Lys Tyr Trp Asn Thr Arg Tyr Ser Thr Ser Gln 65
70 75 127228DNAGossypium hirsutum 127atggctgaca
tggatggttc ttctgttgat tctaaagagg aatccagtga agattccaag 60cttgacttct
cagaagatga ggaaaccctc attattagaa tgttcaattt ggttggagaa 120aggtggtctt
tgatagcagg gagaatccct ggaagaacag ctgaggagat tcagaagtat 180tgggcttcta
gattctctta caataatcca atgcccaatc tgagttag
22812875PRTGossypium hirsutum 128Met Ala Asp Met Asp Gly Ser Ser Val Asp
Ser Lys Glu Glu Ser Ser 1 5 10
15 Glu Asp Ser Lys Leu Asp Phe Ser Glu Asp Glu Glu Thr Leu Ile
Ile 20 25 30 Arg
Met Phe Asn Leu Val Gly Glu Arg Trp Ser Leu Ile Ala Gly Arg 35
40 45 Ile Pro Gly Arg Thr Ala
Glu Glu Ile Gln Lys Tyr Trp Ala Ser Arg 50 55
60 Phe Ser Tyr Asn Asn Pro Met Pro Asn Leu Ser
65 70 75 129240DNAGlycine max
129atgtccacca ccgcaactac aacctctgaa gttagcagca atgagtggaa agtcatacac
60atgagcgagc aagaggagga tctcattcgc aggatgtaca agctagtcgg ggacaagtgg
120aatttgatag ccggtcgcat tcccagtcgt aaagcagaag aaatagagag attctggatt
180atgagacacg gtgatgcttt ctctgttaaa agacacagaa gtaaagccca agactcatga
24013079PRTGlycine max 130Met Ser Thr Thr Ala Thr Thr Thr Ser Glu Val Ser
Ser Asn Glu Trp 1 5 10
15 Lys Val Ile His Met Ser Glu Gln Glu Glu Asp Leu Ile Arg Arg Met
20 25 30 Tyr Lys Leu
Val Gly Asp Lys Trp Asn Leu Ile Ala Gly Arg Ile Pro 35
40 45 Ser Arg Lys Ala Glu Glu Ile Glu
Arg Phe Trp Ile Met Arg His Gly 50 55
60 Asp Ala Phe Ser Val Lys Arg His Arg Ser Lys Ala Gln
Asp Ser 65 70 75
131228DNAGlycine max 131atggctgact cggatctctc ttcaagtcaa atttctacac
attctactga ttcaggaaat 60cgagggtctt ccaaagttga attttctgaa gatgaggaaa
ccctcatcat caggatgtat 120aaactggtag gggagaggtg gtctataatt gctggaagga
ttcctggaag aacagcagag 180gaaatagaga agtattggac ttcaagattc tcgggctcta
gtgaatga 22813275PRTGlycine max 132Met Ala Asp Ser Asp
Leu Ser Ser Ser Gln Ile Ser Thr His Ser Thr 1 5
10 15 Asp Ser Gly Asn Arg Gly Ser Ser Lys Val
Glu Phe Ser Glu Asp Glu 20 25
30 Glu Thr Leu Ile Ile Arg Met Tyr Lys Leu Val Gly Glu Arg Trp
Ser 35 40 45 Ile
Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys 50
55 60 Tyr Trp Thr Ser Arg Phe
Ser Gly Ser Ser Glu 65 70 75
133225DNAGlycine max 133atggctgaca tagatcgctc ctttgataat aatgtttctg
ctgtttctac tgagaaatca 60agccaagttt cagatgttga attttctgaa gctgaggaaa
tccttattgc catggtgtat 120aatctggttg gggagaggtg gtctttgatt gctggaagaa
ttcctggaag aactgcagaa 180gagatagaga aatattggac ttcaagattt tcgactagcc
aatga 22513474PRTGlycine max 134Met Ala Asp Ile Asp
Arg Ser Phe Asp Asn Asn Val Ser Ala Val Ser 1 5
10 15 Thr Glu Lys Ser Ser Gln Val Ser Asp Val
Glu Phe Ser Glu Ala Glu 20 25
30 Glu Ile Leu Ile Ala Met Val Tyr Asn Leu Val Gly Glu Arg Trp
Ser 35 40 45 Leu
Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys 50
55 60 Tyr Trp Thr Ser Arg Phe
Ser Thr Ser Gln 65 70 135222DNAGlycine
soja 135atgactgaca tagatcgctc ctctgataat gtttcttctg attctattga gaaatcaagc
60caagtttctg atgttgaatt ttctgaagct gaggaaatcc ttattgccat ggtgtataat
120ctggttggag aaaggtggtc tttgattgct ggaagaattc ctggaagaac tgcagaagaa
180atagagaaat attggacttc aagattttcg actagccaat ga
22213673PRTGlycine soja 136Met Thr Asp Ile Asp Arg Ser Ser Asp Asn Val
Ser Ser Asp Ser Ile 1 5 10
15 Glu Lys Ser Ser Gln Val Ser Asp Val Glu Phe Ser Glu Ala Glu Glu
20 25 30 Ile Leu
Ile Ala Met Val Tyr Asn Leu Val Gly Glu Arg Trp Ser Leu 35
40 45 Ile Ala Gly Arg Ile Pro Gly
Arg Thr Ala Glu Glu Ile Glu Lys Tyr 50 55
60 Trp Thr Ser Arg Phe Ser Thr Ser Gln 65
70 137231DNAHordeum vulgare 137tccaggatct tcttgctcgt
gctgctctta ctgatgagca cattgttgag actcctatgg 60agctcaatgt tcacctttct
gctactgatg gtgatcccct gcctgacccg acgcgttatc 120gtcatcttgt tggcagtttt
gtttatctcg ctgtcactcg tctggatatc tcttatccgg 180ttcatattct gagtcagttc
gtctctgctc ccacatcggt tcactatagt c 23113876PRTHordeum vulgare
138Met Ser Ser Lys Ser Leu Gly Lys Asn Ser Lys Ile Met Ser Gly Arg 1
5 10 15 Glu Arg Lys Glu
Val Asn Ser Asn Ala Lys His Phe Val Asp Phe Thr 20
25 30 Glu Ala Glu Glu Asp Leu Val Phe Arg
Met His Arg Leu Val Gly Asn 35 40
45 Arg Trp Glu Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala
Glu Glu 50 55 60
Val Glu Met Phe Trp Ala Lys Lys His Gln Asp Gln 65 70
75 139246DNAIpomoea nil 139atggcagatt tggataattc
cagtacctgt ggagaagcag aagcttgtgt ggaaattaca 60gaagtggagg tcactagcca
agattcaaat aagctagttt tctcggtgga tgaggaagct 120ctggtagtca gaatgtataa
cttggtggga gagaggtggt cactaattgc tgggagaatc 180ccaggaagaa gtgcggagga
aattgagaag tactggaact caacacactc aactagtcat 240caatga
24614081PRTIpomoea nil
140Met Ala Asp Leu Asp Asn Ser Ser Thr Cys Gly Glu Ala Glu Ala Cys 1
5 10 15 Val Glu Ile Thr
Glu Val Glu Val Thr Ser Gln Asp Ser Asn Lys Leu 20
25 30 Val Phe Ser Val Asp Glu Glu Ala Leu
Val Val Arg Met Tyr Asn Leu 35 40
45 Val Gly Glu Arg Trp Ser Leu Ile Ala Gly Arg Ile Pro Gly
Arg Ser 50 55 60
Ala Glu Glu Ile Glu Lys Tyr Trp Asn Ser Thr His Ser Thr Ser His 65
70 75 80 Gln
141267DNAJuglans hindsii x regia 141atggataaac gtccgcggaa acaagcaaag
agtacaaaga gctccacatc agaagaagtg 60agcagtattg agtgggagtt cataaagatg
actgaacaag aagaggatct catcttccgg 120atgtacaaac ttgttgggga caggtgggat
ttgatagcag gtcgtgttcc agggcgaaaa 180ccagaagaaa tagagaggtt ttggattatg
agacacggtg aggtatttgc ccagaaaaga 240gacgcagctg ccaagaatta ttcgtga
26714288PRTJuglans hindsii x regia
142Met Asp Lys Arg Pro Arg Lys Gln Ala Lys Ser Thr Lys Ser Ser Thr 1
5 10 15 Ser Glu Glu Val
Ser Ser Ile Glu Trp Glu Phe Ile Lys Met Thr Glu 20
25 30 Gln Glu Glu Asp Leu Ile Phe Arg Met
Tyr Lys Leu Val Gly Asp Arg 35 40
45 Trp Asp Leu Ile Ala Gly Arg Val Pro Gly Arg Lys Pro Glu
Glu Ile 50 55 60
Glu Arg Phe Trp Ile Met Arg His Gly Glu Val Phe Ala Gln Lys Arg 65
70 75 80 Asp Ala Ala Ala Lys
Asn Tyr Ser 85 143261DNAJuglans hindsii x
regia 143atggctgact ctcaaaactc ttcttccaat ggggttagga atgaaacccc
ttccaatgat 60actttcgccg actctagatc agaggaaaca agtgaagcat ccaagctaga
gttctccgaa 120gacgaagaaa tgcttatcat aaggatgttc aatctggttg gagaaaggtg
gtctctgatt 180gccggaagga tcccaggaag aacagctgag gaaattgaga agtactggac
ttctagatat 240tcaacaagcg aaatgaaata a
26114486PRTJuglans hindsii x regia 144Met Ala Asp Ser Gln Asn
Ser Ser Ser Asn Gly Val Arg Asn Glu Thr 1 5
10 15 Pro Ser Asn Asp Thr Phe Ala Asp Ser Arg Ser
Glu Glu Thr Ser Glu 20 25
30 Ala Ser Lys Leu Glu Phe Ser Glu Asp Glu Glu Met Leu Ile Ile
Arg 35 40 45 Met
Phe Asn Leu Val Gly Glu Arg Trp Ser Leu Ile Ala Gly Arg Ile 50
55 60 Pro Gly Arg Thr Ala Glu
Glu Ile Glu Lys Tyr Trp Thr Ser Arg Tyr 65 70
75 80 Ser Thr Ser Glu Met Lys 85
145222DNALotus japonicus 145atggctgaca gagaacactc ctctgataat
gtttctgcag attctacaga gaaatccagt 60caagcttcaa atgtggaatt ctctgaagat
gaggaaatcc ttattaccat ggtttataat 120ctggttgggg aaaggtggtc tttgattgct
ggaagaattc ctggaagaac agcagaagaa 180atagagaagt actggacttc aagatactcc
actagtgaat ga 22214673PRTLotus japonicus 146Met Ala
Asp Arg Glu His Ser Ser Asp Asn Val Ser Ala Asp Ser Thr 1 5
10 15 Glu Lys Ser Ser Gln Ala Ser
Asn Val Glu Phe Ser Glu Asp Glu Glu 20 25
30 Ile Leu Ile Thr Met Val Tyr Asn Leu Val Gly Glu
Arg Trp Ser Leu 35 40 45
Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys Tyr
50 55 60 Trp Thr Ser
Arg Tyr Ser Thr Ser Glu 65 70
147252DNALolium multiflorum 147atgttgtctt ctcggaagga tatgagcagt
aaaagcttgg cgaagaactc caagacgatg 60ggtgtccatg aagcgaaaga agttactagc
accacacagc atttcgttga tttcacagaa 120gcagaggaag atctcgtatt cagaatgcac
aggcttgtcg ggaacaggtg ggaacttata 180gctggaagga tccccggaag aacagcagga
gaagtagaga tgttttgggc gaaaaagcaa 240aaggaacaat ga
25214883PRTLolium multiflorum 148Met
Leu Ser Ser Arg Lys Asp Met Ser Ser Lys Ser Leu Ala Lys Asn 1
5 10 15 Ser Lys Thr Met Gly Val
His Glu Ala Lys Glu Val Thr Ser Thr Thr 20
25 30 Gln His Phe Val Asp Phe Thr Glu Ala Glu
Glu Asp Leu Val Phe Arg 35 40
45 Met His Arg Leu Val Gly Asn Arg Trp Glu Leu Ile Ala Gly
Arg Ile 50 55 60
Pro Gly Arg Thr Ala Gly Glu Val Glu Met Phe Trp Ala Lys Lys Gln 65
70 75 80 Lys Glu Gln
149237DNALactuca saligna 149atggctaact tggataagta ctctacttcc aatgatactt
ccactcatac tagagggcca 60tcaaatcaag aatctcgggt tcatttctct gaagacgaaa
aaactctcat cactaggatg 120tataagcttg tcggagaaag atggtctttg attgctggaa
ggattcctgg aagatctgca 180gaggaaattg agaagtactg gacttccaaa tattcaagaa
ctaatgatca gatgtag 23715078PRTLactuca saligna 150Met Ala Asn Leu
Asp Lys Tyr Ser Thr Ser Asn Asp Thr Ser Thr His 1 5
10 15 Thr Arg Gly Pro Ser Asn Gln Glu Ser
Arg Val His Phe Ser Glu Asp 20 25
30 Glu Lys Thr Leu Ile Thr Arg Met Tyr Lys Leu Val Gly Glu
Arg Trp 35 40 45
Ser Leu Ile Ala Gly Arg Ile Pro Gly Arg Ser Ala Glu Glu Ile Glu 50
55 60 Lys Tyr Trp Thr Ser
Lys Tyr Ser Arg Thr Asn Asp Gln Met 65 70
75 151228DNALactuca serriola 151atgaagcaca attctgagtt
tgaagaggtg agcagtagaa aatgggagtt cattaatatg 60agcgaacaag aagaagatat
catttataga atgcacaaac ttgctggtaa caggtgggat 120ttaatagctg gtaggatttc
gggacggaat ccggaagaaa tagagagatt ttggttaatg 180agacatagtg aagcgtatga
ggatttaagg aaaagagtca aatcttga 22815275PRTLactuca
serriola 152Met Lys His Asn Ser Glu Phe Glu Glu Val Ser Ser Arg Lys Trp
Glu 1 5 10 15 Phe
Ile Asn Met Ser Glu Gln Glu Glu Asp Ile Ile Tyr Arg Met His
20 25 30 Lys Leu Ala Gly Asn
Arg Trp Asp Leu Ile Ala Gly Arg Ile Ser Gly 35
40 45 Arg Asn Pro Glu Glu Ile Glu Arg Phe
Trp Leu Met Arg His Ser Glu 50 55
60 Ala Tyr Glu Asp Leu Arg Lys Arg Val Lys Ser 65
70 75 153219DNALiriodendron tulipifera
153atggctgact tagatcattc atctgaggat gtttctgatg attctcaagg aaccagtcaa
60gattctaagc tggagtttac ggaggatgag cagactctca ttgaaaggat gtttaatctt
120cttggagaga ggtggtctct gattgctggg agaatcccag gaagaacagc agaagagatt
180gagaagtact ggacttcaag gtactcttca agtgaatga
21915472PRTLiriodendron tulipifera 154Met Ala Asp Leu Asp His Ser Ser Glu
Asp Val Ser Asp Asp Ser Gln 1 5 10
15 Gly Thr Ser Gln Asp Ser Lys Leu Glu Phe Thr Glu Asp Glu
Gln Thr 20 25 30
Leu Ile Glu Arg Met Phe Asn Leu Leu Gly Glu Arg Trp Ser Leu Ile
35 40 45 Ala Gly Arg Ile
Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys Tyr Trp 50
55 60 Thr Ser Arg Tyr Ser Ser Ser Glu
65 70 155225DNAMalus domestica 155atggcggact
ccgagcactc ttcttccgat gacactttct cggactctcg agagaagagc 60acagaaaaat
ctgagcttca attctccgag gatgaggaag cacttatcat aagaatgtac 120aatctagtag
gggagaggtg ggctttgatt gccgggagga ttccggggag aacagcagaa 180gaaattgaga
agtattggac ttctacacac tcaactagtc aataa
22515674PRTMalus domestica 156Met Ala Asp Ser Glu His Ser Ser Ser Asp Asp
Thr Phe Ser Asp Ser 1 5 10
15 Arg Glu Lys Ser Thr Glu Lys Ser Glu Leu Gln Phe Ser Glu Asp Glu
20 25 30 Glu Ala
Leu Ile Ile Arg Met Tyr Asn Leu Val Gly Glu Arg Trp Ala 35
40 45 Leu Ile Ala Gly Arg Ile Pro
Gly Arg Thr Ala Glu Glu Ile Glu Lys 50 55
60 Tyr Trp Thr Ser Thr His Ser Thr Ser Gln 65
70 157264DNAManihot esculenta 157atggatagac
gccgcaagaa gcaatccaag gcagcaactc cgcgctctga agaggtaagc 60agtattgaat
gggagttcat aaacatgtcc gaacaagaag aagatcttat ttatagaatg 120tataagcttg
tcggagacag gtgggctttg attgctggtc ggattccagg tcggaaagct 180gaagaaatag
aaaggttttg gataatgagg catggtgaag ggtttgccgg tcgacgaaaa 240gagctcaaga
agtccaaatg ttaa
26415887PRTManihot esculenta 158Met Asp Arg Arg Arg Lys Lys Gln Ser Lys
Ala Ala Thr Pro Arg Ser 1 5 10
15 Glu Glu Val Ser Ser Ile Glu Trp Glu Phe Ile Asn Met Ser Glu
Gln 20 25 30 Glu
Glu Asp Leu Ile Tyr Arg Met Tyr Lys Leu Val Gly Asp Arg Trp 35
40 45 Ala Leu Ile Ala Gly Arg
Ile Pro Gly Arg Lys Ala Glu Glu Ile Glu 50 55
60 Arg Phe Trp Ile Met Arg His Gly Glu Gly Phe
Ala Gly Arg Arg Lys 65 70 75
80 Glu Leu Lys Lys Ser Lys Cys 85
159225DNAManihot esculenta 159atggctgact tggatcattc ttctagtgat gatgtttctg
ttgattctag agaggaatca 60agccaagaat ctaagcttga attcacagag gatgaagaaa
cccttattac taggatgtat 120aatcttgttg gagagaggtg gcctctaatt gctgggagaa
ttccaggaag aacagcggag 180gaaattgaga agtactggaa ttccagattc tcttcaagtc
aataa 22516074PRTManihot esculenta 160Met Ala Asp Leu
Asp His Ser Ser Ser Asp Asp Val Ser Val Asp Ser 1 5
10 15 Arg Glu Glu Ser Ser Gln Glu Ser Lys
Leu Glu Phe Thr Glu Asp Glu 20 25
30 Glu Thr Leu Ile Thr Arg Met Tyr Asn Leu Val Gly Glu Arg
Trp Pro 35 40 45
Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys 50
55 60 Tyr Trp Asn Ser Arg
Phe Ser Ser Ser Gln 65 70
161276DNAMedicago truncatula 161atggaagaaa aacgacgttc ccattcccaa
aataaggcaa atatctcccc aaacacaagt 60caaacctctg aagctggtgg agaagtgagc
agcactgagt gggagttcat agagatgagc 120gagcaagagg aggatctcat tcgcaggatg
tacgacctag ttggagatag gtggaatttg 180atagcaggtc gcattccagg tcgtaaagca
gaagaaatag agagattctg gattatgaga 240cacactgatg ctttttctgc caaaagaaag
aagtga 27616291PRTMedicago truncatula 162Met
Glu Glu Lys Arg Arg Ser His Ser Gln Asn Lys Ala Asn Ile Ser 1
5 10 15 Pro Asn Thr Ser Gln Thr
Ser Glu Ala Gly Gly Glu Val Ser Ser Thr 20
25 30 Glu Trp Glu Phe Ile Glu Met Ser Glu Gln
Glu Glu Asp Leu Ile Arg 35 40
45 Arg Met Tyr Asp Leu Val Gly Asp Arg Trp Asn Leu Ile Ala
Gly Arg 50 55 60
Ile Pro Gly Arg Lys Ala Glu Glu Ile Glu Arg Phe Trp Ile Met Arg 65
70 75 80 His Thr Asp Ala Phe
Ser Ala Lys Arg Lys Lys 85 90
163237DNAOryza minuta 163atggatagca gcagtggtag ccagggaaag aattccaaaa
ccagtgatgg ttgtgaaaca 60aaagaagtta atagcactgc actgaatttt attcatttca
cggaagaaga ggaagatctc 120gttttcagaa tgcacaggct tgttgggaac aggtgggaac
ttatagctgg aagaatccct 180ggaaggacag caaaagaagt agaaatgttc tgggcaataa
agcaccagga cacataa 23716478PRTOryza minuta 164Met Asp Ser Ser Ser
Gly Ser Gln Gly Lys Asn Ser Lys Thr Ser Asp 1 5
10 15 Gly Cys Glu Thr Lys Glu Val Asn Ser Thr
Ala Leu Asn Phe Ile His 20 25
30 Phe Thr Glu Glu Glu Glu Asp Leu Val Phe Arg Met His Arg Leu
Val 35 40 45 Gly
Asn Arg Trp Glu Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala 50
55 60 Lys Glu Val Glu Met Phe
Trp Ala Ile Lys His Gln Asp Thr 65 70
75 165237DNAOryza sativa 165atggatagca gcagtggtag ccagggaaag
aattccaaaa ccagtgatgg ttgtgaaaca 60aaagaagtta ataacactgc acagaatttt
gttcatttca cggaagaaga ggaagatctc 120gttttcagaa tgcacaggct tgttgggaac
aggtgggaac ttatagctgg aagaatccct 180ggaagaacag caaaagaagt agaaatgttc
tgggcagtaa agcaccagaa tacataa 23716678PRTOryza sativa 166Met Asp
Ser Ser Ser Gly Ser Gln Gly Lys Asn Ser Lys Thr Ser Asp 1 5
10 15 Gly Cys Glu Thr Lys Glu Val
Asn Asn Thr Ala Gln Asn Phe Val His 20 25
30 Phe Thr Glu Glu Glu Glu Asp Leu Val Phe Arg Met
His Arg Leu Val 35 40 45
Gly Asn Arg Trp Glu Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala
50 55 60 Lys Glu Val
Glu Met Phe Trp Ala Val Lys His Gln Asn Thr 65 70
75 167162DNAPhalaenopsis equestris 167atgtccaaac
ctaacttcac agaggaagaa gacgacctca ttgccagaat gtataagctc 60gttggagaca
gatggtctct gattgctgga aggatcccag gaagaacaag cgaggagatt 120gagaattact
ggaagtcaaa aaattctacc tcgtctacat aa
16216853PRTPhalaenopsis equestris 168Met Ser Lys Pro Asn Phe Thr Glu Glu
Glu Asp Asp Leu Ile Ala Arg 1 5 10
15 Met Tyr Lys Leu Val Gly Asp Arg Trp Ser Leu Ile Ala Gly
Arg Ile 20 25 30
Pro Gly Arg Thr Ser Glu Glu Ile Glu Asn Tyr Trp Lys Ser Lys Asn
35 40 45 Ser Thr Ser Ser
Thr 50 169324DNAPicea glauca 169cattatatgg gtaacttcag
tactttaaat cttgtccata cattgtattt ctctctgttg 60ccaggactga aatattttga
acgtgacaag cacatgataa aggtcagccg ttccattaat 120atgattggtt tcaattaagg
atatctgaat tgatctacat aacttgggac ctacattcaa 180aattgatccg ctacgaaatt
tcctatacca tcagaccaaa tacaaagagc aaaacctcta 240tactttttct ctgattagag
aaaacactgc aattaaggaa acctcgacta actttctctg 300attaaagaaa atacaatgaa
gaaa 324170107PRTPicea glauca
170Met Glu Lys Asn Val Tyr Cys Ser Ser Ala Ile Leu Glu Tyr Asp Thr 1
5 10 15 Glu Glu Gly Ser
Ser Leu Asp Trp Glu Cys Asp Met Ser Glu Glu Glu 20
25 30 Glu Asp Leu Ile Leu Arg Met Tyr Lys
Leu Val Gly Asn Lys Trp Ser 35 40
45 Leu Ile Ala Gly Arg Ile Pro Gly Arg Lys Ala Glu Glu Ile
Glu Arg 50 55 60
Tyr Trp Ala Met Arg Thr Gln Gln Leu Cys Gly Gly His Gly Ala Ile 65
70 75 80 Phe Thr Asn Lys Lys
Gln Thr Ala Asn Met Ile Ser Ile Gln Tyr Arg 85
90 95 Ile Asn Gly Cys Asn Asp Val Glu Val Asn
Ser 100 105 171258DNAPetunia hybrida
171atggcagata aaggacaaag ttcttcatct gtaaatactc cggctgattc tcaagatggg
60gtggctcctc ggatgttagt ttcaggaaag acatcaaaag tagctgaaat aaaattctct
120gaagaagaag aagacttgat cattaggatg tataatttgg ttggcgagag atggtctctt
180atagctggaa gaatcccagg aagaagtgca gaagagattg agaaatattg gaatactcga
240tcttcaacca gccaataa
25817285PRTPetunia hybrida 172Met Ala Asp Lys Gly Gln Ser Ser Ser Ser Val
Asn Thr Pro Ala Asp 1 5 10
15 Ser Gln Asp Gly Val Ala Pro Arg Met Leu Val Ser Gly Lys Thr Ser
20 25 30 Lys Val
Ala Glu Ile Lys Phe Ser Glu Glu Glu Glu Asp Leu Ile Ile 35
40 45 Arg Met Tyr Asn Leu Val Gly
Glu Arg Trp Ser Leu Ile Ala Gly Arg 50 55
60 Ile Pro Gly Arg Ser Ala Glu Glu Ile Glu Lys Tyr
Trp Asn Thr Arg 65 70 75
80 Ser Ser Thr Ser Gln 85 173327DNAPseudotsuga
menziesii 173atggagaaga atgtgtactg tagcacttct attctggagt atgacaccga
ggaagggagt 60agcttagatt gggaatgcga catgtccgag gaagaagaag atcttattct
cagaatgtac 120aaacttatcg ggaacaagtg gtcgctgatt gccggacgta ttcctggaag
aaaagcagag 180gagattgaga ggtactgggc gatgagaacc caacaatttt gcggcagcca
tggcgccacc 240attttcgcca gcaataagca gatgggcaat atgatctcga ttccatacca
cattaatgga 300tgcaatgacg ttgaagtaca ttcgtag
327174108PRTPseudotsuga menziesii 174Met Glu Lys Asn Val Tyr
Cys Ser Thr Ser Ile Leu Glu Tyr Asp Thr 1 5
10 15 Glu Glu Gly Ser Ser Leu Asp Trp Glu Cys Asp
Met Ser Glu Glu Glu 20 25
30 Glu Asp Leu Ile Leu Arg Met Tyr Lys Leu Ile Gly Asn Lys Trp
Ser 35 40 45 Leu
Ile Ala Gly Arg Ile Pro Gly Arg Lys Ala Glu Glu Ile Glu Arg 50
55 60 Tyr Trp Ala Met Arg Thr
Gln Gln Phe Cys Gly Ser His Gly Ala Thr 65 70
75 80 Ile Phe Ala Ser Asn Lys Gln Met Gly Asn Met
Ile Ser Ile Pro Tyr 85 90
95 His Ile Asn Gly Cys Asn Asp Val Glu Val His Ser 100
105 175225DNAPrunus persica 175atggctgact
tggatcactc cacctctgat gacaattctg tggattctag agaggaaagt 60agtcaagact
ctaagcttca cttctcagaa gatgaggaaa ctctaatcac taggatgttt 120aacctggttg
gtgagaggtg gtctctgatt gctggtagaa ttcctggaag atcagcagag 180gagattgaaa
agtactggac ttcaagatac tcaacaagtg aatga
22517674PRTPrunus persica 176Met Ala Asp Leu Asp His Ser Thr Ser Asp Asp
Asn Ser Val Asp Ser 1 5 10
15 Arg Glu Glu Ser Ser Gln Asp Ser Lys Leu His Phe Ser Glu Asp Glu
20 25 30 Glu Thr
Leu Ile Thr Arg Met Phe Asn Leu Val Gly Glu Arg Trp Ser 35
40 45 Leu Ile Ala Gly Arg Ile Pro
Gly Arg Ser Ala Glu Glu Ile Glu Lys 50 55
60 Tyr Trp Thr Ser Arg Tyr Ser Thr Ser Glu 65
70 177252DNAPinus pinaster 177atggatcgtg
cagacacaga tgaagaccga ctctcctctt cacataaaga agtagaagaa 60gcagggaaag
agagaacaag cagtacaaat ataaacgagg acgaagaaga tctcatcatt 120aggctgcaca
aattggtggg agataggtgg tcgctgattg ctggcagaat acctggacga 180accccagagg
agattgagaa gtactggaag tcgagaaagc aggaaaattc caaacgcaaa 240agaggcaaat
ga
25217883PRTPinus pinaster 178Met Asp Arg Ala Asp Thr Asp Glu Asp Arg Leu
Ser Ser Ser His Lys 1 5 10
15 Glu Val Glu Glu Ala Gly Lys Glu Arg Thr Ser Ser Thr Asn Ile Asn
20 25 30 Glu Asp
Glu Glu Asp Leu Ile Ile Arg Leu His Lys Leu Val Gly Asp 35
40 45 Arg Trp Ser Leu Ile Ala Gly
Arg Ile Pro Gly Arg Thr Pro Glu Glu 50 55
60 Ile Glu Lys Tyr Trp Lys Ser Arg Lys Gln Glu Asn
Ser Lys Arg Lys 65 70 75
80 Arg Gly Lys 179348DNAPinus pinaster 179atggctcgtt cctcctccct
cagtatggag aagaatatgt actgtagttc tactcttctg 60gagtatgata ctgaggaagg
gagtagttta gattgggaat gcgacatgtc cgaggaagaa 120gaagatctta tactcagaat
gtacaaactt atcggcaaca agtggtcgct gattgccggg 180cgcattcctg gaagaaaagc
agaggagatt gagaggtact gggccatgag aacccaacaa 240ttgtgtggcg gccatgatgc
tattttgacg aagaaacagc agaaaaccaa tatgatatcg 300attcagtacc gcattaatgg
acccaatgat gttgaagtaa attcgtag 348180115PRTPinus pinaster
180Met Ala Arg Ser Ser Ser Leu Ser Met Glu Lys Asn Met Tyr Cys Ser 1
5 10 15 Ser Thr Leu Leu
Glu Tyr Asp Thr Glu Glu Gly Ser Ser Leu Asp Trp 20
25 30 Glu Cys Asp Met Ser Glu Glu Glu Glu
Asp Leu Ile Leu Arg Met Tyr 35 40
45 Lys Leu Ile Gly Asn Lys Trp Ser Leu Ile Ala Gly Arg Ile
Pro Gly 50 55 60
Arg Lys Ala Glu Glu Ile Glu Arg Tyr Trp Ala Met Arg Thr Gln Gln 65
70 75 80 Leu Cys Gly Gly His
Asp Ala Ile Leu Thr Lys Lys Gln Gln Lys Thr 85
90 95 Asn Met Ile Ser Ile Gln Tyr Arg Ile Asn
Gly Pro Asn Asp Val Glu 100 105
110 Val Asn Ser 115 181246DNAPicea sitchensis
181atggacgtgc aagaccttca acagcagtcc tctgaaggag agagtgactc tcagggtgga
60aggagccggc aagggttatg tgattctgat atctctgctg acgaagaaga tttgattatc
120agactccaca agcttcttgg tgacaggtgg gcgttgattg ccgggcgcct cccatggcga
180acgactgagg aaattgagaa atactggaaa atgagaagtc aggagatcga tcagagcagc
240gattaa
24618281PRTPicea sitchensis 182Met Asp Val Gln Asp Leu Gln Gln Gln Ser
Ser Glu Gly Glu Ser Asp 1 5 10
15 Ser Gln Gly Gly Arg Ser Arg Gln Gly Leu Cys Asp Ser Asp Ile
Ser 20 25 30 Ala
Asp Glu Glu Asp Leu Ile Ile Arg Leu His Lys Leu Leu Gly Asp 35
40 45 Arg Trp Ala Leu Ile Ala
Gly Arg Leu Pro Trp Arg Thr Thr Glu Glu 50 55
60 Ile Glu Lys Tyr Trp Lys Met Arg Ser Gln Glu
Ile Asp Gln Ser Ser 65 70 75
80 Asp 183324DNAPicea sitchensis 183ttattggttt caattaagga
tatctgaatt gagctacata acttgggacc tacattcaaa 60attgatccgc tacgaaattc
cctataccat cagaccaaat acaaagagca aaacctctat 120actttttctc tgattagaga
aaacactgca attaaggaaa cctcgacgaa ctttctctga 180ttaaagaaaa tacaatgaag
aaaatctccc tgaatttgct ctaattggag aaaatacaca 240gagcaaaatc tctacgaatt
tacttcgacg tcattgcatc cattaatgcg gtattgaatc 300gagatcatat tggccgtctg
cttc 324184107PRTPicea
sitchensis 184Met Glu Lys Asn Val Tyr Cys Ser Ser Ala Ile Leu Glu Tyr Asp
Thr 1 5 10 15 Glu
Glu Gly Ser Ser Leu Asp Trp Glu Cys Asp Met Ser Glu Glu Glu
20 25 30 Glu Asp Leu Ile Leu
Arg Met Tyr Lys Leu Val Gly Asn Lys Trp Ser 35
40 45 Leu Ile Ala Gly Arg Ile Pro Gly Arg
Lys Ala Glu Glu Ile Glu Arg 50 55
60 Tyr Trp Ala Met Arg Thr Gln Gln Leu Cys Gly Gly His
Gly Ala Ile 65 70 75
80 Phe Thr Asn Lys Lys Gln Thr Ala Asn Met Ile Ser Ile Gln Tyr Arg
85 90 95 Ile Asn Gly Cys
Asn Asp Val Glu Val Asn Ser 100 105
185255DNAPinus taeda 185atggatcgtg cagacacaga tgaagaccga ctctcgtctt
cacataaaga agtagaagaa 60gcaggggaag agaggagaac aaggagtaca aatatgaacg
aggacgaaga agatctcatc 120attaggctgc acaaattgtt gggagagagg tggtcgctga
ttgctggcag aatacctgga 180cgaaccccag aggagattga gaagtactgg aagtcgagaa
agcaggaaaa ttccaaacgc 240aaaagaggca aatga
25518684PRTPinus taeda 186Met Asp Arg Ala Asp Thr
Asp Glu Asp Arg Leu Ser Ser Ser His Lys 1 5
10 15 Glu Val Glu Glu Ala Gly Glu Glu Arg Arg Thr
Arg Ser Thr Asn Met 20 25
30 Asn Glu Asp Glu Glu Asp Leu Ile Ile Arg Leu His Lys Leu Leu
Gly 35 40 45 Glu
Arg Trp Ser Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Pro Glu 50
55 60 Glu Ile Glu Lys Tyr Trp
Lys Ser Arg Lys Gln Glu Asn Ser Lys Arg 65 70
75 80 Lys Arg Gly Lys 187237DNAPopulus tremula
187atggctgact ctgaacattc ttcttctgat gaaacttttg cgtattcgag agaggaaaca
60agtcaggaaa caagtcagga atcaaggctt gaattctctg aggatgagga gacacttata
120attaggatgt ttaatctagt tggagagagg tggtctctga ttgctggaag gattcctgga
180agaacagctg aggaaataga gaagtactgg aacactagat actctacaag tgaatga
23718878PRTPopulus tremula 188Met Ala Asp Ser Glu His Ser Ser Ser Asp Glu
Thr Phe Ala Tyr Ser 1 5 10
15 Arg Glu Glu Thr Ser Gln Glu Thr Ser Gln Glu Ser Arg Leu Glu Phe
20 25 30 Ser Glu
Asp Glu Glu Thr Leu Ile Ile Arg Met Phe Asn Leu Val Gly 35
40 45 Glu Arg Trp Ser Leu Ile Ala
Gly Arg Ile Pro Gly Arg Thr Ala Glu 50 55
60 Glu Ile Glu Lys Tyr Trp Asn Thr Arg Tyr Ser Thr
Ser Glu 65 70 75
189288DNAPopulus tremula 189atggaaagta tggaccgccg ccggcgccgg aaacaaccta
aaattaacag ttctgagtct 60gaagaggtca gtagtattga atgggagttt ataaacatga
gcgagcaaga ggaagacctc 120atttacagaa tgcataaact tgttggtgaa aggtgggatt
tgatagctgg aaggattcct 180ggccgaaaag cagaagaaat agagaggttt tggataatga
aacaccgcga agggtttgct 240ggaaacggaa aattgtataa cgaagtgaag tctaggactt
ctagttga 28819095PRTPopulus tremula 190Met Glu Ser Met
Asp Arg Arg Arg Arg Arg Lys Gln Pro Lys Ile Asn 1 5
10 15 Ser Ser Glu Ser Glu Glu Val Ser Ser
Ile Glu Trp Glu Phe Ile Asn 20 25
30 Met Ser Glu Gln Glu Glu Asp Leu Ile Tyr Arg Met His Lys
Leu Val 35 40 45
Gly Glu Arg Trp Asp Leu Ile Ala Gly Arg Ile Pro Gly Arg Lys Ala 50
55 60 Glu Glu Ile Glu Arg
Phe Trp Ile Met Lys His Arg Glu Gly Phe Ala 65 70
75 80 Gly Asn Gly Lys Leu Tyr Asn Glu Val Lys
Ser Arg Thr Ser Ser 85 90
95 191225DNAPopulus tremula 191atggctgact tggatcactc ctctagtgat
gacaactctg ttgattctag agaggaaacc 60agccaagatt ccaagcttga attctcagaa
gatgaggaaa ctcttatcac caggatgtac 120aatctggctg gtgagaggtg gccattaatt
gctgggagga ttccaggaag aacagcagaa 180gaaattgaga agtactggac ttcaagatac
tctacgagtc agtaa 22519274PRTPopulus tremula 192Met Ala
Asp Leu Asp His Ser Ser Ser Asp Asp Asn Ser Val Asp Ser 1 5
10 15 Arg Glu Glu Thr Ser Gln Asp
Ser Lys Leu Glu Phe Ser Glu Asp Glu 20 25
30 Glu Thr Leu Ile Thr Arg Met Tyr Asn Leu Ala Gly
Glu Arg Trp Pro 35 40 45
Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys
50 55 60 Tyr Trp Thr
Ser Arg Tyr Ser Thr Ser Gln 65 70
193234DNAPopulus tremula 193atggctggct cgggtcacac ctcaaataac acaaatcaag
ataccaaggc tgcaaagagt 60aatcaagact ccaacctgca ggatttctct gaagatgaag
agaatctcat tgctagaatg 120tttggcttgg ttgggaagag atggtcacta attgctggga
gaataccagg aagaacagca 180gaggagattg agaagtattg gacttcaaag cagaggtcat
caaaggaaag atga 23419477PRTPopulus tremula 194Met Ala Gly Ser
Gly His Thr Ser Asn Asn Thr Asn Gln Asp Thr Lys 1 5
10 15 Ala Ala Lys Ser Asn Gln Asp Ser Asn
Leu Gln Asp Phe Ser Glu Asp 20 25
30 Glu Glu Asn Leu Ile Ala Arg Met Phe Gly Leu Val Gly Lys
Arg Trp 35 40 45
Ser Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile Glu 50
55 60 Lys Tyr Trp Thr Ser
Lys Gln Arg Ser Ser Lys Glu Arg 65 70
75 195222DNAPopulus trichocarpa 195atggctgact cggatcactc
ctctagtgat gatctctctg ttgattctag agatacaagc 60caagattcca agcttgaatt
ctcagaagac gaggaaactc ttattactag gatgtacaat 120ctggttggtg agaggtggac
tttaattgct gggaggattc ctggaagaac agcagaggaa 180attgagaagt actggacttc
aagatactct acaagtcagt aa 22219673PRTPopulus
trichocarpa 196Met Ala Asp Ser Asp His Ser Ser Ser Asp Asp Leu Ser Val
Asp Ser 1 5 10 15
Arg Asp Thr Ser Gln Asp Ser Lys Leu Glu Phe Ser Glu Asp Glu Glu
20 25 30 Thr Leu Ile Thr Arg
Met Tyr Asn Leu Val Gly Glu Arg Trp Thr Leu 35
40 45 Ile Ala Gly Arg Ile Pro Gly Arg Thr
Ala Glu Glu Ile Glu Lys Tyr 50 55
60 Trp Thr Ser Arg Tyr Ser Thr Ser Gln 65
70 197225DNAPopulus trichocarpa 197atggctgact ctgaacattc
ttcttctgat gaaacttttg tgtattcgag agaggaaaca 60agtaaggaat caaagcttga
attctctgag gatgaggaga cacttataat taggatgttt 120aatctagttg gagagaggtg
gtctttgatt gctggaagga ttcctggaag aacagctgag 180gaaatagaga agtactggaa
cactagatac tctacaagtg aatga 22519874PRTPopulus
trichocarpa 198Met Ala Asp Ser Glu His Ser Ser Ser Asp Glu Thr Phe Val
Tyr Ser 1 5 10 15
Arg Glu Glu Thr Ser Lys Glu Ser Lys Leu Glu Phe Ser Glu Asp Glu
20 25 30 Glu Thr Leu Ile Ile
Arg Met Phe Asn Leu Val Gly Glu Arg Trp Ser 35
40 45 Leu Ile Ala Gly Arg Ile Pro Gly Arg
Thr Ala Glu Glu Ile Glu Lys 50 55
60 Tyr Trp Asn Thr Arg Tyr Ser Thr Ser Glu 65
70 199225DNAPopulus trichocarpa 199atggctgact
tggatcactc ctctagtgat gacaactctg ttgattctag agaggaaacc 60agccaagatt
cgaagcttga attctcagaa gatgaggaaa ctcttatcac caggatgtac 120aatctggttg
gtgagaggtg gcccttaatt gctgggagga ttccaggaag aacagcagaa 180gaaattgaga
agtactggac ttcaagatac tctacaagtc agtaa
22520074PRTPopulus trichocarpa 200Met Ala Asp Leu Asp His Ser Ser Ser Asp
Asp Asn Ser Val Asp Ser 1 5 10
15 Arg Glu Glu Thr Ser Gln Asp Ser Lys Leu Glu Phe Ser Glu Asp
Glu 20 25 30 Glu
Thr Leu Ile Thr Arg Met Tyr Asn Leu Val Gly Glu Arg Trp Pro 35
40 45 Leu Ile Ala Gly Arg Ile
Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys 50 55
60 Tyr Trp Thr Ser Arg Tyr Ser Thr Ser Gln 65
70 201294DNAPopulus trichocarpa
201atggaaagta tggaccgccg ccgccgccga cgccgcaaac aagctaaaat taacaattct
60gggtctgaag aggtcagtag tattgaatgg gagtttatag acatgagtga acaagaggaa
120gacctcattt acagaatgta taggcttgtt ggagaaaggt gggatttggt agctggaagg
180attccaggcc ggaaagcaga agaaatagag aggttttgga taatgaaaca ccgtgaaggg
240tttgctgaga aacgaaggtt gcatagcaaa gcgaagtcta agacttatcg ttag
29420297PRTPopulus trichocarpa 202Met Glu Ser Met Asp Arg Arg Arg Arg Arg
Arg Arg Lys Gln Ala Lys 1 5 10
15 Ile Asn Asn Ser Gly Ser Glu Glu Val Ser Ser Ile Glu Trp Glu
Phe 20 25 30 Ile
Asp Met Ser Glu Gln Glu Glu Asp Leu Ile Tyr Arg Met Tyr Arg 35
40 45 Leu Val Gly Glu Arg Trp
Asp Leu Val Ala Gly Arg Ile Pro Gly Arg 50 55
60 Lys Ala Glu Glu Ile Glu Arg Phe Trp Ile Met
Lys His Arg Glu Gly 65 70 75
80 Phe Ala Glu Lys Arg Arg Leu His Ser Lys Ala Lys Ser Lys Thr Tyr
85 90 95 Arg
203267DNAPopulus trichocarpa 203atggatagac gtcgcaagaa gcaagccaag
actacatctt gttgttctga acaagaggtg 60agcagcattg agtgggagtt cattaacatg
tcagaacaag aagaagatct catttacaga 120atgcataatc tggtggggga caggtgggct
ttgatcgctg gtcgaattcc aggacgcaaa 180gctgaagaaa tagagagatt ttggctaatg
agacacggtg aagggtttgc cagtcgacga 240agagagcaaa agagatgtca ttcctaa
26720488PRTPopulus trichocarpa 204Met
Asp Arg Arg Arg Lys Lys Gln Ala Lys Thr Thr Ser Cys Cys Ser 1
5 10 15 Glu Gln Glu Val Ser Ser
Ile Glu Trp Glu Phe Ile Asn Met Ser Glu 20
25 30 Gln Glu Glu Asp Leu Ile Tyr Arg Met His
Asn Leu Val Gly Asp Arg 35 40
45 Trp Ala Leu Ile Ala Gly Arg Ile Pro Gly Arg Lys Ala Glu
Glu Ile 50 55 60
Glu Arg Phe Trp Leu Met Arg His Gly Glu Gly Phe Ala Ser Arg Arg 65
70 75 80 Arg Glu Gln Lys Arg
Cys His Ser 85 205228DNAPhaseolus vulgaris
205atggctgact ccaatacctc ttccactcaa acttcttcac attcttctga ttcagggaag
60cgtggaactt ccaaggttga gttttctgaa gacgaggaaa ctcttattac caggatgtat
120aaactggttg ggaaaaggtg gtctttaatt gctggaagaa ttcctggaag aacagcagag
180gaaatagaga agtattggac ttcgaaactc tcgagttcta gtaaatga
22820675PRTPhaseolus vulgaris 206Met Ala Asp Ser Asn Thr Ser Ser Thr Gln
Thr Ser Ser His Ser Ser 1 5 10
15 Asp Ser Gly Lys Arg Gly Thr Ser Lys Val Glu Phe Ser Glu Asp
Glu 20 25 30 Glu
Thr Leu Ile Thr Arg Met Tyr Lys Leu Val Gly Lys Arg Trp Ser 35
40 45 Leu Ile Ala Gly Arg Ile
Pro Gly Arg Thr Ala Glu Glu Ile Glu Lys 50 55
60 Tyr Trp Thr Ser Lys Leu Ser Ser Ser Ser Lys
65 70 75 207237DNASorghum bicolor
207atggatagca gcagcggcag ccaggacaag aaatccaaag gcaatgatcg ccgtgaagca
60aaagaagcta atggcactgc acagcatttt gttgatttca cggaagcaga ggaagatctt
120gtttccagaa tgcacaggct tgtggggaac aggtgggaga ttatagcagg gagaatccca
180ggaaggacag cagaagaggt agagatgttc tggtccaaaa aacaccagga aagatga
23720878PRTSorghum bicolor 208Met Asp Ser Ser Ser Gly Ser Gln Asp Lys Lys
Ser Lys Gly Asn Asp 1 5 10
15 Arg Arg Glu Ala Lys Glu Ala Asn Gly Thr Ala Gln His Phe Val Asp
20 25 30 Phe Thr
Glu Ala Glu Glu Asp Leu Val Ser Arg Met His Arg Leu Val 35
40 45 Gly Asn Arg Trp Glu Ile Ile
Ala Gly Arg Ile Pro Gly Arg Thr Ala 50 55
60 Glu Glu Val Glu Met Phe Trp Ser Lys Lys His Gln
Glu Arg 65 70 75
209270DNASalvia miltiorrhiza 209atggataagt gtcggcagaa gcagatcaag
attcggaaat accctctgtg tgaagaggtg 60agcagtattg aatgggagtt tgtgaacatg
actgatcaag aagaagacat catcaacaga 120atgcacaagc ttgttgggga caggtggggt
ttgatagctg ggagacttcc tgggaggaaa 180gctgaggaga ttgagagatt ttggttgatg
agaaatagtg acaattttac agataaaaga 240aaggaatatc ataggagaca aaagtcttga
27021089PRTSalvia miltiorrhiza 210Met
Asp Lys Cys Arg Gln Lys Gln Ile Lys Ile Arg Lys Tyr Pro Leu 1
5 10 15 Cys Glu Glu Val Ser Ser
Ile Glu Trp Glu Phe Val Asn Met Thr Asp 20
25 30 Gln Glu Glu Asp Ile Ile Asn Arg Met His
Lys Leu Val Gly Asp Arg 35 40
45 Trp Gly Leu Ile Ala Gly Arg Leu Pro Gly Arg Lys Ala Glu
Glu Ile 50 55 60
Glu Arg Phe Trp Leu Met Arg Asn Ser Asp Asn Phe Thr Asp Lys Arg 65
70 75 80 Lys Glu Tyr His Arg
Arg Gln Lys Ser 85 211285DNASalvia
miltiorrhiza 211atggataagt gtagtagcac tcagaagcat cccaagattc agaatgaggc
aagctctctt 60gaatgggaat tcataaagat gacagagcaa gaagaagata tcatatgtag
aatgcacaag 120cttgtgggag acaagtggga gttaatagca ggaagaattc caggcagaag
tgcagaagag 180attgaaagat tttggttgat gagaaatggc gatgagagga agaggaaagc
aaataatatt 240gagcgggccc caccgctaca tgttcgagtt tcatcggccg actag
28521294PRTSalvia miltiorrhiza 212Met Asp Lys Cys Ser Ser Thr
Gln Lys His Pro Lys Ile Gln Asn Glu 1 5
10 15 Ala Ser Ser Leu Glu Trp Glu Phe Ile Lys Met
Thr Glu Gln Glu Glu 20 25
30 Asp Ile Ile Cys Arg Met His Lys Leu Val Gly Asp Lys Trp Glu
Leu 35 40 45 Ile
Ala Gly Arg Ile Pro Gly Arg Ser Ala Glu Glu Ile Glu Arg Phe 50
55 60 Trp Leu Met Arg Asn Gly
Asp Glu Arg Lys Arg Lys Ala Asn Asn Ile 65 70
75 80 Glu Arg Ala Pro Pro Leu His Val Arg Val Ser
Ser Ala Asp 85 90
213231DNASolanum tuberosum 213atgggcgctc ttcaaggctt acttggtatc gggaatggaa
tagataaagc atttgaggcc 60aaaaaggaag agagctcgaa gcttgaattt tcccaagatg
aggaaatcct tattactaaa 120atgttcaact tggttggtga gaggtggtca ttaattgctg
gaagaattcc agggagaact 180gcagaagaaa ttgagaagta ttggaactca agaaattcca
ccagccaata a 23121476PRTSolanum tuberosum 214Met Gly Ala Leu
Gln Gly Leu Leu Gly Ile Gly Asn Gly Ile Asp Lys 1 5
10 15 Ala Phe Glu Ala Lys Lys Glu Glu Ser
Ser Lys Leu Glu Phe Ser Gln 20 25
30 Asp Glu Glu Ile Leu Ile Thr Lys Met Phe Asn Leu Val Gly
Glu Arg 35 40 45
Trp Ser Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala Glu Glu Ile 50
55 60 Glu Lys Tyr Trp Asn
Ser Arg Asn Ser Thr Ser Gln 65 70 75
215231DNATriticum aestivum 215atgagcagcg aaagcttggg caagaactcc
aagatcatgg gtggccgtga aagaaaagaa 60gttaatagca ccgcaaagca ttttgttgat
ttcacagaag cagaggaaga tcttgttttc 120agaatgcaca ggcttgtcgg gaacaggtgg
gaacttatag ctggaagaat ccccggaaga 180acagcagaag aagtagagat gttctgggca
aaaaggcacc aggaccaatg a 23121676PRTTriticum aestivum 216Met
Ser Ser Glu Ser Leu Gly Lys Asn Ser Lys Ile Met Gly Gly Arg 1
5 10 15 Glu Arg Lys Glu Val Asn
Ser Thr Ala Lys His Phe Val Asp Phe Thr 20
25 30 Glu Ala Glu Glu Asp Leu Val Phe Arg Met
His Arg Leu Val Gly Asn 35 40
45 Arg Trp Glu Leu Ile Ala Gly Arg Ile Pro Gly Arg Thr Ala
Glu Glu 50 55 60
Val Glu Met Phe Trp Ala Lys Arg His Gln Asp Gln 65 70
75 217198DNATamarix androssowii 217aaagtacaac
aaatgtcatg catgcaccga cgctgatctc cttcctaaaa acgactgaag 60aggtctggac
tcgagtctag tctaatctag tctagcagta tgataacgcc gctaatttcc 120aaaacaaaga
gaaagcactt aaaaatatac atagggagga tagaagagga gcacaaccgc 180tacaactttt
catacaac
19821865PRTTamarix androssowii 218Met Lys Ile Arg Ser Ile Pro Gln Ser Thr
Ala Thr Lys Ser Leu Ser 1 5 10
15 Arg Asn Phe Ser Glu Asp Glu Glu Thr Leu Ile Thr Arg Met Phe
Asn 20 25 30 Leu
Val Gly Glu Arg Trp Ser Leu Ile Ala Gly Arg Ile Pro Gly Arg 35
40 45 Thr Ala Glu Glu Ile Glu
Lys Tyr Trp Thr Ser Arg Tyr Tyr Thr Ser 50 55
60 Arg 65 219309DNATriphysaria sp
219atggctgatg atcaattgca gaaacctagt gctactaatg ataatgcaat agacggcaat
60aaagatgata aggtagtagc tgagagtcca agtatcgtcg atgattcgaa gcagcttgag
120attacagaag atgaagaaac cctaattaat aggatgtaca atttggttgg agaaagatgg
180tcattgattg ctggaagaat accggggaga agtgccgagg aaattgagaa gtattggaat
240tttagaccac aatctacact aaaacagtta ctcgataatg caatcttggt tgtggataat
300caaccatag
309220102PRTTriphysaria sp 220Met Ala Asp Asp Gln Leu Gln Lys Pro Ser Ala
Thr Asn Asp Asn Ala 1 5 10
15 Ile Asp Gly Asn Lys Asp Asp Lys Val Val Ala Glu Ser Pro Ser Ile
20 25 30 Val Asp
Asp Ser Lys Gln Leu Glu Ile Thr Glu Asp Glu Glu Thr Leu 35
40 45 Ile Asn Arg Met Tyr Asn Leu
Val Gly Glu Arg Trp Ser Leu Ile Ala 50 55
60 Gly Arg Ile Pro Gly Arg Ser Ala Glu Glu Ile Glu
Lys Tyr Trp Asn 65 70 75
80 Phe Arg Pro Gln Ser Thr Leu Lys Gln Leu Leu Asp Asn Ala Ile Leu
85 90 95 Val Val Asp
Asn Gln Pro 100 221228DNAVitis vinifera 221atggctgact
cagaatactc tacttctaat gacacttctt gtgttgattc tcaagagcaa 60agcagccaag
aagctaagct tgaattctct gaagacgagg aaacactgat cattaggatg 120tttaatctgg
ttggagagag gtgggctcta attgctggga ggatccctgg gagaacagca 180gaggacattg
agaagtactg gaattcaaga tactcaacca gtgagtga
22822275PRTVitis vinifera 222Met Ala Asp Ser Glu Tyr Ser Thr Ser Asn Asp
Thr Ser Cys Val Asp 1 5 10
15 Ser Gln Glu Gln Ser Ser Gln Glu Ala Lys Leu Glu Phe Ser Glu Asp
20 25 30 Glu Glu
Thr Leu Ile Ile Arg Met Phe Asn Leu Val Gly Glu Arg Trp 35
40 45 Ala Leu Ile Ala Gly Arg Ile
Pro Gly Arg Thr Ala Glu Asp Ile Glu 50 55
60 Lys Tyr Trp Asn Ser Arg Tyr Ser Thr Ser Glu 65
70 75 223279DNAVitis vinifera
223atgtcaccac tagtgaggac accaaaggta cccacacaac acccttctct gtcttcatct
60tgttcactga tctgggtttg ttctggttca gaggaaacag caaaggattc caaggtggag
120ttctctgaag atgaggagac actcatagct agaatgttta gattggtggg agacagatgg
180aatttgattg cgggaaggat cccgggaaga tctgcagaag agatcaagaa gtattggact
240tccaagtctg tctcatcgtc gactaaacaa catgattga
27922492PRTVitis vinifera 224Met Ser Pro Leu Val Arg Thr Pro Lys Val Pro
Thr Gln His Pro Ser 1 5 10
15 Leu Ser Ser Ser Cys Ser Leu Ile Trp Val Cys Ser Gly Ser Glu Glu
20 25 30 Thr Ala
Lys Asp Ser Lys Val Glu Phe Ser Glu Asp Glu Glu Thr Leu 35
40 45 Ile Ala Arg Met Phe Arg Leu
Val Gly Asp Arg Trp Asn Leu Ile Ala 50 55
60 Gly Arg Ile Pro Gly Arg Ser Ala Glu Glu Ile Lys
Lys Tyr Trp Thr 65 70 75
80 Ser Lys Ser Val Ser Ser Ser Thr Lys Gln His Asp 85
90 225222DNAVitis vinifera 225atggctgact
tggatcactc ctctgatggc agctctctgg attctagaga gggaagcagt 60caagattcca
agcttgaatt ctctgaagat gaggaaaccc tgatcactag gatgttcaat 120ctggttggag
agaggtggtc tctgattgct gggagaattc ctggaagaac ggcagaggaa 180attgagaagt
actggacttc aagatattca tcaagtgaat ga
22222673PRTVitis vinifera 226Met Ala Asp Leu Asp His Ser Ser Asp Gly Ser
Ser Leu Asp Ser Arg 1 5 10
15 Glu Gly Ser Ser Gln Asp Ser Lys Leu Glu Phe Ser Glu Asp Glu Glu
20 25 30 Thr Leu
Ile Thr Arg Met Phe Asn Leu Val Gly Glu Arg Trp Ser Leu 35
40 45 Ile Ala Gly Arg Ile Pro Gly
Arg Thr Ala Glu Glu Ile Glu Lys Tyr 50 55
60 Trp Thr Ser Arg Tyr Ser Ser Ser Glu 65
70 227237DNAZea mays 227atggatagca gcagtggtag
ccaggacaag aaattcagag acaatgatcg ccctgaagca 60aaagaagcta atagcaccgc
ccagcatctt gttgacttca cggaagcaga ggaagatctt 120gtttccagaa tgcacaggct
tgtggggaac aggtgggaga ttatagcagg aagaatccca 180ggaaggacag cagaagaggt
agagatgttc tggtccaaaa aataccagga aagatga 23722878PRTZea mays 228Met
Asp Ser Ser Ser Gly Ser Gln Asp Lys Lys Phe Arg Asp Asn Asp 1
5 10 15 Arg Pro Glu Ala Lys Glu
Ala Asn Ser Thr Ala Gln His Leu Val Asp 20
25 30 Phe Thr Glu Ala Glu Glu Asp Leu Val Ser
Arg Met His Arg Leu Val 35 40
45 Gly Asn Arg Trp Glu Ile Ile Ala Gly Arg Ile Pro Gly Arg
Thr Ala 50 55 60
Glu Glu Val Glu Met Phe Trp Ser Lys Lys Tyr Gln Glu Arg 65
70 75 22927PRTArtificial sequencemotif 12
229Phe Ser Glu Asp Glu Glu Asp Leu Ile Ile Arg Met Tyr Asn Leu Val 1
5 10 15 Gly Glu Arg Trp
Ser Leu Ile Ala Gly Arg Ile 20 25
23015PRTArtificial sequencemotif 13 230Pro Gly Arg Thr Ala Glu Glu Ile
Glu Lys Tyr Trp Thr Ser Arg 1 5 10
15 23111PRTArtificial sequencemotif 14 231Glu Glu Val Ser Ser
Gln Glu Ser Glu Phe Ile 1 5 10
23231PRTArtificial sequencemotif 15 232Glu Glu Asp Leu Ile Xaa Arg Leu
His Xaa Leu Leu Gly Asn Arg Trp 1 5 10
15 Xaa Leu Ile Ala Gly Arg Leu Pro Gly Arg Thr Asp Asn
Glu Ile 20 25 30
23356DNAArtificial sequenceprimer prm09014 233ggggacaagt ttgtacaaaa
aagcaggctt aaacaatgga taacactgac cgtcgt 5623450DNAArtificial
sequenceprimer prm09015 234ggggaccact ttgtacaaga aagctgggtt ttttcgttgg
cttaaaaaca 502351264DNAOryza sativa 235tcgacgctac
tcaagtggtg ggaggccacc gcatgttcca acgaagcgcc aaagaaagcc 60ttgcagactc
taatgctatt agtcgcctag gatatttgga atgaaaggaa ccgcagagtt 120tttcagcacc
aagagcttcc ggtggctagt ctgatagcca aaattaagga ggatgccaaa 180acatgggtct
tggcgggcgc gaaacacctt gataggtggc ttacctttta acatgttcgg 240gccaaaggcc
ttgagacggt aaagttttct atttgcgctt gcgcatgtac aattttattc 300ctctattcaa
tgaaattggt ggctcactgg ttcattaaaa aaaaaagaat ctagcctgtt 360cgggaagaag
aggattttgt tcgtgagaga gagagagaga gagagagaga gagagagaga 420gaaggaggag
gaggattttc aggcttcgca ttgcccaacc tctgcttctg ttggcccaag 480aagaatccca
ggcgcccatg ggctggcagt ttaccacgga cctacctagc ctaccttagc 540tatctaagcg
ggccgaccta gtagccacgt gcctagtgta gattaaagtt gccgggccag 600caggaagcca
cgctgcaatg gcatcttccc ctgtccttcg cgtacgtgaa aacaaaccca 660ggtaagctta
gaatcttctt gcccgttgga ctgggacacc caccaatccc accatgcccc 720gatattcctc
cggtctcggt tcatgtgatg tcctctcttg tgtgatcacg gagcaagcat 780tcttaaacgg
caaaagaaaa tcaccaactt gctcacgcag tcacgctgca ccgcgcgaag 840cgacgcccga
taggccaaga tcgcgagata aaataacaac caatgatcat aaggaaacaa 900gcccgcgatg
tgtcgtgtgc agcaatcttg gtcatttgcg ggatcgagtg cttcacagct 960aaccaaatat
tcggccgatg atttaacaca ttatcagcgt agatgtacgt acgatttgtt 1020aattaatcta
cgagccttgc tagggcaggt gttctgccag ccaatccaga tcgccctcgt 1080atgcacgctc
acatgatggc agggcagggt tcacatgagc tctaacggtc gattaattaa 1140tcccggggct
cgactataaa tacctcccta atcccatgat caaaaccatc tcaagcagcc 1200taatcatctc
cagctgatca agagctctta attagctagc tagtgattag ctgcgcttgt 1260gatc
12642361640DNAArtificial sequenceexpression cassette with RCc3 promoter
236tcgacgctac tcaagtggtg ggaggccacc gcatgttcca acgaagcgcc aaagaaagcc
60ttgcagactc taatgctatt agtcgcctag gatatttgga atgaaaggaa ccgcagagtt
120tttcagcacc aagagcttcc ggtggctagt ctgatagcca aaattaagga ggatgccaaa
180acatgggtct tggcgggcgc gaaacacctt gataggtggc ttacctttta acatgttcgg
240gccaaaggcc ttgagacggt aaagttttct atttgcgctt gcgcatgtac aattttattc
300ctctattcaa tgaaattggt ggctcactgg ttcattaaaa aaaaaagaat ctagcctgtt
360cgggaagaag aggattttgt tcgtgagaga gagagagaga gagagagaga gagagagaga
420gaaggaggag gaggattttc aggcttcgca ttgcccaacc tctgcttctg ttggcccaag
480aagaatccca ggcgcccatg ggctggcagt ttaccacgga cctacctagc ctaccttagc
540tatctaagcg ggccgaccta gtagccacgt gcctagtgta gattaaagtt gccgggccag
600caggaagcca cgctgcaatg gcatcttccc ctgtccttcg cgtacgtgaa aacaaaccca
660ggtaagctta gaatcttctt gcccgttgga ctgggacacc caccaatccc accatgcccc
720gatattcctc cggtctcggt tcatgtgatg tcctctcttg tgtgatcacg gagcaagcat
780tcttaaacgg caaaagaaaa tcaccaactt gctcacgcag tcacgctgca ccgcgcgaag
840cgacgcccga taggccaaga tcgcgagata aaataacaac caatgatcat aaggaaacaa
900gcccgcgatg tgtcgtgtgc agcaatcttg gtcatttgcg ggatcgagtg cttcacagct
960aaccaaatat tcggccgatg atttaacaca ttatcagcgt agatgtacgt acgatttgtt
1020aattaatcta cgagccttgc tagggcaggt gttctgccag ccaatccaga tcgccctcgt
1080atgcacgctc acatgatggc agggcagggt tcacatgagc tctaacggtc gattaattaa
1140tcccggggct cgactataaa tacctcccta atcccatgat caaaaccatc tcaagcagcc
1200taatcatctc cagctgatca agagctctta attagctagc tagtgattag ctgcgcttgt
1260gatcatttaa atcaactagg gatatcacaa gtttgtacaa aaaagcaggc ttaaacaatg
1320gataacactg accgtcgtcg ccgtcgtaag caacacaaaa tcgccctcca tgactctgaa
1380gaagtgagca gtatcgaatt ggagtttatc aacatgactg aacaagaaga agatctcatc
1440tttcgaatgt acagacttgt cggtgatagg tgggatttga tagcaggaag agttcctgga
1500agacaaccag aggagataga gagatattgg ataatgagaa acagtgaagg ctttgctgat
1560aaacgacgcc agcttcactc atcttcccac aaacatacca agcctcaccg tcctcgcttt
1620tctatctatc cttcctagtg
16402372194DNAOryza sativa 237aatccgaaaa gtttctgcac cgttttcacc ccctaactaa
caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact
agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag
agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc
ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt
catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag
atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata
taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct
tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc
ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag
tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa
tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata
atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt
taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca
tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg
atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag
agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc
ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca
cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg
gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc
ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta
tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc
ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat
ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag
caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg
atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca
actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt
tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata
atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt
cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa
tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag
tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta
gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg
agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta
tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg
attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat
tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttc
2194238831DNAArabidopsis thaliana 238atgacggcat
caggaggagg atcaacggcg gcgacgggga ggatgccgac gtggaaggaa 60agagagaaca
acaagaagag agaaagaaga agaagagcaa tagcagctaa aatcttcacc 120ggacttagat
ctcaaggcaa ttataaactt cctaaacact gtgacaacaa tgaagtcctc 180aaagctcttt
gtcttgaagc tggttggatc gttcatgaag atggcaccac ttatcgaaag 240ggttctcgac
caacagaaac aacagtgccg tgttcgtcaa tccaacttag tccacaatca 300tcggcctttc
aaagcccaat tccttcgtat caagctagcc cttcatcgtc atcttaccca 360agtccgaccc
ggtttgaccc gaatcaatcc tcgacttatc tcattcccta tctccaaaac 420ctagcttcgt
ctggaaacct cgctcctcta cgaatttcca atagtgcccc tgttacaccg 480ccgatttctt
cacctagaag atcaaatccg agacttccga gatggcaaag cagtaatttc 540ccagtctcag
ctccgtcaag cccaacacgg cgtctccatc actacacatc gattccagaa 600tgcgatgaat
cggatgtttc gacggttgat tcttgtcgat ggggaaattt ccaatcggtt 660aacgtttctc
agacatgtcc tccgtcgccg acatttaacc tggtcggaaa aagcgttagc 720tccgtcggag
tagatgtgtc ggtgaagccg tgggaaggtg agaagattca cgatgttggt 780atcgatgact
tggagctgac gctaggtcac aacaccaaag gacgcggcta g
831239276PRTArabidopsis thaliana 239Met Thr Ala Ser Gly Gly Gly Ser Thr
Ala Ala Thr Gly Arg Met Pro 1 5 10
15 Thr Trp Lys Glu Arg Glu Asn Asn Lys Lys Arg Glu Arg Arg
Arg Arg 20 25 30
Ala Ile Ala Ala Lys Ile Phe Thr Gly Leu Arg Ser Gln Gly Asn Tyr
35 40 45 Lys Leu Pro Lys
His Cys Asp Asn Asn Glu Val Leu Lys Ala Leu Cys 50
55 60 Leu Glu Ala Gly Trp Ile Val His
Glu Asp Gly Thr Thr Tyr Arg Lys 65 70
75 80 Gly Ser Arg Pro Thr Glu Thr Thr Val Pro Cys Ser
Ser Ile Gln Leu 85 90
95 Ser Pro Gln Ser Ser Ala Phe Gln Ser Pro Ile Pro Ser Tyr Gln Ala
100 105 110 Ser Pro Ser
Ser Ser Ser Tyr Pro Ser Pro Thr Arg Phe Asp Pro Asn 115
120 125 Gln Ser Ser Thr Tyr Leu Ile Pro
Tyr Leu Gln Asn Leu Ala Ser Ser 130 135
140 Gly Asn Leu Ala Pro Leu Arg Ile Ser Asn Ser Ala Pro
Val Thr Pro 145 150 155
160 Pro Ile Ser Ser Pro Arg Arg Ser Asn Pro Arg Leu Pro Arg Trp Gln
165 170 175 Ser Ser Asn Phe
Pro Val Ser Ala Pro Ser Ser Pro Thr Arg Arg Leu 180
185 190 His His Tyr Thr Ser Ile Pro Glu Cys
Asp Glu Ser Asp Val Ser Thr 195 200
205 Val Asp Ser Cys Arg Trp Gly Asn Phe Gln Ser Val Asn Val
Ser Gln 210 215 220
Thr Cys Pro Pro Ser Pro Thr Phe Asn Leu Val Gly Lys Ser Val Ser 225
230 235 240 Ser Val Gly Val Asp
Val Ser Val Lys Pro Trp Glu Gly Glu Lys Ile 245
250 255 His Asp Val Gly Ile Asp Asp Leu Glu Leu
Thr Leu Gly His Asn Thr 260 265
270 Lys Gly Arg Gly 275 2401008DNAArabidopsis
thaliana 240atgacgtctg acggagcaac gtcgacgtca gctgcagctg cagcagcagc
gatggcgacg 60aggaggaaac cgtcgtggag agagagggag aacaatcgga gaagagagcg
gcggagaaga 120gctgttgcgg cgaagattta tactggtctt agagctcaag gtaactacaa
tcttccaaaa 180cattgtgaca acaatgaggt tcttaaggct ctttgttctg aagctggttg
ggttgttgaa 240gaagacggaa ctacttatcg caagggacac aagcctctac ctggtgacat
ggctggatca 300tcttctcgag caactcctta ctcttcccat aaccaaagtc ctctttcttc
cacttttgat 360agccccatct tatcttacca agtcagtcct tcctcttctt cattcccgag
tccttctcga 420gttggtgatc cacacaatat ctccacaatc ttccctttcc tcaggaatgg
tggtattcct 480tcatcgcttc ctccacttag aatctcaaac agtgctcctg tcactccacc
agtgtcatcc 540ccaacttcta gaaaccccaa accattgcct acttgggaat cttttaccaa
acaatccatg 600tccatggctg ctaaacagtc aatgacttct ttgaactacc cgttttatgc
ggtgtctgca 660cctgccagtc ctactcatca tcgccagttc catgctccgg ctactatacc
tgaatgtgat 720gagtctgact cttccactgt tgattctggt cattggataa gctttcaaaa
gtttgcacaa 780caacagccat tctctgcctc tatggtgcca acctcgccta ccttcaatct
cgtgaaacct 840gcaccacagc aattgtctcc aaacacagca gcaatccaag agattggtca
aagctccgag 900tttaagtttg agaacagcca agttaagcca tgggaagggg agaggatcca
tgatgtggct 960atggaggatc tagagctcac gcttggaaat ggtaaagctc atagttga
1008241335PRTArabidopsis thaliana 241Met Thr Ser Asp Gly Ala
Thr Ser Thr Ser Ala Ala Ala Ala Ala Ala 1 5
10 15 Ala Met Ala Thr Arg Arg Lys Pro Ser Trp Arg
Glu Arg Glu Asn Asn 20 25
30 Arg Arg Arg Glu Arg Arg Arg Arg Ala Val Ala Ala Lys Ile Tyr
Thr 35 40 45 Gly
Leu Arg Ala Gln Gly Asn Tyr Asn Leu Pro Lys His Cys Asp Asn 50
55 60 Asn Glu Val Leu Lys Ala
Leu Cys Ser Glu Ala Gly Trp Val Val Glu 65 70
75 80 Glu Asp Gly Thr Thr Tyr Arg Lys Gly His Lys
Pro Leu Pro Gly Asp 85 90
95 Met Ala Gly Ser Ser Ser Arg Ala Thr Pro Tyr Ser Ser His Asn Gln
100 105 110 Ser Pro
Leu Ser Ser Thr Phe Asp Ser Pro Ile Leu Ser Tyr Gln Val 115
120 125 Ser Pro Ser Ser Ser Ser Phe
Pro Ser Pro Ser Arg Val Gly Asp Pro 130 135
140 His Asn Ile Ser Thr Ile Phe Pro Phe Leu Arg Asn
Gly Gly Ile Pro 145 150 155
160 Ser Ser Leu Pro Pro Leu Arg Ile Ser Asn Ser Ala Pro Val Thr Pro
165 170 175 Pro Val Ser
Ser Pro Thr Ser Arg Asn Pro Lys Pro Leu Pro Thr Trp 180
185 190 Glu Ser Phe Thr Lys Gln Ser Met
Ser Met Ala Ala Lys Gln Ser Met 195 200
205 Thr Ser Leu Asn Tyr Pro Phe Tyr Ala Val Ser Ala Pro
Ala Ser Pro 210 215 220
Thr His His Arg Gln Phe His Ala Pro Ala Thr Ile Pro Glu Cys Asp 225
230 235 240 Glu Ser Asp Ser
Ser Thr Val Asp Ser Gly His Trp Ile Ser Phe Gln 245
250 255 Lys Phe Ala Gln Gln Gln Pro Phe Ser
Ala Ser Met Val Pro Thr Ser 260 265
270 Pro Thr Phe Asn Leu Val Lys Pro Ala Pro Gln Gln Leu Ser
Pro Asn 275 280 285
Thr Ala Ala Ile Gln Glu Ile Gly Gln Ser Ser Glu Phe Lys Phe Glu 290
295 300 Asn Ser Gln Val Lys
Pro Trp Glu Gly Glu Arg Ile His Asp Val Ala 305 310
315 320 Met Glu Asp Leu Glu Leu Thr Leu Gly Asn
Gly Lys Ala His Ser 325 330
335 2421011DNAArabidopsis thaliana 242atgacttcgg atggagctac gtcgacatca
gcagctgcag ctgcggcggc ggcagcagcg 60gcgaggagga agccgtcgtg gagagaaagg
gagaataatc ggaggagaga aagacggaga 120agagctgtag ctgcgaagat atacactggg
cttagagctc aaggtgatta taatttgcct 180aaacattgtg ataataatga agtccttaaa
gctctttgtg ttgaagctgg ttgggttgtt 240gaagaagatg gtactactta tcgcaaggga
tgcaagcctt tacctggtga gatagctggg 300acttcatctc gagtaactcc atattcatca
cagaaccaga gccctctttc atcagccttt 360caaagtccca tcccatctta ccaagttagc
ccgtcttctt catcattccc gagtccttct 420cgcggtgaac caaataacaa catgtcctct
acattcttcc ctttcctcag aaatggtggc 480attccttctt ctcttccttc cctcagaatc
tcaaacagtt gtccagttac cccaccggtc 540tcatcgccga cttctaagaa cccgaaaccg
ttgcctaact gggaatctat cgctaagcaa 600tccatggcca ttgctaaaca atcaatggcg
tcttttaatt atcctttcta tgcggtttct 660gcacctgcta gtccgacaca tcgccaccag
tttcataccc cggctactat acctgaatgt 720gatgagtctg actcttccac tgttgattct
ggtcattgga taagctttca gaagtttgca 780caacaacagc cattctctgc ctctatggtg
ccaacctctc ctaccttcaa tcttgtgaaa 840cctgcgcctc agcagatgtc tccaaatact
gctgccttcc aagagattgg tcaaagctct 900gagtttaaat ttgagaatag ccaagttaaa
ccctgggaag gagagaggat acatgatgtg 960ggtatggagg atcttgagct tacacttgga
aatgggaagg ctcgtggttg a 1011243336PRTArabidopsis thaliana
243Met Thr Ser Asp Gly Ala Thr Ser Thr Ser Ala Ala Ala Ala Ala Ala 1
5 10 15 Ala Ala Ala Ala
Ala Arg Arg Lys Pro Ser Trp Arg Glu Arg Glu Asn 20
25 30 Asn Arg Arg Arg Glu Arg Arg Arg Arg
Ala Val Ala Ala Lys Ile Tyr 35 40
45 Thr Gly Leu Arg Ala Gln Gly Asp Tyr Asn Leu Pro Lys His
Cys Asp 50 55 60
Asn Asn Glu Val Leu Lys Ala Leu Cys Val Glu Ala Gly Trp Val Val 65
70 75 80 Glu Glu Asp Gly Thr
Thr Tyr Arg Lys Gly Cys Lys Pro Leu Pro Gly 85
90 95 Glu Ile Ala Gly Thr Ser Ser Arg Val Thr
Pro Tyr Ser Ser Gln Asn 100 105
110 Gln Ser Pro Leu Ser Ser Ala Phe Gln Ser Pro Ile Pro Ser Tyr
Gln 115 120 125 Val
Ser Pro Ser Ser Ser Ser Phe Pro Ser Pro Ser Arg Gly Glu Pro 130
135 140 Asn Asn Asn Met Ser Ser
Thr Phe Phe Pro Phe Leu Arg Asn Gly Gly 145 150
155 160 Ile Pro Ser Ser Leu Pro Ser Leu Arg Ile Ser
Asn Ser Cys Pro Val 165 170
175 Thr Pro Pro Val Ser Ser Pro Thr Ser Lys Asn Pro Lys Pro Leu Pro
180 185 190 Asn Trp
Glu Ser Ile Ala Lys Gln Ser Met Ala Ile Ala Lys Gln Ser 195
200 205 Met Ala Ser Phe Asn Tyr Pro
Phe Tyr Ala Val Ser Ala Pro Ala Ser 210 215
220 Pro Thr His Arg His Gln Phe His Thr Pro Ala Thr
Ile Pro Glu Cys 225 230 235
240 Asp Glu Ser Asp Ser Ser Thr Val Asp Ser Gly His Trp Ile Ser Phe
245 250 255 Gln Lys Phe
Ala Gln Gln Gln Pro Phe Ser Ala Ser Met Val Pro Thr 260
265 270 Ser Pro Thr Phe Asn Leu Val Lys
Pro Ala Pro Gln Gln Met Ser Pro 275 280
285 Asn Thr Ala Ala Phe Gln Glu Ile Gly Gln Ser Ser Glu
Phe Lys Phe 290 295 300
Glu Asn Ser Gln Val Lys Pro Trp Glu Gly Glu Arg Ile His Asp Val 305
310 315 320 Gly Met Glu Asp
Leu Glu Leu Thr Leu Gly Asn Gly Lys Ala Arg Gly 325
330 335 244978DNAArabidopsis thaliana
244atgacatcag ggacgagaat gccgacatgg agggaaagag agaacaacaa gagaagagaa
60agacgacgga gagcaatcgc agctaagatc ttcaccggat taagaatgta cggtaattac
120gagcttccga agcattgcga caacaacgaa gtgcttaaag cactctgtaa cgaagctggt
180tggatcgtcg aacctgatgg aactacttac cgcaagggat gtagtagacc tgtagagcgt
240atggagatag gtggtggttc agcaaccgct agtccttgct cttcctatca gccaagtccc
300tgtgcttctt ataatcctag tccaggctcc tccaacttca tgagtcctgc ttcatcctca
360tttgctaatc ttacctctgg tgatggccaa tctctcatcc catggctaaa acacctctca
420acaacatcat cctcatcagc ttcttcatca tcaagactcc ctaattacct ctatatccct
480ggaggctcca taagcgctcc tgtaactcct cctttaagct ctccaacagc tcgtaccccg
540agaatgaaca ctgattggca gcaactcaac aactccttct ttgtctcctc aacaccgcca
600agtcccacgc gtcagatcat ccctgactct gaatggttct cagggattca actagcacaa
660agtgttccag cttcaccaac gtttagcctc gtctcacaaa acccatttgg attcaaagaa
720gaagcagcct ctgctgctgg aggcggagga gggtcaagga tgtggacacc aggtcaaagc
780ggaacctgct cccctgctat tcctcctggt gctgaccaga cagcagatgt tccaatgtct
840gaagccgtgg ctcctccaga gtttgctttt gggagtaata caaacgggct agtgaaagca
900tgggaaggag agaggataca cgaggagagt ggttcagatg atcttgaact cactcttgga
960aactcaagca ccaggtaa
978245325PRTArabidopsis thaliana 245Met Thr Ser Gly Thr Arg Met Pro Thr
Trp Arg Glu Arg Glu Asn Asn 1 5 10
15 Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile
Phe Thr 20 25 30
Gly Leu Arg Met Tyr Gly Asn Tyr Glu Leu Pro Lys His Cys Asp Asn
35 40 45 Asn Glu Val Leu
Lys Ala Leu Cys Asn Glu Ala Gly Trp Ile Val Glu 50
55 60 Pro Asp Gly Thr Thr Tyr Arg Lys
Gly Cys Ser Arg Pro Val Glu Arg 65 70
75 80 Met Glu Ile Gly Gly Gly Ser Ala Thr Ala Ser Pro
Cys Ser Ser Tyr 85 90
95 Gln Pro Ser Pro Cys Ala Ser Tyr Asn Pro Ser Pro Gly Ser Ser Asn
100 105 110 Phe Met Ser
Pro Ala Ser Ser Ser Phe Ala Asn Leu Thr Ser Gly Asp 115
120 125 Gly Gln Ser Leu Ile Pro Trp Leu
Lys His Leu Ser Thr Thr Ser Ser 130 135
140 Ser Ser Ala Ser Ser Ser Ser Arg Leu Pro Asn Tyr Leu
Tyr Ile Pro 145 150 155
160 Gly Gly Ser Ile Ser Ala Pro Val Thr Pro Pro Leu Ser Ser Pro Thr
165 170 175 Ala Arg Thr Pro
Arg Met Asn Thr Asp Trp Gln Gln Leu Asn Asn Ser 180
185 190 Phe Phe Val Ser Ser Thr Pro Pro Ser
Pro Thr Arg Gln Ile Ile Pro 195 200
205 Asp Ser Glu Trp Phe Ser Gly Ile Gln Leu Ala Gln Ser Val
Pro Ala 210 215 220
Ser Pro Thr Phe Ser Leu Val Ser Gln Asn Pro Phe Gly Phe Lys Glu 225
230 235 240 Glu Ala Ala Ser Ala
Ala Gly Gly Gly Gly Gly Ser Arg Met Trp Thr 245
250 255 Pro Gly Gln Ser Gly Thr Cys Ser Pro Ala
Ile Pro Pro Gly Ala Asp 260 265
270 Gln Thr Ala Asp Val Pro Met Ser Glu Ala Val Ala Pro Pro Glu
Phe 275 280 285 Ala
Phe Gly Ser Asn Thr Asn Gly Leu Val Lys Ala Trp Glu Gly Glu 290
295 300 Arg Ile His Glu Glu Ser
Gly Ser Asp Asp Leu Glu Leu Thr Leu Gly 305 310
315 320 Asn Ser Ser Thr Arg 325
246855DNAArabidopsis thaliana 246atgacgtcgg ggactagaac gccgacttgg
aaagagagag agaacaacaa acggcgagag 60cggcgaagac gagcgattgc ggctaagatc
ttcgcaggac taaggattca tggaaacttc 120aagctcccta aacactgcga caacaatgaa
gtcctcaaag ctttatgcaa tgaagctggt 180tggactgtag aagacgacgg aactacttac
cgcaagggat gcaaaccaat ggatcgaatg 240gacctcatga atggttctac ttcagctagt
ccatgctcat cgtatcaaca tagccctcgt 300gcttcctaca atccaagccc ttcgtcttca
tcattcccga gtcctacaaa cccatttggt 360gatgctaact cactaatccc atggctcaag
aacctctctt caaactcacc ttccaagctt 420cccttcttcc atggaaattc aataagcgct
cccgtgactc cgccattggc tcgaagccct 480actcgtgatc aagtaaccat ccctgactct
ggatggctct caggaatgca aactccgcag 540agcggaccgt cttctcctac tttcagttta
gtttcaagaa acccgttttt cgacaaagag 600gcttttaaaa tgggagattg taattcacca
atgtggactc ctggacaaag tggaaactgc 660tctccagcta ttcctgctgg tgttgatcag
aactctgatg tgccaatggc tgatggaatg 720acggctgagt ttgcgtttgg ttgtaacgca
atggctgcga atggaatggt gaagccttgg 780gaaggagaaa ggatacatgg agaatgtgtt
tcagatgatt tagaacttac acttggaaac 840tcaagaacca gatga
855247284PRTArabidopsis thaliana 247Met
Thr Ser Gly Thr Arg Thr Pro Thr Trp Lys Glu Arg Glu Asn Asn 1
5 10 15 Lys Arg Arg Glu Arg Arg
Arg Arg Ala Ile Ala Ala Lys Ile Phe Ala 20
25 30 Gly Leu Arg Ile His Gly Asn Phe Lys Leu
Pro Lys His Cys Asp Asn 35 40
45 Asn Glu Val Leu Lys Ala Leu Cys Asn Glu Ala Gly Trp Thr
Val Glu 50 55 60
Asp Asp Gly Thr Thr Tyr Arg Lys Gly Cys Lys Pro Met Asp Arg Met 65
70 75 80 Asp Leu Met Asn Gly
Ser Thr Ser Ala Ser Pro Cys Ser Ser Tyr Gln 85
90 95 His Ser Pro Arg Ala Ser Tyr Asn Pro Ser
Pro Ser Ser Ser Ser Phe 100 105
110 Pro Ser Pro Thr Asn Pro Phe Gly Asp Ala Asn Ser Leu Ile Pro
Trp 115 120 125 Leu
Lys Asn Leu Ser Ser Asn Ser Pro Ser Lys Leu Pro Phe Phe His 130
135 140 Gly Asn Ser Ile Ser Ala
Pro Val Thr Pro Pro Leu Ala Arg Ser Pro 145 150
155 160 Thr Arg Asp Gln Val Thr Ile Pro Asp Ser Gly
Trp Leu Ser Gly Met 165 170
175 Gln Thr Pro Gln Ser Gly Pro Ser Ser Pro Thr Phe Ser Leu Val Ser
180 185 190 Arg Asn
Pro Phe Phe Asp Lys Glu Ala Phe Lys Met Gly Asp Cys Asn 195
200 205 Ser Pro Met Trp Thr Pro Gly
Gln Ser Gly Asn Cys Ser Pro Ala Ile 210 215
220 Pro Ala Gly Val Asp Gln Asn Ser Asp Val Pro Met
Ala Asp Gly Met 225 230 235
240 Thr Ala Glu Phe Ala Phe Gly Cys Asn Ala Met Ala Ala Asn Gly Met
245 250 255 Val Lys Pro
Trp Glu Gly Glu Arg Ile His Gly Glu Cys Val Ser Asp 260
265 270 Asp Leu Glu Leu Thr Leu Gly Asn
Ser Arg Thr Arg 275 280
248510DNAArabidopsis thaliana 248atggccgccg gaggaggagg aggaggagga
ggatcatcgt cgggacgtac tccgacgtgg 60aaagagagag agaacaataa gaagagagaa
agaagaagaa gagccatcac tgctaagatt 120tactctggtc ttagagctca aggtaactat
aagcttccta agcactgcga taacaacgag 180gttcttaaag ctctctgtct cgaagctggt
tggatcgtcg aagacgatgg caccacttat 240cgcaaggggt ttagccacca gcatcagata
tttcaggaac tcctacaaac ttcagcacaa 300attcatcaat ccaaccaagt ccacaatcat
cagcttttcc aagtcctgca ccttcgtacc 360acggaagtcc agtctcatca tccttcccga
gtccatctcg ctatgacgga aacccttctt 420cataccttct tcttccgttc ctacacaaca
tcgcttcttc gattcctgct aaccttccac 480ctcttagaat atccaacagt gcgcctgtga
510249169PRTArabidopsis thaliana 249Met
Ala Ala Gly Gly Gly Gly Gly Gly Gly Gly Ser Ser Ser Gly Arg 1
5 10 15 Thr Pro Thr Trp Lys Glu
Arg Glu Asn Asn Lys Lys Arg Glu Arg Arg 20
25 30 Arg Arg Ala Ile Thr Ala Lys Ile Tyr Ser
Gly Leu Arg Ala Gln Gly 35 40
45 Asn Tyr Lys Leu Pro Lys His Cys Asp Asn Asn Glu Val Leu
Lys Ala 50 55 60
Leu Cys Leu Glu Ala Gly Trp Ile Val Glu Asp Asp Gly Thr Thr Tyr 65
70 75 80 Arg Lys Gly Phe Ser
His Gln His Gln Ile Phe Gln Glu Leu Leu Gln 85
90 95 Thr Ser Ala Gln Ile His Gln Ser Asn Gln
Val His Asn His Gln Leu 100 105
110 Phe Gln Val Leu His Leu Arg Thr Thr Glu Val Gln Ser His His
Pro 115 120 125 Ser
Arg Val His Leu Ala Met Thr Glu Thr Leu Leu His Thr Phe Phe 130
135 140 Phe Arg Ser Tyr Thr Thr
Ser Leu Leu Arg Phe Leu Leu Thr Phe His 145 150
155 160 Leu Leu Glu Tyr Pro Thr Val Arg Leu
165 250936DNAGlycine max 250atggccgacg acggagcaac
ctcggcggcg acgagccgga gaaagccgtc gtggagagaa 60agagaaaaca acaggagaag
agaaagaagg agaagagcaa tagctgcgaa aatatactct 120ggacttcgag ctcaggggaa
cttcaacttg ccaaagcact gcgacaacaa cgaagttctg 180aaagctctct gcgcagaagc
tggttggtgc gtggaagaag acggaaccac ttatcgcaag 240ggttgcaagc cacctctggc
caatggtgca gggagctcca tgagaaacat taccttttct 300tcttctcaaa atccaagtcc
tctgtcttcg tcatttccca gcccaattcc ttcataccaa 360gtgagccctt cctcctcctc
tttcccgagc ccgtttcgtt tagatgtgga taaggacaat 420gtatcaaacc tcattccata
cattcgcaat gcgtccttgt ctcttcctcc tctcaggata 480tcaaacagtg cccctgtgac
accacctctt tcatcaccaa catcaagaaa tccaaaacca 540attcctactt gggagtctat
tgccaaagaa tccatggcct ccttcagtta ccctttcttt 600gcagcttctg cccctgctag
ccccacacac cgtcaccttt acactccgcc cactattcca 660gaatgcgatg aatccgatac
ttccaccggc gagtctggcc agtgggtgaa attccaagca 720tttgctcctt cttcatctgt
gttgccaatt tctccaacct ttaatcttgt taaacctgtg 780attccgcaca ggatgcctga
taactcaatc caagtgatga ggacgagttc agaagagttt 840ggggtacagg taaagccttg
ggttggggaa aaaattcatg aagtggcatt ggatgattta 900gagctcacac ttggaagtgg
gaaggtgcgg agttag 936251311PRTGlycine max
251Met Ala Asp Asp Gly Ala Thr Ser Ala Ala Thr Ser Arg Arg Lys Pro 1
5 10 15 Ser Trp Arg Glu
Arg Glu Asn Asn Arg Arg Arg Glu Arg Arg Arg Arg 20
25 30 Ala Ile Ala Ala Lys Ile Tyr Ser Gly
Leu Arg Ala Gln Gly Asn Phe 35 40
45 Asn Leu Pro Lys His Cys Asp Asn Asn Glu Val Leu Lys Ala
Leu Cys 50 55 60
Ala Glu Ala Gly Trp Cys Val Glu Glu Asp Gly Thr Thr Tyr Arg Lys 65
70 75 80 Gly Cys Lys Pro Pro
Leu Ala Asn Gly Ala Gly Ser Ser Met Arg Asn 85
90 95 Ile Thr Phe Ser Ser Ser Gln Asn Pro Ser
Pro Leu Ser Ser Ser Phe 100 105
110 Pro Ser Pro Ile Pro Ser Tyr Gln Val Ser Pro Ser Ser Ser Ser
Phe 115 120 125 Pro
Ser Pro Phe Arg Leu Asp Val Asp Lys Asp Asn Val Ser Asn Leu 130
135 140 Ile Pro Tyr Ile Arg Asn
Ala Ser Leu Ser Leu Pro Pro Leu Arg Ile 145 150
155 160 Ser Asn Ser Ala Pro Val Thr Pro Pro Leu Ser
Ser Pro Thr Ser Arg 165 170
175 Asn Pro Lys Pro Ile Pro Thr Trp Glu Ser Ile Ala Lys Glu Ser Met
180 185 190 Ala Ser
Phe Ser Tyr Pro Phe Phe Ala Ala Ser Ala Pro Ala Ser Pro 195
200 205 Thr His Arg His Leu Tyr Thr
Pro Pro Thr Ile Pro Glu Cys Asp Glu 210 215
220 Ser Asp Thr Ser Thr Gly Glu Ser Gly Gln Trp Val
Lys Phe Gln Ala 225 230 235
240 Phe Ala Pro Ser Ser Ser Val Leu Pro Ile Ser Pro Thr Phe Asn Leu
245 250 255 Val Lys Pro
Val Ile Pro His Arg Met Pro Asp Asn Ser Ile Gln Val 260
265 270 Met Arg Thr Ser Ser Glu Glu Phe
Gly Val Gln Val Lys Pro Trp Val 275 280
285 Gly Glu Lys Ile His Glu Val Ala Leu Asp Asp Leu Glu
Leu Thr Leu 290 295 300
Gly Ser Gly Lys Val Arg Ser 305 310
2521005DNAGlycine max 252atgacgtccg tcgcgaggca gccaacgtgg aaggagcggg
agaacaacaa gaggcgagag 60aggaggagga gagccatcgc ggcgaagatc ttctccggcc
tgcgaatgta cggcaactac 120aagctcccca aacactgcga caacaacgaa gttctcaagg
ctctctgcaa cgaagccggc 180tggaccgtag aagccgatgg caccacctat cgcaagggat
gcaagcctcc tgttgaacgc 240atggacatag taggtggttc tgcagcagca agcccatgct
catcttacca tccaagtccc 300tgtgcttcct acaaccctag tccaggctct tcttgcttac
cgagccctcg cgcatccccc 360tttcctccaa acccaaatgc tgatggcaat tctctcattc
catggctcaa aaacctttca 420tcaggatcat catcggcatc ctcttccaag cttccgcagc
tgtacattcc gaatggctcc 480atcagtgctc cagtcactcc tccaatcagc tctccatcat
cccgaaagcc ccgaatcaaa 540gctgactggg aggatctgtc cactcgtccg gcagcgtggg
gcggacctgc atacaccttc 600ctgccctctt caactcctcc tagccccggt cgccaggttg
ctgaaacaga ttggttttcc 660aagatcagga ttcctcaggt aggactaaca ccaacttctc
caaccttcag cctggtctct 720tccaacccat ttggcttcaa ggaagatgct atgggtggca
gcggttcccg catgtggacg 780acaccagggg caagtggaac atgttctcca gccgtagctg
caggctctga aaacacttct 840gacattccaa tggctgaagc agtttcagat gaatttgcct
ttggaagcag ctcatcaggt 900ttagtgaatg cctggaaagg agagaggatc catgaagctt
cttttggaac agatgatctt 960gagctcactc ttgggagctc caagaccagg ttgctccata
agtga 1005253334PRTGlycine max 253Met Thr Ser Val Ala
Arg Gln Pro Thr Trp Lys Glu Arg Glu Asn Asn 1 5
10 15 Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile
Ala Ala Lys Ile Phe Ser 20 25
30 Gly Leu Arg Met Tyr Gly Asn Tyr Lys Leu Pro Lys His Cys Asp
Asn 35 40 45 Asn
Glu Val Leu Lys Ala Leu Cys Asn Glu Ala Gly Trp Thr Val Glu 50
55 60 Ala Asp Gly Thr Thr Tyr
Arg Lys Gly Cys Lys Pro Pro Val Glu Arg 65 70
75 80 Met Asp Ile Val Gly Gly Ser Ala Ala Ala Ser
Pro Cys Ser Ser Tyr 85 90
95 His Pro Ser Pro Cys Ala Ser Tyr Asn Pro Ser Pro Gly Ser Ser Cys
100 105 110 Leu Pro
Ser Pro Arg Ala Ser Pro Phe Pro Pro Asn Pro Asn Ala Asp 115
120 125 Gly Asn Ser Leu Ile Pro Trp
Leu Lys Asn Leu Ser Ser Gly Ser Ser 130 135
140 Ser Ala Ser Ser Ser Lys Leu Pro Gln Leu Tyr Ile
Pro Asn Gly Ser 145 150 155
160 Ile Ser Ala Pro Val Thr Pro Pro Ile Ser Ser Pro Ser Ser Arg Lys
165 170 175 Pro Arg Ile
Lys Ala Asp Trp Glu Asp Leu Ser Thr Arg Pro Ala Ala 180
185 190 Trp Gly Gly Pro Ala Tyr Thr Phe
Leu Pro Ser Ser Thr Pro Pro Ser 195 200
205 Pro Gly Arg Gln Val Ala Glu Thr Asp Trp Phe Ser Lys
Ile Arg Ile 210 215 220
Pro Gln Val Gly Leu Thr Pro Thr Ser Pro Thr Phe Ser Leu Val Ser 225
230 235 240 Ser Asn Pro Phe
Gly Phe Lys Glu Asp Ala Met Gly Gly Ser Gly Ser 245
250 255 Arg Met Trp Thr Thr Pro Gly Ala Ser
Gly Thr Cys Ser Pro Ala Val 260 265
270 Ala Ala Gly Ser Glu Asn Thr Ser Asp Ile Pro Met Ala Glu
Ala Val 275 280 285
Ser Asp Glu Phe Ala Phe Gly Ser Ser Ser Ser Gly Leu Val Asn Ala 290
295 300 Trp Lys Gly Glu Arg
Ile His Glu Ala Ser Phe Gly Thr Asp Asp Leu 305 310
315 320 Glu Leu Thr Leu Gly Ser Ser Lys Thr Arg
Leu Leu His Lys 325 330
254933DNAGlycine max 254atgaccggcg gcggatcgac ggggaggttg ccaacatgga
aggagagaga aaacaacaag 60aggagagaga ggagacgaag agcgatcgcc gctaagatct
acaccggtct tcgagctcag 120gggaactaca agcttccgaa gcactgtgac aacaacgagg
tcctgaaagc tctatgcgcc 180gaagctggtt ggatcgtgga agaggatggc accacttatc
gaaagggatg caagagaccc 240agcgcgagtg agattggagg aacagtggca aacataagcg
cgtgttcttc gattcagcca 300agtccacaat cctcgtcata cccgagtcct gtaccatcct
accacgctag cccaacctct 360tcctcgttcc caagccccac gcgcattgac ggaaaccacc
cttcttcctt tctcatccca 420ttcatccgca acataacttc catccccgcc aacctccctc
ctctcaggat atccaacagc 480gcccccgtca ccccacctct ctcctctcct cgtagctcta
agcgcaaggc tgatttcgac 540tccctccata acgcctccct ccgccaccct ctttttgaca
cctccgcccc ttccagcccc 600tctcgccgcc accacctcgc cacctccacc atcccggagt
gcgatgagtc cgacgcctcc 660accgttgact ccgcctctgg gcgctgggtt agttttcagg
ttcagacgac gatggcggcc 720gctcctcctt ctcccacctt taacctcatg aaacccgcca
tgcagcagat cgctgcccag 780gaaggcatgc tgtggggttc tgttgcggag cgagtcagag
gaggctccga ttttgacttc 840gagaatggca gagtcaaacc ctgggagggt gagagaatac
acgaggttgg aatggatgat 900ttggagctta ctctaggagt tggaaaggct tga
933255310PRTGlycine max 255Met Thr Gly Gly Gly Ser
Thr Gly Arg Leu Pro Thr Trp Lys Glu Arg 1 5
10 15 Glu Asn Asn Lys Arg Arg Glu Arg Arg Arg Arg
Ala Ile Ala Ala Lys 20 25
30 Ile Tyr Thr Gly Leu Arg Ala Gln Gly Asn Tyr Lys Leu Pro Lys
His 35 40 45 Cys
Asp Asn Asn Glu Val Leu Lys Ala Leu Cys Ala Glu Ala Gly Trp 50
55 60 Ile Val Glu Glu Asp Gly
Thr Thr Tyr Arg Lys Gly Cys Lys Arg Pro 65 70
75 80 Ser Ala Ser Glu Ile Gly Gly Thr Val Ala Asn
Ile Ser Ala Cys Ser 85 90
95 Ser Ile Gln Pro Ser Pro Gln Ser Ser Ser Tyr Pro Ser Pro Val Pro
100 105 110 Ser Tyr
His Ala Ser Pro Thr Ser Ser Ser Phe Pro Ser Pro Thr Arg 115
120 125 Ile Asp Gly Asn His Pro Ser
Ser Phe Leu Ile Pro Phe Ile Arg Asn 130 135
140 Ile Thr Ser Ile Pro Ala Asn Leu Pro Pro Leu Arg
Ile Ser Asn Ser 145 150 155
160 Ala Pro Val Thr Pro Pro Leu Ser Ser Pro Arg Ser Ser Lys Arg Lys
165 170 175 Ala Asp Phe
Asp Ser Leu His Asn Ala Ser Leu Arg His Pro Leu Phe 180
185 190 Asp Thr Ser Ala Pro Ser Ser Pro
Ser Arg Arg His His Leu Ala Thr 195 200
205 Ser Thr Ile Pro Glu Cys Asp Glu Ser Asp Ala Ser Thr
Val Asp Ser 210 215 220
Ala Ser Gly Arg Trp Val Ser Phe Gln Val Gln Thr Thr Met Ala Ala 225
230 235 240 Ala Pro Pro Ser
Pro Thr Phe Asn Leu Met Lys Pro Ala Met Gln Gln 245
250 255 Ile Ala Ala Gln Glu Gly Met Leu Trp
Gly Ser Val Ala Glu Arg Val 260 265
270 Arg Gly Gly Ser Asp Phe Asp Phe Glu Asn Gly Arg Val Lys
Pro Trp 275 280 285
Glu Gly Glu Arg Ile His Glu Val Gly Met Asp Asp Leu Glu Leu Thr 290
295 300 Leu Gly Val Gly Lys
Ala 305 310 256927DNAGlycine max 256atgaccggcg gcggatccac
ggggaggttg ccgacgtgga aggagagaga gaacaacaag 60aggagagaga gaagacgaag
agcgattgca gctaagatct acactggcct tcgagcccag 120gggaactaca agcttccaaa
gcactgcgac aacaacgagg tcctgaaagc tctctgcgcc 180gaagctggct ggatcgtgga
agaagatggc acaacttatc gaaagggatg taagagaccc 240acgagtgaga ttggaggaac
accactgaac ttaagcgcgt gttcttccat tcaggcaagt 300ccacaatcct cgtcataccc
gagtcctgta ccatcctacc atgctagccc aacctcttcc 360tcgttcccaa gccccacgcg
cattgacgga aaccaccctt cttcctttct catcccattc 420atccgcaaca taacttccat
ccccgccaac ctccctcctc tcaggatatc caacagcgcc 480cccgtcaccc cacctctttc
ttctccccga agctcaaagc gcaaggcgga tttcgactcc 540ctccgccacc ctctttttgc
cacctccgcc ccgtccagcc ccacgcgccg ccaccacgtt 600gccacctcca ccatcccgga
gtgcgacgag tccgacgcct ccaccgtgga ctccgcctcg 660ggccgctggg ttagtttcca
ggttcagacg acgatggtgg ctgcggcggc ggctgctcct 720ccttcgccta cctttaacct
catgaagccc gcgatgcagc agatcgctgc ccaggaaggc 780atgcagtggg gttctgttgc
cgagagaggc agaggaggct ccgattttga cttcgagaat 840ggcagagtga aaccctggga
gggtgagaga atacacgagg ttggaatgga tgatttggag 900cttactctag gagttggaaa
ggcttga 927257308PRTGlycine max
257Met Thr Gly Gly Gly Ser Thr Gly Arg Leu Pro Thr Trp Lys Glu Arg 1
5 10 15 Glu Asn Asn Lys
Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys 20
25 30 Ile Tyr Thr Gly Leu Arg Ala Gln Gly
Asn Tyr Lys Leu Pro Lys His 35 40
45 Cys Asp Asn Asn Glu Val Leu Lys Ala Leu Cys Ala Glu Ala
Gly Trp 50 55 60
Ile Val Glu Glu Asp Gly Thr Thr Tyr Arg Lys Gly Cys Lys Arg Pro 65
70 75 80 Thr Ser Glu Ile Gly
Gly Thr Pro Leu Asn Leu Ser Ala Cys Ser Ser 85
90 95 Ile Gln Ala Ser Pro Gln Ser Ser Ser Tyr
Pro Ser Pro Val Pro Ser 100 105
110 Tyr His Ala Ser Pro Thr Ser Ser Ser Phe Pro Ser Pro Thr Arg
Ile 115 120 125 Asp
Gly Asn His Pro Ser Ser Phe Leu Ile Pro Phe Ile Arg Asn Ile 130
135 140 Thr Ser Ile Pro Ala Asn
Leu Pro Pro Leu Arg Ile Ser Asn Ser Ala 145 150
155 160 Pro Val Thr Pro Pro Leu Ser Ser Pro Arg Ser
Ser Lys Arg Lys Ala 165 170
175 Asp Phe Asp Ser Leu Arg His Pro Leu Phe Ala Thr Ser Ala Pro Ser
180 185 190 Ser Pro
Thr Arg Arg His His Val Ala Thr Ser Thr Ile Pro Glu Cys 195
200 205 Asp Glu Ser Asp Ala Ser Thr
Val Asp Ser Ala Ser Gly Arg Trp Val 210 215
220 Ser Phe Gln Val Gln Thr Thr Met Val Ala Ala Ala
Ala Ala Ala Pro 225 230 235
240 Pro Ser Pro Thr Phe Asn Leu Met Lys Pro Ala Met Gln Gln Ile Ala
245 250 255 Ala Gln Glu
Gly Met Gln Trp Gly Ser Val Ala Glu Arg Gly Arg Gly 260
265 270 Gly Ser Asp Phe Asp Phe Glu Asn
Gly Arg Val Lys Pro Trp Glu Gly 275 280
285 Glu Arg Ile His Glu Val Gly Met Asp Asp Leu Glu Leu
Thr Leu Gly 290 295 300
Val Gly Lys Ala 305 2581053DNAHordeum vulgare 258atggcgacgg
ggggtggagg aggagcggac ttcggggcgg cggggggagc gggcggcagg 60atgccgacgt
ggagggagcg ggagaacaac aagcggaggg agcggcggcg gcgggcgatc 120gccgccaaga
tattctccgg cctgcgggcg cacggcgggt acaagctccc caagcactgc 180gacaacaacg
aggtcctcaa ggccctctgc aacgaggccg gctgggtcgt cgagcccgac 240ggcaccacct
accgcaaggg atgcagacct gcagagcgca tggatgggat tgggtgctcc 300gtgtcgccaa
gcccatgttc ctcatatcag ccaagcccgc gggcatcata caacgcgagc 360cctacctcct
cttcattccc cagcggcgca tcgtcgccct tcctcccaca ttctaacaac 420atggtaaatg
gcgtcgatgc aactcccatc ctaccatggc tgcaaacgtt ctccaattcg 480aataagcggc
cgcatcttcc cccgctgctg attcacggcg gctccattag cgccccggtg 540actcctccac
tgagctcacc gactgcccgc acccctcgca tgaagacgga ctgggacgag 600tcggtgatcc
agccaccatg gcacggttca aacagtccct gcgtggtgaa ctccaccccg 660ccgagccccg
ggcgtcaaat ggttcctgac ccagcatggc tggccggcat ccagatctcg 720tcaacgagcc
cttcatcgcc cacgtttagt ctcatgtcct ccaacccatt cagcgtcttc 780aaagaagcga
tcccgggcgg cggttcgtcc aggatgtgca cgccggggca gagcggcacg 840tgctcgccgg
tgatccccgg catggcgcgg cacccggacg ttcacatgat ggacgtggtt 900tctgacgagt
ttgcatttgg aagcagcacc aacggtgttg ctcagcaggc caccgccgga 960ttggtgaggg
cgtgggaggg cgagaggatc cacgaggact ctgggtcgga cgagctggag 1020ctcactctcg
ggagcaccag gacgaggagc tga
1053259350PRTHordeum vulgare 259Met Ala Thr Gly Gly Gly Gly Gly Ala Asp
Phe Gly Ala Ala Gly Gly 1 5 10
15 Ala Gly Gly Arg Met Pro Thr Trp Arg Glu Arg Glu Asn Asn Lys
Arg 20 25 30 Arg
Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Phe Ser Gly Leu 35
40 45 Arg Ala His Gly Gly Tyr
Lys Leu Pro Lys His Cys Asp Asn Asn Glu 50 55
60 Val Leu Lys Ala Leu Cys Asn Glu Ala Gly Trp
Val Val Glu Pro Asp 65 70 75
80 Gly Thr Thr Tyr Arg Lys Gly Cys Arg Pro Ala Glu Arg Met Asp Gly
85 90 95 Ile Gly
Cys Ser Val Ser Pro Ser Pro Cys Ser Ser Tyr Gln Pro Ser 100
105 110 Pro Arg Ala Ser Tyr Asn Ala
Ser Pro Thr Ser Ser Ser Phe Pro Ser 115 120
125 Gly Ala Ser Ser Pro Phe Leu Pro His Ser Asn Asn
Met Val Asn Gly 130 135 140
Val Asp Ala Thr Pro Ile Leu Pro Trp Leu Gln Thr Phe Ser Asn Ser 145
150 155 160 Asn Lys Arg
Pro His Leu Pro Pro Leu Leu Ile His Gly Gly Ser Ile 165
170 175 Ser Ala Pro Val Thr Pro Pro Leu
Ser Ser Pro Thr Ala Arg Thr Pro 180 185
190 Arg Met Lys Thr Asp Trp Asp Glu Ser Val Ile Gln Pro
Pro Trp His 195 200 205
Gly Ser Asn Ser Pro Cys Val Val Asn Ser Thr Pro Pro Ser Pro Gly 210
215 220 Arg Gln Met Val
Pro Asp Pro Ala Trp Leu Ala Gly Ile Gln Ile Ser 225 230
235 240 Ser Thr Ser Pro Ser Ser Pro Thr Phe
Ser Leu Met Ser Ser Asn Pro 245 250
255 Phe Ser Val Phe Lys Glu Ala Ile Pro Gly Gly Gly Ser Ser
Arg Met 260 265 270
Cys Thr Pro Gly Gln Ser Gly Thr Cys Ser Pro Val Ile Pro Gly Met
275 280 285 Ala Arg His Pro
Asp Val His Met Met Asp Val Val Ser Asp Glu Phe 290
295 300 Ala Phe Gly Ser Ser Thr Asn Gly
Val Ala Gln Gln Ala Thr Ala Gly 305 310
315 320 Leu Val Arg Ala Trp Glu Gly Glu Arg Ile His Glu
Asp Ser Gly Ser 325 330
335 Asp Glu Leu Glu Leu Thr Leu Gly Ser Thr Arg Thr Arg Ser
340 345 350 2601002DNALycopersicon
esculentum 260atgatgtggg aagctggaga atcaccagca tcttcttcgg ccggtgccgg
agctggtgga 60agtggaggtg ccggagttgg tttaccggaa agtggtggtg gtggtggtgg
tgggagaagg 120aaaccatcat ggagagaaag agagaataac aggagaagag agaggaggag
gagagctgta 180gctgctaaga tttatactgg tttaagagct caaggaaact ataatcttcc
gaagcactgt 240gataacaatg aagttcttaa agctctttgt actgaagctg gttggatcgt
tgaacctgat 300ggtaccactt atcgcaaggg atgcaagcca accccgatgg agattggagg
cacttcaaca 360aacatcacgc caagttcttc acggcatcca agtcccccat catcatactt
tgctagccca 420attccatctt atcagccaag tccaacttcc tcttctttcc ccagtccatc
tcgtgctgat 480gccaacatgt tatcacatcc atattctttt ctccaaaatg tcgttccttc
atcccttcct 540ccattacgaa tatcaaacag tgcccctgta actccacctc tttcatcacc
aactaggcat 600cctaagcaaa ctttcaattt agaaactttg gccaaagaat caatgtttgc
tttaaacatc 660cctttctttg ctgcttcagc cccagcaagc ccaactaggg ttcagcgttt
tactcctcca 720actatacccg agtgtgatga atctgactca tctaccattg attcaggcca
gtggatcaac 780tttcaaaagt atgcgtcaaa tgttccacct tctccaacat ttaatcttgt
aaaacctgtg 840cctcagccgc ttcgtcctaa tgatatgatc acagacaagg gtaagagcat
agacttcgac 900tttgaaaatg tatcagtcaa ggcatgggaa ggtgaaagga ttcacgatgt
aggattcgat 960gatctggaac tcacacttgg aagtggcaat gctcgcatat ga
1002261333PRTLycopersicon esculentum 261Met Met Trp Glu Ala
Gly Glu Ser Pro Ala Ser Ser Ser Ala Gly Ala 1 5
10 15 Gly Ala Gly Gly Ser Gly Gly Ala Gly Val
Gly Leu Pro Glu Ser Gly 20 25
30 Gly Gly Gly Gly Gly Gly Arg Arg Lys Pro Ser Trp Arg Glu Arg
Glu 35 40 45 Asn
Asn Arg Arg Arg Glu Arg Arg Arg Arg Ala Val Ala Ala Lys Ile 50
55 60 Tyr Thr Gly Leu Arg Ala
Gln Gly Asn Tyr Asn Leu Pro Lys His Cys 65 70
75 80 Asp Asn Asn Glu Val Leu Lys Ala Leu Cys Thr
Glu Ala Gly Trp Ile 85 90
95 Val Glu Pro Asp Gly Thr Thr Tyr Arg Lys Gly Cys Lys Pro Thr Pro
100 105 110 Met Glu
Ile Gly Gly Thr Ser Thr Asn Ile Thr Pro Ser Ser Ser Arg 115
120 125 His Pro Ser Pro Pro Ser Ser
Tyr Phe Ala Ser Pro Ile Pro Ser Tyr 130 135
140 Gln Pro Ser Pro Thr Ser Ser Ser Phe Pro Ser Pro
Ser Arg Ala Asp 145 150 155
160 Ala Asn Met Leu Ser His Pro Tyr Ser Phe Leu Gln Asn Val Val Pro
165 170 175 Ser Ser Leu
Pro Pro Leu Arg Ile Ser Asn Ser Ala Pro Val Thr Pro 180
185 190 Pro Leu Ser Ser Pro Thr Arg His
Pro Lys Gln Thr Phe Asn Leu Glu 195 200
205 Thr Leu Ala Lys Glu Ser Met Phe Ala Leu Asn Ile Pro
Phe Phe Ala 210 215 220
Ala Ser Ala Pro Ala Ser Pro Thr Arg Val Gln Arg Phe Thr Pro Pro 225
230 235 240 Thr Ile Pro Glu
Cys Asp Glu Ser Asp Ser Ser Thr Ile Asp Ser Gly 245
250 255 Gln Trp Ile Asn Phe Gln Lys Tyr Ala
Ser Asn Val Pro Pro Ser Pro 260 265
270 Thr Phe Asn Leu Val Lys Pro Val Pro Gln Pro Leu Arg Pro
Asn Asp 275 280 285
Met Ile Thr Asp Lys Gly Lys Ser Ile Asp Phe Asp Phe Glu Asn Val 290
295 300 Ser Val Lys Ala Trp
Glu Gly Glu Arg Ile His Asp Val Gly Phe Asp 305 310
315 320 Asp Leu Glu Leu Thr Leu Gly Ser Gly Asn
Ala Arg Ile 325 330
262687DNALycopersicon esculentummisc_feature(481)..(481)n is a, c, g, or
t 262atgacggccg gcaccggcgg tggaggatcg tcgggaaggt tgccgacgtg gaaggagagg
60gaaaacaata agagaagaga gaggagaaga agagctattg ccgccaaaat atttactggc
120ttacgaactc aaggtaactt caagcttcca aaacactgtg ataataacga ggtcttgaaa
180gctctatgta ttgaagctgg ttggatcgtt gaagatgatg gcaccactta tcgcaaggga
240cacaggcctc caccaattga aaatggatgt gtctctatga atatcagtgc atcttcatcg
300attcagccta gcccaatgtc atcctctttc cccagtcctg taccttctta ccatgccagc
360ccaacatcat cctcatttcc tagtccctcc cgttgtgacg ggaacccctc atcatacatc
420cttccctttc tccataactt agcttccatt ccctctactc tgccacctct tcgtatatct
480natagtgccc ctgttacccc tcctctttct tctcctaccc gacgttcaaa gccccccaaa
540cctttatggg aatccctctc cngggttcca ttnaattctt tccagcaccc actttttgct
600gcttctgcac catcaagtcc cactcgacgc cnannctcnn gcctgctaca attccnnaan
660gngatgagtc tgangccgcn canttga
687263228PRTLycopersicon esculentummisc_feature(161)..(161)Xaa can be any
naturally occurring amino acid 263Met Thr Ala Gly Thr Gly Gly Gly Gly Ser
Ser Gly Arg Leu Pro Thr 1 5 10
15 Trp Lys Glu Arg Glu Asn Asn Lys Arg Arg Glu Arg Arg Arg Arg
Ala 20 25 30 Ile
Ala Ala Lys Ile Phe Thr Gly Leu Arg Thr Gln Gly Asn Phe Lys 35
40 45 Leu Pro Lys His Cys Asp
Asn Asn Glu Val Leu Lys Ala Leu Cys Ile 50 55
60 Glu Ala Gly Trp Ile Val Glu Asp Asp Gly Thr
Thr Tyr Arg Lys Gly 65 70 75
80 His Arg Pro Pro Pro Ile Glu Asn Gly Cys Val Ser Met Asn Ile Ser
85 90 95 Ala Ser
Ser Ser Ile Gln Pro Ser Pro Met Ser Ser Ser Phe Pro Ser 100
105 110 Pro Val Pro Ser Tyr His Ala
Ser Pro Thr Ser Ser Ser Phe Pro Ser 115 120
125 Pro Ser Arg Cys Asp Gly Asn Pro Ser Ser Tyr Ile
Leu Pro Phe Leu 130 135 140
His Asn Leu Ala Ser Ile Pro Ser Thr Leu Pro Pro Leu Arg Ile Ser 145
150 155 160 Xaa Ser Ala
Pro Val Thr Pro Pro Leu Ser Ser Pro Thr Arg Arg Ser 165
170 175 Lys Pro Pro Lys Pro Leu Trp Glu
Ser Leu Ser Xaa Val Pro Xaa Asn 180 185
190 Ser Phe Gln His Pro Leu Phe Ala Ala Ser Ala Pro Ser
Ser Pro Thr 195 200 205
Arg Arg Xaa Xaa Ser Xaa Leu Leu Gln Phe Xaa Xaa Xaa Met Ser Leu 210
215 220 Xaa Pro Xaa Xaa
225 264960DNALycopersicon esculentum 264atgacggccg gcaccggcgg
tggaggatcg tcgggaaggt tgccgacgtg gaaggagagg 60gaaaacaata agagaagaga
gaggagaaga agagctattg ccgccaaaat atttactggc 120ttacgaactc aaggtaactt
caagcttcca aaacactgtg ataataacga ggtcttgaaa 180gctctatgta ttgaagctgg
ttggatcgtt gaagatgatg gcaccactta tcgcaaggga 240cacaggcctc caccaattga
aaatggatgt gtctctatga atatcagtgc atcttcatcg 300attcagccta gcccaatgtc
atcctctttc cccagtcctg taccttctta ccatgccagc 360ccaacatcat cctcatttcc
tagtccctcc cgttgtgacg ggaacccctc atcatacatc 420cttccctttc tccataactt
agcttccatt ccctctactc tgccacctct tcgtatatct 480aatagtgccc ctgttacccc
tcctctttct tctcctaccc gacgttcaaa gccccccaaa 540cctttatggg aatccctctc
cagggttcca ttgaattctt tccagcaccc actttttgct 600gcttctgcac catcaagtcc
cactcgacgc cgatactcta agcctgctac aattccagaa 660tgtgatgagt ctgatgccgc
ctcagttgaa tctgcacgct gggtcagctt ccagacggtg 720gcagctccaa cttcgcctac
ttttaacctt gtaaaacctc ttcctcagca gaacattctc 780ttagatgcct taagtggaca
tggaatggtt ggctggggcg aaacagcagc tcaaaaggga 840catggcgctg aatttgattt
tgagagctgt aaagtgaagg catgggaagg tgagagaata 900catgaagttg ctgtggatga
tctagagctc actcttggta gtgcaaaagc acgtgcttaa 960265319PRTLycopersicon
esculentum 265Met Thr Ala Gly Thr Gly Gly Gly Gly Ser Ser Gly Arg Leu Pro
Thr 1 5 10 15 Trp
Lys Glu Arg Glu Asn Asn Lys Arg Arg Glu Arg Arg Arg Arg Ala
20 25 30 Ile Ala Ala Lys Ile
Phe Thr Gly Leu Arg Thr Gln Gly Asn Phe Lys 35
40 45 Leu Pro Lys His Cys Asp Asn Asn Glu
Val Leu Lys Ala Leu Cys Ile 50 55
60 Glu Ala Gly Trp Ile Val Glu Asp Asp Gly Thr Thr Tyr
Arg Lys Gly 65 70 75
80 His Arg Pro Pro Pro Ile Glu Asn Gly Cys Val Ser Met Asn Ile Ser
85 90 95 Ala Ser Ser Ser
Ile Gln Pro Ser Pro Met Ser Ser Ser Phe Pro Ser 100
105 110 Pro Val Pro Ser Tyr His Ala Ser Pro
Thr Ser Ser Ser Phe Pro Ser 115 120
125 Pro Ser Arg Cys Asp Gly Asn Pro Ser Ser Tyr Ile Leu Pro
Phe Leu 130 135 140
His Asn Leu Ala Ser Ile Pro Ser Thr Leu Pro Pro Leu Arg Ile Ser 145
150 155 160 Asn Ser Ala Pro Val
Thr Pro Pro Leu Ser Ser Pro Thr Arg Arg Ser 165
170 175 Lys Pro Pro Lys Pro Leu Trp Glu Ser Leu
Ser Arg Val Pro Leu Asn 180 185
190 Ser Phe Gln His Pro Leu Phe Ala Ala Ser Ala Pro Ser Ser Pro
Thr 195 200 205 Arg
Arg Arg Tyr Ser Lys Pro Ala Thr Ile Pro Glu Cys Asp Glu Ser 210
215 220 Asp Ala Ala Ser Val Glu
Ser Ala Arg Trp Val Ser Phe Gln Thr Val 225 230
235 240 Ala Ala Pro Thr Ser Pro Thr Phe Asn Leu Val
Lys Pro Leu Pro Gln 245 250
255 Gln Asn Ile Leu Leu Asp Ala Leu Ser Gly His Gly Met Val Gly Trp
260 265 270 Gly Glu
Thr Ala Ala Gln Lys Gly His Gly Ala Glu Phe Asp Phe Glu 275
280 285 Ser Cys Lys Val Lys Ala Trp
Glu Gly Glu Arg Ile His Glu Val Ala 290 295
300 Val Asp Asp Leu Glu Leu Thr Leu Gly Ser Ala Lys
Ala Arg Ala 305 310 315
266999DNALycopersicon esculentum 266atgtgggaag ctggagaatc accagcatct
tcttcggccg gtgccggagc tggtggaagt 60ggaggtgccg gagttggttt accggaaagt
ggtggtggtg gtggtggtgg gagaaggaaa 120ccatcatgga gagaaagaga gaataacagg
agaagagaga ggaggaggag agctgtagct 180gctaagattt atactggttt aagagctcaa
ggaaactata atcttccgaa gcactgtgat 240aacaatgaag ttcttaaagc tctttgtact
gaagctggtt ggatcgttga acctgatggt 300accacttatc gcaagggatg caagccaacc
ccgatggaga ttggaggcac ttcaacaaac 360atcacgccaa gttcttcacg gcatccaagt
cccccatcat catactttgc tagcccaatt 420ccatcttatc agccaagtcc aacttcctct
tctttcccca gtccatctcg tgctgatgcc 480aacatgtcat cacatccata ttcttttctc
caaaatgtcg ttccttcatc ccttccccca 540ttacgaatat caaacagtgc ccctgtaact
ccacctcttt catcaccaac taggcatcct 600aagcaaactt tcaatttaga aactttggcc
aaagaaacaa tgtttgcttt aaacatcccc 660ttccttgctg cttcagcccc agcaagccca
actaggggtc agcgttttac tcctccaact 720atacccgagt gtgatgaatc tgactcatct
accattgatt caggccagtg gatcaacttt 780caaaagtatg cgtcaaatgt tccaccttct
ccaacattta atcttgtaaa acctgtgcct 840cagccgcttc gtcctaatga tatgatcaca
gacaagggta agagcataga cttcgacttt 900gaaaatgtat cagtcaaggc atgggaaggt
gaaaggattc acgatgtagg attcgatgat 960ctggaactca cacttggaag tggcaatgct
cgcatatga 999267332PRTLycopersicon esculentum
267Met Trp Glu Ala Gly Glu Ser Pro Ala Ser Ser Ser Ala Gly Ala Gly 1
5 10 15 Ala Gly Gly Ser
Gly Gly Ala Gly Val Gly Leu Pro Glu Ser Gly Gly 20
25 30 Gly Gly Gly Gly Gly Arg Arg Lys Pro
Ser Trp Arg Glu Arg Glu Asn 35 40
45 Asn Arg Arg Arg Glu Arg Arg Arg Arg Ala Val Ala Ala Lys
Ile Tyr 50 55 60
Thr Gly Leu Arg Ala Gln Gly Asn Tyr Asn Leu Pro Lys His Cys Asp 65
70 75 80 Asn Asn Glu Val Leu
Lys Ala Leu Cys Thr Glu Ala Gly Trp Ile Val 85
90 95 Glu Pro Asp Gly Thr Thr Tyr Arg Lys Gly
Cys Lys Pro Thr Pro Met 100 105
110 Glu Ile Gly Gly Thr Ser Thr Asn Ile Thr Pro Ser Ser Ser Arg
His 115 120 125 Pro
Ser Pro Pro Ser Ser Tyr Phe Ala Ser Pro Ile Pro Ser Tyr Gln 130
135 140 Pro Ser Pro Thr Ser Ser
Ser Phe Pro Ser Pro Ser Arg Ala Asp Ala 145 150
155 160 Asn Met Ser Ser His Pro Tyr Ser Phe Leu Gln
Asn Val Val Pro Ser 165 170
175 Ser Leu Pro Pro Leu Arg Ile Ser Asn Ser Ala Pro Val Thr Pro Pro
180 185 190 Leu Ser
Ser Pro Thr Arg His Pro Lys Gln Thr Phe Asn Leu Glu Thr 195
200 205 Leu Ala Lys Glu Thr Met Phe
Ala Leu Asn Ile Pro Phe Leu Ala Ala 210 215
220 Ser Ala Pro Ala Ser Pro Thr Arg Gly Gln Arg Phe
Thr Pro Pro Thr 225 230 235
240 Ile Pro Glu Cys Asp Glu Ser Asp Ser Ser Thr Ile Asp Ser Gly Gln
245 250 255 Trp Ile Asn
Phe Gln Lys Tyr Ala Ser Asn Val Pro Pro Ser Pro Thr 260
265 270 Phe Asn Leu Val Lys Pro Val Pro
Gln Pro Leu Arg Pro Asn Asp Met 275 280
285 Ile Thr Asp Lys Gly Lys Ser Ile Asp Phe Asp Phe Glu
Asn Val Ser 290 295 300
Val Lys Ala Trp Glu Gly Glu Arg Ile His Asp Val Gly Phe Asp Asp 305
310 315 320 Leu Glu Leu Thr
Leu Gly Ser Gly Asn Ala Arg Ile 325 330
268498DNAMedicago truncatula 268atgacatcag gtacaagact accaacatgg
aaggagagag agaacaacaa gaggagagag 60agaagaagaa gagctatagc tgctaagatc
ttctctggtt tgagaatgta tggtaacttt 120agattaccta aacattgtga taacaatgaa
gttcttaaag ctccttgtaa tgaagctggt 180tggactgttg aacctgatgg aaccacttat
cgtaagggat gcaagccttt agagaacatg 240gatatggttg gtggatcatc agctgcaagc
ccttgttcat cttaccatcc aagccccggc 300tcgtcttcct tcccgagtcc atcttcatcc
ccttacgctg caaatcgtaa cgctgatggt 360aattccctca ttccatggct caaaaacctc
tccacagctt catcttcagg atcatctccg 420aaacttcctc atccctactt tcatagtggc
tccatcagtg ctcctgtcac acctcccctg 480agctctccga cttcctaa
498269165PRTMedicago truncatula 269Met
Thr Ser Gly Thr Arg Leu Pro Thr Trp Lys Glu Arg Glu Asn Asn 1
5 10 15 Lys Arg Arg Glu Arg Arg
Arg Arg Ala Ile Ala Ala Lys Ile Phe Ser 20
25 30 Gly Leu Arg Met Tyr Gly Asn Phe Arg Leu
Pro Lys His Cys Asp Asn 35 40
45 Asn Glu Val Leu Lys Ala Pro Cys Asn Glu Ala Gly Trp Thr
Val Glu 50 55 60
Pro Asp Gly Thr Thr Tyr Arg Lys Gly Cys Lys Pro Leu Glu Asn Met 65
70 75 80 Asp Met Val Gly Gly
Ser Ser Ala Ala Ser Pro Cys Ser Ser Tyr His 85
90 95 Pro Ser Pro Gly Ser Ser Ser Phe Pro Ser
Pro Ser Ser Ser Pro Tyr 100 105
110 Ala Ala Asn Arg Asn Ala Asp Gly Asn Ser Leu Ile Pro Trp Leu
Lys 115 120 125 Asn
Leu Ser Thr Ala Ser Ser Ser Gly Ser Ser Pro Lys Leu Pro His 130
135 140 Pro Tyr Phe His Ser Gly
Ser Ile Ser Ala Pro Val Thr Pro Pro Leu 145 150
155 160 Ser Ser Pro Thr Ser 165
270972DNAMedicago truncatula 270atgaccggcg gtggatcctc cgggagatta
ccaacatgga aggagagaga aaacaacaaa 60agaagagaga gaagaagaag agctattgct
gctaagatct attcaggttt acgagctcaa 120ggtaatttta agcttcctaa acactgtgat
aataatgaag tcttgaaagc tctttgttct 180gaagctggtt ggatcgttga agaagatggt
actacttatc gaaagggaag taagagacca 240ttaccaaatg agatgggagg aactcctaca
aatatgagtg cttgttcttc aatgcaacca 300agtccacaat cttcttcgtt cccaagtcca
caatcttctt cgttcccaag tccaatacca 360tcatacccta cgagtccaac tcgcatggat
ggaattacaa acccctcttc ctttctccta 420ccattcatcc gcaacataac ttcaatccca
acaaatcttc caccccttag gatttccaac 480agtgctcctg ttacgccacc tctttcttct
ccaagaagtt caaagcgaaa agcagatttt 540gaatcccttt gtaatggttc ctttaactcc
tcgtttcgcc accccctttt cgctacctct 600gcaccatcaa gcccctcgcg acgtaaccac
ttaccccctt ccaccattcc agaatgtgat 660gagtcagatg cttctacagt ggactctggt
cggtgggtta gttttcagac aacaactgcc 720catggtgcag ctcctccttc ccctactttt
aatcttatga aaccagcaat gcagatcact 780ccccagagtt cgatggatat gaaacatatg
aatgaagcca tgcaatggag tgcaggttca 840gctactgaga gaggtagagg ctcagatttt
gactttgaga atggcagagt tgtgaagccg 900tgggaaggtg agagaattca tgaggtagga
atggaagagt tggagcttac tctagggttt 960ggtaaggcct ga
972271323PRTMedicago truncatula 271Met
Thr Gly Gly Gly Ser Ser Gly Arg Leu Pro Thr Trp Lys Glu Arg 1
5 10 15 Glu Asn Asn Lys Arg Arg
Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys 20
25 30 Ile Tyr Ser Gly Leu Arg Ala Gln Gly Asn
Phe Lys Leu Pro Lys His 35 40
45 Cys Asp Asn Asn Glu Val Leu Lys Ala Leu Cys Ser Glu Ala
Gly Trp 50 55 60
Ile Val Glu Glu Asp Gly Thr Thr Tyr Arg Lys Gly Ser Lys Arg Pro 65
70 75 80 Leu Pro Asn Glu Met
Gly Gly Thr Pro Thr Asn Met Ser Ala Cys Ser 85
90 95 Ser Met Gln Pro Ser Pro Gln Ser Ser Ser
Phe Pro Ser Pro Gln Ser 100 105
110 Ser Ser Phe Pro Ser Pro Ile Pro Ser Tyr Pro Thr Ser Pro Thr
Arg 115 120 125 Met
Asp Gly Ile Thr Asn Pro Ser Ser Phe Leu Leu Pro Phe Ile Arg 130
135 140 Asn Ile Thr Ser Ile Pro
Thr Asn Leu Pro Pro Leu Arg Ile Ser Asn 145 150
155 160 Ser Ala Pro Val Thr Pro Pro Leu Ser Ser Pro
Arg Ser Ser Lys Arg 165 170
175 Lys Ala Asp Phe Glu Ser Leu Cys Asn Gly Ser Phe Asn Ser Ser Phe
180 185 190 Arg His
Pro Leu Phe Ala Thr Ser Ala Pro Ser Ser Pro Ser Arg Arg 195
200 205 Asn His Leu Pro Pro Ser Thr
Ile Pro Glu Cys Asp Glu Ser Asp Ala 210 215
220 Ser Thr Val Asp Ser Gly Arg Trp Val Ser Phe Gln
Thr Thr Thr Ala 225 230 235
240 His Gly Ala Ala Pro Pro Ser Pro Thr Phe Asn Leu Met Lys Pro Ala
245 250 255 Met Gln Ile
Thr Pro Gln Ser Ser Met Asp Met Lys His Met Asn Glu 260
265 270 Ala Met Gln Trp Ser Ala Gly Ser
Ala Thr Glu Arg Gly Arg Gly Ser 275 280
285 Asp Phe Asp Phe Glu Asn Gly Arg Val Val Lys Pro Trp
Glu Gly Glu 290 295 300
Arg Ile His Glu Val Gly Met Glu Glu Leu Glu Leu Thr Leu Gly Phe 305
310 315 320 Gly Lys Ala
272522DNAMedicago truncatula 272atggcttctg acggagcaac ttcggcggcg
aattcaagtc gtcggaagcc gtcgtggagg 60gagagagaga acaacaggag gagagagaga
cggaggagag ctatagcggc aaagatttac 120gcgggattaa ggtctcaggg gaattataat
ttaccaaaac actgtgataa caatgaggtc 180ttgaaagctc tttgtgctga agctggttgg
actgttgaag aagatggcac cacttatcgc 240aggggatcaa gggcagaaac accaggcgat
ggtgcaggaa atttcaacag aaacaaccca 300ttttcatctc aaaatctaag tcctctttca
tcatcatttc caagtccaat cccttcctac 360caagttagcc cctcttcctc ttcattcccg
agcccgtctc gtatggatgc aaacaacaat 420gcatcaaatt acattccata tgctcgcacc
atgttcccca acatgtctct cccacctttg 480agaatatcaa acagcgcgcc cgtgactcca
cctgtctcat aa 522273173PRTMedicago truncatula
273Met Ala Ser Asp Gly Ala Thr Ser Ala Ala Asn Ser Ser Arg Arg Lys 1
5 10 15 Pro Ser Trp Arg
Glu Arg Glu Asn Asn Arg Arg Arg Glu Arg Arg Arg 20
25 30 Arg Ala Ile Ala Ala Lys Ile Tyr Ala
Gly Leu Arg Ser Gln Gly Asn 35 40
45 Tyr Asn Leu Pro Lys His Cys Asp Asn Asn Glu Val Leu Lys
Ala Leu 50 55 60
Cys Ala Glu Ala Gly Trp Thr Val Glu Glu Asp Gly Thr Thr Tyr Arg 65
70 75 80 Arg Gly Ser Arg Ala
Glu Thr Pro Gly Asp Gly Ala Gly Asn Phe Asn 85
90 95 Arg Asn Asn Pro Phe Ser Ser Gln Asn Leu
Ser Pro Leu Ser Ser Ser 100 105
110 Phe Pro Ser Pro Ile Pro Ser Tyr Gln Val Ser Pro Ser Ser Ser
Ser 115 120 125 Phe
Pro Ser Pro Ser Arg Met Asp Ala Asn Asn Asn Ala Ser Asn Tyr 130
135 140 Ile Pro Tyr Ala Arg Thr
Met Phe Pro Asn Met Ser Leu Pro Pro Leu 145 150
155 160 Arg Ile Ser Asn Ser Ala Pro Val Thr Pro Pro
Val Ser 165 170
2741095DNAOryza sativa 274atggcgacgg gaggaggagg aggaggggga gggatggggg
gaggaggtgt cggaggagga 60gcgggggcgg cgggggtggg ggtgggaggg aggatgccga
cgtggaggga gcgggagaac 120aacaagcgga gggagcggcg gcgtcgcgcg atcgccgcca
agatcttcgc cggcctccgc 180gcccacggcg gctacaagct ccccaagcac tgcgacaaca
acgaggtcct caaggccctc 240tgcaacgagg ccggctgggt cgtcgagccc gacggcacca
cctaccgcaa gggatacaag 300cctcctgaac gcatggaagt gattgggtgc tccgtatcac
caagcccgtg ttcctcgtat 360caaccaagcc cgcgggcatc atacaatgcg agtcctactt
cctcctcatt ccctagcggc 420gcatcctccc ccttccttcc tcaccctaac aacatggcca
atggtgttga tggtaatcct 480atcctcccat ggcttaaaac actgtccaat tctccatcat
caaagaaaca tccacagctt 540cccccactat tgattcacgg tggttccatt agtgcccctg
taactcctcc attgagttca 600ccaactgctc gcactcctcg catgaagaca gattgggatg
aatcaaatgt ccagcctacg 660tggactggtt cgaacagtcc ctgcgtggtg aactccacgc
cgcccagccc cggacgcaca 720atgcttccgg acccagcatg gttagctggt atccaaatat
catcaacaag tccatcatca 780ccgacattta gtcttgtgtc atcaaatcca tttagtgtct
ttaaagacgc gattctggtg 840ggcaacaatt catcgaggat gtgcacgcca gggcaaagcg
gcacatgctc ccctgcgatt 900cctggcatgg caccacaccc agatattcat atgatggatg
cggtttctga tgagtttgca 960tttggaagca gcacaaacgg tggccatcag gcggctggtc
tggtgagggc gtgggaaggc 1020gagaggatcc acgaggactc gggatcggac gacctagagc
tgactcttgg aagctctagg 1080acaagagctg ctgct
1095275365PRTOryza sativa 275Met Ala Thr Gly Gly
Gly Gly Gly Gly Gly Gly Met Gly Gly Gly Gly 1 5
10 15 Val Gly Gly Gly Ala Gly Ala Ala Gly Val
Gly Val Gly Gly Arg Met 20 25
30 Pro Thr Trp Arg Glu Arg Glu Asn Asn Lys Arg Arg Glu Arg Arg
Arg 35 40 45 Arg
Ala Ile Ala Ala Lys Ile Phe Ala Gly Leu Arg Ala His Gly Gly 50
55 60 Tyr Lys Leu Pro Lys His
Cys Asp Asn Asn Glu Val Leu Lys Ala Leu 65 70
75 80 Cys Asn Glu Ala Gly Trp Val Val Glu Pro Asp
Gly Thr Thr Tyr Arg 85 90
95 Lys Gly Tyr Lys Pro Pro Glu Arg Met Glu Val Ile Gly Cys Ser Val
100 105 110 Ser Pro
Ser Pro Cys Ser Ser Tyr Gln Pro Ser Pro Arg Ala Ser Tyr 115
120 125 Asn Ala Ser Pro Thr Ser Ser
Ser Phe Pro Ser Gly Ala Ser Ser Pro 130 135
140 Phe Leu Pro His Pro Asn Asn Met Ala Asn Gly Val
Asp Gly Asn Pro 145 150 155
160 Ile Leu Pro Trp Leu Lys Thr Leu Ser Asn Ser Pro Ser Ser Lys Lys
165 170 175 His Pro Gln
Leu Pro Pro Leu Leu Ile His Gly Gly Ser Ile Ser Ala 180
185 190 Pro Val Thr Pro Pro Leu Ser Ser
Pro Thr Ala Arg Thr Pro Arg Met 195 200
205 Lys Thr Asp Trp Asp Glu Ser Asn Val Gln Pro Thr Trp
Thr Gly Ser 210 215 220
Asn Ser Pro Cys Val Val Asn Ser Thr Pro Pro Ser Pro Gly Arg Thr 225
230 235 240 Met Leu Pro Asp
Pro Ala Trp Leu Ala Gly Ile Gln Ile Ser Ser Thr 245
250 255 Ser Pro Ser Ser Pro Thr Phe Ser Leu
Val Ser Ser Asn Pro Phe Ser 260 265
270 Val Phe Lys Asp Ala Ile Leu Val Gly Asn Asn Ser Ser Arg
Met Cys 275 280 285
Thr Pro Gly Gln Ser Gly Thr Cys Ser Pro Ala Ile Pro Gly Met Ala 290
295 300 Pro His Pro Asp Ile
His Met Met Asp Ala Val Ser Asp Glu Phe Ala 305 310
315 320 Phe Gly Ser Ser Thr Asn Gly Gly His Gln
Ala Ala Gly Leu Val Arg 325 330
335 Ala Trp Glu Gly Glu Arg Ile His Glu Asp Ser Gly Ser Asp Asp
Leu 340 345 350 Glu
Leu Thr Leu Gly Ser Ser Arg Thr Arg Ala Ala Ala 355
360 365 2761179DNAOryza sativa 276atgagcctga agcacccgca
ctctccggtg ctggacgggg acccgccgcc gcaccgccgc 60ccgcggggcc tcgtctccac
cccaccccca cccgccgtcg cggccgacac ctccccctcc 120ccctccccct cccccgcggc
gcctccgcct cggcggcgcg gcggcggcgg agggggaggc 180gagagggaga gggagaggga
gaaggagcgg acgaagctga gggagcggca ccgccgggcc 240atcaccagcc gcatgctgtc
cgggctgcgg cagcacggca acttcccgct ccccgcccgc 300gccgacatga acgacgtcct
cgccgccctc gcgcgcgccg cagggtggac cgtgcatccc 360gacggcacca ccttccgcgc
ctcgtcgcaa cccctccacc ctcccacccc ccaatcgcca 420gggatttttc atgttaattc
tgttgaaacc ccatctttta ctagtgttct caacagctac 480gccatcggga caccattaga
ctcgcaggct tctatgctac aaacagatga tagtttatcg 540ccatcatcgt tggactctgt
tgtggtggca gaccaaagca taaaaaatga gaaatatggg 600aattcagatt ctgtcagctc
tctgaattgt ttggaaaatc accagctgac gagagcatca 660gcagcgctgg caggtgatta
caccagaact ccatatatac cagtctatgc ttctctgcct 720atgggcatta tcaatagcca
ttgccaattg attgatccag agggcatacg tgcagaactg 780atgcatctga agtctttgaa
tgttgatgga gttatcgttg actgttggtg ggggatagtg 840gaagcctgga ttcctcacaa
atacgagtgg tctggttaca gggacctttt cggtatcatt 900aaagagttca agctaaaagt
tcaggctgta ttgtcattcc atgggtctgg ggagactgga 960tctggtggtg tgtctctccc
aaagtgggtc atggaaattg cacaagagaa ccaggatgta 1020ttttttactg atcgtgaagg
taggagaaat atggaatgtc tttcctgggg aattgacaaa 1080gagcgagtcc ttcgcgggag
aactggcatc gaggtattgg gtcatccttg gcgtattttg 1140atttcatgag gagctttcat
atggaattca gaaacctga 1179277391PRTOryza sativa
277Met Ser Leu Lys His Pro His Ser Pro Val Leu Asp Gly Asp Pro Pro 1
5 10 15 Pro His Arg Arg
Pro Arg Gly Leu Val Ser Thr Pro Pro Pro Pro Ala 20
25 30 Val Ala Ala Asp Thr Ser Pro Ser Pro
Ser Pro Ser Pro Ala Ala Pro 35 40
45 Pro Pro Arg Arg Arg Gly Gly Gly Gly Gly Gly Gly Glu Arg
Glu Arg 50 55 60
Glu Arg Glu Lys Glu Arg Thr Lys Leu Arg Glu Arg His Arg Arg Ala 65
70 75 80 Ile Thr Ser Arg Met
Leu Ser Gly Leu Arg Gln His Gly Asn Phe Pro 85
90 95 Leu Pro Ala Arg Ala Asp Met Asn Asp Val
Leu Ala Ala Leu Ala Arg 100 105
110 Ala Ala Gly Trp Thr Val His Pro Asp Gly Thr Thr Phe Arg Ala
Ser 115 120 125 Ser
Gln Pro Leu His Pro Pro Thr Pro Gln Ser Pro Gly Ile Phe His 130
135 140 Val Asn Ser Val Glu Thr
Pro Ser Phe Thr Ser Val Leu Asn Ser Tyr 145 150
155 160 Ala Ile Gly Thr Pro Leu Asp Ser Gln Ala Ser
Met Leu Gln Thr Asp 165 170
175 Asp Ser Leu Ser Pro Ser Ser Leu Asp Ser Val Val Val Ala Asp Gln
180 185 190 Ser Ile
Lys Asn Glu Lys Tyr Gly Asn Ser Asp Ser Val Ser Ser Leu 195
200 205 Asn Cys Leu Glu Asn His Gln
Leu Thr Arg Ala Ser Ala Ala Leu Ala 210 215
220 Gly Asp Tyr Thr Arg Thr Pro Tyr Ile Pro Val Tyr
Ala Ser Leu Pro 225 230 235
240 Met Gly Ile Ile Asn Ser His Cys Gln Leu Ile Asp Pro Glu Gly Ile
245 250 255 Arg Ala Glu
Leu Met His Leu Lys Ser Leu Asn Val Asp Gly Val Ile 260
265 270 Val Asp Cys Trp Trp Gly Ile Val
Glu Ala Trp Ile Pro His Lys Tyr 275 280
285 Glu Trp Ser Gly Tyr Arg Asp Leu Phe Gly Ile Ile Lys
Glu Phe Lys 290 295 300
Leu Lys Val Gln Ala Val Leu Ser Phe His Gly Ser Gly Glu Thr Gly 305
310 315 320 Ser Gly Gly Val
Ser Leu Pro Lys Trp Val Met Glu Ile Ala Gln Glu 325
330 335 Asn Gln Asp Val Phe Phe Thr Asp Arg
Glu Gly Arg Arg Asn Met Glu 340 345
350 Cys Leu Ser Trp Gly Ile Asp Lys Glu Arg Val Leu Arg Gly
Arg Thr 355 360 365
Gly Ile Glu Val Leu Gly His Pro Trp Arg Ile Leu Ile Ser Gly Ala 370
375 380 Phe Ile Trp Asn Ser
Glu Thr 385 390 2781068DNAOryza sativa 278atgacgaacg
gggcgggggg aggaggagga ggagggggat tggggggcac gagggtgccg 60acgtggaggg
agcgggagaa caaccggcgg agggagcggc ggcggcgggc gatcgcggcc 120aagatctacg
ccgggctgcg cgcctacggc aactacaacc tccccaagca ctgcgacaac 180aacgaggtgc
tcaaggcgct ctgcaacgag gccggctgga ccgtcgagcc cgacggcacc 240acctaccgca
agggatgtaa acctcctcaa gcagagcgtc ctgatccaat tggaagatcg 300gcttcgccaa
gcccttgctc ttcatatcaa ccaagtccgc gggcttcata caacccaagt 360cctgcatcgt
cctcctttcc aagctctgga tcctcctcgc atatcactat tggtggaaac 420agcttgattg
gtggtgtcga gggaagctcc ctcattccat ggctgaagac acttccgttg 480agttcatcat
atgcctcctc ctccaagttc ccacagcttc accatttata tttcaatgga 540ggttccatta
gtgcaccagt gactcctcca tccagctccc ctactcgcac acctcgctta 600aggactgatt
gggagaacgc aagtgttcag ccaccatggg ctagtgcaaa ttatacatct 660cttcccaact
ctacaccacc gagcccaggc cacaagattg caccagaccc agcatggctc 720tcaggatttc
aaatatcatc tgctggtccc tcatcgccaa catacaatct tgtttcgccg 780aatccatttg
ggattttcaa agaagctatt gccagcactt ccagggtgtg cacccctggt 840cagagcggaa
catgttcccc ggtaatgggt ggcatgccgg ctcatcatga tgttcagatg 900gttgatggtg
cgccggatga ttttgccttt gggagcagca gcaatggcaa caatgaatca 960cctggactgg
tgaaggcatg ggagggggag cggatacatg aagaatgcgc ctccgatgag 1020ctggagctca
ctcttgggag ctcaaagact cgtgcggatc cctcctga
1068279355PRTOryza sativa 279Met Thr Asn Gly Ala Gly Gly Gly Gly Gly Gly
Gly Gly Leu Gly Gly 1 5 10
15 Thr Arg Val Pro Thr Trp Arg Glu Arg Glu Asn Asn Arg Arg Arg Glu
20 25 30 Arg Arg
Arg Arg Ala Ile Ala Ala Lys Ile Tyr Ala Gly Leu Arg Ala 35
40 45 Tyr Gly Asn Tyr Asn Leu Pro
Lys His Cys Asp Asn Asn Glu Val Leu 50 55
60 Lys Ala Leu Cys Asn Glu Ala Gly Trp Thr Val Glu
Pro Asp Gly Thr 65 70 75
80 Thr Tyr Arg Lys Gly Cys Lys Pro Pro Gln Ala Glu Arg Pro Asp Pro
85 90 95 Ile Gly Arg
Ser Ala Ser Pro Ser Pro Cys Ser Ser Tyr Gln Pro Ser 100
105 110 Pro Arg Ala Ser Tyr Asn Pro Ser
Pro Ala Ser Ser Ser Phe Pro Ser 115 120
125 Ser Gly Ser Ser Ser His Ile Thr Ile Gly Gly Asn Ser
Leu Ile Gly 130 135 140
Gly Val Glu Gly Ser Ser Leu Ile Pro Trp Leu Lys Thr Leu Pro Leu 145
150 155 160 Ser Ser Ser Tyr
Ala Ser Ser Ser Lys Phe Pro Gln Leu His His Leu 165
170 175 Tyr Phe Asn Gly Gly Ser Ile Ser Ala
Pro Val Thr Pro Pro Ser Ser 180 185
190 Ser Pro Thr Arg Thr Pro Arg Leu Arg Thr Asp Trp Glu Asn
Ala Ser 195 200 205
Val Gln Pro Pro Trp Ala Ser Ala Asn Tyr Thr Ser Leu Pro Asn Ser 210
215 220 Thr Pro Pro Ser Pro
Gly His Lys Ile Ala Pro Asp Pro Ala Trp Leu 225 230
235 240 Ser Gly Phe Gln Ile Ser Ser Ala Gly Pro
Ser Ser Pro Thr Tyr Asn 245 250
255 Leu Val Ser Pro Asn Pro Phe Gly Ile Phe Lys Glu Ala Ile Ala
Ser 260 265 270 Thr
Ser Arg Val Cys Thr Pro Gly Gln Ser Gly Thr Cys Ser Pro Val 275
280 285 Met Gly Gly Met Pro Ala
His His Asp Val Gln Met Val Asp Gly Ala 290 295
300 Pro Asp Asp Phe Ala Phe Gly Ser Ser Ser Asn
Gly Asn Asn Glu Ser 305 310 315
320 Pro Gly Leu Val Lys Ala Trp Glu Gly Glu Arg Ile His Glu Glu Cys
325 330 335 Ala Ser
Asp Glu Leu Glu Leu Thr Leu Gly Ser Ser Lys Thr Arg Ala 340
345 350 Asp Pro Ser 355
280897DNAOryza sativa 280atgacgtccg gggcggcggc ggcggggagg acgccgacgt
ggaaggagag ggagaacaac 60aagaggcggg agcggcggcg gcgtgccatc gccgccaaga
tcttcacggg gctccgggcg 120ctcgggaact acaacctccc caagcactgc gacaacaacg
aggtgctcaa ggcgctctgc 180cgcgaggccg gctgggttgt cgaggacgac ggcaccacct
accgcaaggg atgtaagccg 240ccgccatcgt cggctggggg agcgtcggtg gggatgagcc
cctgctcgtc aacgcagctg 300ctgagcgcgc cgtcgtcgtc gttcccgagc ccggtgccgt
cgtaccacgc gagcccggcg 360tcgtcgagct tcccgagccc cagccggatc gacaacccga
gcgcctcctg cctcctcccg 420ttcctccggg ggctccccaa cctcccgccg ctccgcgtct
ccagcagcgc gcccgtcacg 480ccgccgctct cgtcgccgac ggcgtcgcgg ccgcccaaga
tcaggaagcc ggactgggac 540gtcgacccgt tccggcaccc cttcttcgcg gtctccgcgc
cggcgagccc cacccgcggc 600cgccgcctcg agcacccgga cacgataccg gagtgcgacg
agtccgacgt ctccacggtg 660gactccggcc ggtggatcag cttccagatg gccacgacgg
cgccgacgtc gcccacctac 720aacctcgtca acccgggcgc ctccacctcc aactccatgg
agatagaagg gacggccggc 780cgaggcggcg cggagttcga gttcgacaag gggagggtga
cgccatggga gggcgagagg 840atccacgagg tcgccgccga ggagctcgag ctcacgctcg
gcgtcggcgc gaaatga 897281298PRTOryza sativa 281Met Thr Ser Gly Ala
Ala Ala Ala Gly Arg Thr Pro Thr Trp Lys Glu 1 5
10 15 Arg Glu Asn Asn Lys Arg Arg Glu Arg Arg
Arg Arg Ala Ile Ala Ala 20 25
30 Lys Ile Phe Thr Gly Leu Arg Ala Leu Gly Asn Tyr Asn Leu Pro
Lys 35 40 45 His
Cys Asp Asn Asn Glu Val Leu Lys Ala Leu Cys Arg Glu Ala Gly 50
55 60 Trp Val Val Glu Asp Asp
Gly Thr Thr Tyr Arg Lys Gly Cys Lys Pro 65 70
75 80 Pro Pro Ser Ser Ala Gly Gly Ala Ser Val Gly
Met Ser Pro Cys Ser 85 90
95 Ser Thr Gln Leu Leu Ser Ala Pro Ser Ser Ser Phe Pro Ser Pro Val
100 105 110 Pro Ser
Tyr His Ala Ser Pro Ala Ser Ser Ser Phe Pro Ser Pro Ser 115
120 125 Arg Ile Asp Asn Pro Ser Ala
Ser Cys Leu Leu Pro Phe Leu Arg Gly 130 135
140 Leu Pro Asn Leu Pro Pro Leu Arg Val Ser Ser Ser
Ala Pro Val Thr 145 150 155
160 Pro Pro Leu Ser Ser Pro Thr Ala Ser Arg Pro Pro Lys Ile Arg Lys
165 170 175 Pro Asp Trp
Asp Val Asp Pro Phe Arg His Pro Phe Phe Ala Val Ser 180
185 190 Ala Pro Ala Ser Pro Thr Arg Gly
Arg Arg Leu Glu His Pro Asp Thr 195 200
205 Ile Pro Glu Cys Asp Glu Ser Asp Val Ser Thr Val Asp
Ser Gly Arg 210 215 220
Trp Ile Ser Phe Gln Met Ala Thr Thr Ala Pro Thr Ser Pro Thr Tyr 225
230 235 240 Asn Leu Val Asn
Pro Gly Ala Ser Thr Ser Asn Ser Met Glu Ile Glu 245
250 255 Gly Thr Ala Gly Arg Gly Gly Ala Glu
Phe Glu Phe Asp Lys Gly Arg 260 265
270 Val Thr Pro Trp Glu Gly Glu Arg Ile His Glu Val Ala Ala
Glu Glu 275 280 285
Leu Glu Leu Thr Leu Gly Val Gly Ala Lys 290 295
2821071DNAPhyscomitrella patens 282atgacgtccg ggacgcgcct gcccacatgg
aaggagcgag agaacaacaa gaggagagag 60cggcgccggc gcgccattgc ggccaagatc
ttcgccggcc tgcgccttta tggcaactac 120aagctgccga agcactgcga caacaatgaa
gttctcaaag cgttgtgcgt ggaggcgggc 180tggacggtgg aagaggacgg caccacgtac
cgcaagggct cgaagccacc ggcgcagccc 240atggaggtct gcacctcccc atctgaggtg
agccctacga actcctaccc gggtgccacc 300gatggcactt ccctgattcc atggctgaag
gggctgtctt ccaatggagg cagtggggca 360gccaccccga gcagcagcgc gggtcttccg
cccttgcacg taatgcatgg aggctcctct 420agcgcacccg tcacgccacc actgagctct
cccactcacc gtggtcctcc agtcaaacca 480gattgggatc acatcaagga gactgatcac
catccccacg ggtttcctcc aaccggcacg 540cccacatgga accatcaccc tttcctggcc
gcggctgccg ccgcagctca agctgctgct 600tcaaatcagt cccatctccg ccctggctac
tgcgacactc cggacggcgc tcgcactccc 660attgaagagg gagattctga aatctctcca
gaggcggctc ttgaatttgc taccgtctgt 720ggctccaact ccagcaagtg ggccaacggg
gttagggtgc ggacgagctc ggaagggcgg 780ctgctgagcg ggatggcggg tctggggcct
ttcccatcag caaatagcga ttcccccttg 840gaaactttct cccatccctg gcggaatccg
atgcagaagt ccatcagtat gcctgtttct 900cccgtttctt ctcggatgaa gggctcattt
ggggataggc tcgggaggtg tccatccgag 960ctggagttcc ccggcgctgt gcagggattg
gggtcgctgt gggatggcct ggctcctgag 1020gttgggggga agatgaaact gccggcggac
gatttggagc tgaagctttg a 1071283356PRTPhyscomitrella patens
283Met Thr Ser Gly Thr Arg Leu Pro Thr Trp Lys Glu Arg Glu Asn Asn 1
5 10 15 Lys Arg Arg Glu
Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Phe Ala 20
25 30 Gly Leu Arg Leu Tyr Gly Asn Tyr Lys
Leu Pro Lys His Cys Asp Asn 35 40
45 Asn Glu Val Leu Lys Ala Leu Cys Val Glu Ala Gly Trp Thr
Val Glu 50 55 60
Glu Asp Gly Thr Thr Tyr Arg Lys Gly Ser Lys Pro Pro Ala Gln Pro 65
70 75 80 Met Glu Val Cys Thr
Ser Pro Ser Glu Val Ser Pro Thr Asn Ser Tyr 85
90 95 Pro Gly Ala Thr Asp Gly Thr Ser Leu Ile
Pro Trp Leu Lys Gly Leu 100 105
110 Ser Ser Asn Gly Gly Ser Gly Ala Ala Thr Pro Ser Ser Ser Ala
Gly 115 120 125 Leu
Pro Pro Leu His Val Met His Gly Gly Ser Ser Ser Ala Pro Val 130
135 140 Thr Pro Pro Leu Ser Ser
Pro Thr His Arg Gly Pro Pro Val Lys Pro 145 150
155 160 Asp Trp Asp His Ile Lys Glu Thr Asp His His
Pro His Gly Phe Pro 165 170
175 Pro Thr Gly Thr Pro Thr Trp Asn His His Pro Phe Leu Ala Ala Ala
180 185 190 Ala Ala
Ala Ala Gln Ala Ala Ala Ser Asn Gln Ser His Leu Arg Pro 195
200 205 Gly Tyr Cys Asp Thr Pro Asp
Gly Ala Arg Thr Pro Ile Glu Glu Gly 210 215
220 Asp Ser Glu Ile Ser Pro Glu Ala Ala Leu Glu Phe
Ala Thr Val Cys 225 230 235
240 Gly Ser Asn Ser Ser Lys Trp Ala Asn Gly Val Arg Val Arg Thr Ser
245 250 255 Ser Glu Gly
Arg Leu Leu Ser Gly Met Ala Gly Leu Gly Pro Phe Pro 260
265 270 Ser Ala Asn Ser Asp Ser Pro Leu
Glu Thr Phe Ser His Pro Trp Arg 275 280
285 Asn Pro Met Gln Lys Ser Ile Ser Met Pro Val Ser Pro
Val Ser Ser 290 295 300
Arg Met Lys Gly Ser Phe Gly Asp Arg Leu Gly Arg Cys Pro Ser Glu 305
310 315 320 Leu Glu Phe Pro
Gly Ala Val Gln Gly Leu Gly Ser Leu Trp Asp Gly 325
330 335 Leu Ala Pro Glu Val Gly Gly Lys Met
Lys Leu Pro Ala Asp Asp Leu 340 345
350 Glu Leu Lys Leu 355
2841902DNAPhyscomitrella patens 284atggcacagg catgtgcaag ccacacccca
cacattccgc atgtttccag aatgctcaga 60atttcaagcc gatttttgca acaattccag
agaggaagaa acagaagaaa aagaacagga 120aataaaaata acaggagggt aactgcgacg
acttcgtcta ctgctccttc tgcttcgtcg 180gcttcttgtg ctgtttctgc tactgctgct
ggttctggtt cttctggttc ttcttcttct 240tcttcgcctg caacctgcag tttcgggtgt
gcgtgtcgtc aatttgccgg tcaagactgt 300ttcaaaggcg ctgctttggt gatgatggtg
gagttcctcc gggagggttg ggtagatttg 360acttgtgtca gaacgcgaaa gagagcagca
ggggttttga gcgagggagg gtctggttcc 420ttccgagtcg ctagcccgca cgcgcgagca
acacctttat ttggagggtg cttcgttgtc 480cctgagcctg tgctgtatcg cccctcttgt
cttggttcga agatcggcga tatgacgtcc 540gggacgcgct tgcccacctg gaaggaacga
gagaacaaca agaggagaga gcgccgccgg 600cgcgccattg cagctaagat cttcgccggc
ctgcgtctct atggaaacta caagctgccc 660aagcactgcg acaacaatga agtcctcaag
gcgctctgcg tggaggctgg ctggactgtg 720gaagaagacg gcaccactta ccgtaaggga
tcgaagccgc ccgcgcagcc catggaagtg 780tgcacatcac cttctgaggc gagccccact
agttcctacc ccggcgcagc tgaaggcact 840tctctgattc cgtggcttaa ggggctatct
tcaaatggtg gcagtggcac cgctaccccg 900agcagcagcg cgggcctgcc acccttgcac
gtcatgcacg gtggctcctc cagcgcgccg 960gtcacgccac ctctttcgtc ccccactcac
cgcggtcctc cggtcaagcc cgactgggac 1020cacatcaagg acgccgatca ccattcccac
ggattccccc catccggccc tcccacatgg 1080aaccatcacc ctttcttagc tgccgctgcg
gctgcagctc aagctgccgc ttccaatcag 1140tcccatctcc ggcctggcta ctgcgatacg
cccgacggtg ctcgtacccc catcgaagag 1200gccgagtccg aaatctcgcc tgggaccgct
ttggagtttg ccaccgtctg tggctcgaac 1260tccagcaaat gggctaacgg ggttagggtg
cgaactagct caggggggcg gcttctcggg 1320ggaatggcag gccttggtcc atttccttcg
gctaataatg actccccctt ggagacattc 1380tcccatgcat ggcgaaatcc tatgcagaaa
tcgattagta tgcctgtttc gcctgtttcc 1440tcgcgaatga aggggtcatt cggagataga
cttggtaggt gtccgtcaga gctggagctc 1500tccggcgctg ttcaggggtt agggtcgctg
tgggagctgg acggggtggt acctgaagtt 1560ggtgcgaaga ggaaattgcc agccgacgat
ttagagctga agctaattgc tggtggtctg 1620gtgccgatcg ccactgaagt tgttcaggtt
aaactttggg tcttccaagc ggtgcactta 1680aaggatgtgg ggttagtgag cgttgtgtac
ttcgtcggtt tacacttgct ggagccgctt 1740catgagggcg cactctattc ctctacaacc
ttaatactgc agtttaatgc ggtgaagtct 1800gcactgtttg tttctgagag tggctccgcc
tcctctaccc caaaaaatcg tgctgaagtc 1860ttcccagtca ccagtgcgac ggagttatca
aggcaactat aa 1902285633PRTPhyscomitrella patens
285Met Ala Gln Ala Cys Ala Ser His Thr Pro His Ile Pro His Val Ser 1
5 10 15 Arg Met Leu Arg
Ile Ser Ser Arg Phe Leu Gln Gln Phe Gln Arg Gly 20
25 30 Arg Asn Arg Arg Lys Arg Thr Gly Asn
Lys Asn Asn Arg Arg Val Thr 35 40
45 Ala Thr Thr Ser Ser Thr Ala Pro Ser Ala Ser Ser Ala Ser
Cys Ala 50 55 60
Val Ser Ala Thr Ala Ala Gly Ser Gly Ser Ser Gly Ser Ser Ser Ser 65
70 75 80 Ser Ser Pro Ala Thr
Cys Ser Phe Gly Cys Ala Cys Arg Gln Phe Ala 85
90 95 Gly Gln Asp Cys Phe Lys Gly Ala Ala Leu
Val Met Met Val Glu Phe 100 105
110 Leu Arg Glu Gly Trp Val Asp Leu Thr Cys Val Arg Thr Arg Lys
Arg 115 120 125 Ala
Ala Gly Val Leu Ser Glu Gly Gly Ser Gly Ser Phe Arg Val Ala 130
135 140 Ser Pro His Ala Arg Ala
Thr Pro Leu Phe Gly Gly Cys Phe Val Val 145 150
155 160 Pro Glu Pro Val Leu Tyr Arg Pro Ser Cys Leu
Gly Ser Lys Ile Gly 165 170
175 Asp Met Thr Ser Gly Thr Arg Leu Pro Thr Trp Lys Glu Arg Glu Asn
180 185 190 Asn Lys
Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Phe 195
200 205 Ala Gly Leu Arg Leu Tyr Gly
Asn Tyr Lys Leu Pro Lys His Cys Asp 210 215
220 Asn Asn Glu Val Leu Lys Ala Leu Cys Val Glu Ala
Gly Trp Thr Val 225 230 235
240 Glu Glu Asp Gly Thr Thr Tyr Arg Lys Gly Ser Lys Pro Pro Ala Gln
245 250 255 Pro Met Glu
Val Cys Thr Ser Pro Ser Glu Ala Ser Pro Thr Ser Ser 260
265 270 Tyr Pro Gly Ala Ala Glu Gly Thr
Ser Leu Ile Pro Trp Leu Lys Gly 275 280
285 Leu Ser Ser Asn Gly Gly Ser Gly Thr Ala Thr Pro Ser
Ser Ser Ala 290 295 300
Gly Leu Pro Pro Leu His Val Met His Gly Gly Ser Ser Ser Ala Pro 305
310 315 320 Val Thr Pro Pro
Leu Ser Ser Pro Thr His Arg Gly Pro Pro Val Lys 325
330 335 Pro Asp Trp Asp His Ile Lys Asp Ala
Asp His His Ser His Gly Phe 340 345
350 Pro Pro Ser Gly Pro Pro Thr Trp Asn His His Pro Phe Leu
Ala Ala 355 360 365
Ala Ala Ala Ala Ala Gln Ala Ala Ala Ser Asn Gln Ser His Leu Arg 370
375 380 Pro Gly Tyr Cys Asp
Thr Pro Asp Gly Ala Arg Thr Pro Ile Glu Glu 385 390
395 400 Ala Glu Ser Glu Ile Ser Pro Gly Thr Ala
Leu Glu Phe Ala Thr Val 405 410
415 Cys Gly Ser Asn Ser Ser Lys Trp Ala Asn Gly Val Arg Val Arg
Thr 420 425 430 Ser
Ser Gly Gly Arg Leu Leu Gly Gly Met Ala Gly Leu Gly Pro Phe 435
440 445 Pro Ser Ala Asn Asn Asp
Ser Pro Leu Glu Thr Phe Ser His Ala Trp 450 455
460 Arg Asn Pro Met Gln Lys Ser Ile Ser Met Pro
Val Ser Pro Val Ser 465 470 475
480 Ser Arg Met Lys Gly Ser Phe Gly Asp Arg Leu Gly Arg Cys Pro Ser
485 490 495 Glu Leu
Glu Leu Ser Gly Ala Val Gln Gly Leu Gly Ser Leu Trp Glu 500
505 510 Leu Asp Gly Val Val Pro Glu
Val Gly Ala Lys Arg Lys Leu Pro Ala 515 520
525 Asp Asp Leu Glu Leu Lys Leu Ile Ala Gly Gly Leu
Val Pro Ile Ala 530 535 540
Thr Glu Val Val Gln Val Lys Leu Trp Val Phe Gln Ala Val His Leu 545
550 555 560 Lys Asp Val
Gly Leu Val Ser Val Val Tyr Phe Val Gly Leu His Leu 565
570 575 Leu Glu Pro Leu His Glu Gly Ala
Leu Tyr Ser Ser Thr Thr Leu Ile 580 585
590 Leu Gln Phe Asn Ala Val Lys Ser Ala Leu Phe Val Ser
Glu Ser Gly 595 600 605
Ser Ala Ser Ser Thr Pro Lys Asn Arg Ala Glu Val Phe Pro Val Thr 610
615 620 Ser Ala Thr Glu
Leu Ser Arg Gln Leu 625 630
2861671DNAPhyscomitrella patens 286atgcaggtcg gcgatcggag ttttgaacaa
ggtgaaagca gtgaggttcg caaatgcacc 60gtccgaggct gcataaagtc aacctccgga
ccgtggatag tacgccgtcc tccaggcaaa 120ggtcaatcaa ctgccccagc tgtacttagg
atgccctcgg cacgagagcg cgagaacaat 180aagaggaggg agaggcggcg acgagctata
gctgctaaaa tatttgctgg gttgcgcgcc 240catgggaact attgtcttcc caaacacgcc
gatcataatg aggtactcaa agccttgtgt 300caggaggccg gctggcaagt ggaggaggat
ggcaccatct tcagaaagaa cagttttcga 360gcagtacatc ctgttattca aagaattgta
gaagctaaac cgatccgtac cgttcagttg 420atcagcctac aaatgcagca tagcatagtt
cgccaattca ttaggaatca acagcaagga 480tcgcagccac cttcaaggga agtaactaca
gcgcacaata ctcctgaagg cacgccatca 540tacgagagat ccttcaagtc agataccagt
ccctcaacct cctgcagcca agctgggcaa 600accagtgatg aacccacgtg cactgctagg
agcggcggcg ccgaagtcag gcaccttggc 660agaataagtg ttgattctca gtttgaagat
aaaagacagc gctgtgaccc gctctccaac 720ttcaaaacag tggtagcatt tccttccgct
gtgcaagcaa ggaacccgaa tcctaattcc 780agagatccga agaatagggc aggacctaaa
tctgtggcag ggtttctgct tccagagcaa 840acggtgaggt tgcaccatgt tggcaacttg
aacgatccgc cagtacctgg cgcagaggat 900attgccgaag tctgcactgc actggcggtg
aagaacgaat gggaaaccac gcagggtact 960gcaggtgtgc tgtattcagg aggacaaact
gttggacaga cctatattgt gtcgtgtgca 1020tcggaaaagg acacctctga ttgctttgag
cgtgtatctg tgacggcagg acacgacagg 1080ttttcccatg acccccttgt tgcggacatg
atggattgtg tggaccttgg tcaacaactc 1140gaatgtggta gacgaaaaag gttccttgaa
caccagtcca agcagctgga atacgaccag 1200ctcaatccct acctcaacgt gcatatgaac
ggagactcgt ctgtggtatc ccaagttcaa 1260aggcaaaccc aagacccaga tccggggaag
cactacacgt tattccccga agcggccgac 1320ttgctcaacc aatcgcaacg agaacaaggg
gatcaatact catgcatcac tcacgaaatg 1380gtggatgtca ctggtcaagc atacaagtcc
ctcaaggatg gtctctgctt gtggtccgga 1440agagatggag cttccgtcag cacaggttca
actcgtttga gcttgcaccc agcagcagca 1500gcagcatcta ctacagcgag taaccgcggt
ggagcttcaa tcatctctct tcaacacaaa 1560aaggtcgacg cagacgagga tattgttaag
gacatcgctg atgacctcac gctcactctg 1620tgcacttcgg tgagacatac ccacactcca
gagtccagcc gtgtggtcta a 1671287556PRTPhyscomitrella patens
287Met Gln Val Gly Asp Arg Ser Phe Glu Gln Gly Glu Ser Ser Glu Val 1
5 10 15 Arg Lys Cys Thr
Val Arg Gly Cys Ile Lys Ser Thr Ser Gly Pro Trp 20
25 30 Ile Val Arg Arg Pro Pro Gly Lys Gly
Gln Ser Thr Ala Pro Ala Val 35 40
45 Leu Arg Met Pro Ser Ala Arg Glu Arg Glu Asn Asn Lys Arg
Arg Glu 50 55 60
Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Phe Ala Gly Leu Arg Ala 65
70 75 80 His Gly Asn Tyr Cys
Leu Pro Lys His Ala Asp His Asn Glu Val Leu 85
90 95 Lys Ala Leu Cys Gln Glu Ala Gly Trp Gln
Val Glu Glu Asp Gly Thr 100 105
110 Ile Phe Arg Lys Asn Ser Phe Arg Ala Val His Pro Val Ile Gln
Arg 115 120 125 Ile
Val Glu Ala Lys Pro Ile Arg Thr Val Gln Leu Ile Ser Leu Gln 130
135 140 Met Gln His Ser Ile Val
Arg Gln Phe Ile Arg Asn Gln Gln Gln Gly 145 150
155 160 Ser Gln Pro Pro Ser Arg Glu Val Thr Thr Ala
His Asn Thr Pro Glu 165 170
175 Gly Thr Pro Ser Tyr Glu Arg Ser Phe Lys Ser Asp Thr Ser Pro Ser
180 185 190 Thr Ser
Cys Ser Gln Ala Gly Gln Thr Ser Asp Glu Pro Thr Cys Thr 195
200 205 Ala Arg Ser Gly Gly Ala Glu
Val Arg His Leu Gly Arg Ile Ser Val 210 215
220 Asp Ser Gln Phe Glu Asp Lys Arg Gln Arg Cys Asp
Pro Leu Ser Asn 225 230 235
240 Phe Lys Thr Val Val Ala Phe Pro Ser Ala Val Gln Ala Arg Asn Pro
245 250 255 Asn Pro Asn
Ser Arg Asp Pro Lys Asn Arg Ala Gly Pro Lys Ser Val 260
265 270 Ala Gly Phe Leu Leu Pro Glu Gln
Thr Val Arg Leu His His Val Gly 275 280
285 Asn Leu Asn Asp Pro Pro Val Pro Gly Ala Glu Asp Ile
Ala Glu Val 290 295 300
Cys Thr Ala Leu Ala Val Lys Asn Glu Trp Glu Thr Thr Gln Gly Thr 305
310 315 320 Ala Gly Val Leu
Tyr Ser Gly Gly Gln Thr Val Gly Gln Thr Tyr Ile 325
330 335 Val Ser Cys Ala Ser Glu Lys Asp Thr
Ser Asp Cys Phe Glu Arg Val 340 345
350 Ser Val Thr Ala Gly His Asp Arg Phe Ser His Asp Pro Leu
Val Ala 355 360 365
Asp Met Met Asp Cys Val Asp Leu Gly Gln Gln Leu Glu Cys Gly Arg 370
375 380 Arg Lys Arg Phe Leu
Glu His Gln Ser Lys Gln Leu Glu Tyr Asp Gln 385 390
395 400 Leu Asn Pro Tyr Leu Asn Val His Met Asn
Gly Asp Ser Ser Val Val 405 410
415 Ser Gln Val Gln Arg Gln Thr Gln Asp Pro Asp Pro Gly Lys His
Tyr 420 425 430 Thr
Leu Phe Pro Glu Ala Ala Asp Leu Leu Asn Gln Ser Gln Arg Glu 435
440 445 Gln Gly Asp Gln Tyr Ser
Cys Ile Thr His Glu Met Val Asp Val Thr 450 455
460 Gly Gln Ala Tyr Lys Ser Leu Lys Asp Gly Leu
Cys Leu Trp Ser Gly 465 470 475
480 Arg Asp Gly Ala Ser Val Ser Thr Gly Ser Thr Arg Leu Ser Leu His
485 490 495 Pro Ala
Ala Ala Ala Ala Ser Thr Thr Ala Ser Asn Arg Gly Gly Ala 500
505 510 Ser Ile Ile Ser Leu Gln His
Lys Lys Val Asp Ala Asp Glu Asp Ile 515 520
525 Val Lys Asp Ile Ala Asp Asp Leu Thr Leu Thr Leu
Cys Thr Ser Val 530 535 540
Arg His Thr His Thr Pro Glu Ser Ser Arg Val Val 545
550 555 2881104DNAPicea sitchensis 288atgacgtctg
gttcgaggct gccgacctgg aaagaacggg agaataacaa gaagagagag 60cggcgaaggc
gagcgatcgc ggctaagatc tatgccgggc ttcggatgta tggaaactac 120aagctgccga
agcattgcga taacaacgag gttctcaagg ccctgtgtgc agaagccggc 180tggatggtcg
aggaagatgg aacaacttac cgcaagggat gcaagccgac ggagcgcata 240gaggtcgcgg
gatcgagttc cgttagcccg gcgtcttctt atcatccgag cccggcgcct 300tcctaccagc
ccagccctgc gtcgtcgtcg ttcgcgagcc cggcttcgtc atcgttcgag 360cccgcgggaa
ccggggcggc gaattctttg atcccgtggc tcaagaacct ctcgtcgtcg 420tcgtcggcgt
cttcttcggg ccggctgatt cacggcgggg gctcgatcag cgcgccggtg 480actccgcctc
tgagctcccc aacaggccgc ggcgcgcggg cgaagctgga ctgggacgcc 540atggttaaag
ccgttgccaa tgagagcaat gattgtccta attcggggtt ctctaccccc 600gtgagcccct
ggtcgaatta ccccttcgtg gcgtcctcca ccccggcgag cccggggcgt 660cacgccgaga
tggccacgca gttgagcaac gccgtggtgg acaaggggcg ctggatgggc 720gggatccgga
tgatggcgtt cccctcggcg gggccttcgt cgcccacgtt caatctgctg 780accccggccg
cgcagcttca gcacggcctt gcaacggagg gcggcaggct gtggacgccg 840gggcagagtg
gcgtctcgtc tccttgtaac aaccgggcag gtgaggagga gaggttattg 900ccgccgtttc
aagaaggtat ggatgcttca gacgagtttg ctttcggcag cgttgcggtc 960aagccgtggc
aaggcgagag gattcatgag gagtgcgggg gagagatagg atccgacgac 1020ctagagctta
cgctaggatc tttctcgtca tcttcttcaa agttaagatc tgaccgagaa 1080cccttgtttt
ctgttaaaga gtag
1104289367PRTPicea sitchensis 289Met Thr Ser Gly Ser Arg Leu Pro Thr Trp
Lys Glu Arg Glu Asn Asn 1 5 10
15 Lys Lys Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Tyr
Ala 20 25 30 Gly
Leu Arg Met Tyr Gly Asn Tyr Lys Leu Pro Lys His Cys Asp Asn 35
40 45 Asn Glu Val Leu Lys Ala
Leu Cys Ala Glu Ala Gly Trp Met Val Glu 50 55
60 Glu Asp Gly Thr Thr Tyr Arg Lys Gly Cys Lys
Pro Thr Glu Arg Ile 65 70 75
80 Glu Val Ala Gly Ser Ser Ser Val Ser Pro Ala Ser Ser Tyr His Pro
85 90 95 Ser Pro
Ala Pro Ser Tyr Gln Pro Ser Pro Ala Ser Ser Ser Phe Ala 100
105 110 Ser Pro Ala Ser Ser Ser Phe
Glu Pro Ala Gly Thr Gly Ala Ala Asn 115 120
125 Ser Leu Ile Pro Trp Leu Lys Asn Leu Ser Ser Ser
Ser Ser Ala Ser 130 135 140
Ser Ser Gly Arg Leu Ile His Gly Gly Gly Ser Ile Ser Ala Pro Val 145
150 155 160 Thr Pro Pro
Leu Ser Ser Pro Thr Gly Arg Gly Ala Arg Ala Lys Leu 165
170 175 Asp Trp Asp Ala Met Val Lys Ala
Val Ala Asn Glu Ser Asn Asp Cys 180 185
190 Pro Asn Ser Gly Phe Ser Thr Pro Val Ser Pro Trp Ser
Asn Tyr Pro 195 200 205
Phe Val Ala Ser Ser Thr Pro Ala Ser Pro Gly Arg His Ala Glu Met 210
215 220 Ala Thr Gln Leu
Ser Asn Ala Val Val Asp Lys Gly Arg Trp Met Gly 225 230
235 240 Gly Ile Arg Met Met Ala Phe Pro Ser
Ala Gly Pro Ser Ser Pro Thr 245 250
255 Phe Asn Leu Leu Thr Pro Ala Ala Gln Leu Gln His Gly Leu
Ala Thr 260 265 270
Glu Gly Gly Arg Leu Trp Thr Pro Gly Gln Ser Gly Val Ser Ser Pro
275 280 285 Cys Asn Asn Arg
Ala Gly Glu Glu Glu Arg Leu Leu Pro Pro Phe Gln 290
295 300 Glu Gly Met Asp Ala Ser Asp Glu
Phe Ala Phe Gly Ser Val Ala Val 305 310
315 320 Lys Pro Trp Gln Gly Glu Arg Ile His Glu Glu Cys
Gly Gly Glu Ile 325 330
335 Gly Ser Asp Asp Leu Glu Leu Thr Leu Gly Ser Phe Ser Ser Ser Ser
340 345 350 Ser Lys Leu
Arg Ser Asp Arg Glu Pro Leu Phe Ser Val Lys Glu 355
360 365 290987DNAPopulus trichocarpa
290atgacttcag gtacgagact accaacatgg aaagaaagag agaacaacaa gagaagagaa
60aggagaagaa gagccattgc agcaaagata ttttctgggc taagaatgta tggcaactac
120aagcttccaa aacattgtga caataatgaa gtccttaaag ccctctgcaa tgaggctggc
180tggaccgtcg agcccgatgg cactactttc cgtaagggat gcaaacctgt agaacgcatg
240gacattcttg gtgtttctgc aacaacaagt ccatgctcct cgtaccaccc tagcccttgt
300gcttcctaca acccaagtcc tggatcctct tcctttccca gtccagcttc atcctcatac
360gctgctaatg ccaatatgga ttgcaattcc ctcataccat ggctcaaaaa cctctcatca
420gcatcttcat cagcttcctc ctctaagttt cctcatctct acatccatgg tggctccata
480agtgctcctg ttactcctcc tttgagctct ccaacagctc gaaccgctcg aataaaagct
540gactgggaag accaatctat ccgcccaggc tggggtgggc agcactactc cttcttgcca
600tcttcaactc caccgagtcc tggccgccaa attgttcccg atccagaatg gtttaggggg
660attcgaatac cacaaggcgg tccaacgtct cccacattca gcctagttgc ctccaaccca
720tttggcttca aggaagaggc ttttggtggt ggtggatcca atggtggatc ccgcatgtgg
780actccaggtc aaagtggtac atgttcacct gccattgcag ctggctctga tcatacagct
840gatattccca tggccgagat ttcagatgag tttgcatttc gatgtaatgc aactggtcta
900gtgaagccat gggaaggaga gaggatccat gaagaatgtg gatcagacga tctagagctt
960acgcttggga actcaagaac caggtga
987291328PRTPopulus trichocarpa 291Met Thr Ser Gly Thr Arg Leu Pro Thr
Trp Lys Glu Arg Glu Asn Asn 1 5 10
15 Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile
Phe Ser 20 25 30
Gly Leu Arg Met Tyr Gly Asn Tyr Lys Leu Pro Lys His Cys Asp Asn
35 40 45 Asn Glu Val Leu
Lys Ala Leu Cys Asn Glu Ala Gly Trp Thr Val Glu 50
55 60 Pro Asp Gly Thr Thr Phe Arg Lys
Gly Cys Lys Pro Val Glu Arg Met 65 70
75 80 Asp Ile Leu Gly Val Ser Ala Thr Thr Ser Pro Cys
Ser Ser Tyr His 85 90
95 Pro Ser Pro Cys Ala Ser Tyr Asn Pro Ser Pro Gly Ser Ser Ser Phe
100 105 110 Pro Ser Pro
Ala Ser Ser Ser Tyr Ala Ala Asn Ala Asn Met Asp Cys 115
120 125 Asn Ser Leu Ile Pro Trp Leu Lys
Asn Leu Ser Ser Ala Ser Ser Ser 130 135
140 Ala Ser Ser Ser Lys Phe Pro His Leu Tyr Ile His Gly
Gly Ser Ile 145 150 155
160 Ser Ala Pro Val Thr Pro Pro Leu Ser Ser Pro Thr Ala Arg Thr Ala
165 170 175 Arg Ile Lys Ala
Asp Trp Glu Asp Gln Ser Ile Arg Pro Gly Trp Gly 180
185 190 Gly Gln His Tyr Ser Phe Leu Pro Ser
Ser Thr Pro Pro Ser Pro Gly 195 200
205 Arg Gln Ile Val Pro Asp Pro Glu Trp Phe Arg Gly Ile Arg
Ile Pro 210 215 220
Gln Gly Gly Pro Thr Ser Pro Thr Phe Ser Leu Val Ala Ser Asn Pro 225
230 235 240 Phe Gly Phe Lys Glu
Glu Ala Phe Gly Gly Gly Gly Ser Asn Gly Gly 245
250 255 Ser Arg Met Trp Thr Pro Gly Gln Ser Gly
Thr Cys Ser Pro Ala Ile 260 265
270 Ala Ala Gly Ser Asp His Thr Ala Asp Ile Pro Met Ala Glu Ile
Ser 275 280 285 Asp
Glu Phe Ala Phe Arg Cys Asn Ala Thr Gly Leu Val Lys Pro Trp 290
295 300 Glu Gly Glu Arg Ile His
Glu Glu Cys Gly Ser Asp Asp Leu Glu Leu 305 310
315 320 Thr Leu Gly Asn Ser Arg Thr Arg
325 2921017DNAPopulus trichocarpa 292atgacttcag
gtacgagact accaacatgg aaagaaagag agaacaacaa gagaagagaa 60aggagaagaa
gagccattgc agcaaagata ttttctgggc taagaatgta tggcaactac 120aagcttccaa
aacattgtga caataatgaa gtccttaaag ccctctgcaa tgaggctggc 180tggaccgtcg
agcccgatgg cactactttc cgtaagggat gcaaacctgt agaacgcatg 240gacattcttg
gtgtttctgc aacaacaagt ccatgctcct cgtaccaccc tagcccttgt 300gcttcctaca
acccaagtcc tggatcctct tcctttccca gtccagcttc atcctcatac 360gctgctaatg
ccaatatgga ttgcaattcc ctcataccat ggctcaaaaa cctctcatca 420gcatcttcat
cagcttcctc ctctaagttt cctcatctct acatccatgg tggctccata 480agtgctcctg
ttactcctcc tttgagctct ccaacagctc gaaccgctcg aataaaagct 540gactgggaag
accaatctat ccgcccaggc tggggtgggc agcactactc cttcttgcca 600tcttcaactc
caccgagtcc tggccgccaa attgttcccg atccagaatg gtttaggggg 660gttcgaatgc
cacaaggcgg tccaacgtct cccacattta gcctagttgc ctccaaccca 720tttggcttca
aggaagaggc ttttggtggt ggtggatcca atggtggatc ccgcatgtgg 780actccaggtc
aaagtggtac atgttcacct gccattgcag ctggctctga tcatacagct 840gatattccca
tggccgagat ttcagatgag tttgcatttc gatgtaatgc tactggtcta 900gtgaagccat
gggaaggaga gaggatccat gaagaatgtg gatcagacga tctagagcta 960acgcttggga
actcaagaac cagataaaca attgcaaaaa tgaaaattcc cgagtga
1017293328PRTPopulus trichocarpa 293Met Thr Ser Gly Thr Arg Leu Pro Thr
Trp Lys Glu Arg Glu Asn Asn 1 5 10
15 Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile
Phe Ser 20 25 30
Gly Leu Arg Met Tyr Gly Asn Tyr Lys Leu Pro Lys His Cys Asp Asn
35 40 45 Asn Glu Val Leu
Lys Ala Leu Cys Asn Glu Ala Gly Trp Thr Val Glu 50
55 60 Pro Asp Gly Thr Thr Phe Arg Lys
Gly Cys Lys Pro Val Glu Arg Met 65 70
75 80 Asp Ile Leu Gly Val Ser Ala Thr Thr Ser Pro Cys
Ser Ser Tyr His 85 90
95 Pro Ser Pro Cys Ala Ser Tyr Asn Pro Ser Pro Gly Ser Ser Ser Phe
100 105 110 Pro Ser Pro
Ala Ser Ser Ser Tyr Ala Ala Asn Ala Asn Met Asp Cys 115
120 125 Asn Ser Leu Ile Pro Trp Leu Lys
Asn Leu Ser Ser Ala Ser Ser Ser 130 135
140 Ala Ser Ser Ser Lys Phe Pro His Leu Tyr Ile His Gly
Gly Ser Ile 145 150 155
160 Ser Ala Pro Val Thr Pro Pro Leu Ser Ser Pro Thr Ala Arg Thr Ala
165 170 175 Arg Ile Lys Ala
Asp Trp Glu Asp Gln Ser Ile Arg Pro Gly Trp Gly 180
185 190 Gly Gln His Tyr Ser Phe Leu Pro Ser
Ser Thr Pro Pro Ser Pro Gly 195 200
205 Arg Gln Ile Val Pro Asp Pro Glu Trp Phe Arg Gly Val Arg
Met Pro 210 215 220
Gln Gly Gly Pro Thr Ser Pro Thr Phe Ser Leu Val Ala Ser Asn Pro 225
230 235 240 Phe Gly Phe Lys Glu
Glu Ala Phe Gly Gly Gly Gly Ser Asn Gly Gly 245
250 255 Ser Arg Met Trp Thr Pro Gly Gln Ser Gly
Thr Cys Ser Pro Ala Ile 260 265
270 Ala Ala Gly Ser Asp His Thr Ala Asp Ile Pro Met Ala Glu Ile
Ser 275 280 285 Asp
Glu Phe Ala Phe Arg Cys Asn Ala Thr Gly Leu Val Lys Pro Trp 290
295 300 Glu Gly Glu Arg Ile His
Glu Glu Cys Gly Ser Asp Asp Leu Glu Leu 305 310
315 320 Thr Leu Gly Asn Ser Arg Thr Arg
325 294954DNAPopulus trichocarpa 294atgacgtcag atggggcaac
ctcgacgtca gctgcaatgg cggcagctac aaggaggaag 60ccatcgtgga gggagaggga
gaataatagg aggagagaga ggaggaggag agctatagct 120gcaaaaatat ttactgggtt
aagggctcaa gggaattata atttgcctaa atattgtgac 180aataatgagg tattgaaagc
tctctgtgct gaggctggtt gggttgttga agaagatgga 240actacttatc gcaagggaca
caggccacct cctattgaga tagtaggtac ttcgacgagg 300gtaaccccat actcatccca
aaatcctagt ccactatctt cgttgtttcc cagtccaatt 360ccttcctatc aagccagtcc
ctcctcctcc tcgtttccta gccctactcg tggcgataac 420aatgcctctt ctaatctcct
tccattcctt cgaagtgcca ttccattgtc tcttcctcct 480cttcgaatct caaacagtgc
gcctgtaacc ccacctctct cgtccccgac ctcaagaaac 540cccaagccaa ttcccaactg
ggattttatt gccaaacaat ccatggcctc ctttagttac 600ccattcaatg cagtgtctgc
tccggctagc ccaactcatc gtcaatttca tgctccagcc 660actatacctg aatgtgatga
gtctgataca tccactgtgg agtctggtca gtggataagc 720tttcaaaagt ttgcgccttc
tgtggctgca gcaatgccaa cctctcctac ctataatctt 780gtgatacccg tggctcagca
aatttcgtct agcaatttgg tcaaagagag tgcagtgccc 840atggattttg agtttggtag
tgaacaggtg aaaccatggg aaggagagag gattcatgaa 900gtaggattag atgatctaga
gctcacactt ggaagtggca aggctcagag ttag 954295317PRTPopulus
trichocarpa 295Met Thr Ser Asp Gly Ala Thr Ser Thr Ser Ala Ala Met Ala
Ala Ala 1 5 10 15
Thr Arg Arg Lys Pro Ser Trp Arg Glu Arg Glu Asn Asn Arg Arg Arg
20 25 30 Glu Arg Arg Arg Arg
Ala Ile Ala Ala Lys Ile Phe Thr Gly Leu Arg 35
40 45 Ala Gln Gly Asn Tyr Asn Leu Pro Lys
Tyr Cys Asp Asn Asn Glu Val 50 55
60 Leu Lys Ala Leu Cys Ala Glu Ala Gly Trp Val Val Glu
Glu Asp Gly 65 70 75
80 Thr Thr Tyr Arg Lys Gly His Arg Pro Pro Pro Ile Glu Ile Val Gly
85 90 95 Thr Ser Thr Arg
Val Thr Pro Tyr Ser Ser Gln Asn Pro Ser Pro Leu 100
105 110 Ser Ser Leu Phe Pro Ser Pro Ile Pro
Ser Tyr Gln Ala Ser Pro Ser 115 120
125 Ser Ser Ser Phe Pro Ser Pro Thr Arg Gly Asp Asn Asn Ala
Ser Ser 130 135 140
Asn Leu Leu Pro Phe Leu Arg Ser Ala Ile Pro Leu Ser Leu Pro Pro 145
150 155 160 Leu Arg Ile Ser Asn
Ser Ala Pro Val Thr Pro Pro Leu Ser Ser Pro 165
170 175 Thr Ser Arg Asn Pro Lys Pro Ile Pro Asn
Trp Asp Phe Ile Ala Lys 180 185
190 Gln Ser Met Ala Ser Phe Ser Tyr Pro Phe Asn Ala Val Ser Ala
Pro 195 200 205 Ala
Ser Pro Thr His Arg Gln Phe His Ala Pro Ala Thr Ile Pro Glu 210
215 220 Cys Asp Glu Ser Asp Thr
Ser Thr Val Glu Ser Gly Gln Trp Ile Ser 225 230
235 240 Phe Gln Lys Phe Ala Pro Ser Val Ala Ala Ala
Met Pro Thr Ser Pro 245 250
255 Thr Tyr Asn Leu Val Ile Pro Val Ala Gln Gln Ile Ser Ser Ser Asn
260 265 270 Leu Val
Lys Glu Ser Ala Val Pro Met Asp Phe Glu Phe Gly Ser Glu 275
280 285 Gln Val Lys Pro Trp Glu Gly
Glu Arg Ile His Glu Val Gly Leu Asp 290 295
300 Asp Leu Glu Leu Thr Leu Gly Ser Gly Lys Ala Gln
Ser 305 310 315 296996DNAPopulus
trichocarpa 296atgacagccg gtggatcctc agcgaggtta ccaacgtgga aagaaagaga
gaataacatg 60agaagagaaa gaaggagaag agctatagct gccaagatct atacaggact
taggactcaa 120ggaaattata agttaccaaa acattgtgat aataatgaag tcttgaaagc
tctttgtgct 180gaagctggtt ggattgtcga agaagacggt accacttatc gcaagggctg
caagccacct 240ccatctgaga ttgctggcat gccagcaaac atcagtgcat gctcctcaat
tcaaccaagc 300ccgcaatcct caaattttgc aagccctgtg ccttcctacc atgctagtcc
ctcatcctcc 360tcattcccaa gtcctacttg tttcgatgga aactcctcca cgtacctcct
ccctttcctc 420cgaaacatag cttccatccc cacaaacctc ccgcctctta gaatatccaa
tagtgctcct 480gtaaccccac cacgttcttc ccctacatgt agaagttcaa agcggaaagt
tgactgggaa 540tccctctcaa atggctccct aaactcgttt cgccatcccc tttttgcagc
ttctgctcct 600tcaagtccta cacggcgccc ccatctaaca cctgccacaa ttccagaatg
tgacgagtct 660gatgcctcta ccgtggactc tggccgctgg ttgagttttc aggcagtggc
accccaagta 720gcccctcctt caccaacatt taatcttgtt aaaccagtgg atcaacagtg
tgcttttcag 780attggagttg ataggcatga aggtttgagc tggggagtag cagcagaaag
ggggagaggt 840gctgagtttg agtttgagaa ttgtagggtg aagccatggg agggtgagag
gattcatgag 900attggggtgg atgatcttga gctcacactt ggaagtggaa aggtccatgg
tcaagcctcc 960attgatgatc tagcctggga acgtagcaac aagtag
996297331PRTPopulus trichocarpa 297Met Thr Ala Gly Gly Ser
Ser Ala Arg Leu Pro Thr Trp Lys Glu Arg 1 5
10 15 Glu Asn Asn Met Arg Arg Glu Arg Arg Arg Arg
Ala Ile Ala Ala Lys 20 25
30 Ile Tyr Thr Gly Leu Arg Thr Gln Gly Asn Tyr Lys Leu Pro Lys
His 35 40 45 Cys
Asp Asn Asn Glu Val Leu Lys Ala Leu Cys Ala Glu Ala Gly Trp 50
55 60 Ile Val Glu Glu Asp Gly
Thr Thr Tyr Arg Lys Gly Cys Lys Pro Pro 65 70
75 80 Pro Ser Glu Ile Ala Gly Met Pro Ala Asn Ile
Ser Ala Cys Ser Ser 85 90
95 Ile Gln Pro Ser Pro Gln Ser Ser Asn Phe Ala Ser Pro Val Pro Ser
100 105 110 Tyr His
Ala Ser Pro Ser Ser Ser Ser Phe Pro Ser Pro Thr Cys Phe 115
120 125 Asp Gly Asn Ser Ser Thr Tyr
Leu Leu Pro Phe Leu Arg Asn Ile Ala 130 135
140 Ser Ile Pro Thr Asn Leu Pro Pro Leu Arg Ile Ser
Asn Ser Ala Pro 145 150 155
160 Val Thr Pro Pro Arg Ser Ser Pro Thr Cys Arg Ser Ser Lys Arg Lys
165 170 175 Val Asp Trp
Glu Ser Leu Ser Asn Gly Ser Leu Asn Ser Phe Arg His 180
185 190 Pro Leu Phe Ala Ala Ser Ala Pro
Ser Ser Pro Thr Arg Arg Pro His 195 200
205 Leu Thr Pro Ala Thr Ile Pro Glu Cys Asp Glu Ser Asp
Ala Ser Thr 210 215 220
Val Asp Ser Gly Arg Trp Leu Ser Phe Gln Ala Val Ala Pro Gln Val 225
230 235 240 Ala Pro Pro Ser
Pro Thr Phe Asn Leu Val Lys Pro Val Asp Gln Gln 245
250 255 Cys Ala Phe Gln Ile Gly Val Asp Arg
His Glu Gly Leu Ser Trp Gly 260 265
270 Val Ala Ala Glu Arg Gly Arg Gly Ala Glu Phe Glu Phe Glu
Asn Cys 275 280 285
Arg Val Lys Pro Trp Glu Gly Glu Arg Ile His Glu Ile Gly Val Asp 290
295 300 Asp Leu Glu Leu Thr
Leu Gly Ser Gly Lys Val His Gly Gln Ala Ser 305 310
315 320 Ile Asp Asp Leu Ala Trp Glu Arg Ser Asn
Lys 325 330 298954DNAPopulus
trichocarpa 298atgacgtcag atggggccac ttctacatca gctgcagcgg cggcaactac
gaggaggaag 60ccgtcgtgga gggagagaga gaataatagg aggagagaga ggaggagaag
agccatagct 120gcaaaaatat ttactgggtt aagggctcaa gggaattata atttgcccaa
atattgtgac 180aataatgagg tgttaaaagc tctctgtgct gaggctggtt gggttgttga
agaggacggg 240actacttatc gcaagggaca caggccacct ccaatagaga tagtaggttc
atcaatgaga 300gtaaccccat actcatccca aaatccgagc ccgctatctt catcgtttcc
cagcccgatt 360ccttcctatc aagtcagtcc ctcctcctcg tcatttccta gccccactcg
tggtgataac 420aatgtctctt ctaatctcct tccattcctt caaagtgcca ttccgttgtc
tcttcctcct 480ctccgaatct caaacagtgc acctgtaacc ccacctctct cgtccccgac
ctcaagaaat 540cccaagccaa tacctaactg ggattttatt gctaaacaat ccatggcatc
cttcagttac 600cctttcaatg cagtgtctgc cccagctagc ccaactcacc gtcagtttca
tgctccagcc 660actatacctg aatgtgacga gtctgattca tccactgttg agtctggtca
gtggataagc 720tttcaaaagt ttgctccttc tgtggctgca gcaatgccca cctctcctac
ctataatctt 780gtgaaacctg tggctcggca aattttgtcc aacaatctgg tcaaagataa
tggaatgtca 840atggattttg agtttggtag cgaacaggtg aaaccatggg aaggagagag
gattcatgaa 900gtaggattag atgatctaga gctcacactt ggaggtggca aggctcggag
ttag 954299317PRTPopulus trichocarpa 299Met Thr Ser Asp Gly Ala
Thr Ser Thr Ser Ala Ala Ala Ala Ala Thr 1 5
10 15 Thr Arg Arg Lys Pro Ser Trp Arg Glu Arg Glu
Asn Asn Arg Arg Arg 20 25
30 Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Phe Thr Gly Leu
Arg 35 40 45 Ala
Gln Gly Asn Tyr Asn Leu Pro Lys Tyr Cys Asp Asn Asn Glu Val 50
55 60 Leu Lys Ala Leu Cys Ala
Glu Ala Gly Trp Val Val Glu Glu Asp Gly 65 70
75 80 Thr Thr Tyr Arg Lys Gly His Arg Pro Pro Pro
Ile Glu Ile Val Gly 85 90
95 Ser Ser Met Arg Val Thr Pro Tyr Ser Ser Gln Asn Pro Ser Pro Leu
100 105 110 Ser Ser
Ser Phe Pro Ser Pro Ile Pro Ser Tyr Gln Val Ser Pro Ser 115
120 125 Ser Ser Ser Phe Pro Ser Pro
Thr Arg Gly Asp Asn Asn Val Ser Ser 130 135
140 Asn Leu Leu Pro Phe Leu Gln Ser Ala Ile Pro Leu
Ser Leu Pro Pro 145 150 155
160 Leu Arg Ile Ser Asn Ser Ala Pro Val Thr Pro Pro Leu Ser Ser Pro
165 170 175 Thr Ser Arg
Asn Pro Lys Pro Ile Pro Asn Trp Asp Phe Ile Ala Lys 180
185 190 Gln Ser Met Ala Ser Phe Ser Tyr
Pro Phe Asn Ala Val Ser Ala Pro 195 200
205 Ala Ser Pro Thr His Arg Gln Phe His Ala Pro Ala Thr
Ile Pro Glu 210 215 220
Cys Asp Glu Ser Asp Ser Ser Thr Val Glu Ser Gly Gln Trp Ile Ser 225
230 235 240 Phe Gln Lys Phe
Ala Pro Ser Val Ala Ala Ala Met Pro Thr Ser Pro 245
250 255 Thr Tyr Asn Leu Val Lys Pro Val Ala
Arg Gln Ile Leu Ser Asn Asn 260 265
270 Leu Val Lys Asp Asn Gly Met Ser Met Asp Phe Glu Phe Gly
Ser Glu 275 280 285
Gln Val Lys Pro Trp Glu Gly Glu Arg Ile His Glu Val Gly Leu Asp 290
295 300 Asp Leu Glu Leu Thr
Leu Gly Gly Gly Lys Ala Arg Ser 305 310
315 300996DNAPopulus trichocarpa 300atgacgtcgg ggacgagaat
gccgacgtgg aaggagcgag agaacaataa gagaagagaa 60aggagaagga gagcgatcgc
agcgaagatc tattcaggac ttagaatgta cgggaattat 120aagctaccaa aacactgtga
caataatgaa gtgcttaaag ctctctgtaa agaagctggt 180tggactgttg aagaggatgg
cactacttat cgaaagggat gcaaacctgt ggaacgcatg 240gatattatgg gaggttctgc
atcagctagt ccatgttcat cataccatcg aagtccatgt 300gcatcctata atccaagccc
tgcctcatct tcttttccaa gtcctgtttc atcccattat 360gctgccaacg ctaatggtaa
tgctgatccc aattccctca tcccatggct caaaaacctc 420tcctctggct catcatcagc
ctctcccaag catcctcacc atctcttcat tcacactggt 480tccataagcg ctcctgttac
ccctccattg agctccccaa ctgcacgaac cccacgtacc 540aagaatgact gggatgacgc
agctgctggc caatcttgga tgggacagaa ctactcattt 600atgccctcat ctatgccctc
gtctacccca cctagtcctg gccgtcacgt cctaccagat 660tcaggttggc tagctggtat
ccaaattccc caaagtggac cctcatcacc aacatttagt 720cttgtatcac ggaatccatt
tggctttaga gaggaggctt tatcaggtgc aggatcacga 780atgtggactc ctggacaaag
tgggacatgc tctccagcaa ttccggcagg cattgatcag 840acagctgatg tgccaatgtc
ggacagtatg gcagccgagt ttgcatttgg aagcaatgca 900gcaggattgg tgaaaccttg
ggaaggagag aggatccatg aggaatgtgt ttctgatgat 960cttgagctta cactaggaaa
ctcaaatacc agatag 996301331PRTPopulus
trichocarpa 301Met Thr Ser Gly Thr Arg Met Pro Thr Trp Lys Glu Arg Glu
Asn Asn 1 5 10 15
Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Tyr Ser
20 25 30 Gly Leu Arg Met Tyr
Gly Asn Tyr Lys Leu Pro Lys His Cys Asp Asn 35
40 45 Asn Glu Val Leu Lys Ala Leu Cys Lys
Glu Ala Gly Trp Thr Val Glu 50 55
60 Glu Asp Gly Thr Thr Tyr Arg Lys Gly Cys Lys Pro Val
Glu Arg Met 65 70 75
80 Asp Ile Met Gly Gly Ser Ala Ser Ala Ser Pro Cys Ser Ser Tyr His
85 90 95 Arg Ser Pro Cys
Ala Ser Tyr Asn Pro Ser Pro Ala Ser Ser Ser Phe 100
105 110 Pro Ser Pro Val Ser Ser His Tyr Ala
Ala Asn Ala Asn Gly Asn Ala 115 120
125 Asp Pro Asn Ser Leu Ile Pro Trp Leu Lys Asn Leu Ser Ser
Gly Ser 130 135 140
Ser Ser Ala Ser Pro Lys His Pro His His Leu Phe Ile His Thr Gly 145
150 155 160 Ser Ile Ser Ala Pro
Val Thr Pro Pro Leu Ser Ser Pro Thr Ala Arg 165
170 175 Thr Pro Arg Thr Lys Asn Asp Trp Asp Asp
Ala Ala Ala Gly Gln Ser 180 185
190 Trp Met Gly Gln Asn Tyr Ser Phe Met Pro Ser Ser Met Pro Ser
Ser 195 200 205 Thr
Pro Pro Ser Pro Gly Arg His Val Leu Pro Asp Ser Gly Trp Leu 210
215 220 Ala Gly Ile Gln Ile Pro
Gln Ser Gly Pro Ser Ser Pro Thr Phe Ser 225 230
235 240 Leu Val Ser Arg Asn Pro Phe Gly Phe Arg Glu
Glu Ala Leu Ser Gly 245 250
255 Ala Gly Ser Arg Met Trp Thr Pro Gly Gln Ser Gly Thr Cys Ser Pro
260 265 270 Ala Ile
Pro Ala Gly Ile Asp Gln Thr Ala Asp Val Pro Met Ser Asp 275
280 285 Ser Met Ala Ala Glu Phe Ala
Phe Gly Ser Asn Ala Ala Gly Leu Val 290 295
300 Lys Pro Trp Glu Gly Glu Arg Ile His Glu Glu Cys
Val Ser Asp Asp 305 310 315
320 Leu Glu Leu Thr Leu Gly Asn Ser Asn Thr Arg 325
330 302954DNAPopulus trichocarpa 302atgacagccg gtggatcctc
agggaggtta ccaacatgga aggaaagaga gaataacaag 60agaagagaaa gaaggagaag
agctatagct gctaagatat atacaggtct tagaactcaa 120gggaatttta agttaccaaa
acactgtgat aataatgaag tcttgaaagc actttgtgct 180gaagctggtt ggattgttga
agaagatggt accacttatc gcaagggctg caagccgcct 240ccaactgaga ttgcaggcac
tccaacaaat atcagtgcat gttcctcaat tcaaccaagt 300ccacaatcct ccaattttcc
aagccctgta gcttcctacc atgctagtcc aacatcctcc 360tcattcccaa gcccctctcg
tttcgatgga aacccctcca cttacctcct cccattcctc 420cgaaacatag cttccatccc
cacaaacctc cctcctctta gaatatccaa tagtgctcct 480gtaaccccac cactttcttc
ccctacatct agaggttcga aacggaaagc tgactgggaa 540tccctctcaa atggcaccct
taactcgctt caccatcccc ttttggcagc ttctgcccca 600tcaagtccta cacggcgcca
ccatctaacg cctgccacaa taccagaatg tgacgagtct 660gatgcttcca ctgtggactc
tggccgctgg gtgagttttc tggcaggggc accccatgta 720gctcctccct cgccaacttt
taatcttgtt aaaccagtgg cacaacagag tggttttcag 780gatggagttg ataggcatgg
tggtttaagc tggggggcag cagcagagag ggggagaggt 840gcggagtttg agtttgagaa
ttgtagggtg aagccatggg agggcgagag gattcatgag 900attggggtag atgatcttga
gctcacactt ggaggtggaa aagctcgtgg ttaa 954303317PRTPopulus
trichocarpa 303Met Thr Ala Gly Gly Ser Ser Gly Arg Leu Pro Thr Trp Lys
Glu Arg 1 5 10 15
Glu Asn Asn Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys
20 25 30 Ile Tyr Thr Gly Leu
Arg Thr Gln Gly Asn Phe Lys Leu Pro Lys His 35
40 45 Cys Asp Asn Asn Glu Val Leu Lys Ala
Leu Cys Ala Glu Ala Gly Trp 50 55
60 Ile Val Glu Glu Asp Gly Thr Thr Tyr Arg Lys Gly Cys
Lys Pro Pro 65 70 75
80 Pro Thr Glu Ile Ala Gly Thr Pro Thr Asn Ile Ser Ala Cys Ser Ser
85 90 95 Ile Gln Pro Ser
Pro Gln Ser Ser Asn Phe Pro Ser Pro Val Ala Ser 100
105 110 Tyr His Ala Ser Pro Thr Ser Ser Ser
Phe Pro Ser Pro Ser Arg Phe 115 120
125 Asp Gly Asn Pro Ser Thr Tyr Leu Leu Pro Phe Leu Arg Asn
Ile Ala 130 135 140
Ser Ile Pro Thr Asn Leu Pro Pro Leu Arg Ile Ser Asn Ser Ala Pro 145
150 155 160 Val Thr Pro Pro Leu
Ser Ser Pro Thr Ser Arg Gly Ser Lys Arg Lys 165
170 175 Ala Asp Trp Glu Ser Leu Ser Asn Gly Thr
Leu Asn Ser Leu His His 180 185
190 Pro Leu Leu Ala Ala Ser Ala Pro Ser Ser Pro Thr Arg Arg His
His 195 200 205 Leu
Thr Pro Ala Thr Ile Pro Glu Cys Asp Glu Ser Asp Ala Ser Thr 210
215 220 Val Asp Ser Gly Arg Trp
Val Ser Phe Leu Ala Gly Ala Pro His Val 225 230
235 240 Ala Pro Pro Ser Pro Thr Phe Asn Leu Val Lys
Pro Val Ala Gln Gln 245 250
255 Ser Gly Phe Gln Asp Gly Val Asp Arg His Gly Gly Leu Ser Trp Gly
260 265 270 Ala Ala
Ala Glu Arg Gly Arg Gly Ala Glu Phe Glu Phe Glu Asn Cys 275
280 285 Arg Val Lys Pro Trp Glu Gly
Glu Arg Ile His Glu Ile Gly Val Asp 290 295
300 Asp Leu Glu Leu Thr Leu Gly Gly Gly Lys Ala Arg
Gly 305 310 315 3041002DNAPopulus
trichocarpa 304atgacgtcgg ggacgagaat gccgacgtgg aaggagcgag agaacaacaa
aagaagagaa 60aggagaagga gagcgattgc agcgaagatc tatgcaggac ttagaatgta
cgggagttat 120aagctaccaa aacactgtga caataatgaa gtgcttaaag ctctctgcaa
cgaagctggt 180tggactgttg aagaagacgg cactacttat cgaaagggat gcaaacctgt
ggaacgcatg 240gatattatag gtgggtctgc atcagctagt ccatgttcat cttaccatca
gagtccatgt 300gcatcctata atccaagtcc tgcctcatct tcgtttccta gtcctgtttc
atcccgttat 360gctgccaatg gtaatggtaa tgttgatgct gatgccaatt ccctcatccc
atggcttaga 420aacctctctt ccggctcatc ctcagcctca cccaagcatc caaaccatct
attcattcac 480actggttcca taagtgctcc cgtcacccct ccattgagct cccctactgc
acgaactccc 540cgtacaagaa atgactggga cgacccagct gctgggcaat cttggatggg
gcagaactac 600tcatttctgc cctcatctat gccctcgtct acaccaccta gccctggccg
tcaggttcta 660ccagattccg gctggctagc tggtattcaa atcccccaaa gcggaccctc
atcaccaaca 720tttagtcttg tatcccggaa tccatttggc tttaaagagg aggctttatc
aggtgcaggg 780tcgcgaatgt ggactcctgg acaaagcggg acatgctctc ctgcagttcc
ggcaggcatt 840gatcagacag ctgatgtgcc aatggcagac agtatggcag ctgagtttgc
atttggaagt 900aacacagcag ggttggtgaa accatgggaa ggagagagga tccatgagga
atgtgtttct 960gatgatcttg agcttacact tggaaactct agtaccaggt aa
1002305333PRTPopulus trichocarpa 305Met Thr Ser Gly Thr Arg
Met Pro Thr Trp Lys Glu Arg Glu Asn Asn 1 5
10 15 Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala
Ala Lys Ile Tyr Ala 20 25
30 Gly Leu Arg Met Tyr Gly Ser Tyr Lys Leu Pro Lys His Cys Asp
Asn 35 40 45 Asn
Glu Val Leu Lys Ala Leu Cys Asn Glu Ala Gly Trp Thr Val Glu 50
55 60 Glu Asp Gly Thr Thr Tyr
Arg Lys Gly Cys Lys Pro Val Glu Arg Met 65 70
75 80 Asp Ile Ile Gly Gly Ser Ala Ser Ala Ser Pro
Cys Ser Ser Tyr His 85 90
95 Gln Ser Pro Cys Ala Ser Tyr Asn Pro Ser Pro Ala Ser Ser Ser Phe
100 105 110 Pro Ser
Pro Val Ser Ser Arg Tyr Ala Ala Asn Gly Asn Gly Asn Val 115
120 125 Asp Ala Asp Ala Asn Ser Leu
Ile Pro Trp Leu Arg Asn Leu Ser Ser 130 135
140 Gly Ser Ser Ser Ala Ser Pro Lys His Pro Asn His
Leu Phe Ile His 145 150 155
160 Thr Gly Ser Ile Ser Ala Pro Val Thr Pro Pro Leu Ser Ser Pro Thr
165 170 175 Ala Arg Thr
Pro Arg Thr Arg Asn Asp Trp Asp Asp Pro Ala Ala Gly 180
185 190 Gln Ser Trp Met Gly Gln Asn Tyr
Ser Phe Leu Pro Ser Ser Met Pro 195 200
205 Ser Ser Thr Pro Pro Ser Pro Gly Arg Gln Val Leu Pro
Asp Ser Gly 210 215 220
Trp Leu Ala Gly Ile Gln Ile Pro Gln Ser Gly Pro Ser Ser Pro Thr 225
230 235 240 Phe Ser Leu Val
Ser Arg Asn Pro Phe Gly Phe Lys Glu Glu Ala Leu 245
250 255 Ser Gly Ala Gly Ser Arg Met Trp Thr
Pro Gly Gln Ser Gly Thr Cys 260 265
270 Ser Pro Ala Val Pro Ala Gly Ile Asp Gln Thr Ala Asp Val
Pro Met 275 280 285
Ala Asp Ser Met Ala Ala Glu Phe Ala Phe Gly Ser Asn Thr Ala Gly 290
295 300 Leu Val Lys Pro Trp
Glu Gly Glu Arg Ile His Glu Glu Cys Val Ser 305 310
315 320 Asp Asp Leu Glu Leu Thr Leu Gly Asn Ser
Ser Thr Arg 325 330
306960DNAPopulus trichocarpa 306atgacttcag gtacgagact accaacatgg
aaagaaagag agaacaacaa gagaagagaa 60aggagaagaa gagccattgc agcaaagatc
ttttctgggc tacgaatgta tggcaacttc 120aagcttccaa agcactgtga caataatgaa
gtccttaaag ccctctgcaa tgaggctggt 180tgggccgtcg agcccgatgg caccacttac
cgcaagggat gcaaacctgc ggagcacatg 240gacattattg gtggttctgc tacagcaagc
ccatgctcct catacctccc tagcccctgt 300gcttcctata acccaagtcc tggatcctct
tcctttccca gtccagtttc atcctcctat 360gctgctaatg ccaatttgga tgacaattcc
ctcctcccgt ggctcaaaaa cctctcatcg 420gcttcctctt ctaagcttcc ccatctatac
atccatggtg gctctataag tgctcctgtt 480actcctccct tgagctcgcc aactgctaga
accccccgaa taaaaactgg ctgggaagac 540caaccaatcc acccaggctg gtgtgggcag
cactacttgc catcttcaac tccaccaagc 600cctggccgtc aaattgttcc tgatccagga
tggtttgctg ggattcgatt gccacaaggt 660ggtccaactt ctcccacatt cagcctggtt
gcctccaacc cgtttggctt caaggaagag 720gctttagctg gtggtgggtc ccgcatgtgg
actcctggtc aaagtggtac gtgttcacct 780gccattgcag ctggctctga ccagacagct
gatattccca tggcagaggt gatctcggac 840gagtttgcat tccgatgcaa tgcaactggg
ctagtgaagc catgggaagg ggagaggatc 900catgaagagt gtggatcaga tgatctagag
cttacacttg ggaactcaag aaccaggtga 960307319PRTPopulus trichocarpa
307Met Thr Ser Gly Thr Arg Leu Pro Thr Trp Lys Glu Arg Glu Asn Asn 1
5 10 15 Lys Arg Arg Glu
Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile Phe Ser 20
25 30 Gly Leu Arg Met Tyr Gly Asn Phe Lys
Leu Pro Lys His Cys Asp Asn 35 40
45 Asn Glu Val Leu Lys Ala Leu Cys Asn Glu Ala Gly Trp Ala
Val Glu 50 55 60
Pro Asp Gly Thr Thr Tyr Arg Lys Gly Cys Lys Pro Ala Glu His Met 65
70 75 80 Asp Ile Ile Gly Gly
Ser Ala Thr Ala Ser Pro Cys Ser Ser Tyr Leu 85
90 95 Pro Ser Pro Cys Ala Ser Tyr Asn Pro Ser
Pro Gly Ser Ser Ser Phe 100 105
110 Pro Ser Pro Val Ser Ser Ser Tyr Ala Ala Asn Ala Asn Leu Asp
Asp 115 120 125 Asn
Ser Leu Leu Pro Trp Leu Lys Asn Leu Ser Ser Ala Ser Ser Ser 130
135 140 Lys Leu Pro His Leu Tyr
Ile His Gly Gly Ser Ile Ser Ala Pro Val 145 150
155 160 Thr Pro Pro Leu Ser Ser Pro Thr Ala Arg Thr
Pro Arg Ile Lys Thr 165 170
175 Gly Trp Glu Asp Gln Pro Ile His Pro Gly Trp Cys Gly Gln His Tyr
180 185 190 Leu Pro
Ser Ser Thr Pro Pro Ser Pro Gly Arg Gln Ile Val Pro Asp 195
200 205 Pro Gly Trp Phe Ala Gly Ile
Arg Leu Pro Gln Gly Gly Pro Thr Ser 210 215
220 Pro Thr Phe Ser Leu Val Ala Ser Asn Pro Phe Gly
Phe Lys Glu Glu 225 230 235
240 Ala Leu Ala Gly Gly Gly Ser Arg Met Trp Thr Pro Gly Gln Ser Gly
245 250 255 Thr Cys Ser
Pro Ala Ile Ala Ala Gly Ser Asp Gln Thr Ala Asp Ile 260
265 270 Pro Met Ala Glu Val Ile Ser Asp
Glu Phe Ala Phe Arg Cys Asn Ala 275 280
285 Thr Gly Leu Val Lys Pro Trp Glu Gly Glu Arg Ile His
Glu Glu Cys 290 295 300
Gly Ser Asp Asp Leu Glu Leu Thr Leu Gly Asn Ser Arg Thr Arg 305
310 315 3081023DNASolanum
lycopersicum 308atgacgtcgg gaacaaggat gccgacgtgg aaggagcggg agaataataa
gaggagagaa 60cgacggcgaa aagccattgc agctaaaata ttcgccggat taagaatgta
cggtaactat 120caacttccta aacactgcga taataatgaa gtactaaaag ccctctgcaa
tgaagccgga 180tggacagttg agcccgatgg caccacctac cgcaagggct gcaaaccaat
ggagagattg 240gactttttag gtggttcaac atcattaagt ccatgttcat cttaccagcc
aagccccttc 300acttccaaca acccaagccc tgcttcctct tcctttccta gtccagcttc
gtcctcatac 360gcagcaaacc tgaacatgga cggaaaatcc ctcatcccgt ggcttaaaaa
cctctcctct 420ggatcatcgt ccgcttcctc ctccaaactt cctaactttc acatccatac
tggctccatc 480agtgctccag tgactcctcc tttcagctca ccaactgccc ggacccctcg
gattaaaaca 540gatgctggct gggctggatt tcgttaccct taccttccat catccacacc
agctagccct 600ggtcgtcaga atttcattaa tgcagaatgt tttgctggaa taagtggacc
tccttctcca 660acatatagtc ttgtttcgcc aaatccgttt gggttcaaaa tggatggtct
atcgcgtggt 720ggatctcgaa tgtgcactcc tggacagagt ggtgcatgtt cacctgctat
tgctgcagga 780ttagatcata atgccgatgt tcccatggct gaagtgatga tctctgatga
gtttgcattc 840ggaagcaacg tggcagggat ggtgaagccg tgggaaggag agaggatcca
tgaggactgt 900gttccagatg atcttgagct tactcttggg agttcaaaga caagataaaa
cttggaagtg 960tgcaagcaag cagaattggc tcttattgat gtcaaggtgc gacagtcgac
ggggttgtgt 1020taa
1023309339PRTSolanum lycopersicum 309Met Thr Ser Gly Thr Arg
Met Pro Thr Trp Lys Glu Arg Glu Asn Asn 1 5
10 15 Lys Arg Arg Glu Arg Arg Arg Lys Ala Ile Ala
Ala Lys Ile Phe Ala 20 25
30 Gly Leu Arg Met Tyr Gly Asn Tyr Gln Leu Pro Lys His Cys Asp
Asn 35 40 45 Asn
Glu Val Leu Lys Ala Leu Cys Asn Glu Ala Gly Trp Thr Val Glu 50
55 60 Pro Asp Gly Thr Thr Tyr
Arg Lys Gly Cys Lys Pro Met Glu Arg Leu 65 70
75 80 Asp Phe Leu Gly Gly Ser Thr Ser Leu Ser Pro
Cys Ser Ser Tyr Gln 85 90
95 Pro Ser Pro Phe Thr Ser Asn Asn Pro Ser Pro Ala Ser Ser Ser Phe
100 105 110 Pro Ser
Pro Ala Ser Ser Ser Tyr Ala Ala Asn Leu Asn Met Asp Gly 115
120 125 Lys Ser Leu Ile Pro Trp Leu
Lys Asn Leu Ser Ser Gly Ser Ser Ser 130 135
140 Ala Ser Ser Ser Lys Leu Pro Asn Phe His Ile His
Thr Gly Ser Ile 145 150 155
160 Ser Ala Pro Val Thr Pro Pro Phe Ser Ser Pro Thr Ala Arg Thr Pro
165 170 175 Arg Ile Lys
Thr Asp Ala Gly Trp Ala Gly Phe Arg Tyr Pro Tyr Leu 180
185 190 Pro Ser Ser Thr Pro Ala Ser Pro
Gly Arg Gln Asn Phe Ile Asn Ala 195 200
205 Glu Cys Phe Ala Gly Ile Ser Gly Pro Pro Ser Pro Thr
Tyr Ser Leu 210 215 220
Val Ser Pro Asn Pro Phe Gly Phe Lys Met Asp Gly Leu Ser Arg Gly 225
230 235 240 Gly Ser Arg Met
Cys Thr Pro Gly Gln Ser Gly Ala Cys Ser Pro Ala 245
250 255 Ile Ala Ala Gly Leu Asp His Asn Ala
Asp Val Pro Met Ala Glu Val 260 265
270 Met Ile Ser Asp Glu Phe Ala Phe Gly Ser Asn Val Ala Gly
Met Val 275 280 285
Lys Pro Trp Glu Gly Glu Arg Ile His Glu Asp Cys Val Pro Asp Asp 290
295 300 Leu Glu Leu Thr Leu
Gly Ser Ser Lys Thr Arg Asn Leu Glu Val Cys 305 310
315 320 Lys Gln Ala Glu Leu Ala Leu Ile Asp Val
Lys Val Arg Gln Ser Thr 325 330
335 Gly Leu Cys 310924DNAVitis vinifera 310atgacgtcag
agaggacgcc gacgaggagg aaggcgtcgt ggaaggagag ggagaacaac 60atgaggagag
agaggaggag gagagccata gcggcaaaga tatatgcggg cctcagggct 120cagggcaact
atcgtcttcc aaaacactgc gataacaacg aggtcctcaa ggctctctgc 180tccgaagctg
gttggaccgt tgaagatgac ggcaccacct atcgcaaggg atgcaagcct 240cccccctcaa
ctgagattgc aggaacttcc acaaacaaca ctccctgctc ttcccagaaa 300ccaagcccac
catcttcctc ctttccaagc gcattcgctt cctaccaacc cagtccctca 360tcctcaaacc
tgtctttcat ggatgccaat gcctctctca atctccttcc atttctctac 420aagtctatcc
cttcatctct gcctcctctc cgaatatcaa acagtgctcc tgtaacacca 480cctatttcgt
ccccaacctc cagagttccc atgcctaaac ccaactggga gtcccttgcc 540aaagaatcca
tggcctctat ccatcaccat taccccatct ttgctgcttc tgccccagca 600agcccttctc
gctgtcagta tattgctcct gccactatac ctgaatatga ggagtctgac 660acctcaactg
ttgagtcagg ccagtgggtg agtttccaga cgtttgcacg ccatcttgct 720ccattgcccc
caaccttcaa tcttatgaaa cctgtggctc agaaaatttc accagatgga 780gcaaccaaag
agaaggggat aactcccgag ttggaaattg gaagtgcaca ggtgaagccc 840tgggaagggg
agaggattca tgagataggt ttggatgatc tggagcttac actaggaagt 900ggaaagagta
ggagtaaagg ctaa
924311307PRTVitis vinifera 311Met Thr Ser Glu Arg Thr Pro Thr Arg Arg Lys
Ala Ser Trp Lys Glu 1 5 10
15 Arg Glu Asn Asn Met Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala
20 25 30 Lys Ile
Tyr Ala Gly Leu Arg Ala Gln Gly Asn Tyr Arg Leu Pro Lys 35
40 45 His Cys Asp Asn Asn Glu Val
Leu Lys Ala Leu Cys Ser Glu Ala Gly 50 55
60 Trp Thr Val Glu Asp Asp Gly Thr Thr Tyr Arg Lys
Gly Cys Lys Pro 65 70 75
80 Pro Pro Ser Thr Glu Ile Ala Gly Thr Ser Thr Asn Asn Thr Pro Cys
85 90 95 Ser Ser Gln
Lys Pro Ser Pro Pro Ser Ser Ser Phe Pro Ser Ala Phe 100
105 110 Ala Ser Tyr Gln Pro Ser Pro Ser
Ser Ser Asn Leu Ser Phe Met Asp 115 120
125 Ala Asn Ala Ser Leu Asn Leu Leu Pro Phe Leu Tyr Lys
Ser Ile Pro 130 135 140
Ser Ser Leu Pro Pro Leu Arg Ile Ser Asn Ser Ala Pro Val Thr Pro 145
150 155 160 Pro Ile Ser Ser
Pro Thr Ser Arg Val Pro Met Pro Lys Pro Asn Trp 165
170 175 Glu Ser Leu Ala Lys Glu Ser Met Ala
Ser Ile His His His Tyr Pro 180 185
190 Ile Phe Ala Ala Ser Ala Pro Ala Ser Pro Ser Arg Cys Gln
Tyr Ile 195 200 205
Ala Pro Ala Thr Ile Pro Glu Tyr Glu Glu Ser Asp Thr Ser Thr Val 210
215 220 Glu Ser Gly Gln Trp
Val Ser Phe Gln Thr Phe Ala Arg His Leu Ala 225 230
235 240 Pro Leu Pro Pro Thr Phe Asn Leu Met Lys
Pro Val Ala Gln Lys Ile 245 250
255 Ser Pro Asp Gly Ala Thr Lys Glu Lys Gly Ile Thr Pro Glu Leu
Glu 260 265 270 Ile
Gly Ser Ala Gln Val Lys Pro Trp Glu Gly Glu Arg Ile His Glu 275
280 285 Ile Gly Leu Asp Asp Leu
Glu Leu Thr Leu Gly Ser Gly Lys Ser Arg 290 295
300 Ser Lys Gly 305 312414DNAZea
maysmisc_feature(18)..(18)n is a, c, g, or t 312atgacgtcgg ggtccgcngc
ggcggcggcg gtgggaggcc tcgggcggac gccgacatgg 60aaggagcggg agaacaacaa
gcgccgggag cgccggcgga gggccatcgc cgccaagatc 120ttcacgggcc tccgcgcgct
cggcaactac aagctgccca agcactgcga caacaacgag 180gtgctcaagg cgctgtgccg
cgaggcgggg tgggtcgtcg aggacgacgg caccacctac 240cgaaagggat gcaagccgcc
gccagggatg atgagcccgt gctcgtcctc gcagctgctg 300agcgcgccgt cctcgagctt
cccgagcccg gtgccgtcct accacgccag cccggcgtcg 360tcgagcttcc cgagcccgac
gcgcctcgac cacggcagcg gcagcaacac atga 414313137PRTZea mays
313Met Thr Ser Gly Ser Ala Ala Ala Ala Ala Val Gly Gly Leu Gly Arg 1
5 10 15 Thr Pro Thr Trp
Lys Glu Arg Glu Asn Asn Lys Arg Arg Glu Arg Arg 20
25 30 Arg Arg Ala Ile Ala Ala Lys Ile Phe
Thr Gly Leu Arg Ala Leu Gly 35 40
45 Asn Tyr Lys Leu Pro Lys His Cys Asp Asn Asn Glu Val Leu
Lys Ala 50 55 60
Leu Cys Arg Glu Ala Gly Trp Val Val Glu Asp Asp Gly Thr Thr Tyr 65
70 75 80 Arg Lys Gly Cys Lys
Pro Pro Pro Gly Met Met Ser Pro Cys Ser Ser 85
90 95 Ser Gln Leu Leu Ser Ala Pro Ser Ser Ser
Phe Pro Ser Pro Val Pro 100 105
110 Ser Tyr His Ala Ser Pro Ala Ser Ser Ser Phe Pro Ser Pro Thr
Arg 115 120 125 Leu
Asp His Gly Ser Gly Ser Asn Thr 130 135
314699DNAZea mays 314atgcagcagg cggggctggc cgatgacgac gacgaggaga
tatgggtaaa ggaggaggat 60gacgaggagg aggaggacgg gtactatatg gacccccgga
gcccggccgt gtggacgccc 120ggcggcaggg cgggagggac ctcaaaccgg cggcgcgcgc
gcgaggagaa ggagcggacc 180aagatgcggg agcggcagcg gcgcgcgatc acggggcgga
tcctggcggg cctgcgccag 240cacggcaact acaggctgcg ggcgcgcgcc gacatcaacg
aggtgatcgc cgcgctcgca 300agggaggccg gctgggttgt cctccccgac ggcaccactt
tcccctcttc atcctccttc 360gccgccgtgg ctgcacagcc gccccgcccc gtgatggtcg
ccgccgcgtc gccctccgcc 420accccgctcg cgctcccggc ctcctcggcg ctccccctcc
gcggcatcgc gcccgtcgcc 480gcgcgcccta tctcccaccg ccccgcgccc gcgttcgctc
tcctgttgcc cccgcgggcc 540gccgcggcct cgcgatcccc ggccgacgac gttcccgacg
ggaattcctc gcacctcctc 600gccgtccccg tccctgtccc catggacccc gccgccgctg
aagacgtccc cgttgccaag 660cagctgcagg tgcccgatgt gtcgccgcgc ccgccctga
699315232PRTZea mays 315Met Gln Gln Ala Gly Leu
Ala Asp Asp Asp Asp Glu Glu Ile Trp Val 1 5
10 15 Lys Glu Glu Asp Asp Glu Glu Glu Glu Asp Gly
Tyr Tyr Met Asp Pro 20 25
30 Arg Ser Pro Ala Val Trp Thr Pro Gly Gly Arg Ala Gly Gly Thr
Ser 35 40 45 Asn
Arg Arg Arg Ala Arg Glu Glu Lys Glu Arg Thr Lys Met Arg Glu 50
55 60 Arg Gln Arg Arg Ala Ile
Thr Gly Arg Ile Leu Ala Gly Leu Arg Gln 65 70
75 80 His Gly Asn Tyr Arg Leu Arg Ala Arg Ala Asp
Ile Asn Glu Val Ile 85 90
95 Ala Ala Leu Ala Arg Glu Ala Gly Trp Val Val Leu Pro Asp Gly Thr
100 105 110 Thr Phe
Pro Ser Ser Ser Ser Phe Ala Ala Val Ala Ala Gln Pro Pro 115
120 125 Arg Pro Val Met Val Ala Ala
Ala Ser Pro Ser Ala Thr Pro Leu Ala 130 135
140 Leu Pro Ala Ser Ser Ala Leu Pro Leu Arg Gly Ile
Ala Pro Val Ala 145 150 155
160 Ala Arg Pro Ile Ser His Arg Pro Ala Pro Ala Phe Ala Leu Leu Leu
165 170 175 Pro Pro Arg
Ala Ala Ala Ala Ser Arg Ser Pro Ala Asp Asp Val Pro 180
185 190 Asp Gly Asn Ser Ser His Leu Leu
Ala Val Pro Val Pro Val Pro Met 195 200
205 Asp Pro Ala Ala Ala Glu Asp Val Pro Val Ala Lys Gln
Leu Gln Val 210 215 220
Pro Asp Val Ser Pro Arg Pro Pro 225 230
3161068DNAZea mays 316atgacgagcg gcgccggggg agccgcggcc gggatcgggg
gcaccagggt gcccacgtgg 60agggagcgcg agaacaaccg ccgcagggag cgccggcgcc
gcgcgatcgc cgccaagatc 120ttcgccggcc tcagggccta cggcaactac aacctgccca
agcactgcga caacaacgag 180gtgctcaagg cgctctgcaa cgaggccggc tggaccgtcg
agcccgacgg caccacctac 240cgcaagggat gtaaacctct ggcaacagag cgtccagatc
caattgggag gtctgcatca 300ccaagcccct gctcttcata tcaaccaagt ccgcgagcct
catacaaccc aagcgcggca 360tcctcctcgt tcccaagctc tgggtcctcc tcccacatta
ctcttggcgg gagcaacttc 420atgggaggcg tcgagggcag ctcccttatc ccgtggctaa
agaacctctc gtcgagctcc 480tcgttcgcct cctcctccaa gttcccgcag cttcaccacc
tctacttcaa cggcggttcc 540atcagcgcgc cggtgacgcc tccatccagc tctccaaccc
gcacgcctcg catcaagact 600gactgggaga acccgagtgt tcagccaccg tgggctgggg
cgaactacgc gtctcttccc 660aactcccagc cgccgagccc tgggcaccag gttgctccgg
acccggcgtg gctagccggg 720ttccagattt cgtctgctgg cccttcgtct ccaacttaca
gccttgtggc tccgaatccg 780tttggtatct tcaaggagac catcgtcagc acctcaagaa
tgtgcacccc tgggcagagt 840ggaacgtgct ctcctgtaat gggcggtgcg ccgatccatc
acgatgtcca gatggctgat 900ggtgccccag atgacttcgc cttcgggagc agcagcaacg
gcaacaacga gtcgcctggt 960ctcgtgaagg catgggaggg ggaacggata cacgaggagt
gcgcctcgga cgagcatgag 1020ctggagctca cccttgggag ctcaaagact cgtgcagatc
cttcctga 1068317355PRTZea mays 317Met Thr Ser Gly Ala Gly
Gly Ala Ala Ala Gly Ile Gly Gly Thr Arg 1 5
10 15 Val Pro Thr Trp Arg Glu Arg Glu Asn Asn Arg
Arg Arg Glu Arg Arg 20 25
30 Arg Arg Ala Ile Ala Ala Lys Ile Phe Ala Gly Leu Arg Ala Tyr
Gly 35 40 45 Asn
Tyr Asn Leu Pro Lys His Cys Asp Asn Asn Glu Val Leu Lys Ala 50
55 60 Leu Cys Asn Glu Ala Gly
Trp Thr Val Glu Pro Asp Gly Thr Thr Tyr 65 70
75 80 Arg Lys Gly Cys Lys Pro Leu Ala Thr Glu Arg
Pro Asp Pro Ile Gly 85 90
95 Arg Ser Ala Ser Pro Ser Pro Cys Ser Ser Tyr Gln Pro Ser Pro Arg
100 105 110 Ala Ser
Tyr Asn Pro Ser Ala Ala Ser Ser Ser Phe Pro Ser Ser Gly 115
120 125 Ser Ser Ser His Ile Thr Leu
Gly Gly Ser Asn Phe Met Gly Gly Val 130 135
140 Glu Gly Ser Ser Leu Ile Pro Trp Leu Lys Asn Leu
Ser Ser Ser Ser 145 150 155
160 Ser Phe Ala Ser Ser Ser Lys Phe Pro Gln Leu His His Leu Tyr Phe
165 170 175 Asn Gly Gly
Ser Ile Ser Ala Pro Val Thr Pro Pro Ser Ser Ser Pro 180
185 190 Thr Arg Thr Pro Arg Ile Lys Thr
Asp Trp Glu Asn Pro Ser Val Gln 195 200
205 Pro Pro Trp Ala Gly Ala Asn Tyr Ala Ser Leu Pro Asn
Ser Gln Pro 210 215 220
Pro Ser Pro Gly His Gln Val Ala Pro Asp Pro Ala Trp Leu Ala Gly 225
230 235 240 Phe Gln Ile Ser
Ser Ala Gly Pro Ser Ser Pro Thr Tyr Ser Leu Val 245
250 255 Ala Pro Asn Pro Phe Gly Ile Phe Lys
Glu Thr Ile Val Ser Thr Ser 260 265
270 Arg Met Cys Thr Pro Gly Gln Ser Gly Thr Cys Ser Pro Val
Met Gly 275 280 285
Gly Ala Pro Ile His His Asp Val Gln Met Ala Asp Gly Ala Pro Asp 290
295 300 Asp Phe Ala Phe Gly
Ser Ser Ser Asn Gly Asn Asn Glu Ser Pro Gly 305 310
315 320 Leu Val Lys Ala Trp Glu Gly Glu Arg Ile
His Glu Glu Cys Ala Ser 325 330
335 Asp Glu His Glu Leu Glu Leu Thr Leu Gly Ser Ser Lys Thr Arg
Ala 340 345 350 Asp
Pro Ser 355 3181038DNAZea mays 318atggcgagcg gcggcggcgg
gggcctgggt gcggctggcg cgggaggccg gatgcccacg 60tggagggagc gcgagaacaa
caagcgccgg gagcgccgcc gccgcgcgat cgccgccaag 120atcttcgccg gcctccgcgc
gcacggcggc tacaagctgc ccaagcactg cgacaacaac 180gaggtgctca aggcgctctg
caacgaggcc ggctgggtcg tcgagcccga cggcaccacc 240taccgccagg gaagcaagcc
catggaacgc atggatccca tcggttgctc cgtgtcacca 300agcccatgtt cctcgtacca
accaagtccg cgggcgtcat acaatgcaag ccctacctcg 360tcctcattcc ccagcggcgc
atcctccccc ttcctccctc ccaacgaaat gcccaacggt 420atcgacggca atccaatcct
accatggctc aagacattct ccaacggcac tccatcaaag 480aaacacccgc tcctcccacc
gctgctgatc cacggcggct caatcagcgc tccggtaacc 540cctcctctaa gctcgccgtc
tgcccggacg ccccggatga agacggactg ggacgaagcg 600gccgtccagc ctccatggca
cggtgcaagc agccccacaa tagtgaactc cacgccgccc 660agccccggcc ggcccatcgc
gcctgacccg gcatggcttg ccggcatcca gatctcgtcc 720accagtccaa actccccgac
cttcagcctc gtctccacca acccgttcgg cgtcttcaag 780gagtccatcc cggtcggcgg
cggcgactcg tcgatgagga tgtgcacgcc ggggcagagc 840ggcgcctgct cccccgcgat
cccgggcatg ccacggcact cggacgtcca catgatggat 900gtggtctcgg acgagttcgc
gttcgggagc agcaccaacg gcgcgcagca ggccgccggg 960ctggtgaggg cctgggaggg
cgagcggatc cacgaggact ctgggtccga cgacctggag 1020ctgaccctga agctctag
1038319345PRTZea mays 319Met
Ala Ser Gly Gly Gly Gly Gly Leu Gly Ala Ala Gly Ala Gly Gly 1
5 10 15 Arg Met Pro Thr Trp Arg
Glu Arg Glu Asn Asn Lys Arg Arg Glu Arg 20
25 30 Arg Arg Arg Ala Ile Ala Ala Lys Ile Phe
Ala Gly Leu Arg Ala His 35 40
45 Gly Gly Tyr Lys Leu Pro Lys His Cys Asp Asn Asn Glu Val
Leu Lys 50 55 60
Ala Leu Cys Asn Glu Ala Gly Trp Val Val Glu Pro Asp Gly Thr Thr 65
70 75 80 Tyr Arg Gln Gly Ser
Lys Pro Met Glu Arg Met Asp Pro Ile Gly Cys 85
90 95 Ser Val Ser Pro Ser Pro Cys Ser Ser Tyr
Gln Pro Ser Pro Arg Ala 100 105
110 Ser Tyr Asn Ala Ser Pro Thr Ser Ser Ser Phe Pro Ser Gly Ala
Ser 115 120 125 Ser
Pro Phe Leu Pro Pro Asn Glu Met Pro Asn Gly Ile Asp Gly Asn 130
135 140 Pro Ile Leu Pro Trp Leu
Lys Thr Phe Ser Asn Gly Thr Pro Ser Lys 145 150
155 160 Lys His Pro Leu Leu Pro Pro Leu Leu Ile His
Gly Gly Ser Ile Ser 165 170
175 Ala Pro Val Thr Pro Pro Leu Ser Ser Pro Ser Ala Arg Thr Pro Arg
180 185 190 Met Lys
Thr Asp Trp Asp Glu Ala Ala Val Gln Pro Pro Trp His Gly 195
200 205 Ala Ser Ser Pro Thr Ile Val
Asn Ser Thr Pro Pro Ser Pro Gly Arg 210 215
220 Pro Ile Ala Pro Asp Pro Ala Trp Leu Ala Gly Ile
Gln Ile Ser Ser 225 230 235
240 Thr Ser Pro Asn Ser Pro Thr Phe Ser Leu Val Ser Thr Asn Pro Phe
245 250 255 Gly Val Phe
Lys Glu Ser Ile Pro Val Gly Gly Gly Asp Ser Ser Met 260
265 270 Arg Met Cys Thr Pro Gly Gln Ser
Gly Ala Cys Ser Pro Ala Ile Pro 275 280
285 Gly Met Pro Arg His Ser Asp Val His Met Met Asp Val
Val Ser Asp 290 295 300
Glu Phe Ala Phe Gly Ser Ser Thr Asn Gly Ala Gln Gln Ala Ala Gly 305
310 315 320 Leu Val Arg Ala
Trp Glu Gly Glu Arg Ile His Glu Asp Ser Gly Ser 325
330 335 Asp Asp Leu Glu Leu Thr Leu Lys Leu
340 345 32053DNAArtificial sequenceprimer 1
320ggggacaagt ttgtacaaaa aagcaggctt aaacaatgac ggcatcagga gga
5332150DNAArtificial sequenceprimer 2 321ggggaccact ttgtacaaga aagctgggta
ccacgatatt aacctagccg 503222194DNAOryza sativa
322aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct
60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact
120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt
180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc
240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata
300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga
360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt
420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat
480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag
540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt
600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc
660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat
720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa
780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca
840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag
900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa
960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata
1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag
1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc
1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt
1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct
1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt
1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt
1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt
1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa
1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt
1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga
1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt
1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc
1740actttctggt tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct
1800agctgtagtt cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg
1860atttctgatc tccattttta attatatgaa atgaactgta gcataagcag tattcatttg
1920gattattttt tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa
1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct
2040acctgtagaa gtttcttttt ggttattcct tgactgcttg attacagaaa gaaatttatg
2100aagctgtaat cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc
2160ttggtgtagc ttgccacttt caccagcaaa gttc
219432311PRTArtificial sequencemotif 16 323Ser Ala Pro Val Thr Pro Pro
Leu Ser Ser Pro 1 5 10
32411PRTArtificial sequencemotif 17 324Val Lys Pro Trp Glu Gly Glu Arg
Ile His Glu 1 5 10 3257PRTArtificial
sequencemotif 18 325Asp Leu Glu Leu Thr Leu Gly 1 5
32650PRTArtificial sequenceBHLH-like domain 326Arg Glu Arg Arg Arg Arg
Ala Ile Ala Ala Lys Ile Phe Thr Gly Leu 1 5
10 15 Arg Ser Gln Gly Asn Tyr Lys Leu Pro Lys His
Cys Asp Asn Asn Glu 20 25
30 Val Leu Lys Ala Leu Cys Leu Glu Ala Gly Trp Ile Val His Glu
Asp 35 40 45 Gly
Thr 50 3278PRTArtificial sequenceDNA binding element BRRE 327Cys Gly
Thr Gly Cys Thr Thr Gly 1 5 3286PRTArtificial
sequenceE-box 328Cys Ala Asn Asn Thr Cys 1 5
329620PRTArtificial sequenceConsensus 329Met Ala His Arg Gly Xaa Leu Asp
Gly Leu Xaa Xaa Ala Gln Ala Pro 1 5 10
15 Ala Leu Met Arg His Gly Ser Phe Ala Ala Gly Xaa Xaa
Xaa Xaa Xaa 20 25 30
Xaa Xaa Xaa Xaa Xaa Xaa Leu Ser Ser Xaa Xaa Pro Leu Xaa Xaa Ser
35 40 45 Ser Ser Xaa Leu
Glu Met Leu Glu Asn Lys Leu Ala Met Gln Thr Ala 50
55 60 Glu Val Glu Lys Leu Ile Met Glu
Asn Gln Arg Leu Ala Ser Ser His 65 70
75 80 Val Val Leu Arg Gln Asp Ile Val Asp Thr Glu Lys
Glu Met Gln Met 85 90
95 Ile Arg Xaa His Leu Gly Glu Val Gln Thr Glu Thr Asp Leu Xaa Ile
100 105 110 Arg Asp Leu
Leu Glu Arg Ile Arg Leu Met Glu Ala Asp Ile Xaa Ser 115
120 125 Gly Asp Ala Val Lys Lys Glu Leu
His Gln Val His Met Glu Ala Lys 130 135
140 Arg Leu Ile Xaa Glu Arg Gln Met Leu Thr Leu Glu Ile
Glu Xaa Val 145 150 155
160 Thr Lys Glu Leu Gln Lys Leu Ser Ala Xaa Xaa Asp Xaa Lys Ser Leu
165 170 175 Pro Glu Leu Leu
Ala Glu Leu Asp Gly Leu Arg Lys Glu His Xaa Asn 180
185 190 Leu Arg Ser Xaa Phe Glu Tyr Glu Lys
Asn Thr Asn Ile Lys Gln Val 195 200
205 Glu Gln Met Arg Thr Met Glu Met Asn Leu Ile Thr Met Thr
Lys Glu 210 215 220
Ala Glu Lys Leu Arg Ala Asp Val Ala Asn Ala Glu Arg Arg Ala Gln 225
230 235 240 Ala Ala Ala Ala Xaa
Ala Ala Ala His Ala Ala Gly Xaa Ala Gln Val 245
250 255 Thr Ala Ser Gln Pro Gly Thr Ala Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa 260 265
270 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa 275 280 285 Xaa
Xaa Xaa Xaa Xaa Ala Ala Xaa Xaa Xaa Xaa Tyr Ala Gly Ala Xaa 290
295 300 Ala Xaa Xaa Pro Xaa Ala
Tyr Xaa Xaa Xaa Ala Xaa Xaa Ala Gly Xaa 305 310
315 320 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa 325 330
335 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
340 345 350 Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 355
360 365 Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Xaa 370 375
380 Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa 385 390 395
400 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
405 410 415 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 420
425 430 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 435 440
445 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Xaa
Xaa Xaa Xaa 450 455 460
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 465
470 475 480 Gly Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Gly Ala Xaa Xaa Xaa Xaa Xaa 485
490 495 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 500 505
510 Xaa Xaa Xaa Xaa Xaa Ala Ala Xaa Xaa Xaa Xaa Tyr Ala Xaa
Xaa Xaa 515 520 525
Gly Xaa Xaa Xaa Xaa Gly Tyr Xaa Xaa Xaa Xaa Xaa Pro Xaa Tyr Xaa 530
535 540 Xaa Xaa Tyr Ala Xaa
Xaa Xaa Gln Xaa Xaa Ser Xaa Xaa Xaa Xaa Xaa 545 550
555 560 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Ala Leu Gly Xaa 565 570
575 Ala Xaa Tyr Pro Xaa Gly Xaa Xaa Gln Xaa Xaa Xaa Ser Xaa Xaa
Ala 580 585 590 Xaa
Xaa Ala Gln Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Ala Xaa Xaa 595
600 605 Pro Xaa Xaa Xaa Tyr Asp
Xaa Xaa Xaa Gly Xaa Gln 610 615 620
330487PRTZea mays 330Met Gln Gln Ala Gly Leu Ala Asp Asp Asp Asp Glu Glu
Ile Trp Val 1 5 10 15
Lys Glu Glu Asp Asp Glu Glu Glu Glu Asp Gly Tyr Tyr Met Asp Pro
20 25 30 Arg Ser Pro Ala
Val Trp Thr Pro Gly Gly Arg Ala Gly Gly Thr Ser 35
40 45 Asn Arg Arg Arg Ala Arg Glu Glu Lys
Glu Arg Thr Lys Met Arg Glu 50 55
60 Arg Gln Arg Arg Ala Ile Thr Gly Arg Ile Leu Ala Gly
Leu Arg Gln 65 70 75
80 His Gly Asn Tyr Arg Leu Arg Ala Arg Ala Asp Ile Asn Glu Val Ile
85 90 95 Ala Ala Leu Ala
Arg Glu Ala Gly Trp Val Val Leu Pro Asp Gly Thr 100
105 110 Thr Phe Pro Ser Ser Ser Ser Phe Ala
Ala Val Ala Ala Gln Val Val 115 120
125 Met Ser Phe His Glu Cys Gly Gly Asn Val Gly Asp Asp Ile
Ser Ile 130 135 140
Pro Leu Pro His Trp Val Ile Glu Ile Gly Arg Ser Asn Pro Asp Ile 145
150 155 160 Tyr Phe Thr Asp Arg
Ala Gly Arg Arg Asn Thr Glu Cys Leu Ser Trp 165
170 175 Gly Val Asp Lys Glu Arg Val Leu Gln Gly
Arg Thr Ala Val Glu Val 180 185
190 Tyr Phe Asp Phe Met Arg Ser Phe Arg Val Glu Phe Asp Glu Tyr
Phe 195 200 205 Glu
Asp Gly Ile Ile Ser Glu Ile Glu Ile Gly Leu Gly Ala Cys Gly 210
215 220 Glu Leu Arg Tyr Pro Ser
Tyr Pro Ala Lys His Gly Trp Lys Tyr Pro 225 230
235 240 Gly Ile Gly Glu Phe Gln Cys Tyr Asp Arg Tyr
Leu Gln Lys Ser Leu 245 250
255 Arg Lys Ala Ala Glu Ala Arg Gly His Thr Ile Trp Ala Arg Gly Pro
260 265 270 Asp Asn
Ala Gly His Tyr Asn Ser Glu Pro Asn Leu Thr Gly Phe Phe 275
280 285 Cys Asp Gly Gly Asp Tyr Asp
Ser Tyr Tyr Gly Arg Phe Phe Leu Ser 290 295
300 Trp Tyr Ser Gln Ala Leu Val Asp His Ala Asp Arg
Val Leu Met Leu 305 310 315
320 Ala Arg Leu Ala Phe Glu Gly Thr Asn Ile Ala Val Lys Val Ser Gly
325 330 335 Val His Trp
Trp Tyr Lys Thr Ala Ser His Ala Ala Glu Leu Thr Ala 340
345 350 Gly Phe Tyr Asn Pro Cys Asn Arg
Asp Gly Tyr Ala Pro Ile Ala Ala 355 360
365 Val Leu Lys Lys Tyr Asp Ala Ala Leu Asn Phe Thr Cys
Val Glu Leu 370 375 380
Arg Thr Met Asp Gln His Glu Val Tyr Pro Glu Ala Phe Ala Asp Pro 385
390 395 400 Glu Gly Leu Val
Trp Gln Val Leu Asn Ala Ala Trp Asp Ala Gly Ile 405
410 415 Gln Val Ala Ser Glu Asn Ala Leu Pro
Cys Tyr Asp Arg Asp Gly Phe 420 425
430 Asn Lys Ile Leu Glu Asn Ala Lys Pro Leu Asn Asp Pro Asp
Gly Arg 435 440 445
His Leu Leu Gly Phe Thr Tyr Leu Arg Leu Gly Lys Asp Leu Phe Glu 450
455 460 Arg Pro Asn Phe Phe
Glu Phe Glu Arg Phe Ile Lys Arg Met His Gly 465 470
475 480 Glu Ala Val Leu Asp Leu Gln
485 331507PRTArtificial sequenceConsensus 331Met Thr Ser Gly
Gly Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5
10 15 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 20 25
30 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Leu Pro Thr Trp Lys Glu
Arg Glu 35 40 45
Asn Asn Lys Arg Arg Glu Arg Arg Arg Arg Ala Ile Ala Ala Lys Ile 50
55 60 Phe Thr Gly Leu Arg
Ala Tyr Gly Asn Tyr Lys Leu Pro Lys His Cys 65 70
75 80 Asp Asn Asn Glu Val Leu Lys Ala Leu Cys
Xaa Glu Ala Gly Trp Ile 85 90
95 Val Glu Glu Asp Gly Thr Thr Tyr Arg Lys Xaa Xaa Xaa Xaa Xaa
Xaa 100 105 110 Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 115
120 125 Xaa Xaa Xaa Gly Cys Lys
Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 130 135
140 Xaa Xaa Xaa Met Xaa Xaa Xaa Ser Ser Xaa Xaa
Xaa Xaa Ser Pro Xaa 145 150 155
160 Ser Ser Tyr Phe Pro Ser Pro Ile Xaa Ser Tyr Asn Xaa Ser Pro Xaa
165 170 175 Ser Ser
Ser Phe Pro Ser Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 180
185 190 Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Ser Ser Xaa Leu Ile Pro Phe 195 200
205 Leu Lys Asn Leu Ser Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa 210 215 220
Xaa Xaa Xaa Xaa Xaa Leu Xaa Xaa Leu Xaa Xaa Ser Xaa Ser Ala Pro 225
230 235 240 Val Thr Pro
Pro Leu Ser Ser Pro Thr Xaa Xaa Xaa Arg Xaa Pro Arg 245
250 255 Xaa Lys Xaa Asp Trp Glu Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 260 265
270 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa 275 280 285
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Ser Xaa Pro Ala Ser Pro Xaa Arg Xaa 290
295 300 Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 305 310
315 320 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 325 330
335 Xaa Xaa Xaa Trp Xaa Xaa Xaa Xaa Xaa Ile Pro Xaa Xaa Xaa
Xaa Xaa 340 345 350
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Ser Xaa Thr Xaa Xaa Xaa
355 360 365 Xaa Ser Xaa Xaa
Xaa Xaa Xaa Phe Xaa Xaa Xaa Ala Xaa Xaa Xaa Xaa 370
375 380 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 385 390
395 400 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Pro Xaa Ile
Xaa Xaa Xaa Xaa 405 410
415 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
420 425 430 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Glu Phe Xaa Phe Xaa 435
440 445 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 450 455
460 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Lys Pro
Trp Glu Gly 465 470 475
480 Glu Arg Ile His Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asp
485 490 495 Leu Glu Leu Thr
Leu Gly Xaa Xaa Lys Xaa Arg 500 505
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20160089286 | PODIATRY ASSIST |
20160089285 | FOLDING DEVICE FOR POWER SCOOTERS |
20160089284 | POWERED COTS |
20160089283 | Patient Support Apparatus |
20160089282 | ABSORBENT ARTICLE DEMONSTRATING CONTROLLED DEFORMATION AND LONGITUDINAL FLUID DISTRIBUTION |