Patents - stay tuned to the technology

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: Plants Having Enhanced Yield-Related Traits and a Method for Making the Same

Inventors:  Yves Hatzfeld (Lille, FR)
Assignees:  BASF Plant Science GmbH
IPC8 Class: AC12N1582FI
USPC Class: 800290
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part the polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)
Publication date: 2013-11-14
Patent application number: 20130305414



Abstract:

The present invention relates generally to the field of molecular biology and concerns a method for enhancing various economically important yield-related traits in plants. More specifically, the present invention concerns a method for enhancing yield-related traits in plants by modulating expression in a plant of a nucleic acid encoding a NITR (Nitrite Reductase) polypeptide or an ASNS (Asparagine Synthase) polypeptide. The present invention also concerns plants having modulated expression of a nucleic acid encoding a NITR polypeptide or an ASNS polypeptide, which plants have enhanced yield-related traits relative to control plants. The invention also provides constructs comprising NITR-encoding nucleic acids or ASNS-encoding nucleic acids, useful in performing the methods of the invention.

Claims:

1. A method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding an ASNS, wherein said ASNS is represented by SEQ ID NO: 63 or an orthologue or paralogue thereof.

2. The method according to claim 1, wherein said modulated expression is effected by introducing and expressing in a plant a nucleic acid encoding said ASNS polypeptide.

3. The method according to claim 1, wherein said nucleic acid encoding said ASNS polypeptide is a portion of SEQ ID NO: 62, or a nucleic acid capable of hybridizing with such a nucleic acid.

4. The method according to claim 1, wherein said nucleic acid sequence encodes SEQ ID NO: 63.

5. The method according to claim 1, wherein said enhanced yield-related traits comprise increased early vigour, increased yield, increased root thickness, and/or increased seed yield, relative to control plants.

6. The method according to claim 1, wherein said enhanced yield-related traits are obtained under non-stress conditions or under conditions of reduced nutrient availability.

7. The method according to claim 2, wherein said nucleic acid is operably linked to a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice.

8. The method according to claim 1, wherein said nucleic acid encoding an ASNS polypeptide is of plant origin, from a monocotyledonous plant, from the family Poaceae, from the genus Oryza, or from Oryza sativa.

9. A plant or part thereof, including seeds, obtained by the method according to claim 1, wherein said plant or part thereof comprises a recombinant nucleic acid encoding an ASNS polypeptide.

10. A construct comprising: (i) the nucleic acid encoding an ASNS polypeptide as defined in claim 1; (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally (iii) a transcription termination sequence.

11. The construct according to claim 10, wherein one of said control sequences is a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice.

12. A method for making plants having increased yield-related traits comprising transforming a plant with the construct according to claim 10.

13. A plant, plant part or plant cell transformed with the construct according to claim 10.

14. A method for the production of a transgenic plant having increased yield, increased biomass, and/or increased seed yield relative to control plants, comprising: (i) introducing and expressing in a plant the nucleic acid encoding an ASNS polypeptide as defined in claim 1; and (ii) cultivating the plant under conditions promoting plant growth and development.

15. A transgenic plant having increased yield-related traits resulting from increased expression of the nucleic acid encoding an ASNS polypeptide as defined in claim 1, or a transgenic plant cell derived from said transgenic plant.

16. The transgenic plant of claim 15, wherein the increased yield-related traits are selected from the group consisting of increased early vigour, increased root thickness, and increased seed yield relative to control plants.

17. The plant according to claim 9, or a transgenic plant cell derived thereof, wherein said plant is a crop plant, a monocot, a cereal, rice, maize, wheat, barley, millet, rye, triticale, sorghum, or oats.

18. Harvestable parts of the plant according to claim 17,

19. The harvestable parts of claim 18, wherein said harvestable parts are root biomass and/or seeds.

20. Products derived from the plant according to claim 17 and/or from harvestable parts of said plant.

Description:

RELATED APPLICATIONS

[0001] This application is a divisional of U.S. application Ser. No. 12/669,596, filed Jan. 19, 2010, which is a national stage application (under 35 U.S.C. §371) of PCT/EP2008/060030, filed Jul. 31, 2008, which claims benefit of European application 07113568.5, filed Jul. 31, 2007 and European application 07113569.3, filed Jul. 31, 2007, the entire contents of each of which are hereby incorporated by reference in this application.

SUBMISSION OF SEQUENCE LISTING

[0002] The Sequence Listing associated with this application is filed in electronic format via EFS-Web and hereby incorporated by reference into the specification in its entirety. The name of the text file containing the Sequence Listing is Sequence_Listing--32279--00061_US. The size of the text file is 455 KB, and the text file was created on Jul. 10, 2013.

[0003] The present invention relates generally to the field of molecular biology and concerns a method for improving various plant growth characteristics by modulating expression in a plant of a nucleic acid encoding a NITR (Nitrite Reductase). The present invention also concerns plants having modulated expression of a nucleic acid encoding a NITR, which plants have improved growth characteristics relative to corresponding wild type plants or other control plants. The present invention furthermore concerns a method for improving various plant growth characteristics by modulating expression in a plant of a nucleic acid encoding an ASNS (Asparagine Synthase). The present invention also concerns plants having modulated expression of a nucleic acid encoding an ASNS, which plants have improved growth characteristics relative to corresponding wild type plants or other control plants. The invention also provides constructs useful in the methods of the invention.

[0004] The ever-increasing world population and the dwindling supply of arable land available for agriculture fuels research towards increasing the efficiency of agriculture. Conventional means for crop and horticultural improvements utilise selective breeding techniques to identify plants having desirable characteristics. However, such selective breeding techniques have several drawbacks, namely that these techniques are typically labour intensive and result in plants that often contain heterogeneous genetic components that may not always result in the desirable trait being passed on from parent plants. Advances in molecular biology have allowed mankind to modify the germplasm of animals and plants. Genetic engineering of plants entails the isolation and manipulation of genetic material (typically in the form of DNA or RNA) and the subsequent introduction of that genetic material into a plant. Such technology has the capacity to deliver crops or plants having various improved economic, agronomic or horticultural traits.

[0005] A trait of particular economic interest is increased yield. Yield is normally defined as the measurable produce of economic value from a crop. This may be defined in terms of quantity and/or quality. Yield is directly dependent on several factors, for example, the number and size of the organs, plant architecture (for example, the number of branches), seed production, leaf senescence and more. Root development, nutrient uptake, stress tolerance and early vigour may also be important factors in determining yield. Optimizing the abovementioned factors may therefore contribute to increasing crop yield.

[0006] Seed yield is a particularly important trait, since the seeds of many plants are important for human and animal nutrition. Crops such as corn, rice, wheat, canola and soybean account for over half the total human caloric intake, whether through direct consumption of the seeds themselves or through consumption of meat products raised on processed seeds. They are also a source of sugars, oils and many kinds of metabolites used in industrial processes. Seeds contain an embryo (the source of new shoots and roots) and an endosperm (the source of nutrients for embryo growth during germination and during early growth of seedlings). The development of a seed involves many genes, and requires the transfer of metabolites from the roots, leaves and stems into the growing seed. The endosperm, in particular, assimilates the metabolic precursors of carbohydrates, oils and proteins and synthesizes them into storage macromolecules to fill out the grain.

[0007] Another important trait for many crops is early vigour. Improving early vigour is an important objective of modern rice breeding programs in both temperate and tropical rice cultivars. Long roots are important for proper soil anchorage in water-seeded rice. Where rice is sown directly into flooded fields, and where plants must emerge rapidly through water, longer shoots are associated with vigour. Where drill-seeding is practiced, longer mesocotyls and coleoptiles are important for good seedling emergence. The ability to engineer early vigour into plants would be of great importance in agriculture. For example, poor early vigor has been a limitation to the introduction of maize (Zea mays L.) hybrids based on Corn Belt germplasm in the European Atlantic.

[0008] A further important trait is that of improved abiotic stress tolerance. Abiotic stress is a primary cause of crop loss worldwide, reducing average yields for most major crop plants by more than 50% (Wang et al., Planta (2003) 218: 1-14). Abiotic stresses may be caused by drought, salinity, extremes of temperature, chemical toxicity and oxidative stress. The ability to improve plant tolerance to abiotic stress would be of great economic advantage to farmers worldwide and would allow for the cultivation of crops during adverse conditions and in territories where cultivation of crops may not otherwise be possible.

[0009] Crop yield may therefore be increased by optimising one of the above-mentioned factors.

[0010] Depending on the end use, the modification of certain yield traits may be favoured over others. For example for applications such as forage or wood production, or bio-fuel resource, an increase in the vegetative parts of a plant may be desirable, and for applications such as flour, starch or oil production, an increase in seed parameters may be particularly desirable. Even amongst the seed parameters, some may be favoured over others, depending on the application. Various mechanisms may contribute to increasing seed yield, whether that is in the form of increased seed size or increased seed number.

[0011] One approach to increasing yield (seed yield and/or biomass) in plants may be through modification of the inherent growth mechanisms of a plant, such as the cell cycle or various signalling pathways involved in plant growth or in defense mechanisms.

[0012] Surprisingly, it has now been found that modulating expression of a nucleic acid encoding a NITR polypeptide or of an ASNS polypeptide gives plants having enhanced yield-related traits relative to control plants.

[0013] According one embodiment, there is provided a method for improving yield related traits of a plant relative to control plants, comprising modulating expression of a nucleic acid encoding a NITR polypeptide in a plant. The improved yield related traits comprised one or more of increased biomass, increased early vigour, and increased seed yield.

[0014] According another embodiment, there is provided a method for improving yield related traits of a plant relative to control plants, comprising modulating expression of a nucleic acid encoding an ASNS polypeptide in a plant. The improved yield related traits comprised one or more of increased biomass, increased early vigour, and increased seed yield.

DEFINITIONS

Polypeptide(s)/Protein(s)

[0015] The terms "polypeptide" and "protein" are used interchangeably herein and refer to amino acids in a polymeric form of any length, linked together by peptide bonds.

Polynucleotide(s)/Nucleic Acid(s)/Nucleic Acid Sequence(s)/Nucleotide Sequence(s)

[0016] The terms "polynucleotide(s)", "nucleic acid sequence(s)", "nucleotide sequence(s)", "nucleic acid(s)", "nucleic acid molecule" are used interchangeably herein and refer to nucleotides, either ribonucleotides or deoxyribonucleotides or a combination of both, in a polymeric unbranched form of any length.

Control Plant(s)

[0017] The choice of suitable control plants is a routine part of an experimental setup and may include corresponding wild type plants or corresponding plants without the gene of interest. The control plant is typically of the same plant species or even of the same variety as the plant to be assessed. The control plant may also be a nullizygote of the plant to be assessed. Nullizygotes are individuals missing the transgene by segregation. A "control plant" as used herein refers not only to whole plants, but also to plant parts, including seeds and seed parts.

Homologue(s)

[0018] "Homologues" of a protein encompass peptides, oligopeptides, polypeptides, proteins and enzymes having amino acid substitutions, deletions and/or insertions relative to the unmodified protein in question and having similar biological and functional activity as the unmodified protein from which they are derived.

[0019] A deletion refers to removal of one or more amino acids from a protein.

[0020] An insertion refers to one or more amino acid residues being introduced into a predetermined site in a protein. Insertions may comprise N-terminal and/or C-terminal fusions as well as intra-sequence insertions of single or multiple amino acids. Generally, insertions within the amino acid sequence will be smaller than N- or C-terminal fusions, of the order of about 1 to 10 residues. Examples of N- or C-terminal fusion proteins or peptides include the binding domain or activation domain of a transcriptional activator as used in the yeast two-hybrid system, phage coat proteins, (histidine)-6-tag, glutathione S-transferase-tag, protein A, maltose-binding protein, dihydrofolate reductase, Tag•100 epitope, c-myc epitope, FLAG®-epitope, lacZ, CMP (calmodulin-binding peptide), HA epitope, protein C epitope and VSV epitope.

[0021] A substitution refers to replacement of amino acids of the protein with other amino acids having similar properties (such as similar hydrophobicity, hydrophilicity, antigenicity, propensity to form or break α-helical structures or β-sheet structures). Amino acid substitutions are typically of single residues, but may be clustered depending upon functional constraints placed upon the polypeptide; insertions will usually be of the order of about 1 to 10 amino acid residues. The amino acid substitutions are preferably conservative amino acid substitutions. Conservative substitution tables are well known in the art (see for example Creighton (1984) Proteins. W.H. Freeman and Company (Eds) and Table 1 below).

TABLE-US-00001 TABLE 1 Examples of conserved amino acid substitutions Conservative Conservative Residue Substitutions Residue Substitutions Ala Ser Leu Ile; Val Arg Lys Lys Arg; Gln Asn Gln; His Met Leu; Ile Asp Glu Phe Met; Leu; Tyr Gln Asn Ser Thr; Gly Cys Ser Thr Ser; Val Glu Asp Trp Tyr Gly Pro Tyr Trp; Phe His Asn; Gln Val Ile; Leu Ile Leu, Val

[0022] Amino acid substitutions, deletions and/or insertions may readily be made using peptide synthetic techniques well known in the art, such as solid phase peptide synthesis and the like, or by recombinant DNA manipulation. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. For example, techniques for making substitution mutations at predetermined sites in DNA are well known to those skilled in the art and include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, Ohio), QuickChange Site Directed mutagenesis (Stratagene, San Diego, Calif.), PCR-mediated site-directed mutagenesis or other site-directed mutagenesis protocols.

Derivatives

[0023] "Derivatives" include peptides, oligopeptides, polypeptides which may, compared to the amino acid sequence of the naturally-occurring form of the protein, such as the protein of interest, comprise substitutions of amino acids with non-naturally occurring amino acid residues, or additions of non-naturally occurring amino acid residues. "Derivatives" of a protein also encompass peptides, oligopeptides, polypeptides which comprise naturally occurring altered (glycosylated, acylated, prenylated, phosphorylated, myristoylated, sulphated etc.) or non-naturally altered amino acid residues compared to the amino acid sequence of a naturally-occurring form of the polypeptide. A derivative may also comprise one or more non-amino acid substituents or additions compared to the amino acid sequence from which it is derived, for example a reporter molecule or other ligand, covalently or non-covalently bound to the amino acid sequence, such as a reporter molecule which is bound to facilitate its detection, and non-naturally occurring amino acid residues relative to the amino acid sequence of a naturally-occurring protein. Furthermore, "derivatives" also include fusions of the naturally-occurring form of the protein with tagging peptides such as FLAG, HIS6 or thioredoxin (for a review of tagging peptides, see Terpe, Appl. Microbiol. Biotechnol. 60, 523-533, 2003).

Orthologue(s)/Paralogue(s)

[0024] Orthologues and paralogues encompass evolutionary concepts used to describe the ancestral relationships of genes. Paralogues are genes within the same species that have originated through duplication of an ancestral gene; orthologues are genes from different organisms that have originated through speciation, and are also derived from a common ancestral gene.

Domain

[0025] The term "domain" refers to a set of amino acids conserved at specific positions along an alignment of sequences of evolutionarily related proteins. While amino acids at other positions can vary between homologues, amino acids that are highly conserved at specific positions indicate amino acids that are likely essential in the structure, stability or function of a protein. Identified by their high degree of conservation in aligned sequences of a family of protein homologues, they can be used as identifiers to determine if any polypeptide in question belongs to a previously identified polypeptide family.

Motif/Consensus Sequence/Signature

[0026] The term "motif" or "consensus sequence" or "signature" refers to a short conserved region in the sequence of evolutionarily related proteins. Motifs are frequently highly conserved parts of domains, but may also include only part of the domain, or be located outside of conserved domain (if all of the amino acids of the motif fall outside of a defined domain).

Hybridisation

[0027] The term "hybridisation" as defined herein is a process wherein substantially homologous complementary nucleotide sequences anneal to each other. The hybridisation process can occur entirely in solution, i.e. both complementary nucleic acids are in solution. The hybridisation process can also occur with one of the complementary nucleic acids immobilised to a matrix such as magnetic beads, Sepharose beads or any other resin. The hybridisation process can furthermore occur with one of the complementary nucleic acids immobilised to a solid support such as a nitro-cellulose or nylon membrane or immobilised by e.g. photolithography to, for example, a siliceous glass support (the latter known as nucleic acid arrays or microarrays or as nucleic acid chips). In order to allow hybridisation to occur, the nucleic acid molecules are generally thermally or chemically denatured to melt a double strand into two single strands and/or to remove hairpins or other secondary structures from single stranded nucleic acids.

[0028] The term "stringency" refers to the conditions under which a hybridisation takes place. The stringency of hybridisation is influenced by conditions such as temperature, salt concentration, ionic strength and hybridisation buffer composition. Generally, low stringency conditions are selected to be about 30° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. Medium stringency conditions are when the temperature is 20° C. below Tm, and high stringency conditions are when the temperature is 10° C. below Tm. High stringency hybridisation conditions are typically used for isolating hybridising sequences that have high sequence similarity to the target nucleic acid sequence. However, nucleic acids may deviate in sequence and still encode a substantially identical polypeptide, due to the degeneracy of the genetic code. Therefore medium stringency hybridisation conditions may sometimes be needed to identify such nucleic acid molecules.

[0029] The Tm is the temperature under defined ionic strength and pH, at which 50% of the target sequence hybridises to a perfectly matched probe. The Tm is dependent upon the solution conditions and the base composition and length of the probe. For example, longer sequences hybridise specifically at higher temperatures. The maximum rate of hybridisation is obtained from about 16° C. up to 32° C. below Tm. The presence of monovalent cations in the hybridisation solution reduce the electrostatic repulsion between the two nucleic acid strands thereby promoting hybrid formation; this effect is visible for sodium concentrations of up to 0.4M (for higher concentrations, this effect may be ignored). Formamide reduces the melting temperature of DNA-DNA and DNA-RNA duplexes with 0.6 to 0.7° C. for each percent formamide, and addition of 50% formamide allows hybridisation to be performed at 30 to 45° C., though the rate of hybridisation will be lowered. Base pair mismatches reduce the hybridisation rate and the thermal stability of the duplexes. On average and for large probes, the Tm decreases about 1° C. per % base mismatch. The Tm may be calculated using the following equations, depending on the types of hybrids:

[0030] 1) DNA-DNA hybrids (Meinkoth and Wahl, Anal. Biochem., 138: 267-284, 1984):

[0030] Tm=81.5° C.+16.6×log10[Na.sup.+]a+0.41×%[G/Cb]-500.time- s.[Lc]-1-0.61×% formamide

[0031] 2) DNA-RNA or RNA-RNA hybrids:

[0031] Tm=79.8+18.5(log10[Na.sup.+]a)+0.58(% G/Cb)+11.8(% G/Cb)2-820/Lc

[0032] 3) oligo-DNA or oligo-RNAd hybrids:

[0033] For <20 nucleotides: Tm=2(ln)

[0034] For 20-35 nucleotides: Tm=22+1.46(ln)

[0035] a or for other monovalent cation, but only accurate in the 0.01-0.4 M range.

[0036] b only accurate for % GC in the 30% to 75% range.

[0037] c L=length of duplex in base pairs.

[0038] d oligo, oligonucleotide; ln, =effective length of primer=2×(no. of G/C)+(no. of A/T).

[0039] Non-specific binding may be controlled using any one of a number of known techniques such as, for example, blocking the membrane with protein containing solutions, additions of heterologous RNA, DNA, and SDS to the hybridisation buffer, and treatment with Rnase. For non-homologous probes, a series of hybridizations may be performed by varying one of (i) progressively lowering the annealing temperature (for example from 68° C. to 42° C.) or (ii) progressively lowering the formamide concentration (for example from 50% to 0%). The skilled artisan is aware of various parameters which may be altered during hybridisation and which will either maintain or change the stringency conditions.

[0040] Besides the hybridisation conditions, specificity of hybridisation typically also depends on the function of post-hybridisation washes. To remove background resulting from non-specific hybridisation, samples are washed with dilute salt solutions. Critical factors of such washes include the ionic strength and temperature of the final wash solution: the lower the salt concentration and the higher the wash temperature, the higher the stringency of the wash. Wash conditions are typically performed at or below hybridisation stringency. A positive hybridisation gives a signal that is at least twice of that of the background. Generally, suitable stringent conditions for nucleic acid hybridisation assays or gene amplification detection procedures are as set forth above. More or less stringent conditions may also be selected. The skilled artisan is aware of various parameters which may be altered during washing and which will either maintain or change the stringency conditions.

[0041] For example, typical high stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 65° C. in 1×SSC or at 42° C. in 1×SSC and 50% formamide, followed by washing at 65° C. in 0.3×SSC. Examples of medium stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 50° C. in 4×SSC or at 40° C. in 6×SSC and 50% formamide, followed by washing at 50° C. in 2×SSC. The length of the hybrid is the anticipated length for the hybridising nucleic acid. When nucleic acids of known sequence are hybridised, the hybrid length may be determined by aligning the sequences and identifying the conserved regions described herein. 1×SSC is 0.15M NaCl and 15 mM sodium citrate; the hybridisation solution and wash solutions may additionally include 5×Denhardt's reagent, 0.5-1.0% SDS, 100 μg/ml denatured, fragmented salmon sperm DNA, 0.5% sodium pyrophosphate.

[0042] For the purposes of defining the level of stringency, reference can be made to Sambrook et al. (2001) Molecular Cloning: a laboratory manual, 3rd Edition, Cold Spring Harbor Laboratory Press, CSH, New York or to Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates).

Splice Variant

[0043] The term "splice variant" as used herein encompasses variants of a nucleic acid sequence in which selected introns and/or exons have been excised, replaced, displaced or added, or in which introns have been shortened or lengthened. Such variants will be ones in which the biological activity of the protein is substantially retained; this may be achieved by selectively retaining functional segments of the protein. Such splice variants may be found in nature or may be manmade. Methods for predicting and isolating such splice variants are well known in the art (see for example Foissac and Schiex (2005) BMC Bioinformatics 6: 25).

Allelic Variant

[0044] Alleles or allelic variants are alternative forms of a given gene, located at the same chromosomal position. Allelic variants encompass Single Nucleotide Polymorphisms (SNPs), as well as Small Insertion/Deletion Polymorphisms (INDELs). The size of INDELs is usually less than 100 bp. SNPs and INDELs form the largest set of sequence variants in naturally occurring polymorphic strains of most organisms.

Gene Shuffling/Directed Evolution

[0045] Gene shuffling or directed evolution consists of iterations of DNA shuffling followed by appropriate screening and/or selection to generate variants of nucleic acids or portions thereof encoding proteins having a modified biological activity (Castle et al., (2004) Science 304(5674): 1151-4; U.S. Pat. Nos. 5,811,238 and 6,395,547).

Regulatory Element/Control Sequence/Promoter

[0046] The terms "regulatory element", "control sequence" and "promoter" are all used interchangeably herein and are to be taken in a broad context to refer to regulatory nucleic acid sequences capable of effecting expression of the sequences to which they are ligated. The term "promoter" typically refers to a nucleic acid control sequence located upstream from the transcriptional start of a gene and which is involved in recognising and binding of RNA polymerase and other proteins, thereby directing transcription of an operably linked nucleic acid. Encompassed by the aforementioned terms are transcriptional regulatory sequences derived from a classical eukaryotic genomic gene (including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence) and additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or external stimuli, or in a tissue-specific manner. Also included within the term is a transcriptional regulatory sequence of a classical prokaryotic gene, in which case it may include a -35 box sequence and/or -10 box transcriptional regulatory sequences. The term "regulatory element" also encompasses a synthetic fusion molecule or derivative that confers, activates or enhances expression of a nucleic acid molecule in a cell, tissue or organ.

[0047] A "plant promoter" comprises regulatory elements, which mediate the expression of a coding sequence segment in plant cells. Accordingly, a plant promoter need not be of plant origin, but may originate from viruses or micro-organisms, for example from viruses which attack plant cells. The "plant promoter" can also originate from a plant cell, e.g. from the plant which is transformed with the nucleic acid sequence to be expressed in the inventive process and described herein. This also applies to other "plant" regulatory signals, such as "plant" terminators. The promoters upstream of the nucleotide sequences useful in the methods of the present invention can be modified by one or more nucleotide substitution(s), insertion(s) and/or deletion(s) without interfering with the functionality or activity of either the promoters, the open reading frame (ORF) or the 3'-regulatory region such as terminators or other 3' regulatory regions which are located away from the ORF. It is furthermore possible that the activity of the promoters is increased by modification of their sequence, or that they are replaced completely by more active promoters, even promoters from heterologous organisms. For expression in plants, the nucleic acid molecule must, as described above, be linked operably to or comprise a suitable promoter which expresses the gene at the right point in time and with the required spatial expression pattern.

[0048] For the identification of functionally equivalent promoters, the promoter strength and/or expression pattern of a candidate promoter may be analysed for example by operably linking the promoter to a reporter gene and assaying the expression level and pattern of the reporter gene in various tissues of the plant. Suitable well-known reporter genes include for example beta-glucuronidase or beta-galactosidase. The promoter activity is assayed by measuring the enzymatic activity of the beta-glucuronidase or beta-galactosidase. The promoter strength and/or expression pattern may then be compared to that of a reference promoter (such as the one used in the methods of the present invention). Alternatively, promoter strength may be assayed by quantifying mRNA levels or by comparing mRNA levels of the nucleic acid used in the methods of the present invention, with mRNA levels of housekeeping genes such as 18S rRNA, using methods known in the art, such as Northern blotting with densitometric analysis of autoradiograms, quantitative real-time PCR or RT-PCR (Heid et al., 1996 Genome Methods 6: 986-994). Generally by "weak promoter" is intended a promoter that drives expression of a coding sequence at a low level. By "low level" is intended at levels of about 1/10,000 transcripts to about 1/100,000 transcripts, to about 1/500,0000 transcripts per cell. Conversely, a "strong promoter" drives expression of a coding sequence at high level, or at about 1/10 transcripts to about 1/100 transcripts to about 1/1000 transcripts per cell. Generally, by "medium strength promoter" is intended a promoter that drives expression of a coding sequence at a lower level than a strong promoter, in particular at a level that is in all instances below that obtained when under the control of a 35S CaMV promoter.

Operably Linked

[0049] The term "operably linked" as used herein refers to a functional linkage between the promoter sequence and the gene of interest, such that the promoter sequence is able to initiate transcription of the gene of interest.

Constitutive Promoter

[0050] A "constitutive promoter" refers to a promoter that is transcriptionally active during most, but not necessarily all, phases of growth and development and under most environmental conditions, in at least one cell, tissue or organ. Table 2a below gives examples of constitutive promoters.

TABLE-US-00002 TABLE 2a Examples of constitutive promoters Gene Source Reference Actin McElroy et al, Plant Cell, 2: 163-171, 1990 HMGP WO 2004/070039 CAMV 35S Odell et al, Nature, 313: 810-812, 1985 CaMV 19S Nilsson et al., Physiol. Plant. 100: 456-462, 1997 GOS2 de Pater et al, Plant J Nov; 2(6): 837-44, 1992, WO 2004/065596 Ubiquitin Christensen et al, Plant Mol. Biol. 18: 675-689, 1992 Rice cyclophilin Buchholz et al, Plant Mol Biol. 25(5): 837-43, 1994 Maize H3 histone Lepetit et al, Mol. Gen. Genet. 231: 276-285, 1992 Alfalfa H3 histone Wu et al. Plant Mol. Biol. 11: 641-649, 1988 Actin 2 An et al, Plant J. 10(1); 107-121, 1996 34S FMV Sanger et al., Plant. Mol. Biol., 14, 1990: 433-443 Rubisco small subunit U.S. Pat. No. 4,962,028 OCS Leisner (1988) Proc Natl Acad Sci USA 85(5): 2553 SAD1 Jain et al., Crop Science, 39 (6), 1999: 1696 SAD2 Jain et al., Crop Science, 39 (6), 1999: 1696 nos Shaw et al. (1984) Nucleic Acids Res. 12(20): 7831-7846 V-ATPase WO 01/14572 Super promoter WO 95/14098 G-box proteins WO 94/12015

Ubiquitous Promoter

[0051] A ubiquitous promoter is active in substantially all tissues or cells of an organism.

Developmentally-Regulated Promoter

[0052] A developmentally-regulated promoter is active during certain developmental stages or in parts of the plant that undergo developmental changes.

Inducible Promoter

[0053] An inducible promoter has induced or increased transcription initiation in response to a chemical (for a review see Gatz 1997, Annu. Rev. Plant Physiol. Plant Mol. Biol., 48:89-108), environmental or physical stimulus, or may be "stress-inducible", i.e. activated when a plant is exposed to various stress conditions, or a "pathogen-inducible" i.e. activated when a plant is exposed to exposure to various pathogens.

Organ-Specific/Tissue-Specific Promoter

[0054] An organ-specific or tissue-specific promoter is one that is capable of preferentially initiating transcription in certain organs or tissues, such as the leaves, roots, seed tissue etc. For example, a "root-specific promoter" is a promoter that is transcriptionally active predominantly in plant roots, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Promoters able to initiate transcription in certain cells only are referred to herein as "cell-specific".

[0055] Examples of root-specific promoters are listed in Table 2b below:

TABLE-US-00003 TABLE 2b Examples of root-specific promoters Gene Source Reference RCc3 Plant Mol Biol. 1995 Jan; 27(2): 237-48 Arabidopsis PHT1 Kovama et al., 2005; Mudge et al. (2002, Plant J. 31: 341) Medicago phosphate transporter Xiao et al., 2006 Arabidopsis Pyk10 Nitz et al. (2001) Plant Sci 161(2): 337-346 root-expressible genes Tingey et al., EMBO J. 6: 1, 1987. tobacco auxin-inducible gene Van der Zaal et al., Plant Mol. Biol. 16, 983, 1991. β-tubulin Oppenheimer, et al., Gene 63: 87, 1988. tobacco root-specific genes Conkling, et al., Plant Physiol. 93: 1203, 1990. B. napus G1-3b gene U.S. Pat. No. 5,401,836 SbPRP1 Suzuki et al., Plant Mol. Biol. 21: 109-119, 1993. LRX1 Baumberger et al. 2001, Genes & Dev. 15: 1128 BTG-26 Brassica napus US 20050044585 LeAMT1 (tomato) Lauter et al. (1996, PNAS 3: 8139) The LeNRT1-1 (tomato) Lauter et al. (1996, PNAS 3: 8139) class I patatin gene (potato) Liu et al., Plant Mol. Biol. 153: 386-395, 1991. KDC1 (Daucus carota) Downey et al. (2000, J. Biol. Chem. 275: 39420) TobRB7 gene W Song (1997) PhD Thesis, North Carolina State University, Raleigh, NC USA OsRAB5a (rice) Wang et al. 2002, Plant Sci. 163: 273 ALF5 (Arabidopsis) Diener et al. (2001, Plant Cell 13: 1625) NRT2; 1Np (N. plumbaginifolia) Quesada et al. (1997, Plant Mol. Biol. 34: 265)

[0056] A seed-specific promoter is transcriptionally active predominantly in seed tissue, but not necessarily exclusively in seed tissue (in cases of leaky expression). The seed-specific promoter may be active during seed development and/or during germination. Examples of seed-specific promoters are shown in Table 2c to Table 2f below. Further examples of seed-specific promoters are given in Qing Qu and Takaiwa (Plant Biotechnol. J. 2, 113-125, 2004), which disclosure is incorporated by reference herein as if fully set forth.

TABLE-US-00004 TABLE 2c Examples of seed-specific promoters Gene source Reference seed-specific genes Simon et al., Plant Mol. Biol. 5: 191, 1985; Scofield et al., J. Biol. Chem. 262: 12202, 1987.; Baszczynski et al., Plant Mol. Biol. 14: 633, 1990. Brazil Nut albumin Pearson et al., Plant Mol. Biol. 18: 235-245, 1992. legumin Ellis et al., Plant Mol. Biol. 10: 203-214, 1988. glutelin (rice) Takaiwa et al., Mol. Gen. Genet. 208: 15-22, 1986; Takaiwa et al., FEBS Letts. 221: 43-47, 1987. zein Matzke et al Plant Mol Biol, 14(3): 323-32 1990 napA Stalberg et al, Planta 199: 515-519, 1996. wheat LMW and HMW glutenin-1 Mol Gen Genet 216: 81-90, 1989; NAR 17: 461-2, 1989 wheat SPA Albani et al, Plant Cell, 9: 171-184, 1997 wheat α, β, γ-gliadins EMBO J. 3: 1409-15, 1984 barley ltr1 promoter Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 barley B1, C, D, hordein Theor Appl Gen 98: 1253-62, 1999; Plant J 4: 343-55, 1993; Mol Gen Genet 250: 750-60, 1996 barley DOF Mena et al, The Plant Journal, 116(1): 53-62, 1998 blz2 EP99106056.7 synthetic promoter Vicente-Carbajosa et al., Plant J. 13: 629-640, 1998. rice prolamin NRP33 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice a-globulin Glb-1 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 rice α-globulin REB/OHP-1 Nakase et al. Plant Mol. Biol. 33: 513-522, 1997 rice ADP-glucose pyrophos- Trans Res 6: 157-68, 1997 phorylase maize ESR gene family Plant J 12: 235-46, 1997 sorghum α-kafirin DeRose et al., Plant Mol. Biol 32: 1029-35, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 rice oleosin Wu et al, J. Biochem. 123: 386, 1998 sunflower oleosin Cummins et al., Plant Mol. Biol. 19: 873-876, 1992 PRO0117, putative rice 40S WO 2004/070039 ribosomal protein PRO0136, rice alanine unpublished aminotransferase PRO0147, trypsin inhibitor ITR1 unpublished (barley) PRO0151, rice WSI18 WO 2004/070039 PRO0175, rice RAB21 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039 α-amylase (Amy32b) Lanahan et al, Plant Cell 4: 203-211, 1992; Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998

TABLE-US-00005 TABLE 2d examples of endosperm-specific promoters Gene source Reference glutelin (rice) Takaiwa et al. (1986) Mol Gen Genet 208: 15-22; Takaiwa et al. (1987) FEBS Letts. 221: 43-47 zein Matzke et al., (1990) Plant Mol Biol 14(3): 323-32 wheat LMW and HMW glutenin-1 Colot et al. (1989) Mol Gen Genet 216: 81-90, Anderson et al. (1989) NAR 17: 461-2 wheat SPA Albani et al. (1997) Plant Cell 9: 171-184 wheat gliadins Rafalski et al. (1984) EMBO 3: 1409-15 barley ltr1 promoter Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 barley B1, C, D, hordein Cho et al. (1999) Theor Appl Genet 98: 1253-62; Muller et al. (1993) Plant J 4: 343-55; Sorenson et al. (1996) Mol Gen Genet 250: 750-60 barley DOF Mena et al, (1998) Plant J 116(1): 53-62 blz2 Onate et al. (1999) J Biol Chem 274(14): 9175-82 synthetic promoter Vicente-Carbajosa et al. (1998) Plant J 13: 629-640 rice prolamin NRP33 Wu et al, (1998) Plant Cell Physiol 39(8) 885-889 rice globulin Glb-1 Wu et al. (1998) Plant Cell Physiol 39(8) 885-889 rice globulin REB/OHP-1 Nakase et al. (1997) Plant Molec Biol 33: 513-522 rice ADP-glucose pyrophosphorylase Russell et al. (1997) Trans Res 6: 157-68 maize ESR gene family Opsahl-Ferstad et al. (1997) Plant J 12: 235-46 sorghum kafirin DeRose et al. (1996) Plant Mol Biol 32: 1029-35

TABLE-US-00006 TABLE 2e Examples of embryo specific promoters: Gene source Reference rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 PRO0151 WO 2004/070039 PRO0175 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039

TABLE-US-00007 TABLE 2f Examples of aleurone-specific promoters: Gene source Reference α-amylase (Amy32b) Lanahan et al, Plant Cell 4: 203-211, 1992; Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998

[0057] A green tissue-specific promoter as defined herein is a promoter that is transcriptionally active predominantly in green tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts.

[0058] Examples of green tissue-specific promoters which may be used to perform the methods of the invention are shown in Table 2g below.

TABLE-US-00008 TABLE 2g Examples of green tissue-specific promoters Gene Expression Reference Maize Orthophosphate dikinase Leaf specific Fukavama et al., 2001 Maize Phosphoenolpyruvate Leaf specific Kausch et al., 2001 carboxylase Rice Phosphoenolpyruvate Leaf specific Liu et al., 2003 carboxylase Rice small subunit Rubisco Leaf specific Nomura et al., 2000 rice beta expansin EXBP9 Shoot specific WO 2004/070039 Pigeonpea small subunit Rubisco Leaf specific Panguluri et al., 2005 Pea RBCS3A Leaf specific

[0059] Another example of a tissue-specific promoter is a meristem-specific promoter, which is transcriptionally active predominantly in meristematic tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Examples of green meristem-specific promoters which may be used to perform the methods of the invention are shown in Table 2h below.

TABLE-US-00009 TABLE 2h Examples of meristem-specific promoters Gene source Expression pattern Reference rice OSH1 Shoot apical meristem, from Sato et al. (1996) Proc. embryo globular stage to Natl. Acad. Sci. USA, seedling stage 93: 8117-8122 Rice Meristem specific BAD87835.1 metallothionein WAK1 & WAK 2 Shoot and root apical Wagner & Kohorn meristems, and in expanding (2001) Plant Cell leaves and sepals 13(2): 303-318

Terminator

[0060] The term "terminator" encompasses a control sequence which is a DNA sequence at the end of a transcriptional unit which signals 3' processing and polyadenylation of a primary transcript and termination of transcription. The terminator can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The terminator to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.

Modulation

[0061] The term "modulation" means in relation to expression or gene expression, a process in which the expression level is changed by said gene expression in comparison to the control plant, the expression level may be increased or decreased. The original, unmodulated expression may be of any kind of expression of a structural RNA (rRNA, tRNA) or mRNA with subsequent translation. The term "modulating the activity" shall mean any change of the expression of the inventive nucleic acid sequences or encoded proteins, which leads to increased yield and/or increased growth of the plants.

Expression

[0062] The term "expression" or "gene expression" means the transcription of a specific gene or specific genes or specific genetic construct. The term "expression" or "gene expression" in particular means the transcription of a gene or genes or genetic construct into structural RNA (rRNA, tRNA) or mRNA with or without subsequent translation of the latter into a protein. The process includes transcription of DNA and processing of the resulting mRNA product.

Increased Expression/Overexpression

[0063] The term "increased expression" or "overexpression" as used herein means any form of expression that is additional to the original wild-type expression level.

[0064] Methods for increasing expression of genes or gene products are well documented in the art and include, for example, overexpression driven by appropriate promoters, the use of transcription enhancers or translation enhancers. Isolated nucleic acids which serve as promoter or enhancer elements may be introduced in an appropriate position (typically upstream) of a non-heterologous form of a polynucleotide so as to upregulate expression of a nucleic acid encoding the polypeptide of interest. For example, endogenous promoters may be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., WO9322443), or isolated promoters may be introduced into a plant cell in the proper orientation and distance from a gene of the present invention so as to control the expression of the gene.

[0065] If polypeptide expression is desired, it is generally desirable to include a polyadenylation region at the 3'-end of a polynucleotide coding region. The polyadenylation region can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The 3' end sequence to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.

[0066] An intron sequence may also be added to the 5' untranslated region (UTR) or the coding sequence of the partial coding sequence to increase the amount of the mature message that accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both plant and animal expression constructs has been shown to increase gene expression at both the mRNA and protein levels up to 1000-fold (Buchman and Berg (1988) Mol. Cell biol. 8: 4395-4405; Callis et al. (1987) Genes Dev 1:1183-1200). Such intron enhancement of gene expression is typically greatest when placed near the 5' end of the transcription unit. Use of the maize introns Adh1-S intron 1, 2, and 6, the Bronze-1 intron are known in the art. For general information see: The Maize Handbook, Chapter 116, Freeling and Walbot, Eds., Springer, N.Y. (1994).

Endogenous Gene

[0067] Reference herein to an "endogenous" gene not only refers to the gene in question as found in a plant in its natural form (i.e., without there being any human intervention), but also refers to that same gene (or a substantially homologous nucleic acid/gene) in an isolated form subsequently (re)introduced into a plant (a transgene). For example, a transgenic plant containing such a transgene may encounter a substantial reduction of the transgene expression and/or substantial reduction of expression of the endogenous gene. The isolated gene may be isolated from an organism or may be manmade, for example by chemical synthesis.

Decreased Expression

[0068] Reference herein to "decreased expression" or "reduction or substantial elimination" of expression is taken to mean a decrease in endogenous gene expression and/or polypeptide levels and/or polypeptide activity relative to control plants. The reduction or substantial elimination is in increasing order of preference at least 10%, 20%, 30%, 40% or 50%, 60%, 70%, 80%, 85%, 90%, or 95%, 96%, 97%, 98%, 99% or more reduced compared to that of control plants.

[0069] For the reduction or substantial elimination of expression an endogenous gene in a plant, a sufficient length of substantially contiguous nucleotides of a nucleic acid sequence is required. In order to perform gene silencing, this may be as little as 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10 or fewer nucleotides, alternatively this may be as much as the entire gene (including the 5' and/or 3' UTR, either in part or in whole). The stretch of substantially contiguous nucleotides may be derived from the nucleic acid encoding the protein of interest (target gene), or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest. Preferably, the stretch of substantially contiguous nucleotides is capable of forming hydrogen bonds with the target gene (either sense or antisense strand), more preferably, the stretch of substantially contiguous nucleotides has, in increasing order of preference, 50%, 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% sequence identity to the target gene (either sense or antisense strand). A nucleic acid sequence encoding a (functional) polypeptide is not a requirement for the various methods discussed herein for the reduction or substantial elimination of expression of an endogenous gene.

[0070] This reduction or substantial elimination of expression may be achieved using routine tools and techniques. A preferred method for the reduction or substantial elimination of endogenous gene expression is by introducing and expressing in a plant a genetic construct into which the nucleic acid (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of any one of the protein of interest) is cloned as an inverted repeat (in part or completely), separated by a spacer (non-coding DNA).

[0071] In such a preferred method, expression of the endogenous gene is reduced or substantially eliminated through RNA-mediated silencing using an inverted repeat of a nucleic acid or a part thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), preferably capable of forming a hairpin structure. The inverted repeat is cloned in an expression vector comprising control sequences. A non-coding DNA nucleic acid sequence (a spacer, for example a matrix attachment region fragment (MAR), an intron, a polylinker, etc.) is located between the two inverted nucleic acids forming the inverted repeat. After transcription of the inverted repeat, a chimeric RNA with a self-complementary structure is formed (partial or complete). This double-stranded RNA structure is referred to as the hairpin RNA (hpRNA). The hpRNA is processed by the plant into siRNAs that are incorporated into an RNA-induced silencing complex (RISC). The RISC further cleaves the mRNA transcripts, thereby substantially reducing the number of mRNA transcripts to be translated into polypeptides. For further general details see for example, Grierson et al. (1998) WO 98/53083; Waterhouse et al. (1999) WO 99/53050).

[0072] Performance of the methods of the invention does not rely on introducing and expressing in a plant a genetic construct into which the nucleic acid is cloned as an inverted repeat, but any one or more of several well-known "gene silencing" methods may be used to achieve the same effects.

[0073] One such method for the reduction of endogenous gene expression is RNA-mediated silencing of gene expression (downregulation). Silencing in this case is triggered in a plant by a double stranded RNA sequence (dsRNA) that is substantially similar to the target endogenous gene. This dsRNA is further processed by the plant into about 20 to about 26 nucleotides called short interfering RNAs (siRNAs). The siRNAs are incorporated into an RNA-induced silencing complex (RISC) that cleaves the mRNA transcript of the endogenous target gene, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. Preferably, the double stranded RNA sequence corresponds to a target gene.

[0074] Another example of an RNA silencing method involves the introduction of nucleic acid sequences or parts thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest) in a sense orientation into a plant. "Sense orientation" refers to a DNA sequence that is homologous to an mRNA transcript thereof. Introduced into a plant would therefore be at least one copy of the nucleic acid sequence. The additional nucleic acid sequence will reduce expression of the endogenous gene, giving rise to a phenomenon known as co-suppression. The reduction of gene expression will be more pronounced if several additional copies of a nucleic acid sequence are introduced into the plant, as there is a positive correlation between high transcript levels and the triggering of co-suppression.

[0075] Another example of an RNA silencing method involves the use of antisense nucleic acid sequences. An "antisense" nucleic acid sequence comprises a nucleotide sequence that is complementary to a "sense" nucleic acid sequence encoding a protein, i.e. complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA transcript sequence. The antisense nucleic acid sequence is preferably complementary to the endogenous gene to be silenced. The complementarity may be located in the "coding region" and/or in the "non-coding region" of a gene. The term "coding region" refers to a region of the nucleotide sequence comprising codons that are translated into amino acid residues. The term "non-coding region" refers to 5 and 3' sequences that flank the coding region that are transcribed but not translated into amino acids (also referred to as 5 and 3' untranslated regions).

[0076] Antisense nucleic acid sequences can be designed according to the rules of Watson and Crick base pairing. The antisense nucleic acid sequence may be complementary to the entire nucleic acid sequence (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), but may also be an oligonucleotide that is antisense to only a part of the nucleic acid sequence (including the mRNA 5' and 3' UTR). For example, the antisense oligonucleotide sequence may be complementary to the region surrounding the translation start site of an mRNA transcript encoding a polypeptide. The length of a suitable antisense oligonucleotide sequence is known in the art and may start from about 50, 45, 40, 35, 30, 25, 20, 15 or 10 nucleotides in length or less. An antisense nucleic acid sequence according to the invention may be constructed using chemical synthesis and enzymatic ligation reactions using methods known in the art. For example, an antisense nucleic acid sequence (e.g., an antisense oligonucleotide sequence) may be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acid sequences, e.g., phosphorothioate derivatives and acridine substituted nucleotides may be used. Examples of modified nucleotides that may be used to generate the antisense nucleic acid sequences are well known in the art. Known nucleotide modifications include methylation, cyclization and `caps` and substitution of one or more of the naturally occurring nucleotides with an analogue such as inosine. Other modifications of nucleotides are well known in the art.

[0077] The antisense nucleic acid sequence can be produced biologically using an expression vector into which a nucleic acid sequence has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest). Preferably, production of antisense nucleic acid sequences in plants occurs by means of a stably integrated nucleic acid construct comprising a promoter, an operably linked antisense oligonucleotide, and a terminator.

[0078] The nucleic acid molecules used for silencing in the methods of the invention (whether introduced into a plant or generated in situ) hybridize with or bind to mRNA transcripts and/or genomic DNA encoding a polypeptide to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid sequence which binds to DNA duplexes, through specific interactions in the major groove of the double helix. Antisense nucleic acid sequences may be introduced into a plant by transformation or direct injection at a specific tissue site. Alternatively, antisense nucleic acid sequences can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense nucleic acid sequences can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid sequence to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid sequences can also be delivered to cells using the vectors described herein.

[0079] According to a further aspect, the antisense nucleic acid sequence is an a-anomeric nucleic acid sequence. An a-anomeric nucleic acid sequence forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual b-units, the strands run parallel to each other (Gaultier et al. (1987) Nucl Ac Res 15: 6625-6641). The antisense nucleic acid sequence may also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucl Ac Res 15, 6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215, 327-330).

[0080] The reduction or substantial elimination of endogenous gene expression may also be performed using ribozymes. Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid sequence, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334, 585-591) can be used to catalytically cleave mRNA transcripts encoding a polypeptide, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. A ribozyme having specificity for a nucleic acid sequence can be designed (see for example: Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742). Alternatively, mRNA transcripts corresponding to a nucleic acid sequence can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules (Bartel and Szostak (1993) Science 261, 1411-1418). The use of ribozymes for gene silencing in plants is known in the art (e.g., Atkins et al. (1994) WO 94/00012; Lenne et al. (1995) WO 95/03404; Lutziger et al. (2000) WO 00/00619; Prinsen et al. (1997) WO 97/13865 and Scott et al. (1997) WO 97/38116).

[0081] Gene silencing may also be achieved by insertion mutagenesis (for example, T-DNA insertion or transposon insertion) or by strategies as described by, among others, Angell and Baulcombe ((1999) Plant J 20(3): 357-62), (Amplicon VIGS WO 98/36083), or Baulcombe (WO 99/15682).

[0082] Gene silencing may also occur if there is a mutation on an endogenous gene and/or a mutation on an isolated gene/nucleic acid subsequently introduced into a plant. The reduction or substantial elimination may be caused by a non-functional polypeptide. For example, the polypeptide may bind to various interacting proteins; one or more mutation(s) and/or truncation(s) may therefore provide for a polypeptide that is still able to bind interacting proteins (such as receptor proteins) but that cannot exhibit its normal function (such as signalling ligand).

[0083] A further approach to gene silencing is by targeting nucleic acid sequences complementary to the regulatory region of the gene (e.g., the promoter and/or enhancers) to form triple helical structures that prevent transcription of the gene in target cells. See Helene, C., Anticancer Drug Res. 6, 569-84, 1991; Helene et al., Ann. N.Y. Acad. Sci. 660, 27-36 1992; and Maher, L. J. Bioassays 14, 807-15, 1992.

[0084] Other methods, such as the use of antibodies directed to an endogenous polypeptide for inhibiting its function in planta, or interference in the signalling pathway in which a polypeptide is involved, will be well known to the skilled man. In particular, it can be envisaged that manmade molecules may be useful for inhibiting the biological function of a target polypeptide, or for interfering with the signalling pathway in which the target polypeptide is involved.

[0085] Alternatively, a screening program may be set up to identify in a plant population natural variants of a gene, which variants encode polypeptides with reduced activity. Such natural variants may also be used for example, to perform homologous recombination.

[0086] Artificial and/or natural microRNAs (miRNAs) may be used to knock out gene expression and/or mRNA translation. Endogenous miRNAs are single stranded small RNAs of typically 19-24 nucleotides long. They function primarily to regulate gene expression and/or mRNA translation. Most plant microRNAs (miRNAs) have perfect or near-perfect complementarity with their target sequences. However, there are natural targets with up to five mismatches. They are processed from longer non-coding RNAs with characteristic fold-back structures by double-strand specific RNases of the Dicer family. Upon processing, they are incorporated in the RNA-induced silencing complex (RISC) by binding to its main component, an Argonaute protein. MiRNAs serve as the specificity components of RISC, since they base-pair to target nucleic acids, mostly mRNAs, in the cytoplasm. Subsequent regulatory events include target mRNA cleavage and destruction and/or translational inhibition. Effects of miRNA overexpression are thus often reflected in decreased mRNA levels of target genes.

[0087] Artificial microRNAs (amiRNAs), which are typically 21 nucleotides in length, can be genetically engineered specifically to negatively regulate gene expression of single or multiple genes of interest. Determinants of plant microRNA target selection are well known in the art. Empirical parameters for target recognition have been defined and can be used to aid in the design of specific amiRNAs, (Schwab et al., Dev. Cell 8, 517-527, 2005). Convenient tools for design and generation of amiRNAs and their precursors are also available to the public (Schwab et al., Plant Cell 18, 1121-1133, 2006).

[0088] For optimal performance, the gene silencing techniques used for reducing expression in a plant of an endogenous gene requires the use of nucleic acid sequences from monocotyledonous plants for transformation of monocotyledonous plants, and from dicotyledonous plants for transformation of dicotyledonous plants. Preferably, a nucleic acid sequence from any given plant species is introduced into that same species. For example, a nucleic acid sequence from rice is transformed into a rice plant. However, it is not an absolute requirement that the nucleic acid sequence to be introduced originates from the same plant species as the plant in which it will be introduced. It is sufficient that there is substantial homology between the endogenous target gene and the nucleic acid to be introduced.

[0089] Described above are examples of various methods for the reduction or substantial elimination of expression in a plant of an endogenous gene. A person skilled in the art would readily be able to adapt the aforementioned methods for silencing so as to achieve reduction of expression of an endogenous gene in a whole plant or in parts thereof through the use of an appropriate promoter, for example.

Selectable Marker (Gene)/Reporter Gene

[0090] "Selectable marker", "selectable marker gene" or "reporter gene" includes any gene that confers a phenotype on a cell in which it is expressed to facilitate the identification and/or selection of cells that are transfected or transformed with a nucleic acid construct of the invention. These marker genes enable the identification of a successful transfer of the nucleic acid molecules via a series of different principles. Suitable markers may be selected from markers that confer antibiotic or herbicide resistance, that introduce a new metabolic trait or that allow visual selection. Examples of selectable marker genes include genes conferring resistance to antibiotics (such as nptII that phosphorylates neomycin and kanamycin, or hpt, phosphorylating hygromycin, or genes conferring resistance to, for example, bleomycin, streptomycin, tetracyclin, chloramphenicol, ampicillin, gentamycin, geneticin (G418), spectinomycin or blasticidin), to herbicides (for example bar which provides resistance to Basta®; aroA or gox providing resistance against glyphosate, or the genes conferring resistance to, for example, imidazolinone, phosphinothricin or sulfonylurea), or genes that provide a metabolic trait (such as manA that allows plants to use mannose as sole carbon source or xylose isomerase for the utilisation of xylose, or antinutritive markers such as the resistance to 2-deoxyglucose). Expression of visual marker genes results in the formation of colour (for example β-glucuronidase, GUS or β-galactosidase with its coloured substrates, for example X-Gal), luminescence (such as the luciferin/luceferase system) or fluorescence (Green Fluorescent Protein, GFP, and derivatives thereof). This list represents only a small number of possible markers. The skilled worker is familiar with such markers. Different markers are preferred, depending on the organism and the selection method.

[0091] It is known that upon stable or transient integration of nucleic acids into plant cells, only a minority of the cells takes up the foreign DNA and, if desired, integrates it into its genome, depending on the expression vector used and the transfection technique used. To identify and select these integrants, a gene coding for a selectable marker (such as the ones described above) is usually introduced into the host cells together with the gene of interest. These markers can for example be used in mutants in which these genes are not functional by, for example, deletion by conventional methods. Furthermore, nucleic acid molecules encoding a selectable marker can be introduced into a host cell on the same vector that comprises the sequence encoding the polypeptides of the invention or used in the methods of the invention, or else in a separate vector. Cells which have been stably transfected with the introduced nucleic acid can be identified for example by selection (for example, cells which have integrated the selectable marker survive whereas the other cells die).

[0092] Since the marker genes, particularly genes for resistance to antibiotics and herbicides, are no longer required or are undesired in the transgenic host cell once the nucleic acids have been introduced successfully, the process according to the invention for introducing the nucleic acids advantageously employs techniques which enable the removal or excision of these marker genes. One such a method is what is known as co-transformation. The co-transformation method employs two vectors simultaneously for the transformation, one vector bearing the nucleic acid according to the invention and a second bearing the marker gene(s). A large proportion of transformants receives or, in the case of plants, comprises (up to 40% or more of the transformants), both vectors. In case of transformation with Agrobacteria, the transformants usually receive only a part of the vector, i.e. the sequence flanked by the T-DNA, which usually represents the expression cassette. The marker genes can subsequently be removed from the transformed plant by performing crosses. In another method, marker genes integrated into a transposon are used for the transformation together with desired nucleic acid (known as the Ac/Ds technology). The transformants can be crossed with a transposase source or the transformants are transformed with a nucleic acid construct conferring expression of a transposase, transiently or stable. In some cases (approx. 10%), the transposon jumps out of the genome of the host cell once transformation has taken place successfully and is lost. In a further number of cases, the transposon jumps to a different location. In these cases the marker gene must be eliminated by performing crosses. In microbiology, techniques were developed which make possible, or facilitate, the detection of such events. A further advantageous method relies on what is known as recombination systems; whose advantage is that elimination by crossing can be dispensed with. The best-known system of this type is what is known as the Cre/lox system. Cre1 is a recombinase that removes the sequences located between the loxP sequences. If the marker gene is integrated between the loxP sequences, it is removed once transformation has taken place successfully, by expression of the recombinase. Further recombination systems are the HIN/HIX, FLP/FRT and REP/STB system (Tribble et al., J. Biol. Chem., 275, 2000: 22255-22267; Velmurugan et al., J. Cell Biol., 149, 2000: 553-566). A site-specific integration into the plant genome of the nucleic acid sequences according to the invention is possible. Naturally, these methods can also be applied to microorganisms such as yeast, fungi or bacteria.

Transgenic/Transgene/Recombinant

[0093] For the purposes of the invention, "transgenic", "transgene" or "recombinant" means with regard to, for example, a nucleic acid sequence, an expression cassette, gene construct or a vector comprising the nucleic acid sequence or an organism transformed with the nucleic acid sequences, expression cassettes or vectors according to the invention, all those constructions brought about by recombinant methods in which either

[0094] (a) the nucleic acid sequences encoding proteins useful in the methods of the invention, or

[0095] (b) genetic control sequence(s) which is operably linked with the nucleic acid sequence according to the invention, for example a promoter, or

[0096] (c) a) and b) are not located in their natural genetic environment or have been modified by recombinant methods, it being possible for the modification to take the form of, for example, a substitution, addition, deletion, inversion or insertion of one or more nucleotide residues. The natural genetic environment is understood as meaning the natural genomic or chromosomal locus in the original plant or the presence in a genomic library. In the case of a genomic library, the natural genetic environment of the nucleic acid sequence is preferably retained, at least in part. The environment flanks the nucleic acid sequence at least on one side and has a sequence length of at least 50 bp, preferably at least 500 bp, especially preferably at least 1000 bp, most preferably at least 5000 bp. A naturally occurring expression cassette--for example the naturally occurring combination of the natural promoter of the nucleic acid sequences with the corresponding nucleic acid sequence encoding a polypeptide useful in the methods of the present invention, as defined above--becomes a transgenic expression cassette when this expression cassette is modified by non-natural, synthetic ("artificial") methods such as, for example, mutagenic treatment. Suitable methods are described, for example, in U.S. Pat. No. 5,565,350 or WO 00/15815.

[0097] A transgenic plant for the purposes of the invention is thus understood as meaning, as above, that the nucleic acids used in the method of the invention are not at their natural locus in the genome of said plant, it being possible for the nucleic acids to be expressed homologously or heterologously. However, as mentioned, transgenic also means that, while the nucleic acids according to the invention or used in the inventive method are at their natural position in the genome of a plant, the sequence has been modified with regard to the natural sequence, and/or that the regulatory sequences of the natural sequences have been modified. Transgenic is preferably understood as meaning the expression of the nucleic acids according to the invention at an unnatural locus in the genome, i.e. homologous or, preferably, heterologous expression of the nucleic acids takes place. Preferred transgenic plants are mentioned herein.

Transformation

[0098] The term "introduction" or "transformation" as referred to herein encompasses the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for transfer. Plant tissue capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may be transformed with a genetic construct of the present invention and a whole plant regenerated there from. The particular tissue chosen will vary depending on the clonal propagation systems available for, and best suited to, the particular species being transformed. Exemplary tissue targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), and induced meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). The polynucleotide may be transiently or stably introduced into a host cell and may be maintained non-integrated, for example, as a plasmid. Alternatively, it may be integrated into the host genome. The resulting transformed plant cell may then be used to regenerate a transformed plant in a manner known to persons skilled in the art.

[0099] The transfer of foreign genes into the genome of a plant is called transformation. Transformation of plant species is now a fairly routine technique. Advantageously, any of several transformation methods may be used to introduce the gene of interest into a suitable ancestor cell. The methods described for the transformation and regeneration of plants from plant tissues or plant cells may be utilized for transient or for stable transformation. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts (Krens, F. A. et al., (1982) Nature 296, 72-74; Negrutiu I et al. (1987) Plant Mol Biol 8: 363-373); electroporation of protoplasts (Shillito R. D. et al. (1985) Bio/Technol 3, 1099-1102); microinjection into plant material (Crossway A et al., (1986) Mol. Gen Genet 202: 179-185); DNA or RNA-coated particle bombardment (Klein T M et al., (1987) Nature 327: 70) infection with (non-integrative) viruses and the like. Transgenic plants, including transgenic crop plants, are preferably produced via Agrobacterium-mediated transformation. An advantageous transformation method is the transformation in planta. To this end, it is possible, for example, to allow the agrobacteria to act on plant seeds or to inoculate the plant meristem with agrobacteria. It has proved particularly expedient in accordance with the invention to allow a suspension of transformed agrobacteria to act on the intact plant or at least on the flower primordia. The plant is subsequently grown on until the seeds of the treated plant are obtained (Clough and Bent, Plant J. (1998) 16, 735-743). Methods for Agrobacterium-mediated transformation of rice include well known methods for rice transformation, such as those described in any of the following: European patent application EP 1198985 A1, Aldemita and Hodges (Planta 199: 612-617, 1996); Chan et al. (Plant Mol Biol 22 (3): 491-506, 1993), Hiei et al. (Plant J 6 (2): 271-282, 1994), which disclosures are incorporated by reference herein as if fully set forth. In the case of corn transformation, the preferred method is as described in either Ishida et al. (Nat. Biotechnol 14(6): 745-50, 1996) or Frame et al. (Plant Physiol 129(1): 13-22, 2002), which disclosures are incorporated by reference herein as if fully set forth. Said methods are further described by way of example in B. Jenes et al., Techniques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press (1993) 128-143 and in Potrykus Annu. Rev. Plant Physiol. Plant Molec. Biol. 42 (1991) 205-225). The nucleic acids or the construct to be expressed is preferably cloned into a vector, which is suitable for transforming Agrobacterium tumefaciens, for example pBin19 (Bevan et al., Nucl. Acids Res. 12 (1984) 8711). Agrobacteria transformed by such a vector can then be used in known manner for the transformation of plants, such as plants used as a model, like Arabidopsis (Arabidopsis thaliana is within the scope of the present invention not considered as a crop plant), or crop plants such as, by way of example, tobacco plants, for example by immersing bruised leaves or chopped leaves in an agrobacterial solution and then culturing them in suitable media. The transformation of plants by means of Agrobacterium tumefaciens is described, for example, by Hofgen and Willmitzer in Nucl. Acid Res. (1988) 16, 9877 or is known inter alia from F. F. White, Vectors for Gene Transfer in Higher Plants; in Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press, 1993, pp. 15-38.

[0100] In addition to the transformation of somatic cells, which then have to be regenerated into intact plants, it is also possible to transform the cells of plant meristems and in particular those cells which develop into gametes. In this case, the transformed gametes follow the natural plant development, giving rise to transgenic plants. Thus, for example, seeds of Arabidopsis are treated with agrobacteria and seeds are obtained from the developing plants of which a certain proportion is transformed and thus transgenic [Feldman, K A and Marks M D (1987). Mol Gen Genet 208:274-289; Feldmann K (1992). In: C Koncz, N-H Chua and J Shell, eds, Methods in Arabidopsis Research. Word Scientific, Singapore, pp. 274-289]. Alternative methods are based on the repeated removal of the inflorescences and incubation of the excision site in the center of the rosette with transformed agrobacteria, whereby transformed seeds can likewise be obtained at a later point in time (Chang (1994). Plant J. 5: 551-558; Katavic (1994). Mol Gen Genet, 245: 363-370). However, an especially effective method is the vacuum infiltration method with its modifications such as the "floral dip" method. In the case of vacuum infiltration of Arabidopsis, intact plants under reduced pressure are treated with an agrobacterial suspension [Bechthold, N (1993). C R Acad Sci Paris Life Sci, 316: 1194-1199], while in the case of the "floral dip" method the developing floral tissue is incubated briefly with a surfactant-treated agrobacterial suspension [Clough, S J and Bent A F (1998) The Plant J. 16, 735-743]. A certain proportion of transgenic seeds are harvested in both cases, and these seeds can be distinguished from non-transgenic seeds by growing under the above-described selective conditions. In addition the stable transformation of plastids is of advantages because plastids are inherited maternally is most crops reducing or eliminating the risk of transgene flow through pollen. The transformation of the chloroplast genome is generally achieved by a process which has been schematically displayed in Klaus et al., 2004 [Nature Biotechnology 22 (2), 225-229]. Briefly the sequences to be transformed are cloned together with a selectable marker gene between flanking sequences homologous to the chloroplast genome. These homologous flanking sequences direct site specific integration into the plastome. Plastidal transformation has been described for many different plant species and an overview is given in Bock (2001) Transgenic plastids in basic research and plant biotechnology. J Mol Biol. 2001 Sep. 21; 312 (3):425-38 or Maliga, P (2003) Progress towards commercialization of plastid transformation technology. Trends Biotechnol. 21, 20-28. Further biotechnological progress has recently been reported in form of marker free plastid transformants, which can be produced by a transient co-integrated maker gene (Klaus et al., 2004, Nature Biotechnology 22(2), 225-229).

T-DNA Activation Tagging

[0101] T-DNA activation tagging (Hayashi et al. Science (1992) 1350-1353), involves insertion of T-DNA, usually containing a promoter (may also be a translation enhancer or an intron), in the genomic region of the gene of interest or 10 kb up- or downstream of the coding region of a gene in a configuration such that the promoter directs expression of the targeted gene. Typically, regulation of expression of the targeted gene by its natural promoter is disrupted and the gene falls under the control of the newly introduced promoter. The promoter is typically embedded in a T-DNA. This T-DNA is randomly inserted into the plant genome, for example, through Agrobacterium infection and leads to modified expression of genes near the inserted T-DNA. The resulting transgenic plants show dominant phenotypes due to modified expression of genes close to the introduced promoter.

TILLING

[0102] The term "TILLING" is an abbreviation of "Targeted Induced Local Lesions In Genomes" and refers to a mutagenesis technology useful to generate and/or identify nucleic acids encoding proteins with modified expression and/or activity. TILLING also allows selection of plants carrying such mutant variants. These mutant variants may exhibit modified expression, either in strength or in location or in timing (if the mutations affect the promoter for example). These mutant variants may exhibit higher activity than that exhibited by the gene in its natural form. TILLING combines high-density mutagenesis with high-throughput screening methods. The steps typically followed in TILLING are: (a) EMS mutagenesis (Redei G P and Koncz C (1992) In Methods in Arabidopsis Research, Koncz C, Chua N H, Schell J, eds. Singapore, World Scientific Publishing Co, pp. 16-82; Feldmann et al., (1994) In Meyerowitz E M, Somerville C R, eds, Arabidopsis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp 137-172; Lightner J and Caspar T (1998) In J Martinez-Zapater, J Salinas, eds, Methods on Molecular Biology, Vol. 82. Humana Press, Totowa, N.J., pp 91-104); (b) DNA preparation and pooling of individuals; (c) PCR amplification of a region of interest; (d) denaturation and annealing to allow formation of heteroduplexes; (e) DHPLC, where the presence of a heteroduplex in a pool is detected as an extra peak in the chromatogram; (f) identification of the mutant individual; and (g) sequencing of the mutant PCR product. Methods for TILLING are well known in the art (McCallum et al., (2000) Nat Biotechnol 18: 455-457; reviewed by Stemple (2004) Nat Rev Genet 5(2): 145-50).

Homologous Recombination

[0103] Homologous recombination allows introduction in a genome of a selected nucleic acid at a defined selected position. Homologous recombination is a standard technology used routinely in biological sciences for lower organisms such as yeast or the moss Physcomitrella. Methods for performing homologous recombination in plants have been described not only for model plants (Offring a et al. (1990) EMBO J 9(10): 3077-84) but also for crop plants, for example rice (Terada et al. (2002) Nat Biotech 20(10): 1030-4; lida and Terada (2004) Curr Opin Biotech 15(2): 132-8), and approaches exist that are generally applicable regardless of the target organism (Miller et al, Nature Biotechnol. 25, 778-785, 2007).

Yield

[0104] The term "yield" in general means a measurable produce of economic value, typically related to a specified crop, to an area, and to a period of time. Individual plant parts directly contribute to yield based on their number, size and/or weight, or the actual yield is the yield per square meter for a crop and year, which is determined by dividing total production (includes both harvested and appraised production) by planted square meters. The term "yield" of a plant may relate to vegetative biomass (root and/or shoot biomass), to reproductive organs, and/or to propagules (such as seeds) of that plant.

Early Vigour

[0105] "Early vigour" refers to active healthy well-balanced growth especially during early stages of plant growth, and may result from increased plant fitness due to, for example, the plants being better adapted to their environment (i.e. optimizing the use of energy resources and partitioning between shoot and root). Plants having early vigour also show increased seedling survival and a better establishment of the crop, which often results in highly uniform fields (with the crop growing in uniform manner, i.e. with the majority of plants reaching the various stages of development at substantially the same time), and often better and higher yield. Therefore, early vigour may be determined by measuring various factors, such as thousand kernel weight, percentage germination, percentage emergence, seedling growth, seedling height, root length, root and shoot biomass and many more.

Increase/Improve/Enhance

[0106] The terms "increase", "improve" or "enhance" are interchangeable and shall mean in the sense of the application at least a 3%, 4%, 5%, 6%, 7%, 8%, 9% or 10%, preferably at least 15% or 20%, more preferably 25%, 30%, 35% or 40% more yield and/or growth in comparison to control plants as defined herein.

Seed Yield

[0107] Increased seed yield may manifest itself as one or more of the following: a) an increase in seed biomass (total seed weight) which may be on an individual seed basis and/or per plant and/or per square meter; b) increased number of flowers per plant; c) increased number of (filled) seeds; d) increased seed filling rate (which is expressed as the ratio between the number of filled seeds divided by the total number of seeds); e) increased harvest index, which is expressed as a ratio of the yield of harvestable parts, such as seeds, divided by the total biomass; and f) increased thousand kernel weight (TKW), which is extrapolated from the number of filled seeds counted and their total weight. An increased TKW may result from an increased seed size and/or seed weight, and may also result from an increase in embryo and/or endosperm size.

[0108] An increase in seed yield may also be manifested as an increase in seed size and/or seed volume. Furthermore, an increase in seed yield may also manifest itself as an increase in seed area and/or seed length and/or seed width and/or seed perimeter. Increased yield may also result in modified architecture, or may occur because of modified architecture.

Greenness Index

[0109] The "greenness index" as used herein is calculated from digital images of plants. For each pixel belonging to the plant object on the image, the ratio of the green value versus the red value (in the RGB model for encoding color) is calculated. The greenness index is expressed as the percentage of pixels for which the green-to-red ratio exceeds a given threshold. Under normal growth conditions, under salt stress growth conditions, and under reduced nutrient availability growth conditions, the greenness index of plants is measured in the last imaging before flowering. In contrast, under drought stress growth conditions, the greenness index of plants is measured in the first imaging after drought.

Plant

[0110] The term "plant" as used herein encompasses whole plants, ancestors and progeny of the plants and plant parts, including seeds, shoots, stems, leaves, roots (including tubers), flowers, and tissues and organs, wherein each of the aforementioned comprise the gene/nucleic acid of interest. The term "plant" also encompasses plant cells, suspension cultures, callus tissue, embryos, meristematic regions, gametophytes, sporophytes, pollen and microspores, again wherein each of the aforementioned comprises the gene/nucleic acid of interest.

[0111] Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs selected from the list comprising Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp. (e.g. Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Averrhoa carambola, Bambusa sp., Benincasa hispida, Bertholletia excelsea, Beta vulgaris, Brassica spp. (e.g. Brassica napus, Brassica rapa ssp. [canola, oilseed rape, turnip rape]), Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g. Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp., Eriobotrya japonica, Eucalyptus sp., Eugenia uniflora, Fagopyrum spp., Fagus spp., Festuca arundinacea, Ficus carica, Fortunella spp., Fragaria spp., Ginkgo biloba, Glycine spp. (e.g. Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g. Helianthus annuus), Hemerocallis fulva, Hibiscus spp., Hordeum spp. (e.g. Hordeum vulgare), Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lens cullnaris, Linum usitatissimum, Litchi chihensis, Lotus spp., Luffa acutangula, Lupinus spp., Luzula sylvatica, Lycopersicon spp. (e.g. Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Malus spp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihot spp., Manilkara zapota, Medicago sativa, Melliotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp., Opuntia spp., Ornithopus spp., Oryza spp. (e.g. Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp., Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g. Solanum tuberosum, Solanum integrifolium or Solanum lycoperskum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp., Tripsacum dactyloides, Triticosecale rimpaui, Triticum spp. (e.g. Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vilis spp., Zea mays, Zizania palustris, Ziziphus spp., amongst others.

DETAILED DESCRIPTION OF THE INVENTION

I NITR

[0112] Surprisingly, it has now been found that modulating expression in a plant of a nucleic acid encoding a NITR polypeptide gives plants having enhanced yield-related traits relative to control plants. According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a NITR polypeptide.

[0113] A preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding a NITR polypeptide is by introducing and expressing in a plant a nucleic acid encoding a NITR polypeptide.

[0114] Any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a NITR polypeptide as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such a NITR polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "NITR nucleic acid" or "NITR gene".

[0115] A "NITR polypeptide" as defined herein refers to the nitrite reductase protein represented by SEQ ID NO: 2 and to homologues (orthologues and paralogues) thereof. Nitrite reductases belong to the enzyme class EC 1.7.7.1 and catalyse the reduction of nitrite to ammonium.

[0116] Preferably, the homologues of SEQ ID NO: 2 have a NIR_SIR domain. NIR_SIR domains (Pfam entry PF01077, Nitrite and sulphite reductase 4Fe-4S region) are well known in the art and may readily be identified by persons skilled in the art. Preferably, the NITR polypeptides also comprise one or more of the following domains:

[0117] InterPro: IPR005117 (Nitrite/sulphite reductase, hemoprotein beta-component, ferrodoxin-like)

[0118] PFAM: PF03460 (NIR_SIR_ferr)

[0119] InterPro: IPR006066 (Nitrite and sulphite reductase iron-sulphur/siroheme-binding site)

[0120] PRINTS: PR00397 (SIROHAEM)

[0121] PROSITE: PS00365 (NIR_SIR)

[0122] InterPro: IPR006067 (Nitrite and sulphite reductase 4Fe-4S region)

[0123] GENE3D: G3DSA:3.30.413.10 (G3DSA:3.30.413.10)

[0124] Alternatively, the homologue of a NITR protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 2, provided that the homologous protein comprises the conserved domains as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides). Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered.

[0125] Preferably, the polypeptide sequence which when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 3, clusters with the group of NITR polypeptides comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with Sulfite Reductases or any other group.

[0126] The term "domain" and "motif" is defined in the "definitions" section herein. Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788 (2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.

[0127] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity and performs a statistical analysis of the similarity between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity and identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters.

[0128] Furthermore, NITR polypeptides (at least in their native form), as far as SEQ ID NO: 2 and its homologues are concerned, typically have oxidoreductase activity. Tools and techniques for measuring oxidoreductase activity are well known in the art, see for example Ferrari and Varner, Plant Physiol., 47(6), 790-794 (1971).

[0129] Nitrite reductases group together with Sulfite Reductases (EC 1.8.1.2, Hilz et al., Biochem. Z. 332, 151-166, 1959), which catalyse the reaction:

hydrogen sulfide+3NADP++3H2O=sulfite+3NADPH+3H+

However, it should be noted that the group of Sulfite Reductases are not encompassed by the term NITR polypeptides as used in the present invention.

[0130] The present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 1, encoding the polypeptide sequence of SEQ ID NO: 2. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any NITR-encoding nucleic acid or NITR polypeptide as defined herein (thereby excluding the Sulfite Reductases).

[0131] Examples of nucleic acids encoding NITR polypeptides (such as those provided in FIG. 2 or in the sequence listing) may be found in databases known in the art. Such nucleic acids are useful in performing the methods of the invention. Orthologues and paralogues, the terms "orthologues" and "paralogues" being as defined herein, may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using SEQ ID NO: 2) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 1 or SEQ ID NO: 2, the second BLAST would therefore be against Arabidopsis thaliana sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.

[0132] High-ranking hits are those having a low E-value. The lower the E-value, the more significant the score (or in other words the lower the chance that the hit was found by chance). Computation of the E-value is well known in the art. In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In the case of large families, ClustalW may be used, followed by a neighbour joining tree, to help visualize clustering of related genes and to identify orthologues and paralogues.

[0133] Nucleic acid variants encoding homologues and derivatives of SEQ ID NO: 2 may also be useful in practising the methods of the invention, the terms "homologue" and "derivative" being as defined herein. Also useful in the methods of the invention are nucleic acids encoding homologues and derivatives of orthologues or paralogues of SEQ ID NO: 2. Homologues and derivatives useful in the methods of the present invention have substantially the same biological and functional activity as the unmodified protein from which they are derived.

[0134] Further nucleic acid variants useful in practising the methods of the invention include portions of nucleic acids encoding NITR polypeptides, nucleic acids hybridising to nucleic acids encoding NITR polypeptides, splice variants of nucleic acids encoding NITR polypeptides, allelic variants of nucleic acids encoding NITR polypeptides and variants of nucleic acids encoding NITR polypeptides obtained by gene shuffling. The terms hybridising sequence, splice variant, allelic variant and gene shuffling are as described herein.

[0135] Nucleic acids encoding NITR polypeptides need not be full-length nucleic acids, since performance of the methods of the invention does not rely on the use of full-length nucleic acid sequences. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a portion of SEQ ID NO: 1, or a portion of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 2.

[0136] A portion of a nucleic acid may be prepared, for example, by making one or more deletions to the nucleic acid. The portions may be used in isolated form or they may be fused to other coding (or non-coding) sequences in order to, for example, produce a protein that combines several activities. When fused to other coding sequences, the resultant polypeptide produced upon translation may be bigger than that predicted for the protein portion.

[0137] Portions useful in the methods of the invention, encode a NITR polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 2. Preferably, the portion is a portion of any one of the nucleic acids given in SEQ ID NO: 1, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in SEQ ID NO: 1. Preferably the portion is at least 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750 consecutive nucleotides in length, the consecutive nucleotides being of SEQ ID NO: 1, or of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 1.

[0138] Another nucleic acid variant useful in the methods of the invention is a nucleic acid capable of hybridising, under reduced stringency conditions, preferably under stringent conditions, with a nucleic acid encoding a NITR polypeptide as defined herein, or with a portion as defined herein.

[0139] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to SEQ ID NO: 1, or comprising introducing and expressing in a plant a nucleic acid capable of hybridising to a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 1.

[0140] Hybridising sequences useful in the methods of the invention encode a NITR polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 2. Preferably, the hybridising sequence is capable of hybridising to SEQ ID NO: 1, or to a portion of any of these sequences, a portion being as defined above, or the hybridising sequence is capable of hybridising to a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2.

[0141] Another nucleic acid variant useful in the methods of the invention is a splice variant encoding a NITR polypeptide as defined hereinabove, a splice variant being as defined herein.

[0142] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a splice variant of SEQ ID NO: 1, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 2.

[0143] Another nucleic acid variant useful in performing the methods of the invention is an allelic variant of a nucleic acid encoding a NITR polypeptide as defined hereinabove, an allelic variant being as defined herein.

[0144] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant an allelic variant of SEQ ID NO: 1, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of the amino acid sequences represented by SEQ ID NO: 2.

[0145] The allelic variants useful in the methods of the present invention have substantially the same biological activity as the NITR polypeptide of SEQ ID NO: 2. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Gene shuffling or directed evolution may also be used to generate variants of nucleic acids encoding NITR polypeptides as defined above; the term "gene shuffling" being as defined herein.

[0146] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a variant of SEQ ID NO: 1, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 2, which variant nucleic acid is obtained by gene shuffling.

[0147] Furthermore, nucleic acid variants may also be obtained by site-directed mutagenesis. Several methods are available to achieve site-directed mutagenesis, the most common being PCR based methods (Current Protocols in Molecular Biology. Wiley Eds.).

[0148] Nucleic acids encoding NITR polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the NITR polypeptide-encoding nucleic acid is from a plant. In the case of SEQ ID NO: 1, the NITR polypeptide encoding nucleic acid is preferably from a monocotyledonous plant, more preferably from the family Brassicaceae, most preferably the nucleic acid is from Arabidopsis thaliana.

[0149] Performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased early vigour and increased yield, especially increased biomass and increased seed yield relative to control plants. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.

[0150] Reference herein to enhanced yield-related traits is taken to mean an increase in early vigour and/or in biomass (weight) of one or more parts of a plant, which may include aboveground (harvestable) parts and/or (harvestable) parts below ground. In particular, such harvestable parts are biomass and/or seeds, and performance of the methods of the invention results in plants having increased early vigour, biomass and/or seed yield relative to the early vigour, biomass or seed yield of control plants.

[0151] Taking corn as an example, a yield increase may be manifested as one or more of the following: increase in the number of plants established per square meter, an increase in the number of ears per plant, an increase in the number of rows, number of kernels per row, kernel weight, thousand kernel weight, ear length/diameter, increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), among others. Taking rice as an example, a yield increase may manifest itself as an increase in one or more of the following: number of plants per square meter, number of panicles per plant, number of spikelets per panicle, number of flowers (florets) per panicle (which is expressed as a ratio of the number of filled seeds over the number of primary panicles), increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), increase in thousand kernel weight, among others.

[0152] The present invention provides a method for increasing yield, especially biomass and/or seed yield of plants, relative to control plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding a NITR polypeptide as defined herein.

[0153] Since the transgenic plants according to the present invention have increased yield, it is likely that these plants exhibit an increased growth rate (during at least part of their life cycle), relative to the growth rate of control plants at a corresponding stage in their life cycle.

[0154] The increased growth rate may be specific to one or more parts of a plant (including seeds), or may be throughout substantially the whole plant. Plants having an increased growth rate may have a shorter life cycle. The life cycle of a plant may be taken to mean the time needed to grow from a dry mature seed up to the stage where the plant has produced dry mature seeds, similar to the starting material. This life cycle may be influenced by factors such as early vigour, growth rate, greenness index, flowering time and speed of seed maturation. The increase in growth rate may take place at one or more stages in the life cycle of a plant or during substantially the whole plant life cycle. Increased growth rate during the early stages in the life cycle of a plant may reflect enhanced vigour. The increase in growth rate may alter the harvest cycle of a plant allowing plants to be sown later and/or harvested sooner than would otherwise be possible (a similar effect may be obtained with earlier flowering time). If the growth rate is sufficiently increased, it may allow for the further sowing of seeds of the same plant species (for example sowing and harvesting of rice plants followed by sowing and harvesting of further rice plants all within one conventional growing period). Similarly, if the growth rate is sufficiently increased, it may allow for the further sowing of seeds of different plants species (for example the sowing and harvesting of corn plants followed by, for example, the sowing and optional harvesting of soybean, potato or any other suitable plant). Harvesting additional times from the same rootstock in the case of some crop plants may also be possible. Altering the harvest cycle of a plant may lead to an increase in annual biomass production per square meter (due to an increase in the number of times (say in a year) that any particular plant may be grown and harvested). An increase in growth rate may also allow for the cultivation of transgenic plants in a wider geographical area than their wild-type counterparts, since the territorial limitations for growing a crop are often determined by adverse environmental conditions either at the time of planting (early season) or at the time of harvesting (late season). Such adverse conditions may be avoided if the harvest cycle is shortened. The growth rate may be determined by deriving various parameters from growth curves, such parameters may be: T-Mid (the time taken for plants to reach 50% of their maximal size) and T-90 (time taken for plants to reach 90% of their maximal size), amongst others.

[0155] According to a preferred feature of the present invention, performance of the methods of the invention gives plants having an increased growth rate relative to control plants. Therefore, according to the present invention, there is provided a method for increasing the growth rate of plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding a NITR polypeptide as defined herein. In a particular embodiment, performance of the methods of the present invention gives plants with increased early vigour.

[0156] An increase in yield and/or growth rate occurs whether the plant is under non-stress conditions or whether the plant is exposed to various stresses compared to control plants. Plants typically respond to exposure to stress by growing more slowly. In conditions of severe stress, the plant may even stop growing altogether. Mild stress on the other hand is defined herein as being any stress to which a plant is exposed which does not result in the plant ceasing to grow altogether without the capacity to resume growth. Mild stress in the sense of the invention leads to a reduction in the growth of the stressed plants of less than 40%, 35% or 30%, preferably less than 25%, 20% or 15%, more preferably less than 14%, 13%, 12%, 11% or 10% or less in comparison to the control plant under non-stress conditions. Due to advances in agricultural practices (irrigation, fertilization, pesticide treatments) severe stresses are not often encountered in cultivated crop plants. As a consequence, the compromised growth induced by mild stress is often an undesirable feature for agriculture. Mild stresses are the everyday biotic and/or abiotic (environmental) stresses to which a plant is exposed. Abiotic stresses may be due to drought or excess water, anaerobic stress, salt stress, chemical toxicity, oxidative stress and hot, cold or freezing temperatures. The abiotic stress may be an osmotic stress caused by a water stress (particularly due to drought), salt stress, oxidative stress or an ionic stress. Biotic stresses are typically those stresses caused by pathogens, such as bacteria, viruses, fungi and insects.

[0157] In particular, the methods of the present invention may be performed under non-stress conditions or under conditions of mild drought to give plants having increased yield relative to control plants. As reported in Wang et al. (Planta (2003) 218: 1-14), abiotic stress leads to a series of morphological, physiological, biochemical and molecular changes that adversely affect plant growth and productivity. Drought, salinity, extreme temperatures and oxidative stress are known to be interconnected and may induce growth and cellular damage through similar mechanisms. Rabbani et al. (Plant Physiol (2003) 133: 1755-1767) describes a particularly high degree of "cross talk" between drought stress and high-salinity stress. For example, drought and/or salinisation are manifested primarily as osmotic stress, resulting in the disruption of homeostasis and ion distribution in the cell. Oxidative stress, which frequently accompanies high or low temperature, salinity or drought stress, may cause denaturing of functional and structural proteins. As a consequence, these diverse environmental stresses often activate similar cell signalling pathways and cellular responses, such as the production of stress proteins, up-regulation of anti-oxidants, accumulation of compatible solutes and growth arrest. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location.

[0158] Performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield and/or increased early vigour, relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield and/or early vigour in plants grown under non-stress conditions or under mild drought conditions, which method comprises increasing expression in a plant of a nucleic acid encoding a NITR polypeptide.

[0159] Performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises increasing expression in a plant of a nucleic acid encoding a NITR polypeptide. Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, cadmium, magnesium, manganese, iron and boron, amongst others.

[0160] Performance of the methods of the invention gives plants grown under conditions of salt stress, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a POI polypeptide. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.

[0161] The present invention encompasses plants or parts thereof (including seeds) obtainable by the methods according to the present invention. The plants or parts thereof comprise a nucleic acid transgene encoding a NITR polypeptide as defined above.

[0162] The invention also provides genetic constructs and vectors to facilitate introduction and/or expression in plants of nucleic acids encoding NITR polypeptides. The gene constructs may be inserted into vectors, which may be commercially available, suitable for transforming into plants and suitable for expression of the gene of interest in the transformed cells. The invention also provides use of a gene construct as defined herein in the methods of the invention.

[0163] More specifically, the present invention provides a construct comprising:

[0164] (a) a nucleic acid encoding a NITR polypeptide as defined above;

[0165] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally

[0166] (c) a transcription termination sequence.

[0167] Preferably, the nucleic acid encoding a NITR polypeptide is as defined above. The term "control sequence" and "termination sequence" are as defined herein.

[0168] Plants are transformed with a vector comprising any of the nucleic acids described above. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter).

[0169] Advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence. A constitutive promoter is particularly useful in the methods of the invention. Preferably the constitutive promoter is also a ubiquitous promoter. See the "Definitions" section herein for definitions of the various promoter types.

[0170] It should be clear that the applicability of the present invention is not restricted to the NITR polypeptide-encoding nucleic acid represented by SEQ ID NO: 1, nor is the applicability of the invention restricted to expression of a NITR polypeptide-encoding nucleic acid when driven by a constitutive specific promoter.

[0171] The constitutive promoter is preferably a medium strength promoter of plant origin, preferably a GOS2 promoter, more preferably a GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 3, most preferably the constitutive promoter is as represented by SEQ ID NO: 3. See Table 2 in the "Definitions" section herein for further examples of constitutive promoters.

[0172] Optionally, one or more terminator sequences may be used in the construct introduced into a plant. Additional regulatory elements may include transcriptional as well as translational enhancers. Those skilled in the art will be aware of terminator and enhancer sequences that may be suitable for use in performing the invention. An intron sequence may also be added to the 5' untranslated region (UTR) or in the coding sequence to increase the amount of the mature message that accumulates in the cytosol, as described in the definitions section. Other control sequences (besides promoter, enhancer, silencer, intron sequences, 3'UTR and/or 5'UTR regions) may be protein and/or RNA stabilizing elements. Such sequences would be known or may readily be obtained by a person skilled in the art.

[0173] The genetic constructs of the invention may further include an origin of replication sequence that is required for maintenance and/or replication in a specific cell type. One example is when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not limited to, the f1-ori and colE1.

[0174] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.

[0175] The invention also provides a method for the production of transgenic plants having enhanced yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding a NITR polypeptide as defined hereinabove.

[0176] More specifically, the present invention provides a method for the production of transgenic plants having increased enhanced yield-related traits, particularly increased early vigour and/or increased yield, which method comprises:

[0177] (i) introducing and expressing in a plant or plant cell a NITR polypeptide-encoding nucleic acid; and

[0178] (ii) cultivating the plant cell under conditions promoting plant growth and development.

[0179] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a NITR polypeptide as defined herein.

[0180] The nucleic acid may be introduced directly into a plant cell or into the plant itself (including introduction into a tissue, organ or any other part of a plant). According to a preferred feature of the present invention, the nucleic acid is preferably introduced into a plant by transformation. The term "transformation" is described in more detail in the "definitions" section herein.

[0181] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer.

[0182] Generally after transformation, plant cells or cell groupings are selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant. To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above.

[0183] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.

[0184] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).

[0185] The present invention clearly extends to any plant cell or plant produced by any of the methods described herein, and to all plant parts and propagules thereof. The present invention extends further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or whole plant that has been produced by any of the aforementioned methods, the only requirement being that progeny exhibit the same genotypic and/or phenotypic characteristic(s) as those produced by the parent in the methods according to the invention.

[0186] The invention also includes host cells containing an isolated nucleic acid encoding a NITR polypeptide as defined hereinabove. Preferred host cells according to the invention are plant cells. Host plants for the nucleic acids or the vector used in the method according to the invention, the expression cassette or construct or vector are, in principle, advantageously all plants, which are capable of synthesizing the polypeptides used in the inventive method.

[0187] The methods of the invention are advantageously applicable to any plant. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs. According to a preferred embodiment of the present invention, the plant is a crop plant. Examples of crop plants include soybean, sunflower, canola, alfalfa, rapeseed, cotton, tomato, potato and tobacco. Further preferably, the plant is a monocotyledonous plant. Examples of monocotyledonous plants include sugarcane. More preferably the plant is a cereal. Examples of cereals include rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, secale, einkorn, teff, milo and oats.

[0188] The invention also extends to harvestable parts of a plant such as, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs, which harvestable parts comprise a recombinant nucleic acid encoding a NITR polypeptide. The invention furthermore relates to products derived, preferably directly derived, from a harvestable part of such a plant, such as dry pellets or powders, oil, fat and fatty acids, starch or proteins.

[0189] According to a preferred feature of the invention, the modulated expression is increased expression. Methods for increasing expression of nucleic acids or genes, or gene products, are well documented in the art and examples are provided in the definitions section.

[0190] As mentioned above, a preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding a NITR polypeptide is by introducing and expressing in a plant a nucleic acid encoding a NITR polypeptide; however the effects of performing the method, i.e. enhancing yield-related traits may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.

[0191] The present invention also encompasses use of nucleic acids encoding NITR polypeptides as described herein and use of these NITR polypeptides in enhancing any of the aforementioned yield-related traits in plants.

[0192] Nucleic acids encoding NITR polypeptide described herein, or the NITR polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to a NITR polypeptide-encoding gene. The nucleic acids/genes, or the NITR polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined hereinabove in the methods of the invention.

[0193] Allelic variants of a NITR polypeptide-encoding nucleic acid/gene may also find use in marker-assisted breeding programmes. Such breeding programmes sometimes require introduction of allelic variation by mutagenic treatment of the plants, using for example EMS mutagenesis; alternatively, the programme may start with a collection of allelic variants of so called "natural" origin caused unintentionally. Identification of allelic variants then takes place, for example, by PCR. This is followed by a step for selection of superior allelic variants of the sequence in question and which give increased yield. Selection is typically carried out by monitoring growth performance of plants containing different allelic variants of the sequence in question. Growth performance may be monitored in a greenhouse or in the field. Further optional steps include crossing plants in which the superior allelic variant was identified with another plant. This could be used, for example, to make a combination of interesting phenotypic features.

[0194] Nucleic acids encoding NITR polypeptides may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes. Such use of NITR polypeptide-encoding nucleic acids requires only a nucleic acid sequence of at least 15 nucleotides in length. The NITR polypeptide-encoding nucleic acids may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, A Laboratory Manual) of restriction-digested plant genomic DNA may be probed with the NITR-encoding nucleic acids. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et al. (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acids may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the NITR polypeptide-encoding nucleic acid in the genetic map previously obtained using this population (Botstein et al. (1980) Am. J. Hum. Genet. 32:314-331).

[0195] The production and use of plant gene-derived probes for use in genetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol. Reporter 4: 37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.

[0196] The nucleic acid probes may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel et al. In: Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).

[0197] In another embodiment, the nucleic acid probes may be used in direct fluorescence in situ hybridisation (FISH) mapping (Trask (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favour use of large clones (several kb to several hundred kb; see Laan et al. (1995) Genome Res. 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.

[0198] A variety of nucleic acid amplification-based methods for genetic and physical mapping may be carried out using the nucleic acids. Examples include allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med 11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic Acid Res. 17:6795-6807). For these methods, the sequence of a nucleic acid is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art. In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.

[0199] The methods according to the present invention result in plants having enhanced yield-related traits, as described hereinbefore. These traits may also be combined with other economically advantageous traits, such as further yield-enhancing traits, tolerance to other abiotic and biotic stresses, traits modifying various architectural features and/or biochemical and/or physiological features.

II ASNS

[0200] Surprisingly, it has now been found that modulating expression in a plant of a nucleic acid encoding an ASNS polypeptide gives plants having enhanced yield-related traits relative to control plants. According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding an ASNS polypeptide.

[0201] A preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding an ASNS polypeptide is by introducing and expressing in a plant a nucleic acid encoding an ASNS polypeptide.

[0202] Any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean an ASNS polypeptide as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such an ASNS polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "ASNS nucleic acid" or "ASNSgene".

[0203] An "ASNS polypeptide" as defined herein refers to the Asparagine synthetase represented by SEQ ID NO: 63 and to homologues (orthologues and paralogues) thereof. SEQ ID NO: 63 comprises, compared to the wild type sequence (Os06g0265000, SEQ ID NO: 67), two point mutations: R382G and S165G (FIG. 6). Arginine on position 382 in SEQ ID NO: 67 is highly conserved among Asparagine synthetases and may be part of a large alpha-helix which delimits the molecular tunnel between the 2 active sites. It is also close to the AMP binding site. Serine on position 165 may be located in a distorted a-helix region, on the external side of the glutamine binding side according to the structure derived from E. coli. It is postulated that the S165G mutation will probably have little impact on the structure of this region.

[0204] Therefore, ASNS polypeptides useful in the methods of the present invention preferably have a substitution of the Arginine residue that corresponds to R382 in SEQ ID NO: 67, into an amino acid that distorts the alpha-helix, preferably into a Glycine. Optionally ASNS polypeptides useful in the methods of the present invention additionally have a substitution of the Serine residue that corresponds to S165 in SEQ ID NO: 67, into another amino acid, preferably into a Glycine. Arg residues corresponding to R382 in SEQ ID NO: 67 or Ser residues corresponding to S165 can be identified by aligning the amino acid sequence to the one of SEQ ID NO: 67, see for example the multiple alignment in FIG. 4. Such alignment methods are well known in the art.

[0205] Preferably, the homologues of SEQ ID NO: 63 have a Asn_synthase domain. Asn_synthase domains (Pfam entry PF00733) are well known in the art and may readily be identified by persons skilled in the art. Besides the Asn_synthase domain, ASNS polypeptides preferably also have a Glutamine amidotransferase, class-II domain (InterPro IPR000583; GATase--2 (HMMPfam entry PF00310), GATase_Type_II (PROSITE entry PS00443)) and/or a Asparagine synthase, glutamine-hydrolyzing domain (asn_synth_AEB: asparagine synthase (glutami (TIGRFAMs entry TIGR01536))

[0206] Alternatively, the homologue of a ASNS protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 63, provided that the homologous protein comprises the conserved motifs as outlined above and the substitution of the Arg residue that corresponds to R382 in SEQ ID NO: 67. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters. Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered.

[0207] The term "domain" and "motif" is defined in the "definitions" section herein. Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788 (2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.

[0208] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity and performs a statistical analysis of the similarity between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity and identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters. For local alignments, the Smith-Waterman algorithm is particularly useful (Smith T F, Waterman M S (1981) J. Mol. Biol 147(1); 195-7).

[0209] Furthermore, ASNS polypeptides (at least in their native form), as far as SEQ ID NO: 2 and its homologues are concerned, typically have asparagine synthetase activity (Patterson and Orr, J. Biol. Chem. 243, 376-380, 1968; Enzyme Catalogue 6.3.5.4, reaction scheme:

ATP+L-aspartate+L-glutamine+H2O=AMP+diphosphate+L-asparagine+L-glutamate- ).

Tools and techniques for measuring asparagine synthetase activity are well known in the art.

[0210] The present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 62, encoding the polypeptide sequence of SEQ ID NO: 63. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any ASNS-encoding nucleic acid or ASNS polypeptide as defined herein.

[0211] Examples of nucleic acids encoding ASNS polypeptides may be found in databases known in the art, and some of them are listed in FIG. 5. Such nucleic acids are useful in performing the methods of the invention. Orthologues and paralogues, the terms "orthologues" and "paralogues" being as defined herein, may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using SEQ ID NO: 63) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 62 or SEQ ID NO: 63, the second BLAST would therefore be against Oryza sativa sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits. Examples of orthologues and paralogues of SEQ ID NO: 63 or SEQ ID NO: 67 are listed in FIG. 5.

[0212] High-ranking hits are those having a low E-value. The lower the E-value, the more significant the score (or in other words the lower the chance that the hit was found by chance). Computation of the E-value is well known in the art. In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In the case of large families, ClustalW may be used, followed by a neighbour joining tree, to help visualize clustering of related genes and to identify orthologues and paralogues.

[0213] Nucleic acid variants encoding homologues and derivatives of SEQ ID NO: 63 may also be useful in practising the methods of the invention, the terms "homologue" and "derivative" being as defined herein. Also useful in the methods of the invention are nucleic acids encoding homologues and derivatives of orthologues or paralogues of SEQ ID NO: 63. Homologues and derivatives useful in the methods of the present invention have substantially the same biological and functional activity as the unmodified protein from which they are derived.

[0214] Further nucleic acid variants useful in practising the methods of the invention include portions of nucleic acids encoding ASNS polypeptides, nucleic acids hybridising to nucleic acids encoding ASNS polypeptides, splice variants of nucleic acids encoding ASNS polypeptides, allelic variants of nucleic acids encoding ASNS polypeptides and variants of nucleic acids encoding ASNS polypeptides obtained by gene shuffling. The terms hybridising sequence, splice variant, allelic variant and gene shuffling are as described herein.

[0215] Nucleic acids encoding ASNS polypeptides need not be full-length nucleic acids, since performance of the methods of the invention does not rely on the use of full-length nucleic acid sequences. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a portion of SEQ ID NO: 62, or a portion of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 63.

[0216] A portion of a nucleic acid may be prepared, for example, by making one or more deletions to the nucleic acid. The portions may be used in isolated form or they may be fused to other coding (or non-coding) sequences in order to, for example, produce a protein that combines several activities. When fused to other coding sequences, the resultant polypeptide produced upon translation may be bigger than that predicted for the protein portion.

[0217] Portions useful in the methods of the invention, encode an ASNS polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 63. Preferably, the portion is a portion of any one of the nucleic acids given in SEQ ID NO: 62, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in SEQ ID NO: 62. Preferably the portion is at least 800, 900, 1000, 1100, 1200, 1300, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750 consecutive nucleotides in length, the consecutive nucleotides being of SEQ ID NO: 62, or of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 63. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 62.

[0218] Another nucleic acid variant useful in the methods of the invention is a nucleic acid capable of hybridising, under reduced stringency conditions, preferably under stringent conditions, with a nucleic acid encoding an ASNS polypeptide as defined herein, or with a portion as defined herein.

[0219] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to SEQ ID NO: 62, or comprising introducing and expressing in a plant a nucleic acid capable of hybridising to a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 62.

[0220] Hybridising sequences useful in the methods of the invention encode an ASNS polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 63. Preferably, the hybridising sequence is capable of hybridising to SEQ ID NO: 62, or to a portion of any of these sequences, a portion being as defined above, or the hybridising sequence is capable of hybridising to a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 63.

[0221] Another nucleic acid variant useful in the methods of the invention is a splice variant encoding an ASNS polypeptide as defined hereinabove, a splice variant being as defined herein.

[0222] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a splice variant of SEQ ID NO: 62, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 63.

[0223] Another nucleic acid variant useful in performing the methods of the invention is an allelic variant of a nucleic acid encoding an ASNS polypeptide as defined hereinabove, an allelic variant being as defined herein.

[0224] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant an allelic variant of SEQ ID NO: 62, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of the amino acid sequences represented by SEQ ID NO: 63.

[0225] The allelic variants useful in the methods of the present invention have substantially the same biological activity as the ASNS polypeptide of SEQ ID NO: 63. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Gene shuffling or directed evolution may also be used to generate variants of nucleic acids encoding ASNS polypeptides as defined above; the term "gene shuffling" being as defined herein.

[0226] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a variant of SEQ ID NO: 62, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 63, which variant nucleic acid is obtained by gene shuffling.

[0227] Furthermore, nucleic acid variants may also be obtained by site-directed mutagenesis. Several methods are available to achieve site-directed mutagenesis, the most common being PCR based methods (Current Protocols in Molecular Biology. Wiley Eds.).

[0228] Nucleic acids encoding ASNS polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the ASNS polypeptide-encoding nucleic acid is from a plant. In the case of SEQ ID NO: 62, the ASNS polypeptide encoding nucleic acid is preferably from a monocotyledonous plant, more preferably from the family Poaceae, most preferably the nucleic acid is from Oryza sativa.

[0229] Performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased early vigour and increased yield, especially increased biomass and increased seed yield relative to control plants. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.

[0230] Reference herein to enhanced yield-related traits is taken to mean an increase in early vigour and/or in biomass (weight) of one or more parts of a plant, which may include aboveground (harvestable) parts and/or (harvestable) parts below ground. In particular, such harvestable parts are biomass and/or seeds, and performance of the methods of the invention results in plants having increased early vigour, biomass and/or seed yield relative to the early vigour, biomass or seed yield of control plants.

[0231] Taking corn as an example, a yield increase may be manifested as one or more of the following: increase in the number of plants established per square meter, an increase in the number of ears per plant, an increase in the number of rows, number of kernels per row, kernel weight, thousand kernel weight, ear length/diameter, increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), among others. Taking rice as an example, a yield increase may manifest itself as an increase in one or more of the following: number of plants per square meter, number of panicles per plant, number of spikelets per panicle, number of flowers (florets) per panicle (which is expressed as a ratio of the number of filled seeds over the number of primary panicles), increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), increase in thousand kernel weight, among others.

[0232] The present invention provides a method for increasing yield, especially biomass and/or seed yield of plants, relative to control plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding an ASNS polypeptide as defined herein.

[0233] Since the transgenic plants according to the present invention have increased yield, it is likely that these plants exhibit an increased growth rate (during at least part of their life cycle), relative to the growth rate of control plants at a corresponding stage in their life cycle.

[0234] The increased growth rate may be specific to one or more parts of a plant (including seeds), or may be throughout substantially the whole plant. Plants having an increased growth rate may have a shorter life cycle. The life cycle of a plant may be taken to mean the time needed to grow from a dry mature seed up to the stage where the plant has produced dry mature seeds, similar to the starting material. This life cycle may be influenced by factors such as early vigour, growth rate, greenness index, flowering time and speed of seed maturation. The increase in growth rate may take place at one or more stages in the life cycle of a plant or during substantially the whole plant life cycle. Increased growth rate during the early stages in the life cycle of a plant may reflect enhanced vigour. The increase in growth rate may alter the harvest cycle of a plant allowing plants to be sown later and/or harvested sooner than would otherwise be possible (a similar effect may be obtained with earlier flowering time). If the growth rate is sufficiently increased, it may allow for the further sowing of seeds of the same plant species (for example sowing and harvesting of rice plants followed by sowing and harvesting of further rice plants all within one conventional growing period). Similarly, if the growth rate is sufficiently increased, it may allow for the further sowing of seeds of different plants species (for example the sowing and harvesting of corn plants followed by, for example, the sowing and optional harvesting of soybean, potato or any other suitable plant). Harvesting additional times from the same rootstock in the case of some crop plants may also be possible. Altering the harvest cycle of a plant may lead to an increase in annual biomass production per square meter (due to an increase in the number of times (say in a year) that any particular plant may be grown and harvested). An increase in growth rate may also allow for the cultivation of transgenic plants in a wider geographical area than their wild-type counterparts, since the territorial limitations for growing a crop are often determined by adverse environmental conditions either at the time of planting (early season) or at the time of harvesting (late season). Such adverse conditions may be avoided if the harvest cycle is shortened. The growth rate may be determined by deriving various parameters from growth curves, such parameters may be: T-Mid (the time taken for plants to reach 50% of their maximal size) and T-90 (time taken for plants to reach 90% of their maximal size), amongst others.

[0235] According to a preferred feature of the present invention, performance of the methods of the invention gives plants having an increased growth rate relative to control plants. Therefore, according to the present invention, there is provided a method for increasing the growth rate of plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding an ASNS polypeptide as defined herein. In a particular embodiment, performance of the methods of the present invention gives plants with increased early vigour.

[0236] An increase in yield and/or growth rate occurs whether the plant is under non-stress conditions or whether the plant is exposed to various stresses compared to control plants. Plants typically respond to exposure to stress by growing more slowly. In conditions of severe stress, the plant may even stop growing altogether. Mild stress on the other hand is defined herein as being any stress to which a plant is exposed which does not result in the plant ceasing to grow altogether without the capacity to resume growth. Mild stress in the sense of the invention leads to a reduction in the growth of the stressed plants of less than 40%, 35% or 30%, preferably less than 25%, 20% or 15%, more preferably less than 14%, 13%, 12%, 11% or 10% or less in comparison to the control plant under non-stress conditions. Due to advances in agricultural practices (irrigation, fertilization, pesticide treatments) severe stresses are not often encountered in cultivated crop plants. As a consequence, the compromised growth induced by mild stress is often an undesirable feature for agriculture. Mild stresses are the everyday biotic and/or abiotic (environmental) stresses to which a plant is exposed. Abiotic stresses may be due to drought or excess water, anaerobic stress, salt stress, chemical toxicity, oxidative stress and hot, cold or freezing temperatures. The abiotic stress may be an osmotic stress caused by a water stress (particularly due to drought), salt stress, oxidative stress or an ionic stress. Biotic stresses are typically those stresses caused by pathogens, such as bacteria, viruses, fungi and insects.

[0237] In particular, the methods of the present invention may be performed under non-stress conditions or under conditions of mild drought to give plants having increased yield relative to control plants. As reported in Wang et al. (Planta (2003) 218: 1-14), abiotic stress leads to a series of morphological, physiological, biochemical and molecular changes that adversely affect plant growth and productivity. Drought, salinity, extreme temperatures and oxidative stress are known to be interconnected and may induce growth and cellular damage through similar mechanisms. Rabbani et al. (Plant Physiol (2003) 133: 1755-1767) describes a particularly high degree of "cross talk" between drought stress and high-salinity stress. For example, drought and/or salinisation are manifested primarily as osmotic stress, resulting in the disruption of homeostasis and ion distribution in the cell. Oxidative stress, which frequently accompanies high or low temperature, salinity or drought stress, may cause denaturing of functional and structural proteins. As a consequence, these diverse environmental stresses often activate similar cell signalling pathways and cellular responses, such as the production of stress proteins, up-regulation of anti-oxidants, accumulation of compatible solutes and growth arrest. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location.

[0238] Performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield and/or increased early vigour, relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield and/or early vigour in plants grown under non-stress conditions or under mild drought conditions, which method comprises increasing expression in a plant of a nucleic acid encoding an ASNS polypeptide.

[0239] Performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises increasing expression in a plant of a nucleic acid encoding an ASNS polypeptide. Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, cadmium, magnesium, manganese, iron and boron, amongst others.

[0240] Performance of the methods of the invention gives plants grown under conditions of salt stress, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a ASNS polypeptide. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.

[0241] The present invention encompasses plants or parts thereof (including seeds) obtainable by the methods according to the present invention. The plants or parts thereof comprise a nucleic acid transgene encoding an ASNS polypeptide as defined above.

[0242] The invention also provides genetic constructs and vectors to facilitate introduction and/or expression in plants of nucleic acids encoding ASNS polypeptides. The gene constructs may be inserted into vectors, which may be commercially available, suitable for transforming into plants and suitable for expression of the gene of interest in the transformed cells. The invention also provides use of a gene construct as defined herein in the methods of the invention.

[0243] More specifically, the present invention provides a construct comprising:

[0244] (a) a nucleic acid encoding an ASNS polypeptide as defined above;

[0245] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally

[0246] (c) a transcription termination sequence.

[0247] Preferably, the nucleic acid encoding an ASNS polypeptide is as defined above. The term "control sequence" and "termination sequence" are as defined herein.

[0248] Plants are transformed with a vector comprising any of the nucleic acids described above. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter).

[0249] Advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence, but preferably the promoter is of plant origin. A constitutive promoter is particularly useful in the methods of the invention. Preferably the constitutive promoter is also a ubiquitous promoter of medium strength. See the "Definitions" section herein for definitions of the various promoter types.

[0250] It should be clear that the applicability of the present invention is not restricted to the ASNS polypeptide-encoding nucleic acid represented by SEQ ID NO: 62, nor is the applicability of the invention restricted to expression of an ASNS polypeptide-encoding nucleic acid when driven by a constitutive specific promoter.

[0251] The constitutive promoter is preferably a GOS2 promoter, preferably a GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 64, most preferably the constitutive promoter is as represented by SEQ ID NO: 64. See Table 2 in the "Definitions" section herein for further examples of constitutive promoters.

[0252] Optionally, one or more terminator sequences may be used in the construct introduced into a plant. Additional regulatory elements may include transcriptional as well as translational enhancers. Those skilled in the art will be aware of terminator and enhancer sequences that may be suitable for use in performing the invention. An intron sequence may also be added to the 5' untranslated region (UTR) or in the coding sequence to increase the amount of the mature message that accumulates in the cytosol, as described in the definitions section. Other control sequences (besides promoter, enhancer, silencer, intron sequences, 3'UTR and/or 5'UTR regions) may be protein and/or RNA stabilizing elements. Such sequences would be known or may readily be obtained by a person skilled in the art.

[0253] The genetic constructs of the invention may further include an origin of replication sequence that is required for maintenance and/or replication in a specific cell type. One example is when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not limited to, the f1-ori and colE1.

[0254] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.

[0255] The invention also provides a method for the production of transgenic plants having enhanced yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding an ASNS polypeptide as defined hereinabove.

[0256] More specifically, the present invention provides a method for the production of transgenic plants having increased enhanced yield-related traits, particularly increased early vigour and/or increased yield, which method comprises:

[0257] (i) introducing and expressing in a plant or plant cell an ASNS polypeptide-encoding nucleic acid; and

[0258] (ii) cultivating the plant cell under conditions promoting plant growth and development.

[0259] The nucleic acid of (i) may be any of the nucleic acids capable of encoding an ASNS polypeptide as defined herein.

[0260] The nucleic acid may be introduced directly into a plant cell or into the plant itself (including introduction into a tissue, organ or any other part of a plant). According to a preferred feature of the present invention, the nucleic acid is preferably introduced into a plant by transformation. The term "transformation" is described in more detail in the "definitions" section herein.

[0261] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer.

[0262] Generally after transformation, plant cells or cell groupings are selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant. To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above.

[0263] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.

[0264] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).

[0265] The present invention clearly extends to any plant cell or plant produced by any of the methods described herein, and to all plant parts and propagules thereof. The present invention extends further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or whole plant that has been produced by any of the aforementioned methods, the only requirement being that progeny exhibit the same genotypic and/or phenotypic characteristic(s) as those produced by the parent in the methods according to the invention.

[0266] The invention also includes host cells containing an isolated nucleic acid encoding an ASNS polypeptide as defined hereinabove. Preferred host cells according to the invention are plant cells. Host plants for the nucleic acids or the vector used in the method according to the invention, the expression cassette or construct or vector are, in principle, advantageously all plants, which are capable of synthesizing the polypeptides used in the inventive method.

[0267] The methods of the invention are advantageously applicable to any plant. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs. According to a preferred embodiment of the present invention, the plant is a crop plant. Examples of crop plants include soybean, sunflower, canola, alfalfa, rapeseed, cotton, tomato, potato and tobacco. Further preferably, the plant is a monocotyledonous plant. Examples of monocotyledonous plants include sugarcane. More preferably the plant is a cereal. Examples of cereals include rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, secale, einkorn, teff, milo and oats.

[0268] The invention also extends to harvestable parts of a plant such as, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs. The invention furthermore relates to products derived, preferably directly derived, from a harvestable part of such a plant, such as dry pellets or powders, oil, fat and fatty acids, starch or proteins.

[0269] According to a preferred feature of the invention, the modulated expression is increased expression. Methods for increasing expression of nucleic acids or genes, or gene products, are well documented in the art and examples are provided in the definitions section.

[0270] As mentioned above, a preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding an ASNS polypeptide is by introducing and expressing in a plant a nucleic acid encoding an ASNS polypeptide; however the effects of performing the method, i.e. enhancing yield-related traits may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.

[0271] The present invention also encompasses use of nucleic acids encoding ASNS polypeptides as described herein and use of these ASNS polypeptides in enhancing any of the aforementioned yield-related traits in plants.

[0272] Nucleic acids encoding ASNS polypeptide described herein, or the ASNS polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to an ASNS polypeptide-encoding gene. The nucleic acids/genes, or the ASNS polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined hereinabove in the methods of the invention.

[0273] Allelic variants of an ASNS polypeptide-encoding nucleic acid/gene may also find use in marker-assisted breeding programmes. Such breeding programmes sometimes require introduction of allelic variation by mutagenic treatment of the plants, using for example EMS mutagenesis; alternatively, the programme may start with a collection of allelic variants of so called "natural" origin caused unintentionally. Identification of allelic variants then takes place, for example, by PCR. This is followed by a step for selection of superior allelic variants of the sequence in question and which give increased yield. Selection is typically carried out by monitoring growth performance of plants containing different allelic variants of the sequence in question. Growth performance may be monitored in a greenhouse or in the field. Further optional steps include crossing plants in which the superior allelic variant was identified with another plant. This could be used, for example, to make a combination of interesting phenotypic features.

[0274] Nucleic acids encoding ASNS polypeptides may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes. Such use of ASNS polypeptide-encoding nucleic acids requires only a nucleic acid sequence of at least 15 nucleotides in length. The ASNS polypeptide-encoding nucleic acids may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, A Laboratory Manual) of restriction-digested plant genomic DNA may be probed with the ASNS-encoding nucleic acids. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et al. (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acids may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the ASNS polypeptide-encoding nucleic acid in the genetic map previously obtained using this population (Botstein et al. (1980) Am. J. Hum. Genet. 32:314-331).

[0275] The production and use of plant gene-derived probes for use in genetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol. Reporter 4: 37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.

[0276] The nucleic acid probes may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel et al. In: Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).

[0277] In another embodiment, the nucleic acid probes may be used in direct fluorescence in situ hybridisation (FISH) mapping (Trask (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favour use of large clones (several kb to several hundred kb; see Laan et al. (1995) Genome Res. 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.

[0278] A variety of nucleic acid amplification-based methods for genetic and physical mapping may be carried out using the nucleic acids. Examples include allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med 11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic Acid Res. 17:6795-6807). For these methods, the sequence of a nucleic acid is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art.

[0279] In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.

[0280] The methods according to the present invention result in plants having enhanced yield-related traits, as described hereinbefore. These traits may also be combined with other economically advantageous traits, such as further yield-enhancing traits, tolerance to other abiotic and biotic stresses, traits modifying various architectural features and/or biochemical and/or physiological features.

DESCRIPTION OF FIGURES

[0281] The present invention will now be described with reference to the following figures in which:

[0282] FIG. 1 represents the binary vector for increased expression in Oryza sativa of a NITR-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2::NITR)

[0283] FIG. 2 details examples of sequences useful in performing the methods according to the present invention.

[0284] FIG. 3 gives a phylogenetic tree of the NITR protein sequences listed in FIG. 2, in which tree the outgroup is represented by Sulfite Reductases, exemplified by SEQ ID NO: 9 (C. reinhardtii 59303), SEQ ID NO: 11 (C. reinhardtii 192232) and SEQ ID NO: 33 (A. thaliana At5g04590).

[0285] FIG. 4 represents the binary vector for increased expression in Oryza sativa of an ASNS-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2::ASNS)

[0286] FIG. 5 details examples of sequences useful in performing the methods according to the present invention.

[0287] FIG. 6 shows an alignment between SEQ ID NO: 63 and SEQ ID NO: 67. The S165G and R382G mutations are indicated.

[0288] FIG. 7 is a multiple alignment of examples of ASNS polypeptides. The asterisks represent amino acids that are identical in all sequences, the colons indicate highly conserved residues, the dots represent conserved residues. The Arg residues corresponding to R382 in SEQ ID NO: 67 is shown in bold.

EXAMPLES

[0289] The present invention will now be described with reference to the following examples, which are by way of illustration alone. The following examples are not intended to completely define or otherwise limit the scope of the invention.

[0290] DNA manipulation: unless otherwise stated, recombinant DNA techniques are performed according to standard protocols described in (Sambrook (2001) Molecular Cloning: a laboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, CSH, New York) or in Volumes 1 and 2 of Ausubel et al. (1994), Current Protocols in Molecular Biology, Current Protocols. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfax (1993) by R. D. D. Croy, published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications (UK).

Example 1

Identification of Sequences Related to the Nucleic Acid Sequence Used in the Methods of the Invention

[0291] Sequences (full length cDNA, ESTs or genomic) related to the nucleic acid sequence used in the methods of the present invention are identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acids used in the present invention are used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis is viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflects the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.

[0292] In some instances, related sequences may tentatively be assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR). The Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid or polypeptide sequence of interest.

Example 2

Alignment of NITR Polypeptide Sequences

[0293] Alignment of polypeptide sequences is performed using the AlignX programme from the Vector NTI package (Invitrogen) which is based on the popular Clustal W algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chenna et al. (2003). Nucleic Acids Res 31:3497-3500). Default values are for the gap open penalty of 10, for the gap extension penalty of 0.1 and the selected weight matrix is Blosum 62 (if polypeptides are aligned). Minor manual editing may be done to further optimise the alignment.

[0294] A phylogenetic tree of NITR polypeptides is constructed using a neighbour-joining clustering algorithm as provided in the AlignX programme from Vector NTI (Invitrogen).

[0295] For the construction of the phylogenetic tree of FIG. 3, the proteins of FIG. 2 were aligned using MUSCLE (Edgar (2004), Nucleic Acids Research 32(5): 1792-97). A Neighbour-Joining tree was calculated using QuickTree (Howe et al. (2002), Bioinformatics 18(11): 1546-7). Support of the major branching is indicated for 100 bootstrap repetitions. A circular phylogram was drawn using Dendroscope (Huson et al. (2007), BMC Bioinformatics 8(1):460).

Example 3

Calculation of Global Percentage Identity Between Polypeptide Sequences Useful in Performing the Methods of the Invention

[0296] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention are determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.

[0297] Parameters used in the comparison were:

[0298] Scoring matrix: Blosum62

[0299] First Gap: 12

[0300] Extending gap: 2

[0301] A MATGAT table for local alignment of a specific domain, or data on % identity/similarity between specific domains may also be generated.

Example 4

Identification of Domains Comprised in Polypeptide Sequences Useful in Performing the Methods of the Invention

[0302] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, ProDom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.

[0303] The protein sequences representing the NITR are used as query to search the InterPro database.

Example 5

Topology Prediction of the Polypeptide Sequences Useful in Performing the Methods of the Invention

[0304] TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal pre-sequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Scores on which the final prediction is based are not really probabilities, and they do not necessarily add to one. However, the location with the highest score is the most likely according to TargetP, and the relationship between the scores (the reliability class) may be an indication of how certain the prediction is. The reliability class (RC) ranges from 1 to 5, where 1 indicates the strongest prediction. TargetP is maintained at the server of the Technical University of Denmark.

[0305] For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted.

[0306] A number of parameters were selected, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).

[0307] The protein sequence represented by SEQ ID NO: 2 was used to query TargetP 1.1. The "plant" organism group is selected, no cutoffs defined, and the predicted length of the transit peptide requested. The protein has a predicted location in the chloroplast (probability 0.793, reliability class 3).

[0308] Many other algorithms can be used to perform such analyses, including:

[0309] ChloroP 1.1 hosted on the server of the Technical University of Denmark;

[0310] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;

[0311] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;

[0312] TMHMM, hosted on the server of the Technical University of Denmark

Example 6

Cloning of the Nucleic Acid Sequence Used in the Methods of the Invention

Cloning of SEQ ID NO: 1:

[0313] The NITR encoding nucleic acid sequence SEQ ID NO: 1 used in the methods of the invention was amplified by PCR using as template a custom-made Arabidopsis thaliana seedlings cDNA library (in pCMV Sport 6.0; Invitrogen, Paisley, UK). PCR was performed using Hifi Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm07073 (SEQ ID NO: 4; sense, start codon in bold): 5'-ggggacaagtttgt acaaaaaagcaggcttaaacaatgacttctttctctctcactt-3' and prm07074 (SEQ ID NO: 5; reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggtcaatagct tttgaatcaatct-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombines in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone". Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.

[0314] The entry clone comprising SEQ ID NO: 1 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 3) for seed specific expression was located upstream of this Gateway cassette.

[0315] After the LR recombination step, the resulting expression vector pGOS2::NITR (FIG. 1) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.

Example 7

Plant Transformation

Rice Transformation

[0316] The Agrobacterium containing the expression vector was used to transform Oryza sativa plants. Mature dry seeds of the rice japonica cultivar Nipponbare were dehusked. Sterilization was carried out by incubating for one minute in 70% ethanol, followed by 30 minutes in 0.2% HgCl2, followed by a 6 times 15 minutes wash with sterile distilled water. The sterile seeds were then germinated on a medium containing 2,4-D (callus induction medium). After incubation in the dark for four weeks, embryogenic, scutellum-derived calli were excised and propagated on the same medium. After two weeks, the calli were multiplied or propagated by subculture on the same medium for another 2 weeks. Embryogenic callus pieces were sub-cultured on fresh medium 3 days before co-cultivation (to boost cell division activity).

[0317] Agrobacterium strain LBA4404 containing the expression vector was used for co-cultivation. Agrobacterium was inoculated on AB medium with the appropriate antibiotics and cultured for 3 days at 28° C. The bacteria were then collected and suspended in liquid co-cultivation medium to a density (OD600) of about 1. The suspension was then transferred to a Petri dish and the calli immersed in the suspension for 15 minutes. The callus tissues were then blotted dry on a filter paper and transferred to solidified, co-cultivation medium and incubated for 3 days in the dark at 25° C. Co-cultivated calli were grown on 2,4-D-containing medium for 4 weeks in the dark at 28° C. in the presence of a selection agent. During this period, rapidly growing resistant callus islands developed. After transfer of this material to a regeneration medium and incubation in the light, the embryogenic potential was released and shoots developed in the next four to five weeks. Shoots were excised from the calli and incubated for 2 to 3 weeks on an auxin-containing medium from which they were transferred to soil. Hardened shoots were grown under high humidity and short days in a greenhouse.

[0318] Approximately 35 independent T0 rice transformants were generated for one construct. The primary transformants were transferred from a tissue culture chamber to a greenhouse. After a quantitative PCR analysis to verify copy number of the T-DNA insert, only single copy transgenic plants that exhibit tolerance to the selection agent were kept for harvest of T1 seed. Seeds were then harvested three to five months after transplanting. The method yielded single locus transformants at a rate of over 50% (Aldemita and Hodges 1996, Chan et al. 1993, Hiei et al. 1994).

Corn Transformation

[0319] Transformation of maize (Zea mays) is performed with a modification of the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. Transformation is genotype-dependent in corn and only specific genotypes are amenable to transformation and regeneration. The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation, but other genotypes can be used successfully as well. Ears are harvested from corn plant approximately 11 days after pollination (DAP) when the length of the immature embryo is about 1 to 1.2 mm. Immature embryos are cocultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. Excised embryos are grown on callus induction medium, then maize regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Wheat Transformation

[0320] Transformation of wheat is performed with the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. The cultivar Bobwhite (available from CIMMYT, Mexico) is commonly used in transformation. Immature embryos are co-cultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. After incubation with Agrobacterium, the embryos are grown in vitro on callus induction medium, then regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Soybean Transformation

[0321] Soybean is transformed according to a modification of the method described in the Texas A&M patent U.S. Pat. No. 5,164,310. Several commercial soybean varieties are amenable to transformation by this method. The cultivar Jack (available from the Illinois Seed foundation) is commonly used for transformation. Soybean seeds are sterilised for in vitro sowing. The hypocotyl, the radicle and one cotyledon are excised from seven-day old young seedlings. The epicotyl and the remaining cotyledon are further grown to develop axillary nodes. These axillary nodes are excised and incubated with Agrobacterium tumefaciens containing the expression vector. After the cocultivation treatment, the explants are washed and transferred to selection media. Regenerated shoots are excised and placed on a shoot elongation medium. Shoots no longer than 1 cm are placed on rooting medium until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Rapeseed/Canola Transformation

[0322] Cotyledonary petioles and hypocotyls of 5-6 day old young seedling are used as explants for tissue culture and transformed according to Babic et al. (1998, Plant Cell Rep 17: 183-188). The commercial cultivar Westar (Agriculture Canada) is the standard variety used for transformation, but other varieties can also be used. Canola seeds are surface-sterilized for in vitro sowing. The cotyledon petiole explants with the cotyledon attached are excised from the in vitro seedlings, and inoculated with Agrobacterium (containing the expression vector) by dipping the cut end of the petiole explant into the bacterial suspension. The explants are then cultured for 2 days on MSBAP-3 medium containing 3 mg/l BAP, 3% sucrose, 0.7% Phytagar at 23° C., 16 hr light. After two days of co-cultivation with Agrobacterium, the petiole explants are transferred to MSBAP-3 medium containing 3 mg/l BAP, cefotaxime, carbenicillin, or timentin (300 mg/l) for 7 days, and then cultured on MSBAP-3 medium with cefotaxime, carbenicillin, or timentin and selection agent until shoot regeneration. When the shoots are 5-10 mm in length, they are cut and transferred to shoot elongation medium (MSBAP-0.5, containing 0.5 mg/l BAP). Shoots of about 2 cm in length are transferred to the rooting medium (MS0) for root induction. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Alfalfa Transformation

[0323] A regenerating clone of alfalfa (Medicago sativa) is transformed using the method of (McKersie et al., 1999 Plant Physiol 119: 839-847). Regeneration and transformation of alfalfa is genotype dependent and therefore a regenerating plant is required. Methods to obtain regenerating plants have been described. For example, these can be selected from the cultivar Rangelander (Agriculture Canada) or any other commercial alfalfa variety as described by Brown D C W and A Atanassov (1985. Plant Cell Tissue Organ Culture 4: 111-112). Alternatively, the RA3 variety (University of Wisconsin) has been selected for use in tissue culture (Walker et al., 1978 Am J Bot 65:654-659). Petiole explants are cocultivated with an overnight culture of Agrobacterium tumefaciens C58C1 pMP90 (McKersie et al., 1999 Plant Physiol 119: 839-847) or LBA4404 containing the expression vector. The explants are cocultivated for 3 d in the dark on SH induction medium containing 288 mg/L Pro, 53 mg/L thioproline, 4.35 g/L K2SO4, and 100 μm acetosyringinone. The explants are washed in half-strength Murashige-Skoog medium (Murashige and Skoog, 1962) and plated on the same SH induction medium without acetosyringinone but with a suitable selection agent and suitable antibiotic to inhibit Agrobacterium growth. After several weeks, somatic embryos are transferred to BOi2Y development medium containing no growth regulators, no antibiotics, and 50 g/L sucrose. Somatic embryos are subsequently germinated on half-strength Murashige-Skoog medium. Rooted seedlings were transplanted into pots and grown in a greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Example 8

Phenotypic Evaluation Procedure

8.1 Evaluation Setup

[0324] Approximately 35 independent TO rice transformants were generated. The primary transformants were transferred from a tissue culture chamber to a greenhouse for growing and harvest of T1 seed. Five events, of which the T1 progeny segregated 3:1 for presence/absence of the transgene, were retained. For each of these events, approximately 10 T1 seedlings containing the transgene (hetero- and homo-zygotes) and approximately 10 T1 seedlings lacking the transgene (nullizygotes) were selected by monitoring visual marker expression. The transgenic plants and the corresponding nullizygotes were grown side-by-side at random positions. Greenhouse conditions were of shorts days (12 hours light), 28° C. in the light and 22° C. in the dark, and a relative humidity of 70%.

Drought Screen

[0325] Plants from T2 seeds are grown in potting soil under normal conditions until they approached the heading stage. They are then transferred to a "dry" section where irrigation is withheld. Humidity probes are inserted in randomly chosen pots to monitor the soil water content (SWC). When SWC goes below certain thresholds, the plants are automatically re-watered continuously until a normal level is reached again. The plants are then re-transferred again to normal conditions. The rest of the cultivation (plant maturation, seed harvest) is the same as for plants not grown under abiotic stress conditions. Growth and yield parameters are recorded as detailed for growth under normal conditions.

Nitrogen Use Efficiency Screen

[0326] Rice plants from T2 seeds were grown in potting soil under normal conditions except for the nutrient solution. The pots were watered from transplantation to maturation with a specific nutrient solution containing reduced N nitrogen (N) content, usually between 7 to 8 times less. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress. Growth and yield parameters were recorded as detailed for growth under normal conditions.

Salt Stress Screen

[0327] Plants are grown on a substrate made of coco fibers and argex (3 to 1 ratio). A normal nutrient solution is used during the first two weeks after transplanting the plantlets in the greenhouse. After the first two weeks, 25 mM of salt (NaCl) is added to the nutrient solution, until the plants are harvested. Seed-related parameters are then measured.

8.2 Statistical Analysis: F Test

[0328] A two factor ANOVA (analysis of variants) was used as a statistical model for the overall evaluation of plant phenotypic characteristics. An F test was carried out on all the parameters measured of all the plants of all the events transformed with the gene of the present invention. The F test was carried out to check for an effect of the gene over all the transformation events and to verify for an overall effect of the gene, also known as a global gene effect. The threshold for significance for a true global gene effect was set at a 5% probability level for the F test. A significant F test value points to a gene effect, meaning that it is not only the mere presence or position of the gene that is causing the differences in phenotype.

[0329] Because two experiments with overlapping events are carried out, a combined analysis is performed. This is useful to check consistency of the effects over the two experiments, and if this is the case, to accumulate evidence from both experiments in order to increase confidence in the conclusion. The method used is a mixed-model approach that takes into account the multilevel structure of the data (i.e. experiment-event-segregants). P values are obtained by comparing likelihood ratio test to chi square distributions.

8.3 Parameters Measured

Biomass-Related Parameter Measurement

[0330] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.

[0331] The plant aboveground area (or leafy biomass) was determined by counting the total number of pixels on the digital images from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from the different angles and was converted to a physical surface value expressed in square mm by calibration. Experiments show that the aboveground plant area measured this way correlates with the biomass of plant parts above ground. The above ground area is the area measured at the time point at which the plant had reached its maximal leafy biomass. The early vigour is the plant (seedling) aboveground area three weeks post-germination. Increase in root biomass is expressed as an increase in total root biomass (measured as maximum biomass of roots observed during the lifespan of a plant); or as an increase in the root/shoot index (measured as the ratio between root mass and shoot mass in the period of active growth of root and shoot).

[0332] Early vigour was determined by counting the total number of pixels from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from different angles and was converted to a physical surface value expressed in square mm by calibration. The results described below are for plants three weeks post-germination.

Seed-Related Parameter Measurements

[0333] The mature primary panicles were harvested, counted, bagged, barcode-labelled and then dried for three days in an oven at 37° C. The panicles were then threshed and all the seeds were collected and counted. The filled husks were separated from the empty ones using an air-blowing device. The empty husks were discarded and the remaining fraction was counted again. The filled husks were weighed on an analytical balance. The number of filled seeds was determined by counting the number of filled husks that remained after the separation step. The total seed yield was measured by weighing all filled husks harvested from a plant. Total seed number per plant was measured by counting the number of husks harvested from a plant. The Harvest Index (HI) in the present invention is defined as the ratio between the total seed yield and the above ground area (mm2), multiplied by a factor 106. The total number of flowers per panicle as defined in the present invention is the ratio between the total number of seeds and the number of mature primary panicles.

Example 9

Results of the Phenotypic Evaluation of the Transgenic Plants

[0334] The transgenic rice plants expressing the NITR nucleic acid represented by SEQ ID NO: 1 under control of the GOS2 promoter showed an increase of more than 5% for biomass (root and shoot), early vigour, total weight of seeds, number of filled seeds, harvest index, total number of seeds and number of flowers per panicle when grown under nitrogen deficiency-stress conditions. When evaluated over two generations (T1 and T2) the following data were obtained (Table 3):

TABLE-US-00010 TABLE 3 Yield increase for transgenic plants expressing the NITR nucleic acid compared to the control plants. For each parameter the p value is ≦ 0.05. Parameter Overall increase (%) Early vigour 17.5 Root/Shoot index 9.0 Total weight of seeds 6.1 Number of filled seeds 5.8 Total number of seeds 5.0

Example 10

Identification of Sequences Related to the Nucleic Acid Sequence Used in the Methods of the Invention

[0335] Sequences (full length cDNA, ESTs or genomic) related to the nucleic acid sequence used in the methods of the present invention are identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acids used in the present invention are used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis is viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflects the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.

[0336] In some instances, related sequences may tentatively be assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR). The Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid or polypeptide sequence of interest.

Example 11

Alignment of ASNS Polypeptide Sequences

[0337] Alignment of polypeptide sequences is performed using the AlignX programme from the Vector NTI package (Invitrogen) which is based on the popular Clustal W algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chenna et al. (2003). Nucleic Acids Res 31:3497-3500). Default values are for the gap open penalty of 10, for the gap extension penalty of 0.1 and the selected weight matrix is Blosum 62 (if polypeptides are aligned). Minor manual editing may be done to further optimise the alignment. For the alignment of FIG. 4, the ClustalW 2.0 algorithm was used with default parameters (Matrix: Gonnet, Gap-opening penalty: 10, Gap-extension penalty: 0.1).

[0338] A phylogenetic tree of ASNS polypeptides is constructed using a neighbour-joining clustering algorithm as provided in the AlignX programme from Vector NTI (Invitrogen).

Example 12

Calculation of Global Percentage Identity Between Polypeptide Sequences Useful in Performing the Methods of the Invention

[0339] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention are determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.

[0340] Parameters used in the comparison were:

[0341] Scoring matrix: Blosum62

[0342] First Gap: 12

[0343] Extending gap: 2

[0344] A MATGAT table for local alignment of a specific domain, or data on % identity/similarity between specific domains may also be generated.

Example 13

Identification of Domains Comprised in Polypeptide Sequences Useful in Performing the Methods of the Invention

[0345] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, ProDom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.

[0346] The protein sequences represented by SEQ ID NO: 63 was used as query to search the InterPro database.

Example 14

Topology Prediction of the Polypeptide Sequences Useful in Performing the Methods of the Invention

[0347] TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal pre-sequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Scores on which the final prediction is based are not really probabilities, and they do not necessarily add to one. However, the location with the highest score is the most likely according to TargetP, and the relationship between the scores (the reliability class) may be an indication of how certain the prediction is. The reliability class (RC) ranges from 1 to 5, where 1 indicates the strongest prediction. TargetP is maintained at the server of the Technical University of Denmark.

[0348] For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted.

[0349] A number of parameters were selected, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).

[0350] The protein sequence of SEQ ID NO: 63 was used to query TargetP 1.1. The "plant" organism group is selected, no cutoffs defined, and the predicted length of the transit peptide requested. No clear subcellular location was predicted by TargetP, but SubLoc (Hua and Sun, Bioinformatics) predicted a cytoplasmic localisation.

[0351] Many other algorithms can be used to perform such analyses, including:

[0352] ChloroP 1.1 hosted on the server of the Technical University of Denmark;

[0353] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;

[0354] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;

[0355] TMHMM, hosted on the server of the Technical University of Denmark

Example 15

Cloning of the Nucleic Acid Sequence Used in the Methods of the Invention

Cloning of SEQ ID NO: 62:

[0356] The nucleic acid sequence SEQ ID NO: 64 used in the methods of the invention was amplified by PCR using as template a custom-made Oryza sativa seedlings cDNA library (in pCMV Sport 6.0; Invitrogen, Paisley, UK). PCR was performed using Hifi Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm06049 (SEQ ID NO: 65; sense, start codon in bold): 5'-ggggacaagtttgtacaa aaaagcaggcttaaacaatgtgtggcatcctcgccgtgctcg-3' and prm06050 (SEQ ID NO: 66; reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggtgcgacgatagaa agttaaacggcag-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombines in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone". Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.

[0357] The entry clone comprising SEQ ID NO: 62 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 64) for seed specific expression was located upstream of this Gateway cassette.

[0358] After the LR recombination step, the resulting expression vector pGOS2::ASNS (FIG. 4) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.

Example 16

Plant Transformation

Rice Transformation

[0359] The Agrobacterium containing the expression vector was used to transform Oryza sativa plants. Mature dry seeds of the rice japonica cultivar Nipponbare were dehusked. Sterilization was carried out by incubating for one minute in 70% ethanol, followed by 30 minutes in 0.2% HgCl2, followed by a 6 times 15 minutes wash with sterile distilled water. The sterile seeds were then germinated on a medium containing 2,4-D (callus induction medium). After incubation in the dark for four weeks, embryogenic, scutellum-derived calli were excised and propagated on the same medium. After two weeks, the calli were multiplied or propagated by subculture on the same medium for another 2 weeks. Embryogenic callus pieces were sub-cultured on fresh medium 3 days before co-cultivation (to boost cell division activity).

[0360] Agrobacterium strain LBA4404 containing the expression vector was used for co-cultivation. Agrobacterium was inoculated on AB medium with the appropriate antibiotics and cultured for 3 days at 28° C. The bacteria were then collected and suspended in liquid co-cultivation medium to a density (OD600) of about 1. The suspension was then transferred to a Petri dish and the calli immersed in the suspension for 15 minutes. The callus tissues were then blotted dry on a filter paper and transferred to solidified, co-cultivation medium and incubated for 3 days in the dark at 25° C. Co-cultivated calli were grown on 2,4-D-containing medium for 4 weeks in the dark at 28° C. in the presence of a selection agent. During this period, rapidly growing resistant callus islands developed. After transfer of this material to a regeneration medium and incubation in the light, the embryogenic potential was released and shoots developed in the next four to five weeks. Shoots were excised from the calli and incubated for 2 to 3 weeks on an auxin-containing medium from which they were transferred to soil. Hardened shoots were grown under high humidity and short days in a greenhouse.

[0361] Approximately 35 independent T0 rice transformants were generated for one construct. The primary transformants were transferred from a tissue culture chamber to a greenhouse. After a quantitative PCR analysis to verify copy number of the T-DNA insert, only single copy transgenic plants that exhibit tolerance to the selection agent were kept for harvest of T1 seed. Seeds were then harvested three to five months after transplanting. The method yielded single locus transformants at a rate of over 50% (Aldemita and Hodges 1996, Chan et al. 1993, Hiei et al. 1994).

Corn Transformation

[0362] Transformation of maize (Zea mays) is performed with a modification of the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. Transformation is genotype-dependent in corn and only specific genotypes are amenable to transformation and regeneration. The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation, but other genotypes can be used successfully as well. Ears are harvested from corn plant approximately 11 days after pollination (DAP) when the length of the immature embryo is about 1 to 1.2 mm. Immature embryos are cocultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. Excised embryos are grown on callus induction medium, then maize regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Wheat Transformation

[0363] Transformation of wheat is performed with the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. The cultivar Bobwhite (available from CIMMYT, Mexico) is commonly used in transformation. Immature embryos are co-cultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. After incubation with Agrobacterium, the embryos are grown in vitro on callus induction medium, then regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Soybean Transformation

[0364] Soybean is transformed according to a modification of the method described in the Texas A&M patent U.S. Pat. No. 5,164,310. Several commercial soybean varieties are amenable to transformation by this method. The cultivar Jack (available from the Illinois Seed foundation) is commonly used for transformation. Soybean seeds are sterilised for in vitro sowing. The hypocotyl, the radicle and one cotyledon are excised from seven-day old young seedlings. The epicotyl and the remaining cotyledon are further grown to develop axillary nodes. These axillary nodes are excised and incubated with Agrobacterium tumefaciens containing the expression vector. After the cocultivation treatment, the explants are washed and transferred to selection media. Regenerated shoots are excised and placed on a shoot elongation medium. Shoots no longer than 1 cm are placed on rooting medium until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Rapeseed/Canola Transformation

[0365] Cotyledonary petioles and hypocotyls of 5-6 day old young seedling are used as explants for tissue culture and transformed according to Babic et al. (1998, Plant Cell Rep 17: 183-188). The commercial cultivar Westar (Agriculture Canada) is the standard variety used for transformation, but other varieties can also be used. Canola seeds are surface-sterilized for in vitro sowing. The cotyledon petiole explants with the cotyledon attached are excised from the in vitro seedlings, and inoculated with Agrobacterium (containing the expression vector) by dipping the cut end of the petiole explant into the bacterial suspension. The explants are then cultured for 2 days on MSBAP-3 medium containing 3 mg/l BAP, 3% sucrose, 0.7% Phytagar at 23° C., 16 hr light. After two days of co-cultivation with Agrobacterium, the petiole explants are transferred to MSBAP-3 medium containing 3 mg/l BAP, cefotaxime, carbenicillin, or timentin (300 mg/l) for 7 days, and then cultured on MSBAP-3 medium with cefotaxime, carbenicillin, or timentin and selection agent until shoot regeneration. When the shoots are 5-10 mm in length, they are cut and transferred to shoot elongation medium (MSBAP-0.5, containing 0.5 mg/l BAP). Shoots of about 2 cm in length are transferred to the rooting medium (MS0) for root induction. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Alfalfa Transformation

[0366] A regenerating clone of alfalfa (Medicago sativa) is transformed using the method of (McKersie et al., 1999 Plant Physiol 119: 839-847). Regeneration and transformation of alfalfa is genotype dependent and therefore a regenerating plant is required. Methods to obtain regenerating plants have been described. For example, these can be selected from the cultivar Rangelander (Agriculture Canada) or any other commercial alfalfa variety as described by Brown D C W and A Atanassov (1985. Plant Cell Tissue Organ Culture 4: 111-112). Alternatively, the RA3 variety (University of Wisconsin) has been selected for use in tissue culture (Walker et al., 1978 Am J Bot 65:654-659). Petiole explants are cocultivated with an overnight culture of Agrobacterium tumefaciens C58C1 pMP90 (McKersie et al., 1999 Plant Physiol 119: 839-847) or LBA4404 containing the expression vector. The explants are cocultivated for 3 d in the dark on SH induction medium containing 288 mg/L Pro, 53 mg/L thioproline, 4.35 g/L K2SO4, and 100 μm acetosyringinone. The explants are washed in half-strength Murashige-Skoog medium (Murashige and Skoog, 1962) and plated on the same SH induction medium without acetosyringinone but with a suitable selection agent and suitable antibiotic to inhibit Agrobacterium growth. After several weeks, somatic embryos are transferred to BOi2Y development medium containing no growth regulators, no antibiotics, and 50 g/L sucrose. Somatic embryos are subsequently germinated on half-strength Murashige-Skoog medium. Rooted seedlings were transplanted into pots and grown in a greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Example 17

Phenotypic Evaluation Procedure

17.1 Evaluation Setup

[0367] Approximately 35 independent TO rice transformants were generated. The primary transformants were transferred from a tissue culture chamber to a greenhouse for growing and harvest of T1 seed. Six events, of which the T1 progeny segregated 3:1 for presence/absence of the transgene, were retained. For each of these events, approximately 10 T1 seedlings containing the transgene (hetero- and homo-zygotes) and approximately 10 T1 seedlings lacking the transgene (nullizygotes) were selected by monitoring visual marker expression. The transgenic plants and the corresponding nullizygotes were grown side-by-side at random positions. Greenhouse conditions were of shorts days (12 hours light), 28° C. in the light and 22° C. in the dark, and a relative humidity of 70%.

[0368] Four T1 events were further evaluated in the T2 generation following the same evaluation procedure as for the T1 generation but with more individuals per event.

Drought Screen

[0369] Plants from T2 seeds are grown in potting soil under normal conditions until they approached the heading stage. They are then transferred to a "dry" section where irrigation is withheld. Humidity probes are inserted in randomly chosen pots to monitor the soil water content (SWC). When SWC goes below certain thresholds, the plants are automatically re-watered continuously until a normal level is reached again. The plants are then re-transferred again to normal conditions. The rest of the cultivation (plant maturation, seed harvest) is the same as for plants not grown under abiotic stress conditions. Growth and yield parameters are recorded as detailed for growth under normal conditions.

Nitrogen Use Efficiency Screen

[0370] Rice plants from T2 seeds were grown in potting soil under normal conditions except for the nutrient solution. The pots were watered from transplantation to maturation with a specific nutrient solution containing reduced N nitrogen (N) content, usually between 7 to 8 times less. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress. Growth and yield parameters were recorded as detailed for growth under normal conditions.

Salt Stress Screen

[0371] Plants are grown on a substrate made of coco fibers and argex (3 to 1 ratio). A normal nutrient solution is used during the first two weeks after transplanting the plantlets in the greenhouse. After the first two weeks, 25 mM of salt (NaCl) is added to the nutrient solution, until the plants are harvested. Seed-related parameters are then measured.

17.2 Statistical Analysis: F Test

[0372] A two factor ANOVA (analysis of variants) was used as a statistical model for the overall evaluation of plant phenotypic characteristics. An F test was carried out on all the parameters measured of all the plants of all the events transformed with the gene of the present invention. The F test was carried out to check for an effect of the gene over all the transformation events and to verify for an overall effect of the gene, also known as a global gene effect. The threshold for significance for a true global gene effect was set at a 5% probability level for the F test. A significant F test value points to a gene effect, meaning that it is not only the mere presence or position of the gene that is causing the differences in phenotype.

[0373] Because two experiments with overlapping events are carried out, a combined analysis was performed. This is useful to check consistency of the effects over the two experiments, and if this is the case, to accumulate evidence from both experiments in order to increase confidence in the conclusion. The method used was a mixed-model approach that takes into account the multilevel structure of the data (i.e. experiment-event-segregants). P values were obtained by comparing likelihood ratio test to chi square distributions.

17.3 Parameters Measured

Biomass-Related Parameter Measurement

[0374] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.

[0375] The plant aboveground area (or leafy biomass) was determined by counting the total number of pixels on the digital images from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from the different angles and was converted to a physical surface value expressed in square mm by calibration. Experiments show that the aboveground plant area measured this way correlates with the biomass of plant parts above ground. The above ground area is the area measured at the time point at which the plant had reached its maximal leafy biomass. The early vigour is the plant (seedling) aboveground area three weeks post-germination.

[0376] Early vigour was determined by counting the total number of pixels from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from different angles and was converted to a physical surface value expressed in square mm by calibration. The results described below are for plants three weeks post-germination.

Seed-Related Parameter Measurements

[0377] The mature primary panicles were harvested, counted, bagged, barcode-labelled and then dried for three days in an oven at 37° C. The panicles were then threshed and all the seeds were collected and counted. The filled husks were separated from the empty ones using an air-blowing device. The empty husks were discarded and the remaining fraction was counted again. The filled husks were weighed on an analytical balance. The number of filled seeds was determined by counting the number of filled husks that remained after the separation step. The total seed yield was measured by weighing all filled husks harvested from a plant. Total seed number per plant was measured by counting the number of husks harvested from a plant. Thousand Kernel Weight (TKW) is extrapolated from the number of filled seeds counted and their total weight. The Harvest Index (HI) in the present invention is defined as the ratio between the total seed yield and the above ground area (mm2), multiplied by a factor 106. The total number of flowers per panicle as defined in the present invention is the ratio between the total number of seeds and the number of mature primary panicles. The seed fill rate as defined in the present invention is the proportion (expressed as a %) of the number of filled seeds over the total number of seeds (or florets). Increase in root biomass is expressed as root thickness, which is the maximum biomass of roots above a certain thickness threshold observed during the lifespan of a plant (obtained by a root-imaging system).

Example 18

Results of the Phenotypic Evaluation of the Transgenic Plants

[0378] The transgenic rice plants expressing the ASNS nucleic acid represented by SEQ ID NO: 62 under control of the GOS2 promoter and grown whether under non-stress conditions or under conditions of reduced nitrogen availability, showed an increase of more than 5% for at least one of the following parameters: early vigour, total weight of seeds, number of filled seeds, fill rate, number of flowers per panicle, Harvest Index, total number of seeds and root thickness. For Thousand Kernel Weight the observed increase was at least 3%.

Sequence CWU 1

1

12211807DNAArabidopsis thaliana 1ggcttaaaca atgacttctt tctctctcac tttcacatct cctctcctcc cttcctcctc 60caccaaaccc aaaagatccg tccttgtcgc cgccgctcag accacagctc cggccgaatc 120caccgcctct gttgacgcag atcgtctcga gccaagagtt gagttgaaag atggtttttt 180tattctcaag gagaagtttc gaaaagggat caatcctcag gagaaggtta agatcgagag 240agagcccatg aagttgttta tggagaatgg tattgaagag cttgctaaga aatctatgga 300agagcttgat agtgaaaagt cttctaaaga tgatattgat gttagactca agtggcttgg 360tctctttcac cgtagaaagc atcagtatgg gaagtttatg atgaggttga agttaccaaa 420tggtgtgact acaagtgcac agactcggta tttagcgagt gtgattagga agtatggtga 480agatgggtgt gctgatgtga ctactagaca gaattggcag atccgtggtg ttgtgttgcc 540tgatgtgcct gagatcttga aaggtcttgc ttctgttggt ttaacgagtc ttcaaagtgg 600tatggataac gtgaggaacc cggttgggaa tcctatagct gggattgatc cggaggagat 660tgttgacacg aggccttaca cgaatctcct ttcgcagttt atcaccgcta attcacaagg 720aaaccccgat ttcaccaact tgccaagaaa gtggaatgtg tgtgtggtgg ggactcatga 780tctctatgag catccacata tcaatgattt ggcctacatg cctgctaata aagatggacg 840gtttggattc aatttgcttg tgggaggatt ctttagtccc aaaagatgtg aagaagcgat 900tcctcttgat gcttgggtcc ctgctgatga cgttcttcca ctctgcaaag ctgttctaga 960ggcttacaga gatcttggaa ctcgaggaaa ccgacagaag acaagaatga tgtggcttat 1020cgacgaactt ggtgttgaag gatttagaac tgaggtagag aagagaatgc caaatgggaa 1080actcgagaga ggatcttcag aggatcttgt gaacaaacag tgggagagga gagactattt 1140cggagtcaac cctcagaaac aagaaggtct tagcttcgtg gggcttcacg ttccggttgg 1200taggctacaa gctgatgaca tggatgagct tgctcggtta gctgatacct acgggtcagg 1260tgagctaaga ctcacagtag agcaaaacat catcatccca aatgtagaaa cctcgaaaac 1320cgaagctttg cttcaagagc cgtttctcaa gaaccgtttc tcccctgaac catctatcct 1380aatgaaaggc ttagttgctt gtaccggtag ccagttctgc ggacaagcga taatcgagac 1440taagctaaga gctttaaaag tgacagaaga agtagagaga cttgtatctg tgccaagacc 1500gataaggatg cattggacag gatgtcccaa tacttgcgga caagtccaag tagcagatat 1560cggattcatg ggatgcttaa cacgaggcga ggaaggaaag ccagtcgagg gtgctgacgt 1620gtacgtcggg ggacgaatag gaagtgactc gcatatcgga gagatctata agaaaggtgt 1680tcgtgtcacg gagttggttc cattggtggc tgagattctg atcaaagaat ttggtgctgt 1740gcctagagaa agagaagaga atgaagattg attcaaaagc tattgaccca gctttcttgt 1800acaaagt 18072586PRTArabidopsis thaliana 2Met Thr Ser Phe Ser Leu Thr Phe Thr Ser Pro Leu Leu Pro Ser Ser 1 5 10 15 Ser Thr Lys Pro Lys Arg Ser Val Leu Val Ala Ala Ala Gln Thr Thr 20 25 30 Ala Pro Ala Glu Ser Thr Ala Ser Val Asp Ala Asp Arg Leu Glu Pro 35 40 45 Arg Val Glu Leu Lys Asp Gly Phe Phe Ile Leu Lys Glu Lys Phe Arg 50 55 60 Lys Gly Ile Asn Pro Gln Glu Lys Val Lys Ile Glu Arg Glu Pro Met 65 70 75 80 Lys Leu Phe Met Glu Asn Gly Ile Glu Glu Leu Ala Lys Lys Ser Met 85 90 95 Glu Glu Leu Asp Ser Glu Lys Ser Ser Lys Asp Asp Ile Asp Val Arg 100 105 110 Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His Gln Tyr Gly Lys 115 120 125 Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Ala Gln 130 135 140 Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Glu Asp Gly Cys 145 150 155 160 Ala Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Val Leu 165 170 175 Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Ala Ser Val Gly Leu Thr 180 185 190 Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro 195 200 205 Ile Ala Gly Ile Asp Pro Glu Glu Ile Val Asp Thr Arg Pro Tyr Thr 210 215 220 Asn Leu Leu Ser Gln Phe Ile Thr Ala Asn Ser Gln Gly Asn Pro Asp 225 230 235 240 Phe Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly Thr His 245 250 255 Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala 260 265 270 Asn Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe 275 280 285 Ser Pro Lys Arg Cys Glu Glu Ala Ile Pro Leu Asp Ala Trp Val Pro 290 295 300 Ala Asp Asp Val Leu Pro Leu Cys Lys Ala Val Leu Glu Ala Tyr Arg 305 310 315 320 Asp Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu 325 330 335 Ile Asp Glu Leu Gly Val Glu Gly Phe Arg Thr Glu Val Glu Lys Arg 340 345 350 Met Pro Asn Gly Lys Leu Glu Arg Gly Ser Ser Glu Asp Leu Val Asn 355 360 365 Lys Gln Trp Glu Arg Arg Asp Tyr Phe Gly Val Asn Pro Gln Lys Gln 370 375 380 Glu Gly Leu Ser Phe Val Gly Leu His Val Pro Val Gly Arg Leu Gln 385 390 395 400 Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Thr Tyr Gly Ser 405 410 415 Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Val 420 425 430 Glu Thr Ser Lys Thr Glu Ala Leu Leu Gln Glu Pro Phe Leu Lys Asn 435 440 445 Arg Phe Ser Pro Glu Pro Ser Ile Leu Met Lys Gly Leu Val Ala Cys 450 455 460 Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Leu Arg 465 470 475 480 Ala Leu Lys Val Thr Glu Glu Val Glu Arg Leu Val Ser Val Pro Arg 485 490 495 Pro Ile Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly Gln Val 500 505 510 Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Gly Glu Glu 515 520 525 Gly Lys Pro Val Glu Gly Ala Asp Val Tyr Val Gly Gly Arg Ile Gly 530 535 540 Ser Asp Ser His Ile Gly Glu Ile Tyr Lys Lys Gly Val Arg Val Thr 545 550 555 560 Glu Leu Val Pro Leu Val Ala Glu Ile Leu Ile Lys Glu Phe Gly Ala 565 570 575 Val Pro Arg Glu Arg Glu Glu Asn Glu Asp 580 585 33246DNAOryza sativa 3aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tggtgcaaat caggtctata 1800tgattgattt tgggctggcc aagaagtata gagactcatc aactcatcag catattccgt 1860atagagaaaa caaaaatttg acaggaactg ctagatacgc aagcatgaat actcatcttg 1920gcattgaaca aagtcgaagg gatgatttgg aatcgctggg ttatgtttta atgtacttct 1980taagaggaag tctcccttgg caggggctga aagcaggcac taagaaacag aagtatgaga 2040agatcagtga gaagaaagta tcaacatcaa tagagacctt gtgtagggga tatcctgcag 2100agtttgcatc atattttcat tactgtcgat cactaagatt tgatgataaa ccagattatg 2160cttatctgaa gagaattttc cgtgatcttt tcattcgtga agggtttcaa tttgattata 2220tatttgactg gaccattttg aaatatcagc aatcacagct tgccaatcct ccatctcgtg 2280ctcttggtgg tactgctggg ccaagctcag ggatgcctca tgctcttgtt aatgttgaga 2340ggcaatcagg tggagatgaa ggtcgaccaa ctggttggtc ttcatcaaat cttacacgta 2400ataagagcac ggggctgcat ttcaattctg gaagcttatt gaagcaaaaa ggcacagttg 2460ctaatgattt atccatgggt aaagagttat ccagttctaa ttttttccgg tcaagtggac 2520cattgaggcg tccagttgtc tctagcatcc gagacccagt gattgcaggg ggtgaacctg 2580acccctccgg cactctgaca aaagatgcaa gcccgggacc attgcgtaaa gtatccagtg 2640ctgcacggag gagttcacca gttgtgtcct cagatcacaa gcgcagctcc tctatcaaaa 2700atgccaacat aaagaattta gagtccaccg tcaagggaat agagggttta agttttcgat 2760gatgagggac tgcattagta gctgtgcttt gtctcagttc tccgttcact gtaaattttg 2820gcacaccaac ttggggagta agagttctga tattagttgc tgtcaggaag taccataaag 2880ctgaattata caattaaaat ttgggatcca atcgcaaaag cacattaagg atatgatggg 2940gttgcagatc caaactcaca gattccagtt tatgctcgtc catacagtta taggcacttt 3000ccatattctt ttctttaatc tctgtctctt gcttgttatt gttatgtcgt ggtattcttg 3060ttgaggtcat gtttgtgaat tgcgaagatg gtcatgtata attgccgaga aatcatgtac 3120tagtttgttt taaacatgag caaactgtta ttttgttcaa gctactttaa tatcaaaaaa 3180aaaaaaaaaa gggcggccgc tctagagtat ccctcgaggg gcccaagctt acgcgtaccc 3240agcttt 3246457DNAArtificial sequenceprimer prm07073 4ggggacaagt ttgtacaaaa aagcaggctt aaacaatgac ttctttctct ctcactt 57550DNAArtificial sequenceprimer prm07074 5ggggaccact ttgtacaaga aagctgggtc aatagctttt gaatcaatct 506600PRTAquilegia formosa 6Ser Lys Asn Glu Leu Cys Arg Leu Ser Ser Thr Phe Leu Ser Thr Met 1 5 10 15 Ala Ser Leu Gln Phe Leu Ala Pro Ser Ser Ser Pro Leu Gln Ser Asn 20 25 30 Arg Leu Met Val Arg Ala Thr Ser Ser Thr Ser Pro Ser Val Asn Gln 35 40 45 Thr Met Val Ala Pro Asp Leu Ser Arg Leu Glu Pro Arg Val Glu Glu 50 55 60 Arg Glu Gly Gly Tyr Trp Val Leu Lys Glu Lys Tyr Arg Glu Lys Ile 65 70 75 80 Asn Pro Gln Glu Lys Ile Lys Ile Glu Lys Glu Pro Met Lys Phe Val 85 90 95 Thr Glu Gly Gly Ile His Glu Leu Ala Lys Thr Pro Phe Glu Glu Leu 100 105 110 Glu Lys Ala Lys Leu Thr Lys Asp Asp Ile Asp Val Arg Leu Lys Trp 115 120 125 Leu Gly Leu Phe His Arg Arg Lys Asn His Tyr Gly Arg Phe Met Met 130 135 140 Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Glu Gln Thr Arg Tyr 145 150 155 160 Leu Ala Ser Val Ile Arg Arg Tyr Gly Lys Asp Gly Cys Ala Asp Val 165 170 175 Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Glu Leu Pro His Val 180 185 190 Pro Glu Ile Met Lys Gly Leu Asn Gln Val Gly Leu Thr Ser Leu Gln 195 200 205 Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly 210 215 220 Ile Asp Pro Leu Glu Ile Val Asp Thr Arg Pro Tyr Asn Asp Gln Leu 225 230 235 240 Ser Arg Phe Ile Thr Gly Asn Phe Lys Gly Asn Leu Ala Phe Thr Asn 245 250 255 Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly Ser His Asp Leu Phe 260 265 270 Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr Lys Asn 275 280 285 Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Lys 290 295 300 Arg Cys Ala Glu Ala Ile Pro Leu Asp Ala Trp Val Ser Gly Glu Asp 305 310 315 320 Val Ile Pro Val Cys Lys Ala Ile Leu Glu Ala Tyr Arg Asp Leu Gly 325 330 335 Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu 340 345 350 Leu Gly Val Glu Gly Phe Arg Ser Glu Val Val Lys Arg Met Pro Glu 355 360 365 Gln Glu Leu Glu Arg Ser Ser Thr Glu Glu Leu Val Gln Lys Gln Trp 370 375 380 Glu Arg Arg Asp Leu Ile Gly Val His Ala Gln Lys Gln Ala Gly Tyr 385 390 395 400 Ser Phe Val Gly Leu His Ile Pro Val Gly Arg Leu Gln Ala Asp Asp 405 410 415 Met Asp Glu Leu Ala Arg Ile Ala Asp Glu Tyr Gly Ser Gly Glu Leu 420 425 430 Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Val Glu Asn Ser 435 440 445 Arg Val Glu Ala Leu Leu Lys Glu Ala Leu Leu Arg Asp Arg Phe Ser 450 455 460 Pro Thr Pro Pro Leu Leu Met Lys Gly Leu Val Ala Cys Thr Gly Asn 465 470 475 480 Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys 485 490 495 Val Thr Glu Glu Val Glu Arg Leu Val Ala Val Thr Lys Pro Val Arg 500 505 510 Met His Trp Thr Gly Cys Pro Asn Thr Cys Ala Gln Val Gln Val Ala 515 520 525 Asp Ile Gly Phe Met Gly Cys Met Ala Arg Asp Glu Asn Gly Lys Pro 530 535 540 Cys Glu Gly Ala Asp Val Tyr Leu Gly Gly Arg Ile Gly Ser Asp Ser 545 550 555 560 His Leu Gly Asp Ile Tyr Lys Lys Ser Val Pro Cys Lys Asp Leu Val 565 570 575 Pro Leu Val Val Asp Ile Leu Ile Glu Arg Phe Gly Ala Val Pro Arg 580 585 590 Glu Arg Glu Glu Asp Gly Glu Asp 595 600 7583PRTBetula pendula 7Met Ser Ser Leu Ser Val Arg Phe Leu Ser Pro Pro Leu Phe Ser Ser 1 5 10 15 Thr Pro Ala Trp Pro Arg Thr Gly Leu Ala Ala Thr Gln Ala Val Pro 20 25 30 Pro Val Val Ala Glu Val Asp Ala Gly Arg Leu Glu Pro Arg Val Glu 35 40 45 Glu Arg Glu Gly Tyr Trp Val Leu Lys Glu Lys Phe Arg Glu Gly Ile 50 55 60 Asn Pro Gln Glu Lys Leu Lys Leu Glu Arg Glu Pro Met Lys Leu Phe 65 70 75 80 Met Glu Gly Gly Ile Glu Asp Leu Ala Lys Met Ser Leu Glu Glu Ile 85 90 95 Asp Lys Asp Lys Ile Ser Lys Ser Asp Ile Asp Val Arg Leu Lys Trp 100 105 110 Leu Gly Leu Phe His Arg Arg Lys His His Tyr Gly Arg Phe Met Met 115 120 125 Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Ala Gln Thr Arg Tyr 130 135 140 Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Asp Gly Cys Ala Asp Val 145 150 155 160 Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Val Leu Ser Asp Val 165 170 175 Pro Glu Ile Leu Lys Gly Leu Asp Glu Val Gly Leu Thr Ser Leu Gln 180 185 190 Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly 195 200 205 Ile Asp Ile His Glu Ile Val Ala Thr Arg Pro Tyr Asn Asn Leu Leu 210 215 220 Ser Gln Phe Ile Thr Ala Asn Ser Arg Gly Asn Leu Ala Phe Thr Asn 225 230 235 240 Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly Ser His Asp Leu Phe 245 250 255 Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Ile Lys Asp 260 265 270 Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Arg 275

280 285 Arg Cys Ala Glu Ala Val Pro Leu Asp Ala Trp Val Ser Ala Asp Asp 290 295 300 Ile Ile Leu Val Cys Lys Ala Ile Leu Glu Ala Tyr Arg Asp Leu Gly 305 310 315 320 Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu 325 330 335 Leu Gly Ile Glu Gly Phe Arg Ser Glu Val Val Lys Arg Met Pro Asn 340 345 350 Gln Glu Leu Glu Arg Ala Ala Pro Glu Asp Leu Ile Glu Lys Gln Trp 355 360 365 Glu Arg Arg Glu Leu Ile Gly Val His Pro Gln Lys Gln Glu Gly Leu 370 375 380 Ser Tyr Val Gly Leu His Ile Pro Val Gly Arg Val Gln Ala Asp Asp 385 390 395 400 Met Asp Glu Leu Ala Arg Leu Ala Asp Thr Tyr Gly Cys Gly Glu Leu 405 410 415 Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Ile Glu Asn Ser 420 425 430 Lys Leu Glu Ala Leu Leu Gly Glu Pro Leu Leu Lys Asp Arg Phe Ser 435 440 445 Pro Glu Pro Pro Ile Leu Met Lys Gly Leu Val Ala Cys Thr Gly Asn 450 455 460 Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys 465 470 475 480 Val Thr Glu Glu Val Gln Arg Gln Val Ala Val Thr Arg Pro Val Arg 485 490 495 Met His Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Val Gln Val Ala 500 505 510 Asp Ile Gly Phe Met Gly Cys Met Ala Arg Asp Glu Asn Gly Lys Pro 515 520 525 Cys Glu Gly Ala Ala Val Phe Leu Gly Gly Arg Ile Gly Ser Asp Ser 530 535 540 His Leu Gly Asn Leu Tyr Lys Lys Gly Val Pro Cys Lys Asn Leu Val 545 550 555 560 Pro Leu Val Val Asp Ile Leu Val Lys His Phe Gly Ala Val Pro Arg 565 570 575 Glu Arg Glu Glu Ser Glu Asp 580 8612PRTCapsicum annuum 8Met Thr Ala Thr Ile Ile Thr Thr Leu Asn Asn Gln Glu Ser Thr Lys 1 5 10 15 Phe Leu Asn Ser Lys Phe Gly Glu Met Ala Ser Phe Ser Val Lys Phe 20 25 30 Ser Ala Thr Ser Ser Leu Thr Ser Ser Lys Arg Phe Ser Lys Leu His 35 40 45 Ala Thr Pro Pro Gln Thr Val Ala Val Pro Pro Ser Gly Ala Val Glu 50 55 60 Val Ala Ala Glu Arg Leu Glu Pro Arg Leu Glu Glu Arg Asp Gly Tyr 65 70 75 80 Trp Val Leu Lys Glu Lys Phe Arg Lys Gly Ile Asn Pro Ala Glu Lys 85 90 95 Ala Lys Ile Glu Lys Glu Pro Met Lys Leu Phe Thr Glu Asn Gly Ile 100 105 110 Glu Asp Ile Ala Lys Ile Ser Leu Glu Glu Ile Glu Lys Ser Lys Leu 115 120 125 Ala Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His 130 135 140 Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Leu Lys Leu Pro 145 150 155 160 Asn Gly Ile Thr Thr Ser Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile 165 170 175 Arg Lys Tyr Gly Lys Asp Gly Cys Ala Asp Val Thr Thr Arg Gln Asn 180 185 190 Trp Gln Ile Arg Gly Val Val Leu Pro Asp Val Pro Glu Ile Leu Lys 195 200 205 Gly Leu Asp Glu Val Gly Leu Thr Ser Leu Gln Ser Gly Met Asp Asn 210 215 220 Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile Asp Pro Gln Glu 225 230 235 240 Ile Val Asp Thr Arg Pro Tyr Ala Asn Leu Leu Ser Asn Leu Leu Ser 245 250 255 Gln Tyr Val Thr Ala Asn Phe Arg Gly Asn Leu Ser Val His Asn Leu 260 265 270 Pro Arg Lys Trp Asn Val Cys Val Ile Gly Ser His Asp Leu Tyr Glu 275 280 285 His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr Lys Asp Gly 290 295 300 Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Lys Arg 305 310 315 320 Cys Ala Glu Ala Ile Pro Leu Asp Ala Trp Val Pro Ala Asp Asp Val 325 330 335 Val Pro Val Cys Lys Thr Ile Leu Glu Ala Tyr Arg Asp Leu Gly Thr 340 345 350 Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Leu 355 360 365 Gly Val Glu Gly Phe Arg Ala Glu Val Val Lys Arg Met Pro Gln Lys 370 375 380 Lys Leu Glu Arg Glu Ser Thr Glu Asp Leu Val Gln Lys Gln Trp Glu 385 390 395 400 Arg Arg Glu Tyr Leu Gly Val Asn Pro Gln Lys Gln Glu Gly Tyr Ser 405 410 415 Phe Val Gly Leu His Ile Pro Val Gly Arg Val Gln Ala Asp Asp Met 420 425 430 Asp Glu Leu Ala Arg Leu Ala Glu Glu Tyr Gly Ser Gly Glu Leu Arg 435 440 445 Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Ile Glu Asn Ser Lys 450 455 460 Ile Asp Ala Leu Leu Asn Glu Pro Leu Leu Lys Gln Ile Ser Pro Asp 465 470 475 480 Pro Pro Ile Leu Met Arg Asn Leu Val Ala Cys Thr Gly Asn Gln Phe 485 490 495 Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ser Met Lys Ile Thr 500 505 510 Glu Glu Val Gln Arg Leu Val Ser Val Thr Gln Pro Val Arg Met His 515 520 525 Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Val Gln Val Ala Asp Ile 530 535 540 Gly Phe Met Gly Cys Leu Thr Arg Lys Glu Gly Lys Thr Val Glu Gly 545 550 555 560 Ala Asp Val Phe Leu Gly Gly Arg Ile Gly Thr Asp Ser His Leu Gly 565 570 575 Asp Ile Tyr Lys Lys Ser Val Pro Cys Glu Asp Leu Val Pro Ile Ile 580 585 590 Val Asp Leu Leu Val Asn Asn Phe Gly Ala Val Pro Arg Glu Arg Glu 595 600 605 Glu Ala Glu Asp 610 9612PRTChlamydomonas reinhardtii 9Met Leu Leu His Ala Pro His Val Lys Pro Leu Gly Gln Arg Ser Ser 1 5 10 15 Ile Arg Arg Gly Asn Leu Val Val Ala Asn Val Ala Cys Thr Ala Gly 20 25 30 Lys Asn Pro Thr Ser Arg Pro Ala Lys Arg Ser Lys Val Glu Phe Ile 35 40 45 Lys Glu Asn Ser Asp His Leu Arg His Pro Leu Met Glu Glu Leu Val 50 55 60 Asn Asp Glu Thr Phe Ile Thr Glu Asp Ser Val Gln Leu Met Lys Phe 65 70 75 80 His Gly Ser Tyr Gln Gln Asp Asn Arg Glu Lys Arg Ala Phe Gly Gln 85 90 95 Gly Lys Ala Tyr Ser Phe Leu Met Arg Thr Arg Gln Pro Ala Gly Val 100 105 110 Val Pro Asn Arg Leu Tyr Leu Val Met Asp Asp Leu Ala Asp Gln Phe 115 120 125 Gly Asn Gly Thr Leu Arg Leu Thr Thr Arg Gln Ala Tyr Gln Leu His 130 135 140 Gly Val Leu Lys Lys Asp Leu Lys Thr Val Phe Ser Ser Val Ile Lys 145 150 155 160 Asn Met Gly Ser Thr Leu Ala Ala Cys Gly Asp Val Asn Arg Asn Val 165 170 175 Met Gly Pro Ala Ala Pro Phe Thr Asn Arg Pro Asp Tyr Leu Ala Ala 180 185 190 Gln Lys Ala Ala Leu Asp Leu Ala Asp Leu Leu Thr Pro Gln Ser Gly 195 200 205 Ala Tyr Tyr Asp Val Trp Leu Asp Gly Glu Lys Phe Met Ser Ser Tyr 210 215 220 Lys Glu Asp Pro Ala Val Thr Glu Ala Arg Ala Phe Asn Gly Phe Gly 225 230 235 240 Thr Asn Phe Asp Asn Ser Pro Glu Pro Ile Tyr Gly Ser Gln Tyr Leu 245 250 255 Pro Arg Lys Phe Lys Ile Ala Thr Thr Val Pro Gly Asp Asn Ser Val 260 265 270 Asp Leu Phe Thr Gln Asp Leu Gly Val Val Val Gln Gly Tyr Asn Leu 275 280 285 Tyr Val Gly Gly Gly Gln Gly Arg Ser His Arg Asp Ala Asp Thr Phe 290 295 300 Pro Arg Leu Ala Asp Pro Leu Gly Tyr Val Ala Ala Ala Asp Leu Phe 305 310 315 320 Ala Ala Ala Lys Ala Val Val Ala Val Phe Arg Asp Tyr Gly Arg Arg 325 330 335 Asp Asn Arg Lys Gln Ala Arg Thr Arg His Met Leu Ala Glu Trp Gly 340 345 350 Val Asp Lys Phe Arg Ser Val Ala Glu Gln Tyr Leu Gly Lys Arg Phe 355 360 365 Gln Glu Pro Val Pro Leu Pro Pro Trp Gln Tyr Lys Asp Tyr Leu Gly 370 375 380 Trp Gly Glu Gln Gly Asp Gly Arg Leu Tyr Cys Gly Val Tyr Val Gln 385 390 395 400 Asn Gly Arg Ile Lys Gly Glu Ala Lys Arg Ala Leu Arg Ala Ala Ile 405 410 415 Glu Arg Tyr Ser Leu Pro Val Val Leu Thr Pro His Gln Asn Leu Val 420 425 430 Leu Arg Asp Val Arg Pro Glu Asp Arg Glu Asp Ile Glu Gln Leu Leu 435 440 445 Arg Ala Gly Gly Val Lys Glu Leu Val Glu Trp Asp Gly Leu Asp Arg 450 455 460 Leu Ser Met Ala Cys Pro Ala Leu Pro Leu Cys Gly Leu Ala Val Thr 465 470 475 480 Glu Ala Glu Arg Ala Leu Pro Asp Val Asn Thr Arg Ile Arg Ala Met 485 490 495 Leu Thr Arg Ala Gly Leu Pro Pro Ser Gln Pro Leu His Val Arg Met 500 505 510 Thr Gly Cys Pro Asn Gly Cys Val Arg Pro Tyr Met Ala Glu Leu Gly 515 520 525 Leu Val Gly Asp Gly Pro Asn Ser Tyr Gln Leu Trp Leu Gly Gly Gly 530 535 540 Pro Ala Gln Thr Arg Leu Ala Gln Pro Tyr Ala Glu Arg Val Lys Val 545 550 555 560 Lys Asp Leu Glu Ser Thr Leu Glu Pro Leu Phe Gly Ala Trp Arg Ala 565 570 575 Gly Arg Gln Pro Asp Glu Ala Phe Gly Asp Trp Val Ala Arg Leu Gly 580 585 590 Phe Asp Ala Val Arg Gln Gln Ala Ala Ala Ala Ala Ala Ala Ala Pro 595 600 605 Val Gly Thr Ala 610 10589PRTChlamydomonas reinhardtii 10Met Gln Ser Arg Gln Cys Leu Asn Arg Lys Ala Ser Gly Ala Arg Pro 1 5 10 15 Cys Ala Asn Ser Arg Ser Leu Thr Ala Arg Val Leu Ala Thr Ala Ala 20 25 30 Pro Val Ala Pro Ser Ala Thr Pro Ala Ser Ala Pro Leu Pro Leu Pro 35 40 45 Asp Gly Val Gly Glu His Ser Gly Leu Lys His Leu Pro Glu Ala Ala 50 55 60 Arg Thr Arg Ala Leu Asp Lys Lys Ala Asn Lys Phe Glu Lys Val Lys 65 70 75 80 Val Glu Lys Cys Gly Ser Arg Ala Trp Asn Asp Val Phe Glu Leu Ser 85 90 95 Ser Leu Leu Lys Glu Gly Lys Thr Lys Trp Glu Asp Leu Asn Leu Asp 100 105 110 Asp Val Asp Ile Arg Leu Lys Trp Ala Gly Leu Phe His Arg Gly Lys 115 120 125 Arg Thr Pro Gly Lys Phe Met Met Arg Leu Lys Val Pro Asn Gly Glu 130 135 140 Leu Thr Ala Ala Gln Leu Arg Phe Leu Ala Ser Ser Ile Ala Pro Tyr 145 150 155 160 Gly Ala Asp Gly Cys Ala Asp Ile Thr Thr Arg Ala Asn Ile Gln Leu 165 170 175 Arg Gly Val Thr Met Glu Asp Ser Glu Thr Val Ile Lys Gly Leu Trp 180 185 190 Asp Val Gly Leu Thr Ser Phe Gln Ser Gly Met Asp Ser Val Arg Asn 195 200 205 Leu Thr Gly Asn Pro Ile Ala Gly Val Asp Pro His Glu Leu Val Asp 210 215 220 Thr Arg Pro Leu Leu Arg Asp Met Glu Ala Met Leu Phe Asn Asn Gly 225 230 235 240 Lys Gly Arg Glu Glu Phe Ala Asn Leu Pro Arg Lys Leu Asn Ile Cys 245 250 255 Ile Ser Ser Thr Arg Asp Asp Phe Pro His Thr His Ile Asn Asp Val 260 265 270 Gly Tyr Glu Ala Val Ala Lys Pro Asn Gly Glu Val Val Tyr Asn Val 275 280 285 Val Val Gly Gly Tyr Phe Ser Ile Lys Arg Asn Ile Met Ser Ile Pro 290 295 300 Leu Gly Cys Ser Ile Thr Gln Asp Gln Leu Met Pro Phe Thr Glu Ala 305 310 315 320 Leu Leu Arg Val Phe Arg Asp His Gly Pro Arg Gly Asp Arg Gln Gln 325 330 335 Thr Arg Leu Met Trp Leu Val Glu Ala Val Gly Val Asp Lys Phe Arg 340 345 350 Gln Leu Leu Ser Glu Tyr Met Gly Gly Ala Thr Phe Gly Glu Pro Val 355 360 365 His Val His His Asp Gln Pro Trp Glu Arg Arg Asn Leu Leu Gly Val 370 375 380 His Arg Gln Arg Gln Ala Gly Leu Asn Trp Val Gly Ala Cys Val Pro 385 390 395 400 Ala Gly Arg Leu His Ala Ala Asp Phe Glu Glu Ile Ala Ala Val Ala 405 410 415 Glu Lys Tyr Gly Asp Gly Thr Val Arg Ile Thr Cys Glu Glu Asn Val 420 425 430 Ile Phe Thr Asn Val Pro Asp Ala Lys Leu Glu Ala Met Lys Ala Glu 435 440 445 Pro Leu Phe Gln Arg Phe Pro Ile Phe Pro Gly Val Leu Leu Ser Gly 450 455 460 Met Val Ser Cys Thr Gly Asn Gln Phe Cys Gly Phe Gly Leu Ala Glu 465 470 475 480 Thr Lys Ala Lys Ala Val Lys Val Val Glu Ala Leu Asp Ala Gln Leu 485 490 495 Glu Leu Ser Arg Pro Val Arg Ile His Phe Thr Gly Cys Pro Asn Ser 500 505 510 Cys Gly Gln Ala Gln Val Gly Asp Ile Gly Leu Met Gly Ala Pro Ala 515 520 525 Lys His Glu Gly Lys Ala Val Glu Gly Tyr Lys Ile Phe Leu Gly Gly 530 535 540 Lys Ile Gly Glu Asn Pro Ala Leu Ala Thr Glu Phe Ala Gln Gly Val 545 550 555 560 Pro Ala Ile Glu Ser Val Leu Val Pro Arg Leu Lys Glu Ile Leu Ile 565 570 575 Ser Glu Phe Gly Ala Lys Glu Arg Ala Thr Ala Thr Ala 580 585 11469PRTChlamydomonas reinhardtii 11Met Leu Leu Lys Gly Ile Thr Thr Pro Met Leu Gly Gln Gln Arg Pro 1 5 10 15 Thr Arg Gly Gln Leu His Val Val Asn Val Ala Thr Pro Ser Lys Asn 20 25 30 Pro Ser Ser Arg Leu Ala Lys Arg Ser Lys Val Glu Ile Ile Lys Glu 35 40 45 Lys Ser Asp Tyr Leu Arg His Pro Leu Met Glu Glu Leu Val Asn Asp 50 55 60 Ala Thr Phe Ile Thr Glu Asp Ser Val Gln Leu Met Lys Phe His Gly 65 70 75 80 Ser Tyr Gln Gln Asp His Arg Glu Lys Arg Ala Phe Gly Gln Gly Lys 85 90 95 Ala Tyr Cys Phe Met Met Arg Thr Arg Gln Pro Ala Gly Val Val Pro 100 105 110 Asn Arg Leu Tyr Leu Val Met Asp Asp Leu Ala Asp Gln Tyr Gly Asn 115 120 125 Gly Thr Leu Arg Leu Thr Thr Arg Gln Ala Tyr Gln Leu His Gly Val 130 135 140 Leu Lys Lys Asp Leu Lys Thr Val Phe Ser Ser Val Ile Lys Asn Met 145 150 155 160 Gly Ser Thr Leu Ala Ala Cys Gly Asp Val Asn Arg Asn Val Met Gly 165 170 175 Pro Ser Ala Pro

Phe Thr Asn Arg Pro Asp Tyr Val Ala Ala Gln Lys 180 185 190 Ala Ala Asn Asp Ile Ala Asp Leu Leu Thr Pro Gln Ser Gly Ala Tyr 195 200 205 Tyr Asp Val Trp Leu Asp Gly Glu Lys Phe Met Ser Ala Tyr Lys Glu 210 215 220 Asp Pro Lys Val Thr Ala Asp Arg Ala Tyr Asn Gly Phe Gly Thr Asn 225 230 235 240 Phe Glu Asn Ser Pro Glu Pro Ile Tyr Gly Ala Gln Phe Leu Pro Arg 245 250 255 Lys Phe Lys Val Ala Thr Thr Val Pro Gly Asp Asn Ser Val Asp Leu 260 265 270 Phe Thr Gln Asp Leu Gly Val Val Val Ile Met Asp Glu Ser Gly Lys 275 280 285 Glu Val Lys Gly Tyr Asn Leu Thr Val Gly Gly Gly Met Gly Arg Thr 290 295 300 His Arg Asp Asp Glu Thr Phe Pro Arg Leu Ala Asp Pro Leu Gly Tyr 305 310 315 320 Val Asp Lys Asp Asp Leu Phe His Ala Val Lys Ala Val Val Ala Val 325 330 335 Gln Arg Asp Tyr Gly Arg Arg Asp Asn Arg Lys Gln Ala Arg Leu Lys 340 345 350 Tyr Leu Val Gly Leu Pro Ala Asp Gln Glu Leu His Val Arg Met Thr 355 360 365 Gly Cys Pro Asn Gly Cys Ala Arg Pro Tyr Met Ala Glu Leu Gly Phe 370 375 380 Val Gly Asp Gly Pro Asn Ser Tyr Gln Leu Tyr Phe Gly Gly Asn Val 385 390 395 400 Asn Gln Thr Arg Leu Ala Gln Leu Phe Ala Asp Arg Val Lys Val Lys 405 410 415 Asp Leu Glu Ser Thr Leu Glu Pro Ile Phe Ala Ala Trp Lys Ala Ser 420 425 430 Arg Arg Pro Lys Glu Ser Phe Gly Asp Trp Val Ser Arg Pro Ser Gln 435 440 445 Asp Pro Lys Asn Leu Ser Ser Val Gln Gln Gly Thr Gln His Glu Ser 450 455 460 Ala Val Val Ala His 465 12588PRTGossypium hirsutum 12Met Ser Ser Leu Ser Val Arg Phe Phe Ala Pro Gln Gln Pro Leu Leu 1 5 10 15 Pro Ser Thr Ala Ser Ser Phe Lys Pro Lys Thr Trp Val Met Ala Ala 20 25 30 Pro Thr Thr Ala Pro Ala Thr Ser Val Asp Val Asp Gly Gly Arg Leu 35 40 45 Glu Pro Arg Val Glu Glu Arg Glu Gly Tyr Phe Val Leu Lys Glu Lys 50 55 60 Phe Arg Asp Gly Ile Asn Pro Gln Glu Lys Ile Lys Ile Glu Lys Asp 65 70 75 80 Pro Leu Lys Leu Phe Met Glu Ala Gly Ile Asp Glu Leu Ala Lys Met 85 90 95 Ser Phe Glu Asp Leu Asp Lys Ala Lys Ala Thr Lys Asp Asp Ile Asp 100 105 110 Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His Gln Tyr 115 120 125 Gly Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser 130 135 140 Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Glu 145 150 155 160 Gly Cys Ala Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Ala 165 170 175 Val Leu Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Asp Glu Val Gly 180 185 190 Leu Thr Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly 195 200 205 Asn Pro Leu Ala Gly Ile Asp Pro Glu Glu Ile Val Asp Thr Arg Pro 210 215 220 Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Ala Asn Ser Arg Gly Asn 225 230 235 240 Pro Ala Val Ala Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly 245 250 255 Ser His Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met 260 265 270 Pro Ala Thr Lys Asn Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly 275 280 285 Phe Phe Ser Ala Lys Arg Cys Asp Glu Ala Ile Pro Leu Asp Ala Trp 290 295 300 Val Ser Ala Asp Asp Val Ile Pro Leu Cys Lys Ala Val Leu Glu Ala 305 310 315 320 Tyr Arg Asp Leu Gly Tyr Arg Gly Asn Arg Gln Lys Thr Arg Met Met 325 330 335 Trp Leu Ile Asp Glu Leu Gly Ile Glu Val Phe Arg Ser Glu Val Ala 340 345 350 Lys Arg Met Pro Gln Lys Glu Leu Glu Arg Ala Ser Asp Glu Asp Leu 355 360 365 Val Gln Lys Gln Trp Glu Arg Arg Asp Tyr Leu Gly Val His Pro Gln 370 375 380 Lys Gln Glu Gly Phe Ser Tyr Ile Gly Ile His Ile Pro Val Gly Arg 385 390 395 400 Val Gln Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Thr Tyr 405 410 415 Gly Ser Gly Glu Phe Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro 420 425 430 Asn Val Glu Asn Ser Lys Leu Glu Ala Leu Leu Asn Glu Pro Leu Leu 435 440 445 Lys Asp Arg Phe Ser Pro Gln Pro Ser Ile Leu Met Lys Gly Leu Val 450 455 460 Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys 465 470 475 480 Ala Arg Ala Leu Lys Val Thr Glu Glu Val Glu Arg Leu Val Ser Val 485 490 495 Ser Arg Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly 500 505 510 Gln Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Met Ala Arg Asp 515 520 525 Glu Asn Gly Lys Pro Cys Glu Gly Ala Asp Ile Phe Leu Gly Gly Arg 530 535 540 Ile Gly Ser Asp Ser His Leu Gly Glu Leu Tyr Lys Lys Gly Val Pro 545 550 555 560 Cys Lys Asn Leu Val Pro Val Val Ala Asp Ile Leu Val Glu Pro Phe 565 570 575 Gly Ala Val Pro Arg Gln Arg Glu Glu Gly Glu Asp 580 585 13473PRTHordeum vulgareUNSURE(473)..(473)Unknown amino acid 13Met Ala Ser Ser Ala Ser Leu Gln Ser Phe Leu Pro Pro Ser Ala His 1 5 10 15 Ala Ala Thr Ser Ser Ser Arg Leu Arg Pro Ser Arg Ala Arg Pro Val 20 25 30 Gln Cys Ala Ala Val Ser Ala Pro Ser Ser Ser Ser Ser Ser Ala Ser 35 40 45 Pro Ser Ala Ser Ala Val Pro Ser Glu Arg Leu Glu Pro Arg Val Glu 50 55 60 Gln Arg Glu Gly Gly Tyr Trp Val Leu Lys Glu Lys Tyr Arg Thr Ser 65 70 75 80 Leu Asn Pro Gln Glu Lys Val Lys Leu Gly Lys Glu Pro Met Ala Leu 85 90 95 Phe Thr Glu Gly Gly Ile Asn Asp Leu Ala Lys Leu Pro Met Glu Gln 100 105 110 Ile Asp Ala Asp Lys Leu Thr Lys Glu Asp Val Asp Val Arg Leu Lys 115 120 125 Trp Leu Gly Leu Phe His Arg Arg Lys Gln Gln Tyr Gly Arg Phe Met 130 135 140 Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Glu Gln Thr Arg 145 150 155 160 Tyr Leu Ala Ser Val Ile Asp Lys Tyr Gly Glu Glu Gly Cys Ala Asp 165 170 175 Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Thr Leu Pro Asp 180 185 190 Val Pro Glu Ile Leu Asp Gly Leu Arg Ser Val Gly Leu Thr Ser Leu 195 200 205 Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Ser Pro Leu Ala 210 215 220 Gly Ile Asp Pro Leu Glu Ile Val Asp Thr Arg Pro Tyr Thr Asn Leu 225 230 235 240 Leu Ser Ser Tyr Ile Thr Asn Asn Ser Glu Gly Asn Leu Ala Ile Thr 245 250 255 Asn Leu Pro Arg Lys Trp Asn Val Cys Val Ile Gly Thr His Asp Leu 260 265 270 Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Glu Lys 275 280 285 Asp Gly Lys Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Ile Ser Pro 290 295 300 Lys Arg Trp Gly Glu Ala Leu Pro Leu Asp Ala Trp Val Pro Gly Asp 305 310 315 320 Asp Ile Ile Pro Val Cys Lys Ala Val Leu Glu Ala Phe Arg Asp Leu 325 330 335 Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp 340 345 350 Glu Leu Gly Met Glu Ala Phe Arg Ser Glu Ile Glu Lys Arg Met Pro 355 360 365 Asn Gly Val Leu Glu Arg Ala Ala Pro Glu Asp Leu Ile Asp Lys Lys 370 375 380 Trp Glu Arg Arg Asp Tyr Leu Gly Val His Pro Gln Lys Gln Glu Gly 385 390 395 400 Leu Ser Phe Val Gly Leu His Val Pro Val Gly Arg Leu Gln Ala Ala 405 410 415 Asp Met Phe Glu Leu Ala Arg Leu Ala Asp Glu Tyr Gly Ser Gly Glu 420 425 430 Leu Arg Leu Thr Val Glu Gln Asn Ile Val Leu Pro Asn Val Lys Asn 435 440 445 Glu Lys Val Glu Ala Leu Leu Ala Glu Pro Leu Leu His Lys Phe Ser 450 455 460 Ala His Pro Ser Leu Leu Met Lys Xaa 465 470 14582PRTLotus japonicus 14Met Ser Ser Ser Phe Ser Ile Arg Phe Leu Ala Pro Pro Phe Pro Ser 1 5 10 15 Thr Ser Arg Pro Lys Ser Cys Leu Ser Ala Ala Thr Pro Ala Val Ala 20 25 30 Pro Thr Asp Ala Ala Val Ser Arg Leu Glu Pro Arg Val Glu Glu Arg 35 40 45 Asn Gly Tyr Trp Val Leu Lys Glu Glu His Arg Gly Gly Ile Asn Pro 50 55 60 Gln Glu Lys Val Lys Leu Glu Lys Glu Pro Met Ala Leu Phe Met Glu 65 70 75 80 Gly Gly Ile Asp Glu Leu Ala Lys Val Ser Ile Glu Glu Leu Asp Ser 85 90 95 Ser Lys Leu Thr Lys Asp Asp Val Asp Val Arg Leu Lys Trp Leu Gly 100 105 110 Leu Phe His Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Leu 115 120 125 Lys Leu Pro Asn Gly Val Thr Thr Ser Ala Gln Thr Arg Tyr Leu Ala 130 135 140 Ser Val Ile Arg Lys Tyr Gly Lys Asp Gly Cys Ala Asp Val Thr Thr 145 150 155 160 Arg His Asn Trp Gln Ile Arg Gly Val Val Leu Pro Asp Val Pro Glu 165 170 175 Ile Leu Lys Gly Leu Ala Glu Val Gly Leu Thr Ser Leu Gln Ser Gly 180 185 190 Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile Asp 195 200 205 Pro Asp Glu Ile Val Asp Thr Arg Pro Tyr Thr Asn Leu Leu Ser His 210 215 220 Phe Ile Thr Ala Asn Ser Arg Gly Asn Pro Thr Val Ser Asn Leu Pro 225 230 235 240 Arg Lys Trp Asn Val Cys Val Val Gly Ser His Asp Leu Phe Glu His 245 250 255 Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Asn Lys Asp Gly Arg 260 265 270 Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Lys Arg Cys 275 280 285 Ala Glu Ala Ile Pro Leu Asp Ala Trp Val Ser Ala Glu Asp Val Ile 290 295 300 Pro Val Cys Lys Ala Ile Leu Glu Met Tyr Arg Asp Leu Gly Thr Arg 305 310 315 320 Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Leu Gly 325 330 335 Ile Glu Val Phe Arg Ser Glu Val Val Lys Arg Met Pro Leu Gly Gln 340 345 350 Gln Leu Glu Arg Ala Ser Gln Glu Asp Leu Val Gln Lys Gln Trp Glu 355 360 365 Arg Arg Asp Tyr Phe Gly Ala Asn Pro Gln Lys Gln Glu Gly Leu Ser 370 375 380 Tyr Val Gly Ile His Ile Pro Val Gly Arg Ile Gln Ala Asp Glu Met 385 390 395 400 Asp Glu Leu Ala Arg Leu Ala Asp Glu Tyr Gly Thr Gly Glu Leu Arg 405 410 415 Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Val Glu Asn Ser Lys 420 425 430 Leu Ser Ala Leu Leu Asn Glu Pro Leu Leu Lys Glu Lys Phe Ser Pro 435 440 445 Glu Pro Ser Leu Leu Met Lys Thr Leu Val Ala Cys Thr Gly Ser Gln 450 455 460 Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys Val 465 470 475 480 Thr Glu Glu Val Glu Arg Leu Val Ala Val Thr Arg Pro Val Arg Met 485 490 495 His Trp Thr Gly Cys Pro Asn Thr Cys Gly Gln Val Gln Val Ala Asp 500 505 510 Ile Gly Phe Met Gly Cys Met Ala Arg Asp Glu Asn Gly Lys Pro Gly 515 520 525 Glu Gly Val Asp Ile Phe Leu Gly Gly Arg Ile Gly Ser Asp Ser His 530 535 540 Leu Ala Glu Val Tyr Lys Lys Ala Val Pro Cys Lys Asp Leu Val Pro 545 550 555 560 Ile Val Ala Asp Ile Leu Val Lys His Phe Gly Ala Val Gln Arg Asn 565 570 575 Arg Glu Glu Gly Asp Asp 580 15587PRTNicotiana tabacum 15Met Ala Ser Phe Ser Val Lys Phe Ser Ala Thr Ser Leu Pro Asn Pro 1 5 10 15 Asn Arg Phe Ser Arg Thr Ala Lys Leu His Ala Thr Pro Pro Gln Thr 20 25 30 Val Ala Val Pro Pro Ser Gly Glu Ala Glu Ile Ala Ser Glu Arg Leu 35 40 45 Glu Pro Arg Val Glu Glu Lys Asp Gly Tyr Trp Val Leu Lys Glu Lys 50 55 60 Phe Arg Gln Gly Ile Asn Pro Ala Glu Lys Ala Lys Ile Glu Lys Glu 65 70 75 80 Pro Met Lys Leu Phe Met Glu Asn Gly Ile Glu Asp Leu Ala Lys Ile 85 90 95 Ser Leu Glu Glu Ile Glu Gly Ser Lys Leu Thr Lys Asp Asp Ile Asp 100 105 110 Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His His Tyr 115 120 125 Gly Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser 130 135 140 Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Asp 145 150 155 160 Gly Cys Gly Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val 165 170 175 Val Leu Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Asp Glu Val Gly 180 185 190 Leu Thr Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly 195 200 205 Asn Pro Leu Ala Gly Ile Asp Pro His Glu Ile Val Asp Thr Arg Pro 210 215 220 Tyr Thr Asn Leu Leu Ser Gln Tyr Val Thr Ala Asn Phe Arg Gly Asn 225 230 235 240 Pro Ala Val Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Ile Gly 245 250 255 Ser His Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met 260 265 270 Pro Ala Ser Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly 275 280 285 Phe Phe Ser Pro Lys Arg Cys Ala Glu Ala Val Pro Leu Asp Ala Trp 290 295 300 Val Pro Ala Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala 305 310 315 320 Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met 325 330 335 Trp Leu Val Asp Glu Leu Gly Val Glu Gly Phe Arg Ala Glu Val Val 340 345 350 Lys Arg Met Pro Gln Gln Lys

Leu Asp Arg Glu Ser Thr Glu Asp Leu 355 360 365 Val Gln Lys Gln Trp Glu Arg Arg Glu Tyr Leu Gly Val His Pro Gln 370 375 380 Lys Gln Glu Gly Tyr Ser Phe Val Gly Leu His Ile Pro Val Gly Arg 385 390 395 400 Val Gln Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Asn Tyr 405 410 415 Gly Ser Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro 420 425 430 Asn Val Glu Asn Ser Lys Ile Glu Ser Leu Leu Asn Glu Pro Leu Leu 435 440 445 Lys Asn Arg Phe Ser Thr Asn Pro Pro Ile Leu Met Lys Asn Leu Val 450 455 460 Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys 465 470 475 480 Ala Arg Ser Met Lys Ile Thr Glu Glu Val Gln Arg Leu Val Ser Val 485 490 495 Thr Lys Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Ser Cys Gly 500 505 510 Gln Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Lys 515 520 525 Glu Gly Lys Thr Val Glu Gly Ala Asp Val Tyr Leu Gly Gly Arg Ile 530 535 540 Gly Ser Asp Ser His Leu Gly Asp Val Tyr Lys Lys Ser Val Pro Cys 545 550 555 560 Glu Asp Leu Val Pro Ile Ile Val Asp Leu Leu Val Asn Asn Phe Gly 565 570 575 Ala Val Pro Arg Glu Arg Glu Glu Ala Glu Asp 580 585 16587PRTNicotiana tabacum 16Met Ala Ser Phe Ser Ile Lys Phe Leu Ala Pro Ser Leu Pro Asn Pro 1 5 10 15 Ala Arg Phe Ser Lys Asn Ala Val Lys Leu His Ala Thr Pro Pro Ser 20 25 30 Val Ala Ala Pro Pro Thr Gly Ala Pro Glu Val Ala Ala Glu Arg Leu 35 40 45 Glu Pro Arg Val Glu Glu Lys Asp Gly Tyr Trp Ile Leu Lys Glu Gln 50 55 60 Phe Arg Lys Gly Ile Asn Pro Gln Glu Lys Val Lys Ile Glu Lys Gln 65 70 75 80 Pro Met Lys Leu Phe Met Glu Asn Gly Ile Glu Glu Leu Ala Lys Ile 85 90 95 Pro Ile Glu Glu Ile Asp Gln Ser Lys Leu Thr Lys Asp Asp Ile Asp 100 105 110 Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys Asn Gln Tyr 115 120 125 Gly Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser 130 135 140 Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Glu 145 150 155 160 Gly Cys Ala Asp Ile Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val 165 170 175 Val Leu Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Ala Glu Val Gly 180 185 190 Leu Thr Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly 195 200 205 Asn Pro Leu Ala Gly Ile Asp Pro Glu Glu Ile Val Asp Thr Arg Pro 210 215 220 Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Gly Asn Ser Arg Gly Asn 225 230 235 240 Pro Ala Val Ser Asn Leu Pro Arg Lys Trp Asn Pro Cys Val Val Gly 245 250 255 Ser His Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met 260 265 270 Pro Ala Thr Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly 275 280 285 Phe Phe Ser Ala Lys Arg Cys Asp Glu Ala Ile Pro Leu Asp Ala Trp 290 295 300 Val Pro Ala Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala 305 310 315 320 Phe Arg Asp Leu Gly Phe Arg Gly Asn Arg Gln Lys Cys Arg Met Met 325 330 335 Trp Leu Ile Asp Glu Leu Gly Val Glu Gly Phe Arg Ala Glu Val Glu 340 345 350 Lys Arg Met Pro Gln Gln Gln Leu Glu Arg Ala Ser Pro Glu Asp Leu 355 360 365 Val Gln Lys Gln Trp Glu Arg Arg Asp Tyr Leu Gly Val His Pro Gln 370 375 380 Lys Gln Glu Gly Tyr Ser Phe Ile Gly Leu His Ile Pro Val Gly Arg 385 390 395 400 Val Gln Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Glu Tyr 405 410 415 Gly Ser Gly Glu Ile Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro 420 425 430 Asn Ile Glu Asn Ser Lys Ile Glu Ala Leu Leu Lys Glu Pro Val Leu 435 440 445 Ser Thr Phe Ser Pro Asp Pro Pro Ile Leu Met Lys Gly Leu Val Ala 450 455 460 Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala 465 470 475 480 Arg Ser Leu Met Ile Thr Glu Glu Val Gln Arg Gln Val Ser Leu Thr 485 490 495 Arg Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Ala Gln 500 505 510 Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Asp Lys 515 520 525 Asn Gly Lys Thr Val Glu Gly Ala Asp Val Phe Leu Gly Gly Arg Ile 530 535 540 Gly Ser Asp Ser His Leu Gly Glu Val Tyr Lys Lys Ala Val Pro Cys 545 550 555 560 Asp Asp Leu Val Pro Leu Val Val Asp Leu Leu Val Asn Asn Phe Gly 565 570 575 Ala Val Pro Arg Glu Arg Glu Glu Thr Glu Asp 580 585 17584PRTNicotiana tabacum 17Met Ala Ser Phe Ser Val Lys Phe Ser Ala Thr Ser Leu Pro Asn His 1 5 10 15 Lys Arg Phe Ser Lys Leu His Ala Thr Pro Pro Gln Thr Val Ala Val 20 25 30 Ala Pro Ser Gly Ala Ala Glu Ile Ala Ser Glu Arg Leu Glu Pro Arg 35 40 45 Val Glu Glu Lys Asp Gly Tyr Trp Val Leu Lys Glu Lys Phe Arg Gln 50 55 60 Gly Ile Asn Pro Ala Glu Lys Ala Lys Ile Glu Lys Glu Pro Met Lys 65 70 75 80 Leu Phe Met Glu Asn Gly Ile Glu Asp Leu Ala Lys Ile Ser Leu Glu 85 90 95 Glu Ile Glu Gly Ser Lys Leu Thr Lys Asp Asp Ile Asp Val Arg Leu 100 105 110 Lys Trp Leu Gly Leu Phe His Arg Arg Lys His His Tyr Gly Arg Phe 115 120 125 Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Ser Gln Thr 130 135 140 Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Asp Gly Cys Ala 145 150 155 160 Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Val Leu Pro 165 170 175 Asp Val Pro Glu Ile Leu Lys Gly Leu Asp Glu Val Gly Leu Thr Ser 180 185 190 Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro Leu 195 200 205 Ala Gly Ile Asp Pro His Glu Ile Val Asp Thr Arg Pro Tyr Thr Asn 210 215 220 Leu Leu Ser Gln Tyr Val Thr Ala Asn Phe Arg Gly Asn Pro Ala Val 225 230 235 240 Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Ile Gly Ser His Asp 245 250 255 Leu Tyr Glu His Pro Gln Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr 260 265 270 Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser 275 280 285 Pro Lys Arg Cys Ala Glu Ala Val Pro Leu Asp Ala Trp Val Pro Ala 290 295 300 Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala Tyr Arg Asp 305 310 315 320 Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Val 325 330 335 Asp Glu Leu Gly Val Glu Gly Phe Arg Ala Glu Val Val Lys Arg Met 340 345 350 Pro Gln Gln Lys Leu Asp Arg Glu Ser Thr Glu Asp Leu Val Gln Lys 355 360 365 Gln Trp Glu Arg Arg Glu Tyr Leu Gly Val His Pro Gln Lys Gln Glu 370 375 380 Gly Tyr Ser Phe Val Gly Leu His Ile Pro Val Gly Arg Val Gln Ala 385 390 395 400 Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Glu Tyr Gly Ser Gly 405 410 415 Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Val Lys 420 425 430 Asn Ser Lys Ile Glu Ala Leu Leu Asn Glu Pro Leu Leu Lys Asn Arg 435 440 445 Phe Ser Thr Asp Pro Pro Ile Leu Met Lys Asn Leu Val Ala Cys Thr 450 455 460 Gly Asn Gln Phe Cys Gly Lys Ala Ile Ile Glu Thr Lys Ala Arg Ser 465 470 475 480 Met Lys Ile Thr Glu Glu Val Gln Leu Leu Val Ser Ile Thr Gln Pro 485 490 495 Val Arg Met His Trp Thr Gly Cys Pro Asn Ser Cys Ala Gln Val Gln 500 505 510 Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Lys Glu Gly Lys 515 520 525 Thr Val Glu Gly Ala Asp Val Tyr Leu Gly Gly Arg Ile Gly Ser Asp 530 535 540 Ser His Leu Gly Asp Val Tyr Lys Lys Ser Val Pro Cys Glu Asp Leu 545 550 555 560 Val Pro Ile Ile Val Asp Leu Leu Val Asp Asn Phe Gly Ala Val Pro 565 570 575 Arg Glu Arg Glu Glu Ala Glu Asp 580 18594PRTOryza sativa 18Met Ala Ser Ser Ala Ser Leu Gln Arg Phe Leu Pro Pro Tyr Pro His 1 5 10 15 Ala Ala Ala Ser Arg Cys Arg Pro Pro Gly Val Arg Ala Arg Pro Val 20 25 30 Gln Ser Ser Thr Val Ser Ala Pro Ser Ser Ser Thr Pro Ala Ala Asp 35 40 45 Glu Ala Val Ser Ala Glu Arg Leu Glu Pro Arg Val Glu Gln Arg Glu 50 55 60 Gly Arg Tyr Trp Val Leu Lys Glu Lys Tyr Arg Thr Gly Leu Asn Pro 65 70 75 80 Gln Glu Lys Val Lys Leu Gly Lys Glu Pro Met Ser Leu Phe Met Glu 85 90 95 Gly Gly Ile Lys Glu Leu Ala Lys Met Pro Met Glu Glu Ile Glu Ala 100 105 110 Asp Lys Leu Ser Lys Glu Asp Ile Asp Val Arg Leu Lys Trp Leu Gly 115 120 125 Leu Phe His Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Leu 130 135 140 Lys Leu Pro Asn Gly Val Thr Thr Ser Glu Gln Thr Arg Tyr Leu Ala 145 150 155 160 Ser Val Ile Glu Ala Tyr Gly Lys Glu Gly Cys Ala Asp Val Thr Thr 165 170 175 Arg Arg Gln Ile Arg Gly Val Thr Leu Pro Asp Val Pro Ala Ile Leu 180 185 190 Asp Gly Leu Asn Ala Val Gly Leu Thr Ser Leu Gln Ser Gly Met Asp 195 200 205 Asn Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile Asp Pro Asp 210 215 220 Glu Ile Val Asp Thr Arg Ser Tyr Thr Asn Leu Leu Ser Ser Tyr Ile 225 230 235 240 Thr Ser Asn Phe Gln Gly Asn Pro Thr Ile Thr Asn Leu Pro Arg Lys 245 250 255 Trp Asn Val Cys Val Ile Gly Ser His Asp Leu Tyr Glu His Pro His 260 265 270 Ile Asn Asp Leu Ala Tyr Met Pro Ala Val Lys Gly Gly Lys Phe Gly 275 280 285 Phe Asn Leu Leu Val Gly Gly Phe Ile Ser Pro Lys Arg Trp Glu Glu 290 295 300 Ala Leu Pro Leu Asp Ala Trp Val Pro Gly Asp Asp Ile Ile Pro Val 305 310 315 320 Cys Lys Ala Val Leu Glu Ala Tyr Arg Asp Leu Gly Thr Arg Gly Asn 325 330 335 Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Leu Gly Met Glu 340 345 350 Ala Phe Arg Ser Glu Val Glu Lys Arg Met Pro Asn Gly Val Leu Glu 355 360 365 Arg Ala Ala Pro Glu Asp Leu Ile Asp Lys Lys Trp Gln Arg Arg Asp 370 375 380 Tyr Leu Gly Val His Pro Gln Lys Gln Glu Gly Met Ser Tyr Val Gly 385 390 395 400 Leu His Val Pro Val Gly Arg Val Gln Ala Ala Asp Met Phe Glu Leu 405 410 415 Ala Arg Leu Ala Asp Glu Tyr Gly Ser Gly Glu Leu Arg Leu Thr Val 420 425 430 Glu Gln Asn Ile Val Ile Pro Asn Val Lys Asn Glu Lys Val Glu Ala 435 440 445 Leu Leu Ser Glu Pro Leu Leu Gln Lys Phe Ser Pro Gln Pro Ser Leu 450 455 460 Leu Leu Lys Gly Leu Val Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln 465 470 475 480 Ala Ile Ile Glu Thr Lys Gln Arg Ala Leu Leu Val Thr Ser Gln Val 485 490 495 Glu Lys Leu Val Ser Val Pro Arg Ala Val Arg Met His Trp Thr Gly 500 505 510 Cys Pro Asn Ser Cys Gly Gln Val Gln Val Ala Asp Ile Gly Phe Met 515 520 525 Gly Cys Leu Thr Lys Asp Ser Ala Gly Lys Ile Val Glu Ala Ala Asp 530 535 540 Ile Phe Val Gly Gly Arg Val Gly Ser Asp Ser His Leu Ala Gly Ala 545 550 555 560 Tyr Lys Lys Ser Val Pro Cys Asp Glu Leu Ala Pro Ile Val Ala Asp 565 570 575 Ile Leu Val Glu Arg Phe Gly Ala Val Arg Arg Glu Arg Glu Glu Asp 580 585 590 Glu Glu 19602PRTPhyscomitrella patens 19Met Gln Gly Ala Met Gln Thr Lys Met Trp Arg Gly Glu Leu Ile Ser 1 5 10 15 Thr Ser Thr His Phe Ile Gly Gly Thr Arg Leu Gln Pro Lys Leu Asn 20 25 30 Gln Asp Ala Arg Lys Pro Thr Lys Ser Glu Asn Cys Ile Val Arg Val 35 40 45 Ser Met Glu Arg Glu Val Lys Ala Lys Ala Ala Val Ser Pro Pro Ala 50 55 60 Val Ala Ala Asp Arg Leu Thr Pro Arg Val Gln Glu Arg Asp Gly Tyr 65 70 75 80 Tyr Val Leu Lys Glu Glu Phe Arg Gln Gly Ile Asn Pro Gln Glu Lys 85 90 95 Ile Lys Leu Gly Lys Glu Pro Met Lys Phe Phe Ile Glu Asn Glu Ile 100 105 110 Glu Glu Leu Ala Lys Thr Pro Phe Ala Glu Leu Asp Ser Ser Lys Pro 115 120 125 Gly Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His 130 135 140 Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Phe Lys Leu Pro 145 150 155 160 Asn Gly Ile Thr Asn Ser Thr Gln Thr Arg Phe Leu Ala Glu Thr Ile 165 170 175 Ser Lys Tyr Gly Lys Glu Gly Cys Ala Asp Leu Thr Thr Arg Gln Asn 180 185 190 Trp Gln Ile Arg Gly Ile Met Leu Glu Asp Val Pro Ser Leu Leu Lys 195 200 205 Gly Leu Glu Ser Val Gly Leu Ser Ser Leu Gln Ser Gly Met Asp Asn 210 215 220 Val Arg Asn Ala Val Gly Asn Pro Leu Ala Gly Ile Asp Pro Asp Glu 225 230 235 240 Ile Val Asp Thr Ile Pro Ile Cys Gln Ala Leu Asn Asp Tyr Ile Ile 245 250 255 Asn Arg Gly Lys Gly Asn Thr Glu Ile Thr Asn Leu Pro Arg Lys Trp 260 265 270 Asn Val Cys Val Val Gly Thr His Asp Leu Phe Glu His Pro His Ile 275 280 285 Asn Asp Leu Ala Tyr Val Pro Ala Thr Lys Asn Gly Val Phe Gly Phe 290

295 300 Asn Ile Leu Val Gly Gly Phe Phe Ser Ser Lys Arg Cys Ala Glu Ala 305 310 315 320 Ile Pro Met Asp Ala Trp Val Pro Thr Asp Asp Val Val Pro Leu Cys 325 330 335 Lys Ala Ile Leu Glu Thr Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg 340 345 350 Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Met Gly Val Glu Glu 355 360 365 Phe Arg Ala Glu Val Glu Arg Arg Met Pro Ser Gly Thr Ile Arg Arg 370 375 380 Ala Gly Gln Asp Leu Ile Asp Pro Ser Trp Lys Arg Arg Ser Phe Phe 385 390 395 400 Gly Val Asn Pro Gln Lys Gln Ala Gly Leu Asn Tyr Val Gly Leu His 405 410 415 Val Pro Val Gly Arg Leu His Ala Pro Glu Met Phe Glu Leu Ala Arg 420 425 430 Ile Ala Asp Glu Tyr Gly Asn Gly Glu Ile Arg Ile Thr Val Glu Gln 435 440 445 Asn Leu Ile Leu Pro Asn Ile Pro Thr Glu Lys Ile Asp Lys Leu Met 450 455 460 Gln Glu Pro Leu Leu Gln Lys Tyr Ser Pro Asn Pro Thr Pro Leu Leu 465 470 475 480 Ala Asn Leu Val Ala Cys Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile 485 490 495 Ala Glu Thr Lys Ala Leu Ser Leu Gln Leu Thr Gln Gln Leu Glu Asp 500 505 510 Thr Met Glu Thr Thr Arg Pro Ile Arg Leu His Phe Thr Gly Cys Pro 515 520 525 Asn Thr Cys Ala Gln Ile Gln Val Ala Asp Ile Gly Phe Met Gly Thr 530 535 540 Met Ala Arg Asp Glu Asn Arg Lys Pro Val Glu Gly Phe Asp Ile Tyr 545 550 555 560 Leu Gly Gly Arg Ile Gly Ser Asp Ser His Leu Gly Glu Leu Val Val 565 570 575 Pro Gly Val Pro Ala Thr Lys Leu Leu Pro Val Val Gln Glu Leu Met 580 585 590 Ile Gln His Phe Gly Ala Lys Arg Lys Pro 595 600 20602PRTPhyscomitrella patens 20Met Gln Gly Thr Met Gln Ser Gln Met Trp Arg Gly Gln Val Ser Gly 1 5 10 15 Ala Ser Leu His Phe Thr Gly Ala Thr Arg Val Gln Gly Asn Ser His 20 25 30 Gln Asp Leu Val Tyr Pro Thr Gln Phe His Lys His Gly Val Arg Ala 35 40 45 Ser Ala Glu Arg Glu Val Lys Ala Lys Ala Val Ala Ala Pro Pro Thr 50 55 60 Ile Ala Ala Asp Arg Leu Val Pro Arg Val Glu Glu Arg Asp Gly Tyr 65 70 75 80 Tyr Val Leu Lys Glu Glu Phe Arg Gln Gly Ile Asn Pro Ser Glu Lys 85 90 95 Ile Lys Ile Ala Lys Glu Pro Met Lys Phe Phe Met Glu Asn Glu Ile 100 105 110 Glu Glu Leu Ala Lys Thr Pro Phe Ala Glu Leu Asp Ser Ser Lys Ala 115 120 125 Gly Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His 130 135 140 Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Phe Lys Leu Pro 145 150 155 160 Asn Gly Ile Thr Asn Ser Ser Gln Thr Arg Phe Leu Ala Glu Thr Ile 165 170 175 Ser Lys Tyr Gly Glu Tyr Gly Cys Ala Asp Leu Thr Thr Arg Gln Asn 180 185 190 Trp Gln Ile Arg Gly Ile Val Leu Glu Asp Val Pro Ala Leu Leu Lys 195 200 205 Gly Leu Glu Ser Val Gly Leu Ser Ser Leu Gln Ser Gly Met Asp Asn 210 215 220 Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile Asp Pro Asp Glu 225 230 235 240 Ile Val Asp Thr Ala Pro Phe Cys Lys Val Leu Ser Asp Tyr Ile Ile 245 250 255 Asn Arg Gly Gln Gly Asn Pro Gln Ile Thr Asn Leu Pro Arg Lys Trp 260 265 270 Asn Val Cys Val Val Gly Thr His Asp Leu Phe Glu His Pro His Ile 275 280 285 Asn Asp Leu Ala Tyr Met Pro Ala Thr Lys Asn Gly Val Phe Gly Phe 290 295 300 Asn Ile Leu Val Gly Gly Phe Phe Ser Pro Lys Arg Cys Ala Glu Ala 305 310 315 320 Ile Pro Met Asp Ala Trp Val Pro Ala Asp Asp Val Val Pro Leu Cys 325 330 335 Lys Ala Ile Leu Glu Thr Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg 340 345 350 Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Met Gly Ile Glu Glu 355 360 365 Phe Arg Ala Glu Val Glu Arg Arg Met Pro Gly Gly Ser Ile Leu Arg 370 375 380 Ala Gly Lys Asp Leu Val Asp Pro Ser Trp Thr Arg Arg Ser Phe Tyr 385 390 395 400 Gly Val Asn Pro Gln Lys Gln Pro Gly Leu Asn Tyr Val Gly Leu His 405 410 415 Ile Pro Val Gly Arg Leu His Ala Pro Glu Met Phe Glu Leu Ala Arg 420 425 430 Ile Ala Asp Glu Tyr Gly Asn Gly Glu Ile Arg Ile Ser Val Glu Gln 435 440 445 Asn Leu Ile Leu Pro Asn Val Pro Thr Glu Lys Ile Glu Lys Leu Leu 450 455 460 Lys Glu Pro Leu Leu Glu Lys Tyr Ser Pro Asn Pro Thr Pro Leu Leu 465 470 475 480 Ala Asn Leu Val Ala Cys Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile 485 490 495 Ala Glu Thr Lys Ala Arg Ser Leu Gln Leu Thr Gln Glu Leu Glu Ala 500 505 510 Thr Met Glu Thr Thr Arg Pro Ile Arg Leu His Phe Thr Gly Cys Pro 515 520 525 Asn Thr Cys Ala Gln Ile Gln Val Ala Asp Ile Gly Phe Met Gly Thr 530 535 540 Met Ala Arg Asp Glu Asn Arg Lys Pro Val Glu Gly Phe Asp Ile Tyr 545 550 555 560 Leu Gly Gly Arg Ile Gly Ser Asp Ser His Leu Gly Glu Leu Val Val 565 570 575 Pro Gly Val Pro Ala Thr Lys Leu Leu Pro Val Val Gln Asp Leu Met 580 585 590 Ile Gln His Phe Gly Ala Lys Arg Lys Thr 595 600 21616PRTPinus taeda 21Met Asn Leu Ser Ser Pro Val Arg Phe Asp Glu Ile Arg Pro Leu Ala 1 5 10 15 His Val Val Tyr Asn Pro Val Cys Cys Gly His Lys Pro Asn Arg Leu 20 25 30 Arg Leu Met Thr Ala Ile Gln Val Arg Ala Val Asn His Gly Gly Arg 35 40 45 Asn Ser Glu Ile Ser Thr Asp Gly Asn Ser Lys Gly Thr Thr Ala Lys 50 55 60 Ala Val Ala Ser Pro Ala Gly Ser His Val Ala Val Asp Ala Ser Arg 65 70 75 80 Leu Glu Ala Arg Val Glu Glu Arg Asp Gly Tyr Trp Val Leu Lys Glu 85 90 95 Glu Phe Arg Ala Gly Ile Asn Pro Gln Glu Lys Ile Lys Leu Gln Arg 100 105 110 Glu Pro Met Lys Leu Phe Met Glu Asn Glu Ile Glu Glu Leu Ala Lys 115 120 125 Lys Pro Phe Ala Glu Ile Glu Ser Glu Lys Val Asn Lys Asp Asp Ile 130 135 140 Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His His 145 150 155 160 Tyr Gly Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr 165 170 175 Ser Leu Gln Thr Arg Tyr Leu Ala Ser Val Ile Gln Gln Tyr Gly Pro 180 185 190 Glu Gly Cys Ala Asp Ile Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly 195 200 205 Val Val Leu Asp Asp Val Pro Ala Ile Leu Lys Gly Leu Lys Glu Val 210 215 220 Gly Leu Ser Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val 225 230 235 240 Gly Asn Pro Leu Ala Gly Ile Asp Ala Asp Glu Ile Ile Asp Thr Arg 245 250 255 Pro Tyr Thr Lys Val Leu Thr Asp Tyr Ile Val Asn Asn Gly Lys Gly 260 265 270 Asn Pro Ser Ile Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val 275 280 285 Gly Thr His Asp Leu Phe Glu His Pro His Ile Asn Asp Leu Ala Tyr 290 295 300 Ile Pro Ala Met Asn Ser Gly Arg Phe Gly Phe Asn Leu Leu Val Gly 305 310 315 320 Gly Phe Phe Ser Pro Lys Arg Cys Glu Glu Ala Val Pro Leu Asp Ala 325 330 335 Trp Val Ala Gly Glu Asp Val Val Pro Val Cys Arg Ala Ile Leu Glu 340 345 350 Val Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met 355 360 365 Met Trp Leu Ile Asp Glu Leu Gly Ile Glu Gly Phe Arg Ser Glu Val 370 375 380 Val Lys Arg Met Pro Gly Glu Lys Leu Glu Arg Ala Ala Thr Glu Asp 385 390 395 400 Met Leu Asp Lys Ser Trp Glu Arg Arg Ser Tyr Leu Gly Val His Pro 405 410 415 Gln Lys Gln Glu Gly Leu Asn Phe Val Gly Leu His Val Pro Val Gly 420 425 430 Arg Leu Gln Ala Glu Asp Met Leu Glu Leu Ala Arg Leu Ala Glu Gln 435 440 445 Tyr Gly Thr Gln Glu Leu Arg Leu Thr Val Glu Gln Asn Ala Ile Ile 450 455 460 Pro Asn Val Pro Thr Asp Lys Ile Glu Ala Leu Leu Gln Glu Pro Leu 465 470 475 480 Leu Gln Lys Phe Ser Pro Ser Pro Pro Leu Leu Val Ser Thr Leu Val 485 490 495 Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys 500 505 510 Ala Arg Ala Leu Lys Ile Thr Glu Glu Leu Asp Arg Thr Met Glu Val 515 520 525 Pro Lys Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly 530 535 540 Gln Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Met Thr Arg Asp 545 550 555 560 Glu Asn Lys Lys Val Val Glu Gly Val Asp Ile Phe Ile Gly Gly Arg 565 570 575 Val Gly Ala Asp Ser His Leu Gly Asp Leu Ile His Lys Gly Val Pro 580 585 590 Cys Lys Asp Val Val Pro Val Val Gln Glu Leu Leu Ile Lys His Phe 595 600 605 Gly Ala Ile Arg Lys Thr Asp Met 610 615 22588PRTPopulus trichocarpa 22Met Ser Ser Leu Ser Val Arg Phe Leu Thr Pro Gln Leu Ser Pro Thr 1 5 10 15 Val Pro Ser Ser Ser Ala Arg Pro Arg Thr Arg Leu Phe Ala Gly Pro 20 25 30 Pro Thr Val Ala Gln Pro Ala Glu Thr Gly Val Asp Ala Gly Arg Leu 35 40 45 Glu Pro Arg Val Glu Lys Lys Asp Gly Tyr Tyr Val Leu Lys Glu Lys 50 55 60 Phe Arg Gln Gly Ile Asn Pro Gln Glu Lys Val Lys Ile Glu Lys Glu 65 70 75 80 Pro Met Lys Leu Phe Met Glu Asn Gly Ile Glu Glu Leu Ala Lys Leu 85 90 95 Ser Met Glu Glu Ile Asp Lys Glu Lys Ser Thr Lys Asp Asp Ile Asp 100 105 110 Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His Gln Tyr 115 120 125 Gly Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser 130 135 140 Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Asp 145 150 155 160 Gly Cys Ala Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val 165 170 175 Val Leu Pro Asp Val Pro Glu Ile Leu Arg Gly Leu Ala Glu Val Gly 180 185 190 Leu Thr Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly 195 200 205 Asn Pro Leu Ala Gly Ile Asp Pro Asp Glu Ile Val Asp Thr Arg Pro 210 215 220 Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Ala Asn Ser Arg Gly Asn 225 230 235 240 Pro Glu Phe Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly 245 250 255 Ser His Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met 260 265 270 Pro Ala Met Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly 275 280 285 Phe Phe Ser Pro Lys Arg Cys Ala Glu Ala Ile Pro Leu Asp Ala Trp 290 295 300 Val Ser Ala Asp Asp Val Leu Pro Ser Cys Lys Ala Val Leu Glu Ala 305 310 315 320 Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met 325 330 335 Trp Leu Ile Asp Glu Leu Gly Ile Glu Gly Phe Arg Ser Glu Val Val 340 345 350 Lys Arg Met Pro Arg Gln Glu Leu Glu Arg Glu Ser Ser Glu Asp Leu 355 360 365 Val Gln Lys Gln Trp Glu Arg Arg Asp Tyr Phe Gly Val His Pro Gln 370 375 380 Lys Gln Glu Gly Leu Ser Tyr Ala Gly Leu His Ile Pro Val Gly Arg 385 390 395 400 Val Gln Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Ile Tyr 405 410 415 Gly Thr Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro 420 425 430 Asn Ile Glu Asp Ser Lys Ile Glu Ala Leu Leu Lys Glu Pro Leu Leu 435 440 445 Lys Asp Arg Phe Ser Pro Glu Pro Pro Leu Leu Met Gln Gly Leu Val 450 455 460 Ala Cys Thr Gly Lys Glu Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys 465 470 475 480 Ala Arg Ala Met Lys Val Thr Glu Glu Val Gln Arg Leu Val Ser Val 485 490 495 Ser Lys Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly 500 505 510 Gln Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Met Ala Arg Asp 515 520 525 Glu Asn Gly Lys Ile Cys Glu Gly Ala Asp Val Tyr Val Gly Gly Arg 530 535 540 Val Gly Ser Asp Ser His Leu Gly Glu Leu Tyr Lys Lys Ser Val Pro 545 550 555 560 Cys Lys Asp Leu Val Pro Leu Val Val Asp Ile Leu Val Lys Gln Phe 565 570 575 Gly Ala Val Pro Arg Glu Arg Glu Glu Val Asp Asp 580 585 23587PRTSolanum lycopersicum 23Met Ala Ser Phe Ser Ile Lys Phe Leu Ala Pro Ser Leu Pro Asn Pro 1 5 10 15 Thr Arg Phe Ser Lys Ser Ser Ile Val Lys Leu Asn Ala Thr Pro Pro 20 25 30 Gln Thr Val Ala Ala Ala Gly Pro Pro Glu Val Ala Ala Glu Arg Leu 35 40 45 Glu Pro Arg Val Glu Glu Lys Asp Gly Tyr Trp Ile Leu Lys Glu Gln 50 55 60 Phe Arg Gln Gly Ile Asn Pro Gln Glu Lys Val Lys Ile Glu Lys Glu 65 70 75 80 Pro Met Lys Leu Phe Met Glu Asn Gly Ile Glu Glu Leu Ala Lys Ile 85 90 95 Pro Ile Glu Glu Ile Asp Gln Ser Lys Leu Thr Lys Asp Asp Ile Asp 100 105 110 Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys Asn Gln Tyr 115 120 125 Gly Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser 130 135 140 Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Glu Glu 145 150 155 160 Gly Cys Ala Asp Ile Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val 165 170 175 Val Leu Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Glu Glu Val Gly

180 185 190 Leu Thr Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly 195 200 205 Asn Pro Leu Ala Gly Ile Asp Pro Glu Glu Ile Val Asp Thr Arg Pro 210 215 220 Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Gly Asn Ser Arg Gly Asn 225 230 235 240 Pro Ala Val Ser Asn Leu Pro Arg Lys Trp Asn Pro Cys Val Val Gly 245 250 255 Ser His Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met 260 265 270 Pro Ala Ile Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly 275 280 285 Phe Phe Ser Ala Lys Arg Cys Asp Glu Ala Ile Pro Leu Asp Ala Trp 290 295 300 Val Pro Ala Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala 305 310 315 320 Phe Arg Asp Leu Gly Phe Arg Gly Asn Arg Gln Lys Cys Arg Met Met 325 330 335 Trp Leu Ile Asp Glu Leu Gly Val Glu Gly Phe Arg Ala Glu Val Val 340 345 350 Lys Arg Met Pro Gln Gln Glu Leu Glu Arg Ala Ser Pro Glu Asp Leu 355 360 365 Val Gln Lys Gln Trp Glu Arg Arg Asp Tyr Leu Gly Val His Pro Gln 370 375 380 Lys Gln Glu Gly Tyr Ser Phe Ile Gly Leu His Ile Pro Val Gly Arg 385 390 395 400 Val Gln Ala Asp Asp Met Asp Asp Leu Ala Arg Leu Ala Asp Glu Tyr 405 410 415 Gly Ser Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro 420 425 430 Asn Ile Glu Asn Ser Lys Ile Asp Ala Leu Leu Lys Glu Pro Ile Leu 435 440 445 Ser Lys Phe Ser Pro Asp Pro Pro Ile Leu Met Lys Gly Leu Val Ala 450 455 460 Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala 465 470 475 480 Arg Ser Leu Lys Ile Thr Glu Glu Val Gln Arg Gln Val Ser Leu Thr 485 490 495 Arg Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Ala Gln 500 505 510 Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Asp Lys 515 520 525 Asp Lys Lys Thr Val Glu Gly Ala Asp Val Phe Leu Gly Gly Arg Ile 530 535 540 Gly Ser Asp Ser His Leu Gly Glu Val Tyr Lys Lys Ala Val Pro Cys 545 550 555 560 Asp Glu Leu Val Pro Leu Ile Val Asp Leu Leu Ile Lys Asn Phe Gly 565 570 575 Ala Val Pro Arg Glu Arg Glu Glu Thr Glu Asp 580 585 24584PRTSolanum lycopersicum 24Met Thr Ser Phe Ser Val Lys Phe Ser Ala Thr Ser Leu Pro Asn Ser 1 5 10 15 Asn Arg Phe Ser Lys Leu His Ala Thr Pro Pro Gln Thr Val Ala Val 20 25 30 Pro Ser Tyr Gly Ala Ala Glu Ile Ala Ala Glu Arg Leu Glu Pro Arg 35 40 45 Val Glu Gln Arg Asp Gly Tyr Trp Val Val Lys Asp Lys Phe Arg Gln 50 55 60 Gly Ile Asn Pro Ala Glu Lys Ala Lys Ile Glu Lys Glu Pro Met Lys 65 70 75 80 Leu Phe Thr Glu Asn Gly Ile Glu Asp Leu Ala Lys Ile Ser Leu Glu 85 90 95 Glu Ile Glu Lys Ser Lys Leu Thr Lys Glu Asp Ile Asp Ile Arg Leu 100 105 110 Lys Trp Leu Gly Leu Phe His Arg Arg Lys His His Tyr Gly Arg Phe 115 120 125 Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Asp Gln Thr 130 135 140 Arg Tyr Leu Gly Ser Val Ile Arg Lys Tyr Gly Lys Asp Gly Cys Gly 145 150 155 160 Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Val Leu Pro 165 170 175 Asp Val Pro Glu Ile Leu Lys Gly Leu Asp Glu Val Gly Leu Thr Ser 180 185 190 Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro Leu 195 200 205 Ala Gly Ile Asp Leu His Glu Ile Val Asp Thr Arg Pro Tyr Thr Asn 210 215 220 Leu Leu Ser Gln Tyr Val Thr Ala Asn Phe Arg Gly Asn Val Asp Val 225 230 235 240 Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Ile Gly Ser His Asp 245 250 255 Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr 260 265 270 Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser 275 280 285 Pro Lys Arg Cys Ala Glu Ala Ile Pro Leu Asp Ala Trp Val Pro Ala 290 295 300 Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala Tyr Arg Asp 305 310 315 320 Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile 325 330 335 Asp Glu Leu Gly Val Glu Gly Phe Arg Ala Glu Val Val Lys Arg Met 340 345 350 Pro Gln Lys Lys Leu Asp Arg Glu Ser Ser Glu Asp Leu Val Leu Lys 355 360 365 Gln Trp Glu Arg Arg Glu Tyr Leu Gly Val His Pro Gln Lys Gln Glu 370 375 380 Gly Tyr Ser Phe Val Gly Leu His Ile Pro Val Gly Arg Val Gln Ala 385 390 395 400 Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Glu Tyr Gly Ser Gly 405 410 415 Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Ile Glu 420 425 430 Asn Ser Lys Ile Asp Ala Leu Leu Asn Glu Pro Leu Leu Lys Asn Arg 435 440 445 Phe Ser Pro Asp Pro Pro Ile Leu Met Arg Asn Leu Val Ala Cys Thr 450 455 460 Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ser 465 470 475 480 Met Lys Ile Thr Glu Glu Val Gln Arg Leu Val Ser Val Thr Gln Pro 485 490 495 Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly Gln Val Gln 500 505 510 Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Lys Glu Gly Lys 515 520 525 Thr Val Glu Gly Ala Asp Val Phe Leu Gly Gly Arg Ile Gly Ser Asp 530 535 540 Ser His Leu Gly Glu Val Tyr Lys Lys Ser Val Pro Cys Glu Asp Leu 545 550 555 560 Val Pro Ile Ile Val Asp Leu Leu Ile Asn Asn Phe Gly Ala Val Pro 565 570 575 Arg Glu Arg Glu Glu Thr Glu Glu 580 25586PRTArabidopsis thaliana 25Met Thr Ser Phe Ser Leu Thr Phe Thr Ser Pro Leu Leu Pro Ser Ser 1 5 10 15 Ser Thr Lys Pro Lys Arg Ser Val Leu Val Ala Ala Ala Gln Thr Thr 20 25 30 Ala Pro Ala Glu Ser Thr Ala Ser Val Asp Ala Asp Arg Leu Glu Pro 35 40 45 Arg Val Glu Leu Lys Asp Gly Phe Phe Ile Leu Lys Glu Lys Phe Arg 50 55 60 Lys Gly Ile Asn Pro Gln Glu Lys Val Lys Ile Glu Arg Glu Pro Met 65 70 75 80 Lys Leu Phe Met Glu Asn Gly Ile Glu Glu Leu Ala Lys Lys Ser Met 85 90 95 Glu Glu Leu Asp Ser Glu Lys Ser Ser Lys Asp Asp Ile Asp Val Arg 100 105 110 Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His Gln Tyr Gly Lys 115 120 125 Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Ala Gln 130 135 140 Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Glu Asp Gly Cys 145 150 155 160 Ala Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Val Leu 165 170 175 Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Ala Ser Val Gly Leu Thr 180 185 190 Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro 195 200 205 Ile Ala Gly Ile Asp Pro Glu Glu Ile Val Asp Thr Arg Pro Tyr Thr 210 215 220 Asn Leu Leu Ser Gln Phe Ile Thr Ala Asn Ser Gln Gly Asn Pro Asp 225 230 235 240 Phe Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly Thr His 245 250 255 Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala 260 265 270 Asn Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe 275 280 285 Ser Pro Lys Arg Cys Glu Glu Ala Ile Pro Leu Asp Ala Trp Val Pro 290 295 300 Ala Asp Asp Val Leu Pro Leu Cys Lys Ala Val Leu Glu Ala Tyr Arg 305 310 315 320 Asp Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu 325 330 335 Ile Asp Glu Leu Gly Val Glu Gly Phe Arg Thr Glu Val Glu Lys Arg 340 345 350 Met Pro Asn Gly Lys Leu Glu Arg Gly Ser Ser Glu Asp Leu Val Asn 355 360 365 Lys Gln Trp Glu Arg Arg Asp Tyr Phe Gly Val Asn Pro Gln Lys Gln 370 375 380 Glu Gly Leu Ser Phe Val Gly Leu His Val Pro Val Gly Arg Leu Gln 385 390 395 400 Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Thr Tyr Gly Ser 405 410 415 Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Val 420 425 430 Glu Thr Ser Lys Thr Glu Ala Leu Leu Gln Glu Pro Phe Leu Lys Asn 435 440 445 Arg Phe Ser Pro Glu Pro Ser Ile Leu Met Lys Gly Leu Val Ala Cys 450 455 460 Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Leu Arg 465 470 475 480 Ala Leu Lys Val Thr Glu Glu Val Glu Arg Leu Val Ser Val Pro Arg 485 490 495 Pro Ile Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly Gln Val 500 505 510 Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Gly Glu Glu 515 520 525 Gly Lys Pro Val Glu Gly Ala Asp Val Tyr Val Gly Gly Arg Ile Gly 530 535 540 Ser Asp Ser His Ile Gly Glu Ile Tyr Lys Lys Gly Val Arg Val Thr 545 550 555 560 Glu Leu Val Pro Leu Val Ala Glu Ile Leu Ile Lys Glu Phe Gly Ala 565 570 575 Val Pro Arg Glu Arg Glu Glu Asn Glu Asp 580 585 26595PRTVitis vinifera 26Met Ala Ser Ile Ser Val Pro Phe Leu Ser Gln Ala Pro Thr His Leu 1 5 10 15 Ser Asn Ser Thr Ser Leu Arg Leu Lys Thr Arg Ile Ser Ala Thr Pro 20 25 30 Thr Pro Thr Pro Thr Pro Thr Thr Val Ala Pro Ser Ser Thr Ala Ala 35 40 45 Val Asp Ala Ser Arg Met Glu Pro Arg Val Glu Glu Arg Gly Gly Tyr 50 55 60 Trp Val Leu Lys Glu Lys Phe Arg Glu Gly Ile Asn Pro Gln Glu Lys 65 70 75 80 Val Lys Ile Glu Lys Asp Pro Met Lys Leu Phe Ile Glu Asp Gly Phe 85 90 95 Asn Glu Leu Ala Ser Met Ser Phe Glu Glu Ile Glu Lys Ser Lys His 100 105 110 Thr Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His 115 120 125 Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Leu Lys Leu Pro 130 135 140 Asn Gly Val Thr Ser Ser Ala Gln Thr Arg Tyr Leu Ala Ser Ala Ile 145 150 155 160 Arg Gln Tyr Gly Lys Glu Gly Cys Ala Asp Val Thr Thr Arg Gln Asn 165 170 175 Trp Gln Ile Arg Gly Val Val Leu Pro Asp Val Pro Glu Ile Leu Lys 180 185 190 Gly Leu Ser Glu Val Gly Leu Thr Ser Leu Gln Ser Gly Met Asp Asn 195 200 205 Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile Asp Pro His Glu 210 215 220 Ile Val Asp Thr Arg Pro Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr 225 230 235 240 Ala Asn Ala Arg Gly Asn Thr Ala Phe Thr Asn Leu Pro Arg Lys Trp 245 250 255 Asn Val Cys Val Val Gly Ser His Asp Leu Tyr Glu His Pro His Ile 260 265 270 Asn Asp Leu Ala Tyr Met Pro Ala Thr Lys Lys Gly Arg Phe Gly Phe 275 280 285 Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Lys Arg Cys Ala Asp Ala 290 295 300 Ile Pro Leu Asp Ala Trp Ile Pro Ala Asp Asp Val Leu Pro Val Cys 305 310 315 320 Gln Ala Val Leu Glu Ala Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg 325 330 335 Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Leu Gly Ile Glu Gln 340 345 350 Phe Arg Ala Glu Val Val Lys Arg Met Pro Gln Gln Glu Leu Glu Arg 355 360 365 Ser Ser Ser Glu Asp Leu Val Gln Lys Gln Trp Glu Arg Arg Asp Tyr 370 375 380 Leu Gly Val His Pro Gln Lys Gln Glu Gly Phe Ser Phe Val Gly Ile 385 390 395 400 His Ile Pro Val Gly Arg Val Gln Ala Asp Asp Met Asp Glu Leu Ala 405 410 415 Arg Leu Ala Asp Glu Tyr Gly Ser Gly Glu Leu Arg Leu Thr Val Glu 420 425 430 Gln Asn Ile Ile Ile Pro Asn Val Glu Asn Ser Arg Leu Glu Ala Leu 435 440 445 Leu Lys Glu Pro Leu Leu Arg Asp Arg Phe Ser Pro Glu Pro Pro Ile 450 455 460 Leu Met Lys Gly Leu Val Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln 465 470 475 480 Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys Val Thr Glu Asp Val 485 490 495 Gly Arg Leu Val Ser Val Thr Gln Pro Val Arg Met His Trp Thr Gly 500 505 510 Cys Pro Asn Ser Cys Gly Gln Val Gln Val Ala Asp Ile Gly Phe Met 515 520 525 Gly Cys Met Thr Arg Asp Glu Asn Gly Asn Val Cys Glu Gly Ala Asp 530 535 540 Val Phe Leu Gly Gly Arg Ile Gly Ser Asp Cys His Leu Gly Glu Val 545 550 555 560 Tyr Lys Lys Arg Val Pro Cys Lys Asp Leu Val Pro Leu Val Ala Glu 565 570 575 Ile Leu Val Asn His Phe Gly Gly Val Pro Arg Glu Arg Glu Glu Glu 580 585 590 Ala Glu Asp 595 27580PRTVolvox sp. 27Met Gln Ser Gln Ser Leu Ser Arg Arg Thr Cys Thr Arg Thr Leu Gly 1 5 10 15 Arg Gly Leu Val Thr Pro Val Leu Ala Thr Ala Ala Pro Ala Ser Ala 20 25 30 Ala Gln Ala Ala Asp Gly Ile Asn Ala His Ser Gly Leu Lys His Leu 35 40 45 Pro Glu Ala Ala Arg Val Arg Ala Leu Asp Arg Lys Ala Asn Lys Phe 50 55 60 Glu Lys Val Lys Val Glu Lys Cys Gly Ser Arg Ala Trp Thr Asp Val 65 70 75 80 Phe Glu Leu Ser Arg Leu Leu Lys Glu Gly Asn Thr Lys Trp Glu Asp 85 90 95 Leu Asp Leu Asp Asp Ile Asp Ile Arg Met Lys Trp Ala Gly Leu Phe 100 105 110 His Arg Gly Lys Arg Thr Pro Gly Lys Phe Met Met Arg Leu Lys Val 115

120 125 Pro Asn Gly Glu Leu Asp Ala Arg Gln Leu Arg Phe Leu Ala Ser Ala 130 135 140 Ile Ala Pro Tyr Gly Ala Asp Gly Cys Ala Asp Ile Thr Thr Arg Ala 145 150 155 160 Asn Ile Gln Leu Arg Gly Val Thr Leu Ala Asp Ala Asp Ala Ile Ile 165 170 175 Arg Gly Leu Trp Asp Val Gly Leu Thr Ser Phe Gln Ser Gly Met Asp 180 185 190 Ser Val Arg Asn Leu Thr Gly Asn Pro Ile Ala Gly Val Asp Pro His 195 200 205 Glu Leu Ile Asp Thr Arg Pro Leu Leu Arg Glu Met Glu Ala Met Leu 210 215 220 Phe Asn Asn Gly Lys Gly Arg Glu Glu Phe Ala Asn Leu Pro Arg Lys 225 230 235 240 Leu Asn Ile Cys Ile Ser Ser Thr Arg Asp Asp Phe Pro His Thr His 245 250 255 Ile Asn Asp Val Gly Phe Glu Ala Val Arg Arg Pro Asp Asp Gly Glu 260 265 270 Val Val Phe Asn Val Val Val Gly Gly Phe Phe Ser Ile Lys Arg Asn 275 280 285 Val Met Ser Ile Pro Leu Gly Cys Ser Val Thr Gln Asp Gln Leu Met 290 295 300 Pro Phe Thr Glu Ala Leu Leu Arg Val Phe Arg Asp His Gly Pro Arg 305 310 315 320 Gly Asp Arg Gln Gln Thr Arg Leu Met Trp Met Val Asp Ala Ile Gly 325 330 335 Val Glu Lys Phe Arg Gln Leu Leu Ser Glu Tyr Met Gly Gly Ala Glu 340 345 350 Leu Ala Pro Pro Val His Val His His Glu Gly Pro Trp Glu Arg Arg 355 360 365 Asp Val Leu Gly Val His Pro Gln Lys Gln Pro Gly Leu Asn Trp Val 370 375 380 Gly Ala Cys Val Pro Ala Gly Arg Leu Gln Ala Ala Asp Phe Asp Glu 385 390 395 400 Phe Ala Arg Ile Ala Glu Thr Tyr Gly Asp Gly Thr Val Arg Ile Thr 405 410 415 Cys Glu Glu Asn Val Ile Phe Thr Asn Val Pro Asp Ala Lys Leu Pro 420 425 430 Asp Met Leu Ala Glu Pro Leu Phe Gln Arg Phe Lys Val Asn Pro Gly 435 440 445 Leu Leu Leu Arg Gly Leu Val Ser Cys Thr Gly Asn Gln Phe Cys Gly 450 455 460 Phe Gly Leu Ala Glu Thr Lys Ala Arg Ala Val Lys Val Val Glu Met 465 470 475 480 Leu Glu Glu Gln Leu Glu Leu Thr Arg Pro Val Arg Ile His Phe Thr 485 490 495 Gly Cys Pro Asn Ser Cys Gly Gln Ala Gln Gln Val Gly Asp Ile Gly 500 505 510 Leu Met Gly Ala Pro Ala Lys Leu Asp Gly Lys Ala Val Glu Gly Tyr 515 520 525 Lys Ile Phe Leu Gly Gly Lys Ile Gly Glu Asn Pro Gln Leu Ala Thr 530 535 540 Glu Phe Ala Gln Gly Ile Pro Ala Val Glu Ser His Leu Val Pro Lys 545 550 555 560 Leu Lys Glu Ile Leu Ile Lys Glu Phe Gly Ala Lys Glu Lys Glu Thr 565 570 575 Ala Val Val Val 580 28594PRTSpinacia oleracea 28Met Ala Ser Leu Pro Val Asn Lys Ile Ile Pro Ser Ser Thr Thr Leu 1 5 10 15 Leu Ser Ser Ser Asn Asn Asn Arg Arg Arg Asn Asn Ser Ser Ile Arg 20 25 30 Cys Gln Lys Ala Val Ser Pro Ala Ala Glu Thr Ala Ala Val Ser Pro 35 40 45 Ser Val Asp Ala Ala Arg Leu Glu Pro Arg Val Glu Glu Arg Asp Gly 50 55 60 Phe Trp Val Leu Lys Glu Glu Phe Arg Ser Gly Ile Asn Pro Ala Glu 65 70 75 80 Lys Val Lys Ile Glu Lys Asp Pro Met Lys Leu Phe Ile Glu Asp Gly 85 90 95 Ile Ser Asp Leu Ala Thr Leu Ser Met Glu Glu Val Asp Lys Ser Lys 100 105 110 His Asn Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu Phe 115 120 125 His Arg Arg Lys His His Tyr Gly Arg Phe Met Met Arg Leu Lys Leu 130 135 140 Pro Asn Gly Val Thr Thr Ser Glu Gln Thr Arg Tyr Leu Ala Ser Val 145 150 155 160 Ile Lys Lys Tyr Gly Lys Asp Gly Cys Ala Asp Val Thr Thr Arg Gln 165 170 175 Asn Trp Gln Ile Arg Gly Val Val Leu Pro Asp Val Pro Glu Ile Ile 180 185 190 Lys Gly Leu Glu Ser Val Gly Leu Thr Ser Leu Gln Ser Gly Met Asp 195 200 205 Asn Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile Asp Pro His 210 215 220 Glu Ile Val Asp Thr Arg Pro Phe Thr Asn Leu Ile Ser Gln Phe Val 225 230 235 240 Thr Ala Asn Ser Arg Gly Asn Leu Ser Ile Thr Asn Leu Pro Arg Lys 245 250 255 Trp Asn Pro Cys Val Ile Gly Ser His Asp Leu Tyr Glu His Pro His 260 265 270 Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr Lys Asn Gly Lys Phe Gly 275 280 285 Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Ile Lys Arg Cys Glu Glu 290 295 300 Ala Ile Pro Leu Asp Ala Trp Val Ser Ala Glu Asp Val Val Pro Val 305 310 315 320 Cys Lys Ala Met Leu Glu Ala Phe Arg Asp Leu Gly Phe Arg Gly Asn 325 330 335 Arg Gln Lys Cys Arg Met Met Trp Leu Ile Asp Glu Leu Gly Met Glu 340 345 350 Ala Phe Arg Gly Glu Val Glu Lys Arg Met Pro Glu Gln Val Leu Glu 355 360 365 Arg Ala Ser Ser Glu Glu Leu Val Gln Lys Asp Trp Glu Arg Arg Glu 370 375 380 Tyr Leu Gly Val His Pro Gln Lys Gln Gln Gly Leu Ser Phe Val Gly 385 390 395 400 Leu His Ile Pro Val Gly Arg Leu Gln Ala Asp Glu Met Glu Glu Leu 405 410 415 Ala Arg Ile Ala Asp Val Tyr Gly Ser Gly Glu Leu Arg Leu Thr Val 420 425 430 Glu Gln Asn Ile Ile Ile Pro Asn Val Glu Asn Ser Lys Ile Asp Ser 435 440 445 Leu Leu Asn Glu Pro Leu Leu Lys Glu Arg Tyr Ser Pro Glu Pro Pro 450 455 460 Ile Leu Met Lys Gly Leu Val Ala Cys Thr Gly Ser Gln Phe Cys Gly 465 470 475 480 Gln Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys Val Thr Glu Glu 485 490 495 Val Gln Arg Leu Val Ser Val Thr Arg Pro Val Arg Met His Trp Thr 500 505 510 Gly Cys Pro Asn Ser Cys Gly Gln Val Gln Val Ala Asp Ile Gly Phe 515 520 525 Met Gly Cys Met Thr Arg Asp Glu Asn Gly Lys Pro Cys Glu Gly Ala 530 535 540 Asp Val Phe Val Gly Gly Arg Ile Gly Ser Asp Ser His Leu Gly Asp 545 550 555 560 Ile Tyr Lys Lys Ala Val Pro Cys Lys Asp Leu Val Pro Val Val Ala 565 570 575 Glu Ile Leu Ile Asn Gln Phe Gly Ala Val Pro Arg Glu Arg Glu Glu 580 585 590 Ala Glu 29536PRTNostoc sp. 29Met Thr Asp Thr Val Thr Thr Pro Lys Ala Ser Leu Asn Lys Phe Glu 1 5 10 15 Lys Phe Lys Ala Glu Lys Asp Gly Leu Ala Ile Lys Ser Glu Ile Glu 20 25 30 Lys Ile Ala Ser Leu Gly Trp Glu Ala Met Asp Ala Thr Asp Arg Asp 35 40 45 His Arg Leu Lys Trp Val Gly Val Phe Phe Arg Pro Val Thr Pro Gly 50 55 60 Lys Phe Met Met Arg Met Arg Met Pro Asn Gly Ile Leu Thr Ser Asp 65 70 75 80 Gln Met Arg Val Leu Ala Glu Val Val Gln Arg Tyr Gly Asp Asp Gly 85 90 95 Asn Ala Asp Ile Thr Thr Arg Gln Asn Ile Gln Leu Arg Gly Ile Arg 100 105 110 Ile Glu Asp Leu Pro His Ile Phe Asn Lys Phe His Ala Val Gly Leu 115 120 125 Thr Ser Val Gln Ser Gly Met Asp Asn Ile Arg Asn Ile Thr Gly Asp 130 135 140 Pro Ile Ala Gly Leu Asp Ala Asp Glu Leu Tyr Asp Thr Arg Glu Leu 145 150 155 160 Val Gln Gln Ile Gln Asp Met Leu Thr Asn Lys Gly Glu Gly Asn Arg 165 170 175 Glu Phe Ser Asn Leu Pro Arg Lys Phe Asn Ile Ala Ile Ala Gly Gly 180 185 190 Arg Asp Asn Ser Val His Ala Glu Ile Asn Asp Leu Ala Phe Val Pro 195 200 205 Ala Phe Lys Glu Gly Ile Gly Asp Trp Val Leu Gly Asn Gly Glu Glu 210 215 220 Ser Ser Thr Tyr Gln Lys Val Phe Gly Phe Asn Val Leu Val Gly Gly 225 230 235 240 Phe Phe Ser Ala Lys Arg Cys Glu Ala Ala Ile Pro Leu Asn Ala Trp 245 250 255 Val Thr Pro Glu Glu Val Leu Pro Leu Cys Arg Ala Ile Leu Glu Val 260 265 270 Tyr Arg Asp Asn Gly Leu Arg Ala Asn Arg Leu Lys Ser Arg Leu Met 275 280 285 Trp Leu Ile Asp Glu Trp Gly Ile Asp Lys Phe Arg Ala Glu Val Glu 290 295 300 Gln Arg Leu Gly Lys Ser Leu Leu Pro Ala Ala Pro Lys Asp Glu Ile 305 310 315 320 Asp Trp Glu Lys Arg Asp His Ile Gly Val Tyr Lys Gln Lys Gln Glu 325 330 335 Gly Leu Asn Tyr Val Gly Leu His Ile Pro Val Gly Arg Leu Tyr Ala 340 345 350 Glu Asp Met Phe Glu Leu Ala Arg Ile Ala Asp Val Tyr Gly Ser Gly 355 360 365 Glu Ile Arg Met Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Ile Thr 370 375 380 Asp Ser Arg Leu Arg Thr Leu Leu Thr Asp Pro Leu Leu Glu Arg Phe 385 390 395 400 Ser Leu Asp Pro Gly Ala Leu Thr Arg Ser Leu Val Ser Cys Thr Gly 405 410 415 Ala Gln Phe Cys Asn Phe Ala Leu Ile Glu Thr Lys Asn Arg Ala Leu 420 425 430 Glu Met Ile Lys Gly Leu Glu Ala Glu Leu Thr Phe Thr Arg Pro Val 435 440 445 Arg Ile His Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Pro Gln Val 450 455 460 Ala Asp Ile Gly Leu Met Gly Thr Lys Ala Arg Lys Asn Gly Lys Ala 465 470 475 480 Val Glu Gly Val Asp Ile Tyr Met Gly Gly Lys Val Gly Lys Asp Ala 485 490 495 His Leu Gly Ser Cys Val Gln Lys Gly Ile Pro Cys Glu Asp Leu His 500 505 510 Leu Val Leu Arg Asp Leu Leu Ile Thr Asn Phe Gly Ala Lys Pro Arg 515 520 525 Gln Glu Ala Leu Val Thr Ser Gln 530 535 30654PRTPlectonema boryanum 30Met Thr Asp Thr Leu Ala Ala Pro Thr Leu Asn Lys Phe Glu Lys Leu 1 5 10 15 Lys Ala Glu Lys Asp Gly Leu Ala Val Lys Ala Glu Leu Glu His Phe 20 25 30 Ala Arg Leu Gly Trp Glu Ala Met Asp Glu Thr Asp Arg Asp His Arg 35 40 45 Leu Lys Trp Leu Gly Val Phe Phe Arg Pro Val Thr Pro Gly Lys Phe 50 55 60 Met Leu Arg Met Arg Val Pro Asn Gly Ile Ile Thr Ser Gly Gln Thr 65 70 75 80 Arg Val Leu Gly Glu Ile Leu Gln Arg Tyr Gly Asp Asp Gly Asn Ala 85 90 95 Asp Ile Thr Thr Arg Gln Asn Phe Gln Leu Arg Gly Ile Arg Ile Glu 100 105 110 Asp Leu Pro Glu Ile Phe Arg Lys Phe Asp Gln Ala Gly Leu Thr Ser 115 120 125 Ile Gln Ser Gly Met Asp Asn Val Arg Asn Ile Thr Gly Ser Pro Val 130 135 140 Ala Gly Ile Asp Ala Asp Glu Leu Ile Asp Thr Arg Gly Leu Val Arg 145 150 155 160 Lys Val Gln Asp Met Ile Thr Asn Asn Gly Arg Gly Asn Ser Ser Phe 165 170 175 Ser Asn Leu Pro Arg Lys Phe Asn Ile Ala Ile Ala Gly Cys Arg Asp 180 185 190 Asn Ser Val His Ala Glu Ile Asn Asp Ile Ala Phe Val Pro Ala Phe 195 200 205 Lys Asp Gly Thr Leu Gly Phe Asn Ile Leu Val Gly Gly Phe Phe Ser 210 215 220 Gly Lys Arg Cys Glu Ala Ala Ile Pro Leu Asn Ala Trp Val Asp Pro 225 230 235 240 Arg Asp Val Val Ala Val Cys Glu Ala Ile Leu Thr Val Tyr Arg Asn 245 250 255 Leu Gly Leu Arg Ala Asn Arg Gln Lys Ala Arg Leu Met Trp Leu Ile 260 265 270 Asp Glu Met Gly Leu Glu Pro Phe Arg Glu Ala Val Glu Lys Gln Leu 275 280 285 Gly Tyr Ala Phe Thr Pro Ala Ala Ala Lys Asp Glu Ile Leu Trp Asp 290 295 300 Lys Arg Asp His Ile Gly Ile His Ala Gln Lys Gln Pro Gly Leu Asn 305 310 315 320 Tyr Val Gly Leu His Val Pro Val Gly Arg Leu Tyr Ala Gln Asp Leu 325 330 335 Phe Asp Leu Ala Arg Ile Ala Glu Val Tyr Gly Ser Gly Glu Ile Arg 340 345 350 Leu Thr Val Glu Gln Asn Val Ile Ile Pro Asn Val Pro Asp Ser Arg 355 360 365 Val Ser Ala Leu Leu Arg Glu Pro Ile Val Lys Arg Phe Ser Ile Glu 370 375 380 Pro Gln Asn Leu Ser Arg Ala Leu Val Ser Cys Thr Gly Ala Gln Phe 385 390 395 400 Cys Asn Phe Ala Leu Ile Glu Thr Lys Asn Arg Ala Val Ala Leu Met 405 410 415 Gln Glu Leu Glu Gln Asp Leu Tyr Cys Pro Arg Pro Val Arg Ile His 420 425 430 Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Pro Gln Val Ala Asp Ile 435 440 445 Gly Leu Met Gly Thr Lys Val Arg Lys Asp Gly Lys Thr Val Glu Gly 450 455 460 Val Asp Leu Tyr Met Gly Gly Lys Val Gly Lys His Ala Glu Leu Gly 465 470 475 480 Thr Cys Val Arg Lys Ser Ile Pro Cys Glu Asp Leu Lys Pro Ile Leu 485 490 495 Gln Glu Ile Leu Ile Glu Gln Phe Gly Ala Arg Leu Trp Ser Asp Leu 500 505 510 Pro Glu Ser Ala Arg Pro Asn Pro Thr Ala Leu Ile Thr Leu Asp Arg 515 520 525 Pro Thr Val Glu Thr Pro Asn Gly Lys Ser Thr Thr Val Gln Glu Leu 530 535 540 Asn Ala Gln Glu Phe Asp Tyr Val Leu Ser Ala Pro Pro Val Val Lys 545 550 555 560 Ala Pro Thr Glu Ile Ala Ala Pro Ala Thr Ile Arg Phe Ala Gln Ser 565 570 575 Gly Lys Glu Ile Thr Cys Thr Gln Asp Asp Leu Ile Leu Asp Ile Ala 580 585 590 Asp Gln Ala Glu Val Ala Ile Glu Ser Ser Cys Arg Ser Gly Thr Cys 595 600 605 Gly Ser Cys Lys Cys Thr Leu Leu Glu Gly Glu Val Ser Tyr Asp Ser 610 615 620 Glu Pro Asp Val Leu Asp Glu His Asp Arg Ala Ser Gly Gln Ile Leu 625 630 635 640 Thr Cys Ile Ala Arg Pro Val Gly Arg Ile Leu Leu Asp Ala 645 650 31536PRTAnabaena variabilis 31Met Thr Asp Thr Ala Thr Thr Pro Lys Ala Ser Leu Asn Lys Phe Glu 1 5 10 15 Lys Phe Lys Ala Glu Lys Asp Gly Leu Ala Ile Lys Ser Glu Ile Glu 20 25 30 Lys Ile Ala Ser Leu Gly Trp Glu Ala Met Asp Glu Thr Asp Arg Asp 35 40 45 His Arg Leu Lys Trp Val Gly Val Phe Phe Arg Pro Val Thr Pro Gly

50 55 60 Lys Phe Met Met Arg Met Arg Met Pro Asn Gly Ile Leu Thr Ser Asp 65 70 75 80 Gln Met Arg Val Leu Ala Glu Val Val Gln Arg Tyr Gly Asp Asp Gly 85 90 95 Asn Ala Asp Ile Thr Thr Arg Gln Asn Ile Gln Leu Arg Gly Ile Arg 100 105 110 Ile Glu Asp Leu Pro His Ile Phe Asn Lys Phe His Ala Val Gly Leu 115 120 125 Thr Ser Val Gln Ser Gly Met Asp Asn Ile Arg Asn Ile Thr Gly Asp 130 135 140 Pro Ile Ala Gly Leu Asp Ala Asp Glu Leu Tyr Asp Thr Arg Glu Leu 145 150 155 160 Val Gln Gln Ile Gln Asp Met Leu Thr Asn Lys Gly Glu Gly Asn Arg 165 170 175 Glu Phe Ser Asn Leu Pro Arg Lys Phe Asn Ile Ala Ile Ala Gly Gly 180 185 190 Arg Asp Asn Ser Val His Ala Glu Ile Asn Asp Leu Ala Phe Val Pro 195 200 205 Ala Phe Lys Glu Gly Ile Gly Asp Trp Val Leu Gly Gly Gly Glu Glu 210 215 220 Ser Ser Thr His Gln Lys Val Phe Gly Phe Asn Val Leu Val Gly Gly 225 230 235 240 Phe Phe Ser Ala Lys Arg Cys Glu Ala Ala Ile Pro Leu Asn Ala Trp 245 250 255 Val Thr Ala Glu Glu Val Val Ala Leu Cys Arg Ala Val Leu Glu Val 260 265 270 Tyr Arg Asp Asn Gly Leu Arg Ala Asn Arg Leu Lys Ser Arg Leu Met 275 280 285 Trp Leu Ile Asp Glu Trp Gly Ile Asp Lys Phe Arg Ala Glu Val Glu 290 295 300 Gln Arg Leu Gly Lys Ser Leu Leu Tyr Ala Ala Pro Lys Asp Glu Ile 305 310 315 320 Asp Trp Glu Lys Arg Asp His Ile Gly Val Tyr Lys Gln Lys Gln Glu 325 330 335 Gly Leu Asn Tyr Val Gly Leu His Ile Pro Val Gly Arg Leu Tyr Ala 340 345 350 Glu Asp Met Phe Glu Leu Ala Arg Ile Ala Asp Val Tyr Gly Ser Gly 355 360 365 Glu Ile Arg Met Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Ile Thr 370 375 380 Asp Ser Arg Leu Lys Thr Leu Leu Thr Asp Pro Leu Leu Glu Arg Phe 385 390 395 400 Ser Leu Asp Pro Gly Ala Leu Thr Arg Ser Leu Val Ser Cys Thr Gly 405 410 415 Ala Gln Phe Cys Asn Phe Ala Leu Ile Glu Thr Lys Asn Arg Ala Leu 420 425 430 Glu Met Ile Lys Gly Leu Glu Ala Glu Leu Thr Phe Thr Arg Pro Val 435 440 445 Arg Ile His Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Pro Gln Val 450 455 460 Ala Asp Ile Gly Leu Met Gly Thr Lys Ala Arg Lys Asn Gly Lys Ala 465 470 475 480 Val Glu Gly Val Asp Ile Tyr Met Gly Gly Lys Val Gly Lys Asp Ala 485 490 495 His Leu Gly Ser Cys Val Gln Lys Gly Ile Pro Cys Glu Asp Leu His 500 505 510 Leu Val Leu Arg Asp Leu Leu Ile Thr Asn Phe Gly Ala Lys Pro Arg 515 520 525 Gln Glu Ala Leu Val Ser Ser Gln 530 535 32515PRTSynechococcus sp. 32Met Ala Asn Gln Phe Glu Arg Leu Lys Ser Glu Lys Asp Gly Leu Ala 1 5 10 15 Val Lys Ala Glu Leu Glu Ala Phe Ala Arg Met Gly Trp Glu Asn Ile 20 25 30 Pro Glu Asp Asp Arg Asp His Arg Leu Lys Trp Leu Gly Ile Phe Phe 35 40 45 Arg Lys Arg Thr Pro Gly Gln Phe Met Leu Arg Leu Arg Leu Pro Asn 50 55 60 Gly Ile Leu Thr Ser Gly Gln Met Arg Met Leu Gly Ala Ile Ile His 65 70 75 80 Pro Tyr Gly Glu Gln Gly Val Ala Asp Ile Thr Thr Arg Gln Asn Leu 85 90 95 Gln Leu Arg Gly Ile Pro Ile Glu Glu Met Pro Gln Ile Leu Gly Tyr 100 105 110 Leu Lys Glu Val Gly Leu Thr Ser Ile Gln Ser Gly Met Asp Asn Val 115 120 125 Arg Asn Ile Thr Gly Ser Pro Leu Ala Gly Ile Asp Pro Asp Glu Leu 130 135 140 Ile Asp Val Arg Gly Leu Thr Arg Lys Val Gln Asp Met Val Thr Asn 145 150 155 160 Asn Gly Glu Gly Asn Pro Ser Phe Ser Asn Leu Pro Arg Lys Phe Asn 165 170 175 Ile Ala Ile Cys Gly Cys Arg Asp Asn Ser Val His Ala Glu Ile Asn 180 185 190 Asp Leu Ala Phe Val Pro Ala Phe Lys Asn Gly Arg Leu Gly Phe Asn 195 200 205 Val Leu Val Gly Gly Phe Phe Ser Ala Arg Arg Cys Ala Glu Ala Ile 210 215 220 Gly Leu Asp Val Trp Val Asp Pro Arg Asp Val Val Pro Leu Cys Glu 225 230 235 240 Ala Val Leu Leu Val Tyr Arg Asp His Gly Leu Arg Ala Asn Arg Gln 245 250 255 Lys Ala Arg Leu Met Trp Leu Ile Asp Glu Trp Gly Leu Glu Lys Phe 260 265 270 Arg Ala Ala Val Glu Arg Gln Ile Gly His Pro Leu Pro Arg Ala Ala 275 280 285 Glu Lys Asp Glu Val Val Trp His Lys Arg Asp Leu Leu Gly Val His 290 295 300 Ala Gln Lys Gln Pro Gly Leu Asn Phe Val Gly Leu His Val Pro Val 305 310 315 320 Gly Arg Leu Asn Ala Leu Glu Met Met Glu Leu Ala Arg Leu Ala Glu 325 330 335 Val Tyr Gly Ser Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Val Leu 340 345 350 Ile Pro Asn Val Pro Asp Ser Arg Val Ala Pro Leu Leu Lys Glu Pro 355 360 365 Leu Leu Lys Lys Phe Ser Pro Asn Pro Gly Pro Leu Gln Arg Gly Leu 370 375 380 Val Ser Cys Thr Gly Asn Gln Phe Cys Asn Phe Ala Leu Ile Glu Thr 385 390 395 400 Lys Asn Arg Ala Val Ala Leu Met Glu Glu Leu Glu Ala Glu Leu Glu 405 410 415 Ile Pro Gln Thr Val Arg Ile His Trp Thr Gly Cys Pro Asn Ser Cys 420 425 430 Gly Gln Pro Gln Val Ala Asp Ile Gly Leu Met Gly Thr Thr Ala Arg 435 440 445 Lys Asp Gly Arg Val Val Glu Ala Val Asp Ile Tyr Met Gly Gly Glu 450 455 460 Val Gly Lys Asp Ala Lys Leu Gly Glu Cys Val Arg Lys Gly Ile Pro 465 470 475 480 Cys Glu Asp Leu Lys Pro Val Leu Val Glu Leu Leu Ile Glu His Phe 485 490 495 Gly Ala Lys Pro Arg Gln His Pro Ser Ala Ala Gln Ala Ser Val Leu 500 505 510 Val Thr Arg 515 33642PRTArabidopsis thaliana 33Met Ser Ser Thr Phe Arg Ala Pro Ala Gly Ala Ala Thr Val Phe Thr 1 5 10 15 Ala Asp Gln Lys Ile Arg Leu Gly Arg Leu Asp Ala Leu Arg Ser Ser 20 25 30 His Ser Val Phe Leu Gly Arg Tyr Gly Arg Gly Gly Val Pro Val Pro 35 40 45 Pro Ser Ala Ser Ser Ser Ser Ser Ser Pro Ile Gln Ala Val Ser Thr 50 55 60 Pro Ala Lys Pro Glu Thr Ala Thr Lys Arg Ser Lys Val Glu Ile Ile 65 70 75 80 Lys Glu Lys Ser Asn Phe Ile Arg Tyr Pro Leu Asn Glu Glu Leu Leu 85 90 95 Thr Glu Ala Pro Asn Val Asn Glu Ser Ala Val Gln Leu Ile Lys Phe 100 105 110 His Gly Ser Tyr Gln Gln Tyr Asn Arg Glu Glu Arg Gly Gly Arg Ser 115 120 125 Tyr Ser Phe Met Leu Arg Thr Lys Asn Pro Ser Gly Lys Val Pro Asn 130 135 140 Gln Leu Tyr Leu Thr Met Asp Asp Leu Ala Asp Glu Phe Gly Ile Gly 145 150 155 160 Thr Leu Arg Leu Thr Thr Arg Gln Thr Phe Gln Leu His Gly Val Leu 165 170 175 Lys Gln Asn Leu Lys Thr Val Met Ser Ser Ile Ile Lys Asn Met Gly 180 185 190 Ser Thr Leu Gly Ala Cys Gly Asp Leu Asn Arg Asn Val Leu Ala Pro 195 200 205 Ala Ala Pro Tyr Val Lys Lys Asp Tyr Leu Phe Ala Gln Glu Thr Ala 210 215 220 Asp Asn Ile Ala Ala Leu Leu Ser Pro Gln Ser Gly Phe Tyr Tyr Asp 225 230 235 240 Met Trp Val Asp Gly Glu Gln Phe Met Thr Ala Glu Pro Pro Glu Val 245 250 255 Val Lys Ala Arg Asn Asp Asn Ser His Gly Thr Asn Phe Val Asp Ser 260 265 270 Pro Glu Pro Ile Tyr Gly Thr Gln Phe Leu Pro Arg Lys Phe Lys Val 275 280 285 Ala Val Thr Val Pro Thr Asp Asn Ser Val Asp Leu Leu Thr Asn Asp 290 295 300 Ile Gly Val Val Val Val Ser Asp Glu Asn Gly Glu Pro Gln Gly Phe 305 310 315 320 Asn Ile Tyr Val Gly Gly Gly Met Gly Arg Thr His Arg Met Glu Ser 325 330 335 Thr Phe Ala Arg Leu Ala Glu Pro Ile Gly Tyr Val Pro Lys Glu Asp 340 345 350 Ile Leu Tyr Ala Val Lys Ala Ile Val Val Thr Gln Arg Glu His Gly 355 360 365 Arg Arg Asp Asp Arg Lys Tyr Ser Arg Met Lys Tyr Leu Ile Ser Ser 370 375 380 Trp Gly Ile Glu Lys Phe Arg Asp Val Val Glu Gln Tyr Tyr Gly Lys 385 390 395 400 Lys Phe Glu Pro Ser Arg Glu Leu Pro Glu Trp Glu Phe Lys Ser Tyr 405 410 415 Leu Gly Trp His Glu Gln Gly Asp Gly Ala Trp Phe Cys Gly Leu His 420 425 430 Val Asp Ser Gly Arg Val Gly Gly Ile Met Lys Lys Thr Leu Arg Glu 435 440 445 Val Ile Glu Lys Tyr Lys Ile Asp Val Arg Ile Thr Pro Asn Gln Asn 450 455 460 Ile Val Leu Cys Asp Ile Lys Thr Glu Trp Lys Arg Pro Ile Thr Thr 465 470 475 480 Val Leu Ala Gln Ala Gly Leu Leu Gln Pro Glu Phe Val Asp Pro Leu 485 490 495 Asn Gln Thr Ala Met Ala Cys Pro Ala Phe Pro Leu Cys Pro Leu Ala 500 505 510 Ile Thr Glu Ala Glu Arg Gly Ile Pro Ser Ile Leu Lys Arg Val Arg 515 520 525 Ala Met Phe Glu Lys Val Gly Leu Asp Tyr Asp Glu Ser Val Val Ile 530 535 540 Arg Val Thr Gly Cys Pro Asn Gly Cys Ala Arg Pro Tyr Met Ala Glu 545 550 555 560 Leu Gly Leu Val Gly Asp Gly Pro Asn Ser Tyr Gln Val Trp Leu Gly 565 570 575 Gly Thr Pro Asn Leu Thr Gln Ile Ala Arg Ser Phe Met Asp Lys Val 580 585 590 Lys Val His Asp Leu Glu Lys Val Cys Glu Pro Leu Phe Tyr His Trp 595 600 605 Lys Leu Glu Arg Gln Thr Lys Glu Ser Phe Gly Glu Tyr Thr Thr Arg 610 615 620 Met Gly Phe Glu Lys Leu Lys Glu Leu Ile Asp Thr Tyr Lys Gly Val 625 630 635 640 Ser Gln 342041DNAAquilegia formosa 34ctgatccaag aatgaactct gcagactttc ttctaccttt ctttcaacaa tggcttcatt 60acagtttctt gcaccttcat catcaccttt gcaatccaac cgactcatgg ttcgagccac 120tagtagtact agtccatcag tcaaccagac catggttgca ccagacttat caagattgga 180accaagagtt gaagaaagag aaggtggtta ttgggttttg aaagagaaat atagagagaa 240aataaatcca caagagaaaa tcaaaataga gaaagaacca atgaagtttg ttactgaagg 300tggtatacat gaattagcaa aaactccatt tgaagaactt gagaaagcta aacttactaa 360agatgatatt gatgttagac tcaagtggct tggtcttttt catagaagaa aaaatcatta 420tggtagattt atgatgagat tgaagttgcc taatggagtt acaactagtg aacaaacgcg 480atatcttgcg agtgttatta gaaggtatgg aaaggatgga tgtgctgatg ttacaactag 540acagaactgg caaattcgcg gtgttgagtt acctcatgtg cctgagataa tgaaaggatt 600aaatcaagtt ggattaacta gtcttcagag tggtatggat aatgtgcgta atcctgttgg 660taatccactt gctggtattg acccactaga gattgtcgat actagaccct acaatgatca 720gctatctcga tttattactg gcaattttaa agggaacctg gcttttacta atctgccgag 780gaaatggaat gtatgtgtgg tgggctctca tgatcttttt gagcatcccc acatcaatga 840tcttgcttac atgccagcca caaagaatgg ccgttttggg tttaatctgt tagtaggtgg 900tttcttcagt ccaaaaagat gtgcagaggc aattcctctc gatgcctggg tttcaggaga 960agacgtgatc ccagtttgca aagctatact tgaggcatac agagatcttg gcaccagagg 1020aaaccgacag aaaacacgaa tgatgtggtt gattgatgaa cttggggtag aaggatttag 1080gtcagaagtg gtgaaaagga tgcctgaaca agagctggag agatcttcca ctgaagagtt 1140ggttcaaaag caatgggaga ggagagatct aatcggtgtc catgcgcaaa agcaggcagg 1200ctacagtttt gttggtctcc acataccagt aggcaggctt caggctgatg acatggatga 1260actagcccgg atagctgatg agtatggctc aggggagctc cgtctcactg tggaacaaaa 1320tatcataatt cctaatgttg agaactcaag agttgaagct ttgctgaagg aagccctatt 1380gagggacagg ttttcaccca ctccacctct tctaatgaaa ggacttgtgg cctgcacagg 1440caaccagttc tgtggacaag ccatcattga gacaaaggca cgagcactga aggtgacaga 1500agaggttgaa agactggtgg cagtgactaa accagtaaga atgcattgga caggatgccc 1560aaacacctgc gcgcaggtgc aagtagctga tattgggttc atggggtgca tggcaagaga 1620tgaaaacggg aaaccgtgtg aaggagcaga tgtttactta ggtgggagga ttggtagtga 1680ttctcatttg ggagatatat ataagaaatc tgtgccttgt aaggacttgg ttcctctggt 1740agttgacatc ttgattgagc gctttggagc tgtccctagg gagagagaag aagatggcga 1800agactagatt atcaaattcc taaccgaaag ccctttctga ttttaataaa ctaatttgga 1860aggtgaatgc acatagacaa tttggatgaa taaaagccat gcagaagtgg ttctttttgg 1920acttgagttg aggaagcaac tttattgttg tatcagaaga caggttattt taaatttcaa 1980ttcgttctta tgtactcaga atacttggat catatctcta gacattctta atcaccgttt 2040t 2041352472DNABetula pendula 35aaaagctgct agagtatgga aacatgcttg tccaggagca ggacaatgtg aagagagttc 60aactggcaga cacgtacttg agccaagcag ctcttggaga tgcaaacgag gattcgatca 120agcggggaac tttctatggc aaggcaggcc aacaagttaa tgtacccgtt cctgaaggtt 180gcaccgatcc atctgctagt aactttgatc caacagctag gagcgataat ggtagctgcc 240agtattgagg ctaagccatt tctagccttc tacctgctag gctatataaa tgctgtatga 300ggttgggaga actattcatt tccactattg cttgctttct cgatacggag aagtattcct 360aattttgttg taatgaacgt ataattttat cttaatcaca accacgacta aaattaccat 420tacaagcttc agtttattac catgtcgtcg ctctcagtgc gctttctttc acctcccctt 480ttttcttcca cccctgcatg gccaagaaca gggcttgccg ccactcaggc ggtgccaccg 540gttgtggcgg aggtggacgc ggggaggctg gagccgagag tggaggagag agaagggtac 600tgggtgttga aggagaagtt cagagaaggc ataaatcctc aggagaaatt gaagctcgag 660agagagccta tgaagctttt catggaaggt gggatagaag atttggccaa gatgtcgctc 720gaggaaattg acaaggataa gatttcaaag agtgatattg atgtaaggct caagtggctt 780ggtctcttcc ataggagaaa gcatcattat ggtagattta tgatgagact gaagctacct 840aatggggtaa caacaagtgc acaaactcga tacttagcga gtgtgattag gaaatatgga 900aaggacgggt gcgcagatgt gaccaccagg caaaattggc aaattcgtgg tgtggtactg 960tctgatgtgc cagaaatact taaaggtctt gatgaagttg gcttgacaag cctgcagagt 1020ggaatggata atgtgagaaa ccctgttggg aacccccttg caggcattga catacatgag 1080attgttgcta cacggcctta caacaacttg ttatcacaat ttatcactgc taattcgcgc 1140ggtaatctgg ccttcactaa cttgccaagg aagtggaatg tgtgtgtagt gggttctcat 1200gatctctttg agcatcctca catcaatgat cttgcttaca tgcctgctat aaaggatgga 1260aggtttggtt tcaatctgct ggttggtggc ttctttagtc ccaggcgatg tgcagaagca 1320gtccctctcg atgcctgggt ctcagcggat gacataatcc tcgtgtgcaa agccatactg 1380gaggcttata gggatcttgg caccagaggg aacagacaga aaacaagaat gatgtggttg 1440attgatgaac ttggaataga aggattcagg tctgaggtag tgaaaagaat gcccaaccaa 1500gagctggaga gagctgctcc tgaagatcta attgagaagc aatgggaaag gagagagtta 1560attggtgtcc atccacagaa acaagaaggc cttagttacg tgggtcttca cattccggtg 1620ggtcgagtcc aagcagatga catggatgaa cttgctcgtt tagccgacac atatggctgt 1680ggcgaacttc ggctcactgt ggagcaaaac atcataattc ccaacattga gaactcaaag 1740ctcgaagcct tactcggaga gcctctattg aaagacagat tttcaccaga accgcctatt 1800ctcatgaaag ggttggtggc ttgcactggc aatcagttct gtgggcaagc cattatagag 1860acaaaggcca gggccttgaa ggtgactgag gaagttcaac ggcaagtggc agtgactcgg 1920ccggttagga tgcactggac aggctgtcca aatagctgtg ggcaggttca agtggctgat 1980attggtttca tggggtgtat ggcaagggat gagaatggga agccttgtga aggtgctgct 2040gtttttctgg gaggcagaat tgggagcgac tcacatttgg gaaatcttta caaaaagggt 2100gttccttgca agaacttggt gccattggta gtggacattc ttgttaaaca ttttggagct 2160gtaccaaggg agagggaaga

gagcgaggat tgattcaaac agcaagatta cttcttcttt 2220taccattttg gatgactccc tgcaaagcat ttgttctggg agagggaacg tgatgcatca 2280aagaaatcct tatgggacta aaatttgtga gagggaggca cattttagtg ctatacccag 2340cttttaacat gttggtttta taggtttggt acgctataag tactctgttt gaattaactt 2400atgtattaaa acagctaaga gttgaattgt aatatgaaag taataaaata ggaggctttt 2460ggtgcaaaaa aa 2472362242DNACapsicum annuum 36cccacctcac cccaccttac gactacaaaa atgatcttat ttcgccattt taaccatgac 60cgccacgatc atcaccaccc tcaataatca agaatcaact aaattcctca attccaaatt 120tggcgaaatg gcatcttttt ctgttaaatt ttcagcaact tcttcgctga caagttctaa 180gagattttcc aagcttcatg ccactccacc gcagacagtg gcagtacctc catctggggc 240agtggaggta gctgcagaga gactagagcc tagactggag gaaagagatg ggtattgggt 300acttaaggaa aagttcagaa aaggcataaa tcctgctgaa aaggccaaga ttgaaaagga 360acctatgaaa ttgttcactg aaaatggtat tgaagatatt gctaagatct cacttgaaga 420gatcgaaaaa tctaagcttg ctaaggatga tattgatgtt aggctcaagt ggcttggcct 480cttccatagg agaaagcatc aatatggacg attcatgatg cgactgaagc ttccaaatgg 540gataacgacg agtgcccaaa ctcgatattt agcaagtgtg attaggaaat atgggaaaga 600tggatgtgca gatgtgacta caaggcaaaa ttggcagatt cgtggggttg tgctacctga 660tgtgcctgag attctaaagg gactggatga agttggcttg accagtctgc aaagtggcat 720ggacaatgtt agaaatcccg tggggaaccc tctggcgggg attgatccac aagaaattgt 780ggacacaagg ccttacgcta atttgctatc caatttgcta tcccaatatg tcactgccaa 840ttttcgtggc aatctgtccg tgcataactt gccaaggaag tggaatgtat gtgtaatagg 900gtcacacgat ctttatgagc atccccatat caatgatctt gcctatatgc ctgcaacgaa 960agatggacga tttggattca acctgcttgt gggtggattc ttcagtccga agcgatgtgc 1020agaggcaatt cctcttgatg catgggttcc agctgatgat gtagtccctg tttgcaaaac 1080aatattagaa gcttatagag atcttggtac cagagggaac aggcagaaaa caagaatgat 1140gtggttaatt gacgaactgg gtgttgaagg attcagggca gaagttgtga agagaatgcc 1200tcaaaagaag ctagagagag aatccacaga ggatttggtg cagaaacaat gggaaaggag 1260agagtatctt ggggttaatc cacagaaaca ggaaggttac agctttgttg gtcttcacat 1320tccagtgggt cgtgtccaag cagatgacat ggatgagctt gctcgtttag cagaagagta 1380tggttcagga gagctccggc tgactgttga gcaaaacatc attattccga acattgagaa 1440ctcaaagatt gatgcattgc tcaatgaacc tcttctgaaa cagatttcac ccgatccacc 1500tattctcatg agaaatttgg tggcttgtac tggtaaccaa ttctgtgggc aagccataat 1560cgagactaaa gcacgttcaa tgaagataac tgaggaggtt caacggctag tctctgtgac 1620tcagcccgtg aggatgcact ggactggttg cccaaattca tgtggacaag ttcaagttgc 1680agatatcgga tttatgggat gcctgacaag aaaggaagga aagacagtgg aaggcgctga 1740tgttttcttg ggtggcagaa tagggactga ctcacacttg ggagatattt ataagaagtc 1800tgtcccctgt gaagatttgg taccaataat tgtggactta ctagttaaca actttggtgc 1860tgttccaaga gagagagaag aagcagaaga ttaatctcaa catttcagaa tcagctcgtg 1920gctttactca acatagtaaa ttggacgttg atggaatgtg cttaccatat taagatattt 1980ccaaggtaca gaactggtgg agctgttgtt ggaagttagt agaataatca gaacatgagc 2040tgttcttgac atgctatgtg tgacattcca cgatgcaaat acttgtactt gtttcagaat 2100attcacccgg tgtattgttt tggaaaagag ctgatccaaa ctaaaaggtt tttgaattgt 2160gggattccta ataatagatt ttttaaaaat gtaatttaat aatcatacat ttcaattttt 2220acctattatt atattctttg tt 2242372459DNAChlamydomonas reinhardtii 37ttgcatcgtt atctccttcg accaccttga attgcctgcg ggccccttga cctcatccga 60cgcagccatg cttctgcacg cgccgcatgt taagcccctg gggcagcgta gttcgatacg 120gcgtggaaat ttggtggttg cgaacgtagc gtgcacggcg ggcaagaacc cgacgtcgcg 180gccagcgaaa cgctccaagg tggagttcat caaggagaac agcgaccacc tgcgccaccc 240gctcatggaa gagctggtga atgacgagac attcatcacc gaggactcgg tgcagctgat 300gaaatttcac ggctcctacc aacaagacaa ccgtgagaaa cgcgccttcg gccaaggcaa 360agcttactca ttcctgatgc ggactcggca gcccgctggc gttgtgccca accggctcta 420cctggtgatg gacgacctcg ccgaccagtt cggcaacggc acgctgcgcc tgaccacgcg 480ccaggcctac cagctgcacg gcgtgctgaa gaaggacctc aagacggtgt tcagctccgt 540catcaagaac atgggatcca cactggccgc atgcggcgac gtcaaccgca acgtgatggg 600gcccgcagcg cccttcacca accgccccga ctacctggcc gcccagaagg cggcgctgga 660cctggcggat ctgctaacgc cgcagtcggg cgcctactac gacgtgtggc tggacggcga 720gaagttcatg agcagctaca aggaggaccc cgctgtgacc gaggcccgtg ccttcaacgg 780cttcggaacc aatttcgaca acagccccga gcccatctac ggctcccagt acctcccccg 840caagttcaag atcgccacca cggtgcctgg tgacaacagt gtggacctgt tcactcagga 900cctgggcgtg gtggttcagg gctacaacct gtatgtgggc ggtgggcagg gccgcagcca 960cagagacgca gacaccttcc cgcgcctggc ggacccgctg ggctacgtgg ccgccgccga 1020cctgttcgcc gcggccaagg cggtggtggc ggtgttccgc gactacggcc gccgtgacaa 1080ccgcaagcag gcgcgaacac ggcacatgct ggcggagtgg ggcgtggaca agttccgctc 1140ggtggcggag cagtacctgg gcaagcgctt ccaggagccg gtgccgctgc cgccctggca 1200gtacaaggac tacctgggct ggggcgagca gggcgacggg cggctgtact gcggcgtgta 1260tgtgcagaac gggcgcatca agggcgaggc caagcgggcg ctgcgtgcgg ccattgagcg 1320ctacagcctg ccggtggtac tcacgccgca ccagaacctg gtcctgcggg acgtgcggcc 1380cgaggaccgg gaggacattg agcagctgct gcgggccggc ggcgtcaagg agctggtgga 1440gtgggacggg ctggaccggc tgtccatggc ctgccccgcg ctgccgctgt gcggcctggc 1500ggtcacggag gcggagcggg cgctgccgga cgtcaacacg cgcatccggg ccatgttgac 1560acgggcgggc ctgcctccct cccagccgct gcacgtgcgc atgacgggct gccccaacgg 1620ctgcgtgcgg ccctacatgg ccgagttggg gctggtgggc gacggaccca acagctacca 1680gctgtggctg ggcggcgggc cggcgcagac acgcctggcg cagccgtacg cggagagggt 1740caaggtgaag gacttggagt ccacgctgga gcccctgttt ggcgcctgga gggccgggcg 1800ccagccggac gaggcctttg gagattgggt ggcgcggctc ggatttgacg ccgtgcggca 1860gcaggcggcg gcggcggcgg cggcggctcc tgtcggcacc gcgtgaggcg gcggctcggg 1920gctttcccgg tgcaaacgta cgtgcgtgcg tatgcgtgtt tacgtgtgtg taagtatgta 1980tctgtgtatg tgtaccgtat gtgtacgaga agcgaaaatg gtggacgacg actgcacagt 2040cgcagcaccg gcggcttgtg gggtaggctg tggctacctc tcgcaatgcg gccacgtaat 2100ggtattgcaa aatgcccctg cgtcaatgat aagagattgc gtattcatgc acgtgactga 2160ggagaaacgg ttcacaacga aaccctgcag cccggcaatg ccatgttcta gataggtcac 2220gcacgcaatc cgcatgcagc gcggtcttcg tatgtactat gtagcactac cctgtgcgca 2280gtgcaccatt tatatgcttt gctagcagca agcggttttg cttgaggttc cttttgcctg 2340gattcgcctg ccagccctcc gggagctagg ggtgctctgt agcgatcatg caaaagtaag 2400atgagttctg tttgggttgc gcggaagtgc tgaggcgctc ttgtgcaata cgagtacgg 2459383102DNAChlamydomonas reinhardtii 38cttgtaactt gacaaccaag gacaaccaag gaccagccgc ttataatcac tagggttgcg 60ctccagtcgg tgtcttgtga gcgttgattc ctcgctgaaa gctttatctt gagcaccata 120ctagttgagt cgtgattgca ttcgcaaggg caaaataacc cgaggcttgt gactacaatc 180aacaaacggc aatgcagtcg cgccagtgct tgaaccgcaa ggccagcggc gcgcggccct 240gcgctaactc gcgcagcctc acagctcgcg tactcgctac ggccgcgcct gtcgcgccgt 300ccgccacacc cgcctccgcc cccctgcccc tccccgatgg cgttggcgag cacagcggcc 360tgaagcacct gcccgaggcc gcccgcactc gtgcgctcga caagaaggcc aacaagtttg 420agaaggttaa ggtcgagaag tgcggctcgc gcgcctggaa cgacgtgttt gagctgtctt 480ccctgctgaa ggagggcaag accaagtggg aggaccttaa cctcgatgat gtcgacatcc 540gtctcaagtg ggccggcctg ttccaccgcg gcaagcgcac ccccggcaag ttcatgatgc 600gtctcaaggt gcccaacggc gagctcaccg ccgcgcagct gcgcttcctg gcctcctcca 660tcgcgcccta cggcgctgac ggctgcgccg acatcaccac ccgcgccaac atccagctgc 720gcggcgtcac catggaggac tcggagacgg tcatcaaggg gctgtgggat gtgggcctga 780cgtccttcca gtcgggcatg gactccgtgc gcaacctcac cggcaacccc atcgccggag 840tcgacccaca cgagctggtg gacacgcggc cgctgctgcg cgacatggag gcgatgctgt 900tcaacaacgg caagggccgc gaggagtttg ccaacctgcc gcgcaagctg aacatctgca 960tctcctccac ccgcgacgac ttcccgcaca cccacatcaa cgacgttggc tacgaggccg 1020tggccaagcc caacggcgag gtggtgtaca atgtggtggt gggcggctac ttctccatca 1080agcgcaacat catgtccatc ccgctgggct gctccatcac ccaggaccag ctgatgccct 1140tcactgaggc cctgctgcgc gtgttccggg atcacggccc gcgcggcgac cggcagcaga 1200cgcggctgat gtggctggtg gaggcggtgg gcgtggacaa gttccgccag ctgctgtcgg 1260agtacatggg cggcgccacc ttcggcgagc ccgtgcacgt tcaccacgac cagccctggg 1320agcggcgcaa cctgctgggc gtgcaccggc agaggcaggc cggcctgaac tgggtcggcg 1380cctgcgtgcc cgcgggccgc ctgcacgccg ccgactttga ggagatcgcg gctgtggctg 1440agaagtacgg cgacggcacg gtgcgcatca cgtgcgagga gaacgtgatc ttcaccaacg 1500tgcccgacgc caagctggag gcgatgaagg cggagccgct gttccagcgc ttccccatct 1560tccccggcgt gctgctgtcg ggcatggtgt cctgcaccgg caaccagttc tgcggcttcg 1620gtctggctga gaccaaggcg aaggccgtga aggtggtgga ggcgctggac gcgcagctgg 1680agctgagccg gcccgtgcgc atccacttca ccggctgccc caactcatgc ggccaggcgc 1740aggtgggcga catcgggctg atgggcgcgc ccgccaagca cgagggcaag gccgtggagg 1800gctacaagat cttcctgggc ggcaagatcg gcgagaaccc cgcgctcgcc accgagttcg 1860cgcagggtgt gccggccatt gagagcgtgc tggtgcctcg gctaaaggag attctgatct 1920ccgagttcgg tgccaaggag cgcgccaccg ccaccgccta agagcgtggt gtcacgagcg 1980tggcggcagt ggaacgtgct tgcagcgttg gtgtttggag cgagctcctc agagcgtgag 2040tgccttgttg aacacgccgg cgttgcgtga tgggaaggtg ggattggtgg tcgccctgag 2100gtgcatgaag catgcagggc aggggagtgg gattggttgg agaggaaaat gagtaggagt 2160gatgcgcacc tgcggctgcc tatataacat aaggaagtaa gcgtgatgga tgcacgggct 2220gtgttttgct tgaagcggca gagccctgca ggagccagac ggccgacatg tactgctaag 2280gcaggagcca gttcctgcgt tgagaagaag cgtgcttgct tgccggcgga ggccgtcttg 2340cgtgccatac cagggcacgg cagcgctgga agactgcatg cgacgcagcg atcggagcac 2400gctgtggttc tttaccctcg ttttacatat gcgttgtcgt gttccttgtg tgtatgtacg 2460tgtgtgtgtg tgtgtgtgta cggtgtgtat acggcgtgcg gggcaggcag gcggaggctg 2520caaagggagc gcagatgcgc atccttaggg aaaggtacgt aggagccgcc gctgcgtgta 2580tgtatgtact agcagctaga tatgcacgtg gtgacctgca gcgctgtgct caatgcgtgc 2640tgtggcacca gcgcaggggc aagaagcgta ggcatttcgg tagtacggta ttgtgtgcgc 2700gtgctggcgc tgggaggcgg tgcagtggtg caggtttgtt ggcgccggcc gctgcacctg 2760ctgcgcttgc gactaggcag gcgccgtacg gtaatagggg ttgaggcaca ttgcgcatgc 2820atttgtctag ataatggtat gcggcgccga caagtggcaa ctagcgttag ggtggcttgt 2880ctgtactaaa ccacggccca taccgcagtg cggcgtgtgg ctgcaacacc cgtgccggcg 2940tgtaggagga gctgacgtgt gatctagagt gaataccaat ggtactggaa gaggtaacag 3000actttgcgac gagcgttgca atgcgaggcg cccgccgggg caggcgtgca cacaaccacc 3060tagatggctg catcccgggc gaatgtaaca acaccggaag ga 3102391467DNAChlamydomonas reinhardtii 39ctgtttcgtc acgtcgttat tgaattctat taagtggttt aaccgtaggt agcagccatg 60cttctcaagg gcattacaac cccgatgctg gggcagcagc gccccactcg cggccagctg 120cacgtcgtga acgtggctac gccctccaag aatccctcct ctcgcctggc gaagcgcagc 180aaggtggaga ttattaagga gaagagcgac tacctgcggc acccactcat ggaggagctg 240gttaacgacg ccaccttcat caccgaggac tcggtgcagc tcatgaagtt ccacggctcg 300taccagcagg accaccgcga gaagcgcgcg tttggtcagg gcaaggctta ctgctttatg 360atgcgcacgc gtcagcccgc tggtgtcgtg cccaaccgcc tgtacctggt gatggacgac 420ctggccgatc agtacggcaa cggcacgctg cgcctgacta cgcgccaggc ctaccagctg 480cacggcgtgc tgaagaagga cctcaagacg gtgttcagct ccgtcatcaa gaacatggga 540tccaccctgg ccgcctgcgg cgacgtcaac cgcaacgtta tgggcccctc cgcgcccttc 600accaaccgcc ccgactacgt ggccgcccag aaggccgcca acgacatcgc cgacctgctg 660acgccgcagt cgggcgccta ctacgacgtg tggctggacg gcgagaagtt catgtcggct 720tacaaggagg accccaaggt gaccgccgac cgtgcctaca acggcttcgg caccaacttt 780gagaacagcc ccgagcctat ctacggcgcg cagttcctgc cccgcaagtt caaggtggcc 840accacggtgc cgggcgacaa cagcgtggac ctgttcaccc aggacctggg cgtggtggtc 900atcatggacg agagcggcaa ggaggtcaag ggctacaacc tgacggtggg cggcggcatg 960ggccgcacac accgcgacga tgagaccttc ccgcgtctgg ctgacccgct gggctacgtg 1020gacaaggacg acctgttcca cgccgtcaag gcggttgttg cggttcagcg cgactacggc 1080cgccgcgaca accgcaagca ggcgcgcctc aagtacctgg tgggcctgcc cgccgaccag 1140gagctgcacg tgcgcatgac gggctgcccc aacggctgcg cgcggcccta catggccgag 1200ctgggcttcg tgggcgacgg ccccaacagc taccagctct acttcggcgg caacgtcaac 1260cagacgcgcc tggcgcagct gttcgcggac agggtcaagg tgaaggacct ggagtccacg 1320ctggagccca tcttcgccgc ctggaaggcc agccgccggc caaaggagtc gttcggcgac 1380tgggtgtcgc ggccgtccca agatcccaag aatctcagtt ctgtacaaca gggcacgcag 1440cacgagagcg ccgtcgtcgc gcactaa 1467402080DNAGossypium hirsutum 40tatcccttca cttatctttc caccaccaca attccaccag ttccaagctt cttttcaaac 60aacaaaaccc cacatgtctt ccttgtcggt ccgtttcttt gctccacaac agccgttact 120gccgtccaca gcttcctctt tcaagcccaa aacatgggtt atggcagctc ccacgacggc 180gccggcgact tcggtggatg tcgacggggg gaggttggaa ccccgagttg aagaacgaga 240ggggtacttc gtgttgaaag agaagttcag agatggcatc aaccctcagg agaaaataaa 300gatcgagaaa gaccctttga agcttttcat ggaagctggg attgatgaac tcgctaagat 360gtcgttcgag gatcttgata aagctaaggc tacaaaggac gacattgatg ttagacttaa 420atggctcggc ttgttccata ggagaaaaca tcaatatggg agatttatga tgagactaaa 480actaccaaat ggtgtaacaa caagtgcaca aacacggtac ttagccagtg tgataaggaa 540atacggcaaa gaagggtgtg ccgatgttac gacaaggcaa aactggcaaa tccgtggagc 600ggtgttgcct gatgtgcctg aaatacttaa gggtctcgac gaagtaggct tgacgagcct 660acagagtggc atggacaatg tgaggaaccc tgtcggtaat cctcttgccg gcatcgaccc 720cgaagagatt gtcgatactc gaccttatac caacttgtta tctcagttca tcaccgccaa 780ttcccgcggc aatccggctg ttgccaactt gcctaggaaa tggaatgtct gtgtcgtggg 840gtctcatgat ctttacgaac atccccatat caatgatctc gcttatatgc cggcgacgaa 900aaacggacga tttgggttta atttgctggt tggtgggttc tttagtgcca agagatgtga 960tgaggccatt cctcttgatg cttgggtctc agctgatgat gtgattccat tgtgcaaagc 1020tgtgttagaa gcctataggg atcttggata caggggcaat aggcaaaaga ctagaatgat 1080gtggctgatt gatgaactgg gtattgaagt gttcagatca gaagtagcca aaagaatgcc 1140tcagaaagag ttggagagag catctgatga agatttggtt caaaagcaat gggaaaggag 1200agactacctt ggtgtccatc cgcaaaagca agaaggtttc agctacatcg gcattcacat 1260cccagtcggt cgagtccaag ccgacgacat ggacgaacta gcccggttag ccgacacgta 1320tggctcgggc gaattcagac tcactgtgga gcaaaacatc ataatcccca acgttgagaa 1380ctcgaaacta gaagcattac taaacgagcc tctattgaaa gaccggtttt caccccaacc 1440aagtattctc atgaaagggc tagtagcttg tactggtaac cagttttgcg gacaagccat 1500tattgaaaca aaagctagag ccttgaaggt gacggaagag gttgaaaggc tagtgtcggt 1560gagccggccg gtgaggatgc attggaccgg ttgccccaac acgtgtggtc aagtccaagt 1620ggcggatata ggtttcatgg ggtgcatggc aagggatgag aatgggaaac catgtgaagg 1680ggcagacata ttcttgggag ggagaattgg gagtgactca catttaggag agctttataa 1740gaagggtgtc ccttgtaaga acttggtacc tgtagttgct gacattttgg tggaaccctt 1800tggagctgtc cctaggcaaa gggaagaagg ggaagattga ttcaaaatca acttcatttc 1860attccattac ttttatattt gttttatttt ttttttttaa taaccaagaa aaatgaaggg 1920tttgaaagat actggggagg attaaatttg gagaatattg atcaatggca tgatgatgaa 1980gggctttgta ttataaaata tgtaacattt tcagcatatg tattagaata aagttactgg 2040taatatattt tcagttaaaa tttagagatg atcatgtttg 2080411482DNAHordeum vulgare 41accaccatca ccgccacaga gcagcagcag cggcaccacc accaccgcaa ccacaagcag 60catccatggc gtcctcggcc tccctgcaga gcttcctccc gccctcggcc cacgcggcga 120cgtcgtcgtc ccggctccgg cccagccgcg cccgccccgt ccagtgcgct gccgtctccg 180cgccgtcgtc gtcgtcgtcg tccgcatcgc cgtcggcctc ggccgtcccg tcggagcggc 240tggagccgcg ggtggagcag cgggagggcg gctactgggt gctcaaggag aagtaccgca 300ccagcctgaa cccgcaggag aaggtgaagc tgggcaagga gcccatggcg ctcttcaccg 360agggcggcat caacgacctc gccaagctgc ccatggagca gatcgacgcc gacaagctca 420ccaaggagga cgtcgacgtg cgcctcaagt ggctcggcct cttccaccgc cgcaagcagc 480agtatgggcg gttcatgatg cggctgaagc tgcccaacgg cgtgacgacg agcgagcaga 540cgaggtacct ggcgagcgtg atcgacaagt acggcgagga ggggtgcgcc gacgtgacga 600cccggcagaa ctggcagatc cgcggcgtga cgctgccgga cgtgccggag atcctggacg 660ggctccgctc cgtcggcctc accagcctgc agagcggcat ggacaacgtg cgcaaccccg 720tcggcagccc gctcgccggc atcgaccccc tcgagatcgt cgacacgcgc ccctacacca 780acctcctctc ctcctacatc accaacaact ccgagggcaa cctcgccatc accaaccttc 840ctaggaagtg gaacgtgtgc gtgatcggca cacatgatct gtacgagcac ccgcacatca 900acgacctggc gtacatgccg gccgagaagg acggcaagtt cgggttcaac ctgctcgtgg 960gcgggttcat cagccccaag aggtggggtg aggccctgcc gctcgacgcc tgggtccccg 1020gcgacgacat catcccggtc tgcaaggccg tcctcgaggc gttccgcgac ctcggcacca 1080ggggcaaccg ccagaagacg cgcatgatgt ggctcatcga cgagctcggg atggaggcgt 1140tccggtcgga gatcgagaag aggatgccca acggcgtgct ggagcgcgcg gcgccggagg 1200acctgatcga caagaagtgg gagaggcgcg actacctcgg cgtgcacccg cagaagcagg 1260aggggctctc cttcgtcggc cttcacgtgc ccgtcggccg gctgcaggcc gcggacatgt 1320tcgagctggc ccgcctcgcc gacgagtacg gctccggcga gctccgcctc acggtggagc 1380agaacatcgt gctgcccaac gtgaagaacg agaaggtgga ggcgctgctg gcggagccgc 1440tgctgcacaa gttctcggcg cacccgtcgc tgctgatgaa gg 1482422092DNALotus japonicus 42tcaccatgtc ttcttccttc tccattcgct tcctcgctcc tccatttccc tccacctctc 60gccccaagtc atgtctctcc gccgccacgc cggctgtggc tccaaccgat gcggcggtgt 120cgaggttgga gcccagagtg gaggagagaa atgggtactg ggttttgaag gaagagcaca 180ggggtggcat taatccgcag gaaaaggtga agctggagaa agagcctatg gcccttttta 240tggaaggtgg gattgatgag ttggctaagg tttctattga agagcttgat agctctaagc 300ttactaagga tgatgttgat gttaggctca aatggcttgg tctttttcat aggagaaagc 360atcagtatgg tagatttatg atgaggctga aacttccaaa tggggtgaca acgagtgcgc 420agacacgata cttggcgagt gtgatcagga agtacgggaa agatgggtgt gctgatgtga 480ccacaaggca taattggcaa attcgtggtg tagtgctacc tgatgttcct gaaattctta 540agggccttgc agaggttggc ttgactagtc tgcagagtgg tatggacaat gtaagaaacc 600ctgtgggtaa ccctcttgca ggcattgacc ctgatgagat tgttgatacc cgaccttaca 660cgaacttgtt gtcccatttc atcactgcca attcacgtgg caacccaacc gtctcaaact 720tgccaaggaa gtggaatgta tgcgttgtgg gttctcatga tctctttgag catccccaca 780taaatgatct tgcttacatg cctgctaaca aagatggtcg ttttggattc aacttattgg 840tggggggttt ctttagtccc aagcgatgtg cagaggcaat tccacttgat gcatgggtct 900ctgcagaaga tgtaatccca gtttgtaaag caatcctcga gatgtacagg gatcttggca 960ccagaggaaa cagacagaaa acaagaatga tgtggttgat tgacgaactg gggatagaag 1020tattcaggtc agaggtggta aaaagaatgc cattagggca gcagctggag agagcatccc 1080aggaagatct ggttcagaaa caatgggaaa gaagagatta ctttggtgcc aatccacaga 1140aacaagaggg cttaagctat gttgggattc acattccagt tggtaggatc caagcagatg 1200agatggacga gctggcccgt ctggccgatg aatacggcac tggtgaactg aggctcactg 1260tagagcaaaa cataataatc ccaaatgtgg aaaactcaaa actcagtgcc ctgctcaatg 1320agcctctctt gaaagaaaag ttctcacctg aaccttccct tctaatgaaa acactggtgg 1380catgcactgg tagccaattt tgtgggcaag ccataattga gacaaaggcg agggcattga 1440aggtgactga agaagtggag agactagtgg cagtgactag gcctgtgaga atgcactgga 1500ctgggtgtcc caacacctgc gggcaagtgc aggttgctga tattggtttc atggggtgca

1560tggccagaga tgagaatggt aagcctggtg aaggtgtgga tattttcctg ggagggagga 1620taggaagtga ttcacactta gctgaggttt ataagaaggc tgttccttgc aaggacttgg 1680tgcccatagt ggcagacata ctagtaaaac attttggagc tgtccagagg aatagagaag 1740aaggagatga ttaagttatt taggtttaac ttttgaaatt aaaccttctg ttgtatctat 1800gacaaaatat cattttcttg tccaaaattt ataatagtag taagggtgat caagtgagat 1860ataccacatg tgccaatggg gaaaaaaagt cggatatgaa agttgtaatc ttacatgagt 1920ggttttgaaa ttacatgaca catttttatt gatcggacgg aaaagaagat ccaaacaaat 1980gtgtaagaaa tttttcttag tttctaattt ccactttcta ttcataaata aatgtgtaag 2040ctatggttct tactttgtga catttgttaa aataaatatt ttcacttttt tt 2092432149DNANicotiana tabacum 43atggcatctt tttctgttaa attctcagca acttcattgc caaatcctaa cagattttcc 60aggactgcta agcttcatgc aacaccgccg cagacggtgg cagtaccacc atctggggag 120gcggagatag cttccgagag gctagagcct agagtagagg aaaaagatgg gtattgggta 180ctcaaggaaa aattcagaca agggataaat ccagctgaaa aggccaagat tgagaaagaa 240ccaatgaaat tatttatgga aaatggtatt gaagatcttg ctaagatctc acttgaagag 300atcgaagggt ctaagcttac taaagatgat attgatgtta ggctcaagtg gcttggcctt 360ttccatagga gaaagcatca ttatggccga ttcatgatgc gattgaagct tccaaatggg 420gtaacaacga gtgcccaaac tcgatactta gccagtgtga taaggaaata tggaaaagat 480ggatgtggtg atgtgactac aaggcaaaat tggcagattc gcggggttgt actacctgat 540gtacccgaga ttctaaaggg actggatgaa gttggcttga ccagtctgca aagtggcatg 600gacaacgttc gaaatccggt gggaaatcct ctggcgggga ttgatccaca tgaaattgta 660gacacaaggc cttacactaa tttgctctcc caatatgtta ctgccaattt tcgtggcaat 720ccggctgtta ctaacttgcc aaggaagtgg aatgtatgtg taatagggtc acatgatctt 780tatgagcatc cccatatcaa tgatcttgcc tatatgccgg catcaaaaga tggacgattt 840ggattcaacc tgcttgtggg tggattcttc agtccgaagc gatgtgcaga ggcagttcct 900ctagatgcat gggttccagc tgatgacgtg gtccctgttt gcaaagcaat attagaagct 960tatagagatc ttggtaccag agggaacagg caaaaaacaa gaatgatgtg gttagttgat 1020gaactgggcg ttgaaggatt cagggcagag gtcgtaaaga gaatgcctca acaaaagcta 1080gatagagaat caacagagga cttggttcaa aaacaatggg aaaggagaga ataccttggc 1140gtgcatccgc agaaacaaga aggatacagc tttgttggcc ttcacattcc ggtaggtcgt 1200gtccaagcag atgacatgga cgagctagct cgtttagcgg ataactatgg ttcaggagag 1260ctccggttga ctgttgaaca gaacatcatt attcccaacg ttgagaactc aaagatcgag 1320tcattgctca atgagcctct cttaaagaac agattttcga ccaatccacc tattctcatg 1380aaaaatctgg tggcttgtac tggtaaccaa ttttgcgggc aagccataat tgagactaaa 1440gcgcgttcca tgaagataac tgaggaggta caacgactag tttctgtgac aaagccggtg 1500aggatgcatt ggactggttg cccgaattca tgtggacaag ttcaagtcgc ggatattgga 1560tttatgggat gcttgacaag aaaagaagga aaaactgtag aaggtgctga tgtttatttg 1620ggaggcagaa tagggagtga ctcacatttg ggagatgttt ataagaaatc agtaccttgt 1680gaggatttgg tgccaataat tgtggactta ctagttaaca actttggtgc tgttccaaga 1740gaaagagaag aagcagaaga ttaatttcaa gatttcataa cagctcgcgg atcgcgctgc 1800agaattggac attaatggaa tgtgcacacc atatcaagtt atttcgaagg tacagaaatg 1860gtgacactga tcctgaaaac caaggttttc tttattgaaa gttagttgaa taattggtat 1920atgtgccgtt attaacatgc tcatgtgtga tatagcacga cagaaatatt tgtacttgtt 1980tcagaataat tatattgtgt attcttttgg aaaaactgat acaaaccaaa aggcttttaa 2040accacccttc agttgggatt ctaataatcc atctttacat accaattaat catgttgttg 2100tattcttaat catattgtta tattataata atccattcgg tttgatgcc 2149441902DNANicotiana tabacum 44atggcatctt tttctattaa atttctggca ccttcattgc caaatccagc tagattttcc 60aagaatgctg tcaagctcca cgcaacaccg ccgtctgtgg cagcgccgcc aactggtgct 120ccagaggttg ctgctgagag gctagaaccc agagttgagg aaaaagatgg ttattggata 180ctcaaagagc agtttagaaa aggcataaat cctcaagaaa aggtcaagat tgagaagcaa 240cctatgaagt tgttcatgga aaatggtatt gaagagcttg ctaagatacc cattgaagag 300atagatcagt ccaagcttac taaggatgat attgatgtta ggcttaagtg gcttggcctc 360ttccatagga gaaagaacca atatgggcgg ttcatgatga gattgaagct tccaaatgga 420gtaacaacga gtgcacagac tcgatactta gcgagtgtga taaggaaata cgggaaggaa 480ggatgtgctg atattacgac aaggcaaaat tggcagattc gtggagttgt actgcctgat 540gtgccggaga tactaaaggg actagcagaa gttgggttga ccagtttgca gagtggcatg 600gacaatgtca ggaatccagt aggaaatcct ctggctggaa ttgatccaga agaaatagta 660gacacaaggc cttacactaa tttgctctcc caatttatca ctggcaattc acgaggcaat 720cccgcagttt ctaacttgcc aaggaagtgg aatccgtgtg tagtaggctc tcatgatctt 780tatgagcatc cccatatcaa cgatctcgcg tacatgcctg ccacgaaaga cgggcgattt 840ggattcaacc tgcttgtggg agggttcttc agtgcaaaaa gatgtgatga ggcaattcct 900cttgatgcat gggttccagc cgatgatgtt gttccggttt gcaaagcaat actggaagct 960tttagagatc ttggtttcag agggaacaga cagaaatgta gaatgatgtg gttaatcgat 1020gaactgggtg tagaaggatt cagggcagag gtcgagaaga gaatgccaca gcaacaacta 1080gagagagcat ctccagagga cttggttcag aaacaatggg aaagaagaga ttatcttggt 1140gtacatccac aaaaacaaga aggctacagc tttattggtc ttcacattcc agtgggtcgt 1200gttcaagcag acgatatgga tgagctagct cgtttagctg atgagtatgg ttcaggagag 1260atccggctta ctgtggaaca aaacattatt attcccaaca ttgagaactc aaagattgag 1320gcactgctca aagagcctgt tctgagcaca ttttcacctg atccacctat tctcatgaaa 1380ggtttagtgg cttgtactgg taaccagttt tgtggacaag ccataatcga gactaaagct 1440cgttccctga tgataactga agaggttcaa cggcaagttt ctttgacacg gccagtgagg 1500atgcactgga caggctgccc gaatacgtgt gcacaagttc aagttgcgga cattggattc 1560atgggatgcc tgactagaga taagaatgga aagactgtgg aaggcgccga tgttttctta 1620ggaggcagaa tagggagtga ttcacatttg ggagaagtat ataagaaggc tgttccttgt 1680gatgatttgg taccacttgt tgtggactta ctagttaaca actttggtgc agttccacga 1740gaaagagaag aaacagaaga ctaataaaat ttagaatagt tggtgatttt gctgtgttca 1800taacatgtaa tgtatgataa atcaatgcaa acatttctac ctacgtgaga attattacat 1860gctacatata ttcttttgaa gaaaattaca tgcgtactcc tc 1902451755DNANicotiana tabacum 45atggcatctt tttctgttaa attctcagct acttcattac caaatcataa aagattttca 60aagctacatg caacaccgcc gcagacggtg gctgtagccc catctggggc ggcggagata 120gcatcggaga ggttagagcc tagagtagaa gaaaaagatg ggtattgggt acttaaggaa 180aaattcagac aagggataaa tccagctgaa aaagctaaga ttgagaagga accaatgaaa 240ttgtttatgg aaaatggtat tgaagatcta gctaagatct cacttgaaga gatcgaaggg 300tctaagctta ctaaagatga tattgatgtt aggctcaagt ggcttggcct tttccatagg 360agaaagcatc actatggccg attcatgatg agattgaagc ttccaaatgg ggtaacaacg 420agttcccaaa ctcgatactt agccagtgtg ataaggaaat atgggaaaga tggatgtgct 480gatgtgacga caaggcaaaa ttggcagatt cgtggggttg tactacctga tgtacccgag 540attctaaagg gactggatga agttggctta accagtctgc agagtggcat ggacaatgtt 600agaaatccgg tgggaaatcc tctggcgggg attgatccac atgaaattgt agacacaagg 660ccttacacta atttgctctc ccaatatgtt actgccaatt ttcgtggcaa tccggctgtg 720actaacttgc caaggaagtg gaatgtatgt gtaatagggt cacacgatct ttatgagcat 780ccccagatca acgatcttgc ctatatgccg gcaacaaaag atggacgatt tggattcaac 840ctgcttgtgg gtggattctt cagtccgaag cgatgtgcag aggcagttcc tcttgatgca 900tgggttccag ctgatgacgt agtccctgtt tgcaaagcaa tattagaagc ttatagagat 960cttggcacca gagggaacag gcagaaaaca agaatgatgt ggttagttga tgaactgggc 1020gttgaaggat tcagggcaga ggttgtaaag agaatgcctc aacaaaagct agatagagaa 1080tcaacagagg acttggttca aaaacaatgg gaaaggagag aataccttgg cgtgcatcca 1140cagaaacaag aagggtacag ctttgttggt cttcacattc cagtgggtcg tgtccaagca 1200gatgacatgg acgagctagc tcgtttggcc gatgagtatg gttccggaga gctccggctg 1260actgttgaac aaaacatcat tattcccaat gttaagaact caaagatcga ggcattgctc 1320aatgaacctc tcttaaagaa cagattttca accgatccac ctattctcat gaaaaatttg 1380gtcgcttgta ctggtaacca attttgcggg aaagccataa ttgagactaa ggcacgatcc 1440atgaaaataa ctgaggaggt tcaactacta gtttctataa cgcagcctgt gaggatgcat 1500tggactggtt gcccgaattc atgtgcacaa gttcaggtcg cggatattgg atttatggga 1560tgcttgacaa gaaaagaagg aaaaactgta gaaggtgctg atgtttattt gggaggcaga 1620atagggagtg actcacattt gggagatgtt tataagaaat cagtaccttg tgaggatttg 1680gtgccaataa ttgtggactt actagttgac aactttggtg ctgttccaag agaaagagaa 1740gaagcagaag attaa 1755462496DNAOryza sativa 46gaaccttatc tccttctctc tcgtcgcttt ctgcgtctcc ccgtctctcc ttcgccaaca 60gccgagaaga ggcagagaga gcgccgcccc ccgtccctct ctctccctct cgtcctcgcc 120cccatccctc tcgtctttcc cttgccggca gcagaggagg cggcagcgac ggcttcagct 180gctcccacgg gccggatcgg gcagtggcgg tggcgtcggc ggcttccgct ggcgaatccg 240gcgggtggat acaaatcagt gttccgatag gtaaaaccct gctctcagca tctgcccttt 300tgaattcgcc aagagccagc atctgccctt ttgaattcgc caagggccag catctgccca 360tttgattttg aattcgccaa gagccagcaa cagcgccccc gcgccccctc cctcctccgc 420aataaacagc cacacgcgcc gcccccatgt ccaccctcat cgccacagcg caccaccacc 480accaccacca ccaccaccac caccgtctcc agccatggcc tcctccgcct ccctgcagcg 540cttcctcccc ccgtaccccc acgcggcagc atcccgctgc cgccctcccg gcgtccgcgc 600ccgccccgtg cagtcgtcga cggtgtccgc accgtcctcc tcgactccgg cggcggacga 660ggccgtgtcg gcggagcggc tggagccgcg ggtggagcag cgggagggcc ggtactgggt 720gctcaaggag aagtaccgga cggggctgaa cccgcaggag aaggtgaagc tggggaagga 780gcccatgtca ttgttcatgg agggcggcat caaggagctc gccaagatgc ccatggagga 840gatcgaggcc gacaagctct ccaaggagga catcgacgtg cggctcaagt ggctcggcct 900cttccaccgc cgcaagcatc agtatgggcg gttcatgatg cggctgaagc tgccaaacgg 960tgtgacgacg agcgagcaga cgaggtacct ggcgagcgtg atcgaggcgt acggcaagga 1020gggctgcgcc gacgtgacaa cccgccggca gatccgcggc gtcacgctcc ccgacgtgcc 1080ggccatcctc gacgggctca acgccgtcgg cctcaccagc ctccagagcg gcatggacaa 1140cgtccgcaac cccgtcggca acccgctcgc cggcatcgac cccgacgaga tcgtcgacac 1200gcgatcctac accaacctcc tctcctccta catcaccagc aacttccagg gcaaccccac 1260catcaccaac ctgccgagga agtggaacgt gtgcgtgatc gggtcgcacg atctgtacga 1320gcacccacac atcaacgacc tcgcgtacat gccggcggtg aagggcggca agttcgggtt 1380caacctcctc gtcggcgggt tcataagccc caagaggtgg gaggaggcgc tgccgctcga 1440cgcctgggtc cccggcgacg acatcatccc ggtgtgcaag gccgttctcg aggcgtaccg 1500cgacctcggc accaggggca accgccagaa gacccgcatg atgtggctca tcgacgaact 1560tggaatggag gcttttcggt cggaggtgga gaagaggatg ccgaacggcg tgctggagcg 1620cgcggcgccg gaggacctca tcgacaagaa atggcagagg agggactacc tcggcgtgca 1680cccgcagaag caggaaggga tgtcctacgt cggcctgcac gtgcccgtcg gccgggtgca 1740ggcggcggac atgttcgagc tcgcacgcct cgccgacgag tacggctccg gcgagctccg 1800cctcaccgtg gagcagaaca tcgtgatccc gaacgtcaag aacgagaagg tggaggcgct 1860gctctccgag ccgctgcttc agaagttctc cccgcagccg tcgctgctgc tcaagggcct 1920cgtcgcgtgc accggcaacc agttctgcgg ccaggccatc atcgagacga agcagcgggc 1980gctgctggtg acgtcgcagg tggagaagct cgtgtcggtg ccccgggcgg tgcggatgca 2040ctggaccggc tgccccaaca gctgcggcca ggtgcaggtc gccgacatcg gcttcatggg 2100ctgcctcacc aaggacagcg ccggcaagat cgttgaggcg gccgacatct tcgtcggcgg 2160ccgcgtcggc agcgactcgc acctcgccgg cgcgtacaag aagtccgtgc cgtgcgacga 2220gctggcgccg atcgtcgccg acatcctggt cgagcggttc ggggccgtgc ggagggagag 2280ggaggaggac gaggagtagg aacacagact ggggtgtttt gcttgctccg gtgatctctc 2340gccgtccttg taaagtagac gacaatatgc cttcgcccat ggcacgcttg tactgtcacg 2400ttttggtttg atcttgtagc ccaaaagttg tgttcattct cgttacagtc ttacagagga 2460tgattgattg ataaataaag aagaaacaga ttctgc 2496472265DNAPhyscomitrella patens 47attagagagt tgatggacat cgtttgatcg ttaactgcag cgaaataagt ccatggggtt 60tttaggaagt ggagtgatac atcgtcgcat agttactggg aaaattgtaa ttgctcgtgc 120tcaggctgga atttcaagca agttgaggat tgcaggcgaa atttactgaa gtaaaattcg 180ccaggcgcaa tgcaaggtgc aatgcagaca aagatgtgga ggggagagct gatcagcaca 240tcgacccact ttataggcgg cactcgactg cagcccaaac taaaccagga tgcaaggaaa 300cccacgaaaa gtgaaaattg tatcgttcga gtctccatgg agcgtgaggt caaggctaag 360gccgcggttt ctccacccgc tgttgctgca gaccgtctca ctccacgagt gcaagaaaga 420gatggctact acgttctcaa agaggaattc cgacaaggaa ttaaccccca agagaagatc 480aaacttggga aagagccgat gaaattcttc atagagaacg agatagagga gcttgcaaag 540acgccgttcg cggagctaga cagctcgaag cctgggaagg acgatatcga tgttagactc 600aagtggttgg gtctcttcca ccgccgcaaa catcaatatg gaaggttcat gatgcggttc 660aagcttccga atggaatcac gaacagtaca cagacgaggt ttttggccga gaccatctca 720aaatacggaa aggaagggtg tgcagatttg acgacaagac agaactggca aattcgtggg 780attatgctcg aagatgtgcc ctcccttctg aaaggactgg aatccgtggg cctatcgtct 840ctgcagagcg ggatggacaa tgtaagaaat gcggtcggta accctcttgc tggaatcgac 900cccgacgaaa tcgtcgacac cattcctatc tgtcaggcgc tgaacgacta catcatcaac 960agagggaaag gaaatactga gatcaccaac ttacctcgga agtggaacgt gtgcgtggtc 1020gggacgcacg acttatttga acatccgcac atcaacgatc ttgcgtacgt tcccgcaacc 1080aagaacggcg tcttcggttt caacattctt gttggaggat tcttcagctc aaagcggtgc 1140gccgaagcta ttccgatgga cgcttgggtg ccgacagacg acgtcgtccc gttgtgcaaa 1200gcaattctgg agacttatcg agacctcggg actcgcggca accgacagaa gactcgcatg 1260atgtggttga tcgatgagat gggagtcgag gagttcagag ccgaggtgga aaggcgcatg 1320cccagcggca ctatccggcg agccggacag gatctgatag acccgtcgtg gaagcgccgg 1380agcttcttcg gagtaaaccc ccagaagcaa gcagggctga actacgttgg tcttcacgtc 1440ccggtcgggc gtttgcacgc tccagagatg ttcgagctgg ctcgcattgc cgatgagtac 1500ggcaacggcg agatccggat cactgtggag cagaacctga ttctgcccaa catcccgacg 1560gagaaaattg acaagttgat gcaggagccc ctcttgcaga aatactctcc gaatcccacc 1620cccttgttgg cgaacttggt ggcctgcact ggcagccagt tctgcggcca agcgatcgcg 1680gagacgaagg ccctgtccct gcaactcacg cagcagctcg aagacaccat ggaaacgact 1740cgcccgatcc gattgcactt cacgggatgc cccaacacat gcgctcaaat ccaggttgcg 1800gatatcggat tcatgggcac catggctcga gatgaaaacc gaaagcccgt tgaagggttc 1860gacatctacc tcggaggccg catcggctcc gactctcact tgggagagct tgtcgtgcct 1920ggtgtgcctg ccaccaagct gcttccggtg gtgcaagagc tgatgatcca gcatttcggc 1980gctaaaagga aaccttgaga tgcaaatctg ggtatagtaa caaaaaatca ctactcgtca 2040cacacacaca cacaccgctg atgtataatt tacgtaaaac caatctatcg aatagcacga 2100ttcacagtta cgaaactctg ggtaaaaccc ggttataaat tgatgaccat tcattcgtct 2160tgtgcagcct tccagtgaca ttgtcagtgt cggtgggcat gagctctgtc gctaatcccc 2220acttctccaa taaagtttcg gcaaatctgt gcccacatga atcat 2265481809DNAPhyscomitrella patens 48atgcaaggca ctatgcagtc acaaatgtgg aggggacagg tgagcggcgc atcgctccac 60ttcacaggcg caacccgagt gcagggtaac agccaccagg atttagtata tcccacgcaa 120tttcacaaac atggcgttcg ggcctctgcg gagcgcgagg tcaaggccaa ggctgtagct 180gccccaccta ccatcgctgc agaccgcctc gtgccacgcg tggaagaacg agatggttat 240tacgttctta aggaggaatt tcgacagggc atcaacccgt cggagaagat aaaaatcgcc 300aaagaaccca tgaaattctt catggagaac gagatagaag agctggcgaa aacgccgttc 360gccgagctcg atagttcgaa ggcaggaaag gacgacattg atgtgagatt gaagtggttg 420ggcctcttcc accgtcgcaa acatcaatat gggagattca tgatgcggtt caagcttcca 480aatgggatca cgaatagctc gcagacgcgg ttcttggctg agacaatctc caagtacgga 540gagtatgggt gcgctgattt gacgacacgt caaaactggc aaatcagggg gattgttctc 600gaagacgtgc ctgctcttct gaagggattg gaatcagtag gcctgtcatc tttgcagagc 660ggcatggaca acgttaggaa cccagttggt aaccctcttg caggaatcga ccctgacgaa 720attgtcgaca ctgccccgtt ctgcaaggta ctcagcgatt acatcatcaa ccgagggcaa 780ggaaatcctc agatcaccaa tttacctcgg aaatggaacg tgtgcgtggt tggaacacat 840gacttgttcg agcacccgca catcaacgac ctggcgtaca tgccagccac aaagaacggt 900gtcttcggtt tcaacatcct ggtgggagga ttctttagcc ctaagcggtg tgcggaagca 960attcccatgg atgcttgggt gccagcagat gatgtcgttc ccttgtgcaa ggcaattctg 1020gaaacctacc gagaccttgg aacccgaggc aaccgacaga agacccgcat gatgtggttg 1080atcgacgaga tgggaattga ggaattcaga gccgaggtag agaggcgcat gcccggtggg 1140tccattctta gagccgggaa ggacctggtc gatccatcct ggacgcgccg gagcttctat 1200ggagtgaacc cgcagaagca accgggctta aactacgtag gcctccacat tcccgtcggc 1260cggctgcatg ctccagagat gttcgagctt gcgcgcattg cagacgagta cggcaacggg 1320gagattcgga tctcggtgga gcagaacctg atcctgccca acgtccccac ggagaaaatc 1380gagaagctat tgaaggagcc cctcctggag aaatactccc cgaatcccac ccctctgctc 1440gccaacttgg tggcctgcac aggcagccag ttctgtggcc aggccatcgc ggagaccaag 1500gcccggtcgt tgcagctcac gcaagagctg gaagccacca tggaaaccac tcgtcctatt 1560cggttgcact tcaccggatg ccccaacaca tgcgcccaaa tccaggttgc ggatattggc 1620ttcatgggta caatggcacg agacgaaaat agaaagcccg tggaggggtt tgacatctac 1680cttggaggtc gtatcggctc cgactcacat ttgggagagc tcgtggtgcc gggcgtgcct 1740gcgaccaagc tgctccccgt tgtgcaagac ctcatgatcc agcatttcgg cgccaagcgt 1800aagacttaa 1809492270DNAPinus taeda 49cggccggggg agacaagccc tcatcataga tttaattact gatctttgca tcttggattt 60gtaatcggag tagtcaggat gaatctctct agtccagtca gattcgatga gattcgtccc 120ttggcccatg tcgtttacaa tcctgtttgc tgtgggcata agccgaatcg gctcaggttg 180atgacagcaa tccaggttcg tgctgttaat catggtggac gcaattctga gatcagtaca 240gatgggaata gcaaagggac aacagccaag gctgtagcca gtcctgctgg ctctcatgtg 300gctgtagatg cctcaaggct ggaggctaga gttgaggaga gggatggata ctgggttctc 360aaagaggaat tcagggctgg aatcaaccct caggagaaga ttaagttgca gagggagccc 420atgaaattgt tcatggagaa tgagatcgaa gaacttgcaa agaagccctt cgctgaaatt 480gagagtgaga aggttaataa agatgatata gatgtacgcc tgaagtggtt gggtctcttt 540caccgaagaa aacatcacta tgggagattc atgatgagac ttaagcttcc gaatggagtg 600actaccagtc tccaaactcg atatttggca agcgtgattc aacaatatgg accagaggga 660tgcgcagata taacaactcg gcagaattgg cagattcgtg gagttgtgct ggatgacgtg 720cctgccatat tgaaagggct gaaggaggtt ggactgtcta gcttgcagag tggaatggac 780aacgttagaa accctgtggg aaatccttta gcagggattg atgctgatga aatcattgac 840acaaggccat atacaaaggt tctgactgac tacattgtca acaatggaaa gggcaatcca 900tccataacca acctgccacg taaatggaat gtctgtgttg tgggtacaca tgacttgttt 960gagcatcccc acatcaatga cctcgcctac attcctgcaa tgaatagtgg gagatttggt 1020ttcaatctgc tcgttggtgg attctttagt ccaaaacgct gtgaagaagc agttccactt 1080gatgcttggg ttgctggaga ggatgttgta ccagtatgca gagccatttt ggaggtttat 1140agagatctgg gcacccgggg aaatcgccag aaaactcgaa tgatgtggct gattgatgag 1200ttgggcatag agggcttccg ttcagaagtg gtgaagagaa tgccaggaga gaagttggaa 1260agagcagcaa cagaagacat gttagataaa tcatgggagc gcaggagtta tcttggtgtg 1320cacccacaga agcaggaagg cttgaatttc gtaggtctcc atgttccagt gggtcgactt 1380caggcagaag atatgttaga actggctcgt cttgcagaac aatatggcac gcaggaactc 1440cgcctcacag tagaacaaaa tgccatcatt ccaaacgtac ctacagataa gatagaggca 1500cttttacagg aacccctcct ccaaaaattc tccccttccc ctcctcttct tgttagcaca 1560ttagtggctt gtaccggcaa ccagttctgt ggtcaggcaa tcatcgaaac aaaagcaaga 1620gccttgaaaa tcacagagga attggataga accatggaag ttcccaagcc tgtgagaatg 1680cactggacag gatgccctaa tacatgtgga caagtgcagg ttgcagacat tggcttcatg 1740ggttgcatga ctagggatga aaacaagaaa gttgttgagg

gagtggacat attcattgga 1800ggtagggtgg gagcagattc acatctaggg gatttaatcc acaagggagt accttgcaag 1860gacgtggtac ctgtggttca agaactactt attaaacact ttggagccat caggaaaaca 1920gacatgtgaa aatgaattcc aatttctcat ccatcgccat cttcagtgga ggacaatcac 1980cagattgcta aggttctgag cgggtatcca actcattgaa atctgaataa ataaatgtag 2040agatgcaatg tatagatgta ttgtttacga agtccaacgt gttcagaaat aaaatagctg 2100attactgtgt tcacagcagg gtttttttac attaaactcg tcttgcactt ttgaacagta 2160tggaatacaa ataaaaacgg attagcccaa aaaaataatg gaataataga aattccagta 2220agattatgat aaaatctgta gaatttttga aaatctgagt ttcactggtg 2270501877DNAPopulus trichocarpa 50acacttctct agaaactatc taccatcatt atgtcatcac tttcagttcg ttttctcacg 60ccacaattgt cacccacagt tccaagctcc tctgcaagac caagaacaag actctttgct 120ggacctccca cagtggctca gccagcggag acgggggtgg atgcagggag gttggaacct 180agagtggaga agaaagacgg atactatgtg ttgaaagaga agtttaggca aggtattaat 240cctcaagaga aagtgaagat agagaaagag ccaatgaagc ttttcatgga aaatgggatc 300gaggagcttg ctaaattgtc gatggaagag attgacaaag agaagagcac taaagatgat 360attgatgtta gactcaagtg gctcggtctc tttcacagaa ggaagcacca atatggtaga 420tttatgatga gactaaagct accaaatggg gtaacaacaa gtgcacaaac aagatacttg 480gcaagcgtga tcaggaaata tgggaaagat ggctgtgcag atgtaacaac aagacaaaac 540tggcaaattc gtggagtggt gttgcctgat gtgccagaaa tactaagggg tctagctgaa 600gttggtctga caagcctgca gagtggcatg gacaacgtga gaaaccccgt cggaaatccg 660cttgcaggaa ttgatccgga tgagattgtt gataccagac cttataccaa cttgttgtcc 720caatttatca ctgccaattc tcgtggaaat cctgagttca ctaacttgcc aaggaagtgg 780aatgtatgtg tcgtgggttc tcatgatctt tatgagcatc ctcatatcaa tgatcttgct 840tacatgcctg ccatgaagga cgggcggttt ggattcaatt tgctggttgg tgggttcttt 900agtcccaagc gatgtgctga ggcaattcct cttgatgctt gggtttcagc tgatgatgtg 960ctcccatctt gcaaagcagt gttagaggcc tacagagatc ttggcaccag agggaacagg 1020caaaagacta gaatgatgtg gctgatcgac gagcttggca ttgaaggatt caggtcagaa 1080gtagtaaaaa gaatgccacg tcaagagcta gagagagaat cttctgaaga tttggttcaa 1140aagcaatggg aaaggaggga ctatttcggt gtccatccac agaagcaaga aggccttagc 1200tatgcaggtc ttcacattcc tgtcggtcgc gtccaagcag atgacatgga tgagctagct 1260cgtttagctg atatttatgg cactggcgaa ctcagactca ctgtggagca gaacatcata 1320attcccaaca ttgaggactc aaagattgaa gccctactta aagaacctct attaaaagac 1380aggttctcac ctgagccacc tcttctcatg caagggttgg tagcatgcac tggcaaagag 1440ttttgcgggc aagcaataat tgaaacaaag gctagggcca tgaaggtaac tgaggaggtg 1500cagaggttag tgtcggtgtc taaaccagtg agaatgcact ggacaggctg tcctaatacc 1560tgtgggcagg tacaagttgc cgatattggg ttcatgggtt gcatggcaag agatgaaaat 1620gggaaaatct gtgaaggagc agatgtgtac gtaggaggaa gagttgggag tgactcacat 1680ttgggagagc tttataagaa aagtgttcca tgcaaggact tggtgccttt ggttgtggac 1740attttagtta aacaattcgg agctgtacct agggagaggg aagaggtgga tgattagttc 1800atttaatcaa aatgttcatt cttgtttcat tgcaaattcg gaggggatct aatgcatgct 1860tttggaatcg gaaatga 1877512000DNASolanum lycopersicum 51caacaatcaa gagtccacta aacgttttgc cacacatcca tttactccca cagctctaca 60aaatgctctg acatctcttt tgcaacttcc aaaatggcat ctttttctat caaatttttg 120gcaccttcat tgccaaatcc aactagattt tccaagagta gtattgtcaa gctcaatgca 180actccgccgc agacagtggc tgcggcgggg cctccagagg ttgctgctga gagactagaa 240ccaagagttg aggaaaaaga tggatattgg atactaaaag agcagtttag gcaaggaatt 300aatcctcaag agaaggtgaa gattgagaag gaacctatga agttgttcat ggaaaatggt 360attgaggagt tagctaagat tccaattgaa gagatagatc aatcaaagct tactaaggat 420gacattgatg ttaggctcaa gtggcttggc ctcttccata ggagaaagaa tcaatatggg 480agattcatga tgaggttgaa acttccaaat ggagtaacaa caagtgctca gactcgatat 540ttggcgagtg tgataaggaa atatggagag gaaggatgtg ctgatattac gacaaggcaa 600aattggcaga ttcgtggagt agtgctgcct gatgtgcctg agattctaaa gggacttgaa 660gaagttggct tgactagttt gcagagtggc atggataatg tcaggaatcc agttggaaat 720cctctggctg gaattgatcc tgaagaaata gttgacacaa gaccttacac taatttgctc 780tcccaattta tcactggtaa ttcacgaggc aatccggctg tttctaactt gccaaggaag 840tggaatccgt gtgtagtagg gtctcatgat ctttatgagc accctcatat caatgatctt 900gcatacatgc ctgccataaa agatggacga tttggattca acctgcttgt gggagggttc 960ttcagtgcca aaagatgtga tgaggcaatt cctcttgatg catgggttcc agccgatgat 1020gttgttccgg tttgcaaggc aatactggaa gcttttagag accttgggtt cagagggaac 1080aggcagaagt gtagaatgat gtggttgatc gatgaactgg gtgtagaagg attcagggca 1140gaggtcgtaa agagaatgcc tcagcaagag ctagagagag catctccgga agacttggtt 1200cagaaacaat gggaaagaag agattatctt ggtgtacatc cacagaaaca ggaaggctat 1260agctttattg gtcttcacat tccagtgggt cgtgtacaag cagacgacat ggatgatcta 1320gctcgtttgg ctgatgagta cggctcagga gagctacggc tgactgtgga acagaacatt 1380attattccca acattgagaa ctcaaagatt gacgcactgc taaaagagcc tattttgagc 1440aaattttcac ctgatccacc tattctcatg aaaggtttag tggcttgtac tggtaaccag 1500ttttgtggac aagccattat tgaaacgaaa gctcgttccc tgaagatcac cgaagaggtt 1560caaaggcaag tatctctaac gaggccagta aggatgcact ggacaggctg cccaaatacg 1620tgtgcacaag ttcaagttgc agacattgga ttcatgggat gcctgactag agataaagac 1680aagaagactg tggaaggcgc cgatgttttc ttaggaggca gaatagggag tgactcacat 1740ttgggtgaag tatacaagaa ggcagttcct tgtgatgaat tagtaccact tattgtggac 1800ttacttatta agaactttgg tgcagttcca cgagaaagag aagaaacaga agattaataa 1860aatttggatt agatcataat gatggaatgt gcaattatgt ttagtgatta tggaggtata 1920tagctaagag ctggtttgaa taatcagaaa tatgttgtgt tcatatcatt tattgtacga 1980taaatcaaca caaacattcc 2000522135DNASolanum lycopersicum 52gacgatcacc gctacctcaa tcgactaaat tctcaatttt aagttggttt tgtaacttag 60ttgttctttt taatttgtcg aaatgacttc tttttctgtt aaattttcag ctacttcact 120tccaaattct aatagatttt ccaaacttca tgctactcca ccgcagacgg tggcggtacc 180gtcgtacggg gcggcggaga tagctgctga aagactagag cctagagttg agcaaagaga 240tgggtattgg gtagttaagg ataagttcag acaaggcata aatccagctg aaaaggcgaa 300gattgaaaag gaaccaatga aactattcac tgaaaatggt atcgaagatc ttgctaagat 360ctcgcttgaa gagatcgaga aatcaaagct aactaaagaa gatattgata ttcgcctcaa 420gtggcttgga ctcttccatc ggagaaaaca ccactatggt cgattcatga tgcgattgaa 480gcttccaaat ggagtaacga cgagtgatca aactcgatat ttaggtagtg tgattaggaa 540atatgggaaa gatggatgtg gtgatgtgac tacaaggcaa aattggcaga ttcgtggggt 600tgtgttacct gatgtgcctg agattctaaa ggggcttgat gaagttggct tgactagtct 660gcagagtggc atggataatg ttcgaaatcc ggtggggaat cctctcgcag ggattgatct 720tcatgaaatt gtagacacaa ggccttacac taatttgctg tcccaatatg tcaccgccaa 780ttttcgtggc aatgtggatg tgactaactt gccaaggaag tggaatgtat gtgtaatagg 840gtcacatgat ctttatgagc atccgcatat caatgatctt gcgtatatgc ctgcaaccaa 900agatggacga tttggattca acctgcttgt gggtggattc ttcagtccga agcgatgtgc 960agaggcaatt cctcttgatg catgggttcc agctgatgat gtagtccctg tttgcaaagc 1020tatattagaa gcttatagag atcttggtac ccgagggaac aggcagaaaa caagaatgat 1080gtggttaatt gacgaactgg gtgttgaagg attcagggca gaagttgtga agagaatgcc 1140ccaaaagaag ctagatagag aatcttcaga ggatttggtc ctgaaacaat gggaaaggag 1200agagtacctt ggcgtgcatc cgcagaaaca ggaaggatac agctttgttg gtcttcacat 1260tccggttggt cgtgtccaag cagatgacat ggacgagcta gctcgtttgg ctgatgagta 1320tggttcagga gaactccggt tgactgttga acagaacatc attattccca acatcgagaa 1380ctcaaagatc gatgcattac tcaatgagcc tctcctaaag aacagatttt cacctgatcc 1440acctattctc atgagaaatt tggtggcttg tactggtaac caattctgtg ggcaagcaat 1500aatcgagact aaagcacgtt caatgaagat aaccgaggag gttcaacgtc tagtctctgt 1560gacacagcca gtgaggatgc actggacagg ttgcccaaat acatgtggac aagttcaagt 1620tgccgatatc ggattcatgg gatgcctgac tagaaaggaa ggcaaaactg ttgaaggtgc 1680tgatgttttc ttgggtggca gaatagggag cgactcgcat ttaggagaag tttataagaa 1740gtctgtacca tgtgaggatt tggtaccaat aatcgtcgac ttactaatta acaactttgg 1800tgctgttcca agagaaagag aagaaacaga ggagtaatct aaaatcttca gaatgtactt 1860tttatgatat tgaaatattt ccaaggtaca gcattgtaag ttagtaaaat aatcacaaca 1920tgagatgttg ttaacatgtt catgtgtgac atagcatgat gcaaatactt gaacttgttt 1980caaaatataa tcacattgtg tattcttttg gaaatactca tccaaactaa aaggcttttg 2040aattgttgaa ttcctaataa tacatttttt aaaatgtaat ttgatattca tttgttttga 2100ttattatatt cttaaaataa tttacttatt ctctc 2135532110DNAArabidopsis thaliana 53aagagctcat ctcttccctc tacaaaaatg gccgcacgtc tccaaccttc tcccaactcc 60ttcttccgcc atcatcatga cttctttctc tctcactttc acatctcctc tcctcccttc 120ctcctccacc aaacccaaaa gatccgtcct tgtcgccgcc gctcagacca cagctccggc 180cgaatccacc gcctctgttg acgcagatcg tctcgagcca agagttgagt tgaaagatgg 240tttttttatt ctcaaggaga agtttcgaaa agggatcaat cctcaggaga aggttaagat 300cgagagagag cccatgaagt tgtttatgga gaatggtatt gaagagcttg ctaagaaatc 360tatggaagag cttgatagtg aaaagtcttc taaagatgat attgatgtta gactcaagtg 420gcttggtctc tttcaccgta gaaagcatca gtatgggaag tttatgatga ggttgaagtt 480accaaatggt gtgactacaa gtgcacagac tcggtattta gcgagtgtga ttaggaagta 540tggtgaagat gggtgtgctg atgtgactac tagacagaat tggcagatcc gtggtgttgt 600gttgcctgat gtgcctgaga tcttgaaagg tcttgcttct gttggtttaa cgagtcttca 660aagtggtatg gataacgtga ggaacccggt tgggaatcct atagctggga ttgatccgga 720ggagattgtt gacacgaggc cttacacgaa tctcctttcg cagtttatca ccgctaattc 780acaaggaaac cccgatttca ccaacttgcc aagaaagtgg aatgtgtgtg tggtggggac 840tcatgatctc tatgagcatc cacatatcaa tgatttggcc tacatgcctg ctaataaaga 900tggacggttt ggattcaatt tgcttgtggg aggattcttt agtcccaaaa gatgtgaaga 960agcgattcct cttgatgctt gggtccctgc tgatgacgtt cttccactct gcaaagctgt 1020tctagaggct tacagagatc ttggaactcg aggaaaccga cagaagacaa gaatgatgtg 1080gcttatcgac gaacttggtg ttgaaggatt tagaactgag gtagagaaga gaatgccaaa 1140tgggaaactc gagagaggat cttcagagga tcttgtgaac aaacagtggg agaggagaga 1200ctatttcgga gtcaaccctc agaaacaaga aggtcttagc ttcgtggggc ttcacgttcc 1260ggttggtagg ctacaagctg atgacatgga tgagcttgct cggttagctg atacctacgg 1320gtcaggtgag ctaagactca cagtagagca aaacatcatc atcccaaatg tagaaacctc 1380gaaaaccgaa gctttgcttc aagagccgtt tctcaagaac cgtttctccc ctgaaccatc 1440tatcctaatg aaaggcttag ttgcttgtac cggtagccag ttctgcggac aagcgataat 1500cgagactaag ctaagagctt taaaagtgac agaagaagta gagagacttg tatctgtgcc 1560aagaccgata aggatgcatt ggacaggatg tcccaacact tgcggacaag tccaagtagc 1620agatatcgga ttcatgggat gcttaacacg aggcgaggaa ggaaagccag tcgagggtgc 1680tgacgtgtac gtcgggggac gaataggaag tgactcgcat atcggagaga tctataagaa 1740aggtgttcgt gtcacggagt tggttccatt ggtggctgag attctgatca aagaatttgg 1800tgctgtgcct agagaaagag aagagaatga agattgattc aaaagctatt ggattcttaa 1860taagtcaaga gacctatgaa tggttctctc tctggtttca gactttgata cttgatactt 1920gtatttgtat tgtgcccata attttgggtt ttgtagctct ctcctttgtt gtaacctgta 1980actttgtcct tggttgtttt gtaatatctt gttttttagt aatagtagta taatctgatt 2040ttttgtcata tattgtcttg atttctctgt gatatttata agaaataaac atttgtttct 2100ttttacctcc 2110541788DNAVitis vinifera 54atggcttcta tctctgttcc tttcctctct caggcaccca cccacctttc aaactccact 60tctctccgtc tcaaaaccag gatctctgcc accccgactc cgactccaac tccaaccacg 120gttgcaccgt cgtccacggc ggcggtggac gcctccagga tggagcccag ggtggaggag 180agagggggtt actgggtttt gaaggagaag ttcagggaag gtataaatcc acaggagaag 240gtgaagattg agaaggatcc tatgaagctc ttcatagaag atgggttcaa tgagctggcc 300agcatgtctt ttgaagaaat tgaaaagtct aagcatacta aggatgatat tgatgtgagg 360ctcaagtggc ttggactgtt tcataggagg aagcatcaat atggtagatt tatgatgaga 420ttgaagctgc caaatggggt gacatcaagt gcacaaactc gttacctggc cagtgcaata 480aggcaatacg ggaaggaggg atgtgccgat gtgactacgc ggcaaaactg gcaaattcga 540ggtgtggtac tgcctgatgt gcctgaaata ctaaagggtc tttcagaggt tggtttgacg 600agcctgcaga gtggcatgga caatgtgagg aatcctgttg gaaatcctct tgcaggcatt 660gaccctcatg agattgttga tacacgacct tacaccaact tgttatccca attcattact 720gccaatgctc gtgggaatac agccttcact aacttgccga ggaagtggaa tgtgtgtgtt 780gtaggctccc atgatctcta tgagcatccc cacatcaatg atctggcgta catgcctgcc 840acaaagaaag gaagatttgg attcaatctg ctagtaggcg ggttctttag tcccaaacgt 900tgtgctgatg ctattcctct cgatgcctgg atccctgccg acgatgtcct cccagtttgt 960caagcagtac tagaggctta cagggatctt ggtaccagag gaaaccgcca aaagacaaga 1020atgatgtggt taattgatga gctgggcata gagcagttcc gggcagaggt ggtgaaaaga 1080atgccccaac aagagctgga aagatcatct tctgaagacc tggttcagaa gcaatgggag 1140aggagagatt accttggtgt ccatccccag aaacaggaag gctttagctt tgtgggtatt 1200cacattccag tgggtcgagt ccaggcagat gacatggacg agctagctcg attggcagac 1260gaatatggct caggcgagct ccggctcact gtagagcaga acatcataat tcccaatgtg 1320gagaactcaa gacttgaagc cttgctcaaa gagcctctct tgagagacag attctctccg 1380gagcctccta ttctcatgaa aggcttggtg gcctgcaccg gcaatcagtt ttgtggacag 1440gccattatcg agaccaaggc cagagcattg aaggtgacgg aggatgtggg gcggctggtt 1500tcagtgaccc agccagtgag gatgcactgg accggctgcc caaactcctg cggccaggtg 1560caagtggcgg atatcggatt catggggtgc atgacaaggg acgagaatgg gaacgtttgt 1620gaaggggcag atgtattctt aggaggtaga attgggagcg actgtcattt gggagaggtt 1680tataagaagc gtgttccttg caaagactta gtgcccttgg ttgctgaaat tttggtaaat 1740cactttggag gagtccccag ggagagggaa gaagaagctg aagactga 1788552433DNAVolvox sp. 55atgcagtcgc agtcgctgtc ccgccgcacc tgcacccgta ctcttggccg cggcctcgtc 60acccctgtcc tggcaaccgc ggcaccggct tcagcagcgc aagcggccga tggcatcaac 120gcgcatagcg ggctgaagca cctgccagag gctgctcgcg ttcgcgctct cgaccgcaag 180gccaataagt ttgagaaggt caaggttgag aagtgcggat cacgcgcatg gacagatgtc 240ttcgagctgt cacggctgct gaaagaggga aacaccaagt gggaggattt ggatttggac 300gacatagaca tccgcatgaa gtgggcgggc ctgttccatc gcggaaagcg cacgcccggc 360aagttcatga tgcgcctcaa ggttcccaac ggcgagctgg atgcccgcca gctgcgcttc 420ctcgcctcgg caatcgcgcc atacggcgcc gacggctgcg ccgacatcac cacgcgcgcc 480aacatccagc tccgaggcgt gacgctggcg gacgccgacg ccatcattcg cggtctttgg 540gacgttggcc tcacgtcctt ccagagcggt atggacagcg tacggaactt gacgggcaac 600cccatcgcgg gtgtggaccc ccatgagctc atagataccc gtccgctgct gcgggaaatg 660gaggccatgc tgttcaacaa cggcaagggc cgcgaggagt ttgcgaacct gcctcgcaag 720ctcaacatct gcatttcctc aacccgcgac gacttcccgc acacgcacat caacgacgtg 780ggcttcgaag cggtgcgccg ccccgatgat ggcgaggtgg tgttcaatgt ggtcgttggc 840ggcttcttct ccatcaagcg caacgttatg tccatccctc ttggctgctc tgtcactcaa 900gaccagctga tgcccttcac ggaggctctg ctgcgggtgt tccgcgacca cgggccccgc 960ggggaccgcc agcagactcg cctgatgtgg atggtagatg cgattggcgt ggagaagttc 1020cgccagctgc tttcggagta catgggcggc gcggagctgg cgccgccggt gcacgtgcat 1080cacgaggggc cctgggagcg ccgtgacgtg ctgggtgtgc acccccagaa gcagccgggg 1140ctgaattggg tgggcgcctg tgttccggct ggcaggctgc aggctgccga ctttgacgag 1200ttcgcccgca tcgcggagac gtacggcgac ggcaccgtac ggatcacgtg cgaggagaac 1260gtgatcttta ccaacgtccc cgacgccaag ctgccggaca tgcttgctga gcccctgttc 1320cagcgcttca aagtcaatcc ggggctgctg ctccgggggc ttgtgtcctg cacgggcaac 1380cagttttgcg gcttcggtct ggcggagaca aaggcgcggg cggtcaaggt agttgagatg 1440ctggaggagc agttggagct cacccggcct gtcaggatcc acttcaccgg atgccccaac 1500agttgcggcc aagcgcagca ggttggcgac attgggttaa tgggagcccc cgccaagctg 1560gatggcaagg cggtggaggg ctacaagatc tttttgggcg ggaagattgg ggagaacccg 1620cagctggcca cggaattcgc tcaagggatc ccggctgtgg agtctcatct ggtgcccaaa 1680ctcaaggaga tccttattaa ggagtttggt gccaaggaaa aggagactgc cgttgtcgtc 1740taaataggcg tcgttgcgta attaggtgct tataacggag aagggggaat gatagcttgg 1800tgtaagtgtt acataggatt ggggagggag tggtaggcac gggtttgatg cgtgatatac 1860tacatgtgac ctgatgtcgt attttgcata caagtatctt gtccggcgct tctcatgcgt 1920gtgcgtgtct gtttgttctg tttcggctag cagggcggcc aagtcgttta tgttcgggga 1980ttcctactac gggcgcaatt gcaatgataa aagaaggatg cgtgtcttgt ctggggcctg 2040tgaatcactc cttccgatat gccgcgacgt ttgctgtgcg cgcggcgtgc aggtcagggt 2100ttgtcgatag gtagcgtttg cacgtcgcgt ccgtgagtat ctatatcaga gcagcttgcg 2160catgtatgtg ttaaccaagt tttttttatt ggcgtgggaa ctgtgctccc gggcgaatta 2220tgctcgccag cgctgccggt ggtctgtgat tgattaggca ttggtcatct gtatccattc 2280gacttatcag acttatcatg tctcgcgatc ggatgttgtg ctgccttgtt ccattctttt 2340gcacatccgt tgtgtcgatg gcgtgggaag atgccgaggc tacgatgaag agtgtagata 2400gagggtcgcg ttcgtggtga tggtgccgca cag 2433562062DNASpinacia oleracea 56catcatcttc atcttcatct tcatcattca tagttgcaag aaacagagca accaaaaaaa 60atggcatcac ttccagtcaa caagatcata ccatcatcaa cgacattact gtcatcgtcg 120aacaacaaca gaagaagaaa taactcatca attcgatgcc agaaggcggt ttcacccgcg 180gcagaaacgg ctgcagtgtc gccgtctgtg gacgcggcga ggctggagcc gagagtggag 240gagagagatg ggttttgggt attgaaggag gaatttagga gtgggattaa cccagctgag 300aaagttaaga ttgagaaaga cccaatgaag ttgtttattg aggatgggat tagtgatctt 360gctactttgt caatggagga agttgataaa tctaagcata ataaggatga tattgatgtt 420agactcaagt ggcttggact tttccatcgc cgtaaacatc actatgggag attcatgatg 480aggttgaagc tgccgaatgg ggtaacaacg agtgagcaga cacggtacct agcaagcgtg 540atcaagaagt acggaaaaga tggatgtgcg gatgtaacaa caaggcaaaa ctggcaaatt 600agaggagttg ttctgcctga tgtgccagag atcatcaaag ggctggaatc cgttggtctt 660accagcttac agagtgggat ggacaatgta aggaaccctg taggtaaccc tcttgcaggg 720attgaccctc atgaaattgt tgacacccga ccttttacca acctaatttc ccaatttgtc 780actgccaatt cgcgtggaaa cctttctatt accaatctgc caaggaagtg gaatccatgt 840gttattgggt cccatgatct ttatgagcat ccacacatca atgaccttgc ttacatgcct 900gctacaaaga atgggaaatt cgggtttaat ttgttggttg gaggattctt tagcatcaaa 960agatgtgaag aggcaatccc actagacgct tgggtctcag cagaagatgt ggttcctgta 1020tgcaaagcta tgcttgaagc tttcagggac cttggcttta gaggaaacag gcagaagtgc 1080agaatgatgt ggcttattga tgagcttggt atggaagcat tcaggggaga ggttgagaag 1140agaatgcctg agcaagttct agaaagagca tcctcagaag agctggttca gaaggactgg 1200gagagaagag aatacttagg agttcaccct cagaaacaac aaggacttag ctttgtgggt 1260ctccacattc ctgtgggccg tctgcaagct gatgagatgg aagagttagc ccgtatagct 1320gatgtgtatg gatcagggga gctccgtctg acagtagagc agaacataat catcccaaat 1380gttgaaaact caaagataga ttcactacta aacgagcctc tgttaaaaga gcgttactcc 1440cctgaaccac ccatcttgat gaaggggctt gtggcctgta cggggagcca attttgtgga 1500caagccatta tcgagaccaa ggctagggca ctcaaggtga cagaagaggt acaacgacta 1560gtgtctgtaa cacggcctgt taggatgcat tggaccgggt gtcctaatag ttgtggtcaa 1620gtacaagtgg ctgatattgg gttcatgggt tgcatgacta gggatgagaa cggtaagcct 1680tgtgaaggag ctgatgtgtt tgtaggagga cgtataggaa gtgactcgca tctaggagac 1740atttacaaga aggcagtccc atgtaaagat ttggtgcctg ttgttgctga gatattgatc 1800aaccaattcg

gtgctgttcc tagggagagg gaagaggcag agtagtagct agactgtttt 1860gggtgcctgt tcttgttaac tgttatcggt attcggtaat tacttgtaat atttgcattt 1920tttttcaagc atataattaa attgcataaa gatcccttgt atgtctgcat aacaagatac 1980tcagttatgt aatgtcaata gcaggtttac tttgtttatt caataggcac tgtgaaaggg 2040aaagttcatt attcatttct ca 2062571611DNANostoc sp. 57atgacagata cagtaactac ccccaaagcc agcctcaata agtttgagaa attcaaagcc 60gaaaaagatg gacttgccat caagtcagag atcgaaaaaa ttgcctcttt gggatgggaa 120gcaatggacg caacagaccg agatcatcgc ctcaaatggg tgggtgtatt ctttcgccca 180gtcacccctg gtaaatttat gatgcggatg cggatgccga atggtatcct caccagcgat 240cagatgcgtg ttttagccga agtggtgcag cgttacggag atgacggcaa cgctgatatt 300acaactaggc agaatattca actacgaggt atcagaatag aagacttacc gcacatattc 360aataaatttc atgcagtagg tttaaccagt gtgcagtcag ggatggacaa catccgtaac 420atcacaggcg acccgatagc ggggttagat gcggatgagt tgtatgacac ccgtgagtta 480gtgcagcaaa ttcaggatat gctcaccaac aaaggagaag gcaatcgaga gtttagtaat 540ttgcctcgta aatttaatat tgcgatcgcc ggtggacggg ataattcagt tcatgcggaa 600atcaacgatt tagcctttgt tccagcattt aaagaaggga ttggagattg ggtattgggg 660aatggggaag aatcatctac ttaccaaaaa gtctttggat ttaacgtgtt agttggtggt 720ttcttttctg ctaaacgctg tgaggcggcg attcctttga atgcttgggt aactccggaa 780gaagtcttac ccttatgtag agcaatttta gaggtctatc gtgacaatgg actcagggct 840aatcggctca agtctcgctt gatgtggcta attgatgaat ggggtataga taagtttcgg 900gcagaagtcg aacagcgttt gggtaaatcc ttactccccg cagcccccaa agacgaaatt 960gattgggaaa aacgcgacca tatcggagtc tataagcaaa agcaagaggg attgaactat 1020gtagggttac acatccctgt aggtagattg tatgccgagg atatgtttga attggctcgg 1080atagccgatg tatacggtag cggtgaaatc cgcatgactg ttgaacaaaa catcatcatt 1140cccaacatta ccgactcgcg gttaaggact ttgttgacag atcccttact agagagattt 1200tctcttgatc ctggagcatt gacgcgatcg ctagtttcct gcacgggcgc acaattttgc 1260aacttcgccc tcatcgaaac caaaaaccgc gccctagaaa tgattaaagg cttagaagca 1320gaattgacct ttactcgtcc agtgcgaatc cattggacag gttgccccaa ctcctgcgga 1380cagccccaag ttgcagacat tggcttaatg ggaacaaaag ctcgtaaaaa cggtaaagcc 1440gtggaaggtg ttgacatcta tatgggtggc aaagtcggca aagatgcaca tttaggtagc 1500tgtgtacaaa aaggcatccc ctgcgaagac ttgcacctag tattacgaga cttactcatt 1560actaattttg gagccaaacc cagacaggaa gccttagtta ccagccaata a 1611582700DNAPlectonema boryanum 58aacactgccg gaactcgact catgacccat ccaacgcttg cccacgatag aaatgttctc 60cgacgcatga ggttctccta aagaacgata gaggaatagt gagtagggag tggggagtag 120ggtaaatcct ttctatctcc cactcctccc ccgctcccca ccaaattaca actatttcta 180aagtacgccc ttccccctct tcccgccgac agatgacgaa aacgaatcgg ctttatgcag 240aaacgtcata ttatgaaaag ttttgtaaca acagatacga atgtcctctg tgatcccgat 300tacctttact cagtaatcac cgcgaatcat caaacggttc cgcagttgat atcgatttgt 360gttcgctctg gaacacctta tattcatagg ctcaatccat gacagacacc cttgcagcac 420cgaccctcaa taagtttgaa aaactcaaag cagagaaaga tggtcttgcg gtgaaagcag 480aactcgagca ctttgctcgg ctcggctggg aagcaatgga tgaaaccgat cgtgatcatc 540gcttgaagtg gctcggtgtg ttctttcgcc ccgtaactcc tggcaaattt atgctgagaa 600tgcgggttcc gaatggcatt atcacgagcg gacaaacccg ggtgctagga gaaatccttc 660agcgctatgg agatgatggc aatgcagaca tcacgactcg ccagaacttt caactgcgag 720gaattcggat tgaagacctt cccgaaattt ttcgtaagtt tgaccaagct ggattgacga 780gcattcaatc cgggatggat aacgttcgta acattaccgg atcgcctgtt gctggcattg 840atgcagatga gctaattgat actcgtgggc tagttcgcaa agttcaagac atgatcacga 900acaatggtcg tggtaattcg agctttagta acttgcctcg gaaattcaat attgcgatcg 960cagggtgccg cgataactca gttcatgctg aaatcaatga cattgctttc gttcccgctt 1020tcaaagatgg cacattagga ttcaatatcc tagttggcgg attcttctct gggaaacgct 1080gcgaagctgc aattccactc aatgcttggg ttgacccgcg cgatgtcgtt gcggtctgcg 1140aagcaatttt aacggtctat cggaacttgg gactgagagc aaatcgtcaa aaagctcgct 1200taatgtggct gattgatgag atgggattgg aaccgttccg cgaagcggtt gaaaaacaat 1260tgggatatgc ttttacgcct gctgctgcca aagacgagat cctttgggac aagcgagatc 1320acattgggat tcatgcccaa aaacagcctg gattaaacta tgtgggcttg catgttccag 1380tgggacggtt atacgcgcaa gatttgtttg atttagctcg gatcgctgaa gtttacggca 1440gtggtgaaat tcgcttaact gtcgagcaga atgtgatcat tccgaatgtt ccggattcac 1500gagtttctgc attgctcaga gaacccattg tcaaacggtt ctcgatcgag cctcagaatc 1560tttcacgggc attagtgtct tgtactggcg cacagttttg taacttcgca ctgattgaaa 1620ctaaaaatcg tgcggttgct ttaatgcaag agctagaaca agacctgtac tgtcctcgtc 1680cagtgcgcat tcattggaca ggttgcccga actcttgtgg acaacctcaa gttgcagata 1740tcggactgat gggcacaaaa gtccgcaaag atggcaaaac agtcgaaggc gtggatctct 1800atatgggggg caaagttggc aaacatgctg aacttggaac ctgtgtgaga aaaagcattc 1860cctgtgaaga tctcaaaccg attctgcaag agattttgat cgagcaattt ggggcgcgtc 1920tctggtcaga cctgcccgaa tccgctcgtc caaatccgac cgccttgatc acgctcgatc 1980gtcccacggt ggaaacaccg aacgggaaat caacaaccgt gcaagagctt aatgcacaag 2040agtttgacta tgtgctgagt gcgccacctg ttgtaaaagc gccaacagaa atcgcagctc 2100cagcaacgat tcgttttgct cagtcaggaa aagaaatcac ctgcacccag gatgatttga 2160ttctagacat tgcagaccaa gccgaagtcg cgatcgaaag ttcttgccga tcaggaacgt 2220gtggaagttg taaatgcacc ttactcgaag gtgaagtcag ctatgacagc gaacccgatg 2280tgctcgatga gcacgatcgc gcttcgggtc agattctcac ctgtattgct cgtcctgtcg 2340gtcgtatctt gctcgatgct tgatccctaa gttttgttgc tccgctcatt gttctcacat 2400gcgccagctt tttgctgtgc ttccttttcc ttcagtacat tctctaaaaa ggacgatcca 2460tgtcttctaa tctttcaaga cgtaagttca ttttgaccgc aggcgcaacc gcagcaggcg 2520cagtgattgt gaatggttgt agcacaggtc taaataaaag tgcttctagc ggtgcgtcct 2580ctcctgctgc ctctcctgct gcaaatatca gtgcggcaga tgcaccagaa gtcacaacgg 2640ctaaattagg ctttatcgcc ctgaccgatt cggctccatt gatcattgcg ttagagaaag 2700591611DNAAnabaena variabilis 59atgacagata cagcaactac ccccaaagcc agtctcaata agtttgagaa attcaaagcc 60gaaaaagatg gccttgccat caagtcagag attgaaaaaa ttgcctcttt gggatgggaa 120gcaatggacg aaacagaccg agaccatcgc ctcaaatggg tgggtgtatt ctttcgtcca 180gtcacccctg gcaaattcat gatgcggatg cggatgccta atggtattct caccagcgat 240caaatgcgtg ttttagctga agtggtgcag cgttacggag atgatggcaa cgctgatatt 300acaactaggc agaatatcca actacgggga atcagaatag aagacttacc gcacatattc 360aataaatttc atgcagtagg tttaactagt gtgcagtcgg ggatggacaa tatccgcaat 420attacaggcg accccatagc agggttggat gcagatgaat tgtatgatac ccgtgagtta 480gtgcagcaaa tccaagatat gctcaccaac aagggagaag gtaatcgaga gtttagtaat 540ttaccacgga aatttaatat tgcgatcgct ggtggacggg ataattcagt tcatgcagaa 600atcaacgatt tagcttttgt tcccgcattc aaagaaggga ttggggattg ggtattggga 660ggtggtgaag aatcttctac tcaccaaaaa gtctttggat ttaacgtgtt agttggtggc 720ttcttttctg ccaaacgttg tgaagcggca attcctttaa atgcttgggt aacagctgaa 780gaagtcgtag ccttatgtag agcagttctg gaagtctatc gtgacaacgg acttagagct 840aatcggctta agtctcgctt gatgtggcta attgatgaat ggggtataga taagttccgt 900gcagaagtcg aacagcgttt gggtaaatcc ttactatacg ctgcacccaa agacgaaatt 960gattgggaaa aacgcgacca tatcggagtc tataaacaaa agcaagaggg attgaactat 1020gtaggcttac acatacccgt aggtagattg tatgccgaag atatgtttga actagctcgg 1080atagccgatg tttacggtag cggtgaaatc cgtatgactg ttgaacaaaa catcatcatt 1140cccaacatta ccgactcgcg gttaaagact ttgttgacag atcctttact agagagattt 1200tctcttgatc cgggagcatt gacgcgatcg ctagtttcct gcacaggcgc acaattttgc 1260aacttcgccc tcatcgaaac caaaaaccgc gccctagaaa tgattaaagg cttagaagca 1320gagttaacat tcacccgtcc agtgcgaatc cattggacag gttgccccaa ctcctgcgga 1380caaccccaag ttgcagacat cggtttaatg ggaacaaaag cccgtaagaa cggtaaagcc 1440gtcgaaggtg ttgacatcta tatggggggc aaagtcggca aagacgcaca tttaggtagt 1500tgtgtacaaa aaggcatccc ctgcgaagac ttgcacctag tattacgaga cttgctgatt 1560actaattttg gagccaaacc caggcaggaa gccttagtta gtagccagta g 1611601548DNASynechococcus sp. 60atggcgaacc aatttgaacg cctcaaaagc gaaaaggatg ggctggcggt caaggccgag 60ctggaggcgt ttgcccggat gggttgggag aacattcctg aagacgaccg ggatcaccgc 120ctcaagtggc tggggatctt ctttcgcaag cgcaccccag gtcagttcat gctgcggctg 180cgcctgccca atgggatcct aaccagcggc caaatgcgga tgttgggcgc aatcatccac 240ccctatggag aacagggcgt agccgacatc accacccggc agaacctgca actgcgcggc 300atccccattg aagaaatgcc ccagatcctg ggctacctga aagaggtagg cctgaccagc 360atccagtcgg gcatggacaa cgtgcgcaac atcacgggat cccctctggc cggtattgac 420ccggatgagc tgatcgatgt gcgcggtctc acccgcaagg tgcaggacat ggttaccaac 480aacggcgagg gcaacccttc cttcagcaac ctgccgcgca agttcaacat cgccatctgc 540ggttgtcgcg acaactccgt gcatgcggag atcaacgacc tggcctttgt gcccgccttc 600aaaaatggcc gcctgggctt caacgtcctg gtgggcggct ttttctcggc tcgccgctgc 660gccgaggcaa ttggcctaga tgtctgggtg gatccccgcg atgtagttcc cctgtgcgag 720gcggtgctgc tggtctaccg ggatcacggc ctgcgggcca accggcaaaa ggcgcggttg 780atgtggctca ttgacgagtg gggcctagag aagttccggg cggctgtgga gcgccagata 840ggccaccctc tgcccagggc agcggaaaaa gacgaggtgg tctggcacaa gcgggatctg 900ctgggggtgc atgcccagaa gcagccgggc ctcaactttg tcggcctgca tgtgccggtg 960gggcggctca acgccctgga gatgatggag ctggcccgct tggcggaggt gtacggctcc 1020ggggagctgc ggctgacagt ggagcagaac gtgctcatcc ccaatgtgcc cgactcccga 1080gtggccccgc tcctcaaaga gccgctcttg aagaagttct cccccaaccc agggcccttg 1140cagcgggggt tggtgtcctg cacgggcaac cagttctgca actttgccct tatcgagacc 1200aaaaaccggg ctgtggcctt gatggaggag ctggaggcgg agctggagat cccccaaacg 1260gtgcgcatcc actggacggg ctgccccaac tcctgcggcc aaccccaagt agccgatatc 1320ggccttatgg gcaccactgc tcgcaaggac ggcagggtgg tggaggccgt ggacatctac 1380atggggggag aggtgggcaa agacgccaag ctgggcgaat gcgtgcgcaa agggatccct 1440tgcgaagacc tcaagccggt cttggtggag ctgctcattg aacactttgg ggccaagccg 1500cgtcagcatc cgtccgccgc ccaggcttct gttttggtaa cccgctag 1548612203DNAArabidopsis thaliana 61tctcacccac ccaaagccac tcactctctc ttctctctct ctgaagcgat gtcatcgacg 60tttcgagctc cggcgggagc cgctactgtg tttacggcgg atcagaagat cagacttggg 120aggctcgacg ctctgagatc ctctcattct gttttcttag gaagatatgg acgcggcggc 180gtcccggttc ctccttccgc ttcttcgtcg agttcttcgc ctattcaagc cgtctccact 240cctgcgaagc ctgagactgc gaccaagcgg agcaaagtcg aaattatcaa ggagaagagt 300aatttcataa ggtatccttt gaacgaggag cttttaacag aggctccaaa tgtcaacgag 360tcagccgtgc agcttatcaa gttccacggt agctaccaac agtacaacag agaagaacgt 420ggtggaagat cttactcctt catgcttcga actaagaatc catctgggaa ggtccctaac 480cagctctatt tgactatgga tgacttagct gatgagtttg gaattggtac tcttcgtttg 540accacaaggc agacgtttca gcttcatggt gttctgaagc agaatcttaa gactgtgatg 600agctcgatta ttaaaaatat gggtagcacg cttggtgcat gtggtgatct gaacagaaat 660gttcttgctc ctgctgcacc ttatgtgaag aaagactatc tctttgcaca agaaactgct 720gacaacattg cggctcttct ttctcctcaa tcagggttct attatgatat gtgggttgat 780ggagagcagt tcatgactgc tgaacctcca gaggtagtga aggctcgaaa tgataactcc 840catggaacta actttgtcga ctctcctgag cccatctatg gcacccagtt cttgcctaga 900aagttcaagg tcgctgtaac tgttcctaca gataattccg tcgacctcct caccaatgac 960attggcgttg ttgttgtttc agatgaaaat ggggaaccac agggtttcaa tatttatgtt 1020ggtgggggta tgggaagaac acacagaatg gagtctactt ttgcccgcct ggcagaacca 1080ataggttatg ttccaaagga agatattttg tatgctgtga aggccattgt agtcacacag 1140cgagaacacg ggagacgaga tgatcgtaaa tatagcagaa tgaaatattt gatcagctcc 1200tggggaattg agaagttcag agatgttgtt gagcaatatt atggtaaaaa gtttgagcct 1260tcccgtgaac ttccagagtg ggagttcaag agttacttgg gatggcatga acagggagat 1320ggtgcatggt tttgtgggct tcacgtagac agtggtcgtg ttggaggtat aatgaagaag 1380acgctgagag aagtaataga gaaatacaaa attgatgtcc gcatcacacc aaaccaaaac 1440attgtcttgt gtgatataaa gactgaatgg aagcgtccca tcaccacagt acttgctcag 1500gccggcttac tgcaacctga gtttgtcgac ccattaaacc aaactgcaat ggcttgccca 1560gcttttcctt tgtgccctct ggcaataact gaggcagagc gcgggatccc cagcattcta 1620aagagagtta gggcaatgtt tgaaaaggtt ggtctggact acgacgagtc tgttgtgata 1680agagtaaccg gttgtccaaa cggctgtgca agaccgtaca tggctgagct cggtctagtc 1740ggggatggtc ccaacagcta tcaggtttgg ctaggaggaa caccgaacct gacccagata 1800gcgagaagtt tcatggataa ggttaaggtt cacgacttag agaaagtctg cgagccattg 1860ttctatcact ggaaactaga gaggcaaact aaagaatcat ttggagaata cacaacccgc 1920atgggattcg agaaactgaa ggagctgata gatacataca aaggagtttc tcaatgagca 1980caacagagat catctttcgt tttataattc atgtaatgta atgtctctgt ctgaactgtt 2040actcttcggt aactctgatg gagaacttgt tctcgttttg gtttgatttt gtaccctctt 2100tttttttttt gtttttttgg attgctttgt ctttgattgg ataatgaagc attactgtat 2160caaggctaat tagcccatca ataagccttt ttaaagctct gga 2203622195DNAOryza sativa 62ttttttataa tgccaacttt gtacaaaaaa gcaggcttaa acaatgtgtg gcatcctcgc 60cgtgctcggc gtcgcagacg tctccctcgc caagcgctcc cgcatcatcg agctatcccg 120ccggttacgt catagaggcc ctgattggag tggtatacac tgctatcagg attgctatct 180tgcacaccag cggttggcta ttgttgatcc cacatccgga gaccagccgt tgtacaatga 240ggacaaatct gttgttgtga cggtgaatgg agagatctat aaccatgaag aattgaaagc 300taacctgaaa tctcataaat tccaaactgc tagcgattgt gaagttattg ctcatctgta 360tgaggaatat ggggaggaat ttgtggatat gttggatggg atgttcgctt ttgttcttct 420tgacacacgt gataaaagct tcattgcagc ccgtgatgct attggcattt gtcctttata 480catgggctgg ggtcttgatg gttcggtttg gttttcgtca gagatgaagg cattaggtga 540tgattgcgag cgattcatat ccttcccccc tgggcacttg tactccagca aaacaggtgg 600cctaaggaga tggtacaacc caccatggtt ttctgaaagc attccctcca ccccgtacaa 660tcctcttctt ctccgacaga gctttgagaa ggctattatt aagaggctaa tgacagatgt 720gccatttggt gttctcttgt ctggtggact ggactcttct ttggttgcat ctgttgtttc 780gcggcacttg gcagaggcaa aagttgccgc acagtgggga aacaaactgc atacattttg 840cattggtttg aaaggttctc ctgatcttag agctgctaag gaagttgcag actaccttgg 900tactgttcat cacgaactcc acttcacagt gcaggaaggc attgatgcac tggaggaagt 960catttaccat gttgagacat atgatgtaac gacaattaga gcaagcaccc caatgttctt 1020gatgtcacgt aaaattaaat ctttgggggt gaagatggtt ctttcgggag aaggttctga 1080tgagatattt ggcggttacc tttattttca caaggcacca aacaagaagg aattccatga 1140ggaaacatgt cggaagataa aagcccttca tttatatgat tgcttgggag cgaacaaatc 1200aacttctgca tggggtgttg aggcccgtgt tccgttcctt gacaaaaact tcatcaatgt 1260agctatggac attgatcctg aatggaaaat gataaaacgt gatcttggcc gtattgagaa 1320atgggttctc cggaatgcat ttgatgatga ggagaagccc tatttaccta agcacattct 1380atacaggcaa aaggagcaat tcagtgatgg tgttgggtac agttggattg atggattgaa 1440ggatcatgca aatgaacatg tatcagattc catgatgatg aacgctagct ttgtttaccc 1500agaaaacact ccagttacaa aagaagcgta ctattatagg acaatattcg agaaattctt 1560tcccaagaat gctgctaggt tgacagtacc tggaggtcct agcgtcgcgt gcagcactgc 1620taaagctgtt gaatgggacg cagcctggtc caaaaacctt gatccatctg gtcgtgctgc 1680tcttggtgtt catgatgctg catatgaaga tactctacaa aaatctcctg cctctgccaa 1740tcctgtcttg gataacggct ttggtccagc ccttggggaa agcatggtca aaaccgttgc 1800ttcagccact gccgtttaac tttctatcgt cgcacccagc tttcttgtac aaagttggca 1860ttataagaaa gcattgctta tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata 1920atatcattat ttgccatcca gctgcagctc tggcccgtgt ctcaaaatct ctgatgttac 1980attgcacaag ataaaaatat atcatcatga acaataaaac tgtctgctta cataaacagt 2040aatacaaggg gtgttatgag ccatattcaa cgggaaacgt cgaggccgcg attaaattcc 2100aacatggatg ctgatttata tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt 2160gcgacaatct atcgcttgta tgggaagccc gatga 219563591PRTOryza sativa 63Met Cys Gly Ile Leu Ala Val Leu Gly Val Ala Asp Val Ser Leu Ala 1 5 10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Ile His Cys Tyr Gln Asp Cys Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Ser Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Lys Ala Asn Leu Lys Ser His Lys Phe Gln Thr Ala 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ala Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Cys Pro 130 135 140 Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145 150 155 160 Met Lys Ala Leu Gly Asp Asp Cys Glu Arg Phe Ile Ser Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Ser Ile Pro Ser Thr Pro Tyr Asn Pro Leu 195 200 205 Leu Leu Arg Gln Ser Phe Glu Lys Ala Ile Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Val Val Ser Arg His Leu Ala Glu Ala Lys Val Ala Ala 245 250 255 Gln Trp Gly Asn Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260 265 270 Pro Asp Leu Arg Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly Thr Val 275 280 285 His His Glu Leu His Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295 300 Glu Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Gly Ala Asn 370 375 380 Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Asn Phe

Ile Asn Val Ala Met Asp Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Lys Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420 425 430 Phe Asp Asp Glu Glu Lys Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Asn Glu His Val Ser Asp Ser Met Met Met Asn 465 470 475 480 Ala Ser Phe Val Tyr Pro Glu Asn Thr Pro Val Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Asn Ala Ala Arg 500 505 510 Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Asp Ala Ala Tyr Glu Asp Thr Leu Gln Lys 545 550 555 560 Ser Pro Ala Ser Ala Asn Pro Val Leu Asp Asn Gly Phe Gly Pro Ala 565 570 575 Leu Gly Glu Ser Met Val Lys Thr Val Ala Ser Ala Thr Ala Val 580 585 590 643246DNAOryza sativa 64aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tggtgcaaat caggtctata 1800tgattgattt tgggctggcc aagaagtata gagactcatc aactcatcag catattccgt 1860atagagaaaa caaaaatttg acaggaactg ctagatacgc aagcatgaat actcatcttg 1920gcattgaaca aagtcgaagg gatgatttgg aatcgctggg ttatgtttta atgtacttct 1980taagaggaag tctcccttgg caggggctga aagcaggcac taagaaacag aagtatgaga 2040agatcagtga gaagaaagta tcaacatcaa tagagacctt gtgtagggga tatcctgcag 2100agtttgcatc atattttcat tactgtcgat cactaagatt tgatgataaa ccagattatg 2160cttatctgaa gagaattttc cgtgatcttt tcattcgtga agggtttcaa tttgattata 2220tatttgactg gaccattttg aaatatcagc aatcacagct tgccaatcct ccatctcgtg 2280ctcttggtgg tactgctggg ccaagctcag ggatgcctca tgctcttgtt aatgttgaga 2340ggcaatcagg tggagatgaa ggtcgaccaa ctggttggtc ttcatcaaat cttacacgta 2400ataagagcac ggggctgcat ttcaattctg gaagcttatt gaagcaaaaa ggcacagttg 2460ctaatgattt atccatgggt aaagagttat ccagttctaa ttttttccgg tcaagtggac 2520cattgaggcg tccagttgtc tctagcatcc gagacccagt gattgcaggg ggtgaacctg 2580acccctccgg cactctgaca aaagatgcaa gcccgggacc attgcgtaaa gtatccagtg 2640ctgcacggag gagttcacca gttgtgtcct cagatcacaa gcgcagctcc tctatcaaaa 2700atgccaacat aaagaattta gagtccaccg tcaagggaat agagggttta agttttcgat 2760gatgagggac tgcattagta gctgtgcttt gtctcagttc tccgttcact gtaaattttg 2820gcacaccaac ttggggagta agagttctga tattagttgc tgtcaggaag taccataaag 2880ctgaattata caattaaaat ttgggatcca atcgcaaaag cacattaagg atatgatggg 2940gttgcagatc caaactcaca gattccagtt tatgctcgtc catacagtta taggcacttt 3000ccatattctt ttctttaatc tctgtctctt gcttgttatt gttatgtcgt ggtattcttg 3060ttgaggtcat gtttgtgaat tgcgaagatg gtcatgtata attgccgaga aatcatgtac 3120tagtttgttt taaacatgag caaactgtta ttttgttcaa gctactttaa tatcaaaaaa 3180aaaaaaaaaa gggcggccgc tctagagtat ccctcgaggg gcccaagctt acgcgtaccc 3240agcttt 32466560DNAArtificial sequenceprimer prm06049 65ggggacaagt ttgtacaaaa aagcaggctt aaacaatgtg tggcatcctc gccgtgctcg 606654DNAArtificial sequenceprimer prm06050 66ggggaccact ttgtacaaga aagctgggtg cgacgataga aagttaaacg gcag 5467591PRTOryza sativa 67Met Cys Gly Ile Leu Ala Val Leu Gly Val Ala Asp Val Ser Leu Ala 1 5 10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Ile His Cys Tyr Gln Asp Cys Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Ser Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Lys Ala Asn Leu Lys Ser His Lys Phe Gln Thr Ala 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ala Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Cys Pro 130 135 140 Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145 150 155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe Ile Ser Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Ser Ile Pro Ser Thr Pro Tyr Asn Pro Leu 195 200 205 Leu Leu Arg Gln Ser Phe Glu Lys Ala Ile Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Val Val Ser Arg His Leu Ala Glu Ala Lys Val Ala Ala 245 250 255 Gln Trp Gly Asn Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260 265 270 Pro Asp Leu Arg Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly Thr Val 275 280 285 His His Glu Leu His Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295 300 Glu Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Asn Phe Ile Asn Val Ala Met Asp Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Lys Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420 425 430 Phe Asp Asp Glu Glu Lys Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Asn Glu His Val Ser Asp Ser Met Met Met Asn 465 470 475 480 Ala Ser Phe Val Tyr Pro Glu Asn Thr Pro Val Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Asn Ala Ala Arg 500 505 510 Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Asp Ala Ala Tyr Glu Asp Thr Leu Gln Lys 545 550 555 560 Ser Pro Ala Ser Ala Asn Pro Val Leu Asp Asn Gly Phe Gly Pro Ala 565 570 575 Leu Gly Glu Ser Met Val Lys Thr Val Ala Ser Ala Thr Ala Val 580 585 590 68591PRTAquilegia formosa 68Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Tyr Gln His Gly Asp Asn Phe Leu Ser His 35 40 45 Gln Arg Leu Ala Val Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Ala Leu Arg Lys Arg Leu Pro Asn His Lys Phe Arg Thr Gly 85 90 95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu Phe Gly Glu Asp 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Leu Val Ala Arg Asp Ala Ile Gly Ile Thr Ser 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Ile Trp Ile Ser Ser Glu 145 150 155 160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Cys Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Asn Ser Gly Phe Arg Arg Trp Tyr Asn 180 185 190 Pro Ser Trp Phe Ser Glu Ala Val Pro Ser Thr Pro Tyr Asp Pro Leu 195 200 205 Val Leu Arg Arg Ala Phe Glu Asn Ala Val Val Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Ile Thr Ala Arg His Leu Ala Glu Thr Lys Ala Ala Lys 245 250 255 Gln Trp Gly Ala Gln Leu His Ser Phe Cys Val Gly Leu Glu Gly Ser 260 265 270 Pro Asp Leu Lys Ala Gly Lys Glu Val Ala Asp Tyr Leu Gly Thr Val 275 280 285 His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Arg Glu Thr 355 360 365 Cys His Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Ile Asn Val Ala Met Ala Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Lys Arg Asp Gln Gly Arg Ile Glu Lys Trp Val Leu Arg Arg Ala 420 425 430 Phe Asp Asp Glu Asp His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Ala His Ala Ala Ser His Val Thr Asp Lys Met Met Arg Asn 465 470 475 480 Ala Lys Asn Ile Phe Leu His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Lys 500 505 510 Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ser Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Ala Ser Ala Tyr Glu Ala Gln Leu Ser Ala 545 550 555 560 Pro Leu Ala Asn Gly Asn Val Pro Val Lys Ile Phe Asn Asn Val Pro 565 570 575 Arg Met Val Glu Val Gly Ala Pro Ala Ser Leu Thr Ile Arg Ser 580 585 590 69590PRTAsparagus officinalis 69Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Cys Gln His Gly Asp Cys Phe Leu Ser His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Arg Arg Leu Pro Asp His Lys Tyr Arg Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asp 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asn Asn Cys Phe Val Ala Ala Arg Asp Ala Val Gly Ile Thr Pro 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Leu Ser Ser Glu 145 150 155 160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Val Phe Pro Pro 165 170 175 Gly Asn Leu Tyr Ser Ser Arg Ser Gly Ser Phe Arg Arg Trp Tyr Asn 180 185 190 Pro Gln Trp Tyr Asn Glu Thr Ile Pro Ser Ala Pro Tyr Asp Pro Leu 195 200 205 Val Leu Arg Lys Ala Phe Glu Asp Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Thr Ala Arg His Leu Ala Gly Ser Lys Ala Ala Glu 245 250 255 Gln Trp Gly Thr Gln Leu His Ser Phe Cys Val Gly Leu Glu Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Glu Tyr Leu Gly Thr Val 275 280 285 His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Phe His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ala Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His His Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375

380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Met Asp Val Ala Met Ser Ile Asp Pro Glu Ser Lys Met 405 410 415 Ile Lys Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Lys Ala 420 425 430 Phe Asp Asp Glu Glu Asn Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Ala His Ala Ala Lys His Val Thr Asp Arg Met Met Leu Asn 465 470 475 480 Ala Ala Arg Ile Tyr Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Arg 500 505 510 Phe Thr Val Pro Gly Gly Pro Ser Ile Ala Cys Ser Thr Ala Lys Ala 515 520 525 Ile Glu Trp Asp Ala Arg Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Asp Ser Ala Tyr Asp Pro Pro Leu Pro Ser 545 550 555 560 Ser Ile Ser Ala Gly Lys Gly Ala Ala Met Ile Thr Asn Lys Lys Pro 565 570 575 Arg Ile Val Asp Val Ala Thr Pro Gly Val Val Ile Ser Thr 580 585 590 70586PRTBrassica oleracea 70Met Cys Gly Ile Leu Ala Leu Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Ile Tyr Gln Asn Gly Phe Asn Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Asp Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Lys Gly Leu Lys Asn His Lys Phe His Thr Gly 85 90 95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asn 100 105 110 Phe Val Asp Met Leu Asp Gly Ile Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Met Val Ala Arg Asp Ala Val Gly Val Thr Ser 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Leu Trp Val Ser Ser Glu 145 150 155 160 Met Lys Gly Leu His Glu Asp Cys Glu His Phe Glu Ala Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Ser Gly Gly Gly Phe Lys Gln Trp Tyr 180 185 190 Asn Pro Pro Trp Phe Asn Glu Ser Val Pro Ser Thr Pro Tyr Glu Pro 195 200 205 Leu Ala Ile Arg Ser Ala Phe Glu Asp Ala Val Ile Lys Arg Leu Met 210 215 220 Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser 225 230 235 240 Leu Val Ala Ser Ile Thr Ala Arg His Leu Ala Gly Thr Lys Ala Ala 245 250 255 Lys Arg Trp Gly Pro Gln Leu His Ser Phe Cys Val Gly Leu Glu Gly 260 265 270 Ser Pro Asp Leu Lys Ala Gly Lys Glu Val Ala Glu Tyr Leu Gly Thr 275 280 285 Val His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile 290 295 300 Glu Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg 305 310 315 320 Ala Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly 325 330 335 Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly 340 345 350 Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Gln Glu Phe His Gln Glu 355 360 365 Thr Cys Arg Lys Ile Lys Ala Leu His Lys Tyr Asp Cys Leu Arg Ala 370 375 380 Asn Lys Ala Thr Ser Ala Phe Gly Leu Glu Ala Arg Val Pro Phe Leu 385 390 395 400 Asp Lys Glu Phe Ile Asn Thr Ala Met Ser Leu Asp Pro Glu Ser Lys 405 410 415 Met Ile Lys Pro Glu Glu Gly Arg Ile Glu Lys Trp Val Leu Arg Arg 420 425 430 Ala Phe Asp Asp Glu Glu Arg Pro Tyr Leu Pro Lys His Ile Leu Tyr 435 440 445 Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp 450 455 460 Gly Leu Lys Ala His Ala Ala Glu Asn Val Asn Asp Lys Met Met Ser 465 470 475 480 Lys Ala Ala Phe Ile Phe Pro His Asn Thr Pro Leu Thr Lys Glu Ala 485 490 495 Tyr Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala 500 505 510 Arg Leu Thr Val Pro Gly Gly Ala Thr Val Ala Cys Ser Thr Ala Lys 515 520 525 Ala Val Glu Trp Asp Ala Ser Trp Ser Asn Asn Met Asp Pro Ser Gly 530 535 540 Arg Ala Ala Ile Gly Val His Leu Ser Ala Tyr Asp Gly Ser Lys Val 545 550 555 560 Ala Leu Pro Leu Pro Ala Pro His Lys Ala Ile Asp Asp Ile Pro Met 565 570 575 Met Met Gly Gln Glu Val Val Ile Gln Thr 580 585 71578PRTChlamydomonas reinhardtii 71Met Cys Gly Ile Leu Ala Val Leu Asn Thr Thr Asp Asp Ser Gln Ala 1 5 10 15 Met Arg Ser Arg Val Leu Ala Leu Ser Arg Arg Gln Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Met His Gln Phe Gly Asn Asn Phe Leu Ala His 35 40 45 Glu Arg Leu Ala Ile Met Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Arg Thr Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 Tyr Lys Glu Leu Arg Gln Gln Ile Thr Asp Ala Cys Pro Gly Lys Lys 85 90 95 Phe Ala Thr Asn Ser Asp Cys Glu Val Ile Ser His Leu Tyr Glu Leu 100 105 110 His Gly Glu Lys Val Ala Ser Met Leu Asp Gly Phe Phe Ala Phe Val 115 120 125 Val Leu Asp Thr Arg Asn Asn Thr Phe Tyr Ala Ala Arg Asp Pro Ile 130 135 140 Gly Ile Thr Cys Met Tyr Ile Gly Trp Gly Arg Asp Gly Ser Val Trp 145 150 155 160 Leu Ser Ser Glu Met Lys Cys Leu Lys Asp Asp Cys Thr Arg Phe Gln 165 170 175 Gln Phe Pro Pro Gly His Phe Tyr Asn Ser Lys Thr Gly Glu Phe Thr 180 185 190 Arg Tyr Tyr Asn Pro Lys Tyr Phe Leu Asp Phe Glu Ala Lys Pro Gln 195 200 205 Arg Phe Pro Ser Ala Pro Tyr Asp Pro Val Ala Leu Arg Gln Ala Phe 210 215 220 Glu Gln Ser Val Glu Lys Arg Met Met Ser Asp Val Pro Phe Gly Val 225 230 235 240 Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu Val Ala Ser Ile Ala Ala 245 250 255 Arg Lys Ile Lys Arg Glu Gly Ser Val Trp Gly Lys Leu His Ser Phe 260 265 270 Cys Val Gly Leu Pro Gly Ser Pro Asp Leu Lys Ala Gly Ala Gln Val 275 280 285 Ala Glu Phe Leu Gly Thr Asp His His Glu Phe His Phe Thr Val Gln 290 295 300 Glu Gly Ile Asp Ala Ile Ser Glu Val Ile Tyr His Ile Glu Thr Phe 305 310 315 320 Asp Val Thr Thr Ile Arg Ala Ser Thr Pro Met Phe Leu Met Ser Arg 325 330 335 Lys Ile Lys Ala Leu Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser 340 345 350 Asp Glu Val Phe Gly Gly Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys 355 360 365 Glu Glu Phe Gln Ser Glu Thr Val Arg Lys Ile Gln Asp Leu Tyr Lys 370 375 380 Tyr Asp Cys Leu Arg Ala Asn Lys Ser Thr Met Ala Trp Gly Val Glu 385 390 395 400 Ala Arg Val Pro Phe Leu Asp Arg His Phe Leu Asp Val Ala Met Glu 405 410 415 Ile Asp Pro Ala Glu Lys Met Ile Asp Lys Ser Lys Gly Arg Ile Glu 420 425 430 Lys Tyr Ile Leu Arg Lys Ala Phe Asp Thr Pro Glu Asp Pro Tyr Leu 435 440 445 Pro Asn Glu Val Leu Trp Arg Gln Lys Glu Gln Phe Ser Asp Gly Val 450 455 460 Gly Tyr Asn Trp Ile Asp Gly Leu Lys Ala His Ala Asp Ser Gln Val 465 470 475 480 Ser Asp Asp Met Met Lys Thr Ala Ala His Arg Tyr Pro Asp Asn Thr 485 490 495 Pro Arg Thr Lys Glu Ala Tyr Trp Tyr Arg Ser Ile Phe Glu Thr His 500 505 510 Phe Pro Gln Arg Ala Ala Val Glu Thr Val Pro Gly Gly Pro Ser Val 515 520 525 Ala Cys Ser Thr Ala Thr Ala Ala Leu Trp Asp Ala Thr Trp Ala Gly 530 535 540 Lys Glu Asp Pro Ser Gly Arg Ala Val Ala Gly Val His Asp Ser Ala 545 550 555 560 Tyr Asp Ala Ala Ala Ala Ala Asn Gly Glu Pro Ala Ala Lys Lys Ala 565 570 575 Lys Lys 72579PRTGlycine max 72Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Ser Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Gln Tyr Gly Asp Asn Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Val Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Lys Gln Leu Pro Asn His Thr Phe Arg Thr Gly 85 90 95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asn 100 105 110 Phe Val Asp Met Leu Asp Gly Ile Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly Val Thr Ser 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Leu Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Ser Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Glu Arg Ala Phe Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Ala Ile Pro Ser Ala Pro Tyr Asp Pro Leu 195 200 205 Ala Leu Arg His Ala Phe Glu Lys Ala Val Val Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Thr Ala Arg Tyr Leu Ala Gly Thr Asn Ala Ala Lys 245 250 255 Gln Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly Ala 260 265 270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp Tyr Ile Gly Thr Val 275 280 285 His His Glu Phe His Tyr Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Ile Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Trp Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Gln Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Lys Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Phe Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Asp Phe Ile Arg Val Ala Met Asn Ile Asp Pro Asp Tyr Lys Met 405 410 415 Ile Lys Lys Glu Glu Gly Arg Ile Glu Lys Trp Val Leu Arg Arg Ala 420 425 430 Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Gly Trp Ile Asp Gly 450 455 460 Leu Lys Ala His Ala Glu Lys His Val Thr Asp Arg Met Met Leu Asn 465 470 475 480 Ala Ala Asn Ile Phe Pro Phe Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Arg 500 505 510 Leu Ser Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Ala Ser Ala Tyr Gly Asn Gln Val Lys Ala 545 550 555 560 Val Glu Pro Glu Lys Ile Ile Pro Lys Met Glu Val Ser Pro Leu Gly 565 570 575 Val Ala Ile 73579PRTGlycine max 73Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Ser Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Gln Tyr Gly Asp Asn Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Val Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Lys Gln Leu Pro Asn His Thr Phe Arg Thr Gly 85 90 95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asn 100 105 110 Phe Met Asp Met Leu Asp Gly Ile Ser Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly Val Thr Ser 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Leu Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Ser Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Glu Arg Ala Phe Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Leu Ser Leu Ala Ile Pro Ser Ala Pro Tyr Asp Pro Leu 195 200 205 Ala Leu Arg His Ala Phe Glu Lys Leu Trp Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Thr Ala Arg Tyr Leu Ala Gly Thr Lys Ala Ala Lys 245 250 255 Gln Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly Ala 260 265 270 Pro Asp Leu Lys Ala Thr Lys Glu Val Ala Glu Tyr Ile Gly Thr Val 275 280 285 His His Glu Phe His Tyr Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Ile Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Trp Val Ile Ser Gly Glu Gly Ser Asp Val Phe Phe Gly Gly Tyr 340

345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Gln Glu Thr 355 360 365 Cys Arg Thr Ile Ile Val Leu His Arg Tyr Asp Cys Ser Arg Ala Asn 370 375 380 Lys Ser Thr Phe Val Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Ile Arg Val Ala Met Asn Ile Asp Pro Glu Cys Lys Met 405 410 415 Ile Lys Lys Glu Glu Gly Arg Ile Glu Lys Trp Ala Leu Arg Arg Ala 420 425 430 Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Gly Trp Ile Asp Gly 450 455 460 Leu Lys Ala His Ala Glu Lys His Val Thr Asp Arg Met Met Leu Asn 465 470 475 480 Ala Ala Asn Ile Phe Pro Phe Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 His Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Cys Arg 500 505 510 Leu Thr Val Pro Gly Gly Thr Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Ala Ser Ala Tyr Gly Asn Gln Val Lys Ala 545 550 555 560 Val Glu Pro Glu Lys Ile Ile Pro Lys Met Glu Val Ser Pro Leu Gly 565 570 575 Val Ala Ile 74581PRTGlycine max 74Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Arg Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Gln His Gly Asp Cys Phe Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Val Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Lys Ser Val Ile Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Lys Gln Leu Pro Asn His Asn Phe Arg Thr Gly 85 90 95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asp 100 105 110 Phe Val Asp Met Leu Asp Gly Ile Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly Val Thr Ser 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Cys Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Glu Arg Gly Phe Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Ala Ile Pro Ser Ala Pro Tyr Asp Pro Leu 195 200 205 Val Leu Arg His Ala Phe Glu Gln Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Ile Thr Ser Arg Tyr Leu Ala Asn Thr Lys Ala Ala Glu 245 250 255 Gln Trp Gly Ser Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly Thr Val 275 280 285 His His Glu Phe Thr Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Trp Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Arg Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Phe Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Ala Phe Ile Asn Ala Ala Met Ser Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Lys Arg Asp Glu Gly Arg Ile Glu Lys Trp Ile Leu Arg Arg Ala 420 425 430 Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Ala His Ala Ala Lys His Val Thr Glu Lys Met Met Leu Asn 465 470 475 480 Ala Gly Asn Ile Tyr Pro His Asn Thr Pro Lys Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Arg 500 505 510 Leu Thr Val Pro Gly Gly Ala Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Ile Ser Ala Tyr Glu Asn Gln Asn Asn Lys 545 550 555 560 Gly Val Glu Ile Glu Lys Ile Ile Pro Met Asp Ala Ala Pro Leu Gly 565 570 575 Val Ala Ile Gln Gly 580 75569PRTGlycine max 75Met Cys Gly Ile Leu Ala Val Leu Gly Cys Val Asp Asn Ser Gln Thr 1 5 10 15 Lys Arg Ala Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Ile His Cys Tyr Glu Asp Cys Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Thr Ile Ile Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Lys Gln Leu Arg Gln Lys Leu Ser Ser His Gln Phe Arg Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Glu 100 105 110 Phe Val Asn Met Leu Asp Gly Met Phe Ala Phe Ile Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Thr Pro 130 135 140 Leu Tyr Leu Gly Trp Gly His Asp Gly Ser Thr Trp Phe Ala Ser Glu 145 150 155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe Ile Ser Phe Pro Pro 165 170 175 Gly His Ile Tyr Ser Ser Lys Gln Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Asp Ile Pro Ser Thr Pro Tyr Asp Pro Thr 195 200 205 Leu Leu Arg Glu Thr Phe Glu Arg Ala Val Val Lys Arg Met Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Val Asn Arg Tyr Leu Ala Glu Ser Glu Ser Ala Arg 245 250 255 Gln Trp Gly Ser Gln Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly Thr Arg 275 280 285 His His Glu Leu Tyr Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295 300 Glu Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Ala Met Phe Leu Met Ser Arg Lys Ile Lys Ala Leu Gly Val 325 330 335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Ala Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Ile Asn Val Ala Met Ser Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Arg Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420 425 430 Phe Asp Asp Asp Lys Asn Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Asn Lys Gln Val Thr Asp Ala Thr Met Met Ala 465 470 475 480 Ala Asn Phe Ile Tyr Pro Glu Asn Thr Pro Thr Thr Lys Glu Gly Tyr 485 490 495 Leu Tyr Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Asn Ala Ala Lys 500 505 510 Ala Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Ile His Asp Ala Ala Tyr Asp Ala Val Asp Thr Lys 545 550 555 560 Ile Asp Glu Pro Lys Asn Gly Thr Leu 565 76581PRTPhyscomitrella patens 76Met Cys Gly Ile Leu Ala Ile Leu Gly Ser His Asp Ala Ser Pro Ala 1 5 10 15 Arg Arg Asp Arg Ile Leu Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Phe Ala Gly Gln Lys Cys Trp Cys Tyr Leu 35 40 45 Ala His Glu Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro 50 55 60 Leu Tyr Asn Glu Asn Lys Asp Ile Val Val Ala Ala Asn Gly Glu Ile 65 70 75 80 Tyr Asn His Glu Ala Leu Lys Lys Ser Met Lys Pro His Lys Tyr His 85 90 95 Thr Gln Ser Asp Cys Glu Val Ile Ala His Leu Phe Glu Asp Val Gly 100 105 110 Glu Asp Val Val Asn Met Leu Asp Gly Met Phe Ser Phe Val Leu Val 115 120 125 Asp Asn Arg Asp Asn Ser Phe Ile Ala Ala Arg Asp Pro Ile Gly Ile 130 135 140 Thr Pro Leu Tyr Tyr Gly Trp Gly Ala Asp Gly Ser Val Trp Phe Ala 145 150 155 160 Ser Glu Met Lys Ala Leu Lys Asp Asp Cys Glu Arg Phe Glu Ile Phe 165 170 175 Pro Pro Gly His Ile Tyr Ser Ser Lys Ala Gly Gly Leu Arg Arg Tyr 180 185 190 Tyr Asn Pro Ala Trp Phe Ser Glu Thr Phe Val Pro Ser Thr Pro Tyr 195 200 205 Gln Ser Leu Val Leu Arg Ala Ala Phe Glu Lys Ala Val Ile Lys Arg 210 215 220 Leu Met Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp 225 230 235 240 Ser Ser Leu Val Ala Ala Val Ala Ser Arg His Ile Ala Gly Thr Lys 245 250 255 Ala Ala Asn Ile Trp Gly Lys Gln Leu His Ser Phe Cys Val Gly Leu 260 265 270 Gln Gly Ser Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asn Tyr Ile 275 280 285 Gly Thr Gln His His Glu Phe His Phe Thr Val Gln Glu Gly Leu Asp 290 295 300 Ala Leu Ser Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr 305 310 315 320 Ile Arg Ala Ser Thr Pro Met Phe Leu Met Thr Arg Lys Ile Lys Ala 325 330 335 Leu Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe 340 345 350 Gly Gly Tyr Leu Tyr Phe His Lys Ala Pro Asn Arg Glu Glu Phe His 355 360 365 His Glu Leu Val Arg Lys Ile Lys Ala Leu His Met Tyr Asp Cys Gln 370 375 380 Arg Ala Asn Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro 385 390 395 400 Phe Leu Asp Lys Glu Phe Met Glu Val Ala Met Ala Ile Asp Pro Ala 405 410 415 Glu Lys Leu Ile Arg Lys Asp Gln Gly Arg Ile Glu Lys Trp Val Leu 420 425 430 Arg Lys Ala Phe Tyr Asp Glu Lys Asn Pro Tyr Leu Pro Lys His Ile 435 440 445 Leu Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp 450 455 460 Ile Asp Gly Leu Lys Ala His Ala Gln Ser His Val Ser Asp Gln Met 465 470 475 480 Leu Lys His Ala Lys His Val Tyr Pro Tyr Asn Thr Pro Gln Thr Lys 485 490 495 Glu Ala Tyr Tyr Tyr Arg Met Leu Phe Glu Lys His Phe Pro Gln Gln 500 505 510 Ser Ala Arg Leu Thr Val Pro Gly Gly Ala Ser Val Ala Cys Ser Thr 515 520 525 Ala Thr Ala Val Ala Trp Asp Lys Ser Trp Ala Gly Asn Leu Asp Pro 530 535 540 Ser Gly Arg Ala Ala Leu Gly Cys His Asp Ala Ala Tyr Thr Glu Asn 545 550 555 560 Ser Ala Ala Met Ser Tyr Ile Thr Lys Asn Met Ser Asn Val Gly Gln 565 570 575 Lys Met Thr Ile His 580 77603PRTPhyscomitrella patens 77Met Cys Gly Ile Leu Ala Ile Leu Gly Ala Asp Gly Ala Val Pro Ser 1 5 10 15 Ala Gly Arg Asp Arg Ala Leu Ala Leu Ser Arg Arg Leu Arg His Arg 20 25 30 Gly Pro Asp Trp Ser Gly Leu Phe Glu Gly Lys Asp Ser Trp Cys Tyr 35 40 45 Leu Ala His Glu Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln 50 55 60 Pro Leu Tyr Asn Gly Thr Lys Asp Ile Val Val Ala Ala Asn Gly Glu 65 70 75 80 Ile Tyr Asn His Glu Leu Leu Lys Lys Asn Met Lys Pro His Glu Tyr 85 90 95 His Thr Gln Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Asp Val 100 105 110 Gly Glu Glu Val Val Asn Met Leu Asp Gly Met Trp Ser Phe Val Leu 115 120 125 Val Asp Ser Arg Asp Asn Ser Phe Ile Ala Ala Arg Asp Pro Ile Gly 130 135 140 Ile Thr Pro Leu Tyr Leu Gly Trp Gly Ala Asp Gly Arg Thr Val Trp 145 150 155 160 Phe Ala Ser Glu Met Lys Ala Leu Lys Asp Asp Cys Glu Arg Leu Glu 165 170 175 Val Phe Pro Pro Gly His Ile Tyr Ser Ser Lys Ala Gly Gly Leu Arg 180 185 190 Arg Tyr Tyr Asn Pro Gln Trp Phe Ser Glu Thr Phe Val Pro Glu Thr 195 200 205 Pro Tyr Gln Pro Leu Glu Leu Arg Ser Ala Phe Glu Lys Ala Val Val 210 215 220 Lys Arg Leu Met Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly 225 230 235 240 Leu Asp Ser Ser Leu Val Ala Ser Val Ala Ala Arg His Leu Ala Glu 245 250 255 Thr Lys Ala Val Arg Ile Trp Gly Asn Glu Leu His Ser Phe Cys Val 260 265 270 Gly Leu Glu Gly Ser Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Lys 275 280 285 Tyr Ile Gly Thr Arg His His Glu Phe Asn Phe Thr Val Gln Glu Gly 290 295 300 Leu Asp Ala Leu Ser Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val 305 310 315 320 Thr Thr Ile Arg Ala Ser Thr Pro Met Phe Leu Met Thr Arg Lys Ile

325 330 335 Lys Ala Leu Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu 340 345 350 Ile Phe Gly Gly Tyr Leu Tyr Phe His Lys Ala Pro Asn Arg Glu Glu 355 360 365 Phe His His Glu Leu Val Arg Lys Ile Lys Ala Leu His Leu Tyr Asp 370 375 380 Cys Gln Arg Ala Asn Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala Arg 385 390 395 400 Val Pro Phe Leu Asp Lys Glu Phe Met Asp Val Ala Met Met Ile Asp 405 410 415 Pro Ser Glu Lys Met Ile Arg Lys Asp Leu Gly Arg Ile Glu Lys Trp 420 425 430 Val Leu Arg Lys Ala Phe Asp Asp Glu Glu Arg Pro Tyr Leu Pro Lys 435 440 445 His Ile Leu Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr 450 455 460 Ser Trp Ile Asp Gly Leu Lys Glu Tyr Ala Glu Ser His Val Thr Asp 465 470 475 480 Gln Met Met Lys His Ala Lys His Val Tyr Pro Phe Asn Thr Pro Asn 485 490 495 Thr Lys Glu Gly Tyr Tyr Tyr Arg Met Ile Phe Glu Lys His Phe Pro 500 505 510 Gln Gln Ser Ala Arg Met Thr Val Pro Gly Gly Pro Ser Val Ala Cys 515 520 525 Ser Thr Ala Thr Ala Val Ala Trp Asp Glu Ala Trp Ala Asn Asn Leu 530 535 540 Asp Pro Ser Gly Arg Ala Ala Leu Gly Cys His Asp Ser Ala Tyr Thr 545 550 555 560 Asp Lys His Ser Glu Lys Ala Ala Pro Ala Ala Glu Ala Asn Gly Thr 565 570 575 Ala Ser His Glu Asn Gly His Thr Phe Ser Lys Pro Lys Ser Thr Leu 580 585 590 Asp Ala Thr Ile Leu Lys Thr Gln Ala Val His 595 600 78592PRTPhyscomitrella patens 78Met Cys Gly Ile Leu Ala Ile Leu Gly Cys His Asp Lys Ser Val Thr 1 5 10 15 Arg Arg His Arg Cys Leu Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Phe Val Asp Glu Ala Ser Gly Cys Tyr Leu 35 40 45 Ala His Glu Arg Leu Ala Ile Ile Asp Pro Thr Ser Gly Asp Gln Pro 50 55 60 Leu Phe Asn Glu Asn Lys Asp Ile Val Val Ala Val Asn Gly Glu Ile 65 70 75 80 Tyr Asn His Glu Ala Leu Lys Ala Ser Met Lys Ala His Lys Tyr His 85 90 95 Thr Gln Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Ile Gly 100 105 110 Glu Glu Val Val Glu Lys Leu Asp Gly Met Phe Ser Phe Val Leu Val 115 120 125 Asp Leu Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Pro Leu Gly Ile 130 135 140 Thr Pro Leu Tyr Leu Gly Trp Gly Asn Asp Gly Ser Val Trp Phe Ala 145 150 155 160 Ser Glu Met Lys Ala Leu Lys Asp Asp Cys Glu Arg Phe Glu Ser Phe 165 170 175 Pro Pro Gly His Met Tyr Ser Ser Lys Gln Gly Gly Leu Arg Arg Tyr 180 185 190 Tyr Asn Pro Pro Trp Phe Asn Glu Ser Ile Pro Ala Glu Pro Tyr Asp 195 200 205 Pro Leu Ile Leu Arg His Ala Phe Glu Lys Ser Val Ile Lys Arg Leu 210 215 220 Met Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser 225 230 235 240 Ser Leu Val Ala Ala Val Ala Gln Arg His Leu Ala Gly Ser Thr Ala 245 250 255 Ala Lys Gln Trp Gly Asn Lys Leu His Ser Phe Cys Val Gly Leu Glu 260 265 270 Gly Ser Pro Asp Leu Lys Ala Gly Arg Glu Val Ala Asp Tyr Ile Gly 275 280 285 Thr Val His Lys Glu Phe His Phe Thr Val Gln Glu Gly Leu Asp Ala 290 295 300 Ile Ser Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile 305 310 315 320 Arg Ala Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ala Leu 325 330 335 Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly 340 345 350 Gly Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Lys 355 360 365 Glu Thr Cys Arg Lys Leu Lys Ala Leu His Leu Tyr Asp Cys Leu Arg 370 375 380 Ala Asn Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe 385 390 395 400 Leu Asp Arg Asp Phe Val Asn Leu Ala Met Ser Ile Asp Pro Ala Glu 405 410 415 Lys Met Ile Asn Lys Lys Glu Gly Lys Ile Glu Lys Trp Ile Ile Arg 420 425 430 Lys Ala Phe Asp Asp Glu Glu Asn Pro Tyr Leu Pro Lys His Ile Leu 435 440 445 Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile 450 455 460 Asp Gly Leu Lys Asp His Ala Ala Ser Gln Val Ser Asp Gln Met Leu 465 470 475 480 Ala Asn Ala Lys His Ile Tyr Pro His Asn Thr Pro Gly Thr Lys Glu 485 490 495 Gly Tyr Tyr Tyr Arg Met Ile Phe Glu Arg Cys Phe Pro Gln Glu Ser 500 505 510 Ala Arg Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala 515 520 525 Ala Ala Ile Ala Trp Asp Lys Ala Trp Ala Asn Asn Leu Asp Pro Ser 530 535 540 Gly Arg Ala Ala Thr Gly Val His Asp Ser Ala Tyr Glu Gly Gly Glu 545 550 555 560 Val Glu Ser Ser Ala Val Ser His Lys Glu Gly Gly Glu Asp Gly Leu 565 570 575 Ala Asn Ser Lys Val Gly Asp Lys Val Gln Glu Ala Ile Ala Val Ala 580 585 590 79589PRTPopulus trichocarpa 79Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Tyr Gln Cys Gly Asp Phe Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Gln Ala Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Lys Arg Leu Pro Asn His Lys Phe Arg Thr Gly 85 90 95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Asn 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly Ile Thr Pro 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Leu Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Cys Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Ser Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Cys Glu Ala Ile Pro Ser Thr Pro Tyr Asp Pro Leu 195 200 205 Val Leu Arg Arg Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Thr Ala Arg His Leu Ala Gly Thr Lys Ala Ala Arg 245 250 255 Gln Trp Gly Ala Gln Leu His Ser Phe Cys Val Gly Leu Glu Asn Ser 260 265 270 Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Gly Thr Val 275 280 285 His His Glu Phe Tyr Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ala Arg Lys Ile Lys Ala Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Leu His Arg Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Asp Phe Ile Asn Val Ala Met Ala Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Lys Pro Gly Gln Gly His Ile Glu Lys Trp Val Leu Arg Lys Ala 420 425 430 Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Ala His Ala Ala Gln His Val Thr Asp Lys Met Met Gln Asn 465 470 475 480 Ala Glu His Ile Phe Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Arg 500 505 510 Leu Ser Val Pro Gly Gly Ala Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Leu Ser Asp Tyr Asp Gln Gln Ala Ala Leu 545 550 555 560 Ala Asn Ala Gly Val Val Pro Pro Lys Ile Ile Asp Thr Leu Pro Arg 565 570 575 Met Leu Glu Val Ser Ala Ser Gly Val Ala Ile His Ser 580 585 80587PRTPopulus trichocarpa 80Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10 15 Lys Arg Phe Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Phe Gln His Gly Asp Phe Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Gln Ala Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Lys Arg Leu Pro Asn His Lys Phe Arg Thr Gly 85 90 95 Ser Asp Cys Asp Val Ile Ser His Leu Tyr Glu Glu Tyr Gly Glu Asn 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly Ile Thr Ser 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Leu Lys Gly Leu Asn Asp Asp Cys Glu His Phe Lys Cys Phe Pro Pro 165 170 175 Gly His Ile Tyr Ser Ser Lys Ser Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Leu Trp Phe Ser Glu Ala Ile Pro Ser Thr Pro Tyr Asp Pro Leu 195 200 205 Ala Leu Arg Arg Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Thr Ala Arg His Leu Ala Gly Thr Gln Ala Ala Arg 245 250 255 Gln Trp Gly Ala His Leu His Ser Phe Cys Val Gly Leu Glu Asn Ser 260 265 270 Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Gly Thr Ile 275 280 285 His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Leu Ala Arg Lys Ile Lys Ala Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Leu His Gly Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Asp Phe Ile Asn Val Ala Met Ala Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Lys Pro Gly Arg Ile Glu Lys Trp Val Leu Arg Lys Ala Phe Asp 420 425 430 Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg Gln Lys 435 440 445 Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly Leu Lys 450 455 460 Ala His Ala Glu Leu His Val His Asp Lys Met Met Gln Asn Ala Glu 465 470 475 480 His Ile Phe Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr Tyr Tyr 485 490 495 Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Arg Leu Thr 500 505 510 Val Pro Gly Gly Ala Ser Val Ala Cys Ser Thr Ala Lys Ala Val Glu 515 520 525 Trp Asp Ala Ser Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg Ala Ala 530 535 540 Leu Gly Val His Leu Ser Ala Tyr Glu Gln Gln Ala Ala Leu Ala Ser 545 550 555 560 Ala Gly Val Val Pro Pro Glu Ile Ile Asp Asn Leu Pro Arg Met Met 565 570 575 Lys Val Gly Ala Pro Gly Val Ala Ile Gln Ser 580 585 81578PRTArabidopsis thaliana 81Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ile Asp Asn Ser Gln Ala 1 5 10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Cys Tyr Glu Asp Cys Tyr Leu Ala His 35 40 45 Glu Arg Leu Ala Ile Ile Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Thr Val Ala Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Lys Ile Leu Arg Glu Lys Leu Lys Ser His Gln Phe Arg Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Glu 100 105 110 Phe Ile Asp Met Leu Asp Gly Met Phe Ala Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Thr Pro 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ala Ser Glu 145 150 155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Gln Phe Met Ser Phe Pro Pro 165 170 175 Gly His Ile Tyr Ser Ser Lys Gln Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Tyr Asn Glu Gln Val Pro Ser Thr Pro Tyr Asp Pro Leu 195 200 205 Val Leu Arg Asn Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Ala Leu Arg His Leu

Glu Lys Ser Glu Ala Ala Arg 245 250 255 Gln Trp Gly Ser Gln Leu His Thr Phe Cys Ile Gly Leu Gln Gly Ser 260 265 270 Pro Asp Leu Lys Ala Gly Arg Glu Val Ala Asp Tyr Leu Gly Thr Arg 275 280 285 His His Glu Phe Gln Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Glu Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Leu Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Gln Phe Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Leu Asn Val Ala Met Ser Ile Asp Pro Glu Trp Lys Leu 405 410 415 Ile Lys Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420 425 430 Phe Asp Asp Glu Glu Arg Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Asn Lys His Val Ser Asp Thr Met Leu Ser Asn 465 470 475 480 Ala Ser Phe Val Phe Pro Asp Asn Thr Pro Leu Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Ser Ala Ala Arg 500 505 510 Ala Thr Val Pro Gly Gly Pro Ser Ile Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Thr Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Val Ala Ala Tyr Glu Glu Asp Lys Ala Ala 545 550 555 560 Ala Ala Ala Lys Ala Gly Ser Asp Leu Val Asp Pro Leu Pro Lys Asn 565 570 575 Gly Thr 82584PRTArabidopsis thaliana 82Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Tyr Gln Asn Gly Asp Asn Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Val Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Lys Thr Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Lys Arg Leu Lys Asn His Lys Phe Arg Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Val Asp 100 105 110 Phe Val Asp Met Leu Asp Gly Ile Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Met Val Ala Arg Asp Ala Ile Gly Val Thr Ser 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Thr Phe Pro Pro 165 170 175 Gly His Phe Tyr Ser Ser Lys Leu Gly Gly Phe Lys Gln Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Asn Glu Ser Val Pro Ser Thr Pro Tyr Glu Pro Leu 195 200 205 Ala Ile Arg Arg Ala Phe Glu Asn Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Ile Thr Ala Arg His Leu Ala Gly Thr Lys Ala Ala Lys 245 250 255 Gln Trp Gly Pro Gln Leu His Ser Phe Cys Val Gly Leu Glu Gly Ser 260 265 270 Pro Asp Leu Lys Ala Gly Lys Glu Val Ala Glu Tyr Leu Gly Thr Val 275 280 285 His His Glu Phe His Phe Ser Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Leu Ser Gly Glu Gly Ala Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Gln Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Lys Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Ser Ala Phe Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Asp Phe Ile Asn Thr Ala Met Ser Leu Asp Pro Glu Ser Lys Met 405 410 415 Ile Lys Pro Glu Glu Gly Arg Ile Glu Lys Trp Val Leu Arg Arg Ala 420 425 430 Phe Asp Asp Glu Glu Arg Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Ala Gln Asn Val Asn Asp Lys Met Met Ser Asn 465 470 475 480 Ala Gly His Ile Phe Pro His Asn Thr Pro Asn Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Arg 500 505 510 Leu Thr Val Pro Gly Gly Ala Thr Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ser Trp Ser Asn Asn Met Asp Pro Ser Gly Arg 530 535 540 Ala Ala Ile Gly Val His Leu Ser Ala Tyr Asp Gly Lys Asn Val Ala 545 550 555 560 Leu Thr Ile Pro Pro Leu Lys Ala Ile Asp Asn Met Pro Met Met Met 565 570 575 Gly Gln Gly Val Val Ile Gln Ser 580 83578PRTArabidopsis thaliana 83Met Cys Gly Ile Leu Ala Val Leu Gly Cys Val Asp Asn Ser Gln Ala 1 5 10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Cys Tyr Glu Asp Cys Tyr Leu Ala His 35 40 45 Glu Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Thr Ile Ala Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Lys Ala Leu Arg Glu Asn Leu Lys Ser His Gln Phe Arg Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Glu 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ala Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Thr Pro 130 135 140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ala Ser Glu 145 150 155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Gln Phe Met Cys Phe Pro Pro 165 170 175 Gly His Ile Tyr Ser Ser Lys Gln Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Val Val Pro Ser Thr Pro Tyr Asp Pro Leu 195 200 205 Val Val Arg Asn Thr Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Val Ala Leu Arg His Leu Glu Lys Ser Glu Ala Ala Cys 245 250 255 Gln Trp Gly Ser Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260 265 270 Pro Asp Leu Lys Ala Gly Arg Glu Val Ala Asp Tyr Leu Gly Thr Arg 275 280 285 His His Glu Leu His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Glu Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Ile Asn Val Ala Met Ser Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Arg Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420 425 430 Phe Asp Asp Glu Lys Asn Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Asn Lys His Val Ser Glu Thr Met Leu Met Asn 465 470 475 480 Ala Ser Phe Val Phe Pro Asp Asn Thr Pro Leu Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Ser Ala Ala Arg 500 505 510 Ala Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Gln Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Val Ser Ala Tyr Gly Glu Asp Lys Thr Glu 545 550 555 560 Asp Ser Arg Pro Glu Lys Leu Gln Lys Leu Ala Glu Lys Thr Pro Ala 565 570 575 Ile Val 84581PRTTriticum aestivum 84Met Cys Gly Ile Leu Ala Val Leu Gly Cys Gly Asp Glu Ser Gln Gly 1 5 10 15 Lys Arg Val His Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Gln Val Ala Asp Asn Tyr Leu Cys His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Ser Ile Ala Val Ala Val Asn Gly Glu Val Tyr Asn 65 70 75 80 His Glu Glu Leu Arg Ala Arg Leu Ser Gly His Arg Phe Arg Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Ser 100 105 110 Phe Ile Asp Met Leu Asp Gly Val Phe Ser Phe Val Leu Leu Asp Ala 115 120 125 Arg Asp Asn Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Val Thr Pro 130 135 140 Leu Tyr Ile Gly Trp Gly Ile Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Ile Phe Pro Pro 165 170 175 Gly Asn Leu Tyr Ser Ser Lys Glu Lys Ser Phe Lys Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Val Ile Pro Ser Val Pro Tyr Asp Pro Leu 195 200 205 Arg Leu Arg Ser Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Ala Ala Arg His Phe Ala Gly Thr Lys Ala Ala Lys 245 250 255 Arg Trp Gly Thr Arg Leu His Ser Phe Cys Val Gly Leu Glu Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp His Leu Gly Thr Val 275 280 285 His His Glu Phe Asn Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Leu Met Phe Gln Met Ser Arg Lys Ile Lys Ala Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ala Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Gln Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Val Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Ile Asn Glu Ala Met Ser Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Arg Pro Asp Leu Gly Arg Ile Glu Lys Trp Ile Leu Arg Lys Ala 420 425 430 Phe Asp Asp Glu Glu Arg Pro Phe Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Ala Ser Asn Val Ser Asp Lys Met Met Ser Asn 465 470 475 480 Ala Lys Phe Ile Tyr Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Tyr Phe Pro Gln Ser Ser Ala Ile 500 505 510 Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Ile Glu Trp Asp Ala Gln Trp Ser Gly Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Leu Ser Ala Tyr Glu Gln Asp Thr Val Ala 545 550 555 560 Val Gly Gly Ser Asn Lys Pro Gly Val Met Asn Thr Val Val Pro Gly 565 570 575 Val Ala Ile Glu Thr 580 85585PRTTriticum aestivum 85Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ala Asp Asp Thr Gln Gly 1 5 10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Met His Gln Val Gly Asp Cys Tyr Leu Ser His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Gln Leu Arg Ala Gln Leu Ser Ser His Thr Phe Arg Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asn 100 105 110 Phe Ile Asp Met Leu Asp Gly Val Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Asn Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Val Thr Pro 130 135 140 Leu Tyr Ile Gly Trp Gly Ile Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150 155 160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Ile Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Gln Gly Gly Phe Lys Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Val Ile Pro Ser Val Pro Tyr Asp Pro Leu 195 200 205 Ala Leu Arg Lys Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210

215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ala Val Thr Val Arg His Leu Ala Gly Thr Lys Ala Ala Lys 245 250 255 Arg Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asn Tyr Leu Gly Thr Met 275 280 285 His His Glu Phe Thr Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295 300 Asp Val Ile Tyr His Thr Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Leu His Arg Glu Thr 355 360 365 Cys Gln Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Glu Phe Ile Asn Glu Ala Met Ser Ile Asp Pro Glu Trp Lys Met 405 410 415 Ile Arg Pro Asp Leu Gly Arg Ile Glu Lys Trp Met Leu Arg Lys Ala 420 425 430 Phe Asp Asp Glu Glu Gln Pro Phe Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Ala His Ala Glu Ser Asn Val Thr Asp Lys Met Met Ser Asn 465 470 475 480 Ala Lys Phe Ile Tyr Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 Cys Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Ile 500 505 510 Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Gln Trp Ser Gly Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Leu Ser Ala Tyr Glu Gln Glu His Leu Pro 545 550 555 560 Ala Thr Ile Met Ala Gly Thr Ser Lys Lys Pro Arg Met Ile Glu Val 565 570 575 Ala Ala Pro Gly Val Ala Ile Glu Ser 580 585 86589PRTVitis vinifera 86Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10 15 Lys Arg Val Arg Leu Phe Tyr His Cys Tyr Leu Cys Phe Cys Asp Arg 20 25 30 Leu Lys His Arg Gly Pro Asp Trp Ser Gly Leu Tyr Gln His Gly Asp 35 40 45 Cys Tyr Leu Ala His Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly 50 55 60 Asp Gln Pro Leu Tyr Asn Glu Asn Gln Ala Ile Val Val Thr Val Asn 65 70 75 80 Gly Glu Ile Tyr Asn His Glu Glu Leu Arg Lys Ser Met Pro Asn His 85 90 95 Lys Phe Arg Thr Gly Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu 100 105 110 Glu His Gly Glu Asn Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe 115 120 125 Val Leu Leu Asp Thr Arg Asp Asp Ser Phe Ile Val Ala Arg Asp Ala 130 135 140 Ile Gly Ile Thr Ser Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Ser 145 150 155 160 Val Trp Ile Ser Ser Glu Leu Lys Gly Leu Asn Asp Asp Cys Glu His 165 170 175 Phe Glu Ser Phe Pro Pro Gly His Met Tyr Ser Ser Lys Glu Gly Gly 180 185 190 Phe Lys Arg Trp Tyr Asn Pro Pro Trp Phe Ser Glu Ala Ile Pro Ser 195 200 205 Ala Pro Tyr Asp Pro Leu Val Leu Arg Arg Ala Phe Glu Asn Ala Val 210 215 220 Ile Lys Arg Leu Met Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly 225 230 235 240 Gly Leu Asp Ser Ser Leu Val Ala Ser Ile Thr Ala Arg His Leu Ala 245 250 255 Gly Thr Lys Ala Ala Lys Gln Trp Gly Ala Gln Leu His Ser Phe Cys 260 265 270 Val Gly Leu Glu Gly Ser Pro Asp Leu Lys Ala Ala Lys Glu Val Ala 275 280 285 Asp Tyr Leu Gly Thr Val His His Glu Phe His Phe Thr Val Gln Asp 290 295 300 Gly Ile Asp Ala Ile Glu Asp Val Ile Tyr His Ile Glu Thr Tyr Asp 305 310 315 320 Val Thr Thr Ile Arg Ala Ser Thr Pro Met Phe Leu Met Ser Arg Lys 325 330 335 Ile Lys Ser Leu Gly Val Lys Met Val Ile Ser Gly Glu Gly Ser Asp 340 345 350 Glu Ile Phe Gly Gly Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Glu 355 360 365 Glu Phe His Arg Glu Thr Cys Arg Lys Ile Lys Ala Leu Tyr Gln Tyr 370 375 380 Asp Cys Leu Arg Ala Asn Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala 385 390 395 400 Arg Val Pro Phe Leu Asp Lys Glu Phe Ile Lys Val Ala Met Asp Ile 405 410 415 Asp Pro Glu Trp Lys Met Ile Lys Pro Glu Gln Gly Arg Ile Glu Lys 420 425 430 Trp Val Leu Arg Arg Ala Phe Asp Asp Glu Glu Gln Pro Tyr Leu Pro 435 440 445 Lys His Ile Leu Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly 450 455 460 Tyr Ser Trp Ile Asp Gly Leu Lys Ala His Ala Ser Gln His Val Thr 465 470 475 480 Asp Lys Met Met Leu Asn Ala Ser His Ile Phe Pro His Asn Thr Pro 485 490 495 Thr Thr Lys Glu Ala Tyr Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe 500 505 510 Pro Gln Asn Ser Ala Arg Leu Thr Val Pro Gly Gly Ala Ser Val Ala 515 520 525 Cys Ser Thr Ala Lys Ala Val Glu Trp Asp Ser Ala Trp Ser Asn Asn 530 535 540 Leu Asp Pro Ser Gly Arg Ala Ala Leu Gly Val His Leu Ser Ala Tyr 545 550 555 560 Asp Gln Lys Leu Thr Thr Val Ser Ala Ala Asn Val Pro Thr Lys Ile 565 570 575 Ile Asp Asn Met Pro Arg Ile Met Glu Val Thr Ala Pro 580 585 87578PRTVolvox carteri 87Met Cys Gly Ile Leu Ala Val Leu Asn Ser Thr Asp Asp Ser Pro Ala 1 5 10 15 Met Arg Ala Lys Val Leu Ala Leu Ser Arg Arg Gln Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Met His Gln Phe Gly Asn Asn Phe Leu Ala His 35 40 45 Glu Arg Leu Ala Ile Met Asp Pro Ser Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 Tyr Lys Glu Leu Arg Lys Glu Ile Ser Asp Lys Cys Pro Gly Lys Lys 85 90 95 Phe Arg Thr Asn Ser Asp Cys Glu Val Ile Ser His Leu Tyr Glu Leu 100 105 110 Tyr Gly Glu Ala Val Ala Asn Lys Leu Asp Gly Phe Phe Ala Phe Val 115 120 125 Leu Leu Asp Thr Arg Asn Asn Thr Phe Phe Ala Ala Arg Asp Pro Leu 130 135 140 Gly Val Thr Cys Met Tyr Ile Gly Trp Gly Arg Asp Gly Ser Val Trp 145 150 155 160 Leu Ser Ser Glu Met Lys Cys Leu Lys Asp Asp Cys Ala Arg Phe Gln 165 170 175 Gln Phe Pro Pro Gly His Tyr Tyr Ser Ser Lys Thr Gly Glu Phe Val 180 185 190 Arg Tyr Phe Asn Pro Gln Phe Tyr Leu Asp Phe Glu Ala Glu Pro Gln 195 200 205 Val Phe Pro Ser Val Pro Tyr Asp Pro Val Thr Leu Arg Thr Ala Phe 210 215 220 Glu Ala Ala Val Glu Lys Arg Met Met Ser Asp Val Pro Phe Gly Val 225 230 235 240 Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu Val Ala Ser Ile Ala Ala 245 250 255 Arg Lys Ile Lys Arg Glu Gly Ser Val Trp Gly Lys Leu His Ser Phe 260 265 270 Cys Val Gly Leu Glu Gly Ser Pro Asp Leu Lys Ala Gly Ala Ala Val 275 280 285 Ala Glu Phe Leu Gly Thr Asp His His Glu Phe His Phe Thr Val Gln 290 295 300 Glu Gly Ile Asp Ala Ile Ser Glu Val Ile Tyr His Ile Glu Thr Phe 305 310 315 320 Asp Val Thr Thr Ile Arg Ala Ser Thr Pro Met Phe Leu Met Ser Arg 325 330 335 Lys Ile Lys Ala Leu Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser 340 345 350 Asp Glu Val Phe Gly Gly Tyr Leu Tyr Phe His Lys Ala Pro Ser Lys 355 360 365 Asp Glu Phe His Ser Glu Thr Val Arg Lys Leu Lys Asp Leu Phe Lys 370 375 380 Tyr Asp Cys Leu Arg Ala Asn Lys Ala Thr Met Ala Trp Gly Val Glu 385 390 395 400 Ala Arg Val Pro Phe Leu Asp Arg Ala Phe Leu Asp Val Ala Met Ser 405 410 415 Ile Asp Pro Ala Glu Lys Met Ile Asp Lys Ser Lys Gly Arg Ile Glu 420 425 430 Lys Tyr Ile Leu Arg Lys Ala Phe Asp Thr Pro Glu Asp Pro Tyr Leu 435 440 445 Pro Lys Glu Val Leu Trp Arg Gln Lys Glu Gln Phe Ser Asp Gly Val 450 455 460 Gly Tyr Asn Trp Ile Asp Gly Leu Lys Ala His Ala Glu Ser Gln Val 465 470 475 480 Ser Asp Glu Met Leu Lys Asn Ala Val His Arg Phe Pro Asp Asn Thr 485 490 495 Pro Arg Thr Lys Glu Ala Tyr Trp Tyr Arg Ser Ile Phe Glu Ser His 500 505 510 Phe Pro Gln Arg Ala Ala Met Glu Thr Val Pro Gly Gly Pro Ser Val 515 520 525 Ala Cys Ser Thr Ala Thr Ala Ala Leu Trp Asp Ala Ala Trp Ala Gly 530 535 540 Lys Glu Asp Pro Ser Gly Arg Ala Val Ala Gly Val His Asp Ala Ala 545 550 555 560 Tyr Glu Glu Gly Ala Glu Ala Asn Gly Glu Pro Ala Ser Lys Lys Gln 565 570 575 Lys Val 88588PRTZea mays 88Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Trp Ser Gln Ala 1 5 10 15 Lys Arg Ala Arg Ile Leu Ala Cys Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Tyr Gln His Glu Gly Asn Phe Leu Ala Gln 35 40 45 Gln Arg Leu Ala Val Val Ser Pro Leu Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Arg Thr Val Val Val Val Ala Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Lys Asn Val Arg Lys Gln Phe Thr Gly Thr His Asn Phe Ser Thr 85 90 95 Gly Ser Asp Cys Glu Val Ile Ile Pro Leu Tyr Glu Lys Tyr Gly Glu 100 105 110 Asn Phe Val Asp Met Leu Asp Gly Val Phe Ala Phe Val Leu Tyr Asp 115 120 125 Thr Arg Asp Arg Thr Tyr Val Ala Ala Arg Asp Ala Ile Gly Val Asn 130 135 140 Pro Leu Tyr Ile Gly Trp Gly Ser Asp Gly Ser Val Trp Ile Ala Ser 145 150 155 160 Glu Met Lys Ala Leu Asn Glu Asp Cys Val Arg Phe Glu Ile Phe Pro 165 170 175 Pro Gly His Leu Tyr Ser Ser Ala Gly Gly Gly Phe Arg Arg Trp Tyr 180 185 190 Thr Pro His Trp Phe Gln Glu Gln Val Pro Arg Met Pro Tyr Gln Pro 195 200 205 Leu Val Leu Arg Glu Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met 210 215 220 Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser 225 230 235 240 Leu Val Ala Ser Val Thr Lys Arg His Leu Val Glu Thr Glu Ala Ala 245 250 255 Glu Lys Phe Gly Thr Glu Leu His Ser Phe Val Val Gly Leu Glu Gly 260 265 270 Ser Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Gly Thr 275 280 285 Ile His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile 290 295 300 Glu Glu Val Ile Tyr His Asp Glu Thr Tyr Asp Val Thr Thr Ile Arg 305 310 315 320 Ala Ser Thr Pro Met Phe Leu Met Ala Arg Lys Ile Lys Ser Leu Gly 325 330 335 Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Leu Leu Gly Gly 340 345 350 Tyr Leu Tyr Phe His Phe Ala Pro Asn Lys Glu Glu Phe His Arg Glu 355 360 365 Thr Cys Arg Lys Val Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala 370 375 380 Asn Lys Ala Thr Ser Ala Trp Gly Leu Glu Val Arg Val Pro Phe Leu 385 390 395 400 Asp Lys Glu Phe Ile Asn Val Ala Met Gly Met Asp Pro Glu Trp Lys 405 410 415 Met Tyr Asp Lys Asn Leu Gly Arg Ile Glu Lys Trp Val Met Arg Lys 420 425 430 Ala Phe Asp Asp Asp Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr 435 440 445 Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Asn Trp Ile Asp 450 455 460 Gly Leu Lys Ser Phe Thr Glu Gln Gln Val Thr Asp Glu Met Met Asn 465 470 475 480 Asn Ala Ala Gln Met Phe Pro Tyr Asn Thr Pro Val Asn Lys Glu Ala 485 490 495 Tyr Tyr Tyr Arg Met Ile Phe Glu Arg Leu Phe Pro Gln Asp Ser Ala 500 505 510 Arg Glu Thr Val Pro Trp Gly Pro Ser Ile Ala Cys Ser Thr Pro Ala 515 520 525 Ala Ile Glu Trp Val Glu Gln Trp Lys Ala Ser Asn Asp Pro Ser Gly 530 535 540 Arg Phe Ile Ser Ser His Asp Ser Ala Ala Thr Asp His Thr Gly Gly 545 550 555 560 Lys Pro Ala Val Ala Asn Gly Gly Gly His Gly Ala Ala Asn Gly Thr 565 570 575 Val Asn Gly Lys Asp Val Ala Val Ala Ile Ala Val 580 585 89586PRTZea mays 89Met Cys Gly Ile Leu Ala Val Leu Gly Val Val Glu Val Ser Leu Ala 1 5 10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Cys His Glu Asp Cys Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Lys Ala Lys Leu Lys Thr His Glu Phe Gln Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Cys Pro 130 135 140 Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145 150 155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe Ile Thr Phe Pro Pro 165

170 175 Gly His Leu Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Thr Val Pro Ser Thr Pro Tyr Asn Ala Leu 195 200 205 Phe Leu Arg Glu Met Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Val Ala Ser Arg His Leu Asn Glu Thr Lys Val Asp Arg 245 250 255 Gln Trp Gly Asn Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Ser Thr Val 275 280 285 His His Glu Phe His Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295 300 Glu Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe Leu Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ala Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Ser Phe Ile Ser Val Ala Met Asp Ile Asp Pro Glu Trp Asn Met 405 410 415 Ile Lys Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Met Arg Lys Ala 420 425 430 Phe Asp Asp Asp Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Asn Trp Ile Asp Gly 450 455 460 Leu Lys Ser Phe Thr Glu Gln Gln Val Thr Asp Glu Met Met Asn Asn 465 470 475 480 Ala Ala Gln Met Phe Pro Tyr Asn Thr Pro Val Asn Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Leu Phe Pro Gln Asp Ser Ala Arg 500 505 510 Glu Thr Val Pro Trp Gly Pro Ser Ile Ala Cys Ser Thr Pro Ala Ala 515 520 525 Ile Glu Trp Val Glu Gln Trp Lys Ala Ser Asn Asp Pro Ser Gly Arg 530 535 540 Phe Ile Ser Ser His Asp Ser Ala Ala Thr Asp His Thr Ala Val Ser 545 550 555 560 Arg Arg Trp Pro Thr Ala Ala Ala Arg Pro Ala Asn Gly Thr Val Asn 565 570 575 Gly Lys Asp Val Pro Val Pro Ile Ala Val 580 585 90606PRTZea mays 90Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ala Asp Glu Ala Lys Gly 1 5 10 15 Ser Ser Lys Arg Ser Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His 20 25 30 Arg Gly Pro Asp Trp Ser Gly Leu Arg Gln Val Gly Asp Cys Tyr Leu 35 40 45 Ser His Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro 50 55 60 Leu Tyr Asn Glu Asp Gln Ser Val Val Val Ala Val Asn Gly Glu Ile 65 70 75 80 Tyr Asn His Leu Asp Leu Arg Ser Arg Leu Ala Gly Ala Gly His Ser 85 90 95 Phe Arg Thr Gly Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu 100 105 110 His Gly Glu Glu Phe Val Asp Met Leu Asp Gly Val Phe Ser Phe Val 115 120 125 Leu Leu Asp Thr Arg His Gly Asp Arg Ala Gly Ser Ser Phe Phe Met 130 135 140 Ala Ala Arg Asp Ala Ile Gly Val Thr Pro Leu Tyr Ile Gly Trp Gly 145 150 155 160 Val Asp Gly Ser Val Trp Ile Ser Ser Glu Met Lys Ala Leu His Asp 165 170 175 Glu Cys Glu His Phe Glu Ile Phe Pro Pro Gly His Leu Tyr Ser Ser 180 185 190 Asn Thr Gly Gly Phe Ser Arg Trp Tyr Asn Pro Pro Trp Tyr Asp Asp 195 200 205 Asp Asp Asp Glu Glu Ala Val Val Thr Pro Ser Val Pro Tyr Asp Pro 210 215 220 Leu Ala Leu Arg Lys Ala Phe Glu Lys Ala Val Val Lys Arg Leu Met 225 230 235 240 Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser 245 250 255 Leu Val Ala Thr Val Ala Val Arg His Leu Ala Arg Thr Glu Ala Ala 260 265 270 Arg Arg Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly 275 280 285 Ser Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Glu Tyr Leu Gly Thr 290 295 300 Leu His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile 305 310 315 320 Glu Asp Val Ile Tyr His Thr Glu Thr Tyr Asp Val Thr Thr Ile Arg 325 330 335 Ala Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly 340 345 350 Val Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Leu Phe Gly Gly 355 360 365 Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Leu His Arg Glu 370 375 380 Thr Cys Arg Lys Val Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala 385 390 395 400 Asn Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu 405 410 415 Asp Lys Glu Phe Ile Asn Ala Ala Met Ser Ile Asp Pro Glu Trp Lys 420 425 430 Met Val Gln Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Lys 435 440 445 Ala Phe Asp Asp Glu Glu Gln Pro Phe Leu Pro Lys His Ile Leu Tyr 450 455 460 Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp 465 470 475 480 Gly Leu Lys Ala His Ala Thr Ser Asn Val Thr Asp Lys Met Leu Ser 485 490 495 Asn Ala Lys Phe Ile Phe Pro His Asn Thr Pro Thr Thr Lys Glu Ala 500 505 510 Tyr Tyr Tyr Arg Met Val Phe Glu Arg Phe Phe Pro Gln Lys Ser Ala 515 520 525 Ile Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys 530 535 540 Ala Ile Glu Trp Asp Ala Gln Trp Ser Gly Asn Leu Asp Pro Ser Gly 545 550 555 560 Arg Ala Ala Leu Gly Val His Leu Ala Ala Tyr Glu His Gln His Asp 565 570 575 Pro Glu His Val Pro Ala Ala Ile Ala Ala Gly Ser Gly Lys Lys Pro 580 585 590 Arg Thr Ile Arg Val Ala Pro Pro Gly Val Ala Ile Glu Gly 595 600 605 91606PRTZea mays 91Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Cys Ser Gln Ala 1 5 10 15 Arg Arg Ala Arg Ile Leu Ala Cys Ser Arg Arg Leu Lys His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu Tyr Gln His Glu Gly Asn Phe Leu Ala Gln 35 40 45 Gln Arg Leu Ala Ile Val Ser Pro Leu Ser Gly Asp Gln Pro Leu Phe 50 55 60 Asn Glu Asp Arg Thr Val Val Val Val Ala Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Lys Asn Val Arg Lys Gln Phe Thr Gly Ala His Ser Phe Ser Thr 85 90 95 Gly Ser Asp Cys Glu Val Ile Ile Pro Leu Tyr Glu Lys Tyr Gly Glu 100 105 110 Asn Phe Val Asp Met Leu Asp Gly Val Phe Ala Phe Val Leu Tyr Asp 115 120 125 Thr Arg Asp Arg Thr Tyr Val Ala Ala Arg Asp Ala Ile Gly Val Asn 130 135 140 Pro Leu Tyr Ile Gly Trp Gly Ser Asp Gly Ser Val Trp Met Ser Ser 145 150 155 160 Glu Met Lys Ala Leu Asn Glu Asp Cys Val Arg Phe Glu Ile Phe Pro 165 170 175 Pro Gly His Leu Tyr Ser Ser Ala Ala Gly Gly Phe Arg Arg Trp Tyr 180 185 190 Thr Pro His Trp Phe Gln Glu Gln Val Pro Arg Thr Pro Tyr Gln Pro 195 200 205 Leu Val Leu Arg Glu Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met 210 215 220 Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser 225 230 235 240 Leu Val Ala Ser Val Thr Lys Arg His Leu Val Lys Thr Asp Ala Ala 245 250 255 Gly Lys Phe Gly Thr Glu Leu His Ser Phe Val Val Gly Leu Glu Gly 260 265 270 Ser Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Gly Thr 275 280 285 Thr His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile 290 295 300 Glu Glu Val Ile Tyr His Asp Glu Thr Tyr Asp Val Thr Thr Ile Arg 305 310 315 320 Ala Ser Thr Pro Met Phe Leu Met Ala Arg Lys Ile Lys Ser Leu Gly 325 330 335 Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Leu Leu Gly Gly 340 345 350 Tyr Leu Tyr Phe His Phe Ala Pro Asn Arg Glu Glu Leu His Arg Glu 355 360 365 Thr Cys Arg Lys Val Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala 370 375 380 Asn Lys Ala Thr Ser Ala Trp Gly Leu Glu Val Arg Val Pro Phe Leu 385 390 395 400 Asp Lys Glu Phe Val Asp Val Ala Met Gly Met Asp Pro Glu Trp Lys 405 410 415 Met Tyr Asp Lys Asn Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Lys 420 425 430 Ala Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Glu His Ile Leu Tyr 435 440 445 Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Asn Trp Ile Asp 450 455 460 Gly Leu Lys Ala Phe Thr Glu Gln Gln Val Asp Gly Arg Arg Arg Ser 465 470 475 480 Leu Thr Ser Ala Asp Val Pro Pro His Val Gln Val Thr Asp Glu Met 485 490 495 Met Asn Ser Ala Ala Gln Met Phe Pro Tyr Asn Thr Pro Val Asn Lys 500 505 510 Glu Ala Tyr Tyr Tyr Arg Met Ile Phe Glu Arg Leu Phe Pro Gln Asp 515 520 525 Ser Ala Arg Glu Thr Val Pro Trp Gly Pro Ser Ile Ala Cys Ser Thr 530 535 540 Pro Ala Ala Ile Glu Trp Val Glu Gln Trp Lys Ala Ser Asn Asp Pro 545 550 555 560 Ser Gly Arg Phe Ile Ser Ser His Asp Ser Ala Ala Thr Asp Arg Thr 565 570 575 Gly Asp Lys Leu Ala Val Val Asn Gly Asp Gly His Gly Ala Ala Asn 580 585 590 Gly Thr Val Asn Gly Asn Asp Val Ala Val Ala Ile Ala Val 595 600 605 92591PRTZea mays 92Met Cys Gly Ile Leu Ala Val Leu Gly Val Ala Glu Val Ser Leu Ala 1 5 10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Cys His Glu Asp Cys Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Ile Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Lys Ala Lys Leu Lys Thr His Glu Phe Gln Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Cys Pro 130 135 140 Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145 150 155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe Ile Thr Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Thr Val Pro Ser Thr Pro Tyr Asn Ala Leu 195 200 205 Phe Leu Arg Glu Met Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Val Ala Ser Arg His Phe Asn Glu Thr Lys Gly Asp Arg 245 250 255 Gln Trp Gly Asn Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Ser Thr Val 275 280 285 His His Glu Phe His Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295 300 Glu Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ala Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Ser Phe Ile Ser Val Ala Met Asp Ile Asp Pro Asp Trp Lys Met 405 410 415 Ile Lys Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Ile Arg Asn Ala 420 425 430 Phe Asp Asp Asp Glu Arg Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Ser Gln His Val Ser Asp Ser Met Met Met Asn 465 470 475 480 Ala Gly Phe Val Tyr Pro Glu Asn Thr Pro Thr Thr Lys Glu Gly Tyr 485 490 495 Tyr Tyr Arg Met Ile Phe Glu Lys Phe Phe Pro Lys Pro Ala Ala Arg 500 505 510 Ser Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ser Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Asp Ala Ala Tyr Glu Asp Thr Ala Gly Lys 545 550 555 560 Thr Pro Ala Ser Ala Asp Pro Val Ser Asp Lys Gly Leu Arg Pro Ala 565 570 575 Ile Gly Glu Ser Leu Gly Thr Pro Val Ala Ser Ala Thr Ala Val 580 585 590 93580PRTBrassica napus 93Met Cys Gly Ile Leu Ala Val Leu Gly Cys Val Asp Asn Ser Gln Ala 1 5 10 15 Thr Arg Ser Arg Ile Ile Lys Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Leu His Cys Tyr Glu Asp Cys Tyr Leu Ala His 35 40 45 Glu Arg Leu Ala Ile Ile Asp Pro Ile Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Ser Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75

80 His Lys Ala Leu Arg Glu Ser Glu Ser Leu Lys Ser His Lys Tyr His 85 90 95 Thr Gly Ser Asp Cys Glu Val Leu Ala His Leu Tyr Glu Glu His Gly 100 105 110 Glu Glu Phe Ile Asn Met Leu Asp Gly Met Phe Ala Phe Val Leu Leu 115 120 125 Asp Thr Lys Asp Lys Ser Tyr Ile Ala Val Arg Asp Ala Ile Gly Val 130 135 140 Ile Pro Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ala 145 150 155 160 Ser Glu Met Lys Ala Leu Ser Asp Asp Cys Glu Gln Phe Met Ala Phe 165 170 175 Pro Pro Gly His Ile Tyr Ser Ser Lys Gln Gly Gly Leu Arg Arg Trp 180 185 190 Tyr Asn Pro Pro Trp Phe Ser Glu Leu Val Pro Ser Thr Pro Tyr Asp 195 200 205 Pro Leu Val Leu Arg Asp Thr Phe Glu Lys Ala Val Ile Lys Arg Leu 210 215 220 Met Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser 225 230 235 240 Ser Leu Val Ala Ser Val Ala Ile Arg His Leu Glu Lys Ser Asp Ala 245 250 255 Arg Gln Trp Gly Ser Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly 260 265 270 Ser Pro Asp Leu Lys Ala Gly Lys Glu Val Ala Asp Tyr Leu Gly Thr 275 280 285 Arg His His Glu Leu His Phe Thr Val Gln Glu Gly Ile Asp Ala Ile 290 295 300 Glu Glu Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg 305 310 315 320 Ala Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly 325 330 335 Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly 340 345 350 Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Leu His Glu Glu 355 360 365 Thr Cys Arg Lys Ile Lys Ala Leu Tyr Gln Tyr Asp Cys Leu Arg Ala 370 375 380 Asn Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu 385 390 395 400 Asp Lys Ala Phe Leu Asp Val Ala Met Gly Ile Asp Pro Glu Trp Lys 405 410 415 Met Ile Arg Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn 420 425 430 Ala Phe Asp Asp Glu Lys Asn Pro Tyr Leu Pro Lys His Ile Leu Tyr 435 440 445 Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp 450 455 460 Gly Leu Lys Asp His Ala Asn Lys His Val Ser Asp Ala Met Leu Thr 465 470 475 480 Asn Ala Asn Phe Val Phe Pro Glu Asn Thr Pro Leu Thr Lys Glu Ala 485 490 495 Tyr Tyr Tyr Arg Ala Ile Phe Glu Lys Phe Phe Pro Lys Ser Ala Ala 500 505 510 Arg Ala Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys 515 520 525 Ala Val Glu Trp Asp Ala Ala Trp Lys Gly Asn Leu Asp Pro Ser Gly 530 535 540 Arg Ala Ala Leu Gly Val His Val Ala Ala Tyr Glu Gly Asp Lys Ala 545 550 555 560 Glu Asp Pro Arg Pro Glu Lys Val Gln Lys Leu Ala Glu Lys Thr Ala 565 570 575 Glu Ala Ile Val 580 94589PRTTriticum aestivum 94Met Cys Gly Ile Leu Ala Val Leu Gly Val Gly Asp Val Ser Leu Ala 1 5 10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20 25 30 Pro Asp Trp Ser Gly Ile His Ser Phe Glu Asp Cys Tyr Leu Ala His 35 40 45 Gln Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55 60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70 75 80 His Glu Glu Leu Lys Ala Lys Leu Lys Ser His Gln Phe Gln Thr Gly 85 90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu 100 105 110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115 120 125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Cys Pro 130 135 140 Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145 150 155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe Ile Ser Phe Pro Pro 165 170 175 Gly His Leu Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn 180 185 190 Pro Pro Trp Phe Ser Glu Ser Ile Pro Ser Ala Pro Tyr Asp Pro Leu 195 200 205 Leu Ile Arg Glu Ser Ile Glu Lys Ala Ala Ile Lys Arg Leu Met Thr 210 215 220 Asp Val Thr Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230 235 240 Val Ala Ser Val Val Ser Arg Tyr Leu Ala Glu Thr Lys Val Ala Arg 245 250 255 Gln Trp Arg Asn Lys Leu His Thr Phe Cys Ile Gly Met Lys Gly Ser 260 265 270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly Thr Val 275 280 285 His His Glu Leu His Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295 300 Glu Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310 315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325 330 335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340 345 350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Leu His Glu Glu Thr 355 360 365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370 375 380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390 395 400 Lys Asn Phe Ile Asn Val Ala Met Asp Leu Asp Pro Glu Cys Lys Met 405 410 415 Ile Arg Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420 425 430 Phe Asp Asp Glu Glu Lys Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440 445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455 460 Leu Lys Asp His Ala Lys Ala His Val Ser Asp Ser Met Met Thr Asn 465 470 475 480 Ala Ser Phe Val Tyr Pro Glu Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485 490 495 Tyr Tyr Arg Thr Val Phe Glu Lys Phe Tyr Pro Lys Asn Ala Ala Arg 500 505 510 Leu Thr Val Pro Gly Gly Pro Ser Ile Ala Cys Ser Thr Ala Lys Ala 515 520 525 Val Glu Trp Asp Ala Ala Trp Ser Lys Leu Leu Asp Pro Ser Gly Arg 530 535 540 Ala Ala Leu Gly Val His Asp Ala Ala Tyr Lys Glu Lys Ala Pro Ala 545 550 555 560 Ser Val Asp Pro Ala Val Asp Asn Val Ser Arg Ser Pro Ala His Asp 565 570 575 Val Lys Arg Leu Lys Thr Ala Ile Ser Ala Ala Ala Val 580 585 952264DNAOryza sativa 95gcggattcca ttctcctctt ggcatcacga ggcggcgccg cttggtctag ctagtagcca 60cagggagagg tggtagccgc agccgccgcc gacgagacct cgccgccggg gggagggcac 120catgtgtggc atcctcgccg tgctcggcgt cgcagacgtc tccctcgcca agcgctcccg 180catcatcgag ctatcccgcc ggttacgtca tagaggccct gattggagtg gtatacactg 240ctatcaggat tgctatcttg cacaccagcg gttggctatt gttgatccca catccggaga 300ccagccgttg tacaatgagg acaaatctgt tgttgtgacg gtgaatggag agatctataa 360ccatgaagaa ttgaaagcta acctgaaatc tcataaattc caaactgcta gcgattgtga 420agttattgct catctgtatg aggaatatgg ggaggaattt gtggatatgt tggatgggat 480gttcgctttt gttcttcttg acacacgtga taaaagcttc attgcagccc gtgatgctat 540tggcatttgt cctttataca tgggctgggg tcttgatggt tcggtttggt tttcgtcaga 600gatgaaggca ttaagtgatg attgcgagcg attcatatcc ttcccccctg ggcacttgta 660ctccagcaaa acaggtggcc taaggagatg gtacaaccca ccatggtttt ctgaaagcat 720tccctccacc ccgtacaatc ctcttcttct ccgacagagc tttgagaagg ctattattaa 780gaggctaatg acagatgtgc catttggtgt tctcttgtct ggtggactgg actcttcttt 840ggttgcatct gttgtttcgc ggcacttggc agaggcaaaa gttgccgcac agtggggaaa 900caaactgcat acattttgca ttggtttgaa aggttctcct gatcttagag ctgctaagga 960agttgcagac taccttggta ctgttcatca cgaactccac ttcacagtgc aggaaggcat 1020tgatgcactg gaggaagtca tttaccatgt tgagacatat gatgtaacga caattagagc 1080aagcacccca atgttcttga tgtcacgtaa aattaaatct ttgggggtga agatggttct 1140ttcgggagaa ggttctgatg aaatatttgg cggttacctt tattttcaca aggcaccaaa 1200caagaaggaa ttccatgagg aaacatgtcg gaagataaaa gcccttcatt tatatgattg 1260cttgagagcg aacaaatcaa cttctgcatg gggtgttgag gcccgtgttc cgttccttga 1320caaaaacttc atcaatgtag ctatggacat tgatcctgaa tggaaaatga taaaacgtga 1380tcttggccgt attgagaaat gggttctccg gaatgcattt gatgatgagg agaagcccta 1440tttacctaag cacattctat acaggcaaaa ggagcaattc agtgatggtg ttgggtacag 1500ttggattgat ggattgaagg atcatgcaaa tgaacatgta tcagattcca tgatgatgaa 1560cgctagcttt gtttacccag aaaacactcc agttacaaaa gaagcgtact attataggac 1620aatattcgag aaattctttc ccaagaatgc tgctaggttg acagtacctg gaggtcctag 1680cgtcgcgtgc agcactgcta aagctgttga atgggacgca gcctggtcca aaaaccttga 1740tccatctggt cgtgctgctc ttggtgttca tgatgctgca tatgaagata ctctacaaaa 1800atctcctgcc tctgccaatc ctgtcttgga taacggcttt ggtccagccc ttggggaaag 1860catggtcaaa accgttgctt cagccactgc cgtttaactt tctatcgtcg cataaaactc 1920cgtagtttgt tgttcttggt tcaatcccag cttctttcag atgtcgttag tttcttcaaa 1980catgtaatgg agatgcgtgc ttttcctggc ttgttagtta ctgtatgctt gtcatcgtgt 2040atgttttctt ttcttttcca atatgcaaac tgtttggtcg tggactgatc agaacattgt 2100aaatatgaat aaccgcgact gatatcctca agttgctttt ggtttgcaat agttctaatc 2160ttgatgttct gctgggaatc ggaagatgtt atgcagtatg cgtattgttg gggtgtaacc 2220gtgtaagtgc atctgaaatg aagttacggg cgatggtaac tggg 2264962332DNAAquilegia formosa 96caagtgatta aatcatccac atttcttctt tctttctttc tttctttttt tttctttttt 60tctttgttct ccttgtttat agaatcttat tatttattaa cagagcaaaa gcatttctct 120agctagctag ctttatttct ttgtgatcat caatcaataa atatatataa ttcatcatca 180tgtgtggaat tctagctgtt ttgggttgtt ctgatgattc tcaagccaaa agagttcgtg 240ttcttgagct ttctcgcaga ttgaagcacc gtgggcctga ttggagtggt ctgtatcagc 300atggtgacaa ttttctatct catcaaaggc ttgcagtcat tgatcctgct tctggggatc 360agcctcttta taatgaagac aaatcaattg tcgtaactgt gaatggagaa atttataacc 420atgaagcctt gaggaagcgc ttgccaaatc acaaatttcg aactggaagt gactgtgatg 480ttattgctca tctgtatgaa gaattcgggg aggattttgt tgacatgttg gacgggatgt 540tctcatttgt tttattggac acccgcgata acagcttcct tgtcgcccgg gatgccattg 600ggattacctc cctttatatt ggttggggac ttgatggttc aatttggatt tcatctgaga 660tgaaaggact aaatgatgac tgtgaacact ttgaatgctt tcctcctggt cacctttact 720cgagcaaaaa tagtggtttt cgtaggtggt acaatccctc atggttctca gaagctgttc 780catctacacc atatgatcca ctcgtcctca gacgtgcatt tgaaaatgct gtagttaaga 840ggctaatgac tgatgtacca tttggagttc tcctatctgg tggccttgat tcatcattag 900ttgcctccat cacggcacgc cacttggcag agacaaaggc tgccaagcaa tggggggcac 960aacttcattc cttctgtgtt ggtctggagg gctcacctga tttaaaggct ggaaaagagg 1020ttgccgatta tttgggtacc gttcaccatg agtttcactt cactgttcag gatggtatcg 1080atgccattga agatgtgatt taccatgtag aaacatatga tgtaacgact atccgggcga 1140gcacacctat gtttcttatg tctcgcaaga tcaagtcact aggagtgaag atggttatct 1200ctggagaagg ctccgatgaa atatttggtg ggtacttata tttccacaag gctcctaaca 1260aggaggagtt tcatcgcgag acatgtcata agataaaggc tcttcatcag tatgattgct 1320tgagagctaa taaatcgacc tctgcttggg gtctggaagc tcgggtgcca ttcttagaca 1380aagaattcat caatgttgca atggctattg accctgaatg gaagatgatt aaacgtgatc 1440aaggccgtat tgaaaagtgg gtactcagga gggcttttga tgatgaggac cacccctacc 1500tgccaaagca cattctctac aggcagaaag aacaatttag tgatggtgtt ggatatagtt 1560ggatcgatgg actcaaggcc cacgctgcat cacatgttac ggataagatg atgcgcaatg 1620ccaagaacat tttcctacac aacacaccaa ctaccaaaga agcctactac tacagaatga 1680tttttgagag gtttttccct cagaactcgg caaaattaac agttccaggt ggtccaagtg 1740ttgcttgcag cactgccaag gctgtcgaat gggatgcttc ttggtcaaat aatttggacc 1800cttctggcag ggctgcatta ggtgtccatg cttcagcata tgaagcacaa ctgtctgctc 1860ctcttgctaa tggtaatgtt ccagttaaga tttttaacaa tgtaccaaga atggttgaag 1920taggtgctcc agctagcctc acgatccgca gctaatattt ctggtgaatg tgccttattt 1980tgtatggatt tgaagttaag aggccatagt atgcaaggtt cttttttttt cttttttttt 2040ttcagtgtgc agtgtgtata tgtactagta gtccatatgt gaaggaagat gaaacaaaac 2100tatgtaaaag tccatgtctt ttatatttct gaaaaaagaa ggttcttgtg atttcttttt 2160tgctacaaat aggcataaaa tagctgattc catgtatcgg gcacccctgg caaacaccaa 2220tgtatgcagt ctgcatagcg ttgtggatca gccttctgct catcggtcaa cactttccct 2280tgttgttctg tgtaaactga tgtatgtgca tcaatccgat attcagatat tt 2332971925DNAAsparagus officinalis 97tctgcttgca ccttttgaga gagagggaga gagagagaga gagagagaga ggatcatgtg 60tgggatactt gcagtgctcg gttgctccga tgactctcag gcgaagaggg ttcgagttct 120cgagctctct cgcaggttga agcacagggg cccagattgg agcgggcttt gccaacatgg 180agattgtttc ttgtctcatc agagattggc gatcattgat cccgcctctg gtgatcaacc 240cctgtacaac gaggacaagt ccatcgttgt cacggtaaac ggagagattt acaaccacga 300agagctaagg cgacgcctgc ctgatcataa atacagaact ggaagcgact gtgaagtcat 360cgctcatctg tatgaggaac acggagaaga tttcgtcgat atgttggatg gaatgttctc 420cttcgttcta ttggacaccc gaaacaattg cttcgttgcg gcaagggatg cagtgggaat 480aacccccctc tacattggct ggggattaga cggctctgtt tggctctcgt cggaaatgaa 540aggattaaac gatgactgcg aacattttga agtatttcca cctggaaacc tgtactcaag 600cagatcaggc agcttcagaa gatggtataa tcctcagtgg tacaatgaga ctatcccttc 660ggccccctat gatcctcttg ttctgaggaa agcttttgag gatgctgtta taaagaggct 720gatgactgat gtgccatttg gggttctgtt atctggtggc ctcgattcct cgttggtcgc 780cgctgttact gctcggcatc ttgcaggaag taaagctgca gagcaatggg gaactcagct 840ccattctttc tgtgttggct tagagggatc accagatctc aaggctgcaa aagaggttgc 900agagtatctg ggtactgtcc accatgagtt tcacttcaca gttcaggatg gaattgatgc 960cattgaggat gtaatcttcc acattgaaac gtacgatgtg acaacaatca gggcaagcac 1020tccaatgttc ctcatggcca gaaaaatcaa gtccttagga gtaaaaatgg tgatctcagg 1080cgaaggctcg gatgaaatct ttggcgggta cttgtatttt cacaaagcac ctaacaaaga 1140agaattccat cacgaaacat gtcgaaagat caaagctctg catcagtatg actgcctcag 1200agccaacaaa gcaacatcag catgggggct ggaagctcga gtgccatttt tagacaagga 1260gttcatggat gttgctatga gtatagatcc tgaatcgaaa atgattaagc ctgatctcgg 1320gaggatcgag aagtgggtac tgaggaaagc ttttgatgat gaagagaatc cctatcttcc 1380aaagcatatt ctctataggc aaaaggagca gttcagtgat ggtgttggat atagttggat 1440tgatgggctg aaggctcatg ctgcaaaaca tgtaactgat agaatgatgc tgaatgcagc 1500acgtatttac ccccacaaca caccaaccac aaaagaggct tattactaca gaatgatctt 1560tgaaaggttc ttccctcaga actcggcgag atttactgtc cctggaggtc caagcattgc 1620ttgcagcacg gcgaaggcta tcgaatggga cgctcgctgg tcgaacaatt tggatccgtc 1680ggggagagca gctctcggcg tccatgactc tgcctacgat cctcctcttc cttcttcgat 1740ttctgcagga aaaggagctg caatgatcac taacaagaag ccgaggattg tggatgtagc 1800aactccggga gttgttatta gtacctgatg ttggtttggt ttggtttggt tttgatgtac 1860aagttaaaat aaatgtgtgg ggcgttgtat tttggatgga gggtactaaa gcgtgtaatt 1920tgctg 1925982026DNABrassica oleracea 98ttgcgattaa ataagaaaaa tgtgtggaat acttgccctt ttaggatgct ccgacgattc 60tcaggccaag agagtacgcg ttcttgagct ttctcgcaga ttgaggcaca gaggacctga 120ttggagcgga atatatcaga acgggttcaa ttacttggcc catcaacgtc ttgctatcat 180cgatcctgat tccggtgatc aacctctctt taacgaggac aagtccattg ttgtcacggt 240gaacggagag atttataacc atgaggagct gagaaagggt ttgaagaatc acaagttcca 300caccggtagt gattgtgacg tcatagctca cctgtacgag gagcatggtg agaattttgt 360ggacatgttg gatggaatct tctcctttgt gttgctggac acaagagata actcattcat 420ggttgctcgt gacgcggttg gtgtcacttc gctctacatt ggttggggat tagatggatc 480tctgtgggtc tcttccgaga tgaaaggctt acacgaagat tgtgagcatt tcgaagcctt 540tcctccaggt catttgtatt caagcaaatc aggaggaggg tttaagcaat ggtacaatcc 600tccttggttc aatgaatctg ttccttctac gccttatgag cctctcgcaa ttagaagcgc 660ctttgaagac gctgtgataa agcggttgat gactgatgtc ccatttggag ttttgctatc 720tggtggtctt gattcttctc ttgttgcatc catcactgcc cgtcacttgg ccggtactaa 780ggccgctaag cgatggggtc ctcagctcca ttccttttgt gtcggtcttg agggctcgcc 840ggacttgaag gcggggaaag aagtggcgga gtatttgggg acggtgcacc atgagttcca 900tttcacggtg caagacggga ttgatgcgat tgaggatgtg atctaccatg tcgagacata 960tgatgtgacg acaattagag ctagcacacc catgttcttg atgtccagga aaatcaagtc 1020tctaggtgtt aagatggttc tttccggtga aggttctgat gagatctttg gagggtatct 1080ttacttccac aaggcaccta acaagcaaga atttcaccaa gaaacttgtc gcaagatcaa 1140ggctcttcac aaatacgatt gtttaagagc caacaaagct acctctgctt ttggtctaga 1200ggcgcgtgtt ccttttctgg acaaggagtt

tatcaacacc gctatgtctc tcgaccctga 1260atccaagatg atcaaaccag aggaagggag gatcgagaag tgggttctaa ggagagcctt 1320tgatgatgaa gaacgtcctt atttgccaaa acacattctc tacagacaga aagagcagtt 1380tagtgatggt gttggctaca gctggatcga tggcctcaaa gcccacgctg ctgaaaatgt 1440taatgacaag atgatgtcga aagctgcttt tatcttccct cacaacaccc cactcaccaa 1500agaagcatac tattacagaa tgatctttga gaggttcttc ccacagaact cggcaaggct 1560aactgttccc ggaggtgcga ccgtggcttg ctcgaccgca aaagcggtgg agtgggatgc 1620aagctggtcc aacaatatgg atccatctgg aagagctgcg attggagttc acctctcggc 1680ctacgacggc agcaaagtgg cattgccctt gccggcgcca cataaggcaa tcgacgacat 1740cccaatgatg atgggacaag aagttgtgat tcagacatga gtttgaagga tatatagggg 1800aattggagtt ctttaaagtt gtcctaatgg gtttaagtgt ttttgtatga tttcaaaata 1860aaattggttt cgtgttctta gggaaatatg aatgcataaa ttatttttct tgtactatta 1920gtaaatattc gaatgtactg tttctgcaaa atcgatgtac atcaatctta ttataattat 1980atgtattgta atatgatatg aaaaatgtga ttttgcttgt tttcac 2026993203DNAChlamydomonas reinhardtii 99cccctcccgc tccctccccg acatatgatc cagcattgat gggtgataca gacgaagcgc 60agaagcagca atccggtgtg tacgcatatg ggcacggcag cagctgctgg cagccccgga 120cgaaatccct agctgcactt tcgggcccgc gccagtccct tccaagcgct ttgtgacgtc 180ttctggctac ttacttgctc agcgtatcgc gcacgcccgg ctgcgccccg ctttgccctt 240gcgccacttc cgcacgaagg gtctgcacct tctccaggtc atccgctgca tcgtctgctt 300ccctgtccga gtacgttgcc cttatataag tcagcagcgg tgttttgatg tccacagtct 360ccgtcttctt gcaatgtatc gctaacataa ccgattgagc ggtcggcatt tttcaagagg 420cccttcgtga gcgtgccttg ctagatctgg ctagaggttg cagcgcgggt gtgaaaacgc 480agtgagggtt tggttgaatc gacatgcagc cccgtgcgcc catgcaactg tctttccgcg 540cgcagcaggg ccgatggatt cctttccttt acgcccaaac tacgctgggc acacacatct 600ttttgggtag ggctcttacg gtagccaaat tcttatagag tttggggagt gcgggtagca 660ctcaaaaatg tgcggcattc ttgccgtcct caacacgacg gatgacagcc aggctatgcg 720ctcgagggtg ctggccctga gccgtcgcca gcgtcaccgt ggccccgact ggtctggcat 780gcaccagttc ggcaacaact tccttgccca tgagcgcctt gcgattatgg accccgcctc 840gggtgaccag cccctgttca acgaggaccg cacaatcgtg gtcaccgtga acggtgagat 900ctacaactac aaggagctgc gccagcagat cacggatgcc tgccccggca agaagttcgc 960caccaacagc gattgcgagg tgattagcca cctgtacgag ctgcacggcg agaaggtggc 1020ctccatgctg gacggcttct tcgccttcgt ggtgctggac acccgcaaca acaccttcta 1080cgccgcgcgc gaccccattg gcatcacctg catgtacatc ggctggggcc gtgacggcag 1140cgtgtggctg tcgagcgaga tgaagtgcct gaaggatgac tgcacccgct tccagcagtt 1200ccctcccggg cacttctaca actccaagac gggtgagttc acccgctact acaaccccaa 1260gtacttcctg gacttcgagg ccaagccgca gcgtttcccc agcgctccct acgaccccgt 1320cgcgctgcgt caggcgttcg agcagtccgt ggagaagcgc atgatgtcgg atgtgccgtt 1380cggcgtgctg ctgtcgggcg gcctggacag ctcgctggtg gcgtccatcg cggcgcgcaa 1440gattaagcgt gagggcagcg tgtggggcaa gctgcacagc ttctgcgtgg gcctgcccgg 1500cagccctgac ctgaaggctg gcgcccaggt ggctgagttc ctgggcaccg accaccacga 1560gttccacttc acggtgcagg agggcattga cgccatcagc gaggtcatct accacattga 1620gacctttgac gtcaccacca tccgcgcctc cacgcccatg ttcctgatga gccgcaagat 1680caaggcgctg ggcgtgaaga tggtgctgtc aggcgagggt tccgacgagg tgttcggcgg 1740ctacctgtac ttccacaagg cgcccaacaa ggaggagttc cagtcggaga ctgtgcgcaa 1800gatccaggac ctgtacaagt acgactgcct gcgcgccaac aagtccacca tggcttgggg 1860cgtggaggcg cgcgtgccgt tcctggaccg ccacttcctg gacgtggcca tggagatcga 1920ccccgccgag aagatgattg acaagagcaa gggccgcatc gagaagtaca tcctccggaa 1980agccttcgat acccccgagg acccctacct gcccaacgag gtgctctggc gccagaagga 2040gcagttcagc gacggcgtgg gctacaactg gatcgacggc ctcaaggcgc acgcggacag 2100ccaggtcagc gacgacatga tgaagacggc cgcgcatcgg taccccgaca acacgccccg 2160caccaaggag gcgtactggt accgcagcat cttcgagacc cacttccccc agcgtgccgc 2220cgtggagacg gtgccgggcg gcccctcggt ggcctgctcc accgccaccg ccgcgctgtg 2280ggacgccacc tgggctggca aggaggaccc ctcgggccgc gccgtggccg gcgtgcacga 2340ctcggcctac gacgccgccg ccgccgccaa cggcgagccg gctgccaaga aggccaagaa 2400gtaaacgggc cttgtccacc acttgcggtc ccgactgcgg cagctgagac tagctgtcag 2460aggttgctgc gcatggggcc gcggcgtgcg tcgctaccgg gaagcagcgt gctgtggggg 2520agtttgatgt gcttcctgat cagcatcgtg ctcgcggagt agcgagagcg agtccggatc 2580atgcacgcga tgcggctgca tgcataaaga gcagcacctc agctgcaccg ccgtctgtgc 2640atgcatggcc agtgattcca ccaggtgcac ggccttgcgt ttttgagcga agagcacacg 2700tcacggatgt caacgcgtta ttcgggggct acgagcctgc gcgctattgt gtcgtgtttt 2760actggcgtgg agtgtcgtgg atgctgtttc tgacagatgt ctttcactgc gagtgtgaat 2820cataggggtg acttgacggt caatgtagac gaggaacggg gagacgacat gcccattgac 2880aggatgacta ggtcttgacg gtggaggatg ggtcacgggc ggcacaagac gcgggggaac 2940aggcggtgcg aagtccagca catggattaa ttagataaag gggcgccagc aacttggcgc 3000ccgcgtagaa agtcatgaag ccatgctagg cggtagtcgc aaggaagcga gaacgggatg 3060ggacgcagct gcacacgtgc ggcggtgggg agccgctgaa gctctttaag aagacgttcc 3120gcagactctc tgatcccaac tgccattctg ccaacccgtt ttgcacgccg aaaacctggc 3180acactggaag cgctcatcac gct 32031002238DNAGlycine max 100tggaaccctt ctacgtgttc tccattccct ctctcactcc tccatctacg tttcttaaat 60catttccttc tttctctctt tctttatctt ctcattttcc tcattacact cttttttttt 120tctctcaact tttctcttat taaccatagt tcacatatta tatcatcaca tatcatagtg 180atatattata tcatatcaca atgtgtggca tacttgctgt gcttggttgc tctgattcat 240ctcaagccaa aagggttcgc gtccttgagc tttctcgcag attgaagcac cgtggtcctg 300actggagtgg gctccaccaa tatggtgata actatttggc tcatcaaagg ttagccatag 360ttgatccagc ttctggtgat caacccctct tcaatgaaga caaaactgtc gtggttacgg 420tgaatggaga gatctacaat catgaagaac tcaggaaaca gttgcctaat cacaccttcc 480gtacaggaag tgactgtgat gttattgctc acctgtatga ggagcacgga gaaaactttg 540tggacatgct tgatggtata ttttcgtttg ttctgctaga tactcgtgac aacagtttta 600tagtggcacg agatgcaatt ggggtcactt ccttgtacat tggttggggt ctagatggct 660ctgtctggat ttcatcagaa ttgaaggggt tgaatgatga ttgcgaacat tttgagtctt 720ttccacctgg tcacttgtac tctagcaaag agagagcgtt ccgcagatgg tacaatcctc 780catggttctc tgaggctatt ccctcagcac cttatgatcc tcttgctttg aggcatgcct 840ttgagaaggc tgtggtaaaa aggttgatga ctgatgttcc ctttggtgtt ttgctctctg 900gaggtttgga ctcttcattg gttgcagccg tcacggctcg ctacctggca ggcacaaatg 960ctgccaagca atggggaacc aaattacact ctttctgtgt aggccttgag ggtgcacctg 1020acctaaaggc agcaaaggaa gtagcagact acataggaac tgtacatcat gaatttcact 1080acactgttca ggatggcata gatgccattg aggatgtgat ctatcacatt gaaacatatg 1140atgtgacaac aattagagca agcattccca tgtttcttat gtctcgtaag atcaagtcat 1200tgggagtcaa atgggttata tctggagaag gatctgatga gatctttgga gggtatctat 1260atttccacaa ggcaccaaac aaagaagagt ttcatcaaga aacatgccgc aagattaaag 1320cactccacaa atatgattgc ttgcgagcca ataaatcgac ctttgcctgg ggtctagaag 1380ccagagtgcc atttttggac aaagatttta tcagagttgc aatgaacatt gatcctgatt 1440ataaaatgat taaaaaggaa gaagggcgaa ttgagaaatg ggtactgagg agggcctttg 1500atgatgaaga acatccttat ctgccaaagc acattttata caggcagaaa gaacaattca 1560gtgatggagt tggctatggt tggattgatg gccttaaagc tcatgctgag aaacatgtga 1620ctgacagaat gatgctcaat gctgctaaca ttttcccctt caacacacca accaccaaag 1680aagcatacta ctatagaatg atatttgaga ggttcttccc tcagaactca gccaggctga 1740gtgttcctgg aggaccaagt gttgcatgta gcacagccaa agctgtagag tgggatgctg 1800cttggtctaa caaccttgat ccatctggta gggcagcact tggagtgcat gcatcagctt 1860atggaaatca ggtcaaagct gtagaaccag agaagatcat accaaagatg gaagtttccc 1920cactaggagt tgccatatag agctagtatg agccatagca aaaactagta gttgccctag 1980aaccaaaata tattattata ctagtcatca atgactcatt aatcatcata aatgaaaatt 2040tggcctgctg tgtagtttat tcaggcaagg ctatatataa atagataagg ctctctatct 2100agctgtctta agtgttgttc catccacatc ttgtcttcgt tttctattta tgtcatctga 2160gcactatcat gatgtactgg atttccaaga aaatgttcag ttaaatttga atgcaaagtt 2220cactatttca gactttca 22381012108DNAGlycine max 101ggggcattgg attctcacca acgtttgcgt tactcaagcc gacattctcg cttccgttgg 60aaccgttctt cgtgttctcc attccctctc tcactccttc atctacttca catattatat 120catcacatat catagtgata tcatatcaca atgtgtggca tacttgctgt gcttggttgc 180tctgattcat ctcaagccaa aagggtccgc gtccttgagc tttctcgcag attgaagcac 240cgtggtcctg actggagtgg gctccaccaa tatggtgata actatttggc tcatcaacgg 300ttagccatag ttgatccagc ttctggtgat caacccctct tcaatgaaga caaaactgtt 360gttgttacgg tgaatggaga gatctacaat catgaagaac tcaggaaaca attgcctaat 420cacaccttcc gtacaggaag tgattgtgat gttattgctc acctgtatga ggagcacgga 480gaaaacttta tggacatgct tgatggtata tcttcatttg ttctgctgga tactcgtgac 540aacagtttta tagtggcgcg ggatgcaatt ggggtcactt ccttgtacat tggttggggt 600ttagatggct ctgtctggat ttcctctgaa ttgaaggggt tgaatgatga ttgcgaacat 660tttgagtctt ttccacctgg tcacttgtat tctagcaaag agagagcgtt ccgcagatgg 720tacaatcctc catggttgtc tctggctatt ccatctgccc cttatgatcc tcttgctttg 780agacatgcct ttgagaagct gtggataaaa aggttgatga ctgatgtgcc ctttggtgtt 840ttgctctctg gaggtttgga ctcttcattg gttgcagccg tcacggctcg ctacctggca 900ggcacaaaag ctgcgaagca atggggaact aaattacact ctttctgtgt aggccttgag 960ggtgcacccg acctaaaggc tacaaaggaa gtagcagagt acataggaac tgtccatcat 1020gaatttcact acactgttca ggatggcata gatgccatcg aagatgtgat ctatcacatt 1080gagacatatg atgtgacaac aattagagca agcattccca tgtttcttat gtctcggaag 1140atcaagtcat tgggagtcaa atgggttatc tctggagaag gatctgatgt tttttttgga 1200gggtatctat atttccacaa ggcacccaac aaagaagagt tccaccaaga aacatgccgc 1260acaattattg tactccacag gtatgattgc tcgcgagcca ataaatcgac ctttgtctgg 1320ggtctagaag ccagagtacc atttttggac aaagagttta tcagagttgc aatgaacatt 1380gatcctgagt gtaaaatgat aaaaaaggaa gaagggcgaa ttgagaaatg ggcactgagg 1440agggcctttg atgatgaaga acatccttat ctgccaaagc acattttata taggcagaaa 1500gaacaattca gtgatggagt tggctatggt tggattgatg gccttaaagc tcatgctgag 1560aaacatgtga ctgacagaat gatgctcaat gctgccaaca ttttcccctt caacactcca 1620accaccaaag aagcatacca ctatagaatg atatttgaga ggttcttccc tcagaactca 1680tgcaggctca ctgttcctgg aggaacaagt gttgcatgta gcacagcaaa agctgttgag 1740tgggatgctg cttggtctaa caaccttgat ccatcaggta gagcagcact tggagtgcat 1800gcatcagctt atggaaacca ggtcaaagct gtagaaccag agaagatcat acccaagatg 1860gaagtttctc cactaggagt tgccatatag agctagtatg agccatagca aggactagta 1920gttgccctag aaccagcata tattattatt atactaatca tcaaatcatg aaacatcagg 1980ttgctttgta gttatccagg gaatggtata taaatagata aggatctcta tctatctggc 2040tctctttctg ggccacccag atctagcctc aacttgcttt cgatgtcacc tgatgcacaa 2100tcataaag 21081022134DNAGlycine max 102ggcacgagct tcaacttcac ccattcatac gtggtgttgt tactgctgct cttttctctt 60ttcttttctc tttagttctc tcttcccctt tctttttctt tttcttcttc ttctgagctt 120gtttaagctt ttcttccatt aacatattat cacaatgtgt ggtattcttg ctgttcttgg 180ttgttctgat gactctcgag ccaaaagggt ccgcgtgctt gagctctctc gcagattgaa 240gcaccgtggc cctgactgga gtgggctcca tcaacatggt gactgctttt tggcacatca 300acggttagcc atagttgatc ctgcttctgg ggatcaacct ctctttaacg aggacaaatc 360cgtcattgtt acggtaaatg gagagattta caaccatgaa gagctcagga aacagctgcc 420taatcacaac ttccgaactg gaagtgattg tgatgttatt gcacacctgt acgaggaaca 480tggagaagac tttgtggaca tgctggatgg tatcttctca tttgttctac tggacacccg 540tgacaacagt tttatagtgg ctcgggatgc tattggggtc acttccttgt acattggatg 600ggggttagat ggctctgttt ggatttcatc agaaatgaaa ggcctgaatg atgattgtga 660acactttgag tgttttccac ctggtcactt gtactctagc aaagaaagag ggttccgcag 720atggtacaat cctccttggt tctctgaggc tattccatct gccccttatg atcctcttgt 780tttaagacac gcctttgagc aggcagtcat aaaaaggttg atgactgatg tgccttttgg 840tgttctactc tctggaggtt tggactcttc tttggttgca tccatcactt ctcgttactt 900ggccaacaca aaggctgctg agcagtgggg atcaaagtta cattcattct gtgtaggcct 960tgagggctca ccagatttga aggctgcaaa agaggttgct gactatctag gcactgtcca 1020ccatgagttt accttcactg ttcaggatgg aatagatgcc attgaagatg ttatctacca 1080tattgaaaca tatgatgtga ctacaattag agcaagcaca cctatgtttc tcatgtctcg 1140gaagattaaa tcacttggtg tcaaatgggt tatctcagga gaaggatctg atgagatctt 1200tggagggtat ttgtacttcc acaaggcacc caacaaggag gagttccaca gagaaacatg 1260ccgcaagatc aaagcacttc accaatatga ttgcttgcga gccaataaat caacatttgc 1320ttggggtcta gaagcccgtg taccattttt ggacaaggcg tttatcaatg ctgcaatgag 1380tattgaccct gagtggaaga tgataaaaag agatgaagga cgaattgaga agtggattct 1440gaggagagcc tttgatgatg aagagcatcc ttatctgcca aagcacattt tatacaggca 1500gaaagaacaa ttcagtgatg gagttggcta tagttggatt gatggcctta aggcccatgc 1560tgcaaaacat gtgactgaaa aaatgatgct taatgctggt aacatttacc cccacaacac 1620cccaaaaacc aaggaagcat attactacag aatgatcttt gagaggttct tccctcagaa 1680ctcagctagg ctcactgttc ctggaggagc aagtgttgca tgtagcacag ccaaagctgt 1740tgagtgggat gctgcttggt ctaacaacct tgatccctct ggtagagcag cacttggagt 1800gcacatttca gcctatgaaa accagaacaa caagggtgta gaaattgaga agataatacc 1860tatggatgct gctccccttg gtgttgccat ccagggctaa tacaaagatg tgacaaagaa 1920taatttgggc gacaatgaag ataactaagc taaaggtgaa tgaaaaattt gcctgcagtg 1980taatttcatc tgggcaaagc ttttatagtt tatagttata aggctttcta aaaagtgttg 2040cgtattgtat tatcttgaat gctgtgattt gaagtcttaa taaaagtgtt tcctttatca 2100gttcataatg aatgcaaagt ccattatttt aaaa 21341031986DNAGlycine max 103agcagtggta tcaacgcaga gtacgcggga gttctgttgt tgtgttgtgt tgtgtgtctt 60cccttgtgtg ttccagtttt tatttgcagc cgccatgtgc ggaatcctcg cagtgttggg 120ttgcgtcgac aactctcaga ccaagcgcgc tcgcatcatc gaattgtctc gcaggttgcg 180gcatagaggt cctgattgga gtggcataca ttgctatgag gattgttacc tagctcatca 240acgccttgct attgttgacc ctacttcagg ggaccaacct ttgtacaacg aagacaaaac 300tattattgtc actgtaaatg gggagatata caatcacaag caattgaggc agaaactgag 360ttcccatcaa tttcgaactg gtagtgattg tgaagtgatt gcccatcttt atgaagaaca 420tggagaagaa tttgttaata tgctggatgg gatgtttgcc tttattcttc ttgatactag 480ggataaaagt tttattgctg ctcgtgatgc tattggcatt acccctctat acttgggctg 540gggtcatgat ggatcaacat ggtttgcatc tgaaatgaaa gctctgagtg atgattgtga 600gagattcata tcttttcctc cagggcacat ctattccagc aaacagggag gattaagaag 660gtggtacaat ccaccatggt tttcagagga tattccatca actccctatg atccaaccct 720tttgcgtgag accttcgaga gggctgtagt taagagaatg atgactgatg taccttttgg 780agttcttttg tctggaggat tggactcatc acttgttgct gcagtggtca atcgttattt 840ggctgaatct gaatctgctc gtcaatgggg atcacagtta catactttct gcattggttt 900aaagggctct cctgacttga aagctgcaaa agaggtagca gattaccttg gtactcgtca 960ccatgaactt tatttcacgg ttcaggaagg tatagatgca cttgaagaag tcatttacca 1020tattgaaaca tatgatgtaa cgactatcag agcaagtact gcaatgtttc ttatgtccag 1080aaaaattaaa gccttgggag tgaaaatggt actttctgga gaaggttcag atgaaatatt 1140tggaggttac ctgtattttc acaaggcacc taataagaaa gagtttcatg aagaaacatg 1200tcgaaaaatt aaagctcttc atctttatga ctgcctgaga gccaataaat caactgcagc 1260atggggtgta gaggcacgtg taccattctt ggataaagaa tttatcaacg tagccatgag 1320tatagatccg gaatggaaaa tgataaggcc tgatcttgga aggatagaga agtgggtatt 1380acgcaatgca tttgatgacg ataagaatcc atatttacca aagcacatat tgtacaggca 1440gaaggaacaa ttcagtgatg gggttggtta cagctggatt gatggcttga aggatcacgc 1500aaacaaacaa gtcacagatg cgacgatgat ggctgccaat tttatttacc ctgaaaacac 1560tcctaccaca aaagaaggat acctctacag gacaattttt gagaagttct ttccaaagaa 1620tgcagcaaag gcaacagtgc caggaggtcc tagtgtggca tgcagtactg caaaagctgt 1680ggaatgggat gcagcatggt caaaaaatct tgatccttct ggtcgtgccg cacttggtat 1740tcatgatgct gcatatgatg cagtggatac caaaattgac gagcccaaaa atggaaccct 1800ttaaggccca taatcgattg tcaagagaaa aaaatgtatg caacaactgt ctagtgggga 1860tttaaacttc tagtaggcaa aactaatgag aagtgggatt gtttttattt tcagctcaaa 1920ttaatatgta ggttttgaac tgtttgtggg ttattttaaa taaatatcta tatttaaatt 1980ttgtag 19861041746DNAPhyscomitrella patens 104atgtgcggaa ttttggctat tctcggttcc cacgacgcgt cgcctgcgcg acgtgatcgc 60attctggagc tttcccgcag gctgcgccac cgcggtcccg actggagtgg gctgttcgca 120gggcagaagt gctggtgtta tctggctcat gagcgcttgg ccatcattga tcccgcctcg 180ggcgaccaac ctctgtacaa tgagaacaaa gatatcgtcg tcgctgccaa tggagaaatc 240tacaaccacg aggccttgaa gaagagcatg aagcctcaca agtatcacac gcagtccgac 300tgtgaagtta ttgctcatct ctttgaagat gtcggcgagg acgtggtcaa catgctggac 360ggcatgttct cattcgtgtt ggtcgacaac cgcgataatt ccttcatcgc cgcccgggat 420cccattggca tcacccctct ctactacggc tggggtgcgg atggaagtgt ttggtttgca 480tcggagatga aggccttgaa ggacgattgc gagcggttcg agattttccc acccggtcac 540atctactcta gcaaagctgg agggcttcgg cgatattaca acccagcttg gttctctgag 600acttttgtcc ccagcacccc ttaccagtct cttgttctcc gcgcagcctt cgagaaggct 660gtaatcaaga gactgatgac cgacgtgccc ttcggtgtac tcctatccgg agggctggat 720tcttcattag tggcagcagt ggcatcccgt catatcgcag gaactaaagc tgccaacatc 780tggggcaagc agcttcactc tttctgcgtc ggacttcagg gttctcctga cctgaaggct 840gctcgggaag tcgccaacta catcggcacc cagcaccacg agttccactt tactgtccaa 900gaaggtttgg acgctctgtc ggatgtgatc tatcatgtgg agacttacga cgtgaccacc 960atccgagcta gcacgcccat gttcctcatg acacgcaaga ttaaggctct gggtgtaaag 1020atggtgttgt ctggggaggg atccgatgaa atttttggtg gttacctcta tttccataaa 1080gcgcccaaca gggaggagtt ccaccatgag cttgttcgca agatcaaggc gctgcatatg 1140tatgattgcc agagagccaa taagtcgacg tctgcctggg gtttggaggc gcgtgttccc 1200ttcctagaca aagaatttat ggaagttgcc atggctatcg atcctgcgga aaagctgatc 1260aggaaggacc aaggaagaat agagaagtgg gtgctccgaa aagctttcta cgacgaaaag 1320aatccttacc tgcccaagca cattttgtat cgccagaagg agcaattcag cgatggcgtt 1380ggctacagct ggattgacgg cctcaaggct catgcacaga gccatgtatc cgaccaaatg 1440ctgaagcatg caaagcacgt gtacccctac aacacgccgc agactaaaga agcatactat 1500taccgaatgc tcttcgagaa acacttcccg cagcaatccg ctcgcttgac ggtccccgga 1560ggtgctagcg tcgcatgtag cacggccaca gcagttgcat gggacaagtc ctgggcgggc 1620aacctggacc catctggccg agcagcattg ggatgccacg acgcggccta cacggaaaac 1680agcgctgcaa tgagttacat aacaaaaaac atgtcaaatg ttggacaaaa aatgaccata 1740cattga 17461052199DNAPhyscomitrella patens 105atgtgtggaa ttctagcgat tctcggtgcc gacggcgccg ttccgtctgc cggacgtgat 60cgcgctctag cgctgtcccg aaggctgcgc catcgaggac ctgactggag tggactcttt 120gagggcaagg attcctggtg ttacctcgct catgagcgcc tggctatcat cgatccggct 180tcgggtgatc aacccctcta caatggcact aaggacatcg ttgtcgctgc taacggagag 240atttacaacc acgagttgtt gaagaagaac atgaaaccac acgagtacca cacgcagtcc 300gattgcgaag tcattgctca tctttatgag gatgtaggtg aggaggttgt gaacatgctt 360gacggcatgt ggtcgttcgt gctggtggac

agccgagaca actccttcat cgcagcccgc 420gaccccatcg gcatcactcc tctctatctt ggttggggag ccgatggtag aactgtgtgg 480tttgcctcgg agatgaaagc cttgaaggac gattgcgaac ggcttgaggt ctttccacca 540ggccacatct actcaagcaa agctggaggg ctccgtcgct actacaaccc acagtggttc 600tcagagactt ttgttcccga aactccttac cagcctctgg aactacgttc agccttcgag 660aaggctgtgg taaagaggct catgaccgac gtccccttcg gtgtgctcct ttccggaggc 720ttggattctt ccttggtggc atcagtggca gcccgacatc ttgccgaaac caaagctgtc 780agaatctggg gcaacgagct ccactccttc tgtgttggcc ttgagggttc tcccgacctg 840aaggctgcga gggaagttgc caagtacatc ggcacccgcc accacgaatt taacttcacc 900gtccaggaag gattggacgc tctgtctgac gtgatctacc atgtggagac ctacgacgtg 960accaccatta gggcgagcac accaatgttc ctcatgacac ggaagatcaa ggctctgggt 1020gtgaagatgg tgttgtctgg ggagggatcc gacgagatct ttggtggtta cctctacttc 1080cacaaagctc ccaacaggga ggagtttcac cacgaactag tccgcaagat caaggcgcta 1140cacttgtacg attgccagag agccaacaaa tcaacctctg cttggggtct ggaagctcgt 1200gttcccttcc ttgacaagga gttcatggac gttgcgatga tgatcgaccc tagcgagaag 1260atgatcagga aggacctggg cagaattgag aagtgggtgc tgcgtaaagc tttcgatgac 1320gaagagagac catacttgcc caagcacatt ttgtacaggc aaaaggagca attcagcgat 1380ggagtgggct acagctggat tgatggactc aaggaatatg cggagagcca tgtgacggat 1440cagatgatga agcacgcgaa gcatgtgtac cccttcaaca cgcccaacac caaagaagga 1500tattactacc gaatgatctt cgagaagcat ttcccccaac aatccgcccg gatgacggtc 1560cccggaggtc cttcggtagc atgcagcacc gccacagctg tggcatggga cgaagcatgg 1620gccaacaact tggacccctc cggcagagca gcattgggat gccatgactc agcttacaca 1680gacaaacaca gtgagaaagc tgcaccagcg gcagaagcta acggcacggc ttctcacgag 1740aacggccaca cattctccaa gcccaaatcc acactggatg ccaccattct gaaaactcag 1800gccgtgcact aatctctagc aagacacacg tttcagtagt tatctaagtg gcagcaactg 1860caaccaagcc tcagaatggg ctcccaacaa gctgggtttc catgtgaaga gctggagctt 1920gaattgcaac atgcgccctg taacaataat agaaaactcg ctcaaaacaa acgtagaaaa 1980atagaataaa gagtactgga ctgaaagacc gaagaccttt gcttgagtcc tctgaggcgc 2040tggtatggat ataaaccgga cagtgtatgg caaatagtgc gaggaaagta attttaataa 2100gttagcagct atagtttgag ctatggcagt cacagaccca tatctgtaca agcttcactt 2160cccctaagtt atgaattccc tcgtttccag tttcatata 21991062252DNAPhyscomitrella patens 106actgtgtggg cttgggtggt tgtggtgaag gaggacgagg aagagtaaga ggaagaggcg 60gattctgcat caagggttta tgatgctctt tgcacgacaa acctacgaat cctgacccag 120ctggtcgctt gtcgtccccc ctccttcctt tttggcttct ctcttgtctt tccgttagcg 180cttttgagga gacttgagcc gccgtcacaa tgtgtggaat tttagccatc cttgggtgcc 240atgacaagag cgtcacgcgg cggcatcgct gcctggagct ctctcgcagg ttgcggcacc 300ggggacctga ctggagtggt ttgttcgtgg acgaggcgtc gggatgttat ctggcgcacg 360aaaggttggc aattatcgat cccacgtcgg gcgaccagcc gttgttcaac gagaacaagg 420acattgtcgt cgcggtgaat ggcgagattt acaaccatga ggccctcaag gcgagcatga 480aggcacataa ataccacact cagagtgatt gtgaagttat tgcacatctg tacgaggaaa 540ttggggagga ggtggttgag aagctggatg gcatgttttc atttgtattg gtagacttgc 600gcgataagtc attcattgct gctcgcgatc cccttggaat cacaccactc tacctcgggt 660ggggcaatga tgggtctgta tggtttgcgt ctgagatgaa ggctttgaag gacgattgtg 720agcgctttga gtcgttccct ccaggtcaca tgtattccag caagcaaggt ggtctgcgta 780ggtattacaa cccaccttgg ttcaacgaaa gcatcccagc agaaccttat gacccgctca 840tactacgaca tgcctttgag aaatcagtca tcaaacggtt aatgacggat gtgccgtttg 900gagtgctgct gtcgggtggc cttgattcct cgttggtagc tgcggttgct caacgacatc 960tagccggcag tacagcagcc aagcaatggg ggaataagct tcattctttc tgtgttggac 1020tggagggctc tcccgatttg aaggctggac gggaagttgc tgattacatc ggtacggtgc 1080acaaagagtt tcatttcact gtccaggaag gtctggatgc catttctgat gtaatatatc 1140acattgaaac gtatgatgtc actacaattc gagctagtac acccatgttc ctcatgtctc 1200gaaaaatcaa agcccttggc gtgaagatgg ttctttctgg agagggttca gacgagatat 1260ttgggggtta cctttacttc cacaaagctc ctaacaagga ggagtttcac aaggaaactt 1320gtaggaagtt gaaggcactg cacttgtacg attgtttgag ggcaaacaaa tcaacatcag 1380cctggggttt ggaagctcgt gtaccattct tggataggga cttcgtaaac ctcgccatgt 1440cgatcgaccc tgctgagaaa atgataaaca agaaggaagg gaaaatcgag aagtggatca 1500tccgtaaagc ttttgatgat gaagagaacc catacctgcc caagcatatt ttgtacagac 1560agaaggagca gttcagtgac ggtgttggct acagttggat tgatggcttg aaggaccatg 1620cagccagtca ggtttctgac cagatgctgg caaatgctaa acacatttat ccccacaaca 1680ctccaggaac aaaggaaggt tactactacc gcatgatctt cgagagatgc ttcccacagg 1740agtcagcaag gcttacagtt ccaggaggac ctagtgtagc ttgcagtact gctgctgcca 1800ttgcctggga caaggcatgg gccaataact tggatccctc aggcagggca gctacaggtg 1860ttcacgattc cgcatatgaa ggtggtgagg tggagagctc agcagtgagc cacaaagaag 1920gtggtgagga tggtttggcc aactcgaaag tgggcgacaa ggttcaggaa gccatagctg 1980ttgcctgagg tgacgcatgg tgttctttga ttaggatgct cattgtaagc tgacccacct 2040actgtactgc aagcaattgt agctttatat gtattggtga acaattgcca ttttagagtg 2100atcagttttc atttccgttt actttgagat aaatgcctta tgtgtatttg agtaggaact 2160ggttaaagga cttttaaatt tgttgttgac cgtgaaagag atcaaccttc aggtatatat 2220tgttttcgaa tgagcttgtt tttcaaaccc tc 22521072289DNAPopulus trichocarpa 107ctcattcaac aataacaaaa caagctcttg ctctacgtgt tggtgtttcc tattaacagc 60ccatctcctt ctcctgccac ctcgctttcc tttttattac cagattttct tctttcatta 120ctacccaatt tcatctctat agtttatcca tccatttttc tctgtctttg tttttaagat 180atacatatct agcaaaatct tcttttatct gctatatcgt ttttttttaa gaaacgacga 240tgtgtgggat acttgctgtt ttgggttgtt ctgacgactc tcaggccaag agggttcggg 300tgctagagct ctctcgcagg ttgaagcacc gtggtccaga ttggagtggg ctctatcagt 360gcggtgactt ttacttggct catcaaaggc tggctattat cgatcctgct tctggtgacc 420agccactctt taatgaggac caagccatcg ttgtcacggt gaacggagaa atttacaacc 480atgaagaact aaggaagcgt ttgccaaatc acaagttccg aacaggcagt gactgtgatg 540ttatcgccca tctgtacgag gaatatggcg aaaattttgt ggacatgttg gatggaatgt 600tttcatttgt tctgctggat actcgtgaca acagtttcat tgttgctcgt gacgccattg 660ggatcacccc cctctatatt ggctggggac ttgatgggtc cgtgtggatt tcatctgaac 720tgaaaggtct gaatgacgac tgtgaacatt ttgagtgctt tcctcctggt catttgtact 780cgagtaaatc gggtggatta cgtcgttggt acaatcctcc ttggttctgc gaggccattc 840cctcaacccc atatgatcca cttgttctga gacgtgcatt tgaaaaggct gtgattaaaa 900ggctaatgac tgatgtgcct tttggagttc ttttatctgg aggcctagat tcatcactgg 960ttgctgctgt tactgctcgc catttggcag gtacaaaggc tgccagacaa tggggggcac 1020aactccattc cttctgtgtt ggcctagaga attcaccaga tttgaaggct gcaagagaag 1080ttgcagatta tctgggaacc gtccaccatg aattttactt cacggttcag gatggtatag 1140atgccattga ggatgtcata taccatatag aaacatatga tgttacaacc atcagagcaa 1200gtacccctat gttcctaatg gctcgtaaga tcaaggcact aggagtgaag atggttattt 1260ctggtgaagg ttctgatgag atttttggtg ggtatttgta ctttcataag gcacctaaca 1320aagaagagtt acaccgcgaa acatgtcgca agataaaggc ccttcatcaa tatgattgct 1380tgagagctaa caaggcaaca tctgcttggg gtttagaagc ccgtgtcccc ttcttggaca 1440aggattttat taatgttgca atggctattg atcctgaatg gaagatgatc aaacctggac 1500aaggccatat tgagaaatgg gtccttagga aagcctttga cgacgaggag catccttatc 1560tgcctaagca tattctttac aggcagaaag agcaatttag cgatggtgtt ggctatagct 1620ggatcgatgg tctcaaagct catgctgccc aacatgtgac tgacaagatg atgcaaaatg 1680ctgagcacat ctttccacat aataccccta ccaccaaaga agcctattac tacagaatga 1740tttttgagag gttcttccca cagaactcag ccaggctgtc tgttcctgga ggagccagtg 1800tagcatgcag cacagctaaa gctgttgaat gggatgctgc ctggtccaat aatctggatc 1860cttctggacg ggctgcattg ggtgtacatc tctctgatta tgatcagcag gcagctcttg 1920ccaatgcagg agtggtgcca ccaaaaatta ttgacactct tcctcgaatg ttggaagtta 1980gtgcttcggg agttgcgatc cacagttagc gcctgctgga ggactaagta ttggtgaatt 2040tgatatctat agccttggta ttatttaaac ttgtgttgcc ttgtatatgt aaaaatctta 2100gaggtcatat gtagatgtta caaataatga tccgtggtcc tttgaagtcg tgtgttgtca 2160ttactttgtg gtttttgtac aaggtaattc atgtatgtta tcaatgccct gtagctgttt 2220aaagctgcaa ggcaaccttt cctactgttt aaagctgtaa tgcaaccttt cctatggttt 2280ctttgcttc 22891082205DNAPopulus trichocarpa 108gctcattcaa caataacata acaggctatt actctacgga ttatggtttc ctgttaacac 60tccatctccc tctcctcctg cttctttgtt ttcccttttt ttttcccagt attattctct 120cgttattacc tggttccatc tttatcttcg atcttaagat atacttaagc tacttctatc 180ttcaatatcg aacgttttat ttttgaaaaa caaagaagga tgtgtgggat acttgctgtt 240ttgggttgtt ctgatgactc tcaggccaaa aggtttcgag tgcttgagct ctctcgcaga 300ttgaagcacc gtggtcctga ttggagtggg ctctttcagc acggtgactt ctacttggct 360catcaaaggc tagccattat tgatccggct tctggtgatc agcctctctt taatgaagac 420caagccatcg ttgtcacggt gaacggagaa atttacaatc atgaagaact gaggaagcgc 480ttgccaaatc acaagtttcg aacaggcagt gactgtgatg ttatctccca tttgtacgag 540gaatatggcg agaattttgt ggacatgttg gatggaatgt tttcatttgt tctgctggat 600actcgtgaca acagtttcat tgtcgcccga gacgccattg ggatcacctc cctctacatt 660ggctggggac ttgatgggtc tgtgtggatt tcgtcggaat tgaaaggtct gaatgatgac 720tgcgaacatt tcaagtgctt tccacctggt catatatact cgagcaaatc cggtggatta 780aggcgttggt ataatcctct ttggttctct gaggctattc cctcgacccc atatgaccca 840cttgctctga gaagggcatt tgaaaaggct gtgattaaga ggctgatgac tgatgttcct 900tttggagtgc ttttatccgg gggactagat tcgtcattgg ttgctgctgt gactgcccgg 960catttggcag gtacacaggc tgccagacaa tggggggcac atctccattc cttctgtgta 1020ggcctagaga attctccaga tctgaaggct gctagagaag ttgcagatta tttgggcacc 1080atccaccatg aatttcactt cacagttcag gatggtattg atgccattga agatgtcata 1140taccatgttg aaacatatga tgttacaacc atcagagcaa gtacccctat gttccttttg 1200gctcgtaaga tcaaggcgct aggagtgaag atggttattt ccggtgaagg ttctgatgag 1260atttttggtg ggtatttgta ctttcacaag gcacctaata aggaagagct ccacggcgaa 1320acatgtcgca agataaaagc ccttcatcaa tatgactgct tgagagctaa caaagcaaca 1380tctgcttggg gtctagaagc ccgcgtcccc ttcttggaca aggattttat taatgttgca 1440atggctattg atcctgaatg gaagatgatc aaacctggac gtatcgagaa atgggttctt 1500aggaaagcct ttgacgacga ggagcatcct tatctgccaa agcatattct gtacaggcag 1560aaagagcaat ttagtgatgg cgttggctac agttggattg atggtctcaa agctcatgct 1620gaattacatg tgcacgacaa gatgatgcaa aatgctgagc acatctttcc acataatacc 1680cctaccacca aagaggccta ttactacaga atgatttttg agaggttctt cccacagaac 1740tcagcgaggc tgactgttcc tggaggagcc agtgtagcat gcagcacagc taaagctgtt 1800gaatgggatg cttcctggtc caacaatctc gatccttccg gccgtgctgc attgggtgtg 1860catctttctg cttatgaaca gcaggcagct cttgccagtg ctggagtggt gccaccggag 1920attattgaca atcttcctcg aatgatgaaa gttggtgctc caggagttgc aatccaaagt 1980tagcttctgc tggaggaccg aagtacatgc cttgtacatg tataaatcat atagatcatg 2040tgtagaagtt acgaataata atctctgctc gtttgtagta gtgttggcac cttgttgttt 2100ctgtacaagg caattcaagt gtgcaatcga tgttctgtag ctgtttaaag ttgtaatgca 2160acctttcctc tggtttcctt acttcataga cgaatccttt gtttt 22051092069DNAArabidopsis thaliana 109ccattgttat ttgttttcgt tgccactcta acacaatgtg tgggattctc gctgttcttg 60gttgcatcga caactctcaa gctaaacgtt ctcgtatcat cgaactctct cgcagattga 120ggcacagagg tcctgattgg agtggactcc attgttatga agattgttat cttgcccatg 180agcgtttagc catcattgac cctacttcag gagaccaacc tctctataac gaagacaaga 240ccgtcgctgt cactgtaaat ggagagatat acaaccacaa gattttgcgt gaaaagctta 300agtctcatca gttccgtact ggtagtgact gtgaagtgat tgcacatctt tacgaagaac 360atggagagga atttatcgac atgttggatg gaatgttcgc gtttgtcctt cttgatactc 420gcgacaaaag ttttattgct gcaagggacg ctattggtat cactccactt tacattggat 480ggggtcttga tggttctgtc tggtttgctt cggagatgaa agcgcttagt gatgattgtg 540aacagtttat gtcttttcct cctggccaca tctactcaag taaacaagga gggcttagga 600ggtggtacaa tcctccgtgg tacaatgagc aggttccttc aaccccatat gatcctttag 660ttctgcgcaa tgctttcgag aaggctgtaa taaagagact tatgactgat gtgccttttg 720gagttctcct atctggagga ttggactcgt ctctcgttgc tgcagtagca ttacgccatt 780tggaaaaatc agaagctgct cgtcaatggg gttcacaatt gcacacgttt tgcatcggtt 840tgcagggatc gccagatctt aaagctggca gagaagttgc tgactatctt ggaacacgcc 900accacgagtt tcagtttaca gttcaggacg ggatagacgc gatagaggaa gtcatttacc 960atattgagac ttatgacgtt actacaataa gagctagcac cccaatgttt cttatgtcca 1020gaaaaattaa atctttaggt gtaaagatgg ttctttctgg ggaaggttct gatgaaatac 1080tggggggata cttgtacttc cataaggctc ccaacaagaa agaatttcat gaagaaacat 1140gccgaaagat caaagctctc caccaatttg attgtttgag agctaacaaa tcaacttctg 1200cgtggggtgt cgaagctcgt gtgcctttcc tagataaaga atttttaaat gttgcaatga 1260gcatcgatcc agagtggaag ttgatcaagc ctgatctcgg aaggatcgag aagtgggtgc 1320tacgcaatgc ctttgatgat gaagaacgac cttatctacc aaagcacatt ctatatagac 1380agaaagaaca gtttagtgat ggagttgggt atagctggat agatggtctg aaagatcatg 1440caaataaaca tgtctctgat actatgctgt caaacgcaag ctttgtcttc ccggataaca 1500cacctctgac aaaagaagcg tactactaca gaaccatctt cgagaagttc ttcccgaaga 1560gtgctgctag agcgactgta ccaggaggtc caagtatagc ttgcagtacc gcgaaagctg 1620tagaatggga tgcaacttgg tcaaagaatc ttgatccgtc aggccgtgcg gctcttggag 1680ttcatgttgc agcttatgaa gaggataaag cagctgctgc tgctaaggct ggatcggatt 1740tagtagatcc tcttcctaag aatggaacat aagagaacaa cactacaggc attgaggata 1800taagcaaatg ttttattctt ctacacagag agatcgttat cttctagagg gatcaatgaa 1860taaaagcttc gtccatttct agctggagat tccatggatc tccagttagt gcaagtgata 1920cacgttgtct acatttgtac ctaagtttct gcattttttg tcgttctttt gtgttagaca 1980agtcttggac cctagatgat acttcagttt cttagacgtt aaatttgatg aatccgaact 2040tgtttgattt caaagcctgg cctttctgc 20691102207DNAArabidopsis thaliana 110agacatcaaa aacacgaata tcgatagtac acttctacgt gcaattttct cctttctctt 60cctggacatc tgtctgttta ttacattttc ttgtaatctc tttttggggt tttacaatat 120ctatccccta aagtttcgga aaattctgtt tttctgttct cattcttcgt gatctttttc 180actttcttca aaaaaaaaac atgtgtggaa tacttgccgt gttaggatgt tccgatgatt 240ctcaggccaa gagagttcgt gttcttgagc tttctcgcag attgaggcac agaggacctg 300actggagtgg cttatatcag aacggagata attacttggc ccatcaacgt cttgccgtca 360tcgatcctgc ttccggtgat caacctcttt tcaacgagga caagaccatt gttgtcacgg 420tgaacggaga gatttataac catgaggagc tgagaaaacg tctgaagaat cacaagttcc 480gtactggtag tgattgtgaa gtcattgctc acttgtacga ggagtatggt gtggattttg 540ttgatatgtt ggatggaata ttctcctttg tgttgctcga cacacgagat aactctttca 600tggtggctcg tgatgcgatt ggtgtcactt cgctctacat tggttgggga ctagacggat 660ctgtgtggat atcttcagag atgaaaggcc taaacgatga ctgtgagcat ttcgaaacgt 720ttcctccagg tcatttttat tcaagcaaat taggagggtt taagcaatgg tataatcctc 780cttggttcaa tgaatctgtt ccttcaacgc cttatgagcc tcttgcgata agacgcgcct 840ttgaaaacgc tgtgattaag cggttgatga ctgatgttcc atttggagtt ttgctctctg 900gtggtcttga ttcttccctt gttgcctcca tcactgcacg tcacttggcc ggtactaagg 960cggctaagca atggggtcct cagctccatt ccttttgcgt tggtcttgag ggctcaccgg 1020acttgaaggc agggaaagag gtggcggaat atttggggac ggtgcaccac gagttccact 1080tctcggtgca ggacgggatt gatgcgatag aggatgtgat ttaccatgtt gagacctatg 1140atgtgacgac tatcagagcg agcacaccga tgttcttgat gtcccggaaa atcaagtctc 1200taggggtcaa gatggttctc tccggcgaag gtgcggacga gatctttgga gggtacctct 1260atttccacaa ggcacctaac aagaaagagt ttcaccaaga aacttgtcgc aagatcaagg 1320ctcttcacaa gtatgactgt ctaagagcca acaaatctac ctctgccttt ggactagagg 1380cacgtgttcc tttccttgac aaagacttca tcaacacagc tatgtctctc gaccctgaat 1440ccaagatgat caagccagag gaaggaagga tcgagaaatg ggttctaagg agagcctttg 1500acgacgaaga acgtccttat ctaccaaaac acattctcta cagacagaaa gaacagttca 1560gtgatggtgt tggctacagt tggatcgatg gcctgaaaga tcacgctgct caaaatgtca 1620atgacaagat gatgtcgaac gcggggcata tcttccctca caacactcca aacactaaag 1680aagcttacta ctacagaatg atctttgaaa ggttcttccc gcagaactct gcgagactaa 1740cggttcctgg aggtgccacc gtggcttgtt cgactgcaaa ggcagtggag tgggatgcaa 1800gctggtccaa caatatggat ccatcaggaa gagccgctat cggagttcac ctttcggcct 1860acgatggcaa gaacgtggca ttgaccatac caccacttaa ggcaattgac aacatgccga 1920tgatgatggg tcaaggagtt gtgattcagt cataacttcg aaggagaaat ggatgaaata 1980tgtgttatat cttcccaatg ggtgaagtgt tttgtatgat tttaaaaata agaatgtgat 2040cctttttttt tcctatgaag atctgaatgt ataatctatc ttgtaaaaat ttgtttcttt 2100gtaagatttg aatgtaccgc ttttacgtag atcgatgtac atcaatctta taagtttcaa 2160ttatgtatta tattatgtcg atttgccaaa aataaatcta aaacctc 22071112030DNAArabidopsis thaliana 111tccatttctc tgaagccgtt gtgttctctt attgccgcca ccaccaccat gtgtgggatt 60ctcgctgtgt taggctgcgt cgataactct caagctaaac gttcccgtat catcgaactc 120tctcgcagat tgaggcatag aggtcctgac tggagtggtc tacattgtta tgaggattgt 180tatttggctc atgagcgttt ggctatcgtt gaccccactt ctggagatca accactctat 240aacgaagata agaccattgc tgtcacggtc aatggagaga tttacaacca caaggctttg 300cgtgaaaatt tgaagtctca ccaattccgt actgggagtg attgtgaagt gattgcccat 360ctttacgaag aacatggaga ggaatttgtc gacatgttgg atggcatgtt tgcatttgtg 420cttcttgata cccgagacaa aagctttatt gctgcaaggg atgccattgg tatcactcca 480ctctacatcg ggtggggtct cgatggttct gtttggtttg cttccgagat gaaagcactt 540agtgatgatt gtgagcagtt tatgtgcttc cccccaggcc acatctattc aagtaaacaa 600ggtgggctta ggaggtggta caaccccccg tggttctctg aggttgttcc ttcaacccca 660tatgatcccc tagtggtgcg caatactttt gagaaggctg ttataaaacg actaatgact 720gatgtgcctt ttggtgtcct cctatctggt ggattagatt catcccttgt tgcttcagta 780gcattacgcc atctggaaaa atcagaagct gcttgtcagt ggggttcaaa gttgcacaca 840ttttgtatcg gtttgaaggg atccccggat cttaaagctg gcagagaagt cgctgactat 900ttaggaactc gccaccacga gttacacttt acagttcagg acggaataga tgccatagaa 960gaagtcatct accatgttga gacctatgat gtgactacta ttagagccag cactccaatg 1020tttcttatgt cgcgaaaaat caaatcgctt ggtgtaaaga tggttctttc tggggaaggc 1080tctgatgaaa tttttggagg atatttgtac ttccataaag ctcccaacaa gaaggaattt 1140catgaggaaa catgtcgaaa gatcaaagct cttcatcaat atgactgctt gagggctaac 1200aaatcaactt ctgcatgggg tgttgaggct cgtgtacctt tcctcgataa agaatttata 1260aatgtcgcaa tgagcatcga tccagagtgg aagatgatta ggcctgattt gggaaggatc 1320gagaaatggg tgttacgcaa tgcctttgat gatgagaaaa atccttacct accaaagcac 1380attctatata ggcagaaaga acagttcagt gatggagttg gatacagctg gattgatggt 1440ctaaaagatc atgcaaacaa acatgtctct gagacaatgc tgatgaacgc aagctttgtc 1500ttccctgata acacaccttt gacaaaagaa gcttactact acagaaccat ctttgaaaag 1560ttcttcccta agagtgctgc tagagcaact gtaccaggag gtccaagtgt ggcatgtagc 1620acagcaaaag ctgtggaatg ggacgcagct tggtcacaga atcttgaccc atcaggtcgt 1680gcggctcttg gagttcatgt ttcagcttat ggggaagata aaaccgaaga ttctcgtccc 1740gagaagctac agaaactagc agagaagact ccagccattg tttgaggata aacaaacaag 1800gtttcagcta atgttgaatc gtgcaatact cttattgtct caaagacaat agatatcctc 1860ttctataggt tctaaaaagg ctttcttttt

ttcttgtttt ctggggttct ttggatgtgt 1920acctaataag ttcctggtga atttctgtgt ttagtgttat tagacaatcc atgaaagctt 1980gatacttcag attatgaacg ttatttttca tgaatccgat tctttctttc 20301122141DNATriticum aestivum 112gcgacgtgta gccctgctct ccgccatctc cggccaggca tctatctacc tacaagtaga 60gccaagccat tcctgcacac ctccatacag aaacacaatt cagatcgact agctcgctgc 120tggctgtaga ggacgatcga cgacgatcca gaggagcagc ataaccgagg agagcggagc 180atgtgcggca tactagcggt gctggggtgc ggcgacgagt cgcaggggaa gagggtccac 240gtgctagagc tctcgcgcag gctcaagcac cggggcccgg actggagcgg cctgcaccag 300gtcgccgaca actacctctg ccaccagcgc ctcgccatca tcgacccggc ctccggcgac 360cagccgctct acaacgagga caagtccatc gccgttgccg tcaacgggga ggtctacaac 420catgaggagc ttcgggcacg gctctccgga cacaggttcc ggaccggcag tgactgcgag 480gtcatcgccc atctgtatga ggaatacgga gaaagcttca ttgacatgtt ggatggtgtt 540ttctccttcg tgttacttga cgcacgagat aacagcttca ttgctgctcg tgatgccatt 600ggtgtcacgc ctctctacat tggctgggga attgatggtt cagtgtggat atcttcggag 660atgaaaggac taaacgatga ttgtgagcac ttcgagatct tcccgcctgg taatctttac 720tccagcaaag agaagtcctt caagagatgg tataaccctc cttggttctc tgaggtcatc 780ccctcggttc cctatgaccc actgcgtctc agatcggcat ttgaaaaggc tgttatcaag 840aggctcatga cagatgttcc atttggcgtc ctcctctccg gtggtctcga ctcatcattg 900gtggctgctg tcgcagcccg tcatttcgct gggacgaagg ctgcaaagcg ctggggaact 960aggctccact ccttctgtgt ggggcttgag ggatcaccag atctcaaggc tgcaaaggag 1020gtcgcggatc acctgggtac cgtgcaccac gagttcaact tcacagttca ggatggcatc 1080gatgcaattg aagatgtgat ataccacatt gaaacatatg atgtgacgac gatcagggca 1140agcacactga tgttccagat gtcacgcaag atcaaggcgc ttggagtcaa gatggtcatc 1200tccggtgagg gtgccgatga gatcttcgga gggtacttgt atttccacaa ggcccctaac 1260aaggaggagt tccaccagga aacatgtcgg aagataaaag ctctccatca gtacgattgc 1320ttgagggcca acaaagcaac atctgcatgg ggccttgagg ttcgtgtgcc attcttggac 1380aaggagttca tcaatgaggc tatgagcata gatcccgaat ggaagatgat ccggcctgat 1440cttggaagaa ttgagaaatg gatactgagg aaagcgttcg atgatgagga gcgacccttc 1500ctgccgaagc atattctgta caggcagaag gagcagttta gtgatggtgt tgggtatagc 1560tggattgatg gcctgaagga tcatgcagcc tcaaatgtga gtgataagat gatgtccaat 1620gcaaagttca tctacccaca caacacccca acaactaaag aggcctacta ttacaggatg 1680atctttgaga ggtacttccc ccagagctcg gcgatcctca cggtgccagg cgggccaagc 1740gtggcgtgca gcacagccaa ggctatagag tgggatgccc aatggtctgg gaacctggac 1800ccctctggga gagcagcgct tggagtccat ctctcagcct acgagcagga cacggtcgct 1860gtgggaggta gcaacaagcc tggggtgatg aacaccgtgg tacctggtgt tgccattgag 1920acttgatgaa tggtacatgt atcatatcgt gtcctactaa aggcaaataa gaacggttgt 1980gtgcatcgct tcatgtagag gccgggcata ctccttttca aaaaaaaaag agaaaataag 2040atgcatatgt tcttgtcagc gttgtaataa gacgggccta tgttttgcta tttaattaaa 2100gggttaatta tccttttgcc ttgagtgatg tctgtgtgct c 21411132032DNATriticum aestivum 113gcacgaggcc catcctcctt cagaagcaca gagagagatc ttctagctac atactgttgc 60cgtcgatcca ggaaaatgtg cggcatactg gcggtgctgg gctgcgctga tgacacccag 120gggaagagag tgcgcgtgct cgagctctcg cgcaggctca agcaccgcgg ccccgactgg 180agcggcatgc accaggttgg cgactgctac ctctcccacc agcgcctcgc cattatcgac 240cctgcctctg gcgaccagcc gctctacaac gaggacaagt ccatcgtcgt cacagtgaat 300ggagagatct acaaccatga acagctccgg gcgcagctct cctcccacac gttcaggaca 360ggcagcgact gcgaggtcat cgcacacctg tacgaggagc atggggagaa cttcatcgac 420atgctggatg gtgtcttctc cttcgtcttg ctcgatacac gcgacaacag cttcattgct 480gcacgtgatg ccattggcgt cacacccctc tatattggct ggggaattga tgggtcggta 540tggatatcat cagagatgaa gggcctgaat gatgattgtg agcactttga gatctttcct 600cctggccatc tctactccag caagcaggga ggcttcaaga gatggtacaa cccaccttgg 660ttctccgagg tcattccttc agtgccatat gacccacttg ctctcaggaa ggctttcgaa 720aaggctgtca tcaagaggct tatgacggac gttccattcg gtgttctact ctctggtggc 780cttgactcat cattggttgc agccgttaca gttcgtcacc tggcaggaac aaaggctgca 840aagcgctggg ggactaagct tcactctttt tgtgtcggac ttgaggggtc acctgatctg 900aaggctgcaa aggaggtagc caattacctg ggcaccatgc accatgagtt caccttcact 960gttcaggacg gcattgatgc aattgaggat gtgatttatc acaccgaaac atatgatgtg 1020acgacaatca gggcaagcac gccaatgttc ctgatgtcac gcaagatcaa gtcacttggc 1080gtcaagatgg tcatctctgg tgagggttcc gatgagattt tcggagggta cctctacttc 1140cacaaggcac ccaacaaaga ggagctccac cgtgagacat gtcaaaagat caaagctctg 1200catcagtacg attgcttgag ggccaacaag gcaacatctg catggggcct cgaagcacgt 1260gtgccattct tggacaagga gtttatcaat gaggcaatga gcattgatcc tgagtggaag 1320atgatccggc ctgatcttgg aagaattgag aaatggatgc tgaggaaagc atttgatgac 1380gaggagcaac cattcctgcc gaagcacatt ctgtacaggc agaaagagca gttcagtgat 1440ggtgttggct acagctggat tgatggccta aaggctcacg cagaatcaaa tgtgacagat 1500aagatgatgt caaatgcaaa gttcatctac ccacacaaca ccccgactac aaaagaggcc 1560tactgttaca ggatgatatt tgagaggttc ttcccccaga actcggcgat cctgacggtg 1620ccaggtgggc caagcgttgc atgcagcacg gcgaaggcgg tagagtggga tgcccagtgg 1680tcagggaacc tggatccctc agggagagca gcacttggag tccatctctc ggcctatgaa 1740caggagcatc tcccagcaac catcatggca ggaaccagca agaagccaag gatgatcgag 1800gttgcggcgc ctggtgtcgc aattgagagt tgatggtgtc ctgtcctgct tgccgtttct 1860gataagaaat aagatgtacc tggtcttgcc attagagtgg tgcagaccta aggtttgagt 1920gaagattgtg cattaatgtt tctattgttc ttatgaaatc ggagaccggt gatttctaat 1980cctttctggc aacttccatc aaaacattat tacatgatgg ttattatttg ac 20321141770DNAVitis vinifera 114atgtgcggaa tacttgcagt tctgggttgt tctgatgatt cccaggccaa aagggtccga 60ttgttttacc attgttattt atgcttctgt gataggttga agcatcgtgg tcctgactgg 120agtgggctat accaacatgg agattgttat ttagctcatc agcggctagc aatcatcgat 180ccagcttctg gtgatcagcc tctatataat gaaaaccaag ccattgttgt gacggtgaat 240ggagaaattt ataaccatga ggagttgagg aagagcatgc caaatcacaa gttcaggacc 300gggagcgatt gcgatgtcat tgcccatttg tacgaggagc atggggaaaa ttttgtggac 360atgttggatg gaatgttctc atttgtcctg ctggataccc gtgatgatag cttcattgtt 420gcccgagatg ccatcggaat cacctccctc tatattggtt ggggacttga tggtagctcg 480gtatggattt catctgagct caaaggtttg aatgatgact gtgaacattt tgagagcttt 540ccacctggtc acatgtactc tagcaaagag ggtggattca aaagatggta caacccccct 600tggttctctg aggctattcc atcggcacca tatgaccctc ttgttctgag gcgagctttt 660gagaatgccg tgatcaagag gttaatgacc gatgttcctt ttggggttct gctgtcagga 720ggtctggatt catccttagt tgcctctatt accgctcgcc acttagcagg cacaaaggct 780gctaagcagt ggggagcaca gctccattcc ttctgtgttg ggctagaggg ctcaccggat 840ctgaaggctg caaaagaagt tgcagactat ttgggcaccg ttcaccacga gtttcacttc 900accgttcagg atggtatcga tgccattgag gatgttattt accatattga aacttatgat 960gtgacaacga tccgagcaag tacccctatg tttctcatgt cgcgtaagat taagtcacta 1020ggagtgaaga tggtgatatc cggagagggc tctgatgaga tttttggtgg gtacttatac 1080tttcacaagg cgcccaacaa ggaagagttc catagggaaa catgtcgcaa gataaaggca 1140ctctaccagt atgattgctt gagagctaat aaatcaacat ctgcatgggg tttggaagcc 1200cgggtcccct ttttagacaa ggaattcatt aaagttgcaa tggatattga ccctgagtgg 1260aagatgataa agccagaaca agggcgaatt gagaaatggg ttctgaggag ggcttttgat 1320gatgaggaac aaccctatct gccaaagcat attctctaca ggcaaaaaga gcaattcagt 1380gatggtgtcg gctacagttg gattgatggg ctcaaagccc atgcgtcaca acatgtgacc 1440gataaaatga tgctcaatgc ttcacatatc ttcccacaca atacccctac cacaaaagaa 1500gcctactatt accgaatgat ctttgagagg ttcttcccac agaactcagc taggctgact 1560gttccgggag gagcaagcgt tgcatgcagc actgccaaag cagttgaatg ggattctgcg 1620tggtcaaata accttgatcc ttctggcagg gcggcattag gagtccatct ttcagcttat 1680gaccagaagt taaccacagt cagtgctgca aatgtgccaa caaagatcat tgataatatg 1740ccgcggatta tggaagtaac cgcaccctga 17701151737DNAVolvox carteri 115atgtgcggaa tcctagctgt gctcaactcc acggatgata gcccggcgat gagggcgaag 60gtgctggcgc ttagtcgtcg ccagaagcat cgtggccccg actggtcggg gatgcaccag 120tttggcaaca acttcctggc gcatgagcgg cttgcgatta tggatcccag ctcgggcgat 180cagccgctgt acaacgagga caagtctatc gtcgtgacgg tgaacggcga gatctacaat 240tataaggaac tgcgcaagga gatctctgac aagtgccctg gcaagaagtt ccgcaccaac 300agcgactgtg aggtgatcag ccacctgtac gaactgtacg gcgaggcagt tgccaacaag 360ctggacggct tctttgcctt tgtactgctg gacactcgca acaacacctt cttcgcggcg 420cgcgatccgt tgggcgtcac ctgcatgtac attggctggg gccgggatgg cagcgtgtgg 480ctgtcctccg agatgaaatg tctcaaggac gactgtgcgc gcttccagca attccctccc 540ggccattatt actcgtccaa gacaggcgag tttgtgcggt acttcaaccc ccagttttac 600ctggactttg aggcagagcc gcaggttttc ccctcggtgc cctacgaccc cgtcacgttg 660cgcacggcgt ttgaggcggc cgtggagaag cgcatgatga gcgacgtgcc cttcggtgtg 720ctgctgagtg gcggtctgga cagcagcctt gtggcctcta tcgcggcccg caaaatcaag 780cgggagggca gtgtgtgggg caagctgcac agcttctgcg ttggtctgga gggcagcccc 840gacctcaagg caggtgccgc tgtggctgag tttctgggca ccgaccacca cgagttccac 900tttacagtgc aggagggcat tgacgccatc tcggaggtca tttaccacat cgagacgttt 960gacgtgacca cgatccgcgc ctccactccc atgttcctga tgagccgcaa gatcaaggcc 1020ctgggtgtca agatggtgct gtccggagaa ggctcggatg aggtgttcgg gggttacctc 1080tacttccata aggctcccag caaggatgag ttccacagcg aaacggttcg caagctgaag 1140gacctgttca agtacgactg cctgcgagcc aacaaggcca ccatggcctg gggtgtagag 1200gcgcgtgtgc ccttcctgga ccgggcattc ctggatgtgg ccatgtccat tgacccggcg 1260gagaagatga ttgacaagag caagggccgg atcgagaaat acattctccg gaaagccttc 1320gatacgcccg aggatccata cctgcctaag gaggtactgt ggcgccagaa ggagcagttc 1380agcgacggcg tgggctacaa ctggattgat gggctcaagg cgcatgctga gagccaagtc 1440agcgatgaga tgctcaagaa cgccgtgcac agattcccgg acaacacccc gcgcaccaag 1500gaggcctact ggtaccgctc tatctttgag agccacttcc cgcagcgtgc tgctatggag 1560acggtgccgg gtggtccctc agtggcttgc tccaccgcga cagccgccct gtgggatgca 1620gcgtgggccg ggaaggagga cccgtcgggc cgcgccgtgg cgggcgttca tgacgctgct 1680tacgaggaag gcgcggaagc caatggcgag cccgcatcca aaaagcaaaa ggtctga 17371162197DNAZea mays 116gatcgtctcg tctccctccc aaaaaaaaaa aaaaaaactg ctcggttgct gctcctgctc 60cgccgcgccg gcatcatgtg tggcatctta gccgtgctcg gttgctccga ctggtctcag 120gcaaagaggg ctcgcatcct cgcctgctcc agaaggttga agcacagggg ccccgactgg 180tcgggcctct accagcacga gggcaacttc ctggcgcagc agcggctcgc cgtcgtctcc 240ccgctgtccg gcgaccagcc gctgttcaac gaggaccgca ccgtcgtggt ggtggccaat 300ggagagatct acaaccacaa gaacgtccgg aagcagttca ccggcacaca caacttcagc 360acgggcagtg actgcgaggt catcatcccc ctgtacgaga agtacggcga gaacttcgtg 420gacatgctgg acggggtgtt cgcgttcgtg ctctacgaca cccgcgacag gacctacgtg 480gcggcgcgcg acgccatcgg cgtcaacccg ctctacatcg gctggggcag tgacggttcc 540gtctggatcg cgtccgagat gaaggcgctg aacgaggact gcgtgcgctt cgagatcttc 600ccgccgggcc acctctactc cagcgccggc ggcgggttcc ggcggtggta caccccgcac 660tggttccagg agcaggtgcc ccggatgccg taccagccgc tcgtcctcag agaggccttc 720gagaaggcgg tcatcaagag gctcatgact gacgtcccgt tcggggtcct cctctccggc 780ggcctcgact cctcgctcgt cgcctccgtc accaagcgcc acctcgtcga gaccgaggcc 840gccgagaagt tcggcaccga gctccactcc tttgtcgtcg gcctcgaggg ctctcctgac 900ctgaaggccg cacgagaggt cgctgactac cttggaacca tccatcacga gttccacttc 960accgtacagg acggcatcga cgcgatcgag gaggtgatct accacgacga gacgtacgac 1020gtgacgacga tccgggccag cacgcccatg ttcctgatgg ctcgcaagat caagtcgctg 1080ggcgtgaaga tggtgctgtc cggggagggc tccgacgagc tcctgggcgg ctacctctac 1140ttccacttcg cccccaacaa ggaggagttc cacagggaga cctgccgcaa ggtgaaggcc 1200ctgcaccagt acgactgcct gcgcgccaac aaggccacgt cggcgtgggg cctggaggtc 1260cgcgtgccgt tcctcgacaa ggagttcatc aacgtcgcga tgggcatgga ccccgaatgg 1320aaaatgtacg acaagaacct gggccgcatc gagaagtggg tcatgaggaa ggcgttcgac 1380gacgacgagc acccttacct gcccaagcat attctctaca ggcagaaaga acagttcagt 1440gacggcgttg gctacaactg gatcgatggc ctcaaatcct tcactgaaca gcaggtgacg 1500gatgagatga tgaacaacgc cgcccagatg ttcccctaca acacgcccgt caacaaggag 1560gcctactact accggatgat attcgagagg ctcttccctc aggactcggc gagggagacg 1620gtgccgtggg gcccgagcat cgcctgcagc acgcccgcgg ccatcgagtg ggtggagcag 1680tggaaggcct ccaacgaccc ctccggccgc ttcatctcct cccacgactc cgccgccacc 1740gaccacaccg gcggtaagcc ggcggtggcc aacggcggcg gccacggcgc cgcgaacggc 1800acggtcaacg gcaaggacgt cgcagtcgcg atcgcggtct gacgagagta cgtgctcgcg 1860cacctccctg ctagcttcta ccgggctgca gcctgcagca tgcactgtgc gagcacagcc 1920gatcagcgcc aataaactgg aggataagaa cgactggtag gtgtgtgtgt gtgtcgtgcg 1980tgcccaccgg ccatatcccg gtgcggcagc acgtgctatt gttacgtgtt gtactgccgc 2040cagcgtacgt gtctgtgtgt ctcgatcata tctgtacgtt tttagattta gaagaaaaaa 2100aaaaggcatg tccgtgtctg tatgtctgga tcatatctgt acgttcttag atttagaaga 2160aagaagaaaa acattatata cgtacgtcca tgtctct 21971172081DNAZea mays 117atgtgtggga ttctggcggt gctgggcgtc gttgaggtct ccctcgccaa gcgctcccgc 60atcattgagc tctcgcgcag gttacggcac cgagggcctg attggagtgg tttgcactgt 120catgaggatt gttaccttgc acaccagcgg ttggctatta tcgatcctac atctggagac 180cagcctttgt acaatgagga taaaacagtt gttgtaacgg tgaacggcga aatttacaat 240catgaagaat tgaaagctaa gttgaaaact catgagttcc aaactggcag tgattgtgaa 300gttatagccc atctttacga agaatatggc gaagaatttg tggatatgtt ggatggaatg 360ttctcctttg ttcttcttga tacacgtgat aaaagcttca tcgcagctcg tgatgctatt 420ggcatctgcc ctttatacat gggatggggt cttgatggat cagtctggtt ttcttcagag 480atgaaggcat tgagtgatga ttgtgaacgc ttcataacat ttcccccagg gcatctctac 540tccagcaaga caggtggtct aaggagatgg tacaacccac catggttttc agagactgtc 600ccttcaaccc cttacaatgc tctcttcctc cgggagatgt ttgagaaggc tgttattaag 660aggctgatga ctgatgtgcc atttggtgtg cttttatctg gtggactcga ctcttctttg 720gttgcatctg ttgcttcgcg gcacttaaac gaaacaaagg ttgacaggca gtggggaaat 780aaattgcata ctttctgtat aggcttgaag ggttctcctg atcttaaagc tgctagagaa 840gttgctgatt acctcagcac tgtacatcat gagttccact tcacagtgca ggaggggatt 900gatgccttgg aagaagtcat ctaccatatt gagacatatg atgttacaac aatcagagca 960agtaccccaa tgtttttgat gtcacgcaaa atcaaatctt tgggtgtgaa gatggttatt 1020tctggcgaag gttcagatga aatttttggt ggttaccttt attttcacaa ggcaccaaac 1080aagaaagaat tcctagagga aacatgtcgg aagataaaag cactacatct gtatgactgc 1140ttgagagcta acaaagcaac ttctgcctgg ggtgttgagg ctcgtgttcc attccttgac 1200aaaagtttca tcagtgtagc aatggacatt gatcctgaat ggaacatgat aaaacgtgac 1260ctcggtcgaa ttgagaagtg ggtcatgagg aaggcgttcg acgacgacga gcacccttac 1320ctgcccaagc atattctcta caggcagaaa gaacagttca gtgacggcgt tggctacaac 1380tggatcgatg gcctcaaatc cttcactgaa cagcaggtga cggatgagat gatgaacaac 1440gccgcccaga tgttccccta caacacgccc gtcaacaagg aggcctacta ctaccggatg 1500atattcgaga ggctcttccc tcaggactcg gcgagggaga cggtgccgtg gggcccgagc 1560atcgcctgca gcacgcccgc ggccatcgag tgggtggagc agtggaaggc ctccaacgac 1620ccctccggcc gcttcatctc ctcccacgac tccgccgcca ccgaccacac ggcggtaagc 1680cggcggtggc caacggcggc ggcacggccg gcgaacggca cggtcaacgg caaggacgtg 1740ccagtgccga tcgcggtctg acgagagtac gtgctcgcgc acctccctgc tagcttctac 1800cgggctgcag cctgcagcat gcactgtgcg agcacagccg atcagcgcca ataaactgga 1860ggataagaac gactggtagc tgtgtgtgtg tgtgtcgtgc gtgcccaccg gccatatccc 1920ggtgcggcag cacgtgctat tgttacgtgt tgtactgcca ccagcgtacg tgtctgtgtg 1980tctcgatcat atctgtacgt ttttagattt agaagagaaa aaaaaagtat gcccgtgtct 2040gtatgtctgg atcatatctg tacgttctta gatttagaag a 20811182257DNAZea mays 118ggaattcccc gggatcaagg agcaccgtct gctgctcgct ctataaaacg aacggaggct 60gcagagcaga gcagagcaga gcaagaagct ttacagtgaa cgagtgagta tgtgcggcat 120acttgctgtg ctcgggtgcg ccgacgaggc caagggcagc agcaagaggt cccgggtgct 180ggagctgtcg cggcggctga agcaccgggg ccccgactgg agcggcctcc ggcaggtggg 240cgactgctac ctctctcacc agcgcctcgc catcatcgac ccggcctctg gcgaccagcc 300cctctacaac gaggaccagt cggtggtcgt cgccgtcaac ggcgagatct acaaccacct 360ggacctcagg agccgcctcg ccggcgcagg ccacagcttc aggaccggca gcgactgcga 420ggtcatcgcg cacctgtacg aggagcatgg agaagagttc gtggacatgc tggacggcgt 480cttctccttc gtgctgctgg acactcgcca tggcgaccgc gcgggcagca gcttcttcat 540ggctgctcgc gacgccatcg gtgtgacgcc cctctacatc ggatggggag tcgatgggtc 600ggtgtggatt tcgtcggaga tgaaggccct gcacgacgag tgtgagcact tcgagatctt 660ccctccgggg catctctact ccagcaacac cggcggattc agcaggtggt acaaccctcc 720ttggtacgac gacgacgacg acgaggaggc cgtcgtcacc ccctccgtcc cctacgaccc 780gctggcgcta aggaaggcgt tcgagaaggc cgtggtgaag cggctgatga cagacgtccc 840gttcggcgtc ctgctctccg gcgggctgga ctcgtcgctg gtggcgaccg tcgccgtgcg 900ccacctcgcc cggacagagg ccgccaggcg ctggggcacc aagctccact ccttctgcgt 960gggcctggag gggtcccctg acctcaaggc ggccagggag gtggcggagt acctgggcac 1020cctgcaccat gagttccact tcactgttca ggacggcatc gacgccatcg aggacgtgat 1080ctaccacacg gagacgtacg acgtcaccac gatcagggcg agcacgccca tgttcctcat 1140gtcgcgcaag atcaagtcgc tcggggtcaa gatggtcatc tccggcgagg gctccgacga 1200gctcttcgga ggctacctct acttccacaa ggcgcccaac aaggaggagt tgcaccgaga 1260gacgtgtagg aaggttaagg ctctgcatca gtacgactgc ctgagagcca acaaggcgac 1320atcagcttgg ggcctggagg ctcgcgtccc gttcctggac aaggagttca tcaatgcggc 1380catgagcatc gatcctgagt ggaagatggt ccagcctgat cttggaagga ttgagaagtg 1440ggtgctgagg aaggcattcg acgacgagga gcagccattc ctgcccaagc atatcctcta 1500cagacagaag gagcagttca gtgacggcgt tgggtacagc tggatcgatg gcctgaaggc 1560tcatgcaaca tcaaatgtga ctgacaagat gctgtcaaat gcaaagttca tcttcccaca 1620caacactccg accaccaagg aggcctacta ctacaggatg gtcttcgaga ggttcttccc 1680acagaaatct gctatcctga cggtacctgg tgggccaagt gtggcgtgca gcacagccaa 1740ggccatcgag tgggacgcac aatggtcagg aaatctggac ccctcgggaa gggcggcact 1800gggcgtccat ctcgccgcct acgaacacca acatgatccc gagcatgtcc cggcggccat 1860tgcagcagga agcggcaaga agccaaggac gattagggtg gcaccgcctg gcgttgccat 1920cgagggatag acgacgacgc atatataagc ttcctacttt tgtttcaatg catgcatgct 1980atgtatctgt gtccaccggc tgtctagcct tatcatcatc actgtctgca acaaattaat 2040aatcaagtgg tatggggtac ctacgtttaa tgtatacgga gtattgtatt gcttgtgtgt 2100ggtatgctta ggttggccgt gagtaaggga ttacaagtat tcgatatcgg gtgtttctat 2160aggttgaagt gctcataaag ggctccctat cctctatggt catgtttgta atagtttttt 2220ttcttaaaga gcttttctat gaatttggat tcctgtt 22571193700DNAZea mays 119cgagcgctca gcgtctcgtc tcctcctccc cacaaaaagc cgctgaattg ctccgtcggc 60gtcatgtgtg gcatcttagc cgtgctcgga tgctccgact gctcccaggc caggagggct 120cgcatcctcg cctgctccag aaggctgaag cacaggggcc ccgactggtc gggcctctac 180cagcacgagg gcaacttcct ggcgcagcag cggctcgcca tcgtctcccc gctgtccggc 240gaccagccgc tgttcaacga ggaccgcacc gtcgtggtgg tggccaatgg agagatctac 300aaccacaaga acgtccggaa gcagttcacc ggcgcgcaca gcttcagcac cggcagtgac

360tgcgaggtca tcatccccct gtacgagaag tacggcgaga acttcgtgga catgctggac 420ggagtcttcg cgttcgtgct ctacgacacg cgagacagga cctacgtggc ggcacgcgac 480gccatcggcg tcaacccgct ctacatcggc tggggcagcg acggttccgt ctggatgtcg 540tccgagatga aggcgctgaa cgaggactgc gtgcgcttcg agatcttccc gccggggcac 600ctctactcca gcgccgccgg cgggttccgc cggtggtaca ccccgcactg gttccaggag 660caggtgcccc ggacgccgta ccagccgctc gtccttagag aggccttcga gaaggcggtt 720atcaagaggc tcatgaccga cgtcccgttc ggggtcctcc tctccggcgg cctcgactcc 780tccctcgtcg cctccgtcac caagcgccac ctcgtcaaga ccgacgccgc cggaaagttc 840ggcacagagc tccactcctt cgtcgtcggc ctcgagggct cccctgacct gaaggccgca 900cgagaggtcg ctgactacct cggaaccacc catcacgagt tccatttcac cgtacaggac 960ggcatcgacg cgatcgagga ggtgatctac cacgacgaga cgtacgacgt gacgacgatc 1020cgggccagca cgcccatgtt cctgatggct cgcaagatca agtcgctggg cgtgaagatg 1080gtgctgtccg gggagggctc cgacgagctc ctgggcggct acctctactt ccacttcgcc 1140cccaacaggg aggagctcca cagggagacc tgccgcaagg tgaaggccct gcaccagtac 1200gactgcctgc gcgccaacaa ggcgacgtcg gcgtggggcc tggaggtccg cgtgccgttc 1260ctcgacaagg agttcgtcga cgtcgcgatg ggcatggacc ccgaatggaa aatgtacgac 1320aagaacctgg gtcgcatcga gaagtgggtc ctgaggaagg cgttcgacga cgaggagcac 1380ccttacctgc ccgagcatat tctgtacagg cagaaagaac agttcagtga cggagtgggc 1440tacaactgga tcgatggact caaagccttc accgaacagc aggttgatgg tcgtcgtcga 1500agttagctaa ccagcgctga cgttcccccc catgtccagg tgacggatga gatgatgaac 1560agcgccgccc agatgttccc gtacaacacg cccgtcaaca aggaggccta ctactaccgg 1620atgatattcg agaggctctt ccctcaggac tcggcgaggg agacggtgcc gtggggcccg 1680agcatcgcct gcagcacgcc cgcggccatc gagtgggtgg agcagtggaa ggcctccaac 1740gacccctccg gccgcttcat ctcctcccac gactccgccg ccaccgaccg caccggagac 1800aagctggcgg tggtcaacgg cgacgggcac ggcgcggcga acggcacggt caacggcaac 1860gacgtcgctg tcgccatcgc ggtgtaacag taatgaactg gaggataggg acgaacgaac 1920gactggtagg tgtggcgtac ctgccgcgtg cccaccggcc ggccatatat atcgaatccc 1980ggcccggcgc ggcagcacgt gctattgtta cgtgtcacca gcgtacgtgt ctgtgtagtg 2040cctcgatcgt atctgtacgt ctttaggaaa aggtgtgtcc gtgtgtattg tatgtgtgtg 2100agcaagcgtg cgtgacgcgc tctgcctgtg tgacaaagca gagcagtaca agctcaggca 2160ttttctgtcc gagcgatgat ttgaactgga tctatcatct ctgaattgaa ctcggccgga 2220cgacgaccta ccgctaaaat tattcccagc tggatttcgg tacgtgtccc cgttgttcgt 2280tctcgcggct gtgactgtga ccgaacctgc tgctacaagt gcgcgtaaag gatctggttc 2340cacgtgtccg gcacgccggg cacgcaccag tggatgcagt ccgtgtacgt ctgcgggtcg 2400gccctctgcg cgtcggtgag cagctcgccg ccggtctcgg tgtacacgga gacgtgggcg 2460tcgacgcggt gctccgtcag ctgcgtgacg ttgagcagcg tcacgggcac ccgcatccgg 2520cccaccacgt cggacatcac ctccatcatc cgccggtccg cgccgctgcc ccagtacccc 2580ttccgcgtga cgggcaacgt ctcgttgtag caccggatgc cgccctcccg gccccagtcc 2640tcgctcctca tgtgcgtggt ggagatcgac atgaagaaga cccttgtggc gttgggatcg 2700atgttggcgt ccacccagtt ggcccatgtc ttgagtccaa gccggaacgc gacccaggcg 2760tccagctcct cgtacccgtc gtccccgaac gacccccaca ctgatttgat cctgctgccg 2820gtcatccacc acacgtagct gtcgaagacg aggatgtcca cgcccttcca gtgcctagcg 2880tgcagctcga cggcgtcgac gtggagcacg cggccgtcgg cccccagccg gatgttgcgg 2940tcggagttgg cctccaccag gtacggcgcc cagtagaact cgatcgtcgc gttgtactcc 3000gtggcggtga agacggacag ggtggtgctg cgctccatgg accgcgcggt gtagggcacg 3060gcggagttga cgaggcagac catggagagc cactgcccca tctgcagcga gtcgccaacg 3120aacatcatcc gcttcccccg cagcgtctcc agcaacgcca ccgggtcgaa ccttgggaga 3180ctgcagtcgt ctaggtgcca atcccagcgc aggtagtcgc tgtccggcct gccgttcctc 3240tggcaagaga cctgcctgtc gatgaacggg catgtttggt ccgtgtaaag cagctccttg 3300gacctgttgt acgcccagta cccctccgtc acgctgcacc ggctcgggtc gaacgctgcc 3360ttcgccggct gcggcggcgg cgttagcggc atcttcctcg tcgtcgtccc ggcatgtaga 3420gacgtcgtcc tcttgtgctc cttcgctttc cccttcttct ccatgatctc agtgagtgag 3480cggaggtcgt cggtgaagat gacgcccgct agagccagcc cgccgatgat tgccaccacc 3540actgacagcg gggcccgccc cttcatccgc ttcaccgctg ctatctgaat ctgaaccatg 3600aagctcagat gctacgtgga tgctggcatg cagcaatgct agcttgttgc aggctcaagg 3660tgtgagacgg cttatcgatt tatttgcagc tgctctttgt 37001202189DNAZea maysmisc_feature(2173)..(2173)n is a, c, g, or t 120ccgaggcggc gcttttgggg tcggaagcga cacgggcgcc gggcgggtcc gcgggtggtg 60gtgctactgc tagcaagcag cagcaggcga cgctaggcga gagccccagt cggagcaggc 120caccatgtgc ggcatcctcg ctgtcctcgg cgtcgctgag gtctccctcg ccaagcgctc 180ccgcatcatt gagctctcgc gcaggttacg gcaccgaggg cctgattgga gtggtttgca 240ctgtcatgag gattgttacc ttgcacacca gcggttggct attatcgatc ctacatctgg 300agaccagcct ttgtacaatg aggataaaac agttgttgta acggtgaacg gagagatcta 360taaccatgaa gaattgaaag ctaagttgaa aactcatgag ttccaaactg gcagtgattg 420tgaagttata gcccatcttt acgaagaata tggcgaagaa tttgtggata tgttggatgg 480aatgttctcc tttgttcttc ttgatacacg tgataaaagc ttcatcgcag ctcgtgatgc 540tattggcatc tgccctttat acatgggatg gggtcttgat ggatcagtct ggttttcttc 600agagatgaag gcattgagtg atgattgtga acgcttcata acatttcccc cagggcatct 660ctactccagc aagacaggtg gtctaaggag atggtacaac ccaccatggt tttcagagac 720ggtcccttca accccttaca atgctctctt cctccgggag atgtttgaga aggctgttat 780taagaggctg atgactgatg tgccatttgg tgtgctttta tctggtggac tcgactcttc 840tttggttgca tctgttgctt cgcggcactt taacgaaaca aagggtgaca ggcagtgggg 900aaataaattg catactttct gtataggctt gaagggttct cctgatctta aagctgctag 960agaagttgct gattacctca gcactgtaca tcatgagttc cacttcacag tgcaggaggg 1020cattgatgcc ttggaagaag tcatctacca tattgagaca tatgatgtta caacaatcag 1080agcaagtacc ccaatgtttt tgatgtcacg caaaatcaaa tctttgggtg tgaagatggt 1140tatttctggc gaaggttcag atgaaatttt tggtggttac ctttattttc acaaggcacc 1200aaacaagaaa gaattccatg aggaaacatg tcggaagata aaagcactac atctgtatga 1260ctgcttgaga gctaacaaag caacttctgc ctggggtgtt gaggctcgtg ttccattcct 1320tgacaaaagt ttcatcagtg tagcaatgga cattgatcct gattggaaga tgataaaacg 1380tgacctcggt cgaattgaga aatgggttat ccgtaatgca tttgatgatg atgagaggcc 1440ctatttacct aagcacattc tctacaggca aaaggaacag ttcagtgatg gtgttgggta 1500tagttggatc gatggattga aggaccatgc cagccaacat gtctccgatt ccatgatgat 1560gaatgctggc tttgtttacc cagagaacac acccacaaca aaagaagggt actactacag 1620aatgatattc gagaaattct ttcccaagcc tgcagcaagg tcaactgttc ctggaggtcc 1680tagtgtggcc tgcagcactg ccaaagctgt tgaatgggac gcatcctggt ccaagaacct 1740tgatccttct ggccgtgctg ctttgggtgt tcacgatgct gcgtatgaag acactgcagg 1800gaaaactcct gcctctgctg atcctgtctc agacaagggc cttcgtccag ctattggcga 1860aagcctaggg acacccgttg cttcagccac agctgtctaa ccttatgttt atcacccagc 1920aatgcttgaa acagcaaagg ttgtccattg cttgtttcag tttccttccg atcatgtttt 1980tagttccatc aatcaagcaa tggagacatg cttgtgcttc atacttggca gcatcgtgtt 2040tgggttttca ctgggcagta ctgtttaatt tttatggact gaaaagactc agttttgtaa 2100atattcgtca ctgtgaccaa ttcctgtggt ggtttatgtg atttgcagat tgcagtggtt 2160agtgtatctt ccncaatttt cactccttt 21891212083DNABrassica napus 121gaattctccg ggtcgacgat ttcgtacgaa atcgtcattg ccgccaccat ccatcaacca 60tgtgtgggat tctcgctgtt ctaggctgcg tcgataactc tcaagccaca cgttctcgta 120tcatcaaact ctctcgcaga ttgaggcata gaggtcctga ttggagcggg cttcattgtt 180atgaggattg ttacttggct catgagcgtt tggccatcat tgaccccatt tctggagacc 240agcctctcta cagcgaagat aagaccgtcg ttgtcacggt gaatggagag atatacaatc 300acaaggcatt gcgtgaaagt gaaagtctga agtctcacaa gtaccatacc gggagtgatt 360gtgaagtgct tgcccatctt tatgaagaac atggagagga atttatcaac atgttggacg 420gcatgtttgc atttgtcctt cttgatacta aggacaaaag ttatattgct gtaagggatg 480ccattggtgt catcccactc tacattggct ggggtctcga tggttctgtc tggtttgctt 540ctgagatgaa agcacttagt gatgattgtg aacagtttat ggctttccca ccaggccaca 600tctattccag taaacaaggt ggtcttagga ggtggtacaa ccctccatgg ttctctgagc 660tcgttccttc aaccccttat gatcccttag tattgcgaga tactttcgag aaggctgtaa 720taaagagact aatgaccgat gtgccttttg gtgtcctact ctctggagga ctagactcat 780ctcttgttgc ttcagtggct atacgccatt tggaaaagtc agatgctcgt cagtggggtt 840ccaagctgca caccttttgc attggtttaa agggatctcc ggatcttaaa gctggtaaag 900aagttgctga ctatctagga actcgccacc acgagctcca ctttacagtt caggaaggga 960tagacgccat agaagaagtt atataccatg ttgagaccta tgacgtgact accataagag 1020caagcactcc catgtttctc atgtcgagaa aaatcaaatc gcttggtgtg aagatggttc 1080tctctggtga aggctctgat gagatctttg gagggtattt gtacttccac aaagcaccta 1140acaagaagga gttacacgag gaaacatgcc gaaagatcaa agcactttat caatatgatt 1200gcttgagggc taacaaatca acttctgcgt ggggtgttga ggctcgtgtg cctttccttg 1260ataaagcgtt tctagatgta gcaatgggca ttgatccaga gtggaagatg atcaggcctg 1320acttgggaag gattgagaaa tgggtgttac gcaatgcctt tgatgatgag aagaatcctt 1380atctaccaaa gcacattctg tacaggcaga aggaacagtt cagtgatgga gttggataca 1440gctggattga cggtctgaaa gatcatgcaa acaaacatgt ctctgacgca atgctgacga 1500acgcaaactt tgtcttcccg gagaacacac ctttgacaaa ggaggcttac tactacagag 1560ccatctttga aaagttcttc cctaagagcg ctgctagagc aactgtacca ggaggtccaa 1620gtgtagcatg tagtactgca aaagctgtgg agtgggacgc agcttggaaa gggaaccttg 1680acccgtcggg tcgtgcggct cttggagttc atgttgcagc ttatgaagga gataaagctg 1740aagatcctcg tcctgagaag gtacagaagc tggcagagaa aactgcagaa gccattgttt 1800gaggatgaaa cgaatgtttg agtcgtgcgt ttcttttatt ttctcataag acaatacgtt 1860attatcatct tccgtaggat caataagtac aataagttgt ctctctttaa ctgaattgag 1920gtgggagtgt ctgaggttgt acctaagttg ttggtgattt tctggttctt tcatttgtca 1980caaagttttc agcgtttctt ttatgtatga tgtatcgttc acccctgtta atctagattt 2040ggttcagttc aaaaaaaaaa aaaaaaaaag cggacgctct aga 20831222288DNATriticum aestivum 122ggcctggccc gctacgaacc ccaaacgcgc atctctccta gccccctccc tgctgctcta 60ccaccaccgt gccgccgtag aacgccgtac ctgacccccc accaccacct gcgcctgcgt 120cgccgccggc gccgtcgccg tcgcccgtcc gtactagtcg gggcatcgcc ggtgattagt 180caaatcacct tcggagctcg cgaccaccca aatcacccgc ggagtctcgc caacgagcag 240ggaccgcccg ccggccgcca ccatgtgcgg catcctcgcc gtcctcggcg tcggcgacgt 300ctccctcgcc aagcgctccc gcatcatcga gctctcccgc cgattacggc acagaggccc 360tgattggagt ggtatacaca gctttgagga ttgctatctt gcacaccagc ggttggctat 420tgttgatcct acatctggag accagccatt gtacaacgag gacaaaacag ttgttgtgac 480ggtgaatgga gagatctata atcatgaaga actgaaagct aagctgaaat ctcatcaatt 540ccaaactggt agtgattgtg aagttattgc tcacctatat gaggaatacg gggaggaatt 600tgtggatatg ctggatggca tgttctcgtt tgtgcttctt gacacacgtg ataaaagctt 660cattgctgcc cgtgatgcta ttggcatctg tcctttgtac atgggctggg gtcttgatgg 720gtcagtttgg ttttcttcag agatgaaggc attgagtgat gattgcgagc gcttcatatc 780gttccctcct ggacacttgt actcaagcaa aacaggtggc ctaaggaggt ggtacaaccc 840cccatggttt tcagaaagca ttccctcagc cccctatgat cctctcctca tccgagagag 900tattgagaag gctgctatta agaggctaat gactgatgtg acatttggcg ttctcttgtc 960tggtgggctt gactcttctt tggtggcttc tgttgtttca cgctacttgg cagaaacaaa 1020agttgctagg cagtggcgaa acaaactgca caccttttgc atcggcatga agggttctcc 1080tgatcttaaa gctgctaagg aagttgctga ctaccttggc acagtccatc atgaattaca 1140cttcacagtg caggagggca ttgatgcttt ggaagaagtt atatatcaca tcgagacgta 1200tgatgtcacg accattagag caagtacccc aatgtttcta atgtctcgga aaatcaaatc 1260gttgggtgtg aagatggttc tttcgggaga aggctccgat gaaatatttg gtggttatct 1320ttattttcac aaggcaccaa acaaaaagga actacatgag gaaacatgta ggaagataaa 1380agctctccat ttatatgatt gtttgagagc gaacaaagca acttctgcct ggggtctcga 1440ggctcgtgtt ccattcctcg acaaaaactt catcaatgta gcaatggacc tggatccgga 1500atgtaagatg ataagacgtg atcttggccg gatcgagaaa tgggttctgc gtaatgcatt 1560tgatgatgag gagaagccct atttacccaa gcacattctt tacaggcaaa aagaacaatt 1620cagcgatggg gttgggtaca gttggattga tggattgaag gaccatgcta aagcacatgt 1680gtcggattcc atgatgacga acgccagctt tgtttaccct gaaaacacac ccacaacaaa 1740agaggcctac tattacagga ccgtattcga gaagttctat cccaagaatg ctgctaggct 1800aacggtgcca ggaggtccca gcatcgcatg cagcaccgct aaagctgtcg aatgggacgc 1860cgcctggtcc aagctcctcg acccgtctgg ccgcgccgct cttggcgtgc acgatgcggc 1920gtacaaagaa aaggctcctg catcggtcga tcctgccgtg gataacgtct cacgttcacc 1980tgcacatgac gtcaaaagac tcaaaaccgc catttcagca gctgctgtat aaccttccat 2040tccatggttc caaaaatgcc gtcgcttagt tttaatccta gcaatcctgt ctgtagttca 2100ttcagtcatg cagtgcagaa atcgctttgc tctacttttt cgttcatgtt gtgctttcgc 2160atgtatgtac caagttagtt tgtttatgca gcgagcgttt gcgtcgtaaa taaatatttc 2220accgtggttg atatccttgt gttgctcagt gtttggtttg caagctgcaa attgcactaa 2280taaattcc 2288


Patent applications by Yves Hatzfeld, Lille FR

Patent applications by BASF Plant Science GmbH

Patent applications in class The polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)

Patent applications in all subclasses The polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)


User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA
Similar patent applications:
DateTitle
2013-10-31Method to enhance yield and purity of hybrid crops
2013-11-21Promoter-regulated differentiation-dependent self-deleting cassette
2013-11-21Promoter-regulated differentiation-dependent self-deleting cassette
2013-11-21Somatic cell-derived pluripotent cells and methods of use therefor
2011-02-24Crop automated relative maturity system
New patent applications in this class:
DateTitle
2016-06-23Plants having one or more enhanced yield-related traits and a method for making the same
2016-06-09Transgenic maize
2016-05-19Methods and compositions for improvement in seed yield
2016-05-12Means and methods for yield performance in plants
2016-04-21Plants having one or more enhanced yield-related traits and a method for making the same
New patent applications from these inventors:
DateTitle
2016-03-24Plants having enhanced yield-related traits and a method for making the same
2015-12-03Plants having enhanced yield-related traits and a method for making the same
2015-11-12Plants having enhanced yield-related traits and methods for making the same
2015-09-17Plants having enhanced yield-related traits and a method for making the same
2015-08-20Plants having enhanced yield-related traits and a method for making the same
Top Inventors for class "Multicellular living organisms and unmodified parts thereof and related processes"
RankInventor's name
1Gregory J. Holland
2William H. Eby
3Richard G. Stelpflug
4Laron L. Peters
5Justin T. Mason
Website © 2025 Advameg, Inc.