Patent application title: METHODS AND COMPOSITIONS FOR TARGETED POLYNUCLEOTIDE MODIFICATION
Inventors:
IPC8 Class: AC12N1582FI
USPC Class:
1 1
Class name:
Publication date: 2020-01-30
Patent application number: 20200032281
Abstract:
A variety of methods and compostions are provided, including methods and
compositions for targeted modification of a specific target site in a
cell or organism, methods for integrating polynucleotides of interest,
methods to assess promoter activity, directly select transformed
organisms, minimize or eliminate expression resulting from random
integration into the genome of an organism, such as a plant, remove
polynucleotides of interest, combine multiple transfer cassettes, invert
or excise a polynucleotide, silence a gene, and identify and/or
characterize transcriptional regulating regions. The methods involve the
introduction of a cell proliferation factor and a double-strand
break-inducing enzyme into an organism.Claims:
1. A method for modifying a target site of a plant cell, wherein said
target site of said plant cell comprises a recognition sequence, and
wherein said method comprises: a) introducing into said plant cell at
least one heterologous polynucleotide encoding a cell proliferation
factor comprising SEQID NO:3, SEQID NO:4, SEQID NO:5, SEQID NO: 6 SEQID
NO:7, and SEQID NO:8; wherein the heterologous polynucleotide is operably
linked to a promoter active in the plant cell; b) expressing said
heterologous polynucleotide encoding said cell proliferation factor; and
b) introducing a heterologous polynucleotide encoding a double-strand
break-inducing enzyme and expressing said heterologous polynucleotide
encoding said double-strand break-inducing enzyme, wherein said
double-strand break-inducing enzyme recognizes said recognition sequence
and introduces a double-strand break at or near the recognition sequence
to produce a modified target site.
2. (canceled)
3. The method of claim 1, wherein the cell proliferation factor further comprises at least one of the following amino acid sequences: a) the amino acid sequence set forth in SEQ ID NO: 9 or an amino acid sequence that differs from the amino acid sequence set forth in SEQ ID NO: 9 by one amino acid; and b) the amino acid sequence set forth in SEQ ID NO: 12 or an amino acid sequence that differs from the amino acid sequence set forth in SEQ ID NO: 12 by one amino acid.
4.-5. (canceled)
6. The method of claim 5, wherein said promoter operably linked to said heterologous polynucleotide encoding said cell proliferation factor is an oleosin promoter, a ubiquitin promoter, a nopaline synthase promoter, or a In2 promoter.
7. The method of claim 1, wherein said modified target site comprises a deletion, a mutation, a replacement, or an integration of a nucleotide sequence when compared to said target site.
8. The method of claim 1, wherein said double-strand break-inducing enzyme is selected from the group consisting of an endonuclease, a zinc finger nuclease, a transposase, a topoisomerase, and a site-specific recombinase.
9. The method of claim 8, wherein said endonuclease comprises a homing endonuclease.
10. The method of claim 9, wherein said homing endonuclease comprises a modified endonuclease that has been modified to specifically bind said recognition sequence.
11. The method of claim 10, wherein said modified homing endonuclease is derived from a homing endonuclease selected from the group consisting of I-SceI, I-Scell, I-Scelll, I-ScelV, 1-SceV, 1-SceVI, 1-SceVII, I-Ceul, 1-CeuAIIP, I-Crel, 1-CrepsbIP, I-CrepsbllP, 1-CrepsblllP, 1-CrepsblVP, I-Tlil, I-PpoI, P1-PspI, F-SceI, F-Scell, F-Suvl, F-Tevl, F-Tevll, I-Amal, I-Anil I-Chul, I-Cmoel, I-Cpal, I-Cpall, I-Csml, I-Cvul, 1-CvuAIP, I-Ddil, I-Ddill, I-Dirl, I-Dmol, I-Hmul, I-Hmull, I-HsNIP, I-Llal, I-Msol, I-Naal, I-Nanl, 1-NcllP, 1-NgrlP, I-Nitl, I-Njal, I-Nsp236IP, I-Pakl, 1-PbolP, 1-PculP, 1-PcuAI, 1-PcuVI, 1-PgrlP, 1-PoblP, I-Porl, 1-PorllP, 1-PbplP, 1-SpBetalP, I-Scal, 1-SexlP, 1-SnelP, I-Spoml, I-SpomCP, 1-SpomlP, 1-SpomllP, 1-SqulP, 1-Ssp68031, 1-SthPhiJP, I-SthPhiST3P, 1-SthPhiSTe3bP, 1-TdelP, I-Tevl, I-Tevll, I-Tevlll, 1-UarAP, I-UarHGPAIP, 1-UarHGPA13P, 1-VinlP, 1-ZbilP, Pl-Mtul, PI-MtuHIP Pl-MtuHIIP, Pl-Pful, Pl-Pfull, Pl-Pkol, Pl-Pkoll, PI-Rma43812IP, PI-SpBetalP, Pl-Scel, Pl-Tful, Pl-Tfull, Pl-Thyl, PI-TIiI, and PI-TiiII.
12. The method of claim 1, wherein said double-strand break-inducing enzyme is a site-specific recombinase and said recognition sequence comprises a first recombination site.
13. The method of claim 12, wherein said site-specific recombinase is selected from the group consisting of FLP, Cre, SSV1, R, Gin, lambda Int, phiC31 Int, Tn1721, CinH, ParA, Tn5053, Bxb1, TP907-1, U153, and HK022 Int.
14. The method of claim 12, wherein said target site further comprises a second recombination site, wherein said target site comprises the following operably linked components: said first recombination site, a nucleic acid sequence, and a second recombination site.
15. The method of claim 14, wherein said first recombination site is recombinogenic with the second recombination site in the presence of said site-specific recombinase.
16. The method of claim 15, wherein said nucleic acid sequence is excised or inverted to produce the modified target site.
17. The method of claim 1, wherein said modified target site comprises an integrated polynucleotide of interest, and wherein said method further comprises introducing into said plant cell a transfer cassette comprising said polynucleotide of interest.
18. The method of claim 17, wherein said transfer cassette comprises at least a first region having homology to said target site.
19. The method of claim 18, wherein said transfer cassette comprises in the following order: said first region of homology to said target site, said polynucleotide of interest, and a second region of homology to said target site.
20. The method of claim 1, said method further comprising identifying cells comprising the modified target site and regenerating a plant having the modified target site.
21. The method of claim 20, wherein said method further comprises reducing the activity of said cell proliferation factor prior to regenerating a plant having the modified target site.
22. The method of claim 21, wherein reducing the activity of said cell proliferation factor comprises excising said heterologous polynucleotide encoding said cell proliferation factor.
23. The method of claim 22, wherein said heterologous polynucleotide encoding said cell proliferation factor is flanked by recombination sites, and wherein said method further comprises introducing into said plant cell a site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor, whereby said heterologous polynucleotide encoding said cell proliferation factor is excised in the presence of said site-specific recombinase.
24. The method of claim 23, wherein said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor has the amino acid sequence set forth in SEQ ID NO: 43 or an amino acid sequence having at least 70% sequence identity to the amino acid sequence set forth in SEQ ID NO: 43.
25. The method of claim 23, wherein said introducing said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor comprises introducing a heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor and expressing said heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor.
26. The method of claim 25, wherein said heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor is operably linked to an inducible promoter.
27. The method of claim 26, wherein said inducible promoter operably linked to said heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor has the nucleotide sequence set forth in SEQ ID NO: 54 or a nucleotide sequence having at least 70% sequence identity to the nucleotide sequence set forth in SEQ ID NO: 54.
28. The method of claim 25, wherein said heterologous polynucleotide encoding said cell proliferation factor and said heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor are flanked by said recombination sites, whereby said heterologous polynucleotide encoding said cell proliferation factor and said heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor is excised in the presence of said site-specific recombinase.
29. The method of claim 28, wherein said plant cell further comprises a heterologous polynucleotide encoding a Wuschel polypeptide, and wherein said heterologous polynucleotide encoding said cell proliferation factor, said heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor, and said heterologous polynucleotide encoding said Wuschel polypeptide are flanked by said recombination sites, whereby said heterologous polynucleotide encoding said cell proliferation factor, said heterologous polynucleotide encoding said site-specific recombinase capable of recognizing and implementing recombination at the recombination sites flanking said heterologous polynucleotide encoding said cell proliferation factor, and said heterologous polynucleotide encoding said Wuschel polypeptide is excised in the presence of said site-specific recombinase.
30. The method of claim 29, wherein said heterologous polynucleotide encoding said Wuschel polypeptide is operably linked to a nopaline synthase promoter or an In2-2 promoter.
31. The method of claim 29, wherein said heterologous polynucleotide encoding said Wuschel polypeptide is stably integrated into the genome of said plant cell.
32. The method of claim 29, wherein said heterologous polynucleotide encoding said Wuschel polypeptide is transiently expressed.
33. The method of claim 29, wherein said heterologous polynucleotide encoding said Wuschel polypeptide has a nucleotide sequence selected from the group consisting of: a) the nucleotide sequence set forth in SEQ ID NO: 51, 57, 99, or 97; b) a nucleotide sequence having at least 70% sequence identity to SEQ ID NO: 51, 57, 99, or 97; c) a nucleotide sequence encoding the amino acid sequence set forth in SEQ ID NO: 52, 58, 100, or 98; and d) a nucleotide sequence encoding an amino acid sequence having at least 70% sequence identity to SEQ ID NO: 52, 58, 100, or 98.
34. The method of claim 29, wherein said method further comprises reducing the activity of said Wuschel polypeptide prior to the regeneration of a plant having the modified target site.
35. The method of claim 1, wherein said plant cell is a dicot plant cell.
36. The method of claim 1, wherein said plant cell is a monocot plant cell.
37. The method of claim 36, wherein said monocot plant is selected from the group consisting of maize, rice, sorghum, barley, wheat, millet, oats, sugarcane, turfgrass, and switch grass.
38. A method for targeting the insertion of a polynucleotide of interest to a target site in a plant cell, wherein said target site comprises a first recombination site, said method comprising: a) introducing into said plant cell at least one heterologous polynucleotide encoding a cell proliferation factor and expressing said heterologous polynucleotide encoding said cell proliferation factor; wherein the cell proliferation factor comprises SEQID NO:3, SEQID NO:4, SEQID NO:5, SEQID NO: 6 SEQID NO:7, and SEQID NO:8; wherein the heterologous polynucleotide is operably linked to a promoter active in the plant cell; b) introducing into said plant cell a transfer cassette comprising a second recombination site and said polynucleotide of interest, wherein the first and said second recombination sites are recombinogenic with respect to one another; and c) introducing into said plant cell a site-specific recombinase that recognizes and implements recombination at said first and said second recombination sites, thereby inserting said polynucleotide of interest at the target site.
39. A method for targeting the insertion of a polynucleotide of interest to a target site in a plant cell, wherein said target site comprises a first and a second recombination site, wherein said first and said second recombination sites flank a nucleotide sequence and are non-recombinogenic with respect to one another, said method comprising: a) introducing into said plant cell at least one heterologous polynucleotide encoding a cell proliferation factor and expressing said heterologous polynucleotide encoding said cell proliferation factor; wherein the cell proliferation factor comprises SEQID NO:3, SEQID NO:4, SEQID NO:5, SEQID NO: 6 SEQID NO:7, and SEQID NO:8; wherein the heterologous polynucleotide is operably linked to a promoter active in the plant cell; b) introducing into said plant cell a transfer cassette comprising a third and a fourth recombination site flanking said polynucleotide of interest, wherein the third recombination site is recombinogenic with the first recombination site, and wherein the fourth recombination site is recombinogenic with the second recombination site; and c) introducing into said plant cell a site-specific recombinase that recognizes and implements recombination at the first, second, third, and fourth recombination sites; thereby replacing the nucleic acid sequence of the target site with the polynucleotide of interest from the transfer cassette.
40. A method to integrate multiple transfer cassettes at a target site in a plant cell, wherein said target site comprises at least a first and a second recombination site, said method comprising: a) introducing into said plant cell a first transfer cassette comprising in the following order: at least the first, a third, and the second recombination sites, wherein the first and the third recombination sites of the first transfer cassette flank a first polynucleotide of interest, and wherein said first, said second, and said third recombination sites are non-recombinogenic with respect to one another; b) introducing into said plant cell a first site-specific recombinase, wherein said site-specific recombinase recognizes and implements recombination at the first and the second recombination sites; c) introducing a second transfer cassette comprising at least the second and the third recombination sites, wherein the second and the third recombination sites of the second transfer cassette flank a second polynucleotide of interest; and d) introducing into said plant cell a second site-specific recombinase, wherein said second site-specific recombinase recognizes and implements recombination at the second and third recombination sites; whereby the first and the second transfer cassettes are integrated at the target site of the plant cell, and wherein said method further comprises introducing at least one heterologous polynucleotide encoding a cell proliferation factor into said plant cell and expressing said heterologous polynucleotide encoding said cell proliferation factor before or during the introduction of the first site-specific recombinase, the second site-specific recombinase, or both the first and the second site-specific recombinase; wherein the cell proliferation factor comprises SEQID NO:3, SEQID NO:4, SEQID NO:5, SEQID NO: 6 SEQID NO:7, and SEQID NO:8; and wherein the heterologous polynucleotide is operably linked to a promoter active in the plant cell.
41. A method to integrate multiple transfer cassettes at a target site in a plant cell, wherein said target site comprises in the following order at least a first, a second, and a third recombination site, wherein said first, said second, and said third recombination sites are non-recombinogenic with respect to one another, said method comprising: a) introducing into said plant cell a first transfer cassette comprising a first polynucleotide of interest flanked by the first and the second recombination sites; b) introducing into said plant cell a first site-specific recombinase, wherein said first site-specific recombinase recognizes and implements recombination at the first and the second recombination sites; c) introducing a second transfer cassette comprising a second polynucleotide of interest flanked by at least the second and the third recombination sites; and d) introducing into said plant cell a second site-specific recombinase, wherein said second site-specific recombinase recognizes and implements recombination at the second and third recombination sites; whereby the first and the second transfer cassettes are integrated at the target site of the plant cell, and wherein said method further comprises introducing at least one heterologous polynucleotide encoding a cell proliferation factor into said plant cell and expressing said heterologous polynucleotide encoding said cell proliferation factor before or during the introduction of the first site-specific recombinase, the second site-specific recombinase, or both the first and the second site-specific recombinase; and wherein the cell proliferation factor comprises SEQID NO:3, SEQID NO:4, SEQID NO:5, SEQID NO: 6 SEQID NO:7, and SEQID NO:8; and wherein the heterologous polynucleotide is operably linked to a promoter active in the plant cell.
42. A plant cell comprising a target site, wherein said target site of said plant cell comprises a recognition sequence, and wherein said plant cell further comprises: at least one heterologous polynucleotide encoding a cell proliferation factor operably linked to a promoter active in said plant, wherein the cell proliferation factor comprises SEQID NO:3, SEQID NO:4, SEQID NO:5, SEQID NO: 6 SEQID NO:7, and SEQID NO:8, b) a double-strand break-inducing enzyme capable of recognizing said recognition sequence and introducing a double-strand break at or near the recognition sequence, and c) a transfer cassette comprising a polynucleotide of interest, wherein said transfer cassette comprises a first region of homology with said target site.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a Continuation of pending U.S. application Ser. No. 15/890,698 filed 7 Feb. 2018, now allowed, which is a Continuation of U.S. application Ser. No. 14/215,110 filed 17 Mar. 2014, now U.S. Pat. No. 9,926,571 issued 27 Mar. 2018, which is a Continuation of U.S. application Ser. No. 12/982,013 filed on 30 Dec. 2010, now U.S. Pat. No. 8,704,041 issued 22 Apr. 2014, which claims the benefit of U.S. Provisional Application No. 61/291,207, filed on 30 Dec. 2009, the contents of all of which are hereby incorporated by reference in their entirety.
REFERENCE TO A SEQUENCE LISTING SUBMITTED AS A TEXT FILE VIA EFS-WEB
[0002] The official copy of the sequence listing is submitted electronically via EFS-Web as an ASCII formatted sequence listing with a file named 3526USCNT3_SeqListing.TXT, created on 7 Aug. 2019, and having a size of 441,977 bytes and is filed concurrently with the specification. The sequence listing contained in this ASCII formatted document is part of the specification and is herein incorporated by reference in its entirety.
FIELD OF THE INVENTION
[0003] The present invention relates to the field of molecular biology, specifically the targeted modification of polynucleotides, including targeted mutagenesis and recombination events.
BACKGROUND OF THE INVENTION
[0004] Random insertion of introduced DNA into the genome of a host cell can be lethal if the foreign DNA disrupts an important native gene or regulatory region. Even if a random insertion event does not impair the functioning of a sequence in a host cell, the expression of an inserted foreign nucleotide sequence may be influenced by position effects caused by the surrounding genomic DNA. In some cases, the nucleotide sequence is inserted into a site where the position effect suppresses the function or regulation of the introduced nucleotide sequence. In other instances, overproduction of the gene product may have deleterious effects on a cell.
[0005] For example, in plants, position effects can result in reduced agronomics, additional costs for further research, creation of additional transgenic events, slowing product development. For these reasons, efficient methods are needed to target the insertion of nucleotide sequences into the genome of various organisms, such as plants, at chromosomal positions that allow for the desired function of the sequence of interest.
BRIEF SUMMARY OF THE INVENTION
[0006] Methods and compositions for targeted modification of a specific target site in a cell are provided. A variety of compositions and methods that can be used to modify a target site are provided, including methods to recombine polynucleotides, assess promoter activity, directly select transformed organisms, minimize or eliminate expression resulting from random integration into the genome of an organism, such as a plant, remove polynucleotides of interest, combine multiple transfer cassettes, invert or excise a polynucleotide, silence gene(s), and characterize transcriptional regulatory regions. The methods involve the introduction of a cell proliferation factor and a double-strand break-inducing enzyme into an organism, and in some embodiments, the introduction of a transfer cassette. Compositions also include plant cells and plants comprising a heterologous polynucleotide encoding a cell proliferation factor, a double-strand break-inducing enzyme and a transfer cassette comprising a recognition sequence that is recognized by the double-strand break-inducing enzyme.
BRIEF DESCRIPTION OF THE FIGURES
[0007] FIG. 1 provides a depiction of a phylogenetic analysis of 50 sequences with homology to maize babyboom (BBM).
[0008] FIGS. 2A-2M show the consensus motif sequences 1-10, 14, 15, and 19, respectively, discovered in the analysis described herein, along with the alignments of the regions of various polypeptides used to generate the consensus motifs.
[0009] FIG. 3 depicts the motifs found within 50 sequences with homology to maize BBM (ZmBBM).
[0010] FIGS. 4A, 4B, and 4C show alignments of the amino acid sequence of various BBM polypeptides: maize babyboom 2 (ZmBBM2; SEQ ID NO: 29), sorghum babyboom 2 (SbBBM2; SEQ ID NO: 41), rice babyboom 2 (OsBBM2; SEQ ID NO: 35), rice babyboom 3 (OsBBM3; SEQ ID NO: 37), rice babyboom 1 (OsBBM1; SEQ ID NO: 33), maize babyboom (ZmBBM; SEQ ID NO: 2), sorghum babyboom (SbBBM; SEQ ID NO: 39), rice babyboom (OsBBM; SEQ ID NO: 31), Brassica babyboom 1 (BnBBM1; SEQ ID NO: 19), Brassica babyboom 2 (BnBBM2; SEQ ID NO: 21), Arabidopsis babyboom (AtBBM; SEQ ID NO: 17), medicago babyboom (MtBBM; SEQ ID NO: 23), soybean babyboom (GmBBM; SEQ ID NO: 25), and grape babyboom (VvBBM; SEQ ID NO: 27).
[0011] FIG. 5 provides a depiction of the motifs found in babyboom polypeptides.
DETAILED DESCRIPTION OF THE INVENTION
[0012] Various compositions and methods for modifying a target site in a cell, for example a plant cell, are provided. The modification can include a deletion, mutation, replacement or insertion of a nucleotide sequence. The target site is modified through the activity of a double-strand break-inducing enzyme that recognizes a recognition sequence within the target site. The methods further involve the introduction of a cell proliferation factor, such as a babyboom polypeptide and/or a Wuschel polypeptide, that serves to enhance and promote the modification reaction.
[0013] Double-strand breaks induced by double-strand inducing enzymes can result in the induction of DNA repair mechanisms, including the non-homologous end-joining pathway, and homologous recombination. Error-prone DNA repair mechanisms can produce mutations at double-strand break sites. The nonhomologous end-joining (NHEJ) pathways are the most common repair mechanism that serve to bring the broken polynucleotide ends together (Bleuyard et al. (2006) DNA Repair 5:1-12). The structural integrity of chromosomes is typically preserved by the repair, but deletions, insertions, or other rearrangements are possible. The two ends of one double-strand break are the most prevalent substrates of NHEJ (Kirik et al. (2000) EMBO J 19:5562-6). If two different double-strand breaks occur, however, the free ends from different breaks can be ligated to one another, resulting in chromosomal deletions (Siebert and Puchta (2002) Plant Cell 14:1121-31), or chromosomal translocations between different chromosomes (Pacher et al. (2007) Genetics 175:21-9).
[0014] Episomal DNA molecules, for example T-DNAs, can also be ligated into the double-strand break, resulting in integration of the episomal DNA molecule into the host genome (Chilton and Que (2003) Plant Physiol 133:956-65; Salomon and Puchta (1998) EMBO J 17:6086-95). Once the sequence around the double-strand breaks is altered, for example, by exonuclease activities involved in the maturation of double-strand breaks, gene conversion pathways can restore the original structure if a homologous sequence is available, such as a homologous chromosome in non-dividing somatic cells, or a sister chromatid after DNA replication (S, G2, M phases of a cell cycle) (Molinier et al. (2004) Plant Cell 16:342-52). Ectopic and/or epigenic DNA sequences may also serve as a DNA repair template for homologous recombination (Puchta (1999) Genetics 152:1173-81).
[0015] DNA double-strand breaks (DSBs) appear to be an effective factor to stimulate homologous recombination pathways in every organism tested to date (Puchta et al. (1995) Plant Mol Biol 28:281-92; Tzfira and White (2005) Trends Biotechnol 23:567-9; Puchta (2005) J Exp Bot 56:1-14). For example, using DNA break-inducing enzymes, a two- to nine-fold increase of homologous recombination was observed between artificially constructed homologous DNA repeats in plants (Puchta et al. (1995) Plant Mol Biol 28:281-92). Thus, double-strand break-inducing enzymes can be used for targeted modification of polynucleotides in organisms and the provision of one or more cell proliferation factors enhances the frequency of targeted modification.
[0016] Cell proliferation factors can enhance the rate of targeted modification of a target site in a cell of an organism, such as a plant, that has been induced by a double-strand break-inducing enzyme. In these methods, at least one cell proliferation factor and a double-strand break-inducing enzyme are introduced into a cell having a target site with at least one recognition sequence. The double-strand break-inducing enzyme recognizes the recognition sequence and introduces a double-strand break at or near the recognition sequence to produce a modified target site. Modifications to the target site can include a deletion, mutation, replacement, homologous recombination, or insertion of a nucleotide sequence. In certain embodiments, the target site is stably integrated into the genome of the plant. In some of these embodiments, the genomic target site is a native genomic target site. These methods can be used to stimulate recombination at a target site, integrate polynucleotides into a target site, invert or excise a polynucleotide, directly select transformed organisms, minimize or eliminate expression resulting from random integration into the genome of an organism, combine multiple transfer cassettes, silence genes, and characterize transcriptional regulatory regions.
[0017] The presently disclosed methods and compositions utilize cell proliferation factors to enhance rates of targeted polynucleotide modification. As used herein, a "cell proliferation factor" is a polypeptide or a polynucleotide capable of stimulating growth of a cell or tissue, including but not limited to promoting progression through the cell cycle, inhibiting cell death, such as apoptosis, stimulating cell division, and/or stimulating embryogenesis. The polynucleotides can fall into several categories, including but not limited to, cell cycle stimulatory polynucleotides, developmental polynucleotides, anti-apoptosis polynucleotides, hormone polynucleotides, or silencing constructs targeted against cell cycle repressors or pro-apoptotic factors. The following are provided as non-limiting examples of each category and are not considered a complete list of useful polynucleotides for each category: 1) cell cycle stimulatory polynucleotides including plant viral replicase genes such as RepA, cyclins, E2F, prolifera, cdc2 and cdc25; 2) developmental polynucleotides such as Lec1, Kn1 family, WUSCHEL, Zwille, BBM, Aintegumenta (ANT), FUS3, and members of the Knotted family, such as Kn1, STM, OSH1, and SbH1; 3) anti-apoptosis polynucleotides such as CED9, Bcl2, Bcl-X(L), Bcl-W, A1, McL-1, Macl, Boo, and Bax-inhibitors; 4) hormone polynucleotides such as IPT, TZS, and CKI-1; and 5) silencing constructs targeted against cell cycle repressors, such as Rb, CK1, prohibitin, and wee1, or stimulators of apoptosis such as APAF-1, bad, bax, CED-4, and caspase-3, and repressors of plant developmental transitions, such as Pickle and WD polycomb genes including FIE and Medea. The polynucleotides can be silenced by any known method such as antisense, RNA interference, cosuppression, chimerplasty, or transposon insertion.
[0018] The cell proliferation factors can be introduced into cells to enhance targeted polynucleotide modification through the introduction of a polynucleotide that encodes the proliferation factor. The use of the term "polynucleotide" is not intended to limit the compositions to polynucleotides comprising DNA. Polynucleotides can comprise ribonucleotides and combinations of ribonucleotides and deoxyribonucleotides. Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues. The polynucleotides also encompass all forms of sequences including, but not limited to, single-, double-, or multi-stranded forms, hairpins, stem-and-loop structures, circular plasmids, and the like. The polynucleotide encoding the cell proliferation factor may be native to the cell or heterologous. A native polypeptide or polynucleotide comprises a naturally occurring amino acid sequence or nucleotide sequence. "Heterologous" in reference to a polypeptide or a nucleotide sequence is a polypeptide or a sequence that originates from a different species, or if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
[0019] Any of a number of cell proliferation factors can be used. In certain embodiments, those cell proliferation factors that are capable of stimulating embryogenesis are used to enhance targeted polynucleotide modification. Such cell proliferation factors are referred to herein as embryogenesis-stimulating polypeptides and they include, but are not limited to, babyboom polypeptides.
[0020] In some embodiments, the cell proliferation factor is a member of the AP2/ERF family of proteins. The AP2/ERF family of proteins is a plant-specific class of putative transcription factors that regulate a wide variety of developmental processes and are characterized by the presence of an AP2 DNA binding domain that is predicted to form an amphipathic alpha helix that binds DNA (PFAM Accession PF00847). The AP2 domain was first identified in APETALA2, an Arabidopsis protein that regulates meristem identity, floral organ specification, seed coat development, and floral homeotic gene expression. The AP2/ERF proteins have been subdivided into distinct subfamilies based on the presence of conserved domains. Initially, the family was divided into two subfamilies based on the number of DNA binding domains, with the ERF subfamily having one DNA binding domain, and the AP2 subfamily having 2 DNA binding domains. As more sequences were identified, the family was subsequently subdivided into five subfamilies: AP2, DREB, ERF, RAV, and others. (Sakuma et al. (2002) Biochem Biophys Res Comm 290:998-1009).
[0021] Members of the APETALA2 (AP2) family of proteins function in a variety of biological events, including but not limited to, development, plant regeneration, cell division, embryogenesis, and cell proliferation (see, e.g., Riechmann and Meyerowitz (1998) Biol Chem 379:633-646; Saleh and Pages (2003) Genetika 35:37-50 and Database of Arabidopsis Transciption Factors at daft.cbi.pku.edu.cn). The AP2 family includes, but is not limited to, AP2, ANT, Glossy 15, AtBBM, BnBBM, and maize ODP2/BBM.
[0022] Provided herein is an analysis of fifty sequences with homology to a maize BBM sequence (also referred to as maize ODP2 or ZmODP2, the polynucleotide and amino acid sequence of the maize BBM is set forth in SEQ ID NO: 1 and 2, respectively; the polynucleotide and amino acid sequence of another ZmBBM is set forth in SEQ ID NO: 121 and 122, respectively; and genomic sequences of ZmBBM are set forth in SEQ ID NO: 59 and 101). The analysis identified three motifs (motifs 4-6; set forth in SEQ ID NOs: 6-8), along with the AP2 domains (motifs 2 and 3; SEQ ID NOs: 4 and 5) and linker sequence that bridges the AP2 domains (motif 1; SEQ ID NO: 3), that are found in all of the BBM homologues. Thus, motifs 1-6 distinguish these BBM homologues from other AP2-domain containing proteins (e.g., WRI, AP2, and RAP2.7). Thus, these BBM homologues comprise a subgroup of AP2 family of proteins referred to herein as the BBM/PLT subgroup. In some embodiments, the cell proliferation factor that is used in the methods and compositions is a member of the BBM/PLT group of AP2 domain-containing polypeptides. In these embodiments, the cell proliferation factor comprises two AP2 domains and motifs 4-6 (SEQ ID NOs: 6-8) or a fragment or variant thereof. In some of these embodiments, the AP2 domains have the sequence set forth in SEQ ID NOs: 4 and 5 or a fragment or variant thereof, and in particular embodiments, further comprises the linker sequence of SEQ ID NO: 3 or a fragment or variant thereof. In other embodiments, the cell proliferation factor comprises at least one of motifs 4-6 or a fragment or variant thereof, along with two AP2 domains, which in some embodiments have the sequence set forth in SEQ ID NO: 4 and/or 5 or a fragment or variant thereof, and in particular embodiments have the linker sequence of SEQ ID NO: 3 or a fragment or variant thereof. Based on the phylogenetic analysis provided herein, the subgroup of BBM/PLT polypeptides can be subdivided into the BBM, AIL6/7, PLT1/2, AIL1, PLT3, and ANT groups of polypeptides.
[0023] In some embodiments, the cell proliferation factor is a babyboom (BBM) polypeptide, which is a member of the AP2 family of transcription factors. The BBM protein from Arabidopsis (AtBBM) is preferentially expressed in the developing embryo and seeds and has been shown to play a central role in regulating embryo-specific pathways. Overexpression of AtBBM has been shown to induce spontaneous formation of somatic embryos and cotyledon-like structures on seedlings. See, Boutiler et al. (2002) The Plant Cell 14:1737-1749. The maize BBM protein also induces embryogenesis and promotes transformation (See, U.S. Pat. No. 7,579,529, which is herein incorporated by reference in its entirety). Thus, BBM polypeptides stimulate proliferation, induce embryogenesis, enhance the regenerative capacity of a plant, enhance transformation, and as demonstrated herein, enhance rates of targeted polynucleotide modification. As used herein "regeneration" refers to a morphogenic response that results in the production of new tissues, organs, embryos, whole plants or parts of whole plants that are derived from a single cell or a group of cells. Regeneration may proceed indirectly via a callus phase or directly, without an intervening callus phase. "Regenerative capacity" refers to the ability of a plant cell to undergo regeneration.
[0024] In some embodiments, the babyboom polypeptide comprises two AP2 domains and at least one of motifs 7 and 10 (set forth in SEQ ID NO: 9 and 12, respectively) or a variant or fragment thereof. In certain embodiments, the AP2 domains are motifs 3 and 2 (SEQ ID NOs: 5 and 4, respectively) or a fragment or variant thereof, and in particular embodiments, the babyboom polypeptide further comprises a linker sequence between AP2 domain 1 and 2 having motif 1 (SEQ ID NO: 3) or a fragment or variant thereof. In particular embodiments, the BBM polypeptide further comprises motifs 4-6 (SEQ ID NOs 6-8) or a fragment or variant thereof. The BBM polypeptide can further comprise motifs 8 and 9 (SEQ ID NOs: 10 and 11, respectively) or a fragment or variant thereof, and in some embodiments, motif 10 (SEQ ID NO: 12) or a variant or fragment thereof. In some of these embodiments, the BBM polypeptide also comprises at least one of motif 14 (set forth in SEQ ID NO: 13), motif 15 (set forth in SEQ ID NO: 14), and motif 19 (set forth in SEQ ID NO: 15), or variants or fragments thereof. The variant of a particular amino acid motif can be an amino acid sequence having at least about 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity with the motif disclosed herein. Alternatively, variants of a particular amino acid motif can be an amino acid sequence that differs from the amino acid motif by 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids.
[0025] Non-limiting examples of babyboom polynucleotides or polypeptides that can be used in the methods and compositions include the Arabidopsis thaliana AtBBM (SEQ ID NOs: 16 and 17), Brassica napus BnBBM1 (SEQ ID NOs: 18 and 19), Brassica napus BnBBM2 (SEQ ID NOs: 20 and 21), Medicago truncatula MtBBM (SEQ ID NOs: 22 and 23), Glycine max GmBBM (SEQ ID NOs: 24 and 25), Vitis vinifera VvBBM (SEQ ID NOs: 26 and 27), Zea mays ZmBBM (SEQ ID NOs: 1 and 2 and genomic sequence set forth in SEQ ID NO: 59; and SEQ ID NOs: 104 and 105 and genomic sequence set forth in SEQ ID NO: 101) and ZmBBM2 (SEQ ID NOs: 28 and 29), Oryza sativa OsBBM (polynucleotide sequences set forth in SEQ ID NOs: 30 and 103 and amino acid sequence set forth in SEQ ID NO: 31; genomic sequence set forth in SEQ ID NO: 102), OsBBM1 (SEQ ID NOs: 32 and 33), OsBBM2 (SEQ ID NOs: 34 and 35), and OsBBM3 (SEQ ID NOs: 36 and 37), Sorghum bicolor SbBBM (SEQ ID NOs: 38 and 39 and genomic sequence set forth in SEQ ID NO: 60) and SbBBM2 (SEQ ID NOs: 40 and 41) or active fragments or variants thereof. In particular embodiments, the cell proliferation factor is a maize BBM polypeptide (SEQ ID NO: 2, 29, or 105) or a variant or fragment thereof, or is encoded by a maize BBM polynucleotide (SEQ ID NO: 1, 28, or 104) or a variant or fragment thereof.
[0026] In some embodiments, a polynucleotide encoding a cell proliferation factor has a nucleotide sequence having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the nucleotide sequence set forth in SEQ ID NO: 1, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 59, 101, 102, 103, 104, or 60 or the cell proliferation factor has an amino acid sequence having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence set forth in SEQ ID NO: 2, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 105, or 41. In some of these embodiments, the cell proliferation factor has at least one of motifs 7 and 10 (SEQ ID NO: 9 and 12, respectively) or a variant or fragment thereof at the corresponding amino acid residue positions in the babyboom polypeptide. In other embodiments, the cell proliferation factor further comprises at least one of motif 14 (set forth in SEQ ID NO: 13), motif 15 (set forth in SEQ ID NO: 14), and motif 19 (set forth in SEQ ID NO: 15) or a variant or fragment thereof at the corresponding amino acid residue positions in the babyboom polypeptide.
[0027] In other embodiments, other cell proliferation factors, such as, Lec1, Kn1 family, WUSCHEL (e.g., WUS1, the polynucleotide and amino acid sequence of which is set forth in SEQ ID NO: 51 and 52; WUS2, the polynucleotide and amino acid sequence of which is set forth in SEQ ID NO: 57 and 58; WUS2 alt, the polynucleotide and amino acid sequence of which is set forth in SEQ ID NO: 99 and 100; WUS3, the polynucleotide and amino acid sequence of which is set forth in SEQ ID NO: 97 and 98), Zwille, and Aintegumeta (ANT), may be used alone, or in combination with a babyboom polypeptide or other cell proliferation factor to enhance targeted polynucleotide modification in plants. See, for example, U.S. Application Publication No. 2003/0135889, International Application Publication No. WO 03/001902, and U.S. Pat. No. 6,512,165, each of which is herein incorporated by reference. When multiple cell proliferation factors are used, or when a babyboom polypeptide is used along with any of the abovementioned polypeptides, the polynucleotides encoding each of the factors can be present on the same expression cassette or on separate expression cassettes. Likewise, the polynucleotide(s) encoding the cell proliferation factor(s) and the polynucleotide encoding the double-strand break-inducing enzyme can be located on the same or different expression cassettes. When two or more factors are coded for by separate expression cassettes, the expression cassettes can be provided to the plant simultaneously or sequentially.
[0028] In some embodiments, polynucleotides or polypeptides having homology to a known babyboom polynucleotide or polypeptide and/or sharing conserved functional domains can be identified by screening sequence databases using programs such as BLAST. The databases can be queried using full length sequences, or with fragments including, but not limited to, conserved domains or motifs. In some embodiments, the sequences retrieved from the search can be further characterized by alignment programs to quickly identify and compare conserved functional domains, regions of highest homology, and nucleotide and/or amino differences between sequences, including insertions, deletions, or substitutions, including those programs described in more detail elsewhere herein. The retrieved sequences can also be evaluated using a computer program to analyze and output the phylogenetic relationship between the sequences.
[0029] In other embodiments, polynucleotides or polypeptides having homology to a known babyboom polynucleotide or polypeptide and/or sharing conserved functional domains can be identified using standard nucleic acid hybridization techniques, such as those described in more detail elsewhere herein. Extensive guides on nucleic acid hybridization include Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes, Part I, Chapter 2 (Elsevier, N.Y.); Ausubel et al., eds. (1995) Current Protocols in Molecular Biology, Chapter 2 (Greene Publishing and Wiley-Interscience, N.Y.); and, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
[0030] According to the presently disclosed methods, cell proliferation factors are introduced into cells to enhance the modification of a target site within the cell. The terms "target site," and "target sequence," as used interchangeably herein, refer to a polynucleotide sequence present in a cell of an organism, such as a plant, that comprises at least one recognition sequence and/or a nick/cleavage site for a double-strand break-inducing enzyme. The target site may be part of the organism's native genome or integrated therein or may be present on an episomal polynucleotide. The genomic target sequence may be on any region of any chromosome, and may or may not be in a region encoding a protein or RNA. The target site may be native to the cell or heterologous. In some embodiments, the heterologous target sequence may have been transgenically inserted into the organism's genome, and may be on any region of any chromosome, including an artificial or satellite chromosome, and may or may not be in a region encoding a protein or RNA. It is recognized that the cell or the organism may comprise multiple target sites, which may be located at one or multiple loci within or across chromosomes. Multiple independent manipulations of each target site in the organism can be performed using the presently disclosed methods.
[0031] The target sites comprise at least one recognition sequence. As used herein, the terms "recognition sequence" or "recognition site," used interchangeably herein, refer to any nucleotide sequence that is specifically recognized and/or bound by a double-strand break-inducing enzyme. The length of the recognition site sequence can vary, and includes, for example, sequences that are at least about 3, 4, 6, 8, 10, 12, 14, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 80, 90, 100, or more nucleotides in length. In some embodiments, the recognition site is of a sufficient length to only be present in a genome of an organism one time. In some embodiments, the recognition site is palindromic, that is, the sequence on one strand reads the same in the opposite direction on the complementary strand. The double-strand break-inducing enzyme recognizes the recognition sequence and introduces a double-strand break at or near the recognition sequence. The nick/cleavage site could be within the sequence that is specifically recognized by the enzyme or the nick/cleavage site could be outside of the sequence that is specifically recognized by the enzyme. In some embodiments, the double-strand break is introduced about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, or more nucleotides away from the recognition sequence.
[0032] In some embodiments, the cleavage occurs at nucleotide positions immediately opposite each other to produce a blunt end cut or, in alternative embodiments, the cuts are staggered to produce single-stranded overhangs, also called "sticky ends", which can be either 5' overhangs, or 3' overhangs. The recognition sequence can be endogenous (native) or heterologous to the plant cell. When the recognition site is an endogenous sequence, it may be recognized by a naturally-occurring, or native double-strand break-inducing enzyme. Alternatively, an endogenous recognition sequence may be recognized and/or bound by a modified or engineered double-strand break-inducing enzyme designed or selected to specifically recognize the endogenous recognition sequence to produce a double-strand break.
[0033] A double-strand break-inducing enzyme is any enzyme that recognizes and/or binds to a specific recognition sequence to produce a double-strand break at or near the recognition sequence. The double-strand break could be due to the enzymatic activity of the enzyme itself or the enzyme might introduce a single-stranded nick in the DNA that then leads to a double-strand break induced by other cellular machinery (e.g., cellular repair mechanisms). Examples of double-strand break-inducing enzymes include, but are not limited to, endonucleases, site-specific recombinases, transposases, topoisomerases, and zinc finger nucleases, and include modified derivatives, variants, and fragments thereof. A modified double-strand break-inducing enzyme can be derived from a native, naturally-occurring double-strand break-inducing enzyme or it can be artificially created or synthesized. Those modified double-strand break-inducing enzymes that are derived from a native, naturally-occurring double-strand break-inducing enzymes can be modified to recognize a different recognition sequence (at least one nucleotide difference) than its native form. In certain embodiments, the double-strand break-inducing enzyme recognizes recognition sequences that are of a sufficient length to have only one copy in a genome of an organism.
[0034] In some embodiments, the double-strand break-inducing enzyme can be provided to an organism through the introduction of a polynucleotide encoding the enzyme. In some of these embodiments, the polynucleotide can be modified to at least partially optimize codon usage in the organism, such as plants. See, for example, Campbell and Gowri (1990) Plant Physiol. 92:1-11 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes. See, for example, U.S. Pat. Nos. 5,380,831, and 5,436,391, WO 99/25841, and Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference. Such polynucleotides wherein the frequency of codon usage has been designed to mimic the frequency of preferred codon usage of the host cell are referred to herein as being "codon-modified", "codon-preferred", or "codon-optimized." The polynucleotide encoding the cell proliferation factor, and in some embodiments, the polynucleotide of interest, can also be at least partially modified to optimized codon usage in the host cell or organism.
[0035] In some embodiments, the double-strand break-inducing enzyme is a transposase. Transposases are polypeptides that mediate transposition of a transposon from one location in the genome to another. Transposases typically induce double-strand breaks to excise the transposon, recognize subterminal repeats, and bring together the ends of the excised transposon, in some systems, other proteins are also required to bring together the ends during transposition. Examples of transposons and transposases include, but are not limited to, the Ac/Ds, Dt/rdt, Mu-M1/Mn, and Spm(En)/dSpm elements from maize, the Tam elements from snapdragon, the Mu transposon from bacteriophage, bacterial transposons (Tn) and insertion sequences (IS), Ty elements of yeast (retrotransposon), Ta1 elements from Arabidopsis (retrotransposon), the P element transposon from Drosophila (Gloor et al. (1991) Science 253:1110-1117), the Copia, Mariner and Minos elements from Drosophila, the Hermes elements from the housefly, the PiggyBack elements from Trichplusia ni, Tc1 elements from C. elegans, and IAP elements from mice (retrotransposon).
[0036] In other embodiments, the double-strand break-inducing enzyme is a DNA topoisomerase. DNA topoisomerases modulate DNA secondary and higher order structures and functions related primarily to replication, transcription, recombination and repair. Topoisomerases share two characteristics: (i) the ability to cleave and reseal the phosphodiester backbone of DNA in two successive transesterification reactions; and (ii) once a topoisomerase cleaved DNA intermediate is formed, the enzyme allows the severed DNA ends to come apart, allowing the passage of another single- or double-stranded DNA segment. DNA topoisomerases can be classified into three evolutionary independent families: type IA, type IB and type II.
[0037] Type IA and type IB topoisomerases cleave only a single strand of DNA. The Escherichia coli topoisomerase I and topoisomerase III, Saccharomyces cerevisiae topoisomerase III and reverse gyrase belong to the type IA or type I-5' subfamily as the protein link is to a 5' phosphate in the DNA. The prototype of type IB or I-3' enzymes are found in all eukaryotes and also in vaccinia virus topoisomerase I where the protein is attached to a 3' phosphate. Despite differences in mechanism and specificity between the bacterial and eukaryotic enzymes, yeast DNA topoisomerase I can complement a bacterial DNA topoisomerase I mutant (Bjornsti et al. (1987) Proc Natl Acad Sci USA 84:8971-5). Type IA topoisomerases relax negatively supercoiled DNA and require magnesium and a single-stranded region of DNA. Topoisomerases IB relax both positively and negatively supercoiled DNA with equal efficiency and do not require a single-stranded region of DNA or metal ions for function.
[0038] The type II family of DNA topoisomerases are homodimeric (eukaryotic topoisomerase II) or tetrameric (gyrase) enzymes that cleave both strands of a DNA duplex. Type II topoisomerases include, but are not limited to, E. coli DNA gyrase, E. coli topoisomerase IV (par E), eukaryotic type II topoisomerases, and archaic topoisomerase VI. Preferred cutting sites are known for available topoisomerases.
[0039] In particular embodiments, the double-strand break-inducing enzyme is an endonuclease. Endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain, and include restriction endonucleases that cleave DNA at specific sites without damaging the bases. Restriction endonucleases include Type I, Type II, Type III, and Type IV endonucleases, which further include various subtypes. In the Type I and Type III systems, a single protein complex has both methylase and restriction activities.
[0040] Type I and Type III restriction endonucleases recognize specific recognition sequences, but typically cleave at a variable position from the recognition site, which can be hundreds of base pairs away from the recognition site. In Type II systems, the restriction activity is independent of any methylase activity, and typically cleavage occurs at specific sites within or near to the recognition site. Most Type II enzymes cut palindromic sequences, however Type IIa enzymes recognize non-palindromic recognition sites and cleave outside of the recognition site; Type IIb enzymes cut sequences twice with both sites outside of the recognition site; and Type Hs endonucleases recognize an asymmetric recognition site and cleave on one side and at a defined distance of about 1-20 nucleotides from the recognition site.
[0041] Type IV restriction enzymes target methylated DNA. Restriction enzymes are further described and classified, for example in the REBASE database (on the world wide web at rebase.neb.com; Roberts et al. (2003) Nucleic Acids Res 31:418-20; Roberts et al. (2003) Nucleic Acids Res 31:1805-12; and Belfort et al. (2002) in Mobile DNA II, pp. 761-783, Eds. Craigie, et al., ASM Press, Washington, D.C., each of which is herein incorporated by reference in its entirety).
[0042] Endonucleases that are suitable for use in the presently described methods and compositions include homing endonucleases, which like restriction endonucleases, bind and cut polynucleotides at a specific recognition sequence, however the recognition sequences for homing endonucleases are typically longer, about 18 bp or more. These sequences are predicted to naturally occur infrequently in a genome, typically only one or two sites per genome.
[0043] Homing endonucleases, also known as meganucleases, have been classified into four families based on conserved sequence motifs: the LAGLIDADG, GIY-YIG, H-N-H, and His-Cys box families. These motifs participate in the coordination of metal ions and hydrolysis of phosphodiester bonds. Homing endonucleases are notable for their long recognition sites, and for tolerating some sequence polymorphisms in their DNA substrates. The naming convention for homing endonucleases is similar to the convention for other restriction endonucleases. Homing endonucleases are also characterized by a prefix of F-, I-, or PI- for enzymes encoded by free-standing ORFs, introns, and inteins, respectively. For example, the intron-, intein-, and freestanding gene-encoded homing endonucleases from Saccharomyces cerevisiae are denoted I-SceI, PI-SceI, and F-SceII (HO endonuclease), respectively. Homing endonuclease domains, structure and function are known (see for example, Guhan and Muniyappa (2003) Crit Rev Biochem Mol Biol 38:199-248; Lucas et al. (2001) Nucleic Acids Res 29:960-9; Jurica and Stoddard (1999) Cell Mol Life Sci 55:1304-26; Stoddard (2006) Q Rev Biophys 38:49-95; and Moure et al. (2002) Nat Struct Biol 9:764, each of which is herein incorporated by reference). In some embodiments, a naturally occurring variant, and/or an engineered derivative homing endonuclease is used. The cleavage specificity of a homing endonuclease can be changed by rational design of amino acid substitutions at the DNA binding domain and/or combinatorial assembly and selection of mutated monomers (see, for example, Arnould et al. (2006) J Mol Biol 355:443-58; Ashworth et al. (2006) Nature 441:656-9; Doyon et al. (2006) J Am Chem Soc 128:2477-84; Rosen et al. (2006) Nucleic Acids Res 34:4791-800; and Smith et al. (2006) Nucleic Acids Res 34:e149, each of which is herein incorporated by reference). Engineered homing endonucleases have been demonstrated that can cleave cognate mutant sites without broadening their specificity. The endonuclease can be a modified endonuclease that binds a non-native or heterologous recognition sequence and does not bind a native or endogenous recognition sequence. An engineered or modified endonuclease can have only a single modified amino acid or many amino acid changes. Methods for modifying the kinetics, cofactor interactions, expression, optimal conditions, and/or recognition site specificity of homing endonucleases, and subsequently screening for activity are known, see for example, Epinat et al. (2003) Nucleic Acids Res 31:2952-62; Chevalier et al. (2002) Mol Cell 10:895-905; Gimble et al. (2003) Mol Biol 334:993-1008; Seligman et al. (2002) Nucleic Acids Res 30:3870-9; Sussman et al. (2004) J Mol Biol 342:31-41; Rosen et al. (2006) Nucleic Acids Res 34:4791-800; Chames et al. (2005) Nucleic Acids Res 33:e178; Smith et al. (2006) Nucleic Acids Res 34:e149; Gruen et al. (2002) Nucleic Acids Res 30:e29; Chen and Zhao, (2005) Nucleic Acids Res 33:e154; U.S. Application Publication No. US2007/0117128; and International Application Publication Nos. WO 05/105989, WO 03/078619, WO 06/097854, WO 06/097853, WO 06/097784, WO 04/031346, WO 04/067753, and WO 07/047859, each of which is herein incorporated by reference in its entirety.
[0044] Any homing endonuclease can be used as a double-strand break inducing agent including, but not limited to, I-SceI, I-SceII, I-SceIII, I-SceIV, I-SceV, I-SceVI, I-SceVII, I-CeuI, I-CeuAIIP, I-CreI, I-CrepsbIP, I-CrepsbIIP, I-CrepsbIIP, I-CrepsbIVP, I-TliI, I-PpoI, PI-PspI, F-SceI, F-SceII, F-SuvI, F-TevI, F-TevII, I-AmaI, I-AniI, I-ChuI, I-CmoeI, I-CpaI, I-CpaII, I-CsmI, I-CvuI, I-CvuAIP, I-DdiI, I-DdiII, I-DirI, I-DmoI, I-HmuI, I-HmuII, I-HsNIP, I-LlaI, I-MsoI, I-NaaI, I-NanI, I-NclIP, I-NgrIP, I-NitI, I-NjaI, I-Nsp236IP, I-PakI, I-PboIP, I-PcuIP, I-PcuAI, I-PcuVI, I-PgrIP, I-PobIP, I-PorI, I-PorIIP, I-PbpIP, I-SpBetaIP, I-ScaI, I-SexIP, I-SneIP, I-SpomI, I-SpomCP, I-SpomIP, I-SpomIIP, I-SquIP, I-Ssp68031, I-SthPhiJP, I-SthPhiST3P, I-SthPhiSTe3bP, I-TdeIP, I-TevI, I-TevII, I-TevIII, I-UarAP, I-UarHGPAIP, I-UarHGPA13P, I-VinIP, I-ZbiIP, PI-MtuI, PI-MtuHIP PI-MtuHIIP, PI-PfuI, PI-PfuII, PI-PkoI, PI-PkoII, PI-Rma43812IP, PI-SpBetaIP, PI-SceI, PI-TfuI, PI-TfuII, PI-ThyI, PI-TliI, PI-TliII, or any variant or derivative thereof.
[0045] In still other embodiments, the double-strand break-inducing enzyme is a zinc finger nuclease. Zinc finger nucleases (ZFNs) are engineered double-strand break inducing agents comprised of a zinc finger DNA binding domain and a double strand break-inducing enzymatic domain. Recognition site specificity is conferred by the zinc finger domain, which typically comprises two, three, four, or more zinc fingers, for example having a C2H2 structure; however other zinc finger structures are known and have been engineered. Zinc finger domains are amenable to the design of polypeptides which specifically bind a selected polynucleotide recognition sequence. ZFNs consist of an engineered DNA-binding zinc finger domain linked to a non-specific endonuclease domain, for example, a nuclease domain from a Type IIs endonuclease such as FokI. Additional functionalities can be fused to the zinc-finger binding domain, including transcriptional activator domains, transcription repressor domains, and methylases. In some examples, dimerization of the nuclease domain is required for cleavage activity. Each zinc finger recognizes three consecutive base pairs in the target DNA. For example, a 3-finger domain recognizes a sequence of nine contiguous nucleotides, with a dimerization requirement of the nuclease. Two sets of zinc finger triplets are used to bind an 18-nucleotide recognition sequence. A recognition sequence of 18 nucleotides is long enough to be unique in a genome (4.sup.18=6.9.times.10.sup.10).
[0046] To date, designer zinc finger modules predominantly recognize GNN and ANN triplets (Dreier et al. (2001) J Biol Chem 276:29466-78; Dreier et al. (2000) J Mol Biol 303:489-502; Liu et al. (2002) J Biol Chem 277:3850-6, each of which is herein incorporated by reference), but examples using CNN or TNN triplets are also known (Dreier et al. (2005) J Biol Chem 280:35588-97; Jamieson et al. (2003) Nature Rev Drug Discov 2:361-8). See also, Dural et al. (2005) Nucleic Acids Res 33:5978-90; Segal (2002) Methods 26:76-83; Porteus and Carroll (2005) Nat Biotechnol 23:967-73; Pabo et al. (2001) Ann Rev Biochem 70:313-40; Wolfe et al. (2000) Ann Rev Biophys Biomol Struct 29:183-212; Segal and Barbas (2001) Curr Opin Biotechnol 12:632-7; Segal et al. (2003) Biochemistry 42:2137-48; Beerli and Barbas (2002) Nat Biotechnol 20:135-41; Mani et al. (2005) Biochem Biophys Res Comm 335:447-57; Lloyd et al. (2005) Proc Natl Acad Sci USA 102:2232-7; Carroll et al. (2006) Nature Protocols 1:1329; Ordiz et al. (2002) Proc Natl Acad Sci USA 99:13290-5; Guan et al. (2002) Proc Natl Acad Sci USA 99:13296-301; Townsend et al. (2009) Nature 459:442-445; Sander et al. (2008) Nucl Acids Res 37:509-515; Fu et al. (2009) Nucl Acids Res 37:D297-283; Maeder et al. (2008) Mol Cell 31:294-301; Wright et al. (2005) Plant J 44:693-705; Wright et al. (2006) Nat Prot 1:1637-1652; zinc-finger consortium (website at www-dot-zincfinger-dot-org); International Application Publication Nos. WO 02/099084; WO 00/42219; WO 02/42459; WO 03/062455; U.S. Application Publication Nos. 2003/0059767 and 2003/0108880; and U.S. Pat. Nos. 6,534,261, 7,262,054, 7,378,510, 7,151,201, 6,140,466, 6,511,808 and 6,453,242; each of which is herein incorporated by reference in its entirety.
[0047] Alternatively, engineered zinc finger DNA binding domains can be fused to other double-strand break-inducing enzymes or derivatives thereof that retain DNA nicking/cleaving activity. For example, this type of fusion can be used to direct the double-strand break-inducing enzyme to a different recognition site, to alter the location of the nick or cleavage site, to direct the inducing agent to a shorter recognition site, or to direct the inducing agent to a longer recognition site. In some embodiments, a zinc finger DNA binding domain is fused to a site-specific recombinase, transposase, topoisomerase, endonuclease, or a derivative thereof that retains DNA nicking and/or cleaving activity.
[0048] In some embodiments, a site-specific recombinase is used as the double-strand break-inducing enzyme. A site-specific recombinase, also referred to herein as a recombinase, is a polypeptide that catalyzes conservative site-specific recombination between its compatible recombination sites, and includes native polypeptides as well as derivatives, variants and/or fragments that retain activity, and native polynucleotides, derivatives, variants, and/or fragments that encode a recombinase that retains activity. The recombinase used in the methods and compositions can be a native recombinase or a biologically active fragment or variant of the recombinase. In some embodiments, the site-specific recombinase is a recombinantly produced enzyme or variant thereof, which catalyzes conservative site-specific recombination between specified DNA recombination sites. For reviews of site-specific recombinases and their recognition sites, see Sauer (1994) Curr Op Biotechnol 5:521-527; and Sadowski (1993) FASEB 7:760-767, each of which is herein incorporated by reference in its entirety.
[0049] Any recombinase system can be used in the methods and compositions. A recombinase can be provided via a polynucleotide that encodes the recombinase, a modified polynucleotide encoding the recombinase, or the polypeptide itself. Non-limiting examples of site-specific recombinases that can be used to produce a double-strand break at a recognition sequence include FLP, Cre, SSV1, lambda Int, phi C31 Int, HK022, R, Gin, Tn1721, CinH, ParA, Tn5053, Bxbl, TP907-1, U153, and other site-specific recombinases known in the art, including those described in Thomson and Ow (2006) Genesis 44:465-476, which is herein incorporated by reference in its entirety. Examples of site-specific recombination systems used in plants can be found in U.S. Pat. Nos. 5,929,301, 6,175,056, 6,331,661; and International Application Publication Nos. WO 99/25821, WO 99/25855, WO 99/25841, and WO 99/25840, the contents of each are herein incorporated by reference.
[0050] In some embodiments, recombinases from the Integrase or Resolvase families are used, including biologically active variants and fragments thereof. The Integrase family of recombinases has over one hundred members and includes, for example, FLP, Cre, lambda integrase, and R. The Integrase family has been grouped into two classes based on the structure of the active sites, serine recombinases and tyrosine recombinases. The tyrosine family, which includes Cre, FLP, SSV1, and lambda integrase, uses the catalytic tyrosine's hydroxyl group for a nucleophilic attack on the phosphodiester bond of the DNA. Typically, members of the tyrosine family initially nick the DNA, which later forms a double strand break. In the serine recombinase family, which includes phiC31 integrase, a conserved serine residue forms a covalent link to the DNA target site (Grindley et al. (2006) Ann Rev Biochem 16:16). For other members of the Integrase family, see, for example, Esposito et al. (1997) Nucleic Acids Res 25:3605-3614; and Abremski et al. (1992) Protein Eng 5:87-91; each of which are herein incorporated by reference in its entirety. Other recombination systems include, for example, the Streptomycete bacteriophage phi C31 (Kuhstoss et al. (1991) J Mol Biol 20:897-908); the SSV1 site-specific recombination system from Sulfolobus shibatae (Maskhelishvili et al. (1993) Mol Gen Genet 237:334-342); and a retroviral integrase-based integration system (Tanaka et al. (1998) Gene 17:67-76). In some embodiments, the recombinase does not require cofactors or a supercoiled substrate. Such recombinases include Cre, FLP, or active variants or fragments thereof.
[0051] The FLP recombinase is a protein that catalyzes a site-specific reaction that is involved in amplifying the copy number of the two-micron plasmid of S. cerevisiae during DNA replication. FLP recombinase catalyzes site-specific recombination between two FRT sites. The FLP protein has been cloned and expressed (Cox (1993) Proc Natl Acad Sci USA 80:4223-4227). The FLP recombinase for use in the methods and compositions may be derived from the genus Saccharomyces. In some embodiments, a recombinase polynucleotide modified to comprise more plant-preferred codons is used. A recombinant FLP enzyme encoded by a nucleotide sequence comprising maize preferred codons (FLPm) that catalyzes site-specific recombination events is known (the polynucleotide and polypeptide sequence of which is set forth in SEQ ID NO: 42 and 43, respectively; see, e.g., U.S. Pat. No. 5,929,301, which is herein incorporated by reference in its entirety). Thus, in some embodiments, the site-specific recombinase used in the methods and compositions has the sequence set forth in SEQ ID NO: 43 (FLP) has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 43. In some of those embodiments wherein the site-specific recombinase is provided to the cell through the introduction of a polynucleotide that encodes the site-specific recombinase, the polynucleotide has the sequence set forth in SEQ ID NO: 42 (FLPm) or has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 42. Additional functional variants and fragments of FLP are known (Buchholz et al. (1998) Nat Biotechnol 16:657-662; Hartung et al. (1998) J Biol Chem 273:22884-22891; Saxena et al. (1997) Biochim Biophys Acta 1340:187-204; Hartley et al. (1980) Nature 286:860-864; Voziyanov et al. (2002) Nucleic Acids Res 30:1656-1663; Zhu & Sadowski (1995) J Biol Chem 270:23044-23054; and U.S. Pat. No. 7,238,854, each of which is herein incorporated by reference in its entirety).
[0052] The bacteriophage recombinase Cre catalyzes site-specific recombination between two lox sites. The Cre recombinase is known (Guo et al. (1997) Nature 389:40-46; Abremski et al. (1984) J Biol Chem 259:1509-1514; Chen et al. (1996) Somat Cell Mol Genet 22:477-488; Shaikh et al. (1977) J Biol Chem 272:5695-5702; and, Buchholz et al. (1998) Nat Biotechnol 16:657-662, each of which is herein incorporated by reference in its entirety). Cre polynucleotide sequences may also be synthesized using plant-preferred codons, for example such sequences (maize optimized Cre (moCre); the polynucleotide and polypeptide sequence of which is set forth in SEQ ID NO: 44 and 45, respectively) are described, for example, in International Application Publication No. WO 99/25840, which is herein incorporated by reference in its entirety. Thus, in some embodiments, the site-specific recombinase used in the methods and compositions has the sequence set forth in SEQ ID NO: 45 (Cre) has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 45. In some of those embodiments wherein the site-specific recombinase is provided to the cell through the introduction of a polynucleotide that encodes the site-specific recombinase, the polynucleotide has the sequence set forth in SEQ ID NO: 44 (moCre) or has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 44. Variants of the Cre recombinase are known (see, for example U.S. Pat. No. 6,890,726; Rufer & Sauer (2002) Nucleic Acids Res 30:2764-2772; Wierzbicki et al. (1987) J Mol Biol 195:785-794; Petyuk et al. (2004) J Biol Chem 279:37040-37048; Hartung & Kisters-Woike (1998) J Biol Chem 273:22884-22891; Santoro & Schultz (2002) Proc Natl Acad Sci USA 99:4185-4190; Koresawa et al. (2000) J Biochem (Tokyo) 127:367-372; and Vergunst et al. (2000) Science 290:979-982, each of which are herein incorporated by reference in its entirety).
[0053] In some embodiments, a chimeric recombinase is used. A chimeric recombinase is a recombinant fusion protein which is capable of catalyzing site-specific recombination between recombination sites that originate from different recombination systems. For example, if the set of recombination sites comprises a FRT site and a LoxP site, a chimeric FLP/Cre recombinase or active variant or fragment thereof can be used, or both recombinases may be separately provided. Methods for the production and use of such chimeric recombinases or active variants or fragments thereof are described, for example, in International Application Publication No. WO 99/25840; and Shaikh & Sadowski (2000) J Mol Biol 302:27-48, each of which are herein incorporated by reference in its entirety.
[0054] In other embodiments, a variant recombinase is used. Methods for modifying the kinetics, cofactor interaction and requirements, expression, optimal conditions, and/or recognition site specificity, and screening for activity of recombinases and variants are known, see for example Miller et al. (1980) Cell 20:721-9; Lange-Gustafson and Nash (1984) J Biol Chem 259:12724-32; Christ et al. (1998) J Mol Biol 288:825-36; Lorbach et al. (2000) J Mol Biol 296:1175-81; Vergunst et al. (2000) Science 290:979-82; Dorgai et al. (1995) J Mol Biol 252:178-88; Dorgai et al. (1998) J Mol Biol 277:1059-70; Yagu et al. (1995) J Mol Biol 252:163-7; Sclimente et al. (2001) Nucleic Acids Res 29:5044-51; Santoro and Schultze (2002) Proc Natl Acad Sci USA 99:4185-90; Buchholz and Stewart (2001) Nat Biotechnol 19:1047-52; Voziyanov et al. (2002) Nucleic Acids Res 30:1656-63; Voziyanov et al. (2003) J Mol Biol 326:65-76; Klippel et al. (1988) EMBO J 7:3983-9; Arnold et al. (1999) EMBO J 18:1407-14; and International Application Publication Nos. WO 03/08045, WO 99/25840, and WO 99/25841; each of which is herein incorporated by reference in its entirety. The recognition sites range from about 30 nucleotide minimal sites to a few hundred nucleotides.
[0055] By "recombination site" is intended a polynucleotide (native or synthetic/artificial) that is recognized by the recombinase enzyme of interest. As outlined above, many recombination systems are known in the art and one of skill will recognize the appropriate recombination site to be used with the recombinase of interest.
[0056] Non-limiting examples of recombination sites include FRT sites including, for example, the native FRT site (FRT1, SEQ ID NO:46), and various functional variants of FRT, including but not limited to, FRT5 (SEQ ID NO:47), FRT6 (SEQ ID NO:48), FRT7 (SEQ ID NO:49), FRT12 (SEQ ID NO: 53), and FRT87 (SEQ ID NO:50). See, for example, International Application Publication Nos. WO 03/054189, WO 02/00900, and WO 01/23545; and Schlake et al. (1994) Biochemistry 33:12745-12751, each of which is herein incorporated by reference. Recombination sites from the Cre/Lox site-specific recombination system can be used. Such recombination sites include, for example, native LOX sites and various functional variants of LOX.
[0057] In some embodiments, the recombination site is a functional variant of a FRT site or functional variant of a LOX site, any combination thereof, or any other combination of recombinogenic or non-recombinogenic recombination sites known. Functional variants include chimeric recombination sites, such as an FRT site fused to a LOX site (see, for example, Luo et al. (2007) Plant Biotech J 5:263-274, which is herein incorporated by reference in its entirety). Functional variants also include minimal sites (FRT and/or LOX alone or in combination). The minimal native FRT recombination site (SEQ ID NO: 46) has been characterized and comprises a series of domains comprising a pair of 11 base pair symmetry elements, which are the FLP binding sites; the 8 base pair core, or spacer, region; and the polypyrimidine tracts. In some embodiments, at least one modified FRT recombination site is used. Modified or variant FRT recombination sites are sites having mutations such as alterations, additions, or deletions in the sequence. The modifications include sequence modification at any position, including but not limited to, a modification in at least one of the 8 base pair spacer domain, a symmetry element, and/or a polypyrimidine tract. FRT variants include minimal sites (see, e.g., Broach et al. (1982) Cell 29:227-234; Senecoff et al. (1985) Proc Natl Acad Sci USA 82:7270-7274; Gronostajski & Sadowski (1985) J Biol Chem 260:12320-12327; Senecoff et al. (1988) J Mol Biol 201:405-421; and International Application Publication No. WO99/25821), and sequence variants (see, for example, Schlake & Bode (1994) Biochemistry 33:12746-12751; Seibler & Bode (1997) Biochemistry 36:1740-1747; Umlauf & Cox (1988) EMBO J 7:1845-1852; Senecoff et al. (1988) J Mol Biol 201:405-421; Voziyanov et al. (2002) Nucleic Acids Res 30:7; International Application Publication Nos. WO 07/011733, WO 99/25854, WO 99/25840, WO 99/25855, WO 99/25853 and WO 99/25821; and U.S. Pat. Nos. 7,060,499 and 7,476,539; each of which are herein incorporated by reference in its entirety).
[0058] An analysis of the recombination activity of variant LOX sites is presented in Lee et al. (1998) Gene 216:55-65 and in U.S. Pat. No. 6,465,254. Also, see for example, Huang et al. (1991) Nucleic Acids Res 19:443-448; Sadowski (1995) In Progress in Nucleic Acid Research and Molecular Biology Vol. 51, pp. 53-91; U.S. Pat. No. 6,465,254; Cox (1989) In Mobile DNA, Berg and Howe (eds) American Society of Microbiology, Washington D.C., pp. 116-670; Dixon et al. (1995) Mol Microbiol 18:449-458; Buchholz et al. (1996) Nucleic Acids Res 24:3118-3119; Kilby et al. (1993) Trends Genet 9:413-421; Rossant & Geagy (1995) Nat Med 1:592-594; Albert et al. (1995) Plant J 7:649-659; Bayley et al. (1992) Plant Mol Biol 18:353-361; Odell et al. (1990) Mol Gen Genet 223:369-378; Dale & Ow (1991) Proc Natl Acad Sci USA 88:10558-10562; Qui et al. (1994) Proc Natl Acad Sci USA 91:1706-1710; Stuurman et al. (1996) Plant Mol Biol 32:901-913; Dale et al. (1990) Gene 91:79-85; and International Application Publication No. WO 01/111058; each of which is herein incorporated by reference in its entirety.
[0059] Naturally occurring recombination sites or biologically active variants thereof are of use. Methods to determine if a modified recombination site is recombinogenic are known (see, for example, International Application Publication No. WO 07/011733, which is herein incorporated by reference in its entirety). Variant recognition sites are known, see for example, Hoess et al. (1986) Nucleic Acids Res 14:2287-300; Albert et al. (1995) Plant J 7:649-59; Thomson et al. (2003) Genesis 36:162-7; Huang et al. (1991) Nucleic Acids Res 19:443-8; Siebler and Bode (1997) Biochemistry 36:1740-7; Schlake and Bode (1994) Biochemistry 33:12746-51; Thygarajan et al. (2001) Mol Cell Biol 21:3926-34; Umlauf and Cox (1988) EMBO J 7:1845-52; Lee and Saito (1998) Gene 216:55-65; International Application Publication Nos. WO 01/23545, WO 99/25851, WO 01/11058, WO 01/07572; and U.S. Pat. No. 5,888,732; each of which is herein incorporated by reference in its entirety.
[0060] The recombination sites employed in the methods and compositions can be identical or dissimilar sequences. Recombination sites with dissimilar sequences can be either recombinogenic or non-recombinogenic with respect to one another.
[0061] By "recombinogenic" is intended that the set of recombination sites (i.e., dissimilar or corresponding) are capable of recombining with one another. Alternatively, by "non-recombinogenic" is intended the set of recombination sites, in the presence of the appropriate recombinase, will not recombine with one another or recombination between the sites is minimal. Accordingly, it is recognized that any suitable set of non-recombinogenic and/or recombinogenic recombination sites may be utilized, including a FRT site or functional variant thereof, a LOX site or functional variant thereof, any combination thereof, or any other combination of non-recombinogenic and/or recombination sites known in the art.
[0062] In some embodiments, the recombination sites are asymmetric, and the orientation of any two sites relative to each other will determine the recombination reaction product. Directly repeated recombination sites are those recombination sites in a set of recombinogenic recombination sites that are arranged in the same orientation, such that recombination between these sites results in excision, rather than inversion, of the intervening DNA sequence. Inverted recombination sites are those recombination sites in a set of recombinogenic recombination sites that are arranged in the opposite orientation, so that recombination between these sites results in inversion, rather than excision, of the intervening DNA sequence.
[0063] Fragments and variants of the polynucleotides encoding double-strand break-inducing enzymes and cell proliferation factors and fragments and variants of the double-strand break-inducing enzymes and cell proliferation proteins can be used in the methods and compositions. By "fragment" is intended a portion of the polynucleotide and hence the protein encoded thereby or a portion of the polypeptide. Fragments of a polynucleotide may encode protein fragments that retain the biological activity of the native protein and hence implement a double-strand break (double-strand break-inducing enzyme) or stimulate cell growth (cell proliferation factor). Thus, fragments of a polynucleotide may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides, about 500 nucleotides, about 1000 nucleotides, and up to the full-length polynucleotide encoding a double-strand break-inducing enzyme or cell proliferation factor.
[0064] A fragment of a polynucleotide that encodes a biologically active portion of a double-strand break-inducing enzyme or a cell proliferation protein will encode at least about 15, 25, 30, 50, 100, 150, 200, 250, 300, 320, 350, 375, 400, or 500 contiguous amino acids, or up to the total number of amino acids present in a full-length double-strand break-inducing enzyme or cell proliferation protein used in the methods or compositions.
[0065] A biologically active portion of a double-strand break-inducing enzyme or cell proliferation protein can be prepared by isolating a portion of one of the polynucleotides encoding the portion of the double-strand break-inducing enzyme or cell proliferation polypeptide and expressing the encoded portion of the double-strand break-inducing enzyme or cell proliferation protein, and assessing the activity of the portion of the double-strand break-inducing enzyme or cell proliferation factor. Polynucleotides that encode fragments of a double-strand break-inducing enzyme or cell proliferation polypeptide can comprise nucleotide sequence comprising at least about 15, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1,000, 1,100, or 1,500 nucleotides, or up to the number of nucleotides present in a full-length double-strand break-inducing enzyme or cell proliferation factor nucleotide sequence disclosed herein.
[0066] "Variant" sequences have a high degree of sequence similarity. For polynucleotides, conservative variants include those sequences that, because of the degeneracy of the genetic code, encode the amino acid sequence of one of the native recombinase polypeptides. Variants such as these can be identified with the use of well-known molecular biology techniques, such as, for example, with polymerase chain reaction (PCR) and hybridization techniques. Variant polynucleotides also include synthetically derived nucleotide sequences, such as those generated, for example, by using site-directed mutagenesis but which still encode a biologically active protein, such as a double-strand break inducing agent or a cell proliferation factor. Generally, variants of a particular polynucleotide will have at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by known sequence alignment programs and parameters.
[0067] Variants of a particular polynucleotide (i.e., the reference polynucleotide) can also be evaluated by comparison of the percent sequence identity between the polypeptide encoded by a variant polynucleotide and the polypeptide encoded by the reference polynucleotide. Thus, for example, isolated polynucleotides that encode a polypeptide with a given percent sequence identity to the recombinase are known in the art. Percent sequence identity between any two polypeptides can be calculated using sequence alignment programs and parameters described. Where any given pair of polynucleotides is evaluated by comparison of the percent sequence identity shared by the two polypeptides they encode, the percent sequence identity between the two encoded polypeptides is at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity.
[0068] A variant protein can be derived from the native protein by deletion (so-called truncation) or addition of one or more amino acids to the N-terminal and/or C-terminal end of the native protein; deletion or addition of one or more amino acids at one or more sites in the native protein; or substitution of one or more amino acids at one or more sites in the native protein. Variant proteins are biologically active, that is they continue to possess the desired biological activity of the native protein, that is, introduce a double-strand break at or near a recognition sequence (double-strand break-inducing enzyme) or stimulate cell growth (cell proliferation factor). Such variants may result from, for example, genetic polymorphism or from human manipulation. Biologically active variants of a native double-strand break-inducing protein or cell proliferation factor will have at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence for the native protein as determined by known sequence alignment programs and parameters. A biologically active variant of a protein may differ from that protein by as few as 1-15 amino acid residues, as few as 1-10, such as 6-10, as few as 5, as few as 4, 3, 2, or even 1 amino acid residue.
[0069] The introduction of a cell proliferation factor into a cell can also enhance the rate of targeted integration of a polynucleotide of interest. In these methods, at least one cell proliferation factor is introduced into a cell and a double-strand break-inducing enzyme is introduced, along with a transfer cassette comprising the polynucleotide of interest. As used herein, a "transfer cassette" refers to a polynucleotide that can be introduced into a cell, wherein the polynucleotide comprises a polynucleotide of interest that is to be inserted into a target site of a cell. The introduction of a double-strand break can result in the integration of the polynucleotide of interest through non-homologous end joining or if the transfer cassette comprises at least one region of homology to the target site, the polynucleotide of interest can be integrated through homologous recombination.
[0070] Homology indicates at least two sequences that have structural similarity such that they are recognized as being structurally or functionally related sequences. For example, homology indicates that two polynucleotide sequences have sufficient structural similarity to act as substrates for a homologous recombination reaction. Homology can be described or identified in by any known means. In some examples, homology is described using percent sequence identity or sequence similarity, for example by using computer implemented algorithms to search or measure the sequence identity and similarity. Sequence identity or similarity may exist over the full length of a sequence, or may be less evenly distributed, for example it may be significantly higher in a conserved domain region.
[0071] The amount of homology or sequence identity shared by two sequences can vary and includes total lengths and/or regions having unit integral values in the ranges of about 1-20 bp, 20-50 bp, 50-100 bp, 75-150 bp, 100-250 bp, 150-300 bp, 200-400 bp, 250-500 bp, 300-600 bp, 350-750 bp, 400-800 bp, 450-900 bp, 500-1000 bp, 600-1250 bp, 700-1500 bp, 800-1750 bp, 900-2000 bp, 1-2.5 kb, 1.5-3 kb, 2-4 kb, 2.5-5 kb, 3-6 kb, 3.5-7 kb, 4-8 kb, 5-10 kb, or up to and including the total length of the target site. These ranges include every integer within the range, for example, the range of 1-20 bp includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20 bp. The amount of homology can also be described by percent sequence identity over the full aligned length of the two polynucleotides which includes percent sequence identity of about at least about 50%, 55%, 60%, 65%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%. Sufficient homology includes any combination of polynucleotide length, global percent sequence identity, and optionally conserved regions of contiguous nucleotides or local percent sequence identity, for example sufficient homology can be described as a region of 75-150 bp having at least 80% sequence identity to a region of the target locus.
[0072] Homology can also be described by the predicted ability of two polynucleotides to specifically hybridize under high stringency conditions, which is described elsewhere herein (see, for example, Sambrook, et al., (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, N.Y.; Current Protocols in Molecular Biology, Ausubel, et al., Eds (1994) Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc; and, Tijssen, (1993) Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes, Elsevier, New York).
[0073] In those embodiments wherein the transfer cassette comprises at least one region of homology to a region of the target site, there is sufficient homology between the two regions to allow for homologous recombination to occur between the transfer cassette and the target site. In some embodiments, the transfer cassette comprises a first region of homology to the target site, which can be the recognition sequence, and the polynucleotide of interest. In other embodiments, the transfer cassette comprises a first region of homology to the target site, a polynucleotide of interest, and a second region of homology to the target site. In some of these embodiments, the regions of homology are recombination sites and the double-strand break-inducing enzyme is a site-specific recombinase, such as FLP, Cre, SSVI, R, Int, lambda, phiC31, or HK022. The first and the second recombination site can be recombinogenic or non-recombinogenic with respect to one another. In other embodiments, the region(s) of homology of the transfer cassette to the target site are homologous to other regions of the target site, which can comprise genomic sequence.
[0074] In specific embodiments wherein the double-strand break-inducing enzyme that is introduced into a cell along with at least one cell proliferation factor is a site-specific recombinase, the target site of the cell comprises a first recombination site, and a transfer cassette is further introduced into the cell that comprises a second site-specific recombination site and a polynucleotide of interest, wherein the first and the second recombination sites are recombinogenic with each other in the presence of the site-specific recombinase, the polynucleotide of interest can be inserted at the target site. The first and the second recombination sites can be identical or dissimilar.
[0075] In other specific embodiments, the introduction of at least one cell proliferation factor into a cell can also enhance the rate of insertion of a polynucleotide of interest into a target site in a cell, wherein the target site comprises a first and a second recombination site that are dissimilar and non-recombinogenic with respect to one another, wherein the recombination sites flank a nucleotide sequence, through the further introduction of a site-specific recombinase, and a transfer cassette comprising a third and a fourth recombination site flanking a polynucleotide of interest, wherein the third recombination site is recombinogenic with the first recombination site, and the fourth recombination site is recombinogenic with the second recombination site in the presence of the site-specific recombinase. The nucleotide sequence between the recombination sites of the target site will be exchanged with the polynucleotide of interest between the recombination sites of the transfer cassette.
[0076] As used herein, the term "flanked by", when used in reference to the position of the recombination sites or regions of homology of the target site or the transfer cassette, refers to a position immediately adjacent to the sequence intended to be exchanged or inserted.
[0077] The recombination sites or regions of homology of the transfer cassette may be directly contiguous with the polynucleotide of interest or there may be one or more intervening sequences present between one or both ends of the polynucleotide of interest and the recombination sites or regions of homology. Intervening sequences of particular interest include linkers, adapters, selectable markers, additional polynucleotides of interest, promoters, and/or other sites that aid in vector construction or analysis. It is further recognized that the recombination sites or regions of homology can be contained within the polynucleotide of interest (i.e., such as within introns, coding sequene, or 5' and 3' untranslated regions).
[0078] A method to directly select a transformed cell or an organism (such as a plant or plant cell) is provided. The method comprises providing a cell or organism having a polynucleotide comprising a target site. The polynucleotide comprises, in the following order, a promoter and a target site. A transfer cassette is introduced into the cell or organism, where the transfer cassette comprises, in the following order, a first region of homology with the target site, a polynucleotide comprising a selectable marker not operably linked to a promoter, and a second region of homology with the target site. At least one cell proliferation factor (e.g., babyboom polypeptide) and a double-strand break-inducing enzyme are introduced into the cell or into the organism and the selectable marker is integrated into the target site. The cell or organism is then grown on the appropriate selective agent to recover the organism that has successfully undergone targeted integration of the selectable marker at the target site. In certain embodiments, the target site is stably integrated into the genome of the plant. In some of these embodiments, the genomic target site is a native genomic target site.
[0079] In specific embodiments of the method for directly selecting a transformed cell or an organism as described herein, the cell or the organism has a polynucleotide comprising, in the following order, a promoter and a target site that comprises a first and a second recombination site, wherein the first and the second recombination sites are dissimilar and non-recombinogenic with respect to one another. A transfer cassette is introduced into the cell or organism, wherein the transfer cassette comprises, in the following order, a first recombination site, a polynucleotide comprising a selectable marker not operably linked to a promoter, and a second recombination site, wherein the first and the second recombination sites are non-recombinogenic with respect to one another. A cell proliferation factor and a site-specific recombinase is introduced into the cell or organism and the selectable marker is integrated into the target site. The cell or organism is then grown to recover the organism with the targeted integration.
[0080] A selectable marker comprises a DNA segment that allows one to identify or select for or against a molecule or a cell that contains it, often under particular conditions. These markers can encode an activity, such as, but not limited to, production of RNA, peptide, or protein, or can provide a binding site for RNA, peptides, proteins, inorganic and organic compounds or compositions and the like. Examples of selectable markers include, but are not limited to, DNA segments that comprise restriction enzyme sites; DNA segments that encode products which provide resistance against otherwise toxic compounds (e.g., antibiotics, such as, spectinomycin, ampicillin, kanamycin, tetracycline, Basta, neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT)); DNA segments that encode products which are otherwise lacking in the recipient cell (e.g., tRNA genes, auxotrophic markers); DNA segments that encode products which can be readily identified (e.g., phenotypic markers such as .beta.-galactosidase, GUS; fluorescent proteins such as green fluorescent protein (GFP), cyan (CFP), yellow (YFP), red (RFP), and cell surface proteins); the generation of new primer sites for PCR (e.g., the juxtaposition of two DNA sequence not previously juxtaposed), the inclusion of DNA sequences not acted upon or acted upon by a restriction endonuclease or other DNA modifying enzyme, chemical, etc.; and, the inclusion of a DNA sequences required for a specific modification (e.g., methylation) that allows its identification.
[0081] Additional selectable markers include genes that confer resistance to herbicidal compounds, such as glyphosate, sulfonylureas, glufosinate ammonium, bromoxynil, imidazolinones, and 2,4-dichlorophenoxyacetate (2,4-D). See generally, Yarranton (1992) Curr. Opin. Biotech. 3:506-511; Christopherson et al. (1992) Proc. Natl. Acad. Sci. USA 89:6314-6318; Yao et al. (1992) Cell 71:63-72; Reznikoff (1992) Mol. Microbiol. 6:2419-2422; Barkley et al. (1980) in The Operon, pp. 177-220; Hu et al. (1987) Cell 48:555-566; Brown et al. (1987) Cell 49:603-612; Figge et al. (1988) Cell 52:713-722; Deuschle et al. (1989) Proc. Natl. Acad. Sci. USA 86:5400-5404; Fuerst et al. (1989) Proc. Natl. Acad. Sci. USA 86:2549-2553; Deuschle et al. (1990) Science 248:480-483; Gossen (1993) Ph.D. Thesis, University of Heidelberg; Reines et al. (1993) Proc. Natl. Acad. Sci. USA 90:1917-1921; Labow et al. (1990) Mol. Cell. Biol. 10:3343-3356; Zambretti et al. (1992) Proc. Natl. Acad. Sci. USA 89:3952-3956; Baim et al. (1991) Proc. Natl. Acad. Sci. USA 88:5072-5076; Wyborski et al. (1991) Nucleic Acids Res. 19:4647-4653; Hillen and Wissman (1989) Topics Mol. Struc. Biol. 10:143-162; Degenkolb et al. (1991) Antimicrob. Agents Chemother. 35:1591-1595; Kleinschnidt et al. (1988) Biochemistry 27:1094-1104; Bonin (1993) Ph.D. Thesis, University of Heidelberg; Gossen et al. (1992) Proc. Natl. Acad. Sci. USA 89:5547-5551; Oliva et al. (1992) Antimicrob. Agents Chemother. 36:913-919; Hlavka et al. (1985) Handbook of Experimental Pharmacology, Vol. 78 (Springer-Verlag, Berlin); Gill et al. (1988) Nature 334:721-724. Such disclosures are herein incorporated by reference. The above list of selectable markers is not meant to be limiting. Any selectable marker can be used in the methods and compositions.
[0082] The activity of various promoters at a characterized location in the genome of a cell or an organism can be determined. Thus, the desired activity and/or expression level of a nucleotide sequence of interest can be achieved, as well as, the characterization of promoters for expression in the cell or the organism of interest.
[0083] In one embodiment, the method for assessing promoter activity in a cell or an organism comprises providing a cell or an organism comprising (e.g., in its genome) a target site having a first and a second recombination site, wherein the first and the second recombination sites are dissimilar and non-recombinogenic with respect to one another. A transfer cassette is introduced into the cell or the organism, where the transfer cassette comprises a promoter operably linked to a polynucleotide comprising a selectable marker and the transfer cassette is flanked by the first and the second recombination sites. At least one cell proliferation factor and a site-specific recombinase is provided, wherein the recombinase recognizes and implements recombination at the first and second recombination sites. Promoter activity is assessed by monitoring expression of the selectable marker. In this manner, different promoters can be integrated at the same position in the genome and their activity compared.
[0084] In some embodiments of the method for assessing promoter activity, the transfer cassette comprises in the following order: the first recombination site, a promoter operably linked to a third recombination site operably linked to a polynucleotide comprising a selectable marker, and the second recombination site, where the first, the second, and the third recombination sites are dissimilar and non-recombinogenic with respect to one another. This transfer cassette can be generically represented as RSa-P1::RSc::S1-RSb. Following the introduction of the transfer cassette at the target site, the activity of the promoter (P1) can be analyzed using methods known in the art. Once the activity of the promoter is characterized, additional transfer cassettes comprising a polynucleotide of interest flanked by the second and the third recombination site can be introduced into the organism. Upon recombination, the expression of the polynucleotide of interest will be regulated by the characterized promoter. Accordingly, organisms, such as plant lines, having promoters that achieve the desired expression levels in the desired tissues can be engineered so that nucleotide sequences of interest can be readily inserted downstream of the promoter and operably linked to the promoter and thereby expressed in a predictable manner.
[0085] It is further recognized that multiple promoters can be employed to regulate transcription at a single target site. In this method, the target site comprising the first and the second recombination sites is flanked by two convergent promoters. "Convergent promoters" refers to promoters that are oriented to face one another on either terminus of the target site. The same promoter, or different promoters may be used at the target site. Each of the convergent promoters is operably linked to either the first or the second recombination site. For example, the target site flanked by the convergent promoters can comprise P1.fwdarw.:R1-R2:.rarw.P2, where P is a promoter, the arrow indicates the direction of transcription, R is a recombination site, and the colon indicates the components are operably linked.
[0086] The transfer cassette employed with the target site having the convergent promoters can comprise, in the following order, the first recombination site, a first polynucleotide of interest orientated in the 5' to 3' direction, a second polynucleotide of interest orientated in the 3' to 5' direction, and a second recombination site. The insertion of the transfer cassette at the target site results in the first polynucleotide of interest operably linked to the first convergent promoter, and the second polynucleotide of interest operably linked to the second convergent promoter. The expression of the first and/or the second polynucleotide of interest may be increased or decreased in the cell or organism. The expression of the first and/or the second polynucleotide of interest may also be independently regulated depending upon which promoters are used. It is recognized that target sites can be flanked by other elements that influence transcription. For example, insulator elements can flank the target site to minimize position effects. See, for example, U.S. Publication No. 2005/0144665, herein incorporated by reference.
[0087] In further embodiments, methods are provided to identify a cis transcriptional regulatory region in an organism. By "transcriptional regulatory region" is intended any cis acting element that modulates the level of an RNA. Such elements include, but are not limited to, a promoter, an element of a promoter, an enhancer, an intron, or a terminator region that is capable of modulating the level of RNA in a cell. Thus, the methods find use in generating enhancer or promoter traps. In one embodiment, the reporter or marker gene of the target site is expressed only when it inserts close to (enhancer trap) or within (promoter trap) another gene. The expression pattern of the reporter gene will depend on the enhancer elements of the gene near or in which the reporter gene inserts. In this embodiment, the target site introduced into the cell or the organism can comprise a marker gene operably linked to a recombination site. In specific embodiments, the marker gene is flanked by dissimilar and non-recombinogenic recombination sites. The marker gene is either not operably linked to a promoter (promoter trap) or the marker gene is operably linked to a promoter that lacks enhancer elements (enhancer trap). Following insertion of the target site into the genome of the cell or the organism, the expression pattern of the marker gene is determined for each transformant. When a transformant with a marker gene expression pattern of interest is found, the enhancer/promoter trap sequences can be used as a probe to clone the gene that has that expression pattern, or alternatively to identify the promoter or enhancer regulating the expression. In addition, once a target site is integrated and under transcriptional control of a transcriptional regulatory element, methods can further be employed to introduce a transfer cassette having a polynucleotide of interest into that target in the cell or the organism. A recombination event between the target site and the transfer cassette will allow the nucleotide sequence of interest to come under the transcriptional control of the promoter and/or enhancer element. See, for example, Geisler et al. (2002) Plant Physiol 130:1747-1753; Topping et al. (1997) Plant Cell 10:1713-245; Friedrich et al. (1991) Genes Dev 5:1513-23; Dunn et al. (2003) Appl Environ Microbiol 1197-1205; and von Melchner et al. (1992) Genes Dev 6:919-27; all of which are herein incorporated by reference. In these methods, a cell proliferation factor (e.g., a babyboom polypeptide) is further introduced into the cell or organism to enhance recombination.
[0088] Further, methods are provided for locating preferred integration sites within the genome of a plant cell. Such methods comprise introducing into the plant cell a transfer cassette comprising in the following order: a first recombination site, a promoter active in the plant cell operably linked to a polynucleotide, and a second recombination site; wherein the first and second recombination sites are non-recombinogenic with respect to one another. A cell proliferation factor and site-specific recombinase that recognizes and implements recombination at the first and second recombination sites are introduced into the plant cell. The level of expression of the polynucleotide is determined using any method known in the art and the plant cell that is expressing the polynucleotide is selected.
[0089] Methods are also provided for the integration of multiple transfer cassettes at a target site in a cell. In some embodiments, the target site is constructed to have multiple sets of dissimilar and non-recombinogenic recombination sites. Thus, multiple genes or polynucleotides can be stacked or ordered. In specific embodiments, this method allows for the stacking of sequences of interest at precise locations in the genome of a cell or an organism. Likewise, once a target site has been established within a cell or an organism (for example, the target site can be stably integrated into the genome of the cell or organism), additional recombination sites may be introduced by incorporating such sites within the transfer cassette. Thus, once a target site has been established, it is possible to subsequently add sites or alter sites through recombination. Such methods are described in detail in International Application Publication No. WO 99/25821, herein incorporated by reference.
[0090] In one embodiment, the method comprises introducing into a cell having a target site comprising a first and a second recombination site a first transfer cassette comprising at least the first, a third, and the second recombination sites, wherein the first and the third recombination sites of the first transfer cassette flank a first polynucleotide of interest, and wherein the first, the second, and the third recombination sites are non-recombinogenic with respect to one another. Along with the first transfer cassette, a first site-specific recombinase is introduced into the cell, wherein the first site-specific recombinase recognizes and implements recombination at the first and the second recombination sites. A second transfer cassette is then introduced into the cell, comprising at least the second and the third recombination sites, wherein the second and the third recombination sites of the second transfer cassette flank a second polynucleotide of interest. In some embodiments, a single recombinase can recognize and implement recombination at the first and second recombination sites and at the second and third recombination sites. In other embodiments, along with the second transfer cassette, a second site-specific recombinase is introduced into the cell that recognizes and implements recombination at the second and the third recombination sites. The method further comprises introducing at least one cell proliferation factor to the cell before or during the introduction of the first recombinase, the second recombinase, or both the first and the second recombinase. In a related, alternative method, the target site of the cell has a target site comprising the first, second, and third recombination sites, the first transfer cassette comprises a first polynucleotide of interest flanked by the first and the second recombination sites, and the second transfer cassette comprises a second polynucleotide of interest flanked by at least the second and third recombination sites. A first and a second site-specific recombinase and a cell proliferation factor is introduced similar to the first method for the integration of multiple transfer cassettes described immediately above.
[0091] In other embodiments, methods are provided to minimize or eliminate expression resulting from random integration of DNA sequences into the genome of a cell or an organism, such as a plant. This method comprises providing a cell or an organism having stably incorporated into its genome a polynucleotide comprising the following components in the following order: a promoter active in the cell or the organism operably linked to an ATG translational start sequence operably linked to a target site comprising a first and a second functional recombination site, wherein the first and the second recombination sites are dissimilar and non-recombinogenic with respect to one another. A transfer cassette comprising a polynucleotide of interest flanked by the first and the second recombination site is introduced into the cell or the organism. The translational start sequence of the nucleotide sequence of interest in the transfer cassette has been replaced with the first recombination site. A cell proliferation factor (e.g., a babyboom polypeptide) and a recombinase is provided that recognizes and implements recombination at the recombination sites. Recombination with the target site results in the polynucleotide of interest being operably linked to the ATG translational start site of the target site contained in the polynucleotide. By operably linked is intended a fusion between adjacent elements and when used to refer to the linkage between a translational start a promoter and/or a recombination site implies that the sequences are put together to generate an inframe fusion that results in a properly expressed and functional gene product.
[0092] Methods for excising or inverting a polynucleotide of interest are provided. Such methods can comprise introducing into a cell having a target site comprising: a polynucleotide of interest flanked by a first and a second recombination site, wherein the first and the second sites are recombinogenic with respect to one another; at least one cell proliferation factor; and a double-strand break-inducing enzyme comprising a site-specific recombinase that recognizes and implements recombination at the first and the second recombination sites, thereby excising or inverting the polynucleotide of interest. Depending on the orientation of the recombination sites, the polynucleotide of interest will be excised or inverted when the appropriate recombinase is provided. For example, directly repeated recombination sites will allow for excision of the polynucleotide of interest and inverted repeats will allow for an inversion of the polynucleotide of interest.
[0093] The cell proliferation factor, double-strand break-inducing enzyme or a polynucleotide encoding the same, and in some embodiments, a transfer cassette, is introduced into a cell or an organism according to the presently disclosed methods. "Introducing" is intended to mean presenting to the organism, such as a plant, or the cell the polynucleotide or polypeptide in such a manner that the sequence gains access to the interior of a cell of the organism or to the cell itself. The methods and compositions do not depend on a particular method for introducing a sequence into an organism, only that the polynucleotide or polypeptides gains access to the interior of at least one cell of the organism. Methods for introducing polynucleotides or polypeptides into plants are known in the art including, but not limited to, stable transformation methods, transient transformation methods, virus-mediated methods, and sexual breeding.
[0094] "Stable transformation" means that the nucleotide construct introduced into a host cell or an organism integrates into the genome of the host and is capable of being inherited by the progeny thereof. "Transient transformation" is intended to mean that a polynucleotide is introduced and does not integrate into the genome of the host or that a polypeptide is introduced into a host.
[0095] Protocols for introducing polypeptides or polynucleotide sequences into plants may vary depending on the type of plant or plant cell being targeted. Suitable methods of introducing polypeptides and polynucleotides into plant cells include microinjection (Crossway et al. (1986) Biotechniques 4:320-334), electroporation (Riggs et al. (1986) Proc. Natl. Acad. Sci. USA 83:5602-5606, Agrobacterium-mediated transformation (U.S. Pat. Nos. 5,563,055 and 5,981,840), direct gene transfer (Paszkowski et al. (1984) EMBO J. 3:2717-2722), and ballistic particle acceleration (see, for example, U.S. Pat. Nos. 4,945,050; 5,879,918; 5,886,244; and, 5,932,782; Tomes et al. (1995) in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg and Phillips (Springer-Verlag, Berlin); McCabe et al. (1988) Biotechnology 6:923-926); and Lec1 transformation (WO 00/28058). Also see Weissinger et al. (1988) Ann. Rev. Genet. 22:421-477; Sanford et al. (1987) Particulate Science and Technology 5:27-37 (onion); Christou et al. (1988) Plant Physiol. 87:671-674 (soybean); McCabe et al. (1988) Bio/Technology 6:923-926 (soybean); Finer and McMullen (1991) In Vitro Cell Dev. Biol. 27P:175-182 (soybean); Singh et al. (1998) Theor. Appl. Genet. 96:319-324 (soybean); Datta et al. (1990) Biotechnology 8:736-740 (rice); Klein et al. (1988) Proc. Natl. Acad. Sci. USA 85:4305-4309 (maize); Klein et al. (1988) Biotechnology 6:559-563 (maize); U.S. Pat. Nos. 5,240,855; 5,322,783; and, 5,324,646; Klein et al. (1988) Plant Physiol. 91:440-444 (maize); Fromm et al. (1990) Biotechnology 8:833-839 (maize); Hooykaas-Van Slogteren et al. (1984) Nature (London) 311:763-764; U.S. Pat. No. 5,736,369 (cereals); Bytebier et al. (1987) Proc. Natl. Acad. Sci. USA 84:5345-5349 (Liliaceae); De Wet et al. (1985) in The Experimental Manipulation of Ovule Tissues, ed. Chapman et al. (Longman, N.Y.), pp. 197-209 (pollen); Kaeppler et al. (1990) Plant Cell Reports 9:415-418 and Kaeppler et al. (1992) Theor. Appl. Genet. 84:560-566 (whisker-mediated transformation); D'Halluin et al. (1992) Plant Cell 4:1495-1505 (electroporation); Li et al. (1993) Plant Cell Reports 12:250-255 and Christou and Ford (1995) Annals of Botany 75:407-413 (rice); Osjoda et al. (1996) Nature Biotechnology 14:745-750 (maize via Agrobacterium tumefaciens); all of which are herein incorporated by reference.
[0096] In specific embodiments, the sequences can be provided to a plant using a variety of transient transformation methods. Such transient transformation methods include, but are not limited to, the introduction of the double-strand break-inducing enzyme or cell proliferation protein or variants and fragments thereof directly into the plant or the introduction of a double-strand break-inducing enzyme or cell proliferation factor transcript into the plant. Such methods include, for example, microinjection or particle bombardment. See, for example, Crossway et al. (1986) Mol Gen. Genet. 202:179-185; Nomura et al. (1986) Plant Sci. 44:53-58; Hepler et al. (1994) Proc. Natl. Acad. Sci. 91: 2176-2180 and Hush et al. (1994) The Journal of Cell Science 107:775-784, all of which are herein incorporated by reference. Alternatively, the polynucleotide can be transiently transformed into the plant using techniques known in the art. Such techniques include viral vector system and the precipitation of the polynucleotide in a manner that precludes subsequent release of the DNA. Thus, transcription from the particle-bound DNA can occur, but the frequency with which it is released to become integrated into the genome is greatly reduced. Such methods include the use of particles coated with polyethylimine (PEI; Sigma # P3143).
[0097] In other embodiments, the polynucleotide may be introduced into plants by contacting plants with a virus or viral nucleic acids. Generally, such methods involve incorporating a nucleotide construct within a viral DNA or RNA molecule. It is recognized that the double-strand break-inducing enzyme or cell proliferation factor may be initially synthesized as part of a viral polyprotein, which later may be processed by proteolysis in vivo or in vitro to produce the desired recombinant protein. Further, it is recognized that promoters also encompass promoters utilized for transcription by viral RNA polymerases. Methods for introducing polynucleotides into plants and expressing a protein encoded therein, involving viral DNA or RNA molecules, are known in the art. See, for example, U.S. Pat. Nos. 5,889,191, 5,889,190, 5,866,785, 5,589,367, 5,316,931, and Porta et al. (1996) Molecular Biotechnology 5:209-221; herein incorporated by reference.
[0098] The polynucleotides can be provided in a DNA construct. In addition, in specific embodiments, recognition sequences and/or the polynucleotide encoding an appropriate double-strand break-inducing enzyme is also contained in the DNA construct. The construct can include 5' and 3' regulatory sequences operably linked to the polynucleotide of interest. Generally, operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame. However, it is recognized that intervening sequences can be present between operably linked elements and not disrupt the functional linkage. For example, an operable linkage between a promoter and a polynucleotide of interest comprises a linkage that allows for the promoter sequence to initiate and mediate transcription of the polynucleotide of interest. When used to refer to the linkage between a translational start and a recombination site, the term operably linked implies that the sequences are put together to generate an inframe fusion that results in a properly expressed and functional gene product. Similarly, when used to refer to the linkage between a promoter and a recombination site, the linkage will allow for the promoter to transcribe a downstream nucleotide sequence. The cassette may additionally contain at least one additional gene to be introduced into the organism. Alternatively, the additional gene(s) can be provided on multiple DNA constructs.
[0099] Such a DNA construct may be provided with a plurality of restriction sites, recognition sequences, or recombination sites for insertion of the polynucleotide to be under the transcriptional regulation of the regulatory regions. The expression cassette may additionally contain selectable marker genes.
[0100] In some embodiments, the DNA construct can include in the 5' to 3' direction of transcription, a transcriptional and translational initiation region, a polynucleotide of interest, and a transcriptional and translational termination region functional in the organism of interest.
[0101] The transcriptional initiation region, the promoter, may be native, analogous, foreign, or heterologous to the host organism, and/or to the polynucleotide of interest. Additionally, the promoter may be the natural sequence or alternatively a synthetic sequence. Such constructs may change expression levels of the polynucleotide of interest in the organism.
[0102] The termination region may be native or heterologous with the transcriptional initiation region, it may be native or heterologous with the operably linked polynucleotide of interest, or it may be native or heterologous with the host organism. Convenient termination regions are available from the Ti-plasmid of A. tumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also Guerineau et al. (1991) Mol. Gen. Genet. 262:141-144; Proudfoot (1991) Cell 64:671-674; Sanfacon et al. (1991) Genes Dev. 5:141-149; Mogen et al. (1990) Plant Cell 2:1261-1272; Munroe et al. (1990) Gene 91:151-158; Ballas et al. (1989) Nucleic Acids Res. 17:7891-7903; and Joshi et al. (1987) Nucleic Acids Res. 15:9627-9639. The polynucleotide of interest can also be native or analogous or foreign or heterologous to the host organism.
[0103] Sequence modifications in addition to codon optimization are known to enhance gene expression in a cellular host. These include elimination of spurious polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression. The G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures.
[0104] The DNA construct may additionally contain 5' leader sequences. Such leader sequences can act to enhance translation. Translation leaders are known in the art and include: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5' noncoding region) (Elroy-Stein et al. (1989) Proc. Natl. Acad. Sci. USA 86:6126-6130); potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie et al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (BiP) (Macejak et al. (1991) Nature 353:90-94); untranslated leader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4) (Jobling et al. (1987) Nature 325:622-625); tobacco mosaic virus leader (TMV) (Gallie et al. (1989) in Molecular Biology of RNA, ed. Cech (Liss, New York), pp. 237-256); and maize chlorotic mottle virus leader (MCMV) (Lommel et al. (1991) Virology 81:382-385). See also, Della-Cioppa et al. (1987) Plant Physiol. 84:965-968. Other methods or sequences known to enhance translation can also be utilized, for example, introns, and the like.
[0105] In preparing the DNA construct, the various DNA fragments may be manipulated, so as to place the sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be involved.
[0106] Generally, the DNA construct will comprise a selectable marker gene for the selection of transformed cells. Selectable marker genes are utilized for the selection of transformed cells or tissues and have been discussed in detail elsewhere herein.
[0107] A number of promoters can be used. As used herein "promoter" includes reference to a region of DNA involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. A "plant promoter" is a promoter capable of initiating transcription in a plant cell. Any promoter can be used, and is typically selected based on the desired outcome (for a review of plant promoters, see Potenza et al. (2004) In Vitro Cell Dev Biol 40:1-22).
[0108] Constitutive promoters include, for example, the core promoter of the Rsyn7 promoter and other constitutive promoters disclosed in WO 99/43838 and U.S. Pat. No. 6,072,050; the core CaMV 35S promoter (Odell et al. (1985) Nature 313:810-812); rice actin (McElroy et al. (1990) Plant Cell 2:163-171); ubiquitin (Christensen et al. (1989) Plant Mol. Biol. 12:619-632 and Christensen et al. (1992) Plant Mol. Biol. 18:675-689); pEMU (Last et al. (1991) Theor. Appl. Genet. 81:581-588); MAS (Velten et al. (1984) EMBO J. 3:2723-2730); ALS promoter (U.S. Pat. No. 5,659,026), the Agrobacterium nopaline synthase (NOS) promoter (Bevan et al. (1983) Nucl. Acids Res. 11:369-385), and the like. Other constitutive promoters are described in, for example, U.S. Pat. Nos. 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; 5,608,142; and 6,177,611.
[0109] In some embodiments, an inducible promoter can be used, such as from a pathogen-inducible promoter. Such promoters include those from pathogenesis-related proteins (PR proteins), which are induced following infection by a pathogen; e.g., PR proteins, SAR proteins, beta-1,3-glucanase, chitinase, etc. See, for example, Redolfi et al. (1983) Neth. J. Plant Pathol. 89:245-254; Uknes et al. (1992) Plant Cell 4:645-656; and Van Loon (1985) Plant Mol. Virol. 4:111-116. See also WO 99/43819, herein incorporated by reference. Promoters that are expressed locally at or near the site of pathogen infection include, for example, Marineau et al. (1987) Plant Mol. Biol. 9:335-342; Matton et al. (1989) Mol Plant-Microbe Interact 2:325-331; Somsisch et al. (1986) Proc. Natl. Acad. Sci. USA 83:2427-2430; Somsisch et al. (1988) Mol. Gen. Genet. 2:93-98; and Yang (1996) Proc. Natl. Acad. Sci. USA 93:14972-14977. See also, Chen et al. (1996) Plant J. 10:955-966; Zhang et al. (1994) Proc. Natl. Acad. Sci. USA 91:2507-2511; Warner et al. (1993) Plant J. 3:191-201; Siebertz et al. (1989) Plant Cell 1:961-968; U.S. Pat. No. 5,750,386 (nematode-inducible); and the references cited therein. Additional promoters include the inducible promoter for the maize PRms gene, whose expression is induced by the pathogen Fusarium moniliforme (see, for example, Cordero et al. (1992) Physiol. Mol. Plant Path. 41:189-200). Wound-inducible promoters include potato proteinase inhibitor (pin II) gene (Ryan (1990) Ann. Rev. Phytopath. 28:425-449; Duan et al. (1996) Nat Biotechnol 14:494-498); wun1 and wun2, U.S. Pat. No. 5,428,148; win1 and win2 (Stanford et al. (1989) Mol. Gen. Genet. 215:200-208); systemin (McGurl et al. (1992) Science 225:1570-1573); WIP1 (Rohmeier et al. (1993) Plant Mol. Biol. 22:783-792; Eckelkamp et al. (1993) FEBS Lett 323:73-76); MPI gene (Corderok et al. (1994) Plant J. 6:141-150); and the like, herein incorporated by reference. Another inducible promoter is the maize In2-2 promoter (deVeylder et al. (2007) Plant Cell Physiol 38:568-577, herein incorporated by reference).
[0110] Chemical-regulated promoters can be used to modulate the expression of a gene in a plant through the application of an exogenous chemical regulator. The promoter may be a chemical-inducible promoter, where application of the chemical induces gene expression, or a chemical-repressible promoter, where application of the chemical represses gene expression. Chemical-inducible promoters are known in the art and include, but are not limited to, the maize In2-2 promoter, which is activated by benzenesulfonamide herbicide safeners (De Veylder et al. (1997) Plant Cell Physiol. 38:568-77), the maize GST promoter (GST-II-27, WO 93/01294), which is activated by hydrophobic electrophilic compounds that are used as pre-emergent herbicides, the PR-1 promoter (Cao et al. (2006) Plant Cell Reports 6:554-60), which is activated by BTH or benxo(1,2,3)thiaidazole-7-carbothioic acid s-methyl ester, the tobacco PR-la promoter (Ono et al. (2004) Biosci. Biotechnol. Biochem. 68:803-7), which is activated by salicylic acid, the copper inducible ACE1 promoter (Mett et al. (1993) PNAS 90:4567-4571), the ethanol-inducible promoter AlcA (Caddick et al. (1988) Nature Biotechnol 16:177-80), an estradiol-inducible promoter (Bruce et al. (2000) Plant Cell 12:65-79), the XVE estradiol-inducible promoter (Zao et al. (2000) Plant J 24:265-273), the VGE methoxyfenozide inducible promoter (Padidam et al. (2003) Transgenic Res 12:101-109), and the TGV dexamethasone-inducible promoter (Bohner et al. (1999) Plant J 19:87-95). Other chemical-regulated promoters of interest include steroid-responsive promoters (see, for example, the glucocorticoid-inducible promoter in Schena et al. (1991) Proc. Natl. Acad. Sci. USA 88:10421-10425 and McNellis et al. (1998) Plant J. 14(2):247-257) and tetracycline-inducible and tetracycline-repressible promoters (see, for example, Gatz et al. (1991) Mol. Gen. Genet. 227:229-237; Gatz et al. (1992) Plant J 2:397-404; and U.S. Pat. Nos. 5,814,618 and 5,789,156), herein incorporated by reference.
[0111] Tissue-preferred promoters can be utilized to target enhanced expression of a sequence of interest within a particular plant tissue. Tissue-preferred promoters include Kawamata et al. (1997) Plant Cell Physiol. 38(7):792-803; Hansen et al. (1997) Mol. Gen Genet. 254(3):337-343; Russell et al. (1997) Transgenic Res. 6(2):157-168; Rinehart et al. (1996) Plant Physiol. 112(3):1331-1341; Van Camp et al. (1996) Plant Physiol. 112(2):525-535; Canevascini et al. (1996) Plant Physiol. 112(2):513-524; Lam (1994) Results Probl. Cell Differ. 20:181-196; and Guevara-Garcia et al. (1993) Plant J. 4(3):495-505.
[0112] Leaf-preferred promoters are known in the art. See, for example, Yamamoto et al. (1997) Plant J. 12:255-265; Kwon et al. (1994) Plant Physiol. 105:357-67; Yamamoto et al. (1994) Plant Cell Physiol. 35:773-778; Gotor et al. (1993) Plant J. 3:509-18; Orozco et al. (1993) Plant Mol. Biol. 23:1129-1138; and Matsuoka et al. (1993) Proc. Natl. Acad. Sci. USA 90:9586-9590. In addition, promoter of cab and rubisco can also be used. See, for example, Simpson et al. (1958) EMBO J 4:2723-2729 and Timko et al. (1988) Nature 318:57-58.
[0113] Root-preferred promoters are known and can be selected from the many available. See, for example, Hire et al. (1992) Plant Mol. Biol. 20:207-218 (soybean root-specific glutamine synthase gene); Keller and Baumgartner (1991) Plant Cell 3:1051-1061 (root-specific control element in the GRP 1.8 gene of French bean); Sanger et al. (1990) Plant Mol. Biol. 14:433-443 (root-specific promoter of the mannopine synthase (MAS) gene of Agrobacterium tumefaciens); and Miao et al. (1991) Plant Cell 3:11-22 (full-length cDNA clone encoding cytosolic glutamine synthase (GS), which is expressed in roots and root nodules of soybean). See also Bogusz et al. (1990) Plant Cell 2:633-641, where two root-specific promoters isolated from hemoglobin genes from the nitrogen-fixing nonlegume Parasponia andersonii and the related non-nitrogen-fixing nonlegume Trema tomentosa are described. Leach and Aoyagi (1991) describe their analysis of the promoters of the highly expressed rolC and rolD root-inducing genes of Agrobacterium rhizogenes (see Plant Sci (Limerick) 79:69-76). Teeri et al. (1989) used gene fusion to lacZ to show that the Agrobacterium T-DNA gene encoding octopine synthase is especially active in the epidermis of the root tip and that the TR2' gene is root specific in the intact plant and stimulated by wounding in leaf tissue (see EMBO J. 8:343-350). The TR1' gene, fused to nptII (neomycin phosphotransferase II) showed similar characteristics. Additional root-preferred promoters include the VfENOD-GRP3 gene promoter (Kuster et al. (1995) Plant Mol. Biol. 29:759-772); and rolB promoter (Capana et al. (1994) Plant Mol. Biol. 25:681-691. See also U.S. Pat. Nos. 5,837,876; 5,750,386; 5,633,363; 5,459,252; 5,401,836; 5,110,732; and 5,023,179. Another root-preferred promoter includes the promoter of the phaseolin gene (Mural et al. (1983) Science 23:476-482 and Sengopta-Gopalen et al. (1988) Proc. Natl. Acad. Sci. USA 82:3320-3324.
[0114] Seed-preferred promoters include both those promoters active during seed development as well as promoters active during seed germination. See Thompson et al. (1989) BioEssays 10:108, herein incorporated by reference. Such seed-preferred promoters include, but are not limited to, Cim1 (cytokinin-induced message); cZ19B1 (maize 19 kDa zein); and milps (myo-inositol-1-phosphate synthase); (see WO 00/11177 and U.S. Pat. No. 6,225,529; herein incorporated by reference). For dicots, seed-preferred promoters include, but are not limited to, bean .beta.-phaseolin, napin, .beta.-conglycinin, soybean lectin, cruciferin, and the like. For monocots, seed-preferred promoters include, but are not limited to, maize 15 kDa zein, 22 kDa zein, 27 kDa gamma zein, waxy, shrunken 1, shrunken 2, globulin 1, oleosin, nuc1, etc. See also WO 00/12733, where seed-preferred promoters from end1 and end2 genes are disclosed; herein incorporated by reference. In particular embodiments, the maize oleosin promoter set forth in SEQ ID NO: 55 or a variant or fragment thereof is used.
[0115] Where low-level expression is desired, weak promoters will be used. Generally, by "weak promoter" is intended a promoter that drives expression of a coding sequence at a low level. By low level is intended at levels of about 1/1000 transcripts to about 1/100,000 transcripts to about 1/500,000 transcripts. Alternatively, it is recognized that weak promoters also encompasses promoters that are expressed in only a few cells and not in others to give a total low level of expression. Where a promoter is expressed at unacceptably high levels, portions of the promoter sequence can be deleted or modified to decrease expression levels. Such weak constitutive promoters include, for example, the core promoter of the Rsyn7 promoter (WO 99/43838 and U.S. Pat. No. 6,072,050), the core 35S CaMV promoter, and the like.
[0116] Other promoters of interest include the Rab16 promoter (Mundy et al. (1990) PNAS 87: 1406-1410), the Brassica LEA3-1 promoter (U.S. Application Publication No. US 2008/0244793), the HVA1s, Dhn8s, and Dhn4s from barley and the wsi18j, rab16Bj from rice (Xiao and Xue (2001) Plant Cell Rep 20:667-73), and D113 from cotton (Luo et al. (2008) Plant Cell Rep 27:707-717).
[0117] In some embodiments, the polynucleotide encoding a cell proliferation factor (e.g., babyboom polypeptide) is operably linked to a maize ubiquitin promoter or a maize oleosin promoter (e.g., SEQ ID NO: 65 or a variant or fragment thereof).
[0118] In some embodiments, the methods further comprise identifying cells comprising the modified target locus and recovering plants comprising the modified target locus. In some examples, recovering a plant having the modifed target locus occurs at a higher frequency as compared to a control method without a cell proliferation factor.
[0119] Any method can be used to identify a plant cell or plant comprising a modified target locus. In some examples, plant cell or plants having a modified target locus are identified using one or more of the following techniques, including but not limited to PCR methods, hybridization methods such as Southern or Northern blots, restriction digest analyses, or DNA sequencing.
[0120] The cells having the introduced sequence may be grown into plants in accordance with conventional methods, see, for example, McCormick et al. (1986) Plant Cell Rep 5:81-84. These plants may then be grown, and either pollinated with the same transformed strain or with a different strain, and the resulting progeny expressing the desired phenotypic characteristic and/or comprising the introduced polynucleotide or polypeptide identified. Two or more generations may be grown to ensure that the polynucleotide is stably maintained and inherited, and seeds harvested. In this manner, transformed seed, also referred to as transgenic seed, having a polynucleotide, for example, comprising a modified target site, stably incorporated into their genome are provided.
[0121] In some embodiments, the activity and/or level of the cell proliferation factor (e.g., a babyboom polypeptide, Wuschel) is reduced prior to regenerating a plant from the plant cell having the modified target site. In some of these embodiments, the polynucleotide encoding the cell proliferation factor, and in particular embodiments, the polynucleotide encoding the double-strand break-inducing enzyme, as well, are excised prior to the regeneration of a plant. In some of these embodiments, the promoter and other regulatory elements that are operably linked to each of the heterologous polynucleotides are excised along with the heterologous polynucleotides. In certain embodiments, the polynucleotide encoding the cell proliferation factor (and in particular embodiments, the double-strand break-inducing enzyme) are flanked by recombination sites and an appropriate site-specific recombinase is introduced into the plant cell to excise the polynucleotide encoding the cell proliferation factor, and in some embodiments, the double-strand break-inducing enzyme, prior to regeneration of the plant cell into a plant. In some of those embodiments wherein both a babyboom polypeptide and a Wuschel polypeptide are provided to the plant cell, both the polynucleotide encoding the babyboom polypeptide and the polynucleotide encoding the Wuschel polypeptide are excised. The two polynucleotides can be present on the same or on different expression cassettes and, therefore, can be excised in one or two different excision reactions. In some of these embodiments, the polynucleotide encoding the site-specific recombinase for excising the babyboom and Wuschel polynucleotides can be located on the same expression cassette as the babyboom and Wuschel polynucleotides and all three polynucleotides can be excised through the activity of the site-specific recombinase.
[0122] In order to control the excision of the cell proliferation factor(s) (and in some embodiments, the double-strand break-inducing enzyme), the expression of the site-specific recombinase that is responsible for the excision can be controlled by a late embryo promoter or an inducible promoter. In some embodiments, the late embryo promoter is GZ (Uead et al. (1994) Mol Cell Biol 14:4350-4359), gamma-kafarin promoter (Mishra et al. (2008) Mol Biol Rep 35:81-88), Glb1 promoter (Liu et al. (1998) Plant Cell Reports 17:650-655), ZM-LEG1 (U.S. Pat. No. 7,211,712), EEP1 (U.S. Patent Application No. US 2007/0169226), B22E (Klemsdal et al. (1991) Mol Gen Genet 228:9-16), or EAP1 (U.S. Pat. No. 7,321,031). In some embodiments, the inducible promoter that regulates the expression of the site-specific recombinase is a heat-shock, light-induced promoter, a drought-inducible promoter, including but not limited to Hval (Straub et al. (1994) Plant Mol Biol 26:617-630), Dhn, and WSI18 (Xiao & Xue (2001) Plant Cell Rep 20:667-673). In other embodiments, expression of the site-specific recombinase is regulated by the maize rab17 promoter (nucleotides 1-558 or 51-558 of GenBank Acc. No. X1554 or active fragments or variants thereof; Vilardell et al. (1990) Plant Mol Biol 14:423-432; Vilardell et al. (1991) Plant Mol Biol 17:985-993; and U.S. Pat. Nos. 7,253,000 and 7,491,813; each of which is herein incorporated in its entirety), or a variant rab17 promoter (for example, the variant rab17 promoter set forth in SEQ ID NO: 54; see U.S. Provisional Application No. 61/291,257 and U.S. Utility Application entitled "Methods and compositions for the introduction and regulated expression of genes in plants," filed concurrently herewith and herein incorporated by reference in its entirety). The wild type or modified rab17 promoter can be induced through exposure of the plant cell, callus, or plant to abscisic acid, sucrose, or dessication. In some embodiments, the site-specific recombinase that excises the polynucleotide encoding the cell proliferation factor is FLP.
[0123] Also provided are compositions comprising plant cells or plants comprising a heterologous polynucleotide encoding a cell proliferation factor, wherein the plant cell or plant comprises a target site comprising a recognition sequence; a double-strand break-inducing enzyme that recognizes the recognition sequence; and a transfer cassette comprising a polynucleotide of interest and at least one region of homology with the target site. In some embodiments, the region of homology is a recognition sequence. In these embodiments, the double-strand break-inducing enzyme is a site-specific recombinase capable of recognizing and implementing recombination at the recombination sites within the target site and the transfer cassette. In certain embodiments, the target site is stably integrated into the plant genome.
[0124] In some embodiments, the cell proliferation factor is a member of the AP2 family of polypeptides. In some of these embodiments, the cell proliferation factor is a babyboom polypeptide, and in particular embodiments, the babyboom polypeptide comprises two AP2 domains and at least one of: SEQ ID NO: 9 or a sequence having at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 9; or SEQ ID NO: 12 or a sequence having at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 12. In particular embodiments, the cell proliferation factor has the sequence set forth in SEQ ID NO: 2, 17, 19, 21, 23, 25, 27.29, 31, 33, 35, 37, 39, 105, or 41 or has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 2, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 105, or 41. In some of these embodiments, both a babyboom polypeptide and a Wuschel polypeptide are provided to the plant cell.
[0125] In certain embodiments, the cell proliferation factor (e.g., babyboom polypeptide, Wuschel polypeptide) and/or the double-strand break-inducing enzyme is provided to the cell through the introduction of a polynucleotide encoding the cell proliferation factor and/or the double-strand break-inducing enzyme. In some of these embodiments, the polynucleotide encoding the cell proliferation factor has the sequence set forth in SEQ ID NO: 1, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 59, 101, 102, 103, 104, or 60 or has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 1, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 59, 101, 102, 103, 104, or 60. In some of these embodiments, the polynucleotide encoding the cell proliferation factor is operably linked to an oleosin or ubiquitin promoter. In some of those embodiments wherein a Wuschel polynucleotide is also introduced into the plant cell, expression of Wuschel is regulated by the NOS or In2-2 promoter.
[0126] The double-strand break-inducing enzyme can be an endonuclease, a zinc finger nuclease, a transposase, a topoisomerase, or a site-specific recombinase. In some embodiments, the double-strand break-inducing enzyme is an endonuclease or a modified endonuclease, such as a meganuclease. In other embodiments, the double-strand break-inducing enzyme is a site-specific recombinase such as FLP or Cre and the recognition sequence comprises a recombination site (e.g., FRT1, FRT87, lox). In some of these embodiments, the site-specific recombinase has the sequence set forth in SEQ ID NO: 43 (FLP) or 45 (Cre) or has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 43 or 45. In some of those embodiments wherein the site-specific recombinase is provided to the cell through the introduction of a polynucleotide that encodes the site-specific recombinase, the polynucleotide has the sequence set forth in SEQ ID NO: 42 (FLPm) or 44 (moCre) or has at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater sequence identity to SEQ ID NO: 42 or 44.
[0127] In particular embodiments, the plant cell or plant comprises a heterologous polynucleotide of interest encoding a cell proliferation factor, wherein the plant cell or plant comprises a target site comprising a first recombination site, a nucleotide sequence, and a second recombination site; a transfer cassette comprising a third recombination site, a polynucleotide of interest, and a fourth recombination site, wherein the first and the third recombination sites are recombinogenic with respect to one another, and the second and fourth recombination sites are recombinogenic with respect to one another; and a site-specific recombinase capable of recognizing and implementing recombination at the first and third and second and fourth recombination sites.
[0128] The plant cell or plant can comprise more than one cell proliferation factor. For example, along with a babyboom polypeptide, the plant or plant cell can comprise a Wuschel polypeptide.
[0129] In particular embodiments, the heterologous polynucleotide encoding the cell proliferation factor comprises flanking recombination sites to facilitate its excision. In these embodiments, the plant further comprises a site-specific recombinase that recognizes the recombination sites flanking the heterologous polynucleotide encoding the cell proliferation factor. In some embodiments, this site-specific recombinase comprises FLPm or an active variant or fragment thereof. In some of those embodiments wherein the plant cell or plant further comprise a Wuschel polypeptide, the polynucleotide encoding the Wuschel polypeptide and the heterologous polynucleotide encoding the cell proliferation factor are flanked by recombination sites to facilitate the excision of both polynucleotides.
[0130] Any plant species can be transformed, including, but not limited to, monocots and dicots. Examples of plant species of interest include, but are not limited to, corn (Zea mays), Brassica sp. (e.g., B. napus, B. rapa, B. juncea), particularly those Brassica species useful as sources of seed oil, alfalfa (Medicago sativa), rice (Oryza sativa), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), finger millet (Eleusine coracana)), sunflower (Helianthus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), soybean (Glycine max), tobacco (Nicotiana tabacum), potato (Solanum tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum), sweet potato (Ipomoea batatus), cassava (Manihot esculenta), coffee (Coffea spp.), coconut (Cocos nucifera), pineapple (Ananas comosus), citrus trees (Citrus spp.), cocoa (Theobroma cacao), tea (Camellia sinensis), banana (Musa spp.), avocado (Persea americana), fig (Ficus casica), guava (Psidium guajava), mango (Mangifera indica), olive (Olea europaea), papaya (Carica papaya), cashew (Anacardium occidentale), macadamia (Macadamia integrifolia), almond (Prunus amygdalus), sugar beets (Beta vulgaris), sugarcane (Saccharum spp.), oats (Avena), barley (Hordeum), Arabidopsis, switchgrass, vegetables, ornamentals, grasses, and conifers.
[0131] Vegetables include tomatoes (Lycopersicon esculentum), lettuce (e.g., Lactuca sativa), green beans (Phaseolus vulgaris), lima beans (Phaseolus limensis), peas (Lathyrus spp.), and members of the genus Cucumis such as cucumber (C. sativus), cantaloupe (C. cantalupensis), and musk melon (C. melo). Ornamentals include azalea (Rhododendron spp.), hydrangea (Macrophylla hydrangea), hibiscus (Hibiscus rosasanensis), roses (Rosa spp.), tulips (Tulipa spp.), daffodils (Narcissus spp.), petunias (Petunia hybrida), carnation (Dianthus caryophyllus), poinsettia (Euphorbia pulcherrima), and chrysanthemum. Conifers that may be employed in practicing the present invention include, for example, pines such as loblolly pine (Pinus taeda), slash pine (Pinus elliotii), ponderosa pine (Pinus ponderosa), lodgepole pine (Pinus contorta), and Monterey pine (Pinus radiata); Douglas-fir (Pseudotsuga menziesii); Western hemlock (Tsuga canadensis); Sitka spruce (Picea glauca); redwood (Sequoia sempervirens); true firs such as silver fir (Abies amabilis) and balsam fir (Abies balsamea); and cedars such as Western red cedar (Thuja plicata) and Alaska yellow-cedar (Chamaecyparis nootkatensis). In specific embodiments, plants of the present invention are crop plants (for example, corn, alfalfa, sunflower, Brassica, soybean, cotton, safflower, peanut, sorghum, wheat, millet, tobacco, etc.). In other embodiments, corn and soybean and sugarcane plants are optimal, and in yet other embodiments corn plants are optimal.
[0132] Other plants of interest include grain plants that provide seeds of interest, oil-seed plants, and leguminous plants. Seeds of interest include grain seeds, such as corn, wheat, barley, rice, sorghum, rye, etc. Oil-seed plants include cotton, soybean, safflower, sunflower, Brassica, maize, alfalfa, palm, coconut, etc. Leguminous plants include beans and peas. Beans include guar, locust bean, fenugreek, soybean, garden beans, cowpea, mungbean, lima bean, fava bean, lentils, chickpea, etc.
[0133] As used herein, the term plant also includes plant cells, plant protoplasts, plant cell tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruit, kernels, ears, cobs, husks, stalks, roots, root tips, anthers, and the like. Grain is intended to mean the mature seed produced by commercial growers for purposes other than growing or reproducing the species. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced polynucleotides.
[0134] In some of those embodiments wherein the organism to which the cell proliferation factor, double-strand break-inducing enzyme, and in certain embodiments, a transfer cassette, is a plant, these elements can be introduced into a plant cell. In particular embodiments, the plant cell is a cell of a recalcitrant tissue or plant, such as an elite maize inbred. As used herein, a "recalcitrant tissue" or "recalcitrant plant" is a tissue or a plant that has a low rate of transformation using traditional methods of transformation, such as those disclosed elsewhere herein. In some embodiments, the recalcitrant tissue or plant is unable to be transformed in the absence of the cell proliferation factor. In other embodiments, the recalcitrant tissue or plant has a rate of successful transformation of less than about 20%, less than about 15%, less than about 10%, less than about 5%, less than about 1%, less than about 0.1%, less than about 0.01%, less than about 0.001%, or less. Non-limiting examples of recalcitrant tissues include mature seed or mature seed tissue, a leaf or leaf tissue, a stem or stem tissue.
[0135] In some embodiments, the cell proliferation factor, double-strand break-inducing enzyme, and in certain embodiments, a transfer cassette, are introduced into a mature seed, mature seed tissue, or leaf tissue using the methods described in U.S. Provisional Application entitled "Methods and compositions for the introduction and regulated expression of genes in plants," filed concurrently herewith.
[0136] Some embodiments of the methods provide for the targeted insertion of a polynucleotide of interest. If the polynucleotide of interest is introduced into an organism, it may impart various changes in the organism, particularly plants, including, but not limited to, modification of the fatty acid composition in the plant, altering the amino acid content of the plant, altering pathogen resistance, and the like. These results can be achieved by providing expression of heterologous products, increased expression of endogenous products in plants, or suppressed expression of endogenous produces in plants.
[0137] General categories of polynucleotides of interest include, for example, those genes involved in information, such as zinc fingers, those involved in communication, such as kinases, those involved in biosynthetic pathways, and those involved in housekeeping, such as heat shock proteins. More specific categories of transgenes, for example, include sequences encoding important traits for agronomics, insect resistance, disease resistance, herbicide resistance, sterility, grain characteristics, oil, starch, carbohydrate, phytate, protein, nutrient, metabolism, digestability, kernel size, sucrose loading, and commercial products.
[0138] Traits such as oil, starch, and protein content can be genetically altered in addition to using traditional breeding methods. Modifications include increasing content of oleic acid, saturated and unsaturated oils, increasing levels of lysine and sulfur, providing essential amino acids, and also modification of starch. Protein modifications to alter amino acid levels are described in U.S. Pat. Nos. 5,703,049, 5,885,801, 5,885,802, and 5,990,389 and WO 98/20122, herein incorporated by reference.
[0139] Insect resistance genes may encode resistance to pests such as rootworm, cutworm, European Corn Borer, and the like. Such genes include, for example, Bacillus thuringiensis toxic protein genes (U.S. Pat. Nos. 5,366,892; 5,747,450; 5,737,514; 5,723,756; 5,593,881; and Geiser et al. (1986) Gene 48:109); lectins (Van Damme et al. (1994) Plant Mol. Biol. 24:825); and the like.
[0140] Genes encoding disease resistance traits include detoxification genes, such as against fumonosin (U.S. Pat. No. 5,792,931); avirulence (avr) and disease resistance (R) genes (Jones et al. (1994) Science 266:789; Martin et al. (1993) Science 262:1432; and Mindrinos et al. (1994) Cell 78:1089); and the like.
[0141] Herbicide resistance traits may include genes coding for resistance to herbicides that act to inhibit the action of acetolactate synthase (ALS), in particular the sulfonylurea-type herbicides (e.g., the S4 and/or Hra mutations in ALS), genes coding for resistance to herbicides that act to inhibit action of glutamine synthase, such as phosphinothricin or basta (e.g., the bar gene), genes providing resistance to glyphosate, such as GAT (glyphosate N-acetyltransferase; U.S. Pat. No. 6,395,485), EPSPS (enolpyruvylshikimate-3-phosphate synthase; U.S. Pat. Nos. 6,867,293, 5,188,642, 5,627,061), or GOX (glyphosate oxidoreductase; U.S. Pat. No. 5,463,175), or other such genes known in the art. The nptII gene encodes resistance to the antibiotics kanamycin and geneticin.
[0142] Sterility genes can also be encoded in an expression cassette and provide an alternative to physical detasseling. Examples of genes used in such ways include male tissue-preferred genes and genes with male sterility phenotypes such as QM, described in U.S. Pat. No. 5,583,210. Other genes include kinases and those encoding compounds toxic to either male or female gametophytic development.
[0143] Commercial traits can also be encoded on a gene or genes that could, for example increase starch for ethanol production, or provide expression of proteins.
[0144] Reduction of the activity of specific genes (also known as gene silencing, or gene suppression) is desirable for several aspects of genetic engineering in plants. Many techniques for gene silencing are well known to one of skill in the art, including but not limited to antisense technology (see, e.g., Sheehy et al. (1988) Proc. Natl. Acad. Sci. USA 85:8805-8809; and U.S. Pat. Nos. 5,107,065; 5,453,566; and 5,759,829); cosuppression (e.g., Taylor (1997) Plant Cell 9:1245; Jorgensen (1990) Trends Biotech. 8(12):340-344; Flavell (1994) Proc. Natl. Acad. Sci. USA 91:3490-3496; Finnegan et al. (1994) Bio/Technology 12: 883-888; and Neuhuber et al. (1994) Mol. Gen. Genet. 244:230-241); RNA interference (Napoli et al. (1990) Plant Cell 2:279-289; U.S. Pat. No. 5,034,323; Sharp (1999) Genes Dev. 13:139-141; Zamore et al. (2000) Cell 101:25-33; Javier (2003) Nature 425:257-263; and, Montgomery et al. (1998) Proc. Natl. Acad. Sci. USA 95:15502-15507), virus-induced gene silencing (Burton, et al. (2000) Plant Cell 12:691-705; and Baulcombe (1999) Curr. Op. Plant Bio. 2:109-113); target-RNA-specific ribozymes (Haseloff et al. (1988) Nature 334: 585-591); hairpin structures (Smith et al. (2000) Nature 407:319-320; WO 99/53050; WO 02/00904; and WO 98/53083); ribozymes (Steinecke et al. (1992) EMBO J. 11:1525; U.S. Pat. No. 4,987,071; and, Perriman et al. (1993) Antisense Res. Dev. 3:253); oligonucleotide mediated targeted modification (e.g., WO 03/076574 and WO 99/25853); Zn-finger targeted molecules (e.g., WO 01/52620; WO 03/048345; and WO 00/42219); and other methods or combinations of the above methods known to those of skill in the art.
[0145] The following terms are used to describe the sequence relationships between two or more polynucleotides or polypeptides: (a) "reference sequence", (b) "comparison window", (c) "sequence identity", and, (d) "percentage of sequence identity."
[0146] (a) As used herein, "reference sequence" is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
[0147] (b) As used herein, "comparison window" makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two polynucleotides. Generally, the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100, or longer. Those of skill in the art understand that to avoid a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically introduced and is subtracted from the number of matches.
[0148] Methods of alignment of sequences for comparison are well known in the art. Thus, the determination of percent sequence identity between any two sequences can be accomplished using a mathematical algorithm. Non-limiting examples of such mathematical algorithms are the algorithm of Myers and Miller (1988) CABIOS 4:11-17; the local alignment algorithm of Smith et al. (1981) Adv. Appl. Math. 2:482; the global alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443-453; the search-for-local alignment method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. 85:2444-2448; the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 872264, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877.
[0149] Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, Calif.); the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the GCG Wisconsin Genetics Software Package, Version 10 (available from Accelrys Inc., 9685 Scranton Road, San Diego, Calif., USA). Alignments using these programs can be performed using the default parameters. The CLUSTAL program is well described by Higgins et al. (1988) Gene 73:237-244 (1988); Higgins et al. (1989) CABIOS 5:151-153; Corpet et al. (1988) Nucleic Acids Res. 16:10881-90; Huang et al. (1992) CABIOS 8:155-65; and Pearson et al. (1994) Meth. Mol. Biol. 24:307-331. The ALIGN program is based on the algorithm of Myers and Miller (1988) supra. A PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used with the ALIGN program when comparing amino acid sequences. The BLAST programs of Altschul et al (1990) J. Mol. Biol. 215:403 are based on the algorithm of Karlin and Altschul (1990) supra. BLAST nucleotide searches can be performed with the BLASTN program, score=100, wordlength=12, to obtain nucleotide sequences homologous to a nucleotide sequence encoding a protein of the invention. BLAST protein searches can be performed with the BLASTX program, score=50, wordlength=3, to obtain amino acid sequences homologous to a protein or polypeptide of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25:3389. Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, PSI-BLAST, the default parameters of the respective programs (e.g., BLASTN for nucleotide sequences, BLASTX for proteins) can be used. See www.ncbi.nlm.nih.gov. Alignment may also be performed manually by inspection.
[0150] Unless otherwise stated, sequence identity/similarity values provided herein refer to the value obtained using GAP Version 10 using the following parameters: % identity and % similarity for a nucleotide sequence using GAP Weight of 50 and Length Weight of 3, and the nwsgapdna.cmp scoring matrix; % identity and % similarity for an amino acid sequence using GAP Weight of 8 and Length Weight of 2, and the BLOSUM62 scoring matrix; or any equivalent program thereof. By "equivalent program" is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by GAP Version 10.
[0151] GAP uses the algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443-453, to find the alignment of two complete sequences that maximizes the number of matches and minimizes the number of gaps. GAP considers all possible alignments and gap positions and creates the alignment with the largest number of matched bases and the fewest gaps. It allows for the provision of a gap creation penalty and a gap extension penalty in units of matched bases. GAP must make a profit of gap creation penalty number of matches for each gap it inserts. If a gap extension penalty greater than zero is chosen, GAP must, in addition, make a profit for each gap inserted of the length of the gap times the gap extension penalty. Default gap creation penalty values and gap extension penalty values in Version 10 of the GCG Wisconsin Genetics Software Package for protein sequences are 8 and 2, respectively. For nucleotide sequences the default gap creation penalty is 50 while the default gap extension penalty is 3. The gap creation and gap extension penalties can be expressed as an integer selected from the group of integers consisting of from 0 to 200. Thus, for example, the gap creation and gap extension penalties can be 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65 or greater.
[0152] GAP presents one member of the family of best alignments. There may be many members of this family, but no other member has a better quality. GAP displays four figures of merit for alignments: Quality, Ratio, Identity, and Similarity. The Quality is the metric maximized in order to align the sequences. Ratio is the quality divided by the number of bases in the shorter segment. Percent Identity is the percent of the symbols that actually match. Percent Similarity is the percent of the symbols that are similar. Symbols that are across from gaps are ignored. A similarity is scored when the scoring matrix value for a pair of symbols is greater than or equal to 0.50, the similarity threshold. The scoring matrix used in Version 10 of the GCG Wisconsin Genetics Software Package is BLOSUM62 (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915).
[0153] (c) As used herein, "sequence identity" or "identity" in the context of two polynucleotides or polypeptide sequences makes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window. When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. When sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences that differ by such conservative substitutions are said to have "sequence similarity" or "similarity". Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, Calif.).
[0154] (d) As used herein, "percentage of sequence identity" means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
[0155] In hybridization techniques, all or part of a known polynucleotide is used as a probe that selectively hybridizes to other corresponding polynucleotides present in a population of cloned genomic DNA fragments or cDNA fragments (i.e., genomic or cDNA libraries) from a chosen organism. The hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides, and may be labeled with a detectable group such as .sup.32P, or any other detectable marker. Thus, for example, probes for hybridization can be made by labeling synthetic oligonucleotides based on the babyboom polynucleotide. Methods for preparation of probes for hybridization and for construction of cDNA and genomic libraries are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
[0156] For example, the entire babyboom polynucleotide, or one or more portions thereof, may be used as a probe capable of specifically hybridizing to corresponding babyboom polynucleotide and messenger RNAs. To achieve specific hybridization under a variety of conditions, such probes include sequences that are unique among babyboom polynucleotide sequences and are optimally at least about 10 nucleotides in length, and most optimally at least about 20 nucleotides in length. Such probes may be used to amplify corresponding babyboom polynucleotide from a chosen plant by PCR. This technique may be used to isolate additional coding sequences from a desired plant or as a diagnostic assay to determine the presence of coding sequences in a plant. Hybridization techniques include hybridization screening of plated DNA libraries (either plaques or colonies; see, for example, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
[0157] Hybridization of such sequences may be carried out under stringent conditions. By "stringent conditions" or "stringent hybridization conditions" is intended conditions under which a probe will hybridize to its target sequence to a detectably greater degree than to other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences that are 100% complementary to the probe can be identified (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing). Generally, a probe is less than about 1000 nucleotides in length, optimally less than 500 nucleotides in length.
[0158] Typically, stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30.degree. C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60.degree. C. for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37.degree. C., and a wash in 1.times. to 2.times.SSC (20.times.SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55.degree. C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37.degree. C., and a wash in 0.5.times. to 1.times.SSC at 55 to 60.degree. C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37.degree. C., and a wash in 0.1.times.SSC at 60 to 65.degree. C. Optionally, wash buffers may comprise about 0.1% to about 1% SDS. Duration of hybridization is generally less than about 24 hours, usually about 4 to about 12 hours. The duration of the wash time will be at least a length of time sufficient to reach equilibrium.
[0159] Specificity is typically the function of post-hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the T.sub.m can be approximated from the equation of Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284: T.sub.m=81.5.degree. C.+16.6 (log M)+0.41 (% GC)-0.61 (% form)-500/L; where M is the molarity of monovalent cations, % GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. The T.sub.m is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe. T.sub.m is reduced by about 1.degree. C. for each 1% of mismatching; thus, T.sub.m, hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with .gtoreq.90% identity are sought, the T.sub.m can be decreased 10.degree. C. Generally, stringent conditions are selected to be about 5.degree. C. lower than the thermal melting point (T.sub.m) for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4.degree. C. lower than the thermal melting point (T.sub.m); moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10.degree. C. lower than the thermal melting point (T.sub.m); low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or 20.degree. C. lower than the thermal melting point (T.sub.m). Using the equation, hybridization and wash compositions, and desired T.sub.m, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a T.sub.m of less than 45.degree. C. (aqueous solution) or 32.degree. C. (formamide solution), it is optimal to increase the SSC concentration so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes, Part I, Chapter 2 (Elsevier, New York); and Ausubel et al., eds. (1995) Current Protocols in Molecular Biology, Chapter 2 (Greene Publishing and Wiley-Interscience, New York). See Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
[0160] It is to be noted that the term "a" or "an" entity refers to one or more of that entity; for example, "a polypeptide" is understood to represent one or more polypeptides. As such, the terms "a" (or "an"), "one or more," and "at least one" can be used interchangeably herein.
[0161] Throughout this specification and the claims, the words "comprise," "comprises," and "comprising" are used in a non-exclusive sense, except where the context requires otherwise.
[0162] As used herein, the term "about," when referring to a value is meant to encompass variations of, in some embodiments .+-.50%, in some embodiments .+-.20%, in some embodiments .+-.10%, in some embodiments .+-.5%, in some embodiments .+-.1%, in some embodiments .+-.0.5%, and in some embodiments .+-.0.1% from the specified amount, as such variations are appropriate to perform the disclosed methods or employ the disclosed compositions.
[0163] Further, when an amount, concentration, or other value or parameter is given as either a range, preferred range, or a list of upper preferable values and lower preferable values, this is to be understood as specifically disclosing all ranges formed from any pair of any upper range limit or preferred value and any lower range limit or preferred value, regardless of whether ranges are separately disclosed. Where a range of numerical values is recited herein, unless otherwise stated, the range is intended to include the endpoints thereof, and all integers and fractions within the range. It is not intended that the scope of the presently disclosed subject matter be limited to the specific values recited when defining a range.
[0164] The following examples are offered by way of illustration and not by way of limitation.
EXPERIMENTAL
Example 1. Vector Construction
[0165] Maize recombination targets (RTL) were created using Agrobacterium transformation of immature maize embryos (Ishida et al. (1996) Nat Biotechnol 14:745-750). The LBA4404 Agrobacterium strain was used, which carried a specialized binary T-DNA plasmid system (Komari et al. (1996) Plant J 10:165-174) developed for high efficiency maize transformation. The binary Agrobacterium plasmid PHP21199 (similar to pSB124, Komari et al. (1996)), which is a T-DNA containing derivative of plasmid PHP10523 (similar to pSB1, Komari et al. (1996)) was constructed as follows. Visual and selectable marker genes were built into the T-DNA region of the intermediate construct, PHP21198 (similar to pSB12, Komari et al. (1996)), and then introduced into Agrobacterium to create the co-integrated binary plasmid, PHP21199. The selectable marker expression cassette in the PHP21199 plasmid consisted of the maize ubiquitin1 (UBI) promoter (Christensen & Quail (1996) Transgenic Res 5:213-218), 5' untranslated region (5'utr), and intron (UBI PRO), a sequence encoding glyphosate n-acetyltransferase (GAT4602) (Siehl et al. (2007) J Biol Chem 282:11446-11455), and a 3' region from the protease inhibitor 2 (PINII) gene of potato. The visual marker expression cassette in the PHP21199 plasmid consisted of the yellow fluorescent protein (YFP) gene (zs-yellowl n1) (Clontech, Palo Alto, Calif.) expressed by the same promoter and terminator elements as the gat gene (UBI PRO, PINII). The wild-type FRT was inserted between the maize ubiquitin promoter and the YFP gene. The selectable and visual marker expression cassettes, as well as the properly positioned FRT sites, were assembled with the multi-site Gateway.RTM. (Invitrogen, Carlsbad, Calif.) system. The plasmid backbone of PHP21198 served as the destination plasmid (pDEST) with the destination site between the RB and LB in the T-DNA region and three Gateway.RTM. entry vectors (pDONR) were provided; one for each marker gene and one for the downstream FRT87 recombinase site. The FRT87 recombinase site is located 3' of the final PINII 3' region. The PHP21199 plasmid therefore comprised RB-UBI PRO::FRT1::YFP+UBI PRO::GAT4602::FRT87-LB.
[0166] Site-specific integration (SSI) donor plasmids PHP22297 and PHP27064 were built using the multi-site Gateway.RTM. (Invitrogen) system using methods similar to those used to construct the PHP21198 vector, except that an Agrobacterium vector was not used since the donor plasmids were introduced into plant cells by particle bombardment. Instead, the destination site was provided by the commercially available pDEST R4-R3 vector (Invitrogen). The entry vector for the first position of PHP22297 consisted of a promoterless bar gene with the PINII terminator. In place of the promoter is a copy of the 35S cauliflower mosaic virus (CaMV 35S) termination region. This feature was included for the purpose of reducing potential bar gene expression due to random promoter trapping following donor integration into the plant genome outside the target site. The FRT1 site was placed between the CaMV 35S terminator and the bar gene to match the FRT1 in the target constructs and integrations. The second entry vector contained a cyan fluorescent protein (CFP) visual marker (am-cyan 1) (Clontech) operably linked to maize UBI PRO and PINII 3' regions as described above. The FRT87 site was placed in the third and final entry vector in order to position the site downstream of all the genes in the donor construct and to match the FRT87 position in the target construct. PHP22297 comprises FRT1::BAR+UBI PRO::CFP::FRT87. Donor construct PHP27064 was also constructed using pDEST R4-R3 (Invitrogen). The first entry vector was nearly identical to that for PHP22297 except that the bar gene was replaced by GAT4621, a GAT gene variant with similar but improved function to GAT4602. This entry vector did not include the 35S CaMV terminator region upstream of the promoterless gat gene. The second entry vector for PHP27064 had YFP in place of CFP, along with the same expression elements as the second entry vector used in the construction of PHP22297. The third entry vector included only FRT87 and was the same as that used for PHP22297. PHP27064 comprises FRT1::GAT4621+UBI PRO::YFP::FRT87.
Example 2. Recombinant Target Lines (RTL)
[0167] Zea mays immature embryos were transformed by a modified Agrobacterium-mediated transformation procedure (Djukanovic et al. (2006) Plant Biotechnol J 4:345-357) to introduce the T-DNA from PHP21199. Briefly, 10-12 days after pollination (DAP) embryos were dissected from sterile kernels and placed into liquid medium. After embryo collection, the medium was replaced with 1 ml of Agrobacterium suspension at a concentration of 0.35-0.45 OD at 550 nm, wherein the Agrobacterium comprised the T-DNA. After a five minute incubation at room temperature, the embryo suspension was poured onto a media plate. Embryos were incubated in the dark for 3 days at 20.degree. C., followed by a 4 day incubation in the dark at 28.degree. C. and a subsequent transfer onto new media plates containing 0.1778 mg/L glyphosate and 100 mg/L carbenicillin. Embryos were subcultured every three weeks until transgenic events were identified. Regeneration was induced by transferring small sectors of tissue onto maturation media containing 0.1 .mu.M ABA, 0.5 ml/L zeatin, 0.1778 mg/L glyphosate, and 100 mg/L carbenicillin. The plates were incubated in the dark for two weeks at 28.degree. C. Somatic embryos were transferred onto media containing 2.15 g/L MS salts (Gibco 11117: Gibco, Grand Island, N.Y.), 2.5 ml/L MS Vitamins Stock Solution, 50 mg/L myo-inositol, 15.0 g/L sucrose, 0.1778 mg/L glyphosate, and 3.0 g/L Gelrite, pH 5.6 and incubated under artificial light at 28.degree. C. One week later, plantlets were moved into glass tubes containing the same medium and grown until they were sampled and/or transplanted to soil. Target lines were screened by qPCR to assess the copy number of the transgenes and only single copy integration events were used as targets.
Example 3. Transformation and Regeneration of Recombinase-Mediated Cassette Exchange (RMCE) Events
[0168] Two plasmids were typically co-bombarded with SSI donor plasmids to facilitate recombination in PHWWE: PHP5096 and PHP21875. PHP5096 included a maize codon-optimized flp recombinase gene (SEQ ID NO: 42) under the control of maize UBI PRO and a pinII 3' sequence. The second co-bombarded plasmid, PHP21875, contained a maize odp2 gene (also referred to herein as maize BBM; see WO 2005/075655, which is herein incorporated by reference in its entirety) controlled by the maize UBI PRO and pinII terminator. Three plasmids were typically co-bombarded with SSI donor plasmids to facilitate recombination in PHI581. The FLP plasmid was PHP5096 as above, but the second plasmid with BBM is either PHP21875 or PHP31729 with BBM expression regulated by the maize oleosin promoter (OLE). The third plasmid introduced into PHI581 is PHP21139, which has an auxin-inducible promoter IN2-2 controlling the expression of the maize wuschel gene (ZmWUS2). Experiments were performed with or without the BBM expression cassette to assess its impact on the recovery of RMCE events.
i) Delivery of Donor Vector
[0169] The donor plasmid was delivered via biolistic-mediated transformation into hemizygous immature embryos containing the recombinant target site created by the integration of PHP21199. 9 to 11 DAP immature embryos (1-1.5 mm in size) dissected from sterilized kernels were plated with their axis down onto media comprising 4.0 g/L N6 Basal salts (Sigma C-1416), 1.0 ml/L Eriksson's Vitamin Mix (Sigma E-1511), 1.0 mg/L thiamine HCl, 1.5 mg/L 2,4-D, 0.690 g/L L-proline, 30 g/L sucrose, 0.85 mg/L silver nitrate, and 3.0 g/L Gelrite, pH 5.8 and incubated in the dark at 28.degree. C. for 3 to 5 days before the introduction of DNA. Two to four hours prior to bombardment, the embryos were plasmolyzed by placing them on the above media containing 120 gm/L of sucrose.
[0170] Plasmid DNA was associated with gold particles in preparation for biolistic-mediated transformation by mixing 100 .mu.g of the donor plasmid, 10 .mu.g of PHP5096 (encoding for mFLP), and in some bombardments, 10 .mu.g of the helper plasmid PHP21875 (UBI:ODP2) (the volume of the DNA solution was adjusted to 40 .mu.l), 50 .mu.l of 1-.mu.m gold particles at 0.01 mg/.mu.l, and 5 .mu.l TFX-50 (Promega E1811/2). The solution was allowed to gently mix for 10 minutes. The particles and attached DNA were spun down for 1 minute at 10,000 rpm and then the supernatant was removed and replaced with 120 .mu.l of 100% ethanol. The particles were then re-suspended by gentle sonication. 10 .mu.l of the particle solution was spotted on each carrier disc and the ethanol was allowed to evaporate. The macro carrier was placed 2.5 cm from a 450 psi rupture disc with the immature embryos placed on a shelf 7.5 cm below the launch assembly.
ii) Selection of RMCE Events
[0171] After bombardment, the embryos were removed from the high sucrose media and placed back on the same medium containing 30 g/L sucrose. The embryos were incubated in the dark at 28.degree. C. for 7 days, at which time the embryos were moved to selection plates of the above media containing either 3.0 mg/L bialaphos (selection of first round RMCE events) or 0.1778 mg/L glyphosate (selection of second round RMCE events). Embryos were subcultured to fresh medium after 3 weeks and transgenic events were identified 4 weeks later. Transgenic events growing under selection were then observed for their fluorescent phenotype. Those that exhibited a fluorescent phenotype indicative of RMCE were regenerated under the appropriate selective agent (bialophos or glyphosate) using the above protocol. Plantlets were sampled and/or transplanted to soil.
iii) Regeneration
[0172] Plant regeneration medium (288J) comprised 4.3 g/L MS salts (GIBCO 11117-074), 5.0 ml/L MS vitamins stock solution (0.100 g/L nicotinic acid, 0.02 g/L thiamine HCl, 0.10 g/L pyridoxine HCl, and 0.40 g/L glycine brought to volume with polished D-I H2O) (Murashige & Skoog (1962) Physiol Plant 15:473), 100 mg/L myo-inositol, 0.5 mg/L zeatin, 60 g/L sucrose and 1.0 ml/L of 0.1 mM abscisic acid (brought to volume with polished D-I H2O after adjusting to pH 5.6), 3.0 g/L Gelrite.TM. (added after bringing to volume with D-I H2O), and 1.0 mg/L indoleacetic acid and 3.0 mg/L bialaphos (added after sterilizing the medium and cooling to 60.degree. C.). Hormone-free medium (272V) comprised 4.3 g/L MS salts (GIBCO 11117-074), 5.0 ml/L MS vitamins stock solution (0.100 g/L nicotinic acid, 0.02 g/L thiamine HCl, 0.10 g/L pyridoxine HCl, and 0.40 g/L glycine (brought to volume with polished D-I H2O), 0.1 g/L myo-inositol, 40.0 g/L sucrose (brought to volume with polished D-I H2O 2O after adjusting pH to 5.6); and 6 g/L bacto-agar (added after bringing to volume with polished D-I H2O), and was sterilized and cooled to 60.degree. C.
iv) Polymerase Chain Reaction
[0173] DNA was extracted via a modified alkaline lysis method using 1 punch (200 ng) of fresh leaf tissue (Truett et al. (2000) Biotechniques 29:52-54). For quantitative PCR (qPCR), each gene was quantitated using specific forward and reverse primers along with a corresponding FAM based MGB (Applied Biosystems, Foster City, Calif.) fluorogenic multiplexed probe. Each assay was primer titrated and normalized to an amplification signal from an endogenous gene which utilized a VIC.RTM.-based sequence specific probe and primer set. The amplification reactions for the bar and CFP genes were run simultaneously with the normalizing gene in a single tube reaction. Upon completion of the qPCR, all raw data were used to calculate the dCT values. Copy number determination was computed with the .DELTA..DELTA.CT method as described in the ABI User Bulletin #2 (Applied Biosystems, Foster City, Calif.). Endpoint positive and negative qPCR calls were made forflp, ubi:odp2, ubi:frt1:bar and the FrtX junctions according to the dCT estimates. A PCR reaction requiring 5 additional cycles than the normalizing gene was considered negative for the transcript.
v) Sequencing
[0174] QPCR samples identified as positive for recombinant junctions (UBI-FRT1-BAR, donor-FRT87-target) were further characterized by agarose gel electrophoresis (FIG. 5) and sequencing. Each qPCR reaction was run as an individual lane on a 2% agarose gel and visualized by ethidium bromide staining under UV light. DNA bands of the expected size were independently cut from each lane of the gel and extracted from the agarose using the QiaQUICK gel extraction kit (Qiagen, Valencia, Calif.). Samples of these extractions were submitted directly for DNA sequencing. Replicate DNA samples were submitted for sequencing with both forward and reverse sequencing primers.
vi) Southern Blots
[0175] Leaf tissue (2-10 grams fresh weight) was freeze-dried and ground to a fine powder. Ground tissue (350 mg) was re-suspended in 9 ml CTAB extraction buffer with .beta.-mercaptoethanol (10 .mu.l/ml). This solution was incubated at 65.degree. C. for 1 hour. Every 20 minutes, tubes were inverted several times to mix the material and solution. Tubes were removed from the incubator and allowed to cool 10 minutes prior to adding 5 ml chloroform/octanol (24:1). Tubes were mixed by gently inverting for 5 minutes, and then centrifuged at 2500-3000 rpm (1100.times.G) for 30 minutes. The aqueous top layer was transferred to a fresh tube containing 11 ml precipitation buffer, and inverted several times gently. The tubes were allowed to stand at 25.degree. C. (room temperature) for 30 minutes to 2 hours, were centrifuged at 2000 rpm for 20 minutes, and the supernatant was discarded. The tubes were inverted to dry the pellet. The dried pellet was completely dissolved in 2 ml of 100 mM Tris (pH 7.5), 10 mM EDTA (pH 7.5), 0.7 M NaCl, and precipitated in 5 ml of 95-100% ethanol. DNA was pipetted into a tube containing 1 ml of 76% ethanol, 0.2 M sodium acetate for 20 minutes, transferred to a fresh tube containing 1 ml 76% EtOH, 10 mM ammonium acetate for 1 minute, and then transferred again into a third tube and re-suspended.
Example 4. Transient Expression of ZmBBM and Recovery of RMCE Events in Maize
[0176] Recombinant Target Loci (RTL) were created by Agrobacterium-mediated transformation of immature maize embryos. The target sequence was flanked on the 5' side by the wild-type FLP recognition target site (FRT1) paired on the 3' side with a heterospecific FRT87. The integration copy number was determined by real-time quantitative PCR (qPCR) and transgenic events containing only a single RTL with a single copy of each gene were used. The RTL contained a yellow fluorescent protein gene (YFP) driven by the maize ubiquitin promoter. The wild-type FRT was inserted between the maize ubiquitin promoter and the YFP gene to act as a promoter trap for activation of a promoterless marker gene in the donor vector following FLP-mediated recombination at the FRT site. The target vectors also contained the selectable marker gene glyphosate acetyltransferase (GAT) driven by the maize ubiquitin promoter.
[0177] Immature embryos containing the RTL were re-transformed by particle bombardment, wherein the donor vector was co-delivered with the vector PHP5096 (UBI PRO::FLPm::pinII) in all experiments along with the helper plasmid PHP21875 (UBI PRO::ZmBBM::pinII) in the majority of experiments, both at 1/10 of the concentration of the donor vector. In this instance, transient expression of FLP and BBM was achieved through a reduction in the titer of both the FLP and BBM-containing plasmids, while effectively eliminating random integration and subsequent stable expression of both cassettes. Other means of promoting transient expression can also be used, such as delivery of FLP and/or BBM RNA or protein, in addition to the standard amount of donor plasmid as the substrate for RMCE.
[0178] In the first round of RMCE, the donor sequence, flanked by FRT1 and FRT87 sites, contained a promoterless bar gene and the gene encoding the cyan fluorescent protein (CFP) controlled by the maize ubiquitin promoter. RMCE resulted in the exchange of the YFP and GAT genes located at the RTL with bar and CFP from the donor plasmid. To demonstrate the ability to reuse a target site with the FLP/FRT recombination system, a second round of RMCE was performed. Two RTLs were chosen that contained the FRT1-FRT87 pair. The product of the first round of RMCE at the RTL became the target for a new round of RMCE. The next round of RMCE was initiated by delivering the PHP27064 donor vector by particle bombardment. The donor vector contained the wild type FRT1, a promoterless GAT gene for selection and Ubi:YFP flanked by the heterospecific FRT87. RMCE resulted in the exchange of the bar and CFP genes located at the RTL with GAT and YFP from the donor plasmid. The FLP protein used to mediate the recombination was again transiently expressed by co-delivery of the vector PHP5096.
[0179] In the first round of RMCE, replacement of the target sequence at the RTL by the donor sequence led to expression of the otherwise promoterless bar gene. Putative RMCE events were initially selected by placing bombarded embryos on bialaphos-containing media (Table 1, column 2). Growth of callus on bialaphos-containing media was indicative of site-specific integration, but some random integrations of the donor vector also resulted in expression of the promoterless bar gene. In fact, random integration of the donor plasmid and growth on bialaphos-containing media was more frequent than RMCE. On average, under our experimental conditions, 9 bialaphos-resistant calli were routinely recovered for every 1 RMCE event identified. Nevertheless, use of the promoter trap and selection on bialaphos-containing media enriched the population of selected calli for RMCE events.
[0180] Calli growing on bialaphos-containing media were further characterized by phenotypic loss and gain of expression of fluorescence marker genes. In the first round of RMCE, the excision of the YFP gene resulted in calli which were negative for the YFP phenotype, while integration (targeted or random) of CFP contained in the donor vector, resulted in expression of CFP. In contrast, random integration of the donor vector did not result in replacement and calli were positive for YFP.
[0181] In the second round of RMCE, activation of a promoterless GAT gene (in the donor cassette) was used to chemically select for RMCE prior to monitoring of the fluorescent phenotype. In this case, putative RMCE events were YFP positive due to the integration of the donor cassette and CFP negative due to the exchange and excision of the FRT flanked sequence at the RTL. Callus sectors showing the expected fluorescence pattern were transferred to plant regeneration media.
[0182] Molecular confirmation of RMCE was performed on DNA extracted from regenerated plantlets. Putative RMCE events were characterized with a series of six PCR reactions. PCR primers unique to the target and donor sequences were used in combination to amplify DNA fragments bridging the recombined FRT junctions. PCR amplification was observed only when recombination between FRT sites at the RTL and donor occurred. Routinely, real-time quantitative PCR was used for this analysis. To verify that the PCR product was generated across the recombinant junction, a sample of the qPCR products were run out on a gel to demonstrate size and sequenced to demonstrate the presence of target sequence, the FRT site, and donor sequence. The predicted fragment sizes of the recombinant products were confirmed by Southern blot hybridization. Putative RMCE events were analyzed by real-time quantitative PCR for copy number of genes in the donor cassette. Excision of the target sequence was verified by qPCR for the fluorescent marker gene initially at the RTL. QPCR was also used to determine if the FLPm or ODP2 genes had integrated.
[0183] As can be seen in Table 1, RMCE events were identified through a sieving process, first by activation of a promoterless selectable marker, then by phenotyping of fluorescence and finally by molecular analysis of regenerated plants. Samples found to have both recombinant FRT junctions and excision of the target sequence were considered to be the result of RMCE.
[0184] As another means of confirming recombination, genomic DNA was extracted from several of the SSI events and sequenced across the FRT junctions to demonstrate the presence of both target and donor sequence and conservation of the FRT site itself. In one of the recombinant events, sequencing of the FRT87 site revealed a mutation in the 8 bp core region of the FRT site. The number of copies of integrated donor genes was determined by qPCR. Excision of the target sequence was verified by qPCR for the fluorescent marker gene initially at the RTL. qPCR was also used to determine if the FLPm gene had integrated. Random integrants growing under selection and not expressing the target fluorescent marker were identified and eliminated based on the lack of PCR products for the FRT junctions (Table 1, column 3). Precise RMCE was identified by the pattern of the PCR results (Table 1, columns 4 and 5). Only those events containing both the 5' and 3' FRT junctions, a single copy of the donor cassette and the absence of the target sequence and FLPm were considered precise RMCE events (Table 2). An RMCE event was considered imprecise if it contained more then a single copy of either of the donor genes even though both FRT junctions were present. Of the events found to have recombined at both FRT sites, about 10% also contained a random integration locus which segregated independently in the next generation. Various other types of imprecise RMCE and site-specific integrations were also identified by molecular characterization. In all, forty precise RMCE events were identified in the first round of RMCE.
TABLE-US-00001 TABLE 1 Identification of RMCE events in re-transformed embryos. Regenerable, Random Site-specific RMCE (Both Target Bialaphos bialaphos integration integration recombinant embryos resistant resistant, (No recombinant (Recombinant FRT1 and FRT bombarded calli CFP+/YFP- FRT junction) junction only) junctions) 14,945 560 129 56 21 52 3.75% * 0.86% 0.37% 0.14% 0.35% * Percent of bombarded embryos
[0185] Although events were identified in which FRT sites in the donor cassette recombined with those at the RTL, not all resulted in clean RMCE events (Table 2). Of the 52 events that had recombination of both FRT sites and loss of the target sequence (RMCE), 12 were found to have additional integrations of the donor cassette or integration of FLP or ZmBBM. Recombination was observed to occur at only the FRT1 site resulting in the separation of YFP from the ubiqutin promoter with and without the excision of the entire target sequence. Random integration of the donor cassette, as observed previously, would result in growth under selection with loss of YFP expression due to excision of the target sequence by illegitimate recombination between heterospecific FRT sites or silencing of YFP.
TABLE-US-00002 TABLE 2 Genotyping of putative RMCE plantlets by real time quantitative PCR. bar CFP FRT1 FRT87 (est. (est. # Integration junction junction copy) copy) YFP FLPm events Desired recombination product + + 1 1 - - 40 (Clean RMCE) Other patterns of integration observed RMCE - with additional donor + + .gtoreq.1 .gtoreq.1 - +/- 12 cassette and/or integrated FLP or ZmBBM plasmid FRT1 recombination only - target + - .gtoreq.1 .gtoreq.1 - +/- 16 sequence excised FRT1 recombination only - target + - .gtoreq.1 .gtoreq.1 + +/- 5 sequence not excised Random integration - target - - .gtoreq.1 .gtoreq.1 - +/- 12 sequence excised Random integration - target - - .gtoreq.1 .gtoreq.1 + +/- 31 sequence not excised Unknown - Complex integration +/- +/- .gtoreq.1 .gtoreq.1 +/- +/- 13
[0186] About 30% of the regenerated events selected by phenotype (bialaphos resistant, CFP positive, YFP negative) were precise RMCE events based on molecular characterization, while about 70% of the regenerated events were eliminated. In .about.60% of the discarded events, the FRT junctions were not found. These events may be the result of random integration of the donor plasmid. The remaining 40% of the discarded events appeared to have undergone site-specific integration at the target locus, but resulted in integration patterns reflecting either recombination at only the FRT1 site or an imprecise RMCE (Table 2). In a few events, FLPm was found to be integrated, but these events generally had other abnormalities.
[0187] In the second round of RMCE, activation of a promoterless GAT gene in the donor sequence was used to select for RMCE. In this case, about 62.5% of the regenerated events selected by phenotype were precise RMCE events based on molecular characterization. 96% of the putative RMCE events selected based on phenotype that reached the plant stage were found to have recombined at least at FRT1. The frequency of single recombination events at FRT1 and imprecise RMCE was 45% in the first round of RMCE and 38% in the second round.
[0188] The PCR reactions crossing the FRT junctions that were used to identify RMCE events were verified by both sequencing the PCR products and by Southern blot hybridization. The PCR products derived from several events were sequenced to demonstrate the contribution of sequence from the target and donor flanking the FRT site. RMCE was also verified by Southern blot hybridization of genomic DNA extracted from 30 putative RMCE events.
[0189] In the above experiments, an equal number of non-ZmBBM and ZmBBM treatments were not analyzed, but embryos from many ears were evaluated from both treatments. Overall, inclusion of ZmBBM resulted in a general 2-3 fold improvement in RMCE recovery in maize as compared to experiments in which the ZmBBM expression cassette was not used.
Example 5. Controlled Expression of ZmBBM
[0190] Any method can be used to control the timing and or location of expression of a cell proliferation factor, for example, ZmBBM. Molecular cloning and vector construction methods are well known and any such methods can be used to generate constructs with various elements or systems to regulate the timing or location of expression.
A. Transient Expression of ZmBBM
[0191] A particle gun was used to deliver the donor plasmid PHP22297 and PHP5096 plus or minus a UBI PRO::ZmBBM::pinII containing plasmid (PHP21875). During the TFX-mediated precipitation, 100 ng of PHP22297 and 10 ng of PHP5096 and PHP21875 (in the ZmBBM-containing treatment) were mixed. These plasmids, attached to gold particles as described in Example 3, were shot into immature embryos containing a single integrated copy of the T-DNA from PHP21199 (the target locus for RMCE). For this comparison (plus or minus ZmBBM), equal numbers of embryos from each ear, for a total of 176 ears, were used for side-by-side testing. For the control treatment (minus ZmBBM), 4551 bombarded embryos were taken through the selection protocol, and 13 RMCE events were recovered for an overall frequency of 0.29%. When ZmBBM was included in the bombardment, 4719 embryos produced 29 RMCE events for an overall frequency of 0.61%. This represented a consistent 2-fold increase in RCME recovery when the ZmBBM gene was included.
B. Tissue-Preferred Expression of ZmBBM
[0192] The ZmBBM gene was placed under the control of a maize oleosin promoter (SEQ ID NO: 55), which is a seed-preferred promoter expressed only in the scutella of developing embryos. The resulting expression plasmid containing OLE PRO::ZmBBM::pinII (PHP31729) was co-delivered along with the donor vector PHP22297, into immature embryos containing a single copy of the recombination target locus. Following selection on bialaphos and screening for loss of YFP and gain of CFP, RMCE events have been recovered. Expression of ZmBBM in callus cells increases the frequency of RMCE.
C. Excision of ZmBBM
[0193] An excisable ZmBBM plasmid comprising two expression cassettes (loxP-Ubi::ZmBBM::pinII+Rab17::Cre-loxP) is created. These two expression cassettes are co-delivered, along with the donor vector PHP22297, into immature embryos containing a single copy of the Recombination Target Locus. Expression of ZmBBM in callus cells increases the frequency of RMCE. In these experiments, the promoter controlling the expression of Cre is inactive during callus growth and chemical selection of RMCE events. Upon mild desiccation of the callus, for example, by placing the callus on high osmoticum such as 18% sucrose or onto dry filter papers for 1-3 days, expression of Cre recombinase is stimulated and both the BBM and Cre expression cassettes, being flanked by loxP recombinase target sites, are excised. Regeneration of fertile RMCE events is performed as described elsewhere herein.
D. Inducible Expression of ZmBBM for Recovery of RMCE Events in Maize
[0194] The ZmBBM gene can be placed under the control of an inducible expression system, such as that described in U.S. Application Publication No. 2008/0201806 A1, which is herein incorporated by reference in its entirety. Expression cassettes comprising a Triple-Op 35S promoter (Gatz et al. (1992) Plant J 2:397-404) and a pinII 3' sequence operably linked to the ZmBBM gene and a UBI PRO-driven maize-codon modified Tet repressor are constructed. These expression cassettes are co-delivered, along with the donor vector PHP22297, into immature embryos containing a single copy of the Recombination Target Locus. The addition of 1 mg/L tetracycline to the culture medium resulting in BBM expression stimulates cell division and results in an increased recovery of RMCE events in maize.
E. Co-Expression of BBM and Wuschel
[0195] Developmental and inducible promoters were combined to control the expression of ZmBBM and ZmWUS2, respectively, in order to accomplish site specific integration (SSI) in maize inbred PH581. The experiments involved a different SSI target plasmid, PHP17797, although the basic function was identical to PHP21199 as described above. PHP17797 has the maize ubiquitin promoter driving FLP recombinase as the first gene that included the wild type FRT (FRT1) recombinase site. The second gene was CAMV35S PRO:BAR: pinII to provide bialaphos resistance in tissue culture. After the BAR gene, the FRT5 recombinase site was used instead of the FRT87 in PHP21199. Target immature embryos (PH581, 13 DAP) were bombarded using the particle gun for the co-delivery of donor constructs and developmental gene constructs. The ultimate goal was to recover normal fertile plants and then to segregate BBM and WUS2 from the transformation construct in the progeny. SSI donor vector, PHP33552, was bombarded with and without developmental gene constructs to compare the effect of including BBM and WUS2. PHP33552 included a promoterless gene encoding the yellow fluorescent protein (YFP, ZS-Yellow1 N1, Clontech, Palo Alto, Calif., USA). The genes in PHP33552 were flanked by FRT1 and FRT5 to facilitate recombinase-mediated cassette exchange (RMCE) in the presence of FLP recombinase. Correct site-specific integration activates YFP from a captured promoter in the target locus.
[0196] Using a particle gun for transformation, both SSI and standard transformation was attempted in SSI target lines without added BBM and/or WUS2 constructs. PH581 was capable of developing a low frequency of callus using standard transformation methods (0.3%) and a few events were regenerated. The regenerated plants were recovered to the greenhouse and set seed. When SSI methods were used, the numbers of transformed calli with the correct phenotype were lower than with standard transformation methods and no plants could be regenerated. PH581 plant regeneration from tissue culture occurs at a relatively low frequency compared to model maize lines for transformation, such as the public line Hi-II.
[0197] Constitutively expressed BBM and WUS2 were co-bombarded with donor vectors for SSI. In these experiments, the maize Ubi promoter controlled the expression of ZmBBM and the Agrobacterium nopaline synthase (NOS) promoter regulated ZmWUS2 expression. These treatments provided a higher frequency of callus with the SSI phenotype (10-30%). SSI was confirmed by real-time quantitative PCR (QPCR) analysis in callus that demonstrated continued growth in culture and exhibited the expected phenotype. Importantly, plants were able to be regenerated from the SSI positive callus. However, the plants demonstrated abnormal morphology, suspected to be due at least in part to the uncontrolled expression of BBM and WUS2. Roots showed the thickened phenotype attributable to BBM expression. As in past experiments with these developmental genes, regeneration frequency is negatively impacted by BBM and WUS2 expressed in this manner.
[0198] In another set of particle gun transformation experiments using immature PH581 embryos from SSI target lines, standard transformation and SSI were tested with the controlled expression of ZmBBM and ZmWUS2. The maize embryo-preferred promoter, oleosin (Ole Pro) was employed to regulate ZmBBM expression. This promoter is active in developing embryos during callus growth and kernel development. The maize IN2-2 PRO (deVeylder et al. (2007) Plant Cell Physiol 38:568-77) was used to express ZmWUS2. The IN2-2 PRO promoter has a low level constitutive activity, which can be further activated in the presence of auxin that can be provided in the tissue culture medium. This expression strategy allowed for the recovery of a number of callus events having the SSI phenotype. It also provided for the recovery of young TO plants that were characterized with multiple qPCR assays to demonstrate SSI and to confirm the presence or absence of target genes, extra copies of genes from PHP33552, and integrated copies of the BBM and WUS2 plasmids. Young plants with the correct qPCR profile and YFP phenotype were advanced to the greenhouse where they developed into late-stage plants. In most cases, these plants were fertile. In some instances, plants exhibited delayed development or a stunted phenotype. During the flowering stage, the segregation of the cell proliferation transgenes was promoted by crossing tissue cultured plants with conventional PH581. Ears were harvested at about 13-15 DAP and immature embryos were plated on basic culture medium for embryo rescue. YFP positive kernels segregated 1:1 with null kernels as predicted when accounting for single, unlinked transgenic loci, one of which carries OLE PRO-ZmBBM and the second a recombined target locus. QPCR analysis of progeny plants confirmed that the YFP positive plants contained a recombinant SSI target locus. The kernels that were negative for YFP expression were the SSI null segregants.
[0199] By controlling the expression of ZmBBM and ZmWUS2 with developmental and inducible promoters, these developmental genes have been used to facilitate RMCE at numerous different target loci.
Example 6: Gene Targeting Using Homing Endonucleases
[0200] Molecular cloning and vector construction methods are well known and any such methods can be used to generate constructs to provide elements such as double-strand break-inducing enzymes, artificial target sites, targeting vectors, cell proliferation factors, or any other useful element. Vector construction is performed using standard molecular biology techniques. Any method of transformation can be used, and vector construction and/or insert preparation can be modified accordingly.
[0201] DNA double-strand break-inducing enzymes, such as an endonuclease, create double-strand breaks in the genome. Subsequent repair of the break can produce a mutation, DNA insertion, and homologous recombination products. In this manner, a double-strand break-inducing enzyme can be used for targeted modification of the genome to introduce a mutation, targeted insertion, or homologous recombination at a target locus. It is expected that the provision of one or more cell proliferation factors will enhance the targeted modification rates with double-strand break methods. Increased modification rates are expected at both artificial and endogenous target locus sites. Similarly, cell proliferation factors may also increase the rate of recovery of events in which a modification has occurred at the target locus. For example, one or more cell proliferation factors can be provided by introducing expression cassettes (e.g., Ubi Pro::Ubi intron::ZmBBM::pinII+nos Pro::ZmWUS2::pinII), resulting in enhanced gene targeting rates.
A. Artificial Target Site
[0202] An artificial target site (ATS) construct (ATS2) was constructed using a MDTP tetra-peptide linker to create a translational fusion between the selectable markers MoPAT (U.S. Pat. No. 6,096,947) and YFP (PHP21829). An in-frame insertion of the I-SceI recognition sequence in front of the MDTP-linker sequence of PHP21829 resulted in PHP22710. Upon delivery of the PHP21829 or PHP22710 construct into Hi-II maize immature embryos for functional evaluation, spots of yellow fluorescence were observed, confirming expression of the marker. Three stop codons were added to the PHP22710 fusion construct in front of the YFP coding sequence to create the artificial target site 2 (ATS2, PHP22709) construct. PHP22709 comprises the following operably linked components: Ubi pro::FLPm-rice actin pro::moPAT/I-SceI site/YFP::pin II-gAt. As expected, no visible yellow fluorescence was observed in Hi-II embryos bombarded with PHP22709.
[0203] ATS2 was designed with a minimal amount of sequences derived from maize to facilitate the interpretation of results. moPAT and YFP provide 5' and 3' homologous regions (.about.1 kb and .about.4.1 kb, respectively) for targeting in homologous recombination experiments. Homology of the 3' region was increased through the addition of 1578 bp of non-coding genomic sequence from Arabidopsis (gAt) following the pinII terminator. A FLP expression cassette was included in some experiments in order to test certain targeting vectors and other experimental design strategies.
B. Targeting Vectors
[0204] Several versions of targeting vectors were generated for delivery into maize embryos. Targeting vectors were designed that comprise a maize codon-modified I-SceI (moI-SceI) meganuclease expression vector derived from PHP22603 (U.S. Patent Application Publication No. 2009/0133152, which is herein incorporated by reference) and a positive selectable GAT4621 marker gene, flanked by two DNA segments homologous to the ATS2 target site. The homologous segments are 3019 bp (HR1) and 924 bp (HR2), respectively, in length. The GAT4621 gene is asymmetrically positioned within the homologous region to facilitate the identification of homologous recombinants by PCR. The basic vector was named TV-ATS2 (Targeting Vector for Artificial Target Site #2) and comprises the following operably linked components: Ubi pro::ubi 5' UTR::moI-SceI::pinII-HR1-ubi pro::ubi 5' UTR::GAT4621::pinII-HR2
[0205] A second targeting vector, named TV-ATS2Eraser, has two FRT sites directly flanking the TV-ATS2 elements, and was designed to provide a method to eliminate random integration events from selected material and to enrich the recovery of targeted events. TV-ATS2Eraser comprises the following operably linked components: FRT-ubi pro::ubi 5'UTR::moI-SceI::pinII-HR1-ubi pro::ubi 5' UTR::GAT4621::pinII-HR2-FRT
[0206] A third targeting vector (TV-ATS2Turbo) carries a T-DNA replication cassette. Replicating T-DNAs are expected to persist longer in the transformed cells, providing more substrate and time for DNA recombination, including homologous recombination. Replication activity is provided by a modified version of the wheat dwarf virus replication-associated protein (Rep) lacking the intron sequences between the two open reading frames RepA and RepB, along with its cognate origin of replication (LIR). The replicase function of Rep is provided by the longer transcript encompassing two open reading frames (RepAB). Testing confirmed replication activity in BMS cells upon the delivery of the TV-ATS2Turbo cassette. It is possible that strong expression of RepAB may negatively impact the growth of transformed tissues. If this is the case, the Rep cassette may also act as a form of negative selection against random integrations, thus helping to identify potential target modification events. TV-ATS2Turbo comprises the following operably linked components: Ubi pro::ubi 5' UTR::moI-SceI::pinII-WDV SIR::RepAB::WDV LIR-HR1-ubi pro::ubi 5' UTR::GAT4621::pinII-HR2.
[0207] A fourth targeting vector, TV-ATS2TurboEraser, combines all the elements of the TV-ATS2Turbo vector, including the moI-SceI expression cassette, the GAT4621 marker for selection of all transformation events, the RepAB gene for amplification of T-DNAs, and FRT sites to reduce the number of randomly integrated T-DNAs in selected material. TV-ATS2TurboEraser comprises the following operably linked components: FRT-Ubi pro::ubi 5' UTR::moI-SceI::pinII-WDV SIR::RepAB::WDV LIR-HR1-ubi pro::ubi 5' UTR::GAT4621::pinII-HR2-FRT.
[0208] A fifth targeting vector (TV-PHP30662) was constructed using the same elements as TV-ATS2, but the vector lacks the regions of homology to the target site. TV-PHP30662 comprises the following operably linked components: Ubi pro::ubi 5' UTR::moI-SceI::pinII-ubi pro::ubi 5' UTR::GAT4621::pinII.
C. Maize Lines Comprising a Target Site
[0209] Maize lines comprising an artificial target site stably integrated into the genome were produced by Agrobacterium-mediated transformation. Zea mays Hi-II immature embryos were transformed using Agrobacterium-mediated transformation essentially as described in Djukanovic et al. (2006) Plant Biotech J 4:345-57. Briefly, 10-12 DAP immature embryos (1-1.5 mm in size) were dissected from sterilized kernels and placed into liquid medium. After embryo collection, the medium was replaced with 1 ml Agrobacterium (at a concentration of 0.35-0.45 OD550) containing a T-DNA comprising an artificial target site, e.g., ATS2 (PHP22709). Maize embryos were incubated with Agrobacterium for 5 minutes at room temperature, and then the mixture was poured onto a media plate. Embryos were incubated axis down, in the dark for 3 days at 20.degree. C., then incubated 4 days in the dark at 28.degree. C., followed by a transfer to new media plates containing 3.0 mg/L Bialaphos and 100 mg/L carbenicillin. Embryos were subcultured every three weeks until transgenic events were identified. Somatic embryogenesis was induced by transferring a small amount of tissue onto regeneration medium (containing 0.1 .mu.M ABA, 1 mg/L IAA, 0.5 mg/L zeatin, 1.5 mg/L Bialaphos, and 100 mg/L carbenicillin) and incubated in the dark for two weeks at 28.degree. C. All material with visible shoots and roots was transferred onto media containing 4.3 g/L MS salts (Gibco 11117), 5.0 ml/L MS Vitamins Stock Solution, 100 mg/L myo-inositol, 40.0 g/L sucrose, and 1.5 g/L Gelrite, pH 5.6, and incubated under artificial light at 28.degree. C. One week later, plantlets were moved into glass tubes containing the same medium and grown until they were sampled and/or transplanted into soil.
Results
[0210] A total of 20 T0 transgenic plants were generated. Nineteen T0 plants survived to maturity. Leaf samples from these plants were collected for Southern analysis. Only single copy events that produced greater than 10 T1 kernels were used for further experiments. Twelve T0 events were identified from this process. T1 seeds produced by T0 self pollinations were planted for further characterization to confirm single copy ATS2 events by T1 segregation analysis. PAT activity was determined using a PAT protein detection kit. Four events (59, 60, 99, and 102) showed 1:2:1 Mendelian segregation for the target site. Events 99 and 102 also showed a 3:1 segregation of PAT expression, which also verified that the selected events were transcriptionally active. A total of 68 homozygous plants were produced from six selected single copy events and moved to the greenhouse for seed amplification and embryo production for transformation. Of the six selected events, events 59 and 99 showed a good tassel/ear developmental coordination. Embryos from these two events were used for a FLP activity assay to further confirm that the target site was transcriptionally active and to verify FLP function. FLP activity was assessed with the PHP10968 construct, in which the uidA coding sequence and the maize ubiquitin sequence is separated by the GFP coding sequence flanked by two FRT sites. FLP-mediated excision of this fragment is expected to reconstitute GUS expression. Every embryo from these events had GUS activity, indicating that ATS2 target sites in the two independent events were transcriptionally active. Six homozygous, single copy transgenic maize lines containing the ATS2 fragment were produced. Hemizygous embryos can be produced for re-transformation experiments by backcrossing or outcrossing. An ATS homozygous line is crossed to non-transgenic parental plants in order to produce the ATS hemizygous embryos for re-transformation experiments. All dissected embryos contained one copy of the artificial target site.
D. Target Site Modification
[0211] Agrobacterium-mediated transformation, as described elsewhere herein, is used to re-transform 9-12 DAP immature target line embryos comprising the ATS2 target site. The target line embryos are transformed with an I-SceI expression vector, and/or a targeting vector, with or without the following cassette: Ole Pro::ZmBBM::pinII+nos Pro::ZmWUS2::pinII+ALS Pro::Zm-ALS (HRA)::pinII. Zm-ALS (HRA) is the maize acetolactase synthase with two mutated amino acids, making it resistant to sulfonylurea herbicides. Transgenic embryos containing the artificial target site (ATS2) are re-transformed with the targeting vectors delivered on T-DNA molecules. The target sites contain the I-SceI restriction site and the targeting vectors provide the I-SceI meganuclease activity. Re-transformation of transgenic embryos containing ATS2 with an I-SceI expression cassette produces double-strand breaks at the target site. As a result, targeted modifications including short deletions and other rearrangements are introduced at the target site. A GAT expression cassestte is used to confirm construct delivery, therefore embryo co-cultivation is followed by callus selection on media containing 1 mM glyphosate. Transgenic callus events are resistant to glyphosate and exhibit blue fluorescence. In the re-transformation experiments for targeting, the selection protocol does not rely on activation/inactivation of moPAT::YFP; instead, all glyphosate-resistant, CFP+events are screened by PCR for modifications of ATS indicative of targeting events.
[0212] For high-throughput PCR screening of large numbers of samples, DNA is extracted by a HotSHOT protocol (Truett et al. (2000) Biotechniques 29:53-54). Briefly, one leaf punch, or a sample of equivalent size, 400 .mu.l of extraction buffer (25 mM NaOH, 0.2 mM EDTA), and two stainless steel beads are placed in each tube of a Mega titer rack. The samples are ground and extracted by shaking in a Genogrinder at 1650 rpm for 30-60 seconds, then incubating for 60-90 minutes at 95.degree. C. The extracts are cooled to room temperature, 400 .mu.l neutralization buffer (40 mM Tris-HCl, pH 5.0) is added, and the extracts are shaken at 500 rpm for 20-30 minutes. The samples are centrifuged at 4000 rpm for 5-10 minutes, followed by the collection of the supernatant. Two .mu.l of the supernatant from each sample is used for PCR.
[0213] For further evaluation of putative transformation events, DNA extraction is performed using the Qiagen Dneasy Plant Mini kit according to the provided protocol (Qiagen Inc., Valencia, N. Mex., USA). PCR reactions contain 2 .mu.l of DNA extract (100-200 ng), 10 .mu.l of RedExtractandAmpPCR mix (R4775, Sigma, St. Louis, Mo.), 0.05 .mu.l of each primer at a 100 .mu.M concentration, and 7.9 .mu.l water. The Expanded Long Template PCR amplification system (Roche Molecular Biochemicals, Indianapolis, Ind.) is used to amplify products of about 3 kb or larger. The Eppendorf Mastercycler Gradient cycler (Eppendorf North America, Westbury, N.Y.) is used with a PCR program specific for the particular primer annealing temperature and length of the desired PCR product. PCR products are evaluated and purified by agarose gel electrophoresis, by loading 15 .mu.l of each PCR reaction on a 1% agarose gel. PCR products are purified using a Qiagen PCR purification kit (Qiagen Inc., Valencia, N. Mex.). Products less than 4 kb are directly sequenced, or cloned into the pCR4-TOPO vector (InVitrogen, Carlsbad, Calif., USA). Longer PCR products are first cloned into a vector and then sequenced.
[0214] Three PCR primer pairs are used to identify and characterize the transformation events: an ATS primer pair, an I-SceI primer pair, and an HR primer pair. Selected putative targeting events are further characterized by DNA sequencing using BigDye Terminator chemistry on an ABI 3700 capillary sequencing machine (Applied Biosystems, Foster City, Calif.). Each sequencing sample contains either 0.4-0.5 .mu.g plasmid DNA or about 10 ng of the PCR product, and 6.4 pmole primer. Sequences are analyzed using the Sequencher program.
[0215] Selected events are further analyzed by Southern blots. Leaf tissue (about 1-2 grams fresh weight) is ground into a fine powder with liquid nitrogen. Twenty ml Puregene.RTM. Cell Lysis Solution is added to each sample and incubated 1 hour at 64.degree. C., while shaking at 750 rpm. Samples are centrifuged 10 minutes at 4,000 rpm. DNA extract supernatants are transferred to new tubes, mixed with 5 ml of phenol/chloroform (1:1) solution, and centrifuged 10 minutes at 4000 rpm. The upper phase is removed, and mixed with an equal volume of isopropanol to precipitate the DNA. The solutions are centrifuged for 10 min at 4000 rpm, followed by removal of the supernatant and the resuspension of pellets in 5 ml of TE buffer, pH 8.0, 0.4 ml of ethidium bromide (10 mg/ml), and 5 g of cesium chloride. The mixture is centrifuged overnight (12-17 hrs) at 390,000 g. The DNA extraction and ethidium bromide removal are performed essentially as described in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, N.Y. The final DNA preparations are dissolved in TE buffer to yield 1.0 .mu.g/.mu.l DNA solutions. Ten .mu.g DNA from each sample is digested overnight with 50 units of selected restriction enzyme(s) and the resultant digestion product(s) are separated on a 0.7% agarose gel run at 35 mV overnight. The TurboBlotter and Blotting Stack (Schleicher & Schuell, Keene, N.H.) are used to transfer DNA onto a nylon membrane as described in the manufacturer's manual. The DNA fragments are linked to the membrane by UV irradiation at 1.2 kjoules/m.sup.2 in a UV Stratalinker (Stratagene, Cedar Creek, Tex.). The blots are pre-hybridized 2-3 hrs in 20 ml of ExpressHyb hybridization solution (Clontech, Palo Alto, Calif.) at 65.degree. C. The random prime labeling system (Amersham Pharmacia Biotech, Piscataway, N.J.) is used with Redivue [.sup.32P]dCTP to produce radioactively labeled DNA fragments according to the supplied protocol. Hybridizations are incubated overnight at 65.degree. C. Blots are washed twice with 1% SSCE/0.1% SDS solution for 15 min at 65.degree. C. and then two additional washes are done with 0.1% SSCE/0.1% SDS under the same conditions.
E. Homing Endonuclease Activity in Plant Cells
[0216] It is beneficial to be able to evaluate the relative DNA cleavage activity in plant cells of any native, modified, or custom-designed double-strand break inducing agent, for example a meganuclease or zinc-finger nuclease. Modifications include changes to meganuclease polynucleotide or amino acids sequences, such as codon optimization, UTRs, amino acid substitutions, or fusions. The meganuclease and target sequence can be provided to the plant cell using any appropriate delivery method. Any meganucleases and target sequences can be tested in any plant cells in this manner.
[0217] Briefly, a sequence encoding the homing endonuclease (EN) with its cognate target site sequence (TS) is integrated into a DNA construct, for example a T-DNA, and delivered to the plant cells. This construct also includes a recombinase, recombinase sites for excision, and viral replication elements. After a specified period of time, or at defined time points in a series, total DNA is extracted from the treated plant cells and used to transform E. coli. Only circular DNAs containing the target sites will be capable of transforming and propagating in E. coli. These DNA molecules are recovered from E. coli and at least a subset of these samples are analyzed for mutations produced by double-strand breaks at the target site. Mutated target sites can be identified by sequencing of PCR products, real-time PCR using fluorescent probes, PCR-based melting curve analysis, or other suitable methods.
[0218] For example, a T-DNA construct containing the following operably linked components is constructed: RB-FRT-cole1 ori-F1 ori-AMP-TS-WDV LIR-REP Exon1-REP Intron-REP Exon1-WDV SIR-FRT-UBI pro-UBI intron1-FLPm Exon1-ST LS1 Intron2-FLPm Exon2-pinII term-35S Enh-MN/ST_LS Intron2-Ubi Intron1-Ubi Pro-LB-SPC-cole1 ori-COS. SPC is a bacterial gene conferring resistance to spectinomycin.
[0219] The coding regions for both the homing endonuclease (EN) and the recombinase (FLPm) contain an intron (e.g., ST-LS Intron 2) to suppress the expression of the proteins in bacterial cells (Agrobacterium or E. coli). This vector can be constructed using FLP-mediated recombination between a WDV replicase expression vector containing the target site sequence and an acceptor T-DNA vector containing FLP and the MN.
[0220] Agrobacterium containing a plasmid with the above components is used to transform BMS cells. In BMS cells, the meganuclease is expressed and can act upon the target site sequence. FLP recombinase is also expressed, excising the TS-containing WDV replicase expression vector, which circularizes and replicates. The acceptor T-DNA vector may also circularize, but cannot replicate. Replication amplifies the quantity of circular TS-containing WDV replicon, which will be the predominant DNA provided to E. coli. Six days after transformation, total DNA is isolated from the BMS cells and used to transform E. coli. E. coli colonies are screened sequentially for resistance to ampicillin and resistance to spectinomycin to identify colonies containing Ti plasmid DNA. Ampicillin-resistant colonies are selected and screened for mutations at the target site. The target sites can be recovered either by extraction of plasmid DNA from the E. coli, or by PCR amplification. PCR amplification reactions allow more efficient analysis of a large number of samples. Mutated target sites can be identified by sequencing of PCR products, real-time PCR using fluorescent probes, PCR-based melting curve analysis, or other suitable methods.
[0221] A summary of homing endonuclease and target site assay results are summarized in Table 3, wherein the I-SceI, I-CreI, Lig3-4, Lig3-4+, Lig3-4++ homing endonucleases are combined with the corresponding target site (single or double copy).
TABLE-US-00003 TABLE 3 A summary of homing endonuclease and target site assay results. Homing # clones # Mutation Target Site endonuclease sequenced mutations rate I-SceI None 34 0 0% I-SceI I-SceI 58 49 84% Double I-SceI I-SceI 63 57 90% I-CreI None 34 0 0% I-CreI I-CreI 904 318 35% Double I-CreI I-CreI 66 50 76% LIG-1 Lig3-4 637 3 0.5% LIG-1 Lig3-4+ 353 1 0.3% LIG-1 Lig3-4++ 237 56 24%
Example 7. Targeted Modification of an Endogenous Genomic Locus
[0222] A genomic sequence near the liguleless1 locus on chromosome 2 was characterized for use as an endogenous targeting locus. The targeting construct comprised a UBI::moPAT::pinII expression cassette flanked by 3150 bp and 1255 bp of sequence homologous to that of the endogenous genomic locus, in addition to a UBI PRO::I-CRE SC (LIG3/4)::pinII expression cassette encoding a homing endonuclease specific for the endogenous sequence ATATACCTCACACGTACGCGTA (SEQ ID NO: 56).
[0223] The targeting plasmid was delivered at 100 ng plasmid/bombardment to scutellar cells of PHWWE immature embryos either alone, or with 25 ng each of PHP21875 (UBI::ZmBBM::pinII) and PHP21139 (In2-2 PRO::ZmWUS2::In2-1 TERM). After particle bombardment of 569 embryos with all three plasmids, 74 callus events were selected for resistance to bialaphos, and one of these events produced a positive band after PCR screening across the newly formed hybrid junction identifying a putative homologous recombination event. All eight plants regenerated from this event produced a positive PCR signal. Long range PCR, producing longer bands across the newly formed junctions were then used to further confirm successful introduction of the UBI::moPAT::pinII fragment into the endogenous LIG locus. Subsequent Southern analysis demonstrated that after cutting genomic DNA with either PstI or BamHI for probing with Probe 1, or cutting with SpeI or DraI for probing with Probe 2, the expected band sizes were observed which were indicative of perfect integration. Finally, PCR was used to verify that moPAT had integrated as a single copy, and that the I CREI (LIG), ODP2 and WUS2 transgenic expression cassettes had not integrated into the genome. To date, two homologous recombination events have been identified and verified when ODP2 and WUS2 were co-delivered with the donor plasmid, after analyzing approximately 310 events to recover the first perfect homologous recombination (HR) and 74 events to recover the second perfect HR. In separate testing without ODP2 and WUS2, approximately 280 transgenic events were analyzed and no perfect homologous recombination events have been recovered.
[0224] Additionally, the developmental genes ZmBBM and ZmWUS2 have also been used to facilitate integration of transgenes at two different endogenous target sites on chromosome 1.
Example 8. Identification of BBM Motifs
[0225] Fifty genes from different plant species were identified through a homology search using the maize BBM amino acid sequence (SEQ ID NO: 2) queried against annotated protein sequences (see FIG. 1). The gene structure and sequences of these BBM homologs were manually inspected and compared with EST/cDNA alignments whenever possible. The fifty polypeptides are set forth in SEQ ID NOs: 2, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, and 61-96. To systematically identify possible motifs within the BBM homologs, protein sequences of these fifty homologs were submitted to the MEME web server, available on the world wide web at meme.nbcr.net/meme4_1/cgi-bin/meme.cgi, with the following specific parameters:
[0226] Number of different motifs: 20
[0227] Minimum motif width: 5
[0228] Maximum motif width: 300
[0229] Minimum number of sites: 5
[0230] Default values were applied for all other parameters. The raw results from MEME were manually compared with multiple sequence alignments generated by clustalw. Only those candidates showing good consensus with the sequence alignments were considered as motifs for further analysis.
[0231] The fifty genes were subjected to a phylogenetic analysis and a total of six subgroups were identified, including BBM, PLT3, PLT1/2, AIL6/7, AIL1, and ANT (see FIG. 1). FIG. 3 depicts all 50 sequences with each of the motifs that were identified using the MEME web server. FIG. 2 provides the motif consensus sequences along with alignments of the various polypeptides used by the MEME web server to generate the consensus motif. With a few exceptions, motifs 1-6, as defined immediately hereinbelow, are present in all 50 genes. This includes motifs 1-3 (SEQ ID NOs 3-5, respectively), which represent the two AP2 domains and a sequence linking the two domains (linker sequence). Motif 4, with the consensus sequence of PK[L/V][E/A][D/N]FLG (SEQ ID NO: 6) is amino-terminal to the two AP2 domains. Motif 5 (SEQ ID NO: 7) flanks the two AP2 domains on the carboxy terminal end of the polypeptides. Near the amino terminus of the polypeptides is motif 6, with the consensus sequence of NWL[G/S]FSLSP (SEQ ID NO: 8).
[0232] There were motifs that were relatively specific for the BBM subgroup of the homologous sequences (referred to herein as BBM polypeptides). An alignment of the BBM polypeptides can be found in FIG. 4. Motif 7 is found in all BBM polypeptides at the amino terminus of the polypeptide and has the consensus sequence of [G/E]LSMIK[T/N]WLR (SEQ ID NO: 9). Another motif that is present in all of the BBM polypeptides except for the polypeptides from Brassica and from Arabidopsis is Motif 10. Motif 10 has the consensus sequence of WCK[Q/P]EQD (SEQ ID NO: 12) and is located downstream of the AP2 domains.
[0233] There are three more motifs specific to the BBM group of polypeptides, including Motif 15 (SEQ ID NO: 14) which appears only in BBM orthologs, but not in the monocot BBM2 polypeptides; a monocot specific motif (Motif 19; SEQ ID NO: 15); and a general BBM specific motif (Motif 14; SEQ ID NO: 13), which appears in BBM homologs except for the Brassica and legume branch.
[0234] FIG. 5 provides a summary of the motif structure of the BBM homologs. The amino terminal motifs 4 and 6 and the AP2 flanking motif 5 distinguish the BBM homologous sequences from other two AP2 domain-containing homologs, such as WRI, AP2, and RAP2.7. Therefore, motifs 1-6 can be considered as core BBM/PLT family motifs. Many subgroups of the BBM/PLT family (BBM, PLT1/2, AIL1, and ANT) also have a carboxy-terminal motif (motif 8; SEQ ID NO: 10) and the third amino terminal motif (motif 9; SEQ ID NO: 11).
[0235] The BBM polypeptides all have one additional motif (motif 7; SEQ ID NO: 9) in the amino terminus, and all but the Brassica and Arabidopsis BBM homologs have an AP2 downstream motif (motif 10; SEQ ID NO: 12). Some other BBM/PLT family members (e.g., monocot AIL1) may have a similar motif as motif 7, but none of them also have motif 9. Motif 10 appears only in BBM polypeptides. In summary, the MEME predicted motifs 1-10 can be regarded as BBM polypeptide motifs. All monocot BBM polypeptides (corn, sorghum, and rice) also have motif 14, 15, and 19 (see FIG. 3). Some dicot BBM polypeptides and the second monocot BBM group (BBM2) have one or two of these motifs, but none have all three motifs.
[0236] All publications and patent applications mentioned in the specification are indicative of the level of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
[0237] Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
Sequence CWU
1
1
10512130DNAZea maysCDS(1)...(2130) 1atg gcc act gtg aac aac tgg ctc gct
ttc tcc ctc tcc ccg cag gag 48Met Ala Thr Val Asn Asn Trp Leu Ala Phe
Ser Leu Ser Pro Gln Glu 1 5 10
15ctg ccg ccc tcc cag acg acg gac tcc acg ctc atc tcg gcc gcc acc
96Leu Pro Pro Ser Gln Thr Thr Asp Ser Thr Leu Ile Ser Ala Ala Thr
20 25 30gcc gac cat gtc tcc ggc gat
gtc tgc ttc aac atc ccc caa gat tgg 144Ala Asp His Val Ser Gly Asp Val
Cys Phe Asn Ile Pro Gln Asp Trp 35 40
45agc atg agg gga tca gag ctt tcg gcg ctc gtc gcg gag ccg aag ctg
192Ser Met Arg Gly Ser Glu Leu Ser Ala Leu Val Ala Glu Pro Lys Leu 50
55 60gag gac ttc ctc ggc ggc atc tcc
ttc tcc gag cag cat cac aag tcc 240Glu Asp Phe Leu Gly Gly Ile Ser Phe
Ser Glu Gln His His Lys Ser 65 70 75
80aac tgc aac ttg ata ccc agc act agc agc aca gtt tgc tac
gcg agc 288Asn Cys Asn Leu Ile Pro Ser Thr Ser Ser Thr Val Cys Tyr Ala
Ser 85 90 95tca gct gct
agc acc ggc tac cat cac cag ctg tac cag ccc acc agc 336Ser Ala Ala Ser
Thr Gly Tyr His His Gln Leu Tyr Gln Pro Thr Ser 100
105 110tcc gcg ctc cac ttc gcg gac tcc gtc atg gtg
gcc tcc tcg gcc ggt 384Ser Ala Leu His Phe Ala Asp Ser Val Met Val Ala
Ser Ser Ala Gly 115 120 125gtc cac
gac ggc ggt tcc atg ctc agc gcg gcc gcc gct aac ggt gtc 432Val His Asp
Gly Gly Ser Met Leu Ser Ala Ala Ala Ala Asn Gly Val 130
135 140gct ggc gct gcc agt gcc aac ggc ggc ggc atc ggg
ctg tcc atg atc 480Ala Gly Ala Ala Ser Ala Asn Gly Gly Gly Ile Gly Leu
Ser Met Ile145 150 155
160aag aac tgg ctg cgg agc caa ccg gcg ccc atg cag ccg agg gcg gcg
528Lys Asn Trp Leu Arg Ser Gln Pro Ala Pro Met Gln Pro Arg Ala Ala
165 170 175gcg gct gag ggc gcg
cag ggg ctc tct ttg tcc atg aac atg gcg ggg 576Ala Ala Glu Gly Ala Gln
Gly Leu Ser Leu Ser Met Asn Met Ala Gly 180
185 190acg acc caa ggc gct gct ggc atg cca ctt ctc gct
gga gag cgc gca 624Thr Thr Gln Gly Ala Ala Gly Met Pro Leu Leu Ala Gly
Glu Arg Ala 195 200 205cgg gcg ccc
gag agt gta tcg acg tca gca cag ggt ggt gcc gtc gtc 672Arg Ala Pro Glu
Ser Val Ser Thr Ser Ala Gln Gly Gly Ala Val Val 210
215 220gtc acg gcg ccg aag gag gat agc ggt ggc agc ggt
gtt gcc ggt gct 720Val Thr Ala Pro Lys Glu Asp Ser Gly Gly Ser Gly Val
Ala Gly Ala225 230 235
240cta gta gcc gtg agc acg gac acg ggt ggc agc ggc ggc gcg tcg gct
768Leu Val Ala Val Ser Thr Asp Thr Gly Gly Ser Gly Gly Ala Ser Ala
245 250 255gac aac acg gca agg
aag acg gtg gac acg ttc ggg cag cgc acg tcg 816Asp Asn Thr Ala Arg Lys
Thr Val Asp Thr Phe Gly Gln Arg Thr Ser 260
265 270att tac cgt ggc gtg aca agg cat aga tgg act ggg
aga tat gag gca 864Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg
Tyr Glu Ala 275 280 285cat ctt tgg
gat aac agt tgc aga agg gaa gga caa act cgt aag ggt 912His Leu Trp Asp
Asn Ser Cys Arg Arg Glu Gly Gln Thr Arg Lys Gly 290
295 300cgt caa gtc tat tta ggt ggc tat gat aaa gag gag
aaa gct gct agg 960Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys
Ala Ala Arg305 310 315
320gct tat gat ctt gct gct ctg aag tac tgg ggt gcc aca aca aca aca
1008Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Ala Thr Thr Thr Thr
325 330 335aat ttt cca gtg agt
aac tac gaa aag gag ctc gag gac atg aag cac 1056Asn Phe Pro Val Ser Asn
Tyr Glu Lys Glu Leu Glu Asp Met Lys His 340
345 350atg aca agg cag gag ttt gta gcg tct ctg aga agg
aag agc agt ggt 1104Met Thr Arg Gln Glu Phe Val Ala Ser Leu Arg Arg Lys
Ser Ser Gly 355 360 365ttc tcc aga
ggt gca tcc att tac agg gga gtg act agg cat cac caa 1152Phe Ser Arg Gly
Ala Ser Ile Tyr Arg Gly Val Thr Arg His His Gln 370
375 380cat gga aga tgg caa gca cgg att gga cga gtt gca
ggg aac aag gat 1200His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly
Asn Lys Asp385 390 395
400ctt tac ttg ggc acc ttc agc acc cag gag gag gca gcg gag gcg tac
1248Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala Tyr
405 410 415gac atc gcg gcg atc
aag ttc cgc ggc ctc aac gcc gtc acc aac ttc 1296Asp Ile Ala Ala Ile Lys
Phe Arg Gly Leu Asn Ala Val Thr Asn Phe 420
425 430gac atg agc cgc tac gac gtg aag agc atc ctg gac
agc agc gcc ctc 1344Asp Met Ser Arg Tyr Asp Val Lys Ser Ile Leu Asp Ser
Ser Ala Leu 435 440 445ccc atc ggc
agc gcc gcc aag cgt ctc aag gag gcc gag gcc gca gcg 1392Pro Ile Gly Ser
Ala Ala Lys Arg Leu Lys Glu Ala Glu Ala Ala Ala 450
455 460tcc gcg cag cac cac cac gcc ggc gtg gtg agc tac
gac gtc ggc cgc 1440Ser Ala Gln His His His Ala Gly Val Val Ser Tyr Asp
Val Gly Arg465 470 475
480atc gcc tcg cag ctc ggc gac ggc gga gcc cta gcg gcg gcg tac ggc
1488Ile Ala Ser Gln Leu Gly Asp Gly Gly Ala Leu Ala Ala Ala Tyr Gly
485 490 495gcg cac tac cac ggc
gcc gcc tgg ccg acc atc gcg ttc cag ccg ggc 1536Ala His Tyr His Gly Ala
Ala Trp Pro Thr Ile Ala Phe Gln Pro Gly 500
505 510gcc gcc acc aca ggc ctg tac cac ccg tac gcg cag
cag cca atg cgc 1584Ala Ala Thr Thr Gly Leu Tyr His Pro Tyr Ala Gln Gln
Pro Met Arg 515 520 525ggc ggc ggg
tgg tgc aag cag gag cag gac cac gcg gtg atc gcg gcc 1632Gly Gly Gly Trp
Cys Lys Gln Glu Gln Asp His Ala Val Ile Ala Ala 530
535 540gcg cac agc ctg cag gac ctc cac cac ttg aac ctg
ggc gcg gcc ggc 1680Ala His Ser Leu Gln Asp Leu His His Leu Asn Leu Gly
Ala Ala Gly545 550 555
560gcg cac gac ttt ttc tcg gca ggg cag cag gcc gcc gcc gca gct gcg
1728Ala His Asp Phe Phe Ser Ala Gly Gln Gln Ala Ala Ala Ala Ala Ala
565 570 575atg cac ggc ctg gct
agc atc gac agt gcg tcg ctc gag cac agc acc 1776Met His Gly Leu Ala Ser
Ile Asp Ser Ala Ser Leu Glu His Ser Thr 580
585 590ggc tcc aac tcc gtc gtc tac aac ggc ggg gtc ggc
gat agc aac ggc 1824Gly Ser Asn Ser Val Val Tyr Asn Gly Gly Val Gly Asp
Ser Asn Gly 595 600 605gcc agc gcc
gtt ggc agc ggc ggt ggc tac atg atg ccg atg agc gct 1872Ala Ser Ala Val
Gly Ser Gly Gly Gly Tyr Met Met Pro Met Ser Ala 610
615 620gcc gga gca acc act aca tcg gca atg gtg agc cac
gag cag atg cat 1920Ala Gly Ala Thr Thr Thr Ser Ala Met Val Ser His Glu
Gln Met His625 630 635
640gca cgg gcc tac gac gaa gcc aag cag gct gct cag atg ggg tac gag
1968Ala Arg Ala Tyr Asp Glu Ala Lys Gln Ala Ala Gln Met Gly Tyr Glu
645 650 655agc tac ctg gtg aac
gcg gag aac aat ggt ggc gga agg atg tct gca 2016Ser Tyr Leu Val Asn Ala
Glu Asn Asn Gly Gly Gly Arg Met Ser Ala 660
665 670tgg ggg acc gtc gtc tct gca gcc gcg gcg gca gca
gca agc agc aac 2064Trp Gly Thr Val Val Ser Ala Ala Ala Ala Ala Ala Ala
Ser Ser Asn 675 680 685gac aac att
gcc gcc gac gtc ggc cat ggc ggc gcg cag ctc ttc agt 2112Asp Asn Ile Ala
Ala Asp Val Gly His Gly Gly Ala Gln Leu Phe Ser 690
695 700gtc tgg aac gac act taa
2130Val Trp Asn Asp Thr7052709PRTZea mays 2Met Ala Thr Val
Asn Asn Trp Leu Ala Phe Ser Leu Ser Pro Gln Glu1 5
10 15Leu Pro Pro Ser Gln Thr Thr Asp Ser Thr
Leu Ile Ser Ala Ala Thr 20 25
30Ala Asp His Val Ser Gly Asp Val Cys Phe Asn Ile Pro Gln Asp Trp
35 40 45Ser Met Arg Gly Ser Glu Leu Ser
Ala Leu Val Ala Glu Pro Lys Leu 50 55
60Glu Asp Phe Leu Gly Gly Ile Ser Phe Ser Glu Gln His His Lys Ser65
70 75 80Asn Cys Asn Leu Ile
Pro Ser Thr Ser Ser Thr Val Cys Tyr Ala Ser 85
90 95Ser Ala Ala Ser Thr Gly Tyr His His Gln Leu
Tyr Gln Pro Thr Ser 100 105
110Ser Ala Leu His Phe Ala Asp Ser Val Met Val Ala Ser Ser Ala Gly
115 120 125Val His Asp Gly Gly Ser Met
Leu Ser Ala Ala Ala Ala Asn Gly Val 130 135
140Ala Gly Ala Ala Ser Ala Asn Gly Gly Gly Ile Gly Leu Ser Met
Ile145 150 155 160Lys Asn
Trp Leu Arg Ser Gln Pro Ala Pro Met Gln Pro Arg Ala Ala
165 170 175Ala Ala Glu Gly Ala Gln Gly
Leu Ser Leu Ser Met Asn Met Ala Gly 180 185
190Thr Thr Gln Gly Ala Ala Gly Met Pro Leu Leu Ala Gly Glu
Arg Ala 195 200 205Arg Ala Pro Glu
Ser Val Ser Thr Ser Ala Gln Gly Gly Ala Val Val 210
215 220Val Thr Ala Pro Lys Glu Asp Ser Gly Gly Ser Gly
Val Ala Gly Ala225 230 235
240Leu Val Ala Val Ser Thr Asp Thr Gly Gly Ser Gly Gly Ala Ser Ala
245 250 255Asp Asn Thr Ala Arg
Lys Thr Val Asp Thr Phe Gly Gln Arg Thr Ser 260
265 270Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly
Arg Tyr Glu Ala 275 280 285His Leu
Trp Asp Asn Ser Cys Arg Arg Glu Gly Gln Thr Arg Lys Gly 290
295 300Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu
Glu Lys Ala Ala Arg305 310 315
320Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Ala Thr Thr Thr Thr
325 330 335Asn Phe Pro Val
Ser Asn Tyr Glu Lys Glu Leu Glu Asp Met Lys His 340
345 350Met Thr Arg Gln Glu Phe Val Ala Ser Leu Arg
Arg Lys Ser Ser Gly 355 360 365Phe
Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His Gln 370
375 380His Gly Arg Trp Gln Ala Arg Ile Gly Arg
Val Ala Gly Asn Lys Asp385 390 395
400Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala
Tyr 405 410 415Asp Ile Ala
Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe 420
425 430Asp Met Ser Arg Tyr Asp Val Lys Ser Ile
Leu Asp Ser Ser Ala Leu 435 440
445Pro Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala Glu Ala Ala Ala 450
455 460Ser Ala Gln His His His Ala Gly
Val Val Ser Tyr Asp Val Gly Arg465 470
475 480Ile Ala Ser Gln Leu Gly Asp Gly Gly Ala Leu Ala
Ala Ala Tyr Gly 485 490
495Ala His Tyr His Gly Ala Ala Trp Pro Thr Ile Ala Phe Gln Pro Gly
500 505 510Ala Ala Thr Thr Gly Leu
Tyr His Pro Tyr Ala Gln Gln Pro Met Arg 515 520
525Gly Gly Gly Trp Cys Lys Gln Glu Gln Asp His Ala Val Ile
Ala Ala 530 535 540Ala His Ser Leu Gln
Asp Leu His His Leu Asn Leu Gly Ala Ala Gly545 550
555 560Ala His Asp Phe Phe Ser Ala Gly Gln Gln
Ala Ala Ala Ala Ala Ala 565 570
575Met His Gly Leu Ala Ser Ile Asp Ser Ala Ser Leu Glu His Ser Thr
580 585 590Gly Ser Asn Ser Val
Val Tyr Asn Gly Gly Val Gly Asp Ser Asn Gly 595
600 605Ala Ser Ala Val Gly Ser Gly Gly Gly Tyr Met Met
Pro Met Ser Ala 610 615 620Ala Gly Ala
Thr Thr Thr Ser Ala Met Val Ser His Glu Gln Met His625
630 635 640Ala Arg Ala Tyr Asp Glu Ala
Lys Gln Ala Ala Gln Met Gly Tyr Glu 645
650 655Ser Tyr Leu Val Asn Ala Glu Asn Asn Gly Gly Gly
Arg Met Ser Ala 660 665 670Trp
Gly Thr Val Val Ser Ala Ala Ala Ala Ala Ala Ala Ser Ser Asn 675
680 685Asp Asn Ile Ala Ala Asp Val Gly His
Gly Gly Ala Gln Leu Phe Ser 690 695
700Val Trp Asn Asp Thr705331PRTArtificial SequenceConsensus sequence
motif 1VARIANT10Xaa = His or AsnVARIANT16Xaa = Phe or TyrVARIANT17Xaa =
Val or IleVARIANT19Xaa = Ser or His 3Tyr Glu Lys Glu Leu Glu Glu Met Lys
Xaa Met Thr Arg Gln Glu Xaa1 5 10
15Xaa Ala Xaa Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala
20 25 30463PRTArtificial
SequenceConsensus sequence motif 2VARIANT2Xaa = Ile or MetVARIANT36Xaa =
Gln or GluVARIANT45Xaa = Ile or ValVARIANT60Xaa = Asp or GluVARIANT61Xaa
= Met or IleVARIANT(62)...(62)Xaa = Ser or Asn 4Ser Xaa Tyr Arg Gly Val
Thr Arg His His Gln His Gly Arg Trp Gln1 5
10 15Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu
Tyr Leu Gly Thr 20 25 30Phe
Ser Thr Xaa Glu Glu Ala Ala Glu Ala Tyr Asp Xaa Ala Ala Ile 35
40 45Lys Phe Arg Gly Leu Asn Ala Val Thr
Asn Phe Xaa Xaa Xaa Arg 50 55
60568PRTArtificial SequenceConsensus sequence motif 3VARIANT2Xaa = Ile or
GlnVARIANT26Xaa = Arg or LysVARIANT30, 59Xaa = Ser or ThrVARIANT33Xaa =
Val or GlyVARIANT34Xaa = Tyr or ArgVARIANT(35)...(35)Xaa = Leu or
GlnVARIANT(42)...(42)Xaa = Glu or AspVARIANT(58)...(58)Xaa = Pro or
ThrVARIANT(61)...(61)Xaa = Thr or HisVARIANT(62)...(62)Xaa = Thr or
IleVARIANT(66)...(66)Xaa = Ile, Val, or Leu 5Ser Xaa Tyr Arg Gly Val Thr
Arg His Arg Trp Thr Gly Arg Tyr Glu1 5 10
15Ala His Leu Trp Asp Asn Ser Cys Arg Xaa Glu Gly Gln
Xaa Arg Lys 20 25 30Xaa Xaa
Xaa Gly Gly Tyr Asp Lys Glu Xaa Lys Ala Ala Arg Ala Tyr 35
40 45Asp Leu Ala Ala Leu Lys Tyr Trp Gly Xaa
Xaa Thr Xaa Xaa Asn Phe 50 55 60Pro
Xaa Ser Asn6568PRTArtificial SequenceConsensus sequence motif
4VARIANT3Xaa = Leu or ValVARIANT4Xaa = Glu or AlaVARIANT5Xaa = Asp or Asn
6Pro Lys Xaa Xaa Xaa Phe Leu Gly1 5713PRTArtificial
SequenceConsensus sequence motif 5VARIANT6Xaa = Ile or ValVARIANT9Xaa =
Ala or LeuVARIANT11, 12Xaa = Lys or ArgVARIANT13Xaa = Leu or Arg 7Ser Ser
Thr Leu Pro Xaa Gly Gly Xaa Ala Xaa Xaa Xaa1 5
1089PRTArtificial SequenceConsensus sequence motif 6VARIANT4Xaa =
Gly or Ser 8Asn Trp Leu Xaa Phe Ser Leu Ser Pro1
5910PRTArtificial SequenceConsensus sequence motif 7VARIANT1Xaa = Gly or
GluVARIANT7Xaa = Thr or Asn 9Xaa Leu Ser Met Ile Lys Xaa Trp Leu Arg1
5 10108PRTArtificial SequenceConsensus
sequence motif 8VARIANT2, 4, 5Xaa = Any Amino Acid 10Pro Xaa Phe Xaa Xaa
Trp Asn Asp1 5115PRTArtificial SequenceConsensus sequence
motif 9VARIANT2Xaa = Ser, Thr, or Ala 11Leu Xaa Leu Ser Met1
5127PRTArtificial SequenceConsensus sequence motif 10VARIANT4Xaa = Gln
or Pro 12Trp Cys Lys Xaa Glu Gln Asp1 5137PRTArtificial
SequenceConsensus sequence motif 14 13Trp Pro Thr Ile Ala Phe Gln1
51411PRTArtificial SequenceConsensus sequence motif 15VARIANT2Xaa
= Ser or Thr 14Ser Xaa Gly Ser Asn Ser Val Val Tyr Asn Gly1
5 10157PRTArtificial SequenceConsensus sequence motif
19VARIANT4Xaa = Ser or Asn 15Gln Asp Trp Xaa Met Arg Gly1
5161755DNAArabidopsis thalianaCDS(1)...(1755) 16atg aac tcg atg aat aac
tgg tta ggc ttc tct ctc tct cct cat gat 48Met Asn Ser Met Asn Asn Trp
Leu Gly Phe Ser Leu Ser Pro His Asp 1 5
10 15caa aat cat cac cgt acg gat gtt gac tcc tcc acc acc
aga acc gcc 96Gln Asn His His Arg Thr Asp Val Asp Ser Ser Thr Thr Arg
Thr Ala 20 25 30gta gat gtt
gcc gga ggg tac tgt ttt gat ctg gcc gct ccc tcc gat 144Val Asp Val Ala
Gly Gly Tyr Cys Phe Asp Leu Ala Ala Pro Ser Asp 35
40 45gaa tct tct gcc gtt caa aca tct ttt ctt tct cct
ttc ggt gtc acc 192Glu Ser Ser Ala Val Gln Thr Ser Phe Leu Ser Pro Phe
Gly Val Thr 50 55 60ctc gaa gct ttc
acc aga gac aat aat agt cac tcc cga gat tgg gac 240Leu Glu Ala Phe Thr
Arg Asp Asn Asn Ser His Ser Arg Asp Trp Asp 65 70
75 80atc aat ggt ggt gca tgc aat aca tta acc
aat aac gaa caa aat gga 288Ile Asn Gly Gly Ala Cys Asn Thr Leu Thr Asn
Asn Glu Gln Asn Gly 85 90
95cca aag ctt gag aat ttc ctc ggc cgc acc acc acg att tac aat acc
336Pro Lys Leu Glu Asn Phe Leu Gly Arg Thr Thr Thr Ile Tyr Asn Thr
100 105 110aac gag acc gtt gta gat
gga aat ggc gat tgt gga gga gga gac ggt 384Asn Glu Thr Val Val Asp Gly
Asn Gly Asp Cys Gly Gly Gly Asp Gly 115 120
125ggt ggt ggc ggc tca cta ggc ctt tcg atg ata aaa aca tgg ctg
agt 432Gly Gly Gly Gly Ser Leu Gly Leu Ser Met Ile Lys Thr Trp Leu Ser
130 135 140aat cat tcg gtt gct aat gct
aat cat caa gac aat ggt aac ggt gca 480Asn His Ser Val Ala Asn Ala Asn
His Gln Asp Asn Gly Asn Gly Ala145 150
155 160cga ggc ttg tcc ctc tct atg aat tca tct act agt
gat agc aac aac 528Arg Gly Leu Ser Leu Ser Met Asn Ser Ser Thr Ser Asp
Ser Asn Asn 165 170 175tac
aac aac aat gat gat gtc gtc caa gag aag act att gtt gat gtc 576Tyr Asn
Asn Asn Asp Asp Val Val Gln Glu Lys Thr Ile Val Asp Val 180
185 190gta gaa act aca ccg aag aaa act att
gag agt ttt gga caa agg acg 624Val Glu Thr Thr Pro Lys Lys Thr Ile Glu
Ser Phe Gly Gln Arg Thr 195 200
205tct ata tac cgc ggt gtt aca agg cat cgg tgg aca ggt aga tac gag
672Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu 210
215 220gca cat tta tgg gac aat agt tgc
aaa aga gaa ggc cag act cgc aaa 720Ala His Leu Trp Asp Asn Ser Cys Lys
Arg Glu Gly Gln Thr Arg Lys225 230 235
240gga aga caa gtt tat ctg gga ggt tat gac aaa gaa gaa aaa
gca gct 768Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala
Ala 245 250 255agg gct tac
gat tta gcc gca cta aag tat tgg gga ccc acc act act 816Arg Ala Tyr Asp
Leu Ala Ala Leu Lys Tyr Trp Gly Pro Thr Thr Thr 260
265 270act aac ttc ccc ttg agt gaa tat gag aaa gag
gta gaa gag atg aag 864Thr Asn Phe Pro Leu Ser Glu Tyr Glu Lys Glu Val
Glu Glu Met Lys 275 280 285cac atg
acg agg caa gag tat gtt gcc tct ctg cgc agg aaa agt agt 912His Met Thr
Arg Gln Glu Tyr Val Ala Ser Leu Arg Arg Lys Ser Ser 290
295 300ggt ttc tct cgt ggt gca tcg att tat cga gga gta
aca agg cat cac 960Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr
Arg His His305 310 315
320caa cat gga agg tgg caa gct agg atc gga aga gtc gcc ggt aac aaa
1008Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys
325 330 335gac ctc tac ttg gga
act ttc ggc aca cag gaa gag gct gct gag gct 1056Asp Leu Tyr Leu Gly Thr
Phe Gly Thr Gln Glu Glu Ala Ala Glu Ala 340
345 350tat gac att gca gcc att aaa ttc aga gga tta agc
gca gtg act aac 1104Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu Ser Ala
Val Thr Asn 355 360 365ttc gac atg
aac aga tac aat gtt aaa gca atc ctc gag agc ccg agt 1152Phe Asp Met Asn
Arg Tyr Asn Val Lys Ala Ile Leu Glu Ser Pro Ser 370
375 380cta cct att ggt agt tct gcg aaa cgt ctc aag gac
gtt aac aat ccg 1200Leu Pro Ile Gly Ser Ser Ala Lys Arg Leu Lys Asp Val
Asn Asn Pro385 390 395
400gtt cca gct atg atg att agt aat aac gtt tca gag agt gca aat aat
1248Val Pro Ala Met Met Ile Ser Asn Asn Val Ser Glu Ser Ala Asn Asn
405 410 415gtt agc ggt tgg caa
aac act gcg ttt cag cat cat cag gga atg gat 1296Val Ser Gly Trp Gln Asn
Thr Ala Phe Gln His His Gln Gly Met Asp 420
425 430ttg agc tta ttg cag caa cag cag gag agg tac gtt
ggt tat tac aat 1344Leu Ser Leu Leu Gln Gln Gln Gln Glu Arg Tyr Val Gly
Tyr Tyr Asn 435 440 445gga gga aac
ttg tct acc gag agt act agg gtt tgt ttc aaa caa gag 1392Gly Gly Asn Leu
Ser Thr Glu Ser Thr Arg Val Cys Phe Lys Gln Glu 450
455 460gag gaa caa caa cac ttc ttg aga aac tcg ccg agt
cac atg act aat 1440Glu Glu Gln Gln His Phe Leu Arg Asn Ser Pro Ser His
Met Thr Asn465 470 475
480gtt gat cat cat agc tcg acc tct gat gat tct gtt acc gtt tgt gga
1488Val Asp His His Ser Ser Thr Ser Asp Asp Ser Val Thr Val Cys Gly
485 490 495aat gtt gtt agt tat
ggt ggt tat caa gga ttc gca atc cct gtt gga 1536Asn Val Val Ser Tyr Gly
Gly Tyr Gln Gly Phe Ala Ile Pro Val Gly 500
505 510aca tcg gtt aat tac gat ccc ttt act gct gct gag
att gct tac aac 1584Thr Ser Val Asn Tyr Asp Pro Phe Thr Ala Ala Glu Ile
Ala Tyr Asn 515 520 525gca aga aat
cat tat tac tat gct cag cat cag caa caa cag cag att 1632Ala Arg Asn His
Tyr Tyr Tyr Ala Gln His Gln Gln Gln Gln Gln Ile 530
535 540cag cag tcg ccg gga gga gat ttt ccg gtg gcg att
tcg aat aac cat 1680Gln Gln Ser Pro Gly Gly Asp Phe Pro Val Ala Ile Ser
Asn Asn His545 550 555
560agc tct aac atg tac ttt cac ggg gaa ggt ggt gga gaa ggg gct cca
1728Ser Ser Asn Met Tyr Phe His Gly Glu Gly Gly Gly Glu Gly Ala Pro
565 570 575acg ttt tca gtt tgg
aac gac act tag 1755Thr Phe Ser Val Trp Asn
Asp Thr 58017584PRTArabidopsis thaliana 17Met Asn Ser Met Asn
Asn Trp Leu Gly Phe Ser Leu Ser Pro His Asp1 5
10 15Gln Asn His His Arg Thr Asp Val Asp Ser Ser
Thr Thr Arg Thr Ala 20 25
30Val Asp Val Ala Gly Gly Tyr Cys Phe Asp Leu Ala Ala Pro Ser Asp
35 40 45Glu Ser Ser Ala Val Gln Thr Ser
Phe Leu Ser Pro Phe Gly Val Thr 50 55
60Leu Glu Ala Phe Thr Arg Asp Asn Asn Ser His Ser Arg Asp Trp Asp65
70 75 80Ile Asn Gly Gly Ala
Cys Asn Thr Leu Thr Asn Asn Glu Gln Asn Gly 85
90 95Pro Lys Leu Glu Asn Phe Leu Gly Arg Thr Thr
Thr Ile Tyr Asn Thr 100 105
110Asn Glu Thr Val Val Asp Gly Asn Gly Asp Cys Gly Gly Gly Asp Gly
115 120 125Gly Gly Gly Gly Ser Leu Gly
Leu Ser Met Ile Lys Thr Trp Leu Ser 130 135
140Asn His Ser Val Ala Asn Ala Asn His Gln Asp Asn Gly Asn Gly
Ala145 150 155 160Arg Gly
Leu Ser Leu Ser Met Asn Ser Ser Thr Ser Asp Ser Asn Asn
165 170 175Tyr Asn Asn Asn Asp Asp Val
Val Gln Glu Lys Thr Ile Val Asp Val 180 185
190Val Glu Thr Thr Pro Lys Lys Thr Ile Glu Ser Phe Gly Gln
Arg Thr 195 200 205Ser Ile Tyr Arg
Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu 210
215 220Ala His Leu Trp Asp Asn Ser Cys Lys Arg Glu Gly
Gln Thr Arg Lys225 230 235
240Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala
245 250 255Arg Ala Tyr Asp Leu
Ala Ala Leu Lys Tyr Trp Gly Pro Thr Thr Thr 260
265 270Thr Asn Phe Pro Leu Ser Glu Tyr Glu Lys Glu Val
Glu Glu Met Lys 275 280 285His Met
Thr Arg Gln Glu Tyr Val Ala Ser Leu Arg Arg Lys Ser Ser 290
295 300Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly
Val Thr Arg His His305 310 315
320Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys
325 330 335Asp Leu Tyr Leu
Gly Thr Phe Gly Thr Gln Glu Glu Ala Ala Glu Ala 340
345 350Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu
Ser Ala Val Thr Asn 355 360 365Phe
Asp Met Asn Arg Tyr Asn Val Lys Ala Ile Leu Glu Ser Pro Ser 370
375 380Leu Pro Ile Gly Ser Ser Ala Lys Arg Leu
Lys Asp Val Asn Asn Pro385 390 395
400Val Pro Ala Met Met Ile Ser Asn Asn Val Ser Glu Ser Ala Asn
Asn 405 410 415Val Ser Gly
Trp Gln Asn Thr Ala Phe Gln His His Gln Gly Met Asp 420
425 430Leu Ser Leu Leu Gln Gln Gln Gln Glu Arg
Tyr Val Gly Tyr Tyr Asn 435 440
445Gly Gly Asn Leu Ser Thr Glu Ser Thr Arg Val Cys Phe Lys Gln Glu 450
455 460Glu Glu Gln Gln His Phe Leu Arg
Asn Ser Pro Ser His Met Thr Asn465 470
475 480Val Asp His His Ser Ser Thr Ser Asp Asp Ser Val
Thr Val Cys Gly 485 490
495Asn Val Val Ser Tyr Gly Gly Tyr Gln Gly Phe Ala Ile Pro Val Gly
500 505 510Thr Ser Val Asn Tyr Asp
Pro Phe Thr Ala Ala Glu Ile Ala Tyr Asn 515 520
525Ala Arg Asn His Tyr Tyr Tyr Ala Gln His Gln Gln Gln Gln
Gln Ile 530 535 540Gln Gln Ser Pro Gly
Gly Asp Phe Pro Val Ala Ile Ser Asn Asn His545 550
555 560Ser Ser Asn Met Tyr Phe His Gly Glu Gly
Gly Gly Glu Gly Ala Pro 565 570
575Thr Phe Ser Val Trp Asn Asp Thr 580181740DNABrassica
napusCDS(1)...(1740) 18atg aat aat aac tgg tta ggc ttt tct ctc tct cct
tat gaa caa aat 48Met Asn Asn Asn Trp Leu Gly Phe Ser Leu Ser Pro Tyr
Glu Gln Asn 1 5 10 15cac
cat cgt aag gac gtc tac tct tcc acc acc aca acc gtc gta gat 96His His
Arg Lys Asp Val Tyr Ser Ser Thr Thr Thr Thr Val Val Asp 20
25 30gtc gcc gga gag tac tgt tac gat ccg
acc gct gcc tcc gat gag tct 144Val Ala Gly Glu Tyr Cys Tyr Asp Pro Thr
Ala Ala Ser Asp Glu Ser 35 40
45tca gcc atc caa aca tcg ttt cct tct ccc ttt ggt gtc gtc gtc gat
192Ser Ala Ile Gln Thr Ser Phe Pro Ser Pro Phe Gly Val Val Val Asp 50
55 60gct ttc acc aga gac aac aat agt
cac tcc cga gat tgg gac atc aat 240Ala Phe Thr Arg Asp Asn Asn Ser His
Ser Arg Asp Trp Asp Ile Asn 65 70 75
80ggt tgt gca tgc aat aac atc cac aac gat gag caa gat gga
cca aag 288Gly Cys Ala Cys Asn Asn Ile His Asn Asp Glu Gln Asp Gly Pro
Lys 85 90 95ctt gag aat
ttc ctt ggc cgc acc acc acg att tac aac acc aac gaa 336Leu Glu Asn Phe
Leu Gly Arg Thr Thr Thr Ile Tyr Asn Thr Asn Glu 100
105 110aac gtt gga gat gga agt gga agt ggc tgt tat
gga gga gga gac ggt 384Asn Val Gly Asp Gly Ser Gly Ser Gly Cys Tyr Gly
Gly Gly Asp Gly 115 120 125ggt ggt
ggc tca cta gga ctt tcg atg ata aag aca tgg ctg aga aat 432Gly Gly Gly
Ser Leu Gly Leu Ser Met Ile Lys Thr Trp Leu Arg Asn 130
135 140caa ccc gtg gat aat gtt gat aat caa gaa aat ggc
aat gct gca aaa 480Gln Pro Val Asp Asn Val Asp Asn Gln Glu Asn Gly Asn
Ala Ala Lys145 150 155
160ggc ctg tcc ctc tca atg aac tca tct act tct tgt gat aac aac aac
528Gly Leu Ser Leu Ser Met Asn Ser Ser Thr Ser Cys Asp Asn Asn Asn
165 170 175gac agc aat aac aac
gtt gtt gcc caa ggg aag act att gat gat agc 576Asp Ser Asn Asn Asn Val
Val Ala Gln Gly Lys Thr Ile Asp Asp Ser 180
185 190gtt gaa gct aca ccg aag aaa act att gag agt ttt
gga cag agg acg 624Val Glu Ala Thr Pro Lys Lys Thr Ile Glu Ser Phe Gly
Gln Arg Thr 195 200 205tct ata tac
cgc ggt gtt aca agg cat cgg tgg aca gga aga tat gag 672Ser Ile Tyr Arg
Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu 210
215 220gca cat tta tgg gat aat agt tgt aaa aga gaa ggc
caa acg cgc aaa 720Ala His Leu Trp Asp Asn Ser Cys Lys Arg Glu Gly Gln
Thr Arg Lys225 230 235
240gga aga caa gtt tat ttg gga ggt tat gac aaa gaa gaa aaa gca gct
768Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala
245 250 255agg gct tat gat tta
gcc gca ctc aag tat tgg gga acc acc act act 816Arg Ala Tyr Asp Leu Ala
Ala Leu Lys Tyr Trp Gly Thr Thr Thr Thr 260
265 270act aac ttc ccc atg agc gaa tat gaa aaa gag gta
gaa gag atg aag 864Thr Asn Phe Pro Met Ser Glu Tyr Glu Lys Glu Val Glu
Glu Met Lys 275 280 285cac atg aca
agg caa gag tat gtt gcc tca ctg cgc agg aaa agt agt 912His Met Thr Arg
Gln Glu Tyr Val Ala Ser Leu Arg Arg Lys Ser Ser 290
295 300ggt ttc tct cgt ggt gca tcg att tat cgt gga gta
aca aga cat cac 960Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr
Arg His His305 310 315
320caa cat gga aga tgg caa gct agg ata gga aga gtc gcc ggt aac aaa
1008Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys
325 330 335gac ctc tac ttg gga
act ttt ggc aca caa gaa gaa gct gca gag gca 1056Asp Leu Tyr Leu Gly Thr
Phe Gly Thr Gln Glu Glu Ala Ala Glu Ala 340
345 350tac gac att gcg gcc atc aaa ttc aga gga tta acc
gca gtg act aac 1104Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu Thr Ala
Val Thr Asn 355 360 365ttc gac atg
aac aga tac aac gtt aaa gca atc ctc gaa agc cct agt 1152Phe Asp Met Asn
Arg Tyr Asn Val Lys Ala Ile Leu Glu Ser Pro Ser 370
375 380ctt cct att ggt agc gcc gca aaa cgt ctc aag gag
gct aac cgt ccg 1200Leu Pro Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala
Asn Arg Pro385 390 395
400gtt cca agt atg atg atg atc agt aat aac gtt tca gag agt gag aat
1248Val Pro Ser Met Met Met Ile Ser Asn Asn Val Ser Glu Ser Glu Asn
405 410 415agt gct agc ggt tgg
caa aac gct gcg gtt cag cat cat cag gga gta 1296Ser Ala Ser Gly Trp Gln
Asn Ala Ala Val Gln His His Gln Gly Val 420
425 430gat ttg agc tta ttg cac caa cat caa gag agg tac
aat ggt tat tat 1344Asp Leu Ser Leu Leu His Gln His Gln Glu Arg Tyr Asn
Gly Tyr Tyr 435 440 445tac aat gga
gga aac ttg tct tcg gag agt gct agg gct tgt ttc aaa 1392Tyr Asn Gly Gly
Asn Leu Ser Ser Glu Ser Ala Arg Ala Cys Phe Lys 450
455 460caa gag gat gat caa cac cat ttc ttg agc aac acg
cag agc ctc atg 1440Gln Glu Asp Asp Gln His His Phe Leu Ser Asn Thr Gln
Ser Leu Met465 470 475
480act aat atc gat cat caa agt tct gtt tcg gat gat tcg gtt act gtt
1488Thr Asn Ile Asp His Gln Ser Ser Val Ser Asp Asp Ser Val Thr Val
485 490 495tgt gga aat gtt gtt
ggt tat ggt ggt tat caa gga ttt gca gcc ccg 1536Cys Gly Asn Val Val Gly
Tyr Gly Gly Tyr Gln Gly Phe Ala Ala Pro 500
505 510gtt aac tgc gat gcc tac gct gct agt gag ttt gat
tat aac gca aga 1584Val Asn Cys Asp Ala Tyr Ala Ala Ser Glu Phe Asp Tyr
Asn Ala Arg 515 520 525aac cat tat
tac ttt gct cag cag cag cag acc cag cag tcg cca ggt 1632Asn His Tyr Tyr
Phe Ala Gln Gln Gln Gln Thr Gln Gln Ser Pro Gly 530
535 540gga gat ttt ccc gcg gca atg acg aat aat gtt ggc
tct aat atg tat 1680Gly Asp Phe Pro Ala Ala Met Thr Asn Asn Val Gly Ser
Asn Met Tyr545 550 555
560tac cat ggg gaa ggt ggt gga gaa gtt gct cca aca ttt aca gtt tgg
1728Tyr His Gly Glu Gly Gly Gly Glu Val Ala Pro Thr Phe Thr Val Trp
565 570 575aac gac aat tag
1740Asn Asp
Asn19579PRTBrassica napus 19Met Asn Asn Asn Trp Leu Gly Phe Ser Leu Ser
Pro Tyr Glu Gln Asn1 5 10
15His His Arg Lys Asp Val Tyr Ser Ser Thr Thr Thr Thr Val Val Asp
20 25 30Val Ala Gly Glu Tyr Cys Tyr
Asp Pro Thr Ala Ala Ser Asp Glu Ser 35 40
45Ser Ala Ile Gln Thr Ser Phe Pro Ser Pro Phe Gly Val Val Val
Asp 50 55 60Ala Phe Thr Arg Asp Asn
Asn Ser His Ser Arg Asp Trp Asp Ile Asn65 70
75 80Gly Cys Ala Cys Asn Asn Ile His Asn Asp Glu
Gln Asp Gly Pro Lys 85 90
95Leu Glu Asn Phe Leu Gly Arg Thr Thr Thr Ile Tyr Asn Thr Asn Glu
100 105 110Asn Val Gly Asp Gly Ser
Gly Ser Gly Cys Tyr Gly Gly Gly Asp Gly 115 120
125Gly Gly Gly Ser Leu Gly Leu Ser Met Ile Lys Thr Trp Leu
Arg Asn 130 135 140Gln Pro Val Asp Asn
Val Asp Asn Gln Glu Asn Gly Asn Ala Ala Lys145 150
155 160Gly Leu Ser Leu Ser Met Asn Ser Ser Thr
Ser Cys Asp Asn Asn Asn 165 170
175Asp Ser Asn Asn Asn Val Val Ala Gln Gly Lys Thr Ile Asp Asp Ser
180 185 190Val Glu Ala Thr Pro
Lys Lys Thr Ile Glu Ser Phe Gly Gln Arg Thr 195
200 205Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr
Gly Arg Tyr Glu 210 215 220Ala His Leu
Trp Asp Asn Ser Cys Lys Arg Glu Gly Gln Thr Arg Lys225
230 235 240Gly Arg Gln Val Tyr Leu Gly
Gly Tyr Asp Lys Glu Glu Lys Ala Ala 245
250 255Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly
Thr Thr Thr Thr 260 265 270Thr
Asn Phe Pro Met Ser Glu Tyr Glu Lys Glu Val Glu Glu Met Lys 275
280 285His Met Thr Arg Gln Glu Tyr Val Ala
Ser Leu Arg Arg Lys Ser Ser 290 295
300Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His305
310 315 320Gln His Gly Arg
Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys 325
330 335Asp Leu Tyr Leu Gly Thr Phe Gly Thr Gln
Glu Glu Ala Ala Glu Ala 340 345
350Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu Thr Ala Val Thr Asn
355 360 365Phe Asp Met Asn Arg Tyr Asn
Val Lys Ala Ile Leu Glu Ser Pro Ser 370 375
380Leu Pro Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala Asn Arg
Pro385 390 395 400Val Pro
Ser Met Met Met Ile Ser Asn Asn Val Ser Glu Ser Glu Asn
405 410 415Ser Ala Ser Gly Trp Gln Asn
Ala Ala Val Gln His His Gln Gly Val 420 425
430Asp Leu Ser Leu Leu His Gln His Gln Glu Arg Tyr Asn Gly
Tyr Tyr 435 440 445Tyr Asn Gly Gly
Asn Leu Ser Ser Glu Ser Ala Arg Ala Cys Phe Lys 450
455 460Gln Glu Asp Asp Gln His His Phe Leu Ser Asn Thr
Gln Ser Leu Met465 470 475
480Thr Asn Ile Asp His Gln Ser Ser Val Ser Asp Asp Ser Val Thr Val
485 490 495Cys Gly Asn Val Val
Gly Tyr Gly Gly Tyr Gln Gly Phe Ala Ala Pro 500
505 510Val Asn Cys Asp Ala Tyr Ala Ala Ser Glu Phe Asp
Tyr Asn Ala Arg 515 520 525Asn His
Tyr Tyr Phe Ala Gln Gln Gln Gln Thr Gln Gln Ser Pro Gly 530
535 540Gly Asp Phe Pro Ala Ala Met Thr Asn Asn Val
Gly Ser Asn Met Tyr545 550 555
560Tyr His Gly Glu Gly Gly Gly Glu Val Ala Pro Thr Phe Thr Val Trp
565 570 575Asn Asp
Asn201740DNABrassica napusCDS(1)...(1740) 20atg aat aat aac tgg tta ggc
ttt tct ctc tct cct tat gaa caa aat 48Met Asn Asn Asn Trp Leu Gly Phe
Ser Leu Ser Pro Tyr Glu Gln Asn 1 5 10
15cac cat cgt aag gac gtc tgc tct tcc acc acc aca acc gcc
gta gat 96His His Arg Lys Asp Val Cys Ser Ser Thr Thr Thr Thr Ala Val
Asp 20 25 30gtc gcc gga gag
tac tgt tac gat ccg acc gct gcc tcc gat gag tct 144Val Ala Gly Glu Tyr
Cys Tyr Asp Pro Thr Ala Ala Ser Asp Glu Ser 35
40 45tca gcc atc caa aca tcg ttt cct tct ccc ttt ggt gtc
gtc ctc gat 192Ser Ala Ile Gln Thr Ser Phe Pro Ser Pro Phe Gly Val Val
Leu Asp 50 55 60gct ttc acc aga gac
aac aat agt cac tcc cga gat tgg gac atc aat 240Ala Phe Thr Arg Asp Asn
Asn Ser His Ser Arg Asp Trp Asp Ile Asn 65 70
75 80ggt agt gca tgt aat aac atc cac aat gat gag
caa gat gga cca aaa 288Gly Ser Ala Cys Asn Asn Ile His Asn Asp Glu Gln
Asp Gly Pro Lys 85 90
95ctt gag aat ttc ctt ggc cgc acc acc acg att tac aac acc aac gaa
336Leu Glu Asn Phe Leu Gly Arg Thr Thr Thr Ile Tyr Asn Thr Asn Glu
100 105 110aac gtt gga gat atc gat
gga agt ggg tgt tat gga gga gga gac ggt 384Asn Val Gly Asp Ile Asp Gly
Ser Gly Cys Tyr Gly Gly Gly Asp Gly 115 120
125ggt ggt ggc tca cta gga ctt tcg atg ata aag aca tgg ctg aga
aat 432Gly Gly Gly Ser Leu Gly Leu Ser Met Ile Lys Thr Trp Leu Arg Asn
130 135 140caa ccc gtg gat aat gtt gat
aat caa gaa aat ggc aat ggt gca aaa 480Gln Pro Val Asp Asn Val Asp Asn
Gln Glu Asn Gly Asn Gly Ala Lys145 150
155 160ggc ctg tcc ctc tca atg aac tca tct act tct tgt
gat aac aac aac 528Gly Leu Ser Leu Ser Met Asn Ser Ser Thr Ser Cys Asp
Asn Asn Asn 165 170 175tac
agc agt aac aac ctt gtt gcc caa ggg aag act att gat gat agc 576Tyr Ser
Ser Asn Asn Leu Val Ala Gln Gly Lys Thr Ile Asp Asp Ser 180
185 190gtt gaa gct aca ccg aag aaa act att
gag agt ttt gga cag agg acg 624Val Glu Ala Thr Pro Lys Lys Thr Ile Glu
Ser Phe Gly Gln Arg Thr 195 200
205tct ata tac cgc ggt gtt aca agg cat cgg tgg aca gga aga tat gag
672Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu 210
215 220gca cat tta tgg gat aat agt tgt
aaa cga gaa ggc caa acg cgc aaa 720Ala His Leu Trp Asp Asn Ser Cys Lys
Arg Glu Gly Gln Thr Arg Lys225 230 235
240gga aga caa gtt tat ttg gga ggt tat gac aaa gaa gaa aaa
gca gct 768Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala
Ala 245 250 255agg gct tat
gat tta gcc gca ctc aag tat tgg gga acc acc act act 816Arg Ala Tyr Asp
Leu Ala Ala Leu Lys Tyr Trp Gly Thr Thr Thr Thr 260
265 270act aac ttc ccc atg agc gaa tat gag aaa gag
ata gaa gag atg aag 864Thr Asn Phe Pro Met Ser Glu Tyr Glu Lys Glu Ile
Glu Glu Met Lys 275 280 285cac atg
aca agg caa gag tat gtt gcc tca ctt cgc agg aaa agt agt 912His Met Thr
Arg Gln Glu Tyr Val Ala Ser Leu Arg Arg Lys Ser Ser 290
295 300ggt ttc tct cgt ggt gca tcg att tat cgt gga gta
aca aga cat cac 960Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr
Arg His His305 310 315
320caa cat gga aga tgg caa gct agg ata gga aga gtc gcc ggt aac aaa
1008Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys
325 330 335gac ctc tac ttg gga
act ttt ggc aca caa gaa gaa gct gca gag gca 1056Asp Leu Tyr Leu Gly Thr
Phe Gly Thr Gln Glu Glu Ala Ala Glu Ala 340
345 350tac gac att gcg gcc atc aaa ttc aga gga tta acc
gca gtg act aac 1104Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu Thr Ala
Val Thr Asn 355 360 365ttc gac atg
aac aga tac aac gtt aaa gca atc ctc gaa agc cct agt 1152Phe Asp Met Asn
Arg Tyr Asn Val Lys Ala Ile Leu Glu Ser Pro Ser 370
375 380ctt cct att ggt agc gcc gca aaa cgt ctc aag gag
gct aac cgt ccg 1200Leu Pro Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala
Asn Arg Pro385 390 395
400gtt cca agt atg atg atg atc agt aat aac gtt tca gag agt gag aat
1248Val Pro Ser Met Met Met Ile Ser Asn Asn Val Ser Glu Ser Glu Asn
405 410 415aat gct agc ggt tgg
caa aac gct gcg gtt cag cat cat cag gga gta 1296Asn Ala Ser Gly Trp Gln
Asn Ala Ala Val Gln His His Gln Gly Val 420
425 430gat ttg agc tta ttg cag caa cat caa gag agg tac
aat ggt tat tat 1344Asp Leu Ser Leu Leu Gln Gln His Gln Glu Arg Tyr Asn
Gly Tyr Tyr 435 440 445tac aat gga
gga aac ttg tct tcg gag agt gct agg gct tgt ttc aaa 1392Tyr Asn Gly Gly
Asn Leu Ser Ser Glu Ser Ala Arg Ala Cys Phe Lys 450
455 460caa gag gat gat caa cac cat ttc ttg agc aac acg
cag agc ctc atg 1440Gln Glu Asp Asp Gln His His Phe Leu Ser Asn Thr Gln
Ser Leu Met465 470 475
480act aat atc gat cat caa agt tct gtt tca gat gat tcg gtt act gtt
1488Thr Asn Ile Asp His Gln Ser Ser Val Ser Asp Asp Ser Val Thr Val
485 490 495tgt gga aat gtt gtt
ggt tat ggt ggt tat caa gga ttt gca gcc ccg 1536Cys Gly Asn Val Val Gly
Tyr Gly Gly Tyr Gln Gly Phe Ala Ala Pro 500
505 510gtt aac tgc gat gcc tac gct gct agt gag ttt gac
tat aac gca aga 1584Val Asn Cys Asp Ala Tyr Ala Ala Ser Glu Phe Asp Tyr
Asn Ala Arg 515 520 525aac cat tat
tac ttt gct cag cag cag cag acc cag cat tcg cca gga 1632Asn His Tyr Tyr
Phe Ala Gln Gln Gln Gln Thr Gln His Ser Pro Gly 530
535 540gga gat ttt ccc gcg gca atg acg aat aat gtt ggc
tct aat atg tat 1680Gly Asp Phe Pro Ala Ala Met Thr Asn Asn Val Gly Ser
Asn Met Tyr545 550 555
560tac cat ggg gaa ggt ggt gga gaa gtt gct cca aca ttt aca gtt tgg
1728Tyr His Gly Glu Gly Gly Gly Glu Val Ala Pro Thr Phe Thr Val Trp
565 570 575aac gac aat tag
1740Asn Asp
Asn21579PRTBrassica napus 21Met Asn Asn Asn Trp Leu Gly Phe Ser Leu Ser
Pro Tyr Glu Gln Asn1 5 10
15His His Arg Lys Asp Val Cys Ser Ser Thr Thr Thr Thr Ala Val Asp
20 25 30Val Ala Gly Glu Tyr Cys Tyr
Asp Pro Thr Ala Ala Ser Asp Glu Ser 35 40
45Ser Ala Ile Gln Thr Ser Phe Pro Ser Pro Phe Gly Val Val Leu
Asp 50 55 60Ala Phe Thr Arg Asp Asn
Asn Ser His Ser Arg Asp Trp Asp Ile Asn65 70
75 80Gly Ser Ala Cys Asn Asn Ile His Asn Asp Glu
Gln Asp Gly Pro Lys 85 90
95Leu Glu Asn Phe Leu Gly Arg Thr Thr Thr Ile Tyr Asn Thr Asn Glu
100 105 110Asn Val Gly Asp Ile Asp
Gly Ser Gly Cys Tyr Gly Gly Gly Asp Gly 115 120
125Gly Gly Gly Ser Leu Gly Leu Ser Met Ile Lys Thr Trp Leu
Arg Asn 130 135 140Gln Pro Val Asp Asn
Val Asp Asn Gln Glu Asn Gly Asn Gly Ala Lys145 150
155 160Gly Leu Ser Leu Ser Met Asn Ser Ser Thr
Ser Cys Asp Asn Asn Asn 165 170
175Tyr Ser Ser Asn Asn Leu Val Ala Gln Gly Lys Thr Ile Asp Asp Ser
180 185 190Val Glu Ala Thr Pro
Lys Lys Thr Ile Glu Ser Phe Gly Gln Arg Thr 195
200 205Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr
Gly Arg Tyr Glu 210 215 220Ala His Leu
Trp Asp Asn Ser Cys Lys Arg Glu Gly Gln Thr Arg Lys225
230 235 240Gly Arg Gln Val Tyr Leu Gly
Gly Tyr Asp Lys Glu Glu Lys Ala Ala 245
250 255Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly
Thr Thr Thr Thr 260 265 270Thr
Asn Phe Pro Met Ser Glu Tyr Glu Lys Glu Ile Glu Glu Met Lys 275
280 285His Met Thr Arg Gln Glu Tyr Val Ala
Ser Leu Arg Arg Lys Ser Ser 290 295
300Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His305
310 315 320Gln His Gly Arg
Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys 325
330 335Asp Leu Tyr Leu Gly Thr Phe Gly Thr Gln
Glu Glu Ala Ala Glu Ala 340 345
350Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu Thr Ala Val Thr Asn
355 360 365Phe Asp Met Asn Arg Tyr Asn
Val Lys Ala Ile Leu Glu Ser Pro Ser 370 375
380Leu Pro Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala Asn Arg
Pro385 390 395 400Val Pro
Ser Met Met Met Ile Ser Asn Asn Val Ser Glu Ser Glu Asn
405 410 415Asn Ala Ser Gly Trp Gln Asn
Ala Ala Val Gln His His Gln Gly Val 420 425
430Asp Leu Ser Leu Leu Gln Gln His Gln Glu Arg Tyr Asn Gly
Tyr Tyr 435 440 445Tyr Asn Gly Gly
Asn Leu Ser Ser Glu Ser Ala Arg Ala Cys Phe Lys 450
455 460Gln Glu Asp Asp Gln His His Phe Leu Ser Asn Thr
Gln Ser Leu Met465 470 475
480Thr Asn Ile Asp His Gln Ser Ser Val Ser Asp Asp Ser Val Thr Val
485 490 495Cys Gly Asn Val Val
Gly Tyr Gly Gly Tyr Gln Gly Phe Ala Ala Pro 500
505 510Val Asn Cys Asp Ala Tyr Ala Ala Ser Glu Phe Asp
Tyr Asn Ala Arg 515 520 525Asn His
Tyr Tyr Phe Ala Gln Gln Gln Gln Thr Gln His Ser Pro Gly 530
535 540Gly Asp Phe Pro Ala Ala Met Thr Asn Asn Val
Gly Ser Asn Met Tyr545 550 555
560Tyr His Gly Glu Gly Gly Gly Glu Val Ala Pro Thr Phe Thr Val Trp
565 570 575Asn Asp
Asn222070DNAMedicago truncatulaCDS(1)...(2070) 22atg gcc tct atg aac ttg
tta ggt ttc tct cta tct cca caa gaa caa 48Met Ala Ser Met Asn Leu Leu
Gly Phe Ser Leu Ser Pro Gln Glu Gln 1 5
10 15cat cca tca aca caa gat caa acg gtg gct tcc cgt ttt
ggg ttc aac 96His Pro Ser Thr Gln Asp Gln Thr Val Ala Ser Arg Phe Gly
Phe Asn 20 25 30cct aat gaa
atc tca ggc tct gat gtt caa gga gat cac tgc tat gat 144Pro Asn Glu Ile
Ser Gly Ser Asp Val Gln Gly Asp His Cys Tyr Asp 35
40 45ctc tct tct cac aca act cct cat cat tca ctc aac
ctt tct cat cct 192Leu Ser Ser His Thr Thr Pro His His Ser Leu Asn Leu
Ser His Pro 50 55 60ttt tcc att tat
gaa gct ttc cac aca aat aac aac att cac acc act 240Phe Ser Ile Tyr Glu
Ala Phe His Thr Asn Asn Asn Ile His Thr Thr 65 70
75 80caa gat tgg aag gag aac tac aac aac caa
aac cta cta ttg gga aca 288Gln Asp Trp Lys Glu Asn Tyr Asn Asn Gln Asn
Leu Leu Leu Gly Thr 85 90
95tca tgc atg aac caa aat gtg aac aac aac aac caa caa gca caa cca
336Ser Cys Met Asn Gln Asn Val Asn Asn Asn Asn Gln Gln Ala Gln Pro
100 105 110aag cta gaa aac ttc ctc
ggt gga cac tct ttc acc gac cat caa gaa 384Lys Leu Glu Asn Phe Leu Gly
Gly His Ser Phe Thr Asp His Gln Glu 115 120
125tac ggt ggt agc aac tca tac tct tca tta cac ctc cca cct cat
cag 432Tyr Gly Gly Ser Asn Ser Tyr Ser Ser Leu His Leu Pro Pro His Gln
130 135 140ccg gaa gca tcc tgt ggc ggt
ggt gat ggt agt aca agt aac aat aac 480Pro Glu Ala Ser Cys Gly Gly Gly
Asp Gly Ser Thr Ser Asn Asn Asn145 150
155 160tca ata ggt tta tct atg ata aaa aca tgg ctc aga
aac caa cca cca 528Ser Ile Gly Leu Ser Met Ile Lys Thr Trp Leu Arg Asn
Gln Pro Pro 165 170 175cca
cca gaa aac aac aac aat aac aac aat gaa agt ggt gca cgt gtg 576Pro Pro
Glu Asn Asn Asn Asn Asn Asn Asn Glu Ser Gly Ala Arg Val 180
185 190cag aca cta tca ctt tct atg agt act
ggc tca cag tca agt tca tct 624Gln Thr Leu Ser Leu Ser Met Ser Thr Gly
Ser Gln Ser Ser Ser Ser 195 200
205gtg cct ctt ctc aat gca aat gtg atg agt ggt gag att tcc tca tcg
672Val Pro Leu Leu Asn Ala Asn Val Met Ser Gly Glu Ile Ser Ser Ser 210
215 220gaa aac aaa caa cca ccc aca act
gca gtt gta ctt gat agc aac caa 720Glu Asn Lys Gln Pro Pro Thr Thr Ala
Val Val Leu Asp Ser Asn Gln225 230 235
240aca agt gtc gtt gaa agt gct gtg cct aga aaa tcc gtt gat
aca ttt 768Thr Ser Val Val Glu Ser Ala Val Pro Arg Lys Ser Val Asp Thr
Phe 245 250 255gga caa aga
act tcc att tac cgt ggt gta aca agg cat aga tgg aca 816Gly Gln Arg Thr
Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr 260
265 270ggg aga tat gaa gct cac ctt tgg gat aat agt
tgt aga aga gag ggg 864Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser Cys
Arg Arg Glu Gly 275 280 285cag act
cgc aaa gga agg caa gtt tac ttg gga ggt tat gac aaa gaa 912Gln Thr Arg
Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu 290
295 300gaa aaa gca gct aga gcc tat gat ttg gca gca cta
aaa tat tgg gga 960Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys
Tyr Trp Gly305 310 315
320aca act act aca aca aat ttt cca att agc cat tat gaa aaa gaa gtg
1008Thr Thr Thr Thr Thr Asn Phe Pro Ile Ser His Tyr Glu Lys Glu Val
325 330 335gaa gaa atg aag cat
atg aca agg caa gag tac gtt gcg tca ttg aga 1056Glu Glu Met Lys His Met
Thr Arg Gln Glu Tyr Val Ala Ser Leu Arg 340
345 350agg aaa agt agt ggt ttt tca cga ggt gca tcc att
tac cga gga gta 1104Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser Ile Tyr
Arg Gly Val 355 360 365aca aga cat
cat caa cat ggt aga tgg caa gct agg att gga aga gtt 1152Thr Arg His His
Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val 370
375 380gca ggc aac aaa gat ctc tac cta gga act ttc agc
act caa gaa gag 1200Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr
Gln Glu Glu385 390 395
400gca gca gag gca tat gat gtg gca gca ata aaa ttc aga gga ctg agt
1248Ala Ala Glu Ala Tyr Asp Val Ala Ala Ile Lys Phe Arg Gly Leu Ser
405 410 415gca gtt aca aac ttt
gac atg agc aga tat gat gtc aaa acc ata ctt 1296Ala Val Thr Asn Phe Asp
Met Ser Arg Tyr Asp Val Lys Thr Ile Leu 420
425 430gag agc agc aca tta cca att ggt ggt gct gca aag
cgt tta aaa gac 1344Glu Ser Ser Thr Leu Pro Ile Gly Gly Ala Ala Lys Arg
Leu Lys Asp 435 440 445atg gag caa
gtt gaa ttg aat cat gtg aat gtt gat att agc cat aga 1392Met Glu Gln Val
Glu Leu Asn His Val Asn Val Asp Ile Ser His Arg 450
455 460act gaa caa gat cat agc atc atc aac aac act tcc
cat tta aca gaa 1440Thr Glu Gln Asp His Ser Ile Ile Asn Asn Thr Ser His
Leu Thr Glu465 470 475
480caa gcc atc tat gca gca aca aat gca tct aat tgg cat gca ctt tca
1488Gln Ala Ile Tyr Ala Ala Thr Asn Ala Ser Asn Trp His Ala Leu Ser
485 490 495ttc caa cat caa caa
cca cat cat cat tac aat gcc aac aac atg cag 1536Phe Gln His Gln Gln Pro
His His His Tyr Asn Ala Asn Asn Met Gln 500
505 510tta cag aat tat cct tat gga act caa act caa aag
ctt tgg tgc aaa 1584Leu Gln Asn Tyr Pro Tyr Gly Thr Gln Thr Gln Lys Leu
Trp Cys Lys 515 520 525caa gaa caa
gat tct gat gat cat agt act tat act act gct act gat 1632Gln Glu Gln Asp
Ser Asp Asp His Ser Thr Tyr Thr Thr Ala Thr Asp 530
535 540att cat caa cta cag tta ggg aat aat aat aac aat
act cac aat ttc 1680Ile His Gln Leu Gln Leu Gly Asn Asn Asn Asn Asn Thr
His Asn Phe545 550 555
560ttt ggt tta caa aat atc atg agt atg gat tct gct tcc atg gat aat
1728Phe Gly Leu Gln Asn Ile Met Ser Met Asp Ser Ala Ser Met Asp Asn
565 570 575agt tct gga tct aat
tct gtt gtt tat ggt ggt gga gat cat ggt ggt 1776Ser Ser Gly Ser Asn Ser
Val Val Tyr Gly Gly Gly Asp His Gly Gly 580
585 590tat gga gga aat ggt gga tat atg att cca atg gct
att gca aat gat 1824Tyr Gly Gly Asn Gly Gly Tyr Met Ile Pro Met Ala Ile
Ala Asn Asp 595 600 605ggt aac caa
aat cca aga agc aac aac aat ttt ggt gag agt gag att 1872Gly Asn Gln Asn
Pro Arg Ser Asn Asn Asn Phe Gly Glu Ser Glu Ile 610
615 620aaa gga ttt ggt tat gaa aat gtt ttt ggg act act
act gat cct tat 1920Lys Gly Phe Gly Tyr Glu Asn Val Phe Gly Thr Thr Thr
Asp Pro Tyr625 630 635
640cat gca cag gca gca agg aac ttg tac tat cag cca caa caa tta tct
1968His Ala Gln Ala Ala Arg Asn Leu Tyr Tyr Gln Pro Gln Gln Leu Ser
645 650 655gtt gat caa gga tca
aat tgg gtt cca act gct att cca aca ctt gct 2016Val Asp Gln Gly Ser Asn
Trp Val Pro Thr Ala Ile Pro Thr Leu Ala 660
665 670cca agg act acc aat gtc tct cta tgt cct cct ttc
act ttg ttg cat 2064Pro Arg Thr Thr Asn Val Ser Leu Cys Pro Pro Phe Thr
Leu Leu His 675 680 685gaa tag
2070Glu23689PRTMedicago truncatula 23Met Ala Ser Met Asn Leu Leu Gly Phe
Ser Leu Ser Pro Gln Glu Gln1 5 10
15His Pro Ser Thr Gln Asp Gln Thr Val Ala Ser Arg Phe Gly Phe
Asn 20 25 30Pro Asn Glu Ile
Ser Gly Ser Asp Val Gln Gly Asp His Cys Tyr Asp 35
40 45Leu Ser Ser His Thr Thr Pro His His Ser Leu Asn
Leu Ser His Pro 50 55 60Phe Ser Ile
Tyr Glu Ala Phe His Thr Asn Asn Asn Ile His Thr Thr65 70
75 80Gln Asp Trp Lys Glu Asn Tyr Asn
Asn Gln Asn Leu Leu Leu Gly Thr 85 90
95Ser Cys Met Asn Gln Asn Val Asn Asn Asn Asn Gln Gln Ala
Gln Pro 100 105 110Lys Leu Glu
Asn Phe Leu Gly Gly His Ser Phe Thr Asp His Gln Glu 115
120 125Tyr Gly Gly Ser Asn Ser Tyr Ser Ser Leu His
Leu Pro Pro His Gln 130 135 140Pro Glu
Ala Ser Cys Gly Gly Gly Asp Gly Ser Thr Ser Asn Asn Asn145
150 155 160Ser Ile Gly Leu Ser Met Ile
Lys Thr Trp Leu Arg Asn Gln Pro Pro 165
170 175Pro Pro Glu Asn Asn Asn Asn Asn Asn Asn Glu Ser
Gly Ala Arg Val 180 185 190Gln
Thr Leu Ser Leu Ser Met Ser Thr Gly Ser Gln Ser Ser Ser Ser 195
200 205Val Pro Leu Leu Asn Ala Asn Val Met
Ser Gly Glu Ile Ser Ser Ser 210 215
220Glu Asn Lys Gln Pro Pro Thr Thr Ala Val Val Leu Asp Ser Asn Gln225
230 235 240Thr Ser Val Val
Glu Ser Ala Val Pro Arg Lys Ser Val Asp Thr Phe 245
250 255Gly Gln Arg Thr Ser Ile Tyr Arg Gly Val
Thr Arg His Arg Trp Thr 260 265
270Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser Cys Arg Arg Glu Gly
275 280 285Gln Thr Arg Lys Gly Arg Gln
Val Tyr Leu Gly Gly Tyr Asp Lys Glu 290 295
300Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp
Gly305 310 315 320Thr Thr
Thr Thr Thr Asn Phe Pro Ile Ser His Tyr Glu Lys Glu Val
325 330 335Glu Glu Met Lys His Met Thr
Arg Gln Glu Tyr Val Ala Ser Leu Arg 340 345
350Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg
Gly Val 355 360 365Thr Arg His His
Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val 370
375 380Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser
Thr Gln Glu Glu385 390 395
400Ala Ala Glu Ala Tyr Asp Val Ala Ala Ile Lys Phe Arg Gly Leu Ser
405 410 415Ala Val Thr Asn Phe
Asp Met Ser Arg Tyr Asp Val Lys Thr Ile Leu 420
425 430Glu Ser Ser Thr Leu Pro Ile Gly Gly Ala Ala Lys
Arg Leu Lys Asp 435 440 445Met Glu
Gln Val Glu Leu Asn His Val Asn Val Asp Ile Ser His Arg 450
455 460Thr Glu Gln Asp His Ser Ile Ile Asn Asn Thr
Ser His Leu Thr Glu465 470 475
480Gln Ala Ile Tyr Ala Ala Thr Asn Ala Ser Asn Trp His Ala Leu Ser
485 490 495Phe Gln His Gln
Gln Pro His His His Tyr Asn Ala Asn Asn Met Gln 500
505 510Leu Gln Asn Tyr Pro Tyr Gly Thr Gln Thr Gln
Lys Leu Trp Cys Lys 515 520 525Gln
Glu Gln Asp Ser Asp Asp His Ser Thr Tyr Thr Thr Ala Thr Asp 530
535 540Ile His Gln Leu Gln Leu Gly Asn Asn Asn
Asn Asn Thr His Asn Phe545 550 555
560Phe Gly Leu Gln Asn Ile Met Ser Met Asp Ser Ala Ser Met Asp
Asn 565 570 575Ser Ser Gly
Ser Asn Ser Val Val Tyr Gly Gly Gly Asp His Gly Gly 580
585 590Tyr Gly Gly Asn Gly Gly Tyr Met Ile Pro
Met Ala Ile Ala Asn Asp 595 600
605Gly Asn Gln Asn Pro Arg Ser Asn Asn Asn Phe Gly Glu Ser Glu Ile 610
615 620Lys Gly Phe Gly Tyr Glu Asn Val
Phe Gly Thr Thr Thr Asp Pro Tyr625 630
635 640His Ala Gln Ala Ala Arg Asn Leu Tyr Tyr Gln Pro
Gln Gln Leu Ser 645 650
655Val Asp Gln Gly Ser Asn Trp Val Pro Thr Ala Ile Pro Thr Leu Ala
660 665 670Pro Arg Thr Thr Asn Val
Ser Leu Cys Pro Pro Phe Thr Leu Leu His 675 680
685Glu242133DNAGlycine maxCDS(1)...(2133) 24atg ggg tct atg
aat ttg tta ggt ttt tct ctc tct cct caa gaa cac 48Met Gly Ser Met Asn
Leu Leu Gly Phe Ser Leu Ser Pro Gln Glu His 1 5
10 15cct tct agt caa gat cac tct caa acg gca cct
tct cgt ttt tgc ttc 96Pro Ser Ser Gln Asp His Ser Gln Thr Ala Pro Ser
Arg Phe Cys Phe 20 25 30aac
cct gat gga atc tca agc act gat gta gca gga gac tgc ttt gat 144Asn Pro
Asp Gly Ile Ser Ser Thr Asp Val Ala Gly Asp Cys Phe Asp 35
40 45ctc act tct gac tca act cct cat tta ctc
aac ctt ccc tct tac ggc 192Leu Thr Ser Asp Ser Thr Pro His Leu Leu Asn
Leu Pro Ser Tyr Gly 50 55 60ata tac
gaa gct ttt cat agg agc aac aat att cac acc act caa gat 240Ile Tyr Glu
Ala Phe His Arg Ser Asn Asn Ile His Thr Thr Gln Asp 65
70 75 80tgg aag gag aac tac aac agc caa
aac ttg cta ttg gga act tca tgc 288Trp Lys Glu Asn Tyr Asn Ser Gln Asn
Leu Leu Leu Gly Thr Ser Cys 85 90
95agc aac caa aac atg aac cac aac cat cag caa caa caa caa caa
cag 336Ser Asn Gln Asn Met Asn His Asn His Gln Gln Gln Gln Gln Gln Gln
100 105 110cca aag ctt gaa aac
ttc ctc ggt gga cac tca ttt ggt gaa cat gag 384Pro Lys Leu Glu Asn Phe
Leu Gly Gly His Ser Phe Gly Glu His Glu 115 120
125caa ccc tac ggt ggt aac tca gcc tct aca gaa tac atg ttc
ccg gct 432Gln Pro Tyr Gly Gly Asn Ser Ala Ser Thr Glu Tyr Met Phe Pro
Ala 130 135 140cag ccg gta ttg gcc ggt
ggc ggc ggc ggt ggt agc aat agc agc aac 480Gln Pro Val Leu Ala Gly Gly
Gly Gly Gly Gly Ser Asn Ser Ser Asn145 150
155 160aca agc aac agt agc tcc ata ggg tta tcc atg ata
aag aca tgg ttg 528Thr Ser Asn Ser Ser Ser Ile Gly Leu Ser Met Ile Lys
Thr Trp Leu 165 170 175agg
aac caa cca cca cac tca gaa aac aac aat aac aac aac aat gaa 576Arg Asn
Gln Pro Pro His Ser Glu Asn Asn Asn Asn Asn Asn Asn Glu 180
185 190agt ggt ggc aat agt aga agc agt gtg
cag cag act cta tca ctt tcc 624Ser Gly Gly Asn Ser Arg Ser Ser Val Gln
Gln Thr Leu Ser Leu Ser 195 200
205atg agt act ggt tca caa tca agc aca tca cta ccc ctt ctc act gct
672Met Ser Thr Gly Ser Gln Ser Ser Thr Ser Leu Pro Leu Leu Thr Ala 210
215 220agt gtg gat aat gga gag agt tct
tct gat aac aaa caa cca cat acc 720Ser Val Asp Asn Gly Glu Ser Ser Ser
Asp Asn Lys Gln Pro His Thr225 230 235
240acg gct gca ctt gat aca acc caa acc gga gcc att gaa act
gca ccc 768Thr Ala Ala Leu Asp Thr Thr Gln Thr Gly Ala Ile Glu Thr Ala
Pro 245 250 255aga aag tcc
att gac act ttt gga cag aga act tct atc tac cgt ggt 816Arg Lys Ser Ile
Asp Thr Phe Gly Gln Arg Thr Ser Ile Tyr Arg Gly 260
265 270gta aca agg cat agg tgg acg ggg agg tat gag
gct cac ctg tgg gat 864Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala
His Leu Trp Asp 275 280 285aat agt
tgt aga aga gag gga caa act cgc aaa gga agg caa gtt tac 912Asn Ser Cys
Arg Arg Glu Gly Gln Thr Arg Lys Gly Arg Gln Val Tyr 290
295 300ttg gga ggt tat gac aaa gaa gaa aag gca gct aga
gcc tac gat ttg 960Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala
Tyr Asp Leu305 310 315
320gca gca cta aaa tac tgg gga aca act acg aca aca aat ttt cca att
1008Ala Ala Leu Lys Tyr Trp Gly Thr Thr Thr Thr Thr Asn Phe Pro Ile
325 330 335agc cac tat gag aaa
gag ttg gaa gaa atg aag cac atg act agg caa 1056Ser His Tyr Glu Lys Glu
Leu Glu Glu Met Lys His Met Thr Arg Gln 340
345 350gag tac gtt gcg tca ttg aga agg aag agt agt ggg
ttt tct cgc ggg 1104Glu Tyr Val Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe
Ser Arg Gly 355 360 365gca tcc att
tat cga ggt gtg acg aga cac cat caa cat gga aga tgg 1152Ala Ser Ile Tyr
Arg Gly Val Thr Arg His His Gln His Gly Arg Trp 370
375 380caa gcg agg att gga aga gtt gct ggc aac aag gat
ctc tac ttg gga 1200Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu
Tyr Leu Gly385 390 395
400act ttc agc acc caa gag gag gca gca gaa gca tat gat gta gca gca
1248Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp Val Ala Ala
405 410 415atc aaa ttc aga gga
cta agt gct gtt aca aac ttt gac atg agc aga 1296Ile Lys Phe Arg Gly Leu
Ser Ala Val Thr Asn Phe Asp Met Ser Arg 420
425 430tat gac gtg aaa agc ata ctt gag agc acc act ttg
cca att ggt ggt 1344Tyr Asp Val Lys Ser Ile Leu Glu Ser Thr Thr Leu Pro
Ile Gly Gly 435 440 445gct gca aag
cgt ttg aag gat atg gag cag gtg gaa ctg agg gtg gag 1392Ala Ala Lys Arg
Leu Lys Asp Met Glu Gln Val Glu Leu Arg Val Glu 450
455 460aat gtt cat aga gca gat caa gaa gat cat agt agc
atc atg aac tct 1440Asn Val His Arg Ala Asp Gln Glu Asp His Ser Ser Ile
Met Asn Ser465 470 475
480cac tta act caa gga atc att aac aac tat gca gca gga gga aca aca
1488His Leu Thr Gln Gly Ile Ile Asn Asn Tyr Ala Ala Gly Gly Thr Thr
485 490 495gcg act cat cat cat
aac tgg cac aat gct ctt gca ttc cac caa cct 1536Ala Thr His His His Asn
Trp His Asn Ala Leu Ala Phe His Gln Pro 500
505 510caa cct tgc acc acc ata cac tac cct tat gga caa
aga att aat tgg 1584Gln Pro Cys Thr Thr Ile His Tyr Pro Tyr Gly Gln Arg
Ile Asn Trp 515 520 525tgc aag caa
gaa caa gac aac tct gat gcc tct cac tct ttg tct tat 1632Cys Lys Gln Glu
Gln Asp Asn Ser Asp Ala Ser His Ser Leu Ser Tyr 530
535 540tca gat att cat caa cta cag cta ggg aac aat ggc
aca cac aac ttc 1680Ser Asp Ile His Gln Leu Gln Leu Gly Asn Asn Gly Thr
His Asn Phe545 550 555
560ttt cac aca aat tca ggg ttg cac cct atg tta agc atg gat tct gct
1728Phe His Thr Asn Ser Gly Leu His Pro Met Leu Ser Met Asp Ser Ala
565 570 575tcc att gac aat agc
tct tca tct aac tct gtt gtt tat gat ggt tat 1776Ser Ile Asp Asn Ser Ser
Ser Ser Asn Ser Val Val Tyr Asp Gly Tyr 580
585 590gga ggt ggt ggg ggc tat aat gtg att cct atg ggg
act act act act 1824Gly Gly Gly Gly Gly Tyr Asn Val Ile Pro Met Gly Thr
Thr Thr Thr 595 600 605gtt gtt gca
aat gat ggt gat caa aat cca aga agc aat cat ggt ttt 1872Val Val Ala Asn
Asp Gly Asp Gln Asn Pro Arg Ser Asn His Gly Phe 610
615 620ggt gat aat gag ata aag gca ctt ggt tat gaa agt
gtg tat ggt tct 1920Gly Asp Asn Glu Ile Lys Ala Leu Gly Tyr Glu Ser Val
Tyr Gly Ser625 630 635
640aca act gat cct tat cat gca cat gca agg aac ttg tat tat ctt act
1968Thr Thr Asp Pro Tyr His Ala His Ala Arg Asn Leu Tyr Tyr Leu Thr
645 650 655caa cag caa cca tct
tct gtt gat gca gtg aag gct agt gca tat gat 2016Gln Gln Gln Pro Ser Ser
Val Asp Ala Val Lys Ala Ser Ala Tyr Asp 660
665 670caa gga tct gca tgc aat act tgg gtt cca act gct
att cca act cat 2064Gln Gly Ser Ala Cys Asn Thr Trp Val Pro Thr Ala Ile
Pro Thr His 675 680 685gca cca agg
tct agt act agt atg gct ctc tgc cat ggt gct acg ccc 2112Ala Pro Arg Ser
Ser Thr Ser Met Ala Leu Cys His Gly Ala Thr Pro 690
695 700ttc tct tta ttg cat gaa tag
2133Phe Ser Leu Leu His Glu705
71025710PRTGlycine max 25Met Gly Ser Met Asn Leu Leu Gly Phe Ser Leu Ser
Pro Gln Glu His1 5 10
15Pro Ser Ser Gln Asp His Ser Gln Thr Ala Pro Ser Arg Phe Cys Phe
20 25 30Asn Pro Asp Gly Ile Ser Ser
Thr Asp Val Ala Gly Asp Cys Phe Asp 35 40
45Leu Thr Ser Asp Ser Thr Pro His Leu Leu Asn Leu Pro Ser Tyr
Gly 50 55 60Ile Tyr Glu Ala Phe His
Arg Ser Asn Asn Ile His Thr Thr Gln Asp65 70
75 80Trp Lys Glu Asn Tyr Asn Ser Gln Asn Leu Leu
Leu Gly Thr Ser Cys 85 90
95Ser Asn Gln Asn Met Asn His Asn His Gln Gln Gln Gln Gln Gln Gln
100 105 110Pro Lys Leu Glu Asn Phe
Leu Gly Gly His Ser Phe Gly Glu His Glu 115 120
125Gln Pro Tyr Gly Gly Asn Ser Ala Ser Thr Glu Tyr Met Phe
Pro Ala 130 135 140Gln Pro Val Leu Ala
Gly Gly Gly Gly Gly Gly Ser Asn Ser Ser Asn145 150
155 160Thr Ser Asn Ser Ser Ser Ile Gly Leu Ser
Met Ile Lys Thr Trp Leu 165 170
175Arg Asn Gln Pro Pro His Ser Glu Asn Asn Asn Asn Asn Asn Asn Glu
180 185 190Ser Gly Gly Asn Ser
Arg Ser Ser Val Gln Gln Thr Leu Ser Leu Ser 195
200 205Met Ser Thr Gly Ser Gln Ser Ser Thr Ser Leu Pro
Leu Leu Thr Ala 210 215 220Ser Val Asp
Asn Gly Glu Ser Ser Ser Asp Asn Lys Gln Pro His Thr225
230 235 240Thr Ala Ala Leu Asp Thr Thr
Gln Thr Gly Ala Ile Glu Thr Ala Pro 245
250 255Arg Lys Ser Ile Asp Thr Phe Gly Gln Arg Thr Ser
Ile Tyr Arg Gly 260 265 270Val
Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp Asp 275
280 285Asn Ser Cys Arg Arg Glu Gly Gln Thr
Arg Lys Gly Arg Gln Val Tyr 290 295
300Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu305
310 315 320Ala Ala Leu Lys
Tyr Trp Gly Thr Thr Thr Thr Thr Asn Phe Pro Ile 325
330 335Ser His Tyr Glu Lys Glu Leu Glu Glu Met
Lys His Met Thr Arg Gln 340 345
350Glu Tyr Val Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly
355 360 365Ala Ser Ile Tyr Arg Gly Val
Thr Arg His His Gln His Gly Arg Trp 370 375
380Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu
Gly385 390 395 400Thr Phe
Ser Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp Val Ala Ala
405 410 415Ile Lys Phe Arg Gly Leu Ser
Ala Val Thr Asn Phe Asp Met Ser Arg 420 425
430Tyr Asp Val Lys Ser Ile Leu Glu Ser Thr Thr Leu Pro Ile
Gly Gly 435 440 445Ala Ala Lys Arg
Leu Lys Asp Met Glu Gln Val Glu Leu Arg Val Glu 450
455 460Asn Val His Arg Ala Asp Gln Glu Asp His Ser Ser
Ile Met Asn Ser465 470 475
480His Leu Thr Gln Gly Ile Ile Asn Asn Tyr Ala Ala Gly Gly Thr Thr
485 490 495Ala Thr His His His
Asn Trp His Asn Ala Leu Ala Phe His Gln Pro 500
505 510Gln Pro Cys Thr Thr Ile His Tyr Pro Tyr Gly Gln
Arg Ile Asn Trp 515 520 525Cys Lys
Gln Glu Gln Asp Asn Ser Asp Ala Ser His Ser Leu Ser Tyr 530
535 540Ser Asp Ile His Gln Leu Gln Leu Gly Asn Asn
Gly Thr His Asn Phe545 550 555
560Phe His Thr Asn Ser Gly Leu His Pro Met Leu Ser Met Asp Ser Ala
565 570 575Ser Ile Asp Asn
Ser Ser Ser Ser Asn Ser Val Val Tyr Asp Gly Tyr 580
585 590Gly Gly Gly Gly Gly Tyr Asn Val Ile Pro Met
Gly Thr Thr Thr Thr 595 600 605Val
Val Ala Asn Asp Gly Asp Gln Asn Pro Arg Ser Asn His Gly Phe 610
615 620Gly Asp Asn Glu Ile Lys Ala Leu Gly Tyr
Glu Ser Val Tyr Gly Ser625 630 635
640Thr Thr Asp Pro Tyr His Ala His Ala Arg Asn Leu Tyr Tyr Leu
Thr 645 650 655Gln Gln Gln
Pro Ser Ser Val Asp Ala Val Lys Ala Ser Ala Tyr Asp 660
665 670Gln Gly Ser Ala Cys Asn Thr Trp Val Pro
Thr Ala Ile Pro Thr His 675 680
685Ala Pro Arg Ser Ser Thr Ser Met Ala Leu Cys His Gly Ala Thr Pro 690
695 700Phe Ser Leu Leu His Glu705
710261932DNAVitis viniferaCDS(1)...(1932) 26atg gct tcc atg aac
aac tgg ttg ggt ttc tct ttg tcc cct cga gaa 48Met Ala Ser Met Asn Asn
Trp Leu Gly Phe Ser Leu Ser Pro Arg Glu 1 5
10 15ctt cca cca cag cct gaa aat cac tca cag aac agt
gtc tct aga ctt 96Leu Pro Pro Gln Pro Glu Asn His Ser Gln Asn Ser Val
Ser Arg Leu 20 25 30ggt ttc
aac tct gat gaa atc tct ggg act gat gtg tca ggt gag tgt 144Gly Phe Asn
Ser Asp Glu Ile Ser Gly Thr Asp Val Ser Gly Glu Cys 35
40 45ttt gat ctc act tca gat tcc act gct ccc tct
ctc aac ctc cct ccc 192Phe Asp Leu Thr Ser Asp Ser Thr Ala Pro Ser Leu
Asn Leu Pro Pro 50 55 60cct ttt ggg
ata ctt gaa gca ttc aac agg aat aat cag ccc caa gat 240Pro Phe Gly Ile
Leu Glu Ala Phe Asn Arg Asn Asn Gln Pro Gln Asp 65 70
75 80act aac tac aaa acc acc act tct gag
ctc tcc atg ctc atg ggt agt 288Thr Asn Tyr Lys Thr Thr Thr Ser Glu Leu
Ser Met Leu Met Gly Ser 85 90
95tca tgc agt agt cat cat aac ctc gaa aac caa gaa ccc aaa ctt gaa
336Ser Cys Ser Ser His His Asn Leu Glu Asn Gln Glu Pro Lys Leu Glu
100 105 110aat ttc ctg ggc tgc cgc
tct ttt gct gat cat gag cag aaa ctt caa 384Asn Phe Leu Gly Cys Arg Ser
Phe Ala Asp His Glu Gln Lys Leu Gln 115 120
125ggg tac tac att tcc att ggt tta tcc atg atc aag aca tgg ctg
cgg 432Gly Tyr Tyr Ile Ser Ile Gly Leu Ser Met Ile Lys Thr Trp Leu Arg
130 135 140aac caa cct gca ccc acc cat
cag gat aac aac aag agt act gat act 480Asn Gln Pro Ala Pro Thr His Gln
Asp Asn Asn Lys Ser Thr Asp Thr145 150
155 160ggg cct gtc ggt gga gcc gcc gct ggg aac cta ccc
aat gca cag acc 528Gly Pro Val Gly Gly Ala Ala Ala Gly Asn Leu Pro Asn
Ala Gln Thr 165 170 175tta
tcg ttg tcc atg agc acc ggc tcg cac cag acc ggt gcc att gaa 576Leu Ser
Leu Ser Met Ser Thr Gly Ser His Gln Thr Gly Ala Ile Glu 180
185 190acg gtg cca agg aag tcc att gat aca
ttt gga cag agg aca tcc ata 624Thr Val Pro Arg Lys Ser Ile Asp Thr Phe
Gly Gln Arg Thr Ser Ile 195 200
205tac cgt ggt gta aca agg cat aga tgg acg ggt aga tat gag gct cat
672Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His 210
215 220cta tgg gac aac agt tgc aga aga
gaa gga caa act cga aag gga agg 720Leu Trp Asp Asn Ser Cys Arg Arg Glu
Gly Gln Thr Arg Lys Gly Arg225 230 235
240caa gtt tat tta ggt ggt tat gac aaa gaa gaa aag gca gct
agg gct 768Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg
Ala 245 250 255tac gat tta
gca gca ctg aag tat tgg ggt acc acc acc aca aca aat 816Tyr Asp Leu Ala
Ala Leu Lys Tyr Trp Gly Thr Thr Thr Thr Thr Asn 260
265 270ttc cct att agc aac tat gaa aaa gag ata gag
gag atg aag cac atg 864Phe Pro Ile Ser Asn Tyr Glu Lys Glu Ile Glu Glu
Met Lys His Met 275 280 285aca agg
cag gag tac gta gca tct ctg cga agg aag agt agc ggg ttt 912Thr Arg Gln
Glu Tyr Val Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe 290
295 300tct cgt gga gca tcc ata tat aga gga gtg acc aga
cac cat cag cat 960Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His
His Gln His305 310 315
320ggg aga tgg cag gca agg att gga aga gtc gca ggc aac aaa gat ctt
1008Gly Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu
325 330 335tac ttg gga act ttc
agc acc caa gag gaa gca gca gag gcc tat gac 1056Tyr Leu Gly Thr Phe Ser
Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp 340
345 350att gct gcc att aag ttt cga gga ttg aat gcg gtg
acc aac ttt gat 1104Ile Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr
Asn Phe Asp 355 360 365atg agt aga
tat gat gtt aat agc att cta gag agc agt acc ttg ccg 1152Met Ser Arg Tyr
Asp Val Asn Ser Ile Leu Glu Ser Ser Thr Leu Pro 370
375 380att ggt gga gct gca aag cgg ttg aaa gat gct gag
cag gct gaa atg 1200Ile Gly Gly Ala Ala Lys Arg Leu Lys Asp Ala Glu Gln
Ala Glu Met385 390 395
400act ata gat gga cag agg aca gac gat gag atg agc tca cag ctg act
1248Thr Ile Asp Gly Gln Arg Thr Asp Asp Glu Met Ser Ser Gln Leu Thr
405 410 415gat gga atc aac aac
tat gga gca cac cac cat ggc tgg cct act gtt 1296Asp Gly Ile Asn Asn Tyr
Gly Ala His His His Gly Trp Pro Thr Val 420
425 430gca ttc caa caa gct cag cca ttt agc atg cac tac
cct tat ggc cat 1344Ala Phe Gln Gln Ala Gln Pro Phe Ser Met His Tyr Pro
Tyr Gly His 435 440 445cag cag agg
gct gtt tgg tgt aag caa gag caa gac cct gat ggc aca 1392Gln Gln Arg Ala
Val Trp Cys Lys Gln Glu Gln Asp Pro Asp Gly Thr 450
455 460cac aac ttt caa gat ctt cac caa cta caa ttg gga
aac act cac aac 1440His Asn Phe Gln Asp Leu His Gln Leu Gln Leu Gly Asn
Thr His Asn465 470 475
480ttc ttc cag cct aat gtt ctg cac aac ctc atg agc atg gac tct tct
1488Phe Phe Gln Pro Asn Val Leu His Asn Leu Met Ser Met Asp Ser Ser
485 490 495tca atg gac cat agc
tca ggc tcc aat tca gtc atc tat agc ggt ggt 1536Ser Met Asp His Ser Ser
Gly Ser Asn Ser Val Ile Tyr Ser Gly Gly 500
505 510gga gcc gct gat ggc agc gct gca act ggc ggc agt
ggc agt ggg agc 1584Gly Ala Ala Asp Gly Ser Ala Ala Thr Gly Gly Ser Gly
Ser Gly Ser 515 520 525ttc caa ggg
gta ggt tat ggg aac aac att ggc ttt gtg atg ccc ata 1632Phe Gln Gly Val
Gly Tyr Gly Asn Asn Ile Gly Phe Val Met Pro Ile 530
535 540agc acc gtc atc gct cat gaa ggc ggc cat ggc cag
gga aat ggt ggc 1680Ser Thr Val Ile Ala His Glu Gly Gly His Gly Gln Gly
Asn Gly Gly545 550 555
560ttt gga gat agc gaa gtg aag gcg att ggt tac gac aac atg ttt gga
1728Phe Gly Asp Ser Glu Val Lys Ala Ile Gly Tyr Asp Asn Met Phe Gly
565 570 575tcg aca gat cct tac
cat gct agg agc ttg tac tat ctt tca cag caa 1776Ser Thr Asp Pro Tyr His
Ala Arg Ser Leu Tyr Tyr Leu Ser Gln Gln 580
585 590tca tct gca ggc atg gtg aag ggc agt agt gca tat
gat cag ggg tca 1824Ser Ser Ala Gly Met Val Lys Gly Ser Ser Ala Tyr Asp
Gln Gly Ser 595 600 605ggg tgt aac
aac tgg gtt cca act gca gtt cca acc cta gct cca agg 1872Gly Cys Asn Asn
Trp Val Pro Thr Ala Val Pro Thr Leu Ala Pro Arg 610
615 620act aac agc ttg gca gta tgc cat gga aca cct aca
ttc aca gta tgg 1920Thr Asn Ser Leu Ala Val Cys His Gly Thr Pro Thr Phe
Thr Val Trp625 630 635
640aat gat aca taa
1932Asn Asp Thr27643PRTVitis vinifera 27Met Ala Ser Met Asn Asn Trp Leu
Gly Phe Ser Leu Ser Pro Arg Glu1 5 10
15Leu Pro Pro Gln Pro Glu Asn His Ser Gln Asn Ser Val Ser
Arg Leu 20 25 30Gly Phe Asn
Ser Asp Glu Ile Ser Gly Thr Asp Val Ser Gly Glu Cys 35
40 45Phe Asp Leu Thr Ser Asp Ser Thr Ala Pro Ser
Leu Asn Leu Pro Pro 50 55 60Pro Phe
Gly Ile Leu Glu Ala Phe Asn Arg Asn Asn Gln Pro Gln Asp65
70 75 80Thr Asn Tyr Lys Thr Thr Thr
Ser Glu Leu Ser Met Leu Met Gly Ser 85 90
95Ser Cys Ser Ser His His Asn Leu Glu Asn Gln Glu Pro
Lys Leu Glu 100 105 110Asn Phe
Leu Gly Cys Arg Ser Phe Ala Asp His Glu Gln Lys Leu Gln 115
120 125Gly Tyr Tyr Ile Ser Ile Gly Leu Ser Met
Ile Lys Thr Trp Leu Arg 130 135 140Asn
Gln Pro Ala Pro Thr His Gln Asp Asn Asn Lys Ser Thr Asp Thr145
150 155 160Gly Pro Val Gly Gly Ala
Ala Ala Gly Asn Leu Pro Asn Ala Gln Thr 165
170 175Leu Ser Leu Ser Met Ser Thr Gly Ser His Gln Thr
Gly Ala Ile Glu 180 185 190Thr
Val Pro Arg Lys Ser Ile Asp Thr Phe Gly Gln Arg Thr Ser Ile 195
200 205Tyr Arg Gly Val Thr Arg His Arg Trp
Thr Gly Arg Tyr Glu Ala His 210 215
220Leu Trp Asp Asn Ser Cys Arg Arg Glu Gly Gln Thr Arg Lys Gly Arg225
230 235 240Gln Val Tyr Leu
Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala 245
250 255Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly
Thr Thr Thr Thr Thr Asn 260 265
270Phe Pro Ile Ser Asn Tyr Glu Lys Glu Ile Glu Glu Met Lys His Met
275 280 285Thr Arg Gln Glu Tyr Val Ala
Ser Leu Arg Arg Lys Ser Ser Gly Phe 290 295
300Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His Gln
His305 310 315 320Gly Arg
Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu
325 330 335Tyr Leu Gly Thr Phe Ser Thr
Gln Glu Glu Ala Ala Glu Ala Tyr Asp 340 345
350Ile Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn
Phe Asp 355 360 365Met Ser Arg Tyr
Asp Val Asn Ser Ile Leu Glu Ser Ser Thr Leu Pro 370
375 380Ile Gly Gly Ala Ala Lys Arg Leu Lys Asp Ala Glu
Gln Ala Glu Met385 390 395
400Thr Ile Asp Gly Gln Arg Thr Asp Asp Glu Met Ser Ser Gln Leu Thr
405 410 415Asp Gly Ile Asn Asn
Tyr Gly Ala His His His Gly Trp Pro Thr Val 420
425 430Ala Phe Gln Gln Ala Gln Pro Phe Ser Met His Tyr
Pro Tyr Gly His 435 440 445Gln Gln
Arg Ala Val Trp Cys Lys Gln Glu Gln Asp Pro Asp Gly Thr 450
455 460His Asn Phe Gln Asp Leu His Gln Leu Gln Leu
Gly Asn Thr His Asn465 470 475
480Phe Phe Gln Pro Asn Val Leu His Asn Leu Met Ser Met Asp Ser Ser
485 490 495Ser Met Asp His
Ser Ser Gly Ser Asn Ser Val Ile Tyr Ser Gly Gly 500
505 510Gly Ala Ala Asp Gly Ser Ala Ala Thr Gly Gly
Ser Gly Ser Gly Ser 515 520 525Phe
Gln Gly Val Gly Tyr Gly Asn Asn Ile Gly Phe Val Met Pro Ile 530
535 540Ser Thr Val Ile Ala His Glu Gly Gly His
Gly Gln Gly Asn Gly Gly545 550 555
560Phe Gly Asp Ser Glu Val Lys Ala Ile Gly Tyr Asp Asn Met Phe
Gly 565 570 575Ser Thr Asp
Pro Tyr His Ala Arg Ser Leu Tyr Tyr Leu Ser Gln Gln 580
585 590Ser Ser Ala Gly Met Val Lys Gly Ser Ser
Ala Tyr Asp Gln Gly Ser 595 600
605Gly Cys Asn Asn Trp Val Pro Thr Ala Val Pro Thr Leu Ala Pro Arg 610
615 620Thr Asn Ser Leu Ala Val Cys His
Gly Thr Pro Thr Phe Thr Val Trp625 630
635 640Asn Asp Thr282040DNAZea maysCDS(1)...(2040) 28atg
gct tca gcg aac aac tgg ctg ggc ttc tcg ctc tcg ggc cag gat 48Met Ala
Ser Ala Asn Asn Trp Leu Gly Phe Ser Leu Ser Gly Gln Asp 1 5
10 15aac ccg cag cct aac cag gat agc
tcg cct gcc gcc ggt atc gac atc 96Asn Pro Gln Pro Asn Gln Asp Ser Ser
Pro Ala Ala Gly Ile Asp Ile 20 25
30tcc ggc gcc agc gac ttc tat ggc ctg ccc acg cag cag ggc tcc gac
144Ser Gly Ala Ser Asp Phe Tyr Gly Leu Pro Thr Gln Gln Gly Ser Asp
35 40 45ggg cat ctc ggc gtg ccg ggc
ctg cgg gac gat cac gct tct tat ggt 192Gly His Leu Gly Val Pro Gly Leu
Arg Asp Asp His Ala Ser Tyr Gly 50 55
60atc atg gag gcc tac aac agg gtt cct caa gaa acc caa gat tgg aac
240Ile Met Glu Ala Tyr Asn Arg Val Pro Gln Glu Thr Gln Asp Trp Asn 65
70 75 80atg agg ggc ttg
gac tac aac ggc ggt ggc tcg gag ctc tcg atg ctt 288Met Arg Gly Leu Asp
Tyr Asn Gly Gly Gly Ser Glu Leu Ser Met Leu 85
90 95gtg ggg tcc agc ggc ggc ggc ggg ggc aac ggc
aag agg gcc gtg gaa 336Val Gly Ser Ser Gly Gly Gly Gly Gly Asn Gly Lys
Arg Ala Val Glu 100 105 110gac
agc gag ccc aag ctc gaa gat ttc ctc ggc ggc aac tcg ttc gtc 384Asp Ser
Glu Pro Lys Leu Glu Asp Phe Leu Gly Gly Asn Ser Phe Val 115
120 125tcc gat caa gat cag tcc ggc ggt tac ctg
ttc tct gga gtc ccg ata 432Ser Asp Gln Asp Gln Ser Gly Gly Tyr Leu Phe
Ser Gly Val Pro Ile 130 135 140gcc agc
agc gcc aat agc aac agc ggg agc aac acc atg gag ctc tcc 480Ala Ser Ser
Ala Asn Ser Asn Ser Gly Ser Asn Thr Met Glu Leu Ser145
150 155 160atg atc aag acc tgg cta cgg
aac aac cag gtg gcc cag ccc cag ccg 528Met Ile Lys Thr Trp Leu Arg Asn
Asn Gln Val Ala Gln Pro Gln Pro 165 170
175cca gct cca cat cag ccg cag cct gag gaa atg agc acc gac
gcc agc 576Pro Ala Pro His Gln Pro Gln Pro Glu Glu Met Ser Thr Asp Ala
Ser 180 185 190ggc agc agc ttt
gga tgc tcg gat tcg atg gga agg aac agc atg gtg 624Gly Ser Ser Phe Gly
Cys Ser Asp Ser Met Gly Arg Asn Ser Met Val 195
200 205gcg gct ggt ggg agc tcg cag agc ctg gcg ctc tcg
atg agc acg ggc 672Ala Ala Gly Gly Ser Ser Gln Ser Leu Ala Leu Ser Met
Ser Thr Gly 210 215 220tcg cac ctg ccc
atg gtt gtg ccc agc ggc gcc gcc agc gga gcg gcc 720Ser His Leu Pro Met
Val Val Pro Ser Gly Ala Ala Ser Gly Ala Ala225 230
235 240tcg gag agc aca tcg tcg gag aac aag cga
gcg agc ggt gcc atg gat 768Ser Glu Ser Thr Ser Ser Glu Asn Lys Arg Ala
Ser Gly Ala Met Asp 245 250
255tcg ccc ggc agc gcg gta gaa gcc gta ccg agg aag tcc atc gac acg
816Ser Pro Gly Ser Ala Val Glu Ala Val Pro Arg Lys Ser Ile Asp Thr
260 265 270ttc ggg caa agg acc tct
ata tat cga ggt gta aca agg cat aga tgg 864Phe Gly Gln Arg Thr Ser Ile
Tyr Arg Gly Val Thr Arg His Arg Trp 275 280
285aca ggg cgg tat gag gct cat cta tgg gat aat agt tgt aga agg
gaa 912Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser Cys Arg Arg Glu
290 295 300ggg cag agt cgc aag ggt agg
caa gtt tac ctt ggt ggc tat gac aag 960Gly Gln Ser Arg Lys Gly Arg Gln
Val Tyr Leu Gly Gly Tyr Asp Lys305 310
315 320gag gac aag gca gca agg gct tat gat ttg gca gct
ctc aag tat tgg 1008Glu Asp Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala Leu
Lys Tyr Trp 325 330 335ggc
act acg aca aca aca aat ttc cct ata agc aac tac gaa aag gag 1056Gly Thr
Thr Thr Thr Thr Asn Phe Pro Ile Ser Asn Tyr Glu Lys Glu 340
345 350cta gaa gaa atg aaa cat atg act aga
cag gag tac att gca tac cta 1104Leu Glu Glu Met Lys His Met Thr Arg Gln
Glu Tyr Ile Ala Tyr Leu 355 360
365aga aga aat agc agt gga ttt tct cgt ggg gcg tca aag tat cgt gga
1152Arg Arg Asn Ser Ser Gly Phe Ser Arg Gly Ala Ser Lys Tyr Arg Gly
370 375 380gta act aga cat cat cag cat
ggg aga tgg caa gca agg ata ggg aga 1200Val Thr Arg His His Gln His Gly
Arg Trp Gln Ala Arg Ile Gly Arg385 390
395 400gtt gca gga aac aag gat ctc tac ttg ggc aca ttc
agc acc gag gag 1248Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser
Thr Glu Glu 405 410 415gag
gcg gcg gag gcc tac gac atc gcc gcg atc aag ttc cgc ggt ctc 1296Glu Ala
Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu 420
425 430aac gcc gtc acc aac ttc gac atg agc
cgc tac gac gtg aag agc atc 1344Asn Ala Val Thr Asn Phe Asp Met Ser Arg
Tyr Asp Val Lys Ser Ile 435 440
445ctc gag agc agc aca ctg cct gtc ggc ggt gcg gcc agg cgc ctc aag
1392Leu Glu Ser Ser Thr Leu Pro Val Gly Gly Ala Ala Arg Arg Leu Lys
450 455 460gac gcc gtg gac cac gtg gag
gcc ggc gcc acc atc tgg cgc gcc gac 1440Asp Ala Val Asp His Val Glu Ala
Gly Ala Thr Ile Trp Arg Ala Asp465 470
475 480atg gac ggc gcc gtg atc tcc cag ctg gcc gaa gcc
ggg atg ggc ggc 1488Met Asp Gly Ala Val Ile Ser Gln Leu Ala Glu Ala Gly
Met Gly Gly 485 490 495tac
gcc tcg tac ggc cac cac ggc tgg ccg acc atc gcg ttc cag cag 1536Tyr Ala
Ser Tyr Gly His His Gly Trp Pro Thr Ile Ala Phe Gln Gln 500
505 510ccg tcg ccg ctc tcc gtc cac tac ccg
tac ggc cag ccg tcc cgc ggg 1584Pro Ser Pro Leu Ser Val His Tyr Pro Tyr
Gly Gln Pro Ser Arg Gly 515 520
525tgg tgc aaa ccc gag cag gac gcg gcc gcc gcc gcg gcg cac agc ctg
1632Trp Cys Lys Pro Glu Gln Asp Ala Ala Ala Ala Ala Ala His Ser Leu
530 535 540cag gac ctc cag cag ctg cac
ctc ggc agc gcg gcc cac aac ttc ttc 1680Gln Asp Leu Gln Gln Leu His Leu
Gly Ser Ala Ala His Asn Phe Phe545 550
555 560cag gcg tcg tcg agc tcc aca gtc tac aac ggc ggc
gcc ggc gcc agt 1728Gln Ala Ser Ser Ser Ser Thr Val Tyr Asn Gly Gly Ala
Gly Ala Ser 565 570 575ggt
ggg tac cag ggc ctc ggt ggt ggc agc tct ttc ctc atg ccg tcg 1776Gly Gly
Tyr Gln Gly Leu Gly Gly Gly Ser Ser Phe Leu Met Pro Ser 580
585 590agc act gtc gtg gcg gcg gcc gac cag
ggg cac agc agc acg gcc aac 1824Ser Thr Val Val Ala Ala Ala Asp Gln Gly
His Ser Ser Thr Ala Asn 595 600
605cag ggg agc acg tgc agc tac ggg gac gac cac cag gag ggg aag ctc
1872Gln Gly Ser Thr Cys Ser Tyr Gly Asp Asp His Gln Glu Gly Lys Leu
610 615 620atc ggt tac gac gcc gcc atg
gtg gcg acc gca gct ggt gga gac ccg 1920Ile Gly Tyr Asp Ala Ala Met Val
Ala Thr Ala Ala Gly Gly Asp Pro625 630
635 640tac gct gcg gcg agg aac ggg tac cag ttc tcg cag
ggc tcg gga tcc 1968Tyr Ala Ala Ala Arg Asn Gly Tyr Gln Phe Ser Gln Gly
Ser Gly Ser 645 650 655acg
gtg agc atc gcg agg gcg aac ggg tac gct aac aac tgg agc tct 2016Thr Val
Ser Ile Ala Arg Ala Asn Gly Tyr Ala Asn Asn Trp Ser Ser 660
665 670cct ttc aac aac ggc atg ggg tga
2040Pro Phe Asn Asn Gly Met Gly
67529679PRTZea mays 29Met Ala Ser Ala Asn Asn Trp Leu Gly Phe Ser Leu Ser
Gly Gln Asp1 5 10 15Asn
Pro Gln Pro Asn Gln Asp Ser Ser Pro Ala Ala Gly Ile Asp Ile 20
25 30Ser Gly Ala Ser Asp Phe Tyr Gly
Leu Pro Thr Gln Gln Gly Ser Asp 35 40
45Gly His Leu Gly Val Pro Gly Leu Arg Asp Asp His Ala Ser Tyr Gly
50 55 60Ile Met Glu Ala Tyr Asn Arg Val
Pro Gln Glu Thr Gln Asp Trp Asn65 70 75
80Met Arg Gly Leu Asp Tyr Asn Gly Gly Gly Ser Glu Leu
Ser Met Leu 85 90 95Val
Gly Ser Ser Gly Gly Gly Gly Gly Asn Gly Lys Arg Ala Val Glu
100 105 110Asp Ser Glu Pro Lys Leu Glu
Asp Phe Leu Gly Gly Asn Ser Phe Val 115 120
125Ser Asp Gln Asp Gln Ser Gly Gly Tyr Leu Phe Ser Gly Val Pro
Ile 130 135 140Ala Ser Ser Ala Asn Ser
Asn Ser Gly Ser Asn Thr Met Glu Leu Ser145 150
155 160Met Ile Lys Thr Trp Leu Arg Asn Asn Gln Val
Ala Gln Pro Gln Pro 165 170
175Pro Ala Pro His Gln Pro Gln Pro Glu Glu Met Ser Thr Asp Ala Ser
180 185 190Gly Ser Ser Phe Gly Cys
Ser Asp Ser Met Gly Arg Asn Ser Met Val 195 200
205Ala Ala Gly Gly Ser Ser Gln Ser Leu Ala Leu Ser Met Ser
Thr Gly 210 215 220Ser His Leu Pro Met
Val Val Pro Ser Gly Ala Ala Ser Gly Ala Ala225 230
235 240Ser Glu Ser Thr Ser Ser Glu Asn Lys Arg
Ala Ser Gly Ala Met Asp 245 250
255Ser Pro Gly Ser Ala Val Glu Ala Val Pro Arg Lys Ser Ile Asp Thr
260 265 270Phe Gly Gln Arg Thr
Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp 275
280 285Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser
Cys Arg Arg Glu 290 295 300Gly Gln Ser
Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys305
310 315 320Glu Asp Lys Ala Ala Arg Ala
Tyr Asp Leu Ala Ala Leu Lys Tyr Trp 325
330 335Gly Thr Thr Thr Thr Thr Asn Phe Pro Ile Ser Asn
Tyr Glu Lys Glu 340 345 350Leu
Glu Glu Met Lys His Met Thr Arg Gln Glu Tyr Ile Ala Tyr Leu 355
360 365Arg Arg Asn Ser Ser Gly Phe Ser Arg
Gly Ala Ser Lys Tyr Arg Gly 370 375
380Val Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg385
390 395 400Val Ala Gly Asn
Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr Glu Glu 405
410 415Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala
Ile Lys Phe Arg Gly Leu 420 425
430Asn Ala Val Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ser Ile
435 440 445Leu Glu Ser Ser Thr Leu Pro
Val Gly Gly Ala Ala Arg Arg Leu Lys 450 455
460Asp Ala Val Asp His Val Glu Ala Gly Ala Thr Ile Trp Arg Ala
Asp465 470 475 480Met Asp
Gly Ala Val Ile Ser Gln Leu Ala Glu Ala Gly Met Gly Gly
485 490 495Tyr Ala Ser Tyr Gly His His
Gly Trp Pro Thr Ile Ala Phe Gln Gln 500 505
510Pro Ser Pro Leu Ser Val His Tyr Pro Tyr Gly Gln Pro Ser
Arg Gly 515 520 525Trp Cys Lys Pro
Glu Gln Asp Ala Ala Ala Ala Ala Ala His Ser Leu 530
535 540Gln Asp Leu Gln Gln Leu His Leu Gly Ser Ala Ala
His Asn Phe Phe545 550 555
560Gln Ala Ser Ser Ser Ser Thr Val Tyr Asn Gly Gly Ala Gly Ala Ser
565 570 575Gly Gly Tyr Gln Gly
Leu Gly Gly Gly Ser Ser Phe Leu Met Pro Ser 580
585 590Ser Thr Val Val Ala Ala Ala Asp Gln Gly His Ser
Ser Thr Ala Asn 595 600 605Gln Gly
Ser Thr Cys Ser Tyr Gly Asp Asp His Gln Glu Gly Lys Leu 610
615 620Ile Gly Tyr Asp Ala Ala Met Val Ala Thr Ala
Ala Gly Gly Asp Pro625 630 635
640Tyr Ala Ala Ala Arg Asn Gly Tyr Gln Phe Ser Gln Gly Ser Gly Ser
645 650 655Thr Val Ser Ile
Ala Arg Ala Asn Gly Tyr Ala Asn Asn Trp Ser Ser 660
665 670Pro Phe Asn Asn Gly Met Gly
675302088DNAOryza sativaCDS(1)...(2088) 30atg gcc acc atg aac aac tgg ctg
gcc ttc tcc ctc tcc ccg cag gat 48Met Ala Thr Met Asn Asn Trp Leu Ala
Phe Ser Leu Ser Pro Gln Asp 1 5 10
15cag ctc ccg ccg tct cag acc aac tcc act ctc atc tcc gcc gcc
gcc 96Gln Leu Pro Pro Ser Gln Thr Asn Ser Thr Leu Ile Ser Ala Ala Ala
20 25 30acc acc acc acc gcc
ggc gac tcc tcc acc ggc gac gtc tgc ttc aac 144Thr Thr Thr Thr Ala Gly
Asp Ser Ser Thr Gly Asp Val Cys Phe Asn 35 40
45atc ccc caa gat tgg agc atg agg gga tcg gag ctc tcg gcg
ctc gtc 192Ile Pro Gln Asp Trp Ser Met Arg Gly Ser Glu Leu Ser Ala Leu
Val 50 55 60gcc gag ccg aag ctg gag
gac ttc ctc ggc ggc atc tcc ttc tcg gag 240Ala Glu Pro Lys Leu Glu Asp
Phe Leu Gly Gly Ile Ser Phe Ser Glu 65 70
75 80cag cag cat cat cac ggc ggc aag ggc ggc gtg atc
ccg agc agc gcc 288Gln Gln His His His Gly Gly Lys Gly Gly Val Ile Pro
Ser Ser Ala 85 90 95gcc
gct tgc tac gcg agc tcc ggc agc agc gtc ggc tac ctg tac cct 336Ala Ala
Cys Tyr Ala Ser Ser Gly Ser Ser Val Gly Tyr Leu Tyr Pro 100
105 110cct cca agc tca tcc tcg ctc cag ttc
gcc gac tcc gtc atg gtg gcc 384Pro Pro Ser Ser Ser Ser Leu Gln Phe Ala
Asp Ser Val Met Val Ala 115 120
125acc tcc tcg ccc gtc gtc gcc cac gac ggc gtc agc ggc ggc ggc atg
432Thr Ser Ser Pro Val Val Ala His Asp Gly Val Ser Gly Gly Gly Met 130
135 140gtg agc gcc gcc gcc gcc gcg gcg
gcc agt ggc aac ggc ggc att ggc 480Val Ser Ala Ala Ala Ala Ala Ala Ala
Ser Gly Asn Gly Gly Ile Gly145 150 155
160ctg tcc atg atc aag aac tgg ctc cgg agc cag ccg gcg ccg
cag ccg 528Leu Ser Met Ile Lys Asn Trp Leu Arg Ser Gln Pro Ala Pro Gln
Pro 165 170 175gcg cag gcg
ctg tct ctg tcc atg aac atg gcg ggg acg acg acg gcg 576Ala Gln Ala Leu
Ser Leu Ser Met Asn Met Ala Gly Thr Thr Thr Ala 180
185 190cag ggc ggc ggc gcc atg gcg ctc ctc gcc ggc
gca ggg gag cga ggc 624Gln Gly Gly Gly Ala Met Ala Leu Leu Ala Gly Ala
Gly Glu Arg Gly 195 200 205cgg acg
acg ccc gcg tca gag agc ctg tcc acg tcg gcg cac gga gcg 672Arg Thr Thr
Pro Ala Ser Glu Ser Leu Ser Thr Ser Ala His Gly Ala 210
215 220acg acg gcg acg atg gct ggt ggt cgc aag gag att
aac gag gaa ggc 720Thr Thr Ala Thr Met Ala Gly Gly Arg Lys Glu Ile Asn
Glu Glu Gly225 230 235
240agc ggc agc gcc ggc gcc gtg gtt gcc gtc ggc tcg gag tca ggc ggc
768Ser Gly Ser Ala Gly Ala Val Val Ala Val Gly Ser Glu Ser Gly Gly
245 250 255agc ggc gcc gtg gtg
gag gcc ggc gcg gcg gcg gcg gcg gcg agg aag 816Ser Gly Ala Val Val Glu
Ala Gly Ala Ala Ala Ala Ala Ala Arg Lys 260
265 270tcc gtc gac acg ttc ggc cag aga aca tcg atc tac
cgc ggc gtg aca 864Ser Val Asp Thr Phe Gly Gln Arg Thr Ser Ile Tyr Arg
Gly Val Thr 275 280 285agg cat aga
tgg aca ggg agg tat gag gct cat ctt tgg gac aac agc 912Arg His Arg Trp
Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser 290
295 300tgc aga aga gag ggc caa act cgc aag ggt cgt caa
gtc tat cta ggt 960Cys Arg Arg Glu Gly Gln Thr Arg Lys Gly Arg Gln Val
Tyr Leu Gly305 310 315
320ggt tat gac aaa gag gaa aaa gct gct aga gct tat gat ttg gct gct
1008Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala
325 330 335ctc aaa tac tgg ggc
ccg acg acg acg aca aat ttt ccg gta aat aac 1056Leu Lys Tyr Trp Gly Pro
Thr Thr Thr Thr Asn Phe Pro Val Asn Asn 340
345 350tat gaa aag gag ctg gag gag atg aag cac atg aca
agg cag gag ttc 1104Tyr Glu Lys Glu Leu Glu Glu Met Lys His Met Thr Arg
Gln Glu Phe 355 360 365gta gcc tct
ttg aga agg aag agc agt ggt ttc tcc aga ggt gca tcc 1152Val Ala Ser Leu
Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser 370
375 380att tac cgt gga gta act agg cat cac cag cat ggg
aga tgg caa gca 1200Ile Tyr Arg Gly Val Thr Arg His His Gln His Gly Arg
Trp Gln Ala385 390 395
400agg ata gga aga gtt gca ggg aac aag gac ctc tac ttg ggc acc ttc
1248Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe
405 410 415agc acg cag gag gag
gcg gcg gag gcg tac gac atc gcg gcg atc aag 1296Ser Thr Gln Glu Glu Ala
Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys 420
425 430ttc cgg ggg ctc aac gcc gtc acc aac ttc gac atg
agc cgc tac gac 1344Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp Met Ser
Arg Tyr Asp 435 440 445gtc aag agc
atc ctc gac agc gct gcc ctc ccc gtc ggc acc gcc gcc 1392Val Lys Ser Ile
Leu Asp Ser Ala Ala Leu Pro Val Gly Thr Ala Ala 450
455 460aag cgc ctc aag gac gcc gag gcc gcc gcc gcc tac
gac gtc ggc cgc 1440Lys Arg Leu Lys Asp Ala Glu Ala Ala Ala Ala Tyr Asp
Val Gly Arg465 470 475
480atc gcc tcg cac ctc ggc ggc gac ggc gcc tac gcc gcg cat tac ggc
1488Ile Ala Ser His Leu Gly Gly Asp Gly Ala Tyr Ala Ala His Tyr Gly
485 490 495cac cac cac cac tcg
gcc gcc gcc gcc tgg ccg acc atc gcg ttc cag 1536His His His His Ser Ala
Ala Ala Ala Trp Pro Thr Ile Ala Phe Gln 500
505 510gcg gcg gcg gcg ccg ccg ccg cac gcc gcc ggg ctt
tac cac ccg tac 1584Ala Ala Ala Ala Pro Pro Pro His Ala Ala Gly Leu Tyr
His Pro Tyr 515 520 525gcg cag ccg
ctg cgt ggg tgg tgc aag cag gag cag gac cac gcc gtg 1632Ala Gln Pro Leu
Arg Gly Trp Cys Lys Gln Glu Gln Asp His Ala Val 530
535 540atc gcg gcg gcg cac agc ctg cag gat ctc cac cac
ctc aac ctc ggc 1680Ile Ala Ala Ala His Ser Leu Gln Asp Leu His His Leu
Asn Leu Gly545 550 555
560gcc gcc gcc gcc gcg cat gac ttc ttc tcg cag gcg atg cag cag cag
1728Ala Ala Ala Ala Ala His Asp Phe Phe Ser Gln Ala Met Gln Gln Gln
565 570 575cac ggc ctc ggc agc
atc gac aac gcg tcg ctc gag cac agc acc ggc 1776His Gly Leu Gly Ser Ile
Asp Asn Ala Ser Leu Glu His Ser Thr Gly 580
585 590tcc aac tcc gtc gtc tac aac ggc gac aat ggc ggc
gga ggc ggc ggc 1824Ser Asn Ser Val Val Tyr Asn Gly Asp Asn Gly Gly Gly
Gly Gly Gly 595 600 605tac atc atg
gcg ccg atg agc gcc gtg tcg gcc acg gcc acc gcg gtg 1872Tyr Ile Met Ala
Pro Met Ser Ala Val Ser Ala Thr Ala Thr Ala Val 610
615 620gcg agc agc cac gat cac ggc ggc gac ggc ggg aag
cag gtg cag atg 1920Ala Ser Ser His Asp His Gly Gly Asp Gly Gly Lys Gln
Val Gln Met625 630 635
640ggg tac gac agc tac ctc gtc ggc gca gac gcc tac ggc ggc ggc ggc
1968Gly Tyr Asp Ser Tyr Leu Val Gly Ala Asp Ala Tyr Gly Gly Gly Gly
645 650 655gcc ggg agg atg cca
tcc tgg gcg atg acg ccg gcg tcg gcg ccg gcc 2016Ala Gly Arg Met Pro Ser
Trp Ala Met Thr Pro Ala Ser Ala Pro Ala 660
665 670gcc acg agc agc agc gac atg acc gga gtc tgc cat
ggc gca cag ctc 2064Ala Thr Ser Ser Ser Asp Met Thr Gly Val Cys His Gly
Ala Gln Leu 675 680 685ttc agc gtc
tgg aac gac aca taa 2088Phe Ser Val Trp
Asn Asp Thr 690 69531695PRTOryza sativa 31Met Ala Thr
Met Asn Asn Trp Leu Ala Phe Ser Leu Ser Pro Gln Asp1 5
10 15Gln Leu Pro Pro Ser Gln Thr Asn Ser
Thr Leu Ile Ser Ala Ala Ala 20 25
30Thr Thr Thr Thr Ala Gly Asp Ser Ser Thr Gly Asp Val Cys Phe Asn
35 40 45Ile Pro Gln Asp Trp Ser Met
Arg Gly Ser Glu Leu Ser Ala Leu Val 50 55
60Ala Glu Pro Lys Leu Glu Asp Phe Leu Gly Gly Ile Ser Phe Ser Glu65
70 75 80Gln Gln His His
His Gly Gly Lys Gly Gly Val Ile Pro Ser Ser Ala 85
90 95Ala Ala Cys Tyr Ala Ser Ser Gly Ser Ser
Val Gly Tyr Leu Tyr Pro 100 105
110Pro Pro Ser Ser Ser Ser Leu Gln Phe Ala Asp Ser Val Met Val Ala
115 120 125Thr Ser Ser Pro Val Val Ala
His Asp Gly Val Ser Gly Gly Gly Met 130 135
140Val Ser Ala Ala Ala Ala Ala Ala Ala Ser Gly Asn Gly Gly Ile
Gly145 150 155 160Leu Ser
Met Ile Lys Asn Trp Leu Arg Ser Gln Pro Ala Pro Gln Pro
165 170 175Ala Gln Ala Leu Ser Leu Ser
Met Asn Met Ala Gly Thr Thr Thr Ala 180 185
190Gln Gly Gly Gly Ala Met Ala Leu Leu Ala Gly Ala Gly Glu
Arg Gly 195 200 205Arg Thr Thr Pro
Ala Ser Glu Ser Leu Ser Thr Ser Ala His Gly Ala 210
215 220Thr Thr Ala Thr Met Ala Gly Gly Arg Lys Glu Ile
Asn Glu Glu Gly225 230 235
240Ser Gly Ser Ala Gly Ala Val Val Ala Val Gly Ser Glu Ser Gly Gly
245 250 255Ser Gly Ala Val Val
Glu Ala Gly Ala Ala Ala Ala Ala Ala Arg Lys 260
265 270Ser Val Asp Thr Phe Gly Gln Arg Thr Ser Ile Tyr
Arg Gly Val Thr 275 280 285Arg His
Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser 290
295 300Cys Arg Arg Glu Gly Gln Thr Arg Lys Gly Arg
Gln Val Tyr Leu Gly305 310 315
320Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala
325 330 335Leu Lys Tyr Trp
Gly Pro Thr Thr Thr Thr Asn Phe Pro Val Asn Asn 340
345 350Tyr Glu Lys Glu Leu Glu Glu Met Lys His Met
Thr Arg Gln Glu Phe 355 360 365Val
Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser 370
375 380Ile Tyr Arg Gly Val Thr Arg His His Gln
His Gly Arg Trp Gln Ala385 390 395
400Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr
Phe 405 410 415Ser Thr Gln
Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys 420
425 430Phe Arg Gly Leu Asn Ala Val Thr Asn Phe
Asp Met Ser Arg Tyr Asp 435 440
445Val Lys Ser Ile Leu Asp Ser Ala Ala Leu Pro Val Gly Thr Ala Ala 450
455 460Lys Arg Leu Lys Asp Ala Glu Ala
Ala Ala Ala Tyr Asp Val Gly Arg465 470
475 480Ile Ala Ser His Leu Gly Gly Asp Gly Ala Tyr Ala
Ala His Tyr Gly 485 490
495His His His His Ser Ala Ala Ala Ala Trp Pro Thr Ile Ala Phe Gln
500 505 510Ala Ala Ala Ala Pro Pro
Pro His Ala Ala Gly Leu Tyr His Pro Tyr 515 520
525Ala Gln Pro Leu Arg Gly Trp Cys Lys Gln Glu Gln Asp His
Ala Val 530 535 540Ile Ala Ala Ala His
Ser Leu Gln Asp Leu His His Leu Asn Leu Gly545 550
555 560Ala Ala Ala Ala Ala His Asp Phe Phe Ser
Gln Ala Met Gln Gln Gln 565 570
575His Gly Leu Gly Ser Ile Asp Asn Ala Ser Leu Glu His Ser Thr Gly
580 585 590Ser Asn Ser Val Val
Tyr Asn Gly Asp Asn Gly Gly Gly Gly Gly Gly 595
600 605Tyr Ile Met Ala Pro Met Ser Ala Val Ser Ala Thr
Ala Thr Ala Val 610 615 620Ala Ser Ser
His Asp His Gly Gly Asp Gly Gly Lys Gln Val Gln Met625
630 635 640Gly Tyr Asp Ser Tyr Leu Val
Gly Ala Asp Ala Tyr Gly Gly Gly Gly 645
650 655Ala Gly Arg Met Pro Ser Trp Ala Met Thr Pro Ala
Ser Ala Pro Ala 660 665 670Ala
Thr Ser Ser Ser Asp Met Thr Gly Val Cys His Gly Ala Gln Leu 675
680 685Phe Ser Val Trp Asn Asp Thr 690
695321680DNAOryza sativaCDS(1)...(1680) 32atg gcc tcc atc
acc aac tgg ctc ggc ttc tcc tcc tcc tcc ttc tcc 48Met Ala Ser Ile
Thr Asn Trp Leu Gly Phe Ser Ser Ser Ser Phe Ser1 5
10 15ggc gcc ggc gcc gac ccc gtc ctg ccc cac
ccg ccg ctg caa gag tgg 96Gly Ala Gly Ala Asp Pro Val Leu Pro His
Pro Pro Leu Gln Glu Trp 20 25
30ggg agc gct tat gag ggc ggc ggc acg gtg gcg gcc gcc ggc ggg gag
144Gly Ser Ala Tyr Glu Gly Gly Gly Thr Val Ala Ala Ala Gly Gly Glu
35 40 45gag acg gcg gcg ccg aag ctg gag
gac ttc ctc ggc atg cag gtg cag 192Glu Thr Ala Ala Pro Lys Leu Glu
Asp Phe Leu Gly Met Gln Val Gln 50 55
60cag gag acg gcc gcc gcg gcg gcg ggg cac ggc cgt gga ggc agc tcg
240Gln Glu Thr Ala Ala Ala Ala Ala Gly His Gly Arg Gly Gly Ser Ser65
70 75 80tcg gtc gtt ggg ctg
tcc atg atc aag aac tgg cta cgc agc cag ccg 288Ser Val Val Gly Leu
Ser Met Ile Lys Asn Trp Leu Arg Ser Gln Pro 85
90 95ccg ccc gcg gtg gtt ggg gga gaa gac gct atg
atg gcg ctc gcg gtg 336Pro Pro Ala Val Val Gly Gly Glu Asp Ala Met
Met Ala Leu Ala Val 100 105
110tcg acg tcg gcg tcg ccg ccg gtg gac gcg acg gtg ccg gcc tgc att
384Ser Thr Ser Ala Ser Pro Pro Val Asp Ala Thr Val Pro Ala Cys Ile
115 120 125tcg ccg gat ggg atg ggg tcg
aag gcg gcc gac ggc ggc ggc gcg gcc 432Ser Pro Asp Gly Met Gly Ser
Lys Ala Ala Asp Gly Gly Gly Ala Ala 130 135
140gag gcg gcg gcg gcg gcg gcg gcg cag agg atg aag gcg gcc atg gac
480Glu Ala Ala Ala Ala Ala Ala Ala Gln Arg Met Lys Ala Ala Met Asp145
150 155 160acg ttc ggg cag
cgg acg tcc atc tac cgg ggt gtc acc aag cac agg 528Thr Phe Gly Gln
Arg Thr Ser Ile Tyr Arg Gly Val Thr Lys His Arg 165
170 175tgg aca gga agg tat gaa gcc cat ctt tgg
gat aac agc tgc aga aga 576Trp Thr Gly Arg Tyr Glu Ala His Leu Trp
Asp Asn Ser Cys Arg Arg 180 185
190gaa ggt cag act cgc aaa ggc aga caa gta tat ctt gga gga tat gat
624Glu Gly Gln Thr Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp
195 200 205aag gaa gaa aaa gct gct agg
gct tat gat ttg gct gcc ctt aaa tac 672Lys Glu Glu Lys Ala Ala Arg
Ala Tyr Asp Leu Ala Ala Leu Lys Tyr 210 215
220tgg ggc act aca acg acg acg aat ttt ccg gta agc aac tac gaa aaa
720Trp Gly Thr Thr Thr Thr Thr Asn Phe Pro Val Ser Asn Tyr Glu Lys225
230 235 240gag ttg gat gaa
atg aag cac atg aat agg cag gaa ttt gtt gca tcc 768Glu Leu Asp Glu
Met Lys His Met Asn Arg Gln Glu Phe Val Ala Ser 245
250 255ctt aga aga aaa agc agt gga ttt tca cgt
ggt gct tcc ata tat cgt 816Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg
Gly Ala Ser Ile Tyr Arg 260 265
270ggt gtt aca aga cac cat cag cat gga agg tgg caa gca agg ata gga
864Gly Val Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly
275 280 285cgg gtg gca gga aac aag gat
ctg tat ttg ggc aca ttt ggc acc caa 912Arg Val Ala Gly Asn Lys Asp
Leu Tyr Leu Gly Thr Phe Gly Thr Gln 290 295
300gag gaa gct gca gag gca tat gat atc gct gca atc aaa ttc cgt ggt
960Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly305
310 315 320ctc aat gct gtg
aca aac ttt gac atg agc cgg tac gat gtc aag agc 1008Leu Asn Ala Val
Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ser 325
330 335atc att gaa agc agc aat ctc cca att ggt
act gga acc acc cgg cga 1056Ile Ile Glu Ser Ser Asn Leu Pro Ile Gly
Thr Gly Thr Thr Arg Arg 340 345
350ttg aag gac tcc tct gat cac act gat aat gtc atg gac atc aat gtc
1104Leu Lys Asp Ser Ser Asp His Thr Asp Asn Val Met Asp Ile Asn Val
355 360 365aat acc gaa ccc aat aat gtg
gta tca tcc cac ttc acc aat ggg gtt 1152Asn Thr Glu Pro Asn Asn Val
Val Ser Ser His Phe Thr Asn Gly Val 370 375
380ggc aac tat ggt tcg cag cat tat ggt tac aat gga tgg tcg cca att
1200Gly Asn Tyr Gly Ser Gln His Tyr Gly Tyr Asn Gly Trp Ser Pro Ile385
390 395 400agc atg cag ccg
atc ccc tcg cag tac gcc aac ggc cag ccc agg gca 1248Ser Met Gln Pro
Ile Pro Ser Gln Tyr Ala Asn Gly Gln Pro Arg Ala 405
410 415tgg ttg aaa caa gag cag gac agc tct gtg
gtt aca gcg gcg cag aac 1296Trp Leu Lys Gln Glu Gln Asp Ser Ser Val
Val Thr Ala Ala Gln Asn 420 425
430ctg cac aat cta cat cat ttt agt tcc ttg ggc tac acc cac aac ttc
1344Leu His Asn Leu His His Phe Ser Ser Leu Gly Tyr Thr His Asn Phe
435 440 445ttc cag caa tct gat gtt cca
gac gtc aca ggt ttc gtt gat gcg cct 1392Phe Gln Gln Ser Asp Val Pro
Asp Val Thr Gly Phe Val Asp Ala Pro 450 455
460tcg agg tcc agt gac tca tac tcc ttc agg tac aat gga aca aat ggc
1440Ser Arg Ser Ser Asp Ser Tyr Ser Phe Arg Tyr Asn Gly Thr Asn Gly465
470 475 480ttt cat ggt ctc
ccg ggt gga atc agc tat gct atg ccg gtt gcg aca 1488Phe His Gly Leu
Pro Gly Gly Ile Ser Tyr Ala Met Pro Val Ala Thr 485
490 495gcg gtg gac caa ggt cag ggc atc cat ggc
tat gga gaa gat ggt gtg 1536Ala Val Asp Gln Gly Gln Gly Ile His Gly
Tyr Gly Glu Asp Gly Val 500 505
510gca ggc att gac acc aca cat gac ctg tat ggc agc cgt aat gtg tac
1584Ala Gly Ile Asp Thr Thr His Asp Leu Tyr Gly Ser Arg Asn Val Tyr
515 520 525tac ctt tcc gag ggt tcg ctt
ctt gcc gat gtc gaa aaa gaa ggc gac 1632Tyr Leu Ser Glu Gly Ser Leu
Leu Ala Asp Val Glu Lys Glu Gly Asp 530 535
540tat ggc caa tct gtg ggg ggc aac agc tgg gtt ttg ccg aca ccg tag
1680Tyr Gly Gln Ser Val Gly Gly Asn Ser Trp Val Leu Pro Thr Pro545
550 55533559PRTOryza sativa 33Met Ala Ser Ile
Thr Asn Trp Leu Gly Phe Ser Ser Ser Ser Phe Ser1 5
10 15Gly Ala Gly Ala Asp Pro Val Leu Pro His
Pro Pro Leu Gln Glu Trp 20 25
30Gly Ser Ala Tyr Glu Gly Gly Gly Thr Val Ala Ala Ala Gly Gly Glu
35 40 45Glu Thr Ala Ala Pro Lys Leu Glu
Asp Phe Leu Gly Met Gln Val Gln 50 55
60Gln Glu Thr Ala Ala Ala Ala Ala Gly His Gly Arg Gly Gly Ser Ser65
70 75 80Ser Val Val Gly Leu
Ser Met Ile Lys Asn Trp Leu Arg Ser Gln Pro 85
90 95Pro Pro Ala Val Val Gly Gly Glu Asp Ala Met
Met Ala Leu Ala Val 100 105
110Ser Thr Ser Ala Ser Pro Pro Val Asp Ala Thr Val Pro Ala Cys Ile
115 120 125Ser Pro Asp Gly Met Gly Ser
Lys Ala Ala Asp Gly Gly Gly Ala Ala 130 135
140Glu Ala Ala Ala Ala Ala Ala Ala Gln Arg Met Lys Ala Ala Met
Asp145 150 155 160Thr Phe
Gly Gln Arg Thr Ser Ile Tyr Arg Gly Val Thr Lys His Arg
165 170 175Trp Thr Gly Arg Tyr Glu Ala
His Leu Trp Asp Asn Ser Cys Arg Arg 180 185
190Glu Gly Gln Thr Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly
Tyr Asp 195 200 205Lys Glu Glu Lys
Ala Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr 210
215 220Trp Gly Thr Thr Thr Thr Thr Asn Phe Pro Val Ser
Asn Tyr Glu Lys225 230 235
240Glu Leu Asp Glu Met Lys His Met Asn Arg Gln Glu Phe Val Ala Ser
245 250 255Leu Arg Arg Lys Ser
Ser Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg 260
265 270Gly Val Thr Arg His His Gln His Gly Arg Trp Gln
Ala Arg Ile Gly 275 280 285Arg Val
Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Gly Thr Gln 290
295 300Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala
Ile Lys Phe Arg Gly305 310 315
320Leu Asn Ala Val Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ser
325 330 335Ile Ile Glu Ser
Ser Asn Leu Pro Ile Gly Thr Gly Thr Thr Arg Arg 340
345 350Leu Lys Asp Ser Ser Asp His Thr Asp Asn Val
Met Asp Ile Asn Val 355 360 365Asn
Thr Glu Pro Asn Asn Val Val Ser Ser His Phe Thr Asn Gly Val 370
375 380Gly Asn Tyr Gly Ser Gln His Tyr Gly Tyr
Asn Gly Trp Ser Pro Ile385 390 395
400Ser Met Gln Pro Ile Pro Ser Gln Tyr Ala Asn Gly Gln Pro Arg
Ala 405 410 415Trp Leu Lys
Gln Glu Gln Asp Ser Ser Val Val Thr Ala Ala Gln Asn 420
425 430Leu His Asn Leu His His Phe Ser Ser Leu
Gly Tyr Thr His Asn Phe 435 440
445Phe Gln Gln Ser Asp Val Pro Asp Val Thr Gly Phe Val Asp Ala Pro 450
455 460Ser Arg Ser Ser Asp Ser Tyr Ser
Phe Arg Tyr Asn Gly Thr Asn Gly465 470
475 480Phe His Gly Leu Pro Gly Gly Ile Ser Tyr Ala Met
Pro Val Ala Thr 485 490
495Ala Val Asp Gln Gly Gln Gly Ile His Gly Tyr Gly Glu Asp Gly Val
500 505 510Ala Gly Ile Asp Thr Thr
His Asp Leu Tyr Gly Ser Arg Asn Val Tyr 515 520
525Tyr Leu Ser Glu Gly Ser Leu Leu Ala Asp Val Glu Lys Glu
Gly Asp 530 535 540Tyr Gly Gln Ser Val
Gly Gly Asn Ser Trp Val Leu Pro Thr Pro545 550
555342112DNAOryza sativaCDS(1)...(2112) 34atg gct tct gca aac aac
tgg ctg ggc ttc tcg ctc tcc ggc caa gag 48Met Ala Ser Ala Asn Asn
Trp Leu Gly Phe Ser Leu Ser Gly Gln Glu1 5
10 15aat ccg cag cct cac cag gat agc tcg cct ccg gca
gcc atc gac gtc 96Asn Pro Gln Pro His Gln Asp Ser Ser Pro Pro Ala
Ala Ile Asp Val 20 25 30tcc
ggc gcc ggc gac ttc tat ggc ctg ccg acg tcg cag ccg acg gcg 144Ser
Gly Ala Gly Asp Phe Tyr Gly Leu Pro Thr Ser Gln Pro Thr Ala 35
40 45gcc gac gcg cac ctc ggc gtg gcg ggg
cat cat cac aac gcc tcg tat 192Ala Asp Ala His Leu Gly Val Ala Gly
His His His Asn Ala Ser Tyr 50 55
60ggc atc atg gag gcc ttc aat agg gga gct caa gag gca caa gat tgg
240Gly Ile Met Glu Ala Phe Asn Arg Gly Ala Gln Glu Ala Gln Asp Trp65
70 75 80aac atg agg ggg ctg
gac tac aac ggc ggc gcc tcg gag ctg tcg atg 288Asn Met Arg Gly Leu
Asp Tyr Asn Gly Gly Ala Ser Glu Leu Ser Met 85
90 95ctc gtc ggc tcc agc ggc ggc aag agg gcg gcg
gcg gtg gag gag acc 336Leu Val Gly Ser Ser Gly Gly Lys Arg Ala Ala
Ala Val Glu Glu Thr 100 105
110gag ccg aag ctg gag gac ttc ctc ggc ggc aac tcg ttc gtc tcc gag
384Glu Pro Lys Leu Glu Asp Phe Leu Gly Gly Asn Ser Phe Val Ser Glu
115 120 125caa gat cat cac gcg gcg ggg
ggc ttc ctc ttc tcc ggc gtc ccg atg 432Gln Asp His His Ala Ala Gly
Gly Phe Leu Phe Ser Gly Val Pro Met 130 135
140gcc agc agc acc aac agc aac agc ggg agc aac act atg gag ctc tcc
480Ala Ser Ser Thr Asn Ser Asn Ser Gly Ser Asn Thr Met Glu Leu Ser145
150 155 160atg atc aag acc
tgg ctc cgg aac aac ggc cag gtg ccc gcc ggc cac 528Met Ile Lys Thr
Trp Leu Arg Asn Asn Gly Gln Val Pro Ala Gly His 165
170 175cag ccg cag cag cag cag ccg gcg gcc gcg
gcc gcc gcc gcg cag cag 576Gln Pro Gln Gln Gln Gln Pro Ala Ala Ala
Ala Ala Ala Ala Gln Gln 180 185
190cag gcg cac gag gcg gcg gag atg agc acc gac gcg agc gcg agc agc
624Gln Ala His Glu Ala Ala Glu Met Ser Thr Asp Ala Ser Ala Ser Ser
195 200 205ttc ggg tgc tcc tcc gac gcg
atg ggg agg agt aac aac ggc ggc gcg 672Phe Gly Cys Ser Ser Asp Ala
Met Gly Arg Ser Asn Asn Gly Gly Ala 210 215
220gtc tcg gcg gcg gcc ggc ggg acg agc tcg cag agc ctg gcg ctc tcg
720Val Ser Ala Ala Ala Gly Gly Thr Ser Ser Gln Ser Leu Ala Leu Ser225
230 235 240atg agc acg ggc
tcg cac tcg cac ctg cct atc gtc gtc gcc ggc ggc 768Met Ser Thr Gly
Ser His Ser His Leu Pro Ile Val Val Ala Gly Gly 245
250 255ggg aac gcc agc ggc gga gcg gcc gag agc
aca tcg tcg gag aac aag 816Gly Asn Ala Ser Gly Gly Ala Ala Glu Ser
Thr Ser Ser Glu Asn Lys 260 265
270cgg gcc agc ggc gcc atg gat tcg ccg ggc ggt ggc gcg ata gag gcc
864Arg Ala Ser Gly Ala Met Asp Ser Pro Gly Gly Gly Ala Ile Glu Ala
275 280 285gtg ccg agg aag tcc atc gac
acg ttc ggg caa agg acc tcg ata tat 912Val Pro Arg Lys Ser Ile Asp
Thr Phe Gly Gln Arg Thr Ser Ile Tyr 290 295
300cga ggt gta aca agg cat aga tgg aca ggg cga tat gag gct cat ctc
960Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu305
310 315 320tgg gat aat agc
tgt aga aga gaa ggg cag agt cgc aag ggt agg caa 1008Trp Asp Asn Ser
Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly Arg Gln 325
330 335gtt tat ctt ggt ggc tat gac aag gag gat
aaa gca gcg aga gct tat 1056Val Tyr Leu Gly Gly Tyr Asp Lys Glu Asp
Lys Ala Ala Arg Ala Tyr 340 345
350gat ttg gca gct ctg aag tat tgg ggc aca aca aca aca aca aat ttc
1104Asp Leu Ala Ala Leu Lys Tyr Trp Gly Thr Thr Thr Thr Thr Asn Phe
355 360 365cca ata agt aac tat gaa aaa
gag cta gat gaa atg aaa cat atg acc 1152Pro Ile Ser Asn Tyr Glu Lys
Glu Leu Asp Glu Met Lys His Met Thr 370 375
380agg cag gag tat att gca tac cta aga agg aat agc agt gga ttt tct
1200Arg Gln Glu Tyr Ile Ala Tyr Leu Arg Arg Asn Ser Ser Gly Phe Ser385
390 395 400cgt ggt gca tcg
aaa tat cgt ggt gta acc agg cac cat cag cat ggg 1248Arg Gly Ala Ser
Lys Tyr Arg Gly Val Thr Arg His His Gln His Gly 405
410 415aga tgg caa gca agg ata ggg agg gtt gca
gga aac aag gac ctc tac 1296Arg Trp Gln Ala Arg Ile Gly Arg Val Ala
Gly Asn Lys Asp Leu Tyr 420 425
430tta ggc acc ttc agc acc gag gag gag gcg gcg gag gcg tac gac atc
1344Leu Gly Thr Phe Ser Thr Glu Glu Glu Ala Ala Glu Ala Tyr Asp Ile
435 440 445gcg gcg atc aag ttc cgg ggg
ctc aac gcc gtc acc aac ttt gac atg 1392Ala Ala Ile Lys Phe Arg Gly
Leu Asn Ala Val Thr Asn Phe Asp Met 450 455
460agc cgc tac gac gtc aag agc atc ctg gag agc agc acg ctg ccg gtg
1440Ser Arg Tyr Asp Val Lys Ser Ile Leu Glu Ser Ser Thr Leu Pro Val465
470 475 480ggc ggc gcg gcg
agg cgg ctg aag gag gcg gcg gac cac gcg gag gcg 1488Gly Gly Ala Ala
Arg Arg Leu Lys Glu Ala Ala Asp His Ala Glu Ala 485
490 495gcc ggc gcc acc atc tgg cgc gcc gcc gac
atg gac ggc gcc ggc gtc 1536Ala Gly Ala Thr Ile Trp Arg Ala Ala Asp
Met Asp Gly Ala Gly Val 500 505
510atc tcc ggc ctg gcc gac gtc ggg atg ggc gcc tac gcc gcc tcg tac
1584Ile Ser Gly Leu Ala Asp Val Gly Met Gly Ala Tyr Ala Ala Ser Tyr
515 520 525cac cac cac cac cac cac ggc
tgg ccg acc atc gcg ttc cag cag ccg 1632His His His His His His Gly
Trp Pro Thr Ile Ala Phe Gln Gln Pro 530 535
540ccg ccg ctc gcc gtg cac tac ccg tac ggc cag gcg ccg gcg gcg ccg
1680Pro Pro Leu Ala Val His Tyr Pro Tyr Gly Gln Ala Pro Ala Ala Pro545
550 555 560tcg cgc ggg tgg
tgc aag ccc gag cag gac gcc gcc gtc gct gcc gcc 1728Ser Arg Gly Trp
Cys Lys Pro Glu Gln Asp Ala Ala Val Ala Ala Ala 565
570 575gcg cac agc ctc cag gac ctc cag cag ctg
cac ctc ggc agc gcc gcc 1776Ala His Ser Leu Gln Asp Leu Gln Gln Leu
His Leu Gly Ser Ala Ala 580 585
590gcc cac aac ttc ttc cag gcg tcg tcg agc tcg acg gtc tac aac ggc
1824Ala His Asn Phe Phe Gln Ala Ser Ser Ser Ser Thr Val Tyr Asn Gly
595 600 605ggc ggc ggc ggg tac cag ggc
ctc ggt ggc aac gcc ttc ttg atg ccg 1872Gly Gly Gly Gly Tyr Gln Gly
Leu Gly Gly Asn Ala Phe Leu Met Pro 610 615
620gcg agc acc gtc gtg gcc gac cag ggg cac agc agc acg gcc acc aac
1920Ala Ser Thr Val Val Ala Asp Gln Gly His Ser Ser Thr Ala Thr Asn625
630 635 640cat gga aac acc
tgc agc tac ggc aac gag gag cag ggg aag ctc atc 1968His Gly Asn Thr
Cys Ser Tyr Gly Asn Glu Glu Gln Gly Lys Leu Ile 645
650 655ggg tac gac gcc atg gcg atg gcg agc ggc
gcc gcc ggc ggc ggg tac 2016Gly Tyr Asp Ala Met Ala Met Ala Ser Gly
Ala Ala Gly Gly Gly Tyr 660 665
670cag ctg tcg cag ggc tcg gcg tcg acg gtg agc atc gcg agg gcg aac
2064Gln Leu Ser Gln Gly Ser Ala Ser Thr Val Ser Ile Ala Arg Ala Asn
675 680 685ggc tac tcg gcc aac tgg agc
tcg cct ttc aat ggc gcc atg gga tga 2112Gly Tyr Ser Ala Asn Trp Ser
Ser Pro Phe Asn Gly Ala Met Gly 690 695
70035703PRTOryza sativa 35Met Ala Ser Ala Asn Asn Trp Leu Gly Phe Ser
Leu Ser Gly Gln Glu1 5 10
15Asn Pro Gln Pro His Gln Asp Ser Ser Pro Pro Ala Ala Ile Asp Val
20 25 30Ser Gly Ala Gly Asp Phe Tyr
Gly Leu Pro Thr Ser Gln Pro Thr Ala 35 40
45Ala Asp Ala His Leu Gly Val Ala Gly His His His Asn Ala Ser
Tyr 50 55 60Gly Ile Met Glu Ala Phe
Asn Arg Gly Ala Gln Glu Ala Gln Asp Trp65 70
75 80Asn Met Arg Gly Leu Asp Tyr Asn Gly Gly Ala
Ser Glu Leu Ser Met 85 90
95Leu Val Gly Ser Ser Gly Gly Lys Arg Ala Ala Ala Val Glu Glu Thr
100 105 110Glu Pro Lys Leu Glu Asp
Phe Leu Gly Gly Asn Ser Phe Val Ser Glu 115 120
125Gln Asp His His Ala Ala Gly Gly Phe Leu Phe Ser Gly Val
Pro Met 130 135 140Ala Ser Ser Thr Asn
Ser Asn Ser Gly Ser Asn Thr Met Glu Leu Ser145 150
155 160Met Ile Lys Thr Trp Leu Arg Asn Asn Gly
Gln Val Pro Ala Gly His 165 170
175Gln Pro Gln Gln Gln Gln Pro Ala Ala Ala Ala Ala Ala Ala Gln Gln
180 185 190Gln Ala His Glu Ala
Ala Glu Met Ser Thr Asp Ala Ser Ala Ser Ser 195
200 205Phe Gly Cys Ser Ser Asp Ala Met Gly Arg Ser Asn
Asn Gly Gly Ala 210 215 220Val Ser Ala
Ala Ala Gly Gly Thr Ser Ser Gln Ser Leu Ala Leu Ser225
230 235 240Met Ser Thr Gly Ser His Ser
His Leu Pro Ile Val Val Ala Gly Gly 245
250 255Gly Asn Ala Ser Gly Gly Ala Ala Glu Ser Thr Ser
Ser Glu Asn Lys 260 265 270Arg
Ala Ser Gly Ala Met Asp Ser Pro Gly Gly Gly Ala Ile Glu Ala 275
280 285Val Pro Arg Lys Ser Ile Asp Thr Phe
Gly Gln Arg Thr Ser Ile Tyr 290 295
300Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu305
310 315 320Trp Asp Asn Ser
Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly Arg Gln 325
330 335Val Tyr Leu Gly Gly Tyr Asp Lys Glu Asp
Lys Ala Ala Arg Ala Tyr 340 345
350Asp Leu Ala Ala Leu Lys Tyr Trp Gly Thr Thr Thr Thr Thr Asn Phe
355 360 365Pro Ile Ser Asn Tyr Glu Lys
Glu Leu Asp Glu Met Lys His Met Thr 370 375
380Arg Gln Glu Tyr Ile Ala Tyr Leu Arg Arg Asn Ser Ser Gly Phe
Ser385 390 395 400Arg Gly
Ala Ser Lys Tyr Arg Gly Val Thr Arg His His Gln His Gly
405 410 415Arg Trp Gln Ala Arg Ile Gly
Arg Val Ala Gly Asn Lys Asp Leu Tyr 420 425
430Leu Gly Thr Phe Ser Thr Glu Glu Glu Ala Ala Glu Ala Tyr
Asp Ile 435 440 445Ala Ala Ile Lys
Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp Met 450
455 460Ser Arg Tyr Asp Val Lys Ser Ile Leu Glu Ser Ser
Thr Leu Pro Val465 470 475
480Gly Gly Ala Ala Arg Arg Leu Lys Glu Ala Ala Asp His Ala Glu Ala
485 490 495Ala Gly Ala Thr Ile
Trp Arg Ala Ala Asp Met Asp Gly Ala Gly Val 500
505 510Ile Ser Gly Leu Ala Asp Val Gly Met Gly Ala Tyr
Ala Ala Ser Tyr 515 520 525His His
His His His His Gly Trp Pro Thr Ile Ala Phe Gln Gln Pro 530
535 540Pro Pro Leu Ala Val His Tyr Pro Tyr Gly Gln
Ala Pro Ala Ala Pro545 550 555
560Ser Arg Gly Trp Cys Lys Pro Glu Gln Asp Ala Ala Val Ala Ala Ala
565 570 575Ala His Ser Leu
Gln Asp Leu Gln Gln Leu His Leu Gly Ser Ala Ala 580
585 590Ala His Asn Phe Phe Gln Ala Ser Ser Ser Ser
Thr Val Tyr Asn Gly 595 600 605Gly
Gly Gly Gly Tyr Gln Gly Leu Gly Gly Asn Ala Phe Leu Met Pro 610
615 620Ala Ser Thr Val Val Ala Asp Gln Gly His
Ser Ser Thr Ala Thr Asn625 630 635
640His Gly Asn Thr Cys Ser Tyr Gly Asn Glu Glu Gln Gly Lys Leu
Ile 645 650 655Gly Tyr Asp
Ala Met Ala Met Ala Ser Gly Ala Ala Gly Gly Gly Tyr 660
665 670Gln Leu Ser Gln Gly Ser Ala Ser Thr Val
Ser Ile Ala Arg Ala Asn 675 680
685Gly Tyr Ser Ala Asn Trp Ser Ser Pro Phe Asn Gly Ala Met Gly 690
695 700361977DNAOryza sativaCDS(1)...(1977)
36atg gct tct gca gat aac tgg cta ggc ttc tcg ctc tcc ggc caa ggc 48Met
Ala Ser Ala Asp Asn Trp Leu Gly Phe Ser Leu Ser Gly Gln Gly 1
5 10 15aac cca cag cat cac cag aac
ggc tcg ccg tct gcc gcc ggc gac gcc 96Asn Pro Gln His His Gln Asn Gly
Ser Pro Ser Ala Ala Gly Asp Ala 20 25
30gcc atc gac atc tcc ggc tca ggc gac ttc tat ggt ctg cca acg
ccg 144Ala Ile Asp Ile Ser Gly Ser Gly Asp Phe Tyr Gly Leu Pro Thr Pro
35 40 45gac gca cac cac atc ggc
atg gcg ggc gaa gac gcg ccc tat ggc gtc 192Asp Ala His His Ile Gly Met
Ala Gly Glu Asp Ala Pro Tyr Gly Val 50 55
60atg gat gct ttc aac aga ggc acc cat gaa acc caa gat tgg gcg atg
240Met Asp Ala Phe Asn Arg Gly Thr His Glu Thr Gln Asp Trp Ala Met 65
70 75 80agg ggt ttg gac
tac ggc ggc ggc tcc tcc gac ctc tcg atg ctc gtc 288Arg Gly Leu Asp Tyr
Gly Gly Gly Ser Ser Asp Leu Ser Met Leu Val 85
90 95ggc tcg agc ggc ggc ggg agg agg acg gtg gcc
ggc gac ggc gtc ggc 336Gly Ser Ser Gly Gly Gly Arg Arg Thr Val Ala Gly
Asp Gly Val Gly 100 105 110gag
gcg ccg aag ctg gag aac ttc ctc gac ggc aac tca ttc tcc gac 384Glu Ala
Pro Lys Leu Glu Asn Phe Leu Asp Gly Asn Ser Phe Ser Asp 115
120 125gtg cac ggc caa gcc gcc ggc ggg tac ctc
tac tcc gga agc gct gtc 432Val His Gly Gln Ala Ala Gly Gly Tyr Leu Tyr
Ser Gly Ser Ala Val 130 135 140ggc ggc
gcc ggt ggt tac agt aac ggc gga tgc ggc ggc gga acc ata 480Gly Gly Ala
Gly Gly Tyr Ser Asn Gly Gly Cys Gly Gly Gly Thr Ile145
150 155 160gag ctg tcc atg atc aag acg
tgg ctc cgg agc aac cag tcg cag cag 528Glu Leu Ser Met Ile Lys Thr Trp
Leu Arg Ser Asn Gln Ser Gln Gln 165 170
175cag cca tcg ccg ccg cag cac gct gat cag ggc atg agc acc
gac gcc 576Gln Pro Ser Pro Pro Gln His Ala Asp Gln Gly Met Ser Thr Asp
Ala 180 185 190agc gcg agc agc
tac gcg tgc tcc gac gtg ctg gtg ggg agc tgc ggc 624Ser Ala Ser Ser Tyr
Ala Cys Ser Asp Val Leu Val Gly Ser Cys Gly 195
200 205ggc ggc ggc gcc ggg ggc acg gcg agc tcg cat ggg
cag ggc ctg gcg 672Gly Gly Gly Ala Gly Gly Thr Ala Ser Ser His Gly Gln
Gly Leu Ala 210 215 220ctg tcg atg agc
acg ggg tcg gtg gcc gcc gcc gga ggg ggc ggc gcc 720Leu Ser Met Ser Thr
Gly Ser Val Ala Ala Ala Gly Gly Gly Gly Ala225 230
235 240gtc gtc gcg gcc gag agc tcg tcg tcg gag
aac aag cgg gtg gat tcg 768Val Val Ala Ala Glu Ser Ser Ser Ser Glu Asn
Lys Arg Val Asp Ser 245 250
255ccg ggc ggc gcc gtg gac ggc gcc gtc ccg agg aaa tcc atc gac acc
816Pro Gly Gly Ala Val Asp Gly Ala Val Pro Arg Lys Ser Ile Asp Thr
260 265 270ttc ggg caa agg acg tct
ata tac cga ggt gta aca agg cat aga tgg 864Phe Gly Gln Arg Thr Ser Ile
Tyr Arg Gly Val Thr Arg His Arg Trp 275 280
285aca gga aga tat gaa gct cat ctg tgg gat aat agc tgt agg aga
gaa 912Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser Cys Arg Arg Glu
290 295 300ggc caa agt cgc aag ggg aga
cag gtt tat ttg ggc ggt tat gac aaa 960Gly Gln Ser Arg Lys Gly Arg Gln
Val Tyr Leu Gly Gly Tyr Asp Lys305 310
315 320gaa gat aag gcg gct cgg gct tat gat ttg gca gct
cta aaa tac tgg 1008Glu Asp Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala Leu
Lys Tyr Trp 325 330 335ggc
acg acc aca aca aca aat ttc cca atg agt aat tat gaa aag gag 1056Gly Thr
Thr Thr Thr Thr Asn Phe Pro Met Ser Asn Tyr Glu Lys Glu 340
345 350cta gag gaa atg aaa cac atg acc agg
cag gag tac att gca cat ctt 1104Leu Glu Glu Met Lys His Met Thr Arg Gln
Glu Tyr Ile Ala His Leu 355 360
365aga agg aat agc agt gga ttt tct cgt ggt gca tcc aaa tat cgt ggt
1152Arg Arg Asn Ser Ser Gly Phe Ser Arg Gly Ala Ser Lys Tyr Arg Gly
370 375 380gtt act agg cat cat cag cat
ggg aga tgg cag gca agg ata ggg cga 1200Val Thr Arg His His Gln His Gly
Arg Trp Gln Ala Arg Ile Gly Arg385 390
395 400gtt gca ggc aac aag gat atc tac cta ggc acc ttc
agc acc gag gag 1248Val Ala Gly Asn Lys Asp Ile Tyr Leu Gly Thr Phe Ser
Thr Glu Glu 405 410 415gag
gcc gcc gag gcg tac gac atc gcc gcc atc aag ttc cgc ggg ctc 1296Glu Ala
Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu 420
425 430aac gcc gtc acc aac ttc gac atg agc
cgg tac gac gtc aag agc atc 1344Asn Ala Val Thr Asn Phe Asp Met Ser Arg
Tyr Asp Val Lys Ser Ile 435 440
445ctg gac agc agc acg ctg ccg gtc ggc ggc gcg gcg cgg cgg ctc aag
1392Leu Asp Ser Ser Thr Leu Pro Val Gly Gly Ala Ala Arg Arg Leu Lys
450 455 460gag gcg gag gtc gcc gcc gcc
gcc gcg ggc ggc ggc gtg atc gtc tcc 1440Glu Ala Glu Val Ala Ala Ala Ala
Ala Gly Gly Gly Val Ile Val Ser465 470
475 480cac ctg gcc gac ggc ggt gtg ggt ggg tac tac tac
ggg tgc ggc ccg 1488His Leu Ala Asp Gly Gly Val Gly Gly Tyr Tyr Tyr Gly
Cys Gly Pro 485 490 495acc
atc gcg ttc ggc ggc ggc ggc cag cag ccg gcg ccg ctc gcc gtg 1536Thr Ile
Ala Phe Gly Gly Gly Gly Gln Gln Pro Ala Pro Leu Ala Val 500
505 510cac tac ccg tcg tac ggc cag gcc agc
ggg tgg tgc aag ccg gag cag 1584His Tyr Pro Ser Tyr Gly Gln Ala Ser Gly
Trp Cys Lys Pro Glu Gln 515 520
525gac gcg gtg atc gcg gcc ggg cac tgc gcg acg gac ctc cag cac ctg
1632Asp Ala Val Ile Ala Ala Gly His Cys Ala Thr Asp Leu Gln His Leu
530 535 540cac ctc ggg agc ggc ggc gcc
gcc gcc acc cac aac ttc ttc cag cag 1680His Leu Gly Ser Gly Gly Ala Ala
Ala Thr His Asn Phe Phe Gln Gln545 550
555 560ccg gcg tca agc tcg gcc gtc tac ggc aac ggc ggc
ggc ggc ggc ggc 1728Pro Ala Ser Ser Ser Ala Val Tyr Gly Asn Gly Gly Gly
Gly Gly Gly 565 570 575aac
gcg ttc atg atg ccg atg ggc gcc gtg gtg gcc gcc gcc gat cac 1776Asn Ala
Phe Met Met Pro Met Gly Ala Val Val Ala Ala Ala Asp His 580
585 590ggc ggg cag agc agc gcc tac ggc ggt
ggc gac gag agc ggg agg ctc 1824Gly Gly Gln Ser Ser Ala Tyr Gly Gly Gly
Asp Glu Ser Gly Arg Leu 595 600
605gtc gtg ggg tac gac ggc gtc gtc gac ccg tac gcg gcc atg aga agc
1872Val Val Gly Tyr Asp Gly Val Val Asp Pro Tyr Ala Ala Met Arg Ser
610 615 620gcg tac gag ctc tcg cag ggc
tcg tcg tcg tcg tcg gtg agc gtc gcg 1920Ala Tyr Glu Leu Ser Gln Gly Ser
Ser Ser Ser Ser Val Ser Val Ala625 630
635 640aag gcg gcg aac ggg tac ccg gac aac tgg agc tcg
ccg ttc aac ggc 1968Lys Ala Ala Asn Gly Tyr Pro Asp Asn Trp Ser Ser Pro
Phe Asn Gly 645 650 655atg
gga tga 1977Met
Gly37658PRTOryza sativa 37Met Ala Ser Ala Asp Asn Trp Leu Gly Phe Ser Leu
Ser Gly Gln Gly1 5 10
15Asn Pro Gln His His Gln Asn Gly Ser Pro Ser Ala Ala Gly Asp Ala
20 25 30Ala Ile Asp Ile Ser Gly Ser
Gly Asp Phe Tyr Gly Leu Pro Thr Pro 35 40
45Asp Ala His His Ile Gly Met Ala Gly Glu Asp Ala Pro Tyr Gly
Val 50 55 60Met Asp Ala Phe Asn Arg
Gly Thr His Glu Thr Gln Asp Trp Ala Met65 70
75 80Arg Gly Leu Asp Tyr Gly Gly Gly Ser Ser Asp
Leu Ser Met Leu Val 85 90
95Gly Ser Ser Gly Gly Gly Arg Arg Thr Val Ala Gly Asp Gly Val Gly
100 105 110Glu Ala Pro Lys Leu Glu
Asn Phe Leu Asp Gly Asn Ser Phe Ser Asp 115 120
125Val His Gly Gln Ala Ala Gly Gly Tyr Leu Tyr Ser Gly Ser
Ala Val 130 135 140Gly Gly Ala Gly Gly
Tyr Ser Asn Gly Gly Cys Gly Gly Gly Thr Ile145 150
155 160Glu Leu Ser Met Ile Lys Thr Trp Leu Arg
Ser Asn Gln Ser Gln Gln 165 170
175Gln Pro Ser Pro Pro Gln His Ala Asp Gln Gly Met Ser Thr Asp Ala
180 185 190Ser Ala Ser Ser Tyr
Ala Cys Ser Asp Val Leu Val Gly Ser Cys Gly 195
200 205Gly Gly Gly Ala Gly Gly Thr Ala Ser Ser His Gly
Gln Gly Leu Ala 210 215 220Leu Ser Met
Ser Thr Gly Ser Val Ala Ala Ala Gly Gly Gly Gly Ala225
230 235 240Val Val Ala Ala Glu Ser Ser
Ser Ser Glu Asn Lys Arg Val Asp Ser 245
250 255Pro Gly Gly Ala Val Asp Gly Ala Val Pro Arg Lys
Ser Ile Asp Thr 260 265 270Phe
Gly Gln Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp 275
280 285Thr Gly Arg Tyr Glu Ala His Leu Trp
Asp Asn Ser Cys Arg Arg Glu 290 295
300Gly Gln Ser Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys305
310 315 320Glu Asp Lys Ala
Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp 325
330 335Gly Thr Thr Thr Thr Thr Asn Phe Pro Met
Ser Asn Tyr Glu Lys Glu 340 345
350Leu Glu Glu Met Lys His Met Thr Arg Gln Glu Tyr Ile Ala His Leu
355 360 365Arg Arg Asn Ser Ser Gly Phe
Ser Arg Gly Ala Ser Lys Tyr Arg Gly 370 375
380Val Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly
Arg385 390 395 400Val Ala
Gly Asn Lys Asp Ile Tyr Leu Gly Thr Phe Ser Thr Glu Glu
405 410 415Glu Ala Ala Glu Ala Tyr Asp
Ile Ala Ala Ile Lys Phe Arg Gly Leu 420 425
430Asn Ala Val Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys
Ser Ile 435 440 445Leu Asp Ser Ser
Thr Leu Pro Val Gly Gly Ala Ala Arg Arg Leu Lys 450
455 460Glu Ala Glu Val Ala Ala Ala Ala Ala Gly Gly Gly
Val Ile Val Ser465 470 475
480His Leu Ala Asp Gly Gly Val Gly Gly Tyr Tyr Tyr Gly Cys Gly Pro
485 490 495Thr Ile Ala Phe Gly
Gly Gly Gly Gln Gln Pro Ala Pro Leu Ala Val 500
505 510His Tyr Pro Ser Tyr Gly Gln Ala Ser Gly Trp Cys
Lys Pro Glu Gln 515 520 525Asp Ala
Val Ile Ala Ala Gly His Cys Ala Thr Asp Leu Gln His Leu 530
535 540His Leu Gly Ser Gly Gly Ala Ala Ala Thr His
Asn Phe Phe Gln Gln545 550 555
560Pro Ala Ser Ser Ser Ala Val Tyr Gly Asn Gly Gly Gly Gly Gly Gly
565 570 575Asn Ala Phe Met
Met Pro Met Gly Ala Val Val Ala Ala Ala Asp His 580
585 590Gly Gly Gln Ser Ser Ala Tyr Gly Gly Gly Asp
Glu Ser Gly Arg Leu 595 600 605Val
Val Gly Tyr Asp Gly Val Val Asp Pro Tyr Ala Ala Met Arg Ser 610
615 620Ala Tyr Glu Leu Ser Gln Gly Ser Ser Ser
Ser Ser Val Ser Val Ala625 630 635
640Lys Ala Ala Asn Gly Tyr Pro Asp Asn Trp Ser Ser Pro Phe Asn
Gly 645 650 655Met
Gly382112DNASorghum bicolorCDS(1)...(2112) 38atg gct act gtg aac aac tgg
ctc gct ttc tcc ctc tcc ccg cag gag 48Met Ala Thr Val Asn Asn Trp
Leu Ala Phe Ser Leu Ser Pro Gln Glu1 5 10
15ctg ccg ccc acc cag acg gac tcc acc ctc atc tct gcc
gcc acc acc 96Leu Pro Pro Thr Gln Thr Asp Ser Thr Leu Ile Ser Ala
Ala Thr Thr 20 25 30gac gat
gtc tcc ggc gat gtc tgc ttc aac atc ccc caa gat tgg agc 144Asp Asp
Val Ser Gly Asp Val Cys Phe Asn Ile Pro Gln Asp Trp Ser 35
40 45atg agg gga tcc gag ctt tcg gcg ctc gtc
gcc gag ccg aag ctg gag 192Met Arg Gly Ser Glu Leu Ser Ala Leu Val
Ala Glu Pro Lys Leu Glu 50 55 60gac
ttc ctc ggc gga atc tcc ttc tcc gag cag cac cac aag gcc aac 240Asp
Phe Leu Gly Gly Ile Ser Phe Ser Glu Gln His His Lys Ala Asn65
70 75 80tgc aac atg atc ccc agc
act agc agc aca gct tgc tac gcg agc tcg 288Cys Asn Met Ile Pro Ser
Thr Ser Ser Thr Ala Cys Tyr Ala Ser Ser 85
90 95ggt gct acc gcc ggc tac cat cac cag ctg tac cac
cag ccc acc agc 336Gly Ala Thr Ala Gly Tyr His His Gln Leu Tyr His
Gln Pro Thr Ser 100 105 110tcc
gcg ctc cac ttc gct gac tcc gtc atg gtg gcc tcc tcg gcc ggc 384Ser
Ala Leu His Phe Ala Asp Ser Val Met Val Ala Ser Ser Ala Gly 115
120 125ggc gtc cac gac gga ggt gcc atg ctc
agc gcg gcc agc gct aat ggt 432Gly Val His Asp Gly Gly Ala Met Leu
Ser Ala Ala Ser Ala Asn Gly 130 135
140agc gct ggc gct ggc gct gcc agt gcc aat ggc agc ggc agc atc ggg
480Ser Ala Gly Ala Gly Ala Ala Ser Ala Asn Gly Ser Gly Ser Ile Gly145
150 155 160ctg tcc atg atc
aag aac tgg ctg cgg agc caa cca gct ccc atg cag 528Leu Ser Met Ile
Lys Asn Trp Leu Arg Ser Gln Pro Ala Pro Met Gln 165
170 175ccg agg gtg gcg gcg gct gag agc gtg cag
ggg ctc tct ttg tcc atg 576Pro Arg Val Ala Ala Ala Glu Ser Val Gln
Gly Leu Ser Leu Ser Met 180 185
190aac atg gcg ggg gcg acg caa ggc gcc gct ggc atg cca ctt ctt gct
624Asn Met Ala Gly Ala Thr Gln Gly Ala Ala Gly Met Pro Leu Leu Ala
195 200 205gga gag cgc ggc cgg gcg ccc
gag agt gtc tcg acg tcg gca cag ggt 672Gly Glu Arg Gly Arg Ala Pro
Glu Ser Val Ser Thr Ser Ala Gln Gly 210 215
220gga gcc gtc gtc acg gct cca aag gag gat agc ggt ggc agc ggt gtt
720Gly Ala Val Val Thr Ala Pro Lys Glu Asp Ser Gly Gly Ser Gly Val225
230 235 240gcc gcc acc ggc
gcc cta gta gcc gtg agc acg gac acg ggt ggc agc 768Ala Ala Thr Gly
Ala Leu Val Ala Val Ser Thr Asp Thr Gly Gly Ser 245
250 255ggc gcg tcg gct gac aac acg gca agg aag
acg gtg gac acg ttc ggg 816Gly Ala Ser Ala Asp Asn Thr Ala Arg Lys
Thr Val Asp Thr Phe Gly 260 265
270cag cgc acg tcg att tac cgt ggc gtg aca agg cat aga tgg act ggg
864Gln Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly
275 280 285aga tat gaa gca cat ctg tgg
gac aac agt tgc aga agg gaa gga caa 912Arg Tyr Glu Ala His Leu Trp
Asp Asn Ser Cys Arg Arg Glu Gly Gln 290 295
300act cgc aag ggt cgt caa gtc tat tta ggt ggc tat gat aaa gag gag
960Thr Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu305
310 315 320aaa gct gct agg
gct tat gat ctg gct gct ctt aag tac tgg ggt ccc 1008Lys Ala Ala Arg
Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Pro 325
330 335acg aca aca aca aat ttt cca gtg aat aac
tac gaa aag gag ctg gag 1056Thr Thr Thr Thr Asn Phe Pro Val Asn Asn
Tyr Glu Lys Glu Leu Glu 340 345
350gat atg aag cac atg aca agg cag gag ttt gta gcg tct ctg aga agg
1104Asp Met Lys His Met Thr Arg Gln Glu Phe Val Ala Ser Leu Arg Arg
355 360 365aag agc agt ggt ttc tcc aga
ggt gca tcc att tac agg gga gtg act 1152Lys Ser Ser Gly Phe Ser Arg
Gly Ala Ser Ile Tyr Arg Gly Val Thr 370 375
380agg cat cac cag cat gga aga tgg caa gca cgg att gga cga gtt gca
1200Arg His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val Ala385
390 395 400ggg aac aag gat
ctc tac ttg ggc acc ttc agc acg cag gag gag gca 1248Gly Asn Lys Asp
Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala 405
410 415gcg gag gca tac gac att gcg gcg atc aag
ttc cgc ggc ctc aac gcc 1296Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys
Phe Arg Gly Leu Asn Ala 420 425
430gtc aca aac ttc gac atg agc cgc tac gac gtc aag agc atc ctg gac
1344Val Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ser Ile Leu Asp
435 440 445agc agt gcg ctc ccc atc ggc
agc gcc gcc aag cgt ctc aag gag gcc 1392Ser Ser Ala Leu Pro Ile Gly
Ser Ala Ala Lys Arg Leu Lys Glu Ala 450 455
460gag gcc gcc gcg tcc gca cag cac cat gcc ggc gtg gtg agc tac gac
1440Glu Ala Ala Ala Ser Ala Gln His His Ala Gly Val Val Ser Tyr Asp465
470 475 480gtc ggc cgc ata
gcc tca cag ctc ggc gac ggc ggc gcc ctg gcg gcg 1488Val Gly Arg Ile
Ala Ser Gln Leu Gly Asp Gly Gly Ala Leu Ala Ala 485
490 495gcg tac ggc gcg cac tac cat ggc gcc tgg
ccg acc atc gcg ttc cag 1536Ala Tyr Gly Ala His Tyr His Gly Ala Trp
Pro Thr Ile Ala Phe Gln 500 505
510ccg agc gcg gcc acg ggc ctg tac cac ccg tac gcg cag ccg atg cgc
1584Pro Ser Ala Ala Thr Gly Leu Tyr His Pro Tyr Ala Gln Pro Met Arg
515 520 525ggg tgg tgc aag cag gag cag
gac cac gcg gtg atc gcg gcc gcg cac 1632Gly Trp Cys Lys Gln Glu Gln
Asp His Ala Val Ile Ala Ala Ala His 530 535
540agc ctg cag gag ctc cac cac ctg aac ctg ggt gct gcc gcc ggc gcg
1680Ser Leu Gln Glu Leu His His Leu Asn Leu Gly Ala Ala Ala Gly Ala545
550 555 560cac gac ttc ttc
tcg gcg ggg cag cag gcg gcg atg cac ggc ctg ggt 1728His Asp Phe Phe
Ser Ala Gly Gln Gln Ala Ala Met His Gly Leu Gly 565
570 575agc atg gac aat gca tca ctc gag cac agc
acc ggc tcc aac tcc gtc 1776Ser Met Asp Asn Ala Ser Leu Glu His Ser
Thr Gly Ser Asn Ser Val 580 585
590gtg tac aac ggt gtt ggt gat agc aac ggc agc acc gtc gtc ggc agt
1824Val Tyr Asn Gly Val Gly Asp Ser Asn Gly Ser Thr Val Val Gly Ser
595 600 605ggt ggc tac atg atg cct atg
agc gct gcc acg gcg acg gct acc acg 1872Gly Gly Tyr Met Met Pro Met
Ser Ala Ala Thr Ala Thr Ala Thr Thr 610 615
620gca atg gtg agc cac gag cag gtg cat gca cgg gca cag ggt gat cac
1920Ala Met Val Ser His Glu Gln Val His Ala Arg Ala Gln Gly Asp His625
630 635 640cac gac gaa gcc
aag cag gct gct cag atg ggg tac gag agc tac ctg 1968His Asp Glu Ala
Lys Gln Ala Ala Gln Met Gly Tyr Glu Ser Tyr Leu 645
650 655gtg aac gca gag aac tat ggc ggc ggg agg
atg tct gcg gcc tgg gcg 2016Val Asn Ala Glu Asn Tyr Gly Gly Gly Arg
Met Ser Ala Ala Trp Ala 660 665
670act gtc tca gcg cca ccg gcg gca agc agc aac gat aac atg gcg gac
2064Thr Val Ser Ala Pro Pro Ala Ala Ser Ser Asn Asp Asn Met Ala Asp
675 680 685gtc ggc cat ggc ggc gca cag
ctc ttc agt gtc tgg aac gat act taa 2112Val Gly His Gly Gly Ala Gln
Leu Phe Ser Val Trp Asn Asp Thr 690 695
70039703PRTSorghum bicolor 39Met Ala Thr Val Asn Asn Trp Leu Ala Phe Ser
Leu Ser Pro Gln Glu1 5 10
15Leu Pro Pro Thr Gln Thr Asp Ser Thr Leu Ile Ser Ala Ala Thr Thr
20 25 30Asp Asp Val Ser Gly Asp Val
Cys Phe Asn Ile Pro Gln Asp Trp Ser 35 40
45Met Arg Gly Ser Glu Leu Ser Ala Leu Val Ala Glu Pro Lys Leu
Glu 50 55 60Asp Phe Leu Gly Gly Ile
Ser Phe Ser Glu Gln His His Lys Ala Asn65 70
75 80Cys Asn Met Ile Pro Ser Thr Ser Ser Thr Ala
Cys Tyr Ala Ser Ser 85 90
95Gly Ala Thr Ala Gly Tyr His His Gln Leu Tyr His Gln Pro Thr Ser
100 105 110Ser Ala Leu His Phe Ala
Asp Ser Val Met Val Ala Ser Ser Ala Gly 115 120
125Gly Val His Asp Gly Gly Ala Met Leu Ser Ala Ala Ser Ala
Asn Gly 130 135 140Ser Ala Gly Ala Gly
Ala Ala Ser Ala Asn Gly Ser Gly Ser Ile Gly145 150
155 160Leu Ser Met Ile Lys Asn Trp Leu Arg Ser
Gln Pro Ala Pro Met Gln 165 170
175Pro Arg Val Ala Ala Ala Glu Ser Val Gln Gly Leu Ser Leu Ser Met
180 185 190Asn Met Ala Gly Ala
Thr Gln Gly Ala Ala Gly Met Pro Leu Leu Ala 195
200 205Gly Glu Arg Gly Arg Ala Pro Glu Ser Val Ser Thr
Ser Ala Gln Gly 210 215 220Gly Ala Val
Val Thr Ala Pro Lys Glu Asp Ser Gly Gly Ser Gly Val225
230 235 240Ala Ala Thr Gly Ala Leu Val
Ala Val Ser Thr Asp Thr Gly Gly Ser 245
250 255Gly Ala Ser Ala Asp Asn Thr Ala Arg Lys Thr Val
Asp Thr Phe Gly 260 265 270Gln
Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly 275
280 285Arg Tyr Glu Ala His Leu Trp Asp Asn
Ser Cys Arg Arg Glu Gly Gln 290 295
300Thr Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu305
310 315 320Lys Ala Ala Arg
Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Pro 325
330 335Thr Thr Thr Thr Asn Phe Pro Val Asn Asn
Tyr Glu Lys Glu Leu Glu 340 345
350Asp Met Lys His Met Thr Arg Gln Glu Phe Val Ala Ser Leu Arg Arg
355 360 365Lys Ser Ser Gly Phe Ser Arg
Gly Ala Ser Ile Tyr Arg Gly Val Thr 370 375
380Arg His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val
Ala385 390 395 400Gly Asn
Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala
405 410 415Ala Glu Ala Tyr Asp Ile Ala
Ala Ile Lys Phe Arg Gly Leu Asn Ala 420 425
430Val Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ser Ile
Leu Asp 435 440 445Ser Ser Ala Leu
Pro Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala 450
455 460Glu Ala Ala Ala Ser Ala Gln His His Ala Gly Val
Val Ser Tyr Asp465 470 475
480Val Gly Arg Ile Ala Ser Gln Leu Gly Asp Gly Gly Ala Leu Ala Ala
485 490 495Ala Tyr Gly Ala His
Tyr His Gly Ala Trp Pro Thr Ile Ala Phe Gln 500
505 510Pro Ser Ala Ala Thr Gly Leu Tyr His Pro Tyr Ala
Gln Pro Met Arg 515 520 525Gly Trp
Cys Lys Gln Glu Gln Asp His Ala Val Ile Ala Ala Ala His 530
535 540Ser Leu Gln Glu Leu His His Leu Asn Leu Gly
Ala Ala Ala Gly Ala545 550 555
560His Asp Phe Phe Ser Ala Gly Gln Gln Ala Ala Met His Gly Leu Gly
565 570 575Ser Met Asp Asn
Ala Ser Leu Glu His Ser Thr Gly Ser Asn Ser Val 580
585 590Val Tyr Asn Gly Val Gly Asp Ser Asn Gly Ser
Thr Val Val Gly Ser 595 600 605Gly
Gly Tyr Met Met Pro Met Ser Ala Ala Thr Ala Thr Ala Thr Thr 610
615 620Ala Met Val Ser His Glu Gln Val His Ala
Arg Ala Gln Gly Asp His625 630 635
640His Asp Glu Ala Lys Gln Ala Ala Gln Met Gly Tyr Glu Ser Tyr
Leu 645 650 655Val Asn Ala
Glu Asn Tyr Gly Gly Gly Arg Met Ser Ala Ala Trp Ala 660
665 670Thr Val Ser Ala Pro Pro Ala Ala Ser Ser
Asn Asp Asn Met Ala Asp 675 680
685Val Gly His Gly Gly Ala Gln Leu Phe Ser Val Trp Asn Asp Thr 690
695 700402082DNASorghum
bicolorCDS(1)...(2082) 40atg gct tcg acg aac aac cac tgg ctg ggt ttc tcg
ctc tcg ggc cag 48Met Ala Ser Thr Asn Asn His Trp Leu Gly Phe Ser Leu
Ser Gly Gln 1 5 10 15gat
aac ccg cag cct aat cat cag gac agc tcg cct gcc gcc gcc ggc 96Asp Asn
Pro Gln Pro Asn His Gln Asp Ser Ser Pro Ala Ala Ala Gly 20
25 30atc gac atc tcc ggc gcc agc gac ttc
tat ggc ttg ccc acg cag cag 144Ile Asp Ile Ser Gly Ala Ser Asp Phe Tyr
Gly Leu Pro Thr Gln Gln 35 40
45ggc tcc gac ggg aat ctc ggc gtg ccg ggc ctg cgg gac gat cac gct
192Gly Ser Asp Gly Asn Leu Gly Val Pro Gly Leu Arg Asp Asp His Ala 50
55 60tct tat ggc atc atg gag gcc ttc
aac agg gtt cct caa gaa acc caa 240Ser Tyr Gly Ile Met Glu Ala Phe Asn
Arg Val Pro Gln Glu Thr Gln 65 70 75
80gat tgg aac atg agg gga ttg gac tac aac ggc ggt ggc tcg
gaa ctc 288Asp Trp Asn Met Arg Gly Leu Asp Tyr Asn Gly Gly Gly Ser Glu
Leu 85 90 95tcg atg ctt
gtg ggg tcc agc ggc ggc ggc ggg ggc ggc ggc aag agg 336Ser Met Leu Val
Gly Ser Ser Gly Gly Gly Gly Gly Gly Gly Lys Arg 100
105 110gcc gtg gaa gac agc gag ccc aag ctc gaa gat
ttc ctc ggc ggc aac 384Ala Val Glu Asp Ser Glu Pro Lys Leu Glu Asp Phe
Leu Gly Gly Asn 115 120 125tcg ttc
gtc tcc gag cat gat cag tcc ggc ggt tac ctg ttc tct gga 432Ser Phe Val
Ser Glu His Asp Gln Ser Gly Gly Tyr Leu Phe Ser Gly 130
135 140gtc ccg atg gcc agc agc acc aac agc aac agc ggg
agc aac acc atg 480Val Pro Met Ala Ser Ser Thr Asn Ser Asn Ser Gly Ser
Asn Thr Met145 150 155
160gag ctc tcc atg atc aag acc tgg ctc cgg aac aac cag gtg ccc cag
528Glu Leu Ser Met Ile Lys Thr Trp Leu Arg Asn Asn Gln Val Pro Gln
165 170 175ccg cag ccg cca gca
gct ccg cat cag gcg ccg cag act gag gag atg 576Pro Gln Pro Pro Ala Ala
Pro His Gln Ala Pro Gln Thr Glu Glu Met 180
185 190agc acc gac gcc aac gcc agc gcc agc agc ttt ggc
tgc tcg gat tcg 624Ser Thr Asp Ala Asn Ala Ser Ala Ser Ser Phe Gly Cys
Ser Asp Ser 195 200 205atg ggg agg
aac ggc acg gtg gcg gct gct ggg agc tcc cag agc ctg 672Met Gly Arg Asn
Gly Thr Val Ala Ala Ala Gly Ser Ser Gln Ser Leu 210
215 220gcg ctc tcg atg agc acg ggc tcg cac ctg ccg atg
gtt gtg gcc ggc 720Ala Leu Ser Met Ser Thr Gly Ser His Leu Pro Met Val
Val Ala Gly225 230 235
240ggc ggc gcc agc gga gcg gcc tcg gag agc acg tca tcg gag aac aag
768Gly Gly Ala Ser Gly Ala Ala Ser Glu Ser Thr Ser Ser Glu Asn Lys
245 250 255cga gcg agc ggc gcc
atg gat tcg ccc ggc agc gcg gta gaa gcc gtc 816Arg Ala Ser Gly Ala Met
Asp Ser Pro Gly Ser Ala Val Glu Ala Val 260
265 270ccg agg aag tcc atc gac acg ttc ggg caa agg acc
tct ata tat cga 864Pro Arg Lys Ser Ile Asp Thr Phe Gly Gln Arg Thr Ser
Ile Tyr Arg 275 280 285ggt gta aca
aga cat aga tgg aca ggg cga tat gag gct cat cta tgg 912Gly Val Thr Arg
His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp 290
295 300gat aat agt tgt aga aga gaa ggg cag agt cgc aag
ggt agg caa gtt 960Asp Asn Ser Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly
Arg Gln Val305 310 315
320tac ctt ggt ggc tat gac aag gaa gac aag gca gca agg gct tat gat
1008Tyr Leu Gly Gly Tyr Asp Lys Glu Asp Lys Ala Ala Arg Ala Tyr Asp
325 330 335ttg gca gct ctc aag
tat tgg ggc act act aca aca aca aat ttc cct 1056Leu Ala Ala Leu Lys Tyr
Trp Gly Thr Thr Thr Thr Thr Asn Phe Pro 340
345 350ata agc aac tat gaa aag gag cta gag gaa atg aaa
cat atg act agg 1104Ile Ser Asn Tyr Glu Lys Glu Leu Glu Glu Met Lys His
Met Thr Arg 355 360 365cag gag tat
att gca tac cta aga aga aat agc agt gga ttt tct cgt 1152Gln Glu Tyr Ile
Ala Tyr Leu Arg Arg Asn Ser Ser Gly Phe Ser Arg 370
375 380ggc gca tca aaa tat cgt gga gta act aga cat cat
cag cat ggg aga 1200Gly Ala Ser Lys Tyr Arg Gly Val Thr Arg His His Gln
His Gly Arg385 390 395
400tgg caa gca agg ata ggg aga gtt gca gga aac aag gat ctc tac ttg
1248Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu
405 410 415ggc aca ttc agc acc
gag gag gag gcg gcg gag gcc tac gac atc gcc 1296Gly Thr Phe Ser Thr Glu
Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala 420
425 430gcg atc aag ttc cgc ggt ctg aac gcc gtc acc aac
ttc gac atg agc 1344Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe
Asp Met Ser 435 440 445cgc tac gac
gtc aag agc atc ctc gag agc agc acg ctg cct gtc ggc 1392Arg Tyr Asp Val
Lys Ser Ile Leu Glu Ser Ser Thr Leu Pro Val Gly 450
455 460ggc gcg gcc agg cgc ctc aag gat gcc gtg gac cac
gtg gag gcc ggc 1440Gly Ala Ala Arg Arg Leu Lys Asp Ala Val Asp His Val
Glu Ala Gly465 470 475
480gcc acc atc tgg cgc gcc gac atg gac ggc ggc gtg atc tcc cag ctc
1488Ala Thr Ile Trp Arg Ala Asp Met Asp Gly Gly Val Ile Ser Gln Leu
485 490 495gcc gaa gcc ggg atg
ggc ggc tac gcc tcg tac ggg cac cac gcc tgg 1536Ala Glu Ala Gly Met Gly
Gly Tyr Ala Ser Tyr Gly His His Ala Trp 500
505 510ccg acc atc gcg ttc cag cag ccg tcg ccg ctc tcc
gtc cac tac ccg 1584Pro Thr Ile Ala Phe Gln Gln Pro Ser Pro Leu Ser Val
His Tyr Pro 515 520 525tac ggg cag
ccg ccg tcc cgc ggg tgg tgc aag ccc gag cag gac gcg 1632Tyr Gly Gln Pro
Pro Ser Arg Gly Trp Cys Lys Pro Glu Gln Asp Ala 530
535 540gcc gtc gcc gcc gcc gcg cac agc ctg cag gac ctc
cag cag ctg cac 1680Ala Val Ala Ala Ala Ala His Ser Leu Gln Asp Leu Gln
Gln Leu His545 550 555
560ctc ggc agc gcg gca cac aac ttc ttc cag gcg tcg tcg agc tcg gca
1728Leu Gly Ser Ala Ala His Asn Phe Phe Gln Ala Ser Ser Ser Ser Ala
565 570 575gtc tac aac agc ggc
ggc ggc ggc gct agc ggc ggg tac cac cag ggc 1776Val Tyr Asn Ser Gly Gly
Gly Gly Ala Ser Gly Gly Tyr His Gln Gly 580
585 590ctc ggt ggc ggc agc agc tcc ttc ctc atg ccg tcg
agc act gtc gtg 1824Leu Gly Gly Gly Ser Ser Ser Phe Leu Met Pro Ser Ser
Thr Val Val 595 600 605gcg ggg gcc
gac cag ggg cac agc agc agc acg gcc aac cag ggg agc 1872Ala Gly Ala Asp
Gln Gly His Ser Ser Ser Thr Ala Asn Gln Gly Ser 610
615 620acg tgc agc tac ggg gac gat cac cag gaa ggg aag
ctc atc ggg tac 1920Thr Cys Ser Tyr Gly Asp Asp His Gln Glu Gly Lys Leu
Ile Gly Tyr625 630 635
640gac gcc atg gtg gcg gcg acc gca gcc ggc ggg gac ccg tac gcc gcg
1968Asp Ala Met Val Ala Ala Thr Ala Ala Gly Gly Asp Pro Tyr Ala Ala
645 650 655gcg agg agc ggg tac
cag ttc tcg tcg cag ggc tcg gga tcc acg gtg 2016Ala Arg Ser Gly Tyr Gln
Phe Ser Ser Gln Gly Ser Gly Ser Thr Val 660
665 670agc atc gcg agg gcg aac ggg tac tct aac aac tgg
agc tct cct ttc 2064Ser Ile Ala Arg Ala Asn Gly Tyr Ser Asn Asn Trp Ser
Ser Pro Phe 675 680 685aac ggc ggc
atg ggg tga 2082Asn Gly Gly Met
Gly 69041693PRTSorghum bicolor 41Met Ala Ser Thr Asn Asn His Trp Leu
Gly Phe Ser Leu Ser Gly Gln1 5 10
15Asp Asn Pro Gln Pro Asn His Gln Asp Ser Ser Pro Ala Ala Ala
Gly 20 25 30Ile Asp Ile Ser
Gly Ala Ser Asp Phe Tyr Gly Leu Pro Thr Gln Gln 35
40 45Gly Ser Asp Gly Asn Leu Gly Val Pro Gly Leu Arg
Asp Asp His Ala 50 55 60Ser Tyr Gly
Ile Met Glu Ala Phe Asn Arg Val Pro Gln Glu Thr Gln65 70
75 80Asp Trp Asn Met Arg Gly Leu Asp
Tyr Asn Gly Gly Gly Ser Glu Leu 85 90
95Ser Met Leu Val Gly Ser Ser Gly Gly Gly Gly Gly Gly Gly
Lys Arg 100 105 110Ala Val Glu
Asp Ser Glu Pro Lys Leu Glu Asp Phe Leu Gly Gly Asn 115
120 125Ser Phe Val Ser Glu His Asp Gln Ser Gly Gly
Tyr Leu Phe Ser Gly 130 135 140Val Pro
Met Ala Ser Ser Thr Asn Ser Asn Ser Gly Ser Asn Thr Met145
150 155 160Glu Leu Ser Met Ile Lys Thr
Trp Leu Arg Asn Asn Gln Val Pro Gln 165
170 175Pro Gln Pro Pro Ala Ala Pro His Gln Ala Pro Gln
Thr Glu Glu Met 180 185 190Ser
Thr Asp Ala Asn Ala Ser Ala Ser Ser Phe Gly Cys Ser Asp Ser 195
200 205Met Gly Arg Asn Gly Thr Val Ala Ala
Ala Gly Ser Ser Gln Ser Leu 210 215
220Ala Leu Ser Met Ser Thr Gly Ser His Leu Pro Met Val Val Ala Gly225
230 235 240Gly Gly Ala Ser
Gly Ala Ala Ser Glu Ser Thr Ser Ser Glu Asn Lys 245
250 255Arg Ala Ser Gly Ala Met Asp Ser Pro Gly
Ser Ala Val Glu Ala Val 260 265
270Pro Arg Lys Ser Ile Asp Thr Phe Gly Gln Arg Thr Ser Ile Tyr Arg
275 280 285Gly Val Thr Arg His Arg Trp
Thr Gly Arg Tyr Glu Ala His Leu Trp 290 295
300Asp Asn Ser Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly Arg Gln
Val305 310 315 320Tyr Leu
Gly Gly Tyr Asp Lys Glu Asp Lys Ala Ala Arg Ala Tyr Asp
325 330 335Leu Ala Ala Leu Lys Tyr Trp
Gly Thr Thr Thr Thr Thr Asn Phe Pro 340 345
350Ile Ser Asn Tyr Glu Lys Glu Leu Glu Glu Met Lys His Met
Thr Arg 355 360 365Gln Glu Tyr Ile
Ala Tyr Leu Arg Arg Asn Ser Ser Gly Phe Ser Arg 370
375 380Gly Ala Ser Lys Tyr Arg Gly Val Thr Arg His His
Gln His Gly Arg385 390 395
400Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu
405 410 415Gly Thr Phe Ser Thr
Glu Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala 420
425 430Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn
Phe Asp Met Ser 435 440 445Arg Tyr
Asp Val Lys Ser Ile Leu Glu Ser Ser Thr Leu Pro Val Gly 450
455 460Gly Ala Ala Arg Arg Leu Lys Asp Ala Val Asp
His Val Glu Ala Gly465 470 475
480Ala Thr Ile Trp Arg Ala Asp Met Asp Gly Gly Val Ile Ser Gln Leu
485 490 495Ala Glu Ala Gly
Met Gly Gly Tyr Ala Ser Tyr Gly His His Ala Trp 500
505 510Pro Thr Ile Ala Phe Gln Gln Pro Ser Pro Leu
Ser Val His Tyr Pro 515 520 525Tyr
Gly Gln Pro Pro Ser Arg Gly Trp Cys Lys Pro Glu Gln Asp Ala 530
535 540Ala Val Ala Ala Ala Ala His Ser Leu Gln
Asp Leu Gln Gln Leu His545 550 555
560Leu Gly Ser Ala Ala His Asn Phe Phe Gln Ala Ser Ser Ser Ser
Ala 565 570 575Val Tyr Asn
Ser Gly Gly Gly Gly Ala Ser Gly Gly Tyr His Gln Gly 580
585 590Leu Gly Gly Gly Ser Ser Ser Phe Leu Met
Pro Ser Ser Thr Val Val 595 600
605Ala Gly Ala Asp Gln Gly His Ser Ser Ser Thr Ala Asn Gln Gly Ser 610
615 620Thr Cys Ser Tyr Gly Asp Asp His
Gln Glu Gly Lys Leu Ile Gly Tyr625 630
635 640Asp Ala Met Val Ala Ala Thr Ala Ala Gly Gly Asp
Pro Tyr Ala Ala 645 650
655Ala Arg Ser Gly Tyr Gln Phe Ser Ser Gln Gly Ser Gly Ser Thr Val
660 665 670Ser Ile Ala Arg Ala Asn
Gly Tyr Ser Asn Asn Trp Ser Ser Pro Phe 675 680
685Asn Gly Gly Met Gly 690421272DNAArtificial
SequenceMaize optimized FLP coding sequenceCDS(1)...(1272) 42atg ccc cag
ttc gac atc ctc tgc aag acc ccc ccc aag gtg ctc gtg 48Met Pro Gln Phe
Asp Ile Leu Cys Lys Thr Pro Pro Lys Val Leu Val 1 5
10 15agg cag ttc gtg gag agg ttc gag agg ccc
tcc ggc gag aag atc gcc 96Arg Gln Phe Val Glu Arg Phe Glu Arg Pro Ser
Gly Glu Lys Ile Ala 20 25
30ctc tgc gcc gcc gag ctc acc tac ctc tgc tgg atg atc acc cac aac
144Leu Cys Ala Ala Glu Leu Thr Tyr Leu Cys Trp Met Ile Thr His Asn
35 40 45ggc acc gcc att aag agg gcc
acc ttc atg tca tac aac acc atc atc 192Gly Thr Ala Ile Lys Arg Ala Thr
Phe Met Ser Tyr Asn Thr Ile Ile 50 55
60tcc aac tcc ctc tcc ttc gac atc gtg aac aag tcc ctc cag ttc aaa
240Ser Asn Ser Leu Ser Phe Asp Ile Val Asn Lys Ser Leu Gln Phe Lys 65
70 75 80tac aag acc cag
aag gcc acc atc ctc gag gcc tcc ctc aag aag ctc 288Tyr Lys Thr Gln Lys
Ala Thr Ile Leu Glu Ala Ser Leu Lys Lys Leu 85
90 95atc ccc gcc tgg gag ttc acc atc atc ccc tac
tac ggc cag aag cac 336Ile Pro Ala Trp Glu Phe Thr Ile Ile Pro Tyr Tyr
Gly Gln Lys His 100 105 110cag
tcc gac atc acc gac atc gtg tca tcc ctc cag ctt cag ttc gag 384Gln Ser
Asp Ile Thr Asp Ile Val Ser Ser Leu Gln Leu Gln Phe Glu 115
120 125tcc tcc gag gag gct gac aag ggc aac tcc
cac tcc aag aag atg ctg 432Ser Ser Glu Glu Ala Asp Lys Gly Asn Ser His
Ser Lys Lys Met Leu 130 135 140aag gcc
ctc ctc tcc gag ggc gag tcc atc tgg gag atc acc gag aag 480Lys Ala Leu
Leu Ser Glu Gly Glu Ser Ile Trp Glu Ile Thr Glu Lys145
150 155 160atc ctc aac tcc ttc gag tac
acc tcc agg ttc act aag acc aag acc 528Ile Leu Asn Ser Phe Glu Tyr Thr
Ser Arg Phe Thr Lys Thr Lys Thr 165 170
175ctc tac cag ttc ctc ttc ctc gcc acc ttc atc aac tgc ggc
agg ttc 576Leu Tyr Gln Phe Leu Phe Leu Ala Thr Phe Ile Asn Cys Gly Arg
Phe 180 185 190tca gac atc aag
aac gtg gac ccc aag tcc ttc aag ctc gtg cag aac 624Ser Asp Ile Lys Asn
Val Asp Pro Lys Ser Phe Lys Leu Val Gln Asn 195
200 205aag tac ctc ggc gtg atc atc cag tgc ctc gtg acc
gag acc aag acc 672Lys Tyr Leu Gly Val Ile Ile Gln Cys Leu Val Thr Glu
Thr Lys Thr 210 215 220tcc gtg tcc agg
cac atc tac ttc ttc tcc gct cgc ggc agg atc gac 720Ser Val Ser Arg His
Ile Tyr Phe Phe Ser Ala Arg Gly Arg Ile Asp225 230
235 240ccc ctc gtg tac ctc gac gag ttc ctc agg
aac tca gag ccc gtg ctc 768Pro Leu Val Tyr Leu Asp Glu Phe Leu Arg Asn
Ser Glu Pro Val Leu 245 250
255aag agg gtg aac agg acc ggc aac tcc tcc tcc aac aag cag gag tac
816Lys Arg Val Asn Arg Thr Gly Asn Ser Ser Ser Asn Lys Gln Glu Tyr
260 265 270cag ctc ctc aag gac aac
ctc gtg agg tcc tac aac aag gcc ctc aag 864Gln Leu Leu Lys Asp Asn Leu
Val Arg Ser Tyr Asn Lys Ala Leu Lys 275 280
285aag aac gcc ccc tac tcc atc ttc gcc atc aag aac ggc ccc aag
tcc 912Lys Asn Ala Pro Tyr Ser Ile Phe Ala Ile Lys Asn Gly Pro Lys Ser
290 295 300cac atc ggt agg cac ctc atg
acc tcc ttc ctc tca atg aag ggc ctc 960His Ile Gly Arg His Leu Met Thr
Ser Phe Leu Ser Met Lys Gly Leu305 310
315 320acc gag ctc acc aac gtg gtg ggc aac tgg tcc gac
aag agg gcc tcc 1008Thr Glu Leu Thr Asn Val Val Gly Asn Trp Ser Asp Lys
Arg Ala Ser 325 330 335gcc
gtg gcc agg acc acc tac acc cac cag atc acc gcc atc ccc gac 1056Ala Val
Ala Arg Thr Thr Tyr Thr His Gln Ile Thr Ala Ile Pro Asp 340
345 350cac tac ttc gcc ctc gtg tca agg tac
tac gcc tac gac ccc atc tcc 1104His Tyr Phe Ala Leu Val Ser Arg Tyr Tyr
Ala Tyr Asp Pro Ile Ser 355 360
365aag gag atg atc gcc ctc aag gac gag act aac ccc atc gag gag tgg
1152Lys Glu Met Ile Ala Leu Lys Asp Glu Thr Asn Pro Ile Glu Glu Trp
370 375 380cag cac atc gag cag ctc aag
ggc tcc gcc gag ggc tcc atc agg tac 1200Gln His Ile Glu Gln Leu Lys Gly
Ser Ala Glu Gly Ser Ile Arg Tyr385 390
395 400ccc gcc tgg aac ggc atc atc tcc cag gag gtg ctc
gac tac ctc tcc 1248Pro Ala Trp Asn Gly Ile Ile Ser Gln Glu Val Leu Asp
Tyr Leu Ser 405 410 415tcc
tac atc aac agg agg atc tga 1272Ser Tyr
Ile Asn Arg Arg Ile 42043423PRTArtificial SequenceFLP 43Met
Pro Gln Phe Asp Ile Leu Cys Lys Thr Pro Pro Lys Val Leu Val1
5 10 15Arg Gln Phe Val Glu Arg Phe
Glu Arg Pro Ser Gly Glu Lys Ile Ala 20 25
30Leu Cys Ala Ala Glu Leu Thr Tyr Leu Cys Trp Met Ile Thr
His Asn 35 40 45Gly Thr Ala Ile
Lys Arg Ala Thr Phe Met Ser Tyr Asn Thr Ile Ile 50 55
60Ser Asn Ser Leu Ser Phe Asp Ile Val Asn Lys Ser Leu
Gln Phe Lys65 70 75
80Tyr Lys Thr Gln Lys Ala Thr Ile Leu Glu Ala Ser Leu Lys Lys Leu
85 90 95Ile Pro Ala Trp Glu Phe
Thr Ile Ile Pro Tyr Tyr Gly Gln Lys His 100
105 110Gln Ser Asp Ile Thr Asp Ile Val Ser Ser Leu Gln
Leu Gln Phe Glu 115 120 125Ser Ser
Glu Glu Ala Asp Lys Gly Asn Ser His Ser Lys Lys Met Leu 130
135 140Lys Ala Leu Leu Ser Glu Gly Glu Ser Ile Trp
Glu Ile Thr Glu Lys145 150 155
160Ile Leu Asn Ser Phe Glu Tyr Thr Ser Arg Phe Thr Lys Thr Lys Thr
165 170 175Leu Tyr Gln Phe
Leu Phe Leu Ala Thr Phe Ile Asn Cys Gly Arg Phe 180
185 190Ser Asp Ile Lys Asn Val Asp Pro Lys Ser Phe
Lys Leu Val Gln Asn 195 200 205Lys
Tyr Leu Gly Val Ile Ile Gln Cys Leu Val Thr Glu Thr Lys Thr 210
215 220Ser Val Ser Arg His Ile Tyr Phe Phe Ser
Ala Arg Gly Arg Ile Asp225 230 235
240Pro Leu Val Tyr Leu Asp Glu Phe Leu Arg Asn Ser Glu Pro Val
Leu 245 250 255Lys Arg Val
Asn Arg Thr Gly Asn Ser Ser Ser Asn Lys Gln Glu Tyr 260
265 270Gln Leu Leu Lys Asp Asn Leu Val Arg Ser
Tyr Asn Lys Ala Leu Lys 275 280
285Lys Asn Ala Pro Tyr Ser Ile Phe Ala Ile Lys Asn Gly Pro Lys Ser 290
295 300His Ile Gly Arg His Leu Met Thr
Ser Phe Leu Ser Met Lys Gly Leu305 310
315 320Thr Glu Leu Thr Asn Val Val Gly Asn Trp Ser Asp
Lys Arg Ala Ser 325 330
335Ala Val Ala Arg Thr Thr Tyr Thr His Gln Ile Thr Ala Ile Pro Asp
340 345 350His Tyr Phe Ala Leu Val
Ser Arg Tyr Tyr Ala Tyr Asp Pro Ile Ser 355 360
365Lys Glu Met Ile Ala Leu Lys Asp Glu Thr Asn Pro Ile Glu
Glu Trp 370 375 380Gln His Ile Glu Gln
Leu Lys Gly Ser Ala Glu Gly Ser Ile Arg Tyr385 390
395 400Pro Ala Trp Asn Gly Ile Ile Ser Gln Glu
Val Leu Asp Tyr Leu Ser 405 410
415Ser Tyr Ile Asn Arg Arg Ile 420441032DNAArtificial
SequenceMaize optimized Cre coding sequenceCDS(1)...(1032) 44atg tcc aac
ctg ctc acg gtt cac cag aac ctt ccg gct ctt cca gtg 48Met Ser Asn Leu
Leu Thr Val His Gln Asn Leu Pro Ala Leu Pro Val 1 5
10 15gac gcg acg tcc gat gaa gtc agg aag aac
ctc atg gac atg ttc cgc 96Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu
Met Asp Met Phe Arg 20 25
30gac agg caa gcg ttc agc gag cac acc tgg aag atg ctg ctc tcc gtc
144Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val
35 40 45tgc cgc tcc tgg gct gca tgg
tgc aag ctg aac aac agg aag tgg ttc 192Cys Arg Ser Trp Ala Ala Trp Cys
Lys Leu Asn Asn Arg Lys Trp Phe 50 55
60ccc gct gag ccc gag gac gtg agg gat tac ctt ctg tac ctg caa gct
240Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala 65
70 75 80cgc ggg ctg gca
gtg aag acc atc cag caa cac ctt gga caa ctg aac 288Arg Gly Leu Ala Val
Lys Thr Ile Gln Gln His Leu Gly Gln Leu Asn 85
90 95atg ctt cac agg cgc tcc ggc ctc ccg cgc ccc
agc gac tcg aac gcc 336Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser
Asp Ser Asn Ala 100 105 110gtg
agc ctc gtc atg cgc cgc atc agg aag gaa aac gtc gat gcc ggc 384Val Ser
Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala Gly 115
120 125gaa agg gca aag cag gcc ctc gcg ttc gag
agg acc gat ttc gac cag 432Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg
Thr Asp Phe Asp Gln 130 135 140gtc cgc
agc ctg atg gag aac agc gac agg tgc cag gac att agg aac 480Val Arg Ser
Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg Asn145
150 155 160ctg gcg ttc ctc gga att gca
tac aac acg ctc ctc agg atc gcg gaa 528Leu Ala Phe Leu Gly Ile Ala Tyr
Asn Thr Leu Leu Arg Ile Ala Glu 165 170
175att gcc cgc att cgc gtg aag gac att agc cgc acc gac ggc
ggc agg 576Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly
Arg 180 185 190atg ctt atc cac
att ggc agg acc aag acg ctc gtt tcc acc gca ggc 624Met Leu Ile His Ile
Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 195
200 205gtc gaa aag gcc ctc agc ctc gga gtg acc aag ctc
gtc gaa cgc tgg 672Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val
Glu Arg Trp 210 215 220atc tcc gtg tcc
ggc gtc gcg gac gac cca aac aac tac ctc ttc tgc 720Ile Ser Val Ser Gly
Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys225 230
235 240cgc gtc cgc aag aac ggg gtg gct gcc cct
agc gcc acc agc caa ctc 768Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser
Ala Thr Ser Gln Leu 245 250
255agc acg agg gcc ttg gaa ggt att ttc gag gcc acc cac cgc ctg atc
816Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu Ile
260 265 270tac ggc gcg aag gat gac
agc ggt caa cgc tac ctc gca tgg tcc ggg 864Tyr Gly Ala Lys Asp Asp Ser
Gly Gln Arg Tyr Leu Ala Trp Ser Gly 275 280
285cac tcc gcc cgc gtt gga gct gct agg gac atg gcc cgc gcc ggt
gtt 912His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val
290 295 300tcc atc ccc gaa atc atg cag
gcg ggt gga tgg acg aac gtg aac att 960Ser Ile Pro Glu Ile Met Gln Ala
Gly Gly Trp Thr Asn Val Asn Ile305 310
315 320gtc atg aac tac att cgc aac ctt gac agc gag acg
ggc gca atg gtt 1008Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly
Ala Met Val 325 330 335cgc
ctc ctg gaa gat ggt gac tga 1032Arg Leu
Leu Glu Asp Gly Asp 34045343PRTArtificial SequenceCre 45Met
Ser Asn Leu Leu Thr Val His Gln Asn Leu Pro Ala Leu Pro Val1
5 10 15Asp Ala Thr Ser Asp Glu Val
Arg Lys Asn Leu Met Asp Met Phe Arg 20 25
30Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu
Ser Val 35 40 45Cys Arg Ser Trp
Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 50 55
60Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr
Leu Gln Ala65 70 75
80Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu Asn
85 90 95Met Leu His Arg Arg Ser
Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 100
105 110Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn
Val Asp Ala Gly 115 120 125Glu Arg
Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gln 130
135 140Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys
Gln Asp Ile Arg Asn145 150 155
160Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala Glu
165 170 175Ile Ala Arg Ile
Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly Arg 180
185 190Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu
Val Ser Thr Ala Gly 195 200 205Val
Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 210
215 220Ile Ser Val Ser Gly Val Ala Asp Asp Pro
Asn Asn Tyr Leu Phe Cys225 230 235
240Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln
Leu 245 250 255Ser Thr Arg
Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu Ile 260
265 270Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg
Tyr Leu Ala Trp Ser Gly 275 280
285His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 290
295 300Ser Ile Pro Glu Ile Met Gln Ala
Gly Gly Trp Thr Asn Val Asn Ile305 310
315 320Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr
Gly Ala Met Val 325 330
335Arg Leu Leu Glu Asp Gly Asp 3404634DNAArtificial
SequenceFRT1 46gaagttccta tactttctag agaataggaa cttc
344734DNAArtificial SequenceFRT5 47gaagttccta tactcttttg
agaataggaa cttc 344834DNAArtificial
SequenceFRT6 48gaagttccta tactttttga agaataggaa cttc
344934DNAArtificial SequenceFRT7 49gaagttccta tacttattga
agaataggaa cttc 345034DNAArtificial
SequenceFRT87 50gaagttccta tactttctgg agaataggaa cttc
3451975DNAZea maysCDS(1)...(975) 51atg gag acg cca cag cag
caa tcc gcc gcc gcc gcc gcc gcc gcc gcc 48Met Glu Thr Pro Gln Gln Gln
Ser Ala Ala Ala Ala Ala Ala Ala Ala 1 5
10 15cac ggg cag gac gac ggc ggg tcg ccg ccg atg tcg ccg
gcc tcc gcc 96His Gly Gln Asp Asp Gly Gly Ser Pro Pro Met Ser Pro Ala
Ser Ala 20 25 30gcg gcg gcg
gcg ctg gcg aac gcg cgg tgg aac ccg acc aag gag cag 144Ala Ala Ala Ala
Leu Ala Asn Ala Arg Trp Asn Pro Thr Lys Glu Gln 35
40 45gtg gcc gtg ctg gag ggg ctg tac gag cac ggc ctg
cgc acc ccc agc 192Val Ala Val Leu Glu Gly Leu Tyr Glu His Gly Leu Arg
Thr Pro Ser 50 55 60gcg gag cag ata
cag cag atc acg ggc agg ctg cgg gag cac ggc gcc 240Ala Glu Gln Ile Gln
Gln Ile Thr Gly Arg Leu Arg Glu His Gly Ala 65 70
75 80atc gag ggc aag aac gtc ttc tac tgg ttc
cag aac cac aag gcc cgc 288Ile Glu Gly Lys Asn Val Phe Tyr Trp Phe Gln
Asn His Lys Ala Arg 85 90
95cag cgc cag agg cag aag cag gac agc ttc gcc tac ttc agc agg ctc
336Gln Arg Gln Arg Gln Lys Gln Asp Ser Phe Ala Tyr Phe Ser Arg Leu
100 105 110ctc cgc cgg ccc ccg ccg
ctg ccc gtg ctc tcc atg ccc ccc gcg cca 384Leu Arg Arg Pro Pro Pro Leu
Pro Val Leu Ser Met Pro Pro Ala Pro 115 120
125ccg tac cat cac gcc cgc gtc ccg gcg ccg ccc gcg ata ccg atg
ccg 432Pro Tyr His His Ala Arg Val Pro Ala Pro Pro Ala Ile Pro Met Pro
130 135 140atg gcg ccg ccg ccg ccc gct
gca tgc aac gac aac ggc ggc gcg cgt 480Met Ala Pro Pro Pro Pro Ala Ala
Cys Asn Asp Asn Gly Gly Ala Arg145 150
155 160gtg atc tac agg aac cca ttc tac gtg gct gcg ccg
cag gcg ccc cct 528Val Ile Tyr Arg Asn Pro Phe Tyr Val Ala Ala Pro Gln
Ala Pro Pro 165 170 175gca
aat gcc gcc tac tac tac cca cag cca cag cag cag cag cag cag 576Ala Asn
Ala Ala Tyr Tyr Tyr Pro Gln Pro Gln Gln Gln Gln Gln Gln 180
185 190cag gtg aca gtc atg tac cag tac ccg
aga atg gag gta gcc ggc cag 624Gln Val Thr Val Met Tyr Gln Tyr Pro Arg
Met Glu Val Ala Gly Gln 195 200
205gac aag atg atg acc agg gcc gcg gcg cac cag cag cag cag cac aac
672Asp Lys Met Met Thr Arg Ala Ala Ala His Gln Gln Gln Gln His Asn 210
215 220ggc gcc ggg caa caa ccg gga cgc
gcc ggc cac ccc agc cgc gag acg 720Gly Ala Gly Gln Gln Pro Gly Arg Ala
Gly His Pro Ser Arg Glu Thr225 230 235
240ctc cag ctg ttc ccg ctc cag ccc acc ttc gtg ctg cgg cac
gac aag 768Leu Gln Leu Phe Pro Leu Gln Pro Thr Phe Val Leu Arg His Asp
Lys 245 250 255ggg cgc gcc
gcc aac ggc agt aat aac gac tcc ctg acg tcg acg tcg 816Gly Arg Ala Ala
Asn Gly Ser Asn Asn Asp Ser Leu Thr Ser Thr Ser 260
265 270acg gcg act gcg aca gcg aca gcg aca gcg aca
gcg tcc gct tcc atc 864Thr Ala Thr Ala Thr Ala Thr Ala Thr Ala Thr Ala
Ser Ala Ser Ile 275 280 285tcc gag
gac tcg gat ggc ctg gag agc ggc agc tcc ggc aag ggc gtc 912Ser Glu Asp
Ser Asp Gly Leu Glu Ser Gly Ser Ser Gly Lys Gly Val 290
295 300gag gag gcg ccc gcg ctg ccg ttc tat gac ttc ttc
ggg ctc cag tcc 960Glu Glu Ala Pro Ala Leu Pro Phe Tyr Asp Phe Phe Gly
Leu Gln Ser305 310 315
320tcc gga ggc cgc tga
975Ser Gly Gly Arg52324PRTZea mays 52Met Glu Thr Pro Gln Gln Gln Ser Ala
Ala Ala Ala Ala Ala Ala Ala1 5 10
15His Gly Gln Asp Asp Gly Gly Ser Pro Pro Met Ser Pro Ala Ser
Ala 20 25 30Ala Ala Ala Ala
Leu Ala Asn Ala Arg Trp Asn Pro Thr Lys Glu Gln 35
40 45Val Ala Val Leu Glu Gly Leu Tyr Glu His Gly Leu
Arg Thr Pro Ser 50 55 60Ala Glu Gln
Ile Gln Gln Ile Thr Gly Arg Leu Arg Glu His Gly Ala65 70
75 80Ile Glu Gly Lys Asn Val Phe Tyr
Trp Phe Gln Asn His Lys Ala Arg 85 90
95Gln Arg Gln Arg Gln Lys Gln Asp Ser Phe Ala Tyr Phe Ser
Arg Leu 100 105 110Leu Arg Arg
Pro Pro Pro Leu Pro Val Leu Ser Met Pro Pro Ala Pro 115
120 125Pro Tyr His His Ala Arg Val Pro Ala Pro Pro
Ala Ile Pro Met Pro 130 135 140Met Ala
Pro Pro Pro Pro Ala Ala Cys Asn Asp Asn Gly Gly Ala Arg145
150 155 160Val Ile Tyr Arg Asn Pro Phe
Tyr Val Ala Ala Pro Gln Ala Pro Pro 165
170 175Ala Asn Ala Ala Tyr Tyr Tyr Pro Gln Pro Gln Gln
Gln Gln Gln Gln 180 185 190Gln
Val Thr Val Met Tyr Gln Tyr Pro Arg Met Glu Val Ala Gly Gln 195
200 205Asp Lys Met Met Thr Arg Ala Ala Ala
His Gln Gln Gln Gln His Asn 210 215
220Gly Ala Gly Gln Gln Pro Gly Arg Ala Gly His Pro Ser Arg Glu Thr225
230 235 240Leu Gln Leu Phe
Pro Leu Gln Pro Thr Phe Val Leu Arg His Asp Lys 245
250 255Gly Arg Ala Ala Asn Gly Ser Asn Asn Asp
Ser Leu Thr Ser Thr Ser 260 265
270Thr Ala Thr Ala Thr Ala Thr Ala Thr Ala Thr Ala Ser Ala Ser Ile
275 280 285Ser Glu Asp Ser Asp Gly Leu
Glu Ser Gly Ser Ser Gly Lys Gly Val 290 295
300Glu Glu Ala Pro Ala Leu Pro Phe Tyr Asp Phe Phe Gly Leu Gln
Ser305 310 315 320Ser Gly
Gly Arg5330DNAArtificial SequenceFRT12 53agttcctata ctctatgtag aataggaact
3054665DNAArtificial
SequencePromoter construct comprising Zea mays rab17 promoter and
attB1 site 54ctatagtatt ttaaaattgc attaacaaac atgtcctaat tggtactcct
gagatactat 60accctcctgt tttaaaatag ttggcattat cgaattatca ttttactttt
taatgttttc 120tcttctttta atatatttta tgaattttaa tgtattttaa aatgttatgc
agttcgctct 180ggacttttct gctgcgccta cacttgggtg tactgggcct aaattcagcc
tgaccgaccg 240cctgcattga ataatggatg agcaccggta aaatccgcgt acccaacttt
cgagaagaac 300cgagacgtgg cgggccgggc caccgacgca cggcaccagc gactgcacac
gtcccgccgg 360cgtacgtgta cgtgctgttc cctcactggc cgcccaatcc actcatgcat
gcccacgtac 420acccctgccg tggcgcgccc agatcctaat cctttcgccg ttctgcactt
ctgctgccta 480taaatggcgg catcgaccgt cacctgcttc accaccggcg agccacatcg
agaacacgat 540cgagcacaca agcacgaaga ctcgtttagg agaaaccaca aaccaccaag
ccgtgcaagc 600accaagcttg gtcacccggt ccgggcctag aaggccagct tcaagtttgt
acaaaaaagc 660aggct
66555961DNAZea mays 55gatccgattg actatctcat tcctccaaac
ccaaacacct caaatatatc tgctatcggg 60attggcattc ctgtatccct acgcccgtgt
accccctgtt tagagaacct cccaaggtat 120aagatggcga agattattgt tgtcttgtct
ttcatcatat atcgagtctt tccctaggat 180attattattg gcaatgagca ttacacggtt
aatcgattga gagaacatgc atctcacctt 240cagcaaataa ttacgataat ccatatttta
cgcttcgtaa cttctcatga gtttcgatat 300acaaatttgt tttctggaca ccctaccatt
catcctcttc ggagaagaga ggaagtgtcc 360tcaatttaaa tatgttgtca tgctgtagtt
cttcacccaa tctcaacagg taccaagcac 420attgtttcca caaattatat tttagtcaca
ataaatctat attattatta atatactaaa 480actatactga cgctcagatg cttttactag
ttcttgctag tatgtgatgt aggtctacgt 540ggaccagaaa atagtgagac acggaagaca
aaagaagtaa aagaggcccg gactacggcc 600cacatgagat tcggccccgc cacctccggc
aaccagcggc cgatccaacg gaagtgcgcg 660cacacacaca acctcgtata tatcgccgcg
cggaagcggc gcgaccgagg aagccttgtc 720ctcgacaccc cctacacagg tgtcgcgctg
cccccgacac gagtcccgca tgcgtcccac 780gcggccgcgc cagatcccgc ctccgcgcgt
tgccacgccc tctataaaca cccagctctc 840cctcgccctc atctacctca ctcgtagtcg
tagctcaagc atcagcggca gcggcagcgg 900caggagctct gggcagcgtg cgcacgtggg
gtacctagct cgctctgcta gcctacctta 960a
9615622DNAArtificial SequenceHoming
endonuclease target site 56atatacctca cacgtacgcg ta
2257909DNAZea maysCDS(1)...(909) 57atg gcg gcc aat
gcg ggc ggc ggt gga gcg gga gga ggc agc ggc agc 48Met Ala Ala Asn Ala
Gly Gly Gly Gly Ala Gly Gly Gly Ser Gly Ser 1 5
10 15ggc agc gtg gct gcg ccg gcg gtg tgc cgc ccc
agc ggc tcg cgg tgg 96Gly Ser Val Ala Ala Pro Ala Val Cys Arg Pro Ser
Gly Ser Arg Trp 20 25 30acg
ccg acg ccg gag cag atc agg atg ctg aag gag ctc tac tac ggc 144Thr Pro
Thr Pro Glu Gln Ile Arg Met Leu Lys Glu Leu Tyr Tyr Gly 35
40 45tgc ggc atc cgg tcg ccc agc tcg gag cag
atc cag cgc atc acc gcc 192Cys Gly Ile Arg Ser Pro Ser Ser Glu Gln Ile
Gln Arg Ile Thr Ala 50 55 60atg ctg
cgg cag cac ggc aag atc gag ggc aag aac gtc ttc tac tgg 240Met Leu Arg
Gln His Gly Lys Ile Glu Gly Lys Asn Val Phe Tyr Trp 65
70 75 80ttc cag aac cac aag gcc cgc gag
cgc cag aag cgc cgc ctc acc agc 288Phe Gln Asn His Lys Ala Arg Glu Arg
Gln Lys Arg Arg Leu Thr Ser 85 90
95ctc gac gtc aac gtg ccc gcc gcc ggc gcg gcc gac gcc acc acc
agc 336Leu Asp Val Asn Val Pro Ala Ala Gly Ala Ala Asp Ala Thr Thr Ser
100 105 110caa ctc ggc gtc ctc
tcg ctg tcg tcg ccg ccg cct tca ggc gcg gcg 384Gln Leu Gly Val Leu Ser
Leu Ser Ser Pro Pro Pro Ser Gly Ala Ala 115 120
125cct ccc tcg ccc acc ctc ggc ttc tac gcc gcc ggc aat ggc
ggc gga 432Pro Pro Ser Pro Thr Leu Gly Phe Tyr Ala Ala Gly Asn Gly Gly
Gly 130 135 140tcg gct gtg ctg ctg gac
acg agt tcc gac tgg ggc agc agc ggc gct 480Ser Ala Val Leu Leu Asp Thr
Ser Ser Asp Trp Gly Ser Ser Gly Ala145 150
155 160gcc atg gcc acc gag aca tgc ttc ctg cag gac tac
atg ggc gtg acg 528Ala Met Ala Thr Glu Thr Cys Phe Leu Gln Asp Tyr Met
Gly Val Thr 165 170 175gac
acg ggc agc tcg tcg cag tgg cca cgc ttc tcg tcg tcg gac acg 576Asp Thr
Gly Ser Ser Ser Gln Trp Pro Arg Phe Ser Ser Ser Asp Thr 180
185 190ata atg gcg gcg gcc gcg gcg cgg gcg
gcg acg acg cgg gcg ccc gag 624Ile Met Ala Ala Ala Ala Ala Arg Ala Ala
Thr Thr Arg Ala Pro Glu 195 200
205acg ctc cct ctc ttc ccg acc tgc ggc gac gac ggc ggc agc ggt agc
672Thr Leu Pro Leu Phe Pro Thr Cys Gly Asp Asp Gly Gly Ser Gly Ser 210
215 220agc agc tac ttg ccg ttc tgg ggt
gcc gcg tcc aca act gcc ggc gcc 720Ser Ser Tyr Leu Pro Phe Trp Gly Ala
Ala Ser Thr Thr Ala Gly Ala225 230 235
240act tct tcc gtt gcg atc cag cag caa cac cag ctg cag gag
cag tac 768Thr Ser Ser Val Ala Ile Gln Gln Gln His Gln Leu Gln Glu Gln
Tyr 245 250 255agc ttt tac
agc aac agc aac agc acc cag ctg gcc ggc acc ggc aac 816Ser Phe Tyr Ser
Asn Ser Asn Ser Thr Gln Leu Ala Gly Thr Gly Asn 260
265 270caa gac gta tcg gca aca gca gca gca gcc gcc
gcc ctg gag ctg agc 864Gln Asp Val Ser Ala Thr Ala Ala Ala Ala Ala Ala
Leu Glu Leu Ser 275 280 285ctc agc
tca tgg tgc tcc cct tac cct gct gca ggg agt atg tga 909Leu Ser Ser
Trp Cys Ser Pro Tyr Pro Ala Ala Gly Ser Met 290 295
30058302PRTZea mays 58Met Ala Ala Asn Ala Gly Gly Gly Gly
Ala Gly Gly Gly Ser Gly Ser1 5 10
15Gly Ser Val Ala Ala Pro Ala Val Cys Arg Pro Ser Gly Ser Arg
Trp 20 25 30Thr Pro Thr Pro
Glu Gln Ile Arg Met Leu Lys Glu Leu Tyr Tyr Gly 35
40 45Cys Gly Ile Arg Ser Pro Ser Ser Glu Gln Ile Gln
Arg Ile Thr Ala 50 55 60Met Leu Arg
Gln His Gly Lys Ile Glu Gly Lys Asn Val Phe Tyr Trp65 70
75 80Phe Gln Asn His Lys Ala Arg Glu
Arg Gln Lys Arg Arg Leu Thr Ser 85 90
95Leu Asp Val Asn Val Pro Ala Ala Gly Ala Ala Asp Ala Thr
Thr Ser 100 105 110Gln Leu Gly
Val Leu Ser Leu Ser Ser Pro Pro Pro Ser Gly Ala Ala 115
120 125Pro Pro Ser Pro Thr Leu Gly Phe Tyr Ala Ala
Gly Asn Gly Gly Gly 130 135 140Ser Ala
Val Leu Leu Asp Thr Ser Ser Asp Trp Gly Ser Ser Gly Ala145
150 155 160Ala Met Ala Thr Glu Thr Cys
Phe Leu Gln Asp Tyr Met Gly Val Thr 165
170 175Asp Thr Gly Ser Ser Ser Gln Trp Pro Arg Phe Ser
Ser Ser Asp Thr 180 185 190Ile
Met Ala Ala Ala Ala Ala Arg Ala Ala Thr Thr Arg Ala Pro Glu 195
200 205Thr Leu Pro Leu Phe Pro Thr Cys Gly
Asp Asp Gly Gly Ser Gly Ser 210 215
220Ser Ser Tyr Leu Pro Phe Trp Gly Ala Ala Ser Thr Thr Ala Gly Ala225
230 235 240Thr Ser Ser Val
Ala Ile Gln Gln Gln His Gln Leu Gln Glu Gln Tyr 245
250 255Ser Phe Tyr Ser Asn Ser Asn Ser Thr Gln
Leu Ala Gly Thr Gly Asn 260 265
270Gln Asp Val Ser Ala Thr Ala Ala Ala Ala Ala Ala Leu Glu Leu Ser
275 280 285Leu Ser Ser Trp Cys Ser Pro
Tyr Pro Ala Ala Gly Ser Met 290 295
300592260DNAZea mays 59cttccctaac ctttgcactg tccaaaatgg cttcctgatc
ccctcacttc ctcgaatcaa 60tctaagaaga aactcaagcc gcaaccatta ggggcagatt
aattgctgca ctttcagata 120atcaaccatg gccactgtga acaactggct cgctttctcc
ctctccccgc aggagctgcc 180gccctcccag acgacggact ccacactcat ctcggccgcc
accgccgacc atgtctccgg 240cgatgtctgc ttcaacatcc cccaagattg gagcatgagg
ggatcagagc tttcggcgct 300cgtcgcggag ccgaagctgg aggacttcct cggcggcatc
tccttctccg agcagcatca 360caaggccaac tgcaacatga tacccagcac tagcagcaca
gtttgctacg cgagctcagg 420tgctagcacc ggctaccatc accagctgta ccaccagccc
accagctcag cgctccactt 480cgcggactcc gtaatggtgg cctcctcggc cggtgtccac
gacggcggtg ccatgctcag 540cgcggccgcc gctaacggtg tcgctggcgc tgccagtgcc
aacggcggcg gcatcgggct 600gtccatgatt aagaactggc tgcggagcca accggcgccc
atgcagccga gggtggcggc 660ggctgagggc gcgcaggggc tctctttgtc catgaacatg
gcggggacga cccaaggcgc 720tgctggcatg ccacttctcg ctggagagcg cgcacgggcg
cccgagagtg tatcgacgtc 780agcacagggt ggagccgtcg tcgtcacggc gccgaaggag
gatagcggtg gcagcggtgt 840tgccggcgct ctagtagccg tgagcacgga cacgggtggc
agcggcggcg cgtcggctga 900caacacggca aggaagacgg tggacacgtt cgggcagcgc
acgtcgattt accgtggcgt 960gacaaggcat agatggactg ggagatatga ggcacatctt
tgggataaca gttgcagaag 1020ggaagggcaa actcgtaagg gtcgtcaagt ctatttaggt
ggctatgata aagaggagaa 1080agctgctagg gcttatgatc ttgctgctct gaagtactgg
ggtgccacaa caacaacaaa 1140ttttccagtg agtaactacg aaaaggagct cgaggacatg
aagcacatga caaggcagga 1200gtttgtagcg tctctgagaa ggaagagcag tggtttctcc
agaggtgcat ccatttacag 1260gggagtgact aggcatcacc aacatggaag atggcaagca
cggattggac gagttgcagg 1320gaacaaggat ctttacttgg gcaccttcag cacccaggag
gaggcagcgg aggcgtacga 1380catcgcggcg atcaagttcc gcggcctcaa cgccgtcacc
aacttcgaca tgagccgcta 1440cgacgtgaag agcatcctgg acagcagcgc cctccccatc
ggcagcgccg ccaagcgcct 1500caaggaggcc gaggccgcag cgtccgcgca gcaccaccac
gccggcgtgg tgagctacga 1560cgtcggccgc atcgcctcgc agctcggcga cggcggagcc
ctggcggcgg cgtacggcgc 1620gcactaccac ggcgccgcct ggccgaccat cgcgttccag
ccgggcgccg ccagcacagg 1680cctgtaccac ccgtacgcgc agcagccaat gcgcggcggc
gggtggtgca agcaggagca 1740ggaccacgcg gtgatcgcgg ccgcgcacag cctgcaggac
ctccaccacc tgaacctggg 1800cgcggccggc gcgcacgact ttttctcggc agggcagcag
gccgccgccg ctgcgatgca 1860cggcctgggt agcatcgaca gtgcgtcgct cgagcacagc
accggctcca actccgtcgt 1920ctacaacggc ggggtcggcg acagcaacgg cgccagcgcc
gtcggcggca gtggcggtgg 1980ctacatgatg ccgatgagcg ctgccggagc aaccactaca
tcggcaatgg tgagccacga 2040gcaggtgcat gcacgggcct acgacgaagc caagcaggct
gctcagatgg ggtacgagag 2100ctacctggtg aacgcggaga acaatggtgg cggaaggatg
tctgcatggg ggactgtcgt 2160gtctgcagcc gcggcggcag cagcaagcag caacgacaac
atggccgccg acgtcggcca 2220tggcggcgcg cagctcttca gtgtctggaa cgacacttaa
2260603766DNASorghum bicolor 60atggctactg
tgaacaactg gctcgctttc tccctctccc cgcaggagct gccgcccacc 60cagacggact
ccaccctcat ctctgccgcc accaccgacg atgtctccgg cgatgtctgc 120ttcaacatcc
cccaaggtat gcatctatcg atcgatatat gtacgtacag tgcgcatata 180tatatatatc
tgcagtttgt ggtacgaata ctgattgaag ctagcatgaa atgtcgtttg 240ttctttcaga
ttggagcatg aggggatccg agctttcggc gctcgtcgcc gagccgaagc 300tggaggactt
cctcggcgga atctccttct ccgagcagca ccacaaggcc aactgcaaca 360tgatccccag
cactagcagc acagcttgct acgcgagctc gggtgctacc gccggctacc 420atcaccagct
gtaccaccag cccaccagct ccgcgctcca cttcgctgac tccgtcatgg 480tggcctcctc
ggccggcggc gtccacgacg gaggtgccat gctcagcgcg gccagcgcta 540atggtagcgc
tggcgctggc gctgccagtg ccaatggcag cggcagcatc gggctgtcca 600tgatcaagaa
ctggctgcgg agccaaccag ctcccatgca gccgagggtg gcggcggctg 660agagcgtgca
ggggctctct ttgtccatga acatggcggg ggcgacgcaa ggcgccgctg 720gcatgccact
tcttgctgga gagcgcggcc gggcgcccga gagtgtctcg acgtcggcac 780agggtggagc
cgtcgtcacg gctccaaagg aggatagcgg tggcagcggt gttgccgcca 840ccggcgccct
agtagccgtg agcacggaca cgggtggcag cggcgcgtcg gctgacaaca 900cggcaaggaa
gacggtggac acgttcgggc agcgcacgtc gatttaccgt ggcgtgacaa 960ggtaataagg
gtccggtatt acaatgaatc gtcacttcgt cagagaacta aactagcaca 1020aatcagcaat
gaatcaagta atatcatgaa atttagaaaa gccgttagca atgcaaggag 1080ctatcattat
agatttgatt gcatctagac agttctgaat taaatgagta gggcaatgtg 1140tagcctttga
tgatctcgct gattattagg agtgccattt gtattggcta tgattgtggt 1200atatacagca
gtagacaatt aacaaaaggc taccactttc gaattatttt aggcatagat 1260ggactgggag
atatgaagca catctgtggg acaacagttg cagaagggaa ggacaaactc 1320gcaagggtcg
tcaaggtacc aatataatgc aatacaccgt atttaaatat atatgctttt 1380ctgtaattaa
gtttatactt tcacaaaact gacattactt cgcattatca tttttggatt 1440gtcgtcgtca
tgattggcgg gattgaaatg aactattgaa tctacagtct atttaggtaa 1500gcgatttcac
ttggttatta atttgggacc aactacttaa tccagtttgt ttttccccta 1560taaccattat
tttttcatct gtgttctcaa ctcttacttt tccatcttgt tccactgata 1620ggtggctatg
ataaagagga gaaagctgct agggcttatg atctggctgc tcttaagtac 1680tggggtccca
cgacaacaac aaattttcca gtatgtatat gtagaatgca gttttacttc 1740actgaagatc
atacctttgc tatgtctcaa atgccgttca ttagttagtg gatctgaagt 1800gaaggttctg
taatttttgt taactatgta cattgctgga attgtactta aagtcatttg 1860tttttgtata
tctaggtgaa taactacgaa aaggagctgg aggatatgaa gcacatgaca 1920aggcaggagt
ttgtagcgtc tctgagaagg tcggtcgaac agcattgatt aatcaatgcc 1980aactctattg
aataaacatc tactctgtta attgttaaag tttgagagaa agatctgcat 2040gttagatctt
aatagaccac tgtatatgaa tgcaggaaga gcagtggttt ctccagaggt 2100gcatccattt
acaggggagt gactaggtat gaattcatat aatggcgtca acaaacacac 2160atacactttg
attgaggagg cgaatgcacg catggattga atgtgaatgg tgttttactt 2220gaactatgta
attataggca tcaccagcat ggaagatggc aagcacggat tggacgagtt 2280gcagggaaca
aggatctcta cttgggcacc ttcagtaagt atcagagatg ttttctcatt 2340gtatatagag
gagtacttct atatgtatat atacattcag ttattcacca cacaaaagca 2400aattgcagtc
aactaataac aatctcaacg caatgagaag caagtgttac agctgatagt 2460acacatttgt
agaccttctg catatggatg ttatatatga tgactattaa aaatgtgacc 2520attgcatcaa
gtcatgcaaa gttgcattgc agtagtacat acattactta gtgcatgctc 2580ctcaagtggc
tttttcaaac ctgatcccat gtctggcgct attgttgtct cccattcacc 2640cgtgcatcag
gtcaaaatag tactatgcct caataagaaa cacatgagca tgcactggca 2700gcagcagact
aatcaagttc tatcatttac taataaacta attaggctac agcatccaaa 2760agattctacc
cattaagcca caactgttca tgcatgcatt cataaaccag gataccacca 2820tgcatgcgtg
caccgtgttc gtgcttggaa tattgagctg agccgagtgc acccttgcgt 2880ggatgcaggc
acgcaggagg aggcagcgga ggcatacgac attgcggcga tcaagttccg 2940cggcctcaac
gccgtcacaa acttcgacat gagccgctac gacgtcaaga gcatcctgga 3000cagcagtgcg
ctccccatcg gcagcgccgc caagcgtctc aaggaggccg aggccgccgc 3060gtccgcacag
caccatgccg gcgtggtgag ctacgacgtc ggccgcatag cctcacagct 3120cggcgacggc
ggcgccctgg cggcggcgta cggcgcgcac taccatggcg cctggccgac 3180catcgcgttc
cagccgagcg cggccacggg cctgtaccac ccgtacgcgc agccgatgcg 3240cgggtggtgc
aagcaggagc aggaccacgc ggtgatcgcg gccgcgcaca gcctgcagga 3300gctccaccac
ctgaacctgg gtgctgccgc cggcgcgcac gacttcttct cggcggggca 3360gcaggcggcg
atgcacggcc tgggtagcat ggacaatgca tcactcgagc acagcaccgg 3420ctccaactcc
gtcgtgtaca acggtgttgg tgatagcaac ggcagcaccg tcgtcggcag 3480tggtggctac
atgatgccta tgagcgctgc cacggcgacg gctaccacgg caatggtgag 3540ccacgagcag
gtgcatgcac gggcacaggg tgatcaccac gacgaagcca agcaggctgc 3600tcagatgggg
tacgagagct acctggtgaa cgcagagaac tatggcggcg ggaggatgtc 3660tgcggcctgg
gcgactgtct cagcgccacc ggcggcaagc agcaacgata acatggcgga 3720cgtcggccat
ggcggcgcac agctcttcag tgtctggaac gatact
376661530PRTGlycine max 61Met Asp Ser Ser Ser Ser Ser Pro Pro Asn Ser Thr
Asn Asn Asn Ser1 5 10
15Leu Ala Phe Ser Leu Ser Asn His Phe Pro Asn Pro Ser Ser Ser Pro
20 25 30Leu Ser Leu Phe His Ser Phe
Thr Tyr Pro Ser Leu Ser Leu Thr Gly 35 40
45Ser Asn Thr Val Asp Ala Pro Pro Glu Pro Thr Ala Gly Ala Gly
Pro 50 55 60Thr Asn Leu Ser Ile Phe
Thr Gly Gly Pro Lys Phe Glu Asp Phe Leu65 70
75 80Gly Gly Ser Ala Ala Thr Ala Thr Thr Val Ala
Cys Ala Pro Pro Gln 85 90
95Leu Pro Gln Phe Ser Thr Asp Asn Asn Asn His Leu Tyr Asp Ser Glu
100 105 110Leu Lys Ser Thr Ile Ala
Ala Cys Phe Pro Arg Ala Leu Ala Ala Glu 115 120
125Gln Ser Thr Glu Pro Gln Lys Pro Ser Pro Lys Lys Thr Val
Asp Thr 130 135 140Phe Gly Gln Arg Thr
Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp145 150
155 160Thr Gly Arg Tyr Glu Ala His Leu Trp Asp
Asn Ser Cys Arg Arg Glu 165 170
175Gly Gln Ser Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys
180 185 190Glu Asp Lys Ala Ala
Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp 195
200 205Gly Pro Thr Thr Thr Thr Asn Phe Pro Ile Ser Asn
Tyr Glu Lys Glu 210 215 220Leu Glu Glu
Met Lys Asn Met Thr Arg Gln Glu Phe Val Ala Ser Leu225
230 235 240Arg Arg Lys Ser Ser Gly Phe
Ser Arg Gly Ala Ser Ile Tyr Arg Gly 245
250 255Val Thr Arg His His Gln His Gly Arg Trp Gln Ala
Arg Ile Gly Arg 260 265 270Val
Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu 275
280 285Glu Ala Ala Glu Ala Tyr Asp Ile Ala
Ala Ile Lys Phe Arg Gly Leu 290 295
300Asn Ala Val Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ser Ile305
310 315 320Ala Asn Ser Thr
Leu Pro Ile Gly Gly Leu Ser Gly Lys Asn Lys Asn 325
330 335Ser Thr Asp Ser Ala Ser Glu Ser Lys Ser
His Glu Pro Ser Gln Ser 340 345
350Asp Gly Asp Pro Ser Ser Ala Ser Ser Val Thr Phe Ala Ser Gln Gln
355 360 365Gln Pro Ser Ser Ser Asn Leu
Ser Phe Ala Ile Pro Ile Lys Gln Asp 370 375
380Pro Ser Asp Tyr Trp Ser Ile Leu Gly Tyr His Asn Thr Pro Leu
Asp385 390 395 400Asn Ser
Gly Ile Arg Asn Thr Thr Ser Thr Val Thr Thr Thr Thr Phe
405 410 415Pro Ser Ser Asn Asn Gly Thr
Ala Ser Ser Leu Thr Pro Phe Asn Met 420 425
430Glu Phe Ser Ser Ala Pro Ser Ser Thr Gly Ser Asp Asn Asn
Ala Ala 435 440 445Phe Phe Ser Gly
Gly Gly Ile Phe Val Gln Gln Gln Thr Ser His Gly 450
455 460His Gly Asn Ala Ser Ser Gly Ser Ser Ser Ser Ser
Leu Ser Cys Ser465 470 475
480Ile Pro Phe Ala Thr Pro Ile Phe Ser Leu Asn Ser Asn Thr Ser Tyr
485 490 495Glu Ser Ser Ala Gly
Tyr Gly Asn Trp Ile Gly Pro Thr Leu His Thr 500
505 510Phe Gln Ser His Ala Lys Pro Ser Leu Phe Gln Thr
Pro Ile Phe Gly 515 520 525Met Glu
53062528PRTGlycine max 62Met Asp Ser Cys Ser Ser Pro Pro Asn Asn Asn
Ser Leu Ala Phe Ser1 5 10
15Leu Ser Asn His Phe Pro Asn Pro Ser Ser Ser Pro Leu Ser Leu Phe
20 25 30His Ser Phe Thr Tyr Pro Ser
Leu Ser Leu Thr Gly Ser His Thr Ala 35 40
45Asp Ala Pro Pro Glu Pro Ile Ala Gly Gly Gly Ala Thr Asn Leu
Ser 50 55 60Ile Phe Thr Gly Ala Pro
Lys Phe Glu Asp Phe Leu Gly Gly Ser Ser65 70
75 80Ala Thr Ala Thr Ala Thr Thr Cys Ala Pro Pro
Gln Leu Pro Gln Phe 85 90
95Ser Thr Asp Asn Asn Asn His Leu Tyr Asp Ser Glu Leu Lys Thr Thr
100 105 110Ile Ala Ala Cys Phe Pro
Arg Ala Phe Ala Ala Glu Pro Thr Thr Glu 115 120
125Pro Gln Lys Pro Ser Pro Lys Lys Thr Val Asp Thr Phe Gly
Gln Arg 130 135 140Thr Ser Ile Tyr Arg
Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr145 150
155 160Glu Ala His Leu Trp Asp Asn Ser Cys Arg
Arg Glu Gly Gln Ser Arg 165 170
175Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Asp Lys Ala
180 185 190Ala Arg Ala Tyr Asp
Leu Ala Ala Leu Lys Tyr Trp Gly Pro Thr Thr 195
200 205Thr Thr Asn Phe Pro Ile Ser Asn Tyr Glu Lys Glu
Leu Glu Glu Met 210 215 220Lys Asn Met
Thr Arg Gln Glu Phe Val Ala Ser Leu Arg Arg Lys Ser225
230 235 240Ser Gly Phe Ser Arg Gly Ala
Ser Ile Tyr Arg Gly Val Thr Arg His 245
250 255His Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg
Val Ala Gly Asn 260 265 270Lys
Asp Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu 275
280 285Ala Tyr Asp Ile Ala Ala Ile Lys Phe
Arg Gly Leu Asn Ala Val Thr 290 295
300Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ser Ile Ala Asn Ser Thr305
310 315 320Leu Pro Ile Gly
Gly Leu Ser Gly Lys Asn Lys Asn Ser Thr Asp Ser 325
330 335Ala Ser Glu Ser Lys Ser His Glu Ala Ser
Arg Ser Asp Glu Arg Asp 340 345
350Pro Ser Ala Ala Ser Ser Val Thr Phe Ala Ser Gln Gln Gln Pro Ser
355 360 365Ser Ser Thr Leu Ser Phe Ala
Ile Pro Ile Lys Gln Asp Pro Ser Asp 370 375
380Tyr Trp Ser Ile Leu Gly Tyr His Asn Ser Pro Leu Asp Asn Thr
Gly385 390 395 400Ile Arg
Asn Thr Thr Ser Val Thr Ala Thr Ser Phe Pro Ser Ser Asn
405 410 415Asn Gly Thr Thr Ser Ser Leu
Thr Pro Phe His Met Glu Phe Ser Asn 420 425
430Ala Pro Thr Ser Thr Gly Ser Asp Asn Asp Ala Ala Phe Phe
Ser Gly 435 440 445Gly Gly Ile Phe
Val Gln Gln Gln Ser Gly His Gly Asn Gly His Gly 450
455 460Ser Gly Ser Ser Gly Ser Ser Ser Ser Ser Leu Ser
Cys Ser Ile Pro465 470 475
480Phe Ala Thr Pro Ile Phe Ser Leu Asn Ser Asn Thr Ser Tyr Glu Asn
485 490 495Ser Ala Gly Tyr Gly
Asn Trp Ile Gly Pro Thr Leu His Thr Phe Gln 500
505 510Ser His Ala Lys Pro Ser Leu Phe Gln Thr Pro Ile
Phe Gly Met Glu 515 520
52563488PRTZea mays 63Met Asp Met Asp Met Ser Ser Ala Tyr Pro His His Trp
Leu Ser Phe1 5 10 15Ser
Leu Ser Asn Asn Tyr His His Gly Leu Leu Glu Ala Phe Ser Asn 20
25 30Ser Ser Gly Thr Pro Leu Gly Asp
Glu Gln Gly Ala Val Glu Glu Ser 35 40
45Pro Arg Thr Val Glu Asp Phe Leu Gly Gly Val Gly Gly Ala Gly Ala
50 55 60Pro Pro Gln Pro Ala Ala Ala Ala
Asp Gln Asp His Gln Leu Val Cys65 70 75
80Gly Glu Leu Gly Ser Ile Thr Ala Arg Phe Leu Arg His
Tyr Pro Ala 85 90 95Ala
Pro Ala Gly Thr Thr Val Glu Asn Pro Gly Ala Val Thr Val Ala
100 105 110Ala Met Ser Ser Thr Asp Val
Ala Gly Ala Glu Ser Asp Gln Ala Arg 115 120
125Arg Pro Ala Glu Thr Phe Gly Gln Arg Thr Ser Ile Tyr Arg Gly
Val 130 135 140Thr Arg His Arg Trp Thr
Gly Arg Tyr Glu Ala His Leu Trp Asp Asn145 150
155 160Ser Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly
Arg Gln Val Tyr Leu 165 170
175Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala
180 185 190Ala Leu Lys Tyr Trp Gly
Pro Thr Thr Thr Thr Asn Phe Pro Val Ser 195 200
205Asn Tyr Glu Lys Glu Leu Glu Glu Met Lys Ser Met Thr Arg
Gln Glu 210 215 220Phe Ile Ala Ser Leu
Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala225 230
235 240Ser Ile Tyr Arg Gly Val Thr Arg His His
Gln His Gly Arg Trp Gln 245 250
255Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr
260 265 270Phe Ser Thr Gln Glu
Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile 275
280 285Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp
Met Ser Arg Tyr 290 295 300Asp Val Glu
Ser Ile Leu Ser Ser Asp Leu Pro Val Gly Gly Gly Ala305
310 315 320Ser Gly Arg Ala Pro Ala Lys
Phe Pro Leu Asp Ser Leu Gln Pro Gly 325
330 335Ser Ala Ala Ala Met Met Leu Ala Gly Ala Ala Ala
Ala Ser Gln Ala 340 345 350Thr
Met Pro Pro Ser Glu Lys Asp Tyr Trp Ser Leu Leu Ala Leu His 355
360 365Tyr Gln Gln Gln Gln Glu Gln Glu Arg
Gln Phe Pro Ala Ser Ala Tyr 370 375
380Glu Ala Tyr Gly Ser Gly Gly Val Asn Val Asp Phe Thr Met Gly Thr385
390 395 400Ser Ser Gly Asn
Asn Asn Asn Asn Thr Gly Ser Gly Val Met Trp Gly 405
410 415Ala Thr Thr Gly Ala Val Val Val Gly Gln
Gln Asp Ser Ser Gly Lys 420 425
430Gln Gly Asn Gly Tyr Ala Ser Asn Ile Pro Tyr Ala Ala Ala Ala Met
435 440 445Val Ser Gly Ser Ala Gly Tyr
Glu Gly Ser Thr Gly Asp Asn Gly Thr 450 455
460Trp Val Thr Thr Thr Thr Ser Ser Asn Thr Gly Thr Ala Pro His
Tyr465 470 475 480Tyr Asn
Tyr Leu Phe Gly Met Glu 48564495PRTOryza sativa 64Met Asp
Met Asp Thr Ser His His Tyr Pro Trp Leu Asn Phe Ser Leu1 5
10 15Ala His His Cys Glu Met Glu Glu
Glu Glu Arg Gly Ala Ala Ala Glu 20 25
30Leu Ala Ala Ile Ala Gly Ala Ala Pro Pro Pro Lys Leu Glu Asp
Phe 35 40 45Leu Gly Gly Gly Cys
Asn Gly Gly Ser Ser Gly Gly Ala Cys Pro Pro 50 55
60Val Gln Thr Thr Ala Pro Thr Ala Ala Glu Leu Tyr Glu Ser
Glu Leu65 70 75 80Lys
Phe Leu Ala Ala Gly Phe Gln Leu Ser Gly Ala Ala Gly Ala Ala
85 90 95Pro Pro Val Pro Ala Leu Leu
Pro Ala Ala Ala Leu Glu Gln Thr Asp 100 105
110Glu Thr Lys Gln Leu Ala Leu Pro Pro Gln Ala Ala Val Ala
Pro Pro 115 120 125Pro Glu Gln Lys
Lys Ala Val Asp Ser Phe Gly Gln Arg Thr Ser Ile 130
135 140Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg
Tyr Glu Ala His145 150 155
160Leu Trp Asp Asn Ser Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly Arg
165 170 175Gln Val Tyr Leu Gly
Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala 180
185 190Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Pro Ser
Thr Thr Thr Asn 195 200 205Phe Pro
Val Ala Glu Tyr Glu Lys Glu Leu Glu Glu Met Lys His Met 210
215 220Thr Arg Gln Glu Phe Val Ala Ser Leu Arg Arg
Lys Ser Ser Gly Phe225 230 235
240Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His Gln His
245 250 255Gly Arg Trp Gln
Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu 260
265 270Tyr Leu Gly Thr Phe Gly Thr Glu Glu Glu Ala
Ala Glu Ala Tyr Asp 275 280 285Ile
Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Glu 290
295 300Ile Gly Arg Tyr Asn Val Glu Ser Ile Ile
Ser Ser Asn Leu Pro Ile305 310 315
320Gly Ser Met Ala Gly Asn Arg Ser Thr Lys Ala Gly Leu Glu Leu
Ala 325 330 335Pro Ser Ser
Ser Ala Asp Ala Ile Ala Ala Thr Glu Ala Asn His Thr 340
345 350Gly Val Ala Pro Pro Ser Thr Leu Ala Phe
Thr Ala Leu Pro Met Lys 355 360
365Tyr Asp Gln Ala Asp Tyr Leu Ser Tyr Leu Ala Leu Gln His His Gln 370
375 380Gln Gly Asn Leu Gln Gly Leu Gly
Phe Gly Leu Tyr Ser Ser Gly Val385 390
395 400Asn Leu Asp Phe Ala Asn Ala Asn Gly Asn Gly Ala
Met Ser Asn Cys 405 410
415Tyr Thr Asn Val Ser Leu His Glu Gln Gln Gln Gln His Gln His Gln
420 425 430His Gln Gln Glu Gln Gln
Gln Asp Gln Gln Asp Asp Gln Ser Gln Ser 435 440
445Ser Asn Asn Ser Cys Gly Ser Ile Pro Phe Ala Thr Pro Ile
Ala Phe 450 455 460Ser Gly Ser Tyr Glu
Ser Ser Met Thr Ala Ala Gly Thr Phe Gly Tyr465 470
475 480Tyr Pro Asn Val Ala Ala Phe Gln Thr Pro
Ile Phe Gly Met Glu 485 490
49565558PRTArabidopsis thaliana 65Met Lys Asn Asn Asn Asn Lys Ser Ser
Ser Ser Ser Ser Tyr Asp Ser1 5 10
15Ser Leu Ser Pro Ser Ser Ser Ser Ser Ser His Gln Asn Trp Leu
Ser 20 25 30Phe Ser Leu Ser
Asn Asn Asn Asn Asn Phe Asn Ser Ser Ser Asn Pro 35
40 45Asn Leu Thr Ser Ser Thr Ser Asp His His His Pro
His Pro Ser His 50 55 60Leu Ser Leu
Phe Gln Ala Phe Ser Thr Ser Pro Val Glu Arg Gln Asp65 70
75 80Gly Ser Pro Gly Val Ser Pro Ser
Asp Ala Thr Ala Val Leu Ser Val 85 90
95Tyr Pro Gly Gly Pro Lys Leu Glu Asn Phe Leu Gly Gly Gly
Ala Ser 100 105 110Thr Thr Thr
Thr Arg Pro Met Gln Gln Val Gln Ser Leu Gly Gly Val 115
120 125Val Phe Ser Ser Asp Leu Gln Pro Pro Leu His
Pro Pro Ser Ala Ala 130 135 140Glu Ile
Tyr Asp Ser Glu Leu Lys Ser Ile Ala Ala Ser Phe Leu Gly145
150 155 160Asn Tyr Ser Gly Gly His Ser
Ser Glu Val Ser Ser Val His Lys Gln 165
170 175Gln Pro Asn Pro Leu Ala Val Ser Glu Ala Ser Pro
Thr Pro Lys Lys 180 185 190Asn
Val Glu Ser Phe Gly Gln Arg Thr Ser Ile Tyr Arg Gly Val Thr 195
200 205Arg His Arg Trp Thr Gly Arg Tyr Glu
Ala His Leu Trp Asp Asn Ser 210 215
220Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly Arg Gln Val Tyr Leu Gly225
230 235 240Gly Tyr Asp Lys
Glu Asp Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala 245
250 255Leu Lys Tyr Trp Gly Pro Thr Thr Thr Thr
Asn Phe Pro Ile Ser Asn 260 265
270Tyr Glu Ser Glu Leu Glu Glu Met Lys His Met Thr Arg Gln Glu Phe
275 280 285Val Ala Ser Leu Arg Arg Lys
Ser Ser Gly Phe Ser Arg Gly Ala Ser 290 295
300Met Tyr Arg Gly Val Thr Arg His His Gln His Gly Arg Trp Gln
Ala305 310 315 320Arg Ile
Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe
325 330 335Ser Thr Gln Glu Glu Ala Ala
Glu Ala Tyr Asp Ile Ala Ala Ile Lys 340 345
350Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp Ile Ser Arg
Tyr Asp 355 360 365Val Lys Ser Ile
Ala Ser Cys Asn Leu Pro Val Gly Gly Leu Met Pro 370
375 380Lys Pro Ser Pro Ala Thr Ala Ala Ala Asp Lys Thr
Val Asp Leu Ser385 390 395
400Pro Ser Asp Ser Pro Ser Leu Thr Thr Pro Ser Leu Thr Phe Asn Val
405 410 415Ala Thr Pro Val Asn
Asp His Gly Gly Thr Phe Tyr His Thr Gly Ile 420
425 430Pro Ile Lys Pro Asp Pro Ala Asp His Tyr Trp Ser
Asn Ile Phe Gly 435 440 445Phe Gln
Ala Asn Pro Lys Ala Glu Met Arg Pro Leu Ala Asn Phe Gly 450
455 460Ser Asp Leu His Asn Pro Ser Pro Gly Tyr Ala
Ile Met Pro Val Met465 470 475
480Gln Glu Gly Glu Asn Asn Phe Gly Gly Ser Phe Val Gly Ser Asp Gly
485 490 495Tyr Asn Asn His
Ser Ala Ala Ser Asn Pro Val Ser Ala Ile Pro Leu 500
505 510Ser Ser Thr Thr Thr Met Ser Asn Gly Asn Glu
Gly Tyr Gly Gly Asn 515 520 525Ile
Asn Trp Ile Asn Asn Asn Ile Ser Ser Ser Tyr Gln Thr Ala Lys 530
535 540Ser Asn Leu Ser Val Leu His Thr Pro Val
Phe Gly Leu Glu545 550
55566568PRTArabidopsis thaliana 66Met Asn Ser Asn Asn Trp Leu Ala Phe Pro
Leu Ser Pro Thr His Ser1 5 10
15Ser Leu Pro Pro His Ile His Ser Ser Gln Asn Ser His Phe Asn Leu
20 25 30Gly Leu Val Asn Asp Asn
Ile Asp Asn Pro Phe Gln Asn Gln Gly Trp 35 40
45Asn Met Ile Asn Pro His Gly Gly Gly Gly Glu Gly Gly Glu
Val Pro 50 55 60Lys Val Ala Asp Phe
Leu Gly Val Ser Lys Ser Gly Asp His His Thr65 70
75 80Asp His Asn Leu Val Pro Tyr Asn Asp Ile
His Gln Thr Asn Ala Ser 85 90
95Asp Tyr Tyr Phe Gln Thr Asn Ser Leu Leu Pro Thr Val Val Thr Cys
100 105 110Ala Ser Asn Ala Pro
Asn Asn Tyr Glu Leu Gln Glu Ser Ala His Asn 115
120 125Leu Gln Ser Leu Thr Leu Ser Met Gly Ser Thr Gly
Ala Ala Ala Ala 130 135 140Glu Val Ala
Thr Val Lys Ala Ser Pro Ala Glu Thr Ser Ala Asp Asn145
150 155 160Ser Ser Ser Thr Thr Asn Thr
Ser Gly Gly Ala Ile Val Glu Ala Thr 165
170 175Pro Arg Arg Thr Leu Glu Thr Phe Gly Gln Arg Thr
Ser Ile Tyr Arg 180 185 190Gly
Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp 195
200 205Asp Asn Ser Cys Arg Arg Glu Gly Gln
Ser Arg Lys Gly Arg Gln Val 210 215
220Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala Tyr Asp225
230 235 240Leu Ala Ala Leu
Lys Tyr Trp Gly Pro Ser Thr Thr Thr Asn Phe Pro 245
250 255Ile Thr Asn Tyr Glu Lys Glu Val Glu Glu
Met Lys Asn Met Thr Arg 260 265
270Gln Glu Phe Val Ala Ser Ile Arg Arg Lys Ser Ser Gly Phe Ser Arg
275 280 285Gly Ala Ser Met Tyr Arg Gly
Val Thr Arg His His Gln His Gly Arg 290 295
300Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr
Leu305 310 315 320Gly Thr
Phe Ser Thr Glu Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala
325 330 335Ala Ile Lys Phe Arg Gly Leu
Asn Ala Val Thr Asn Phe Glu Ile Asn 340 345
350Arg Tyr Asp Val Lys Ala Ile Leu Glu Ser Asn Thr Leu Pro
Ile Gly 355 360 365Gly Gly Ala Ala
Lys Arg Leu Lys Glu Ala Gln Ala Leu Glu Ser Ser 370
375 380Arg Lys Arg Glu Glu Met Ile Ala Leu Gly Ser Asn
Phe His Gln Tyr385 390 395
400Gly Ala Ala Ser Gly Ser Ser Ser Val Ala Ser Ser Ser Arg Leu Gln
405 410 415Leu Gln Pro Tyr Pro
Leu Ser Ile Gln Gln Pro Phe Glu His Leu His 420
425 430His His Gln Pro Leu Leu Thr Leu Gln Asn Asn Asn
Asp Ile Ser Gln 435 440 445Tyr His
Asp Ser Phe Ser Tyr Ile Gln Thr Gln Leu His Leu His Gln 450
455 460Gln Gln Thr Asn Asn Tyr Leu Gln Ser Ser Ser
His Thr Ser Gln Leu465 470 475
480Tyr Asn Ala Tyr Leu Gln Ser Asn Pro Gly Leu Leu His Gly Phe Val
485 490 495Ser Asp Asn Asn
Asn Thr Ser Gly Phe Leu Gly Asn Asn Gly Ile Gly 500
505 510Ile Gly Ser Ser Ser Thr Val Gly Ser Ser Ala
Glu Glu Glu Phe Pro 515 520 525Ala
Val Lys Val Asp Tyr Asp Met Pro Pro Ser Gly Gly Ala Thr Gly 530
535 540Tyr Gly Gly Trp Asn Ser Gly Glu Ser Ala
Gln Gly Ser Asn Pro Gly545 550 555
560Gly Val Phe Thr Met Trp Asn Glu
56567474PRTSorghum bicolor 67Met Asp Met Asp Met Ser Ser Ala Tyr Pro His
His Trp Leu Ser Phe1 5 10
15Ser Leu Ser Asn Asn Tyr His His Gly Leu Leu Glu Ala Phe Ser Asn
20 25 30Ser Ser Ser Ala Ala Pro Leu
Gly Asp Glu Gln Gly Thr Val Glu Glu 35 40
45Ser Pro Lys Met Val Glu Asp Phe Leu Gly Gly Val Gly Gly Ala
Gly 50 55 60Ala Pro Pro Ala Ala Ala
Thr Ala Ala Glu Asp His Gln Leu Val Cys65 70
75 80Gly Glu Leu Gly Ser Ile Thr Ala Gly Phe Leu
Arg His Tyr Pro Ala 85 90
95Pro Gly Thr Thr Val Glu Asn Pro Gly Ala Val Thr Val Ala Ala Met
100 105 110Ser Thr Asp Val Ala Glu
Ser Asp Gln Ala Arg Arg Pro Ala Glu Thr 115 120
125Phe Gly Gln Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His
Arg Trp 130 135 140Thr Gly Arg Tyr Glu
Ala His Leu Trp Asp Asn Ser Cys Arg Arg Glu145 150
155 160Gly Gln Ser Arg Lys Gly Arg Gln Val Tyr
Leu Gly Gly Tyr Asp Lys 165 170
175Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp
180 185 190Gly Ala Thr Thr Thr
Thr Asn Phe Pro Val Ser Asn Tyr Glu Lys Glu 195
200 205Leu Glu Glu Met Lys Ser Met Thr Arg Gln Glu Phe
Ile Ala Ser Leu 210 215 220Arg Arg Lys
Ser Ser Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly225
230 235 240Val Thr Arg His His Gln His
Gly Arg Trp Gln Ala Arg Ile Gly Arg 245
250 255Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe
Ser Thr Gln Glu 260 265 270Glu
Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu 275
280 285Asn Ala Val Thr Asn Phe Asp Met Ser
Arg Tyr Asp Val Asp Ser Ile 290 295
300Leu Asn Ser Asp Leu Pro Val Gly Gly Gly Ala Ala Gly Arg Ala Ser305
310 315 320Lys Phe Pro Leu
Asp Ser Leu Gln Pro Gly Ser Ala Ala Ala Met Ile 325
330 335Ala Gly Ala Ala Ser Gln Ala Met Pro Pro
Ser Glu Lys Asp Tyr Trp 340 345
350Ser Leu Leu Ala Leu His Tyr Gln Gln Gln Gln Gln Gln Gln Gln Phe
355 360 365Pro Ala Ser Ala Tyr Glu Ala
Tyr Gly Ser Gly Val Asn Val Asp Phe 370 375
380Thr Met Gly Thr Ser Ser His Ser Ser Ser Asn Thr Gly Ser Gly
Val385 390 395 400Met Trp
Gly Thr Thr Thr Gly Ala Met Gly Gln Gln Asp Ser Ser Ser
405 410 415Ser Lys Gln Gly Asn Gly Tyr
Ala Ser Asn Ile Pro Tyr Ala Ala Ala 420 425
430Ala Ala Ala Met Val Ser Gly Ser Ala Gly Tyr Glu Gly Ser
Thr Gly 435 440 445Asn Asn Gly Thr
Trp Val Thr Ser Ser Thr Ser Thr Ser Thr Ala Pro 450
455 460Gln Tyr Tyr Asn Tyr Leu Phe Gly Met Glu465
47068549PRTOryza sativa 68Met Asp Met Asn Ser Gly Trp Leu Gly Phe
Ser Leu Ser Ser Ser Ser1 5 10
15Ala Arg Gly Tyr Gly Asp Gly Cys Gly Glu Gly Asn Gly Gly Gly Asp
20 25 30Gly Asp Gly Ser Cys Ser
Ser Pro Val Ala Ala Ser Pro Leu Val Ala 35 40
45Met Pro Leu His Ser Asp Gly Ser Val His Tyr Asp Ala Pro
Asp Trp 50 55 60Arg His Ala Glu Ala
Lys Asp Pro Lys Leu Glu Asp Phe Met Ser Val65 70
75 80Ser Tyr Ser Asn Lys Ser Ser Ser Asn Leu
Tyr Gly Ser Ser Ser Ser 85 90
95Ser Ser Cys Gly His Ala Asp Gln Ile Lys Tyr His His Val His Asp
100 105 110Val Gln Ala Phe Ser
Thr Pro Tyr Phe Tyr Gly His Gly Gly Ser Gly 115
120 125Val Gly Ile Asp Ile Asn Met Asn Ala Pro Pro Ala
Gly Cys Thr Gly 130 135 140Val Leu Pro
Asp His Arg Pro Pro Pro Pro Gln Gln Asp His Ile Phe145
150 155 160Leu Pro Pro His Gly Gln Tyr
Phe Leu Gly Pro Pro Asn Pro Met Ala 165
170 175Pro Ala Pro Met Tyr Asn Ala Gly Gly Gly Gly Gly
Gly Val Val Asp 180 185 190Gly
Ser Met Ser Ile Ser Gly Ile Lys Ser Trp Leu Arg Gln Ala Met 195
200 205Tyr Val Pro Glu Arg Ser Ala Ala Ala
Leu Ser Leu Ser Val Pro Ala 210 215
220Ala Pro Pro Ser Glu Ala Pro Leu Pro Pro Ala Ala Met Pro Val Val225
230 235 240Arg Lys Pro Ala
Gln Thr Phe Gly Gln Arg Thr Ser Gln Phe Arg Gly 245
250 255Val Thr Arg His Arg Trp Thr Gly Arg Tyr
Glu Ala His Leu Trp Asp 260 265
270Asn Thr Cys Arg Lys Glu Gly Gln Thr Arg Lys Gly Arg Gln Val Tyr
275 280 285Leu Gly Gly Tyr Asp Lys Glu
Glu Lys Ala Ala Arg Ala Tyr Asp Leu 290 295
300Ala Ala Leu Lys Tyr Trp Gly Pro Thr Thr His Ile Asn Phe Pro
Leu305 310 315 320Ser Thr
Tyr Glu Lys Glu Leu Glu Glu Met Lys His Met Thr Arg Gln
325 330 335Glu Phe Ile Ala His Leu Arg
Arg Asn Ser Ser Gly Phe Ser Arg Gly 340 345
350Ala Ser Met Tyr Arg Gly Val Thr Arg His His Gln His Gly
Arg Trp 355 360 365Gln Ala Arg Ile
Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly 370
375 380Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala Tyr
Asp Ile Ala Ala385 390 395
400Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp Ile Ser Lys
405 410 415Tyr Asp Val Lys Arg
Ile Cys Ser Ser Thr His Leu Ile Gly Gly Asp 420
425 430Leu Ala Cys Arg Arg Ser Pro Thr Arg Met Leu Pro
Pro Asp Ala Pro 435 440 445Ala Gly
Ala Ala Gly Val Asp Val Val Val Ala Pro Gly Asp His Gln 450
455 460Gln Ile Ser Ala Gly Gly Gly Gly Ala Ser Asp
Asn Ser Asp Thr Ala465 470 475
480Ser Asp Gly His Arg Gly Ala His Leu Leu His Gly Leu Gln Tyr Ala
485 490 495His Ala Met Lys
Phe Glu Ala Gly Glu Ser Ser Gly Gly Gly Gly Gly 500
505 510Asp Gly Ala Thr Thr Asn Trp Met Ala Ala Ala
Ala Ala Ala Ala Arg 515 520 525Pro
Val Ala Gly Ile Pro Thr Thr Val His His Gln Leu Pro Val Phe 530
535 540Ala Leu Trp Asn Asp54569553PRTGlycine max
69Met Asn Asn Asn Trp Leu Ser Phe Pro Leu Ser Pro Thr His Ser Ser1
5 10 15Leu Pro Ala His Asp Leu
Gln Ala Thr Gln Tyr His Gln Phe Ser Leu 20 25
30Gly Leu Val Asn Glu Asn Met Asp Asn Pro Phe Gln Asn
His Asp Trp 35 40 45Asn Leu Ile
Asn Thr His Ser Ser Asn Glu Ile Pro Lys Val Ala Asp 50
55 60Phe Leu Gly Val Ser Lys Ser Glu Asn Gln Ser Asp
Leu Ala Ala Leu65 70 75
80Asn Glu Ile His Ser Asn Asp Ser Asp Tyr Leu Phe Thr Asn Asn Ser
85 90 95Leu Val Pro Met Gln Asn
Pro Val Leu Asp Thr Pro Ser Asn Glu Tyr 100
105 110Gln Glu Asn Ala Asn Ser Asn Leu Gln Ser Leu Thr
Leu Ser Met Gly 115 120 125Ser Gly
Lys Asp Ser Thr Cys Glu Thr Ser Gly Glu Asn Ser Thr Asn 130
135 140Thr Thr Val Glu Val Ala Pro Arg Arg Thr Leu
Asp Thr Phe Gly Gln145 150 155
160Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg
165 170 175Tyr Glu Ala His
Leu Trp Asp Asn Ser Cys Arg Arg Glu Gly Gln Ser 180
185 190Arg Lys Gly Arg Gln Val Tyr Leu Gly Gly Tyr
Asp Lys Glu Glu Lys 195 200 205Ala
Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Thr Ser 210
215 220Thr Thr Thr Asn Phe Pro Ile Ser Asn Tyr
Glu Lys Glu Leu Asp Glu225 230 235
240Met Lys His Met Thr Arg Gln Glu Phe Val Ala Ala Ile Arg Arg
Lys 245 250 255Ser Ser Gly
Phe Ser Arg Gly Ala Ser Met Tyr Arg Gly Val Thr Arg 260
265 270His His Gln His Gly Arg Trp Gln Ala Arg
Ile Gly Arg Val Ala Gly 275 280
285Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr Glu Glu Glu Ala Ala 290
295 300Glu Ala Tyr Asp Ile Ala Ala Ile
Lys Phe Arg Gly Leu Asn Ala Val305 310
315 320Thr Asn Phe Asp Met Ser Arg Tyr Asp Val Lys Ala
Ile Leu Glu Ser 325 330
335Asn Thr Leu Pro Ile Gly Gly Gly Ala Ala Lys Arg Leu Lys Glu Ala
340 345 350Gln Ala Leu Glu Ser Ser
Arg Lys Arg Glu Glu Met Ile Ala Leu Gly 355 360
365Ser Ser Ser Thr Phe Gln Tyr Gly Thr Ser Ala Ser Ser Ser
Arg Leu 370 375 380His Ala Tyr Pro Leu
Met Gln His His His Gln Phe Glu Gln Pro Gln385 390
395 400Pro Leu Leu Thr Leu Gln Asn His Asp Ile
Ser Ser Ser His Phe Ser 405 410
415His Gln Gln Asp Pro Leu His His Gln Gly Tyr Ile Gln Thr Gln Leu
420 425 430Gln Leu His Gln Gln
Ser Gly Ala Ser Ser Tyr Ser Phe Gln Asn Asn 435
440 445Ala Gln Phe Tyr Asn Gly Tyr Leu Gln Asn His Pro
Ala Leu Leu Gln 450 455 460Gly Met Met
Asn Met Gly Ser Ser Ser Ser Ser Ser Ser Val Leu Glu465
470 475 480Asn Asn Asn Ser Asn Asn Asn
Asn Asn Asn Val Gly Gly Phe Val Gly 485
490 495Ser Gly Phe Gly Met Ala Ser Asn Ala Thr Ala Gly
Asn Thr Val Gly 500 505 510Thr
Ala Glu Glu Leu Gly Leu Val Lys Val Asp Tyr Asp Met Pro Ala 515
520 525Gly Gly Tyr Gly Gly Trp Ser Ala Ala
Asp Ser Met Gln Thr Ser Asn 530 535
540Gly Gly Val Phe Thr Met Trp Asn Asp545
55070509PRTMedicago truncatula 70Met Asp Lys Ser Ser Ser Ser Pro Pro Thr
Asn Thr Asn Asn Thr Ser1 5 10
15Leu Ala Phe Ser Leu Ser Asn Asn Asn Phe Pro Asn Pro Ser His Ser
20 25 30Ser Ser Ser His Leu Ser
Leu Phe His Ser Phe Thr Pro Tyr Pro Ser 35 40
45Ser Ile Ile Pro Pro Ser Leu Thr Leu Thr Gly Ser Asn Asn
Pro Val 50 55 60Glu Ala Ser Pro Glu
Ala Thr Asp Gly Gly Thr Thr Asn Leu Ser Ile65 70
75 80Phe Thr Gly Gly His Lys Phe Glu Asp Phe
Leu Gly Ser Ser Val Ala 85 90
95Pro Thr Arg Thr Ala Ala Ala Thr Cys Ala Pro Thr Gln Leu Gln Gln
100 105 110Phe Ser Thr Asp Asn
Asp Val Tyr Asn Ser Glu Leu Lys Lys Thr Ile 115
120 125Ala Ala Cys Phe Pro Gly Gly Tyr Pro Thr Glu Pro
Asn Ser Glu Pro 130 135 140Gln Lys Pro
Ser Pro Lys Lys Thr Val Asp Thr Phe Gly Gln Arg Thr145
150 155 160Ser Ile Tyr Arg Gly Val Thr
Arg His Arg Trp Thr Gly Arg Tyr Glu 165
170 175Ala His Leu Trp Asp Asn Ser Cys Arg Arg Glu Gly
Gln Ser Arg Lys 180 185 190Gly
Arg Gln Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala Tyr 195
200 205Asp Leu Ala Ala Leu Lys Tyr Trp Gly
Pro Thr Thr Thr Thr Asn Phe 210 215
220Pro Ile Ser Asn Tyr Glu Lys Glu Ile Asp Asp Met Lys Asn Met Thr225
230 235 240Arg Gln Glu Phe
Val Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe Ser 245
250 255Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr
Arg His His Gln His Gly 260 265
270Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr
275 280 285Leu Gly Thr Phe Ser Thr Gln
Glu Glu Ala Ala Glu Ala Tyr Asp Ile 290 295
300Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp
Met305 310 315 320Ser Arg
Tyr Asp Val Lys Ser Ile Ala Asn Cys Ser Leu Pro Ile Gly
325 330 335Gly Leu Ser Asn Lys Asn Asn
Lys Asn Ser Thr Asp Cys Val Ser Glu 340 345
350Thr Lys Ile Asn Glu Pro Ile Gln Ser Asp Glu Ile Asp His
Pro Ser 355 360 365Ser Thr Ser Ser
Ala Thr Thr Leu Ser Phe Ala Leu Pro Ile Lys Gln 370
375 380Asp Pro Ser Thr Asp Tyr Trp Ser Asn Ile Leu Gly
Phe His Asn Asn385 390 395
400Pro Ser Ala Val Thr Thr Thr Thr Ile Pro Phe Asn Met Asp Phe Ser
405 410 415Ala His Val Pro Ser
Asn Thr Asn Ser Asp Asn Pro His Asn Ala Ala 420
425 430Phe Phe Ser Gly Ser Gly Ile Phe Val Gln Gln Gln
Asn Met Asn Gly 435 440 445Ser Ser
Gly Ser Asn Ser Ser Ser Ser Ser Ser Ala Ser Thr Ser Ser 450
455 460Ile Pro Phe Ala Thr Pro Ile Phe Ser Leu Asn
Ser Asn Ser Ser Ser465 470 475
480Tyr Gly Asn Gly Asn Asn Trp Ile Gly His Thr Phe Gln Thr His Ala
485 490 495Lys Pro Ser Leu
Phe Gln Thr Pro Ile Phe Gly Met Glu 500
50571492PRTZea mays 71Met Asp Thr Ser His His Tyr His Pro Trp Leu Asn Phe
Ser Leu Ala1 5 10 15His
His Cys Asp Leu Glu Glu Glu Glu Arg Gly Ala Ala Ala Glu Leu 20
25 30Ala Ala Ile Ala Gly Ala Ala Pro
Pro Pro Lys Leu Glu Asp Phe Leu 35 40
45Gly Gly Gly Val Ala Thr Gly Gly Pro Glu Ala Val Ala Pro Ala Glu
50 55 60Met Tyr Asp Ser Asp Leu Lys Phe
Ile Ala Ala Ala Gly Phe Leu Gly65 70 75
80Gly Ser Ala Ala Ala Ala Ala Thr Ser Pro Leu Ser Ser
Leu Asp Gln 85 90 95Ala
Gly Ser Lys Leu Ala Leu Pro Ala Ala Ala Ala Ala Pro Ala Pro
100 105 110Glu Gln Arg Lys Ala Val Asp
Ser Phe Gly Gln Arg Thr Ser Ile Tyr 115 120
125Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His
Leu 130 135 140Trp Asp Asn Ser Cys Arg
Arg Glu Gly Gln Ser Arg Lys Gly Arg Gln145 150
155 160Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys
Ala Ala Arg Ala Tyr 165 170
175Asp Leu Ala Ala Leu Lys Tyr Trp Gly Ser Ser Thr Thr Thr Asn Phe
180 185 190Pro Val Ala Glu Tyr Glu
Lys Glu Val Glu Glu Met Lys Asn Met Thr 195 200
205Arg Gln Glu Phe Val Ala Ser Leu Arg Arg Lys Ser Ser Gly
Phe Ser 210 215 220Arg Gly Ala Ser Ile
Tyr Arg Gly Val Thr Arg His His Gln His Gly225 230
235 240Arg Trp Gln Ala Arg Ile Gly Arg Val Ala
Gly Asn Lys Asp Leu Tyr 245 250
255Leu Gly Thr Phe Ser Thr Glu Glu Glu Ala Ala Glu Ala Tyr Asp Ile
260 265 270Ala Ala Ile Lys Phe
Arg Gly Leu Asn Ala Val Thr Asn Phe Glu Ile 275
280 285Ser Arg Tyr Asn Val Glu Thr Ile Met Ser Ser Asn
Leu Pro Val Ala 290 295 300Ser Met Ser
Ser Ser Ala Ala Ala Ala Ala Gly Gly Arg Ser Ser Lys305
310 315 320Ala Leu Glu Ser Pro Pro Ser
Gly Ser Leu Asp Gly Gly Gly Gly Met 325
330 335Pro Val Val Glu Ala Ser Thr Ala Pro Pro Leu Phe
Ile Pro Val Lys 340 345 350Tyr
Asp Gln Gln Gln Gln Glu Tyr Leu Ser Met Leu Ala Leu Gln Gln 355
360 365His His Gln Gln Gln Gln Ala Gly Asn
Leu Leu Gln Gly Pro Leu Val 370 375
380Gly Phe Gly Gly Leu Tyr Ser Ser Gly Val Asn Leu Asp Phe Ala Asn385
390 395 400Ser His Gly Thr
Ala Ala Pro Ser Ser Met Ala His His Cys Tyr Ala 405
410 415Asn Gly Thr Ala Ser Ala Ser His Glu His
Gln His Gln Met Gln Gln 420 425
430Gly Gly Glu Asn Glu Thr Gln Pro Gln Pro Gln Gln Ser Ser Ser Ser
435 440 445Cys Ser Ser Leu Pro Phe Ala
Thr Pro Val Ala Phe Asn Gly Ser Tyr 450 455
460Glu Ser Ser Ile Thr Ala Ala Gly Pro Phe Gly Tyr Ser Tyr Pro
Asn465 470 475 480Val Ala
Ala Phe Gln Thr Pro Ile Tyr Gly Met Glu 485
49072469PRTOryza sativa 72Met Asp Met Asp Met Ser Ser Ala Tyr Pro His
His Trp Leu Ser Phe1 5 10
15Ser Leu Ser Asn Asn Tyr His His Gly Leu Leu Glu Ala Leu Ser Thr
20 25 30Thr Ser Ala Pro Pro Leu Gly
Glu Glu Gly Pro Ala Glu Gly Ala Pro 35 40
45Lys Met Glu Asp Phe Leu Gly Gly Leu Gly Gly Gly Gly Gly Ala
Val 50 55 60Ala Ala Ala Pro Ala Ala
Ala Pro Glu Asp Gln Leu Ser Cys Gly Glu65 70
75 80Leu Gly Ser Ile Ala Ala Gly Phe Leu Arg Arg
Tyr Pro Ala Pro Glu 85 90
95Asn Ala Gly Gly Val Thr Ile Ala Met Ala Thr Asp Ala Ala Ala Glu
100 105 110Leu Ala Asp Pro Ala Arg
Arg Thr Ala Glu Thr Phe Gly Gln Arg Thr 115 120
125Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg
Tyr Glu 130 135 140Ala His Leu Trp Asp
Asn Ser Cys Arg Arg Glu Gly Gln Ser Arg Lys145 150
155 160Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp
Lys Glu Glu Lys Ala Ala 165 170
175Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Pro Thr Thr Thr
180 185 190Thr Asn Phe Pro Val
Ala Asn Tyr Glu Thr Glu Leu Glu Glu Met Lys 195
200 205Ser Met Thr Arg Gln Glu Phe Ile Ala Ser Leu Arg
Arg Lys Ser Ser 210 215 220Gly Phe Ser
Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His225
230 235 240Gln His Gly Arg Trp Gln Ala
Arg Ile Gly Arg Val Ala Gly Asn Lys 245
250 255Asp Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu
Ala Ala Glu Ala 260 265 270Tyr
Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn 275
280 285Phe Asp Met Ser Arg Tyr Asp Val Asp
Ser Ile Leu Asn Ser Asp Leu 290 295
300Pro Val Gly Gly Gly Ala Ala Thr Arg Ala Ser Lys Phe Pro Ser Asp305
310 315 320Pro Ser Leu Pro
Leu Pro Ser Pro Ala Met Pro Pro Ser Glu Lys Asp 325
330 335Tyr Trp Ser Leu Leu Ala Leu His Tyr His
His His Gln Gln Gln Gln 340 345
350Gln Gln Gln Gln Phe Pro Ala Ser Ala Phe Asp Thr Tyr Gly Cys Ser
355 360 365Ser Gly Val Asn Val Asp Phe
Thr Met Gly Thr Ser Ser His Ser Gly 370 375
380Ser Asn Ser Asn Ser Ser Ser Ser Ser Ala Ile Trp Gly Thr Ala
Ala385 390 395 400Gly Ala
Ala Met Gly Arg Gln Gln Asn Gly Gly Ser Ser Asn Lys Gln
405 410 415Ser Asn Ser Tyr Ser Gly Asn
Asn Ile Pro Tyr Ala Ala Ala Ala Ala 420 425
430Met Thr Ser Gly Ser Ala Leu Tyr Gly Gly Ser Thr Gly Ser
Asn Gly 435 440 445Thr Trp Val Ala
Ser Asn Thr Ser Thr Ala Pro His Phe Tyr Asn Tyr 450
455 460Leu Phe Gly Met Glu46573562PRTGlycine max 73Met
Asn Asn Asn Trp Leu Ser Phe Pro Leu Ser Pro Thr His Ser Ser1
5 10 15Leu Pro Ala His Asp Leu Gln
Ala Thr Gln Tyr His Gln Phe Ser Leu 20 25
30Gly Leu Val Asn Glu Asn Met Glu Asn Pro Phe Gln Asn His
Asp Trp 35 40 45Ser Leu Ile Asn
Thr His Ser Ser Ser Glu Val Pro Lys Val Ala Asp 50 55
60Phe Leu Gly Val Ser Lys Ser Glu Asn Glu Ser Asp Leu
Ala Ala Ser65 70 75
80Leu Asn Glu Ile Gln Ser Asn Asp Ser Asp Tyr Leu Phe Thr Asn Asn
85 90 95Ser Leu Val Pro Met Gln
Asn Pro Ala Val Asp Thr Pro Ser Asn Glu 100
105 110Tyr Gln Glu Asn Ala Asn Ser Ser Leu Gln Ser Leu
Thr Leu Ser Met 115 120 125Gly Ser
Gly Lys Asp Ser Thr Cys Glu Thr Ser Gly Asp Asn Ser Thr 130
135 140Asn Thr Thr Thr Thr Thr Thr Val Glu Ala Ala
Pro Arg Arg Thr Leu145 150 155
160Asp Thr Phe Gly Gln Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His
165 170 175Arg Trp Thr Gly
Arg Tyr Glu Ala His Leu Trp Asp Asn Ser Cys Arg 180
185 190Arg Glu Gly Gln Ser Arg Lys Gly Arg Gln Val
Tyr Leu Gly Gly Tyr 195 200 205Asp
Lys Glu Glu Lys Ala Ala Arg Ser Tyr Asp Leu Ala Ala Leu Lys 210
215 220Tyr Trp Gly Thr Ser Thr Thr Thr Asn Phe
Pro Ile Ser Asn Tyr Glu225 230 235
240Lys Glu Leu Asp Glu Met Lys His Met Thr Arg Gln Glu Phe Val
Ala 245 250 255Ala Ile Arg
Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser Met Tyr 260
265 270Arg Gly Val Thr Arg His His Gln His Gly
Arg Trp Gln Ala Arg Ile 275 280
285Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr 290
295 300Glu Glu Glu Ala Ala Glu Ala Tyr
Asp Ile Ala Ala Ile Lys Phe Arg305 310
315 320Gly Leu Asn Ala Val Thr Asn Phe Asp Met Ser Arg
Tyr Asp Val Lys 325 330
335Ala Ile Leu Glu Ser Asn Thr Leu Pro Ile Gly Gly Gly Ala Ala Lys
340 345 350Arg Leu Lys Glu Ala Gln
Ala Leu Glu Ser Ser Arg Lys Arg Glu Glu 355 360
365Met Ile Ala Leu Gly Ser Ser Thr Phe Gln Tyr Gly Thr Thr
Ser Ser 370 375 380Asn Ser Arg Leu His
Ala Tyr Pro Leu Met Gln His His His Gln Phe385 390
395 400Glu Gln Pro Gln Pro Leu Leu Thr Leu Gln
Asn His Asp Ile Ser Ser 405 410
415His Phe Ser His Gln Gln Asp Pro Leu His Gln Gly Tyr Ile Gln Thr
420 425 430Gln Leu Gln Leu His
Gln Gln Gln Ser Gly Gly Ser Ser Ser Tyr Ser 435
440 445Phe Gln Asn Asn Asn Ile Asn Asn Ala Gln Phe Tyr
Asn Gly Tyr Asn 450 455 460Leu Gln Asn
His Pro Ala Leu Leu Gln Gly Met Ile Asn Met Gly Ser465
470 475 480Ser Ser Ser Ser Ser Val Leu
Glu Asn Asn Asn Ser Asn Asn Asn Asn 485
490 495Val Gly Gly Phe Val Gly Ser Gly Phe Gly Met Ala
Ser Asn Ala Thr 500 505 510Ser
Gly Asn Thr Val Gly Thr Ala Glu Glu Leu Gly Leu Val Lys Val 515
520 525Asp Tyr Asp Met Pro Thr Gly Gly Tyr
Gly Gly Trp Ser Ala Ala Ala 530 535
540Ala Ala Glu Ser Met Gln Thr Ser Asn Ser Gly Val Phe Thr Met Trp545
550 555 560Asn
Asp74574PRTArabidopsis thaliana 74Met Asn Ser Asn Asn Trp Leu Gly Phe Pro
Leu Ser Pro Asn Asn Ser1 5 10
15Ser Leu Pro Pro His Glu Tyr Asn Leu Gly Leu Val Ser Asp His Met
20 25 30Asp Asn Pro Phe Gln Thr
Gln Glu Trp Asn Met Ile Asn Pro His Gly 35 40
45Gly Gly Gly Asp Glu Gly Gly Glu Val Pro Lys Val Ala Asp
Phe Leu 50 55 60Gly Val Ser Lys Pro
Asp Glu Asn Gln Ser Asn His Leu Val Ala Tyr65 70
75 80Asn Asp Ser Asp Tyr Tyr Phe His Thr Asn
Ser Leu Met Pro Ser Val 85 90
95Gln Ser Asn Asp Val Val Val Ala Ala Cys Asp Ser Asn Thr Pro Asn
100 105 110Asn Ser Ser Tyr His
Glu Leu Gln Glu Ser Ala His Asn Leu Gln Ser 115
120 125Leu Thr Leu Ser Met Gly Thr Thr Ala Gly Asn Asn
Val Val Asp Lys 130 135 140Ala Ser Pro
Ser Glu Thr Thr Gly Asp Asn Ala Ser Gly Gly Ala Leu145
150 155 160Ala Val Val Glu Thr Ala Thr
Pro Arg Arg Ala Leu Asp Thr Phe Gly 165
170 175Gln Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His
Arg Trp Thr Gly 180 185 190Arg
Tyr Glu Ala His Leu Trp Asp Asn Ser Cys Arg Arg Glu Gly Gln 195
200 205Ser Arg Lys Gly Arg Gln Val Tyr Leu
Gly Gly Tyr Asp Lys Glu Asp 210 215
220Lys Ala Ala Arg Ser Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Pro225
230 235 240Ser Thr Thr Thr
Asn Phe Pro Ile Thr Asn Tyr Glu Lys Glu Val Glu 245
250 255Glu Met Lys His Met Thr Arg Gln Glu Phe
Val Ala Ala Ile Arg Arg 260 265
270Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser Met Tyr Arg Gly Val Thr
275 280 285Arg His His Gln His Gly Arg
Trp Gln Ala Arg Ile Gly Arg Val Ala 290 295
300Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr Glu Glu Glu
Ala305 310 315 320Ala Glu
Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala
325 330 335Val Thr Asn Phe Glu Ile Asn
Arg Tyr Asp Val Lys Ala Ile Leu Glu 340 345
350Ser Ser Thr Leu Pro Ile Gly Gly Gly Ala Ala Lys Arg Leu
Lys Glu 355 360 365Ala Gln Ala Leu
Glu Ser Ser Arg Lys Arg Glu Ala Glu Met Ile Ala 370
375 380Leu Gly Ser Ser Phe Gln Tyr Gly Gly Gly Ser Ser
Thr Gly Ser Gly385 390 395
400Ser Thr Ser Ser Arg Leu Gln Leu Gln Pro Tyr Pro Leu Ser Ile Gln
405 410 415Gln Pro Leu Glu Pro
Phe Leu Ser Leu Gln Asn Asn Asp Ile Ser His 420
425 430Tyr Asn Asn Asn Asn Ala His Asp Ser Ser Ser Phe
Asn His His Ser 435 440 445Tyr Ile
Gln Thr Gln Leu His Leu His Gln Gln Thr Asn Asn Tyr Leu 450
455 460Gln Gln Gln Ser Ser Gln Asn Ser Gln Gln Leu
Tyr Asn Ala Tyr Leu465 470 475
480His Ser Asn Pro Ala Leu Leu His Gly Leu Val Ser Thr Ser Ile Val
485 490 495Asp Asn Asn Asn
Asn Asn Gly Gly Ser Ser Gly Ser Tyr Asn Thr Ala 500
505 510Ala Phe Leu Gly Asn His Gly Ile Gly Ile Gly
Ser Ser Ser Thr Val 515 520 525Gly
Ser Thr Glu Glu Phe Pro Thr Val Lys Thr Asp Tyr Asp Met Pro 530
535 540Ser Ser Asp Gly Thr Gly Gly Tyr Ser Gly
Trp Thr Ser Glu Ser Val545 550 555
560Gln Gly Ser Asn Pro Gly Gly Val Phe Thr Met Trp Asn Glu
565 57075543PRTMedicago truncatula 75Met Asn Asn
Asn Trp Leu Ser Phe Pro Leu Ser Pro Ser His Ser Ser1 5
10 15Leu Pro Ser Asn Asp Leu Gln Ala Thr
Gln Tyr His His Phe Pro Leu 20 25
30Gly Leu Val Asn Asp Asn Met Glu Asn Pro Phe Gln Asn His Asp Trp
35 40 45Asn Leu Met Asn Thr His Asn
Ser Asn Glu Val Pro Lys Val Ala Asp 50 55
60Phe Leu Gly Val Cys Lys Ser Glu Asn His Ser Asp Leu Ala Thr Pro65
70 75 80Asn Glu Ile Gln
Ser Asn Asp Ser Asp Tyr Leu Phe Thr Asn Asn Asn 85
90 95Thr Leu Met Pro Met Gln Asn Gln Met Val
Thr Thr Cys Thr Asn Glu 100 105
110Tyr Gln Glu Lys Ala Ser Asn Ser Asn Leu Gln Ser Leu Thr Leu Ser
115 120 125Met Gly Ser Gly Lys Asp Ser
Thr Cys Glu Thr Ser Gly Glu Asn Ser 130 135
140Thr Asn Thr Val Glu Val Ala Val Pro Lys Arg Thr Ser Glu Thr
Phe145 150 155 160Gly Gln
Arg Thr Ser Ile Tyr Arg Gly Val Thr Lys His Arg Trp Thr
165 170 175Gly Arg Tyr Glu Ala His Leu
Trp Asp Asn Ser Cys Arg Arg Glu Gly 180 185
190Gln Ser Arg Lys Gly Arg Gln Gly Gly Tyr Asp Lys Glu Glu
Lys Ala 195 200 205Ala Arg Ser Tyr
Asp Leu Ala Ala Leu Lys Tyr Trp Gly Thr Ser Thr 210
215 220Thr Thr Asn Phe Pro Val Ser Asn Tyr Glu Lys Glu
Ile Asp Glu Met225 230 235
240Lys His Met Thr Arg Gln Glu Phe Val Ala Ser Ile Arg Arg Lys Ser
245 250 255Ser Gly Phe Ser Arg
Gly Ala Ser Met Tyr Arg Gly Val Thr Arg His 260
265 270His Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg
Val Ala Gly Asn 275 280 285Lys Asp
Leu Tyr Leu Gly Thr Phe Ser Thr Glu Glu Glu Ala Ala Glu 290
295 300Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly
Leu Asn Ala Val Thr305 310 315
320Asn Phe Asp Met Thr Arg Tyr Asp Val Lys Ala Ile Leu Glu Ser Asn
325 330 335Thr Leu Pro Ile
Gly Gly Gly Ala Ala Lys Arg Leu Lys Glu Ala Gln 340
345 350Ala Leu Glu Thr Ser Arg Lys Arg Glu Glu Met
Leu Ala Leu Asn Ser 355 360 365Ser
Ser Phe Gln Tyr Gly Thr Ser Ser Ser Ser Asn Thr Arg Leu Gln 370
375 380Pro Tyr Pro Leu Met Gln Tyr His His Gln
Phe Glu Gln Pro Gln Pro385 390 395
400Leu Leu Thr Leu Gln Asn Asn His Glu Ser Leu Asn Ser Gln Gln
Phe 405 410 415Ser Gln His
Gln Gly Gly Gly Tyr Phe Gln Thr Gln Leu Glu Leu Cys 420
425 430Gln Gln Gln Asn Gln Gln Pro Ser Gln Asn
Ser Asn Ile Gly Ser Phe 435 440
445Tyr Asn Gly Tyr Tyr Gln Asn His Pro Gly Leu Phe Gln Met Asn Asn 450
455 460Ile Gly Ser Ser Ser Ser Ser Ser
Val Met Gly Asn Asn Gly Gly Gly465 470
475 480Ser Ser Gly Ile Tyr Ser Asn Ser Gly Gly Leu Ile
Ser Asn Asn Ala 485 490
495Val Glu Glu Phe Val Pro Val Lys Val Asp Tyr Asp Met Gln Gly Asp
500 505 510Gly Ser Gly Phe Gly Gly
Trp Ser Ala Ala Gly Glu Asn Met Gln Thr 515 520
525Ala Asp Leu Phe Thr Met Trp Asn Asp Tyr Glu Thr Arg Glu
Asn 530 535 54076543PRTZea mays 76Met
Asp Met Asn Asn Gly Trp Leu Gly Phe Ser Leu Ser Pro Ser Ala1
5 10 15Ala Ser Arg Gly Gly Tyr Gly
Tyr Gly Asp Gly Gly Gly Gly Ala Ser 20 25
30Ala Ser Ala Cys Gly Asp Gly Glu Gly Ser Cys Pro Ser Pro
Ala Ala 35 40 45Ala Ala Ser Pro
Leu Pro Leu Val Ala Met Pro Leu Asp Asp Ser Leu 50 55
60His Tyr Ser Ser Ala Pro Asp Trp Arg His Gly Ala Ala
Glu Ala Lys65 70 75
80Gly Pro Lys Leu Glu Asp Phe Met Ser Ile Thr Cys Ser Asn Lys Ser
85 90 95Ser Gly Arg Ser Leu Tyr
Asp Ser Cys Gly His His Asp Asp Glu Gln 100
105 110Ala Ser Lys Tyr His Glu Val His Gly Ile His Pro
Leu Ser Cys Gly 115 120 125Ser Tyr
Tyr His Gly Cys Ile Ser Ser Gly Gly Gly Gly Gly Gly Gly 130
135 140Ile Gly Leu Gly Ile Asn Met Asn Ala Pro Pro
Cys Thr Gly Gly Phe145 150 155
160Pro Asp His Gln His His Gln Phe Val Pro Ser Ser His His Gly Gln
165 170 175Tyr Phe Leu Gly
Ala Pro Ala Ala Ser Ala Gly Pro Pro Ala Gly Ala 180
185 190Ala Met Pro Met Tyr Asn Ala Gly Gly Gly Ser
Val Val Gly Gly Ser 195 200 205Met
Ser Ile Ser Gly Ile Lys Ser Trp Leu Arg Glu Ala Met Tyr Val 210
215 220Pro Pro Glu Arg Pro Ala Ala Ala Ala Leu
Ser Leu Ala Val Thr Asp225 230 235
240Asp Val Pro Pro Ala Glu Pro Pro Gln Leu Leu Pro Ala Pro Leu
Pro 245 250 255Val His Arg
Lys Pro Ala Gln Thr Phe Gly Gln Arg Thr Ser Gln Phe 260
265 270Arg Gly Val Thr Arg His Arg Trp Thr Gly
Arg Tyr Glu Ala His Leu 275 280
285Trp Asp Asn Thr Cys Arg Lys Glu Gly Gln Thr Arg Lys Gly Arg Gln 290
295 300Val Tyr Leu Gly Gly Tyr Asp Arg
Glu Glu Lys Ala Ala Arg Ala Tyr305 310
315 320Asp Leu Ala Ala Leu Lys Tyr Trp Gly Pro Ser Thr
His Ile Asn Phe 325 330
335Pro Leu Ser His Tyr Glu Lys Glu Leu Glu Glu Met Lys His Met Ser
340 345 350Arg Gln Glu Phe Ile Ala
His Leu Arg Arg Asn Ser Ser Gly Phe Ser 355 360
365Arg Gly Ala Ser Met Tyr Arg Gly Val Thr Arg His His Gln
His Gly 370 375 380Arg Trp Gln Ala Arg
Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr385 390
395 400Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala
Ala Glu Ala Tyr Asp Ile 405 410
415Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp Ile
420 425 430Ser Lys Tyr Asp Val
Lys Arg Ile Cys Ala Ser Thr His Leu Ile Gly 435
440 445Gly Gly Asp Ala Cys Arg Arg Ser Pro Thr Arg Pro
Pro Asp Ala Ala 450 455 460Pro Ala Leu
Ala Gly Gly Ala Asp Arg Ser Ser Asp Ala Pro Gly Asp465
470 475 480Gln Ala Ala Ser Asp Asn Ser
Asp Thr Ser Asp Gly His Arg Gly Ala 485
490 495His Leu Leu His Gly Leu Gln Tyr Gly His Pro Met
Lys Leu Glu Ala 500 505 510Gly
Glu Gly Ser Ser Trp Met Ala Ala Ala Ala Ala Ala Arg Pro Val 515
520 525Pro Gly Val His Gln Leu Pro Met Phe
Ala Leu Trp Asn Asp Cys 530 535
54077512PRTGlycine max 77Met Ser Asn Trp Leu Gly Phe Ser Leu Thr Pro His
Leu Arg Ile Asp1 5 10
15Glu Glu Phe Glu Arg Glu Asn Gln Glu Arg Gly Gly Gly Ile Ile Leu
20 25 30Phe Glu Lys Lys Lys Thr Lys
Trp Arg Tyr Asp Ser Ala Ile Gly Gly 35 40
45Gly Asn Ser Asn Glu Glu Gly Pro Lys Leu Glu Asp Phe Leu Gly
Cys 50 55 60Tyr Ser Asn Ser Pro Ala
Lys Val Phe Cys Gln Asp Ser Gln Pro Asp65 70
75 80Gln Asn Gln Ser Gln Asn Asn Val Ser Lys Ile
Asn Ile Glu Thr Gly 85 90
95Asp Asn Leu Thr Asn Pro Ser Ser Leu Leu His Ser Phe His Ala Tyr
100 105 110Asn Asp Asn Ser His Ala
Leu Ile Pro Thr Asn Gly Met Tyr Lys Ser 115 120
125Trp Leu Ala Gln Thr Gln Phe Ser Ser Asp Gly Lys Pro Ser
Asn Glu 130 135 140Ala Asn Gly Cys Asn
Phe Gln Ser Leu Ser Leu Thr Met Ser Pro Ser145 150
155 160Val Gln Asn Gly Val Gly Ala Ile Ser Ser
Val Gln Val Asn Glu Asp 165 170
175Ser Arg Lys Arg Val Met Ala Lys Ser His Ala Arg Glu Pro Val Pro
180 185 190Arg Lys Ser Ile Asp
Thr Phe Gly Gln Arg Thr Ser Gln Tyr Arg Gly 195
200 205Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala
His Leu Trp Asp 210 215 220Asn Ser Cys
Arg Lys Glu Gly Gln Thr Arg Lys Gly Arg Gln Gly Gly225
230 235 240Tyr Asp Lys Glu Glu Lys Ala
Ala Lys Ala Tyr Asp Leu Ala Ala Leu 245
250 255Lys Tyr Trp Gly Pro Thr Thr His Ile Asn Phe Pro
Leu Ser Thr Tyr 260 265 270Glu
Lys Glu Leu Glu Glu Met Lys His Met Thr Arg Gln Glu Phe Val 275
280 285Ala Asn Leu Arg Arg Lys Ser Ser Gly
Phe Ser Arg Gly Ala Ser Val 290 295
300Tyr Arg Gly Val Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg305
310 315 320Ile Gly Arg Val
Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser 325
330 335Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp
Ile Ala Ala Ile Lys Phe 340 345
350Arg Gly Thr Ser Ala Val Thr Asn Phe Asp Ile Ser Arg Tyr Asp Val
355 360 365Lys Arg Ile Cys Ser Ser Ser
Thr Leu Ile Ala Gly Asp Leu Ala Lys 370 375
380Arg Ser Pro Lys Glu Ser Pro Ala Pro Pro Pro Pro Leu Ala Ile
Thr385 390 395 400Asp Gly
Glu His Ser Asp Glu Leu Ser Asn Met Met Trp Asn Ala Asn
405 410 415Asn Ser Asp Glu Gln Ala Gln
Asn Glu Ser Gly Gly Ala Glu Phe Asn 420 425
430Asn Asn Val Thr Glu Ser Ser Ser Ser Gln Gln Val Ser Pro
Ser Ser 435 440 445Asn Lys Asp Ala
Leu Asn Pro Gln Ser Pro Asn Glu Phe Gly Val Ser 450
455 460Gly Ala Asp Tyr Gly His Gly Tyr Phe Thr Leu Asp
Gly Pro Lys Tyr465 470 475
480Asp Asp Gly Asn Asn Glu Asn Asp His Met Ser Thr Asn Arg Leu Gly
485 490 495Asn Leu Gly Leu Val
Asn Gln Val Pro Met Phe Ala Leu Trp Asn Glu 500
505 51078485PRTSorghum bicolor 78Met Asp Thr Ser His His
Tyr Pro Trp Leu Asn Phe Ser Leu Ala His1 5
10 15His Gly Asp Leu Glu Glu Glu Glu Arg Gly Ala Ala
Ala Glu Leu Ala 20 25 30Ala
Ile Ala Gly Ala Ala Pro Pro Pro Lys Leu Glu Asp Phe Leu Gly 35
40 45Gly Gly Val Ile Asn Gly Glu Ser Ala
Arg Ser Gly Gly Gly Val Pro 50 55
60Val Ala Ala Pro Glu Val Ser Ala Pro Ala Glu Met Tyr Asp Ser Asp65
70 75 80Leu Lys Phe Ile Ala
Ala Ala Gly Phe Leu Gly Gly Gly Ser Ala Ala 85
90 95Gly Pro Val Ala Thr Ser Pro Leu Ser Ser Leu
Asp Gln Ala Asp Pro 100 105
110Lys Leu Ala Leu Pro Ala Ala Ala Ala Ala Ala Pro Ala Pro Glu Gln
115 120 125Arg Lys Ala Val Asp Ser Phe
Gly Gln Arg Thr Ser Ile Tyr Arg Gly 130 135
140Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp
Asp145 150 155 160Asn Ser
Cys Arg Arg Glu Gly Gln Ser Arg Lys Gly Arg Gln Gly Gly
165 170 175Tyr Asp Lys Glu Glu Lys Ala
Ala Arg Ala Tyr Asp Leu Ala Ala Leu 180 185
190Lys Tyr Trp Gly Ser Ser Thr Thr Thr Asn Phe Pro Val Ala
Glu Tyr 195 200 205Glu Lys Glu Leu
Glu Glu Met Lys Thr Met Thr Arg Gln Glu Phe Val 210
215 220Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg
Gly Ala Ser Ile225 230 235
240Tyr Arg Gly Val Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg
245 250 255Ile Gly Arg Val Ala
Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser 260
265 270Thr Glu Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala
Ala Ile Lys Phe 275 280 285Arg Gly
Leu Asn Ala Val Thr Asn Phe Glu Ile Ser Arg Tyr Asn Val 290
295 300Glu Ser Ile Met Asn Ser Asn Ile Pro Met Gly
Ser Met Ser Ala Gly305 310 315
320Gly Arg Ser Asn Lys Ala Leu Glu Ser Pro Pro Ser Gly Ser Pro Asp
325 330 335Ala Met Pro Val
Glu Ala Ser Thr Ala Pro Leu Phe Ala Ala Leu Pro 340
345 350Val Lys Tyr Asp Gln Gln Gln Gln Asp Tyr Leu
Ser Met Leu Ala Leu 355 360 365Gln
His His Gln Gln Gly Asn Leu Gln Gly Leu Gly Phe Gly Leu Tyr 370
375 380Ser Ser Gly Val Asn Leu Asp Phe Ala Asn
Ser His Ser Thr Ala Ser385 390 395
400Ser Met Thr His Cys Tyr Val Asn Gly Gly Thr Val Ser Ser His
Glu 405 410 415Gln His Gln
His His Gln Gln Leu Gln Asp His Gln Gln Gln Gly Glu 420
425 430Ser Glu Thr Gln Gln Ser Ser Asn Ser Cys
Ser Ser Leu Pro Phe Ala 435 440
445Thr Pro Ile Ala Phe Asn Gly Ser Tyr Glu Ser Ser Met Thr Ala Ala 450
455 460Gly Pro Phe Gly Tyr Ser Tyr Pro
Asn Val Ala Ala Phe Gln Thr Pro465 470
475 480Ile Tyr Gly Met Glu
48579507PRTGlycine max 79Met Ala Arg Ala Thr Asn Trp Leu Ser Phe Ser Leu
Ser Pro Met Glu1 5 10
15Met Leu Arg Thr Ser Glu Pro Gln Phe Leu Gln Tyr Asp Ala Ala Ser
20 25 30Ala Thr Ser Ser His His Tyr
Tyr Leu Asp Asn Leu Tyr Thr Asn Gly 35 40
45Trp Gly Asn Gly Ser Leu Lys Phe Glu Gln Asn Leu Asn His Ser
Asp 50 55 60Val Ser Phe Val Glu Ser
Ser Ser Gln Ser Val Gly His Val Pro Pro65 70
75 80Pro Pro Pro Lys Leu Glu Asp Phe Leu Gly Asp
Ser Ser Ala Val Met 85 90
95Arg Tyr Ser Asp Ser Gln Thr Glu Thr Gln Asp Ser Ser Leu Thr His
100 105 110Ile Tyr Asp His His His
His His His His His His Gly Ser Thr Ser 115 120
125Tyr Phe Gly Gly Asp Gln Gln Asp Leu Lys Ala Ile Thr Gly
Phe Gln 130 135 140Ala Phe Ser Thr Asn
Ser Gly Ser Glu Val Asp Asp Ser Ala Ser Ile145 150
155 160Gly Lys Ala Gln Ala Ser Glu Phe Gly Thr
His Ser Ile Glu Ser Ser 165 170
175Gly Asn Glu Phe Ala Ala Phe Ser Gly Gly Thr Thr Gly Thr Leu Ser
180 185 190Leu Ala Val Ala Leu
Ser Ser Glu Lys Ala Val Val Ala Ala Glu Ser 195
200 205Asn Ser Ser Lys Lys Ile Val Asp Thr Phe Gly Gln
Arg Thr Ser Ile 210 215 220Tyr Arg Gly
Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His225
230 235 240Leu Trp Asp Asn Ser Cys Arg
Arg Glu Gly Gln Ala Arg Lys Gly Arg 245
250 255Gln Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg
Ala Tyr Asp Leu 260 265 270Ala
Ala Leu Lys Tyr Trp Gly Pro Thr Ala Thr Thr Asn Phe Pro Val 275
280 285Ser Asn Tyr Ser Lys Glu Val Glu Glu
Met Lys His Val Thr Lys Gln 290 295
300Glu Phe Ile Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly305
310 315 320Ala Ser Ile Tyr
Arg Gly Val Thr Arg His His Gln Gln Gly Arg Trp 325
330 335Gln Ala Arg Ile Gly Arg Val Ala Gly Asn
Lys Asp Leu Tyr Leu Gly 340 345
350Thr Phe Ala Thr Glu Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala
355 360 365Ile Lys Phe Arg Gly Ala Asn
Ala Val Thr Asn Phe Glu Met Asn Arg 370 375
380Tyr Asp Val Glu Ala Ile Met Lys Ser Ser Leu Pro Val Gly Gly
Ala385 390 395 400Ala Lys
Arg Leu Arg Leu Ser Leu Glu Ser Glu Gln Lys Ala Pro Pro
405 410 415Val Asn Ser Ser Ser Gln Gln
Gln Asn Pro Gln Cys Gly Asn Val Ser 420 425
430Gly Ser Ile Asn Phe Ser Ala Ile His Gln Pro Ile Ala Ser
Ile Pro 435 440 445Cys Gly Ile Pro
Phe Asp Ser Thr Thr Ala Tyr Tyr Pro His Asn Leu 450
455 460Phe Gln His Phe His Pro Thr Asn Ala Gly Ala Ala
Ala Ser Ala Val465 470 475
480Thr Ser Ala Asn Ala Thr Ala Leu Thr Ala Leu Pro Ala Ser Ala Ala
485 490 495Thr Glu Phe Phe Ile
Trp Pro His Gln Ser Tyr 500
50580569PRTArabidopsis thaliana 80Met Glu Met Leu Arg Ser Ser Asp Gln Ser
Gln Phe Val Ser Tyr Asp1 5 10
15Ala Ser Ser Ala Ala Ser Ser Ser Pro Tyr Leu Leu Asp Asn Phe Tyr
20 25 30Gly Trp Ser Asn Gln Lys
Pro Gln Glu Phe Phe Lys Glu Glu Ala Gln 35 40
45Leu Ala Ala Ala Ala Ser Met Ala Asp Ser Thr Ile Leu Thr
Thr Phe 50 55 60Val Asp Pro Gln Ser
His His Ser Gln Asn His Ile Pro Lys Leu Glu65 70
75 80Asp Phe Leu Gly Asp Ser Ser Ser Ile Val
Arg Tyr Ser Asp Asn Ser 85 90
95Gln Thr Asp Thr Gln Asp Ser Ser Leu Thr Gln Ile Tyr Asp Pro Arg
100 105 110His His His Asn Gln
Thr Gly Phe Tyr Ser Asp His His Asp Phe Lys 115
120 125Thr Met Ala Gly Phe Gln Ser Ala Phe Ser Thr Asn
Ser Gly Ser Glu 130 135 140Val Asp Asp
Ser Ala Ser Ile Gly Arg Thr His Leu Ala Gly Asp Tyr145
150 155 160Leu Gly His Val Val Glu Ser
Ser Gly Pro Glu Leu Gly Phe His Gly 165
170 175Gly Ser Thr Gly Ala Leu Ser Leu Gly Val Asn Val
Asn Asn Asn Thr 180 185 190Asn
His Arg Asn Asp Asn Asp Asn His Tyr Arg Gly Asn Asn Asn Gly 195
200 205Glu Arg Ile Asn Asn Asn Asn Asn Asn
Asp Asn Glu Lys Thr Asp Ser 210 215
220Glu Lys Glu Lys Ala Val Val Ala Val Glu Thr Ser Asp Cys Ser Asn225
230 235 240Lys Lys Ile Ala
Asp Thr Phe Gly Gln Arg Thr Ser Ile Tyr Arg Gly 245
250 255Val Thr Arg His Arg Trp Thr Gly Arg Tyr
Glu Ala His Leu Trp Asp 260 265
270Asn Ser Cys Arg Arg Glu Gly Gln Ala Arg Lys Gly Arg Gln Val Tyr
275 280 285Leu Gly Gly Tyr Asp Lys Glu
Asp Lys Ala Ala Arg Ala Tyr Asp Leu 290 295
300Ala Ala Leu Lys Tyr Trp Asn Ala Thr Ala Thr Thr Asn Phe Pro
Ile305 310 315 320Thr Asn
Tyr Ser Lys Glu Val Glu Glu Met Lys His Met Thr Lys Gln
325 330 335Glu Phe Ile Ala Ser Leu Arg
Arg Lys Ser Ser Gly Phe Ser Arg Gly 340 345
350Ala Ser Ile Tyr Arg Gly Val Thr Arg His His Gln Gln Gly
Arg Trp 355 360 365Gln Ala Arg Ile
Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly 370
375 380Thr Phe Ala Thr Glu Glu Glu Ala Ala Glu Ala Tyr
Asp Ile Ala Ala385 390 395
400Ile Lys Phe Arg Gly Ile Asn Ala Val Thr Asn Phe Glu Met Asn Arg
405 410 415Tyr Asp Val Glu Ala
Ile Met Lys Ser Ala Leu Pro Ile Gly Gly Ala 420
425 430Ala Lys Arg Leu Lys Leu Ser Leu Glu Ala Ala Ala
Ser Ser Glu Gln 435 440 445Lys Pro
Ile Leu Gly His His Gln Leu His His Phe Gln Gln Gln Gln 450
455 460Gln Gln Gln Gln Leu Gln Leu Gln Ser Ser Pro
Asn His Ser Ser Ile465 470 475
480Asn Phe Ala Leu Cys Pro Asn Ser Ala Val Gln Ser Gln Gln Ile Ile
485 490 495Pro Cys Gly Ile
Pro Phe Glu Ala Ala Ala Leu Tyr His His His Gln 500
505 510Gln Gln Gln Gln His Gln Gln Gln Gln Gln Gln
Gln Asn Phe Phe Gln 515 520 525His
Phe Pro Ala Asn Ala Ala Ser Asp Ser Thr Gly Ser Asn Asn Asn 530
535 540Ser Asn Val Gln Gly Thr Met Gly Leu Met
Ala Pro Asn Pro Ala Glu545 550 555
560Phe Phe Leu Trp Pro Asn Gln Ser Tyr
56581574PRTMedicago truncatula 81Met Ser Asn Trp Leu Gly Phe Ser Leu Thr
Pro His Leu Arg Ile Asp1 5 10
15Glu Glu Phe Gly Thr Glu Asn Gln Asn Gln Asn Gln Asn His Val Ala
20 25 30Glu Gly Ser Glu Ile Gly
Arg Asn Tyr Val Thr Pro Ser Ser His Pro 35 40
45His Pro His His Leu Ser Ile Met Pro Leu Arg Ser Asp Gly
Ser Leu 50 55 60Cys Val Ser Asp Ser
Phe Thr Pro Gln Glu Trp Arg Tyr Glu Asn Ala65 70
75 80Ile Thr Asp Gly Asn Ser Asn Glu Glu Gly
Pro Lys Leu Glu Asp Phe 85 90
95Leu Gly Cys Tyr Ser Asn Gln Asn Gln Asn Ser Thr Thr Thr Ser Thr
100 105 110Met Ser Lys Ile Asn
Val Asn Val Ser Pro Ser Phe Cys Thr Asn Asn 115
120 125Asn Pro Glu Ile Asp Thr Arg Glu Asn Leu Thr Asn
Gln Ser Leu Ile 130 135 140His Ser Phe
His Ala Tyr Asn Asp His Ser Asn Asn Asn His His Ala145
150 155 160Leu Ile His Asp Asn Ser Met
Tyr Lys Ser Trp Met Thr Gln Thr Gln 165
170 175Phe Ser Ser Glu Gly Lys Thr Thr Ser Ser Asp Gly
Asn Gly Phe Gln 180 185 190Ser
Leu Asn Leu Thr Met Ser Pro Cys Val Gln Asn Gly Val Gly Gly 195
200 205Gly Val Gly Ser Ala Ile Ser Asn Val
Gln Val Asn Glu Asp Pro Arg 210 215
220Lys Arg Ser Leu Ser Lys Ser Asn Ala Arg Glu Pro Val Pro Arg Lys225
230 235 240Ser Ile Asp Thr
Phe Gly Gln Arg Thr Ser Gln Tyr Arg Gly Val Thr 245
250 255Arg His Arg Trp Thr Gly Arg Tyr Glu Ala
His Leu Trp Asp Asn Ser 260 265
270Cys Arg Lys Glu Gly Gln Thr Arg Lys Gly Arg Gln Gly Gly Tyr Asp
275 280 285Lys Glu Glu Lys Ala Ala Lys
Ala Tyr Asp Leu Ala Ala Leu Lys Tyr 290 295
300Trp Gly Pro Thr Thr His Ile Asn Phe Pro Leu Ser Thr Tyr Asp
Lys305 310 315 320Glu Leu
Glu Glu Met Lys His Met Thr Arg Gln Glu Phe Val Ala Asn
325 330 335Leu Arg Arg Lys Ser Ser Gly
Phe Ser Arg Gly Ala Ser Val Tyr Arg 340 345
350Gly Val Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg
Ile Gly 355 360 365Arg Val Ala Gly
Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ser Thr Gln 370
375 380Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile
Lys Phe Arg Gly385 390 395
400Thr Ser Ala Val Thr Asn Phe Asp Ile Ser Arg Tyr Asp Val Lys Arg
405 410 415Ile Cys Ser Ser Ser
Thr Leu Ile Thr Gly Asp Leu Ala Lys Arg Ser 420
425 430Pro Lys Asp Ser Thr Pro Pro Ala Thr Thr Ala Glu
Asp Phe Asn Ser 435 440 445Cys Gly
Ser Ser Ser Thr Leu Ser Gln Pro Pro Pro Leu Thr Ile Thr 450
455 460Asp Gly Glu Gln His Ser Asp Glu Leu Ser Asn
Met Val Trp Asn Ser465 470 475
480Asn Asn Asp Glu Gln Lys Pro Gln Asn Gly Thr Asn Ile Thr Glu Ser
485 490 495Ser Gln His Gly
Ser Pro Ser Asn Lys Asn Glu Met Asn Pro Gln Ser 500
505 510Pro Lys Cys Ser Leu Gly Leu Pro Asn Glu Phe
Gly Val Ser Gly Ala 515 520 525Asp
Tyr Gly His Gly Tyr Phe Thr Leu His Gly Pro Lys Phe Asp Asp 530
535 540Gly Ser Asn Glu Asn Asp His Met Asn Asn
Asn Arg Leu Gly Asn Leu545 550 555
560Gly Leu Val Asn Gln Val Pro Met Phe Ala Leu Trp Asn Glu
565 57082541PRTSorghum bicolor 82Met Asp Met Asn
Asn Gly Trp Leu Gly Phe Ser Leu Ser Pro Ser Ala1 5
10 15Gly Arg Gly Gly Tyr Gly Asp Gly Gly Ala
Ser Ala Ser Gly Asp Gly 20 25
30Gly Asp Gly Ser Cys Ser Ser Pro Ala Ala Ala Ala Ser Pro Val Pro
35 40 45Leu Val Ala Met Pro Leu Gln Pro
Asp Gly Ser Leu His Tyr Thr Ser 50 55
60Ala Pro Asp Trp Arg His Gly Ala Ala Glu Ala Asn Gly Pro Lys Leu65
70 75 80Glu Asp Phe Met Ser
Val Thr Cys Ser Ser Asn Asn Lys Arg Ser Ser 85
90 95Ser Ser Ser Ser Phe Tyr Asp Arg Cys Ser His
Ala Glu Gln Ala Asn 100 105
110Lys Tyr His Glu Val His Asp Leu Gln Pro Leu Ser Cys Gly Ser Tyr
115 120 125Tyr His Gly Ser Ser Gly Gly
Gly Gly Asn Gly Ile Ala Leu Gly Ile 130 135
140Asn Met Asn Ala Pro Pro Cys Ser Gly Gly Gly Phe Pro Asp His
His145 150 155 160His His
His Gln Phe Val Ser Ser His His Gly Gln Tyr Phe Leu Gly
165 170 175Ala Pro Leu Asn Ala Ser Pro
Pro Gly Ala Val Pro Met Tyr Ser Ala 180 185
190Gly Gly Gly Gly Val Gly Gly Ser Met Ser Ile Ser Gly Ile
Lys Ser 195 200 205Trp Leu Arg Glu
Ala Met Tyr Val Pro Pro Glu Arg Pro Val Ala Ala 210
215 220Ala Ala Ala Leu Ser Leu Ala Val Thr Asp Asp Val
Gly Ala Glu Pro225 230 235
240Pro Gln Leu Leu Pro Ala Ala Pro Met Pro Pro Val His Arg Lys Pro
245 250 255Ala Gln Thr Phe Gly
Gln Arg Thr Ser Gln Phe Arg Gly Val Thr Arg 260
265 270His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp
Asp Asn Thr Cys 275 280 285Arg Lys
Glu Gly Gln Thr Arg Lys Gly Arg Gln Gly Gly Tyr Asp Arg 290
295 300Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala
Ala Leu Lys Tyr Trp305 310 315
320Gly Pro Ser Thr His Ile Asn Phe Pro Leu Ser His Tyr Glu Lys Glu
325 330 335Leu Glu Glu Met
Lys His Met Ser Arg Gln Glu Phe Ile Ala His Leu 340
345 350Arg Arg Asn Ser Ser Gly Phe Ser Arg Gly Ala
Ser Met Tyr Arg Gly 355 360 365Val
Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg 370
375 380Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly
Thr Phe Ser Thr Gln Glu385 390 395
400Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly
Leu 405 410 415Asn Ala Val
Thr Asn Phe Asp Ile Ser Lys Tyr Asp Val Lys Arg Ile 420
425 430Cys Ala Ser Thr His Leu Ile Gly Gly Gly
Asp Ala Cys Arg Arg Ser 435 440
445Pro Thr Gln Pro Pro Asp Ala Pro Ala Leu Ala Ile Asp Ala Ala Gly 450
455 460Ala Asp Arg Ser Ser Asp Ala Pro
Gly Gly Gly Asp Gln Ala Val Ser465 470
475 480Asp Asn Ser Asp Thr Ser Ala Gly His Arg Gly Ala
His Leu Leu His 485 490
495Gly Leu Gln Tyr Gly His Pro Met Lys Leu Glu Ala Gly Glu Gly Ser
500 505 510Ser Trp Met Ala Ala Ala
Thr Ala Ala Ala Ala Arg Pro Val Ala Gly 515 520
525Val His Gln Leu Pro Val Phe Ala Leu Trp Asn Asp Cys
530 535 54083555PRTArabidopsis thaliana
83Met Lys Ser Phe Cys Asp Asn Asp Asp Asn Asn His Ser Asn Thr Thr1
5 10 15Asn Leu Leu Gly Phe Ser
Leu Ser Ser Asn Met Met Lys Met Gly Gly 20 25
30Arg Gly Gly Arg Glu Ala Ile Tyr Ser Ser Ser Thr Ser
Ser Ala Ala 35 40 45Thr Ser Ser
Ser Ser Val Pro Pro Gln Leu Val Val Gly Asp Asn Thr 50
55 60Ser Asn Phe Gly Val Cys Tyr Gly Ser Asn Pro Asn
Gly Gly Ile Tyr65 70 75
80Ser His Met Ser Val Met Pro Leu Arg Ser Asp Gly Ser Leu Cys Leu
85 90 95Met Glu Ala Leu Asn Arg
Ser Ser His Ser Asn His His Gln Asp Ser 100
105 110Ser Pro Lys Val Glu Asp Phe Phe Gly Thr His His
Asn Asn Thr Ser 115 120 125His Lys
Glu Ala Met Asp Leu Ser Leu Asp Ser Leu Phe Tyr Asn Thr 130
135 140Thr His Glu Pro Asn Thr Thr Thr Asn Phe Gln
Glu Phe Phe Ser Phe145 150 155
160Pro Gln Thr Arg Asn His Glu Glu Glu Thr Arg Asn Tyr Gly Asn Asp
165 170 175Pro Ser Leu Thr
His Gly Gly Ser Phe Asn Val Gly Val Tyr Gly Glu 180
185 190Phe Gln Gln Ser Leu Ser Leu Ser Met Ser Pro
Gly Ser Gln Ser Ser 195 200 205Cys
Ile Thr Gly Ser His His His Gln Gln Asn Gln Asn Gln Asn His 210
215 220Gln Ser Gln Asn His Gln Gln Ile Ser Glu
Ala Leu Val Glu Thr Ser225 230 235
240Val Gly Phe Glu Thr Thr Thr Met Ala Ala Ala Lys Lys Lys Arg
Gly 245 250 255Gln Glu Asp
Val Val Val Val Gly Gln Lys Gln Ile Val His Arg Lys 260
265 270Ser Ile Asp Thr Phe Gly Gln Arg Thr Ser
Gln Tyr Arg Gly Val Thr 275 280
285Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser 290
295 300Phe Lys Lys Glu Gly His Ser Arg
Lys Gly Arg Gln Val Tyr Leu Gly305 310
315 320Gly Tyr Asp Met Glu Glu Lys Ala Ala Arg Ala Tyr
Asp Leu Ala Ala 325 330
335Leu Lys Tyr Trp Gly Pro Ser Thr His Thr Asn Phe Ser Ala Glu Asn
340 345 350Tyr Gln Lys Glu Ile Glu
Asp Met Lys Asn Met Thr Arg Gln Glu Tyr 355 360
365Val Ala His Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly
Ala Ser 370 375 380Ile Tyr Arg Gly Val
Thr Arg His His Gln His Gly Arg Trp Gln Ala385 390
395 400Arg Ile Gly Arg Val Ala Gly Asn Lys Asp
Leu Tyr Leu Gly Thr Phe 405 410
415Gly Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp Val Ala Ala Ile Lys
420 425 430Phe Arg Gly Thr Asn
Ala Val Thr Asn Phe Asp Ile Thr Arg Tyr Asp 435
440 445Val Asp Arg Ile Met Ser Ser Asn Thr Leu Leu Ser
Gly Glu Leu Ala 450 455 460Arg Arg Asn
Asn Asn Ser Ile Val Val Arg Asn Thr Glu Asp Gln Thr465
470 475 480Ala Leu Asn Ala Val Val Glu
Gly Gly Ser Asn Lys Glu Val Ser Thr 485
490 495Pro Glu Arg Leu Leu Ser Phe Pro Ala Ile Phe Ala
Leu Pro Gln Val 500 505 510Asn
Gln Lys Met Phe Gly Ser Asn Met Gly Gly Asn Met Ser Pro Trp 515
520 525Thr Ser Asn Pro Asn Ala Glu Leu Lys
Thr Val Ala Leu Thr Leu Pro 530 535
540Gln Met Pro Val Phe Ala Ala Trp Ala Asp Ser545 550
55584678PRTSorghum bicolor 84Met Thr Asn Asn Asn Gly Asn Gly
Thr Asn Ala Ala Ala Ser Ser Trp1 5 10
15Leu Gly Phe Ser Leu Ser Pro His Met Ala Ser Ala Met Asp
Glu His 20 25 30His His Val
Gln Gln Gln Gln Gln His His His His His Ser Leu Phe 35
40 45Phe Pro Ser Val Thr Ala Ala Ala Ala Ala Ala
Tyr Gly Leu Gly Gly 50 55 60Ser Asp
Gly Gly Val Ala Thr Ser Ala Ser Pro Tyr Tyr Thr Pro Gln65
70 75 80Leu Ala Ser Met Pro Leu Lys
Ser Asp Gly Ser Leu Cys Ile Met Glu 85 90
95Ala Leu Arg Arg Ser Asp Gln Pro Asp His His Gly Pro
Lys Leu Glu 100 105 110Asp Phe
Leu Gly Ala Ala Ala Ala Gln Ser Gln Ala Met Ala Leu Ser 115
120 125Leu Gln Asp Asn Pro Ala Ala Ala Ala Ser
Ser Phe Tyr Tyr Tyr Gly 130 135 140Asn
Gly Gly Gly Gly Gly Ser Gly His Gln His His Gly Gly Phe Leu145
150 155 160Gln Pro Cys Ala Asp Leu
Tyr Gly Gly Pro Ser Glu Ala Ser Leu Val 165
170 175Ala Asp Asp Asp Glu Ala Ala Ala Ala Ala Thr Ala
Met Ala Ser Trp 180 185 190Val
Ala Ala Arg Ala Gly Glu Ser Gly Gly Val Leu Ser Ala Ala Ala 195
200 205Ala Ala Ala Gly His Gln His His His
His Ala Leu Ala Leu Ser Met 210 215
220Ser Ser Gly Ser Leu Ser Ser Cys Val Thr Ala His Pro Gly Ala Ala225
230 235 240Ala Ala Asp Tyr
Gly Val Val Ala Ala Thr Ala Ser Ala Ser Leu Asp 245
250 255Gly Gly Arg Lys Arg Gly Gly Ala Ala Gly
Gln Lys Gln Pro Val His 260 265
270His Arg Lys Ser Ile Asp Thr Phe Gly Gln Arg Thr Ser Gln Tyr Arg
275 280 285Gly Val Thr Arg His Arg Trp
Thr Gly Arg Tyr Glu Ala His Leu Trp 290 295
300Asp Asn Ser Cys Lys Lys Glu Gly Gln Thr Arg Lys Gly Arg Gln
Gly305 310 315 320Gly Tyr
Asp Met Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala
325 330 335Leu Lys Tyr Trp Gly Pro Ser
Thr His Ile Asn Phe Pro Leu Glu Asp 340 345
350Tyr Gln Glu Glu Leu Glu Glu Met Lys Asn Met Thr Arg Gln
Glu Tyr 355 360 365Val Ala His Leu
Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser 370
375 380Met Tyr Arg Gly Val Thr Arg His His Gln His Gly
Arg Trp Gln Ala385 390 395
400Arg Ile Gly Arg Val Ser Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe
405 410 415Ser Thr Gln Glu Glu
Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys 420
425 430Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp Ile
Thr Arg Tyr Asp 435 440 445Val Asp
Lys Ile Met Ala Ser Asn Thr Leu Leu Pro Gly Asp Leu Ala 450
455 460Arg Arg Arg Lys Asp Asp Asp Pro Ala Ala Val
Ile Ala Gly Ala Asp465 470 475
480Ala Ser Asn Gly Gly Gly Val Thr Thr Ala Ala Ala Ala Ala Ala Leu
485 490 495Val Gln Gln Ala
Ala Ala Ala Ala Ala Ala Gly Ala Gly Gly Asn His 500
505 510Ser Ala Ser Ser Ser Glu Thr Trp Ile Lys Val
Ala Ala Ala Ala Ala 515 520 525Leu
Gln Ala Ala Gly Ala Ala Pro Arg Asp Gly Asn His His His His 530
535 540His His Asp Val Leu Ser Gly Glu Ala Phe
Ser Val Leu His Asp Leu545 550 555
560Val Val Thr Ala Ala Asp Gly Gly Asn Gly Asn Gly Asn Gly Gly
His 565 570 575His His His
His Val His Asn Ser Ala Ala Thr Ala Gln His Met Ser 580
585 590Met Ser Ser Ala Ser Ser Leu Val Thr Ser
Leu Gly Asn Ser Arg Glu 595 600
605Gly Ser Pro Asp Arg Gly Gly Gly Leu Ser Met Leu Phe Ser Lys Pro 610
615 620Pro Ala Pro Ala Pro Ala Ala Ser
Ala His Ala Ala Asn Lys Pro Met625 630
635 640Ser Pro Leu Met Pro Leu Gly Ser Trp Ala Ser Thr
Ala Ala Ala Ser 645 650
655Ala Arg Ala Ala Ala Ala Ala Val Ser Ile Ala His Met Pro Val Phe
660 665 670Ala Ala Trp Thr Asp Ala
67585509PRTGlycine max 85Met Ala Arg Ala Ser Thr Asn Trp Leu Ser Phe
Ser Leu Ser Pro Met1 5 10
15Asp Met Leu Arg Thr Pro Glu Pro Gln Phe Val Gln Tyr Asp Ala Ala
20 25 30Ser Asp Thr Ser Ser His His
Tyr Tyr Leu Asp Asn Leu Tyr Thr Asn 35 40
45Gly Trp Gly Asn Gly Ser Leu Lys Phe Glu Gln Asn Leu Asn His
Ser 50 55 60Asp Val Ser Phe Val Gln
Ser Ser Ser Gln Ser Val Ser His Ala Pro65 70
75 80Pro Lys Leu Glu Asp Phe Leu Gly Asp Ser Ser
Ala Val Met Arg Tyr 85 90
95Ser Asp Ser Gln Thr Glu Thr Gln Asp Ser Ser Leu Thr His Ile Tyr
100 105 110Asp His His His His His
His His Gly Ser Ser Ala Tyr Phe Gly Gly 115 120
125Asp His Gln Asp Leu Lys Ala Ile Thr Gly Phe Gln Ala Phe
Ser Thr 130 135 140Asn Ser Gly Ser Glu
Val Asp Asp Ser Ala Ser Ile Gly Lys Ala Gln145 150
155 160Gly Ser Glu Phe Gly Thr His Ser Ile Glu
Ser Ser Val Asn Glu Phe 165 170
175Ala Ala Phe Ser Gly Gly Thr Asn Thr Gly Gly Thr Leu Ser Leu Ala
180 185 190Val Ala Gln Ser Ser
Glu Lys Ala Val Ala Ala Ala Ala Glu Ser Asp 195
200 205Arg Ser Lys Lys Val Val Asp Thr Phe Gly Gln Arg
Thr Ser Ile Tyr 210 215 220Arg Gly Val
Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu225
230 235 240Trp Asp Asn Ser Cys Arg Arg
Glu Gly Gln Ala Arg Lys Gly Arg Gln 245
250 255Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ser
Tyr Asp Leu Ala 260 265 270Ala
Leu Lys Tyr Trp Gly Pro Thr Ala Thr Thr Asn Phe Pro Val Ser 275
280 285Asn Tyr Ser Lys Glu Val Glu Glu Met
Lys His Val Thr Lys Gln Glu 290 295
300Phe Ile Ala Ser Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala305
310 315 320Ser Ile Tyr Arg
Gly Val Thr Arg His His Gln Gln Gly Arg Trp Gln 325
330 335Ala Arg Ile Gly Arg Val Ala Gly Asn Lys
Asp Leu Tyr Leu Gly Thr 340 345
350Phe Ala Thr Glu Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile
355 360 365Lys Phe Arg Gly Ala Asn Ala
Val Thr Asn Phe Glu Met Asn Arg Tyr 370 375
380Asp Val Glu Ala Ile Met Lys Ser Ser Leu Pro Val Gly Gly Ala
Ala385 390 395 400Lys Arg
Leu Lys Leu Ser Leu Glu Ser Glu Gln Lys Ala Leu Pro Val
405 410 415Ser Ser Ser Ser Ser Ser Ser
Gln Gln Gln Asn Pro Gln Cys Gly Asn 420 425
430Val Ser Ala Ser Ile Asn Phe Ser Ser Ile His Gln Pro Ile
Ala Ser 435 440 445Ile Pro Cys Gly
Ile Pro Phe Asp Ser Thr Thr Ala Tyr Tyr His His 450
455 460Asn Leu Phe Gln His Phe His Pro Thr Asn Ala Gly
Thr Ala Ala Ser465 470 475
480Ala Val Thr Ser Ala Asn Ala Asn Ala Leu Thr Ala Leu Pro Pro Thr
485 490 495Ala Ala Ala Glu Phe
Phe Ile Trp Pro His Gln Ser Tyr 500
50586638PRTZea mays 86Met Thr Ser Asn Ser Ser Gln Asn Met Ser Ser Cys Ser
Thr Gly Gly1 5 10 15Ser
Asp Ala Ala Val Gly Gly Gly Ser Trp Leu Gly Phe Ser Leu Ser 20
25 30Pro His Met Ala Ala Thr Met Asp
Gly Ala Ala Asp Gly Val Pro Val 35 40
45Gln His His His His Glu Gly Leu Phe Tyr Pro Pro Val Val Ser Ser
50 55 60Ser Pro Ala Pro Phe Cys Tyr Ala
Leu Gly Gly Gly Gln Asp Gly Leu65 70 75
80Ala Thr Ala Ala Ala Asn Gly Gly Gly Gly Phe Tyr Pro
Gly Leu Ser 85 90 95Ser
Met Pro Leu Lys Ser Asp Gly Ser Leu Cys Ile Leu Glu Ala Leu
100 105 110His Arg Ser Glu Gln Glu Arg
His Gly Val Val Val Ser Ser Ser Ser 115 120
125Pro Lys Leu Glu Asp Phe Leu Gly Ala Ser Ala Ser Thr Ala Met
Ala 130 135 140Leu Ser Leu Asp Ser Ser
Ser Phe Tyr Tyr Gly Cys Gly His Gly His145 150
155 160Gly His Asp Gln Gly Gly Tyr Leu Gln Pro Met
Gln Cys Ala Val Met 165 170
175Pro Gly Ser Gly Gly His Asp Val Tyr Gly Gly Gly His Ala Gln Met
180 185 190Val Asp Glu Gln Ser Ala
Ala Ala Met Ala Ala Ser Trp Phe Ser Ala 195 200
205Arg Gly Asn Gly Gly Tyr Asp Val Asp Gly Ala Gly Ala Gly
Ala Ile 210 215 220Val Pro Leu Gln Gly
His Pro His Pro Leu Ala Leu Ser Met Ser Ser225 230
235 240Gly Thr Gly Ser Gln Ser Ser Ser Val Thr
Met Gln Val Gly Ser Ala 245 250
255His Ala Asp Ala Val Thr Glu Tyr Ile Ala Met Asp Gly Ser Lys Lys
260 265 270Arg Gly Ala Gly Asn
Gly Ala Ser Ala Gly Gln Lys Gln Pro Thr Ile 275
280 285His Arg Lys Thr Ile Asp Thr Phe Gly Gln Arg Thr
Ser Gln Tyr Arg 290 295 300Gly Val Thr
Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp305
310 315 320Asp Asn Ser Cys Arg Lys Glu
Gly Gln Thr Arg Lys Gly Arg Gln Val 325
330 335Tyr Leu Gly Gly Tyr Asp Val Glu Glu Lys Ala Ala
Arg Ala Tyr Asp 340 345 350Leu
Ala Ala Leu Lys Tyr Trp Gly Thr Ser Thr His Val Asn Phe Pro 355
360 365Val Glu Asp Tyr Arg Glu Glu Leu Glu
Glu Met Lys Asn Met Thr Arg 370 375
380Gln Glu Tyr Val Ala His Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg385
390 395 400Gly Ala Ser Ile
Tyr Arg Gly Val Thr Arg His His Gln His Gly Arg 405
410 415Trp Gln Ala Arg Ile Gly Arg Val Ser Gly
Asn Lys Asp Leu Tyr Leu 420 425
430Gly Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp Val Ala
435 440 445Ala Ile Lys Phe Arg Gly Leu
Ser Ala Val Thr Asn Phe Asp Ile Thr 450 455
460Arg Tyr Asp Val Asp Lys Ile Met Glu Ser Ser Thr Leu Leu Pro
Gly465 470 475 480Glu Gln
Val Arg Arg Arg Lys Glu Gly Ala Asp Ala Ala Val Ser Glu
485 490 495Ala Ala Ala Ala Leu Val Gln
Ala Gly Asn Cys Met Thr Asp Thr Trp 500 505
510Lys Ile Gln Ala Ala Leu Pro Ala Ala Ala Arg Ala Asp Glu
Arg Gly 515 520 525Ala Gly Gln Gln
Gln Arg Gln Asp Leu Leu Ser Ser Glu Ala Phe Ser 530
535 540Leu Leu His Asp Ile Val Ser Val Asp Ala Ala Ala
Gly Thr Gly Thr545 550 555
560Gly Gly Met Ser Asn Ala Ser Ser Ser Leu Ala Pro Ser Val Ser Asn
565 570 575Ser Arg Glu Gln Ser
Pro Asp Arg Gly Gly Ala Ser Leu Ala Met Leu 580
585 590Phe Ala Lys Pro Ala Ala Ala Pro Lys Leu Ala Cys
Pro Leu Pro Leu 595 600 605Gly Ser
Trp Val Ser Pro Ser Ala Val Ser Ala Arg Pro Pro Gly Val 610
615 620Ser Ile Ala His Leu Pro Val Phe Ala Ala Trp
Thr Asp Ala625 630 63587652PRTOryza
sativa 87Met Ala Ser Gly Asn Ser Ser Ser Ser Ser Gly Ser Met Ala Ala Thr1
5 10 15Ala Gly Gly Val
Gly Gly Trp Leu Gly Phe Ser Leu Ser Pro His Met 20
25 30Ala Thr Tyr Cys Ala Gly Gly Val Asp Asp Val
Gly His His His His 35 40 45His
His Val His Gln His Gln Gln Gln His Gly Gly Gly Leu Phe Tyr 50
55 60Asn Pro Ala Ala Val Ala Ser Ser Phe Tyr
Tyr Gly Gly Gly His Asp65 70 75
80Ala Val Val Thr Ser Ala Ala Gly Gly Gly Ser Tyr Tyr Gly Ala
Gly 85 90 95Phe Ser Ser
Met Pro Leu Lys Ser Asp Gly Ser Leu Cys Ile Met Glu 100
105 110Ala Leu Arg Gly Gly Asp Gln Glu Gln Gln
Gly Val Val Val Ser Ala 115 120
125Ser Pro Lys Leu Glu Asp Phe Leu Gly Ala Gly Pro Ala Met Ala Leu 130
135 140Ser Leu Asp Asn Ser Ala Phe Tyr
Tyr Gly Gly His Gly His His Gln145 150
155 160Gly His Ala Gln Asp Gly Gly Ala Val Gly Gly Asp
Pro His His Gly 165 170
175Gly Gly Gly Phe Leu Gln Cys Ala Val Ile Pro Gly Ala Gly Ala Gly
180 185 190His Asp Ala Ala Leu Val
His Asp Gln Ser Ala Ala Ala Val Ala Ala 195 200
205Gly Trp Ala Ala Met His Gly Gly Gly Tyr Asp Ile Ala Asn
Ala Ala 210 215 220Ala Asp Asp Val Cys
Ala Ala Gly Pro Ile Ile Pro Thr Gly Gly His225 230
235 240Leu His Pro Leu Thr Leu Ser Met Ser Ser
Ala Gly Ser Gln Ser Ser 245 250
255Cys Val Thr Val Gln Ala Ala Ala Ala Gly Glu Pro Tyr Met Ala Met
260 265 270Asp Ala Val Ser Lys
Lys Arg Gly Gly Ala Asp Arg Ala Gly Gln Lys 275
280 285Gln Pro Val His Arg Lys Ser Ile Asp Thr Phe Gly
Gln Arg Thr Ser 290 295 300Gln Tyr Arg
Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu Ala305
310 315 320His Leu Trp Asp Asn Ser Cys
Lys Lys Glu Gly Gln Thr Arg Lys Gly 325
330 335Arg Gln Gly Gly Tyr Asp Met Glu Glu Lys Ala Ala
Arg Ala Tyr Asp 340 345 350Leu
Ala Ala Leu Lys Tyr Trp Gly Pro Ser Thr His Ile Asn Phe Pro 355
360 365Leu Glu Asp Tyr Gln Glu Glu Leu Glu
Glu Met Lys Asn Met Ser Arg 370 375
380Gln Glu Tyr Val Ala His Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg385
390 395 400Gly Ala Ser Ile
Tyr Arg Gly Val Thr Arg His His Gln His Gly Arg 405
410 415Trp Gln Ala Arg Ile Gly Arg Val Ser Gly
Asn Lys Asp Leu Tyr Leu 420 425
430Gly Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp Val Ala
435 440 445Ala Ile Lys Phe Arg Gly Leu
Asn Ala Val Thr Asn Phe Asp Ile Thr 450 455
460Arg Tyr Asp Val Asp Lys Ile Leu Glu Ser Ser Thr Leu Leu Pro
Gly465 470 475 480Glu Leu
Ala Arg Arg Lys Gly Lys Val Gly Asp Gly Gly Gly Ala Ala
485 490 495Ala Val Ala Asp Ala Ala Ala
Ala Leu Val Gln Ala Gly Asn Val Ala 500 505
510Glu Trp Lys Met Ala Thr Ala Ala Ala Leu Pro Ala Ala Ala
Arg Thr 515 520 525Glu Gln Gln Gln
Gln His Gly His Gly Gly His Gln His His Asp Leu 530
535 540Leu Pro Ser Asp Ala Phe Ser Val Leu Gln Asp Ile
Val Ser Thr Val545 550 555
560Asp Ala Ala Gly Ala Pro Pro Arg Ala Pro His Met Ser Met Ala Ala
565 570 575Thr Ser Leu Gly Asn
Ser Arg Glu Gln Ser Pro Asp Arg Gly Val Gly 580
585 590Gly Gly Gly Gly Gly Gly Val Leu Ala Thr Leu Phe
Ala Lys Pro Ala 595 600 605Ala Ala
Ser Lys Leu Tyr Ser Pro Val Pro Leu Asn Thr Trp Ala Ser 610
615 620Pro Ser Pro Ala Val Ser Ser Val Pro Ala Arg
Ala Gly Val Ser Ile625 630 635
640Ala His Leu Pro Met Phe Ala Ala Trp Thr Asp Ala
645 65088440PRTArabidopsis thaliana 88Met Ala Asp Ser Thr
Thr Leu Ser Thr Phe Phe Asp His Ser Gln Thr1 5
10 15Gln Ile Pro Lys Leu Glu Asp Phe Leu Gly Asp
Ser Phe Val Arg Tyr 20 25
30Ser Asp Asn Gln Thr Glu Thr Gln Asp Ser Ser Ser Leu Thr Pro Phe
35 40 45Tyr Asp Pro Arg His Arg Thr Val
Ala Glu Gly Val Thr Gly Phe Phe 50 55
60Ser Asp His His Gln Pro Asp Phe Lys Thr Ile Asn Ser Gly Pro Glu65
70 75 80Ile Phe Asp Asp Ser
Thr Thr Ser Asn Ile Gly Gly Thr His Leu Ser 85
90 95Ser His Val Val Glu Ser Ser Thr Thr Ala Lys
Leu Gly Phe Asn Gly 100 105
110Asp Cys Thr Thr Thr Gly Gly Val Leu Ser Leu Gly Val Asn Asn Thr
115 120 125Ser Asp Gln Pro Leu Ser Cys
Asn Asn Gly Glu Arg Gly Gly Asn Ser 130 135
140Asn Lys Lys Lys Thr Val Ser Lys Lys Glu Thr Ser Asp Asp Ser
Lys145 150 155 160Lys Lys
Ile Val Glu Thr Leu Gly Gln Arg Thr Ser Ile Tyr Arg Gly
165 170 175Val Thr Arg His Arg Trp Thr
Gly Arg Tyr Glu Ala His Leu Trp Asp 180 185
190Asn Ser Cys Arg Arg Glu Gly Gln Ala Arg Lys Gly Arg Gln
Val Tyr 195 200 205Leu Gly Gly Tyr
Asp Lys Glu Asp Arg Ala Ala Arg Ala Tyr Asp Leu 210
215 220Ala Ala Leu Lys Tyr Trp Gly Ser Thr Ala Thr Thr
Asn Phe Pro Val225 230 235
240Ser Ser Tyr Ser Lys Glu Leu Glu Glu Met Asn His Met Thr Lys Gln
245 250 255Glu Phe Ile Ala Ser
Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly 260
265 270Ala Ser Ile Tyr Arg Gly Val Thr Arg His His Gln
Gln Gly Arg Trp 275 280 285Gln Ala
Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly 290
295 300Thr Phe Ala Thr Glu Glu Glu Ala Ala Glu Ala
Tyr Asp Ile Ala Ala305 310 315
320Ile Lys Phe Arg Gly Ile Asn Ala Val Thr Asn Phe Glu Met Asn Arg
325 330 335Tyr Asp Ile Glu
Ala Val Met Asn Ser Ser Leu Pro Val Gly Gly Ala 340
345 350Ala Ala Lys Arg His Lys Leu Lys Leu Ala Leu
Glu Ser Pro Ser Ser 355 360 365Ser
Ser Ser Asp His Asn Leu Gln Gln Gln Gln Leu Leu Pro Ser Ser 370
375 380Ser Pro Ser Asp Gln Asn Pro Asn Ser Ile
Pro Cys Gly Ile Pro Phe385 390 395
400Glu Pro Ser Val Leu Tyr Tyr His Gln Asn Phe Phe Gln His Tyr
Pro 405 410 415Leu Val Ser
Asp Ser Thr Ile Gln Ala Pro Met Asn Gln Ala Glu Phe 420
425 430Phe Leu Trp Pro Asn Gln Ser Tyr
435 44089651PRTZea mays 89Met Ala Asn Gly Ser Asn Trp Leu
Gly Phe Ser Leu Ser Pro His Thr1 5 10
15Ala Met Glu Val Pro Ser Val Ser Glu Pro Ala Ser Thr His
His Ala 20 25 30Pro Pro Pro
Pro Ser Ser Ser Thr Thr Ile Ser Ser Ser Ser Thr Asn 35
40 45Asn Thr Ile Ser Ser Asn Phe Leu Phe Ser Pro
Met Ala Ser Pro Tyr 50 55 60Pro Gly
Tyr Tyr Cys Val Gly Gly Ala Tyr Gly Asp Gly Thr Ser Ala65
70 75 80Ala Gly Val Tyr Tyr Ser His
Leu Pro Ala Met Pro Asn Lys Ser Asp 85 90
95Asp Gly Thr Leu Cys Asn Met Glu Gly Met Val Pro Ser
Ser Pro Pro 100 105 110Lys Leu
Glu Asp Phe Leu Gly Gly Gly Asn Gly Gly Gly Gln Glu Thr 115
120 125Ala Thr Tyr Tyr Ser His Gln Gln Gln Gly
Gln Glu Glu Gly Ala Ser 130 135 140Arg
Asp Tyr Arg Gln Tyr His Tyr Gln His Gln Gln Leu Val Pro Tyr145
150 155 160Asn Phe Gln Pro Leu Thr
Glu Ala Glu Met Leu Gln Glu Gly Ala Ala 165
170 175Pro Met Glu Glu Ala Met Ala Ala Ala Lys Asn Phe
Leu Leu Ala Ser 180 185 190Tyr
Gly Ala Cys Tyr Ser Asn Glu Glu Thr Arg Pro Leu Ser Leu Ser 195
200 205Met Met Ser Pro Gly Thr Gln Leu Ser
Ser Cys Val Ser Ala Ala Pro 210 215
220Gln Gln Gln His Gln Met Ala Ala Thr Val Ala Thr Ala Ala Thr Ala225
230 235 240Ala Ala Ala Leu
Gly Arg Ser Asn Gly Asp Gly Glu Gln Cys Val Gly 245
250 255Arg Lys Arg Ser Thr Gly Lys Gly Gly His
Lys Gln Thr Val His Arg 260 265
270Lys Ser Ile Asp Thr Phe Gly Gln Arg Thr Ser Arg Tyr Arg Gly Val
275 280 285Thr Arg His Arg Trp Thr Gly
Arg Tyr Glu Ala His Leu Trp Asp Asn 290 295
300Ser Cys Arg Lys Asp Gly Gln Thr Arg Lys Gly Arg Gln Val Tyr
Leu305 310 315 320Gly Gly
Tyr Asp Thr Glu Asp Lys Ala Ala Arg Ala Tyr Asp Leu Ala
325 330 335Ala Leu Lys Tyr Trp Gly Pro
Ala Thr His Val Asn Phe Pro Val Glu 340 345
350Asn Tyr Arg Asp Glu Leu Glu Glu Met Lys Gly Met Thr Arg
Gln Glu 355 360 365Phe Val Ala His
Leu Arg Arg Arg Ser Ser Gly Phe Ser Arg Gly Ala 370
375 380Ser Ile Tyr Arg Gly Val Thr Arg His His Gln Gln
Gly Arg Trp Gln385 390 395
400Ser Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr
405 410 415Phe Thr Thr Gln Glu
Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile 420
425 430Lys Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp
Ile Ala Arg Tyr 435 440 445Asp Val
Asp Lys Ile Met Glu Ser Ser Thr Leu Leu Ala Val Glu Glu 450
455 460Ala Arg Lys Val Lys Ala Val Glu Ala Ala Ser
Ser Ala Pro Met Thr465 470 475
480His Thr His Ser Gly Gly Lys Glu Gln Leu Asn Ala Thr Thr Ala Glu
485 490 495Glu Thr Ser Ser
Ala Gly Trp Arg Met Val Leu His Gly Ser Pro His 500
505 510Gln Leu Glu Ala Ala Arg Cys Pro Glu Ala Ala
Asp Leu Gln Ser Ala 515 520 525Ile
Met Asn Asn Asp Ser His Pro Arg Pro Ser Leu His Gly Ile Ala 530
535 540Gly Leu Asp Ile Glu Cys Ala Val His Asp
His His Asp His Leu Asp545 550 555
560Val Pro Ala Gly Ser Arg Thr Thr Ala Ala Gly Ser Ile Asn Phe
Ser 565 570 575Asn Ser Ser
Ser Gln Val Thr Ser Leu Gly Asn Ser Arg Glu Gly Ser 580
585 590Pro Glu Arg Leu Gly Leu Ala Met Met Tyr
Gly Lys Gln Pro Ser Ser 595 600
605Ala Val Ser Leu Ala Ala Thr Met Ser Pro Trp Thr Pro Val Ala Ala 610
615 620Gln Thr Val Ala His Val Leu Lys
Gln Gln Pro Asn Val Val Val Ser625 630
635 640His Arg Pro Val Phe Ala Ala Trp Ala Asp Ala
645 65090656PRTMedicago truncatula 90Met Lys Arg
Met Glu Asn Asn Asp Asp Ser Val Asp Ile Asn Asn Glu1 5
10 15Asn Asn Trp Leu Gly Phe Ser Leu Ser
Pro Gln Met Asn Asn Ile Gly 20 25
30Val Ser Ser His Thr His His His Ser Leu Pro Ser Ala Thr Ala Thr
35 40 45Ala Ser Glu Val Val Pro Leu
Gln Ala Ser Phe Tyr His Ser Ser Pro 50 55
60Leu Ser Asn Phe Cys Tyr Ser Tyr Gly Leu Glu His Glu Asn Ala Gly65
70 75 80Leu Tyr Ser Leu
Leu Pro Ile Met Pro Leu Lys Ser Asp Gly Ser Leu 85
90 95Phe Glu Met Glu Ala Leu Ser Arg Ser Gln
Thr Gln Ala Met Ser Thr 100 105
110Thr Ser Ala Pro Lys Leu Glu Asn Phe Leu Gly Asn Glu Ala Met Gly
115 120 125Thr Pro His Tyr Ala Cys Ser
Ser Thr Val Thr Glu Thr Met Pro Leu 130 135
140Ser Leu Asp Ser Met Phe Gln Asn Gln Ile Gln Gln Asn Met Asn
Met145 150 155 160Asn Asn
Gln Gln His Leu Ser Tyr Tyr Asn Ser Thr Leu Arg Asn His
165 170 175Glu Leu Met Leu Glu Gly Ser
Lys Gln Ser Gln Thr Ser Ser Gly Asn 180 185
190Phe His Gln Ser Asn Met Gly Glu Asp His Gly Leu Ser Gly
Leu Lys 195 200 205Asn Trp Val Leu
Arg Asn Phe Pro Ala Ser His Gly His Asp Gln Ser 210
215 220Lys Met Ile Val Pro Val Val Glu Glu Asn Glu Gly
Glu Cys Gly Ser225 230 235
240Asn Ile Gly Ser Met Ala Tyr Gly Asp Leu His Ser Leu Ser Leu Ser
245 250 255Met Ser Pro Ser Ser
Gln Ser Ser Cys Val Thr Thr Ser Gln Asn Met 260
265 270Ser Ser Ala Val Val Glu Asn Ser Val Ala Met Asp
Thr Lys Lys Arg 275 280 285Gly Ser
Glu Lys Phe Glu Gln Lys Gln Ile Val His Arg Lys Ser Ile 290
295 300Asp Thr Phe Gly Gln Arg Thr Ser Gln Tyr Arg
Gly Val Thr Arg His305 310 315
320Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser Cys Lys
325 330 335Lys Glu Gly Gln
Ser Arg Lys Gly Arg Gln Gly Gly Tyr Asp Met Glu 340
345 350Glu Lys Ala Ala Arg Ala Tyr Asp Gln Ala Ala
Leu Lys Tyr Trp Gly 355 360 365Pro
Ser Thr His Ile Asn Phe Pro Leu Glu Asn Tyr Gln Asn Gln Leu 370
375 380Glu Glu Met Lys Asn Met Thr Arg Gln Glu
Tyr Val Ala His Leu Arg385 390 395
400Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser Met Tyr Arg Gly
Val 405 410 415Thr Ser Arg
His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg 420
425 430Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly
Thr Phe Ser Thr Gln Glu 435 440
445Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly Ala 450
455 460Asn Ala Val Thr Asn Phe Asp Ile
Ile Lys Tyr Asp Val Glu Lys Ile465 470
475 480Met Ala Ser Ser Asn Leu Leu Asn Ile Glu Gln Ala
Arg Arg Asn Lys 485 490
495Glu Val Val Asp Ile Ser Ser Thr Gln Tyr Ile Asp Gln Asn Lys Pro
500 505 510Ser Ser Ala Tyr Asp Asn
Asn Ser Thr Gln Glu Ala Ile Ser Met Gln 515 520
525Lys Ser Met Val Leu Tyr Gln Ser Ser Gln His Gln Gln Leu
Gln Gln 530 535 540Asn Gln Pro Arg Phe
Glu Asn Glu Arg Thr His Gln Thr Phe Ser Ser545 550
555 560Val Ser Leu Asp Asn Met Phe His Gln Glu
Val Val Glu Glu Ala Ser 565 570
575Lys Met Arg Thr His Val Ser Asn Ala Ser Ser Leu Ala Thr Ser Leu
580 585 590Ser Ser Ser Arg Glu
Gly Thr Pro Asp Arg Thr Ser Leu Gln Asn Leu 595
600 605Ser Gly Ile Met Pro Ser Thr Ala Ser Lys Leu Leu
Val Thr Ser Ala 610 615 620Pro Asn Ser
Asn Leu Asn Ser Trp Asp Pro Ser Gln His Leu Arg Pro625
630 635 640Ser Leu Ser Leu Pro Gln Met
Pro Val Phe Ala Ala Trp Thr Asp Ala 645
650 65591546PRTGlycine max 91Met Lys Arg Met Asn Glu Ser
Asn Asn Thr Asp Asp Gly Asn Asn His1 5 10
15Asn Trp Leu Gly Phe Ser Leu Ser Pro His Met Lys Met
Glu Val Thr 20 25 30Ser Ala
Ala Thr Val Ser Asp Asn Asn Val Pro Thr Thr Phe Tyr Met 35
40 45Ser Pro Ser His Met Ser Asn Ser Gly Met
Cys Tyr Ser Val Gly Glu 50 55 60Asn
Gly Asn Phe His Ser Pro Leu Thr Val Met Pro Leu Lys Ser Asp65
70 75 80Gly Ser Leu Gly Ile Leu
Glu Ala Leu Asn Arg Ser Gln Thr Gln Val 85
90 95Met Val Pro Thr Ser Ser Pro Lys Leu Glu Asp Phe
Leu Gly Gly Ala 100 105 110Thr
Met Gly Thr His Glu Tyr Gly Asn His Glu Arg Gly Leu Ser Leu 115
120 125Asp Ser Ile Tyr Tyr Asn Ser Gln Asn
Ala Glu Ala Gln Pro Asn Arg 130 135
140Asn Leu Leu Ser His Pro Phe Arg Gln Gln Gly His Ala Pro Ser Glu145
150 155 160Glu Glu Ala Thr
Lys Glu Thr His Val Ser Val Met Pro Gln Met Thr 165
170 175Gly Gly Gly Leu Gln Asn Trp Ile Leu Glu
Gln Gln Met Asn Cys Gly 180 185
190Ile Trp Asn Glu Arg Ser Gly Val Ser Val Gly Thr Val Gly Cys Gly
195 200 205Glu Leu Gln Ser Leu Ser Leu
Ser Met Ser Pro Gly Ser Gln Ser Ser 210 215
220Cys Val Thr Ala Pro Ser Gly Thr Asp Ser Val Ala Val Asp Ala
Lys225 230 235 240Lys Arg
Gly His Ala Lys Leu Gly Gln Lys Gln Pro Val His Arg Lys
245 250 255Ser Ile Asp Thr Phe Gly Gln
Arg Thr Ser Gln Tyr Arg Gly Val Thr 260 265
270Arg His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu Trp Asp
Asn Ser 275 280 285Cys Lys Lys Glu
Gly Gln Thr Arg Lys Gly Arg Gln Gly Gly Tyr Asp 290
295 300Met Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala
Ala Leu Lys Tyr305 310 315
320Trp Gly Pro Ser Thr His Ile Asn Phe Ser Ile Glu Asn Tyr Gln Val
325 330 335Gln Leu Glu Glu Met
Lys Asn Met Ser Arg Gln Glu Tyr Val Ala His 340
345 350Leu Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala
Ser Ile Tyr Arg 355 360 365Gly Val
Thr Arg His His Gln His Gly Arg Trp Gln Ala Arg Ile Gly 370
375 380Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly
Thr Phe Ser Thr Gln385 390 395
400Glu Glu Ala Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys Phe Arg Gly
405 410 415Ala Asn Ala Val
Thr Asn Phe Asp Ile Ser Arg Tyr Asp Val Glu Arg 420
425 430Ile Met Ala Ser Ser Asn Leu Leu Ala Gly Glu
Leu Ala Arg Arg Asn 435 440 445Lys
Asp Asn Asp Pro Arg Asn Glu Ala Ile Asp Tyr Asn Lys Ser Val 450
455 460Phe Lys Gln Glu Thr Thr Met Lys Met Ile
Arg Ser Gly Arg Cys Leu465 470 475
480Ser Ser Ser Arg Glu Ala Ser Pro Glu Lys Met Gly Pro Ser Leu
Leu 485 490 495Phe Pro Lys
Pro Pro Pro Met Glu Thr Lys Ile Val Asn Pro Ile Gly 500
505 510Thr Ser Val Thr Ser Trp Leu Pro Ser Pro
Thr Val Gln Met Arg Pro 515 520
525Ser Pro Ala Ile Ser Leu Ser His Leu Pro Val Phe Ala Ala Trp Thr 530
535 540Asp Thr54592415PRTArabidopsis
thaliana 92Met Lys Lys Trp Leu Gly Phe Ser Leu Thr Pro Pro Leu Arg Ile
Cys1 5 10 15Asn Ser Glu
Glu Glu Glu Leu Arg His Asp Gly Ser Asp Val Trp Arg 20
25 30Tyr Asp Ile Asn Phe Asp His His His His
Asp Glu Asp Val Pro Lys 35 40
45Val Glu Asp Leu Leu Ser Asn Ser His Gln Thr Glu Tyr Pro Ile Asn 50
55 60His Asn Gln Thr Asn Val Asn Cys Thr
Thr Val Val Asn Arg Leu Asn65 70 75
80Pro Pro Gly Tyr Leu Leu His Asp Gln Thr Val Val Thr Pro
His Tyr 85 90 95Pro Asn
Leu Asp Pro Asn Leu Ser Asn Asp Tyr Gly Gly Phe Glu Arg 100
105 110Val Gly Ser Val Ser Val Phe Lys Ser
Trp Leu Glu Gln Gly Thr Pro 115 120
125Ala Phe Pro Leu Ser Ser His Tyr Val Thr Glu Glu Ala Gly Thr Ser
130 135 140Asn Asn Ile Ser His Phe Ser
Asn Glu Glu Thr Gly Tyr Asn Thr Asn145 150
155 160Gly Ser Met Leu Ser Leu Ala Leu Ser His Gly Ala
Cys Ser Asp Leu 165 170
175Ile Asn Glu Ser Asn Val Ser Ala Arg Val Glu Glu Pro Val Lys Val
180 185 190Asp Glu Lys Arg Lys Arg
Leu Val Val Lys Pro Gln Val Lys Glu Ser 195 200
205Val Pro Arg Lys Ser Val Asp Ser Tyr Gly Gln Arg Thr Ser
Gln Tyr 210 215 220Arg Gly Val Thr Arg
His Arg Trp Thr Gly Arg Tyr Glu Ala His Leu225 230
235 240Trp Asp Asn Ser Cys Lys Lys Glu Gly Gln
Thr Arg Arg Gly Arg Gln 245 250
255Val Tyr Leu Gly Gly Tyr Asp Glu Glu Glu Lys Ala Ala Arg Ala Tyr
260 265 270Asp Leu Ala Ala Leu
Lys Tyr Trp Gly Pro Thr Thr His Leu Asn Phe 275
280 285Pro Leu Ser Asn Tyr Glu Lys Glu Ile Glu Glu Leu
Asn Asn Met Asn 290 295 300Arg Gln Glu
Phe Val Ala Met Leu Arg Arg Asn Ser Ser Gly Phe Ser305
310 315 320Arg Gly Ala Ser Val Tyr Arg
Gly Val Thr Arg His His Gln His Gly 325
330 335Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly Asn
Lys Asp Leu Tyr 340 345 350Leu
Gly Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala Tyr Asp Ile 355
360 365Ala Ala Ile Lys Phe Arg Gly Leu Asn
Ala Val Thr Asn Phe Asp Ile 370 375
380Asn Arg Tyr Asp Val Lys Arg Ile Cys Ser Ser Ser Thr Ile Val Asp385
390 395 400Ser Asp Gln Ala
Lys His Ser Pro Thr Ser Ser Gly Ala Gly His 405
410 41593428PRTZea mays 93Met Ser Pro Pro Thr Asn
Gly Ala Ile Ser Leu Ala Tyr Ala Pro Ser1 5
10 15Met Met Leu Gly Ala Gly Ala Leu Thr Asn Pro Pro
Leu Leu Pro Phe 20 25 30Asp
Gly Phe Thr Asp Glu Asp Phe Leu Ala Ser Ala Asp Ala Ala Leu 35
40 45Leu Gly Glu Ala Gly Thr Asp Gln Thr
Leu Leu Leu Leu Pro Ser Cys 50 55
60Pro Gly Ala Asn Cys Cys Gly Gly Ser Ser Ser Asp Gln Gly Leu Gly65
70 75 80Ala Leu Ala Cys Glu
Val Thr Thr Ala Gly Ser Phe Ser Leu Leu Gly 85
90 95Gln Pro Ala Pro Gly Gln Val Ser Trp Glu Val
Thr Thr Ala Val Ala 100 105
110Ala Asp Arg Asn Thr Phe Ser Arg Ala Arg Asp Pro Ala Pro Ser Pro
115 120 125Pro Pro Ser Pro Ala Leu Pro
Leu Val Gln Thr Thr Ser Gln Ser Gln 130 135
140Arg Thr Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly
Arg145 150 155 160Tyr Glu
Ala His Leu Trp Asp Asn Thr Cys Arg Lys Glu Gly Gln Lys
165 170 175Arg Lys Gly Arg Gln Val Tyr
Leu Gly Gly Tyr Asp Lys Glu Asp Lys 180 185
190Ala Ala Arg Ala Tyr Asp Ile Ala Ala Leu Lys Tyr Trp Gly
Asp Asn 195 200 205Ala Thr Thr Asn
Phe Pro Arg Glu Asn Tyr Ile Arg Glu Ile Gln Asp 210
215 220Met Gln Asn Met Asn Arg Arg Asp Val Val Ala Ser
Leu Arg Arg Lys225 230 235
240Ser Ser Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Lys
245 250 255His His Gln His Gly
Arg Trp Gln Ala Arg Ile Gly Arg Val Ala Gly 260
265 270Asn Lys Asp Leu Tyr Leu Gly Thr Phe Ala Thr Glu
Gln Glu Ala Ala 275 280 285Glu Ala
Tyr Asp Ile Ala Ala Leu Lys Phe Arg Gly Glu Asn Ala Val 290
295 300Thr Asn Phe Glu Pro Ser Arg Tyr Asn Leu Leu
Ala Ile Ala Gln Arg305 310 315
320Asp Ile Pro Ile Leu Gly Arg Lys Leu Ile Gln Lys Pro Ala Pro Glu
325 330 335Ala Glu Asp Gln
Ala Ala Leu Ser Ala Arg Ser Phe Ser Gln Ser Gln 340
345 350Gln Ser Ser Asn Ser Leu Pro Pro Tyr Phe Leu
Thr Asn Leu Leu Gln 355 360 365Pro
Leu Pro Ser Gln His Ser Leu Ala Gln Ala Leu Pro Ser Tyr Asn 370
375 380Asn Leu Gly Phe Gly Glu Pro Ser Leu Tyr
Trp Pro Cys Pro Cys Gly385 390 395
400Asp Pro Gly Glu Gln Lys Val Gln Leu Gly Ser Lys Leu Glu Ile
Val 405 410 415Asp Gly Leu
Val Gln Leu Ala Asn Ser Ala Ala Asn 420
42594438PRTArabidopsis thaliana 94Met Lys Lys Arg Leu Thr Thr Ser Thr Cys
Ser Ser Ser Pro Ser Ser1 5 10
15Ser Val Ser Ser Ser Thr Thr Thr Ser Ser Pro Ile Gln Ser Glu Ala
20 25 30Pro Arg Pro Lys Arg Ala
Lys Arg Ala Lys Lys Ser Ser Pro Ser Gly 35 40
45Asp Lys Ser His Asn Pro Thr Ser Pro Ala Ser Thr Arg Arg
Ser Ser 50 55 60Ile Tyr Arg Gly Val
Thr Arg His Arg Trp Thr Gly Arg Phe Glu Ala65 70
75 80His Leu Trp Asp Lys Ser Ser Trp Asn Ser
Ile Gln Asn Lys Lys Gly 85 90
95Lys Gln Val Tyr Leu Gly Ala Tyr Asp Ser Glu Glu Ala Ala Ala His
100 105 110Thr Tyr Asp Leu Ala
Ala Leu Lys Tyr Trp Gly Pro Asp Thr Ile Leu 115
120 125Asn Phe Pro Ala Glu Thr Tyr Thr Lys Glu Leu Glu
Glu Met Gln Arg 130 135 140Val Thr Lys
Glu Glu Tyr Leu Ala Ser Leu Arg Arg Gln Ser Ser Gly145
150 155 160Phe Ser Arg Gly Val Ser Lys
Tyr Arg Gly Val Ala Arg His His His 165
170 175Asn Gly Arg Trp Glu Ala Arg Ile Gly Arg Val Phe
Gly Asn Lys Tyr 180 185 190Leu
Tyr Leu Gly Thr Tyr Asn Thr Gln Glu Glu Ala Ala Ala Ala Tyr 195
200 205Asp Met Ala Ala Ile Glu Tyr Arg Gly
Ala Asn Ala Val Thr Asn Phe 210 215
220Asp Ile Ser Asn Tyr Ile Asp Arg Leu Lys Lys Lys Gly Val Phe Pro225
230 235 240Phe Pro Val Asn
Gln Ala Asn His Gln Glu Gly Ile Leu Val Glu Ala 245
250 255Lys Gln Glu Val Glu Thr Arg Glu Ala Lys
Glu Glu Pro Arg Glu Glu 260 265
270Val Lys Gln Gln Tyr Val Glu Glu Pro Pro Gln Glu Glu Glu Glu Lys
275 280 285Glu Glu Glu Lys Ala Glu Gln
Gln Glu Ala Glu Ile Val Gly Tyr Ser 290 295
300Glu Glu Ala Ala Val Val Asn Cys Cys Ile Asp Ser Ser Thr Ile
Met305 310 315 320Glu Met
Asp Arg Cys Gly Asp Asn Asn Glu Leu Ala Trp Asn Phe Cys
325 330 335Met Met Asp Thr Gly Phe Ser
Pro Phe Leu Thr Asp Gln Asn Leu Ala 340 345
350Asn Glu Asn Pro Ile Glu Tyr Pro Glu Leu Phe Asn Glu Leu
Ala Phe 355 360 365Glu Asp Asn Ile
Asp Phe Met Phe Asp Asp Gly Lys His Glu Cys Leu 370
375 380Asn Leu Glu Asn Leu Asp Cys Cys Val Val Gly Arg
Glu Ser Pro Pro385 390 395
400Ser Ser Ser Ser Pro Leu Ser Cys Leu Ser Thr Asp Ser Ala Ser Ser
405 410 415Thr Thr Thr Thr Thr
Thr Ser Val Ser Cys Asn Tyr Leu Phe Gln Gly 420
425 430Leu Phe Val Gly Ser Glu
43595432PRTArabidopsis thaliana 95Met Trp Asp Leu Asn Asp Ala Pro His Gln
Thr Gln Arg Glu Glu Glu1 5 10
15Ser Glu Glu Phe Cys Tyr Ser Ser Pro Ser Lys Arg Val Gly Ser Phe
20 25 30Ser Asn Ser Ser Ser Ser
Ala Val Val Ile Glu Asp Gly Ser Asp Asp 35 40
45Asp Glu Leu Asn Arg Val Arg Pro Asn Asn Pro Leu Val Thr
His Gln 50 55 60Phe Phe Pro Glu Met
Asp Ser Asn Gly Gly Gly Val Ala Ser Gly Phe65 70
75 80Pro Arg Ala His Trp Phe Gly Val Lys Phe
Cys Gln Ser Asp Leu Ala 85 90
95Thr Gly Ser Ser Ala Gly Lys Ala Thr Asn Val Ala Ala Ala Val Val
100 105 110Glu Pro Ala Gln Pro
Leu Lys Lys Ser Arg Arg Gly Pro Arg Ser Arg 115
120 125Ser Ser Gln Tyr Arg Gly Val Thr Phe Tyr Arg Arg
Thr Gly Arg Trp 130 135 140Glu Ser His
Ile Trp Asp Cys Gly Lys Gln Val Tyr Leu Gly Gly Phe145
150 155 160Asp Thr Ala His Ala Ala Ala
Arg Ala Tyr Asp Arg Ala Ala Ile Lys 165
170 175Phe Arg Gly Val Glu Ala Asp Ile Asn Phe Asn Ile
Asp Asp Tyr Asp 180 185 190Asp
Asp Leu Lys Gln Met Thr Asn Leu Thr Lys Glu Glu Phe Val His 195
200 205Val Leu Arg Arg Gln Ser Thr Gly Phe
Pro Arg Gly Ser Ser Lys Tyr 210 215
220Arg Gly Val Thr Leu His Lys Cys Gly Arg Trp Glu Ala Arg Met Gly225
230 235 240Gln Phe Leu Gly
Lys Lys Tyr Val Tyr Leu Gly Leu Phe Asp Thr Glu 245
250 255Val Glu Ala Ala Arg Ala Tyr Asp Lys Ala
Ala Ile Lys Cys Asn Gly 260 265
270Lys Asp Ala Val Thr Asn Phe Asp Pro Ser Ile Tyr Asp Glu Glu Leu
275 280 285Asn Ala Glu Ser Ser Gly Asn
Pro Thr Thr Pro Gln Asp His Asn Leu 290 295
300Asp Leu Ser Leu Gly Asn Ser Ala Asn Ser Lys His Lys Ser Gln
Asp305 310 315 320Met Arg
Leu Arg Met Asn Gln Gln Gln Gln Asp Ser Leu His Ser Asn
325 330 335Glu Val Leu Gly Leu Gly Gln
Thr Gly Met Leu Asn His Thr Pro Asn 340 345
350Ser Asn His Gln Phe Pro Gly Ser Ser Asn Ile Gly Ser Gly
Gly Gly 355 360 365Phe Ser Leu Phe
Pro Ala Ala Glu Asn His Arg Phe Asp Gly Arg Ala 370
375 380Ser Thr Asn Gln Val Leu Thr Asn Ala Ala Ala Ser
Ser Gly Phe Ser385 390 395
400Pro His His His Asn Gln Ile Phe Asn Ser Thr Ser Thr Pro His Gln
405 410 415Asn Trp Leu Gln Thr
Asn Gly Phe Gln Pro Pro Leu Met Arg Pro Ser 420
425 43096449PRTArabidopsis thaliana 96Met Leu Asp Leu
Asn Leu Asn Ala Asp Ser Pro Glu Ser Thr Gln Tyr1 5
10 15Gly Gly Asp Ser Tyr Leu Asp Arg Gln Thr
Ser Asp Asn Ser Ala Gly 20 25
30Asn Arg Val Glu Glu Ser Gly Thr Ser Thr Ser Ser Val Ile Asn Ala
35 40 45Asp Gly Asp Glu Asp Ser Cys Ser
Thr Arg Ala Phe Thr Leu Ser Phe 50 55
60Asp Ile Leu Lys Val Gly Ser Ser Ser Gly Gly Asp Glu Ser Pro Ala65
70 75 80Ala Ser Ala Ser Val
Thr Lys Glu Phe Phe Pro Val Ser Gly Asp Cys 85
90 95Gly His Leu Arg Asp Val Glu Gly Ser Ser Ser
Ser Arg Asn Trp Ile 100 105
110Asp Leu Ser Phe Asp Arg Ile Gly Asp Gly Glu Thr Lys Leu Val Thr
115 120 125Pro Val Pro Thr Pro Ala Pro
Val Pro Ala Gln Val Lys Lys Ser Arg 130 135
140Arg Gly Pro Arg Ser Arg Ser Ser Gln Tyr Arg Gly Val Thr Phe
Tyr145 150 155 160Arg Arg
Thr Gly Arg Trp Glu Ser His Ile Trp Asp Cys Gly Lys Gln
165 170 175Val Tyr Leu Gly Gly Phe Asp
Thr Ala His Ala Ala Ala Arg Ala Tyr 180 185
190Asp Arg Ala Ala Ile Lys Phe Arg Gly Val Asp Ala Asp Ile
Asn Phe 195 200 205Thr Leu Gly Asp
Tyr Glu Glu Asp Met Lys Gln Val Gln Asn Leu Ser 210
215 220Lys Glu Glu Phe Val His Ile Leu Arg Arg Gln Ser
Thr Gly Phe Ser225 230 235
240Arg Gly Ser Ser Lys Tyr Arg Gly Val Thr Leu His Lys Cys Gly Arg
245 250 255Trp Glu Ala Arg Met
Gly Gln Phe Leu Gly Lys Lys Ala Tyr Asp Lys 260
265 270Ala Ala Ile Asn Thr Asn Gly Arg Glu Ala Val Thr
Asn Phe Glu Met 275 280 285Ser Ser
Tyr Gln Asn Glu Ile Asn Ser Glu Ser Asn Asn Ser Glu Ile 290
295 300Asp Leu Asn Leu Gly Ile Ser Leu Ser Thr Gly
Asn Ala Pro Lys Gln305 310 315
320Asn Gly Arg Leu Phe His Phe Pro Ser Asn Thr Tyr Glu Thr Gln Arg
325 330 335Gly Val Ser Leu
Arg Ile Asp Asn Glu Tyr Met Gly Lys Pro Val Asn 340
345 350Thr Pro Leu Pro Tyr Gly Ser Ser Asp His Arg
Leu Tyr Trp Asn Gly 355 360 365Ala
Cys Pro Ser Tyr Asn Asn Pro Ala Glu Gly Arg Ala Thr Glu Lys 370
375 380Arg Ser Glu Ala Glu Gly Met Met Ser Asn
Trp Gly Trp Gln Arg Pro385 390 395
400Gly Gln Thr Ser Ala Val Arg Pro Gln Pro Pro Gly Pro Gln Pro
Pro 405 410 415Pro Leu Phe
Ser Val Ala Ala Ala Ser Ser Gly Phe Ser His Phe Arg 420
425 430Pro Gln Pro Pro Asn Asp Asn Ala Thr Arg
Gly Tyr Phe Tyr Pro His 435 440
445Pro97663DNAZea maysCDS(1)...(663) 97atg gag gcg ctg agc ggg cgg gta
ggc gtc aag tgc ggg cgg tgg aac 48Met Glu Ala Leu Ser Gly Arg Val Gly
Val Lys Cys Gly Arg Trp Asn 1 5 10
15cct acg gcg gag cag gtg aag gtc ctg acg gag ctc ttc cgc gcg
ggg 96Pro Thr Ala Glu Gln Val Lys Val Leu Thr Glu Leu Phe Arg Ala Gly
20 25 30ctg cgg acg ccc agc
acg gag cag atc cag cgc atc tcc acc cac ctc 144Leu Arg Thr Pro Ser Thr
Glu Gln Ile Gln Arg Ile Ser Thr His Leu 35 40
45agc gcc ttc ggc aag gtg gag agc aag aac gtc ttc tac tgg
ttc cag 192Ser Ala Phe Gly Lys Val Glu Ser Lys Asn Val Phe Tyr Trp Phe
Gln 50 55 60aac cac aag gcc cgc gag
cgc cac cac cac aag aag cgc cgc cgc ggc 240Asn His Lys Ala Arg Glu Arg
His His His Lys Lys Arg Arg Arg Gly 65 70
75 80gcg tcg tcg tcc tcc ccc gac agc ggc agc ggc agg
gga agc aac aac 288Ala Ser Ser Ser Ser Pro Asp Ser Gly Ser Gly Arg Gly
Ser Asn Asn 85 90 95gag
gaa gac ggc cgt ggt gcc gcc tcg cag tcg cac gac gcc gac gcc 336Glu Glu
Asp Gly Arg Gly Ala Ala Ser Gln Ser His Asp Ala Asp Ala 100
105 110gac gcc gac ctc gtg ctg caa ccg cca
gag agc aag cgg gag gcc aga 384Asp Ala Asp Leu Val Leu Gln Pro Pro Glu
Ser Lys Arg Glu Ala Arg 115 120
125agc tat ggc cac cat cac cgg ctc gtg aca tgc tac gtc agg gac gtg
432Ser Tyr Gly His His His Arg Leu Val Thr Cys Tyr Val Arg Asp Val 130
135 140gtg gag cag cag gag gcg tcg ccg
tcg tgg gag cgg ccg acg agg gag 480Val Glu Gln Gln Glu Ala Ser Pro Ser
Trp Glu Arg Pro Thr Arg Glu145 150 155
160gtg gag acg cta gag ctc ttc ccc ctc aag tcg tac ggc gac
ctc gag 528Val Glu Thr Leu Glu Leu Phe Pro Leu Lys Ser Tyr Gly Asp Leu
Glu 165 170 175gcg gcg gag
aag gtc cgg tcg tac gtc aga ggc atc gcc gcc acc agc 576Ala Ala Glu Lys
Val Arg Ser Tyr Val Arg Gly Ile Ala Ala Thr Ser 180
185 190gag cag tgc agg gag ttg tcc ttc ttc gac gtc
tcc gcc ggc cgg gat 624Glu Gln Cys Arg Glu Leu Ser Phe Phe Asp Val Ser
Ala Gly Arg Asp 195 200 205ccg ccg
ctc gag ctc agg ctc tgc agc ttc ggt ccc tag 663Pro Pro Leu
Glu Leu Arg Leu Cys Ser Phe Gly Pro 210 215
22098220PRTZea mays 98Met Glu Ala Leu Ser Gly Arg Val Gly Val Lys
Cys Gly Arg Trp Asn1 5 10
15Pro Thr Ala Glu Gln Val Lys Val Leu Thr Glu Leu Phe Arg Ala Gly
20 25 30Leu Arg Thr Pro Ser Thr Glu
Gln Ile Gln Arg Ile Ser Thr His Leu 35 40
45Ser Ala Phe Gly Lys Val Glu Ser Lys Asn Val Phe Tyr Trp Phe
Gln 50 55 60Asn His Lys Ala Arg Glu
Arg His His His Lys Lys Arg Arg Arg Gly65 70
75 80Ala Ser Ser Ser Ser Pro Asp Ser Gly Ser Gly
Arg Gly Ser Asn Asn 85 90
95Glu Glu Asp Gly Arg Gly Ala Ala Ser Gln Ser His Asp Ala Asp Ala
100 105 110Asp Ala Asp Leu Val Leu
Gln Pro Pro Glu Ser Lys Arg Glu Ala Arg 115 120
125Ser Tyr Gly His His His Arg Leu Val Thr Cys Tyr Val Arg
Asp Val 130 135 140Val Glu Gln Gln Glu
Ala Ser Pro Ser Trp Glu Arg Pro Thr Arg Glu145 150
155 160Val Glu Thr Leu Glu Leu Phe Pro Leu Lys
Ser Tyr Gly Asp Leu Glu 165 170
175Ala Ala Glu Lys Val Arg Ser Tyr Val Arg Gly Ile Ala Ala Thr Ser
180 185 190Glu Gln Cys Arg Glu
Leu Ser Phe Phe Asp Val Ser Ala Gly Arg Asp 195
200 205Pro Pro Leu Glu Leu Arg Leu Cys Ser Phe Gly Pro
210 215 22099978DNAZea maysCDS(1)...(978)
99atg gcg gcc aat gcg ggc ggc ggt gga gcg gga gga ggc agc ggc agc 48Met
Ala Ala Asn Ala Gly Gly Gly Gly Ala Gly Gly Gly Ser Gly Ser 1
5 10 15ggc agc gtg gct gcg ccg gcg
gtg tgc cgc ccc agc ggc tcg cgg tgg 96Gly Ser Val Ala Ala Pro Ala Val
Cys Arg Pro Ser Gly Ser Arg Trp 20 25
30acg ccg acg ccg gag cag atc agg atg ctg aag gag ctc tac tac
ggc 144Thr Pro Thr Pro Glu Gln Ile Arg Met Leu Lys Glu Leu Tyr Tyr Gly
35 40 45tgc ggc atc cgg tcg ccc
agc tcg gag cag atc cag cgc atc acc gcc 192Cys Gly Ile Arg Ser Pro Ser
Ser Glu Gln Ile Gln Arg Ile Thr Ala 50 55
60atg ctg cgg cag cac ggc aag atc gag ggc aag aac gtc ttc tac tgg
240Met Leu Arg Gln His Gly Lys Ile Glu Gly Lys Asn Val Phe Tyr Trp 65
70 75 80ttc cag aac cac
aag gcc cgc gag cgc cag aag cgc cgc ctc acc agc 288Phe Gln Asn His Lys
Ala Arg Glu Arg Gln Lys Arg Arg Leu Thr Ser 85
90 95ctc gac gtc aac gtg ccc gcc gcc ggc gcg gcc
gac gcc acc acc agc 336Leu Asp Val Asn Val Pro Ala Ala Gly Ala Ala Asp
Ala Thr Thr Ser 100 105 110caa
ctc ggc gtc ctc tcg ctg tcg tcg ccg cct tca ggc gcg gcg cct 384Gln Leu
Gly Val Leu Ser Leu Ser Ser Pro Pro Ser Gly Ala Ala Pro 115
120 125ccc tcg ccc acc ctc ggc ttc tac gcc gcc
ggc aat ggc ggc gga tcg 432Pro Ser Pro Thr Leu Gly Phe Tyr Ala Ala Gly
Asn Gly Gly Gly Ser 130 135 140gct ggg
ctg ctg gac acg agt tcc gac tgg ggc agc agc ggc gct gct 480Ala Gly Leu
Leu Asp Thr Ser Ser Asp Trp Gly Ser Ser Gly Ala Ala145
150 155 160atg gcc acc gag aca tgc ttc
ctg cag gac tac atg ggc gtg acg gac 528Met Ala Thr Glu Thr Cys Phe Leu
Gln Asp Tyr Met Gly Val Thr Asp 165 170
175acg ggc agc tcg tcg cag tgg cca tgc ttc tcg tcg tcg gac
acg ata 576Thr Gly Ser Ser Ser Gln Trp Pro Cys Phe Ser Ser Ser Asp Thr
Ile 180 185 190atg gcg gcg gcg
gcg gcc gcg gcg cgg gtg gcg acg acg cgg gcg ccc 624Met Ala Ala Ala Ala
Ala Ala Ala Arg Val Ala Thr Thr Arg Ala Pro 195
200 205gag aca ctc cct ctc ttc ccg acc tgc ggc gac gac
gac gac gac gac 672Glu Thr Leu Pro Leu Phe Pro Thr Cys Gly Asp Asp Asp
Asp Asp Asp 210 215 220agc cag ccc ccg
ccg cgg ccg cgg cac gca gtc cca gtc ccg gca ggc 720Ser Gln Pro Pro Pro
Arg Pro Arg His Ala Val Pro Val Pro Ala Gly225 230
235 240gag acc atc cgc ggc ggc ggc ggc agc agc
agc agc tac ttg ccg ttc 768Glu Thr Ile Arg Gly Gly Gly Gly Ser Ser Ser
Ser Tyr Leu Pro Phe 245 250
255tgg ggt gcc ggt gcc gcg tcc aca act gcc ggc gcc act tct tcc gtt
816Trp Gly Ala Gly Ala Ala Ser Thr Thr Ala Gly Ala Thr Ser Ser Val
260 265 270gcg atc cag cag caa cac
cag ctg cag gag cag tac agc ttt tac agc 864Ala Ile Gln Gln Gln His Gln
Leu Gln Glu Gln Tyr Ser Phe Tyr Ser 275 280
285aac agc acc cag ctg gcc ggc acc ggc agc caa gac gta tcg gct
tca 912Asn Ser Thr Gln Leu Ala Gly Thr Gly Ser Gln Asp Val Ser Ala Ser
290 295 300gcg gcc gcc ctg gag ctg agc
ctc agc tca tgg tgc tcc cct tac cct 960Ala Ala Ala Leu Glu Leu Ser Leu
Ser Ser Trp Cys Ser Pro Tyr Pro305 310
315 320gct gca ggg agc atg tga
978Ala Ala Gly Ser Met 325100325PRTZea mays
100Met Ala Ala Asn Ala Gly Gly Gly Gly Ala Gly Gly Gly Ser Gly Ser1
5 10 15Gly Ser Val Ala Ala Pro
Ala Val Cys Arg Pro Ser Gly Ser Arg Trp 20 25
30Thr Pro Thr Pro Glu Gln Ile Arg Met Leu Lys Glu Leu
Tyr Tyr Gly 35 40 45Cys Gly Ile
Arg Ser Pro Ser Ser Glu Gln Ile Gln Arg Ile Thr Ala 50
55 60Met Leu Arg Gln His Gly Lys Ile Glu Gly Lys Asn
Val Phe Tyr Trp65 70 75
80Phe Gln Asn His Lys Ala Arg Glu Arg Gln Lys Arg Arg Leu Thr Ser
85 90 95Leu Asp Val Asn Val Pro
Ala Ala Gly Ala Ala Asp Ala Thr Thr Ser 100
105 110Gln Leu Gly Val Leu Ser Leu Ser Ser Pro Pro Ser
Gly Ala Ala Pro 115 120 125Pro Ser
Pro Thr Leu Gly Phe Tyr Ala Ala Gly Asn Gly Gly Gly Ser 130
135 140Ala Gly Leu Leu Asp Thr Ser Ser Asp Trp Gly
Ser Ser Gly Ala Ala145 150 155
160Met Ala Thr Glu Thr Cys Phe Leu Gln Asp Tyr Met Gly Val Thr Asp
165 170 175Thr Gly Ser Ser
Ser Gln Trp Pro Cys Phe Ser Ser Ser Asp Thr Ile 180
185 190Met Ala Ala Ala Ala Ala Ala Ala Arg Val Ala
Thr Thr Arg Ala Pro 195 200 205Glu
Thr Leu Pro Leu Phe Pro Thr Cys Gly Asp Asp Asp Asp Asp Asp 210
215 220Ser Gln Pro Pro Pro Arg Pro Arg His Ala
Val Pro Val Pro Ala Gly225 230 235
240Glu Thr Ile Arg Gly Gly Gly Gly Ser Ser Ser Ser Tyr Leu Pro
Phe 245 250 255Trp Gly Ala
Gly Ala Ala Ser Thr Thr Ala Gly Ala Thr Ser Ser Val 260
265 270Ala Ile Gln Gln Gln His Gln Leu Gln Glu
Gln Tyr Ser Phe Tyr Ser 275 280
285Asn Ser Thr Gln Leu Ala Gly Thr Gly Ser Gln Asp Val Ser Ala Ser 290
295 300Ala Ala Ala Leu Glu Leu Ser Leu
Ser Ser Trp Cys Ser Pro Tyr Pro305 310
315 320Ala Ala Gly Ser Met
3251013727DNAZea mays 101atggccactg tgaacaactg gctcgctttc tccctctccc
cgcaggagct gccgccctcc 60cagacgacgg actccacact catctcggcc gccaccgccg
accatgtctc cggcgatgtc 120tgcttcaaca tcccccaaga ttggagcatg aggggatcag
agctttcggc gctcgtcgcg 180gagccgaagc tggaggactt cctcggcggc atctccttct
ccgagcagca tcacaaggcc 240aactgcaaca tgatacccag cactagcagc acagtttgct
acgcgagctc aggtgctagc 300accggctacc atcaccagct gtaccaccag cccaccagct
cagcgctcca cttcgcggac 360tccgtaatgg tggcctcctc ggccggtgtc cacgacggcg
gtgccatgct cagcgcggcc 420gccgctaacg gtgtcgctgg cgctgccagt gccaacggcg
gcggcatcgg gctgtccatg 480atcaagaact ggctgcggag ccaaccggcg cccatgcagc
cgagggcggc ggcggctgag 540ggcgcgcagg ggctctcttt gtccatgaac atggcgggga
cgacccaagg cgctgctggc 600atgccacttc tcgctggaga gcgcgcacgg gcgcccgaga
gtgtatcgac gtcagcacag 660ggtggtgccg tcgtcgtcac ggcgccgaag gaggatagcg
gtggcagcgg tgttgccggt 720gctctagtag ccgtgagcac ggacacgggt ggcagcggcg
gcgcgtcggc tgacaacacg 780gcaaggaaga cggtggacac gttcgggcag cgcacgtcga
tttaccgtgg cgtgacaagg 840taagggggtg gatgaatcaa gtaatcatga aattttgaaa
agccattggt aatccaagga 900actgtcatga tagatttgat tgcatctaga catagttccg
atcgaatcaa atgagtaggc 960caatgtttag cctttgggga tctcgctgat tattaggagt
accattgtat tgggcatggt 1020tgtggtatag tagtagacaa ttaacaaaaa agctaccact
tttcaattat tttaggcata 1080gatggactgg gagatatgag gcacatcttt gggataacag
ttgcagaagg gaaggacaaa 1140ctcgtaaggg tcgtcaaggt atacaaatat aatgcaacat
actgtcatta aatatgcttt 1200ttctgtaagt tttatatttc accaatgatg ttgttattgt
taactgacat tgcttcacac 1260tatcaatttt ggattcggcg caatgatttg tgggattgaa
atcaaatctt aaatctacag 1320tctatttagg tacgcgattt ctctccaact acttaatgca
gttcgtttct ccctataacc 1380atattctttt tcatctcaaa tctcactcga ctcttttttt
ttatcttgta ccattgatag 1440gtggctatga taaagaggag aaagctgcta gggcttatga
tcttgctgct ctgaagtact 1500ggggtcccac aacaacaaca aatttcccag tatgtatatg
tagcatccag ttttacttta 1560ctgaagttca tatctcgtta tgggctataa atatgtatca
aatgatgtcc attagctagt 1620gatctggagt gaaggttcta tagtaaagta aacgctgtgt
gcggagtgca gtagcgggag 1680gtctctcttc tattttctaa gaaaaatgga cattgctgaa
attgtactta aagtcgttta 1740ttttattttt ttgtatttcc aggtgagtaa ctacgaaaag
gagctcgagg acatgaagca 1800catgacaagg caggagtttg tagcgtctct gagaaggtcg
gtctaacagc attgattaat 1860cagtaccacc tctactgaat aaaatctgct gctatttgtt
aaattttgag cgaggtcaac 1920tgcatatttg atcttattag accactgtat atgaatgcag
gaagagcagt ggtttctcca 1980gaggtgcatc catttacagg ggagtgacta ggtatgaatt
catatagcta agaacttaac 2040atcaacaaaa acacacatac acttgggttg atgtggcaga
tgcatgcatg gattgaaaat 2100gtgtgcatgt tgttttactt gaactcgatc tctgtattta
taggcatcac caacatggaa 2160gatggcaagc acggattgga cgagttgcag ggaacaagga
tctttacttg ggcaccttca 2220gtaagtagca aacaaatatg tttttgcatt gtatatagag
tacccttgaa tatataaatt 2280caccacatat acaagcaagt tacagtcaac taacacaatc
tcaacgcaac gagaaagcaa 2340gtgttccagc tgatagtaca catttgtaga ccagccgcat
atggttgttt tgtatgcatg 2400atgactatta aaaatgtgac catcgcatta agtcatgcaa
agttgcattg cagtagtaca 2460ttgcttagtg catgctcctc aagtggcttt tttcaaacct
gatcccatgt ctggtgctat 2520tgttgtctcc cattcacccg tgcatcaggt caaaatagta
ccatgcctga ataagaaaaa 2580caaaacgagc atgcactggc agcagcagac taataaacaa
agttccagca tttactaata 2640aactaattag gctacagcat ccaaaagatt cttccaatta
agccacaact gttcatgcat 2700acatgggtat gccacccagg ataccatgca tgcaccgtgc
acgacgaaag cgaaacgctc 2760gttctcggaa tattagaact gacgaagccg agtgcaacct
tctgtcgtgg atgcaggcac 2820ccaggaggag gcagcggagg cgtacgacat cgcggcgatc
aagttccgcg gcctaaacgc 2880cgtcaccaac ttcgacatga gccgctacga cgtgaagagc
atcctggaca gcagcgccct 2940ccccatcggc agcgccgcca agcgcctcaa ggaggccgag
gccgcagcgt ccgcgcagca 3000ccaccacgcc ggcgtggtga gttacgacgt cggccgcatc
gcctcgcagc tcggcgacgg 3060cggagccctg gcggcggcgt acggcgcgca ctaccacggc
gccgcctggc cgaccatcgc 3120gttccagccg ggcgccgcca ccacaggcct gtaccacccg
tacgcgcagc agccaatgcg 3180cggcggcggg tggtgcaagc aggagcagga ccacgcggtg
atcgcggccg cgcacagcct 3240gcaggacctc caccacctga acctgggcgc ggccggcgcg
cacgactttt tctcggcagg 3300gcagcaggcc gccgccgctg cgatgcacgg cctgggtagc
atcgacagtg cgtcgctcga 3360gcacagcacc ggctccaact ccgtcgtcta caacggcggg
gtcggcgaca gcaacggcgc 3420cagcgccgtc ggcggcagtg gcggtggcta catgatgccg
atgagcgctg ccggagcaac 3480cactacatcg gcaatggtga gccacgagca ggtgcatgca
cgggcctacg acgaagccaa 3540gcaggctgct cagatggggt acgagagcta cctggtgaac
gcggagaaca atggtggcgg 3600aaggatgtct gcatggggga ctgtcgtgtc tgcagccgcg
gcggcagcag caagcagcaa 3660cgacaacatg gccgccgacg tcggccatgg cggcgcgcag
ctcttcagtg tctggaacga 3720cacttaa
37271024325DNAOryza sativa 102atgcatatct atcttatata
aatatctacc agtgatactg ttgcttagtg ctccaaacct 60ctcttgacct cttcttcttc
ttctcagtta gcttagctta agcttcccct aaccttgagc 120tcaccacaac aatggcgact
tgatctaaca gagcttaacc aagtagcaaa tcatacatat 180aaccatagct taattcgcat
tgaatcttgt cttgttcagt gtgaatcatc aaccatggcc 240accatgaaca actggctggc
cttctccctc tccccgcagg atcagctccc gccgtctcag 300accaactcca ctctcatctc
cgccgccgcc accaccacca ccgccggcga ctcctccacc 360ggcgacgtct gcttcaacat
cccccaaggt aattaagctc accaatcgat gcatgcattc 420atgagctaga tatagctagt
gttggttggg atttgaagag acatgcatgt ttgattgatt 480gatttgatgt gcagattgga
gcatgagggg atcggagctc tcggcgctcg tcgccgagcc 540gaagctggag gacttcctcg
gcggcatctc cttctcggag cagcagcatc atcacggcgg 600caagggcggc gtgatcccga
gcagcgccgc cgcttgctac gcgagctccg gcagcagcgt 660cggctacctg taccctcctc
caagctcatc ctcgctccag ttcgccgact ccgtcatggt 720ggccacctcc tcgcccgtcg
tcgcccacga cggcgtcagc ggcggcggca tggtgagcgc 780cgccgccgcc gcggcggcca
gtggcaacgg cggcattggc ctgtccatga tcaagaactg 840gctccggagc cagccggcgc
cgcagccggc gcaggcgctg tctctgtcca tgaacatggc 900ggggacgacg acggcgcagg
gcggcggcgc catggcgctc ctcgccggcg caggggagcg 960aggccggacg acgcccgcgt
cagagagcct gtccacgtcg gcgcacggag cgacgacggc 1020gacgatggct ggtggtcgca
aggagattaa cgaggaaggc agcggcagcg ccggcgccgt 1080ggttgccgtc ggctcggagt
caggcggcag cggcgccgtg gtggaggccg gcgcggcggc 1140ggcggcggcg aggaagtccg
tcgacacgtt cggccagaga acatcgatct accgcggcgt 1200gacaaggtat ttagggtgca
attaattaat catctatcta tattttgctc aaaaaagttc 1260atctactagc tagcttagca
caaatcatca tcagtgtaat catatatatt ctttgatgat 1320ttaactgtgt tgcatgaatt
cattcctatt tgatgtttgt gatttggatc ccattttcta 1380ggatagctat ataggtgata
gattgatcat tagatttgta ggatttatca ttatgtcatt 1440attatgtggg acatgattgt
tgtgattaac aaagttgtaa tatcttttgg tttggttata 1500ggcatagatg gacagggagg
tatgaggctc atctttggga caacagctgc agaagagagg 1560gccaaactcg caagggtcgt
caaggtaggc taactagtgc catttaaatc gattaattgt 1620ttttttatgc tccaatggcg
attgatactg atcttgtttc tttttctaat gatcatttcg 1680ggatcgaatg atcttcctct
gtttgatcga acttggcttt tgaatctaca gtctatctag 1740gtgagtgaga ttccttgaac
ctagatgttc tgtttgcgat gcatgtatat attcggtaga 1800ttgaattatt tgctgatctt
tgctttcttg aagtttaatg atcttataaa ttgtaatgct 1860gataggtggt tatgacaaag
aggaaaaagc tgctagagct tatgatttgg ctgctctcaa 1920atactggggc ccgacgacga
cgacaaattt tccggtgtgt ttataattaa tatacagatt 1980gtgtcacatt gttattttct
cactctttta tttgatactg atctagtgta atgatgatta 2040ctaaaactgt acttaaaggc
aatggtttct gtatttttca ggtaaataac tatgaaaagg 2100agctggagga gatgaagcac
atgacaaggc aggagttcgt agcctctttg agaaggttgg 2160tctctacaat caagatatcc
atactatact aattaatttc cttttagatt tatagtaatt 2220tatctatcgc attgaagtta
attaattatc tgatgcttac tgatactaac aaatactgtt 2280ccttatatgt gcaggaagag
cagtggtttc tccagaggtg catccattta ccgtggagta 2340actaggtaca tatatatatg
catcattgta caattaattt ttttaatttt tttagggtaa 2400aaaatgaaga ctgtgatata
gatccattaa tttgatcttg tgtacttgta aatataggca 2460tcaccagcat gggagatggc
aagcaaggat aggaagagtt gcagggaaca aggacctcta 2520cttgggcacc ttcagtaagt
acaaatattc atatttatac tgcaaaacca tataaatcca 2580tattaataag tatgtccttt
ctcattgagt atacaaaata tcatattttc ttggcaagta 2640caatttattc attcagggca
aaatagtagt agtaagaaag aggggtgact cttcaaagaa 2700cacagagctt acttaagcct
gtaactaatt aattaaacta aaaatgtgat ctgcaagtca 2760tgtcaagttg cattacacca
ctaatatata tactctgtgc atgcttgcat gctctcctca 2820tgtggctagc taccttttca
aaccttccat gtctggtgct actcctgtct ccattcacca 2880ctgcacctgg tcaagatcct
cactaattaa gaaacaataa tgcattattt gcagtaaata 2940atttaactag tgttaatcac
attctttgca acacaaacta atcaccaatt aagctagcta 3000gctagccaaa atgataatct
tgcttgcatg cgctaatggt gtgtgtgatg atggtggtgt 3060cacgcatgca ggcacgcagg
aggaggcggc ggaggcgtac gacatcgcgg cgatcaagtt 3120ccgggggctc aacgccgtca
ccaacttcga catgagccgc tacgacgtca agagcatcct 3180cgacagcgct gccctccccg
tcggcaccgc cgccaagcgc ctcaaggacg ccgaggccgc 3240cgccgcctac gacgtcggcc
gcatcgcctc gcacctcggc ggcgacggcg cctacgccgc 3300gcattacggc caccaccacc
actcggccgc cgccgcctgg ccgaccatcg cgttccaggc 3360ggcggcggcg ccgccgccgc
acgccgccgg gctttaccac ccgtacgcgc agccgctgcg 3420tgggtggtgc aagcaggagc
aggaccacgc cgtgatcgcg gcggcgcaca gcctgcagga 3480tctccaccac ctcaacctcg
gcgccgccgc cgccgcgcat gacttcttct cgcaggcgat 3540gcagcagcag cacggcctcg
gcagcatcga caacgcgtcg ctcgagcaca gcaccggctc 3600caactccgtc gtctacaacg
gcgacaatgg cggcggaggc ggcggctaca tcatggcgcc 3660gatgagcgcc gtgtcggcca
cggccaccgc ggtggcgagc agccacgatc acggcggcga 3720cggcgggaag caggtgcaga
tggggtacga cagctacctc gtcggcgcag acgcctacgg 3780cggcggcggc gccgggagga
tgccatcctg ggcgatgacg ccggcgtcgg cgccggccgc 3840cacgagcagc agcgacatga
ccggagtctg ccatggcgca cagctcttca gcgtctggaa 3900cgacacataa aaaaaaaact
aggttagcca gcttaattag cagggtaaac cactgacaca 3960attaagccat acttaaatta
gggttcatga gatgaccatt aagcaggtta ttatcattaa 4020tgatgtttaa tttctcaatt
agtacttagc tcaaaaggag gggatttctt ctgaaggatg 4080gtgatggctt gtgaaattga
acctggtgtt cttgccatga tttttttttc acaagctgcc 4140attttggggt tcaggttcag
aaggatcctg attattatta accagccata tatatataga 4200agggtagaaa tggaggtatc
ctgcttgtaa attggggcaa tggtagctag agttgatgca 4260atgaccatgc ttcatgtgat
gagaactaat tgtcttcctc tgatcaaatt aagcaggaag 4320attaa
43251032088DNAOryza
sativaCDS(1)...(2088) 103atg gcc act atg aac aac tgg ctc gcc ttc tcg ctc
tcg ccg cag gac 48Met Ala Thr Met Asn Asn Trp Leu Ala Phe Ser Leu Ser
Pro Gln Asp 1 5 10 15caa
ctc cca ccg tcg cag acc aat agc act ctc atc tcc gct gct gca 96Gln Leu
Pro Pro Ser Gln Thr Asn Ser Thr Leu Ile Ser Ala Ala Ala 20
25 30acc acc aca acc gca ggc gat tcg tca
acg ggc gac gtc tgc ttc aac 144Thr Thr Thr Thr Ala Gly Asp Ser Ser Thr
Gly Asp Val Cys Phe Asn 35 40
45atc cct caa gac tgg tcc atg cgc gga agc gag ctt agc gct ctc gtc
192Ile Pro Gln Asp Trp Ser Met Arg Gly Ser Glu Leu Ser Ala Leu Val 50
55 60gcg gag ccc aag ttg gag gat ttc
ttg gga ggc atc tcc ttc tcg gag 240Ala Glu Pro Lys Leu Glu Asp Phe Leu
Gly Gly Ile Ser Phe Ser Glu 65 70 75
80caa cag cat cat cac ggc gga aag ggc ggt gtt atc cca agc
tct gct 288Gln Gln His His His Gly Gly Lys Gly Gly Val Ile Pro Ser Ser
Ala 85 90 95gcc gca tgc
tat gca agc tcc ggc tcc agc gtg ggc tac ctc tac cct 336Ala Ala Cys Tyr
Ala Ser Ser Gly Ser Ser Val Gly Tyr Leu Tyr Pro 100
105 110ccg cct tca tcc tcg tca ctt cag ttt gca gac
agc gtg atg gtc gca 384Pro Pro Ser Ser Ser Ser Leu Gln Phe Ala Asp Ser
Val Met Val Ala 115 120 125acc tca
tct cca gtg gtt gcg cac gat ggc gtg agc ggt ggc ggt atg 432Thr Ser Ser
Pro Val Val Ala His Asp Gly Val Ser Gly Gly Gly Met 130
135 140gtc tca gca gca gcg gct gca gca gct tcg ggt aat
ggc ggg att ggc 480Val Ser Ala Ala Ala Ala Ala Ala Ala Ser Gly Asn Gly
Gly Ile Gly145 150 155
160ctc tcc atg atc aag aac tgg ctc agg agc caa ccg gct ccg caa cct
528Leu Ser Met Ile Lys Asn Trp Leu Arg Ser Gln Pro Ala Pro Gln Pro
165 170 175gcg caa gca ctc agc
ctg tcg atg aac atg gct ggt act act acc gct 576Ala Gln Ala Leu Ser Leu
Ser Met Asn Met Ala Gly Thr Thr Thr Ala 180
185 190caa ggt gga ggc gca atg gca ctt ctc gca ggc gct
ggc gaa aga gga 624Gln Gly Gly Gly Ala Met Ala Leu Leu Ala Gly Ala Gly
Glu Arg Gly 195 200 205agg acc aca
cca gca tcc gag agc ctc tct act tcc gcg cac gga gcc 672Arg Thr Thr Pro
Ala Ser Glu Ser Leu Ser Thr Ser Ala His Gly Ala 210
215 220acc acg gct aca atg gct ggc ggg agg aaa gag atc
aac gag gaa gga 720Thr Thr Ala Thr Met Ala Gly Gly Arg Lys Glu Ile Asn
Glu Glu Gly225 230 235
240tct gga tcc gct ggt gcc gtg gtt gca gtt ggc tca gaa tca ggt gga
768Ser Gly Ser Ala Gly Ala Val Val Ala Val Gly Ser Glu Ser Gly Gly
245 250 255tcc ggc gct gtt gtt
gaa gct ggt gcc gct gcg gca gcg gct cgg aag 816Ser Gly Ala Val Val Glu
Ala Gly Ala Ala Ala Ala Ala Ala Arg Lys 260
265 270agc gtt gat act ttc ggc caa aga acg agc atc tac
aga ggc gtt act 864Ser Val Asp Thr Phe Gly Gln Arg Thr Ser Ile Tyr Arg
Gly Val Thr 275 280 285cgg cac cgc
tgg acc ggc agg tac gag gca cac ttg tgg gac aac agc 912Arg His Arg Trp
Thr Gly Arg Tyr Glu Ala His Leu Trp Asp Asn Ser 290
295 300tgt cgc cgc gag ggc caa act agg aag gga aga cag
gtc tat cta gga 960Cys Arg Arg Glu Gly Gln Thr Arg Lys Gly Arg Gln Val
Tyr Leu Gly305 310 315
320gga tat gac aaa gag gag aag gct gcc aga gcg tac gac ctg gcc gcg
1008Gly Tyr Asp Lys Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala
325 330 335ttg aag tac tgg ggt
cca aca acg acg acc aac ttc ccg gtg aac aac 1056Leu Lys Tyr Trp Gly Pro
Thr Thr Thr Thr Asn Phe Pro Val Asn Asn 340
345 350tac gag aag gag ctg gaa gag atg aag cac atg acg
cgg cag gag ttc 1104Tyr Glu Lys Glu Leu Glu Glu Met Lys His Met Thr Arg
Gln Glu Phe 355 360 365gtc gct tct
ctc agg cgc aag tca tct ggt ttc tcc aga ggt gcg tcg 1152Val Ala Ser Leu
Arg Arg Lys Ser Ser Gly Phe Ser Arg Gly Ala Ser 370
375 380atc tat aga gga gtt acc cgc cac cac cag cac gga
agg tgg cag gca 1200Ile Tyr Arg Gly Val Thr Arg His His Gln His Gly Arg
Trp Gln Ala385 390 395
400aga atc ggg aga gtc gcc ggt aac aag gac ctg tac ttg gga acc ttc
1248Arg Ile Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly Thr Phe
405 410 415tcg act cag gag gag
gca gcg gaa gcg tat gac att gcg gcg atc aag 1296Ser Thr Gln Glu Glu Ala
Ala Glu Ala Tyr Asp Ile Ala Ala Ile Lys 420
425 430ttc cgc ggt ctc aat gcc gtg acc aac ttc gac atg
tca cgc tat gat 1344Phe Arg Gly Leu Asn Ala Val Thr Asn Phe Asp Met Ser
Arg Tyr Asp 435 440 445gtc aag tcg
att ctg gat agc gct gcg ttg cct gtg gga acc gct gcc 1392Val Lys Ser Ile
Leu Asp Ser Ala Ala Leu Pro Val Gly Thr Ala Ala 450
455 460aaa cgc ctc aag gac gcg gaa gca gct gcc gcg tac
gat gtt ggc agg 1440Lys Arg Leu Lys Asp Ala Glu Ala Ala Ala Ala Tyr Asp
Val Gly Arg465 470 475
480att gcc tca cat ctc ggt gga gat gga gct tac gct gcc cac tac ggg
1488Ile Ala Ser His Leu Gly Gly Asp Gly Ala Tyr Ala Ala His Tyr Gly
485 490 495cat cat cac cac tct
gca gcc gca gct tgg cct aca ata gca ttc caa 1536His His His His Ser Ala
Ala Ala Ala Trp Pro Thr Ile Ala Phe Gln 500
505 510gcg gca gcg gct cct cct cca cac gct gct ggt ctt
tac cat ccg tac 1584Ala Ala Ala Ala Pro Pro Pro His Ala Ala Gly Leu Tyr
His Pro Tyr 515 520 525gcg caa cct
ctc cgc ggt tgg tgt aag cag gaa caa gat cat gcg gtg 1632Ala Gln Pro Leu
Arg Gly Trp Cys Lys Gln Glu Gln Asp His Ala Val 530
535 540att gcg gct gca cac agc ttg caa gat ctg cat cac
ctc aat ctg gga 1680Ile Ala Ala Ala His Ser Leu Gln Asp Leu His His Leu
Asn Leu Gly545 550 555
560gcc gca gca gct gcc cat gac ttc ttc tca caa gcc atg cag cag cag
1728Ala Ala Ala Ala Ala His Asp Phe Phe Ser Gln Ala Met Gln Gln Gln
565 570 575cat ggc ctg ggc agc
ata gac aat gcg tct ctg gag cac tcc acc gga 1776His Gly Leu Gly Ser Ile
Asp Asn Ala Ser Leu Glu His Ser Thr Gly 580
585 590tcg aac tcg gtg gtg tac aat gga gac aac ggc gga
gga ggt gga ggt 1824Ser Asn Ser Val Val Tyr Asn Gly Asp Asn Gly Gly Gly
Gly Gly Gly 595 600 605tac atc atg
gca cct atg tca gcg gtc tct gct acc gct acg gcg gtg 1872Tyr Ile Met Ala
Pro Met Ser Ala Val Ser Ala Thr Ala Thr Ala Val 610
615 620gcc tca tcc cac gac cac ggt gga gac ggc ggc aag
cag gtc caa atg 1920Ala Ser Ser His Asp His Gly Gly Asp Gly Gly Lys Gln
Val Gln Met625 630 635
640ggc tac gac tcc tac ctt gtg gga gct gac gct tac ggc gga gga gga
1968Gly Tyr Asp Ser Tyr Leu Val Gly Ala Asp Ala Tyr Gly Gly Gly Gly
645 650 655gct ggt cgc atg cct
agc tgg gcc atg acg cct gct tct gct cct gcg 2016Ala Gly Arg Met Pro Ser
Trp Ala Met Thr Pro Ala Ser Ala Pro Ala 660
665 670gct acg agc tcg tcg gat atg aca gga gtg tgt cat
ggc gcc caa ctg 2064Ala Thr Ser Ser Ser Asp Met Thr Gly Val Cys His Gly
Ala Gln Leu 675 680 685ttc tcg gtg
tgg aat gat aca tag 2088Phe Ser Val Trp
Asn Asp Thr 690 6951042133DNAZea maysCDS(1)...(2133)
104atg gcc act gtg aac aac tgg ctc gct ttc tcc ctc tcc ccg cag gag
48Met Ala Thr Val Asn Asn Trp Leu Ala Phe Ser Leu Ser Pro Gln Glu 1
5 10 15ctg ccg ccc tcc cag acg
acg gac tcc aca ctc atc tcg gcc gcc acc 96Leu Pro Pro Ser Gln Thr Thr
Asp Ser Thr Leu Ile Ser Ala Ala Thr 20 25
30gcc gac cat gtc tcc ggc gat gtc tgc ttc aac atc ccc caa
gat tgg 144Ala Asp His Val Ser Gly Asp Val Cys Phe Asn Ile Pro Gln Asp
Trp 35 40 45agc atg agg gga tca
gag ctt tcg gcg ctc gtc gcg gag ccg aag ctg 192Ser Met Arg Gly Ser Glu
Leu Ser Ala Leu Val Ala Glu Pro Lys Leu 50 55
60gag gac ttc ctc ggc ggc atc tcc ttc tcc gag cag cat cac aag
gcc 240Glu Asp Phe Leu Gly Gly Ile Ser Phe Ser Glu Gln His His Lys Ala
65 70 75 80aac tgc aac
atg ata ccc agc act agc agc aca gtt tgc tac gcg agc 288Asn Cys Asn Met
Ile Pro Ser Thr Ser Ser Thr Val Cys Tyr Ala Ser 85
90 95tca ggt gct agc acc ggc tac cat cac cag
ctg tac cac cag ccc acc 336Ser Gly Ala Ser Thr Gly Tyr His His Gln Leu
Tyr His Gln Pro Thr 100 105
110agc tca gcg ctc cac ttc gcg gac tcc gta atg gtg gcc tcc tcg gcc
384Ser Ser Ala Leu His Phe Ala Asp Ser Val Met Val Ala Ser Ser Ala
115 120 125ggt gtc cac gac ggc ggt gcc
atg ctc agc gcg gcc gcc gct aac ggt 432Gly Val His Asp Gly Gly Ala Met
Leu Ser Ala Ala Ala Ala Asn Gly 130 135
140gtc gct ggc gct gcc agt gcc aac ggc ggc ggc atc ggg ctg tcc atg
480Val Ala Gly Ala Ala Ser Ala Asn Gly Gly Gly Ile Gly Leu Ser Met145
150 155 160att aag aac tgg
ctg cgg agc caa ccg gcg ccc atg cag ccg agg gtg 528Ile Lys Asn Trp Leu
Arg Ser Gln Pro Ala Pro Met Gln Pro Arg Val 165
170 175gcg gcg gct gag ggc gcg cag ggg ctc tct ttg
tcc atg aac atg gcg 576Ala Ala Ala Glu Gly Ala Gln Gly Leu Ser Leu Ser
Met Asn Met Ala 180 185 190ggg
acg acc caa ggc gct gct ggc atg cca ctt ctc gct gga gag cgc 624Gly Thr
Thr Gln Gly Ala Ala Gly Met Pro Leu Leu Ala Gly Glu Arg 195
200 205gca cgg gcg ccc gag agt gta tcg acg tca
gca cag ggt gga gcc gtc 672Ala Arg Ala Pro Glu Ser Val Ser Thr Ser Ala
Gln Gly Gly Ala Val 210 215 220gtc gtc
acg gcg ccg aag gag gat agc ggt ggc agc ggt gtt gcc ggc 720Val Val Thr
Ala Pro Lys Glu Asp Ser Gly Gly Ser Gly Val Ala Gly225
230 235 240gct cta gta gcc gtg agc acg
gac acg ggt ggc agc ggc ggc gcg tcg 768Ala Leu Val Ala Val Ser Thr Asp
Thr Gly Gly Ser Gly Gly Ala Ser 245 250
255gct gac aac acg gca agg aag acg gtg gac acg ttc ggg cag
cgc acg 816Ala Asp Asn Thr Ala Arg Lys Thr Val Asp Thr Phe Gly Gln Arg
Thr 260 265 270tcg att tac cgt
ggc gtg aca agg cat aga tgg act ggg aga tat gag 864Ser Ile Tyr Arg Gly
Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu 275
280 285gca cat ctt tgg gat aac agt tgc aga agg gaa ggg
caa act cgt aag 912Ala His Leu Trp Asp Asn Ser Cys Arg Arg Glu Gly Gln
Thr Arg Lys 290 295 300ggt cgt caa gtc
tat tta ggt ggc tat gat aaa gag gag aaa gct gct 960Gly Arg Gln Val Tyr
Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala Ala305 310
315 320agg gct tat gat ctt gct gct ctg aag tac
tgg ggt gcc aca aca aca 1008Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp
Gly Ala Thr Thr Thr 325 330
335aca aat ttt cca gtg agt aac tac gaa aag gag ctc gag gac atg aag
1056Thr Asn Phe Pro Val Ser Asn Tyr Glu Lys Glu Leu Glu Asp Met Lys
340 345 350cac atg aca agg cag gag
ttt gta gcg tct ctg aga agg aag agc agt 1104His Met Thr Arg Gln Glu Phe
Val Ala Ser Leu Arg Arg Lys Ser Ser 355 360
365ggt ttc tcc aga ggt gca tcc att tac agg gga gtg act agg cat
cac 1152Gly Phe Ser Arg Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His
370 375 380caa cat gga aga tgg caa gca
cgg att gga cga gtt gca ggg aac aag 1200Gln His Gly Arg Trp Gln Ala Arg
Ile Gly Arg Val Ala Gly Asn Lys385 390
395 400gat ctt tac ttg ggc acc ttc agc acc cag gag gag
gca gcg gag gcg 1248Asp Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala
Ala Glu Ala 405 410 415tac
gac atc gcg gcg atc aag ttc cgc ggc ctc aac gcc gtc acc aac 1296Tyr Asp
Ile Ala Ala Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn 420
425 430ttc gac atg agc cgc tac gac gtg aag
agc atc ctg gac agc agc gcc 1344Phe Asp Met Ser Arg Tyr Asp Val Lys Ser
Ile Leu Asp Ser Ser Ala 435 440
445ctc ccc atc ggc agc gcc gcc aag cgc ctc aag gag gcc gag gcc gca
1392Leu Pro Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala Glu Ala Ala
450 455 460gcg tcc gcg cag cac cac cac
gcc ggc gtg gtg agc tac gac gtc ggc 1440Ala Ser Ala Gln His His His Ala
Gly Val Val Ser Tyr Asp Val Gly465 470
475 480cgc atc gcc tcg cag ctc ggc gac ggc gga gcc ctg
gcg gcg gcg tac 1488Arg Ile Ala Ser Gln Leu Gly Asp Gly Gly Ala Leu Ala
Ala Ala Tyr 485 490 495ggc
gcg cac tac cac ggc gcc gcc tgg ccg acc atc gcg ttc cag ccg 1536Gly Ala
His Tyr His Gly Ala Ala Trp Pro Thr Ile Ala Phe Gln Pro 500
505 510ggc gcc gcc agc aca ggc ctg tac cac
ccg tac gcg cag cag cca atg 1584Gly Ala Ala Ser Thr Gly Leu Tyr His Pro
Tyr Ala Gln Gln Pro Met 515 520
525cgc ggc ggc ggg tgg tgc aag cag gag cag gac cac gcg gtg atc gcg
1632Arg Gly Gly Gly Trp Cys Lys Gln Glu Gln Asp His Ala Val Ile Ala
530 535 540gcc gcg cac agc ctg cag gac
ctc cac cac ctg aac ctg ggc gcg gcc 1680Ala Ala His Ser Leu Gln Asp Leu
His His Leu Asn Leu Gly Ala Ala545 550
555 560ggc gcg cac gac ttt ttc tcg gca ggg cag cag gcc
gcc gcc gct gcg 1728Gly Ala His Asp Phe Phe Ser Ala Gly Gln Gln Ala Ala
Ala Ala Ala 565 570 575atg
cac ggc ctg ggt agc atc gac agt gcg tcg ctc gag cac agc acc 1776Met His
Gly Leu Gly Ser Ile Asp Ser Ala Ser Leu Glu His Ser Thr 580
585 590ggc tcc aac tcc gtc gtc tac aac ggc
ggg gtc ggc gac agc aac ggc 1824Gly Ser Asn Ser Val Val Tyr Asn Gly Gly
Val Gly Asp Ser Asn Gly 595 600
605gcc agc gcc gtc ggc ggc agt ggc ggt ggc tac atg atg ccg atg agc
1872Ala Ser Ala Val Gly Gly Ser Gly Gly Gly Tyr Met Met Pro Met Ser
610 615 620gct gcc gga gca acc act aca
tcg gca atg gtg agc cac gag cag gtg 1920Ala Ala Gly Ala Thr Thr Thr Ser
Ala Met Val Ser His Glu Gln Val625 630
635 640cat gca cgg gcc tac gac gaa gcc aag cag gct gct
cag atg ggg tac 1968His Ala Arg Ala Tyr Asp Glu Ala Lys Gln Ala Ala Gln
Met Gly Tyr 645 650 655gag
agc tac ctg gtg aac gcg gag aac aat ggt ggc gga agg atg tct 2016Glu Ser
Tyr Leu Val Asn Ala Glu Asn Asn Gly Gly Gly Arg Met Ser 660
665 670gca tgg ggg act gtc gtg tct gca gcc
gcg gcg gca gca gca agc agc 2064Ala Trp Gly Thr Val Val Ser Ala Ala Ala
Ala Ala Ala Ala Ser Ser 675 680
685aac gac aac atg gcc gcc gac gtc ggc cat ggc ggc gcg cag ctc ttc
2112Asn Asp Asn Met Ala Ala Asp Val Gly His Gly Gly Ala Gln Leu Phe
690 695 700agt gtc tgg aac gac act taa
2133Ser Val Trp Asn Asp Thr705
710105710PRTZea mays 105Met Ala Thr Val Asn Asn Trp Leu Ala Phe
Ser Leu Ser Pro Gln Glu1 5 10
15Leu Pro Pro Ser Gln Thr Thr Asp Ser Thr Leu Ile Ser Ala Ala Thr
20 25 30Ala Asp His Val Ser Gly
Asp Val Cys Phe Asn Ile Pro Gln Asp Trp 35 40
45Ser Met Arg Gly Ser Glu Leu Ser Ala Leu Val Ala Glu Pro
Lys Leu 50 55 60Glu Asp Phe Leu Gly
Gly Ile Ser Phe Ser Glu Gln His His Lys Ala65 70
75 80Asn Cys Asn Met Ile Pro Ser Thr Ser Ser
Thr Val Cys Tyr Ala Ser 85 90
95Ser Gly Ala Ser Thr Gly Tyr His His Gln Leu Tyr His Gln Pro Thr
100 105 110Ser Ser Ala Leu His
Phe Ala Asp Ser Val Met Val Ala Ser Ser Ala 115
120 125Gly Val His Asp Gly Gly Ala Met Leu Ser Ala Ala
Ala Ala Asn Gly 130 135 140Val Ala Gly
Ala Ala Ser Ala Asn Gly Gly Gly Ile Gly Leu Ser Met145
150 155 160Ile Lys Asn Trp Leu Arg Ser
Gln Pro Ala Pro Met Gln Pro Arg Val 165
170 175Ala Ala Ala Glu Gly Ala Gln Gly Leu Ser Leu Ser
Met Asn Met Ala 180 185 190Gly
Thr Thr Gln Gly Ala Ala Gly Met Pro Leu Leu Ala Gly Glu Arg 195
200 205Ala Arg Ala Pro Glu Ser Val Ser Thr
Ser Ala Gln Gly Gly Ala Val 210 215
220Val Val Thr Ala Pro Lys Glu Asp Ser Gly Gly Ser Gly Val Ala Gly225
230 235 240Ala Leu Val Ala
Val Ser Thr Asp Thr Gly Gly Ser Gly Gly Ala Ser 245
250 255Ala Asp Asn Thr Ala Arg Lys Thr Val Asp
Thr Phe Gly Gln Arg Thr 260 265
270Ser Ile Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr Glu
275 280 285Ala His Leu Trp Asp Asn Ser
Cys Arg Arg Glu Gly Gln Thr Arg Lys 290 295
300Gly Arg Gln Val Tyr Leu Gly Gly Tyr Asp Lys Glu Glu Lys Ala
Ala305 310 315 320Arg Ala
Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Ala Thr Thr Thr
325 330 335Thr Asn Phe Pro Val Ser Asn
Tyr Glu Lys Glu Leu Glu Asp Met Lys 340 345
350His Met Thr Arg Gln Glu Phe Val Ala Ser Leu Arg Arg Lys
Ser Ser 355 360 365Gly Phe Ser Arg
Gly Ala Ser Ile Tyr Arg Gly Val Thr Arg His His 370
375 380Gln His Gly Arg Trp Gln Ala Arg Ile Gly Arg Val
Ala Gly Asn Lys385 390 395
400Asp Leu Tyr Leu Gly Thr Phe Ser Thr Gln Glu Glu Ala Ala Glu Ala
405 410 415Tyr Asp Ile Ala Ala
Ile Lys Phe Arg Gly Leu Asn Ala Val Thr Asn 420
425 430Phe Asp Met Ser Arg Tyr Asp Val Lys Ser Ile Leu
Asp Ser Ser Ala 435 440 445Leu Pro
Ile Gly Ser Ala Ala Lys Arg Leu Lys Glu Ala Glu Ala Ala 450
455 460Ala Ser Ala Gln His His His Ala Gly Val Val
Ser Tyr Asp Val Gly465 470 475
480Arg Ile Ala Ser Gln Leu Gly Asp Gly Gly Ala Leu Ala Ala Ala Tyr
485 490 495Gly Ala His Tyr
His Gly Ala Ala Trp Pro Thr Ile Ala Phe Gln Pro 500
505 510Gly Ala Ala Ser Thr Gly Leu Tyr His Pro Tyr
Ala Gln Gln Pro Met 515 520 525Arg
Gly Gly Gly Trp Cys Lys Gln Glu Gln Asp His Ala Val Ile Ala 530
535 540Ala Ala His Ser Leu Gln Asp Leu His His
Leu Asn Leu Gly Ala Ala545 550 555
560Gly Ala His Asp Phe Phe Ser Ala Gly Gln Gln Ala Ala Ala Ala
Ala 565 570 575Met His Gly
Leu Gly Ser Ile Asp Ser Ala Ser Leu Glu His Ser Thr 580
585 590Gly Ser Asn Ser Val Val Tyr Asn Gly Gly
Val Gly Asp Ser Asn Gly 595 600
605Ala Ser Ala Val Gly Gly Ser Gly Gly Gly Tyr Met Met Pro Met Ser 610
615 620Ala Ala Gly Ala Thr Thr Thr Ser
Ala Met Val Ser His Glu Gln Val625 630
635 640His Ala Arg Ala Tyr Asp Glu Ala Lys Gln Ala Ala
Gln Met Gly Tyr 645 650
655Glu Ser Tyr Leu Val Asn Ala Glu Asn Asn Gly Gly Gly Arg Met Ser
660 665 670Ala Trp Gly Thr Val Val
Ser Ala Ala Ala Ala Ala Ala Ala Ser Ser 675 680
685Asn Asp Asn Met Ala Ala Asp Val Gly His Gly Gly Ala Gln
Leu Phe 690 695 700Ser Val Trp Asn Asp
Thr705 710
User Contributions:
Comment about this patent or add new information about this topic: