Patents - stay tuned to the technology

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: METHOD OF INCREASING RESISTANCE AGAINST SOYBEAN RUST IN TRANSGENIC PLANTS BY INCREASING THE SCOPOLETIN CONTENT

Inventors:
IPC8 Class: AC12N1582FI
USPC Class: 1 1
Class name:
Publication date: 2018-01-11
Patent application number: 20180010144



Abstract:

A method for increasing fungal resistance in a plant, a plant part, or a plant cell wherein the method comprises the step of increasing the production and/or accumulation of scopoletin and/or a derivative thereof in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell.

Claims:

1. A method for increasing fungal resistance in a plant, a plant part, or a plant cell wherein the method comprises the step of increasing the production and/or accumulation of scopoletin and/or a derivative thereof in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell.

2. The method according to claim 1, wherein the derivative of the scopoletin is scopolin.

3. The method for increasing fungal resistance in a plant, a plant part, or a plant cell or the method of claim 1, wherein the method comprises increasing the expression and/or biological activity of a F6H1 protein in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell, wherein said F6H1 protein is encoded by (i) an exogenous nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 1, or a functional fragment thereof or a splice variant thereof; (ii) an exogenous nucleic acid encoding a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 2, or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) an exogenous nucleic acid encoding the same F6H1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

4. The method according to claim 3, wherein the method further comprises increasing the expression and/or biological activity of at least one or more additional protein(s) selected from the group consisting of a CCoAOMT1 protein, a ABCG37 protein and a UGT71C1 protein in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell, (a) wherein said CCoAOMT1 protein is encoded by (i) an exogenous nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 3, or a functional fragment thereof, or a splice variant thereof; (ii) an exogenous nucleic acid encoding a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 4, or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) an exogenous nucleic acid encoding the same CCoAOMT1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code, (b) wherein said ABCG37 protein is encoded by (i) an exogenous nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 5, or a functional fragment thereof, or a splice variant thereof; (ii) an exogenous nucleic acid encoding a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 6, or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) an exogenous nucleic acid encoding the same ABCG37 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code, and (c) wherein said UGT71C1 protein is encoded by (i) an exogenous nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 7, or a functional fragment thereof, or a splice variant thereof; (ii) an exogenous nucleic acid encoding a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 8, or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) an exogenous nucleic acid encoding the same UGT71C1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

5. The method according to claim 4, comprising the steps of (a) stably transforming a plant cell with expression cassette(s) comprising exogenous nucleic acids encoding a F6H1 protein and optionally encoding one or more additional protein(s) selected from the group consisting of a CCoAOMT1 protein, a ABCG37 protein and a UGT71C1 protein, (b) regenerating the plant from the plant cell; and (c) expressing said exogenous nucleic acids, wherein the exogenous nucleic acid encoding a F6H1 protein and the exogenous nucleic acid(s) encoding one or more additional protein(s) selected from the group consisting CCoAOMT1 protein, ABCG37 protein and UGT71C1 protein are located on the same expression cassette or different expression cassettes.

6. A recombinant vector construct comprising one or more of the nucleic acids selected from the group consisting of: (a) a nucleic acid encoding F6H1 protein wherein said F6H1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 1 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 2 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same F6H1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence, (b) a nucleic acid encoding CCoAOMT1 protein wherein said CCoAOMT1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 3 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 4 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same CCoAOMT1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence, (c) a nucleic acid encoding ABCG37 protein wherein said ABCG37 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 5 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 6 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same ABCG37 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence; and (d) a nucleic acid encoding UGT71C1 protein wherein said UGT71C1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 7 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 8 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same UGT71C1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence.

7. The recombinant expression vector according to claim 6, wherein the promoter is a constitutive, pathogen-inducible promoter, a mesophyll-specific promoter or an epidermis specific-promoter.

8. A transgenic plant, transgenic plant part, or transgenic plant cell transformed with one or more recombinant vector construct(s) according to claim 6, wherein the nucleic acid(s) encoding a F6H1 protein, a CCoAOMT1 protein, a ABCG37 protein and/or a UGT71C1 protein are located on the same recombinant vector construct or different vector constructs.

9. A transgenic plant, transgenic plant part, or transgenic plant cell overexpressing an exogenous F6H1 protein optionally in combination with one or more additional exogenous protein(s) selected from the group consisting of a CCoAOMT1 protein, an ABCG37 protein and an UGT71C1 protein, (a) wherein said F6H1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 1 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 2 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same F6H1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence, (b) wherein said CCoAOMT1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 3 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 4 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same CCoAOMT1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence, (c) wherein said ABCG37 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 5 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 6 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same ABCG37 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence, and (d) wherein said UGT71C1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 7 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 8 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same UGT71C1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence.

10. A method for the production of a transgenic plant, transgenic plant part, or transgenic plant cell having increased fungal resistance, comprising (i) introducing an exogenous nucleic acid encoding the F6H1 protein optionally in combination with one or more exogenous nucleic acid(s) encoding the exogenous protein(s) selected from the group consisting of CCoAOMT1 protein, ABCG37 protein and UGT71C1 protein into a plant, a plant part, or a plant cell, (ii) generating a transgenic plant, transgenic plant part, or transgenic plant cell from the plant, plant part or plant cell; and (iii) expressing the protein(s) encoded by the recombinant vector construct(s), wherein the exogenous nucleic acid(s) encoding the F6H9 protein, the CCoAMT1 protein, the ABCG37 protein and/or the UGT71C1 protein are located on the same or different vector constructs, (a) wherein said F6H1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 1 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 2 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same F6H1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code operably linked with a promoter and a transcription termination sequence, (b) wherein said CCoAOMT1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 3 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 4 or a functional fragment thereof; (iii) a nucleic acid capable of hybridizing under stringent conditions with any of the nucleic acids according to (i) or (ii) or a complementary sequence thereof; or (iv) a nucleic acid encoding the same CCoAOMT1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code; operably linked with a promoter and a transcription termination sequence, (c) wherein said ABCG37 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 5 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 6 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same ABCG37 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code and (d) wherein said UGT71C1 protein is encoded by (i) a nucleic acid having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 7 or a functional fragment thereof, or a splice variant thereof; (ii) a nucleic acid coding for a protein having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity with SEQ ID NO: 8 or a functional fragment thereof; (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); or (iv) a nucleic acid encoding the same UGT71C1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

11. The method of claim 10, further comprising the step of harvesting the seeds of the transgenic plant and planting the seeds and growing the seeds to plants, wherein the grown plants comprise the exogenous nucleic acid encoding the F6H1 protein, and optionally further comprises one or more exogenous nucleic acid(s) selected from the group consisting of exogenous nucleic acid(s) encoding F6H1 protein, CCoAMT1, ABCG37 protein and UGT71C1 protein.

12. (canceled)

13. A harvestable part of a transgenic plant described in claim 8, wherein the harvestable part of the transgenic plant comprises the exogenous nucleic acid encoding a F6H1 protein and/or the F6H1 protein, and optionally further comprises one or more additional exogenous nucleic acid(s) and/or the additional exogenous proteins itself encoded by said additional exogenous nucleic acid(s), wherein said additional exogenous nucleic acid(s) and/or the additional exogenous proteins are selected from the group consisting of exogenous nucleic acid(s) encoding F6H1 protein, CCoAMT1, ABCG37 protein and UGT71C1 protein and/or said additional exogenous proteins are selected from the group consisting of F6H1 protein, CCoAMT1, ABCG37 protein and UGT71C1 protein, wherein the harvestable part is preferably a transgenic seed of the transgenic plant.

14. A product derived from a plant described in claim 8, wherein the product comprises the exogenous nucleic acid encoding the F6H1 protein and/or the F6H1 protein, and optionally further comprises one or more exogenous nucleic acid(s) and/or the exogenous proteins encoded by said exogenous nucleic acid(s), wherein said exogenous nucleic acid(s) are the selected from the group consisting of exogenous nucleic acid(s) encoding F6H1 protein, CCoAMT1, ABCG37 protein and UGT71C1 protein and/or said additional exogenous proteins are selected from the group consisting of F6H1 protein, CCoAMT1, ABCG37 protein and UGT71C1 protein, wherein the product is preferably soy oil.

15. A method for the production of a product comprising a) growing a plant of claim 8 and b) producing said product from or by the plant and/or part, preferably seeds, of the plant, wherein the product comprises the exogenous nucleic acid encoding the F6H1 protein and/or the F6H1 protein, and optionally further comprises one or more exogenous nucleic acid(s) and/or the exogenous proteins encoded by said exogenous nucleic acid(s), wherein said exogenous nucleic acid(s) are the selected from the group consisting of exogenous nucleic acid(s) encoding F6H1 protein, CCoAMT1, ABCG37 protein and UGT71C1 protein and/or said additional exogenous proteins are selected from the group consisting of F6H1 protein, CCoAMT1, ABCG37 protein and UGT71C1 protein.

16. Method according to claim 15, comprising a) growing the plant and removing the harvestable parts from the plant; and b) producing said product from or by the harvestable parts of the plant.

17. The method according to claim 1, wherein the fungal resistance is resistance against rust fungus, downy mildew, powdery mildew, leaf spot, late blight, fusarium and/or septoria.

18. The method according to claim 17, wherein the fungal resistance is a resistance against soybean rust and/or fusarium.

19. The method according to claim 18, wherein the fungal resistance is against Phakopsora meibomiae, Phakopsora pachyrhizi, Fusarium graminearum and/or Fusarium verticolloides.

20. The method according to claim 1, wherein the plant is selected from the group consisting of beans, soy, pea, clover, kudzu, lucerne, lentils, lupins, vetches, groundnut, rice, wheat, barley, arabidopsis, lentil, banana, canola, cotton, potatoe, corn, sugar cane, alfalfa, and sugar beet, preferably wherein the plant is soy or corn.

21. A method for breeding a fungal resistant plant comprising (i) crossing the plant of claim 8 with a second plant; (ii) obtaining seed from the cross of step (a); (iii) planting said seeds and growing the seeds to plants; and (iv) selecting from said plants expressing F6H1 proteins and optionally expressing a one or more additional protein(s) selected from the group consisting of CCoAMT1, ABCG37 protein and UGT71C1 protein.

22. A method for applying a scopoletin and/or a derivative thereof to a surface of a plant, plant part or plant cell, wherein the resistance to a fungal pathogen of the plant, plant part or plant cell is increased by applying scopoletin and/or a derivative thereof to the surface of the plant, plant part or plant cell in comparison to a plant, plant part or plant cell to which surface scopoletin and/or a derivative has not been applied, wherein the plant is soy and/or corn.

23. (canceled)

24. A plant, plant part or plant cell having a surface coated with scopoletin and/or a derivative thereof, wherein the plant is soy and/or corn.

Description:

SUMMARY OF THE INVENTION

[0001] The present invention relates to a method of increasing resistance against fungal pathogens, in particular soybean rust and/or Fusarium graminearum and/or Fusarium verticillioides, in plants, plant parts, and/or plant cells. This is achieved by increasing the content of scopoletin and/or a derivative thereof, in particular by increasing the expression of F6H1 in a plant, plant part and/or plant cell. This can also be achieved by application of a formulation or solution containing scopoletin and/or a derivative thereof.

[0002] Furthermore, the invention relates to recombinant expression vector constructs comprising a sequence that is identical or homologous to a sequence encoding F6H1 protein.

BACKGROUND OF THE INVENTION

[0003] The cultivation of agricultural crop plants serves mainly for the production of foodstuffs for humans and animals. Monocultures in particular, which are the rule nowadays, are highly susceptible to an epidemic-like spreading of diseases. The result is markedly reduced yields. To date, the pathogenic organisms have been controlled mainly by using pesticides. Nowadays, the possibility of directly modifying the genetic disposition of a plant or pathogen is also open to man. Alternatively, natural occurring fungicides produced by the plants after fungal infection can be synthesized and applied to the plants.

[0004] Resistance generally describes the ability of a plant to prevent, or at least curtail the infestation and colonization by a harmful pathogen. Different mechanisms can be discerned in the naturally occurring resistance, with which the plants fend off colonization by phytopathogenic organisms (Schopfer and Brennicke (1999) Pflanzenphysiologie, Springer Verlag, Berlin-Heidelberg, Germany).

[0005] With regard to the race specific resistance, also called host resistance, a differentiation is made between compatible and incompatible interactions. In the compatible interaction, an interaction occurs between a virulent pathogen and a susceptible plant. The pathogen survives, and may build up reproduction structures, while the host is seriously hampered in development or dies off. An incompatible interaction occurs on the other hand when the pathogen infects the plant but is inhibited in its growth before or after weak development of symptoms (mostly by the presence of R genes of the NBS-LRR family, see below). In the latter case, the plant is resistant to the respective pathogen (Schopfer and Brennicke, vide supra). However, this type of resistance is mostly specific for a certain strain or pathogen.

[0006] In both compatible and incompatible interactions a defensive and specific reaction of the host to the pathogen occurs. In nature, however, this resistance is often overcome because of the rapid evolutionary development of new virulent races of the pathogens (Neu et al. (2003) American Cytopathol. Society, MPMI 16 No. 7: 626-633).

[0007] Most pathogens are plant-species specific. This means that a pathogen can induce a disease in a certain plant species, but not in other plant species (Heath (2002) Can. J. Plant Pathol. 24: 259-264). The resistance against a pathogen in certain plant species is called non-host resistance. The non-host resistance offers strong, broad, and permanent protection from phytopathogens. Genes providing non-host resistance provide the opportunity of a strong, broad and permanent protection against certain diseases in non-host plants. In particular, such a resistance works for different strains of the pathogen.

[0008] Fungi are distributed worldwide. Approximately 100 000 different fungal species are known to date. Thereof rusts are of great importance. They can have a complicated development cycle with up to five different spore stages (spermatium, aecidiospore, uredospore, teleutospore and basidiospore).

[0009] During the infection of plants by pathogenic fungi, different phases are usually observed. The first phases of the interaction between phytopathogenic fungi and their potential host plants are decisive for the colonization of the plant by the fungus. During the first stage of the infection, the spores become attached to the surface of the plants, germinate, and the fungus penetrates the plant. Fungi may penetrate the plant via existing ports such as stomata, lenticels, hydatodes and wounds, or else they penetrate the plant epidermis directly as the result of the mechanical force and with the aid of cell-wall-digesting enzymes. Specific infection structures are developed for penetration of the plant. To counteract plants have developed physical barriers, such as wax layers, and chemical compounds having antifungal effects to inhibit spore germination, hyphal growth or penetration.

[0010] The soybean rust Phakopsora pachyrhizi directly penetrates the plant epidermis. After crossing the epidermal cell, the fungus reaches the intercellular space of the mesophyll, where the fungus starts to spread through the leaves. To acquire nutrients the fungus penetrates mesophyll cells and develops haustoria inside the mesophyl cell. During the penetration process the plasmamembrane of the penetrated mesophyll cell stays intact.

[0011] Fusarium species are important plant pathogens that attacks a wide range of plant species including many important crops such as maize and wheat. They cause seed rots, seedling blights as well as root rots, stalk rots and ear rots. Pathogens of the genus Fusarium infect the plants via infected seeds, roots or silks or they penetrate the plant via wounds or natural openings and cracks. After a very short establishment phase the Fusarium fungi start to secrete mycotoxins such as trichothecenes, zearalenone and fusaric acid into the infected host tissues leading to cell death and maceration of the infected tissue. Nourishing from dead tissue the fungus then starts to spread through the infected plant leading to severe yield losses and decreases in quality of the harvested grain.

[0012] Biotrophic phytopathogenic fungi depend for their nutrition on the metabolism of living cells of the plants. This type of fungi belong to the group of biotrophic fungi, like many rust fungi, powdery mildew fungi or oomycete pathogens like the genus Phytophthora or Peronospora. Necrotrophic phytopathogenic fungi depend for their nutrition on dead cells of the plants, e.g. species from the genus Fusarium, Rhizoctonia or Mycospaerella. Soybean rust has occupied an intermediate position, since it penetrates the epidermis directly, whereupon the penetrated cell becomes necrotic. After the penetration, the fungus changes over to an obligatory-biotrophic lifestyle. The subgroup of the biotrophic fungal pathogens which follows essentially such an infection strategy are heminecrotrohic.

[0013] Scopoletin and scopolin are antimicrobial phenolic hydroxycumarins that accumulate in different plants upon infection with various pathogens such as fungi or bacteria or in response to insect feeding damage, mechanical injury, dehydration or various other abiotic stresses.

[0014] Scopoletin shows broad antimicrobial activity and can inhibit development and growth of various fungi or bacteria in vitro (Goy, P. A., Signer, H., Reist, R., Aichholz, R., Blum, W., Schmidt, E., and Kessmann, H. (1993). Accumulation of scopoletin is associated with the high disease resistance of the hybrid Nicotiana glutinosa.times.Nicotiana debneyi. Planta 41: 200-206; Tal, B. and Robeson, D. J. (1986b). The Metabolism of Sunflower Phytoalexins Ayapin and Scopoletin: Plant-Fungus Interactions. Plant Physiology 82: 167-172.).

[0015] Scopoletin and its glucoside scopolin originate from the phenylpropanoid pathway (FIG. 1; (Kai, K., Mizutani, M., Kawamura, N., Yamamoto, R., Tamai, M., Yamaguchi, H., Sakata, K., and Shimizu, B. (2008). Scopoletin is biosynthesized via ortho-hydroxylation of feruloyl CoA by a 2-oxoglutarate-dependent dioxygenase in Arabidopsis thaliana. Plant Journal 55: 989-99).

[0016] Key steps of scopletin/scopolin synthesis comprise ortho hydroxylation of feruloyl-CoA, trans/cis isomeration of the side chain, lactonization and--considering scopolin synthesis--glycosylation (Kai et al., 2008). In Arabidopsis it has recently been shown that scopoletin production depends on ortho hydroxylation of feruloyl-CoA by the Fe(II)- and 2-oxoglutarate-dependent dioxygenase F6H1 (At3g13610). E-Z isomerisation of the side chain and lactonization were found to occur spontaneously. (Kai et al., 2008).

[0017] In planta accumulating scopoletin can finally be glucosylated to produce scopolin. Several Arabidopsis glucosyltransferases (e.g. UGT71C1) (Lim, E.-K., Baldauf, S., Li, Y., Elias, L., Worrall, D., Spencer, S. P., Jackson, R. G., Taguchi, G., Ross, J., and Bowles, D. J. (2003). Evolution of substrate recognition across a multigene family of glycosyltransferases in Arabidopsis. Glycobiology 13: 139-45.) as well as two different tobacco glucosyltransferases (Togt1 and Togt2) (Fraissinet-Tachet, L., Baltz, R., Chong, J., Kauffmann, S., Fritig, B., and Saindrenan, P. (1998). Two tobacco genes induced by infection, elicitor and salicylic acid encode glucosyltransferases acting on phenylpropanoids and benzoic acid derivatives, including salicylic acid. FEBS letters 437: 319-23) have been identified that can catalyze glycosylation of scopoletin in vitro.

[0018] Scopolin is generally regarded a less potent antimicrobial agent than scopoletin. Following pathogen-induced mechanical injury or hypersensitive reactions (HR), decompartimentalization of scopolin containing cells might lead to the release of scopolin from vacuoles into the cytoplasm and subsequent hydrolysis of the glucose conjugate by .beta.-glucosidases.

[0019] Scopoletin and its glucoside scopolin are widely distributed among the plant kingdom and have been detected in various plant organs of approximately 80 different plant families. Interestingly, scopoletin biosynthesis seems to be lost in several economically important crops (e.g. Glycine max, Zea mays, Triticum aestivum, Oryza sativa etc.), indicating that the ability to synthesize this antimicrobial substance might have been lost during breeding. However, this does not apply to sweet potato, tobacco, sunflower, cotton or cassava since scopoletin has been shown to accumulate in these crops in response to infection (summarized by Gnonlonfin, G. J. B., Sanni, A., and Brimer, L. (2012). Review Scopoletin--A Coumarin Phytoalexin with Medicinal Properties. Critical Reviews in Plant Sciences 31: 47-56).

[0020] Soybean rust has become increasingly important in recent times. The disease may be caused by the biotrophic rusts Phakopsora pachyrhizi (Sydow) and Phakopsora meibomiae (Arthur). They belong to the class Basidiomycota, order Uredinales, family Phakopsoraceae. Both rusts infect a wide spectrum of leguminosic host plants. P. pachyrhizi, also referred to as Asian rust, is the more aggressive pathogen on soy (Glycine max), and is therefore, at least currently, of great importance for agriculture. P. pachyrhizi can be found in nearly all tropical and subtropical soy growing regions of the world. P. pachyrhizi is capable of infecting 31 species from 17 families of the Leguminosae under natural conditions and is capable of growing on further 60 species under controlled conditions (Sinclair et al. (eds.), Proceedings of the rust workshop (1995), National SoyaResearch Laboratory, Publication No. 1 (1996); Rytter J. L. et al., Plant Dis. 87, 818 (1984)). P. meibomiae has been found in the Caribbean Basin and in Puerto Rico, and has not caused substantial damage as yet.

[0021] P. pachyrhizi can currently be controlled in the field only by means of fungicides. Soy plants with resistance to the entire spectrum of the isolates are not available. When searching for resistant soybean accessions, six dominant R-genes of the NBS-LRR family, which mediate resistance of soy to P. pachyrhizi, were discovered. The resistance was lost rapidly, as P. pachyrhizi develops new virulent races.

[0022] Increasing resistance to Fusarium is one of the most important goals in maize breeding. Despite having a great natural diversity in interaction phenotypes with Fusarium species, resistance seems to be distributed over many weak QTLs with low heritability. Therefore only little progress was made in increasing resistance against Fusarium by breeding.

[0023] In recent years, fungal diseases, e.g. soybean rust and Fusarium graminearum have gained in importance as pest in agricultural production. There was therefore a demand in the prior art for developing methods to control fungi and to provide fungal resistant plants.

[0024] Much research has been performed on the field of powdery and downy mildew infecting the epidermal layer of plants. However, the problem to cope with soybean rust which infects the mesophyll or Fusarium fungi that infect inaccessible inner tissues remains unsolved.

[0025] The object of the present invention is inter alia to provide a method of increasing resistance against fungal pathogens, preferably against fungal pathogens of the family Phakopsoraceae, more preferably against fungal pathogens of the genus Phakopsora, most preferably against Phakopsora pachyrhizi (Sydow) and/or Phakopsora meibomiae (Arthur), also known as soybean rust.

[0026] A further object of the present invention is inter alia to provide a method of increasing resistance against fungal pathogens, preferably against fungal pathogens of the genus Fusarium, most preferably against Fusarium graminearum and/or Fusarium verticillioides.

[0027] Surprisingly, we found that fungal pathogens, in particular of the genus Phakopsora, for example soybean rust and/or of the genus Fusarium, for example Fusarium graminearum and/or Fusarium verticillioides, can be controlled by increased production or increased accumulation of scopoletin or derivatives thereof in a plant and by direct application of scopoletin or derivatives thereof to the plant.

[0028] Surprisingly, we found that fungal pathogens, in particular of the genus Phakopsora, for example soybean rust and of the genus Fusarium, for example Fusarium graminearum and/or Fusarium verticillioides, can be controlled by increased expression of the F6H1 protein, optionally in combination with one or more proteins selected from the group consisting of CCoAOMT1, ABCG37 and UGT71C1.

[0029] The present invention therefore provides a method of increasing resistance against fungal pathogens, preferably against fungal pathogens of the family Phakopsoraceae and/or Nectriaceae, more preferably against fungal pathogens of the genus Phakopsora and/or Fusarium, most preferably against Phakopsora pachyrhizi (Sydow), Phakopsora meibomiae (Arthur), Fusarium graminearum and/or Fusarium verticillioides in transgenic plants, plant parts, or transgenic plant cells by increasing the production and/or accumulation of scopoletin and/or derivatives thereof or by exogenous application of scopoletin and/or derivatives thereof to plants, plant parts, or plant cells.

[0030] A further object is to provide transgenic plants resistant against fungal pathogens, preferably of the family Phakopsoraceae and/or Nectriaceae, more preferably against fungal pathogens of the genus Phakopsora and/or Fusarium, most preferably against Phakopsora pachyrhizi (Sydow), Phakopsora meibomiae (Arthur), Fusarium graminearum and/or Fusarium verticillioides, a method for producing such plants as well as a recombinant vector construct useful for the above methods.

[0031] The present invention also refers to a recombinant vector construct and a transgenic plant, plant part, or plant cell comprising exogenous nucleic acids or fragment thereof which lead to enhanced production of scopoletin and/or derivatives thereof. Furthermore, a method for the production of a transgenic plant, plant part or plant cell using the nucleic acids of the present invention is claimed herein. In addition, the use of a nucleic acid or the recombinant vector of the present invention for the transformation of a plant, plant part, or plant cell is claimed herein.

[0032] The present invention also refers to method for applying a scopoletin and/or derivatives to a surface of a plant, plant part or plant cell as well as plant surface or plant part surface coated with scopoletin and/or derivatives.

[0033] The objects of the present invention, as outlined above, are achieved by the subject-matter of the main claims. Preferred embodiments of the invention are defined by the subject matter of the dependent claims.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Figures

[0034] FIG. 1 shows the key steps of the scopoletin and scopolin synthesis in Arabidopsis thaliana (as proposed by Kai et al, Plant J. 2008 September; 55(6):989-99)

[0035] (a) 3-O-methylation of the caffeate unit occurs mainly via CCoAOMT1 using caffeoyl CoA. Ortho-hydroxylation of feruloyl CoA is catalyzed by F6H1, followed by trans/cis isomerization of the side chain and lactonization to form scopoletin. C3H, p-coumarate 3-hydroxylase; 4CL, 4-coumarate:CoA ligase.

[0036] (b) An ionic mechanism of trans/cis isomerization of the side chain and lactonization is proposed for the thioester.

[0037] FIG. 2a shows the schematic illustration of plant transformation vectors harboring p35S::F6H1 (At3g13610) for transient production of scopoletin in N. benthamiana leaves (see example 3).

[0038] FIG. 2b shows the schematic illustration of plant transformation vectors harboring P35S::FLAG-tag:F6H1 (At3g13610) for expression of FLAG-tagged F6H1 used for transient production of scopoletin in N. benthamiana leaves (see example 3)

[0039] FIG. 2c shows the schematic illustration of plant transformation vectors harboring pUbi: F6H1 (At3g13610) for stable production of scopoletin in soybean plants (see examples 6-10).

[0040] FIG. 3 shows the schematic illustration of plant transformation vectors harboring pUbi F6H1 (AT3G13610)+pSUPER CCoAOMT1 (At4g34050) for stable production of scopoletin in soybean plants (see examples 7-11).

[0041] FIG. 4 shows the schematic illustration of plant transformation vectors harboring pUbi F6H1 (AT3G13610)+pSUPER CCoAOMT1 (At4g34050)+pGlyma14g06680 ABCG37 (PDR9; AT3G53480) for stable production of scopoletin in soybean plants (see examples 7-11).

[0042] FIG. 5 shows the schematic illustration of plant transformation vectors harboring pUbi F6H1 (AT3G13610)+pSUPER UGT71C1 (At2g29750) for stable production of scopoletin in soybean plants (see examples 7-11).

[0043] FIG. 6 contains a brief description of the sequences of the sequence listing.

[0044] FIG. 7a shows the nucleotide sequence of the F6H1 (At3g13610) gene from Arabidopsis thaliana having SEQ ID No: 1.

[0045] FIG. 7b shows the protein sequence of the F6H1 (At3g13610) gene from Arabidopsis thaliana having SEQ ID No: 2.

[0046] FIG. 8a shows the nucleotide sequence of the CCoAOMT1 (At4g34050) gene from Arabidopsis thaliana having SEQ ID No: 3.

[0047] FIG. 8b shows the protein sequence of the CCoAOMT1 (At4g34050) gene from Arabidopsis thaliana having SEQ ID No: 4.

[0048] FIG. 9a shows the nucleotide sequence of the ABCG37 (PDR9; AT3G53480) gene from Arabidopsis thaliana having SEQ ID No: 5.

[0049] FIG. 9b shows the protein sequence of the ABCG37 (PDR9; AT3G53480) gene from Arabidopsis thaliana having SEQ ID No: 6.

[0050] FIG. 10a shows the nucleotide sequence of the UGT71C1 gene from Arabidopsis thaliana having SEQ ID No:7.

[0051] FIG. 10b shows the protein sequence of the UGT71C1 gene from Arabidopsis thaliana having SEQ ID No: 8.

[0052] FIG. 11 shows the scoring system used to determine the level of diseased leaf area of wildtype and transgenic soy plants against the rust fungus P. pachyrhizi ((as described in GODOY, C. V., KOGA, L. J. & CANTERI, M. G. Diagrammatic scale for assessment of soybean rust severity. Fitopatologia Brasileira 31:063-068. 2006.)

[0053] FIG. 12 a shows the production of scopoletin in transiently transformed N. benthamiana leaves by overexpression of F6H1. Leaves of Nicotiana benthamiana were transiently transformed by infiltrated with Agrobacterium tumefaciens AGL01 harboring one of the plasmids shown in FIGS. 2a and 2b (see example 3). Scopoletin produced by transiently transformed N. benthamiana leaves was identified and quantified by HPLC as described in example 3b.

[0054] Untransformed (wildtype) N. benthamiana is not able to produce Scopoletin. Transient expression of the F6H1 enzyme (original sequence (F6H1, FIG. 2a) or FLAG-tagged (Omega-F6H1-FLAG; FIG. 2b) leads to the production and accumulation of scopoletin in leaves of N. benthamiana (independent on the construct used.

[0055] FIG. 12 b shows the enhancement of the production of scopoletin and scopolin in transiently transformed N. benthamiana leaves by co-overexpression of F6H1 and CCoAOMT1. Leaves of Nicotiana benthamiana were transiently transformed by infiltrating with Agrobacterium tumefaciens harboring plasmids containing the F6H1 gene or F6H1 gene and the CCoAOMT1 gene (see FIG. 2b and example 3). Untransformed (wildtype) N. benthamiana is not able to produce scopoletin. Transient expression of the F6H1 enzyme (Omega-F6H1-FLAG) leads to the production and accumulation of scopoletin. Transient co-overexpression of the F6H1 enzyme in combination with CCoAOMT1 (Omega-F6H1-FLAG+CCoAOMT1) leads to an enhanced production and accumulation of scopoletin in comparison to F6H1 alone, as visible in a larger peak area in the HPLC chromatograph. This results shows that the F6H1 accumulation could be enhanced by coexpression of CCoAOMT1.

[0056] FIG. 13 Scopoletin inhibits the germination of ASR (Asian soy rust) spores in vivo. Leaves of Arabidopsis Col-0 wildtype plants were treated with 1 mM Scopoletin either 6 h before inoculation (bi) with P. pachyrhizi (stripped bar) or in parallel with the inoculation with P. Pachyrhizi (black bar) (plants not treated with Scopoletin, light grey bar): Germination of ASR spores was assessed microscopically 48 hours after infection (see example 6.1) Quantitative microscopic analysis showed that the germination of spores of Phakopsora pachyrhizi is strongly inhibited by the presence of 1 mM Scopoletin on the leaves of Arabidopsis thaliana independent of the application method (co-application or pre-treatment).

[0057] FIG. 14a Scopoletin inhibits the germination of ASR spores in vitro.

[0058] Spores of Phakopsora pachyrhizi were germinated on glass slides in water containing 0 (grey dotted bar), 10 .mu.M (vertically striped bar), 100 .mu.M (diagonally striped bar), 500 .mu.M (horizontally striped bar) and 1 mM (black bar) scopoletin. Morphological status of spores was determined microscopically 6 h after inoculation (see example 5a). Quantitative microscopic analysis showed that the germination and appressorium formation of Phakopsora pachyrhizi is strongly inhibited by the presence of scopoletin in a dose dependent manner.

[0059] FIG. 14b Scopoletin reduces soybean rust disease symptoms in planta.

[0060] Leaves of soybean plants were treated with 10 .mu.M, 100 .mu.M or 1 mM scopoletin in parallel with the inoculation with P. pachyrhizi (Co-application). Plants not treated with Scopoletin are marked as control (see example 6.2). The diseased leaf area was assessed according to FIG. 11 and as described in example 10.

[0061] Quantitative analysis of the ratio of the infected leaf area showed that the diseased leaf area caused by Phakopsora pachyrhizi infection is strongly reduced in a dose dependent manner by co-application of scopoletin.

[0062] FIG. 14c Scopoletin reduces soybean rust disease symptoms in planta.

[0063] Primary leaves (grey dotted bars) or first trifoliate leaves (vertically striped bars) and second trifoliate leaves (diagonally striped bars) of soy plants were treated with 1 mM scopoletin either 6 h before inoculation with P. pachyrhizi (Pre-treatment) or in parallel with the inoculation with P. Pachyrhizi (Co-application). Plants not treated with scopoletin are marked "ASR-only")(see example 6.2). The diseased leaf area was assessed according to FIG. 11 and as described in example 10.

[0064] Quantitative analysis of the ratio of the infected leaf area showed that the diseased leaf area caused by Phakopsora pachyrhizi infection is strongly reduced by the either pre-treatment or co-application of 1 mM scopoletin on primary leaves (grey dotted bars) and first trifoliate leaves (vertically striped bars) and second trifoliate leaves (diagonally striped bars).

[0065] FIG. 15 shows the impact of scopoletin on the growth of Fusarium graminearum (in-vitro) Fusarium graminearum fungus is grown on PDA plates containing either 1 mM Scopoletin (solved in methanol) or methanol alone as control. The growth rate of the Fusarium graminearum in mm/day was determined microscopically (see example 5b).

[0066] The presence of 1 mM scopoletin in the agar leads to a reduction of the Fusarium graminearum growth rate per day by 61% in comparison to Fusarium graminearum grown on PDA+methanol, indicating that scopoletin is also toxic against Fusarium graminearum.

[0067] FIG. 16 shows soybean leaves expressing F6H1 enzyme in comparison to wildtype control. Expression of F6H1 enzyme is leading to accumulation of the antifungal molecule Scopoletin as visible by fluorescence under UV light. Elicitation of fluorescence was done by a B-100AP UV lamp (UVP LLC, Upland, Canada) using 365 nm longwave UV.

[0068] FIG. 17 shows the result of the scoring of 25 transgenic soy plants (derived from 5 independent events) accumulating Scopoletin by overexpression of F6H1 enzyme (construct see FIG. 2c) compared with wildtype plants.

DETAILED DESCRIPTION OF THE INVENTION

[0069] The present invention may be understood more readily by reference to the following detailed description of the preferred embodiments of the invention and the examples included herein.

Definitions

[0070] Unless otherwise noted, the terms used herein are to be understood according to conventional usage by those of ordinary skill in the relevant art. In addition to the definitions of terms provided herein, definitions of common terms in molecular biology may also be found in Rieger et al., 1991 Glossary of genetics: classical and molecular, 5th Ed., Berlin: Springer-Verlag; and in Current Protocols in Molecular Biology, F. M. Ausubel et al., Eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1998 Supplement).

[0071] It is to be understood that as used in the specification and in the claims, "a" or "an" can mean one or more, depending upon the context in which it is used. Thus, for example, reference to "a cell" can mean that at least one cell can be utilized. It is to be understood that the terminology used herein is for the purpose of describing specific embodiments only and is not intended to be limiting.

[0072] Throughout this application, various publications are referenced. The disclosures of all of these publications and those references cited within those publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art to which this invention pertains. Standard techniques for cloning, DNA isolation, amplification and purification, for enzymatic reactions involving DNA ligase, DNA polymerase, restriction endonucleases and the like, and various separation techniques are those known and commonly employed by those skilled in the art. A number of standard techniques are described in Sambrook et al., 1989 Molecular Cloning, Second Edition, Cold Spring Harbor Laboratory, Plainview, N.Y.; Maniatis et al., 1982 Molecular Cloning, Cold Spring Harbor Laboratory, Plainview, N.Y.; Wu (Ed.) 1993 Meth. Enzymol. 218, Part I; Wu (Ed.) 1979 Meth Enzymol. 68; Wu et al., (Eds.) 1983 Meth. Enzymol. 100 and 101; Grossman and Moldave (Eds.) 1980 Meth. Enzymol. 65; Miller (Ed.) 1972 Experiments in Molecular Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; Old and Primrose, 1981 Principles of Gene Manipulation, University of California Press, Berkeley; Schleif and Wensink, 1982 Practical Methods in Molecular Biology; Glover (Ed.) 1985 DNA Cloning Vol. I and II, IRL Press, Oxford, UK; Hames and Higgins (Eds.) 1985 Nucleic Acid Hybridization, IRL Press, Oxford, UK; and Setlow and Hollaender 1979 Genetic Engineering: Principles and Methods, Vols. 1-4, Plenum Press, New York. Abbreviations and nomenclature, where employed, are deemed standard in the field and commonly used in professional journals such as those cited herein.

[0073] "Homologues" of a protein encompass peptides, oligopeptides, polypeptides, proteins and/or enzymes having amino acid substitutions, deletions and/or insertions relative to the unmodified protein in question and having the same, essentially the same biological activity or similar as the unmodified protein from which they are derived.

[0074] "Homologues" of a nucleic acid encompass nucleotides and/or polynucleotides having nucleic acid substitutions, deletions and/or insertions relative to the unmodified nucleic acid in question, wherein the protein coded by such nucleic acids has the same, essentially the same or similar biological activity as the unmodified protein coded by the unmodified nucleic acid from which they are derived. In particular, homologues of a nucleic acid may encompass substitutions on the basis of the degenerative amino acid code.

[0075] The terms "identity", "homology" and "similarity" are used herein interchangeably. "Identity" or "homology" or "similarity" between two nucleic acids sequences or amino acid sequences refers in each case over at least 70%, at least 80% or at least 90% of the entire length of the respective F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acid sequence or the respective F6H1, CCoAOMT, ABCG37 and/or UGT71C1 amino acid sequence, preferably over the entire length of the respective F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acid sequence or the respective F6H1, CCoAOMT, ABCG37 and/or UGT71C1 amino acid sequence.

[0076] Preferably, "percentage of sequence identity" is calculated by comparing two optimally aligned sequences over a particular region, determining the number of positions at which the identical base or amino acid occurs in both sequences in order to yield the number of matched positions, dividing the number of such positions by the total number of positions in the region being compared and multiplying the result by 100.

[0077] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity or similarity or homology and performs a statistical analysis of the identity or similarity or homology between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity/homology/identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/homology/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters. For local alignments, the Smith-Waterman algorithm is particularly useful (Smith T F, Waterman M S (1981) J. Mol. Biol 147(1); 195-7).

[0078] The sequence identity may also be calculated by means of the Vector NTI Suite 7.1 program of the company Informax (USA) employing the Clustal Method (Higgins D G, Sharp P M. Fast and sensitive multiple sequence alignments on a microcomputer. Comput Appl. Biosci. 1989 April; 5(2):151-1) with the following settings:

Multiple Alignment Parameter:

[0079] Gap opening penalty 10 Gap extension penalty 10 Gap separation penalty range 8 Gap separation penalty off % identity for alignment delay 40 Residue specific gaps off Hydrophilic residue gap off Transition weighing 0

Pairwise Alignment Parameter:

[0080] FAST algorithm on K-tuple size 1 Gap penalty 3 Window size 5 Number of best diagonals 5

[0081] Alternatively the identity may be determined according to Chenna, Ramu, Sugawara, Hideaki, Koike, Tadashi, Lopez, Rodrigo, Gibson, Toby J, Higgins, Desmond G, Thompson, Julie D. Multiple sequence alignment with the Clustal series of programs. (2003) Nucleic Acids Res 31 (13):3497-500, the web page: http://www.ebi.ac.uk/Tools/clustalw/index.html# and the following settings

DNA Gap Open Penalty 15.0

DNA Gap Extension Penalty 6.66

DNA Matrix Identity

Protein Gap Open Penalty 10.0

Protein Gap Extension Penalty 0.2

[0082] Protein matrix Gonnet

Protein/DNA ENDGAP -1

Protein/DNA GAPDIST 4

[0083] Sequence identity between the nucleic acid or protein useful according to the present invention and the F6H1, CCoAOMT, ABCG37 and UGT71C1 nucleic acids and the F6H1, CCoAOMT, ABCG37 and UGT71C1 proteins may be optimized by sequence comparison and alignment algorithms known in the art (see Gribskov and Devereux, Sequence Analysis Primer, Stockton Press, 1991, and references cited therein) and calculating the percent difference between the nucleotide or protein sequences by, for example, the Smith-Waterman algorithm as implemented in the BESTFIT software program using default parameters (e.g., University of Wisconsin Genetic Computing Group).

[0084] A "deletion" refers to removal of one or more amino acids from a protein or to the removal of one or more nucleic acids from DNA, ssRNA and/or dsRNA.

[0085] An "insertion" refers to one or more amino acid residues or nucleic acid residues being introduced into a predetermined site in a protein or the nucleic acid.

[0086] A "substitution" refers to replacement of amino acids of the protein with other amino acids having similar properties (such as similar hydrophobicity, hydrophilicity, antigenicity, propensity to form or break .alpha.-helical structures or beta-sheet structures).

[0087] On the nucleic acid level a substitution refers to a replacement of one or more nucleotides with other nucleotides within a nucleic acid, wherein the protein coded by the modified nucleic acid has essentially the same or a similar function. In particular homologues of a nucleic acid encompass substitutions on the basis of the degenerative amino acid code.

[0088] Amino acid substitutions are typically of single residues, but may be clustered depending upon functional constraints placed upon the protein and may range from 1 to 10 amino acids; insertions or deletion will usually be of the order of about 1 to 10 amino acid residues. The amino acid substitutions are preferably conservative amino acid substitutions. Conservative substitution tables are well known in the art (see for example Taylor W. R. (1986) The classification of amino acid conservation J Theor Biol., 119:205-18 and Table 1 below).

TABLE-US-00001 TABLE 1 Examples of conserved amino acid substitutions Conservative Conservative Residue Substitutions Residue Substitutions A G, V, I, L, M L M, I, V, A, G C S, T N Q E D Q N D E P G A, V, I, L, M S T, C F Y, W R K, H I V, A, G, L, M T S, C H R, K W Y, F K R, H V I, A, G, L, M M L, I, V, A, G Y F, W

[0089] Amino acid substitutions, deletions and/or insertions may readily be made using peptide synthetic techniques well known in the art, such as solid phase peptide synthesis and the like, or by recombinant DNA manipulation.

[0090] Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. For example, techniques for making substitution mutations at predetermined sites in DNA are well known to those skilled in the art and include M13 mutagenesis, T7-Gene in vitro mutagenesis (USB, Cleveland, Ohio), QuickChange Site Directed mutagenesis (Stratagene, San Diego, Calif.), PCR-mediated site-directed mutagenesis or other site-directed mutagenesis protocols.

[0091] The terms "encode" or "coding for" is used for the capability of a nucleic acid to contain the information for the amino acid sequence of a protein via the genetic code, i.e., the succession of codons each being a sequence of three nucleotides, which specify which amino acid will be added next during protein synthesis. The terms "encode" or "coding for" therefore includes all possible reading frames of a nucleic acid. Furthermore, the terms "encode" or "coding for" also applies to a nucleic acid, which coding sequence is interrupted by non-coding nucleic acid sequences, which are removed prior translation, e.g., a nucleic acid sequence comprising introns.

[0092] The term "domain" refers to a set of amino acids conserved at specific positions along an alignment of sequences of evolutionarily related proteins. While amino acids at other positions can vary between homologues, amino acids that are highly conserved at specific positions indicate amino acids that are likely essential in the structure, stability or function of a protein.

[0093] Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788(2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.

[0094] The nucleic acids according to the present invention may comprise domains as defined herein below when analysed with the software tool InterProScan (version 4.8, (see Zdobnov E. M. and Apweiler R.; "InterProScan--an integration platform for the signature-recognition methods in InterPro."; Bioinformatics, 2001, 17(9): 847-8; InterPro database, release 42 (Apr. 4, 2013)).

[0095] As used herein the terms "fungal-resistance", "resistant to a fungus" and/or "fungal-resistant" mean reducing, preventing, or delaying an infection by fungi. Preferably fungal resistance is soybean rust-resistance and/or fusarium-resistance. The term "resistance" refers to fungal resistance. Resistance does not imply that the plant necessarily has 100% resistance to infection. In preferred embodiments, enhancing or increasing fungal resistance means that resistance in a resistant plant is greater than 10%, greater than 20%, greater than 30%, greater than 40%, greater than 50%, greater than 60%, greater than 70%, greater than 80%, greater than 90%, or greater than 95% in comparison to a wild type plant. Preferably the wild type plant is a plant of a similar, more preferably identical, genotype as the plant having increased resistance to fungi, in particular soy-rust and or fusarium, but does not comprise an exogenous F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAOMT, ABCG37 and UGT71C1 nucleic acids. Preferably, the wildtype plant is not capable to produce more than 10 .mu.M scopoletin and/or a derivative thereof, more preferably more than 5 .mu.M scopoletin and/or a derivative thereof, most preferably the wildtype plant is not capable to produce scopoletin and/or a derivative thereof.

[0096] As used herein the terms "soybean rust-resistance", "resistant to a soybean rust", "soybean rust-resistant", "rust-resistance", "resistant to a rust", or "rust-resistant" mean reducing or preventing or delaying an infection of a plant, plant part, or plant cell by Phacopsoracea, in particular Phakopsora, more particularly soybean rust or Asian Soybean Rust (ASR), more particularly Phakopsora pachyrhizi, Phakopsora meibomiae and/or Fusarium solani--also known as, as compared to a wild type plant, wild type plant part, or wild type plant cell.

[0097] As used herein the terms "fusarium-resistance", "resistant to a fusarium", or "fusarium-resistant" mean reducing or preventing or delaying an infection of a plant, plant part, or plant cell by Fusarium, in particular Fusarium graminearum, Fusarium sporotrichioides, Fusarium pseudograminearum, Fusarium culmorum, Fusarium poae, Fusarium verticillioides (Fusarium moniliforme), Fusarium subglutinans, Fusarium proliferatum, Fusarium fujikuroi), Fusarium avenaceum, Fusarium oxysporum, Fusarium virguliforme and/or Fusarium solani as compared to a wild type plant, wild type plant part, or wild type plant cell.

[0098] The level of fungal resistance of a plant can be determined in various ways, e.g. by scoring/measuring the infected leaf area or three-dimensional space in relation to the overall area or three-dimensional space. Another possibility to determine the level of resistance is to count the number of fusarium colonies on the plant or to measure the amount of spores produced by these colonies. Another way to resolve the degree of fungal infestation is to specifically measure the amount of fungal DNA by quantitative (q) PCR. Specific probes and primer sequences for most fungal pathogens are available in the literature (Frederick R D, Snyder C L, Peterson G L, et al. 2002 Polymerase chain reaction assays for the detection and discrimination of the rust pathogens Phakopsora pachyrhizi and P. meibomiae, Phytopathology 92(2) 217-227). (Nicolaisen M, Suproniene S, Nielsen L K, Lazzaro I, Spliid N H, Justesen A F. 2009 Real-time PCR for quantification of eleven individual Fusarium species in cereals. J Microbiol Methods. 2009 March; 76(3): 234-40.) Another way of evaluating fungal biomass is to biochemically determining the amount of fungal specific compounds, such as ergosterol or chitin (L. M. Reid, R. W. Nicol, T. Ouellet, M. Savard, J. D. Miller, J. C. Young, D. W. Stewart, and A. W. Schaafsma (1999) Interaction of Fusarium graminearum and F. moniliforme in Maize Ears: Disease Progress, Fungal Biomass, and Mycotoxin Accumulation Phytopathology 89(11) 1028-1037; CA Roberts, R R Marquardt, A A Frohlich, R L McGraw, R G Rotter, J C Henning (1991) Chemical and spectral quantification of mold in contaminated barley; Cereal Chemistry 68(3):272-275).

[0099] The term "hybridization" as used herein includes "any process by which a strand of nucleic acid molecule joins with a complementary strand through base pairing" (J. Coombs (1994) Dictionary of Biotechnology, Stockton Press, New York). Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acid molecules) is impacted by such factors as the degree of complementarity between the nucleic acid molecules, stringency of the conditions involved, the Tm of the formed hybrid, and the G:C ratio within the nucleic acid molecules.

[0100] As used herein, the term "Tm" is used in reference to the "melting temperature." The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. The equation for calculating the Tm of nucleic acid molecules is well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation: Tm=81.5+0.41(% G+C), when a nucleic acid molecule is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985). Other references include more sophisticated computations, which take structural as well as sequence characteristics into account for the calculation of Tm. Stringent conditions, are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6.

[0101] In particular, the term "stringency conditions" refers to conditions, wherein 100 contigous nucleotides or more, 150 contigous nucleotides or more, 200 contigous nucleotides or more or 250 contigous nucleotides or more which are a fragment or identical to the complementary nucleic acid molecule (DNA, RNA, ssDNA or ssRNA) hybridizes under conditions equivalent to hybridization in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO4, 1 mM EDTA at 50.degree. C. with washing in 2.times.SSC, 0.1% SDS at 50.degree. C. or 65.degree. C., preferably at 65.degree. C., with a specific nucleic acid molecule (DNA; RNA, ssDNA or ss RNA). Preferably, the hybridizing conditions are equivalent to hybridization in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO4, 1 mM EDTA at 50.degree. C. with washing in 1.times.SSC, 0.1% SDS at 50.degree. C. or 65.degree. C., preferably 65.degree. C., more preferably the hybridizing conditions are equivalent to hybridization in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO4, 1 mM EDTA at 50.degree. C. with washing in 0.1.times.SSC, 0.1% SDS at 50.degree. C. or 65.degree. C., preferably 65.degree. C. Preferably, the complementary nucleotides hybridize with a fragment or the whole nucleic acids of exogenous F6H1, CCoAOMT, ABCG37 genes and UGT71C1, respectively. Alternatively, preferred hybridization conditions encompass hybridisation at 65.degree. C. in 1.times.SSC or at 42.degree. C. in 1.times.SSC and 50% formamide, followed by washing at 65.degree. C. in 0.3.times.SSC or hybridisation at 50.degree. C. in 4.times.SSC or at 40.degree. C. in 6.times.SSC and 50% formamide, followed by washing at 50.degree. C. in 2.times.SSC. Further preferred hybridization conditions are 0.1% SDS, 0.1 SSD and 65.degree. C.

[0102] The term "plant" is intended to encompass plants at any stage of maturity or development, as well as any tissues or organs (plant parts) taken or derived from any such plant unless otherwise clearly indicated by context. Plant parts include, but are not limited to, plant cells, stems, roots, flowers, ovules, stamens, seeds, leaves, embryos, meristematic regions, callus tissue, anther cultures, gametophytes, sporophytes, pollen, microspores, protoplasts, hairy root cultures, and/or the like. The present invention also includes seeds produced by the plants of the present invention. Preferably, the seeds comprise the exogenous F6H1 nucleic acid optionally in combination one or more nucleic acid selected from CCoAOMT, ABCG37 and UGT71C1 nucleic acids. In one embodiment, the seeds can develop into plants with increased resistance to fungal infection as compared to a wild-type variety of the plant seed. As used herein, a "plant cell" includes, but is not limited to, a protoplast, gamete producing cell, and a cell that regenerates into a whole plant. Tissue culture of various tissues of plants and regeneration of plants therefrom is well known in the art and is widely published.

[0103] Reference herein to an "endogenous" nucleic acid and/or protein refers to the nucleic acid and/or protein in question as found in a plant in its natural form (i.e., without there being any human intervention).

[0104] The term "exogenous" nucleic acid refers to a nucleic acid that has been introduced in a plant by means of genetechnology. An "exogenous" nucleic acid can either not occur in a plant in its natural form, be different from the nucleic acid in question as found in a plant in its natural form, or can be identical to a nucleic acid found in a plant in its natural form, but integrated not within their natural genetic environment. The corresponding meaning of "exogenous" is applied in the context of protein expression. For example, a transgenic plant containing a transgene, i.e., an exogenous nucleic acid, may, when compared to the expression of the endogenous gene, encounter a substantial increase of the expression of the respective gene or protein in total. A transgenic plant according to the present invention includes an exogenous F6H1 nucleic acid optionally in combination one or more exogenous nucleic acid(s) selected from CCoAOMT, ABCG37 and UGT71C1 nucleic acids integrated at any genetic loci and optionally the plant may also include the endogenous gene within the natural genetic background. Preferably the plant, plant part or plant cell does not include endogenous F6H1 nucleic acid optionally in combination with one or more endogenous nucleic acid(s) selected from CCoAOMT, ABCG37 and UGT71C1.

[0105] For the purposes of the invention, "recombinant" means with regard to, for example, a nucleic acid sequence, a nucleic acid molecule, an expression cassette or a vector construct comprising F6H1 nucleic acid optionally in combination with any one or more of CCoAOMT, ABCG37 and/or UGT71C1 nucleic acid(s), all those constructions brought about by man by genetechnological methods in which either

[0106] (a) the sequences of the F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids or a part thereof, or

[0107] (b) genetic control sequence(s) which are operably linked with the F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acid sequences according to the invention, for example a promoter, or

[0108] (c) a) and b) are not located in their natural genetic environment within the genome of the wildtype plant or have been modified by man by genetechnological methods. The modification may take the form of, for example, a substitution, addition, deletion, inversion or insertion of one or more nucleotide residues. The natural genetic environment is understood as meaning the natural genomic or chromosomal locus in the original plant or the presence in a genomic library or the combination with the natural promoter.

[0109] For instance, a naturally occurring expression cassette--for example the naturally occurring combination of the natural promoter of the nucleic acid sequences with the corresponding nucleic acid sequence encoding a protein useful in the methods of the present invention, as defined above--becomes a recombinant expression cassette when this expression cassette is modified by man by non-natural, synthetic ("artificial") methods such as, for example, mutagenic treatment. Suitable methods are described, for example, in U.S. Pat. No. 5,565,350, WO 00/15815 or US200405323. Furthermore, a naturally occurring expression cassette--for example the naturally occurring combination of the natural promoter of the nucleic acid sequences with the corresponding nucleic acid sequence encoding a protein useful in the methods of the present invention, as defined above--becomes a recombinant expression cassette when this expression cassette is not integrated in the natural genetic environment but in a different genetic environment.

[0110] The term "isolated nucleic acid" or "isolated protein" refers to a nucleic acid or protein that is not located in its natural environment, in particular its natural cellular environment. Thus, an isolated nucleic acid or isolated protein is essentially separated from other components of its natural environment. However, the skilled person in the art is aware that preparations of an isolated nucleic acid or an isolated protein can display a certain degree of impurity depending on the isolation procedure used. Methods for purifying nucleic acids and proteins are well known in the art. The isolated gene may be isolated from an organism or may be manmade, for example by chemical synthesis. In this regard, a recombinant nucleic acid may also be in an isolated form.

[0111] As used herein, the term "transgenic" refers to an organism, e.g., a plant, plant cell, callus, plant tissue, or plant part that exogenously contains the nucleic acid, recombinant construct, vector or expression cassette described herein or a part thereof which is preferably introduced by non-essentially biological processes, preferably by Agrobacteria transformation. The recombinant construct or a part thereof is stably integrated into a chromosome, so that it is passed on to successive generations by clonal propagation, vegetative propagation or sexual propagation. Preferred successive generations are transgenic too. Essentially biological processes may be crossing of plants and/or natural recombination.

[0112] Preferably, the nucleic acids according to the invention or used according to the invention comprise

F6H1 nucleic acid, F6H1 and CCoAOMT nucleic acids, F6H1 and ABCG37 nucleic acids, or F6H1 and UGT71C1 nucleic acids, or F6H1, CCoAOMT and ABCG37 nucleic acids or F6H1, CCoAOMT, ABCG37 and UGT71C1 nucleic acids.

[0113] A transgenic plant, plants cell or tissue for the purposes of the invention is thus understood as meaning that an exogenous F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAOMT, ABCG37 and UGT71C1 nucleic acids is integrated into the genome by means of genetechnology.

[0114] A recombinant construct, vector or expression cassette for the purposes of the invention comprises a F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAOMT, ABCG37 and UGT71C1 nucleic acids and is prepared by means of genetechnology.

[0115] A "wild type" plant, "wild type" plant part, or "wild type" plant cell means that said plant, plant part, or plant cell does not express exogenous F6H1, CCoAOMT, ABCG37 and UGT71C1 nucleic acids and exogenous F6H1, CCoAOMT, ABCG37 and UGT71C1 proteins. Preferably, the wildtype plant is not capable to produce more than 10 .mu.M scopoletin and/or a derivative thereof, more preferably not more than 5 .mu.M scopoletin and/or a derivative thereof and most preferably the wildtype plant is not capable to produce scopoletin and/or a derivative thereof. A derivative of scopoletin is e.g. scopolin. Preferably, the wildtype plant plant does not express endogenous F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids and endogenous F6H1, CCoAOMT, ABCG37 and/or UGT71C1 proteins.

[0116] Natural locus means the location on a specific chromosome and/or the location between certain genes and/or the same sequence background as in the original plant which is transformed.

[0117] Preferably, the transgenic plant, plant cell or tissue thereof expresses the F6H1 nucleic acids optionally in combination with one or more nucleic acids selected from the group consisting of CCoAOMT, ABCG37 and UGT71C1 nucleic acids. Preferably, the transgenic plant, plant cell or tissue thereof is transformed with recombinant vector constructs comprising F6H1 nucleic acids optionally in combination with one or more nucleic acids selected from the group consisting of CCoAOMT, ABCG37 and UGT71C1 nucleic acids described herein. F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids may be located on the same vector or different recombinant vectors.

[0118] The term "expression" or "gene expression" means the transcription of a specific gene or specific genes or specific genetic vector construct. The term "expression" or "gene expression" in particular means the transcription of a gene or genes or genetic vector construct into structural RNA (rRNA, tRNA), or mRNA with or without subsequent translation of the latter into a protein. The process includes transcription of DNA and processing of the resulting RNA product. The term "expression" or "gene expression" can also include the translation of the mRNA and therewith the synthesis of the encoded protein, i.e., protein expression.

[0119] The term "increased expression" or "enhanced expression" or "overexpression" or "increase of content" as used herein means any form of expression that is additional to the original wild-type expression level. For the purposes of this invention, the original wild-type expression level might also be zero (absence of expression).

[0120] Methods for increasing expression of genes or gene products are well documented in the art and include, for example, overexpression driven by appropriate promoters, the use of transcription enhancers or translation enhancers, or RNAa (Li et al 2006, PNAS 103(46) 17337-42). Isolated nucleic acids which serve as promoter or enhancer elements may be introduced in an appropriate position (typically upstream) of a non-heterologous form of a polynucleotide so as to upregulate expression of a nucleic acid encoding the protein of interest. For example, endogenous promoters may be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., WO9322443), or isolated promoters may be introduced into a plant cell in the proper orientation and distance from a gene of the present invention so as to control the expression of the gene.

[0121] The term "functional fragment" refers to any nucleic acid or protein which comprises merely a part of the fulllength nucleic acid or fulllength protein, respectively, but still provides the essentially same or similar function, e.g., increased fungal resistance and/or the same, essentially the same or similar biological activity when expressed in a plant. Preferably, the fragment comprises at least 70%, at least 80%, at least 90% at least 95%, at least 98%, at least 99% of the original sequence. Preferably, the functional fragment comprises contiguous nucleic acids or amino acids as in the original nucleic acid or original protein, respectively. In one embodiment the fragment of any of the F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids has an identity as defined above over a length of at least 70%, at least 75%, at least 90% of the nucleotides of the respective F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acid.

[0122] The term "the same biological activity", "essentially the same biologicla activity", "similar biological activity" or increased biological activity preferably means leading to an increased production and/or accumulation compared to the wildtype plant, wild type plant part, or wild type plant cell of more than 0.1 .mu.M, preferably more than 1 .mu.M, preferably more than 2 .mu.M, more preferably more than 5 .mu.M, most preferably more than 10 .mu.M scopoletin and/or a derivative thereof when F6H1 and optionally CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids or fragments thereof are expressed in a plant.

[0123] The term "splice variant" as used herein encompasses variants of a nucleic acid sequence in which selected introns and/or exons or parts thereof have been excised, replaced, displaced or added, or in which introns have been shortened or lengthened. Thus, a splice variant can have one or more or even all introns removed or added or partially removed or partially added. According to this definition, a cDNA is considered as a splice variant of the respective intron-containing genomic sequence and vice versa. Such splice variants may be found in nature or may be manmade. Methods for predicting and isolating such splice variants are well known in the art (see for example Foissac and Schiex (2005) BMC Bioinformatics 6: 25).

[0124] The wildtype plant may express the respective endogenous F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids. As far as overexpression of exogenous F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids is concerned, for the purposes of this invention, the original wild-type expression level of the corresponding endogenous nucleic acids might also be zero (absence of expression).

[0125] With respect to a vector construct and/or the recombinant nucleic acid molecules, the term "operatively linked" is intended to mean that the nucleic acid to be expressed is linked to the regulatory sequence, including promoters, terminators, enhancers and/or other expression control elements (e.g. polyadenylation signals), in a manner which allows for expression of the nucleic acid (e.g. in a host plant cell when the vector is introduced into the host plant cell). Such regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) and Gruber and Crosby, in: Methods in Plant Molecular Biology and Biotechnology, Eds. Glick and Thompson, Chapter 7, 89-108, CRC Press: Boca Raton, Fla., including the references therein. Regulatory sequences include those that direct constitutive expression of a nucleotide sequence in many types of host cells and those that direct expression of the nucleotide sequence only in certain host cells or under certain conditions. It will be appreciated by those skilled in the art that the design of the vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of nucleic acid desired, and the like.

[0126] The term "introduction" or "transformation" as referred to herein encompass the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for transfer. Plant tissue capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may be transformed with a vector construct of the present invention and a whole plant regenerated there from. The particular tissue chosen will vary depending on the clonal propagation systems available for, and best suited to, the particular species being transformed. Exemplary tissue targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), and induced meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). The polynucleotide may be transiently or stably introduced into a host cell and may be maintained non-integrated, for example, as a plasmid. Alternatively, it may be integrated into the host genome. The host genome includes the nucleic acid contained in the nucleus as well as the nucleic acid contained in the plastids, e.g., chloroplasts, and/or mitochondria. The resulting transformed plant cell may then be used to regenerate a transformed plant in a manner known to persons skilled in the art.

[0127] The term "terminator" encompasses a control sequence which is a DNA sequence at the end of a transcriptional unit which signals 3' processing and polyadenylation of a primary transcript and termination of transcription. The terminator can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The terminator to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.

DETAILED DESCRIPTION

[0128] F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids

[0129] The F6H1 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens, e.g., of the family Phacopsoraceae, for example soybean rust, or of the genus of Fusarium, in particular Fusarium graminearum and/or Fusarium verticillioides, is preferably a nucleic acid

consisting of or comprising a nucleic acid selected from the group consisting of:

[0130] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the nucleic acid sequence represented by SEQ ID NO: 1 or a functional fragment, or a splice variant thereof;

[0131] (ii) a nucleic acid encoding a F6H1 protein comprising an amino acid sequence having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the amino acid sequence represented by SEQ ID NO: 2 or a functional fragment; preferably the F6H1 protein has the essentially same or similar biological activity as a F6H1 protein encoded by SEQ ID NO: 2; preferably the F6H1 protein confers enhanced fungal resistance relative to control plants;

[0132] (iii) a nucleic acid molecule which hybridizes with a complementary sequence of any of the nucleic acid molecules of (i) or (ii) under high stringency hybridization conditions; preferably encoding a F6H1 protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 2; preferably the encoded protein confers enhanced fungal resistance relative to control plants; and

[0133] (iv) a nucleic acid encoding the same F6H1 protein as the F6H1 nucleic acids of (i) to (iii) above, but differing from the F6H1 nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0134] The F6H1 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid selected from SEQ ID No. 1, 9, 11, 13, 15, 17, 19 and 21. The F6H1 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid encoding a F6H1 protein selected from SEQ ID No. 2, 10, 12, 14, 16, 18, 20 and 22.

[0135] The F6H1 protein may comprise a domain as defined in SEQ ID No. 63, and having least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the protein sequence represented by the respective sequence.

[0136] The CCoAOMT nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens, e.g., of the family Phacopsoraceae, for example soybean rust, or of the genus of Fusarium, in particular Fusarium graminearum and/or Fusarium verticillioides, is preferably a nucleic acid

consisting of or comprising a nucleic acid selected from the group consisting of:

[0137] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the nucleic acid sequence represented by SEQ ID NO: 3 or a functional fragment, or a splice variant thereof;

[0138] (ii) a nucleic acid encoding a CCoAOMT protein comprising an amino acid sequence having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the amino acid sequence represented by SEQ ID NO: 4 or a functional fragment; preferably the protein has essentially the same or similar biological activity as a CCoAOMT protein encoded by SEQ ID NO: 4; preferably the CCoAOMT protein confers enhanced fungal resistance relative to control plants;

[0139] (iii) a nucleic acid molecule which hybridizes with a complementary sequence of any of the nucleic acid molecules of (i) or (ii) under high stringency hybridization conditions; preferably encoding a CCoAOMT protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 4; preferably the encoded protein confers enhanced fungal resistance relative to control plants; and

[0140] (iv) a nucleic acid encoding the same CCoAOMT protein as the CCoAOMT nucleic acids of (i) to (iii) above, but differing from the CCoAOMT nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0141] The CCoAOMT nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid selected from SEQ ID No. 3, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45 and 47. The CCoAOMT nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid encoding a CCoAOMT protein selected from SEQ ID No. 4, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46 and 48.

[0142] The CCoAOMT protein may comprise a domain as defined in SEQ ID No. 64, having least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the protein sequence represented by the respective sequence.

[0143] The ABCG37 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens, e.g., of the family Phacopsoraceae, for example soybean rust, or of the genus of Fusarium, in particular Fusarium graminearum and/or Fusarium verticillioides, is preferably a nucleic acid

consisting of or comprising a nucleic acid selected from the group consisting of:

[0144] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the nucleic acid sequence represented by SEQ ID NO: 5 or a functional fragment thereof, or a splice variant thereof;

[0145] (ii) a nucleic acid encoding a ABCG37 protein comprising an amino acid sequence having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the amino acid sequence represented by SEQ ID NO: 6 or a functional fragment thereof; preferably the protein has essentially the same or similar biological activity as a ABCG37 protein encoded by SEQ ID NO: 6; preferably the ABCG37 protein confers enhanced fungal resistance relative to control plants;

[0146] (iii) a nucleic acid molecule which hybridizes with a complementary sequence of any of the nucleic acid molecules of (i) or (ii) under high stringency hybridization conditions; preferably encoding a ABCG37 protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 6; preferably the encoded protein confers enhanced fungal resistance relative to control plants; and

[0147] (iv) a nucleic acid encoding the same ABCG37 protein as the ABCG37 nucleic acids of (i) to (iii) above, but differing from the ABCG37 nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0148] The ABCG37 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid selected from SEQ ID No. 5, 49, 51 and 53. The ABCG37 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid encoding a ABCG37 protein selected from SEQ ID No. 6, 50, 52 and 54.

[0149] The ABCG37 protein may comprise at least one domain selected from the group as defined in SEQ ID No. 65, 66, 67 and/or 68 having least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the protein sequence represented by the respective sequence.

[0150] The UGT71C1 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens, e.g., of the family Phacopsoraceae, for example soybean rust, or of the genus of Fusarium, in particular Fusarium graminearum and/or Fusarium verticillioides, is preferably a nucleic acid

consisting of or comprising a nucleic acid selected from the group consisting of:

[0151] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the nucleic acid sequence represented by SEQ ID NO: 7 or a functional fragment thereof, or a splice variant thereof;

[0152] (ii) a nucleic acid encoding a UGT71C1 protein comprising an amino acid sequence having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the amino acid sequence represented by SEQ ID NO: 8 or a functional fragment thereof; preferably the protein has essentially the same or similar biological activity as a UGT71C1 protein encoded by SEQ ID NO: 8; preferably the UGT71C1 protein confers enhanced fungal resistance relative to control plants;

[0153] (iii) a nucleic acid molecule which hybridizes with a complementary sequence of any of the nucleic acid molecules of (i) or (ii) under high stringency hybridization conditions; preferably encoding a UGT71C1 protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 8; preferably the encoded protein confers enhanced fungal resistance relative to control plants; and

[0154] (iv) a nucleic acid encoding the same UGT71C1 protein as the UGT71C1 nucleic acids of (i) to (iii) above, but differing from the UGT71C1 nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0155] The UGT71C1 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid selected from SEQ ID No. 55, 57, 59 and 61. The UGT71C1 nucleic acid to be overexpressed in order to achieve increased resistance to fungal pathogens is for example a nucleic acid encoding an ABCG37 protein selected from SEQ ID No. 56, 58, 60 and 62.

[0156] The UGT71C1 protein may comprise at least one domain selected from the group as defined in SEQ ID No. 69 and/or 70 having least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the protein sequence represented by the respective sequence.

[0157] Percentages of identity of a nucleic acid are indicated with reference to the entire nucleotide region given in a sequence specifically disclosed herein.

[0158] Preferably the portion of the F6H1 nucleic acid fragment is about 500-600, about 600-700, about 700-800, about 800-900, about 900-1000, or about 1000-1086 nucleotides, preferably consecutive nucleotides, preferably counted from the 5' or 3' end of the nucleic acid, in length, of the nucleic acid sequences given in SEQ ID NO: 1.

[0159] Preferably the portion of the CCoAOMT nucleic acid fragment is about 400-500 about 500-600, about 600-700, about 700-780, preferably consecutive nucleotides, preferably counted from the 5' or 3' end of the nucleic acid, in length, of the nucleic acid sequences given in SEQ ID NO: 3.

[0160] Preferably the portion of the ABCG37 nucleic acid fragment is about 2500-2600, about 2600-2700, about 2700-2800 about 2800-2900, about 2900-3000, about 3000-3100, about 3100-3200, about 3200-3300, about 3300-3400, about 3400-3500, about 3500-3600, about 3600-3700, about 3700-3800, about 3800-3900, about 3900-4000, about 4000-4100, about 4100-4200, or about 4300-4353 nucleotides, preferably consecutive nucleotides, preferably counted from the 5' or 3' end of the nucleic acid, in length, of the nucleic acid sequences given in SEQ ID NO: 5.

[0161] Preferably the portion of the UGT71C1 nucleic acid fragment is about 500-600, about 600-700, about 700-800 about 800-900, about 900-1000, about 1000-1100, about 1100-1200, about 1200-1300, about 1300-1400 or about 1400-1446 nucleotides, preferably consecutive nucleotides, preferably counted from the 5' or 3' end of the nucleic acid, in length, of the nucleic acid sequences given in SEQ ID NO: 7.

[0162] All the nucleic acid sequences mentioned herein (single-stranded and double-stranded DNA and RNA sequences, for example cDNA and mRNA) can be produced in a known way by chemical synthesis from the nucleotide building blocks, e.g. by fragment condensation of individual overlapping, complementary nucleic acid building blocks of the double helix. Chemical synthesis of oligonucleotides can, for example, be performed in a known way, by the phosphoamidite method (Voet, Voet, 2nd edition, Wiley Press, New York, pages 896-897). The accumulation of synthetic oligonucleotides and filling of gaps by means of the Klenow fragment of DNA polymerase and ligation reactions as well as general cloning techniques are described in Sambrook et al. (1989), see below.

[0163] The F6H1, CCoAOMT, ABCG37 and/or UGT71C1 nucleic acids described herein are useful in the constructs, methods, plants, harvestable parts and products of the invention.

F6H1, CCoAOMT, ABCG37 and/or UGT71C1 Proteins

[0164] In one embodiment of the invention, the F6H1 protein is encoded by a nucleic acid comprising an exogenous nucleic acid having

[0165] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity with SEQ ID NO: 1 a functional fragment thereof, or a splice variant thereof; or by

[0166] (ii) an exogenous nucleic acid encoding a protein comprising an amino acid sequence having at least F6H1 homology with SEQ ID NO: 2, a functional fragment thereof, preferably the encoded protein confers enhanced fungal resistance relative to control plants;

[0167] (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); preferably encoding a F6H1 protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 2; preferably the encoded protein confers enhanced fungal resistance relative to control plants; or by

[0168] (iv) an exogenous nucleic acid encoding the same F6H1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0169] In one embodiment of the invention, the CCoAOMT protein is encoded by a nucleic acid comprising an exogenous nucleic acid having

[0170] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity with SEQ ID NO: 3 a functional fragment thereof, or a splice variant thereof; or by

[0171] (ii) an exogenous nucleic acid encoding a protein comprising an amino acid sequence having at least CCoAOMT homology with SEQ ID NO: 4, a functional fragment thereof, preferably the encoded protein confers enhanced fungal resistance relative to control plants;

[0172] (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); preferably encoding a CCoAOMT protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 4; preferably the encoded protein confers enhanced fungal resistance relative to control plants; or by

[0173] (iv) an exogenous nucleic acid encoding the same CCoAOMT protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0174] In one embodiment of the invention, the ABCG37 protein is encoded by a nucleic acid comprising an exogenous nucleic acid having

[0175] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity with SEQ ID NO: 5 a functional fragment thereof, or a splice variant thereof; or by

[0176] (ii) an exogenous nucleic acid encoding a protein comprising an amino acid sequence having at least ABCG37 homology with SEQ ID NO: 6, a functional fragment thereof, preferably the encoded protein confers enhanced fungal resistance relative to control plants;

[0177] (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); preferably encoding a ABCG37 protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 6; preferably the encoded protein confers enhanced fungal resistance relative to control plants; or by

[0178] (iv) an exogenous nucleic acid encoding the same ABCG37 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0179] In one embodiment of the invention, the UGT71C1 protein is encoded by a nucleic acid comprising an exogenous nucleic acid having

[0180] (i) a nucleic acid having in increasing order of preference at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity with SEQ ID NO: 7 a functional fragment thereof, or a splice variant thereof; or by

[0181] (ii) an exogenous nucleic acid encoding a protein comprising an amino acid sequence having at least UGT71C1 homology with SEQ ID NO: 8, a functional fragment thereof, preferably the encoded protein confers enhanced fungal resistance relative to control plants;

[0182] (iii) an exogenous nucleic acid capable of hybridizing under stringent conditions with a complementary sequence of any of the nucleic acids according to (i) or (ii); preferably encoding a UGT71C1 protein; preferably wherein the nucleic acid molecule codes for a polypeptide which has essentially identical properties to the polypeptide described in SEQ ID NO: 8; preferably the encoded protein confers enhanced fungal resistance relative to control plants; or by

[0183] (iv) an exogenous nucleic acid encoding the same UGT71C1 protein as the nucleic acids of (i) to (iii) above, but differing from the nucleic acids of (i) to (iii) above due to the degeneracy of the genetic code.

[0184] Preferably, the F6H1 polypeptide comprises about 200-225, about 225-250, about 250-275, about 275-300, about 300-325, about 325-350, or about 350-362 amino acid residues, preferably consecutive amino acid residues, preferably counted from the N-terminus or C-terminus of the amino acid sequence, or up to the full length of any of the amino acid sequences encoded by the nucleic acid sequences set out in SEQ ID NO: 1.

[0185] Preferably, the CCoAOMT polypeptide comprises about 100-125, about 125-150, about 150-175, about 175-200, about 200-225, about 225-250, or about 250-260 amino acid residues, preferably consecutive amino acid residues, preferably counted from the N-terminus or C-terminus of the amino acid sequence, or up to the full length of any of the amino acid sequences encoded by the nucleic acid sequences set out in SEQ ID NO: 3.

[0186] Preferably, the ABCG37 polypeptide comprises about 1100-1125, about 1125-1150, about 1150-1175, about 1175-1200, about 1200-1225, about 1200-1225, about 1225-1250, about 1250-1275, about 1275-1300, about 1300-1325, about 1325-1350, about 1350-1375, about 1375-1400, about 1400-1425, or about 1425-1451 amino acid residues, preferably consecutive amino acid residues, preferably counted from the N-terminus or C-terminus of the amino acid sequence, or up to the full length of any of the amino acid sequences encoded by the nucleic acid sequences set out in SEQ ID NO: 5.

[0187] Preferably, the UGT71C1 polypeptide comprises about 225-250, about 250-275, about 275-300, about 300-325, about 325-350, about 350-375, about 375-400, about 400-425, about 425-450, about 450-475, or about 475-482 amino acid residues, preferably consecutive amino acid residues, preferably counted from the N-terminus or C-terminus of the amino acid sequence, or up to the full length of any of the amino acid sequences encoded by the nucleic acid sequences set out in SEQ ID NO: 7.

[0188] The F6H1, CCoAOMT, ABCG37 and/or UGT71C1 proteins described herein are useful in the constructs, methods, plants, harvestable parts and products of the invention.

Methods for Increasing Fungal Resistance

[0189] One embodiment of the present invention is a method according to the present invention for increasing fungal resistance in a plant, a plant part, or a plant cell, wherein the method comprises the step of increasing the production of scopoletin and/or a derivative thereof in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell. The derivative of the scopoletin may be the scopolin.

[0190] Scopoletin is defined by the structural formula:

##STR00001##

[0191] Scopolin is defined by the structural formula:

##STR00002##

[0192] One embodiment of the present invention is a method for increasing fungal resistance in a plant, a plant part, or a plant cell, wherein the method comprises increasing the expression and/or biological activity of a F6H1 protein in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell, wherein said F6H1 protein is encoded by as defined above. In a preferred embodiment said method further comprises increasing the expression and/or biological activity of at least one or more additional protein(s) selected from the group consisting of a CCoAOMT1 protein, a ABCG37 protein and a UGT71C1 protein in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell, wherein said CCoAOMT1 protein, a ABCG37 protein and a UGT71C1 protein are defined as above. Preferably, said method comprises increasing the productions and/or accumulation of scopoletin and/or a derivative thereof in a plant, plant part or plant cell.

[0193] One embodiment of the invention is a method for increasing fungal resistance, preferably resistance to Phacopsoracea and/or Fusarium, in a plant, plant part, or plant cell by increasing the expression and/or biological activity of a F6H1 protein, and optionally in combination with increasing the expression and/or biological activity of one or more of the protein(s) selected from the group consisting of CCoAOMT, ABCG37 and/or UGT71C1 protein(s) or a functional fragment, homologue thereof in comparison to wild-type plants, wild-type plant parts or wild-type plant cells. Preferably, the F6H1 protein is expressed from an exogenous nucleic acid. Preferably, F6H1 protein and one or more the proteins selected from the group consisting of CCoAOMT, ABCG37 and/or UGT71C1 protein(s), are expressed from an exogenous nucleic acid.

[0194] One embodiment of the invention is a method for increasing fungal resistance in a plant, a plant part, or a plant cell comprises

[0195] (a) stably transforming a plant cell with an expression cassette comprising an exogenous nucleic acid encoding a F6H1 protein,

[0196] (b) regenerating the plant from the plant cell; and

[0197] (c) expressing said exogenous nucleic acid.

[0198] A preferred method according to the present invention comprises

[0199] (a) stably transforming a plant cell with expression cassette(s) comprising an exogenous nucleic acid encoding a F6H1 protein and encoding one or more exogenous nucleic acid(s) encoding CCoAOMT1, ABCG37 and/or UGT71C1 protein(s),

[0200] (b) regenerating the plant from the plant cell; and

[0201] (c) expressing said exogenous nucleic acids, optionally wherein the nucleic acid(s) which codes for a CCoAOMT1, ABCG37 and/or UGT71C1 protein(s) is expressed in an amount and for a period sufficient to generate or to increase fungal resistance in said plant.

[0202] Preferably the nucleic acid(s) encoding F6H1, CCoAOMT1, ABCG37 and/or UGT71C1 protein(s) are in functional linkage with a promoter. Preferably, the promoter is a constitutive, pathogen inducible, preferably fungal inducible, mesophyll-specific promoter and/or epidermis-specific promoter and/or stalk specific, ear or kernel specific promoter

[0203] Preferably, the production of scopoletin and/or a derivative thereof in the plant, plant part, or plant cell in comparison to a wild type plant, wild type plant part, or wild type plant cell is increased.

[0204] In preferred embodiments, the protein amount and/or biological activity of the F6H1 protein in the plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the F6H1 nucleic acid.

[0205] In preferred embodiments, the protein amount and/or biological activity of the CCoAOMT protein in the plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the CCoAOMT nucleic acid.

[0206] In preferred embodiments, the protein amount and/or biological activity of the ABCG37 protein in the plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the ABCG37 nucleic acid.

[0207] In preferred embodiments, the protein amount and/or biological activity of the UGT71C1 protein in the plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the UGT71C1 nucleic acid.

[0208] The exogenous nucleic acid encoding F6H1, CCoAOMT1, ABCG37 and/or UGT71C1 are located on the same or different expression cassettes. Preferably, one expression cassette comprises exogenous nucleic acid encoding F6H1 and optionally in combination with one or more exogenous nucleic acid encoding CCoAOMT1, ABCG37 and/or UGT71C1. Preferably, the expression cassette comprises exogenous nucleic acid encoding

[0209] F6H1,

[0210] F6H1 and CCoAOMT1,

[0211] F6H1 and ABCG37,

[0212] F6H1 and UGT71C1,

[0213] F6H1, CCoAOMT1 and ABCG37

[0214] F6H1, CCoAOMT1 and UGT71C1,

[0215] F6H1, UGT71C1 and ABCG37 or

[0216] F6H1, CCoAOMT1, ABCG37 and UGT71C1 proteins.

[0217] In another embodiment the exogenous nucleic acid encoding

[0218] F6H1 and CCoAOMT1,

[0219] F6H1 and ABCG37,

[0220] F6H1 and UGT71C1 or

[0221] F6H1, CCoAOMT1 and ABCG37

[0222] F6H1, CCoAOMT1 and UGT71C1

[0223] F6H1, UGT71C1 and ABCG37 or

[0224] F6H1, CCoAOMT1, ABCG37 and UGT71C1 proteins are located on different expression cassettes.

[0225] The fungal pathogens or fungus-like pathogens (such as, for example, Chromista) can belong to the group comprising Plasmodiophoramycota, Oomycota, Ascomycota, Chytridiomycetes, Zygomycetes, Basidiomycota or Deuteromycetes (Fungi imperfecti). Pathogens which may be mentioned by way of example, but not by limitation, are those detailed in Tables 2 and 3, and the diseases which are associated with them.

TABLE-US-00002 TABLE 2 Diseases caused by biotrophic and/or heminecrotrophic phytopathogenic fungi Disease Pathogen Leaf rust Puccinia recondita Yellow rust P. striiformis Powdery mildew Erysiphe graminis/Blumeria graminis Rust (common corn) Puccinia sorghi Rust (Southern corn) Puccinia polysora Tobacco leaf spot Cercospora nicotianae Rust (soybean) Phakopsora pachyrhizi, P. meibomiae Rust (tropical corn) Physopella pallescens, P. zeae = Angiopsora zeae

TABLE-US-00003 TABLE 3 Diseases caused by necrotrophic and/or hemibiotrophic fungi and Oomycetes Disease Pathogen Plume blotch Septoria (Stagonospora) nodorum Leaf blotch Septoria tritici Ear fusarioses Fusarium spp. Late blight Phytophthora infestans Anthrocnose leaf Colletotrichum graminicola (teleomorph: blight Glomerella graminicola Politis); Anthracnose stalk Glomerella tucumanensis rot (anamorph: Glomerella falcatum Went) Curvularia Curvularia clavata, C. eragrostidis, =C. leaf spot maculans (teleomorph: Cochliobolus eragrostidis), Curvularia inaequalis, C. intermedia (teleomorph: Cochliobolus intermedius), Curvularia lunata (teleomorph: Cochliobolus lunatus), Curvularia pallescens (teleomorph: Cochliobolus pallescens), Curvularia senegalensis, C. tuberculata (teleomorph: Cochliobolus tuberculatus) Didymella leaf spot Didymella exitalis Diplodia leaf spot Stenocarpella macrospora = or streak Diplodialeaf macrospora Brown stripe downy Sclerophthora rayssiae var. zeae mildew Crazy top downy Sclerophthora macrospora = mildew Sclerospora macrospora Green ear downy Sclerospora graminicola mildew (graminicola downy mildew) Leaf spots, minor Alternaria alternata, Ascochyta maydis, A. tritici, A. zeicola, Bipolaris victoriae = Helminthosporium victoriae (teleomorph: Cochliobolus victoriae), C. sativus (anamorph: Bipolaris sorokiniana = H. sorokinianum = H. sativum), Epicoccum nigrum, Exserohilum prolatum = Drechslera prolata (teleomorph: Setosphaeria prolata) Graphium penicillioides, Leptosphaeria maydis, Leptothyrium zeae, Ophiosphaerella herpotricha, (anamorph: Scolecosporiella sp.), Paraphaeosphaeria michotii, Phoma sp., Septoria zeae, S. zeicola, S. zeina Northern corn leaf Setosphaeria turcica (anamorph: blight (white Exserohilum turcicum = blast, crown stalk Helminthosporium turcicum) rot, stripe) Northern corn leaf Cochliobolus carbonum (anamorph: spot Bipolaris zeicola = Helminthosporium Helminthosporium carbonum) ear rot (race 1) Phaeosphaeria Phaeosphaeria maydis = Sphaerulina maydis leaf spot Rostratum leaf spot Setosphaeria rostrata, (anamorph: (Helminthosporium xserohilum rostratum = leaf disease, ear Helminthosporium rostratum) and stalk rot) Java downy mildew Peronosclerospora maydis = Sclerospora maydis Philippine downy Peronosclerospora philippinensis = mildew Sclerospora philippinensis Sorghum downy Peronosclerospora sorghi = mildew Sclerospora sorghi Spontaneum downy Peronosclerospora spontanea = mildew Sclerospora spontanea Sugarcane downy Peronosclerospora sacchari = mildew Sclerospora sacchari Sclerotium ear rot Sclerotium rolfsii Sacc. (teleomorph: (southern blight) Athelia rolfsii) Seed rot-seedling Bipolaris sorokiniana, B. zeicola = blight Helminthosporium carbonum, Diplodia maydis, Exserohilum pedicillatum, Exserohilum turcicum = Helminthosporium turcicum, Fusarium avenaceum, F. culmorum, F. moniliforme, Gibberella zeae (anamorph: F. graminearum), Macrophomina phaseolina, Penicillium spp., Phomopsis sp., Pythium spp., Rhizoctonia solani, R. zeae, Sclerotium rolfsii, Spicaria sp. Selenophoma leaf Selenophoma sp. spot Yellow leaf blight Ascochyta ischaemi, Phyllosticta maydis (teleomorph: Mycosphaerella zeae-maydis) Zonate leaf spot Gloeocercospora sorghi

[0226] Preferred fungal pathogens are of the order Pucciniales, in particular the family Phacopsoracea, in particular the genus Phakopsora, more particularly the species Phakopsora pachyrhizi and/or Phakopsora meibomiae--also known as soybean rust or Asian Soybean Rust (ASR) and/or preferred fungal pathogens are of the family Nectriaceae, in particular the genus Fusarium, in particular the species Fusarium graminearum, Fusarium sporotrichioides, Fusarium pseudograminearum, Fusarium culmorum, Fusarium poae, Fusarium verticillioides (Fusarium moniliforme), Fusarium subglutinans, Fusarium proliferatum, Fusarium fujikuroi), Fusarium avenaceum, Fusarium oxysporum, Fusarium virguliforme and/or Fusarium solani. Most preferred is fusarium graminearum and/or fusarium verticolloides.

[0227] F6H1, CCoAOMT1, ABCG37 and/or UGT71C1 expression constructs and vector constructs

[0228] One embodiment of the present invention is a recombinant vector construct comprising the nucleic acid encoding F6H1 protein as defined above operably linked with a promoter and a transcription termination sequence.

[0229] One embodiment of the present invention is a recombinant vector construct comprising the nucleic acid encoding CCoAOMT1 protein as defined above operably linked with a promoter and a transcription termination sequence.

[0230] One embodiment of the present invention is a recombinant vector construct comprising the nucleic acid encoding ABCG37 protein as defined above operably linked with a promoter and a transcription termination sequence.

[0231] One embodiment of the present invention is a recombinant vector construct comprising the nucleic acid encoding UGT71C1 protein as defined above operably linked with a promoter and a transcription termination sequence.

[0232] In one embodiment the nucleic acid encoding F6H1 protein, CCoAOMT1 protein, ABCG37 and/or UGT71C1 protein are located on the same recombinant vector construct. In another embodiment the nucleic acid encoding F6H1 protein, CCoAOMT1 protein and/or ABCG37 protein are located on different vector constructs. Preferably, one expression cassette comprises the exogenous nucleic acid(s) encoding F6H1 and optionally in combination with exogenous nucleic acids encoding one or more selected from the group of the exogenous nucleic acid(s) CCoAOMT1, ABCG37 and/or UGT71C1. Preferably, the recombinant vector construct comprises exogenous nucleic acid encoding.

[0233] F6H1,

[0234] F6H1 and CCoAOMT1,

[0235] F6H1 and ABCG37,

[0236] F6H1 and UGT71C1,

[0237] F6H1, CCoAOMT1 and ABCG37

[0238] F6H1, CCoAOMT1 and UGT71C1

[0239] F6H1, UGT71C1 and ABCG37 or

[0240] F6H1, CCoAOMT1, ABCG37 and UGT71C1 proteins.

[0241] Promoters according to the present invention may be constitutive, inducible, in particular pathogen-inducible, developmental stage-preferred, cell type-preferred, tissue-preferred or organ-preferred. Examples for suitable promoters and terminators are:

[0242] p-PcUbi::F6H1::t-ocs

[0243] p-SUPER::CCoAOMT1::t-nos

[0244] p-Glyma14g06680::ABCG37::t-StCATHD

[0245] p-SUPER::UGT71C1::t-nos

[0246] The PcUbi promoter regulates constitutive expression of the ubi4-2 gene (accession number X64345) of Petroselinum crispum (Kawalleck, P., Somssich, I. E., Feldbrugge, M., Hahlbrock, K., & Weisshaar, B. (1993). Polyubiquitin gene expression and structural properties of the ubi4-2 gene in Petroselinum crispum. Plant molecular biology, 21(4), 673-684. The p-Super promoter consists of three identical Octapine Synthase Enhancers followed by a MAS promoter (Lee et al., 2007 Plant Physiology Vol145 Issue 4 1294-1300). The p-Glyma14g06680 promoter has been identified in a screen for genes that are predominantly expressed in the leaf of soybean. The promoter regulates the expression of the gene Glyma14g06680, which is most likely a water channel protein (WO12127373) T-ocs and t-NOS terminators are both derived from Agrobacterium (Gielen, J., et al. "The complete nucleotide sequence of the TL-DNA of the Agrobacterium tumefaciens plasmid pTiAch5." The EMBO journal 3.4 (1984): 835. T-ocs is the terminator of the octopine synthase gene and t-NOS is the terminator of the nopaline synthase gene of Agrobacterium tumefaciens The StCATHD-pA is the terminator of the cathepsin D inhibitor gene from Solanum tuberosum (t-StCat) (Herbers et al. 1994)

[0247] One type of recombinant vector construct is a "plasmid," which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vector constructs are capable of autonomous replication in a host plant cell into which they are introduced. Other vector constructs are integrated into the genome of a host plant cell upon introduction into the host cell, and thereby are replicated along with the host genome. In particular the vector construct is capable of directing the expression of gene to which the vectors is operatively linked. However, the invention is intended to include such other forms of expression vector constructs, such as viral vectors (e.g., potato virus X, tobacco rattle virus, and/or Gemini virus), which serve equivalent functions.

Transgenic Organisms; Transgenic Plants, Plant Parts, and Plant Cells

[0248] A preferred embodiment is a transgenic plant, transgenic plant part, or transgenic plant cell overexpressing an exogenous F6H1 protein, optionally in combination with overexpressing one or more of CCoAOMT1 protein, ABCG37 protein and/or UGT71C1 protein encoded by a nucleic acid as defined above.

[0249] In preferred embodiments the biological activity of the F6H1 protein optional the biological activity of one or more of CCoAOMT1 protein, ABCG37 protein and/or UGT71C1 protein is increased in said transgenic plant, transgenic plant part, or transgenic plant cell.

[0250] In preferred embodiments, the protein amount of a F6H1 protein in the transgenic plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the F6H1 nucleic acid.

[0251] In preferred embodiments, the protein amount of a CCoAOMT1 protein in the transgenic plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the CCoAOMT1 nucleic acid.

[0252] In preferred embodiments, the protein amount of a ABCG37 protein in the transgenic plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the ABCG37 nucleic acid.

[0253] In preferred embodiments, the protein amount of a UGT71C1 protein in the transgenic plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the ABCG37 nucleic acid.

[0254] On preferred embodiments the amount of F6H1 protein in combination with CCoAOMT1 and/or ABCG37 and/or UGT71C1 in the transgenic plant is increased by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% or more in comparison to a wild type plant that is not transformed with the respective nucleic acid(s).

[0255] More preferably, the transgenic plant, transgenic plant part, or transgenic plant cell according to the present invention has been obtained by transformation with one or more recombinant vector construct(s) described herein. In one embodiment a transgenic plant, transgenic plant part, or transgenic plant cell is transformed with one or more recombinant vector construct(s) as described, wherein the nucleic acid(s) encoding a F6H1 protein, and/or a CCoAOMT1 protein, and/or a ABCG37 protein and/or a UGT71C1 protein are located on the same recombinant vector construct or different vector constructs. Preferably, the recombinant vector construct comprises exogenous nucleic acid encoding F6H1 and CCoAOMT1, F6H1 and ABCG37, F6H1 and UGT71C1 or F6H1, CCoAOMT1, ABCG37 and UGT71C1 proteins.

[0256] A preferred embodiment comprises a transgenic plant, transgenic plant part, or transgenic plant cell overexpressing an exogenous F6H1 protein optionally in combination with one or more additional exogenous protein(s) selected from the group consisting of a CCoAOMT1 protein, an ABCG37 protein and an UGT71C1 protein, wherein the nucleic acid encodings the respective protein(s) is operably linked with a promoter and a transcription termination sequence.

[0257] Suitable methods for transforming or transfecting host cells including plant cells are well known in the art of plant biotechnology. Any method may be used to transform the recombinant expression vector into plant cells to yield the transgenic plants of the invention. General methods for transforming dicotyledonous plants are disclosed, for example, in U.S. Pat. Nos. 4,940,838; 5,464,763, and the like. Methods for transforming specific dicotyledonous plants, for example, cotton, are set forth in U.S. Pat. Nos. 5,004,863; 5,159,135; and 5,846,797. Soy transformation methods are set forth in U.S. Pat. Nos. 4,992,375; 5,416,011; 5,569,834; 5,824,877; 6,384,301 and in EP 0301749B1 may be used. Transformation methods may include direct and indirect methods of transformation. Suitable direct methods include polyethylene glycol induced DNA uptake, liposome-mediated transformation (U.S. Pat. No. 4,536,475), biolistic methods using the gene gun (Fromm M E et al., Bio/Technology. 8(9):833-9, 1990; Gordon-Kamm et al. Plant Cell 2:603, 1990), electroporation, incubation of dry embryos in DNA-comprising solution, and microinjection. In the case of these direct transformation methods, the plasmids used need not meet any particular requirements. Simple plasmids, such as those of the pUC series, pBR322, M13mp series, pACYC184 and the like can be used. If intact plants are to be regenerated from the transformed cells, an additional selectable marker gene is preferably located on the plasmid. The direct transformation techniques are equally suitable for dicotyledonous and monocotyledonous plants.

[0258] Transformation can also be carried out by bacterial infection by means of Agrobacterium (for example EP 0 116 718), viral infection by means of viral vectors (EP 0 067 553; U.S. Pat. No. 4,407,956; WO 95/34668; WO 93/03161) or by means of pollen (EP 0 270 356; WO 85/01856; U.S. Pat. No. 4,684,611). Agrobacterium based transformation techniques (especially for dicotyledonous plants) are well known in the art. The Agrobacterium strain (e.g., Agrobacterium tumefaciens or Agrobacterium rhizogenes) comprises a plasmid (Ti or Ri plasmid) and a T-DNA element which is transferred to the plant following infection with Agrobacterium. The T-DNA (transferred DNA) is integrated into the genome of the plant cell. The T-DNA may be localized on the Ri- or Ti-plasmid or is separately comprised in a so-called binary vector. Methods for the Agrobacterium-mediated transformation are described, for example, in Horsch R B et al. (1985) Science 225:1229. The Agrobacterium-mediated transformation is best suited to dicotyledonous plants but has also been adapted to monocotyledonous plants. The transformation of plants by Agrobacteria is described in, for example, White F F, Vectors for Gene Transfer in Higher Plants, Transgenic Plants, Vol. 1, Engineering and Utilization, edited by S. D. Kung and R. Wu, Academic Press, 1993, pp. 15-38; Jenes B et al. Techniques for Gene Transfer, Transgenic Plants, Vol. 1, Engineering and Utilization, edited by S. D. Kung and R. Wu, Academic Press, 1993, pp. 128-143; Potrykus (1991) Annu Rev Plant Physiol Plant Molec Biol 42:205-225. Transformation may result in transient or stable transformation and expression. Although a nucleotide sequence of the present invention can be inserted into any plant and plant cell falling within these broad classes, it is particularly useful in crop plant cells.

[0259] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer.

[0260] After transformation, plant cells or cell groupings may be selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant.

[0261] To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above. The transformed plants may also be directly selected by screening for the presence of the F6H1, CCoAOMT1, ABCG37 and/or UGT71C1 protein nucleic acid(s).

[0262] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.

[0263] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques or crossed with appropriate tester lines to generate hybrids. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion). Preferably, constructs or vectors or expression cassettes are not present in the genome of the original plant or are present in the genome of the transgenic plant not at their natural locus of the genome of the original plant.

[0264] Preferably, the transgenic plant of the present invention or the plant obtained by the method of the present invention has increased resistance against fungal pathogens, preferably rust pathogens (i.e., fungal pathogens of the order Pucciniales), preferably against fungal pathogens of the family Phacopsoraceae, more preferably against fungal pathogens of the genus Phacopsora, most preferably against Phakopsora pachyrhizi and Phakopsora meibomiae, also known as soybean rust. Preferably, resistance against Phakopsora pachyrhizi and/or Phakopsora meibomiae is increased.

[0265] Preferably, the plant, plant part, or plant cell is a plant or derived from a plant selected from the group consisting of beans, soya, pea, clover, kudzu, lucerne, lentils, lupins, vetches, groundnut, rice, wheat, barley, arabidopsis, lentil, banana, canola, cotton, potatoe, corn, sugar cane, alfalfa, and sugar beet.

[0266] In one embodiment of the present invention the plant is selected from the group consisting of beans, soy, pea, clover, kudzu, lucerne, lentils, lupins, vetches, and/or groundnut. Preferably, the plant is a legume, comprising plants of the genus Phaseolus (comprising French bean, dwarf bean, climbing bean (Phaseolus vulgaris), Lima bean (Phaseolus lunatus L.), Tepary bean (Phaseolus acutifolius A. Gray), runner bean (Phaseolus coccineus)); the genus Glycine (comprising Glycine soja, soybeans (Glycine max (L.) Merill)); pea (Pisum) (comprising shelling peas (Pisum sativum L. convar. sativum), also called smooth or round-seeded peas; marrowfat pea (Pisum sativum L. convar. medullare Alef. emend. C. O. Lehm), sugar pea (Pisum sativum L. convar. axiphium Alef emend. C. O. Lehm), also called snow pea, edible-podded pea or mangetout, (Pisum granda sneida L. convar. sneidulo p. shneiderium)); peanut (Arachis hypogaea), clover (Trifolium spec.), medick (Medicago), kudzu vine (Pueraria lobata), common lucerne, alfalfa (M. sativa L.), chickpea (Cicer), lentils (Lens) (Lens culinaris Medik.), lupins (Lupinus); vetches (Vicia), field bean, broad bean (Vicia faba), vetchling (Lathyrus) (comprising chickling pea (Lathyrus sativus), heath pea (Lathyrus tuberosus)); genus Vigna (comprising moth bean (Vigna aconitifolia (Jacq.) Marechal), adzuki bean (Vigna angularis (Willd.) Ohwi & H. Ohashi), urd bean (Vigna mungo (L.) Hepper), mung bean (Vigna radiata (L.) R. Wilczek), bambara groundnut (Vigna subterrane (L.) Verdc.), rice bean (Vigna umbellata (Thunb.) Ohwi & H. Ohashi), Vigna vexillata (L.) A. Rich., Vigna unguiculata (L.) Walp., in the three subspecies asparagus bean, cowpea, catjang bean)); pigeonpea (Cajanus cajan (L.) Millsp.), the genus Macrotyloma (comprising geocarpa groundnut (Macrotyloma geocarpum (Harms) Marechal & Baudet), horse bean (Macrotyloma uniflorum (Lam.) Verdc.); goa bean (Psophocarpus tetragonolobus (L.) DC.), African yam bean (Sphenostylis stenocarpa (Hochst. ex A. Rich.) Harms), Egyptian black bean, dolichos bean, lablab bean (Lablab purpureus (L.) Sweet), yam bean (Pachyrhizus), guar bean (Cyamopsis tetragonolobus (L.) Taub.); and/or the genus Canavalia (comprising jack bean (Canavalia ensiformis (L.) DC.), sword bean (Canavalia gladiata (Jacq.) DC.).

[0267] Further preferred is a plant selected from the group consisting of beans, soya, pea, clover, kudzu, lucerne, lentils, lupins, vetches, and groundnut. Most preferably, the plant, plant part, or plant cell is or is derived from soy and/or corn.

[0268] Preferably, the transgenic plant of the present invention or the plant obtained by the method of the present invention is a soybean plant and has increased resistance against fungal pathogens of the order Pucciniales (rust), preferably, of the family Phacopsoraceae, more preferably against fungal pathogens of the genus Phacopsora, most preferably against Phakopsora pachyrhizi and Phakopsora meibomiae, also known as soybean rust. Preferably, resistance against Phakopsora pachyrhizi and/or Phakopsora meibomiae is increased.

[0269] Preferably, the transgenic plant of the present invention or the plant obtained by the method of the present invention is a corn plant and has increased resistance against fungal pathogens of the family Nectriaceae, in particular the genus Fusarium, in particular the species Fusarium graminearum, Fusarium sporotrichioides, Fusarium pseudograminearum, Fusarium culmorum, Fusarium poae, Fusarium verticillioides (Fusarium moniliforme), Fusarium subglutinans, Fusarium proliferatum, Fusarium fujikuroi), Fusarium avenaceum, Fusarium oxysporum, Fusarium virguliforme and/or Fusarium solani. Most preferred is fusarium graminearum and/or fusarium verticolloides.

Methods for the Production of Transgenic Plants

[0270] One embodiment according to the present invention provides a method for the production of a transgenic plant, transgenic plant part, or transgenic plant cell having increased fungal resistance, comprising introducing

[0271] a) exogenous nucleic acid encoding the nucleic acid encoding F6H1 protein wherein said F6H1 protein is encoded a nucleic acid as defined above operably linked with a promoter and a transcription termination sequence, and further optionally introducing one or more nucleic acids selected from the group consisting of

[0272] b) exogenous nucleic acids encoding CCoAOMT1 protein as defined above operably linked with a promoter and a transcription termination sequence,

[0273] c) exogenous nucleic acids encoding ABCG37 protein as defined above operably linked with a promoter and a transcription termination sequence, and

[0274] d) exogenous nucleic acids encoding UGT71C1 protein as defined above operably linked with a promoter and a transcription termination sequence into a plant, a plant part, or a plant cell, wherein the exogenous nucleic acid encoding F6H1, CCoAMT1, ABCG37 and/or UGT71C1 protein are located on the same or different vector constructs, generating a transgenic plant, transgenic plant part, or transgenic plant cell from the plant, plant part or plant cell; and expressing the protein(s) encoded by the recombinant vector construct(s).

[0275] In one embodiment, the present invention refers to a method for the production of a transgenic plant, transgenic plant part, or transgenic plant cell having increased fungal resistance, comprising

[0276] (a) introducing a recombinant vector construct according to the present invention into a plant, a plant part or a plant cell and

[0277] (b) generating a transgenic plant from the plant, plant part or plant cell and optionally

[0278] (c) expressing the F6H1 protein and one or more proteins selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 protein(s).

[0279] Preferably, said introducing and expressing does not comprise an essentially biological process.

[0280] Preferably, the method for the production of the transgenic plant, transgenic plant part, or transgenic plant cell further comprises the step of selecting a transgenic plant expressing F6H1 protein and one or more proteins selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 protein(s).

[0281] Preferably, the method for the production of the transgenic plant, transgenic plant part, or transgenic plant cell additionally comprises the step of harvesting the seeds of the transgenic plant and planting the seeds and growing the seeds to plants, wherein the grown plant(s) comprises a nucleic acid encoding F6H1 protein and one or more nucleic acids encoding proteins selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 protein(s) operably linked with a promoter and a transcription termination sequence.

[0282] Preferably, the step of harvesting the seeds of the transgenic plant and planting the seeds and growing the seeds to plants is repeated more than one time, preferably, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 times.

[0283] The transgenic plants may be selected by known methods as described above (e.g., by screening for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the F6H1, CCoAMT1, ABCG37 and/or UGT71C1 gene(s) or by directly screening for the FF6H1, CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s)).

[0284] Furthermore, the use of the exogenous F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) or use of the recombinant vector construct comprising the F6H1 nucleic acid optionally in combination with one or more nucleic acid(s) selected from the group CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) for the transformation of a plant, plant part, or plant cell to provide a fungal resistant plant, plant part, or plant cell is provided.

Harvestable Parts and Products

[0285] Harvestable parts of the transgenic plant according to the present invention are part of the invention. Preferably, the harvestable parts comprise the F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) or F6H1 protein optionally in combination with one or more protein(s) selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 protein(s). The harvestable parts may be seeds, roots, leaves and/or flowers. Preferred parts of soy plants are soy beans. Preferred parts of corn plants are corn grains.

[0286] Products derived from a transgenic plant according to the present invention, parts thereof or harvestable parts thereof are part of the invention. A preferred product is oil, preferably, corn oil or soybean oil.

[0287] Preferred parts of soy plants are soy beans comprising the F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) or F6H1 protein optionally in combination with one or more protein(s) selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 protein(s).

[0288] Preferred parts of corn plants are soy grains comprising the F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) or F6H1 protein optionally in combination with one or more protein(s) selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 protein(s).

[0289] In a preferred embodiment a product is derived from the plant described above or from the harvestable part of the plant described above, wherein the product is preferably soybean oil and/or corn oil.

[0290] Preferably the soybean oil comprise the F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) or F6H1 protein optionally in combination with one or more protein(s) selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 protein(s).

[0291] Preferably the corn oil comprises the F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) or F6H1 protein optionally in combination with one or more protein(s) selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 protein(s).

Methods for Manufacturing a Product

[0292] In one embodiment the method for the production of a product comprises

[0293] a) growing the plants of the invention or obtainable by the methods of invention and

[0294] b) producing said product from or by the plants of the invention and/or parts, e.g. seeds, of these plants.

[0295] In a further embodiment the method comprises the steps a) growing the plants of the invention, b) removing the harvestable parts as defined above from the plants and c) producing said product from or by the harvestable parts of the invention.

[0296] Preferably the products obtained by said method comprises an exogenous nucleic acid(s) and/or protein(s) according to the invention.

[0297] Method for the production of a product comprising

[0298] a) growing a plant according to the invention or obtainable by the method according to the invention and

[0299] b) producing said product from or by the plant and/or part, preferably seeds, of the plant, wherein the product comprise the F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid(s) or the proteins encoded by said nucleic acids.

[0300] The product may be produced at the site where the plant has been grown, the plants and/or parts thereof may be removed from the site where the plants have been grown to produce the product. Typically, the plant is grown, the desired harvestable parts are removed from the plant, if feasible in repeated cycles, and the product made from the harvestable parts of the plant. The step of growing the plant may be performed only once each time the methods of the invention is performed, while allowing repeated times the steps of product production e.g. by repeated removal of harvestable parts of the plants of the invention and if necessary further processing of these parts to arrive at the product. It is also possible that the step of growing the plants of the invention is repeated and plants or harvestable parts are stored until the production of the product is then performed once for the accumulated plants or plant parts. Also, the steps of growing the plants and producing the product may be performed with an overlap in time, even simultaneously to a large extend or sequentially. Generally the plants are grown for some time before the product is produced.

[0301] In one embodiment the products produced by said methods of the invention are plant products such as, but not limited to, a foodstuff, feedstuff, a food supplement, feed supplement, fiber, cosmetic and/or pharmaceutical. Foodstuffs are regarded as compositions used for nutrition and/or for supplementing nutrition. Animal feedstuffs and animal feed supplements, in particular, are regarded as foodstuffs.

[0302] In another embodiment the inventive methods for the production are used to make agricultural products such as, but not limited to, plant extracts, proteins, amino acids, carbohydrates, fats, oils, polymers, vitamins, and the like.

[0303] It is possible that a plant product consists of one or more agricultural products to a large extent.

Methods for Breeding/Methods for Plant Improvement/Methods Plant Variety Production

[0304] The transgenic plants of the invention may be crossed with similar transgenic plants or with transgenic plants lacking the nucleic acids of the invention or with non-transgenic plants, using known methods of plant breeding, to prepare seeds. Further, the transgenic plant cells or plants of the present invention may comprise, and/or be crossed to another transgenic plant that comprises one or more exogenous nucleic acids, thus creating a "stack" of transgenes in the plant and/or its progeny. The seed is then planted to obtain a crossed fertile transgenic plant comprising the F6H1 nucleic acid optionally in combination with nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 nucleic acid(s). The crossed fertile transgenic plant may have the particular expression cassette inherited through a female parent or through a male parent. The second plant may be an inbred plant. The crossed fertile transgenic may be a hybrid. Also included within the present invention are seeds of any of these crossed fertile transgenic plants. The seeds of this invention can be harvested from fertile transgenic plants and be used to grow progeny generations of transformed plants of this invention including hybrid plant lines comprising the exogenous nucleic acid.

[0305] Thus, one embodiment of the present invention is a method for breeding a fungal resistant plant comprising the steps of

[0306] (a) crossing a transgenic plant described herein or a plant obtainable by a method described herein with a second plant;

[0307] (b) obtaining a seed or seeds resulting from the crossing step described in (a);

[0308] (c) planting said seed or seeds and growing the seed or seeds to plants; and

[0309] (d) selecting from said plants the plants expressing a F6H1 protein optionally in combination with one or more proteins selected from the group consisting of, CCoAMT1, ABCG37 and UGT71C1 protein(s).

[0310] Another preferred embodiment is a method for plant improvement comprising

[0311] (a) obtaining a transgenic plant by any of the methods of the present invention;

[0312] (b) combining within one plant cell the genetic material of at least one plant cell of the plant of (a) with the genetic material of at least one cell differing in one or more gene from the plant cells of the plants of (a) or crossing the transgenic plant of (a) with a second plant;

[0313] (c) obtaining seed from at least one plant generated from the one plant cell of (b) or the plant of the cross of step (b);

[0314] (d) planting said seeds and growing the seeds to plants; and

[0315] (e) selecting from said plants, plants expressing the nucleic acid encoding F6H1 protein optionally in combination with one or more protein(s) selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 protein(s); and optionally

[0316] (f) producing propagation material from the plants expressing the nucleic acid encoding F6H1 protein optionally in combination with one or more protein(s) selected from the group consisting of CCoAMT1, ABCG37 and UGT71C1 protein(s).

[0317] The transgenic plants may be selected by known methods as described above (e.g., by screening for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the F6H1, CCoAMT1, ABCG37 and/or UGT71C1 gene or screening for the F6H1, CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid itself).

[0318] According to the present invention, the introduced F6H1 nucleic acid optionally in combination with one or more nucleic acids selected from the group consisting of CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid may be maintained in the plant cell stably if it is incorporated into a non-chromosomal autonomous replicon or integrated into the plant chromosomes. Whether present in an extra-chromosomal non-replicating or replicating vector construct or a vector construct that is integrated into a chromosome, the exogenous F6H1, CCoAMT1, ABCG37 and/or UGT71C1 nucleic acid preferably resides in one or more a plant expression cassette. A plant expression cassette preferably contains regulatory sequences capable of driving gene expression in plant cells that are functional linked so that each sequence can fulfill its function, for example, termination of transcription by polyadenylation signals. Preferred polyadenylation signals are those originating from Agrobacterium tumefaciens t-DNA such as the gene 3 known as octopine synthase of the Ti-plasmid pTiACH5 (Gielen et al., 1984, EMBO J. 3:835) or functional equivalents thereof, but also all other terminators functionally active in plants are suitable. As plant gene expression is very often not limited on transcriptional levels, a plant expression cassette preferably contains other functional linked sequences like translational enhancers such as the overdrive-sequence containing the 5'-untranslated leader sequence from tobacco mosaic virus increasing the polypeptide per RNA ratio (Gallie et al., 1987, Nucl. Acids Research 15:8693-8711). Examples of plant expression vectors include those detailed in: Becker, D. et al., 1992, New plant binary vectors with selectable markers located proximal to the left border, Plant Mol. Biol. 20:1195-1197; Bevan, M. W., 1984, Binary Agrobacterium vectors for plant transformation, Nucl. Acid. Res. 12:8711-8721; and Vectors for Gene Transfer in Higher Plants; in: Transgenic Plants, Vol. 1, Engineering and Utilization, eds.: Kung and R. Wu, Academic Press, 1993, S. 15-38.

[0319] A preferred method according to the invention is a method for applying a scopoletin and/or a derivative thereof to a surface of a plant, plant part or plant cell, wherein the resistance to a fungal pathogen of the plant, plant part or plant cell is increased by applying scopoletin and/or a derivative thereof to the surface of the plant, plant part or plant cell in comparison to a plant, plant part or plant cell to which surface scopoletin and/or a derivative has not been applied, wherein the plant is soy and/or corn.

[0320] In one embodiment according to the invention a plant surface or plant part surface is coated with scopoletin and/or a derivative thereof, wherein the plant is soy and/or corn.

[0321] In one embodiment according to the invention a plant, plant part or plant cell has a surface coated with scopoletin and/or a derivative thereof. wherein the plant is soy and/or corn.

EXAMPLES

[0322] The following examples are not intended to limit the scope of the claims to the invention, but are rather intended to be exemplary of certain embodiments. Any variations in the exemplified methods that occur to the skilled artisan are intended to fall within the scope of the present invention.

Example 1: General Methods

[0323] The chemical synthesis of oligonucleotides can be affected, for example, in the known fashion using the phosphoamidite method (Voet, Voet, 2nd Edition, Wiley Press New York, pages 896-897). The cloning steps carried out for the purposes of the present invention such as, for example, restriction cleavages, agarose gel electrophoresis, purification of DNA fragments, transfer of nucleic acids to nitrocellulose and nylon membranes, linking DNA fragments, transformation of E. coli cells, bacterial cultures, phage multiplication and sequence analysis of recombinant DNA, are carried out as described by Sambrook et al. Cold Spring Harbor Laboratory Press (1989), ISBN 0-87969-309-6. The sequencing of recombinant DNA molecules is carried out with an MWG-Licor laser fluorescence DNA sequencer following the method of Sanger (Sanger et al., Proc. Natl. Acad. Sci. USA 74, 5463 (1977).

Example 2: Cloning of Overexpression Vector Constructs for Transient N. benthamiana Transformation

[0324] To obtain cDNA, RNA was extracted from leaf tissue of Arabidopsis thaliana pen2 mutants that had been inoculated with P. pachyrhizi two days before harvest. cDNA was produced using RevertAid H minus reverse trancriptase (Thermo Scientific). All steps of cDNA preparation and purification were performed according as described in the manual.

[0325] The SEQ-ID 1-sequence (F6H1) was amplified from the cDNA by PCR as described in the protocol of the Phusion High-Fidelity DNA Polymerase (Thermo Scientific) hot-start, Pfu Ultra, Pfu Turbo or Herculase DNA polymerase (Stratagene). The composition for the protocol of the Pfu Ultra, Pfu Turbo or Herculase DNA polymerase was as follows: 1.times.PCR buffer, 0.2 mM of each dNTP, 100 ng cDNA of Arabidopsis thaliana (var Columbia-0), 50 pmol forward primer, 50 pmol reverse primer, 1 u Phusion hot-start, Pfu Ultra, Pfu Turbo or Herculase DNA polymerase.

[0326] The amplification cycles were as follows:

[0327] 1 cycle of 30 seconds at 98.degree. C., followed by 35 cycles of in each case 10 seconds at 98.degree. C., 30 seconds at 62.degree. C. and 40 seconds at 72.degree. C., followed by 1 cycle of 10 minutes at 72.degree. C., then 4.degree. C.

[0328] The following primer sequences were used to specifically amplify the F6H1 full-length ORF for cloning purposes:

TABLE-US-00004 i) F6H1_attB1 foward primer: (SEQ ID NO: 76) 5'-GGGGACAAGTTTGTACAAAAAAGCAGGCTTAATGGCTCCAACACTCT TGAC-3' ii) F6H1_attB2 reverse primer: (SEQ ID NO: 77) 5'-GGGGACCACTTTGTACAAGAAAGCTGGGTATCAGATCTTGGCGTAAT CG-3'

[0329] The amplified fragments were gel purified and cloned into the pDONR 207 entry vector (Invitrogen) using Gateway.RTM. cloning according to the manufacturer's instructions. Using this cloning technique the full-length F6H1 fragment is inserted in sense direction between the attL1 and attL2 recombination sites of the entry vector. To prepare an untagged F6H1 overexpression construct, a LR reaction (Gateway system, (Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturers protocol by using a pDONR207 vector containing the F6H1 fragment. As target a binary pB2GW7 (Ghent University, Belgium) vector was used, which is composed of: (1) a Spectinomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a pBR322 origin of replication for stable maintenance in E. coli and (4) between the right and left border a bar selection gene under control of a pNos-promoter. The recombination reaction was transformed into competent E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from each the vector construct was sequenced and submitted to soy transformation (see FIG. 2a).

[0330] The amplified fragments were gel purified and to prepare a FLAG tagged F6H1 overexpression construct MultiSite Gateway.RTM. cloning was applied according to the manufacturer's manual. First, the .OMEGA.-FLAG sequence was PCR amplified from the vector pTA7002 (Shuqun Zhang, Columbia University, Missouri, USA) harboring the Arabidopsis thaliana MKK4 gene 5' flanked by a tobacco mosaic virus .OMEGA. translational enhancer and a FLAG Tag sequence using following primers (attB primer extensions underlined):

TABLE-US-00005 (i) .OMEGA.-FLAG-attB1 forward Primer: (SEQ ID NO: 78) GGGGACAAGTTTGTACAAAAAAGCAGGCTCATATTTTTACAACAATTAC CAACAACA (ii) .OMEGA.-FLAG-attB5r reverse Primer: (SEQ ID NO: 79) GGGGACAACTTTTGTATACAAAGTTGTCTTGTCATCGTCGTCCTTGT

[0331] Second, the F6H1 full length coding sequence was PCR amplified from pen2 cDNA prepared as described above. Following primer sequences carrying 5''attB5 and attB2 extensions were used for F6H1 amplification by PCR

TABLE-US-00006 (i) F6H1-attB5 forward Primer: (SEQ ID NO: 80) GGGGACAACTTTGTATACAAAAGTTGCAATGGCTCCAACACTCTTGAC (ii) F6H1-attB2 reverse Primer: (SEQ ID NO: 77 see above) GGGGACCACTTTGTACAAGAAAGCTGGGTATCAGATCTTGGCGTAATCG

[0332] PCR products were gel purified and the attB1-.OMEGA.-FLAG-attB5r sequence introduced into pDONR221 P1-P5r via gateway cloning (BP reaction). Analogously the attB5-F6H1-attB2 sequence was cloned into pDONR 221 P5-P2. Recombination reactions were transformed into competent E. coli (DH5alpha). Following plasmid extraction both vectors were used for LR recombination with the pB2GW7 destination vector. The resulting expression clone containing both sequences (.OMEGA.-FLAG and F6H1) was screened by specific restriction digestions and sequenced prior to transformation (FIG. 2b).

Example 3a: Transient Transformation of N. benthamiana Leaves

[0333] Transient transformation of N. benthamiana leaves was done according to a slightly modified protocol from Popescu et al. 2007 (Popescu, S. C., Popescu, G. V., Bachan, S., Zhang, Z., Seay, M., Gerstein, M., Snyder, M., and Dinesh-Kumar, S. P. (2007). Differential binding of calmodulin-related proteinsto their targets revealed through high-density Arabidopsis protein microarrays Proc Natl Acad Sci USA 104, 4730-4735.) A single Agrobacterium (strain AGL01) carrying a DNA construct of interest (see FIGS. 2a and 2b) was cultured in YEB medium with appropriate antibiotics for 14-16 h at 28.degree. C. Cells were harvested by centrifugation (5000 rpm 10 min), resuspended to an OD of 0.4-0.8 in buffer containing 10 mM MgCl2, 10 mM MES pH 5.6 and 150 .mu.M acetosyringone and incubated for 2-5 h at room temperature. Agrobacteria transformed with the DNA construct of interest were then mixed with an equal volume of Agrobacteria containing the p19 silencing suppressor gene from tomato bushy stunt virus (TBSV) and 1:1 mixtures were syringae-infiltrated into leaves of 6-week-old N. benthamiana plants. Three days after Agrobacterium infiltration, leaves were frozen in liquid nitrogen and stored at -80.degree. C. until analysis.

Example 3b: Scopoletin Extraction and HPLC Based Analytics (FIGS. 12a and 12b)

[0334] Plant material was ground in liquid N.sub.2 and extracted for 24 h with 90% (v/v) methanol (1 ml per 0.5 g fresh material) supplemented with 4-methylumbelliferone as an internal standard. Extracts were centrifuged for 10 min at 15,000 g. The supernatants were concentrated in a speed vac and the dried residue resolved in 150 .mu.l 100% methanol. Samples (20 .mu.l injection volume) were subsequently subjected to reverse-phase high-performance liquid chromatography (HPLC) analysis on a Nucleosil C18 column (EC 150/4.6 Nucleosil 100-5 C18; Macherey-Nagel) with a gradient mobile phase built with 1% (v/v) formic acid in water (A) and 1% (v/v) formic acid in methanol (B), and a flow rate of 1.0 ml/min at RT. The gradient program started at 15% B for 2 min, then increased linearly to 21.5% for 18 min followed by a linear increase to 55% B between 20 and 40 min. The gradient then increased to 95% B for 5 min. This proportion was maintained for 10 min and then returned to initial conditions in 5 min. Scopoletin was detected with a fluorescence detector with an excitation wavelength of 345 nm and an emission wavelength of 460 nm and identified by comparison with the pure reference compound (Scopoletin, SIGMA-ALDRICH).

Example 4: Determining Abundance of Gene Transcripts

[0335] Total RNA was extracted from leaves of the described Arabidopsis mutants as described by Chomczynski and Sacchi (1987). 1 .mu.g RNA was transcribed to cDNA using random primers (9-mers) and RevertAid.TM. reverse transcriptase (Fermentas) according to manufacturer's instructions. Accumulation of gene transcripts was quantified in an ABI7300 using SYBR green (Invitrogen) at the following conditions for RT-qPCR: 50.degree. C. for 2 min, 95.degree. C. for 10 min, 95.degree. C. for 15 s, 60.degree. C. for 1 min, 95.degree. C. for 15 s, 60.degree. C. for 1 min, and 95.degree. C. for 15 s (the third and fourth steps were repeated 40 times).

TABLE-US-00007 Primers specifically hybridizing to F6H1 gene (SEQ ID No 1): F6H1_RT_F: (SEQ ID NO: 81) 5'-CTCAGCCTCTTCTTTGTCTC-3 F6H1_RT_R: (SEQ ID NO: 82) 5'-AAGCCTCCTCACCATCTTC-3' Primers specifically hybridizing to CCoAOMT1 (SEQ ID No 3): CCoAOMT1_RT_F: (SEQ ID NO: 83) 5'-ATGGCGACGACAACAACAGAAGC-3 CCoAOMT1_RT_R: (SEQ ID NO: 84) 5'-GCCAATCACTCCTCCAATTTTCACA-3' Primers specifically hybridizing to ABCG37 (SEQ ID No 5): ABCG37_RT_F: (SEQ ID NO: 85) 5'-GATCGACTCTCCTTGATGATGGCGA-3 ABCG37_RT_R: (SEQ ID NO: 86) 5-CGCACTCGGCCACCACTTTTAAACT-3' Primers specifically hybridizing to UGT71C1 (SEQ ID No 7): UGT71C1_RT_F: (SEQ ID NO: 87) 5'-CTCGCAACAATCGAACTCGCCAAA-3 UGT71C1_RT_R: (SEQ ID NO: 88) 5'-TCGGCAAATTCCACAAAGAGTTCCA-3'

[0336] All primers were designed according to standard criteria (Udvardi et al., 2008), off target search using Primer Blast tool at NCBI (http://www.ncbi.nlm.nih.gov/tools/primer-blast/)). Expression of the genes was normalized to Actin2. Data were analyzed using the ABI 7300 software and the expression relative to actin was calculated according to Livak and Schmittgen (2001) with 2.sup.-(Ct F6H1-Ct Actin2).

Example 5: In Vitro Germination Tests

Example 5a Growth Inhibition of Phakopsora pachyrhizi

[0337] Spores of Phakopsora pachyrhizi were resuspended in H.sub.2O supplemented with Tween-20 and 10 .mu.M, 100 .mu.M, 500 .mu.M and 1 mM scopoletin. Spores of Phakopsora pachyrhizi resuspended in H.sub.2O supplemented with Tween-20 were used as control. All resuspended spores were transferred onto glass slides. After six hours incubation time the ASR spores were germinated and started to form appressoria.

[0338] The germination rate and appressoria formation rate was determined by quantitative microscopic analysis. Spores showing a visible germtube formation but no thickening of the germ tube tip were counted as "germinated", whereas the presence of a thickened germ tube tip indicated the formation of an appressoria (FIG. 14a).

[0339] Application of scopoletin to ASR spores decreases the germination and appressoria formation in a dose dependent manner. At 1 mM concentration scopoletin completely abolishes spore germination in-vitro.

Example 5b Growth Inhibition of Fusarium graminearum

[0340] 1 cm.sup.2 Agar plugs from 7 day old F. graminearum cultures grown on potato dextrose agar (PDA) at 24.degree. C. were placed on fresh PDA plates supplemented with 1 mM scopoletin in methanol or equal volumes of methanol lacking scopoletin as control. Fungal spores were stained by spraying Uvitex2 solution (0.1% Uvitex 2B (Polyscience, Warrington, UK) solved in 0.1 M Tris-/HCl-buffer, pH 8.5). Fungal growth was measured daily using a fluorescence microscope to determine the average growth rate of the fungus. 100 spores were counted per sample (see FIG. 15b).

Example 6: In Vivo Spore Germination Tests

6.1 Arabidopsis

[0341] Arabidopsis seeds were sown on soil (type VM, Einheitserde Werkverband) and stratified at 4.degree. C. for two days. Plants were grown at short day conditions (in a chamber at 8 h photoperiod, 120 .mu.mol m-2 s-1 photon irradiance) 22.degree. C., and 65% humidity. Five to six-week-old plants were inoculated with P. pachyrhizi as described below.

[0342] For pre-treatment experiments Arabidopsis plants were sprayed with--1 mM scopoletin (solved in H.sub.2O, 0.01% Tween-20); incubated for 6 h at short day conditions and subsequently inoculated with 1 mg/ml P. pachyrhizi uredospores. For co-treatment experiments spores of Phakopsora pachyrhizi were solved in 0.01% Tween-20 supplemented with 1 mM scopoletin.

[0343] Following inoculation plants were covered with moistened plastic domes to ensure high humidity and incubated at short day conditions (see above). 24 h later plastic domes were removed and plants incubated at the same conditions for another 24 h. Leaves were harvested 2 dpi and destained on tissue soaked with a saturated (2.5 g/ml) chloralhydrate solution. Germination and penetration on destained leaves was determined by quantitative microscopic analysis. Spores showing a visible germtube formation are assigned to the category "germinated" Pretreated as well as co-treated plants showed a drastically reduced formation of germinated spores, proving the toxic effect of scopoletin against soybean rust fungus (FIG. 13). We never observed any phytotoxic effect of scopoletin leading to pleiotropic effects in Arabidopsis.

6.2 Soybean

[0344] Soy seeds were sown on soil (type VM, Einheitserde Werkverband) and grown at short day conditions in a chamber (at 8 h photoperiod, 120 .mu.mol m-2 s-1 photon irradiance) 22.degree. C., and 65% humidity. Five to six-week-old plants were inoculated with P. pachyrhizi as described below.

[0345] For co-treatment experiments spores of Phakopsora pachyrhizi were solved in 0.01% Tween-20 supplemented with 10 .mu.M, 100 .mu.M, 500 .mu.M and 1 mM scopoletin (FIGS. 14b and c).

[0346] For pre-treatment experiments soy plants were sprayed with 1 mM scopoletin (solved in H.sub.2O, 0.01% Tween-20); incubated for 6 h and subsequently inoculated with 1 mg/ml P. pachyrhizi uredospores (FIG. 14c).

[0347] Following inoculation plants were covered with moistened plastic domes to ensure high humidity and incubated at short day conditions (see above). 24 h later plastic domes were removed and plants incubated at the same conditions for another 11 days. At 12 dpi the diseased leaf area was rated on primary leaves, first and second trifoliate leaves by using the program Assess2.0 (Lobet G., Draye X., Perilleux C. 2013 An online database for plant image analysis software tools, Plant Methods, vol. 9 (38)). The average of the percentage of the leaf area showing fungal colonies or strong yellowing/browning on all leaves is considered as diseased leaf area.

[0348] Pretreated as well as co-treated plants showed a drastically reduced formation of infected leaf area (FIGS. 14b and 14c) showing the potential of scopoletin to inhibit soybean rust disease. Any phytotoxic effect of scopoletin leading to pleiotropic effects in soybean was never observed, so the toxic effects are fungus specific.

Example 7: Cloning of Overexpression Vector Constructs for Stable Soybean Transformation

[0349] The DNA sequence of the F6H1 (AT3G13610, SEQ ID No: 1), CCoAOMT1 (At4g34050, SEQ ID No: 3), ABCG37(PDR9; AT3G53480, SEQ ID No: 5) and UGT71C1 (SEQ ID No: 7) genes mentioned in this application were generated by DNA synthesis (Geneart, Regensburg, Germany).

[0350] The F6H1 DNA (as shown in SEQ ID No: 1) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon. The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-C vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the parsley ubiquitin promoter and the Agrobacterium tumefaciens derived octopine synthase terminator (t-OCS). The PcUbi promoter regulates constitutive expression of the ubi4-2 gene (accession number X64345) of Petroselinum crispum (Kawalleck et al. 1993 Plant Molecular Biology 21(4): 673-684).

[0351] To obtain the binary plant transformation vector, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using an empty pENTRY-A vector, an empty pENTRY-C, and the PcUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector. As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection under control of a AtAHASL-promoter (see FIG. 2c). The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from the vector construct (FIG. 2c) was sequenced and submitted soy transformation.

[0352] To obtain the F6H1-CCoAOMT1 double gene construct (FIG. 3) the CCoAOMT1 DNA (as shown in SEQ ID No: 3) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon. The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-B vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the pSuper promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300) and the Agrobacterium tumefaciens derived nopaline synthase terminator (t-nos). The Super promoter consists of three identical Octapine Synthase Enhancers followed by a MAS promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300).

[0353] To obtain the binary plant transformation vector containing F6H1 and CCoAOMT1, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using an empty pENTRY-A vector, the pSuper promoter::CCoAOMT1::nos-terminator in the above described pENTRY-B vector and the PcUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector. As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection under control of a AtAHASL-promoter (see FIG. 3). The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from each vector construct was sequenced and submitted soy transformation.

[0354] To obtain the F6H1-UGT71C1 double gene construct (FIG. 5) the UGT71C1 DNA (as shown in SEQ ID No: 7) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon. The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-B vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the pSuper promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300) and the Agrobacterium tumefaciens derived nopaline synthase terminator (t-nos). The Super promoter consists of three identical Octapine Synthase Enhancers followed by a MAS promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300).

[0355] To obtain the binary plant transformation vector containing F6H1 and UGT71C1, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using an empty pENTRY-A vector, the pSuper promoter::UGT71C1::nos-terminator in the above described pENTRY-B vector and the PcUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector. As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection under control of a AtAHASL-promoter (see FIG. 5). The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from each vector construct was sequenced and submitted soy transformation.

[0356] To obtain the F6H1-CCoAOMT1-ABCG37 (FIG. 4) triple gene construct the ABCG37 DNA (as shown in SEQ ID No 5) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon.

[0357] The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-A vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the pGlyma14g06680 promoter (see WO 2012/127373) and the Solanum tuberosum cathepsin D inhibitor (Herbers, Karin, Salome Prat, and Lothar Willmitzer. "Functional analysis of a leucine aminopeptidase from Solanum tuberosum L." Planta 194.2 (1994): 230-240.). The pGlyma14g06680 promoter mediates a medium strong constitutive expression in soybean.

[0358] To obtain the binary plant transformation vector containing F6H1, CCoAOMT1 and ABCG37, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using the Glyma14g06680 promoter::ABCG37::cathepsin inhibitor terminator in the pENTRY-A vector, as described above, the pSuper promoter::CCoAOMT1::nos-terminator in the above described pENTRY-B vector and the PcUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector.

[0359] As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection under control of a AtAHASL-promoter (see FIG. 4). The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from each vector construct was sequenced and submitted for soy transformation.

Example 8: Soy Transformation

[0360] The expression vector constructs (see example 2) is transformed into soy.

8.1 Sterilization and Germination of Soy Seeds

[0361] Virtually any seed of any soy variety can be employed in the method of the invention. A variety of soybean cultivar (including Jack, Williams 82, Jake, Stoddard, CD215 and Resnik) is appropriate for soy transformation. Soy seeds are sterilized in a chamber with a chlorine gas produced by adding 3.5 ml 12N HCl drop wise into 100 ml bleach (5.25% sodium hypochlorite) in a desiccator with a tightly fitting lid. After 24 to 48 hours in the chamber, seeds are removed and approximately 18 to 20 seeds are plated on solid GM medium with or without 5 .mu.M 6-benzyl-aminopurine (BAP) in 100 mm Petri dishes. Seedlings without BAP are more elongated and roots develop especially secondary and lateral root formation. BAP strengthens the seedling by forming a shorter and stockier seedling.

[0362] Seven-day-old seedlings grown in the light (>100 .mu.Einstein/m.sup.2s) at 25 degreeC are used for explant material for the three-explant types. At this time, the seed coat was split, and the epicotyl with the unifoliate leaves are grown to, at minimum, the length of the cotyledons.

[0363] The epicotyl should be at least 0.5 cm to avoid the cotyledonary-node tissue (since soycultivars and seed lots may vary in the developmental time a description of the germination stage is more accurate than a specific germination time).

[0364] For inoculation of entire seedlings, see Method A (example 8.3. and 8.3.2) or leaf explants see Method B (example 8.3.3).

[0365] For method C (see example 8.3.4), the hypocotyl and one and a half or part of both cotyledons are removed from each seedling. The seedlings are then placed on propagation media for 2 to 4 weeks. The seedlings produce several branched shoots to obtain explants from. The majority of the explants originated from the plantlet growing from the apical bud. These explants are preferably used as target tissue.

8.2--Growth and Preparation of Agrobacterium Culture

[0366] Agrobacterium cultures are prepared by streaking Agrobacterium (e.g., A. tumefaciens or A. rhizogenes) carrying the desired binary vector (e.g. H. Klee. R. Horsch and S. Rogers 1987 Agrobacterium-Mediated Plant Transformation and its further Applications to Plant Biology; Annual Review of Plant Physiology Vol. 38: 467-486) onto solid YEP growth medium YEP media: 10 g yeast extract. 10 g Bacto Peptone. 5 g NaCl. Adjust pH to 7.0, and bring final volume to 1 liter with H2O, for YEP agar plates add 20 g Agar, autoclave) and incubating at 25.degree C. until colonies appeared (about 2 days). Depending on the selectable marker genes present on the Ti or Ri plasmid, the binary vector, and the bacterial chromosomes, different selection compounds are be used for A. tumefaciens and A. rhizogenes selection in the YEP solid and liquid media. Various Agrobacterium strains can be used for the transformation method.

[0367] After approximately two days, a single colony (with a sterile toothpick) is picked and 50 ml of liquid YEP is inoculated with antibiotics and shaken at 175 rpm (25.degree. C.) until an OD.sub.600 between 0.8-1.0 is reached (approximately 2 d). Working glycerol stocks (15%) for transformation are prepared and one-ml of Agrobacterium stock aliquoted into 1.5 ml Eppendorf tubes then stored at -80.degree. C.

[0368] The day before explant inoculation, 200 ml of YEP are inoculated with 5 .mu.l to 3 ml of working Agrobacterium stock in a 500 ml Erlenmeyer flask. The flask is shaken overnight at 25.degree. C. until the OD.sub.600 is between 0.8 and 1.0. Before preparing the soy explants, the Agrobacteria ARE pelleted by centrifugation for 10 min at 5,500.times.g at 20.degree. C. The pellet Is resuspended in liquid CCM to the desired density (OD.sub.600 0.5-0.8) and placed at room temperature at least 30 min before use.

8.3--Explant Preparation and Co-Cultivation (Inoculation)

8.3.1 Method A: Explant Preparation on the Day of Transformation.

[0369] Seedlings at this time had elongated epicotyls from at least 0.5 cm but generally between 0.5 and 2 cm. Elongated epicotyls up to 4 cm in length are successfully employed. Explants are then prepared with: i) with or without some roots, ii) with a partial, one or both cotyledons, all preformed leaves are removed including apical meristem, and the node located at the first set of leaves is injured with several cuts using a sharp scalpel.

[0370] This cutting at the node not only induces Agrobacterium infection but also distributes the axillary meristem cells and damaged pre-formed shoots. After wounding and preparation, the explants are set aside in a Petri dish and subsequently co-cultivated with the liquid CCM/Agrobacterium mixture for 30 minutes. The explants are then removed from the liquid medium and plated on top of a sterile filter paper on 15.times.100 mm Petri plates with solid co-cultivation medium. The wounded target tissues are placed such that they are in direct contact with the medium.

8.3.2 Modified Method A: Epicotyl Explant Preparation

[0371] Soyepicotyl segments prepared from 4 to 8 d old seedlings are used as explants for regeneration and transformation. Seeds of soya cv. L00106CN, 93-41131 and Jack are germinated in 1/10 MS salts or a similar composition medium with or without cytokinins for 4 to 8 d. Epicotyl explants are prepared by removing the cotyledonary node and stem node from the stem section. The epicotyl is cut into 2 to 5 segments. Especially preferred are segments attached to the primary or higher node comprising axillary meristematic tissue.

[0372] The explants are used for Agrobacterium infection. Agrobacterium AGL1 harboring a plasmid with the gene of interest (GOI) and the AHAS, bar or dsdA selectable marker gene is cultured in LB medium with appropriate antibiotics overnight, harvested and resuspended in a inoculation medium with acetosyringone. Freshly prepared epicotyl segments are soaked in the Agrobacterium suspension for 30 to 60 min and then the explants were blotted dry on sterile filter papers. The inoculated explants are then cultured on a co-culture medium with L-cysteine and TTD and other chemicals such as acetosyringone for increasing T-DNA delivery for 2 to 4 d. The infected epicotyl explants are then placed on a shoot induction medium with selection agents such as imazapyr (for AHAS gene), glufosinate (for bar gene), or D-serine (for dsdA gene). The regenerated shoots are subcultured on elongation medium with the selective agent.

[0373] For regeneration of transgenic plants the segments are then cultured on a medium with cytokinins such as BAP, TDZ and/or Kinetin for shoot induction. After 4 to 8 weeks, the cultured tissues are transferred to a medium with lower concentration of cytokinin for shoot elongation. Elongated shoots are transferred to a medium with auxin for rooting and plant development. Multiple shoots are regenerated.

[0374] Many stable transformed sectors showing strong cDNA expression are recovered. Soybean plants are regenerated from epicotyl explants. Efficient T-DNA delivery and stable transformed sectors are demonstrated.

8.3.3 Method B: Leaf Explants

[0375] For the preparation of the leaf explant the cotyledon is removed from the hypocotyl. The cotyledons are separated from one another and the epicotyl is removed. The primary leaves, which consist of the lamina, the petiole, and the stipules, are removed from the epicotyl by carefully cutting at the base of the stipules such that the axillary meristems are included on the explant. To wound the explant as well as to stimulate de novo shoot formation, any pre-formed shoots are removed and the area between the stipules was cut with a sharp scalpel 3 to 5 times.

[0376] The explants are either completely immersed or the wounded petiole end dipped into the Agrobacterium suspension immediately after explant preparation. After inoculation, the explants are blotted onto sterile filter paper to remove excess Agrobacterium culture and place explants with the wounded side in contact with a round 7 cm Whatman paper overlaying the solid CCM medium (see above). This filter paper prevents A. tumefaciens overgrowth on the soy-explants. Wrap five plates with Parafilm.TM. "M" (American National Can, Chicago, Ill., USA) and incubate for three to five days in the dark or light at 25.degree. C.

8.3.4 Method C: Propagated Axillary Meristem

[0377] For the preparation of the propagated axillary meristem explant propagated 3-4 week-old plantlets are used. Axillary meristem explants can be pre-pared from the first to the fourth node. An average of three to four explants could be obtained from each seedling. The explants are prepared from plantlets by cutting 0.5 to 1.0 cm below the axillary node on the internode and removing the petiole and leaf from the explant. The tip where the axillary meristems lie is cut with a scalpel to induce de novo shoot growth and allow access of target cells to the Agrobacterium. Therefore, a 0.5 cm explant included the stem and a bud.

[0378] Once cut, the explants are immediately placed in the Agrobacterium suspension for 20 to 30 minutes. After inoculation, the explants are blotted onto sterile filter paper to remove excess Agrobacterium culture then placed almost completely immersed in solid CCM or on top of a round 7 cm filter paper overlaying the solid CCM, depending on the Agrobacterium strain. This filter paper prevents Agrobacterium overgrowth on the soy-explants. Plates are wrapped with Parafilm.TM. "M" (American National Can, Chicago, Ill., USA) and incubated for two to three days in the dark at 25.degree. C.

8.4--Shoot Induction

[0379] After 3 to 5 days co-cultivation in the dark at 25.degree. C., the explants are rinsed in liquid SIM medium (to remove excess Agrobacterium) (SIM, see Olhoft et al 2007 A novel Agrobacterium rhizogenes-mediated transformation method of soy using primary-node explants from seedlings In Vitro Cell. Dev. Biol.--Plant (2007) 43:536-549; to remove excess Agrobacterium) or Modwash medium (1.times.B5 major salts, 1.times.B5 minor salts, 1.times.MSIII iron, 3% Sucrose, 1.times.B5 vitamins, 30 mM MES, 350 mg/L Timentin pH 5.6, WO 2005/121345) and blotted dry on sterile filter paper (to prevent damage especially on the lamina) before placing on the solid SIM medium. The approximately 5 explants (Method A) or 10 to 20 (Methods B and C) explants are placed such that the target tissue was in direct contact with the medium. During the first 2 weeks, the explants could be cultured with or without selective medium. Preferably, explants are transferred onto SIM without selection for one week.

[0380] For leaf explants (Method B), the explant should be placed into the medium such that it is perpendicular to the surface of the medium with the petiole imbedded into the medium and the lamina out of the medium.

[0381] For propagated axillary meristem (Method C), the explant is placed into the medium such that it is parallel to the surface of the medium (basipetal) with the explant partially embedded into the medium.

[0382] Wrap plates with Scotch 394 venting tape (3M, St. Paul, Minn., USA) are placed in a growth chamber for two weeks with a temperature averaging 25.degree. C. under 18 h light/6 h dark cycle at 70-100 .mu.E/m.sup.2s. The explants remains on the SIM medium with or without selection until de novo shoot growth occurred at the target area (e.g., axillary meristems at the first node above the epicotyl). Transfers to fresh medium can occur during this time. Explants are transferred from the SIM with or without selection to SIM with selection after about one week. At this time, there is considerable de novo shoot development at the base of the petiole of the leaf explants in a variety of SIM (Method B), at the primary node for seedling explants (Method A), and at the axillary nodes of propagated explants (Method C).

[0383] Preferably, all shoots formed before transformation are removed up to 2 weeks after co-cultivation to stimulate new growth from the meristems. This helped to reduce chimerism in the primary transformant and increase amplification of transgenic meristematic cells. During this time the explant may or may not be cut into smaller pieces (i.e. detaching the node from the explant by cutting the epicotyl).

8.5--Shoot Elongation

[0384] After 2 to 4 weeks (or until a mass of shoots is formed) on SIM medium (preferably with selection), the explants are transferred to SEM medium (shoot elongation medium, see Olhoft et al 2007 A novel Agrobacterium rhizogenes-mediated transformation method of soy using primary-node explants from seedlings. In Vitro Cell. Dev. Biol.--Plant (2007) 43:536-549) that stimulates shoot elongation of the shoot primordia. This medium may or may not contain a selection compound.

[0385] After every 2 to 3 weeks, the explants are transferred to fresh SEM medium (preferably containing selection) after carefully removing dead tissue. The explants should hold together and not fragment into pieces and retain somewhat healthy. The explants are continued to be transferred until the explant dies or shoots elongate. Elongated shoots >3 cm are removed and placed into RM medium for about 1 week (Methods A and B), or about 2 to 4 weeks depending on the cultivar (Method C) at which time roots began to form. In the case of explants with roots, they are transferred directly into soil. Rooted shoots are transferred to soil and hardened in a growth chamber for 2 to 3 weeks before transferring to the greenhouse. Regenerated plants obtained using this method are fertile and produced on average 500 seeds per plant.

[0386] After 5 days of co-cultivation with Agrobacterium tumefaciens transient expression of the gene of interest (GOI) is widespread on the seedling axillary meristem explants especially in the regions wounding during explant preparation (Method A). Explants are placed into shoot induction medium without selection to see how the primary-node responds to shoot induction and regeneration. Thus far, greater than 70% of the explants were formed new shoots at this region. Expression of the GOI is stable after 14 days on SIM, implying integration of the T-DNA into the soybean genome. In addition, preliminary experiments results in the formation of cDNA expressing shoots forming after 3 weeks on SIM.

[0387] For Method C, the average regeneration time of a soybean plantlet using the propagated axillary meristem protocol is 14 weeks from explant inoculation. Therefore, this method has a quick regeneration time that leads to fertile, healthy soybean plants.

Example 9: Pathogen Assay for Soybean

9.1. Growth of Plants

[0388] 10 T1 soy plants per event are potted and grown for 3-4 weeks in the Phytochamber (16 h-day-und 8 h-night-Rhythm at a temperature of 16.degree. and 22.degree. C. und a humidity of 75%) till the first 2 trifoliate leaves were fully expanded.

9.2 Inoculation

[0389] The plants are inoculated with spores of P. pachyrhizi.

[0390] In order to obtain appropriate spore material for the inoculation, soybean leaves which are infected with rust 15-20 days ago, are taken 2-3 days before the inoculation and transferred to agar plates (1% agar in H2O). The leaves are placed with their upper side onto the agar, which allowed the fungus to grow through the tissue and to produce very young spores. For the inoculation solution, the spores are knocked off the leaves and are added to a Tween-H2O solution. The counting of spores is performed under a light microscope by means of a Thoma counting chamber. For the inoculation of the plants, the spore suspension is added into a compressed-air operated spray flask and applied uniformly onto the plants or the leaves until the leaf surface is well moisturized. For macroscopic assays a spore density of 1-5.times.10.sup.5 spores/ml is used. For the microscopy, a density of >5.times.10.sup.5 spores/ml is used. The inoculated plants are placed for 24 hours in a greenhouse chamber with an average of 22.degree. C. and >90% of air humidity. The following cultivation is performed in a chamber with an average of 25.degree. C. and 70% of air humidity.

Example 10: Microscopical Screening

[0391] For the evaluation of the pathogen development, the inoculated leaves of plants are stained with aniline blue 48 hours after infection.

[0392] The aniline blue staining serves for the detection of fluorescent substances. During the defense reactions in host interactions and non-host interactions, substances such as phenols, callose or lignin accumulate or are produced and are incorporated at the cell wall either locally in papillae or in the whole cell (hypersensitive reaction, HR). Complexes are formed in association with aniline blue, which lead e.g. in the case of callose to yellow fluorescence. The leaf material is transferred to falcon tubes or dishes containing destaining solution II (ethanol/acetic acid 6/1) and is incubated in a water bath at 90.degree. C. for 10-15 minutes. The destaining solution II is removed immediately thereafter, and the leaves are washed 2.times. with water. For the staining, the leaves are incubated for 1.5-2 hours in staining solution II (0.05% aniline blue=methyl blue, 0.067 M di-potassium hydrogen phosphate) and analyzed by microscopy immediately thereafter.

[0393] The different interaction types are evaluated (counted) by microscopy. An Olympus UV microscope BX61 (incident light) and a UV Longpath filter (excitation: 375/15, Beam splitter: 405 LP) are used. After aniline blue staining, the spores appeared blue under UV light. The papillae can be recognized beneath the fungal appressorium by a green/yellow staining. The hypersensitive reaction (HR) is characterized by a whole cell fluorescence

Example 11: Evaluating the Susceptibility to Soybean Rust

[0394] The progression of the soybean rust disease is scored by the estimation of the diseased area (area which was covered by sporulating uredinia) on the backside (abaxial side) of the leaf. Additionally the yellowing of the leaf is taken into account. (for scheme see FIG. 11)

[0395] At all 50 T1 soybean plants per construct are inoculated with spores of Phakopsora pachyrhizi. The macroscopic disease symptoms of soy against P. pachyrhizi of the inoculated soybean plants are scored 14 days after inoculation.

[0396] The average of the percentage of the leaf area showing fungal colonies or strong yellowing/browning on all leaves is considered as diseased leaf area. At all 50 soybean T1 plants per construct (expression checked by RT-PCR) are evaluated in parallel to non-transgenic control plants. Non-transgenic soy plants grown in parallel to the transgenic plants are used as controls.

[0397] The expression of the F6H1 gene will lead to enhanced resistance of corn against Phakopsora pachyrhizi.

Example 12: Cloning of Overexpression Vector Constructs for Stable Corn Transformation

[0398] The DNA sequence of the F6H1 (AT3G13610), CCoAOMT1 (At4g34050) and ABCG37 (PDR9; AT3G53480) genes mentioned in this application were generated by DNA synthesis (Geneart, Regensburg, Germany).

[0399] The F6H1 DNA (as shown in SEQ ID No: 1) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon. The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-C vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the maize ubiquitin promoter and the Agrobacterium tumefaciens derived octopine synthase terminator (t-OCS).

[0400] To obtain the binary plant transformation vector, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using an empty pENTRY-A vector, an empty pENTRY-C, and the ZmUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector. As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection (Z. mays acetohydroxyacid synthase (AHAS108) gene) under control of a Maize AHASL2 promoter. The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from the vector construct was sequenced and submitted to corn transformation.

[0401] To obtain the F6H1-CCoAOMT1 double gene construct the CCoAOMT1 DNA (as shown in SEQ ID No: 3) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon. The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-B vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the pSuper promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300) and the Agrobacterium tumefaciens derived nopaline synthase terminator (t-nos). The Super promoter consists of three identical Octapine Synthase Enhancers followed by a MAS promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300).

[0402] To obtain the binary plant transformation vector containing F6H1 and CCoAOMT1, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using an empty pENTRY-A vector, the pSuper promoter::CCoAOMT1::nos-terminator in the above described pENTRY-B vector and the PcUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector. As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection (Z. mays acetohydroxyacid synthase (AHAS108) gene) under control of a Maize AHASL2 promoter. The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from each vector construct was sequenced and submitted soy transformation.

[0403] To obtain the F6H1-UGT71C1 double gene construct the UGT71C1 DNA (as shown in SEQ ID No: 7) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon. The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-B vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the pSuper promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300) and the Agrobacterium tumefaciens derived nopaline synthase terminator (t-nos). The Super promoter consists of three identical Octapine Synthase Enhancers followed by a MAS promoter (Lee et al., 2007 Plant Physiology Vol 145 Issue 4 1294-1300).

[0404] To obtain the binary plant transformation vector containing F6H1 and UGT71C1, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using an empty pENTRY-A vector, the pSuper promoter::UGT71C1::nos-terminator in the above described pENTRY-B vector and the ZmUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector. As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection (Z. mays acetohydroxyacid synthase (AHAS108) gene) under control of a Maize AHASL2 promoter. The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from the vector construct was sequenced and submitted to corn transformation.

[0405] To obtain the F6H1-CCoAOMT1-ABCG37 triple gene construct the ABCG37 DNA (as shown in SEQ ID No: 5) was synthesized in a way that a PacI restriction site is located in front of the start-ATG and a AscI restriction site downstream of the stop-codon. The synthesized DNA was digested using the restriction enzymes PacI and AscI (NEB Biolabs) and ligated in a PacI/AscI digested Gateway pENTRY-A vector (Invitrogen, Life Technologies, Carlsbad, Calif., USA) in a way that the full-length fragment is located in sense direction between the ScBV promoter (Bouhida, Mohammed, B. E. Lockhart, and Neil E. Olszewski. "An analysis of the complete sequence of a sugarcane bacilliform virus genome infectious to banana and rice." The Journal of general virology 74 (1993): 15-22.) and the Solanum tuberosum cathepsin D inhibitor (Herbers, Karin, Salome Prat, and Lothar Willmitzer. "Functional analysis of a leucine aminopeptidase from Solanum tuberosum L." Planta 194.2 (1994): 230-240.). The ScBV promoter mediates a medium strong constitutive expression in corn.

[0406] To obtain the binary plant transformation vector containing F6H1, CCoAOMT1 and ABCG37, a triple LR reaction (Gateway system, Invitrogen, Life Technologies, Carlsbad, Calif., USA) was performed according to manufacturer's protocol by using the ScBV promoter::ABCG37::cathepsin inhibitor terminator in the pENTRY-A vector, as described above, the pSuper promoter::CCoAOMT1::nos-terminator in the above described pENTRY-B vector and the ZmUbi promoter::F6H1::OCS-terminator in the above described pENTRY-C vector.

[0407] As target a binary pDEST vector was used which is composed of: (1) a Spectinomycin/Streptomycin resistance cassette for bacterial selection (2) a pVS1 origin for replication in Agrobacteria (3) a ColE1 origin of replication for stable maintenance in E. coli and (4) between the right and left border an AHAS selection (Z. mays acetohydroxyacid synthase (AHAS108) gene) under control of a Maize AHASL2 promoter. The recombination reaction was transformed into E. coli (DH5alpha), mini-prepped and screened by specific restriction digestions. A positive clone from the vector construct was sequenced and submitted to corn transformation.

Example 13: Maize Transformation

[0408] Agrobacterium cells harboring a plasmid containing the gene of interest (see above) and the mutated maize AHAS gene were grown in YP medium supplemented with appropriate antibiotics for 1-2 days. One loop of Agrobacterium cells was collected and suspended in 1.8 ml M-LS-002 medium (LS-inf). The cultures were incubated while shaking at 1,200 rpm for 5 min-3 hrs. Corn cobs were harvested at 8-11 days after pollination. The cobs were sterilized in 20% Clorox solution for 5 min, followed by spraying with 70% Ethanol and then thoroughly rinsed with sterile water. Immature embryos 0.8-2.0 mm in size were dissected into the tube containing Agrobacterium cells in LS-inf solution.

[0409] The constructs were transformed into immature embryos by a protocol modified from Japan Tobacco Agrobacterium mediated plant transformation method (U.S. Pat. Nos. 5,591,616; 5,731,179; 6,653,529; and U.S. Patent Application Publication No. 2009/0249514). Two types of plasmid vectors were used for transformation. One type had only one T-DNA border on each of left and right side of the border, and selectable marker gene and gene of interest were between the left and right T-DNA borders. The other type was so called "two T-DNA constructs" as described in Japan Tobacco U.S. Pat. No. 5,731,179. In the two DNA constructs, the selectable marker gene was located between one set of T-DNA borders and the gene of interest was included in between the second set of T-DNA borders. Either plasmid vector can be used. The plasmid vector was electroporated into Agrobacterium.

[0410] Agrobacterium infection of the embryos was carried out by inverting the tube several times. The mixture was poured onto a filter paper disk on the surface of a plate containing co-cultivation medium (M-LS-011). The liquid agro-solution was removed and the embryos were checked under a microscope and placed scutellum side up. Embryos were cultured in the dark at 22.degree. C. for 2-4 days, and transferred to M-MS-101 medium without selection and incubated for four to seven days. Embryos were then transferred to M-LS-202 medium containing 0.75 .mu.M imazethapyr and grown for three weeks at 27.degree. C. to select for transformed callus cells.

[0411] Plant regeneration was initiated by transferring resistant calli to M-LS-504 medium supplemented with 0.75 .mu.M imazethapyr and growing under light at 26.degree. C. for two to three weeks. Regenerated shoots were then transferred to a rooting box with M-MS-618 medium (0.5 .mu.M imazethapyr). Plantlets with roots were transferred to soil-less potting mixture and grown in a growth chamber for a week, then transplanted to larger pots and maintained in a greenhouse until maturity.

[0412] Transgenic maize plant production is also described, for example, in U.S. Pat. Nos. 5,591,616 and 6,653,529; U.S. Patent Application Publication No. 2009/0249514; and WO/2006136596, each of which are hereby incorporated by reference in their entirety.

[0413] Transformation of maize may be made using Agrobacterium transformation, as described in U.S. Pat. Nos. 5,591,616; 5,731,179; U.S. Patent Application Publication No. 2002/0104132, and the like. Transformation of maize (Zea mays L.) can also be performed with a modification of the method described by Ishida et al. (Nature Biotech., 1996, 14:745-750). The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation (Fromm et al., Biotech, 1990, 8:833), but other genotypes can be used successfully as well. Ears are harvested from corn plants at approximately 11 days after pollination (DAP) when the length of immature embryos is about 1 to 1.2 mm. Immature embryos are co-cultivated with Agrobacterium tumefaciens that carry "super binary" vectors and transgenic plants are recovered through organogenesis. The super binary vector system is described in WO 94/00977 and WO 95/06722. Vectors are constructed as described. Various selection marker genes are used including the maize gene encoding a mutated acetohydroxy acid synthase (AHAS) enzyme (U.S. Pat. No. 6,025,541). Similarly, various promoters are used to regulate the trait gene to provide constitutive, developmental, inducible, tissue or environmental regulation of gene transcription. Excised embryos can be used and can be grown on callus induction medium, then maize regeneration medium, containing imidazolinone as a selection agent. The Petri dishes are incubated in the light at 25.degree. C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25.degree. C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the imidazolinone herbicides and which are PCR positive for the transgenes.

Example 14: Fusarium and Colletotrichum Resistance Screening

[0414] Transgenic maize plants expressing the F6H1 DNA alone or in combination with CCoAOMT1, ABCG37 or UGT71C1 (as described above) are grown in greenhouse or phyto-chamber under standard growing conditions in a controlled environment (20-25.degree. C., 60-90% humidity).

[0415] Shortly after the transgenic maize plants enter the reproductive phase they are inoculated near the base of the stalk using a fungal suspension of spores (10.sup.5 spores in PBS solution) of Fusarium ssp. or Colletotrichum graminicola. Plants are incubated for 2-4 weeks at 20-25.degree. C. and 60-90% humidity.

[0416] For scoring the stalk rot disease, stalks are split and the progression of the disease is scored by observation of the characteristic brown to black color of the fungus as it grows up the stalk. Disease ratings are conducted by assigning a visual score. Per experiment the diseased leaf area of more than 10 transgenic plants (and wild-type plants as control) is scored. For analysis the average of the diseased leaf area of the non-transgenic mother plant is set to 100% to calculate the relative diseased leaf area of the transgenic lines

[0417] The expression of the F6H1 gene will lead to enhanced resistance of corn against Fusarium ssp. and Colletotrichum graminicola.

Example 15

Evaluating the Effect of Scopoletin Accumulation and Susceptibility to Soybean Rust

[0418] The effect on resistance of Scopoletin accumulation in leaves was evaluated. To achieve accumulation of Scopoeltin in leaves a F6H1 overexpression construct generated. The F6H1 overexpression construct (FIG. 2c) carries the coding sequence of the F6H1 enzyme (SEQ-ID-No. 1) under control of a constitutively and ubiquitously expressing promoter (as described in example 7). The construct was transformed into soybean as described in example 8 (Method C) and resulting T1 soybean seeds were planted and cultivated for 3 weeks as described in example 9.

[0419] The 5 best working independent events were selected for further analysis. As trait efficacy is varying depending on the T-DNA insertion site, the average of those 5 independent events is seen as a good measure to estimate the overall effect of F6H1 overexpression.

[0420] At all 5 transgenic plants were cultivated per event. Additionally 11 non-transgenic wild type soybean plants were grown in parallel as controls. Presence of the construct was confirmed by qPCR, and Scopoletin accumulation was confirmed by presence of fluorescence (FIG. 16). Elicitation of fluorescence was done using a B-100AP UV lamp (UVP LLC, Upland, Canada) using 365 nm longwave UV. Occurrence of fluorescence is a qualitative measure only (not quantitative)

[0421] Three weeks old plants (V1 stage) were inoculated with spores of Phakopsora pachyrhizi as described in example 9.

[0422] The progression of the soybean rust disease was scored 14 days after infection by visual rating of the diseased leaf area. Diseased leaf area is defined as area showing fungal colonies or strong yellowing/browning. The relative diseased area in percent is defined as diseased leaf area divided by overall leaf area (for scheme see FIG. 11).

[0423] Evaluation of Scopoletin accumulating plants was done in parallel to the evaluation of the non-transformed wildtype controls. The average of the diseased leaf area for soybean plants transformed with the F6H1 overexpression construct (FIG. 2c) resulting in Scopoletin accumulation is shown in FIG. 17.

[0424] Expression of F6H1 (construct 1, FIG. 2c) leads to a relative diseased leaf area of 34.9%. In comparison to the wild type, which shows a relative diseased leaf area of 43.9%. So the expression of F6H1 (construct 1, FIG. 2c) leads to a significant (p<0.05, t-test, * FIG. 17) relative increase of soybean rust resistance of 20.6% in average over 5 independent events.

[0425] This data clearly indicates that the in-planta accumulation of Scopoletin leads to a lower disease of transgenic plants compared to non-transgenic wild type controls. So, the expression of F6H1 in soybean significantly (p<0.05) increases the resistance of soy against soybean rust.

Sequence CWU 1

1

8811086DNAArabidopsis thalianaCDS(1)..(1086) 1atg gct cca aca ctc ttg aca acc caa ttc tca aat cca gct gaa gta 48Met Ala Pro Thr Leu Leu Thr Thr Gln Phe Ser Asn Pro Ala Glu Val 1 5 10 15 acc gac ttt gta gtc tac aaa gga aat ggt gtt aag ggt tta tca gaa 96Thr Asp Phe Val Val Tyr Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 aca gga atc aaa gct ctt cca gaa caa tac att cag cca ctt gaa gaa 144Thr Gly Ile Lys Ala Leu Pro Glu Gln Tyr Ile Gln Pro Leu Glu Glu 35 40 45 cga ctc atc aac aaa ttc gtc aac gaa aca gat gaa gcc att cca gtt 192Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 atc gat atg tcg aac cct gat gag gac aga gtc gct gaa gct gtt tgt 240Ile Asp Met Ser Asn Pro Asp Glu Asp Arg Val Ala Glu Ala Val Cys 65 70 75 80 gat gct gct gag aaa tgg ggg ttc ttt caa gtg atc aat cat gga gtt 288Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 cct ttg gaa gtt ctt gat gac gtc aag gct gcg act cac aag ttc ttc 336Pro Leu Glu Val Leu Asp Asp Val Lys Ala Ala Thr His Lys Phe Phe 100 105 110 aat ctc cct gtt gaa gag aag cgc aag ttc act aaa gag aat tcg ctg 384Asn Leu Pro Val Glu Glu Lys Arg Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 tcg acg act gtt agg ttt ggg acg agt ttt agt cct ctt gca gag caa 432Ser Thr Thr Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Gln 130 135 140 gcg ctt gag tgg aaa gat tat ctc agc ctc ttc ttt gtc tct gaa gct 480Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 gaa gct gaa cag ttc tgg cct gat atc tgc agg aat gaa acg tta gag 528Glu Ala Glu Gln Phe Trp Pro Asp Ile Cys Arg Asn Glu Thr Leu Glu 165 170 175 tac att aac aag tca aag aag atg gtg agg agg ctt cta gag tat ttg 576Tyr Ile Asn Lys Ser Lys Lys Met Val Arg Arg Leu Leu Glu Tyr Leu 180 185 190 gga aag aat ctc aat gtt aaa gag ctt gac gag acg aaa gaa tca ctc 624Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 ttt atg ggc tcg att cga gtc aac ctt aac tac tac ccc atc tgc cct 672Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 aat ccg gac cta aca gtt ggt gtt ggt cgc cac tca gac gtc tct tct 720Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc acc att ctc tta caa gac cag atc ggt ggt cta cac gtg cgt tct 768Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 ctg gct tca ggg aac tgg gtt cac gtg cct ccg gtt gct gga tct ttt 816Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Ala Gly Ser Phe 260 265 270 gtg atc aac atc gga gat gcg atg cag atc atg agc aat ggt ctg tac 864Val Ile Asn Ile Gly Asp Ala Met Gln Ile Met Ser Asn Gly Leu Tyr 275 280 285 aag agc gtg gag cat cgt gtc tta gcc aat ggt tac aat aat aga atc 912Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Tyr Asn Asn Arg Ile 290 295 300 tct gtt cct atc ttt gtg aac cca aaa cca gag tca gtt att ggt cct 960Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 cta cct gag gtg att gca aac gga gag gaa ccg att tac aga gac gtc 1008Leu Pro Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 ctg tac tct gat tac gtc aag tat ttc ttc agg aag gca cac gat gga 1056Leu Tyr Ser Asp Tyr Val Lys Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 aag aaa acc gtc gat tac gcc aag atc tga 1086Lys Lys Thr Val Asp Tyr Ala Lys Ile 355 360 2361PRTArabidopsis thaliana 2Met Ala Pro Thr Leu Leu Thr Thr Gln Phe Ser Asn Pro Ala Glu Val 1 5 10 15 Thr Asp Phe Val Val Tyr Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 Thr Gly Ile Lys Ala Leu Pro Glu Gln Tyr Ile Gln Pro Leu Glu Glu 35 40 45 Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 Ile Asp Met Ser Asn Pro Asp Glu Asp Arg Val Ala Glu Ala Val Cys 65 70 75 80 Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 Pro Leu Glu Val Leu Asp Asp Val Lys Ala Ala Thr His Lys Phe Phe 100 105 110 Asn Leu Pro Val Glu Glu Lys Arg Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 Ser Thr Thr Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Gln 130 135 140 Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 Glu Ala Glu Gln Phe Trp Pro Asp Ile Cys Arg Asn Glu Thr Leu Glu 165 170 175 Tyr Ile Asn Lys Ser Lys Lys Met Val Arg Arg Leu Leu Glu Tyr Leu 180 185 190 Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Ala Gly Ser Phe 260 265 270 Val Ile Asn Ile Gly Asp Ala Met Gln Ile Met Ser Asn Gly Leu Tyr 275 280 285 Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Tyr Asn Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Pro Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 Leu Tyr Ser Asp Tyr Val Lys Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Val Asp Tyr Ala Lys Ile 355 360 3780DNAArabidopsis thalianaCDS(1)..(780) 3atg gcg acg aca aca aca gaa gca acg aag aca tca tcg acc aat gga 48Met Ala Thr Thr Thr Thr Glu Ala Thr Lys Thr Ser Ser Thr Asn Gly 1 5 10 15 gaa gat cag aag cag tct cag aat ctt cga cat caa gaa gtt ggt cac 96Glu Asp Gln Lys Gln Ser Gln Asn Leu Arg His Gln Glu Val Gly His 20 25 30 aag agt ctc tta cag agc gat gat ctc tac cag tat ata ctg gag aca 144Lys Ser Leu Leu Gln Ser Asp Asp Leu Tyr Gln Tyr Ile Leu Glu Thr 35 40 45 agt gtg tat cct aga gaa cca gaa tca atg aag gaa ctc agg gaa gtg 192Ser Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu Leu Arg Glu Val 50 55 60 aca gca aaa cat cca tgg aac ata atg acc aca tca gct gat gaa gga 240Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser Ala Asp Glu Gly 65 70 75 80 cag ttc tta aac atg ctt atc aag ctc gtt aac gcc aag aac aca atg 288Gln Phe Leu Asn Met Leu Ile Lys Leu Val Asn Ala Lys Asn Thr Met 85 90 95 gag atc gga gtt tac act ggc tac tct ctt ctc gcc acc gct ctt gct 336Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala 100 105 110 ctc cct gaa gac ggc aaa att ctg gct atg gat gtc aac aga gag aat 384Leu Pro Glu Asp Gly Lys Ile Leu Ala Met Asp Val Asn Arg Glu Asn 115 120 125 tac gaa ttg ggt tta ccg atc att gag aaa gcc ggc gtt gct cac aag 432Tyr Glu Leu Gly Leu Pro Ile Ile Glu Lys Ala Gly Val Ala His Lys 130 135 140 atc gac ttc agg gaa ggc cct gct ctt ccc gtt ctt gat gaa atc gtt 480Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu Asp Glu Ile Val 145 150 155 160 gct gac gag aag aac cat gga aca tat gac ttt ata ttc gtt gat gct 528Ala Asp Glu Lys Asn His Gly Thr Tyr Asp Phe Ile Phe Val Asp Ala 165 170 175 gac aaa gac aac tac atc aac tac cac aag cgt ttg atc gat ctt gtg 576Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg Leu Ile Asp Leu Val 180 185 190 aaa att gga gga gtg att ggc tac gac aac act ctg tgg aat ggt tct 624Lys Ile Gly Gly Val Ile Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser 195 200 205 gtc gtg gct cct cct gat gca cca atg agg aag tac gtt cgt tac tac 672Val Val Ala Pro Pro Asp Ala Pro Met Arg Lys Tyr Val Arg Tyr Tyr 210 215 220 aga gac ttt gtt ctt gag ctt aac aag gct ctt gct gct gac cct cgg 720Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp Pro Arg 225 230 235 240 atc gag atc tgt atg ctc cct gtt ggt gat gga atc act atc tgc cgt 768Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile Thr Ile Cys Arg 245 250 255 cgg atc agt tga 780Arg Ile Ser 4259PRTArabidopsis thaliana 4 Met Ala Thr Thr Thr Thr Glu Ala Thr Lys Thr Ser Ser Thr Asn Gly 1 5 10 15 Glu Asp Gln Lys Gln Ser Gln Asn Leu Arg His Gln Glu Val Gly His 20 25 30 Lys Ser Leu Leu Gln Ser Asp Asp Leu Tyr Gln Tyr Ile Leu Glu Thr 35 40 45 Ser Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu Leu Arg Glu Val 50 55 60 Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser Ala Asp Glu Gly 65 70 75 80 Gln Phe Leu Asn Met Leu Ile Lys Leu Val Asn Ala Lys Asn Thr Met 85 90 95 Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala 100 105 110 Leu Pro Glu Asp Gly Lys Ile Leu Ala Met Asp Val Asn Arg Glu Asn 115 120 125 Tyr Glu Leu Gly Leu Pro Ile Ile Glu Lys Ala Gly Val Ala His Lys 130 135 140 Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu Asp Glu Ile Val 145 150 155 160 Ala Asp Glu Lys Asn His Gly Thr Tyr Asp Phe Ile Phe Val Asp Ala 165 170 175 Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg Leu Ile Asp Leu Val 180 185 190 Lys Ile Gly Gly Val Ile Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser 195 200 205 Val Val Ala Pro Pro Asp Ala Pro Met Arg Lys Tyr Val Arg Tyr Tyr 210 215 220 Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp Pro Arg 225 230 235 240 Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile Thr Ile Cys Arg 245 250 255 Arg Ile Ser 54353DNAArabidopsis thalianaCDS(1)..(4353) 5atg gct cat atg gtt gga gca gac gat att gag tca ttg aga gta gag 48Met Ala His Met Val Gly Ala Asp Asp Ile Glu Ser Leu Arg Val Glu 1 5 10 15 ctt gca gag atc gga aga agc atc aga tca tca ttc cgg aga cat act 96Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe Arg Arg His Thr 20 25 30 tcg agt ttc aga agc agc tct tca ata tat gaa gtt gaa aat gat ggt 144Ser Ser Phe Arg Ser Ser Ser Ser Ile Tyr Glu Val Glu Asn Asp Gly 35 40 45 gat gtt aat gat cat gat gca gag tat gct ctg caa tgg gct gag att 192Asp Val Asn Asp His Asp Ala Glu Tyr Ala Leu Gln Trp Ala Glu Ile 50 55 60 gag aga tta cca act gtc aag cga atg aga tcg act ctc ctt gat gat 240Glu Arg Leu Pro Thr Val Lys Arg Met Arg Ser Thr Leu Leu Asp Asp 65 70 75 80 ggc gat gag tcc atg acc gag aaa gga aga aga gtc gtt gat gtc aca 288Gly Asp Glu Ser Met Thr Glu Lys Gly Arg Arg Val Val Asp Val Thr 85 90 95 aag ctt gga gcc gtg gaa cgt cat ctg atg att gag aaa ctc atc aaa 336Lys Leu Gly Ala Val Glu Arg His Leu Met Ile Glu Lys Leu Ile Lys 100 105 110 cac att gag aat gat aat ctc aag ttg ctc aag aaa atc agg aga aga 384His Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Arg Arg 115 120 125 ata gac aga gtc ggg atg gag tta ccg acc ata gaa gtg agg tac gag 432Ile Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Glu 130 135 140 agt tta aaa gtg gtg gcc gag tgc gag gtt gtc gaa ggg aag gca ctt 480Ser Leu Lys Val Val Ala Glu Cys Glu Val Val Glu Gly Lys Ala Leu 145 150 155 160 cca aca ctg tgg aac act gct aag cgt gtt tta tct gaa ctg gtg aag 528Pro Thr Leu Trp Asn Thr Ala Lys Arg Val Leu Ser Glu Leu Val Lys 165 170 175 ctc act ggt gca aaa aca cat gaa gcc aag ata aac att att aat gat 576Leu Thr Gly Ala Lys Thr His Glu Ala Lys Ile Asn Ile Ile Asn Asp 180 185 190 gtt aat ggc att ata aag cca gga agg tta aca ctg ttg ctt ggt cct 624Val Asn Gly Ile Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro 195 200 205 cct agc tgc gga aaa aca act ttg tta aag gcc ttg tct gga aat tta 672Pro Ser Cys Gly Lys Thr Thr Leu Leu Lys Ala Leu Ser Gly Asn Leu 210 215 220 gaa aac aat cta aag tgt tca ggt gaa ata tct tac aat gga cac aga 720Glu Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg 225 230 235 240 ctg gat gag ttt gtt cct cag aaa act tca gcg tac ata agt caa tat 768Leu Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr 245 250 255 gat ctg cac att gca gag atg aca gtg agg gag aca gtt gac ttc tca 816Asp Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser 260 265 270 gct cgt tgt cag ggc gtt ggt agc cga aca gat att atg atg gaa gtt 864Ala Arg Cys Gln Gly Val Gly Ser Arg Thr Asp Ile Met Met Glu Val 275 280 285 agt aaa aga gaa aag gaa aaa gga atc att cct gac aca gaa gtg gat 912Ser Lys Arg Glu Lys Glu Lys Gly Ile Ile Pro Asp Thr Glu Val Asp 290 295 300 gct tac atg aaa gca att tct gtt gaa gga ctc caa aga agt ctg caa 960Ala Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Gln Arg Ser Leu Gln 305 310 315 320 aca gat tac att ttg aag att ctc gga ctt gat att tgt gca gaa ata 1008Thr Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Ile 325 330 335

ttg att gga gat gtg atg agg aga ggt ata tca gga ggt caa aag aag 1056Leu Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys 340 345 350 cgt ctt acc aca gct gag atg atc gtt ggc ccg aca aag gct ctg ttt 1104Arg Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe 355 360 365 atg gat gaa ata aca aat ggc cta gac agc tcc aca gct ttt cag att 1152Met Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile 370 375 380 gtc aaa tct ctt cag cag ttt gct cac ata tca agc gct act gta ctt 1200Val Lys Ser Leu Gln Gln Phe Ala His Ile Ser Ser Ala Thr Val Leu 385 390 395 400 gtt tcg ctt ctt caa ccc gcc cca gaa tcc tat gac ctc ttt gat gac 1248Val Ser Leu Leu Gln Pro Ala Pro Glu Ser Tyr Asp Leu Phe Asp Asp 405 410 415 att atg ctg atg gcc aaa gga aga atc gtg tat cat ggt cca cgc ggt 1296Ile Met Leu Met Ala Lys Gly Arg Ile Val Tyr His Gly Pro Arg Gly 420 425 430 gaa gtc ctt aac ttc ttt gag gat tgt gga ttc cga tgc cct gaa agg 1344Glu Val Leu Asn Phe Phe Glu Asp Cys Gly Phe Arg Cys Pro Glu Arg 435 440 445 aag ggt gtt gca gac ttt ctc cag gag gtt ata tcc aaa aaa gat caa 1392Lys Gly Val Ala Asp Phe Leu Gln Glu Val Ile Ser Lys Lys Asp Gln 450 455 460 gca caa tac tgg tgg cac gag gat tta cct tac agt ttt gtc tcg gta 1440Ala Gln Tyr Trp Trp His Glu Asp Leu Pro Tyr Ser Phe Val Ser Val 465 470 475 480 gaa atg ttg tcg aag aag ttc aag gac ttg agt att ggg aaa aag atc 1488Glu Met Leu Ser Lys Lys Phe Lys Asp Leu Ser Ile Gly Lys Lys Ile 485 490 495 gaa gac act ctg tca aag cca tat gat aga tcc aaa agc cat aag gat 1536Glu Asp Thr Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp 500 505 510 gct ttg tcc ttc agt gtg tat tct ctt cca aac tgg gag ctg ttc ata 1584Ala Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Leu Phe Ile 515 520 525 gca tgc ata tca aga gag tat ctt ctc atg aag aga aac tat ttc gtc 1632Ala Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val 530 535 540 tat att ttc aag act gct cag ctt gtt atg gcc gca ttc atc act atg 1680Tyr Ile Phe Lys Thr Ala Gln Leu Val Met Ala Ala Phe Ile Thr Met 545 550 555 560 aca gtg ttt atc cga aca cgg atg ggt att gat atc att cat gga aat 1728Thr Val Phe Ile Arg Thr Arg Met Gly Ile Asp Ile Ile His Gly Asn 565 570 575 tct tac atg agt gcc ctc ttt ttc gcc ctc att ata ctt ctt gtt gac 1776Ser Tyr Met Ser Ala Leu Phe Phe Ala Leu Ile Ile Leu Leu Val Asp 580 585 590 gga ttc cca gag ttg tct atg acg gct caa cgt cta gcc gtg ttt tat 1824Gly Phe Pro Glu Leu Ser Met Thr Ala Gln Arg Leu Ala Val Phe Tyr 595 600 605 aag cag aag cag ttg tgt ttc tat cct gca tgg gcg tat gca atc cct 1872Lys Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro 610 615 620 gca aca gtg tta aag gtc cct ctc tcg ttc ttt gaa tct ctc gtt tgg 1920Ala Thr Val Leu Lys Val Pro Leu Ser Phe Phe Glu Ser Leu Val Trp 625 630 635 640 acc tgc ctc tca tac tat gtc att gga tac acc cct gaa gca tcc agg 1968Thr Cys Leu Ser Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg 645 650 655 ttc ttc aag cag ttc att cta ctc ttt gct gtt cac ttc acc tcg ata 2016Phe Phe Lys Gln Phe Ile Leu Leu Phe Ala Val His Phe Thr Ser Ile 660 665 670 tcc atg ttc cgg tgt cta gct gca atc ttc cag aca gta gtt gct tca 2064Ser Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser 675 680 685 atc aca gct ggc agt ttt ggt ata tta ttc aca ttt gtc ttt gcc ggt 2112Ile Thr Ala Gly Ser Phe Gly Ile Leu Phe Thr Phe Val Phe Ala Gly 690 695 700 ttc gtc att cca cca cct tct atg cca gca tgg ctc aag tgg ggt ttc 2160Phe Val Ile Pro Pro Pro Ser Met Pro Ala Trp Leu Lys Trp Gly Phe 705 710 715 720 tgg gca aat cct ttg agt tac ggt gag att ggg tta tca gta aac gag 2208Trp Ala Asn Pro Leu Ser Tyr Gly Glu Ile Gly Leu Ser Val Asn Glu 725 730 735 ttt ctt gct cca agg tgg aat cag atg caa ccc aat aat ttt acc tta 2256Phe Leu Ala Pro Arg Trp Asn Gln Met Gln Pro Asn Asn Phe Thr Leu 740 745 750 gga cga acc ata ctc caa acc cgt gga atg gac tac aac ggt tac atg 2304Gly Arg Thr Ile Leu Gln Thr Arg Gly Met Asp Tyr Asn Gly Tyr Met 755 760 765 tac tgg gta tca tta tgt gcc ttg ttg ggt ttc act gtg ctc ttc aac 2352Tyr Trp Val Ser Leu Cys Ala Leu Leu Gly Phe Thr Val Leu Phe Asn 770 775 780 atc att ttc act ctg gct cta acg ttc ttg aaa tca ccc aca tca tct 2400Ile Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser Ser 785 790 795 800 cga gcc atg att tcg caa gac aaa ctc tct gag ctg caa gga aca gaa 2448Arg Ala Met Ile Ser Gln Asp Lys Leu Ser Glu Leu Gln Gly Thr Glu 805 810 815 aag tca aca gaa gat tct tct gtc agg aaa aag acc aca gac tcc cct 2496Lys Ser Thr Glu Asp Ser Ser Val Arg Lys Lys Thr Thr Asp Ser Pro 820 825 830 gta aag acc gaa gaa gaa gac aaa atg gtc tta cca ttc aag cct ctc 2544Val Lys Thr Glu Glu Glu Asp Lys Met Val Leu Pro Phe Lys Pro Leu 835 840 845 act gta aca ttt caa gac ttg aac tat ttc gtt gac atg cca gtg gag 2592Thr Val Thr Phe Gln Asp Leu Asn Tyr Phe Val Asp Met Pro Val Glu 850 855 860 atg aga gac caa gga tat gat cag aag aaa cta caa ctt ctc tca gat 2640Met Arg Asp Gln Gly Tyr Asp Gln Lys Lys Leu Gln Leu Leu Ser Asp 865 870 875 880 atc aca gga gct ttc cgt ccc gga atc cta acg gca cta atg gga gtg 2688Ile Thr Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val 885 890 895 agt gga gct gga aaa acc act ctt ctc gac gtt cta gcc gga agg aaa 2736Ser Gly Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys 900 905 910 aca agc gga tac atc gaa gga gac att aga atc agt ggc ttc cct aaa 2784Thr Ser Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys 915 920 925 gtc caa gaa aca ttc gct aga gtc tca ggc tac tgt gaa caa aca gat 2832Val Gln Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp 930 935 940 att cac tca cca aac atc act gta gaa gaa tcc gta atc tac tcg gct 2880Ile His Ser Pro Asn Ile Thr Val Glu Glu Ser Val Ile Tyr Ser Ala 945 950 955 960 tgg ctt cgt cta gct cct gag atc gat gcc aca aca aaa acc aaa ttc 2928Trp Leu Arg Leu Ala Pro Glu Ile Asp Ala Thr Thr Lys Thr Lys Phe 965 970 975 gtg aag caa gtg ctt gag acg atc gaa tta gat gag att aaa gat tca 2976Val Lys Gln Val Leu Glu Thr Ile Glu Leu Asp Glu Ile Lys Asp Ser 980 985 990 ttg gtg gga gtc acc gga gtt agt gga tta tcg acg gag caa agg aag 3024Leu Val Gly Val Thr Gly Val Ser Gly Leu Ser Thr Glu Gln Arg Lys 995 1000 1005 aga ttg acg att gcg gtg gag ttg gtg gcg aat ccg tcg att ata 3069Arg Leu Thr Ile Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile 1010 1015 1020 ttt atg gat gag cca acg acg ggg cta gac gca aga gca gct gcc 3114Phe Met Asp Glu Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala 1025 1030 1035 att gtt atg aga gct gtg aag aac gtc gct gat act gga cga acc 3159Ile Val Met Arg Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr 1040 1045 1050 atc gtc tgt act att cat cag cct agt atc gac att ttt gaa gcc 3204Ile Val Cys Thr Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala 1055 1060 1065 ttc gac gag ctg gtg ctt ctt aaa aga ggt ggt cgc atg atc tac 3249Phe Asp Glu Leu Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr 1070 1075 1080 aca gga cca tta ggc caa cat tca cgt cac att atc gag tat ttt 3294Thr Gly Pro Leu Gly Gln His Ser Arg His Ile Ile Glu Tyr Phe 1085 1090 1095 gag agt gtt cct gaa att cct aaa ata aaa gac aac cac aat cca 3339Glu Ser Val Pro Glu Ile Pro Lys Ile Lys Asp Asn His Asn Pro 1100 1105 1110 gca aca tgg atg ctt gat gtt agt tca cag tcg gta gaa att gaa 3384Ala Thr Trp Met Leu Asp Val Ser Ser Gln Ser Val Glu Ile Glu 1115 1120 1125 ctt ggt gtc gat ttc gca aaa atc tac cat gac tct gct ctt tac 3429Leu Gly Val Asp Phe Ala Lys Ile Tyr His Asp Ser Ala Leu Tyr 1130 1135 1140 aag cga aac tca gag ctt gtg aaa cag ttg agc cag cca gat tca 3474Lys Arg Asn Ser Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ser 1145 1150 1155 gga tca agt gat ata cag ttt aag aga acc ttt gca caa agc tgg 3519Gly Ser Ser Asp Ile Gln Phe Lys Arg Thr Phe Ala Gln Ser Trp 1160 1165 1170 tgg gga caa ttc aaa tct att cta tgg aaa atg aac ttg tct tat 3564Trp Gly Gln Phe Lys Ser Ile Leu Trp Lys Met Asn Leu Ser Tyr 1175 1180 1185 tgg aga agc cct tct tat aac cta atg cgt atg atg cac act tta 3609Trp Arg Ser Pro Ser Tyr Asn Leu Met Arg Met Met His Thr Leu 1190 1195 1200 gtc tct tct ttg atc ttc ggc gca ctt ttc tgg aaa caa ggc caa 3654Val Ser Ser Leu Ile Phe Gly Ala Leu Phe Trp Lys Gln Gly Gln 1205 1210 1215 aat cta gat act caa cag agt atg ttc aca gta ttt gga gcg atc 3699Asn Leu Asp Thr Gln Gln Ser Met Phe Thr Val Phe Gly Ala Ile 1220 1225 1230 tac ggt ttg gta ctc ttc tta ggg ata aac aat tgt gca tca gct 3744Tyr Gly Leu Val Leu Phe Leu Gly Ile Asn Asn Cys Ala Ser Ala 1235 1240 1245 ctt caa tat ttc gaa aca gag aga aat gtt atg tac cgg gaa aga 3789Leu Gln Tyr Phe Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg 1250 1255 1260 ttc gca ggg atg tac tca gcg act gct tat gca ttg ggt caa gtg 3834Phe Ala Gly Met Tyr Ser Ala Thr Ala Tyr Ala Leu Gly Gln Val 1265 1270 1275 gtg act gag ata cct tat ata ttc ata caa gct gcc gag ttt gtg 3879Val Thr Glu Ile Pro Tyr Ile Phe Ile Gln Ala Ala Glu Phe Val 1280 1285 1290 atc gta aca tat cca atg atc ggt ttc tat cct tca gcc tac aaa 3924Ile Val Thr Tyr Pro Met Ile Gly Phe Tyr Pro Ser Ala Tyr Lys 1295 1300 1305 gtc ttt tgg tca ctc tac tct atg ttt tgc tca cta ctc act ttc 3969Val Phe Trp Ser Leu Tyr Ser Met Phe Cys Ser Leu Leu Thr Phe 1310 1315 1320 aac tac ctt gcg atg ttc ctc gtc tcc atc acg cca aac ttc atg 4014Asn Tyr Leu Ala Met Phe Leu Val Ser Ile Thr Pro Asn Phe Met 1325 1330 1335 gtt gcc gcg att ctt caa tcg ctc ttt tat gtt ggt ttc aac ctt 4059Val Ala Ala Ile Leu Gln Ser Leu Phe Tyr Val Gly Phe Asn Leu 1340 1345 1350 ttt tcg ggg ttt ttg atc ccc caa acg caa gta cca ggg tgg tgg 4104Phe Ser Gly Phe Leu Ile Pro Gln Thr Gln Val Pro Gly Trp Trp 1355 1360 1365 att tgg tta tat tat cta aca cca acg tct tgg aca ctc aac ggg 4149Ile Trp Leu Tyr Tyr Leu Thr Pro Thr Ser Trp Thr Leu Asn Gly 1370 1375 1380 ttt atc tcg tcc caa tac ggc gat att cat gaa gag atc aat gtc 4194Phe Ile Ser Ser Gln Tyr Gly Asp Ile His Glu Glu Ile Asn Val 1385 1390 1395 ttt gga caa tcc acg acg gtt gca aga ttc ttg aaa gac tat ttt 4239Phe Gly Gln Ser Thr Thr Val Ala Arg Phe Leu Lys Asp Tyr Phe 1400 1405 1410 gga ttt cat cat gac ctt ttg gcg gtt acc gcg gtt gtt caa atc 4284Gly Phe His His Asp Leu Leu Ala Val Thr Ala Val Val Gln Ile 1415 1420 1425 gct ttt ccc att gcc tta gct tct atg ttt gca ttc ttc gtg ggc 4329Ala Phe Pro Ile Ala Leu Ala Ser Met Phe Ala Phe Phe Val Gly 1430 1435 1440 aaa ctc aac ttc caa cga aga tga 4353Lys Leu Asn Phe Gln Arg Arg 1445 1450 61450PRTArabidopsis thaliana 6Met Ala His Met Val Gly Ala Asp Asp Ile Glu Ser Leu Arg Val Glu 1 5 10 15 Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe Arg Arg His Thr 20 25 30 Ser Ser Phe Arg Ser Ser Ser Ser Ile Tyr Glu Val Glu Asn Asp Gly 35 40 45 Asp Val Asn Asp His Asp Ala Glu Tyr Ala Leu Gln Trp Ala Glu Ile 50 55 60 Glu Arg Leu Pro Thr Val Lys Arg Met Arg Ser Thr Leu Leu Asp Asp 65 70 75 80 Gly Asp Glu Ser Met Thr Glu Lys Gly Arg Arg Val Val Asp Val Thr 85 90 95 Lys Leu Gly Ala Val Glu Arg His Leu Met Ile Glu Lys Leu Ile Lys 100 105 110 His Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Arg Arg 115 120 125 Ile Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Glu 130 135 140 Ser Leu Lys Val Val Ala Glu Cys Glu Val Val Glu Gly Lys Ala Leu 145 150 155 160 Pro Thr Leu Trp Asn Thr Ala Lys Arg Val Leu Ser Glu Leu Val Lys 165 170 175 Leu Thr Gly Ala Lys Thr His Glu Ala Lys Ile Asn Ile Ile Asn Asp 180 185 190 Val Asn Gly Ile Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro 195 200 205 Pro Ser Cys Gly Lys Thr Thr Leu Leu Lys Ala Leu Ser Gly Asn Leu 210 215 220 Glu Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg 225 230 235 240 Leu Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr 245 250 255 Asp Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser 260 265 270 Ala Arg Cys Gln Gly Val Gly Ser Arg Thr Asp Ile Met Met Glu Val 275 280 285 Ser Lys Arg Glu Lys Glu Lys Gly Ile Ile Pro Asp Thr Glu Val Asp 290 295 300 Ala Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Gln Arg Ser Leu Gln 305 310 315

320 Thr Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Ile 325 330 335 Leu Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys 340 345 350 Arg Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe 355 360 365 Met Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile 370 375 380 Val Lys Ser Leu Gln Gln Phe Ala His Ile Ser Ser Ala Thr Val Leu 385 390 395 400 Val Ser Leu Leu Gln Pro Ala Pro Glu Ser Tyr Asp Leu Phe Asp Asp 405 410 415 Ile Met Leu Met Ala Lys Gly Arg Ile Val Tyr His Gly Pro Arg Gly 420 425 430 Glu Val Leu Asn Phe Phe Glu Asp Cys Gly Phe Arg Cys Pro Glu Arg 435 440 445 Lys Gly Val Ala Asp Phe Leu Gln Glu Val Ile Ser Lys Lys Asp Gln 450 455 460 Ala Gln Tyr Trp Trp His Glu Asp Leu Pro Tyr Ser Phe Val Ser Val 465 470 475 480 Glu Met Leu Ser Lys Lys Phe Lys Asp Leu Ser Ile Gly Lys Lys Ile 485 490 495 Glu Asp Thr Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp 500 505 510 Ala Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Leu Phe Ile 515 520 525 Ala Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val 530 535 540 Tyr Ile Phe Lys Thr Ala Gln Leu Val Met Ala Ala Phe Ile Thr Met 545 550 555 560 Thr Val Phe Ile Arg Thr Arg Met Gly Ile Asp Ile Ile His Gly Asn 565 570 575 Ser Tyr Met Ser Ala Leu Phe Phe Ala Leu Ile Ile Leu Leu Val Asp 580 585 590 Gly Phe Pro Glu Leu Ser Met Thr Ala Gln Arg Leu Ala Val Phe Tyr 595 600 605 Lys Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro 610 615 620 Ala Thr Val Leu Lys Val Pro Leu Ser Phe Phe Glu Ser Leu Val Trp 625 630 635 640 Thr Cys Leu Ser Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg 645 650 655 Phe Phe Lys Gln Phe Ile Leu Leu Phe Ala Val His Phe Thr Ser Ile 660 665 670 Ser Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser 675 680 685 Ile Thr Ala Gly Ser Phe Gly Ile Leu Phe Thr Phe Val Phe Ala Gly 690 695 700 Phe Val Ile Pro Pro Pro Ser Met Pro Ala Trp Leu Lys Trp Gly Phe 705 710 715 720 Trp Ala Asn Pro Leu Ser Tyr Gly Glu Ile Gly Leu Ser Val Asn Glu 725 730 735 Phe Leu Ala Pro Arg Trp Asn Gln Met Gln Pro Asn Asn Phe Thr Leu 740 745 750 Gly Arg Thr Ile Leu Gln Thr Arg Gly Met Asp Tyr Asn Gly Tyr Met 755 760 765 Tyr Trp Val Ser Leu Cys Ala Leu Leu Gly Phe Thr Val Leu Phe Asn 770 775 780 Ile Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser Ser 785 790 795 800 Arg Ala Met Ile Ser Gln Asp Lys Leu Ser Glu Leu Gln Gly Thr Glu 805 810 815 Lys Ser Thr Glu Asp Ser Ser Val Arg Lys Lys Thr Thr Asp Ser Pro 820 825 830 Val Lys Thr Glu Glu Glu Asp Lys Met Val Leu Pro Phe Lys Pro Leu 835 840 845 Thr Val Thr Phe Gln Asp Leu Asn Tyr Phe Val Asp Met Pro Val Glu 850 855 860 Met Arg Asp Gln Gly Tyr Asp Gln Lys Lys Leu Gln Leu Leu Ser Asp 865 870 875 880 Ile Thr Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val 885 890 895 Ser Gly Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys 900 905 910 Thr Ser Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys 915 920 925 Val Gln Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp 930 935 940 Ile His Ser Pro Asn Ile Thr Val Glu Glu Ser Val Ile Tyr Ser Ala 945 950 955 960 Trp Leu Arg Leu Ala Pro Glu Ile Asp Ala Thr Thr Lys Thr Lys Phe 965 970 975 Val Lys Gln Val Leu Glu Thr Ile Glu Leu Asp Glu Ile Lys Asp Ser 980 985 990 Leu Val Gly Val Thr Gly Val Ser Gly Leu Ser Thr Glu Gln Arg Lys 995 1000 1005 Arg Leu Thr Ile Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile 1010 1015 1020 Phe Met Asp Glu Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala 1025 1030 1035 Ile Val Met Arg Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr 1040 1045 1050 Ile Val Cys Thr Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala 1055 1060 1065 Phe Asp Glu Leu Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr 1070 1075 1080 Thr Gly Pro Leu Gly Gln His Ser Arg His Ile Ile Glu Tyr Phe 1085 1090 1095 Glu Ser Val Pro Glu Ile Pro Lys Ile Lys Asp Asn His Asn Pro 1100 1105 1110 Ala Thr Trp Met Leu Asp Val Ser Ser Gln Ser Val Glu Ile Glu 1115 1120 1125 Leu Gly Val Asp Phe Ala Lys Ile Tyr His Asp Ser Ala Leu Tyr 1130 1135 1140 Lys Arg Asn Ser Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ser 1145 1150 1155 Gly Ser Ser Asp Ile Gln Phe Lys Arg Thr Phe Ala Gln Ser Trp 1160 1165 1170 Trp Gly Gln Phe Lys Ser Ile Leu Trp Lys Met Asn Leu Ser Tyr 1175 1180 1185 Trp Arg Ser Pro Ser Tyr Asn Leu Met Arg Met Met His Thr Leu 1190 1195 1200 Val Ser Ser Leu Ile Phe Gly Ala Leu Phe Trp Lys Gln Gly Gln 1205 1210 1215 Asn Leu Asp Thr Gln Gln Ser Met Phe Thr Val Phe Gly Ala Ile 1220 1225 1230 Tyr Gly Leu Val Leu Phe Leu Gly Ile Asn Asn Cys Ala Ser Ala 1235 1240 1245 Leu Gln Tyr Phe Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg 1250 1255 1260 Phe Ala Gly Met Tyr Ser Ala Thr Ala Tyr Ala Leu Gly Gln Val 1265 1270 1275 Val Thr Glu Ile Pro Tyr Ile Phe Ile Gln Ala Ala Glu Phe Val 1280 1285 1290 Ile Val Thr Tyr Pro Met Ile Gly Phe Tyr Pro Ser Ala Tyr Lys 1295 1300 1305 Val Phe Trp Ser Leu Tyr Ser Met Phe Cys Ser Leu Leu Thr Phe 1310 1315 1320 Asn Tyr Leu Ala Met Phe Leu Val Ser Ile Thr Pro Asn Phe Met 1325 1330 1335 Val Ala Ala Ile Leu Gln Ser Leu Phe Tyr Val Gly Phe Asn Leu 1340 1345 1350 Phe Ser Gly Phe Leu Ile Pro Gln Thr Gln Val Pro Gly Trp Trp 1355 1360 1365 Ile Trp Leu Tyr Tyr Leu Thr Pro Thr Ser Trp Thr Leu Asn Gly 1370 1375 1380 Phe Ile Ser Ser Gln Tyr Gly Asp Ile His Glu Glu Ile Asn Val 1385 1390 1395 Phe Gly Gln Ser Thr Thr Val Ala Arg Phe Leu Lys Asp Tyr Phe 1400 1405 1410 Gly Phe His His Asp Leu Leu Ala Val Thr Ala Val Val Gln Ile 1415 1420 1425 Ala Phe Pro Ile Ala Leu Ala Ser Met Phe Ala Phe Phe Val Gly 1430 1435 1440 Lys Leu Asn Phe Gln Arg Arg 1445 1450 71446DNAArabidopsis thalianaCDS(1)..(1446) 7atg ggg aag caa gaa gat gca gag ctc gtc atc ata cct ttc cct ttc 48Met Gly Lys Gln Glu Asp Ala Glu Leu Val Ile Ile Pro Phe Pro Phe 1 5 10 15 tcc gga cac att ctc gca aca atc gaa ctc gcc aaa cgt ctc ata agt 96Ser Gly His Ile Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Ser 20 25 30 caa gac aat cct cgg atc cac acc atc acc atc ctc tat tgg gga tta 144Gln Asp Asn Pro Arg Ile His Thr Ile Thr Ile Leu Tyr Trp Gly Leu 35 40 45 cct ttt att cct caa gct gac aca atc gct ttc ctc cga tcc cta gtc 192Pro Phe Ile Pro Gln Ala Asp Thr Ile Ala Phe Leu Arg Ser Leu Val 50 55 60 aaa aat gag cct cgt atc cgt ctc gtt acg ttg ccc gaa gtc caa gac 240Lys Asn Glu Pro Arg Ile Arg Leu Val Thr Leu Pro Glu Val Gln Asp 65 70 75 80 cct cca cca atg gaa ctc ttt gtg gaa ttt gcc gaa tct tac att ctt 288Pro Pro Pro Met Glu Leu Phe Val Glu Phe Ala Glu Ser Tyr Ile Leu 85 90 95 gaa tac gtc aag aaa atg gtt ccc atc atc aga gaa gct ctc tcc act 336Glu Tyr Val Lys Lys Met Val Pro Ile Ile Arg Glu Ala Leu Ser Thr 100 105 110 ctc ttg tct tcc cgc gat gaa tcg ggt tca gtt cgt gtg gct gga ttg 384Leu Leu Ser Ser Arg Asp Glu Ser Gly Ser Val Arg Val Ala Gly Leu 115 120 125 gtt ctt gac ttc ttc tgc gtc cct atg atc gat gta gga aac gag ttt 432Val Leu Asp Phe Phe Cys Val Pro Met Ile Asp Val Gly Asn Glu Phe 130 135 140 aat ctc cct tct tac att ttc ttg acg tgt agc gca ggg ttc ttg ggt 480Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Gly Phe Leu Gly 145 150 155 160 atg atg aag tat ctt cca gag aga cac cgc gaa atc aaa tcg gaa ttc 528Met Met Lys Tyr Leu Pro Glu Arg His Arg Glu Ile Lys Ser Glu Phe 165 170 175 aac cgg agc ttc aac gag gag ttg aat ctc att cct ggt tat gtc aac 576Asn Arg Ser Phe Asn Glu Glu Leu Asn Leu Ile Pro Gly Tyr Val Asn 180 185 190 tct gtt cct act aag gtt ttg ccg tca ggt cta ttc atg aaa gag acc 624Ser Val Pro Thr Lys Val Leu Pro Ser Gly Leu Phe Met Lys Glu Thr 195 200 205 tac gag cct tgg gtc gaa cta gca gag agg ttt cct gaa gct aag ggt 672Tyr Glu Pro Trp Val Glu Leu Ala Glu Arg Phe Pro Glu Ala Lys Gly 210 215 220 att ttg gtt aat tca tac aca gct ctc gag cca aac ggt ttt aaa tat 720Ile Leu Val Asn Ser Tyr Thr Ala Leu Glu Pro Asn Gly Phe Lys Tyr 225 230 235 240 ttc gat cgt tgt ccg gat aac tac cca acc att tac cca atc ggg ccg 768Phe Asp Arg Cys Pro Asp Asn Tyr Pro Thr Ile Tyr Pro Ile Gly Pro 245 250 255 ata tta tgc tcc aac gac cgt ccg aat ttg gac tca tcg gaa cga gat 816Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Ser Ser Glu Arg Asp 260 265 270 cgg atc ata act tgg cta gat gac caa ccc gag tca tcg gtc gtg ttc 864Arg Ile Ile Thr Trp Leu Asp Asp Gln Pro Glu Ser Ser Val Val Phe 275 280 285 ctc tgt ttc ggg agc ttg aag aat ctc agc gct act cag atc aac gag 912Leu Cys Phe Gly Ser Leu Lys Asn Leu Ser Ala Thr Gln Ile Asn Glu 290 295 300 ata gct caa gcc tta gag atc gtt gac tgc aaa ttc atc tgg tcg ttt 960Ile Ala Gln Ala Leu Glu Ile Val Asp Cys Lys Phe Ile Trp Ser Phe 305 310 315 320 cga acc aac ccg aag gag tac gcg agc cct tac gag gct cta cca cac 1008Arg Thr Asn Pro Lys Glu Tyr Ala Ser Pro Tyr Glu Ala Leu Pro His 325 330 335 ggg ttc atg gac cgg gtc atg gat caa ggc att gtt tgt ggt tgg gct 1056Gly Phe Met Asp Arg Val Met Asp Gln Gly Ile Val Cys Gly Trp Ala 340 345 350 cct caa gtt gaa atc cta gcc cat aaa gct gtg gga gga ttc gta tct 1104Pro Gln Val Glu Ile Leu Ala His Lys Ala Val Gly Gly Phe Val Ser 355 360 365 cat tgt ggt tgg aac tcg ata ttg gag agt ttg ggt ttc ggc gtt cca 1152His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Gly Phe Gly Val Pro 370 375 380 atc gcc acg tgg ccg atg tac gcg gaa caa caa cta aac gcg ttc acg 1200Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 385 390 395 400 atg gtg aag gag ctt ggt tta gcc ttg gag atg cgg ttg gat tac gtg 1248Met Val Lys Glu Leu Gly Leu Ala Leu Glu Met Arg Leu Asp Tyr Val 405 410 415 tcg gaa gat gga gat ata gtg aaa gct gat gag atc gca gga acc gtt 1296Ser Glu Asp Gly Asp Ile Val Lys Ala Asp Glu Ile Ala Gly Thr Val 420 425 430 aga tct tta atg gac ggt gtg gat gtg ccg aag agt aaa gtg aag gag 1344Arg Ser Leu Met Asp Gly Val Asp Val Pro Lys Ser Lys Val Lys Glu 435 440 445 att gct gag gcg gga aaa gaa gct gtg gac ggt gga tct tcg ttt ctt 1392Ile Ala Glu Ala Gly Lys Glu Ala Val Asp Gly Gly Ser Ser Phe Leu 450 455 460 gcg gtt aaa aga ttc atc ggt gac ttg atc gac ggc gtt tct ata agt 1440Ala Val Lys Arg Phe Ile Gly Asp Leu Ile Asp Gly Val Ser Ile Ser 465 470 475 480 aag tag 1446Lys 8481PRTArabidopsis thaliana 8Met Gly Lys Gln Glu Asp Ala Glu Leu Val Ile Ile Pro Phe Pro Phe 1 5 10 15 Ser Gly His Ile Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Ser 20 25 30 Gln Asp Asn Pro Arg Ile His Thr Ile Thr Ile Leu Tyr Trp Gly Leu 35 40 45 Pro Phe Ile Pro Gln Ala Asp Thr Ile Ala Phe Leu Arg Ser Leu Val 50 55 60 Lys Asn Glu Pro Arg Ile Arg Leu Val Thr Leu Pro Glu Val Gln Asp 65 70 75 80 Pro Pro Pro Met Glu Leu Phe Val Glu Phe Ala Glu Ser Tyr Ile Leu 85 90 95 Glu Tyr Val Lys Lys Met Val Pro Ile Ile Arg Glu Ala Leu Ser Thr 100 105 110 Leu Leu Ser Ser Arg Asp Glu Ser Gly Ser Val Arg Val Ala Gly Leu 115 120 125 Val Leu Asp Phe Phe Cys Val Pro Met Ile Asp Val Gly Asn Glu Phe 130 135 140 Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Gly Phe Leu Gly 145 150 155 160 Met Met Lys Tyr Leu Pro Glu Arg His Arg Glu Ile Lys Ser Glu Phe 165 170 175 Asn Arg Ser Phe Asn Glu Glu Leu Asn Leu Ile Pro Gly Tyr Val Asn 180 185 190 Ser Val Pro Thr Lys Val Leu Pro Ser Gly Leu Phe Met Lys Glu Thr 195 200 205 Tyr Glu Pro Trp Val Glu Leu Ala Glu Arg Phe Pro Glu Ala Lys Gly 210 215 220 Ile Leu Val Asn Ser Tyr Thr Ala Leu Glu Pro Asn Gly Phe Lys Tyr 225 230 235 240 Phe Asp Arg Cys Pro Asp Asn Tyr Pro Thr Ile Tyr Pro Ile Gly Pro 245 250 255 Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Ser Ser Glu Arg Asp 260 265 270 Arg Ile Ile Thr Trp Leu Asp Asp Gln Pro Glu Ser Ser Val Val Phe 275 280 285 Leu Cys Phe Gly Ser Leu Lys Asn Leu Ser Ala Thr Gln Ile Asn Glu 290

295 300 Ile Ala Gln Ala Leu Glu Ile Val Asp Cys Lys Phe Ile Trp Ser Phe 305 310 315 320 Arg Thr Asn Pro Lys Glu Tyr Ala Ser Pro Tyr Glu Ala Leu Pro His 325 330 335 Gly Phe Met Asp Arg Val Met Asp Gln Gly Ile Val Cys Gly Trp Ala 340 345 350 Pro Gln Val Glu Ile Leu Ala His Lys Ala Val Gly Gly Phe Val Ser 355 360 365 His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Gly Phe Gly Val Pro 370 375 380 Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 385 390 395 400 Met Val Lys Glu Leu Gly Leu Ala Leu Glu Met Arg Leu Asp Tyr Val 405 410 415 Ser Glu Asp Gly Asp Ile Val Lys Ala Asp Glu Ile Ala Gly Thr Val 420 425 430 Arg Ser Leu Met Asp Gly Val Asp Val Pro Lys Ser Lys Val Lys Glu 435 440 445 Ile Ala Glu Ala Gly Lys Glu Ala Val Asp Gly Gly Ser Ser Phe Leu 450 455 460 Ala Val Lys Arg Phe Ile Gly Asp Leu Ile Asp Gly Val Ser Ile Ser 465 470 475 480 Lys 91086DNABrassica rapaCDS(1)..(1086) 9atg gct cca aca ctc tct acc tta cag ttc gca gat cca gct gaa gta 48Met Ala Pro Thr Leu Ser Thr Leu Gln Phe Ala Asp Pro Ala Glu Val 1 5 10 15 acc gag ttc gtg gtc aac aaa gga aac ggc gta aag ggt tta tca gaa 96Thr Glu Phe Val Val Asn Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 aca ggg atc aaa gct ctt ccc gac caa tac att caa cca ttc gaa gag 144Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 cgt ctc atc aac aag ttc gtc aac gaa aca gac gag gcc att ccc gtc 192Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 atc gac atg tcc tac ccc gaa gag gac aaa gtc gct gaa gct gta tgt 240Ile Asp Met Ser Tyr Pro Glu Glu Asp Lys Val Ala Glu Ala Val Cys 65 70 75 80 gac gct gct gag aga tgg ggt ttc ttt caa gtg atc aac cat gga gtt 288Asp Ala Ala Glu Arg Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 cct ctt gaa gtt ctt gac aac gtg aag gct gcg act cat agg ttc ttt 336Pro Leu Glu Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 aat ctc cct gtt gag gag aag agt agg ttc aca agg gag aac tcg ttg 384Asn Leu Pro Val Glu Glu Lys Ser Arg Phe Thr Arg Glu Asn Ser Leu 115 120 125 tcg acg aat gta agg ttt gga acg agt ttt agt cct cgt gca gag aaa 432Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Arg Ala Glu Lys 130 135 140 gct ctt gag tgg aaa gat tat ctc agt ctc ttc ttt gtt tct gaa act 480Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Thr 145 150 155 160 gaa gct gaa cag tac tgg cct aat gct tgc aag aac gaa gct cta gag 528Glu Ala Glu Gln Tyr Trp Pro Asn Ala Cys Lys Asn Glu Ala Leu Glu 165 170 175 tac atg aac aag tcc aag aca atg gtg agg aag ctt tta gag tat tta 576Tyr Met Asn Lys Ser Lys Thr Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 ggg aag aat ctc aac gtg aag gag cta gac gag acc aaa gaa tca ctc 624Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 ttc atg ggt tca att cga atc aac ctc aac tac tat ccc atc tgt cct 672Phe Met Gly Ser Ile Arg Ile Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 agt ccc gac cta acc gtt ggc gtt ggt cga cac tca gat gtc tct tcc 720Ser Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc acc att ctc tta caa gac cag atc ggt ggc ctc cac gtg cgt tct 768Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 cta acg tca ggg aac tgg gtt cac gtg cca ccg gtt cct gga tct ttc 816Leu Thr Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 gtg atc aac atc gga gac gcc atg cag atc ttg agc aat ggt cgt tac 864Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 aag agc gtg gag cat cgt gtc tta gcc aac ggt agc aac aac aga atc 912Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 tct gtt cct atc ttc gtg aat cca aaa cca gag tct gtg att ggt cct 960Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 ctt act gag gtg gtc tca aat gga gag gaa ccc gtt tat aga gac gtt 1008Leu Thr Glu Val Val Ser Asn Gly Glu Glu Pro Val Tyr Arg Asp Val 325 330 335 gtg tac tct gat tac gtc aga tac ttt ttc aag aag gcg cac gac gga 1056Val Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Lys Lys Ala His Asp Gly 340 345 350 aag aaa acc atc gat ttc gcg aag att tga 1086Lys Lys Thr Ile Asp Phe Ala Lys Ile 355 360 10361PRTBrassica rapa 10Met Ala Pro Thr Leu Ser Thr Leu Gln Phe Ala Asp Pro Ala Glu Val 1 5 10 15 Thr Glu Phe Val Val Asn Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 Ile Asp Met Ser Tyr Pro Glu Glu Asp Lys Val Ala Glu Ala Val Cys 65 70 75 80 Asp Ala Ala Glu Arg Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 Pro Leu Glu Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 Asn Leu Pro Val Glu Glu Lys Ser Arg Phe Thr Arg Glu Asn Ser Leu 115 120 125 Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Arg Ala Glu Lys 130 135 140 Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Thr 145 150 155 160 Glu Ala Glu Gln Tyr Trp Pro Asn Ala Cys Lys Asn Glu Ala Leu Glu 165 170 175 Tyr Met Asn Lys Ser Lys Thr Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 Phe Met Gly Ser Ile Arg Ile Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Ser Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 Leu Thr Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Thr Glu Val Val Ser Asn Gly Glu Glu Pro Val Tyr Arg Asp Val 325 330 335 Val Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Lys Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Ile Asp Phe Ala Lys Ile 355 360 111086DNABrassica rapaCDS(1)..(1086) 11atg gct cca aca gtc tct aca acc caa ttc tcg gac cca gct gaa gta 48Met Ala Pro Thr Val Ser Thr Thr Gln Phe Ser Asp Pro Ala Glu Val 1 5 10 15 acc gag ttc gtt gtc aac caa gga aac ggc gta aag ggt ttg tca gaa 96Thr Glu Phe Val Val Asn Gln Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 aca gga ata aaa gct ctt cca gat caa tac att caa cca ttc gaa gaa 144Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 cgt ctc atc aac aat ttc gtc aac gag aca gac gaa gcc att cct gtc 192Arg Leu Ile Asn Asn Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 atc gac atg tcg tac ccc gac gag agc aaa gtg gct aaa gct atc tgt 240Ile Asp Met Ser Tyr Pro Asp Glu Ser Lys Val Ala Lys Ala Ile Cys 65 70 75 80 gac gct gct gag aaa tgg ggt ttc ttt caa gtg atc aac cat gga gtt 288Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 cct ttg gaa gtt ctt gac aac gtg aag gcc gct act cac aga ttc ttc 336Pro Leu Glu Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 aat ctt cct gta gaa gag aag agc aaa tac aca aag gag aat tct ctg 384Asn Leu Pro Val Glu Glu Lys Ser Lys Tyr Thr Lys Glu Asn Ser Leu 115 120 125 tcg acc aat gtt agg ttc ggt acg agt ttc agt cct cgt gca gag aag 432Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Arg Ala Glu Lys 130 135 140 gct ttg gag tgg aaa gat tat ctc agt ctc ttc ttt gtc tct gaa act 480Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Thr 145 150 155 160 gaa gca tca cag ttt tgg cct gat gtt tgc aag aat gaa gct cta gac 528Glu Ala Ser Gln Phe Trp Pro Asp Val Cys Lys Asn Glu Ala Leu Asp 165 170 175 tac atg aac aag tcc aag aca atg gtg agg aag ctt cta gag tat ttg 576Tyr Met Asn Lys Ser Lys Thr Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 ggg aag aac ctc aat gtg aaa gag cta gac gag acc aaa gag tca ctc 624Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 ttc atg ggt tcg att cga gtc aac ctc aac tac tat ccc atc tgt cct 672Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 aac cct gac cta acc gtt ggc gtt ggc cgc cac tct gac gtc tct tcc 720Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc acc att gtc tta caa gac cag atc gat ggt ctc cac gtg cgt tct 768Leu Thr Ile Val Leu Gln Asp Gln Ile Asp Gly Leu His Val Arg Ser 245 250 255 ctg gtg tca ggg aac tgg gtt cac gtg cca ccg gtt ccc gga tct ttc 816Leu Val Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 gtg atc aac atc gga gac acc atg cag atc ttg agc aat ggt cgt tac 864Val Ile Asn Ile Gly Asp Thr Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 aag agc gtg gag cct cgt gtc tta gct aac ggt agc aac aac aga atc 912Lys Ser Val Glu Pro Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 tcg gta cct atc ttt gtg aat cca aaa cca gag tca gtg att ggt cct 960Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 ctt ctc gag gtg ata gca aat gga gag gaa ccg atc gat aga gac gtc 1008Leu Leu Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Asp Arg Asp Val 325 330 335 gtg tac tct gat tac gtt agg tac ttc ttc aag aag gca cat gat gga 1056Val Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Lys Lys Ala His Asp Gly 340 345 350 aag aag acc gtt gat ttt gcc aag ata tga 1086Lys Lys Thr Val Asp Phe Ala Lys Ile 355 360 12361PRTBrassica rapa 12Met Ala Pro Thr Val Ser Thr Thr Gln Phe Ser Asp Pro Ala Glu Val 1 5 10 15 Thr Glu Phe Val Val Asn Gln Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 Arg Leu Ile Asn Asn Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 Ile Asp Met Ser Tyr Pro Asp Glu Ser Lys Val Ala Lys Ala Ile Cys 65 70 75 80 Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 Pro Leu Glu Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 Asn Leu Pro Val Glu Glu Lys Ser Lys Tyr Thr Lys Glu Asn Ser Leu 115 120 125 Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Arg Ala Glu Lys 130 135 140 Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Thr 145 150 155 160 Glu Ala Ser Gln Phe Trp Pro Asp Val Cys Lys Asn Glu Ala Leu Asp 165 170 175 Tyr Met Asn Lys Ser Lys Thr Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Val Leu Gln Asp Gln Ile Asp Gly Leu His Val Arg Ser 245 250 255 Leu Val Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 Val Ile Asn Ile Gly Asp Thr Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 Lys Ser Val Glu Pro Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Leu Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Asp Arg Asp Val 325 330 335 Val Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Lys Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Val Asp Phe Ala Lys Ile 355 360 131086DNABrassica rapaCDS(1)..(1086) 13atg gct cca act ctc tct acc gct aac ttc gca gac cca gct gaa gta 48Met Ala Pro Thr Leu Ser Thr Ala Asn Phe Ala Asp Pro Ala Glu Val 1 5 10 15 acc gag ttc gtg gtc aac aaa ggc aat ggc gta aag ggt ttg tca gaa 96Thr Glu Phe Val Val Asn Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 aca gga atc aaa gct ctt ccg gac caa tac att caa cca ttt gaa gag 144Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 cgt ctc atc aac aag ttc gtc aac gag aca gac gaa gct att cca gtc 192Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 atc gac atg tcg gac cct gat gag aac aaa gtc gct gaa gct atc tgt

240Ile Asp Met Ser Asp Pro Asp Glu Asn Lys Val Ala Glu Ala Ile Cys 65 70 75 80 gac gct gct gag aaa tgg ggt ttc ttt cag gtg atc aac cat gga gtt 288Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 cct ttg gat gtt ctt gac aac gtg aag gct gcg act cac agg ttc ttt 336Pro Leu Asp Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 aat ctt cct gtt gag gag aag agc agg ttc aca aag gag aat tct ctg 384Asn Leu Pro Val Glu Glu Lys Ser Arg Phe Thr Lys Glu Asn Ser Leu 115 120 125 acg acc aat gtt agg ttc ggt act agt ttc agt cct cgt gct gag aag 432Thr Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Arg Ala Glu Lys 130 135 140 gct ctc gag tgg aaa gat tat ctc agt ctc ttc ttt gtg tcc gaa gcc 480Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 gaa gct gaa cag ttt tgg cct gat gtt tgc aag aat gaa gct cta gag 528Glu Ala Glu Gln Phe Trp Pro Asp Val Cys Lys Asn Glu Ala Leu Glu 165 170 175 tac atg aac aag tcc aag aca atg gtg cgg aag ctt cta gag tat tta 576Tyr Met Asn Lys Ser Lys Thr Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 gga aaa aat ctc aac gtg aaa gag cta gac gag acc aaa gaa tca ctc 624Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 ttc atg ggc tca atc cga gtc aac ctc aac tac tat ccc atc tgt cct 672Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 aac cct gac cta acc gtt ggc gtt ggt cgt cac tca gac gtc tct tcc 720Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc acc att ctc tta caa gac caa atc ggt ggc ctc cac gtg cgt tct 768Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 cta tct tca ggg aac tgg gtt cac gtg cca ccg gtt cct gga tcc ttt 816Leu Ser Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 gtc atc aac ata gga gac gcc atg cag atc ttg agc aac ggt cgt tac 864Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 aag agc gtg gag cat cgt gtc tta gct aac ggt agt aac aac aga atc 912Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 tct gtt cct atc ttt gtg aat cca aaa cca gag tca gtg att ggt cct 960Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 ctc cct gag gtg gtc gca aat ggt gag gaa ccg att tat aaa gac gtt 1008Leu Pro Glu Val Val Ala Asn Gly Glu Glu Pro Ile Tyr Lys Asp Val 325 330 335 gtg tac tct gat tac gtc agg tac ttc ttc aag aag gca cat gat gga 1056Val Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Lys Lys Ala His Asp Gly 340 345 350 aag aaa acc gtt gac ttc gcc aag ata tga 1086Lys Lys Thr Val Asp Phe Ala Lys Ile 355 360 14361PRTBrassica rapa 14Met Ala Pro Thr Leu Ser Thr Ala Asn Phe Ala Asp Pro Ala Glu Val 1 5 10 15 Thr Glu Phe Val Val Asn Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 Ile Asp Met Ser Asp Pro Asp Glu Asn Lys Val Ala Glu Ala Ile Cys 65 70 75 80 Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 Pro Leu Asp Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 Asn Leu Pro Val Glu Glu Lys Ser Arg Phe Thr Lys Glu Asn Ser Leu 115 120 125 Thr Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Arg Ala Glu Lys 130 135 140 Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 Glu Ala Glu Gln Phe Trp Pro Asp Val Cys Lys Asn Glu Ala Leu Glu 165 170 175 Tyr Met Asn Lys Ser Lys Thr Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 Leu Ser Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Pro Glu Val Val Ala Asn Gly Glu Glu Pro Ile Tyr Lys Asp Val 325 330 335 Val Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Lys Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Val Asp Phe Ala Lys Ile 355 360 151086DNAArabidopsis thalianaCDS(1)..(1086) 15atg aat caa aca ctc gct gcc caa ttc tta acc cga gac caa gtc acc 48Met Asn Gln Thr Leu Ala Ala Gln Phe Leu Thr Arg Asp Gln Val Thr 1 5 10 15 aac ttt gtt gta cac gaa ggt aac ggt gtt aaa ggc ttg tct gag acc 96Asn Phe Val Val His Glu Gly Asn Gly Val Lys Gly Leu Ser Glu Thr 20 25 30 gga atc aaa gtt ctt cct gac caa tac att cag cca ttc gaa gag aga 144Gly Ile Lys Val Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu Arg 35 40 45 ctg atc aac ttc cac gta aaa gag gat tca gac gaa tcc ata ccc gtg 192Leu Ile Asn Phe His Val Lys Glu Asp Ser Asp Glu Ser Ile Pro Val 50 55 60 atc gac ata tca aat tta gac gag aag agt gtc tcc aag gcc gta tgt 240Ile Asp Ile Ser Asn Leu Asp Glu Lys Ser Val Ser Lys Ala Val Cys 65 70 75 80 gat gct gca gaa gaa tgg ggt ttc ttt cag gtg atc aac cat ggc gtg 288Asp Ala Ala Glu Glu Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 tca atg gaa gtt ctt gag aat atg aaa aca gct act cac aga ttc ttc 336Ser Met Glu Val Leu Glu Asn Met Lys Thr Ala Thr His Arg Phe Phe 100 105 110 ggt tta ccg gta gaa gag aaa aga aag ttc tca aga gag aag tct ttg 384Gly Leu Pro Val Glu Glu Lys Arg Lys Phe Ser Arg Glu Lys Ser Leu 115 120 125 tca acg aat gtg aga ttc ggg acg agt ttt agt cct cat gct gag aaa 432Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro His Ala Glu Lys 130 135 140 gct ctc gag tgg aaa gat tat ctg agc ctc ttc ttt gtc tct gaa gct 480Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 gaa gca tca caa ctc tgg cct gac tct tgc agg agt gaa acg cta gaa 528Glu Ala Ser Gln Leu Trp Pro Asp Ser Cys Arg Ser Glu Thr Leu Glu 165 170 175 tac atg aac gag aca aaa cct cta gtg aag aaa ctc tta cgg ttt cta 576Tyr Met Asn Glu Thr Lys Pro Leu Val Lys Lys Leu Leu Arg Phe Leu 180 185 190 ggc gag aat ctg aac gtg aaa gag cta gac aag acc aaa gag tca ttc 624Gly Glu Asn Leu Asn Val Lys Glu Leu Asp Lys Thr Lys Glu Ser Phe 195 200 205 ttc atg ggt tca aca cgt atc aac ctc aac tat tac cct att tgt ccc 672Phe Met Gly Ser Thr Arg Ile Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 aat cca gaa ctc acg gtt gga gtc gga cgt cac tct gat gtt tcc tca 720Asn Pro Glu Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc aca atc ctc tta caa gac gag atc ggt ggt ctc cac gtt cgt tct 768Leu Thr Ile Leu Leu Gln Asp Glu Ile Gly Gly Leu His Val Arg Ser 245 250 255 ctc acc acg ggg aga tgg gtt cac gtg cct cca atc tcc gga tct tta 816Leu Thr Thr Gly Arg Trp Val His Val Pro Pro Ile Ser Gly Ser Leu 260 265 270 gtc att aac att gga gac gct atg caa atc atg agt aat ggt cgt tac 864Val Ile Asn Ile Gly Asp Ala Met Gln Ile Met Ser Asn Gly Arg Tyr 275 280 285 aag agt gtt gag cat cgt gtc tta gct aac ggt tct tat aac aga atc 912Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Tyr Asn Arg Ile 290 295 300 tct gtt cct att ttc gtg agc ccg aaa cca gag tct gtg atc ggt cct 960Ser Val Pro Ile Phe Val Ser Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 ctt ctt gaa gtg atc gaa aat gga gag aaa ccg gtt tat aaa gat att 1008Leu Leu Glu Val Ile Glu Asn Gly Glu Lys Pro Val Tyr Lys Asp Ile 325 330 335 ctt tat acc gat tac gtg aaa cat ttc ttc aga aaa gct cat gat ggg 1056Leu Tyr Thr Asp Tyr Val Lys His Phe Phe Arg Lys Ala His Asp Gly 340 345 350 aag aaa acc atc gat ttt gcc aac att tga 1086Lys Lys Thr Ile Asp Phe Ala Asn Ile 355 360 16361PRTArabidopsis thaliana 16Met Asn Gln Thr Leu Ala Ala Gln Phe Leu Thr Arg Asp Gln Val Thr 1 5 10 15 Asn Phe Val Val His Glu Gly Asn Gly Val Lys Gly Leu Ser Glu Thr 20 25 30 Gly Ile Lys Val Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu Arg 35 40 45 Leu Ile Asn Phe His Val Lys Glu Asp Ser Asp Glu Ser Ile Pro Val 50 55 60 Ile Asp Ile Ser Asn Leu Asp Glu Lys Ser Val Ser Lys Ala Val Cys 65 70 75 80 Asp Ala Ala Glu Glu Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 Ser Met Glu Val Leu Glu Asn Met Lys Thr Ala Thr His Arg Phe Phe 100 105 110 Gly Leu Pro Val Glu Glu Lys Arg Lys Phe Ser Arg Glu Lys Ser Leu 115 120 125 Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro His Ala Glu Lys 130 135 140 Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 Glu Ala Ser Gln Leu Trp Pro Asp Ser Cys Arg Ser Glu Thr Leu Glu 165 170 175 Tyr Met Asn Glu Thr Lys Pro Leu Val Lys Lys Leu Leu Arg Phe Leu 180 185 190 Gly Glu Asn Leu Asn Val Lys Glu Leu Asp Lys Thr Lys Glu Ser Phe 195 200 205 Phe Met Gly Ser Thr Arg Ile Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Asn Pro Glu Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Leu Leu Gln Asp Glu Ile Gly Gly Leu His Val Arg Ser 245 250 255 Leu Thr Thr Gly Arg Trp Val His Val Pro Pro Ile Ser Gly Ser Leu 260 265 270 Val Ile Asn Ile Gly Asp Ala Met Gln Ile Met Ser Asn Gly Arg Tyr 275 280 285 Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Tyr Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Ser Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Leu Glu Val Ile Glu Asn Gly Glu Lys Pro Val Tyr Lys Asp Ile 325 330 335 Leu Tyr Thr Asp Tyr Val Lys His Phe Phe Arg Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Ile Asp Phe Ala Asn Ile 355 360 171086DNAArabidopsis lyratamisc_featuresubsp. lyrataCDS(1)..(1086) 17atg gct cca aca ctc tca aca acc caa ttc tca aac cca gct gaa gta 48Met Ala Pro Thr Leu Ser Thr Thr Gln Phe Ser Asn Pro Ala Glu Val 1 5 10 15 acc gac ttc gta gtc cac aaa gga aat ggt gta aag ggt tta tca gaa 96Thr Asp Phe Val Val His Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 aca gga atc aaa gct ctt cca gat caa tac atc cag cca ttt gaa gaa 144Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 cga ctc atc aac aaa ttc gtc aac gaa aca gac gaa gcc att ccg gtg 192Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 atc gat atg tcg aac cct gac gag aac aga gtc gct gaa gct gtc tgt 240Ile Asp Met Ser Asn Pro Asp Glu Asn Arg Val Ala Glu Ala Val Cys 65 70 75 80 gat gct gct gag aaa tgg ggt ttc ttt caa gtg atc aac cat gga gtc 288Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 cct ttg gaa gtt ctt gac gat gtt aag gcg gcg act cac aga ttc ttc 336Pro Leu Glu Val Leu Asp Asp Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 aat ctc cct gtt gaa gag aag tgc aaa ttc act aaa gag aat tct ctg 384Asn Leu Pro Val Glu Glu Lys Cys Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 tcg acg act gtt agg ttt ggg acg agt ttt agt cct ctt gca gag caa 432Ser Thr Thr Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Gln 130 135 140 gct ctc gag tgg aaa gat tat ctc agt ctc ttc ttt gtc tct gaa gct 480Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 gaa gct gaa cag ttc tgg cct gat atc tgc agg aat gaa acg tta gag 528Glu Ala Glu Gln Phe Trp Pro Asp Ile Cys Arg Asn Glu Thr Leu Glu 165 170 175 tac att gac aag tca aag aag atg gtg agg aag ctt cta gag tat ttg 576Tyr Ile Asp Lys Ser Lys Lys Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 ggg aag aat ctc aac gtg aag gag cta gac gag acg aaa gaa tca ctc 624Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 ttt atg ggt tcg att cga gtc aac ctc aac tac tat ccg att tgt cct 672Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 aac ccg gac cta acc gtt ggt gtt ggt cgc cac tca gac gtc tct tct 720Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc acc atc ctc tta caa gac cag atc ggt ggt cta

cac gtg cgt tct 768Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 ctg gca tca ggg aac tgg gtt cac gtg cca ccg gtt ccc ggg tct ttt 816Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 gtg atc aac atc gga gat gcg atg cag atc ttg agc aat ggt cgg tac 864Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 aag agc gtg gag cat cgt gtc tta gcc aac ggt aac aat aac aga atc 912Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Asn Asn Asn Arg Ile 290 295 300 tct gtt cct atc ttt gtg aat cca aaa cca gag tca gtg att ggt cct 960Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 cta cct gag gtg att gca aac gga gag gaa ccg att tac aga gac gtc 1008Leu Pro Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 ctg tac tct gat tac gtc aag tat ttc ttc agg aag gca cac gat gga 1056Leu Tyr Ser Asp Tyr Val Lys Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 aag aaa acc gtc gat tac gcc aag atc tga 1086Lys Lys Thr Val Asp Tyr Ala Lys Ile 355 360 18361PRTArabidopsis lyrata 18Met Ala Pro Thr Leu Ser Thr Thr Gln Phe Ser Asn Pro Ala Glu Val 1 5 10 15 Thr Asp Phe Val Val His Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Phe Glu Glu 35 40 45 Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 Ile Asp Met Ser Asn Pro Asp Glu Asn Arg Val Ala Glu Ala Val Cys 65 70 75 80 Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 Pro Leu Glu Val Leu Asp Asp Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 Asn Leu Pro Val Glu Glu Lys Cys Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 Ser Thr Thr Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Gln 130 135 140 Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 Glu Ala Glu Gln Phe Trp Pro Asp Ile Cys Arg Asn Glu Thr Leu Glu 165 170 175 Tyr Ile Asp Lys Ser Lys Lys Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Asn Asn Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Pro Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 Leu Tyr Ser Asp Tyr Val Lys Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Val Asp Tyr Ala Lys Ile 355 360 191086DNAArabidopsis lyratamisc_featuresubsp. lyrataCDS(1)..(1086) 19atg gct cca aca ctc tca aca acc caa ttc tca aac cca gct gaa gta 48Met Ala Pro Thr Leu Ser Thr Thr Gln Phe Ser Asn Pro Ala Glu Val 1 5 10 15 acc gac ttc gta gtt cac aaa gga aat ggt gta aag ggt tta tca gaa 96Thr Asp Phe Val Val His Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 act gga atc aaa gct ctt cca gat caa tac atc cag cca ctt gaa gaa 144Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Leu Glu Glu 35 40 45 cga ctc atc aac aaa ttc gtc aac gaa aca gat gaa gcc att ccg gtg 192Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 atc gat atg tcg agc cct gac gag aac aga gtc gct gaa gct gtc tgt 240Ile Asp Met Ser Ser Pro Asp Glu Asn Arg Val Ala Glu Ala Val Cys 65 70 75 80 gat gct gct gag aaa tgg ggt ttc ttt caa gtt atc aat cat gga gtc 288Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 cct ttg gaa gtt ctt gac gac gtg aag gct gcg act cac aga ttc ttc 336Pro Leu Glu Val Leu Asp Asp Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 aat ctc cct gtt gaa gag aag tgc aaa ttc act aaa gag aat tct ctg 384Asn Leu Pro Val Glu Glu Lys Cys Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 tcg acg aat gtt agg ttt ggg acg agt ttt agt ccc ctt gca gag aaa 432Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Lys 130 135 140 tct ctc gag tgg aaa gat tat ctc agt ctc ttc ttt gtc tct gaa gct 480Ser Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 gaa gct gaa cag ttc tgg cct gat atc tgc agg aat gaa aca tta gag 528Glu Ala Glu Gln Phe Trp Pro Asp Ile Cys Arg Asn Glu Thr Leu Glu 165 170 175 tac atg aac aag tca aag aag atg gtg agg aag ctt cta gag tat ttg 576Tyr Met Asn Lys Ser Lys Lys Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 ggg aag aat ctc aat gtt aaa gag ctc gac gag acg aaa gaa tca ctc 624Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 ttt atg ggt tcg att cga gtc aac ctc aac tac tat ccg atc tgc cct 672Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 aac ccg gac cta acc gtc ggt gtt ggt cgc cac tca gac gtc tct tct 720Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc act att ctc tta caa gat cag atc ggc ggt cta cac gtg cgt tct 768Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 ctg gcg tca ggg aac tgg gtt cac gtg cca ccg gtt ccc gga tct ttt 816Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 gtg atc aac atc gga gat gcg atg cag atc ttg agc aat ggt cgg tac 864Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 aag agc gtg gag cat cgt gtc tta gcc aat ggc aac aat aac aga atc 912Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Asn Asn Asn Arg Ile 290 295 300 tct gtt cct atc ttt gtg aat cca aaa cca gag tca gtg att ggt cct 960Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 cta cct gag gtg att gca aat gga gag gaa ccg att tac aga gac gtc 1008Leu Pro Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 ctg tac tct gat tac gtc agg tat ttc ttc agg aag gca cac gac gga 1056Leu Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 aag aaa acc gtc gat tac gcc aag atc tga 1086Lys Lys Thr Val Asp Tyr Ala Lys Ile 355 360 20361PRTArabidopsis lyrata 20Met Ala Pro Thr Leu Ser Thr Thr Gln Phe Ser Asn Pro Ala Glu Val 1 5 10 15 Thr Asp Phe Val Val His Lys Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Leu Glu Glu 35 40 45 Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 Ile Asp Met Ser Ser Pro Asp Glu Asn Arg Val Ala Glu Ala Val Cys 65 70 75 80 Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Ile Asn His Gly Val 85 90 95 Pro Leu Glu Val Leu Asp Asp Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 Asn Leu Pro Val Glu Glu Lys Cys Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 Ser Thr Asn Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Lys 130 135 140 Ser Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Glu Ala 145 150 155 160 Glu Ala Glu Gln Phe Trp Pro Asp Ile Cys Arg Asn Glu Thr Leu Glu 165 170 175 Tyr Met Asn Lys Ser Lys Lys Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Arg Tyr 275 280 285 Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Asn Asn Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Pro Glu Val Ile Ala Asn Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 Leu Tyr Ser Asp Tyr Val Arg Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Val Asp Tyr Ala Lys Ile 355 360 211086DNACapsella rubellaCDS(1)..(1086) 21atg gct cct act ctc tcc aca gct cag ttc tca acc cca gct gaa gta 48Met Ala Pro Thr Leu Ser Thr Ala Gln Phe Ser Thr Pro Ala Glu Val 1 5 10 15 acc gac ttc gta gtc cac aga gga aac ggt gta aag ggt ttg tca gaa 96Thr Asp Phe Val Val His Arg Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 aca ggg atc aaa gct ctt cca gac caa tac att cag cca ctt gaa gag 144Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Leu Glu Glu 35 40 45 cgg ctc atc aac aaa ttc gtc aac gaa aca gac gaa gcc att ccg gtg 192Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 atc gac atg tcc aac cct gat gag aaa aaa gtc gct gaa gct gtc tgt 240Ile Asp Met Ser Asn Pro Asp Glu Lys Lys Val Ala Glu Ala Val Cys 65 70 75 80 gat gct gct gag aaa tgg ggt ttc ttc cag gtg gtc aat cat gga gtt 288Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Val Asn His Gly Val 85 90 95 cct ttg gag gtt ctt gat aac gtc aag gcc gcg act cac aga ttc ttt 336Pro Leu Glu Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 aat ctc cct gtg gag gag aag agc aag ttc act aag gag aac tct ttg 384Asn Leu Pro Val Glu Glu Lys Ser Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 tcg gct act gtt agg ttt ggt acg agt ttt agt cct ctt gca gag aaa 432Ser Ala Thr Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Lys 130 135 140 gct ctt gag tgg aaa gat tat ctt agt ctc ttc ttc gtc tct gac gct 480Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Asp Ala 145 150 155 160 gaa gct gaa cag ttc tgg cct gat gct tgc agg aat gaa acg tta gag 528Glu Ala Glu Gln Phe Trp Pro Asp Ala Cys Arg Asn Glu Thr Leu Glu 165 170 175 tac ata gac aag tca aag aag atg gtg agg aag ctt tta gag tat ttg 576Tyr Ile Asp Lys Ser Lys Lys Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 ggg aag aat ctc aac gtt aaa gag ctc gac gag acg aaa gaa tca ctc 624Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 ttc atg ggt tcg att cga gtc aac ctc aac tac tac ccc atc tgc cct 672Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 aac ccg gac cta acc gtc ggt gtt ggt cgc cac tca gac gtc tct tct 720Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 ctc acc atc ctc tta caa gac cag atc ggt ggt cta cac gtg cgt tct 768Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 ctg gcg tca ggg aac tgg gtt cac gtg cca ccg gtt cct gga tct ttt 816Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 gtg atc aac atc gga gat gcg atg cag atc ttg agc aat ggt ctg tac 864Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Leu Tyr 275 280 285 aag agc gtg gag cat cgt gtc tta gcc aat ggt agc aat aac aga atc 912Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 tct gtt cct atc ttt gtg aat cca aaa cca gag tcc gtg att ggt cct 960Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 cta cct gag gtc att gca aaa gga gag gag ccg att tac aga gac gtc 1008Leu Pro Glu Val Ile Ala Lys Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 gtc tac tct gac tac gtc aag tat ttc ttc agg aag gca cac gac gga 1056Val Tyr Ser Asp Tyr Val Lys Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 aag aaa acc gtt gat ttc gcc aag ata tga 1086Lys Lys Thr Val Asp Phe Ala Lys Ile 355 360 22361PRTCapsella rubella 22Met Ala Pro Thr Leu Ser Thr Ala Gln Phe Ser Thr Pro Ala Glu Val 1 5 10 15 Thr Asp Phe Val Val His Arg Gly Asn Gly Val Lys Gly Leu Ser Glu 20 25 30 Thr Gly Ile Lys Ala Leu Pro Asp Gln Tyr Ile Gln Pro Leu Glu Glu 35 40 45 Arg Leu Ile Asn Lys Phe Val Asn Glu Thr Asp Glu Ala Ile Pro Val 50 55 60 Ile Asp Met Ser Asn Pro Asp Glu Lys Lys Val Ala Glu Ala Val Cys 65 70 75 80 Asp Ala Ala Glu Lys Trp Gly Phe Phe Gln Val Val Asn His Gly Val

85 90 95 Pro Leu Glu Val Leu Asp Asn Val Lys Ala Ala Thr His Arg Phe Phe 100 105 110 Asn Leu Pro Val Glu Glu Lys Ser Lys Phe Thr Lys Glu Asn Ser Leu 115 120 125 Ser Ala Thr Val Arg Phe Gly Thr Ser Phe Ser Pro Leu Ala Glu Lys 130 135 140 Ala Leu Glu Trp Lys Asp Tyr Leu Ser Leu Phe Phe Val Ser Asp Ala 145 150 155 160 Glu Ala Glu Gln Phe Trp Pro Asp Ala Cys Arg Asn Glu Thr Leu Glu 165 170 175 Tyr Ile Asp Lys Ser Lys Lys Met Val Arg Lys Leu Leu Glu Tyr Leu 180 185 190 Gly Lys Asn Leu Asn Val Lys Glu Leu Asp Glu Thr Lys Glu Ser Leu 195 200 205 Phe Met Gly Ser Ile Arg Val Asn Leu Asn Tyr Tyr Pro Ile Cys Pro 210 215 220 Asn Pro Asp Leu Thr Val Gly Val Gly Arg His Ser Asp Val Ser Ser 225 230 235 240 Leu Thr Ile Leu Leu Gln Asp Gln Ile Gly Gly Leu His Val Arg Ser 245 250 255 Leu Ala Ser Gly Asn Trp Val His Val Pro Pro Val Pro Gly Ser Phe 260 265 270 Val Ile Asn Ile Gly Asp Ala Met Gln Ile Leu Ser Asn Gly Leu Tyr 275 280 285 Lys Ser Val Glu His Arg Val Leu Ala Asn Gly Ser Asn Asn Arg Ile 290 295 300 Ser Val Pro Ile Phe Val Asn Pro Lys Pro Glu Ser Val Ile Gly Pro 305 310 315 320 Leu Pro Glu Val Ile Ala Lys Gly Glu Glu Pro Ile Tyr Arg Asp Val 325 330 335 Val Tyr Ser Asp Tyr Val Lys Tyr Phe Phe Arg Lys Ala His Asp Gly 340 345 350 Lys Lys Thr Val Asp Phe Ala Lys Ile 355 360 23747DNALinum usitatissimumCDS(1)..(747) 23atg gcg gaa gag cag aag cag agc agc agc gag aat gtc agc cgg cac 48Met Ala Glu Glu Gln Lys Gln Ser Ser Ser Glu Asn Val Ser Arg His 1 5 10 15 cag gaa gtc ggc cac aag agc ctc ctc cag agc gac gcc ctt tac cag 96Gln Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln 20 25 30 tat att ctt gag acg agt gtt tat cct aga gag cca gag tcc atg aag 144Tyr Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Ser Met Lys 35 40 45 gag ctc aga gaa gtc aca gcc aaa cac ccc tgg aac ata atg acg acg 192Glu Leu Arg Glu Val Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr 50 55 60 tcg gcc gac gaa gga cag ttc ctg aac atg ctg ttg aag ctc atc aac 240Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn 65 70 75 80 gcc aag aac acc atg gag atc ggc gtc tac acc ggt tac tcc ctc ctc 288Ala Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu 85 90 95 gcc acc gcc cta gct atc ccc gac gac ggc aag atc ttg gcc atg gac 336Ala Thr Ala Leu Ala Ile Pro Asp Asp Gly Lys Ile Leu Ala Met Asp 100 105 110 atc aac cgg gag aac tac gag atc gga ctt ccg atc atc gag aag gcc 384Ile Asn Arg Glu Asn Tyr Glu Ile Gly Leu Pro Ile Ile Glu Lys Ala 115 120 125 ggc ctc gct cac aag atc gag ttc cgt gaa ggc cct gcg ttg ccg gcg 432Gly Leu Ala His Lys Ile Glu Phe Arg Glu Gly Pro Ala Leu Pro Ala 130 135 140 ctc gac ctg atg gtt gaa gac aaa tcg ttg cac gga acc tac gac ttc 480Leu Asp Leu Met Val Glu Asp Lys Ser Leu His Gly Thr Tyr Asp Phe 145 150 155 160 ata ttc gtg gac gcg gac aag gac aac tac atc aac tat cac aag agg 528Ile Phe Val Asp Ala Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg 165 170 175 ttg atc gac ctg gtg aaa atc ggg gga gtg atc ggg tat gac aac acc 576Leu Ile Asp Leu Val Lys Ile Gly Gly Val Ile Gly Tyr Asp Asn Thr 180 185 190 cta tgg aac gga tcg gtg gtc gcg cct ccc gac gct ccg ttg agg aag 624Leu Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu Arg Lys 195 200 205 tac gtt agg tac tac agg gat ttc gtg ctc gag ctc aac aag gcg ctc 672Tyr Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu 210 215 220 gcc gcg gac ccc agg atc gag att tgc atg ctc ccc gtc ggt gat gga 720Ala Ala Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly 225 230 235 240 atc act ctc tgc cgt cgg atc agt tga 747Ile Thr Leu Cys Arg Arg Ile Ser 245 24248PRTLinum usitatissimum 24Met Ala Glu Glu Gln Lys Gln Ser Ser Ser Glu Asn Val Ser Arg His 1 5 10 15 Gln Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln 20 25 30 Tyr Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Ser Met Lys 35 40 45 Glu Leu Arg Glu Val Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr 50 55 60 Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn 65 70 75 80 Ala Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu 85 90 95 Ala Thr Ala Leu Ala Ile Pro Asp Asp Gly Lys Ile Leu Ala Met Asp 100 105 110 Ile Asn Arg Glu Asn Tyr Glu Ile Gly Leu Pro Ile Ile Glu Lys Ala 115 120 125 Gly Leu Ala His Lys Ile Glu Phe Arg Glu Gly Pro Ala Leu Pro Ala 130 135 140 Leu Asp Leu Met Val Glu Asp Lys Ser Leu His Gly Thr Tyr Asp Phe 145 150 155 160 Ile Phe Val Asp Ala Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg 165 170 175 Leu Ile Asp Leu Val Lys Ile Gly Gly Val Ile Gly Tyr Asp Asn Thr 180 185 190 Leu Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu Arg Lys 195 200 205 Tyr Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu 210 215 220 Ala Ala Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly 225 230 235 240 Ile Thr Leu Cys Arg Arg Ile Ser 245 25729DNAVitis viniferaCDS(1)..(729) 25atg gcc acg aag caa gaa gct ggg agg cac cag gag gtt ggc cac aag 48Met Ala Thr Lys Gln Glu Ala Gly Arg His Gln Glu Val Gly His Lys 1 5 10 15 agc ctt ttg cag agt gat gct ctt tat cag tat ata ctt gaa acc agt 96Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu Glu Thr Ser 20 25 30 gtg tac cca aga gag ccc gaa tcc atg aag gag ctc aga gag ttg act 144Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu Leu Arg Glu Leu Thr 35 40 45 gcc cag cat cca tgg aac atc atg act acg tct gct gat gaa ggg cag 192Ala Gln His Pro Trp Asn Ile Met Thr Thr Ser Ala Asp Glu Gly Gln 50 55 60 ttc ttg aac atg ctt ctc aag ctc atc aat gcc aag aac acc atg gag 240Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala Lys Asn Thr Met Glu 65 70 75 80 ata ggc gtc tac act ggc tac tct ctt ctg gcc aca gcc ctt gct ctc 288Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala Leu 85 90 95 ccc gat gac gga aag atc ctg gct atg gac atc aac aaa gaa aat tac 336Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile Asn Lys Glu Asn Tyr 100 105 110 gag ctg ggc ctg cca gta att caa aag gca ggg gtt gcc cac aag att 384Glu Leu Gly Leu Pro Val Ile Gln Lys Ala Gly Val Ala His Lys Ile 115 120 125 gac ttc aaa gaa ggc cct gct ttg cct gtt ctt gat cag atg atc gaa 432Asp Phe Lys Glu Gly Pro Ala Leu Pro Val Leu Asp Gln Met Ile Glu 130 135 140 gat ggg aag tat cac ggg tcg ttc gac ttc ata ttc gtg gac gca gac 480Asp Gly Lys Tyr His Gly Ser Phe Asp Phe Ile Phe Val Asp Ala Asp 145 150 155 160 aag gac aat tat ctg aac tac cac aag aga ttg atc gat ttg gtg aag 528Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg Leu Ile Asp Leu Val Lys 165 170 175 gtg ggg gga atc atc ggc tac gac aac acc ctc tgg aac ggg tcg gtg 576Val Gly Gly Ile Ile Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser Val 180 185 190 gtg gcg cca ccc gat gct ccg ctg cgg aag tac gtg agg tac tac aga 624Val Ala Pro Pro Asp Ala Pro Leu Arg Lys Tyr Val Arg Tyr Tyr Arg 195 200 205 gac ttc gtg ttg gag ctg aac aag gct ctt gct gct gac cca aga atc 672Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp Pro Arg Ile 210 215 220 gag atc tgt atg ctt ccg gtt ggt gac ggg atc acc ctt tgc cgt cgg 720Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile Thr Leu Cys Arg Arg 225 230 235 240 cta agc tga 729Leu Ser 26242PRTVitis vinifera 26Met Ala Thr Lys Gln Glu Ala Gly Arg His Gln Glu Val Gly His Lys 1 5 10 15 Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu Glu Thr Ser 20 25 30 Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu Leu Arg Glu Leu Thr 35 40 45 Ala Gln His Pro Trp Asn Ile Met Thr Thr Ser Ala Asp Glu Gly Gln 50 55 60 Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala Lys Asn Thr Met Glu 65 70 75 80 Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala Leu 85 90 95 Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile Asn Lys Glu Asn Tyr 100 105 110 Glu Leu Gly Leu Pro Val Ile Gln Lys Ala Gly Val Ala His Lys Ile 115 120 125 Asp Phe Lys Glu Gly Pro Ala Leu Pro Val Leu Asp Gln Met Ile Glu 130 135 140 Asp Gly Lys Tyr His Gly Ser Phe Asp Phe Ile Phe Val Asp Ala Asp 145 150 155 160 Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg Leu Ile Asp Leu Val Lys 165 170 175 Val Gly Gly Ile Ile Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser Val 180 185 190 Val Ala Pro Pro Asp Ala Pro Leu Arg Lys Tyr Val Arg Tyr Tyr Arg 195 200 205 Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp Pro Arg Ile 210 215 220 Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile Thr Leu Cys Arg Arg 225 230 235 240 Leu Ser 27978DNASolanum lycopersicumCDS(23)..(751) 27ctgtttcaga gtcaaaaaag ca atg gca acc aac gga gaa aat gga aga cat 52 Met Ala Thr Asn Gly Glu Asn Gly Arg His 1 5 10 caa gaa gtt gga cac aag agt cta ttg caa agt gat gcc ctt tat cag 100Gln Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln 15 20 25 tat att ctt gaa acc agt gtg tac cca aga gag cct gaa gcc atg aaa 148Tyr Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Ala Met Lys 30 35 40 gag cta aga gag att act gca aaa cac cct tgg aac ctt atg acc act 196Glu Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Leu Met Thr Thr 45 50 55 tct gct gac gaa ggg cag ttc ttg aat atg ctt ctc aaa ctc atc aat 244Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn 60 65 70 gcc aaa aac aca atg gaa att ggg gtt ttt act ggt tac tct ctg ctt 292Ala Lys Asn Thr Met Glu Ile Gly Val Phe Thr Gly Tyr Ser Leu Leu 75 80 85 90 gct act gcc atg gct ctt cct gat gat ggc aag att cta gcc atg gat 340Ala Thr Ala Met Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp 95 100 105 atc aac cgc gat aac tat gag att gga ctt cca gta att gaa aag gct 388Ile Asn Arg Asp Asn Tyr Glu Ile Gly Leu Pro Val Ile Glu Lys Ala 110 115 120 ggt cta gcg cac aaa att gaa ttc aga gaa ggc cct gca cta cct gtt 436Gly Leu Ala His Lys Ile Glu Phe Arg Glu Gly Pro Ala Leu Pro Val 125 130 135 ctt gac caa atg att gaa gac ggc caa tac cat gga tca tat gat ttc 484Leu Asp Gln Met Ile Glu Asp Gly Gln Tyr His Gly Ser Tyr Asp Phe 140 145 150 ata ttt gtg gat gct gac aag gac aat tac ttg aac tat cac aag aga 532Ile Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg 155 160 165 170 tta atc gac ttg gtc aag att ggt gga tta att ggc tat gac aac acc 580Leu Ile Asp Leu Val Lys Ile Gly Gly Leu Ile Gly Tyr Asp Asn Thr 175 180 185 cta tgg aat gga tca gta gtt gca cca cct gat gca ccc ctc agg aaa 628Leu Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu Arg Lys 190 195 200 tat gtt agg tat tac agg gat ttc gta ttg gaa ctt aac aag gcg ttg 676Tyr Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu 205 210 215 gct gct gat ccc aga atc gaa att tgc cag ctt cct gtt ggt gat ggc 724Ala Ala Asp Pro Arg Ile Glu Ile Cys Gln Leu Pro Val Gly Asp Gly 220 225 230 atc act ctt tgc cgt cgc atc agt taa aatattcgta tagtactatt 771Ile Thr Leu Cys Arg Arg Ile Ser 235 240 ggtggcaatc aacaactcat gagtcatgac gatagaggat ttatcatttt tgaaatcccc 831tgttttactc attcgtttaa ttttatcatt ttagttcgta ttatggcaaa agattgcatt 891gtctatgtta ccaaatgctt atttcacaat gtatttgatg aataaaaaaa gaaagaaatt 951caagttgaaa aaaaaaaaaa aaaaaaa 97828242PRTSolanum lycopersicum 28Met Ala Thr Asn Gly Glu Asn Gly Arg His Gln Glu Val Gly His Lys 1 5 10 15 Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu Glu Thr Ser 20 25 30 Val Tyr Pro Arg Glu Pro Glu Ala Met Lys Glu Leu Arg Glu Ile Thr 35 40 45 Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser Ala Asp Glu Gly Gln 50 55 60 Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala Lys Asn Thr Met Glu 65 70 75 80 Ile Gly Val Phe Thr Gly Tyr Ser Leu Leu Ala Thr Ala Met Ala Leu 85 90 95 Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile Asn Arg Asp Asn Tyr 100 105 110 Glu Ile Gly Leu Pro Val Ile Glu Lys Ala Gly Leu Ala His Lys Ile 115 120 125 Glu Phe Arg Glu Gly Pro Ala Leu Pro Val Leu Asp Gln Met Ile Glu 130 135 140 Asp Gly Gln Tyr His Gly Ser Tyr Asp Phe Ile Phe Val Asp Ala Asp 145 150 155 160 Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg Leu Ile Asp Leu Val Lys 165 170 175 Ile Gly Gly Leu Ile Gly Tyr Asp

Asn Thr Leu Trp Asn Gly Ser Val 180 185 190 Val Ala Pro Pro Asp Ala Pro Leu Arg Lys Tyr Val Arg Tyr Tyr Arg 195 200 205 Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp Pro Arg Ile 210 215 220 Glu Ile Cys Gln Leu Pro Val Gly Asp Gly Ile Thr Leu Cys Arg Arg 225 230 235 240 Ile Ser 29744DNACicer arietinumCDS(1)..(744) 29atg gca acc aac gag gat caa aag caa act gaa tct gga agg cat caa 48Met Ala Thr Asn Glu Asp Gln Lys Gln Thr Glu Ser Gly Arg His Gln 1 5 10 15 gag gtt ggt cac aaa agc ctt ctg caa agt gat gct ctt tac cag tat 96Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr 20 25 30 att cta gag aca agc gtg ttc cca aga gaa cat gaa gcc atg aaa gag 144Ile Leu Glu Thr Ser Val Phe Pro Arg Glu His Glu Ala Met Lys Glu 35 40 45 ttg aga gag gtc aca gca aaa cat cca tgg aac atc atg aca acc tct 192Leu Arg Glu Val Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser 50 55 60 gca gac gag gga caa ttt ttg aac atg ctc ctt aaa ctt atc aat gcc 240Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala 65 70 75 80 aag aat acc atg gaa att ggt gtc tac act ggc tac tcc ctt ctt gcc 288Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala 85 90 95 act gcc ctt gct ctt cct gaa gat gga aag att ttg gcc atg gac att 336Thr Ala Leu Ala Leu Pro Glu Asp Gly Lys Ile Leu Ala Met Asp Ile 100 105 110 aac aag gaa aat tac gaa ttg ggt ctg ccc gta att aaa aaa gct ggt 384Asn Lys Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Lys Lys Ala Gly 115 120 125 gtt gcc cac aaa att gat ttc aga gaa ggc cct gct ctt ccg gtt ctt 432Val Ala His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu 130 135 140 gat gaa atg gtt aaa gat gaa aag aat cat ggg agc tac gat ttc atc 480Asp Glu Met Val Lys Asp Glu Lys Asn His Gly Ser Tyr Asp Phe Ile 145 150 155 160 ttc gtg gat gcg gac aaa gac aat tac atc aac tac cat aag agg tta 528Phe Val Asp Ala Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg Leu 165 170 175 att gaa ctt gtt aaa gtg gga ggt gtg atc ggg tac gac aac acc ttg 576Ile Glu Leu Val Lys Val Gly Gly Val Ile Gly Tyr Asp Asn Thr Leu 180 185 190 tgg aat gga tct gta gtg gca cct cct gat gct cct ctc agg aaa tat 624Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu Arg Lys Tyr 195 200 205 gtt agg tat tac agg gat ttc gtg ttg gaa ctt aac aag gct ttg gct 672Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 gtc gac cct agg att gaa atc tgt atg ctt cct gtt ggt gat gga atc 720Val Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 act atc tgc cgt cgg atc aag taa 744Thr Ile Cys Arg Arg Ile Lys 245 30247PRTCicer arietinum 30Met Ala Thr Asn Glu Asp Gln Lys Gln Thr Glu Ser Gly Arg His Gln 1 5 10 15 Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr 20 25 30 Ile Leu Glu Thr Ser Val Phe Pro Arg Glu His Glu Ala Met Lys Glu 35 40 45 Leu Arg Glu Val Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser 50 55 60 Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala 65 70 75 80 Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala 85 90 95 Thr Ala Leu Ala Leu Pro Glu Asp Gly Lys Ile Leu Ala Met Asp Ile 100 105 110 Asn Lys Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Lys Lys Ala Gly 115 120 125 Val Ala His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu 130 135 140 Asp Glu Met Val Lys Asp Glu Lys Asn His Gly Ser Tyr Asp Phe Ile 145 150 155 160 Phe Val Asp Ala Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg Leu 165 170 175 Ile Glu Leu Val Lys Val Gly Gly Val Ile Gly Tyr Asp Asn Thr Leu 180 185 190 Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu Arg Lys Tyr 195 200 205 Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 Val Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 Thr Ile Cys Arg Arg Ile Lys 245 31744DNACoffea canephoraCDS(1)..(744) 31atg gcc cag aat gga gaa gga aag gat agc caa aat ctc agg cat caa 48Met Ala Gln Asn Gly Glu Gly Lys Asp Ser Gln Asn Leu Arg His Gln 1 5 10 15 gaa gta ggc cac aaa agc ctt ctg caa agt gat gca ctc tac cag tac 96Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr 20 25 30 atc ctg gaa acc agc gtg tat cca aga gag cca gag ccc atg aaa gag 144Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Pro Met Lys Glu 35 40 45 ctg aga gaa ctg aca gca aag cat cca tgg aat att atg act aca tct 192Leu Arg Glu Leu Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser 50 55 60 gct gat gaa ggg cag ttc ttg aac atg att atc aag ttg atc aat gcc 240Ala Asp Glu Gly Gln Phe Leu Asn Met Ile Ile Lys Leu Ile Asn Ala 65 70 75 80 aag aaa acc atg gag att gga gtt tac act ggt tac tcg ctt ctg gct 288Lys Lys Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala 85 90 95 aca gct ctc gct ctt cca gaa gat ggg aag ata ttg gcc atg gat att 336Thr Ala Leu Ala Leu Pro Glu Asp Gly Lys Ile Leu Ala Met Asp Ile 100 105 110 aac aga gaa aac tac gaa ttg ggt ctg ccc gtg atc gaa agg gct ggt 384Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Glu Arg Ala Gly 115 120 125 gtg tcc cat aaa att gac ttc aga gaa ggc cct gct ttg cca gtg ctt 432Val Ser His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu 130 135 140 gat gag ttg att gaa gat gac aag aac cat gga agt ttt gat ttc atc 480Asp Glu Leu Ile Glu Asp Asp Lys Asn His Gly Ser Phe Asp Phe Ile 145 150 155 160 ttc gtg gat gct gac aag gac aac tat ctc aac tac cac aag agg ata 528Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg Ile 165 170 175 atc gag ttg gtc aag gtt ggg gga atg att ggg tac gac aac acc cta 576Ile Glu Leu Val Lys Val Gly Gly Met Ile Gly Tyr Asp Asn Thr Leu 180 185 190 tgg aac ggc tcc gtg gtg gcc cca cca gat gct cca atg agg aag tac 624Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Met Arg Lys Tyr 195 200 205 gtg agg tac tac agg gac ttc gtc ttg gag ctc aac aaa gcc ctg gcc 672Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 gct gat ccc agg atc gag atc tgc atg ctc ccc gtt ggc gac ggt atc 720Ala Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 acc ctg tgc cgc cgc gtc agc taa 744Thr Leu Cys Arg Arg Val Ser 245 32247PRTCoffea canephora 32Met Ala Gln Asn Gly Glu Gly Lys Asp Ser Gln Asn Leu Arg His Gln 1 5 10 15 Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr 20 25 30 Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Pro Met Lys Glu 35 40 45 Leu Arg Glu Leu Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser 50 55 60 Ala Asp Glu Gly Gln Phe Leu Asn Met Ile Ile Lys Leu Ile Asn Ala 65 70 75 80 Lys Lys Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala 85 90 95 Thr Ala Leu Ala Leu Pro Glu Asp Gly Lys Ile Leu Ala Met Asp Ile 100 105 110 Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Glu Arg Ala Gly 115 120 125 Val Ser His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu 130 135 140 Asp Glu Leu Ile Glu Asp Asp Lys Asn His Gly Ser Phe Asp Phe Ile 145 150 155 160 Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg Ile 165 170 175 Ile Glu Leu Val Lys Val Gly Gly Met Ile Gly Tyr Asp Asn Thr Leu 180 185 190 Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Met Arg Lys Tyr 195 200 205 Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 Ala Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 Thr Leu Cys Arg Arg Val Ser 245 33780DNABambusa oldhamiiCDS(1)..(780) 33atg gcc acc gcg acc gcc gat gcg acg acg gcg acc aag gag caa acc 48Met Ala Thr Ala Thr Ala Asp Ala Thr Thr Ala Thr Lys Glu Gln Thr 1 5 10 15 agc ggc ggc ggc ggc gag cag aag acg cgc cac tcc gag gtc ggg cac 96Ser Gly Gly Gly Gly Glu Gln Lys Thr Arg His Ser Glu Val Gly His 20 25 30 aag agc ctg ctc cag agc gac gcg ctc tac cag tac ata ctg gag acg 144Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu Glu Thr 35 40 45 agc gtg tac ccg cgc gag cac gag tgc atg aag gag ctc cgc gag gtc 192Ser Val Tyr Pro Arg Glu His Glu Cys Met Lys Glu Leu Arg Glu Val 50 55 60 acc gcc aag cac cca tgg aac ctg atg acg acg tcg gcg gac gag ggg 240Thr Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser Ala Asp Glu Gly 65 70 75 80 cag ttc ctg aac atg ctg ctc aag ctc atc ggc gcc aag aag acc atg 288Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Gly Ala Lys Lys Thr Met 85 90 95 gag atc ggc gtc tac acc ggc tac tcc ctc ctc gcc acc gcg ctc gcc 336Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala 100 105 110 atc ccc gag gac ggc acg atc ttg gcc atg gac atc aac cgc gag aac 384Ile Pro Glu Asp Gly Thr Ile Leu Ala Met Asp Ile Asn Arg Glu Asn 115 120 125 tac gag ctc ggc ctg ccc tgc atc gag aag gcc ggc gtc gcc cac aag 432Tyr Glu Leu Gly Leu Pro Cys Ile Glu Lys Ala Gly Val Ala His Lys 130 135 140 atc gac ttc cgc gag ggc ccc gcc ctc ccc gtc ctc gac cag ctc ctc 480Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu Asp Gln Leu Leu 145 150 155 160 gag gac gag gcc aac cac ggc tcg ttc gac ttc gtc ttc gtc gac gcc 528Glu Asp Glu Ala Asn His Gly Ser Phe Asp Phe Val Phe Val Asp Ala 165 170 175 gac aag gac aac tac ctc aac tac cac gac cgc ctg atg aag ctg gtc 576Asp Lys Asp Asn Tyr Leu Asn Tyr His Asp Arg Leu Met Lys Leu Val 180 185 190 aag gtc ggc ggc ctc gtt ggc tac gac aac acg ctc tgg aac ggc tcc 624Lys Val Gly Gly Leu Val Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser 195 200 205 gtc gtg ctc ccc gcc gac gcg ccc atg cgc aag tac atc cgc tac tac 672Val Val Leu Pro Ala Asp Ala Pro Met Arg Lys Tyr Ile Arg Tyr Tyr 210 215 220 cgc gac ttc gtg ctc gag ctc aac aag gcc ctc gcc gcc gac gag cgc 720Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp Glu Arg 225 230 235 240 gtc gag atc tgc cag ctc ccc gtc ggc gac ggc atc acc ctc tgc cgc 768Val Glu Ile Cys Gln Leu Pro Val Gly Asp Gly Ile Thr Leu Cys Arg 245 250 255 cgc gcc aag tga 780Arg Ala Lys 34259PRTBambusa oldhamii 34Met Ala Thr Ala Thr Ala Asp Ala Thr Thr Ala Thr Lys Glu Gln Thr 1 5 10 15 Ser Gly Gly Gly Gly Glu Gln Lys Thr Arg His Ser Glu Val Gly His 20 25 30 Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu Glu Thr 35 40 45 Ser Val Tyr Pro Arg Glu His Glu Cys Met Lys Glu Leu Arg Glu Val 50 55 60 Thr Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser Ala Asp Glu Gly 65 70 75 80 Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Gly Ala Lys Lys Thr Met 85 90 95 Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala 100 105 110 Ile Pro Glu Asp Gly Thr Ile Leu Ala Met Asp Ile Asn Arg Glu Asn 115 120 125 Tyr Glu Leu Gly Leu Pro Cys Ile Glu Lys Ala Gly Val Ala His Lys 130 135 140 Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu Asp Gln Leu Leu 145 150 155 160 Glu Asp Glu Ala Asn His Gly Ser Phe Asp Phe Val Phe Val Asp Ala 165 170 175 Asp Lys Asp Asn Tyr Leu Asn Tyr His Asp Arg Leu Met Lys Leu Val 180 185 190 Lys Val Gly Gly Leu Val Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser 195 200 205 Val Val Leu Pro Ala Asp Ala Pro Met Arg Lys Tyr Ile Arg Tyr Tyr 210 215 220 Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp Glu Arg 225 230 235 240 Val Glu Ile Cys Gln Leu Pro Val Gly Asp Gly Ile Thr Leu Cys Arg 245 250 255 Arg Ala Lys 35744DNAEucalyptus camaldulensisCDS(1)..(744) 35atg gca gcc aac gca gag cct cag cag acc caa cca gcg aag cat tcg 48Met Ala Ala Asn Ala Glu Pro Gln Gln Thr Gln Pro Ala Lys His Ser 1 5 10 15 gaa gtc ggc cac aag agc ctc ttg cag agc gat gct ctc tac cag tac 96Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr 20 25 30 ata ttg gag acc agc gtc tac cca aga gag cca gag tcc atg aag gag 144Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu 35 40 45 ctc agg gaa ata aca gcc aaa cat cca tgg aac ctg atg acc aca tcg 192Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser 50 55 60 gct gat gaa ggg cag ttc ctg aac atg ctc ctc aag ctc atc aac gcc 240Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala 65 70 75

80 aag aac acc atg gag atc ggt gtc tac acc ggc tac tct ctc ctc gcc 288Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala 85 90 95 acc gcc ctt gct ctt cct gat gac gga aag atc ttg gcc atg gac atc 336Thr Ala Leu Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile 100 105 110 aat agg gag aac ttc gag atc ggg ctg ccc gtc atc cag aag gcc ggc 384Asn Arg Glu Asn Phe Glu Ile Gly Leu Pro Val Ile Gln Lys Ala Gly 115 120 125 ctt gcc cac aag atc gat ttc aga gaa ggc cct gcc ctg ccg ctc ctt 432Leu Ala His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Leu Leu 130 135 140 gat cag ctc gtg caa gat gag aag aac cat gga acg tac gac ttc ata 480Asp Gln Leu Val Gln Asp Glu Lys Asn His Gly Thr Tyr Asp Phe Ile 145 150 155 160 ttc gtg gat gcc gac aag gac aac tac atc aac tac cac aag agg ctg 528Phe Val Asp Ala Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg Leu 165 170 175 atc gac ctg gtc aag gtt ggc ggc ctg atc gga tac gac aac acc ctg 576Ile Asp Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu 180 185 190 tgg aac ggc tcc gtg gtc gcg ccc gcc gac gcg ccc ctc cgc aag tac 624Trp Asn Gly Ser Val Val Ala Pro Ala Asp Ala Pro Leu Arg Lys Tyr 195 200 205 gtg cgg tac tac cgg gac ttc gtg ctg gag ctc aac aag gcc ctc gcc 672Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 gtg gac ccg agg atc gag atc tgc atg ctt ccc gtc ggg gat ggt atc 720Val Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 acc ctg tgc cgc cgg gtc agc tga 744Thr Leu Cys Arg Arg Val Ser 245 36247PRTEucalyptus camaldulensis 36Met Ala Ala Asn Ala Glu Pro Gln Gln Thr Gln Pro Ala Lys His Ser 1 5 10 15 Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr 20 25 30 Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu 35 40 45 Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser 50 55 60 Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala 65 70 75 80 Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala 85 90 95 Thr Ala Leu Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile 100 105 110 Asn Arg Glu Asn Phe Glu Ile Gly Leu Pro Val Ile Gln Lys Ala Gly 115 120 125 Leu Ala His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Leu Leu 130 135 140 Asp Gln Leu Val Gln Asp Glu Lys Asn His Gly Thr Tyr Asp Phe Ile 145 150 155 160 Phe Val Asp Ala Asp Lys Asp Asn Tyr Ile Asn Tyr His Lys Arg Leu 165 170 175 Ile Asp Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu 180 185 190 Trp Asn Gly Ser Val Val Ala Pro Ala Asp Ala Pro Leu Arg Lys Tyr 195 200 205 Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 Val Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 Thr Leu Cys Arg Arg Val Ser 245 37753DNAGossypium hirsutumCDS(1)..(753) 37atg gca acc aac aaa aca gaa gag cag cag cag caa tct cag gcg ggt 48Met Ala Thr Asn Lys Thr Glu Glu Gln Gln Gln Gln Ser Gln Ala Gly 1 5 10 15 agg cac caa gaa gtt ggc cat aag agc ctt tta caa agc gat gct ctt 96Arg His Gln Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu 20 25 30 tac cag tat atc ctg gag aca agt gta tat ccc agg gag cct gaa ccc 144Tyr Gln Tyr Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Pro 35 40 45 atg aaa gag ctc aga gag ata aca gcc aag cat cca tgg aac ctt atg 192Met Lys Glu Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Leu Met 50 55 60 aca aca tca gct gat gaa ggc caa ttc ttg aac atg ctt ctt aag ttg 240Thr Thr Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu 65 70 75 80 atc aat gcc aag aac acc atg gag att ggt gtt tac act ggc tac tct 288Ile Asn Ala Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser 85 90 95 ctt tta gcc acg gcc ctt gct ctc ccc gat gat ggg aag atc ttc gcc 336Leu Leu Ala Thr Ala Leu Ala Leu Pro Asp Asp Gly Lys Ile Phe Ala 100 105 110 atg gat att aac aga gaa aac tac gag ttg ggt cta cct gta atc caa 384Met Asp Ile Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Gln 115 120 125 aaa gct ggt gtt gct cac aaa att gat ttc aaa gaa ggg cct gca atg 432Lys Ala Gly Val Ala His Lys Ile Asp Phe Lys Glu Gly Pro Ala Met 130 135 140 cca gtt ctt gat gaa ctt gtc caa gat gaa aag aat cac gga tcc ttt 480Pro Val Leu Asp Glu Leu Val Gln Asp Glu Lys Asn His Gly Ser Phe 145 150 155 160 gac ttc ata ttc gtg gat gct gat aag gac aac tac tta aac tac cat 528Asp Phe Ile Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His 165 170 175 aag agg ttg att gag ttg gtg aaa gtg gga ggt tta atc ggc tac gac 576Lys Arg Leu Ile Glu Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp 180 185 190 aac acc cta tgg aac ggc tcg gtg gtg gcg ccg cct gat gct ccg ctc 624Asn Thr Leu Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu 195 200 205 agg aag tac gtc agg tat tat aga gac ttt gtt ttg gaa ctc aac aag 672Arg Lys Tyr Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys 210 215 220 gct ctt gct gtt gac cct agg att gag atc tgc atg ctc cct gtt ggt 720Ala Leu Ala Val Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly 225 230 235 240 gat gga atc acc ctt tgc cgt cgc ctc aaa tga 753Asp Gly Ile Thr Leu Cys Arg Arg Leu Lys 245 250 38250PRTGossypium hirsutum 38Met Ala Thr Asn Lys Thr Glu Glu Gln Gln Gln Gln Ser Gln Ala Gly 1 5 10 15 Arg His Gln Glu Val Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu 20 25 30 Tyr Gln Tyr Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Pro 35 40 45 Met Lys Glu Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Leu Met 50 55 60 Thr Thr Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu 65 70 75 80 Ile Asn Ala Lys Asn Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser 85 90 95 Leu Leu Ala Thr Ala Leu Ala Leu Pro Asp Asp Gly Lys Ile Phe Ala 100 105 110 Met Asp Ile Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Gln 115 120 125 Lys Ala Gly Val Ala His Lys Ile Asp Phe Lys Glu Gly Pro Ala Met 130 135 140 Pro Val Leu Asp Glu Leu Val Gln Asp Glu Lys Asn His Gly Ser Phe 145 150 155 160 Asp Phe Ile Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His 165 170 175 Lys Arg Leu Ile Glu Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp 180 185 190 Asn Thr Leu Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu 195 200 205 Arg Lys Tyr Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys 210 215 220 Ala Leu Ala Val Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly 225 230 235 240 Asp Gly Ile Thr Leu Cys Arg Arg Leu Lys 245 250 39744DNAEucalyptus globulusmisc_featuresubsp. globulusCDS(1)..(744)misc_feature(12)..(12)s is g or c 39atg gcc acc gcs gga gag gag agc cag acc caa gcc ggg agg cac cag 48Met Ala Thr Ala Gly Glu Glu Ser Gln Thr Gln Ala Gly Arg His Gln 1 5 10 15 gag gtt ggc cac aag tct ctc cat att cag agt gat gct ctt tac caa 96Glu Val Gly His Lys Ser Leu His Ile Gln Ser Asp Ala Leu Tyr Gln 20 25 30 tat att ttg gag acc agc gtg tac cca aga gag cct gag ccc atg aag 144Tyr Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Pro Met Lys 35 40 45 gag ctc agg gaa ata aca gca aaa cat cca tgg aac ata atg aca aca 192Glu Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr 50 55 60 tca gca gac gaa ggg cag ttc ttg aac atg ctt ctc aag ctc atc aac 240Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn 65 70 75 80 gcc aag aac acc atg gag att ggt gtc ttc act ggc tac tct ctc ctt 288Ala Lys Asn Thr Met Glu Ile Gly Val Phe Thr Gly Tyr Ser Leu Leu 85 90 95 gcc acc gct ctt gct ctt cct gat gac gga aag att ttg gct atg gac 336Ala Thr Ala Leu Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp 100 105 110 att aac aga gag aac tat gaa ctt ggc ctg ccg gtc atc caa aaa gcc 384Ile Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Gln Lys Ala 115 120 125 ggt gtt gcc gac aag att gac ttc aga gaa ggc cct gct ttg cct att 432Gly Val Ala Asp Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Ile 130 135 140 ctt gat cag ttg atc gaa gat ggg aag caa ggg tcg ttc gac ttc ata 480Leu Asp Gln Leu Ile Glu Asp Gly Lys Gln Gly Ser Phe Asp Phe Ile 145 150 155 160 ttc gtg gac gcg gac aag gac aat tac ctc aac tac cac aag agg ctg 528Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg Leu 165 170 175 atc gag ctt gtc aag gtt gga ggc ctc att ggc tac gac aac acc cta 576Ile Glu Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu 180 185 190 tgg aac ggc tcc gtg gtt gcg ccg ccg gac gcc ccg ctc agg aag tat 624Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu Arg Lys Tyr 195 200 205 gtg agg tac tac agg gat ttt gtg ctg gag ctc aac aag gct ctt gcc 672Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 gct gat cct agg att gag atc tgc atg ctc ccc gtg ggt gat ggc atc 720Ala Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 act ctc tgc cgt cgg atc agc tga 744Thr Leu Cys Arg Arg Ile Ser 245 40247PRTEucalyptus globulus 40Met Ala Thr Ala Gly Glu Glu Ser Gln Thr Gln Ala Gly Arg His Gln 1 5 10 15 Glu Val Gly His Lys Ser Leu His Ile Gln Ser Asp Ala Leu Tyr Gln 20 25 30 Tyr Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Pro Met Lys 35 40 45 Glu Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr 50 55 60 Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn 65 70 75 80 Ala Lys Asn Thr Met Glu Ile Gly Val Phe Thr Gly Tyr Ser Leu Leu 85 90 95 Ala Thr Ala Leu Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp 100 105 110 Ile Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Val Ile Gln Lys Ala 115 120 125 Gly Val Ala Asp Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Ile 130 135 140 Leu Asp Gln Leu Ile Glu Asp Gly Lys Gln Gly Ser Phe Asp Phe Ile 145 150 155 160 Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His Lys Arg Leu 165 170 175 Ile Glu Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu 180 185 190 Trp Asn Gly Ser Val Val Ala Pro Pro Asp Ala Pro Leu Arg Lys Tyr 195 200 205 Val Arg Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala 210 215 220 Ala Asp Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile 225 230 235 240 Thr Leu Cys Arg Arg Ile Ser 245 41768DNACunninghamia lanceolataCDS(1)..(768) 41atg gca agt aca aat gta cag aat ggt gca gat gca tcc aag gat tcg 48Met Ala Ser Thr Asn Val Gln Asn Gly Ala Asp Ala Ser Lys Asp Ser 1 5 10 15 act aag cag gtt agc cgt cac cag gaa gta ggc cac aag agc ctt ctt 96Thr Lys Gln Val Ser Arg His Gln Glu Val Gly His Lys Ser Leu Leu 20 25 30 cag agc gat gcc ctt tat cag tat ata ttg gaa aca agt gta tat ccc 144Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu Glu Thr Ser Val Tyr Pro 35 40 45 cgt gag cct gag tca atg agg gag ctc aga gaa ata act gcc aag cat 192Arg Glu Pro Glu Ser Met Arg Glu Leu Arg Glu Ile Thr Ala Lys His 50 55 60 cca tgg aat ctg atg act act tcg gct gat gag ggc caa ttt tta aat 240Pro Trp Asn Leu Met Thr Thr Ser Ala Asp Glu Gly Gln Phe Leu Asn 65 70 75 80 ctg ttg ttg aag ctg ata aat gcc aag aac acc atg gag att ggt gtg 288Leu Leu Leu Lys Leu Ile Asn Ala Lys Asn Thr Met Glu Ile Gly Val 85 90 95 tat act ggt tac tcg ctt ctc agc act gct ctt gcc ctg cct gat gat 336Tyr Thr Gly Tyr Ser Leu Leu Ser Thr Ala Leu Ala Leu Pro Asp Asp 100 105 110 gga aag ata ata gca atg gac att aac agg gag aac tat gag ttg ggg 384Gly Lys Ile Ile Ala Met Asp Ile Asn Arg Glu Asn Tyr Glu Leu Gly 115 120 125 ctg cct gta att caa aaa gca ggg gtt gcc cac aaa att gac ttc aga 432Leu Pro Val Ile Gln Lys Ala Gly Val Ala His Lys Ile Asp Phe Arg 130 135 140 gag ggc cct gcc ctg cca gtt ctt gat caa atg ttg gaa aat aag gaa 480Glu Gly Pro Ala Leu Pro Val Leu Asp Gln Met Leu Glu Asn Lys Glu 145 150 155 160 atg cat ggc tcc ttc gat ttc ata ttt gtg gac gca gac aaa gac aat 528Met His Gly Ser Phe Asp Phe Ile Phe Val Asp Ala Asp Lys Asp Asn 165 170 175 tat ctg aat tac cac aag cgg ctg att gat ctg gtt aag att ggg gga 576Tyr Leu Asn Tyr His Lys Arg Leu Ile Asp Leu Val Lys Ile Gly Gly 180 185 190 gtg atc ggc tat gac aat act ctg tgg aat gga tca gtg gtg gct cca 624Val Ile Gly Tyr Asp Asn Thr Leu Trp Asn

Gly Ser Val Val Ala Pro 195 200 205 ccc gat gcc ccg cta agg aaa tat gtg aga tat tac aga gat ttt gta 672Pro Asp Ala Pro Leu Arg Lys Tyr Val Arg Tyr Tyr Arg Asp Phe Val 210 215 220 att gaa ctg aac aag gcc ctg gct gca gac cct cgt att gaa atc agc 720Ile Glu Leu Asn Lys Ala Leu Ala Ala Asp Pro Arg Ile Glu Ile Ser 225 230 235 240 caa att cca gta gga gat ggc atc act ctt tgc agg agg gtt tct taa 768Gln Ile Pro Val Gly Asp Gly Ile Thr Leu Cys Arg Arg Val Ser 245 250 255 42255PRTCunninghamia lanceolata 42Met Ala Ser Thr Asn Val Gln Asn Gly Ala Asp Ala Ser Lys Asp Ser 1 5 10 15 Thr Lys Gln Val Ser Arg His Gln Glu Val Gly His Lys Ser Leu Leu 20 25 30 Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu Glu Thr Ser Val Tyr Pro 35 40 45 Arg Glu Pro Glu Ser Met Arg Glu Leu Arg Glu Ile Thr Ala Lys His 50 55 60 Pro Trp Asn Leu Met Thr Thr Ser Ala Asp Glu Gly Gln Phe Leu Asn 65 70 75 80 Leu Leu Leu Lys Leu Ile Asn Ala Lys Asn Thr Met Glu Ile Gly Val 85 90 95 Tyr Thr Gly Tyr Ser Leu Leu Ser Thr Ala Leu Ala Leu Pro Asp Asp 100 105 110 Gly Lys Ile Ile Ala Met Asp Ile Asn Arg Glu Asn Tyr Glu Leu Gly 115 120 125 Leu Pro Val Ile Gln Lys Ala Gly Val Ala His Lys Ile Asp Phe Arg 130 135 140 Glu Gly Pro Ala Leu Pro Val Leu Asp Gln Met Leu Glu Asn Lys Glu 145 150 155 160 Met His Gly Ser Phe Asp Phe Ile Phe Val Asp Ala Asp Lys Asp Asn 165 170 175 Tyr Leu Asn Tyr His Lys Arg Leu Ile Asp Leu Val Lys Ile Gly Gly 180 185 190 Val Ile Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser Val Val Ala Pro 195 200 205 Pro Asp Ala Pro Leu Arg Lys Tyr Val Arg Tyr Tyr Arg Asp Phe Val 210 215 220 Ile Glu Leu Asn Lys Ala Leu Ala Ala Asp Pro Arg Ile Glu Ile Ser 225 230 235 240 Gln Ile Pro Val Gly Asp Gly Ile Thr Leu Cys Arg Arg Val Ser 245 250 255 43777DNAPanicum virgatumCDS(1)..(777) 43atg gcc agc acg gcg gcc gag gcg gcg aag gcg gcg gag cag ccg gcc 48Met Ala Ser Thr Ala Ala Glu Ala Ala Lys Ala Ala Glu Gln Pro Ala 1 5 10 15 aac ggc aac ggc gag cag aag acg cgc cac tcc gag gtc ggc cac aag 96Asn Gly Asn Gly Glu Gln Lys Thr Arg His Ser Glu Val Gly His Lys 20 25 30 agc ctg ctc aag agc gac gac ctc tac cag tac atc ctg gac acg agc 144Ser Leu Leu Lys Ser Asp Asp Leu Tyr Gln Tyr Ile Leu Asp Thr Ser 35 40 45 gtg tac ccg cgg gag ccc gag agc atg aag gag ctc cgc gag atc acc 192Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu Leu Arg Glu Ile Thr 50 55 60 gcc aag cac ccg tgg aac ctg atg acg acg tcg gcg gac gag ggg cag 240Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser Ala Asp Glu Gly Gln 65 70 75 80 ttc ctc aac atg ctc atc aag ctc atc ggc gcc aag aag acc atg gag 288Phe Leu Asn Met Leu Ile Lys Leu Ile Gly Ala Lys Lys Thr Met Glu 85 90 95 atc ggc gtc tac acc ggc tac tcc ctc ctc gcc acc gcc ctc gcg ctc 336Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala Leu 100 105 110 ccc gag gac ggc acg atc ttg gcc atg gac atc aac cgc gag aac tac 384Pro Glu Asp Gly Thr Ile Leu Ala Met Asp Ile Asn Arg Glu Asn Tyr 115 120 125 gag ctc ggc ctg ccc tgc atc gag aag gcc ggc gtc gcc cac aag atc 432Glu Leu Gly Leu Pro Cys Ile Glu Lys Ala Gly Val Ala His Lys Ile 130 135 140 gac ttc cgc gag ggc ccc gcg ctc ccc gtc ctc gac gac ctc atc gcc 480Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu Asp Asp Leu Ile Ala 145 150 155 160 gac gag aag aac cac ggc acc ttc gac ttc gcc ttc gtg gac gcc gac 528Asp Glu Lys Asn His Gly Thr Phe Asp Phe Ala Phe Val Asp Ala Asp 165 170 175 aag gac aac tac ctc aac tac cac gag cgg ctg ctc aag ctc gtg aag 576Lys Asp Asn Tyr Leu Asn Tyr His Glu Arg Leu Leu Lys Leu Val Lys 180 185 190 ctc ggc ggc ctc atc ggc tac gac aac acg ctg tgg aac ggc tcc gtc 624Leu Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser Val 195 200 205 gtg ctc ccc gac gac gcg ccc atg cgc aag tac atc cgc tac tac cgc 672Val Leu Pro Asp Asp Ala Pro Met Arg Lys Tyr Ile Arg Tyr Tyr Arg 210 215 220 gac ttc gtg ctc gtg ctc aac aag gcg ctc gcc gcc gac gag cgc gtc 720Asp Phe Val Leu Val Leu Asn Lys Ala Leu Ala Ala Asp Glu Arg Val 225 230 235 240 gag atc tgc cag ctc ccc gtc ggc gac ggc gtc acc ctc tgc cgc cgc 768Glu Ile Cys Gln Leu Pro Val Gly Asp Gly Val Thr Leu Cys Arg Arg 245 250 255 gtc aag tga 777Val Lys 44258PRTPanicum virgatum 44Met Ala Ser Thr Ala Ala Glu Ala Ala Lys Ala Ala Glu Gln Pro Ala 1 5 10 15 Asn Gly Asn Gly Glu Gln Lys Thr Arg His Ser Glu Val Gly His Lys 20 25 30 Ser Leu Leu Lys Ser Asp Asp Leu Tyr Gln Tyr Ile Leu Asp Thr Ser 35 40 45 Val Tyr Pro Arg Glu Pro Glu Ser Met Lys Glu Leu Arg Glu Ile Thr 50 55 60 Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser Ala Asp Glu Gly Gln 65 70 75 80 Phe Leu Asn Met Leu Ile Lys Leu Ile Gly Ala Lys Lys Thr Met Glu 85 90 95 Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala Leu Ala Leu 100 105 110 Pro Glu Asp Gly Thr Ile Leu Ala Met Asp Ile Asn Arg Glu Asn Tyr 115 120 125 Glu Leu Gly Leu Pro Cys Ile Glu Lys Ala Gly Val Ala His Lys Ile 130 135 140 Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu Asp Asp Leu Ile Ala 145 150 155 160 Asp Glu Lys Asn His Gly Thr Phe Asp Phe Ala Phe Val Asp Ala Asp 165 170 175 Lys Asp Asn Tyr Leu Asn Tyr His Glu Arg Leu Leu Lys Leu Val Lys 180 185 190 Leu Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu Trp Asn Gly Ser Val 195 200 205 Val Leu Pro Asp Asp Ala Pro Met Arg Lys Tyr Ile Arg Tyr Tyr Arg 210 215 220 Asp Phe Val Leu Val Leu Asn Lys Ala Leu Ala Ala Asp Glu Arg Val 225 230 235 240 Glu Ile Cys Gln Leu Pro Val Gly Asp Gly Val Thr Leu Cys Arg Arg 245 250 255 Val Lys 45738DNACamellia sinensisCDS(1)..(738) 45atg gca aca aac gga gaa gga gaa cag aat ctc agg cac caa gag gtc 48Met Ala Thr Asn Gly Glu Gly Glu Gln Asn Leu Arg His Gln Glu Val 1 5 10 15 ggc cac aag agt ctt tta cag agc gat gct ctc tac cag tat ata ctt 96Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu 20 25 30 gag acc agt gtt tac cca aga gag cca gag gcg atg aag gag ctc aga 144Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Ala Met Lys Glu Leu Arg 35 40 45 gag gtc act gca aaa cat cca tgg aac atc atg act acc tct gcc gac 192Glu Val Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser Ala Asp 50 55 60 gaa ggt cag ttc ttg aac atg ctt ttg aag ctt atc aac gcc aag aac 240Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala Lys Asn 65 70 75 80 acg atg gaa atc ggt gtt tac act ggt tac tct ctt cta gcc acc gcc 288Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala 85 90 95 ctt gct ctc ccc gat gat ggg aag att ttg gca atg gac att aac aga 336Leu Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile Asn Arg 100 105 110 gat aac ttc gaa atc ggt ctg ccg ata att gaa aag gcc ggc gtc gct 384Asp Asn Phe Glu Ile Gly Leu Pro Ile Ile Glu Lys Ala Gly Val Ala 115 120 125 cac aaa atc gac ttc aga gaa ggc ccc gct ctg cct gct ctc gat aaa 432His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Ala Leu Asp Lys 130 135 140 atg atc gaa gat gga aag cat cat ggg tcg ttt gat ttc att ttc gtg 480Met Ile Glu Asp Gly Lys His His Gly Ser Phe Asp Phe Ile Phe Val 145 150 155 160 gac gct gac aag gac aac tac aac aac tac cac aag agg ctg att gat 528Asp Ala Asp Lys Asp Asn Tyr Asn Asn Tyr His Lys Arg Leu Ile Asp 165 170 175 ctg gtg aag gtt ggg gga ctg atc ggc tac gat aac acc ctc tgg aac 576Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu Trp Asn 180 185 190 ggc tct gtg gtg gcg cct ccg gac gct ccg atg agg aag tac gta agg 624Gly Ser Val Val Ala Pro Pro Asp Ala Pro Met Arg Lys Tyr Val Arg 195 200 205 tac tac aga gac ttc gtc ctg gag ctc aac aag gca ctc gcc gcc gat 672Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp 210 215 220 ccc cgc atc gag atc tgc atg ctt ccc gtc ggc gat ggc att acc ctg 720Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile Thr Leu 225 230 235 240 tgc cgg cgt gtc tgc tga 738Cys Arg Arg Val Cys 245 46245PRTCamellia sinensis 46Met Ala Thr Asn Gly Glu Gly Glu Gln Asn Leu Arg His Gln Glu Val 1 5 10 15 Gly His Lys Ser Leu Leu Gln Ser Asp Ala Leu Tyr Gln Tyr Ile Leu 20 25 30 Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Ala Met Lys Glu Leu Arg 35 40 45 Glu Val Thr Ala Lys His Pro Trp Asn Ile Met Thr Thr Ser Ala Asp 50 55 60 Glu Gly Gln Phe Leu Asn Met Leu Leu Lys Leu Ile Asn Ala Lys Asn 65 70 75 80 Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser Leu Leu Ala Thr Ala 85 90 95 Leu Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile Asn Arg 100 105 110 Asp Asn Phe Glu Ile Gly Leu Pro Ile Ile Glu Lys Ala Gly Val Ala 115 120 125 His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Ala Leu Asp Lys 130 135 140 Met Ile Glu Asp Gly Lys His His Gly Ser Phe Asp Phe Ile Phe Val 145 150 155 160 Asp Ala Asp Lys Asp Asn Tyr Asn Asn Tyr His Lys Arg Leu Ile Asp 165 170 175 Leu Val Lys Val Gly Gly Leu Ile Gly Tyr Asp Asn Thr Leu Trp Asn 180 185 190 Gly Ser Val Val Ala Pro Pro Asp Ala Pro Met Arg Lys Tyr Val Arg 195 200 205 Tyr Tyr Arg Asp Phe Val Leu Glu Leu Asn Lys Ala Leu Ala Ala Asp 210 215 220 Pro Arg Ile Glu Ile Cys Met Leu Pro Val Gly Asp Gly Ile Thr Leu 225 230 235 240 Cys Arg Arg Val Cys 245 47801DNAZea maysCDS(1)..(801) 47atg gcc acc acg gcg acc gag gcg acc aag acg act gca ccg gcg cgg 48Met Ala Thr Thr Ala Thr Glu Ala Thr Lys Thr Thr Ala Pro Ala Arg 1 5 10 15 gag cag cag gcc aac ggc aac ggc aac ggc aac ggc gag cag aag acg 96Glu Gln Gln Ala Asn Gly Asn Gly Asn Gly Asn Gly Glu Gln Lys Thr 20 25 30 cgc cac tcc gag gtc ggc cac aag agc ctg ctc aag agc gac gac ctc 144Arg His Ser Glu Val Gly His Lys Ser Leu Leu Lys Ser Asp Asp Leu 35 40 45 tac cag tac atc ctg gac acg agc gtg tac ccg cgg gag ccg gag agc 192Tyr Gln Tyr Ile Leu Asp Thr Ser Val Tyr Pro Arg Glu Pro Glu Ser 50 55 60 atg aag gag ctg cgc gag atc acc gcc aag cac cca tgg aac ctg atg 240Met Lys Glu Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Leu Met 65 70 75 80 acc acc tcc gcc gac gag ggc cag ttc ctc aac atg ctc atc aag ctc 288Thr Thr Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Ile Lys Leu 85 90 95 atc ggc gcc aag aag acc atg gag atc ggc gtc tac acc ggc tac tcg 336Ile Gly Ala Lys Lys Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser 100 105 110 ctc ctc gcc acc gcg ctc gca ctc ccg gag gac ggc acg atc ttg gcc 384Leu Leu Ala Thr Ala Leu Ala Leu Pro Glu Asp Gly Thr Ile Leu Ala 115 120 125 atg gac atc aac cgc gag aac tac gag cta ggc ctt ccc tgc atc aac 432Met Asp Ile Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Cys Ile Asn 130 135 140 aag gcc ggc gtg ggc cac aag atc gac ttc cgc gag ggc ccc gcg ctc 480Lys Ala Gly Val Gly His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu 145 150 155 160 ccc gtc ctg gac gac ctc gtg gcg gac aag gag cag cac ggg tcg ttc 528Pro Val Leu Asp Asp Leu Val Ala Asp Lys Glu Gln His Gly Ser Phe 165 170 175 gac ttc gcc ttc gtg gac gcc gac aag gac aac tac ctc agc tac cac 576Asp Phe Ala Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Ser Tyr His 180 185 190 gag cgg ctc ctg aag ctg gtg agg ccc ggc ggc ctc atc ggc tac gac 624Glu Arg Leu Leu Lys Leu Val Arg Pro Gly Gly Leu Ile Gly Tyr Asp 195 200 205 aac acg ctg tgg aac ggc tcc gtc gtg ctc ccc gac gac gcg ccc atg 672Asn Thr Leu Trp Asn Gly Ser Val Val Leu Pro Asp Asp Ala Pro Met 210 215 220 cgc aag tac atc cgc ttc tac cgc gac ttc gtc ctc gcc ctc aac agc 720Arg Lys Tyr Ile Arg Phe Tyr Arg Asp Phe Val Leu Ala Leu Asn Ser 225 230 235 240 gcg ctc gcc gcc gac gac cgc gtc gag atc tgc cag ctc ccc gtc ggc 768Ala Leu Ala Ala Asp Asp Arg Val Glu Ile Cys Gln Leu Pro Val Gly 245 250 255 gac ggc gtc acg ctc tgc cgc cgc gtc aag tga 801Asp Gly Val Thr Leu Cys Arg Arg Val Lys 260 265 48266PRTZea mays 48Met Ala Thr Thr Ala Thr Glu Ala Thr Lys Thr Thr Ala Pro Ala Arg 1 5 10 15 Glu Gln Gln Ala Asn Gly Asn Gly Asn Gly Asn Gly Glu Gln Lys Thr 20 25 30 Arg His Ser Glu Val Gly His Lys Ser Leu Leu Lys Ser Asp Asp Leu 35 40 45 Tyr Gln Tyr Ile Leu Asp Thr Ser Val Tyr Pro Arg Glu Pro Glu Ser 50 55 60 Met Lys Glu Leu Arg Glu Ile Thr Ala Lys His Pro Trp

Asn Leu Met 65 70 75 80 Thr Thr Ser Ala Asp Glu Gly Gln Phe Leu Asn Met Leu Ile Lys Leu 85 90 95 Ile Gly Ala Lys Lys Thr Met Glu Ile Gly Val Tyr Thr Gly Tyr Ser 100 105 110 Leu Leu Ala Thr Ala Leu Ala Leu Pro Glu Asp Gly Thr Ile Leu Ala 115 120 125 Met Asp Ile Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Cys Ile Asn 130 135 140 Lys Ala Gly Val Gly His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu 145 150 155 160 Pro Val Leu Asp Asp Leu Val Ala Asp Lys Glu Gln His Gly Ser Phe 165 170 175 Asp Phe Ala Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Ser Tyr His 180 185 190 Glu Arg Leu Leu Lys Leu Val Arg Pro Gly Gly Leu Ile Gly Tyr Asp 195 200 205 Asn Thr Leu Trp Asn Gly Ser Val Val Leu Pro Asp Asp Ala Pro Met 210 215 220 Arg Lys Tyr Ile Arg Phe Tyr Arg Asp Phe Val Leu Ala Leu Asn Ser 225 230 235 240 Ala Leu Ala Ala Asp Asp Arg Val Glu Ile Cys Gln Leu Pro Val Gly 245 250 255 Asp Gly Val Thr Leu Cys Arg Arg Val Lys 260 265 494431DNABrassica rapaCDS(1)..(4431) 49atg gct aat atg gct gga gca gac gag att gag tcg ttg aga gtg gag 48Met Ala Asn Met Ala Gly Ala Asp Glu Ile Glu Ser Leu Arg Val Glu 1 5 10 15 ctt gca gag att gga aga agc atc aga tca tcg ttc cat aga cac acc 96Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe His Arg His Thr 20 25 30 tcg agt ttc aga agc ggc tct tca agg tat gaa cct gat cat gat ggt 144Ser Ser Phe Arg Ser Gly Ser Ser Arg Tyr Glu Pro Asp His Asp Gly 35 40 45 gag ggc aat aat acg aat gca gag tat gct ctg caa tgg gct gag atc 192Glu Gly Asn Asn Thr Asn Ala Glu Tyr Ala Leu Gln Trp Ala Glu Ile 50 55 60 gag aga ttg cca acc gtc aaa cgc atg aga tcc tct ctc ctt gat gat 240Glu Arg Leu Pro Thr Val Lys Arg Met Arg Ser Ser Leu Leu Asp Asp 65 70 75 80 ggt gat gag tcc atg gcc gag aaa ggt aaa aga gtc gtt gat gtc acg 288Gly Asp Glu Ser Met Ala Glu Lys Gly Lys Arg Val Val Asp Val Thr 85 90 95 aag ctt gga gcc atg gaa cgt cat ctg atg att gag aaa ctc atc aaa 336Lys Leu Gly Ala Met Glu Arg His Leu Met Ile Glu Lys Leu Ile Lys 100 105 110 cac att gag aat gat aat ctc aag ttg ctc aag aaa atc agg aga aga 384His Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Arg Arg 115 120 125 ata gac aga gtt gga atg gag tta ccg acc ata gaa gtg agg tat gag 432Ile Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Glu 130 135 140 ggt tta aaa gtg gag gca gag tgc gag att gtt gaa ggg aag gca ctt 480Gly Leu Lys Val Glu Ala Glu Cys Glu Ile Val Glu Gly Lys Ala Leu 145 150 155 160 cca aca ctg tgg aac act gct aag cgt gtt ttg tct gaa ctg gtg aag 528Pro Thr Leu Trp Asn Thr Ala Lys Arg Val Leu Ser Glu Leu Val Lys 165 170 175 ctc act ggt gca aaa aca cga gaa gcc aag ata agc att ctt aat gat 576Leu Thr Gly Ala Lys Thr Arg Glu Ala Lys Ile Ser Ile Leu Asn Asp 180 185 190 gtt aat ggc att ata aaa cca gga agg tta aca ctg ttg ctt ggt cct 624Val Asn Gly Ile Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro 195 200 205 cct gga tgt gga aaa acg act ttg tta aag gcc tta tca gga aac tta 672Pro Gly Cys Gly Lys Thr Thr Leu Leu Lys Ala Leu Ser Gly Asn Leu 210 215 220 gaa aac aat cta aag tgt tca ggt gaa atc tcc tac aat ggg cat aga 720Glu Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg 225 230 235 240 ctt gac gag ttt gtt cct cag aaa aca tcc gcg tac ata agc caa tat 768Leu Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr 245 250 255 gat ctg cac att gct gag atg aca gtg agg gag aca gtc gac ttc tca 816Asp Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser 260 265 270 gct cgt tgt cag ggt gtt gga agc cga aca gaa att atg atg gaa gtt 864Ala Arg Cys Gln Gly Val Gly Ser Arg Thr Glu Ile Met Met Glu Val 275 280 285 agt aaa aga gaa aag gaa gca gga atc att cct gac aca gaa gtg gat 912Ser Lys Arg Glu Lys Glu Ala Gly Ile Ile Pro Asp Thr Glu Val Asp 290 295 300 gct tac atg aaa gca ata tct gtt gaa gga ctt gaa aga agt ctg caa 960Ala Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Glu Arg Ser Leu Gln 305 310 315 320 aca gat tac atc ttg aag att ctt gga ctc gac att tgc gca gaa aca 1008Thr Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Thr 325 330 335 ttg att gga gat gtg atg agg aga ggc ata tca ggg ggc caa aag aaa 1056Leu Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys 340 345 350 cgt ctt acc aca gcc gag atg atc gtt ggt cca aca aag gca ctg ttt 1104Arg Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe 355 360 365 atg gat gaa ata aca aac ggc tta gac agt tcc acg gct ttt cag att 1152Met Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile 370 375 380 gtt aaa tct ctt cag cag ctg gct cac ata tca aac gct act gtt gtt 1200Val Lys Ser Leu Gln Gln Leu Ala His Ile Ser Asn Ala Thr Val Val 385 390 395 400 gtt tcg ctt ctt caa cct gct cca gag tcc ttt gac ctc ttt gat gac 1248Val Ser Leu Leu Gln Pro Ala Pro Glu Ser Phe Asp Leu Phe Asp Asp 405 410 415 gtt atg ctg atg gcc aag ggg aaa ata gtg tat cat ggc cca cgc ggt 1296Val Met Leu Met Ala Lys Gly Lys Ile Val Tyr His Gly Pro Arg Gly 420 425 430 gag gtc ctg aac ttc ttt gag gag tgt gga ttc caa tgc cct gaa agg 1344Glu Val Leu Asn Phe Phe Glu Glu Cys Gly Phe Gln Cys Pro Glu Arg 435 440 445 aaa ggt gtt gca gac tat ctc cag gag gtt ata tca aga aaa gac caa 1392Lys Gly Val Ala Asp Tyr Leu Gln Glu Val Ile Ser Arg Lys Asp Gln 450 455 460 gca caa tac tgg cgg cat gag gat gta cct tat agc ttt gtc tcg gta 1440Ala Gln Tyr Trp Arg His Glu Asp Val Pro Tyr Ser Phe Val Ser Val 465 470 475 480 gac atg ttg tcg aag aaa ttc aag gac ttc agc atc ggg aag aag att 1488Asp Met Leu Ser Lys Lys Phe Lys Asp Phe Ser Ile Gly Lys Lys Ile 485 490 495 gag gac gct cta tct aag cca tat gat aga tca aaa agc cat aag gat 1536Glu Asp Ala Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp 500 505 510 gct ctt tcc ttc agc gtg tac tct cta cca aac tgg gag atg ttc ata 1584Ala Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Met Phe Ile 515 520 525 gct tgc ata tca aga gag tat ctt ctc atg aag aga aac tat ttc gtc 1632Ala Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val 530 535 540 tat ata ttc aag acg ggt cag ctt gtg atg gca gca ttc atc act atg 1680Tyr Ile Phe Lys Thr Gly Gln Leu Val Met Ala Ala Phe Ile Thr Met 545 550 555 560 act gtg ttt atc cga aca cgg atg ggt att gat atc ctt cat gga aac 1728Thr Val Phe Ile Arg Thr Arg Met Gly Ile Asp Ile Leu His Gly Asn 565 570 575 tct tac atg agt gcc ctc ttc ttc gcc gtc atc att ctt ctt gtt gat 1776Ser Tyr Met Ser Ala Leu Phe Phe Ala Val Ile Ile Leu Leu Val Asp 580 585 590 gga ttc cct gag ttg gct atg acg gct caa cgc tta gcg gtg ttt tac 1824Gly Phe Pro Glu Leu Ala Met Thr Ala Gln Arg Leu Ala Val Phe Tyr 595 600 605 aaa cag aag cag ttg tgt ttc tat cca gca tgg gct tat gca atc cct 1872Lys Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro 610 615 620 gca acg gtg tta aag gtc cca ctg tca tta ctg gaa tct ttc gtt tgg 1920Ala Thr Val Leu Lys Val Pro Leu Ser Leu Leu Glu Ser Phe Val Trp 625 630 635 640 acc ggc ctg aca tac tat gtc att ggg tac acc cct gaa gct tcc agg 1968Thr Gly Leu Thr Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg 645 650 655 ttc ttc aag cag ttc att cta ctg ttt ctt gtt cac ttc act tcg ata 2016Phe Phe Lys Gln Phe Ile Leu Leu Phe Leu Val His Phe Thr Ser Ile 660 665 670 tcc atg ttt cgg tgc ctc gct gca atc ttc cag aca gta gtt gct tca 2064Ser Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser 675 680 685 gtc aca gct ggc agt ttt ggt ata tta atc aca ttt gtc ttt gcc ggt 2112Val Thr Ala Gly Ser Phe Gly Ile Leu Ile Thr Phe Val Phe Ala Gly 690 695 700 ttt gtc att cca cca cct tct atg cct gca tgg ctc aag tgg ggt ttc 2160Phe Val Ile Pro Pro Pro Ser Met Pro Ala Trp Leu Lys Trp Gly Phe 705 710 715 720 tgg gcg aat cct ttg agt tac agt gag att ggg cta tcg gta aat gag 2208Trp Ala Asn Pro Leu Ser Tyr Ser Glu Ile Gly Leu Ser Val Asn Glu 725 730 735 ttt ctt gct cca agg tgg aac cag ata caa cca agt act aat ctt acc 2256Phe Leu Ala Pro Arg Trp Asn Gln Ile Gln Pro Ser Thr Asn Leu Thr 740 745 750 tta ggt aga acc ata ctc gaa agc cgt gga ctg aac tac gat ggt tat 2304Leu Gly Arg Thr Ile Leu Glu Ser Arg Gly Leu Asn Tyr Asp Gly Tyr 755 760 765 atg tat tgg gta tca ctc tgt gcc ttg gtg ggt ttc act gtg ctc ttc 2352Met Tyr Trp Val Ser Leu Cys Ala Leu Val Gly Phe Thr Val Leu Phe 770 775 780 aac aca att ttc act ctg gcg ctg act ttc ctg aaa tca cca aca tca 2400Asn Thr Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser 785 790 795 800 tca cga gcc atg atc tca caa gaa aaa ctc tct gag ctg caa gga aca 2448Ser Arg Ala Met Ile Ser Gln Glu Lys Leu Ser Glu Leu Gln Gly Thr 805 810 815 gaa gat aca aca gac tac tct tcc atc aag aaa aag acc aca gat tcc 2496Glu Asp Thr Thr Asp Tyr Ser Ser Ile Lys Lys Lys Thr Thr Asp Ser 820 825 830 cct gta aaa aca gaa ggc aag atg gtg tta cct ttc aag ccc ctc act 2544Pro Val Lys Thr Glu Gly Lys Met Val Leu Pro Phe Lys Pro Leu Thr 835 840 845 gta aca ttt caa gaa cta aac tac ttc gtt gac act cca gtg gag atg 2592Val Thr Phe Gln Glu Leu Asn Tyr Phe Val Asp Thr Pro Val Glu Met 850 855 860 aga gag caa gga tat gct aac aag aag ctg caa cta ctc aca gac atc 2640Arg Glu Gln Gly Tyr Ala Asn Lys Lys Leu Gln Leu Leu Thr Asp Ile 865 870 875 880 acc gga gct ttc cgt ccg gga atc cta acg gcg tta atg gga gtg agc 2688Thr Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val Ser 885 890 895 gga gcc gga aag acc aca ctc ctc gac gtc cta gcc gga aga aaa acg 2736Gly Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys Thr 900 905 910 agc gga tac ata gaa ggc gac atc aga atc agc ggc ttc cct aaa gtc 2784Ser Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys Val 915 920 925 caa gaa acg ttc gcc aga gtc tca ggc tac tgc gaa caa aca gat att 2832Gln Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp Ile 930 935 940 cac tca cca aac atc acc gtc gaa gaa tcc gtc atc tac tcc gct tgg 2880His Ser Pro Asn Ile Thr Val Glu Glu Ser Val Ile Tyr Ser Ala Trp 945 950 955 960 ctc cgt ctc gct cct gag atc gag tcc gca acc aaa acc gta cgc atc 2928Leu Arg Leu Ala Pro Glu Ile Glu Ser Ala Thr Lys Thr Val Arg Ile 965 970 975 tcc tcc ttc ttc ttc ttc ttc ctt ctt ctt ccc cgc gca aat tcg aca 2976Ser Ser Phe Phe Phe Phe Phe Leu Leu Leu Pro Arg Ala Asn Ser Thr 980 985 990 cca atc tca acc caa tct tta cag gaa ttc gtg agg caa gtg ctg gag 3024Pro Ile Ser Thr Gln Ser Leu Gln Glu Phe Val Arg Gln Val Leu Glu 995 1000 1005 acg atc gag tta gac gag atc aag gat gcg ttg gtg gga gtc gcc 3069Thr Ile Glu Leu Asp Glu Ile Lys Asp Ala Leu Val Gly Val Ala 1010 1015 1020 gga gag agc gga tta tcg acg gag cag agg aaa cgg ctt acg atc 3114Gly Glu Ser Gly Leu Ser Thr Glu Gln Arg Lys Arg Leu Thr Ile 1025 1030 1035 gcg gtg gag ttg gtg gcg aat ccg tcg atc atc ttc atg gac gag 3159Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile Phe Met Asp Glu 1040 1045 1050 cct acg acg gga ttg gat gca aga gca gcc gcc att gtt atg aga 3204Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala Ile Val Met Arg 1055 1060 1065 gct gtg aag aac gta gct gac act gga cga acc atc gtc tgc act 3249Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr Ile Val Cys Thr 1070 1075 1080 att cat cag cct agc ata gat att ttc gaa gct ttc gac gag ttg 3294Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala Phe Asp Glu Leu 1085 1090 1095 gtc ctt ctc aaa aga ggt ggt cgc atg atc tac aca gga cca cta 3339Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr Thr Gly Pro Leu 1100 1105 1110 ggc cta aac tca tgt cat att att gag tat ttt gag aat gtt ccc 3384Gly Leu Asn Ser Cys His Ile Ile Glu Tyr Phe Glu Asn Val Pro 1115 1120 1125 gga gtt cct aaa ata aga gac aac cac aat cct gca aca tgg atg 3429Gly Val Pro Lys Ile Arg Asp Asn His Asn Pro Ala Thr Trp Met 1130 1135 1140 ctt gat gtt agt tca caa tct gcg gaa gtt gaa ctt ggt gtc gat 3474Leu Asp Val Ser Ser Gln Ser Ala Glu Val Glu Leu Gly Val Asp 1145 1150 1155 ttc gct aaa atc tac cac gaa tcc cct ctt ttc aag agc aac tca 3519Phe Ala Lys Ile Tyr His Glu Ser Pro Leu Phe Lys Ser Asn Ser 1160 1165 1170 gag ctt gtg aaa cag ttg agc caa cca gat tca ggg tca agt gat 3564Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ser Gly Ser Ser Asp 1175 1180 1185 tta cag ttt aaa aga act tat gca cag agc tgg tat gga caa ttc 3609Leu Gln Phe Lys Arg Thr Tyr Ala Gln Ser Trp Tyr Gly Gln Phe 1190 1195 1200 aaa tcc att ttg tgg aag atg aac ttg tct tac tgg agg aac cct 3654Lys Ser Ile Leu Trp Lys Met Asn Leu Ser Tyr Trp Arg Asn Pro

1205 1210 1215 tct tat aac cta atg cgt ttg att cac aca tta atc tct tct ttg 3699Ser Tyr Asn Leu Met Arg Leu Ile His Thr Leu Ile Ser Ser Leu 1220 1225 1230 atc ttc ggc gca ctc ttt tgg aaa caa ggc cag aaa ata gat act 3744Ile Phe Gly Ala Leu Phe Trp Lys Gln Gly Gln Lys Ile Asp Thr 1235 1240 1245 caa caa agt gtg ttc act gta gtt gga gcg atc tat ggg gct gtg 3789Gln Gln Ser Val Phe Thr Val Val Gly Ala Ile Tyr Gly Ala Val 1250 1255 1260 ctt ttc tta ggg att aac aat tgt gca tca gct ctt cgg aat tta 3834Leu Phe Leu Gly Ile Asn Asn Cys Ala Ser Ala Leu Arg Asn Leu 1265 1270 1275 gaa aca gaa cgt aat gtt atg tac cgt gaa aga ttt gca gga atg 3879Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg Phe Ala Gly Met 1280 1285 1290 tac tca gca aca gct tat gca tta ggt caa gtt gtg act gag ata 3924Tyr Ser Ala Thr Ala Tyr Ala Leu Gly Gln Val Val Thr Glu Ile 1295 1300 1305 cct tac ttg ttc ata caa gca gcc gag ttt gtg atc ata aca tat 3969Pro Tyr Leu Phe Ile Gln Ala Ala Glu Phe Val Ile Ile Thr Tyr 1310 1315 1320 cct atg atc ggt ttc tat cct tcg acc tac aaa gtc ttt tgg gca 4014Pro Met Ile Gly Phe Tyr Pro Ser Thr Tyr Lys Val Phe Trp Ala 1325 1330 1335 ctc tac tct atg ttc act tca ctt ctc act tac aac tat ctc gca 4059Leu Tyr Ser Met Phe Thr Ser Leu Leu Thr Tyr Asn Tyr Leu Ala 1340 1345 1350 atg ttc ctc atc tcc atc aca cca aac ttc atg gtt gcc tcg att 4104Met Phe Leu Ile Ser Ile Thr Pro Asn Phe Met Val Ala Ser Ile 1355 1360 1365 ctt cag tcc atc ttc ttt gtt aac ttt aac ctc ttt tcc ggg ttc 4149Leu Gln Ser Ile Phe Phe Val Asn Phe Asn Leu Phe Ser Gly Phe 1370 1375 1380 ttg att cct gaa acg caa gtt cca agg tgg tgg att tgg tta tat 4194Leu Ile Pro Glu Thr Gln Val Pro Arg Trp Trp Ile Trp Leu Tyr 1385 1390 1395 tat ata aca cca acg tca tgg aca ctc aac ggg ttt ttc tcg gct 4239Tyr Ile Thr Pro Thr Ser Trp Thr Leu Asn Gly Phe Phe Ser Ala 1400 1405 1410 cag tat gaa aat att cat gag gag atc att gtc ttt gga gaa tcc 4284Gln Tyr Glu Asn Ile His Glu Glu Ile Ile Val Phe Gly Glu Ser 1415 1420 1425 acg acg gct tca aaa ttc tta gaa gac tat ttt gga ttc cat cat 4329Thr Thr Ala Ser Lys Phe Leu Glu Asp Tyr Phe Gly Phe His His 1430 1435 1440 gac cgt ttg gca gtt aca gca gtt gtt caa atc gct ttt cct att 4374Asp Arg Leu Ala Val Thr Ala Val Val Gln Ile Ala Phe Pro Ile 1445 1450 1455 gca ttg gct ttg atg ttt gca ttc ttt gtt ggc aaa ctc aat ttc 4419Ala Leu Ala Leu Met Phe Ala Phe Phe Val Gly Lys Leu Asn Phe 1460 1465 1470 caa aga aga tga 4431Gln Arg Arg 1475 501476PRTBrassica rapa 50Met Ala Asn Met Ala Gly Ala Asp Glu Ile Glu Ser Leu Arg Val Glu 1 5 10 15 Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe His Arg His Thr 20 25 30 Ser Ser Phe Arg Ser Gly Ser Ser Arg Tyr Glu Pro Asp His Asp Gly 35 40 45 Glu Gly Asn Asn Thr Asn Ala Glu Tyr Ala Leu Gln Trp Ala Glu Ile 50 55 60 Glu Arg Leu Pro Thr Val Lys Arg Met Arg Ser Ser Leu Leu Asp Asp 65 70 75 80 Gly Asp Glu Ser Met Ala Glu Lys Gly Lys Arg Val Val Asp Val Thr 85 90 95 Lys Leu Gly Ala Met Glu Arg His Leu Met Ile Glu Lys Leu Ile Lys 100 105 110 His Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Arg Arg 115 120 125 Ile Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Glu 130 135 140 Gly Leu Lys Val Glu Ala Glu Cys Glu Ile Val Glu Gly Lys Ala Leu 145 150 155 160 Pro Thr Leu Trp Asn Thr Ala Lys Arg Val Leu Ser Glu Leu Val Lys 165 170 175 Leu Thr Gly Ala Lys Thr Arg Glu Ala Lys Ile Ser Ile Leu Asn Asp 180 185 190 Val Asn Gly Ile Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro 195 200 205 Pro Gly Cys Gly Lys Thr Thr Leu Leu Lys Ala Leu Ser Gly Asn Leu 210 215 220 Glu Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg 225 230 235 240 Leu Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr 245 250 255 Asp Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser 260 265 270 Ala Arg Cys Gln Gly Val Gly Ser Arg Thr Glu Ile Met Met Glu Val 275 280 285 Ser Lys Arg Glu Lys Glu Ala Gly Ile Ile Pro Asp Thr Glu Val Asp 290 295 300 Ala Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Glu Arg Ser Leu Gln 305 310 315 320 Thr Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Thr 325 330 335 Leu Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys 340 345 350 Arg Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe 355 360 365 Met Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile 370 375 380 Val Lys Ser Leu Gln Gln Leu Ala His Ile Ser Asn Ala Thr Val Val 385 390 395 400 Val Ser Leu Leu Gln Pro Ala Pro Glu Ser Phe Asp Leu Phe Asp Asp 405 410 415 Val Met Leu Met Ala Lys Gly Lys Ile Val Tyr His Gly Pro Arg Gly 420 425 430 Glu Val Leu Asn Phe Phe Glu Glu Cys Gly Phe Gln Cys Pro Glu Arg 435 440 445 Lys Gly Val Ala Asp Tyr Leu Gln Glu Val Ile Ser Arg Lys Asp Gln 450 455 460 Ala Gln Tyr Trp Arg His Glu Asp Val Pro Tyr Ser Phe Val Ser Val 465 470 475 480 Asp Met Leu Ser Lys Lys Phe Lys Asp Phe Ser Ile Gly Lys Lys Ile 485 490 495 Glu Asp Ala Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp 500 505 510 Ala Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Met Phe Ile 515 520 525 Ala Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val 530 535 540 Tyr Ile Phe Lys Thr Gly Gln Leu Val Met Ala Ala Phe Ile Thr Met 545 550 555 560 Thr Val Phe Ile Arg Thr Arg Met Gly Ile Asp Ile Leu His Gly Asn 565 570 575 Ser Tyr Met Ser Ala Leu Phe Phe Ala Val Ile Ile Leu Leu Val Asp 580 585 590 Gly Phe Pro Glu Leu Ala Met Thr Ala Gln Arg Leu Ala Val Phe Tyr 595 600 605 Lys Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro 610 615 620 Ala Thr Val Leu Lys Val Pro Leu Ser Leu Leu Glu Ser Phe Val Trp 625 630 635 640 Thr Gly Leu Thr Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg 645 650 655 Phe Phe Lys Gln Phe Ile Leu Leu Phe Leu Val His Phe Thr Ser Ile 660 665 670 Ser Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser 675 680 685 Val Thr Ala Gly Ser Phe Gly Ile Leu Ile Thr Phe Val Phe Ala Gly 690 695 700 Phe Val Ile Pro Pro Pro Ser Met Pro Ala Trp Leu Lys Trp Gly Phe 705 710 715 720 Trp Ala Asn Pro Leu Ser Tyr Ser Glu Ile Gly Leu Ser Val Asn Glu 725 730 735 Phe Leu Ala Pro Arg Trp Asn Gln Ile Gln Pro Ser Thr Asn Leu Thr 740 745 750 Leu Gly Arg Thr Ile Leu Glu Ser Arg Gly Leu Asn Tyr Asp Gly Tyr 755 760 765 Met Tyr Trp Val Ser Leu Cys Ala Leu Val Gly Phe Thr Val Leu Phe 770 775 780 Asn Thr Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser 785 790 795 800 Ser Arg Ala Met Ile Ser Gln Glu Lys Leu Ser Glu Leu Gln Gly Thr 805 810 815 Glu Asp Thr Thr Asp Tyr Ser Ser Ile Lys Lys Lys Thr Thr Asp Ser 820 825 830 Pro Val Lys Thr Glu Gly Lys Met Val Leu Pro Phe Lys Pro Leu Thr 835 840 845 Val Thr Phe Gln Glu Leu Asn Tyr Phe Val Asp Thr Pro Val Glu Met 850 855 860 Arg Glu Gln Gly Tyr Ala Asn Lys Lys Leu Gln Leu Leu Thr Asp Ile 865 870 875 880 Thr Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val Ser 885 890 895 Gly Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys Thr 900 905 910 Ser Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys Val 915 920 925 Gln Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp Ile 930 935 940 His Ser Pro Asn Ile Thr Val Glu Glu Ser Val Ile Tyr Ser Ala Trp 945 950 955 960 Leu Arg Leu Ala Pro Glu Ile Glu Ser Ala Thr Lys Thr Val Arg Ile 965 970 975 Ser Ser Phe Phe Phe Phe Phe Leu Leu Leu Pro Arg Ala Asn Ser Thr 980 985 990 Pro Ile Ser Thr Gln Ser Leu Gln Glu Phe Val Arg Gln Val Leu Glu 995 1000 1005 Thr Ile Glu Leu Asp Glu Ile Lys Asp Ala Leu Val Gly Val Ala 1010 1015 1020 Gly Glu Ser Gly Leu Ser Thr Glu Gln Arg Lys Arg Leu Thr Ile 1025 1030 1035 Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile Phe Met Asp Glu 1040 1045 1050 Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala Ile Val Met Arg 1055 1060 1065 Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr Ile Val Cys Thr 1070 1075 1080 Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala Phe Asp Glu Leu 1085 1090 1095 Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr Thr Gly Pro Leu 1100 1105 1110 Gly Leu Asn Ser Cys His Ile Ile Glu Tyr Phe Glu Asn Val Pro 1115 1120 1125 Gly Val Pro Lys Ile Arg Asp Asn His Asn Pro Ala Thr Trp Met 1130 1135 1140 Leu Asp Val Ser Ser Gln Ser Ala Glu Val Glu Leu Gly Val Asp 1145 1150 1155 Phe Ala Lys Ile Tyr His Glu Ser Pro Leu Phe Lys Ser Asn Ser 1160 1165 1170 Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ser Gly Ser Ser Asp 1175 1180 1185 Leu Gln Phe Lys Arg Thr Tyr Ala Gln Ser Trp Tyr Gly Gln Phe 1190 1195 1200 Lys Ser Ile Leu Trp Lys Met Asn Leu Ser Tyr Trp Arg Asn Pro 1205 1210 1215 Ser Tyr Asn Leu Met Arg Leu Ile His Thr Leu Ile Ser Ser Leu 1220 1225 1230 Ile Phe Gly Ala Leu Phe Trp Lys Gln Gly Gln Lys Ile Asp Thr 1235 1240 1245 Gln Gln Ser Val Phe Thr Val Val Gly Ala Ile Tyr Gly Ala Val 1250 1255 1260 Leu Phe Leu Gly Ile Asn Asn Cys Ala Ser Ala Leu Arg Asn Leu 1265 1270 1275 Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg Phe Ala Gly Met 1280 1285 1290 Tyr Ser Ala Thr Ala Tyr Ala Leu Gly Gln Val Val Thr Glu Ile 1295 1300 1305 Pro Tyr Leu Phe Ile Gln Ala Ala Glu Phe Val Ile Ile Thr Tyr 1310 1315 1320 Pro Met Ile Gly Phe Tyr Pro Ser Thr Tyr Lys Val Phe Trp Ala 1325 1330 1335 Leu Tyr Ser Met Phe Thr Ser Leu Leu Thr Tyr Asn Tyr Leu Ala 1340 1345 1350 Met Phe Leu Ile Ser Ile Thr Pro Asn Phe Met Val Ala Ser Ile 1355 1360 1365 Leu Gln Ser Ile Phe Phe Val Asn Phe Asn Leu Phe Ser Gly Phe 1370 1375 1380 Leu Ile Pro Glu Thr Gln Val Pro Arg Trp Trp Ile Trp Leu Tyr 1385 1390 1395 Tyr Ile Thr Pro Thr Ser Trp Thr Leu Asn Gly Phe Phe Ser Ala 1400 1405 1410 Gln Tyr Glu Asn Ile His Glu Glu Ile Ile Val Phe Gly Glu Ser 1415 1420 1425 Thr Thr Ala Ser Lys Phe Leu Glu Asp Tyr Phe Gly Phe His His 1430 1435 1440 Asp Arg Leu Ala Val Thr Ala Val Val Gln Ile Ala Phe Pro Ile 1445 1450 1455 Ala Leu Ala Leu Met Phe Ala Phe Phe Val Gly Lys Leu Asn Phe 1460 1465 1470 Gln Arg Arg 1475 514353DNAArabidopsis lyratamisc_featuresubsp. lyrataCDS(1)..(4353) 51atg gct cat atg gtt gga gca gac gag att gag tcg ttg aga gtg gag 48Met Ala His Met Val Gly Ala Asp Glu Ile Glu Ser Leu Arg Val Glu 1 5 10 15 ctt gca gag att gga aga agc atc aga tca tcg ttc cgg aga cac act 96Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe Arg Arg His Thr 20 25 30 tcg agt ttc aga agc agc tct tca aga tat gaa ctt gaa aat gat ggt 144Ser Ser Phe Arg Ser Ser Ser Ser Arg Tyr Glu Leu Glu Asn Asp Gly 35 40 45 gat gtt att gat cat gat gca gag tat gct ctg caa tgg gct gag att 192Asp Val Ile Asp His Asp Ala Glu Tyr Ala Leu Gln Trp Ala Glu Ile 50 55 60 gag aga tta cca act gtc aaa cga atg aga tcg act ctc ctt gat gat 240Glu Arg Leu Pro Thr Val Lys Arg Met Arg Ser Thr Leu Leu Asp Asp 65 70 75 80 ggc gat gag tcc atg tcc gag aaa gga aga agg gtc gtt gat gtc aca 288Gly Asp Glu Ser Met Ser Glu Lys Gly Arg Arg Val Val Asp Val Thr 85 90 95 aag ctt gga gcc atg gaa cgt cat ctg atg att gag aaa ctc atc aaa 336Lys Leu Gly Ala Met Glu Arg His Leu Met Ile Glu Lys Leu Ile Lys 100 105 110 cac att gag aat gat aat ctc aaa ttg ctc aag aaa atc agg aaa aga 384His Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Lys Arg 115 120 125 ata gac aga gtc ggg atg gag tta ccg acc ata gaa gtg agg tac gag 432Ile Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Glu 130 135 140 agt tta aaa gtg gag gcc gag tgc gag att gtt gaa ggg aag gca ctt 480Ser Leu Lys Val Glu Ala Glu Cys Glu Ile Val Glu Gly Lys Ala Leu 145 150 155 160 cca aca ctg tgg aac act gct aag cgc gtt tta tct gaa ctg gtg aag 528Pro Thr Leu Trp Asn Thr Ala Lys Arg Val Leu Ser Glu Leu

Val Lys 165 170 175 ctc act ggt gca aaa aca cac gaa gcg aag ata aac att att aat gat 576Leu Thr Gly Ala Lys Thr His Glu Ala Lys Ile Asn Ile Ile Asn Asp 180 185 190 gtt aat ggc gtt ata aag ccg gga agg tta aca ctg ttg ctt ggt cct 624Val Asn Gly Val Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro 195 200 205 cct gga tgt gga aaa aca act ttg tta aag gcc ttg tct gga aat tta 672Pro Gly Cys Gly Lys Thr Thr Leu Leu Lys Ala Leu Ser Gly Asn Leu 210 215 220 gaa aac aat cta aag tgt tca ggt gaa ata tct tac aat gga cac aga 720Glu Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg 225 230 235 240 ctg gac gag ttt gtt cct cag aaa act tcg gcg tac ata agt caa tat 768Leu Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr 245 250 255 gat ctg cac att gca gag atg aca gtg aga gag aca gtt gat ttc tca 816Asp Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser 260 265 270 gct cgt tgt cag gga gtt ggt agc cga aca gat ata atg atg gaa gtc 864Ala Arg Cys Gln Gly Val Gly Ser Arg Thr Asp Ile Met Met Glu Val 275 280 285 agt aaa aga gaa aag gaa aaa gga atc att cct gac aca gaa gtg gat 912Ser Lys Arg Glu Lys Glu Lys Gly Ile Ile Pro Asp Thr Glu Val Asp 290 295 300 gct tac atg aaa gca att tct gtt gaa gga ctc caa aga aat ctg caa 960Ala Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Gln Arg Asn Leu Gln 305 310 315 320 aca gat tac atc ttg aag att ctc gga ctt gat att tgt gca gaa aca 1008Thr Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Thr 325 330 335 ttg att gga gat gtg atg agg aga ggt ata tca gga ggt caa aag aag 1056Leu Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys 340 345 350 cgt ctt acc aca gct gag atg att gtt ggc ccg aca aag gct ctg ttt 1104Arg Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe 355 360 365 atg gat gaa ata aca aat ggc tta gac agt tcc aca gct ttt cag att 1152Met Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile 370 375 380 gtc aaa tct ctt cag cag ttt gct cac ata tca agc gct act gtg ctt 1200Val Lys Ser Leu Gln Gln Phe Ala His Ile Ser Ser Ala Thr Val Leu 385 390 395 400 gtt tcg ctt ctt caa ccc gcc cca gag tcc ttt gac ctc ttt gat gac 1248Val Ser Leu Leu Gln Pro Ala Pro Glu Ser Phe Asp Leu Phe Asp Asp 405 410 415 ata atg ctg atg gcc aaa gga aga atc atg tat cat ggt cca cgc ggt 1296Ile Met Leu Met Ala Lys Gly Arg Ile Met Tyr His Gly Pro Arg Gly 420 425 430 gag gtc ctc aac ttc ttt gag gat tgt gga ttc cga tgc cct gaa agg 1344Glu Val Leu Asn Phe Phe Glu Asp Cys Gly Phe Arg Cys Pro Glu Arg 435 440 445 aaa ggt gtc gca gac ttt ctc cag gag gtt ata tcc aaa aaa gac caa 1392Lys Gly Val Ala Asp Phe Leu Gln Glu Val Ile Ser Lys Lys Asp Gln 450 455 460 gca caa tac tgg cgg cac gag gat tta cct tat agt ttt gtc tcg gta 1440Ala Gln Tyr Trp Arg His Glu Asp Leu Pro Tyr Ser Phe Val Ser Val 465 470 475 480 gat atg ttg tca aag aag ttc aag gag ttg agt att gga aaa aag atg 1488Asp Met Leu Ser Lys Lys Phe Lys Glu Leu Ser Ile Gly Lys Lys Met 485 490 495 gaa cac act ctg tca aag cca tat gat aga tcc aaa agc cat aag gat 1536Glu His Thr Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp 500 505 510 gct ttg tcc ttc agt gtg tat tct ctt cca aac tgg gag ctg ttc ata 1584Ala Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Leu Phe Ile 515 520 525 gca tgc ata tca aga gaa tat ctt ctc atg aag aga aac tat ttc gtc 1632Ala Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val 530 535 540 tat att ttc aag aca tct cag ctt gtt atg gcc gca ttt atc act atg 1680Tyr Ile Phe Lys Thr Ser Gln Leu Val Met Ala Ala Phe Ile Thr Met 545 550 555 560 act gtg tat atc cga aca cgg atg ggt att gat atc att cat gga aat 1728Thr Val Tyr Ile Arg Thr Arg Met Gly Ile Asp Ile Ile His Gly Asn 565 570 575 tct tac atg agt gcc ctc ttt ttc gcc ctc att ata ctt ctt gtt gac 1776Ser Tyr Met Ser Ala Leu Phe Phe Ala Leu Ile Ile Leu Leu Val Asp 580 585 590 gga ttc cca gag ttg tct atg acg gct caa cgc cta gcc gtg ttt tac 1824Gly Phe Pro Glu Leu Ser Met Thr Ala Gln Arg Leu Ala Val Phe Tyr 595 600 605 aag cag aag cag ttg tgt ttc tat cct gca tgg gcg tat gca atc cct 1872Lys Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro 610 615 620 gca aca gtg tta aag gtc cct ctc tcg ttc ttt gaa tct ctc gtt tgg 1920Ala Thr Val Leu Lys Val Pro Leu Ser Phe Phe Glu Ser Leu Val Trp 625 630 635 640 acc ggc ctc aca tac tat gtc att gga tac acc cct gaa gca tcc agg 1968Thr Gly Leu Thr Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg 645 650 655 ttt ttc aag cag ttc att cta ctc ttt gct gtc cac ttc acc tcg ata 2016Phe Phe Lys Gln Phe Ile Leu Leu Phe Ala Val His Phe Thr Ser Ile 660 665 670 tcc atg ttc cgg tgt cta gct gca atc ttc cag aca gta gtt gct tca 2064Ser Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser 675 680 685 atc acc gct ggc agt ttt ggt ata tta ttc aca ttt gtc ttt gcc ggt 2112Ile Thr Ala Gly Ser Phe Gly Ile Leu Phe Thr Phe Val Phe Ala Gly 690 695 700 ttc gtc att cca cca cct tct atg cca gca tgg ctt aag tgg ggt ttc 2160Phe Val Ile Pro Pro Pro Ser Met Pro Ala Trp Leu Lys Trp Gly Phe 705 710 715 720 tgg gta aat cct ttg agt tac ggt gag att ggg cta tcg gta aac gag 2208Trp Val Asn Pro Leu Ser Tyr Gly Glu Ile Gly Leu Ser Val Asn Glu 725 730 735 ttt ctt gct cca agg tgg aat cag atg caa ccc aat aat gtt acc tta 2256Phe Leu Ala Pro Arg Trp Asn Gln Met Gln Pro Asn Asn Val Thr Leu 740 745 750 gga cga acc ata ctc caa acc cgt gga atg gac tac gat ggt tac atg 2304Gly Arg Thr Ile Leu Gln Thr Arg Gly Met Asp Tyr Asp Gly Tyr Met 755 760 765 tac tgg gta tca ttg tat gcc ttg ttg ggt ttc act gtg ctc ttc aac 2352Tyr Trp Val Ser Leu Tyr Ala Leu Leu Gly Phe Thr Val Leu Phe Asn 770 775 780 atc att ttc act ctg gct cta acg ttc ttg aaa tca ccc aca tca tct 2400Ile Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser Ser 785 790 795 800 cga gcc atg att tcg caa gac aaa ctc tca gag ctg caa gga aca gaa 2448Arg Ala Met Ile Ser Gln Asp Lys Leu Ser Glu Leu Gln Gly Thr Glu 805 810 815 aat tca aca gac gac tct tct gtc aag aaa aag acc aca gat tcc cct 2496Asn Ser Thr Asp Asp Ser Ser Val Lys Lys Lys Thr Thr Asp Ser Pro 820 825 830 gta aag acg gaa gaa gaa ggc aat atg gtc tta cca ttc aag cct ctc 2544Val Lys Thr Glu Glu Glu Gly Asn Met Val Leu Pro Phe Lys Pro Leu 835 840 845 act gta aca ttt caa gac ttg aag tat ttc gtt gac atg ccc gtg gag 2592Thr Val Thr Phe Gln Asp Leu Lys Tyr Phe Val Asp Met Pro Val Glu 850 855 860 atg aga gac caa gga tat gat cag aag aaa cta caa ctt ctc tca gat 2640Met Arg Asp Gln Gly Tyr Asp Gln Lys Lys Leu Gln Leu Leu Ser Asp 865 870 875 880 atc aca gga gct ttc cgt ccc gga att cta acg gca tta atg gga gtg 2688Ile Thr Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val 885 890 895 agt gga gcc gga aaa aca act ctc ctc gac gtt tta gcc gga aga aaa 2736Ser Gly Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys 900 905 910 acc agc gga tac atc gaa gga gac atc aga atc agt ggc ttc cct aaa 2784Thr Ser Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys 915 920 925 atc caa gaa aca ttc gct aga gtc tca ggg tac tgt gaa caa aca gat 2832Ile Gln Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp 930 935 940 att cac tca cca aac atc acc gtc gaa gaa tcc gta atc tac tcc gct 2880Ile His Ser Pro Asn Ile Thr Val Glu Glu Ser Val Ile Tyr Ser Ala 945 950 955 960 tgg ctt cgt cta gct cct gag atc gat tcc gca acc aaa acc aaa ttt 2928Trp Leu Arg Leu Ala Pro Glu Ile Asp Ser Ala Thr Lys Thr Lys Phe 965 970 975 gtg aag caa gtg ctt gag acg atc gaa tta gat gaa atc aaa gat tca 2976Val Lys Gln Val Leu Glu Thr Ile Glu Leu Asp Glu Ile Lys Asp Ser 980 985 990 ttg gtg gga gtc acc gga gtg agt gga tta tcg acg gag cag agg aag 3024Leu Val Gly Val Thr Gly Val Ser Gly Leu Ser Thr Glu Gln Arg Lys 995 1000 1005 aga ttg acg att gcg gtg gaa ttg gtg gcg aat ccg tcg att ata 3069Arg Leu Thr Ile Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile 1010 1015 1020 ttc atg gac gag cca acg acg ggg cta gac gca aga gca gcc gcc 3114Phe Met Asp Glu Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala 1025 1030 1035 att gtt atg aga gct gtg aag aac gtt gct gat act gga cga acc 3159Ile Val Met Arg Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr 1040 1045 1050 atc gtc tgc act att cat cag cct agt atc gac att ttt gaa gcc 3204Ile Val Cys Thr Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala 1055 1060 1065 ttc gac gag ttg gtg ctt ctt aaa aga ggt ggt cgc atg att tac 3249Phe Asp Glu Leu Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr 1070 1075 1080 aca gga cca ttg ggt caa cat tca cgt cat att att gag tat ttt 3294Thr Gly Pro Leu Gly Gln His Ser Arg His Ile Ile Glu Tyr Phe 1085 1090 1095 gag agt gtt cct gaa att cct aaa ata aaa gac aac cat aat cca 3339Glu Ser Val Pro Glu Ile Pro Lys Ile Lys Asp Asn His Asn Pro 1100 1105 1110 gca aca tgg atg ctt gat gtt agt tca caa tct gta gaa gtt gaa 3384Ala Thr Trp Met Leu Asp Val Ser Ser Gln Ser Val Glu Val Glu 1115 1120 1125 ctt ggc gtc gat ttt gct aaa atc tac cat gac tct gct ctt tac 3429Leu Gly Val Asp Phe Ala Lys Ile Tyr His Asp Ser Ala Leu Tyr 1130 1135 1140 aag aga aac gca gag ctt gtg aaa cag ttg agc caa cca gat tca 3474Lys Arg Asn Ala Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ser 1145 1150 1155 gga tca agt gat ata cag ttt aag aga act ttt gca caa agt tgg 3519Gly Ser Ser Asp Ile Gln Phe Lys Arg Thr Phe Ala Gln Ser Trp 1160 1165 1170 tgg gga caa ttc aga tct att cta tgg aaa atg aac ttg tct tat 3564Trp Gly Gln Phe Arg Ser Ile Leu Trp Lys Met Asn Leu Ser Tyr 1175 1180 1185 tgg aga agc cct tct tat aac cta atg cgt atg att cac aca tta 3609Trp Arg Ser Pro Ser Tyr Asn Leu Met Arg Met Ile His Thr Leu 1190 1195 1200 gtc tct tct ttg atc ttc ggc tca ctt ttc tgg aaa caa ggc cag 3654Val Ser Ser Leu Ile Phe Gly Ser Leu Phe Trp Lys Gln Gly Gln 1205 1210 1215 aat ata gat act caa cag ggt atg ttc act gtg ttt gga gcg atc 3699Asn Ile Asp Thr Gln Gln Gly Met Phe Thr Val Phe Gly Ala Ile 1220 1225 1230 tat ggt ttg gtg ctc ttc tta ggg ata aac aat tgt tca tca gct 3744Tyr Gly Leu Val Leu Phe Leu Gly Ile Asn Asn Cys Ser Ser Ala 1235 1240 1245 att caa tat ata gaa aca gag cga aat gtt atg tac cgc gaa aga 3789Ile Gln Tyr Ile Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg 1250 1255 1260 ttc gca gga atg tac tca gcg act gct tac gca ttg ggt caa gtg 3834Phe Ala Gly Met Tyr Ser Ala Thr Ala Tyr Ala Leu Gly Gln Val 1265 1270 1275 gtg act gag ata cct tat ata ttc ata caa gcc gcc gag ttt gtg 3879Val Thr Glu Ile Pro Tyr Ile Phe Ile Gln Ala Ala Glu Phe Val 1280 1285 1290 atc ata aca tat cca atg atc ggt ttc tat cct tca acc tac aaa 3924Ile Ile Thr Tyr Pro Met Ile Gly Phe Tyr Pro Ser Thr Tyr Lys 1295 1300 1305 gtc ttc tgg tca ctc tac tct atg ttt tgc tca ctt ctc act ttt 3969Val Phe Trp Ser Leu Tyr Ser Met Phe Cys Ser Leu Leu Thr Phe 1310 1315 1320 aac tac ctt gcg atg ttc ctc gtc tcc atc acg cca aac ttc atg 4014Asn Tyr Leu Ala Met Phe Leu Val Ser Ile Thr Pro Asn Phe Met 1325 1330 1335 gtt gcc gcg att ctt caa tcg ctc ttc tat gtt aat ttc aac ctt 4059Val Ala Ala Ile Leu Gln Ser Leu Phe Tyr Val Asn Phe Asn Leu 1340 1345 1350 ttt tcc ggg ttt ttg atc ccc caa acg caa gtt cca ggg tgg tgg 4104Phe Ser Gly Phe Leu Ile Pro Gln Thr Gln Val Pro Gly Trp Trp 1355 1360 1365 att tgg tta tat tat cta aca cca acg tct tgg aca ctg aac gga 4149Ile Trp Leu Tyr Tyr Leu Thr Pro Thr Ser Trp Thr Leu Asn Gly 1370 1375 1380 ttt ttc tcg tcc caa tac ggt gat att gac gaa aag atc aat gtc 4194Phe Phe Ser Ser Gln Tyr Gly Asp Ile Asp Glu Lys Ile Asn Val 1385 1390 1395 ttt gga gaa tcc acg acg gtt gca aga ttc ttg aaa gac tat ttt 4239Phe Gly Glu Ser Thr Thr Val Ala Arg Phe Leu Lys Asp Tyr Phe 1400 1405 1410 gga ttt cat cat gac cgt ttg gcg gtt acg gcg gtt gtt caa atc 4284Gly Phe His His Asp Arg Leu Ala Val Thr Ala Val Val Gln Ile 1415 1420 1425 gct ttt ccc att gcg tta gct tct atg ttt gca ttc ttc gtg ggc 4329Ala Phe Pro Ile Ala Leu Ala Ser Met Phe Ala Phe Phe Val Gly 1430 1435 1440 aaa ctc aac ttc caa cga aga tga 4353Lys Leu Asn Phe Gln Arg Arg 1445 1450 521450PRTArabidopsis lyrata 52Met Ala His Met Val Gly Ala Asp Glu Ile Glu Ser Leu Arg Val Glu 1 5 10 15 Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe Arg Arg His Thr 20 25 30 Ser Ser

Phe Arg Ser Ser Ser Ser Arg Tyr Glu Leu Glu Asn Asp Gly 35 40 45 Asp Val Ile Asp His Asp Ala Glu Tyr Ala Leu Gln Trp Ala Glu Ile 50 55 60 Glu Arg Leu Pro Thr Val Lys Arg Met Arg Ser Thr Leu Leu Asp Asp 65 70 75 80 Gly Asp Glu Ser Met Ser Glu Lys Gly Arg Arg Val Val Asp Val Thr 85 90 95 Lys Leu Gly Ala Met Glu Arg His Leu Met Ile Glu Lys Leu Ile Lys 100 105 110 His Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Lys Arg 115 120 125 Ile Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Glu 130 135 140 Ser Leu Lys Val Glu Ala Glu Cys Glu Ile Val Glu Gly Lys Ala Leu 145 150 155 160 Pro Thr Leu Trp Asn Thr Ala Lys Arg Val Leu Ser Glu Leu Val Lys 165 170 175 Leu Thr Gly Ala Lys Thr His Glu Ala Lys Ile Asn Ile Ile Asn Asp 180 185 190 Val Asn Gly Val Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro 195 200 205 Pro Gly Cys Gly Lys Thr Thr Leu Leu Lys Ala Leu Ser Gly Asn Leu 210 215 220 Glu Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg 225 230 235 240 Leu Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr 245 250 255 Asp Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser 260 265 270 Ala Arg Cys Gln Gly Val Gly Ser Arg Thr Asp Ile Met Met Glu Val 275 280 285 Ser Lys Arg Glu Lys Glu Lys Gly Ile Ile Pro Asp Thr Glu Val Asp 290 295 300 Ala Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Gln Arg Asn Leu Gln 305 310 315 320 Thr Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Thr 325 330 335 Leu Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys 340 345 350 Arg Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe 355 360 365 Met Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile 370 375 380 Val Lys Ser Leu Gln Gln Phe Ala His Ile Ser Ser Ala Thr Val Leu 385 390 395 400 Val Ser Leu Leu Gln Pro Ala Pro Glu Ser Phe Asp Leu Phe Asp Asp 405 410 415 Ile Met Leu Met Ala Lys Gly Arg Ile Met Tyr His Gly Pro Arg Gly 420 425 430 Glu Val Leu Asn Phe Phe Glu Asp Cys Gly Phe Arg Cys Pro Glu Arg 435 440 445 Lys Gly Val Ala Asp Phe Leu Gln Glu Val Ile Ser Lys Lys Asp Gln 450 455 460 Ala Gln Tyr Trp Arg His Glu Asp Leu Pro Tyr Ser Phe Val Ser Val 465 470 475 480 Asp Met Leu Ser Lys Lys Phe Lys Glu Leu Ser Ile Gly Lys Lys Met 485 490 495 Glu His Thr Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp 500 505 510 Ala Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Leu Phe Ile 515 520 525 Ala Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val 530 535 540 Tyr Ile Phe Lys Thr Ser Gln Leu Val Met Ala Ala Phe Ile Thr Met 545 550 555 560 Thr Val Tyr Ile Arg Thr Arg Met Gly Ile Asp Ile Ile His Gly Asn 565 570 575 Ser Tyr Met Ser Ala Leu Phe Phe Ala Leu Ile Ile Leu Leu Val Asp 580 585 590 Gly Phe Pro Glu Leu Ser Met Thr Ala Gln Arg Leu Ala Val Phe Tyr 595 600 605 Lys Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro 610 615 620 Ala Thr Val Leu Lys Val Pro Leu Ser Phe Phe Glu Ser Leu Val Trp 625 630 635 640 Thr Gly Leu Thr Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg 645 650 655 Phe Phe Lys Gln Phe Ile Leu Leu Phe Ala Val His Phe Thr Ser Ile 660 665 670 Ser Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser 675 680 685 Ile Thr Ala Gly Ser Phe Gly Ile Leu Phe Thr Phe Val Phe Ala Gly 690 695 700 Phe Val Ile Pro Pro Pro Ser Met Pro Ala Trp Leu Lys Trp Gly Phe 705 710 715 720 Trp Val Asn Pro Leu Ser Tyr Gly Glu Ile Gly Leu Ser Val Asn Glu 725 730 735 Phe Leu Ala Pro Arg Trp Asn Gln Met Gln Pro Asn Asn Val Thr Leu 740 745 750 Gly Arg Thr Ile Leu Gln Thr Arg Gly Met Asp Tyr Asp Gly Tyr Met 755 760 765 Tyr Trp Val Ser Leu Tyr Ala Leu Leu Gly Phe Thr Val Leu Phe Asn 770 775 780 Ile Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser Ser 785 790 795 800 Arg Ala Met Ile Ser Gln Asp Lys Leu Ser Glu Leu Gln Gly Thr Glu 805 810 815 Asn Ser Thr Asp Asp Ser Ser Val Lys Lys Lys Thr Thr Asp Ser Pro 820 825 830 Val Lys Thr Glu Glu Glu Gly Asn Met Val Leu Pro Phe Lys Pro Leu 835 840 845 Thr Val Thr Phe Gln Asp Leu Lys Tyr Phe Val Asp Met Pro Val Glu 850 855 860 Met Arg Asp Gln Gly Tyr Asp Gln Lys Lys Leu Gln Leu Leu Ser Asp 865 870 875 880 Ile Thr Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val 885 890 895 Ser Gly Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys 900 905 910 Thr Ser Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys 915 920 925 Ile Gln Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp 930 935 940 Ile His Ser Pro Asn Ile Thr Val Glu Glu Ser Val Ile Tyr Ser Ala 945 950 955 960 Trp Leu Arg Leu Ala Pro Glu Ile Asp Ser Ala Thr Lys Thr Lys Phe 965 970 975 Val Lys Gln Val Leu Glu Thr Ile Glu Leu Asp Glu Ile Lys Asp Ser 980 985 990 Leu Val Gly Val Thr Gly Val Ser Gly Leu Ser Thr Glu Gln Arg Lys 995 1000 1005 Arg Leu Thr Ile Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile 1010 1015 1020 Phe Met Asp Glu Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala 1025 1030 1035 Ile Val Met Arg Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr 1040 1045 1050 Ile Val Cys Thr Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala 1055 1060 1065 Phe Asp Glu Leu Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr 1070 1075 1080 Thr Gly Pro Leu Gly Gln His Ser Arg His Ile Ile Glu Tyr Phe 1085 1090 1095 Glu Ser Val Pro Glu Ile Pro Lys Ile Lys Asp Asn His Asn Pro 1100 1105 1110 Ala Thr Trp Met Leu Asp Val Ser Ser Gln Ser Val Glu Val Glu 1115 1120 1125 Leu Gly Val Asp Phe Ala Lys Ile Tyr His Asp Ser Ala Leu Tyr 1130 1135 1140 Lys Arg Asn Ala Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ser 1145 1150 1155 Gly Ser Ser Asp Ile Gln Phe Lys Arg Thr Phe Ala Gln Ser Trp 1160 1165 1170 Trp Gly Gln Phe Arg Ser Ile Leu Trp Lys Met Asn Leu Ser Tyr 1175 1180 1185 Trp Arg Ser Pro Ser Tyr Asn Leu Met Arg Met Ile His Thr Leu 1190 1195 1200 Val Ser Ser Leu Ile Phe Gly Ser Leu Phe Trp Lys Gln Gly Gln 1205 1210 1215 Asn Ile Asp Thr Gln Gln Gly Met Phe Thr Val Phe Gly Ala Ile 1220 1225 1230 Tyr Gly Leu Val Leu Phe Leu Gly Ile Asn Asn Cys Ser Ser Ala 1235 1240 1245 Ile Gln Tyr Ile Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg 1250 1255 1260 Phe Ala Gly Met Tyr Ser Ala Thr Ala Tyr Ala Leu Gly Gln Val 1265 1270 1275 Val Thr Glu Ile Pro Tyr Ile Phe Ile Gln Ala Ala Glu Phe Val 1280 1285 1290 Ile Ile Thr Tyr Pro Met Ile Gly Phe Tyr Pro Ser Thr Tyr Lys 1295 1300 1305 Val Phe Trp Ser Leu Tyr Ser Met Phe Cys Ser Leu Leu Thr Phe 1310 1315 1320 Asn Tyr Leu Ala Met Phe Leu Val Ser Ile Thr Pro Asn Phe Met 1325 1330 1335 Val Ala Ala Ile Leu Gln Ser Leu Phe Tyr Val Asn Phe Asn Leu 1340 1345 1350 Phe Ser Gly Phe Leu Ile Pro Gln Thr Gln Val Pro Gly Trp Trp 1355 1360 1365 Ile Trp Leu Tyr Tyr Leu Thr Pro Thr Ser Trp Thr Leu Asn Gly 1370 1375 1380 Phe Phe Ser Ser Gln Tyr Gly Asp Ile Asp Glu Lys Ile Asn Val 1385 1390 1395 Phe Gly Glu Ser Thr Thr Val Ala Arg Phe Leu Lys Asp Tyr Phe 1400 1405 1410 Gly Phe His His Asp Arg Leu Ala Val Thr Ala Val Val Gln Ile 1415 1420 1425 Ala Phe Pro Ile Ala Leu Ala Ser Met Phe Ala Phe Phe Val Gly 1430 1435 1440 Lys Leu Asn Phe Gln Arg Arg 1445 1450 534347DNACapsella rubellaCDS(1)..(4347) 53atg gct cac atg gtt gga cca gac gag att gag tcc ttg aga gtg gag 48Met Ala His Met Val Gly Pro Asp Glu Ile Glu Ser Leu Arg Val Glu 1 5 10 15 ctt gca gag att gga aga agc atc aga tca tct ttc cgg aga cac act 96Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe Arg Arg His Thr 20 25 30 tct agt ttc aga agc agc tct tca atc tat gaa gct gat aat gac ggt 144Ser Ser Phe Arg Ser Ser Ser Ser Ile Tyr Glu Ala Asp Asn Asp Gly 35 40 45 gat gtt aat gat gat cat cat gat gca gag tat gct ctg caa tgg gct 192Asp Val Asn Asp Asp His His Asp Ala Glu Tyr Ala Leu Gln Trp Ala 50 55 60 aag att gag aga tta cca act gcc aaa cgc atg aga tcg act ctc ctc 240Lys Ile Glu Arg Leu Pro Thr Ala Lys Arg Met Arg Ser Thr Leu Leu 65 70 75 80 gat gaa tcc atc acc gag aat gga aaa aga gtc gtt gat gtc tca aag 288Asp Glu Ser Ile Thr Glu Asn Gly Lys Arg Val Val Asp Val Ser Lys 85 90 95 ctt gga gcc acc gaa cgt cat ctg atg att gag gga ctt atc aaa cac 336Leu Gly Ala Thr Glu Arg His Leu Met Ile Glu Gly Leu Ile Lys His 100 105 110 att gag aat gat aat ctc aag ttg ctc aag aaa atc aga aga aga ata 384Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Arg Arg Ile 115 120 125 gac agg gtg ggg atg gag tta ccg acc ata gaa gtg agg tac acg agt 432Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Thr Ser 130 135 140 tta aaa gta gag gcc gag tgc gag att gtt gaa ggg aag gca ctt cca 480Leu Lys Val Glu Ala Glu Cys Glu Ile Val Glu Gly Lys Ala Leu Pro 145 150 155 160 aca ctg tgg aac act gcc aag cgc att ttc tct gaa ctg gtg aag ctc 528Thr Leu Trp Asn Thr Ala Lys Arg Ile Phe Ser Glu Leu Val Lys Leu 165 170 175 act ggt gca aaa gca cac gaa gcc aat ata agc att ctt aat gat gtt 576Thr Gly Ala Lys Ala His Glu Ala Asn Ile Ser Ile Leu Asn Asp Val 180 185 190 aat ggc att ata aag ccc gga agg tta aca ctg ttg ctt ggt cct cct 624Asn Gly Ile Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro Pro 195 200 205 gga tgc ggt aaa aca act atg tta aag gcc ttg tct gga aat tta gaa 672Gly Cys Gly Lys Thr Thr Met Leu Lys Ala Leu Ser Gly Asn Leu Glu 210 215 220 aac aat cta aag tgt tca ggt gaa atc tct tac aat gga cac aga cta 720Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg Leu 225 230 235 240 gac gag ttc gtt cct cag aaa acc tcg gca tat ata agt caa tat gac 768Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr Asp 245 250 255 ctg cat att gcg gag atg acg gtg agg gag act gtt gac ttc tca gct 816Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser Ala 260 265 270 cgt tgt cag ggc gtt ggt agc cga aca gat att atg atg gaa gtc agt 864Arg Cys Gln Gly Val Gly Ser Arg Thr Asp Ile Met Met Glu Val Ser 275 280 285 aaa cga gaa aag gaa aaa gga atc att cct gac aca gaa gtg gat gct 912Lys Arg Glu Lys Glu Lys Gly Ile Ile Pro Asp Thr Glu Val Asp Ala 290 295 300 tac atg aaa gca att tct gtt gaa gga ctc aaa aga agt ctg caa aca 960Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Lys Arg Ser Leu Gln Thr 305 310 315 320 gat tac atc ttg aag att ctc gga cta gac att tgt gca gaa aca ctg 1008Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Thr Leu 325 330 335 att gga gat gtg atg agg aga ggt ata tca gga ggc caa aag aag cgt 1056Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys Arg 340 345 350 ctt acg aca gcc gag atg att gtt ggc ccg aca aag gct ctg ttt atg 1104Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe Met 355 360 365 gat gaa ata aca aat ggc tta gac agt tcc aca gct ttt cag att gtc 1152Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile Val 370 375 380 aaa tct ctt cag caa ttt gct cac ata tca agt gct act gtg ctt gtt 1200Lys Ser Leu Gln Gln Phe Ala His Ile Ser Ser Ala Thr Val Leu Val 385 390 395 400 tcg ctt ctt caa ccg gcc cca gaa tct ttc gat ctc ttt gat gac gtt 1248Ser Leu Leu Gln Pro Ala Pro Glu Ser Phe Asp Leu Phe Asp Asp Val 405 410 415 atg ctg atg gcc aaa gga aga att gtg tat cat ggt cca cgc ggc gaa 1296Met Leu Met Ala Lys Gly Arg Ile Val Tyr His Gly Pro Arg Gly Glu 420 425 430 gtc ctg aaa ttc ttt gag gat tgt gga ttc caa tgc cct gaa agg aaa 1344Val Leu Lys Phe Phe Glu Asp Cys Gly Phe Gln Cys Pro Glu Arg Lys 435 440 445 ggt gtt gca gac ttt ctc cag gag gtt ata tcc aaa aaa gac caa gca 1392Gly Val Ala Asp Phe Leu Gln Glu Val Ile Ser Lys Lys Asp Gln Ala 450 455 460 caa tac tgg cgg cac gag gat tta cct tat agt ttt gtc tcg gtg gaa 1440Gln Tyr Trp Arg His Glu Asp Leu Pro Tyr Ser Phe Val Ser Val Glu 465 470 475 480 atg ttg tca aag aag ttc aag gac ttg agt att gga aaa aag att gag 1488Met Leu Ser Lys Lys Phe Lys Asp Leu Ser Ile Gly Lys Lys Ile Glu 485 490 495 gaa aca ctt tct aag ccg tat gat aga tcc aaa agc cat aag gat gcc

1536Glu Thr Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp Ala 500 505 510 tta tcc ttc agt gtg tat tca ctt cca aac tgg gag ttg ttc atc gca 1584Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Leu Phe Ile Ala 515 520 525 tgc ata tca aga gag tat ctt ctc atg aag aga aac tat ttc gtc tat 1632Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val Tyr 530 535 540 att ttc aag aca tct cag ctt gtt atg gct gca ttc atc act atg act 1680Ile Phe Lys Thr Ser Gln Leu Val Met Ala Ala Phe Ile Thr Met Thr 545 550 555 560 gtg tat atc cga aca cgg atg ggt att gat atc att cat ggg aat tct 1728Val Tyr Ile Arg Thr Arg Met Gly Ile Asp Ile Ile His Gly Asn Ser 565 570 575 tac atg agt gcc ctc ttt ttt gcc ctc gtt ata ctt ctt gtt gac gga 1776Tyr Met Ser Ala Leu Phe Phe Ala Leu Val Ile Leu Leu Val Asp Gly 580 585 590 ttc cct gag ttg tct atg acg gct caa cgc cta gcc gtg ttt tac aag 1824Phe Pro Glu Leu Ser Met Thr Ala Gln Arg Leu Ala Val Phe Tyr Lys 595 600 605 cag aag cag ttg tgt ttc tat cct gca tgg gcg tat gca atc cct gca 1872Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro Ala 610 615 620 aca gtg cta aag gtc cct ctc tcg ttc ttc gaa tct tta gtt tgg acc 1920Thr Val Leu Lys Val Pro Leu Ser Phe Phe Glu Ser Leu Val Trp Thr 625 630 635 640 ggc ctc aca tac tat gtc att gga tac acc cct gaa gcc tcc agg ttc 1968Gly Leu Thr Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg Phe 645 650 655 ttc aag cag ttc att cta ctg ttt gct gtt cac ttc acc tcg ata tcc 2016Phe Lys Gln Phe Ile Leu Leu Phe Ala Val His Phe Thr Ser Ile Ser 660 665 670 atg ttt cgg tgt cta gct gca atc ttc cag aca gta gtt gct tca atc 2064Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser Ile 675 680 685 aca gct ggc agt ttt ggt ata tta ttc aca ttt gtc ttt gct ggt ttt 2112Thr Ala Gly Ser Phe Gly Ile Leu Phe Thr Phe Val Phe Ala Gly Phe 690 695 700 gtc att cca cca aca tca atg cca gca tgg ctc aag tgg ggt ttc tgg 2160Val Ile Pro Pro Thr Ser Met Pro Ala Trp Leu Lys Trp Gly Phe Trp 705 710 715 720 gca aat cct ttg agt tac ggt gag att ggg cta tcg gta aac gag ttc 2208Ala Asn Pro Leu Ser Tyr Gly Glu Ile Gly Leu Ser Val Asn Glu Phe 725 730 735 ctt gcc ccc agg tgg aat cag atg caa ccc aat aat gtt acc tta ggg 2256Leu Ala Pro Arg Trp Asn Gln Met Gln Pro Asn Asn Val Thr Leu Gly 740 745 750 cga acc ata ctc caa acc cgt gga atg gac tac gat ggt tac atg tac 2304Arg Thr Ile Leu Gln Thr Arg Gly Met Asp Tyr Asp Gly Tyr Met Tyr 755 760 765 tgg gta tca tta tgt gcc ttg ttg gga ttc act gtg ctc ttt aac atc 2352Trp Val Ser Leu Cys Ala Leu Leu Gly Phe Thr Val Leu Phe Asn Ile 770 775 780 att ttc acc ctg gca ctg act ttc ttg aaa tca ccc aca tca tct aaa 2400Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser Ser Lys 785 790 795 800 gct atg att tcg caa gaa aaa ctc ttt gag ctg caa gga aaa gaa gct 2448Ala Met Ile Ser Gln Glu Lys Leu Phe Glu Leu Gln Gly Lys Glu Ala 805 810 815 tca aca ggc gac act tca gtc aag aac aag act aca ggt tcc cct gta 2496Ser Thr Gly Asp Thr Ser Val Lys Asn Lys Thr Thr Gly Ser Pro Val 820 825 830 aac aca gaa gaa ggc aag atg gtc tta cct ttc aag ccc ctc aca gta 2544Asn Thr Glu Glu Gly Lys Met Val Leu Pro Phe Lys Pro Leu Thr Val 835 840 845 aca ttt caa gat ttg aac tat ttc gtt gac atg ccc gtg gag atg aga 2592Thr Phe Gln Asp Leu Asn Tyr Phe Val Asp Met Pro Val Glu Met Arg 850 855 860 gac caa gga tat gac cag aag aaa cta caa ctt cta tca aat atc acc 2640Asp Gln Gly Tyr Asp Gln Lys Lys Leu Gln Leu Leu Ser Asn Ile Thr 865 870 875 880 gga gct ttc cgc cct gga atc cta acg gct ttg atg gga gtg agt gga 2688Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val Ser Gly 885 890 895 gcc gga aaa acc aca ctc ctc gat gtt cta gcc gga aga aaa aca agt 2736Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys Thr Ser 900 905 910 gga tac atc gaa gga gac atc aga atc agt ggt ttc cct aaa gtt cag 2784Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys Val Gln 915 920 925 gaa acg ttc gct aga gtc tca ggc tac tgc gaa caa aca gat atc cac 2832Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp Ile His 930 935 940 tca cca aac atc acc gtc ggt gaa tct gtg att tac tca gct tgg ctt 2880Ser Pro Asn Ile Thr Val Gly Glu Ser Val Ile Tyr Ser Ala Trp Leu 945 950 955 960 cgt ctt gct cct gag atc gat tcc gca acc aaa acc caa ttc gtg aaa 2928Arg Leu Ala Pro Glu Ile Asp Ser Ala Thr Lys Thr Gln Phe Val Lys 965 970 975 caa gtg ctc gag acg atc gaa tta gat gaa atc aaa gac gca ttg gtg 2976Gln Val Leu Glu Thr Ile Glu Leu Asp Glu Ile Lys Asp Ala Leu Val 980 985 990 gga gtc gcc gga gtg agc ggg ttg tcg acg gag cag agg aag aga ctg 3024Gly Val Ala Gly Val Ser Gly Leu Ser Thr Glu Gln Arg Lys Arg Leu 995 1000 1005 acg att gcg gtg gag ttg gtg gcg aat ccg tcg atc atc ttc atg 3069Thr Ile Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile Phe Met 1010 1015 1020 gac gag ccc acg acg ggg cta gac gca aga gca gcc gcc att gtt 3114Asp Glu Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala Ile Val 1025 1030 1035 atg aga gct gtg aag aac gtc gct gat act gga cga acc atc gtc 3159Met Arg Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr Ile Val 1040 1045 1050 tgt act att cat cag cct agt atc gac att ttc gaa gct ttc gac 3204Cys Thr Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala Phe Asp 1055 1060 1065 gag ttg gtg ctt ctt aaa aga ggt ggt cgc atg atc tac aca gga 3249Glu Leu Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr Thr Gly 1070 1075 1080 cca tta ggc cta cat tca tgt cac att atc gag tat ttt gag agt 3294Pro Leu Gly Leu His Ser Cys His Ile Ile Glu Tyr Phe Glu Ser 1085 1090 1095 gtt cct gaa att cct aaa ata aga gac aac cac aat cca gca aca 3339Val Pro Glu Ile Pro Lys Ile Arg Asp Asn His Asn Pro Ala Thr 1100 1105 1110 tgg atg ctt gat gtt agt tca caa tct gta gaa gtt gaa ctt ggc 3384Trp Met Leu Asp Val Ser Ser Gln Ser Val Glu Val Glu Leu Gly 1115 1120 1125 gtc gat ttc gca aat atc tac cat gag tct gct ctt tac aag aga 3429Val Asp Phe Ala Asn Ile Tyr His Glu Ser Ala Leu Tyr Lys Arg 1130 1135 1140 aac tca gag ctt gtt aaa cag tta agc caa cca gat gca gaa tca 3474Asn Ser Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ala Glu Ser 1145 1150 1155 agt gat ata cag ttt aag aga act ttt gca caa agt tgg tgg ggg 3519Ser Asp Ile Gln Phe Lys Arg Thr Phe Ala Gln Ser Trp Trp Gly 1160 1165 1170 caa ttc aaa tct att cta tgg aaa atg agt ttg tca tat tgg aga 3564Gln Phe Lys Ser Ile Leu Trp Lys Met Ser Leu Ser Tyr Trp Arg 1175 1180 1185 agc cct tct tat aac ctt atg cgt atg att cac act ttg atc tct 3609Ser Pro Ser Tyr Asn Leu Met Arg Met Ile His Thr Leu Ile Ser 1190 1195 1200 tct ttg atc ttt ggc gca ctc ttc tgg aaa caa ggc caa aaa ata 3654Ser Leu Ile Phe Gly Ala Leu Phe Trp Lys Gln Gly Gln Lys Ile 1205 1210 1215 gat act caa cag agt ttg ttc acc gta ttt gga gcc atc tac ggt 3699Asp Thr Gln Gln Ser Leu Phe Thr Val Phe Gly Ala Ile Tyr Gly 1220 1225 1230 ttg gta ctc ttc tta ggg ata aac aac tgt tca tca gct ctt cag 3744Leu Val Leu Phe Leu Gly Ile Asn Asn Cys Ser Ser Ala Leu Gln 1235 1240 1245 tat ttt gaa acg gag aga aat gta atg tat cga gaa aga ttc gca 3789Tyr Phe Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg Phe Ala 1250 1255 1260 ggg atg tac tca gcg aca gct tac gcg ttg agt caa gtg gtg aca 3834Gly Met Tyr Ser Ala Thr Ala Tyr Ala Leu Ser Gln Val Val Thr 1265 1270 1275 gag ata cct tat ata ttc ata caa gct gcg gag ttt gtg atc ata 3879Glu Ile Pro Tyr Ile Phe Ile Gln Ala Ala Glu Phe Val Ile Ile 1280 1285 1290 aca tat cca atg atc ggt ttc tat cct tcg acc tac aaa gtc ttt 3924Thr Tyr Pro Met Ile Gly Phe Tyr Pro Ser Thr Tyr Lys Val Phe 1295 1300 1305 tgg tca ctc tac tct atg ttt tgc tca ctt ctc act ttc aac tac 3969Trp Ser Leu Tyr Ser Met Phe Cys Ser Leu Leu Thr Phe Asn Tyr 1310 1315 1320 ctt gcc atg ttc ctc gta tcc atc acg cca aac ttc atg gtt gcc 4014Leu Ala Met Phe Leu Val Ser Ile Thr Pro Asn Phe Met Val Ala 1325 1330 1335 gcg att ctt cag tcg ctt ttc tat gtt aat ttc aac ctc ttc tcc 4059Ala Ile Leu Gln Ser Leu Phe Tyr Val Asn Phe Asn Leu Phe Ser 1340 1345 1350 ggg ttt ttg atc ccc caa acg caa gtt cca ggg tgg tgg att tgg 4104Gly Phe Leu Ile Pro Gln Thr Gln Val Pro Gly Trp Trp Ile Trp 1355 1360 1365 tta tat tat cta aca cca acg tca tgg aca ctc aac ggg ttc atc 4149Leu Tyr Tyr Leu Thr Pro Thr Ser Trp Thr Leu Asn Gly Phe Ile 1370 1375 1380 tcg tct cag tac gga gat att cat gac gag atc aat gtc ttt gga 4194Ser Ser Gln Tyr Gly Asp Ile His Asp Glu Ile Asn Val Phe Gly 1385 1390 1395 gaa aca acg act gtt gca gca ttc ttg aaa gac tat ttt gga ttt 4239Glu Thr Thr Thr Val Ala Ala Phe Leu Lys Asp Tyr Phe Gly Phe 1400 1405 1410 cac cat gaa cgt ttg gcg att acg gcg gtt gtt caa atc gct ttt 4284His His Glu Arg Leu Ala Ile Thr Ala Val Val Gln Ile Ala Phe 1415 1420 1425 cca att gcg ttt gcg tct atg ttt gcc ttc ttc gtg ggc aaa ctc 4329Pro Ile Ala Phe Ala Ser Met Phe Ala Phe Phe Val Gly Lys Leu 1430 1435 1440 aac ttc caa cga cga tga 4347Asn Phe Gln Arg Arg 1445 541448PRTCapsella rubella 54Met Ala His Met Val Gly Pro Asp Glu Ile Glu Ser Leu Arg Val Glu 1 5 10 15 Leu Ala Glu Ile Gly Arg Ser Ile Arg Ser Ser Phe Arg Arg His Thr 20 25 30 Ser Ser Phe Arg Ser Ser Ser Ser Ile Tyr Glu Ala Asp Asn Asp Gly 35 40 45 Asp Val Asn Asp Asp His His Asp Ala Glu Tyr Ala Leu Gln Trp Ala 50 55 60 Lys Ile Glu Arg Leu Pro Thr Ala Lys Arg Met Arg Ser Thr Leu Leu 65 70 75 80 Asp Glu Ser Ile Thr Glu Asn Gly Lys Arg Val Val Asp Val Ser Lys 85 90 95 Leu Gly Ala Thr Glu Arg His Leu Met Ile Glu Gly Leu Ile Lys His 100 105 110 Ile Glu Asn Asp Asn Leu Lys Leu Leu Lys Lys Ile Arg Arg Arg Ile 115 120 125 Asp Arg Val Gly Met Glu Leu Pro Thr Ile Glu Val Arg Tyr Thr Ser 130 135 140 Leu Lys Val Glu Ala Glu Cys Glu Ile Val Glu Gly Lys Ala Leu Pro 145 150 155 160 Thr Leu Trp Asn Thr Ala Lys Arg Ile Phe Ser Glu Leu Val Lys Leu 165 170 175 Thr Gly Ala Lys Ala His Glu Ala Asn Ile Ser Ile Leu Asn Asp Val 180 185 190 Asn Gly Ile Ile Lys Pro Gly Arg Leu Thr Leu Leu Leu Gly Pro Pro 195 200 205 Gly Cys Gly Lys Thr Thr Met Leu Lys Ala Leu Ser Gly Asn Leu Glu 210 215 220 Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn Gly His Arg Leu 225 230 235 240 Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile Ser Gln Tyr Asp 245 250 255 Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val Asp Phe Ser Ala 260 265 270 Arg Cys Gln Gly Val Gly Ser Arg Thr Asp Ile Met Met Glu Val Ser 275 280 285 Lys Arg Glu Lys Glu Lys Gly Ile Ile Pro Asp Thr Glu Val Asp Ala 290 295 300 Tyr Met Lys Ala Ile Ser Val Glu Gly Leu Lys Arg Ser Leu Gln Thr 305 310 315 320 Asp Tyr Ile Leu Lys Ile Leu Gly Leu Asp Ile Cys Ala Glu Thr Leu 325 330 335 Ile Gly Asp Val Met Arg Arg Gly Ile Ser Gly Gly Gln Lys Lys Arg 340 345 350 Leu Thr Thr Ala Glu Met Ile Val Gly Pro Thr Lys Ala Leu Phe Met 355 360 365 Asp Glu Ile Thr Asn Gly Leu Asp Ser Ser Thr Ala Phe Gln Ile Val 370 375 380 Lys Ser Leu Gln Gln Phe Ala His Ile Ser Ser Ala Thr Val Leu Val 385 390 395 400 Ser Leu Leu Gln Pro Ala Pro Glu Ser Phe Asp Leu Phe Asp Asp Val 405 410 415 Met Leu Met Ala Lys Gly Arg Ile Val Tyr His Gly Pro Arg Gly Glu 420 425 430 Val Leu Lys Phe Phe Glu Asp Cys Gly Phe Gln Cys Pro Glu Arg Lys 435 440 445 Gly Val Ala Asp Phe Leu Gln Glu Val Ile Ser Lys Lys Asp Gln Ala 450 455 460 Gln Tyr Trp Arg His Glu Asp Leu Pro Tyr Ser Phe Val Ser Val Glu 465 470 475 480 Met Leu Ser Lys Lys Phe Lys Asp Leu Ser Ile Gly Lys Lys Ile Glu 485 490 495 Glu Thr Leu Ser Lys Pro Tyr Asp Arg Ser Lys Ser His Lys Asp Ala 500 505 510 Leu Ser Phe Ser Val Tyr Ser Leu Pro Asn Trp Glu Leu Phe Ile Ala 515 520 525 Cys Ile Ser Arg Glu Tyr Leu Leu Met Lys Arg Asn Tyr Phe Val Tyr 530 535 540 Ile Phe Lys Thr Ser Gln Leu Val Met Ala Ala Phe Ile Thr Met Thr 545 550 555 560 Val Tyr Ile Arg Thr Arg Met Gly Ile Asp Ile Ile His Gly Asn Ser 565 570 575 Tyr Met Ser Ala Leu Phe Phe Ala Leu Val Ile Leu Leu Val Asp Gly 580 585 590 Phe Pro Glu Leu Ser Met Thr Ala Gln Arg Leu Ala Val Phe Tyr Lys 595

600 605 Gln Lys Gln Leu Cys Phe Tyr Pro Ala Trp Ala Tyr Ala Ile Pro Ala 610 615 620 Thr Val Leu Lys Val Pro Leu Ser Phe Phe Glu Ser Leu Val Trp Thr 625 630 635 640 Gly Leu Thr Tyr Tyr Val Ile Gly Tyr Thr Pro Glu Ala Ser Arg Phe 645 650 655 Phe Lys Gln Phe Ile Leu Leu Phe Ala Val His Phe Thr Ser Ile Ser 660 665 670 Met Phe Arg Cys Leu Ala Ala Ile Phe Gln Thr Val Val Ala Ser Ile 675 680 685 Thr Ala Gly Ser Phe Gly Ile Leu Phe Thr Phe Val Phe Ala Gly Phe 690 695 700 Val Ile Pro Pro Thr Ser Met Pro Ala Trp Leu Lys Trp Gly Phe Trp 705 710 715 720 Ala Asn Pro Leu Ser Tyr Gly Glu Ile Gly Leu Ser Val Asn Glu Phe 725 730 735 Leu Ala Pro Arg Trp Asn Gln Met Gln Pro Asn Asn Val Thr Leu Gly 740 745 750 Arg Thr Ile Leu Gln Thr Arg Gly Met Asp Tyr Asp Gly Tyr Met Tyr 755 760 765 Trp Val Ser Leu Cys Ala Leu Leu Gly Phe Thr Val Leu Phe Asn Ile 770 775 780 Ile Phe Thr Leu Ala Leu Thr Phe Leu Lys Ser Pro Thr Ser Ser Lys 785 790 795 800 Ala Met Ile Ser Gln Glu Lys Leu Phe Glu Leu Gln Gly Lys Glu Ala 805 810 815 Ser Thr Gly Asp Thr Ser Val Lys Asn Lys Thr Thr Gly Ser Pro Val 820 825 830 Asn Thr Glu Glu Gly Lys Met Val Leu Pro Phe Lys Pro Leu Thr Val 835 840 845 Thr Phe Gln Asp Leu Asn Tyr Phe Val Asp Met Pro Val Glu Met Arg 850 855 860 Asp Gln Gly Tyr Asp Gln Lys Lys Leu Gln Leu Leu Ser Asn Ile Thr 865 870 875 880 Gly Ala Phe Arg Pro Gly Ile Leu Thr Ala Leu Met Gly Val Ser Gly 885 890 895 Ala Gly Lys Thr Thr Leu Leu Asp Val Leu Ala Gly Arg Lys Thr Ser 900 905 910 Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser Gly Phe Pro Lys Val Gln 915 920 925 Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys Glu Gln Thr Asp Ile His 930 935 940 Ser Pro Asn Ile Thr Val Gly Glu Ser Val Ile Tyr Ser Ala Trp Leu 945 950 955 960 Arg Leu Ala Pro Glu Ile Asp Ser Ala Thr Lys Thr Gln Phe Val Lys 965 970 975 Gln Val Leu Glu Thr Ile Glu Leu Asp Glu Ile Lys Asp Ala Leu Val 980 985 990 Gly Val Ala Gly Val Ser Gly Leu Ser Thr Glu Gln Arg Lys Arg Leu 995 1000 1005 Thr Ile Ala Val Glu Leu Val Ala Asn Pro Ser Ile Ile Phe Met 1010 1015 1020 Asp Glu Pro Thr Thr Gly Leu Asp Ala Arg Ala Ala Ala Ile Val 1025 1030 1035 Met Arg Ala Val Lys Asn Val Ala Asp Thr Gly Arg Thr Ile Val 1040 1045 1050 Cys Thr Ile His Gln Pro Ser Ile Asp Ile Phe Glu Ala Phe Asp 1055 1060 1065 Glu Leu Val Leu Leu Lys Arg Gly Gly Arg Met Ile Tyr Thr Gly 1070 1075 1080 Pro Leu Gly Leu His Ser Cys His Ile Ile Glu Tyr Phe Glu Ser 1085 1090 1095 Val Pro Glu Ile Pro Lys Ile Arg Asp Asn His Asn Pro Ala Thr 1100 1105 1110 Trp Met Leu Asp Val Ser Ser Gln Ser Val Glu Val Glu Leu Gly 1115 1120 1125 Val Asp Phe Ala Asn Ile Tyr His Glu Ser Ala Leu Tyr Lys Arg 1130 1135 1140 Asn Ser Glu Leu Val Lys Gln Leu Ser Gln Pro Asp Ala Glu Ser 1145 1150 1155 Ser Asp Ile Gln Phe Lys Arg Thr Phe Ala Gln Ser Trp Trp Gly 1160 1165 1170 Gln Phe Lys Ser Ile Leu Trp Lys Met Ser Leu Ser Tyr Trp Arg 1175 1180 1185 Ser Pro Ser Tyr Asn Leu Met Arg Met Ile His Thr Leu Ile Ser 1190 1195 1200 Ser Leu Ile Phe Gly Ala Leu Phe Trp Lys Gln Gly Gln Lys Ile 1205 1210 1215 Asp Thr Gln Gln Ser Leu Phe Thr Val Phe Gly Ala Ile Tyr Gly 1220 1225 1230 Leu Val Leu Phe Leu Gly Ile Asn Asn Cys Ser Ser Ala Leu Gln 1235 1240 1245 Tyr Phe Glu Thr Glu Arg Asn Val Met Tyr Arg Glu Arg Phe Ala 1250 1255 1260 Gly Met Tyr Ser Ala Thr Ala Tyr Ala Leu Ser Gln Val Val Thr 1265 1270 1275 Glu Ile Pro Tyr Ile Phe Ile Gln Ala Ala Glu Phe Val Ile Ile 1280 1285 1290 Thr Tyr Pro Met Ile Gly Phe Tyr Pro Ser Thr Tyr Lys Val Phe 1295 1300 1305 Trp Ser Leu Tyr Ser Met Phe Cys Ser Leu Leu Thr Phe Asn Tyr 1310 1315 1320 Leu Ala Met Phe Leu Val Ser Ile Thr Pro Asn Phe Met Val Ala 1325 1330 1335 Ala Ile Leu Gln Ser Leu Phe Tyr Val Asn Phe Asn Leu Phe Ser 1340 1345 1350 Gly Phe Leu Ile Pro Gln Thr Gln Val Pro Gly Trp Trp Ile Trp 1355 1360 1365 Leu Tyr Tyr Leu Thr Pro Thr Ser Trp Thr Leu Asn Gly Phe Ile 1370 1375 1380 Ser Ser Gln Tyr Gly Asp Ile His Asp Glu Ile Asn Val Phe Gly 1385 1390 1395 Glu Thr Thr Thr Val Ala Ala Phe Leu Lys Asp Tyr Phe Gly Phe 1400 1405 1410 His His Glu Arg Leu Ala Ile Thr Ala Val Val Gln Ile Ala Phe 1415 1420 1425 Pro Ile Ala Phe Ala Ser Met Phe Ala Phe Phe Val Gly Lys Leu 1430 1435 1440 Asn Phe Gln Arg Arg 1445 551658DNAArabidopsis thalianaCDS(38)..(1462) 55gactactaag ttgatctaga aaaaaatcgc cggaaga atg gcg aag cag caa gaa 55 Met Ala Lys Gln Gln Glu 1 5 gca gag ctc atc ttc atc cca ttt cca atc ccc gga cac att ctc gcc 103Ala Glu Leu Ile Phe Ile Pro Phe Pro Ile Pro Gly His Ile Leu Ala 10 15 20 aca atc gaa ctc gcg aaa cgt ctc atc agt cac caa cct agt cgg atc 151Thr Ile Glu Leu Ala Lys Arg Leu Ile Ser His Gln Pro Ser Arg Ile 25 30 35 cac acc atc acc atc ctc cat tgg agc tta cct ttt ctt cct caa tct 199His Thr Ile Thr Ile Leu His Trp Ser Leu Pro Phe Leu Pro Gln Ser 40 45 50 gac act atc gcc ttc ctc aaa tcc cta atc gaa aca gag tct cgt atc 247Asp Thr Ile Ala Phe Leu Lys Ser Leu Ile Glu Thr Glu Ser Arg Ile 55 60 65 70 cgt ctc att acc tta ccc gat gtc caa aac cct cca cca atg gag cta 295Arg Leu Ile Thr Leu Pro Asp Val Gln Asn Pro Pro Pro Met Glu Leu 75 80 85 ttt gtg aaa gct tcc gaa tct tac att ctt gaa tac gtc aag aaa atg 343Phe Val Lys Ala Ser Glu Ser Tyr Ile Leu Glu Tyr Val Lys Lys Met 90 95 100 gtt cct ttg gtc aga aac gct ctc tcc act ctc ttg tct tct cgt gat 391Val Pro Leu Val Arg Asn Ala Leu Ser Thr Leu Leu Ser Ser Arg Asp 105 110 115 gaa tcg gat tca gtt cat gtc gcc gga tta gtt ctt gat ttc ttc tgt 439Glu Ser Asp Ser Val His Val Ala Gly Leu Val Leu Asp Phe Phe Cys 120 125 130 gtc cct ttg atc gat gtc gga aac gag ttt aat ctc cct tct tac atc 487Val Pro Leu Ile Asp Val Gly Asn Glu Phe Asn Leu Pro Ser Tyr Ile 135 140 145 150 ttc ttg acg tgt agc gca agt ttc ttg ggt atg atg aag tat ctt ctg 535Phe Leu Thr Cys Ser Ala Ser Phe Leu Gly Met Met Lys Tyr Leu Leu 155 160 165 gag aga aac cgc gaa acc aaa ccg gaa ctt aac cgg agc tct gac gag 583Glu Arg Asn Arg Glu Thr Lys Pro Glu Leu Asn Arg Ser Ser Asp Glu 170 175 180 gaa aca ata tca gtt cct ggt ttt gtt aac tcc gtt ccg gtt aaa gtt 631Glu Thr Ile Ser Val Pro Gly Phe Val Asn Ser Val Pro Val Lys Val 185 190 195 ttg cca ccg ggt ttg ttc acg act gag tct tac gaa gct tgg gtc gaa 679Leu Pro Pro Gly Leu Phe Thr Thr Glu Ser Tyr Glu Ala Trp Val Glu 200 205 210 atg gcg gaa agg ttc cct gaa gcc aag ggt att ttg gtc aat tca ttt 727Met Ala Glu Arg Phe Pro Glu Ala Lys Gly Ile Leu Val Asn Ser Phe 215 220 225 230 gaa tct cta gaa cgt aac gct ttt gat tat ttc gat cgt cgt ccg gat 775Glu Ser Leu Glu Arg Asn Ala Phe Asp Tyr Phe Asp Arg Arg Pro Asp 235 240 245 aat tac cca ccc gtt tac cca atc ggg cca att cta tgc tcc aac gat 823Asn Tyr Pro Pro Val Tyr Pro Ile Gly Pro Ile Leu Cys Ser Asn Asp 250 255 260 cgt ccg aat ttg gat tta tcg gaa cga gac cgg atc ttg aaa tgg ctc 871Arg Pro Asn Leu Asp Leu Ser Glu Arg Asp Arg Ile Leu Lys Trp Leu 265 270 275 gat gac caa ccc gag tca tct gtt gtg ttt ctc tgc ttc ggg agc ttg 919Asp Asp Gln Pro Glu Ser Ser Val Val Phe Leu Cys Phe Gly Ser Leu 280 285 290 aag agt ctc gct gcg tct cag att aaa gag atc gct caa gcc tta gag 967Lys Ser Leu Ala Ala Ser Gln Ile Lys Glu Ile Ala Gln Ala Leu Glu 295 300 305 310 ctc gtc gga atc aga ttc ctc tgg tcg att cga acg gac ccg aag gag 1015Leu Val Gly Ile Arg Phe Leu Trp Ser Ile Arg Thr Asp Pro Lys Glu 315 320 325 tac gcg agc ccg aac gag att tta ccg gac ggg ttt atg aac cga gtc 1063Tyr Ala Ser Pro Asn Glu Ile Leu Pro Asp Gly Phe Met Asn Arg Val 330 335 340 atg ggt ttg ggc ctt gtt tgt ggt tgg gct cct caa gtt gaa att ctg 1111Met Gly Leu Gly Leu Val Cys Gly Trp Ala Pro Gln Val Glu Ile Leu 345 350 355 gcc cat aaa gca att gga ggg ttc gtg tca cac tgc ggt tgg aac tcg 1159Ala His Lys Ala Ile Gly Gly Phe Val Ser His Cys Gly Trp Asn Ser 360 365 370 ata ttg gag agt ttg cgt ttc gga gtt cca att gcc acg tgg cca atg 1207Ile Leu Glu Ser Leu Arg Phe Gly Val Pro Ile Ala Thr Trp Pro Met 375 380 385 390 tac gcg gaa caa caa cta aac gcg ttc acg att gtg aag gag ctt ggt 1255Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr Ile Val Lys Glu Leu Gly 395 400 405 ttg gcg ttg gag atg cgg ttg gat tac gtg tcg gaa tat gga gaa atc 1303Leu Ala Leu Glu Met Arg Leu Asp Tyr Val Ser Glu Tyr Gly Glu Ile 410 415 420 gtg aaa gct gat gaa atc gca gga gcc gta cga tct ttg atg gac ggt 1351Val Lys Ala Asp Glu Ile Ala Gly Ala Val Arg Ser Leu Met Asp Gly 425 430 435 gag gat gtg ccg agg agg aaa ctg aag gag att gcg gag gcg gga aaa 1399Glu Asp Val Pro Arg Arg Lys Leu Lys Glu Ile Ala Glu Ala Gly Lys 440 445 450 gag gct gtg atg gac ggt gga tct tcg ttt gtt gcg gtt aaa aga ttc 1447Glu Ala Val Met Asp Gly Gly Ser Ser Phe Val Ala Val Lys Arg Phe 455 460 465 470 ata gat ggg ctt tga tcggtgatgg gttttaaagt ttttacacca tgcaaacgtt 1502Ile Asp Gly Leu gtcgttttat gtaatttaag cttgctttga gtgagtctct aatggctttg agctttatcc 1562aactctataa aagtcctcct tttgatagta tgcatgatct tttgtgttta ctcatttgtt 1622atatatctaa atagctcatt ttgcattttg ttttat 165856474PRTArabidopsis thaliana 56Met Ala Lys Gln Gln Glu Ala Glu Leu Ile Phe Ile Pro Phe Pro Ile 1 5 10 15 Pro Gly His Ile Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Ser 20 25 30 His Gln Pro Ser Arg Ile His Thr Ile Thr Ile Leu His Trp Ser Leu 35 40 45 Pro Phe Leu Pro Gln Ser Asp Thr Ile Ala Phe Leu Lys Ser Leu Ile 50 55 60 Glu Thr Glu Ser Arg Ile Arg Leu Ile Thr Leu Pro Asp Val Gln Asn 65 70 75 80 Pro Pro Pro Met Glu Leu Phe Val Lys Ala Ser Glu Ser Tyr Ile Leu 85 90 95 Glu Tyr Val Lys Lys Met Val Pro Leu Val Arg Asn Ala Leu Ser Thr 100 105 110 Leu Leu Ser Ser Arg Asp Glu Ser Asp Ser Val His Val Ala Gly Leu 115 120 125 Val Leu Asp Phe Phe Cys Val Pro Leu Ile Asp Val Gly Asn Glu Phe 130 135 140 Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Ser Phe Leu Gly 145 150 155 160 Met Met Lys Tyr Leu Leu Glu Arg Asn Arg Glu Thr Lys Pro Glu Leu 165 170 175 Asn Arg Ser Ser Asp Glu Glu Thr Ile Ser Val Pro Gly Phe Val Asn 180 185 190 Ser Val Pro Val Lys Val Leu Pro Pro Gly Leu Phe Thr Thr Glu Ser 195 200 205 Tyr Glu Ala Trp Val Glu Met Ala Glu Arg Phe Pro Glu Ala Lys Gly 210 215 220 Ile Leu Val Asn Ser Phe Glu Ser Leu Glu Arg Asn Ala Phe Asp Tyr 225 230 235 240 Phe Asp Arg Arg Pro Asp Asn Tyr Pro Pro Val Tyr Pro Ile Gly Pro 245 250 255 Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Leu Ser Glu Arg Asp 260 265 270 Arg Ile Leu Lys Trp Leu Asp Asp Gln Pro Glu Ser Ser Val Val Phe 275 280 285 Leu Cys Phe Gly Ser Leu Lys Ser Leu Ala Ala Ser Gln Ile Lys Glu 290 295 300 Ile Ala Gln Ala Leu Glu Leu Val Gly Ile Arg Phe Leu Trp Ser Ile 305 310 315 320 Arg Thr Asp Pro Lys Glu Tyr Ala Ser Pro Asn Glu Ile Leu Pro Asp 325 330 335 Gly Phe Met Asn Arg Val Met Gly Leu Gly Leu Val Cys Gly Trp Ala 340 345 350 Pro Gln Val Glu Ile Leu Ala His Lys Ala Ile Gly Gly Phe Val Ser 355 360 365 His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Arg Phe Gly Val Pro 370 375 380 Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 385 390 395 400 Ile Val Lys Glu Leu Gly Leu Ala Leu Glu Met Arg Leu Asp Tyr Val 405 410 415 Ser Glu Tyr Gly Glu Ile Val Lys Ala Asp Glu Ile Ala Gly Ala Val 420 425 430 Arg Ser Leu Met Asp Gly Glu Asp Val Pro Arg Arg Lys Leu Lys Glu 435 440 445 Ile Ala Glu Ala Gly Lys Glu Ala Val Met Asp Gly Gly Ser Ser Phe 450 455 460 Val Ala Val Lys Arg Phe Ile Asp Gly Leu 465 470 571425DNAArabidopsis lyratamisc_featuresubsp. lyrataCDS(1)..(1425) 57atg gag gag aag caa gaa gca gag ctc ata ttc atc cca ttt cca atc 48Met Glu Glu Lys Gln Glu Ala Glu Leu Ile Phe Ile Pro Phe Pro Ile 1 5 10 15 cct gga cac atg ctt gcc aca atc gaa ctc gcg aaa cgt ctc atc aat 96Pro Gly His Met Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Asn 20 25 30 cac aaa cct cgt cgg atc cat acc atc acc atc ctc cat tgg agc tta 144His Lys Pro Arg Arg Ile His Thr Ile Thr Ile Leu His

Trp Ser Leu 35 40 45 cct ttt ctt cct caa tct gac act atc tcc ttc ctc aaa tcc cta atc 192Pro Phe Leu Pro Gln Ser Asp Thr Ile Ser Phe Leu Lys Ser Leu Ile 50 55 60 caa aca gag tct cgt atc cgt ctt gtt acc tta ccc gac gtc cca aac 240Gln Thr Glu Ser Arg Ile Arg Leu Val Thr Leu Pro Asp Val Pro Asn 65 70 75 80 cct cca cca atg gaa ctt ttc gtg aaa gct tca gaa tct tac att ctt 288Pro Pro Pro Met Glu Leu Phe Val Lys Ala Ser Glu Ser Tyr Ile Leu 85 90 95 gaa ttc gtc aag aaa atg gtt cct ttg gtt aaa aaa gct ctc tcc act 336Glu Phe Val Lys Lys Met Val Pro Leu Val Lys Lys Ala Leu Ser Thr 100 105 110 ctc ttg tct tct cgt gat gaa tcg gat tca gtt cgt gtc gcc gga tta 384Leu Leu Ser Ser Arg Asp Glu Ser Asp Ser Val Arg Val Ala Gly Leu 115 120 125 gtt ctc gat ttc ttc tgt gtc cct ttg att gat gtt gga aac gag ttt 432Val Leu Asp Phe Phe Cys Val Pro Leu Ile Asp Val Gly Asn Glu Phe 130 135 140 aat ctc cct tct tac att ttc ttg acg tgt agc gca agt ttc ttg ggt 480Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Ser Phe Leu Gly 145 150 155 160 atg atg aag tat ctc cca gag aga cac cgc aaa atc aaa ccg gaa ttt 528Met Met Lys Tyr Leu Pro Glu Arg His Arg Lys Ile Lys Pro Glu Phe 165 170 175 aac cgg agc tct ggc gag gaa aca ata ccg gtt cct ggc ttt gtt aac 576Asn Arg Ser Ser Gly Glu Glu Thr Ile Pro Val Pro Gly Phe Val Asn 180 185 190 tcc gtt ccg gtt aag gtt ttg cca ccg ggt ctg ttc atg aga gag tct 624Ser Val Pro Val Lys Val Leu Pro Pro Gly Leu Phe Met Arg Glu Ser 195 200 205 tac gaa gct tgg gtc gaa atg gcg gag agg ttc cct gaa gcc aag ggt 672Tyr Glu Ala Trp Val Glu Met Ala Glu Arg Phe Pro Glu Ala Lys Gly 210 215 220 atc ttg gta aat tct ttc gaa tct cta gaa cgt aac gct ttt gat tat 720Ile Leu Val Asn Ser Phe Glu Ser Leu Glu Arg Asn Ala Phe Asp Tyr 225 230 235 240 ttc gat cat cgt ccg gat aat tac cca ccc gtt tac cca atc ggg ccg 768Phe Asp His Arg Pro Asp Asn Tyr Pro Pro Val Tyr Pro Ile Gly Pro 245 250 255 att cta tgc tcc aac gat cgt ccg aat ttg gat tta tcg gaa cga gat 816Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Leu Ser Glu Arg Asp 260 265 270 cgg atc ttg aga tgg ctc gat gac caa ccc gag tca tca gtt gtg ttc 864Arg Ile Leu Arg Trp Leu Asp Asp Gln Pro Glu Ser Ser Val Val Phe 275 280 285 ttc tgc ttc ggg agc ttg aag agt ctc gct gct tct cag att aaa gag 912Phe Cys Phe Gly Ser Leu Lys Ser Leu Ala Ala Ser Gln Ile Lys Glu 290 295 300 atc gct caa gcc att gaa ctc gtc gga ttc aga ttc ctc tgg tcg att 960Ile Ala Gln Ala Ile Glu Leu Val Gly Phe Arg Phe Leu Trp Ser Ile 305 310 315 320 cga aca gat ccg aac gag tac ccg aac ccg tac gag att tta ccg gac 1008Arg Thr Asp Pro Asn Glu Tyr Pro Asn Pro Tyr Glu Ile Leu Pro Asp 325 330 335 ggg ttt atg aac cgg gtc atg ggt ttg ggt ctt gtt tgt ggt tgg gct 1056Gly Phe Met Asn Arg Val Met Gly Leu Gly Leu Val Cys Gly Trp Ala 340 345 350 cct caa gtt gaa att ctg gcc cat aaa gca atc gga ggg ttc gtg tca 1104Pro Gln Val Glu Ile Leu Ala His Lys Ala Ile Gly Gly Phe Val Ser 355 360 365 cac tgc ggt tgg aac tcg att ttg gag agt ttg cgt ttc ggg gtt cca 1152His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Arg Phe Gly Val Pro 370 375 380 atc gcc acg tgg cca atg tac gca gaa caa caa cta aac gcg ttc acg 1200Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 385 390 395 400 att gtg aag gag ctt ggt ttg gcg ttg gag atg cgg ttg gat tac gtg 1248Ile Val Lys Glu Leu Gly Leu Ala Leu Glu Met Arg Leu Asp Tyr Val 405 410 415 tgg gct cat gga gaa atc gtg aaa gct gat gaa atc gca ggt gcc gta 1296Trp Ala His Gly Glu Ile Val Lys Ala Asp Glu Ile Ala Gly Ala Val 420 425 430 cga tct tta atg gac ggt gag gat gtg cgg agg agg aaa ctg aag gag 1344Arg Ser Leu Met Asp Gly Glu Asp Val Arg Arg Arg Lys Leu Lys Glu 435 440 445 att gcg gag gcg gca aaa gag gct gtg atg gac ggt gga tct tcg ttt 1392Ile Ala Glu Ala Ala Lys Glu Ala Val Met Asp Gly Gly Ser Ser Phe 450 455 460 gtt gcg gtt aaa aga ttc ata gat ggg ctt tga 1425Val Ala Val Lys Arg Phe Ile Asp Gly Leu 465 470 58474PRTArabidopsis lyrata 58Met Glu Glu Lys Gln Glu Ala Glu Leu Ile Phe Ile Pro Phe Pro Ile 1 5 10 15 Pro Gly His Met Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Asn 20 25 30 His Lys Pro Arg Arg Ile His Thr Ile Thr Ile Leu His Trp Ser Leu 35 40 45 Pro Phe Leu Pro Gln Ser Asp Thr Ile Ser Phe Leu Lys Ser Leu Ile 50 55 60 Gln Thr Glu Ser Arg Ile Arg Leu Val Thr Leu Pro Asp Val Pro Asn 65 70 75 80 Pro Pro Pro Met Glu Leu Phe Val Lys Ala Ser Glu Ser Tyr Ile Leu 85 90 95 Glu Phe Val Lys Lys Met Val Pro Leu Val Lys Lys Ala Leu Ser Thr 100 105 110 Leu Leu Ser Ser Arg Asp Glu Ser Asp Ser Val Arg Val Ala Gly Leu 115 120 125 Val Leu Asp Phe Phe Cys Val Pro Leu Ile Asp Val Gly Asn Glu Phe 130 135 140 Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Ser Phe Leu Gly 145 150 155 160 Met Met Lys Tyr Leu Pro Glu Arg His Arg Lys Ile Lys Pro Glu Phe 165 170 175 Asn Arg Ser Ser Gly Glu Glu Thr Ile Pro Val Pro Gly Phe Val Asn 180 185 190 Ser Val Pro Val Lys Val Leu Pro Pro Gly Leu Phe Met Arg Glu Ser 195 200 205 Tyr Glu Ala Trp Val Glu Met Ala Glu Arg Phe Pro Glu Ala Lys Gly 210 215 220 Ile Leu Val Asn Ser Phe Glu Ser Leu Glu Arg Asn Ala Phe Asp Tyr 225 230 235 240 Phe Asp His Arg Pro Asp Asn Tyr Pro Pro Val Tyr Pro Ile Gly Pro 245 250 255 Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Leu Ser Glu Arg Asp 260 265 270 Arg Ile Leu Arg Trp Leu Asp Asp Gln Pro Glu Ser Ser Val Val Phe 275 280 285 Phe Cys Phe Gly Ser Leu Lys Ser Leu Ala Ala Ser Gln Ile Lys Glu 290 295 300 Ile Ala Gln Ala Ile Glu Leu Val Gly Phe Arg Phe Leu Trp Ser Ile 305 310 315 320 Arg Thr Asp Pro Asn Glu Tyr Pro Asn Pro Tyr Glu Ile Leu Pro Asp 325 330 335 Gly Phe Met Asn Arg Val Met Gly Leu Gly Leu Val Cys Gly Trp Ala 340 345 350 Pro Gln Val Glu Ile Leu Ala His Lys Ala Ile Gly Gly Phe Val Ser 355 360 365 His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Arg Phe Gly Val Pro 370 375 380 Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 385 390 395 400 Ile Val Lys Glu Leu Gly Leu Ala Leu Glu Met Arg Leu Asp Tyr Val 405 410 415 Trp Ala His Gly Glu Ile Val Lys Ala Asp Glu Ile Ala Gly Ala Val 420 425 430 Arg Ser Leu Met Asp Gly Glu Asp Val Arg Arg Arg Lys Leu Lys Glu 435 440 445 Ile Ala Glu Ala Ala Lys Glu Ala Val Met Asp Gly Gly Ser Ser Phe 450 455 460 Val Ala Val Lys Arg Phe Ile Asp Gly Leu 465 470 591449DNAArabidopsis lyratamisc_featuresubsp. lyrataCDS(1)..(1449) 59atg ggg atg caa gaa gaa gca gag ctc gtc atc atc cct ttc ccc ttc 48Met Gly Met Gln Glu Glu Ala Glu Leu Val Ile Ile Pro Phe Pro Phe 1 5 10 15 tcc ggg cac att ctc gca acc atc gaa ctc gcg aaa cgt ctc ata agt 96Ser Gly His Ile Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Ser 20 25 30 caa gac aat cct cgg atc cac acc atc acc atc ctc tat tgg gga cta 144Gln Asp Asn Pro Arg Ile His Thr Ile Thr Ile Leu Tyr Trp Gly Leu 35 40 45 ccc ttt att cct caa gct gac aca atc gct ttc ctc caa tcc cta gtc 192Pro Phe Ile Pro Gln Ala Asp Thr Ile Ala Phe Leu Gln Ser Leu Val 50 55 60 aaa aat gag tct cgt atc cgt ctc gtt acg ttg ccc gag gtc caa aac 240Lys Asn Glu Ser Arg Ile Arg Leu Val Thr Leu Pro Glu Val Gln Asn 65 70 75 80 cct cca cca atg gaa ctc ttt gtg gaa ttt gct gaa tct tac att ctt 288Pro Pro Pro Met Glu Leu Phe Val Glu Phe Ala Glu Ser Tyr Ile Leu 85 90 95 gaa tac gtc aag aaa atg att ccc att gtg aga gat ggt ctc tcc act 336Glu Tyr Val Lys Lys Met Ile Pro Ile Val Arg Asp Gly Leu Ser Thr 100 105 110 ctc ttg tct tct cgc gat gaa tcg gat tca gtt cgt gtg gct gga ttg 384Leu Leu Ser Ser Arg Asp Glu Ser Asp Ser Val Arg Val Ala Gly Leu 115 120 125 gtt ctt gat ttc ttc tgc gtc cct atg atc gat gtg gga aac gag ttt 432Val Leu Asp Phe Phe Cys Val Pro Met Ile Asp Val Gly Asn Glu Phe 130 135 140 aat ctc cct tct tac att ttc ttg acg tgt agc gca ggg ttc ttg ggt 480Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Gly Phe Leu Gly 145 150 155 160 atg atg aag tat ctt cca gag aga cac cgc aaa atc aaa tcg gaa ttt 528Met Met Lys Tyr Leu Pro Glu Arg His Arg Lys Ile Lys Ser Glu Phe 165 170 175 acc cgg agc tct aac gag gag tta aac cct att cct ggt ttt gtc aac 576Thr Arg Ser Ser Asn Glu Glu Leu Asn Pro Ile Pro Gly Phe Val Asn 180 185 190 tct gtt cca act aag gtt ttg ccg tca ggt ctg ttc atg aaa gag act 624Ser Val Pro Thr Lys Val Leu Pro Ser Gly Leu Phe Met Lys Glu Thr 195 200 205 tac gag cct tgg gtc gta cta gcc gag aga ttt cct gaa gct aag ggt 672Tyr Glu Pro Trp Val Val Leu Ala Glu Arg Phe Pro Glu Ala Lys Gly 210 215 220 att ttg gta aat tcc tac aca tct ctc gag cca aac ggt ttt aaa tat 720Ile Leu Val Asn Ser Tyr Thr Ser Leu Glu Pro Asn Gly Phe Lys Tyr 225 230 235 240 ttc gat cgt tgt ccg gat aac tac cca acc gtt tac cca atc ggg ccg 768Phe Asp Arg Cys Pro Asp Asn Tyr Pro Thr Val Tyr Pro Ile Gly Pro 245 250 255 att tta tgc tcc aac gac cgt ccg aat ttg gac tca tcg gaa cgc gat 816Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Ser Ser Glu Arg Asp 260 265 270 cgg atc ata aga tgg ctc gat gac caa ccc gag tca tca gtc gtg ttc 864Arg Ile Ile Arg Trp Leu Asp Asp Gln Pro Glu Ser Ser Val Val Phe 275 280 285 ctt tgt ttc ggg agc ttg aag aat ctc agt gct act cag atc aac gag 912Leu Cys Phe Gly Ser Leu Lys Asn Leu Ser Ala Thr Gln Ile Asn Glu 290 295 300 atc gct caa gcc tta gag ctc gtt gaa tgc aaa ttc atc tgg tcg ttc 960Ile Ala Gln Ala Leu Glu Leu Val Glu Cys Lys Phe Ile Trp Ser Phe 305 310 315 320 cga acc aac ccg aag gag tac gca agc ccg tac gag gcc tta cca gac 1008Arg Thr Asn Pro Lys Glu Tyr Ala Ser Pro Tyr Glu Ala Leu Pro Asp 325 330 335 ggg ttc atg gac cgg gtc atg gat caa ggc ctc gtt tgt ggt tgg gct 1056Gly Phe Met Asp Arg Val Met Asp Gln Gly Leu Val Cys Gly Trp Ala 340 345 350 cct caa gtt gaa att tta gct cat aaa gct gtc gga gga ttt gta tcg 1104Pro Gln Val Glu Ile Leu Ala His Lys Ala Val Gly Gly Phe Val Ser 355 360 365 cac tgc ggt tgg aac tcg ata tta gaa agt ttg ggt ttc ggc gtt cca 1152His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Gly Phe Gly Val Pro 370 375 380 atc gcc acg tgg cca atg tac gca gaa caa caa cta aac gcg ttc acg 1200Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 385 390 395 400 atg gtg aag gaa ctt ggt tta gcc ttg gag atg cgg ttg gat tac gtg 1248Met Val Lys Glu Leu Gly Leu Ala Leu Glu Met Arg Leu Asp Tyr Val 405 410 415 tcg gaa gat gga gat ata gtg aaa gct gat gaa atc gca gga acc att 1296Ser Glu Asp Gly Asp Ile Val Lys Ala Asp Glu Ile Ala Gly Thr Ile 420 425 430 aga tct tta atg gac ggt gtg gat gtg cca aag agt aaa gtg aag gag 1344Arg Ser Leu Met Asp Gly Val Asp Val Pro Lys Ser Lys Val Lys Glu 435 440 445 att gct gag gcg gga aaa gaa gct gtt ctg gac ggt gga tct tcg ttt 1392Ile Ala Glu Ala Gly Lys Glu Ala Val Leu Asp Gly Gly Ser Ser Phe 450 455 460 gtt gcg gtt aaa aga ttc att ggt gac ttg atc gac ggc gtt tct ata 1440Val Ala Val Lys Arg Phe Ile Gly Asp Leu Ile Asp Gly Val Ser Ile 465 470 475 480 agg aag tag 1449Arg Lys 60482PRTArabidopsis lyrata 60Met Gly Met Gln Glu Glu Ala Glu Leu Val Ile Ile Pro Phe Pro Phe 1 5 10 15 Ser Gly His Ile Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Ser 20 25 30 Gln Asp Asn Pro Arg Ile His Thr Ile Thr Ile Leu Tyr Trp Gly Leu 35 40 45 Pro Phe Ile Pro Gln Ala Asp Thr Ile Ala Phe Leu Gln Ser Leu Val 50 55 60 Lys Asn Glu Ser Arg Ile Arg Leu Val Thr Leu Pro Glu Val Gln Asn 65 70 75 80 Pro Pro Pro Met Glu Leu Phe Val Glu Phe Ala Glu Ser Tyr Ile Leu 85 90 95 Glu Tyr Val Lys Lys Met Ile Pro Ile Val Arg Asp Gly Leu Ser Thr 100 105 110 Leu Leu Ser Ser Arg Asp Glu Ser Asp Ser Val Arg Val Ala Gly Leu 115 120 125 Val Leu Asp Phe Phe Cys Val Pro Met Ile Asp Val Gly Asn Glu Phe 130 135 140 Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Gly Phe Leu Gly 145 150 155 160 Met Met Lys Tyr Leu Pro Glu Arg His Arg Lys Ile Lys Ser Glu Phe 165 170 175 Thr Arg Ser Ser Asn Glu Glu Leu Asn Pro Ile Pro Gly Phe Val Asn 180 185 190 Ser Val Pro Thr Lys Val Leu Pro Ser Gly Leu Phe Met Lys Glu Thr 195 200 205 Tyr Glu Pro Trp Val Val Leu Ala Glu Arg Phe Pro Glu Ala Lys Gly 210 215 220 Ile Leu Val Asn Ser

Tyr Thr Ser Leu Glu Pro Asn Gly Phe Lys Tyr 225 230 235 240 Phe Asp Arg Cys Pro Asp Asn Tyr Pro Thr Val Tyr Pro Ile Gly Pro 245 250 255 Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Ser Ser Glu Arg Asp 260 265 270 Arg Ile Ile Arg Trp Leu Asp Asp Gln Pro Glu Ser Ser Val Val Phe 275 280 285 Leu Cys Phe Gly Ser Leu Lys Asn Leu Ser Ala Thr Gln Ile Asn Glu 290 295 300 Ile Ala Gln Ala Leu Glu Leu Val Glu Cys Lys Phe Ile Trp Ser Phe 305 310 315 320 Arg Thr Asn Pro Lys Glu Tyr Ala Ser Pro Tyr Glu Ala Leu Pro Asp 325 330 335 Gly Phe Met Asp Arg Val Met Asp Gln Gly Leu Val Cys Gly Trp Ala 340 345 350 Pro Gln Val Glu Ile Leu Ala His Lys Ala Val Gly Gly Phe Val Ser 355 360 365 His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Gly Phe Gly Val Pro 370 375 380 Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 385 390 395 400 Met Val Lys Glu Leu Gly Leu Ala Leu Glu Met Arg Leu Asp Tyr Val 405 410 415 Ser Glu Asp Gly Asp Ile Val Lys Ala Asp Glu Ile Ala Gly Thr Ile 420 425 430 Arg Ser Leu Met Asp Gly Val Asp Val Pro Lys Ser Lys Val Lys Glu 435 440 445 Ile Ala Glu Ala Gly Lys Glu Ala Val Leu Asp Gly Gly Ser Ser Phe 450 455 460 Val Ala Val Lys Arg Phe Ile Gly Asp Leu Ile Asp Gly Val Ser Ile 465 470 475 480 Arg Lys 611785DNACapsella rubellaCDS(134)..(1576) 61ctttaaaagt agctaacaat aagcatcaac acatacaaaa cacaactttc tagaaaaaaa 60cagctttgca caatctcagt ttcattttga ttttgtcatt ttccttattg acttttgagt 120ttctcagaac aca atg ggg aac caa gaa gca gag ctc gtc atc atc cct 169 Met Gly Asn Gln Glu Ala Glu Leu Val Ile Ile Pro 1 5 10 cac ccg ttc tcc gga cat att ctc gca acc atc gaa ctg gcg aaa cgt 217His Pro Phe Ser Gly His Ile Leu Ala Thr Ile Glu Leu Ala Lys Arg 15 20 25 ctc atc agt caa gac aat cct cgg atc cac acc atc acc atc ctc tac 265Leu Ile Ser Gln Asp Asn Pro Arg Ile His Thr Ile Thr Ile Leu Tyr 30 35 40 tgg gga cta ccc ttt att cct caa gct gac acg atc gcc ttc ctc cag 313Trp Gly Leu Pro Phe Ile Pro Gln Ala Asp Thr Ile Ala Phe Leu Gln 45 50 55 60 tcc cta gtc aaa aat gag cca cgt atc cgt ctc gtt acc ttg ccc gac 361Ser Leu Val Lys Asn Glu Pro Arg Ile Arg Leu Val Thr Leu Pro Asp 65 70 75 gtc gag aac cct cca ccg atg gag ctc ttc ttg gaa gca gct gaa gct 409Val Glu Asn Pro Pro Pro Met Glu Leu Phe Leu Glu Ala Ala Glu Ala 80 85 90 tac att ctt gaa tac gtc aag aag atg gtt ccc atc gtg agg gat ggt 457Tyr Ile Leu Glu Tyr Val Lys Lys Met Val Pro Ile Val Arg Asp Gly 95 100 105 ctc tcc act ctc ttg tct tct cgt gac gaa tct gat cca gtt cgc gtg 505Leu Ser Thr Leu Leu Ser Ser Arg Asp Glu Ser Asp Pro Val Arg Val 110 115 120 gcg gga ttg gtt ctt gat ttc ttc tgc gtc ccc atg att gat gtt gga 553Ala Gly Leu Val Leu Asp Phe Phe Cys Val Pro Met Ile Asp Val Gly 125 130 135 140 aac gag ttc aac ctc cct tct tac att ttc ttg acg tgc agc gca ggt 601Asn Glu Phe Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Gly 145 150 155 ttc ttg ggt atg atg aag tat ctc cca gag aga cac agc gaa acc aac 649Phe Leu Gly Met Met Lys Tyr Leu Pro Glu Arg His Ser Glu Thr Asn 160 165 170 tca gag ttt aac cgg agc tct aac gag gag tta aac cgg gtt cct ggt 697Ser Glu Phe Asn Arg Ser Ser Asn Glu Glu Leu Asn Arg Val Pro Gly 175 180 185 ttt gtc aac tct gtt cct acc aag gtt ttg ccg tca ggt ctg ttc atg 745Phe Val Asn Ser Val Pro Thr Lys Val Leu Pro Ser Gly Leu Phe Met 190 195 200 aaa gag act tac gag cct tgg gtc gtg cta gca gag agg ttt cct gaa 793Lys Glu Thr Tyr Glu Pro Trp Val Val Leu Ala Glu Arg Phe Pro Glu 205 210 215 220 gct aag ggt atc tta gta aat tca ttc acg tct tta gag cca aac gct 841Ala Lys Gly Ile Leu Val Asn Ser Phe Thr Ser Leu Glu Pro Asn Ala 225 230 235 ttt gaa tat ttt gat ggt tgt ccg gat aat tac cca ccc gtt tac cca 889Phe Glu Tyr Phe Asp Gly Cys Pro Asp Asn Tyr Pro Pro Val Tyr Pro 240 245 250 atc ggg ccg ata ctc tgc tcc aac gat cgt ccg aat ctg gac tca tcg 937Ile Gly Pro Ile Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Ser Ser 255 260 265 gaa cga gac cgg atc ata aca tgg ctc gat gat cag aca gag tca tcg 985Glu Arg Asp Arg Ile Ile Thr Trp Leu Asp Asp Gln Thr Glu Ser Ser 270 275 280 gtt gtg ttc ctt tgc ttc ggg agc ttg aag aat att tct cag aca cag 1033Val Val Phe Leu Cys Phe Gly Ser Leu Lys Asn Ile Ser Gln Thr Gln 285 290 295 300 atc aaa gag atc gct caa gcc ttg gag ctc gtt gac tgc aaa ttc ctc 1081Ile Lys Glu Ile Ala Gln Ala Leu Glu Leu Val Asp Cys Lys Phe Leu 305 310 315 tgg tca ata aga acc gac ccg aaa gag tac tcg agc ccg tac gaa gct 1129Trp Ser Ile Arg Thr Asp Pro Lys Glu Tyr Ser Ser Pro Tyr Glu Ala 320 325 330 tta cca gac ggg ttc atg gac cgg gtt atg gat caa ggt ctt gtt tgt 1177Leu Pro Asp Gly Phe Met Asp Arg Val Met Asp Gln Gly Leu Val Cys 335 340 345 ggt tgg gct cct caa gtt gag att ctg gcc cat aaa gca atc gga ggg 1225Gly Trp Ala Pro Gln Val Glu Ile Leu Ala His Lys Ala Ile Gly Gly 350 355 360 ttc gtg tct cac tgc ggt tgg aac tct att ttg gag agt ttg ggt tac 1273Phe Val Ser His Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Gly Tyr 365 370 375 380 ggc gtt ccc atc gcc acg tgg ccg atg tac gcg gaa cag cag cta aac 1321Gly Val Pro Ile Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn 385 390 395 gcg ttc acg atg gtg aag gag ctt ggt atc gca ttg gag atg cgg ttg 1369Ala Phe Thr Met Val Lys Glu Leu Gly Ile Ala Leu Glu Met Arg Leu 400 405 410 gat tac gtg tcg gaa gat gga cat ata gtg aaa gct gat gag atc gca 1417Asp Tyr Val Ser Glu Asp Gly His Ile Val Lys Ala Asp Glu Ile Ala 415 420 425 gaa acc gta cga tct ttg atg gac ggt gag gat cgt gcg ctg aag aat 1465Glu Thr Val Arg Ser Leu Met Asp Gly Glu Asp Arg Ala Leu Lys Asn 430 435 440 aca gtg gag gag att gct aat gcg gga aaa gtg gct gtg atg gac ggt 1513Thr Val Glu Glu Ile Ala Asn Ala Gly Lys Val Ala Val Met Asp Gly 445 450 455 460 gga tct tcg ttt gct gcg att aaa aga ttt atc ggt gat ttg atc atc 1561Gly Ser Ser Phe Ala Ala Ile Lys Arg Phe Ile Gly Asp Leu Ile Ile 465 470 475 ggc gat ggt ttg tag aaacgtcgta gtttcacttg gcgtgtggtg accatgatgc 1616Gly Asp Gly Leu 480 tcggctcaga ttcctttgtt cgttattaaa taatagaaga ctgagtcttc ttacaagtat 1676tttcaccagt tccatgtttt gtaaaggagt caacgattcc attatttgct tccacgtaat 1736gttgtatact tgtatcatct catatttaag gatcaaaaac gagttattc 178562480PRTCapsella rubella 62Met Gly Asn Gln Glu Ala Glu Leu Val Ile Ile Pro His Pro Phe Ser 1 5 10 15 Gly His Ile Leu Ala Thr Ile Glu Leu Ala Lys Arg Leu Ile Ser Gln 20 25 30 Asp Asn Pro Arg Ile His Thr Ile Thr Ile Leu Tyr Trp Gly Leu Pro 35 40 45 Phe Ile Pro Gln Ala Asp Thr Ile Ala Phe Leu Gln Ser Leu Val Lys 50 55 60 Asn Glu Pro Arg Ile Arg Leu Val Thr Leu Pro Asp Val Glu Asn Pro 65 70 75 80 Pro Pro Met Glu Leu Phe Leu Glu Ala Ala Glu Ala Tyr Ile Leu Glu 85 90 95 Tyr Val Lys Lys Met Val Pro Ile Val Arg Asp Gly Leu Ser Thr Leu 100 105 110 Leu Ser Ser Arg Asp Glu Ser Asp Pro Val Arg Val Ala Gly Leu Val 115 120 125 Leu Asp Phe Phe Cys Val Pro Met Ile Asp Val Gly Asn Glu Phe Asn 130 135 140 Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Gly Phe Leu Gly Met 145 150 155 160 Met Lys Tyr Leu Pro Glu Arg His Ser Glu Thr Asn Ser Glu Phe Asn 165 170 175 Arg Ser Ser Asn Glu Glu Leu Asn Arg Val Pro Gly Phe Val Asn Ser 180 185 190 Val Pro Thr Lys Val Leu Pro Ser Gly Leu Phe Met Lys Glu Thr Tyr 195 200 205 Glu Pro Trp Val Val Leu Ala Glu Arg Phe Pro Glu Ala Lys Gly Ile 210 215 220 Leu Val Asn Ser Phe Thr Ser Leu Glu Pro Asn Ala Phe Glu Tyr Phe 225 230 235 240 Asp Gly Cys Pro Asp Asn Tyr Pro Pro Val Tyr Pro Ile Gly Pro Ile 245 250 255 Leu Cys Ser Asn Asp Arg Pro Asn Leu Asp Ser Ser Glu Arg Asp Arg 260 265 270 Ile Ile Thr Trp Leu Asp Asp Gln Thr Glu Ser Ser Val Val Phe Leu 275 280 285 Cys Phe Gly Ser Leu Lys Asn Ile Ser Gln Thr Gln Ile Lys Glu Ile 290 295 300 Ala Gln Ala Leu Glu Leu Val Asp Cys Lys Phe Leu Trp Ser Ile Arg 305 310 315 320 Thr Asp Pro Lys Glu Tyr Ser Ser Pro Tyr Glu Ala Leu Pro Asp Gly 325 330 335 Phe Met Asp Arg Val Met Asp Gln Gly Leu Val Cys Gly Trp Ala Pro 340 345 350 Gln Val Glu Ile Leu Ala His Lys Ala Ile Gly Gly Phe Val Ser His 355 360 365 Cys Gly Trp Asn Ser Ile Leu Glu Ser Leu Gly Tyr Gly Val Pro Ile 370 375 380 Ala Thr Trp Pro Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr Met 385 390 395 400 Val Lys Glu Leu Gly Ile Ala Leu Glu Met Arg Leu Asp Tyr Val Ser 405 410 415 Glu Asp Gly His Ile Val Lys Ala Asp Glu Ile Ala Glu Thr Val Arg 420 425 430 Ser Leu Met Asp Gly Glu Asp Arg Ala Leu Lys Asn Thr Val Glu Glu 435 440 445 Ile Ala Asn Ala Gly Lys Val Ala Val Met Asp Gly Gly Ser Ser Phe 450 455 460 Ala Ala Ile Lys Arg Phe Ile Gly Asp Leu Ile Ile Gly Asp Gly Leu 465 470 475 480 6361PRTArtificial sequenceprotein patternVariant(3)..(3)Xaa in position 3 is Leu or PheVariant(10)..(10)Xaa in position 10 is Ala or ThrVariant(14)..(14)Xaa in position 14 is Leu or MetVariant(18)..(18)Xaa in position 18 is any amino acidVariant(24)..(24)Xaa in position 24 is any amino acidVariant(31)..(33)Xaa in position 31 to 33 is any or no amino acidVariant(43)..(43)Xaa in position 43 is Asn or SerVariant(54)..(54)Xaa in position 54 is any amino acidVariant(57)..(57)Xaa in position 57 is Ile or ValVariant(58)..(58)Xaa in position 58 is Ala, Glu or SerVariant(59)..(59)Xaa in position 59 is any amino acid 63Gly Ser Xaa Val Ile Asn Ile Gly Asp Xaa Met Gln Ile Xaa Ser Asn 1 5 10 15 Gly Xaa Tyr Lys Ser Val Glu Xaa Arg Val Leu Ala Asn Gly Xaa Xaa 20 25 30 Xaa Asn Arg Ile Ser Val Pro Ile Phe Val Xaa Pro Lys Pro Glu Ser 35 40 45 Val Ile Gly Pro Leu Xaa Glu Val Xaa Xaa Xaa Gly Glu 50 55 60 6449PRTArtificial sequenceprotein patternVariant(1)..(1)Xaa in position 1 is Lys or GlnVariant(4)..(4)Xaa in position 4 is Ala or AspVariant(11)..(11)Xaa in position 11 is Asp or GluVariant(15)..(15)Xaa in position 15 is Phe or TyrVariant(19)..(19)Xaa in position 19 is any amino acidVariant(21)..(21)Xaa in position 21 is any amino acidVariant(23)..(23)Xaa in position 23 is Lys or ArgVariant(28)..(28)Xaa in position 28 is Ile, Leu or ValVariant(31)..(31)Xaa in position 31 is any amino acidVariant(36)..(36)Xaa in position 36 is Ile or LeuVariant(49)..(49)Xaa in position 49 is Leu or Met 64Xaa Ser Asp Xaa Leu Tyr Gln Tyr Ile Leu Xaa Thr Ser Val Xaa Pro 1 5 10 15 Arg Glu Xaa Glu Xaa Met Xaa Glu Leu Arg Glu Xaa Thr Ala Xaa His 20 25 30 Pro Trp Asn Xaa Met Thr Thr Ser Ala Asp Glu Gly Gln Phe Leu Asn 35 40 45 Xaa 6560PRTArtificial sequenceprotein pattern 65Gly Asn Leu Glu Asn Asn Leu Lys Cys Ser Gly Glu Ile Ser Tyr Asn 1 5 10 15 Gly His Arg Leu Asp Glu Phe Val Pro Gln Lys Thr Ser Ala Tyr Ile 20 25 30 Ser Gln Tyr Asp Leu His Ile Ala Glu Met Thr Val Arg Glu Thr Val 35 40 45 Asp Phe Ser Ala Arg Cys Gln Gly Val Gly Ser Arg 50 55 60 6661PRTArtificial sequenceprotein patternVariant(21)..(21)Xaa in position 21 is Ile or ValVariant(45)..(45)Xaa in position 45 is any or no amino acidVariant(47)..(47)Xaa in position 47 is any or no amino acid 66Ala Gly Arg Lys Thr Ser Gly Tyr Ile Glu Gly Asp Ile Arg Ile Ser 1 5 10 15 Gly Phe Pro Lys Xaa Gln Glu Thr Phe Ala Arg Val Ser Gly Tyr Cys 20 25 30 Glu Gln Thr Asp Ile His Ser Pro Asn Ile Thr Val Xaa Glu Xaa Ser 35 40 45 Val Ile Tyr Ser Ala Trp Leu Arg Leu Ala Pro Glu Ile 50 55 60 6760PRTArtificial sequenceprotein patternVariant(6)..(6)Xaa in position 6 is any amino acid 67Asp Ile Cys Ala Glu Xaa Leu Ile Gly Asp Val Met Arg Arg Gly Ile 1 5 10 15 Ser Gly Gly Gln Lys Lys Arg Leu Thr Thr Ala Glu Met Ile Val Gly 20 25 30 Pro Thr Lys Ala Leu Phe Met Asp Glu Ile Thr Asn Gly Leu Asp Ser 35 40 45 Ser Thr Ala Phe Gln Ile Val Lys Ser Leu Gln Gln 50 55 60 6860PRTArtificial sequenceprotein patternVariant(2)..(2)Xaa in position 2 is any amino acid 68Gly Xaa Ser Gly Leu Ser Thr Glu Gln Arg Lys Arg Leu Thr Ile Ala 1 5 10 15 Val Glu Leu Val Ala Asn Pro Ser Ile Ile Phe Met Asp Glu Pro Thr 20 25 30 Thr Gly Leu Asp Ala Arg Ala Ala Ala Ile Val Met Arg Ala Val Lys 35 40 45 Asn Val Ala Asp Thr Gly Arg Thr Ile Val Cys Thr 50 55 60 6959PRTArtificial sequenceprotein patternVariant(3)..(3)Xaa in position 3 is Asp or GlyVariant(4)..(4)Xaa in position 4 is any amino acidVariant(6)..(6)Xaa in position 6 is Ile or LeuVariant(22)..(22)Xaa in position 22 is Ile or ValVariant(39)..(39)Xaa in position 39 is any amino acidVariant(40)..(40)Xaa in position 40 is Phe or Tyr 69Val Met Xaa Xaa Gly Xaa Val Cys Gly Trp Ala Pro Gln Val Glu Ile 1 5 10 15 Leu Ala His Lys Ala Xaa Gly Gly Phe Val Ser His Cys Gly Trp Asn 20 25 30 Ser Ile Leu Glu Ser Leu Xaa Xaa Gly Val Pro Ile Ala Thr Trp Pro 35

40 45 Met Tyr Ala Glu Gln Gln Leu Asn Ala Phe Thr 50 55 7060PRTArtificial sequenceprotein patternVariant(11)..(11)Xaa in position 11 is Asp or GlyVariant(12)..(12)Xaa in position 12 is Pro or SerVariant(14)..(14)Xaa in position 14 is His or ArgVariant(27)..(27)Xaa in position 27 is Leu or MetVariant(47)..(47)Xaa in position 47 is Gly or SerVariant(56)..(56)Xaa in position 56 is Leu or ProVariant(59)..(59)Xaa in position 59 is any amino acidVariant(60)..(60)Xaa in position 60 is Arg or Ser 70Ser Thr Leu Leu Ser Ser Arg Asp Glu Ser Xaa Xaa Val Xaa Val Ala 1 5 10 15 Gly Leu Val Leu Asp Phe Phe Cys Val Pro Xaa Ile Asp Val Gly Asn 20 25 30 Glu Phe Asn Leu Pro Ser Tyr Ile Phe Leu Thr Cys Ser Ala Xaa Phe 35 40 45 Leu Gly Met Met Lys Tyr Leu Xaa Glu Arg Xaa Xaa 50 55 60 7110304DNAArtificial Sequencevector 71gaagattagc ctcttcaatt tcagaaagaa tgctgaccca cagatggtta gagaggccta 60cgcggcaggt ctcatcaaga cgatctaccc gagtaataat ctccaggaga tcaaatacct 120tcccaagaag gttaaagatg cagtcaaaag attcaggact aactgcatca agaacacaga 180gaaagatata tttctcaaga tcagaagtac tattccagta tggacgattc aaggcttgct 240tcataaacca aggcaagtaa tagagattgg agtctctaag aaagtagttc ctactgaatc 300aaaggccatg gagtcaaaaa ttcagatcga ggatctaaca gaactcgccg tgaagactgg 360cgaacagttc atacagagtc ttttacgact caatgacaag aagaaaatct tcgtcaacat 420ggtggagcac gacactctcg tctactccaa gaatatcaaa gatacagtct cagaagacca 480aagggctatt gagacttttc aacaaagggt aatatcggga aacctcctcg gattccattg 540cccagctatc tgtcacttca tcaaaaggac agtagaaaag gaaggtggca cctacaaatg 600ccatcattgc gataaaggaa aggctatcgt tcaagatgcc tctgccgaca gtggtcccaa 660agatggaccc ccacccacga ggagcatcgt ggaaaaagaa gacgttccaa ccacgtcttc 720aaagcaagtg gattgatgtg atatctccac tgacgtaagg gatgacgcac aatcccacta 780tccttcgcaa gacccttcct ctatataagg aagttcattt catttggaga ggactccggt 840atttttacaa caataccaca acaaaacaaa caacaaacaa cattacaatt tactattcta 900gtcgacctgc aggcggccgc actagtgata tcacaagttt gtacaaaaaa gcaggctcat 960atttttacaa caattaccaa caacaacaaa caacaaacaa cattacaatt actatttaca 1020attacaatta ccatggacta caaggacgac gatgacaaga caactttgta tacaaaagtt 1080gcaatggctc caacactctt gacaacccaa ttctcaaatc cagctgaagt aaccgacttt 1140gtagtctaca aaggaaatgg tgttaagggt ttatcagaaa caggaatcaa agctcttcca 1200gaacaataca ttcagccact tgaagaacga ctcatcaaca aattcgtcaa cgaaacagat 1260gaagccattc cagttatcga tatgtcgaac cctgatgagg acagagtcgc tgaagctgtt 1320tgtgatgctg ctgagaaatg ggggttcttt caagtgatca atcatggagt tcctttggaa 1380gttcttgatg acgtcaaggc tgcgactcac aagttcttca atctccctgt tgaagagaag 1440cgcaagttca ctaaagagaa ttcgctgtcg acgactgtta ggtttgggac gagttttagt 1500cctcttgcag agcaagcgct tgagtggaaa gattatctca gcctcttctt tgtctctgaa 1560gctgaagctg aacagttctg gcctgatatc tgcaggaatg aaacgttaga gtacattaac 1620aagtcaaaga agatggtgag gaggcttcta gagtatttgg gaaagaatct caatgttaaa 1680gagcttgacg agacgaaaga atcactcttt atgggctcga ttcgagtcaa ccttaactac 1740taccccatct gccctaatcc ggacctaaca gttggtgttg gtcgccactc agacgtctct 1800tctctcacca ttctcttaca agaccagatc ggtggtctac acgtgcgttc tctggcttca 1860gggaactggg ttcacgtgcc tccggttgct ggatcttttg tgatcaacat cggagatgcg 1920atgcagatca tgagcaatgg tctgtacaag agcgtggagc atcgtgtctt agccaatggt 1980tacaataata gaatctctgt tcctatcttt gtgaacccaa aaccagagtc agttattggt 2040cctctacctg aggtgattgc aaacggagag gaaccgattt acagagacgt cctgtactct 2100gattacgtca agtatttctt caggaaggca cacgatggaa agaaaaccgt cgattacgcc 2160aagatctgat acccagcttt cttgtacaaa gtggtgatat cccgcggcca tgctagagtc 2220cgcaaaaatc accagtctct ctctacaaat ctatctctct ctatttttct ccagaataat 2280gtgtgagtag ttcccagata agggaattag ggttcttata gggtttcgct catgtgttga 2340gcatataaga aacccttagt atgtatttgt atttgtaaaa tacttctatc aataaaattt 2400ctaattccta aaaccaaaat ccagtgacct gcaggcatgc gacgtcgggc ccaagcttag 2460cttgagcttg gatcagattg tcgtttcccg ccttcagttt aaactatcag tgtttgacag 2520gatatattgg cgggtaaacc taagagaaaa gagcgtttat tagaataacg gatatttaaa 2580agggcgtgaa aaggtttatc cgttcgtcca tttgtatgtg catgccaacc acagggttcc 2640cctcgggatc aaagtacttt gatccaaccc ctccgctgct atagtgcagt cggcttctga 2700cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca agtcctaagt tacgcgacag 2760gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt gttttagtcg cataaagtag 2820aatacttgcg actagaaccg gagacattac gccatgaaca agagcgccgc cgctggcctg 2880ctgggctatg cccgcgtcag caccgacgac caggacttga ccaaccaacg ggccgaactg 2940cacgcggccg gctgcaccaa gctgttttcc gagaagatca ccggcaccag gcgcgaccgc 3000ccggagctgg ccaggatgct tgaccaccta cgccctggcg acgttgtgac agtgaccagg 3060ctagaccgcc tggcccgcag cacccgcgac ctactggaca ttgccgagcg catccaggag 3120gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg acaccaccac gccggccggc 3180cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg agcgttccct aatcatcgac 3240cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg tgaagtttgg cccccgccct 3300accctcaccc cggcacagat cgcgcacgcc cgcgagctga tcgaccagga aggccgcacc 3360gtgaaagagg cggctgcact gcttggcgtg catcgctcga ccctgtaccg cgcacttgag 3420cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg gtgccttccg tgaggacgca 3480ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac gccaagagga acaagcatga 3540aaccgcacca ggacggccag gacgaaccgt ttttcattac cgaagagatc gaggcggaga 3600tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt ctcaaccgtg cggctgcatg 3660aaatcctggc cggtttgtct gatgccaagc tggcggcctg gccggccagc ttggccgctg 3720aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt tgagtaaaac agcttgcgtc 3780atgcggtcgc tgcgtatatg atgcgatgag taaataaaca aatacgcaag gggaacgcat 3840gaaggttatc gctgtactta accagaaagg cgggtcaggc aagacgacca tcgcaaccca 3900tctagcccgc gccctgcaac tcgccggggc cgatgttctg ttagtcgatt ccgatcccca 3960gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa ccgctaaccg ttgtcggcat 4020cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc cggcgcgact tcgtagtgat 4080cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg atcaaggcag ccgacttcgt 4140gctgattccg gtgcagccaa gcccttacga catatgggcc accgccgacc tggtggagct 4200ggttaagcag cgcattgagg tcacggatgg aaggctacaa gcggcctttg tcgtgtcgcg 4260ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag gcgctggccg ggtacgagct 4320gcccattctt gagtcccgta tcacgcagcg cgtgagctac ccaggcactg ccgccgccgg 4380cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc cgcgaggtcc aggcgctggc 4440cgctgaaatt aaatcaaaac tcatttgagt taatgaggta aagagaaaat gagcaaaagc 4500acaaacacgc taagtgccgg ccgtccgagc gcacgcagca gcaaggctgc aacgttggcc 4560agcctggcag acacgccagc catgaagcgg gtcaactttc agttgccggc ggaggatcac 4620accaagctga agatgtacgc ggtacgccaa ggcaagacca ttaccgagct gctatctgaa 4680tacatcgcgc agctaccaga gtaaatgagc aaatgaataa atgagtagat gaattttagc 4740ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc accgacgccg tggaatgccc 4800catgtgtgga ggaacgggcg gttggccagg cgtaagcggc tgggttgtct gccggccctg 4860caatggcact ggaaccccca agcccgagga atcggcgtga cggtcgcaaa ccatccggcc 4920cggtacaaat cggcgcggcg ctgggtgatg acctggtgga gaagttgaag gccgcgcagg 4980ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg tgaatcgtgg caagcggccg 5040ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc cggtgcgccg tcgattagga 5100agccgcccaa gggcgacgag caaccagatt ttttcgttcc gatgctctat gacgtgggca 5160cccgcgatag tcgcagcatc atggacgtgg ccgttttccg tctgtcgaag cgtgaccgac 5220gagctggcga ggtgatccgc tacgagcttc cagacgggca cgtagaggtt tccgcagggc 5280cggccggcat ggccagtgtg tgggattacg acctggtact gatggcggtt tcccatctaa 5340ccgaatccat gaaccgatac cgggaaggga agggagacaa gcccggccgc gtgttccgtc 5400cacacgttgc ggacgtactc aagttctgcc ggcgagccga tggcggaaag cagaaagacg 5460acctggtaga aacctgcatt cggttaaaca ccacgcacgt tgccatgcag cgtacgaaga 5520aggccaagaa cggccgcctg gtgacggtat ccgagggtga agccttgatt agccgctaca 5580agatcgtaaa gagcgaaacc gggcggccgg agtacatcga gatcgagcta gctgattgga 5640tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct gacggttcac cccgattact 5700ttttgatcga tcccggcatc ggccgttttc tctaccgcct ggcacgccgc gccgcaggca 5760aggcagaagc cagatggttg ttcaagacga tctacgaacg cagtggcagc gccggagagt 5820tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc aaatgacctg ccggagtacg 5880atttgaagga ggaggcgggg caggctggcc cgatcctagt catgcgctac cgcaacctga 5940tcgagggcga agcatccgcc ggttcctaat gtacggagca gatgctaggg caaattgccc 6000tagcagggga aaaaggtcga aaaggtctct ttcctgtgga tagcacgtac attgggaacc 6060caaagccgta cattgggaac cggaacccgt acattgggaa cccaaagccg tacattggga 6120accggtcaca catgtaagtg actgatataa aagagaaaaa aggcgatttt tccgcctaaa 6180actctttaaa acttattaaa actcttaaaa cccgcctggc ctgtgcataa ctgtctggcc 6240agcgcacagc cgaagagctg caaaaagcgc ctacccttcg gtcgctgcgc tccctacgcc 6300ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc aaaaatggct ggcctacggc 6360caggcaatct accagggcgc ggacaagccg cgccgtcgcc actcgaccgc cggcgcccac 6420atcaaggcac cctgcctcgc gcgtttcggt gatgacggtg aaaacctctg acacatgcag 6480ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg ggagcagaca agcccgtcag 6540ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca tgacccagtc acgtagcgat 6600agcggagtgt atactggctt aactatgcgg catcagagca gattgtactg agagtgcacc 6660atatgcggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc aggcgctctt 6720ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 6780ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 6840tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 6900tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 6960gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 7020ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 7080tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 7140agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 7200atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 7260acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 7320actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct 7380tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 7440tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 7500tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 7560tgcatgatat atctcccaat ttgtgtaggg cttattatgc acgcttaaaa ataataaaag 7620cagacttgac ctgatagttt ggctgtgagc aattatgtgc ttagtgcatc taatcgcttg 7680agttaacgcc ggcgaagcgg cgtcggcttg aacgaatttc tagctagaca ttatttgccg 7740actaccttgg tgatctcgcc tttcacgtag tggacaaatt cttccaactg atctgcgcgc 7800gaggccaagc gatcttcttc ttgtccaaga taagcctgtc tagcttcaag tatgacgggc 7860tgatactggg ccggcaggcg ctccattgcc cagtcggcag cgacatcctt cggcgcgatt 7920ttgccggtta ctgcgctgta ccaaatgcgg gacaacgtaa gcactacatt tcgctcatcg 7980ccagcccagt cgggcggcga gttccatagc gttaaggttt catttagcgc ctcaaataga 8040tcctgttcag gaaccggatc aaagagttcc tccgccgctg gacctaccaa ggcaacgcta 8100tgttctcttg cttttgtcag caagatagcc agatcaatgt cgatcgtggc tggctcgaag 8160atacctgcaa gaatgtcatt gcgctgccat tctccaaatt gcagttcgcg cttagctgga 8220taacgccacg gaatgatgtc gtcgtgcaca acaatggtga cttctacagc gcggagaatc 8280tcgctctctc caggggaagc cgaagtttcc aaaaggtcgt tgatcaaagc tcgccgcgtt 8340gtttcatcaa gccttacggt caccgtaacc agcaaatcaa tatcactgtg tggcttcagg 8400ccgccatcca ctgcggagcc gtacaaatgt acggccagca acgtcggttc gagatggcgc 8460tcgatgacgc caactacctc tgatagttga gtcgatactt cggcgatcac cgcttccccc 8520atgatgttta actttgtttt agggcgactg ccctgctgcg taacatcgtt gctgctccat 8580aacatcaaac atcgacccac ggcgtaacgc gcttgctgct tggatgcccg aggcatagac 8640tgtaccccaa aaaaacatgt cataacaaga agccatgaaa accgccactg cgccgttacc 8700accgctgcgt tcggtcaagg ttctggacca gttgcgtgac ggcagttacg ctacttgcat 8760tacagcttac gaaccgaacg aggcttatgt ccactgggtt cgtgcccgaa ttgatcacag 8820gcagcaacgc tctgtcatcg ttacaatcaa catgctaccc tccgcgagat catccgtgtt 8880tcaaacccgg cagcttagtt gccgttcttc cgaatagcat cggtaacatg agcaaagtct 8940gccgccttac aacggctctc ccgctgacgc cgtcccggac tgatgggctg cctgtatcga 9000gtggtgattt tgtgccgagc tgccggtcgg ggagctgttg gctggctggt ggcaggatat 9060attgtggtgt aaacaaattg acgcttagac aacttaataa cacattgcgg acgtttttaa 9120tgtactgaat taacgccgaa ttgaattatc agcttgcatg ccggtcgatc tagtaacata 9180tagatgacac cgcgcgcgat aatttatcct agtttgcgcg ctatattttg ttttctatcg 9240cgtattaaat gtataattgc gggactctaa tcataaaaac ccatctcata aataacgtca 9300tgcattacat gttaattatt acatgcttaa cgtaattcaa cagaaattat atgataatca 9360tcgcaagacc ggcaacagga ttcaatctta agaaacttta ttgccaaatg tttgaacgat 9420ctgcttgact ctaggggtca tcagatttcg gtgacgggca ggaccggacg gggcggcacc 9480ggcaggctga agtccagctg ccagaaaccc acgtcatgcc agttcccgtg cttgaagccg 9540gccgcccgca gcatgccgcg gggggcatat ccgagcgcct cgtgcatgcg cacgctcggg 9600tcgttgggca gcccgatgac agcgaccacg ctcttgaagc cctgtgcctc cagggacttc 9660agcaggtggg tgtagagcgt ggagcccagt cccgtccgct ggtggcgggg ggagacgtac 9720acggtcgact cggccgtcca gtcgtaggcg ttgcgtgcct tccagggacc cgcgtaggcg 9780atgccggcga cctcgccgtc cacctcggcg acgagccagg gatagcgctc ccgcagacgg 9840acgaggtcgt ccgtccactc ctgcggttcc tgcggctcgg tacggaagtt gaccgtgctt 9900gtctcgatgt agtggttgac gatggtgcag accgccggca tgtccgcctc ggtggcacgg 9960cggatgtcgg ccgggcgtcg ttctgggctc atggtagatc ccctcgatcg agttgagagt 10020gaatatgaga ctctaattgg ataccgaggg gaatttatgg aacgtcagtg gagcattttt 10080gacaagaaat atttgctagc tgatagtgac cttaggcgac ttttgaacgc gcaataatgg 10140tttctgacgt atgtgcttag ctcattaaac tccagaaacc cgcggctcag tggctccttc 10200aacgttgcgg ttctgtcagt tccaaacgta aaacggcttg tcccgcgtca tcggcggggg 10260tcataacgtg actcccttaa ttctcatgta tgataattcg agct 103047210317DNAArtificial Sequencevector 72ctcccatatg gtcgactaga gccaagctga tctcctttgc cccggagatc accatggacg 60actttctcta tctctacgat ctaggaagaa agttcgacgg agaaggtgac gataccatgt 120tcaccaccga taatgagaag attagcctct tcaatttcag aaagaatgct gacccacaga 180tggttagaga ggcctacgcg gcaggtctca tcaagacgat ctacccgagt aataatctcc 240aggagatcaa ataccttccc aagaaggtta aagatgcagt caaaagattc aggactaact 300gcatcaagaa cacagagaaa gatatatttc tcaagatcag aagtactatt ccagtatgga 360cgattcaagg cttgcttcat aaaccaaggc aagtaataga gattggagtc tctaagaaag 420tagttcctac tgaatcaaag gccatggagt caaaaattca gatcgaggat ctaacagaac 480tcgccgtgaa gactggcgaa cagttcatac agagtctttt acgactcaat gacaagaaga 540aaatcttcgt caacatggtg gagcacgaca ctctcgtcta ctccaagaat atcaaagata 600cagtctcaga agaccaaagg gctattgaga cttttcaaca aagggtaata tcgggaaacc 660tcctcggatt ccattgccca gctatctgtc acttcatcaa aaggacagta gaaaaggaag 720gtggcaccta caaatgccat cattgcgata aaggaaaggc tatcgttcaa gatgcctctg 780ccgacagtgg tcccaaagat ggacccccac ccacgaggag catcgtggaa aaagaagacg 840ttccaaccac gtcttcaaag caagtggatt gatgtgatat ctccactgac gtaagggatg 900acgcacaatc ccactatcct tcgcaagacc cttcctctat ataaggaagt tcatttcatt 960tggagaggac tccggtattt ttacaacaat accacaacaa aacaaacaac aaacaacatt 1020acaatttact attctagtcg acctgcaggc ggccgcacta gtgatatcac aagtttgtac 1080aaaaaagcag gcttaatggc tccaacactc ttgacaaccc aattctcaaa tccagctgaa 1140gtaaccgact ttgtagtcta caaaggaaat ggtgttaagg gtttatcaga aacaggaatc 1200aaagctcttc cagaacaata cattcagcca cttgaagaac gactcatcaa caaattcgtc 1260aacgaaacag atgaagccat tccagttatc gatatgtcga accctgatga ggacagagtc 1320gctgaagctg tttgtgatgc tgctgagaaa tgggggttct ttcaagtgat caatcatgga 1380gttcctttgg aagttcttga tgacgtcaag gctgcgactc acaagttctt caatctccct 1440gttgaagaga agcgcaagtt cactaaagag aattcgctgt cgacgactgt taggtttggg 1500acgagtttta gtcctcttgc agagcaagcg cttgagtgga aagattatct cagcctcttc 1560tttgtctctg aagctgaagc tgaacagttc tggcctgata tctgcaggaa tgaaacgtta 1620gagtacatta acaagtcaaa gaagatggtg aggaggcttc tagagtattt gggaaagaat 1680ctcaatgtta aagagcttga cgagacgaaa gaatcactct ttatgggctc gattcgagtc 1740aaccttaact actaccccat ctgccctaat ccggacctaa cagttggtgt tggtcgccac 1800tcagacgtct cttctctcac cattctctta caagaccaga tcggtggtct acacgtgcgt 1860tctctggctt cagggaactg ggttcacgtg cctccggttg ctggatcttt tgtgatcaac 1920atcggagatg cgatgcagat catgagcaat ggtctgtaca agagcgtgga gcatcgtgtc 1980ttagccaatg gttacaataa tagaatctct gttcctatct ttgtgaaccc aaaaccagag 2040tcagttattg gtcctctacc tgaggtgatt gcaaacggag aggaaccgat ttacagagac 2100gtcctgtact ctgattacgt caagtatttc ttcaggaagg cacacgatgg aaagaaaacc 2160gtcgattacg ccaagatctg atacccagct ttcttgtaca aagtggtgat atcccgcggc 2220catgctagag tccgcaaaaa tcaccagtct ctctctacaa atctatctct ctctattttt 2280ctccagaata atgtgtgagt agttcccaga taagggaatt agggttctta tagggtttcg 2340ctcatgtgtt gagcatataa gaaaccctta gtatgtattt gtatttgtaa aatacttcta 2400tcaataaaat ttctaattcc taaaaccaaa atccagtgac ctgcaggcat gcgacgtcgg 2460gcccaagctt agcttgagct tggatcagat tgtcgtttcc cgccttcagt ttaaactatc 2520agtgtttgac aggatatatt ggcgggtaaa cctaagagaa aagagcgttt attagaataa 2580cggatattta aaagggcgtg aaaaggttta tccgttcgtc catttgtatg tgcatgccaa 2640ccacagggtt cccctcggga tcaaagtact ttgatccaac ccctccgctg ctatagtgca 2700gtcggcttct gacgttcagt gcagccgtct tctgaaaacg acatgtcgca caagtcctaa 2760gttacgcgac aggctgccgc cctgcccttt tcctggcgtt ttcttgtcgc gtgttttagt 2820cgcataaagt agaatacttg cgactagaac cggagacatt acgccatgaa caagagcgcc 2880gccgctggcc tgctgggcta tgcccgcgtc agcaccgacg accaggactt gaccaaccaa 2940cgggccgaac tgcacgcggc cggctgcacc aagctgtttt ccgagaagat caccggcacc 3000aggcgcgacc gcccggagct ggccaggatg cttgaccacc tacgccctgg cgacgttgtg 3060acagtgacca ggctagaccg cctggcccgc agcacccgcg acctactgga cattgccgag 3120cgcatccagg aggccggcgc gggcctgcgt agcctggcag agccgtgggc cgacaccacc 3180acgccggccg gccgcatggt gttgaccgtg ttcgccggca ttgccgagtt cgagcgttcc 3240ctaatcatcg accgcacccg gagcgggcgc gaggccgcca aggcccgagg cgtgaagttt 3300ggcccccgcc ctaccctcac cccggcacag atcgcgcacg cccgcgagct gatcgaccag 3360gaaggccgca ccgtgaaaga ggcggctgca ctgcttggcg tgcatcgctc gaccctgtac 3420cgcgcacttg agcgcagcga ggaagtgacg cccaccgagg ccaggcggcg cggtgccttc 3480cgtgaggacg cattgaccga ggccgacgcc ctggcggccg ccgagaatga acgccaagag 3540gaacaagcat gaaaccgcac caggacggcc aggacgaacc gtttttcatt accgaagaga 3600tcgaggcgga gatgatcgcg gccgggtacg tgttcgagcc gcccgcgcac gtctcaaccg 3660tgcggctgca tgaaatcctg gccggtttgt ctgatgccaa gctggcggcc tggccggcca 3720gcttggccgc tgaagaaacc gagcgccgcc gtctaaaaag gtgatgtgta tttgagtaaa 3780acagcttgcg tcatgcggtc

gctgcgtata tgatgcgatg agtaaataaa caaatacgca 3840aggggaacgc atgaaggtta tcgctgtact taaccagaaa ggcgggtcag gcaagacgac 3900catcgcaacc catctagccc gcgccctgca actcgccggg gccgatgttc tgttagtcga 3960ttccgatccc cagggcagtg cccgcgattg ggcggccgtg cgggaagatc aaccgctaac 4020cgttgtcggc atcgaccgcc cgacgattga ccgcgacgtg aaggccatcg gccggcgcga 4080cttcgtagtg atcgacggag cgccccaggc ggcggacttg gctgtgtccg cgatcaaggc 4140agccgacttc gtgctgattc cggtgcagcc aagcccttac gacatatggg ccaccgccga 4200cctggtggag ctggttaagc agcgcattga ggtcacggat ggaaggctac aagcggcctt 4260tgtcgtgtcg cgggcgatca aaggcacgcg catcggcggt gaggttgccg aggcgctggc 4320cgggtacgag ctgcccattc ttgagtcccg tatcacgcag cgcgtgagct acccaggcac 4380tgccgccgcc ggcacaaccg ttcttgaatc agaacccgag ggcgacgctg cccgcgaggt 4440ccaggcgctg gccgctgaaa ttaaatcaaa actcatttga gttaatgagg taaagagaaa 4500atgagcaaaa gcacaaacac gctaagtgcc ggccgtccga gcgcacgcag cagcaaggct 4560gcaacgttgg ccagcctggc agacacgcca gccatgaagc gggtcaactt tcagttgccg 4620gcggaggatc acaccaagct gaagatgtac gcggtacgcc aaggcaagac cattaccgag 4680ctgctatctg aatacatcgc gcagctacca gagtaaatga gcaaatgaat aaatgagtag 4740atgaatttta gcggctaaag gaggcggcat ggaaaatcaa gaacaaccag gcaccgacgc 4800cgtggaatgc cccatgtgtg gaggaacggg cggttggcca ggcgtaagcg gctgggttgt 4860ctgccggccc tgcaatggca ctggaacccc caagcccgag gaatcggcgt gacggtcgca 4920aaccatccgg cccggtacaa atcggcgcgg cgctgggtga tgacctggtg gagaagttga 4980aggccgcgca ggccgcccag cggcaacgca tcgaggcaga agcacgcccc ggtgaatcgt 5040ggcaagcggc cgctgatcga atccgcaaag aatcccggca accgccggca gccggtgcgc 5100cgtcgattag gaagccgccc aagggcgacg agcaaccaga ttttttcgtt ccgatgctct 5160atgacgtggg cacccgcgat agtcgcagca tcatggacgt ggccgttttc cgtctgtcga 5220agcgtgaccg acgagctggc gaggtgatcc gctacgagct tccagacggg cacgtagagg 5280tttccgcagg gccggccggc atggccagtg tgtgggatta cgacctggta ctgatggcgg 5340tttcccatct aaccgaatcc atgaaccgat accgggaagg gaagggagac aagcccggcc 5400gcgtgttccg tccacacgtt gcggacgtac tcaagttctg ccggcgagcc gatggcggaa 5460agcagaaaga cgacctggta gaaacctgca ttcggttaaa caccacgcac gttgccatgc 5520agcgtacgaa gaaggccaag aacggccgcc tggtgacggt atccgagggt gaagccttga 5580ttagccgcta caagatcgta aagagcgaaa ccgggcggcc ggagtacatc gagatcgagc 5640tagctgattg gatgtaccgc gagatcacag aaggcaagaa cccggacgtg ctgacggttc 5700accccgatta ctttttgatc gatcccggca tcggccgttt tctctaccgc ctggcacgcc 5760gcgccgcagg caaggcagaa gccagatggt tgttcaagac gatctacgaa cgcagtggca 5820gcgccggaga gttcaagaag ttctgtttca ccgtgcgcaa gctgatcggg tcaaatgacc 5880tgccggagta cgatttgaag gaggaggcgg ggcaggctgg cccgatccta gtcatgcgct 5940accgcaacct gatcgagggc gaagcatccg ccggttccta atgtacggag cagatgctag 6000ggcaaattgc cctagcaggg gaaaaaggtc gaaaaggtct ctttcctgtg gatagcacgt 6060acattgggaa cccaaagccg tacattggga accggaaccc gtacattggg aacccaaagc 6120cgtacattgg gaaccggtca cacatgtaag tgactgatat aaaagagaaa aaaggcgatt 6180tttccgccta aaactcttta aaacttatta aaactcttaa aacccgcctg gcctgtgcat 6240aactgtctgg ccagcgcaca gccgaagagc tgcaaaaagc gcctaccctt cggtcgctgc 6300gctccctacg ccccgccgct tcgcgtcggc ctatcgcggc cgctggccgc tcaaaaatgg 6360ctggcctacg gccaggcaat ctaccagggc gcggacaagc cgcgccgtcg ccactcgacc 6420gccggcgccc acatcaaggc accctgcctc gcgcgtttcg gtgatgacgg tgaaaacctc 6480tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga 6540caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggcgcagc catgacccag 6600tcacgtagcg atagcggagt gtatactggc ttaactatgc ggcatcagag cagattgtac 6660tgagagtgca ccatatgcgg tgtgaaatac cgcacagatg cgtaaggaga aaataccgca 6720tcaggcgctc ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc 6780gagcggtatc agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg 6840caggaaagaa catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt 6900tgctggcgtt tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa 6960gtcagaggtg gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct 7020ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc 7080cttcgggaag cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg 7140tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct 7200tatccggtaa ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag 7260cagccactgg taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga 7320agtggtggcc taactacggc tacactagaa ggacagtatt tggtatctgc gctctgctga 7380agccagttac cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg 7440gtagcggtgg tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag 7500aagatccttt gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag 7560ggattttggt catgcatgat atatctccca atttgtgtag ggcttattat gcacgcttaa 7620aaataataaa agcagacttg acctgatagt ttggctgtga gcaattatgt gcttagtgca 7680tctaatcgct tgagttaacg ccggcgaagc ggcgtcggct tgaacgaatt tctagctaga 7740cattatttgc cgactacctt ggtgatctcg cctttcacgt agtggacaaa ttcttccaac 7800tgatctgcgc gcgaggccaa gcgatcttct tcttgtccaa gataagcctg tctagcttca 7860agtatgacgg gctgatactg ggccggcagg cgctccattg cccagtcggc agcgacatcc 7920ttcggcgcga ttttgccggt tactgcgctg taccaaatgc gggacaacgt aagcactaca 7980tttcgctcat cgccagccca gtcgggcggc gagttccata gcgttaaggt ttcatttagc 8040gcctcaaata gatcctgttc aggaaccgga tcaaagagtt cctccgccgc tggacctacc 8100aaggcaacgc tatgttctct tgcttttgtc agcaagatag ccagatcaat gtcgatcgtg 8160gctggctcga agatacctgc aagaatgtca ttgcgctgcc attctccaaa ttgcagttcg 8220cgcttagctg gataacgcca cggaatgatg tcgtcgtgca caacaatggt gacttctaca 8280gcgcggagaa tctcgctctc tccaggggaa gccgaagttt ccaaaaggtc gttgatcaaa 8340gctcgccgcg ttgtttcatc aagccttacg gtcaccgtaa ccagcaaatc aatatcactg 8400tgtggcttca ggccgccatc cactgcggag ccgtacaaat gtacggccag caacgtcggt 8460tcgagatggc gctcgatgac gccaactacc tctgatagtt gagtcgatac ttcggcgatc 8520accgcttccc ccatgatgtt taactttgtt ttagggcgac tgccctgctg cgtaacatcg 8580ttgctgctcc ataacatcaa acatcgaccc acggcgtaac gcgcttgctg cttggatgcc 8640cgaggcatag actgtacccc aaaaaaacat gtcataacaa gaagccatga aaaccgccac 8700tgcgccgtta ccaccgctgc gttcggtcaa ggttctggac cagttgcgtg acggcagtta 8760cgctacttgc attacagctt acgaaccgaa cgaggcttat gtccactggg ttcgtgcccg 8820aattgatcac aggcagcaac gctctgtcat cgttacaatc aacatgctac cctccgcgag 8880atcatccgtg tttcaaaccc ggcagcttag ttgccgttct tccgaatagc atcggtaaca 8940tgagcaaagt ctgccgcctt acaacggctc tcccgctgac gccgtcccgg actgatgggc 9000tgcctgtatc gagtggtgat tttgtgccga gctgccggtc ggggagctgt tggctggctg 9060gtggcaggat atattgtggt gtaaacaaat tgacgcttag acaacttaat aacacattgc 9120ggacgttttt aatgtactga attaacgccg aattgaatta tcagcttgca tgccggtcga 9180tctagtaaca tatagatgac accgcgcgcg ataatttatc ctagtttgcg cgctatattt 9240tgttttctat cgcgtattaa atgtataatt gcgggactct aatcataaaa acccatctca 9300taaataacgt catgcattac atgttaatta ttacatgctt aacgtaattc aacagaaatt 9360atatgataat catcgcaaga ccggcaacag gattcaatct taagaaactt tattgccaaa 9420tgtttgaacg atctgcttga ctctaggggt catcagattt cggtgacggg caggaccgga 9480cggggcggca ccggcaggct gaagtccagc tgccagaaac ccacgtcatg ccagttcccg 9540tgcttgaagc cggccgcccg cagcatgccg cggggggcat atccgagcgc ctcgtgcatg 9600cgcacgctcg ggtcgttggg cagcccgatg acagcgacca cgctcttgaa gccctgtgcc 9660tccagggact tcagcaggtg ggtgtagagc gtggagccca gtcccgtccg ctggtggcgg 9720ggggagacgt acacggtcga ctcggccgtc cagtcgtagg cgttgcgtgc cttccaggga 9780cccgcgtagg cgatgccggc gacctcgccg tccacctcgg cgacgagcca gggatagcgc 9840tcccgcagac ggacgaggtc gtccgtccac tcctgcggtt cctgcggctc ggtacggaag 9900ttgaccgtgc ttgtctcgat gtagtggttg acgatggtgc agaccgccgg catgtccgcc 9960tcggtggcac ggcggatgtc ggccgggcgt cgttctgggc tcatggtaga tcccctcgat 10020cgagttgaga gtgaatatga gactctaatt ggataccgag gggaatttat ggaacgtcag 10080tggagcattt ttgacaagaa atatttgcta gctgatagtg accttaggcg acttttgaac 10140gcgcaataat ggtttctgac gtatgtgctt agctcattaa actccagaaa cccgcggctc 10200agtggctcct tcaacgttgc ggttctgtca gttccaaacg taaaacggct tgtcccgcgt 10260catcggcggg ggtcataacg tgactccctt aattctcatg tatgataatt cgagctc 103177312323DNAArtificial Sequencevector 73aaaagttgcc atgattacgc caagcttggc cactaaggcc aatttaaatc tactaggccg 60gccaaagtag gcgcctacta ccggtaattc ccgggattag cggccgctag tctgtgcgca 120cttgtatcct gcaggtcaat cgtttaaaca ctgtacggac cgtggcctaa taggccggta 180cccaagtttg tacaaaaaag caggctccat gattacgcca agcttggcca ctaaggccaa 240tttaaatcta ctaggccggc caaagtaggc gcctactacc ggtaattccc gggattagcg 300gccgctagtc tgtgcgcact tgtatcctgc aggtcaatcg tttaaacact gtacggaccg 360tggcctaata ggccggtacc acccagcttt cttgtacaaa gtggccatga ttacgccaag 420cttggccact aaggccaatt taaatctact aggccggccc aggtaccaat tcgaatccaa 480aaattacgga tatgaatata ggcatatccg tatccgaatt atccgtttga cagctagcaa 540cgattgtaca attgcttctt taaaaaagga agaaagaaag aaagaaaaga atcaacatca 600gcgttaacaa acggccccgt tacggcccaa acggtcatat agagtaacgg cgttaagcgt 660tgaaagactc ctatcgaaat acgtaaccgc aaacgtgtca tagtcagatc ccctcttcct 720tcaccgcctc aaacacaaaa ataatcttct acagcctata tatacaaccc ccccttctat 780ctctcctttc tcacaattca tcatctttct ttctctaccc ccaattttaa gaaatcctct 840cttctcctct tcattttcaa ggtaaatctc tctctctctc tctctctctg ttattccttg 900ttttaattag gtatgtatta ttgctagttt gttaatctgc ttatcttatg tatgccttat 960gtgaatatct ttatcttgtt catctcatcc gtttagaagc tataaatttg ttgatttgac 1020tgtgtatcta cacgtggtta tgtttatatc taatcagata tgaatttctt catattgttg 1080cgtttgtgtg taccaatccg aaatcgttga tttttttcat ttaatcgtgt agctaattgt 1140acgtatacat atggatctac gtatcaattg ttcatctgtt tgtgtttgta tgtatacaga 1200tctgaaaaca tcacttctct catctgattg tgttgttaca tacatagata tagatctgtt 1260atatcatttt ttttattaat tgtgtatata tatatgtgca tagatctgga ttacatgatt 1320gtgattattt acatgatttt gttatttacg tatgtatata tgtagatctg gactttttgg 1380agttgttgac ttgattgtat ttgtgtgtgt atatgtgtgt tctgatcttg atatgttatg 1440tatgtgcagt taattaacca tggctccaac actcttgaca acccaattct caaatccagc 1500tgaagtaacc gactttgtag tctacaaagg aaatggtgtt aagggtttat cagaaacagg 1560aatcaaagct cttccagaac aatacattca gccacttgaa gaacgactca tcaacaaatt 1620cgtcaacgaa acagatgaag ccattccagt tatcgatatg tcgaaccctg atgaggacag 1680agtcgctgaa gctgtttgtg atgctgctga gaaatggggg ttctttcaag tgatcaatca 1740tggagttcct ttggaagttc ttgatgacgt caaggctgcg actcacaagt tcttcaatct 1800ccctgttgaa gagaagcgca agttcactaa agagaattcg ctgtcgacga ctgttaggtt 1860tgggacgagt tttagtcctc ttgcagagca agcgcttgag tggaaagatt atctcagcct 1920cttctttgtc tctgaagctg aagctgaaca gttctggcct gatatctgca ggaatgaaac 1980gttagagtac attaacaagt caaagaagat ggtgaggagg cttctagagt atttgggaaa 2040gaatctcaat gttaaagagc ttgacgagac gaaagaatca ctctttatgg gctcgattcg 2100agtcaacctt aactactacc ccatctgccc taatccggac ctaacagttg gtgttggtcg 2160ccactcagac gtctcttctc tcaccattct cttacaagac cagatcggtg gtctacacgt 2220gcgttctctg gcttcaggga actgggttca cgtgcctccg gttgctggat cttttgtgat 2280caacatcgga gatgcgatgc agatcatgag caatggtctg tacaagagcg tggagcatcg 2340tgtcttagcc aatggttaca ataatagaat ctctgttcct atctttgtga acccaaaacc 2400agagtcagtt attggtcctc tacctgaggt gattgcaaac ggagaggaac cgatttacag 2460agacgtcctg tactctgatt acgtcaagta tttcttcagg aaggcacacg atggaaagaa 2520aaccgtcgat tacgccaaga tctgaggcgc gccctgcttt aatgagatat gcgagacgcc 2580tatgatcgca tgatatttgc tttcaattct gttgtgcacg ttgtaaaaaa cctgagcatg 2640tgtagctcag atccttaccg ccggtttcgg ttcattctaa tgaatatatc acccgttact 2700atcgtatttt tatgaataat attctccgtt caatttactg attgtggcgc ctactaccgg 2760taattcccgg gattagcggc cgctagtctg tgcgcacttg tatcctgcag gtcaatcgtt 2820taaacactgt acggaccgtg gcctaatagg ccggtaccca actttattat acatagttga 2880taattcactg gccggatgta ccgaattcgc ggccgcaagc ttggtacctt tctttacgag 2940gtaattgatc tcgcattata tatctacatt ttggttatgt tacttgacat atagtcattg 3000attcaatagt tctgttaatt cctttaaaga tcattttgac tagaccacat tcttggttca 3060ttcctcaata atttgtaatc atattggtgg atatagaagt agattggtta tagatcagat 3120agtggaagac tttaggatga atttcagcta gttttttttt ttggcttatt gtctcaaaag 3180attagtgctt tgctgtctcc attgcttctg ctatcgacac gcttctgtct ccttgtatct 3240ttattatatc tattcgtccc atgagttttg tttgttctgt attcgttcgc tctggtgtca 3300tggatggagt ctctgttcca tgtttctgta atgcatgttg ggttgtttca tgcaagaaat 3360gctgagataa acactcattt gtgaaagttt ctaaactctg aatcgcgcta caggcaatgc 3420tccgaggagt aggaggagaa gaacgaacca aacgacatta tcagcccttt gaggaagctc 3480ttagttttgt tattgttttt gtagccaaat tctccattct tattccattt tcacttatct 3540cttgttcctt atagacctta taagtttttt attcatgtat acaaattata ttgtcatcaa 3600gaagtatctt taaaatctaa atctcaaatc accaggacta tgtttttgtc caattcgtgg 3660aaccaacttg cagcttgtat ccattctctt aaccaataaa aaaagaaaga aagatcaatt 3720tgataaattt ctcagccaca aattctacat ttaggtttta gcatatcgaa ggctcaatca 3780caaatacaat agatagacta gagattccag cgtcacgtga gttttatcta taaataaagg 3840accaaaaatc aaatcccgag ggcattttcg taatccaaca taaaaccctt aaacttcaag 3900tctcattttt aaacaaatca tgttcacaag tctcttcttc ttctctgttt ctctatctct 3960tgctcgggcc cttagatctc gtgccgtcgt gcgacgttgt tttccggtac gtttattcct 4020gttgattcct tctctgtctc tctcgattca ctgctacttc tgtttggatt cctttcgcgc 4080gatctctgga tccgtgcgtt attcattggc tcgtcgtttt cagatctgtt gcgtttcttc 4140tgttttctgt tatgagtgga tgcgttttct tgtgattcgc ttgtttgtaa tgctggatct 4200gtatctgcgt cgtgggaatt caaagtgata gtagttgata ttttttccag atcaggcatg 4260ttctcgtata atcaggtcta atggttgatg attctgcgga attatagatc taagatcttg 4320attgatttag atttgaggat atgaatgaga ttcgtaggtc cacaaaggtc ttgttatctc 4380tgctgctaga tagatgatta tccaattgcg tttcgtagtt atttttatgg attcaaggaa 4440ttgcgtgtaa ttgagagttt tactctgttt tgtgaacagg cttgatcaaa ctcgagatct 4500ttctcctgaa ccatggcggc ggcaacaaca acaacaacaa catcttcttc gatctccttc 4560tccaccaaac catctccttc ctcctccaaa tcaccattac caatctccag attctccctc 4620ccattctccc taaaccccaa caaatcatcc tcctcctccc gccgccgcgg tatcaaatcc 4680agctctccct cctccatctc cgccgtgctc aacacaacca ccaatgtcac aaccactccc 4740tctccaacca aacctaccaa acccgaaaca ttcatctccc gattcgctcc agatcaaccc 4800cgcaaaggcg ctgatatcct cgtcgaggct ttagaacgtc aaggcgtaga aaccgtattc 4860gcttaccctg gaggtacatc aatggagatt caccaagcct taacccgctc ttcctcaatc 4920cgtaacgtcc ttcctcgtca cgaacaagga ggtgtattcg cagcagaagg atacgctcga 4980tcctcaggta aaccaggtat ctgtatagcc acttcaggtc ccggagctac aaatctcgtt 5040agcggattag ccgatgcgtt gttagatagt gttcctcttg tagcaatcac aggacaagtc 5100cctcgtcgta tgattggtac agatgcgttt caagagactc cgattgttga ggtaacgcgt 5160tcgattacga agcataacta tcttgtgatg gatgttgaag atatcccaag gattattgaa 5220gaggctttct ttttagctac ttctggtaga cctggacctg ttttggttga tgttcctaaa 5280gatattcaac aacagcttgc gattcctaat tgggaacagg ctatgagatt acctggttat 5340atgtctagga tgcctaaacc tccggaagat tctcatttgg agcagattgt taggttgatt 5400tctgagtcta agaagcctgt gttgtatgtt ggtggtggtt gtcttaattc tagcgatgaa 5460ttgggtaggt ttgttgagct tacgggcatc cctgttgcga gtacgttgat ggggctggga 5520tcttatcctt gtgatgatga gttgtcgtta catatgcttg gaatgcatgg gactgtgtat 5580gcaaattacg ctgtggagca tagtgatttg ttgttggcgt ttggggtaag gtttgatgat 5640cgtgtcacgg gtaaacttga ggcttttgct agtagggcta agattgttca tattgatatt 5700gactcggctg agattgggaa gaataagact cctcatgtgt ctgtgtgtgg tgatgttaag 5760ctggctttgc aagggatgaa taaggttctt gagaaccgag cggaggagct taaacttgat 5820tttggagttt ggaggaatga gttgaacgta cagaaacaga agtttccgtt gagctttaag 5880acgtttgggg aagctattcc tccacagtat gcgattaagg tccttgatga gttgactgat 5940ggaaaagcca taataagtac tggtgtcggg caacatcaaa tgtgggcggc gcagttctac 6000aattacaaga aaccaaggca gtggctatca tcaggaggcc ttggagctat gggatttgga 6060cttcctgctg cgattggagc gtctgttgct aaccctgatg cgatagttgt ggatattgac 6120ggagatggaa gttttataat gaatgtgcaa gagctagcca ctattcgtgt agagaatctt 6180ccagtgaagg tacttttatt aaacaaccag catcttggca tggttatgca atgggaagat 6240cggttctaca aagctaaccg agctcacaca tttctcgggg acccggctca ggaggacgag 6300atattcccga acatgttgct gtttgcagca gcttgcggga ttccagcggc gagggtgaca 6360aagaaagcag atctccgaga agctattcag acaatgctgg atacaccagg accttacctg 6420ttggatgtga tttgtccgca ccaagaacat gtgttgccga tgatcccgaa tggtggcact 6480ttcaacgatg tcataacgga aggagatggc cggattaaat actgagagat gaaaccggtg 6540attatcagaa ccttttatgg tctttgtatg catatggtaa aaaaacttag tttgcaattt 6600cctgtttgtt ttggtaattt gagtttcttt tagttgttga tctgcctgct ttttggttta 6660cgtcagacta ctactgctgt tgttgtttgg tttcctttct ttcattttat aaataaataa 6720tccggttcgg tttactcctt gtgactggct cagtttggtt attgcgaaat gcgaatggta 6780aattgagtaa ttgaaattcg ttattagggt tctaagctgt tttaacagtc actgggttaa 6840tatctctcga atcttgcatg gaaaatgctc ttaccattgg tttttaattg aaatgtgctc 6900atatgggccg tggtttccaa attaaataaa actacgatgt catcgagaag taaaatcaac 6960tgtgtccaca ttatcagttt tgtgtatacg atgaaatagg gtaattcaaa atctagcttg 7020atatgccttt tggttcattt taaccttctg taaacatttt ttcagatttt gaacaagtaa 7080atccaaaaaa aaaaaaaaaa aatctcaact caacactaaa ttattttaat gtataaaaga 7140tgcttaaaac atttggctta aaagaaagaa gctaaaaaca tagagaactc ttgtaaattg 7200aagtatgaaa atatactgaa ttgggtatta tatgaatttt tctgatttag gattcacatg 7260atccaaaaag gaaatccaga agcactaatc agacattgga agtaggattt aaatttaatc 7320gcagtactta atcagtgatc agtaactaaa ttcagtacat taaagacgtc cgcaatgtgt 7380tattaagttg tctaagcgtc aatttgttta caccacaata tatcctgcca ccagccagcc 7440aacagctccc cgaccggcag ctcggcacaa aatcactgat catctaaaaa ggtgatgtgt 7500atttgagtaa aacagcttgc gtcatgcggt cgctgcgtat atgatgcgat gagtaaataa 7560acaaatacgc aaggggaacg catgaaggtt atcgctgtac ttaaccagaa aggcgggtca 7620ggcaagacga ccatcgcaac ccatctagcc cgcgccctgc aactcgccgg ggccgatgtt 7680ctgttagtcg attccgatcc ccagggcagt gcccgcgatt gggcggccgt gcgggaagat 7740caaccgctaa ccgttgtcgg catcgaccgc ccgacgattg accgcgacgt gaaggccatc 7800ggccggcgcg acttcgtagt gatcgacgga gcgccccagg cggcggactt ggctgtgtcc 7860gcgatcaagg cagccgactt cgtgctgatt ccggtgcagc caagccctta cgacatttgg 7920gccaccgccg acctggtgga gctggttaag cagcgcattg aggtcacgga tggaaggcta 7980caagcggcct ttgtcgtgtc gcgggcgatc aaaggcacgc gcatcggcgg tgaggttgcc 8040gaggcgctgg ccgggtacga gctgcccatt cttgagtccc gtatcacgca gcgcgtgagc 8100tacccaggca ctgccgccgc cggcacaacc gttcttgaat cagaacccga gggcgacgct 8160gcccgcgagg tccaggcgct ggccgctgaa attaaatcaa aactcatttg agttaatgag 8220gtaaagagaa aatgagcaaa agcacaaaca cgctaagtgc cggccgtccg agcgcacgca 8280gcagcaaggc tgcaacgttg gccagcctgg cagacacgcc agccatgaag cgggtcaact 8340ttcagttgcc ggcggaggat cacaccaagc tgaagatgta cgcggtacgc caaggcaaga 8400ccattaccga gctgctatct gaatacatcg cgcagctacc agagtaaatg agcaaatgaa 8460taaatgagta gatgaatttt agcggctaaa ggaggcggca tggaaaatca

agaacaacca 8520ggcaccgacg ccgtggaatg ccccatgtgt ggaggaacgg gcggttggcc aggcgtaagc 8580ggctgggttg tctgccggcc ctgcaatggc actggaaccc ccaagcccga ggaatcggcg 8640tgagcggtcg caaaccatcc ggcccggtac aaatcggcgc ggcgctgggt gatgacctgg 8700tggagaagtt gaaggccgcg caggccgccc agcggcaacg catcgaggca gaagcacgcc 8760ccggtgaatc gtggcaaggg gccgctgatc gaatccgcaa agaatcccgg caaccgccgg 8820cagccggtgc gccgtcgatt aggaagccgc ccaagggcga cgagcaacca gattttttcg 8880ttccgatgct ctatgacgtg ggcacccgcg atagtcgcag catcatggac gtggccgttt 8940tccgtctgtc gaagcgtgac cgacgagctg gcgaggtgat ccgctacgag cttccagacg 9000ggcacgtaga ggtttccgca ggccccgccg gcatggccag tgtgtgggat tacgacctgg 9060tactgatggc ggtttcccat ctaaccgaat ccatgaaccg ataccgggaa gggaagggag 9120acaagcccgg ccgcgtgttc cgtccacacg ttgcggacgt actcaagttc tgccggcgag 9180ccgatggcgg aaagcagaaa gacgacctgg tagaaacctg cattcggtta aacaccacgc 9240acgttgccat gcagcgtacc aagaaggcca agaacggccg cctggtgacg gtatccgagg 9300gtgaagcctt gattagccgc tacaagatcg taaagagcga aaccgggcgg ccggagtaca 9360tcgagatcga gcttgctgat tggatgtacc gcgagatcac agaaggcaag aacccggacg 9420tgctgacggt tcaccccgat tactttttga tcgaccccgg catcggccgt tttctctacc 9480gcctggcacg ccgcgccgca ggcaaggcag aagccagatg gttgttcaag acgatctacg 9540aacgcagtgg cagcgccgga gagttcaaga agttctgttt caccgtgcgc aagctgatcg 9600ggtcaaatga cctgccggag tacgatttga aggaggaggc ggggcaggct ggcccgatcc 9660tagtcatgcg ctaccgcaac ctgatcgagg gcgaagcatc cgccggttcc taatgtacgg 9720agcagatgct agggcaaatt gccctagcag gggaaaaagg tcgaaaaggt ctctttcctg 9780tggatagcac gtacattggg aacccaaagc cgtacattgg gaaccggaac ccgtacattg 9840ggaacccaaa gccgtacatt gggaaccggt cacacatgta agtgactgat ataaaagaga 9900aaaaaggcga tttttccgcc taaaactctt taaaacttat taaaactctt aaaacccgcc 9960tggcctgtgc ataactgtct ggccagcgca cagccgaaga gctgcaaaaa gcgcctaccc 10020ttcggtcgct gcgctcccta cgccccgccg cttcgcgtcg gcctatcgcg gcctatgcgg 10080tgtgaaatac cgcacagatg cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc 10140tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca 10200aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca 10260aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg 10320ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg 10380acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 10440ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 10500tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 10560tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 10620gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 10680agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 10740tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa 10800agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt 10860tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct 10920acggggtcct tcaactcatc gatagtttgg ctgtgagcaa ttatgtgctt agtgcatcta 10980acgcttgagt taagccgcgc cgcgaagcgg cgtcggcttg aacgaatttc tagctagaca 11040ttatttgcca acgaccttcg tgatctcgcc cttgacatag tggacaaatt cttcgagctg 11100gtcggcccgg gacgcgagac ggtcttcttc ttggcccaga taggcttggc gcgcttcgag 11160gatcacgggc tggtattgcg ccggaaggcg ctccatcgcc cagtcggcgg cgacatcctt 11220cggcgcgatc ttgccggtaa ccgccgagta ccaaatccgg ctcagcgtaa ggaccacatt 11280gcgctcatcg cccgcccaat ccggcgggga gttccacagg gtcagcgtct cgttcagtgc 11340ttcgaacaga tcctgttccg gcaccgggtc gaaaagttcc tcggccgcgg ggccgacgag 11400ggccacgcta tgctcccggg ccttggtgag caggatcgcc agatcaatgt cgatggtggc 11460cggttcaaag atacccgcca gaatatcatt acgctgccat tcgccgaact ggagttcgcg 11520tttggccgga tagcgccagg ggatgatgtc atcgtgcacc acaatcgtca cctcaaccgc 11580gcgcaggatt tcgctctcgc cgggggaggc ggacgtttcc agaaggtcgt tgataagcgc 11640gcggcgcgtg gtctcgtcga gacggacggt aacggtgaca agcaggtcga tgtccgaatg 11700gggcttaagg ccgccgtcaa cggcgctacc atacagatgc acggcgagga gggtcggttc 11760gaggtggcgc tcgatgacac ccacgacttc cgacagctgg gtggacacct cggcgatgac 11820cgcttcaccc atgatgttta actttgtttt agggcgactg ccctgctgcg taacatcgtt 11880gctgctccat aacatcaaac atcgacccac ggcgtaacgc gcttgctgct tggatgcccg 11940aggcatagac tgtaccccaa aaaaacagtc ataacaagcc atgaaaaccg ccactgcgtt 12000ccatgaatat tcaaacaaac acatacagcg cgacttatca tggatattga catacaaatg 12060gacgaacgga taaacctttt cacgcccttt taaatatccg attattctaa taaacgctct 12120tttctcttag gtttacccgc caatatatcc tgtcaaacac tgatagttta aactgaaggc 12180gggaaacgac aatctgatca ctgattagta actaaggcct ttaattaatc tagaggcgcg 12240ccgggccccc tgcagggagc tcggccggcc aatttaaatt gatatcggta catcgattac 12300gccaagctat caactttgta tag 123237414404DNAArtificial Sequencevector 74ttatacatag ttgataattc actggccgga tgtaccgaat tcgcggccgc aagcttggta 60cctttcttta cgaggtaatt gatctcgcat tatatatcta cattttggtt atgttacttg 120acatatagtc attgattcaa tagttctgtt aattccttta aagatcattt tgactagacc 180acattcttgg ttcattcctc aataatttgt aatcatattg gtggatatag aagtagattg 240gttatagatc agatagtgga agactttagg atgaatttca gctagttttt ttttttggct 300tattgtctca aaagattagt gctttgctgt ctccattgct tctgctatcg acacgcttct 360gtctccttgt atctttatta tatctattcg tcccatgagt tttgtttgtt ctgtattcgt 420tcgctctggt gtcatggatg gagtctctgt tccatgtttc tgtaatgcat gttgggttgt 480ttcatgcaag aaatgctgag ataaacactc atttgtgaaa gtttctaaac tctgaatcgc 540gctacaggca atgctccgag gagtaggagg agaagaacga accaaacgac attatcagcc 600ctttgaggaa gctcttagtt ttgttattgt ttttgtagcc aaattctcca ttcttattcc 660attttcactt atctcttgtt ccttatagac cttataagtt ttttattcat gtatacaaat 720tatattgtca tcaagaagta tctttaaaat ctaaatctca aatcaccagg actatgtttt 780tgtccaattc gtggaaccaa cttgcagctt gtatccattc tcttaaccaa taaaaaaaga 840aagaaagatc aatttgataa atttctcagc cacaaattct acatttaggt tttagcatat 900cgaaggctca atcacaaata caatagatag actagagatt ccagcgtcac gtgagtttta 960tctataaata aaggaccaaa aatcaaatcc cgagggcatt ttcgtaatcc aacataaaac 1020ccttaaactt caagtctcat ttttaaacaa atcatgttca caagtctctt cttcttctct 1080gtttctctat ctcttgctcg ggcccttaga tctcgtgccg tcgtgcgacg ttgttttccg 1140gtacgtttat tcctgttgat tccttctctg tctctctcga ttcactgcta cttctgtttg 1200gattcctttc gcgcgatctc tggatccgtg cgttattcat tggctcgtcg ttttcagatc 1260tgttgcgttt cttctgtttt ctgttatgag tggatgcgtt ttcttgtgat tcgcttgttt 1320gtaatgctgg atctgtatct gcgtcgtggg aattcaaagt gatagtagtt gatatttttt 1380ccagatcagg catgttctcg tataatcagg tctaatggtt gatgattctg cggaattata 1440gatctaagat cttgattgat ttagatttga ggatatgaat gagattcgta ggtccacaaa 1500ggtcttgtta tctctgctgc tagatagatg attatccaat tgcgtttcgt agttattttt 1560atggattcaa ggaattgcgt gtaattgaga gttttactct gttttgtgaa caggcttgat 1620caaactcgag atctttctcc tgaaccatgg cggcggcaac aacaacaaca acaacatctt 1680cttcgatctc cttctccacc aaaccatctc cttcctcctc caaatcacca ttaccaatct 1740ccagattctc cctcccattc tccctaaacc ccaacaaatc atcctcctcc tcccgccgcc 1800gcggtatcaa atccagctct ccctcctcca tctccgccgt gctcaacaca accaccaatg 1860tcacaaccac tccctctcca accaaaccta ccaaacccga aacattcatc tcccgattcg 1920ctccagatca accccgcaaa ggcgctgata tcctcgtcga ggctttagaa cgtcaaggcg 1980tagaaaccgt attcgcttac cctggaggta catcaatgga gattcaccaa gccttaaccc 2040gctcttcctc aatccgtaac gtccttcctc gtcacgaaca aggaggtgta ttcgcagcag 2100aaggatacgc tcgatcctca ggtaaaccag gtatctgtat agccacttca ggtcccggag 2160ctacaaatct cgttagcgga ttagccgatg cgttgttaga tagtgttcct cttgtagcaa 2220tcacaggaca agtccctcgt cgtatgattg gtacagatgc gtttcaagag actccgattg 2280ttgaggtaac gcgttcgatt acgaagcata actatcttgt gatggatgtt gaagatatcc 2340caaggattat tgaagaggct ttctttttag ctacttctgg tagacctgga cctgttttgg 2400ttgatgttcc taaagatatt caacaacagc ttgcgattcc taattgggaa caggctatga 2460gattacctgg ttatatgtct aggatgccta aacctccgga agattctcat ttggagcaga 2520ttgttaggtt gatttctgag tctaagaagc ctgtgttgta tgttggtggt ggttgtctta 2580attctagcga tgaattgggt aggtttgttg agcttacggg catccctgtt gcgagtacgt 2640tgatggggct gggatcttat ccttgtgatg atgagttgtc gttacatatg cttggaatgc 2700atgggactgt gtatgcaaat tacgctgtgg agcatagtga tttgttgttg gcgtttgggg 2760taaggtttga tgatcgtgtc acgggtaaac ttgaggcttt tgctagtagg gctaagattg 2820ttcatattga tattgactcg gctgagattg ggaagaataa gactcctcat gtgtctgtgt 2880gtggtgatgt taagctggct ttgcaaggga tgaataaggt tcttgagaac cgagcggagg 2940agcttaaact tgattttgga gtttggagga atgagttgaa cgtacagaaa cagaagtttc 3000cgttgagctt taagacgttt ggggaagcta ttcctccaca gtatgcgatt aaggtccttg 3060atgagttgac tgatggaaaa gccataataa gtactggtgt cgggcaacat caaatgtggg 3120cggcgcagtt ctacaattac aagaaaccaa ggcagtggct atcatcagga ggccttggag 3180ctatgggatt tggacttcct gctgcgattg gagcgtctgt tgctaaccct gatgcgatag 3240ttgtggatat tgacggagat ggaagtttta taatgaatgt gcaagagcta gccactattc 3300gtgtagagaa tcttccagtg aaggtacttt tattaaacaa ccagcatctt ggcatggtta 3360tgcaatggga agatcggttc tacaaagcta accgagctca cacatttctc ggggacccgg 3420ctcaggagga cgagatattc ccgaacatgt tgctgtttgc agcagcttgc gggattccag 3480cggcgagggt gacaaagaaa gcagatctcc gagaagctat tcagacaatg ctggatacac 3540caggacctta cctgttggat gtgatttgtc cgcaccaaga acatgtgttg ccgatgatcc 3600cgaatggtgg cactttcaac gatgtcataa cggaaggaga tggccggatt aaatactgag 3660agatgaaacc ggtgattatc agaacctttt atggtctttg tatgcatatg gtaaaaaaac 3720ttagtttgca atttcctgtt tgttttggta atttgagttt cttttagttg ttgatctgcc 3780tgctttttgg tttacgtcag actactactg ctgttgttgt ttggtttcct ttctttcatt 3840ttataaataa ataatccggt tcggtttact ccttgtgact ggctcagttt ggttattgcg 3900aaatgcgaat ggtaaattga gtaattgaaa ttcgttatta gggttctaag ctgttttaac 3960agtcactggg ttaatatctc tcgaatcttg catggaaaat gctcttacca ttggttttta 4020attgaaatgt gctcatatgg gccgtggttt ccaaattaaa taaaactacg atgtcatcga 4080gaagtaaaat caactgtgtc cacattatca gttttgtgta tacgatgaaa tagggtaatt 4140caaaatctag cttgatatgc cttttggttc attttaacct tctgtaaaca ttttttcaga 4200ttttgaacaa gtaaatccaa aaaaaaaaaa aaaaaatctc aactcaacac taaattattt 4260taatgtataa aagatgctta aaacatttgg cttaaaagaa agaagctaaa aacatagaga 4320actcttgtaa attgaagtat gaaaatatac tgaattgggt attatatgaa tttttctgat 4380ttaggattca catgatccaa aaaggaaatc cagaagcact aatcagacat tggaagtagg 4440atttaaattt aatcgcagta cttaatcagt gatcagtaac taaattcagt acattaaaga 4500cgtccgcaat gtgttattaa gttgtctaag cgtcaatttg tttacaccac aatatatcct 4560gccaccagcc agccaacagc tccccgaccg gcagctcggc acaaaatcac tgatcatcta 4620aaaaggtgat gtgtatttga gtaaaacagc ttgcgtcatg cggtcgctgc gtatatgatg 4680cgatgagtaa ataaacaaat acgcaagggg aacgcatgaa ggttatcgct gtacttaacc 4740agaaaggcgg gtcaggcaag acgaccatcg caacccatct agcccgcgcc ctgcaactcg 4800ccggggccga tgttctgtta gtcgattccg atccccaggg cagtgcccgc gattgggcgg 4860ccgtgcggga agatcaaccg ctaaccgttg tcggcatcga ccgcccgacg attgaccgcg 4920acgtgaaggc catcggccgg cgcgacttcg tagtgatcga cggagcgccc caggcggcgg 4980acttggctgt gtccgcgatc aaggcagccg acttcgtgct gattccggtg cagccaagcc 5040cttacgacat ttgggccacc gccgacctgg tggagctggt taagcagcgc attgaggtca 5100cggatggaag gctacaagcg gcctttgtcg tgtcgcgggc gatcaaaggc acgcgcatcg 5160gcggtgaggt tgccgaggcg ctggccgggt acgagctgcc cattcttgag tcccgtatca 5220cgcagcgcgt gagctaccca ggcactgccg ccgccggcac aaccgttctt gaatcagaac 5280ccgagggcga cgctgcccgc gaggtccagg cgctggccgc tgaaattaaa tcaaaactca 5340tttgagttaa tgaggtaaag agaaaatgag caaaagcaca aacacgctaa gtgccggccg 5400tccgagcgca cgcagcagca aggctgcaac gttggccagc ctggcagaca cgccagccat 5460gaagcgggtc aactttcagt tgccggcgga ggatcacacc aagctgaaga tgtacgcggt 5520acgccaaggc aagaccatta ccgagctgct atctgaatac atcgcgcagc taccagagta 5580aatgagcaaa tgaataaatg agtagatgaa ttttagcggc taaaggaggc ggcatggaaa 5640atcaagaaca accaggcacc gacgccgtgg aatgccccat gtgtggagga acgggcggtt 5700ggccaggcgt aagcggctgg gttgtctgcc ggccctgcaa tggcactgga acccccaagc 5760ccgaggaatc ggcgtgagcg gtcgcaaacc atccggcccg gtacaaatcg gcgcggcgct 5820gggtgatgac ctggtggaga agttgaaggc cgcgcaggcc gcccagcggc aacgcatcga 5880ggcagaagca cgccccggtg aatcgtggca aggggccgct gatcgaatcc gcaaagaatc 5940ccggcaaccg ccggcagccg gtgcgccgtc gattaggaag ccgcccaagg gcgacgagca 6000accagatttt ttcgttccga tgctctatga cgtgggcacc cgcgatagtc gcagcatcat 6060ggacgtggcc gttttccgtc tgtcgaagcg tgaccgacga gctggcgagg tgatccgcta 6120cgagcttcca gacgggcacg tagaggtttc cgcaggcccc gccggcatgg ccagtgtgtg 6180ggattacgac ctggtactga tggcggtttc ccatctaacc gaatccatga accgataccg 6240ggaagggaag ggagacaagc ccggccgcgt gttccgtcca cacgttgcgg acgtactcaa 6300gttctgccgg cgagccgatg gcggaaagca gaaagacgac ctggtagaaa cctgcattcg 6360gttaaacacc acgcacgttg ccatgcagcg taccaagaag gccaagaacg gccgcctggt 6420gacggtatcc gagggtgaag ccttgattag ccgctacaag atcgtaaaga gcgaaaccgg 6480gcggccggag tacatcgaga tcgagcttgc tgattggatg taccgcgaga tcacagaagg 6540caagaacccg gacgtgctga cggttcaccc cgattacttt ttgatcgacc ccggcatcgg 6600ccgttttctc taccgcctgg cacgccgcgc cgcaggcaag gcagaagcca gatggttgtt 6660caagacgatc tacgaacgca gtggcagcgc cggagagttc aagaagttct gtttcaccgt 6720gcgcaagctg atcgggtcaa atgacctgcc ggagtacgat ttgaaggagg aggcggggca 6780ggctggcccg atcctagtca tgcgctaccg caacctgatc gagggcgaag catccgccgg 6840ttcctaatgt acggagcaga tgctagggca aattgcccta gcaggggaaa aaggtcgaaa 6900aggtctcttt cctgtggata gcacgtacat tgggaaccca aagccgtaca ttgggaaccg 6960gaacccgtac attgggaacc caaagccgta cattgggaac cggtcacaca tgtaagtgac 7020tgatataaaa gagaaaaaag gcgatttttc cgcctaaaac tctttaaaac ttattaaaac 7080tcttaaaacc cgcctggcct gtgcataact gtctggccag cgcacagccg aagagctgca 7140aaaagcgcct acccttcggt cgctgcgctc cctacgcccc gccgcttcgc gtcggcctat 7200cgcggcctat gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcaggc 7260gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg 7320tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa 7380agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg 7440cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga 7500ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg 7560tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg 7620gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc 7680gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 7740gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 7800ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 7860ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 7920ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 7980gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 8040ctttgatctt ttctacgggg tccttcaact catcgatagt ttggctgtga gcaattatgt 8100gcttagtgca tctaacgctt gagttaagcc gcgccgcgaa gcggcgtcgg cttgaacgaa 8160tttctagcta gacattattt gccaacgacc ttcgtgatct cgcccttgac atagtggaca 8220aattcttcga gctggtcggc ccgggacgcg agacggtctt cttcttggcc cagataggct 8280tggcgcgctt cgaggatcac gggctggtat tgcgccggaa ggcgctccat cgcccagtcg 8340gcggcgacat ccttcggcgc gatcttgccg gtaaccgccg agtaccaaat ccggctcagc 8400gtaaggacca cattgcgctc atcgcccgcc caatccggcg gggagttcca cagggtcagc 8460gtctcgttca gtgcttcgaa cagatcctgt tccggcaccg ggtcgaaaag ttcctcggcc 8520gcggggccga cgagggccac gctatgctcc cgggccttgg tgagcaggat cgccagatca 8580atgtcgatgg tggccggttc aaagataccc gccagaatat cattacgctg ccattcgccg 8640aactggagtt cgcgtttggc cggatagcgc caggggatga tgtcatcgtg caccacaatc 8700gtcacctcaa ccgcgcgcag gatttcgctc tcgccggggg aggcggacgt ttccagaagg 8760tcgttgataa gcgcgcggcg cgtggtctcg tcgagacgga cggtaacggt gacaagcagg 8820tcgatgtccg aatggggctt aaggccgccg tcaacggcgc taccatacag atgcacggcg 8880aggagggtcg gttcgaggtg gcgctcgatg acacccacga cttccgacag ctgggtggac 8940acctcggcga tgaccgcttc acccatgatg tttaactttg ttttagggcg actgccctgc 9000tgcgtaacat cgttgctgct ccataacatc aaacatcgac ccacggcgta acgcgcttgc 9060tgcttggatg cccgaggcat agactgtacc ccaaaaaaac agtcataaca agccatgaaa 9120accgccactg cgttccatga atattcaaac aaacacatac agcgcgactt atcatggata 9180ttgacataca aatggacgaa cggataaacc ttttcacgcc cttttaaata tccgattatt 9240ctaataaacg ctcttttctc ttaggtttac ccgccaatat atcctgtcaa acactgatag 9300tttaaactga aggcgggaaa cgacaatctg atcactgatt agtaactaag gcctttaatt 9360aatctagagg cgcgccgggc cccctgcagg gagctcggcc ggccaattta aattgatatc 9420ggtacatcga ttacgccaag ctatcaactt tgtatagaaa agttgccatg attacgccaa 9480gcttggccac taaggccaat ttaaatctac taggccggcc aaagtaggcg cctactaccg 9540gtaattcccg ggattagcgg ccgctagtct gtgcgcactt gtatcctgca ggtcaatcgt 9600ttaaacactg tacggaccgt ggcctaatag gccggtaccc aagtttgtac aaaaaagcag 9660gctcccggga tacctgcagg ttaggccggc ccaggtaccc tagattcgac ggtatcgata 9720agctcgcgga tccctgaaag cgacgttgga tgttaacatc tacaaattgc cttttcttat 9780cgaccatgta cgtaagcgct tacgtttttg gtggaccctt gaggaaactg gtagctgttg 9840tgggcctgtg gtctcaagat ggatcattaa tttccacctt cacctacgat ggggggcatc 9900gcaccggtga gtaatattgt acggctaaga gcgaatttgg cctgtaggat ccctgaaagc 9960gacgttggat gttaacatct acaaattgcc ttttcttatc gaccatgtac gtaagcgctt 10020acgtttttgg tggacccttg aggaaactgg tagctgttgt gggcctgtgg tctcaagatg 10080gatcattaat ttccaccttc acctacgatg gggggcatcg caccggtgag taatattgta 10140cggctaagag cgaatttggc ctgtaggatc cctgaaagcg acgttggatg ttaacatcta 10200caaattgcct tttcttatcg accatgtacg taagcgctta cgtttttggt ggacccttga 10260ggaaactggt agctgttgtg ggcctgtggt ctcaagatgg atcattaatt tccaccttca 10320cctacgatgg ggggcatcgc accggtgagt aatattgtac ggctaagagc gaatttggcc 10380tgtaggatcc gcgagctggt caatcccatt gcttttgaag cagctcaaca ttgatctctt 10440tctcgatcga gggagatttt tcaaatcagt gcgcaagacg tgacgtaagt atccgagtca 10500gtttttattt ttctactaat ttggtcgttt atttcggcgt gtaggacatg gcaaccgggc 10560ctgaatttcg cgggtattct gtttctattc caactttttc ttgatccgca gccattaacg 10620acttttgaat agatacgctg acacgccaag cctcgctagt caaaagtgta ccaaacaacg 10680ctttacagca agaacggaat gcgcgtgacg ctcgcggtga cgccatttcg ccttttcaga 10740aatggataaa tagccttgct tcctattata tcttcccaaa ttaccaatac attacactag 10800catctgaatt tcataaccaa tctcgataca ccaaatcgat taattaacca tggcgacgac 10860aacaacagaa gcaacgaaga catcatcgac caatggagaa gatcagaagc agtctcagaa 10920tcttcgacat caagaagttg gtcacaagag tctcttacag agcgatgatc tctaccagta 10980tatactggag acaagtgtgt atcctagaga accagaatca atgaaggaac tcagggaagt 11040gacagcaaaa catccatgga acataatgac cacatcagct gatgaaggac agttcttaaa 11100catgcttatc aagctcgtta acgccaagaa cacaatggag atcggagttt acactggcta 11160ctctcttctc

gccaccgctc ttgctctccc tgaagacggc aaaattctgg ctatggatgt 11220caacagagag aattacgaat tgggtttacc gatcattgag aaagccggcg ttgctcacaa 11280gatcgacttc agggaaggcc ctgctcttcc cgttcttgat gaaatcgttg ctgacgagaa 11340gaaccatgga acatatgact ttatattcgt tgatgctgac aaagacaact acatcaacta 11400ccacaagcgt ttgatcgatc ttgtgaaaat tggaggagtg attggctacg acaacactct 11460gtggaatggt tctgtcgtgg ctcctcctga tgcaccaatg aggaagtacg ttcgttacta 11520cagagacttt gttcttgagc ttaacaaggc tcttgctgct gaccctcgga tcgagatctg 11580tatgctccct gttggtgatg gaatcactat ctgccgtcgg atcagttgag gcgcgccgat 11640cgttcaaaca tttggcaata aagtttctta agattgaatc ctgttgccgg tcttgcgatg 11700attatcatat aatttctgtt gaattacgtt aagcatgtaa taattaacat gtaatgcatg 11760acgttattta tgagatgggt ttttatgatt agagtcccgc aattatacat ttaatacgcg 11820atagaaaaca aaatatagcg cgcaaactag gataaattat cgcgcgcggt gtcatctatg 11880ttactagatc ggcgcctaag tttaaactaa gcggccgcac ccagctttct tgtacaaagt 11940ggccatgatt acgccaagct tggccactaa ggccaattta aatctactag gccggcccag 12000gtaccaattc gaatccaaaa attacggata tgaatatagg catatccgta tccgaattat 12060ccgtttgaca gctagcaacg attgtacaat tgcttcttta aaaaaggaag aaagaaagaa 12120agaaaagaat caacatcagc gttaacaaac ggccccgtta cggcccaaac ggtcatatag 12180agtaacggcg ttaagcgttg aaagactcct atcgaaatac gtaaccgcaa acgtgtcata 12240gtcagatccc ctcttccttc accgcctcaa acacaaaaat aatcttctac agcctatata 12300tacaaccccc ccttctatct ctcctttctc acaattcatc atctttcttt ctctaccccc 12360aattttaaga aatcctctct tctcctcttc attttcaagg taaatctctc tctctctctc 12420tctctctgtt attccttgtt ttaattaggt atgtattatt gctagtttgt taatctgctt 12480atcttatgta tgccttatgt gaatatcttt atcttgttca tctcatccgt ttagaagcta 12540taaatttgtt gatttgactg tgtatctaca cgtggttatg tttatatcta atcagatatg 12600aatttcttca tattgttgcg tttgtgtgta ccaatccgaa atcgttgatt tttttcattt 12660aatcgtgtag ctaattgtac gtatacatat ggatctacgt atcaattgtt catctgtttg 12720tgtttgtatg tatacagatc tgaaaacatc acttctctca tctgattgtg ttgttacata 12780catagatata gatctgttat atcatttttt ttattaattg tgtatatata tatgtgcata 12840gatctggatt acatgattgt gattatttac atgattttgt tatttacgta tgtatatatg 12900tagatctgga ctttttggag ttgttgactt gattgtattt gtgtgtgtat atgtgtgttc 12960tgatcttgat atgttatgta tgtgcagtta attaaccatg gctccaacac tcttgacaac 13020ccaattctca aatccagctg aagtaaccga ctttgtagtc tacaaaggaa atggtgttaa 13080gggtttatca gaaacaggaa tcaaagctct tccagaacaa tacattcagc cacttgaaga 13140acgactcatc aacaaattcg tcaacgaaac agatgaagcc attccagtta tcgatatgtc 13200gaaccctgat gaggacagag tcgctgaagc tgtttgtgat gctgctgaga aatgggggtt 13260ctttcaagtg atcaatcatg gagttccttt ggaagttctt gatgacgtca aggctgcgac 13320tcacaagttc ttcaatctcc ctgttgaaga gaagcgcaag ttcactaaag agaattcgct 13380gtcgacgact gttaggtttg ggacgagttt tagtcctctt gcagagcaag cgcttgagtg 13440gaaagattat ctcagcctct tctttgtctc tgaagctgaa gctgaacagt tctggcctga 13500tatctgcagg aatgaaacgt tagagtacat taacaagtca aagaagatgg tgaggaggct 13560tctagagtat ttgggaaaga atctcaatgt taaagagctt gacgagacga aagaatcact 13620ctttatgggc tcgattcgag tcaaccttaa ctactacccc atctgcccta atccggacct 13680aacagttggt gttggtcgcc actcagacgt ctcttctctc accattctct tacaagacca 13740gatcggtggt ctacacgtgc gttctctggc ttcagggaac tgggttcacg tgcctccggt 13800tgctggatct tttgtgatca acatcggaga tgcgatgcag atcatgagca atggtctgta 13860caagagcgtg gagcatcgtg tcttagccaa tggttacaat aatagaatct ctgttcctat 13920ctttgtgaac ccaaaaccag agtcagttat tggtcctcta cctgaggtga ttgcaaacgg 13980agaggaaccg atttacagag acgtcctgta ctctgattac gtcaagtatt tcttcaggaa 14040ggcacacgat ggaaagaaaa ccgtcgatta cgccaagatc tgaggcgcgc cctgctttaa 14100tgagatatgc gagacgccta tgatcgcatg atatttgctt tcaattctgt tgtgcacgtt 14160gtaaaaaacc tgagcatgtg tagctcagat ccttaccgcc ggtttcggtt cattctaatg 14220aatatatcac ccgttactat cgtattttta tgaataatat tctccgttca atttactgat 14280tgtggcgcct actaccggta attcccggga ttagcggccg ctagtctgtg cgcacttgta 14340tcctgcaggt caatcgttta aacactgtac ggaccgtggc ctaataggcc ggtacccaac 14400ttta 144047520077DNAArtificial Sequencevector 75ttatacatag ttgataattc actggccgga tgtaccgaat tcgcggccgc aagcttggta 60cctttcttta cgaggtaatt gatctcgcat tatatatcta cattttggtt atgttacttg 120acatatagtc attgattcaa tagttctgtt aattccttta aagatcattt tgactagacc 180acattcttgg ttcattcctc aataatttgt aatcatattg gtggatatag aagtagattg 240gttatagatc agatagtgga agactttagg atgaatttca gctagttttt ttttttggct 300tattgtctca aaagattagt gctttgctgt ctccattgct tctgctatcg acacgcttct 360gtctccttgt atctttatta tatctattcg tcccatgagt tttgtttgtt ctgtattcgt 420tcgctctggt gtcatggatg gagtctctgt tccatgtttc tgtaatgcat gttgggttgt 480ttcatgcaag aaatgctgag ataaacactc atttgtgaaa gtttctaaac tctgaatcgc 540gctacaggca atgctccgag gagtaggagg agaagaacga accaaacgac attatcagcc 600ctttgaggaa gctcttagtt ttgttattgt ttttgtagcc aaattctcca ttcttattcc 660attttcactt atctcttgtt ccttatagac cttataagtt ttttattcat gtatacaaat 720tatattgtca tcaagaagta tctttaaaat ctaaatctca aatcaccagg actatgtttt 780tgtccaattc gtggaaccaa cttgcagctt gtatccattc tcttaaccaa taaaaaaaga 840aagaaagatc aatttgataa atttctcagc cacaaattct acatttaggt tttagcatat 900cgaaggctca atcacaaata caatagatag actagagatt ccagcgtcac gtgagtttta 960tctataaata aaggaccaaa aatcaaatcc cgagggcatt ttcgtaatcc aacataaaac 1020ccttaaactt caagtctcat ttttaaacaa atcatgttca caagtctctt cttcttctct 1080gtttctctat ctcttgctcg ggcccttaga tctcgtgccg tcgtgcgacg ttgttttccg 1140gtacgtttat tcctgttgat tccttctctg tctctctcga ttcactgcta cttctgtttg 1200gattcctttc gcgcgatctc tggatccgtg cgttattcat tggctcgtcg ttttcagatc 1260tgttgcgttt cttctgtttt ctgttatgag tggatgcgtt ttcttgtgat tcgcttgttt 1320gtaatgctgg atctgtatct gcgtcgtggg aattcaaagt gatagtagtt gatatttttt 1380ccagatcagg catgttctcg tataatcagg tctaatggtt gatgattctg cggaattata 1440gatctaagat cttgattgat ttagatttga ggatatgaat gagattcgta ggtccacaaa 1500ggtcttgtta tctctgctgc tagatagatg attatccaat tgcgtttcgt agttattttt 1560atggattcaa ggaattgcgt gtaattgaga gttttactct gttttgtgaa caggcttgat 1620caaactcgag atctttctcc tgaaccatgg cggcggcaac aacaacaaca acaacatctt 1680cttcgatctc cttctccacc aaaccatctc cttcctcctc caaatcacca ttaccaatct 1740ccagattctc cctcccattc tccctaaacc ccaacaaatc atcctcctcc tcccgccgcc 1800gcggtatcaa atccagctct ccctcctcca tctccgccgt gctcaacaca accaccaatg 1860tcacaaccac tccctctcca accaaaccta ccaaacccga aacattcatc tcccgattcg 1920ctccagatca accccgcaaa ggcgctgata tcctcgtcga ggctttagaa cgtcaaggcg 1980tagaaaccgt attcgcttac cctggaggta catcaatgga gattcaccaa gccttaaccc 2040gctcttcctc aatccgtaac gtccttcctc gtcacgaaca aggaggtgta ttcgcagcag 2100aaggatacgc tcgatcctca ggtaaaccag gtatctgtat agccacttca ggtcccggag 2160ctacaaatct cgttagcgga ttagccgatg cgttgttaga tagtgttcct cttgtagcaa 2220tcacaggaca agtccctcgt cgtatgattg gtacagatgc gtttcaagag actccgattg 2280ttgaggtaac gcgttcgatt acgaagcata actatcttgt gatggatgtt gaagatatcc 2340caaggattat tgaagaggct ttctttttag ctacttctgg tagacctgga cctgttttgg 2400ttgatgttcc taaagatatt caacaacagc ttgcgattcc taattgggaa caggctatga 2460gattacctgg ttatatgtct aggatgccta aacctccgga agattctcat ttggagcaga 2520ttgttaggtt gatttctgag tctaagaagc ctgtgttgta tgttggtggt ggttgtctta 2580attctagcga tgaattgggt aggtttgttg agcttacggg catccctgtt gcgagtacgt 2640tgatggggct gggatcttat ccttgtgatg atgagttgtc gttacatatg cttggaatgc 2700atgggactgt gtatgcaaat tacgctgtgg agcatagtga tttgttgttg gcgtttgggg 2760taaggtttga tgatcgtgtc acgggtaaac ttgaggcttt tgctagtagg gctaagattg 2820ttcatattga tattgactcg gctgagattg ggaagaataa gactcctcat gtgtctgtgt 2880gtggtgatgt taagctggct ttgcaaggga tgaataaggt tcttgagaac cgagcggagg 2940agcttaaact tgattttgga gtttggagga atgagttgaa cgtacagaaa cagaagtttc 3000cgttgagctt taagacgttt ggggaagcta ttcctccaca gtatgcgatt aaggtccttg 3060atgagttgac tgatggaaaa gccataataa gtactggtgt cgggcaacat caaatgtggg 3120cggcgcagtt ctacaattac aagaaaccaa ggcagtggct atcatcagga ggccttggag 3180ctatgggatt tggacttcct gctgcgattg gagcgtctgt tgctaaccct gatgcgatag 3240ttgtggatat tgacggagat ggaagtttta taatgaatgt gcaagagcta gccactattc 3300gtgtagagaa tcttccagtg aaggtacttt tattaaacaa ccagcatctt ggcatggtta 3360tgcaatggga agatcggttc tacaaagcta accgagctca cacatttctc ggggacccgg 3420ctcaggagga cgagatattc ccgaacatgt tgctgtttgc agcagcttgc gggattccag 3480cggcgagggt gacaaagaaa gcagatctcc gagaagctat tcagacaatg ctggatacac 3540caggacctta cctgttggat gtgatttgtc cgcaccaaga acatgtgttg ccgatgatcc 3600cgaatggtgg cactttcaac gatgtcataa cggaaggaga tggccggatt aaatactgag 3660agatgaaacc ggtgattatc agaacctttt atggtctttg tatgcatatg gtaaaaaaac 3720ttagtttgca atttcctgtt tgttttggta atttgagttt cttttagttg ttgatctgcc 3780tgctttttgg tttacgtcag actactactg ctgttgttgt ttggtttcct ttctttcatt 3840ttataaataa ataatccggt tcggtttact ccttgtgact ggctcagttt ggttattgcg 3900aaatgcgaat ggtaaattga gtaattgaaa ttcgttatta gggttctaag ctgttttaac 3960agtcactggg ttaatatctc tcgaatcttg catggaaaat gctcttacca ttggttttta 4020attgaaatgt gctcatatgg gccgtggttt ccaaattaaa taaaactacg atgtcatcga 4080gaagtaaaat caactgtgtc cacattatca gttttgtgta tacgatgaaa tagggtaatt 4140caaaatctag cttgatatgc cttttggttc attttaacct tctgtaaaca ttttttcaga 4200ttttgaacaa gtaaatccaa aaaaaaaaaa aaaaaatctc aactcaacac taaattattt 4260taatgtataa aagatgctta aaacatttgg cttaaaagaa agaagctaaa aacatagaga 4320actcttgtaa attgaagtat gaaaatatac tgaattgggt attatatgaa tttttctgat 4380ttaggattca catgatccaa aaaggaaatc cagaagcact aatcagacat tggaagtagg 4440atttaaattt aatcgcagta cttaatcagt gatcagtaac taaattcagt acattaaaga 4500cgtccgcaat gtgttattaa gttgtctaag cgtcaatttg tttacaccac aatatatcct 4560gccaccagcc agccaacagc tccccgaccg gcagctcggc acaaaatcac tgatcatcta 4620aaaaggtgat gtgtatttga gtaaaacagc ttgcgtcatg cggtcgctgc gtatatgatg 4680cgatgagtaa ataaacaaat acgcaagggg aacgcatgaa ggttatcgct gtacttaacc 4740agaaaggcgg gtcaggcaag acgaccatcg caacccatct agcccgcgcc ctgcaactcg 4800ccggggccga tgttctgtta gtcgattccg atccccaggg cagtgcccgc gattgggcgg 4860ccgtgcggga agatcaaccg ctaaccgttg tcggcatcga ccgcccgacg attgaccgcg 4920acgtgaaggc catcggccgg cgcgacttcg tagtgatcga cggagcgccc caggcggcgg 4980acttggctgt gtccgcgatc aaggcagccg acttcgtgct gattccggtg cagccaagcc 5040cttacgacat ttgggccacc gccgacctgg tggagctggt taagcagcgc attgaggtca 5100cggatggaag gctacaagcg gcctttgtcg tgtcgcgggc gatcaaaggc acgcgcatcg 5160gcggtgaggt tgccgaggcg ctggccgggt acgagctgcc cattcttgag tcccgtatca 5220cgcagcgcgt gagctaccca ggcactgccg ccgccggcac aaccgttctt gaatcagaac 5280ccgagggcga cgctgcccgc gaggtccagg cgctggccgc tgaaattaaa tcaaaactca 5340tttgagttaa tgaggtaaag agaaaatgag caaaagcaca aacacgctaa gtgccggccg 5400tccgagcgca cgcagcagca aggctgcaac gttggccagc ctggcagaca cgccagccat 5460gaagcgggtc aactttcagt tgccggcgga ggatcacacc aagctgaaga tgtacgcggt 5520acgccaaggc aagaccatta ccgagctgct atctgaatac atcgcgcagc taccagagta 5580aatgagcaaa tgaataaatg agtagatgaa ttttagcggc taaaggaggc ggcatggaaa 5640atcaagaaca accaggcacc gacgccgtgg aatgccccat gtgtggagga acgggcggtt 5700ggccaggcgt aagcggctgg gttgtctgcc ggccctgcaa tggcactgga acccccaagc 5760ccgaggaatc ggcgtgagcg gtcgcaaacc atccggcccg gtacaaatcg gcgcggcgct 5820gggtgatgac ctggtggaga agttgaaggc cgcgcaggcc gcccagcggc aacgcatcga 5880ggcagaagca cgccccggtg aatcgtggca aggggccgct gatcgaatcc gcaaagaatc 5940ccggcaaccg ccggcagccg gtgcgccgtc gattaggaag ccgcccaagg gcgacgagca 6000accagatttt ttcgttccga tgctctatga cgtgggcacc cgcgatagtc gcagcatcat 6060ggacgtggcc gttttccgtc tgtcgaagcg tgaccgacga gctggcgagg tgatccgcta 6120cgagcttcca gacgggcacg tagaggtttc cgcaggcccc gccggcatgg ccagtgtgtg 6180ggattacgac ctggtactga tggcggtttc ccatctaacc gaatccatga accgataccg 6240ggaagggaag ggagacaagc ccggccgcgt gttccgtcca cacgttgcgg acgtactcaa 6300gttctgccgg cgagccgatg gcggaaagca gaaagacgac ctggtagaaa cctgcattcg 6360gttaaacacc acgcacgttg ccatgcagcg taccaagaag gccaagaacg gccgcctggt 6420gacggtatcc gagggtgaag ccttgattag ccgctacaag atcgtaaaga gcgaaaccgg 6480gcggccggag tacatcgaga tcgagcttgc tgattggatg taccgcgaga tcacagaagg 6540caagaacccg gacgtgctga cggttcaccc cgattacttt ttgatcgacc ccggcatcgg 6600ccgttttctc taccgcctgg cacgccgcgc cgcaggcaag gcagaagcca gatggttgtt 6660caagacgatc tacgaacgca gtggcagcgc cggagagttc aagaagttct gtttcaccgt 6720gcgcaagctg atcgggtcaa atgacctgcc ggagtacgat ttgaaggagg aggcggggca 6780ggctggcccg atcctagtca tgcgctaccg caacctgatc gagggcgaag catccgccgg 6840ttcctaatgt acggagcaga tgctagggca aattgcccta gcaggggaaa aaggtcgaaa 6900aggtctcttt cctgtggata gcacgtacat tgggaaccca aagccgtaca ttgggaaccg 6960gaacccgtac attgggaacc caaagccgta cattgggaac cggtcacaca tgtaagtgac 7020tgatataaaa gagaaaaaag gcgatttttc cgcctaaaac tctttaaaac ttattaaaac 7080tcttaaaacc cgcctggcct gtgcataact gtctggccag cgcacagccg aagagctgca 7140aaaagcgcct acccttcggt cgctgcgctc cctacgcccc gccgcttcgc gtcggcctat 7200cgcggcctat gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcaggc 7260gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg 7320tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa 7380agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg 7440cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga 7500ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg 7560tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg 7620gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc 7680gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 7740gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 7800ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 7860ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 7920ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 7980gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 8040ctttgatctt ttctacgggg tccttcaact catcgatagt ttggctgtga gcaattatgt 8100gcttagtgca tctaacgctt gagttaagcc gcgccgcgaa gcggcgtcgg cttgaacgaa 8160tttctagcta gacattattt gccaacgacc ttcgtgatct cgcccttgac atagtggaca 8220aattcttcga gctggtcggc ccgggacgcg agacggtctt cttcttggcc cagataggct 8280tggcgcgctt cgaggatcac gggctggtat tgcgccggaa ggcgctccat cgcccagtcg 8340gcggcgacat ccttcggcgc gatcttgccg gtaaccgccg agtaccaaat ccggctcagc 8400gtaaggacca cattgcgctc atcgcccgcc caatccggcg gggagttcca cagggtcagc 8460gtctcgttca gtgcttcgaa cagatcctgt tccggcaccg ggtcgaaaag ttcctcggcc 8520gcggggccga cgagggccac gctatgctcc cgggccttgg tgagcaggat cgccagatca 8580atgtcgatgg tggccggttc aaagataccc gccagaatat cattacgctg ccattcgccg 8640aactggagtt cgcgtttggc cggatagcgc caggggatga tgtcatcgtg caccacaatc 8700gtcacctcaa ccgcgcgcag gatttcgctc tcgccggggg aggcggacgt ttccagaagg 8760tcgttgataa gcgcgcggcg cgtggtctcg tcgagacgga cggtaacggt gacaagcagg 8820tcgatgtccg aatggggctt aaggccgccg tcaacggcgc taccatacag atgcacggcg 8880aggagggtcg gttcgaggtg gcgctcgatg acacccacga cttccgacag ctgggtggac 8940acctcggcga tgaccgcttc acccatgatg tttaactttg ttttagggcg actgccctgc 9000tgcgtaacat cgttgctgct ccataacatc aaacatcgac ccacggcgta acgcgcttgc 9060tgcttggatg cccgaggcat agactgtacc ccaaaaaaac agtcataaca agccatgaaa 9120accgccactg cgttccatga atattcaaac aaacacatac agcgcgactt atcatggata 9180ttgacataca aatggacgaa cggataaacc ttttcacgcc cttttaaata tccgattatt 9240ctaataaacg ctcttttctc ttaggtttac ccgccaatat atcctgtcaa acactgatag 9300tttaaactga aggcgggaaa cgacaatctg atcactgatt agtaactaag gcctttaatt 9360aatctagagg cgcgccgggc cccctgcagg gagctcggcc ggccaattta aattgatatc 9420ggtacatcga ttacgccaag ctatcaactt tgtatagaaa agttgccatg attacgccaa 9480gcttggccac taaggccaat ttaaatctac taggccggcc caggtaccat ttccaactcc 9540tgactgagaa gtggatttca tatcaacatt agcaattagt agaatactat catctttcac 9600gctacaaaac attggtactt tggtaggtaa agatttgcaa acacgaataa gtaattaaga 9660aaggttcata cacattcaat gattctggat tcctacctta cgttatttgt ttcgaaatac 9720ctagatgaga gcatcttgtt atttattact acatattaat tttccctgtg taccttgtcg 9780tagtttaaat ttattatttt ttcaatcata aataaatata agaaatattt ttttcttaat 9840ataattttat tttatattta aaaataaatc ataatttgaa agagctacaa atttatacca 9900catgtgggaa gtattgttgg tttctccaac catacttatt gagaataact tgaatttata 9960ttcaacgtat taattgcttc acctttaacg tgccaaaata ataataataa aaaacttaaa 10020actactgtat taatcgcgtg tggttgaatg gaggcaaatt ctattctaaa aaagaaaagc 10080attaacaaaa ggagaaaaga aaaactgttg acacctgaca gcagtaacag ggaactggga 10140agtagcagta ggagtatttg cgtgttggtt tccaactctg gaatccaccg tgccaaactg 10200cgaatgcagg agaaatcgac acgtgtccat ttgcaggcgc gagttgaacg tgacaatgca 10260ccaccgccca gcatcgaacg cagccaagga ccacgtcgaa accacagtaa tccacgttcc 10320agtgctgcgc ggaacatggt cggtctttct aggagtggtt ggaatcacgc cagctaggac 10380aaaccccatc aatcattggt cattatcaaa caaaacattt caaaaattca acatattacg 10440cctcgggacc cacctcccac tacacctcac cctcacttct attaactcga acacattcgg 10500gttataaatc cgcaaccctc cttctcactc actcactcac tcactcactc actcgcaagc 10560aaaaagaaag aatcccaggc gaggagaaag ttaattaacc atggctcata tggttggagc 10620agacgatatt gagtcattga gagtagagct tgcagagatc ggaagaagca tcagatcatc 10680attccggaga catacttcga gtttcagaag cagctcttca atatatgaag ttgaaaatga 10740tggtgatgtt aatgatcatg atgcagagta tgctctgcaa tgggctgaga ttgagagatt 10800accaactgtc aagcgaatga gatcgactct ccttgatgat ggcgatgagt ccatgaccga 10860gaaaggaaga agagtcgttg atgtcacaaa gcttggagcc gtggaacgtc atctgatgat 10920tgagaaactc atcaaacaca ttgagaatga taatctcaag ttgctcaaga aaatcaggag 10980aagaatagac agagtcggga tggagttacc gaccatagaa gtgaggtacg agagtttaaa 11040agtggtggcc gagtgcgagg ttgtcgaagg gaaggcactt ccaacactgt ggaacactgc 11100taagcgtgtt ttatctgaac tggtgaagct cactggtgca aaaacacatg aagccaagat 11160aaacattatt aatgatgtta atggcattat aaagccagga aggttaacac tgttgcttgg 11220tcctcctagc tgcggaaaaa caactttgtt aaaggccttg tctggaaatt tagaaaacaa 11280tctaaagtgt tcaggtgaaa tatcttacaa tggacacaga ctggatgagt ttgttcctca 11340gaaaacttca gcgtacataa gtcaatatga tctgcacatt gcagagatga cagtgaggga 11400gacagttgac ttctcagctc gttgtcaggg cgttggtagc cgaacagata ttatgatgga 11460agttagtaaa agagaaaagg aaaaaggaat cattcctgac acagaagtgg atgcttacat 11520gaaagcaatt tctgttgaag gactccaaag aagtctgcaa acagattaca ttttgaagat 11580tctcggactt gatatttgtg cagaaatatt gattggagat gtgatgagga gaggtatatc 11640aggaggtcaa aagaagcgtc ttaccacagc tgagatgatc gttggcccga caaaggctct 11700gtttatggat gaaataacaa

atggcctaga cagctccaca gcttttcaga ttgtcaaatc 11760tcttcagcag tttgctcaca tatcaagcgc tactgtactt gtttcgcttc ttcaacccgc 11820cccagaatcc tatgacctct ttgatgacat tatgctgatg gccaaaggaa gaatcgtgta 11880tcatggtcca cgcggtgaag tccttaactt ctttgaggat tgtggattcc gatgccctga 11940aaggaagggt gttgcagact ttctccagga ggttatatcc aaaaaagatc aagcacaata 12000ctggtggcac gaggatttac cttacagttt tgtctcggta gaaatgttgt cgaagaagtt 12060caaggacttg agtattggga aaaagatcga agacactctg tcaaagccat atgatagatc 12120caaaagccat aaggatgctt tgtccttcag tgtgtattct cttccaaact gggagctgtt 12180catagcatgc atatcaagag agtatcttct catgaagaga aactatttcg tctatatttt 12240caagactgct cagcttgtta tggccgcatt catcactatg acagtgttta tccgaacacg 12300gatgggtatt gatatcattc atggaaattc ttacatgagt gccctctttt tcgccctcat 12360tatacttctt gttgacggat tcccagagtt gtctatgacg gctcaacgtc tagccgtgtt 12420ttataagcag aagcagttgt gtttctatcc tgcatgggcg tatgcaatcc ctgcaacagt 12480gttaaaggtc cctctctcgt tctttgaatc tctcgtttgg acctgcctct catactatgt 12540cattggatac acccctgaag catccaggtt cttcaagcag ttcattctac tctttgctgt 12600tcacttcacc tcgatatcca tgttccggtg tctagctgca atcttccaga cagtagttgc 12660ttcaatcaca gctggcagtt ttggtatatt attcacattt gtctttgccg gtttcgtcat 12720tccaccacct tctatgccag catggctcaa gtggggtttc tgggcaaatc ctttgagtta 12780cggtgagatt gggttatcag taaacgagtt tcttgctcca aggtggaatc agatgcaacc 12840caataatttt accttaggac gaaccatact ccaaacccgt ggaatggact acaacggtta 12900catgtactgg gtatcattat gtgccttgtt gggtttcact gtgctcttca acatcatttt 12960cactctggct ctaacgttct tgaaatcacc cacatcatct cgagccatga tttcgcaaga 13020caaactctct gagctgcaag gaacagaaaa gtcaacagaa gattcttctg tcaggaaaaa 13080gaccacagac tcccctgtaa agaccgaaga agaagacaaa atggtcttac cattcaagcc 13140tctcactgta acatttcaag acttgaacta tttcgttgac atgccagtgg agatgagaga 13200ccaaggatat gatcagaaga aactacaact tctctcagat atcacaggag ctttccgtcc 13260cggaatccta acggcactaa tgggagtgag tggagctgga aaaaccactc ttctcgacgt 13320tctagccgga aggaaaacaa gcggatacat cgaaggagac attagaatca gtggcttccc 13380taaagtccaa gaaacattcg ctagagtctc aggctactgt gaacaaacag atattcactc 13440accaaacatc actgtagaag aatccgtaat ctactcggct tggcttcgtc tagctcctga 13500gatcgatgcc acaacaaaaa ccaaattcgt gaagcaagtg cttgagacga tcgaattaga 13560tgagattaaa gattcattgg tgggagtcac cggagttagt ggattatcga cggagcaaag 13620gaagagattg acgattgcgg tggagttggt ggcgaatccg tcgattatat ttatggatga 13680gccaacgacg gggctagacg caagagcagc tgccattgtt atgagagctg tgaagaacgt 13740cgctgatact ggacgaacca tcgtctgtac tattcatcag cctagtatcg acatttttga 13800agccttcgac gagctggtgc ttcttaaaag aggtggtcgc atgatctaca caggaccatt 13860aggccaacat tcacgtcaca ttatcgagta ttttgagagt gttcctgaaa ttcctaaaat 13920aaaagacaac cacaatccag caacatggat gcttgatgtt agttcacagt cggtagaaat 13980tgaacttggt gtcgatttcg caaaaatcta ccatgactct gctctttaca agcgaaactc 14040agagcttgtg aaacagttga gccagccaga ttcaggatca agtgatatac agtttaagag 14100aacctttgca caaagctggt ggggacaatt caaatctatt ctatggaaaa tgaacttgtc 14160ttattggaga agcccttctt ataacctaat gcgtatgatg cacactttag tctcttcttt 14220gatcttcggc gcacttttct ggaaacaagg ccaaaatcta gatactcaac agagtatgtt 14280cacagtattt ggagcgatct acggtttggt actcttctta gggataaaca attgtgcatc 14340agctcttcaa tatttcgaaa cagagagaaa tgttatgtac cgggaaagat tcgcagggat 14400gtactcagcg actgcttatg cattgggtca agtggtgact gagatacctt atatattcat 14460acaagctgcc gagtttgtga tcgtaacata tccaatgatc ggtttctatc cttcagccta 14520caaagtcttt tggtcactct actctatgtt ttgctcacta ctcactttca actaccttgc 14580gatgttcctc gtctccatca cgccaaactt catggttgcc gcgattcttc aatcgctctt 14640ttatgttggt ttcaaccttt tttcggggtt tttgatcccc caaacgcaag taccagggtg 14700gtggatttgg ttatattatc taacaccaac gtcttggaca ctcaacgggt ttatctcgtc 14760ccaatacggc gatattcatg aagagatcaa tgtctttgga caatccacga cggttgcaag 14820attcttgaaa gactattttg gatttcatca tgaccttttg gcggttaccg cggttgttca 14880aatcgctttt cccattgcct tagcttctat gtttgcattc ttcgtgggca aactcaactt 14940ccaacgaaga tgaggcgcgc ccctgcagat agactatact atgttttagc ctgcctgctg 15000gctagctact atgttatgtt atgttgtaaa ataaacacct gctaaggtat atctatctat 15060attttagcat ggctttctca ataaattgtc tttccttatc gtttactatc ttatacctaa 15120taatgaaata ataatatcac atatgaggaa cggggcaggt ttaggcatat atatacgagt 15180gtagggcgga gtggggtaag gcgcctacta ccggtaattc ccgggattag cggccgctag 15240tctgtgcgca cttgtatcct gcaggtcaat cgtttaaaca ctgtacggac cgtggcctaa 15300taggccggta cccaagtttg tacaaaaaag caggctcccg ggatacctgc aggttaggcc 15360ggcccaggta ccctagattc gacggtatcg ataagctcgc ggatccctga aagcgacgtt 15420ggatgttaac atctacaaat tgccttttct tatcgaccat gtacgtaagc gcttacgttt 15480ttggtggacc cttgaggaaa ctggtagctg ttgtgggcct gtggtctcaa gatggatcat 15540taatttccac cttcacctac gatggggggc atcgcaccgg tgagtaatat tgtacggcta 15600agagcgaatt tggcctgtag gatccctgaa agcgacgttg gatgttaaca tctacaaatt 15660gccttttctt atcgaccatg tacgtaagcg cttacgtttt tggtggaccc ttgaggaaac 15720tggtagctgt tgtgggcctg tggtctcaag atggatcatt aatttccacc ttcacctacg 15780atggggggca tcgcaccggt gagtaatatt gtacggctaa gagcgaattt ggcctgtagg 15840atccctgaaa gcgacgttgg atgttaacat ctacaaattg ccttttctta tcgaccatgt 15900acgtaagcgc ttacgttttt ggtggaccct tgaggaaact ggtagctgtt gtgggcctgt 15960ggtctcaaga tggatcatta atttccacct tcacctacga tggggggcat cgcaccggtg 16020agtaatattg tacggctaag agcgaatttg gcctgtagga tccgcgagct ggtcaatccc 16080attgcttttg aagcagctca acattgatct ctttctcgat cgagggagat ttttcaaatc 16140agtgcgcaag acgtgacgta agtatccgag tcagttttta tttttctact aatttggtcg 16200tttatttcgg cgtgtaggac atggcaaccg ggcctgaatt tcgcgggtat tctgtttcta 16260ttccaacttt ttcttgatcc gcagccatta acgacttttg aatagatacg ctgacacgcc 16320aagcctcgct agtcaaaagt gtaccaaaca acgctttaca gcaagaacgg aatgcgcgtg 16380acgctcgcgg tgacgccatt tcgccttttc agaaatggat aaatagcctt gcttcctatt 16440atatcttccc aaattaccaa tacattacac tagcatctga atttcataac caatctcgat 16500acaccaaatc gattaattaa ccatggcgac gacaacaaca gaagcaacga agacatcatc 16560gaccaatgga gaagatcaga agcagtctca gaatcttcga catcaagaag ttggtcacaa 16620gagtctctta cagagcgatg atctctacca gtatatactg gagacaagtg tgtatcctag 16680agaaccagaa tcaatgaagg aactcaggga agtgacagca aaacatccat ggaacataat 16740gaccacatca gctgatgaag gacagttctt aaacatgctt atcaagctcg ttaacgccaa 16800gaacacaatg gagatcggag tttacactgg ctactctctt ctcgccaccg ctcttgctct 16860ccctgaagac ggcaaaattc tggctatgga tgtcaacaga gagaattacg aattgggttt 16920accgatcatt gagaaagccg gcgttgctca caagatcgac ttcagggaag gccctgctct 16980tcccgttctt gatgaaatcg ttgctgacga gaagaaccat ggaacatatg actttatatt 17040cgttgatgct gacaaagaca actacatcaa ctaccacaag cgtttgatcg atcttgtgaa 17100aattggagga gtgattggct acgacaacac tctgtggaat ggttctgtcg tggctcctcc 17160tgatgcacca atgaggaagt acgttcgtta ctacagagac tttgttcttg agcttaacaa 17220ggctcttgct gctgaccctc ggatcgagat ctgtatgctc cctgttggtg atggaatcac 17280tatctgccgt cggatcagtt gaggcgcgcc gatcgttcaa acatttggca ataaagtttc 17340ttaagattga atcctgttgc cggtcttgcg atgattatca tataatttct gttgaattac 17400gttaagcatg taataattaa catgtaatgc atgacgttat ttatgagatg ggtttttatg 17460attagagtcc cgcaattata catttaatac gcgatagaaa acaaaatata gcgcgcaaac 17520taggataaat tatcgcgcgc ggtgtcatct atgttactag atcggcgcct aagtttaaac 17580taagcggccg cacccagctt tcttgtacaa agtggccatg attacgccaa gcttggccac 17640taaggccaat ttaaatctac taggccggcc caggtaccaa ttcgaatcca aaaattacgg 17700atatgaatat aggcatatcc gtatccgaat tatccgtttg acagctagca acgattgtac 17760aattgcttct ttaaaaaagg aagaaagaaa gaaagaaaag aatcaacatc agcgttaaca 17820aacggccccg ttacggccca aacggtcata tagagtaacg gcgttaagcg ttgaaagact 17880cctatcgaaa tacgtaaccg caaacgtgtc atagtcagat cccctcttcc ttcaccgcct 17940caaacacaaa aataatcttc tacagcctat atatacaacc cccccttcta tctctccttt 18000ctcacaattc atcatctttc tttctctacc cccaatttta agaaatcctc tcttctcctc 18060ttcattttca aggtaaatct ctctctctct ctctctctct gttattcctt gttttaatta 18120ggtatgtatt attgctagtt tgttaatctg cttatcttat gtatgcctta tgtgaatatc 18180tttatcttgt tcatctcatc cgtttagaag ctataaattt gttgatttga ctgtgtatct 18240acacgtggtt atgtttatat ctaatcagat atgaatttct tcatattgtt gcgtttgtgt 18300gtaccaatcc gaaatcgttg atttttttca tttaatcgtg tagctaattg tacgtataca 18360tatggatcta cgtatcaatt gttcatctgt ttgtgtttgt atgtatacag atctgaaaac 18420atcacttctc tcatctgatt gtgttgttac atacatagat atagatctgt tatatcattt 18480tttttattaa ttgtgtatat atatatgtgc atagatctgg attacatgat tgtgattatt 18540tacatgattt tgttatttac gtatgtatat atgtagatct ggactttttg gagttgttga 18600cttgattgta tttgtgtgtg tatatgtgtg ttctgatctt gatatgttat gtatgtgcag 18660ttaattaacc atggctccaa cactcttgac aacccaattc tcaaatccag ctgaagtaac 18720cgactttgta gtctacaaag gaaatggtgt taagggttta tcagaaacag gaatcaaagc 18780tcttccagaa caatacattc agccacttga agaacgactc atcaacaaat tcgtcaacga 18840aacagatgaa gccattccag ttatcgatat gtcgaaccct gatgaggaca gagtcgctga 18900agctgtttgt gatgctgctg agaaatgggg gttctttcaa gtgatcaatc atggagttcc 18960tttggaagtt cttgatgacg tcaaggctgc gactcacaag ttcttcaatc tccctgttga 19020agagaagcgc aagttcacta aagagaattc gctgtcgacg actgttaggt ttgggacgag 19080ttttagtcct cttgcagagc aagcgcttga gtggaaagat tatctcagcc tcttctttgt 19140ctctgaagct gaagctgaac agttctggcc tgatatctgc aggaatgaaa cgttagagta 19200cattaacaag tcaaagaaga tggtgaggag gcttctagag tatttgggaa agaatctcaa 19260tgttaaagag cttgacgaga cgaaagaatc actctttatg ggctcgattc gagtcaacct 19320taactactac cccatctgcc ctaatccgga cctaacagtt ggtgttggtc gccactcaga 19380cgtctcttct ctcaccattc tcttacaaga ccagatcggt ggtctacacg tgcgttctct 19440ggcttcaggg aactgggttc acgtgcctcc ggttgctgga tcttttgtga tcaacatcgg 19500agatgcgatg cagatcatga gcaatggtct gtacaagagc gtggagcatc gtgtcttagc 19560caatggttac aataatagaa tctctgttcc tatctttgtg aacccaaaac cagagtcagt 19620tattggtcct ctacctgagg tgattgcaaa cggagaggaa ccgatttaca gagacgtcct 19680gtactctgat tacgtcaagt atttcttcag gaaggcacac gatggaaaga aaaccgtcga 19740ttacgccaag atctgaggcg cgccctgctt taatgagata tgcgagacgc ctatgatcgc 19800atgatatttg ctttcaattc tgttgtgcac gttgtaaaaa acctgagcat gtgtagctca 19860gatccttacc gccggtttcg gttcattcta atgaatatat cacccgttac tatcgtattt 19920ttatgaataa tattctccgt tcaatttact gattgtggcg cctactaccg gtaattcccg 19980ggattagcgg ccgctagtct gtgcgcactt gtatcctgca ggtcaatcgt ttaaacactg 20040tacggaccgt ggcctaatag gccggtaccc aacttta 200777651DNAArtificialprimer 76ggggacaagt ttgtacaaaa aagcaggctt aatggctcca acactcttga c 517749DNAArtificialprimer 77ggggaccact ttgtacaaga aagctgggta tcagatcttg gcgtaatcg 497855DNAArtificialprimer 78ggggacaagt ttgtacaaaa aagcaggctc atatttttac aacaattacc aacaa 557947DNAArtificialprimer 79ggggacaact tttgtataca aagttgtctt gtcatcgtcg tccttgt 478048DNAArtificialprimer 80ggggacaact ttgtatacaa aagttgcaat ggctccaaca ctcttgac 488120DNAArtificialprimer 81ctcagcctct tctttgtctc 208219DNAArtificialprimer 82aagcctcctc accatcttc 198323DNAArtificialprimer 83atggcgacga caacaacaga agc 238425DNAArtificialprimer 84gccaatcact cctccaattt tcaca 258525DNAArtificialprimer 85gatcgactct ccttgatgat ggcga 258625DNAArtificialprimer 86cgcactcggc caccactttt aaact 258724DNAArtificialprimer 87ctcgcaacaa tcgaactcgc caaa 248825DNAArtificialprimer 88tcggcaaatt ccacaaagag ttcca 25



User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA
New patent applications in this class:
DateTitle
2022-09-22Electronic device
2022-09-22Front-facing proximity detection using capacitive sensor
2022-09-22Touch-control panel and touch-control display apparatus
2022-09-22Sensing circuit with signal compensation
2022-09-22Reduced-size interfaces for managing alerts
Website © 2025 Advameg, Inc.