Patent application title: PLANT SEEDS WITH ALTERED STORAGE COMPOUND LEVELS, RELATED CONSTRUCTS AND METHODS INVOLVING GENES ENCODING OXIDOREDUCTASE MOTIF POLYPEPTIDES
Inventors:
Knut Meyer (Wilmington, DE, US)
Knut Meyer (Wilmington, DE, US)
Kevin L. Stecca (New Castle, DE, US)
Assignees:
E.I. DU PONT DE NEMOURS AND COMPANY
IPC8 Class: AC12N1587FI
USPC Class:
800278
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part
Publication date: 2011-09-08
Patent application number: 20110219474
Abstract:
This invention is in the field of plant molecular biology. More
specifically, this invention pertains to isolated nucleic acid fragments
encoding ORM proteins in plants and seeds and the use of such fragments
to modulate expression of a gene encoding ORM protein activity in a
transformed host cell.Claims:
1. A transgenic plant comprising a recombinant DNA construct comprising a
polynucleotide operably linked to at least one regulatory element,
wherein said polynucleotide encodes a polypeptide having an amino acid
sequence of at least 70% sequence identity, based on the Clustal V method
of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46,
48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or
117 and wherein seed obtained from said transgenic plant has an altered
i.e. increased or decreased oil, protein, starch and/or soluble
carbohydrate content when compared to a control plant not comprising said
recombinant DNA construct.
2. A transgenic seed obtained from the transgenic plant of claim 1 comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a seed from a control plant not comprising said recombinant DNA construct.
3. A transgenic seed comprising: a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes a ORM protein, and wherein said plant has an altered, increased or decreased oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.
4. The transgenic seed of claim 1, wherein the oil content in increased by at least 2% when compared to the oil content of a non-transgenic seed.
5. A transgenic seed comprising a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 2%, on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.
6. A method for producing a transgenic plant, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and regenerating a plant from the transformed plant cell.
7. A method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.
8. A method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increased starch content of at least 0.5% as compared to a transgenic seed obtained from a non-transgenic plant.
9. A method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.
10. A method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increase in oil content of at least 2%, on a dry-weight basis, as compared to a transgenic seed obtained from a non-transgenic plant.
11. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein the transgenic seed is obtained from a monocot or dicot plant.
12. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein the transgenic seed is obtained from a maize or soybean plant.
13. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein the at least one regulatory element is a seed-specific or seed-preferred promoter.
14. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein at least one regulatory element is an endosperm or embryo-specific promoter.
15. The method of any one of claim 6, 7, 8, 9, or 10, wherein the transgenic seed is obtained from a transgenic dicot plant comprising in its genome the recombinant construct.
16. The method of any one of claim 6, 7, 8, 9, or 10, wherein the dicot plant is soybean.
17. Transgenic seed obtained by the method of any one of claims 6, 7, 8, 9, or 10.
18. A product and/or by-product obtained from the transgenic seed of claim of any one of claim 6, 7, 8, 9, or 10.
19. The transgenic seed obtained by the method of any one of claim 6, 7, 8, 9, 10 or 11, wherein the transgenic seed is obtained from a monocot or dicot plant.
20. A product and/or by-product from transgenic seed of claim 2 wherein the plant is maize or soybean.
21. A product and/or by-product from the transgenic seed of claim 3, wherein the plant is maize or soybean.
22. A product and/or by-product from the transgenic seed of claim 4, wherein the plant is maize or soybean.
23. A product and/or by-product from the transgenic seed of claim 5, wherein the plant is maize or soybean.
24. An isolated polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide required for altering i.e. increasing or decreasing oil, protein, starch and/or soluble carbohydrate content in a plant, wherein, based on the Clustal V method of alignment with pairwise alignment default parameters of KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5, the polypeptide has an amino acid sequence of at least 70% sequence identity when compared to SEQ ID NO:32; 102, 104; 113, or 116; or (b) the full complement of the nucleotide sequence of (a).
25. The polynucleotide of claim 24, wherein the amino acid sequence of the polypeptide comprises SEQ ID NO: 32; 102, 104; 113, or 116.
26. The polynucleotide of claim 24, wherein the nucleotide sequence comprises SEQ ID NO:31, 101, 103, 112, or 115.
27. A plant or seed comprising a recombinant DNA construct, wherein the recombinant DNA construct comprises the polynucleotide of any one of claims 24 to 26 operably linked to at least one regulatory sequence.
Description:
FIELD OF THE INVENTION
[0001] This invention is in the field of plant molecular biology. More specifically, this invention pertains to isolated nucleic acid fragments encoding oxidoreductase motif proteins in plants and seeds and the use of such fragments to modulate expression of a gene encoding oxidoreductase activity.
BACKGROUND OF THE INVENTION
[0002] At maturity, about 40% of soybean seed dry weight is protein and 20% extractable oil. These constitute the economically valuable products of the soybean crop. Plant oils for example are the most energy-rich biomass available from plants; they have twice the energy content of carbohydrates. It also requires very little energy to extract plant oils and convert them to fuels. Of the remaining 40% of seed weight, about 10% is soluble carbohydrate. The soluble carbohydrate portion contributes little to the economic value of soybean seeds and the main component of the soluble carbohydrate fraction, raffinosaccharides, are deleterious both to processing and to the food value of soybean meal in monogastric animals (Coon et al., (1988) Proceedings Soybean Utilization Alternatives, Univ. of Minnesota, pp. 203-211).
[0003] As the pathways of storage compound biosynthesis in seeds are becoming better understood it is clear that it may be possible to modulate the size of the storage compound pools in plant cells by altering the catalytic activity of specific enzymes in the oil, starch and soluble carbohydrate biosynthetic pathways (Taiz L., et al. Plant Physiology; The Benjamin/Cummings Publishing Company: New York, 1991). For example, studies investigating the over-expression of LPAT and DAGAT showed that the final steps acylating the glycerol backbone exert significant control over flux to lipids in seeds. Seed oil content could also be increased in oil-seed rape by overexpression of a yeast glycerol-3-phosphate dehydrogenase, whereas over-expression of the individual genes involved in de novo fatty acid synthesis in the plastid, such as acetyl-CoA carboxylase and fatty acid synthase, did not substantially alter the amount of lipids accumulated (Vigeolas H., et al. Plant Biotechnology J. 5, 431-441 (2007). A low-seed-oil mutant, wrinkled 1, has been identified in Arabidopsis. The mutation apparently causes a deficiency in the seed-specific regulation of carbohydrate metabolism (Focks, Nicole et al., Plant Physiol. (1998), 118(1), 91-101. There is a continued interest in identifying the genes that encode proteins that can modulate the synthesis of storage compounds, such as oil, protein, starch and soluble carbohydrates, in plants.
[0004] The biochemical term oxidoreductase refers to enzymes involved in the transfer of electrons from one molecule (the reductant, also called the hydrogen or electron donor) to another (the oxidant, also called the hydrogen or electron acceptor). For some oxidoreductase proteins catalytic properties are known while other proteins are only identified based on the presence of a motif found also in known oxidoreductase enzymes. Small, proteins, 10-30 kDA in size with, with an oxidoreductase motif (ORM) and unkown catalytic properties are prevalent in eukaryotes ranging from unicellular yeast and algae to the animal and plant kingdom. Yoshikawa et al (FEMS Yeast Research (2009), 9(1), 32-44.) disclose that disruption of YPL107W of Saccharomyces cerevisae encoding a protein with oxidoreductase motif and mitochondrial localization is hypersensitive osmotic and ethanol stress. Although proteins with an oxidoreductase motif closely related to that of YPL107W have been identified in every plant that was subjected to in-depth genome or EST sequencing few studies have been conducted on the role of these proteins. In view of the ubiquitous nature of genes encoding ORM proteins in plants further investigation of their role in plant growth and development and specifically in the regulation of storage compound content in seed is of great interest.
SUMMARY OF THE INVENTION
[0005] In a first embodiment the present invention concerns a transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein seeds from said transgenic plant have an altered oil, protein, starch and/or soluble carbohydrate content when compared to seeds from a control plant not comprising said recombinant DNA construct.
[0006] In a second embodiment the present invention concerns transgenic seed comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a control seed not comprising said recombinant DNA construct.
[0007] In a third embodiment the present invention concerns transgenic seed comprising: a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes an ORM protein, and wherein said plant has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.
[0008] In a fourth embodiment the invention concerns transgenic seed having an increased oil content of at least 2% on a dry-weight basis when compared to the oil content of a non-transgenic seed, wherein said transgenic seed comprises a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 2% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.
[0009] In a fifth embodiment the invention concerns transgenic seed comprising a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM proteins activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 2% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.
[0010] In a sixth embodiment the present invention concerns a method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.
[0011] In a seventh embodiment this invention concerns a method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant;
(b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.
[0012] In an eighth embodiment, the present invention concerns a method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increase in oil content of at least 2% on a dry-weight basis, as compared to a transgenic seed obtained from a non-transgenic plant.
[0013] In a ninth embodiment the invention concerns a transgenic seed comprising: a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes a ORM protein, and wherein said plant has an altered, increased or decreased oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.
[0014] In a tenth embodiment, the present invention includes an isolated polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide required for altering i.e. increasing or decreasing oil, protein, starch and/or soluble carbohydrate content, wherein the polypeptide has an amino acid sequence of at least 70% sequence identity when compared to SEQ ID NO: 32, 102, 104; 113, or 116, or (b) a full complement of the nucleotide sequence, wherein the full complement and the nucleotide sequence consist of the same number of nucleotides and are 100% complementary. The polypeptide may comprise the amino acid sequence of SEQ ID NO: 32; 102, 104; 113, or 116. The nucleotide sequence may comprise the nucleotide sequence of SEQ ID NO:31, 101, 103, 112, or 115.
[0015] In another embodiment, the present invention concerns a recombinant DNA construct comprising any of the isolated polynucleotides of the present invention operably linked to at least one regulatory sequence, and a cell, a plant, and a seed comprising the recombinant DNA construct. The cell may be eukaryotic, e.g., a yeast, insect or plant cell, or prokaryotic, e.g., a bacterial cell.
[0016] Seeds obtained from monocot and dicot plants (such as for example maize and soybean, respectively) comprising the recombinant constructs of the invention are within the scope of the present invention. Also included are seed-specific or seed-preferred promoters driving the expression of the nucleic acid sequences of the invention. Embryo or endosperm specific promoters driving the expression of the nucleic acid sequences of the invention are also included.
Furthermore, the methods of the present inventions are useful for obtaining transgenic seeds from monocot plants (such as maize and rice) and dicot plants (such as soybean and canola).
[0017] Also within the scope of the invention are product(s) and/or by-product(s) obtained from the transgenic seed obtained from monocot or dicot plants, such as maize and soybean, respectively.
[0018] In another embodiment, this invention relates to a method for suppressing in a plant the level of expression of a gene encoding a polypeptide having ORM protein activity, wherein the method comprises transforming a monocot or dicot plant with any of the nucleic acid fragments of the present invention.
BRIEF DESCRIPTION OF THE DRAWING AND SEQUENCE LISTING
[0019] The invention can be more fully understood from the following detailed description and the accompanying Drawing and Sequence Listing which form a part of this application.
[0020] FIG. 1A-1B shows an alignment of the amino acid sequences of ORM proteins encoded by the nucleotide sequences derived from the following: Brassica rapa (SEQ ID NO:26, 28, and 30); Helianthus annuus (SEQ ID NO:32); Ricinus communis (SEQ ID NO:34); Glycine max (SEQ ID NO:36, and 38), Zea mays (SEQ ID NO:40, 42, 44, and 66, which corresponds to NCBI GI NO:195615148); Oryza sativa (SEQ ID NO:46); Sorghum bicolor (SEQ ID NO:48; Populus trichocarpa (SEQ ID NO:64; NCBI GI NO.:118481427); SEQ ID NO:65 corresponding to SEQ ID NO:36271 from US Patent Application US20060123505; SEQ ID NO:67 corresponding to SEQ ID NO:233249 of US Patent Application US20040214272; and Arabidopsis thaliana (SEQ ID NO:69, At5G17280). For the alignment, amino acids which are conserved among all sequences at a given position, are indicated with an asterisk (*). Dashes are used by the program to maximize the alignment of the sequences. A conserved sequence motif is boxed in the alignment and corresponds to SEQ ID NO:70.
[0021] FIG. 2 shows a chart of the percent sequence identity for each pair of amino acid sequences displayed in FIGS. 1A-1B.
[0022] FIG. 3A-3C shows an alignment of the amino acid sequences of ORM proteins encoded by the nucleotide sequences derived from the following: Brassica rapa (SEQ ID NO:26, 28, and 30); Helianthus annuus (SEQ ID NO:32); Ricinus communis (SEQ ID NO:34); Glycine max (SEQ ID NO:36, and 38), Zea mays (SEQ ID NO:40, 42, 44, and 66, which corresponds to NCBI GI NO:195615148); Oryza sativa (SEQ ID NO:46); Sorghum bicolor (SEQ ID NO:48; Populus trichocarpa (SEQ ID NO:64; NCBI GI NO.:118481427); SEQ ID NO:65 corresponding to SEQ ID NO:36271 from US Patent Application US20060123505; SEQ ID NO:67 corresponding to SEQ ID NO:233249 of US Patent Application US20040214272; Arabidopsis thaliana (SEQ ID NO:69, At5G17280), Guar (SEQ ID NO:102, Ids2c.pk014.b22), Bahia (SEQ ID NO:104, contig), Arabidopsis lyrata (SEQ ID NO:105, NCBI GI NO:297807753), Picea sitchensis (SEQ ID NO:106, NCBI GI NO:116782186), Hordeum vulgare (SEQ ID NO:108), Raphanus sativus (SEQ ID NO:110), Dennstaedtia punctiloba (SEQ ID NO:113), Osmunda cinnamomea (SEQ ID NO:116). For the alignment, amino acids which are conserved among all sequences at a given position, are indicated with an asterisk (*). Dashes are used by the program to maximize the alignment of the sequences. A conserved sequence motif is boxed in the alignment and corresponds to SEQ ID NO:117.
[0023] FIG. 4 shows a chart of the percent sequence identity for each pair of amino acid sequences displayed in FIGS. 3A-3C.
[0024] The sequence descriptions and Sequence Listing attached hereto comply with the rules governing nucleotide and/or amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. §1.821-1.825.
SEQ ID NO:1 corresponds to the nucleotide sequence of vector PHSbarENDS2. SEQ ID NO:2 corresponds to the nucleotide sequence of vector pUC9 and a polylinker. SEQ ID NO:3 corresponds to the nucleotide sequence of vector pKR85. SEQ ID NO:4 corresponds to the nucleotide sequence of vector pKR278. SEQ ID NO:5 corresponds to the nucleotide sequence of vector pKR407. SEQ ID NO:6 corresponds to the nucleotide sequence of vector pKR1468. SEQ ID NO:7 corresponds to the nucleotide sequence of vector pKR1475. SEQ ID NO:8 corresponds to the nucleotide sequence of vector pKR92. SEQ ID NO:9 corresponds to the nucleotide sequence of vector pKR1478. SEQ ID NO:10 corresponds to SAIFF and genomic DNA of lo17849. SEQ ID NO:11 corresponds to the forward primer ORM ORF FWD. SEQ ID NO:12 corresponds to the reverse primer ORM ORF REV. SEQ ID NO:13 corresponds to the nucleotide sequence of vector pENTR comprising ORM. SEQ ID NO:14 corresponds to the nucleotide sequence of vector pKR1478-ORM. SEQ ID NO:15 corresponds to the nucleotide sequence of PKR1482. SEQ ID NO:16 corresponds to the AthLcc In forward primer. SEQ ID NO:17 corresponds to the AthLcc In reverse primer. SEQ ID NO:18 corresponds to the PCR product with the laccase intron. SEQ ID NO:19 corresponds to the nucleotide sequence of PSM1318. SEQ ID NO:20 corresponds to the nucleotide sequence of pMBL18 ATTR12 INT. SEQ ID NO:21 corresponds to the nucleotide sequence of PSM1789. SEQ ID NO:22 corresponds to the nucleotide sequence of pMBL18 ATTR12 INT ATTR21. SEQ ID NO:23 corresponds to the nucleotide sequence of vector pKR1480. SEQ ID NO:24 corresponds to the nucleotide sequence of pKR1482-ORM.
[0025] Table 1 lists the polypeptides that are described herein, the designation of the clones that comprise the nucleic acid fragments encoding polypeptides representing all or a substantial portion of these polypeptides, and the corresponding identifier (SEQ ID NO:) as used in the attached Sequence Listing. Table 1 also identifies the cDNA clones as individual ESTs ("EST"), the sequences of the entire cDNA inserts comprising the indicated cDNA clones ("FIS"), contigs assembled from two or more ESTs ("Contig"), contigs assembled from an FIS and one or more ESTs ("Contig*", or sequences encoding the entire or functional protein derived from an FIS, a contig, an EST and PCR, or an FIS and PCR ("CGS").
TABLE-US-00001 TABLE 1 ORM Proteins SEQ ID NO: Protein (Plant Source) Clone Designation Status (Nucleotide) (Amino Acid) ORM (Brassica rapa) TC44737 CGS 25 26 ORM (Brassica rapa) TC52165 CGS 27 28 ORM (Brassica rapa) TC52879 CGS 29 30 ORM (Helianthus annuus) hso1c.pk014.c16 CGS 31 32 ORM (Ricinus communis) XM_002533611 CGS 33 34 ORM (Glycine max) Glyma02g05870 CGS 35 36 ORM (Glycine max) Glyma16g24560 CGS 37 38 ORM (Zea mays) GRMZM2G1312101 CGS 39 40 ORM (Zea mays) pco642986 CGS 41 42 ORM (Zea mays) pco597536 CGS 43 44 ORM (Oryza sativa) Os09g36120 CGS 45 46 ORM (Sorghum bicolor) Sb02g030770 CGS 47 48
SEQ ID NO:49 is the nucleic acid sequence of the linker described in Example 19. SEQ ID NO:50 is the nucleic acid sequence of vector pKS133 described in Example 18. SEQ ID NO:51 corresponds to the single copy of ELVISLIVES. SEQ ID NO:52 corresponds to two copies of ELVISLIVES. SEQ ID NO:53 corresponds the primer described in Example 20. SEQ ID NO:54 corresponds to the primer described in Example 20. SEQ ID NO:55 corresponds to a synthetic PCR primer (SA195). SEQ ID NO:56 corresponds to a synthetic PCR primer (SA196). SEQ ID NO:57 corresponds to a synthetic PCR primer (SA200). SEQ ID NO:58 corresponds to a synthetic PCR primer (SA201). SEQ ID NO:59 corresponds to pGemTA. SEQ ID NO:60 corresponds to pGemTB. SEQ ID NO:61 corresponds to pGemT-ORM-HP. SEQ ID NO:62 corresponds to pKS433. SEQ ID NO:63 corresponds to pKS120. SEQ ID NO:64 corresponds to NCBI GI NO: 118481427 (Populus trichocarpa) SEQ ID NO:65 corresponds to SEQ ID NO:36271 from US Patent Application, US20060123505. SEQ ID NO:66 corresponds to NCBI Gi NO: 195615148 (Zea mays). SEQ ID NO:67 corresponds to SEQ ID NO:233249 of US20040214272. SEQ ID NO:68 corresponds to the nucleotide sequence of At5G17280. SEQ ID NO:69 corresponds to the amino acid sequence encoded by SEQ ID NO:68. SEQ ID NO:70 is a conserved sequence motif associated with sequences included in the present invention as shown in FIGS. 1A and 1B. SEQ ID NO:71 corresponds to the SA3 11 primer. SEQ ID NO:72 corresponds to the SA3 12 primer. SEQ ID NO:73 corresponds to the SA3 13 primer. SEQ ID NO:74 corresponds to the SA3 14 primer. SEQ ID NO:75 corresponds to the SA3 15 primer. SEQ ID NO:76 corresponds to the SA3 16 primer. SEQ ID NO:77 corresponds to the nucleotide sequence of pGEM T Easy-C. SEQ ID NO:78 corresponds to the nucleotide sequence of pGEM T Easy-D. SEQ ID NO:79 corresponds to the nucleotide sequence of pGEM T Easy-E. SEQ ID NO:80 corresponds to the nucleotide sequence of pBluescript SK+-C. SEQ ID NO:81 corresponds to the nucleotide sequence of pBluescript SK+-CD. SEQ ID NO:82 corresponds to the nucleotide sequence of pBluescript SK+-CDE. SEQ ID NO:83 corresponds to the nucleotide sequence of KS442. SEQ ID NO:84 corresponds to the nucleotide sequence of KS442-CDE. SEQ ID NO:85 corresponds to the nucleotide sequence of lo127 SEQ ID NO:86 corresponds to the sequence of artificial microRNA, OX16. SEQ ID NO:87 corresponds to the sequence of artificial microRNA, OX2. SEQ ID NO:88 corresponds to the sequence of artificial microRNA, OX16. SEQ ID NO:89 corresponds to the sequence of artificial microRNA, OX2. SEQ ID NO:90 corresponds to the microRNA 396 precursor. SEQ ID NO:91 corresponds to the microRNA 396 precursor v3. SEQ ID NO:92 corresponds to OX16 primer A. SEQ ID NO:93 corresponds to OX16 primer B. SEQ ID NO:94 corresponds to the nucleotide sequence of plasmid OX16. SEQ ID NO:95 corresponds to the microRNA 159 precursor. SEQ ID NO:96 corresponds to the in-fusion ready microRNA 159 precursor. SEQ ID NO:97 corresponds to the 1590×2 primer A. SEQ ID NO:98 corresponds to the 1590×2 primer B. SEQ ID NO:99 corresponds to the nucleotide sequence of plasmid 159-OX2. SEQ ID NO:100 corresponds to the nucleotide sequence of plasmid KS434. SEQ ID NO:101 corresponds to the nucleotide sequence of a Guar ORM (Ids2c.pk014.b22). SEQ ID NO:102 corresponds to the amino acid sequence of the Guar ORM encoded by Nucleotides of SEQ ID NO:101. SEQ ID NO:103 corresponds to the nucleotide sequence of a contig of a Bahia ORM. SEQ ID NO:104 corresponds to the amino acid sequence encoded by nucleotides of SEQ ID NO:103. SEQ ID NO:105 corresponds to NCBI GI NO: 297807753 (Arabidopsis lyrata). SEQ ID NO:106 corresponds to NCBI GI NO: 116782186 (Picea sitchensis). SEQ ID NO:107 corresponds to a Hordeum vulgare ORM sequence, obtained a from a Hordeum vulgare seedling shoot EST library. SEQ ID NO:108 corresponds to the partial amino acid sequence encoded by SEQ ID NO: 107. SEQ ID NO:109 corresponds to a partial ORM nucleotide sequence obtained from Raphanus sativus. SEQ ID NO:110 corresponds to the amino acid sequence encoded by SEQ ID NO:109. SEQ ID NO:111 corresponds to the ORM nucleotide sequence from Dennstaedtia punctiloba. SEQ ID NO:112 corresponds to the nucleotide sequence of the ORM-ORF of SEQ ID NO:111. SEQ ID NO:113 corresponds to the amino acids sequence encoded by SEQ ID NO:112. SEQ ID NO:114 corresponds to the ORM nucleotide sequence from Osmunda cinnamomea. SEQ ID NO:115 corresponds to the nucleotide sequence of the ORM-ORF of SEQ ID NO:114. SEQ ID NO:116 corresponds to the amino acid sequence encoded by SEQ ID NO:115. SEQ ID NO:117: corresponds to a conserved sequence motif associated with sequences included in the present invention as shown in FIG. 3A-3C. SEQ ID NO:118 corresponds to the amino acid sequence from Glycine max in US Patent US2004031072-A1-14947. SEQ ID NO:119 corresponds to the amino acid sequence from Sorghum bicolor (NCBI GI: 8062081). SEQ ID NO:120 corresponds to the amino acid sequence form Arabidopsis thaliana (BAB10515). SEQ ID NO:121 corresponds to the amino acid sequence form Oryza sativa (NCBI GI: 5207721).
[0026] The Sequence Listing contains the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB standards described in Nucleic Acids Res. 13:3021-3030 (1985) and in the Biochemical J. 219 (No. 2):345-373 (1984) which are herein incorporated by reference. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822.
DETAILED DESCRIPTION OF THE INVENTION
[0027] All patents, patent applications, and publications cited throughout the application are hereby incorporated by reference in their entirety.
[0028] As used herein and in the appended claims, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to "a plant" includes a plurality of such plants, reference to "a cell" includes one or more cells and equivalents thereof known to those skilled in the art, and so forth.
[0029] In the context of this disclosure a number of terms and abbreviations are used. The following definitions are provided.
[0030] "Open reading frame" is abbreviated ORF.
[0031] "Polymerase chain reaction" is abbreviated PCR.
[0032] "Triacylglycerols" are abbreviated TAGs.
[0033] "Co-enzyme A" is abbreviated CoA.
[0034] "Pyrophosphatase" is abbreviated PPiase.
[0035] The term "fatty acids" refers to long chain aliphatic acids (alkanoic acids) of varying chain length, from about C12 to C22 (although both longer and shorter chain-length acids are known). The predominant chain lengths are between C16 and C22. The structure of a fatty acid is represented by a simple notation system of "X:Y", where X is the total number of carbon (C) atoms in the particular fatty acid and Y is the number of double bonds.
[0036] Generally, fatty acids are classified as saturated or unsaturated. The term "saturated fatty acids" refers to those fatty acids that have no "double bonds" between their carbon backbone. In contrast, "unsaturated fatty acids" have "double bonds" along their carbon backbones (which are most commonly in the cis-configuration). "Monounsaturated fatty acids" have only one "double bond" along the carbon backbone (e.g., usually between the 9th and 10th carbon atom as for palmitoleic acid (16:1) and oleic acid (18:1)), while "polyunsaturated fatty acids" (or "PUFAs") have at least two double bonds along the carbon backbone (e.g., between the 9th and 10th, and 12th and 13th carbon atoms for linoleic acid (18:2); and between the 9th and 10th, 12th and 13th, and 15th and 16th for α-linolenic acid (18:3)).
[0037] The terms "triacylglycerol", "oil" and "TAGs" refer to neutral lipids composed of three fatty acyl residues esterified to a glycerol molecule (and such terms will be used interchangeably throughout the present disclosure herein). Such oils can contain long chain PUFAs, as well as shorter saturated and unsaturated fatty acids and longer chain saturated fatty acids. Thus, "oil biosynthesis" generically refers to the synthesis of TAGs in the cell.
The term "modulation" or "alteration" in the context of the present invention refers to increases or decreases of ORM protein expression, protein level or enzyme activity, as well as to an increase or decrease in the storage compound levels, such as oil, protein, starch or soluble carbohydrates.
[0038] The term "plant" includes reference to whole plants, plant parts or organs (e.g., leaves, stems, roots, etc.), plant cells, seeds and progeny of same. Plant cell, as used herein includes, without limitation, cells obtained from or found in the following: seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen and microspores. Plant cells can also be understood to include modified cells, such as protoplasts, obtained from the aforementioned tissues. The class of plants which can be used in the methods of the invention is generally as broad as the class of higher plants amenable to transformation techniques, including both monocotyledonous and dicotyledonous plants.
[0039] Examples of monocots include, but are not limited to (corn) maize, wheat, rice, sorghum, millet, barley, palm, lily, Alstroemeria, rye, and oat.
[0040] Examples of dicots include, but are not limited to, soybean, rape, sunflower, canola, grape, guayule, columbine, cotton, tobacco, peas, beans, flax, safflower, and alfalfa.
[0041] Plant tissue includes differentiated and undifferentiated tissues or plants, including but not limited to, roots, stems, shoots, leaves, pollen, seeds, tumor tissue, and various forms of cells and culture such as single cells, protoplasm, embryos, and callus tissue.
[0042] The term "plant organ" refers to plant tissue or group of tissues that constitute a morphologically and functionally distinct part of a plant.
[0043] The term "genome" refers to the following: 1. The entire complement of genetic material (genes and non-coding sequences) is present in each cell of an organism, or virus or organelle. 2. A complete set of chromosomes inherited as a (haploid) unit from one parent. The term "stably integrated" refers to the transfer of a nucleic acid fragment into the genome of a host organism or cell resulting in genetically stable inheritance.
[0044] The terms "polynucleotide", "polynucleotide sequence", "nucleic acid", nucleic acid sequence", and "nucleic acid fragment" are used interchangeably herein. These terms encompass nucleotide sequences and the like. A polynucleotide may be a polymer of RNA or DNA that is single- or double-stranded, that optionally contains synthetic, non-natural or altered nucleotide bases. A polynucleotide in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA, synthetic DNA, or mixtures thereof.
[0045] The term "isolated" refers to materials, such as "isolated nucleic acid fragments" and/or "isolated polypeptides", which are substantially free or otherwise removed from components that normally accompany or interact with the materials in a naturally occurring environment. Isolated polynucleotides may be purified from a host cell in which they naturally occur. Conventional nucleic acid purification methods known to skilled artisans may be used to obtain isolated polynucleotides. The term also embraces recombinant polynucleotides and chemically synthesized polynucleotides.
[0046] The term "isolated nucleic acid fragment" is used interchangeably with "isolated polynucleotide" and is a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA. Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.
[0047] The terms "subfragment that is functionally equivalent" and "functionally equivalent subfragment" are used interchangeably herein. These terms refer to a portion or subsequence of an isolated nucleic acid fragment in which the ability to alter gene expression or produce a certain phenotype is retained whether or not the fragment or subfragment encodes an active enzyme. For example, the fragment or subfragment can be used in the design of recombinant DNA constructs to produce the desired phenotype in a transformed plant. Recombinant DNA constructs can be designed for use in co-suppression or antisense by linking a nucleic acid fragment or subfragment thereof, whether or not it encodes an active enzyme, in the appropriate orientation relative to a plant promoter sequence.
[0048] "Cosuppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar native genes (U.S. Pat. No. 5,231,020). Cosuppression technology constitutes the subject matter of U.S. Pat. No. 5,231,020, which issued to Jorgensen et al. on Jul. 27, 1999. The phenomenon observed by Napoli et al. in petunia was referred to as "cosuppression" since expression of both the endogenous gene and the introduced transgene were suppressed (for reviews see Vaucheret et al., Plant J. 16:651-659 (1998); and Gura, Nature 404:804-808 (2000)).
[0049] Co-suppression constructs in plants previously have been designed by focusing on overexpression of a nucleic acid sequence having homology to an endogenous mRNA, in the sense orientation, which results in the reduction of all RNA having homology to the overexpressed sequence (see Vaucheret et al. (1998) Plant J 16:651-659; and Gura (2000) Nature 404:804-808). The overall efficiency of this phenomenon is low, and the extent of the RNA reduction is widely variable. Recent work has described the use of "hairpin" structures that incorporate all, or part, of an mRNA encoding sequence in a complementary orientation that results in a potential "stem-loop" structure for the expressed RNA (PCT Publication WO 99/53050 published on Oct. 21, 1999). This increases the frequency of co-suppression in the recovered transgenic plants. Another variation describes the use of plant viral sequences to direct the suppression, or "silencing", of proximal mRNA encoding sequences (PCT Publication WO 98/36083 published on Aug. 20, 1998). Both of these co-suppressing phenomena have not been elucidated mechanistically, although recent genetic evidence has begun to unravel this complex situation (Elmayan et al. (1998) Plant Cell 10:1747-1757).
[0050] In addition to cosuppression, antisense technology has also been used to block the function of specific genes in cells. Antisense RNA is complementary to the normally expressed RNA, and presumably inhibits gene expression by interacting with the normal RNA strand. The mechanisms by which the expression of a specific gene are inhibited by either antisense or sense RNA are on their way to being understood. However, the frequencies of obtaining the desired phenotype in a transgenic plant may vary with the design of the construct, the gene, the strength and specificity of its promoter, the method of transformation and the complexity of transgene insertion events (Baulcombe, Curr. Biol. 12(3):R82-84 (2002); Tang et al., Genes Dev. 17(1):49-63 (2003); Yu et al., Plant Cell. Rep. 22(3):167-174 (2003)). Cosuppression and antisense inhibition are also referred to as "gene silencing", "post-transcriptional gene silencing" (PTGS), RNA interference or RNAi. See for example U.S. Pat. No. 6,506,559.
[0051] MicroRNAs (miRNA) are small regulatory RNSs that control gene expression. miRNAs bind to regions of target RNAs and inhibit their translation and, thus, interfere with production of the polypeptide encoded by the target RNA. miRNAs can be designed to be complementary to any region of the target sequence RNA including the 3' untranslated region, coding region, etc. miRNAs are processed from highly structured RNA precursors that are processed by the action of a ribonuclease III termed DICER. While the exact mechanism of action of miRNAs is unknown, it appears that they function to regulate expression of the target gene. See, e.g., U.S. Patent Publication No. 2004/0268441 A1 which was published on Dec. 30, 2004.
[0052] The term "expression", as used herein, refers to the production of a functional end-product, be it mRNA or translation of mRNA into a polypeptide. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein. "Co-suppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020).
[0053] "Overexpression" refers to the production of a functional end-product in transgenic organisms that exceeds levels of production when compared to expression of that functional end-product in a normal, wild type or non-transformed organism.
[0054] "Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including both nuclear and organellar genomes, resulting in genetically stable inheritance. In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. The preferred method of cell transformation of rice, corn and other monocots is using particle-accelerated or "gene gun" transformation technology (Klein et al. (1987) Nature (London) 327:70-73; U.S. Pat. No. 4,945,050), or an Agrobacterium-mediated method (Ishida Y. et al. (1996) Nature Biotech. 14:745-750). The term "transformation" as used herein refers to both stable transformation and transient transformation.
[0055] "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein.
[0056] As stated herein, "suppression" refers to the reduction of the level of enzyme activity or protein functionality detectable in a transgenic plant when compared to the level of enzyme activity or protein functionality detectable in a plant with the native enzyme or protein. The level of enzyme activity in a plant with the native enzyme is referred to herein as "wild type" activity. The level of protein functionality in a plant with the native protein is referred to herein as "wild type" functionality. The term "suppression" includes lower, reduce, decline, decrease, inhibit, eliminate and prevent. This reduction may be due to the decrease in translation of the native mRNA into an active enzyme or functional protein. It may also be due to the transcription of the native DNA into decreased amounts of mRNA and/or to rapid degradation of the native mRNA. The term "native enzyme" refers to an enzyme that is produced naturally in the desired cell.
[0057] "Gene silencing," as used herein, is a general term that refers to decreasing mRNA levels as compared to wild-type plants, does not specify mechanism and is inclusive, and not limited to, anti-sense, cosuppression, viral-suppression, hairpin suppression and stem-loop suppression.
[0058] The terms "homology", "homologous", "substantially similar" and "corresponding substantially" are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases does not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. For example, alterations in a nucleic acid fragment which result in the production of a chemically equivalent amino acid at a given site, but do not effect the functional properties of the encoded polypeptide, are well known in the art. Thus, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one negatively charged residue for another, such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can also be expected to produce a functionally equivalent product. Nucleotide changes that result in alteration of the N-terminal and C-terminal portions of the polypeptide molecule would also not be expected to alter the activity of the polypeptide. Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products. It is therefore understood, as those skilled in the art will appreciate, that the invention encompasses more than the specific exemplary sequences.
[0059] Moreover, the skilled artisan recognizes that substantially similar nucleic acid sequences encompassed by this invention are also defined by their ability to hybridize, under moderately stringent conditions (for example, 1×SSC, 0.1% SDS, 60° C.) with the sequences exemplified herein, or to any portion of the nucleotide sequences reported herein and which are functionally equivalent to the gene or the promoter of the invention. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes determine stringency conditions. One set of preferred conditions involves a series of washes starting with 6×SSC, 0.5% SDS at room temperature for 15 min, then repeated with 2×SSC, 0.5% SDS at 45° C. for 30 min, and then repeated twice with 0.2×SSC, 0.5% SDS at 50° C. for 30 min. A more preferred set of stringent conditions involves the use of higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2×SSC, 0.5% SDS was increased to 60° C. Another preferred set of highly stringent conditions involves the use of two final washes in 0.1×SSC, 0.1% SDS at 65° C.
[0060] With respect to the degree of substantial similarity between the target (endogenous) mRNA and the RNA region in the construct having homology to the target mRNA, such sequences should be at least 25 nucleotides in length, preferably at least 50 nucleotides in length, more preferably at least 100 nucleotides in length, again more preferably at least 200 nucleotides in length, and most preferably at least 300 nucleotides in length; and should be at least 80% identical, preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95% identical.
[0061] Substantially similar nucleic acid fragments of the instant invention may also be characterized by the percent identity of the amino acid sequences that they encode to the amino acid sequences disclosed herein, as determined by algorithms commonly employed by those skilled in this art. Suitable nucleic acid fragments (isolated polynucleotides of the present invention) encode polypeptides that are at least 70% identical, preferably at least 80% identical to the amino acid sequences reported herein. Preferred nucleic acid fragments encode amino acid sequences that are at least 85% identical to the amino acid sequences reported herein. More preferred nucleic acid fragments encode amino acid sequences that are at least 90% identical to the amino acid sequences reported herein. Most preferred are nucleic acid fragments that encode amino acid sequences that are at least 95% identical to the amino acid sequences reported herein.
[0062] It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying related polypeptide sequences. Useful examples of percent identities are 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%.
[0063] Sequence alignments and percent similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Unless stated otherwise, multiple alignment of the sequences provided herein were performed using the Clustal method of alignment (Higgins and Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4. After alignment of the sequences, using the Clustal V program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table on the same program.
[0064] Unless otherwise stated, "BLAST" sequence identity/similarity values provided herein refer to the value obtained using the BLAST 2.0 suite of programs using default parameters (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)). Software for performing BLAST analyses is publicly available, e.g., through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
[0065] "Sequence identity" or "identity" in the context of nucleic acid or polypeptide sequences refers to the nucleic acid bases or amino acid residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window.
[0066] Thus, "Percentage of sequence identity" refers to the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the results by 100 to yield the percentage of sequence identity. Useful examples of percent sequence identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. These identities can be determined using any of the programs described herein.
[0067] Sequence alignments and percent identity or similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences are performed using the Clustal V method of alignment (Higgins, D. G. and Sharp, P. M. (1989) Comput. Appl. Biosci. 5:151-153; Higgins, D. G. et al. (1992) Comput. Appl. Biosci. 8:189-191) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4.
[0068] It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides, from other plant species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. Indeed, any integer amino acid identity from 50%-100% may be useful in describing the present invention. Also, of interest is any full or partial complement of this isolated nucleotide fragment.
[0069] The term "recombinant" means, for example, that a nucleic acid sequence is made by an artificial combination of two otherwise separated segments of sequence, e.g., by chemical synthesis or by the manipulation of isolated nucleic acids by genetic engineering techniques.
[0070] As used herein, "contig" refers to a nucleotide sequence that is assembled from two or more constituent nucleotide sequences that share common or overlapping regions of sequence homology. For example, the nucleotide sequences of two or more nucleic acid fragments can be compared and aligned in order to identify common or overlapping sequences. Where common or overlapping sequences exist between two or more nucleic acid fragments, the sequences (and thus their corresponding nucleic acid fragments) can be assembled into a single contiguous nucleotide sequence.
[0071] "Codon degeneracy" refers to divergence in the genetic code permitting variation of the nucleotide sequence without affecting the amino acid sequence of an encoded polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment comprising a nucleotide sequence that encodes all or a substantial portion of the amino acid sequences set forth herein. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a nucleic acid fragment for improved expression in a host cell, it is desirable to design the nucleic acid fragment such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.
[0072] The terms "synthetic nucleic acid" or "synthetic genes" refer to nucleic acid molecules assembled either in whole or in part from oligonucleotide building blocks that are chemically synthesized using procedures known to those skilled in the art. These building blocks are ligated and annealed to form larger nucleic acid fragments which may then be enzymatically assembled to construct the entire desired nucleic acid fragment. "Chemically synthesized", as related to a nucleic acid fragment, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of nucleic acid fragments may be accomplished using well established procedures, or automated chemical synthesis can be performed using one of a number of commercially available machines. Accordingly, the nucleic acid fragments can be tailored for optimal gene expression based on optimization of the nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful gene expression if codon usage is biased towards those codons favored by the host. Determination of preferred codons can be based on a survey of genes derived from the host cell where sequence information is available.
[0073] "Gene" refers to a nucleic acid fragment that is capable of directing expression a specific protein or functional RNA.
[0074] "Native gene" refers to a gene as found in nature with its own regulatory sequences.
[0075] "Chimeric gene" or "recombinant DNA construct" are used interchangeably herein, and refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature, or to an isolated native gene optionally modified and reintroduced into a host cell.
[0076] A chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. In one embodiment, a regulatory region and a coding sequence region are assembled from two different sources. In another embodiment, a regulatory region and a coding sequence region are derived from the same source but arranged in a manner different than that found in nature. In another embodiment, the coding sequence region is assembled from at least two different sources. In another embodiment, the coding region is assembled from the same source but in a manner not found in nature.
[0077] The term "endogenous gene" refers to a native gene in its natural location in the genome of an organism.
[0078] The term "foreign gene" refers to a gene not normally found in the host organism that is introduced into the host organism by gene transfer.
[0079] The term "transgene" refers to a gene that has been introduced into a host cell by a transformation procedure. Transgenes may become physically inserted into a genome of the host cell (e.g., through recombination) or may be maintained outside of a genome of the host cell (e.g., on an extrachromasomal array).
[0080] An "allele" is one of several alternative forms of a gene occupying a given locus on a chromosome. When the alleles present at a given locus on a pair of homologous chromosomes in a diploid plant are the same that plant is homozygous at that locus. If the alleles present at a given locus on a pair of homologous chromosomes in a diploid plant differ that plant is heterozygous at that locus. If a transgene is present on one of a pair of homologous chromosomes in a diploid plant that plant is hemizygous at that locus.
[0081] The term "coding sequence" refers to a DNA fragment that codes for a polypeptide having a specific amino acid sequence, or a structural RNA. The boundaries of a protein coding sequence are generally determined by a ribosome binding site (prokaryotes) or by an ATG start codon (eukaryotes) located at the 5' end of the mRNA and a transcription terminator sequence located just downstream of the open reading frame at the 3' end of the mRNA. A coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences.
[0082] "Mature" protein refers to a post-translationally processed polypeptide; i.e., one from which any pre- or pro-peptides present in the primary translation product have been removed. "Precursor" protein refers to the primary product of translation of mRNA; i.e., with pre- and pro-peptides still present. Pre- and pro-peptides may be and are not limited to intracellular localization signals.
[0083] "RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from post-transcriptional processing of the primary transcript and is referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a DNA that is complementary to and synthesized from an mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded form using the Klenow fragment of DNA polymerase I. "Sense" RNA refers to RNA transcript that includes the mRNA and can be translated into protein within a cell or in vitro. "Antisense RNA" refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target isolated nucleic acid fragment (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.
[0084] The term "endogenous RNA" refers to any RNA which is encoded by any nucleic acid sequence present in the genome of the host prior to transformation with the recombinant construct of the present invention, whether naturally-occurring or non-naturally occurring, i.e., introduced by recombinant means, mutagenesis, etc.
[0085] The term "non-naturally occurring" means artificial, not consistent with what is normally found in nature.
[0086] "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell.
[0087] "cDNA" refers to a DNA that is complementary to and synthesized from a mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded form using the Klenow fragment of DNA polymerase I.
[0088] "Sense" RNA refers to RNA transcript that includes the mRNA and can be translated into protein within a cell or in vitro.
[0089] "Antisense RNA" refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA, and that blocks the expression of a target gene (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence.
[0090] "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated, yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.
[0091] The term "recombinant DNA construct" refers to a DNA construct assembled from nucleic acid fragments obtained from different sources. The types and origins of the nucleic acid fragments may be very diverse.
[0092] A "recombinant expression construct" contains a nucleic acid fragment operably linked to at least one regulatory element, that is capable of effecting expression of the nucleic acid fragment. The recombinant expression construct may also affect expression of a homologous sequence in a host cell.
[0093] In one embodiment the choice of recombinant expression construct is dependent upon the method that will be used to transform host cells. The skilled artisan is well aware of the genetic elements that must be present on the recombinant expression construct in order to successfully transform, select and propagate host cells. The skilled artisan will also recognize that different independent transformation events may be screened to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by, but is not limited to, Southern analysis of DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analysis.
[0094] The term "operably linked" refers to the association of nucleic acid fragments on a single nucleic acid fragment so that the function of one is regulated by the other. For example, a promoter is operably linked with a coding sequence when it is capable of regulating the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in a sense or antisense orientation. In another example, the complementary RNA regions of the invention can be operably linked, either directly or indirectly, 5' to the target mRNA, or 3' to the target mRNA, or within the target mRNA, or a first complementary region is 5' and its complement is 3' to the target mRNA.
[0095] "Regulatory sequences" refer to nucleotides located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which may influence the transcription, RNA processing, stability, or translation of the associated coding sequence. Regulatory sequences may include, and are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
[0096] "Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoter sequences can also be located within the transcribed portions of genes, and/or downstream of the transcribed sequences. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of an isolated nucleic acid fragment in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause an isolated nucleic acid fragment to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, (1989) Biochemistry of Plants 15:1-82. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity.
[0097] Specific examples of promoters that may be useful in expressing the nucleic acid fragments of the invention include, but are not limited to, the oleosin promoter (PCT Publication WO99/65479, published Dec. 12, 1999), the maize 27 kD zein promoter (Ueda et al (1994) Mol. Cell. Biol. 14:4350-4359), the ubiquitin promoter (Christensen et al (1992) Plant Mol. Biol. 18:675-680), the SAM synthetase promoter (PCT Publication WO00/37662, published Jun. 29, 2000), the CaMV 35S (Odell et al (1985) Nature 313:810-812), and the promoter described in PCT Publication WO02/099063 published Dec. 12, 2002.
[0098] The "translation leader sequence" refers to a polynucleotide fragment located between the promoter of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner, R. and Foster, G. D. (1995) Mol. Biotechnol. 3:225-236).
[0099] An "intron" is an intervening sequence in a gene that does not encode a portion of the protein sequence. Thus, such sequences are transcribed into RNA but are then excised and are not translated. The term is also used for the excised RNA sequences.
[0100] The "3' non-coding sequences" refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is exemplified by Ingelbrecht, I. L., et al. (1989) Plant Cell 1:671-680.
[0101] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 1989. Transformation methods are well known to those skilled in the art and are described below.
[0102] "PCR" or "Polymerase Chain Reaction" is a technique for the synthesis of large quantities of specific DNA segments, consists of a series of repetitive cycles (Perkin Elmer Cetus Instruments, Norwalk, Conn.). Typically, the double stranded DNA is heat denatured, the two primers complementary to the 3' boundaries of the target segment are annealed at low temperature and then extended at an intermediate temperature. One set of these three consecutive steps is referred to as a cycle.
[0103] "Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including nuclear and organellar genomes, resulting in genetically stable inheritance.
[0104] In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance.
[0105] Host organisms comprising the transformed nucleic acid fragments are referred to as "transgenic" organisms.
[0106] The term "amplified" means the construction of multiple copies of a nucleic acid sequence or multiple copies complementary to the nucleic acid sequence using at least one of the nucleic acid sequences as a template. Amplification systems include the polymerase chain reaction (PCR) system, ligase chain reaction (LCR) system, nucleic acid sequence based amplification (NASBA, Cangene, Mississauga, Ontario), Q-Beta Replicase systems, transcription-based amplification system (TAS), and strand displacement amplification (SDA). See, e.g., Diagnostic Molecular Microbiology: Principles and Applications, D. H. Persing et al., Ed., American Society for Microbiology, Washington, D.C. (1993). The product of amplification is termed an amplicon.
[0107] The term "chromosomal location" includes reference to a length of a chromosome which may be measured by reference to the linear segment of DNA which it comprises. The chromosomal location can be defined by reference to two unique DNA sequences, i.e., markers.
[0108] The term "marker" includes reference to a locus on a chromosome that serves to identify a unique position on the chromosome. A "polymorphic marker" includes reference to a marker which appears in multiple forms (alleles) such that different forms of the marker, when they are present in a homologous pair, allow transmission of each of the chromosomes in that pair to be followed. A genotype may be defined by use of one or a plurality of markers.
[0109] The present invention includes, inter alia, compositions and methods for altering or modulating (i.e., increasing or decreasing) the level of ORM polypeptides described herein in plants. The size of the oil, protein, starch and soluble carbohydrate pools in soybean seeds can be modulated or altered (i.e. increased or decreased) by altering the expression of a specific gene, encoding ORM protein.
[0110] In one embodiment, the present invention concerns a transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein seed obtained from said transgenic plant has an altered oil, protein, starch and/or soluble carbohydrate content when compared to seed obtained from a control plant not comprising said recombinant DNA construct.
[0111] In a second embodiment the present invention concerns a transgenic seed obtained from the transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.
[0112] In a third embodiment the present invention concerns a transgenic seed obtained from the transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an increased starch content of at least 0.5%, 1%, 1.5%, 2%, 2.5%, 3.0%, 3.5%, 4.0%, 4.5%, 5.0%, 5.5%, 6.0%, 6.5%, 7.0%, 7.5%, 8.0%, 8.5%, 9.0%, 9.5%, 10.0%, 10.5%, 11%, 11.5%, 12.0% 12.5%, 13.0, 13.5%. 14.0%, 14.5%, 15.0%, 15.5%, 15.0%, 16.5%, 17.0%, 17.5% 18.0%, 18.5%, 19.0%, 19.5%, 20.0%, 20.5%, 21.0%, 21.5%, 22.0%, 22.5%, 23.0%, 23.5%, 24.0%, 24.5%, 25.0%, 25.5%, 26.0%, 26.5%, 27.0%, 27.5%, 28.0%, 28.5%, 29%, 29.5%, 30.0%, 30.5%, 31.0%, 31.5%, 32.0%, 32.5%, 33.0%, 33.5%, 34.0%, 35.0%, 35.5%, 36.0%, 36.5%, 37.0%, 37.5%, 38.0%, 38.5%, 39.0%, 39.5%, 40.0%, 40.5%, 41.0%, 41.5%, 42.0%, 42.5%, 43.0%, 43.5%, 44.0%, 44.5%, 45.0%, 45.5%, 46.0%, 46.5%, 47.0%, 47.5%, 48.0%, 48.5%, 49.0%, 49.5%, or 50.0% on a dry weight basis when compared to a control seed not comprising said recombinant DNA construct.
[0113] In another embodiment, the present invention relates to a recombinant DNA construct comprising any of the isolated polynucleotides of the present invention operably linked to at least one regulatory sequence.
[0114] In another embodiment of the present invention, a recombinant construct of the present invention further comprises an enhancer.
[0115] In another embodiment, the present invention relates to a vector comprising any of the polynucleotides of the present invention.
[0116] In another embodiment, the present invention relates to an isolated polynucleotide fragment comprising a nucleotide sequence comprised by any of the polynucleotides of the present invention, wherein the nucleotide sequence contains at least 30, 40, 60, 100, 200, 300, 400, 500 or 600 nucleotides.
[0117] In another embodiment, the present invention relates to a method for transforming a cell comprising transforming a cell with any of the isolated polynucleotides of the present invention, and the cell transformed by this method. Advantageously, the cell is eukaryotic, e.g., a yeast or plant cell, or prokaryotic, e.g., a bacterium.
[0118] In yet another embodiment, the present invention relates to a method for transforming a cell, comprising transforming a cell with a polynucleotide of the present invention.
[0119] In another embodiment, the present invention relates to a method for producing a transgenic plant comprising transforming a plant cell with any of the isolated polynucleotides of the present invention and regenerating a transgenic plant from the transformed plant cell.
[0120] In another embodiment, a cell, plant, or seed comprising a recombinant DNA construct of the present invention.
[0121] In another embodiment, an isolated polynucleotide comprising: (i) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; or (ii) a full complement of the nucleic acid sequence of (i), wherein the full complement and the nucleic acid sequence of (i) consist of the same number of nucleotides and are 100% complementary. Any of the foregoing isolated polynucleotides may be utilized in any recombinant DNA constructs (including suppression DNA constructs) of the present invention. Preferably the polypeptide is an ORM protein.
[0122] In another embodiment, an isolated polynucleotide comprising: (i) a nucleic acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) a full complement of the nucleic acid sequence of (i). Any of the foregoing isolated polynucleotides may be utilized in any recombinant DNA constructs (including suppression DNA constructs) of the present invention. Preferably, the polypeptide is an ORM protein.
[0123] In one aspect, the present invention includes recombinant DNA constructs (including suppression DNA constructs).
[0124] In another embodiment, the present invention relates to a method of selecting an isolated polynucleotide that alters, i.e. increases or decreases, the level of expression of a ORM protein gene, protein or enzyme activity in a host cell, preferably a plant cell, the method comprising the steps of: (a) constructing an isolated polynucleotide of the present invention or an isolated recombinant DNA construct of the present invention; (b) introducing the isolated polynucleotide or the isolated recombinant DNA construct into a host cell; (c) measuring the level of the ORM protein RNA, protein or enzyme activity in the host cell containing the isolated polynucleotide or recombinant DNA construct; (d) comparing the level of the PPiase RNA, protein or enzyme activity in the host cell containing the isolated polynucleotide or recombinant DNA construct with the level of the ORM protein RNA, protein or enzyme activity in a host cell that does not contain the isolated polynucleotide or recombinant DNA construct, and selecting the isolated polynucleotide or recombinant DNA construct that alters, i.e., increases or decreases, the level of expression of the ORM protein gene, protein or enzyme activity in the plant cell.
[0125] In another embodiment, this invention concerns a method for suppressing the level of expression of a gene encoding a ORM protein having ORM protein activity in a transgenic plant, wherein the method comprises: (a) transforming a plant cell with a fragment of the isolated polynucleotide of the invention; (b) regenerating a transgenic plant from the transformed plant cell of 9a); and (c) selecting a transgenic plant wherein the level of expression of a gene encoding a polypeptide having ORM protein activity has been suppressed.
[0126] Preferably, the gene encodes a polypeptide having ORM protein activity, and the plant is a soybean plant.
[0127] In another embodiment, the invention concerns a method for producing transgenic seed, the method comprising: a) transforming a plant cell with the recombinant DNA construct of (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114, or (ii) the complement of (i); wherein (i) or (ii) is useful in co-suppression or antisense suppression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces transgenic seeds having an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% compared to seed obtained from a non-transgenic plant. Preferably, the seed is a soybean plant.
[0128] In another embodiment, a plant comprising in its genome a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes a ORM protein, and wherein said plant has an altered oil, protein, starch and/or soluble carbohydrate content, when compared to a control plant not comprising said recombinant DNA construct.
[0129] A transgenic seed having an increased oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% when compared to the oil content of a non-transgenic seed, wherein said transgenic seed comprises a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114;
or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.
[0130] Yet another embodiment of the invention concerns a transgenic seed comprising a recombinant DNA construct comprising:
[0131] (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a):
wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.
[0132] In another embodiment, the invention concerns a method for producing a transgenic plant, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a plant from the transformed plant cell.
[0133] Another embodiment of the invention concerns, a method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.
[0134] Another embodiment of the invention concerns, a method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increased starch content of at least 0.5%, 1%, 1.5%, 2%, 2.5%, 3.0%, 3.5%, 4.0%, 4.5%, 5.0%, 5.5%, 6.0%, 6.5%, 7.0%, 7.5%, 8.0%, 8.5%, 9.0%, 9.5%, 10.0%, 10.5%, 11%, 11.5%, 12.0% 12.5%, 13.0, 13.5%. 14.0%, 14.5%, 15.0%, 15.5%, 15.0%, 16.5%, 17.0%, 17.5% 18.0%, 18.5%, 19.0%, 19.5%, 20.0%, 20.5%, 21.0%, 21.5%, 22.0%, 22.5%, 23.0%, 23.5%, 24.0%, 24.5%, 25.0%, 25.5%, 26.0%, 26.5%, 27.0%, 27.5%, 28.0%, 28.5%, 29%, 29.5%, 30.0%, 30.5%, 31.0%, 31.5%, 32.0%, 32.5%, 33.0%, 33.5%, 34.0%, 35.0%, 35.5%, 36.0%, 36.5%, 37.0%, 37.5%, 38.0%, 38.5%, 39.0%, 39.5%, 40.0%, 40.5%, 41.0%, 41.5%, 42.0%, 42.5%, 43.0%, 43.5%, 44.0%, 44.5%, 45.0%, 45.5%, 46.0%, 46.5%, 47.0%, 47.5%, 48.0%, 48.5%, 49.0%, 49.5%, or 50.0% on a dry weight basis as compared to a transgenic seed obtained from a non-transgenic plant.
[0135] In another embodiment, the invention concerns a method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.
[0136] A method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i);
wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30%, on a dry-weight basis, as compared to a transgenic seed obtained from a non-transgenic plant.
[0137] Soybeans can be processed into a number of products. For example, "soy protein products" can include, and are not limited to, those items listed in Table 2. "Soy protein products".
TABLE-US-00002 TABLE 2 Soy Protein Products Derived from Soybean Seedsa Whole Soybean Products Roasted Soybeans Baked Soybeans Soy Sprouts Soy Milk Specialty Soy Foods/Ingredients Soy Milk Tofu Tempeh Miso Soy Sauce Hydrolyzed Vegetable Protein Whipping Protein Processed Soy Protein Products Full Fat and Defatted Flours Soy Grits Soy Hypocotyls Soybean Meal Soy Milk Soy Protein Isolates Soy Protein Concentrates Textured Soy Proteins Textured Flours and Concentrates Textured Concentrates Textured Isolates aSee Soy Protein Products: Characteristics, Nutritional Aspects and Utilization (1987). Soy Protein Council.
[0138] "Processing" refers to any physical and chemical methods used to obtain the products listed in Table A and includes, and is not limited to, heat conditioning, flaking and grinding, extrusion, solvent extraction, or aqueous soaking and extraction of whole or partial seeds. Furthermore, "processing" includes the methods used to concentrate and isolate soy protein from whole or partial seeds, as well as the various traditional Oriental methods in preparing fermented soy food products. Trading Standards and Specifications have been established for many of these products (see National Oilseed Processors Association Yearbook and Trading Rules 1991-1992).
[0139] "White" flakes refer to flaked, dehulled cotyledons that have been defatted and treated with controlled moist heat to have a PDI (AOCS: Bal 0-65) of about 85 to 90. This term can also refer to a flour with a similar PDI that has been ground to pass through a No. 100 U.S. Standard Screen size.
[0140] "Grits" refer to defatted, dehulled cotyledons having a U.S. Standard screen size of between No. 10 and 80.
[0141] "Soy Protein Concentrates" refer to those products produced from dehulled, defatted soybeans by three basic processes: acid leaching (at about pH 4.5), extraction with alcohol (about 55-80%), and denaturing the protein with moist heat prior to extraction with water. Conditions typically used to prepare soy protein concentrates have been described by Pass ((1975) U.S. Pat. No. 3,897,574; Campbell et al., (1985) in New Protein Foods, ed. by Altschul and Wilcke, Academic Press, Vol. 5, Chapter 10, Seed Storage Proteins, pp 302-338).
[0142] "Extrusion" refers to processes whereby material (grits, flour or concentrate) is passed through a jacketed auger using high pressures and temperatures as a means of altering the texture of the material. "Texturing" and "structuring" refer to extrusion processes used to modify the physical characteristics of the material. The characteristics of these processes, including thermoplastic extrusion, have been described previously (Atkinson (1970) U.S. Pat. No. 3,488,770, Horan (1985) In New Protein Foods, ed. by Altschul and Wilcke, Academic Press, Vol. 1A, Chapter 8, pp 367-414). Moreover, conditions used during extrusion processing of complex foodstuff mixtures that include soy protein products have been described previously (Rokey (1983) Feed Manufacturing Technology III, 222-237; McCulloch, U.S. Pat. No. 4,454,804).
TABLE-US-00003 TABLE 3 Generalized Steps for Soybean Oil and Byproduct Production Process Impurities Removed and/or Step Process By-Products Obtained # 1 soybean seed # 2 oil extraction meal # 3 Degumming lecithin # 4 alkali or physical gums, free fatty acids, refining pigments # 5 water washing soap # 6 Bleaching color, soap, metal # 7 (hydrogenation) # 8 (winterization) stearine # 9 Deodorization free fatty acids, tocopherols, sterols, volatiles # 10 oil products
[0143] More specifically, soybean seeds are cleaned, tempered, dehulled, and flaked, thereby increasing the efficiency of oil extraction. Oil extraction is usually accomplished by solvent (e.g., hexane) extraction but can also be achieved by a combination of physical pressure and/or solvent extraction. The resulting oil is called crude oil. The crude oil may be degummed by hydrating phospholipids and other polar and neutral lipid complexes that facilitate their separation from the nonhydrating, triglyceride fraction (soybean oil). The resulting lecithin gums may be further processed to make commercially important lecithin products used in a variety of food and industrial products as emulsification and release (i.e., antisticking) agents. Degummed oil may be further refined for the removal of impurities (primarily free fatty acids, pigments and residual gums). Refining is accomplished by the addition of a caustic agent that reacts with free fatty acid to form soap and hydrates phosphatides and proteins in the crude oil. Water is used to wash out traces of soap formed during refining. The soapstock byproduct may be used directly in animal feeds or acidulated to recover the free fatty acids. Color is removed through adsorption with a bleaching earth that removes most of the chlorophyll and carotenoid compounds. The refined oil can be hydrogenated, thereby resulting in fats with various melting properties and textures. Winterization (fractionation) may be used to remove stearine from the hydrogenated oil through crystallization under carefully controlled cooling conditions. Deodorization (principally via steam distillation under vacuum) is the last step and is designed to remove compounds which impart odor or flavor to the oil. Other valuable byproducts such as tocopherols and sterols may be removed during the deodorization process. Deodorized distillate containing these byproducts may be sold for production of natural vitamin E and other high-value pharmaceutical products. Refined, bleached, (hydrogenated, fractionated) and deodorized oils and fats may be packaged and sold directly or further processed into more specialized products. A more detailed reference to soybean seed processing, soybean oil production, and byproduct utilization can be found in Erickson, Practical Handbook of Soybean Processing and Utilization, The American Oil Chemists' Society and United Soybean Board (1995). Soybean oil is liquid at room temperature because it is relatively low in saturated fatty acids when compared with oils such as coconut, palm, palm kernel, and cocoa butter.
[0144] For example, plant and microbial oils containing polyunsaturated fatty acids (PUFAs) that have been refined and/or purified can be hydrogenated, thereby resulting in fats with various melting properties and textures. Many processed fats (including spreads, confectionary fats, hard butters, margarines, baking shortenings, etc.) require varying degrees of solidity at room temperature and can only be produced through alteration of the source oil's physical properties. This is most commonly achieved through catalytic hydrogenation.
[0145] Hydrogenation is a chemical reaction in which hydrogen is added to the unsaturated fatty acid double bonds with the aid of a catalyst such as nickel. For example, high oleic soybean oil contains unsaturated oleic, linoleic, and linolenic fatty acids, and each of these can be hydrogenated. Hydrogenation has two primary effects. First, the oxidative stability of the oil is increased as a result of the reduction of the unsaturated fatty acid content. Second, the physical properties of the oil are changed because the fatty acid modifications increase the melting point resulting in a semi-liquid or solid fat at room temperature.
[0146] There are many variables which affect the hydrogenation reaction, which in turn alter the composition of the final product. Operating conditions including pressure, temperature, catalyst type and concentration, agitation, and reactor design are among the more important parameters that can be controlled. Selective hydrogenation conditions can be used to hydrogenate the more unsaturated fatty acids in preference to the less unsaturated ones. Very light or brush hydrogenation is often employed to increase stability of liquid oils. Further hydrogenation converts a liquid oil to a physically solid fat. The degree of hydrogenation depends on the desired performance and melting characteristics designed for the particular end product. Liquid shortenings (used in the manufacture of baking products, solid fats and shortenings used for commercial frying and roasting operations) and base stocks for margarine manufacture are among the myriad of possible oil and fat products achieved through hydrogenation. A more detailed description of hydrogenation and hydrogenated products can be found in Patterson, H. B. W., Hydrogenation of Fats and Oils: Theory and Practice. The American Oil Chemists' Society (1994).
[0147] Hydrogenated oils have become somewhat controversial due to the presence of trans-fatty acid isomers that result from the hydrogenation process. Ingestion of large amounts of trans-isomers has been linked with detrimental health effects including increased ratios of low density to high density lipoproteins in the blood plasma and increased risk of coronary heart disease.
[0148] In a another embodiment, the invention concerns a transgenic seed produced by any of the above methods. Preferably, the seed is a soybean seed.
[0149] The present invention concerns a transgenic soybean seed having increased total fatty acid content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% when compared to the total fatty acid content of a non-transgenic, null segregant soybean seed. It is understood that any measurable increase in the total fatty acid content of a transgenic versus a non-transgenic, null segregant would be useful. Such increases in the total fatty acid content would include, but are not limited to, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30%.
[0150] Regulatory sequences may include, and are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
[0151] "Tissue-specific" promoters direct RNA production preferentially in particular types of cells or tissues. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg (Biochemistry of Plants 15:1-82 (1989)). It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity.
[0152] A number of promoters can be used to practice the present invention. The promoters can be selected based on the desired outcome. The nucleic acids can be combined with constitutive, tissue-specific (preferred), inducible, or other promoters for expression in the host organism. Suitable constitutive promoters for use in a plant host cell include, for example, the core promoter of the Rsyn7 promoter and other constitutive promoters disclosed in WO 99/43838 and U.S. Pat. No. 6,072,050; the core CaMV 35S promoter (Odell et al., Nature 313:810-812 (1985)); rice actin (McElroy et al., Plant Cell 2:163-171 (1990)); ubiquitin (Christensen et al., Plant Mol. Biol. 12:619-632 (1989) and Christensen et al., Plant Mol. Biol. 18:675-689 (1992)); pEMU (Last et al., Theor. Appl. Genet. 81:581-588 (1991)); MAS (Velten et al., EMBO J. 3:2723-2730 (1984)); ALS promoter (U.S. Pat. No. 5,659,026), and the like. Other constitutive promoters include, for example, those discussed in U.S. Pat. Nos. 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; 5,608,142; and 6,177,611.
[0153] In choosing a promoter to use in the methods of the invention, it may be desirable to use a tissue-specific or developmentally regulated promoter. A tissue-specific or developmentally regulated promoter is a DNA sequence which regulates the expression of a DNA sequence selectively in particular cells/tissues of a plant. Any identifiable promoter may be used in the methods of the present invention which causes the desired temporal and spatial expression.
[0154] Promoters which are seed or embryo specific and may be useful in the invention include patatin (potato tubers) (Rocha-Sosa, M., et al. (1989) EMBO J. 8:23-29), convicilin, vicilin, and legumin (pea cotyledons) (Rerie, W. G., et al. (1991) Mol. Gen. Genet. 259:149-157; Newbigin, E. J., et al. (1990) Planta 180:461-470; Higgins, T. J. V., et al. (1988) Plant. Mol. Biol. 11:683-695), zein (maize endosperm) (Schemthaner, J. P., et al. (1988) EMBO J. 7:1249-1255), phaseolin (bean cotyledon) (Segupta-Gopalan, C., et al. (1985) Proc. Natl. Acad. Sci. U.S.A. 82:3320-3324), phytohemagglutinin (bean cotyledon) (Voelker, T. et al. (1987) EMBO J. 6:3571-3577), B-conglycinin and glycinin (soybean cotyledon) (Chen, Z-L, et al. (1988) EMBO J. 7:297-302), glutelin (rice endosperm), hordein (barley endosperm) (Marris, C., et al. (1988) Plant Mol. Biol. 10:359-366), glutenin and gliadin (wheat endosperm) (Colot, V., et al. (1987) EMBO J. 6:3559-3564), and sporamin (sweet potato tuberous root) (Hattori, T., et al. (1990) Plant Mol. Biol. 14:595-604). Promoters of seed-specific genes operably linked to heterologous coding regions in chimeric gene constructions maintain their temporal and spatial expression pattern in transgenic plants. Such examples include Arabidopsis thaliana 2S seed storage protein gene promoter to express enkephalin peptides in Arabidopsis and Brassica napus seeds (Vanderkerckhove et al., Bio/Technology 7:L929-932 (1989)), bean lectin and bean beta-phaseolin promoters to express luciferase (Riggs et al., Plant Sci. 63:47-57 (1989)), and wheat glutenin promoters to express chloramphenicol acetyl transferase (Colot et al., EMBO J 6:3559-3564 (1987)).
[0155] A plethora of promoters is described in WO 00/18963, published on Apr. 6, 2000, the disclosure of which is hereby incorporated by reference. Examples of seed-specific promoters include, and are not limited to, the promoter for soybean Kunitz trypsin inhibitor (Kti3, Jofuku and Goldberg, Plant Cell 1:1079-1093 (1989)) β-conglycinin (Chen et al., Dev. Genet. 10:112-122 (1989)), the napin promoter, and the phaseolin promoter.
[0156] In some embodiments, isolated nucleic acids which serve as promoter or enhancer elements can be introduced in the appropriate position (generally upstream) of a non-heterologous form of a polynucleotide of the present invention so as to up or down regulate expression of a polynucleotide of the present invention. For example, endogenous promoters can be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., PCT/US93/03868), or isolated promoters can be introduced into a plant cell in the proper orientation and distance from a cognate gene of a polynucleotide of the present invention so as to control the expression of the gene. Gene expression can be modulated under conditions suitable for plant growth so as to alter the total concentration and/or alter the composition of the polypeptides of the present invention in plant cell. Thus, the present invention includes compositions, and methods for making, heterologous promoters and/or enhancers operably linked to a native, endogenous (i.e., non-heterologous) form of a polynucleotide of the present invention.
[0157] An intron sequence can be added to the 5' untranslated region or the coding sequence of the partial coding sequence to increase the amount of the mature message that accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both plant and animal expression constructs has been shown to increase gene expression at both the mRNA and protein levels up to 1000-fold (Buchman and Berg, Mol. Cell Biol. 8:4395-4405 (1988); Callis et al., Genes Dev. 1:1183-1200 (1987)). Such intron enhancement of gene expression is typically greatest when placed near the 5' end of the transcription unit. Use of maize introns Adh1-S intron 1, 2, and 6, the Bronze-1 intron are known in the art. See generally, The Maize Handbook, Chapter 116, Freeling and Walbot, Eds., Springer, New York (1994). A vector comprising the sequences from a polynucleotide of the present invention will typically comprise a marker gene which confers a selectable phenotype on plant cells. Typical vectors useful for expression of genes in higher plants are well known in the art and include vectors derived from the tumor-inducing (Ti) plasmid of Agrobacterium tumefaciens described by Rogers et al., Meth. in Enzymol. 153:253-277 (1987).
[0158] If polypeptide expression is desired, it is generally desirable to include a polyadenylation region at the 3'-end of a polynucleotide coding region. The polyadenylation region can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The 3' end sequence to be added can be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.
[0159] Preferred recombinant DNA constructs include the following combinations: a) a nucleic acid fragment corresponding to a promoter operably linked to at least one nucleic acid fragment encoding a selectable marker, followed by a nucleic acid fragment corresponding to a terminator, b) a nucleic acid fragment corresponding to a promoter operably linked to a nucleic acid fragment capable of producing a stem-loop structure, and followed by a nucleic acid fragment corresponding to a terminator, and c) any combination of a) and b) above. Preferably, in the stem-loop structure at least one nucleic acid fragment that is capable of suppressing expression of a native gene comprises the "loop" and is surrounded by nucleic acid fragments capable of producing a stem.
[0160] Preferred methods for transforming dicots and obtaining transgenic plants have been published, among others, for cotton (U.S. Pat. No. 5,004,863, U.S. Pat. No. 5,159,135); soybean (U.S. Pat. No. 5,569,834, U.S. Pat. No. 5,416,011); Brassica (U.S. Pat. No. 5,463,174); peanut (Cheng et al. (1996) Plant Cell Rep. 15:653-657, McKently et al. (1995) Plant Cell Rep. 14:699-703); papaya (Ling, K. et al. (1991) Bio/technology 9:752-758); and pea (Grant et al. (1995) Plant Cell Rep. 15:254-258). For a review of other commonly used methods of plant transformation see Newell, C. A. (2000) Mol. Biotechnol. 16:53-65. One of these methods of transformation uses Agrobacterium rhizogenes (Tepfler, M. and Casse-Delbart, F. (1987) Microbiol. Sci. 4:24-28). Transformation of soybeans using direct delivery of DNA has been published using PEG fusion (PCT publication WO 92/17598), electroporation (Chowrira, G. M. et al. (1995) Mol. Biotechnol. 3:17-23; Christou, P. et al. (1987) Proc. Natl. Acad. Sci. U.S.A. 84:3962-3966), microinjection, or particle bombardment (McCabe, D. E. et. Al. (1988) Bio/Technology 6:923; Christou et al. (1988) Plant Physiol. 87:671-674).
[0161] There are a variety of methods for the regeneration of plants from plant tissue. The particular method of regeneration will depend on the starting plant tissue and the particular plant species to be regenerated. The regeneration, development and cultivation of plants from single plant protoplast transformants or from various transformed explants are well known in the art (Weissbach and Weissbach, (1988) In.: Methods for Plant Molecular Biology, (Eds.), Academic Press, Inc., San Diego, Calif.). This regeneration and growth process typically includes the steps of selection of transformed cells, culturing those individualized cells through the usual stages of embryonic development through the rooted plantlet stage. Transgenic embryos and seeds are similarly regenerated. The resulting transgenic rooted shoots are thereafter planted in an appropriate plant growth medium such as soil. The regenerated plants may be self-pollinated. Otherwise, pollen obtained from the regenerated plants is crossed to seed-grown plants of agronomically important lines. Conversely, pollen from plants of these important lines is used to pollinate regenerated plants. A transgenic plant of the present invention containing a desired polypeptide(s) is cultivated using methods well known to one skilled in the art.
[0162] In addition to the above discussed procedures, practitioners are familiar with the standard resource materials which describe specific conditions and procedures for the construction, manipulation and isolation of macromolecules (e.g., DNA molecules, plasmids, etc.), generation of recombinant DNA fragments and recombinant expression constructs and the screening and isolating of clones, (see for example, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press; Maliga et al. (1995) Methods in Plant Molecular Biology, Cold Spring Harbor Press; Birren et al. (1998) Genome Analysis: Detecting Genes, 1, Cold Spring Harbor, N.Y.; Birren et al. (1998) Genome Analysis: Analyzing DNA, 2, Cold Spring Harbor, N.Y.; Plant Molecular Biology: A Laboratory Manual, eds. Clark, Springer, New York (1997)).
[0163] Assays to detect proteins may be performed by SDS-polyacrylamide gel electrophoresis or immunological assays. Assays to detect levels of substrates or products of enzymes may be performed using gas chromatography or liquid chromatography for separation and UV or visible spectrometry or mass spectrometry for detection, or the like. Determining the levels of mRNA of the enzyme of interest may be accomplished using northern-blotting or RT-PCR techniques. Once plants have been regenerated, and progeny plants homozygous for the transgene have been obtained, plants will have a stable phenotype that will be observed in similar seeds in later generations.
[0164] In another aspect, this invention includes a polynucleotide of this invention or a functionally equivalent subfragment thereof useful in antisense inhibition or cosuppression of expression of nucleic acid sequences encoding proteins having cytosolic pyrophosphatase activity, most preferably in antisense inhibition or cosuppression of an endogenous ORM protein gene.
[0165] Protocols for antisense inhibition or co-suppression are well known to those skilled in the art.
[0166] The sequences of the polynucleotide fragments used for suppression do not have to be 100% identical to the sequences of the polynucleotide fragment found in the gene to be suppressed. For example, suppression of all the subunits of the soybean seed storage protein R-conglycinin has been accomplished using a polynucleotide derived from a portion of the gene encoding the α subunit (U.S. Pat. No. 6,362,399). R-conglycinin is a heterogeneous glycoprotein composed of varying combinations of three highly negatively charged subunits identified as α,α' and β. The polynucleotide sequences encoding the α and α' subunits are 85% identical to each other while the polynucleotide sequences encoding the β subunit are 75 to 80% identical to the α and α' subunits, respectively. Thus, polynucleotides that are at least 75% identical to a region of the polynucleotide that is target for suppression have been shown to be effective in suppressing the desired target.
[0167] The polynucleotide may be at least 80% identical, at least 90% identical, at least 95% identical, or about 100% identical to the desired target sequence.
[0168] The isolated nucleic acids and proteins and any embodiments of the present invention can be used over a broad range of plant types, particularly dicots such as the species of the genus Glycine.
[0169] It is believed that the nucleic acids and proteins and any embodiments of the present invention can be with monocots as well including, but not limited to, Graminiae including Sorghum bicolor and Zea mays.
[0170] The isolated nucleic acid and proteins of the present invention can also be used in species from the following dicot genera: Cucurbita, Rosa, Vitis, Juglans, Fragaria, Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus, Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Cichorium, Helianthus, Lactuca, Antirrhinum, Pelargonium, Ranunculus, Senecio, Salpiglossis, Cucumis, Browallia, Glycine, Pisum, Phaseolus, and from the following monocot genera: Bromus, Asparagus, Hemerocallis, Panicum, Pennisetum, Lolium, Oryza, Avena, Hordeum, Secale, Triticum, Bambusa, Dendrocalamus, and Melocanna.
EXAMPLES
[0171] The present invention is further defined in the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.
[0172] The disclosure of each reference set forth herein is incorporated herein by reference in its entirety.
EXAMPLES
[0173] The present invention is further defined in the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.
[0174] The disclosure of each reference set forth herein is incorporated herein by reference in its entirety.
Example 1
Creation of an Arabidopsis Population with Activation-Tagged Genes
[0175] An 18.49-kb T-DNA based binary construct was created, pHSbarENDs2 (SEQ ID NO:1;), that contains four multimerized enhancer elements derived from the Cauliflower Mosaic Virus 35S promoter (corresponding to sequences -341 to -64, as defined by Odell et al., Nature 313:810-812 (1985)). The construct also contains vector sequences (pUC9) and a poly-linker (SEQ ID NO:2) to allow plasmid rescue, transposon sequences (Ds) to remobilize the T-DNA, and the bar gene to allow for glufosinate selection of transgenic plants. In principle, only the 10.8-kb segment from the right border (RB) to left border (LB) inclusive will be transferred into the host plant genome. Since the enhancer elements are located near the RB, they can induce cis-activation of genomic loci following T-DNA integration.
[0176] Arabidopsis activation-tagged populations were created by whole plant Agrobacterium transformation. The pHSbarENDs2 (SEQ ID NO:1) construct was transformed into Agrobacterium tumefaciens strain C58, grown in lysogeny broth medium at 25° C. to OD600˜1.0. Cells were then pelleted by centrifugation and resuspended in an equal volume of 5% sucrose/0.05% Silwet L-77 (OSI Specialties, Inc). At early bolting, soil grown Arabidopsis thaliana ecotype Col-0 were top watered with the Agrobacterium suspension. A week later, the same plants were top watered again with the same Agrobacterium strain in sucrose/Silwet. The plants were then allowed to set seed as normal. The resulting T1 seed were sown on soil, and transgenic seedlings were selected by spraying with glufosinate (FINALE®; AgrEvo; Bayer Environmental Science). A total of 100,000 glufosinate resistant T1 seedlings were selected. T2 seed from each line was kept separate. Small aliquots of T2 seed from independently generated activation-tagged lines were pooled. The pooled seed were planted in soil and plants were grown to maturity producing T3 seed pools each comprised of seed derived from 96 activation-tagged lines.
Example 2
Identification and Characterization of Mutant Line lo17849
[0177] A method for screening Arabidopsis seed density was developed based on Focks and Benning (1998) with significant modifications. Arabidopsis seeds can be separated according to their density. Density layers were prepared by a mixture of 1,6 dibromohexane (d=1.6), 1-bromohexane (d=1.17) and mineral oil (d=0.84) at different ratios. From the bottom to the top of the tube, 6 layers of organic solvents each comprised of 2 mL were added sequentially. The ratios of 1,6 dibromohexane:1-bromohexane:mineral oil for each layer were 1:1:0, 1:2:0, 0:1:0, 0:5:1, 0:3:1, 0:0:1. About 600 mg of T3 seed of a given pool of 96 activation-tagged lines corresponding to about 30,000 seeds were loaded on to the surface layer of a 15 ml glass tube containing said step gradient. After centrifugation for 5 min at 2000×g, seeds were separated according to their density. The seeds in the lower two layers of the step gradient and from the bottom of the tube were collected. Organic solvents were removed by sequential washing with 100% and 80% ethanol and seeds were sterilized using a solution of 5% hypochloride (NaOCl) in water. Seed were rinsed in sterile water and plated on MS-1 media comprised of 0.5×MS salts, 1% (WN) sucrose, 0.05 MES/KOH (pH 5.8), 200 μg/mL, 10 g/L agar and 15 mg L-1 glufosinate ammonium (Basta; Sigma Aldrich, USA). A total of 520 T3 pools each derived from 96 T2 activation-tagged lines were screened in this manner. Seed pool 475 when subjected to density gradient centrifugation as described above produced about 25 seed with increased density. These seed were sterilized and plated on selective media containing Basta. Basta-resistant seedlings were transferred to soil and plants were grown in a controlled environment (22° C., 16 h light/8 h dark, 100-200 μE m-2s-1). to maturity for about 8-10 weeks alongside four untransformed wild type plants of the Columbia ecotype. Oil content of T4 seed and control seed was measured by NMR as follows.
[0178] NMR based analysis of seed oil content:
[0179] Seed oil content was determined using a Maran Ultra NMR analyzer (Resonance Instruments Ltd, Whitney, Oxfordshire, UK). Samples (e.g., batches of Arabidopsis seed ranging in weight between 5 and 200 mg) were placed into pre-weighed 2 mL polypropylene tubes (Corning Inc, Corning N.Y., USA; Part no. 430917) previously labeled with unique bar code identifiers. Samples were then placed into 96 place carriers and processed through the following series of steps by an ADEPT COBRA 600® SCARA robotic system: [0180] 1. pick up tube (the robotic arm was fitted with a vacuum pickup devise); [0181] 2. read bar code; [0182] 3. expose tube to antistatic device (ensured that Arabidopsis seed were not adhering to the tube walls); [0183] 4. weigh tube (containing the sample), to 0.0001 g precision; [0184] 5. take NMR reading; measured as the intensity of the proton spin echo 1 msec after a 22.95 MHz signal had been applied to the sample (data was collected for 32 NMR scans per sample); [0185] 6. return tube to rack; and [0186] 7. repeat process with next tube. Bar codes, tubes weights and NMR readings were recorded by a computer connected to the system. Sample weight was determined by subtracting the polypropylene tube weight from the weight of the tube containing the sample.
[0187] Seed oil content of soybeans seed or soybean somatic embryos was calculated as follows:
% oil ( % wt basis ) = ( NMR signal / sample wt ( g ) ) - 70.58 ) 351.45 ##EQU00001##
[0188] Calibration parameters were determined by precisely weighing samples of soy oil (ranging from 0.0050 to 0.0700 g at approximately 0.0050 g intervals; weighed to a precision of 0.0001 g) into Corning tubes (see above) and subjecting them to NMR analysis. A calibration curve of oil content (% seed wt basis; assuming a standard seed weight of 0.1500 g) to NMR value was established.
[0189] The relationship between seed oil contents measured by NMR and absolute oil contents measured by classical analytical chemistry methods was determined as follows. Fifty soybean seed, chosen to have a range of oil contents, were dried at 40° C. in a forced air oven for 48 h. Individual seeds were subjected to NMR analysis, as described above, and were then ground to a fine powder in a GenoGrinder (SPEX Centriprep (Metuchen, N.J., U.S.A.); 1500 oscillations per minute, for 1 minute). Aliquots of between 70 and 100 mg were weighed (to 0.0001 g precision) into 13×100 mm glass tubes fitted with Teflon® lined screw caps; the remainder of the powder from each bean was used to determine moisture content, by weight difference after 18 h in a forced air oven at 105° C. Heptane (3 mL) was added to the powders in the tubes and after vortex mixing samples were extracted, on an end-over-end agitator, for 1 h at room temperature. The extracts were centrifuged, 1500×g for 10 min, the supernatant decanted into a clean tube and the pellets were extracted two more times (1 h each) with 1 mL heptane. The supernatants from the three extractions were combined and 50 μL internal standard (triheptadecanoic acid; 10 mg/mL toluene) was added prior to evaporation to dryness at room temperature under a stream of nitrogen gas; standards containing 0, 0.0050, 0.0100, 0.0150, 0.0200 and 0.0300 g soybean oil, in 5 mL heptane, were prepared in the same manner. Fats were converted to fatty acid methyl esters (FAMEs) by adding 1 mL 5% sulfuric acid (v:v. in anhydrous methanol) to the dried pellets and heating them at 80° C. for 30 min, with occasional vortex mixing. The samples were allowed to cool to room temperature and 1 mL 25% aqueous sodium chloride was added followed by 0.8 mL heptane. After vortex mixing the phases were allowed to separate and the upper organic phase was transferred to a sample vial and subjected to GC analysis.
[0190] Plotting NMR determined oil contents versus GC determined oil contents resulted in a linear relationship between 9.66 and 26.27% oil (GC values; % seed wt basis) with a slope of 1.0225 and an R2 of 0.9744; based on a seed moisture content that averaged 2.6+/-0.8%.
[0191] Seed oil content (on a % seed weight basis) of Arabidopsis seed was calculated as follows:
mg oil=(NMR signal-2.1112)/37.514;
% oil=[(mg oil)/1000]/[g of seed sample weight]×100.
[0192] Prior to establishing this formula, Arabidopsis seed oil was extracted as follows. Approximately 5 g of mature Arabidopsis seed (cv Columbia) were ground to a fine powder using a mortar and pestle. The powder was placed into a 33×94 mm paper thimble (Ahlstrom #7100-3394; Ahlstrom, Mount Holly Springs, Pa., USA) and the oil extracted during approximately 40 extraction cycles with petroleum ether (BP 39.9-51.7° C.) in a Soxhlet apparatus. The extract was allowed to cool and the crude oil was recovered by removing the solvent under vacuum in a rotary evaporator. Calibration parameters were determined by precisely weighing 11 standard samples of partially purified Arabidopsis oil (samples contained 3.6, 6.3, 7.9, 9.6, 12.8, 16.3, 20.3, 28.2, 32.1, 39.9 and 60 mg of partially purified Arabidopsis oil) weighed to a precision of 0.0001 g) into 2 mL polypropylene tubes (Corning Inc, Corning N.Y., USA; Part no. 430917) and subjecting them to NMR analysis. A calibration curve of oil content (% seed weight basis) to NMR value was established.
[0193] Table 4 shows that the seed oil content of T4 activation-tagged line with Bar code ID K17849 is only 86% of that of the average of four WT control plants grown in the same flat.
TABLE-US-00004 TABLE 4 Oil Content of T4 activation-tagged lines derived from T3 pool 256 % oil content % BARCODE Oil T3 pool ID # of WT K17835 40.1 256 95.8 K17836 43.0 256 102.7 K17837 42.2 256 100.8 K17838 42.6 256 101.8 K17839 41.7 256 99.6 K17840 42.4 256 101.3 K17841 43.7 256 104.5 K17842 40.9 256 97.6 K17843 42.9 256 102.5 K17844 43.3 256 103.5 K17845 43.6 256 104.1 K17846 41.5 256 99.1 K17847 40.9 256 97.8 K17848 41.7 256 99.7 K17849 36.0 256 86.0 K17851 43.3 256 103.5 K17852 42.8 256 102.3 K17853 43.0 256 102.8 K17854 42.1 256 100.6 K17855 42.8 256 102.2 K17856 41.9 wt K17857 40.2 wt
K17849 was renamed lo17849. T4 seed were plated on selective media and nine glufosinate-resistant seedlings were planted in the same flat as six untransformed WT plants. Plants were grown to maturity and oil content was determined by NMR.
TABLE-US-00005 TABLE 5 Oil Content of T5 seed of activation-tagged line lo17849 Average oil content T5 activation- % Average oil content % % of BARCODE tagged line ID Oil % oil of WT WT K24753 lo17849 39.3 38.3 95.3 92.9 K24747 lo17849 38.9 94.2 K24752 lo17849 38.8 94.1 K24746 lo17849 38.4 93.2 K24750 lo17849 38.4 93.1 K24751 lo17849 38.2 92.7 K24748 lo17849 38.0 92.1 K24754 lo17849 37.8 91.5 K24749 lo17849 36.9 89.5 K24760 wt 42.9 K24755 wt 41.7 K24757 wt 41.6 K24756 wt 40.9 K24759 wt 40.7 K24758 wt 39.7
[0194] Table 5 shows that the seed oil content of T5 seed of activation-tagged line lo17849 is between 89.5 and 95.3% of that of WT control plants grown in the same flat. The average seed oil content of all T5 lines of lo17849 was 93% of the WT control plant average. Twenty-four Basta-resistant T5 seedlings of lo17849 were planted in the same flat alongside 12 untransformed WT control plants of the Columbia ecotype. Plants were grown to maturity and seed was bulk-harvested from all 24 lo17849 and 12 WT plants. Oil content of lo17849 and WT seed was measured by NMR (Table 6).
TABLE-US-00006 TABLE 6 Oil Content of T6 activation-tagged line lo17849 % oil content % Barcode Oil Seed ID of WT K37207 39.7 LO 17849 92.3 K37208 43.0 WT
[0195] T6 seed of lo17849 and WT seed produced under identical conditions were subjected to compositional analysis as described below. Seed weight was measured by determining the weight of 100 seed. This analysis was performed in triplicate.
[0196] Tissue Preparation:
[0197] Arabidopsis seed (approximately 0.5 g in a 1/2×2'' polycarbonate vial) was ground to a homogeneous paste in a GENOGRINDER® (3×30 sec at 1400 strokes per minute, with a 15 sec interval between each round of agitation). After the second round of agitation, the vials were removed and the Arabidopsis paste was scraped from the walls with a spatula prior to the last burst of agitation.
[0198] Determination of Protein Content:
[0199] Protein contents were estimated by combustion analysis on a Thermo FINNIGAN® Flash 1112EA combustion analyzer running in the NCS mode (vanadium pentoxide was omitted) according to instructions of the manufacturer. Triplicate samples of the ground pastes, 4-8 mg, weighed to an accuracy of 0.001 mg on a METTLER-TOLEDO® MX5 micro balance, were used for analysis. Protein contents were calculated by multiplying % N, determined by the analyzer, by 6.25. Final protein contents were expressed on a % tissue weight basis.
[0200] Determination of Non-Structural Carbohydrate Content:
[0201] Sub-samples of the ground paste were weighed (to an accuracy of 0.1 mg) into 13×100 mm glass tubes; the tubes had TEFLON® lined screw-cap closures. Three replicates were prepared for each sample tested.
[0202] Lipid extraction was performed by adding 2 ml aliquots of heptane to each tube. The tubes were vortex mixed and placed into an ultrasonic bath (VWR Scientific Model 750D) filled with water heated to 60° C. The samples were sonicated at full-power (˜360 W) for 15 min and were then centrifuged (5 min×1700 g). The supernatants were transferred to clean 13×100 mm glass tubes and the pellets were extracted 2 more times with heptane (2 ml, second extraction; 1 ml third extraction) with the supernatants from each extraction being pooled. After lipid extraction 1 ml acetone was added to the pellets and after vortex mixing, to fully disperse the material, they were taken to dryness in a Speedvac.
[0203] Non-Structural Carbohydrate Extraction and Analysis:
[0204] Two ml of 80% ethanol was added to the dried pellets from above. The samples were thoroughly vortex mixed until the plant material was fully dispersed in the solvent prior to sonication at 60° C. for 15 min. After centrifugation, 5 min×1700 g, the supernatants were decanted into clean 13×100 mm glass tubes. Two more extractions with 80% ethanol were performed and the supernatants from each were pooled. The extracted pellets were suspended in acetone and dried (as above). An internal standard β-phenyl glucopyranoside (100 μl of a 0.5000+/-0.0010 g/100 ml stock) was added to each extract prior to drying in a Speedvac. The extracts were maintained in a desiccator until further analysis.
[0205] The acetone dried powders from above were suspended in 0.9 ml MOPS (3-N[Morpholino]propane-sulfonic acid; 50 mM, 5 mM CaCl2, pH 7.0) buffer containing 100 U of heat-stable α-amylase (from Bacillus licheniformis; Sigma A-4551).
[0206] Samples were placed in a heat block (90° C.) for 75 min and were vortex mixed every 15 min. Samples were then allowed to cool to room temperature and 0.6 ml acetate buffer (285 mM, pH 4.5) containing 5 U amyloglucosidase (Roche 110 202 367 001) was added to each. Samples were incubated for 15-18 h at 55° C. in a water bath fitted with a reciprocating shaker; standards of soluble potato starch (Sigma S-2630) were included to ensure that starch digestion went to completion.
[0207] Post-digestion the released carbohydrates were extracted prior to analysis. Absolute ethanol (6 ml) was added to each tube and after vortex mixing the samples were sonicated for 15 min at 60° C. Samples were centrifuged (5 min×1700 g) and the supernatants were decanted into clean 13×100 mm glass tubes. The pellets were extracted 2 more times with 3 ml of 80% ethanol and the resulting supernatants were pooled. Internal standard (100 μl β-phenyl glucopyranoside, as above) was added to each sample prior to drying in a Speedvac.
[0208] Sample Preparation and Analysis:
[0209] The dried samples from the soluble and starch extractions described above were solubilized in anhydrous pyridine (Sigma-Aldrich P57506) containing 30 mg/ml of hydroxylamine HCl (Sigma-Aldrich 159417). Samples were placed on an orbital shaker (300 rpm) overnight and were then heated for 1 hr (75° C.) with vigorous vortex mixing applied every 15 min. After cooling to room temperature, 1 ml hexamethyldisilazane (Sigma-Aldrich H-4875) and 100 μl trifluoroacetic acid (Sigma-Aldrich T-6508) were added. The samples were vortex mixed and the precipitates were allowed to settle prior to transferring the supernatants to GC sample vials.
[0210] Samples were analyzed on an Agilent 6890 gas chromatograph fitted with a DB-17MS capillary column (15m×0.32 mm×0.25 um film). Inlet and detector temperatures were both 275° C. After injection (2 μl, 20:1 split) the initial column temperature (150° C.) was increased to 180° C. at a rate of 3° C./min and then at 25° C./min to a final temperature of 320° C. The final temperature was maintained for 10 min. The carrier gas was H2 at a linear velocity of 51 cm/sec. Detection was by flame ionization. Data analysis was performed using Agilent ChemStation software. Each sugar was quantified relative to the internal standard and detector responses were applied for each individual carbohydrate (calculated from standards run with each set of samples). Final carbohydrate concentrations were expressed on a tissue weight basis.
[0211] Carbohydrates were identified by retention time matching with authentic samples of each sugar run in the same chromatographic set and by GC-MS with spectral matching to the NIST Mass Spectral Library Version 2a, build Jul. 1, 2002.
TABLE-US-00007 TABLE 7 Compositional Analysis of lo17849 and WT Control Seed Seed fructose Barcode Oil Weight (μg mg-1 Genotype ID (%, NMR) Protein % (μg) seed) lo17849 K37207 39.7 16.95 24 0.66 WT K37208 43.0 15.49 23.67 0.57 Δ -7.7 9.4 1.4 15.8 TG/WT % glucose sucrose raffinose stachyose Barcode (μg mg-1 (μg mg-1 (μg mg-1 (μg mg-1 Genotype ID seed) seed) seed) seed) lo17849 K37207 9.54 16.07 1.44 4.71 WT K37208 8.02 17.59 1.21 3.48 Δ 19.0 -8.6 19.0 35.3 TG/WT %
Table 7 shows that no change of seed weight is associated with the seed oil reduction in lo17849. There is however a 10% increase in protein content in 1017849 compared to control seed. The soluble carbohydrate profile of lo17849 differs from that of WT seed. The former shows decrease a sucrose and increased levels of fructose, glucose, raffinose and stachyose.
[0212] In summary the lo17849 contains a genetic locus that confers glufosinate herbicide resistance. Presence of this transgene is associated with a low oil trait (reduction in oil content of 5-8% compared to WT) that is accompanied by unaltered seed size, increased protein content and a shift in the carbohydrate profile mature dry seed that consists of decreased sucrose levels and increased levels of fructose, glucose and raffinosaccharides.
Example 3
Identification of Activation-Tagged Genes
[0213] Genes flanking the T-DNA insert in the lo17849 lines were identified using one, or both, of the following two standard procedures: (1) thermal asymmetric interlaced (TAIL) PCR (Liu et al., Plant J. 8:457-63 (1995)); and (2) SAIFF PCR (Siebert et al., Nucleic Acids Res. 23:1087-1088 (1995)). In lines with complex multimerized T-DNA inserts, TAIL PCR and SAIFF PCR may both prove insufficient to identify candidate genes. In these cases, other procedures, including inverse PCR, plasmid rescue and/or genomic library construction, can be employed.
[0214] A successful result is one where a single TAIL or SAIFF PCR fragment contains a T-DNA border sequence and Arabidopsis genomic sequence. Once a tag of genomic sequence flanking a T-DNA insert is obtained, candidate genes are identified by alignment to publicly available Arabidopsis genome sequence. Specifically, the annotated gene nearest the 35S enhancer elements/T-DNA RB are candidates for genes that are activated.
[0215] To verify that an identified gene is truly near a T-DNA and to rule out the possibility that the TAIL/SAIFF fragment is a chimeric cloning artifact, a diagnostic PCR on genomic DNA is done with one oligo in the T-DNA and one oligo specific for the candidate gene. Genomic DNA samples that give a PCR product are interpreted as representing a T-DNA insertion. This analysis also verifies a situation in which more than one insertion event occurs in the same line, e.g., if multiple differing genomic fragments are identified in TAIL and/or SAIFF PCR analyses.
Example 4
Identification of Activation-Tagged Genes in lo17849
Construction of pKR1478 for Seed Specific Overexpression of Genes in Arabidopsis
[0216] Plasmid pKR85 (SEQ ID NO:3; described in US Patent Application Publication US 2007/0118929 published on May 24, 2007) was digested with HindIII and the fragment containing the hygromycin selectable marker was re-ligated together to produce pKR278 (SEQ ID NO:4).
[0217] Plasmid pKR407 (SEQ ID NO:5; described in PCT Int. Appl. WO 2008/124048 published on Oct. 16, 2008) was digested with BamHI/HindIII and the fragment containing the Gy1 promoter/NotI/LegA2 terminator cassette was effectively cloned into the BamHI/HindIII fragment of pKR278 (SEQ ID NO:4) to produce pKR1468 (SEQ ID NO:6).
[0218] Plasmid pKR1468 (SEQ ID NO:6) was digested with NotI and the resulting DNA ends were filled using Klenow. After filling to form blunt ends, the DNA fragments were treated with calf intestinal alkaline phosphatase and separated using agarose gel electrophoresis. The purified fragment was ligated with cassette formA containing a chloramphenicol resistance and ccdB genes flanked by attR1 and attR2 sites, using the Gateway® Vector Conversion System (Cat. No. 11823-029, Invitrogen Corporation) following the manufacturer's protocol to pKR1475 (SEQ ID NO:7).
[0219] Plasmid pKR1475 (SEQ ID NO:7) was digested with AscI and the fragment containing the Gy1 promoter/NotI/LegA2 terminator Gateway® L/R cloning cassette was cloned into the AscI fragment of binary vector pKR92 (SEQ ID NO:8; described in US Patent Application Publication US 2007/0118929 published on May 24, 2007) to produce pKR1478 (SEQ ID NO:9).
[0220] In this way, genes flanked by attL1 and attL2 sites could be cloned into pKR1478 (SEQ ID NO:9) using Gateway® technology (Invitrogen Corporation) and the gene could be expressed in Arabidopsis from the strong, seed-specific soybean Gy1 promoter in soy.
[0221] The activation tagged-line (1017849) showing reduced oil content was further analyzed. DNA from the line was extracted, and genes flanking the T-DNA insert in the mutant line were identified using ligation-mediated PCR (Siebert et al., Nucleic Acids Res. 23:1087-1088 (1995)). A single amplified fragment was identified that contained a T-DNA border sequence and Arabidopsis genomic sequence. The sequence of this PCR product which contains part of the left border of the inserted T-DNA is set forth as SEQ ID NO:10. Once a tag of genomic sequence flanking a T-DNA insert was obtained, a candidate gene was identified by alignment of SEQ ID NO:10 to the completed Arabidopsis genome (NCBI). Specifically, the SAIFF PCR product generated with PCR primers corresponding to the left border sequence of the T-DNA present in pHSbarENDs2 aligns with sequence of the Arabidopsis genome that is located in the second intron of Arabidopsis gene At5g17270 and 5949 by upstream of the inferred start codon of At5g17280.
Validation of Candidate Arabidopsis Gene (At5017280) Via Transformation into Arabidopsis
[0222] The gene At5g17280, specifically its inferred start codon is 5.5 kb downstream of the SAIFF sequence corresponding to sequence adjacent to the left T-DNA border in lo17849. This gene is annotated as encoding a protein with an oxidoreductase motif (ORM). Primers ORM ORF FWD (SEQ ID NO:11) and ORM ORF REV (SEQ ID NO:12) were used to amplify the At5g172800RF from genomic DNA of Arabidopsis plants of the Columbia ecotype. The PCR product was cloned into pENTR (Invitrogen, USA) to give pENTR-ORM (SEQ ID NO:13). The At5g17280 ORF was inserted in the sense orientation downstream of the GY1 promoter in binary plant transformation vector pKR1478 using Gateway LR recombinase (Invitrogen, USA) using manufacturer instructions. The sequence of the resulting plasmid pKR1478-ORM is set forth as SEQ ID NO:14.
[0223] pKR1478-ORM (SEQ ID NO:14) was introduced into Agrobacterium tumefaciens NTL4 (Luo et al, Molecular Plant-Microbe Interactions (2001) 14(1):98-103) by electroporation. Briefly, 1 μg plasmid DNA was mixed with 100 μL of electro-competent cells on ice. The cell suspension was transferred to a 100 μL electroporation cuvette (1 mm gap width) and electroporated using a BIORAD electroporator set to 1 kV, 400Ω and 25 μF. Cells were transferred to 1 mL LB medium and incubated for 2 h at 30° C. Cells were plated onto LB medium containing 50 μg/mL kanamycin. Plates were incubated at 30° C. for 60 h. Recombinant Agrobacterium cultures (500 mL LB, 50 μg/mL kanamycin) were inoculated from single colonies of transformed agrobacterium cells and grown at 30° C. for 60 h. Cells were harvested by centrifugation (5000×g, 10 min) and resuspended in 1 L of 5% (WN) sucrose containing 0.05% (V/V) Silwet. Arabidopsis plants were grown in soil at a density of 30 plants per 100 cm2 pot in METRO-MIX® 360 soil mixture for 4 weeks (22° C., 16 h light/8 h dark, 100 μE m-2s-1). Plants were repeatedly dipped into the Agrobacterium suspension harboring the binary vector pKR1478-ORM and kept in a dark, high humidity environment for 24 h. Post dipping, plants were grown for three to four weeks under standard plant growth conditions described above and plant material was harvested and dried for one week at ambient temperatures in paper bags. Seeds were harvested using a 0.425 mm mesh brass sieve.
[0224] Cleaned Arabidopsis seeds (2 grams, corresponding to about 100,000 seeds) were sterilized by washes in 45 mL of 80% ethanol, 0.01% TRITON® X-100, followed by 45 mL of 30% (V/V) household bleach in water, 0.01% TRITON® X-100 and finally by repeated rinsing in sterile water. Aliquots of 20,000 seeds were transferred to square plates (20×20 cm) containing 150 mL of sterile plant growth medium comprised of 0.5×MS salts, 0.53% (W/V) sorbitol, 0.05 MES/KOH (pH 5.8), 200 μg/mL TIMENTIN®, and 50 μg/mL kanamycin solidified with 10 g/L agar. Homogeneous dispersion of the seed on the medium was facilitated by mixing the aqueous seed suspension with an equal volume of melted plant growth medium. Plates were incubated under standard growth conditions for ten days. Kanamycin-resistant seedlings were transferred to plant growth medium without selective agent and grown for one week before transfer to soil. T1 Plants are grown to maturity alongside wt control plants and T2 seeds were harvested. A total of six wt plant were grown alongside the T1 plants and two bulk samples were generated by combining seed from three wt plants. Oil content was measured by NMR and is shown in Table 8
TABLE-US-00008 TABLE 8 Seed oil content of T1 plants generated with binary vector pKR1478-ORM for seed-specific over-expression of At5g17280 oil avg. oil % content % content % Construct BARCODE oil of WT of WT pKR1478- K42329 42.4 104.7 ORM pKR1478- K42319 41.6 102.8 ORM pKR1478- K42320 41.0 101.4 ORM pKR1478- K42326 40.6 100.5 ORM pKR1478- K42330 40.1 99.1 ORM pKR1478- K42324 40.0 98.8 ORM pKR1478- K42333 39.8 98.4 ORM pKR1478- K42323 39.7 98.1 ORM pKR1478- K42321 39.3 97.3 ORM pKR1478- K42332 38.3 94.8 ORM pKR1478- K42328 38.1 94.1 ORM pKR1478- K42322 37.8 93.6 ORM pKR1478- K42327 37.1 91.6 ORM pKR1478- K42325 35.6 88.0 ORM pKR1478- K42334 34.1 84.2 ORM pKR1478- K42331 34.0 84.1 95.7 ORM wt K42335 40.4
T2 seed of events K42334 and K42331 were plated on selective media and planted alongside untransformed wt control plants. Plants were gown to maturity. Seeds were harvested and oil content was measured by NMR (Table 9)
TABLE-US-00009 TABLE 9 Seed oil content of T2 plants generated with binary vector pKR1478-PAE for seed-specific over-expression of At5g17280 oil content avg. oil % of content Event ID Construct BARCODE % oil WT % of WT K42334 pKR1478- K44550 40.5 102.0 ORM pKR1478- K44537 39.2 98.9 ORM pKR1478- K44543 39.2 98.7 ORM pKR1478- K44553 39.0 98.2 ORM pKR1478- K44535 38.1 96.0 ORM pKR1478- K44545 37.9 95.5 ORM pKR1478- K44546 37.5 94.5 ORM pKR1478- K44551 37.2 93.8 ORM pKR1478- K44542 36.9 92.9 ORM pKR1478- K44549 36.6 92.1 ORM pKR1478- K44538 36.4 91.7 ORM pKR1478- K44547 36.2 91.1 ORM pKR1478- K44552 36.1 91.1 ORM pKR1478- K44540 35.6 89.8 ORM pKR1478- K44539 35.4 89.3 ORM pKR1478- K44544 35.0 88.1 ORM pKR1478- K44534 34.7 87.4 ORM pKR1478- K44536 34.4 86.7 ORM pKR1478- K44548 33.0 83.2 ORM pKR1478- K44541 30.3 76.2 91.9 ORM wt K44563 42.9 wt K44555 42.6 wt K44558 41.4 wt K44559 40.6 wt K44554 39.7 wt K44557 39.3 wt K44564 39.3 wt K44561 38.8 wt K44556 38.6 wt K44562 38.2 wt K44565 37.8 wt K44560 37.1 K42331 pKR1478- K46263 40.3 94.0 ORM pKR1478- K46264 39.7 92.6 ORM pKR1478- K46266 39.7 92.5 ORM pKR1478- K46268 38.8 90.4 ORM pKR1478- K46262 38.7 90.3 ORM pKR1478- K46248 38.7 90.3 ORM pKR1478- K46251 38.4 89.6 ORM pKR1478- K46269 38.4 89.5 ORM pKR1478- K46249 38.3 89.4 ORM pKR1478- K46250 38.3 89.2 ORM pKR1478- K46258 38.3 89.2 ORM pKR1478- K46261 38.1 88.8 ORM pKR1478- K46254 38.0 88.7 ORM pKR1478- K46255 38.0 88.7 ORM pKR1478- K46267 37.9 88.3 ORM pKR1478- K46256 37.8 88.1 ORM pKR1478- K46253 37.6 87.6 ORM pKR1478- K46265 37.3 87.1 ORM pKR1478- K46257 37.2 86.7 ORM pKR1478- K46259 37.1 86.5 ORM pKR1478- K46260 36.9 86.0 ORM pKR1478- K46252 35.8 83.6 89.0 ORM wt K46275 44.7 wt K46270 43.6 wt K46272 43.4 wt K46280 43.4 wt K46281 43.3 wt K46277 43.2 wt K46271 43.0 wt K46273 42.8 wt K46278 42.7 wt K46279 42.6 wt K46276 42.2 wt K46274 39.8
T3 seed of lines K44584 and K44581 derived from event K42334 were plated on selective media and planted alongside untransformed wt control plants. Plants were gown to maturity. Seeds were harvested and oil content was measured by NMR (Table 10)
TABLE-US-00010 TABLE 10 Seed oil content of T3 plants generated with binary vector pKR1478-PAE for seed-specific over-expression of At5g17280 oil content avg. oil % of content Event ID Construct BARCODE % oil WT % of WT K42334/K44548 pKR1478- K49194 39.3 92.9 ORM pKR1478- K49193 39.0 92.1 ORM pKR1478- K49204 38.9 92.1 ORM pKR1478- K49206 38.7 91.5 ORM pKR1478- K49197 38.7 91.5 ORM pKR1478- K49208 38.7 91.5 ORM pKR1478- K49199 38.2 90.3 ORM pKR1478- K49207 37.8 89.4 ORM pKR1478- K49214 37.7 89.0 ORM pKR1478- K49196 37.6 88.9 ORM pKR1478- K49191 37.5 88.8 ORM pKR1478- K49192 37.3 88.2 ORM pKR1478- K49205 37.2 87.8 ORM pKR1478- K49209 36.5 86.3 ORM pKR1478- K49211 36.5 86.2 ORM pKR1478- K49212 36.4 86.0 ORM pKR1478- K49200 36.3 85.9 89.3 ORM wt K49223 43.0 wt K49219 42.8 wt K49221 42.7 wt K49222 42.4 wt K49220 42.1 wt K49216 42.0 wt K49218 41.8 wt K49217 41.7 K42334/K44541 pKR1478- K49174 38.8 93.0 ORM pKR1478- K49152 38.1 91.3 ORM pKR1478- K49173 38.1 91.3 ORM pKR1478- K49177 37.7 90.2 ORM pKR1478- K49162 37.6 90.1 ORM pKR1478- K49176 36.9 88.2 ORM pKR1478- K49167 36.8 88.2 ORM pKR1478- K49157 36.8 88.2 ORM pKR1478- K49163 36.8 88.1 ORM pKR1478- K49170 36.7 87.9 ORM pKR1478- K49171 36.7 87.8 ORM pKR1478- K49178 36.6 87.7 ORM pKR1478- K49154 36.5 87.3 ORM pKR1478- K49156 35.7 85.5 ORM pKR1478- K49165 35.0 83.7 ORM pKR1478- K49161 33.8 80.9 ORM pKR1478- K49179 33.6 80.5 87.6 ORM wt K49185 43.1 wt K49186 42.5 wt K49187 42.3 wt K49181 42.2 wt K49182 42.0 wt K49184 41.5 wt K49180 40.8 wt K49183 39.8
Tables 8-10 demonstrate that seed specific over-expression of At5g17280 leads to a decrease in oil content of 10%. The decrease in oil content associated with the transgene is heritable. This finding suggests that the low seed oil phenotype in lo17849 in related to increased expression of At5g17280 resulting from the nearby insertion of quadruple 35S enhancer sequence present in the pHSbarENDs2-derived T-DNA.
Example 5
Seed-Specific RNAi of At5g17280
Generation and Phenotypic Characterization of Transgenic Lines
[0225] A binary plant transformation vector pKR1482 (SEQ ID NO:15) for generation of hairpin constructs facilitating seed-specific RNAi under control of the GY1 promoter derived from the soy gene Glyma03g32030.1 was constructed. The RNAi-related expression cassette that can be used for cloning of a given DNA fragment flanked by ATTL sites in antisense and sense orientation downstream of the seed-specific promoter. The two gene fragments are interrupted by a spliceable intron sequence derived from the Arabidopsis gene At2g38080.
[0226] An intron of an Arabidopsis laccase gene (At2g38080) was amplified from genomic Arabidopsis DNA of ecotype Columbia using primers AthLcc IN FWD (SEQ ID NO:16) and AthLcc IN REV (SEQ ID NO:17). PCR products were cloned into pGEM T EASY (Promega, USA) according to manufacturer instructions and sequenced. The DNA sequence of the PCR product containing the laccase intron is set forth as SEQ ID NO:18. The PCR primers introduce an HpaI restriction site at the 5' end of the intron and restriction sites for NruI and SpeI at the 3' end of the intron. A three-way ligation of DNA fragments was performed as follows. XbaI digested, dephosphorylated DNA of pMBL18 (Nakano, Yoshio; Yoshida, Yasuo; Yamashita, Yoshihisa; Koga, Toshihiko. Construction of a series of pACYC-derived plasmid vectors. Gene (1995), 162(1), 157-8.) was ligated to the XbaI, EcoRV DNA fragment of PSM1318 (SEQ ID NO:19) containing ATTR12 sites a DNA Gyrase inhibitor gene (ccdB), a chloramphenicol acetyltransferase gene, an HpaI/SpeI restriction fragment excised from pGEM T EASY Lacc INT (SEQ ID NO:18) containing intron 1 of At2g38080. Ligation products were transformed into the DB 3.1 strain of E. coli (Invitrogen, USA). Recombinant clones were characterized by restriction digests and sequenced. The DNA sequence of the resulting plasmid pMBL18 ATTR12 INT is set forth as SEQ ID NO:20. DNA of pMBL18 ATTR12 INT was linearized with NruI, dephosphorylated and ligated to the XbaI, EcoRV DNA fragment of PSM1789 (SEQ ID NO: 21) containing ATTR12 sites and a DNA Gyrase inhibitor gene (ccdB). Prior to ligation ends of the PSM1789 restriction fragment had been filled in with T4 DNA polymerase (Promega, USA). Ligation products were transformed into the DB 3.1 strain of E. coli (Invitrogen, USA). Recombinant clones were characterized by restriction digests and sequenced. The DNA sequence of the resulting plasmid pMBL18 ATTR12 INT ATTR21 is set forth as SEQ ID NO:22.
[0227] Plasmid pMBL18 ATTR12 INT ATTR21 (SEQ ID NO:22) was digested with XbaI and after filling to blunt the XbaI site generated, the resulting DNA was digested with Ecl136II and the fragment containing the attR cassettes was cloned into the NotI/BsiWI (where the NotI site was completely filled in) fragment of pKR1468 (SEQ ID NO:6), containing the Gy1 promoter, to produce pKR1480 (SEQ ID NO:23).
[0228] pKR1480 (SEQ ID NO:23) was digested with AscI and the fragment containing the Gy1 promoter/attR cassettes was cloned into the AscI fragment of binary vector pKR92 (SEQ ID NO:8) to produce pKR1482 (SEQ ID NO:15).
[0229] 5 μg of plasmid DNA of pENTR-ORM (SEQ ID NO:13). was digested with EcoRV/HpaI. A restriction fragment of 0.7 kb (derived from pENTR-ORM) was excised from an agarose gel. The purified DNA fragment was inserted into vector pKR1482 using LR clonase (Invitrogen) according to the manufacturers instructions, to give pKR1482-ORM (SEQ ID NO:24)
[0230] pKR1482-ORM (SEQ ID NO:24) was introduced into Agrobacterium tumefaciens NTL4 (Luo et al, Molecular Plant-Microbe Interactions (2001) 14(1):98-103) by electroporation. Briefly, 1 μg plasmid DNA was mixed with 100 μL of electro-competent cells on ice. The cell suspension was transferred to a 100 μL electroporation cuvette (1 mm gap width) and electroporated using a BIORAD electroporator set to 1 kV, 400Ω and 25 μF. Cells were transferred to 1 mL LB medium and incubated for 2 h at 30° C. Cells were plated onto LB medium containing 50 μg/mL kanamycin. Plates were incubated at 30° C. for 60 h. Recombinant Agrobacterium cultures (500 mL LB, 50 μg/mL kanamycin) were inoculated from single colonies of transformed agrobacterium cells and grown at 30° C. for 60 h. Cells were harvested by centrifugation (5000×g, 10 min) and resuspended in 1 L of 5% (WN) sucrose containing 0.05% (V/V) Silwet. Arabidopsis plants were grown in soil at a density of 30 plants per 100 cm2 pot in METRO-MIX® 360 soil mixture for 4 weeks (22° C., 16 h light/8 h dark, 100 μE m-2s-1). Plants were repeatedly dipped into the Agrobacterium suspension harboring the binary vector pKR1482-ORM (SEQ ID NO:24) and kept in a dark, high humidity environment for 24 h. Plants were grown for three to four weeks under standard plant growth conditions described above and plant material was harvested and dried for one week at ambient temperatures in paper bags. Seeds were harvested using a 0.425 mm mesh brass sieve.
[0231] Cleaned Arabidopsis seeds (2 grams, corresponding to about 100,000 seeds) were sterilized by washes in 45 mL of 80% ethanol, 0.01% TRITON® X-100, followed by 45 mL of 30% (V/V) household bleach in water, 0.01% TRITON® X-100 and finally by repeated rinsing in sterile water. Aliquots of 20,000 seeds were transferred to square plates (20×20 cm) containing 150 mL of sterile plant growth medium comprised of 0.5×MS salts, 0.53% (W/V) sorbitol, 0.05 MES/KOH (pH 5.8), 200 μg/mL TIMENTIN®, and 50 μg/mL kanamycin solidified with 10 g/L agar. Homogeneous dispersion of the seed on the medium was facilitated by mixing the aqueous seed suspension with an equal volume of melted plant growth medium. Plates were incubated under standard growth conditions for ten days. Kanamycin-resistant seedlings were transferred to plant growth medium without selective agent and grown for one week before transfer to soil. Plants were grown to maturity and T2 seeds were harvested. A total of 15 events were generated with pKR1482-ORM (SEQ ID NO:24). Six wild-type (WT) control plants were grown in the same flat. WT seeds were bulk harvested thus generating two batches of wt control seed derived form three plants. T2 seed of individual transgenic lines were harvested. Oil content was measured by NMR as described above.
TABLE-US-00011 TABLE 11 Seed oil content of T1 plants generated with binary vector pKR1482-ORM for seed specific gene suppression of At5g17280 (Experiment 1) oil content avg. oil % of content Construct BARCODE % oil WT % of WT pKR1482- K42351 41.4 111.5 ORM pKR1482- K42355 41.0 110.4 ORM pKR1482- K42361 40.8 109.8 ORM pKR1482- K42360 40.5 109.0 ORM pKR1482- K42359 40.2 108.2 ORM pKR1482- K42350 40.1 107.8 ORM pKR1482- K42362 39.5 106.2 ORM pKR1482- K42353 38.6 103.8 ORM pKR1482- K42352 38.5 103.7 ORM pKR1482- K42354 38.3 103.0 ORM pKR1482- K42356 38.3 102.9 ORM pKR1482- K42358 37.8 101.8 ORM pKR1482- K42349 36.7 98.9 ORM pKR1482- K42357 36.2 97.5 ORM pKR1482- K42348 36.0 96.8 104.7 ORM wt K42363 38.4 wt K42364 35.9
[0232] Table 11 shows that seed-specific down regulation of At5g17280 leads to increased oil content in Arabidopsis seed.
T2 seeds of event K42355 that carries transgene pKR1482-ORM (SEQ ID NO: 24) were plated on plant growth media containing kanamycin. Plants were grown to maturity alongside WT plants of the Columbia ecotype grown in the same flats. Oil content of T3 seed is depicted in Table 12.
TABLE-US-00012 TABLE 12 Seed oil content of T2 plants generated with binary vector pKR1482-ORM for seed specific gene suppression of At5g17280 (Experiment 1) avg. oil oil content content % Event ID Construct BARCODE % oil % of WT of WT K42335 pKR1482- K44642 43.3 107.8 ORM pKR1482- K44650 43.1 107.3 ORM pKR1482- K44643 42.8 106.5 ORM pKR1482- K44637 42.6 106.0 ORM pKR1482- K44641 42.2 105.1 ORM pKR1482- K44647 41.6 103.5 ORM pKR1482- K44652 41.3 102.8 ORM pKR1482- K44636 41.3 102.7 ORM pKR1482- K44639 41.0 102.1 ORM pKR1482- K44646 41.0 102.0 ORM pKR1482- K44653 40.9 101.7 ORM pKR1482- K44649 40.4 100.5 ORM pKR1482- K44644 40.3 100.2 ORM pKR1482- K44657 39.9 99.2 ORM pKR1482- K44654 39.5 98.3 ORM pKR1482- K44656 39.0 97.1 ORM pKR1482- K44651 38.4 95.6 102.0 ORM wt K44658 41.7 wt K44661 41.3 wt K44663 41.2 wt K44664 41.1 wt K44666 40.7 wt K44662 40.1 wt K44665 38.8 wt K44668 38.4 wt K44667 38.3
T3 seeds of lines K44650 and K44637 derived from event K42355 that carries transgene pKR1482-ORM were plated on plant growth media containing kanamycin. Plants were grown to maturity alongside WT plants of the Columbia ecotype grown in the same flats. Oil content of T3 seed is depicted in Table 13.
TABLE-US-00013 TABLE 13 Seed oil content of T3 plants generated with binary vector pKR1482- ORM for seed specific gene suppression of At5g17280 (Experiment 1) avg. oil content oil content % of Event ID Construct BARCODE % oil % of WT WT K42335/K44650 pKR1482- K49241 43.5 105.7 ORM pKR1482- K49231 43.3 105.3 ORM pKR1482- K49236 42.9 104.1 ORM pKR1482- K49227 42.8 104.0 ORM pKR1482- K49239 42.7 103.9 ORM pKR1482- K49234 42.7 103.8 ORM pKR1482- K49226 42.7 103.8 ORM pKR1482- K49249 42.6 103.6 ORM pKR1482- K49237 42.6 103.5 ORM pKR1482- K49233 42.6 103.4 ORM pKR1482- K49225 42.4 103.1 ORM pKR1482- K49228 42.4 103.0 ORM pKR1482- K49230 42.2 102.5 ORM pKR1482- K49244 42.1 102.3 ORM pKR1482- K49242 42.1 102.2 ORM pKR1482- K49232 42.0 102.1 ORM pKR1482- K49224 42.0 102.0 ORM pKR1482- K49248 41.8 101.6 ORM pKR1482- K49246 41.7 101.3 ORM pKR1482- K49238 41.6 101.0 ORM pKR1482- K49247 41.5 100.8 ORM pKR1482- K49245 41.5 100.7 ORM pKR1482- K49240 41.4 100.7 ORM pKR1482- K49250 41.3 100.4 ORM pKR1482- K49235 41.1 99.9 ORM pKR1482- K49229 41.1 99.8 ORM pKR1482- K49243 41.0 99.6 102.4 ORM wt K49255 42.2 wt K49257 41.8 wt K49252 41.7 wt K49256 41.5 wt K49251 40.9 wt K49253 40.3 wt K49254 39.6 K42335/K44637 pKR1482- K49600 42.3 116.5 ORM pKR1482- K49595 42.0 115.6 ORM pKR1482- K49596 41.9 115.2 ORM pKR1482- K49582 41.7 114.8 ORM pKR1482- K49598 41.5 114.2 ORM pKR1482- K49594 41.5 114.1 ORM pKR1482- K49591 41.4 113.9 ORM pKR1482- K49583 41.3 113.6 ORM pKR1482- K49592 41.1 113.2 ORM pKR1482- K49601 40.8 112.4 ORM pKR1482- K49576 40.8 112.2 ORM pKR1482- K49587 40.7 111.9 ORM pKR1482- K49599 40.5 111.4 ORM pKR1482- K49597 40.4 111.4 ORM pKR1482- K49579 40.4 111.2 ORM pKR1482- K49580 40.2 110.6 ORM pKR1482- K49578 40.1 110.4 ORM pKR1482- K49585 40.1 110.3 ORM pKR1482- K49586 40.0 110.3 ORM pKR1482- K49590 40.0 110.0 ORM pKR1482- K49588 39.6 109.1 ORM pKR1482- K49581 39.6 109.0 ORM pKR1482- K49584 39.3 108.3 ORM pKR1482- K49574 39.2 107.9 ORM pKR1482- K49593 39.2 107.8 ORM pKR1482- K49589 39.1 107.7 ORM pKR1482- K49577 39.0 107.3 ORM pKR1482- K49575 35.8 98.5 111.0 ORM wt K49604 39.1 wt K49603 37.7 wt K49606 36.7 wt K49602 34.1 wt K49605 33.9
Additional events were generated with pKR1482-ORM in a second experiment henceforth referred to as Experiment 2. Oil content of T1 and T2 plants of pKR1482-ORM events derived from Experiment 2 is shown in Tables 14 and 15.
TABLE-US-00014 TABLE 14 Seed oil content of T1 plants generated with binary vector pKR1482- ORM for seed specific gene suppression of At5g17280 (Experiment 2) oil content Construct BARCODE % oil % of WT pKR1482- K47030 41.8 104.9 ORM pKR1482- K47021 41.2 103.4 ORM pKR1482- K47018 41.1 103.2 ORM pKR1482- K47017 41.0 103.0 ORM pKR1482- K47013 40.3 101.1 ORM pKR1482- K47028 40.2 101.0 ORM pKR1482- K47015 40.2 100.8 ORM pKR1482- K47007 40.0 100.2 ORM pKR1482- K47025 39.6 99.3 ORM pKR1482- K47029 39.5 99.0 ORM pKR1482- K47008 39.3 98.7 ORM pKR1482- K47022 38.8 97.5 ORM pKR1482- K47020 38.8 97.3 ORM pKR1482- K47014 38.5 96.6 ORM pKR1482- K47026 38.4 96.2 ORM pKR1482- K47012 38.2 95.8 ORM pKR1482- K47023 38.0 95.4 ORM pKR1482- K47010 37.9 95.1 ORM pKR1482- K47019 37.3 93.5 ORM pKR1482- K47011 37.2 93.4 ORM pKR1482- K47027 37.2 93.3 ORM pKR1482- K47009 35.6 89.4 ORM pKR1482- K47024 35.5 89.1 ORM pKR1482- K47016 32.3 81.1 ORM wt K47308 40.9 wt K47312 40.4 wt K47306 40.3 wt K47307 40.2 wt K47302 40.1 wt K47301 39.9 wt K47310 39.7 wt K47305 39.6 wt K47309 39.5 wt K47311 39.3 wt K47304 39.2 wt K47303 39.1
TABLE-US-00015 TABLE 15 Seed oil content of T2 plants generated with binary vector pKR1482- ORM for seed specific gene suppression of At5g17280 (Experiment 2) oil avg. oil content content % Event ID Construct BARCODE % oil % of WT of WT K47021 pKR1482- K50089 44.5 107.6 ORM pKR1482- K50087 44.3 107.3 ORM pKR1482- K50093 44.3 107.3 ORM pKR1482- K50085 44.1 106.7 ORM pKR1482- K50086 43.9 106.3 ORM pKR1482- K50088 43.8 106.0 ORM pKR1482- K50091 43.6 105.6 ORM pKR1482- K50090 43.3 104.9 ORM pKR1482- K50094 43.0 104.2 ORM pKR1482- K50084 42.7 103.3 ORM pKR1482- K50092 42.5 102.8 105.6 ORM wt K50097 42.2 wt K50099 42.2 wt K50100 41.8 wt K50095 41.6 wt K50098 40.2 wt K50096 39.7 K47018 pKR1482- K50105 44.9 108.7 ORM pKR1482- K50102 44.7 108.2 ORM pKR1482- K50122 44.2 107.1 ORM pKR1482- K50109 44.2 107.0 ORM pKR1482- K50104 44.0 106.6 ORM pKR1482- K50114 44.0 106.5 ORM pKR1482- K50112 43.8 106.0 ORM pKR1482- K50111 43.7 105.9 ORM pKR1482- K50121 43.7 105.8 ORM pKR1482- K50115 43.6 105.7 ORM pKR1482- K50101 43.6 105.6 ORM pKR1482- K50106 43.6 105.6 ORM pKR1482- K50120 43.5 105.3 ORM pKR1482- K50123 43.4 105.2 ORM pKR1482- K50103 43.2 104.6 ORM pKR1482- K50110 43.1 104.4 ORM pKR1482- K50117 43.1 104.4 ORM pKR1482- K50108 43.0 104.1 ORM pKR1482- K50118 42.8 103.7 ORM pKR1482- K50119 42.5 103.0 ORM pKR1482- K50113 42.2 102.2 ORM pKR1482- K50107 42.1 101.9 ORM pKR1482- K50116 40.3 97.6 105.0 ORM wt K50129 42.8 wt K50132 42.7 wt K50130 42.7 wt K50133 42.5 wt K50134 42.3 wt K50124 42.2 wt K50127 41.7 wt K50128 41.3 wt K50125 39.7 wt K50126 39.2 wt K50131 37.1
Tables 11, 12, 13, 14 and 15 demonstrate that an oil increase of about 2-11% is associated with seed-specific down regulation of At5g17280. The oil increase is observed in multiple events and is heritable.
Example 6
Identification of cDNA Clones
[0233] cDNA clones encoding an ORM motif protein can be identified by conducting BLAST (Basic Local Alignment Search Tool; Altschul et al. (1993) J. Mol. Biol. 215:403-410; see also the explanation of the BLAST algorithm on the world wide web site for the National Center for Biotechnology Information at the National Library of Medicine of the National Institutes of Health) searches for similarity to amino acid sequences contained in the BLAST "nr" database (comprising all non-redundant GenBank CDS translations, sequences derived from the 3-dimensional structure Brookhaven Protein Data Bank, the last major release of the SWISS-PROT protein sequence database, EMBL, and DDBJ databases). The DNA sequences from clones can be translated in all reading frames and compared for similarity to all publicly available protein sequences contained in the "nr" database using the BLASTX algorithm (Gish and States (1993) Nat. Genet. 3:266-272) provided by the NCBI. The polypeptides encoded by the cDNA sequences can be analyzed for similarity to all publicly available amino acid sequences contained in the "nr" database using the BLASTP algorithm provided by the National Center for Biotechnology Information (NCBI). For convenience, the P-value (probability) or the E-value (expectation) of observing a match of a cDNA-encoded sequence to a sequence contained in the searched databases merely by chance as calculated by BLAST are reported herein as "pLog" values, which represent the negative of the logarithm of the reported P-value or E-value. Accordingly, the greater the pLog value, the greater the likelihood that the cDNA-encoded sequence and the BLAST "hit" represent homologous proteins.
[0234] ESTs sequences can be compared to the Genbank database as described above. ESTs that contain sequences more 5- or 3-prime can be found by using the BLASTN algorithm (Altschul et al (1997) Nucleic Acids Res. 25:3389-3402.) against the DUPONT® proprietary database comparing nucleotide sequences that share common or overlapping regions of sequence homology. Where common or overlapping sequences exist between two or more nucleic acid fragments, the sequences can be assembled into a single contiguous nucleotide sequence, thus extending the original fragment in either the 5 or 3 prime direction. Once the most 5-prime EST is identified, its complete sequence can be determined by Full Insert Sequencing as described above. Homologous genes belonging to different species can be found by comparing the amino acid sequence of a known gene (from either a proprietary source or a public database) against an EST database using the TBLASTN algorithm. The TBLASTN algorithm searches an amino acid query against a nucleotide database that is translated in all 6 reading frames. This search allows for differences in nucleotide codon usage between different species, and for codon degeneracy.
Example 7
Characterization of cDNA Clones Encoding ORM protein Polypeptides
[0235] A cDNA library representing mRNAs from sunflower was prepared and cDNA clones encoding ORM polypeptides were identified. Clone hso1c.pk014.c16 was obtained from a cDNA library prepared from transgenic sunflower plants.
Example 8
Identification of Genes of Brassica napus Closely-Related to At5g17280
[0236] Public DNA sequences (NCBI and Brassica napus EST assembly (N) Brassica napus EST assembly version 3.0 (Jul. 30, 2007) from the Gene Index Project at Dana-Farber Cancer Institute were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. The assembly encompasses about 558465 public ESTs and has a total of 90310 sequences (47591 assemblies and 42719 singletons). There are three genes encoding proteins with homology to At5g17280. These genes, their % identity to At5g17280 and SEQ ID NOs are listed in Table 16.
TABLE-US-00016 TABLE 16 Brassica rapa gene closely related to At5g17280 % AA sequence identity to Gene name At5g17280 SEQ ID NO: NT SEQ ID NO: AA TC44737 51.8 25 26 TC52165 53.3 27 28 TC52879 48.2 29 30
Example 9
Identification of Genes of Sunflower Genes Closely-Related to At5g17280
[0237] Applicants Sunflower EST libraries were searched using the predicted amino acid sequence of were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. and tBLASTn. There is one EST encoding a protein that shares 47.2 sequence identity to At5g17280. The gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 17. Clone hso1c.pk014.c16 shares 38.3% sequence identity with the public sequence from Populus trichocarpa (NCBI GI:118481427, SEQ ID NO:64) and 35.7% sequence identity with SEQ ID NO: 36271 of US20060123505 (SEQ ID NO:65).
TABLE-US-00017 TABLE 17 Sunflower (Helianthus annuus) gene closely related to At5g17280 % AA sequence identity to SEQ ID NO: SEQ ID NO: Gene name At5g17280 AA NT hso1c.pk014.c16 39.1 31 32
Example 10
Identification of Genes of Castor Genes Closely-Related to At5g17280
[0238] The Non-redundant protein data set from NCBI including non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF protein sequences was searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There is one gene XM--002533611 which shares 50.7% amino acid sequence identity to At5g17280. This gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 18.
TABLE-US-00018 TABLE 18 Castor (Ricinus communis) gene closely related to At5g17280 % AA sequence identity to SEQ ID Gene name At5g17280 SEQ ID NO: NT NO: AA XM_002533611 50.7 33 34
Example 11
Identification of Genes of Soybean (Glycine max) Closely-Related to At5g17280
[0239] Public DNA sequences (Soybean cDNAs Glyma1.01 (JGI) (N) Predicted cDNAs from Soybean JGI Glyma1.01 genomic sequence, FGENESH predictions, and EST PASA analysis.) were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There are two genes that encode protein which share between 38.2 and 30.3% amino acid sequence identity with the predicted protein At5g17280. These genes, its properties and SEQ ID NO are listed in Table 19
TABLE-US-00019 TABLE 19 Soybean genes closely related to At5g17280 % AA sequence identity to SEQ ID NO: Gene name At5g17280 SEQ ID NO: NT AA Glyma02g05870 38.2 35 36 Glyma16g24560 30.3 37 38
Example 12
Identification of Genes of Maize (Zea mays) Closely-Related to At5g17280
[0240] The filtered Gene Set cDNAs of the maize genome sequence in the public maize database was searched using the predicted amino acid sequence of At5g17280 and tBLASTn. In addition applicant's maize EST data base was searched in a similar fashion. These genes, its properties and SEQ ID NO are listed in Table 20. Maize GRMZM2G132101 shares 94.4% sequence identity with the public sequence from maize, NCBI Gi NO: 195615148 (SEQ ID NO: 66) and 93.3 sequence identity with SEQ ID NO:233249 of US20040214272 (SEQ ID NO:67). Maize cDNA pco642986 shares 95.5% sequence identity with the public sequence from maize, NCBI Gi NO: 195615148 (SEQ ID NO: 66) and 96.6% sequence identity with SEQ ID NO:233249 of US20040214272 (SEQ ID NO:67). Maize cDNA pco597536 shares 99.2% sequence identity with the public sequence from maize, NCBI Gi NO: 195615148 (SEQ ID NO:66) and 100% sequence identity with SEQ ID NO:233249 of US20040214272 (SEQ ID NO:67).
TABLE-US-00020 TABLE 20 Maize genes closely related to At5g17280 % AA sequence identity to SEQ ID SEQ ID Gene name At5g17280 NO: NT NO: AA GRMZM2G132101 33.7 39 40 pco642986 33.0 41 42 pco597536 30.9 43 44
Example 13
Identification of Genes of Rice (Oryza sativa) Closely-Related to At5g17280
[0241] A public database of transcripts from rice gene models (Oryza sativa (japonica cultivar-group) MSU Rice Genome Annotation Project Osa1 release 6 (January 2009)) which includes untranslated regions (UTR) but no introns was searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There is one gene which shares 34.5% amino acid sequence identity to At5g17280. This gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 21.
TABLE-US-00021 TABLE 21 Rice gene closely related to At5g17280 % AA sequence SEQ ID SEQ ID Gene name identity to At5g17280 NO: NT NO: AA Os09g36120 34.5 45 46
Example 14
Identification of Genes of Sorghum (Sorghum bicolor) Closely-Related to At5g17280
[0242] The predicted coding sequences (mRNA) from the Sorghum JGI genomic sequence, version 1.4 were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There is one gene which shares 30.9% amino acid sequence identity to At5g17280. This gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 22.
TABLE-US-00022 TABLE 22 Sorghum gene closely related to At5g17280 % AA sequence identity to SEQ ID NO: SEQ ID NO: Gene name At5g17280 NT AA Sb02g030770 30.9 47 48
Example 15
Expression of Chimeric Genes in Monocot Cells
[0243] A chimeric gene comprising a cDNA encoding the instant polypeptides in sense orientation with respect to the maize 27 kD zein promoter that is located 5' to the cDNA fragment, and the 10 kD zein 3' end that is located 3' to the cDNA fragment, can be constructed. The cDNA fragment of this gene may be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites (NcoI or SmaI) can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the digested vector pML103 as described below. Amplification is then performed in a standard PCR. The amplified DNA is then digested with restriction enzymes NcoI and SmaI and fractionated on an agarose gel. The appropriate band can be isolated from the gel and combined with a 4.9 kb NcoI-SmaI fragment of the plasmid pML103. Plasmid pML103 has been deposited under the terms of the Budapest Treaty at ATCC (American Type Culture Collection, 10801 University Blvd., Manassas, Va. 20110-2209), and bears accession number ATCC 97366. The DNA segment from pML103 contains a 1.05 kb SalI-NcoI promoter fragment of the maize 27 kD zein gene and a 0.96 kb SmaI-SalI fragment from the 3' end of the maize 10 kD zein gene in the vector pGem9Zf(+) (Promega). Vector and insert DNA can be ligated at 15° C. overnight, essentially as described (Maniatis). The ligated DNA may then be used to transform E. coli XL1-Blue (Epicurian Coli XL-1 Blue®; Stratagene). Bacterial transformants can be screened by restriction enzyme digestion of plasmid DNA and limited nucleotide sequence analysis using the dideoxy chain termination method (Sequenase® DNA Sequencing Kit; U.S. Biochemical). The resulting plasmid construct would comprise a chimeric gene encoding, in the 5' to 3' direction, the maize 27 kD zein promoter, a cDNA fragment encoding the instant polypeptides, and the 10 kD zein 3' region.
[0244] The chimeric gene described above can then be introduced into corn cells by the following procedure. Immature corn embryos can be dissected from developing caryopses derived from crosses of the inbred corn lines H99 and LH132. The embryos are isolated 10 to 11 days after pollination when they are 1.0 to 1.5 mm long. The embryos are then placed with the axis-side facing down and in contact with agarose-solidified N6 medium (Chu et al. (1975) Sci. Sin. Peking 18:659-668). The embryos are kept in the dark at 27° C. Friable embryogenic callus consisting of undifferentiated masses of cells with somatic proembryoids and embryoids borne on suspensor structures proliferate from the scutellum of these immature embryos. The embryogenic callus isolated from the primary explant can be cultured on N6 medium and sub-cultured on this medium every 2 to 3 weeks.
[0245] The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag, Frankfurt, Germany) may be used in transformation experiments in order to provide for a selectable marker. This plasmid contains the Pat gene (see European Patent Publication 0 242 236) which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers resistance to herbicidal glutamine synthetase inhibitors such as phosphinothricin. The pat gene in p35S/Ac is under the control of the 35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812) and the 3' region of the nopaline synthase gene from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens.
[0246] The particle bombardment method (Klein et al. (1987) Nature 327:70-73) may be used to transfer genes to the callus culture cells. According to this method, gold particles (1 μm in diameter) are coated with DNA using the following technique. Ten μg of plasmid DNAs are added to 50 μL of a suspension of gold particles (60 mg per mL). Calcium chloride (50 μL of a 2.5 M solution) and spermidine free base (20 μL of a 1.0 M solution) are added to the particles. The suspension is vortexed during the addition of these solutions. After 10 minutes, the tubes are briefly centrifuged (5 sec at 15,000 rpm) and the supernatant removed. The particles are resuspended in 200 μL of absolute ethanol, centrifuged again and the supernatant removed. The ethanol rinse is performed again and the particles resuspended in a final volume of 30 μL of ethanol. An aliquot (5 μL) of the DNA-coated gold particles can be placed in the center of a Kapton® flying disc (Bio-Rad Labs). The particles are then accelerated into the corn tissue with a Biolistic® PDS-1000/He (Bio-Rad Instruments, Hercules Calif.), using a helium pressure of 1000 psi, a gap distance of 0.5 cm and a flying distance of 1.0 cm.
[0247] For bombardment, the embryogenic tissue is placed on filter paper over agarose-solidified N6 medium. The tissue is arranged as a thin lawn and covered a circular area of about 5 cm in diameter. The petri dish containing the tissue can be placed in the chamber of the PDS-1000/He approximately 8 cm from the stopping screen. The air in the chamber is then evacuated to a vacuum of 28 inches of Hg. The macrocarrier is accelerated with a helium shock wave using a rupture membrane that bursts when the He pressure in the shock tube reaches 1000 psi. Seven days after bombardment the tissue can be transferred to N6 medium that contains gluphosinate (2 mg per liter) and lacks casein or proline. The tissue continues to grow slowly on this medium. After an additional 2 weeks the tissue can be transferred to fresh N6 medium containing gluphosinate. After 6 weeks, areas of about 1 cm in diameter of actively growing callus can be identified on some of the plates containing the glufosinate-supplemented medium. These calli may continue to grow when sub-cultured on the selective medium.
[0248] Plants can be regenerated from the transgenic callus by first transferring clusters of tissue to N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the tissue can be transferred to regeneration medium (Fromm et al. (1990) Bio/Technology 8:833-839).
Example 16
Expression of Chimeric Genes in Dicot Cells
[0249] A seed-specific construct composed of the promoter and transcription terminator from the gene encoding the β subunit of the seed storage protein phaseolin from the bean Phaseolus vulgaris (Doyle et al. (1986) J. Biol. Chem. 261:9228-9238) can be used for expression of the instant polypeptides in transformed soybean. The phaseolin construct includes about 500 nucleotides upstream (5') from the translation initiation codon and about 1650 nucleotides downstream (3') from the translation stop codon of phaseolin. Between the 5' and 3' regions are the unique restriction endonuclease sites Nco I (which includes the ATG translation initiation codon), Sma I, Kpn I and Xba I. The entire construct is flanked by Hind III sites.
[0250] The cDNA fragment of this gene may be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the expression vector. Amplification is then performed as described above, and the isolated fragment is inserted into a pUC18 vector carrying the seed construct.
[0251] Soybean embryos may then be transformed with the expression vector comprising sequences encoding the instant polypeptides. To induce somatic embryos, cotyledons, 3-5 mm in length dissected from surface sterilized, immature seeds of the soybean cultivar A2872 can be cultured in the light or dark at 26° C. on an appropriate agar medium for 6-10 weeks. Somatic embryos which produce secondary embryos are then excised and placed into a suitable liquid medium. After repeated selection for clusters of somatic embryos which multiplied as early, globular staged embryos, the suspensions are maintained as described below. Soybean embryogenic suspension cultures can be maintained in 35 mL of liquid media on a rotary shaker, 150 rpm, at 26° C. with fluorescent lights on a 16:8 hour day/night schedule. Cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 mL of liquid medium.
Soybean embryogenic suspension cultures may then be transformed by the method of particle gun bombardment (Klein et al. (1987) Nature (London) 327:70-73, U.S. Pat. No. 4,945,050). A DuPont Biolistic® PDS1000/HE instrument (helium retrofit) can be used for these transformations.
[0252] A selectable marker gene which can be used to facilitate soybean transformation is a chimeric gene composed of the 35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812), the hygromycin phosphotransferase gene from plasmid pJR225 (from E. coli; Gritz et al. (1983) Gene 25:179-188) and the 3' region of the nopaline synthase gene from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens. The seed construct comprising the phaseolin 5' region, the fragment encoding the instant polypeptides and the phaseolin 3' region can be isolated as a restriction fragment. This fragment can then be inserted into a unique restriction site of the vector carrying the marker gene. To 50 μL of a 60 mg/mL 1 μm gold particle suspension is added (in order): 5 μL DNA (1 μg/μL), 20 μL spermidine (0.1 M), and 50 μL CaCl2 (2.5M). The particle preparation is then agitated for three minutes, spun in a microfuge for 10 seconds and the supernatant removed. The DNA-coated particles are then washed once in 400 μL 70% ethanol and resuspended in 40 μL of anhydrous ethanol. The DNA/particle suspension can be sonicated three times for one second each. Five μL of the DNA-coated gold particles are then loaded on each macro carrier disk. Approximately 300-400 mg of a two-week-old suspension culture is placed in an empty 60×15 mm petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5-10 plates of tissue are normally bombarded. Membrane rupture pressure is set at 1100 psi and the chamber is evacuated to a vacuum of 28 inches of mercury. The tissue is placed approximately 3.5 inches away from the retaining screen and bombarded three times. Following bombardment, the tissue can be divided in half and placed back into liquid and cultured as described above.
[0253] Five to seven days post bombardment, the liquid media may be exchanged with fresh media, and eleven to twelve days post bombardment with fresh media containing 50 mg/mL hygromycin. This selective media can be refreshed weekly. Seven to eight weeks post bombardment, green, transformed tissue may be observed growing from untransformed, necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into individual flasks to generate new, clonally propagated, transformed embryogenic suspension cultures. Each new line may be treated as an independent transformation event. These suspensions can then be subcultured and maintained as clusters of immature embryos or regenerated into whole plants by maturation and germination of individual somatic embryos.
Example 17
Expression of Chimeric Genes in Microbial Cells
[0254] The cDNAs encoding the instant polypeptides can be inserted into the T7 E. coli expression vector pBT430. This vector is a derivative of pET-3a (Rosenberg et al. (1987) Gene 56:125-135) which employs the bacteriophage T7 RNA polymerase/T7 promoter system. Plasmid pBT430 was constructed by first destroying the EcoR I and Hind III sites in pET-3a at their original positions. An oligonucleotide adaptor containing EcoR I and Hind III sites was inserted at the BamH I site of pET-3a. This created pET-3aM with additional unique cloning sites for insertion of genes into the expression vector. Then, the Nde I site at the position of translation initiation was converted to an Nco I site using oligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM in this region, 5'-CATATGG, was converted to 5'-CCCATGG in pBT430.
[0255] Plasmid DNA containing a cDNA may be appropriately digested to release a nucleic acid fragment encoding the protein. This fragment may then be purified on a 1% NuSieve GTG® low melting agarose gel (FMC). Buffer and agarose contain 10 μg/mL ethidium bromide for visualization of the DNA fragment. The fragment can then be purified from the agarose gel by digestion with GELase® (Epicentre Technologies) according to the manufacturer's instructions, ethanol precipitated, dried and resuspended in 20 μL of water. Appropriate oligonucleotide adapters may be ligated to the fragment using T4 DNA ligase (New England Biolabs, Beverly, Mass.). The fragment containing the ligated adapters can be purified from the excess adapters using low melting agarose as described above. The vector pBT430 is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized with phenol/chloroform as described above. The prepared vector pBT430 and fragment can then be ligated at 16° C. for 15 hours followed by transformation into DH5 electrocompetent cells (GIBCO BRL). Transformants can be selected on agar plates containing LB media and 100 μg/mL ampicillin. Transformants containing the gene encoding the instant polypeptides are then screened for the correct orientation with respect to the T7 promoter by restriction enzyme analysis. For high level expression, a plasmid clone with the cDNA insert in the correct orientation relative to the T7 promoter can be transformed into E. coli strain BL21(DE3) (Studier et al. (1986) J. Mol. Biol. 189:113-130). Cultures are grown in LB medium containing ampicillin (100 mg/L) at 25° C. At an optical density at 600 nm of approximately 1, IPTG (isopropylthio-β-galactoside, the inducer) can be added to a final concentration of 0.4 mM and incubation can be continued for 3 h at 25° C. Cells are then harvested by centrifugation and re-suspended in 50 μL of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of 1 mm glass beads can be added and the mixture sonicated 3 times for about 5 seconds each time with a microprobe sonicator. The mixture is centrifuged and the protein concentration of the supernatant determined. One μg of protein from the soluble fraction of the culture can be separated by SDS-polyacrylamide gel electrophoresis. Gels can be observed for protein bands migrating at the expected molecular weight.
Example 18
Transformation of Somatic Soybean Embryo Cultures
Generic Stable Soybean Transformation Protocol:
[0256] Soybean embryogenic suspension cultures are maintained in 35 ml liquid media (SB55 or SBP6) on a rotary shaker, 150 rpm, at 28° C. with mixed fluorescent and incandescent lights on a 16:8 h day/night schedule. Cultures are subcultured every four weeks by inoculating approximately 35 mg of tissue into 35 ml of liquid medium.
TABLE-US-00023 TABLE 23 Stock Solutions (g/L): MS Sulfate 100X Stock MgSO47H2O 37.0 MnSO4H2O 1.69 ZnSO47H2O 0.86 CuSO45H2O 0.0025 MS Halides 100X Stock CaCl22H2O 44.0 KI 0.083 CoCl26H20 0.00125 KH2PO4 17.0 H3BO3 0.62 Na2MoO42H2O 0.025 MS FeEDTA 100X Stock Na2EDTA 3.724 FeSO47H2O 2.784 B5 Vitamin Stock 10 g m-inositol 100 mg nicotinic acid 100 mg pyridoxine HCl 1 g thiamine SB55 (per Liter, pH 5.7) 10 ml each MS stocks 1 ml B5 Vitamin stock 0.8 g NH4NO3 3.033 g KNO3 1 ml 2,4-D (10 mg/mL stock) 60 g sucrose 0.667 g asparagine SBP6 same as SB55 except 0.5 ml 2,4-D SB103 (per Liter, pH 5.7) 1X MS Salts 6% maltose 750 mg MgCl2 0.2% Gelrite SB71-1 (per Liter, pH 5.7) 1X B5 salts 1 ml B5 vitamin stock 3% sucrose 750 mg MgCl2 0.2% Gelrite
[0257] Soybean embryogenic suspension cultures are transformed with plasmid DNA by the method of particle gun bombardment (Klein et al (1987) Nature 327:70). A DuPont Biolistic PDS1000/HE instrument (helium retrofit) is used for these transformations.
[0258] To 50 ml of a 60 mg/ml 1 μm gold particle suspension is added (in order); 5 μL DNA (1 μg/μl), 20 μl spermidine (0.1 M), and 50 μl CaCl2 (2.5 M). The particle preparation is agitated for 3 min, spun in a microfuge for 10 sec and the supernatant removed. The DNA-coated particles are then washed once in 400 μl 70% ethanol and re suspended in 40 μl of anhydrous ethanol. The DNA/particle suspension is sonicated three times for 1 sec each. Five μl of the DNA-coated gold particles are then loaded on each macro carrier disk. For selection, a plasmid conferring resistance to hygromycin phosphotransferase (HPT) may be co-bombarded with the silencing construct of interest.
[0259] Approximately 300-400 mg of a four week old suspension culture is placed in an empty 60×15 mm petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5-10 plates of tissue are normally bombarded. Membrane rupture pressure is set at 1000 psi and the chamber is evacuated to a vacuum of 28 inches of mercury. The tissue is placed approximately 3.5 inches away from the retaining screen and bombarded three times. Following bombardment, the tissue is placed back into liquid and cultured as described above.
[0260] Eleven days post bombardment, the liquid media is exchanged with fresh SB55 containing 50 mg/ml hygromycin. The selective media is refreshed weekly. Seven weeks post bombardment, green, transformed tissue is observed growing from untransformed, necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into individual flasks to generate new, clonally propagated, transformed embryogenic suspension cultures. Thus each new line is treated as an independent transformation event. These suspensions can then be maintained as suspensions of embryos maintained in an immature developmental stage or regenerated into whole plants by maturation and germination of individual somatic embryos.
[0261] Independent lines of transformed embryogenic clusters are removed from liquid culture and placed on a solid agar media (SB103) containing no hormones or antibiotics. Embryos are cultured for four weeks at 26° C. with mixed fluorescent and incandescent lights on a 16:8 h day/night schedule. During this period, individual embryos are removed from the clusters and screened for alterations in gene expression.
[0262] It should be noted that any detectable phenotype, resulting from the alterted expression of a target gene, can be screened at this stage. This would include, but not be limited to, alterations in oil content, protein content, carbohydrate content, growth rate, viability, or the ability to develop normally into a soybean plant.
Example 19
Plasmid DNAs for "Complementary Region" Co-Suppression
[0263] The plasmids in the following experiments are made using standard cloning methods well known to those skilled in the art (Sambrook et al (1989) Molecular Cloning, CSHL Press, New York). A starting plasmid pKS18HH (U.S. Pat. No. 5,846,784 the contents of which are hereby incorporated by reference) contains a hygromycin B phosphotransferase (HPT) obtained from E. coli strain W677 under the control of a T7 promoter and the 35S cauliflower mosaic virus promoter. Plasmid pKS18HH thus contains the T7 promoter/HPT/T7 terminator cassette for expression of the HPT enzyme in certain strains of E. coli such as NovaBlue (DE3) [from Novagen], that are lysogenic for lambda DE3 (which carries the T7 RNA Polymerase gene under lacV5 control). Plasmid pKS18HH also contains the 35S/HPT/NOS cassette for constitutive expression of the HPT enzyme in plants, such as soybean. These two expression systems allow selection for growth in the presence of hygromycin to be used as a means of identifying cells that contain the plasmid in both bacterial and plant systems. pKS18HH also contains three unique restriction endonuclease sites suitable for the cloning other chimeric genes into this vector. Plasmid ZBL100 (PCT Application No. WO 00/11176 published on Mar. 2, 2000) is a derivative of pKS18HH with a reduced NOS 3' terminator. Plasmid pKS67 is a ZBL100 derivative with the insertion of a beta-conglycinin promoter, in front of a NotI cloning site, followed by a phaseolin 3' terminator (described in PCT Application No. WO 94/11516, published on May 26, 1994).
[0264] The 2.5 kb plasmid pKS17 contains pSP72 (obtained from Promega Biosystems) and the T7 promoter/HPT/T7 3' terminator region, and is the original vector into which the 3.2 kb BamHI-SalI fragment containing the 35S/HPT/NOS cassette was cloned to form pKS18HH. The plasmid pKS102 is a pKS17 derivative that is digested with XhoI and SalI, treated with mung-bean nuclease to generate blunt ends, and ligated to insert the linker described in SEQ ID NO:49:
[0265] The plasmid pKS83 has the 2.3 kb BamHI fragment of ML70 containing the Kti3 promoter/NotI/Kti3 3' terminator region (described in PCT Application No. WO 94/11516, published on May 26, 1994) ligated into the BamHI site of pKS17. Additional methods for suppression of endogenous genes are well know in the art and have been described in the detailed description of the instant invention and can be used to reduce the expression of endogenous ORM protein or enzyme activity in a plant cell.
Example 20
Suppression by ELVISLIVES Complementary Region
[0266] Constructs can be made which have "synthetic complementary regions" (SCR). In this example the target sequence is placed between complementary sequences that are not known to be part of any biologically derived gene or genome (i.e. sequences that are "synthetic" or conjured up from the mind of the inventor). The target DNA would therefore be in the sense or antisense orientation and the complementary RNA would be unrelated to any known nucleic acid sequence. It is possible to design a standard "suppression vector" into which pieces of any target gene for suppression could be dropped. The plasmids pKS106, pKS124, and pKS133 (SEQ ID NO:50) exemplify this. One skilled in the art will appreciate that all of the plasmid vectors contain antibiotic selection genes such as, but not limited to, hygromycin phosphotransferase with promoters such as the T7 inducible promoter.
[0267] pKS106 uses the beta-conglycinin promoter while the pKS124 and pKS133 plasmids use the Kti promoter, both of these promoters exhibit strong tissue specific expression in the seeds of soybean. pKS106 uses a 3' termination region from the phaseolin gene, and pKS124 and pKS133 use a Kti 3' termination region. pKS106 and pKS124 have single copies of the 36 nucleotide EagI-ELVISLIVES sequence surrounding a NotI site (the amino acids given in parentheses are back-translated from the complementary strand):
TABLE-US-00024 SEQ ID NO: 51 EagI E L V I S L I V E S NotI CGGCCG GAG CTG GTC ATC TCG CTC ATC GTC GAG TCG GCGGCCGC (S) (E) (V) (I) (L) (S) (I) (V) (L) (E) EagI CGA CTC GAC GAT GAG CGA GAT GAC CAG CTC CGGCCG
pKS133 has 2× copies of ELVISLIVES surrounding the NotI site:
TABLE-US-00025 SEQ ID NO: 52 EagI E L V I S L I V E S EagI E L V I S cggccggagctggtcatctcgctcatcgtcgagtcg gcggccg gagctggtcatctcg SEQ ID NO: 52 L I V E S NotI (S)(E (V)(I)(L)(S)(I)(V)(L)(E) EagI ctcatcgtcgagtcg gcggccgc cgactcgacgatgagcgagatgacc agctc cggccgc (S)(E)(V)(I)(L)(S)(I)(V)(L)(E) EagI cgactcgacgatgagcgagatgaccagctc cggccg
[0268] The idea is that the single EL linker (SCR) can be duplicated to increase stem lengths in increments of approximately 40 nucleotides. A series of vectors will cover the SCR lengths between 40 by and the 300 bp. Various target gene lengths can also be evaluated. It is believed that certain combinations of target lengths and complementary region lengths will give optimum suppression of the target, however, it is expected that the suppression phenomenon works well over a wide range of sizes and sequences. It is also believed that the lengths and ratios providing optimum suppression may vary somewhat given different target sequences and/or complementary regions.
[0269] The plasmid pKS106 is made by putting the EagI fragment of ELVISLIVES (SEQ ID NO:51) into the NotI site of pKS67. The ELVISLIVES fragment is made by PCR using two primers (SEQ ID NO:53 and SEQ ID NO:54) and no other DNA.
[0270] The product of the PCR reaction is digested with EagI (5'-CGGCCG-3') and then ligated into NotI digested pKS67. The term "ELVISLIVES" and "EL" are used interchangeably herein.
[0271] Additional plasmids can be used to test this example and any synthetic sequence, or naturally occurring sequence, can be used in an analogous manner.
Example 21
Screening of Transgenic Lines for Alterations in Oil, Protein, Starch and Soluble Carbohydrate Content
[0272] Transgenic lines can be selected from soybean transformed with a suppression plasmid, such as those described in Example 19 and Example 20. Transgenic lines can be screened for down regulation of plastidic HpaIL aldolase in soybean, by measuring alteration in oil, starch, protein, soluble carbohydrate and/or seed weight. Compositional analysis including measurements of seed compositional parameters such as protein content and content of soluble carbohydrates of soybean seed derived from transgenic events that show seed-specific down-regulation of ORM genes is performed as follows:
Oil content of mature soybean seed or lyophilized soybean somatic embryos can be measured by NMR as described in Example 2. Non-structural carbohydrate and protein analysis.
[0273] Dry soybean seed are ground to a fine powder in a GenoGrinder and subsamples are weighed (to an accuracy of 0.0001 g) into 13×100 mm glass tubes; the tubes have Teflon® lined screw-cap closures. Three replicates are prepared for each sample tested. Tissue dry weights are calculated by weighing sub-samples before and after drying in a forced air oven for 18 h at 105 C.
[0274] Lipid extraction is performed by adding 2 ml aliquots of heptane to each tube. The tubes are vortex mixed and placed into an ultrasonic bath (VWR Scientific Model 750D) filled with water heated to 60 C. The samples are sonicated at full-power (˜360 W) for 15 min and were then centrifuged (5 min×1700 g). The supernatants are transferred to clean 13×100 mm glass tubes and the pellets are extracted 2 more times with heptane (2 ml, second extraction, 1 ml third extraction) with the supernatants from each extraction being pooled. After lipid extraction 1 ml acetone is added to the pellets and after vortex mixing, to fully disperse the material, they are taken to dryness in a Speedvac.
Non-structural carbohydrate extraction and analysis.
[0275] Two ml of 80% ethanol is added to the acetone dried pellets from above. The samples are thoroughly vortex mixed until the plant material was fully dispersed in the solvent prior to sonication at 60 C for 15 min. After centrifugation, 5 min×1700 g, the supernatants are decanted into clean 13×100 mm glass tubes. Two more extractions with 80% ethanol are performed and the supernatants from each are pooled. The extracted pellets are suspended in acetone and dried (as above). An internal standard β-phenyl glucopyranoside (100 ul of a 0.5000+/-0.0010 g/100 ml stock) is added to each extract prior to drying in a Speedvac. The extracts are maintained in a desiccator until further analysis.
[0276] The acetone dried powders from above were suspended in 0.9 ml MOPS (3-N[Morpholino]propane-sulfonic acid; 50 mM, 5 mM CaCl2, pH 7.0) buffer containing 1000 of heat stable α-amylase (from Bacillus licheniformis; Sigma A-4551). Samples are placed in a heat block (90 C) for 75 min and were vortex mixed every 15 min. Samples are then allowed to cool to room temperature and 0.6 ml acetate buffer (285 mM, pH 4.5) containing 5 U amyloglucosidase (Roche 110 202 367 001) is added to each. Samples are incubated for 15-18 h at 55 C in a water bath fitted with a reciprocating shaker; standards of soluble potato starch (Sigma S-2630) are included to ensure that starch digestion went to completion.
[0277] Post-digestion the released carbohydrates are extracted prior to analysis. Absolute ethanol (6 ml) is added to each tube and after vortex mixing the samples were sonicated for 15 min at 60 C. Samples were centrifuged (5 min×1700 g) and the supernatants were decanted into clean 13×100 mm glass tubes. The pellets are extracted 2 more times with 3 ml of 80% ethanol and the resulting supernatants are pooled. Internal standard (100 ul β-phenyl glucopyranoside, as above) is added to each sample prior to drying in a Speedvac.
Sample Preparation and Analysis
[0278] The dried samples from the soluble and starch extractions described above are solubilized in anhydrous pyridine (Sigma-Aldrich P57506) containing 30 mg/ml of hydroxylamine HCl (Sigma-Aldrich 159417). Samples are placed on an orbital shaker (300 rpm) overnight and are then heated for 1 hr (75 C) with vigorous vortex mixing applied every 15 min. After cooling to room temperature 1 ml hexamethyldisilazane (Sigma-Aldrich H-4875) and 100 ul trifluoroacetic acid (Sigma-Aldrich T-6508) are added. The samples are vortex mixed and the precipitates are allowed to settle prior to transferring the supernatants to GC sample vials. Samples are analyzed on an Agilent 6890 gas chromatograph fitted with a DB-17MS capillary column (15m×0.32 mm×0.25 um film). Inlet and detector temperatures are both 275 C. After injection (2 ul, 20:1 split) the initial column temperature (150 C) is increased to 180 C at a rate 3 C/min and then at 25 C/min to a final temperature of 320 C. The final temperature is maintained for 10 min. The carrier gas is H2 at a linear velocity of 51 cm/sec. Detection is by flame ionization. Data analysis is performed using Agilent ChemStation software. Each sugar is quantified relative to the internal standard and detector responses were applied for each individual carbohydrate (calculated from standards run with each set of samples). Final carbohydrate concentrations are expressed on a tissue dry weight basis.
Protein Analysis
[0279] Protein contents are estimated by combustion analysis on a Thermo Finnigan Flash 1112EA combustion analyzer. Samples, 4-8 mg, weighed to an accuracy of 0.001 mg on a Mettler-Toledo MX5 micro balance are used for analysis. Protein contents were calculated by multiplying % N, determined by the analyzer, by 6.25. Final protein contents are expressed on a % tissue dry weight basis. Additionally, the composition of intact single seed and bulk quantities of seed or powders derived from them, may be measured by near-infrared analysis. Measurements of moisture, protein and oil content in soy and moisture, protein, oil and starch content in corn can be measured when combined with the appropriate calibrations.
Example 22
Screening of Transgenic Maize Lines for Alterations in Oil, Protein, Starch and Soluble Carbohydrate Content
[0280] Transgenic maize lines prepared by the method described in Example 15 can be screened essentially as described in Example 21. Embryo-specific downregulation of ORM gene expression is expected to lead to an increase in seed oil content. In contrast overexpression of ORM genes in the endosperm-specific is expected to lead to an increase in seed starch and/or protein content.
Example 23
Seed-Specific RNAi of ORM Genes in Soybean
[0281] A plasmid vector (pKS433) for generation of transgenic soybean events that show seed specific down-regulation of the soy ORM genes corresponding to Glyma02g05870 and Glyma16g24560 genes was constructed.
[0282] Briefly plasmid DNA of applicants EST clone sl1.pk0142.e6 corresponding to Glyma02g05870 (SEQ ID NO:35) was used in a PCR reactions with Primers SA195 (SEQ ID NO:55) and SA196 (SEQ ID NO:56) and SA200 (SEQ ID NO:57) and SA201 (SEQ ID NO:58). A PCR product of 0.39 kb was generated with SA195 (SEQ ID NO:55) and SA196 (SEQ ID NO:56). It was gel purified and is henceforth known as product A. A PCR product of 0.19 kb was generated with SA200 (SEQ ID NO:57) and SA201 (SEQ ID NO:58). It was gel purified and is henceforth known as product B. PCR products A and B were cloned into pGEM T to give pGEM TA (SEQ ID NO:59) and pGEM TB (SEQ ID NO:60), respectively. pGEM TA (SEQ ID NO:59) was digested with HhaI. The digested DNA was treated with Klenow polymerase (NEB, Ipswich, Mass., USA), specifically the 3'-5' exonuclease activity of said enzyme was used to create blunt ends. A 0.58 kb DNA fragment was gel-purified. pGEM TB (SEQ ID NO:60), was linearized by digestion with BamHI. Overhanging ends were filled-in with Klenow polymerase activity and 3' ends were dephosphorylated using calf intestinal phosphatase (NEB, Ipswich, Mass., USA). The 0.58 kb HhaI fragment was ligated to BamHI-linearized pGEM TB to give rise to pGEM T-ORM-HP (SEQ ID NO:61).
[0283] pGEM T-ORM-HP (SEQ ID NO:61) was digested with NotI. A 0.56 kb was gel-purified. The gel purified product was ligated using T4 ligase and thereby cloned in the sense orientation behind the Kti promoter of soybean expression vector KS126 (PCT Publication No. WO 04/071467) that had previously been linearized with the restriction enzyme NotI to give pKS433 (SEQ ID NO:62).
[0284] Plasmid DNA of pKS433 can be used to generate transgenic somatic embryos or seed of soybean using hygromycin selection as described in Example 14. Composition of transgenic somatic embryos or soybean seed generated with pKS433 can be determined as described in Example 19.
[0285] The plasmid vector pKS123 is described in PCT Application No. WO 02/08269. Plasmid pKS120 (SEQ ID NO: 63) is identical to pKS123 (supra) with the exception that the HindIII fragment containing Bcon/NotI/Phas3' cassette was removed.
Generation of Transgenic Somatic Embryos:
[0286] Soybean somatic embryos soybean tissue was co-bombarded as described below with a plasmid DNA of pKS120 or pKS433
Culture Conditions:
[0287] Soybean embryogenic suspension cultures (cv. Jack) were maintained in 35 mL liquid medium SB196 (infra) on a rotary shaker, 150 rpm, 26° C. with cool white fluorescent lights on 16:8 h day/night photoperiod at light intensity of 60-85 μE/m2/s. Cultures were subcultured every 7 days to two weeks by inoculating approximately 35 mg of tissue into 35 mL of fresh liquid SB196 (the preferred subculture interval is every 7 days).
[0288] Soybean embryogenic suspension cultures were transformed with the soybean expression plasmids by the method of particle gun bombardment (Klein et al., Nature 327:70 (1987)) using a DuPont Biolistic PDS1000/HE instrument (helium retrofit) for all transformations.
Soybean Embryogenic Suspension Culture Initiation:
[0289] Soybean cultures were initiated twice each month with 5-7 days between each initiation. Pods with immature seeds from available soybean plants 45-55 days after planting were picked, removed from their shells and placed into a sterilized magenta box. The soybean seeds were sterilized by shaking them for 15 min in a 5% Clorox solution with 1 drop of ivory soap (i.e., 95 mL of autoclaved distilled water plus 5 mL Clorox and 1 drop of soap, mixed well). Seeds were rinsed using 2 1-liter bottles of sterile distilled water and those less than 4 mm were placed on individual microscope slides. The small end of the seed was cut and the cotyledons pressed out of the seed coat. Cotyledons were transferred to plates containing SB199 medium (25-30 cotyledons per plate) for 2 weeks, then transferred to SB1 for 2-4 weeks. Plates were wrapped with fiber tape. After this time, secondary embryos were cut and placed into SB196 liquid media for 7 days.
Preparation of DNA for Bombardment:
[0290] Plasmid DNA of pKS120 or pKS433 were used for bombardment.
[0291] A 50 μL aliquot of sterile distilled water containing 1 mg of gold particles was added to 5 μL of a 1 μg/μL plasmid DNA solution 50 μL 2.5M CaCl2 and 20 μL of 0.1 M spermidine. The mixture was pulsed 5 times on level 4 of a vortex shaker and spun for 5 sec in a bench microfuge. After a wash with 150 μL of 100% ethanol, the pellet was suspended by sonication in 85 μL of 100% ethanol. Five μL of DNA suspension was dispensed to each flying disk of the Biolistic PDS1000/HE instrument disk. Each 5 μL aliquot contained approximately 0.058 mg gold particles per bombardment (i.e., per disk).
Tissue Preparation and Bombardment with DNA:
[0292] Approximately 100-150 mg of 7 day old embryonic suspension cultures were placed in an empty, sterile 60×15 mm petri dish and the dish was placed inside of an empty 150×25 mm Petri dish. Tissue was bombarded 1 shot per plate with membrane rupture pressure set at 650 PSI and the chamber was evacuated to a vacuum of 27-28 inches of mercury. Tissue was placed approximately 2.5 inches from the retaining/stopping screen.
Selection of Transformed Embryos:
[0293] Transformed embryos were selected using hygromycin as the selectable marker. Specifically, following bombardment, the tissue was placed into fresh SB196 media and cultured as described above. Six to eight days post-bombardment, the SB196 is exchanged with fresh SB196 containing 30 mg/L hygromycin. The selection media was refreshed weekly. Four to six weeks post-selection, green, transformed tissue was observed growing from untransformed, necrotic embryogenic clusters. Isolated, green tissue was removed and inoculated into multi-well plates to generate new, clonally propagated, transformed embryogenic suspension cultures.
Embryo Maturation:
[0294] Transformed embryogenic clusters were cultured for one-three weeks at 26° C. in SB196 under cool white fluorescent (Phillips cool white Econowatt F40/CW/RS/EW) and Agro (Phillips F40 Agro) bulbs (40 watt) on a 16:8 hr photoperiod with light intensity of 90-120 μE/m2s. After this time embryo clusters were removed to a solid agar media, SB166, for 1 week. Then subcultured to medium SB103 for 3 weeks. Alternatively, embryo clusters were removed to SB228 (SHaM) liquid media, 35 mL in 250 mL Erlenmeyer flask, for 2-3 weeks. Tissue cultured in SB228 was maintained on a rotary shaker, 130 rpm, 26° C. with cool white fluorescent lights on 16:8 h day/night photoperiod at light intensity of 60-85 μE/m2/s. During this period, individual embryos were removed from the clusters and screened for alterations in their fatty acid compositions as described supra.
Media Recipes:
TABLE-US-00026 [0295] SB 196 - FN Lite Liquid Proliferation Medium (per liter) MS FeEDTA - 100x Stock 1 10 mL MS Sulfate - 100x Stock 2 10 mL FN Lite Halides - 100x Stock 3 10 mL FN Lite P, B, Mo - 100x Stock 4 10 mL B5 vitamins (1 mL/L) 1.0 mL 2,4-D (10 mg/L final concentration) 1.0 mL KNO3 2.83 gm (NH4)2SO4 0.463 gm Asparagine 1.0 gm Sucrose (1%) 10 gm pH 5.8
TABLE-US-00027 FN Lite Stock Solutions Stock Number 1000 mL 500 mL 1 MS Fe EDTA 100x Stock Na2 EDTA* 3.724 g 1.862 g FeSO4--7H2O 2.784 g 1.392 g 2 MS Sulfate 100x stock MgSO4--7H2O 37.0 g 18.5 g MnSO4--H2O 1.69 g 0.845 g ZnSO4--7H2O 0.86 g 0.43 g CuSO4--5H2O 0.0025 g 0.00125 g 3 FN Lite Halides 100x Stock CaCl2--2H2O 30.0 g 15.0 g KI 0.083 g 0.0715 g CoCl2--6H2O 0.0025 g 0.00125 g 4 FN Lite P, B, Mo 100x Stock KH2PO4 18.5 g 9.25 g H3BO3 0.62 g 0.31 g Na2MoO4--2H2O 0.025 g 0.0125 g *Add first, dissolve in dark bottle while stirring
SB1 Solid Medium
Per Liter
[0296] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)
[0297] 1 mL B5 vitamins 1000× stock
[0298] 31.5 g Glucose
[0299] 2 mL 2,4-D (20 mg/L final concentration)
[0300] pH 5.7
[0301] 8 g TC agar
SB199 Solid Medium
Per Liter
[0302] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)
[0303] 1 mL B5 vitamins 1000× stock
[0304] 30 g Sucrose
[0305] 4 ml 2,4-D (40 mg/L final concentration)
[0306] pH 7.0
[0307] 2 gm Gelrite
SB 166 Solid Medium
Per Liter
[0308] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)
[0309] 1 mL B5 vitamins 1000× stock
[0310] 60 g maltose
[0311] 750 mg MgCl2 hexahydrate
[0312] 5 g Activated charcoal
[0313] pH 5.7
[0314] 2 g Gelrite
SB 103 Solid Medium
Per Liter
[0315] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)
[0316] 1 mL B5 vitamins 1000× stock
[0317] 60 g maltose
[0318] 750 mg MgCl2 hexahydrate
[0319] pH 5.7
[0320] 2 g Gelrite
SB 71-4 Solid Medium
Per Liter
[0321] 1 bottle Gamborg's B5 salts w/sucrose (Gibco/BRL--Cat. No. 21153-036)
[0322] pH 5.7
[0323] g TC agar
2,4-D Stock
[0324] Obtain premade from Phytotech Cat. No. D 295--concentration 1 mg/mL
B5 Vitamins Stock
Per 100 mL
[0325] Store aliquots at -20° C.
[0326] 10 g Myo-inositol
[0327] 100 mg Nicotinic acid
[0328] 100 mg Pyridoxine HCl
[0329] 1 g Thiamine
If the solution does not dissolve quickly enough, apply a low level of heat via the hot stir plate.
TABLE-US-00028 SB 228 - Soybean Histodifferentiation & Maturation (SHaM) (per liter) DDI H2O 600 ml FN-Lite Macro Salts for SHaM 10X 100 ml MS Micro Salts 1000x 1 ml MS FeEDTA 100x 10 ml CaCl 100x 6.82 ml B5 Vitamins 1000x 1 ml L-Methionine 0.149 g Sucrose 30 g Sorbitol 30 g Adjust volume to 900 mL pH 5.8 Autoclave Add to cooled media (≦30 C.): *Glutamine (Final conc. 30 mM) 4% 110 mL *Note: Final volume will be 1010 mL after glutamine addition.
Because glutamine degrades relatively rapidly, it may be preferable to add immediately prior to using media. Expiration 2 weeks after glutamine is added; base media can be kept longer w/o glutamine.
TABLE-US-00029 FN-lite Macro for SHAM 10X - Stock #1 (per liter) (NH4)2SO4 (Ammonium Sulfate) 4.63 g KNO3 (Potassium Nitrate) 28.3 g MgSO4*7H20 (Magnesium Sulfate Heptahydrate) 3.7 g KH2PO4 (Potassium Phosphate, Monobasic) 1.85 g Bring to volume Autoclave
TABLE-US-00030 MS Micro 1000X - Stock #2 (per 1 liter) H3BO3 (Boric Acid) 6.2 g MnSO4*H2O (Manganese Sulfate Monohydrate) 16.9 g ZnSO4*7H20 (Zinc Sulfate Heptahydrate) 8.6 g Na2MoO4*2H20 (Sodium Molybdate Dihydrate) 0.25 g CuSO4*5H20 (Copper Sulfate Pentahydrate) 0.025 g CoCl2*6H20 (Cobalt Chloride Hexahydrate) 0.025 g KI (Potassium Iodide) 0.8300 g Bring to volume Autoclave
TABLE-US-00031 FeEDTA 100X - Stock #3 (per liter) Na2EDTA* (Sodium EDTA) 3.73 g FeSO4*7H20 (Iron Sulfate Heptahydrate) 2.78 g *EDTA must be completely dissolved before adding iron.
Bring to Volume
[0330] Solution is photosensitive. Bottle(s) should be wrapped in foil to omit light.
Autoclave
TABLE-US-00032 [0331] Ca 100X - Stock #4 (per liter) CaCl2*2H20 (Calcium Chloride Dihydrate) 44 g Bring to Volume Autoclave
TABLE-US-00033 B5 Vitamin 1000X - Stock #5 (per liter) Thiamine*HCl 10 g Nicotinic Acid 1 g Pyridoxine*HCl 1 g Myo-Inositol 100 g Bring to Volume Store frozen
TABLE-US-00034 4% Glutamine - Stock #6 (per liter) DDI water heated to 30° C. 900 ml L-Glutamine 40 g Gradually add while stirring and applying low heat. Do not exceed 35° C. Bring to Volume Filter Sterilize Store frozen* *Note: Warm thawed stock in 31° C. bath to fully dissolve crystals.
Oil Analysis:
[0332] Oil content of somatic embryos is measured using NMR. Briefly lyophilized soybean somatic embryo tissue is pulverized in genogrinder vial as described previously (Example 2). 20-200 mg of tissue powder were transferred to NMR tubes. Oil content of the somatic embryo tissue powder is calculated from the NMR signal as described in Example 2.
Example 24
Compositional Analysis of Arabidospis Events Transformed with DNA Constructs for Seed-Preferred Silencing of ORM Genes
[0333] The example describes seed composition of transgenic events gene generated with pKR1482-ORM (SEQ ID NO:24). It demonstrates that transformation with DNA constructs for silencing of genes encoding ORM genes leads to increased oil content that is accompanied by a reduction in seed storage protein and soluble carbohydrate content.
T4 seed of event K42335 described in Table 13 of Example 5 and T3 seed of event K47021 and K47018 described in Table 15 of Example 5 were used to create three bulk seed samples. Three bulk seed sample of WT control plants grown alongside the T4 and T3 plants described in Tables 13 and 15 of Example 5 were also generated. Oil content of the six seed samples was measured by NMR as described in Example 2. The seed samples were subjected to compositional analysis of protein and soluble carbohydrate content of triplicate samples as described in Example 2. The results of this analysis are summarized in Table 24.
TABLE-US-00035 TABLE 24 Seed composition of arabidospis events transformed with DNA constructs for silencing of ORM genes fructose glucose Oil (%, (μg mg-1 (μg mg-1 Genotype Event ID NMR) Protein % seed) seed) pKR1482- K42335/ 44.3 16.7 0.2 3.3 ORM K44650 WT 42.1 18.0 0.3 4.3 Δ TG/WT % 5.2 -7.2 -29.7 -23.2 total soluble sucrose raffinose stachyose CHO (μg mg-1 (μg mg-1 (μg mg-1 (μg mg-1 Genotype Bar code ID seed) seed) seed) seed) pKR1482- K42335/ 11.8 0.1 0.6 16.6 ORM K44650 WT 15.9 0.3 0.2 21.3 Δ TG/WT % -25.9 -57.2 167.9 -21.9 fructose glucose Oil (%, (μg mg-1 (μg mg-1 Genotype Event ID NMR) Protein % seed) seed) pKR1482- K47021 44.9 16.7 0.3 3.5 ORM WT 42.5 17.9 0.2 4.0 Δ TG/WT % 5.6 -6.7 16.1 -12.5 total soluble sucrose raffinose stachyose CHO (μg mg-1 (μg mg-1 (μg mg-1 (μg mg-1 Genotype Event ID seed) seed) seed) seed) pKR1482- K47021 14.6 0.3 0.3 19.2 ORM WT 15.9 0.4 0.8 21.6 Δ TG/WT % -8.3 -22.1 -65.8 -10.9 fructose glucose Oil (%, (μg mg-1 (μg mg-1 Genotype Event ID NMR) Protein % seed) seed) pKR1482- K47018 44.8 15.7 0.2 2.7 ORM WT 42.6 17.7 0.3 4.3 Δ TG/WT % 5.2 -11.1 -16.6 -37.0 total soluble sucrose raffinose stachyose CHO (μg mg-1 (μg mg-1 (μg mg-1 (μg mg-1 Genotype Event ID seed) seed) seed) seed) pKR1482- K47018 15.2 0.3 0.8 19.5 ORM WT 16.1 0.4 1.2 22.5 Δ TG/WT % -5.7 -13.4 -32.2 -13.1
Table 24 demonstrates that the oil increase associated with the presence of the pKR1482-ORM transgene (SEQ ID NO:24) is accompanied by a reduction in seed protein content and a reduction in soluble carbohydrate content. The latter was calculated by summarizing the content of pinitol, sorbitol, fructose, glucose, myo-Inositol, sucrose, raffinose and stachyose.
Example 25
Compositional Analysis of Arabidospis Events Transformed with DNA Constructs for Seed-Preferred Over-Expression of ORM Genes
[0334] The example describes seed composition of transgenic events gene generated with pKR1478-ORM (SEQ ID NO:14). It demonstrates that transformation with DNA constructs for seed-preferred overexpression genes encoding ORM genes leads to decreased oil content that is accompanied by increased seed storage protein and a small decrease in soluble carbohydrate content.
[0335] T4 seed of event K42334 described in Table 10 of Example 4 were used to create two bulk seed samples. Bulk seed sample of WT control plants grown alongside the T3 plants described in Table 10 of Example 4 were also generated. Oil content of the four seed samples was measured by NMR as described in Example 2. The seed samples were subjected to compositional analysis of protein and soluble carbohydrate content of triplicate samples as described in Example 2. The results of this analysis are summarized in Table 25.
TABLE-US-00036 TABLE 25 Seed composition of arabidospis events transformed with DNA constructs for seed-preferred overexpression of ORM genes fructose glucose Oil (μg mg-1 (μg mg-1 Genotype Event ID (%, NMR) Protein % seed) seed) pKR1478- K42334/ 39.5 19.3 0.2 4.9 ORM K44548 WT 42.3 17.2 0.3 3.4 Δ TG/WT % -6.6 12.5 -11.9 41.1 total soluble sucrose raffinose stachyose CHO (μg (μg mg-1 (μg mg-1 (μg mg-1 mg-1 Genotype Event ID seed) seed) seed) seed) pKR1478- K42334/ 12.8 0.4 1.6 20.1 ORM K44548 WT 16.4 0.4 1.6 22.4 Δ TG/WT % -22.3 -5.1 0.0 -10.2 fructose glucose Oil (μg mg-1 (μg mg-1 Genotype Event ID (%, NMR) Protein % seed) seed) pKR1478- K42334/ 37.0 19.8 0.3 6.2 ORM K44541 WT 42.2 17.8 0.3 3.7 Δ TG/WT % -12.3 11.1 11.5 65.9 total soluble sucrose raffinose stachyose CHO (μg (μg mg-1 (μg mg-1 (μg mg-1 mg-1 Genotype Event ID seed) seed) seed) seed) pKR1478- K42334/ 13.1 0.4 2.1 22.6 ORM K44541 WT 16.6 0.4 1.8 23.2 Δ TG/WT % -21.2 0.5 17.4 -2.6
Table 25 shows that the oil reduction associated with seed-specific over-expression of ORM genes such as At5g17280 is accompanied by an increase in seed storage protein and a small decrease in soluble carbohydrate content of the seed.
Example 25
Characterization of Arabidospis Events Transformed with a DNA Construct that Contains an Intron-Less Inverted Repeat Construct Derived from Sequences of the At5g17280 (ORM) Gene
[0336] A plasmid vector lo127 for generation of transgenic arabidopsis events that show seed specific down-regulation of the ORM gene corresponding to At5g17280 was constructed.
[0337] Briefly, plasmid DNA isolated from a pooled Arabidopsis cDNA library was used in two PCR reactions with either primers SA311 (SEQ ID NO:71) and SA 312 (SEQ ID NO:72) or SA313 (SEQ ID NO:73) and SA 314 (SEQ ID NO:74). A PCR product of 0.208 kb was generated with SA311 (SEQ ID NO:71) and SA 312 (SEQ ID NO:72). It was gel purified and is henceforth known as product C. A PCR product of 0.183 kb was generated with SA313 (SEQ ID NO:73) and SA 314 (SEQ ID NO:74). It was gel purified and is henceforth known as product D. In a similar fashion a PCR product of 0.208 kb was generated with SA316 (SEQ ID NO:75) and SA 315 (SEQ ID NO:76). It was gel purified and is henceforth known as product E. PCR products C, D and E were cloned into pGEM T easy using instructions of the manufacturer which generated plasmids pGEM T easy C (SEQ ID NO:77), pGEM T easy D (SEQ ID NO:78), pGEM T easy E (SEQ ID NO:79). A restriction fragment of 215 by was excised form pGEM T easy C with NotI and BamHI and cloned into pBluesript SK+ (Stratagene, USA). The resulting plasmid pBluescript-C (SEQ ID NO:80) was linearized with BamHI and PstI and ligated to a 193 by fragment excised from pGEM T easy D with BamHI and PstI. The resulting plasmid pBluescript-CD (SEQ ID NO:81) was linearized with PstI and EcoRI and ligated to a 218 by fragment excised from pGEM T easy E with PstI, EcoRI to give pBluescript-CDE (SEQ ID NO:82). A fragment of 619 by was excised from pBluescript-CDE with NotI and ligated to NotI linearized KS442 (SEQ ID NO:83) to give KS442-CDE (SEQ ID NO:84).
[0338] Prior to this KS442 was constructed as follows. KS121 (PCT Application No. WO 02/00904) was digested BamHI and XmnI and ligated to a fragment comprised of the soybean GYI promoter. The GYI promoter was obtained from KS349 (US 20080295204 A1, published Nov. 27, 2008). Briefly, KS349 was digested with NcoI, overhangs were filled in with Klenow DNA polymerase (NEB, USA) according to manufacturer instructions. The linearized KS349 plasmid was digested with BamHI thus releasing the GYI promoter used for construction of KS442.
[0339] KS442-CDE was digested with AscI and a DNA fragment of 1.558 kb was ligated to Asc-linearized pKR92 (SEQ ID NO:8) to give lo127 (SEQ ID NO:85).
[0340] Plasmid DNA of lo127 was used for agrobacterium-mediated transformation of arabidopsis as described in Example 4. A total of 54 events were generated with lo127. T1 plant of these events were grown to maturity alongside WT control plants. Seed were harvested and oil content was measured by NMR as described in Example 2. The results of this analysis are summarized in Table 26.
TABLE-US-00037 TABLE 26 Seed oil content of T1 plants generated with binary vector lo127 for seed- specific silencing of At5g17280 construct/ oil content avg oil content % genotype event ID % oil % of WT avg of WT ARALO 127 K61385 42.0 116.5 ARALO 127 K61388 41.0 113.7 ARALO 127 K61386 40.6 112.6 ARALO 127 K61389 40.2 111.5 ARALO 127 K61377 40.1 111.2 ARALO 127 K61375 40.0 110.9 ARALO 127 K61379 39.6 109.8 ARALO 127 K61378 39.5 109.5 ARALO 127 K61383 39.3 109.0 ARALO 127 K61367 39.0 108.2 ARALO 127 K61371 38.9 107.9 ARALO 127 K61372 38.8 107.6 ARALO 127 K61394 38.5 106.8 ARALO 127 K61382 38.4 106.5 ARALO 127 K61393 38.2 105.9 ARALO 127 K61391 38.2 105.9 ARALO 127 K61387 38.1 105.7 ARALO 127 K61373 37.9 105.1 ARALO 127 K61381 37.4 103.7 ARALO 127 K61368 37.2 103.2 ARALO 127 K61374 37.2 103.2 ARALO 127 K61392 37.2 103.2 ARALO 127 K61380 37.1 102.9 ARALO 127 K61370 36.6 101.5 ARALO 127 K61384 36.5 101.2 ARALO 127 K61369 35.3 97.9 ARALO 127 K61376 34.8 96.5 ARALO 127 K61390 34.8 96.5 106.2 col 37.2 col 36.9 col 36.8 col 35.5 WT avg col 33.9 36.06 ARALO 127 K61403 41.0 118.2 ARALO 127 K61406 39.7 114.4 ARALO 127 K61425 39.4 113.5 ARALO 127 K61405 39.2 113.0 ARALO 127 K61401 39.2 113.0 ARALO 127 K61408 39.1 112.7 ARALO 127 K61416 38.9 112.1 ARALO 127 K61415 38.9 112.1 ARALO 127 K61404 38.5 111.0 ARALO 127 K61420 38.4 110.7 ARALO 127 K61414 38.2 110.1 ARALO 127 K61407 37.8 108.9 ARALO 127 K61402 37.8 108.9 ARALO 127 K61400 37.7 108.6 ARALO 127 K61424 37.4 107.8 ARALO 127 K61421 37.3 107.5 ARALO 127 K61417 37.3 107.5 ARALO 127 K61419 37.2 107.2 ARALO 127 K61411 37.2 107.2 ARALO 127 K61426 36.5 105.2 ARALO 127 K61409 36.3 104.6 ARALO 127 K61413 35.8 103.2 ARALO 127 K61418 35.7 102.9 ARALO 127 K61422 35.5 102.3 ARALO 127 K61410 35.4 102.0 ARALO 127 K61412 35.3 101.7 108.7 col 36.7 col 36.5 col 34.2 WT avg col 31.4 34.7
[0341] T2 seed of events K61385, K61388, K61386 and K61403 were germinated on selective plant growth media containing kanamycin, planted in soil alongside WT plants and grown to maturity. T3 seed oil content was measured by NMR. The results of this analysis are summarized in Table 27.
TABLE-US-00038 TABLE 27 Seed oil content of T2 plants generated with binary vector lo127 for seed preferred silencing of At5g17280 oil content % avg oil content % event ID/genotype Line ID % oil of WT avg of WT K61385 K62439 42.7 109.5 K62454 42.3 108.5 K62447 41.9 107.4 K63000 41.9 107.4 K63001 41.9 107.4 K62441 41.8 107.2 K62453 41.4 106.2 K62444 41.1 105.4 K62440 40.9 104.9 K62452 40.7 104.4 K62450 40.5 103.8 K62442 40.5 103.8 K62445 40.5 103.8 K62456 39.7 101.8 K62443 39.7 101.8 K62448 38.5 98.7 K62446 38.0 97.4 K62455 37.8 96.9 K62451 37.5 96.2 K62449 37.2 95.4 103.4 col 42.5 col 41.5 col 40.8 col 40.0 col 39.9 col 39.8 col 39.0 col 37.6 col 36.3 col 36.0 WT avg col 35.6 39 K61388 K62406 42.6 107.4 K62414 42.5 107.2 K62410 42.4 106.9 K62411 42.2 106.4 K62419 42.2 106.4 K62413 42.0 105.9 K62415 41.7 105.1 K62408 41.3 104.1 K62412 41.3 104.1 K62422 41.2 103.9 K62424 41.1 103.6 K62404 41.1 103.6 K62425 41.1 103.6 K62417 40.9 103.1 K62409 40.8 102.9 K62423 40.7 102.6 K62421 40.5 102.1 K62416 40.0 100.8 K62426 39.9 100.6 K62418 39.8 100.3 K62427 38.3 96.6 K62407 38.0 95.8 K62420 37.3 94.0 K62405 36.4 91.8 102.5 col 41.2 col 41.2 col 41.0 col 40.9 col 40.6 col 39.4 col 38.9 col 38.7 col 38.7 col 38.5 WT avg col 37.2 39.7 K61386 K63580 45.2 110.9 K63587 45.1 110.6 K63577 44.8 109.9 K63575 44.8 109.9 K63589 44.3 108.6 K63585 43.7 107.2 K63578 43.2 105.9 K62744 43.2 105.9 K63583 43.2 105.9 K63576 43.1 105.7 K63592 43.1 105.7 K63579 43.0 105.5 K63593 42.9 105.2 K63591 42.7 104.7 K63584 41.6 102.0 K63586 41.6 102.0 K63574 41.5 101.8 K63590 41.2 101.0 K63581 40.7 99.8 K63582 40.1 98.3 K63588 39.4 96.6 K63595 37.4 91.7 K63596 37.3 91.5 K63594 36.9 90.5 103.2 col K63601 44.6 col K63600 43.0 col K63598 42.4 col K63599 41.1 col K63604 41.1 col K63606 41.0 col K63605 40.9 col K63608 40.3 col K63597 39.9 col K63607 39.4 col K63602 38.9 WT avg col K63603 36.7 40.8 K61403 K62316 43.1 111.5 K62308 43.0 111.3 K62321 43.0 111.3 K62315 42.1 109.0 K62306 41.8 108.2 K62318 41.4 107.1 K62312 41.4 107.1 K62324 41.3 106.9 K62305 41.0 106.1 K62323 40.7 105.3 K62313 40.3 104.3 K62310 40.0 103.5 K62314 39.6 102.5 K62307 39.6 102.5 K62322 38.8 100.4 K62317 37.4 96.8 K62309 37.1 96.0 K62320 37.0 95.8 K62319 36.7 95.0 K62311 28.7 74.3 102.7 col 41.6 col 40.7 col 40.4 col 40.0 col 38.6 col 38.3 col 35.8 WT avg col 33.7 38.6
Table 23-25 show that silencing of ORM genes such as At5g17280 using hairpin constructs that contain an intron-less inverted repeat lead to a heritable oil increase. In T3 lines that still segregate for the lo127 derived T-DNA insertion the average oil content was 2.5-3.4% higher than that of WT control plants.
Example 25
Seed-Preferred Silencing of ORM Genes in Soybean Using Artificial miRNAs
[0342] The example describes the construction of a plasmid vector for soybean transformation. The plasmid provides seed-preferred expression of two artificial microRNAs that target soybean ORM genes Glyma02g05870 and Glyma16g24560, respectively.
[0343] Vectors were made to silence ORM genes using an artificial microRNA largely as described in U.S. patent application Ser. No. 12/335,717, filed Dec. 16, 2008. The following briefly explains the procedure.
Design of Artificial MicroRNA Sequences
[0344] Artificial microRNAs (amiRNAs) that would have the ability to silence the desired target genes were designed largely according to rules described in Schwab R, et al. (2005) Dev Cell 8: 517-27. To summarize, microRNA sequences are 21 nucleotides in length, start at their 5'-end with a "U", display 5' instability relative to their star sequence which is achieved by including a C or G at position 19, and their 10th nucleotide is either an "A" or an "U". An additional requirement for artificial microRNA design was that the amiRNA have a high free delta-G as calculated using the ZipFold algorithm (Markham, N. R. & Zuker, M. (2005) Nucleic Acids Res. 33: W577-W581.) The DNA sequence corresponding to the amiRNA (OX16) that was used to silence Glyma16g24560 is set forth in SEQ ID NO:86. The DNA sequence corresponding to the amiRNA (OX2) that was used to silence the Glyma02g05870 gene is set forth in SEQ ID NO:87.
Design of an Artificial Star Sequences
[0345] "Star sequences" are those that base pair with the amiRNA sequences, in the precursor RNA, to form imperfect stem structures. To form a perfect stem structure the star sequence would be the exact reverse complement of the amiRNA. The soybean precursor sequence as described in "Novel and nodulation-regulated microRNAs in soybean roots" Subramanian S, Fu Y, Sunkar R, Barbazuk W B, Zhu J K, Yu O BMC Genomics. 9:160 (2008) and accessed on mirBase (Conservation and divergence of microRNA families in plants" Dezulian T, Palatnik J F, Huson D H, Weigel D (2005) Genome Biology 6:P13) was folded using mfold (M. Zuker (2003) Nucleic Acids Res. 31: 3406-15; and D. H. Mathews, J. et al. (1999) J. Mol. Biol. 288: 911-940). The miRNA sequence was then replaced with the amiRNA sequence and the endogenous star sequence was replaced with the exact reverse complement of the amiRNA. Changes in the artificial star sequence were introduced so that the structure of the stem would remain the same as the endogenous structure. The altered sequence was then folded with mfold and the original and altered structures were compared by eye. If necessary, further alternations to the artificial star sequence were introduced to maintain the original structure. The first amiRNA star sequence (OX16 star) that was used to silence Glyma16g24560 is set forth as SEQ ID NO:88. The 2nd amiRNA star sequence (OX2 star) that was used to silence Glyma02g05870 is set forth as SEQ ID NO:89.
Conversion of Genomic MicroRNA Precursors to Artificial MicroRNA Precursors
[0346] Genomic miRNA precursor genes as described in US Patent Publication No. 2009-0155910A1, published Jun. 18, 2009 can be converted to amiRNAs using overlapping PCR and the resulting DNAs are completely sequenced. These DNAs are then cloned downstream of an appropriate promoter in a vector capable of soybean transformation.
[0347] Alternatively, amiRNAs can be synthesized commercially, for example by Codon Devices, (Cambridge, Mass.), DNA 2.0 (Menlo Park, Calif.) or Genescript (Piscataway, N.J.). The synthesized DNA is then cloned downstream of an appropriate promoter in a vector capable of soybean transformation.
[0348] Alternatively, amiRNAs can be constructed using In-Fusion® technology (Clontech, Mountain View, Calif.).
Conversion of Genomic MicroRNA Precursors to Artificial MicroRNA Precursors
[0349] Genomic miRNA precursor genes were converted to amiRNA precursors using In-Fusion® as described above. In brief, the microRNA 396b precursor (SEQ ID NO: 90) was altered to include Pme I sites immediately flanking the star and microRNA sequences to form the in-fusion ready microRNA 396b precursorv3 (SEQ ID NO: 91).
[0350] The microRNA 396b precursor (Seq ID 90) was used as a PCR template with the primers shown in SEQ ID NO:92 and SEQ ID NO:93. The primers are designed according to the protocol provided by Clontech (USA) and do not leave any footprint of the Pme I sites after the In-Fusion recombination reaction. The amplified sequence is recombined into the in-fusion ready microRNA 396b (SEQ ID NO:91) cloned into pCR2.1 and digested with Pme I. This was done using protocols provided with the In-Fusion® kit. The resulting plasmid 396b-OX16 is shown in SEQ ID 94.
[0351] To construct 159-OX2, the microRNA 159 precursor (SEQ ID No: 95) was altered to include Pme I sites immediately flanking the star and microRNA sequences to form the in-fusion ready microRNA 159 precursor (SEQ ID NO: 96).
[0352] The microRNA 159 precursor (SEQ ID NO: 95) was used as a PCR template with the primers shown in SEQ ID NO:97 and SEQ ID NO:98. The primers are designed according to the protocol provided by Clontech and do not leave any footprint of the Pme I sites after the In-Fusion recombination reaction. The amplified sequences is recombined into the in-fusion ready microRNA 159 (SEQ ID NO:96) cloned into pCR2.1 and digested with Pme I. This was done using protocols provided with the In-Fusion® kit. The resulting plasmid 159-OX2 is shown in Table 3 (SEQ ID NO: 99).
The 611 by Not I-Eco RI fragment was removed from 396b-OX16 (SEQ ID NO:94) and a 965 by EcoRI-Not I fragment was removed from 159-OX2 SEQ ID NO: 100 and cloned into the Not I site of KS126 (PCT Publication No. WO 04/071467) to form KS 434 (SEQ ID NO 100).
Example 26
Compositional Analysis of Soybean Somatic Embryos Transformed with Constructs for RNAi- or amiRNA-Mediated Suppression of ORM Gene Expression
[0353] DNA of plasmids KS120, KS433 and KS434 were stably transformed into soybean suspension cultures and transgenic somatic embryos were generated as described in Example 23. Oil content was analyzed by NMR as described in Example 2.
TABLE-US-00039 TABLE 30 Oil content of somatic embryos generated with plasmids KS120, KS433 and KS434 experiment name plasmid event id % oil average % oil 2698 KS120 K57206 6.6 K57198 6.2 K57195 5.0 K57207 5.0 K57201 5.0 K57211 4.9 K57187 4.8 K57204 4.6 K57189 4.3 K57212 4.3 K57194 4.2 K57188 4.0 K57193 3.9 K57190 3.9 K57200 3.8 K57202 3.8 K57191 3.7 K57210 3.6 K57205 3.5 K57209 3.5 K57208 3.4 K57199 3.1 K57197 3.1 K57192 3.0 K57203 2.6 K57196 2.4 4.1 2699 KS433 K57232 10.0 K57238 9.9 K57236 9.8 K57224 9.4 K57215 8.2 K57220 8.2 K57225 8.1 K57222 8.1 K57237 7.5 K57221 7.2 K57233 7.0 K57229 6.9 K57234 6.5 K57217 6.3 K57213 6.1 K57230 5.9 K57214 5.8 K57227 5.3 K57226 5.3 K57231 5.2 K57223 4.9 K57219 4.5 K57235 4.1 K57228 3.9 K57218 2.8 K57216 1.9 6.5 2700 KS434 K57239 7.6 K57247 7.1 K57261 6.5 K57242 6.3 K57243 6.0 K57252 5.8 K57256 5.7 K57260 5.6 K57264 5.5 K57251 5.2 K57255 5.2 K57263 5.2 K57245 4.7 K57249 4.7 K57265 4.7 K57266 4.6 K57246 4.6 K57250 4.5 K57240 4.4 K57257 4.3 K57248 4.1 K57269 3.6 K57259 3.4 K57267 3.2 K57254 3.1 K57268 2.9 K57262 2.9 K57253 2.9 K57258 2.6 K57244 2.6 K57241 2.5 4.6
Table 30 shows that silencing of the soybean ORM genes Glyma02g05870 and Glyma16g24560 (KS433) using RNAi- or amiRNA-mediated suppression led to an increase in oil compared to the control.
Sequence CWU
1
121118491DNAArtificial SequencepHSbarEND2s activation tagging vector
1catgaatcaa acaaacatac acagcgactt attcacacga gctcaaatta caacggtata
60tatcctgccg tcgacaacca tggtctagac aggatccccg ggtaccgagc tcgaatttgc
120aggtcgactg cgtcatccct tacgtcagtg gagatatcac atcaatccac ttgctttgaa
180gacgtggttg gaacgtcttc tttttccacg atgctcctcg tgggtggggg tccatctttg
240ggaccactgt cggcagaggc atcttgaacg atagcctttc ctttatcgca atgatggcat
300ttgtaggtgc caccttcctt ttctactgtc cttttgatga agtgacagat agctgggcaa
360tggaatccga ggaggtttcc cgatattacc ctttgttgaa aagtctcaat tgccctttgg
420tcttctgaga ctgttgcgtc atcccttacg tcagtggaga tatcacatca atccacttgc
480tttgaagacg tggttggaac gtcttctttt tccacgatgc tcctcgtggg tgggggtcca
540tctttgggac cactgtcggc agaggcatct tgaacgatag cctttccttt atcgcaatga
600tggcatttgt aggtgccacc ttccttttct actgtccttt tgatgaagtg acagatagct
660gggcaatgga atccgaggag gtttcccgat attacccttt gttgaaaagt ctcagttaac
720ccgcgatcct gcgtcatccc ttacgtcagt ggagatatca catcaatcca cttgctttga
780agacgtggtt ggaacgtctt ctttttccac gatgctcctc gtgggtgggg gtccatcttt
840gggaccactg tcggcagagg catcttgaac gatagccttt cctttatcgc aatgatggca
900tttgtaggtg ccaccttcct tttctactgt ccttttgatg aagtgacaga tagctgggca
960atggaatccg aggaggtttc ccgatattac cctttgttga aaagtctcaa ttgccctttg
1020gtcttctgag actgttgcgt catcccttac gtcagtggag atatcacatc aatccacttg
1080ctttgaagac gtggttggaa cgtcttcttt ttccacgatg ctcctcgtgg gtgggggtcc
1140atctttggga ccactgtcgg cagaggcatc ttgaacgata gcctttcctt tatcgcaatg
1200atggcatttg taggtgccac cttccttttc tactgtcctt ttgatgaagt gacagatagc
1260tgggcaatgg aatccgagga ggtttcccga tattaccctt tgttgaaaag tctcagttaa
1320cccgcaattc actggccgtc gttttacaac gtcgtgactg ggaaaaccct ggcgttaccc
1380aacttaatcg ccttgcagca catccccctt tcgccagctg gcgtaatagc gaagaggccc
1440gcaccgatcg cccttcccaa cagttgcgca gcctgaatgg cgaatggatc gatccgtcga
1500tcgaccaaag cggccatcgt gcctccccac tcctgcagtt cgggggcatg gatgcgcgga
1560tagccgctgc tggtttcctg gatgccgacg gatttgcact gccggtagaa ctccgcgagg
1620tcgtccagcc tcaggcagca gctgaaccaa ctcgcgaggg gatcgagccc ctgctgagcc
1680tcgacatgtt gtcgcaaaat tcgccctgga cccgcccaac gatttgtcgt cactgtcaag
1740gtttgacctg cacttcattt ggggcccaca tacaccaaaa aaatgctgca taattctcgg
1800ggcagcaagt cggttacccg gccgccgtgc tggaccgggt tgaatggtgc ccgtaacttt
1860cggtagagcg gacggccaat actcaacttc aaggaatctc acccatgcgc gccggcgggg
1920aaccggagtt cccttcagtg aacgttatta gttcgccgct cggtgtgtcg tagatactag
1980cccctggggc cttttgaaat ttgaataaga tttatgtaat cagtctttta ggtttgaccg
2040gttctgccgc tttttttaaa attggatttg taataataaa acgcaattgt ttgttattgt
2100ggcgctctat catagatgtc gctataaacc tattcagcac aatatattgt tttcatttta
2160atattgtaca tataagtagt agggtacaat cagtaaattg aacggagaat attattcata
2220aaaatacgat agtaacgggt gatatattca ttagaatgaa ccgaaaccgg cggtaaggat
2280ctgagctaca catgctcagg ttttttacaa cgtgcacaac agaattgaaa gcaaatatca
2340tgcgatcata ggcgtctcgc atatctcatt aaagcagggg gtgggcgaag aactccagca
2400tgagatcccc gcgctggagg atcatccagc cggcgtcccg gaaaacgatt ccgaagccca
2460acctttcata gaaggcggcg gtggaatcga aatctcgtga tggcaggttg ggcgtcgctt
2520ggtcggtcat ttcgaacccc agagtcccgc tcagaagaac tcgtcaagaa ggcgatagaa
2580ggcgatgcgc tgcgaatcgg gagcggcgat accgtaaagc acgaggaagc ggtcagccca
2640ttcgccgcca agctcttcag caatatcacg ggtagccaac gctatgtcct gatagcggtc
2700cgccacaccc agccggccac agtcgatgaa tccagaaaag cggccatttt ccaccatgat
2760attcggcaag caggcatcgc catgggtcac gacgagatcc tcgccgtcgg gcatgccccc
2820caattcactg gccgtcgttt tacaacgtcg tgactgggaa aaccctggcg ttacccaact
2880taatcgcctt gcagcacatc cccctttcgc cagctggcgt aatagcgaag aggcccgcac
2940cgatcgccct tcccaacagt tgcgcagcct gaatggcgaa tggcgcctga tgcggtattt
3000tctccttacg catctgtgcg gtatttcaca ccgcatatgg tgcactctca gtacaatctg
3060ctctgatgcc gcatagttaa gccagccccg acacccgcca acacccgctg acgcgccctg
3120acgggcttgt ctgctcccgg catccgctta cagacaagct gtgaccgtct ccgggagctg
3180catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg agacgaaagg gcctcgtgat
3240acgcctattt ttataggtta atgtcatgat aataatggtt tcttagacgt caggtggcac
3300ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac attcaaatat
3360gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa aaaggaagag
3420tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat tttgccttcc
3480tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc agttgggtgc
3540acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga gttttcgccc
3600cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg cggtattatc
3660ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactattctc agaatgactt
3720ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag taagagaatt
3780atgcagtgct gccataacca tgagtgataa cactgcggcc aacttacttc tgacaacgat
3840cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg taactcgcct
3900tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg acaccacgat
3960gcctgtagca atggcaacaa cgttgcgcaa actattaact ggcgaactac ttactctagc
4020ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac cacttctgcg
4080ctcggccctt ccggctggct ggtttattgc tgataaatct ggagccggtg agcgtgggtc
4140tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg tagttatcta
4200cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg agataggtgc
4260ctcactgatt aagcattggt aactgtcaga ccaagtttac tcatatatac tttagattga
4320tttaaaactt catttttaat ttaaaaggat ctaggtgaag atcctttttg ataatctcat
4380gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg tagaaaagat
4440caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc aaacaaaaaa
4500accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc tttttccgaa
4560ggtaactggc ttcagcagag cgcagatacc aaatactgtc cttctagtgt agccgtagtt
4620aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc taatcctgtt
4680accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact caagacgata
4740gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac agcccagctt
4800ggagcgaacg acctacaccg aactgagata cctacagcgt gagcattgag aaagcgccac
4860gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg gaacaggaga
4920gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg tcgggtttcg
4980ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga gcctatggaa
5040aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt ttgctcacat
5100gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct ttgagtgagc
5160tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg aggaagcgga
5220agagcgccca atacgcaaac cgcctctccc cgcgcgttgg ccgattcatt aatgcagctg
5280gcacgacagg tttcccgact ggaaagcggg cagtgagcgc aacgcaatta atgtgagtta
5340gctcactcat taggcacccc aggctttaca ctttatgctt ccggctcgta tgttgtgtgg
5400aattgtgagc ggataacaat ttcacacagg aaacagctat gaccatgatt acgccaagct
5460ttctaggggg ggggtaccga tctgagatcg gtaacgaaaa cgaacgggta gggatgaaaa
5520cggtcggtaa cggtcggtaa aatacctcta ccgttttcat tttcatattt aacttgcggg
5580acggaaacga aaacgggata taccggtaac gaaaacgaac gggataaata cggtaatcga
5640aaaccgatac gatccggtcg ggttaaagtc gaaatcggac gggaaccggt atttttgttc
5700ggtaaaatca cacatgaaaa catatattca aaacttaaaa acaaatataa aaaattgtaa
5760acacaagtct taatgatcac tagtggcgcg cctaggagat ctcgagtagg gataacaggg
5820taatacatag ataaaatcca tataaatctg gagcacacat agtttaatgt agcacataag
5880tgataagtct tgggctcttg gctaacataa gaagccatat aagtctacta gcacacatga
5940cacaatataa agtttaaaac acatattcat aatcacttgc tcacatctgg atcacttagc
6000atgctacagc tagtgcaata ttagacactt tccaatattt ctcaaacttt tcactcattg
6060caacggccat tctcctaatg acaaattttt catgaacaca ccattggtca atcaaatcct
6120ttatctcaca gaaacctttg taaaataaat ttgcagtgga atattgagta ccagatagga
6180gttcagtgag atcaaaaaac ttcttcaaac acttaaaaag agttaatgcc atcttccact
6240cctcggcttt aggacaaatt gcatcgtacc tacaataatt gacatttgat taattgagaa
6300tttataatga tgacatgtac aacaattgag acaaacatac ctgcgaggat cacttgtttt
6360aagccgtgtt agtgcaggct tataatataa ggcatccctc aacatcaaat aggttgaatt
6420ccatctagtt gagacatcat atgagatccc tttagattta tccaagtcac attcactagc
6480acacttcatt agttcttccc actgcaaagg agaagatttt acagcaagaa caatcgcttt
6540gattttctca attgttcctg caattacagc caagccatcc tttgcaacca agttcagtat
6600gtgacaagca cacctcacat gaaagaaagc accatcacaa actagatttg aatcagtgtc
6660ctgcaaatcc tcaattatat cgtgcacagc tacttcattt gcactagcat tatccaaaga
6720caaggcaaac aattttttct caatgttcca cttaaccatg attgcagtga aggtttgtga
6780taacctttgg ccagtgtggc gcccttcaac atgaaaaaag ccaacaattc ttttttggag
6840acaccaatca tcatcaatcc aatggatggt gacacacatg tatgacttat tttgacaaga
6900tgtccacata tccatagttg tactgaagcg agactgaaca tcttttagtt ttccatacaa
6960cttttctttt tcttccaaat acaaatccat gatatatttt ctagcagtga cacgggactt
7020tattggaaag tgagggcgca gagacttaac aaactcaaca aagtactcat gttctacaat
7080attgaaagga tattcatgca tgattattgc caaatgaagc ttctttaggc taaccacttc
7140atcgtactta taaggctcaa tgagatttat gtctttgcca tgatcctttt cactttttag
7200acacaactga cctttaacta aactatgtga tgttctcaag tgatttcgaa atccgcttgt
7260tccatgatga ccctcagccc tatacttagc cttgcaatta ggaaagttgc aatgtcccca
7320tacctgaacg tatttctttc catcgacctc cacttcaatt tccttcttgg tgaaatgctg
7380ccatacatcc gatgtgcact tctttgccct cttctgtggt gcttcttctt cgggttcagg
7440ttgtggctgt ggttgtggtt ctggttgtgg ttgtggttgt ggttgtggtt catgaacaat
7500agccatatca tcttgactcg gatctgtagc tgtaccattt gcattactac tgcttacact
7560ctgaataaaa tgcctctcgg cctcagctgt tgatgatgat ggtgatgtgc ggccacatcc
7620atgcccacgc gcacgtgcac gtacattctg aatccgacta gaagaggctt cagcttttct
7680tttcaaccct gttataaaca gatttttcgt attattctac agtcaatatg atgcttccca
7740atctacaacc aattagtaat gctaatgcta ttgctactgt ttttctaata tataccttga
7800gcatatgcag agaatacgga atttgttttg cgagtagaag gcgctcttgt ggtagacatc
7860aacttggcca atcttatggc tgagcctgag ggaggattat ttccaaccgg aggcgtcatc
7920tgaggaatgg agtcgtagcc ggctagccga agtggagagc agagccctgg acagcaggtg
7980ttcagcaatc agcttggtgc tgtactgctg tgacttgtga gcacctggac ggctggacag
8040caatcagcag gtgttgcaga gcccctggac agcacacaaa tgacacaaca gcttggtgca
8100atggtgctga cgtgctgtac tgctaagtgc tgtgagcctg tgagcagccg tggagacagg
8160gagaccgcgg atggccggat gggcgagcgc cgagcagtgg aggtctggag gaccgctgac
8220cgcagatggc ggatggcgga tgggcggacc gcggatgggc gagcagtgga gtggaggtct
8280gggcggatgg gcggaccgcg gcgcggatgg gcgagtcgcg agcagtggag tggagggcgg
8340accgtggatg gcggcgtctg cgtccggcgt gccgcgtcac ggccgtcacc gcgtgtggtg
8400cctggtgcag cccagcggcc ggccggctgg gagacaggga gagtcggaga gagcaggcga
8460gagcgagacg cgtcgccggc gtcggcgtgc ggctggcggc gtccggactc cggcgtgggc
8520gcgtggcggc gtgtgaatgt gtgatgctgt tactcgtgtg gtgcctggcc gcctgggaga
8580gaggcagagc agcgttcgct aggtatttct tacatgggct gggcctcagt ggttatggat
8640gggagttgga gctggccata ttgcagtcat cccgaattag aaaatacggt aacgaaacgg
8700gatcatcccg attaaaaacg ggatcccggt gaaacggtcg ggaaactagc tctaccgttt
8760ccgtttccgt ttaccgtttt gtatatcccg tttccgttcc gttttcgttt tttacctcgg
8820gttcgaaatc gatcgggata aaactaacaa aatcggttat acgataacgg tcggtacggg
8880attttcccat cctactttca tccctgagat tattgtcgtt tctttcgcag atcggtaccc
8940cccccctaga gtcgacatcg atctagtaac atagatgaca ccgcgcgcga taatttatcc
9000tagtttgcgc gctatatttt gttttctatc gcgtattaaa tgtataattg cgggactcta
9060atcataaaaa cccatctcat aaataacgtc atgcattaca tgttaattat tacatgctta
9120acgtaattca acagaaatta tatgataatc atcgcaagac cggcaacagg attcaatctt
9180aagaaacttt attgccaaat gtttgaacga tctgcttcga cgcactcctt ctttaggtac
9240ggactagatc tcggtgacgg gcaggaccgg acggggcggt accggcaggc tgaagtccag
9300ctgccagaaa cccacgtcat gccagttccc gtgcttgaag ccggccgccc gcagcatgcc
9360gcggggggca tatccgagcg cctcgtgcat gcgcacgctc gggtcgttgg gcagcccgat
9420gacagcgacc acgctcttga agccctgtgc ctccagggac ttcagcaggt gggtgtagag
9480cgtggagccc agtcccgtcc gctggtggcg gggggagacg tacacggtcg actcggccgt
9540ccagtcgtag gcgttgcgtg ccttccaggg gcccgcgtag gcgatgccgg cgacctcgcc
9600gtccacctcg gcgacgagcc agggatagcg ctcccgcaga cggacgaggt cgtccgtcca
9660ctcctgcggt tcctgcggct cggtacggaa gttgaccgtg cttgtctcga tgtagtggtt
9720gacgatggtg cagaccgccg gcatgtccgc ctcggtggca cggcggatgt cggccgggcg
9780tcgttctggg ctcatggatc tggattgaga gtgaatatga gactctaatt ggataccgag
9840gggaatttat ggaacgtcag tggagcattt ttgacaagaa atatttgcta gctgatagtg
9900accttaggcg acttttgaac gcgcaataat ggtttctgac gtatgtgctt agctcattaa
9960actccagaaa cccgcggctg agtggctcct tcaatcgttg cggttctgtc agttccaaac
10020gtaaaacggc ttgtcccgcg tcatcggcgg gggtcataac gtgactccct taattctccg
10080ctcatgatcc ccgggtaccg agctcgaatt gcggctgagt ggctccttca atcgttgcgg
10140ttctgtcagt tccaaacgta aaacggcttg tcccgcgtca tcggcggggg tcataacgtg
10200actcccttaa ttctccgctc atgatcttga tcccctgcgc catcagatcc ttggcggcaa
10260gaaagccatc cagtttactt tgcagggctt cccaacctta ccagagggcg ccccagctgg
10320caattccggt tcgcttgctg tatcgatatg gtggatttat cacaaatggg acccgccgcc
10380gacagaggtg tgatgttagg ccaggacttt gaaaatttgc gcaactatcg tatagtggcc
10440gacaaattga cgccgagttg acagactgcc tagcatttga gtgaattatg tgaggtaatg
10500ggctacactg aattggtagc tcaaactgtc agtatttatg tatatgagtg tatattttcg
10560cataatctca gaccaatctg aagatgaaat gggtatctgg gaatggcgaa atcaaggcat
10620cgatcgtgaa gtttctcatc taagccccca tttggacgtg aatgtagaca cgtcgaaata
10680aagatttccg aattagaata atttgtttat tgctttcgcc tataaatacg acggatcgta
10740atttgtcgtt ttatcaaaat gtactttcat tttataataa cgctgcggac atctacattt
10800ttgaattgaa aaaaaattgg taattactct ttctttttct ccatattgac catcatactc
10860attgctgatc catgtagatt tcccggacat gaagccattt acaattgaat atatcctgcc
10920gccgctgccg ctttgcaccc ggtggagctt gcatgttggt ttctacgcag aactgagccg
10980gttaggcaga taatttccat tgagaactga gccatgtgca ccttcccccc aacacggtga
11040gcgacggggc aacggagtga tccacatggg acttttaaac atcatccgtc ggatggcgtt
11100gcgagagaag cagtcgatcc gtgagatcag ccgacgcacc gggcaggcgc gcaacacgat
11160cgcaaagtat ttgaacgcag gtacaatcga gccgacgttc accgtcaccc tggatgctgt
11220aggcataggc ttggttatgc cggtactgcc gggcctcttg cgggatatcg tccattccga
11280cagcatcgcc agtcactatg gcgtgctgct agcgctatat gcgttgatgc aatttctatg
11340cgcacccgtt ctcggagcac tgtccgaccg ctttggccgc cgcccagtcc tgctcgcttc
11400gctacttgga gccactatcg actacgcgat catggcgacc acacccgtcc tgtggtccaa
11460cccctccgct gctatagtgc agtcggcttc tgacgttcag tgcagccgtc ttctgaaaac
11520gacatgtcgc acaagtccta agttacgcga caggctgccg ccctgccctt ttcctggcgt
11580tttcttgtcg cgtgttttag tcgcataaag tagaatactt gcgactagaa ccggagacat
11640tacgccatga acaagagcgc cgccgctggc ctgctgggct atgcccgcgt cagcaccgac
11700gaccaggact tgaccaacca acgggccgaa ctgcacgcgg ccggctgcac caagctgttt
11760tccgagaaga tcaccggcac caggcgcgac cgcccggagc tggccaggat gcttgaccac
11820ctacgccctg gcgacgttgt gacagtgacc aggctagacc gcctggcccg cagcacccgc
11880gacctactgg acattgccga gcgcatccag gaggccggcg cgggcctgcg tagcctggca
11940gagccgtggg ccgacaccac cacgccggcc ggccgcatgg tgttgaccgt gttcgccggc
12000attgccgagt tcgagcgttc cctaatcatc gaccgcaccc ggagcgggcg cgaggccgcc
12060aaggcccgag gcgtgaagtt tggcccccgc cctaccctca ccccggcaca gatcgcgcac
12120gcccgcgagc tgatcgacca ggaaggccgc accgtgaaag aggcggctgc actgcttggc
12180gtgcatcgct cgaccctgta ccgcgcactt gagcgcagcg aggaagtgac gcccaccgag
12240gccaggcggc gcggtgcctt ccgtgaggac gcattgaccg aggccgacgc cctggcggcc
12300gccgagaatg aacgccaaga ggaacaagca tgaaaccgca ccaggacggc caggacgaac
12360cgtttttcat taccgaagag atcgaggcgg agatgatcgc ggccgggtac gtgttcgagc
12420cgcccgcgca cgtctcaacc gtgcggctgc atgaaatcct ggccggtttg tctgatgcca
12480agctggcggc ctggccggcc agcttggccg ctgaagaaac cgagcgccgc cgtctaaaaa
12540ggtgatgtgt atttgagtaa aacagcttgc gtcatgcggt cgctgcgtat atgatgcgat
12600gagtaaataa acaaatacgc aagggaacgc atgaagttat cgctgtactt aaccagaaag
12660gcgggtcagg caagacgacc atcgcaaccc atctagcccg cgccctgcaa ctcgccgggg
12720ccgatgttct gttagtcgat tccgatcccc agggcagtgc ccgcgattgg gcggccgtgc
12780gggaagatca accgctaacc gttgtcggca tcgaccgccc gacgattgac cgcgacgtga
12840aggccatcgg ccggcgcgac ttcgtagtga tcgacggagc gccccaggcg gcggacttgg
12900ctgtgtccgc gatcaaggca gccgacttcg tgctgattcc ggtgcagcca agcccttacg
12960acatatgggc caccgccgac ctggtggagc tggttaagca gcgcattgag gtcacggatg
13020gaaggctaca agcggccttt gtcgtgtcgc gggcgatcaa aggcacgcgc atcggcggtg
13080aggttgccga ggcgctggcc gggtacgagc tgcccattct tgagtcccgt atcacgcagc
13140gcgtgagcta cccaggcact gccgccgccg gcacaaccgt tcttgaatca gaacccgagg
13200gcgacgctgc ccgcgaggtc caggcgctgg ccgctgaaat taaatcaaaa ctcatttgag
13260ttaatgaggt aaagagaaaa tgagcaaaag cacaaacacg ctaagtgccg gccgtccgag
13320cgcacgcagc agcaaggctg caacgttggc cagcctggca gacacgccag ccatgaagcg
13380ggtcaacttt cagttgccgg cggaggatca caccaagctg aagatgtacg cggtacgcca
13440aggcaagacc attaccgagc tgctatctga atacatcgcg cagctaccag agtaaatgag
13500caaatgaata aatgagtaga tgaattttag cggctaaagg aggcggcatg gaaaatcaag
13560aacaaccagg caccgacgcc gtggaatgcc ccatgtgtgg aggaacgggc ggttggccag
13620gcgtaagcgg ctgggttgtc tgccggccct gcaatggcac tggaaccccc aagcccgagg
13680aatcggcgtg agcggtcgca aaccatccgg cccggtacaa atcggcgcgg cgctgggtga
13740tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca tcgaggcaga
13800agcacgcccc ggtgaatcgt ggcaagcggc cgctgatcga atccgcaaag aatcccggca
13860accgccggca gccggtgcgc cgtcgattag gaagccgccc aagggcgacg agcaaccaga
13920ttttttcgtt ccgatgctct atgacgtggg cacccgcgat agtcgcagca tcatggacgt
13980ggccgttttc cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc gctacgagct
14040tccagacggg cacgtagagg tttccgcagg gccggccggc atggccagtg tgtgggatta
14100cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat accgggaagg
14160gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac tcaagttctg
14220ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca ttcggttaaa
14280caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc tggtgacggt
14340atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa ccgggcggcc
14400ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag aaggcaagaa
14460cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca tcggccgttt
14520tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt tgttcaagac
14580gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca ccgtgcgcaa
14640gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg ggcaggctgg
14700cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg ccggttccta
14760atgtacggag cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc gaaaaggtct
14820ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga accggaaccc
14880gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag tgactgatat
14940aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta aaactcttaa
15000aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc tgcaaaaagc
15060gcctaccctt cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc ctatcgcggc
15120cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc gcggacaagc
15180cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc gcgcgtttcg
15240gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca gcttgtctgt
15300aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc
15360ggggcgcagc catgacccag tcacgtagcg atagcggagt gtatactggc ttaactatgc
15420ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac cgcacagatg
15480cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc tcgctcactg actcgctgcg
15540ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc
15600cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag
15660gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca
15720tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca
15780ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg
15840atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag
15900gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt
15960tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca
16020cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg
16080cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa ggacagtatt
16140tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc
16200cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg
16260cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg
16320gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta
16380gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg
16440gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg
16500ttcatccata gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc
16560atctggcccc agtgctgcaa tgataccgcg agacccacgc tcaccggctc cagatttatc
16620agcaataaac cagccagccg gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc
16680ctccatccag tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag
16740tttgcgcaac gttgttgcca ttgctacagg catcgtggtg tcacgctcgt cgtttggtat
16800ggcttcattc agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg
16860caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt
16920gttatcactc atggttatgg cagcactgca taattctctt actgtcatgc catccgtaag
16980atgcttttct gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg
17040accgagttgc tcttgcccgg cgtcaacacg ggataatacc gcgccacata gcagaacttt
17100aaaagtgctc atcattggaa aagacctgca gggggggggg ggaaagccac gttgtgtctc
17160aaaatctctg atgttacatt gcacaagata aaaatatatc atcatgaaca ataaaactgt
17220ctgcttacat aaacagtaat acaaggggtg ttatgagcca tattcaacgg gaaacgtctt
17280gctcgaggcc gcgattaaat tccaacatgg atgctgattt atatgggtat aaatgggctc
17340gcgataatgt cgggcaatca ggtgcgacaa tctatcgatt gtatgggaag cccgatgcgc
17400cagagttgtt tctgaaacat ggcaaaggta gcgttgccaa tgatgttaca gatgagatgg
17460tcagactaaa ctggctgacg gaatttatgc ctcttccgac catcaagcat tttatccgta
17520ctcctgatga tgcatggtta ctcaccactg cgatccccgg gaaaacagca ttccaggtat
17580tagaagaata tcctgattca ggtgaaaata ttgttgatgc gctggcagtg ttcctgcgcc
17640ggttgcattc gattcctgtt tgtaattgtc cttttaacag cgatcgcgta tttcgtctcg
17700ctcaggcgca atcacgaatg aataacggtt tggttgatgc gagtgatttt gatgacgagc
17760gtaatggctg gcctgttgaa caagtctgga aagaaatgca taagcttttg ccattctcac
17820cggattcagt cgtcactcat ggtgatttct cacttgataa ccttattttt gacgagggga
17880aattaatagg ttgtattgat gttggacgag tcggaatcgc agaccgatac caggatcttg
17940ccatcctatg gaactgcctc ggtgagtttt ctccttcatt acagaaacgg ctttttcaaa
18000aatatggtat tgataatcct gatatgaata aattgcagtt tcatttgatg ctcgatgagt
18060ttttctaatc agaattggtt aattggttgt aacactggca gagcattacg ctgacttgac
18120gggacggcgg ctttgttgaa taaatcgaac ttttgctgag ttgaaggatc agatcacgca
18180tcttcccgac aacgcagacc gttccgtggc aaagcaaaag ttcaaaatca ccaactggtc
18240cacctacaac aaagctctca tcaaccgtgg ctccctcact ttctggctgg atgatggggc
18300gattcaggcc tggtatgagt cagcaacacc ttcttcacga ggcagacctc agcgcccccc
18360cccccctgca ggtcaattcg gtcgatatgg ctattacgaa gaaggctcgt gcgcggagtc
18420ccgtgaactt tcccacgcaa caagtgaacc gcaccgggtt tgccggaggc catttcgtta
18480aaatgcgcag c
18491250DNAArtificial Sequencepoly-linker 2gatcactagt ggcgcgccta
ggagatctcg agtagggata acagggtaat 5037085DNAArtificial
SequencePlasmid pKR85 3cgcgccaagc ttttgatcca tgcccttcat ttgccgctta
ttaattaatt tggtaacagt 60ccgtactaat cagttactta tccttccccc atcataatta
atcttggtag tctcgaatgc 120cacaacactg actagtctct tggatcataa gaaaaagcca
aggaacaaaa gaagacaaaa 180cacaatgaga gtatcctttg catagcaatg tctaagttca
taaaattcaa acaaaaacgc 240aatcacacac agtggacatc acttatccac tagctgatca
ggatcgccgc gtcaagaaaa 300aaaaactgga ccccaaaagc catgcacaac aacacgtact
cacaaaggtg tcaatcgagc 360agcccaaaac attcaccaac tcaacccatc atgagccctc
acatttgttg tttctaaccc 420aacctcaaac tcgtattctc ttccgccacc tcatttttgt
ttatttcaac acccgtcaaa 480ctgcatgcca ccccgtggcc aaatgtccat gcatgttaac
aagacctatg actataaata 540gctgcaatct cggcccaggt tttcatcatc aagaaccagt
tcaatatcct agtacaccgt 600attaaagaat ttaagatata ctgcggccgc aagtatgaac
taaaatgcat gtaggtgtaa 660gagctcatgg agagcatgga atattgtatc cgaccatgta
acagtataat aactgagctc 720catctcactt cttctatgaa taaacaaagg atgttatgat
atattaacac tctatctatg 780caccttattg ttctatgata aatttcctct tattattata
aatcatctga atcgtgacgg 840cttatggaat gcttcaaata gtacaaaaac aaatgtgtac
tataagactt tctaaacaat 900tctaacctta gcattgtgaa cgagacataa gtgttaagaa
gacataacaa ttataatgga 960agaagtttgt ctccatttat atattatata ttacccactt
atgtattata ttaggatgtt 1020aaggagacat aacaattata aagagagaag tttgtatcca
tttatatatt atatactacc 1080catttatata ttatacttat ccacttattt aatgtcttta
taaggtttga tccatgatat 1140ttctaatatt ttagttgata tgtatatgaa agggtactat
ttgaactctc ttactctgta 1200taaaggttgg atcatcctta aagtgggtct atttaatttt
attgcttctt acagataaaa 1260aaaaaattat gagttggttt gataaaatat tgaaggattt
aaaataataa taaataacat 1320ataatatatg tatataaatt tattataata taacatttat
ctataaaaaa gtaaatattg 1380tcataaatct atacaatcgt ttagccttgc tggacgaatc
tcaattattt aaacgagagt 1440aaacatattt gactttttgg ttatttaaca aattattatt
taacactata tgaaattttt 1500ttttttatca gcaaagaata aaattaaatt aagaaggaca
atggtgtccc aatccttata 1560caaccaactt ccacaagaaa gtcaagtcag agacaacaaa
aaaacaagca aaggaaattt 1620tttaatttga gttgtcttgt ttgctgcata atttatgcag
taaaacacta cacataaccc 1680ttttagcagt agagcaatgg ttgaccgtgt gcttagcttc
ttttatttta tttttttatc 1740agcaaagaat aaataaaata aaatgagaca cttcagggat
gtttcaacaa gcttggatct 1800cctgcaggat ctggccggcc ggatctcgta cggatccgtc
gacggcgcgc ccgatcatcc 1860ggatatagtt cctcctttca gcaaaaaacc cctcaagacc
cgtttagagg ccccaagggg 1920ttatgctagt tattgctcag cggtggcagc agccaactca
gcttcctttc gggctttgtt 1980agcagccgga tcgatccaag ctgtacctca ctattccttt
gccctcggac gagtgctggg 2040gcgtcggttt ccactatcgg cgagtacttc tacacagcca
tcggtccaga cggccgcgct 2100tctgcgggcg atttgtgtac gcccgacagt cccggctccg
gatcggacga ttgcgtcgca 2160tcgaccctgc gcccaagctg catcatcgaa attgccgtca
accaagctct gatagagttg 2220gtcaagacca atgcggagca tatacgcccg gagccgcggc
gatcctgcaa gctccggatg 2280cctccgctcg aagtagcgcg tctgctgctc catacaagcc
aaccacggcc tccagaagaa 2340gatgttggcg acctcgtatt gggaatcccc gaacatcgcc
tcgctccagt caatgaccgc 2400tgttatgcgg ccattgtccg tcaggacatt gttggagccg
aaatccgcgt gcacgaggtg 2460ccggacttcg gggcagtcct cggcccaaag catcagctca
tcgagagcct gcgcgacgga 2520cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac
acatggggat cagcaatcgc 2580gcatatgaaa tcacgccatg tagtgtattg accgattcct
tgcggtccga atgggccgaa 2640cccgctcgtc tggctaagat cggccgcagc gatcgcatcc
atagcctccg cgaccggctg 2700cagaacagcg ggcagttcgg tttcaggcag gtcttgcaac
gtgacaccct gtgcacggcg 2760ggagatgcaa taggtcaggc tctcgctgaa ttccccaatg
tcaagcactt ccggaatcgg 2820gagcgcggcc gatgcaaagt gccgataaac ataacgatct
ttgtagaaac catcggcgca 2880gctatttacc cgcaggacat atccacgccc tcctacatcg
aagctgaaag cacgagattc 2940ttcgccctcc gagagctgca tcaggtcgga gacgctgtcg
aacttttcga tcagaaactt 3000ctcgacagac gtcgcggtga gttcaggctt ttccatgggt
atatctcctt cttaaagtta 3060aacaaaatta tttctagagg gaaaccgttg tggtctccct
atagtgagtc gtattaattt 3120cgcgggatcg agatcgatcc aattccaatc ccacaaaaat
ctgagcttaa cagcacagtt 3180gctcctctca gagcagaatc gggtattcaa caccctcata
tcaactacta cgttgtgtat 3240aacggtccac atgccggtat atacgatgac tggggttgta
caaaggcggc aacaaacggc 3300gttcccggag ttgcacacaa gaaatttgcc actattacag
aggcaagagc agcagctgac 3360gcgtacacaa caagtcagca aacagacagg ttgaacttca
tccccaaagg agaagctcaa 3420ctcaagccca agagctttgc taaggcccta acaagcccac
caaagcaaaa agcccactgg 3480ctcacgctag gaaccaaaag gcccagcagt gatccagccc
caaaagagat ctcctttgcc 3540ccggagatta caatggacga tttcctctat ctttacgatc
taggaaggaa gttcgaaggt 3600gaaggtgacg acactatgtt caccactgat aatgagaagg
ttagcctctt caatttcaga 3660aagaatgctg acccacagat ggttagagag gcctacgcag
caggtctcat caagacgatc 3720tacccgagta acaatctcca ggagatcaaa taccttccca
agaaggttaa agatgcagtc 3780aaaagattca ggactaattg catcaagaac acagagaaag
acatatttct caagatcaga 3840agtactattc cagtatggac gattcaaggc ttgcttcata
aaccaaggca agtaatagag 3900attggagtct ctaaaaaggt agttcctact gaatctaagg
ccatgcatgg agtctaagat 3960tcaaatcgag gatctaacag aactcgccgt gaagactggc
gaacagttca tacagagtct 4020tttacgactc aatgacaaga agaaaatctt cgtcaacatg
gtggagcacg acactctggt 4080ctactccaaa aatgtcaaag atacagtctc agaagaccaa
agggctattg agacttttca 4140acaaaggata atttcgggaa acctcctcgg attccattgc
ccagctatct gtcacttcat 4200cgaaaggaca gtagaaaagg aaggtggctc ctacaaatgc
catcattgcg ataaaggaaa 4260ggctatcatt caagatgcct ctgccgacag tggtcccaaa
gatggacccc cacccacgag 4320gagcatcgtg gaaaaagaag acgttccaac cacgtcttca
aagcaagtgg attgatgtga 4380catctccact gacgtaaggg atgacgcaca atcccactat
ccttcgcaag acccttcctc 4440tatataagga agttcatttc atttggagag gacacgctcg
agctcatttc tctattactt 4500cagccataac aaaagaactc ttttctcttc ttattaaacc
atgaaaaagc ctgaactcac 4560cgcgacgtct gtcgagaagt ttctgatcga aaagttcgac
agcgtctccg acctgatgca 4620gctctcggag ggcgaagaat ctcgtgcttt cagcttcgat
gtaggagggc gtggatatgt 4680cctgcgggta aatagctgcg ccgatggttt ctacaaagat
cgttatgttt atcggcactt 4740tgcatcggcc gcgctcccga ttccggaagt gcttgacatt
ggggaattca gcgagagcct 4800gacctattgc atctcccgcc gtgcacaggg tgtcacgttg
caagacctgc ctgaaaccga 4860actgcccgct gttctgcagc cggtcgcgga ggccatggat
gcgatcgctg cggccgatct 4920tagccagacg agcgggttcg gcccattcgg accgcaagga
atcggtcaat acactacatg 4980gcgtgatttc atatgcgcga ttgctgatcc ccatgtgtat
cactggcaaa ctgtgatgga 5040cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag
ctgatgcttt gggccgagga 5100ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc
tccaacaatg tcctgacgga 5160caatggccgc ataacagcgg tcattgactg gagcgaggcg
atgttcgggg attcccaata 5220cgaggtcgcc aacatcttct tctggaggcc gtggttggct
tgtatggagc agcagacgcg 5280ctacttcgag cggaggcatc cggagcttgc aggatcgccg
cggctccggg cgtatatgct 5340ccgcattggt cttgaccaac tctatcagag cttggttgac
ggcaatttcg atgatgcagc 5400ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga
gccgggactg tcgggcgtac 5460acaaatcgcc cgcagaagcg cggccgtctg gaccgatggc
tgtgtagaag tactcgccga 5520tagtggaaac cgacgcccca gcactcgtcc gagggcaaag
gaatagtgag gtacctaaag 5580aaggagtgcg tcgaagcaga tcgttcaaac atttggcaat
aaagtttctt aagattgaat 5640cctgttgccg gtcttgcgat gattatcata taatttctgt
tgaattacgt taagcatgta 5700ataattaaca tgtaatgcat gacgttattt atgagatggg
tttttatgat tagagtcccg 5760caattataca tttaatacgc gatagaaaac aaaatatagc
gcgcaaacta ggataaatta 5820tcgcgcgcgg tgtcatctat gttactagat cgatgtcgaa
tcgatcaacc tgcattaatg 5880aatcggccaa cgcgcgggga gaggcggttt gcgtattggg
cgctcttccg cttcctcgct 5940cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg
gtatcagctc actcaaaggc 6000ggtaatacgg ttatccacag aatcagggga taacgcagga
aagaacatgt gagcaaaagg 6060ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg
gcgtttttcc ataggctccg 6120cccccctgac gagcatcaca aaaatcgacg ctcaagtcag
aggtggcgaa acccgacagg 6180actataaaga taccaggcgt ttccccctgg aagctccctc
gtgcgctctc ctgttccgac 6240cctgccgctt accggatacc tgtccgcctt tctcccttcg
ggaagcgtgg cgctttctca 6300atgctcacgc tgtaggtatc tcagttcggt gtaggtcgtt
cgctccaagc tgggctgtgt 6360gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc
ggtaactatc gtcttgagtc 6420caacccggta agacacgact tatcgccact ggcagcagcc
actggtaaca ggattagcag 6480agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg
tggcctaact acggctacac 6540tagaaggaca gtatttggta tctgcgctct gctgaagcca
gttaccttcg gaaaaagagt 6600tggtagctct tgatccggca aacaaaccac cgctggtagc
ggtggttttt ttgtttgcaa 6660gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat
cctttgatct tttctacggg 6720gtctgacgct cagtggaacg aaaactcacg ttaagggatt
ttggtcatga cattaaccta 6780taaaaatagg cgtatcacga ggccctttcg tctcgcgcgt
ttcggtgatg acggtgaaaa 6840cctctgacac atgcagctcc cggagacggt cacagcttgt
ctgtaagcgg atgccgggag 6900cagacaagcc cgtcagggcg cgtcagcggg tgttggcggg
tgtcggggct ggcttaacta 6960tgcggcatca gagcagattg tactgagagt gcaccatatg
gacatattgt cgttagaacg 7020cggctacaat taatacataa ccttatgtat catacacata
cgatttaggt gacactatag 7080aacgg
708545303DNAArtificial SequencePlasmid pKR278
4agcttggatc tcctgcagga tctggccggc cggatctcgt acggatccgt cgacggcgcg
60cccgatcatc cggatatagt tcctcctttc agcaaaaaac ccctcaagac ccgtttagag
120gccccaaggg gttatgctag ttattgctca gcggtggcag cagccaactc agcttccttt
180cgggctttgt tagcagccgg atcgatccaa gctgtacctc actattcctt tgccctcgga
240cgagtgctgg ggcgtcggtt tccactatcg gcgagtactt ctacacagcc atcggtccag
300acggccgcgc ttctgcgggc gatttgtgta cgcccgacag tcccggctcc ggatcggacg
360attgcgtcgc atcgaccctg cgcccaagct gcatcatcga aattgccgtc aaccaagctc
420tgatagagtt ggtcaagacc aatgcggagc atatacgccc ggagccgcgg cgatcctgca
480agctccggat gcctccgctc gaagtagcgc gtctgctgct ccatacaagc caaccacggc
540ctccagaaga agatgttggc gacctcgtat tgggaatccc cgaacatcgc ctcgctccag
600tcaatgaccg ctgttatgcg gccattgtcc gtcaggacat tgttggagcc gaaatccgcg
660tgcacgaggt gccggacttc ggggcagtcc tcggcccaaa gcatcagctc atcgagagcc
720tgcgcgacgg acgcactgac ggtgtcgtcc atcacagttt gccagtgata cacatgggga
780tcagcaatcg cgcatatgaa atcacgccat gtagtgtatt gaccgattcc ttgcggtccg
840aatgggccga acccgctcgt ctggctaaga tcggccgcag cgatcgcatc catagcctcc
900gcgaccggct gcagaacagc gggcagttcg gtttcaggca ggtcttgcaa cgtgacaccc
960tgtgcacggc gggagatgca ataggtcagg ctctcgctga attccccaat gtcaagcact
1020tccggaatcg ggagcgcggc cgatgcaaag tgccgataaa cataacgatc tttgtagaaa
1080ccatcggcgc agctatttac ccgcaggaca tatccacgcc ctcctacatc gaagctgaaa
1140gcacgagatt cttcgccctc cgagagctgc atcaggtcgg agacgctgtc gaacttttcg
1200atcagaaact tctcgacaga cgtcgcggtg agttcaggct tttccatggg tatatctcct
1260tcttaaagtt aaacaaaatt atttctagag ggaaaccgtt gtggtctccc tatagtgagt
1320cgtattaatt tcgcgggatc gagatcgatc caattccaat cccacaaaaa tctgagctta
1380acagcacagt tgctcctctc agagcagaat cgggtattca acaccctcat atcaactact
1440acgttgtgta taacggtcca catgccggta tatacgatga ctggggttgt acaaaggcgg
1500caacaaacgg cgttcccgga gttgcacaca agaaatttgc cactattaca gaggcaagag
1560cagcagctga cgcgtacaca acaagtcagc aaacagacag gttgaacttc atccccaaag
1620gagaagctca actcaagccc aagagctttg ctaaggccct aacaagccca ccaaagcaaa
1680aagcccactg gctcacgcta ggaaccaaaa ggcccagcag tgatccagcc ccaaaagaga
1740tctcctttgc cccggagatt acaatggacg atttcctcta tctttacgat ctaggaagga
1800agttcgaagg tgaaggtgac gacactatgt tcaccactga taatgagaag gttagcctct
1860tcaatttcag aaagaatgct gacccacaga tggttagaga ggcctacgca gcaggtctca
1920tcaagacgat ctacccgagt aacaatctcc aggagatcaa ataccttccc aagaaggtta
1980aagatgcagt caaaagattc aggactaatt gcatcaagaa cacagagaaa gacatatttc
2040tcaagatcag aagtactatt ccagtatgga cgattcaagg cttgcttcat aaaccaaggc
2100aagtaataga gattggagtc tctaaaaagg tagttcctac tgaatctaag gccatgcatg
2160gagtctaaga ttcaaatcga ggatctaaca gaactcgccg tgaagactgg cgaacagttc
2220atacagagtc ttttacgact caatgacaag aagaaaatct tcgtcaacat ggtggagcac
2280gacactctgg tctactccaa aaatgtcaaa gatacagtct cagaagacca aagggctatt
2340gagacttttc aacaaaggat aatttcggga aacctcctcg gattccattg cccagctatc
2400tgtcacttca tcgaaaggac agtagaaaag gaaggtggct cctacaaatg ccatcattgc
2460gataaaggaa aggctatcat tcaagatgcc tctgccgaca gtggtcccaa agatggaccc
2520ccacccacga ggagcatcgt ggaaaaagaa gacgttccaa ccacgtcttc aaagcaagtg
2580gattgatgtg acatctccac tgacgtaagg gatgacgcac aatcccacta tccttcgcaa
2640gacccttcct ctatataagg aagttcattt catttggaga ggacacgctc gagctcattt
2700ctctattact tcagccataa caaaagaact cttttctctt cttattaaac catgaaaaag
2760cctgaactca ccgcgacgtc tgtcgagaag tttctgatcg aaaagttcga cagcgtctcc
2820gacctgatgc agctctcgga gggcgaagaa tctcgtgctt tcagcttcga tgtaggaggg
2880cgtggatatg tcctgcgggt aaatagctgc gccgatggtt tctacaaaga tcgttatgtt
2940tatcggcact ttgcatcggc cgcgctcccg attccggaag tgcttgacat tggggaattc
3000agcgagagcc tgacctattg catctcccgc cgtgcacagg gtgtcacgtt gcaagacctg
3060cctgaaaccg aactgcccgc tgttctgcag ccggtcgcgg aggccatgga tgcgatcgct
3120gcggccgatc ttagccagac gagcgggttc ggcccattcg gaccgcaagg aatcggtcaa
3180tacactacat ggcgtgattt catatgcgcg attgctgatc cccatgtgta tcactggcaa
3240actgtgatgg acgacaccgt cagtgcgtcc gtcgcgcagg ctctcgatga gctgatgctt
3300tgggccgagg actgccccga agtccggcac ctcgtgcacg cggatttcgg ctccaacaat
3360gtcctgacgg acaatggccg cataacagcg gtcattgact ggagcgaggc gatgttcggg
3420gattcccaat acgaggtcgc caacatcttc ttctggaggc cgtggttggc ttgtatggag
3480cagcagacgc gctacttcga gcggaggcat ccggagcttg caggatcgcc gcggctccgg
3540gcgtatatgc tccgcattgg tcttgaccaa ctctatcaga gcttggttga cggcaatttc
3600gatgatgcag cttgggcgca gggtcgatgc gacgcaatcg tccgatccgg agccgggact
3660gtcgggcgta cacaaatcgc ccgcagaagc gcggccgtct ggaccgatgg ctgtgtagaa
3720gtactcgccg atagtggaaa ccgacgcccc agcactcgtc cgagggcaaa ggaatagtga
3780ggtacctaaa gaaggagtgc gtcgaagcag atcgttcaaa catttggcaa taaagtttct
3840taagattgaa tcctgttgcc ggtcttgcga tgattatcat ataatttctg ttgaattacg
3900ttaagcatgt aataattaac atgtaatgca tgacgttatt tatgagatgg gtttttatga
3960ttagagtccc gcaattatac atttaatacg cgatagaaaa caaaatatag cgcgcaaact
4020aggataaatt atcgcgcgcg gtgtcatcta tgttactaga tcgatgtcga atcgatcaac
4080ctgcattaat gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc
4140gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct
4200cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg
4260tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc
4320cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga
4380aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct
4440cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg
4500gcgctttctc aatgctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag
4560ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat
4620cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac
4680aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac
4740tacggctaca ctagaaggac agtatttggt atctgcgctc tgctgaagcc agttaccttc
4800ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt
4860tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc
4920ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg
4980acattaacct ataaaaatag gcgtatcacg aggccctttc gtctcgcgcg tttcggtgat
5040gacggtgaaa acctctgaca catgcagctc ccggagacgg tcacagcttg tctgtaagcg
5100gatgccggga gcagacaagc ccgtcagggc gcgtcagcgg gtgttggcgg gtgtcggggc
5160tggcttaact atgcggcatc agagcagatt gtactgagag tgcaccatat ggacatattg
5220tcgttagaac gcggctacaa ttaatacata accttatgta tcatacacat acgatttagg
5280tgacactata gaacggcgcg cca
530354140DNAArtificial SequencePlasmid pKR407 5ggccgcattt cgcaccaaat
caatgaaagt aataatgaaa agtctgaata agaatactta 60ggcttagatg cctttgttac
ttgtgtaaaa taacttgagt catgtacctt tggcggaaac 120agaataaata aaaggtgaaa
ttccaatgct ctatgtataa gttagtaata cttaatgtgt 180tctacggttg tttcaatatc
atcaaactct aattgaaact ttagaaccac aaatctcaat 240cttttcttaa tgaaatgaaa
aatcttaatt gtaccatgtt tatgttaaac accttacaat 300tggttggaga ggaggaccaa
ccgatgggac aacattggga gaaagagatt caatggagat 360ttggatagga gaacaacatt
ctttttcact tcaatacaag atgagtgcaa cactaaggat 420atgtatgaga ctttcagaag
ctacgacaac atagatgagt gaggtggtga ttcctagcaa 480gaaagacatt agaggaagcc
aaaatcgaac aaggaagaca tcaagggcaa gagacaggac 540catccatctc aggaaaagga
gctttgggat agtccgagaa gttgtacaag aaattttttg 600gagggtgagt gatgcattgc
tggtgacttt aactcaatca aaattgagaa agaaagaaaa 660gggagggggc tcacatgtga
atagaaggga aacgggagaa ttttacagtt ttgatctaat 720gggcatccca gctagtggta
acatattcac catgtttaac cttcacgtac gtctagagga 780tcccccgggc tgcaggaatt
cactggccgt cgttttacaa cgtcgtgact gggaaaaccc 840tggcgttacc caacttaatc
gccttgcagc acatccccct ttcgccagct ggcgtaatag 900cgaagaggcc cgcaccgatc
gcccttccca acagttgcgc agcctgaatg gcgaatggcg 960cctgatgcgg tattttctcc
ttacgcatct gtgcggtatt tcacaccgca tatggtgcac 1020tctcagtaca atctgctctg
atgccgcata gttaagccag ccccgacacc cgccaacacc 1080cgctgacgcg ccctgacggg
cttgtctgct cccggcatcc gcttacagac aagctgtgac 1140cgtctccggg agctgcatgt
gtcagaggtt ttcaccgtca tcaccgaaac gcgcgagacg 1200aaagggcctc gtgatacgcc
tatttttata ggttaatgtc atgataataa tggtttctta 1260gacgtcaggt ggcacttttc
ggggaaatgt gcgcggaacc cctatttgtt tatttttcta 1320aatacattca aatatgtatc
cgctcatgag acaataaccc tgataaatgc ttcaataata 1380ttgaaaaagg aagagtatga
gtattcaaca tttccgtgtc gcccttattc ccttttttgc 1440ggcattttgc cttcctgttt
ttgctcaccc agaaacgctg gtgaaagtaa aagatgctga 1500agatcagttg ggtgcacgag
tgggttacat cgaactggat ctcaacagcg gtaagatcct 1560tgagagtttt cgccccgaag
aacgttttcc aatgatgagc acttttaaag ttctgctatg 1620tggcgcggta ttatcccgta
ttgacgccgg gcaagagcaa ctcggtcgcc gcatacacta 1680ttctcagaat gacttggttg
agtactcacc agtcacagaa aagcatctta cggatggcat 1740gacagtaaga gaattatgca
gtgctgccat aaccatgagt gataacactg cggccaactt 1800acttctgaca acgatcggag
gaccgaagga gctaaccgct tttttgcaca acatggggga 1860tcatgtaact cgccttgatc
gttgggaacc ggagctgaat gaagccatac caaacgacga 1920gcgtgacacc acgatgcctg
tagcaatggc aacaacgttg cgcaaactat taactggcga 1980actacttact ctagcttccc
ggcaacaatt aatagactgg atggaggcgg ataaagttgc 2040aggaccactt ctgcgctcgg
cccttccggc tggctggttt attgctgata aatctggagc 2100cggtgagcgt gggtctcgcg
gtatcattgc agcactgggg ccagatggta agccctcccg 2160tatcgtagtt atctacacga
cggggagtca ggcaactatg gatgaacgaa atagacagat 2220cgctgagata ggtgcctcac
tgattaagca ttggtaactg tcagaccaag tttactcata 2280tatactttag attgatttaa
aacttcattt ttaatttaaa aggatctagg tgaagatcct 2340ttttgataat ctcatgacca
aaatccctta acgtgagttt tcgttccact gagcgtcaga 2400ccccgtagaa aagatcaaag
gatcttcttg agatcctttt tttctgcgcg taatctgctg 2460cttgcaaaca aaaaaaccac
cgctaccagc ggtggtttgt ttgccggatc aagagctacc 2520aactcttttt ccgaaggtaa
ctggcttcag cagagcgcag ataccaaata ctgtccttct 2580agtgtagccg tagttaggcc
accacttcaa gaactctgta gcaccgccta catacctcgc 2640tctgctaatc ctgttaccag
tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt 2700ggactcaaga cgatagttac
cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg 2760cacacagccc agcttggagc
gaacgaccta caccgaactg agatacctac agcgtgagct 2820atgagaaagc gccacgcttc
ccgaagggag aaaggcggac aggtatccgg taagcggcag 2880ggtcggaaca ggagagcgca
cgagggagct tccaggggga aacgcctggt atctttatag 2940tcctgtcggg tttcgccacc
tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg 3000gcggagccta tggaaaaacg
ccagcaacgc ggccttttta cggttcctgg ccttttgctg 3060gccttttgct cacatgttct
ttcctgcgtt atcccctgat tctgtggata accgtattac 3120cgcctttgag tgagctgata
ccgctcgccg cagccgaacg accgagcgca gcgagtcagt 3180gagcgaggaa gcggaagagc
gcccaatacg caaaccgcct ctccccgcgc gttggccgat 3240tcattaatgc agctggcacg
acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc 3300aattaatgtg agttagctca
ctcattaggc accccaggct ttacacttta tgcttccggc 3360tcgtatgttg tgtggaattg
tgagcggata acaatttcac acaggaaaca gctatgacca 3420tgattacgcc aagcttgcat
gcctgcaggc tagcctaagt acgtactcaa aatgccaaca 3480aataaaaaaa aagttgcttt
aataatgcca aaacaaatta ataaaacact tacaacaccg 3540gatttttttt aattaaaatg
tgccatttag gataaatagt taatattttt aataattatt 3600taaaaagccg tatctactaa
aatgattttt atttggttga aaatattaat atgtttaaat 3660caacacaatc tatcaaaatt
aaactaaaaa aaaaataagt gtacgtggtt aacattagta 3720cagtaatata agaggaaaat
gagaaattaa gaaattgaaa gcgagtctaa tttttaaatt 3780atgaacctgc atatataaaa
ggaaagaaag aatccaggaa gaaaagaaat gaaaccatgc 3840atggtcccct cgtcatcacg
agtttctgcc atttgcaata gaaacactga aacacctttc 3900tctttgtcac ttaattgaga
tgccgaagcc acctcacacc atgaacttca tgaggtgtag 3960cacccaaggc ttccatagcc
atgcatactg aagaatgtct caagctcagc accctacttc 4020tgtgacgtgt ccctcattca
ccttcctctc ttccctataa ataaccacgc ctcaggttct 4080ccgcttcaca actcaaacat
tctctccatt ggtccttaaa cactcatcag tcatcaccgc 414066747DNAArtificial
SequencePlasmid pKR1468 6gatccgtcga cggcgcgccc gatcatccgg atatagttcc
tcctttcagc aaaaaacccc 60tcaagacccg tttagaggcc ccaaggggtt atgctagtta
ttgctcagcg gtggcagcag 120ccaactcagc ttcctttcgg gctttgttag cagccggatc
gatccaagct gtacctcact 180attcctttgc cctcggacga gtgctggggc gtcggtttcc
actatcggcg agtacttcta 240cacagccatc ggtccagacg gccgcgcttc tgcgggcgat
ttgtgtacgc ccgacagtcc 300cggctccgga tcggacgatt gcgtcgcatc gaccctgcgc
ccaagctgca tcatcgaaat 360tgccgtcaac caagctctga tagagttggt caagaccaat
gcggagcata tacgcccgga 420gccgcggcga tcctgcaagc tccggatgcc tccgctcgaa
gtagcgcgtc tgctgctcca 480tacaagccaa ccacggcctc cagaagaaga tgttggcgac
ctcgtattgg gaatccccga 540acatcgcctc gctccagtca atgaccgctg ttatgcggcc
attgtccgtc aggacattgt 600tggagccgaa atccgcgtgc acgaggtgcc ggacttcggg
gcagtcctcg gcccaaagca 660tcagctcatc gagagcctgc gcgacggacg cactgacggt
gtcgtccatc acagtttgcc 720agtgatacac atggggatca gcaatcgcgc atatgaaatc
acgccatgta gtgtattgac 780cgattccttg cggtccgaat gggccgaacc cgctcgtctg
gctaagatcg gccgcagcga 840tcgcatccat agcctccgcg accggctgca gaacagcggg
cagttcggtt tcaggcaggt 900cttgcaacgt gacaccctgt gcacggcggg agatgcaata
ggtcaggctc tcgctgaatt 960ccccaatgtc aagcacttcc ggaatcggga gcgcggccga
tgcaaagtgc cgataaacat 1020aacgatcttt gtagaaacca tcggcgcagc tatttacccg
caggacatat ccacgccctc 1080ctacatcgaa gctgaaagca cgagattctt cgccctccga
gagctgcatc aggtcggaga 1140cgctgtcgaa cttttcgatc agaaacttct cgacagacgt
cgcggtgagt tcaggctttt 1200ccatgggtat atctccttct taaagttaaa caaaattatt
tctagaggga aaccgttgtg 1260gtctccctat agtgagtcgt attaatttcg cgggatcgag
atcgatccaa ttccaatccc 1320acaaaaatct gagcttaaca gcacagttgc tcctctcaga
gcagaatcgg gtattcaaca 1380ccctcatatc aactactacg ttgtgtataa cggtccacat
gccggtatat acgatgactg 1440gggttgtaca aaggcggcaa caaacggcgt tcccggagtt
gcacacaaga aatttgccac 1500tattacagag gcaagagcag cagctgacgc gtacacaaca
agtcagcaaa cagacaggtt 1560gaacttcatc cccaaaggag aagctcaact caagcccaag
agctttgcta aggccctaac 1620aagcccacca aagcaaaaag cccactggct cacgctagga
accaaaaggc ccagcagtga 1680tccagcccca aaagagatct cctttgcccc ggagattaca
atggacgatt tcctctatct 1740ttacgatcta ggaaggaagt tcgaaggtga aggtgacgac
actatgttca ccactgataa 1800tgagaaggtt agcctcttca atttcagaaa gaatgctgac
ccacagatgg ttagagaggc 1860ctacgcagca ggtctcatca agacgatcta cccgagtaac
aatctccagg agatcaaata 1920ccttcccaag aaggttaaag atgcagtcaa aagattcagg
actaattgca tcaagaacac 1980agagaaagac atatttctca agatcagaag tactattcca
gtatggacga ttcaaggctt 2040gcttcataaa ccaaggcaag taatagagat tggagtctct
aaaaaggtag ttcctactga 2100atctaaggcc atgcatggag tctaagattc aaatcgagga
tctaacagaa ctcgccgtga 2160agactggcga acagttcata cagagtcttt tacgactcaa
tgacaagaag aaaatcttcg 2220tcaacatggt ggagcacgac actctggtct actccaaaaa
tgtcaaagat acagtctcag 2280aagaccaaag ggctattgag acttttcaac aaaggataat
ttcgggaaac ctcctcggat 2340tccattgccc agctatctgt cacttcatcg aaaggacagt
agaaaaggaa ggtggctcct 2400acaaatgcca tcattgcgat aaaggaaagg ctatcattca
agatgcctct gccgacagtg 2460gtcccaaaga tggaccccca cccacgagga gcatcgtgga
aaaagaagac gttccaacca 2520cgtcttcaaa gcaagtggat tgatgtgaca tctccactga
cgtaagggat gacgcacaat 2580cccactatcc ttcgcaagac ccttcctcta tataaggaag
ttcatttcat ttggagagga 2640cacgctcgag ctcatttctc tattacttca gccataacaa
aagaactctt ttctcttctt 2700attaaaccat gaaaaagcct gaactcaccg cgacgtctgt
cgagaagttt ctgatcgaaa 2760agttcgacag cgtctccgac ctgatgcagc tctcggaggg
cgaagaatct cgtgctttca 2820gcttcgatgt aggagggcgt ggatatgtcc tgcgggtaaa
tagctgcgcc gatggtttct 2880acaaagatcg ttatgtttat cggcactttg catcggccgc
gctcccgatt ccggaagtgc 2940ttgacattgg ggaattcagc gagagcctga cctattgcat
ctcccgccgt gcacagggtg 3000tcacgttgca agacctgcct gaaaccgaac tgcccgctgt
tctgcagccg gtcgcggagg 3060ccatggatgc gatcgctgcg gccgatctta gccagacgag
cgggttcggc ccattcggac 3120cgcaaggaat cggtcaatac actacatggc gtgatttcat
atgcgcgatt gctgatcccc 3180atgtgtatca ctggcaaact gtgatggacg acaccgtcag
tgcgtccgtc gcgcaggctc 3240tcgatgagct gatgctttgg gccgaggact gccccgaagt
ccggcacctc gtgcacgcgg 3300atttcggctc caacaatgtc ctgacggaca atggccgcat
aacagcggtc attgactgga 3360gcgaggcgat gttcggggat tcccaatacg aggtcgccaa
catcttcttc tggaggccgt 3420ggttggcttg tatggagcag cagacgcgct acttcgagcg
gaggcatccg gagcttgcag 3480gatcgccgcg gctccgggcg tatatgctcc gcattggtct
tgaccaactc tatcagagct 3540tggttgacgg caatttcgat gatgcagctt gggcgcaggg
tcgatgcgac gcaatcgtcc 3600gatccggagc cgggactgtc gggcgtacac aaatcgcccg
cagaagcgcg gccgtctgga 3660ccgatggctg tgtagaagta ctcgccgata gtggaaaccg
acgccccagc actcgtccga 3720gggcaaagga atagtgaggt acctaaagaa ggagtgcgtc
gaagcagatc gttcaaacat 3780ttggcaataa agtttcttaa gattgaatcc tgttgccggt
cttgcgatga ttatcatata 3840atttctgttg aattacgtta agcatgtaat aattaacatg
taatgcatga cgttatttat 3900gagatgggtt tttatgatta gagtcccgca attatacatt
taatacgcga tagaaaacaa 3960aatatagcgc gcaaactagg ataaattatc gcgcgcggtg
tcatctatgt tactagatcg 4020atgtcgaatc gatcaacctg cattaatgaa tcggccaacg
cgcggggaga ggcggtttgc 4080gtattgggcg ctcttccgct tcctcgctca ctgactcgct
gcgctcggtc gttcggctgc 4140ggcgagcggt atcagctcac tcaaaggcgg taatacggtt
atccacagaa tcaggggata 4200acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc
caggaaccgt aaaaaggccg 4260cgttgctggc gtttttccat aggctccgcc cccctgacga
gcatcacaaa aatcgacgct 4320caagtcagag gtggcgaaac ccgacaggac tataaagata
ccaggcgttt ccccctggaa 4380gctccctcgt gcgctctcct gttccgaccc tgccgcttac
cggatacctg tccgcctttc 4440tcccttcggg aagcgtggcg ctttctcaat gctcacgctg
taggtatctc agttcggtgt 4500aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc
cgttcagccc gaccgctgcg 4560ccttatccgg taactatcgt cttgagtcca acccggtaag
acacgactta tcgccactgg 4620cagcagccac tggtaacagg attagcagag cgaggtatgt
aggcggtgct acagagttct 4680tgaagtggtg gcctaactac ggctacacta gaaggacagt
atttggtatc tgcgctctgc 4740tgaagccagt taccttcgga aaaagagttg gtagctcttg
atccggcaaa caaaccaccg 4800ctggtagcgg tggttttttt gtttgcaagc agcagattac
gcgcagaaaa aaaggatctc 4860aagaagatcc tttgatcttt tctacggggt ctgacgctca
gtggaacgaa aactcacgtt 4920aagggatttt ggtcatgaca ttaacctata aaaataggcg
tatcacgagg ccctttcgtc 4980tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat
gcagctcccg gagacggtca 5040cagcttgtct gtaagcggat gccgggagca gacaagcccg
tcagggcgcg tcagcgggtg 5100ttggcgggtg tcggggctgg cttaactatg cggcatcaga
gcagattgta ctgagagtgc 5160accatatgga catattgtcg ttagaacgcg gctacaatta
atacataacc ttatgtatca 5220tacacatacg atttaggtga cactatagaa cggcgcgcca
agcttgcatg cctgcaggct 5280agcctaagta cgtactcaaa atgccaacaa ataaaaaaaa
agttgcttta ataatgccaa 5340aacaaattaa taaaacactt acaacaccgg atttttttta
attaaaatgt gccatttagg 5400ataaatagtt aatattttta ataattattt aaaaagccgt
atctactaaa atgattttta 5460tttggttgaa aatattaata tgtttaaatc aacacaatct
atcaaaatta aactaaaaaa 5520aaaataagtg tacgtggtta acattagtac agtaatataa
gaggaaaatg agaaattaag 5580aaattgaaag cgagtctaat ttttaaatta tgaacctgca
tatataaaag gaaagaaaga 5640atccaggaag aaaagaaatg aaaccatgca tggtcccctc
gtcatcacga gtttctgcca 5700tttgcaatag aaacactgaa acacctttct ctttgtcact
taattgagat gccgaagcca 5760cctcacacca tgaacttcat gaggtgtagc acccaaggct
tccatagcca tgcatactga 5820agaatgtctc aagctcagca ccctacttct gtgacgtgtc
cctcattcac cttcctctct 5880tccctataaa taaccacgcc tcaggttctc cgcttcacaa
ctcaaacatt ctctccattg 5940gtccttaaac actcatcagt catcaccgcg gccgcatttc
gcaccaaatc aatgaaagta 6000ataatgaaaa gtctgaataa gaatacttag gcttagatgc
ctttgttact tgtgtaaaat 6060aacttgagtc atgtaccttt ggcggaaaca gaataaataa
aaggtgaaat tccaatgctc 6120tatgtataag ttagtaatac ttaatgtgtt ctacggttgt
ttcaatatca tcaaactcta 6180attgaaactt tagaaccaca aatctcaatc ttttcttaat
gaaatgaaaa atcttaattg 6240taccatgttt atgttaaaca ccttacaatt ggttggagag
gaggaccaac cgatgggaca 6300acattgggag aaagagattc aatggagatt tggataggag
aacaacattc tttttcactt 6360caatacaaga tgagtgcaac actaaggata tgtatgagac
tttcagaagc tacgacaaca 6420tagatgagtg aggtggtgat tcctagcaag aaagacatta
gaggaagcca aaatcgaaca 6480aggaagacat caagggcaag agacaggacc atccatctca
ggaaaaggag ctttgggata 6540gtccgagaag ttgtacaaga aattttttgg agggtgagtg
atgcattgct ggtgacttta 6600actcaatcaa aattgagaaa gaaagaaaag ggagggggct
cacatgtgaa tagaagggaa 6660acgggagaat tttacagttt tgatctaatg ggcatcccag
ctagtggtaa catattcacc 6720atgtttaacc ttcacgtacg tctagag
674778462DNAArtificial SequencePlasmid pKR1475
7ggccgcattt cgcaccaaat caatgaaagt aataatgaaa agtctgaata agaatactta
60ggcttagatg cctttgttac ttgtgtaaaa taacttgagt catgtacctt tggcggaaac
120agaataaata aaaggtgaaa ttccaatgct ctatgtataa gttagtaata cttaatgtgt
180tctacggttg tttcaatatc atcaaactct aattgaaact ttagaaccac aaatctcaat
240cttttcttaa tgaaatgaaa aatcttaatt gtaccatgtt tatgttaaac accttacaat
300tggttggaga ggaggaccaa ccgatgggac aacattggga gaaagagatt caatggagat
360ttggatagga gaacaacatt ctttttcact tcaatacaag atgagtgcaa cactaaggat
420atgtatgaga ctttcagaag ctacgacaac atagatgagt gaggtggtga ttcctagcaa
480gaaagacatt agaggaagcc aaaatcgaac aaggaagaca tcaagggcaa gagacaggac
540catccatctc aggaaaagga gctttgggat agtccgagaa gttgtacaag aaattttttg
600gagggtgagt gatgcattgc tggtgacttt aactcaatca aaattgagaa agaaagaaaa
660gggagggggc tcacatgtga atagaaggga aacgggagaa ttttacagtt ttgatctaat
720gggcatccca gctagtggta acatattcac catgtttaac cttcacgtac gtctagagga
780tccgtcgacg gcgcgcccga tcatccggat atagttcctc ctttcagcaa aaaacccctc
840aagacccgtt tagaggcccc aaggggttat gctagttatt gctcagcggt ggcagcagcc
900aactcagctt cctttcgggc tttgttagca gccggatcga tccaagctgt acctcactat
960tcctttgccc tcggacgagt gctggggcgt cggtttccac tatcggcgag tacttctaca
1020cagccatcgg tccagacggc cgcgcttctg cgggcgattt gtgtacgccc gacagtcccg
1080gctccggatc ggacgattgc gtcgcatcga ccctgcgccc aagctgcatc atcgaaattg
1140ccgtcaacca agctctgata gagttggtca agaccaatgc ggagcatata cgcccggagc
1200cgcggcgatc ctgcaagctc cggatgcctc cgctcgaagt agcgcgtctg ctgctccata
1260caagccaacc acggcctcca gaagaagatg ttggcgacct cgtattggga atccccgaac
1320atcgcctcgc tccagtcaat gaccgctgtt atgcggccat tgtccgtcag gacattgttg
1380gagccgaaat ccgcgtgcac gaggtgccgg acttcggggc agtcctcggc ccaaagcatc
1440agctcatcga gagcctgcgc gacggacgca ctgacggtgt cgtccatcac agtttgccag
1500tgatacacat ggggatcagc aatcgcgcat atgaaatcac gccatgtagt gtattgaccg
1560attccttgcg gtccgaatgg gccgaacccg ctcgtctggc taagatcggc cgcagcgatc
1620gcatccatag cctccgcgac cggctgcaga acagcgggca gttcggtttc aggcaggtct
1680tgcaacgtga caccctgtgc acggcgggag atgcaatagg tcaggctctc gctgaattcc
1740ccaatgtcaa gcacttccgg aatcgggagc gcggccgatg caaagtgccg ataaacataa
1800cgatctttgt agaaaccatc ggcgcagcta tttacccgca ggacatatcc acgccctcct
1860acatcgaagc tgaaagcacg agattcttcg ccctccgaga gctgcatcag gtcggagacg
1920ctgtcgaact tttcgatcag aaacttctcg acagacgtcg cggtgagttc aggcttttcc
1980atgggtatat ctccttctta aagttaaaca aaattatttc tagagggaaa ccgttgtggt
2040ctccctatag tgagtcgtat taatttcgcg ggatcgagat cgatccaatt ccaatcccac
2100aaaaatctga gcttaacagc acagttgctc ctctcagagc agaatcgggt attcaacacc
2160ctcatatcaa ctactacgtt gtgtataacg gtccacatgc cggtatatac gatgactggg
2220gttgtacaaa ggcggcaaca aacggcgttc ccggagttgc acacaagaaa tttgccacta
2280ttacagaggc aagagcagca gctgacgcgt acacaacaag tcagcaaaca gacaggttga
2340acttcatccc caaaggagaa gctcaactca agcccaagag ctttgctaag gccctaacaa
2400gcccaccaaa gcaaaaagcc cactggctca cgctaggaac caaaaggccc agcagtgatc
2460cagccccaaa agagatctcc tttgccccgg agattacaat ggacgatttc ctctatcttt
2520acgatctagg aaggaagttc gaaggtgaag gtgacgacac tatgttcacc actgataatg
2580agaaggttag cctcttcaat ttcagaaaga atgctgaccc acagatggtt agagaggcct
2640acgcagcagg tctcatcaag acgatctacc cgagtaacaa tctccaggag atcaaatacc
2700ttcccaagaa ggttaaagat gcagtcaaaa gattcaggac taattgcatc aagaacacag
2760agaaagacat atttctcaag atcagaagta ctattccagt atggacgatt caaggcttgc
2820ttcataaacc aaggcaagta atagagattg gagtctctaa aaaggtagtt cctactgaat
2880ctaaggccat gcatggagtc taagattcaa atcgaggatc taacagaact cgccgtgaag
2940actggcgaac agttcataca gagtctttta cgactcaatg acaagaagaa aatcttcgtc
3000aacatggtgg agcacgacac tctggtctac tccaaaaatg tcaaagatac agtctcagaa
3060gaccaaaggg ctattgagac ttttcaacaa aggataattt cgggaaacct cctcggattc
3120cattgcccag ctatctgtca cttcatcgaa aggacagtag aaaaggaagg tggctcctac
3180aaatgccatc attgcgataa aggaaaggct atcattcaag atgcctctgc cgacagtggt
3240cccaaagatg gacccccacc cacgaggagc atcgtggaaa aagaagacgt tccaaccacg
3300tcttcaaagc aagtggattg atgtgacatc tccactgacg taagggatga cgcacaatcc
3360cactatcctt cgcaagaccc ttcctctata taaggaagtt catttcattt ggagaggaca
3420cgctcgagct catttctcta ttacttcagc cataacaaaa gaactctttt ctcttcttat
3480taaaccatga aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag
3540ttcgacagcg tctccgacct gatgcagctc tcggagggcg aagaatctcg tgctttcagc
3600ttcgatgtag gagggcgtgg atatgtcctg cgggtaaata gctgcgccga tggtttctac
3660aaagatcgtt atgtttatcg gcactttgca tcggccgcgc tcccgattcc ggaagtgctt
3720gacattgggg aattcagcga gagcctgacc tattgcatct cccgccgtgc acagggtgtc
3780acgttgcaag acctgcctga aaccgaactg cccgctgttc tgcagccggt cgcggaggcc
3840atggatgcga tcgctgcggc cgatcttagc cagacgagcg ggttcggccc attcggaccg
3900caaggaatcg gtcaatacac tacatggcgt gatttcatat gcgcgattgc tgatccccat
3960gtgtatcact ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc gcaggctctc
4020gatgagctga tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat
4080ttcggctcca acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc
4140gaggcgatgt tcggggattc ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg
4200ttggcttgta tggagcagca gacgcgctac ttcgagcgga ggcatccgga gcttgcagga
4260tcgccgcggc tccgggcgta tatgctccgc attggtcttg accaactcta tcagagcttg
4320gttgacggca atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga
4380tccggagccg ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc
4440gatggctgtg tagaagtact cgccgatagt ggaaaccgac gccccagcac tcgtccgagg
4500gcaaaggaat agtgaggtac ctaaagaagg agtgcgtcga agcagatcgt tcaaacattt
4560ggcaataaag tttcttaaga ttgaatcctg ttgccggtct tgcgatgatt atcatataat
4620ttctgttgaa ttacgttaag catgtaataa ttaacatgta atgcatgacg ttatttatga
4680gatgggtttt tatgattaga gtcccgcaat tatacattta atacgcgata gaaaacaaaa
4740tatagcgcgc aaactaggat aaattatcgc gcgcggtgtc atctatgtta ctagatcgat
4800gtcgaatcga tcaacctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt
4860attgggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg
4920cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac
4980gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg
5040ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca
5100agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc
5160tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc
5220ccttcgggaa gcgtggcgct ttctcaatgc tcacgctgta ggtatctcag ttcggtgtag
5280gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc
5340ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca
5400gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg
5460aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg
5520aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct
5580ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa
5640gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa
5700gggattttgg tcatgacatt aacctataaa aataggcgta tcacgaggcc ctttcgtctc
5760gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca
5820gcttgtctgt aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt
5880ggcgggtgtc ggggctggct taactatgcg gcatcagagc agattgtact gagagtgcac
5940catatggaca tattgtcgtt agaacgcggc tacaattaat acataacctt atgtatcata
6000cacatacgat ttaggtgaca ctatagaacg gcgcgccaag cttgcatgcc tgcaggctag
6060cctaagtacg tactcaaaat gccaacaaat aaaaaaaaag ttgctttaat aatgccaaaa
6120caaattaata aaacacttac aacaccggat tttttttaat taaaatgtgc catttaggat
6180aaatagttaa tatttttaat aattatttaa aaagccgtat ctactaaaat gatttttatt
6240tggttgaaaa tattaatatg tttaaatcaa cacaatctat caaaattaaa ctaaaaaaaa
6300aataagtgta cgtggttaac attagtacag taatataaga ggaaaatgag aaattaagaa
6360attgaaagcg agtctaattt ttaaattatg aacctgcata tataaaagga aagaaagaat
6420ccaggaagaa aagaaatgaa accatgcatg gtcccctcgt catcacgagt ttctgccatt
6480tgcaatagaa acactgaaac acctttctct ttgtcactta attgagatgc cgaagccacc
6540tcacaccatg aacttcatga ggtgtagcac ccaaggcttc catagccatg catactgaag
6600aatgtctcaa gctcagcacc ctacttctgt gacgtgtccc tcattcacct tcctctcttc
6660cctataaata accacgcctc aggttctccg cttcacaact caaacattct ctccattggt
6720ccttaaacac tcatcagtca tcaccgcggc catcacaagt ttgtacaaaa aagctgaacg
6780agaaacgtaa aatgatataa atatcaatat attaaattag attttgcata aaaaacagac
6840tacataatac tgtaaaacac aacatatcca gtcatattgg cggccgcatt aggcacccca
6900ggctttacac tttatgcttc cggctcgtat aatgtgtgga ttttgagtta ggatccgtcg
6960agattttcag gagctaagga agctaaaatg gagaaaaaaa tcactggata taccaccgtt
7020gatatatccc aatggcatcg taaagaacat tttgaggcat ttcagtcagt tgctcaatgt
7080acctataacc agaccgttca gctggatatt acggcctttt taaagaccgt aaagaaaaat
7140aagcacaagt tttatccggc ctttattcac attcttgccc gcctgatgaa tgctcatccg
7200gaattccgta tggcaatgaa agacggtgag ctggtgatat gggatagtgt tcacccttgt
7260tacaccgttt tccatgagca aactgaaacg ttttcatcgc tctggagtga ataccacgac
7320gatttccggc agtttctaca catatattcg caagatgtgg cgtgttacgg tgaaaacctg
7380gcctatttcc ctaaagggtt tattgagaat atgtttttcg tctcagccaa tccctgggtg
7440agtttcacca gttttgattt aaacgtggcc aatatggaca acttcttcgc ccccgttttc
7500accatgggca aatattatac gcaaggcgac aaggtgctga tgccgctggc gattcaggtt
7560catcatgccg tttgtgatgg cttccatgtc ggcagaatgc ttaatgaatt acaacagtac
7620tgcgatgagt ggcagggcgg ggcgtaaacg cgtggatccg gcttactaaa agccagataa
7680cagtatgcgt atttgcgcgc tgatttttgc ggtataagaa tatatactga tatgtatacc
7740cgaagtatgt caaaaagagg tatgctatga agcagcgtat tacagtgaca gttgacagcg
7800acagctatca gttgctcaag gcatatatga tgtcaatatc tccggtctgg taagcacaac
7860catgcagaat gaagcccgtc gtctgcgtgc cgaacgctgg aaagcggaaa atcaggaagg
7920gatggctgag gtcgcccggt ttattgaaat gaacggctct tttgctgacg agaacagggg
7980ctggtgaaat gcagtttaag gtttacacct ataaaagaga gagccgttat cgtctgtttg
8040tggatgtaca gagtgatatt attgacacgc ccgggcgacg gatggtgatc cccctggcca
8100gtgcacgtct gctgtcagat aaagtctccc gtgaacttta cccggtggtg catatcgggg
8160atgaaagctg gcgcatgatg accaccgata tggccagtgt gccggtctcc gttatcgggg
8220aagaagtggc tgatctcagc caccgcgaaa atgacatcaa aaacgccatt aacctgatgt
8280tctggggaat ataaatgtca ggctccctta tacacagcca gtctgcaggt cgaccatagt
8340gactggatat gttgtgtttt acagcattat gtagtctgtt ttttatgcaa aatctaattt
8400aatatattga tatttatatc attttacgtt tctcgttcag ctttcttgta caaagtggtg
8460at
8462813268DNAArtificial SequencePlasmid pKR92 8cgcgcctcga gtgggcggat
cccccgggct gcaggaattc actggccgtc gttttacaac 60gtcgtgactg ggaaaaccct
ggcgttaccc aacttaatcg ccttgcagca catccccctt 120tcgccagctg gcgtaatagc
gaagaggccc gcaccgatcg cccttcccaa cagttgcgca 180gcctgaatgg cgaatggatc
gatccatcgc gatgtacctt ttgttagtca gcctctcgat 240tgctcatcgt cattacacag
taccgaagtt tgatcgatct agtaacatag atgacaccgc 300gcgcgataat ttatcctagt
ttgcgcgcta tattttgttt tctatcgcgt attaaatgta 360taattgcggg actctaatca
taaaaaccca tctcataaat aacgtcatgc attacatgtt 420aattattaca tgcttaacgt
aattcaacag aaattatatg ataatcatcg caagaccggc 480aacaggattc aatcttaaga
aactttattg ccaaatgttt gaacgatctg cttcgacgca 540ctccttcttt actccaccat
ctcgtcctta ttgaaaacgt gggtagcacc aaaacgaatc 600aagtcgctgg aactgaagtt
accaatcacg ctggatgatt tgccagttgg attaatcttg 660cctttccccg catgaataat
attgatgaat gcatgcgtga ggggtagttc gatgttggca 720atagctgcaa ttgccgcgac
atcctccaac gagcataatt cttcagaaaa atagcgatgt 780tccatgttgt cagggcatgc
atgatgcacg ttatgaggtg acggtgctag gcagtattcc 840ctcaaagttt catagtcagt
atcatattca tcattgcatt cctgcaagag agaattgaga 900cgcaatccac acgctgcggc
aaccttccgg cgttcgtggt ctatttgctc ttggacgttg 960caaacgtaag tgttggatcg
atccggggtg ggcgaagaac tccagcatga gatccccgcg 1020ctggaggatc atccagccgg
cgtcccggaa aacgattccg aagcccaacc tttcatagaa 1080ggcggcggtg gaatcgaaat
ctcgtgatgg caggttgggc gtcgcttggt cggtcatttc 1140gaaccccaga gtcccgctca
gaagaactcg tcaagaaggc gatagaaggc gatgcgctgc 1200gaatcgggag cggcgatacc
gtaaagcacg aggaagcggt cagcccattc gccgccaagc 1260tcttcagcaa tatcacgggt
agccaacgct atgtcctgat agcggtccgc cacacccagc 1320cggccacagt cgatgaatcc
agaaaagcgg ccattttcca ccatgatatt cggcaagcag 1380gcatcgccat gggtcacgac
gagatcctcg ccgtcgggca tgcgcgcctt gagcctggcg 1440aacagttcgg ctggcgcgag
cccctgatgc tcttcgtcca gatcatcctg atcgacaaga 1500ccggcttcca tccgagtacg
tgctcgctcg atgcgatgtt tcgcttggtg gtcgaatggg 1560caggtagccg gatcaagcgt
atgcagccgc cgcattgcat cagccatgat ggatactttc 1620tcggcaggag caaggtgaga
tgacaggaga tcctgccccg gcacttcgcc caatagcagc 1680cagtcccttc ccgcttcagt
gacaacgtcg agcacagctg cgcaaggaac gcccgtcgtg 1740gccagccacg atagccgcgc
tgcctcgtcc tgcagttcat tcagggcacc ggacaggtcg 1800gtcttgacaa aaagaaccgg
gcgcccctgc gctgacagcc ggaacacggc ggcatcagag 1860cagccgattg tctgttgtgc
ccagtcatag ccgaatagcc tctccaccca agcggccgga 1920gaacctgcgt gcaatccatc
ttgttcaatc atgcgaaacg atccccgcaa gcttggagac 1980tggtgatttc agcgtgtcct
ctccaaatga aatgaacttc cttatataga ggaagggtct 2040tgcgaaggat agtgggattg
tgcgtcatcc cttacgtcag tggagatatc acatcaatcc 2100acttgctttg aagacgtggt
tggaacgtct tctttttcca cgatgctcct cgtgggtggg 2160ggtccatctt tgggaccact
gtcggcagag gcatcttcaa cgatggcctt tcctttatcg 2220caatgatggc atttgtagga
gccaccttcc ttttccacta tcttcacaat aaagtgacag 2280atagctgggc aatggaatcc
gaggaggttt ccggatatta ccctttgttg aaaagtctca 2340attgcccttt ggtcttctga
gactgtatct ttgatatttt tggagtagac aagcgtgtcg 2400tgctccacca tgttgacgaa
gattttcttc ttgtcattga gtcgtaagag actctgtatg 2460aactgttcgc cagtctttac
ggcgagttct gttaggtcct ctatttgaat ctttgactcc 2520atggcctttg attcagtggg
aactaccttt ttagagactc caatctctat tacttgcctt 2580ggtttgtgaa gcaagccttg
aatcgtccat actggaatag tacttctgat cttgagaaat 2640atatctttct ctgtgttctt
gatgcagtta gtcctgaatc ttttgactgc atctttaacc 2700ttcttgggaa ggtatttgat
ctcctggaga ttattgctcg ggtagatcgt cttgatgaga 2760cctgctgcgt aagcctctct
aaccatctgt gggttagcat tctttctgaa attgaaaagg 2820ctaatcttct cattatcagt
ggtgaacatg gtatcgtcac cttctccgtc gaacttcctg 2880actagatcgt agagatagag
gaagtcgtcc attgtgatct ctggggcaaa ggagtctgaa 2940ttaattcgat atggtggatt
tatcacaaat gggacccgcc gccgacagag gtgtgatgtt 3000aggccaggac tttgaaaatt
tgcgcaacta tcgtatagtg gccgacaaat tgacgccgag 3060ttgacagact gcctagcatt
tgagtgaatt atgtgaggta atgggctaca ctgaattggt 3120agctcaaact gtcagtattt
atgtatatga gtgtatattt tcgcataatc tcagaccaat 3180ctgaagatga aatgggtatc
tgggaatggc gaaatcaagg catcgatcgt gaagtttctc 3240atctaagccc ccatttggac
gtgaatgtag acacgtcgaa ataaagattt ccgaattaga 3300ataatttgtt tattgctttc
gcctataaat acgacggatc gtaatttgtc gttttatcaa 3360aatgtacttt cattttataa
taacgctgcg gacatctaca tttttgaatt gaaaaaaaat 3420tggtaattac tctttctttt
tctccatatt gaccatcata ctcattgctg atccatgtag 3480atttcccgga catgaagcca
tttacaattg aatatatcct gccgccgctg ccgctttgca 3540cccggtggag cttgcatgtt
ggtttctacg cagaactgag ccggttaggc agataatttc 3600cattgagaac tgagccatgt
gcaccttccc cccaacacgg tgagcgacgg ggcaacggag 3660tgatccacat gggactttta
aacatcatcc gtcggatggc gttgcgagag aagcagtcga 3720tccgtgagat cagccgacgc
accgggcagg cgcgcaacac gatcgcaaag tatttgaacg 3780caggtacaat cgagccgacg
ttcacgcgga acgaccaagc aagctagctt taatgcggta 3840gtttatcaca gttaaattgc
taacgcagtc aggcaccgtg tatgaaatct aacaatgcgc 3900tcatcgtcat cctcggcacc
gtcaccctgg atgctgtagg cataggcttg gttatgccgg 3960tactgccggg cctcttgcgg
gatatcgtcc attccgacag catcgccagt cactatggcg 4020tgctgctagc gctatatgcg
ttgatgcaat ttctatgcgc acccgttctc ggagcactgt 4080ccgaccgctt tggccgccgc
ccagtcctgc tcgcttcgct acttggagcc actatcgact 4140acgcgatcat ggcgaccaca
cccgtcctgt ggtccaaccc ctccgctgct atagtgcagt 4200cggcttctga cgttcagtgc
agccgtcttc tgaaaacgac atgtcgcaca agtcctaagt 4260tacgcgacag gctgccgccc
tgcccttttc ctggcgtttt cttgtcgcgt gttttagtcg 4320cataaagtag aatacttgcg
actagaaccg gagacattac gccatgaaca agagcgccgc 4380cgctggcctg ctgggctatg
cccgcgtcag caccgacgac caggacttga ccaaccaacg 4440ggccgaactg cacgcggccg
gctgcaccaa gctgttttcc gagaagatca ccggcaccag 4500gcgcgaccgc ccggagctgg
ccaggatgct tgaccaccta cgccctggcg acgttgtgac 4560agtgaccagg ctagaccgcc
tggcccgcag cacccgcgac ctactggaca ttgccgagcg 4620catccaggag gccggcgcgg
gcctgcgtag cctggcagag ccgtgggccg acaccaccac 4680gccggccggc cgcatggtgt
tgaccgtgtt cgccggcatt gccgagttcg agcgttccct 4740aatcatcgac cgcacccgga
gcgggcgcga ggccgccaag gcccgaggcg tgaagtttgg 4800cccccgccct accctcaccc
cggcacagat cgcgcacgcc cgcgagctga tcgaccagga 4860aggccgcacc gtgaaagagg
cggctgcact gcttggcgtg catcgctcga ccctgtaccg 4920cgcacttgag cgcagcgagg
aagtgacgcc caccgaggcc aggcggcgcg gtgccttccg 4980tgaggacgca ttgaccgagg
ccgacgccct ggcggccgcc gagaatgaac gccaagagga 5040acaagcatga aaccgcacca
ggacggccag gacgaaccgt ttttcattac cgaagagatc 5100gaggcggaga tgatcgcggc
cgggtacgtg ttcgagccgc ccgcgcacgt ctcaaccgtg 5160cggctgcatg aaatcctggc
cggtttgtct gatgccaagc tggcggcctg gccggccagc 5220ttggccgctg aagaaaccga
gcgccgccgt ctaaaaaggt gatgtgtatt tgagtaaaac 5280agcttgcgtc atgcggtcgc
tgcgtatatg atgcgatgag taaataaaca aatacgcaag 5340ggaacgcatg aagttatcgc
tgtacttaac cagaaaggcg ggtcaggcaa gacgaccatc 5400gcaacccatc tagcccgcgc
cctgcaactc gccggggccg atgttctgtt agtcgattcc 5460gatccccagg gcagtgcccg
cgattgggcg gccgtgcggg aagatcaacc gctaaccgtt 5520gtcggcatcg accgcccgac
gattgaccgc gacgtgaagg ccatcggccg gcgcgacttc 5580gtagtgatcg acggagcgcc
ccaggcggcg gacttggctg tgtccgcgat caaggcagcc 5640gacttcgtgc tgattccggt
gcagccaagc ccttacgaca tatgggccac cgccgacctg 5700gtggagctgg ttaagcagcg
cattgaggtc acggatggaa ggctacaagc ggcctttgtc 5760gtgtcgcggg cgatcaaagg
cacgcgcatc ggcggtgagg ttgccgaggc gctggccggg 5820tacgagctgc ccattcttga
gtcccgtatc acgcagcgcg tgagctaccc aggcactgcc 5880gccgccggca caaccgttct
tgaatcagaa cccgagggcg acgctgcccg cgaggtccag 5940gcgctggccg ctgaaattaa
atcaaaactc atttgagtta atgaggtaaa gagaaaatga 6000gcaaaagcac aaacacgcta
agtgccggcc gtccgagcgc acgcagcagc aaggctgcaa 6060cgttggccag cctggcagac
acgccagcca tgaagcgggt caactttcag ttgccggcgg 6120aggatcacac caagctgaag
atgtacgcgg tacgccaagg caagaccatt accgagctgc 6180tatctgaata catcgcgcag
ctaccagagt aaatgagcaa atgaataaat gagtagatga 6240attttagcgg ctaaaggagg
cggcatggaa aatcaagaac aaccaggcac cgacgccgtg 6300gaatgcccca tgtgtggagg
aacgggcggt tggccaggcg taagcggctg ggttgtctgc 6360cggccctgca atggcactgg
aacccccaag cccgaggaat cggcgtgagc ggtcgcaaac 6420catccggccc ggtacaaatc
ggcgcggcgc tgggtgatga cctggtggag aagttgaagg 6480ccgcgcaggc cgcccagcgg
caacgcatcg aggcagaagc acgccccggt gaatcgtggc 6540aagcggccgc tgatcgaatc
cgcaaagaat cccggcaacc gccggcagcc ggtgcgccgt 6600cgattaggaa gccgcccaag
ggcgacgagc aaccagattt tttcgttccg atgctctatg 6660acgtgggcac ccgcgatagt
cgcagcatca tggacgtggc cgttttccgt ctgtcgaagc 6720gtgaccgacg agctggcgag
gtgatccgct acgagcttcc agacgggcac gtagaggttt 6780ccgcagggcc ggccggcatg
gccagtgtgt gggattacga cctggtactg atggcggttt 6840cccatctaac cgaatccatg
aaccgatacc gggaagggaa gggagacaag cccggccgcg 6900tgttccgtcc acacgttgcg
gacgtactca agttctgccg gcgagccgat ggcggaaagc 6960agaaagacga cctggtagaa
acctgcattc ggttaaacac cacgcacgtt gccatgcagc 7020gtacgaagaa ggccaagaac
ggccgcctgg tgacggtatc cgagggtgaa gccttgatta 7080gccgctacaa gatcgtaaag
agcgaaaccg ggcggccgga gtacatcgag atcgagctag 7140ctgattggat gtaccgcgag
atcacagaag gcaagaaccc ggacgtgctg acggttcacc 7200ccgattactt tttgatcgat
cccggcatcg gccgttttct ctaccgcctg gcacgccgcg 7260ccgcaggcaa ggcagaagcc
agatggttgt tcaagacgat ctacgaacgc agtggcagcg 7320ccggagagtt caagaagttc
tgtttcaccg tgcgcaagct gatcgggtca aatgacctgc 7380cggagtacga tttgaaggag
gaggcggggc aggctggccc gatcctagtc atgcgctacc 7440gcaacctgat cgagggcgaa
gcatccgccg gttcctaatg tacggagcag atgctagggc 7500aaattgccct agcaggggaa
aaaggtcgaa aaggtctctt tcctgtggat agcacgtaca 7560ttgggaaccc aaagccgtac
attgggaacc ggaacccgta cattgggaac ccaaagccgt 7620acattgggaa ccggtcacac
atgtaagtga ctgatataaa agagaaaaaa ggcgattttt 7680ccgcctaaaa ctctttaaaa
cttattaaaa ctcttaaaac ccgcctggcc tgtgcataac 7740tgtctggcca gcgcacagcc
gaagagctgc aaaaagcgcc tacccttcgg tcgctgcgct 7800ccctacgccc cgccgcttcg
cgtcggccta tcgcggccgc tggccgctca aaaatggctg 7860gcctacggcc aggcaatcta
ccagggcgcg gacaagccgc gccgtcgcca ctcgaccgcc 7920ggcgcccaca tcaaggcacc
ctgcctcgcg cgtttcggtg atgacggtga aaacctctga 7980cacatgcagc tcccggagac
ggtcacagct tgtctgtaag cggatgccgg gagcagacaa 8040gcccgtcagg gcgcgtcagc
gggtgttggc gggtgtcggg gcgcagccat gacccagtca 8100cgtagcgata gcggagtgta
tactggctta actatgcggc atcagagcag attgtactga 8160gagtgcacca tatgcggtgt
gaaataccgc acagatgcgt aaggagaaaa taccgcatca 8220ggcgctcttc cgcttcctcg
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 8280cggtatcagc tcactcaaag
gcggtaatac ggttatccac agaatcaggg gataacgcag 8340gaaagaacat gtgagcaaaa
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 8400tggcgttttt ccataggctc
cgcccccctg acgagcatca caaaaatcga cgctcaagtc 8460agaggtggcg aaacccgaca
ggactataaa gataccaggc gtttccccct ggaagctccc 8520tcgtgcgctc tcctgttccg
accctgccgc ttaccggata cctgtccgcc tttctccctt 8580cgggaagcgt ggcgctttct
catagctcac gctgtaggta tctcagttcg gtgtaggtcg 8640ttcgctccaa gctgggctgt
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 8700ccggtaacta tcgtcttgag
tccaacccgg taagacacga cttatcgcca ctggcagcag 8760ccactggtaa caggattagc
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 8820ggtggcctaa ctacggctac
actagaagga cagtatttgg tatctgcgct ctgctgaagc 8880cagttacctt cggaaaaaga
gttggtagct cttgatccgg caaacaaacc accgctggta 8940gcggtggttt ttttgtttgc
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 9000atcctttgat cttttctacg
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 9060ttttggtcat gagattatca
aaaaggatct tcacctagat ccttttaaat taaaaatgaa 9120gttttaaatc aatctaaagt
atatatgagt aaacttggtc tgacagttac caatgcttaa 9180tcagtgaggc acctatctca
gcgatctgtc tatttcgttc atccatagtt gcctgactcc 9240ccgtcgtgta gataactacg
atacgggagg gcttaccatc tggccccagt gctgcaatga 9300taccgcgaga cccacgctca
ccggctccag atttatcagc aataaaccag ccagccggaa 9360gggccgagcg cagaagtggt
cctgcaactt tatccgcctc catccagtct attaattgtt 9420gccgggaagc tagagtaagt
agttcgccag ttaatagttt gcgcaacgtt gttgccattg 9480ctacaggcat cgtggtgtca
cgctcgtcgt ttggtatggc ttcattcagc tccggttccc 9540aacgatcaag gcgagttaca
tgatccccca tgttgtgcaa aaaagcggtt agctccttcg 9600gtcctccgat cgttgtcaga
agtaagttgg ccgcagtgtt atcactcatg gttatggcag 9660cactgcataa ttctcttact
gtcatgccat ccgtaagatg cttttctgtg actggtgagt 9720actcaaccaa gtcattctga
gaatagtgta tgcggcgacc gagttgctct tgcccggcgt 9780caacacggga taataccgcg
ccacatagca gaactttaaa agtgctcatc attggaaaag 9840acctgcaggg gggggggggc
gctgaggtct gcctcgtgaa gaaggtgttg ctgactcata 9900ccaggcctga atcgccccat
catccagcca gaaagtgagg gagccacggt tgatgagagc 9960tttgttgtag gtggaccagt
tggtgatttt gaacttttgc tttgccacgg aacggtctgc 10020gttgtcggga agatgcgtga
tctgatcctt caactcagca aaagttcgat ttattcaaca 10080aagccgccgt cccgtcaagt
cagcgtaatg ctctgccagt gttacaacca attaaccaat 10140tctgattaga aaaactcatc
gagcatcaaa tgaaactgca atttattcat atcaggatta 10200tcaataccat atttttgaaa
aagccgtttc tgtaatgaag gagaaaactc accgaggcag 10260ttccatagga tggcaagatc
ctggtatcgg tctgcgattc cgactcgtcc aacatcaata 10320caacctatta atttcccctc
gtcaaaaata aggttatcaa gtgagaaatc accatgagtg 10380acgactgaat ccggtgagaa
tggcaaaagc ttatgcattt ctttccagac ttgttcaaca 10440ggccagccat tacgctcgtc
atcaaaatca ctcgcatcaa ccaaaccgtt attcattcgt 10500gattgcgcct gagcgagacg
aaatacgcga tcgctgttaa aaggacaatt acaaacagga 10560atcgaatgca accggcgcag
gaacactgcc agcgcatcaa caatattttc acctgaatca 10620ggatattctt ctaatacctg
gaatgctgtt ttcccgggga tcgcagtggt gagtaaccat 10680gcatcatcag gagtacggat
aaaatgcttg atggtcggaa gaggcataaa ttccgtcagc 10740cagtttagtc tgaccatctc
atctgtaaca tcattggcaa cgctaccttt gccatgtttc 10800agaaacaact ctggcgcatc
gggcttccca tacaatcgat agattgtcgc acctgattgc 10860ccgacattat cgcgagccca
tttataccca tataaatcag catccatgtt ggaatttaat 10920cgcggcctcg agcaagacgt
ttcccgttga atatggctca taacacccct tgtattactg 10980tttatgtaag cagacagttt
tattgttcat gatgatatat ttttatcttg tgcaatgtaa 11040catcagagat tttgagacac
aacgtggctt tccccccccc ccctgcaggt caattcggtc 11100gatatggcta ttacgaagaa
ggctcgtgcg cggagtcccg tgaactttcc cacgcaacaa 11160gtgaaccgca ccgggtttgc
cggaggccat ttcgttaaaa tgcgcagcca tggctgcttc 11220gtccagcatg gcgtaatact
gatcctcgtc ttcggctggc ggtatattgc cgatgggctt 11280caaaagccgc cgtggttgaa
ccagtctatc cattccaagg tagcgaactc gaccgcttcg 11340aagctcctcc atggtccacg
ccgatgaatg acctcggcct tgtaaagacc gttgatcgct 11400tctgcgaggg cgttgtcgtg
ctgtcgccga cgcttccgat agatggctcg atacctgctt 11460ctgccaaccg ctcggaatag
cgaaaggaca cgtattgaac accgcgatcc gagtgatgca 11520ctaggccgcc atgagcggga
cgccgatcat gatgagcctc ctcgagggca tcgaggacaa 11580agcctgcatg tgctgtccgg
ctcgcccgcc atccgacaat gcgacgggcg aagacgtcga 11640tcacgaaggc cacgtagacg
aagccctccc aagtggcgac ataagtacgg acatgcgcaa 11700aggctttccc ggtttgtcgc
tgatggtgca agagacgctg aagcgcgatc cgatgcgcag 11760gcatctgttc gtcttccgcg
gtcgtggcgg tggcctgatc aaggtcactc gccgaagagc 11820tgcatgattg gctcgaaacc
gagcggggga aattgtcgcg cagttctccc gtcgccgagg 11880cgataaatta catgctcaag
cgatgggatg gcattacgtc attcctcgat gacggcccga 11940tttgcctgac gaacaatgct
gccgaacgaa cgctcagagg ctatgtactc ggcaggaagt 12000catggctgtt tgccggatcg
gatcgttgtg ctgaacgtgc ggcgttcatg gcgacactga 12060tcatgagcgc caagctcaat
aacatcgatc cgcaggcctg gcttgccgac gtccgcgccg 12120accttgcgga cgctccgatc
agcaggcttg agcaacagct gccgtggaac tggacatcca 12180agacactgag tgctcaggcg
gcctgacctg cggccttcac cggatactta ccccattatc 12240gcagattgcg atgaagcatc
agcgtcattc agcaatcttg ccaaagtatg caggctcgcg 12300agaatcgacg tgcgaaaccg
gctggttgcg ccaaagatcc gcttgcggag cggtcgaaca 12360ttcatgctgg gacttcaaga
ggtcgagtag aggaagaacc ggaaaggttg caccggaaaa 12420tatgcgttcc tttggagagc
gcctcatgga cgtgaacaaa tcgcccggac caaggatgcc 12480acggatacaa aagctcgcga
agctcggtcc cgtgggtgtt ctgtcgtctc gttgtacaac 12540gaaatccatt cccattccgc
gctcaagatg gcttcccctc ggcagttcat cagggctaaa 12600tcaatctagc cgacttgtcc
ggtgaaatgg gctgcactcc aacagaaaca atcaaacaaa 12660catacacagc gacttattca
cacgagctca aattacaacg gtatatatcc tgccagtcag 12720catcatcaca ccaaaagtta
ggcccgaata gtttgaaatt agaaagctcg caattgaggt 12780ctacaggcca aattcgctct
tagccgtaca atattactca ccggtgcgat gccccccatc 12840gtaggtgaag gtggaaatta
atgatccatc ttgagaccac aggcccacaa cagctaccag 12900tttcctcaag ggtccaccaa
aaacgtaagc gcttacgtac atggtcgata agaaaaggca 12960atttgtagat gttaacatcc
aacgtcgctt tcagggatcg atccaatacg caaaccgcct 13020ctccccgcgc gttggccgat
tcattaatgc agctggcacg acaggtttcc cgactggaaa 13080gcgggcagtg agcgcaacgc
aattaatgtg agttagctca ctcattaggc accccaggct 13140ttacacttta tgcttccggc
tcgtatgttg tgtggaattg tgagcggata acaatttcac 13200acaggaaaca gctatgacca
tgattacgcc aagcttgcat gcctgcaggt cgactctaga 13260ggatctgg
13268916490DNAArtificial
SequencepKR1478 9cgcgccagat cctctagagt cgacctgcag gcatgcaagc ttggcgtaat
catggtcata 60gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac
gagccggaag 120cataaagtgt aaagcctggg gtgcctaatg agtgagctaa ctcacattaa
ttgcgttgcg 180ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat
gaatcggcca 240acgcgcgggg agaggcggtt tgcgtattgg atcgatccct gaaagcgacg
ttggatgtta 300acatctacaa attgcctttt cttatcgacc atgtacgtaa gcgcttacgt
ttttggtgga 360cccttgagga aactggtagc tgttgtgggc ctgtggtctc aagatggatc
attaatttcc 420accttcacct acgatggggg gcatcgcacc ggtgagtaat attgtacggc
taagagcgaa 480tttggcctgt agacctcaat tgcgagcttt ctaatttcaa actattcggg
cctaactttt 540ggtgtgatga tgctgactgg caggatatat accgttgtaa tttgagctcg
tgtgaataag 600tcgctgtgta tgtttgtttg attgtttctg ttggagtgca gcccatttca
ccggacaagt 660cggctagatt gatttagccc tgatgaactg ccgaggggaa gccatcttga
gcgcggaatg 720ggaatggatt tcgttgtaca acgagacgac agaacaccca cgggaccgag
cttcgcgagc 780ttttgtatcc gtggcatcct tggtccgggc gatttgttca cgtccatgag
gcgctctcca 840aaggaacgca tattttccgg tgcaaccttt ccggttcttc ctctactcga
cctcttgaag 900tcccagcatg aatgttcgac cgctccgcaa gcggatcttt ggcgcaacca
gccggtttcg 960cacgtcgatt ctcgcgagcc tgcatacttt ggcaagattg ctgaatgacg
ctgatgcttc 1020atcgcaatct gcgataatgg ggtaagtatc cggtgaaggc cgcaggtcag
gccgcctgag 1080cactcagtgt cttggatgtc cagttccacg gcagctgttg ctcaagcctg
ctgatcggag 1140cgtccgcaag gtcggcgcgg acgtcggcaa gccaggcctg cggatcgatg
ttattgagct 1200tggcgctcat gatcagtgtc gccatgaacg ccgcacgttc agcacaacga
tccgatccgg 1260caaacagcca tgacttcctg ccgagtacat agcctctgag cgttcgttcg
gcagcattgt 1320tcgtcaggca aatcgggccg tcatcgagga atgacgtaat gccatcccat
cgcttgagca 1380tgtaatttat cgcctcggcg acgggagaac tgcgcgacaa tttcccccgc
tcggtttcga 1440gccaatcatg cagctcttcg gcgagtgacc ttgatcaggc caccgccacg
accgcggaag 1500acgaacagat gcctgcgcat cggatcgcgc ttcagcgtct cttgcaccat
cagcgacaaa 1560ccgggaaagc ctttgcgcat gtccgtactt atgtcgccac ttgggagggc
ttcgtctacg 1620tggccttcgt gatcgacgtc ttcgcccgtc gcattgtcgg atggcgggcg
agccggacag 1680cacatgcagg ctttgtcctc gatgccctcg aggaggctca tcatgatcgg
cgtcccgctc 1740atggcggcct agtgcatcac tcggatcgcg gtgttcaata cgtgtccttt
cgctattccg 1800agcggttggc agaagcaggt atcgagccat ctatcggaag cgtcggcgac
agcacgacaa 1860cgccctcgca gaagcgatca acggtcttta caaggccgag gtcattcatc
ggcgtggacc 1920atggaggagc ttcgaagcgg tcgagttcgc taccttggaa tggatagact
ggttcaacca 1980cggcggcttt tgaagcccat cggcaatata ccgccagccg aagacgagga
tcagtattac 2040gccatgctgg acgaagcagc catggctgcg cattttaacg aaatggcctc
cggcaaaccc 2100ggtgcggttc acttgttgcg tgggaaagtt cacgggactc cgcgcacgag
ccttcttcgt 2160aatagccata tcgaccgaat tgacctgcag gggggggggg gaaagccacg
ttgtgtctca 2220aaatctctga tgttacattg cacaagataa aaatatatca tcatgaacaa
taaaactgtc 2280tgcttacata aacagtaata caaggggtgt tatgagccat attcaacggg
aaacgtcttg 2340ctcgaggccg cgattaaatt ccaacatgga tgctgattta tatgggtata
aatgggctcg 2400cgataatgtc gggcaatcag gtgcgacaat ctatcgattg tatgggaagc
ccgatgcgcc 2460agagttgttt ctgaaacatg gcaaaggtag cgttgccaat gatgttacag
atgagatggt 2520cagactaaac tggctgacgg aatttatgcc tcttccgacc atcaagcatt
ttatccgtac 2580tcctgatgat gcatggttac tcaccactgc gatccccggg aaaacagcat
tccaggtatt 2640agaagaatat cctgattcag gtgaaaatat tgttgatgcg ctggcagtgt
tcctgcgccg 2700gttgcattcg attcctgttt gtaattgtcc ttttaacagc gatcgcgtat
ttcgtctcgc 2760tcaggcgcaa tcacgaatga ataacggttt ggttgatgcg agtgattttg
atgacgagcg 2820taatggctgg cctgttgaac aagtctggaa agaaatgcat aagcttttgc
cattctcacc 2880ggattcagtc gtcactcatg gtgatttctc acttgataac cttatttttg
acgaggggaa 2940attaataggt tgtattgatg ttggacgagt cggaatcgca gaccgatacc
aggatcttgc 3000catcctatgg aactgcctcg gtgagttttc tccttcatta cagaaacggc
tttttcaaaa 3060atatggtatt gataatcctg atatgaataa attgcagttt catttgatgc
tcgatgagtt 3120tttctaatca gaattggtta attggttgta acactggcag agcattacgc
tgacttgacg 3180ggacggcggc tttgttgaat aaatcgaact tttgctgagt tgaaggatca
gatcacgcat 3240cttcccgaca acgcagaccg ttccgtggca aagcaaaagt tcaaaatcac
caactggtcc 3300acctacaaca aagctctcat caaccgtggc tccctcactt tctggctgga
tgatggggcg 3360attcaggcct ggtatgagtc agcaacacct tcttcacgag gcagacctca
gcgccccccc 3420ccccctgcag gtcttttcca atgatgagca cttttaaagt tctgctatgt
ggcgcggtat 3480tatcccgtgt tgacgccggg caagagcaac tcggtcgccg catacactat
tctcagaatg 3540acttggttga gtactcacca gtcacagaaa agcatcttac ggatggcatg
acagtaagag 3600aattatgcag tgctgccata accatgagtg ataacactgc ggccaactta
cttctgacaa 3660cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat
catgtaactc 3720gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag
cgtgacacca 3780cgatgcctgt agcaatggca acaacgttgc gcaaactatt aactggcgaa
ctacttactc 3840tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca
ggaccacttc 3900tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc
ggtgagcgtg 3960ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt
atcgtagtta 4020tctacacgac ggggagtcag gcaactatgg atgaacgaaa tagacagatc
gctgagatag 4080gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat
atactttaga 4140ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt
tttgataatc 4200tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac
cccgtagaaa 4260agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc
ttgcaaacaa 4320aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca
actctttttc 4380cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta
gtgtagccgt 4440agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct
ctgctaatcc 4500tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg
gactcaagac 4560gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc
acacagccca 4620gcttggagcg aacgacctac accgaactga gatacctaca gcgtgagcta
tgagaaagcg 4680ccacgcttcc cgaagggaga aaggcggaca ggtatccggt aagcggcagg
gtcggaacag 4740gagagcgcac gagggagctt ccagggggaa acgcctggta tctttatagt
cctgtcgggt 4800ttcgccacct ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg
cggagcctat 4860ggaaaaacgc cagcaacgcg gcctttttac ggttcctggc cttttgctgg
ccttttgctc 4920acatgttctt tcctgcgtta tcccctgatt ctgtggataa ccgtattacc
gcctttgagt 4980gagctgatac cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg
agcgaggaag 5040cggaagagcg cctgatgcgg tattttctcc ttacgcatct gtgcggtatt
tcacaccgca 5100tatggtgcac tctcagtaca atctgctctg atgccgcata gttaagccag
tatacactcc 5160gctatcgcta cgtgactggg tcatggctgc gccccgacac ccgccaacac
ccgctgacgc 5220gccctgacgg gcttgtctgc tcccggcatc cgcttacaga caagctgtga
ccgtctccgg 5280gagctgcatg tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgaggc
agggtgcctt 5340gatgtgggcg ccggcggtcg agtggcgacg gcgcggcttg tccgcgccct
ggtagattgc 5400ctggccgtag gccagccatt tttgagcggc cagcggccgc gataggccga
cgcgaagcgg 5460cggggcgtag ggagcgcagc gaccgaaggg taggcgcttt ttgcagctct
tcggctgtgc 5520gctggccaga cagttatgca caggccaggc gggttttaag agttttaata
agttttaaag 5580agttttaggc ggaaaaatcg ccttttttct cttttatatc agtcacttac
atgtgtgacc 5640ggttcccaat gtacggcttt gggttcccaa tgtacgggtt ccggttccca
atgtacggct 5700ttgggttccc aatgtacgtg ctatccacag gaaagagacc ttttcgacct
ttttcccctg 5760ctagggcaat ttgccctagc atctgctccg tacattagga accggcggat
gcttcgccct 5820cgatcaggtt gcggtagcgc atgactagga tcgggccagc ctgccccgcc
tcctccttca 5880aatcgtactc cggcaggtca tttgacccga tcagcttgcg cacggtgaaa
cagaacttct 5940tgaactctcc ggcgctgcca ctgcgttcgt agatcgtctt gaacaaccat
ctggcttctg 6000ccttgcctgc ggcgcggcgt gccaggcggt agagaaaacg gccgatgccg
ggatcgatca 6060aaaagtaatc ggggtgaacc gtcagcacgt ccgggttctt gccttctgtg
atctcgcggt 6120acatccaatc agctagctcg atctcgatgt actccggccg cccggtttcg
ctctttacga 6180tcttgtagcg gctaatcaag gcttcaccct cggataccgt caccaggcgg
ccgttcttgg 6240ccttcttcgt acgctgcatg gcaacgtgcg tggtgtttaa ccgaatgcag
gtttctacca 6300ggtcgtcttt ctgctttccg ccatcggctc gccggcagaa cttgagtacg
tccgcaacgt 6360gtggacggaa cacgcggccg ggcttgtctc ccttcccttc ccggtatcgg
ttcatggatt 6420cggttagatg ggaaaccgcc atcagtacca ggtcgtaatc ccacacactg
gccatgccgg 6480ccggccctgc ggaaacctct acgtgcccgt ctggaagctc gtagcggatc
acctcgccag 6540ctcgtcggtc acgcttcgac agacggaaaa cggccacgtc catgatgctg
cgactatcgc 6600gggtgcccac gtcatagagc atcggaacga aaaaatctgg ttgctcgtcg
cccttgggcg 6660gcttcctaat cgacggcgca ccggctgccg gcggttgccg ggattctttg
cggattcgat 6720cagcggccgc ttgccacgat tcaccggggc gtgcttctgc ctcgatgcgt
tgccgctggg 6780cggcctgcgc ggccttcaac ttctccacca ggtcatcacc cagcgccgcg
ccgatttgta 6840ccgggccgga tggtttgcga ccgctcacgc cgattcctcg ggcttggggg
ttccagtgcc 6900attgcagggc cggcagacaa cccagccgct tacgcctggc caaccgcccg
ttcctccaca 6960catggggcat tccacggcgt cggtgcctgg ttgttcttga ttttccatgc
cgcctccttt 7020agccgctaaa attcatctac tcatttattc atttgctcat ttactctggt
agctgcgcga 7080tgtattcaga tagcagctcg gtaatggtct tgccttggcg taccgcgtac
atcttcagct 7140tggtgtgatc ctccgccggc aactgaaagt tgacccgctt catggctggc
gtgtctgcca 7200ggctggccaa cgttgcagcc ttgctgctgc gtgcgctcgg acggccggca
cttagcgtgt 7260ttgtgctttt gctcattttc tctttacctc attaactcaa atgagttttg
atttaatttc 7320agcggccagc gcctggacct cgcgggcagc gtcgccctcg ggttctgatt
caagaacggt 7380tgtgccggcg gcggcagtgc ctgggtagct cacgcgctgc gtgatacggg
actcaagaat 7440gggcagctcg tacccggcca gcgcctcggc aacctcaccg ccgatgcgcg
tgcctttgat 7500cgcccgcgac acgacaaagg ccgcttgtag ccttccatcc gtgacctcaa
tgcgctgctt 7560aaccagctcc accaggtcgg cggtggccca tatgtcgtaa gggcttggct
gcaccggaat 7620cagcacgaag tcggctgcct tgatcgcgga cacagccaag tccgccgcct
ggggcgctcc 7680gtcgatcact acgaagtcgc gccggccgat ggccttcacg tcgcggtcaa
tcgtcgggcg 7740gtcgatgccg acaacggtta gcggttgatc ttcccgcacg gccgcccaat
cgcgggcact 7800gccctgggga tcggaatcga ctaacagaac atcggccccg gcgagttgca
gggcgcgggc 7860tagatgggtt gcgatggtcg tcttgcctga cccgcctttc tggttaagta
cagcgataac 7920ttcatgcgtt cccttgcgta tttgtttatt tactcatcgc atcatatacg
cagcgaccgc 7980atgacgcaag ctgttttact caaatacaca tcaccttttt agacggcggc
gctcggtttc 8040ttcagcggcc aagctggccg gccaggccgc cagcttggca tcagacaaac
cggccaggat 8100ttcatgcagc cgcacggttg agacgtgcgc gggcggctcg aacacgtacc
cggccgcgat 8160catctccgcc tcgatctctt cggtaatgaa aaacggttcg tcctggccgt
cctggtgcgg 8220tttcatgctt gttcctcttg gcgttcattc tcggcggccg ccagggcgtc
ggcctcggtc 8280aatgcgtcct cacggaaggc accgcgccgc ctggcctcgg tgggcgtcac
ttcctcgctg 8340cgctcaagtg cgcggtacag ggtcgagcga tgcacgccaa gcagtgcagc
cgcctctttc 8400acggtgcggc cttcctggtc gatcagctcg cgggcgtgcg cgatctgtgc
cggggtgagg 8460gtagggcggg ggccaaactt cacgcctcgg gccttggcgg cctcgcgccc
gctccgggtg 8520cggtcgatga ttagggaacg ctcgaactcg gcaatgccgg cgaacacggt
caacaccatg 8580cggccggccg gcgtggtggt gtcggcccac ggctctgcca ggctacgcag
gcccgcgccg 8640gcctcctgga tgcgctcggc aatgtccagt aggtcgcggg tgctgcgggc
caggcggtct 8700agcctggtca ctgtcacaac gtcgccaggg cgtaggtggt caagcatcct
ggccagctcc 8760gggcggtcgc gcctggtgcc ggtgatcttc tcggaaaaca gcttggtgca
gccggccgcg 8820tgcagttcgg cccgttggtt ggtcaagtcc tggtcgtcgg tgctgacgcg
ggcatagccc 8880agcaggccag cggcggcgct cttgttcatg gcgtaatgtc tccggttcta
gtcgcaagta 8940ttctacttta tgcgactaaa acacgcgaca agaaaacgcc aggaaaaggg
cagggcggca 9000gcctgtcgcg taacttagga cttgtgcgac atgtcgtttt cagaagacgg
ctgcactgaa 9060cgtcagaagc cgactgcact atagcagcgg aggggttgga ccacaggacg
ggtgtggtcg 9120ccatgatcgc gtagtcgata gtggctccaa gtagcgaagc gagcaggact
gggcggcggc 9180caaagcggtc ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc
aacgcatata 9240gcgctagcag cacgccatag tgactggcga tgctgtcgga atggacgata
tcccgcaaga 9300ggcccggcag taccggcata accaagccta tgcctacagc atccagggtg
acggtgccga 9360ggatgacgat gagcgcattg ttagatttca tacacggtgc ctgactgcgt
tagcaattta 9420actgtgataa actaccgcat taaagctagc ttgcttggtc gttccgcgtg
aacgtcggct 9480cgattgtacc tgcgttcaaa tactttgcga tcgtgttgcg cgcctgcccg
gtgcgtcggc 9540tgatctcacg gatcgactgc ttctctcgca acgccatccg acggatgatg
tttaaaagtc 9600ccatgtggat cactccgttg ccccgtcgct caccgtgttg gggggaaggt
gcacatggct 9660cagttctcaa tggaaattat ctgcctaacc ggctcagttc tgcgtagaaa
ccaacatgca 9720agctccaccg ggtgcaaagc ggcagcggcg gcaggatata ttcaattgta
aatggcttca 9780tgtccgggaa atctacatgg atcagcaatg agtatgatgg tcaatatgga
gaaaaagaaa 9840gagtaattac caattttttt tcaattcaaa aatgtagatg tccgcagcgt
tattataaaa 9900tgaaagtaca ttttgataaa acgacaaatt acgatccgtc gtatttatag
gcgaaagcaa 9960taaacaaatt attctaattc ggaaatcttt atttcgacgt gtctacattc
acgtccaaat 10020gggggcttag atgagaaact tcacgatcga tgccttgatt tcgccattcc
cagataccca 10080tttcatcttc agattggtct gagattatgc gaaaatatac actcatatac
ataaatactg 10140acagtttgag ctaccaattc agtgtagccc attacctcac ataattcact
caaatgctag 10200gcagtctgtc aactcggcgt caatttgtcg gccactatac gatagttgcg
caaattttca 10260aagtcctggc ctaacatcac acctctgtcg gcggcgggtc ccatttgtga
taaatccacc 10320atatcgaatt aattcagact cctttgcccc agagatcaca atggacgact
tcctctatct 10380ctacgatcta gtcaggaagt tcgacggaga aggtgacgat accatgttca
ccactgataa 10440tgagaagatt agccttttca atttcagaaa gaatgctaac ccacagatgg
ttagagaggc 10500ttacgcagca ggtctcatca agacgatcta cccgagcaat aatctccagg
agatcaaata 10560ccttcccaag aaggttaaag atgcagtcaa aagattcagg actaactgca
tcaagaacac 10620agagaaagat atatttctca agatcagaag tactattcca gtatggacga
ttcaaggctt 10680gcttcacaaa ccaaggcaag taatagagat tggagtctct aaaaaggtag
ttcccactga 10740atcaaaggcc atggagtcaa agattcaaat agaggaccta acagaactcg
ccgtaaagac 10800tggcgaacag ttcatacaga gtctcttacg actcaatgac aagaagaaaa
tcttcgtcaa 10860catggtggag cacgacacgc ttgtctactc caaaaatatc aaagatacag
tctcagaaga 10920ccaaagggca attgagactt ttcaacaaag ggtaatatcc ggaaacctcc
tcggattcca 10980ttgcccagct atctgtcact ttattgtgaa gatagtggaa aaggaaggtg
gctcctacaa 11040atgccatcat tgcgataaag gaaaggccat cgttgaagat gcctctgccg
acagtggtcc 11100caaagatgga cccccaccca cgaggagcat cgtggaaaaa gaagacgttc
caaccacgtc 11160ttcaaagcaa gtggattgat gtgatatctc cactgacgta agggatgacg
cacaatccca 11220ctatccttcg caagaccctt cctctatata aggaagttca tttcatttgg
agaggacacg 11280ctgaaatcac cagtctccaa gcttgcgggg atcgtttcgc atgattgaac
aagatggatt 11340gcacgcaggt tctccggccg cttgggtgga gaggctattc ggctatgact
gggcacaaca 11400gacaatcggc tgctctgatg ccgccgtgtt ccggctgtca gcgcaggggc
gcccggttct 11460ttttgtcaag accgacctgt ccggtgccct gaatgaactg caggacgagg
cagcgcggct 11520atcgtggctg gccacgacgg gcgttccttg cgcagctgtg ctcgacgttg
tcactgaagc 11580gggaagggac tggctgctat tgggcgaagt gccggggcag gatctcctgt
catctcacct 11640tgctcctgcc gagaaagtat ccatcatggc tgatgcaatg cggcggctgc
atacgcttga 11700tccggctacc tgcccattcg accaccaagc gaaacatcgc atcgagcgag
cacgtactcg 11760gatggaagcc ggtcttgtcg atcaggatga tctggacgaa gagcatcagg
ggctcgcgcc 11820agccgaactg ttcgccaggc tcaaggcgcg catgcccgac ggcgaggatc
tcgtcgtgac 11880ccatggcgat gcctgcttgc cgaatatcat ggtggaaaat ggccgctttt
ctggattcat 11940cgactgtggc cggctgggtg tggcggaccg ctatcaggac atagcgttgg
ctacccgtga 12000tattgctgaa gagcttggcg gcgaatgggc tgaccgcttc ctcgtgcttt
acggtatcgc 12060cgctcccgat tcgcagcgca tcgccttcta tcgccttctt gacgagttct
tctgagcggg 12120actctggggt tcgaaatgac cgaccaagcg acgcccaacc tgccatcacg
agatttcgat 12180tccaccgccg ccttctatga aaggttgggc ttcggaatcg ttttccggga
cgccggctgg 12240atgatcctcc agcgcgggga tctcatgctg gagttcttcg cccaccccgg
atcgatccaa 12300cacttacgtt tgcaacgtcc aagagcaaat agaccacgaa cgccggaagg
ttgccgcagc 12360gtgtggattg cgtctcaatt ctctcttgca ggaatgcaat gatgaatatg
atactgacta 12420tgaaactttg agggaatact gcctagcacc gtcacctcat aacgtgcatc
atgcatgccc 12480tgacaacatg gaacatcgct atttttctga agaattatgc tcgttggagg
atgtcgcggc 12540aattgcagct attgccaaca tcgaactacc cctcacgcat gcattcatca
atattattca 12600tgcggggaaa ggcaagatta atccaactgg caaatcatcc agcgtgattg
gtaacttcag 12660ttccagcgac ttgattcgtt ttggtgctac ccacgttttc aataaggacg
agatggtgga 12720gtaaagaagg agtgcgtcga agcagatcgt tcaaacattt ggcaataaag
tttcttaaga 12780ttgaatcctg ttgccggtct tgcgatgatt atcatataat ttctgttgaa
ttacgttaag 12840catgtaataa ttaacatgta atgcatgacg ttatttatga gatgggtttt
tatgattaga 12900gtcccgcaat tatacattta atacgcgata gaaaacaaaa tatagcgcgc
aaactaggat 12960aaattatcgc gcgcggtgtc atctatgtta ctagatcgat caaacttcgg
tactgtgtaa 13020tgacgatgag caatcgagag gctgactaac aaaaggtaca tcgcgatgga
tcgatccatt 13080cgccattcag gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct
tcgctattac 13140gccagctggc gaaaggggga tgtgctgcaa ggcgattaag ttgggtaacg
ccagggtttt 13200cccagtcacg acgttgtaaa acgacggcca gtgaattcct gcagcccggg
ggatccgccc 13260actcgaggcg cgccaagctt gcatgcctgc aggctagcct aagtacgtac
tcaaaatgcc 13320aacaaataaa aaaaaagttg ctttaataat gccaaaacaa attaataaaa
cacttacaac 13380accggatttt ttttaattaa aatgtgccat ttaggataaa tagttaatat
ttttaataat 13440tatttaaaaa gccgtatcta ctaaaatgat ttttatttgg ttgaaaatat
taatatgttt 13500aaatcaacac aatctatcaa aattaaacta aaaaaaaaat aagtgtacgt
ggttaacatt 13560agtacagtaa tataagagga aaatgagaaa ttaagaaatt gaaagcgagt
ctaattttta 13620aattatgaac ctgcatatat aaaaggaaag aaagaatcca ggaagaaaag
aaatgaaacc 13680atgcatggtc ccctcgtcat cacgagtttc tgccatttgc aatagaaaca
ctgaaacacc 13740tttctctttg tcacttaatt gagatgccga agccacctca caccatgaac
ttcatgaggt 13800gtagcaccca aggcttccat agccatgcat actgaagaat gtctcaagct
cagcacccta 13860cttctgtgac gtgtccctca ttcaccttcc tctcttccct ataaataacc
acgcctcagg 13920ttctccgctt cacaactcaa acattctctc cattggtcct taaacactca
tcagtcatca 13980ccgcggccat cacaagtttg tacaaaaaag ctgaacgaga aacgtaaaat
gatataaata 14040tcaatatatt aaattagatt ttgcataaaa aacagactac ataatactgt
aaaacacaac 14100atatccagtc atattggcgg ccgcattagg caccccaggc tttacacttt
atgcttccgg 14160ctcgtataat gtgtggattt tgagttagga tccgtcgaga ttttcaggag
ctaaggaagc 14220taaaatggag aaaaaaatca ctggatatac caccgttgat atatcccaat
ggcatcgtaa 14280agaacatttt gaggcatttc agtcagttgc tcaatgtacc tataaccaga
ccgttcagct 14340ggatattacg gcctttttaa agaccgtaaa gaaaaataag cacaagtttt
atccggcctt 14400tattcacatt cttgcccgcc tgatgaatgc tcatccggaa ttccgtatgg
caatgaaaga 14460cggtgagctg gtgatatggg atagtgttca cccttgttac accgttttcc
atgagcaaac 14520tgaaacgttt tcatcgctct ggagtgaata ccacgacgat ttccggcagt
ttctacacat 14580atattcgcaa gatgtggcgt gttacggtga aaacctggcc tatttcccta
aagggtttat 14640tgagaatatg tttttcgtct cagccaatcc ctgggtgagt ttcaccagtt
ttgatttaaa 14700cgtggccaat atggacaact tcttcgcccc cgttttcacc atgggcaaat
attatacgca 14760aggcgacaag gtgctgatgc cgctggcgat tcaggttcat catgccgttt
gtgatggctt 14820ccatgtcggc agaatgctta atgaattaca acagtactgc gatgagtggc
agggcggggc 14880gtaaacgcgt ggatccggct tactaaaagc cagataacag tatgcgtatt
tgcgcgctga 14940tttttgcggt ataagaatat atactgatat gtatacccga agtatgtcaa
aaagaggtat 15000gctatgaagc agcgtattac agtgacagtt gacagcgaca gctatcagtt
gctcaaggca 15060tatatgatgt caatatctcc ggtctggtaa gcacaaccat gcagaatgaa
gcccgtcgtc 15120tgcgtgccga acgctggaaa gcggaaaatc aggaagggat ggctgaggtc
gcccggttta 15180ttgaaatgaa cggctctttt gctgacgaga acaggggctg gtgaaatgca
gtttaaggtt 15240tacacctata aaagagagag ccgttatcgt ctgtttgtgg atgtacagag
tgatattatt 15300gacacgcccg ggcgacggat ggtgatcccc ctggccagtg cacgtctgct
gtcagataaa 15360gtctcccgtg aactttaccc ggtggtgcat atcggggatg aaagctggcg
catgatgacc 15420accgatatgg ccagtgtgcc ggtctccgtt atcggggaag aagtggctga
tctcagccac 15480cgcgaaaatg acatcaaaaa cgccattaac ctgatgttct ggggaatata
aatgtcaggc 15540tcccttatac acagccagtc tgcaggtcga ccatagtgac tggatatgtt
gtgttttaca 15600gcattatgta gtctgttttt tatgcaaaat ctaatttaat atattgatat
ttatatcatt 15660ttacgtttct cgttcagctt tcttgtacaa agtggtgatg gccgcatttc
gcaccaaatc 15720aatgaaagta ataatgaaaa gtctgaataa gaatacttag gcttagatgc
ctttgttact 15780tgtgtaaaat aacttgagtc atgtaccttt ggcggaaaca gaataaataa
aaggtgaaat 15840tccaatgctc tatgtataag ttagtaatac ttaatgtgtt ctacggttgt
ttcaatatca 15900tcaaactcta attgaaactt tagaaccaca aatctcaatc ttttcttaat
gaaatgaaaa 15960atcttaattg taccatgttt atgttaaaca ccttacaatt ggttggagag
gaggaccaac 16020cgatgggaca acattgggag aaagagattc aatggagatt tggataggag
aacaacattc 16080tttttcactt caatacaaga tgagtgcaac actaaggata tgtatgagac
tttcagaagc 16140tacgacaaca tagatgagtg aggtggtgat tcctagcaag aaagacatta
gaggaagcca 16200aaatcgaaca aggaagacat caagggcaag agacaggacc atccatctca
ggaaaaggag 16260ctttgggata gtccgagaag ttgtacaaga aattttttgg agggtgagtg
atgcattgct 16320ggtgacttta actcaatcaa aattgagaaa gaaagaaaag ggagggggct
cacatgtgaa 16380tagaagggaa acgggagaat tttacagttt tgatctaatg ggcatcccag
ctagtggtaa 16440catattcacc atgtttaacc ttcacgtacg tctagaggat ccgtcgacgg
164901049DNAartificial sequenceSaiff and genomic DNA of lo17849
10gaaggctcta agctgtgttg taggcttctt agcattcatt tctgtttgc
491130DNAartificial sequenceprimer 11caccatggtt gttgtgtctc ttcttcctcg
301223DNAartificial sequenceprimer
12tcaaattgat ttagtttctc cag
23132988DNAartificial sequencevector 13aagggtgggc gcgccgaccc agctttcttg
tacaaagttg gcattataag aaagcattgc 60ttatcaattt gttgcaacga acaggtcact
atcagtcaaa ataaaatcat tatttgccat 120ccagctgata tcccctatag tgagtcgtat
tacatggtca tagctgtttc ctggcagctc 180tggcccgtgt ctcaaaatct ctgatgttac
attgcacaag ataaaaatat atcatcatga 240acaataaaac tgtctgctta cataaacagt
aatacaaggg gtgttatgag ccatattcaa 300cgggaaacgt cgaggccgcg attaaattcc
aacatggatg ctgatttata tgggtataaa 360tgggctcgcg ataatgtcgg gcaatcaggt
gcgacaatct atcgcttgta tgggaagccc 420gatgcgccag agttgtttct gaaacatggc
aaaggtagcg ttgccaatga tgttacagat 480gagatggtca gactaaactg gctgacggaa
tttatgcctc ttccgaccat caagcatttt 540atccgtactc ctgatgatgc atggttactc
accactgcga tccccggaaa aacagcattc 600caggtattag aagaatatcc tgattcaggt
gaaaatattg ttgatgcgct ggcagtgttc 660ctgcgccggt tgcattcgat tcctgtttgt
aattgtcctt ttaacagcga tcgcgtattt 720cgtctcgctc aggcgcaatc acgaatgaat
aacggtttgg ttgatgcgag tgattttgat 780gacgagcgta atggctggcc tgttgaacaa
gtctggaaag aaatgcataa acttttgcca 840ttctcaccgg attcagtcgt cactcatggt
gatttctcac ttgataacct tatttttgac 900gaggggaaat taataggttg tattgatgtt
ggacgagtcg gaatcgcaga ccgataccag 960gatcttgcca tcctatggaa ctgcctcggt
gagttttctc cttcattaca gaaacggctt 1020tttcaaaaat atggtattga taatcctgat
atgaataaat tgcagtttca tttgatgctc 1080gatgagtttt tctaatcaga attggttaat
tggttgtaac actggcagag cattacgctg 1140acttgacggg acggcgcaag ctcatgacca
aaatccctta acgtgagtta cgcgtcgttc 1200cactgagcgt cagaccccgt agaaaagatc
aaaggatctt cttgagatcc tttttttctg 1260cgcgtaatct gctgcttgca aacaaaaaaa
ccaccgctac cagcggtggt ttgtttgccg 1320gatcaagagc taccaactct ttttccgaag
gtaactggct tcagcagagc gcagatacca 1380aatactgtcc ttctagtgta gccgtagtta
ggccaccact tcaagaactc tgtagcaccg 1440cctacatacc tcgctctgct aatcctgtta
ccagtggctg ctgccagtgg cgataagtcg 1500tgtcttaccg ggttggactc aagacgatag
ttaccggata aggcgcagcg gtcgggctga 1560acggggggtt cgtgcacaca gcccagcttg
gagcgaacga cctacaccga actgagatac 1620ctacagcgtg agcattgaga aagcgccacg
cttcccgaag ggagaaaggc ggacaggtat 1680ccggtaagcg gcagggtcgg aacaggagag
cgcacgaggg agcttccagg gggaaacgcc 1740tggtatcttt atagtcctgt cgggtttcgc
cacctctgac ttgagcgtcg atttttgtga 1800tgctcgtcag gggggcggag cctatggaaa
aacgccagca acgcggcctt tttacggttc 1860ctggcctttt gctggccttt tgctcacatg
ttctttcctg cgttatcccc tgattctgtg 1920gataaccgta ttaccgcctt tgagtgagct
gataccgctc gccgcagccg aacgaccgag 1980cgcagcgagt cagtgagcga ggaagcggaa
gagcgcccaa tacgcaaacc gcctctcccc 2040gcgcgttggc cgattcatta atgcagctgg
cacgacaggt ttcccgactg gaaagcgggc 2100agtgagcgca acgcaattaa tacgcgtacc
gctagccagg aagagtttgt agaaacgcaa 2160aaaggccatc cgtcaggatg gccttctgct
tagtttgatg cctggcagtt tatggcgggc 2220gtcctgcccg ccaccctccg ggccgttgct
tcacaacgtt caaatccgct cccggcggat 2280ttgtcctact caggagagcg ttcaccgaca
aacaacagat aaaacgaaag gcccagtctt 2340ccgactgagc ctttcgtttt atttgatgcc
tggcagttcc ctactctcgc gttaacgcta 2400gcatggatgt tttcccagtc acgacgttgt
aaaacgacgg ccagtcttaa gctcgggccc 2460caaataatga ttttattttg actgatagtg
acctgttcgt tgcaacaaat tgatgagcaa 2520tgctttttta taatgccaac tttgtacaaa
aaagcaggct ccgcggccgc ccccttcacc 2580atggttgttg tgtctcttct tcctcgaatc
tcgatcgtta catcaccggg ttctagcctt 2640cacgatgtgc ttttgagcat gagatttggt
ttgacgcgac atctccctct caaacgatct 2700ttctccaatt attcaatcac ttccgtatct
ccagaacaac agctcaaatc tccggtgacc 2760atggcgacga ccgagagcaa gaatcttgta
gaagcttcca aggaggagac aaacaagaag 2820gagacagaag ataagaagga ggtgggagtt
tcggttcctc caccgccaga gaaaccagag 2880cctggcgatt gttgcggtag cggttgcgtc
cgatgcgttt gggatgttta ttacgatgag 2940ctcgaagatt acaacaagca gctttctgga
gaaactaaat caatttga 29881415279DNAartificial
sequencevector 14acccagcttt cttgtacaaa gtggtgatgg ccgcatttcg caccaaatca
atgaaagtaa 60taatgaaaag tctgaataag aatacttagg cttagatgcc tttgttactt
gtgtaaaata 120acttgagtca tgtacctttg gcggaaacag aataaataaa aggtgaaatt
ccaatgctct 180atgtataagt tagtaatact taatgtgttc tacggttgtt tcaatatcat
caaactctaa 240ttgaaacttt agaaccacaa atctcaatct tttcttaatg aaatgaaaaa
tcttaattgt 300accatgttta tgttaaacac cttacaattg gttggagagg aggaccaacc
gatgggacaa 360cattgggaga aagagattca atggagattt ggataggaga acaacattct
ttttcacttc 420aatacaagat gagtgcaaca ctaaggatat gtatgagact ttcagaagct
acgacaacat 480agatgagtga ggtggtgatt cctagcaaga aagacattag aggaagccaa
aatcgaacaa 540ggaagacatc aagggcaaga gacaggacca tccatctcag gaaaaggagc
tttgggatag 600tccgagaagt tgtacaagaa attttttgga gggtgagtga tgcattgctg
gtgactttaa 660ctcaatcaaa attgagaaag aaagaaaagg gagggggctc acatgtgaat
agaagggaaa 720cgggagaatt ttacagtttt gatctaatgg gcatcccagc tagtggtaac
atattcacca 780tgtttaacct tcacgtacgt ctagaggatc cgtcgacggc gcgccagatc
ctctagagtc 840gacctgcagg catgcaagct tggcgtaatc atggtcatag ctgtttcctg
tgtgaaattg 900ttatccgctc acaattccac acaacatacg agccggaagc ataaagtgta
aagcctgggg 960tgcctaatga gtgagctaac tcacattaat tgcgttgcgc tcactgcccg
ctttccagtc 1020gggaaacctg tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga
gaggcggttt 1080gcgtattgga tcgatccctg aaagcgacgt tggatgttaa catctacaaa
ttgccttttc 1140ttatcgacca tgtacgtaag cgcttacgtt tttggtggac ccttgaggaa
actggtagct 1200gttgtgggcc tgtggtctca agatggatca ttaatttcca ccttcaccta
cgatgggggg 1260catcgcaccg gtgagtaata ttgtacggct aagagcgaat ttggcctgta
gacctcaatt 1320gcgagctttc taatttcaaa ctattcgggc ctaacttttg gtgtgatgat
gctgactggc 1380aggatatata ccgttgtaat ttgagctcgt gtgaataagt cgctgtgtat
gtttgtttga 1440ttgtttctgt tggagtgcag cccatttcac cggacaagtc ggctagattg
atttagccct 1500gatgaactgc cgaggggaag ccatcttgag cgcggaatgg gaatggattt
cgttgtacaa 1560cgagacgaca gaacacccac gggaccgagc ttcgcgagct tttgtatccg
tggcatcctt 1620ggtccgggcg atttgttcac gtccatgagg cgctctccaa aggaacgcat
attttccggt 1680gcaacctttc cggttcttcc tctactcgac ctcttgaagt cccagcatga
atgttcgacc 1740gctccgcaag cggatctttg gcgcaaccag ccggtttcgc acgtcgattc
tcgcgagcct 1800gcatactttg gcaagattgc tgaatgacgc tgatgcttca tcgcaatctg
cgataatggg 1860gtaagtatcc ggtgaaggcc gcaggtcagg ccgcctgagc actcagtgtc
ttggatgtcc 1920agttccacgg cagctgttgc tcaagcctgc tgatcggagc gtccgcaagg
tcggcgcgga 1980cgtcggcaag ccaggcctgc ggatcgatgt tattgagctt ggcgctcatg
atcagtgtcg 2040ccatgaacgc cgcacgttca gcacaacgat ccgatccggc aaacagccat
gacttcctgc 2100cgagtacata gcctctgagc gttcgttcgg cagcattgtt cgtcaggcaa
atcgggccgt 2160catcgaggaa tgacgtaatg ccatcccatc gcttgagcat gtaatttatc
gcctcggcga 2220cgggagaact gcgcgacaat ttcccccgct cggtttcgag ccaatcatgc
agctcttcgg 2280cgagtgacct tgatcaggcc accgccacga ccgcggaaga cgaacagatg
cctgcgcatc 2340ggatcgcgct tcagcgtctc ttgcaccatc agcgacaaac cgggaaagcc
tttgcgcatg 2400tccgtactta tgtcgccact tgggagggct tcgtctacgt ggccttcgtg
atcgacgtct 2460tcgcccgtcg cattgtcgga tggcgggcga gccggacagc acatgcaggc
tttgtcctcg 2520atgccctcga ggaggctcat catgatcggc gtcccgctca tggcggccta
gtgcatcact 2580cggatcgcgg tgttcaatac gtgtcctttc gctattccga gcggttggca
gaagcaggta 2640tcgagccatc tatcggaagc gtcggcgaca gcacgacaac gccctcgcag
aagcgatcaa 2700cggtctttac aaggccgagg tcattcatcg gcgtggacca tggaggagct
tcgaagcggt 2760cgagttcgct accttggaat ggatagactg gttcaaccac ggcggctttt
gaagcccatc 2820ggcaatatac cgccagccga agacgaggat cagtattacg ccatgctgga
cgaagcagcc 2880atggctgcgc attttaacga aatggcctcc ggcaaacccg gtgcggttca
cttgttgcgt 2940gggaaagttc acgggactcc gcgcacgagc cttcttcgta atagccatat
cgaccgaatt 3000gacctgcagg gggggggggg aaagccacgt tgtgtctcaa aatctctgat
gttacattgc 3060acaagataaa aatatatcat catgaacaat aaaactgtct gcttacataa
acagtaatac 3120aaggggtgtt atgagccata ttcaacggga aacgtcttgc tcgaggccgc
gattaaattc 3180caacatggat gctgatttat atgggtataa atgggctcgc gataatgtcg
ggcaatcagg 3240tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca gagttgtttc
tgaaacatgg 3300caaaggtagc gttgccaatg atgttacaga tgagatggtc agactaaact
ggctgacgga 3360atttatgcct cttccgacca tcaagcattt tatccgtact cctgatgatg
catggttact 3420caccactgcg atccccggga aaacagcatt ccaggtatta gaagaatatc
ctgattcagg 3480tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg ttgcattcga
ttcctgtttg 3540taattgtcct tttaacagcg atcgcgtatt tcgtctcgct caggcgcaat
cacgaatgaa 3600taacggtttg gttgatgcga gtgattttga tgacgagcgt aatggctggc
ctgttgaaca 3660agtctggaaa gaaatgcata agcttttgcc attctcaccg gattcagtcg
tcactcatgg 3720tgatttctca cttgataacc ttatttttga cgaggggaaa ttaataggtt
gtattgatgt 3780tggacgagtc ggaatcgcag accgatacca ggatcttgcc atcctatgga
actgcctcgg 3840tgagttttct ccttcattac agaaacggct ttttcaaaaa tatggtattg
ataatcctga 3900tatgaataaa ttgcagtttc atttgatgct cgatgagttt ttctaatcag
aattggttaa 3960ttggttgtaa cactggcaga gcattacgct gacttgacgg gacggcggct
ttgttgaata 4020aatcgaactt ttgctgagtt gaaggatcag atcacgcatc ttcccgacaa
cgcagaccgt 4080tccgtggcaa agcaaaagtt caaaatcacc aactggtcca cctacaacaa
agctctcatc 4140aaccgtggct ccctcacttt ctggctggat gatggggcga ttcaggcctg
gtatgagtca 4200gcaacacctt cttcacgagg cagacctcag cgcccccccc cccctgcagg
tcttttccaa 4260tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt atcccgtgtt
gacgccgggc 4320aagagcaact cggtcgccgc atacactatt ctcagaatga cttggttgag
tactcaccag 4380tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt
gctgccataa 4440ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga
ccgaaggagc 4500taaccgcttt tttgcacaac atgggggatc atgtaactcg ccttgatcgt
tgggaaccgg 4560agctgaatga agccatacca aacgacgagc gtgacaccac gatgcctgta
gcaatggcaa 4620caacgttgcg caaactatta actggcgaac tacttactct agcttcccgg
caacaattaa 4680tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc
cttccggctg 4740gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt
atcattgcag 4800cactggggcc agatggtaag ccctcccgta tcgtagttat ctacacgacg
gggagtcagg 4860caactatgga tgaacgaaat agacagatcg ctgagatagg tgcctcactg
attaagcatt 4920ggtaactgtc agaccaagtt tactcatata tactttagat tgatttaaaa
cttcattttt 4980aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa
atcccttaac 5040gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga
tcttcttgag 5100atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg
ctaccagcgg 5160tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact
ggcttcagca 5220gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac
cacttcaaga 5280actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg
gctgctgcca 5340gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg
gataaggcgc 5400agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga
acgacctaca 5460ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc
gaagggagaa 5520aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg
agggagcttc 5580cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc
tgacttgagc 5640gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc
agcaacgcgg 5700cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt
cctgcgttat 5760cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc
gctcgccgca 5820gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc
ctgatgcggt 5880attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact
ctcagtacaa 5940tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac
gtgactgggt 6000catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg
cttgtctgct 6060cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt
gtcagaggtt 6120ttcaccgtca tcaccgaaac gcgcgaggca gggtgccttg atgtgggcgc
cggcggtcga 6180gtggcgacgg cgcggcttgt ccgcgccctg gtagattgcc tggccgtagg
ccagccattt 6240ttgagcggcc agcggccgcg ataggccgac gcgaagcggc ggggcgtagg
gagcgcagcg 6300accgaagggt aggcgctttt tgcagctctt cggctgtgcg ctggccagac
agttatgcac 6360aggccaggcg ggttttaaga gttttaataa gttttaaaga gttttaggcg
gaaaaatcgc 6420cttttttctc ttttatatca gtcacttaca tgtgtgaccg gttcccaatg
tacggctttg 6480ggttcccaat gtacgggttc cggttcccaa tgtacggctt tgggttccca
atgtacgtgc 6540tatccacagg aaagagacct tttcgacctt tttcccctgc tagggcaatt
tgccctagca 6600tctgctccgt acattaggaa ccggcggatg cttcgccctc gatcaggttg
cggtagcgca 6660tgactaggat cgggccagcc tgccccgcct cctccttcaa atcgtactcc
ggcaggtcat 6720ttgacccgat cagcttgcgc acggtgaaac agaacttctt gaactctccg
gcgctgccac 6780tgcgttcgta gatcgtcttg aacaaccatc tggcttctgc cttgcctgcg
gcgcggcgtg 6840ccaggcggta gagaaaacgg ccgatgccgg gatcgatcaa aaagtaatcg
gggtgaaccg 6900tcagcacgtc cgggttcttg ccttctgtga tctcgcggta catccaatca
gctagctcga 6960tctcgatgta ctccggccgc ccggtttcgc tctttacgat cttgtagcgg
ctaatcaagg 7020cttcaccctc ggataccgtc accaggcggc cgttcttggc cttcttcgta
cgctgcatgg 7080caacgtgcgt ggtgtttaac cgaatgcagg tttctaccag gtcgtctttc
tgctttccgc 7140catcggctcg ccggcagaac ttgagtacgt ccgcaacgtg tggacggaac
acgcggccgg 7200gcttgtctcc cttcccttcc cggtatcggt tcatggattc ggttagatgg
gaaaccgcca 7260tcagtaccag gtcgtaatcc cacacactgg ccatgccggc cggccctgcg
gaaacctcta 7320cgtgcccgtc tggaagctcg tagcggatca cctcgccagc tcgtcggtca
cgcttcgaca 7380gacggaaaac ggccacgtcc atgatgctgc gactatcgcg ggtgcccacg
tcatagagca 7440tcggaacgaa aaaatctggt tgctcgtcgc ccttgggcgg cttcctaatc
gacggcgcac 7500cggctgccgg cggttgccgg gattctttgc ggattcgatc agcggccgct
tgccacgatt 7560caccggggcg tgcttctgcc tcgatgcgtt gccgctgggc ggcctgcgcg
gccttcaact 7620tctccaccag gtcatcaccc agcgccgcgc cgatttgtac cgggccggat
ggtttgcgac 7680cgctcacgcc gattcctcgg gcttgggggt tccagtgcca ttgcagggcc
ggcagacaac 7740ccagccgctt acgcctggcc aaccgcccgt tcctccacac atggggcatt
ccacggcgtc 7800ggtgcctggt tgttcttgat tttccatgcc gcctccttta gccgctaaaa
ttcatctact 7860catttattca tttgctcatt tactctggta gctgcgcgat gtattcagat
agcagctcgg 7920taatggtctt gccttggcgt accgcgtaca tcttcagctt ggtgtgatcc
tccgccggca 7980actgaaagtt gacccgcttc atggctggcg tgtctgccag gctggccaac
gttgcagcct 8040tgctgctgcg tgcgctcgga cggccggcac ttagcgtgtt tgtgcttttg
ctcattttct 8100ctttacctca ttaactcaaa tgagttttga tttaatttca gcggccagcg
cctggacctc 8160gcgggcagcg tcgccctcgg gttctgattc aagaacggtt gtgccggcgg
cggcagtgcc 8220tgggtagctc acgcgctgcg tgatacggga ctcaagaatg ggcagctcgt
acccggccag 8280cgcctcggca acctcaccgc cgatgcgcgt gcctttgatc gcccgcgaca
cgacaaaggc 8340cgcttgtagc cttccatccg tgacctcaat gcgctgctta accagctcca
ccaggtcggc 8400ggtggcccat atgtcgtaag ggcttggctg caccggaatc agcacgaagt
cggctgcctt 8460gatcgcggac acagccaagt ccgccgcctg gggcgctccg tcgatcacta
cgaagtcgcg 8520ccggccgatg gccttcacgt cgcggtcaat cgtcgggcgg tcgatgccga
caacggttag 8580cggttgatct tcccgcacgg ccgcccaatc gcgggcactg ccctggggat
cggaatcgac 8640taacagaaca tcggccccgg cgagttgcag ggcgcgggct agatgggttg
cgatggtcgt 8700cttgcctgac ccgcctttct ggttaagtac agcgataact tcatgcgttc
ccttgcgtat 8760ttgtttattt actcatcgca tcatatacgc agcgaccgca tgacgcaagc
tgttttactc 8820aaatacacat caccttttta gacggcggcg ctcggtttct tcagcggcca
agctggccgg 8880ccaggccgcc agcttggcat cagacaaacc ggccaggatt tcatgcagcc
gcacggttga 8940gacgtgcgcg ggcggctcga acacgtaccc ggccgcgatc atctccgcct
cgatctcttc 9000ggtaatgaaa aacggttcgt cctggccgtc ctggtgcggt ttcatgcttg
ttcctcttgg 9060cgttcattct cggcggccgc cagggcgtcg gcctcggtca atgcgtcctc
acggaaggca 9120ccgcgccgcc tggcctcggt gggcgtcact tcctcgctgc gctcaagtgc
gcggtacagg 9180gtcgagcgat gcacgccaag cagtgcagcc gcctctttca cggtgcggcc
ttcctggtcg 9240atcagctcgc gggcgtgcgc gatctgtgcc ggggtgaggg tagggcgggg
gccaaacttc 9300acgcctcggg ccttggcggc ctcgcgcccg ctccgggtgc ggtcgatgat
tagggaacgc 9360tcgaactcgg caatgccggc gaacacggtc aacaccatgc ggccggccgg
cgtggtggtg 9420tcggcccacg gctctgccag gctacgcagg cccgcgccgg cctcctggat
gcgctcggca 9480atgtccagta ggtcgcgggt gctgcgggcc aggcggtcta gcctggtcac
tgtcacaacg 9540tcgccagggc gtaggtggtc aagcatcctg gccagctccg ggcggtcgcg
cctggtgccg 9600gtgatcttct cggaaaacag cttggtgcag ccggccgcgt gcagttcggc
ccgttggttg 9660gtcaagtcct ggtcgtcggt gctgacgcgg gcatagccca gcaggccagc
ggcggcgctc 9720ttgttcatgg cgtaatgtct ccggttctag tcgcaagtat tctactttat
gcgactaaaa 9780cacgcgacaa gaaaacgcca ggaaaagggc agggcggcag cctgtcgcgt
aacttaggac 9840ttgtgcgaca tgtcgttttc agaagacggc tgcactgaac gtcagaagcc
gactgcacta 9900tagcagcgga ggggttggac cacaggacgg gtgtggtcgc catgatcgcg
tagtcgatag 9960tggctccaag tagcgaagcg agcaggactg ggcggcggcc aaagcggtcg
gacagtgctc 10020cgagaacggg tgcgcataga aattgcatca acgcatatag cgctagcagc
acgccatagt 10080gactggcgat gctgtcggaa tggacgatat cccgcaagag gcccggcagt
accggcataa 10140ccaagcctat gcctacagca tccagggtga cggtgccgag gatgacgatg
agcgcattgt 10200tagatttcat acacggtgcc tgactgcgtt agcaatttaa ctgtgataaa
ctaccgcatt 10260aaagctagct tgcttggtcg ttccgcgtga acgtcggctc gattgtacct
gcgttcaaat 10320actttgcgat cgtgttgcgc gcctgcccgg tgcgtcggct gatctcacgg
atcgactgct 10380tctctcgcaa cgccatccga cggatgatgt ttaaaagtcc catgtggatc
actccgttgc 10440cccgtcgctc accgtgttgg ggggaaggtg cacatggctc agttctcaat
ggaaattatc 10500tgcctaaccg gctcagttct gcgtagaaac caacatgcaa gctccaccgg
gtgcaaagcg 10560gcagcggcgg caggatatat tcaattgtaa atggcttcat gtccgggaaa
tctacatgga 10620tcagcaatga gtatgatggt caatatggag aaaaagaaag agtaattacc
aatttttttt 10680caattcaaaa atgtagatgt ccgcagcgtt attataaaat gaaagtacat
tttgataaaa 10740cgacaaatta cgatccgtcg tatttatagg cgaaagcaat aaacaaatta
ttctaattcg 10800gaaatcttta tttcgacgtg tctacattca cgtccaaatg ggggcttaga
tgagaaactt 10860cacgatcgat gccttgattt cgccattccc agatacccat ttcatcttca
gattggtctg 10920agattatgcg aaaatataca ctcatataca taaatactga cagtttgagc
taccaattca 10980gtgtagccca ttacctcaca taattcactc aaatgctagg cagtctgtca
actcggcgtc 11040aatttgtcgg ccactatacg atagttgcgc aaattttcaa agtcctggcc
taacatcaca 11100cctctgtcgg cggcgggtcc catttgtgat aaatccacca tatcgaatta
attcagactc 11160ctttgcccca gagatcacaa tggacgactt cctctatctc tacgatctag
tcaggaagtt 11220cgacggagaa ggtgacgata ccatgttcac cactgataat gagaagatta
gccttttcaa 11280tttcagaaag aatgctaacc cacagatggt tagagaggct tacgcagcag
gtctcatcaa 11340gacgatctac ccgagcaata atctccagga gatcaaatac cttcccaaga
aggttaaaga 11400tgcagtcaaa agattcagga ctaactgcat caagaacaca gagaaagata
tatttctcaa 11460gatcagaagt actattccag tatggacgat tcaaggcttg cttcacaaac
caaggcaagt 11520aatagagatt ggagtctcta aaaaggtagt tcccactgaa tcaaaggcca
tggagtcaaa 11580gattcaaata gaggacctaa cagaactcgc cgtaaagact ggcgaacagt
tcatacagag 11640tctcttacga ctcaatgaca agaagaaaat cttcgtcaac atggtggagc
acgacacgct 11700tgtctactcc aaaaatatca aagatacagt ctcagaagac caaagggcaa
ttgagacttt 11760tcaacaaagg gtaatatccg gaaacctcct cggattccat tgcccagcta
tctgtcactt 11820tattgtgaag atagtggaaa aggaaggtgg ctcctacaaa tgccatcatt
gcgataaagg 11880aaaggccatc gttgaagatg cctctgccga cagtggtccc aaagatggac
ccccacccac 11940gaggagcatc gtggaaaaag aagacgttcc aaccacgtct tcaaagcaag
tggattgatg 12000tgatatctcc actgacgtaa gggatgacgc acaatcccac tatccttcgc
aagacccttc 12060ctctatataa ggaagttcat ttcatttgga gaggacacgc tgaaatcacc
agtctccaag 12120cttgcgggga tcgtttcgca tgattgaaca agatggattg cacgcaggtt
ctccggccgc 12180ttgggtggag aggctattcg gctatgactg ggcacaacag acaatcggct
gctctgatgc 12240cgccgtgttc cggctgtcag cgcaggggcg cccggttctt tttgtcaaga
ccgacctgtc 12300cggtgccctg aatgaactgc aggacgaggc agcgcggcta tcgtggctgg
ccacgacggg 12360cgttccttgc gcagctgtgc tcgacgttgt cactgaagcg ggaagggact
ggctgctatt 12420gggcgaagtg ccggggcagg atctcctgtc atctcacctt gctcctgccg
agaaagtatc 12480catcatggct gatgcaatgc ggcggctgca tacgcttgat ccggctacct
gcccattcga 12540ccaccaagcg aaacatcgca tcgagcgagc acgtactcgg atggaagccg
gtcttgtcga 12600tcaggatgat ctggacgaag agcatcaggg gctcgcgcca gccgaactgt
tcgccaggct 12660caaggcgcgc atgcccgacg gcgaggatct cgtcgtgacc catggcgatg
cctgcttgcc 12720gaatatcatg gtggaaaatg gccgcttttc tggattcatc gactgtggcc
ggctgggtgt 12780ggcggaccgc tatcaggaca tagcgttggc tacccgtgat attgctgaag
agcttggcgg 12840cgaatgggct gaccgcttcc tcgtgcttta cggtatcgcc gctcccgatt
cgcagcgcat 12900cgccttctat cgccttcttg acgagttctt ctgagcggga ctctggggtt
cgaaatgacc 12960gaccaagcga cgcccaacct gccatcacga gatttcgatt ccaccgccgc
cttctatgaa 13020aggttgggct tcggaatcgt tttccgggac gccggctgga tgatcctcca
gcgcggggat 13080ctcatgctgg agttcttcgc ccaccccgga tcgatccaac acttacgttt
gcaacgtcca 13140agagcaaata gaccacgaac gccggaaggt tgccgcagcg tgtggattgc
gtctcaattc 13200tctcttgcag gaatgcaatg atgaatatga tactgactat gaaactttga
gggaatactg 13260cctagcaccg tcacctcata acgtgcatca tgcatgccct gacaacatgg
aacatcgcta 13320tttttctgaa gaattatgct cgttggagga tgtcgcggca attgcagcta
ttgccaacat 13380cgaactaccc ctcacgcatg cattcatcaa tattattcat gcggggaaag
gcaagattaa 13440tccaactggc aaatcatcca gcgtgattgg taacttcagt tccagcgact
tgattcgttt 13500tggtgctacc cacgttttca ataaggacga gatggtggag taaagaagga
gtgcgtcgaa 13560gcagatcgtt caaacatttg gcaataaagt ttcttaagat tgaatcctgt
tgccggtctt 13620gcgatgatta tcatataatt tctgttgaat tacgttaagc atgtaataat
taacatgtaa 13680tgcatgacgt tatttatgag atgggttttt atgattagag tcccgcaatt
atacatttaa 13740tacgcgatag aaaacaaaat atagcgcgca aactaggata aattatcgcg
cgcggtgtca 13800tctatgttac tagatcgatc aaacttcggt actgtgtaat gacgatgagc
aatcgagagg 13860ctgactaaca aaaggtacat cgcgatggat cgatccattc gccattcagg
ctgcgcaact 13920gttgggaagg gcgatcggtg cgggcctctt cgctattacg ccagctggcg
aaagggggat 13980gtgctgcaag gcgattaagt tgggtaacgc cagggttttc ccagtcacga
cgttgtaaaa 14040cgacggccag tgaattcctg cagcccgggg gatccgccca ctcgaggcgc
gccaagcttg 14100catgcctgca ggctagccta agtacgtact caaaatgcca acaaataaaa
aaaaagttgc 14160tttaataatg ccaaaacaaa ttaataaaac acttacaaca ccggattttt
tttaattaaa 14220atgtgccatt taggataaat agttaatatt tttaataatt atttaaaaag
ccgtatctac 14280taaaatgatt tttatttggt tgaaaatatt aatatgttta aatcaacaca
atctatcaaa 14340attaaactaa aaaaaaaata agtgtacgtg gttaacatta gtacagtaat
ataagaggaa 14400aatgagaaat taagaaattg aaagcgagtc taatttttaa attatgaacc
tgcatatata 14460aaaggaaaga aagaatccag gaagaaaaga aatgaaacca tgcatggtcc
cctcgtcatc 14520acgagtttct gccatttgca atagaaacac tgaaacacct ttctctttgt
cacttaattg 14580agatgccgaa gccacctcac accatgaact tcatgaggtg tagcacccaa
ggcttccata 14640gccatgcata ctgaagaatg tctcaagctc agcaccctac ttctgtgacg
tgtccctcat 14700tcaccttcct ctcttcccta taaataacca cgcctcaggt tctccgcttc
acaactcaaa 14760cattctctcc attggtcctt aaacactcat cagtcatcac cgcggccatc
acaagtttgt 14820acaaaaaagc aggctccgcg gccgccccct tcaccatggt tgttgtgtct
cttcttcctc 14880gaatctcgat cgttacatca ccgggttcta gccttcacga tgtgcttttg
agcatgagat 14940ttggtttgac gcgacatctc cctctcaaac gatctttctc caattattca
atcacttccg 15000tatctccaga acaacagctc aaatctccgg tgaccatggc gacgaccgag
agcaagaatc 15060ttgtagaagc ttccaaggag gagacaaaca agaaggagac agaagataag
aaggaggtgg 15120gagtttcggt tcctccaccg ccagagaaac cagagcctgg cgattgttgc
ggtagcggtt 15180gcgtccgatg cgtttgggat gtttattacg atgagctcga agattacaac
aagcagcttt 15240ctggagaaac taaatcaatt tgaaagggtg ggcgcgccg
152791517273DNAartificial sequencevector 15cgcgccagat
cctctagagt cgacctgcag gcatgcaagc ttggcgtaat catggtcata 60gctgtttcct
gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag 120cataaagtgt
aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg 180ctcactgccc
gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca 240acgcgcgggg
agaggcggtt tgcgtattgg atcgatccct gaaagcgacg ttggatgtta 300acatctacaa
attgcctttt cttatcgacc atgtacgtaa gcgcttacgt ttttggtgga 360cccttgagga
aactggtagc tgttgtgggc ctgtggtctc aagatggatc attaatttcc 420accttcacct
acgatggggg gcatcgcacc ggtgagtaat attgtacggc taagagcgaa 480tttggcctgt
agacctcaat tgcgagcttt ctaatttcaa actattcggg cctaactttt 540ggtgtgatga
tgctgactgg caggatatat accgttgtaa tttgagctcg tgtgaataag 600tcgctgtgta
tgtttgtttg attgtttctg ttggagtgca gcccatttca ccggacaagt 660cggctagatt
gatttagccc tgatgaactg ccgaggggaa gccatcttga gcgcggaatg 720ggaatggatt
tcgttgtaca acgagacgac agaacaccca cgggaccgag cttcgcgagc 780ttttgtatcc
gtggcatcct tggtccgggc gatttgttca cgtccatgag gcgctctcca 840aaggaacgca
tattttccgg tgcaaccttt ccggttcttc ctctactcga cctcttgaag 900tcccagcatg
aatgttcgac cgctccgcaa gcggatcttt ggcgcaacca gccggtttcg 960cacgtcgatt
ctcgcgagcc tgcatacttt ggcaagattg ctgaatgacg ctgatgcttc 1020atcgcaatct
gcgataatgg ggtaagtatc cggtgaaggc cgcaggtcag gccgcctgag 1080cactcagtgt
cttggatgtc cagttccacg gcagctgttg ctcaagcctg ctgatcggag 1140cgtccgcaag
gtcggcgcgg acgtcggcaa gccaggcctg cggatcgatg ttattgagct 1200tggcgctcat
gatcagtgtc gccatgaacg ccgcacgttc agcacaacga tccgatccgg 1260caaacagcca
tgacttcctg ccgagtacat agcctctgag cgttcgttcg gcagcattgt 1320tcgtcaggca
aatcgggccg tcatcgagga atgacgtaat gccatcccat cgcttgagca 1380tgtaatttat
cgcctcggcg acgggagaac tgcgcgacaa tttcccccgc tcggtttcga 1440gccaatcatg
cagctcttcg gcgagtgacc ttgatcaggc caccgccacg accgcggaag 1500acgaacagat
gcctgcgcat cggatcgcgc ttcagcgtct cttgcaccat cagcgacaaa 1560ccgggaaagc
ctttgcgcat gtccgtactt atgtcgccac ttgggagggc ttcgtctacg 1620tggccttcgt
gatcgacgtc ttcgcccgtc gcattgtcgg atggcgggcg agccggacag 1680cacatgcagg
ctttgtcctc gatgccctcg aggaggctca tcatgatcgg cgtcccgctc 1740atggcggcct
agtgcatcac tcggatcgcg gtgttcaata cgtgtccttt cgctattccg 1800agcggttggc
agaagcaggt atcgagccat ctatcggaag cgtcggcgac agcacgacaa 1860cgccctcgca
gaagcgatca acggtcttta caaggccgag gtcattcatc ggcgtggacc 1920atggaggagc
ttcgaagcgg tcgagttcgc taccttggaa tggatagact ggttcaacca 1980cggcggcttt
tgaagcccat cggcaatata ccgccagccg aagacgagga tcagtattac 2040gccatgctgg
acgaagcagc catggctgcg cattttaacg aaatggcctc cggcaaaccc 2100ggtgcggttc
acttgttgcg tgggaaagtt cacgggactc cgcgcacgag ccttcttcgt 2160aatagccata
tcgaccgaat tgacctgcag gggggggggg gaaagccacg ttgtgtctca 2220aaatctctga
tgttacattg cacaagataa aaatatatca tcatgaacaa taaaactgtc 2280tgcttacata
aacagtaata caaggggtgt tatgagccat attcaacggg aaacgtcttg 2340ctcgaggccg
cgattaaatt ccaacatgga tgctgattta tatgggtata aatgggctcg 2400cgataatgtc
gggcaatcag gtgcgacaat ctatcgattg tatgggaagc ccgatgcgcc 2460agagttgttt
ctgaaacatg gcaaaggtag cgttgccaat gatgttacag atgagatggt 2520cagactaaac
tggctgacgg aatttatgcc tcttccgacc atcaagcatt ttatccgtac 2580tcctgatgat
gcatggttac tcaccactgc gatccccggg aaaacagcat tccaggtatt 2640agaagaatat
cctgattcag gtgaaaatat tgttgatgcg ctggcagtgt tcctgcgccg 2700gttgcattcg
attcctgttt gtaattgtcc ttttaacagc gatcgcgtat ttcgtctcgc 2760tcaggcgcaa
tcacgaatga ataacggttt ggttgatgcg agtgattttg atgacgagcg 2820taatggctgg
cctgttgaac aagtctggaa agaaatgcat aagcttttgc cattctcacc 2880ggattcagtc
gtcactcatg gtgatttctc acttgataac cttatttttg acgaggggaa 2940attaataggt
tgtattgatg ttggacgagt cggaatcgca gaccgatacc aggatcttgc 3000catcctatgg
aactgcctcg gtgagttttc tccttcatta cagaaacggc tttttcaaaa 3060atatggtatt
gataatcctg atatgaataa attgcagttt catttgatgc tcgatgagtt 3120tttctaatca
gaattggtta attggttgta acactggcag agcattacgc tgacttgacg 3180ggacggcggc
tttgttgaat aaatcgaact tttgctgagt tgaaggatca gatcacgcat 3240cttcccgaca
acgcagaccg ttccgtggca aagcaaaagt tcaaaatcac caactggtcc 3300acctacaaca
aagctctcat caaccgtggc tccctcactt tctggctgga tgatggggcg 3360attcaggcct
ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgccccccc 3420ccccctgcag
gtcttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat 3480tatcccgtgt
tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg 3540acttggttga
gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag 3600aattatgcag
tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa 3660cgatcggagg
accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc 3720gccttgatcg
ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca 3780cgatgcctgt
agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc 3840tagcttcccg
gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc 3900tgcgctcggc
ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg 3960ggtctcgcgg
tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta 4020tctacacgac
ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag 4080gtgcctcact
gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga 4140ttgatttaaa
acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc 4200tcatgaccaa
aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa 4260agatcaaagg
atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa 4320aaaaaccacc
gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc 4380cgaaggtaac
tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt 4440agttaggcca
ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc 4500tgttaccagt
ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac 4560gatagttacc
ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca 4620gcttggagcg
aacgacctac accgaactga gatacctaca gcgtgagcta tgagaaagcg 4680ccacgcttcc
cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag 4740gagagcgcac
gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt 4800ttcgccacct
ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat 4860ggaaaaacgc
cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc 4920acatgttctt
tcctgcgtta tcccctgatt ctgtggataa ccgtattacc gcctttgagt 4980gagctgatac
cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg agcgaggaag 5040cggaagagcg
cctgatgcgg tattttctcc ttacgcatct gtgcggtatt tcacaccgca 5100tatggtgcac
tctcagtaca atctgctctg atgccgcata gttaagccag tatacactcc 5160gctatcgcta
cgtgactggg tcatggctgc gccccgacac ccgccaacac ccgctgacgc 5220gccctgacgg
gcttgtctgc tcccggcatc cgcttacaga caagctgtga ccgtctccgg 5280gagctgcatg
tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgaggc agggtgcctt 5340gatgtgggcg
ccggcggtcg agtggcgacg gcgcggcttg tccgcgccct ggtagattgc 5400ctggccgtag
gccagccatt tttgagcggc cagcggccgc gataggccga cgcgaagcgg 5460cggggcgtag
ggagcgcagc gaccgaaggg taggcgcttt ttgcagctct tcggctgtgc 5520gctggccaga
cagttatgca caggccaggc gggttttaag agttttaata agttttaaag 5580agttttaggc
ggaaaaatcg ccttttttct cttttatatc agtcacttac atgtgtgacc 5640ggttcccaat
gtacggcttt gggttcccaa tgtacgggtt ccggttccca atgtacggct 5700ttgggttccc
aatgtacgtg ctatccacag gaaagagacc ttttcgacct ttttcccctg 5760ctagggcaat
ttgccctagc atctgctccg tacattagga accggcggat gcttcgccct 5820cgatcaggtt
gcggtagcgc atgactagga tcgggccagc ctgccccgcc tcctccttca 5880aatcgtactc
cggcaggtca tttgacccga tcagcttgcg cacggtgaaa cagaacttct 5940tgaactctcc
ggcgctgcca ctgcgttcgt agatcgtctt gaacaaccat ctggcttctg 6000ccttgcctgc
ggcgcggcgt gccaggcggt agagaaaacg gccgatgccg ggatcgatca 6060aaaagtaatc
ggggtgaacc gtcagcacgt ccgggttctt gccttctgtg atctcgcggt 6120acatccaatc
agctagctcg atctcgatgt actccggccg cccggtttcg ctctttacga 6180tcttgtagcg
gctaatcaag gcttcaccct cggataccgt caccaggcgg ccgttcttgg 6240ccttcttcgt
acgctgcatg gcaacgtgcg tggtgtttaa ccgaatgcag gtttctacca 6300ggtcgtcttt
ctgctttccg ccatcggctc gccggcagaa cttgagtacg tccgcaacgt 6360gtggacggaa
cacgcggccg ggcttgtctc ccttcccttc ccggtatcgg ttcatggatt 6420cggttagatg
ggaaaccgcc atcagtacca ggtcgtaatc ccacacactg gccatgccgg 6480ccggccctgc
ggaaacctct acgtgcccgt ctggaagctc gtagcggatc acctcgccag 6540ctcgtcggtc
acgcttcgac agacggaaaa cggccacgtc catgatgctg cgactatcgc 6600gggtgcccac
gtcatagagc atcggaacga aaaaatctgg ttgctcgtcg cccttgggcg 6660gcttcctaat
cgacggcgca ccggctgccg gcggttgccg ggattctttg cggattcgat 6720cagcggccgc
ttgccacgat tcaccggggc gtgcttctgc ctcgatgcgt tgccgctggg 6780cggcctgcgc
ggccttcaac ttctccacca ggtcatcacc cagcgccgcg ccgatttgta 6840ccgggccgga
tggtttgcga ccgctcacgc cgattcctcg ggcttggggg ttccagtgcc 6900attgcagggc
cggcagacaa cccagccgct tacgcctggc caaccgcccg ttcctccaca 6960catggggcat
tccacggcgt cggtgcctgg ttgttcttga ttttccatgc cgcctccttt 7020agccgctaaa
attcatctac tcatttattc atttgctcat ttactctggt agctgcgcga 7080tgtattcaga
tagcagctcg gtaatggtct tgccttggcg taccgcgtac atcttcagct 7140tggtgtgatc
ctccgccggc aactgaaagt tgacccgctt catggctggc gtgtctgcca 7200ggctggccaa
cgttgcagcc ttgctgctgc gtgcgctcgg acggccggca cttagcgtgt 7260ttgtgctttt
gctcattttc tctttacctc attaactcaa atgagttttg atttaatttc 7320agcggccagc
gcctggacct cgcgggcagc gtcgccctcg ggttctgatt caagaacggt 7380tgtgccggcg
gcggcagtgc ctgggtagct cacgcgctgc gtgatacggg actcaagaat 7440gggcagctcg
tacccggcca gcgcctcggc aacctcaccg ccgatgcgcg tgcctttgat 7500cgcccgcgac
acgacaaagg ccgcttgtag ccttccatcc gtgacctcaa tgcgctgctt 7560aaccagctcc
accaggtcgg cggtggccca tatgtcgtaa gggcttggct gcaccggaat 7620cagcacgaag
tcggctgcct tgatcgcgga cacagccaag tccgccgcct ggggcgctcc 7680gtcgatcact
acgaagtcgc gccggccgat ggccttcacg tcgcggtcaa tcgtcgggcg 7740gtcgatgccg
acaacggtta gcggttgatc ttcccgcacg gccgcccaat cgcgggcact 7800gccctgggga
tcggaatcga ctaacagaac atcggccccg gcgagttgca gggcgcgggc 7860tagatgggtt
gcgatggtcg tcttgcctga cccgcctttc tggttaagta cagcgataac 7920ttcatgcgtt
cccttgcgta tttgtttatt tactcatcgc atcatatacg cagcgaccgc 7980atgacgcaag
ctgttttact caaatacaca tcaccttttt agacggcggc gctcggtttc 8040ttcagcggcc
aagctggccg gccaggccgc cagcttggca tcagacaaac cggccaggat 8100ttcatgcagc
cgcacggttg agacgtgcgc gggcggctcg aacacgtacc cggccgcgat 8160catctccgcc
tcgatctctt cggtaatgaa aaacggttcg tcctggccgt cctggtgcgg 8220tttcatgctt
gttcctcttg gcgttcattc tcggcggccg ccagggcgtc ggcctcggtc 8280aatgcgtcct
cacggaaggc accgcgccgc ctggcctcgg tgggcgtcac ttcctcgctg 8340cgctcaagtg
cgcggtacag ggtcgagcga tgcacgccaa gcagtgcagc cgcctctttc 8400acggtgcggc
cttcctggtc gatcagctcg cgggcgtgcg cgatctgtgc cggggtgagg 8460gtagggcggg
ggccaaactt cacgcctcgg gccttggcgg cctcgcgccc gctccgggtg 8520cggtcgatga
ttagggaacg ctcgaactcg gcaatgccgg cgaacacggt caacaccatg 8580cggccggccg
gcgtggtggt gtcggcccac ggctctgcca ggctacgcag gcccgcgccg 8640gcctcctgga
tgcgctcggc aatgtccagt aggtcgcggg tgctgcgggc caggcggtct 8700agcctggtca
ctgtcacaac gtcgccaggg cgtaggtggt caagcatcct ggccagctcc 8760gggcggtcgc
gcctggtgcc ggtgatcttc tcggaaaaca gcttggtgca gccggccgcg 8820tgcagttcgg
cccgttggtt ggtcaagtcc tggtcgtcgg tgctgacgcg ggcatagccc 8880agcaggccag
cggcggcgct cttgttcatg gcgtaatgtc tccggttcta gtcgcaagta 8940ttctacttta
tgcgactaaa acacgcgaca agaaaacgcc aggaaaaggg cagggcggca 9000gcctgtcgcg
taacttagga cttgtgcgac atgtcgtttt cagaagacgg ctgcactgaa 9060cgtcagaagc
cgactgcact atagcagcgg aggggttgga ccacaggacg ggtgtggtcg 9120ccatgatcgc
gtagtcgata gtggctccaa gtagcgaagc gagcaggact gggcggcggc 9180caaagcggtc
ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc aacgcatata 9240gcgctagcag
cacgccatag tgactggcga tgctgtcgga atggacgata tcccgcaaga 9300ggcccggcag
taccggcata accaagccta tgcctacagc atccagggtg acggtgccga 9360ggatgacgat
gagcgcattg ttagatttca tacacggtgc ctgactgcgt tagcaattta 9420actgtgataa
actaccgcat taaagctagc ttgcttggtc gttccgcgtg aacgtcggct 9480cgattgtacc
tgcgttcaaa tactttgcga tcgtgttgcg cgcctgcccg gtgcgtcggc 9540tgatctcacg
gatcgactgc ttctctcgca acgccatccg acggatgatg tttaaaagtc 9600ccatgtggat
cactccgttg ccccgtcgct caccgtgttg gggggaaggt gcacatggct 9660cagttctcaa
tggaaattat ctgcctaacc ggctcagttc tgcgtagaaa ccaacatgca 9720agctccaccg
ggtgcaaagc ggcagcggcg gcaggatata ttcaattgta aatggcttca 9780tgtccgggaa
atctacatgg atcagcaatg agtatgatgg tcaatatgga gaaaaagaaa 9840gagtaattac
caattttttt tcaattcaaa aatgtagatg tccgcagcgt tattataaaa 9900tgaaagtaca
ttttgataaa acgacaaatt acgatccgtc gtatttatag gcgaaagcaa 9960taaacaaatt
attctaattc ggaaatcttt atttcgacgt gtctacattc acgtccaaat 10020gggggcttag
atgagaaact tcacgatcga tgccttgatt tcgccattcc cagataccca 10080tttcatcttc
agattggtct gagattatgc gaaaatatac actcatatac ataaatactg 10140acagtttgag
ctaccaattc agtgtagccc attacctcac ataattcact caaatgctag 10200gcagtctgtc
aactcggcgt caatttgtcg gccactatac gatagttgcg caaattttca 10260aagtcctggc
ctaacatcac acctctgtcg gcggcgggtc ccatttgtga taaatccacc 10320atatcgaatt
aattcagact cctttgcccc agagatcaca atggacgact tcctctatct 10380ctacgatcta
gtcaggaagt tcgacggaga aggtgacgat accatgttca ccactgataa 10440tgagaagatt
agccttttca atttcagaaa gaatgctaac ccacagatgg ttagagaggc 10500ttacgcagca
ggtctcatca agacgatcta cccgagcaat aatctccagg agatcaaata 10560ccttcccaag
aaggttaaag atgcagtcaa aagattcagg actaactgca tcaagaacac 10620agagaaagat
atatttctca agatcagaag tactattcca gtatggacga ttcaaggctt 10680gcttcacaaa
ccaaggcaag taatagagat tggagtctct aaaaaggtag ttcccactga 10740atcaaaggcc
atggagtcaa agattcaaat agaggaccta acagaactcg ccgtaaagac 10800tggcgaacag
ttcatacaga gtctcttacg actcaatgac aagaagaaaa tcttcgtcaa 10860catggtggag
cacgacacgc ttgtctactc caaaaatatc aaagatacag tctcagaaga 10920ccaaagggca
attgagactt ttcaacaaag ggtaatatcc ggaaacctcc tcggattcca 10980ttgcccagct
atctgtcact ttattgtgaa gatagtggaa aaggaaggtg gctcctacaa 11040atgccatcat
tgcgataaag gaaaggccat cgttgaagat gcctctgccg acagtggtcc 11100caaagatgga
cccccaccca cgaggagcat cgtggaaaaa gaagacgttc caaccacgtc 11160ttcaaagcaa
gtggattgat gtgatatctc cactgacgta agggatgacg cacaatccca 11220ctatccttcg
caagaccctt cctctatata aggaagttca tttcatttgg agaggacacg 11280ctgaaatcac
cagtctccaa gcttgcgggg atcgtttcgc atgattgaac aagatggatt 11340gcacgcaggt
tctccggccg cttgggtgga gaggctattc ggctatgact gggcacaaca 11400gacaatcggc
tgctctgatg ccgccgtgtt ccggctgtca gcgcaggggc gcccggttct 11460ttttgtcaag
accgacctgt ccggtgccct gaatgaactg caggacgagg cagcgcggct 11520atcgtggctg
gccacgacgg gcgttccttg cgcagctgtg ctcgacgttg tcactgaagc 11580gggaagggac
tggctgctat tgggcgaagt gccggggcag gatctcctgt catctcacct 11640tgctcctgcc
gagaaagtat ccatcatggc tgatgcaatg cggcggctgc atacgcttga 11700tccggctacc
tgcccattcg accaccaagc gaaacatcgc atcgagcgag cacgtactcg 11760gatggaagcc
ggtcttgtcg atcaggatga tctggacgaa gagcatcagg ggctcgcgcc 11820agccgaactg
ttcgccaggc tcaaggcgcg catgcccgac ggcgaggatc tcgtcgtgac 11880ccatggcgat
gcctgcttgc cgaatatcat ggtggaaaat ggccgctttt ctggattcat 11940cgactgtggc
cggctgggtg tggcggaccg ctatcaggac atagcgttgg ctacccgtga 12000tattgctgaa
gagcttggcg gcgaatgggc tgaccgcttc ctcgtgcttt acggtatcgc 12060cgctcccgat
tcgcagcgca tcgccttcta tcgccttctt gacgagttct tctgagcggg 12120actctggggt
tcgaaatgac cgaccaagcg acgcccaacc tgccatcacg agatttcgat 12180tccaccgccg
ccttctatga aaggttgggc ttcggaatcg ttttccggga cgccggctgg 12240atgatcctcc
agcgcgggga tctcatgctg gagttcttcg cccaccccgg atcgatccaa 12300cacttacgtt
tgcaacgtcc aagagcaaat agaccacgaa cgccggaagg ttgccgcagc 12360gtgtggattg
cgtctcaatt ctctcttgca ggaatgcaat gatgaatatg atactgacta 12420tgaaactttg
agggaatact gcctagcacc gtcacctcat aacgtgcatc atgcatgccc 12480tgacaacatg
gaacatcgct atttttctga agaattatgc tcgttggagg atgtcgcggc 12540aattgcagct
attgccaaca tcgaactacc cctcacgcat gcattcatca atattattca 12600tgcggggaaa
ggcaagatta atccaactgg caaatcatcc agcgtgattg gtaacttcag 12660ttccagcgac
ttgattcgtt ttggtgctac ccacgttttc aataaggacg agatggtgga 12720gtaaagaagg
agtgcgtcga agcagatcgt tcaaacattt ggcaataaag tttcttaaga 12780ttgaatcctg
ttgccggtct tgcgatgatt atcatataat ttctgttgaa ttacgttaag 12840catgtaataa
ttaacatgta atgcatgacg ttatttatga gatgggtttt tatgattaga 12900gtcccgcaat
tatacattta atacgcgata gaaaacaaaa tatagcgcgc aaactaggat 12960aaattatcgc
gcgcggtgtc atctatgtta ctagatcgat caaacttcgg tactgtgtaa 13020tgacgatgag
caatcgagag gctgactaac aaaaggtaca tcgcgatgga tcgatccatt 13080cgccattcag
gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct tcgctattac 13140gccagctggc
gaaaggggga tgtgctgcaa ggcgattaag ttgggtaacg ccagggtttt 13200cccagtcacg
acgttgtaaa acgacggcca gtgaattcct gcagcccggg ggatccgccc 13260actcgaggcg
cgccaagctt gcatgcctgc aggctagcct aagtacgtac tcaaaatgcc 13320aacaaataaa
aaaaaagttg ctttaataat gccaaaacaa attaataaaa cacttacaac 13380accggatttt
ttttaattaa aatgtgccat ttaggataaa tagttaatat ttttaataat 13440tatttaaaaa
gccgtatcta ctaaaatgat ttttatttgg ttgaaaatat taatatgttt 13500aaatcaacac
aatctatcaa aattaaacta aaaaaaaaat aagtgtacgt ggttaacatt 13560agtacagtaa
tataagagga aaatgagaaa ttaagaaatt gaaagcgagt ctaattttta 13620aattatgaac
ctgcatatat aaaaggaaag aaagaatcca ggaagaaaag aaatgaaacc 13680atgcatggtc
ccctcgtcat cacgagtttc tgccatttgc aatagaaaca ctgaaacacc 13740tttctctttg
tcacttaatt gagatgccga agccacctca caccatgaac ttcatgaggt 13800gtagcaccca
aggcttccat agccatgcat actgaagaat gtctcaagct cagcacccta 13860cttctgtgac
gtgtccctca ttcaccttcc tctcttccct ataaataacc acgcctcagg 13920ttctccgctt
cacaactcaa acattctctc cattggtcct taaacactca tcagtcatca 13980ccgcggccct
agacgcccat cacaagtttg tacaaaaaag ctgaacgaga aacgtaaaat 14040gatataaata
tcaatatatt aaattagatt ttgcataaaa aacagactac ataatactgt 14100aaaacacaac
atatccagtc atattggcgg ccgcattagg caccccaggc tttacacttt 14160atgcttccgg
ctcgtataat gtgtggattt tgagttagga tccgtcgaga ttttcaggag 14220ctaaggaagc
taaaatggag aaaaaaatca ctggatatac caccgttgat atatcccaat 14280ggcatcgtaa
agaacatttt gaggcatttc agtcagttgc tcaatgtacc tataaccaga 14340ccgttcagct
ggatattacg gcctttttaa agaccgtaaa gaaaaataag cacaagtttt 14400atccggcctt
tattcacatt cttgcccgcc tgatgaatgc tcatccggaa ttccgtatgg 14460caatgaaaga
cggtgagctg gtgatatggg atagtgttca cccttgttac accgttttcc 14520atgagcaaac
tgaaacgttt tcatcgctct ggagtgaata ccacgacgat ttccggcagt 14580ttctacacat
atattcgcaa gatgtggcgt gttacggtga aaacctggcc tatttcccta 14640aagggtttat
tgagaatatg tttttcgtct cagccaatcc ctgggtgagt ttcaccagtt 14700ttgatttaaa
cgtggccaat atggacaact tcttcgcccc cgttttcacc atgggcaaat 14760attatacgca
aggcgacaag gtgctgatgc cgctggcgat tcaggttcat catgccgttt 14820gtgatggctt
ccatgtcggc agaatgctta atgaattaca acagtactgc gatgagtggc 14880agggcggggc
gtaaacgcgt ggatccggct tactaaaagc cagataacag tatgcgtatt 14940tgcgcgctga
tttttgcggt ataagaatat atactgatat gtatacccga agtatgtcaa 15000aaagaggtat
gctatgaagc agcgtattac agtgacagtt gacagcgaca gctatcagtt 15060gctcaaggca
tatatgatgt caatatctcc ggtctggtaa gcacaaccat gcagaatgaa 15120gcccgtcgtc
tgcgtgccga acgctggaaa gcggaaaatc aggaagggat ggctgaggtc 15180gcccggttta
ttgaaatgaa cggctctttt gctgacgaga acaggggctg gtgaaatgca 15240gtttaaggtt
tacacctata aaagagagag ccgttatcgt ctgtttgtgg atgtacagag 15300tgatattatt
gacacgcccg ggcgacggat ggtgatcccc ctggccagtg cacgtctgct 15360gtcagataaa
gtctcccgtg aactttaccc ggtggtgcat atcggggatg aaagctggcg 15420catgatgacc
accgatatgg ccagtgtgcc ggtctccgtt atcggggaag aagtggctga 15480tctcagccac
cgcgaaaatg acatcaaaaa cgccattaac ctgatgttct ggggaatata 15540aatgtcaggc
tcccttatac acagccagtc tgcaggtcga ccatagtgac tggatatgtt 15600gtgttttaca
gcattatgta gtctgttttt tatgcaaaat ctaatttaat atattgatat 15660ttatatcatt
ttacgtttct cgttcagctt tcttgtacaa agtggtgatg ataaccaagt 15720ttaacgtgag
tttatatatt cacagttcca tttacagatc ttatgctgat tgcagcatat 15780aacatagtcg
caacttaact ttatccctgc ttacgtaaag aaacatacat attgtttgtg 15840gcttcgtagt
ggaacatatg caattatgta atctttatat tatgagcctt tacttacaaa 15900gattacttga
gatttatgta cgtgtgctat tttcactttt caaacatgaa tttcctacgt 15960ttacaatcat
ttaatgtaaa agggatgata taatgtattt acgtacatgt gaacaaccaa 16020gcatgttatt
ttttcctttt ttgttgcaac ttacaatcaa gtaatgatta tggttatgat 16080tatgatattg
gtgtgtgtct tttgccttat atatatattt atccctttcg tttaactttg 16140caatataatt
attactgatc actatatttt ggtttgaaat ggcgcaggtt gtaatgatcg 16200atcatcacca
ctttgtacaa gaaagctgaa cgagaaacgt aaaatgatat aaatatcaat 16260atattaaatt
agattttgca taaaaaacag actacataat gctgtaaaac acaacatatc 16320cagtcactat
ggtcgacctg cagactggct gtgtataagg gagcctgaca tttatattcc 16380ccagaacatc
aggttaatgg cgtttttgat gtcattttcg cggtggctga gatcagccac 16440ttcttccccg
ataacggaga ccggcacact ggccatatcg gtggtcatca tgcgccagct 16500ttcatccccg
atatgcacca ccgggtaaag ttcacgggag actttatctg acagcagacg 16560tgcactggcc
agggggatca ccatccgtcg cccgggcgtg tcaataatat cactctgtac 16620atccacaaac
agacgataac ggctctctct tttataggtg taaaccttaa actgcatttc 16680accagcccct
gttctcgtca gcaaaagagc cgttcatttc aataaaccgg gcgacctcag 16740ccatcccttc
ctgattttcc gctttccagc gttcggcacg cagacgacgg gcttcattct 16800gcatggttgt
gcttaccaga ccggagatat tgacatcata tatgccttga gcaactgata 16860gctgtcgctg
tcaactgtca ctgtaatacg ctgcttcata gcatacctct ttttgacata 16920cttcgggtat
acatatcagt atatattctt ataccgcaaa aatcagcgcg caaatacgca 16980tactgttatc
tggcttttag taagccggat cctaactcaa aatccacaca ttatacgagc 17040cggaagcata
aagtgtaaag cctggggtgc ctaatgcggc cgccaatatg actggatatg 17100ttgtgtttta
cagtattatg tagtctgttt tttatgcaaa atctaattta atatattgat 17160atttatatca
ttttacgttt ctcgttcagc ttttttgtac aaacttgtga tgggcgtcta 17220gcgaactaga
ggatccccgg gtaccgaggt acgtctagag gatccgtcga cgg
172731638DNAartificial sequenceprimer 16cctagggtta accaagttta acgtgagttt
atatattc 381738DNAartificial sequenceprimer
17actagttcgc gatcattaca acctgcgcca tttcaaac
3818506DNAartificial sequencePCR product & intron 18cctagggtta accaagttta
acgtgagttt atatattcac agttccattt acagatctta 60tgctgattgc agcatataac
atagtcgcaa cttaacttta tccctgctta cgtaaagaaa 120catacatatt gtttgtggct
tcgtagtgga acatatgcaa ttatgtaatc tttatattat 180gagcctttac ttacaaagat
tacttgagat ttatgtacgt gtgctatttt cacttttcaa 240acatgaattt cctacgttta
caatcattta atgtaaaagg gatgatataa tgtatttacg 300tacatgtgaa caaccaagca
tgttattttt tccttttttg ttgcaactta caatcaagta 360atgattatgg ttatgattat
gatattggtg tgtgtctttt gccttatata tatatttatc 420cctttcgttt aactttgcaa
tataattatt actgatcact atattttggt ttgaaatggc 480gcaggttgta atgatcgcga
actagt 506191724DNAartificial
sequencesynthetic DNA fragment 19ctagacgccc atcacaagtt tgtacaaaaa
agctgaacga gaaacgtaaa atgatataaa 60tatcaatata ttaaattaga ttttgcataa
aaaacagact acataatact gtaaaacaca 120acatatccag tcatattggc ggccgcatta
ggcaccccag gctttacact ttatgcttcc 180ggctcgtata atgtgtggat tttgagttag
gatccgtcga gattttcagg agctaaggaa 240gctaaaatgg agaaaaaaat cactggatat
accaccgttg atatatccca atggcatcgt 300aaagaacatt ttgaggcatt tcagtcagtt
gctcaatgta cctataacca gaccgttcag 360ctggatatta cggccttttt aaagaccgta
aagaaaaata agcacaagtt ttatccggcc 420tttattcaca ttcttgcccg cctgatgaat
gctcatccgg aattccgtat ggcaatgaaa 480gacggtgagc tggtgatatg ggatagtgtt
cacccttgtt acaccgtttt ccatgagcaa 540actgaaacgt tttcatcgct ctggagtgaa
taccacgacg atttccggca gtttctacac 600atatattcgc aagatgtggc gtgttacggt
gaaaacctgg cctatttccc taaagggttt 660attgagaata tgtttttcgt ctcagccaat
ccctgggtga gtttcaccag ttttgattta 720aacgtggcca atatggacaa cttcttcgcc
cccgttttca ccatgggcaa atattatacg 780caaggcgaca aggtgctgat gccgctggcg
attcaggttc atcatgccgt ttgtgatggc 840ttccatgtcg gcagaatgct taatgaatta
caacagtact gcgatgagtg gcagggcggg 900gcgtaaacgc gtggatccgg cttactaaaa
gccagataac agtatgcgta tttgcgcgct 960gatttttgcg gtataagaat atatactgat
atgtataccc gaagtatgtc aaaaagaggt 1020atgctatgaa gcagcgtatt acagtgacag
ttgacagcga cagctatcag ttgctcaagg 1080catatatgat gtcaatatct ccggtctggt
aagcacaacc atgcagaatg aagcccgtcg 1140tctgcgtgcc gaacgctgga aagcggaaaa
tcaggaaggg atggctgagg tcgcccggtt 1200tattgaaatg aacggctctt ttgctgacga
gaacaggggc tggtgaaatg cagtttaagg 1260tttacaccta taaaagagag agccgttatc
gtctgtttgt ggatgtacag agtgatatta 1320ttgacacgcc cgggcgacgg atggtgatcc
ccctggccag tgcacgtctg ctgtcagata 1380aagtctcccg tgaactttac ccggtggtgc
atatcgggga tgaaagctgg cgcatgatga 1440ccaccgatat ggccagtgtg ccggtctccg
ttatcgggga agaagtggct gatctcagcc 1500accgcgaaaa tgacatcaaa aacgccatta
acctgatgtt ctggggaata taaatgtcag 1560gctcccttat acacagccag tctgcaggtc
gaccatagtg actggatatg ttgtgtttta 1620cagcattatg tagtctgttt tttatgcaaa
atctaattta atatattgat atttatatca 1680ttttacgttt ctcgttcagc tttcttgtac
aaagtggtga tgat 1724204934DNAartificial
sequenceplasmid 20ctagaggatc cccgggtacc gagctcgaat tcgtaatcat ggtcatagct
gtttcctgtg 60tgaaattgtt atccgctcac aattccacac aacatacgag ccggaagcat
aaagtgtaaa 120gcctggggtg cctaatgagt gagctaactc acattaattg cgttgcgctc
actgcccgct 180ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa tcggccaacg
cgcggggaga 240ggcggtttgc gtattgggcg ctagcggagt gtatactggc ttactatgtt
ggcactgatg 300agggtgtcag tgaagtgctt catgtggcag gagaaaaaag gctgcaccgg
tgcgtcagca 360gaatatgtga tacaggatat attccgcttc ctcgctcact gactcgctac
gctcggtcgt 420tcgactgcgg cgagcggaaa tggcttacga acggggcgga gatttcctgg
aagatgccag 480gaagatactt aacagggaag tgagagggcc gcggcaaagc cgtttttcca
taggctccgc 540ccccctgaca agcatcacga aatctgacgc tcaaatcagt ggtggcgaaa
cccgacagga 600ctataaagat accaggcgtt tccccctggc ggctccctcg tgcgctctcc
tgttcctgcc 660tttcggttta ccggtgtcat tccgctgtta tggccgcgtt tgtctcattc
cacgcctgac 720actcagttcc gggtaggcag ttcgctccaa gctggactgt atgcacgaac
cccccgttca 780gtccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg
aaagacatgc 840aaaagcacca ctggcagcag ccactggtaa ttgatttaga ggagttagtc
ttgaagtcat 900gcgccggtta aggctaaact gaaaggacaa gttttggtga ctgcgctcct
ccaagccagt 960tacctcggtt caaagagttg gtagctcaga gaaccttcga aaaaccgccc
tgcaaggcgg 1020ttttttcgtt ttcagagcaa gagattacgc gcagaccaaa acgatctcaa
gaagatcatc 1080ttattaaggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt
ttggtcatga 1140gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt
tttaaatcaa 1200tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc
agtgaggcac 1260ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc
gtcgtgtaga 1320taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata
ccgcgagacc 1380cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg
gccgagcgca 1440gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc
cgggaagcta 1500gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct
acaggcatcg 1560tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa
cgatcaaggc 1620gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt
cctccgatcg 1680ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca
ctgcataatt 1740ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac
tcaaccaagt 1800cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca
atacgggata 1860ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt
tcttcggggc 1920gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc
actcgtgcac 1980ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca
aaaacaggaa 2040ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata
ctcatactct 2100tcctttttca atattattga agcatttatc agggttattg tctcatgagc
ggatacatat 2160ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc
cgaaaagtgc 2220cacctgacgt ctaagaaacc attattatca tgacattaac ctataaaaat
aggcgtatca 2280cgaggccctt tcgtctcgcg cgtttcggtg atgacggtga aaacctctga
cacatgcagc 2340tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa
gcccgtcagg 2400gcgcgtcagc gggtgttggc gggtgtcggg gctggcttaa ctatgcggca
tcagagcaga 2460ttgtactgag agtgcaccat atgcggtgtg aaataccgca cagatgcgta
aggagaaaat 2520accgcatcag gcgccattcg ccattcaggc tgcgcaactg ttgggaaggg
cgatcggtgc 2580gggcctcttc gctattacgc cagctggcga aagggggatg tgctgcaagg
cgattaagtt 2640gggtaacgcc agggttttcc cagtcacgac gttgtaaaac gacggccagt
gccaagcttg 2700catgcctgca ggtcgactct agacgcccat cacaagtttg tacaaaaaag
ctgaacgaga 2760aacgtaaaat gatataaata tcaatatatt aaattagatt ttgcataaaa
aacagactac 2820ataatactgt aaaacacaac atatccagtc atattggcgg ccgcattagg
caccccaggc 2880tttacacttt atgcttccgg ctcgtataat gtgtggattt tgagttagga
tccgtcgaga 2940ttttcaggag ctaaggaagc taaaatggag aaaaaaatca ctggatatac
caccgttgat 3000atatcccaat ggcatcgtaa agaacatttt gaggcatttc agtcagttgc
tcaatgtacc 3060tataaccaga ccgttcagct ggatattacg gcctttttaa agaccgtaaa
gaaaaataag 3120cacaagtttt atccggcctt tattcacatt cttgcccgcc tgatgaatgc
tcatccggaa 3180ttccgtatgg caatgaaaga cggtgagctg gtgatatggg atagtgttca
cccttgttac 3240accgttttcc atgagcaaac tgaaacgttt tcatcgctct ggagtgaata
ccacgacgat 3300ttccggcagt ttctacacat atattcgcaa gatgtggcgt gttacggtga
aaacctggcc 3360tatttcccta aagggtttat tgagaatatg tttttcgtct cagccaatcc
ctgggtgagt 3420ttcaccagtt ttgatttaaa cgtggccaat atggacaact tcttcgcccc
cgttttcacc 3480atgggcaaat attatacgca aggcgacaag gtgctgatgc cgctggcgat
tcaggttcat 3540catgccgttt gtgatggctt ccatgtcggc agaatgctta atgaattaca
acagtactgc 3600gatgagtggc agggcggggc gtaaacgcgt ggatccggct tactaaaagc
cagataacag 3660tatgcgtatt tgcgcgctga tttttgcggt ataagaatat atactgatat
gtatacccga 3720agtatgtcaa aaagaggtat gctatgaagc agcgtattac agtgacagtt
gacagcgaca 3780gctatcagtt gctcaaggca tatatgatgt caatatctcc ggtctggtaa
gcacaaccat 3840gcagaatgaa gcccgtcgtc tgcgtgccga acgctggaaa gcggaaaatc
aggaagggat 3900ggctgaggtc gcccggttta ttgaaatgaa cggctctttt gctgacgaga
acaggggctg 3960gtgaaatgca gtttaaggtt tacacctata aaagagagag ccgttatcgt
ctgtttgtgg 4020atgtacagag tgatattatt gacacgcccg ggcgacggat ggtgatcccc
ctggccagtg 4080cacgtctgct gtcagataaa gtctcccgtg aactttaccc ggtggtgcat
atcggggatg 4140aaagctggcg catgatgacc accgatatgg ccagtgtgcc ggtctccgtt
atcggggaag 4200aagtggctga tctcagccac cgcgaaaatg acatcaaaaa cgccattaac
ctgatgttct 4260ggggaatata aatgtcaggc tcccttatac acagccagtc tgcaggtcga
ccatagtgac 4320tggatatgtt gtgttttaca gcattatgta gtctgttttt tatgcaaaat
ctaatttaat 4380atattgatat ttatatcatt ttacgtttct cgttcagctt tcttgtacaa
agtggtgatg 4440ataaccaagt ttaacgtgag tttatatatt cacagttcca tttacagatc
ttatgctgat 4500tgcagcatat aacatagtcg caacttaact ttatccctgc ttacgtaaag
aaacatacat 4560attgtttgtg gcttcgtagt ggaacatatg caattatgta atctttatat
tatgagcctt 4620tacttacaaa gattacttga gatttatgta cgtgtgctat tttcactttt
caaacatgaa 4680tttcctacgt ttacaatcat ttaatgtaaa agggatgata taatgtattt
acgtacatgt 4740gaacaaccaa gcatgttatt ttttcctttt ttgttgcaac ttacaatcaa
gtaatgatta 4800tggttatgat tatgatattg gtgtgtgtct tttgccttat atatatattt
atccctttcg 4860tttaactttg caatataatt attactgatc actatatttt ggtttgaaat
ggcgcaggtt 4920gtaatgatcg cgaa
4934211021DNAartificial sequencesynthetic DNA fragment
21ctagacgccc atcacaagtt tgtacaaaaa agctgaacga gaaacgtaaa atgatataaa
60tatcaatata ttaaattaga ttttgcataa aaaacagact acataatact gtaaaacaca
120acatatccag tcatattggc ggccgcatta ggcaccccag gctttacact ttatgcttcc
180ggctcgtata atgtgtggat tttgagttag gatccggctt actaaaagcc agataacagt
240atgcgtattt gcgcgctgat ttttgcggta taagaatata tactgatatg tatacccgaa
300gtatgtcaaa aagaggtatg ctatgaagca gcgtattaca gtgacagttg acagcgacag
360ctatcagttg ctcaaggcat atatgatgtc aatatctccg gtctggtaag cacaaccatg
420cagaatgaag cccgtcgtct gcgtgccgaa cgctggaaag cggaaaatca ggaagggatg
480gctgaggtcg cccggtttat tgaaatgaac ggctcttttg ctgacgagaa caggggctgg
540tgaaatgcag tttaaggttt acacctataa aagagagagc cgttatcgtc tgtttgtgga
600tgtacagagt gatattattg acacgcccgg gcgacggatg gtgatccccc tggccagtgc
660acgtctgctg tcagataaag tctcccgtga actttacccg gtggtgcata tcggggatga
720aagctggcgc atgatgacca ccgatatggc cagtgtgccg gtctccgtta tcggggaaga
780agtggctgat ctcagccacc gcgaaaatga catcaaaaac gccattaacc tgatgttctg
840gggaatataa atgtcaggct cccttataca cagccagtct gcaggtcgac catagtgact
900ggatatgttg tgttttacag cattatgtag tctgtttttt atgcaaaatc taatttaata
960tattgatatt tatatcattt tacgtttctc gttcagcttt cttgtacaaa gtggtgatga
1020t
1021225955DNAartificial sequenceplasmid 22atcatcacca ctttgtacaa
gaaagctgaa cgagaaacgt aaaatgatat aaatatcaat 60atattaaatt agattttgca
taaaaaacag actacataat gctgtaaaac acaacatatc 120cagtcactat ggtcgacctg
cagactggct gtgtataagg gagcctgaca tttatattcc 180ccagaacatc aggttaatgg
cgtttttgat gtcattttcg cggtggctga gatcagccac 240ttcttccccg ataacggaga
ccggcacact ggccatatcg gtggtcatca tgcgccagct 300ttcatccccg atatgcacca
ccgggtaaag ttcacgggag actttatctg acagcagacg 360tgcactggcc agggggatca
ccatccgtcg cccgggcgtg tcaataatat cactctgtac 420atccacaaac agacgataac
ggctctctct tttataggtg taaaccttaa actgcatttc 480accagcccct gttctcgtca
gcaaaagagc cgttcatttc aataaaccgg gcgacctcag 540ccatcccttc ctgattttcc
gctttccagc gttcggcacg cagacgacgg gcttcattct 600gcatggttgt gcttaccaga
ccggagatat tgacatcata tatgccttga gcaactgata 660gctgtcgctg tcaactgtca
ctgtaatacg ctgcttcata gcatacctct ttttgacata 720cttcgggtat acatatcagt
atatattctt ataccgcaaa aatcagcgcg caaatacgca 780tactgttatc tggcttttag
taagccggat cctaactcaa aatccacaca ttatacgagc 840cggaagcata aagtgtaaag
cctggggtgc ctaatgcggc cgccaatatg actggatatg 900ttgtgtttta cagtattatg
tagtctgttt tttatgcaaa atctaattta atatattgat 960atttatatca ttttacgttt
ctcgttcagc ttttttgtac aaacttgtga tgggcgtcta 1020gcgaactaga ggatccccgg
gtaccgagct cgaattcgta atcatggtca tagctgtttc 1080ctgtgtgaaa ttgttatccg
ctcacaattc cacacaacat acgagccgga agcataaagt 1140gtaaagcctg gggtgcctaa
tgagtgagct aactcacatt aattgcgttg cgctcactgc 1200ccgctttcca gtcgggaaac
ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg 1260ggagaggcgg tttgcgtatt
gggcgctagc ggagtgtata ctggcttact atgttggcac 1320tgatgagggt gtcagtgaag
tgcttcatgt ggcaggagaa aaaaggctgc accggtgcgt 1380cagcagaata tgtgatacag
gatatattcc gcttcctcgc tcactgactc gctacgctcg 1440gtcgttcgac tgcggcgagc
ggaaatggct tacgaacggg gcggagattt cctggaagat 1500gccaggaaga tacttaacag
ggaagtgaga gggccgcggc aaagccgttt ttccataggc 1560tccgcccccc tgacaagcat
cacgaaatct gacgctcaaa tcagtggtgg cgaaacccga 1620caggactata aagataccag
gcgtttcccc ctggcggctc cctcgtgcgc tctcctgttc 1680ctgcctttcg gtttaccggt
gtcattccgc tgttatggcc gcgtttgtct cattccacgc 1740ctgacactca gttccgggta
ggcagttcgc tccaagctgg actgtatgca cgaacccccc 1800gttcagtccg accgctgcgc
cttatccggt aactatcgtc ttgagtccaa cccggaaaga 1860catgcaaaag caccactggc
agcagccact ggtaattgat ttagaggagt tagtcttgaa 1920gtcatgcgcc ggttaaggct
aaactgaaag gacaagtttt ggtgactgcg ctcctccaag 1980ccagttacct cggttcaaag
agttggtagc tcagagaacc ttcgaaaaac cgccctgcaa 2040ggcggttttt tcgttttcag
agcaagagat tacgcgcaga ccaaaacgat ctcaagaaga 2100tcatcttatt aaggggtctg
acgctcagtg gaacgaaaac tcacgttaag ggattttggt 2160catgagatta tcaaaaagga
tcttcaccta gatcctttta aattaaaaat gaagttttaa 2220atcaatctaa agtatatatg
agtaaacttg gtctgacagt taccaatgct taatcagtga 2280ggcacctatc tcagcgatct
gtctatttcg ttcatccata gttgcctgac tccccgtcgt 2340gtagataact acgatacggg
agggcttacc atctggcccc agtgctgcaa tgataccgcg 2400agacccacgc tcaccggctc
cagatttatc agcaataaac cagccagccg gaagggccga 2460gcgcagaagt ggtcctgcaa
ctttatccgc ctccatccag tctattaatt gttgccggga 2520agctagagta agtagttcgc
cagttaatag tttgcgcaac gttgttgcca ttgctacagg 2580catcgtggtg tcacgctcgt
cgtttggtat ggcttcattc agctccggtt cccaacgatc 2640aaggcgagtt acatgatccc
ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc 2700gatcgttgtc agaagtaagt
tggccgcagt gttatcactc atggttatgg cagcactgca 2760taattctctt actgtcatgc
catccgtaag atgcttttct gtgactggtg agtactcaac 2820caagtcattc tgagaatagt
gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg 2880ggataatacc gcgccacata
gcagaacttt aaaagtgctc atcattggaa aacgttcttc 2940ggggcgaaaa ctctcaagga
tcttaccgct gttgagatcc agttcgatgt aacccactcg 3000tgcacccaac tgatcttcag
catcttttac tttcaccagc gtttctgggt gagcaaaaac 3060aggaaggcaa aatgccgcaa
aaaagggaat aagggcgaca cggaaatgtt gaatactcat 3120actcttcctt tttcaatatt
attgaagcat ttatcagggt tattgtctca tgagcggata 3180catatttgaa tgtatttaga
aaaataaaca aataggggtt ccgcgcacat ttccccgaaa 3240agtgccacct gacgtctaag
aaaccattat tatcatgaca ttaacctata aaaataggcg 3300tatcacgagg ccctttcgtc
tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat 3360gcagctcccg gagacggtca
cagcttgtct gtaagcggat gccgggagca gacaagcccg 3420tcagggcgcg tcagcgggtg
ttggcgggtg tcggggctgg cttaactatg cggcatcaga 3480gcagattgta ctgagagtgc
accatatgcg gtgtgaaata ccgcacagat gcgtaaggag 3540aaaataccgc atcaggcgcc
attcgccatt caggctgcgc aactgttggg aagggcgatc 3600ggtgcgggcc tcttcgctat
tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt 3660aagttgggta acgccagggt
tttcccagtc acgacgttgt aaaacgacgg ccagtgccaa 3720gcttgcatgc ctgcaggtcg
actctagacg cccatcacaa gtttgtacaa aaaagctgaa 3780cgagaaacgt aaaatgatat
aaatatcaat atattaaatt agattttgca taaaaaacag 3840actacataat actgtaaaac
acaacatatc cagtcatatt ggcggccgca ttaggcaccc 3900caggctttac actttatgct
tccggctcgt ataatgtgtg gattttgagt taggatccgt 3960cgagattttc aggagctaag
gaagctaaaa tggagaaaaa aatcactgga tataccaccg 4020ttgatatatc ccaatggcat
cgtaaagaac attttgaggc atttcagtca gttgctcaat 4080gtacctataa ccagaccgtt
cagctggata ttacggcctt tttaaagacc gtaaagaaaa 4140ataagcacaa gttttatccg
gcctttattc acattcttgc ccgcctgatg aatgctcatc 4200cggaattccg tatggcaatg
aaagacggtg agctggtgat atgggatagt gttcaccctt 4260gttacaccgt tttccatgag
caaactgaaa cgttttcatc gctctggagt gaataccacg 4320acgatttccg gcagtttcta
cacatatatt cgcaagatgt ggcgtgttac ggtgaaaacc 4380tggcctattt ccctaaaggg
tttattgaga atatgttttt cgtctcagcc aatccctggg 4440tgagtttcac cagttttgat
ttaaacgtgg ccaatatgga caacttcttc gcccccgttt 4500tcaccatggg caaatattat
acgcaaggcg acaaggtgct gatgccgctg gcgattcagg 4560ttcatcatgc cgtttgtgat
ggcttccatg tcggcagaat gcttaatgaa ttacaacagt 4620actgcgatga gtggcagggc
ggggcgtaaa cgcgtggatc cggcttacta aaagccagat 4680aacagtatgc gtatttgcgc
gctgattttt gcggtataag aatatatact gatatgtata 4740cccgaagtat gtcaaaaaga
ggtatgctat gaagcagcgt attacagtga cagttgacag 4800cgacagctat cagttgctca
aggcatatat gatgtcaata tctccggtct ggtaagcaca 4860accatgcaga atgaagcccg
tcgtctgcgt gccgaacgct ggaaagcgga aaatcaggaa 4920gggatggctg aggtcgcccg
gtttattgaa atgaacggct cttttgctga cgagaacagg 4980ggctggtgaa atgcagttta
aggtttacac ctataaaaga gagagccgtt atcgtctgtt 5040tgtggatgta cagagtgata
ttattgacac gcccgggcga cggatggtga tccccctggc 5100cagtgcacgt ctgctgtcag
ataaagtctc ccgtgaactt tacccggtgg tgcatatcgg 5160ggatgaaagc tggcgcatga
tgaccaccga tatggccagt gtgccggtct ccgttatcgg 5220ggaagaagtg gctgatctca
gccaccgcga aaatgacatc aaaaacgcca ttaacctgat 5280gttctgggga atataaatgt
caggctccct tatacacagc cagtctgcag gtcgaccata 5340gtgactggat atgttgtgtt
ttacagcatt atgtagtctg ttttttatgc aaaatctaat 5400ttaatatatt gatatttata
tcattttacg tttctcgttc agctttcttg tacaaagtgg 5460tgatgataac caagtttaac
gtgagtttat atattcacag ttccatttac agatcttatg 5520ctgattgcag catataacat
agtcgcaact taactttatc cctgcttacg taaagaaaca 5580tacatattgt ttgtggcttc
gtagtggaac atatgcaatt atgtaatctt tatattatga 5640gcctttactt acaaagatta
cttgagattt atgtacgtgt gctattttca cttttcaaac 5700atgaatttcc tacgtttaca
atcatttaat gtaaaaggga tgatataatg tatttacgta 5760catgtgaaca accaagcatg
ttattttttc cttttttgtt gcaacttaca atcaagtaat 5820gattatggtt atgattatga
tattggtgtg tgtcttttgc cttatatata tatttatccc 5880tttcgtttaa ctttgcaata
taattattac tgatcactat attttggttt gaaatggcgc 5940aggttgtaat gatcg
5955239245DNAartificial
sequencevector 23gtacgtctag aggatccgtc gacggcgcgc ccgatcatcc ggatatagtt
cctcctttca 60gcaaaaaacc cctcaagacc cgtttagagg ccccaagggg ttatgctagt
tattgctcag 120cggtggcagc agccaactca gcttcctttc gggctttgtt agcagccgga
tcgatccaag 180ctgtacctca ctattccttt gccctcggac gagtgctggg gcgtcggttt
ccactatcgg 240cgagtacttc tacacagcca tcggtccaga cggccgcgct tctgcgggcg
atttgtgtac 300gcccgacagt cccggctccg gatcggacga ttgcgtcgca tcgaccctgc
gcccaagctg 360catcatcgaa attgccgtca accaagctct gatagagttg gtcaagacca
atgcggagca 420tatacgcccg gagccgcggc gatcctgcaa gctccggatg cctccgctcg
aagtagcgcg 480tctgctgctc catacaagcc aaccacggcc tccagaagaa gatgttggcg
acctcgtatt 540gggaatcccc gaacatcgcc tcgctccagt caatgaccgc tgttatgcgg
ccattgtccg 600tcaggacatt gttggagccg aaatccgcgt gcacgaggtg ccggacttcg
gggcagtcct 660cggcccaaag catcagctca tcgagagcct gcgcgacgga cgcactgacg
gtgtcgtcca 720tcacagtttg ccagtgatac acatggggat cagcaatcgc gcatatgaaa
tcacgccatg 780tagtgtattg accgattcct tgcggtccga atgggccgaa cccgctcgtc
tggctaagat 840cggccgcagc gatcgcatcc atagcctccg cgaccggctg cagaacagcg
ggcagttcgg 900tttcaggcag gtcttgcaac gtgacaccct gtgcacggcg ggagatgcaa
taggtcaggc 960tctcgctgaa ttccccaatg tcaagcactt ccggaatcgg gagcgcggcc
gatgcaaagt 1020gccgataaac ataacgatct ttgtagaaac catcggcgca gctatttacc
cgcaggacat 1080atccacgccc tcctacatcg aagctgaaag cacgagattc ttcgccctcc
gagagctgca 1140tcaggtcgga gacgctgtcg aacttttcga tcagaaactt ctcgacagac
gtcgcggtga 1200gttcaggctt ttccatgggt atatctcctt cttaaagtta aacaaaatta
tttctagagg 1260gaaaccgttg tggtctccct atagtgagtc gtattaattt cgcgggatcg
agatcgatcc 1320aattccaatc ccacaaaaat ctgagcttaa cagcacagtt gctcctctca
gagcagaatc 1380gggtattcaa caccctcata tcaactacta cgttgtgtat aacggtccac
atgccggtat 1440atacgatgac tggggttgta caaaggcggc aacaaacggc gttcccggag
ttgcacacaa 1500gaaatttgcc actattacag aggcaagagc agcagctgac gcgtacacaa
caagtcagca 1560aacagacagg ttgaacttca tccccaaagg agaagctcaa ctcaagccca
agagctttgc 1620taaggcccta acaagcccac caaagcaaaa agcccactgg ctcacgctag
gaaccaaaag 1680gcccagcagt gatccagccc caaaagagat ctcctttgcc ccggagatta
caatggacga 1740tttcctctat ctttacgatc taggaaggaa gttcgaaggt gaaggtgacg
acactatgtt 1800caccactgat aatgagaagg ttagcctctt caatttcaga aagaatgctg
acccacagat 1860ggttagagag gcctacgcag caggtctcat caagacgatc tacccgagta
acaatctcca 1920ggagatcaaa taccttccca agaaggttaa agatgcagtc aaaagattca
ggactaattg 1980catcaagaac acagagaaag acatatttct caagatcaga agtactattc
cagtatggac 2040gattcaaggc ttgcttcata aaccaaggca agtaatagag attggagtct
ctaaaaaggt 2100agttcctact gaatctaagg ccatgcatgg agtctaagat tcaaatcgag
gatctaacag 2160aactcgccgt gaagactggc gaacagttca tacagagtct tttacgactc
aatgacaaga 2220agaaaatctt cgtcaacatg gtggagcacg acactctggt ctactccaaa
aatgtcaaag 2280atacagtctc agaagaccaa agggctattg agacttttca acaaaggata
atttcgggaa 2340acctcctcgg attccattgc ccagctatct gtcacttcat cgaaaggaca
gtagaaaagg 2400aaggtggctc ctacaaatgc catcattgcg ataaaggaaa ggctatcatt
caagatgcct 2460ctgccgacag tggtcccaaa gatggacccc cacccacgag gagcatcgtg
gaaaaagaag 2520acgttccaac cacgtcttca aagcaagtgg attgatgtga catctccact
gacgtaaggg 2580atgacgcaca atcccactat ccttcgcaag acccttcctc tatataagga
agttcatttc 2640atttggagag gacacgctcg agctcatttc tctattactt cagccataac
aaaagaactc 2700ttttctcttc ttattaaacc atgaaaaagc ctgaactcac cgcgacgtct
gtcgagaagt 2760ttctgatcga aaagttcgac agcgtctccg acctgatgca gctctcggag
ggcgaagaat 2820ctcgtgcttt cagcttcgat gtaggagggc gtggatatgt cctgcgggta
aatagctgcg 2880ccgatggttt ctacaaagat cgttatgttt atcggcactt tgcatcggcc
gcgctcccga 2940ttccggaagt gcttgacatt ggggaattca gcgagagcct gacctattgc
atctcccgcc 3000gtgcacaggg tgtcacgttg caagacctgc ctgaaaccga actgcccgct
gttctgcagc 3060cggtcgcgga ggccatggat gcgatcgctg cggccgatct tagccagacg
agcgggttcg 3120gcccattcgg accgcaagga atcggtcaat acactacatg gcgtgatttc
atatgcgcga 3180ttgctgatcc ccatgtgtat cactggcaaa ctgtgatgga cgacaccgtc
agtgcgtccg 3240tcgcgcaggc tctcgatgag ctgatgcttt gggccgagga ctgccccgaa
gtccggcacc 3300tcgtgcacgc ggatttcggc tccaacaatg tcctgacgga caatggccgc
ataacagcgg 3360tcattgactg gagcgaggcg atgttcgggg attcccaata cgaggtcgcc
aacatcttct 3420tctggaggcc gtggttggct tgtatggagc agcagacgcg ctacttcgag
cggaggcatc 3480cggagcttgc aggatcgccg cggctccggg cgtatatgct ccgcattggt
cttgaccaac 3540tctatcagag cttggttgac ggcaatttcg atgatgcagc ttgggcgcag
ggtcgatgcg 3600acgcaatcgt ccgatccgga gccgggactg tcgggcgtac acaaatcgcc
cgcagaagcg 3660cggccgtctg gaccgatggc tgtgtagaag tactcgccga tagtggaaac
cgacgcccca 3720gcactcgtcc gagggcaaag gaatagtgag gtacctaaag aaggagtgcg
tcgaagcaga 3780tcgttcaaac atttggcaat aaagtttctt aagattgaat cctgttgccg
gtcttgcgat 3840gattatcata taatttctgt tgaattacgt taagcatgta ataattaaca
tgtaatgcat 3900gacgttattt atgagatggg tttttatgat tagagtcccg caattataca
tttaatacgc 3960gatagaaaac aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg
tgtcatctat 4020gttactagat cgatgtcgaa tcgatcaacc tgcattaatg aatcggccaa
cgcgcgggga 4080gaggcggttt gcgtattggg cgctcttccg cttcctcgct cactgactcg
ctgcgctcgg 4140tcgttcggct gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg
ttatccacag 4200aatcagggga taacgcagga aagaacatgt gagcaaaagg ccagcaaaag
gccaggaacc 4260gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg cccccctgac
gagcatcaca 4320aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga
taccaggcgt 4380ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt
accggatacc 4440tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca atgctcacgc
tgtaggtatc 4500tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc
cccgttcagc 4560ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta
agacacgact 4620tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat
gtaggcggtg 4680ctacagagtt cttgaagtgg tggcctaact acggctacac tagaaggaca
gtatttggta 4740tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct
tgatccggca 4800aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt
acgcgcagaa 4860aaaaaggatc tcaagaagat cctttgatct tttctacggg gtctgacgct
cagtggaacg 4920aaaactcacg ttaagggatt ttggtcatga cattaaccta taaaaatagg
cgtatcacga 4980ggccctttcg tctcgcgcgt ttcggtgatg acggtgaaaa cctctgacac
atgcagctcc 5040cggagacggt cacagcttgt ctgtaagcgg atgccgggag cagacaagcc
cgtcagggcg 5100cgtcagcggg tgttggcggg tgtcggggct ggcttaacta tgcggcatca
gagcagattg 5160tactgagagt gcaccatatg gacatattgt cgttagaacg cggctacaat
taatacataa 5220ccttatgtat catacacata cgatttaggt gacactatag aacggcgcgc
caagcttgca 5280tgcctgcagg ctagcctaag tacgtactca aaatgccaac aaataaaaaa
aaagttgctt 5340taataatgcc aaaacaaatt aataaaacac ttacaacacc ggattttttt
taattaaaat 5400gtgccattta ggataaatag ttaatatttt taataattat ttaaaaagcc
gtatctacta 5460aaatgatttt tatttggttg aaaatattaa tatgtttaaa tcaacacaat
ctatcaaaat 5520taaactaaaa aaaaaataag tgtacgtggt taacattagt acagtaatat
aagaggaaaa 5580tgagaaatta agaaattgaa agcgagtcta atttttaaat tatgaacctg
catatataaa 5640aggaaagaaa gaatccagga agaaaagaaa tgaaaccatg catggtcccc
tcgtcatcac 5700gagtttctgc catttgcaat agaaacactg aaacaccttt ctctttgtca
cttaattgag 5760atgccgaagc cacctcacac catgaacttc atgaggtgta gcacccaagg
cttccatagc 5820catgcatact gaagaatgtc tcaagctcag caccctactt ctgtgacgtg
tccctcattc 5880accttcctct cttccctata aataaccacg cctcaggttc tccgcttcac
aactcaaaca 5940ttctctccat tggtccttaa acactcatca gtcatcaccg cggccctaga
cgcccatcac 6000aagtttgtac aaaaaagctg aacgagaaac gtaaaatgat ataaatatca
atatattaaa 6060ttagattttg cataaaaaac agactacata atactgtaaa acacaacata
tccagtcata 6120ttggcggccg cattaggcac cccaggcttt acactttatg cttccggctc
gtataatgtg 6180tggattttga gttaggatcc gtcgagattt tcaggagcta aggaagctaa
aatggagaaa 6240aaaatcactg gatataccac cgttgatata tcccaatggc atcgtaaaga
acattttgag 6300gcatttcagt cagttgctca atgtacctat aaccagaccg ttcagctgga
tattacggcc 6360tttttaaaga ccgtaaagaa aaataagcac aagttttatc cggcctttat
tcacattctt 6420gcccgcctga tgaatgctca tccggaattc cgtatggcaa tgaaagacgg
tgagctggtg 6480atatgggata gtgttcaccc ttgttacacc gttttccatg agcaaactga
aacgttttca 6540tcgctctgga gtgaatacca cgacgatttc cggcagtttc tacacatata
ttcgcaagat 6600gtggcgtgtt acggtgaaaa cctggcctat ttccctaaag ggtttattga
gaatatgttt 6660ttcgtctcag ccaatccctg ggtgagtttc accagttttg atttaaacgt
ggccaatatg 6720gacaacttct tcgcccccgt tttcaccatg ggcaaatatt atacgcaagg
cgacaaggtg 6780ctgatgccgc tggcgattca ggttcatcat gccgtttgtg atggcttcca
tgtcggcaga 6840atgcttaatg aattacaaca gtactgcgat gagtggcagg gcggggcgta
aacgcgtgga 6900tccggcttac taaaagccag ataacagtat gcgtatttgc gcgctgattt
ttgcggtata 6960agaatatata ctgatatgta tacccgaagt atgtcaaaaa gaggtatgct
atgaagcagc 7020gtattacagt gacagttgac agcgacagct atcagttgct caaggcatat
atgatgtcaa 7080tatctccggt ctggtaagca caaccatgca gaatgaagcc cgtcgtctgc
gtgccgaacg 7140ctggaaagcg gaaaatcagg aagggatggc tgaggtcgcc cggtttattg
aaatgaacgg 7200ctcttttgct gacgagaaca ggggctggtg aaatgcagtt taaggtttac
acctataaaa 7260gagagagccg ttatcgtctg tttgtggatg tacagagtga tattattgac
acgcccgggc 7320gacggatggt gatccccctg gccagtgcac gtctgctgtc agataaagtc
tcccgtgaac 7380tttacccggt ggtgcatatc ggggatgaaa gctggcgcat gatgaccacc
gatatggcca 7440gtgtgccggt ctccgttatc ggggaagaag tggctgatct cagccaccgc
gaaaatgaca 7500tcaaaaacgc cattaacctg atgttctggg gaatataaat gtcaggctcc
cttatacaca 7560gccagtctgc aggtcgacca tagtgactgg atatgttgtg ttttacagca
ttatgtagtc 7620tgttttttat gcaaaatcta atttaatata ttgatattta tatcatttta
cgtttctcgt 7680tcagctttct tgtacaaagt ggtgatgata accaagttta acgtgagttt
atatattcac 7740agttccattt acagatctta tgctgattgc agcatataac atagtcgcaa
cttaacttta 7800tccctgctta cgtaaagaaa catacatatt gtttgtggct tcgtagtgga
acatatgcaa 7860ttatgtaatc tttatattat gagcctttac ttacaaagat tacttgagat
ttatgtacgt 7920gtgctatttt cacttttcaa acatgaattt cctacgttta caatcattta
atgtaaaagg 7980gatgatataa tgtatttacg tacatgtgaa caaccaagca tgttattttt
tccttttttg 8040ttgcaactta caatcaagta atgattatgg ttatgattat gatattggtg
tgtgtctttt 8100gccttatata tatatttatc cctttcgttt aactttgcaa tataattatt
actgatcact 8160atattttggt ttgaaatggc gcaggttgta atgatcgatc atcaccactt
tgtacaagaa 8220agctgaacga gaaacgtaaa atgatataaa tatcaatata ttaaattaga
ttttgcataa 8280aaaacagact acataatgct gtaaaacaca acatatccag tcactatggt
cgacctgcag 8340actggctgtg tataagggag cctgacattt atattcccca gaacatcagg
ttaatggcgt 8400ttttgatgtc attttcgcgg tggctgagat cagccacttc ttccccgata
acggagaccg 8460gcacactggc catatcggtg gtcatcatgc gccagctttc atccccgata
tgcaccaccg 8520ggtaaagttc acgggagact ttatctgaca gcagacgtgc actggccagg
gggatcacca 8580tccgtcgccc gggcgtgtca ataatatcac tctgtacatc cacaaacaga
cgataacggc 8640tctctctttt ataggtgtaa accttaaact gcatttcacc agcccctgtt
ctcgtcagca 8700aaagagccgt tcatttcaat aaaccgggcg acctcagcca tcccttcctg
attttccgct 8760ttccagcgtt cggcacgcag acgacgggct tcattctgca tggttgtgct
taccagaccg 8820gagatattga catcatatat gccttgagca actgatagct gtcgctgtca
actgtcactg 8880taatacgctg cttcatagca tacctctttt tgacatactt cgggtataca
tatcagtata 8940tattcttata ccgcaaaaat cagcgcgcaa atacgcatac tgttatctgg
cttttagtaa 9000gccggatcct aactcaaaat ccacacatta tacgagccgg aagcataaag
tgtaaagcct 9060ggggtgccta atgcggccgc caatatgact ggatatgttg tgttttacag
tattatgtag 9120tctgtttttt atgcaaaatc taatttaata tattgatatt tatatcattt
tacgtttctc 9180gttcagcttt tttgtacaaa cttgtgatgg gcgtctagcg aactagagga
tccccgggta 9240ccgag
92452415500DNAartificial sequencevector 24ccgcggccgc
ccccttcacc atggttgttg tgtctcttct tcctcgaatc tcgatcgtta 60catcaccggg
ttctagcctt cacgatgtgc ttttgagcat gagatttggt ttgacgcgac 120atctccctct
caaacgatct ttctccaatt attcaatcac ttccgtatct ccagaacaac 180agctcaaatc
tccggtgacc atggcgacga ccgagagcaa gaatcttgta gaagcttcca 240aggaggagac
aaacaagaag gagacagaag ataagaagga ggtgggagtt tcggttcctc 300caccgccaga
gaaaccagag cctggcgatt gttgcggtag cggttgcgtc cgatgcgttt 360gggatgttta
ttacgatgag ctcgaagatt acaacaagca gctttctgga gaaactaaat 420caatttgaaa
gggtgggcgc gccgacccag ctttcttgta caaagtggtg tgagtttata 480tattcacagt
tccatttaca gatcttatgc tgattgcagc atataacata gtcgcaactt 540aactttatcc
ctgcttacgt aaagaaacat acatattgtt tgtggcttcg tagtggaaca 600tatgcaatta
tgtaatcttt atattatgag cctttactta caaagattac ttgagattta 660tgtacgtgtg
ctattttcac ttttcaaaca tgaatttcct acgtttacaa tcatttaatg 720taaaagggat
gatataatgt atttacgtac atgtgaacaa ccaagcatgt tattttttcc 780ttttttgttg
caacttacaa tcaagtaatg attatggtta tgattatgat attggtgtgt 840gtcttttgcc
ttatatatat atttatccct ttcgtttaac tttgcaatat aattattact 900gatcactata
ttttggtttg aaatggcgca gaccactttg tacaagaaag ctgggtcggc 960gcgcccaccc
tttcaaattg atttagtttc tccagaaagc tgcttgttgt aatcttcgag 1020ctcatcgtaa
taaacatccc aaacgcatcg gacgcaaccg ctaccgcaac aatcgccagg 1080ctctggtttc
tctggcggtg gaggaaccga aactcccacc tccttcttat cttctgtctc 1140cttcttgttt
gtctcctcct tggaagcttc tacaagattc ttgctctcgg tcgtcgccat 1200ggtcaccgga
gatttgagct gttgttctgg agatacggaa gtgattgaat aattggagaa 1260agatcgtttg
agagggagat gtcgcgtcaa accaaatctc atgctcaaaa gcacatcgtg 1320aaggctagaa
cccggtgatg taacgatcga gattcgagga agaagagaca caacaaccat 1380ggtgaagggg
gcggccgcgg agcctgcttt tttgtacaaa cttgtgatgg gcgtctagcg 1440aactagagga
tccccgggta ccgaggtacg tctagaggat ccgtcgacgg cgcgccagat 1500cctctagagt
cgacctgcag gcatgcaagc ttggcgtaat catggtcata gctgtttcct 1560gtgtgaaatt
gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt 1620aaagcctggg
gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc 1680gctttccagt
cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 1740agaggcggtt
tgcgtattgg atcgatccct gaaagcgacg ttggatgtta acatctacaa 1800attgcctttt
cttatcgacc atgtacgtaa gcgcttacgt ttttggtgga cccttgagga 1860aactggtagc
tgttgtgggc ctgtggtctc aagatggatc attaatttcc accttcacct 1920acgatggggg
gcatcgcacc ggtgagtaat attgtacggc taagagcgaa tttggcctgt 1980agacctcaat
tgcgagcttt ctaatttcaa actattcggg cctaactttt ggtgtgatga 2040tgctgactgg
caggatatat accgttgtaa tttgagctcg tgtgaataag tcgctgtgta 2100tgtttgtttg
attgtttctg ttggagtgca gcccatttca ccggacaagt cggctagatt 2160gatttagccc
tgatgaactg ccgaggggaa gccatcttga gcgcggaatg ggaatggatt 2220tcgttgtaca
acgagacgac agaacaccca cgggaccgag cttcgcgagc ttttgtatcc 2280gtggcatcct
tggtccgggc gatttgttca cgtccatgag gcgctctcca aaggaacgca 2340tattttccgg
tgcaaccttt ccggttcttc ctctactcga cctcttgaag tcccagcatg 2400aatgttcgac
cgctccgcaa gcggatcttt ggcgcaacca gccggtttcg cacgtcgatt 2460ctcgcgagcc
tgcatacttt ggcaagattg ctgaatgacg ctgatgcttc atcgcaatct 2520gcgataatgg
ggtaagtatc cggtgaaggc cgcaggtcag gccgcctgag cactcagtgt 2580cttggatgtc
cagttccacg gcagctgttg ctcaagcctg ctgatcggag cgtccgcaag 2640gtcggcgcgg
acgtcggcaa gccaggcctg cggatcgatg ttattgagct tggcgctcat 2700gatcagtgtc
gccatgaacg ccgcacgttc agcacaacga tccgatccgg caaacagcca 2760tgacttcctg
ccgagtacat agcctctgag cgttcgttcg gcagcattgt tcgtcaggca 2820aatcgggccg
tcatcgagga atgacgtaat gccatcccat cgcttgagca tgtaatttat 2880cgcctcggcg
acgggagaac tgcgcgacaa tttcccccgc tcggtttcga gccaatcatg 2940cagctcttcg
gcgagtgacc ttgatcaggc caccgccacg accgcggaag acgaacagat 3000gcctgcgcat
cggatcgcgc ttcagcgtct cttgcaccat cagcgacaaa ccgggaaagc 3060ctttgcgcat
gtccgtactt atgtcgccac ttgggagggc ttcgtctacg tggccttcgt 3120gatcgacgtc
ttcgcccgtc gcattgtcgg atggcgggcg agccggacag cacatgcagg 3180ctttgtcctc
gatgccctcg aggaggctca tcatgatcgg cgtcccgctc atggcggcct 3240agtgcatcac
tcggatcgcg gtgttcaata cgtgtccttt cgctattccg agcggttggc 3300agaagcaggt
atcgagccat ctatcggaag cgtcggcgac agcacgacaa cgccctcgca 3360gaagcgatca
acggtcttta caaggccgag gtcattcatc ggcgtggacc atggaggagc 3420ttcgaagcgg
tcgagttcgc taccttggaa tggatagact ggttcaacca cggcggcttt 3480tgaagcccat
cggcaatata ccgccagccg aagacgagga tcagtattac gccatgctgg 3540acgaagcagc
catggctgcg cattttaacg aaatggcctc cggcaaaccc ggtgcggttc 3600acttgttgcg
tgggaaagtt cacgggactc cgcgcacgag ccttcttcgt aatagccata 3660tcgaccgaat
tgacctgcag gggggggggg gaaagccacg ttgtgtctca aaatctctga 3720tgttacattg
cacaagataa aaatatatca tcatgaacaa taaaactgtc tgcttacata 3780aacagtaata
caaggggtgt tatgagccat attcaacggg aaacgtcttg ctcgaggccg 3840cgattaaatt
ccaacatgga tgctgattta tatgggtata aatgggctcg cgataatgtc 3900gggcaatcag
gtgcgacaat ctatcgattg tatgggaagc ccgatgcgcc agagttgttt 3960ctgaaacatg
gcaaaggtag cgttgccaat gatgttacag atgagatggt cagactaaac 4020tggctgacgg
aatttatgcc tcttccgacc atcaagcatt ttatccgtac tcctgatgat 4080gcatggttac
tcaccactgc gatccccggg aaaacagcat tccaggtatt agaagaatat 4140cctgattcag
gtgaaaatat tgttgatgcg ctggcagtgt tcctgcgccg gttgcattcg 4200attcctgttt
gtaattgtcc ttttaacagc gatcgcgtat ttcgtctcgc tcaggcgcaa 4260tcacgaatga
ataacggttt ggttgatgcg agtgattttg atgacgagcg taatggctgg 4320cctgttgaac
aagtctggaa agaaatgcat aagcttttgc cattctcacc ggattcagtc 4380gtcactcatg
gtgatttctc acttgataac cttatttttg acgaggggaa attaataggt 4440tgtattgatg
ttggacgagt cggaatcgca gaccgatacc aggatcttgc catcctatgg 4500aactgcctcg
gtgagttttc tccttcatta cagaaacggc tttttcaaaa atatggtatt 4560gataatcctg
atatgaataa attgcagttt catttgatgc tcgatgagtt tttctaatca 4620gaattggtta
attggttgta acactggcag agcattacgc tgacttgacg ggacggcggc 4680tttgttgaat
aaatcgaact tttgctgagt tgaaggatca gatcacgcat cttcccgaca 4740acgcagaccg
ttccgtggca aagcaaaagt tcaaaatcac caactggtcc acctacaaca 4800aagctctcat
caaccgtggc tccctcactt tctggctgga tgatggggcg attcaggcct 4860ggtatgagtc
agcaacacct tcttcacgag gcagacctca gcgccccccc ccccctgcag 4920gtcttttcca
atgatgagca cttttaaagt tctgctatgt ggcgcggtat tatcccgtgt 4980tgacgccggg
caagagcaac tcggtcgccg catacactat tctcagaatg acttggttga 5040gtactcacca
gtcacagaaa agcatcttac ggatggcatg acagtaagag aattatgcag 5100tgctgccata
accatgagtg ataacactgc ggccaactta cttctgacaa cgatcggagg 5160accgaaggag
ctaaccgctt ttttgcacaa catgggggat catgtaactc gccttgatcg 5220ttgggaaccg
gagctgaatg aagccatacc aaacgacgag cgtgacacca cgatgcctgt 5280agcaatggca
acaacgttgc gcaaactatt aactggcgaa ctacttactc tagcttcccg 5340gcaacaatta
atagactgga tggaggcgga taaagttgca ggaccacttc tgcgctcggc 5400ccttccggct
ggctggttta ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg 5460tatcattgca
gcactggggc cagatggtaa gccctcccgt atcgtagtta tctacacgac 5520ggggagtcag
gcaactatgg atgaacgaaa tagacagatc gctgagatag gtgcctcact 5580gattaagcat
tggtaactgt cagaccaagt ttactcatat atactttaga ttgatttaaa 5640acttcatttt
taatttaaaa ggatctaggt gaagatcctt tttgataatc tcatgaccaa 5700aatcccttaa
cgtgagtttt cgttccactg agcgtcagac cccgtagaaa agatcaaagg 5760atcttcttga
gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc 5820gctaccagcg
gtggtttgtt tgccggatca agagctacca actctttttc cgaaggtaac 5880tggcttcagc
agagcgcaga taccaaatac tgtccttcta gtgtagccgt agttaggcca 5940ccacttcaag
aactctgtag caccgcctac atacctcgct ctgctaatcc tgttaccagt 6000ggctgctgcc
agtggcgata agtcgtgtct taccgggttg gactcaagac gatagttacc 6060ggataaggcg
cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg 6120aacgacctac
accgaactga gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc 6180cgaagggaga
aaggcggaca ggtatccggt aagcggcagg gtcggaacag gagagcgcac 6240gagggagctt
ccagggggaa acgcctggta tctttatagt cctgtcgggt ttcgccacct 6300ctgacttgag
cgtcgatttt tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc 6360cagcaacgcg
gcctttttac ggttcctggc cttttgctgg ccttttgctc acatgttctt 6420tcctgcgtta
tcccctgatt ctgtggataa ccgtattacc gcctttgagt gagctgatac 6480cgctcgccgc
agccgaacga ccgagcgcag cgagtcagtg agcgaggaag cggaagagcg 6540cctgatgcgg
tattttctcc ttacgcatct gtgcggtatt tcacaccgca tatggtgcac 6600tctcagtaca
atctgctctg atgccgcata gttaagccag tatacactcc gctatcgcta 6660cgtgactggg
tcatggctgc gccccgacac ccgccaacac ccgctgacgc gccctgacgg 6720gcttgtctgc
tcccggcatc cgcttacaga caagctgtga ccgtctccgg gagctgcatg 6780tgtcagaggt
tttcaccgtc atcaccgaaa cgcgcgaggc agggtgcctt gatgtgggcg 6840ccggcggtcg
agtggcgacg gcgcggcttg tccgcgccct ggtagattgc ctggccgtag 6900gccagccatt
tttgagcggc cagcggccgc gataggccga cgcgaagcgg cggggcgtag 6960ggagcgcagc
gaccgaaggg taggcgcttt ttgcagctct tcggctgtgc gctggccaga 7020cagttatgca
caggccaggc gggttttaag agttttaata agttttaaag agttttaggc 7080ggaaaaatcg
ccttttttct cttttatatc agtcacttac atgtgtgacc ggttcccaat 7140gtacggcttt
gggttcccaa tgtacgggtt ccggttccca atgtacggct ttgggttccc 7200aatgtacgtg
ctatccacag gaaagagacc ttttcgacct ttttcccctg ctagggcaat 7260ttgccctagc
atctgctccg tacattagga accggcggat gcttcgccct cgatcaggtt 7320gcggtagcgc
atgactagga tcgggccagc ctgccccgcc tcctccttca aatcgtactc 7380cggcaggtca
tttgacccga tcagcttgcg cacggtgaaa cagaacttct tgaactctcc 7440ggcgctgcca
ctgcgttcgt agatcgtctt gaacaaccat ctggcttctg ccttgcctgc 7500ggcgcggcgt
gccaggcggt agagaaaacg gccgatgccg ggatcgatca aaaagtaatc 7560ggggtgaacc
gtcagcacgt ccgggttctt gccttctgtg atctcgcggt acatccaatc 7620agctagctcg
atctcgatgt actccggccg cccggtttcg ctctttacga tcttgtagcg 7680gctaatcaag
gcttcaccct cggataccgt caccaggcgg ccgttcttgg ccttcttcgt 7740acgctgcatg
gcaacgtgcg tggtgtttaa ccgaatgcag gtttctacca ggtcgtcttt 7800ctgctttccg
ccatcggctc gccggcagaa cttgagtacg tccgcaacgt gtggacggaa 7860cacgcggccg
ggcttgtctc ccttcccttc ccggtatcgg ttcatggatt cggttagatg 7920ggaaaccgcc
atcagtacca ggtcgtaatc ccacacactg gccatgccgg ccggccctgc 7980ggaaacctct
acgtgcccgt ctggaagctc gtagcggatc acctcgccag ctcgtcggtc 8040acgcttcgac
agacggaaaa cggccacgtc catgatgctg cgactatcgc gggtgcccac 8100gtcatagagc
atcggaacga aaaaatctgg ttgctcgtcg cccttgggcg gcttcctaat 8160cgacggcgca
ccggctgccg gcggttgccg ggattctttg cggattcgat cagcggccgc 8220ttgccacgat
tcaccggggc gtgcttctgc ctcgatgcgt tgccgctggg cggcctgcgc 8280ggccttcaac
ttctccacca ggtcatcacc cagcgccgcg ccgatttgta ccgggccgga 8340tggtttgcga
ccgctcacgc cgattcctcg ggcttggggg ttccagtgcc attgcagggc 8400cggcagacaa
cccagccgct tacgcctggc caaccgcccg ttcctccaca catggggcat 8460tccacggcgt
cggtgcctgg ttgttcttga ttttccatgc cgcctccttt agccgctaaa 8520attcatctac
tcatttattc atttgctcat ttactctggt agctgcgcga tgtattcaga 8580tagcagctcg
gtaatggtct tgccttggcg taccgcgtac atcttcagct tggtgtgatc 8640ctccgccggc
aactgaaagt tgacccgctt catggctggc gtgtctgcca ggctggccaa 8700cgttgcagcc
ttgctgctgc gtgcgctcgg acggccggca cttagcgtgt ttgtgctttt 8760gctcattttc
tctttacctc attaactcaa atgagttttg atttaatttc agcggccagc 8820gcctggacct
cgcgggcagc gtcgccctcg ggttctgatt caagaacggt tgtgccggcg 8880gcggcagtgc
ctgggtagct cacgcgctgc gtgatacggg actcaagaat gggcagctcg 8940tacccggcca
gcgcctcggc aacctcaccg ccgatgcgcg tgcctttgat cgcccgcgac 9000acgacaaagg
ccgcttgtag ccttccatcc gtgacctcaa tgcgctgctt aaccagctcc 9060accaggtcgg
cggtggccca tatgtcgtaa gggcttggct gcaccggaat cagcacgaag 9120tcggctgcct
tgatcgcgga cacagccaag tccgccgcct ggggcgctcc gtcgatcact 9180acgaagtcgc
gccggccgat ggccttcacg tcgcggtcaa tcgtcgggcg gtcgatgccg 9240acaacggtta
gcggttgatc ttcccgcacg gccgcccaat cgcgggcact gccctgggga 9300tcggaatcga
ctaacagaac atcggccccg gcgagttgca gggcgcgggc tagatgggtt 9360gcgatggtcg
tcttgcctga cccgcctttc tggttaagta cagcgataac ttcatgcgtt 9420cccttgcgta
tttgtttatt tactcatcgc atcatatacg cagcgaccgc atgacgcaag 9480ctgttttact
caaatacaca tcaccttttt agacggcggc gctcggtttc ttcagcggcc 9540aagctggccg
gccaggccgc cagcttggca tcagacaaac cggccaggat ttcatgcagc 9600cgcacggttg
agacgtgcgc gggcggctcg aacacgtacc cggccgcgat catctccgcc 9660tcgatctctt
cggtaatgaa aaacggttcg tcctggccgt cctggtgcgg tttcatgctt 9720gttcctcttg
gcgttcattc tcggcggccg ccagggcgtc ggcctcggtc aatgcgtcct 9780cacggaaggc
accgcgccgc ctggcctcgg tgggcgtcac ttcctcgctg cgctcaagtg 9840cgcggtacag
ggtcgagcga tgcacgccaa gcagtgcagc cgcctctttc acggtgcggc 9900cttcctggtc
gatcagctcg cgggcgtgcg cgatctgtgc cggggtgagg gtagggcggg 9960ggccaaactt
cacgcctcgg gccttggcgg cctcgcgccc gctccgggtg cggtcgatga 10020ttagggaacg
ctcgaactcg gcaatgccgg cgaacacggt caacaccatg cggccggccg 10080gcgtggtggt
gtcggcccac ggctctgcca ggctacgcag gcccgcgccg gcctcctgga 10140tgcgctcggc
aatgtccagt aggtcgcggg tgctgcgggc caggcggtct agcctggtca 10200ctgtcacaac
gtcgccaggg cgtaggtggt caagcatcct ggccagctcc gggcggtcgc 10260gcctggtgcc
ggtgatcttc tcggaaaaca gcttggtgca gccggccgcg tgcagttcgg 10320cccgttggtt
ggtcaagtcc tggtcgtcgg tgctgacgcg ggcatagccc agcaggccag 10380cggcggcgct
cttgttcatg gcgtaatgtc tccggttcta gtcgcaagta ttctacttta 10440tgcgactaaa
acacgcgaca agaaaacgcc aggaaaaggg cagggcggca gcctgtcgcg 10500taacttagga
cttgtgcgac atgtcgtttt cagaagacgg ctgcactgaa cgtcagaagc 10560cgactgcact
atagcagcgg aggggttgga ccacaggacg ggtgtggtcg ccatgatcgc 10620gtagtcgata
gtggctccaa gtagcgaagc gagcaggact gggcggcggc caaagcggtc 10680ggacagtgct
ccgagaacgg gtgcgcatag aaattgcatc aacgcatata gcgctagcag 10740cacgccatag
tgactggcga tgctgtcgga atggacgata tcccgcaaga ggcccggcag 10800taccggcata
accaagccta tgcctacagc atccagggtg acggtgccga ggatgacgat 10860gagcgcattg
ttagatttca tacacggtgc ctgactgcgt tagcaattta actgtgataa 10920actaccgcat
taaagctagc ttgcttggtc gttccgcgtg aacgtcggct cgattgtacc 10980tgcgttcaaa
tactttgcga tcgtgttgcg cgcctgcccg gtgcgtcggc tgatctcacg 11040gatcgactgc
ttctctcgca acgccatccg acggatgatg tttaaaagtc ccatgtggat 11100cactccgttg
ccccgtcgct caccgtgttg gggggaaggt gcacatggct cagttctcaa 11160tggaaattat
ctgcctaacc ggctcagttc tgcgtagaaa ccaacatgca agctccaccg 11220ggtgcaaagc
ggcagcggcg gcaggatata ttcaattgta aatggcttca tgtccgggaa 11280atctacatgg
atcagcaatg agtatgatgg tcaatatgga gaaaaagaaa gagtaattac 11340caattttttt
tcaattcaaa aatgtagatg tccgcagcgt tattataaaa tgaaagtaca 11400ttttgataaa
acgacaaatt acgatccgtc gtatttatag gcgaaagcaa taaacaaatt 11460attctaattc
ggaaatcttt atttcgacgt gtctacattc acgtccaaat gggggcttag 11520atgagaaact
tcacgatcga tgccttgatt tcgccattcc cagataccca tttcatcttc 11580agattggtct
gagattatgc gaaaatatac actcatatac ataaatactg acagtttgag 11640ctaccaattc
agtgtagccc attacctcac ataattcact caaatgctag gcagtctgtc 11700aactcggcgt
caatttgtcg gccactatac gatagttgcg caaattttca aagtcctggc 11760ctaacatcac
acctctgtcg gcggcgggtc ccatttgtga taaatccacc atatcgaatt 11820aattcagact
cctttgcccc agagatcaca atggacgact tcctctatct ctacgatcta 11880gtcaggaagt
tcgacggaga aggtgacgat accatgttca ccactgataa tgagaagatt 11940agccttttca
atttcagaaa gaatgctaac ccacagatgg ttagagaggc ttacgcagca 12000ggtctcatca
agacgatcta cccgagcaat aatctccagg agatcaaata ccttcccaag 12060aaggttaaag
atgcagtcaa aagattcagg actaactgca tcaagaacac agagaaagat 12120atatttctca
agatcagaag tactattcca gtatggacga ttcaaggctt gcttcacaaa 12180ccaaggcaag
taatagagat tggagtctct aaaaaggtag ttcccactga atcaaaggcc 12240atggagtcaa
agattcaaat agaggaccta acagaactcg ccgtaaagac tggcgaacag 12300ttcatacaga
gtctcttacg actcaatgac aagaagaaaa tcttcgtcaa catggtggag 12360cacgacacgc
ttgtctactc caaaaatatc aaagatacag tctcagaaga ccaaagggca 12420attgagactt
ttcaacaaag ggtaatatcc ggaaacctcc tcggattcca ttgcccagct 12480atctgtcact
ttattgtgaa gatagtggaa aaggaaggtg gctcctacaa atgccatcat 12540tgcgataaag
gaaaggccat cgttgaagat gcctctgccg acagtggtcc caaagatgga 12600cccccaccca
cgaggagcat cgtggaaaaa gaagacgttc caaccacgtc ttcaaagcaa 12660gtggattgat
gtgatatctc cactgacgta agggatgacg cacaatccca ctatccttcg 12720caagaccctt
cctctatata aggaagttca tttcatttgg agaggacacg ctgaaatcac 12780cagtctccaa
gcttgcgggg atcgtttcgc atgattgaac aagatggatt gcacgcaggt 12840tctccggccg
cttgggtgga gaggctattc ggctatgact gggcacaaca gacaatcggc 12900tgctctgatg
ccgccgtgtt ccggctgtca gcgcaggggc gcccggttct ttttgtcaag 12960accgacctgt
ccggtgccct gaatgaactg caggacgagg cagcgcggct atcgtggctg 13020gccacgacgg
gcgttccttg cgcagctgtg ctcgacgttg tcactgaagc gggaagggac 13080tggctgctat
tgggcgaagt gccggggcag gatctcctgt catctcacct tgctcctgcc 13140gagaaagtat
ccatcatggc tgatgcaatg cggcggctgc atacgcttga tccggctacc 13200tgcccattcg
accaccaagc gaaacatcgc atcgagcgag cacgtactcg gatggaagcc 13260ggtcttgtcg
atcaggatga tctggacgaa gagcatcagg ggctcgcgcc agccgaactg 13320ttcgccaggc
tcaaggcgcg catgcccgac ggcgaggatc tcgtcgtgac ccatggcgat 13380gcctgcttgc
cgaatatcat ggtggaaaat ggccgctttt ctggattcat cgactgtggc 13440cggctgggtg
tggcggaccg ctatcaggac atagcgttgg ctacccgtga tattgctgaa 13500gagcttggcg
gcgaatgggc tgaccgcttc ctcgtgcttt acggtatcgc cgctcccgat 13560tcgcagcgca
tcgccttcta tcgccttctt gacgagttct tctgagcggg actctggggt 13620tcgaaatgac
cgaccaagcg acgcccaacc tgccatcacg agatttcgat tccaccgccg 13680ccttctatga
aaggttgggc ttcggaatcg ttttccggga cgccggctgg atgatcctcc 13740agcgcgggga
tctcatgctg gagttcttcg cccaccccgg atcgatccaa cacttacgtt 13800tgcaacgtcc
aagagcaaat agaccacgaa cgccggaagg ttgccgcagc gtgtggattg 13860cgtctcaatt
ctctcttgca ggaatgcaat gatgaatatg atactgacta tgaaactttg 13920agggaatact
gcctagcacc gtcacctcat aacgtgcatc atgcatgccc tgacaacatg 13980gaacatcgct
atttttctga agaattatgc tcgttggagg atgtcgcggc aattgcagct 14040attgccaaca
tcgaactacc cctcacgcat gcattcatca atattattca tgcggggaaa 14100ggcaagatta
atccaactgg caaatcatcc agcgtgattg gtaacttcag ttccagcgac 14160ttgattcgtt
ttggtgctac ccacgttttc aataaggacg agatggtgga gtaaagaagg 14220agtgcgtcga
agcagatcgt tcaaacattt ggcaataaag tttcttaaga ttgaatcctg 14280ttgccggtct
tgcgatgatt atcatataat ttctgttgaa ttacgttaag catgtaataa 14340ttaacatgta
atgcatgacg ttatttatga gatgggtttt tatgattaga gtcccgcaat 14400tatacattta
atacgcgata gaaaacaaaa tatagcgcgc aaactaggat aaattatcgc 14460gcgcggtgtc
atctatgtta ctagatcgat caaacttcgg tactgtgtaa tgacgatgag 14520caatcgagag
gctgactaac aaaaggtaca tcgcgatgga tcgatccatt cgccattcag 14580gctgcgcaac
tgttgggaag ggcgatcggt gcgggcctct tcgctattac gccagctggc 14640gaaaggggga
tgtgctgcaa ggcgattaag ttgggtaacg ccagggtttt cccagtcacg 14700acgttgtaaa
acgacggcca gtgaattcct gcagcccggg ggatccgccc actcgaggcg 14760cgccaagctt
gcatgcctgc aggctagcct aagtacgtac tcaaaatgcc aacaaataaa 14820aaaaaagttg
ctttaataat gccaaaacaa attaataaaa cacttacaac accggatttt 14880ttttaattaa
aatgtgccat ttaggataaa tagttaatat ttttaataat tatttaaaaa 14940gccgtatcta
ctaaaatgat ttttatttgg ttgaaaatat taatatgttt aaatcaacac 15000aatctatcaa
aattaaacta aaaaaaaaat aagtgtacgt ggttaacatt agtacagtaa 15060tataagagga
aaatgagaaa ttaagaaatt gaaagcgagt ctaattttta aattatgaac 15120ctgcatatat
aaaaggaaag aaagaatcca ggaagaaaag aaatgaaacc atgcatggtc 15180ccctcgtcat
cacgagtttc tgccatttgc aatagaaaca ctgaaacacc tttctctttg 15240tcacttaatt
gagatgccga agccacctca caccatgaac ttcatgaggt gtagcaccca 15300aggcttccat
agccatgcat actgaagaat gtctcaagct cagcacccta cttctgtgac 15360gtgtccctca
ttcaccttcc tctcttccct ataaataacc acgcctcagg ttctccgctt 15420cacaactcaa
acattctctc cattggtcct taaacactca tcagtcatca ccgcacaagt 15480ttgtacaaaa
aagcaggcta
1550025585DNABrassica napus 25gatattcgta tggttttgtt gcatgtccat caccacggtc
ttcatctcca tcccagaatc 60tcgatcgcca catcgcccga ttacaatcgc ctccggaaaa
gccttaacga tgtgcttttg 120agcatgagat ttggactaac gcgagatctc cctctgaaac
gatcatcatt cgcctattat 180tccggatctc gagaacaaca gcccatcacc atggcgacca
agggcgacaa gacttcgacg 240gaggtgaaag aaaaggtagt ggaggagaag aaggataatg
ataagaagga ggaggtatcg 300ctcccaccgc cgccggagaa accagaggct ggcgattgtt
gcggtagcgg ttgcgtccga 360tgcgtttggg atgtgtatta cgatgaactc gaagaataca
acaagcttac tgctttcgct 420cctggagata ctaaatccaa ttgattgaat tgctttgttc
tctattgttg ttagattcgc 480tcctggagat actaaatcca attgattgaa ttgctttgtt
ctctattgtt gttagaaaaa 540gttaaacaat cgctttgttc gaataaaaag tactgatcga
ccata 58526144PRTBrassica napus 26Met Val Leu Leu His
Val His His His Gly Leu His Leu His Pro Arg1 5
10 15Ile Ser Ile Ala Thr Ser Pro Asp Tyr Asn Arg
Leu Arg Lys Ser Leu 20 25
30Asn Asp Val Leu Leu Ser Met Arg Phe Gly Leu Thr Arg Asp Leu Pro
35 40 45Leu Lys Arg Ser Ser Phe Ala Tyr
Tyr Ser Gly Ser Arg Glu Gln Gln 50 55
60Pro Ile Thr Met Ala Thr Lys Gly Asp Lys Thr Ser Thr Glu Val Lys65
70 75 80Glu Lys Val Val Glu
Glu Lys Lys Asp Asn Asp Lys Lys Glu Glu Val 85
90 95Ser Leu Pro Pro Pro Pro Glu Lys Pro Glu Ala
Gly Asp Cys Cys Gly 100 105
110Ser Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu
115 120 125Glu Tyr Asn Lys Leu Thr Ala
Phe Ala Pro Gly Asp Thr Lys Ser Asn 130 135
14027552DNABrassica napus 27ggattcaaaa aagtcttcga ttttcccgaa
gggttttgtt gcatgttcat caccacggtc 60ttcatctcca tcccagaatc tcgatcgcca
catcgcccga ttacaatcgc ctccggaaaa 120gccttaacga tgtgcttttg agcatgagat
ttggactaac gcgagatctc cctctgaaac 180gatcatcatt cgcctattat tccggatctc
gagaacaaca gcccatcacc atggcgacca 240agggcgacaa gacttcgacg gaggtgaaag
aaaaggtagt ggaggagaag aaggataatg 300ataagaagga ggaggtatcg ctcccaccgc
cgccggagaa accagaggct ggcgattgtt 360gcggtagcgg ttgcgtccga tgcgtttggg
atgtgtatta cgatgagctc gaagaataca 420acaagcttac tgcttccgct cctggagata
ctaaatccaa ttgattgaat tgctttgttc 480tctattgttg ttagaaaaag ttaaacaatc
gctttgttcg aataaaaagt actgatcgac 540cattttaaac ga
55228106PRTBrassica napus 28Met Arg Phe
Gly Leu Thr Arg Asp Leu Pro Leu Lys Arg Ser Ser Phe1 5
10 15Ala Tyr Tyr Ser Gly Ser Arg Glu Gln
Gln Pro Ile Thr Met Ala Thr 20 25
30Lys Gly Asp Lys Thr Ser Thr Glu Val Lys Glu Lys Val Val Glu Glu
35 40 45Lys Lys Asp Asn Asp Lys Lys
Glu Glu Val Ser Leu Pro Pro Pro Pro 50 55
60Glu Lys Pro Glu Ala Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65
70 75 80Val Trp Asp Val
Tyr Tyr Asp Glu Leu Glu Glu Tyr Asn Lys Leu Thr 85
90 95Ala Ser Ala Pro Gly Asp Thr Lys Ser Asn
100 10529600DNABrassica napus 29gagcaaaaga
agtcttcgat atattcgtat ggttttgttc catcaccacg gtcttcatct 60ccatcccaga
atctcgatca ccacatcgcc cggttacaat cgactccgga aaagccttaa 120cgatgtgctt
ctgagcatga gatttggact aacacgagat ctccgtctga aacgaccatc 180attcgcatac
tattccggat ctcgaggaca acagcccatc accatggcga ccaagggcga 240caagacttcg
acagaggtga aagataaggt agtggaggag aagaaggata tggataagga 300taagaaggaa
gaggtatcgc tcccaccgcc gccggagaaa ccagaggctg gcgattgttg 360cggtagcggt
tgcgtccgat gcgtttggga tgtgtattac gatgagctcg aagaatacaa 420caagcttact
gcttccactc ctggagatac taaatccaat tgattgaatt gggattgctt 480tgttctgatt
gttaccctat tgttgctaga aaaagttaaa caattgcttt gttctataat 540aaagactggt
caagaactga tcgaccaata ttaaacgatt tcaatctttt tttcactgtg
60030144PRTBrassica napus 30Met Val Leu Phe His His His Gly Leu His Leu
His Pro Arg Ile Ser1 5 10
15Ile Thr Thr Ser Pro Gly Tyr Asn Arg Leu Arg Lys Ser Leu Asn Asp
20 25 30Val Leu Leu Ser Met Arg Phe
Gly Leu Thr Arg Asp Leu Arg Leu Lys 35 40
45Arg Pro Ser Phe Ala Tyr Tyr Ser Gly Ser Arg Gly Gln Gln Pro
Ile 50 55 60Thr Met Ala Thr Lys Gly
Asp Lys Thr Ser Thr Glu Val Lys Asp Lys65 70
75 80Val Val Glu Glu Lys Lys Asp Met Asp Lys Asp
Lys Lys Glu Glu Val 85 90
95Ser Leu Pro Pro Pro Pro Glu Lys Pro Glu Ala Gly Asp Cys Cys Gly
100 105 110Ser Gly Cys Val Arg Cys
Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu 115 120
125Glu Tyr Asn Lys Leu Thr Ala Ser Thr Pro Gly Asp Thr Lys
Ser Asn 130 135 14031619DNAHelianthus
annuus 31aagctggagc tccaccgcgg tggcggccgc tctagaacta gtggatcccc
cgggctgcag 60gaattcggca cgagctccga cgccatggca ccgttaaccg tcactcgcct
atgagatcac 120agactctgca ccgactcacc accactttta accgatctca tctcaatcca
attcaacctt 180ctctcagatc tgattcaaat ttcaacctca ccatggctga ttcaggttct
aataataaaa 240tcaagtcaga tgacggttcg agcgccgtta aggacgcaac ggagacgaaa
aagctgccgg 300agatccctcc gccgccggag aaaccgttgc cgggagactg ttgtggcagc
ggttgtgttc 360ggtgcgtttg ggacgtgtat tacgacgagc ttgaagagta taataagatt
tgtaaaggag 420gatctgattc tacagctgga tctaaggttt cgtaaacgtt ttgtagaaat
tgtttgatta 480ttgattgtta tagatcaatt tgattattga ttgttataga tctatttgat
gttcaaataa 540acgaattagt tcgatatctg tgttgtgagt ttcttgtcat gatgtgtctt
tgtttacata 600taatcgatcg aatatgatt
61932114PRTHelianthus annuus 32Met Arg Ser Gln Thr Leu His
Arg Leu Thr Thr Thr Phe Asn Arg Ser1 5 10
15His Leu Asn Pro Ile Gln Pro Ser Leu Arg Ser Asp Ser
Asn Phe Asn 20 25 30Leu Thr
Met Ala Asp Ser Gly Ser Asn Asn Lys Ile Lys Ser Asp Asp 35
40 45Gly Ser Ser Ala Val Lys Asp Ala Thr Glu
Thr Lys Lys Leu Pro Glu 50 55 60Ile
Pro Pro Pro Pro Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser65
70 75 80Gly Cys Val Arg Cys Val
Trp Asp Val Tyr Tyr Asp Glu Leu Glu Glu 85
90 95Tyr Asn Lys Ile Cys Lys Gly Gly Ser Asp Ser Thr
Ala Gly Ser Lys 100 105 110Val
Ser 33219DNACastor canadensis 33atggccacca acaaaactga acctctagat
tcaaaaacac acaatataaa taagaaagaa 60gaagaaaaga aattgccgcc gccgccgccg
ccggagaagc cggagcctgg ggattgttgt 120ggaagcggat gtgttaggtg cgtatgggat
gtgtattatg aagagcttga agaatataat 180aagctttatc aatcccattc tgattctaag
cgcccttga 2193472PRTCastor canadensis 34Met Ala
Thr Asn Lys Thr Glu Pro Leu Asp Ser Lys Thr His Asn Ile1 5
10 15Asn Lys Lys Glu Glu Glu Lys Lys
Leu Pro Pro Pro Pro Pro Pro Glu 20 25
30Lys Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys
Val 35 40 45Trp Asp Val Tyr Tyr
Glu Glu Leu Glu Glu Tyr Asn Lys Leu Tyr Gln 50 55
60Ser His Ser Asp Ser Lys Arg Pro65
7035771DNAGlycine max 35cggcagggtt acaatcttat cttcgtattg gacttcaatt
gatcccaaag aaaaatatag 60agagagagag aatgtggtgg cggcggcgcc cgagaccatg
agaactacag caccttccga 120tttcattttc acccaaaagc ttcacccttt caacatcacc
tccaccaaaa cctccctcca 180acgaacccta ccctattttc tccaactcaa tcgcatggcc
gaggctgcac gaaccgcgca 240taaacccgcg ccgcacccga tccaacccaa acccgacgat
aaaaccccga atccggcgaa 300ggagattccg ccgccgccgg agaagccgga gcccggcgat
tgctgcggca gcgggtgcgt 360ccgatgcgtc tgggatgtgt actacgacga actcgaagaa
tacaataagc gatacaaaca 420ggtcgatccc agccccaaac cttcttcgta atcttcaaca
tcgcttggat tagctttatt 480aatttattta tattacatcc taattttaaa aagctttggg
tatttcttga tttcgtgaat 540tgtccctttt tatcaaaaag gatcgaaatg ttgtatgtgg
aattatacat gtagaataaa 600ctgatttttt taaaaaaaat gccagggcta aaatgtacga
tttatataat cccgaagatt 660aattcggaga tttacttctc agatcgcata attcccaagt
tttttggtaa tagtacgctg 720tgttttttct ttcatgactt tgtttatgta ttttttataa
ccattttgat a 77136117PRTGlycine max 36Met Arg Thr Thr Ala Pro
Ser Asp Phe Ile Phe Thr Gln Lys Leu His1 5
10 15Pro Phe Asn Ile Thr Ser Thr Lys Thr Ser Leu Gln
Arg Thr Leu Pro 20 25 30Tyr
Phe Leu Gln Leu Asn Arg Met Ala Glu Ala Ala Arg Thr Ala His 35
40 45Lys Pro Ala Pro His Pro Ile Gln Pro
Lys Pro Asp Asp Lys Thr Pro 50 55
60Asn Pro Ala Lys Glu Ile Pro Pro Pro Pro Glu Lys Pro Glu Pro Gly65
70 75 80Asp Cys Cys Gly Ser
Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr 85
90 95Asp Glu Leu Glu Glu Tyr Asn Lys Arg Tyr Lys
Gln Val Asp Pro Ser 100 105
110Pro Lys Pro Ser Ser 11537664DNAGlycine max 37ggacttcaaa
ttcaattgat ccgtttccca acccaaagaa agagagagaa tgtggtggtg 60gcggcggcgg
cgtagatctt gagaacatct ccttcttccg atttcatttt cacccaaaag 120cttctcgctt
tcaacatcac cctcaccaaa acccctcttc aacgagccct actcttcttc 180tttctccatc
ccaatcgaat ggccgagggt gcacgaaccg cgcatgcacc cgccccgcac 240ccgatccaac
ccaaacccga cgataaaacc ccgaatccgg tgaaggagac tccgccgccg 300ccggagaaac
cggagcccgg cgattgctgc ggcagcggat gcgtccggga cgtttactac 360gacgaactcg
aagatacaat aagctataca aacaagacga tcccagcccc aaagcttctt 420catagtcttc
atcatcgcat gggtggaagt gttatgggtc gatgattgtt gggttattgt 480cgtcgtcaat
acaccaggta tgttgttact gggtgagtgt gttaagtgat tcgtaaggca 540aattttaaca
tatagatcaa cttgaattat atggatgaag ttgattcgta agttgataat 600aaactaacgg
atcatgttga tttgtattga ttacagattt tgatttttta aaaatttctt 660aaaa
6643888PRTGlycine
max 38Met Ala Glu Gly Ala Arg Thr Ala His Ala Pro Ala Pro His Pro Ile1
5 10 15Gln Pro Lys Pro Asp
Asp Lys Thr Pro Asn Pro Val Lys Glu Thr Pro 20
25 30Pro Pro Pro Glu Lys Pro Glu Pro Gly Asp Cys Cys
Gly Ser Gly Cys 35 40 45Val Arg
Asp Val Tyr Tyr Asp Glu Leu Glu Asp Thr Ile Ser Tyr Thr 50
55 60Asn Lys Thr Ile Pro Ala Pro Lys Leu Leu His
Ser Leu His His Arg65 70 75
80Met Gly Gly Ser Val Met Gly Arg 8539689DNAZea mays
39 ggggctgtat gctgggcgcc gtcgtccgtg tcccgggccc gatcctacct ttcctgcccg
60ggccgacgcg cccctctcct ccgccgccgc cactacctcc cgcccgaaac gcccatggcc
120tcggccaccc cttgcgatgg cggcaccggg aagcccgacg ccgcgccggc tcccacgccc
180gcgccaacgc cgctgccgcc cgagaagcct ctcccgggcg actgctgcgg cagcggctgc
240gttcgctgcg tctgggacat atatttcgac gagctcgacg cgtacgacaa ggccctcgcc
300gcgcgcgcgg cctcctcagg ctccggcggc aaggacgact ctgctgatac caagcccaaa
360gaaggcaaga caacaaggtg aaagaaacca aagcgtgagg ccaacctgtt gcagttggaa
420acattgaacc tgtccccggc gacgcattgc cctttccacc gccgcggagc ctcgctcatg
480ccgtcgtctc taaaactggc cgactctggc cagattcctg caaagcgcgg accaccagga
540cacctcagtc ttcgaactga aatgtcagtc ctaatcccag ttgctactga aagaaaagaa
600agtgaaggga aatctcctca ccagtgtcta gcacaccgat aatggaatcc tcaccgaacc
660tactctctgg gaggattcca gccgaatgc
6894088PRTZea mays 40Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys
Pro Asp Ala1 5 10 15Ala
Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro Glu Lys Pro 20
25 30Leu Pro Gly Asp Cys Cys Gly Ser
Gly Cys Val Arg Cys Val Trp Asp 35 40
45Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp Lys Ala Leu Ala Ala Arg
50 55 60Ala Ala Ser Ser Gly Ser Gly Gly
Lys Asp Asp Ser Ala Asp Thr Lys65 70 75
80Pro Lys Glu Gly Lys Thr Thr Arg
8541503DNAZea mays 41 gaggcgccgt cgtccgtgtc ccgggcccga tcctgccttt
cctgcccggg ccgacgcgcc 60ctttcctccg ccgccgccac tacctcccgg cccgagacgc
ccatggcctc ggccacccct 120tgcgatggcg gcaccgggaa gcccgacgcc gcgccggctc
ccacgcccgc gccaacgccg 180ctgccgcccg agaagcctct cccgggcgac tgctgcggca
gcggctgcgt ccgctgcgtc 240tgggacatat atttcgacga gctcgacgcc tacgacaagg
ccgtcgccgc ccacgcggcc 300tcctcaggct ccggcggcaa ggacgactcc gctgatacca
agcccaacga aggtgccaag 360tcctgaagtg cgcctctcat gtgtaatgac ctcttctgct
ctgaactgaa tttagattac 420tggcgttcac atacgccact accaattctt agcactcgaa
acattacagt accgttgtgc 480ctgctgtctt aatatgctta gac
5034287PRTZea mays 42Met Ala Ser Ala Thr Pro Cys
Asp Gly Gly Thr Gly Lys Pro Asp Ala1 5 10
15Ala Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro
Glu Lys Pro 20 25 30Leu Pro
Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp 35
40 45Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp
Lys Ala Val Ala Ala His 50 55 60Ala
Ala Ser Ser Gly Ser Gly Gly Lys Asp Asp Ser Ala Asp Thr Lys65
70 75 80Pro Asn Glu Gly Ala Lys
Ser 8543674DNAZea mays 43cttgtggcag ttggattctc ctgcgtccca
atcacgcagc ctcttggcct ccgcgaccgt 60cccctcccct cccctttcga cgagcgaggg
gctgtatgct gggcgccgtc gtccgtgtcc 120cgggcccgat cctacctttc ctgcccgggc
cgacgcgccc tctcctccgc cgccgccact 180acctcccgcc cgagacgccc atggcctcgg
ccaccccttg cgatggcggc accgggaagc 240ccgacgccgc gccggctccc acgcccgcgc
caacgccgct gccgcccgag aagcctctcc 300cgggcgactg ctgcggcagc ggctgcgttc
gctgcgtctg ggacatatat ttcgacgagc 360tcgacgcgta cgacaaggcc ctcgccgcgc
acgcggcctc ctcaggctcc ggcggcaagg 420acgactctgc tgataccaag cccaaagaag
gtgccaaatc ctgaagtgcg cctctcatgt 480gtaatgacct cttctgctct gaactgaatt
agattgctgg cgtctcacca gattcacata 540cgctggtacc aattcttagc actcgagaca
ttacaatact cttgtgcctg ctgtgttata 600tgctagatta agatgcttta tcaattcagc
ctccttattg tgtaacagga ggaagttatg 660aaacaaaaaa aaaa
67444122PRTZea mays 44Met Leu Gly Ala
Val Val Arg Val Pro Gly Pro Ile Leu Pro Phe Leu1 5
10 15Pro Gly Pro Thr Arg Pro Leu Leu Arg Arg
Arg His Tyr Leu Pro Pro 20 25
30Glu Thr Pro Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys
35 40 45Pro Asp Ala Ala Pro Ala Pro Thr
Pro Ala Pro Thr Pro Leu Pro Pro 50 55
60Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65
70 75 80Val Trp Asp Ile Tyr
Phe Asp Glu Leu Asp Ala Tyr Asp Lys Ala Leu 85
90 95Ala Ala His Ala Ala Ser Ser Gly Ser Gly Gly
Lys Asp Asp Ser Ala 100 105
110Asp Thr Lys Pro Lys Glu Gly Ala Lys Ser 115
12045653DNAOryza sativa 45cccttctctt ccgcatgctg gtcgccgccc tccgcgtccc
ggcgccgatc ccctcgtcgc 60tcccctcgcc ggcgcgccct ctcctccgcc gccgcagcag
ccaccgcctg ccccctcccc 120cgccccccgc cgcgtcaatg gccgacgccg gcggcgccac
cacgaacaag cccgctccgg 180ccccggcccc ggagccgccc gagaagccgc tccccggcga
ctgctgcggc agcggctgcg 240tccgctgcgt ctgggacgtc tactacgacg agctcgacgc
ctacaataag gctctcgccg 300cccactcctc gtcggcatcc tccggcagca agcccgctac
cagcgacggc gccaaatcat 360gaggcgaatc aggattcagg agttctgagg acgacttgca
gtatgcgtcc cttcctctct 420tttcattttt tttccccttc cccaaatcgg ggtcttggtg
tggtactcct accagctagt 480agtattaaaa ttactcgttt gattatagtg aaacatttgt
gttatctcat tgtgtatgct 540gcaatttgta ctagagtgga atggttgttg ttccaacgaa
aaattccctg attacataca 600gagaattgtt catggatagt tcttgtgtaa caaacattag
cattttggca gaa 65346115PRTOryza sativa 46Met Leu Val Ala Ala
Leu Arg Val Pro Ala Pro Ile Pro Ser Ser Leu1 5
10 15Pro Ser Pro Ala Arg Pro Leu Leu Arg Arg Arg
Ser Ser His Arg Leu 20 25
30Pro Pro Pro Pro Pro Pro Ala Ala Ser Met Ala Asp Ala Gly Gly Ala
35 40 45Thr Thr Asn Lys Pro Ala Pro Ala
Pro Ala Pro Glu Pro Pro Glu Lys 50 55
60Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp65
70 75 80Asp Val Tyr Tyr Asp
Glu Leu Asp Ala Tyr Asn Lys Ala Leu Ala Ala 85
90 95His Ser Ser Ser Ala Ser Ser Gly Ser Lys Pro
Ala Thr Ser Asp Gly 100 105
110Ala Lys Ser 11547399DNASorghum bicolor 47atgctgggcg ccgtcgtccg
tgtcccggcg ccgatcctgc tgcctctcct ccccggaccg 60acgcgccctc tcctcctccg
ccgccgccgc cactgcctcc cgcccgaggc gcccatggcc 120tcggccaccc ctagcgacgg
cggcgccgcg aagcccgatg ccgcgcccgc gcccgtgccc 180gtgcccgcgc ccgcgccaac
gccgctgccg ctgccgcccg agaagcctct cccgggcgac 240tgctgcggca gcggctgcgt
gcgctgcgtc tgggacatat atttcgacga gctcgacgcg 300tacgacaagg cgctcgccgc
ccacgcggcc gcctcctcag gctccggcgc caaggacgac 360tccgccgata ccaagcccag
cgacggcgcc aagtcctga 39948132PRTSorghum bicolor
48Met Leu Gly Ala Val Val Arg Val Pro Ala Pro Ile Leu Leu Pro Leu1
5 10 15Leu Pro Gly Pro Thr Arg
Pro Leu Leu Leu Arg Arg Arg Arg His Cys 20 25
30Leu Pro Pro Glu Ala Pro Met Ala Ser Ala Thr Pro Ser
Asp Gly Gly 35 40 45Ala Ala Lys
Pro Asp Ala Ala Pro Ala Pro Val Pro Val Pro Ala Pro 50
55 60Ala Pro Thr Pro Leu Pro Leu Pro Pro Glu Lys Pro
Leu Pro Gly Asp65 70 75
80Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Ile Tyr Phe Asp
85 90 95Glu Leu Asp Ala Tyr Asp
Lys Ala Leu Ala Ala His Ala Ala Ala Ser 100
105 110Ser Gly Ser Gly Ala Lys Asp Asp Ser Ala Asp Thr
Lys Pro Ser Asp 115 120 125Gly Ala
Lys Ser 1304934DNAartificial sequencelinker 49ggcgcgccaa gcttggatcc
gtcgacggcg cgcc 34504974DNAartificial
sequencevector 50ggccgccgac tcgacgatga gcgagatgac cagctccggc cgcgacacaa
gtgtgagagt 60actaaataaa tgctttggtt gtacgaaatc attacactaa ataaaataat
caaagcttat 120atatgccttc cgctaaggcc gaatgcaaag aaattggttc tttctcgtta
tcttttgcca 180cttttactag tacgtattaa ttactactta atcatctttg tttacggctc
attatatccg 240tcgacggcgc gcccgatcat ccggatatag ttcctccttt cagcaaaaaa
cccctcaaga 300cccgtttaga ggccccaagg ggttatgcta gttattgctc agcggtggca
gcagccaact 360cagcttcctt tcgggctttg ttagcagccg gatcgatcca agctgtacct
cactattcct 420ttgccctcgg acgagtgctg gggcgtcggt ttccactatc ggcgagtact
tctacacagc 480catcggtcca gacggccgcg cttctgcggg cgatttgtgt acgcccgaca
gtcccggctc 540cggatcggac gattgcgtcg catcgaccct gcgcccaagc tgcatcatcg
aaattgccgt 600caaccaagct ctgatagagt tggtcaagac caatgcggag catatacgcc
cggagccgcg 660gcgatcctgc aagctccgga tgcctccgct cgaagtagcg cgtctgctgc
tccatacaag 720ccaaccacgg cctccagaag aagatgttgg cgacctcgta ttgggaatcc
ccgaacatcg 780cctcgctcca gtcaatgacc gctgttatgc ggccattgtc cgtcaggaca
ttgttggagc 840cgaaatccgc gtgcacgagg tgccggactt cggggcagtc ctcggcccaa
agcatcagct 900catcgagagc ctgcgcgacg gacgcactga cggtgtcgtc catcacagtt
tgccagtgat 960acacatgggg atcagcaatc gcgcatatga aatcacgcca tgtagtgtat
tgaccgattc 1020cttgcggtcc gaatgggccg aacccgctcg tctggctaag atcggccgca
gcgatcgcat 1080ccatagcctc cgcgaccggc tgcagaacag cgggcagttc ggtttcaggc
aggtcttgca 1140acgtgacacc ctgtgcacgg cgggagatgc aataggtcag gctctcgctg
aattccccaa 1200tgtcaagcac ttccggaatc gggagcgcgg ccgatgcaaa gtgccgataa
acataacgat 1260ctttgtagaa accatcggcg cagctattta cccgcaggac atatccacgc
cctcctacat 1320cgaagctgaa agcacgagat tcttcgccct ccgagagctg catcaggtcg
gagacgctgt 1380cgaacttttc gatcagaaac ttctcgacag acgtcgcggt gagttcaggc
ttttccatgg 1440gtatatctcc ttcttaaagt taaacaaaat tatttctaga gggaaaccgt
tgtggtctcc 1500ctatagtgag tcgtattaat ttcgcgggat cgagatctga tcaacctgca
ttaatgaatc 1560ggccaacgcg cggggagagg cggtttgcgt attgggcgct cttccgcttc
ctcgctcact 1620gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc
aaaggcggta 1680atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc
aaaaggccag 1740caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag
gctccgcccc 1800cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc
gacaggacta 1860taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt
tccgaccctg 1920ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct
ttctcaatgc 1980tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg
ctgtgtgcac 2040gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct
tgagtccaac 2100ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat
tagcagagcg 2160aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg
ctacactaga 2220aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa
aagagttggt 2280agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt
ttgcaagcag 2340cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc
tacggggtct 2400gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgacatt
aacctataaa 2460aataggcgta tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg
tgaaaacctc 2520tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc
cgggagcaga 2580caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct
taactatgcg 2640gcatcagagc agattgtact gagagtgcac catatggaca tattgtcgtt
agaacgcggc 2700tacaattaat acataacctt atgtatcata cacatacgat ttaggtgaca
ctatagaacg 2760gcgcgccaag cttggatcct cgaagagaag ggttaataac acatttttta
acatttttaa 2820cacaaatttt agttatttaa aaatttatta aaaaatttaa aataagaaga
ggaactcttt 2880aaataaatct aacttacaaa atttatgatt tttaataagt tttcaccaat
aaaaaatgtc 2940ataaaaatat gttaaaaagt atattatcaa tattctcttt atgataaata
aaaagaaaaa 3000aaaaataaaa gttaagtgaa aatgagattg aagtgacttt aggtgtgtat
aaatatatca 3060accccgccaa caatttattt aatccaaata tattgaagta tattattcca
tagcctttat 3120ttatttatat atttattata taaaagcttt atttgttcta ggttgttcat
gaaatatttt 3180tttggtttta tctccgttgt aagaaaatca tgtgctttgt gtcgccactc
actattgcag 3240ctttttcatg cattggtcag attgacggtt gattgtattt ttgtttttta
tggttttgtg 3300ttatgactta agtcttcatc tctttatctc ttcatcaggt ttgatggtta
cctaatatgg 3360tccatgggta catgcatggt taaattaggt ggccaacttt gttgtgaacg
atagaatttt 3420ttttatatta agtaaactat ttttatatta tgaaataata ataaaaaaaa
tattttatca 3480ttattaacaa aatcatatta gttaatttgt taactctata ataaaagaaa
tactgtaaca 3540ttcacattac atggtaacat ctttccaccc tttcatttgt tttttgtttg
atgacttttt 3600ttcttgttta aatttatttc ccttctttta aatttggaat acattatcat
catatataaa 3660ctaaaatact aaaaacagga ttacacaaat gataaataat aacacaaata
tttataaatc 3720tagctgcaat atatttaaac tagctatatc gatattgtaa aataaaacta
gctgcattga 3780tactgataaa aaaatatcat gtgctttctg gactgatgat gcagtatact
tttgacattg 3840cctttatttt atttttcaga aaagctttct tagttctggg ttcttcatta
tttgtttccc 3900atctccattg tgaattgaat catttgcttc gtgtcacaaa tacaatttag
ntaggtacat 3960gcattggtca gattcacggt ttattatgtc atgacttaag ttcatggtag
tacattacct 4020gccacgcatg cattatattg gttagatttg ataggcaaat ttggttgtca
acaatataaa 4080tataaataat gtttttatat tacgaaataa cagtgatcaa aacaaacagt
tttatcttta 4140ttaacaagat tttgtttttg tttgatgacg ttttttaatg tttacgcttt
cccccttctt 4200ttgaatttag aacactttat catcataaaa tcaaatacta aaaaaattac
atatttcata 4260aataataaca caaatatttt taaaaaatct gaaataataa tgaacaatat
tacatattat 4320cacgaaaatt cattaataaa aatattatat aaataaaatg taatagtagt
tatatgtagg 4380aaaaaagtac tgcacgcata atatatacaa aaagattaaa atgaactatt
ataaataata 4440acactaaatt aatggtgaat catatcaaaa taatgaaaaa gtaaataaaa
tttgtaatta 4500acttctatat gtattacaca cacaaataat aaataatagt aaaaaaaatt
atgataaata 4560tttaccatct cataagatat ttaaaataat gataaaaata tagattattt
tttatgcaac 4620tagctagcca aaaagagaac acgggtatat ataaaaagag tacctttaaa
ttctactgta 4680cttcctttat tcctgacgtt tttatatcaa gtggacatac gtgaagattt
taattatcag 4740tctaaatatt tcattagcac ttaatacttt tctgttttat tcctatccta
taagtagtcc 4800cgattctccc aacattgctt attcacacaa ctaactaaga aagtcttcca
tagcccccca 4860agcggccgga gctggtcatc tcgctcatcg tcgagtcggc ggccggagct
ggtcatctcg 4920ctcatcgtcg agtcggcggc cgccgactcg acgatgagcg agatgaccag
ctcc 49745180DNAartificial sequenceEagI ELVISLIVES sequence
51cggccggagc tggtcatctc gctcatcgtc gagtcggcgg ccgccgactc gacgatgagc
60gagatgacca gctccggccg
8052118DNAartificial sequence2XELVISLIVES 52cggccggagc tggtcatctc
gctcatcgtc gagtcggcgg ccggagctgg tcatctcgct 60catcgtcgag tcggcggccg
ccgactcgac gatgagcgag atgaccagct ccggccgc 1185392DNAartificial
sequenceprimer 53gaattccggc cggagctggt catctcgctc atcgtcgagt cggcggccgc
cgactcgacg 60atgagcgaga tgaccagctc cggccggaat tc
925415DNAartificial sequenceprimer 54gaattccggc cggag
155529DNAartificial
sequenceprimer 55gcggccgcat gtataattcc acatacaac
295629DNAartificial sequenceprimer 56gcggccgcat gtataattcc
acatacaac 295726DNAartificial
sequenceprimer 57tctagaccac atacaacatt tcgatc
265826DNAartificial sequenceprimer 58tctagaccac atacaacatt
tcgatc 26593402DNAartificial
sequenceplasmid 59gcggccgctc aatcgcatgg ccgaggctgc acgaaccgcg cataaacccg
cgccgcaccc 60gatccaaccc aaacccgacg ataaaacccc gaatccggcg aaggagattc
cgccgccgcc 120ggagaagccg gagcccggcg attgctgcgg cagcgggtgc gtccgatgcg
tctgggatgt 180gtactacgac gaactcgaag aatacaataa gcgatacaaa caggtcgatc
ccagccccaa 240accttcttcg taatcttcaa catcgcttgg attagcttta ttaatttatt
tatattacat 300cctaatttta aaaagctttg ggtatttctt gatttcgtga attgtccctt
tttatcaaaa 360aggatcgaaa tgttgtatgt ggaattatac atgcggccgc aatcactagt
gcggccgcct 420gcaggtcgac catatgggag agctcccaac gcgttggatg catagcttga
gtattctata 480gtgtcaccta aatagcttgg cgtaatcatg gtcatagctg tttcctgtgt
gaaattgtta 540tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag
cctggggtgc 600ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt
tccagtcggg 660aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag
gcggtttgcg 720tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg
ttcggctgcg 780gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat
caggggataa 840cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta
aaaaggccgc 900gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa
atcgacgctc 960aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc
cccctggaag 1020ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt
ccgcctttct 1080cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca
gttcggtgta 1140ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg
accgctgcgc 1200cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat
cgccactggc 1260agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta
cagagttctt 1320gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct
gcgctctgct 1380gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac
aaaccaccgc 1440tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa
aaggatctca 1500agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa
actcacgtta 1560agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt
taaattaaaa 1620atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca
gttaccaatg 1680cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca
tagttgcctg 1740actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc
ccagtgctgc 1800aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa
accagccagc 1860cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc
agtctattaa 1920ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca
acgttgttgc 1980cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat
tcagctccgg 2040ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag
cggttagctc 2100cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac
tcatggttat 2160ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt
ctgtgactgg 2220tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt
gctcttgccc 2280ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc
tcatcattgg 2340aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat
ccagttcgat 2400gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca
gcgtttctgg 2460gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga
cacggaaatg 2520ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg
gttattgtct 2580catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg
ttccgcgcac 2640atttccccga aaagtgccac ctgatgcggt gtgaaatacc gcacagatgc
gtaaggagaa 2700aataccgcat caggaaattg taagcgttaa tattttgtta aaattcgcgt
taaatttttg 2760ttaaatcagc tcatttttta accaataggc cgaaatcggc aaaatccctt
ataaatcaaa 2820agaatagacc gagatagggt tgagtgttgt tccagtttgg aacaagagtc
cactattaaa 2880gaacgtggac tccaacgtca aagggcgaaa aaccgtctat cagggcgatg
gcccactacg 2940tgaaccatca ccctaatcaa gttttttggg gtcgaggtgc cgtaaagcac
taaatcggaa 3000ccctaaaggg agcccccgat ttagagcttg acggggaaag ccggcgaacg
tggcgagaaa 3060ggaagggaag aaagcgaaag gagcgggcgc tagggcgctg gcaagtgtag
cggtcacgct 3120gcgcgtaacc accacacccg ccgcgcttaa tgcgccgcta cagggcgcgt
ccattcgcca 3180ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg cctcttcgct
attacgccag 3240ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg taacgccagg
gttttcccag 3300tcacgacgtt gtaaaacgac ggccagtgaa ttgtaatacg actcactata
gggcgaattg 3360ggcccgacgt cgcatgctcc cggccgccat ggccgcggga tt
3402603204DNAartificial sequenceplasmid 60aatcactagt
gcggccgcct gcaggtcgac catatgggag agctcccaac gcgttggatg 60catagcttga
gtattctata gtgtcaccta aatagcttgg cgtaatcatg gtcatagctg 120tttcctgtgt
gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata 180aagtgtaaag
cctggggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca 240ctgcccgctt
tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc 300gcggggagag
gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg 360cgctcggtcg
ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 420tccacagaat
caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 480aggaaccgta
aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 540catcacaaaa
atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 600caggcgtttc
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 660ggatacctgt
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 720aggtatctca
gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 780gttcagcccg
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 840cacgacttat
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 900ggcggtgcta
cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 960tttggtatct
gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 1020tccggcaaac
aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 1080cgcagaaaaa
aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 1140tggaacgaaa
actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 1200tagatccttt
taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 1260tggtctgaca
gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 1320cgttcatcca
tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 1380ccatctggcc
ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta 1440tcagcaataa
accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 1500gcctccatcc
agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 1560agtttgcgca
acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 1620atggcttcat
tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 1680tgcaaaaaag
cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 1740gtgttatcac
tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 1800agatgctttt
ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 1860cgaccgagtt
gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 1920ttaaaagtgc
tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 1980ctgttgagat
ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 2040actttcacca
gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 2100ataagggcga
cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 2160atttatcagg
gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 2220caaatagggg
ttccgcgcac atttccccga aaagtgccac ctgatgcggt gtgaaatacc 2280gcacagatgc
gtaaggagaa aataccgcat caggaaattg taagcgttaa tattttgtta 2340aaattcgcgt
taaatttttg ttaaatcagc tcatttttta accaataggc cgaaatcggc 2400aaaatccctt
ataaatcaaa agaatagacc gagatagggt tgagtgttgt tccagtttgg 2460aacaagagtc
cactattaaa gaacgtggac tccaacgtca aagggcgaaa aaccgtctat 2520cagggcgatg
gcccactacg tgaaccatca ccctaatcaa gttttttggg gtcgaggtgc 2580cgtaaagcac
taaatcggaa ccctaaaggg agcccccgat ttagagcttg acggggaaag 2640ccggcgaacg
tggcgagaaa ggaagggaag aaagcgaaag gagcgggcgc tagggcgctg 2700gcaagtgtag
cggtcacgct gcgcgtaacc accacacccg ccgcgcttaa tgcgccgcta 2760cagggcgcgt
ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 2820cctcttcgct
attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 2880taacgccagg
gttttcccag tcacgacgtt gtaaaacgac ggccagtgaa ttgtaatacg 2940actcactata
gggcgaattg ggcccgacgt cgcatgctcc cggccgccat ggccgcggga 3000ttggatccac
tcgaagaata caataagcga tacaaacagg tcgatcccag ccccaaacct 3060tcttcgtaat
cttcaacatc gcttggatta gctttattaa tttatttata ttacatccta 3120attttaaaaa
gctttgggta tttcttgatt tcgtgaattg tcccttttta tcaaaaagga 3180tcgaaatgtt
gtatgtggtc taga
3204613790DNAartificial sequenceplasmid 61gatccactcg aagaatacaa
taagcgatac aaacaggtcg atcccagccc caaaccttct 60tcgtaatctt caacatcgct
tggattagct ttattaattt atttatatta catcctaatt 120ttaaaaagct ttgggtattt
cttgatttcg tgaattgtcc ctttttatca aaaaggatcg 180aaatgttgta tgtggtctag
aaatcactag tgcggccgcc tgcaggtcga ccatatggga 240gagctcccaa cgcgttggat
gcatagcttg agtattctat agtgtcacct aaatagcttg 300gcgtaatcat ggtcatagct
gtttcctgtg tgaaattgtt atccgctcac aattccacac 360aacatacgag ccggaagcat
aaagtgtaaa gcctggggtg cctaatgagt gagctaactc 420acattaattg cgttgcgctc
actgcccgct ttccagtcgg gaaacctgtc gtgccagctg 480cattaatgaa tcggccaacg
cgcggggaga ggcggtttgc gtattgggcg ctcttccgct 540tcctcgctca ctgactcgct
gcgctcggtc gttcggctgc ggcgagcggt atcagctcac 600tcaaaggcgg taatacggtt
atccacagaa tcaggggata acgcaggaaa gaacatgtga 660gcaaaaggcc agcaaaaggc
caggaaccgt aaaaaggccg cgttgctggc gtttttccat 720aggctccgcc cccctgacga
gcatcacaaa aatcgacgct caagtcagag gtggcgaaac 780ccgacaggac tataaagata
ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 840gttccgaccc tgccgcttac
cggatacctg tccgcctttc tcccttcggg aagcgtggcg 900ctttctcata gctcacgctg
taggtatctc agttcggtgt aggtcgttcg ctccaagctg 960ggctgtgtgc acgaaccccc
cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1020cttgagtcca acccggtaag
acacgactta tcgccactgg cagcagccac tggtaacagg 1080attagcagag cgaggtatgt
aggcggtgct acagagttct tgaagtggtg gcctaactac 1140ggctacacta gaagaacagt
atttggtatc tgcgctctgc tgaagccagt taccttcgga 1200aaaagagttg gtagctcttg
atccggcaaa caaaccaccg ctggtagcgg tggttttttt 1260gtttgcaagc agcagattac
gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 1320tctacggggt ctgacgctca
gtggaacgaa aactcacgtt aagggatttt ggtcatgaga 1380ttatcaaaaa ggatcttcac
ctagatcctt ttaaattaaa aatgaagttt taaatcaatc 1440taaagtatat atgagtaaac
ttggtctgac agttaccaat gcttaatcag tgaggcacct 1500atctcagcga tctgtctatt
tcgttcatcc atagttgcct gactccccgt cgtgtagata 1560actacgatac gggagggctt
accatctggc cccagtgctg caatgatacc gcgagaccca 1620cgctcaccgg ctccagattt
atcagcaata aaccagccag ccggaagggc cgagcgcaga 1680agtggtcctg caactttatc
cgcctccatc cagtctatta attgttgccg ggaagctaga 1740gtaagtagtt cgccagttaa
tagtttgcgc aacgttgttg ccattgctac aggcatcgtg 1800gtgtcacgct cgtcgtttgg
tatggcttca ttcagctccg gttcccaacg atcaaggcga 1860gttacatgat cccccatgtt
gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt 1920gtcagaagta agttggccgc
agtgttatca ctcatggtta tggcagcact gcataattct 1980cttactgtca tgccatccgt
aagatgcttt tctgtgactg gtgagtactc aaccaagtca 2040ttctgagaat agtgtatgcg
gcgaccgagt tgctcttgcc cggcgtcaat acgggataat 2100accgcgccac atagcagaac
tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga 2160aaactctcaa ggatcttacc
gctgttgaga tccagttcga tgtaacccac tcgtgcaccc 2220aactgatctt cagcatcttt
tactttcacc agcgtttctg ggtgagcaaa aacaggaagg 2280caaaatgccg caaaaaaggg
aataagggcg acacggaaat gttgaatact catactcttc 2340ctttttcaat attattgaag
catttatcag ggttattgtc tcatgagcgg atacatattt 2400gaatgtattt agaaaaataa
acaaataggg gttccgcgca catttccccg aaaagtgcca 2460cctgatgcgg tgtgaaatac
cgcacagatg cgtaaggaga aaataccgca tcaggaaatt 2520gtaagcgtta atattttgtt
aaaattcgcg ttaaattttt gttaaatcag ctcatttttt 2580aaccaatagg ccgaaatcgg
caaaatccct tataaatcaa aagaatagac cgagataggg 2640ttgagtgttg ttccagtttg
gaacaagagt ccactattaa agaacgtgga ctccaacgtc 2700aaagggcgaa aaaccgtcta
tcagggcgat ggcccactac gtgaaccatc accctaatca 2760agttttttgg ggtcgaggtg
ccgtaaagca ctaaatcgga accctaaagg gagcccccga 2820tttagagctt gacggggaaa
gccggcgaac gtggcgagaa aggaagggaa gaaagcgaaa 2880ggagcgggcg ctagggcgct
ggcaagtgta gcggtcacgc tgcgcgtaac caccacaccc 2940gccgcgctta atgcgccgct
acagggcgcg tccattcgcc attcaggctg cgcaactgtt 3000gggaagggcg atcggtgcgg
gcctcttcgc tattacgcca gctggcgaaa gggggatgtg 3060ctgcaaggcg attaagttgg
gtaacgccag ggttttccca gtcacgacgt tgtaaaacga 3120cggccagtga attgtaatac
gactcactat agggcgaatt gggcccgacg tcgcatgctc 3180ccggccgcca tggccgcggg
attggatcca acgcaattaa tgtgagttag ctcactcatt 3240aggcacccca ggctttacac
tttatgcttc cggctcgtat gttgtgtgga attgtgagcg 3300gataacaatt tcacacagga
aacagctatg accatgatta cgccaagcta tttaggtgac 3360actatagaat actcaagcta
tgcatccaac gcgttgggag ctctcccata tggtcgacct 3420gcaggcggcc gcactagtga
ttgcggccgc atgtataatt ccacatacaa catttcgatc 3480ctttttgata aaaagggaca
attcacgaaa tcaagaaata cccaaagctt tttaaaatta 3540ggatgtaata taaataaatt
aataaagcta atccaagcga tgttgaagat tacgaagaag 3600gtttggggct gggatcgacc
tgtttgtatc gcttattgta ttcttcgagt tcgtcgtagt 3660acacatccca gacgcatcgg
acgcacccgc tgccgcagca atcgccgggc tccggcttct 3720ccggcggcgg cggaatctcc
ttcgccggat tcggggtttt atcgtcgggt ttgggttgga 3780tcgggtgcgg
3790628113DNAartificial
sequencevector 62ggccgcgaca caagtgtgag agtactaaat aaatgctttg gttgtacgaa
atcattacac 60taaataaaat aatcaaagct tatatatgcc ttccgctaag gccgaatgca
aagaaattgg 120ttctttctcg ttatcttttg ccacttttac tagtacgtat taattactac
ttaatcatct 180ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat catccggata
tagttcctcc 240tttcagcaaa aaacccctca agacccgttt agaggcccca aggggttatg
ctagttattg 300ctcagcggtg gcagcagcca actcagcttc ctttcgggct ttgttagcag
ccggatcgat 360ccaagctgta cctcactatt cctttgccct cggacgagtg ctggggcgtc
ggtttccact 420atcggcgagt acttctacac agccatcggt ccagacggcc gcgcttctgc
gggcgatttg 480tgtacgcccg acagtcccgg ctccggatcg gacgattgcg tcgcatcgac
cctgcgccca 540agctgcatca tcgaaattgc cgtcaaccaa gctctgatag agttggtcaa
gaccaatgcg 600gagcatatac gcccggagcc gcggcgatcc tgcaagctcc ggatgcctcc
gctcgaagta 660gcgcgtctgc tgctccatac aagccaacca cggcctccag aagaagatgt
tggcgacctc 720gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg accgctgtta
tgcggccatt 780gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg aggtgccgga
cttcggggca 840gtcctcggcc caaagcatca gctcatcgag agcctgcgcg acggacgcac
tgacggtgtc 900gtccatcaca gtttgccagt gatacacatg gggatcagca atcgcgcata
tgaaatcacg 960ccatgtagtg tattgaccga ttccttgcgg tccgaatggg ccgaacccgc
tcgtctggct 1020aagatcggcc gcagcgatcg catccatagc ctccgcgacc ggctgcagaa
cagcgggcag 1080ttcggtttca ggcaggtctt gcaacgtgac accctgtgca cggcgggaga
tgcaataggt 1140caggctctcg ctgaattccc caatgtcaag cacttccgga atcgggagcg
cggccgatgc 1200aaagtgccga taaacataac gatctttgta gaaaccatcg gcgcagctat
ttacccgcag 1260gacatatcca cgccctccta catcgaagct gaaagcacga gattcttcgc
cctccgagag 1320ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga aacttctcga
cagacgtcgc 1380ggtgagttca ggcttttcca tgggtatatc tccttcttaa agttaaacaa
aattatttct 1440agagggaaac cgttgtggtc tccctatagt gagtcgtatt aatttcgcgg
gatcgagatc 1500gatccaattc caatcccaca aaaatctgag cttaacagca cagttgctcc
tctcagagca 1560gaatcgggta ttcaacaccc tcatatcaac tactacgttg tgtataacgg
tccacatgcc 1620ggtatatacg atgactgggg ttgtacaaag gcggcaacaa acggcgttcc
cggagttgca 1680cacaagaaat ttgccactat tacagaggca agagcagcag ctgacgcgta
cacaacaagt 1740cagcaaacag acaggttgaa cttcatcccc aaaggagaag ctcaactcaa
gcccaagagc 1800tttgctaagg ccctaacaag cccaccaaag caaaaagccc actggctcac
gctaggaacc 1860aaaaggccca gcagtgatcc agccccaaaa gagatctcct ttgccccgga
gattacaatg 1920gacgatttcc tctatcttta cgatctagga aggaagttcg aaggtgaagg
tgacgacact 1980atgttcacca ctgataatga gaaggttagc ctcttcaatt tcagaaagaa
tgctgaccca 2040cagatggtta gagaggccta cgcagcaggt ctcatcaaga cgatctaccc
gagtaacaat 2100ctccaggaga tcaaatacct tcccaagaag gttaaagatg cagtcaaaag
attcaggact 2160aattgcatca agaacacaga gaaagacata tttctcaaga tcagaagtac
tattccagta 2220tggacgattc aaggcttgct tcataaacca aggcaagtaa tagagattgg
agtctctaaa 2280aaggtagttc ctactgaatc taaggccatg catggagtct aagattcaaa
tcgaggatct 2340aacagaactc gccgtgaaga ctggcgaaca gttcatacag agtcttttac
gactcaatga 2400caagaagaaa atcttcgtca acatggtgga gcacgacact ctggtctact
ccaaaaatgt 2460caaagataca gtctcagaag accaaagggc tattgagact tttcaacaaa
ggataatttc 2520gggaaacctc ctcggattcc attgcccagc tatctgtcac ttcatcgaaa
ggacagtaga 2580aaaggaaggt ggctcctaca aatgccatca ttgcgataaa ggaaaggcta
tcattcaaga 2640tgcctctgcc gacagtggtc ccaaagatgg acccccaccc acgaggagca
tcgtggaaaa 2700agaagacgtt ccaaccacgt cttcaaagca agtggattga tgtgacatct
ccactgacgt 2760aagggatgac gcacaatccc actatccttc gcaagaccct tcctctatat
aaggaagttc 2820atttcatttg gagaggacac gctcgagctc atttctctat tacttcagcc
ataacaaaag 2880aactcttttc tcttcttatt aaaccatgaa aaagcctgaa ctcaccgcga
cgtctgtcga 2940gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
cggagggcga 3000agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc
gggtaaatag 3060ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat
cggccgcgct 3120cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct
attgcatctc 3180ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc
ccgctgttct 3240gcagccggtc gcggaggcca tggatgcgat cgctgcggcc gatcttagcc
agacgagcgg 3300gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg
atttcatatg 3360cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca
ccgtcagtgc 3420gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc
ccgaagtccg 3480gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg
gccgcataac 3540agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
tcgccaacat 3600cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact
tcgagcggag 3660gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca
ttggtcttga 3720ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg
cgcagggtcg 3780atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa
tcgcccgcag 3840aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
gaaaccgacg 3900ccccagcact cgtccgaggg caaaggaata gtgaggtacc taaagaagga
gtgcgtcgaa 3960gcagatcgtt caaacatttg gcaataaagt ttcttaagat tgaatcctgt
tgccggtctt 4020gcgatgatta tcatataatt tctgttgaat tacgttaagc atgtaataat
taacatgtaa 4080tgcatgacgt tatttatgag atgggttttt atgattagag tcccgcaatt
atacatttaa 4140tacgcgatag aaaacaaaat atagcgcgca aactaggata aattatcgcg
cgcggtgtca 4200tctatgttac tagatcgatg tcgaatctga tcaacctgca ttaatgaatc
ggccaacgcg 4260cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact
gactcgctgc 4320gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta
atacggttat 4380ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag
caaaaggcca 4440ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc
cctgacgagc 4500atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta
taaagatacc 4560aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg
ccgcttaccg 4620gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc
tcacgctgta 4680ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac
gaaccccccg 4740ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac
ccggtaagac 4800acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg
aggtatgtag 4860gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga
aggacagtat 4920ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt
agctcttgat 4980ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag
cagattacgc 5040gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct
gacgctcagt 5100ggaacgaaaa ctcacgttaa gggattttgg tcatgacatt aacctataaa
aataggcgta 5160tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc
tgacacatgc 5220agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga
caagcccgtc 5280agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg
gcatcagagc 5340agattgtact gagagtgcac catatggaca tattgtcgtt agaacgcggc
tacaattaat 5400acataacctt atgtatcata cacatacgat ttaggtgaca ctatagaacg
gcgcgccaag 5460cttggatcct cgaagagaag ggttaataac acactttttt aacattttta
acacaaattt 5520tagttattta aaaatttatt aaaaaattta aaataagaag aggaactctt
taaataaatc 5580taacttacaa aatttatgat ttttaataag ttttcaccaa taaaaaatgt
cataaaaata 5640tgttaaaaag tatattatca atattctctt tatgataaat aaaaagaaaa
aaaaaataaa 5700agttaagtga aaatgagatt gaagtgactt taggtgtgta taaatatatc
aaccccgcca 5760acaatttatt taatccaaat atattgaagt atattattcc atagccttta
tttatttata 5820tatttattat ataaaagctt tatttgttct aggttgttca tgaaatattt
ttttggtttt 5880atctccgttg taagaaaatc atgtgctttg tgtcgccact cactattgca
gctttttcat 5940gcattggtca gattgacggt tgattgtatt tttgtttttt atggttttgt
gttatgactt 6000aagtcttcat ctctttatct cttcatcagg tttgatggtt acctaatatg
gtccatgggt 6060acatgcatgg ttaaattagg tggccaactt tgttgtgaac gatagaattt
tttttatatt 6120aagtaaacta tttttatatt atgaaataat aataaaaaaa atattttatc
attattaaca 6180aaatcatatt agttaatttg ttaactctat aataaaagaa atactgtaac
attcacatta 6240catggtaaca tctttccacc ctttcatttg ttttttgttt gatgactttt
tttcttgttt 6300aaatttattt cccttctttt aaatttggaa tacattatca tcatatataa
actaaaatac 6360taaaaacagg attacacaaa tgataaataa taacacaaat atttataaat
ctagctgcaa 6420tatatttaaa ctagctatat cgatattgta aaataaaact agctgcattg
atactgataa 6480aaaaatatca tgtgctttct ggactgatga tgcagtatac ttttgacatt
gcctttattt 6540tatttttcag aaaagctttc ttagttctgg gttcttcatt atttgtttcc
catctccatt 6600gtgaattgaa tcatttgctt cgtgtcacaa atacaattta gntaggtaca
tgcattggtc 6660agattcacgg tttattatgt catgacttaa gttcatggta gtacattacc
tgccacgcat 6720gcattatatt ggttagattt gataggcaaa tttggttgtc aacaatataa
atataaataa 6780tgtttttata ttacgaaata acagtgatca aaacaaacag ttttatcttt
attaacaaga 6840ttttgttttt gtttgatgac gttttttaat gtttacgctt tcccccttct
tttgaattta 6900gaacacttta tcatcataaa atcaaatact aaaaaaatta catatttcat
aaataataac 6960acaaatattt ttaaaaaatc tgaaataata atgaacaata ttacatatta
tcacgaaaat 7020tcattaataa aaatattata taaataaaat gtaatagtag ttatatgtag
gaaaaaagta 7080ctgcacgcat aatatataca aaaagattaa aatgaactat tataaataat
aacactaaat 7140taatggtgaa tcatatcaaa ataatgaaaa agtaaataaa atttgtaatt
aacttctata 7200tgtattacac acacaaataa taaataatag taaaaaaaat tatgataaat
atttaccatc 7260tcataagata tttaaaataa tgataaaaat atagattatt ttttatgcaa
ctagctagcc 7320aaaaagagaa cacgggtata tataaaaaga gtacctttaa attctactgt
acttccttta 7380ttcctgacgt ttttatatca agtggacata cgtgaagatt ttaattatca
gtctaaatat 7440ttcattagca cttaatactt ttctgtttta ttcctatcct ataagtagtc
ccgattctcc 7500caacattgct tattcacaca actaactaag aaagtcttcc atagcccccc
aagcggccgc 7560atgtataatt ccacatacaa catttcgatc ctttttgata aaaagggaca
attcacgaaa 7620tcaagaaata cccaaagctt tttaaaatta ggatgtaata taaataaatt
aataaagcta 7680atccaagcga tgttgaagat tacgaagaag gtttggggct gggatcgacc
tgtttgtatc 7740gcttattgta ttcttcgagt tcgtcgtagt acacatccca gacgcatcgg
acgcacccgc 7800tgccgcagca atcgccgggc tccggcttct ccggcggcgg cggaatctcc
ttcgccggat 7860tcggggtttt atcgtcgggt ttgggttgga tcgggtgcgg gatccactcg
aagaatacaa 7920taagcgatac aaacaggtcg atcccagccc caaaccttct tcgtaatctt
caacatcgct 7980tggattagct ttattaattt atttatatta catcctaatt ttaaaaagct
ttgggtattt 8040cttgatttcg tgaattgtcc ctttttatca aaaaggatcg aaatgttgta
tgtggtctag 8100aaatcactag tgc
8113635267DNAartificial sequencevector 63atctgatcaa cctgcattaa
tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 60ggcgctcttc cgcttcctcg
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 120cggtatcagc tcactcaaag
gcggtaatac ggttatccac agaatcaggg gataacgcag 180gaaagaacat gtgagcaaaa
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 240tggcgttttt ccataggctc
cgcccccctg acgagcatca caaaaatcga cgctcaagtc 300agaggtggcg aaacccgaca
ggactataaa gataccaggc gtttccccct ggaagctccc 360tcgtgcgctc tcctgttccg
accctgccgc ttaccggata cctgtccgcc tttctccctt 420cgggaagcgt ggcgctttct
caatgctcac gctgtaggta tctcagttcg gtgtaggtcg 480ttcgctccaa gctgggctgt
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 540ccggtaacta tcgtcttgag
tccaacccgg taagacacga cttatcgcca ctggcagcag 600ccactggtaa caggattagc
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 660ggtggcctaa ctacggctac
actagaagga cagtatttgg tatctgcgct ctgctgaagc 720cagttacctt cggaaaaaga
gttggtagct cttgatccgg caaacaaacc accgctggta 780gcggtggttt ttttgtttgc
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 840atcctttgat cttttctacg
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 900ttttggtcat gacattaacc
tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 960gtttcggtga tgacggtgaa
aacctctgac acatgcagct cccggagacg gtcacagctt 1020gtctgtaagc ggatgccggg
agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 1080ggtgtcgggg ctggcttaac
tatgcggcat cagagcagat tgtactgaga gtgcaccata 1140tggacatatt gtcgttagaa
cgcggctaca attaatacat aaccttatgt atcatacaca 1200tacgatttag gtgacactat
agaacggcgc gccaagcttg gatccgtcga cggcgcgccc 1260gatcatccgg atatagttcc
tcctttcagc aaaaaacccc tcaagacccg tttagaggcc 1320ccaaggggtt atgctagtta
ttgctcagcg gtggcagcag ccaactcagc ttcctttcgg 1380gctttgttag cagccggatc
gatccaagct gtacctcact attcctttgc cctcggacga 1440gtgctggggc gtcggtttcc
actatcggcg agtacttcta cacagccatc ggtccagacg 1500gccgcgcttc tgcgggcgat
ttgtgtacgc ccgacagtcc cggctccgga tcggacgatt 1560gcgtcgcatc gaccctgcgc
ccaagctgca tcatcgaaat tgccgtcaac caagctctga 1620tagagttggt caagaccaat
gcggagcata tacgcccgga gccgcggcga tcctgcaagc 1680tccggatgcc tccgctcgaa
gtagcgcgtc tgctgctcca tacaagccaa ccacggcctc 1740cagaagaaga tgttggcgac
ctcgtattgg gaatccccga acatcgcctc gctccagtca 1800atgaccgctg ttatgcggcc
attgtccgtc aggacattgt tggagccgaa atccgcgtgc 1860acgaggtgcc ggacttcggg
gcagtcctcg gcccaaagca tcagctcatc gagagcctgc 1920gcgacggacg cactgacggt
gtcgtccatc acagtttgcc agtgatacac atggggatca 1980gcaatcgcgc atatgaaatc
acgccatgta gtgtattgac cgattccttg cggtccgaat 2040gggccgaacc cgctcgtctg
gctaagatcg gccgcagcga tcgcatccat agcctccgcg 2100accggctgca gaacagcggg
cagttcggtt tcaggcaggt cttgcaacgt gacaccctgt 2160gcacggcggg agatgcaata
ggtcaggctc tcgctgaatt ccccaatgtc aagcacttcc 2220ggaatcggga gcgcggccga
tgcaaagtgc cgataaacat aacgatcttt gtagaaacca 2280tcggcgcagc tatttacccg
caggacatat ccacgccctc ctacatcgaa gctgaaagca 2340cgagattctt cgccctccga
gagctgcatc aggtcggaga cgctgtcgaa cttttcgatc 2400agaaacttct cgacagacgt
cgcggtgagt tcaggctttt ccatgggtat atctccttct 2460taaagttaaa caaaattatt
tctagaggga aaccgttgtg gtctccctat agtgagtcgt 2520attaatttcg cgggatcgag
atcgatccaa ttccaatccc acaaaaatct gagcttaaca 2580gcacagttgc tcctctcaga
gcagaatcgg gtattcaaca ccctcatatc aactactacg 2640ttgtgtataa cggtccacat
gccggtatat acgatgactg gggttgtaca aaggcggcaa 2700caaacggcgt tcccggagtt
gcacacaaga aatttgccac tattacagag gcaagagcag 2760cagctgacgc gtacacaaca
agtcagcaaa cagacaggtt gaacttcatc cccaaaggag 2820aagctcaact caagcccaag
agctttgcta aggccctaac aagcccacca aagcaaaaag 2880cccactggct cacgctagga
accaaaaggc ccagcagtga tccagcccca aaagagatct 2940cctttgcccc ggagattaca
atggacgatt tcctctatct ttacgatcta ggaaggaagt 3000tcgaaggtga aggtgacgac
actatgttca ccactgataa tgagaaggtt agcctcttca 3060atttcagaaa gaatgctgac
ccacagatgg ttagagaggc ctacgcagca ggtctcatca 3120agacgatcta cccgagtaac
aatctccagg agatcaaata ccttcccaag aaggttaaag 3180atgcagtcaa aagattcagg
actaattgca tcaagaacac agagaaagac atatttctca 3240agatcagaag tactattcca
gtatggacga ttcaaggctt gcttcataaa ccaaggcaag 3300taatagagat tggagtctct
aaaaaggtag ttcctactga atctaaggcc atgcatggag 3360tctaagattc aaatcgagga
tctaacagaa ctcgccgtga agactggcga acagttcata 3420cagagtcttt tacgactcaa
tgacaagaag aaaatcttcg tcaacatggt ggagcacgac 3480actctggtct actccaaaaa
tgtcaaagat acagtctcag aagaccaaag ggctattgag 3540acttttcaac aaaggataat
ttcgggaaac ctcctcggat tccattgccc agctatctgt 3600cacttcatcg aaaggacagt
agaaaaggaa ggtggctcct acaaatgcca tcattgcgat 3660aaaggaaagg ctatcattca
agatgcctct gccgacagtg gtcccaaaga tggaccccca 3720cccacgagga gcatcgtgga
aaaagaagac gttccaacca cgtcttcaaa gcaagtggat 3780tgatgtgaca tctccactga
cgtaagggat gacgcacaat cccactatcc ttcgcaagac 3840ccttcctcta tataaggaag
ttcatttcat ttggagagga cacgctcgag ctcatttctc 3900tattacttca gccataacaa
aagaactctt ttctcttctt attaaaccat gaaaaagcct 3960gaactcaccg cgacgtctgt
cgagaagttt ctgatcgaaa agttcgacag cgtctccgac 4020ctgatgcagc tctcggaggg
cgaagaatct cgtgctttca gcttcgatgt aggagggcgt 4080ggatatgtcc tgcgggtaaa
tagctgcgcc gatggtttct acaaagatcg ttatgtttat 4140cggcactttg catcggccgc
gctcccgatt ccggaagtgc ttgacattgg ggaattcagc 4200gagagcctga cctattgcat
ctcccgccgt gcacagggtg tcacgttgca agacctgcct 4260gaaaccgaac tgcccgctgt
tctgcagccg gtcgcggagg ccatggatgc gatcgctgcg 4320gccgatctta gccagacgag
cgggttcggc ccattcggac cgcaaggaat cggtcaatac 4380actacatggc gtgatttcat
atgcgcgatt gctgatcccc atgtgtatca ctggcaaact 4440gtgatggacg acaccgtcag
tgcgtccgtc gcgcaggctc tcgatgagct gatgctttgg 4500gccgaggact gccccgaagt
ccggcacctc gtgcacgcgg atttcggctc caacaatgtc 4560ctgacggaca atggccgcat
aacagcggtc attgactgga gcgaggcgat gttcggggat 4620tcccaatacg aggtcgccaa
catcttcttc tggaggccgt ggttggcttg tatggagcag 4680cagacgcgct acttcgagcg
gaggcatccg gagcttgcag gatcgccgcg gctccgggcg 4740tatatgctcc gcattggtct
tgaccaactc tatcagagct tggttgacgg caatttcgat 4800gatgcagctt gggcgcaggg
tcgatgcgac gcaatcgtcc gatccggagc cgggactgtc 4860gggcgtacac aaatcgcccg
cagaagcgcg gccgtctgga ccgatggctg tgtagaagta 4920ctcgccgata gtggaaaccg
acgccccagc actcgtccga gggcaaagga atagtgaggt 4980acctaaagaa ggagtgcgtc
gaagcagatc gttcaaacat ttggcaataa agtttcttaa 5040gattgaatcc tgttgccggt
cttgcgatga ttatcatata atttctgttg aattacgtta 5100agcatgtaat aattaacatg
taatgcatga cgttatttat gagatgggtt tttatgatta 5160gagtcccgca attatacatt
taatacgcga tagaaaacaa aatatagcgc gcaaactagg 5220ataaattatc gcgcgcggtg
tcatctatgt tactagatcg atgtcga 526764142PRTPopulus
trichocarpa 64Met Glu Ala Thr Leu His Asn His Phe Leu Ser Arg Ile Phe Ser
Tyr1 5 10 15Thr Leu Pro
Lys Pro Lys Asn Pro Pro Asn Asp Pro Thr His Phe Ile 20
25 30Phe Ala Met Lys Asn Pro Phe Lys Pro Ile
Phe Ile Ser Pro Lys Thr 35 40
45Ile Thr Phe Asn Ser Arg Ser Gln Asp Pro Lys Ser Cys His Val Thr 50
55 60Ala Asn Phe Val Met Ala Thr Glu Asn
Lys Asn Glu Gln Ile Glu Ser65 70 75
80Thr Val Met Ser Lys Gln Gly Glu Glu Glu Ser Lys Lys Lys
Thr Ala 85 90 95Pro Pro
Pro Pro Pro Pro Pro Glu Lys Pro Glu Pro Gly Asp Cys Cys 100
105 110Gly Ser Gly Cys Val Arg Cys Val Trp
Asp Val Tyr Tyr Glu Glu Leu 115 120
125Glu Glu Tyr Asp Lys Leu Tyr Lys Ser Asp Ser Ser Lys Ser 130
135 14065115PRTOryza sativa 65Met Leu Val Ala
Ala Leu Arg Val Pro Ala Pro Ile Pro Ser Ser Leu1 5
10 15Pro Ser Pro Ala Arg Pro Leu Leu Arg Arg
Arg Ser Ser His Arg Leu 20 25
30Pro Pro Pro Pro Pro Pro Ala Ala Ser Met Ala Asp Ala Gly Gly Ala
35 40 45Thr Thr Asn Lys Pro Ala Pro Ala
Pro Ala Pro Glu Pro Pro Glu Lys 50 55
60Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp65
70 75 80Asp Val Tyr Tyr Asp
Glu Leu Asp Ala Tyr Asn Lys Ala Leu Ala Ala 85
90 95His Ser Ser Ser Ala Ser Ser Gly Ser Lys Pro
Ala Thr Ser Asp Gly 100 105
110Ala Lys Ser 11566122PRTZea mays 66Met Leu Gly Ala Val Val Arg
Val Pro Gly Pro Ile Leu Pro Phe Leu1 5 10
15Pro Gly Pro Thr Arg Pro Leu Leu Arg Arg Arg His Tyr
Leu Pro Pro 20 25 30Glu Thr
Pro Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys 35
40 45Pro Asp Ala Ala Pro Ala Pro Thr Pro Ala
Pro Thr Pro Leu Pro Pro 50 55 60Glu
Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65
70 75 80Val Trp Asp Ile Tyr Phe
Asp Glu Leu Asp Ala Tyr Asp Lys Ala Leu 85
90 95Ala Ala Arg Ala Ala Ser Ser Gly Ser Gly Gly Lys
Asp Asp Ser Ala 100 105 110Asp
Thr Lys Pro Lys Glu Gly Ala Lys Ser 115
12067122PRTZea mays 67Met Leu Gly Ala Val Val Arg Val Pro Gly Pro Ile Leu
Pro Phe Leu1 5 10 15Pro
Gly Pro Thr Arg Pro Leu Leu Arg Arg Arg His Tyr Leu Pro Pro 20
25 30Glu Thr Pro Met Ala Ser Ala Thr
Pro Cys Asp Gly Gly Thr Gly Lys 35 40
45Pro Asp Ala Ala Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro
50 55 60Glu Lys Pro Leu Pro Gly Asp Cys
Cys Gly Ser Gly Cys Val Arg Cys65 70 75
80Val Trp Asp Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp
Lys Ala Leu 85 90 95Ala
Ala His Ala Ala Ser Ser Gly Ser Gly Gly Lys Asp Asp Ser Ala
100 105 110Asp Thr Lys Pro Lys Glu Gly
Ala Lys Ser 115 12068597DNAArabidopsis thaliana
68acgaaaaaag gaaatagaaa aaaaaagaag agaacaaatc tttttgatct gtgcgtatgg
60ttgttgtgtc tcttcttcct cgaatctcga tcgttacatc accgggttct agccttcacg
120atgtgctttt gagcatgaga tttggtttga cgcgacatct ccctctcaaa cgatctttct
180ccaattattc aatcacttcc gtatctccag aacaacagct caaatctccg gtgaccatgg
240cgacgaccga gagcaagaat cttgtagaag cttccaagga ggagacaaac aagaaggaga
300cagaagataa gaaggaggtg ggagtttcgg ttcctccacc gccagagaaa ccagagcctg
360gcgattgttg cggtagcggt tgcgtccgat gcgtttggga tgtttattac gatgagctcg
420aagattacaa caagcagctt tctggagaaa ctaaatcaat ttgactgatt tttcctcgca
480ttgttaatgg agaaattaaa catttgtctt tgtcgatttg atgatacagt gcttttgttg
540aacaacattt tggatctctc tatgaacttg agctgattta cttgtgaata gaagaaa
59769135PRTArabidopsis thaliana 69Met Val Val Val Ser Leu Leu Pro Arg Ile
Ser Ile Val Thr Ser Pro1 5 10
15Gly Ser Ser Leu His Asp Val Leu Leu Ser Met Arg Phe Gly Leu Thr
20 25 30Arg His Leu Pro Leu Lys
Arg Ser Phe Ser Asn Tyr Ser Ile Thr Ser 35 40
45Val Ser Pro Glu Gln Gln Leu Lys Ser Pro Val Thr Met Ala
Thr Thr 50 55 60Glu Ser Lys Asn Leu
Val Glu Ala Ser Lys Glu Glu Thr Asn Lys Lys65 70
75 80Glu Thr Glu Asp Lys Lys Glu Val Gly Val
Ser Val Pro Pro Pro Pro 85 90
95Glu Lys Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys
100 105 110Val Trp Asp Val Tyr
Tyr Asp Glu Leu Glu Asp Tyr Asn Lys Gln Leu 115
120 125Ser Gly Glu Thr Lys Ser Ile 130
1357027PRTArtificial sequencemisc_feature 70Pro Glu Lys Pro Xaa Xaa Gly
Asp Cys Cys Gly Ser Gly Cys Val Arg1 5 10
15Xaa Xaa Xaa Asp Xaa Tyr Xaa Xaa Glu Leu Xaa
20 257130DNAartificial sequenceprimer 71gcggccgcgt
tgttgtgtct cttcttcctc
307230DNAartificial sequenceprimer 72ggatccctac aagattcttg ctctcggtcg
307330DNAartificial
sequenceggatccttccaaggaggagacaaacaagaa 73ggatccttcc aaggaggaga caaacaagaa
307430DNAartificial sequenceprimer
74gctgcagtta gtttctccag aaagctgctt
307538DNAartificial sequenceprimer 75gaattcgcgg ccgcgttgtt gtgtctcttc
ttcctcga 387630DNAartificial sequenceprimer
76ctgcagctac aagattcttg ctctcggtcg
30773239DNAartificial sequenceplasmid 77taatcactag tgaattcgcg gccgcctgca
ggtcgaccat atgggagagc tcccaacgcg 60ttggatgcat agcttgagta ttctatagtg
tcacctaaat agcttggcgt aatcatggtc 120atagctgttt cctgtgtgaa attgttatcc
gctcacaatt ccacacaaca tacgagccgg 180aagcataaag tgtaaagcct ggggtgccta
atgagtgagc taactcacat taattgcgtt 240gcgctcactg cccgctttcc agtcgggaaa
cctgtcgtgc cagctgcatt aatgaatcgg 300ccaacgcgcg gggagaggcg gtttgcgtat
tgggcgctct tccgcttcct cgctcactga 360ctcgctgcgc tcggtcgttc ggctgcggcg
agcggtatca gctcactcaa aggcggtaat 420acggttatcc acagaatcag gggataacgc
aggaaagaac atgtgagcaa aaggccagca 480aaaggccagg aaccgtaaaa aggccgcgtt
gctggcgttt ttccataggc tccgcccccc 540tgacgagcat cacaaaaatc gacgctcaag
tcagaggtgg cgaaacccga caggactata 600aagataccag gcgtttcccc ctggaagctc
cctcgtgcgc tctcctgttc cgaccctgcc 660gcttaccgga tacctgtccg cctttctccc
ttcgggaagc gtggcgcttt ctcatagctc 720acgctgtagg tatctcagtt cggtgtaggt
cgttcgctcc aagctgggct gtgtgcacga 780accccccgtt cagcccgacc gctgcgcctt
atccggtaac tatcgtcttg agtccaaccc 840ggtaagacac gacttatcgc cactggcagc
agccactggt aacaggatta gcagagcgag 900gtatgtaggc ggtgctacag agttcttgaa
gtggtggcct aactacggct acactagaag 960aacagtattt ggtatctgcg ctctgctgaa
gccagttacc ttcggaaaaa gagttggtag 1020ctcttgatcc ggcaaacaaa ccaccgctgg
tagcggtggt ttttttgttt gcaagcagca 1080gattacgcgc agaaaaaaag gatctcaaga
agatcctttg atcttttcta cggggtctga 1140cgctcagtgg aacgaaaact cacgttaagg
gattttggtc atgagattat caaaaaggat 1200cttcacctag atccttttaa attaaaaatg
aagttttaaa tcaatctaaa gtatatatga 1260gtaaacttgg tctgacagtt accaatgctt
aatcagtgag gcacctatct cagcgatctg 1320tctatttcgt tcatccatag ttgcctgact
ccccgtcgtg tagataacta cgatacggga 1380gggcttacca tctggcccca gtgctgcaat
gataccgcga gacccacgct caccggctcc 1440agatttatca gcaataaacc agccagccgg
aagggccgag cgcagaagtg gtcctgcaac 1500tttatccgcc tccatccagt ctattaattg
ttgccgggaa gctagagtaa gtagttcgcc 1560agttaatagt ttgcgcaacg ttgttgccat
tgctacaggc atcgtggtgt cacgctcgtc 1620gtttggtatg gcttcattca gctccggttc
ccaacgatca aggcgagtta catgatcccc 1680catgttgtgc aaaaaagcgg ttagctcctt
cggtcctccg atcgttgtca gaagtaagtt 1740ggccgcagtg ttatcactca tggttatggc
agcactgcat aattctctta ctgtcatgcc 1800atccgtaaga tgcttttctg tgactggtga
gtactcaacc aagtcattct gagaatagtg 1860tatgcggcga ccgagttgct cttgcccggc
gtcaatacgg gataataccg cgccacatag 1920cagaacttta aaagtgctca tcattggaaa
acgttcttcg gggcgaaaac tctcaaggat 1980cttaccgctg ttgagatcca gttcgatgta
acccactcgt gcacccaact gatcttcagc 2040atcttttact ttcaccagcg tttctgggtg
agcaaaaaca ggaaggcaaa atgccgcaaa 2100aaagggaata agggcgacac ggaaatgttg
aatactcata ctcttccttt ttcaatatta 2160ttgaagcatt tatcagggtt attgtctcat
gagcggatac atatttgaat gtatttagaa 2220aaataaacaa ataggggttc cgcgcacatt
tccccgaaaa gtgccacctg atgcggtgtg 2280aaataccgca cagatgcgta aggagaaaat
accgcatcag gaaattgtaa gcgttaatat 2340tttgttaaaa ttcgcgttaa atttttgtta
aatcagctca ttttttaacc aataggccga 2400aatcggcaaa atcccttata aatcaaaaga
atagaccgag atagggttga gtgttgttcc 2460agtttggaac aagagtccac tattaaagaa
cgtggactcc aacgtcaaag ggcgaaaaac 2520cgtctatcag ggcgatggcc cactacgtga
accatcaccc taatcaagtt ttttggggtc 2580gaggtgccgt aaagcactaa atcggaaccc
taaagggagc ccccgattta gagcttgacg 2640gggaaagccg gcgaacgtgg cgagaaagga
agggaagaaa gcgaaaggag cgggcgctag 2700ggcgctggca agtgtagcgg tcacgctgcg
cgtaaccacc acacccgccg cgcttaatgc 2760gccgctacag ggcgcgtcca ttcgccattc
aggctgcgca actgttggga agggcgatcg 2820gtgcgggcct cttcgctatt acgccagctg
gcgaaagggg gatgtgctgc aaggcgatta 2880agttgggtaa cgccagggtt ttcccagtca
cgacgttgta aaacgacggc cagtgaattg 2940taatacgact cactataggg cgaattgggc
ccgacgtcgc atgctcccgg ccgccatggc 3000ggccgcggga attcgatgcg gccgcgttgt
tgtgtctctt cttcctcgaa tctcgatcgt 3060tacatcaccg ggttctagcc ttcacgatgt
gcttttgagc atgagatttg gtttgacgcg 3120acatctccct ctcaaacgat ctttctccaa
ttattcaatc acttccgtat ctccagaaca 3180acagctcaaa tctccggtga ccatggcgac
gaccgagagc aagaatcttg tagggatcc 3239783213DNAartificial
sequenceplasmid 78taatcactag tgaattcgcg gccgcctgca ggtcgaccat atgggagagc
tcccaacgcg 60ttggatgcat agcttgagta ttctatagtg tcacctaaat agcttggcgt
aatcatggtc 120atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca
tacgagccgg 180aagcataaag tgtaaagcct ggggtgccta atgagtgagc taactcacat
taattgcgtt 240gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt
aatgaatcgg 300ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct
cgctcactga 360ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa
aggcggtaat 420acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa
aaggccagca 480aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc
tccgcccccc 540tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga
caggactata 600aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc
cgaccctgcc 660gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt
ctcatagctc 720acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct
gtgtgcacga 780accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg
agtccaaccc 840ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta
gcagagcgag 900gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct
acactagaag 960aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa
gagttggtag 1020ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt
gcaagcagca 1080gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta
cggggtctga 1140cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat
caaaaaggat 1200cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa
gtatatatga 1260gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct
cagcgatctg 1320tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta
cgatacggga 1380gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct
caccggctcc 1440agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg
gtcctgcaac 1500tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa
gtagttcgcc 1560agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt
cacgctcgtc 1620gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta
catgatcccc 1680catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca
gaagtaagtt 1740ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta
ctgtcatgcc 1800atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct
gagaatagtg 1860tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg
cgccacatag 1920cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac
tctcaaggat 1980cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact
gatcttcagc 2040atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa
atgccgcaaa 2100aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt
ttcaatatta 2160ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat
gtatttagaa 2220aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg
atgcggtgtg 2280aaataccgca cagatgcgta aggagaaaat accgcatcag gaaattgtaa
gcgttaatat 2340tttgttaaaa ttcgcgttaa atttttgtta aatcagctca ttttttaacc
aataggccga 2400aatcggcaaa atcccttata aatcaaaaga atagaccgag atagggttga
gtgttgttcc 2460agtttggaac aagagtccac tattaaagaa cgtggactcc aacgtcaaag
ggcgaaaaac 2520cgtctatcag ggcgatggcc cactacgtga accatcaccc taatcaagtt
ttttggggtc 2580gaggtgccgt aaagcactaa atcggaaccc taaagggagc ccccgattta
gagcttgacg 2640gggaaagccg gcgaacgtgg cgagaaagga agggaagaaa gcgaaaggag
cgggcgctag 2700ggcgctggca agtgtagcgg tcacgctgcg cgtaaccacc acacccgccg
cgcttaatgc 2760gccgctacag ggcgcgtcca ttcgccattc aggctgcgca actgttggga
agggcgatcg 2820gtgcgggcct cttcgctatt acgccagctg gcgaaagggg gatgtgctgc
aaggcgatta 2880agttgggtaa cgccagggtt ttcccagtca cgacgttgta aaacgacggc
cagtgaattg 2940taatacgact cactataggg cgaattgggc ccgacgtcgc atgctcccgg
ccgccatggc 3000ggccgcggga attcgatgga tccttccaag gaggagacaa acaagaagga
gacagaagat 3060aagaaggagg tgggagtttc ggttcctcca ccgccagaga aaccagagcc
tggcgattgt 3120tgcggtagcg gttgcgtccg atgcgtttgg gatgtttatt acgatgagct
cgaagattac 3180aacaagcagc tttctggaga aactaactgc agc
3213793245DNAartificial sequenceplasmid 79taatcactag
tgaattcgcg gccgcctgca ggtcgaccat atgggagagc tcccaacgcg 60ttggatgcat
agcttgagta ttctatagtg tcacctaaat agcttggcgt aatcatggtc 120atagctgttt
cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 180aagcataaag
tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt 240gcgctcactg
cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 300ccaacgcgcg
gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 360ctcgctgcgc
tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 420acggttatcc
acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 480aaaggccagg
aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 540tgacgagcat
cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 600aagataccag
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 660gcttaccgga
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 720acgctgtagg
tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 780accccccgtt
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 840ggtaagacac
gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 900gtatgtaggc
ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 960aacagtattt
ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 1020ctcttgatcc
ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 1080gattacgcgc
agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 1140cgctcagtgg
aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 1200cttcacctag
atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 1260gtaaacttgg
tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 1320tctatttcgt
tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 1380gggcttacca
tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 1440agatttatca
gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 1500tttatccgcc
tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 1560agttaatagt
ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 1620gtttggtatg
gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 1680catgttgtgc
aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 1740ggccgcagtg
ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 1800atccgtaaga
tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 1860tatgcggcga
ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 1920cagaacttta
aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 1980cttaccgctg
ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 2040atcttttact
ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 2100aaagggaata
agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 2160ttgaagcatt
tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 2220aaataaacaa
ataggggttc cgcgcacatt tccccgaaaa gtgccacctg atgcggtgtg 2280aaataccgca
cagatgcgta aggagaaaat accgcatcag gaaattgtaa gcgttaatat 2340tttgttaaaa
ttcgcgttaa atttttgtta aatcagctca ttttttaacc aataggccga 2400aatcggcaaa
atcccttata aatcaaaaga atagaccgag atagggttga gtgttgttcc 2460agtttggaac
aagagtccac tattaaagaa cgtggactcc aacgtcaaag ggcgaaaaac 2520cgtctatcag
ggcgatggcc cactacgtga accatcaccc taatcaagtt ttttggggtc 2580gaggtgccgt
aaagcactaa atcggaaccc taaagggagc ccccgattta gagcttgacg 2640gggaaagccg
gcgaacgtgg cgagaaagga agggaagaaa gcgaaaggag cgggcgctag 2700ggcgctggca
agtgtagcgg tcacgctgcg cgtaaccacc acacccgccg cgcttaatgc 2760gccgctacag
ggcgcgtcca ttcgccattc aggctgcgca actgttggga agggcgatcg 2820gtgcgggcct
cttcgctatt acgccagctg gcgaaagggg gatgtgctgc aaggcgatta 2880agttgggtaa
cgccagggtt ttcccagtca cgacgttgta aaacgacggc cagtgaattg 2940taatacgact
cactataggg cgaattgggc ccgacgtcgc atgctcccgg ccgccatggc 3000ggccgcggga
attcgatctg cagctacaag attcttgctc tcggtcgtcg ccatggtcac 3060cggagatttg
agctgttgtt ctggagatac ggaagtgatt gaataattgg agaaagatcg 3120tttgagaggg
agatgtcgcg tcaaaccaaa tctcatgctc aaaagcacat cgtgaaggct 3180agaacccggt
gatgtaacga tcgagattcg aggaagaaga gacacaacaa cgcggccgcg 3240aattc
3245803154DNAartificial sequenceplasmid 80gatcccccgg gctgcaggaa
ttcgatatca agcttatcga taccgtcgac ctcgaggggg 60ggcccggtac ccagcttttg
ttccctttag tgagggttaa tttcgagctt ggcgtaatca 120tggtcatagc tgtttcctgt
gtgaaattgt tatccgctca caattccaca caacatacga 180gccggaagca taaagtgtaa
agcctggggt gcctaatgag tgagctaact cacattaatt 240gcgttgcgct cactgcccgc
tttccagtcg ggaaacctgt cgtgccagct gcattaatga 300atcggccaac gcgcggggag
aggcggtttg cgtattgggc gctcttccgc ttcctcgctc 360actgactcgc tgcgctcggt
cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg 420gtaatacggt tatccacaga
atcaggggat aacgcaggaa agaacatgtg agcaaaaggc 480cagcaaaagg ccaggaaccg
taaaaaggcc gcgttgctgg cgtttttcca taggctccgc 540ccccctgacg agcatcacaa
aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 600ctataaagat accaggcgtt
tccccctgga agctccctcg tgcgctctcc tgttccgacc 660ctgccgctta ccggatacct
gtccgccttt ctcccttcgg gaagcgtggc gctttctcat 720agctcacgct gtaggtatct
cagttcggtg taggtcgttc gctccaagct gggctgtgtg 780cacgaacccc ccgttcagcc
cgaccgctgc gccttatccg gtaactatcg tcttgagtcc 840aacccggtaa gacacgactt
atcgccactg gcagcagcca ctggtaacag gattagcaga 900gcgaggtatg taggcggtgc
tacagagttc ttgaagtggt ggcctaacta cggctacact 960agaaggacag tatttggtat
ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 1020ggtagctctt gatccggcaa
acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 1080cagcagatta cgcgcagaaa
aaaaggatct caagaagatc ctttgatctt ttctacgggg 1140tctgacgctc agtggaacga
aaactcacgt taagggattt tggtcatgag attatcaaaa 1200aggatcttca cctagatcct
tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata 1260tatgagtaaa cttggtctga
cagttaccaa tgcttaatca gtgaggcacc tatctcagcg 1320atctgtctat ttcgttcatc
catagttgcc tgactccccg tcgtgtagat aactacgata 1380cgggagggct taccatctgg
ccccagtgct gcaatgatac cgcgagaccc acgctcaccg 1440gctccagatt tatcagcaat
aaaccagcca gccggaaggg ccgagcgcag aagtggtcct 1500gcaactttat ccgcctccat
ccagtctatt aattgttgcc gggaagctag agtaagtagt 1560tcgccagtta atagtttgcg
caacgttgtt gccattgcta caggcatcgt ggtgtcacgc 1620tcgtcgtttg gtatggcttc
attcagctcc ggttcccaac gatcaaggcg agttacatga 1680tcccccatgt tgtgcaaaaa
agcggttagc tccttcggtc ctccgatcgt tgtcagaagt 1740aagttggccg cagtgttatc
actcatggtt atggcagcac tgcataattc tcttactgtc 1800atgccatccg taagatgctt
ttctgtgact ggtgagtact caaccaagtc attctgagaa 1860tagtgtatgc ggcgaccgag
ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca 1920catagcagaa ctttaaaagt
gctcatcatt ggaaaacgtt cttcggggcg aaaactctca 1980aggatcttac cgctgttgag
atccagttcg atgtaaccca ctcgtgcacc caactgatct 2040tcagcatctt ttactttcac
cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 2100gcaaaaaagg gaataagggc
gacacggaaa tgttgaatac tcatactctt cctttttcaa 2160tattattgaa gcatttatca
gggttattgt ctcatgagcg gatacatatt tgaatgtatt 2220tagaaaaata aacaaatagg
ggttccgcgc acatttcccc gaaaagtgcc acctaaattg 2280taagcgttaa tattttgtta
aaattcgcgt taaatttttg ttaaatcagc tcatttttta 2340accaataggc cgaaatcggc
aaaatccctt ataaatcaaa agaatagacc gagatagggt 2400tgagtgttgt tccagtttgg
aacaagagtc cactattaaa gaacgtggac tccaacgtca 2460aagggcgaaa aaccgtctat
cagggcgatg gcccactacg tgaaccatca ccctaatcaa 2520gttttttggg gtcgaggtgc
cgtaaagcac taaatcggaa ccctaaaggg agcccccgat 2580ttagagcttg acggggaaag
ccggcgaacg tggcgagaaa ggaagggaag aaagcgaaag 2640gagcgggcgc tagggcgctg
gcaagtgtag cggtcacgct gcgcgtaacc accacacccg 2700ccgcgcttaa tgcgccgcta
cagggcgcgt cccattcgcc attcaggctg cgcaactgtt 2760gggaagggcg atcggtgcgg
gcctcttcgc tattacgcca gctggcgaaa gggggatgtg 2820ctgcaaggcg attaagttgg
gtaacgccag ggttttccca gtcacgacgt tgtaaaacga 2880cggccagtga attgtaatac
gactcactat agggcgaatt ggagctccac cgcggtggcg 2940gccgcgttgt tgtgtctctt
cttcctcgaa tctcgatcgt tacatcaccg ggttctagcc 3000ttcacgatgt gcttttgagc
atgagatttg gtttgacgcg acatctccct ctcaaacgat 3060ctttctccaa ttattcaatc
acttccgtat ctccagaaca acagctcaaa tctccggtga 3120ccatggcgac gaccgagagc
aagaatcttg tagg 3154813331DNAartificial
sequenceplasmid 81ggaattcgat atcaagctta tcgataccgt cgacctcgag ggggggcccg
gtacccagct 60tttgttccct ttagtgaggg ttaatttcga gcttggcgta atcatggtca
tagctgtttc 120ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga
agcataaagt 180gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg
cgctcactgc 240ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc
caacgcgcgg 300ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac
tcgctgcgct 360cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata
cggttatcca 420cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa
aaggccagga 480accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct
gacgagcatc 540acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa
agataccagg 600cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg
cttaccggat 660acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca
cgctgtaggt 720atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa
ccccccgttc 780agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg
gtaagacacg 840acttatcgcc actggcagca gccactggta acaggattag cagagcgagg
tatgtaggcg 900gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg
acagtatttg 960gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc
tcttgatccg 1020gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag
attacgcgca 1080gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac
gctcagtgga 1140acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc
ttcacctaga 1200tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag
taaacttggt 1260ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt
ctatttcgtt 1320catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag
ggcttaccat 1380ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca
gatttatcag 1440caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact
ttatccgcct 1500ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca
gttaatagtt 1560tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg
tttggtatgg 1620cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc
atgttgtgca 1680aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg
gccgcagtgt 1740tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca
tccgtaagat 1800gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt
atgcggcgac 1860cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc
agaactttaa 1920aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc
ttaccgctgt 1980tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca
tcttttactt 2040tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa
aagggaataa 2100gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat
tgaagcattt 2160atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa
aataaacaaa 2220taggggttcc gcgcacattt ccccgaaaag tgccacctaa attgtaagcg
ttaatatttt 2280gttaaaattc gcgttaaatt tttgttaaat cagctcattt tttaaccaat
aggccgaaat 2340cggcaaaatc ccttataaat caaaagaata gaccgagata gggttgagtg
ttgttccagt 2400ttggaacaag agtccactat taaagaacgt ggactccaac gtcaaagggc
gaaaaaccgt 2460ctatcagggc gatggcccac tacgtgaacc atcaccctaa tcaagttttt
tggggtcgag 2520gtgccgtaaa gcactaaatc ggaaccctaa agggagcccc cgatttagag
cttgacgggg 2580aaagccggcg aacgtggcga gaaaggaagg gaagaaagcg aaaggagcgg
gcgctagggc 2640gctggcaagt gtagcggtca cgctgcgcgt aaccaccaca cccgccgcgc
ttaatgcgcc 2700gctacagggc gcgtcccatt cgccattcag gctgcgcaac tgttgggaag
ggcgatcggt 2760gcgggcctct tcgctattac gccagctggc gaaaggggga tgtgctgcaa
ggcgattaag 2820ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa acgacggcca
gtgaattgta 2880atacgactca ctatagggcg aattggagct ccaccgcggt ggcggccgcg
ttgttgtgtc 2940tcttcttcct cgaatctcga tcgttacatc accgggttct agccttcacg
atgtgctttt 3000gagcatgaga tttggtttga cgcgacatct ccctctcaaa cgatctttct
ccaattattc 3060aatcacttcc gtatctccag aacaacagct caaatctccg gtgaccatgg
cgacgaccga 3120gagcaagaat cttgtaggga tccttccaag gaggagacaa acaagaagga
gacagaagat 3180aagaaggagg tgggagtttc ggttcctcca ccgccagaga aaccagagcc
tggcgattgt 3240tgcggtagcg gttgcgtccg atgcgtttgg gatgtttatt acgatgagct
cgaagattac 3300aacaagcagc tttctggaga aactaactgc a
3331823547DNAartificial sequenceplasmid 82aattcgatat
caagcttatc gataccgtcg acctcgaggg ggggcccggt acccagcttt 60tgttcccttt
agtgagggtt aatttcgagc ttggcgtaat catggtcata gctgtttcct 120gtgtgaaatt
gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt 180aaagcctggg
gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc 240gctttccagt
cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 300agaggcggtt
tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 360gtcgttcggc
tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca 420gaatcagggg
ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 480cgtaaaaagg
ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 540aaaaatcgac
gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 600tttccccctg
gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac 660ctgtccgcct
ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat 720ctcagttcgg
tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 780cccgaccgct
gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac 840ttatcgccac
tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt 900gctacagagt
tcttgaagtg gtggcctaac tacggctaca ctagaaggac agtatttggt 960atctgcgctc
tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc 1020aaacaaacca
ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga 1080aaaaaaggat
ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac 1140gaaaactcac
gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc 1200cttttaaatt
aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct 1260gacagttacc
aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca 1320tccatagttg
cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct 1380ggccccagtg
ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca 1440ataaaccagc
cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc 1500atccagtcta
ttaattgttg ccgggaagct agagtaagta gttcgccagt taatagtttg 1560cgcaacgttg
ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct 1620tcattcagct
ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa 1680aaagcggtta
gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta 1740tcactcatgg
ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc 1800ttttctgtga
ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg 1860agttgctctt
gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa 1920gtgctcatca
ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg 1980agatccagtt
cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc 2040accagcgttt
ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 2100gcgacacgga
aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat 2160cagggttatt
gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 2220ggggttccgc
gcacatttcc ccgaaaagtg ccacctaaat tgtaagcgtt aatattttgt 2280taaaattcgc
gttaaatttt tgttaaatca gctcattttt taaccaatag gccgaaatcg 2340gcaaaatccc
ttataaatca aaagaataga ccgagatagg gttgagtgtt gttccagttt 2400ggaacaagag
tccactatta aagaacgtgg actccaacgt caaagggcga aaaaccgtct 2460atcagggcga
tggcccacta cgtgaaccat caccctaatc aagttttttg gggtcgaggt 2520gccgtaaagc
actaaatcgg aaccctaaag ggagcccccg atttagagct tgacggggaa 2580agccggcgaa
cgtggcgaga aaggaaggga agaaagcgaa aggagcgggc gctagggcgc 2640tggcaagtgt
agcggtcacg ctgcgcgtaa ccaccacacc cgccgcgctt aatgcgccgc 2700tacagggcgc
gtcccattcg ccattcaggc tgcgcaactg ttgggaaggg cgatcggtgc 2760gggcctcttc
gctattacgc cagctggcga aagggggatg tgctgcaagg cgattaagtt 2820gggtaacgcc
agggttttcc cagtcacgac gttgtaaaac gacggccagt gaattgtaat 2880acgactcact
atagggcgaa ttggagctcc accgcggtgg cggccgcgtt gttgtgtctc 2940ttcttcctcg
aatctcgatc gttacatcac cgggttctag ccttcacgat gtgcttttga 3000gcatgagatt
tggtttgacg cgacatctcc ctctcaaacg atctttctcc aattattcaa 3060tcacttccgt
atctccagaa caacagctca aatctccggt gaccatggcg acgaccgaga 3120gcaagaatct
tgtagggatc cttccaagga ggagacaaac aagaaggaga cagaagataa 3180gaaggaggtg
ggagtttcgg ttcctccacc gccagagaaa ccagagcctg gcgattgttg 3240cggtagcggt
tgcgtccgat gcgtttggga tgtttattac gatgagctcg aagattacaa 3300caagcagctt
tctggagaaa ctaactgcag ctacaagatt cttgctctcg gtcgtcgcca 3360tggtcaccgg
agatttgagc tgttgttctg gagatacgga agtgattgaa taattggaga 3420aagatcgttt
gagagggaga tgtcgcgtca aaccaaatct catgctcaaa agcacatcgt 3480gaaggctaga
acccggtgat gtaacgatcg agattcgagg aagaagagac acaacaacgc 3540ggccgcg
3547833453DNAartificial sequenceplasmid 83tcttccatag ccccccaagc
ggccgcgaca caagtgtgag agtactaaat aaatgctttg 60gttgtacgaa atcattacac
taaataaaat aatcaaagct tatatatgcc ttccgctaag 120gccgaatgca aagaaattgg
ttctttctcg ttatcttttg ccacttttac tagtacgtat 180taattactac ttaatcatct
ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat 240catccggata tagttcctcc
tttcagcaaa aaacccctca agacccgttt agaggcccca 300aggggttatg ctagttattg
ctcagcggtg gcagcagcca actcagcttc ctttcgggct 360ttgttagcag ccggatcgat
ccaagctgta cctcactatt cctttgccct cggacgagtg 420ctggggcgtc ggtttccact
atcggcgagt acttctacac agccatcggt ccagacggcc 480gcgcttctgc gggcgatttg
tgtacgcccg acagtcccgg ctccggatcg gacgattgcg 540tcgcatcgac cctgcgccca
agctgcatca tcgaaattgc cgtcaaccaa gctctgatag 600agttggtcaa gaccaatgcg
gagcatatac gcccggagcc gcggcgatcc tgcaagctcc 660ggatgcctcc gctcgaagta
gcgcgtctgc tgctccatac aagccaacca cggcctccag 720aagaagatgt tggcgacctc
gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg 780accgctgtta tgcggccatt
gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg 840aggtgccgga cttcggggca
gtcctcggcc caaagcatca gctcatcgag agcctgcgcg 900acggacgcac tgacggtgtc
gtccatcaca gtttgccagt gatacacatg gggatcagca 960atcgcgcata tgaaatcacg
ccatgtagtg tattgaccga ttccttgcgg tccgaatggg 1020ccgaacccgc tcgtctggct
aagatcggcc gcagcgatcg catccatagc ctccgcgacc 1080ggctgcagaa cagcgggcag
ttcggtttca ggcaggtctt gcaacgtgac accctgtgca 1140cggcgggaga tgcaataggt
caggctctcg ctgaattccc caatgtcaag cacttccgga 1200atcgggagcg cggccgatgc
aaagtgccga taaacataac gatctttgta gaaaccatcg 1260gcgcagctat ttacccgcag
gacatatcca cgccctccta catcgaagct gaaagcacga 1320gattcttcgc cctccgagag
ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga 1380aacttctcga cagacgtcgc
ggtgagttca ggcttttcca tgggtatatc tccttcttaa 1440agttaaacaa aattatttct
agagggaaac cgttgtggtc tccctatagt gagtcgtatt 1500aatttcgcgg gatcgagatc
tgatcaacct gcattaatga atcggccaac gcgcggggag 1560aggcggtttg cgtattgggc
gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 1620cgttcggctg cggcgagcgg
tatcagctca ctcaaaggcg gtaatacggt tatccacaga 1680atcaggggat aacgcaggaa
agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 1740taaaaaggcc gcgttgctgg
cgtttttcca taggctccgc ccccctgacg agcatcacaa 1800aaatcgacgc tcaagtcaga
ggtggcgaaa cccgacagga ctataaagat accaggcgtt 1860tccccctgga agctccctcg
tgcgctctcc tgttccgacc ctgccgctta ccggatacct 1920gtccgccttt ctcccttcgg
gaagcgtggc gctttctcaa tgctcacgct gtaggtatct 1980cagttcggtg taggtcgttc
gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 2040cgaccgctgc gccttatccg
gtaactatcg tcttgagtcc aacccggtaa gacacgactt 2100atcgccactg gcagcagcca
ctggtaacag gattagcaga gcgaggtatg taggcggtgc 2160tacagagttc ttgaagtggt
ggcctaacta cggctacact agaaggacag tatttggtat 2220ctgcgctctg ctgaagccag
ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 2280acaaaccacc gctggtagcg
gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 2340aaaaggatct caagaagatc
ctttgatctt ttctacgggg tctgacgctc agtggaacga 2400aaactcacgt taagggattt
tggtcatgac attaacctat aaaaataggc gtatcacgag 2460gccctttcgt ctcgcgcgtt
tcggtgatga cggtgaaaac ctctgacaca tgcagctccc 2520ggagacggtc acagcttgtc
tgtaagcgga tgccgggagc agacaagccc gtcagggcgc 2580gtcagcgggt gttggcgggt
gtcggggctg gcttaactat gcggcatcag agcagattgt 2640actgagagtg caccatatgg
acatattgtc gttagaacgc ggctacaatt aatacataac 2700cttatgtatc atacacatac
gatttaggtg acactataga acggcgcgcc aagcttggat 2760cctagcctaa gtacgtactc
aaaatgccaa caaataaaaa aaaagttgct ttaataatgc 2820caaaacaaat taataaaaca
cttacaacac cggatttttt ttaattaaaa tgtgccattt 2880aggataaata gttaatattt
ttaataatta tttaaaaagc cgtatctact aaaatgattt 2940ttatttggtt gaaaatatta
atatgtttaa atcaacacaa tctatcaaaa ttaaactaaa 3000aaaaaaataa gtgtacgtgg
ttaacattag tacagtaata taagaggaaa atgagaaatt 3060aagaaattga aagcgagtct
aatttttaaa ttatgaacct gcatatataa aaggaaagaa 3120agaatccagg aagaaaagaa
atgaaaccat gcatggtccc ctcgtcatca cgagtttctg 3180ccatttgcaa tagaaacact
gaaacacctt tctctttgtc acttaattga gatgccgaag 3240ccacctcaca ccatgaactt
catgaggtgt agcacccaag gcttccatag ccatgcatac 3300tgaagaatgt ctcaagctca
gcaccctact tctgtgacgt gtccctcatt caccttcctc 3360tcttccctat aaataaccac
gcctcaggtt ctccgcttca caactcaaac attctctcca 3420ttggtcctta aacactcatc
agtcatcacc atg 3453844072DNAartificial
sequenceplasmid 84ggccgcgaca caagtgtgag agtactaaat aaatgctttg gttgtacgaa
atcattacac 60taaataaaat aatcaaagct tatatatgcc ttccgctaag gccgaatgca
aagaaattgg 120ttctttctcg ttatcttttg ccacttttac tagtacgtat taattactac
ttaatcatct 180ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat catccggata
tagttcctcc 240tttcagcaaa aaacccctca agacccgttt agaggcccca aggggttatg
ctagttattg 300ctcagcggtg gcagcagcca actcagcttc ctttcgggct ttgttagcag
ccggatcgat 360ccaagctgta cctcactatt cctttgccct cggacgagtg ctggggcgtc
ggtttccact 420atcggcgagt acttctacac agccatcggt ccagacggcc gcgcttctgc
gggcgatttg 480tgtacgcccg acagtcccgg ctccggatcg gacgattgcg tcgcatcgac
cctgcgccca 540agctgcatca tcgaaattgc cgtcaaccaa gctctgatag agttggtcaa
gaccaatgcg 600gagcatatac gcccggagcc gcggcgatcc tgcaagctcc ggatgcctcc
gctcgaagta 660gcgcgtctgc tgctccatac aagccaacca cggcctccag aagaagatgt
tggcgacctc 720gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg accgctgtta
tgcggccatt 780gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg aggtgccgga
cttcggggca 840gtcctcggcc caaagcatca gctcatcgag agcctgcgcg acggacgcac
tgacggtgtc 900gtccatcaca gtttgccagt gatacacatg gggatcagca atcgcgcata
tgaaatcacg 960ccatgtagtg tattgaccga ttccttgcgg tccgaatggg ccgaacccgc
tcgtctggct 1020aagatcggcc gcagcgatcg catccatagc ctccgcgacc ggctgcagaa
cagcgggcag 1080ttcggtttca ggcaggtctt gcaacgtgac accctgtgca cggcgggaga
tgcaataggt 1140caggctctcg ctgaattccc caatgtcaag cacttccgga atcgggagcg
cggccgatgc 1200aaagtgccga taaacataac gatctttgta gaaaccatcg gcgcagctat
ttacccgcag 1260gacatatcca cgccctccta catcgaagct gaaagcacga gattcttcgc
cctccgagag 1320ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga aacttctcga
cagacgtcgc 1380ggtgagttca ggcttttcca tgggtatatc tccttcttaa agttaaacaa
aattatttct 1440agagggaaac cgttgtggtc tccctatagt gagtcgtatt aatttcgcgg
gatcgagatc 1500tgatcaacct gcattaatga atcggccaac gcgcggggag aggcggtttg
cgtattgggc 1560gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg
cggcgagcgg 1620tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat
aacgcaggaa 1680agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc
gcgttgctgg 1740cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc
tcaagtcaga 1800ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga
agctccctcg 1860tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt
ctcccttcgg 1920gaagcgtggc gctttctcaa tgctcacgct gtaggtatct cagttcggtg
taggtcgttc 1980gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc
gccttatccg 2040gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg
gcagcagcca 2100ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc
ttgaagtggt 2160ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg
ctgaagccag 2220ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc
gctggtagcg 2280gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct
caagaagatc 2340ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt
taagggattt 2400tggtcatgac attaacctat aaaaataggc gtatcacgag gccctttcgt
ctcgcgcgtt 2460tcggtgatga cggtgaaaac ctctgacaca tgcagctccc ggagacggtc
acagcttgtc 2520tgtaagcgga tgccgggagc agacaagccc gtcagggcgc gtcagcgggt
gttggcgggt 2580gtcggggctg gcttaactat gcggcatcag agcagattgt actgagagtg
caccatatgg 2640acatattgtc gttagaacgc ggctacaatt aatacataac cttatgtatc
atacacatac 2700gatttaggtg acactataga acggcgcgcc aagcttggat cctagcctaa
gtacgtactc 2760aaaatgccaa caaataaaaa aaaagttgct ttaataatgc caaaacaaat
taataaaaca 2820cttacaacac cggatttttt ttaattaaaa tgtgccattt aggataaata
gttaatattt 2880ttaataatta tttaaaaagc cgtatctact aaaatgattt ttatttggtt
gaaaatatta 2940atatgtttaa atcaacacaa tctatcaaaa ttaaactaaa aaaaaaataa
gtgtacgtgg 3000ttaacattag tacagtaata taagaggaaa atgagaaatt aagaaattga
aagcgagtct 3060aatttttaaa ttatgaacct gcatatataa aaggaaagaa agaatccagg
aagaaaagaa 3120atgaaaccat gcatggtccc ctcgtcatca cgagtttctg ccatttgcaa
tagaaacact 3180gaaacacctt tctctttgtc acttaattga gatgccgaag ccacctcaca
ccatgaactt 3240catgaggtgt agcacccaag gcttccatag ccatgcatac tgaagaatgt
ctcaagctca 3300gcaccctact tctgtgacgt gtccctcatt caccttcctc tcttccctat
aaataaccac 3360gcctcaggtt ctccgcttca caactcaaac attctctcca ttggtcctta
aacactcatc 3420agtcatcacc atgtcttcca tagcccccca agcggccgcg ttgttgtgtc
tcttcttcct 3480cgaatctcga tcgttacatc accgggttct agccttcacg atgtgctttt
gagcatgaga 3540tttggtttga cgcgacatct ccctctcaaa cgatctttct ccaattattc
aatcacttcc 3600gtatctccag aacaacagct caaatctccg gtgaccatgg cgacgaccga
gagcaagaat 3660cttgtaggga tccttccaag gaggagacaa acaagaagga gacagaagat
aagaaggagg 3720tgggagtttc ggttcctcca ccgccagaga aaccagagcc tggcgattgt
tgcggtagcg 3780gttgcgtccg atgcgtttgg gatgtttatt acgatgagct cgaagattac
aacaagcagc 3840tttctggaga aactaactgc agctacaaga ttcttgctct cggtcgtcgc
catggtcacc 3900ggagatttga gctgttgttc tggagatacg gaagtgattg aataattgga
gaaagatcgt 3960ttgagaggga gatgtcgcgt caaaccaaat ctcatgctca aaagcacatc
gtgaaggcta 4020gaacccggtg atgtaacgat cgagattcga ggaagaagag acacaacaac
gc 40728514827DNAartificial sequenceplasmid 85cgcgcctcga
gtgggcggat cccccgggct gcaggaattc actggccgtc gttttacaac 60gtcgtgactg
ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca catccccctt 120tcgccagctg
gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa cagttgcgca 180gcctgaatgg
cgaatggatc gatccatcgc gatgtacctt ttgttagtca gcctctcgat 240tgctcatcgt
cattacacag taccgaagtt tgatcgatct agtaacatag atgacaccgc 300gcgcgataat
ttatcctagt ttgcgcgcta tattttgttt tctatcgcgt attaaatgta 360taattgcggg
actctaatca taaaaaccca tctcataaat aacgtcatgc attacatgtt 420aattattaca
tgcttaacgt aattcaacag aaattatatg ataatcatcg caagaccggc 480aacaggattc
aatcttaaga aactttattg ccaaatgttt gaacgatctg cttcgacgca 540ctccttcttt
actccaccat ctcgtcctta ttgaaaacgt gggtagcacc aaaacgaatc 600aagtcgctgg
aactgaagtt accaatcacg ctggatgatt tgccagttgg attaatcttg 660cctttccccg
catgaataat attgatgaat gcatgcgtga ggggtagttc gatgttggca 720atagctgcaa
ttgccgcgac atcctccaac gagcataatt cttcagaaaa atagcgatgt 780tccatgttgt
cagggcatgc atgatgcacg ttatgaggtg acggtgctag gcagtattcc 840ctcaaagttt
catagtcagt atcatattca tcattgcatt cctgcaagag agaattgaga 900cgcaatccac
acgctgcggc aaccttccgg cgttcgtggt ctatttgctc ttggacgttg 960caaacgtaag
tgttggatcg atccggggtg ggcgaagaac tccagcatga gatccccgcg 1020ctggaggatc
atccagccgg cgtcccggaa aacgattccg aagcccaacc tttcatagaa 1080ggcggcggtg
gaatcgaaat ctcgtgatgg caggttgggc gtcgcttggt cggtcatttc 1140gaaccccaga
gtcccgctca gaagaactcg tcaagaaggc gatagaaggc gatgcgctgc 1200gaatcgggag
cggcgatacc gtaaagcacg aggaagcggt cagcccattc gccgccaagc 1260tcttcagcaa
tatcacgggt agccaacgct atgtcctgat agcggtccgc cacacccagc 1320cggccacagt
cgatgaatcc agaaaagcgg ccattttcca ccatgatatt cggcaagcag 1380gcatcgccat
gggtcacgac gagatcctcg ccgtcgggca tgcgcgcctt gagcctggcg 1440aacagttcgg
ctggcgcgag cccctgatgc tcttcgtcca gatcatcctg atcgacaaga 1500ccggcttcca
tccgagtacg tgctcgctcg atgcgatgtt tcgcttggtg gtcgaatggg 1560caggtagccg
gatcaagcgt atgcagccgc cgcattgcat cagccatgat ggatactttc 1620tcggcaggag
caaggtgaga tgacaggaga tcctgccccg gcacttcgcc caatagcagc 1680cagtcccttc
ccgcttcagt gacaacgtcg agcacagctg cgcaaggaac gcccgtcgtg 1740gccagccacg
atagccgcgc tgcctcgtcc tgcagttcat tcagggcacc ggacaggtcg 1800gtcttgacaa
aaagaaccgg gcgcccctgc gctgacagcc ggaacacggc ggcatcagag 1860cagccgattg
tctgttgtgc ccagtcatag ccgaatagcc tctccaccca agcggccgga 1920gaacctgcgt
gcaatccatc ttgttcaatc atgcgaaacg atccccgcaa gcttggagac 1980tggtgatttc
agcgtgtcct ctccaaatga aatgaacttc cttatataga ggaagggtct 2040tgcgaaggat
agtgggattg tgcgtcatcc cttacgtcag tggagatatc acatcaatcc 2100acttgctttg
aagacgtggt tggaacgtct tctttttcca cgatgctcct cgtgggtggg 2160ggtccatctt
tgggaccact gtcggcagag gcatcttcaa cgatggcctt tcctttatcg 2220caatgatggc
atttgtagga gccaccttcc ttttccacta tcttcacaat aaagtgacag 2280atagctgggc
aatggaatcc gaggaggttt ccggatatta ccctttgttg aaaagtctca 2340attgcccttt
ggtcttctga gactgtatct ttgatatttt tggagtagac aagcgtgtcg 2400tgctccacca
tgttgacgaa gattttcttc ttgtcattga gtcgtaagag actctgtatg 2460aactgttcgc
cagtctttac ggcgagttct gttaggtcct ctatttgaat ctttgactcc 2520atggcctttg
attcagtggg aactaccttt ttagagactc caatctctat tacttgcctt 2580ggtttgtgaa
gcaagccttg aatcgtccat actggaatag tacttctgat cttgagaaat 2640atatctttct
ctgtgttctt gatgcagtta gtcctgaatc ttttgactgc atctttaacc 2700ttcttgggaa
ggtatttgat ctcctggaga ttattgctcg ggtagatcgt cttgatgaga 2760cctgctgcgt
aagcctctct aaccatctgt gggttagcat tctttctgaa attgaaaagg 2820ctaatcttct
cattatcagt ggtgaacatg gtatcgtcac cttctccgtc gaacttcctg 2880actagatcgt
agagatagag gaagtcgtcc attgtgatct ctggggcaaa ggagatctga 2940attaattcga
tatggtggat ttatcacaaa tgggacccgc cgccgacaga ggtgtgatgt 3000taggccagga
ctttgaaaat ttgcgcaact atcgtatagt ggccgacaaa ttgacgccga 3060gttgacagac
tgcctagcat ttgagtgaat tatgtgaggt aatgggctac actgaattgg 3120tagctcaaac
tgtcagtatt tatgtatatg agtgtatatt ttcgcataat ctcagaccaa 3180tctgaagatg
aaatgggtat ctgggaatgg cgaaatcaag gcatcgatcg tgaagtttct 3240catctaagcc
cccatttgga cgtgaatgta gacacgtcga aataaagatt tccgaattag 3300aataatttgt
ttattgcttt cgcctataaa tacgacggat cgtaatttgt cgttttatca 3360aaatgtactt
tcattttata ataacgctgc ggacatctac atttttgaat tgaaaaaaaa 3420ttggtaatta
ctctttcttt ttctccatat tgaccatcat actcattgct gatccatgta 3480gatttcccgg
acatgaagcc atttacaatt gaatatatcc tgccgccgct gccgctttgc 3540acccggtgga
gcttgcatgt tggtttctac gcagaactga gccggttagg cagataattt 3600ccattgagaa
ctgagccatg tgcaccttcc ccccaacacg gtgagcgacg gggcaacgga 3660gtgatccaca
tgggactttt aaacatcatc cgtcggatgg cgttgcgaga gaagcagtcg 3720atccgtgaga
tcagccgacg caccgggcag gcgcgcaaca cgatcgcaaa gtatttgaac 3780gcaggtacaa
tcgagccgac gttcacgcgg aacgaccaag caagctagct ttaatgcggt 3840agtttatcac
agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 3900ctcatcgtca
tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 3960gtactgccgg
gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 4020gtgctgctag
cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 4080tccgaccgct
ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 4140tacgcgatca
tggcgaccac acccgtcctg tggtccaacc cctccgctgc tatagtgcag 4200tcggcttctg
acgttcagtg cagccgtctt ctgaaaacga catgtcgcac aagtcctaag 4260ttacgcgaca
ggctgccgcc ctgccctttt cctggcgttt tcttgtcgcg tgttttagtc 4320gcataaagta
gaatacttgc gactagaacc ggagacatta cgccatgaac aagagcgccg 4380ccgctggcct
gctgggctat gcccgcgtca gcaccgacga ccaggacttg accaaccaac 4440gggccgaact
gcacgcggcc ggctgcacca agctgttttc cgagaagatc accggcacca 4500ggcgcgaccg
cccggagctg gccaggatgc ttgaccacct acgccctggc gacgttgtga 4560cagtgaccag
gctagaccgc ctggcccgca gcacccgcga cctactggac attgccgagc 4620gcatccagga
ggccggcgcg ggcctgcgta gcctggcaga gccgtgggcc gacaccacca 4680cgccggccgg
ccgcatggtg ttgaccgtgt tcgccggcat tgccgagttc gagcgttccc 4740taatcatcga
ccgcacccgg agcgggcgcg aggccgccaa ggcccgaggc gtgaagtttg 4800gcccccgccc
taccctcacc ccggcacaga tcgcgcacgc ccgcgagctg atcgaccagg 4860aaggccgcac
cgtgaaagag gcggctgcac tgcttggcgt gcatcgctcg accctgtacc 4920gcgcacttga
gcgcagcgag gaagtgacgc ccaccgaggc caggcggcgc ggtgccttcc 4980gtgaggacgc
attgaccgag gccgacgccc tggcggccgc cgagaatgaa cgccaagagg 5040aacaagcatg
aaaccgcacc aggacggcca ggacgaaccg tttttcatta ccgaagagat 5100cgaggcggag
atgatcgcgg ccgggtacgt gttcgagccg cccgcgcacg tctcaaccgt 5160gcggctgcat
gaaatcctgg ccggtttgtc tgatgccaag ctggcggcct ggccggccag 5220cttggccgct
gaagaaaccg agcgccgccg tctaaaaagg tgatgtgtat ttgagtaaaa 5280cagcttgcgt
catgcggtcg ctgcgtatat gatgcgatga gtaaataaac aaatacgcaa 5340gggaacgcat
gaagttatcg ctgtacttaa ccagaaaggc gggtcaggca agacgaccat 5400cgcaacccat
ctagcccgcg ccctgcaact cgccggggcc gatgttctgt tagtcgattc 5460cgatccccag
ggcagtgccc gcgattgggc ggccgtgcgg gaagatcaac cgctaaccgt 5520tgtcggcatc
gaccgcccga cgattgaccg cgacgtgaag gccatcggcc ggcgcgactt 5580cgtagtgatc
gacggagcgc cccaggcggc ggacttggct gtgtccgcga tcaaggcagc 5640cgacttcgtg
ctgattccgg tgcagccaag cccttacgac atatgggcca ccgccgacct 5700ggtggagctg
gttaagcagc gcattgaggt cacggatgga aggctacaag cggcctttgt 5760cgtgtcgcgg
gcgatcaaag gcacgcgcat cggcggtgag gttgccgagg cgctggccgg 5820gtacgagctg
cccattcttg agtcccgtat cacgcagcgc gtgagctacc caggcactgc 5880cgccgccggc
acaaccgttc ttgaatcaga acccgagggc gacgctgccc gcgaggtcca 5940ggcgctggcc
gctgaaatta aatcaaaact catttgagtt aatgaggtaa agagaaaatg 6000agcaaaagca
caaacacgct aagtgccggc cgtccgagcg cacgcagcag caaggctgca 6060acgttggcca
gcctggcaga cacgccagcc atgaagcggg tcaactttca gttgccggcg 6120gaggatcaca
ccaagctgaa gatgtacgcg gtacgccaag gcaagaccat taccgagctg 6180ctatctgaat
acatcgcgca gctaccagag taaatgagca aatgaataaa tgagtagatg 6240aattttagcg
gctaaaggag gcggcatgga aaatcaagaa caaccaggca ccgacgccgt 6300ggaatgcccc
atgtgtggag gaacgggcgg ttggccaggc gtaagcggct gggttgtctg 6360ccggccctgc
aatggcactg gaacccccaa gcccgaggaa tcggcgtgag cggtcgcaaa 6420ccatccggcc
cggtacaaat cggcgcggcg ctgggtgatg acctggtgga gaagttgaag 6480gccgcgcagg
ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg tgaatcgtgg 6540caagcggccg
ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc cggtgcgccg 6600tcgattagga
agccgcccaa gggcgacgag caaccagatt ttttcgttcc gatgctctat 6660gacgtgggca
cccgcgatag tcgcagcatc atggacgtgg ccgttttccg tctgtcgaag 6720cgtgaccgac
gagctggcga ggtgatccgc tacgagcttc cagacgggca cgtagaggtt 6780tccgcagggc
cggccggcat ggccagtgtg tgggattacg acctggtact gatggcggtt 6840tcccatctaa
ccgaatccat gaaccgatac cgggaaggga agggagacaa gcccggccgc 6900gtgttccgtc
cacacgttgc ggacgtactc aagttctgcc ggcgagccga tggcggaaag 6960cagaaagacg
acctggtaga aacctgcatt cggttaaaca ccacgcacgt tgccatgcag 7020cgtacgaaga
aggccaagaa cggccgcctg gtgacggtat ccgagggtga agccttgatt 7080agccgctaca
agatcgtaaa gagcgaaacc gggcggccgg agtacatcga gatcgagcta 7140gctgattgga
tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct gacggttcac 7200cccgattact
ttttgatcga tcccggcatc ggccgttttc tctaccgcct ggcacgccgc 7260gccgcaggca
aggcagaagc cagatggttg ttcaagacga tctacgaacg cagtggcagc 7320gccggagagt
tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc aaatgacctg 7380ccggagtacg
atttgaagga ggaggcgggg caggctggcc cgatcctagt catgcgctac 7440cgcaacctga
tcgagggcga agcatccgcc ggttcctaat gtacggagca gatgctaggg 7500caaattgccc
tagcagggga aaaaggtcga aaaggtctct ttcctgtgga tagcacgtac 7560attgggaacc
caaagccgta cattgggaac cggaacccgt acattgggaa cccaaagccg 7620tacattggga
accggtcaca catgtaagtg actgatataa aagagaaaaa aggcgatttt 7680tccgcctaaa
actctttaaa acttattaaa actcttaaaa cccgcctggc ctgtgcataa 7740ctgtctggcc
agcgcacagc cgaagagctg caaaaagcgc ctacccttcg gtcgctgcgc 7800tccctacgcc
ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc aaaaatggct 7860ggcctacggc
caggcaatct accagggcgc ggacaagccg cgccgtcgcc actcgaccgc 7920cggcgcccac
atcaaggcac cctgcctcgc gcgtttcggt gatgacggtg aaaacctctg 7980acacatgcag
ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg ggagcagaca 8040agcccgtcag
ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca tgacccagtc 8100acgtagcgat
agcggagtgt atactggctt aactatgcgg catcagagca gattgtactg 8160agagtgcacc
atatgcggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc 8220aggcgctctt
ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga 8280gcggtatcag
ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca 8340ggaaagaaca
tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 8400ctggcgtttt
tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt 8460cagaggtggc
gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 8520ctcgtgcgct
ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 8580tcgggaagcg
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 8640gttcgctcca
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 8700tccggtaact
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 8760gccactggta
acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 8820tggtggccta
actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag 8880ccagttacct
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt 8940agcggtggtt
tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa 9000gatcctttga
tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg 9060attttggtca
tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga 9120agttttaaat
caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta 9180atcagtgagg
cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc 9240cccgtcgtgt
agataactac gatacgggag ggcttaccat ctggccccag tgctgcaatg 9300ataccgcgag
acccacgctc accggctcca gatttatcag caataaacca gccagccgga 9360agggccgagc
gcagaagtgg tcctgcaact ttatccgcct ccatccagtc tattaattgt 9420tgccgggaag
ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt 9480gctacaggca
tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc 9540caacgatcaa
ggcgagttac atgatccccc atgttgtgca aaaaagcggt tagctccttc 9600ggtcctccga
tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca 9660gcactgcata
attctcttac tgtcatgcca tccgtaagat gcttttctgt gactggtgag 9720tactcaacca
agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 9780tcaacacggg
ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa 9840gacctgcagg
gggggggggg cgctgaggtc tgcctcgtga agaaggtgtt gctgactcat 9900accaggcctg
aatcgcccca tcatccagcc agaaagtgag ggagccacgg ttgatgagag 9960ctttgttgta
ggtggaccag ttggtgattt tgaacttttg ctttgccacg gaacggtctg 10020cgttgtcggg
aagatgcgtg atctgatcct tcaactcagc aaaagttcga tttattcaac 10080aaagccgccg
tcccgtcaag tcagcgtaat gctctgccag tgttacaacc aattaaccaa 10140ttctgattag
aaaaactcat cgagcatcaa atgaaactgc aatttattca tatcaggatt 10200atcaatacca
tatttttgaa aaagccgttt ctgtaatgaa ggagaaaact caccgaggca 10260gttccatagg
atggcaagat cctggtatcg gtctgcgatt ccgactcgtc caacatcaat 10320acaacctatt
aatttcccct cgtcaaaaat aaggttatca agtgagaaat caccatgagt 10380gacgactgaa
tccggtgaga atggcaaaag cttatgcatt tctttccaga cttgttcaac 10440aggccagcca
ttacgctcgt catcaaaatc actcgcatca accaaaccgt tattcattcg 10500tgattgcgcc
tgagcgagac gaaatacgcg atcgctgtta aaaggacaat tacaaacagg 10560aatcgaatgc
aaccggcgca ggaacactgc cagcgcatca acaatatttt cacctgaatc 10620aggatattct
tctaatacct ggaatgctgt tttcccgggg atcgcagtgg tgagtaacca 10680tgcatcatca
ggagtacgga taaaatgctt gatggtcgga agaggcataa attccgtcag 10740ccagtttagt
ctgaccatct catctgtaac atcattggca acgctacctt tgccatgttt 10800cagaaacaac
tctggcgcat cgggcttccc atacaatcga tagattgtcg cacctgattg 10860cccgacatta
tcgcgagccc atttataccc atataaatca gcatccatgt tggaatttaa 10920tcgcggcctc
gagcaagacg tttcccgttg aatatggctc ataacacccc ttgtattact 10980gtttatgtaa
gcagacagtt ttattgttca tgatgatata tttttatctt gtgcaatgta 11040acatcagaga
ttttgagaca caacgtggct ttcccccccc cccctgcagg tcaattcggt 11100cgatatggct
attacgaaga aggctcgtgc gcggagtccc gtgaactttc ccacgcaaca 11160agtgaaccgc
accgggtttg ccggaggcca tttcgttaaa atgcgcagcc atggctgctt 11220cgtccagcat
ggcgtaatac tgatcctcgt cttcggctgg cggtatattg ccgatgggct 11280tcaaaagccg
ccgtggttga accagtctat ccattccaag gtagcgaact cgaccgcttc 11340gaagctcctc
catggtccac gccgatgaat gacctcggcc ttgtaaagac cgttgatcgc 11400ttctgcgagg
gcgttgtcgt gctgtcgccg acgcttccga tagatggctc gatacctgct 11460tctgccaacc
gctcggaata gcgaaaggac acgtattgaa caccgcgatc cgagtgatgc 11520actaggccgc
catgagcggg acgccgatca tgatgagcct cctcgagggc atcgaggaca 11580aagcctgcat
gtgctgtccg gctcgcccgc catccgacaa tgcgacgggc gaagacgtcg 11640atcacgaagg
ccacgtagac gaagccctcc caagtggcga cataagtacg gacatgcgca 11700aaggctttcc
cggtttgtcg ctgatggtgc aagagacgct gaagcgcgat ccgatgcgca 11760ggcatctgtt
cgtcttccgc ggtcgtggcg gtggcctgat caaggtcact cgccgaagag 11820ctgcatgatt
ggctcgaaac cgagcggggg aaattgtcgc gcagttctcc cgtcgccgag 11880gcgataaatt
acatgctcaa gcgatgggat ggcattacgt cattcctcga tgacggcccg 11940atttgcctga
cgaacaatgc tgccgaacga acgctcagag gctatgtact cggcaggaag 12000tcatggctgt
ttgccggatc ggatcgttgt gctgaacgtg cggcgttcat ggcgacactg 12060atcatgagcg
ccaagctcaa taacatcgat ccgcaggcct ggcttgccga cgtccgcgcc 12120gaccttgcgg
acgctccgat cagcaggctt gagcaacagc tgccgtggaa ctggacatcc 12180aagacactga
gtgctcaggc ggcctgacct gcggccttca ccggatactt accccattat 12240cgcagattgc
gatgaagcat cagcgtcatt cagcaatctt gccaaagtat gcaggctcgc 12300gagaatcgac
gtgcgaaacc ggctggttgc gccaaagatc cgcttgcgga gcggtcgaac 12360attcatgctg
ggacttcaag aggtcgagta gaggaagaac cggaaaggtt gcaccggaaa 12420atatgcgttc
ctttggagag cgcctcatgg acgtgaacaa atcgcccgga ccaaggatgc 12480cacggataca
aaagctcgcg aagctcggtc ccgtgggtgt tctgtcgtct cgttgtacaa 12540cgaaatccat
tcccattccg cgctcaagat ggcttcccct cggcagttca tcagggctaa 12600atcaatctag
ccgacttgtc cggtgaaatg ggctgcactc caacagaaac aatcaaacaa 12660acatacacag
cgacttattc acacgagctc aaattacaac ggtatatatc ctgccagtca 12720gcatcatcac
accaaaagtt aggcccgaat agtttgaaat tagaaagctc gcaattgagg 12780tctacaggcc
aaattcgctc ttagccgtac aatattactc accggtgcga tgccccccat 12840cgtaggtgaa
ggtggaaatt aatgatccat cttgagacca caggcccaca acagctacca 12900gtttcctcaa
gggtccacca aaaacgtaag cgcttacgta catggtcgat aagaaaaggc 12960aatttgtaga
tgttaacatc caacgtcgct ttcagggatc gatccaatac gcaaaccgcc 13020tctccccgcg
cgttggccga ttcattaatg cagctggcac gacaggtttc ccgactggaa 13080agcgggcagt
gagcgcaacg caattaatgt gagttagctc actcattagg caccccaggc 13140tttacacttt
atgcttccgg ctcgtatgtt gtgtggaatt gtgagcggat aacaatttca 13200cacaggaaac
agctatgacc atgattacgc caagcttgca tgcctgcagg tcgactctag 13260aggatctggc
gcgccaagct tggatcctag cctaagtacg tactcaaaat gccaacaaat 13320aaaaaaaaag
ttgctttaat aatgccaaaa caaattaata aaacacttac aacaccggat 13380tttttttaat
taaaatgtgc catttaggat aaatagttaa tatttttaat aattatttaa 13440aaagccgtat
ctactaaaat gatttttatt tggttgaaaa tattaatatg tttaaatcaa 13500cacaatctat
caaaattaaa ctaaaaaaaa aataagtgta cgtggttaac attagtacag 13560taatataaga
ggaaaatgag aaattaagaa attgaaagcg agtctaattt ttaaattatg 13620aacctgcata
tataaaagga aagaaagaat ccaggaagaa aagaaatgaa accatgcatg 13680gtcccctcgt
catcacgagt ttctgccatt tgcaatagaa acactgaaac acctttctct 13740ttgtcactta
attgagatgc cgaagccacc tcacaccatg aacttcatga ggtgtagcac 13800ccaaggcttc
catagccatg catactgaag aatgtctcaa gctcagcacc ctacttctgt 13860gacgtgtccc
tcattcacct tcctctcttc cctataaata accacgcctc aggttctccg 13920cttcacaact
caaacattct ctccattggt ccttaaacac tcatcagtca tcaccatgtc 13980ttccatagcc
ccccaagcgg ccgcgttgtt gtgtctcttc ttcctcgaat ctcgatcgtt 14040acatcaccgg
gttctagcct tcacgatgtg cttttgagca tgagatttgg tttgacgcga 14100catctccctc
tcaaacgatc tttctccaat tattcaatca cttccgtatc tccagaacaa 14160cagctcaaat
ctccggtgac catggcgacg accgagagca agaatcttgt agggatcctt 14220ccaaggagga
gacaaacaag aaggagacag aagataagaa ggaggtggga gtttcggttc 14280ctccaccgcc
agagaaacca gagcctggcg attgttgcgg tagcggttgc gtccgatgcg 14340tttgggatgt
ttattacgat gagctcgaag attacaacaa gcagctttct ggagaaacta 14400actgcagcta
caagattctt gctctcggtc gtcgccatgg tcaccggaga tttgagctgt 14460tgttctggag
atacggaagt gattgaataa ttggagaaag atcgtttgag agggagatgt 14520cgcgtcaaac
caaatctcat gctcaaaagc acatcgtgaa ggctagaacc cggtgatgta 14580acgatcgaga
ttcgaggaag aagagacaca acaacgcggc cgcgacacaa gtgtgagagt 14640actaaataaa
tgctttggtt gtacgaaatc attacactaa ataaaataat caaagcttat 14700atatgccttc
cgctaaggcc gaatgcaaag aaattggttc tttctcgtta tcttttgcca 14760cttttactag
tacgtattaa ttactactta atcatctttg tttacggctc attatatccg 14820tcgacgg
148278621DNAartificial sequenceamiRNA 86taacccaaca atcatcgacc c
218721DNAartificial sequenceamiRNA
87ttggagaaaa tagggtaggg t
218821DNAartificial sequenceamiRNA 88gggacgatga ttgttgggtt a
218921DNAartificial sequenceamiRNA
89accctaccct acattctcca t
2190590DNAartificial sequencemicroRNA percursor 90gcggccgcgc gagaaacttt
gtatgggcat ggttatttct cacttctcac cctcctttac 60tttcttatgc taaatcctcc
ttcccctata tctccaccct caaccccttt ttctcattat 120aacttttggt gcctagatgg
tgtgtgtgtg tgcgcgcgag agatctgagc tcaattttcc 180tctctcaagt cctggtcatg
cttttccaca gctttcttga acttcttatg catcttatat 240ctctccacct ccaggatttt
aagccctaga agctcaagaa agctgtggga gaatatggca 300attcaggctt ttaattgctt
tcatttggta ccatcacttg caagatttca gagtacaagg 360tgaacacaca catcttcctc
ttcatcaatt ctctagtttc atccttatct tttcattcac 420ggtaactctc actaccctct
ttcatcttat aagttatacc gggggtgtga tgttgatgag 480tgtaaattaa atatatgtga
tctctttctc tggaaaaatt ttcagtgtga tatacatann 540natctcttaa tctagagatt
ttatggcttt gttatatata aggcggccgc 59091605DNAartificial
sequencemicroRNA precursor 91gcggccgcgc gagaaacttt gtatgggcat ggttatttct
cacttctcac cctcctttac 60tttcttatgc taaatcctcc ttcccctata tctccaccct
caaccccttt ttctcattat 120aacttttggt gcctagatgg tgtgtgtgtg tgcgcgcgag
agatctgagc tcaattttcc 180tctctcaagt cctggtcatg ctgtttaaac cacagctttc
ttgaacttct tatgcatctt 240atatctctcc acctccagga ttttaagccc tagaagctca
agaaagctgt gggagtttaa 300actatggcaa ttcaggcttt taattgcttt catttggtac
catcacttgc aagatttcag 360agtacaaggt gaacacacac atcttcctct tcatcaattc
tctagtttca tccttatctt 420ttcattcacg gtaactctca ctaccctctt tcatcttata
agttataccg ggggtgtgat 480gttgatgagt gtaaattaaa tatatgtgat ctctttctct
ggaaaaattt tcagtgtgat 540atacatannn atctcttaat ctagagattt tatggctttg
ttatatataa ggaattcgcg 600gccgc
6059259DNAartificial sequenceprimer 92tctcaagtcc
tggtcatgct ttaacccaac aatcatcgac cccttatgca tcttatatc
599359DNAartificial sequenceprimer 93cctgaattgc catattctaa cccaacaatc
atcgtcccct agggcttaaa atcctggag 59944536DNAartificial
sequenceplasmid 94aagggcgaat tctgcagata tccatcacac tggcggccgc tcgagcatgc
atctagaggg 60cccaattcgc cctatagtga gtcgtattac aattcactgg ccgtcgtttt
acaacgtcgt 120gactgggaaa accctggcgt tacccaactt aatcgccttg cagcacatcc
ccctttcgcc 180agctggcgta atagcgaaga ggcccgcacc gatcgccctt cccaacagtt
gcgcagcctg 240aatggcgaat ggacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg
gtggttacgc 300gcagcgtgac cgctacactt gccagcgccc tagcgcccgc tcctttcgct
ttcttccctt 360cctttctcgc cacgttcgcc ggctttcccc gtcaagctct aaatcggggg
ctccctttag 420ggttccgatt tagtgcttta cggcacctcg accccaaaaa acttgattag
ggtgatggtt 480cacgtagtgg gccatcgccc tgatagacgg tttttcgccc tttgacgttg
gagtccacgt 540tctttaatag tggactcttg ttccaaactg gaacaacact caaccctatc
tcggtctatt 600cttttgattt ataagggatt ttgccgattt cggcctattg gttaaaaaat
gagctgattt 660aacaaaaatt taacgcgaat tttaacaaaa ttcagggcgc aagggctgct
aaaggaagcg 720gaacacgtag aaagccagtc cgcagaaacg gtgctgaccc cggatgaatg
tcagctactg 780ggctatctgg acaagggaaa acgcaagcgc aaagagaaag caggtagctt
gcagtgggct 840tacatggcga tagctagact gggcggtttt atggacagca agcgaaccgg
aattgccagc 900tggggcgccc tctggtaagg ttgggaagcc ctgcaaagta aactggatgg
ctttcttgcc 960gccaaggatc tgatggcgca ggggatcaag atctgatcaa gagacaggat
gaggatcgtt 1020tcgcatgatt gaacaagatg gattgcacgc aggttctccg gccgcttggg
tggagaggct 1080attcggctat gactgggcac aacagacaat cggctgctct gatgccgccg
tgttccggct 1140gtcagcgcag gggcgcccgg ttctttttgt caagaccgac ctgtccggtg
ccctgaatga 1200actgcaggac gaggcagcgc ggctatcgtg gctggccacg acgggcgttc
cttgcgcagc 1260tgtgctcgac gttgtcactg aagcgggaag ggactggctg ctattgggcg
aagtgccggg 1320gcaggatctc ctgtcatccc accttgctcc tgccgagaaa gtatccatca
tggctgatgc 1380aatgcggcgg ctgcatacgc ttgatccggc tacctgccca ttcgaccacc
aagcgaaaca 1440tcgcatcgag cgagcacgta ctcggatgga agccggtctt gtcgatcagg
atgatctgga 1500cgaagagcat caggggctcg cgccagccga actgttcgcc aggctcaagg
cgcgcatgcc 1560cgacggcgag gatctcgtcg tgacccatgg cgatgcctgc ttgccgaata
tcatggtgga 1620aaatggccgc ttttctggat tcatcgactg tggccggctg ggtgtggcgg
accgctatca 1680ggacatagcg ttggctaccc gtgatattgc tgaagagctt ggcggcgaat
gggctgaccg 1740cttcctcgtg ctttacggta tcgccgctcc cgattcgcag cgcatcgcct
tctatcgcct 1800tcttgacgag ttcttctgaa ttgaaaaagg aagagtatga gtattcaaca
tttccgtgtc 1860gcccttattc ccttttttgc ggcattttgc cttcctgttt ttgctcaccc
agaaacgctg 1920gtgaaagtaa aagatgctga agatcagttg ggtgcacgag tgggttacat
cgaactggat 1980ctcaacagcg gtaagatcct tgagagtttt cgccccgaag aacgttttcc
aatgatgagc 2040acttttaaag ttctgctatg tggcgcggta ttatcccgta ttgacgccgg
gcaagagcaa 2100ctcggtcgcc gcatacacta ttctcagaat gacttggttg agtactcacc
agtcacagaa 2160aagcatctta cggatggcat gacagtaaga gaattatgca gtgctgccat
aaccatgagt 2220gataacactg cggccaactt acttctgaca acgatcggag gaccgaagga
gctaaccgct 2280tttttgcaca acatggggga tcatgtaact cgccttgatc gttgggaacc
ggagctgaat 2340gaagccatac caaacgacga gcgtgacacc acgatgcctg tagcaatggc
aacaacgttg 2400cgcaaactat taactggcga actacttact ctagcttccc ggcaacaatt
aatagactgg 2460atggaggcgg ataaagttgc aggaccactt ctgcgctcgg cccttccggc
tggctggttt 2520attgctgata aatctggagc cggtgagcgt gggtctcgcg gtatcattgc
agcactgggg 2580ccagatggta agccctcccg tatcgtagtt atctacacga cggggagtca
ggcaactatg 2640gatgaacgaa atagacagat cgctgagata ggtgcctcac tgattaagca
ttggtaactg 2700tcagaccaag tttactcata tatactttag attgatttaa aacttcattt
ttaatttaaa 2760aggatctagg tgaagatcct ttttgataat ctcatgacca aaatccctta
acgtgagttt 2820tcgttccact gagcgtcaga ccccgtagaa aagatcaaag gatcttcttg
agatcctttt 2880tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac cgctaccagc
ggtggtttgt 2940ttgccggatc aagagctacc aactcttttt ccgaaggtaa ctggcttcag
cagagcgcag 3000ataccaaata ctgttcttct agtgtagccg tagttaggcc accacttcaa
gaactctgta 3060gcaccgccta catacctcgc tctgctaatc ctgttaccag tggctgctgc
cagtggcgat 3120aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac cggataaggc
gcagcggtcg 3180ggctgaacgg ggggttcgtg cacacagccc agcttggagc gaacgaccta
caccgaactg 3240agatacctac agcgtgagct atgagaaagc gccacgcttc ccgaagggag
aaaggcggac 3300aggtatccgg taagcggcag ggtcggaaca ggagagcgca cgagggagct
tccaggggga 3360aacgcctggt atctttatag tcctgtcggg tttcgccacc tctgacttga
gcgtcgattt 3420ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg ccagcaacgc
ggccttttta 3480cggttcctgg ccttttgctg gccttttgct cacatgttct ttcctgcgtt
atcccctgat 3540tctgtggata accgtattac cgcctttgag tgagctgata ccgctcgccg
cagccgaacg 3600accgagcgca gcgagtcagt gagcgaggaa gcggaagagc gcccaatacg
caaaccgcct 3660ctccccgcgc gttggccgat tcattaatgc agctggcacg acaggtttcc
cgactggaaa 3720gcgggcagtg agcgcaacgc aattaatgtg agttagctca ctcattaggc
accccaggct 3780ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata
acaatttcac 3840acaggaaaca gctatgacca tgattacgcc aagcttggta ccgagctcgg
atccactagt 3900aacggccgcc agtgtgctgg aattcgccct tgcggccgcg cgagaaactt
tgtatgggca 3960tggttatttc tcacttctca ccctccttta ctttcttatg ctaaatcctc
cttcccctat 4020atctccaccc tcaacccctt tttctcatta taacttttgg tgcctagatg
gtgtgtgtgt 4080gtgcgcgcga gagatctgag ctcaattttc ctctctcaag tcctggtcat
gctgtttaaa 4140ccacagcttt cttgaacttc ttatgcatct tatatctctc cacctccagg
attttaagcc 4200ctagaagctc aagaaagctg tgggagttta aactatggca attcaggctt
ttaattgctt 4260tcatttggta ccatcacttg caagatttca gagtacaagg tgaacacaca
catcttcctc 4320ttcatcaatt ctctagtttc atccttatct tttcattcac ggtaactctc
actaccctct 4380ttcatcttat aagttatacc gggggtgtga tgttgatgag tgtaaattaa
atatatgtga 4440tctctttctc tggaaaaatt ttcagtgtga tatacatann natctcttaa
tctagagatt 4500ttatggcttt gttatatata aggaattcgc ggccgc
453695974DNAartificial sequencemicroRNA precursor 95gcggccgctt
ctagctagct agggtttggg tagtgagtgt aataaagttg caaagttttt 60ggttaggtta
cgttttgacc ttattattat agttcaaagg gaaacattaa ttaaagggga 120ttatgaagtg
gagctccttg aagtccaatt gaggatctta ctgggtgaat tgagctgctt 180agctatggat
cccacagttc tacccatcaa taagtgcttt tgtggtagtc ttgtggcttc 240catatctggg
gagcttcatt tgcctttata gtattaacct tctttggatt gaagggagct 300ctacaccctt
ctcttctttt ctctcataat aatttaaatt tgttatagac tctaaacttt 360aaatgttttt
tttgaagttt ttccgttttt ctcttttgcc atgatcccgt tcttgctgtg 420gagtaacctt
gtccgaggta tgtgcatgat tagatccata cttaatttgt gtgcatcacg 480aaggtgaggt
tgaaatgaac tttgcttttt tgacctttta ggaaagttct tttgttgcag 540taatcaattt
taattagttt taattgacac tattactttt attgtcatct ttgttagttt 600tattgttgaa
ttgagtgcat atttcctagg aaattctctt acctaacatt ttttatacag 660atctatgctc
ttggctcttg cccttactct tggccttgtg ttggttattt gtctacatat 720ttattgactg
gtcgatgaga catgtcacaa ttcttgggct tatttgttgg tctaataaaa 780ggagtgctta
ttgaaagatc aagacggaga ttcggtttta tataaataaa ctaaagatga 840catattagtg
tgttgatgtc tcttcaggat aatttttgtt tgaaataata tggtaatgtc 900ttgtctaaat
ttgtgtacat aattcttact gattttttgg attgttggat ttttataaac 960aaatctgcgg
ccgc
97496990DNAartificial sequencein-fusion ready microRNA 159 precursor
96gcggccgctt ctagctagct agggtttggg tagtgagtgt aataaagttg caaagttttt
60ggttaggtta cgttttgacc ttattattat agttcaaagg gaaacattaa ttaaagggga
120ttatgaagtg tttaaacgga gctccttgaa gtccaattga ggatcttact gggtgaattg
180agctgcttag ctatggatcc cacagttcta cccatcaata agtgcttttg tggtagtctt
240gtggcttcca tatctgggga gcttcatttg cctttatagt attaaccttc tttggattga
300agggagctct agtttaaacc acccttctct tcttttctct cataataatt taaatttgtt
360atagactcta aactttaaat gttttttttg aagtttttcc gtttttctct tttgccatga
420tcccgttctt gctgtggagt aaccttgtcc gaggtatgtg catgattaga tccatactta
480atttgtgtgc atcacgaagg tgaggttgaa atgaactttg cttttttgac cttttaggaa
540agttcttttg ttgcagtaat caattttaat tagttttaat tgacactatt acttttattg
600tcatctttgt tagttttatt gttgaattga gtgcatattt cctaggaaat tctcttacct
660aacatttttt atacagatct atgctcttgg ctcttgccct tactcttggc cttgtgttgg
720ttatttgtct acatatttat tgactggtcg atgagacatg tcacaattct tgggcttatt
780tgttggtcta ataaaaggag tgcttattga aagatcaaga cggagattcg gttttatata
840aataaactaa agatgacata ttagtgtgtt gatgtctctt caggataatt tttgtttgaa
900ataatatggt aatgtcttgt ctaaatttgt gtacataatt cttactgatt ttttggattg
960ttggattttt ataaacaaat ctgcggccgc
9909754DNAartificial sequenceprimer 97attaaagggg attatgaaga ccctacccta
cattctccat tgaggatctt actg 549853DNAartificial sequenceprimer
98agaaaagaag agaagggtga ccctacccta ttttctccaa gaaggttaat act
53994911DNAartificial sequenceplasmid 99ggccgcgaat tcttctagct agctagggtt
tgggtagtga gtgtaataaa gttgcaaagt 60ttttggttag gttacgtttt gaccttatta
ttatagttca aagggaaaca ttaattaaag 120gggattatga agaccctacc ctacattctc
cattgaggat cttactgggt gaattgagct 180gcttagctat ggatcccaca gttctaccca
tcaataagtg cttttgtggt agtcttgtgg 240cttccatatc tggggagctt catttgcctt
tatagtatta accttcttgg agaaaatagg 300gtagggtcac ccttctcttc ttttctctca
taataattta aatttgttat agactctaaa 360ctttaaatgt tttttttgaa gtttttccgt
ttttctcttt tgccatgatc ccgttcttgc 420tgtggagtaa ccttgtccga ggtatgtgca
tgattagatc catacttaat ttgtgtgcat 480cacgaaggtg aggttgaaat gaactttgct
tttttgacct tttaggaaag ttcttttgtt 540gcagtaatca attttaatta gttttaattg
acactattac ttttattgtc atctttgtta 600gttttattgt tgaattgagt gcatatttcc
taggaaattc tcttacctaa cattttttat 660acagatctat gctcttggct cttgccctta
ctcttggcct tgtgttggtt atttgtctac 720atatttattg actggtcgat gagacatgtc
acaattcttg ggcttatttg ttggtctaat 780aaaaggagtg cttattgaaa gatcaagacg
gagattcggt tttatataaa taaactaaag 840atgacatatt agtgtgttga tgtctcttca
ggataatttt tgtttgaaat aatatggtaa 900tgtcttgtct aaatttgtgt acataattct
tactgatttt ttggattgtt ggatttttat 960aaacaaatct gcggccgcaa gggcgaattc
tgcagatatc catcacactg gcggccgctc 1020gagcatgcat ctagagggcc caattcgccc
tatagtgagt cgtattacaa ttcactggcc 1080gtcgttttac aacgtcgtga ctgggaaaac
cctggcgtta cccaacttaa tcgccttgca 1140gcacatcccc ctttcgccag ctggcgtaat
agcgaagagg cccgcaccga tcgcccttcc 1200caacagttgc gcagcctgaa tggcgaatgg
acgcgccctg tagcggcgca ttaagcgcgg 1260cgggtgtggt ggttacgcgc agcgtgaccg
ctacacttgc cagcgcccta gcgcccgctc 1320ctttcgcttt cttcccttcc tttctcgcca
cgttcgccgg ctttccccgt caagctctaa 1380atcgggggct ccctttaggg ttccgattta
gtgctttacg gcacctcgac cccaaaaaac 1440ttgattaggg tgatggttca cgtagtgggc
catcgccctg atagacggtt tttcgccctt 1500tgacgttgga gtccacgttc tttaatagtg
gactcttgtt ccaaactgga acaacactca 1560accctatctc ggtctattct tttgatttat
aagggatttt gccgatttcg gcctattggt 1620taaaaaatga gctgatttaa caaaaattta
acgcgaattt taacaaaatt cagggcgcaa 1680gggctgctaa aggaagcgga acacgtagaa
agccagtccg cagaaacggt gctgaccccg 1740gatgaatgtc agctactggg ctatctggac
aagggaaaac gcaagcgcaa agagaaagca 1800ggtagcttgc agtgggctta catggcgata
gctagactgg gcggttttat ggacagcaag 1860cgaaccggaa ttgccagctg gggcgccctc
tggtaaggtt gggaagccct gcaaagtaaa 1920ctggatggct ttcttgccgc caaggatctg
atggcgcagg ggatcaagat ctgatcaaga 1980gacaggatga ggatcgtttc gcatgattga
acaagatgga ttgcacgcag gttctccggc 2040cgcttgggtg gagaggctat tcggctatga
ctgggcacaa cagacaatcg gctgctctga 2100tgccgccgtg ttccggctgt cagcgcaggg
gcgcccggtt ctttttgtca agaccgacct 2160gtccggtgcc ctgaatgaac tgcaggacga
ggcagcgcgg ctatcgtggc tggccacgac 2220gggcgttcct tgcgcagctg tgctcgacgt
tgtcactgaa gcgggaaggg actggctgct 2280attgggcgaa gtgccggggc aggatctcct
gtcatcccac cttgctcctg ccgagaaagt 2340atccatcatg gctgatgcaa tgcggcggct
gcatacgctt gatccggcta cctgcccatt 2400cgaccaccaa gcgaaacatc gcatcgagcg
agcacgtact cggatggaag ccggtcttgt 2460cgatcaggat gatctggacg aagagcatca
ggggctcgcg ccagccgaac tgttcgccag 2520gctcaaggcg cgcatgcccg acggcgagga
tctcgtcgtg acccatggcg atgcctgctt 2580gccgaatatc atggtggaaa atggccgctt
ttctggattc atcgactgtg gccggctggg 2640tgtggcggac cgctatcagg acatagcgtt
ggctacccgt gatattgctg aagagcttgg 2700cggcgaatgg gctgaccgct tcctcgtgct
ttacggtatc gccgctcccg attcgcagcg 2760catcgccttc tatcgccttc ttgacgagtt
cttctgaatt gaaaaaggaa gagtatgagt 2820attcaacatt tccgtgtcgc ccttattccc
ttttttgcgg cattttgcct tcctgttttt 2880gctcacccag aaacgctggt gaaagtaaaa
gatgctgaag atcagttggg tgcacgagtg 2940ggttacatcg aactggatct caacagcggt
aagatccttg agagttttcg ccccgaagaa 3000cgttttccaa tgatgagcac ttttaaagtt
ctgctatgtg gcgcggtatt atcccgtatt 3060gacgccgggc aagagcaact cggtcgccgc
atacactatt ctcagaatga cttggttgag 3120tactcaccag tcacagaaaa gcatcttacg
gatggcatga cagtaagaga attatgcagt 3180gctgccataa ccatgagtga taacactgcg
gccaacttac ttctgacaac gatcggagga 3240ccgaaggagc taaccgcttt tttgcacaac
atgggggatc atgtaactcg ccttgatcgt 3300tgggaaccgg agctgaatga agccatacca
aacgacgagc gtgacaccac gatgcctgta 3360gcaatggcaa caacgttgcg caaactatta
actggcgaac tacttactct agcttcccgg 3420caacaattaa tagactggat ggaggcggat
aaagttgcag gaccacttct gcgctcggcc 3480cttccggctg gctggtttat tgctgataaa
tctggagccg gtgagcgtgg gtctcgcggt 3540atcattgcag cactggggcc agatggtaag
ccctcccgta tcgtagttat ctacacgacg 3600gggagtcagg caactatgga tgaacgaaat
agacagatcg ctgagatagg tgcctcactg 3660attaagcatt ggtaactgtc agaccaagtt
tactcatata tactttagat tgatttaaaa 3720cttcattttt aatttaaaag gatctaggtg
aagatccttt ttgataatct catgaccaaa 3780atcccttaac gtgagttttc gttccactga
gcgtcagacc ccgtagaaaa gatcaaagga 3840tcttcttgag atcctttttt tctgcgcgta
atctgctgct tgcaaacaaa aaaaccaccg 3900ctaccagcgg tggtttgttt gccggatcaa
gagctaccaa ctctttttcc gaaggtaact 3960ggcttcagca gagcgcagat accaaatact
gttcttctag tgtagccgta gttaggccac 4020cacttcaaga actctgtagc accgcctaca
tacctcgctc tgctaatcct gttaccagtg 4080gctgctgcca gtggcgataa gtcgtgtctt
accgggttgg actcaagacg atagttaccg 4140gataaggcgc agcggtcggg ctgaacgggg
ggttcgtgca cacagcccag cttggagcga 4200acgacctaca ccgaactgag atacctacag
cgtgagctat gagaaagcgc cacgcttccc 4260gaagggagaa aggcggacag gtatccggta
agcggcaggg tcggaacagg agagcgcacg 4320agggagcttc cagggggaaa cgcctggtat
ctttatagtc ctgtcgggtt tcgccacctc 4380tgacttgagc gtcgattttt gtgatgctcg
tcaggggggc ggagcctatg gaaaaacgcc 4440agcaacgcgg cctttttacg gttcctggcc
ttttgctggc cttttgctca catgttcttt 4500cctgcgttat cccctgattc tgtggataac
cgtattaccg cctttgagtg agctgatacc 4560gctcgccgca gccgaacgac cgagcgcagc
gagtcagtga gcgaggaagc ggaagagcgc 4620ccaatacgca aaccgcctct ccccgcgcgt
tggccgattc attaatgcag ctggcacgac 4680aggtttcccg actggaaagc gggcagtgag
cgcaacgcaa ttaatgtgag ttagctcact 4740cattaggcac cccaggcttt acactttatg
cttccggctc gtatgttgtg tggaattgtg 4800agcggataac aatttcacac aggaaacagc
tatgaccatg attacgccaa gcttggtacc 4860gagctcggat ccactagtaa cggccgccag
tgtgctggaa ttcgcccttg c 49111009130DNAartificial
sequenceplasmid 100ggccgcgaca caagtgtgag agtactaaat aaatgctttg gttgtacgaa
atcattacac 60taaataaaat aatcaaagct tatatatgcc ttccgctaag gccgaatgca
aagaaattgg 120ttctttctcg ttatcttttg ccacttttac tagtacgtat taattactac
ttaatcatct 180ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat catccggata
tagttcctcc 240tttcagcaaa aaacccctca agacccgttt agaggcccca aggggttatg
ctagttattg 300ctcagcggtg gcagcagcca actcagcttc ctttcgggct ttgttagcag
ccggatcgat 360ccaagctgta cctcactatt cctttgccct cggacgagtg ctggggcgtc
ggtttccact 420atcggcgagt acttctacac agccatcggt ccagacggcc gcgcttctgc
gggcgatttg 480tgtacgcccg acagtcccgg ctccggatcg gacgattgcg tcgcatcgac
cctgcgccca 540agctgcatca tcgaaattgc cgtcaaccaa gctctgatag agttggtcaa
gaccaatgcg 600gagcatatac gcccggagcc gcggcgatcc tgcaagctcc ggatgcctcc
gctcgaagta 660gcgcgtctgc tgctccatac aagccaacca cggcctccag aagaagatgt
tggcgacctc 720gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg accgctgtta
tgcggccatt 780gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg aggtgccgga
cttcggggca 840gtcctcggcc caaagcatca gctcatcgag agcctgcgcg acggacgcac
tgacggtgtc 900gtccatcaca gtttgccagt gatacacatg gggatcagca atcgcgcata
tgaaatcacg 960ccatgtagtg tattgaccga ttccttgcgg tccgaatggg ccgaacccgc
tcgtctggct 1020aagatcggcc gcagcgatcg catccatagc ctccgcgacc ggctgcagaa
cagcgggcag 1080ttcggtttca ggcaggtctt gcaacgtgac accctgtgca cggcgggaga
tgcaataggt 1140caggctctcg ctgaattccc caatgtcaag cacttccgga atcgggagcg
cggccgatgc 1200aaagtgccga taaacataac gatctttgta gaaaccatcg gcgcagctat
ttacccgcag 1260gacatatcca cgccctccta catcgaagct gaaagcacga gattcttcgc
cctccgagag 1320ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga aacttctcga
cagacgtcgc 1380ggtgagttca ggcttttcca tgggtatatc tccttcttaa agttaaacaa
aattatttct 1440agagggaaac cgttgtggtc tccctatagt gagtcgtatt aatttcgcgg
gatcgagatc 1500gatccaattc caatcccaca aaaatctgag cttaacagca cagttgctcc
tctcagagca 1560gaatcgggta ttcaacaccc tcatatcaac tactacgttg tgtataacgg
tccacatgcc 1620ggtatatacg atgactgggg ttgtacaaag gcggcaacaa acggcgttcc
cggagttgca 1680cacaagaaat ttgccactat tacagaggca agagcagcag ctgacgcgta
cacaacaagt 1740cagcaaacag acaggttgaa cttcatcccc aaaggagaag ctcaactcaa
gcccaagagc 1800tttgctaagg ccctaacaag cccaccaaag caaaaagccc actggctcac
gctaggaacc 1860aaaaggccca gcagtgatcc agccccaaaa gagatctcct ttgccccgga
gattacaatg 1920gacgatttcc tctatcttta cgatctagga aggaagttcg aaggtgaagg
tgacgacact 1980atgttcacca ctgataatga gaaggttagc ctcttcaatt tcagaaagaa
tgctgaccca 2040cagatggtta gagaggccta cgcagcaggt ctcatcaaga cgatctaccc
gagtaacaat 2100ctccaggaga tcaaatacct tcccaagaag gttaaagatg cagtcaaaag
attcaggact 2160aattgcatca agaacacaga gaaagacata tttctcaaga tcagaagtac
tattccagta 2220tggacgattc aaggcttgct tcataaacca aggcaagtaa tagagattgg
agtctctaaa 2280aaggtagttc ctactgaatc taaggccatg catggagtct aagattcaaa
tcgaggatct 2340aacagaactc gccgtgaaga ctggcgaaca gttcatacag agtcttttac
gactcaatga 2400caagaagaaa atcttcgtca acatggtgga gcacgacact ctggtctact
ccaaaaatgt 2460caaagataca gtctcagaag accaaagggc tattgagact tttcaacaaa
ggataatttc 2520gggaaacctc ctcggattcc attgcccagc tatctgtcac ttcatcgaaa
ggacagtaga 2580aaaggaaggt ggctcctaca aatgccatca ttgcgataaa ggaaaggcta
tcattcaaga 2640tgcctctgcc gacagtggtc ccaaagatgg acccccaccc acgaggagca
tcgtggaaaa 2700agaagacgtt ccaaccacgt cttcaaagca agtggattga tgtgacatct
ccactgacgt 2760aagggatgac gcacaatccc actatccttc gcaagaccct tcctctatat
aaggaagttc 2820atttcatttg gagaggacac gctcgagctc atttctctat tacttcagcc
ataacaaaag 2880aactcttttc tcttcttatt aaaccatgaa aaagcctgaa ctcaccgcga
cgtctgtcga 2940gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
cggagggcga 3000agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc
gggtaaatag 3060ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat
cggccgcgct 3120cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct
attgcatctc 3180ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc
ccgctgttct 3240gcagccggtc gcggaggcca tggatgcgat cgctgcggcc gatcttagcc
agacgagcgg 3300gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg
atttcatatg 3360cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca
ccgtcagtgc 3420gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc
ccgaagtccg 3480gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg
gccgcataac 3540agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
tcgccaacat 3600cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact
tcgagcggag 3660gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca
ttggtcttga 3720ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg
cgcagggtcg 3780atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa
tcgcccgcag 3840aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
gaaaccgacg 3900ccccagcact cgtccgaggg caaaggaata gtgaggtacc taaagaagga
gtgcgtcgaa 3960gcagatcgtt caaacatttg gcaataaagt ttcttaagat tgaatcctgt
tgccggtctt 4020gcgatgatta tcatataatt tctgttgaat tacgttaagc atgtaataat
taacatgtaa 4080tgcatgacgt tatttatgag atgggttttt atgattagag tcccgcaatt
atacatttaa 4140tacgcgatag aaaacaaaat atagcgcgca aactaggata aattatcgcg
cgcggtgtca 4200tctatgttac tagatcgatg tcgaatctga tcaacctgca ttaatgaatc
ggccaacgcg 4260cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact
gactcgctgc 4320gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta
atacggttat 4380ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag
caaaaggcca 4440ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc
cctgacgagc 4500atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta
taaagatacc 4560aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg
ccgcttaccg 4620gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc
tcacgctgta 4680ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac
gaaccccccg 4740ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac
ccggtaagac 4800acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg
aggtatgtag 4860gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga
aggacagtat 4920ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt
agctcttgat 4980ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag
cagattacgc 5040gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct
gacgctcagt 5100ggaacgaaaa ctcacgttaa gggattttgg tcatgacatt aacctataaa
aataggcgta 5160tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc
tgacacatgc 5220agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga
caagcccgtc 5280agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg
gcatcagagc 5340agattgtact gagagtgcac catatggaca tattgtcgtt agaacgcggc
tacaattaat 5400acataacctt atgtatcata cacatacgat ttaggtgaca ctatagaacg
gcgcgccaag 5460cttggatcct cgaagagaag ggttaataac acactttttt aacattttta
acacaaattt 5520tagttattta aaaatttatt aaaaaattta aaataagaag aggaactctt
taaataaatc 5580taacttacaa aatttatgat ttttaataag ttttcaccaa taaaaaatgt
cataaaaata 5640tgttaaaaag tatattatca atattctctt tatgataaat aaaaagaaaa
aaaaaataaa 5700agttaagtga aaatgagatt gaagtgactt taggtgtgta taaatatatc
aaccccgcca 5760acaatttatt taatccaaat atattgaagt atattattcc atagccttta
tttatttata 5820tatttattat ataaaagctt tatttgttct aggttgttca tgaaatattt
ttttggtttt 5880atctccgttg taagaaaatc atgtgctttg tgtcgccact cactattgca
gctttttcat 5940gcattggtca gattgacggt tgattgtatt tttgtttttt atggttttgt
gttatgactt 6000aagtcttcat ctctttatct cttcatcagg tttgatggtt acctaatatg
gtccatgggt 6060acatgcatgg ttaaattagg tggccaactt tgttgtgaac gatagaattt
tttttatatt 6120aagtaaacta tttttatatt atgaaataat aataaaaaaa atattttatc
attattaaca 6180aaatcatatt agttaatttg ttaactctat aataaaagaa atactgtaac
attcacatta 6240catggtaaca tctttccacc ctttcatttg ttttttgttt gatgactttt
tttcttgttt 6300aaatttattt cccttctttt aaatttggaa tacattatca tcatatataa
actaaaatac 6360taaaaacagg attacacaaa tgataaataa taacacaaat atttataaat
ctagctgcaa 6420tatatttaaa ctagctatat cgatattgta aaataaaact agctgcattg
atactgataa 6480aaaaatatca tgtgctttct ggactgatga tgcagtatac ttttgacatt
gcctttattt 6540tatttttcag aaaagctttc ttagttctgg gttcttcatt atttgtttcc
catctccatt 6600gtgaattgaa tcatttgctt cgtgtcacaa atacaattta gntaggtaca
tgcattggtc 6660agattcacgg tttattatgt catgacttaa gttcatggta gtacattacc
tgccacgcat 6720gcattatatt ggttagattt gataggcaaa tttggttgtc aacaatataa
atataaataa 6780tgtttttata ttacgaaata acagtgatca aaacaaacag ttttatcttt
attaacaaga 6840ttttgttttt gtttgatgac gttttttaat gtttacgctt tcccccttct
tttgaattta 6900gaacacttta tcatcataaa atcaaatact aaaaaaatta catatttcat
aaataataac 6960acaaatattt ttaaaaaatc tgaaataata atgaacaata ttacatatta
tcacgaaaat 7020tcattaataa aaatattata taaataaaat gtaatagtag ttatatgtag
gaaaaaagta 7080ctgcacgcat aatatataca aaaagattaa aatgaactat tataaataat
aacactaaat 7140taatggtgaa tcatatcaaa ataatgaaaa agtaaataaa atttgtaatt
aacttctata 7200tgtattacac acacaaataa taaataatag taaaaaaaat tatgataaat
atttaccatc 7260tcataagata tttaaaataa tgataaaaat atagattatt ttttatgcaa
ctagctagcc 7320aaaaagagaa cacgggtata tataaaaaga gtacctttaa attctactgt
acttccttta 7380ttcctgacgt ttttatatca agtggacata cgtgaagatt ttaattatca
gtctaaatat 7440ttcattagca cttaatactt ttctgtttta ttcctatcct ataagtagtc
ccgattctcc 7500caacattgct tattcacaca actaactaag aaagtcttcc atagcccccc
aagcggccgc 7560gcgagaaact ttgtatgggc atggttattt ctcacttctc accctccttt
actttcttat 7620gctaaatcct ccttccccta tatctccacc ctcaacccct ttttctcatt
ataacttttg 7680gtgcctagat ggtgtgtgtg tgtgcgcgcg agagatctga gctcaatttt
cctctctcaa 7740gtcctggtca tgctttaacc caacaatcat cgacccctta tgcatcttat
atctctccac 7800ctccaggatt ttaagcccta ggggacgatg attgttgggt tagaatatgg
caattcaggc 7860ttttaattgc tttcatttgg taccatcact tgcaagattt cagagtacaa
ggtgaacaca 7920cacatcttcc tcttcatcaa ttctctagtt tcatccttat cttttcattc
acggtaactc 7980tcactaccct ctttcatctt ataagttata ccgggggtgt gatgttgatg
agtgtaaatt 8040aaatatatgt gatctctttc tctggaaaaa ttttcagtgt gatatacata
ataatctctt 8100aatctagaga ttttatggct ttgttatata taagcggcca attctgcaga
tatccatcac 8160actggaattc ttctagctag ctagggtttg ggtagtgagt gtaataaagt
tgcaaagttt 8220ttggttaggt tacgttttga ccttattatt atagttcaaa gggaaacatt
aattaaaggg 8280gattatgaag accctaccct acattctcca ttgaggatct tactgggtga
attgagctgc 8340ttagctatgg atcccacagt tctacccatc aataagtgct tttgtggtag
tcttgtggct 8400tccatatctg gggagcttca tttgccttta tagtattaac cttcttggag
aaaatagggt 8460agggtcaccc ttctcttctt ttctctcata ataatttaaa tttgttatag
actctaaact 8520ttaaatgttt tttttgaagt ttttccgttt ttctcttttg ccatgatccc
gttcttgctg 8580tggagtaacc ttgtccgagg tatgtgcatg attagatcca tacttaattt
gtgtgcatca 8640cgaaggtgag gttgaaatga actttgcttt tttgaccttt taggaaagtt
cttttgttgc 8700agtaatcaat tttaattagt tttaattgac actattactt ttattgtcat
ctttgttagt 8760tttattgttg aattgagtgc atatttccta ggaaattctc ttacctaaca
ttttttatac 8820agatctatgc tcttggctct tgcccttact cttggccttg tgttggttat
ttgtctacat 8880atttattgac tggtcgatga gacatgtcac aattcttggg cttatttgtt
ggtctaataa 8940aaggagtgct tattgaaaga tcaagacgga gattcggttt tatataaata
aactaaagat 9000gacatattag tgtgttgatg tctcttcagg ataatttttg tttgaaataa
tatggtaatg 9060tcttgtctaa atttgtgtac ataattctta ctgatttttt ggattgttgg
atttttataa 9120acaaatctgc
9130101607DNACyamopsis tetragonoloba 101gcacgaggtt gcggctcaca
gtcgttgtgc ttcccaatcc ccgatcccca aaagagagag 60agagaatgag gtggtggcgg
cgccggcgtt gaccatgaga ccagtagcaa ccgatttcac 120ccaaaagctc ctcccttcca
atctcattct ggccaccaac aatcgccttc aacgtacctc 180tcccttcttt ctccatccat
atcgcatggc cgacggcgca gcgacatcca atacacccgc 240gccgcaccag atccaaccca
aactggaccc aaacgccgag aagaaggaga atctaccgaa 300ggagattcct ccgccgccgg
agaagcccga gcccggcgat tgttgcggca gcggatgcgt 360ccgatgcgtt tgggatattt
actatgagga gcttgaacaa tacaataagc tctacaaaca 420cgacgattcc aaccccaaac
cttaattagg atcattcttt tcccaatgta attcacaatt 480caagggttaa aatgacatca
tgattttgtc aatatctcca aagtttatcg ttaatggcaa 540gctcagggtt caccttgcca
aatttgacat tcaaggatgt gtagatctat actaagaaga 600gcttgaa
607102116PRTCyamopsis
tetragonoloba 102Met Arg Pro Val Ala Thr Asp Phe Thr Gln Lys Leu Leu Pro
Ser Asn1 5 10 15Leu Ile
Leu Ala Thr Asn Asn Arg Leu Gln Arg Thr Ser Pro Phe Phe 20
25 30Leu His Pro Tyr Arg Met Ala Asp Gly
Ala Ala Thr Ser Asn Thr Pro 35 40
45Ala Pro His Gln Ile Gln Pro Lys Leu Asp Pro Asn Ala Glu Lys Lys 50
55 60Glu Asn Leu Pro Lys Glu Ile Pro Pro
Pro Pro Glu Lys Pro Glu Pro65 70 75
80Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp
Ile Tyr 85 90 95Tyr Glu
Glu Leu Glu Gln Tyr Asn Lys Leu Tyr Lys His Asp Asp Ser 100
105 110Asn Pro Lys Pro
115103608DNABahia 103cgatggtgtc accaatcaca gcagcctccc cctccccttc
ccctttcggc ggcctgtatg 60ctgggcgccg tcctccgcgt ctcggccccg atcccgtctc
tcctccccgc gccgacgcgc 120cctctcctac tccgccgccg cagccacagc ctcccgcccg
agacgcccat ggccgcggcc 180gccccwcgcg acgccggcgc cacgaagccc gacgccgcgc
cggcgccggc gccagtgccg 240cagccacccg agaagccgct ccctggcgac tgctgcggga
gcggctgcgt ccgctgcgtc 300tgggacatct attacgacga actcgacgcg tacgaaaagg
ccctcgccgc ccacgcggcc 360tccgccggcg gcaaggcctc cccctatccc gctgacakca
agcccagcga cggcgccaag 420tcctgaagca cgtggggcgt catgcgtatc ccttcttctg
ttcccaactg aaatagattt 480tcagatatgc tgctagcaat tgttgacact gagacattac
atatgtgtat gctagattga 540gatgctttgt caattcaacc tcatcgttgt gcaagtgtgt
aacaagagaa agttaatatg 600attattaa
608104122PRTBahiamisc_feature(114)..(114)Xaa can
be any naturally occurring amino acid 104Met Leu Gly Ala Val Leu Arg Val
Ser Ala Pro Ile Pro Ser Leu Leu1 5 10
15Pro Ala Pro Thr Arg Pro Leu Leu Leu Arg Arg Arg Ser His
Ser Leu 20 25 30Pro Pro Glu
Thr Pro Met Ala Ala Ala Ala Pro Arg Asp Ala Gly Ala 35
40 45Thr Lys Pro Asp Ala Ala Pro Ala Pro Ala Pro
Val Pro Gln Pro Pro 50 55 60Glu Lys
Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65
70 75 80Val Trp Asp Ile Tyr Tyr Asp
Glu Leu Asp Ala Tyr Glu Lys Ala Leu 85 90
95Ala Ala His Ala Ala Ser Ala Gly Gly Lys Ala Ser Pro
Tyr Pro Ala 100 105 110Asp Xaa
Lys Pro Ser Asp Gly Ala Lys Ser 115
120105133PRTArabidopsis lyrata 105Met Val Val Val Ser Leu His Arg Ile Ser
Ile Thr Thr Ser Pro Gly1 5 10
15Ser Ser Leu His Asp Val Leu Leu Ser Met Arg Phe Gly Leu Thr Arg
20 25 30Arg His Leu Pro Leu Lys
Arg Pro Phe Thr Asn Tyr Ser Ile Thr Ser 35 40
45Val Ser Pro Glu Gln Gln Leu Ile Ser Pro Val Thr Met Ala
Thr Thr 50 55 60Glu Ser Gln Asn Leu
Val Gln Ala Ser Lys Glu Glu Thr Asn Lys Lys65 70
75 80Glu Val Glu Asp Thr Lys Glu Ile Leu Ala
Pro Pro Pro Pro Glu Lys 85 90
95Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp
100 105 110Asp Val Tyr Tyr Glu
Glu Leu Glu Asp Tyr Asn Lys Lys Leu Ser Gly 115
120 125Glu Thr Lys Ser Val 130106113PRTPicea
sitchensis 106Met Arg Ser Pro Phe Cys Ile Pro Ser Val Val Ser Ala Arg Thr
Arg1 5 10 15Val Cys Phe
Arg Phe Thr Cys Phe Thr Met Ala Thr Val Ser Gly Gly 20
25 30Gly Val Glu Gly Lys Glu Asn Leu Glu Lys
Ser Ile Glu Ala Lys Ala 35 40
45Lys Asp Glu Lys Lys Lys Ala Glu Glu Glu Ile Glu Lys Ile Leu Met 50
55 60Glu Lys Ile Gly Pro Pro Pro Glu Lys
Pro Leu Pro Gly Asp Cys Cys65 70 75
80Gly Ser Gly Cys Glu Ile Cys Val Trp Asp Thr Tyr Phe Asp
Gln Leu 85 90 95Gln Glu
Tyr Lys Lys Glu Lys Asp Ser Ile Leu Lys Ser Ile Ser Pro 100
105 110Pro 107821DNAHordeum vulgare
107ctccgcggcc cgggctctcc gatcccgcct ctcttccccg cgccggggcg ccctctcatc
60cacctatccc gccgcctccc tacggcgccc gccatggccg acgccaagaa gaccgacgcg
120ccggcgaccc cggccccgga gccgcccgag aagccgctcc ccggcgactg ctgcggcagc
180ggctgcgtcc gctgcgtctg ggacatctac tacgacgagc tccaggacta caaggaggcc
240ctcgccgccc acgcggccgc ggccgatccc agcggcgaca aggcatgcgt cgacgagaag
300aagaccgaat gatgagaccc gggaggaggc aggacccggg tgtgtatgct ggaactagta
360ctgggaccaa ataggatgcg cggctcgagt gggatatggg agcatgactc atggaatggc
420ggagcggcgt agctggcgtt gtggcgagaa aaaaaaatac taccaacagg gggggcccga
480gaccgagtga gtcctctaat tataatggaa gcaaaagcgt gaacgggtgt gtgcgcgggc
540gtggtcttga agagctctgg tgaagctgtg ccgaggagca gatgtgtccg tgcgtccata
600cgggtacaga gacgactagg aggtgttgta cgcggcttag tgagcgtggt taggcgggat
660gaaggagaag gggaggggga aggcgtgaga tgatagaaga tgatgggttg acgagatatg
720acgacggtgg agacgtagga ggcatgtgat aacagtaggc tgggctgagg tgggatgcgg
780aaggaggaga gatatatgag gggagggtgc ggttatagac g
821108103PRTHordeum vulgare 108Leu Arg Gly Pro Gly Ser Pro Ile Pro Pro
Leu Phe Pro Ala Pro Gly1 5 10
15Arg Pro Leu Ile His Leu Ser Arg Arg Leu Pro Thr Ala Pro Ala Met
20 25 30Ala Asp Ala Lys Lys Thr
Asp Ala Pro Ala Thr Pro Ala Pro Glu Pro 35 40
45Pro Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys
Val Arg 50 55 60Cys Val Trp Asp Ile
Tyr Tyr Asp Glu Leu Gln Asp Tyr Lys Glu Ala65 70
75 80Leu Ala Ala His Ala Ala Ala Ala Asp Pro
Ser Gly Asp Lys Ala Cys 85 90
95Val Asp Glu Lys Lys Thr Glu 100109547DNARaphanus
sativus 109gggcatggtt tcgttgcatc atatccatcc tcgattctcg accgccgcat
cgtcggaata 60caatcgtcgc cggaaaagct tccacgatgt gcttctgagc atgagatttg
gatttacgcg 120agatctctct ctgaaacggt ccttggtcaa ctactattcc ttatctcgac
aacaacgaca 180cctcaagtcg cccatcacca tggccaccaa gagcgagaag acttccacgg
aggagaagga 240taagaaggag gaggtttcac tccctccgcc tccgccgccg gagaaaccag
agcctggcga 300ctgctgcggt agcggatgcg tgcgatgcgt ttgggatgtg tattacgaag
agctccaaga 360atacaacaag ctttctacat cccttcctgg acaaactaaa tccaattgaa
tgctaaattt 420ttgtgtgcaa atgtactcgt cttcgagttt gagaagtcga agatgatgtt
atgtttgaac 480attattggat cattatcgtt actacttatc tacaaagttt actaaaagaa
aaaaaaaaaa 540aaaaaaa
547110134PRTRaphanus sativus 110Met Val Ser Leu His His Ile
His Pro Arg Phe Ser Thr Ala Ala Ser1 5 10
15Ser Glu Tyr Asn Arg Arg Arg Lys Ser Phe His Asp Val
Leu Leu Ser 20 25 30Met Arg
Phe Gly Phe Thr Arg Asp Leu Ser Leu Lys Arg Ser Leu Val 35
40 45Asn Tyr Tyr Ser Leu Ser Arg Gln Gln Arg
His Leu Lys Ser Pro Ile 50 55 60Thr
Met Ala Thr Lys Ser Glu Lys Thr Ser Thr Glu Glu Lys Asp Lys65
70 75 80Lys Glu Glu Val Ser Leu
Pro Pro Pro Pro Pro Pro Glu Lys Pro Glu 85
90 95Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys
Val Trp Asp Val 100 105 110Tyr
Tyr Glu Glu Leu Gln Glu Tyr Asn Lys Leu Ser Thr Ser Leu Pro 115
120 125Gly Gln Thr Lys Ser Asn
130111800DNADennstaedtia punctilobula 111ggcccctaca atatcccaaa atttcatccg
accaaagaaa ttttgggctg ctgtaacgct 60ggtgaaggta atgaaggtag ctttcttgaa
ctattcattg attccttcct tcttctcgcc 120ctcacctgta ctacaacgag ggttagggtt
tcgcgagact acaagggcgg caatgtccgg 180taacagggag cctgatcccg atcttgtgct
agaaagtact cctcccaagc agaagcagca 240gaatcacaag aaagaagtag atggagaaga
gaagaaagaa gaagatgatg cagagatttt 300gaggaagcag cttggcgagc cccctgagaa
gcctttgcct ggagactgtt gcggcagtgg 360atgtgtccga tgtgtctggg acatttattt
tgacgagctc gagctttata actcccgcaa 420ggatgtcctt gatgcccgcc gtgcttcgtg
atagtaccaa ctcgggatgc ctactattca 480tagctgaaga tttgcaagga ggcccacact
catctctgca gcagctcaac tcatcaattt 540tctgtgtgac ttgtttcaag gttcccctgt
gaccttgcac aatatttttc attgatctgt 600attctttacc atcataaaca ttggaattgg
gggttcctga aaggactaaa tcccctgttt 660ttttcaaggt aaccctgcca tttatgggtt
aatctgtatt gtttccttcc atgtacattt 720gcctagattc taccatatac atcagaaggc
cagaaataaa tccagggctt caattggctg 780tccagatgct tcgttttggg
800112381DNADennstaedtia punctilobula
112atgaaggtag ctttcttgaa ctattcattg attccttcct tcttctcgcc ctcacctgta
60ctacaacgag ggttagggtt tcgcgagact acaagggcgg caatgtccgg taacagggag
120cctgatcccg atcttgtgct agaaagtact cctcccaagc agaagcagca gaatcacaag
180aaagaagtag atggagaaga gaagaaagaa gaagatgatg cagagatttt gaggaagcag
240cttggcgagc cccctgagaa gcctttgcct ggagactgtt gcggcagtgg atgtgtccga
300tgtgtctggg acatttattt tgacgagctc gagctttata actcccgcaa ggatgtcctt
360gatgcccgcc gtgcttcgtg a
381113126PRTDennstaedtia punctilobula 113Met Lys Val Ala Phe Leu Asn Tyr
Ser Leu Ile Pro Ser Phe Phe Ser1 5 10
15Pro Ser Pro Val Leu Gln Arg Gly Leu Gly Phe Arg Glu Thr
Thr Arg 20 25 30Ala Ala Met
Ser Gly Asn Arg Glu Pro Asp Pro Asp Leu Val Leu Glu 35
40 45Ser Thr Pro Pro Lys Gln Lys Gln Gln Asn His
Lys Lys Glu Val Asp 50 55 60Gly Glu
Glu Lys Lys Glu Glu Asp Asp Ala Glu Ile Leu Arg Lys Gln65
70 75 80Leu Gly Glu Pro Pro Glu Lys
Pro Leu Pro Gly Asp Cys Cys Gly Ser 85 90
95Gly Cys Val Arg Cys Val Trp Asp Ile Tyr Phe Asp Glu
Leu Glu Leu 100 105 110Tyr Asn
Ser Arg Lys Asp Val Leu Asp Ala Arg Arg Ala Ser 115
120 125114777DNAOsmunda cinnamomea 114acaatgagat
agcatgaggt tgggtatcct gccctgcccc ttcatcaggc ctctgcttcc 60ctcgccatcc
atcgcccctc cctcctccag cctcctaacc ttccgcgctt cgccacgagc 120catggacaaa
cagcaggttc tccatcccaa gcccgcggat ctccccaaga atgactccaa 180acagaacgac
ctaacgctgc ctgcggatca ggaggaatcg cagctcggtc ctccaccgga 240aaagccgctc
ccaggtgatt gctgtggcag cggttgcgtg cggtgtgtct gggataccta 300tttcgaggag
ctggatagtt acaacgagcg caaagaggcg tttgaatccc gcctgaagaa 360gtcgcctcct
ctgtaatttt ctacattggc ggtagggaaa gggagtaaaa aatttacgag 420gaagaatgtg
caatgttttt gtgaggatga agtatcaggt ggtggggata gttcagaagg 480ctaagaactc
caaagatctt tcaagttgat ggtttgaaac ttattgaatg gactctcatg 540aagtcaagac
tgcactctct ttattgttac agactttcca ttgatatatt ttttcgccat 600attagcggac
atgcagatgt cacttgagat cttcgtccaa gttgtggcca gctgattctt 660tctatctgca
gtggtgcatt tgcccaacca gctaccttct ctaagcattt tgatcagagc 720ttctaaaaga
gcaggctgaa gtgatgatat atggtttctt tacatcaatc atggctg
777115363DNAOsmunda cinnamomea 115atgaggttgg gtatcctgcc ctgccccttc
atcaggcctc tgcttccctc gccatccatc 60gcccctccct cctccagcct cctaaccttc
cgcgcttcgc cacgagccat ggacaaacag 120caggttctcc atcccaagcc cgcggatctc
cccaagaatg actccaaaca gaacgaccta 180acgctgcctg cggatcagga ggaatcgcag
ctcggtcctc caccggaaaa gccgctccca 240ggtgattgct gtggcagcgg ttgcgtgcgg
tgtgtctggg atacctattt cgaggagctg 300gatagttaca acgagcgcaa agaggcgttt
gaatcccgcc tgaagaagtc gcctcctctg 360taa
363116120PRTOsmunda cinnamomea 116Met
Arg Leu Gly Ile Leu Pro Cys Pro Phe Ile Arg Pro Leu Leu Pro1
5 10 15Ser Pro Ser Ile Ala Pro Pro
Ser Ser Ser Leu Leu Thr Phe Arg Ala 20 25
30Ser Pro Arg Ala Met Asp Lys Gln Gln Val Leu His Pro Lys
Pro Ala 35 40 45Asp Leu Pro Lys
Asn Asp Ser Lys Gln Asn Asp Leu Thr Leu Pro Ala 50 55
60Asp Gln Glu Glu Ser Gln Leu Gly Pro Pro Pro Glu Lys
Pro Leu Pro65 70 75
80Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Thr Tyr
85 90 95Phe Glu Glu Leu Asp Ser
Tyr Asn Glu Arg Lys Glu Ala Phe Glu Ser 100
105 110Arg Leu Lys Lys Ser Pro Pro Leu 115
12011725PRTArtificial sequencemisc_feature 117Glu Lys Pro Xaa Xaa
Gly Asp Cys Cys Gly Ser Gly Cys Xaa Xaa1 5
10 15Cys Val Trp Asp Xaa Tyr Xaa Xaa Xaa Leu
20 25118117PRTGlycine max 118Met Arg Thr Thr Ala Pro
Ser Asp Phe Ile Phe Thr Gln Lys Leu His1 5
10 15Pro Phe Asn Ile Thr Ser Thr Lys Thr Ser Leu Gln
Arg Thr Leu Pro 20 25 30Tyr
Phe Leu Gln Leu Asn Arg Met Ala Glu Ala Ala Arg Thr Ala His 35
40 45Lys Pro Ala Pro His Pro Ile Gln Pro
Lys Pro Asp Asp Lys Thr Pro 50 55
60Asn Pro Ala Lys Glu Ile Pro Pro Pro Pro Glu Lys Pro Glu Pro Gly65
70 75 80Asp Cys Cys Gly Ser
Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr 85
90 95Asp Glu Leu Glu Glu Tyr Asn Lys Arg Tyr Lys
Gln Val Asp Pro Ser 100 105
110Pro Lys Pro Ser Ser 115119132PRTSorghum bicolor 119Met Leu Gly
Ala Val Val Arg Val Pro Ala Pro Ile Leu Leu Pro Leu1 5
10 15Leu Pro Gly Pro Thr Arg Pro Leu Leu
Leu Arg Arg Arg Arg His Cys 20 25
30Leu Pro Pro Glu Ala Pro Met Ala Ser Ala Thr Pro Ser Asp Gly Gly
35 40 45Ala Ala Lys Pro Asp Ala Ala
Pro Ala Pro Val Pro Val Pro Ala Pro 50 55
60Ala Pro Thr Pro Leu Pro Leu Pro Pro Glu Lys Pro Leu Pro Gly Asp65
70 75 80Cys Cys Gly Ser
Gly Cys Val Arg Cys Val Trp Asp Ile Tyr Phe Asp 85
90 95Glu Leu Asp Ala Tyr Asp Lys Ala Leu Ala
Ala His Ala Ala Ala Ser 100 105
110Ser Gly Ser Gly Ala Lys Asp Asp Ser Ala Asp Thr Lys Pro Ser Asp
115 120 125Gly Ala Lys Ser
130120135PRTArabidopsis thaliana 120Met Val Val Val Ser Leu Leu Pro Arg
Ile Ser Ile Val Thr Ser Pro1 5 10
15Gly Ser Ser Leu His Asp Val Leu Leu Ser Met Arg Phe Gly Leu
Thr 20 25 30Arg His Leu Pro
Leu Lys Arg Ser Phe Ser Asn Tyr Ser Ile Thr Ser 35
40 45Val Ser Pro Glu Gln Gln Leu Lys Ser Pro Val Thr
Met Ala Thr Thr 50 55 60Glu Ser Lys
Asn Leu Val Glu Ala Ser Lys Glu Glu Thr Asn Lys Lys65 70
75 80Glu Thr Glu Asp Lys Lys Glu Val
Gly Val Ser Val Pro Pro Pro Pro 85 90
95Glu Lys Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val
Arg Cys 100 105 110Val Trp Asp
Val Tyr Tyr Asp Glu Leu Glu Asp Tyr Asn Lys Gln Leu 115
120 125Ser Gly Glu Thr Lys Ser Ile 130
135121115PRTOryza sativa 121Met Leu Val Ala Ala Leu Arg Val Pro Ala
Pro Ile Pro Ser Ser Leu1 5 10
15Pro Ser Pro Ala Arg Pro Leu Leu Arg Arg Arg Ser Ser His Arg Leu
20 25 30Pro Pro Pro Pro Pro Pro
Ala Ala Ser Met Ala Asp Ala Gly Gly Ala 35 40
45Thr Thr Asn Lys Pro Ala Pro Ala Pro Ala Pro Glu Pro Pro
Glu Lys 50 55 60Pro Leu Pro Gly Asp
Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp65 70
75 80Asp Val Tyr Tyr Asp Glu Leu Asp Ala Tyr
Asn Lys Ala Leu Ala Ala 85 90
95His Ser Ser Ser Ala Ser Ser Gly Ser Lys Pro Ala Thr Ser Asp Gly
100 105 110Ala Lys Ser 115
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20220065372 | MIXED FEMALE CONNECTOR OR PORT |
20220065371 | THREADED CONNECTION FOR STEEL PIPE |
20220065370 | METAL HOSE WITH CRIMPED COLLAR WELD END |
20220065369 | CROSS-LINKED POLYETHYLENE TYPE A (PEX-A) PIPE |
20220065368 | AIR BRAKE TUBING AND COMPOSITIONS FOR MAKING THE SAME |