Patent application title: Sorghum Centromere Sequences and Minichromosomes
Inventors:
Daphne Preuss (Chicago, IL, US)
Daphne Preuss (Chicago, IL, US)
Pierluigi Barone (Charleston, IL, US)
Shawn R. Carlson (Bondville, IL, US)
Gregory P. Copenhaver (Chapel Hill, NC, US)
Gregory P. Copenhaver (Chapel Hill, NC, US)
Song Luo (Chicago, IL, US)
Jennifer M. Mach (Chicago, IL, US)
IPC8 Class: AC12N1582FI
USPC Class:
800320
Class name: Plant, seedling, plant seed, or plant part, per se higher plant, seedling, plant seed, or plant part (i.e., angiosperms or gymnosperms) gramineae (e.g., barley, oats, rye, sorghum, millet, etc.)
Publication date: 2016-04-28
Patent application number: 20160115495
Abstract:
The invention is generally related to MCs containing sorghum centromere
sequences. In addition, the invention provides for methods of generating
plants transformed with these MCs. MCs with novel compositions and
structures are used to transform plants cells which are in turn used to
generate the plant. Methods for generating the plant include methods for
delivering the MC into plant cell to transform the cell, methods for
selecting the transformed cell, and methods for isolating plants
transformed with the MC.Claims:
1. A sorghum plant comprising a sorghum mini-chromosome comprising a
sorghum centromere, wherein the sorghum centromere comprises at least two
copies of a first repeated nucleotide sequence that is at least 80%
identical to the nucleotide sequence of any one of SEQ ID NOs:22-176 or
hybridizes to the nucleotide sequence of any one of SEQ ID NOs:22-176
under stringent conditions comprising hybridization at 65.degree. C. and
washing three times for 15 minutes with 0.25.times.SSC, 0.1% SDS at
65.degree. C.
2. A sorghum plant cell comprising a sorghum mini-chromosome comprising at least two copies of a repeated nucleotide sequence that is at least 80% identical any one of SEQ ID NOs:22-176 or hybridizes to any one of SEQ ID NOs:22-176 under stringent conditions comprising hybridization at 65.degree. C. and washing three times for 15 minutes with 0.25.times.SSC, 0.1% SDS at 65.degree. C., and a Transgene Expression Cassette
3. The sorghum plant cell of claim 2, further comprising at least two copies of a second repeated nucleotide sequence that is at least 80% identical over its length to a fragment of the sorghum retrotransposon sequence of SEQ ID NO:21 or hybridizes to a fragment of the sorghum retrotransposon sequence of SEQ ID NO:21 under stringent conditions comprising hybridization at 65.degree. C. and washing three times for 15 minutes with 0.25.times.SSC, 0.1% SDS at 65.degree. C.
4. The sorghum plant cell of claim 3, comprising, wherein the sorghum centromere comprises (a) at least 5 copies of the first repeat within 1 kb of nucleotide sequence, and (b) a transgene expression cassette comprising at least one exogenous nucleic acid.
5-8. (canceled)
9. The sorghum plant cell of claim 4, wherein the sorghum mini-chromosome exhibits a mitotic segregation efficiency in sorghum cells of at least 90%.
10. The sorghum plant cell of claim 4, wherein at least one exogenous nucleic acid is operably linked to a heterologous regulatory sequence functional in sorghum cells.
11. The sorghum plant cell of claim 10, wherein the exogenous nucleic acid is selected from the group consisting of a herbicide resistance gene, a nitrogen fixation gene, an insect resistance gene, a disease resistance gene, a plant stress-induced gene, a nutrient utilization gene, a gene that affects plant pigmentation, a gene that encodes an antisense or ribozyme molecule, a gene encoding a secretable antigen, a toxin gene, a receptor gene, a ligand gene, a seed storage gene, a hormone gene, an enzyme gene, an antibody gene, a growth factor gene, a drought resistance gene, a heat resistance gene, a chilling resistance gene, a freezing resistance gene, an excessive moisture resistance gene, or a salt stress resistance gene or a biofuel gene.
12. (canceled)
13. A sorghum plant cell comprising a transgene expression cassette comprising at least one exogenous nucleic acid not integrated into the plant cell genome, wherein the transgene expression cassette comprises (a) a polynucleotide sequence that is transcribed as a first RNA, (b) a polynucleotide sequence that is transcribed as a second RNA, and (c) a polynucleotide sequence that is transcribed as a third RNA wherein transcription of the polynucleotide sequences results in increased biomass of a sorghum plant compared to the biomass of a wildtype sorghum plant.
14-15. (canceled)
16. The sorghum plant cell of claim 4, wherein the transgene expression cassette comprises at least three exogenous nucleic acids, and wherein the first repeated nucleotide sequence and the transgene expression cassette are not integrated into the genome of the sorghum plant cell.
17. A sorghum plant cell of claim 4 that exhibits an altered phenotype associated with at least one exogenous nucleic acid within the sorghum [MC] mini-chromosome.
18. The sorghum plant cell of claim 17, wherein the altered phenotype comprises increased altered expression of a native gene or the expression of an exogenous gene.
19-20. (canceled).
21. A sorghum plant, plant tissue or sorghum plant part comprising the plant cell of any one of claim 2.
22-24. (canceled).
25. A sorghum seed obtained from the plant of claim 21.
26. A sorghum plant progeny comprising a sorghum mini-chromosome, wherein the plant progeny is the result of breeding a plant of claim 21.
27. A method of using a sorghum plant of claim 21, the method comprising growing the plant to produce a recombinant protein encoded by an exogenous nucleic acid of the mini-chromosome, and alternatively, further comprising a step of harvesting or processing the sorghum plant.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to US Provisional Patent Application by D. Preuss et al., U.S. Patent Application Ser. No. 61/228,015, titled, "SORGHUM CENTROMERE SEQUENCES AND MINICHROMOSOMES," filed Jul. 23, 2009, which is incorporated by reference herein in its entirety.
GOVERNMENT SUPPORT
[0002] Not applicable.
COMPACT DISC FOR SEQUENCE LISTINGS AND TABLES
[0003] Not applicable.
FIELD OF THE INVENTION
[0004] The present invention relates to sorghum centromere sequences that are useful, for example, in constructing artificial chromosomes comprising sorghum centromere sequences, and cells and organisms comprising such artificial chromosomes, including Sorghum bicolor and Sorghum sudanese. Methods the make and use the disclosed sorghum centromeres are also disclosed.
BACKGROUND OF THE INVENTION
[0005] Two general approaches are used for introduction of new heritable genetic information ("transformation") into cells. One approach is to introduce the new genetic information as part of another DNA molecule, referred to as an "episomal vector," or "minichromosome" (MC), which can be maintained as an independent unit (an episome) apart from the host chromosomal DNA molecule(s). Episomal vectors contain all the necessary DNA sequence elements required for DNA replication and maintenance of the vector within the cell. Many episomal vectors are available for use in bacterial cells (for example, see Maniatis et al., Molecular Cloning: a Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1982). However, only a few episomal vectors that function in higher eukaryotic cells have been developed. Higher eukaryotic episomal vectors were primarily based on naturally occurring viruses. In higher plant systems gemini viruses are double-stranded DNA viruses that replicate through a double-stranded intermediate upon which an episomal vector could be based, although the gemini virus is limited to an approximately 800 by insert. Although an episomal plant vector based on the Cauliflower Mosaic Virus has been developed, its capacity to carry new genetic information also is limited (Brisson et al., Nature, 310:511, 1984).
[0006] The other general method of genetic transformation involves integration of introduced DNA sequences into the recipient cell's chromosomes, permitting the new information to be replicated and partitioned to the cell's progeny as a part of the natural chromosomes. The introduced DNA usually can be broken and joined together in various combinations before it is integrated at random sites into the cell's chromosome (see, for example Wigler et al., Cell, 11:223, 1977). Common problems with this procedure are the rearrangement of introduced DNA sequences and unpredictable levels of expression due to the location of the transgene integration site in the host genome or so called "position effect variegation" (Shingo et al., Mol. Cell. Biol., 6:1787, 1986). Further, unlike episomal DNA, integrated DNA cannot normally be precisely removed. A more refined form of integrative transformation can be achieved by exploiting naturally occurring viruses that integrate into the host's chromosomes as part of their life cycle, such as retroviruses (see Chepko et al., Cell, 37:1053, 1984).
[0007] One common genetic transformation method used in higher plants is based on the transfer of bacterial DNA into plant chromosomes that occurs during infection by the phytopathogenic soil bacterium Agrobacterium (see Nester et al., Ann. Rev. Plant Phys., 35:387-413, 1984). By substituting genes of interest for a portion of the naturally transferred bacterial sequences (called T-DNA), investigators have been able to introduce new DNA into plant cells. However, even this more "refined" integrative transformation system is limited in three major ways. First, DNA sequences introduced into plant cells using the Agrobacterium T-DNA system are frequently rearranged (see Jones et al., Mol Gen. Genet., 207:478, 1987). Second, the expression of the introduced DNA sequences varies between individual transformants (see Jones et al., EMBO J., 4:2411-2418, 1985). This variability is presumably caused by rearranged sequences and the influence of surrounding sequences in the plant chromosome (i.e., position effects), as well as methylation of the transgene. Finally, insertion of extra elements into the genome can disrupt the genes, promoters or other genetic elements necessary for normal plant growth and function.
[0008] Another widely used technique to genetically transform plants involves the use of microprojectile bombardment to integrate DNA sequences into the genome. In this process, a nucleic acid containing the desired genetic elements to be introduced into the plant's native chromosome is deposited on or in small metallic particles, e.g., tungsten, platinum, or preferably gold, which are then delivered at a high velocity into the plant tissue or plant cells. However, similar problems arise as with Agrobacterium-mediated gene transfer, and as noted above expression of the inserted DNA can be unpredictable and insertion of extra elements into the genome can disrupt and adversely impact plant processes.
[0009] One attractive alternative to the commonly used methods of transformation is the use of an artificial chromosome. Artificial chromosomes are episomal nucleic acid molecules that exist autonomously from the native chromosomes of the host genome. They can be linear or circular DNA molecules that are comprised of cis-acting nucleic acid sequence elements that provide replication and partitioning activities (see Murray et al., Nature, 305:189-193, 1983). Desired elements include: (1) origin of replication, which are the sites for initiation of DNA replication, (2) centromeres (site of kinetochore assembly and responsible for proper distribution of replicated chromosomes into daughter cells at mitosis or meiosis), and (3) if the chromosome is linear, telomeres (specialized DNA structures at the ends of linear chromosomes that function to stabilize the ends and facilitate the complete replication of the extreme termini of the DNA molecule). An additional desired element is a chromatin organizing sequence. It is well documented that centromere function is crucial for stable chromosomal inheritance in almost all eukaryotic organisms (reviewed in Nicklas, J Cell Sci. 189:283-5, 1988). The centromere accomplishes this by attaching, via centromere binding proteins, to the spindle fibers during mitosis and meiosis, thus ensuring proper gene segregation during cell divisions.
[0010] Artificial chromosomes have been engineered using one of two approaches. The first approach identifies and assembles the desired chromosomal elements into an artificial construct. This is approach has been described as "bottom-up" and involves the use of a heterologous system (i.e. bacteria or fungal) to perform the various cloning steps necessary to assemble the artificial chromosome. Artificial chromosomes of this type will be referred to in this application as "minichromosomes or "MCs". The second approach.derives the artificial from existing chromosomes through chromosome fragmentation and, optionally, subsequent addition of desired elements including transgenes. For example, an existing chromosome can be induced to undergo breakage events that result in chromosomal fragments. Minimal fragments that possess the elements necessary for replication and segregation during cell division (i.e. centromere, origins of replication and telomeres) can be identified. These derived artificial chromosomes can then be used as targets for further manipulation including the addition of one or more transgenes. This approach has been described as "top-down" and does not require the use of a heterologous system (i.e. bacterial or fungal) since it doesn't require in vitro-based cloning steps. Artificial chromosomes of this type will be referred to in this application as "recombinant chromosomes."
[0011] The essential chromosomal elements for construction of artificial chromosomes have been precisely characterized in lower eukaryotic species, and more recently in mouse and human Autonomous Replication Sequences (ARSB) have been isolated from unicellular fungi, including Saccharomyces cerevisiae (brewer's yeast) and Schizosaccharomyces pombe (see Stinchcomb et al., Nature 282:39-43, 1979 and Hsiao et al., Proc Natl Acad Sci USA 76:3829-33, 1979). An ARS behaves like an origin of replication allowing DNA molecules that contain the ARS to be replicated in concert with the rest of the genome after introduction into the cell nuclei of these fungi. DNA molecules containing these sequences replicate, but in the absence of a centromere they are not partitioned into daughter cells in a controlled fashion that ensures efficient chromosome inheritance.
[0012] Artificial chromosomes have been constructed in yeast using the three cloned essential chromosomal elements (see Murray et al., Nature, 305:189-193, 1983). None of the essential components identified in unicellular organisms, however, function in higher eukaryotic systems. For example, a yeast centromere sequence will not confer stable inheritance upon vectors transformed into higher eukaryotes.
[0013] In contrast to the detailed studies done in yeast, less is known about the molecular structure of functional centromeric DNA of higher eukaryotes. Ultrastructural studies indicate that higher eukaryotic kinetochores, which are specialized complexes of proteins that form on the centromere during late prophase, are large structures (mammalian kinetochore plates are approximately 0.3 μm in diameter) which possess multiple microtubule attachment sites (reviewed in Rieder, Int Rev Cytol; 79:1-58, 1982). It is therefore possible that the centromeric DNA regions of these organisms will be correspondingly large, although the minimal amount of DNA necessary for centromere function may be much smaller.
[0014] While the above studies have been useful in elucidating the structure and function of centromeres, it was not known whether information derived from lower eukaryotic or mammalian higher eukaryotic organisms would be applicable to sorghum. There exists a need for cloned centromeres from sorghum, which would represent a first step in the production of artificial chromosomes, or in the identification of recombinant chromosomes. There further exists a need for sorghum cells, plants, seeds and progeny containing functional, stable, and autonomous artificial or recombinant chromosomes capable of carrying a large number of different genes and genetic elements.
SUMMARY OF THE INVENTION
[0015] In one aspect, the present invention addresses sorghum MCs comprising a sorghum centromere having one or more repeated nucleotide sequences, described in further detail herein. In some embodiments, such MCs comprise a centromere comprising one or more selected repeated nucleotide sequences derived from sorghum, including those isolated from sorghum genomic DNA and synthetic arrays of repeat sequences. In other embodiments, the invention addresses sorghum recombinant chromosomes.
[0016] In another aspect, the invention provides modified or "adchromosomal" sorghum plants, containing functional, stable, autonomous MCs or recombinant chromosomes.
[0017] The inventon provides for isolated sorghum MCs comprising a centromere, wherein the centromere comprises at least two copies of a repeated nucleotide sequences, and wherein the centromere confers the ability to segregate to daughter cells. The repeated nucleotide sequences may be short sorghum satellite sequences such as those sequences set out in SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22. The repeated nucleotide sequences may be longer sequences such as the sorghum retrotransposon CRS sequence, set out as SEQ ID NO:21 or fragments thereof.
[0018] In exemplary embodiments, the invention provides for a sorghum plant cell comprising a sorghum MC comprising a sorghum centromere that comprises at least two repeat nucleotide sequences that have a a sequence that hybridizes under conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C. to any one of the sorghum satellite sequence set out is SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22, or the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof, and wherein the centromere confers the ability to segregate to daughter cells. Alternatively, the hybridization conditions may comprise hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C.
[0019] In another exemplary embodiment, the invention provides for a sorghum plant cell comprising a sorghum MC comprising a sorghum centromere, wherein the centromere comprises at least two copies of a repeated nucleotide sequences that have a sequence that is at least 80% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22, or the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof, and wherein the centromere confers the ability to segregate to daughter cells. The invention also provides for a sorghum plant cell comprising a sorghum MCs wherein the repeated nucleotide sequence comprise a sequence that is at least 85% identical, or 90% identical or 95% identical or 98% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22.
[0020] In another embodiment, the invention provides for a sorghum plant cell comprising a sorghum Applied MC comprising at least two copies of a repeated nucleotide sequence that is at least 80% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22, or the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof or hybridizes to the nucleotide sequence of any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22, or the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof under stringent conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C., and a Transgene Expression Cassette.
[0021] In a further embodiment, the invention provides for a sorghum plant cell comprising a sorghum MC comprising a sorghum centromere, wherein the centromere comprises (a) at least two copies of a sorghum satellite nucleotide sequence, and (b) at least two copies of the a sorghum CRS nucleotide sequence (SEQ ID NO:21) or fragments thereof, and wherein the centromere confers the ability to segregate to daughter sorghum cells. In another embodiment, the invention provides for a sorghum plant cell comprising a MC comprising a sorghum centromere, wherein the centromere comprises (a) at least one array of sorghum satellite nucleotide sequences, and (b) at least one array of sorghum CRS nucleotide sequence (SEQ ID NO:21) or fragments thereof, and wherein the centromere confers the ability to segregate to daughter sorghum cells. The sorghum satellite nucleotide sequence may be one of the sequences set out in SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22, or a sequence that hybridizes under conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C. to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or a sequence that is al least 70% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22.
[0022] In addition, the invention provides for a sorghum plant cell comprising a sorghum Applied MC comprising a sorghum centromere, wherein the sorghum centromere comprises (a) at least 5 copies of a repeated nucleotide sequence within 1 kb of nucleotide sequence, wherein the repeated nucleotide sequence is at least 80% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or hybridizes to the nucleotide sequence of any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22 under stringent conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C., and (b) at least 2 copies of a repeated nucleotide sequence that is at least 80% identical over its length to a fragment of the nucleotide sequence of SEQ ID NO:21 or hybridizes to a fragment of SEQ ID NO:21 under stringent conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C.
[0023] In another embodiment, the invention provides a sorghum plant cell comprising (a) a polynucleotide sequence that is transcribed as a first RNA, (b) a polynucleotide sequence that is transcribed as a second RNA, and (c) a polynucleotide sequence that is transcribed as a third RNA, wherein transcription of the polynucleotide sequences results in increased biomass of a sorghum plant.
[0024] In an additional embodiment, the invention provides for a sorghum plant cell comprising a transgene expression cassette not integrated into the plant cell genome, wherein the Transgene Expression Cassette comprises (a) a polynucleotide sequence that is transcribed as a first RNA, (b) a polynucleotide sequence that is transcribed as a second RNA, and (c) a polynucleotide sequence that is transcribed as a third RNA, wherein transcription of the polynucleotide sequences results in increased biomass of a sorghum plant.
[0025] The inventon provides for a sorghum plant cell comprising a recombinant chromosome comprising at least two copies of a repeated nucleotide sequences, and wherein the centromere confers the ability to segregate to daughter cells. The repeated nucleotide sequences may be short sorghum satellite sequences such as those sequences set out in SEQ ID NOs:23-176, or the consensus sorghum satellite sequence set out as SEQ ID NO:22. The repeated nucleotide sequences may be longer sequences such as the sorghum retrotransposon sequence CRS, set out as SEQ ID NO:21.
[0026] In exemplary embodiments, the invention provides for a sorghum plant cell comprising a recombinant chromosomecomprising a sorghum centromere that comprises at least two repeat nucleotide sequences that have a a sequence that hybridizes under conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C. to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or to the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof, and wherein the centromere confers the ability to segregate to daughter cells. Alternatively, the hybridization conditions may comprise hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C.
[0027] In another exemplary embodiment, the invention provides for a sorghum plant cell comprising a recombinant chromosome comprising at least two copies of a repeated nucleotide sequences that have a sequence that is at least 80% identical to any one of the sorghum satellite sequence set out in the SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or to the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof, and a transgene expression cassette comprising at least three exogenous nucleic acids. The invention also provides for sorghum recombinant chromosomes wherein the repeated nucleotide sequence comprise a sequence that is at least 85% identical, or 90% identical or 95% identical or 98% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof.
[0028] In a further embodiment, the invention provides for a sorghum plant cell comprising a sorghum recombinant chromosome comprising a sorghum centromere, wherein the centromere comprises (a) at least two copies of a sorghum satellite nucleotide sequence, and (b) at least two copies of the a sorghum CRS nucleotide sequence (SEQ ID NO:21) or a fragment thereof, and wherein the centromere confers the ability to segregate to daughter cells. In another embodiment, the invention provides for a sorghum recombinant chromosome comprising a sorghum centromere, wherein the centromere comprises (a) at least one array of sorghum satellite nucleotide sequences, and (b) at least one array of of sorghum CRS nucleotide sequence (SEQ ID NO:21) or a fragment thereof, and wherein the centromere confers the ability to segregate to daughter cells. The sorghum satellite nucleotide sequence may be one of the sequences set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or to a sequence that hybridizes under conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C. to a nucleotide sequence any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or to a sequence that is at least 80% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22.
[0029] Alternatively, the invention provides for sorghum plant cells comprising a recombinant chromosome that has not been maintained in a cell of a heterologous organism.
[0030] In another embodiment, the invention provides for a sorghum plant cell comprising (a) at least two copies of a repeated nucleotide sequence that is at least 80% identical to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or to the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof or hybridizes to any one of the sorghum satellite sequence set out in SEQ ID NOs:23-176, or to the consensus sorghum satellite sequence set out as SEQ ID NO:22, or to the sorghum retrotransposon sequence of SEQ ID NO:21 or a fragment thereof under stringent conditions comprising hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C., and (b) a Transgene Expression Cassette comprising at least three exogenous nucleic acids, wherein the nucleotide sequence and the Transgene Expression Cassette are not integrated into the genome of the sorghum plant cell.
[0031] The invention also provides for a sorghum plant cell comprising a sorghum MC comprising a sorghum centromere, wherein the centromere comprises at least two synthetic repeat sequences or a synthetic array of repeated nucleotide sequence, wherein the array comprises at least two copies of a repeated nucleotide sequence, and wherein the centromere confers the ability to segregate to daughter sorghum cells. These artificially synthesized repeated nucleotide sequences may be based on sequence information from natural sorghum centromere sequences, combinations or fragments of natural sorghum centromere sequences including a combination of repeats of different lengths, a combination of different sequences, a combination of both different repeat lengths and different sequences, a combination of different artificially synthesized sequences or a combination of natural sorghum centromere sequence(s) and artificially synthesized sorghum sequence(s). The polynucleotides comprising synthetic arrays of sorghum repeat sequences and synthetic arrays of sorghum repeat sequences may be generated using any technique known in the art including PCR from sorghum genomic DNA (or a clone thereof) or by custom oligonucleotide synthesis.
[0032] The invention provides for any of the preceding sorghum MCs or recombinant chromosomes having a centromere comprising an array of repeated nucleotide sequence that ranges from about 1 kb to about 200 kb in length, 1 kb to about 100 kb in length, about 1 kb to about 10 kb, about 2 kb to about 12 kb, about 5 kb to about 25 kb, about 10 kb to about 50 kb, about 25 kb to 100kb.
[0033] The invention further contemplates any of the preceding sorghum MCs or recombinant chromosomes having centromeres comprising at least 300 bp, 400 bp, 500 bp, 600 bp, 700 bp, 750 bp, 1 kb, 1.5 kb, 2 kb, 2.5 kb, 3 kb, 3.5 kb, 4 kb, 4.5 kb, 5 kb, 5.5 kb, 6 kb, 6.5 kb, 7 kb, 7.5 kb, 8 kb, 8.5 kb, 9 kb, 9.5 kb, 10 kb, 10.5 kb, 11 kb, 11.5 kb, 12 kb, 12.5 kb, 13 kb, 13.5 kb, 14 kb, 14.5 kb, 15 kb, 16 kb, 17 kb, 18 kb, 19 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, 50 kb, 60 kb, 70 kb, 80 kb, 90 kb, 100 kb, 110 kb, 120 kb, 130 kb, 140 kb, 150 kb, 160 kb, 170 kb, 180 kb, 190 kb, 200 kb, 225 kb, 250 kb, 275 kb, 300 kb, 325 kb, 350 kb or 375 kb.
[0034] In another embodiment, any of the preceding sorghum MCs or recombinant chromosomes comprise centromeres having n copies of a repeated nucleotide sequence, wherein n is less than 2000, less than 1500, less than1000, less than 500, less than 400, less than 300, less than 250, less than 200, less than 100, less than 90, less than 80, less than 70, less than 60, less than 50, less than 40, less than 30, less than 25, less than 20, less than 15, less than 10, less than 9, less than 8, less than 7, less than 6 or less than 5. In exemplary embodiments, the centromeres of the sorghum MCs of the invention comprise n copies of a repeated nucleotide sequence, wherein n is at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 550, 600, 650, 700, 750, 800, 850, 900 or 1000. In additional exemplary embodiments, the centromeres of the sorghum MCs or recombinant chromosomes of the invention comprise n copies of a repeated nucleotide sequence where n ranges from 2 to 10, 2 to 20, 5 to 15, 5 to 25, 5 to 50, 5 to 100, 5 to 250, 5 to 500, 5 to 1000, 15 to 25, 15 to 50, 15 to 100, 15 to 250, 15 to 500, 15 to 1000, 25 to 50, 25 to 100, 25 to 250, 25 to 500, 25 to 1000, 50 to 100, 50 to 250, 50 to 500, 50 to 1000, 100 to 250, 100 to 500, 100 to 1000, 250 to 500, 250 to 1000, or 500 to 1000.
[0035] In an embodiment of the invention, any of the preceding sorghum MCs or recombinant chromosomes comprising a centromere having at least 5 consecutive repeated nucleotide sequences in "head to tail orientation." In an embodiment of the invention, any of the preceding sorghum MCs or recombinant chromosomes comprising a centromere having at least 5 consecutive repeated nucleotide sequences in "tandem," in which one repeat sequence is immedidately adjacent to another repeat sequence in any orientation, e.g. head to tail, tail to tail, or head to head. The invention also provides for any of the preceding sorghum MCs or recombinant chromosomes comprising a centromere having at least 5 repeated nucleotide sequences that are consecutive. The term "consecutive" refers to the same or similar repeated nucleotide sequences (e.g., at least 80% identical) that follow one after another without being interrupted by other significant sequence elements. Consecutive repeated nucleotide sequences may be in any orientation, e.g. head to tail, tail to tail, or head to head, and need not be directly adjacent to each other (e.g., may be 1-50 by apart).
[0036] The invention further provides for any of the preceding sorghum MCs or recombinant chromosomes comprising a centromere having at least 5 of the consecutive repeated nucleotide sequences separated by less than n number of nucleotides, wherein n ranges from 1 to 10, or 1 to 20, or 1 to 30, or 1 to 40, or 1 to 50 or wherein n is less than 10 by or n is less than 20 by or n is less than 30 by or n is less that 40 by or n is less than 50 bp.
[0037] The invention also provide for any of the preceding sorghum MCs or recombinant chromosomes comprising a centromere having at least two arrays of consecutive repeated nucleotide sequences, wherein the array comprises at least 2, 3, 4, 5, 6, 7, 8, 9,10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 or 2000 repeated nucleotide sequences. The repeats within an array may be in tandem in any orientation, e.g. head to tail, tail to tail, or head to head, or consecutive in any orientation, e.g. head to tail, tail to tail, or head to head. The arrays may be separated by less than n number of nucleotides, wherein n ranges from 1 to 10, or 1 to 20, or 1 to 30, or 1 to 40, or 1 to 50, or 1 to 60, or 1 to 70, or 1 to 80, or 1 to 90, or 1 to 100, or wherein n is less than 10 by or n is less than 20 by or n is less than 30 by or n is less thatn 40 by or n is less than 50 bp. The two arrays may comprise the same repeated nucleotide sequence or two different repeated nucleotide sequences (i.e. the first array can be comprised of repeat type 1 and the second array can be comprised of repeat type 2--here "type 1" and "type 2" are arbitrary designations).
[0038] In one embodiment, the sorghum MCs or recombinant chromosomes of the invention are 1000 kb or less in length, 900 kb or less in length, 800 kb or less in length or 700 kb or less in length. In exemplary embodiments, the sorghum MC is 600 kb or less in length, 500 kb or less in length, 250 kb or less in length, 100 kb or less in length, 50 kb or less in length, 10 kb or less in length, 5 kb or less in length, or 1 kb or less in length. For example, the sorghum MCs of the invention are 50 to 250 kb in length, 50 to 100 kb in length, 50 to 75 kb in length, 50 to 100 kb in length, 60 kb to 85 kb in length, 70 to 90 kb in length, 75 to 100 kb in length, 100 to250 kb in length, 250 to 500 kb in length, 500 to 1000 kb in length. In an exemplary embodiment, the sorghum MC is 28 kb in length, 42 kb in length, 82 kb in length, 87 kb in length, 88 kb in length, 97 kb in length, 130 kb in length, 150 kb in length, 200 kb in length or ranges from 28-200 kb in length. The MC of the invention preferably has a segregation efficiency during mitotic division of at least 60%, at least 80%, at least 90% or at least 95% and/or a transmission efficiency during meiotic division of, e.g., at least 60%, at least 80%, at least 85%, at least 90% or at least 95%.
[0039] The sorghum MC or recombinant chromosomes of the invention preferably has a segregation efficiency during mitotic division of at least 60%, at least 80%, at least 90% or at least 95% and/or a transmission efficiency during meiotic division of, e.g., at least 60%, at least 80%, at least 85%, at least 90% or at least 95%.
[0040] In another embodiment, the sorghum MCs or recombinant chromosomes of the invention comprise a site for site-specific recombination.
[0041] The invention also provides for a sorghum MC, wherein the MC is derived from a donor clone or a centromere clone and has substitutions, deletions, insertions, duplications or arrangements of one or more nucleotides in the MC compared to the nucleotide sequence of the donor clone or centromere clone. In one embodiment, the sorghum MC is obtained by passage of the sorghum MC through one or more hosts. In another embodiment, the MC is obtained by passage of the MC through two or more different hosts. The host may be selected from the group consisting of viruses, bacteria, yeasts. In another embodiment, the sorghum MC is obtained from a donor clone by in vitro methods that introduce sequence variation during template-based replication of the donor clone, or its complementary sequence. In one embodiment this variation may be introduced by a DNA-dependent DNA polymerase. In a further embodiment a sorghum MC derived by an in vitro method may be further modified by passage of the MC through one or more hosts.
[0042] The invention also provides for a sorghum MC or recombinant chromosome, wherein the MC comprises at least one exogenous nucleic acid. In further exemplary embodiments, the Sugarecane MC or recombinant chromosome comprises at least two or more, at least three or more, at least four or more, at least five or more, at least ten or more, at least 20 or more, at least 30 or more, at least 40 or more, at least 50 or more exogenous nucleic acids.
[0043] In one embodiment, at least one exogenous nucleic acid of any of the preceding sorghum MCs or recombinant chromosome is operably linked to a heterologous regulatory sequence functional in plant cells, including but not limited to a plant regulatory sequence. The invention also provides for exogenous nucleic acids linked to a non-plant regulatory sequence, such as an arthropod, viral, bacterial, vertebrate or yeast regulatory sequence. The invention also provides for exogenous nucleic acids linked to a regulatory sequence from sorghum.
[0044] The invention also provides for a MC or recombinant chromosome comprising a gene or group of genes that act to improve the total recoverable sugar from sorghum. Such genes may act to increase the sugar concentration of the stem juice, increase the amount of juice, or increase the stem strength to improve yield, increase total bomass of the plant. Such genes may be derived from bacterial sequences such as a sucrose isomerase or from animal, plant fungal, or protist sequences. Such genes from plants may include genes involved in sugar metabolism or transport or genes of unknown function that have been shown to quantitatively increase total recoverable sugar. Such genes may also include genes that affect plant height, stem diameter, water metabolism or total biomass. Such genes may also include those that regulate the equilibrium between starch and sugar. Several genes have been shown to improve sugar accumulation. For example, expression of a bacterial sucrose isomerase can increase sugarcane sugar content by as much as two-fold (Birch, R. G., and Wu, L. (2007). Doubled sugar content in sugarcane plants modified to produce a sucrose isomer. Plant Biotechnology Journal 5: 109-117.). The lignin-deficient "brown midrib" mutations improve sorghum sugar content via their effects on lignin; this phenotype is caused by mutations in cinnamyl alcohol dehydrogenase (CAD), and 14 CAD-like genes are present in the sorghum genome (Saballos, A et al. Genetics 181:783-95, 2009).
[0045] In another embodiment, the Sugarcane MC or recombinant chromosome comprises an exogenous nucleic acid comprises a QTL that confers a desirable trait. QTLs that affect total recoverable sugars have been mapped in sugarcane (Murray, S. C., et al. Crop Sci. 48:2165-2179, 2008).
[0046] In another embodiment, the sorghum MC or recombinant chromosome comprises an exogenous nucleic acid that confers herbicide resistance, insect resistance, disease resistance, or stress resistance on the sorghum plant. The invention provides for sorghum MCs or recombinant chromosomes comprising an exogenous nucleic acid that confers resistance to phosphinothricin or glyphosate herbicide. Nonlimiting examples include an exogenous nucleic acid that encodes a phosphinothricin acetyltransferase, glyphosate acetyltransferase, acetohydroxyadic synthase or a mutant enoylpyruvylshikimate phosphate (EPSP) synthase. Nonlimiting examples of exogenous nucleic acids that confer insect resistance include a Bacillus thuringiensis toxin gene or Bacillus cereus toxin gene. In related embodiments, the sorghum MC or recombinant chromosome comprises an exogenous nucleic acid conferring herbicide resistance, an exogenous nucleic acid conferring insect resistance, and at least one additional exogenous nucleic acid.
[0047] The invention further provides for sorghum MCs or recombinant chromosomes comprising additional copies of genes already found in the sorghum genome. The invention also provides for the additional copies of sorghum genes carried on the sorghum MC or recombinant chromosomes to be operably linked to either their native regulatory sequences or to heterologous regulatory sequences.
[0048] The invention further provides for sorghum MCs or recombinant chromosome comprising an exogenous nucleic acid that confers resistance to drought, heat, chilling, freezing, excessive moisture, ultraviolet light, ionizing radiation, toxins, pollution, mechanical stress or salt stress. The invention also provides for a sorghum MC that comprises an exogenous nucleic acid that confers resistance to a virus, bacteria, fungi or nematode.
[0049] The invention provides for sorghum MCs or recombinant chromosome comprising an exogenous nucleic acid selected from the group consisting of a nitrogen fixation gene, a plant stress-induced gene, a nutrient utilization gene, a gene that affects plant pigmentation, a gene that encodes an antisense or ribozyme molecule, a gene encoding a secretable antigen, a toxin gene, a receptor gene, a ligand gene, a seed storage gene, a hormone gene, an enzyme gene, an interleukin gene, a clotting factor gene, a cytokine gene, an antibody gene, a growth factor gene, a transcription factor gene, a transcriptional repressor gene, a DNA-binding protein gene, a recombination gene, a DNA replication gene, a programmed cell death gene, a kinase gene, a phosphatase gene, a G protein gene, a cyclin gene, a cell cycle control gene, a gene involved in transcription, a gene involved in translation, a gene involved in RNA processing, a gene involved in RNAi, an organellar gene, a intracellular trafficking gene, an integral membrane protein gene, a transporter gene, a membrane channel protein gene, a cell wall gene, a gene involved in protein processing, a gene involved in protein modification, a gene involved in protein degradation, a gene involved in metabolism, a gene involved in biosynthesis, a gene involved in assimilation of nitrogen or other elements or nutrients, a gene involved in controlling carbon flux, gene involved in respiration, a gene involved in photosynthesis, a gene involved in light sensing, a gene involved in organogenesis, a gene involved in embryogenesis, a gene involved in differentiation, a gene involved in meiotic drive, a gene involved in self incompatibility, a gene involved in development, a gene involved in nutrient, metabolite or mineral transport, a gene involved in nutrient, metabolite or mineral storage, a calcium-binding protein gene, or a lipid-binding protein gene.
[0050] The invention also provides for a sorghum MC or recombinant chromosome comprising an exogenous enzyme gene selected from the group consisting of a gene that encodes an enzyme involved in metabolizing biochemical wastes for use in bioremediation, a gene that encodes an enzyme for modifying pathways that produce secondary plant metabolites, a gene that encodes an enzyme that produces a pharmaceutical, a gene that encodes an enzyme that improves changes the nutritional content of a plant, a gene that encodes an enzyme involved in vitamin synthesis, a gene that encodes an enzyme involved in carbohydrate, polysaccharide or starch synthesis, a gene that encodes an enzyme involved in mineral accumulation or availability, a gene that encodes a phytase, a gene that encodes an enzyme involved in fatty acid, fat or oil synthesis, a gene that encodes an enzyme involved in synthesis of chemicals or plastics, a gene that encodes an enzyme involved in synthesis of a fuel and a gene that encodes an enzyme involved in synthesis of a fragrance, a gene that encodes an enzyme involved in synthesis of a flavor, a gene that encodes an enzyme involved in synthesis of a pigment or dye, a gene that encodes an enzyme involved in synthesis of a hydrocarbon, a gene that encodes an enzyme involved in synthesis of a structural or fibrous compound, a gene that encodes an enzyme involved in synthesis of a food additive, a gene that encodes an enzyme involved in synthesis of a chemical insecticide, a gene that encodes an enzyme involved in synthesis of an insect repellent, or a gene controlling carbon flux in a plant.
[0051] In another embodiment of the invention, any of the preceding sorghum MCs or recombinant chromosomes comprises a telomere.
[0052] The invention also provides embodiments wherein any of the preceding sorghum MCs or recombinant chromosomes is linear or circular.
[0053] In one embodiment, the invention provides for sorghum plants or plant cells comprising any of the preceding sorghum MCs or recombinant chromosomes. The invention also provides for sorghum plant tissue and sorghum seed obtained from the sorghum plants of the invention.
[0054] In another embodiment, the invention provides for sorghum plants comprising any of the preceding sorghum MCs or recombinant chromosomes, which may be referred to herein as "adchromosomal" sorghum plants. In addition, the invention provides for sorghum plant cells, tissues and seeds obtained from these modified plants.
[0055] In one embodiment, the invention provides for a sorghum plant cell comprising any of the preceding sorghum MCs or recombinant chromosomes that (i) is not integrated into the sorghum plant cell genome and (ii) confers an altered phenotype on the sorghum plant cell associated with at least one structural gene within the sorghum MC. The altered phenotype comprises increased expression of a native gene, decreased expression of a native gene, or expression of an exogenous gene. In a further embodiment, these sorghum plant cells also comprise one or more integrated exogenous structural gene(s).
[0056] Another embodiment of the invention is a part of any of the preceding sorghum plants. Exemplary sorghum plant parts of the invention include a pod, root, sett root, shoot root, root primordial, shoot, primary shoot, secondary shoot, tassle, panicle, arrow, midrib, blade, ligule, auricle, dewlap, blade joint, sheath, node, internode, bud furrow, leaf scar, cutting, tuber, stem, stalk, fruit, berry, nut, flower, leaf, bark, wood, epidermis, vascular tissue, organ, protoplast, crown, callus culture, petiole, petal, sepal, stamen, stigma, style, bud, meristem, cambium, cortex, pith, sheath, silk, ovule or embryo. Other exemplary sorghum plant parts are a meiocyte or gamete or ovule or pollen or endosperm of any of the preceding plants. Other exemplary plant parts are a seed, seed-piece, embryo, protoplast, cell culture, any group of plant cells organized into a structural and functional unit, ratoon, or propagule of any of the preceding sorghum plants.
[0057] An embodiment of the invention is a progeny of any of the preceding sorghum plants of the invention. These progeny of the invention may be the result of self-breeding, cross-breeding, apomyxis or clonal propagation. In exemplary embodiments, the invention also provides for progeny that comprise a sorghum MC or recombinant chromosome that is descended from a parental sorghum MC or recombinant chromosome that contained a centromere less than about 1000 kilobases in length, less than about 750 kilobases in length, less than about 600 kilobases in length, less than about 500 kilobases in length, less than about 400 kilobases in length, less than about 300 kilobases in length, less than about 250 kilobases in length, less than about 200 kilobases in length, less than about 150 kilobases, less than about 100 kilobases, less than about 90 kilobases in length, less than about 85 kilobases in length, less than about 80 kilobases in length, less than about 75 kilobases in length, less than about 70 kilobases in length, less than about 65 kilobases in length, less than about 60 kilobases in length, less than about 55 kilobases in length, less than about 50 kilobases in length, less than about 45 kilobases in length, less than about 40 kilobases in length, less than about 35 kilobases in length, less than about 30 kb in length, less than about 25 kilobases in length, less than about 20 kb in length, less than about 15 kilobases in length, less than about 12 kilobases in length, less than about 10 kb in length, less than about 7 kb in length, less than about 5 kb in length, or less than about 2kb in length.
[0058] In another aspect, the invention provides for methods of making a sorghum MC for use in any of the preceding sorghum plants of the invention. These methods comprise identifying a centromere nucleotide sequence in a sorghum genomic DNA library using a multiplicity of diverse probes, and constructing a sorghum MC comprising the centromere nucleotide sequence. These methods may further comprise determining hybridization scores for hybridization of the multiplicity of diverse probes to genomic clones within the sorghum genomic nucleic acid library, determining a classification for genomic clones within the sorghum genomic nucleic acid library according to the hybridization scores for at least two of the diverse probes, and selecting one or more genomic clones within one or more classifications for constructing the sorghum MC.
[0059] The invention also contemplates methods of using any of the preceding sorghum plants to produce a recombinant protein, by growing a sorghum plant comprising a sorghum MC or recombinant chromosome that comprises an exogenous nucleic acid encoding the desired recombinant protein. Optionally the sorghum plant is harvested and the desired protein product is isolated from the plant. Exemplary protein products include industrial enzymes such as those useful for biofuel production.
[0060] The invention also contemplates methods of using any of the preceding sorghum plants to produce a chemical product, by growing a sorghum plant comprising a sorghum MC or recombinant chromosome that comprises an exogenous nucleic acid encoding and enzyme involved in the synthesis of the chemical product. Optionally the sorghum plant is harvested and the desired chemical product is isolated from the plant. Exemplary chemical products include sugars, lipids and carbohydrates useful in the production of biofuels.
[0061] Another aspect of the invention provides for methods of using any of the preceding sorghum plants comprising a sorghum MCs or recombinant chromosome for a food product, a pharmaceutical product or chemical product, according to which a suitable exogenous nucleic acid is expressed in sorghum plants or plant cells and the plant or plant cells are grown. The plant may secrete the product into its growth environment or the product may be contained within the plant, in which case the plant is harvested and desirable products are extracted.
[0062] Thus, the invention contemplates methods of using any of the preceding sorghum plants comprising a sorghum MC or recombinant chromosome to produce a modified food product, for example, by growing a plant that expresses a exogenous nucleic acid that alters the nutritional content of the plant, and harvesting or processing the sorghum plant.
[0063] The invention also provides for methods of constructing a synthetic array of repeated nucleotide sequence having sorghum centromere function comprising the steps of: (a) PCR amplifying a sorghum satellite sequence, (b) cloning the PCR amplified satellite sequence into a cloning vector, (c) sequencing the cloned satellite DNA, (d) use a restriction enzyme with an asymmetric recognition sequence to excise the cloned satellite sequence from the cloning vector, (e) ligate the satellite sequence to one another forming a synthetic tandem array, (f) ligate the synthetic array into a sorghum MC backbone vector. The invention also provides for an isolated sorghum MC comprising a synthetic array of repeated nucleotide sequence constructed according to the method of the invention, and sorghum plant cells and plants comprising these MCs.
[0064] In another embodiment, the invention provides for methods of contacting a sorghum cell with a sorghum MC comprising the steps of (a) delivering the MC to immature differentiated leaves of the apical region of the stem of a sorghum plant, wherein the MC comprises a selectable marker gene, and (b) selecting the sorghum cells expressing the marker gene, wherein expression of the marker gene indicates transformation with the MC. The leaves used in this method are immature but are fully differentiated, such as the inner immature leaves of the sorghum stem. In an exemplary embodiment, the MC may be delivered by bombarding the immature leaves with micro-particles comprising the sorghum MC.
[0065] The invention also provides for methods of regenerating a sorghum plant transformed with a sorghum MC comprising the steps of (a) obtaining a callus comprising a sorghum cell that is transformed by any of the methods of the invention, and (b) growing the callus in media that may comprise 1% -3% polyvinylpyrrolidone to form a plantlet, wherein the cells of the plantlet are transformed with the sorghum MC. In a further embodiment, the methods of culturing the callus comprise growing the cells in liquid media for a time period and subsequently culturing the cells in a solid culture media. In an exemplary embodiment, the sorghum MC comprises a growth regulating gene such as a gene in the auxin biosynthesis or perception pathways. Such genes may include iaaM (Trp mono-oxygenase), iaaH (Indole-3-acetamide hydrolase), and ipt (AMP iso-pentenyl transferase). When these three genes are expressed on a MC, IaaM converts Trp into indole-3-acetamide, which IaaH converts into auxin. Ipt converts AMP into a cytokinin. The expression of all three genes allows a cultured cell to grow in the absence of exogenously supplied hormones.
Sequences of the Invention
[0066] The following list indicates the identity of the SEQ ID NOs in the sequence listing:
[0067] SEQ ID NOs:1-20--promoter sequences
[0068] SEQ ID NO:21--sorghum CRS sequence
[0069] SEQ ID NO:22--sorghum consensus satellite repeat sequence
[0070] SEQ ID NOs:23-177--sorghum satellite repeat sequences
[0071] SEQ ID NO:178--previously identified sorghum CRS sequence
[0072] SEQ ID NOs:179-180--forward and reverse primers for amplifying SEQ ID NO:21 of the invention
[0073] SEQ ID NOs:181-182--forward and reverse primers for making sorghum satellite repeat-specific probes for FISH analysis
[0074] SEQ ID NOs:183-275--sorghum centromere sequence contigs from BAC 42NM (identified as CRS-positive)
[0075] SEQ ID NOs:276-326--sorghum centromere sequence contigs from BAC 89F4 (identified as satellite-positive)
DETAILED DESCRIPTION OF THE INVENTION
[0076] While this invention is susceptible of embodiment in many different forms, and will be described herein in detail, specific embodiments thereof with the understanding that the present disclosure is to be considered as an exemplification of the principles of the invention and is not intended to limit the invention to the specific embodiments illustrated.
[0077] The invention provides novel, isolated functional, stable, autonomous MCs and recombinant chromosomes comprising centromere comprising sorghum repeat sequences including synthetic sequences. The invention also provides for "adchromosomal sorghum plants," described in further detail herein.
[0078] One aspect of the invention is related to plants containing functional, stable, autonomous MCs or recombinant chromosomes, preferably carrying one or more exogenous nucleic acids or carrying extra copies of a nucleic acid that already exists in the plant's genome. Such plants carrying MCs or recombinant chromosomes are contrasted to transgenic plants whose genome has been altered by integrating exogenous nucleic acid transgenes into the native plant chromosomes. Preferably, expression of the exogenous nucleic acid, either constitutively or in response to a signal (which may be induced by challenge or a stimulus), e.g. or tissue specific expression, or time specific expression, results in an altered phenotype of the plant.
[0079] The invention provides for MCs or recombinant chromosomes comprising at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 250, 500, 1000 or more exogenous nucleic acids.
[0080] The invention contemplates that sorghum plants may be used to cary the autonomous MCs as described herein. A related aspect of the invention is a plant part or plant tissue, including a pod, root, sett root, shoot root, root primordial, shoot, primary shoot, secondary shoot, tassle, panicle, arrow, midrib, blade, ligule, auricle, dewlap, blade joint, sheath, node, internode, bud furrow, leaf scar, cutting, tuber, stem, stalk, fruit, berry, nut, flower, leaf, bark, wood, epidermis, vascular tissue, organ, protoplast, crown, callus culture, petiole, petal, sepal, stamen, stigma, style, bud, meristem, cambium, cortex, pith, sheath, silk, ovule or embryo. Other exemplary plant parts are a meiocyte or gamete or ovule or pollen or endosperm of any of the preceding plants. Other exemplary plant parts are a seed, seed-piece, embryo, protoplast, cell culture, any group of plant cells organized into a structural and functional unit, ratoon or propagule of any of the preceding plants.
[0081] In one preferred embodiment, the exogenous nucleic acid is primarily expressed in a specific location or tissue of a plant, for example, stem, epidermis, vascular tissue, meristem, cambium, cortex, pith, leaf, sheath, flower, root or seed. Tissue-specific expression can be accomplished with, for example, localized presence of the MC or recombinant chromosome, selective maintenance of the MC or recombinant chromosomes, or with promoters that drive tissue-specific expression.
[0082] Another related aspect of the invention is meiocytes, pollen, ovules, endosperm, seed, somatic embryos, apomyctic embryos, embryos derived from fertilization, vegetative propagules and progeny of the originally adchromosomal plant and of its filial generations that retain the functional, stable, autonomous MC or recombinant chromosome. Such progeny include clonally propagated plants, embryos and plant parts as well as filial progeny from self- and cross-breeding, and from apomyxis.
[0083] Preferably the MC or recombinant chromosome is transmitted to subsequent generations of viable daughter cells during mitotic cell division with a transmission efficiency of at least 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%.
[0084] During meiotic division, the MC or recombinant chromosome is preferably transmitted to viable gametes with a transmission efficiency of at least 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% when more than one copy of the MC recombinant chromosome is present in the gamete mother cells of the plant. Preferably, the MC or recombinant chromosome is transmitted to viable gametes during meiotic cell division with a transmission frequency of at least 1%, 10%, 20%, 30%, 40%, 45%, 46%, 47%, 48%, or 49% when one copy of the MC or recombinant chromosome is present in the gamete mother cells of the plant. For production of seeds via sexual reproduction or by apomyxis the MC or recombinant chromosome is preferably transferred into at least 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of viable embryos when cells of the plant contain more than one copy of the MC or recombinant chromosome. For production of seeds via sexual reproduction or by apomyxis from plants with one MC or recombinant chromosome per cell, the MC or recombinant chromosome is preferably transferred into at least 1%, 10%, 20%, 30%, 40%, 45%, 46%, 47%, 48%, or 49% of viable embryos.
[0085] Preferably, a MC or recombinant chromosome that comprises an exogenous selectable trait or exogenous selectable marker can be employed to increase the frequency in subsequent generations of adchromosomal cells, tissues, gametes, embryos, endosperm, seeds, plants or progeny that comprise the MC or recombinant chromosome. More preferably, the frequency of transmission of MCs or recombinant chromosomes into viable cells, tissues, gametes, embryos, endosperm, seeds, plants or progeny can be at least 95%, 96%, 97%, 98%, 99% or 99.5% after mitosis or meiosis by applying at least one selection that favors the survival of adchromosomal cells, tissues, gametes, embryos, endosperm, seeds, plants or progeny over such cells, tissues, gametes, embryos, endosperm, seeds, plants or progeny lacking the MC or recombinant chromosome.
[0086] Transmission efficiency may be measured as the percentage of progeny cells or plants that carry the MC or recombinant chromosome as measured by one of several assays taught herein including detection of reporter gene fluorescence, PCR detection of a sequence that is carried by the MC or recombinant chromosome, RT-PCR detection of a gene transcript for a gene carried on the MC or recombinant, Western analysis of a protein produced by a gene carried on the MC or recombinant chromosome, Southern analysis of the DNA (either in total or a portion thereof) carried by the MC or recombinant chromosome, fluorescence in situ hybridization (FISH) or in situ localization by repressor binding, to name a few. Any assay used to detect the presence of the MC (or a portion of the MC) or recombinant chromosome may be used to measure the efficiency that a parental cell or plant transmits the MC or recombinant chromosome to its progeny. Efficient transmission as measured by some benchmark percentage should indicate the degree to which the MC or recombinant chromosome is stable through the mitotic and meiotic cycles.
[0087] Plants of the invention may also contain chromosomally integrated exogenous nucleic acid in addition to the autonomous MCs or recombinant chromosome. The modified plants or plant parts, including plant tissues of the invention may include plants that have chromosomal integration of some portion of the MC (e.g. exogenous nucleic acid or centromere sequences) or recombinant chromosome in some or all cells the plant. In one aspect of the invention, the autonomous MC or recombinant chromosome can be isolated from integrated exogenous nucleic acid by crossing the modified plant containing the integrated exogenous nucleic acid with plants producing some gametes lacking the integrated exogenous nucleic acid and subsequently isolating offspring of the cross, or subsequent crosses, that are modified but lack the integrated exogenous nucleic acid. This independent segregation of the MC or recombinant chromosome is one measure of the autonomous nature of the MC.
[0088] Another aspect of the invention relates to methods for producing and isolating such modified plants containing functional, stable, autonomous MCs.
[0089] In one embodiment, the invention contemplates improved methods for isolating native centromere sequences. In another embodiment, the invention contemplates methods for generating variants of native or artificial centromere sequences by passage through other host cells such are bacterial or fungal hosts.
[0090] In a further embodiment, the invention contemplates methods for delivering the MC into plant cells or tissues to transform the cells or tissues, optionally detecting MC presence or assessing MC performance, and optionally generating a plant from such cells or tissues.
[0091] Exemplary assays for assessing MC performance include lineage-based inheritance assays, use of chromosome loss agents to demonstrate autonomy, exonuclease digestion, global mitotic MC inheritance assays (sectoring assays) with or without the use of agents inducing chromosomal loss, assays measuring expression levels of genes (including marker genes) carried by the MC over time and space in a plant, physical assays for separation of autonomous MCs or recombinant chromosomes from endogenous nuclear chromosomes of plants, molecular assays demonstrating conserved MC structure, such as PCR, Southern blots, MC rescue, cloning and characterization of MC sequences present in the plant, cytological assays detecting MC presence in the cell's genome (e.g. FISH) and meiotic MC inheritance assays, which measure the levels of MC inheritance into a subsequent generation of plants via meiosis and gametes, embryos, endosperm or seeds.
[0092] Another aspect of the invention relates to methods for using such plants containing a MC or recombinant chromosome for producing food products, pharmaceutical products, biofuels and chemical products by appropriate expression of exogenous nucleic acid(s) contained within the MC(s) or recombinant chromosome(s).
[0093] Yet another aspect of the invention provides novel autonomous MCs with novel compositions and structures which are used to transform plant cells which are in turn used to generate a plant (or multiple plants). Exemplary MCs of the invention are contemplated to be of a size 2000 kb or less in length. Other exemplary sizes of MCs include less than or equal to, e.g., 1500 kb, 1000 kb, 900 kb, 800 kb, 700 kb, 600 kb, 500 kb, 450 kb, 400 kb, 350 kb, 300 kb, 250 kb, 200 kb, 150 kb, 100 kb, 80 kb, 60 kb, 40 kb, 35 kb in length. In an exemplary emdodiment, the MC is about 28 kb in length, 42 kb in length, 82 kb in length, 87 kb in length, 88 kb in length, 97 kb in length, 130 kb in length, 150 kb in length, 200 kb in length or ranges from 28 kb to 200 kb in length.
[0094] In a related aspect, novel centromere compositions as characterized by sequence content, size or other parameters are provided. Preferably, the minimal size of centromeric sequence is utilized in MC construction. Exemplary sizes include a centromeric nucleic acid segment derived from a portion of plant genomic DNA or a synthesized based on a plant satellite repeat sequence, that is less than or equal to 1000 kb, 900 kb, 800 kb, 700 kb, 600 kb, 500 kb, 400 kb, 300 kb, 200 kb, 190 kb, 150 kb, 100 kb, 95 kb, 90 kb, 85 kb, 80 kb, 75 kb, 70 kb, 65 kb, 60 kb, 55 kb, 50 kb, 45 kb, 40 kb, 35 kb, 30 kb, 28 kb, 25 kb, 20 kb, 17 kb, 15 kb, 12 kb, 10 kb, 7, kb, 6.4 kb, 5 kb, or 2 kb in length. Exemplary inserts may range in size 80 kb to 100 kb, 7 kb to 190 kb, 7 kb to 12 kb, 5 kb to 10 kb, 3 kb to 10 kb, 3 kb to 7 kb, 5 kb to 7 kb, 10 to 30 kb, 15 to 30 kb, and 15 to 28 kb. Another related aspect is the novel structure of the MC, particularly structures lacking bacterial sequences, e.g., required for bacterial propagation, refered to as backbone-free MCs.
[0095] In other exemplary embodiments, the invention contemplates MCs or other vectors comprising centromeric nucleotide sequence that when hybridized to 1, 2, 3, 4, 5, 6, 7, 8 or more of the probes described in the examples herein, under hybridization conditions described herein, e.g. low, medium or high stringency, provides relative hybridization scores. Exemplary stringent hybridization conditions comprise hybridization at 65° C. and washing three times for 15 minutes with 0.25×SSC, 0.1% SDS at 65° C. Additional exemplary stringent hybridization conditions comprise hybridization in 0.02 M to 0.15 M NaCl at temperatures of about 50° C. to 70° C. or 0.5×SSC 0.25% SDS at 65° for 15 minutes, followed by a wash at 65° C. for a half hour or hybridization at 65° C. for 14 hours followed by 3 washings with 0.5×SSC, 1% SDS at 65° C. Probe hybridization can be scored visually to determine a binary (positive versus negative) value, or more preferably the probes can be assigned a score based on the relative strength of their hybridization on a 10 point scale. For example, relative hybridization scores of 5 may be used to select clones that hybridize well to the probe. Alternatively, a hybridization signal greater than background for one or more of these probes can be used to select clones. Modified or adchromosomal plants or plant parts containing such MCs are contemplated.
[0096] The advantages of the present invention include: provision of an autonomous, independent genetic linkage group for accelerating breeding; lack of disruption of host genome; multiple gene "stacking" of large and potentially unlimited numbers of genes; uniform genetic composition exogenous DNA sequences in plant cells and plants containing autonomous MCs; defined genetic context for predictable gene expression; higher frequency occurrence and recovery of plant cells and plants containing stably maintained exogenous DNA due to elimination of inefficient integration step. In addition, MCs that increase total recoverable sugars, or enhance the utility of modified plants for use in biofuel production are specifically envisioned.
I. Composition of MCs and MC Construction
[0097] The MC vector of the present invention may contain a variety of elements, including (1) sequences that function as plant centromeres, (2) one or more exogenous nucleic acids, including, for example, plant-expressed genes, or genes for non-coding RNAs, (3) sequences that function as an origin of replication, which may be included in the region that functions as plant centromere, (4) optionally, a bacterial plasmid backbone for propagation of the plasmid in bacteria, (5) optionally, sequences that function as plant telomeres, (6) optionally, additional "stuffer DNA" sequences that serve to physically separate the various components on the MC from each other, (7) optionally "buffer" sequences such as MARs or SARs, (8) optionally marker sequences of any origin, including but not limited to plant and bacterial origin, (9) optionally, sequences that serve as recombination sites, and (10) "chromatin packaging sequences" such as cohesion and condensing binding sites.
[0098] The MCs of the present invention may be constructed to include various components which are novel, which include, but are not limited to, the centromere comprising novel repeating centromeric sequences, as described in further detail below
[0099] Novel Centromere Compositions
[0100] The centromere in the MC of the present invention may comprise novel repeating centromeric sequences.
[0101] Vectors comprising one, two, three, four, five, six, seven, eight, nine, ten, 15 or 20 or more of the elements contained in any of the exemplary vectors described in the examples below are also contemplated.
[0102] The invention specifically contemplates the alternative use of fragments or variants (mutants) of any of the nucleic acids described herein that retain the desired activity, including nucleic acids that function as centromeres, nucleic acids that function as promoters or other regulatory control sequences, or exogenous nucleic acids. Variants may have one or more additions, substitutions or deletions of nucleotides within the original nucleotide sequence or consensus sequence. Variants include nucleic acid sequences that are at least 50%, 55%, 60, 65, 70, 75, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% identical to the original nucleic acid sequence. Variants also include nucleic acid sequences that hybridize under low, medium, high or very high stringency conditions to the original nucleic acid sequence. Similarly, the specification also contemplates the alternative use of fragments or variants of any of the polypeptides described herein.
[0103] The comparison of sequences and determination of percent identity between two nucleotide sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (1970) J. Mol. Biol. 48:444-453 algorithm which has been incorporated into the GAP program in the GCG software package (available at www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix. Preferably parameters are set so as to maximize the percent identity.
[0104] As used herein, the term "hybridizes under low stringency, medium stringency, and high stringency conditions" describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology (1989) John Wiley & Sons, N.Y., 6.3.1-6.3.6, which is incorporated by reference. Aqueous and non-aqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.5×SSC, 0.1% SDS, at least at 50° C.; 2) medium stringency hybridization conditions in 6×SSC at about 45° C., followed by one or more washes in 0.2×SSC, 0.1% SDS at 55° C.; 3) high stringency hybridization conditions are hybridization at 65° C. for 12-18 hours and washing three times for 15-90 minutes with 0.25×SSC, 0.1% SDS at 65° C. Additional exemplary stringent hybridization conditions comprise 6×SSC at about 45° C., followed by one or more washes in 0.2×SSC, 0.1% SDS at 65° C. Other exemplary highly selective or stringent hybridization conditions comprise 0.02 M to 0.15 M NaCl at temperatures of about 50° C. to 70° C. or 0.5×SSC 0.25% SDS at 65° for 12-15 hours, followed by three washes at 65° C. for 15-90 minutes each.
[0105] MC Sequence Content and Structure
[0106] Plant-expressed genes from non-plant sources may be modified to accommodate plant codon usage, to insert preferred motifs near the translation initiation ATG codon, to remove sequences recognized in plants as 5' or 3' splice sites, or to better reflect plant GC/AT content. Plant genes typically have a GC content of more than 35%, and coding sequences which are rich in A and T nucleotides can be problematic. For example, ATTTA motifs may destabilize mRNA; plant polyadenylation signals such as AATAAA at inappropriate positions within the message may cause premature truncation of transcription; and monocotyledons, such as sorghum, may recognize AT-rich sequences as splice sites.
[0107] Each exogenous nucleic acid or plant-expressed gene may include a promoter, a coding region and a terminator sequence, which may be separated from each other by restriction endonuclease sites or recombination sites or both. Genes may also include introns, which may be present in any number and at any position within the transcribed portion of the gene, including the 5' untranslated sequence, the coding region and the 3' untranslated sequence. Introns may be natural plant introns derived from any plant, or artificial introns based on the splice site consensus that has been defined for plant species. Some intron sequences have been shown to enhance expression in plants. Optionally the exogenous nucleic acid may include a plant transcriptional terminator, non-translated leader sequences derived from viruses that enhance expression, a minimal promoter, or a signal sequence controlling the targeting of gene products to plant compartments or organelles.
[0108] The coding regions of the genes can encode any protein, including but not limited to visible marker genes (for example, fluorescent protein genes, other genes conferring a visible phenotype to the plant) or other screenable or selectable marker genes (for example, conferring resistance to antibiotics, herbicides or other toxic compounds or encoding a protein that confers a growth advantage to the cell expressing the protein) or genes which confer some commercial or agronomic value to the modified or adchromosomal plant. Multiple genes can be placed on the same MC vector. The genes may be separated from each other by restriction endonuclease sites, homing endonuclease sites, recombination sites or any combinations thereof. Alternatively, the cloning process can be executed in a manner that destroys the intervening restriction sites. Any number of genes can be present.
[0109] The MC vector may also contain a bacterial plasmid backbone for propagation of the plasmid in bacteria such as E. coli, A. tumefaciens, or A. rhizogenes. The plasmid backbone may be that of a low-copy vector or in other embodiments it may be desirable to use a mid to high level copy backbone. In one embodiment of the invention, this backbone contains the replicon of the F' plasmid of E. coli. However, other plasmid replicons, such as the bacteriophage P1 replicon, or other low-copy plasmid systems such as the RK2 replication origin, may also be used. The backbone may include one or several antibiotic-resistance genes conferring resistance to a specific antibiotic to the bacterial cell in which the plasmid is present. Bacterial antibiotic-resistance genes include but are not limited to kanamycin-, ampicillin-, chloramphenicol-, streptomycin-, spectinomycin-, tetracycline- and gentamycin-resistance genes.
[0110] The MC vector may also contain plant telomeres. An exemplary telomere sequence is TTTAGGG or its complement. Telomeres are specialized DNA structures at the ends of linear chromosomes that function to stabilize the ends and facilitate the complete replication of the extreme termini of the DNA molecule (Richards et al., Cell, 53:127-36, 1988; Ausubel et al., Current Protocols in Molecular Biology, Wiley & Sons, 1997).
[0111] Additionally, the MC vector may contain "stuffer DNA" sequences that serve to separate the various components on the MC (centromere, genes, telomeres) from each other. The stuffer DNA may be of any origin, prokaryotic or eukaryotic, and from any genome or species, plant, animal, microbe or organelle, or may be of synthetic origin. The stuffer DNA can range from 100 by to 10 Mb in length and can be repetitive in sequence, with unit repeats from 10 to 1,000,000 bp. Examples of repetitive sequences that can be used as stuffer DNAs include but are not limited to: rDNA, satellite repeats, retroelements, transposons, pseudogenes, transcribed genes, microsatellites, tDNA genes, short sequence repeats and combinations thereof. Alternatively, the stuffer DNA can consist of unique, non-repetitive DNA of any origin or sequence. The stuffer sequences may also include DNA with the ability to form boundary domains, such as but not limited to scaffold attachment regions (SARs) or matrix attachment regions (MARs). The stuffer DNA may be entirely synthetic, composed of random sequence. In this case, the stuffer DNA may have any base composition, or any A/T or G/C content. For example, the G/C content of the stuffer DNA could resemble that of the plant (˜30-40%), or could be much lower (0-30%) or much higher (40-100%). Alternatively, the stuffer sequences could be synthesized to contain an excess of any given nucleotide such as A, C, G or T. Different synthetic stuffers of different compositions may also be combined with each other. For example a fragment with low G/C content may be flanked or abutted by a fragment of medium or high G/C content, or vice versa.
[0112] In one embodiment of the invention, the MC has a circular structure without telomeres. In another embodiment, the MC has a circular structure with telomeres. In a third embodiment, the MC has a linear structure with telomeres, as would result if a "linear" structure were to be cut with a unique endonuclease, exposing the telomeres at the ends of a DNA molecule that contains all of the sequence contained in the original, closed construct with the exception of an antibiotic-resistance gene. In a fourth embodiment of the invention, the telomeres could be placed in such a manner that the bacterial replicon, backbone sequences, antibiotic-resistance genes and any other sequences of bacterial origin and present for the purposes of propagation of the MC in bacteria, can be removed from the plant-expressed genes, the centromere, telomeres, and other sequences by cutting the structure with, for example, an unique endonuclease. This results in a MC from which much of, or preferably all, bacterial sequences have been removed. In this embodiment, bacterial sequence present between or among the plant-expressed genes or other MC sequences would be excised prior to removal of the remaining bacterial sequences by cutting the MC with an endonuclease and re-ligating the structure such that the antibiotic-resistance gene has been lost. The unique endonuclease site may be the recognition sequence of any of a number of endonucleases including but not limited to restriction endonucleases, meganucleases, or homing endonuclease. Alternatively, the endonucleases and their sites can be replaced with any specific DNA cutting mechanism and its specific recognition site such as rare-cutting endonuclease or recombinase and its specific recognition site, as long as that site is present in the MCs only at the indicated positions.
[0113] Various structural configurations are possible by which MC elements can be oriented with respect to each other. A centromere can be placed on a MC either between genes or outside a cluster of genes next to one telomere or next to the other telomere. Stuffer DNAs can be combined with these configurations to place the stuffer sequences inside the telomeres, around the centromere between genes or any combination thereof. Thus, a large number of alternative MC structures are possible, depending on the relative placement of centromere DNA, genes, stuffer DNAs, bacterial sequences, telomeres, and other sequences. The sequence content of each of these variants is the same, but their structure may be different depending on how the sequences are placed. These variations in architecture are possible both for linear and for circular MCs.
Exemplary Centromere Components
[0114] Centromere components may be isolated or derived from native plant genome, for example, modified through recombinant techniques or through the cell-based techniques described below. Alternatively, wholly artificial centromere components may be constructed using as a general guide the sequence of native centromeres such as native sorghum satellite repeat sequences. Combinations of centromere components derived from natural sources and/or combinations of naturally derived and artificial components are also contemplated. As noted above, centromere sequences from one taxonomic plant species may be functional in another taxonomic plant species, genus and family.
[0115] In one embodiment, the centromere contains n copies of a repeated nucleotide sequence obtained by the methods disclosed herein; wherein n is at least 2. In another embodiment, the centromere contains n copies of interdigitated repeats. An interdigitated repeat is a DNA sequence that consists of two distinct repetitive elements that combine to create a unique permutation. Potentially any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers. Moreover, the copies, while largely identical, can vary from each other. Such repeat variation is commonly observed in naturally occurring centromeres. The length of the repeat may vary, but will preferably range from about 20 by to about 360 bp, from about 20 by to about 250 bp, from about 50 by to about 225 bp, from 20 by to 137 bp, from about 75 by to about 210 bp, such as a 92 by repeat, a 97 by repeat and a 100 by repeat, from about 100 by to about 205 bp, from about 125 by to about 200 bp, from about 150 by to about 195 bp, from about 160 by to about 190 and from about 170 by to about 185 by including about 180 bp. Larger repeats including those up to 3,465 by or 3,500 by or 3,600 by or 3,700 by are also anticipated by the current invention.
[0116] The invention contemplates that two or more of these repeated nucleotide sequences, or similar repeated nucleotide sequences, may be oriented head to tail within the centromere. The term "head to tail" refers to multiple consecutive copies of the same or similar repeated nucleotide sequence (e.g., at least 70% identical) that are in the same 5'-3' orientation. The invention also contemplates that two or more of these repeated nucleotide sequences may be consecutive within the centromere. The term "consecutive" refers to the same or similar repeated nucleotide sequences (e.g., at least 70% identical) that follow one after another without being interrupted by other significant sequence elements. Such consecutive repeated nucleotide sequences may be in any orientation, e.g. head to tail, tail to tail, or head to head, and may be separated by n number of nucleotides, wherein n ranges from 1 to 10, or 1 to 20, or 1 to 30, or 1 to 40, or 1 to 50. Exemplary repeated nucleotide sequences derived from sorghum are set out in SEQ ID NOs:23-176, the consensus sorghum satellite sequence (SEQ ID NO:22, and the sorghum retrotransoposon sequence (SEQ ID NO:21) or a fragment thereof.
[0117] Modification of Centromeres Isolated from Native Plant Genome
[0118] Modification and changes may be made in the centromeric DNA segments of the current invention and still obtain a functional molecule with desirable characteristics. The following is a discussion based upon changing the nucleic acids of a centromere to create an equivalent, or even an improved, second generation molecule.
[0119] In particular embodiments of the invention, mutated centromeric sequences are contemplated to be useful for increasing the utility of the centromere. It is specifically contemplated that the function of the centromeres of the current invention may be based in part or in whole upon the secondary structure of the DNA sequences of the centromere, modification of the DNA with methyl groups or other adducts, and/or the proteins which interact with the centromere. By changing the DNA sequence of the centromere, one may alter the affinity of one or more centromere-associated protein(s) for the centromere and/or the secondary structure or modification of the centromeric sequences, thereby changing the activity of the centromere. Alternatively, changes may be made in the centromeres of the invention which do not affect the activity of the centromere. Changes in the centromeric sequences which reduce the size of the DNA segment needed to confer centromere activity are contemplated to be particularly useful in the current invention, as would changes which increased the fidelity with which the centromere was transmitted during mitosis or meiosis.
[0120] Modification of Centromeres by Passage through Bacteria, Plant or other Hosts or Processes
[0121] In the methods of the present invention, the resulting MC DNA sequence may also be a derivative of the parental clone or centromere clone having substitutions, deletions, insertions, duplications and/or rearrangements of one or more nucleotides in the nucleic acid sequence. Such nucleotide mutations may occur individually or consecutively in stretches of 1, 2, 3, 4, 5, 6, 7, 8, 9 10, 15, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, 2000, 4000, 8000, 10000, 50000, 100000, and about 200000, including all ranges in-between.
[0122] Variations of MCs may arise through passage of MCs through various hosts including virus, bacteria, yeast, plant or other prokaryotic or eukaryotic organism and may occur through passage of multiple hosts or individual host. Variations may also occur by replicating the MC in vitro.
[0123] Derivatives may be identified through sequence analysis, or variations in MC molecular weight through electrophoresis such as, but not limited to, CHEF gel analysis, column or gradient separation, or any other methods used in the field to determine and/or analyze DNA molecular weight or sequence content. Alternately, derivatives may be identified by the altered activity of a derivative in conferring centromere function to a MC.
[0124] Production or Syntheis of Synthetic Centromere Repeat Sequences
[0125] These artificially synthesized repeated nucleotide sequences of the invention may be derived from natural centromere sequences, combinations or fragments of natural centromere sequences including a combination of repeats of different lengths, a combination of different sequences, a combination of both different repeat lengths and different sequences, a combination of different artificially synthesized sequences or a combination of natural centromere sequence(s) and artificially synthesized sequence(s). The synthetic nucleotide sequences and arrays of these synthetic repeat sequences may be generated using any technique known in the art including PCR from genomic DNA, e.g. the methods described in Example 1, or by custom polynucleotide synthesis.
[0126] Polynucleotide synthesis is the non-biological, chemical synthesis of defined sequences of nucleic acids using automated synthesizers. Oligonucleotides may be chemically synthesized, purified and then these oligonucleotides are connected by specific annealing and standard ligation or polymerase reactions. Examplary ligation methods include ligation of phosphorylated overlapping oligonucleotides (Gupta et al. PNAS USA, 60, 1338-1344, 1993, Fuhrmann et al. Plant J. 19:353-61, 1999), the Fokl method (Mandecki et al. Gene, 68, 101-107) and a modified form of ligase chain reaction for gene synthesis. In addition, PCR assembly approaches may be used which generally employ oligonucleotides of 40-50 nt long that overlap each other. These oligonucleotides are designed to cover most of the sequence of both strands, and the full-length molecule is generated progressively by overlap extension PCR (Stemmer et al. Gene, 164, 49-53)., thermodynamically balanced inside-out PCR (Gao et al. Nucleic Acids Res. 15;31(22):e143, 2003) or combined approaches (Young et al. Nucleic Acids Res. 15;32(7):e59, 2004).
[0127] Exemplary Exogenous Nucleic Acids Including Plant-Expressed Genes
[0128] Of particular interest in the present invention are exogenous nucleic acids which when introduced into plants will alter the phenotype of the plant, a plant organ, plant tissue, or a portion of the plant. Exemplary exogenous nucleic acids encode polypeptides. Other exemplary exogenous nucleic acids alter expression of exogenous or endogenous genes, either increasing or decreasing expression, optionally in response to a specific signal or stimulus.
[0129] As used herein, the term "trait" can refer either to the altered phenotype of interest or the nucleic acid which causes the altered phenotype of interest.
[0130] One of the major purposes of transformation of crop plants is to add some commercially desirable, agronomically important traits to the plant. Such traits include, but are not limited to, enhanced production of total recoverable sugars; utility for production of biofuels; herbicide resistance or tolerance; insect (pest) resistance or tolerance; disease resistance or tolerance (viral, bacterial, fungal, nematode or other pathogens); stress tolerance and/or resistance, as exemplified by resistance or tolerance to drought, heat, chilling, freezing, excessive moisture, salt stress, mechanical stress, extreme acidity, alkalinity, toxins, UV light, ionizing radiation or oxidative stress; increased yields, increased biomass, whether in quantity or quality; enhanced or altered nutrient acquisition and enhanced or altered metabolic efficiency; enhanced or altered nutritional content and makeup of plant tissues used for food, feed, fiber or processing; physical appearance; male sterility; drydown; standability; prolificacy; starch quantity and quality; oil quantity and quality; protein quality and quantity; amino acid composition; modified chemical production; altered pharmaceutical or nutraceutical properties; altered bioremediation properties; increased biomass; altered growth rate; altered fitness; altered biodegradability; altered CO2 fixation; presence of bioindicator activity; altered digestibility by humans or animals; altered allergenicity; altered mating characteristics; altered pollen dispersal; improved environmental impact; altered nitrogen fixation capability; the production of a pharmaceutically active protein; the production of a small molecule with medicinal properties; the production of a chemical including those with industrial utility; the production of nutraceuticals, food additives, carbohydrates, RNAs, lipids, fuels, dyes, pigments, vitamins, scents, flavors, vaccines, antibodies, hormones, and the like; and alterations in plant architecture or development, including changes in developmental timing, photosynthesis, signal transduction, cell growth, reproduction, or differentiation. Additionally one could create a library of an entire genome from any organism or organelle including mammals, plants, microbes, fungi, or bacteria, represented on MCs.
[0131] In one embodiment, the sorghum plant comprising a sorghum MC or recombinant chromosome may exhibit increased or decreased expression or accumulation of a product of the plant, which may be a natural product of the plant or a new or altered product of the plant. Exemplary products include an enzyme, an RNA molecule, a nutritional protein, a structural protein, an amino acid, a lipid, a fatty acid, a polysaccharide, a sugar, an alcohol, an alkaloid, a carotenoid, a propanoid, a phenylpropanoid, or terpenoid, a steroid, a flavonoid, a phenolic compound, an anthocyanin, a pigment, a vitamin or a plant hormone. In another embodiment, the sorghum plant comprising a sorghum MC or recombinant chromosome has enhanced or diminished requirements for light, water, nitrogen, or trace elements. In another embodiment the sorghum plant comprising a sorghum MC or recombinant chromosome has an enhanced ability to capture or fix nitrogen from its environment. In yet another embodiment, the sorghum plant comprising a sorghum MC or recombinant chromosome is enriched for an essential amino acid as a proportion of a protein fraction of the plant. The protein fraction may be, for example, total seed protein, soluble protein, insoluble protein, water-extractable protein, and lipid-associated protein. The sorghum plant comprising a sorghum MC or recombinant chromosome may include genes that cause the overexpression, underexpression, antisense modulation, sense suppression, inducible expression, inducible repression, or inducible modulation of another gene.
[0132] A brief summary of exemplary improved properties and polypeptides of interest for either increased or decreased expression is provided below.
[0133] (i) Herbicide Resistance
[0134] A herbicide resistance (or tolerance) trait is a characteristic of a sorghum plant comprising a sorghum MC or recombinant chromosome that is resistant to dosages of an herbicide that is typically lethal to a wild type plant. Exemplary herbicides for which resistance is useful in a plant include glyphosate herbicides, phosphinothricin herbicides, oxynil herbicides, imidazolinone herbicides, dinitroaniline herbicides, pyridine herbicides, sulfonylurea herbicides, bialaphos herbicides, sulfonamide herbicides and glufosinate herbicides. Other herbicides would be useful as would combinations of herbicide genes on the same MC or recombinant chromosome.
[0135] The genes encoding phosphinothricin acetyltransferase (bar), glyphosate tolerant EPSP synthase genes, glyphosate acetyltransferase, the glyphosate degradative enzyme gene gox encoding glyphosate oxidoreductase, deh (encoding a dehalogenase enzyme that inactivates dalapon), herbicide resistant (e.g., sulfonylurea and imidazolinone) acetolactate synthase, and bxn genes (encoding a nitrilase enzyme that degrades bromoxynil) are good examples of herbicide resistant genes for use in transformation. The bar gene codes for an enzyme, phosphinothricin acetyltransferase (PAT), which inactivates the herbicide phosphinothricin and prevents this compound from inhibiting glutamine synthetase enzymes. The enzyme 5 enolpyruvylshikimate 3 phosphate synthase (EPSP Synthase), is normally inhibited by the herbicide N (phosphonomethyl)glycine (glyphosate). However, genes are known that encode glyphosate resistant EPSP synthase enzymes. These genes are particularly contemplated for use in plant transformation. The deh gene encodes the enzyme dalapon dehalogenase and confers resistance to the herbicide dalapon. The bxn gene codes for a specific nitrilase enzyme that converts bromoxynil to a non herbicidal degradation product. The glyphosate acetyl transferase gene inactivates the herbicide glyphosate and prevents this compound from inhibiting EPSP synthase.
[0136] Polypeptides that may produce plants having tolerance to plant herbicides include polypeptides involved in the shikimate pathway, which are of interest for providing glyphosate tolerant plants. Such polypeptides include polypeptides involved in biosynthesis of chorismate, phenylalanine, tyrosine and tryptophan.
[0137] (ii) Insect Resistance
[0138] Potential insect resistance (or tolerance) genes that can be introduced include Bacillus thuringiensis toxin genes or Bt genes (Watrud et al., In: Engineered Organisms and the Environment, 1985). Bt genes may provide resistance to lepidopteran or coleopteran pests such as European Corn Borer (ECB). Preferred Bt toxin genes for use in such embodiments include the CrylA(b) and CrylA(c) genes. Endotoxin genes from other species of B. thuringiensis which affect insect growth or development also may be employed in this regard.
[0139] It is contemplated that preferred Bt genes for use in the MCs or recombinant chromosomes disclosed herein will be those in which the coding sequence has been modified to effect increased expression in plants, and for example, in monocot plants including sorghum. Means for preparing synthetic genes are well known in the art and are disclosed in, for example, U.S. Pat. No. 5,500,365 and U.S. Pat. No. 5,689,052, each of the disclosures of which are specifically incorporated herein by reference in their entirety. Examples of such modified Bt toxin genes include a synthetic Bt CrylA(b) gene (Perlak et al., PNAS USA, 88:3324-3328, 1991), and the synthetic CrylA(c) gene termed 1800b (PCT Application WO 95/06128). Some examples of other Bt toxin genes known to those of skill in the art are given in Table 1 below.
TABLE-US-00001 TABLE 1 Bacillus thuringiensis Endotoxin Genesa New Nomenclature Old Nomenclature GenBank Accession Cry1Aa CryIA(a) M11250 Cry1Ab CryIA(b) M13898 Cry1Ac CryIA(c) M11068 Cry1Ad CryIA(d) M73250 Cry1Ae CryIA(e) M65252 Cry1Ba CryIB X06711 Cry1Bb ET5 L32020 Cry1Bc PEG5 Z46442 Cry1Bd CryE1 U70726 Cry1Ca CryIC X07518 Cry1Cb CryIC(b) M97880 Cry1Da CryID X54160 Cry1Db PrtB Z22511 Cry1Ea CryIE X53985 Cry1Eb CryIE(b) M73253 Cry1Fa CryIF M63897 Cry1Fb PrtD Z22512 Cry1Ga PrtA Z22510 Cry1Gb CryH2 U70725 Cry1Ha PrtC Z22513 Cry1Hb U35780 Cry1Ia CryV X62821 Cry1Ib CryV U07642 Cry1Ja ET4 L32019 Cry1Jb ET1 U31527 Cry1K U28801 Cry2Aa CryIIA M31738 Cry2Ab CryIIB M23724 Cry2Ac CryIIC X57252 Cry3A CryIIIA M22472 Cry3Ba CryIIIB X17123 Cry3Bb CryIIIB2 M89794 Cry3C CryIIID X59797 Cry4A CryIVA Y00423 Cry4B CryIVB X07423 Cry5Aa CryVA(a) L07025 Cry5Ab CryVA(b) L07026 Cry6A CryVIA L07022 Cry6B CryVIB L07024 Cry7Aa CryIIIC M64478 Cry7Ab CryIIICb U04367 Cry8A CryIIIE U04364 Cry8B CryIIIG U04365 Cry8C CryIIIF U04366 Cry9A CryIG X58120 Cry9B CryIX X75019 Cry9C CryIH Z37527 Cry10A CryIVC M12662 Cry11A CryIVD M31737 Cry11B Jeg80 X86902 Cry12A CryVB L07027 Cry13A CryVC L07023 Cry14A CryVD U13955 Cry15A 34kDa M76442 Cry16A cbm71 X94146 Cry17A cbm71 X99478 Cry18A CryBP1 X99049 Cry19A Jeg65 Y08920 Cyt1Aa CytA X03182 Cyt1Ab CytM X98793 Cyt2A CytB Z14147 Cyt2B CytB U52043 aAdapted from Crickmore, N, Zeigler, D R, Feitelson, J. et al. Microbiol. Molec. Biol. Rev. 62:807-813, 1998.
[0140] Protease inhibitors also may provide insect resistance (Johnson et al., PNAS USA 86): 9871-9875, 1989), and will thus have utility in plant transformation. The use of a pinll gene in combination with a Bt toxin gene, the combined effect of which has been discovered to produce synergistic insecticidal activity is envisioned to be particularly useful. Other genes which encode inhibitors of the insect's digestive system, or those that encode enzymes or co factors that facilitate the production of inhibitors, also may be useful. This group may be exemplified by oryzacystatin and amylase inhibitors such as those from wheat and barley.
[0141] Amylase inhibitors are found in various plant species and are used to ward off insect predation via inhibition of the digestive amylases of attacking insects. Several amylase inhibitor genes have been isolated from plants and some have been introduced as exogenous nucleic acids, conferring an insect resistant phenotype that is potentially useful (Chrispeels, M J and D E Sadava, Plants, Genes, and Crop Biotechnology Jones and Bartlett Press, 2003).
[0142] Genes encoding lectins may confer additional or alternative insecticide properties. Lectins are multivalent carbohydrate binding proteins which have the ability to agglutinate red blood cells from a range of species. Lectins have been identified recently as insecticidal agents with activity against weevils, ECB and rootworm (Murdock et al., Phytochemistry, 29:85-89, 1990, Czapla & Lang, J. Econ. Entomol., 83:2480-2485, 1990). Lectin genes contemplated to be useful include, for example, barley and wheat germ agglutinin (WGA) and rice lectins (Gatehouse et al., J. Sci. Food. Agric., 35:373-380, 1984), with WGA being preferred.
[0143] Genes controlling the production of large or small polypeptides active against insects when introduced into the insect pests, such as, e.g., lytic peptides, peptide hormones and toxins and venoms, form another aspect of the invention. For example, it is contemplated that the expression of juvenile hormone esterase, directed towards specific insect pests, also may result in insecticidal activity, or perhaps cause cessation of metamorphosis (Hammock et al., Nature, 344:458-461, 1990).
[0144] Genes that encode enzymes that affect the integrity of the insect cuticle form yet another aspect of the invention. Such genes include those encoding, e.g., chitinase, proteases, lipases and also genes for the production of nikkomycin, a compound that inhibits chitin synthesis, the introduction of any of which is contemplated to produce insect resistant plants. Genes that code for activities that affect insect molting, such as those affecting the production of ecdysteroid UDP glucosyl transferase, also fall within the scope of the useful exogenous nucleic acids of the present invention.
[0145] Genes that code for enzymes that facilitate the production of compounds that reduce the nutritional quality of the host plant to insect pests also are encompassed by the present invention. It may be possible, for instance, to confer insecticidal activity on a plant by altering its sterol composition. Sterols are obtained by insects from their diet and are used for hormone synthesis and membrane stability. Therefore alterations in plant sterol composition by expression of novel genes, e.g., those that directly promote the production of undesirable sterols or those that convert desirable sterols into undesirable forms, could have a negative effect on insect growth and/or development and hence endow the plant with insecticidal activity. Lipoxygenases are naturally occurring plant enzymes that have been shown to exhibit anti nutritional effects on insects and to reduce the nutritional quality of their diet. Therefore, further embodiments of the invention concern modified plants with enhanced lipoxygenase activity which may be resistant to insect feeding.
[0146] Tripsacum dactyloides is a species of grass that is resistant to certain insects, including root worm. It is anticipated that genes encoding proteins that are toxic to insects or are involved in the biosynthesis of compounds toxic to insects will be isolated from Tripsacum and that these novel genes will be useful in conferring resistance to insects. It is known that the basis of insect resistance in Tripsacum is genetic, because said resistance has been transferred to Zea mays via sexual crosses (Branson and Guss, Proceedings North Central Branch Entomological Society of America, 27:91-95, 1972). It is further anticipated that other cereal, monocot or dicot plant species may have genes encoding proteins that are toxic to insects which would be useful for producing insect resistant plants.
[0147] Further genes encoding proteins characterized as having potential insecticidal activity also may be used as exogenous nucleic acids in accordance herewith. Such genes include, for example, the cowpea trypsin inhibitor (CpTl; Hilder et al., Nature, 330:160-163, 1987) which may be used as a rootworm deterrent; genes encoding avermectin (Avermectin and Abamectin; Campbell, W. C., Ed., 1989; Ikeda et al., J. Bacteriol., 169:5615-5621, 1987) which may prove particularly useful as a corn rootworm deterrent; ribosome inactivating protein genes; and even genes that regulate plant structures. Sorghum plant comprising a sorghum MC or recombinant chromosome comprising anti insect antibody genes and genes that code for enzymes that can convert a non toxic insecticide (pro insecticide) applied to the outside of the plant into an insecticide inside the plant also are contemplated.
[0148] Polypeptides that may improve plant tolerance to the effects of plant pests or pathogens include proteases, polypeptides involved in anthocyanin biosynthesis, polypeptides involved in cell wall metabolism, including cellulases, glucosidases, pectin methylesterase, pectinase, polygalacturonase, chitinase, chitosanase, and cellulose synthase, and polypeptides involved in biosynthesis of terpenoids or indole for production of bioactive metabolites to provide defense against herbivorous insects. It is also anticipated that combinations of different insect resistance genes on the same MC or recombinant chromosomes will be particularly useful.
[0149] Vegetative Insecticidal Proteins (VIP) are another class of proteins originally found to be produced in the vegetative growth phase of the bacterium, Bacillus cereus, but do have a spectrum of insect lethality similar to the insecticidal genes found in strains of Bacillus thuriengensis. Both the vip1a and vip3A genes have been isolated and have demonstrated insect toxicity. It is anticipated that such genes may be used in modified plants to confer insect resistance ("Plants, Genes, and Crop Biotechnology" by Maarten J. Chrispeels and David E. Sadava (2003) Jones and Bartlett Press).
[0150] (iii) Environment or Stress Resistance
[0151] Improvement of a plant's ability to tolerate various environmental stresses such as, but not limited to, drought, excess moisture, chilling, freezing, high temperature, salt, and oxidative stress, also can be affected through expression of novel genes. It is proposed that benefits may be realized in terms of increased resistance to freezing temperatures through the introduction of an "antifreeze" protein such as that of the Winter Flounder (Cutler et al., J. Plant Physiol., 135:351-354, 1989) or synthetic gene derivatives thereof. Improved chilling tolerance also may be conferred through increased expression of glycerol 3 phosphate acetyltransferase in chloroplasts (Wolter et al., EMBO J., 4685-4692, 1992). Resistance to oxidative stress (often exacerbated by conditions such as chilling temperatures in combination with high light intensities) can be conferred by expression of superoxide dismutase (Gupta et al., 1993), and may be improved by glutathione reductase (Bowler et al., Ann Rev. Plant Physiol., 43:83-116, 1992). Such strategies may allow for tolerance to freezing in newly emerged fields as well as extending later maturity higher yielding varieties to earlier relative maturity zones. Many sorghum genes are known to be modulated by various stresses including drought stress, salt stresss and the application of stress response hormones (Buchanan C D, et al. Plant Mol Biol. 58:699-720, 2005) and (Srinivas G, et al. Theor Appl Genet. 118:703-17, 2009).
[0152] It is contemplated that the expression of novel genes that favorably affect plant water content, total water potential, osmotic potential, or turgor will enhance the ability of the plant to tolerate drought. As used herein, the terms "drought resistance" and "drought tolerance" are used to refer to a plant's increased resistance or tolerance to stress induced by a reduction in water availability, as compared to normal circumstances, and the ability of the plant to function and survive in lower water environments. In this aspect of the invention it is proposed, for example, that the expression of genes encoding for the biosynthesis of osmotically active solutes, such as polyol compounds, may impart protection against drought. Within this class are genes encoding for mannitol L phosphate dehydrogenase (Lee and Saier, PNAS USA 78:7336-7340, 1981) and trehalose 6 phosphate synthase (Kaasen et al., J. Bacteriology, 174:889-898, 1992). Through the subsequent action of native phosphatases in the cell or by the introduction and coexpression of a specific phosphatase, these introduced genes will result in the accumulation of either mannitol or trehalose, respectively, both of which have been well documented as protective compounds able to mitigate the effects of stress. Mannitol accumulation in transgenic tobacco has been verified and preliminary results indicate that plants expressing high levels of this metabolite are able to tolerate an applied osmotic stress (Tarczynski et al., Science, 259:508-510, 1993, Tarczynski et al. PNAS USA, 89:1-5, 1993).
[0153] Similarly, the efficacy of other metabolites in protecting either enzyme function (e.g., alanopine or propionic acid) or membrane integrity (e.g., alanopine) has been documented (Loomis et al., J. Expt. Zoology 252:9-15, 1989), and therefore expression of genes encoding for the biosynthesis of these compounds might confer drought resistance in a manner similar to or complimentary to mannitol. Other examples of naturally occurring metabolites that are osmotically active and/or provide some direct protective effect during drought and/or desiccation include fructose, erythritol (Coxson et al., Biotropica 24:121-133, 1992), sorbitol, dulcitol (Karsten et al., Botanica Marina 35:11-19, 1992), glucosylglycerol (Reed et al., J. Gen. Microbiol. 130:1-4, 1984; Erdmann et al., J. Gen. Microbiol. 138:363-368, 1992), sucrose, stachyose (Koster and Leopold, Plant Physiol. 88:829-832, 1988; Blackman et al., Plant Physiol. 100:225-230, 1992), raffinose (Lugo and Leopold, Plant Physiol. 98:1207-1210, 1992), proline (Rensburg et al., J. Plant Physiol. 141:188-194, 1993), glycine betaine, ononitol and pinitol (Vernon and Bohnert, EMBO J. 11:2077-2085, 1992). Continued growth and increased reproductive fitness during times of stress may be augmented by introduction and expression of genes such as those controlling the osmotically active compounds discussed above and other such compounds. Currently preferred genes which promote the synthesis of an osmotically active polyol compound are genes which encode the enzymes mannitol 1 phosphate dehydrogenase, trehalose 6 phosphate synthase and myoinositol 0 methyltransferase.
[0154] It is contemplated that the expression of specific proteins also may increase drought tolerance. Three classes of Late Embryogenic Abundant (LEA) Proteins have been assigned based on structural similarities (see Dure et al., Plant Molec. Biol. 12:475-486, 1989). All three classes of LEAs have been demonstrated in maturing (e.g. desiccating) seeds. Within these 3 types of LEA proteins, the Type II (dehydrin type) have generally been implicated in drought and/or desiccation tolerance in vegetative plant parts (e.g. Mundy and Chua, EMBO J., 7:2279-2286, 1988; Piatkowski et al., Plant Physiol. 94:1682-1688, 1990; Yamaguchi Shinozaki et al., Plant Cell Physiol. 33:217-224, 1992). Expression of a Type III LEA (HVA 1) in tobacco was found to influence plant height, maturity and drought tolerance (Fitzpatrick, Gen. Engineering News 22:7, 1993). In rice, expression of the HVA 1 gene influenced tolerance to water deficit and salinity (Xu et al., Plant Physiol. 110:249-257, 1996). Expression of structural genes from any of the three LEA groups may therefore confer drought tolerance. Other types of proteins induced during water stress include thiol proteases, aldolases or transmembrane transporters (Guerrero et al., Plant Molecul. Biol. 15:11-26, 1990), which may confer various protective and/or repair type functions during drought stress. It also is contemplated that genes that effect lipid biosynthesis and hence membrane composition might also be useful in conferring drought resistance on the plant.
[0155] Many of these genes for improving drought resistance have complementary modes of action. Thus, it is envisaged that combinations of these genes might have additive and/or synergistic effects in improving drought resistance in plants. Many of these genes also improve freezing tolerance (or resistance); the physical stresses incurred during freezing and drought are similar in nature and may be mitigated in similar fashion. Benefits may be conferred via constitutive expression of these genes, but the preferred means of expressing these novel genes may be through the use of a turgor induced promoter (such as the promoters for the turgor induced genes described in Guerrero et al., Plant Molecul. Biol. 15:11-26, 1990 and Shagan et al., Plant Physiol. 101:1397-1398, 1993 which are incorporated herein by reference). Spatial and temporal expression patterns of these genes may enable plants to better withstand stress.
[0156] It is proposed that expression of genes that are involved with specific morphological traits that allow for increased water extractions from drying soil would be of benefit. For example, introduction and expression of genes that alter root characteristics may enhance water uptake. It also is contemplated that expression of genes that enhance reproductive fitness during times of stress would be of significant value. For example, expression of genes that improve the synchrony of pollen shed and receptiveness of the female flower parts, e.g., silks, would be of benefit. In addition it is proposed that expression of genes that minimize kernel abortion during times of stress would increase the amount of grain to be harvested and hence be of value.
[0157] Given the overall role of water in determining yield, it is contemplated that enabling plants to utilize water more efficiently, through the introduction and expression of novel genes, will improve overall performance even when soil water availability is not limiting. By introducing genes that improve the ability of plants to maximize water usage across a full range of stresses relating to water availability, yield stability or consistency of yield performance may be realized.
[0158] Polypeptides that may improve stress tolerance under a variety of stress conditions include polypeptides involved in gene regulation, such as serine/threonine-protein kinases, MAP kinases, MAP kinase kinases, and MAP kinase kinase kinases; polypeptides that act as receptors for signal transduction and regulation, such as receptor protein kinases; intracellular signaling proteins, such as protein phosphatases, GTP binding proteins, and phospholipid signaling proteins; polypeptides involved in arginine biosynthesis; polypeptides involved in ATP metabolism, including for example ATPase, adenylate transporters, and polypeptides involved in ATP synthesis and transport; polypeptides involved in glycine betaine, jasmonic acid, flavonoid or steroid biosynthesis; and hemoglobin. Enhanced or reduced activity of such polypeptides in modified plants will provide changes in the ability of a plant to respond to a variety of environmental stresses, such as chemical stress, drought stress and pest stress.
[0159] Other polypeptides that may improve plant tolerance to cold or freezing temperatures include polypeptides involved in biosynthesis of trehalose or raffinose, polypeptides encoded by cold induced genes, fatty acyl desaturases and other polypeptides involved in glycerolipid or membrane lipid biosynthesis, which find use in modification of membrane fatty acid composition, alternative oxidase, calcium-dependent protein kinases, LEA proteins or uncoupling protein.
[0160] Other polypeptides that may improve plant tolerance to heat include polypeptides involved in biosynthesis of trehalose, polypeptides involved in glycerolipid biosynthesis or membrane lipid metabolism (for altering membrane fatty acid composition), heat shock proteins or mitochondrial NDK.
[0161] Other polypeptides that may improve tolerance to extreme osmotic conditions include polypeptides involved in proline biosynthesis.
[0162] Other polypeptides that may improve plant tolerance to drought conditions include aquaporins, polypeptides involved in biosynthesis of trehalose or wax, LEA proteins or invertase.
[0163] (iv) Disease Resistance
[0164] It is proposed that increased resistance (or tolerance) to diseases may be realized through introduction of genes into plants, for example, into monocotyledonous plants such as sorghum. It is possible to produce resistance to diseases caused by viruses, viroids, bacteria, fungi and nematodes. It also is contemplated that control of mycotoxin producing organisms may be realized through expression of introduced genes. Resistance can be affected through suppression of endogenous factors that encourage disease-causing interactions, expression of exogenous factors that are toxic to or otherwise provide protection from pathogens, or expression of factors that enhance the plant's own defense responses.
[0165] Resistance to viruses may be produced through expression of novel genes. For example, it has been demonstrated that expression of a viral coat protein in a modified plant can impart resistance to infection of the plant by that virus and perhaps other closely related viruses (Hemenway et al., EMBO J. 7:1273-1280, 1988, Abel et al., Science 232:738-743, 1986). It is contemplated that expression of antisense genes targeted at essential viral functions may also impart resistance to viruses. For example, an antisense gene targeted at the gene responsible for replication of viral nucleic acid may inhibit replication and lead to resistance to the virus. It is believed that interference with other viral functions through the use of antisense genes also may increase resistance to viruses. Further, it is proposed that it may be possible to achieve resistance to viruses through other approaches, including, but not limited to the use of satellite viruses.
[0166] It is proposed that increased resistance to diseases caused by bacteria and fungi may be realized through introduction of novel genes. It is contemplated that genes encoding so called "peptide antibiotics," pathogenesis related (PR) proteins, toxin resistance, or proteins affecting host pathogen interactions such as morphological characteristics will be useful. Peptide antibiotics are polypeptide sequences which are inhibitory to growth of bacteria and other microorganisms. For example, the classes of peptides referred to as cecropins and magainins inhibit growth of many species of bacteria and fungi. It is proposed that expression of PR proteins in plants, for example, monocots such as sorghum, may be useful in conferring resistance to bacterial disease. These genes are induced following pathogen attack on a host plant and have been divided into at least five classes of proteins (Bol et al. Annu. Rev. Pytopathol. 28:113-138, 1990). Included amongst the PR proteins are beta 1, 3 glucanases, chitinases, and osmotin and other proteins that are believed to function in plant resistance to disease organisms. Other genes have been identified that have antifungal properties, e.g., UDA (stinging nettle lectin), or hevein (Broakaert et al., PNAS USA 87:7633-7, 1989; Barkai Golan et al., Arch. Microbiol. 116:119-121, 1978). It is known that certain plant diseases are caused by the production of phytotoxins. It is proposed that resistance to these diseases would be achieved through expression of a novel gene that encodes an enzyme capable of degrading or otherwise inactivating the phytotoxin. It also is contemplated that expression of novel genes that alter the interactions between the host plant and pathogen may be useful in reducing the ability of the disease organism to invade the tissues of the host plant, e.g., an increase in the waxiness of the leaf cuticle or other morphological characteristics.
[0167] Polypeptides useful for imparting improved disease responses to plants include polypeptides encoded by cercosporin-induced genes, antifungal proteins and proteins encoded by R-genes or SAR genes.
[0168] Agronomically important diseases in sorghum include but are not limited to Exserohilum turcicum, Colletotrichum graminicola (Glomerella graminicola), Cercospora sorghi, Gloeocercospora soghi, Ascochyta sorghi, Pseudomonas syringae p.v. syringae, Xanthomonas campestris p.v. holcicola, Pseudomonas andropogonis, Puccinia purpurea, Macrophomina phaseolina, Periconia circinata, FUSArium moniliforme, Alternaria alternate, Bipolaris sorghicola, Helminthosporium sorghicola, Curvularia lunata, Phoma insidiosa, Pseudomonas avenae (Pseudomonas alboprecipitans), Ramulispora sorghi, Ramulispora sorghicola, Phyllachara sacchari Sporisorium relianum (Sphacelotheca reliana), Sphacelotheca cruenta, Sporisorium sorghi, Sugarcane mosaic H, Maize Dwarf Mosaic Virus A & B, Claviceps sorghi, Rhizoctonia solani, Acremonium strictum, Sclerophthona macrospora, Peronosclerospora sorghi, Peronosclerospora philippinensis, Sclerospora graminicola, FUSArium graminearum, FUSArium Oxysporum, Pythium arrhenomanes, and Pythium graminicola.
[0169] (v) Plant Agronomic Characteristics
[0170] Temperature also influences where crop plants can be grown. Within the areas where it is possible to grow a particular crop, there are varying limitations on the maximal time it is allowed to grow to maturity and be harvested. For example, a variety to be grown in a particular area is selected for its ability to mature within the required period of time with maximum possible yield. It is considered that genes that influence maturity can be identified and introduced into plant lines to create new varieties adapted to different growing locations or the same growing location, but having improved yield at harvest. Expression of genes that are involved in regulation of plant development may be especially useful.
[0171] It is contemplated that genes may be introduced into plants that would improve standability and other plant growth characteristics. Expression of novel genes in plants which confer stronger stalks, improved root systems, or prevent or reduce ear droppage or shattering would be of great value to the farmer. It is proposed that introduction and expression of genes that increase the total amount of photoassimilate available by, for example, increasing light distribution and/or interception would be advantageous. In addition, the expression of genes that increase the efficiency of photosynthesis and/or the leaf canopy would further increase gains in productivity. It is contemplated that expression of a phytochrome gene in crop plants may be advantageous. Expression of such a gene may reduce apical dominance, confer semidwarfism on a plant, or increase shade tolerance (U.S. Pat. No. 5,268,526). Such approaches would allow for increased plant populations in the field.
[0172] (vi) Nutrient Utilization
[0173] The ability to utilize available nutrients may be a limiting factor in growth of crop plants. It is proposed that it would be possible to alter nutrient uptake, tolerate pH extremes, mobilization through the plant, storage pools, and availability for metabolic activities by the introduction of novel genes. These modifications would allow a plant, for example, sorghum to more efficiently utilize available nutrients. It is contemplated that an increase in the activity of, for example, an enzyme that is normally present in the plant and involved in nutrient utilization would increase the availability of a nutrient or decrease the availability of an antinutritive factor. An example of such an enzyme would be phytase. It is further contemplated that enhanced nitrogen utilization by a plant is desirable. Expression of a glutamate dehydrogenase gene in plants, e.g., E. coli gdhA genes, may lead to increased fixation of nitrogen in organic compounds. Furthermore, expression of gdhA in plants may lead to enhanced resistance to the herbicide glufosinate by incorporation of excess ammonia into glutamate, thereby detoxifying the ammonia. It also is contemplated that expression of a novel gene may make a nutrient source available that was previously not accessible, e.g., an enzyme that releases a component of nutrient value from a more complex molecule, perhaps a macromolecule.
[0174] Polypeptides useful for improving nitrogen flow, sensing, uptake, storage and/or transport include those involved in aspartate, glutamine or glutamate biosynthesis, polypeptides involved in aspartate, glutamine or glutamate transport, polypeptides associated with the TOR (Target of Rapamycin) pathway, nitrate transporters, nitrate reductases, amino transferases, ammonium transporters, chlorate transporters or polypeptides involved in tetrapyrrole biosynthesis.
[0175] Polypeptides useful for increasing the rate of photosynthesis include phytochrome, ribulose bisphosphate carboxylase-oxygenase, Rubisco activase, photosystem I and II proteins, electron carriers, ATP synthase, NADH dehydrogenase or cytochrome oxidase.
[0176] Polypeptides useful for increasing phosphorus uptake, transport or utilization include phosphatases or phosphate transporters.
[0177] (vii) Male Sterility
[0178] Male sterility is useful in the production of hybrid varieties of sorghum It is proposed that male sterility may be produced through expression of novel genes. For example, it has been shown that expression of genes that encode proteins, RNAs, or peptides that interfere with development of the male inflorescence and/or gametophyte result in male sterility. Chimeric ribonuclease genes that express in the anthers of transgenic tobacco and oilseed rape have been demonstrated to lead to male sterility (Mariani et al., Nature, 347:737-741, 1990).
[0179] A number of mutations were discovered in maize that confer cytoplasmic male sterility. One mutation in particular, referred to as T cytoplasm, also correlates with sensitivity to Southern corn leaf blight. A DNA sequence, designated TURF 13 (Levings, Science, 250:942-947, 1990), was identified that correlates with T cytoplasm. It is proposed that it would be possible through the introduction of TURF 13 via transformation, to separate male sterility from disease sensitivity. As it is necessary to be able to restore male fertility for breeding purposes and for grain production, it is proposed that genes encoding restoration of male fertility also may be introduced.
[0180] Male sterility systems have also been described in sorghum. These include cytoplasmic male sterility (Van Tang H, et al. Curr Genet. 29:265-74, 1996), gametophytic male sterility (Pring D R, et al. J Hered. 90:386-93, 1999), and nuclear male sterility (J. F. Pedersen and J. J. Toy, Crop Science 41:607, 2001).
[0181] (viii) Altered Nutritional Content
[0182] Genes can be introduced into plants to improve or alter the nutrient quality or content. Introduction of genes that alter the nutrient composition of a crop can greatly enhance the feed, food or forage value. Limiting essential amino acids can include lysine, methionine, tryptophan, threonine, valine, arginine, and histidine. The levels of these essential amino acids can be elevated by mechanisms which include, but are not limited to, the introduction of genes to increase the biosynthesis of the amino acids, decrease the degradation of the amino acids, increase the storage of the amino acids in proteins, or increase transport of the amino acids to particular tissues.
[0183] Polypeptides useful for providing increased protein quantity and/or quality include polypeptides involved in the metabolism of amino acids in plants, particularly polypeptides involved in biosynthesis of methionine/cysteine and lysine, amino acid transporters, amino acid efflux carriers, seed storage proteins, proteases, or polypeptides involved in phytic acid metabolism.
[0184] The protein composition of a crop can be altered to improve the balance of amino acids in a variety of ways including elevating expression of native proteins, decreasing expression of those with poor composition, changing the composition of native proteins, or introducing genes encoding entirely new proteins possessing superior composition.
[0185] The introduction of genes that alter the oil content of a crop plant can also be of value. Increases in oil content can result in increases in metabolizable-energy-content. The introduced genes can encode enzymes that remove or reduce rate-limitations or regulated steps in fatty acid or lipid biosynthesis. Such genes can include, but are not limited to, those that encode acetyl-CoA carboxylase, ACP-acyltransferase, alpha-ketoacyl-ACP synthase, or other well known fatty acid biosynthetic activities. Other possibilities are genes that encode proteins that do not possess enzymatic activity such as acyl carrier protein. Genes can be introduced that alter the balance of fatty acids present in the oil providing a more healthful or nutritive feedstuff. The introduced DNA also can encode sequences that block expression of enzymes involved in fatty acid biosynthesis, altering the proportions of fatty acids present in crops.
[0186] Genes can be introduced that enhance the nutritive value of crops, or of foods derived from crops by increasing the level of naturally occurring phytosterols, or by encoding for proteins to enable the synthesis of phytosterols in crops. The phytosterols from these crops can be processed directly into foods, or extracted and used to manufacture food products.
[0187] Genes can be introduced that enhance the nutritive value or energy value of the starch component of crops, for example by altering increasing the degree of branching of starch molecules, resulting in improved utilization of the starch in biofuel feedstock applications. Additionally, other major constituents of a crop can be altered, including genes that affect a variety of other nutritive, processing, or other quality aspects. For example, pigmentation can be increased or decreased.
[0188] Carbohydrate metabolism can be altered, for example by increased sucrose production and/or transport. Polypeptides useful for affecting on carbohydrate metabolism include polypeptides involved in sucrose or starch metabolism, carbon assimilation or carbohydrate transport, including, for example sucrose transporters or glucose/hexose transporters, enzymes involved in glycolysis/gluconeogenesis, the pentose phosphate cycle, or raffinose biosynthesis, or polypeptides involved in glucose signaling, such as SNF1 complex proteins.
[0189] Feed or food crops can also possess sub-optimal quantities of vitamins, antioxidants or other nutraceuticals, requiring supplementation to provide adequate nutritive value and ideal health value. Introduction of genes that enhance vitamin biosynthesis can be envisioned including, for example, vitamins A, E, B12, choline, or the like. Mineral content can also be sub-optimal. Thus genes that affect the accumulation or availability of compounds containing phosphorus, sulfur, calcium, manganese, zinc, or iron among others would be valuable.
[0190] Numerous other examples of improvements of crops can be used with the invention. Introduction of DNA to accomplish this might include sequences that alter lignin production such as those that result in the "brown midrib" phenotype associated with superior feed value for cattle. Other genes can encode for enzymes that alter the structure of extracellular carbohydrates, or that facilitate the degradation of the carbohydrates so that it can be efficiently fermented into ethanol or other useful carbohydrates.
[0191] It can be desirable to modify the nutritional content of plants by reducing undesirable components such as fats, starches, etc. This can be done, for example, by the use of exogenous nucleic acids that encode enzymes which increase plant use or metabolism of such components so that they are present at lower quantities. Alternatively, it can be done by use of exogenous nucleic acids that reduce expression levels or activity of native plant enzymes that synthesize such components.
[0192] Likewise the elimination of certain undesirable traits can improve the food or feed value of the crop. Many undesirable traits must currently be eliminated by special post-harvest processing steps and the degree to which these can be engineered into the plant prior to harvest and processing would provide significant value. Examples of such traits are the elimination of anti-nutritionals such as phytates and phenolic compounds which are commonly found in many crop species. Also, the reduction of fats, carbohydrates and certain phytohormones can be valuable for the food and feed industries as they can allow a more efficient mechanism to meet specific dietary requirements.
[0193] In addition to direct improvements in feed or food value, genes also can be introduced which improve the processing of crops and improve the value of the products resulting from the processing. Novel genes that increase the efficiency and reduce the cost of such processing, for example by decreasing the time required at a particular step, can also find use. Improving the value of products derived from processed plants, such as sorghum, can include altering the quantity or quality of sugar, starch, oil, fiber, gluten, or the components. Elevation of sugar or starch can be achieved through the identification and elimination of rate limiting steps in sugar and starch and sugar biosynthesis by expressing increased amounts of enzymes involved in biosynthesis or by decreasing levels of the other components of crops resulting in proportional increases in sugar or starch. In addition, plants can be modified by introducing or expressing a gene or genes that produce novel products, such as secondary plant metabolites or pharmaceutical products, which can be purified during the processing step. Using MCs or recombinant chromosomes to both introduce genes for new products and optionally for improving processing steps could provide a cost effective option to produce these novel products.
[0194] Oil is another product of processing, the value of which can be improved by introduction and expression of genes. Oil properties can be altered to improve its performance in the production and use of cooking oil, shortenings, lubricants or other oil-derived products or improvement of its health attributes when used in the food-related applications. Novel fatty acids also can be synthesized which upon extraction can serve as starting materials for chemical syntheses. The changes in oil properties can be achieved by altering the type, level, or lipid arrangement of the fatty acids present in the oil. This in turn can be accomplished by the addition of genes that encode enzymes that catalyze the synthesis of novel fatty acids (e.g. fatty acid elongases, desaturases) and the lipids possessing them or by increasing levels of native fatty acids while possibly reducing levels of precursors or breakdown products. Alternatively, DNA sequences can be introduced which slow or block steps in fatty acid biosynthesis resulting in the increase in precursor fatty acid intermediates. Genes that might be added include desaturases, epoxidases, hydratases, dehydratases, or other enzymes that catalyze reactions involving fatty acid intermediates. Representative examples of catalytic steps that might be blocked include the desaturations from stearic to oleic acid or oleic to linolenic acid resulting in the respective accumulations of stearic and oleic acids. Another example is the blockage of elongation steps resulting in the accumulation of C8 to C12 saturated fatty acids.
[0195] Polypeptides useful for providing increased oil quantity and/or quality include polypeptides involved in fatty acid and glycerolipid biosynthesis, beta-oxidation enzymes, enzymes involved in biosynthesis of nutritional compounds, such as carotenoids and tocopherols.
[0196] Polypeptides involved in production of galactomannans or arabinogalactans are of interest for providing plants having increased and/or modified reserve polysaccharides for use in food, pharmaceutical, cosmetic, paper and paint industries.
[0197] Polypeptides involved in modification of flavonoid/isoflavonoid metabolism in plants include cinnamate-4-hydroxylase, chalcone synthase or flavones synthase. Enhanced or reduced activity of such polypeptides in modified plants will provide changes in the quantity and/or speed of flavonoid metabolism in plants and can improve disease resistance by enhancing synthesis of protective secondary metabolites or improving signaling pathways governing disease resistance.
[0198] Polypeptides involved in lignin biosynthesis are of interest for increasing plants' resistance to lodging and for increasing the usefulness of plant materials as biofuels.
[0199] (ix) Production or Assimilation of Chemicals or Biologicals
[0200] It may further be considered that a sorghum plant comprising a sorghum MC or recombinant chromosome prepared in accordance with the invention may be used for the production or manufacturing of useful biological compounds that were either not produced at all, or not produced at the same level, in the plant previously. Alternatively, plants produced in accordance with the invention may be made to metabolize or absorb and concentrate certain compounds, such as hazardous wastes, thereby allowing bioremediation of these compounds.
[0201] The novel plants producing these compounds are made possible by the introduction and expression of one or potentially many genes with the constructs provided by the invention. The vast array of possibilities include but are not limited to any biological compound which is presently produced by any organism such as proteins, nucleic acids, primary and intermediary metabolites, carbohydrate polymers, enzymes for uses in bioremediation, enzymes for modifying pathways that produce secondary plant metabolites such as falconoid or vitamins, enzymes that could produce pharmaceuticals, and for introducing enzymes that could produce compounds of interest to the manufacturing industry such as specialty chemicals and plastics. The compounds may be produced by the plant, extracted upon harvest and/or processing, and used for any presently recognized useful purpose such as pharmaceuticals, fragrances, and industrial enzymes to name a few.
[0202] (x) Other characteristics
[0203] Cell cycle modification: Polypeptides encoding cell cycle enzymes and regulators of the cell cycle pathway are useful for manipulating growth rate in plants to provide early vigor and accelerated maturation. Improvements in quality traits, such as seed oil content, may also be obtained by expression of cell cycle enzymes and cell cycle regulators. Polypeptides of interest for modification of cell cycle pathway include cycling and EIF5α pathway proteins, polypeptides involved in polyamine metabolism, polypeptides which act as regulators of the cell cycle pathway, including cyclin-dependent kinases (CDKs), CDK-activating kinases, cell cycle-dependent phosphatases, CDK-inhibitors, Rb and Rb-binding proteins, or transcription factors that activate genes involved in cell proliferation and division, such as the E2F family of transcription factors, proteins involved in degradation of cyclins, such as cullins, and plant homologs of tumor suppressor polypeptides.
[0204] Plant growth regulators: Polypeptides involved in production of substances that regulate the growth of various plant tissues are of interest in the present invention and may be used to provide modified plants having altered morphologies and improved plant growth and development profiles leading to improvements in yield and stress response. Of particular interest are polypeptides involved in the biosynthesis, or degradation of plant growth hormones, such as gibberellins, brassinosteroids, cytokinins, auxins, ethylene or abscisic acid, and other proteins involved in the activity, uptake and/or transport of such polypeptides, including for example, cytokinin oxidase, cytokinin/purine permeases, F-box proteins, G-proteins or phytosulfokines.
[0205] Transcription factors in plants: Transcription factors play a key role in plant growth and development by controlling the expression of one or more genes in temporal, spatial and physiological specific patterns. Enhanced or reduced activity of such polypeptides in modified plants will provide significant changes in gene transcription patterns and provide a variety of beneficial effects in plant growth, development and response to environmental conditions. Transcription factors of interest include, but are not limited to myb transcription factors, including helix-turn-helix proteins, homeodomain transcription factors, leucine zipper transcription factors, MADS transcription factors, transcription factors having AP2 domains, zinc finger transcription factors, CCAAT binding transcription factors, ethylene responsive transcription factors, transcription initiation factors or UV damaged DNA binding proteins.
[0206] Homologous recombination: Increasing the rate of homologous recombination in plants is useful for accelerating the introgression of transgenes into breeding varieties by backcrossing, and to enhance the conventional breeding process by allowing rare recombinants between closely linked genes in phase repulsion to be identified more easily. Polypeptides useful for expression in plants to provide increased homologous recombination include polypeptides involved in mitosis and/or meiosis, DNA replication, nucleic acid metabolism, DNA repair pathways or homologous recombination pathways including for example, recombinases, nucleases, proteins binding to DNA double-strand breaks, single-strand DNA binding proteins, strand-exchange proteins, resolvases, ligases, helicases and polypeptide members of the RAD52 epistasis group.
Enhanced Biofuel Conversion
[0207] Biofuels can be produced from the conversion of biomass into liquid or gaseous fuels by converting the biomass into sugars, or by direct extract of sugars, that can be fermented or chemically converted to form a biofuel. Biofuels can also be generated by extracting oils from the biomass. Exemplary biofuels are ethanol, propanol, butanol, methanol, methane, 2,5-dimethylfurqan, dimethyl ether, biodiesel (short chain acid alkyl esters), biogasoline, syngas, parrafins (alkanes), other hydrocarbons or co-products of hydrogen.
[0208] The invention provides for MCs or recombinant chromosomes expressing at least one gene that enhance or increase sugar production or extractability, enhance or increase biomass, enhance the conversion of biomass to sugars or enhance sugar fermentation to biofuels. It may further be considered that a modified plant prepared in accordance with the invention may be used as biomass for the production of biofuels or the plant may facilitate conversion of biomass to sugars or facilitate fermentation of sugars to biofuels.
[0209] Enzymes that may be useful for biofuel production include those that break down glucans. In some embodiments, the enzymes are selected from the group consisting of: endo-β(1,4)-glucanase, cellobiohydrolase, β-glucosidase, α/β-glucosidase, mixed-linked glucanase, endo-β(1,3)-glucanase, exo-β(1,3)-glucanse and β-(1,6)-glucanase. In other embodiments the enzymes break down xyloglucans, xylans, mannans or lignins.
[0210] The enzyme genes may be controlled by inducible promoters that may be inactive until a desired time, such as at harvest or when the plant is added to the biofuels process (e.g. inactive at physiological conditions, then activetated by heat or pH), or sequestered by subcellular localization. The enzymes may also be controlled by a tissue-specific promoter which may be active only in specific tissues (e.g seeds or leaves).
Non-Protein-Expressing Exogenous Nucleic Acids
[0211] Plants with decreased expression of a gene of interest can also be achieved, for example, by expression of antisense nucleic acids, dsRNA or RNAi, acatalytic RNA such as ribozymes, sense expression constructs that exhibit cosuppression effects, aptamers or zinc finger proteins.
[0212] Antisense RNA reduces production of the polypeptide product of the target messenger RNA, for example by blocking translation through formation of RNA:RNA duplexes or by inducing degradation of the target mRNA. Antisense approaches are a way of preventing or reducing gene function by targeting the genetic material as disclosed in U.S. Pat. Nos. 4,801,540, 5,107,065, 5,759,829, 5,910,444, 6,184,439, and 6,198,026, all of which are incorporated herein by reference. In one approach, an antisense gene sequence is introduced that is transcribed into antisense RNA that is complementary to the target mRNA. For example, part or all of the normal gene sequences are placed under a promoter in inverted orientation so that the complementary strand is transcribed into a non-protein expressing antisense RNA. The promoter used for the antisense gene may influence the level, timing, tissue, specificity, or inducibility of the antisense inhibition.
[0213] Autonomous MCs or recombinant chromosome may comprise exogenous DNA flanked by recombination sites, for example lox-P sites, that can be recognized by a recombinase, e.g. Cre, and removed from the MC or recombinant chromosome. In cases where there is a homologous recombination site or sites in the host genomic DNA, the exogenous DNA excised the MC or recombinant chromosome may be integrated into the genome at one of the specific recombination sites and the DNA flanked by the recombination sites will become integrated into the host DNA. The use of a MC or recombinant chromosome as a platform for DNA excision or for launching such DNA integration into the host genome may include in vivo induction of the expression of a recombinase encoded in the genomic DNA of a transgenic host, or in a MC or recombinant chromosome.
[0214] RNAi gene suppression in plants by transcription of a dsRNA is described in U.S. Pat. No. 6,506,559, US patent application Publication No. 2002/0168707, WO 98/53083, WO 99/53050 and WO 99/61631, all of which are incorporated herein by reference. The double-stranded RNA or RNAi constructs can trigger the sequence-specific degradation of the target messenger RNA. Suppression of a gene by RNAi can be achieved using a recombinant DNA construct having a promoter operably linked to a DNA element comprising a sense and anti-sense element of a segment of genomic DNA of the gene, e.g., a segment of at least about 23 nucleotides, more preferably about 50 to 200 nucleotides where the sense and anti-sense DNA components can be directly linked or joined by an intron or artificial DNA segment that can form a loop when the transcribed RNA hybridizes to form a hairpin structure.
[0215] Catalytic RNA molecules or ribozymes can also be used to inhibit expression of the target gene or genes or facilitate molecular reactions. Ribozymes are targeted to a given sequence by hybridization of sequences within the ribozyme to the target mRNA. Two stretches of homology are required for this targeting, and these stretches of homologous sequences flank the catalytic ribozyme structure. It is possible to design ribozymes that specifically pair with virtually any target mRNA and cleave the target mRNA at a specific location, thereby inactivating it. A number of classes of ribozymes have been identified. One class of ribozymes is derived from a number of small circular RNAs that are capable of self-cleavage and replication in plants. The RNAs replicate either alone (viroid RNAs) or with a helper virus (satellite RNAs). Examples include Tobacco Ringspot Virus (Prody et al., Science, 231:1577-1580, 1986), Avocado Sunblotch Viroid (Palukaitis et al., Virology, 99:145-151, 1979; Symons, Nucl. Acids Res., 9:6527-6537, 1981), and Lucerne Transient Streak Virus (Forster and Symons, Cell, 49:211-220, 1987), and the satellite RNAs from velvet tobacco mottle virus, Solanum nodiflorum mottle virus and subterranean clover mottle virus. The design and use of target RNA-specific ribozymes is described in Haseloff, et al., Nature 334:585-591 (1988). Several different ribozyme motifs have been described with RNA cleavage activity (Symons, Annu. Rev. Biochem., 61:641-671, 1992). Other suitable ribozymes include sequences from RNase P with RNA cleavage activity (Yuan et al., PNAS USA, 89:8006-8010, 1992; Yuan and Altman, Science, 263:1269-1273, 1994; U.S. Pat. Nos. 5,168,053 and 5,624,824), hairpin ribozyme structures (Berzal-Herranz et al., Genes and Devel., 6:129-134, 1992; Chowrira et al., J. Biol. Chem., 269:25856-25864, 1994) and Hepatitis Delta virus based ribozymes (U.S. Pat. No. 5,625,047). The general design and optimization of ribozyme directed RNA cleavage activity has been discussed in detail (Haseloff and Gerlach, Nature 334:585-91, 1988; Chowrira et al., J. Biol. Chem., 269:25856-25864, 1994).
[0216] Another method of reducing protein expression utilizes the phenomenon of cosuppression or gene silencing (for example, U.S. Pat. Nos. 6,063,947, 5,686,649, or 5,283,184; each of which is incorporated herein by reference). Cosuppression of an endogenous gene using a full-length cDNA sequence as well as a partial cDNA sequence are known (for example, Napoli et al., Plant Cell 2:279-289, 1990; van der Krol et al., Plant Cell 2:291-299, 1990; Smith et al., Mol. Gen. Genetics 224:477-481, 1990). The phenomenon of cosuppression has also been used to inhibit plant target genes in a tissue-specific manner.
[0217] In some embodiments, nucleic acids from one species of plant are expressed in another species of plant to effect cosuppression of a homologous gene. The introduced sequence generally will be substantially identical to the endogenous sequence intended to be repressed, for example, about 65%, 80%, 85%, 90%, or preferably 95% or greater identical. Higher identity may result in a more effective repression of expression of the endogenous sequence. A higher identity in a shorter than full length sequence compensates for a longer, less identical sequence. Furthermore, the introduced sequence need not have the same intron or exon pattern, and identity of non-coding segments will be equally effective. Generally, where inhibition of expression is desired, some transcription of the introduced sequence occurs. The effect may occur where the introduced sequence contains no coding sequence per se, but only intron or untranslated sequences homologous to sequences present in the primary transcript of the endogenous sequence.
[0218] Yet another method of reducing protein activity is by expressing nucleic acid ligands, so-called aptamers, which specifically bind to the protein. Aptamers may be obtained by the SELEX (Systematic Evolution of Ligands by EXponential Enrichment) method. See U.S. Pat. No. 5,270,163, incorporated herein by reference. In the SELEX method, a candidate mixture of single stranded nucleic acids having regions of randomized sequence is contacted with the protein and those nucleic acids having an increased affinity to the target are selected and amplified. After several iterations a nucleic acid with optimal affinity to the polypeptide is obtained and is used for expression in modified plants.
[0219] A zinc finger protein that binds a polypeptide-encoding sequence or its regulatory region is also used to alter expression of the nucleotide sequence. Transcription of the nucleotide sequence may be reduced or increased. Zinc finger proteins are, for example, described in Beerli et al. (1998) PNAS USA 95:14628-14633., or in WO 95/19431, WO 98/54311, or WO 96/06166, all incorporated herein by reference.
[0220] Other examples of non-protein expressing sequences specifically envisioned for use with the invention include: tRNA sequences, for example, to alter codon usage; rRNA variants, for example, which may confer resistance to various agents such as antibiotics.
[0221] It is contemplated that unexpressed DNA sequences, including novel synthetic sequences, could be introduced into cells as proprietary "labels" of those cells and plants and seeds thereof. It would not be necessary for a label DNA element to disrupt the function of a gene endogenous to the host organism, as the sole function of this DNA would be to identify the origin of the organism. For example, one could introduce a unique DNA sequence into a plant and this DNA element would identify all cells, plants, and progeny of these cells as having arisen from that labeled source. It is proposed that inclusion of label DNAs would enable one to distinguish proprietary germplasm or germplasm derived from such, from unlabelled germplasm.
Exemplary Plant Promoters, Regulatory Sequences and Targeting Sequences
[0222] Exemplary classes of plant promoters are described below.
[0223] Constitutive Expression promoters: Exemplary constitutive expression promoters include the ubiquitin promoter (e.g., sunflower--Binet et al. Plant Science 79: 87-94, 1991; maize--Christensen et al. Plant Molec. Biol. 12: 619-632, 1989; and Arabidopsis--Callis et al., J. Biol. Chem. 265: 12486-12493, 1990; and Norris et al., Plant Mol. Biol. 21: 895-906, 1993); the CaMV 35S promoter (U.S. Pat. Nos. 5,858,742 and 5,322,938); or the actin promoter (e.g., rice--U.S. Pat. No. 5,641,876; McElroy et al. Plant Cell 2: 163-171, 1990; McElroy et al. Mol. Gen. Genet. 231: 150-160, 1991, and Chibbar et al. Plant Cell Rep. 12: 506-509, 1993. Exemplary promters for use in sorghum include the seed-specific SBEIIb promoter (Mutisya J, et al. J Plant Physiol. 163:770-80, 2005). Other promoters that may be useful in sorghum include the maize polyubiquitin 1 (Mubi-1) and the Sugarcane polyubiquitin 9 (SCubi9) promoters (Wang ML, et al. Transgenic Res. 14:167-78, 2005); and the Sugarcane polyubiquitin 4 (ubi4) promoter (Wei H, et al. J Plant Physiol. 160:1241-51, 2003).
[0224] Inducible Expression promoters: Exemplary inducible expression promoters include the chemically regulatable tobacco PR-1 promoter (e.g., tobacco--U.S. Pat. No. 5,614,395; Arabidopsis--Lebel et al., Plant J. 16: 223-233, 1998; maize-U.S. Pat. No. 6,429,362). Various chemical regulators may be employed to induce expression, including the benzothiadiazole, isonicotinic acid, and salicylic acid compounds disclosed in U.S. Pat. Nos. 5,523,311 and 5,614,395. Other promoters inducible by certain alcohols or ketones, such as ethanol, include, for example, the alcA gene promoter from Aspergillus nidulans (Caddick et al. Nat. Biotechnol 16:177-180, 1998). A glucocorticoid-mediated induction system is described in Aoyama and Chua (Plant Journal 11: 605-612, 1997) wherein gene expression is induced by application of a glucocorticoid, for example a dexamethasone. Another class of useful promoters is water-deficit-inducible promoters, e.g. promoters which are derived from the 5' regulatory region of genes identified as a heat shock protein 17.5 gene (HSP 17.5), an HVA22 gene (HVA22), and a cinnamic acid 4-hydroxylase (CA4H) gene of Zea mays. Another water-deficit-inducible promoter is derived from the rab-17 promoter as disclosed by Vilardell et al. (Plant Molec. Biol, 17:985-993, 1990). See also U.S. Pat. No. 6,084,089 which discloses cold inducible promoters, U.S. Pat. No. 6,294,714 which discloses light inducible promoters, U.S. Pat. No. 6,140,078 which discloses salt inducible promoters, U.S. Pat. No. 6,252,138 which discloses pathogen inducible promoters, and U.S. Pat. No. 6,175,060 which discloses phosphorus deficiency inducible promoters.
[0225] As another example, numerous wound-inducible promoters have been described (e.g. Xu et al. Plant Molec. Biol. 22: 573-588, 1993; Logemann et al., Plant Cell 1: 151-158, 1989; Rohrmeier & Lehle, Plant Molec. Biol. 22: 783-792, 1993; Firek et al. Plant Molec. Biol. 22: 129-142, 1993; Warner et al. Plant J. 3: 191-201, 1993)). Logemann describe 5' upstream sequences of the potato wunl gene. Xu et al. show that a wound-inducible promoter from the dicotyledon potato (pint) is active in the monocotyledon rice. Rohrmeier & Lehle describe maize Wipl cDNA which is wound induced and which can be used to isolate the cognate promoter. Firek et al. and Warner et al. have described a wound-induced gene from the monocotyledon Asparagus officinalis, which is expressed at local wound and pathogen invasion sites.
[0226] Tissue-Specific Promoters: Exemplary promoters that express genes only in certain tissues are useful according to the present invention. For example root specific expression may be attained using the promoter of the maize metallothionein-like (MTL) gene described by de Framond (FEBS 290: 103-106, 1991) and also in U.S. Pat. No. 5,466,785, incorporated herein by reference. U.S. Pat. No. 5,837,848 discloses a root specific promoter. Another exemplary promoter confers pith-preferred expression (see WO 93/07278, herein incorporated by reference, which describes the maize trpA gene and promoter that is preferentially expressed in pith cells). Leaf-specific expression may be attained, for example, by using the promoter for a maize gene encoding phosphoenol carboxylase (PEPC) (see Hudspeth & Grula, Plant Molec Biol 12: 579-589 (1989)). Pollen-specific expression may be conferred by the promoter for the maize calcium-dependent protein kinase (CDPK) gene which is expressed in pollen cells (WO 93/07278). US Pat. Appl. Pub. No. 20040016025 describes tissue-specific promoters. Pollen-specific expression may be conferred by the tomato LAT52 pollen-specific promoter (Bate et al., Plant Mol. Biol. 37:859-69, 1998).
[0227] See also U.S. Pat. No. 6,437,217 which discloses a root-specific maize RS81 promoter, U.S. Pat. No. 6,426,446 which discloses a root specific maize RS324 promoter, U.S. Pat. No. 6,232,526 which discloses a constitutive maize A3 promoter, U.S. Pat. No. 6,177,611 which discloses constitutive maize promoters, U.S. Pat. No. 6,433,252 which discloses a maize L3 oleosin promoter that are aleurone and seed coat-specific promoters, U.S. Pat. No. 6,429,357 which discloses a constitutive rice actin 2 promoter and intron, US patent application Pub. No. 20040216189 which discloses an inducible constitutive leaf specific maize chloroplast aldolase promoter.
[0228] Optionally a plant transcriptional terminator can be used in place of the plant-expressed gene native transcriptional terminator. Exemplary transcriptional terminators are those that are known to function in plants and include the CaMV 35S terminator, the tml terminator, the nopaline synthase terminator and the pea rbcS E9 terminator. These can be used in both monocotyledons and dicotyledons.
[0229] Various intron sequences have been shown to enhance expression, particularly in monocotyledonous cells. For example, the introns of the maize Adhl gene have been found to significantly enhance expression. Intron 1 was found to be particularly effective and enhanced expression in fusion constructs with the chloramphenicol acetyltransferase gene (Callis et al., Genes Develop. 1: 1183-1200, 1987). The intron from the maize bronze1 gene also enhances expression. Intron sequences have been routinely incorporated into plant transformation vectors, typically within the non-translated leader. US Patent Application Publication 2002/0192813 discloses 5', 3' and intron elements useful in the design of effective plant expression vectors.
[0230] A number of non-translated leader sequences derived from viruses are also known to enhance expression, and these are particularly effective in dicotyledonous cells. Specifically, leader sequences from Tobacco Mosaic Virus (TMV, the "omega-sequence"), Maize Chlorotic Mottle Virus (MCMV), and Alfalfa Mosaic Virus (AMV) have been shown to be effective in enhancing expression (e.g. Gallie et al. Nucl. Acids Res. 15: 8693-8711, 1987; Skuzeski et al. Plant Molec. Biol. 15: 65-79, 1990). Other leader sequences known in the art include but are not limited to: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5' noncoding region) (Elroy-Stein, O, et al. PNAS USA 86:6126-6130, 1989); potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Allison et al., Virology 154:9-20, 1986); MDMV leader (Maize Dwarf Mosaic Virus); Virology 154:9-20); human immunoglobulin heavy-chain binding protein (BiP) leader, (Macejak, D. G., and Sarnow, P., Nature 353: 90-94, 1991; untranslated leader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4), (Jobling, S. A., and Gehrke, L., Nature 325:622-625, 1987); tobacco mosaic virus leader (TMV), (Gallie et al., Molecular Biology of RNA, pages 237-256, 1989); or Maize Chlorotic Mottle Virus leader (MCMV) (Lommel et al., Virology 81:382-385, 1991). See also, Della-Cioppa et al., Plant Physiology 84:965-968 (1987).
[0231] A minimal promoter may also be incorporated. Such a promoter has low background activity in plants when there is no transactivator present or when enhancer or response element binding sites are absent. One exemplary minimal promoter is the Bz1 minimal promoter, which is obtained from the bronze1 gene of maize. Roth et al., Plant Cell 3: 317 (1991). A minimal promoter may also be created by use of a synthetic TATA element. The TATA element allows recognition of the promoter by RNA polymerase factors and confers a basal level of gene expression in the absence of activation (see generally, Mukumoto, Plant Mol Biol 23: 995-1003, 1993; Green, Trends Biochem Sci 25: 59-63, 2000).
[0232] Sequences controlling the targeting of gene products also may be included. For example, the targeting of gene products to the chloroplast is controlled by a signal sequence found at the amino terminal end of various proteins which is cleaved during chloroplast import to yield the mature protein (e.g. Comai et al., J. Biol. Chem. 263: 15104-15109, 1988). These signal sequences can be fused to heterologous gene products to affect the import of heterologous products into the chloroplast (van den Broeck, et al. Nature 313: 358-363, 1985). DNA encoding for appropriate signal sequences can be isolated from the 5' end of the cDNAs encoding the RUBISCO protein, the CAB protein, the EPSP synthase enzyme, the GS2 protein or many other proteins which are known to be chloroplast localized. Other gene products are localized to other organelles such as the mitochondrion and the peroxisome (e.g. Unger et al. Plant Molec. Biol. 13: 411-418, 1989). Examples of sequences that target to such organelles are the nuclear-encoded ATPases or specific aspartate amino transferase isoforms for mitochondria. Targeting cellular protein bodies has been described by Rogers et al. (PNAS USA 82: 6512-6516, 1985). In addition, amino terminal and carboxy-terminal sequences are responsible for targeting to the ER, the apoplast, and extracellular secretion from aleurone cells (Koehler & Ho, Plant Cell 2: 769-783, 1990). Additionally, amino terminal sequences in conjunction with carboxy terminal sequences are responsible for vacuolar targeting of gene products (Shinshi et al. Plant Molec. Biol. 14: 357-368, 1990).
[0233] Another possible element which may be introduced is a matrix attachment region element (MAR), such as the chicken lysozyme A element (Stief et al. Nature 34:343-5, 1989), which can be positioned around an expressible gene of interest to effect an increase in overall expression of the gene and diminish position dependent effects upon incorporation into the plant genome (Stief et al., Nature, 341:343, 1989; Phi-Van et al., Mol. Cell. Biol., 10:2302-230, 1990).
Use of Non-Plant Promoter Regions Isolated from Drosophila melanociaster and Saccharomyces cerevisiae to Express Genes in Plants
[0234] The promoter in the sorghum MC or recombinant chromosome of the present invention can be derived from plant or non-plant species. In one embodiment, the nucleotide sequence of the promoter is derived from non-plant species for the expression of genes in plant cells, including but not limited to dicotyledon plant cells such as tobacco, tomato, potato, soybean, canola, sunflower, alfalfa, cotton and Arabidopsis, or monocotyledonous plant cell, such as wheat, maize, rye, rice, turf grass, oat, barley, sorghum, sugarcane and millet. In one embodiment, the non-plant promoters are constitutive or inducible promoters derived from insect, e.g., Drosophila melanogaster or yeast, e.g., Saccharomyces cerevisiae. Table 2 lists the promoters from Drosophila melanogaster and Saccharomyces cerevisiae that are used to derive the examples of non-plant promoters in the present invention. Promoters derived from any animal, protist, or fungi are also contemplated. SEQ ID NOs:1-20, or fragments, mutants, hybrid or tandem promoters thereof, are examples of promoter sequences derived from Drosophila melanogaster or Saccharomyces cerevisiae. These non-plant promoters can be operably linked to nucleic acid sequences encoding polypeptides or non-protein-expressing sequences including, but not limited to, antisense RNA and ribozymes, to form nucleic acid constructs, vectors, and host cells (prokaryotic or eukaryotic), comprising the promoters.
TABLE-US-00002 TABLE 2 Exemplary Promoters from D. melanogaster and S. cerevisiae Drosophila melanogaster promoters (adapted from the Drosophila FlyBase, referenced in Grumbling, and Strelets, Nucl. Acids Rsrch. 34:D484-8, 2006 SEQ ID Standard promoter NO: Symbol Flybase ID gene name Gene product Chromoso 1 Pgd FBgn0004654 Phosphogluconate 6-phosphogluconat X dehydrogenase dehydrogenase 2 Grim FBgn0015946 grim grim-P138 3 3 Uro FBgn0003961 Urate oxidase Uro-P1 2 4 Sna FBgn0003448 Snail sna-P1 2 5 Rh3 FBgn0003249 Rhodopsin 3 Rh3 3 6 Lsp-1 γ FBgn0002564 Larval serum protein 1 Lsp1γ-P1 3 Saccharomyces cerevisiae Promoters (adapted from information available from the Saccharomyces Genome Database, referenced in Dwight SS et al. Brief Bioinform. 5:9-22, 2004). SEQ ID Standard promoter NO: Symbol Systematic Name gene name Gene product Chromoso 7 Tef-2 YBR118W TEF2 (Translation Translation elongati 2 elongation factor factor EF-1 alpha promtoer) 8 Leu-1 YGL009C LEU1 (LEUcine isopropylmalate 7 biosynthesis) isomerase 9 Met16 YPR167C METhionine requiring 3'phosphoadenylyls 16 ate reductase 10 Leu-2 YCL018W LEU2 (leucine beta-IPM 3 biosynthesis) (isopropylmalate) dehydrogenase 11 His-4 YCL030C HIS4 (HIStidine requirin histidinol 3 dehydrogenase 12 Met-2 YNL277W MET2 (methionine L-homoserine-O- 14 requiring) acetyltransferase 13 Ste-3 YKL178C STE3 (alias DAF2 Sterile) a-factor receptor 11 14 Arg-1 YOL058W ARG1(alias ARG10 arginosuccinate 15 ARGinine requiring) synthetase 15 Pgk-1 YCR012W PGK1 (phosphoglycerat phosphoglycerate 3 kinase ) kinase 16 GPD-1 YDL022W GPD1 (alias glycerol-3-phosphat 4 DAR1/HOR1/OSG1/OSR dehydrogenase glycerol-3-phosphate dehydrogenase activity 17 ADH1 YOL086C ADH1 (alias ADC1) alcohol 15 dehydrogenase 18 GPD-2 YOL059W GPD2 (alias GPD3: glycerol-3-phosphat 15 glycerol-3-phosphate dehydrogenase dehydrogenase activity 19 Arg-4 YHR018C ARGinine requiring argininosuccinate 8 lyase 20 Yat-1 YAR035W YAT-1(carnitine carnitine 1 acetyltransferase) acetyltransferase indicates data missing or illegible when filed
[0235] In the MCs or recombinant chromosome of the present invention, the promoter may be a mutant of the promoters having a substitution, deletion, and/or insertion of one or more nucleotides in the nucleic acid sequence of SEQ ID NOs:1 to 20, hybrid or tandem promoters.
[0236] The techniques used to isolate or clone a nucleic acid sequence comprising a promoter of interest are known in the art and include isolation from genomic DNA. The cloning procedures may involve excision or amplification, for example by polymerase chain reaction, and isolation of a desired nucleic acid fragment comprising the nucleic acid sequence encoding the promoter, insertion of the fragment into a vector molecule, and incorporation of the recombinant vector into the plant cell
Definitions
[0237] The term "adchromosomal" plant or plant part means a plant or plant part that contains functional, stable and autonomous MCs. Adchromosomal plants or plant parts may be chimeric or not chimeric (chimeric meaning that MCs are only in certain portions of the plant, and are not uniformly distributed throughout the plant). An adchromosomal plant cell contains at least one functional, stable and autonomous MC.
[0238] The term "autonomous" means that when delivered to plant cells, at least some MCs are transmitted through mitotic division to daughter cells and are episomal in the daughter plant cells, i.e. are not chromosomally integrated in the daughter plant cells. Daughter plant cells that contain autonomous MCs can be selected for further replication using, for example, selectable or screenable markers. During the introduction into a cell of a MC, or during subsequent stages of the cell cycle, there may be chromosomal integration of some portion or all of the DNA derived from a MC in some cells. The MC is still characterized as autonomous despite the occurrence of such events if a plant may be regenerated that contains episomal descendants of the MC distributed throughout its parts, or if gametes or progeny can be derived from the plant that contain episomal descendants of the MC distributed through its parts.
[0239] A "centromere" is any DNA sequence that confers an ability to segregate to daughter cells through cell division. In one context, this sequence may produce a transmission efficiency to daughter cells ranging from about 1% to about 100%, including to about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or about 95% of daughter cells. Variations in transmission efficiency may find important applications within the scope of the invention; for example, MCs carrying centromeres that confer 100% stability could be maintained in all daughter cells without selection, while those that confer 1% stability could be temporarily introduced into a transgenic organism, but be eliminated when desired. In particular embodiments of the invention, the centromere may confer stable transmission to daughter cells of a nucleic acid sequence, including a recombinant construct comprising the centromere, through mitotic or meiotic divisions, including through both meiotic and meiotic divisions. A plant centromere is not necessarily derived from plants, but has the ability to promote DNA transmission to daughter plant cells.
[0240] The term "circular permutations" refer to variants of a sequence that begin at base n within the sequence, proceed to the end of the sequence, resume with base number one of the sequence, and proceed to base n-1. For this analysis, n may be any number less than or equal to the length of the sequence. For example, circular permutations of the sequence ABCD are: ABCD, BCDA, CDAB, and DABC.
[0241] The term "co-delivery" refers to the delivery of two nucleic acid segments to a cell. In co-delivery of plant growth inducing genes and MCs, the two nucleic acid segments are delivered simultaneously using the same delivery method. Alternatively, the nucleic acid segment containing the growth inducing gene, optionally as part of an episomal vector, such as a viral vector or a plasmid vector, may be delivered to the plant cells before or after delivery of the MC, and the MC may carry an exogenous nucleic acid that induces expression of the earlier-delivered growth inducing gene. In this embodiment, the two nucleic acid segments may be delivered separately at different times provided the encoded growth inducing factors are functional during the appropriate time period.
[0242] The term "coding sequence" is defined herein as a nucleic acid sequence that is transcribed into mRNA which is translated into a polypeptide when placed under the control of promoter sequences. The boundaries of the coding sequence are generally determined by the ATG start codon located at the start of the open reading frame, near the 5' end of the mRNA, and TAG, TGA or TAA stop codons at the end of the coding sequence, near the 3' end f the mRNA, and in some cases, a transcription terminator sequence located just downstream of the open reading frame at the 3' end of the mRNA. A coding sequence can include, but is not limited to, genomic DNA, cDNA, semisynthetic, synthetic, or recombinant nucleic acid sequences.
[0243] The term "consensus" refers to a nucleic acid sequence derived by comparing two or more related sequences. A consensus sequence defines both the conserved and variable sites between the sequences being compared. Any one of the sequences used to derive the consensus or any permutation defined by the consensus may be useful in construction of MCs.
[0244] The term "exogenous" when used in reference to a nucleic acid, for example, is intended to refer to any nucleic acid that has been introduced into a recipient cell, regardless of whether the same or similar nucleic acid is already present in such a cell. Thus, as an example, "exogenous DNA" can include an additional copy of DNA that is already present in the plant cell, DNA from another plant, DNA from a different organism, or a DNA generated externally, such as a DNA sequence containing an antisense message of a gene, or a DNA sequence encoding a synthetic or modified version of a gene. An "exogenous gene" can be a gene not normally found in the host genome in an identical context, or an extra copy of a host gene. The gene may be isolated from a different species than that of the host genome, or alternatively, isolated from the host genome but operably linked to one or more regulatory regions which differ from those found in the unaltered, native gene.
[0245] The term "functional" to describe a MC means that when an exogenous nucleic acid is present within the MC the exogenous nucleic acid can function in a detectable manner when the MC is within a plant cell; exemplary functions of the exogenous nucleic acid include transcription of the exogenous nucleic acid, expression of the exogenous nucleic acid, regulatory control of expression of other exogenous nucleic acids, recognition by a restriction enzyme or other endonuclease, ribozyme or recombinase; providing a substrate for DNA methylation, DNA glycolation or other DNA chemical modification; binding to proteins such as histones, helix-loop-helix proteins, zinc binding proteins, leucine zipper proteins, MADS box proteins, topoisomerases, helicases, transposases, TATA box binding proteins, viral protein, reverse transcriptases, or cohesins; providing an integration site for homologous recombination; providing an integration site for a transposon, T-DNA or retrovirus; providing a substrate for RNAi synthesis; priming of DNA replication; aptamer binding; or kinetochore binding. If multiple exogenous nucleic acids are present within the MC, the function of one or preferably more of the exogenous nucleic acids can be detected under suitable conditions permitting function thereof.
[0246] "Library" is a pool of cloned DNA fragments that represents some or all DNA sequences collected, prepared or purified from a specific source. Each library may contain the DNA of a given organism inserted as discrete restriction enzyme generated fragments or as randomly sheared fragments into many thoUSAnds of plasmid vectors. For purposes of the present invention, E. coli, yeast, and Salmonella plasmids are particularly useful for propagating the genome inserts from other organisms. In principle, any gene or sequence present in the starting DNA preparation can be isolated by screening the library with a specific hybridization probe (see, for example, Young et al., In: Eukaryotic Genetic Systems ICN-UCLA Symposia on Molecular and Cellular Biology, VII, 315-331, 1977).
[0247] The term "linker" refers to a DNA molecule, generally up to 50 or 60 nucleotides long and composed of two or more complementary oligonucleotides that have been synthesized chemically, or excised or amplified from existing plasmids or vectors. In a preferred embodiment, this fragment contains one, or preferably more than one, restriction enzyme site for a blunt cutting enzyme and/or a staggered cutting enzyme, such as BamHI. One end of the linker is designed to be ligatable to one end of a linear DNA molecule and the other end is designed to be ligatable to the other end of the linear molecule, or both ends may be designed to be ligatable to both ends of the linear DNA molecule.
[0248] A "MC" is a recombinant DNA construct including a centromere that is capable of transmission to daughter cells. A MC may remain separate from the host genome (as episomes) or may integrate into host chromosomes. The stability of this construct through cell division could range between from about 1% to about 100%, including about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% and about 95%. The MC construct may be a circular or linear molecule. It may include elements such as one or more telomeres, origin of replication sequences, stuffer sequences, buffer sequences, chromatin packaging sequences, linkers and genes. The number of such sequences included is only limited by the physical size limitations of the construct itself. It could contain DNA derived from a natural centromere, although it may be preferable to limit the amount of DNA to the minimal amount required to obtain a transmission efficiency in the range of 1-100%. The MC could also contain a synthetic centromere composed of tandem arrays of repeats of any sequence, either derived from a natural centromere, or of synthetic DNA. The MC could also contain DNA derived from multiple natural centromeres. The MC may be inherited through mitosis or meiosis, or through both meiosis and mitosis. the term MC specifically encompasses and includes the terms "plant artificial chromosome" or "PLAC," or engineered chromosomes or microchromosomes and all teachings relevant to a PLAC or plant artificial chromosome specifically apply to constructs within the meaning of the term MC.
[0249] The term "non-protein expressing sequence" or "non-protein coding sequence" is defined herein as a nucleic acid sequence that is not eventually translated into protein. The nucleic acid may or may not be transcribed into RNA. Exemplary sequences include ribozymes or antisense RNA.
[0250] The term "operably linked" is defined herein as a configuration in which a control sequence, e.g., a promoter sequence, directs transcription or translation of another sequence, for example a coding sequence. For example, a promoter sequence could be appropriately placed at a position relative to a coding sequence such that the control sequence directs the production of a polypeptide encoded by the coding sequence.
[0251] "Phenotype" or "phenotypic trait(s)", refers to an observable property or set of properties resulting from the expression of a gene. The set of properties may be observed visually or after biological or biochemical testing, and may be constantly present or may only manifest upon challenge with the appropriate stimulus or activation with the appropriate signal.
[0252] The term "plant" refers to any type of plant. Modified plants of the invention include, for example, dicots, gymnosperm, monocots, mosses, ferns, horsetails, club mosses, liver worts, hornworts, red algae, brown algae, gametophytes and sporophytes of pteridophytes, and green algae.
[0253] One modified crop plant of particular interest in the present invention is Sorghum, including but not limited to Sorghum bicolor (primary cultivated species), Sorghum almum, Sorghum amplum, Sorghum angustum, Sorghum rundinaceum, Sorghum brachypodum, Sorghum bulbosum, Sorghum burmahicum, Sorghum controversum, Sorghum drummondii, Sorghum carinatum, Sorghum exstans, Sorghum grande, Sorghum halepense, Sorghum interjectum, Sorghum intrans, Sorghum laxiflorum, Sorghum leiocladum, Sorghum macrospermum, Sorghum matarankense, Sorghum miliaceum, Sorghum nigrum, Sorghum nitidum, Sorghum plumosum, Sorghum propinquum, Sorghum purpureosericeum, Sorghum stipoideum, Sorghum timorense, Sorghum trichocladum, Sorghum versicolor, Sorghum virgatum, and Sorghum vulgare (including but not limited to the variety Sorghum vulgare var. sudanens also known as sudangrass). Hybrids of these species are also of interest in the present invention as are hybrids with othe members of the Family Poaceae.
[0254] The term "plant part" includes a pod, root, sett root, shoot root, root primordial, shoot, primary shoot, secondary shoot, tassle, panicle, arrow, midrib, blade, ligule, auricle, dewlap, blade joint, sheath, node, internode, bud furrow, leaf scar, cutting, tuber, stem, stalk, fruit, berry, nut, flower, leaf, bark, wood, epidermis, vascular tissue, organ, protoplast, crown, callus culture, petiole, petal, sepal, stamen, stigma, style, bud, meristem, cambium, cortex, pith, sheath, silk, ovule or embryo. Other exemplary Sugarcane plant parts are a meiocyte or gamete or ovule or pollen or endosperm of any of the preceding plants. Other exemplary plant parts are a seed, seed-piece, embryo, protoplast, cell culture, any group of plant cells organized into a structural and functional unit, ratoon or propagule.
[0255] The term "promoter" is a DNA sequence that allows the binding of RNA polymerase (including but not limited to RNA polymerase I, RNA polymerase II and RNA polymerase Ill from eukaryotes) and directs the polymerase to a downstream transcriptional start site of a nucleic acid sequence encoding a polypeptide to initiate transcription. RNA polymerase effectively catalyzes the assembly of messenger RNA complementary to the appropriate DNA strand of the coding region.
[0256] A "promoter operably linked to a heterologous gene" is a promoter that is operably linked to a gene that is different from the gene to which the promoter is normally operably linked in its native state. Similarly, an "exogenous nucleic acid operably linked to a heterologous regulatory sequence" is a nucleic acid that is operably linked to a regulatory control sequence to which it is not normally linked in its native state.
[0257] The term "recombinant chromosome" refers to an engineered or artificial chromosome that has been constructed by fragmenting a natural chromosome and identifying fragmentation products that are capable of segregation through mitotic and/or meiotic cell divisions. Recombinant chromosomes are distinct from MCs in that they are not constructed in vitro from constituent parts and have not been passaged through an heterologous cell such as a bacteria or fungus (as is commonly used in standard cloning techniques). Recombinant chromosomes may the used as targets for addition of transgene expression cassettes.
[0258] The term "Basic MC" is defined as a recombinant DNA construct that when present within a cell is capable of mitotic and/or meiotic transmission to daughter cells under appropriate conditions and comprises a Assembled Centromere and, optionally, one or more of the following: (a) one or more telomeres; (b) one or more sequences for regulating, maintaining, or imparting topological or chromatin structure, molecular integrity, or stability of gene expression or inheritance in a cell; (c) the required vector DNA that allows for propagation of MC in and DNA that facilitates the selective removal of unwanted portions of MC prior to or after transformation; or (d) a Transgene Expression Cassette, wherein the Transgene Expression Cassette serves only to regulate, maintain, or impart function or stability to a MC in a cell.
[0259] A "Basic MC" does not include a Transgene Expression Cassette that imparts one or more functions other than those expressly set forth in subsection (d), above.
[0260] An "Assembled Centromere" means a polynucleotide sequence having the properties of a Centromere that is assembled from one or more fragments of native Centromere(s) and/or other polynucleotide sequence, which are (i) isolated from a plant cell, and/or based on plant Centromere sequence motifs, (ii) inserted into a plasmid vector that is propagated and maintained in a cell of a heterologous organism, and (iii) delivered back into a plant cell as part of a Basic or Applied MC. An Assembled Centromere may possibly be modified by an endogenous in vivo process after it is delivered into a plant cell such that its sequence now differs from that contained in the parental Basic or Applied MC as propagated in a cell of a heterologous organism. For the avoidance of doubt an Assembled Centromere does not include derivatives or deletions of native Centromeres that are constructed within the plant cell, and are never maintained in their entirety in a cell of a heterologous organism.
[0261] An "Applied MC" means a genetic construct formed by integrating one or more Transgene Expression Cassettes into a Basic MC, wherein said Transgene Expression Cassettes impart one or more functions other than to regulate, maintain, or impart function or stability to a MC.
[0262] The term "hybrid promoter" is defined herein as parts of two or more promoters that are fused together to generate a sequence that is a fusion of the two or more promoters, which is operably linked to a coding sequence and mediates the transcription of the coding sequence into mRNA.
[0263] The term "tandem promoter" is defined herein as two or more promoter sequences each of which is operably linked to a coding sequence and mediates the transcription of the coding sequence into mRNA.
[0264] The term "constitutive active promoter" is defined herein as a promoter that allows permanent stable expression of the gene of interest.
[0265] The term "Inducible promoter" is defined herein as a promoter induced by the presence or absence of biotic or an abiotic factor.
[0266] The term "polypeptide" does not refer to a specific length of the encoded product and, therefore, encompasses peptides, oligopeptides, and proteins. The term "exogenous polypeptide" is defined as a polypeptide which is not native to the plant cell, a native polypeptide in which modifications have been made to alter the native sequence, or a native polypeptide whose expression is quantitatively altered as a result of a manipulation of the plant cell by recombinant DNA techniques.
[0267] The term "pseudogene" refers to a non-functional copy of a protein-coding gene; pseudogenes found in the genomes of eukaryotic organisms are often inactivated by mutations and are thus presumed to be non-essential to that organism; pseudogenes of reverse transcriptase and other open reading frames found in retroelements are abundant in the centronneric regions of Arabidopsis and other organisms and are often present in complex clusters of related sequences.
[0268] The term "regulatory sequence" refers to any DNA sequence that influences the efficiency of transcription or translation of any gene. The term includes, but is not limited to, sequences comprising promoters, enhancers and terminators.
[0269] The term "repeated nucleotide sequence" refers to any nucleic acid sequence of at least 25 by present in a genome or a recombinant molecule, other than a telomere repeat, that occurs at least two or more times and that are preferably at least 80% identical either in head to tail or head to head orientation either with or without intervening sequence between repeat units.
[0270] The term "retroelement" or "retrotransposon" refers to a genetic element related to retroviruses that disperse through an RNA stage; the abundant retroelements present in plant genomes contain long terminal repeats (LTR retrotransposons) and encode a polyprotein gene that is processed into several proteins including a reverse transcriptase. Specific retroelements (complete or partial sequences) can be found in and around plant centromeres and can be present as dispersed copies or complex repeat clusters. Individual copies of retroelements may be truncated or contain mutations; intact retrolements are rarely encountered.
[0271] The term "satellite DNA" refers to short DNA sequences (typically <1000 bp) present in a genome as multiple repeats, mostly arranged in a tandemly repeated fashion, as opposed to a dispersed fashion. Repetitive arrays of specific satellite repeats are abundant in the centromeres of many higher eukaryotic organisms.
[0272] A "screenable marker" is a gene whose presence results in an identifiable phenotype. This phenotype may be observable under standard conditions, altered conditions such as elevated temperature, or in the presence of certain chemicals used to detect the phenotype. The use of a screenable marker allows for the use of lower, sub-killing antibiotic concentrations and the use of a visible marker gene to identify clusters of transformed cells, and then manipulation of these cells to homogeneity. Preferred screenable markers of the present include genes that encode fluorescent proteins that are detectable by a visual microscope such as the fluorescent reporter genes DsRed, ZsGreen, ZsYellow, AmCyan, Green Fluorescent Protein (GFP) and modifications of these reporter genes to excite or emit at altered wavelengths. An additional preferred screenable marker gene is lac.
[0273] Alternative methods of screening for modified plant cells may involve use of relatively low, sub-killing concentrations of a selection agent (e.g. sub-killing antibiotic concentrations), and also involve use of a screenable marker (e.g., a visible marker gene) to identify clusters of modified cells carrying the screenable marker, after which these screenable cells are manipulated to homogeneity, a "selectable marker" is a gene whose presence results in a clear phenotype, and most often a growth advantage for cells that contain the marker. This growth advantage may be present under standard conditions, altered conditions such as elevated temperature, specialized media compositions, or in the presence of certain chemicals such as herbicides or antibiotics. Use of selectable markers is described, for example, in Broach et al. (Gene, 8:121-133, 1979). Examples of selectable markers include the thymidine kinase gene, the cellular adenine phosphoribosyltransferase gene and the dihydrylfolate reductase gene, hygromycin phosphotransferase genes, the bar gene, neomycin phosphotransferase genes and phosphomannose isomerase, among others. Preferred selectable markers in the present invention include genes whose expression confer antibiotic or herbicide resistance to the host cell, or proteins allowing utilization of a carbon source not normally utilized by plant cells. Expression of one of these markers should be sufficient to enable the survival of those cells that comprise a vector within the host cell, and facilitate the manipulation of the plasmid into new host cells. Of particular interest in the present invention are proteins conferring cellular resistance to kanamycin, G418, paramomycin, hygromycin, bialaphos, and glyphosate for example, or proteins allowing utilization of a carbon source, such as mannose, not normally utilized by plant cells.
[0274] The term "stable" means that the MC can be transmitted to daughter cells over at least 8 mitotic generations. Some embodiments of MCs may be transmitted as functional, autonomous units for less than 8 mitotic generations, e.g. 1, 2, 3, 4, 5, 6, or 7. Preferred MCs can be transmitted over at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 generations, for example, through the regeneration or differentiation of an entire plant, and preferably are transmitted through meiotic division to gametes. Other preferred MCs can be further maintained in the zygote derived from such a gamete or in an embryo or endosperm derived from one or more such gametes. A "functional and stable" MC is one in which functional MCs can be detected after transmission of the MCs over at least 8 mitotic generations, or after inheritance through a meiotic division. During mitotic division, as occurs occasionally with native chromosomes, there may be some non-transmission of MCs; the MC may still be characterized as stable despite the occurrence of such events if an adchromosomal plant that contains descendants of the MC distributed throughout its parts may be regenerated from cells, cuttings, propagules, or cell cultures containing the MC, or if an adchromosomal plant can be identified in progeny of the plant containing the MC.
[0275] A "structural gene" is a sequence which codes for a polypeptide or RNA and includes 5' and 3' ends. The structural gene may be from the host into which the structural gene is transformed or from another species. A structural gene will preferably but not necessarily include one or more regulatory sequences which modulate the expression of the structural gene, such as a promoter, terminator or enhancer. A structural gene will preferably but not necessarily confer some useful phenotype upon an organism comprising the structural gene, for example, herbicide resistance. In one embodiment of the invention, a structural gene may encode an RNA sequence which is not translated into a protein, for example a tRNA or rRNA gene.
[0276] The term "telomere" or "telomere DNA" refers to a sequence capable of capping the ends of a chromosome, thereby preventing degradation of the chromosome end, ensuring replication and preventing fusion to other chromosome sequences. Telomeres can include naturally occurring telomere sequences or synthetic sequences. Telomeres from one species may confer telomere activity in another species. An exemplary telomere DNA is a heptanucleotide telomere repeat TTTAGGG (and its complement) found in the majority of plants.
[0277] "Transformed," "transgenic," "modified," and "recombinant" refer to a host organism such as a plant into which an exogenous or heterologous nucleic acid molecule has been introduced, and includes meiocytes, seeds, zygotes, embryos, endosperm, or progeny of such plant that retain the exogenous or heterologous nucleic acid molecule but which have not themselves been subjected to the transformation process.
[0278] When the phrase "transmission efficiency"of a certain percent is used, transmission percent efficiency is calculated by measuring MC presence through one or more mitotic or meiotic generations. It is directly measured as the ratio (expressed as a percentage) of the daughter cells or plants demonstrating presence of the MC to parental cells or plants demonstrating presence of the MC. Presence of the MC in parental and daughter cells is demonstrated with assays that detect the presence of an exogenous nucleic acid carried on the MC. Exemplary assays can be the detection of a screenable marker (e.g. presence of a fluorescent protein or any gene whose expression results in an observable phenotype), a selectable marker, or PCR amplification of any exogenous nucleic acid carried on the MC.
[0279] I. Constructing MCs by Site-Specific Recombination
[0280] Plant MCs may be constructed using site-specific recombination sequences (for example those recognized by the bacteriophage P1 Cre recombinase, or the bacteriophage lambda integrase, or similar recombination enzymes). A compatible recombination site, or a pair of such sites, is present on both the centromere containing DNA clones and the donor DNA clones. Incubation of the donor clone and the centromere clone in the presence of the recombinase enzyme causes strand exchange to occur between the recombination sites in the two plasmids; the resulting MCs contain centromere sequences as well as MC vector sequences. The DNA molecules formed in such recombination reactions is introduced into E. coli, other bacteria, yeast or plant cells by common methods in the field including, but not limited to, heat shock, chemical transformation, electroporation, particle bombardment, whiskers, or other transformation methods followed by selection for marker genes including chemical, enzymatic, color, or other marker present on either parental plasmid, allowing for the selection of transformants harboring MCs.
[0281] II. Methods of Detecting and Characterizing MCs in Plant Cells or of Scoring MC Performance in Plant Cells
Identification of Candidate Centromere Fragments by Probing BAC Libraries
[0282] Centromere clones are identified from a large genomic insert library such as a Bacterial Artificial Chromosome library. Probes are labeled using nick-translation in the presence of radioactively labeled dCTP, dATP, dGTP or dTTP as in, for example, the commercially available REDIPRIME® kit (GE Healthcare; Piscataway, N.J.; USA) as per the manufacturer's instructions. Other labeling methods familiar to those skilled in the art could be substituted. The libraries are screened and deconvoluted. Genomic clones are screened by probing with small centromere-specific clones. Other embodiments of this procedure would involve hybridizing a library with other centromere sequences. Of the BAC clones identified using this procedure, a representative set are identified as having high hybridization signals to some probes, and optionally low hybridization signals to other probes. These are selected, the bacterial clones grown up in cultures and DNA prepared by methods familiar to those skilled in the art such as alkaline lysis. The DNA composition of purified clones is surveyed using for example fingerprinting by digesting with restriction enzymes such as, but not limited to, HinfI or HindIII. In a preferred embodiment the restriction enzyme cuts within the tandem centromere satellite repeat (see below). A variety of clones showing different fingerprints are selected for conversion into MCs and inheritance testing. It can also be informative to use multiple restriction enzymes for fingerprinting or other enzymes which can cleave DNA.
Fingerprinting Analysis of BACs and MCs
[0283] Centromere function may be associated with large tandem arrays of satellite repeats. To assess the composition and architecture of the centromere BACs, the candidate BACs are digested with a restriction enzyme, such as HindiIII, which cuts with known frequency within the consensus sequence of the unit repeat of the tandemly repeated centromere satellite. Digestion products are then separated by agarose gel electrophoresis. Large insert clones containing a large array of tandem repeats will produce a strong band of the unit repeat size, as well as less intense bands at 2× and 3× the unit repeat size, and further multiples of the repeat size. These methods are well-known and there are many possible variations known to those skilled in the art.
Determining Sequence Composition of MCs by Shotgun Cloning/Sequencing, Sequence Analysis
[0284] To determine the sequence composition of the MC, the centromeric region of the MC is sequenced. To generate DNA suitable for sequencing MCs are fragmented, for example by using a random shearing method (such as sonication, nebulization, etc). Other fragmentation techniques may also be used such as enzymatic digestion. These fragments are then cloned into a plasmid vector and sequenced. The resulting DNA sequence is trimmed of poor-quality sequence and of sequence corresponding to the plasmid vector. The sequence is then compared to known DNA sequences using an algorithm such as BLAST to search a sequence database such as GenBank.
[0285] To determine the consensus of the satellite repeat in the MC, the sequences containing satellite repeat are aligned using a DNA sequence alignment program such as CONTIGEXPRESS® from Vector NTI®. The sequences may also be aligned to previously determined repeats for that species. The sequences are trimmed to unit repeat length using the consensus as a template. Sequences trimmed from the ends of the alignment are realigned with the consensus and further trimmed until all sequences are at or below the consensus length. The sequences are then aligned with each other. The consensus is determined by the frequency of a specific nucleotide at each position; if the most frequent base is three times more frequent than the next most frequent base, it was considered the consensus.
[0286] Methods for determining consensus sequence are well known in the art, see, e.g., US Pat. App. Pub. No. 20030124561; Hall et al. Plant Physiol. 129:1439-1447, 2002. These methods, including DNA sequencing, assembly, and analysis, are well-known and there are many possible variations known to those skilled in the art. Other alignment parameters may also be useful such as using more or less stringent definitions of consensus.
Non-Selective MC Mitotic Inheritance Assays
[0287] The following list of assays and potential outcomes illustrates how various assays can be used to distinguish autonomous events from integrated events.
[0288] Assay #1: Transient Assay
[0289] MCs are tested for their ability to become established as chromosomes and their ability to be inherited in mitotic cell divisions. In this assay, MCs are delivered to plant cells, for example suspension cells in liquid culture. The cells used can be at various stages of growth. In this example, a population in which some cells were undergoing division was used. The MC is then assessed over the course of several cell divisions, by tracking the presence of a screenable marker, e.g. a visible marker gene such as a fluorescent protein. MCs that are established and inherited well may show an initial delivery into many single cells; after several cell divisions, these single cells divide to form clusters of MC-containing cells. Other exemplary embodiments of this method include delivering MCs to other mitotic cell types, including roots and shoot meristems.
[0290] Assay #2: Non-Lineage Based Inheritance Assays on Modified Transformed Cells and Plants
[0291] MC inheritance is assessed on modified cell lines and plants by following the presence of the MC over the course of multiple cell divisions. An initial population of MC containing cells is assayed for the presence of the MC, by the presence of a marker gene, including but not limited to a fluorescent protein, a colored protein, a protein assayable by histochemical assay, and a gene affecting cell morphology. In the use of a DNA-specific dye, all nuclei are stained with a dye including but not limited to DAPI, Hoechst 33258, OliGreen, Giemsa YOYO, or TOTO, allowing a determination of the number of cells that do not contain the MC. After the initial determination of the percent of cells carrying the MC, the cells are allowed to divide over the course of several cell divisions. The number of cell divisions, n, is determined by a method including but not limited to monitoring the change in total weight of cells, and monitoring the change in volume of the cells or by directly counting cells in an aliquot of the culture. After a number of cell divisions, the population of cells is again assayed for the presence of the MC. The loss rate per generation is calculated by the equation (1):
Loss rate per generation=1-(F/I)1/n (1)
[0292] The population of MC-containing cells may include suspension cells, callus, roots, leaves, meristems, flowers, or any other tissue of modified plants, or any other cell type containing a MC.
[0293] These methods are well-known and there are many possible variations known to those skilled in the art; they have been used before with human cells and yeast cells.
[0294] Assay #3: Lineage Based Inheritance Assays on Modified Cells and Plants
[0295] MC inheritance is assessed on cell lines and plants comprising the MCs by following the presence of the MC over the course of multiple cell divisions. In cell types that allow for tracking of cell lineage, including but not limited to root or leaf cell files, trichomes, and leaf stomata guard cells, MC loss per generation does not need to be determined statistically over a population, it can be discerned directly through successive cell divisions. In other manifestations of this method, cell lineage can be discerned from cell position, or methods including but not limited to the use of histological lineage tracing dyes, and the induction of genetic mosaics in dividing cells.
[0296] In one simple example, the two guard cells of the stomata are daughters of a single precursor cell. To assay MC inheritance in this cell type, the epidermis of the leaf of a plant containing a MC is examined for the presence of the MC by the presence of a marker gene, including but not limited to a fluorescent protein, a colored protein, a protein assayable by histochemical assay, and a gene affecting cell morphology. The number of loss events in which one guard cell contains the MC (L) and the number of cell divisions in which both guard cells contain the MC (B) are counted. The loss rate per cell division is determined as L/(L+B). Other lineage-based cell types are assayed in similar fashion. These methods are well-known and there are many possible variations known to those skilled in the art; they have been used before with yeast cells (though, instead of observing the marker in stomates, a color marker was observed in yeast colonies).
[0297] Linear MC inheritance may also be assessed by examining leaf or root files or clustered cells in callus over time. Changes in the percent of cells carrying the MC will indicate the mitotic inheritance.
[0298] Assay #4: Inheritance Assays on Modified Cells and Plants in the Presence of Chromosome Loss Agents
[0299] Any of the above three assays can be done in the presence of chromosome loss agents (including but not limited to colchicine, colcemid, caffeine, etopocide, nocodazole, oryzalin, trifluran). It is likely that an autonomous MC will prove more susceptible to loss induced by chromosome loss agents; therefore, autonomous MCs should show a lower rate of inheritance in the presence of chromosome loss agents. These methods have been used to study chromosome loss in fruit flies and yeast; there are many possible variations known to those skilled in the art.
III. Transformation of Plant Cells and Plant Regeneration
[0300] Various methods may be used to deliver DNA into plant cells. These include biological methods, such as Agrobacterium, E. coli, and viruses, physical methods such as biolistic particle bombardment, nanocopoea device, the Stein beam gun, silicon carbide whiskers and microinjection, electrical methods such as electroporation, and chemical methods such as the use of poly-ethylene glycol and other compounds known to stimulate DNA uptake into cells. Examples of these techniques have been described (Paszkowski et al., EMBO J 3:2717-2722, 1984); Potrykus et al., Mol. Gen. Genet. 199:169-177. 1985; Reich et al., Biotechnol. 4:1001-1004; 1986; and Klein et al., Nature 327:70-73, 1987). Transformation using silicon carbide whiskers, e.g. in maize, is described in Brisibe (J. Exp. Bot. 51:187-196, 2000) and Dunwell (Methods Mol. Biol. 111:375-82, 1999) and U.S. Pat. No. 5,464,765.
[0301] Agrobacterium-Mediated Delivery
[0302] Agrobacterium-mediated transformation is one method for introducing a desired genetic element into a plant. Several Agrobacterium species mediate the transfer of a specific DNA known as "T-DNA" that can be genetically engineered to carry a desired piece of DNA into many plant species. Plasmids used for delivery contain the T-DNA flanking the nucleic acid to be inserted into the plant. The major events marking the process of T-DNA mediated pathogenesis are induction of virulence genes, processing and transfer of T-DNA.
[0303] There are three common methods to transform plant cells with Agrobacterium. The first method is co-cultivation of Agrobacterium with cultured isolated protoplasts. This method requires an established culture system that allows culturing protoplasts and plant regeneration from cultured protoplasts. The second method is transformation of cells or tissues with Agrobacterium. This method requires (a) that the plant cells or tissues can be modified by Agrobacterium and (b) that the modified cells or tissues can be induced to regenerate into whole plants. The third method is transformation of seeds, immature or mature embryos, apices or meristems with Agrobacterium. This method requires exposure of the meristematic cells of these tissues to Agrobacterium and micropropagation of the shoots or plan organs arising from these meristematic cells.
[0304] Those of skill in the art are familiar with procedures for growth and suitable culture conditions for Agrobacterium as well as subsequent inoculation procedures. Liquid, solid or semi-solid culture media can be used. The density of the Agrobacterium culture used for inoculation and the ratio of Agrobacterium cells to explant can vary from one system to the next, as can media, growth procedures, timing and lighting conditions.
[0305] Tranformation of dicotyledons using Agrobacterium has long been known in the art, and transformation of monocotyledons using Agrobacterium has also been described. See, WO 94/00977 and U.S. Pat. No. 5,591,616, both of which are incorporated herein by reference. See also, Negrotto et al. (Plant Cell Rep. 19:798-803, 2000, incorporated herein by reference).
[0306] A number of wild-type and disarmed strains of Agrobacterium tumefaciens and Agrobacterium rhizogenes harboring Ti or Ri plasmids can be used for gene transfer into plants. Preferably, the Agrobacterium hosts contain disarmed Ti and Ri plasmids that do not contain the oncogenes that cause tumorigenesis or rhizogenesis. Exemplary strains include Agrobacterium tumefaciens strain C58, a nopaline-type strain that is used to mediate the transfer of DNA into a plant cell, octopine-type strains such as LBA4404 or succinamopine-type strains, e.g., EHA101 or EHA105. The use of these strains for plant transformation has been reported and the methods are familiar to those of skill in the art.
[0307] US Application No. 20040244075 published Dec. 2, 2004 describes improved methods of Agrobacterium-mediated transformation. The efficiency of transformation by Agrobacterium may be enhanced by using a number of methods known in the art. For example, the inclusion of a natural wound response molecule such as acetosyringone (AS) to the Agrobacterium culture has been shown to enhance transformation efficiency with Agrobacterium tumefaciens (Shahla et al., (1987) Plant Molec. Biol. 8:291-298). Alternatively, transformation efficiency may be enhanced by wounding the target tissue to be modified or transformed. Wounding of plant tissue may be achieved, for example, by punching, maceration, bombardment with microprojectiles, etc. (See e.g., Bidney et al., Plant Molec. Biol. 18:301-313, 1992).
[0308] In addition, another recent method described by Broothaerts, et al. (Nature 433:629-633, 2005) expands the bacterial genera that can be used to transfer genes into plants. This work involved the transfer of a disarmed Ti plasmid without T-DNA and another vector with T-DNA containing the marker enzyme beta-glucuronidase, into three different bacteria. Gene transfer was successful and this method significantly expands the tools available for gene delivery into plants.
[0309] Microprojectile Bombardment Delivery
[0310] Another widely used technique to genetically transform plants involves the use of microprojectile bombardment. In this process, a nucleic acid containing the desired genetic elements to be introduced into the plant is deposited on or in small dense particles, e.g., tungsten, platinum, or preferably 0.5 to 1.0 micron gold particles, which are then delivered at a high velocity into the plant tissue or plant cells using a specialized biolistics device. Many such devices have been designed and constructed; one in particular, the PDS1000/He sold by Bio-Rad Laboratories (Hercules, Calif.; USA), is the instrument most commonly used for biolistics of plant cells. The advantage of this method is that no specialized sequences need to be present on the nucleic acid molecule to be delivered into plant cells; delivery of any nucleic acid sequence is theoretically possible.
[0311] For the bombardment, cells in suspension are concentrated on filters, petri dishes or solid culture medium. Alternatively, immature embryos, seedling explants, or any plant tissue or target cells may be arranged on filters, petri dishes or solid culture medium. The cells to be bombarded are positioned at an appropriate distance below the microprojectile stopping plate.
[0312] Various biolistics protocols have been described that differ in the type of particle or the manner in which DNA is coated onto the particle. Any technique for coating microprojectiles that allows for delivery of transforming DNA to the target cells may be used. For example, particles may be prepared by functionalizing the surface of a gold particle by providing free amine groups. DNA, having a strong negative charge, will then bind to the functionalized particles.
[0313] Parameters such as the concentration of DNA used to coat microprojectiles may influence the recovery of transformants containing a single copy of the transgene. For example, a lower concentration of DNA may not necessarily change the efficiency of the transformation but may instead increase the proportion of single copy insertion events. In this regard, ranges of approximately 1 ng to approximately 10 μg (10,000 ng), approximately 5 ng to 8 μg or approximately 20 ng, 50 ng, 100 ng, 200 ng, 500 ng, 1 μg, 2 μg, 5 μg, or 7 μg of transforming DNA may be used per each 1.0-2.0 mg of starting gold particles (in the 0.5 to 1.0 micron range).
[0314] Other physical and biological parameters may be varied, such as manipulation of the DNA/microprojectile precipitate, factors that affect the flight and velocity of the projectiles, manipulation of the cells before and immediately after bombardment (including osmotic state, tissue hydration and the subculture stage or cell cycle of the recipient cells), the orientation of an immature embryo or other target tissue relative to the particle trajectory, and also the nature of the transforming DNA, such as linearized DNA or intact supercoiled plasmids. One may also want to use agents to protect the DNA during delivery. One may particularly wish to adjust physical parameters such as DNA concentration, gap distance, flight distance, tissue distance, and helium pressure.
[0315] The particles delivered via biolistics can be "dry" or "wet." In the "dry" method, the MC DNA-coated particles such as gold are applied onto a macrocarrier (such as a metal plate, or a carrier sheet made of a fragile material such as mylar) and dried. The gas discharge then accelerates the macrocarrier into a stopping screen, which halts the macrocarrier but allows the particles to pass through; the particles then continue their trajectory until they impact the tissue being bombarded. For the "wet" method, the droplet containing the MC DNA-coated particles is applied to the bottom part of a filter holder, which is attached to a base which is itself attached to a rupture disk holder used to hold the rupture disk to the helium egress tube for bombardment. The gas discharge directly displaces the DNA/gold droplet from the filter holder and accelerates the particles and their DNA cargo into the tissue being bombarded. The wet biolistics method has been described in detail elsewhere but has not previously been applied in the context of plants (Mialhe et al., Mol Mar Biol Biotechnol. 4(4):275-83, 1995). The concentrations of the various components for coating particles and the physical parameters for delivery can be optimized using procedures known in the art.
[0316] A variety of plant cells/tissues are suitable for transformation, including immature embryos, scutellar tissue, suspension cell cultures, immature inflorescence, shoot meristem, epithelial peels, nodal explants, callus tissue, hypocotyl tissue, cotyledons, roots, leaves, meristem cells, and gametic cells such as microspores, pollen, sperm and egg cells. It is contemplated that any cell from which a fertile plant may be regenerated is useful as a recipient cell. Callus may be initiated from tissue sources including, but not limited to, immature embryos, seedling apical meristems, microspore-derived embryos, roots, hypocotyls, cotyledons and the like. Those cells which are capable of proliferating as callus also are recipient cells for genetic transformation.
[0317] Any suitable plant culture medium can be used. Examples of suitable media would include but are not limited to MS-based media (Murashige and Skoog, Physiol. Plant, 15:473-497, 1962) or N6-based media (Chu et al., Scientia Sinica 18:659, 1975) supplemented with additional plant growth regulators including but not limited to auxins such as picloram (4-amino-3,5,6-trichloropicolinic acid), 2,4-D (2,4-dichlorophenoxyacetic acid), naphalene-acetic acid (NAA) and dicamba (3,6-dichloroanisic acid), cytokinins such as BAP (6-benzylaminopurine) and kinetin, and gibberellins. Other media additives can include but are not limited to amino acids, macroelements, iron, microelements, vitamins and organics, carbohydrates, undefined media components such as casein hydrolysates, an appropriate gelling agent such as a form of agar, a low melting point agarose or Gelrite if desired. Those of skill in the art are familiar with the variety of tissue culture media, which when supplemented appropriately, support plant tissue growth and development and are suitable for plant transformation and regeneration. These tissue culture media can either be purchased as a commercial preparation, or custom prepared and modified. Examples of such media would include but are not limited to Murashige and Skoog, N6, Linsmaier and Skoog (Physio. Plant, 18:100, 1965), Uchimiya and Murashige (Plant Physiol. 15:473, 1962), Gamborg's B5 media (Exp. Cell Res., 50:151, 1968), D medium (Duncan et al., Planta, 165:322-332, 1985), Mc-Coven's Woody plant media (McCown and Lloyd, HortScience 6:453, 1981), Nitsch and Nitsch (Science 163:85-87, 1969), and Schenk and Hildebrandt (Can. J. Bot. 50:199-204, 1972) or derivations of these media supplemented accordingly. Those of skill in the art are aware that media and media supplements such as nutrients and growth regulators for use in transformation and regeneration and other culture conditions such as light intensity during incubation, pH, and incubation temperatures can be varied.
[0318] Those of skill in the art are aware of the numerous modifications in selective regimes, media, and growth conditions that can be varied depending on the plant system and the selective agent. Typical selective agents include but are not limited to antibiotics such as geneticin (G418), kanamycin, paromomycin or other chemicals such as glyphosate or other herbicides. Consequently, such media and culture conditions disclosed in the present invention can be modified or substituted with nutritionally equivalent components, or similar processes for selection and recovery of transgenic events, and still fall within the scope of the present invention.
[0319] MC Delivery Without Selection
[0320] The MC is delivered to plant cells or tissues, e.g., plant cells in suspension to obtain stably modified callus clones for inheritance assay. Suspension cells are maintained in a growth media, for example MS liquid medium containing an auxin such as 2,4-dichlorophenoxyacetic acid (2,4-D). Cells are bombarded using a particle bombardment process, such as the helium-driven PDS-1000/He system, and propagated in the same liquid medium to permit the growth of modified and non-modified cells. Portions of each bombardment are monitored for formation of fluorescent clusters, which are isolated by micromanipulation and cultured on solid medium. Clones modified with the MC are expanded and homogenous clones are used in inheritance assays, or assays measuring MC structure or autonomy.
[0321] MC Transformation with Selectable Marker Gene
[0322] Isolation of MC-modified cells in bombarded calluses or explants can be facilitated by the use of a selectable marker gene. The bombarded tissues are transferred to a medium containing an appropriate selective agent for a particular selectable marker gene. Such a transfer usually occurs between 0 and about 7 days after bombardment. The transfer could also take place any number of days after bombardment. The amount of selective agent and timing of incorporation of such an agent in selection medium can be optimized by using procedures known in the art. Selection inhibits the growth of non-modified cells, thus providing an advantage to the growth of modified cells, which can be further monitored by tracking the presence of a fluorescent marker gene or by the appearance of modified explants (modified cells or explants may be green under light in selection medium, while surrounding non-modified cells are weakly pigmented). In plants that develop through shoot organogenesis, the modified cells can form shoots directly, or alternatively, can be isolated and expanded for regeneration of multiple shoots transgenic for the MC. In plants that develop through embryogenesis, additional culturing steps may be necessary. Sorgum can be regenerated through embryogenesis (Wernicke & Brettell, Nature 287:138-139, 1990; and Bhaskaran and Smith, In Vitro Cell. Devel. Biol. Plant 24:65-70, 1987) and can also be regenerated by shoot organogenesis (Nirwan and Kothari, J. Plant Biochem. Biotech., 13:149-152, 2004).
[0323] Useful selectable marker genes are well known in the art and include, for example, herbicide and antibiotic resistance genes including but not limited to neomycin phosphotransferase II (conferring resistance to kanamycin, paramomycin and G418), hygromycin phosphotransferase (conferring resistance to hygromycin), 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS, conferring resistance to glyphosate), phosphinothricin acetyltransferase (conferring resistance to phosphinothricin/bialophos), MerA (conferring resistance to mercuric ions). Selectable marker genes may be transformed using standard methods in the art.
[0324] The first step in the production of plants containing novel genes involves delivery of DNA into a suitable plant tissue (described in the previous section) and selection of the tissue under conditions that allow preferential growth of any cells containing the novel genes. Selection is typically achieved with a selectable marker gene present in the delivered DNA, which may be a gene conferring resistance to an antibiotic, herbicide or other killing agent, or a gene allowing utilization of a carbon source not normally metabolized by plant cells. For selection to be effective, the plant cells or tissue need to be grown on selective medium containing the appropriate concentration of antibiotic or killing agent, and the cells need to be plated at a defined and constant density. The concentration of selective agent and cell density are generally chosen to cause complete growth inhibition of wild type plant tissue that does not express the selectable marker gene; but allowing cells containing the introduced DNA to grow and expand into adchromosomal clones. This critical concentration of selective agent typically is the lowest concentration at which there is complete growth inhibition of wild type cells, at the cell density used in the experiments. However, in some cases, sub-killing concentrations of the selective agent may be equally or more effective for the isolation of plant cells containing MC DNA, especially in cases where the identification of such cells is assisted by a visible marker gene (e.g., fluorescent protein gene) present on the MC. Such sub-killing concentrations of the selective agent may be administered during part or all of the selection timing.
[0325] In some species (e.g., tobacco or tomato), a homogenous clone of modified cells can also arise spontaneously when bombarded cells are placed under the appropriate selection. An exemplary selective agent is the neomycin phosphotransferase II (nptII) marker gene, which is commonly used in plant biotechnology and confers resistance to the antibiotics kanamycin, G418 (geneticin) and paramomycin. In other species, or in certain plant tissues or when using particular selectable markers, homogeneous clones may not arise spontaneously under selection; in this case the clusters of modified cells can be manipulated to homogeneity using the visible marker genes present on the MCs as an indication of which cells contain MC DNA.
[0326] Regeneration of Modified Plants from Explants to Mature, Rooted Plants
[0327] In instances where shoot organogenesis is less efficient or for other reasons undesirable, an embryogenic step is necessary for regeneration. In these cases explant tissue is cultured on an appropriate media for embryogenesis, and the embryo is cultured until shoots form. The regenerated shoots are cultured in a rooting medium to obtain intact whole plants with a fully developed root system. These plants are potted in soil and grown to maturity in a greenhouse.
[0328] Generally, regeneration and tissue culture of sorghum plant parts and whole plants is challenging as sorghum produces phenolic compounds while in culture. The present invention provides for methods of culturing sorghum cells and tissues in media containing polyvinylpyrrolidone (PVP), see Examples, below. The PVP acts as a sink for the phenolic compounds produced by sorghum and enhances callus growth during selection as well as facilitating callus and plantlet regeneration. Furthermore, generation of sorghum callus can be facilitated by delivering to the plant cells and/or tissues MCs of the invention that contain auxin genes. The presence of the auxin genes will facilitate callus induction of the transformed tissue. The invention also provides for tissue culture methods which cycle between the liquid culture mdia and solid culture media in order to promote the frequency and the morphogenic competence of the regenerable sorghum callus.
[0329] For plants that develop through shoot organogenesis, regeneration of a whole plant involves culturing of regenerable explant tissues taken from sterile organogenic callus tissue, seedlings or mature plants on a shoot regeneration medium for shoot organogenesis, and rooting of the regenerated shoots in a rooting medium to obtain intact whole plants with a fully developed root system. These plants are potted in soil and grown to maturity in a greenhouse.
[0330] Explants are obtained from any tissues of a plant suitable for regeneration. Exemplary tissues include hypocotyls, internodes, roots, cotyledons, petioles, cotyledonary petioles, leaves and peduncles, prepared from sterile seedlings or mature plants.
[0331] Explants are wounded (for example with a scalpel or razor blade) and cultured on a shoot regeneration medium (SRM) containing Murashige and Skoog (MS) medium as well as a cytokinin, e.g., 6-benzylaminopurine (BA), and an auxin, e.g., α-naphthaleneacetic acid (NAA), and an anti-ethylene agent, e.g., silver nitrate (AgNO3). For example, 2 mg/L of BA, 0.05 mg/L of NAA, and 2 mg/L of AgNO3 can be added to MS medium for shoot organogenesis. The most efficient shoot regeneration is obtained from longitudinal sections of internode explants.
[0332] Shoots regenerated via organogenesis are rooted in a MS medium. Plants are potted and grown in a greenhouse to sexual maturity for seed harvest.
[0333] To regenerate a whole plant with a MC, explants are pre-incubated for 1 to 7 days (or longer) on the shoot regeneration medium prior to bombardment with MC (see below). Following bombardment, explants are incubated on the same shoot regeneration medium for a recovery period up to 7 days (or longer), followed by selection for transformed shoots or clusters on the same medium but with a selective agent appropriate for a particular selectable marker gene (see below)
[0334] Method of co-delivering growth inducing genes to facilitate isolation of modified plant cell clones
[0335] Another method used in the generation of cell clones containing MCs involves the co-delivery of DNA containing genes that are capable of activating growth of plant cells, or that promote the formation of a specific organ, embryo or plant structure that is capable of self-sustaining growth. In one embodiment, the recipient cell receives simultaneously the MC, and a separate DNA molecule encoding one or more growth promoting, organogenesis-promoting, embryogenesis-promoting or regeneration-promoting genes. Following DNA delivery, expression of the plant growth regulator genes stimulates the plant cells to divide, or to initiate differentiation into a specific organ, embryo, or other cell types or tissues capable of regeneration. Multiple plant growth regulator genes can be combined on the same molecule, or co-bombarded on separate molecules. Use of these genes can also be combined with application of plant growth regulator molecules into the medium used to culture the plant cells, or of precursors to such molecules that are converted to functional plant growth regulators by the plant cell's biosynthetic machinery, or by the genes delivered into the plant cell.
[0336] The co-bombardment strategy of MCs with separate DNA molecules encoding plant growth regulators transiently supplies the plant growth regulator genes for several generations of plant cells following DNA delivery. During this time, the MC may be stabilized by virtue of its centromere, but the DNA molecules encoding plant growth regulator genes, or organogenesis-promoting, embryogenesis-promoting or regeneration-promoting genes will tend to be lost. The transient expression of these genes, prior to their loss, may give the cells containing MC DNA a sufficient growth advantage, or sufficient tendency to develop into plant organs, embryos or a regenerable cell cluster, to outgrow the non-modified cells in their vicinity, or to form a readily identifiable structure that is not formed by non-modified cells. Loss of the DNA molecule encoding these genes will prevent phenotypes from manifesting themselves that may be caused by these genes if present through the remainder of plant regeneration. In rare cases, the DNA molecules encoding plant growth regulator genes will integrate into the host plant's genome or into the MC.
[0337] Alternatively the genes promoting plant cell growth may be genes promoting shoot formation or embryogenesis, or giving rise to any identifiable organ, tissue or structure that can be regenerated into a plant. In this case, it may be possible to obtain embryos or shoots harboring MCs directly after DNA delivery, without the need to induce shoot formation with growth activators supplied into the medium, or lowering the growth activator treatment necessary to regenerate plants. The advantages of this method are more rapid regeneration, higher transformation efficiency, lower background growth of non-modified tissue, and lower rates of morphologic abnormalities in the regenerated plants (due to shorter and less intense treatments of the tissue with chemical plant growth activators added to the growth medium).
[0338] Determination of MC Structure and Autonomy in Adchromosomal Plants and Tissues
[0339] The structure and autonomy of the MC in adchromosomal plants and tissues can be determined by methods including but not limited to: conventional and pulsed-field Southern blot hybridization to genomic DNA from modified tissue subjected or not subjected to restriction endonuclease digestion, dot blot hybridization of genomic DNA from modified tissue hybridized with different MC specific sequences, MC rescue, exonucleas activity, PCR on DNA from modified tissues with probes specific to the MC, or Fluorescence Hybridization (FISH) to nuclei of modified cells. Table 3 below summarizes these methods.
TABLE-US-00003 TABLE 3 Examples of methods to determin MC structure and autonomy Assay Assay details Potential outcome Interpretation Southern blot Restriction digest of 1. Native sizes and pattern of 1. Autonomous or integrated via genomic DNA* compared bands CEN fragment to purified MC 2. Altered sizes or pattern of 2. Integrated or rearranged bands CHEF gel Restriction digest of 1. Native sizes and pattern of 1. Autonomous or integrated via Southern blot genomic DNA compared bands CEN fragment purified MC 2. Altered sizes or pattern of 2. Integrated or rearranged bands Native genomic DNA (no 1. MC band migrating 1. Autonomous circles or linears digest) ahead of genomic DNA present in plant 2. MC band co-migrating wit 2. Integrated genomic DNA 3. >1 MC bands observed 3. Various possibilities Exonuclease Exonuclease digestion of 1. Signal strength close to 1. Autonomous circles present assay genomic DNA followed that w/o exonuclease by detection of circular 2. No signal or signal 2. Integrated MC by PCR, dot blot, or strength lower that w/o restriction digest exonuclease (optional), electrophoresis and southern blot (useful for circular MCs) MC Transformation of plant 1. Colonies isolated only fro 1. Autonomous circles present, rescue genomic DNA into E. coli MC plants with MCs, not fro native MC structure followed by selection for controls; MC structure antibiotic resistance matches that of the parental genes on MC MC 2. Colonies isolated only fro 2. Autonomous circles present, MC plants with MCs, not fro rearranged MC structure OR MC controls; MC structure integrated via centromere different from parental MC fragment 3. Colonies observed both in 3. Various possibilities MC-modified plants and in controls PCR PCR amplification of 1. All MC parts detected by 1. Complete MC sequences various parts of the MC PCR present in plant 2. Subset of MC parts 2. Partial MC sequences present detected by PCR plant FISH Detection of MC 1. MC sequences detected, 1. Autonomous sequences in mitotic or free of genome meiotic nuclei by 2. MC sequences detected, 2. Integrated fluorescence in situ associated with genome hybridization 3. MC sequences detected, 3. Both autonomous and both free and associated integrated MC sequences preser with genome 4. No MC sequences 4. MC DNA not visible by FISH detected *Genomic DNA refers to total DNA extracted from plants containing a MC indicates data missing or illegible when filed
[0340] Furthermore, MC structure can be examined by characterizing MCs `rescued` from adchromosomal cells. Circular MCs that contain bacterial sequences for their selection and propagation in bacteria can be rescued from an adchromosomal plant or plant cell and re-introduced into bacteria. If no loss of sequences has occurred during replication of the MC in plant cells, the MC is able to replicate in bacteria and confer antibiotic resistance. Total genomic DNA is isolated from the adchromosomal plant cells by any method for DNA isolation known to those skilled in the art, including but not limited to a standard cetyltrimethylammonium bromide (CTAB) based method (Current Protocols in Molecular Biology. John Wiley & Sons, NY, 1994 et seq.) The purified genomic DNA is introduced into bacteria (e.g., E. coli) using methods familiar to one skilled in the art (for example heat shock or electroporation). The transformed bacteria are plated on solid medium containing antibiotics to select bacterial clones modified with MC DNA. Modified bacterial clones are grown up, the plasmid DNA purified (by alkaline lysis for example), and DNA analyzed by restriction enzyme digestion and gel electrophoresis or by sequencing. Because plant-methylated DNA containing methylcytosine residues will be degraded by wild-type strains of E. coli, bacterial strains (e.g. DH10B) deficient in the genes encoding methylation restriction nucleases (e.g. the mcr and mrr gene loci in E. coli) are best suited for this type of analysis. MC rescue can be performed on any plant tissue or clone of plant cells comprising a MC.
[0341] MC Autonomy Demonstration by In Situ Hybridization (ISH)
[0342] To assess whether the MC is autonomous from the native plant chromosomes, or has integrated into the plant genome, In Situ Hybridization is carried out (Fluorescent In Situ Hybridization or FISH is particularly well suited to this purpose). In this assay, mitotic or meiotic tissue, such as root tips or meiocytes from the anther, possibly treated with metaphase arrest agents such as colchicines or nitrous oxide is obtained, and standard FISH methods are used to label both the centromere and sequences specific to the MC. For example, a sorghum centromere is labeled using a probe from a sequence that labels all sorghum centromeres, attached to one fluorescent tag (Molecular Probes Alexafluor 568, for example), and sequences specific to the MC are labeled with another fluorescent tag (Alexafluor 488, for example). All centromere sequences are detected with the first tag; only MCs are detected with both the first and second tag. Chromosomes are stained with a DNA-specific dye including but not limited to DAPI, Hoechst 33258, OliGreen, Giemsa YOYO, and TOTO. An autonomous MC is visualized as a body that shows hybridization signal with both centromere probes and MC specific probes and is separate from the native chromosomes.
[0343] Determination of Gene Expression Levels
[0344] The expression level of any gene present on the MC can be determined by methods including but not limited to one of the following. The mRNA level of the gene can be determined by Northern Blot hybridization, Reverse Transcriptase-Polymerase Chain Reaction, binding levels of a specific RNA-binding protein, in situ hybridization, or dot blot hybridization.
[0345] The protein level of the gene product can be determined by Western blot hybridization, Enzyme-Linked Immunosorbant Assay (ELISA), fluorescent quantitation of a fluorescent gene product, enzymatic quantitation of an enzymatic gene product, immunohistochemical quantitation, or spectroscopic quantitation of a gene product that absorbs a specific wavelength of light.
[0346] Use of Exonuclease to Isolate Circular MC DNA from Genomic DNA
[0347] Exonucleases may be used to obtain pure MC DNA, suitable for isolation of MCs from E. coli or from plant cells. The method assumes a circular structure of the MC. A DNA preparation containing MC DNA and genomic DNA from the source organism is treated with exonuclease, for example lambda exonuclease combined with E. coli exonuclease I, or the ATP-dependent exonuclease (Qiagen Inc.; Valencia, Calif.; USA). Because the exonuclease is only active on DNA ends, it will specifically degrade the linear genomic DNA fragments, but will not affect the circular MC DNA. The result is MC DNA in pure form. The resultant MC DNA can be detected by a number of methods for DNA detection known to those skilled in the art, including but not limited to PCR, dot blot followed by hybridization analysis, and southern blot followed by hybridization analysis. Exonuclease treatment followed by detection of resultant circular MC may be used as a method to determine MC autonomy.
[0348] Structural Analysis of MCs by BAC-End Sequencing
[0349] BAC-end sequencing procedures, known to those skilled in the art, can be applied to characterize MC clones for a variety of purposes, such as structural characterization, determination of sequence content, and determination of the precise sequence at a unique site on the chromosome (for example the specific sequence signature found at the junction between a centromere fragment and the vector sequences). In particular, this method is useful to prove the relationship between a parental MC and the MCs descended from it and isolated from plant cells by MC rescue, described above. This method also fosters identification of specific sorghum MCs if more than one unique sorghum MC is present in a plant cell simultaneously.
[0350] Methods for Scoring Meiotic MC Inheritance
[0351] A variety of methods can be used to assess the efficiency of meiotic MC transmission. In one embodiment of the method, gene expression of genes encoded by the MC (marker genes or non-marker genes) can be scored by any method for detection of gene expression known to those skilled in the art, including but not limited to visible methods (e.g. fluorescence of fluorescent protein markers, scoring of visible phenotypes of the plant), scoring resistance of the plant or plant tissues to antibiotics, herbicides or other selective agents, by measuring enzyme activity of proteins encoded by the MC, or measuring non-visible plant phenotypes, or directly measuring the RNA and protein products of gene expression using microarray, northern blots, in situ hybridization, dot blot hybridization, RT-PCR, western blots, immunoprecipitation, Enzyme-Linked Immunosorbant Assay (ELISA), immunofluorescence and radio-immunoassays (RIA). Gene expression can be scored in the post-meiotic stages of microspore, pollen, pollen tube or female gametophyte, or the post-zygotic stages such as embryo, seed, or progeny seedlings and plants. In another embodiment of the method, the MC can de directly detected or visualized in post-meiotic, zygotic, embryonal or other cells in by a number of methods for DNA detection known to those skilled in the art, including but not limited to fluorescence in situ hybridization, in situ PCR, PCR, southern blot, or by MC rescue described above.
[0352] FISH Analysis of MC Copy Number in Meiocytes, Roots or other Tissues of Adchromosomal Plants
[0353] The copy number of the MC can be assessed in any cell or plant tissue by FISH is particularly well suited to this purpose). In an exemplary assay, standard FISH methods are used to label the centromere, using a probe which labels all chromosomes with one fluorescent tag (e.g., ALEXA FLUOR® 568; Invitrogen Corp.; Carlsbad, Calif.; USA), and to label sequences specific to the MC with another fluorescent tag (ALEXA FLUOR® 488, for example). All centromere sequences are detected with the first tag; only MCs are detected with both the first and second tag. Nuclei are stained with a DNA-specific dye including but not limited to DAPI, Hoechst 33258, OliGreen, Giemsa YOYO, and TOTO. MC copy number is determined by counting the number of fluorescent foci per cell that label with both tags.
[0354] Induction of Callus and Roots from Adchromosomal Plants Tissues for Inheritance Assays
[0355] MC inheritance is assessed using callus and roots induced from transformed plants. To induce roots and callus, tissues such as leaf pieces are prepared from adchromosomal plants and cultured on a MS or N6 medium that may contain a cytokinin, e.g., 6-benzylaminopurine (BA), and an auxin, e.g., α-naphthaleneacetic acid (NAA). Any tissue of an adchromosomal plant can be used for callus and root induction, and the medium recipe for tissue culture can be optimized using procedures known in the art.
[0356] Clonal Propagation of Adchromosomal Plants
[0357] To produce multiple clones of plants from a MC-transformed plant, any tissue of the plant can be tissue-cultured for shoot organogenesis using regeneration procedures described under the section regeneration of plants from explants to mature, rooted plants (see above). Alternatively, multiple auxiliary buds can induced from a MC-modified plant by excising the shoot tip, which can be rooted and subsequently be grown into a whole plant; each auxiliary bud can be rooted and produce a whole plant. Additionally, multiple shoots that result from one plant can be subdivided in culture to produce multiple individual plants.
[0358] Scoring of Antibiotic- or Herbicide Resistance in Seedlings and Plants (Progeny of Self- and Out-Crossed Transformants)
[0359] Progeny seeds harvested from MC-modified plants can be scored for antibiotic- or herbicide resistance by seed germination under sterile conditions on a growth media (for example MS medium) containing an appropriate selective agent for a particular selectable marker gene. Only seeds containing the MC can germinate on the medium and further grow and develop into whole plants. Alternatively, seeds can be germinated in soil, and the germinating seedlings can then be sprayed with a selective agent appropriate for a selectable marker gene. Seedlings that do not contain MC do not survive; only seedlings containing MC can survive and develop into mature plants.
[0360] Genetic Methods for Analyzing MC Performance
[0361] Though sorghum is typically propagated vegitatively, it is possible to use sexual propagation techniques as well. In addition to direct transformation of a plant with a MC, plants containing a MChromsome can be prepared by crossing a first plant containing the functional, stable, autonomous MC with a second plant lacking the MC.
[0362] Fertile plants modified with MCs can be crossed to other plant lines to study MC performance and inheritance. In the first embodiment of this method, pollen from an adchromosomal plant can be used to fertilize the stigma of a non-adchromosomal plant. MC presence is scored in the progeny of this cross using the methods outlined in the preceding section. In the second embodiment, the reciprocal cross is performed by using pollen from a non-adchromosomal plant to fertilize the flowers of an adchromosomal plant. The rate of MC inheritance in both crosses can be used to establish the frequencies of meiotic inheritance in male and female meiosis. In a third embodiment of this method, pollen for an adchromosomal plant is used to fertilize another or the same adchromosomal plant (e.g. self or sibling pollination). In the fourth embodiment of this method, the progeny of one of the crosses just described are back-crossed to a non-adchromosomal parental line, and the progeny of this second cross are scored for the presence of genetic markers in the plant's natural chromosomes as well as the MC. Scoring of a sufficient marker set against a sufficiently large set of progeny allows the determination of linkage or co-segregation of the MC to specific chromosomes or chromosomal loci in the plant's genome. Genetic crosses performed for testing genetic linkage can be done with a variety of combinations of parental lines; such variations of the methods described are known to those skilled in the art.
[0363] It should be understood that various changes and modifications to the presently preferred embodiments described herein will be apparent to those skilled in the art. Such changes and modifications can be made without departing from the spirit and scope of the present invention and without diminishing its intended advantages. It is therefore intended that such changes and modifications be covered by the appended claims.
EXAMPLES
Example 1
Sorghum Centromere Discovery
[0364] Identification of Sorghum Satellite Repeat Sequences
[0365] The investigators compiled Sorghum repetitive genomic DNA as candidate probes for hybridization with the BAC libraries. Sorghum sequence was extracted from GenBank and analyzed the sequence by homology to a known Sorghum satellite sequence (Zwick, MS, et al. Am J Bot 87: 1757-1764, 2000), set out in SEQ ID NO:177:
TABLE-US-00004 tacgtaagct tcgtttcgtc tgtttggaca tagtgctaat ctttatgcaa gatagatgca 60 cggtttacgt ggaacatatg atatgctcag aagcaattta ggacgcacct aatataactc 120 cttgatgatg tgtgtcacat ggaatcttgc ttcggtttct ttagagacag tgttagtttt 180 ggtagaagat atgtgcacag tgtacgccta atgcaccata ggctaaagaa accattttag 240 acgcacccga tggtactcgt agttgaagag gctcaactgg aggctcgatt tggtctgttc 300 ggatatagtg ctaatcttga tgcaagatag ttgcacaatt tgcaggcaac gtaccatatg 360 ttaagaaatc aatttggacg cacccaatgg aactcctaga tgacgtgtgt catatggaac 420 tcgcttcggt ctgtttggtg accatattag tttcactgca ggaaaggtgc atagtttgtg 480 cctaatgcac catagtctaa gaaaaccatt tttgatgcac ctgtttgtac ttctatgaag 540 aggctcaagt ggaagctcgg ttcggtctgt ttggagatag tgctaatctt gatgcaagat 600 aggtgtacgg tttgtatgga acataccata tgcttggaaa tcaatttgga tgcacccgtt 660 ggaactcctt gagaagtgtg tcttatgtac cctcgctttg gtctgtttag aaatagtgtt 720 agtttcagtg caagatatga gcatggtttg cgcctaacgc accatagtct aagaaaccat 780 tttggaagca cctgttggta cttcggtgaa gaagctcaag tggaagctcg gtttgacctg 840 tttggagata gtgctaatct tgatgcaaga tagtgcatga tttgcaagga acataccata 900 tgcttagaaa tcaacttgga cgcacgcccc gcaactccta catcacgtgt gtcatatgga 960 atcttacttc ggtccatttg taacattgta agttttagtg caag 1004
The investigators identified the sequences listed in Table 4 (SEQ ID NOs:23-176). Each sequence represents a different repetitive DNA sequence from S. bicolor.
TABLE-US-00005 TABLE 4 Sorghum satellite sequences SEQ ID NO: Nucleic acid sequence 23 aaactgagct tccacttgag cccctttacc caggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc aatgatgcat taggcacaaa ttgtgtacct atcttgtacc aaaactaact 120 ctgtctccaa aca 133 24 aaactgagct tccacttgag cccctttaca caggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc aatgatgcat taggcgcaaa ttgtgtacct atcttgtacc aaaactaact 120 ctgtctccaa aca 133 25 caaactgagc ttccacttga gcccctttac ctaggagtat gatcaggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgtac caaaactaac 120 tctgtctcca aac 133 26 caaactgagc ttccacttga gcccctttac ccaggagtat cttcaggtgc atccaaaatt 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtatc tatcttgcac caaaactaac 120 tctgtctcca aac 133 27 caaactgagc ttccacttga gcacctttac ccacgagtat catcgggctc atccaaaatg 60 gtttcttagc ccatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 28 caaactgagc ttccacttga gcccctgtac ccaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac taaaactacc 120 tttgtctcca aac 133 29 gaaccgacct tccacttgag cctctccacc taggattatc atcgggtgct tgcataatgg 60 tttctgagcc tatggtgcat tatgcgcaaa ccatgcacca atcctgcacc taaactaaca 120 ctgtctccaa aca 133 30 caaactgagc ttccacatga gcccctttac ccaggagtat attcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa agtatgtacc gatcttgcac caaaactaac 120 tctgtctcca aac 133 31 caaactgaga ttccacttga gcccctttac cggggagtat catcgtgtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actatgtacc tatcttacac caaaactaac 120 tctgtctcca aac 133 32 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctgtgatgca ttaggagaaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 33 aaactgagct tccacttgag cccctttacc caggagtgtt atcgagtgca tccaaaatag 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaaactatct 120 ctgtctccaa aca 133 34 aaactgagct tccacttgag cccctttact caggagtatc atcgggttca tccaaaatgg 60 tttcttagcc aatgatgcat ttggcgcaaa ctgtgtacct atctcgcacc aaaactaact 120 ttgtctccaa aca 133 35 caaactgagc ttccacttga gcccctttac ccaggagtat catcgagtgc atccaaaata 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactgcc 120 tctgtctcca aac 133 36 caaactgagc ttccacttga gcccctttac ccaggagtat catcaggttc atccaaaatt 60 gtttcttagc ctatgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 37 caaactgagc ttccacttga gcccctttgc ccaggagtat catcaggttc atccaaaatt 60 gtttcttagc ctttgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 38 aaactgagct tccacttgag cccctttacc caggagtatc gtcgggtgca tccaaaatgg 60 tttcttaccc tatgatgcat taggcgcaaa atgtgtacct atcttgaacc aaaactaact 120 ctgtctccaa aca 133 39 caaactgagc ttccacttga gcccctttac ccaggagtat gatcgggtgc atctaaaatg 60 gtttcttagc ctatgatgca ttaggcgaaa actgtgtacc tattttgcac caaaactaac 120 tctgtctcca aac 133 40 caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atccaaaatg 60 gtttcttagg ctatgatgca ttaggcgcaa actgtgcacc tatcctgtac ctaaactaac 120 actgtctcca aac 133 41 caaactgagc ttccacttga gccccattac ctaggagtat gttcgggtgg attcaaaatg 60 gtttcttagc ctatgatgca ttatgcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 42 aaactgagct tccacttgag cccctatacc tagtagtatc atcgggtgca tccaaaatga 60 tttcttatcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaaactaact 120 ctgtctccaa aca 133 43 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacag actatgtacc tatcttgcac caaaactaac 120 tccgtctcca aac 133 44 caaaccgacc ttccacttca gcctctttac ctaggattat catcgggtgc ttccataatg 60 gtttttgagc ctatggtgca tattgcgcaa accatgcacc aatcttgcat ctaaactaac 120 actgtctcca aac 133 45 aaactgagct tccaattgag cccctttacc caggagtatc atcgggtgca aacaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaaaataact 120 ctgtctccaa aca 133 46 caaactgagc ttccacttga gcccctttac ctaggagtgt catcgggtgc attcaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tttgtctcta aac 133 47 caaactgagc ttccatttga gcccctttac ccacgagtat aattgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtatacc tatattgcac caaaactacc 120 tctgtctcca aac 133 48 caaactgagc ttccacttga gcccctatac cgaggagtat catcgggtgc attcaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac tgaaactaac 120 tctgtctcca aac 133 49 caaactgagc tttcacttga gcccctttac ctaggagtat gaacgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcta aac 133 50 aaactgagct tccacttgag cccctttaca caggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc tatgatacat aaggcgcaaa ctgtgtatgt atcttgcacc aaatctaact 120 ctatctccaa aca 133 51 caaactgagc ttccacttga gccottttac ccaggagtat catcgagtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactacc 120 tccgtctcca aac 133 52 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttcttcgc ctatgatgca ttaggcgcaa actatgtacc tatcttgcac caaaactaac 120 tttgtctcca aac 133 53 aaactgagct tccacttgag cccctttacc caggtgtatc atcgggtgca tccaaactgg 60 tttcttagcc tatgacgcat taggcgcaaa ctgtgtacct atcttgcacc aaaactacct 120 ctgtctccaa aca 133 54 caaactgagc ttccacttga gcccctttac ccaggagtat attcgggtgc atccaaaatg 60 gtttcttagc ctatgatgcc ttaggcgcaa agtgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 55 caaactgagc ctccacttga gcccctttac ccaggagtat catcaggtgc atccaaaatg 60 gtttcttagc atatgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 56 caaactgagc ttccacttga gcccctttac ccaggagaat caacagatgc atccaaaata 60 gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac 120 tctgtctcca aac 133 57 gaactgacct tccacttgag cctctttacc taggattatc atcgggtgct tccataatgg 60 tttctgagcc tatggtgcat tatgcgcaaa ccatgcacca atcttgcacc taaactaaca 120 ctgtctccaa aca 133 58 aaactgagct tccacttgag cccttttacc caggagtatc atcgagtgca tccaaaatga 60 tttcttaccc tatgatgcat taggcgcaaa ctgtgaacct atcttgcacc aaaactacct 120 ctgtctccaa aca 133 59 caaactgagt ttccacttga gcccctttac ccaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttcggcgcaa attgtgtacc taacttgcac caaaactaac 120 tctgtctcca aac 133 60 aaactgagct tccacttgag cccctttacc taggattatc atcgggtgca tccaaaatgg 60 tttcttagcc tattatgcat taggcgtaaa ctgtgtacca atcttgcacc aaaactaact 120 ctctctccaa ac 132 61 aaactgagct tccatttgag cccctttacc caggattatc atcgggtgcg tccaaaatgg 60 tttctgagcc tatgatgcat taggtggaaa ctgtgtacct attttgcacc aaaactaact 120 ctgtctccaa aca 133 62 accaaactgt gcttccactt aagcctcttc acctaggatt accatcaagt gcatccaaaa 60 tggtttctta gactatggtg aattaggcaa aaactgtgca cctatcttgc accaaaacta 120 acactatgtc caa 133 63 caaactgagc ttctgcttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttctcagc ctttgatgca ttaggcgtaa actgtgtacc tatgttgcac caaaactaac 120 tctatctcca aac 133 64 gatcaaacca agcttccact tgagcccctt ttcctaggag taccattagg tgtgtccaaa 60 aaggttctta gcctatggtg cattaggcgc aaaccattca cctatcttgc acagaaacta 120 atactgtctc aaa 133 65 cgaaccgacc ttccacatga gactcttcac ctaggattat catcgggtgc ttccataatg 60 gtttctgtgc ctatggtgca ttatgcgcaa accatgcacc aatcttgcac ctaaactaac 120 actctctcca aac 133 66 caaactgagc tttcccttga gccccttgag ccaggagtat catcgggtgc atccaaaatg 60 gtttcttagc tgtatgaagc attaggcgca aactgtgtac ttatcttgca ccaaaactaa 120 ctctgtctcc aaa 133 67 caaactgagc ttccacttga gcccctttac cttggagtat caacgggtgc atccaaaatg 60 ttttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tttgtctcca aac 133 68 aaactgagct tccacttgag cccctttacc caggagtatc atcgggtgca tccaaaatgg 60 attcttagcc tatgatgcat taggcgtaaa ctgtgtacct ttcttgtacc aaaactaact 120 ctgtctccaa aca 133 69 caaactgagc ttccacttga gcccctttac ctaggagtat catcggctcc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcct aac 133 70 caaactgagc ttccgcttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttctcagc ctatgatgcc ttaggagcaa actgtgtacc tatcttgcac caaaactaag 120 tctgtctcca aac 133 71 aaactgagct tccacttgag cccctttgcc caggagtatc atcaggttca tccaaaatgg 60 tttcttagcc tttgatgcat taggcgtagc ctgtgtacct atcttgcacc ataactaact 120 ctgtctccaa aca 133 72 accaaactgt gcttccactt gagcctcttc atctaggatt accatcaagt gcatccaaaa 60 tggtttctta gactacggtg aattaggcta aaattgtgca cctatcttgc accaaaacta 120 acactatgtc caa 133 73 caaactgagc ttccacttga gccccgttac ctaggagtat cttcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actatgtacc tatcttacac taaaactaac 120 tctgtctcca aac 133 74 aaactgagct tccacctgag cccctttacc caggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgctt taggcgcaaa ctgtgtacct atctagcacc aaaactagct 120 ctgtctccaa aca 133 75 cgaaccgacc tttcaattga gcctcttcac ctaggattat catcgggtgt ttccataatg 60 gtttctgagc ctatggtgca ttatgcgcaa accatgcacc aatcttgcac ctaaactaac 120 actgtctcca aac 133 76 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggagaaa actgtgtccc tatcttgcac caaaactaac 120 tctgtctcca aac 133 77 aaactgagct tccacttgag cccctttacc caggagtatc attgggtgca tccaaactgg 60 tttcttagcc tatgatgcat ttggcgcaaa ctgtgtacct atcttgcacc aaaactgact 120 ctgtctccaa aca 133 78 aaactgagat tccacttgag cccctttacc caacagtata atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtaccg atcttgcacc aaaactaact 120 ctgtctccaa aca 133 79 caaactgagc ttccacttgg gcccctttac ccaggagtat caacagatgc atccaaaata 60 gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac 120 tctgtctcca aac 133 80 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atacaaaatg 60 gttccttagc ctatgatgca ttaggcgcaa actgtgtact tatcttgcac caaaactaac 120 tctgtctcca aac 133 81 aaactgagct tccacttgag cccctttacc taggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgcat ttggcacaaa ctgtgtacct atcctgcacc aaaactaact 120 ctgtctccaa aca 133 82 aaactgagct tccagttgag cccctttacc gaggagtatc atcaggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaac ctgtgtacct atcttgcacc aaaactacct 120 ctgtatccaa aca 133 83 caaactgagc ttccacttga gcccctttac ccaggagtat caacagattc atccaaaata 60 gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac 120 tctgtctcca aac 133
84 aaacctacct tccacttgag cctctccacc taggagtatt atcgggtgct tccataatgg 60 tttccgagcc tatggtgcat tatgcgcaaa ccatgcacca atcttgcacc taaactaaca 120 ctgtctccaa aca 133 85 caaactgacc ttccacttga cactcttcac ctaggagtat tatccggtgc ttccataatg 60 gtttctgagc ctatggtgca ttatgcgcaa accatgcacc aatcttgcac ctaaactaac 120 actatctcca aac 133 86 caaactgagc ttccacttga gcccctttac cctggagtat cttcaggtgc atccaaaatt 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 87 caaactgagc ttccacttga gcccctttac ccaggagtat catcaggtgc atcagaaatg 60 gtttcttagc ctatgatgca ttaggtgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 88 caaactgagc ttccacttga gcccctttac ccaagagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgacgca ttaggcacaa actgtgtacc tatgttgcac caaaactaac 120 tctgtctcca aac 133 89 caaactgagc ttcctcttga gcccctttac ctaggagtat catcggttgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaacttac 120 tctgtctcca aac 133 90 caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atacaaaatg 60 gattcttagc ctatgacgca ttaggcgcaa actatgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 91 aaactgagtt tccacatgag cacctttacc ctggagtatc atcaggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atattgcacc aaaactaact 120 ctttctccaa aca 133 92 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggcgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actatgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 93 aaactgagct tccagttgag cccctttacc cagcagtatc atcgggtgga tccaaaatgg 60 tttcttcacc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaaactaact 120 ctatctcgaa aca 133 94 aactgagctt ccacttgagc ccctttagcc aggagtatca tcgggtgcat ccaaaatggt 60 ttcttagcct atgaaatcat taggcgcaaa ctgtgtacct atcttgcacc aaaactaact 120 ctgtctctaa aca 133 95 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggagc atccaaaatg 60 gtttcttagc ctatgatgca taaggagcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 96 caaactgagc ttccacttaa gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttacgcgcaa actgtgtacc tatcttgcac caaaactaac 120 tttgtctcca aac 133 97 caaactgagc ttccacttga gcccctttac ctaggagtat aatcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttagacgtaa actatgtacc tatcttgcac caaaactaac 120 tctgtctccg aac 133 98 aaactgagct tccacttgag cccctttacc taggagtatc atcgggtgta tccaaaattg 60 tttcttagcc tacgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaaactaact 120 ctgtctccaa aca 133 99 caaactgagc ttccacttga gcccctttac ctaggagtat cattgggtgc atccaaaatg 60 ctttcttagc ctatgatgca ttaggtgcaa actgtgtagc tatcttgcac caaaactatc 120 tctatctcca aac 133 100 aaactgagct tccacttgag cccgtttacc gaggagtatc atcgagtgca tctaaaatga 60 tttcttagcc tatgatgcat taggcacaaa ctgtgtacct atctagcacc aaaactaact 120 ttctctccaa aca 133 101 accgaccttc cacttgagac tcttcaccta ggattatcat cgggtgcttc cataatggtt 60 tctgagccta tggtgcatta tgcacaaacc atgcaccaat attgcaccga aactaacact 120 gtctccaaac a 131 102 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttcttagc caatgatgca ttaggagaaa actgtgtacc aatcttgcac caaaactaac 120 tctgtctcca aac 133 103 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaaag 60 gtctcttagc ctatgatgcc ttaggagaaa actatgtacc tgtcttgcac cataactaac 120 tctgtctcca aac 133 104 aaactgtgct tgcacttgag cccctttacc caggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcata ctgtgtacct atcttgcagt aaaactaact 120 ctgtctccaa aca 133 105 aaactgagct tccacttgag gccctttatc taggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgcgt taggcgcaaa ctatgtacct atcttgcacc aaaactaact 120 ctgtctccaa aca 133 106 aaactgagct tccacttgag cccctttacc taggagtatc ttcgggtgca tcagaaatgg 60 tttcttagcc tatcatgcat taggcacaaa ctgtgcacct atcttacatc aaaattaact 120 ctgtctccaa aca 133 107 caaactgagc ttccacttga gcccctttac ccaggagtat attcgggtgc atccaaaatg 60 gtttcttagc ctatgatgcg ttaggcgcaa actgtgtacc tatcttgcac cacaactaaa 120 tctgtctcca aac 133 108 caaactgagc ttccacttga gcccctttac ccaggagtat caacagatgc atccaaaata 60 gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac 120 tctgtctgca aac 133 109 caaactgagg ttccgcttga gcccctttac ctaggagtat catcgggttc atccaaaatg 60 gtttctcagc ctatgatgcc ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 110 aaactgagct tccacttgag cccctttacc caggagtatc atcgggtgca tccaaaatgg 60 attcttcgcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacg aaaactaact 120 ctatctccaa aca 133 111 cgaaccgacc ttccacttga gccccttcac ctaggattat catcgggtgc ttccataatg 60 gtttttgagc ctatggtgca ttatgcacaa accatgcacc aatcttgcac ctaaactaac 120 actgtctcca aac 133 112 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gttaattagc ctatgatgca ttaggcgcta actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 113 caaactgagc ttccacttaa gcccctttac ccaggagtat cttcaggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actatgtacc tatcttacac caaaactaac 120 tctgtctcca aac 133 114 aaccgagacc tccacttgag gcctcttcac ctaggagata ccatcggatg cgtctaagat 60 ggtttcttat cctatggtgc attatgcgta acccgtgcac atatcttgct ccaaaactaa 120 tgctgtctct aaa 133 115 caaactgagc ttccacttga gcccctttac ccagtagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttatgcgaaa attgtgtacc tatattgcac caaaactaac 120 tctgtctcca aac 133 116 caaactgagc atccacttga gcccctttac ctaggagtat catcgggtgc atacaaaatg 60 gtttcttaac ctatgatgca ttagacgcaa actgtgtacc tatattgcac caaaactaac 120 tctgtctcca aac 133 117 caaactgagc ttccacttga gcccctttac cttggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgaac caaaactaac 120 tctgtctcca aac 133 118 aaactgagct tccacttgag cccctttacc caggagtatc ttcaggtgca tccaaaattg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atctttcacc aaaactaaca 120 ctgtctccaa aca 133 119 caaactgagc ttccacttga gcccctttac ctagaagtat catcgggtgc atccaaaagg 60 gtttcttagc ctatgatgta ctaggcgtaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 120 caaactgagc taccacttga gcccctttac ctaggagtat catcaggttc atccaaaatt 60 gtttcttagc ctatgatgcg ttaggcgtaa actgtttacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 121 caaactgagc ttccatttga gcccctttgc ctaggagtat catcgggtgc atccaaaatg 60 gttccttagc ctatgatgca ttaggtgcaa actgtgtacc tatcttgcac caaaactaac 120 tttgtctcca aac 133 122 caaactgagc ttccacctga gccactttaa ccaggagtat catcgggtgc atccaaaatg 60 ttttcttagc ctatgatgct ttaggcgcaa attgtgtacc tatcttgcac caaaactaac 120 tctgcctcca aac 133 123 caaactgagc ttccacttga gcatctttac ccaggagtat catcaggtgc atccaaaata 60 gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgcaa caaaactaac 120 tctgtctcca tac 133 124 caaactgagc ttccacttga gcccctttac ctaggggtaa catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 125 caaactgagc ttccacctga gcccctttac ctaggagtat catcgtgtgc atctaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 126 caaactgagc ttccacttga gcccctttac ctaggagtat catcgtgtgc atcaaaaatg 60 gtttcttagc ctatgaagca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 127 caaactgagc ttccacatga gcccctttac ctaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgcac caaaactaac 120 tctgtatcca aac 133 128 caaactgagc ttccacttga gcccctttac ccaggagtat catcgagtgc atctaaaaag 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaagctacc 120 tctgtctcca aac 133 129 caaactgagc ttccacttga gcccctttac ccaggagtat caacagatgc atccaaaata 60 gtttcttagc ctttcatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac 120 tctgtctcca aac 133 130 aaactgagct tccacttgag cccctttacc ctggaatatc atcgggtgca tcccaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa gtgtgtacct atcttgcacc aaaactaact 120 ttgtctccaa aca 133 131 aaactgagct tcaacttgag cccctttacc taggagtatc atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct gtcttgcacc aaaactaacc 120 ctgtctccaa aca 133 132 caaactgagc ttccacttga gcccctttac ccaggagtat caacagatgc ttccaaaata 60 gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac 120 tctgtctcca aac 133 133 caaactgagc ttccacttga gcccctttac ccaggagtat caacagatgc atccaaaata 60 gtttcttagt ctttgatgca ttaggtgcaa actgtgtagc tatcttgccc caatactaac 120 tctgtctcca aac 133 134 aaactgagct tccacgtgag cccctttacc caggagtata atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaagctatct 120 ctgtctccaa aca 133 135 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc attcaaaatg 60 gtttcatagc ctatgatgca ttaggcgcaa actatgtacc tatcttgcac caaaactacc 120 tccgtctcca aac 133 136 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgt atccaaaatg 60 gtttcttagc ctatgatgca ttaggcatag actgtgtacc tatattgcac caaaactaac 120 tccgtctcca aac 133 137 caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atccaaaatt 60 gtttcttagc ttatgatgca ttaggtgtaa actgtgtacc tatcttgcat caaaactcac 120 tctgtctcca aac 133 138 caaactgagc ttccacttga gcccctttac ccaggagtat cttcaggtgc atccaaaatt 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tttcttgcac caaaactaac 120 tctgtctcca aac 133 139 aaactgagct tccacttgag cccctttacc caagagtacc atcgggtgca tccaaaatgg 60 tttcttagcc aatgatgcat taggcgcaaa ttgtgtacct atcttgtacc aaaactaact 120 ttgtctccaa aca 133 140 caaactgagc ttccacttga gcccctttac ctagcagtat aatcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actatgtact tatcttgcac caaaactaac 120 tctgtctcca aac 133 141 aaactgagct tccacttgag cccctttacc gaggagtatc atcgggtgca ttcaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcaca aaaactagct 120 ctgtctccaa aca 133 142 caaactgagc ttccacttga gcccctttac ccaggagtat catcaggttc atccaaaatt 60 gtttcttagc ctttgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 143 aaactgagtt tccacatgag cacctttacc caggagtatc atcaggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaaactaact 120 ctgtctccaa aca 133 144 aaactgagct tccacttgag ccccttttct caggagtatc attgggttca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgtacc aaaactaact 120 ctgtctccaa aca 133 145 caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atccaaaatg 60 gtttctttgc ctatgatgca ttaggcggaa actgtgtacc tgttttgcac caaaactaac 120 tctatctcca aac 133 146 aaactgtgct tccacttgag cccctttacc taggagtatc atcagggtgc atccacaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133
147 caaactgagc ttccacttgg gcccctttac ccaggagtat cttcaggtgc atccaaaatt 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 148 aaactgagct tccgcttgtg cccctttacc caagagtatc gacgggtgca tccaaaatgg 60 tttcttagcc tacgatgcat taggcgcaaa cagtgtagct atcttgcacc aaaactaact 120 ttgtctccaa aca 133 149 caaactgagc ttccacttga gcccctttac ctaggagtat catcaggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggagaaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 150 caaactgagc ttccacttga gcccctttac ctaggagtat catcgtgtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgcac caaaagtaac 120 tctgtctcca aac 133 151 caaactgagc ttccacttga gcccctttac caaggagtat catcgggtgc atccaaaatg 60 gtttcttagc ctctggtgca ttaggcacaa gctaggtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 152 aaactgagct tccacttgag cccctttact caggagtatc atcgtgtgcc tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcata ctgtgtacct atcttgcacc aaaactacct 120 ctatctccaa aca 133 153 aaactgagct tccacttgag cccctttaca cacgagtatc atcgggtgca tccaaaatgg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgtacc aaaactaact 120 ctggctctaa aca 133 154 caaactgagc ttccacttga gtccctttac ccaggagtat catagggtgc atccaaaatg 60 ttttcttagc ctatgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 155 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaagatg 60 gtttcttagc ctatgatgca ttagacgcaa actgtgtacc tatcttgcac caaaactaac 120 tttgtctcca aac 133 156 aaactgagct tccacttgag cccctttaca caggagtatc atcgggtgca tccaaaatgg 60 tttcttagca tatgatgcat tagtcgcaaa ctgtgtacct atcttgtacc aaaactaact 120 ctgtctccaa aca 133 157 caaacggagc ttccgcttga gcccctttac ctaagagtat catcgggtgc atccaaaatg 60 gtttgtcagc ctatgatgca ttaggtgcaa actgtgtacc tatcttgccc caaaactaac 120 tctgtctcca aac 133 158 caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaaag 60 gtttcttagc ctatgatgct ctaggagaaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcta aac 133 159 caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atctaaagtg 60 gtttcttagc ctacgatgca gtaggcgcaa actgtgtaca tatcttgcac caaaactaac 120 tctgtctcca aac 133 160 tgaaacggag ctttcacttg agccccttga cctaggagta ccatcgggtg catccaaaat 60 ggtttcttat cctatggtgc attaggtgta aaccgtgcac ctatcttgca ccgaaactaa 120 cgttgtctct aaa 133 161 caaactgagc ttccacttga gcccctttac ccgggagtat catcgggtgc atccaaaatg 60 gtttcttatc caatgatgcg ttaggcgcaa actatgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 162 caaactgagc ttccagttga gcctctttac ccaggagtat catcgggtgg atccaaaatg 60 gtttgttagc ctatgatgca ttaggagcaa actatgtacc tatcttgcac caaaactaat 120 tctgtctcca aac 133 163 caaactgagc ttccacttga gcccctttac ccaggaggat cttcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tttcatgcac caaaactaac 120 tctgtctcca aac 133 164 caaaccgagc tttcacttta gccccttgac ctaggagtac catcgggtgc gttcaaaacg 60 gtttcttatc ctatggtgca ttaggtgcaa accgtgcacc tatcttgcac tgaaactaac 120 actgtctcta aac 133 165 aaactgagct tcgacttgag cccctttacc caggagtatc atcgggtgca tccaaaaggg 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacgt atcttgcacc aaaactacct 120 ctgtctctaa aca 133 166 aaactgagct tccacttgag cccctttacc caggagtatc atcgggtgca tccaaaagag 60 tttcttagcc tatgatgcat taggcgcaaa ctgtgtacgt atcttgcacc aaaactacct 120 ctgtctccaa aca 133 167 caaactgagc ttccacttca gcccctttaa tcaggaatat catcgggtgc atccaaagta 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 168 caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atccaaaata 60 gtttcttagc ctacgatgca gtaagcgcaa actgtgtacc tatcttgcac caaaactaac 120 tcggtctcca aac 133 169 caaactgagc ttccacttga gcccctttac ctaggagtat gatcaggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 170 aaactgagct tccacttgag cacctttacc caggagtatc atcaggtgca tccaaaatgg 60 gttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgaacc aaaactaact 120 ctatctccaa aca 133 171 caaactgagc ttccacttga gcccctttac ccaggagtat cttcaggtgc atccaaaatt 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tttgtctcca aac 133 172 caaactgagc ttccacttga gcccctttac ctaggagtat aatcgggtgc atccaaaatg 60 gtttcttagc ctatgatgca ttaggtgcaa actatgtacc tatcttgcac caaaactaac 120 tttgtctccg aac 133 173 caaacagagc ttccaattga gaccctttac tcaggagtat catcgggttc atccaaaatg 60 gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tattttgcac caaaactaac 120 tctgtctcca aac 133 174 caaactgagc ttccacttga gcccctttac ccacgagtat catcgggctc atccaaaatg 60 atttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120 tctgtctcca aac 133 175 actgcgcttc cacttgagcc ccattaccca ggagtatcat cgggtgcatc caaaatagtt 60 tcatagccga tgatgcatta ggtgtaaact gtgtacctat cttgcaccaa aactaactct 120 gtctccaaac a 131 176 aactgagctt gcacttgagc ccctttaccc aggagtatca tcgagtgcat ccaaaattgt 60 ttcttagcct gtgatgcatt aggcgcaaac tgtgtacctg tcttgcacca aaactaactc 120 tgtctccaaa c 131
[0366] To identify the consensus sorghum satellite sequence from the sequences of SEQ ID NOs:23-176, these sequences were aligned using ALIGN (publicly available software; Altschul, S F, et al., J Mol Biol. 215:403-10, 1990). The sequences were trimmed to unit repeat length using the consensus as a template. Sequences trimmed from the ends of the alignment were realigned with the consensus and further trimmed until all sequences were at or below the consensus length. The consensus was determined by the frequency of a specific nucleotide at each position; if the most frequent base was three times more frequent than the next most frequent base, it was considered the consensus. An exemplary consensus sorghum satellite sequence is set out as SEQ ID NO:22:
TABLE-US-00006 aaactgagct tccacttgag cccctttacc aggagtatca tcgggtgcat ccaaaatggt 60 ttcttagcct atgatgcatt aggcgcaaac tgtgtaccta tcttgcacca aaactaactc 120 tgtctccaaa cc 132
[0367] The sorghum centromere specific retrotransposon sequence (CRS) was amplified using PCR and sequenced using primers designed from published sequence (set forth in SEQ ID NO:178; Presting, G G, et al., Plant J 16: 721-728, 1998 and Miller, J T, et al. Genetics 150: 1615-1623, 1998) and are set forth in SEQ ID NOs:179 and 180. The sequence for sorghum CRS is set out as SEQ ID NO:21:
TABLE-US-00007 tggattcgga ctggaaaata actctaactt gtatggatca ccacgacgtc atatggactc 60 caactgggac gttcctatac ttgttggaaa gctcatgaag tctactttcc aatgggtcca 120 accacatatc tatgcggctt atgagtcggg cgcagtcctt gttttcgtgc cgacaccttt 180 ttctgttttg gtgctgcgtc actctatttt ggaccaatgg cccatgtatc aagttgagtc 240 cattagggac gcatcctagg gttggaggac gactctagca cccctttggt cgtcctcccc 300 tctatttatt tacatctaga gccgccatga acaactggat tttgtttaga tcaagtttag 360 ccttcgctac ttgcttgtag gcgcgcgtgc aggatcagcc gcccgcctcc ttgtcttcgg 420 aaccccattg ttgattaaga ttcagtttaa aaccttcaat tcatcttgca aattcagtgc 480 ttgtttcctc gttcttgcta gttcttcgat tgcttgcagg acgggagccc taggggctgg 540 ttgtcgcgct ccacaagatc gtgacggttg ttggacgtgg tgtatcggtt gctaaggcgc 600 ggtcttgagg gctgtagtcg ggccgtgaac gtcatctcca tccactaatc gagttatcca 660 gcgcctctca tcgaaagatc aggccaaaaa ccctagcggg ctcgcatcag ttggtaatca 720 gagcaaggtt cttcggtgag agacttctaa tcctttgctg tttttaatta atttcctata 780 gtccagaaaa gccaaaaaaa tatagtagat tagtttttcc ataatcctat taaacctttg 840 tgccttggct agtaccgttt tagttagggc ttgttgaatt tgcgttgctt cggtttgtgt 900 cgagttgctg gtcttagtgt ctagtccttt agagtttcga gttcttgtca ccatctatac 960 acagccgagt attaccatat cttctctctg tcgaatctgt tgcgaagtct gaattgaact 1020 gtggtcatgg tccggatcga gtagagttcc aatctgagtt caaaaagaaa gataactcta 1080 cttgttcggc cttactctag agagagagag agagtgtgga gcgaaaaaag tgtgtggagc 1140 gaattgctct tttgtattct tttgttcata taatcagttt tggaggttgc ccacaaaaaa 1200 agaaaaaaaa gaaaagagaa aagattcaaa aaaaagaggc tgtttttcat attgatttta 1260 ggtttgtccc accttgtttt cgggggtgtg ctgtggtttt cctttgtgtc caggctcgcg 1320 tctctagcac ggtctagcct aggaccagca cagtaccatc gtcgaacgct tattcagctc 1380 gcttttataa ctaacgtggt gctagttcgt tccttgtttc agcccaccta tagctccaca 1440 tactctacag cttgacaggt cttgtgctgc agcaccgata cacttcgtcc attgctatac 1500 acttgttggc agacgacccc tcctgtcaag caagataaga attggtaaga acttgtgtta 1560 caggttgagt gtgagcgact tgctatagct acatcctagt agttgtaggg attttatttc 1620 ttcacttgct ttttgttgtc tttgtctttg aaccatgcca ggggcagatg atggtaacga 1680 aacaccactt acacctcgca ctatgggcat catacaatat tttgaaagga aagtgaagct 1740 gcacacagag ggacttgata acgacttgca ggtgacgaat gaaaagctgg ggcagttgga 1800 ggctacgcag attgccacaa acaacaagct cacaagtttg gaggaatccg ttgctagtgt 1860 ggacaaaagc cttgctgctc tcctaaggcg atttgatgct ttccacaccg aagataaaga 1920 gaagcataaa gaagaaaagg agggagatcg agagcacggt agtcatgaag atgactacac 1980 tggtgatact gaacatgatg atcaagacac tcgtgatcga cgtcgccttc gtcacaaccg 2040 tagaggtatg ggtggcaacc gccgacgcga ggtacacaat aatgatgatg ctttcagtaa 2100 gattaaattt aagatacccc tttttgatgg taaatatgac cctgatgctt acatcacttg 2160 ggagattgct gttgatcaaa agtttgcatg tcatgaattt cctgagacta cacgtgttag 2220 ggctgctact agtgagttta cagattttgc ttctgtttgg tggatagaat atggaaagaa 2280 aaatcataat aacttaccta gaacttggga tgcgctgaaa agggccatga gagctagatt 2340 tgttccatct tactatgcgc gtgatatgat aaataagttg cagcagttaa gacaaggtgc 2400 taaaagtgta gaagaatatt atcaggaatt acaaacgggt atgttgcgtt gtaacctaga 2460 ggaggatgag gaaccggcta tggctagatt tttgggtggg ttaaatcggg aaattcagga 2520 catcctcgct tacaaagaat acaataatgt aacccgtttg tttcatcttg cttgtaaagc 2580 tgaaagggaa gtgcagagac gacgtgctag cacaaggagt aatatttctg cagggaaggc 2640 taattcatgg cagcaacgcg tggcttcaac tccatctaca cgtatttcta ctccatcatc 2700 tagtgacaag actcgagctg cccccaccaa ttcagttgcg aagacgatgc aaaagcctgc 2760 tgcgagtact tcatccgtgg catcgacggg tagaacaagc aacatacaat gtcaccggtg 2820 caagggatat gggcacatga tgcgtgactg tccaaacaag cgagttatga ttgtcaggga 2880 tgatggtgag tactcatctg ctagtgattt tgatgaggat acacttgcac tgcttgcgac 2940 tgaccatgca ggtaatgaag atcaaataga agaatatatt aatgcaggtg aagcggacca 3000 ctatgagagc ttgatcgtgc agcgagtgct tagtgcacaa atggagatgg cggaacaaaa 3060 tcagcgacac attttattcc aaacaaagtg tgtcatcaaa gagcgttctt gtcgcatgat 3120 cattgatgga ggtagctgca acaacttggc aagcagcgat atggtgcaga agcttgccct 3180 caacaccaaa ccacacccgc atccctacta catccaatgg ctgaacaaca gtggtaaggc 3240 aaaggtaact agacttgtga gaattaattt ttccatcgga tcctacaaag atattgttga 3300 atgtgatgtt gtgcctatgc aagcttgtaa cattctgcta ggtagacctt ggcaatttga 3360 tagagattct atgcatcatg gtagatcaaa tcagtattct tttctatacc atgatcgcaa 3420 aattgtgttg catcctatat cccctgaaac tattatgcaa actgatgttg ctagggctac 3480 taaagcaaag agcaagagca ataaaaatga taaatctgta attggtaaca aagatgagat 3540 aaaactgaaa ggacattgta tgatagctac caaatcagat attaatgagt tcaatgcatc 3600 cacttctgtt gcttatgctt tgatatgcaa ggatgctttg atttcagttg aggatatgca 3660 atgttctttg ccccctgctg ttgctaacgt tttgcaggag tattctgatg tgtttccaag 3720 tgatgtacca gcggggctgc ctccactacg cgggattgag caccaaattg atcttattcc 3780 tggatcagtt ttgccaaatc gtgcaccata caggacaaac ccggaggaaa caaaggaaat 3840 tcagcgacaa gtgcaagaac tactagacaa aggttatgtc cgagaatctc ttagtccttg 3900 tgctgttcca gtaattttag tgcctaagaa agatggaaca tggcgtatgt gtgttgattg 3960 tagagctatt aataatatca ccattcgata ttgacaccct attccacgat tagatgatat 4020 gctagatgaa ctgagtggtg ctgttgtgtt ttcaaaagtt gatttacgta gtgggtacca 4080 ccagattcgt atgaaattgg gagatgaatg gaaaactgct ttcaaaacta agttcggttt 4140 gtatgagtgg ttagtcatgc cttttgggtt aactaatgca cctagtactt tcatgagatt 4200 aatgaacgag gtcttgcgtg ctttcattgg gaaatttgtt gtcgtatatt ttgatgacat 4260 attgatttac agcaaatcat tggatgaaca tcttgatcat ttacgtgctg tttttaatgc 4320 actacgcgag gcacgtttat ttggtaacct tgagaagtgc accttttgca ccgatcgagt 4380 gtcttttctt ggttatgttg tgactccaca gggaattgag gttgatcaag ccaaggtgga 4440 agctatacag ggatggcctg tcccaaatac tatcacccag gtgcggagtt tcctaggact 4500 tgctagattc tatcgccgtt ttgtgaagga tttcagcacc attgctgcac cattgaatga 4560 gcttacaaag aagggggtgc cttttgattg gggcaaagca caagagaatt cattcaacat 4620 gttgaaagat aagttaactc atgcacctct cctacaactt cctgatttta ataagacttt 4680 tgagcttgaa tgtgatgcta gtggaattgg tttgggaggt gttttattac aagagggaaa 4740 acctattgca tattttagtg agaaattgag tgggcctgtt ctcaaattca acttatgata 4800 aagaactcta tgctcttgtt agaacattag agacatggca gcattatttg tggcccaaag 4860 agtttattat ccattttgat catgaatctt tgaaacatat tcgtagtcaa ggaaaactga 4920 atcgtaggca tgcaaagttg gttgaattta ttgaatcttt tccttatatt attaagcaca 4980 agaaagggaa ggaaaatatt attgctgatg ctttatcacg gagatatact ttgctgaatc 5040 aacttgatta caagatattt gggttagaaa caattaaaga ccaatatgtt catgatgctg 5100 attttagaga cgtgttgctg cattgtaaag atggaaaagg gtggaataaa ttcatcgtta 5160 gtgatgggtt tgtgtttaga gctaacaagc tatgcattcc agctagctct gttcgtttgt 5220 tgttgttgca ggaagcgcat ggaggtggct tgatgggaca ttttggagca aagaagaccg 5280 aggacatact tgctggtcat ttcttttggc ccaggatgaa gagagatgtg gagaggtttg 5340 ttgctcgttg cacaacatgt caaaaggcaa agtcacggtt aaatccccac ggtttgtatt 5400 tacctcttcc tgttcctaat gctccttggg aggatatatc tatggatttt gtgttgggac 5460 taccaaggac taggagggga cgtgatagtg tgtttgtggt tgttgataga ttttctaaga 5520 tggcacattt cataccatgt cataaaactg atgatgctac aaatattgct gatttgtttt 5580 ttcgagaaat tgttcgctta catggtgtgc ccaacacaat tgtttctgat cgtgatgcta 5640 aatttcttag tcatttttgg aagactttgt ggttcaaatt ggggactaag cttttatttt 5700 ccaccacctg tcatccccaa actgatggtc aaactgaagt tgttaataga actttatcca 5760 ctatgttaag ggctgtttta aagaagaata ttaagatgtg ggaagaatgt ttgcctcatg 5820 ttgagttcgc ctataatcgt tcattgcatt ctactacaaa aatgtgtcct tttgagattg 5880 tctatggctt cttgccacgt gctcctattg atttaatgcc tttgccaagt tctgaaaaaa 5940 taaattttga tgctaagcaa catgctgaat tgatgttaaa attgcatgaa gccactaaac 6000 aaaacataga gcgcatgaat gctaagtaca aatgcactgg agataaaggt agaaagcaat 6060 tgattctgga acctggggat ttggtttggt tgcatttgcg aaaagataga tttccagaac 6120 tgataaaatc caaattgatg cctagagctg atggtccttt taaagtgctg caacgaatta 6180 atgagaatgc atataagctt gatcttcctg cagattttgg ggttagtccc acatttaaca 6240 ttgcagattt gaagccttat ttgggtgagg aagatgagct tgagtcgagg acgactcaaa 6300 tgcaagaaag ggaggatgat gaggacatca acac 6334 SEQ ID NO: 179 (forward primer): gggaagtaca gggacgaaga gc 22 SEQ ID NO: 180 (reverse primer): tgcaaccaaa ccaaatcacc ag 22
[0368] BAC Library Construction
[0369] A Bacterial Artificial Chromosome (BAC) library was constructed from sorghum genomic DNA. The sorghum genomic DNA was isolated from Sorghum bicolor, and digested with a restriction enzyme that was methylation insensitive to enrich BAC libraries for centromere DNA sequences.
[0370] Probe Identification and Selection
[0371] Groups of sorghum repetitive genomic DNA, including specific centromere-localized sequences, were initially compiled as candidate probes for hybridization with the BAC libraries. The satellite sequences set out in SEQ ID NOs:22-176 were used as probes for interrogating BAC libraries. These probes were prepared and labeled with standard molecular methods.
[0372] Library Interrogation and Data Analysis
[0373] The BAC clones from the libraries were spotted onto filters for further analysis. The filters were hybridized with the probes to identify specific BAC clones that contained DNA from the group of sequences represented by probes to identify those BAC clones that were positive for satellite sequence (the probe being amplified from sorghum genomic DNA using for a forward primer gtcacccagc agttccatcg ggtgc (SEQ ID NO:181) and for the reverse primer, actgctgggt gacgtggctc aagt (SEQ ID NO:182). Hybridization was at 65° C. for 12-15 hours and washing three times for 15-90 minutes with 0.25×SSC, 0.1% SDS at 65° C. Other exemplary stringent hybridization conditions comprise 0.5×SSC 0.25% SDS at 65° C. for 15 minutes, followed by a wash at 65° C. for a half hour. Unique clones that hybridized with one or more of the probes were isolated. Probe hybridization was scored visually to determine a binary (positive versus negative) value, and the signal was assigned a score based on the relative strength of hybridization on a 10 point scale.
[0374] From this experiment, 211 BACs were CRS-positive and 624 BACs were 137 bp satellite positive, providing at least 624 BACs as centromere candidates for MC construction. Exemplary BACs are shown in Table 5.
TABLE-US-00008 TABLE 5 Sorghum centromere sequence-containing BACs Satellite-positive BACs CRS-positive BACs 81J23 41A11 82O22 41D7 84B23 42D9 85K22 42N11 89A3 43C3 89E2 43D8 89F4 43E5 89H10 44B3 89H10 44K3 89H8 44N12 89I6 45C3 89J10 45M8 89J9 45P2 89N4 46F1 89N6 46F4 89P4 46P12 90C2 47F1 90I3 47H1 90L2 48D6 90L3 48F2
[0375] Of the BACs shown in Table, 42N11 and 89F4 were sequenced. BAC 42N11, identified for containing the CRS sequence, assembled into 90 contigs (set forth in SEQ ID NOs:183-275), yielding about 65 kb, of which 104 by aligned to the consensensus satellite sequence (SEQ ID NO:22), and 3325 by aligned with the CRS sequence (SEQ ID NO:21). Further analysis showed that this BAC also aligned 2852 by with CRM2, a corn centromere retroelement sequence related to CRS. BAC 89F4, identified for containing the CRS sequence, assembled into 50 contigs (set forth in SEQ ID NOs:276-326), yielding a total of about 47 kb, of which 857 by aligned with the CRS sequence (SEQ ID NO:21), another 6314 aligned with CRM2, and 1632 aligned with the consensus satellite sequence of SEQ ID NO:22. These results also demonstrate that BAC clones containing centromere sequence as identified by SEQ ID NO:21 or SEQ ID NO:22 contain both sequences.
Example 2
Construction of Sorghum MCs Containing Genomic DNA (Prophetic)
[0376] A subset of BAC clones, identified as described in Example 1, are grown up and DNA is extracted for MC construction using NUCLOBOND® Purification Kit (Clontech). To determine the molecular weight of centromere fragments in the BAC libraries, a frozen sample of bacteria harboring a BAC clone is grown in selective liquid media, and the BAC DNA harvested using standard alkaline lysis. The recovered BAC DNA is restriction digested and resolved on an agarose gel. Centromere fragment size is determined by comparing to a molecular weight standard.
[0377] The components of a exemplary sorghum MCs are described in Table 6. The UBQ10 promoter is used to express DsRed in MCs constructed with the backbone vector CHROM-SB.
TABLE-US-00009 TABLE 6 Donor components of CHROM-SB Size Element (bp) Location (bp) Details YAT1 yeast Promoter 2000 7110-9109 PCR amplified YAT1 promoter from chromosome I of S. cerevisiae for expression of NptII in sorghum A. UBQ10 Intron 360 9123-9482 PCR amplified A. thaliana intron from UBQ10 gene (At4g05320) for stabilization of NptII gene transcript and increase protein expression level NPTII 795 9510-10304 Neomycin phosphotransferase II plant selectable marker Rps16A terminator 489 10368-10856 Amplified from A. thaliana 40S ribosomal protein S16 (At2g09990) for termination of NptII gene Bacterial Kanamycin 817 11039-11855 Bacterial kanamycin selectable marker Terminator 6 332 12000-12331 Terminator 6 DsRed2 + NLS 780 12466-13245 Nuclear localized red fluorescent protein from Discosoma sp. UBQ10 Promoter 2038 13282-15319 PCR amplified A. thaliana promoter from UBQ10 gene (At4g05320) for stabilization of DsRedl gene transcript and increase protein expression level LoxP 34 7057-7090 Recombination site for Cre mediated recombination and 15335-15368
[0378] The MCs are constructed by following a two-step procedure: Step 1: Preparation of donor DNA for retrofitting with BAC centromere vectors and Step 2: Cre-Lox Recombination-BAC and Donor DNA to generate the MC. The resulting MChromsomes are subsequently tested in several different sorghum cell line.
[0379] Preparation of Donor DNA for Retrofitting
[0380] Cre recombinase-mediated exchange is used to construct MCs by combining the centromere fragments cloned in pBeloBAC11 with a donor plasmid (i.e. CHROM-SB Table 5). The recipient BAC vector carrying the sorghum centromere fragment contained a IoxP recombination site; the donor plasmid contained two such sites, flanking the sequences to be inserted into the recipient BAC.
[0381] Sorghum MCs are constructed using a two-step method. First, the donor plasmid is linearized to allow free contact between the two IoxP sites; eliminating the backbone of the donor plasmid. Second, the donor molecules are combined with sorghum centromere BACs and treated with Cre recombinase, generating circular sorghum MCs with all the components of the donor and recipient DNA. The MCs are delivered into E. coli and selected on medium containing kanamycin and chloramphenicol. Only vectors that successfully cre-recombined and contain both selectable markers survive in the medium. The MCs are extracted and restriction digested to verify DNA composition and calculate sorghum centromere insert size.
[0382] To determine the molecular weight of the sorghum centromere fragments in the sorghum MCs, three bacterial colonies from each transformation event are independently grown in selective liquid media. and the MC DNA harvested using alkaline lysis methods. The recovered MC is restriction digested and resolved on an agarose gel. Sorghum centromere fragment size is determined by comparing to a molecular weight standard. If variation in sorghum centromere size is noted, the MC with the largest sorghum centromere insert is used for further experimentation.
[0383] Functional Testing of Sorghum MCs Using Transient Assays
[0384] The MCs are tested, for example, in several sorghum cell lines, and the procedure optimized for antibiotic selection, cell pre-treatments, and bombardment conditions. All assays are transient and fluorescent cells are counted at several time points.
Example 3
Construction of Sorghum MCs Containing Synthetic Arrays of Repeat Sequence (Prophetic)
[0385] A synthetic array of the sorghum satellite repeat sequences is generated using PCR and directional cloning. A block of sorghum satellite repeats are PCR amplified and sequenced, and this sequence used as the basis for building the synthetic array.
[0386] For example, MCs containing synthetic arrays of sorghum satellite repeats may also contained either 5 or 8 stacked exogenous genes. The five-gene stack may include the genes NptII, DsRed, Anthocyanin, ZsGreen, and ZsYellow. The eight-gene stack may include those of the five-gene stack plus three additional genes from the A. tumefaciens tumor-inducing (Ti) pathway. These include iaaM (Trp mono-oxygenase), iaaH (Indole-3-acetamide hydrolase), and ipt (AMP iso-pentenyl transferase).
[0387] In order to investigate whether the MCs can carry a large number of genes, MCs containing a gene stack, a synthetic array of sorghum repeat nucleotide sequence and about 20 kb of peace lily (Spathiphyllum spp.) DNA is constructed. The total size of these MCs ranges between 82 kb and 87 kb.
[0388] In addition, MCs with a gene stack with two genes in addition to a synthetic sorghum centromere repeat array and an approximately 50 kb insertion of peace lily DNA may be constructed using the methods described above. The functionality of this MC demonstrates that the MCs of the invention can accommodate a large payload of genes, as 50 kb of the peace lily DNA includes a wide variety of genes.
Example 4
MC Delivery into Sorghum Cells and Regeneration (Prophetic)
[0389] To enhance the efficiency with which sorghum cells transformed with MCs can be regenerated into sorghum plants, where the MCs contain the auxin gene pathway and are delivered into fully differentiated leaf rolls rather than undifferentiated tissue, e.g. embryos. In addition, growth conditions are modified to enable development and propagation of transformed sorghum callus.
[0390] Sorghum is grown in the greenhouse for up to 6 months without floral initiation due to the growth time as well as the daylength settings on greenhouse supplemental lighting for those varieties that are day-length-sensitive. Stalks from several (clonal) plants are used to generate leaf rolls that do not include any developing meristematic tissue.
[0391] The MCs with a synthetic sorghum centromere are delivered to leaf rolls. For example, the MC may contain an eight-gene stack ("eight-gene MC"), or the MC may contain a five-gene stack ("five-gene MC"). In addition, a control plasmid (lacking a centromere) containing eight genes is also delivered, in which the eight-gene stack is identical to that delivered on the eight-gene MC.
[0392] The eight-gene MC includes A. tumefaciens tumor inducing (Ti) pathway genes (iaaM, iaaH, and ipt). The inclusion of these genes minimizes the time the transformed cells are in culture. IaaM converts Trp into indole-3-acetamide, which IaaH converts into auxin. Isopentenyl transferase (Ipt) converts 3',5'-adenosine monophosphate (AMP) into a cytokinin. Auxin is used in cell culture to stimulate plant cells to form callus. Media with auxin promotes callus growth from plant cells whereas plant cells cultured on media lacking auxin either germinate (for embryogenic material) or are unable to grow (non-embryogenic or meristematic tissue such as leaf tissue). Thus, the eight-gene MC induces callus formation without supplementing the media with auxin.
[0393] A biolistic delivery method using dry gold particles is used to deliver MCs to the sorghum leaf rolls. MC DNA (in 1× TE) is precipitated onto 2.1 mg of sterilized and washed 0.6μ gold particles. The DNA-containing gold particles are resuspended in 2.5 M CaCl2 solution. 0.1 M spermidine (free base) are added to the mixture. The mixture is incubated on ice for 1.5 hours, with gentle finger vortexing (3×) for 45 minutes. The precipitated DNA is then washed with 100% ethanol, resuspended in 100% ethanol, and then the ethanol evaporated prior to bombardment.
[0394] The apical region of the sorghum stem is collected (20-30 cm long), after removing the outermost mature leaves, and the remaining leaves are sterilized by submersion in a solution of 50 ml bleach in 1 liter of water for 10 minutes. The remaining mature leaves are aseptically removed, and the young inner immature leaves are sliced into sections/discs approximately 2-3 mm thick. The leaf rolls are placed in Sorghum Osmotic Medium (SCOM; 4.3 g/l MS salts and vitamins, supplemented with 20 g/l sucrose, 0.5 g/l casein, 3 mg/l 2,4-D, 0.2 M mannitol, 0.2 M sorbitol pH to 5.8 and solidify with 2 g/L Gelrite) at 28° C. for 4-5 hours before the bombardment.
[0395] The three constructs are each initially tested by delivery into the leaf rolls. For delivery, the leaf rolls are bombarded with the MC DNA using the BioRad PDS-1000/He with a rupture disk rating of 900-1800 psi (1350 psi is preferred with one shot per plate). The gap distance (distance from rupture disk to macrocarrier) is 6 mm. Target shelf for tissue is L2-L4; L2 or L3 is preferred. The vacuum pressure of 25-29 in Hg; 27.5 in Hg is preferred. The bombarded leaf rolls are stored at 28° C. (dark) for an additional 16-18 hours on SCOM.
[0396] Subsequently, the bombarded leaf rolls are transferred to MSO (4.3 g/l MS salts and vitamins, supplemented with 20 g/l sucrose, 0.5 g/l casein, with NO 2,4-D. pH to 5.8 and solidify with 2 g/L Gelrite) and stored at 28° C., in the dark, for 2 weeks. The leaf rolls are visually assessed for callus production two and four weeks after bombardment.
[0397] Callus arising from the bombarded tissue is phenotypically evaluated for DsRed expression using a fluorescent dissecting microscope. If DsRed is observed in the tissue, then the callus is transferred to Regeneration Sorghum Medium (RSCM; 4.3 g/l MS salts and vitamins supplemented with 20 g/l sucrose, 0.5 g/l casein, 0.5 mg/l kinetin. pH to 5.8 and solidify with 2 g/L Gelrite) in low light (e.g., 16 hour day length, 26° C.) to initiate regeneration. This media does not contain auxin. If after a couple of additional weeks of culture, callus has also started to differentiate into root (primarily) and shoot material, which is not expected in the presence of auxin, then this result suggests either silencing or loss of the 3 genes from the A. tumefaciens tumor-inducing (Ti) pathway. After 2 additional weeks on media, PCR evaluation of this material or presence of the DsRed gene is carried out. If PCR results are negative, further suggesting loss of the entire MC and verifies that the genes are not silenced.
[0398] The advantages of including the genes of the Ti pathway on a MC are that the non-meristematic tissues are transformed and the need for callus initiation prior to DNA delivery is eliminated. In addition, the time in culture is reduced and as a result somaclonal variation, endogenous chromosome number changes and the like are also reduced. Furthermore, the inclusion of the Ti pathway genes eliminated the need for selectable marker genes.
[0399] In a separate experiment, five-gene (NptII, DsRed, Anthocyanin, ZsGreen, and ZsYellow) MCs are delivered into sorghum callus. These MCs are delivered to the callus cells using the wet biolistic method as described above.
[0400] Following delivery, Callus from the tissue is phenotypically evaluated for DsRed expression using a fluorescent dissecting microscope. If DsRed is observed in the tissue, the calluses are transferred to Selection Sorghum Medium MS3-50 (4.3 g/l MS salts and vitamins supplemented with 20 g/l sucrose, 0.5 g/l casein, 3.0 mg/l 2,4-D, 0.5 g/l polyvinylpyrrolidone (PVP). pH to 5.8 with 2 g/l Gel rite; further supplemented with 50 mg/l G418) for initial selection for 2 weeks. All calluses are subsequently transferred to additional selection on Selection Sorghum Medium MS3-75 (4.3 g/l MS salts and vitamins supplemented with 20 g/l sucrose, 0.5 g/l casein, 3.0 mg/l 2,4-D, 0.5 g/l polyvinylpyrrolidone (PVP) pH to 5.8 with 2 g/l Gelrite; further supplemented with 75 mg/l G418) for 4 additional weeks. Tissue is then visually assessed for sorghum callus tissue that is able to grow. Those identified events are transferred to Regeneration Sorghum Medium RSCM-25 (4.3 g/l MS salts and vitamins supplemented with 20 g/l sucrose, 0.5 g/l casein, 0.5 mg/l kinetin. pH to 5.8 and solidify with 2 g/L Gelrite; further supplemented with 25 mg/l G418 after autoclaving) in low light (16 hour day length, 26° C.) to initiate regeneration. Simultaneous with initiating regeneration, this callus tissue is evaluated by PCR for presence of the genes on the MC.
[0401] After an additional 4-6 weeks on regeneration, plantlets (with and without initial root initiation) are transferred to Rooting Medium RtSC-25 (2.15g MS salts and vitamins supplemented with 20 g/l sucrose, 0.5 g/l casein, 0.5 mg/l kinetin. pH to 5.8 and solidify with 2 g/L Gelrite; further supplemented with 25 mg/l G418 after autoclaving). Rooting occurs in 2 to 6 additional weeks of culture with 16 hour day length at 26° C. Rooting occurs in sundae cups (Solo Cup Co.; Joliet, Ill.; USA) for additional plantlet growth and root development. Plantlets with well established root systems are transferred into pre-moistened soil-less mix (LC1, BFG Supply Co.; Burton, Ohio; USA) under a humidome in an 18 well flat in a growth chamber (28° C., 16 hour day length). The dome is opened slightly 3-4 days after transplanting to slowly reduce humidity. The dome is removed completely 2 days later and the plantlets are then transferred to a greenhouse (28° C., 16 hour day length). The plants are watered from trays beneath the pots when soil began to dry. The plants are subsequently transplanted and grown to maturity in 1.6 gallon pots with Soil:Peat:Perlite (1:1:1) supplemented with Osmocote fertilizer (Scotts Co.; Marysville, Ohio; USA).
[0402] Sorghum callus and tissues produce phenolic compounds while in tissue culture, and these phenolic compounds reduce or inhibit callus growth and plantlet regeneration. In order to promote sorghum plantlet regeneration in culture, the media described above (MSO, MS3 and variants thereof, and RSCM) are supplemented with PVP at a concentrations of 1% to 3% w/v according to the intensity of the exudation of the phenolic compounds. The PVP acts as a sink for phenolic compounds and enhances subsequent callus growth and plantlet regeneration.
[0403] In order to promote the frequency and the morphogenetic competence of regenerable sorghum callus, the cells are cycled from a liquid culture to a solid culture. The apical region of the sorghum stem (20-30 cm long) is collected and the mature leaves are removed. The stem is surface sterilize by submerging the tissue in a solution of 50 ml bleach in 1 liter of water for 10 minutes. The remaining outermost mature leaves are aseptically removed and the young inner immature leaves are sliced into sections/discs approximately 2-3 mm thick.
[0404] The resulting leaf roll discs are placed on sorghum MS3 Medium (MS3; 3 g/l MS salts and vitamins with 20 g/l sucrose, 0.5 g/l casein, 3 mg/l 2,4-D. pH to 5.8 and solidified with 2 g/L Gelrite) at 28° C. for 2 weeks in the dark. The resulting regenerable sorghum callus (white nodular embryogenic pieces) is then removed and placed into liquid sorghum MS1 Medium (MS1; 4.3 g/l MS salts and vitamins with 20 g/l sucrose, 0.5 g/l casein, 1 mg/l 2,4-D. pH to 5.8) at 28° C. for 2 weeks on a rotating orbital shaker (100-150 rpm) in the dark. After the 2 weeks, the regenerable sorghum callus (white nodular embryogenic pieces) is removed and sub-cultured back onto sorghum MS3 Medium (MS3) at 28° C. for 2 additional weeks in the dark. Sorghum callus can be subcultured in 2-week intervals between solid MS medium containing 3 mg/l 2,4-D and liquid MS medium containing 1 mg/l 2,4-D to maintain embryogenic callus.
Example 5
Evaluation of Autonomous MCs (Prophetic)
[0405] To evaluate whether the candidate MCs are maintained autonomously, FISH can be performed on mitotic metaphase chromosome spreads from root tips. FISH can be performed essentially as described in Kato et al.(PNAS USA. 101: 13554-13559, 2004).
[0406] For FISH, root tips can be collected approximately 10 days after transplanting regenerated TO plants to soil or after germination (T1-T4 plants). Sampled roots (3-6 per plant) are moistened and exposed to nitrous oxide at 150 psi for 3 hours to arrest chromosomes in metaphase as described in Kato (Biotech. Histochem 74: 160-166, 1999). Roots are fixed in 90% acetic acid, and spread onto poly-lysine coated glass slides by squashing thin cross sections. Following hybridization with ALEXA FLUOR® 488 and ALEXA FLUOR® 568-labeled probes, slides are counter-stained with DAPI (0.04 mg/ml) and 15 metaphase cells are evaluated per plant using a Zeiss Axio-Imager equipped with rhodamine, FITC, and DAPI filter sets (excitation BP 550/24, emission BP 605/70; excitation BP 470/40, emission: BP525/50; and excitation G 365, emission BP 445/50, respectively).
[0407] Extra-chromosomal signals are only considered to indicate autonomous MCs if ≧70% of the images (n≧15 cells analyzed) show co-localization of the ALEXA FLUOR® 488 and ALEXA FLUOR® 568 signals within 1 nuclear diameter of the endogenous metaphase maize chromosomes. Gray-scale images can be captured in each panel, merged and adjusted with pseudo-color using Zeiss AxioVision software (Zeiss; Thornwood, N.Y.; USA); fluorescent signals from doubly labeled MCs can be detected in both the red and green channels.
[0408] Integrated constructs result in two FISH signals, each on a replicated metaphase chromatid. The MCs can be considered autonomous when when (i) ≧70% of the cells examined (n≧15) contained signals that are clearly distinct from the DAPI-stained host chromosomes, (ii) integrated signals are not detected, and (iii) the fluorescent probe corresponding to the MC-encoded genes co-localize with the probe to repetitive centromeric DNA, suggesting an intact construct and making it unlikely that the signal is due to noise.
Sequence CWU
1
1
32612139DNADrosophila melanogaster 1gttgtccgca gcggagatgc aactgatgca
acccacattt cagatcaccg acaacgtgca 60gcgcggcaac tacgccactc tgaccgacaa
ggatgtggcg catttcgagc agctcctggg 120caagaacttc gtgctcactg aggacctgga
gggatacaac atctgcttcc ttaagaggat 180tcgaggtagg ttgtgtaacc aaattcattc
acattcgtgt gccctttaat gaatttctcc 240gatgaattgc ttcaaccagg caacagcaag
ttggtgctta agcccggaag cacggcggag 300gtggccgcca tcctgaagta ctgcaacgag
cgtcgtttgg cggtggtgcc gcagggcggg 360aacacaggtc tagtgggcgg atccgtgccg
atctgcgacg agattgtcct ttctctagcg 420cgcctgaaca aggtgttatc cgtggacgag
gtcaccggca ttgctgtcgt ggaggcgggc 480tgcatcctgg agaacttcga tcagagggcc
agagaggtgg gcttgacggt gccactggac 540ctgggcgcca aggccagttg ccacatcggg
ggcaatgtgt ccacaaacgc gggcggagtg 600cgggtggtgc gttacggcaa tctgcacggc
tctgttttgg gcgtggaggc ggtgctggcc 660accggtcagg tgctggacct tatgtccaac
ttcaagaagg acaacaccgg ctaccacatg 720aagcacttgt tcataggatc cgagggcact
ctgggcgtgg tcacgaagct ttcgatgctc 780tgcccccatt cctcgcgagc ggtgaacgtg
gccttcatcg gcctgaactc cttcgacgat 840gtgctgaaga cttttgtcag tgccaagcgt
aatctgggcg agattctaag ctcctgcgag 900ctgattgacg agcgggcctt gaacaccgcc
ctcgagcagt tcaagttcct gaagtgagtt 960gcgccacctt tgtcttctct gagcgttacc
aatcctgttc acaaacttat ttcccatagc 1020tcccccattt cgggatttcc cttctacatg
ctcatcgaga cctcgggcag caacggtgac 1080cacgacgagg agaagatcaa ccagttcatt
ggggacggta tggagcgtgg cgagatccag 1140gatggcaccg taaccggtga tcccggcaag
gtgcaggaga tctggaagat ccgcgaaatg 1200gtgccgctgg gtctgatcga gaagagcttc
tgcttcaagt acgacatctc gctgcctctg 1260cgggacttct acaacattgt ggacgtgatg
cgagagaggt gcggtcccct ggccacagtt 1320gtctgcggat acggccatct gggggactct
aatctgcacc tgaacgtctc ctgcgaggag 1380tttaacggcg agatctacaa gcgggtcgaa
cccttcgtct acgagtacac ctccaagctg 1440aagggcagca ttagtgcgga gcacggcatt
ggcttcctga agaaggacta cctgcactac 1500tccaaggacc cggtggccat tggctacatg
cgcgagatga agaagctgct ggaccccaac 1560agcatcctca atccctataa ggtgcttaac
tgaaggcttc tacctaatag attctatttt 1620ttttgtttgt gtgtaatttt cataacctta
taatacagaa atggcattag aagtgaattt 1680tgttaacttg tgaagttaaa aaggaccatc
atatttggca cgaaaccaat gggcaaaact 1740tacttataaa atagtccgaa aaaatagtat
ataccagttt ttacagtacc acattatagg 1800tactcggagg taataataga aaaaacacta
tctttgcatt tactgttaca ctacgaagca 1860ctatatttag tagcagtact cattagagtc
cactcacaaa attagcacca accggcagta 1920attggtcaag gatcggcgat agcttcaaac
tccgaagttc aaagtcaaac tgccgccctg 1980cgaaagcttc gcgagtggag cttttctgca
cttatcgata gctaacattg tggcgcgact 2040atcgatcgac gagctgccgc ttaacagtgc
catatataga ttgtaacatt agaagctcaa 2100atcattgttg gagcacaaac cacaaagaac
acacgaaac 213922191DNADrosophila melanogaster
2aaaatatttc acctcatttt ccgcacacca tttataagca aagttacccc caacccataa
60cttttatggt aagtaataca gaccctccaa gttcggcaaa tcgataccca gcgaccttga
120gcttgacatt tatatatatg ccagaatata acgaccacgt gctgtcaact gtgtcaggaa
180aagctcaccc acactttctt tggaggagct gtgctcccta aacgaatttc attgtcaagg
240tcgcacgcac aaaaatgaag aggaaaagct gaatgtgggt ggaaatgccg gccggcacga
300ccttgaagcc agttgggtga gaaataaaaa gcttttgccg gtaggagact tgtggaacat
360cacccacaag tggcggactt ggccttggcg atggccttgt tggagctccc tcagcaaaaa
420tgttacatag ggggaggaaa taagctcaat tggctttatg ctttccgctc cctggaagtc
480cttttctgga atgttaaagt gttaaatgac atttattgaa catttgggac agaggaggag
540ataatacaat atacttgtct aattaaaaaa aatcgttatt atgatttatt ccatatgtaa
600gattttaatt catcatgatt gtaaataaat tatataaaac aaattcaata aatttacatt
660attgataaaa tttatttttt catgaaatta tacccaaaaa ttattctcaa tttttcttat
720aatcagtttt gcataagtat actttcttca tacccctcta ccacagccac tgctttcttg
780actttgcaac tatccgggaa cagcttatca taatggatga gctgcagcta acggaaaatg
840ggggagctgg gatcaaacat tttccaaggt tgaaattgtc gtcagcataa tgtttgaggg
900agctggattc gcgttagctt gaaggtcaat ccatttgggt gccctttgtt atggtcaagt
960ttaaggctgc aataggggga atcttcaagg accattacgc aaggttttcg catcaaagat
1020ttgccgtgca agctttttga gttgaaggat gcttaacttg aaagcgggtt agtggttcca
1080agagatttta ggtgaaggag actccgctgt tttgaaatat attaagtatg taaagaagta
1140tactataaat aacccaaagt gatacaatgt aagaaaagat ctcgttggtc cctggtataa
1200atttgtttgc cattaatgaa tattgaaaat aataattata ctaataatag gtacaataag
1260caagattaaa ttgcatttaa tcaccaaaaa tcagtttcta tgcgaaccaa aatgtcataa
1320caaacaattg ttgattcatc cgtagtgaaa tccaagttcg aaattcgaaa tgagcatacg
1380acgaccaaac ttcccctcaa aattgctaga ctcagctaga gcaagtacgc ccaagttaac
1440ccctgaaatt cgaaatgaat tcgatgccgc gcttcgaaca acgaaatccc aaagagctta
1500cgttttattt gacgtagcac tcttacgtga aatgattttc cccaattccg ctctcatttc
1560ccgagtctct caccgcttct cagccacttt cccaccccct ttctagttcc gaagtaaagg
1620taacaaaggc agccgtgtct ttggggtggt aaactggcgg tggtggtggc acattgtcag
1680tggtgtgggt tcctgtggtt ggtggttcaa ttggttggtt gttggcataa acaaagcaca
1740cacacaatac acacaaactc ccggggggtg gtggaaattg ggagggtgac attcactgcg
1800agagaggaac tcgcttccta taggaaagta caaagagagc tattttataa atgtgactgc
1860agcaaggata tttacagtca gtccactctg aaacctcgac gagagaacat tgaataacaa
1920gcggaagcga aaagcgcagt tgaaagttcg tcaaaaagcg acaagtttcc tcgttcgttt
1980tcccgccaaa tgagtcagaa aaattttcca agtgctcgat acgaaacata aagacttaca
2040agacttaaag tgcaagcagt gaatggaata tattattcct cagcgatatt gaaatcaaac
2100attaaaaata tatgctacac taaagttata tattttttta aagattcata cgttttgtaa
2160aatcacattt tgtattaaat taaataccgc c
219132035DNADrosophila melanogaster 3tgggtgcgtc gcaggtttca ctggaaaaca
atttgcactt ttgtttgtgg agtcgacaac 60aaaagcattc acttgtctaa gactctctca
ttcataactc gcactttagt tcactgaacc 120gcacgcaaaa ctttggggcg gacaacatgt
tttcgaggtg ccaaaagctt cataaaacta 180ccaatccatt agattaaatt ccaggcggta
catcttttgg ggatgattca tgtggcaggg 240gttctctact cgtttacaat catatcatca
tcttcaagat catatagttt atcatatcag 300tagagtacta caatataatg cataaactaa
gccaaataac tttatgacgc gtgcttatgc 360gaaagtaaac tttattatca aatttactta
accgtgaaat caaaaccttt atataaacac 420gaatattatt atctttgcta aataaaactc
tcgcttaaca aacaatgaca cttcaattcc 480aacatagagt ttatcttaag ccaataacca
aaaacggaac ttacataact tgccaacaaa 540catatgaata tagctatttc ggatcgtggg
agaccattat gcatacaagg cacgctccta 600aaaaccgtgt taaacaaata tatgtcaaat
gtatatctta aaaaagcgcg cacatatctt 660ttgaaatatc ttcacccaga gtatgtatga
gattaaactg gattagcact aagccacagc 720ttctgtagat agaaatttta tgcagagagt
agattatttg gctgctgagc aatttgacca 780ccacaagata gcagagaaca tctgacattt
tctatatcca tataataaaa ctgacttaac 840actaagctga agtggtatgt ttaaatcctc
cagctaataa atcgagacta aacgccctat 900cttatagtga tatataatag tatctatatg
tgtattgtca tttactgttt atgagtattt 960gaaaaaacca ttctatattt tataggttag
ttaataaata ttttgatata catatgtaga 1020ttggctcaca cgtacttatg acccactaca
taataaaatt gttttgtttt ttaatagaat 1080aatggtttat aaaaagttta gactcacacg
gaaatgataa actctttgca aatacagctt 1140tcattttatt acaaattgca ctctttcaga
tctgcagttg ctatgccaac cttttattcc 1200ctttactaaa agggtatact aggcttactg
aacagtatgt aactggtaaa gtaaagcgtt 1260tccgattcta taaattatat atctaaactt
ttgatcagtc gaatccatct gaacacattc 1320tgtcacatta gattattcca gaaactcaac
ttaaacatgt gtatttttta agaccattat 1380caaggatatt aaaaatggtc tcctaaaatt
taataaacaa aagtgtcaca tcaaatttaa 1440gacgtaaatt aatatttttt ttctatggtg
aaataattgt tattttccaa tgttgtgaaa 1500taataaatgt atcttttcaa cgcacacatt
ttcaaggttt taataataat agtgactcgt 1560gcgtgaataa gagagaaatt aagattttaa
aaaagaataa aattcagaga tgtgatctgt 1620aaaaattatt taccaatttt catttacccc
cgaaagtgat gctaatggtt aaaacggcat 1680ttgcgactta tctcctacgt aatattgcaa
aaataaggat ttggttagat gagtgtgaag 1740taaacaagat gcaaagtttt ggagatagaa
aacatagcct tgagtcttgg tcatgtttac 1800ttggcaccag gccgcgatta tcagcgctac
tagtcgtaat ttgagttaga cctttaatac 1860tctaagtgag agtgatgata tacgatttcc
cagccacttg ctttctacga aatgcgctaa 1920aaaaaatccc taactacaca aagatttgtg
ttgttatcca ggtgttctga tataaaaggc 1980ggcaaggaaa ttgatggcat catcagtatc
aaagtgagag tgattgcagt cacac 203542136DNADrosophila melanogaster
4atgggacggt cctattctca gcaaaaattg acaagaacaa caacaatgtc tatggaaaat
60cgaacttcat cccagcacct gcagaaatcc cgagcgagtc ggggaaaaag tatttaaccc
120ccgaaagggt tttccccaaa ataatgaagt aatgaatgaa gcggaaaaca ctggccgcca
180atctacctaa tactaatgag cgggccaacc cgaccaggaa tttttgcaag tcaggtactt
240caacggatat atgggttcga caagtgcgga ttttcccgcg acatcaatga ggacttggcc
300gggttatccg cggtgctcat cgggcaattc cgcggccgag gacttcatcg tagtgatcat
360taggtagata tgtgcatgga tgtgacatgg cgatcattgc gcggaataac acacgtaata
420accgagatat ccgggatgac ccaccaggta ggatgtgagg acatatagaa aacccccagc
480cagtttttcc actcgtcgtg gcttgttttg cttgagtttc gctgactgcg taattggata
540agatgggaaa ttactttaaa tccttcgctg atccacatcc ggacattcgt cgaaggaaaa
600tccattgcag ggaaatacga aatggaaatg cggctgggtt attggctcga catttcccat
660cttccctcac gccattggtt gcaggatcgc ggggaattgg aattccgcgc tggaattttt
720tgtcacctct tgggtttatc aaaacttttg ggtttgctat ggattttttc caattttacc
780accgcgcctg gttttttttt tttgacgacg cggaaaatcg gacttggcta tgcgggcttg
840tctgtttttc cgggtacaaa gtctgcatgt cagcctccat gcgggagtgg gagttgggaa
900agtttcccat cgatagttgg aggggtggct tgaaagtctg gaggtgctag ctgggaaagt
960tgtgtgtgcg cgatgaggca aggagtcaaa gatcagggga gttggaaagc gagaattgtg
1020ggaatcgtcc aggactcagc tggatgctga ggggcagtat gatttttttt acgttatcaa
1080tcgaattgat tttaagacag cagaacttca catactaata agatgaccat gggattagtt
1140aaaatgtgta actcgtattc gaatcgtcat tctttcacgg accaatcgtg ggaacaggag
1200atctcttcga tccaagctca caggagactt gacactcttc gtctattcct tgtcaagttt
1260ttaatgacat ctcctatgcc ctgagctatg ttttcctagc tctcatcgat cgctgccaat
1320gagccactgg agatgatcca taagtcagcg tagagtgcac cccagagttg acacttggtg
1380tctcggaatt cggctcatta tcagtgctat ttttggaaca cctctctgcg aaggtgtcat
1440ttttgtcagt gcgtatcgct caggttcaac tccccaccaa aaaccgaatt tagagcatcg
1500gcagatgtac ttgaagcact caatctaagt gaggaaacca ccccatgaac gaagagtact
1560aggagtccta tttgactcgt gcttaaaaat agaaaattac ttagggtgat ccataggtag
1620ggaggcgata ttgtaacttg catttcggac ccggacctgc acgagttatt acgggtgggt
1680tgtgagcgta tcgggaaatt ggagagccac cagatctgtc ataacttata cgggggatcc
1740ttattcctgg gagggtgcgc ctgcgtctgc tcttccgaga gagaggtggg aaatggagga
1800agagagagag agagagagtg agagagcagg tagagggaag tgagggaaat acgcaataag
1860ggtatgggaa aagtgctgtt gttgttgcta ggtagcgacg cacacgtgcg agtgtttttc
1920tgttttgaag aagaaccacc accaaatggc gacagcggcg tcggcagagg cgcagagttc
1980cgggtataaa agagcgtgct cgactgttga cctgtcacag ccacctcagc tctcgttgag
2040aacgcaacca ccgctctata ctcgatcccg aactatataa ctcgcctctc gatcgccgat
2100ctcccgattt acccatctcg atcagtaccg gaaacc
213652015DNADrosophila melanogaster 5atttggctcc ccatcgccat cggttgctcc
aatgacacta gggaattgtg ggccgccgac 60agctgtcctt aattacatgg aaatccacac
tagattcgtg cccctcgccc cgtactcgca 120gccgaagtcc ccacagagtc attcaccttg
ccaccaccaa aaaaaaaacg aaagcaactg 180aaggaaaagt tcgattcgaa ggctgaggga
tacccttaaa ggcccatttc ccggcttcgt 240aaatcacatt tagttagcca tttagactac
agcaagtctt ttaagataca ctgcaaaata 300aataccatta cattaataga agtgtcatgt
catcggtctg tatttttgtt accacagaat 360agacttacat atatgataaa aaaatgttca
acaataagtt acatcggtag ccaattctat 420agatttaatt ccttacgaat atagtttcgt
tggaatactc aatttgtaat tgtaattaat 480tataattatt ataattttaa gaatttatat
aagtaactaa aagacacggc agacacagaa 540tgaaaacact ctatgttagg gaatgcaaaa
aaacgtggcg gaagccaaaa ggcgcaagca 600aaaatcgaaa ccaagtgaat ataacatatt
atttcaacag gcaactcatt cagcatataa 660tattaccacc catggagctt tatgtagttg
atgtacgtag tctatgatgt ggagcccacg 720ttggcggaac tgggaatggg gattggggtt
tgagagctgt ggtaaattgg ggggttgaag 780tatcaagggt ttgggttctg tagacctgcg
gaatcgaggt gaataagcga agaacacatt 840cacacacact aaaaggcaaa caaagggaaa
tcaatctttg tacatacttt tagcatatgc 900acacgtatga tctccaccca cttttccctc
ccaatgaaac aaacacacac acacatgcaa 960ggccgtacgt ttgtatatgt gtgcggttgt
cggctttgcc gggaattggg gaatatttgc 1020atgcctttgt gtactttttc catatgattt
atgacctaaa ttgttgctgc tcgcgcacat 1080ataattacac acacatcgct gtggccatgt
gtgtgtgtgt cgtcttggga cgcgcgccaa 1140agtatgctac actttttgtt ttatgagtta
ataagtaggc gtggccccag cccaattgct 1200acactctgat tatggcaccg gatacccaga
tagacgccca tccaccccac tgtaagatgg 1260gggaatttcc aaacctatat gtatgtgcag
atcagatagg atagcacaga actttttaaa 1320gtacactttt ggggcacgca atttagaaaa
tgtacctcgg tgtcggagaa attattttaa 1380aagtcgactg aaccacctcg ttccatatgg
agaagtctac gagttcaagt ttaatggagc 1440agctgactgc actgaatttt gtagtttaat
acacaaatcc gcaaattgca tctcacttca 1500aatagcctgg tacatagtat ctactaacat
aactcatatt aaaataaagc aaccaaccag 1560agggccgaag ttctattaat aaaactaata
tttaactatt atatatacat tttatttact 1620tggtacgctt atgataacct tcgaaagaga
accaacacaa tacgctttgt catttgaaaa 1680ataaatatgc tgtaactact ttacaaggtg
aaactcttgt cagaagataa gaggctaggt 1740aagttgatta ttcaatcagt ttacttactg
caacccaaaa tggtcactgc actaaccttc 1800agatgagctg cactacaccc tcaatcgaga
atcaatgcaa acgcagtgcc agcgaaaatg 1860tcagcaaggg attaggccaa tcccaaacgg
gtaatcccgc tgcgacaatg ctaatccaat 1920tccgatgggc cgtataaaag ccccaagctg
ggctggctgt gatttcgtct tggcccgcag 1980accggagcat ggagtccggt aacgtgtcgt
cgagc 201562082DNADrosophila melanogaster
6atcgatgacg gcatcggctt gacctctcgg agtacgtttg attttataga acaagttttc
60tcctttctta tactataagg aaaaattata aaaattgctg aaaatgaaac atggctagaa
120ttcgtttttt aacatttttt caatctgaga aaaaatttcc gattagtctt aaaataacta
180aaccaattcg tatacccgtt aatcgtagaa gaaaaatgaa attcatataa taagtagatg
240gatttgctga cccggtgagg tatatatgta ttcctgaaca tgatcagtaa acgagtcgat
300ctggccttat ccgtatgaac gtcgagatct cgggaaatac aaaagctaga aggttgagat
360taagtatgca gattctagaa gaagacgcag cgcaagtttg cgactacgct gaatctactg
420ctaaaaactg ccacgcccac acttcttaag aatttgattt attttcacaa gctgaggaac
480ggtagggtcg aggaactcga ctacaacgtt ctgccttgtt tatttcttaa caaaaactta
540gtagccgttt gggttggaaa ccacctgacc ttaggtctgg tagcagttat ttaatttatt
600ttttttattt tatacaactt gctcgctgtt tgttccccct agccctgaaa cacaagctgt
660caaacggtgg aggtgataag tctaatgaat gcgataagct ttatttcaat tcgcaatttt
720cgtgtggcat tttggcaaaa aaaaaaactc gtcggacata catgttgcca caaacataaa
780gtgaatacat aatgttgggt gaacgactca tacacgattg tggcaaatca aattctttta
840acacgggacg gggaaaggcg agtgaagata ttttagcata tatttagcac atctgttaaa
900tccatttttt tactctccgt tttcggccag atatggttag aaaagaaaaa aattagtaca
960tacccccata tataataaga aaaaaagaga gagtcagcag aagtacgggg agcttaagtg
1020tagcaatcag aacatcacaa atagtaaata aattaataat aataataatc atatccaaaa
1080atatttttat tcctaaccta tcgcattgtt acatcgaggg tgaaattcaa aatagacaaa
1140aagttgggga ataaaatgtg aaaaaagtgg taaaatgttt aatagtgtgg gcgttactgt
1200tttgtcggtg tgaggtgcgt ggccaccaaa gtgtttttgg tataacgata gaaattggta
1260agacaaacaa tattgcgaag aaaacccgaa gcatttttaa aaagtgcgaa cgtggcagtt
1320ttaagggttt gtgggcgtgg caataatttt tggcaattcg ataaaaatgt acaggaccaa
1380atatatgaag aaatataaaa tatttttcaa aatgacagcc agcaaccata catatatata
1440aataaatgtc ggagaccctt ccttctacct gtaacatact tttccacgaa tctagtattg
1500gttgatatat aattatgctg tgtataagac caaaatcagt gtacatttcc attggattca
1560ccaaccggat ggttccggat ggtaatgcaa aatattcatc taagaaacga aaacacctag
1620aattaaacct gaactgatat gacttatgca catatcagtg aggtgggcag ttcaaagcaa
1680tcacgatgct ccaagttatt atcgcagtgc agtgaaaaat tcacagtcac cgtcgccaat
1740tgccaataaa gatcggccat tatacaacag aaccgcgttg aagacgatcg acgaggtcgt
1800gggtcttatc ttatcaccac ctgaattgag gcatgcctcc agaatgacga gggcatccga
1860agataatgtg gcccgctatt ttcggccggg actggaccta tgcgacgacc tatgctgatg
1920acgggagtct gccgctgata tggtgcaatg caaggctcca gtcgggggta taaaagaccc
1980agtttcggtg cagtcaagac aacagacttt aggtgttggt cgttgagcga accaaagccg
2040gagcagttga ggaaccaaag aatagcagcg agaggaccaa gg
208271999DNASaccharomyces cerevisiae 7tgtagggacc caaatccaat tgtagtagtt
accttgatta tggttggctt gtccttcgat 60agttttgcct tttccaaagc gctagaaatg
gattccatat cgtcgtctcc tttatcgact 120tccatgactt cccaaccata tgcctcgtat
cgcttcaaaa catcttcgtc gaacgagtac 180gaggttttac cgtcaatgga aatgctatta
ctgtcataaa acgtaatcaa gttacccaat 240tgcagatgtc ccgctaagga agaggtctcc
gaagaaacac cctcttgtaa gcaaccatcc 300cctacaatag caaacgtata tgagtcggaa
atgggaaagc catcctcgtt ataagtggcg 360gcaaagttgg cctgcgctat tgccatacca
acagcatttg agataccctg gcctagcgga 420ccggaagtga tttccactcc cgctgagtgg
aattctggat gacccggtgt ccttgagttt 480acttgtctaa attgtctcaa gtcctcgata
gagtaatcgt atcctaatag atggagcatt 540gagtacagaa gagcgcatga gtgaccgttc
gacagaacaa acctgtctct attgatccaa 600tgttcattgt tagggttaca gcgcagttgc
ttgaaaatta catgggcaac tggtgccaat 660cctagtggtg cacctgggtg gccagattgt
gcgctttcca cctggtcaac ggaaagtaat 720cttaaagtgg aaaccgcaag tttatcaatg
tcggagaact gtgccatttt tttgttcttt 780ttttgattag taaggtataa tcgtctacgt
agaggttaca aatcgaagac tacagtaaga 840ggggacaagc caattgaata tacgactgaa
ataaatggaa taattctgca ttattacact 900cgtttatata tccaaacagg tgatctggta
ttctcttgac aacgaatgaa gctccctata 960ttcgacactc cttattcagg actcctccca
acaaggagaa gtaggtgttc cttgagctac 1020cctttaaagc tggggagatg agcttgccct
tcctgtcatc gccattatga cgagaaaagt 1080aaaacatgta gaataaggtc cacccaaaca
tgtccgagca atgacgttat atatcgtgtt 1140ccctgttcaa agcatggcat atgtgccatt
aaaggcgaat ttttgtccct agcaaaggag 1200agacagcgag ccaccattaa gaagtgactt
gaaagcaagc gaaaatagct acacatatat 1260atcaatatat tgacctataa acccaaaatg
tgaaagaaat ttgataggtc aagatcaatg 1320taaacaatta ctttgttatg tagagttttt
ttagctacct atattccacc ataacatcaa 1380tcatgcggtt gctggtgtat ttaccaataa
tgtttaatgt atatatatat atatatatat 1440ggggccgtat acttacatat agtagatgtc
aagcgtaggc gcttcccctg ccggctgtga 1500gggcgccata accaaggtat ctatagaccg
ccaatcagca aactacctcc gtacattcat 1560gttgcaccca cacatttata cacccagacc
gcgacaaatt acccataagg ttgtttgtga 1620cggcgtcgta caagagaacg tgggaacttt
ttaggctcac caaaaaagaa agaaaaaata 1680cgagttgctg acagaagcct caagaaaaaa
aaaattcttc ttcgactatg ctggaggcag 1740agatgatcga gccggtagtt aactatatat
agctaaattg gttccatcac cttcttttct 1800ggtgtcgctc cttctagtgc tatttctggc
ttttcctatt tttttttttc catttttctt 1860tctctctttc taatatataa attctcttgc
attttctatt tttctctcta tctattctac 1920ttgtttattc ccttcaaggt ttttttttaa
ggagtacttg tttttagaat atacggtcaa 1980cgaactataa ttaactaaa
199982001DNASaccharomyces cerevisiae
8tctgctatta ttgatgcttt gaagacctcc agacaaattt ttcacagaat gtactcttac
60gttgtttacc gtattgcttt gtctctacat ttggaaatct tcttgggtct atggattgct
120attttggata actctttgga cattgatttg attgttttca tcgctatttt cgctgatgtt
180gctactttgg ctattgctta cgataatgct ccttactctc caaagcccgt taaatggaac
240ctaccaagat tatggggtat gtctattatt ttgggcatag ttttagctat aggttcttgg
300attaccttga ctactatgtt cttaccaaag ggtggtatta tccaaaactt cggtgctatg
360aacggtatta tgttcttgca aatttccttg actgaaaact ggttgatttt cattaccaga
420gctgctggtc cattctggtc ttctatccca tcctggcaat tggctggtgc cgtcttcgct
480gtcgacatca tcgctaccat gtttacctta ttcggttggt ggtctgaaaa ctggactgat
540attgttactg tcgtccgtgt ctggatctgg tctatcggta tcttctgtgt tttgggtggt
600ttctactacg aaatgtccac ttctgaagcc tttgacagat tgatgaacgg taagccaatg
660aaggaaaaga agtctaccag aagtgtcgaa gacttcatgg ctgctatgca aagagtctct
720actcaacacg aaaaggaaac ctaatcctgt tgaagtagca tttaatcata atttttgtca
780cattttaatc aacttgattt ttctggttta atttttctaa ttttaatttt aattttttta
840tcaatgggaa ctgatacact aaaaagaatt aggagccaac aagaataagc cgcttatttc
900ctactagagt ttgcttaaaa tttcatctcg aattgtcatt ctaatatttt atccacacac
960acaccttaaa atttttagat taaatggcat caactcttag cttcacacac acacacacac
1020cgaagctggt tgttttattt gatttgatat aattggtttc tctggatggt actttttctt
1080tcttggttat ttcctatttt aaaatatgaa acgcacacaa gtcataatta ttctaataga
1140gcacaattca caacacgcac atttcaactt taatattttt ttagaaacac tttatttagt
1200ctaattctta atttttaata tatataatgc acacacacta atttattcat taatttttta
1260ttgagtagga tttgaaaata tttggtatct ttgcaagatg tttgtataga gggacaaaga
1320atcgtcttta ttatggtcaa ggctttacgt cataatagtt cctgcccagc tcttctataa
1380tactttaaag atctcttctc gtttgctcca tttggaagtc tcgcttacgt ttatgcgccc
1440atacagacac tcaagataca cacttacatg aacgtataca aatttactaa cactacttga
1500aaatatgaac cacagtacat catattaaga cgtagtattc gatgattgaa ggccgcctcc
1560gcgaaatacc tttactgatt ttgccggtta atcgcatcga aatttcttca tcacaagaaa
1620gcaaacaaat cgccaggcca ttctacaagt ttccttttct tatgaagatg taaaagctac
1680taaggcgtca ttactctaga tgactcagtt tagtctgacc ttctatagta tactaccctg
1740gcgctatgat gatgagcggt tcttttattg cggaaacgaa aattccggga ccggcgaaat
1800ttgcccggtt ttgtccgtaa ccggcttcat gagtcggctt caatagtagt tgaatactta
1860tttaaacagc agaacttaac tcactcatca cgctgtttcc gctgaatttt ctcaaaatat
1920ctaagcagtc aacaaatata aagaatattg aaattgacag tttttgtcgc tatcgatttt
1980tattatttgc tgttttaaat c
200192000DNASaccharomyces cerevisiae 9ccaaatcatt cttattcggt ttccagacgg
taacaatacc ctcgcccatc ccacacaaaa 60gggtatctgc tacttcggga tcgacgaaac
aaccacaaag aacttcgtcc tcctgatcat 120cgctgatcaa aattttacca tcctcgtttc
cagctacgtt cggtttggca tctttgtcgc 180gaacgtcaaa ataagctaac gttgtctggc
ccaaagaaat gaatttatat gcagatcttt 240tatcaaagtg gaaaatatcg ttgatagagt
cgccaaaatg tatcgaacga atggaatttg 300ataatgccaa gttttccgag tttattacgt
gtatattacc ggattcatcg cctattaaaa 360tgaatgggtg agtttgagag gcgcataatt
tcgtaaattt atcatttttt ttctcttcag 420aattaaacag tgagcttaag tttacctttt
tgacgacttt gccggtcata gtattggcct 480tttttaaaac attatccgat ccaacagaaa
aaatattgtc acctttagaa tcaaagcaca 540tggcacggac actaccttta tgtcttttag
tcttccaaag tgtttttacg cccaagtctt 600catcttttcc tgtttgcttt tgttgctgtt
gttcttcaat atcaacaaat ttcaaatctc 660cagtttctag gtctatatct aatctaatcc
aagggcatac acctttcttt gcatccttgc 720ctgtagttgc agtgtcaatt ctacgtctac
gatctaggtg cgattgcaac ttagcggggt 780cataacgatg gcacacaata tgtcctgtac
caaagccagt tattataatg ggcagttcag 840gatgtaaaag agactggaaa atgggagctt
ttaatgatag taattctaga atgggcaggt 900ttgttgaatc gacaacatct gttttttttt
tgctctttgc catagctgat gcgtggattg 960tttctaattt cccagctgct tcctcttcca
attgtggcga tgatgccatg atttctatgt 1020taaaattttt ctaaccatga aatttttttt
ttctagcgag aaaaaaaatc agaaaaatta 1080ctattagtga gtattggaga cattgtcaat
gggagatgtt ctctttataa tatcttcaac 1140aggttctttc aactctggaa attcatccac
aatcttgtca gcaagtgaat ctcttaattg 1200cttcaatcca tgcatcttgc ctctttgata
ttggttggat cttcttatgg cttccacgaa 1260ctctcttgtg taaatatctg gatttctacc
gtcctcaatg tattgaacaa cttccaaggg 1320aatgtccacc ttagacaagc tggattgagg
atcgttgctt ctcacgttca gcttgtacaa 1380gcgatccaca tttctttgca agttggtgat
cattcccttg gtggcttctg gagtaccagg 1440aaaatcatat atcgagacac ctaattcaac
gaaggactca ataatcgaag ccacttggtc 1500ttgagtagtg gccagttctt gctgcaattg
ttcattgtta gtgctgtttc cattcatctt 1560atcggtttat ttttctatat atttgcctct
ttctcaaaca ggagttagta gttaaaagta 1620cgaagttctt gttctttaat gcgcgctgac
aaaagaattg gataaaagag aatggtgggg 1680ggacaagaag gaaatttgtc ctagtttaac
atgaatggca tcttgttacc gggtggacat 1740cacctattga ttctaaatat ctttacggtt
tatcatactg ttctttattc cgtcgttatt 1800ctttttattt ttatcatcat ttcacgtggc
tagtaaaaga aaagccacaa catgactcag 1860caaatctcga caaagtaaaa gctcatagag
atagtattat attgatataa aaaaagtata 1920ctgtactgtt tgtaaccttt tcaatgcttt
aagatcaaaa ctaaggccag caaaggtatc 1980aacccatagc aactcataaa
2000102001DNASaccharomyces cerevisiae
10gaaaccatta aatcatattt aataaattgt tgcgacatgc aagaagttcg cggatggtca
60tgcgtattta agaatagtca agtaacaatt tgcttattcg ttgatgatat gatattattc
120agcaaagact taaatgcaaa taagaaaatc ataacaacac tcaagaaaca atacgataca
180aagataataa atctgggtga aagtgataac gaaattcagt acgacatact tggattagag
240atcaaatatc aaagaagcaa gtacatgaaa ttaggtatgg aaaaatcctt gacagaaaaa
300ttacccaaac taaacgtacc tttgaaccca aaaggaaaga aacttagagc tccaggtcaa
360ccaggtcatt atatagacca ggatgaacta gaaatagatg aagatgaata caaagagaaa
420gtacatgaaa tgcaaaagtt gattggtcta gcttcatatg ttggatataa atttagattt
480gacttactat actacatcaa cacattgctc aaccatatac tattcccctc taggcaagtt
540ttagacatga catatgagtt aatacaattc atgtgggaca ctagagataa acaattaata
600tggcacaaaa acaaacctac caagccagat aataaactag tcgcaataag cgatgcttca
660tatggtaacc aaccatatta caagtcacaa attggtaaca ttttcctact caacggaaaa
720gtgattggag gaaagtcgac aaaggcttcg ttaacatgca cttcaactac agaagcagaa
780atacacgcgg tcagtgaagc tattccgcta ttgaataacc tcagtcacct tgtgcaagaa
840cttaacaaga aaccaattat taaaggctta cttactgata gtagatcaac gatcagtata
900attaagtcta caaatgaaga gaaatttaga aacagatttt ttggcacaaa ggcaatgaga
960cttagagatg aagtatcagg taataattta tacgtatact acatcgagac caagaagaac
1020attgctgatg tgatgacaaa acctcttccg ataaaaacat ttaaactatt aactaacaaa
1080tggattcatt agatctatta cattatgggt ggtatgttgg aataaaaatc aactatcatc
1140tactaactag tatttacgtt actagtatat tatcatatac ggtgttagaa gatgacgcaa
1200atgatgagaa atagtcatct aaattagtgg aagctgaaac gcaaggattg ataatgtaat
1260aggatcaatg aatattaaca tataaaatga tgataataat atttatagaa ttgtgtagaa
1320ttgcagattc ccttttatgg attcctaaat cctcgaggag aacttctagt atatctacat
1380acctaatatt attgccttat taaaaatgga atcccaacaa ttacatcaaa atccacattc
1440tcttcaaaat caattgtcct gtacttcctt gttcatgtgt gttcaaaaac gttatattta
1500taggataatt atactctatt tctcaacaag taattggttg tttggccgag cggtctaagg
1560cgcctgattc aagaaatatc ttgaccgcag ttaactgtgg gaatactcag gtatcgtaag
1620atgcaagagt tcgaatctct tagcaaccat tatttttttc ctcaacataa cgagaacaca
1680caggggcgct atcgcacaga atcaaattcg atgactggaa attttttgtt aatttcagag
1740gtcgcctgac gcatatacct ttttcaactg aaaaattggg agaaaaagga aaggtgagag
1800ccgcggaacc ggcttttcat atagaataga gaagcgttca tgactaaatg cttgcatcac
1860aatacttgaa gttgacaata ttatttaagg acctattgtt ttttccaata ggtggttagc
1920aatcgtctta ctttctaact tttcttacct tttacatttc agcaatatat atatatatat
1980ttcaaggata taccattcta a
2001112001DNASaccharomyces cerevisiae 11cactaccacc actacggttg tccatgacgt
atcctgcgat tttttgaatt aatgattcaa 60tagttgacat ttgctcgtca ttggggttcg
actgagctgc ggatgtcaac ttcgcaacag 120cttctgcatg gtttccttga gaaaaatgag
actcagcctc tgagattaac ttatccgtat 180ccatttcaga tctttgctat acgtttgtat
cgctatatgt acgttctttt aatgaacttt 240ctcctttctt tatcgtgtag cttgcttggg
tatcttttaa tgagttgcgg acagtgagat 300ttttcagaag ggcaattggc caagacacca
aaaacgtttg gacgagacag gcatcaaagg 360acaaggtaaa aggcgttgag ctgtggctgg
ctgtgtatgc gtttgaaata ccatggatag 420atatcaaaga aagataggat gtttcataca
aatcccaaat ttggggcgcg gacaactgaa 480atacgtgggt ccagtggaca cgaaagctgg
aatgtttgct ggtgtagact tacttgccaa 540cattggtaag aacgatggat cattcatggg
gaagaagtat tttcaaacag agtatcctca 600aagtggacta tttatccagt tgcaaaaagt
cgcatcattg atcgagaagg catcgatatc 660gcaaacctcg agaagaacga cgatggaacc
gctatcaata cccaaaaaca gatctattgt 720gaggctcact aaccagttct ctcccatgga
tgatcctaaa tcccccacac ccatgagaag 780tttccggatc accagtcggc acagcggtaa
tcaacagtcg atggaccagg aggcatcgga 840tcaccatcaa cagcaagaat ttggttacga
taacagagaa gacagaatgg aggtcgactc 900tatcctgtca tcagacagaa aggctaatca
caacaccacc agcgattgga aaccggacaa 960tggccacatg aatgacctca atagcagcga
agttacaatt gaattacgag aagcccaatt 1020gaccatcgaa aagctacaaa ggaaacaact
acactacaaa aggctactcg atgaccaaag 1080aatggtcctc gaagaagtgc aaccgacttt
tgataggtat gaagccacaa tacaagaaag 1140agagaaagag atagaccatc tcaagcaaca
attggagctc gaacgcagac agcaagccaa 1200acaaaagcag ttttttgacg ctgagaatga
acagctactt gctgtcgtaa gccaactaca 1260cgaagagatc aaagaaaacg aagagagaaa
tctttctcat aatcaaccca ctggtgccaa 1320cgaagatgtc gaactcctga aaaaacagct
ggaacaatta cgcaacatag aagaccaatt 1380tgagttacac aagacaaagt gggctaaaga
acgcgaacaa ttgaaaatgc ataacgattc 1440gctcagtaaa gaataccaaa atttgagcaa
ggaactattt ttgacaaaac cacaagattc 1500ctcatcggaa gaggtggcat ccttaacgaa
aaaacttgaa gaggctaatg aaaaaatcaa 1560acagttggaa caggctcaag cacaaacagc
cgtggaatcg ttgccaattt tcgacccccc 1620tgcaccagtc gataccacgg caggaagaca
acagtggtgt gagcattgcg atacgatggg 1680tcataataca gcagaatgcc cccatcacaa
tcctgacaac cagcagttct tctaggcagt 1740cgaactgact ctaatagtga ctccggtaaa
ttagttaatt aattgctaaa cccatgcaca 1800gtgactcacg tttttttatc agtcattcga
tatagaaggt aagaaaagga tatgactatg 1860aacagtagta tactgtgtat ataatagata
tggaacgtta tattcacctc cgatgtgtgt 1920tgtacataca taaaaatatc atagcacaac
tgcgctgtgt aatagtaata caatagttta 1980caaaattttt tttctgaata c
2001122001DNASaccharomyces cerevisiae
12acaatgagga agaacatgcc gttttacaag aattaaatag tttaacccaa agaattaatg
60aactaggcat ggaaagtata aattcaaact ccgattcgga cagaataaac gggtcatatt
120cacaagtgga ttttggtaac aataacgacg aggacgatat gaacctgttc gacccagatt
180ttatggcaca agaccaattg cgtgctgaag aaagagacta caacaaggat gatagaacac
240ccttagctaa ggtccctgcg gcctttcaat caactggatt gggcataacc cccgatgacg
300atatcgagag acaatacata acggaacaca gatcacgaca tgaagtgcca aagcggtctc
360ccgagaaacc ctccaacccg ctggaaatag gtaacccata cgcgaaacct ggcacaaggt
420tgaataccac tcacacccac agcaaaactg atcgtagcat tacccctcag aggggccagc
480cagtcccatc aggccagcag atttcctcct acgtgcagcc agcaaacatt aatagtccta
540acaaaatgta tggtgcaaac aactcggcaa tgggttcgcc caggaatcca aagacgagag
600cgccaccagg tccatacaat cagggatgga ataaccgccc ctcgccttca aatatttacc
660aacgtcctca tccctcagat acacaaccac aagcatatca tctccccgga aacccatact
720caacggggaa caggccaaac atgcaagcgc aatatcaccc gcagcaggtg cccatgccta
780tcctgcagca gcccaatcgc ccgtaccaac cttatgcgat gaatacgcac atgggctctc
840ctggcggata tgctggggca gcaccaccat ttcagccagc taacgtcaac tacaatacta
900ggcctcagca gccatggcct acacctaact caccatccgc acactaccgt ccgcccccta
960acctgaacca gcctcaaaac ggtagtgctg gttactatcg tccgccggca ccacaattgc
1020aaaactccca agcccgtcca caaaagaagg acggattctc acagttcatg ccatctgcaa
1080ctacgaagaa cccatatgcc cagtaactcg accgactggt tgtaatttta caaaaagaga
1140gacaattaag aaaagaaaca agcgccaggc ttccgtatcc cagtttttca tctcactttc
1200tgggcacgat tgtaataata cttcatgata ataactaaac tatataagta gtgtctcatc
1260cgtaaatata catttagaca gattcttgta ttttctccgg gcaattttta actttttttc
1320tgttagggca catgacactt gcctattatg gacagccagt aaagatgtgc catatattgc
1380cccctttacg ctctctgcca gtattagtgg gaaaaaaaaa actgaaaaaa aaaaaatcgc
1440agactactaa taatcacgtg atatttcttt tcactctctt cataaagttg ctaaaaacac
1500acaatcgaat gagcctctga gcagtataaa ttgtacttca aagcactagt catgaaaaac
1560gcttacatta gttcagtttg tcaaggttat gctattactt gtacttattt cttgctattg
1620ttagtggctc cccacattga cgtattttca cgtgatgcgc ctcactgcgg aaggcgccac
1680acattgcctg caaaaaattg tggatgcact catttgatag taaactaagt catgttaatc
1740gtttggattt ggcacacacc cacaaatata cacattacat atatatatat attcaaaata
1800cagctgcgtc caatagatga gcttccgctt cgttgtacaa cctacctgct atcttgttca
1860cggatatttc ttgcttttaa taaacaaaag taactctaga acagtcaagt cttcgataat
1920ttttttagtc acagggtccg tctaaagttt ctctttattt ggaataatag aaaagaaaga
1980aaaaaacgta gtataaaagg a
2001132000DNASaccharomyces cerevisiae 13aaggatggca aatacccaat cggaggaact
cgaacacttc agtatctgtg tcttctagtg 60agtctttagc ggaagttatt cagccatctt
ccttcaaaag tgggagtagt tcattgcatt 120atctatcgtc ttctatctca agccaacctg
gttcgtacgg ttcttggttc aacaaaaggc 180caacaatttc tcagttcttt caaccaagcc
cttctttaaa acacaacgag tcgtgggaga 240ggctgcaaac aactgctgga aatatgcaaa
ggacttcaag ttcgtcttct ttgcagcaag 300caacctccag gttatcacta accactccgc
aacaatcacc gtctatcagc gaatatgatg 360agtatccttg gatgggcaca cctggctctc
ctaatgttgg agatgtgtct cacgcacccc 420cattggttaa gaatatatca tataaatttc
cactaaagaa cgttgagttg aaaagagatt 480gccaaaggat ctctcaggat gatcttttgg
atgaggcttt tgaaagaata tgtcagccct 540ctttggctga ccttaattcc acttacgaaa
tttttccagg taactcttct tatgcggata 600ttttgactac tgattctgat attgatgatg
gcttgatgaa taaacctctg gaactattgc 660cgaaatatac aatgtattta acccatttta
acaatttttt ccagttgcaa gcatgtcctg 720ctggtcaaga atcagagagc agaataacaa
attctatgaa gattgacctg ttaaaggcgg 780attacacaag aagtctatta gtatcgttac
gttcaaggga cattagggat gtcgcattga 840aaagagagtt tactggcaat aacaacaata
acagcaacca gaatatctat gatgagaatt 900ttgtcggaaa aaggaagtac gtgttgaaac
agaagaccag aaaaatcttt tcctgtggca 960agattggcaa gctaagtact agtttggaaa
actgcgttaa ttttgttgaa aatagtataa 1020agagtgcaat gatgttatat gatgataatg
gaatagatag tgagcttcgc gattcagaag 1080ctttacggat tttttcatct cttgttcatt
attgtaatgc aggttaatgt tttctccttc 1140tttacatgtt taatatattc caagttacct
aagaggtgta cgatattttt ttcttttata 1200tatatgattt tctctattca ttttttagtt
ttttttgata cataagcgaa tcgcacattg 1260cgcaacttca atttgttgat tcgccaaagt
attcttacca taaaacaacc attcgttgct 1320ttaccctttc gtaatcattt accgtgataa
ccataatcag aaacttatta tttcagccta 1380gtagaccggc caagcaggcc ttgtaatgtt
tctcttgatt gcttgaatct tttaagcagc 1440caaatctttc caaaaaaatg caattatcag
aacaaaacta tttaaggtga cttctccgta 1500tttacaccac cagaagcgtt ctggctcccc
ttttctctaa acgttaaaca ttttacaatt 1560gaaatgttac caatcctata ttattgtacc
acattgccag atttatgaac tctgggtatg 1620ggtgctaatt ttcgttagaa gcgctggtac
aattttctct gtcattgtga cactaattag 1680gaaacttctc gactatcaat gtgtaaatga
aggaataatg gcggaaactt tgaaactttg 1740tcaataattg catcattgga tgcgtttcat
ttggccgtta tcacggagag gcagagttct 1800ctccacaatt tgggcagaag tcttttgaaa
agacatatat atatatatat atgtatatga 1860gtggatgctt aaggtaagaa taatttctga
attcccaagt attcattttg tgcagtattc 1920acatattcta ttttattgct ttttaacttt
agaggcaatt aaatttgtgt aggaaaggca 1980aaatactatc aaaattttcc
2000142001DNASaccharomyces cerevisiae
14ttgccttcaa gatctacttt cctaagaaga tcattattac aaacacaact gcactcaaag
60atgactgctc atactaatat caaacagcac aaacactgtc atgaggacca tcctatcaga
120agatcggact ctgccgtgtc aattgtacat ttgaaacgtg cgcccttcaa ggttacagtg
180attggttctg gtaactgggg gaccaccatc gccaaagtca ttgcggaaaa cacagaattg
240cattcccata tcttcgagcc agaggtgaga atgtgggttt ttgatgaaaa gatcggcgac
300gaaaatctga cggatatcat aaatacaaga caccagaacg ttaaatatct acccaatatt
360gacctgcccc ataatctagt ggccgatcct gatcttttac actccatcaa gggtgctgac
420atccttgttt tcaacatccc tcatcaattt ttaccaaaca tagtcaaaca attgcaaggc
480cacgtggccc ctcatgtaag ggccatctcg tgtctaaaag ggttcgagtt gggctccaag
540ggtgtgcaat tgctatcctc ctatgttact gatgagttag gaatccaatg tggcgcacta
600tctggtgcaa acttggcacc ggaagtggcc aaggagcatt ggtccgaaac caccgtggct
660taccaactac caaaggatta tcaaggtgat ggcaaggatg tagatcataa gattttgaaa
720ttgctgttcc acagacctta cttccacgtc aatgtcatcg atgatgttgc tggtatatcc
780attgccggtg ccttgaagaa cgtcgtggca cttgcatgtg gtttcgtaga aggtatggga
840tggggtaaca atgcctccgc agccattcaa aggctgggtt taggtgaaat tatcaagttc
900ggtagaatgt ttttcccaga atccaaagtc gagacctact atcaagaatc cgctggtgtt
960gcagatctga tcaccacctg ctcaggcggt agaaacgtca aggttgccac atacatggcc
1020aagaccggta agtcagcctt ggaagcagaa aaggaattgc ttaacggtca atccgcccaa
1080gggataatca catgcagaga agttcacgag tggctacaaa catgtgagtt gacccaagaa
1140ttcccattat tcgaggcagt ctaccagata gtctacaaca acgtccgcat ggaagaccta
1200ccggagatga ttgaagagct agacatcgat gacgaataga cactctcccc ccccctcccc
1260ctctgatctt tcctgttgcc tctttttccc ccaaccaatt tatcattata cacaagttct
1320acaactacta ctagtaacat tactacagtt attataattt tctattctct ttttctttaa
1380gaatctatca ttaacgttaa tttctatata tacataacta ccattataca cgctattatc
1440gtttacatat cacatcaccg ttaatgaaag atacgacacc ctgtacacta acacaattaa
1500ataatcgcca taaccttttc tgttatctat agcccttaaa gctgtttctt cgagcttttt
1560cactgcagta attctccaca tgggcccagc cactgagata agagcgctat gttagtcact
1620actgacggct ctccagtcat ttatgtgatt ttttagtgac tcatgtcgca tttggcccgt
1680ttttttccgc tgtcgcaacc tatttccatt aacggtgccg tatggaagag tcatttaaag
1740gcaggagaga gagattactc atcttcattg gatcagattg atgactgcgt acggcagata
1800gtgtaatctg agcagttgcg agacccagac tggcactgtc tcaatagtat attaatgggc
1860atacattcgt actcccttgt tcttgcccac agttctctct ctctttactt cttgtatctt
1920gtctccccat tgtgcagcga taaggaacat tgttctaata tacacggata caaaagaaat
1980acacataatt gcataaaata c
2001151999DNASaccharomyces cerevisiae 15ttttgtaaga aattattcac cgcatcttca
tctggcaaac gaatgggaga ctttgaggaa 60cccaatccat ttctgaataa cggagattta
gaaatgtaaa aggtagcaaa tgtaaaaagt 120gccaggacca tcacagcagt caatgccaac
accaatttcc cttgccatga cactgttgga 180tcttttgaag gagatttgta acctggaatc
tcactataat gaacacattc accggattca 240cacttcaaag taatataagg gtcaccaaac
acggtcaata tcaaatcatt catagaaggc 300tcactgaatt tacattgcct tgtttctaaa
tcacagctga aatctcctgg cccttttatt 360gtctctgtca ggaaatccga gatatctata
gaccccttag caccacacaa cacagtgtcg 420ggaacgcatt tgcattgaac gtcattacac
ttataatggg aggtattctg ttccaagtcg 480tattcaaagg cacaatcact taagccacaa
tagaagcttt ctaactgatc tatccaaaac 540tgaaaattac attcttgatt aggtttatca
caggcaaatg taatttgtgg tattttgccg 600ttcaaaatct gtagaatttt ctcattggtc
acattacaac ctgaaaatac tttatctaca 660atcataccat tcttataaca tgtcccctta
atactaggat caggcatgaa cgcatcacag 720acaaaatctt cttgacaaac gtcacaattg
atccctcccc atccgttatc acaatgacag 780gtgtcatttt gtgctcttat gggacgatcc
ttattaccgc tttcatccgg tgatagaccg 840ccacagaggg gcagagagca atcatcacct
gcaaaccctt ctatacactc acatctacca 900gtgtacgaat tgcattcaga aaactgtttg
cattcaaaaa taggtagcat acaattaaaa 960catggcgggc atgtatcatt gcccttatct
tgtgcagtta gacgcgaatt tttcgaagaa 1020gtaccttcaa agaatggggt cttatcttgt
tttgcaagta ccactgagca ggataataat 1080agaaatgata atatactata gtagagataa
cgtcgatgac ttcccatact gtaattgctt 1140ttagttgtgt atttttagtg tgcaagtttc
tgtaaatcga ttaatttttt tttctttcct 1200ctttttatta accttaattt ttattttaga
ttcctgactt caactcaaga cgcacagata 1260ttataacatc tgcataatag gcatttgcaa
gaattactcg tgagtaagga aagagtgagg 1320aactatcgca tacctgcatt taaagatgcc
gatttgggcg cgaatccttt attttggctt 1380caccctcata ctattatcag ggccagaaaa
aggaagtgtt tccctccttc ttgaattgat 1440gttaccctca taaagcacgt ggcctcttat
cgagaaagaa attaccgtcg ctcgtgattt 1500gtttgcaaaa agaacaaaac tgaaaaaacc
cagacacgct cgacttcctg tcttcctatt 1560gattgcagct tccaatttcg tcacacaaca
aggtcctagc gacggctcac aggttttgta 1620acaagcaatc gaaggttctg gaatggcggg
aaagggttta gtaccacatg ctatgatgcc 1680cactgtgatc tccagagcaa agttcgttcg
atcgtactgt tactctctct ctttcaaaca 1740gaattgtccg aatcgtgtga caacaacagc
ctgttctcac acactctttt cttctaacca 1800agggggtggt ttagtttagt agaacctcgt
gaaacttaca tttacatata tataaacttg 1860cataaattgg tcaatgcaag aaatacatat
ttggtctttt ctaattcgta gtttttcaag 1920ttcttagatg ctttcttttt ctctttttta
cagatcatca aggaagtaat tatctacttt 1980ttacaacaaa tataaaaca
1999161999DNASaccharomyces cerevisiae
16aaacaaatgg caaaaataac gggcttcacc attgttcctg tatggtgtat tagaacatag
60ctgaaaatac ttctgcctca aaaaagtgtt aaaaaaaaga ggcattatat agaggtaaag
120cctacaggcg caagataaca catcaccgct ctcccccctc tcatgaaaag tcatcgctaa
180agaggaacac tgaaggttcc cgtaggttgt ctttggcaca aggtagtaca tggtaaaaac
240tcaggatgga ataattcaaa ttcaccaatt tcaacgtccc ttgtttaaaa agaaaagaat
300ttttctcttt aaggtagcac taatgcatta tcgatgatgt aaccattcac acaggttatt
360tagcttttga tccttgaacc attaattaac ccagaaatag aaattaccca agtggggctc
420tccaacacaa tgagaggaaa ggtgactttt taagggggcc agaccctgtt aaaaaccttt
480gatggctatg taataatagt aaattaagtg caaacatgta agaaagattc tcggtaacga
540ccatacaaat attgggcgtg tggcgtagtc ggtagcgcgc tcccttagca tgggagaggt
600ctccggttcg attccggact cgtccaaatt attttttact ttccgcggtg ccgagatgca
660gacgtggcca actgtgtctg ccgtcgcaaa atgatttgaa ttttgcgtcg cgcacgtttc
720tcacgtacat aataagtatt ttcatacagt tctagcaaga cgaggtggtc aaaatagaag
780cgtcctatgt tttacagtac aagacagtcc atactgaaat gacaacgtac ttgacttttc
840agtattttct ttttctcaca gtctggttat ttttgaaagc gcacgaaata tatgtaggca
900agcattttct gagtctgctg acctctaaaa ttaatgctat tgtgcacctt agtaacccaa
960ggcaggacag ttaccttgcg tggtgttact atggccggaa gcccgaaaga gttatcgtta
1020ctccgattat tttgtacagc tgatgggacc ttgccgtctt catttttttt ttttttcacc
1080tatagagccg ggcagagctg cccggcttaa ctaagggccg gaaaaaaaac ggaaaaaaga
1140aagccaagcg tgtagacgta gtataacagt atatctgaca cgcacgtgat gaccacgtaa
1200tcgcatcgcc cctcacctct cacctctcac cgctgactca gcttcactaa aaaggaaaat
1260atatactctt tcccaggcaa ggtgacagcg gtccccgtct cctccacaaa ggcctctcct
1320ggggtttgag caagtctaag tttacgtagc ataaaaattc tcggattgcg tcaaataata
1380aaaaaagtaa ccccacttct acttctacat cggaaaaaca ttccattcac atatcgtctt
1440tggcctatct tgttttgtcc tcggtagatc aggtcagtac aaacgcaaca cgaaagaaca
1500aaaaaagaag aaaacagaag gccaagacag ggtcaatgag actgttgtcc tcctactgtc
1560cctatgtctc tggccgatca cgcgccattg tccctcagaa acaaatcaaa cacccacacc
1620ccgggcaccc aaagtcccca cccacaccac caatacgtaa acggggcgcc ccctgcaggc
1680cctcctgcgc gcggcctccc gccttgcttc tctccccttc cttttctttt tccagttttc
1740cctattttgt ccctttttcc gcacaacaag tatcagaatg ggttcatcaa atctatccaa
1800cctaattcgc acgtagactg gcttggtatt ggcagtttcg tagttatata tatactacca
1860tgagtgaaac tgttacgtta ccttaaattc tttctccctt taattttctt ttatcttact
1920ctcctacata agacatcaag aaacaattgt atattgtaca ccccccccct ccacaaacac
1980aaatattgat aatataaag
1999172009DNASaccharomyces cerevisiae 17ggatgagaaa cgagtgcggt ttcgagagta
gatattcaac ccacccgaag tagccttcag 60gaactggttc cgttctctct tcctccggaa
tagtctgaat gtccttaaga gaccgtggct 120cgtatactct tctattcttg ggccgcaata
gcaaaaagag ccagacaaac acgacggcgg 180taagaccgta gataatcagg gttgaaatga
acgccgaagt cgaagaactg tcagccatag 240tacgtatgtg ctataaatat ctaacctttc
gctgctttga atatgatgtg ctcaaatata 300acttaatata atagtataac aaaaaggagt
actatttgct aaatatcgta gacgtagtag 360acatagtaaa tacaataaag gatagataac
caagaaccca catcaagcga atacatacat 420atatatatac tcgatgtata catgtttcta
agcacttgcg cacatacgta tttaaagtat 480ttcagggaga ttaacgtatt aaaacaagaa
gagggttgac tacatcacga tgagggggat 540cgaagaaatg atggtaaatg aaataggaaa
tcaaggagca tgaaggcaaa agacaaatat 600aagggtcgaa cgaaaaataa agtgaaaagt
gttgatatga tgtatttggc tttgcggcgc 660cgaaaaaacg agtttacgca attgcacaat
catgctgact ctgtggcgga cccgcgctct 720tgccggcccg gcgataacgc tgggcgtgag
gctgtgcccg gcggagtttt ttgcgcctgc 780attttccaag gtttaccctg cgctaagggg
cgagattgga gaagcaataa gaatgccggt 840tggggttgcg atgatgacga ccacgacaac
tggtgtcatt atttaagttg ccgaaagaac 900ctgagtgcat ttgcaacatg agtatactag
aagaatgagc caagacttgc gagacgcgag 960tttgccggtg gtgcgaacaa tagagcgacc
atgaccttga aggtgagacg cgcataaccg 1020ctagagtact ttgaagagga aacagcaata
gggttgctac cagtataaat agacaggtac 1080atacaacact ggaaatggtt gtctgtttga
gtacgctttc aattcatttg ggtgtgcact 1140ttattatgtt acaatatgga agggaacttt
acacttctcc tatgcacata tattaattaa 1200agtccaatgc tagtagagaa ggggggtaac
acccctccgc gctcttttcc gatttttttc 1260taaaccgtgg aatatttcgg atatcctttt
gttgtttccg ggtgtacaat atggacttcc 1320tcttttctgg caaccaaacc catacatcgg
gattcctata ataccttcgt tggtctccct 1380aacatgtagg tggcggaggg gagatataca
atagaacaga taccagacaa gacataatgg 1440gctaaacaag actacaccaa ttacactgcc
tcattgatgg tggtacataa cgaactaata 1500ctgtagccct agacttgata gccatcatca
tatcgaagtt tcactaccct ttttccattt 1560gccatctatt gaagtaataa taggcgcatg
caacttcttt tctttttttt tcttttctct 1620ctcccccgtt gttgtctcac catatccgca
atgacaaaaa aatgatggaa gacactaaag 1680gaaaaaatta acgacaaaga cagcaccaac
agatgtcgtt gttccagagc tgatgagggg 1740tatctcgaag cacacgaaac tttttccttc
cttcattcac gcacactact ctctaatgag 1800caacggtata cggccttcct tccagttact
tgaatttgaa ataaaaaaaa gtttgctgtc 1860ttgctatcaa gtataaatag acctgcaatt
attaatcttt tgtttcctcg tcattgttct 1920cgttcccttt cttccttgtt tctttttctg
cacaatattt caagctatac caagcataca 1980atcaactatc tcatatacaa tgtctatcc
2009181943DNASaccharomyces cerevisiae
18ggcagtcatc aggatcgtag gagataagca ccctgacaag taacatgccg atgaagttgt
60ttggttcatt gggcaaaaaa atcgggattc tagaaaaccc tgagttgaag attttttcga
120cagttttatc gtctaggatg gtatcggcac tcattgtgaa cacgttttca atcggagtca
180tgatttcctc aaccctcttt gcctttagat ccaaaacagc agagatgatt gtaacttcgt
240ctttagtcaa ccgttccacc cccatggtcc tatgcaaggt gaccaaagtc tttaagccgg
300attttttgta catcgtacca tgatcttcac ccagcatata gtccaggaga gtcgcgatcg
360gatatgcgac tgggtacatc agatacatca gtacaagaac aaaggggcag aagaatgccc
420caacttgcag cccgtattta acacagacac tctgcggaat aatttcaccg aagatcacaa
480ttagaatagt tgacgacact acagcctgcc aaccaccccc aagacacctg tccaaaacaa
540taggcaatgt ttcgttggtt ataacattag aaagcagcag tgtgactaga acccaatgct
600tccccctaga tattaggtca agcacccgct tggccagttt cttttcagaa ttcgagcctg
660aagtgctgat taccttcagg tagacttcat cttgacccat caaccccagc gtcaatcctg
720caaatacacc acccagcagc actaggatga tagagataat atagtacgtg gtaacgcttg
780cctcatcacc tacgctatgg ccggaatcgg caacatccct agaattgagt acgtgtgatc
840cggataacaa cggcagtgaa tatatcttcg gtatcgtaaa gatgtgatat aagatgatgt
900atacccaatg aggagcgcct gatcgtgacc tagaccttag tggcaaaaac gacatatcta
960ttatagtggg gagagtttcg tgcaaataac agacgcagca gcaagtaact gtgacgatat
1020caactctttt tttattatgt aataagcaaa caagcacgaa tggggaaagc ctatgtgcaa
1080tcaccaaggt cgtccctttt ttcccatttg ctaatttaga atttaaagaa accaaaagaa
1140tgaagaaaga aaacaaatac tagccctaac cctgacttcg tttctatgat aataccctgc
1200tttaatgaac ggtatgccct agggtatatc tcactctgta cgttacaaac tccggttatt
1260ttatcggaac atccgagcac ccgcgccttc ctcaacccag gcaccgcccc caggtaaccg
1320tgcgcgatga gctaatcctg agccatcacc caccccaccc gttgatgaca gcaattcggg
1380agggcgaaaa ataaaaactg gagcaaggaa ttaccatcac cgtcaccatc accatcatat
1440cgccttagcc tctagccata gccatcatgc aagcgtgtat cttctaagat tcagtcatca
1500tcattaccga gtttgttttc cttcacatga tgaagaaggt ttgagtatgc tcgaaacaat
1560aagacgacga tggctctgcc attgttatat tacgcttttg cggcgaggtg ccgatgggtt
1620gctgagggga agagtgttta gcttacggac ctattgccat tgttattccg attaatctat
1680tgttcagcag ctcttctcta ccctgtcatt ctagtatttt tttttttttt ttttggtttt
1740actttttttt cttcttgcct ttttttcttg ttactttttt tctagttttt tttccttcca
1800ctaagctttt tccttgattt atccttgggt tcttctttct actcctttag attttttttt
1860tatatattaa tttttaagtt tatgtatttt ggtagattca attctctttc cctttccttt
1920tccttcgctc cccttcctta tca
1943192001DNASaccharomyces cerevisiae 19tgacaacgag taccaggaaa tcagtgcttc
tgctttgaag aaggctcgta agggctgtga 60tggtttgaag aaaaaggcag tcaagcaaaa
ggaacaggag ttgaagaaac aacaaaaaga 120ggcagaaaat gctgccaagc aattgtctgc
tttgaatatc accattaagg aggacgaatc 180gctaccagct gccattaaga ctagaattta
tgactcttat tccaaggtcg gacaaagagt 240taaggtttcc ggttggatcc atagattacg
ttctaacaag aaggttattt tcgtcgtcct 300cagagacgga tctggtttca ttcaatgtgt
cttgtccggt gatttggcat tggctcaaca 360aactttggac ctgactttgg aatccaccgt
tactctgtac ggtaccatag tcaaattgcc 420tgagggtaaa accgctccag gtggtgttga
attgaatgtc gactattacg aagttgtagg 480tttggccccc ggtggtgaag actcctttac
aaacaaaatc gcagagggct cagacccttc 540tttactgttg gaccaacgtc atttggcctt
gagaggagat gccttgtctg cagtcatgaa 600agtccgtgct gctctactga aaagcgttag
acgtgtttat gatgaagaac atttgacaga 660agttacccca ccatgtatgg tgcaaactca
agtcgaaggt ggttccactt tgttcaagat 720gaactattac ggcgaggaag cttacttgac
ccaaagttcc caattatatt tagaaacctg 780tttggcctcc ctaggtgatg tttataccat
ccaagaatct ttcagagctg aaaagtccca 840cacaagaaga catttgtccg aatataccca
tatcgaagct gaattggcct tcttgacttt 900cgacgatcta ttacaacata ttgaaacttt
gatcgtcaaa tccgtgcaat acgttttgga 960agacccaatt gctggcccac tcgtaaaaca
attgaatcca aactttaagg ctccaaaggc 1020tccattcatg agattacagt acaaggatgc
cattacctgg ttgaacgaac acgacatcaa 1080gaacgaagag ggcgaagact ttaaatttgg
tgacgatatt gcagaagctg ctgaaagaaa 1140gatgaccgat accatcggcg tcccaatctt
tttgacgaga ttcccagtag aaatcaagtc 1200tttctacatg aagcgttgtt ctgacgaccc
ccgcgtcact gaatccgtcg acgttttgat 1260gccaaacgtt ggtgaaatca ctggtggttc
tatgagaatc gacgacatgg acgaactaat 1320ggcagggttt aagcgtgagg gtattgatac
cgacgcctac tactggttca ttgaccaaag 1380aaaatacggt acttgcccac atggtggtta
cggtatcggt accgaacgta ttttagcctg 1440gttgtgtgac agattcactg tcagagactg
ttccttgtat ccacgtttca gcggtagatg 1500taagccatga tctttagtta ctgaagagta
cgtgagcgct cacatatata caaatattta 1560taccgattaa tatttacgtt cctccctctc
tctaattatt cattgattta ttcaagaatt 1620agcgttataa caataaatgg ttggcgcagg
caattaattt ttctttactc ttccaaaccc 1680tctgttaacg acaatcaaat aacctgatct
gccaaggctc catcatatct ggcctagaac 1740agtttttttt tttcgattat ttgttcgttc
ttgtggtggt tactcattgg cagaatcccg 1800aaaatcatga ttagtagatg aatgactcac
tttttggata agctggcgca aattgaaaca 1860tgtgaaaaaa aaaaaaaagg attataaaag
gtcagcgaag cacagaactc tgagataaga 1920ctacctttct ttagctaggg gagaatattc
gcaattgaag agctcaaaag caggtaacta 1980tataacaaga ctaaggcaaa c
2001201999DNASaccharomyces cerevisiae
20tcctaaggac atattccgtt cgtacttgag ttattggatc tatgaaatcg ctcgctatac
60accagtcatg attttgtccc tggtaatagg ggttttggtt ttattaatta tattttttaa
120tgacaacgaa gcttgtgttt tcaattctgc aatatttgct tttacttctc ttgtaggttt
180gttaataata ttaagtgatg gtaatccaaa gctagtcagt cgtcgaaatt ttaggaccga
240gcttttagtg gatgtcatca cacgtaaacc ggcggtagaa gggaaagaat ggaggatcat
300cacatacaac atgaaccaat atttgtttaa tcatgggcaa tggcatactc cgtattactt
360ttacagcgat gaggattgct accgttattt tctacgcctt gttgagggag taacccccaa
420gaagcaaaca gccacgtcaa ttggcaattc tccggtcacc gctaagcctg aagatgccat
480cgagtcagct tctcctagtt ccagactgaa ttatcaaaac tttttgctca aggcagcgga
540gatcgaacga caagctcagg aaaattactg gcgaaggcgg catcccaata tcgatgcgct
600tcttaaaaag acggaatagc ttagagacac taccatacgt aaagcgaaca taaactagag
660tatgatatat aatcagcact aactggccgg aaaacggccg aaggaagcct cgaaaagtcg
720attcgtgttg gacccatttg ctgaacaaag tggttcattg cctacctatt atggtagtag
780tcgtgataat cgtgtggttg gttttgtcaa cggtgcattt gcattttcat gacaataaac
840cttgcgtttt cgttctcggg atattacttt ccctccactt ctttcgcctc aatagctcct
900ataagcattc tcagggcgta tgtcggtgat cgagatttcc aagcaagctt ttagtggaaa
960tcatcgcgcg caagccagcg gtaaagggaa aagaacggag gacgattaca tacaagatga
1020acgaataaat aaattaataa taaataataa taaaaagtac agtagcatta aatattatta
1080agtttaatga ttaaaaattg gttaattgtc aagaaaatct aaggtattaa taaataaata
1140atactatgac aacttgcagc gaaagcatca gccccaatga aaattaatca gaattgaatc
1200tgagcgtatt tatttgataa cggtttacgt aactgttgga ataaaaatca actatcatct
1260actaactagt gtttacgtta ctagtatatt atcatatacg gtgttagaag atgacgcaaa
1320tgatgagaaa tagtcatcgt tttcaacgga agctgaaata caaggattga taatgtaata
1380ggatcaatga atatcaacat ataaaacgat gataataata tttatagaat tgtgtagaat
1440tgcagattcc cttttatgga ttcctaaatc ctcgagaaga acttctagta tatctacgta
1500cctaatatta ttgccttatt aaaaatggaa tcccaacaat tatctcaaaa ttcccccaat
1560tctcatcagt aacaccccac cccgtattac ttttaccgtg atgaagattg gcatcgttac
1620tttctaaacg taggacgtgc ggaatgacaa aaccatcagc agtgtcacga tctctccagt
1680cacaatggca atcatgagtg catagtccaa agtaaagggg caaggaaaag catgattgaa
1740aggactcccc atctggactc tatatgtcat cagcggctaa aaaaaagcat atagcacaac
1800atcagcatca gcatcagcac tagagtcatc ggcccggcgg tccgcggtca tccccgcgga
1860ctttccgtcc gcccggcggg ctgtatcagc gtcaactgga acgcgcatat atatacaaga
1920cacacataac atagaagcac acccacgaca ataaccacac gacaataacc acacccgccc
1980acccctcctt tccgtatac
1999216334DNASorghum bicolor 21tggattcgga ctggaaaata actctaactt
gtatggatca ccacgacgtc atatggactc 60caactgggac gttcctatac ttgttggaaa
gctcatgaag tctactttcc aatgggtcca 120accacatatc tatgcggctt atgagtcggg
cgcagtcctt gttttcgtgc cgacaccttt 180ttctgttttg gtgctgcgtc actctatttt
ggaccaatgg cccatgtatc aagttgagtc 240cattagggac gcatcctagg gttggaggac
gactctagca cccctttggt cgtcctcccc 300tctatttatt tacatctaga gccgccatga
acaactggat tttgtttaga tcaagtttag 360ccttcgctac ttgcttgtag gcgcgcgtgc
aggatcagcc gcccgcctcc ttgtcttcgg 420aaccccattg ttgattaaga ttcagtttaa
aaccttcaat tcatcttgca aattcagtgc 480ttgtttcctc gttcttgcta gttcttcgat
tgcttgcagg acgggagccc taggggctgg 540ttgtcgcgct ccacaagatc gtgacggttg
ttggacgtgg tgtatcggtt gctaaggcgc 600ggtcttgagg gctgtagtcg ggccgtgaac
gtcatctcca tccactaatc gagttatcca 660gcgcctctca tcgaaagatc aggccaaaaa
ccctagcggg ctcgcatcag ttggtaatca 720gagcaaggtt cttcggtgag agacttctaa
tcctttgctg tttttaatta atttcctata 780gtccagaaaa gccaaaaaaa tatagtagat
tagtttttcc ataatcctat taaacctttg 840tgccttggct agtaccgttt tagttagggc
ttgttgaatt tgcgttgctt cggtttgtgt 900cgagttgctg gtcttagtgt ctagtccttt
agagtttcga gttcttgtca ccatctatac 960acagccgagt attaccatat cttctctctg
tcgaatctgt tgcgaagtct gaattgaact 1020gtggtcatgg tccggatcga gtagagttcc
aatctgagtt caaaaagaaa gataactcta 1080cttgttcggc cttactctag agagagagag
agagtgtgga gcgaaaaaag tgtgtggagc 1140gaattgctct tttgtattct tttgttcata
taatcagttt tggaggttgc ccacaaaaaa 1200agaaaaaaaa gaaaagagaa aagattcaaa
aaaaagaggc tgtttttcat attgatttta 1260ggtttgtccc accttgtttt cgggggtgtg
ctgtggtttt cctttgtgtc caggctcgcg 1320tctctagcac ggtctagcct aggaccagca
cagtaccatc gtcgaacgct tattcagctc 1380gcttttataa ctaacgtggt gctagttcgt
tccttgtttc agcccaccta tagctccaca 1440tactctacag cttgacaggt cttgtgctgc
agcaccgata cacttcgtcc attgctatac 1500acttgttggc agacgacccc tcctgtcaag
caagataaga attggtaaga acttgtgtta 1560caggttgagt gtgagcgact tgctatagct
acatcctagt agttgtaggg attttatttc 1620ttcacttgct ttttgttgtc tttgtctttg
aaccatgcca ggggcagatg atggtaacga 1680aacaccactt acacctcgca ctatgggcat
catacaatat tttgaaagga aagtgaagct 1740gcacacagag ggacttgata acgacttgca
ggtgacgaat gaaaagctgg ggcagttgga 1800ggctacgcag attgccacaa acaacaagct
cacaagtttg gaggaatccg ttgctagtgt 1860ggacaaaagc cttgctgctc tcctaaggcg
atttgatgct ttccacaccg aagataaaga 1920gaagcataaa gaagaaaagg agggagatcg
agagcacggt agtcatgaag atgactacac 1980tggtgatact gaacatgatg atcaagacac
tcgtgatcga cgtcgccttc gtcacaaccg 2040tagaggtatg ggtggcaacc gccgacgcga
ggtacacaat aatgatgatg ctttcagtaa 2100gattaaattt aagatacccc tttttgatgg
taaatatgac cctgatgctt acatcacttg 2160ggagattgct gttgatcaaa agtttgcatg
tcatgaattt cctgagacta cacgtgttag 2220ggctgctact agtgagttta cagattttgc
ttctgtttgg tggatagaat atggaaagaa 2280aaatcataat aacttaccta gaacttggga
tgcgctgaaa agggccatga gagctagatt 2340tgttccatct tactatgcgc gtgatatgat
aaataagttg cagcagttaa gacaaggtgc 2400taaaagtgta gaagaatatt atcaggaatt
acaaacgggt atgttgcgtt gtaacctaga 2460ggaggatgag gaaccggcta tggctagatt
tttgggtggg ttaaatcggg aaattcagga 2520catcctcgct tacaaagaat acaataatgt
aacccgtttg tttcatcttg cttgtaaagc 2580tgaaagggaa gtgcagagac gacgtgctag
cacaaggagt aatatttctg cagggaaggc 2640taattcatgg cagcaacgcg tggcttcaac
tccatctaca cgtatttcta ctccatcatc 2700tagtgacaag actcgagctg cccccaccaa
ttcagttgcg aagacgatgc aaaagcctgc 2760tgcgagtact tcatccgtgg catcgacggg
tagaacaagc aacatacaat gtcaccggtg 2820caagggatat gggcacatga tgcgtgactg
tccaaacaag cgagttatga ttgtcaggga 2880tgatggtgag tactcatctg ctagtgattt
tgatgaggat acacttgcac tgcttgcgac 2940tgaccatgca ggtaatgaag atcaaataga
agaatatatt aatgcaggtg aagcggacca 3000ctatgagagc ttgatcgtgc agcgagtgct
tagtgcacaa atggagatgg cggaacaaaa 3060tcagcgacac attttattcc aaacaaagtg
tgtcatcaaa gagcgttctt gtcgcatgat 3120cattgatgga ggtagctgca acaacttggc
aagcagcgat atggtgcaga agcttgccct 3180caacaccaaa ccacacccgc atccctacta
catccaatgg ctgaacaaca gtggtaaggc 3240aaaggtaact agacttgtga gaattaattt
ttccatcgga tcctacaaag atattgttga 3300atgtgatgtt gtgcctatgc aagcttgtaa
cattctgcta ggtagacctt ggcaatttga 3360tagagattct atgcatcatg gtagatcaaa
tcagtattct tttctatacc atgatcgcaa 3420aattgtgttg catcctatat cccctgaaac
tattatgcaa actgatgttg ctagggctac 3480taaagcaaag agcaagagca ataaaaatga
taaatctgta attggtaaca aagatgagat 3540aaaactgaaa ggacattgta tgatagctac
caaatcagat attaatgagt tcaatgcatc 3600cacttctgtt gcttatgctt tgatatgcaa
ggatgctttg atttcagttg aggatatgca 3660atgttctttg ccccctgctg ttgctaacgt
tttgcaggag tattctgatg tgtttccaag 3720tgatgtacca gcggggctgc ctccactacg
cgggattgag caccaaattg atcttattcc 3780tggatcagtt ttgccaaatc gtgcaccata
caggacaaac ccggaggaaa caaaggaaat 3840tcagcgacaa gtgcaagaac tactagacaa
aggttatgtc cgagaatctc ttagtccttg 3900tgctgttcca gtaattttag tgcctaagaa
agatggaaca tggcgtatgt gtgttgattg 3960tagagctatt aataatatca ccattcgata
ttgacaccct attccacgat tagatgatat 4020gctagatgaa ctgagtggtg ctgttgtgtt
ttcaaaagtt gatttacgta gtgggtacca 4080ccagattcgt atgaaattgg gagatgaatg
gaaaactgct ttcaaaacta agttcggttt 4140gtatgagtgg ttagtcatgc cttttgggtt
aactaatgca cctagtactt tcatgagatt 4200aatgaacgag gtcttgcgtg ctttcattgg
gaaatttgtt gtcgtatatt ttgatgacat 4260attgatttac agcaaatcat tggatgaaca
tcttgatcat ttacgtgctg tttttaatgc 4320actacgcgag gcacgtttat ttggtaacct
tgagaagtgc accttttgca ccgatcgagt 4380gtcttttctt ggttatgttg tgactccaca
gggaattgag gttgatcaag ccaaggtgga 4440agctatacag ggatggcctg tcccaaatac
tatcacccag gtgcggagtt tcctaggact 4500tgctagattc tatcgccgtt ttgtgaagga
tttcagcacc attgctgcac cattgaatga 4560gcttacaaag aagggggtgc cttttgattg
gggcaaagca caagagaatt cattcaacat 4620gttgaaagat aagttaactc atgcacctct
cctacaactt cctgatttta ataagacttt 4680tgagcttgaa tgtgatgcta gtggaattgg
tttgggaggt gttttattac aagagggaaa 4740acctattgca tattttagtg agaaattgag
tgggcctgtt ctcaaattca acttatgata 4800aagaactcta tgctcttgtt agaacattag
agacatggca gcattatttg tggcccaaag 4860agtttattat ccattttgat catgaatctt
tgaaacatat tcgtagtcaa ggaaaactga 4920atcgtaggca tgcaaagttg gttgaattta
ttgaatcttt tccttatatt attaagcaca 4980agaaagggaa ggaaaatatt attgctgatg
ctttatcacg gagatatact ttgctgaatc 5040aacttgatta caagatattt gggttagaaa
caattaaaga ccaatatgtt catgatgctg 5100attttagaga cgtgttgctg cattgtaaag
atggaaaagg gtggaataaa ttcatcgtta 5160gtgatgggtt tgtgtttaga gctaacaagc
tatgcattcc agctagctct gttcgtttgt 5220tgttgttgca ggaagcgcat ggaggtggct
tgatgggaca ttttggagca aagaagaccg 5280aggacatact tgctggtcat ttcttttggc
ccaggatgaa gagagatgtg gagaggtttg 5340ttgctcgttg cacaacatgt caaaaggcaa
agtcacggtt aaatccccac ggtttgtatt 5400tacctcttcc tgttcctaat gctccttggg
aggatatatc tatggatttt gtgttgggac 5460taccaaggac taggagggga cgtgatagtg
tgtttgtggt tgttgataga ttttctaaga 5520tggcacattt cataccatgt cataaaactg
atgatgctac aaatattgct gatttgtttt 5580ttcgagaaat tgttcgctta catggtgtgc
ccaacacaat tgtttctgat cgtgatgcta 5640aatttcttag tcatttttgg aagactttgt
ggttcaaatt ggggactaag cttttatttt 5700ccaccacctg tcatccccaa actgatggtc
aaactgaagt tgttaataga actttatcca 5760ctatgttaag ggctgtttta aagaagaata
ttaagatgtg ggaagaatgt ttgcctcatg 5820ttgagttcgc ctataatcgt tcattgcatt
ctactacaaa aatgtgtcct tttgagattg 5880tctatggctt cttgccacgt gctcctattg
atttaatgcc tttgccaagt tctgaaaaaa 5940taaattttga tgctaagcaa catgctgaat
tgatgttaaa attgcatgaa gccactaaac 6000aaaacataga gcgcatgaat gctaagtaca
aatgcactgg agataaaggt agaaagcaat 6060tgattctgga acctggggat ttggtttggt
tgcatttgcg aaaagataga tttccagaac 6120tgataaaatc caaattgatg cctagagctg
atggtccttt taaagtgctg caacgaatta 6180atgagaatgc atataagctt gatcttcctg
cagattttgg ggttagtccc acatttaaca 6240ttgcagattt gaagccttat ttgggtgagg
aagatgagct tgagtcgagg acgactcaaa 6300tgcaagaaag ggaggatgat gaggacatca
acac 633422132DNASorghum bicolor
22aaactgagct tccacttgag cccctttacc aggagtatca tcgggtgcat ccaaaatggt
60ttcttagcct atgatgcatt aggcgcaaac tgtgtaccta tcttgcacca aaactaactc
120tgtctccaaa cc
13223133DNASorghum bicolor 23aaactgagct tccacttgag cccctttacc caggagtatc
atcgggtgca tccaaaatgg 60tttcttagcc aatgatgcat taggcacaaa ttgtgtacct
atcttgtacc aaaactaact 120ctgtctccaa aca
13324133DNASorghum bicolor 24aaactgagct tccacttgag
cccctttaca caggagtatc atcgggtgca tccaaaatgg 60tttcttagcc aatgatgcat
taggcgcaaa ttgtgtacct atcttgtacc aaaactaact 120ctgtctccaa aca
13325133DNASorghum bicolor
25caaactgagc ttccacttga gcccctttac ctaggagtat gatcaggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgtac caaaactaac
120tctgtctcca aac
13326133DNASorghum bicolor 26caaactgagc ttccacttga gcccctttac ccaggagtat
cttcaggtgc atccaaaatt 60gtttcttagc ctatgatgca ttaggcgcaa actgtgtatc
tatcttgcac caaaactaac 120tctgtctcca aac
13327133DNASorghum bicolor 27caaactgagc ttccacttga
gcacctttac ccacgagtat catcgggctc atccaaaatg 60gtttcttagc ccatgatgca
ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
13328133DNASorghum bicolor
28caaactgagc ttccacttga gcccctgtac ccaggagtat catcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac taaaactacc
120tttgtctcca aac
13329133DNASorghum bicolor 29gaaccgacct tccacttgag cctctccacc taggattatc
atcgggtgct tgcataatgg 60tttctgagcc tatggtgcat tatgcgcaaa ccatgcacca
atcctgcacc taaactaaca 120ctgtctccaa aca
13330133DNASorghum bicolor 30caaactgagc ttccacatga
gcccctttac ccaggagtat attcgggtgc atccaaaatg 60gtttcttagc ctatgatgca
ttaggcgcaa agtatgtacc gatcttgcac caaaactaac 120tctgtctcca aac
13331133DNASorghum bicolor
31caaactgaga ttccacttga gcccctttac cggggagtat catcgtgtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcgcaa actatgtacc tatcttacac caaaactaac
120tctgtctcca aac
13332133DNASorghum bicolor 32caaactgagc ttccacttga gcccctttac ctaggagtat
catcgggtgc atccaaaatg 60gtttcttagc ctgtgatgca ttaggagaaa actgtgtacc
tatcttgcac caaaactaac 120tctgtctcca aac
13333133DNASorghum bicolor 33aaactgagct tccacttgag
cccctttacc caggagtgtt atcgagtgca tccaaaatag 60tttcttagcc tatgatgcat
taggcgcaaa ctgtgtacct atcttgcacc aaaactatct 120ctgtctccaa aca
13334133DNASorghum bicolor
34aaactgagct tccacttgag cccctttact caggagtatc atcgggttca tccaaaatgg
60tttcttagcc aatgatgcat ttggcgcaaa ctgtgtacct atctcgcacc aaaactaact
120ttgtctccaa aca
13335133DNASorghum bicolor 35caaactgagc ttccacttga gcccctttac ccaggagtat
catcgagtgc atccaaaata 60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc
tatcttgcac caaaactgcc 120tctgtctcca aac
13336133DNASorghum bicolor 36caaactgagc ttccacttga
gcccctttac ccaggagtat catcaggttc atccaaaatt 60gtttcttagc ctatgatgca
ttaggcgtaa actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
13337133DNASorghum bicolor
37caaactgagc ttccacttga gcccctttgc ccaggagtat catcaggttc atccaaaatt
60gtttcttagc ctttgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
13338133DNASorghum bicolor 38aaactgagct tccacttgag cccctttacc caggagtatc
gtcgggtgca tccaaaatgg 60tttcttaccc tatgatgcat taggcgcaaa atgtgtacct
atcttgaacc aaaactaact 120ctgtctccaa aca
13339133DNASorghum bicolor 39caaactgagc ttccacttga
gcccctttac ccaggagtat gatcgggtgc atctaaaatg 60gtttcttagc ctatgatgca
ttaggcgaaa actgtgtacc tattttgcac caaaactaac 120tctgtctcca aac
13340133DNASorghum bicolor
40caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atccaaaatg
60gtttcttagg ctatgatgca ttaggcgcaa actgtgcacc tatcctgtac ctaaactaac
120actgtctcca aac
13341133DNASorghum bicolor 41caaactgagc ttccacttga gccccattac ctaggagtat
gttcgggtgg attcaaaatg 60gtttcttagc ctatgatgca ttatgcgcaa actgtgtacc
tatcttgcac caaaactaac 120tctgtctcca aac
13342133DNASorghum bicolor 42aaactgagct tccacttgag
cccctatacc tagtagtatc atcgggtgca tccaaaatga 60tttcttatcc tatgatgcat
taggcgcaaa ctgtgtacct atcttgcacc aaaactaact 120ctgtctccaa aca
13343133DNASorghum bicolor
43caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcacag actatgtacc tatcttgcac caaaactaac
120tccgtctcca aac
13344133DNASorghum bicolor 44caaaccgacc ttccacttca gcctctttac ctaggattat
catcgggtgc ttccataatg 60gtttttgagc ctatggtgca tattgcgcaa accatgcacc
aatcttgcat ctaaactaac 120actgtctcca aac
13345133DNASorghum bicolor 45aaactgagct tccaattgag
cccctttacc caggagtatc atcgggtgca aacaaaatgg 60tttcttagcc tatgatgcat
taggcgcaaa ctgtgtacct atcttgcacc aaaaataact 120ctgtctccaa aca
13346133DNASorghum bicolor
46caaactgagc ttccacttga gcccctttac ctaggagtgt catcgggtgc attcaaaatg
60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac
120tttgtctcta aac
13347133DNASorghum bicolor 47caaactgagc ttccatttga gcccctttac ccacgagtat
aattgggtgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcgcaa actgtatacc
tatattgcac caaaactacc 120tctgtctcca aac
13348133DNASorghum bicolor 48caaactgagc ttccacttga
gcccctatac cgaggagtat catcgggtgc attcaaaatg 60gtttcttagc ctatgatgca
ttaggcgcaa actgtgtacc tatcttgcac tgaaactaac 120tctgtctcca aac
13349133DNASorghum bicolor
49caaactgagc tttcacttga gcccctttac ctaggagtat gaacgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcta aac
13350133DNASorghum bicolor 50aaactgagct tccacttgag cccctttaca caggagtatc
atcgggtgca tccaaaatgg 60tttcttagcc tatgatacat aaggcgcaaa ctgtgtatgt
atcttgcacc aaatctaact 120ctatctccaa aca
13351133DNASorghum bicolor 51caaactgagc ttccacttga
gcccttttac ccaggagtat catcgagtgc atccaaaatg 60gtttcttagc ctatgatgca
ttaggcgcaa actgtgtacc tatcttgcac caaaactacc 120tccgtctcca aac
13352133DNASorghum bicolor
52caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg
60gtttcttcgc ctatgatgca ttaggcgcaa actatgtacc tatcttgcac caaaactaac
120tttgtctcca aac
13353133DNASorghum bicolor 53aaactgagct tccacttgag cccctttacc caggtgtatc
atcgggtgca tccaaactgg 60tttcttagcc tatgacgcat taggcgcaaa ctgtgtacct
atcttgcacc aaaactacct 120ctgtctccaa aca
13354133DNASorghum bicolor 54caaactgagc ttccacttga
gcccctttac ccaggagtat attcgggtgc atccaaaatg 60gtttcttagc ctatgatgcc
ttaggcgcaa agtgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
13355133DNASorghum bicolor
55caaactgagc ctccacttga gcccctttac ccaggagtat catcaggtgc atccaaaatg
60gtttcttagc atatgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
13356133DNASorghum bicolor 56caaactgagc ttccacttga gcccctttac ccaggagaat
caacagatgc atccaaaata 60gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc
tatcttgcac caatactaac 120tctgtctcca aac
13357133DNASorghum bicolor 57gaactgacct tccacttgag
cctctttacc taggattatc atcgggtgct tccataatgg 60tttctgagcc tatggtgcat
tatgcgcaaa ccatgcacca atcttgcacc taaactaaca 120ctgtctccaa aca
13358133DNASorghum bicolor
58aaactgagct tccacttgag cccttttacc caggagtatc atcgagtgca tccaaaatga
60tttcttaccc tatgatgcat taggcgcaaa ctgtgaacct atcttgcacc aaaactacct
120ctgtctccaa aca
13359133DNASorghum bicolor 59caaactgagt ttccacttga gcccctttac ccaggagtat
catcgggtgc atccaaaatg 60gtttcttagc ctatgatgca ttcggcgcaa attgtgtacc
taacttgcac caaaactaac 120tctgtctcca aac
13360132DNASorghum bicolor 60aaactgagct tccacttgag
cccctttacc taggattatc atcgggtgca tccaaaatgg 60tttcttagcc tattatgcat
taggcgtaaa ctgtgtacca atcttgcacc aaaactaact 120ctctctccaa ac
13261133DNASorghum bicolor
61aaactgagct tccatttgag cccctttacc caggattatc atcgggtgcg tccaaaatgg
60tttctgagcc tatgatgcat taggtggaaa ctgtgtacct attttgcacc aaaactaact
120ctgtctccaa aca
13362133DNASorghum bicolor 62accaaactgt gcttccactt aagcctcttc acctaggatt
accatcaagt gcatccaaaa 60tggtttctta gactatggtg aattaggcaa aaactgtgca
cctatcttgc accaaaacta 120acactatgtc caa
13363133DNASorghum bicolor 63caaactgagc ttctgcttga
gcccctttac ctaggagtat catcgggtgc atccaaaatg 60gtttctcagc ctttgatgca
ttaggcgtaa actgtgtacc tatgttgcac caaaactaac 120tctatctcca aac
13364133DNASorghum bicolor
64gatcaaacca agcttccact tgagcccctt ttcctaggag taccattagg tgtgtccaaa
60aaggttctta gcctatggtg cattaggcgc aaaccattca cctatcttgc acagaaacta
120atactgtctc aaa
13365133DNASorghum bicolor 65cgaaccgacc ttccacatga gactcttcac ctaggattat
catcgggtgc ttccataatg 60gtttctgtgc ctatggtgca ttatgcgcaa accatgcacc
aatcttgcac ctaaactaac 120actctctcca aac
13366133DNASorghum bicolor 66caaactgagc tttcccttga
gccccttgag ccaggagtat catcgggtgc atccaaaatg 60gtttcttagc tgtatgaagc
attaggcgca aactgtgtac ttatcttgca ccaaaactaa 120ctctgtctcc aaa
13367133DNASorghum bicolor
67caaactgagc ttccacttga gcccctttac cttggagtat caacgggtgc atccaaaatg
60ttttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac
120tttgtctcca aac
13368133DNASorghum bicolor 68aaactgagct tccacttgag cccctttacc caggagtatc
atcgggtgca tccaaaatgg 60attcttagcc tatgatgcat taggcgtaaa ctgtgtacct
ttcttgtacc aaaactaact 120ctgtctccaa aca
13369133DNASorghum bicolor 69caaactgagc ttccacttga
gcccctttac ctaggagtat catcggctcc atccaaaatg 60gtttcttagc ctatgatgca
ttaggcgcaa actgtgtacc tatcttgcac caaaactaac 120tctgtctcct aac
13370133DNASorghum bicolor
70caaactgagc ttccgcttga gcccctttac ctaggagtat catcgggtgc atccaaaatg
60gtttctcagc ctatgatgcc ttaggagcaa actgtgtacc tatcttgcac caaaactaag
120tctgtctcca aac
13371133DNASorghum bicolor 71aaactgagct tccacttgag cccctttgcc caggagtatc
atcaggttca tccaaaatgg 60tttcttagcc tttgatgcat taggcgtagc ctgtgtacct
atcttgcacc ataactaact 120ctgtctccaa aca
13372133DNASorghum bicolor 72accaaactgt gcttccactt
gagcctcttc atctaggatt accatcaagt gcatccaaaa 60tggtttctta gactacggtg
aattaggcta aaattgtgca cctatcttgc accaaaacta 120acactatgtc caa
13373133DNASorghum bicolor
73caaactgagc ttccacttga gccccgttac ctaggagtat cttcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcacaa actatgtacc tatcttacac taaaactaac
120tctgtctcca aac
13374133DNASorghum bicolor 74aaactgagct tccacctgag cccctttacc caggagtatc
atcgggtgca tccaaaatgg 60tttcttagcc tatgatgctt taggcgcaaa ctgtgtacct
atctagcacc aaaactagct 120ctgtctccaa aca
13375133DNASorghum bicolor 75cgaaccgacc tttcaattga
gcctcttcac ctaggattat catcgggtgt ttccataatg 60gtttctgagc ctatggtgca
ttatgcgcaa accatgcacc aatcttgcac ctaaactaac 120actgtctcca aac
13376133DNASorghum bicolor
76caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggagaaa actgtgtccc tatcttgcac caaaactaac
120tctgtctcca aac
13377133DNASorghum bicolor 77aaactgagct tccacttgag cccctttacc caggagtatc
attgggtgca tccaaactgg 60tttcttagcc tatgatgcat ttggcgcaaa ctgtgtacct
atcttgcacc aaaactgact 120ctgtctccaa aca
13378133DNASorghum bicolor 78aaactgagat tccacttgag
cccctttacc caacagtata atcgggtgca tccaaaatgg 60tttcttagcc tatgatgcat
taggcgcaaa ctgtgtaccg atcttgcacc aaaactaact 120ctgtctccaa aca
13379133DNASorghum bicolor
79caaactgagc ttccacttgg gcccctttac ccaggagtat caacagatgc atccaaaata
60gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac
120tctgtctcca aac
13380133DNASorghum bicolor 80caaactgagc ttccacttga gcccctttac ctaggagtat
catcgggtgc atacaaaatg 60gttccttagc ctatgatgca ttaggcgcaa actgtgtact
tatcttgcac caaaactaac 120tctgtctcca aac
13381133DNASorghum bicolor 81aaactgagct tccacttgag
cccctttacc taggagtatc atcgggtgca tccaaaatgg 60tttcttagcc tatgatgcat
ttggcacaaa ctgtgtacct atcctgcacc aaaactaact 120ctgtctccaa aca
13382133DNASorghum bicolor
82aaactgagct tccagttgag cccctttacc gaggagtatc atcaggtgca tccaaaatgg
60tttcttagcc tatgatgcat taggcgcaac ctgtgtacct atcttgcacc aaaactacct
120ctgtatccaa aca
13383133DNASorghum bicolor 83caaactgagc ttccacttga gcccctttac ccaggagtat
caacagattc atccaaaata 60gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc
tatcttgcac caatactaac 120tctgtctcca aac
13384133DNASorghum bicolor 84aaacctacct tccacttgag
cctctccacc taggagtatt atcgggtgct tccataatgg 60tttccgagcc tatggtgcat
tatgcgcaaa ccatgcacca atcttgcacc taaactaaca 120ctgtctccaa aca
13385133DNASorghum bicolor
85caaactgacc ttccacttga cactcttcac ctaggagtat tatccggtgc ttccataatg
60gtttctgagc ctatggtgca ttatgcgcaa accatgcacc aatcttgcac ctaaactaac
120actatctcca aac
13386133DNASorghum bicolor 86caaactgagc ttccacttga gcccctttac cctggagtat
cttcaggtgc atccaaaatt 60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc
tatcttgcac caaaactaac 120tctgtctcca aac
13387133DNASorghum bicolor 87caaactgagc ttccacttga
gcccctttac ccaggagtat catcaggtgc atcagaaatg 60gtttcttagc ctatgatgca
ttaggtgcaa actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
13388133DNASorghum bicolor
88caaactgagc ttccacttga gcccctttac ccaagagtat catcgggtgc atccaaaatg
60gtttcttagc ctatgacgca ttaggcacaa actgtgtacc tatgttgcac caaaactaac
120tctgtctcca aac
13389133DNASorghum bicolor 89caaactgagc ttcctcttga gcccctttac ctaggagtat
catcggttgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcgtaa actgtgtacc
tatcttgcac caaaacttac 120tctgtctcca aac
13390133DNASorghum bicolor 90caaactgagc ttccacttga
gcccctttac ccaggagtat catcgggtgc atacaaaatg 60gattcttagc ctatgacgca
ttaggcgcaa actatgtacc tatcttgcac caaaactaac 120tctgtctcca aac
13391133DNASorghum bicolor
91aaactgagtt tccacatgag cacctttacc ctggagtatc atcaggtgca tccaaaatgg
60tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atattgcacc aaaactaact
120ctttctccaa aca
13392133DNASorghum bicolor 92caaactgagc ttccacttga gcccctttac ctaggagtat
catcgggcgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcacaa actatgtacc
tatcttgcac caaaactaac 120tctgtctcca aac
13393133DNASorghum bicolor 93aaactgagct tccagttgag
cccctttacc cagcagtatc atcgggtgga tccaaaatgg 60tttcttcacc tatgatgcat
taggcgcaaa ctgtgtacct atcttgcacc aaaactaact 120ctatctcgaa aca
13394133DNASorghum bicolor
94aactgagctt ccacttgagc ccctttagcc aggagtatca tcgggtgcat ccaaaatggt
60ttcttagcct atgaaatcat taggcgcaaa ctgtgtacct atcttgcacc aaaactaact
120ctgtctctaa aca
13395133DNASorghum bicolor 95caaactgagc ttccacttga gcccctttac ctaggagtat
catcgggagc atccaaaatg 60gtttcttagc ctatgatgca taaggagcaa actgtgtacc
tatcttgcac caaaactaac 120tctgtctcca aac
13396133DNASorghum bicolor 96caaactgagc ttccacttaa
gcccctttac ctaggagtat catcgggtgc atccaaaatg 60gtttcttagc ctatgatgca
ttacgcgcaa actgtgtacc tatcttgcac caaaactaac 120tttgtctcca aac
13397133DNASorghum bicolor
97caaactgagc ttccacttga gcccctttac ctaggagtat aatcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttagacgtaa actatgtacc tatcttgcac caaaactaac
120tctgtctccg aac
13398133DNASorghum bicolor 98aaactgagct tccacttgag cccctttacc taggagtatc
atcgggtgta tccaaaattg 60tttcttagcc tacgatgcat taggcgcaaa ctgtgtacct
atcttgcacc aaaactaact 120ctgtctccaa aca
13399133DNASorghum bicolor 99caaactgagc ttccacttga
gcccctttac ctaggagtat cattgggtgc atccaaaatg 60ctttcttagc ctatgatgca
ttaggtgcaa actgtgtagc tatcttgcac caaaactatc 120tctatctcca aac
133100133DNASorghum bicolor
100aaactgagct tccacttgag cccgtttacc gaggagtatc atcgagtgca tctaaaatga
60tttcttagcc tatgatgcat taggcacaaa ctgtgtacct atctagcacc aaaactaact
120ttctctccaa aca
133101131DNASorghum bicolor 101accgaccttc cacttgagac tcttcaccta
ggattatcat cgggtgcttc cataatggtt 60tctgagccta tggtgcatta tgcacaaacc
atgcaccaat attgcaccga aactaacact 120gtctccaaac a
131102133DNASorghum bicolor
102caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg
60gtttcttagc caatgatgca ttaggagaaa actgtgtacc aatcttgcac caaaactaac
120tctgtctcca aac
133103133DNASorghum bicolor 103caaactgagc ttccacttga gcccctttac
ctaggagtat catcgggtgc atccaaaaag 60gtctcttagc ctatgatgcc ttaggagaaa
actatgtacc tgtcttgcac cataactaac 120tctgtctcca aac
133104133DNASorghum bicolor
104aaactgtgct tgcacttgag cccctttacc caggagtatc atcgggtgca tccaaaatgg
60tttcttagcc tatgatgcat taggcgcata ctgtgtacct atcttgcagt aaaactaact
120ctgtctccaa aca
133105133DNASorghum bicolor 105aaactgagct tccacttgag gccctttatc
taggagtatc atcgggtgca tccaaaatgg 60tttcttagcc tatgatgcgt taggcgcaaa
ctatgtacct atcttgcacc aaaactaact 120ctgtctccaa aca
133106133DNASorghum bicolor
106aaactgagct tccacttgag cccctttacc taggagtatc ttcgggtgca tcagaaatgg
60tttcttagcc tatcatgcat taggcacaaa ctgtgcacct atcttacatc aaaattaact
120ctgtctccaa aca
133107133DNASorghum bicolor 107caaactgagc ttccacttga gcccctttac
ccaggagtat attcgggtgc atccaaaatg 60gtttcttagc ctatgatgcg ttaggcgcaa
actgtgtacc tatcttgcac cacaactaaa 120tctgtctcca aac
133108133DNASorghum bicolor
108caaactgagc ttccacttga gcccctttac ccaggagtat caacagatgc atccaaaata
60gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac
120tctgtctgca aac
133109133DNASorghum bicolor 109caaactgagg ttccgcttga gcccctttac
ctaggagtat catcgggttc atccaaaatg 60gtttctcagc ctatgatgcc ttaggcgcaa
actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133110133DNASorghum bicolor
110aaactgagct tccacttgag cccctttacc caggagtatc atcgggtgca tccaaaatgg
60attcttcgcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacg aaaactaact
120ctatctccaa aca
133111133DNASorghum bicolor 111cgaaccgacc ttccacttga gccccttcac
ctaggattat catcgggtgc ttccataatg 60gtttttgagc ctatggtgca ttatgcacaa
accatgcacc aatcttgcac ctaaactaac 120actgtctcca aac
133112133DNASorghum bicolor
112caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaatg
60gttaattagc ctatgatgca ttaggcgcta actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
133113133DNASorghum bicolor 113caaactgagc ttccacttaa gcccctttac
ccaggagtat cttcaggtgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcacaa
actatgtacc tatcttacac caaaactaac 120tctgtctcca aac
133114133DNASorghum bicolor
114aaccgagacc tccacttgag gcctcttcac ctaggagata ccatcggatg cgtctaagat
60ggtttcttat cctatggtgc attatgcgta acccgtgcac atatcttgct ccaaaactaa
120tgctgtctct aaa
133115133DNASorghum bicolor 115caaactgagc ttccacttga gcccctttac
ccagtagtat catcgggtgc atccaaaatg 60gtttcttagc ctatgatgca ttatgcgaaa
attgtgtacc tatattgcac caaaactaac 120tctgtctcca aac
133116133DNASorghum bicolor
116caaactgagc atccacttga gcccctttac ctaggagtat catcgggtgc atacaaaatg
60gtttcttaac ctatgatgca ttagacgcaa actgtgtacc tatattgcac caaaactaac
120tctgtctcca aac
133117133DNASorghum bicolor 117caaactgagc ttccacttga gcccctttac
cttggagtat catcgggtgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcacaa
actgtgtacc tatcttgaac caaaactaac 120tctgtctcca aac
133118133DNASorghum bicolor
118aaactgagct tccacttgag cccctttacc caggagtatc ttcaggtgca tccaaaattg
60tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atctttcacc aaaactaaca
120ctgtctccaa aca
133119133DNASorghum bicolor 119caaactgagc ttccacttga gcccctttac
ctagaagtat catcgggtgc atccaaaagg 60gtttcttagc ctatgatgta ctaggcgtaa
actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133120133DNASorghum bicolor
120caaactgagc taccacttga gcccctttac ctaggagtat catcaggttc atccaaaatt
60gtttcttagc ctatgatgcg ttaggcgtaa actgtttacc tatcttgcac caaaactaac
120tctgtctcca aac
133121133DNASorghum bicolor 121caaactgagc ttccatttga gcccctttgc
ctaggagtat catcgggtgc atccaaaatg 60gttccttagc ctatgatgca ttaggtgcaa
actgtgtacc tatcttgcac caaaactaac 120tttgtctcca aac
133122133DNASorghum bicolor
122caaactgagc ttccacctga gccactttaa ccaggagtat catcgggtgc atccaaaatg
60ttttcttagc ctatgatgct ttaggcgcaa attgtgtacc tatcttgcac caaaactaac
120tctgcctcca aac
133123133DNASorghum bicolor 123caaactgagc ttccacttga gcatctttac
ccaggagtat catcaggtgc atccaaaata 60gtttcttagc ctatgatgca ttaggcacaa
actgtgtacc tatcttgcaa caaaactaac 120tctgtctcca tac
133124133DNASorghum bicolor
124caaactgagc ttccacttga gcccctttac ctaggggtaa catcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
133125133DNASorghum bicolor 125caaactgagc ttccacctga gcccctttac
ctaggagtat catcgtgtgc atctaaaatg 60gtttcttagc ctatgatgca ttaggcgcaa
actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133126133DNASorghum bicolor
126caaactgagc ttccacttga gcccctttac ctaggagtat catcgtgtgc atcaaaaatg
60gtttcttagc ctatgaagca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
133127133DNASorghum bicolor 127caaactgagc ttccacatga gcccctttac
ctaggagtat catcgggtgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcacaa
actgtgtacc tatcttgcac caaaactaac 120tctgtatcca aac
133128133DNASorghum bicolor
128caaactgagc ttccacttga gcccctttac ccaggagtat catcgagtgc atctaaaaag
60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaagctacc
120tctgtctcca aac
133129133DNASorghum bicolor 129caaactgagc ttccacttga gcccctttac
ccaggagtat caacagatgc atccaaaata 60gtttcttagc ctttcatgca ttaggtgcaa
actgtgtagc tatcttgcac caatactaac 120tctgtctcca aac
133130133DNASorghum bicolor
130aaactgagct tccacttgag cccctttacc ctggaatatc atcgggtgca tcccaaatgg
60tttcttagcc tatgatgcat taggcgcaaa gtgtgtacct atcttgcacc aaaactaact
120ttgtctccaa aca
133131133DNASorghum bicolor 131aaactgagct tcaacttgag cccctttacc
taggagtatc atcgggtgca tccaaaatgg 60tttcttagcc tatgatgcat taggcgcaaa
ctgtgtacct gtcttgcacc aaaactaacc 120ctgtctccaa aca
133132133DNASorghum bicolor
132caaactgagc ttccacttga gcccctttac ccaggagtat caacagatgc ttccaaaata
60gtttcttagc ctttgatgca ttaggtgcaa actgtgtagc tatcttgcac caatactaac
120tctgtctcca aac
133133133DNASorghum bicolor 133caaactgagc ttccacttga gcccctttac
ccaggagtat caacagatgc atccaaaata 60gtttcttagt ctttgatgca ttaggtgcaa
actgtgtagc tatcttgccc caatactaac 120tctgtctcca aac
133134133DNASorghum bicolor
134aaactgagct tccacgtgag cccctttacc caggagtata atcgggtgca tccaaaatgg
60tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgcacc aaagctatct
120ctgtctccaa aca
133135133DNASorghum bicolor 135caaactgagc ttccacttga gcccctttac
ctaggagtat catcgggtgc attcaaaatg 60gtttcatagc ctatgatgca ttaggcgcaa
actatgtacc tatcttgcac caaaactacc 120tccgtctcca aac
133136133DNASorghum bicolor
136caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgt atccaaaatg
60gtttcttagc ctatgatgca ttaggcatag actgtgtacc tatattgcac caaaactaac
120tccgtctcca aac
133137133DNASorghum bicolor 137caaactgagc ttccacttga gcccctttac
ccaggagtat catcgggtgc atccaaaatt 60gtttcttagc ttatgatgca ttaggtgtaa
actgtgtacc tatcttgcat caaaactcac 120tctgtctcca aac
133138133DNASorghum bicolor
138caaactgagc ttccacttga gcccctttac ccaggagtat cttcaggtgc atccaaaatt
60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tttcttgcac caaaactaac
120tctgtctcca aac
133139133DNASorghum bicolor 139aaactgagct tccacttgag cccctttacc
caagagtacc atcgggtgca tccaaaatgg 60tttcttagcc aatgatgcat taggcgcaaa
ttgtgtacct atcttgtacc aaaactaact 120ttgtctccaa aca
133140133DNASorghum bicolor
140caaactgagc ttccacttga gcccctttac ctagcagtat aatcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcacaa actatgtact tatcttgcac caaaactaac
120tctgtctcca aac
133141133DNASorghum bicolor 141aaactgagct tccacttgag cccctttacc
gaggagtatc atcgggtgca ttcaaaatgg 60tttcttagcc tatgatgcat taggcgcaaa
ctgtgtacct atcttgcaca aaaactagct 120ctgtctccaa aca
133142133DNASorghum bicolor
142caaactgagc ttccacttga gcccctttac ccaggagtat catcaggttc atccaaaatt
60gtttcttagc ctttgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
133143133DNASorghum bicolor 143aaactgagtt tccacatgag cacctttacc
caggagtatc atcaggtgca tccaaaatgg 60tttcttagcc tatgatgcat taggcgcaaa
ctgtgtacct atcttgcacc aaaactaact 120ctgtctccaa aca
133144133DNASorghum bicolor
144aaactgagct tccacttgag ccccttttct caggagtatc attgggttca tccaaaatgg
60tttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgtacc aaaactaact
120ctgtctccaa aca
133145133DNASorghum bicolor 145caaactgagc ttccacttga gcccctttac
ccaggagtat catcgggtgc atccaaaatg 60gtttctttgc ctatgatgca ttaggcggaa
actgtgtacc tgttttgcac caaaactaac 120tctatctcca aac
133146133DNASorghum bicolor
146aaactgtgct tccacttgag cccctttacc taggagtatc atcagggtgc atccacaatg
60gtttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
133147133DNASorghum bicolor 147caaactgagc ttccacttgg gcccctttac
ccaggagtat cttcaggtgc atccaaaatt 60gtttcttagc ctatgatgca ttaggcgcaa
actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133148133DNASorghum bicolor
148aaactgagct tccgcttgtg cccctttacc caagagtatc gacgggtgca tccaaaatgg
60tttcttagcc tacgatgcat taggcgcaaa cagtgtagct atcttgcacc aaaactaact
120ttgtctccaa aca
133149133DNASorghum bicolor 149caaactgagc ttccacttga gcccctttac
ctaggagtat catcaggtgc atccaaaatg 60gtttcttagc ctatgatgca ttaggagaaa
actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133150133DNASorghum bicolor
150caaactgagc ttccacttga gcccctttac ctaggagtat catcgtgtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggcacaa actgtgtacc tatcttgcac caaaagtaac
120tctgtctcca aac
133151133DNASorghum bicolor 151caaactgagc ttccacttga gcccctttac
caaggagtat catcgggtgc atccaaaatg 60gtttcttagc ctctggtgca ttaggcacaa
gctaggtacc tatcttgcac caaaactaac 120tctgtctcca aac
133152133DNASorghum bicolor
152aaactgagct tccacttgag cccctttact caggagtatc atcgtgtgcc tccaaaatgg
60tttcttagcc tatgatgcat taggcgcata ctgtgtacct atcttgcacc aaaactacct
120ctatctccaa aca
133153133DNASorghum bicolor 153aaactgagct tccacttgag cccctttaca
cacgagtatc atcgggtgca tccaaaatgg 60tttcttagcc tatgatgcat taggcgcaaa
ctgtgtacct atcttgtacc aaaactaact 120ctggctctaa aca
133154133DNASorghum bicolor
154caaactgagc ttccacttga gtccctttac ccaggagtat catagggtgc atccaaaatg
60ttttcttagc ctatgatgca ttaggcgtaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
133155133DNASorghum bicolor 155caaactgagc ttccacttga gcccctttac
ctaggagtat catcgggtgc atccaagatg 60gtttcttagc ctatgatgca ttagacgcaa
actgtgtacc tatcttgcac caaaactaac 120tttgtctcca aac
133156133DNASorghum bicolor
156aaactgagct tccacttgag cccctttaca caggagtatc atcgggtgca tccaaaatgg
60tttcttagca tatgatgcat tagtcgcaaa ctgtgtacct atcttgtacc aaaactaact
120ctgtctccaa aca
133157133DNASorghum bicolor 157caaacggagc ttccgcttga gcccctttac
ctaagagtat catcgggtgc atccaaaatg 60gtttgtcagc ctatgatgca ttaggtgcaa
actgtgtacc tatcttgccc caaaactaac 120tctgtctcca aac
133158133DNASorghum bicolor
158caaactgagc ttccacttga gcccctttac ctaggagtat catcgggtgc atccaaaaag
60gtttcttagc ctatgatgct ctaggagaaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcta aac
133159133DNASorghum bicolor 159caaactgagc ttccacttga gcccctttac
ccaggagtat catcgggtgc atctaaagtg 60gtttcttagc ctacgatgca gtaggcgcaa
actgtgtaca tatcttgcac caaaactaac 120tctgtctcca aac
133160133DNASorghum bicolor
160tgaaacggag ctttcacttg agccccttga cctaggagta ccatcgggtg catccaaaat
60ggtttcttat cctatggtgc attaggtgta aaccgtgcac ctatcttgca ccgaaactaa
120cgttgtctct aaa
133161133DNASorghum bicolor 161caaactgagc ttccacttga gcccctttac
ccgggagtat catcgggtgc atccaaaatg 60gtttcttatc caatgatgcg ttaggcgcaa
actatgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133162133DNASorghum bicolor
162caaactgagc ttccagttga gcctctttac ccaggagtat catcgggtgg atccaaaatg
60gtttgttagc ctatgatgca ttaggagcaa actatgtacc tatcttgcac caaaactaat
120tctgtctcca aac
133163133DNASorghum bicolor 163caaactgagc ttccacttga gcccctttac
ccaggaggat cttcgggtgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcgcaa
actgtgtacc tttcatgcac caaaactaac 120tctgtctcca aac
133164133DNASorghum bicolor
164caaaccgagc tttcacttta gccccttgac ctaggagtac catcgggtgc gttcaaaacg
60gtttcttatc ctatggtgca ttaggtgcaa accgtgcacc tatcttgcac tgaaactaac
120actgtctcta aac
133165133DNASorghum bicolor 165aaactgagct tcgacttgag cccctttacc
caggagtatc atcgggtgca tccaaaaggg 60tttcttagcc tatgatgcat taggcgcaaa
ctgtgtacgt atcttgcacc aaaactacct 120ctgtctctaa aca
133166133DNASorghum bicolor
166aaactgagct tccacttgag cccctttacc caggagtatc atcgggtgca tccaaaagag
60tttcttagcc tatgatgcat taggcgcaaa ctgtgtacgt atcttgcacc aaaactacct
120ctgtctccaa aca
133167133DNASorghum bicolor 167caaactgagc ttccacttca gcccctttaa
tcaggaatat catcgggtgc atccaaagta 60gtttcttagc ctatgatgca ttaggcgcaa
actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133168133DNASorghum bicolor
168caaactgagc ttccacttga gcccctttac ccaggagtat catcgggtgc atccaaaata
60gtttcttagc ctacgatgca gtaagcgcaa actgtgtacc tatcttgcac caaaactaac
120tcggtctcca aac
133169133DNASorghum bicolor 169caaactgagc ttccacttga gcccctttac
ctaggagtat gatcaggtgc atccaaaatg 60gtttcttagc ctatgatgca ttaggcacaa
actgtgtacc tatcttgcac caaaactaac 120tctgtctcca aac
133170133DNASorghum bicolor
170aaactgagct tccacttgag cacctttacc caggagtatc atcaggtgca tccaaaatgg
60gttcttagcc tatgatgcat taggcgcaaa ctgtgtacct atcttgaacc aaaactaact
120ctatctccaa aca
133171133DNASorghum bicolor 171caaactgagc ttccacttga gcccctttac
ccaggagtat cttcaggtgc atccaaaatt 60gtttcttagc ctatgatgca ttaggcgcaa
actgtgtacc tatcttgcac caaaactaac 120tttgtctcca aac
133172133DNASorghum bicolor
172caaactgagc ttccacttga gcccctttac ctaggagtat aatcgggtgc atccaaaatg
60gtttcttagc ctatgatgca ttaggtgcaa actatgtacc tatcttgcac caaaactaac
120tttgtctccg aac
133173133DNASorghum bicolor 173caaacagagc ttccaattga gaccctttac
tcaggagtat catcgggttc atccaaaatg 60gtttcttagc ctatgatgca ttaggcgcaa
actgtgtacc tattttgcac caaaactaac 120tctgtctcca aac
133174133DNASorghum bicolor
174caaactgagc ttccacttga gcccctttac ccacgagtat catcgggctc atccaaaatg
60atttcttagc ctatgatgca ttaggcgcaa actgtgtacc tatcttgcac caaaactaac
120tctgtctcca aac
133175131DNASorghum bicolor 175actgcgcttc cacttgagcc ccattaccca
ggagtatcat cgggtgcatc caaaatagtt 60tcatagccga tgatgcatta ggtgtaaact
gtgtacctat cttgcaccaa aactaactct 120gtctccaaac a
131176131DNASorghum bicolor
176aactgagctt gcacttgagc ccctttaccc aggagtatca tcgagtgcat ccaaaattgt
60ttcttagcct gtgatgcatt aggcgcaaac tgtgtacctg tcttgcacca aaactaactc
120tgtctccaaa c
1311777572DNASorghum bicolor 177tgatgaagac atccacacta ctgatgcatc
tataccaata caagtaccaa tttctggtcc 60cattactcgc gctcgtgctc gtcaactcaa
ccatcaggtg attacactct tgagttcatg 120tccatcatat ttagaccatg gagacccgtg
cactcttgtt ttgcttagga atcagggaga 180agaccgaaag ggaaaaggat ttgaacatgc
tggattcgga ctgcagaaga acaccaactt 240gtgacggtca ccacggtcag atgcgggctc
ggattggaat gttcaagcac aacatggaaa 300gcttatcaag tctactttca tatggatccg
gaattatagt catatctgtt ctgaggccgc 360cgtaatcatt gttttcttac cgagacattt
cctgcctttt ctgcccatgg tgctgcgtca 420ccctattttg gcccaatggg tcgtgtatca
agttaggtcc attagggacg catcctaggg 480ttgcagcacg accccaatac ccttgtggtc
gtcctcccat gtttataaac cccctagccg 540ccaccaagaa cagcgggttt tgtttagatc
aagtttagct ctcgctactt gcttgcaagc 600gcgcgtgcta gttcagccgc ccgtcttctt
gtcttcggaa ccccaccata ttggagtttg 660atctttaaac ctacatttag atctggtaat
tcagtacttg ttctacttgt tcttgctagt 720tcttcgattg cttgcaggac gagtgcccta
gtggccaggg tgtcacgctc cacaagatcg 780tgacagccat aggaggtggt gtatcggttg
ctaaggcgca gcgtctttgg aaggctgtag 840tcgggccgtg aacgtcgtct cctcccccaa
tcgagttatt ccacaccctc tcatcgaaag 900atcgggcaat cacccaacgg gtgcacatca
gttggtaatc agagcaaggt ttatcggtga 960gagatttact tttcttcgct gttttcttat
ctcctatagt ccagaaaaag ccaaaaaaat 1020agtagattag ttttaccgca atcctataaa
ccattgagca tttactagta ctacttagtt 1080agggcttgtt gagtttttgg ttgcatcggt
tgtgtcgagt tgctggtctt agtttattcc 1140tttagagttt tgagttctac cacgttttgg
tcaccacgag atccaccatc accaaaaaca 1200tctctggttc gtttttgcca ccacggatac
atatcatatc cgatttggaa gtttaaatac 1260aatctggaaa gcttatctta tcttctttcc
aacggatctg accttatctc aaaattcgtt 1320ctgagcgctc cgcaatcatc gtagagattt
ctggactttc tatattaaca agatttgtta 1380aatctgattt aaaggggttg ttagcaatat
ctttattgtt tgggttgtca tagtgaaaaa 1440aagggtttag gcccctgcaa aaaaaaacag
aagaagaaga aaaaaaaaga atagaaaaga 1500aaaaaaagga aaagaagaag aagggggctg
aatctctaaa tctgttgttt ctctttgtgc 1560tgtgctagtt gttctttttc agtgactacc
tttgtgccta ggctcacgtc tctagcctgg 1620tttagcctag gaccagcaca gtaccaccgt
tgaacgatta ttcagcttgc ttttgtaact 1680aacgtggtac tagtgtattc cttgcttcag
cccacctaca actctacata tttcgactac 1740agtttgacag gtcgtgttgc tgcggcaccg
atacacttat tccacggttg cagacttgtt 1800ggttgctgac ccctcctgtt gtgcaaggta
agaattggta agagcttgtg tggcaggttg 1860agagtgagcg ccttgcagta gctacatcct
aatagttgta gagtttttat tccttcacat 1920ttttttttct tgttgcctct gttcgtctaa
ccatggcagg attggaggtt gatgatgctt 1980ctcgtaatat gccacactct cctcgcacca
agggtatcat acaacacttt gtaaggctgg 2040tgaaaacgca cacggaaggt cttgataatg
acatgcaggt gacgaatgaa aagatggggc 2100aattggaggc cacacagatc gacacaaaca
ccaaacttgc aaatgtggaa atgacagttg 2160ctcatattga caagagcctt gtcgcactct
tgaggcgatt tgatgagatg catgctaata 2220ccaatggtgg gcgtgatgag ggcgccgaag
gtaactggga tgactatgtt gctgatactg 2280aacaagatga ccaagaagca cctaatcgcc
ggcgactacg tactaaccgt agaggtatgg 2340gtggttttca ccgacgtgag gtacatggta
atgatgatgc ttttagtaag gttaaattta 2400aaatacctcc ttttgatggt aaatatgacc
ctgatgctta cattacttgg gagattgcgg 2460ttgatcaaaa gtttgcatgc catgaatttc
ctgagaatgc gcgggttaga gctgctacta 2520gtgagtttac tgaatttgct tctgtttggt
ggatagaaca tggtaagaag aatcctaata 2580acatgccaca aacttgggat gcgttgaaac
gggtcatgcg ggctagattt gttccttctt 2640attatgcacg tgatatgtta aacaagttgc
aacaattgag acagggtact aaaagtgtag 2700aagaatatta tcaggaatta caaatgggta
tgctgcgttg taacatagag gagggtgagg 2760aatctgctat ggctagattt ttgggcgggt
taaataggga aattcaggac atccttgctt 2820ataaagatta tgctaatgta acccgattgt
ttcatcttgc ttgcaaagct gaaagggaag 2880tgcagggacg acgtgctagt gcaaggtcta
atgtttctgc aggaaaatct acaccatggc 2940aacagcgcac gactacgtcc atgaccggcc
gtacactagc accaactccc tcgccaagtc 3000gaccagcacc cccgccttcc tccagcgaca
aaccacgtgc atcttccaca aattcagcaa 3060ccaaatctgc ccagaaacca gcaggtagtg
cctcttcagt agcctccacg ggtagaacaa 3120gagatgttct gtgttatcga tgcaagggct
atggacacgt gcagcgtgat tgtcctaatc 3180agcgtgtttt ggtggtaaaa gacgatggtg
ggtattcctc tgctagtgat ttggatgaag 3240ctacacttgc tttgcttgcg gctgatgatg
caggcactaa ggaaccaccc gaagaacaga 3300ttggtgcaga tgatgcagag cattatgaga
gcctcattgt acagcgtgtg cttagtgcac 3360aaatggagaa ggcagagcag aatcagcgac
atacgttgtt tcaaacaaag tgtgtcatta 3420aggagcgttc atgtcgtttg atcattgatg
gaggtagctg caacaacttg gctagcagcg 3480acatggtgga gaagcttgca cttacgacca
aaccgcaccc gcatccatat cacattcaat 3540ggctcaacaa tagtggtaag gtcaaggtaa
ccaagctggt acgaattaat tttgctattg 3600gttcatatcg tgatgttgtt gactgtgatg
ttgtgcctat ggatgcttgt aatattctgc 3660taggtagacc atggcaattt gattcagatt
gtatgcatca tggtagatca aatcaatatt 3720ctctcataca ccatgataag aaaattattt
tgcttcccat gtcccctgag gctattgtgc 3780gtgatgatgt tgctaaagct accaaagcta
aaactgagaa caacaagaat attaaagttg 3840ttggtaataa caaagatggg ataaaattga
aaggacattg cttgcttgca acaaaaactg 3900atgttaatga attatttgct tccactactg
ttgcctacgc cttggtatgc aaggatgctt 3960tgatttcaat tcaagatatg cagcattctt
tgcctcctgt tattactaac attttgcagg 4020agtattctga tgtatttcca agtgagatac
cagaggggct gccacctata cgagggattg 4080agcaccaaat tgatcttatt cctggtgcat
ctttgccgaa tcgtgcgcca tataggacaa 4140atccagagga aacaaaagaa attcagcgac
aagtgcaaga actactcgac aaaggttacg 4200tgcgtgagtc tcttagtccg tgtgctgttc
cggttatttt agtgcctaaa aaagatggaa 4260catggcgtat gtgtgttgat tgtagggcta
ttaataatat cacgatacgt tatcgacacc 4320ctattccacg tttagatgat atgcttgatg
aattgagtgg tgccattgtc ttttctaaag 4380ttgatttgcg tagtgggtac caccagattc
gtatgaaatt gggagatgaa tggaaaactg 4440ctttcaaaac taagttcgga ttgtatgagt
ggttagtcat gccttttggg ttaactaatg 4500cacctagcac tttcatgaga ttaatgaacg
aggttttgcg tgccttcatt ggaaaatttg 4560tggtagtata ctttgatgac atattaatct
acagcaaatc tatggatgaa catgttgatc 4620acatgcgtgc tgtttttaat gctttacgag
atgcacgttt atttggtaac cttgagaagt 4680gcacattttg caccgatcga gtttcgtttc
ttggttatgt tgtgactcca cagggaattg 4740aggttgatca agccaaggta gaagcgatac
atggatggcc tatgccaaag actatcacac 4800aggtgcggag tttcctagga cttgctggct
tctatcgccg ttttgtgaag gactttagca 4860ccattgctgc acctttgaat gagcttacga
agaagggagt gcattttagt tggggcaaag 4920tacaagagca cgctttcaac gtgctgaaag
ataagttgac acatgcacct ctcctccaac 4980ttcctgattt taataagact tttgagcttg
aatgtgatgc tagtggaatt ggattgggtg 5040gtgttttgtt acaagaaggc aaacctgttg
catattttag tgaaaaattg agtgggtctg 5100ttctaaatta ttctacttat gataaggaat
tatatgctct tgtgcgaaca ttagaaacat 5160ggcagcatta tttgtggccc aaagagtttg
ttattcattc tgatcatgaa tctttgaaac 5220atattcgtag tcaaggaaaa ctgaaccgta
gacatgctaa gtgggttgaa tttatcgaat 5280cgtttcctta tgttattaag cacaagaaag
gaaaagagaa tatcattgct gacgctttgt 5340ctaggagata tactttgctg aatcaacttg
actacaaaat ctttggatta gagacgatta 5400aagaccaata tgttcatgat gctgatttta
aagatgtgtt gctgcattgt aaagatggga 5460aaggatggaa caaatatatc gttagtgatg
ggtttgtgtt tagagctaac aagctatgca 5520ttccagctag ctccgttcgt ttgttgttgt
tacaggaagc acatggaggt ggcttaatgg 5580gacattttgg agcaaagaaa acggaggaca
tacttgctgg tcatttcttt tggcccaaga 5640tgagaagaga tgtggtgaga ttggttgctc
gttgcacgac atgccaaaag gcgaagtcac 5700ggttaaatcc acacggtttg tatttgcctc
tacccgttcc tagtgctcct tgggaagata 5760tttctatgga ttttgtgctg ggattgccta
ggactaggaa gggacgtgat agtgtgtttg 5820tggttgttga tagattttct aagatggcac
atttcatacc atgtcataaa actgacgatg 5880ctactcatat tgctgatttg ttctttcgtg
aaattgttcg cttgcatggt gtgcccaaca 5940caatcgtttc tgatcgtgat gctaaatttc
ttagtcattt ttggaggact ttgtgggcaa 6000aattggggac taagctttta ttttctacta
catgtcatcc tcaaactgat ggtcaaactg 6060aagttgtgaa tagaactttg tctactatgt
taagggcagt tctaaagaag aatattaaga 6120tgtgggagga ctgtttgcct catattgaat
ttgcttataa tcgatcattg cattctacta 6180caaagatgtg cccatttcag attgtatatg
gtttgttacc tcgtgctcct attgatttaa 6240tgcctttgcc atcttctgaa aaactaaatt
ttgatgctac taggcgtgct gaattgatgt 6300taaaactgca cgaaactact aaagaaaaca
tagagcgtat gaatgctaga tataagtttg 6360ctagtgataa aggtagaaag gaaataaatt
ttgaacctgg agatttagtt tggttgcatt 6420tgagaaagga aaggtttcct gaattacgaa
aatctaaatt gttgcctcga gccgatggac 6480cgtttaaagt gctagagaaa attaacgaca
atgcatatag gctagatctg cctgcagact 6540ttggggttag ccccacattt aacattgcag
atttaaagcc ctacttggga gaggaagttg 6600agcttgagtc gaggacgact caaatgcaag
aaggggagaa tgatgaagac atccacacta 6660ctgatgcatc tataccaata caagtaccaa
tttctggtcc cattactcgc gctcgtgctc 6720gtcaactcaa ccatcaggtg attacactct
tgagttcatg tccatcatat ttagagccat 6780ggagacccgt gcactcttgt tttgcttagg
aatcagggag aagaccgaaa gggaaaagga 6840tttgaacatg ctggattcgg actgcagaag
aacaccaact tgtgacggtc accacggtca 6900gatgcgggct cggattggaa tgttcaagca
caacatggaa agcttatcaa gtctactttc 6960atatggatcc ggaattatag tcatatctgt
tctgaggccg ccgtaatcat tgttttctta 7020ccgagacatt tcctgccttt tctgcccatg
gtgctgcgtc accctatttt ggcccaatgg 7080gtcgtgtatc aagttaggtc cattagggac
gcatcctagg gttgcagcac gaccccaata 7140cccttgtggt cgtcctccca tgtttataaa
ccccctagcc gccaccaaga acagcgggtt 7200ttgtttagat caagtttagc tctcgctact
tgcttgtaag cgcgcgtgct agttcagccg 7260cccgtcttct tgtcttcgga accccaccat
attggagttt gattttgaaa cctacattta 7320gatctggtaa ttcagtactt gttctacttg
ttcttgctag ttcttcgatt gcttgcagga 7380cgagtgccct agtggccagg gtgtcacgct
ccacaagatc gtgacagcca taggaggtgg 7440tgtatcggtt gctaaggcgc agcgtctttg
gaaggctgta gtcgggccgt gaacgtcgtc 7500tcctccccca atcgagttat tccacaccct
ctcatcgaaa gatcgggcaa tcacccaacg 7560ggtgcacatc ag
75721781004DNASorghum bicolor
178tacgtaagct tcgtttcgtc tgtttggaca tagtgctaat ctttatgcaa gatagatgca
60cggtttacgt ggaacatatg atatgctcag aagcaattta ggacgcacct aatataactc
120cttgatgatg tgtgtcacat ggaatcttgc ttcggtttct ttagagacag tgttagtttt
180ggtagaagat atgtgcacag tgtacgccta atgcaccata ggctaaagaa accattttag
240acgcacccga tggtactcgt agttgaagag gctcaactgg aggctcgatt tggtctgttc
300ggatatagtg ctaatcttga tgcaagatag ttgcacaatt tgcaggcaac gtaccatatg
360ttaagaaatc aatttggacg cacccaatgg aactcctaga tgacgtgtgt catatggaac
420tcgcttcggt ctgtttggtg accatattag tttcactgca ggaaaggtgc atagtttgtg
480cctaatgcac catagtctaa gaaaaccatt tttgatgcac ctgtttgtac ttctatgaag
540aggctcaagt ggaagctcgg ttcggtctgt ttggagatag tgctaatctt gatgcaagat
600aggtgtacgg tttgtatgga acataccata tgcttggaaa tcaatttgga tgcacccgtt
660ggaactcctt gagaagtgtg tcttatgtac cctcgctttg gtctgtttag aaatagtgtt
720agtttcagtg caagatatga gcatggtttg cgcctaacgc accatagtct aagaaaccat
780tttggaagca cctgttggta cttcggtgaa gaagctcaag tggaagctcg gtttgacctg
840tttggagata gtgctaatct tgatgcaaga tagtgcatga tttgcaagga acataccata
900tgcttagaaa tcaacttgga cgcacgcccc gcaactccta catcacgtgt gtcatatgga
960atcttacttc ggtccatttg taacattgta agttttagtg caag
100417922DNAArtificial sequencesynthesized oligonucleotide 179gggaagtaca
gggacgaaga gc
2218022DNAArtificial sequencesynthetic oligonucleotide 180tgcaaccaaa
ccaaatcacc ag
2218125DNAArtificial sequencesynthetic oligonucleotdie 181gtcacccagc
agttccatcg ggtgc
2518224DNAArtificial sequencesynthetic oligonucleotide 182actgctgggt
gacgtggctc aagt
241832408DNASorghum bicolor 183cggcactcag ttcacgggca aaagttcttg
gacttctgcg atcagcatca catccgtgtg 60aactggtctg cagtggccca ccctcgaact
aacggccagg tcgagcgtgc caatggcatg 120attttgcaag ggctcaaacc aaggatttac
aatcgcttga agaaatttgg caagaaatgg 180gtcgaggaac tttcctcggt cctatggagc
ctaaggacga cgccaagcag ggccacattc 240atggtctacg gctctgaggc tgttctccca
acagacctcg agtatgggtc ccctcgactc 300aaagcataca acgaacaatc aaataaagag
tctcaagaaa acgtggttga ccaactcgaa 360gaagctcgag acatggccct catcaactct
gctagatacc agcagaaact tcgacgctac 420cacgacaagc acgtgcgcaa gagggacttg
aacgtaggtg acctcgtcct acgacgacgg 480caaaataatc aaggacgcca caagttgact
ccaccttggg agggcccgta cgtggtagcc 540gaggtcttga agccagggac atacaagctc
acggacgaaa agtgggcgat cttcaccaac 600gcgtggaaca ttgaacagct acgttgattc
taccccaaga atttcaaagc tttatgttcc 660tgcgtacatt ctgtaaatga ataaatgaat
aaataaagtc tttttctcga gcgacttacc 720ttttcacagg tctcaacgtt agaagggagt
atcgactatg acccatcata gtcgacaccc 780cctcgggggc tagcaaggga ggtgaccccc
ccaagtgtcg aaaaaaacca agtaatcctt 840tcgttcctat cggcaatctc atgcagtcga
gtagtaaagg tacctcgagc cccttaagga 900ccgagaaacg acgagcctga gaactcctac
gcccccgggc tatggaaact ctactcgtct 960cctcaccctt gaggtaatcg agaccgcctc
gaacaaaaga ccaagtgaga aaaacaaaca 1020taggcgcaaa aaggaataaa ggagcttcga
gaggaaagac agacaaacat ttaacaaacc 1080acttaaagac attgtattac ttaaagacaa
gttaacagag tactatacaa ggggccccag 1140gcacccagag caggctcgca ggccttagtc
cacagcatgg tcctcaccac cctcgcctgc 1200gcctgagcta gtctcagggg gaagcacctc
gggctcaaat agcttggcca gcctctctcc 1260aggaacctcc gtgtcgtcga tcaaagcgtg
aagcctctcc tcgttctccg tgtcagtctt 1320ggagatatcg gtaacaaagc catgagatac
cacctccata tcgtaggaaa agcccgagca 1380cacaaccgcc attgcccgct ttaccccgat
atggagagcg tcacgcaccc gatccctcag 1440cgtcgcgcct aagtagcaca actgatcgac
cagcgcgtcg cctcgagctt cctccttcga 1500ctcgtccata ggctcgactt cccaagaggt
cgagagatcg cttatcaccg tccgcagtca 1560acggttcaaa gcaacctccc gctcgagctg
ggcctgggcg ttgcgagcct cgacttgggc 1620ggccaagagc tcgtctttaa gccctgtgaa
tgccaacaaa aataaaagct ttagataaaa 1680ccaagcacat ctcgaaaaga aaacccgaca
atagaaacgt actgcggaca ctctcctcca 1740gagctgtgtt ctcgccaact aatttagtgt
tggccctctc gatctccctg ttagagcgtg 1800ccagctcagt gttggcaaca cgcagatctt
cgattgcctt gccagcctga gcaatggctc 1860cactcttctc cagcagctct cccttcagac
gatcgatgtc gtcggagagg ctgcgggatc 1920gagtccgctc cgcctcgagg tcctcgaggg
ccttcttctt ggcttcctcc gtggcgcccc 1980tagcgatctc gccatacaca tgctccacct
tcatcctctg aaaggattct cgaagggagt 2040ccaagtcagt cttgaggagg cccttctcct
tgtccagatc agcaacgacg ttcatgtagg 2100acaggacctc ctcccgggca ttggacgcag
cctcctcaac cgtcagagcc cgctccctca 2160gcagcacggc ctcttccata gccttcttcg
aggcctctcg agcctcggcc agggcagcct 2220ccctctcctt cttctcctcc tcgagcgcta
agagatcttt ataggcttgg aggagtttct 2280cctgcgactc caggagttgg tctttgacga
ggggaagcta ctcccagcca cccctcgtgg 2340cgtgaatgaa gcttgactta atgcgggagg
tctctttcaa gtcctgtgaa ccgatgatcg 2400agcagtta
24081842013DNASorghum bicolor
184gaggagccat cttgagggtt cgatgtgcat gtacatctca tgagcattga tcttccccaa
60tatggtggtt ggtgtagcgg tggaaagatc gccttgatgg agcacggtta ctatgtgccc
120atatttctca atgggaagca cacatagtat cttccttgct acatccgccg tactcatttg
180tgtaagccca agtccattta gctcctctac aatgacattg aagcgagaat acatttcatt
240agcattttct ttaggaagca tttcaaatgt attaagcttg ttcatcacta ggtgatagcg
300ttcctcgcgt tcactcttag ttccctcatg gagcgcacaa agttccttcc aaagttcatt
360ggcggttttg tggctccgaa cacggttgaa cacctctttg caaagacctc taaagatgtg
420gtttttggcc tttgcattcc acttctcgtt ttcttgctct tgtggagtaa gagcggtggc
480tcttttcgga gggacaaacc cttcggtcgt ggcttttagg cactttacat cgcatgcctc
540aaggtatgac tccatccgta ttttccaata cgggaagtca ttcccatcga acatgggtgg
600cggtccatcc ccgttagaca tctttctcta ggcggtgaag cctaaataat gagcactagg
660ctctgatacc aattgaaagg atcaagatgc ccaagagggg gggtgaattg ggcttctcta
720aaaatttaag caacctataa gctccaattc aaccccttgt gcctagtgtg acttagagag
780ctaccggata aaagttttgc aacctagttc caatcctatt ctagcatggc aaatctaaga
840atgtaaaagc acaaagtaaa tgctagaaag taaaggagta gtggaagaaa gtgctcggcg
900atgttttgcc gaggtatcgg agagtcgcca ctctccacta gtcctcgttg gagcacccgc
960acaagggtct tgctccccct tggtccgcgc aaggaccaag tgctctctac gggctgattc
1020ttcgacactc cgtcgcggtg aatcgcccaa aaccgctcac aagcttgaca cgtgccaccc
1080acaagaactc cgggtgatct tcgtgcctcc aatcaccacc gaaccgtcta ggtgatggcg
1140atcaccaaga gtaacaagca aagaactctc acttgaccca aacaaggcac tagaaagtgg
1200tggatgcaca cttgactctt ggaactcact agaggaggat tctctcaaga attcactcaa
1260aaactcaatc ctctctaggc ttttgcaact ctcttgctcc acaacaagtt tctctgaagt
1320tcaaatgggc aagagaggtc tcatggacga ggtggaggag tataaatact atccacgaag
1380tccaaaggtc ggccaaccgt tttccactga aaacggggtc accggacgca cattatgttg
1440caccggacgc tctgcaccga gcgtccggtg tgacataacg gctacctgcg tttttctctg
1500acaggtcacc ggacgctaac tcccagcgtc cggtgcaccg tccggtgctt ggggaaactt
1560tacatgctcc ctgcgcatgg gaccggacgc tacccggtgc gtccggtgct taggggaact
1620ttgcaagctc cctgggtaag ggaccggacg ctaccaggtg cgtccggtgc ccctttgggc
1680acccaaactt cgtcgaaacg cgatcgctcc aaaacgaagt ttgatcctct cgatctaagg
1740actatctcta agctgcctag agctaggttt accaagtgtg caccacacct aaacctaaag
1800ccttgcctaa gtcaagctac tagatcaaag cccctcttaa tagtacggtc aaaggaaaaa
1860aaagtcctac caagtgccct tcttcaccat atggcactta gacctagtct agccttgacg
1920atgtccatcc atcctttgaa aaccgaaacg atttctacca ttaagtaggc atgtacgtcc
1980ctgtccatcg agaacctatt taccatgacc tta
20131851546DNASorghum bicolor 185catcctctga cgcctcgtta ccaccagacg
tgtcgctgat ggtgattggg tccgcctcga 60cccccggttg aggaggctcg ggagctcctt
cttgcccctg agtctggggt gcctcgtcac 120gcgtctacgg catcagggta ggctcgggac
gaggctgagc actctcgtcg tctaccggag 180gtcgagagtc ggaggaagct gcaaaacaaa
gaatagaggc cggtcactat aatgcccaac 240ataagaacgt cacaagtccc aaaagtaagc
cacaaaccta gctccttgga ggctgccccc 300gctttgagcc tcttcgccat tgaaggctga
gcttgttcaa gcgtccgctt taatctgaaa 360caaaaagaac cagcgtaaaa agactggtaa
acaaggaggc gcaggacaaa agccacaaca 420tcgaagagct tacccgacgg ggaggacagc
tcgcctgcca gtgccccgac cggcgggtct 480cgacccaccc cgtgtcaaga tcctgggctc
ctcgagggca ggcactggcc ttggctcggc 540accttccgcc aggcgggccc cgggactgat
gcttctgcga ccagtctcgc ccttctcgga 600ggccgcggtg cgaccgcggg tcagcggccc
cgtcgcctgg ggcttagcat ggctatccgc 660ctacggcacc aggggagggc gcggggcggt
ctggcctatc gcgggcacgg gaggagtatc 720ggcacggggg cggtgagatc cgcttggacc
ttccctcggc gttcccgtct gtggggcgtc 780gacgtcacct cgaggcggac cctcgaggat
ccggtcgagg cgagaggcca tcccctcaga 840gtcatcgctg tcttcatccc caccatcatc
gtcgctgggg gactcttcct caggctcccc 900cctctgcctg gatttagccc gacgcgcctc
caacgcttgg cggtcgaggt tcttcttttt 960ctcccctttc ttcgcagagt ccttggcgga
cttctgcttc tcggcggacc ggcgtcgcgc 1020gtcgcgatca acctcgtcct cctttgctgg
aggcctcgag gatcggacgt cgatccgtcc 1080ctgccagaac aaaggtggac ttagaaaaaa
gaaaagaaac cctgaatcaa agctgcgcac 1140gaacagaaag cataccagat cgatcgagcc
cgcatccggc ctcatgggga aaccgttaac 1200gtgctccggc ttgaagtcgc cagcaatggc
agccctgacc cgggccgcta cttcgtcgtt 1260caccggagcc tcgctcgaca tccggcatgc
ctcgaggtcc cgggacgaga cacctgggcc 1320catctcgtcc atcctcagcg ggcgagacat
caaggggaga acccttcgat gatggacggc 1380tgagagaacg agagcggcgg tcaaaccctc
cgagcgtagc ttcttcatga cgtcgaggag 1440ggggtcgagc cgagatggtg agcaacgacg
acgccgtacg tccaattttc cggacgctcc 1500gtgatcagac gtccggtgta ggcgggcagg
aggtcgtcgt cgtttc 15461861161DNASorghum bicolor
186ctcactaaag gaagacgatc agatcaaaac atcttttatc actcctttcg gcgcgtattg
60ctacacgacc atgtccttcg gactcaaaat gctggagcta cttatcaacg ggctattcat
120taatgcctcc acgacgagat tcgtgacgac ctcgtcgagg cttatgtaga cgacgttgtt
180gtcaaaacaa gggacgcgag caccctaatc gacaacctag accgaacctt taaggcgcta
240aataaataca agtggaaact aaaccccaaa aaatgtatct ttggggtccc ctctggacta
300ctactcgaca acgttgtcag ccgcgatggc atacgaccga atccttcaaa agtaaaggta
360gtgctcgacg tgcgaccacc caagaatgtc aaggatattc aaaagctcac cggttgtatg
420gctgctctca gccgttttat ctctagactg ggagaaaaag gcctcccctt cttcaaactt
480ataaaggcat cggaaaaatt ctcctggaca gaggaagctg acgttgcgtt tacccagctc
540aaaactttcc tcacctcacc acccgtcctc acggcgcctc aacctaacga gaacctgctc
600ctttacataa cagcaaccga tcgggtcgtc tccacggcaa tggtggtcga gcgggacgag
660ccaggtcatg tctacaaagt ctagaggcct gtttatttca taagtgaagt cttaaacgaa
720tctaagacca ggtacaaaag ttaatctacg ctatcctgat aacctcaaga aagctgaagc
780attacttcga cggtcattgg gtcttggtaa ccaccagttt ccctctaggg gacattttgc
840gcaacaagga cgctaatggc agaattgtaa aatgggcaat ggaattgtgc ccattctccc
900tagatttcca gagccgcact accatcaagt cccaggccct ggtcgatttc atcgtagaat
960ggacggacct caacgagccc ccccccctcc ggacacttcc gaccactggt caatgttctt
1020cgacgggtcc ctaaacatca atggcgccgg tgctagaata ctcttcgtat cgcctaacaa
1080ggacaaactt cgctatgtcc ttagaatcct cttttcggca tctaacaacg tcgccgagta
1140cgaagcatgc ctacatggta t
1161187474DNASorghum bicolor 187ttgagcctct tcaccaagga gtaccatcgt
gtgcgtccaa aatagtttat tagcctatag 60tgcattaggc acaaaccgtg tacctatctt
gcatcgaaac taacactatc tccaaagaga 120tcgaagtgag attctatatg acacacgtca
tctaggagtt ctattgggtg catccaaata 180tatttcgaag catatggtat gttccatgaa
attcatgcac ctatctttgt gggggtataa 240acccctatac cctttcggct agacttgggc
taggagactt ggcccatcac gaagacagtt 300cgaggcttga tccaacagtt cggagtttca
tgcaaggaaa cgagacgcag aggtcaagca 360ggattctagt cggttagaat aggaattgat
atcgcactat ctatggcaat tgtaaccgac 420taggattagt ttccagattt ataaccctac
cctctggact atataaggag aggt 474188110DNASorghum bicolor
188tctagtcggt tagaatagga attggtatcg cactatctat ggcaattgta accgactagg
60aatagtttct agatttgtaa ccctgccctc tagactatat aaggagaggc
110189211DNASorghum bicolor 189aagggacccc cctaggcaat tcatctcaac
tcaatccaat acaatcagac gcaggacgta 60ggtattacgc ccacgcggcg gccgaacctg
gataaaaacc ttgtctgtgt cttgcgtcac 120catcgagttc gtagcttgcg caccgtctac
cgataaacta ctaccgtggg tataccccaa 180ggtagactgc cgactagctt tcatcgacaa t
2111904946DNASorghum bicolor
190tatttgtctt tccgctcgag gttcttttat ttctcttttg tcgtcctacg tatgtgttcc
60ttgttcgatt ttcgtaaaga cagcttcgat cacctcgagg gtgagggagc gagtagagtt
120tccatagccc gggggcgtag gtgttctcag gctcgttatc cctcggccct taaggtgctc
180gaggcgcctc aactactcga ccatgcaagg tttactaagt aagaaagaga gaaatttttg
240gcttttttcg acacttgggg gggggtcgcc cccctggtag cccccgaggg ggtgttgact
300atgatgggtc atagtcggcg cccccttcta acgtcgagat tataacttgt aaagccttgg
360tacaaaaaga agttgctcga gggaaggact tcatctattc atacattcac ttacaaagtc
420tcggtacaaa atatacgtag gaacataaag ttttgaactt ctaggggtag aacaacgtag
480ctgttcgatg ttccacgcgt tggtgaagac tgcccccttc tcgtccgcta acttgtacgt
540tcctggcttc aaaacctcgg ccaccacgta cgggccttcc caaggcggag tcagcttgtg
600gcgtccttga ttgctttgcc gtcgtcgcag gacaaggtcg cccacgttta agtccctctt
660tcgtacgtgt ttgtcgtggt agcgtcgaag cttctgctgg tatctggcag aattgaggag
720ggccatgtct cgagcctcct cgagctgatc gatcacgttc tcttgagtct ccttattcga
780ttgctcgttg tacgctttga gtcgagggga cccgtattcg aggtctgtgg ggaggacagc
840ctcacagccg tagactatga agaaaggggt aaattttgtg gccctgctcg gtgttgtcct
900taggctccat aggaccgagg aaagttcctc gacccacttc ttgccgaatt tcttcaagcg
960attgtagatc cttggtttga gtccttgcaa aatcatgccg ttggcatgct caacctggcc
1020gttagtttga gggtgggcca ctgccgacca gttcacacgg atgtgatgct gatcgcagaa
1080gttcaagaac tttttgccgg tgaactgagt gccgttgtcg gtgatgatga cgttgggaat
1140cccaaaccga tggatgatgt cggtgaagaa aaggacggct tgctcggagc ggatgttggt
1200gatgggtcga gcctcgatcc acttggagaa tttgtcgatg gccactagca agtgggtgta
1260tcctcctttc gcctttttgt aggggcccta ctaagtcgag cccccacacc acaaatggcc
1320atgtaatggg gatcatctga agagcgtggg cgggcaggtg cgtctgtttg gcgtagaact
1380ggcacccacg acatgagcgt acgagctcga tggcatctgc tacggctgtt ggccagtaga
1440agccttgtcg aaaggcgttc cctacgagag tccgtggagc agcgtggtgg ctgcaagccc
1500ccgagtgtag gtcctcgagt agtttccggc cttcttccac ggtgatgcaa cgctgcaaga
1560cacccgtcgg gcttcgttgg tacagctcat tgccttcgcc atatatcaca tatgacttgg
1620cccgtcgagc gattcggcgg gcttcagatc gatcttctgg cagctcgcag cggattaggc
1680agtcaaggaa cggtgtgcgc cagtcgaacg gtcgaccggg ccgccgttct acctccatca
1740cctcagctgc catcaatagt gcgttgacgg tgtcggttgg ctgggcggga gcggcatcga
1800cgtcagcccc cgagcctaag tcgacggatg gctcatgaag atctctcgag aacgcgtcag
1860gcggtaccgt gcctcgggtc gacgcgattt tggcgagttc gtcggctgct tcgttgtagc
1920gtcgggcgac atggacgagc tcgaggccat ggaacttgtc ctcgagtcga cggacttctt
1980tgcagtacgc ctccattttc gggtcgtggc agcttgaggt ctttattact tgatcgatga
2040cgagttggga gtcgcctcgg acgtcaaggc gtcggactcc tagctcgatg gcgatcttga
2100gaccgttgac gagggcctcg tattctgcga cattgttgga tgcggcaaag tgaatcctga
2160tgacgtacct catatggacg cccaagggtg agatgaacag caggccggcc ccggcccacg
2220ttttcatgag tgacccgtcg aaatacatcg tccacagctc cacctggacc tgagtcgggg
2280ggagctagga gtctgtccat tccgcgacga aatccaccag ggcttgcgat ttgatggcct
2340ttcgaggcgc ataagtgaga gtctcactca taagctcgac cgaccattta gctattcttc
2400ccgtggcctc cttactctgg attatctcgc ccagggggaa agacgagacc accgtgatgg
2460ggtgaccgag gaagtagtgc tgcagcttac ggcgtgcgag gattacggcg tagatcagct
2520tctgaatttg ggggtaacgt gccttggtct ccgagagcac ctcgctgacg aaataaactg
2580gcctctggac tggcagcgca tgcccctcct cctgcctctc gaccacgacc gccgcactga
2640tgacctgggt cgtcgacgcg acgtaaagga gcagtggttc tgcaggttga ggtgggacta
2700ggactggtgc tgaggtcaat gttttcttca gcttttcgag agcttcctcg gcctcggggg
2760tccaagaaaa acgctcgacc ttctttagaa gtcgatacaa tggtaacgcc ttttctccta
2820gtctcgagat gaaacggctc agagccgcca ggcatcccgt gactctttga acacccttga
2880tgtctcggat cggccccatt cttgtgatgg tcgagacttt ctcggggttg gcctcgatgc
2940cgcgctggga aacaatgtaa cccaagagca tgcctcgagg cacgccgaag acgcatttct
3000cgggattgag cttgatctgg ttggtgcgta agcagctaaa agcgatttcg aggtcttgga
3060tcaggtctcc tcgccgcttt gatttgacga caatgtcgtc gacataagcc tcgaccgccg
3120accctatgtg tttgccaaat acgtggagca tgccacgctt tttgcttctt cttgttcttc
3180ttgagccctc gactgggggt tccatcttca tcgacgggcc gcttgccttt ccctccttca
3240ccactgaaga aggcgccgac tgcctcctcc cctgctgcat agtttgccac tacatccatc
3300agctcatcga cggtctgagg gcgatttcgg cccaactccc gcactaagtc tcgactggtg
3360gtaccagaga caaaagccag gatgacgtca tggtcaggga tgtgtggcag ctcagtgcgc
3420tgcttcgaga agcgtcgagc atactctcga agagtttgtc ctgacttctc cttgcatttg
3480ctgaggtccc acgagttccc gggccgtatg taggtccctt tgaaatttcc ctcgaagact
3540ttaaccaaat cgacccagtc gtggatctga tttgctggaa gttcctcgag ccatcgacgg
3600gcggtgtcgg agaggaagag gggtagctgt ctgatgatgg ctcgatcatc ccctcgagcg
3660ccacctagct gacaggccag cctaaaatcg gccagccaca attctggttt ggtctcacca
3720ttgtatttcg cgatgctggt cgggggtcgg aatggactgg gtagaggcgt gctgcggatt
3780gccctgctga acacctgtgg gcttggtggc tcgggggcca cccgatcctc gtcgctgtcg
3840tacctcccac cacgatgtgc gttgtaaccg cgctcttccg cgtctctgtg ccggcggcgt
3900cgattctctt cgatgtcatg ccgcgcgtcg tgtcgacgct gattgtcgac ggggaggacg
3960agagtggtct ttcttcctcg agggggcggc tgttgatgga ctgacacttc ctttttattt
4020tgaggcggct cgtcgcgctt ttcagtggca gctcctcgcc gtcgagaagc tgagctttcg
4080acttgttgaa ctgccgccac ttggagcaga gtttgtacct cgtctcgaat gcgccgagcc
4140tgagagttcg atggtttcgg catgttgcgc agtagcatcg tcgccgccac tacattctgg
4200cccgctcgag ggaaatggct gacaggctgc tctggctctg cgtctccgat gatttgtcga
4260tgtacctctc gagcgcgtct gcagacactt ccgcccggcg ggtagggtag gtcttgctca
4320aggatttgct cgagcaggac gaggcgctca cggtcttcgt cgagcttcat cttgagctcc
4380tgtagctgtg caagctgcgc agccctgtct tcttgtggag gggggtcgac ttgtccgtca
4440ttcccaggcg tcctggggtc ttcccgtgat acaggggctc gaccgcccgc agcagcaacc
4500tgcgtctcgt cggtagtctc taccgccccg tcgacgtgat agcattccct tgtagggtcg
4560tagctatcag tctcggagtc ggagtccggc ttggcggcat agaaacatcc acgtccctcc
4620tccagcagcc gttgcctgac tgaggtgatc gtgtatcgcc cagggcctag gcggcggagg
4680cctcgcacct gggagaaatc cggcagttca gacgtcggct gtagtgtttc agagtcacgt
4740gggaggacca ttggtctctc cacgtacgtc tgggcgtttg ggcatattgc cctggcaagc
4800gacgctgcgg cgttgtctaa tccgaatggc agtgcccttc ttggagtgaa caacccgtgt
4860ggaagtacct tagcagcggt gggggagggc gcgcgagaca cgtccccccc ccgtcgtcgt
4920gcggtgttga gccgtcgtgc ccgtgg
49461911281DNASorghum bicolor 191cgctcgctct atcgagagta cgtcgaaacc
ttcgcacaaa acgagccaat cagaaccctc 60caccacaggt gtcgacggcg tttctgtgaa
ttaggcaata taaccctcga aggagtcaaa 120aactcctcca agggctcgag ggctaccccc
acggggtcgc tcgcgtgccc ccacgaaaac 180tcaacccaag aatacagcct ccactcgagc
gccagcgctc gaatggagac tcgggggcta 240ctgtcgaggg tatcaataag gggtaccctc
accgatgcac ataacaagat tatccgtacg 300caggtcgagg ccctcaactc ggcgctctga
tcatacatac gcacagccat caacaaccgc 360agcctcgaag acagaaatga tgtcgagcga
atcgatcaag ggtcgagcgc cagctatcgt 420cgaatacgga gacaggcccg agcgaattga
gaagcgtcga gcgcaaggac actgtccgcc 480gcctgacgcg cgcacgagag ccaggacatt
taatgcgccc gccgcattcc cacctaacac 540actggtcacg ggaggcgtga taggaaatag
gcacccgtcc catcgttctt tttgcagcct 600tccccaccaa acgacccagg ggtgtcagga
cgcgggaaac gaggatggaa cgtctaatcg 660gaaccccctc gagacaacca aggtcagcgc
tctggacatc agggcattac acggcgtccg 720accctcgacc tgacgtaact ccttcctggg
gacgagctgg gcgtcgaccg acaacaccgc 780aaccactctg ccggatttgt cgccatgtcg
tacaggcgtg cgactggtga accagcccaa 840gacggcgcgc aggagcggga tacaggggca
cgcgtaatca tcaccaggct actaagtcgg 900gacggctcga ggtcacgccg gtatggaggc
ctcgaatagg tcagcgcgcc atgctcctat 960cgacccctac tctgacacct atacatgtac
cctgggtctc tccttgatgc tataaaagga 1020agggctcggg aatagataga catcaggcga
taccacgtcc atacgcagta gaactctcac 1080actccatacc acgcttgtgt tcacccctgt
acaagcactt cggtgcaaga taatacagac 1140tcccctcccc cgctggacgt agggccttct
cttgcccgaa ccaggataaa tttctgtttc 1200ttcttgcatc accatctggg aaagggagca
cgcatacaaa tttactcgtt agtgtgaccc 1260cccaggggga aacaccgaca g
1281192582DNASorghum bicolor
192gcttattttg taagtatctt tcggtcctta attctcatag atcctttggt tttgtgttct
60aaccgttcaa ccccggctcg caggacatga aagagacctc ccgcattaag tcgagcttca
120ttcacgccac aagaggcggt tgggagcagc tccccgtcct caaggatcaa ctcctcgagt
180cgcaagagaa gctccttaag gcctataaag accttatagc gctcgaggag gagaagaagg
240cgagggaggc tgccttggtc aaggctcgag aggccttaag gaaggcagag gaggagacga
300cgttgctgcg ggagcgcgct ctgacggtcg aggaggccgc gtccaaagcc cgggaggagg
360ccctgtccta taagagtgtc attgcagact tagataagga gaagggtctt ctcaagatcg
420acttggactc ccttcgagag tccttccaaa agatgaaggt ggagcacgtg aatggcgaga
480tcaccaggag cgccgcggtg gaaaccaaga agaaggccct cgaagatctc gaggcggagc
540gggctcaatc ccacaggctc tctgacgacg tcgaacgtct ga
5821931194DNASorghum bicolor 193caaaaggaat ttgtatgtat gtatggtacc
actattgttt ctatgatgga ttgatctagt 60ggtagcatat gacatgtttg tgagcttgta
agcctagtgt tgaatctaga atatgagctt 120atgatgtgta attcaacatg gtcaagataa
cccttatact tcatgaggtg tgaaaaagct 180tgtccttgga tcaaaccgaa ttacatgttc
taggcaagtg atctagattg gaccataatt 240tgaccctcac attgattgac ttaattctca
ttaaaaattg aacctttgtg gtcattgatg 300acaaagtggg agagaaacaa agataagtcg
taaaggggga gaaaagtgct taagggggag 360agaaaacttt tgaaaataga aaggggtaaa
ttaaacttga gcacacaaat agggggagca 420agctcatgaa ccatttgttg catttgtatg
tgcactaaca taatgaagtg ttgcattaca 480agtttaaatt cactactctt gtttgattga
tgattgctag aaataaaact tgaatgatgc 540tttgaattct agcatagagt ttttattttc
atgtggtatc tagacatata atgtgatctc 600acgaggtatc ttgagttttt gatgtatgtc
tagctacatt ggtgctaagg atggtatatt 660ggcaactccg attggtatca cgcttcaaag
gtccattcta tataccttag catcattttg 720gtagtaataa atctcccaaa attccaattc
atgcatatgt gcaaacttga accaaactca 780tggaagcaca tatgtagggg gagctagtac
taccaaacgt gaaattaaag tgtttgtcca 840atattggttt catgattaaa attctctgga
caaacaattc aagttcatta tgatttcatt 900tcatatcttt gtgatggttg tcatcaatta
ccaaaaaggg ggagattgaa agcccaagtt 960tggttttggt aattaatgac accaagttgc
taatgctttg tgttcaagtg atttgagtta 1020ggcatagcaa cacattttaa gaaggagcaa
tgtgacatga gtggtggaca catggtcata 1080aagagagaag gcatgaagtg gagatcatgg
tgatggacaa ggagtaaagt gatcaaggca 1140aaggtataaa cataggattt tgcttttgcc
ggtctaagat gagtagagaa gtga 1194194607DNASorghum bicolor
194aagaacggga agcaggacag gcgtgctcac gaagacgaag acgatgacca ggatagggac
60ccacgacacc agtatgtcag tcccatcgac gtggtccact ccatcttcgg aggcaaagtt
120tctatcgagt ctaaacgaga gagaaagctc ctgaagagag cctgcctcaa tgtggacagc
180gcagatggtc tggtcgccga tccgaaattt cctccctggt cacacaggga gatctccttc
240aacaggaagg accaatgggc cgccatcccg gagccaggat gtttccctct gatcctggac
300ccttgcatca acaacgtcag attcgagcgc gtgctcgttg atgggggcag ctccattgac
360atccttttcc gcaacagtct gcctgctctg aagataaccc cggcgcaact aaagtcgtat
420gacgctcaat tttggggagt cctgccaggt caaagttcag tacccctcgg acagataacg
480ctgcctgtcc aatttggaac acctgaccac tttcgaatag agtttgtcaa ctttgtggtc
540gccgacttcg acggcactta ccacgcaatt ctgggccgac catcgctgac aaagttcatg
600gcagttc
6071951373DNASorghum bicolor 195ctatctccaa acagaccaaa ctgagctttc
acttgagcat cttcacctag tgaaagccct 60agtttggttt tgtataattg atgaaaccct
agtactaacc tctatactaa gtgtgtgtag 120acttaatgag gttggtacat gccaagtgat
ggagcaagtg atgatcatat tgatgatggt 180gatgactaca agatgatcaa gtgctcaact
tggaaaagaa gaaagagaaa aacaaaaccc 240tatggagatc aaggcaaagg tattgcttag
ggttttggtt ttggtgatca agacaccata 300gagggtgtga tcacatttag gatagagagc
cgtactataa agaggggaat tctttggcta 360aagcggttat caagtgccac taggtgtctt
tgttcatgtg catgcattta gaacctagtg 420agctaactta actccttcaa agaaaatgat
tgtgaaaatg ctaacacacg tgcacttgtt 480ggtttacaca tcgtggtgtt ggcacacttt
gagaaggagg tggagtttga agagtagaga 540gaggatgggt tcctctctcc ctcccaccga
gcttgcgact agggattcgg cgcttttcga 600gaaaatgaag tgcatatttt ctattgcgcc
ggtgggaaat ttggagaagt cgcgggagtg 660ttcctcgcag agaaacactc accggacgct
ggcctatgag gcaccggacg ctgaggctga 720gcgtccggtg tgctgtggtg ctagggttaa
gcaccggacg gtgaacaccg gatgctgggg 780agctcttgtt catgcgtccg gtgctatctg
acttcggtca gagtgtttga ctggaagcac 840cggacgatca gggaccgtcc ggtggttagc
gtccggtgtg cgggcgtttt gcaaccctct 900ctacgcatgg gtccggtgag caccggacgc
taccggtgct tagcgtccgg tgacccacag 960gtttgcggaa ctctgtgcgc ctgagtccgg
tgtgcactgg acgtgtccgg tgctaacttg 1020ctcagcgtcc ggtgcactac aggtgaccgt
tagactctga cacgggaagt tcaaaaggtg 1080acacgtggct gacgttggag caccggacgc
aagggctgag cgtccggtgc ccctttaaga 1140gcgtccggtg accccgagtt tttgcccagt
gaaagagcca acggctctat ttgtttgagg 1200ggctataaat acgtgtttgg ctggcttggg
gctcactctc ttagcattct agcatacttg 1260acattcttgt gagcctaagc aaacacctcc
cactcatctc cttcatagat taaacatctt 1320tgtgagattg ggagtgattt caagtgcatt
tgcttgagtg attgcatcta gtg 13731961100DNASorghum bicolor
196ccctttggca tcaagcgcca aaaacctaag aagacggcgg aggaggcgga gcagaagagt
60ccctcggcgc ggccacgaaa cgagctgcat catctgaagt atcggtcgtc gaggcggagg
120aggtggagtg accctctgaa gctgcaggtg ctggtgaagc ggtccgggtc tgctctggag
180ctgtcacagg ctgtgctggc tgctcgagtc cggctgacga agctggcacg gactcggtcg
240ctgtagcagt cgtctctggt acggtagtag acggtgcaag ctgctcggac gacgctactg
300acgacgacag aagctctgta gtggtaactg gcaccgtaaa gactgcagga gcctgcagag
360gaggaggcgt agggtcacca gtcagagcac tgtaggtgaa gctgaggtgc ggatctgtct
420gcgagtccca cacaagctgt gacgctgaag ctgactgagg ggtgaaacct gagcagagag
480gggtaaactg cggtacctgc tcaaagcctg aagagatggc accctgagag acctggtgct
540gtggcgtcga gaacggctga ctctgagctg gtagctggag cggctgctgt gacgggctct
600gagtctgggg cgcaggagta gggggcctga cggacgacgt gcctgggagc tgtatctgcg
660gtagctgaat cccggaggca gccatgagag cggccatcat cgctgtctgc tgggcctgga
720aagcaagctg ctgctcctga aaggtgagaa gctgacgctg gagctcatcc tgcctagcct
780gcatggctgc attgatggca gctgatccct gtctcgcctc acctgctcct cagctgcgcg
840gtgctggtct gccctcatac cctctagtat agcaagcagg gcggggtctg acgcactcga
900agaaccgctt gcctcgtggt cgtgtgctct gggagggaga tctgtcactg gcggctgata
960gtcgtcatct gagctgtctg aggcgaagtc aaactctccc tcagcctctg catcggcgaa
1020agccccaatg gctatgtcct actgctcctc tgtctctgca acagctgctg tactgcgggt
1080gcccctagga gaaggcgggg
1100197519DNASorghum bicolor 197tcattctcgc cttagccctt tcgctgcctt
agttcccttg ccatcccacg cgtgtgtgcc 60ccctcgagta gtagcggatc cgccattctt
ttggcaagga tgcctgtgag gcttgtcgcc 120gctgacgact ggcgcccgtc gtcgatgacg
gagcgccggc tgcaagagct tgagagggag 180ggactcctgc gccaccgcac ctcgctgtcg
tcgccggagt ggatcgcgcc ggcggcggac 240caaagggagc ccaggccgcc taaaggctat
gtggttttgt tcgccaagtt ccaccgccac 300gggctgggcg ctcccccgag ccgcttcatg
cgggcgctct gccaccacta cggggtggag 360cttcagcact tctccccaaa cgccatcacc
gtcgcggcgg tcttcgccgc ggtgtgtgag 420ggctatttgg ggatgatgcc gcactgggag
ctgtggctcc acctctacag gggcgagctc 480ttcaacgccc ctacgggtac caccggcgtg
aggaagccg 519198661DNASorghum bicolor
198aatactatct ccaactcgac tgaagcaaga ttccatatga tacacttcat ctagtaattt
60catcgggtgc atctacagtt ccatcgggtg tggccaaatt gctttctgag catgtggtat
120gtaccatgca aaccgtgcag ctatcttgca tcaagattag aactatctcc aaacagacaa
180ccatgattcc acatgagccc cttcaccaag gagtaccatc gcgtgcgtcc aaaatagttt
240cttagcctat ggtgcattag gaacaaaccg tgtaccgatc ttgcaccgaa actaacacta
300tctccaaaga gaccaaagtg agattctatg tgacacacgt catctaggag ttttattagg
360tgcttccaaa tatatttcta agcatatgtt acgttcgatg taattcgtgc acctaccttg
420catcaagatt ggcactatct ccaaacaaag caaactgagc ttccagttga gcccctttac
480ctaggagtat gatcgggttg tcacgcccgg ttttaaagaa caaaaccagg ctagccatat
540gtgtgcccag gaagtccaca catacaacaa caaaaccaat agtatcaaaa caatgttata
600tagcgaaaac atacttataa ttaacactta cattagagaa atcgcggact caggctcaat
660c
6611992166DNASorghum bicolormisc_feature(2149)..(2149)n is a, c, g, or t
199cggaccactg acgttcttgg tacgagccat tgcgaaacaa gtgtccgact aacaaccgac
60taacaatgga gattaacgcg ctgcagggaa atgacaagat aggatatata taagaaaaca
120tatcgactag agatgagaca atcctataga tcgtaagaag tgataacgaa tggacgcgaa
180cggcttacga tccacggacg agacgcgttc cggaagaaag ctatcgatcg acaaccaccg
240gagaacgcaa atctgctcct caagatgctg cagcgagatg acaatcgaga tcggaatcaa
300aacctaaatc aaatgccttt tactcatagg aacaaacatc ttgagggcaa agtccaagga
360tcttacccaa acccttgaat ctcttcgtga tttggagaac cacgcgaaga gaagagagag
420gagaagatca ggggcagatc tggcgcggag agagcttcaa agggcgccgg gaatcgacgc
480acgcacgacg gcggcggcgc tagggcacgg gcgcggctcg ggcgatgttg gcggaagaga
540aagaaaacgc ccgctcaaag cccctcggtg cgggctttat gcgcccgccg cggggtcacc
600ggacactctt aaagttgcac cggacacgtc cggtgaccac cggactcatg cgcagagagg
660attgcaaatc ggcattgcac cggacgatgg gcaccggacg ctggctttag cgtccggtgc
720ttcaggtcac accaggtgag caccggacgc accgcaccgg acgctacagg gatactgttc
780ctgcgtccgg tgcgtgcagc ctggcacact caccacaccg gacgctcaga ggcagcgtcc
840ggtgcatcgt ccggtgctcc tctgagcact ttttggacta agcaccacgt ctgactttga
900cccaaccaag ttccatcttc aaaagcacac aaataaacac caaatggaac tggtatgagt
960gacttctctc aaaccctcaa attttcacaa atatttagcc ataggcttag tagtttttat
1020gaaaatagtt cgagaaaatc accaaggagc atcgtatggc cataaagcta ggggtttgaa
1080tcataatgag ctttgaatgc tccccctatc tatggacgaa cacagcggat gctcaacgaa
1140taaccgaaaa gaaacgatca actaacaagc atgcacatga catgagttgt aatgcaatac
1200ttgaaagaaa actaatgctt gtcaagtttg atccaaggtt aagctttttc acacacaaaa
1260gggggttatc ttaaccatgt tagacaagcc ctacatgcaa gttatatttt tagttttagt
1320atgcatgata tgcaaagcaa aagttattta caagattcaa cacacaactt ttatctttta
1380gtgaagttgg ataggtcaag cacattaagt tcatttctca acatacaaaa ttagcttcat
1440ctaggggttt tgtgaagata tccgccaatt gattttccgt tcccacattc tctaaggata
1500tgtcactttt agcaacgtga tccctaagga aatggtggcg gatgtcaata tgttttgtgc
1560gggtgtgttg aaccgggttg tttgcaagtt ttacggcact ttcgttgtca cacaacaaag
1620gtactttgtc tagtactact ccatagtcta gaaaagtttg tttcatgtag agtatttgtg
1680cacaacaagc accggcggct atgtattccg cctcggccgt tgacaaggca acactatttt
1740gtttctttga catccaagag acaagagatc gacctaggaa gtggcaccct ccggatgtgc
1800tttttctatc aatccgacac ccggcataat ccgaatcgga atagcctaca agttggaatc
1860ttgcaccttt gggataccac aagcctaggc aaggtgtgta ttttagatac ctaagaattc
1920tcttgaccgc aatcaagtga gattccatag gcattgcttg atatcttgca cacatacaca
1980cactaaacat tatatcgggc ctagatgcgg tgaggtagac ttcctatcat ggagcgatag
2040agagtcttgt caatagggtt acctccctca tccaagtcga gatgcccatt tgatgccatg
2100ggagtcttga ttggcttgca gtccatcatc ttgaacctct tgagaagtnt cttgtgtgta
2160cttctc
21662001072DNASorghum bicolor 200gttctctata tgcaagcttc ttgtcttgat
caagtgactc aatcacattg gtcttgtgct 60tggcttgttc ttgcaactct ttcttgagca
tgacattttc atccttgagc ttgatagtgt 120cctcataatt ctcggccact accactttct
tgcccttgca atcaacacat ttgcttgtgc 180tctccacaag taagtcatca caagatgtgg
ctacatcaag cttagcaaca ttgttagtag 240caacatgtgt ttcatcaatt gcaagttcat
aagcaatctc aaggttgtca tagtttgctt 300ttagagttgt tagttcttct ttcattgctc
tatatctagt gataagctca tcatgaacac 360tctcaagttt atcatgtttt tctttgagct
cttttagaga gagtttgagc tccttgagct 420tagtggtcat gatctcattt tggtctctaa
gctcatcact agacttaagc tttgctagaa 480gtgtgtcatt ttcaagttca agcttttcat
tttcacttct agtctttctt atgacttttg 540tgtatttgtt tagcaatttt acaagttcat
cataagaggg tgattcatat tcactatcac 600tttcactact ctcatcctca ctttgtacct
tccggtcacc tttggccaca aggcataggt 660gtgtagagga tgtagatggt ggtgaagatg
atggtgatgg agcatcgatg gcgatagcgg 720caaccttctc atcatcactt tcattgccgg
aggagtcacc acttgagctt tcaatgtcgg 780tgagccaatc accaacaata taagccttgc
cattcttctt cttgtagtgc tccttcttct 840tgccaccttt cttcttgtat tgcttcttgt
catttttctc atcatcgctt gagtcatctt 900gctctttgtt cttcttcttg tacttgtcct
tcttgggctt tggacattga tgagcaagac 960gaccaagctc accacaattg tagcaatcca
tctcggagat gggattcctc ttgctacttg 1020tgaagaactt cttcttcttg gagtcaaatt
tgactccatt cttgttgagc tt 10722011003DNASorghum bicolor
201ccccggccaa cttcaagttc atcctcgtac taatttacaa attctcgaag tggatcgagt
60acatgccact ggtgaaagca acctccgaga aggctgtgga gttcctcaat caaatcatac
120acaaattcgg gatccccaat agcataatca ctgaccaggg cactcagttc actggcacta
180cattttggga cttctgcgtt gacaggggca tagtcataaa gtacgtgtcg gtagctcacc
240cacggggcaa cggtcaagta gagcaagcga atggaatgat tatcgatgct ctaaagaaaa
300ggctgtacac cgagaacgac agagcaccac gacgatggat gaaaaagtta ccggctgtgg
360tctggggtct ccgaacacag gctagtcgca acacgagtgt atcaccctat tcatggttta
420cggcaccgaa gcagtgcttc cgtccgacgt gacctttgga tctccaagag tcgaaaattt
480tgaccagtct tcggccgacc tcgccagaga gctcgaaatc aactacacag aagaaaaggg
540cctaatctca tgccttcgaa cagcaaaata tcttgaagcc attaggcggt accacaacag
600gaacgtcaaa gactgttcgt tcgtgttcgg tgatctggta ctcaagtgga aaacaagcca
660agaaggaacg cacaagttat ccacaccttg ggaaggaccc tttgtggtcg ccgaagtcac
720acgacctaca tcgtacaaac tggcgtaccc agacggaaca cgcctgccta actcatggca
780catagacaaa ctgcgccgtt tctatccata agttccagta catttagttt gtaaatctct
840tatgatcagt agtaataaaa ttttttcttt cttgctatat acttctttta tatgcagaac
900ttccgacaag tgccacaatc ttgttttagg acgaaggtcc ttaagtgtta ttaacccgat
960tcagtgatac atcactcaga caacattgca gcgctacaaa acc
1003202689DNASorghum bicolor 202gatatcttga gtaagcctat tgttattgct
tctactaacc cttcatgtag cacatctaca 60tcatcctcat ctagtagtga tggtttcact
tgtgacacca cactaaaagt tgagaatgaa 120attctcaaga aggaggtgaa agagctcaat
cacactttag ctaaggctta tggtggtgag 180gaccgcttgc ttatgtgctt gggtagccaa
agagcttctc tctataaaga gggattgggc 240tataacccca agaaaggcaa ggccgccttt
gctcctcaca agacacgttt tgtgaagaac 300aatggccggt attgcaaagc ttgcaagcaa
gttggtcact tagagcaaca ttacatgaac 360aagaaatcca aagcaaatgt atcctcaatt
aagcttaatt ctttctatgt gcttactaag 420gatacaaaag gtgttcatgc taagttcatc
ggtgcaccat ggatgggctc aaacaagaaa 480gccatttggg taccaaagag cttagttgct
aaccttggag gacccaatca agtttgggca 540cctaaaagga attgatcttg tcttgtaggt
caattacaaa gccggaggaa ggcattgggt 600gcttgatagt gggtgcactc aacatatgat
cggagagtct tgtatgttca aatcaattga 660tactagtcaa aatggtggct ttgattctt
689203959DNASorghum bicolor
203ttcttcgacc agcaaggctt gacatcttca atcagcatga aagtcgccca aacagttgca
60acggtatttt aaccaactca aaagatagtt gatcaacatt gacatcgaag aagcaggctg
120ttcgaaaaat gcatggtgct cttaatcaaa aaagattttc ttcctagtga agtgtgccca
180ctttttcttt ttctgaaaag gacagacttt caaaaagcaa gcattttcac cgcaaggtga
240agtgtgccca cttaatcccc gagcctggta gtaggtgtcg tatgcacgtg gtgccaggat
300caaaaaggtc ttcatagaaa ttccaaaact gagatgcatc gctagatgca tcgtaccgat
360gtagtcctcg agcttgctgg aaagcgaatt atgagccttg tagcaaagtc taaaaattga
420atttaccagt ccccgagcat atcatgctcg tctgcaactg aatagactaa tcgaccagtc
480cccgagcata tcatgctcgt ctgtaatgga acagactggt cgaccagtcc ccgagctctc
540aaattggtgg gctggtcgac cagcctccga gcgtatagtg ttcggtggtc ggtacagtcc
600ccaaactctc aaattggtgg gctagtcgac cagtccccga gcgtatagtg ctcggtggtc
660gggacagtcc ccaaactctc aaattggtgg gctggtcgtc cagtccccga gcgtatagtg
720ctcggtggtc ggtacagtcc ccaggtatgt catacacagt cctctaccag gcccacttgt
780cctttggaaa aagcatcttg ccacattaac gcatgatgcg acaaggcatc ctagcgtatt
840gttagacggt tgcgtaaaaa ggtgcgcctg ataactgtgt agccgctctt tgtgtggttc
900aagaaaagga aaccaccaat tatataaaaa agccaccata ttaactgccg gtaattatt
959204449DNASorghum bicolor 204gaagccagag cagacaaccg ccatcgcccg
cttcaccccg atgtggaggg catcccgcac 60ccgatctctc agcgtcgcgc ctaagtagca
cagctggtcg accagtgcgt caccttgagc 120gtcctccttc gactcgtcca taggctcgac
ctcccaggag gtcgagaggt cgtttatcgc 180cgtccgcacg cgacggttca aagcaacctc
cgcctcaagc tgggccttag cattacgggc 240ctcgacttgg gcggctagga gctcgtcctt
aagccctgta aatgccaata caagaaacct 300taggaaaaac caagcacacc tcgaaaaaag
aaatccgaca gaggaaacgt accgcggacg 360gtctcctcca gggccgtgtt cttgccgacc
agcctggtgt tggccctctc gatctctctg 420ttggagcgtg ccagctcggt gttggcaac
449205418DNASorghum bicolor
205ctcctatgcg gacttccatt catggacccg aacttccggt tatggaccgg aacttctggt
60catggactgg aagttcggtc agacaccgga aatgttttct caaacgcgag cgcatctctc
120acagcccaac cggaacttcc tgtccggcta atcgaaactt ccggtcaggc actggaagta
180ccacccgtcc tgcattttca gcaccaagtc aaaccttgtt aggatgctaa cttttagctc
240cgaactccga attcgatgat cttggacatt ttggaaagct tactcagagg gctatacaat
300ccatatggat actcaatcca agtcataatg tatcaaagca gtatttcaat ctaaaagcca
360tcctaatgtc cggagaacac cgaaaagcct actttctctt ctccaagttg atcaaaat
418206590DNASorghum bicolor 206atcgcccagg gcctaggcgt cggaggcctc
gtacctggga gaaatctggc agctcggacg 60tcggctgctg tgcttccgag tcacgtggga
ggaccattgg tctctctacg tacgtctggg 120cgtttgggca tatcgccctg gcgagcgacg
ctgcggcatt gtccaggccg aacggtaagg 180ctcctcttga ggcgaactgc ccatgtggaa
gtagcctagc ggtgggggag agtgcgcggg 240gcgcgtcacc taccgtcatc gtgtgatcat
cacggcactg agccgtcgcg cccgtggtgc 300gaggggctat cggctctcgt cccgcctcct
gcctggcacg gactgccggt cgcggcgagg 360cagatgggcg acggtggtgc tgtccacgcc
tcttcttggg cggggaggcg tggtggtggg 420gacgtcttgg ggccctcatc cgggttggcg
caacgcgggc tcctcccggc cctttgatcg 480agagcaagat catgtcatag ccggaaccag
tggacatgaa ctcgagactt ccaaaccaaa 540tcaccgtgga gtgcggaatc ggtgaagaag
agtggccatc tggaaagagt 590207470DNASorghum bicolor
207ttttagtttg tctcattttc acatttgtta actagagagg aatgagcttc ttcaagctta
60gtgtgagcct tttgaagctt cttatgatct tcctttagac tctcataaga tgctctaagc
120gcatcaagtt cttgctcaag ggttttctta tccttaagca aggccttgca ctcctttcta
180gtttctttat agtgatgagt gcaatcttct agcatagtga ttagttcatc tttagttggt
240tcatcatcat cactatcact atcactagca tggtcactct catcattgga tgatacctta
300gtggccttag ccatggagca tgatggagtg tcgaagatgg agctctttcc ttgaattgca
360atgctagcgt gccccttctt cttggtgctc tcatcatcac ttgaggagtc atcactatcc
420caagttgcaa catgggcatc acccttcttc ttgtttgagc cttccttctt
470208476DNASorghum bicolor 208atcatgaact tcttctaatg tgccacttgc
caaagttcca aaactactat aatgccttgc 60ttgtggtgga ataaccaagt aagaaccctt
catcacattt cttttcaaac ttgctcaatc 120tagtgccttt cttgagaata tagcatttgt
aaccaaacac tctaaagtat gcaatgttgg 180gctttcttcc attcaatagc tcatatggtg
tcttcccaag cttcccatgg caatagagac 240ggttgctaca atagcaagcc gtgttgatag
cttcactcca aaatgaatgg ctaacattgt 300attcactaag cattgatcta gccatgtcaa
tcaaggttct attattcctt tcaactacac 360cattggattg aggagtgtat gttggagaga
attgatgtcc aatgccaaga tcatcacaca 420attcctcaat tcttgtgttc ttgaattcac
taccattgtc acttctaatc ttcttg 4762092823DNASorghum bicolor
209gctcaaggaa gaaaatcaat atctcaagct tggtttgatg tatgacaagc aagaagaaga
60tgagacattc atcttggatg agttagctag caacaatgac ccaatcatca agaagctaac
120tcaagagaac aacaagctca agaaagagaa ggaacaccta accatggggt tagcaaagtt
180cacaaagggg aaggaccttc aaagtgagct ttttatgaac accgtcatga agatggacaa
240gagtgggatt ggctacaagg ctcatcaaac aaagctcatc aagtcactag ccactcatga
300tcaaccaagc aagccaaagc ccaagagatg ctttgagtgt ggtcaagaag gacactttgc
360tcatgagtgt aaggcaccac taccaccacc cttgcccaag catgcaagac catttgcctt
420caatgctcac tacattgtaa ggaaagacaa gagtggcaag gtcaaggtta gcttcatggg
480gccgcccaac aagcaaaggc caaagaagat ttgggtgcca aagcaactag tagagaagct
540caagggccct aagcaaatgt gggtccctaa atctcaagct tgatctcttg tgtgtaggtg
600aactacaaga ccggtggatc acattgggta attgatagtg gttgcactca acatatgacc
660ggagatcccc agatgttcac ctctcttgat gaagatgttg acaaccaaga gaagatcaca
720tttggtgata attcaaaagg caaggtcaaa ggattaggaa aggttgctat atcaaatgac
780aactccatca ccaatgtgct ctatgtgcaa tctttgagtt tcaacttgct ctcggttgga
840caactttgtg atcttggatt tgaatgccta ttcaagaaga aggaagtaat tgtgaccaag
900gaagatgaca atgaagtgat attcaaaggc ttccttcaca acaacttata tgtggttgac
960ttctcatcca atgaagttga tgtcaagact tgcttattca ccaagacttc acttgggtgg
1020ttgtggcata gaaggttagc acatgttgga atgggcacac tcaagaagtt gataagaaga
1080aagaattgat tagaggcttg aaaaggatgc gacatttgaa aaggacaaac tttgtagtgc
1140atgtcaagcg ggcaaacaag ttgcaacact catccaacca aagcctatct ctctacttca
1200agagtgcttg agctacttca catggatttg tttggaccaa ccacatatgc tagtcttgga
1260ggcaataaat attgcttggt catagttgat gatttctccc ggtacacttg gacattcttc
1320ttgcaagaca aggccgaagt tgcatcaata ttcaagaagt ttgcaaagaa tgcccaaaat
1380caatttgatg tgaagatcaa gaaaattaga agtgacaatg gaaaagaatt tgacaacacc
1440aacattgaag agtattgtga tgaagtggga atcaagcatg agttctcctc aacatacaca
1500ccacaacaaa atggggttgt agaaagaaag aaccggacat tgatcacctt ggcaagaaca
1560atgctagatg agtataacac ttcggagaag atgtgggccg aggcaatcaa cacggcatgc
1620tatgcttcaa accggctctt tcctcacaag ttcctagaga agacaccata tgagttgctc
1680aatgggaaga agcccgatgt ctcattcttt agagtgtttg gatgcaagtg ctacatccac
1740aagaagcgcc aacacttggg aaagttccaa agaagatgtg acattggtta cttggttggc
1800tattcatcaa agtccaaagc atatagggtc tttaaccatg ccacaaacat ggttgaagaa
1860acatttgatg ttgaatttga tgaaactaat ggctcccaag gagcaagtga taatcttgat
1920gatgtaggtg gtgaaccatt gagggatgcc atgaagaaca tgccggtggg agacatcaag
1980cctaaagaag atgatgatga tgtgcaagtc attgagccac catccacctc acatgtatca
2040caagatgaag acaaggatgt gagagatgct catgaagaca cccaagtcac tcatgagcaa
2100gcggtggcac aagcacaaga tgttgatgct ccccaaccaa cccctcaagt ggcaccaaga
2160agaacatcac atctcctcca agatcactct caagatctca tcatcgggag tccgtcacgt
2220ggtgtaacta ctcgttctag acatgcttta tttattaaac atcaagcttt tgtgtctctt
2280gaagatgaac caaagactat agaggaagct cttcgtgatg cggattggat catggccatg
2340caagaggagt tgaacaactt cactcgcaac caagtatgga cacttgaaga gcgaccccaa
2400gatgcaagag tgattggaac aaagtgggtc tttcagaaca agaaggatga tcaaggcaaa
2460gtggtgcgca acaaggcaag acttgtggca aaaggctttt cacaagtgga aggtcttgac
2520ttcggtgaaa cctttgcacc ggtggcaaga cttgaagcaa tccgtatcct acttgcatat
2580gcatctagtc atgatatcaa gttatttcaa atggatgtga aaagtgcctt tttaaatggt
2640tatattaatg agcttgtcta tgttgagcaa ccccccggtt ttgaagaccc taggtacccc
2700aagcatgtgt accggttgtc caaggcactc tacggtctca agcaagctcc tagagcttgg
2760tatgagaggc ttagggactt cctcattgag aagggcttca agattgggaa agttgacaca
2820aca
2823210554DNASorghum bicolor 210atttggacaa acccgatgga actgtagatg
cacccgatgg aactactaga tgaagtgtat 60catatggaat ctcgcttcag tccagttgga
catagtgatg tgaacccact agggttttcg 120cctgatcttt cgatgagagg cgctggataa
ctcgattagt gaatggagat gacgttcacg 180gcccgactac agccctcaag actgcgcctt
agcaaccgat acaccacctc caatggctgt 240cacgatcttg tggagcgcaa cacccggcca
ctagggcact cgtcctgcaa gcaatcgaag 300aaccagcaag aacaagtaga acaagtacta
aattaccaga tgtaaatgaa ggtttcaaac 360tcaatctcta atgaagtggg gtttcgaaaa
caagaagacg ggcggctgga tcagcacgcg 420cgcttacaag caagtagcga aggctaaact
ttatctaatc aaaacccagt tgttcatggc 480ggctctagat gtaaataaat agaggggagg
atgaccaaag gggtgctaga gtcgtcctcc 540aaccctagga tgcg
5542113139DNASorghum bicolor
211gacagatcga gcagaccaac acgaaggtcg tctgcctcct gcggaccaac ctccttggga
60tacccccgct ccagtcggga caagtcaacc agggggtagt gggcgcgcac catgctgagg
120acgtgcgcac cggcaaattc cctggcgtcc ttgacaaact gctgaagcca gtcccacgct
180cattgcgcct tctcgaccgg ggtcagcctc ggtgcgtagg gctggtcttc tgggagttca
240gggctgatca agtcgagaac cggggccact cctttccaga tcttgatgca gttggccttc
300taggagtccc ggtccttgat cagctcgtcg aggtgcttct tagtattgac tttgagcact
360atgaaccaag caggaagtca gtgcacacac aagaaaagga acgttgagta atcaaaaaac
420agggtaacac ggttctcacc agacaactcc cgactcttgt tccttagctc ctcgttagcc
480cggcgcccgg cctccagcgc gccggcacgc tcctggtcga gcttctcctt ttcttgtcgg
540aggcgtgcaa cctcctcctc caagtctgca ggagcaattt ctggtcaaca aaaatgacta
600cgtggtaata tagaactgct cggaatgcat gactgacctt tcctttcccg atctctgtcg
660gctagtcgcg gattgagatc gagcaggcgc tcttcctgag cacccatctc ggccgccttc
720tcctcggcct ctgaccgaag acggtctact tccgcggaca gggccttatt ctcggccatg
780atcccttcaa tctggtcgaa gcaccttttc cggtacttcg ctgttttcat tgtgccctat
840aggaatcaca catattgacg taagcgacaa cgcgtataca tttcaagcag aatggaagac
900cacttacctt aacctcatca acgagccgct tagctgcgcg ctccaccctc acggcctcat
960ccgtctccac gagctcctca tgaccaatga aatggtcccc acgttagcgc cacacgtaaa
1020cgtgctggcg accgtcatgg agcgaccttg gatttcttca acttcgtcgt cggaggccgc
1080ttagtctcct cgccctctac gcttccctca gctgtggtca taggtatcac tagaccttta
1140tcctggccag gggagccctt tggcgaagag gtttccgctc ggggttgagt tgcagccctc
1200ctttcttcag cgggcggggg agtcggggct ctatctccct cctccgcgac actgctcggg
1260ggaggcgtcc tcatagcctc atcatccctt gtaggggtgc tcattccggt cctcgcctga
1320gctgggacct cctcgacagt ctctgccatg ctcactaccg agggctgctc ctcggtctgc
1380tcctgggtgg tctgctgagc ggcatcttcg acgacaacag tcggctcggc cgtcttcagg
1440ctagcgacgt tgtctggatc aacgtcagac gccgtactaa caaaaataca acattttacc
1500aatgtcaaaa gtagcagcaa caaaacaaaa aacaaatatg aggaagtgca aaagtgagat
1560tgaaaaactt acagatcgga gcgcctgtgt gtcgctgtga agaacctcct cctggtccta
1620ccagtcgccg ccgccacccc tgtctctcca agacgcgtcg gctcggcgtg tggagcagga
1680ggatcactgg tcacctgacc agtagtggtc ggcacgacgt ctggacgact gcgcggtctt
1740cggactaaag agggtgcagc ctcctcctcg tcctcgtcgt cgtcagtgat cctaacgagc
1800tgatgatgtt tcggagcttg gctggtcttc gccggcagcg cctccgtagg agatgggtcc
1860accgccattg gccttttccc cgccacccta tcctcggtgg tcggggcggg gcggttctat
1920tcctcttcac tgctggtgta ttgaggacac cgagcagcat cgtcctccgg cgggtccatc
1980ccagcaggat tttccatacc aggtgccagt gacacgtaca ctgcgcaccg gttaatgtcc
2040ccaatctgca atagaacgaa cgacaagtca tcaactagga ggcaacagta ttcatatagc
2100aatactagtc aataacaata tttatttacc ttaggagctg gtcgtcccaa cttgaaggcc
2160tgcatcggat cactgcttcg aacgatgttg tcatctgata agttgaacag ctcgttcatc
2220agctgctgga cctccgcctt ctccaaacca tcagggctca ttctggtcgg gttctggctc
2280cccgcgtact catacgccga gtgtaccctc ctctagcagg gcatcaccca acgcacaatg
2340tagttcccga ccactcctgg cccatccaag cgggaccagt cgatcaggct catcaggtct
2400gcaacctgcc cagcgaattc cgggcggtcc gtccagctga ccctcttctc ggggatgtaa
2460cccacgtcgc agaacgtggc gctgcccggc tcttccctga tgtagaacca cttcttgtac
2520cattcggtga gagaagtgtt ccatgggcag ctgaggtacc ggttcttcat cccatcgcgc
2580agattaaggt atactccacc tgcaatattg gagcctccaa ctgccccttt cttcctcaaa
2640tagaaaagat aacgaaagag gtcgaagtgc ggctctattc aaacataggc ctcgcacaag
2700tgaataaagg tggagatgag aagaatagag ttcgggtgca ggttgcagat cccgatctca
2760tagtaaagaa gaaggccctg gaggaaagga tggataggaa tcccaaagcc cctcttgaag
2820aaatcttaca aaaacgacaa tctcaccggc gcgggggtcg aggtagcttt ccccttccgg
2880tgccctccaa ccgccgagct cttggttatg aagcagaccc attccgacga ggtcgttgag
2940ggacctcgtg ctgctccggg acttcttcca ctccttcgcc ctcagctcgg cccttttctt
3000cgagtcactc ttcgccattg acgccgaaga aaagactttg agctggaagt taaagatgga
3060agttgcagga ggcaggaatg gcaagagcag aaggtttgta agcaaaggca gggtaaagaa
3120tatgggcgct gcattgagc
3139212309DNASorghum bicolor 212gatggtagtt cttggtgaag agactcaagt
ggaatcttgg ttctgtctgt ttggagatag 60ttctaatcct gacgcaagat aggtggacag
tttgcatgga acataccata tgctcagaaa 120tcaatttgga caaacccgat ggaactgtag
atgcacccaa tggaactact agatgaagtg 180tatcatatgg aatctcactt cggtccagtt
agagatagta ttagtttcgg tacaagatag 240ttgcatggtt tgtgcccagt gcaccatagg
ctcagaaatt gttgtggatg ttcccgatgg 300tactcctac
3092131222DNASorghum bicolor
213aaaaaatgtt ttcatacacg acttttaata attctaaaga taatccgtct taaccctaaa
60cctcctatgg ccaaaatttt tttttgctct ttacactaat ataatggatg gatcaagacc
120tgtgcaactc tttttctcgt tgaatccggt gcatctgttt tttttgtttt gttcgttggg
180cctgcttgac gagagaatga cataatccat gtaagaagct aataaacaag ctaatactac
240tagaacctag caaaccagta ttgtatgcgt acacgataag gaaagctagt cctgcacact
300gtatgtgtcc atctataaat agcgagctcc cctgctgccc tagctagacg catcctatgt
360gccgccgcta atctcgccgt ccgctgctcg gaatctcatc gagctcgcaa acccactgcc
420ggtgctgtag atgccttcca caactgccag ggtgagcaag cgccttcatg tcatcgcctt
480ggatcacacc atcgagaagg agttctctcg tcattcgtca agtgtagcaa attccactgc
540tgaaactatg gcacctcatc tctacagcat gtggcattgt gcagcaaatt cctcaagtgt
600atatgcctct gtacgcccct acaggttctg atcatgaccc cttgcagatt gatcatcaaa
660taaactgact attatcttcc ggcccaataa attcctaaag ctttaatcca ttagctctgc
720ttatctgttc aggttacatt agttttaagt cttggactat gtgataacat tgtcgttcac
780aatatcgtat gtttatgtgt ttattacaat taaattttaa tttgattaat atactaatat
840tagctttcta attaaaagcc aagccacata ggatattttt aatattggtg tacttctgaa
900ttataatttt gtttagtatt aacacatact gtgacatatt agttgctatc accatgtttt
960tgaattatta attgtttagt ttaaaacgac atcataagca gatcgtttag gtgtttcacc
1020tatcctttta tttggcaaag ttagtgtata atttgtttaa atatacaata atatttataa
1080gtaaaatata actagatttt acttggcttt caaactaaat taagatttat ctaatttaaa
1140ttaaacctct ttatgatttt gtttaactaa aaataaaatc aagtttgtag ttgctccttt
1200tataaaataa actcacgata tt
12222141091DNASorghum bicolor 214tataatgacc gacgtaagga cgacacagat
cccatgaata agctgctaac ctgagtcctt 60gaggccgctc ccactttgag ccgttttgcc
atcgaaggct gggcctgctc gagcgtccgc 120ttcaatctga ggaaagaggg acggacataa
agaaaaccgt tatatgagaa aacgcgaaga 180atcaggagaa taaaaatcgc aacgtcgaag
aacttacccc acaggaagaa cggctcgcct 240gccggtacct cgaccggtgg gcctcgaacc
accccgcgtc aaggccccgg gctcctcgag 300agcaggcgca ggcctgggcc cagtgactcc
cgcctggcgg gcgccaggac tggcgcttcg 360gcggccggtc tctcccttct cggaggccgc
agcgcgaccg cgggtcaacg gcctcgtcgc 420ctggggctta gcttggccac ccgcctttgg
cgcaggggga tgttgggggc gcggggcggc 480ctgactggtc gtgggcgcag gaggggtgtc
agtacggggg cggtgagatc cgcctggtcc 540ctcctttggc gctcccatcc gcggggcgtc
gatgtcgcct cgaggcggac cctcgaggat 600ccggtcgagg cgagacgcca tcccctcgga
gtcgtcgcta tcgtcgtcgc cattcccacc 660atcatcctca ctaggggatt cttcttcagg
ctcacccctc tgcctagatt tagctcgacg 720cgcctctagc acttggcggt cgaggttctt
cttccttttc cctttcttct cagagtcctt 780ggtggacttc agtttctcga cggactggcg
ccgtttgtca cggtcgacct cgtcctcctt 840tacagggggc ctcaaggatc ggacatcaat
gcgtccctgc cagaataagg taggcttaaa 900aaggatcgaa agaaatccag aattgaagct
gtgtacgagg gaagagcatg ccaggtcgat 960cgaaccctcg tcgggcctca tagggaagcc
attgacgtgc tccgacttaa agtcgccggc 1020aatggccgcc ctaacccggg cagcggtctc
gttgtccgcc agggcctcgt tagacatccg 1080gcatgcctcg a
1091215841DNASorghum bicolor
215cccactgtcc tgctagagcc ctccccaaaa ggtgcacgga agcatggagc cacgaagtca
60accggtggta tctcaccacc atgaggacga cgaaggggat ccacgttccc gtagcacagc
120tggtgaatcc tggtgggaga ggctcgcagt ccaagcacct ctctggccct ctgagctgtc
180actctgtagt ctgtccccgc cagtgcgtag tggatatagc tgtgatctgg agaaacccac
240actgacgcgt agaactctct gacccactgc tcacagtaag tgccggggag tgctagtaac
300tctggaagtc tgtggaggta ggtgaaatac tgacgaatgt ctgctcctac cacgttcccc
360agtatctcta ggtcgatcac tctgtgctcc ctgaactggg cctgagagcg aaccagtgcc
420tcgtagatgt cctcctggac gactgtgtag aacctggcac cggctctagg atccctggca
480gctggaaacc aggtgggaaa cggaacgaac ctcagctgct ggatctctct ggctgacaca
540agagagagat ccctgtggat cctgactggc ctcgggcgtg aagctggtgt gcctctcgct
600ggtgtggcac gagctgtgga agtgtactga cccggcgtgc cggtccgagg aacggcctgg
660gtacgagcac gagtcgacct cctgagtggc tgctcctgct cctgctcctg tgctggaccc
720tctgcctggg tctctgtctc agtgtctgga tcctcctgag ctggatccct gaccacaagc
780ttaacctctg tcctccaatc gtcacggtgc ctggagggga tccgtggaat gcctctgcct
840c
841216638DNASorghum bicolor 216acaaatcagc aatattgtag catcatcagt
tttatgacat ggtataaaat gtgccatctt 60agaaaatcta tcaacaacca caaacacact
atcacgtccc ctcctagtcc tcggtagtcc 120caacacaaaa tccatagata tatcctccca
aggagcatta ggaacagaag aggtaaatac 180aaaccatggg gatttaaccg tgactttgcc
ttttgacatg ttgtgcaacg agcaacaaac 240ctctccacat ctctcttcat ccttggccaa
aagaaatgac cagcaagtat gtcctcggtc 300ttctttgctc caaaatgtcc catcaagcca
cctccatgcg cttcctgcaa caacaacaaa 360cgaacagagc tagctggaat gcatagcttg
ttagctctaa acacaaaccc atcactaacg 420atgaatttat tccacccttt tccatcttta
caatgcagca acacgtctct aaaatcagca 480tcatgaacat attggtcttt aattgtttct
aacccaaata tcttgtaatc aagttgattc 540agcaaagtat atctccgtga taaagcatca
gcaataatat tttccttccc tttcttgtgc 600ttaataatat aaggaaaaga ttcaataaat
tcaaccca 638217996DNASorghum bicolor
217cggaatcgtc ttgagcggcc cgctgttgtt gttagtattg tcttttgagg acacgacatt
60cttccaccgt gtgattggac ttgggatgga gttggcacgg tcccttcaac gtcttgttgt
120agtcctcctg gtagtctcgt cgcccattgg ggcgcttgac cgtattcacc tcgtggtcat
180ggtcacgagg gcgcttgccc cggaagtcat cacgattgtc acgccgcctg tcccaacctt
240ctctatggtc gcggttgtca gtgcgacgat ggttcctgtt gtcaaaacgt ccacgactgt
300tgtctcgtcg acgaggctgg tcgggacctc tcgcatcttc ttggatgagc ttttctgcgt
360catcagcgtc agcgtagttt tgtcgtggcg aggagctcgg ccatcgtggt cggccttttg
420cacagtagct tattccttag tgctttgtgg aagcatagcc ctttgatgaa ggctgatatg
480gcctcatcct gagaaatttt tggaacctta aggcgcatat ccgagtaccg tcggatgtaa
540tcacgcaaag gctcgttttc ctgtccctta ttctctcggg atcgtacttg ttcctaggct
600gctcgcaggt agcgatgaaa ttgtcaataa acgcctggca cagttcttcc caagagtcaa
660aagagtcctt gggcaggccg aggagccact gatgggcatc ctggtcgagg acaactgaga
720agtaatttgc catgacgtgc tcgtctcctg ctgctgaccg aactgcaatc tcgtagagag
780tgatccagtt ttcagggttt tccttgccgt catatttttt aagcttctcg agtttaaagt
840tgtggggcca tacggacttg gcgaaggtgt gaagaaaatt gtcttagtcc gtggcacgcc
900ggttgatacg gtcgcgggcg tccttgggag ggatgttgta tcgtagatcc ctgtcctggt
960ttacttcctg accacaccca cacttctgct cgcggt
996218379DNASorghum bicolor 218ttggcgcgcc aggtaggggt cctgcgtgtt
tttcatcgat ttcccactcc tttccagatg 60gcagctcacg cctcgccgat tccgtgctcc
acgatgattt ggtttgggag tctcgagttc 120atgtctacgg gctccggcta cgacatgatc
ttgctctcag tcaaaggacc aggaggagcc 180cgcgttgcac cagcacggtt gagggccccg
agacgccctc gccaccacgc ctccccgccc 240aaaaagaggc gcggacagca ccacccccgc
ccctatgcct catcacgacc ggcaactcgt 300gcgaggcagg aggcgggcca cgagccggca
gcaccatgtg ccggaaccac gggcacgacg 360gctcaacacc gcacgacgg
3792191139DNASorghum bicolor
219acaagatggt catcattcca gctccgcgac cgtggctgaa ggcctcacgt tcaccgtcgg
60ccagattacg tggacgactc atggcagcgg cctcacgacc atgacctcgg aaggaactca
120gatccgatct gggacagcag gggcgtcaat tccgatcacg ccggcaccga gcacccgacc
180tccgctccct cgttacaggg gaaaaaaggt caacgacttg gatcttttct gagcgcttga
240tcgtgtcgat cttaggctcc ttgaagcttc caaactggta gattggatct cgtgcctgtc
300agatcaagcg gcgttgacaa actctttcga ctcctgccgg ccaacttgtg tcatcacaca
360cgaacagctg gggacgtctc tgacaataac gtccaccccc acgggtcggt tcgtcaagct
420gaagacagct caagactaga gtcctcctta tggactaaac aactcggccg atcactactc
480acggcacacc caatcccttt tcaacagggc gagctagtcg ccgagacaga gcagccgagc
540agctcgccat tacgtcaaca tggtatcaat ccaagtagta ccgaacgatg aggctgccag
600tacgaactcc agcacagggt ctgtccctac cgaggttcta aacttcgagg acgaagatta
660cgacctagat ttaacaccgt atcctccggg cttttctcgc ttcccagtct ttccacctcg
720gcggggagat cttattttca atgtcagcaa tgatgccagt tgtcgacgga gaaacagacg
780agcagaagca gctccgtgaa cagcgcaata ccgatcgcgc tcggcggcgc gcggatgagg
840aacgacagct tgcgccgcac aacctcaatg acggtttcga catggtcgga gatcagccag
900tctacaagac gccgagtgct aacgtgggca ttgctatggc aaacctggac cgactccctg
960atactcctga gtgccagggc gtccgatcca gtatacgtgc acacctgatc gccgcgatgg
1020gacagacagc caccttgctc aaaaggatcc aagccatctc ctacacagag gtctcttccg
1080accagactca tcgcatccgg acttcaccac aacccagcga gcgccaccgt agccgctct
1139220181DNASorghum bicolor 220ctaatattga tgtaagatag gtgcacgaat
tgcatcgaac gtaccatatg cttagaaata 60tatttggacg cacccaatag aactcctaga
tgacgtgtgt catatagaat ctcacttcga 120tctctttgga gataatgtta gtttcggtgc
aagataggtg cacggtttgt gcctaatgca 180c
181221392DNASorghum bicolor
221ctatcatagg gacacttggc gatgagatga tccttgctat ggcatctaaa gcacctcatg
60gaatggtcat tcttcttgga ggagttcttc ttccttcttg cctgatagcc cttcttcttc
120ataaacttgc caaatctctt gacaagaaga gccatctcct catcttcatt actatcacaa
180gagcttgctt cttcttcact tgatgactct atcttggctt tgcccttgga tgaggaggcc
240ttgaatgcca cacttttctt cttctcatca tccttcttct caccatcctt cttcttcttc
300ttgatcaact catcgttctc atcatcttct ctataagcat catcggtcat cacttcttga
360aacacttgtt ggggagttga ttcactaagt gt
392222706DNASorghum bicolor 222tttctcaatg gatatataaa tgagttggtt
tatgttgatc aacctccctg tcacacccat 60attttaagaa caaaatagga tgcataaaag
actcatatgt gccccaggaa tagtcacaca 120cataagtaga caaatctcaa atgtaccatt
gcagtgttta ttacatagtg gaacataata 180aataacatag tctcatacaa atatgatagc
ataaaacaaa caacgctctc gacggaagct 240ccacataggg acactgttga ctggttgact
ccaaacctag tactcataac gatagtcctc 300attccagtca ccttcattat catatcctga
ggtgttggga aattgcaaga gtgagcacat 360atcgtactca acaagtataa tcaagggttc
atgaggctca aatagctgac actggtttga 420ctgcatttag cttttaatag tggataacat
gtttaatcat tggatagcaa atatcaaggt 480agcataatta atcccataac cacatgatca
atgtatacaa gaattaagaa taacacatat 540aaacaacata ataaaccatc atttaatatc
attattcacg ttcatcagtg tccatctatt 600ccgtcagttt tctgggccac ccgtatccgt
gggcacggct agtataccag ttttacactc 660tgcagaggtt gtacatcttt acccatgagt
cgtgatttac cctttc 706223457DNASorghum bicolor
223gaggccgaat acattagtgc cgatagttgt tgtgcacaat tgctttggat gaaagctacc
60ttgaatgact ttggaatcaa atttaagaat gtgcctttgc tatgtgacaa tgagagtgca
120atcaagatga cacaaaatcc ggttcaacat tcaagaacca agcacattga cataaggcac
180catttcataa gagatcatca acaaaagggt gatataagca ttgaaagcat tggcaccgaa
240gatcaactag ccgacacctt cacaaagcca cttgatgaga agagattttg caagttgaga
300aatgaattga acatacttga cttctccaac ttgagatgag tgcaccccta tatttatata
360tgacatgcct ctcctccaac aaagcaaggt aaagatggtt gacatagcat tcatactttg
420ctaaggacat gtttagaaca tatagacatg cttgcat
457224506DNASorghum bicolor 224agccaatcgg gatgatacac ttctttaatg
aatccggctg ctagaagccg cgttatttct 60atcctaatag cctccttttt gtcacgagcg
aaccgtcgta gcttctgctt gataggtctt 120gcggtcttcg agacgtttaa ggagtgctcg
atcaggtttc atgggacgcc gggcatgtct 180gcgggtttcc atgcaaaaac gctcatgttg
tcccttagaa acctgacgag cgtgtcttcc 240tatttagggt ccagattggc cccgatgaga
gccgtcttct cggggctgcc gtcgaccagc 300tgcacctcct tgtgctcttt ggactttgaa
tttttcctcg gggtctccta ctcgggtagt 360ggaagctggt tgggtggaaa ctgcgtagct
tgggctacgg tttttgccat gcgaatcgag 420agatcgattg cctcagtgat cttgaagctt
tcctcctcgc aagtgtatgc gatatagaca 480ttgtctctga cagttaggac gccctt
506225344DNASorghum bicolor
225cgaaacctcc gcacaaaacg agctaatcag aaccttccac cactggtgtc gacagcgttt
60ctgcgaatta ggcaatataa cctttgaagg agtcaacaac tcctccaagg gttcgggggc
120taccccgtgg ggtcgctcgc acgcccccac ggaaacccga ccaaggaata cagcctccat
180tcgagtgcca gcgctcaaat ggagactcgg gggctactgt cgagggtatc agtaaggggt
240atcctaactg atgcacataa cgagattacc tatacacaga ttgaagtcct caactcggcg
300ttctaatcag ccatacggtc atcgacggcc gcagcctcga agtc
344226127DNASorghum bicolor 226gtctttgagg ctgcggccgt cgatgaccgt
atggctgatt agaacgtcga gttgaggact 60tcgatctgtg tataggtaat ctcgttatgt
gcatcagtta ggatacccct tgctgatacc 120ctcgaca
127227596DNASorghum bicolor
227cctctccgac acaacctata gctggtcctc ttactcgtgc tcgtgcccgt caactcaacc
60ttcaagtaag ttcatcttta aactcttgtc aatcatatta ggcaatggag acacgtgcac
120tttcgtgttg ctcaggaata atggacaaga tcagcaaggg aaggttcaac tgcattcaga
180atttgaggca acaccaactt caagggctga ttgcatatgg gaagagtgat aacttaacaa
240aggtgattgg agatcaggtc caagcctcca caacatcctc tatcaagtta ccacgtcgct
300tctaaagcaa ggaaacaaag agtccaaacc caacacgttt tgggagttgg attcggactg
360gaaaataact ctaacttgta tggatcacca cggcgtcata tggactccaa ctgggacgtt
420cctatacttg ttggaaagct catgaagtct actttccaat gggtccaacc acatatctat
480gcagcttatg agtcgggcgc agtccttgtt ttcgtgccga cacctttttc tgttttggtg
540ctgcgtcacc ctattttgga ccaatggccc atgtatcaag ttgagtccat taggga
596228493DNASorghum bicolor 228aagtagatgg attgacgacc aaagaggtgt
tggggttgtc ctccaaccct aggacgcgtc 60cctaatggac ccaacttgat acatgggcca
tcagcccaaa atgaggtgac gcagcaccct 120ggacagcaag acctatcgac ggtaattgga
tcattgtcgc tgctgccgaa caggtattga 180cgtgggactt gattcataag aaagtagact
tgataagctt tccaacaagt cctgaagcgc 240ctgaatctga atccgtatgc aaccgtggta
acggtcacaa gttggcgctt tgctgctgtc 300cgaatccagc gagcgtgagt ccttctccct
ttcggtcttc tccttggttc ctaatcaaaa 360cagagtgcac gcatttccat tgtctaaaca
agatggacac gagcttatca aggaactcac 420ttgattatta agttgacgaa cacgagcacg
agtaagagga ccagctattg gttgtgtcgg 480agaggatgta tcg
493229913DNASorghum bicolor
229ctggtgttga tgtcctcatc atcctccctt tcttgcattt gagtcatcct cgactcaagc
60tcatcttcct cacccaaata aggcttcaaa tctgcaatgt taaatgtggg actaacccca
120aaatctgcag aagatcaagc ttatatgcat tctcattaat tcgttgcagc actttaaaag
180gaccatcagc tctaggcatc aatttggatt ttctcagttc tggaaatcta tcttttcgca
240aatgcaacca aaccaaatcc ccaggttcca gaatcaattg ctttctacct ttatctccag
300cgcatttgta cttagcattc atgcgctcta tgttttgttt agtggcttca tgcaatttta
360acatcaattc agcacgttgc ttagcatcaa aatttatttt ttcagaactt ggcaaaggca
420ttaaatcaat aggagcacgt ggcaagaagc catagacaat ctcaaaagga cacatttttg
480tagtagaatg caatgaacga ttataggcga actcaatatg aggcaaacat tcttcccaca
540tcttaatatt cttctttaaa acagccctta acatagtgga taaagttcta ttaacaactt
600cagtttgacc atcagttttg gatgacaagt ggtggaaaat aaaagcttag tccccaattt
660gaaccacaaa gtcttccaaa aatgactaag aaatttagca tcacgatcag aaacaattgt
720gttgggcaca ccatgtaagc gaacaatttc tcgaaagaac aaatcagcaa tatttgtagc
780atcatcagtt ttatgacatg gtatgaaatg tgtcatctta gaaaatctat caacaaccac
840aaacacacta tcacgtcccc tcctagtcct tggtagtccc aatagaaaat ccatagatat
900atcctcccaa gga
913230943DNASorghum bicolor 230ggacgggttg gactggaggt aatatatgcc
tagcacttct tgtatcctga gaacgtcctt 60cgtaactcct tcgctcctct gataactctt
actagtaaac tggagtggtg ggaaatagca 120agagtgagca catgtcgtat tcaacaagta
taaccagggg ttcatgaggc tcaattagct 180gacactggtt tgactgcatt tagcatttat
ttgtatgtag catatttatc attttgtagc 240atcaacaaca aggtcgctta taatctcata
accacatgat caatgtatac aagaattaag 300aataacacat ataaccaaca taataaacca
tcatttatca ttattaatca tgttcatcag 360agtctatcta ttccgtcagt tttccgggcc
gcccgtatcc gtgggcacgg ctagtatacc 420agatttaaca ctctgcagag gttgtacatc
tttacccacg agtcatgatt taccttttcg 480cccgaggccc gtagacctct tagctcactt
ccaaggaaag ccggcagggt tcactatgaa 540gcctttcaaa gtttcgtcta acaagttagg
gccgcttggt ttcattagtc agtccatgtg 600attcacctgc gggaatccac ggtctgctat
tccccaattg cgccacatgg gtaaccgcta 660acgagctaga aagttactca tacttgacta
aagccagagc catatagccc tcaaggttgt 720acgagttgtc ccagcttttg ccaagggata
agtccttatg gagggtcaag atcattccag 780caaaagctag agttctttcc accctttata
ttcaagttgc tagaaagctt attttattgt 840ttattgtata tccaaacatt catgttacaa
gatcatggat tataatcaag cactagcaag 900aactacccaa atgcatatcc aaataggtaa
caaggaattc agg 9432311130DNASorghum bicolor
231ttcactttcc gatctctgac cggaagttcc gatggaactt ccggtgggtc tgagtctctc
60acaatggtca gatctgcttg gccacaacgg tcagatctgc ttggatcgag gctccaacgg
120tcagatctgg tcagaccgga agttccggtc ccaggcaccg gaacttccgg taccacacta
180taaatactcg tttcccttgt ttccaactgg ggactttgca tagaacacat tttctgactt
240ccggtggctc ttccaaagag tagagaaagc tctcccctcc tcctctctca ctccctaagc
300tttgtgctcc attcaagtga gagattgagc tctagtgata gatctgagag cttcaagagc
360atctcttccc tctcctctcc atccccatag cttggtgctc tttggtgaga ggatttggga
420aaccctagtg ttgtgcattt gtgatttcat tcttgtggca ctaggtggtg attgcaagtg
480tggatttctt gttactcttg ggtgtttccc gacgccctag acggcttggt gcaagggagg
540tgttgagctc gtgattggag attgtttcga gcctcaccaa gtgatttgtg aggggttctt
600gagccttccc cgcaggagat cgcaattggg tactctagtg gattgctcgt ggcttggagg
660atccccatct tatgagtgga tgtgcggcac ccgctgaggg tttggctttg gattgccaat
720tagctcgtga tccatcaagt gggtgtatcg ccataacaag gactagcttg ccgggaagca
780agtgaacctc ggtaataatc ttgtgtcatc tcttgccgag gactctcttg tgattgtgag
840tgattggttg gatatatctc tactctacaa cgttggtata acaatcacta tccactcctt
900tacttacttg tttatcttgc tagttgttta gcttgtttag tttagtcttc tttgtttagg
960agtgtagcaa gtttgtagtt gtgctttctt gttgtaactt gtgtttagct ttcttgctag
1020acttgtgtag gtggcttgca tagcttagtt gtgctagtgc tagaatagct tcaccttttg
1080ttttactaac caacttgtct agttgaagtt tgtagaaatt ttaaataggc
1130232788DNASorghum bicolor 232ctcttgagga gcagaaaaag aaagaagagg
aggaggaaga gcgggagcgc gaggcaacgg 60ggcagccgta ggatcagcag cagcaacagc
agcagcaaga gcagcaatag cagcagcaag 120atcggggggg ccaagaggta ggaccgctcg
agcccccgcc tcaaggtctg gcccaaccag 180tactaccggc gccgcaagag cccggcgagc
aggaggaaga tttgccagta tacggccctc 240cgactccagc gtggctcctt ccgggggcgc
ctgccgcgat ccgggcagag tcgggacgag 300cggacggagt ggtggcactt gcgtgcatga
tgcacacccc cgccgacccc gagtccgaga 360tcaggagggc gatctacggc ttggacggga
tcgccccagg gttcctccgg gagcgccggg 420catgggagga ccactttccc tctgaggaag
acgagatcgg gtcgtcgggg agtaaggacg 480gtacctggaa gacggtggac gacgacgggc
ggttcccgtg cctcgtcgac ctgttcttgc 540gccacggggg ttccctcgag accgtcgaga
acctcatccg cggcatcaag gcgcgggctg 600atcgagagat ggagaactgg tgccccgatc
ggattcgtat ccgggcttcc accgatccgt 660ccgagcccaa catcttcctc gacgacaaga
aggaggccga gaagtgggaa catgtcgagg 720agctccgcct ttggatgaag cacgtggtgg
ggctcgttat cggacatcat caacacgcgc 780tcgggccg
7882331304DNASorghum bicolor
233ctttgcatgc ctacgattca gttttccttg actacgaata tgtttcaaag attcatgatc
60agaatggata ataaactctt tgggccacaa ataatgctgc catgtctcta atgttctaac
120aagagcatcg agttctttat cataagttga ataattgaga acaggcccac tcaatttctc
180actaaaatat gcaataggtt ttccctcttg taataaaaca cctcccaaac caattccact
240agcatcacat tcaagctcaa aagtcttatt aaaatcagga agttgtagga gaggtgcatg
300agttaactta tctttcaaca tattgaatga attctcttgt gctttgcccc aatcaaaagg
360cacccccttc tttgtaagct cattcaatgg tgcagcaatg gtgctgaaat ccttcacaaa
420acggcgatag aatccagcaa gtcctaggaa actccgcacc tgggtgatag tatttgggac
480aggccatccc tgtatagctt ccaccttggc ttgatcaacc tcaattccct gtggagtcac
540aacataacca agaaaagaca ctcgatcggt gcaaaaggtg cacttctcaa ggttaccaaa
600taaacgtgcc tcgcgtagtg cattaaaaac agcacgtaaa tgatcaagat gttcatccaa
660tgatttgctg taaatcaata tgtcatcaaa atatacgaca acaaatttcc caatgaaagc
720acgcaaaacc tcgttcatta atctcatgaa agtactaggt gcattagtta acccaaaagg
780catgactaac cactcataca aaccgaactt agttttgaaa gcagttttcc attcatctcc
840caatttcata cgaatctggt ggtacccact acgtaaatca acttttgaaa acacaacagc
900accactcagt tcatctagca tatcatctaa tcgtggaata gggtgtcgat atcgaatggt
960gatattatta atagctctac aatcaacaca catacgccat gttccatctt tcttaggcac
1020taaaattact ggaacagcac aaggactaag agattctcag acataacctt tgtctagtag
1080ttcttgcact tgtcgctgaa tttcctttgt ttcctccggg tttgtcctgt atggtgcacg
1140atttggcaaa actgctccag gaataagatc aatttggtgc tcaatcccac gtagtggagg
1200cagccccgct ggtacctcac ttggaaacac atcagaatac tcctgcaaaa cgttagcaac
1260agcagggggc aaagaacatt gcatatcctc aattgaaatc aaag
1304234151DNASorghum bicolor 234cgtacttcct aggacggtcc gagtttgggc
ctctcttaag gtagcagata acttcgtata 60atgtatgcta tacgaactta tgcggccgct
ggatcatgaa tctttgaaac atattcgtag 120tcaaggaaaa ctgaatcgta ggcatccaaa a
151235408DNASorghum bicolor
235gcaactccaa cccacctcct aaacaagtcc ttattctaaa tctagggact tgattcaaaa
60aaatcaactc caacccacct actttgatga ggacatcaac accatcaata tatccacacc
120tacaccagtt ttaattgatg cgaacccgct agggtttttg gcctgatctt tcgatgagag
180gcgctggata actcgattag tggaaggaga tgacgttcac ggctcgacta cagccctcaa
240gaccgcgcct tagcaaccga tacaccacgt ccaacagccg tcacgatctt gtggagcgcg
300acaaccagcc cctagggccc ccgtcctcca agcaatcgaa gaactagcaa gaacgaggaa
360acaagcactg aatttgcaag atgaattgac ggttttaaac tgaatctt
408236111DNASorghum bicolor 236caatatctcc aaacagatcg aaccgacctt
ccacttgagc ctcttcacct aggattatca 60tcgggagctt cgataatggt ttctgagcca
tggtgcaata tgcgcaaacc a 111237107DNASorghum bicolor
237aactatgtcc aaacacacag aaccaagatt ccacaaaata gtttcttagc ctatggtgca
60ttaggcacaa accgtgtaca tatcttgcac cgaaactaac actatct
107238182DNASorghum bicolor 238cgatggtact cctaggtgca cggtttgcat
ggaacatatc atatgctcag aaatcaattt 60ggacacaccc gatggaaatg tagatgcacc
cgatggaact actagatgaa gtgtatcata 120tggaatctcg cttcggtcca gttggatata
gtattagttt tggtgcaaga tagttgcatg 180gt
182239245DNASorghum bicolor
239acttcgatct gtttggagat agtgctaatc ttgatgcaag ataggtgcgg gatttgcatg
60gaacatacca tattctcaga aatcaatttg gacgcacctg atacaactcc tagatcatgt
120gtttcatatg gaatcccgct tcaatctgtt tggagatagt gttagtttag ttgcaagatt
180ggtgcatggt ttgcgcataa tgcaccatag gatcagaaac cattatgcaa gcacccgatg
240ataat
245240483DNASorghum bicolor 240tggagtcttg cttcagtcga gttggagata
gtattagttt cagtgcaaga taggtgcata 60gtttgtgccc agtgcaccat atggtcagaa
accattgtgg aagttcacga tggtactcct 120aggtgaaatg tcctaatatg gctagagggg
gggtgaatag cctattcaaa attctacaaa 180ttcactagag cgagaggtta gtaagtaaca
aagcaaagct ttttgctcta gctctaaaag 240gggtgtttgc aagccaccta accaacaatt
ctagttgata taatcactag gcacacaata 300gctatgtcac tacttacaca agagagctaa
ctaaaatttt tatactagta agtaagctac 360tctaacttgc gggaatgtaa gagagatggt
ttgatcttta taccgccgcg tagaggggat 420aaaccaatca ataaaatgaa gtccaatctc
cgggagaaat ccaatcaaca aacacaatgg 480aga
483241470DNASorghum bicolor
241gagaagttat tcttgtgaga tactttctat gtgtgctact tcatcatgca tcatgtttac
60ttttatgcat tgcacttatg tgagatacct tggttgtgag acatgacttg tggttattat
120gatatcttag tgtcatgtgt tttggctcac attacctttg cttccgcgtt tctactccgt
180tgtaactatg agcctttttg tttataccct gttgtatatc actcacatat ttggatgcag
240gatattggtt ttggttgata taagcatgac taatcccttt gttcttattg taaaatgctt
300attgaaacca attctattta aaaacctcaa ctctttttat acacgaggtt gtcatcaatc
360accaaaaagg gggagattga aagagcatct aggcccctag tgatttcggt gattaatgac
420attattgatt actatgacta acgtgtgttt tgcagagaca aagtcatacg
470242748DNASorghum bicolor 242acgacggaat agtttgaatc tcgtgagcag
gcacatgggt cctcttggca aagaacaggc 60atcattcgca gtgtcggacc agtttctcgg
cgtcggcaac cgcggttggc caggaaaagc 120ctgctcggaa ggccttcccg actagtgttc
tggaggcagc gtggttgccg caggacccga 180cgtgtatgtc atctaagagc ttgacgccat
catcttggga tatgcatttc agcaacactt 240ctgagcttgc attctttcgc ataagtttac
catcaaccag tatgtaattc ttagaccgcc 300ggaggaccgc tcgttctccg ttttatcttg
aaaacccgat ccatccgata tatacttgat 360gaagggagca cgccaatcat ggctgctggt
tgtcggattg gtatcgtcta tgagcattgc 420ctcattggat tgcttttcga ccaggtctgt
accaacactt ggcgcgtgta tgtcttgaac 480aaaaacgcct tgcggaatct gggcccgaga
tgaccctatc tttgacaaca catctgctgc 540ctggttccta tcccgaacca cgtggatgta
ttctatgccg tagaaattgg attcccattt 600ccgaatttct ttgcaataga ggtccatctt
ttcgtgggtt atgtcccatt ccttgtgaaa 660gagcatctag gcccctagtg atttcggtga
ttaatgacat tattgattac tatgactaac 720gtgtgttttg cagaggcaaa gtcatagg
748243364DNASorghum bicolor
243ctcggttcgg tcgtgtttgg agatagtgct aaccttgatg aaatataggt gtacggtttg
60catggaacat accatattct tggaaatcaa tttggacgca cccgatagaa ctccaagatc
120atgtgtgtca catggaatct tgcttcaatc tgtttagaga cagtgttagt ttaggtgcaa
180gattggtcca tggtttgcgc ataatgcacc ataggctcgg aaaccgttat ggaagcaccc
240gataatactc ctaggtgaag aggctcaagt ggaaggtcag tttgatctat ttggagatag
300tgttaatctt gaagcaagat aggtgcacgg tttgcatgga acataccata tgctaagaaa
360tcca
3642441493DNASorghum bicolor 244aaataaaatt catcatcagt ttagaaactt
gaattttgca tctcatgctg gagcatattt 60aattctgtgt tgttgtcctt tttgtttgaa
tttatttgat tcaaatttct tttggaaaat 120gctttagaaa gaggaattaa aaaagaaaaa
ggggaaggac cccagccgaa ccccactttc 180ccccccccct cgtgcgcgtg gcccagaccc
ctcggcccag cgagcggccc cgctcgcccc 240cgcgcgtgcc tccccctcct ctcccccacc
gaagactggg ccccactgcg cagcgcctcc 300ttctccccct tcctttcttc tccaccggtc
ggggacgagc cggactcggt ccgagcggta 360acaaactccg gattctcagg gattcgatct
ccctgagtct gttttaagca tcctggagcc 420acccgtcacc tcctcttgcc atcctttgca
ccccgggaac cctagccgcc acttttcgtc 480gagtttcgaa tctcgcagag cttcaaacta
aaccgccgcc gtcgcgggtc acctctgtgc 540cgtctcagct cgagcaaacc atcctagcga
gttcggggta agctcctcca cgcgttggta 600tttttatttc ggagtttggt gttgggaaac
gagaaacccg cgaacgccgg cgagcttcgc 660ggcggtggcc atggcgccac cgaaccggtg
ccgagctccg gccggacgat ggctctgcgt 720gaccaggaag ccgcccagga agcctgccaa
ccgttcaacc gaagatcaac ggccacgatt 780caaacatacc ctttcactgc ggtttttgat
aaagattccc taggtagttt tgtatttcgc 840ccgcggtcct tggcgccctg caccagatta
cgttttccta ttgcgaaagc gtactcctgt 900ccggttgagt tcaaaggcgt tttcacctat
ttacatcttt gccactagat ttgttttgct 960tataaaatgc tcattttaat tccgtttttg
tccattcaaa ttgcgttagg ttcgtaatta 1020tatgctctac atgttagaaa cattagttta
ctgttttgaa actttttatt tctgcagtac 1080tatttaatta attatttctc tataggaaat
cttagaaaat tcatatcttt ctcgttttaa 1140ttctgatttt cgtgaacttt acgttcgtgt
gatcgtagcg ctgcgtagaa tattttcata 1200aacttttatc tttgatttct cactgttggt
gtaatgttct aattatagct tgtttgcttt 1260gtgtatgatt gtctacattg gattgcgtgt
tgttgattga tgattgggat tagacggtga 1320gccgtacgtt ggtgctcaag atcaagcatt
tgaagaccag caggctcagg agagctttga 1380tcaaggcaag tataacttgg gatcatcctt
gttacctata cacaattaat ataacattta 1440tcatatgcat gctatcacct tgaagacctt
agtaaaatca tagatgatta tta 1493245127DNASorghum bicolor
245ataatagtct atgattttgc taaggtcatc aaggtggaca catgcatatg atatatgtat
60taaaattggg taggtaacaa ggataatccc aagttatact tgccttaaac ttgctttaca
120cccaaag
127246467DNASorghum bicolor 246ctcaaaacta gattatatga acagaagaat
acaaaagaag caattcgcta cacacacaca 60cactctctct ctagagtaag gccgaacaag
tatggagtta tctttctttt tggatcacag 120aaggactcac atagacggag ttcaggctat
attaacaagc ggactccaag ttctattcgg 180tttccactca gaaccaacca caaaacggat
tccacctatt gccgagttct actcggtttc 240cttctttttt tatcacgttg gtggaactca
gattggaact ctactcgata cggaccaaga 300ccacagttca attcggactt cgcaacagat
tcgatagaga gaagatatgg taatacttgg 360ctgtgtatag atggtgacaa gaactcgaaa
ctctaaagga ctagacacta agactagcaa 420ctcgacacaa accgaagcaa cgcaaattca
acaagcccta actaaaa 467247411DNASorghum bicolor
247tatctccaac tggactaaag caagattcaa tatgatacac ttcatctagt aattccatcg
60ggtgcatcta cagttccatc gggtgtggct aaattgattt gtgagcatat ggtatgtaca
120atgcaaaccg tgcacctatc ttgcatcaag attagaacta tctccaaaca gacaaccaag
180attccacatg agcctcttca ccaaggagta ccatcgcgtg cgtccaaaat agtttcttag
240cctatggtgt tttaggaaca aaccgtgtac ctatcttgca tcgaaactaa cactatctcc
300aaagagacca aagtgagatt ctatgtgata cacgtcatct aggagttcta ttgggtgctt
360ccaaatatat ttctaagcat atggtacgtt cgatgcaatt catgcaccta t
411248176DNASorghum bicolor 248ttgatttccg agaatatggt atgttccatt
caaaccgtac acctatcttg catcaagatt 60agcactatct cgaaatagac cgaaccaagc
tttcacttga gcctcttcac ctaggagtac 120catcgggaac ttccacaacg tttctgagcc
tatggtgcac tgggcacaaa ccatgc 176249557DNASorghum bicolor
249gattggtgca tggtttgcgc ataatgcacc ataggctcag aaaccattat ggaagcaccc
60gataatactc ctaggtgaag aggctcaagt gaaaggtcgg ttcgatctat ttggagatag
120agataatctt gatgcaagat aggtgcatgg tttgcatgga acaaaccata tgctaagaaa
180tccatttgga tgcatcccaa tagaactcct agatgacatg tgtcatatag aatctcgctt
240tggtctattt ggagacagag ttagttttag tgcaagaaag gtacacagtt tgcgcctaat
300gcatcatagg ctaagaaacc atttaggatg cacttgatga tactcctggg taaaggggct
360caagtggaag ctcagtttgc tttgtctgga gatagtgcta atcttgacgc aagataggcg
420cacgaattgc atcgaacgta ccatatcctt agaaatatat ttggaagcac caaatagaac
480tcctagatga tgtgtgtcat atagaatctc acatcggtct ctttggagat agtgtttgtt
540tcggtgcaag ataggta
557250243DNASorghum bicolor 250ggtttgtttg gagatagtgc taattgatgc
gaacccgcta gggtttttgg cctgatcttt 60cgatgagagg cgctggataa ctcgattagt
ggatggagat gacgatcacg gcccgactac 120agccctcaag accgcgcctt agcaaccgat
acaccacgtc caacagccgt cacgatcttg 180tggagcgcga caaccagccc ctagggctcc
cgtcctgcaa gcaatcgaag aattagcaag 240aac
243251265DNASorghum bicolor
251cctaggtaaa ggggctcaag tggaagcaca gtttgcttct tttggagata gtgccaatct
60tgatgcaaga taggtgcacg aatcgcatcg aacgtaccat atgcttagaa atatatttgg
120aagcacccaa tagaactcct agatgacgtg tgtcacatag aatctcactt cgatctcttt
180ggagatagtg ttagttttgg tacaagatag gtacacgatt tgtgcctaat gcaccatagg
240ctaagaaact attttggacg cacat
2652522236DNASorghum bicolor 252aagaaaagag aaaagattca aaaaaggggc
tgtttttcat attgatttta ggtttgttcc 60accttgtttt tgggggtgtg ctgtggtttt
cctttgtgtc caggctcgcg tctctagcac 120ggtctagcct aggaccagca cagtaccatc
atcgaacgct tattcagctc gcttttataa 180ctaacatggt gctagttcgt tccttgtttc
agcccaccta tagctccaca tactctacag 240cttgacaggt cttgtgctgc agcaccgata
cacttcgtcc attgctgtac acttgttggc 300agacgacccc tcctgtcaag caaggtaaga
attggtaaga acttgtgtta caggttgagt 360gtgagcgact tgctgtagct acatcctagt
agttgtaggg cttttatttc ttcacttgcc 420tttttgttgt ctttgtcttt gaaccatgct
aggggcagat gatggtaacg aaacaccaca 480tacacctcgc actaagggca tcatacaaca
ttttgaaagg aaagtgaggc tgcacacaga 540gggacttgat aacgacttgc aggtgacaaa
tgaaaagctg gggcagttag aggctacgca 600gattgccaca aaaaacaagc tcacacgttt
ggaggaatct gttgctagtg tggacaaaag 660ccttgctgct ctcctaaggc gatttgatga
ttatcaagac actcgtgatc gacgtcgcct 720tcgtcacaac cgtagaggta tgggtggcaa
ccgccgacgc gaggtacaca ataatgatga 780tgctttcagt aagattaaat ttaagatacc
tccttttgat ggtaaatatg accctgatgc 840ttacatcact tgggagattg ctgttgatca
aaagtttgca tgtcatgaat ttcctgagac 900tacacgtgtt agggctgcta ctagtgagtt
tacagatttt gcttctattt ggtggataga 960atatggaaag aaaaatccta ataacttacc
tagaacttgg gatgcgctga aaagggccat 1020gagagctaga tttgttccat cttactatgc
gcgtgatatg ataaataagt tgcagcaatt 1080aagacaaggt gctaaaagtg tagaagaata
ttatcaggaa ttacaaacgg gtatgttgcg 1140ttgtaaccta gaggaggatg aggaaccggc
tatggctaga tttttgggtg ggttaaatca 1200ggaaattcag gacatcctcg cttacaaaga
atacaataat gtaacccgtt tgtttcatct 1260tgcttgtaaa gctgaaaggg aagtgcaggg
acgacgtgct agcacaagga gaaatatttc 1320tgcagggaag gctaattcat ggcagcaaca
cgtggcttca actccatcta cacgtatttc 1380tactccatca tctagtgaca agactcgaac
tgcccccacc aattcagttg cgaagacgat 1440gcaaaagcct gctgcgagta cttcatccgt
ggcatcgaca ggtagaacaa gcaacataca 1500atgtcaccgg tgcaagggat atgggcacat
gatgcgtgac tgtccaaaca agcgagttat 1560gattgtcaga gatgatggtg agtactcatc
tgctagtgat tttgatgagg atacgcttgc 1620actgcttgca ggaccatgca ggtaatgaag
atcaaataga agaacatatt aatgcaggtg 1680aagcggacca ctatgagagc ttgatcgtgc
agcgagtgct tagtgcacaa atggagatgg 1740cggagcaaaa tcagcgacac attttattcc
aaacaaagtg tgtcatcaaa gagcgttctt 1800gtcgcatgat cattgatgga ggtagctgca
acaacttggc aagcagcgat atggtgcaga 1860agcttgccct caacaccaaa ccacacccgc
atccctacta catccaatgg ctgaacaaca 1920gtggtaaggc aaaggtaact agacttgtga
gaattaattt ttccatcgga tcctacaaag 1980atattgtcga atgtgatgtt gtgcctatgc
aagcttgtaa cattctgcta ggtagacctt 2040ggcaatttga tagagattct atgcatcatg
gtagatcaaa tcagtattct tttctatacc 2100atgatctcaa aattgtgttg catcctatgt
cccctgaaac tattatgcaa actaatgttg 2160ctagagctac taaagcaaag agtgagagca
ataaaaatga taaatctgta attggtaaca 2220aagatgagat aaaact
22362531013DNASorghum bicolor
253aaaaggacgt tgtatgatag ctaccaaatc agatattaat gagttcaatg catccacttc
60tgttgcttat gctttgatat gcaaggatgc tttgtgtcgg tgttttcccc acggggggtc
120acaccaacga gtgaatttgt atgcgtgctt ccctttccca gatggtgatg caagaagaca
180ccaagattta tcctggttcg ggcaagagaa ggccctacgt ccagcgaggg aggggagttt
240gtattatctt gcacctaagt gcttgtacag gggcgaatac aagtgtgtat gaactgggat
300ggagtatgga gactcctcta tgtgtgggtg tgctgttttc gtgatgtgtg ttcattcccg
360ggtctcccct tttatagctc caaggagaga cctagggtac atgtataggt gacagagtag
420gggtcgatag aggcatggtg cgctgaccta ctcgaggcct ccgtaccggc gtggtctcga
480gccgtcccgt cctggtagct tgacgatgat tacgcgtgcc ctcatatccc gctctgtacg
540ccgtcttggg ccggtttgtc cgttgcacgc ttgtacgtca tggcggcaga tacgtcagag
600tggtggcagt gccgtcagtc gacgcccaac ccttcccaag aaggaaacac gtccggtcga
660gggtcggacg ccatgtagcg ccttgctatc tagaccattg acttttggcg tctcgagggg
720gtcctgacta gacgttccat cccggctccc cgcgtcctga cacgctcagc gtcatttggt
780ggggaaggct gcaaaaagaa tgacgggacg ggtgcctgtt ccccatcatg cttcccgtga
840ctagcgtgtt aggtgggaaa gcgacaggcg cattaaatgc ccgggttctc gtgcgcgcgt
900caggcggcgg gcggtgtcct tgcgctcgac gcctcctggt tcgctcgagc ccatttccgt
960attcgacggt agttaatgct cgacccctga gcggttcgct cgacgccttt tcc
1013254121DNASorghum bicolor 254gtgtgtccaa aatagtttct taccctatgg
tgcattaggc acaaatcatg tacctatctt 60acaccgaaac taacactatc tccaaaaaga
ccgaagtgag attctatatg acacacgtca 120t
121255181DNASorghum bicolor
255tgcatggaac gtaccatatg cttagaaata tatttgcaag cacccaatag aactcctaga
60tgatgtgtgt catatagaat ctcacttcgg tctctttgga gataatgtta gttttggtgc
120aagataggta cacggtttgt gcctaatgca ccatggggta agaaactatt ttggacacac
180g
181256125DNASorghum bicolor 256accgaaacta acactgtctc caaatagatt
gaatcgagat tccatatgac acacatgatc 60taggagttct atcgggtgtg tccaaattga
tttatgagaa tatggtacgt tctgcaaacc 120ataca
125257332DNASorghum bicolor
257aatcttttct cttttctttt ttttttcttt tttgtttctt tttttttggg caacctcaaa
60aactgattat atgaacaaag gaatacaaaa gagcaattcg ctacacacac tctctctctc
120tctagagtaa ggccgaacaa gtatggagtt atcttttctt ttggatcaca gttcaattcg
180gacttcacaa cagattcgac agagagaaga tatggtaata ctcggctgtg tatagatggt
240gacaagaact cgaaactcta aaggactaga cactaagacc agcaactcga cacaaaccga
300agcaacgcaa attctacaag ccctaactaa at
332258221DNASorghum bicolor 258caaatcttgt acaaatatta aattcatttt
attttgttta gtgtttttct aatgcacaac 60ccaaaattag aaattttggg gtgtgacagg
gtgcatccaa aatggtttct taacccatga 120tgcattaggc gcaaactgtg tacctatctt
gcaccaaaac taactcggtc tccaaacaga 180ccgaagcgag attccatatg acacatgtca
tctaggagtt a 221259170DNASorghum bicolor
259aatggtactc cttggtgaag aggctcaagt ggaatcttga ttctgtctat ttggagatag
60ttctaatctt gatgcaagat agttgcacga tttgcatgga acatacccta tgctcagaaa
120tcaatttgga caaacccgat ggaactgtag atgagatgtg tcatatggaa
170260133DNASorghum bicolor 260tactcgtttt gtgcctaatg caccataggc
taagaaacta ttttagacgc acgcgatgat 60actccttggt gaagaggctc atatggaatc
ttggttctat ctgtttggag atagttccaa 120tcttgatgca aga
133261191DNASorghum bicolor
261aggagtatca tcgagtgcat ccaaaatggt ttcttagcct atgatgcatt aggcgtaaac
60tgtgtaccta tcttgcacca aaactaactc tgtctctaaa tagaccaaag cgagattcca
120tatgacacat gtcatctagg agttctattg ggatgcatcc aaatggattt cttagcatat
180ggtttgttcc a
191262187DNASorghum bicolor 262aatcaacaat ggggttccga agacaaggag
acgggcggct gatcctgcac gcgcgcctac 60aagcaagtag cgaaggctaa acttgatcta
aacaaaaccc agttgttcat ggcggctcta 120gatgtaaata aatagagggg aggacgacca
aaggggtgct agagtcgtcc tccaacccta 180ggacgca
187263121DNASorghum bicolor
263taggtgcacg gtttgcatgg aacataccat atgctcagaa atcaatttgg acacaccgat
60ggaactgtag atgcacccga tggaactact agatgaagtg tatcatatgg aatctcgctt
120c
121264286DNASorghum bicolor 264gtgcattagg cacaaatcgt gtaccaatct
tgcaccgaaa cgaacactat ctccaaagag 60accgaagtga gattctatat gacacacgtc
atctaggagt tctattgggt tgcttccaaa 120tatgtttcga agcaaatggt atgttccatg
caattcatgc acctatcttg catcaagatt 180agcactatct ccaaacaaaa gcaaacggag
cttccactgg agccccttta cccaggagta 240tcatcaggtg catccaaaat ggtttcttag
cctacgatgc attagg 286265119DNASorghum bicolor
265gcaaaccgtg cacctatctt gcatcaagat tagctctatc tccaaataga tcgaaccgac
60cttccacttg agcctcttca cctaggagta ttatcagggt gcttccataa tggtttctg
119266331DNASorghum bicolor 266tggaatctcg cttcggtcca gttggagata
gtattagttt cagtgcaaaa tatgttttgt 60gcctagtgca ccataggctc agaaatcgtt
gtggaagttc ctgatggtac tccttgatgc 120gaacccgcta ggggttttgc ccgatctttt
gatgagagac ggggataact cgattggtgg 180atggagatga cgttcacggc ccgactacag
ccctcaagac cgcgccttag aaaccgatac 240accacgtcca atagccgtca cgatctagtg
gagcgcgaca accagcccct agggctccca 300tcctgcaagc aatcgaagaa ctagcaaaaa c
331267152DNASorghum bicolor
267tttggagata gttctaatct tgatgcaaga taggtgcaca gtttgcatgg aacataccat
60atgctcagaa atcaatttgg acaaacccga tggaactgta gatgcacccg atggaactac
120tagatgaagt gtatcatatg gaatctcgct tt
152268138DNASorghum bicolor 268ttgatttctt agaatatggt acgttccatg
caaaccgtac acctatcttt catcaagatt 60agcactatct ctaaacagac caaaccaagc
tttcagttga gcctcttcac ctaggagtga 120tgaggacatc aacaccag
138269139DNASorghum bicolor
269acacacccga tgtaactgta gatgcaccca atggaactac tagatgaagt atatcatatg
60gaatctcgct tcagtccagt tggagatagt attagtttcg gtgcaagata gttgcatggt
120ttgtgcccag tgcaccata
139270169DNASorghum bicolor 270agattagaac tatcaccaaa cagatcgaac
cgaccttgca cttgagcctt ttaacctagg 60attatcattg ggtgcttcca taatggtttc
tgagcctgtg gtgcattatg cgcaaaccat 120gcaccaatct tgcacctaaa ctaacactat
ctctaaacag attgaagcg 169271465DNASorghum bicolor
271ctaggagtat tatcggatgc ttccataatg gtttccgagc ctatggtaca ttatgcgcaa
60accatgcacc aatcttgcac ctaaactaac actgtctcca aacagattga agaaagattc
120catttgacac acataatcta ggagttctat cgggtgcatc caaatttatt actgagaata
180tggtatgttt catgcaaatc atacacctat cttgcatcaa gattagcact gtctccaaat
240agaccgaacc gagctttcgc ttaagcctct tcacctagga gtaccatcgt ggacttccac
300aacggtttct gagcctatgg tgcagtgggc acaaaccatg cacctatctt gcaccgaaac
360taatactatg tgatgaggac atcaacacca gcgatacata tcctacatcc accagcgata
420catatcctac atcctctccg gcacaaccta tagctggtcc tacat
465272205DNASorghum bicolor 272actgagcttc cacttgagcc cctttaccca
tgagtatcat cgggtgcatc caaaatggtt 60tcttagccta tgatgtatta ggcgcaaatt
gtgtacctat ctcgcaccaa tactaacttt 120gtctccaaac agaccaaagc gaaattccat
atgacacatg tcatctaggg gttctattgg 180ggtgtgtcca aatagatttc ttagc
205273100DNASorghum bicolor
273tgcaaactgt gcacctatct tgcatcaaga ttagctctat ctccaaatag atcgaatcga
60ccttccactt gagcctcttc acctaggagt attatcgggt
100274166DNASorghum bicolor 274atagaactcc aagatcatgt gtgttatatg
gaatctcgct tcaatctgtt tggagaaagt 60gttagtatag gtgcaagatt agtgcatggt
ttgcgcataa tgcaccatac gctcagaaaa 120cattatggaa gcacccgatg ataatcctag
gtgaagaggc tcaagt 166275107DNASorghum bicolor
275cacccaatag aactcctaga tgacgtgtgt cacatagaat ctcacttcga tctctttgga
60gatagtgtta gtttcggtgc aagataggta cgcgatttgt gcctaat
107276159DNASorghum bicolor 276aagattccat atgacacaca tcatcaagga
gttatatcag gtgcgtccta attgatttct 60aagccaagtg gaggcttggt tcggtctgtt
tagagatagt gctaatcttg atgcaagatt 120ggtgcaccat atgctcagaa atcaaattgg
acgcaccag 159277730DNASorghum bicolor
277tagaatttgg catgagattg gattgctttt agtcagcctc ttatagccta aagtctttga
60gtgactagat gacatatcat gtaagttgct gataggtttc cagttttccg ctcctaggtc
120tgcatattgt acttttcctc ttactcgact taaccagtac caacccagct tctcaacgga
180tttataccat ggcactttaa agccagcatc actgacaatg agcggtgtgg tgttactcgg
240tagaatgctc gcaaggtcgg ctagaaattg gtcatgagct ttctttgaac attgctctga
300aagcgggaac gctttctcat aaagagtaac agaacgaccg tgtagtgcga ctgaagctcg
360caataccata agccgttttt gctcacggat atcagaccag tcaacaagta caatgggcat
420cgtattgccc gaacagataa agctagcatg ccaacggtat acagcgagtc gctctttgtg
480gaggtgacga ttacctaaca atcggtcgat tcgtttgatg ttatgttttg ttctcgcttt
540ggttggcagg ttacggccaa gttcggtaag agtgagagtt ttacagtcaa gtaaggcgtg
600gcaagccaac gttaagctgt tgagtcgttt taagtgtaat tcggggcaga attggtaaag
660agagtcgtgt aaaatatcga gttcgcacat tttgttgtct gattattgat ttttcgcgaa
720accatttgat
730278408DNASorghum bicolor 278gaaactaaca ctatctccaa acggaccaaa
gcaaggttac atatgacata caacatctag 60tagttccatc gggcgtgtcc aaattgattt
cgggcatata gcacgttaca tgcaaaccgt 120gcacctatct tgcatcaaga ttagcagtat
ctccaaacag accaaaccaa gcttccactt 180gagtctcttc acctaggagt agcaacaggt
gcatccataa tggtttctta tgctattgtg 240cgttaggcgc aaactgtgca catatcttcc
accaaaatta acactacctc caaacagacc 300aaagtgagat ttcatatgac acacatcatc
taggagttcc atcgggtgtg tccaaattga 360tttccgagca tatagtacgt tcatgcaaac
cgtgcaccta tcttgcat 408279113DNASorghum bicolor
279ggtttgcatg ggacatacca tattcttgac aatcaatttg gatgcacctg atggaactcc
60tagtgacttg tgtcatatgg aatctcgctt cgcccgtttg gagacagtgt tag
113280243DNASorghum bicolor 280agatagtgct aatcttgatg ctagataggt
gcacaatttg catggaacat accatatgct 60cggataaatt tggacgcacc caatggaact
cctagatgat gtgtgtcata tggaatcttg 120cttcggtttg tttagagata gtgttagttt
tggtgcaaga taggtgcacg gtttgcacct 180aatgcaccat actctaagaa aacattttgg
atgcacctga tggtactcct agctaaagag 240gct
243281516DNASorghum bicolor
281aaaccatttt ggatgcactt gttggtactc ctaggtgaag gggctctagt ggatgctcgg
60ttcggtctgt ttggagatag tgctaatctt gatacaagat agatgcacag tttgcatgga
120acataccaaa tgcttggaaa tcaatctaga cgcacttgct ggaactccta gatgacatgt
180gtcatacaaa atcttgcttt ggtctattta gagatagtgt tagttccggt gcaagatagg
240tgcaaggttt gcacataatg cagcataggc aaagaaacca ttttagacgc acccaatggt
300actcctaggt aaaaggctca agtgaaagct cgatttggtc tatttggaga tagtgctaat
360ctggatgcaa gatagatgca cggtctgcat gaaacgtacc atgtgcttag aaattaattt
420ggacacaccc gatacaactc cttgatgatg tgtgtcgtat ggaatcttgc tttggtgtgt
480ttagagatag tattagtttc ggtgcaagat atgtgc
516282314DNASorghum bicolor 282ttgggttcag tctgtttgga ggtattgcta
atcttgatgc aagataggtg catggtttgc 60atggaaggta ctatatgttt ggaaatcaat
ttggacgcat ccgatgaaac tcctagatgt 120catgtgtcat atgtaacctc gtcttggtct
gtttggagat agtgttggtt tcggtgcaag 180ataggtgcac ggtttgtgcc taatgcacca
tactctaaca aaccattttg gacacacctg 240attatactcc tagcagaaga ggcacaagtg
gaagctcggt ttggtctgtt tggagatagt 300gctaatcttg acgc
314283776DNASorghum bicolor
283gatcgaacca agcttccact tgagcctctt catctaggag tacaaacagg tgcgtccaaa
60atggtttctt agactatggt gcattaggca caaactatgc acgtatcttg caccagaact
120aacattgtct ccaaacagac cgaagcgaga ttccatatga cacacgtcat ctaggagttc
180cattgggtgg gtccaaattg atttcttaac atatggtacg ttgcatgcaa actgtgtaac
240tatcttgcat caagattagc actatatccg aacagaccaa atcgagcctc cacttgagcc
300tcttcaacta tgagtatcat cgggtgcgtc taaaatggtt tcttacccta tggtgcatga
360ggcgtatacc gtgcacatat catctataaa aactaacact gtctctaaac gtatcgaagc
420aagattcaat atgacacaca tcatcaagga gttatatcag gtgcgtccta gttaattttt
480gagcatatcg tatgttccat gcaaaccgtg catctatctc gcatcaagat tagcactacc
540tccaaacata ccaaactgag ctttctcttg agtcccttga cctaggagta ccatcgggtg
600cgtccaaaac tgtttcttat cctatggtgc attatctgca aactgtgcac ctatcttgca
660ccgaaactaa cattgtctct aaacaaaccg aagtgagatt ccatatgaca tgtgtcatct
720aggagtttca tctggtgcgt caaaattgat ttctgagtat atggtatgtt ccatgc
776284311DNASorghum bicolor 284tagaccaaac tgagctttca cttgagcccc
ttgacctagg agtaccatcg ggtgtgtcca 60aaatggtttc ttatcctatg gtgcatttgc
tgcaaatcat gcacctatct tgcaccgaaa 120ctaacactgt ctctaaacgg accgaagtga
gattccatat gacacatgtc atctaggagt 180tccatctggt gcatctaaat tgatttctga
gcatatggta tccatgcaaa tcgtgcacct 240atcttgcatc aagattagca ctatctctaa
acagaccgaa ccggggcctc cacttaagcc 300tcttcaccta c
311285433DNASorghum bicolor
285gtaatatagg tgcacggttt gcatggaaca taccatatgc attaaaatca ttttggacgc
60acccgataga actcctagat gatgtgtgtc atctgatatc tcgctttggt ctgtttgaat
120atagttttag tttcggtgca agatggtgaa tggtttgtgc ctaatgcacc atactctaaa
180aagatattgc acggtttatg cctaatgcac catactctaa gaaacggttt tggacgcacc
240cgatggtact cctaggtgaa gggtctcaag tgaaagctca gtttggtcta tttggagata
300ctgctaatct tgatgcaaga tatttgcatg gtttgcgtag aatgtcccat atgctcagaa
360atcaatttgg atgcacctga tggaactcct agatgacgtg tgtcatatgg aatctcggtt
420tggtccgttt gta
433286767DNASorghum bicolor 286cttcctcttg agcctcttca gctaggagta
ccatcgggtg cgtccaaaat ggtttcttag 60tattatggtg catttggcac aaaccgtgca
cctatcttgc accaaaatta acactatctc 120caaacagacc aaagtgagat attagatgtc
ttacacaatc taggagttct atcgggtgcg 180tccaaattga ttttcaacca tatggtatgt
tccatacaaa ccgtgcacct atcttatatg 240aagattagca ctatctccaa acagatcgta
ccgagctttc actttagcct cttcacctag 300gagtaccatc gggtgcttcc acaactgttt
ctaagcctat ggtgcattag gcgcacatca 360tgcacctatc ttacaccgaa actaacacta
tctccaaacg gatcaaagtg agattccata 420tgagacacgt catctaggag ttctatcaag
tgcgtcgaaa ttgatttccg agcatatggt 480acgttccatg cataccgtgc acctatcttg
cgtcaagatt tgtagtatct ccaaacagac 540cgaaccaagc ttccacttga gcctcttcac
ctaggagtac caacaggtgc atctaaaatg 600gtttcttggg ctatggtgca ttaggcataa
actgtgcacc tatcttgcac tgaaactaac 660actgtctcca aatagatgaa gcgagattcc
atatgacaca cgtcatctag gagttccatc 720acgtgcatcc aaatcgattt tcaagcatat
ggtatttccc atgctaa 767287415DNASorghum bicolor
287ttgtatgcaa gataggtgca cagtttgcat ggaatatcgt agatgcttgg aaatcaaatt
60ggatgcaccc gatggagctc ctagatgaag tgtgccatat ggaatctcgc ttcggtccgt
120ttggagacaa tattagtttc agtgcaagat aggtgcacag tttgcgccta atacaccata
180gtctaagaaa ccatttttta cacacctgtt ggcactcctg ggtgaagagg ctcaagtgga
240atctcggttt ggtctatttg tagatagtgc tagtctggat gcaagatagg tgcatgcttt
300gcatggaaca taccatatgc ttagaaatca atatggacgc acctcatgaa actcctagat
360gacgtgtgtc atatgaaatc tcgctttggt ctatttggag atagtgttag tttcg
415288749DNASorghum bicolor 288caccatagcc taagaaacca atttggacgc
acccgattgt actcctagct gaagaggctc 60aagtggaagc tcggtttggt ctgtttggaa
attgtgctaa tcttgacgca agacaggtgc 120acggtttgca tggatgtacc atatactcag
aaataaattt ggacgcaccc gatagaactc 180ctagatgacg tgtgtcatat ggaatctcac
tttgatccgt ttggagatag tgttagtttc 240ggtgtaagat aggtgcacag tgtgcgccta
atgcaccata ggcttagaaa cagttgtgga 300agcacccgat ggtactccta ggtgaagagg
ctcaagtgaa agcttggtaa gatttgtttg 360gagatagtgc taatcttcat gtaagatagg
tgcaaggttt gtatggaaca taccatatgc 420ttgaaaatca atttgcacgc acccgataga
actcctagat ggtgtgtgtc atctaatatc 480gctttggtct atttggagat agtgttagtt
tcggtgcaag ataggtgcac ggtttgtgcc 540taatgtacca tactctaaga aaccattttg
gatgcagcca atggtactcc tagccgaaga 600ggctcaagtg gaaggttggt tcggtctgtt
tggagatagt gctaatcttg atgcaagatt 660gcacagtttg catggaatgt atcatatgct
cagaattcaa tttggatgca cctgatagaa 720ctcctagctg aagaggctca agtggaagc
749289185DNASorghum bicolor
289atggaactcc tagatgacat gtgtcatatg gaatctcact tcggtctgtt tagaggcagt
60gttagtttcg gtgcaagata ggtgcactgt ttgcacctaa tgcaccatag gataagaaac
120cattttggac gcacccgatg gtactcctag gtcaaggggc tcaagtgaaa gatcggtttg
180gtctg
185290149DNASorghum bicolor 290aatggaaaac ctagatgacg tctgtcatat
ggaatctcgc ttcgatttgt ttggagacaa 60tgttagtttt ggtgcaagat aggtgtatag
tttgtgccta atgcaccata gtctaagaaa 120ccatttttga gcacctgttt gtactccta
149291180DNASorghum bicolor
291aatggaactc ctagatgacg tgtgttgtat ggaatctcac ttcggtctgt ttggagacaa
60tgttagtttc ggtgcaagat cggtgcatag tttgtgccta atgcaccata gtctaagaaa
120ccatttttga cgcacctatt tgtactccta gatgaagaag ctcaagtgga agcttagttc
180292167DNASorghum bicolor 292ggtctattca gagatattgc ctaatcttga
tgcaagatag gtgcactatt agtgtggaac 60atatcatatg ctcggaaatc aatttcgaca
cacccaatga aactcctaga tgacgtgtgt 120catatggaat cttgcttcgg tctttttaga
gactgttagt ttcactg 167293301DNASorghum bicolor
293acctaggagt accaataggt gcttccaaaa tggtttctta gactttggtg cattaggcgc
60aaatcgtgta catatcttgc accgaaacta atactctcta tggacaccaa agcgagattc
120catatgacac acgtgatgaa ggagttctat cgggtgtgtc caaattaatt tctaagcata
180tggaacgttc catgcagacc gtgcatctat cttgcatcaa gattaacact atctccaaac
240agatcaaacc gagcttccac ttgagcctct tcaccgaagt accaacaggt gcattcaaaa
300t
301294591DNASorghum bicolor 294ctatgagtac catcgggtgc gtctaaaatg
gtttcttagc ctattgtgca ttaggcgtaa 60accatgcaca tatcttctac caaaactaac
actgtctcta aatcaaacga agcaagattc 120catatgacac acatcatcaa ggagttatat
catgtccgtc ctaattgatt tctgagcata 180tcatatgttc catgcaaacc gtgcatctat
cttgcatcaa gattagcact atctccaaac 240agaccaaagc aagctttcac ttgagcacgt
tgacctggga gtaccatcgg gtgcgtccaa 300aaaggtttat tatcctatgg tgcatttggt
gcaaaccgtg cacctatctt gcaccgaaac 360taacaatatg tctaaatgga tcgaagtgag
attccatatg acacatgtca tctaggagtt 420ccatctggtg cgtccaaatt gatttccgag
catatggtat attccatgca aatcgtgcac 480ctaccttgca tcaagattaa cactatctct
aaacagaatg aaccaagcct ccacttgagc 540ctcttcacct aggagtacca acaggtgctt
caaaaatggt ttcttagact a 591295104DNASorghum bicolor
295gatagtgtta gtttcggtgc aagataggtg gacagtttgc gtctaatgca ccatagtcta
60ggaaaccatt ttggacgcac ctattggtcc gaggtgaaga ggct
1042961077DNASorghum bicolor 296gatctatttg gagatagtgc taatcttgat
gcaagatagg tgcatgattt gcagggaaca 60taccatatgc ttagaaatca atttggacgc
acctaatgga agtcctagat gacgtgtgtc 120atatggaatc ttgcttcagt ccgtttgtag
acattgtaag ttttagtgca agataggtgc 180atagtttgcg cctaatgcac catagtctaa
gaaactattt tggacgcaat atttggtact 240cctaagtgaa gaggctcaag tggaagcttg
gttcggtcag tttggagata gtgctaatcc 300atatgcaagg aaggtgcatg gtttgcatgg
aacataccat atgcttggaa atcaatttgg 360atgcacccga tggaactcct tgatgacgtg
tgtcatatgg aatctcgctt cggtcaattt 420ggagagattg ttagtttcgg tgcaagatag
gtgcacggtt tgcacctaat gcaccatagg 480ataagaaacc attttggacg cacccgatgg
ttctcctaga tcaaggggct caagtgaaag 540cttggtttgg tctgtttgga gatagtgcta
atcttgatgc aagatagatg cacggtttgc 600atggaacata cgatatgctt agaaatcaat
taggacgcac ctgatataac tccttgatga 660tttgtgtcgt atggaatgtt gctttggttc
gtttagagac agtgttagtt ttggtagaag 720atatgtgcac ggtttacgcc taatgcacca
taggctaaga aaccatttta acgcacccga 780tggtactcgt agttgaagag gctcaagtgg
aggctcgatt tggtctgttc ggatatagta 840ctaatcttga tgcaagatag ttgcacagtt
tgcatgcaac gtaccatatc ttaagaaatc 900aaattggacg cacccaatgg aactcctaga
tgacgtctgt catatggaat ctcgcttcgg 960tctgtttgga gacaatgtta gttttggtgc
aagataggtg catagtttgt tcataatgca 1020ccatagtcta agaaaccatt tttgacacac
ttgtttgtac tcctagatga agaggct 10772974657DNASorghum bicolor
297aggcatatcg tatattccat gcaaaccgtg catcgatctt gcatcaagat tagcactatc
60tacaaataga ccgaaccgag agcatccact tgagcccctt caccttggag taccaacagg
120tgcgtccaaa atggtttctt acactatggt gcattaggtg caaactgtgc acctatcttg
180caccaaaact aacaatgtca tgtgttgtgt gcacccacga tcaagtaccc aatgccttcc
240tccggcttta taatttatct acaaaacaaa tcaatgtttc ttaggtaccc atacttgttt
300gggtctttgg aggttagtga ctaagctttt tggtacccaa atggccttct tctttggacc
360cacaattggt gtaccaacaa acttagcatg cacacccttc tcacccttgg taagcacata
420acaagaatca aatagaatgt aagatacatt agcatgtttc ttgttcttgt tcactttatt
480gcaattaagc tctaggtgcc caacttgctt gcatttgttg caataccaac tattgctctt
540cacaaagcta ggcttgtgag ttgcaaaggc cgccttgcct ttcttggggg tatagcctaa
600tccctctttg ttgagagaaa acctttgact acccaagcac tttagcaagc gggcctcacc
660accataggcc ttgcctaggg tgtgagtgag ctcatccacc tctctcttga gagtctcatt
720ctcaaccttt agtgaggtat cacaagtgac actaacacta gaagtagagg tagagttggt
780ggtggatgaa gacctacaag aagagttagt ggcaacaaca ataggtgggt tgggtgattc
840tttaattaag tcacaagttg tgcccacatc acatgataca acaacctttt ccctgttctc
900ttctaacaac aaggagtgag ctttctcaag cttggtgtga gccttgccaa gcttctcatg
960ggcttccact agcctctcat gattagcatt gagctcatca aaggattgct caagagcttt
1020taacttcttg gccaattctt tgcacttctt gtttttagat tgaatgagag attcggcttg
1080ttctaacatg tccatgagtt gttcattagt gaatttattt tcatcatcac tatcatgttc
1140atgttcactt tcactttcat catattgtac cttgtggcac ttggccatga agcaagaagg
1200agtgttgaag agtgatggct tgttgatagc aatgcttgca agtgccttct ttttggatga
1260tttgtcatca tcgcttgagt tatcatccga agaggcatca ctatcccatg tcacaacata
1320ggatccaccc ttcttcttct tcttgaagac catcttcttc tctttctttt ctttcttctc
1380cttcttgttc ttcttgttat gctcatcatt gtcactattg taaggacaat ccgccacaat
1440atgatcgggg ctcttgcact tgtagcatag cctcacatat tccttgttct tggagttgtc
1500ccttctcttc cttatatggt agcccttttt cttcatcatc ttgccaaact tgcggacaaa
1560gagtgctagt gcttcatcat cactatcatc attggaacac tcctcatcac ttgattcttc
1620cttgttcttg ctcttagatg aagagctagc tttgaatgct ccactcttct tcttcttatc
1680atcatcatcc ttcttcatga cctcatcatc atcattgtca ttatattgat cttcggtcat
1740gacatctcca agtacctcat taggagatac acctgtcaat ccacctctaa agatgattgt
1800tctcaatgtc ttgaacctct tgggcaagca catcaagaac ttgtgaatga agtcctcatc
1860cttcactttc tcaccaaggc ccttgaggtc attgatgatg acttggagcc tatagaacat
1920ttctggaatt gactcatcat ccttcatatt gaagtttgat agcttgtcct tgagcatgta
1980caacttggca ccctttaccg ttgtagtgcc ctcatatgtt tcttctagcc ttgtccacac
2040ttcttgcctt ctcaagatct ttgacttgtt caaatacctt tgaatcaatg ccattatagt
2100tgtgttgaga gccattgtat tgcattgctt gtttgctctc tcattgtcgg tggggttagc
2160ggggtcaagg atcacaaaat cattcttcgt gacttctcac actctatcat tgattgatcc
2220aagatacatt ttcatttttc tcttccaata gtcaaaagaa cttgctccat caaagaacgg
2280tggtttgccc ccaacatggt tgaacactat ttgagccata atttgagcat cgaggttgtt
2340aagccttcac aaaactgtga ccacggctcc aataccactt gaaaggtcct aatatggcta
2400gaggggaggt gaatagccta tttaaaaatc tacaaactca tagagcaaga ggttagtaga
2460taacaagcat agctttttga tctagctcta aaggggtgtt tgccagccac ctatccaaca
2520attctagttg ctatgatcac tatgcacaca agagctatgt cactacttac actagagagc
2580tatctaaagt ttctatacaa gtaagtaagc tactctagtt tgcgggaatg taagagagag
2640atggtttgat ctttataccg ccgcatagag gggatgaacc aatcaataat atgaagtcca
2700atcaccagga gaaatccact gaacaatcac aatggagaca cacaattttc tcccgaggtt
2760cacgtgcttg ctggcatgct acgtccctat tgtgtcgacc aacacttggt ggttcggcgg
2820ctaagaggtg tagcatgaac cttgtcctca ctaggacacc gtaagaactg acccacaagt
2880gaggtaactc aatgacacga gcaatccact aaagttacct ttcggctctc cgcagggaag
2940gtacaagacc cctcacaatc actaggagat agcgacgaac aatcactaac tcgtgccaat
3000gctcctccac tgctccaagt cgtctaggtg gcgcaaccac caagagtaac aagaaaaccg
3060cagccaaatc gatccccaag tgccactaga tgcaatcact caagcaaatg cacttggaat
3120cactcccaat ctcacaaaga tgaataatct atgaaggaga tgagagggag gtgtttgctt
3180aggctcacaa ggattcaagt atgctagaat gccaagagag tgagccctaa gtcggccaat
3240aactatttat aagctcctca aaacaaacag agccattggc tctttcactg ggctaaaaac
3300ggggtcactt gatgaaccac agggggcatc ggacgctcaa cccctgtgtc tggtgctcta
3360gaaacagcca cgtgttgccc tatccacttt cgcatgttga tatctaacgg tcacctgcag
3420agcaccggac gcgctcaagt tggcaccgga cgcgtctcgt actcaccgga cttaaactca
3480gggaggtctg caaactcgcg gggtcactgg acgctgagca cggactgtcc ggtgctcact
3540ggtctcatac ctagagagca ttgcaaaaga gtagaacact ggactcaaaa caccggacgc
3600tcaaatagga tctaacctgc gtccggtgtt gggcgtccgg tgcctaaccc tagctgagct
3660agacactgcc tacacaccgg atgcacagac acagcgtctg gtgcctctga gccagcgtcc
3720ggtgagtgtt tctcagcgag aaacacaccc gcaacttctc caattttccc acctgtgcta
3780ttgaaaaatt gcacatcatt ttctctctac ccttcaaact tcacctcctt ttcaaagtgt
3840gccaacacca caatgtgtaa accaatatgt gcacgtgtgt tagcattttc acaaacattt
3900tcttcaaagg agttaagtta gctcactagg ttctaaatgc atgcacatga ataatgacac
3960ctagtggcac ttgataaccg cttagccaaa gaattcccct ctttatagta tggctatcta
4020tcctaaatgt gatcacaccc tctatggtgt cttgatcacc aaaaccaaaa ccctaagcaa
4080tacctttgcc ttgatctcca tagggttttg tttttctctt tcttcttttc taagttgagc
4140acttgatcat cttgtggtca tcaccatcat aatcatgatc atctcttgct ccatcaattg
4200gcatgtacca acctcattaa gtctgcacac acttagtata gaggttagta caagggtttc
4260atcaattatc caaaaccaaa ctagggattt cgacagtttg cgcctaatgc accatagtct
4320aagagaccat tttggacgca cctgttggta ctcctaggtg aagaggctca agtggaagct
4380cagttcggtc tttttggaga tagtgctaat cttgatgcaa gataggtgca cagtttgcat
4440ggaacgtacc atatgctcaa aaatcaattt ggacgcacta gatggaagac ctagatgatg
4500tgtgtcatat ggaatctcgc tttggtctgt ttggagacag tgttggtttt ggtgcaagat
4560atgtgcacaa tttgcaccta atgcaccata ggctgagaaa ccattttaga tgcacctgat
4620cgtactcctt ggtgaagagg ctcaagtgga agctcgg
4657298547DNASorghum bicolor 298tttggagata gtgctaatct tgatgcaaga
tagatgcacg gtttgcatgg aacatacgat 60atggtcagaa atcaattagg acgcacttga
tataactcct tgatgatgtg tgtcatatag 120aatcttgctt cggttcgttt agagactgtg
ttagttttgg tagaagatat gtgcacggtt 180tacgcctaat gcaccatagg caaagaaacc
attttagacg catcagatgg tactcgtagt 240tgaagagcct caagtggagg cttgatttgg
tctgttcaga tatagtgcta tcttgatgca 300agatagttgc acagtttgca tggaacgtgg
catatgttat gaaatcaatt tggatgcacc 360taatggaact cctagatgac gtgtgtcata
tggaatcttg ctttggtctg tttggagata 420atgttagttt cggtgcaaga taggtgcatg
gtttgtgcct aatgcactat agtctaagaa 480accatttttg acgcacctgt ttgtactcct
agatgaagag gctcaagtgg aagctcggtt 540tggtctg
547299535DNASorghum bicolor
299ctatctccaa acaaaccgaa gtgagattcc atatgacaca tgtcgtctag gagttccatc
60tagtgcatcc aaattgattt ttgagcatat ggtatgttcc ttgcaaatcg tgcacctatc
120ttccgtcaag attagcacta tctctaaaca gaccgaaccg agtctccact tgagcctctt
180cacctaggag taccaacagt tgcttccaaa atggtttctt agactttggt gcattacgcg
240caaaccgtgc acatttcttg caccgaaact aatactgtcc ctaagcacac caaagtgaga
300ttccatatga cacacgtcat caaggagatc tatcgggtgt gtccaaatta atttctaatc
360atatggtacg ttccatgtag accgtgcatc tatcttgcat caagattagc actatctcca
420aaaagaacaa accgagcttt cacttgagcc ttttacctag gagtaccatc gggtgcgtcc
480aaaatggttt cttagcctat gctgcattat gtgcaaacct tgcacgtatc tgagc
535300232DNASorghum bicolor 300agatagtgct aatcttgacg caagataggt
gcccggtttg catggacata ccatatgctc 60aaaaatcaat ttggacgcac ccgatagaac
ttctagatga cgtgtgtcat atggaatctc 120acttcggtcc atttagagat agtgttagtt
ttggtgcaag ataagtgtac ggtttgcgcc 180taatgtacca taggctcaaa aaccattgtg
gaagcacccg atggtactcc ga 2323011187DNASorghum bicolor
301tgccagatag atgcacggtc tgcattgaac gtaccatatg cttagaaatt aatttagaca
60cacccaatag aactccttga tgacatgtgt catttggaat ctcgctttag tgtgcttaga
120gacagtatta gcttcggtgc aagatatgtg cacggtttac gcctaatgca ccaaagtcta
180agaaaccatt ttggaagcac ctgttggtac tcctaggtga agaggctcaa gtggaggctc
240ggtttggtct gtttagagat tgtgctaatc ttgatgcaag ataggtgcac gatttgcatg
300gaacgtacca tatgctcata aatcaatttg gacgcaccag atggaactac tagatgacat
360gtgtcgtatg gtatctcact tcggtccatt taatgacagt gttagtttcg gtgcaagata
420ggtaaccggt ttgcacctaa tgcaccatag gataagaaac cattttggac acacccgatg
480gtactaatag gtcaaggggc tcaagtgaaa gcttggtttg gtctatttgt agattagtgc
540taatcttgat gcaagataga tgcacggttt gcatggaaca tacgatatgc ttagaaatca
600attaggacgc accggatata attccttgat gatgtgtgtc atatggaatc ttgcttcgga
660tcgtttagag acagtgttag ttttggtaga agatatgtgc acggtttacg cctaatgcac
720cataggctaa gaaaccattt tagacgcacc tgatggtact cgtagttgaa gaggctcaag
780tggaggctcg atttggtctg ttcggatata gttccaatct tgatgcaaca tagttgcaca
840gtttgcatgc aatgtaccat atgttaagaa atcaatttgg acgcacccaa tggaactcct
900agatgacatg tgtcatatcg aatctcgctt cagtcttatt ggtgacatgt tagtttcggt
960gcaagatagg tgcatatttt gtgcctaatg caccatagtc taagaaacca tttttgatgc
1020gcctgtttgt actcctagat gaagaggcac aagtggaagc tcggttcggt ctatttggag
1080atagtgctaa tcttgatgca agataggtgc acggtttgca tggaacatac catatgcttg
1140gaaatcaatt tggatgcacc cgatggaact ccttgatgac gtgtgtc
1187302402DNASorghum bicolor 302aagtcatatg gtatgttcca tgcaaaccct
acacctacct tgcataagga ttagcactgt 60ctccaaactg accaaaccaa gcttccactt
gagcctcttc acccaggagt actaaatagt 120gcgtccaaaa tggtttctta gactatggtg
cattaagagc aacctgtgca cctatcttgc 180actaaaactt acaatgtcta caaatgaacc
gaagcaagat tcaatatgac acacgtcatc 240taggggttcc attgggtgcg tccaaattga
tttctaagca tatgttatgt tccttgcaaa 300ccatgcacct atcttgcatc aagatttgca
ctatctccaa acaaatcaaa ccgagcttcc 360actgagcctc ttcaccgaag taccaacagg
tgctaccaaa at 402303999DNASorghum bicolor
303agaaatacaa gttttggtac gcacccgata gaactcctag gtgatgtgtg tcatatggaa
60tctcacttca gtccatttgt agatagtgtt agtttcggta caagataggt gcacggtgtg
120cgcgtaatgc accataggct tagaaacagt tgtggaagca cctaatggta cttctaggtg
180aagagactag agtgaaagct cggtacgata tgtttggaga tagtgctaat cttcatgtaa
240gataggtgca cagtttgcat ggaacatacc atatgcttaa aagtcaatat agacgcaccc
300aatagaactc ctagatgatg tgtgtcatct gatatcttgc tttggtctgt ttggagatag
360tgttagtttt ggtgcaagat aggtgcgcgg tttgtgccta atacaccata tctctaaaca
420gaccgaacct cttcacctag gagtaccaac aggtgcttcc aaaatggttt cttagacttt
480ggtgcattag gtgcaaaccg tgcacatatc ttgcaccata actaatactg tctctaagca
540caccaaagca agattccata tgacacacat catcaaggag ttatatgggg tgtgtccaaa
600ttaatttcta agcatatggt acgttccaag cagaccgtgc atctttcagt tgagcctttt
660accaaggacg ggtgcgtcca aatagaccaa actgagcttt cacttgagcc ttttaccaag
720gacgggtgcg tccaaaatgg tttcttagca tatgttgcat tatatgcaaa ccttgcacct
780atcttgcacc gaaactaaca ctgtctccaa acagaccgaa gcaagatttc gtttgacaca
840cgtcatctag gagttccatc atgtgcgtcc agattgattt ccaagcattt ggtatgttcc
900atgcaaactg tgcacctatc ttgcattaag attagcacaa tctccaaaca gactgaaccg
960agcatccact tgagccccct cacctaggag taccaacaa
999304141DNASorghum bicolor 304gatctatttg tagatattgc taatcttgtt
gcaaccatat gcttagaaat taatttggat 60gcacccaatg gaactcctag atgacgtgtg
tcatatggaa tcttgcttcg gtccgtttgt 120agacattgta agtttaagtg t
141305282DNASorghum bicolor
305gaccgaagca agattccata tgacacaagt catctaggag ttccatccga aaagaccaaa
60ttgatttcct aacatatggt acgttccgtg caatatatgc aactatcttg tatcaagctt
120agcactatat ctgaaaagac caaatcaaga ctccacttga gcctcttcaa caacgagtac
180catcggatgc gtctaaaatg gtttcttagc ctatgacgca tttggcgtaa accgtgcaca
240tatcttctgc aaaaataaca ctgtctctaa acgaaccgaa gc
282306118DNASorghum bicolor 306agatagtgct aatcttgatg caagaaaggt
gcatggtttg catggaacat accaaatgct 60tggaaaacaa tctagacgca cataatggaa
ctcctacatg acgtgtgtca taggaaat 1183079853DNASorghum bicolor
307atatggaatc tcgcttcggc ccatttggag acattgttag tttcggtgaa agataggtgc
60acagtttgca cctaatgaac catagtctaa gaaaccattt tggacgcact tgtttgtact
120cctaggtgaa ggggctcaag tggttgctcg gttcggtctg tttggagata gtgctaatct
180tgatgcaaga taggtgcacg atttgcatgg aatataccaa atgcttggaa aacaatctgg
240aagcacatga tggaactcct agatgatgtg tgtcatacga aatcttgctt cggtctgttt
300ggagacagtg tgtcgagggt atcagtaagg ggtatcctaa ctgatacaca taatgagatc
360ccccatacgc aggtcgaggc cctcaactcg acgctctagt ctgtcatacg aacagtcatc
420gacaaccgca gcctcgaaga cagaaaaggc gacgagcgaa tcgatcaggg ctcgagcgcc
480gctaccgtcg aatacggaag taggctcgag cgaaccagga ggcgtcgagc gcaaggactc
540cgcccgcctt gacgcgcgca tgggaagcag ggcatttaat gcgcctgtcg cattctcacc
600taacacgcca gtcacgggaa gcgtgatagg gaacaggcat ccgtcccgtc attctttttg
660cagccttccc cgccaaacaa cccagagcgt gtcaggacac gggaagtagg gatggaacgt
720ctagtcagga ccccctcgag acatccaggg tcaacgctct agacagcgga gcgttacaca
780gcggccaacc ctcgaccaga cccgtttcct tccaggggaa gggttggacg tcgaatggca
840acaaggcaac cactccgacg gatccgctgt tatatcgtac aggcgtgccg ccaatgaacc
900agtccaagac ggcatgcaga gcaggattcc agggcacgcg taatcatcat caagctacga
960ggacgggatg gctcgaaacc acgccggtac ggaggcctcg agtaggtccg cgcgccatgc
1020ctctatcgac ccctactctc acacctatat ttgtaccctg ggcttctcct tggaactata
1080aaaggggaag cccgggagcg gactcaggag gagatcaacc agacaacacc acactcatac
1140gcagtagaac ttccatactc cataccacgc ttgtattcac ccctgtacaa gcacttaggt
1200gcaaaataat acaaactctc tccccccagc tggacgtagg gccttctctt gcccgatcca
1260ggataaatcc ttgtgtcttt ttgcatcacc atcagggaaa gggaagcacg catacaaatt
1320cactcgttgg tgtgaccccc cgtggagaaa acaccgacag ttggcgcgcc aggtaggggt
1380cctgcgtgtt ttttcatcga tttcccattc tttcccagat ggccactctt tcttcgccaa
1440ttctgcgctc ctcggtgatt tggttctgga gtctcgagtt tatgtccact ggctccggct
1500atgacatgat cttgctctcg atcaaaggac cgggaggagc ccgcgtcgcg ccaacccgga
1560tgagggcccc gagacgtcct cgccaccacg cctccccgcc caagaagagg cgtggacagc
1620accatcatcg cccttctgcc tcgccacgac cggcagtccg tgccaggtag gaggcggcac
1680gagagccgat agccttgtgc accacgggca cgacggctca gcgccgagag gatcacacga
1740tgacgggagg tgacgcgccc cgcgcactct cccccaccgc taggctactt ccacatgggt
1800tgttcgcctc aagaggagcc ttgccgatcg gcatggacaa tgccgcagcg tcactcgcta
1860gggcgatatg cccaaacgcc cagacgtacg tggagagacc aatggtcctc ccacgcaacc
1920ctgaagcaca gcaaccgacg tccgagctac cggatttctc ccaggtacga ggcctccgcc
1980gcctgggccc tggacgctac acgattacct cccttagaca gcggctgctg gaggaaggac
2040gcggatgttt ctacgccacc gagccggact ccggctcgga atccgacagc cacgacccta
2100ctagggagtg cttccatatc gacggtgcgg tggaaaccac cgacgagaca caagacgccg
2160ttgtaggtgg gtgggcccct gcagcgaggg aagaccccgg acgcctggga atgacggtca
2220agtcgatccc cctctgcaag aagacagagc cgtgcaaatt gcgcagctac gagaactcaa
2280ggcaaagctt gacgacgacc gcgagcgcct tgtcctgctt gagcagatcc tcgagcagga
2340cctgccttac ccgcctggtg ggagtgtccg cagacgagct cgagaggtat accgacagat
2400cgttggagac acagagccag aacagcccgt cagccatttc cctcgagcag gccagaacgt
2460cgtggcagca acaatgctac tgcgaaacat gccggagccg tcgaactccc aagctcgacg
2520cattcgagac gaggtgcaga cactactcca ggtggcggca gttcaacagg ccgaaagctc
2580ggcttctcga cgtcgaggag ctgccactaa aaagcgcgat gagccagccc aaaacgaaaa
2640ggaggtgtcg gtccatcagc agccgcctcc tcgaggaaaa aagaccatgc tcattctccc
2700cgtcgacaat cagcgtcgac acgacgcgcg atgtgacatt gaagagaatc gacgccgtcg
2760gtacggggac gcggaagagc gcggttacag tgcccatcgc ggtgggaggt acgacagtga
2820tgaagatcgg atggctctag agccaccagg cccacgggta ttcagcaggg caatccgcag
2880cacaccgctg cccagcccgt tccgaccccc gaccagcctc gcgaagtaca acggagagac
2940caacccggag ctgtggctgg cagacttcag gctggcctgt cagctgggag gtgctcgagg
3000agacgatcga gccatcatca gacagctgcc actcttcctt tccgacaccg ctcgcaggtg
3060gcttgaagag ctcccggccg atcagatcca tgactaggtc gatctggtta gagttttcga
3120aggtaacttc aaagggacct acatacggcc tgggaactcg tgggacctca gcaagtacaa
3180gcagaagtca ggagaaactc ttcgagagta tgctcgatgc ttctcaaagc aacacaccga
3240gctaccgcac atccccaatc acgaccgcac atcaccgagg taccaccagt cgagacttgg
3300tacgggaatt aggccgaaat caccctcagt tcgtcgacga gctgatggat atggtggcca
3360actacgcggc aggagaagaa gcggtcggcg ccttcttcag ctgtgaagaa aggaaaggca
3420agcagcccgc cgacgatggt gaaggcccca gtcgagggcc caagaagagc aagaagaaga
3480agaagacccg gccgttccaa cgggaagacc tcgacgacga tctcgtcgct gccatggaat
3540gctaaaagcc tcgaggcccc ccagatgggg gcatctttga taaaatgcta gaagagtcgt
3600gccctttcca taagggagga gccaaccaca agctcaagga ctgtcgtatg ctgagaaagc
3660atttcgacgg tctggggttc aagaaggacg cgcgcgatga cccaaagaaa gagaagggcg
3720gcgaaaagga ggacgacaaa gacgacggtg gtttccctgc cgtccatgac tgctacatga
3780tctacggcgg gccctcgacg cagctgaccg caaggcagtg caagagggaa cgccgtgagg
3840tcttcgcggc gaggatggcg gtgccccagt acctcagctg gtcgagcacc cctatctcct
3900ttgatcgaga ggaccacccc gacaaagtag ctgcccctgg cgtctacccg ctcgtcgtcg
3960accctatcat catcaacacc cggctctcaa aggtaccgat ggacggtggc agcagcctta
4020acatcatcta cctcgagacc ctcgacctcc tcggcatcag cagggcacag ctccaaccaa
4080gcgccggcgg cttccacggc gtcgtactag gaaagaaggc gctgccggtt ggtcgaatcg
4140atctaccggt ctgttttggc acggcggcca acttcagaaa ggagaccctc acctttgagg
4200tggtggggtt ccggggcatg taccacgcca tcatcggatg accgggttac gccaaattca
4260tggctatccc caactacacc tacctaaagc tgaagatgcc cggccccaag ggtgtcatca
4320tagtcagctc ctccttcgag cacgcatatg agtgcgacgt cgagtgcgtc gagtatgggg
4380aggcagtcga gagttccacc gagctcgcct caaaactcga ggccctggcc gctgaggctc
4440cagagcccaa gcgccacgca ggcagcttcg agccggcgga aggaaccaag aagatcccac
4500tcgaccccaa caactccgat ggcaagatgc tgacgatcag cgctgacctt aatcccaaat
4560aggaagccgt gctcgtcgac tttctccgtg caaacgccga catatttgca tggagtcctt
4620tggacatgcc tggcataccg agggaagtcg ccgagcactc cttggaaatt cgagccggtt
4680ccaagccagt gaagcagcgg ttgcgccgat tcaacgagga gaagcgcaag atcattggtg
4740aggagatcca aaagcttttg acggccggat tcatcaagga ggttcaccat cccgactggt
4800tagcaaatcc tgtactagtt aagaaaaaga atgggaaaat gaggatgtgt gtcgattata
4860caagtttaaa taaagcatgt ccgaaagttc cttttccatt acctcgtatc gatcaaattg
4920ttgactcaac tgcgggatgt gaaacccttt ctttccttga tgcatattct ggttaccatc
4980aaataaaaat gaaagagtcc gaaattcatt acaccttttg ggatgtattg ttatgtgacc
5040atgccgttcg ggctttgaaa cgcgggggcc acatatcagc gctgcatgct ccacatgttt
5100ggcaagcaca tagggtcgac agtcgaggcc tatgtcgacg acatcgttgt caagtcgaag
5160cggcagggag acctgatcca ggacctcaaa atcgctttca gctgtttacg cgcaaaccag
5220atcaagctca accctgagaa gtgtgttttt ggcgtacctc ggggcatgct cctgggttac
5280atcgtttccc agcgtggcat cgaggccaac cccaagaaag tctcggccat cacaagaatg
5340gggccgatcc gagacatcaa gggcgtacaa agggtaacgg gatgcctagc ggcgctgagt
5400cgttttatct caaggttggg agaaaaggcg ttgcccctat accgacttct gaagagagag
5460agcgcttctc ttggacccct gaggccgagg aagccctcga aaacctgaag aaaacgttga
5520cctcagcacc agttctggtc ccacctcaac ctagagaacc gctactcttt tatgttgcct
5580cgacgaccca ggtcgtcagc gtagctgtgg tggtcgagag gcaggaggag gggtgtgcat
5640tgcctgtcta gaggccggtc tatttcgtca gcgaggtact ctcggagacc aaagcgcgtt
5700acccacagat ccagaagctg atctacgccg taatcctcgc ccgacgcaag ctgcagcact
5760acttcctcgg ccatcctatc acagtggtct catccttccc cttaggagag atcatccaaa
5820gttgagaagc cacgggaaga atcgccaaat ggtcggtcga gctcatgagt gagactctca
5880cttatgcgcc ccgtaaggct atcaagtcgc aagcccttgt ggattgcgtc gcggaatgga
5940cagactccca gctccccccg gcctaggttc aggcggagct gtggacgatg tacttcgacg
6000ggtctctcat gaagacagga gctggggcgg gcctgctatt catttcgccg ctgggcatcc
6060atatgaggta cgtcgtcagg atacactttg ccgcatccaa caatgttgca gagtacgagg
6120cccttgtcaa tggtctgaag atcgccatcg agctgggagt ccgacgcctc gatgttcgag
6180gcgactccca gctcgtcatc gaccaagtaa tgaaagcctc gaactgtcac gacccaaaaa
6240tggaagcata ctgcaaggag gtccgtcgac tcgaggacaa gttccacggc ctcgatctcg
6300tccacgtcgc ccgatgctaa aacgaggcag ccgacgaact cgccaggatt gcgttgaccc
6360gaggcacggt ccctcctgac gcgttttcaa gagatctaca cgagccatcc atcgacctgg
6420gctcgggggc tgacatcgag accgctcctg cccagcaaac caacgccgtc gaggcactac
6480taatggcggc tgaggtaatg gaagtgcggc ccggtcgacc gttcgattgg cgcacgccgt
6540tcctcgactg cctgatccgc tgcgagctgc tataagatcg atctgaggcc cgccgtattg
6600ctcggcgggc caagtcatat gtaatctatg gcgatgacaa ggagctatat cgacgaagcc
6660cgataggggt cttgcagcgt tgcgtcacca tagaggaagc cggaaactcc tcgaggatct
6720acactcgggg gcttgtgggc accatgctgc tccacggacc cttgtaggga acgccttccg
6780acaaggcttc tactggccaa cggtcgtagc tgacgccatc gagctcgtac gctcatgcca
6840cggatgccaa ttctacgcca agcagacgca cctgcctgcc cacgctctcc agatgttccc
6900gatcacatgg ccgttcgcgg tatgggggct cgacttagta gggcctctac aaaaggcaaa
6960aggggggtac actcacttgc tggtggctat cgacaagttc tccaaatgga ttgaggctcg
7020acccatcacc aacatccgtt ccgagcaagc cgtccttttc ttcaccgaca tcatccaccg
7080gtttgggatt cccaacgtca tcatcaccga caacggcact cagttcaccg gcaaaaagtt
7140cctggccttc tgcgatcagc atcacatcca tgtgaactgg tctgtagttg cccaccctcg
7200aactaacgac caggttgagc atgccaacgg catgattttg caggggctca aaccgagaat
7260ctataatcgc ttgaagaaat tcggcaagaa atgggttgag gagctttcct cagtcctatg
7320gagcttaagg actacgccaa gcagggccac aaaatacacc ccatttttca tggtctatgc
7380tctgaggccg tgctcccgat ggatctcgag tatgggtccc ctcgactcaa agcatacaac
7440gagcaatcaa ataaggagac tcaagagaat gcggtcgacc agctcgagga agctcgagac
7500atggccctcc tcaactctgc caggtaccag cagaagcttc gacgctacca cgacaagcac
7560gtgcgcaaga gggacctgaa cgtggccgac ctcgtcctac gacggcggca aagcaatcaa
7620ggacgccaca agctgactcc accttgggaa ggcccatacg tggtggccga ggtcttgaaa
7680ccgggaacat acaagctcgc agacgaaaag ggagcgatct tcaccaacgc gtggaacatc
7740aaacagctac gtcgattcta cccctagaat ttcaaagctt tatgttccca cgtacattct
7800gtaccgaggt tttgtaaatg aatgcatgaa taaataaagt ctttccctcg agcaatttgc
7860tttttcacga gtcaaaatct tgacgattag aagggggtac cgactatgac ctatcatagt
7920cgatacctcc tcgggggcta gcaggagggc gaccccccca ggtgtcgaaa aaaccaagta
7980accctttcgt tcccatcagt aatctcgtgt ggttgagtag taaaggtacc tcgagcccct
8040tacgggccga gaaacgacga gcctgagatc tcctacgccc ccgggctacg gaaactctac
8100tcgcctcctc acccttaagg caatcgagac cgccttaaac aaaagaccga gcgggaaaaa
8160acaaacatag gcgcaagaag aagtaaagga gcctcgagcg gaaagacaga taaacatttg
8220acaaccattt aaaagacgtc atactactta aagatagagt agcaaagtac tgtacaaagg
8280ggcctaggca cccagagcag gctcgcaggc ctcagtccgc atcatggtcc tcaccgccct
8340cgcctgcgcc cgagctagtc tcaggaggca gtacctctgg ctcgaacagc ctagccaacc
8400tctctccagg aacctctgcg tcgtcgatca aggcgtggag cctctcttcg ttctccacgt
8460cggtcttgga tatgtcggtg acgaagccgt gggacaccac ctccatgtca taggagaagc
8520ctggacagac aacggccatt gcccgcatca ccccgatgtg gagagcgtcc cgcacccgat
8580ccatcagcat cgcgcctaag tagcacagct gatcgaccag cgcgtcgcct cgagcttcct
8640ccatcgactc gtccatgggc tcgacctccc aagaggtcga gagatcgctt atcgccgtcc
8700gaagtcgacg gttcaaaaca acctccccct cgagctgggc tttggcattg cgagcctcga
8760cttgggcggc caagagcttg tctttaaggc ctgtgaatgc aaaaagcttt agataaaacc
8820aagcacatct cgaaaagaaa atctgacaaa agaaacacac cgcggacact ctcctccaga
8880gttgtgttct cgctaactaa tttggtgttg gccctctcga tctctttgtt ggagcgtgcc
8940agctcagtat tggcaacacg aagatcctcg atcgccttgc cagcctgagc aatagctcca
9000ttcttctcca acagctctcc cttcaaacga tcgacgtcgt cagagaggct gtgggatcga
9060gcccgcttcg cctcgaggtc ttcaatggcc tttttctttg cctcctccgc ggcacccctg
9120gcgacctcac tgtccacata ctccaccttt atcctccgga aggattctcg gatggagtcc
9180aggtcggtct tgaggaggcc cttctccttg tccaggtcag caacagtgct cttataggac
9240agagcctcct ctcgggcttt ggacccggcc tcctcgaccg tcagaacccg ctccctcagt
9300agcacggcct cttccacagc cttctttgag aactctcaag cctcggctag ggcagcctcc
9360ctctccttct tctcctcctc gagcgccaag agctctttat aagctttgag gagtttatcc
9420tgcgactcca ggggttggtc tttgaggagg gggagctgct cctagtcgcc tctcgtggca
9480tgaatgaagc ctgacttaat acgggaggtc tctttcaggt cctgcgagcc ggtgatcgag
9540cagttagaac ccaaaaccaa cgagttccct aaaggttaaa agactaggag acttacaaag
9600taggccggcc cgagcccgtt gtttacaacg tccgacaaga gccccaccgt gtgcttcatc
9660caaaggcgga gctcctcgac gtgctcccac ttctcggctt ccttccgatc atccagaaaa
9720atgttgggct tggacggatc cgtggaagca cggatgcgga tccagcggcc gcataagttc
9780gtatagcata cattatacga agttatctgc taccttaaga gaggcccaaa ctcggaccgt
9840cctaggaagt acg
98533081397DNASorghum bicolor 308aggtgaaggg gctcaagtgg atgctcggtt
cggtctcttt ggagatagtg ctaatcttga 60tgcaagatag gtgcacggtt ttcatggaac
ataccaaatg cttggaaatc aatctggacg 120cacatgatgg aactcctaga tgacgtgtgt
catatgaaat cttgcttcgg tctgtttgga 180gatagtgtta gattcggtgc tcagatacgt
gcaaggtttg cacataatgc agcataggct 240aagaaaccat tttggacgca cccgatggta
ctcctaggta aaaggcttaa gtgaaagctt 300ggttaggtct atttggagac agtgctaatc
ttcatgcaag atagatgcac ggtgtgcatg 360gaacgtacca tatgcttaga aattaatttg
gacacacccg atataactcc ttgatgacgt 420gtgtcatatg gaatcttgct tttgtgtgct
tagagagagt tttagttccg gtgcaagata 480tgtgcacggt ttgcgcataa tgcaccaaag
tctaaaaaac cattttggaa gcacctgttg 540gtactcctag gtgaagaggc tcaagtggtg
gctcggttcg gtctatttag agatagtgct 600aatcttgatg caagataggt gcacgatttg
catggaacat accatatgct cagaaatcca 660ttttgacaca ccagatgaaa ctcctagatg
acatatgtca tatggaatct cacttcggtc 720tgtttagaga cagtgttagt ttcggtgcaa
gataggtgca tagtttgcac ttaatgcacc 780ataggataag aaactgtttt ggacgcaccc
gatggtactc ctaggtcaag ggactcaagt 840gaaagctcgg tttggtatgt ttggaggtag
tgctaatctt gatgcgagat agatgcacag 900tttgcatgga acatatgata tgctcaaaaa
ttaactagga cgcacctgat ataactcctt 960gatgatgtgt gtcatattga atcttgcttc
ggtacgttta gagacagtgt tagttttgat 1020agatgatatg tgcacggtat acgcctaatg
caccataggg taagaaacca ttttagacgc 1080acccgatgat actcatagtt gaagaggctc
aagtggaggc tcgatttggt ctgttcggat 1140atagtgctaa tcttgatgta agatagttgc
acaatttgaa tggaacgttc cacttgttaa 1200gaaatcaatt tggacacacc caatggaact
cctagatgat gtgtgtcata tggaatctca 1260ctttggtctg tttggagaca atgttagttt
cggtgcaaga taggtgcata gcttgtgcct 1320aatgcaccat agtctaagaa accattttta
acgcacctgt ttgtactcct agatgaagag 1380gcaagctcgg ttcagtc
1397309262DNASorghum bicolor
309aagctcggtt cggtctattt ggagatagtg ctaatcttga tgcaagatag ttgcacggtt
60tgcatggagc gtaccatatg ctcgaaaatc gatatggacg caactgatgg aactcctaga
120tgacatgtgt catatagaat ctcgttttgg tctatttgga gactgttagt tttggtgcaa
180gatattgcac ggtttgtgcc taatgcacca taggctaaga aaccattttg gagcacctgt
240tggtactcta ggtgaagagg ct
262310671DNASorghum bicolor 310ctaggtgaag aggctcaagt ggaggcttgg
ttcggtctgt ttagagatag tgctagtatt 60gatgcaagat aggtgcacga tttgcatgga
aactattttg gatgcactat ttggtactcc 120tgggtgaaga ggctcaagtg gaagcttggt
tcggtcattt tggagatagt gctaatcctt 180atgcactgta ggtgcatggt ttgcatggaa
cctaccatat gcttggaaat caatttggat 240gcacccgatg gaactccttg ttgacctgtg
tcatttggaa tctcgcttca gtccatttgg 300agacattgtt agtttcggtg aaagataggt
gcatagtttg cacctaatgc accttagtct 360aagaaaccat tttggacaca cttgtgggta
ctcctaggtg aaggggctca agtggatgct 420cggttcgatc tgtttggaga tagtgctaat
cttgatgcaa gatagatgca cagtttgcat 480ggaacatacg atatgcttag aaatcaatta
ggacgcacct gatataactc cttgatgatt 540tgtgtcatat ggaatcttgc ttcagttcgt
ttagagaaag tgttagtttt ggtagaagat 600atgtgcacgg tttacgcctg atgcaccata
ggctaagaaa ccattttaga cgcacccgat 660ggtactcgta g
671311338DNASorghum bicolor
311agaaatcaat taggacgcac ctgatataac tccttgatga tttgtgtcat atggaatctt
60gctacggttc gtttagagac agtgttagtt ttggtagaag atatgtgcac ggtttatgcc
120taatgcacca taggctaaga aacaatttta gacgcacccg atggtactcg tagttgaaga
180ggctcaagtg gaggctcgat ttggtctgtt cggatatagt gctaatcttg atgcaagata
240gttgcacagt ttgcatgcaa cgtaccatat gttaagaaat caatttggac gcacccaatg
300gaactcctag atgatgtgtg tcatatggaa tctcgctt
338312169DNASorghum bicolor 312agtgttagtt tcggtgcaag aaaggtgcac
ggtttgcgcc taatgcacga taggctaaga 60aacaattttg gtcgcaccca atggtattcc
taggtaaagg ggcacaagtg aaagcttggt 120atggtttctt tggagatagt gctaatcttg
atgcaaaata ggtgcacgg 169313102DNASorghum bicolor
313tttgcatgga acgtaccata tgctcagaaa tcaatttgga cgcacctgat ggaactcgta
60gatgacgtgt gtcatatgga atcttgcttt ggtccgtttg ga
102314123DNASorghum bicolor 314ttgaagaggc tcaagtggag gctcgatttg
gtctgttcgg atatagtgct aatcttgatg 60caagatagtt gcacagtttg catggaacgt
accatatgtt aagaaatcaa tttggacgca 120ccc
123315397DNASorghum bicolor
315caagataggt gcacggtttg cgcctaacgt gatgtctgtc acatgcaatc tagctttggt
60ctgtttggag atagtgttag attcagtgca agataggtgc acggtttgcg cctaatgcac
120cataggctaa gaaatgattt tgacccaccc gatggtactc ctaggtaaag gggctcaagt
180gaaagctcgg tttggtctgt ttggagatag tgctaatctt gatgcaagat aggtgcacag
240ttagcatgga atgtaccata tgctcagaaa tcaattagga tgcacccgat agaactccta
300gatgacgtgt gtcgtacgga atctcacttt agtccatttg gagatagtgt tagttttggt
360gaaagattgg tgcacggttt ccgcctaatg caccata
397316288DNASorghum bicolor 316gtgcacctat cttgcatcaa gattagcact
atctccaaac aaactgaacc gagcatccac 60ttgagactct ttagctagga gcacaatcgg
gtgcatccaa aatggtttgt tagagtatgg 120tgcattaggc acaaaccatg cacctatctt
gcactgaaac taacactatc tccaaacaaa 180aaaagtgaga tatcagatga cacacatcac
ctaggagttc tatagggtgt gtccaaattt 240attttcaagc atatggtatg ttccatgcaa
accgtgcacc tatcttac 28831713826DNASorghum bicolor
317cctatcttgc accgaaacta acaatgtctc caaatggacc gaagcgagat tccatatgac
60acacgtcgtc aaggagttcc atcgggtgca tccaaattga tttccaagca gcaagacaag
120gtgaaaacct ttgtcagacg agcacaatag gaatttggtc ttcctatcaa gaaagtaaga
180agtgacaatg ggaccgaatt caaaaacact cactcaagtt gaagagtttc ttgatgatga
240aggcatcaag catgaatttt caaccgcgta caccccacaa caaaatggtg tggtagagag
300aaagaataga acacttatcg acatggcaag aactatgctt gatgaataca agacgtcgga
360tatattttgg tgtgaggcca tcaacaccgc ttgccatgcc atcaatcgcc tctatctaca
420caagaaactc aagaagactt catatgagct tctcaccggt aacaaaccca aggtgtccta
480ctttagagtg tttgggtgta aatgcttcat actaaacaaa agacccaaaa cctctaagtt
540tgcacctaaa gtagatgaag gatttcttct tggctatgga tcaaatgagc acgcctatcg
600agtcttcaac aaaactctag gtagagttga agtgtcgata gatgtgacat ttgatgaatc
660taatggctct caagtggagc aagttgatct aagtgttgta ggaaaggaag atccaccttg
720taaggcaatc aagcaaatgt ccatcggtga cattaggcca ctggaaggat aagtctcaga
780aaaggaggat ccaccagctg ttgctgcaca aatttccact gacgtactcg acaaggatgc
840acaacacaca cctactagaa atcagcaggg cggcagtgcc gccccctcca cctcagcagc
900agaccctcct gcttctatat cacaagttga aggaatcaac ctagagccca tttttgaaca
960agaagaagct gaaggttcag aggagtagaa aaagcttgat gagtatccaa gacttccaca
1020aactatacaa caagatcatc ccatcgacaa cattcttgga agcattcgaa aaggggtaac
1080aactagatct catttggtta acttttgtca attttactcg ttttgtctcc tctttggaac
1140cactcaaggt cgaacaggca cttggagatc cggattgggt catggcaatg caagaggagc
1200ttaacaactt tgagagaaat caagtatgga ccttggttga aaggtcaaat accaatgtta
1260ttggaacaaa atgggtcttt cgcaacaagc aagatgaaca tggtgtggtg acaagaaaca
1320aggcaagatt ggtagctcaa ggatttactc aagtagagag attggatttt gaagaaacat
1380atgcaccggt agcaaggctt gaggcaattc gaatgctctt agcttttgct gcccatcatg
1440acttcaagtt atatcaaatg gatgtcaaga gtgcattcct caatggtcca atacaagaat
1500tggtctacgt tgagcaacca ccgggatttg aagaccccaa gtttccaaac catgtgttca
1560aactccgaaa ggcactctat gggctgaaac aagcaccaag agcatggtat gaatgcctta
1620aggaattctt ggttaaacaa ggctttcaca tagggaaagc cgatcctaca ctcttcactc
1680ataaagttcg tgatgaggac atcaacacca gcgatacata tcctacatcc tctccggtac
1740aacctatagc tggtcctaca tcctctccga cacaacctat agctggtcct cttactcgtg
1800ctcgtgcccg tcaactcaac cttcaagtaa gttcagtttt aaactcttgt caatcatatt
1860tagacaatgg agacacgtgc actttcgtgt tgctcaggaa taatggacaa gatcagcaag
1920ggaaggttca actgcattca gaatttgagg caacaccaac ttcaagggct gattgcatat
1980gggaagagtg ataacttaac aaaggtgatt ggagatcagg tccaagcctc cacaacatcc
2040tctatcaagt taccacgtcg cttctaaagc aaggaaacaa agagtccaaa cccaacacgt
2100tttgggagtt ggattcggac tggaaaataa ctctaacttg tatggatcac cacagcgtca
2160tatggactcc aactgggacg ttcctatact tgttggaaag cacataaagt ctactttcca
2220atgggtccaa ccaaatatct atgcggctta tgagtcgggc gcagtccttg ttttcgtgcc
2280gacacctttt tctgttttgg tgctgcgtca ccctattttg gaccaatggc ccatgtatca
2340agttgagtcc attagggatg cgtcctaagg ttggaggatg actctagcac ccctttggtc
2400gtcctcccct ctatttattt acatctagag ccgccatgaa caactgggtt ttgattagat
2460aaagtttagc cttcgctact tgcttgtaaa cgcgcgtgct gatccagccg cccgtcttct
2520tgttttcgaa accccacttc attagagatt gagtttgaaa ccttcattta catctggtaa
2580tttagtactt gttctacttg ttcttgttgg ttcttcgatt gcttgcagga cgagtgccct
2640agtggccggg tgttgcgctc cacaagatcg tgacagccat tggaggcggt gtatcggttg
2700ctaaggcgca gtcttgaggg ctgtagtcgg gccgtgaacg tcatctccat tcactaatcg
2760agttatccag cgcctctcat cgaaagatca ggcgaaaacc ctagtgggtt cacatcagtt
2820ggtaatcaga gcaaggttta tcggtgagag atttccaatt cttcgtgttt gtttttccta
2880tagtccaaaa aaaaagacaa aaaatatagc agatttgttt tccataatcc tataaatcct
2940ttgtgccgtg gctagtacta cttagttagg gctggttgaa tgagtgtttg cttcggtcgt
3000gtccagtgct ggttttagtt tagtccttta gagttttgag ttcttgtcac catctagtca
3060caactgcgtc caatctattg tgggttggaa tttgagtaga tcttgataat aggtgcagcc
3120acttgttggt ctattctgcg tttccatcaa cgtgatccaa aaggaaagat agcacttcat
3180acttgtgcta gatatattga attttgaggt tcaaccttac tctagtgaga gggtgagagt
3240ggtgaagtga tattttgtac tattttgttt atataatcag gttttgaggt tccccaaaaa
3300aaagaaaaga gaaagaaaaa aataaagaaa aaaaaagaaa gaaagaaaga aaaaagaaaa
3360aaaatcaaaa aaagggattg ttgttccttt tgtttccagt agggctgctc attttgtttc
3420tagtggtgtg atgtgttttc cctttgtgtc caggctcgcg tctctagcac ggtctaggct
3480aggaccagca cagtaccacc gttgagcgtt tattcagctc gcttttataa ctaacgtggt
3540gctagttcgt tccttgtttc agcccaccta tagctccaca tactctacag cttgacaggt
3600cttgtgctgc agcaccgata cacttcgtcc attgctgtac acttgttggc agacgacccc
3660tcctgtcaag caaggtatga attggtaaga acttgtgtta caggttgagt gtgagcgact
3720tgctgtagct acatcctagt agttgtaggg cttttatttc ttcacttgcc tttttgttgt
3780ctttgtcttt gaaccatgcc aggggcagat gatggtaacg aaacaccact tacacctcgc
3840actaagggca tcatacaaca ttttgaaaag aaagtgaagc tgcacacaga gggacttgat
3900aacgacttgc aggtgacaaa tgaaaagctg ggacagttgg aggctacgca gattgccaca
3960aacaacaagc tcacaagttt ggaggaatcc attgctagtg tggacaaaag ccttgctgct
4020ctcctaaggc gatttgatgc tttccacaca atggacaaag agaagcataa ggaagaaaac
4080aaggaggaag accgagtgga tggcaattat gatgatgatt acactgctga tacgaaacga
4140gatgatcaag acactcatca tcgacgtcac ctacgtcaca cccgtagagg tatgggtggc
4200caccaccgac gcgaggtaca caataataat gatgctttca gtaagattaa atttaaaata
4260cctccttttg atggtaaata tgaccctgat gcatacatta cttgggagat tgctgttgac
4320caaaagttta catgtcatga attccctgag gatacacgtg ttagggctgc tactagtgag
4380ttcactgatt ttgcttccgt ttggtggata gaacatggca agaaaaatcc taataacata
4440cctagaactt ggaatgcgtt gaaacaagtc atgagggcta gatttgttcc ttcttactat
4500gcacgtgaca tgattaataa gttgcagcaa ttgagacaag gtgctaaaag tgtagaagaa
4560tattatcagg aattacaaat gggtatgctg cgatgtaatt tagaggagga agaagaacct
4620gctatggcta gatttttggg cgggttaaat cgtgaaatcc aggacattct tgcttataaa
4680gattatacta acgtaacccg tttgtttcat cttgcttgta aagctgaaag ggaagtgcag
4740ggacaacgtt ctagtgccaa atctaacaat tctgcaggga aatcctggca acaacgcaca
4800tctgctacat tgtcgggtgg tgtacctctt ccatcaagcc gatcaatagc tccaccacct
4860tcctacagcg acaaaccaca tgattcttcc acaaatacag caactaaatc agtccagaga
4920ccaatcgcta gtgccacctc ggttaattcc acgggaagaa caagagatgt tcagtgccat
4980cgatgcaagg gatatgggca catgatgcgt gactgcccaa acaagcgagt tatgattgtc
5040agggatgatg gtgagtactc atctgctagt gattttgatg aggatacact tgcactgctt
5100gcgactgacc atgcaggtaa tgaagatcaa atagaagaac atattaatgc aggtgaagcg
5160gaccactatg agagcttaat cgtgcagcga gtgcttagtg cacaaatgga gatggcggag
5220caaaatcagc gacacatttt attccaaaca aagtgtgtca tcaaagagcg ttcttgtcgc
5280atgatcattg atagaggtag ctgcaacaac ttggcaagca gcgatatggt gcagaagctt
5340gccctcaaca ccaagccaca cccgcatccc tactacatcc aatggctaaa caacagtggt
5400aaggcaaagg taactagact tgtgagaatt aatttttcca tcggatccta caaagatatt
5460gttgaatgtg atgttgtgcc tatgcaagct tgtaacattc tgctaggtag accttggcaa
5520tttgatagag attctatgca tcatggtaga tcaaatcagt attcttttct ataccatgat
5580cgcaaaattg tgttgcatcc tatgtcccct gaaactatta tgcaaactga tgttgctagg
5640gctactaaag caaagagcaa gagcaataaa aatgataaat ctgtaattgg taacaaagat
5700gagataaaac tgaaaggacg ttgtatgatc aaatcagata ttaatgagtt caatgcatcc
5760acttctgttg cttatgcttt gatatgcaag ggtgctttga tttcaattga ggatatgcaa
5820tgttctttgc cccctgctgt tgctaacgtt ttgcaggagt attctgatgt gtttccaagt
5880gaggtaccag cggggctgcc tccactacgc gggattgagc accaaattga tcttattcct
5940agagcagttt tgccaaatcg tgcaccatac aggacgaacc cggaggaaac aaaggaaatt
6000cagcgacaag tgcaagaact actagacaaa ggttatgtcc gagaatctct tagtccttgt
6060gctgttccag taattttagt gcctaagaaa gatggaacat ggcgtatgtg tgttgattgt
6120agagctatta ataatatcac cattcgatat cgacacccta ttccacgatt agatgatatg
6180ctagatgaac tgagtggtgc tgttgtgttt tcaaaagttg atttacgtag tgggtaccac
6240cagattcgta tgaaattagg agataaaggt agaaagcaat tgattctgga acctggggat
6300ttggtttggt tgcatttgcg aaaagataga tttccagaac tgagaaaatc caaattgatg
6360cctagagctg atggtccttt taaagtgcag caacgaatta atgagaatgc atataagctt
6420gatcttcctg cagattttgg ggttagtccc acatttaaca ttgcagattt gaagccttat
6480ttgggtgagg aagatgagct tgagtcgagg acgactcaaa tgcaagaaag ggaggatgat
6540gaggacatca acaccagcga tacatatcct acatccacca gcgatacata tcctacatcc
6600tctccggtac aacctatagc tggtcctaca tcctctccga cacaacctat agctggtcct
6660cttactcgtg ctcgtgcccg tcaactcaac cttcaagtaa attcagtttt aaactcttgt
6720caatcatatt tagacaatgg agacacgtgc actttcgtgt tgctcaggaa taatggacaa
6780gattagcaag ggaaagttca actgcattca gaatttgagg caacaccaac ttcaagggtt
6840gattgcatat gggaagagtg ataacttaac aaaggtgatt ggagatcagg tccaagcctc
6900cacaacatcc tctatcaagt taccacgtcg cttctaaagc aaggaaacaa agagtccaaa
6960cccaacacgt tttgggagtt ggatccggac tggaaaataa ctctaacttg tatggatcac
7020cacggcatca tatggacacc aactgggacg ttcctatact tgttggaaag ctcatgaagt
7080ctactttcca atgggtccaa ccacatatct atgcggctta tgagtcgggc gcagtccttg
7140ttttcgtgcc gacacctttt tctattttgg tgctgcgtca ccctattttg gaccaatggc
7200ccatgtatca agttgagtcc attaggaacg cgtcctaggg ttggaggacg actctagcac
7260acctttggtc atcctcccct ctttttattt acatctagag ccgccatgaa caactgggtt
7320ttcattagat aaagtttagc cttggctact tgcttgtaaa cgcgcgtgct gatccagccg
7380cccgtcttct tgttttcgaa accccacttc attagagatt gagtttgaaa ccttcattta
7440catctggtaa tttagtactt gttctacttg ttcttgctgg ttcttcgatt gcttgcagga
7500cgagtgccct agtggtcggg tgttgcgctc cacaagatcg tgacagccat tgcaggcggt
7560gtatcggttg ctaaggcgca gtcttgaggg ctgtagtcga gccgtgaacg tcatctccat
7620tcactaatcg agttatccag cgcctctcat cgaaagatca ggcgaaaacc ctagtgggtt
7680cacatcagtt cgtaatgata tatttgtgtg ccaaatatat gtcaatgaca taatatttgg
7740cagtactaat catttgtatg ttgaagaatt tagtaggacc atgacgaaga gatttgagat
7800gtccatgatg ggtgaattga agttcttcct tggatttcaa atcaaacaag tgaaggaagg
7860aactttcata agtcaaacca actacactca tgatatgctt aagaagtttc acatggtgaa
7920tgccaagcct atcaaaactc ccattccaac taatggacat cttgatctaa atgaagaagg
7980gacagccgtg gatatcaagg tatatcgttc catgattggc tctcttcttt acttatgtgc
8040atctaggccg gacataatgc ttagtgtgtg catgtgtgct agatttcaag ccaacccaaa
8100agagtgtcac ttagtggctg ttaagagaat cttacgatat ctagttcaca cactgaacct
8160tggcttgtgg tatcctaagg gttccaagtt caatctactt ggctattcgg actccgatta
8220cgccggttgc aaagtagata gaaaaagcac ttcggggaca tgtcaattgc ttggacggtc
8280cttattgtct tggagctcta agaagcaaaa ttgtgtagcc ctttccactg cggaggccga
8340gtatgttgta gccggcgcat gctgtgctca actactttgg atgaagcaaa cccttcaaga
8400tttcggatgt cacctcacca aaatcccaat attatgtgac tatgaaagcg ccataaagct
8460tgcaaacaac cccgtaagtc actcaagaac taaacacata gacatccgac atcatttctt
8520gagagaccac gaagctaaag gagatattga aattcgtcat gtgagcaccg aaaagcaact
8580agccgatatt ttcactaaac ccctcgatga gtcaaggttt tgtgagctgc gtagtgaact
8640aaatatcctt gattctcgta acgtgacttg aaatcctgca catatatgtt tgtcaaccta
8700gcgacatagg caaaatcttg aaaagctgat ttacaagtct ttcaaaacat ttacaaaatg
8760tctcttagta ttatgatcat agtatgtatt gttgtttgtt tgctgatcat aatacttagg
8820gattaacctg tccttatcta aagtaaagaa taggaaaagg atttagctgt acccctgcaa
8880gttggacagg gcgacagtgc cgccctttga atccacaatg gattcgggcc caatttggct
8940gaggcgcggg gcccatctcc ctctacctct tttctctctg gctccgcccc tcctctttcc
9000tttctctccc agacgtgcac agcacctccc ttctccttct tctctcctct agggttaggg
9060caaggtgcaa gtgctcatag ccatgttctc cacggccctt ggcgagttcc cagcctctcc
9120tctcgttgct ggtgactgcc cccctgctca agccttgggc attgctcttc tttcctcttc
9180attcaacacg taagatggat cgagaaggat atgatgctag gggaaagagg aaagtgttgg
9240cgaagcaatt ggctcgcaga ggcaggggga ggacagcggg aggtagtggt gcagcacctc
9300gagccactga tagagcagca catgatcaat atgtgtctga ggaggaaatg aatcaaagga
9360ttggatacac catccccatt atgcatggca caccaaatca cctcacaggg tttgatgaca
9420actatatgag ggacaatgag gatgagtttg ttctcacaga gcctcacaac aatgtgagac
9480atacagtggt tgactatggg aggagttgga aggctactgg tgatgccaaa gagattgatc
9540cctatgctgc agataagctg ttggggttga ttatagattt tcgaatgtgt tccactccaa
9600cttctatgcc acagccatca tgacaaagcc taggggcaaa atctgcaaga tgcaatatgt
9660tgatttcaat gagctgcaag atagaaatga gtttgctgct gccataaaga catgtgacag
9720attccagttg actgatatca tgagtttcag gtatgattgg aatagagaaa tacttgcaca
9780gtttcatgcc acatattttt ggaacaggga tgaggatgag atccactgga tgacagatga
9840taggcattat cgcattgatt ttgtcagctt ctgtcatatt cttggttttg gacagattca
9900tagatctttc agtaggatcc atgatgtgcg tcgccttgag ccacatgagg tgagttttat
9960gtgggaagat cctagcaagg ctgatgggag aaggacaggg ttcaaggcaa tttattacac
10020catgaacaac ctgtttagaa tcactctcaa tcccaaggac aatgcaactg atctaaatgg
10080atatatcact aatgtgttat ctagattccc agagtggtga gagattcaat gtaggaagat
10140ttatttgggt tgagttagct tatgccatgg atgatggaag aaggtcgctg ccatatgcac
10200cttatctgat gttcatgatt aagagggttt taggtcagag attccccaat gactgcattc
10260atggtgttta caacataaag aagacacatg ggggtaaggg gagcagcggg gcagcagccg
10320gttcacccac tagagagaca tcctttgctc acagagatgt ttctgagtcc tctaggagtg
10380gcaggaaaaa gaagagcaag aagctgggga agatgagtga atggatcaag gcctacctgt
10440acctatgctg caaacactgc ttatgaggat cgtttggaga acagagaggc agttagggta
10500gctagggagt tggctggtct tccacctctt gccccagtta ggcctcctcc tcaattccct
10560aacctaccta gtctgtcaga cacctcttct gaggatgagc agccccatgg agatgcatag
10620gagcagcact ttgagcagcc tgatggtgat gatagtgatg atgaggggat ctgggcagga
10680gaggaagatc cacaggttca ggagaccttg ctacgtgtct actctcgacg tcctcgtgac
10740cctttagagg cagctgggcc ttcttcttta gctccacctc gacgttcgct tcacctcgac
10800cactcatgtt cagggacgtg ctagagttga ctccgacagc gacgacgagt gatttctctt
10860cttttctctt ctctttttgg tgtttgatgc caaaggggga gaaaattaga ggggtcaaat
10920tagttttgag atctagctgc gcttcatgtc ctatcttttg agagtagtct ttaggttgtg
10980agagaaaacc gttgaaaaac tctattttat gtattatggc tacctcatct cactatttgc
11040tttggatatg gacttgtttt aagaacaacg gttttattat ttagttcagt tcactgtgtt
11100gtctctcctc tctagtttct gctgtgtttt tctggttgtt gtctgactgt tgggtattag
11160gccggcagtg ccgcccttct aggccggcag tgccgcccta tgcctcagca gccaagcagc
11220tctgttgttt aactatctgt ttgcatcacg tcttgtcttg tgacactttg cacaacaccc
11280catagcaggc atggatgtag ggggaggtcc tcctacttga agatgtaaga ccttgcatca
11340taaagcaagc ttttaggatt caatcctcat ttacatcttc agggggaggc ccctataatc
11400ctggctcgaa aatcttaatt cttgtttata ttgttgaaag ctctaattag gttgtcatca
11460atcaccaaaa agggggagat tgtaagtgga atcaagccct attgtgggtt ttggtgttaa
11520tgacaacaaa attagaagac taacaagttt tatcgagtta atgagcaggg gatcaattta
11580tggaaatgat gtacaggttg ctgatattct aaaaatatgt ttgagctgat ccaaactcaa
11640ggatgtgtta cttcatttta ttttctttat ttgagtttag gaaaagccgt actataaagg
11700ggaattctag aattgttggt caactgtgca accagatgct cgtcttcaca aaacaacatc
11760ctctttctca gccaaagcag cacgcaaaac agtttctttc ttaacctgct ctggccaggg
11820tggcagtgcc gccctctcct gaccgttggg acctagggga tatctttacc tctggacgct
11880cctcacaacg gtcatcacag acctcagatg acttatctct tgctcagaca gaccagagct
11940ctctctctct ctcctccatt gttgccctcc atccctcaag catcaatctt ttaagaaaaa
12000aggtagcaaa actcgattgg agagtagatc cactgattcc caaggtctaa gagcattttg
12060ttcacgtttg gtcggaggtt ctagggtttg ttactcttgg agcttgctcc tagccggcta
12120ggcgtcgccc atgagcttgc cctcttgtgt ggcagccttg ggaggtttgt aaacttgttt
12180tgcagctaag aaattacccc tcacttcaag agttcactct cttgacttga gaacgagggt
12240agggcaagcc tttgtggcaa gcctaagcct agtgtggctt cctcaacaac gtggacctag
12300gcaagccttt gtggtgagct gaaccacggg ataaatcact gagtcttgtg tgcttcttgc
12360agattatttc tcaagttata ctcttctagg gtttggtggc cctatctagt cttgagagac
12420ttttctttga catccagtct tcgtactgga tcttatcttt gttttgcagg attgtgttct
12480tcaccaccta agtttacttc ctccaaagtt gtaccacctt tggtatttta tttgaaggct
12540agcagtaccg ccctctgttc taacagagtt ttgagttgaa tttttgcagg cctattcacc
12600ccccctctag gcctctttag cttccaggag atcctacagt atgttccttg caaatgatgc
12660acctatcttg catcaagatt ggcactatct ccaaatagat caaaccgagc tttcacttga
12720gcctcttcac cgaagtgatg aggacatcaa caccatcgat atatccgcac ctacaccagt
12780tggaataatg gacaagatca gcaagggaag gttcaactgc attcggaatt tgaggcaaca
12840ccaacttcaa gggctgattg catatgggaa gagtgataac ttaacaaagg tgattggaga
12900tcaggtccaa gcctccacaa catcctctat caagttacca cgtcgcttct aaagcaagga
12960aacaaagagt ccaaacccaa cacgttttgg gagttggatt cggactggaa aataactcta
13020acttgtatgg atcaccacgg cgtcatatgg actccaactg ggacgttcct atacttgttg
13080gaaagttcat gaagtctact ttccaatggg tccaaccaca tatctatgca gcttatgagt
13140cgggcgcagt ccttgttttc gtgccgatac ctttttctgt tctggtgctg cgtcacccta
13200ttttggacca atggcccatg tatcaagttg agtctattag ggacgcgtcc tcgggttgga
13260ggatgactct agcacccctt tggtcgtcct cccctctatt tatttacatc tagagccgcc
13320atgaacaact gggttttgat tagataaagt ttagccttcg ctacttgctt gtaaacgcgc
13380gtgctgatcc agccgcccgt cttcttgttt tcgaaacccc acttcattag agattgagtt
13440tgaaaccttc atttacatct ggtaatttag tacttgttct acttgttctt gctggttctt
13500cgattgcttg caggacgagt gccctagtgg tcgggtgttg cgctccacaa gatcgtgaca
13560gccattggag gcggtgtatc ggttgctagg gcgcggcctt agaaggctgt agtcgggccg
13620tgaacgtcat ctccattcac taatcgagtt atccagcgcc tctcatcgaa agatcaggca
13680aaaaccctag tgggttcaca tcacgaagta ccaacacgtg cttccaaaat ggtttctgag
13740actatggtgc attaggcgca aaccatgcac atatcttgca ccgaaacgaa cactatttct
13800aaacagacca aagcgagatt ccatac
13826318682DNASorghum bicolor 318cggtctgttt ggagacaatg ttagtttcgg
tgcaagatag gtgcagggtt tgtgcctaat 60gcaccatagt ctaagaaacc attttggatg
cacttgttag tactcctagg tgaaggggct 120caagtggatg ctcggttcgg tctgtttgga
gatagtgtta atcttgatgc aagaaggtgc 180acggtttgca tggaacatac caaattcttg
gaaatcaatc tagacgcaca tgacggaacc 240ctagatggcg tgtgtcatac aaaatcttgc
tttggtctgt ttggagacag tgttagtttt 300ggtgcaacat aggtgcaagg tttgcacata
atgcagcata ggctaagaaa ccattttgga 360tgcaccggtg gtactcctag gtaaaaggct
caagtgaaag cttggtttgg tctatttgaa 420gacagtgcta atcttgatgc aagatagatg
cacggtcagc atggaacgta ccatatgctt 480agaaattaat ttggacacac ccaatagaac
tccttgatgg cgtgtgtcat atggaatctc 540gcttttgtgt gcttagagac agtattagtt
tcggtgcaag atgtgtgcac ggtttgcgcc 600taatgcacca aagtctaaga aaccattttg
gaagcacgtg ttggtactcc taggtgaaga 660ggctcaagtg gaggctcggt tc
682319146DNASorghum bicolor
319ctccaaacag accgaactga gcttgcactt gagcctcttc atctaggagt ataaacaggt
60gcgtcaaaaa tggtttctta gactatggtg cattaggcac aaactatgca cctatcttgc
120accgaaacta acattgtctc caaaca
146320416DNASorghum bicolor 320gcttcggttc gtttagagac agtgttagtt
ttggtagaag atatgtgcac ggtttacgcc 60taatgcacca taggctaaga aaccatttta
gatgcatctg atggtactcg tagttcaaga 120ggctcaagtg gaggctcgat ctggtctgtt
tagatatagt gctaatcttg atgcaagata 180gttgcacagt attcatggaa cgtaccatat
gttaagaaat caatttggat gcaccaaatg 240gaactcctag atgacgtgtg tcatatagaa
tctcgcttcg gtctgtttgg agacaatgtt 300agtttcggtg caagataggt gcatagtttg
tgcctaatgc accatagtct aagaaaccat 360ttttgacgca cctgtttgtc cttctagatg
aagaggctca actggaagct cggttc 416321126DNASorghum bicolor
321atttttgacg cacctgtttg tactcctaga tgaagaggct caagtggaag cttggttcgg
60cgctttggtc tgtttagaaa tagtgttagt tttggtgcaa gatttgtgca tggtttgcgc
120ctaatg
126322147DNASorghum bicolor 322gcatggaaca taccatatgc ttggaaataa
ttttgggtgc acccgatgga attccttcac 60gacgtgtgtc atatcaaatc tcgctttggt
ctgtttagaa atattgttag tttcggtgca 120agatatgtgc atgctttgcg cctaatg
147323128DNASorghum bicolor
323gcatggaaca taccatatcg cttggaaatc aatttggatg cacccgatgg aactccttga
60tgacatgtgt catatggaat ctcgcttcgg tccatttgga gacattgtta gtttcagtgc
120aagataga
128324177DNASorghum bicolor 324gatctgtttc gagatagagc taatcttcat
gcaagaatga tgcacagtct gcatggaatg 60taccatatgc ttagaaatta atttggacac
acccgataga actccttcat gacgtgtcat 120acggaatctc gctttggtgt gcttacagat
agtattagtt tgctgcaaga tatgtgc 177325196DNASorghum bicolor
325gtttggagat aatgctaatc ttgatgcaag atagatgcac ggtctgcatg gaacatacca
60tatgcttaga aattaattta gacacacccg atagaactcc ttcatgacct atgtcatatg
120gaatctcgct ttggtgtgct tagagacagt attagttacg gtgcaagata tgtgcatgtt
180ttgcgcataa tgcacc
196326274DNASorghum bicolor 326ggtctgtttg gagacagtgc taatcttgac
gcaagatagg tgacggtttg catggacgta 60ccatatgcta ggaaatcaat ttggacgcac
ccgatagaac tcctagatga cgtgtgtcat 120atgaaatctc acttcggtcc atttggagat
agtgttagtt tcggtgtaag ataggtgcac 180ggtatgcacc taatgcacca taggcttaga
aacacttgtg gaagcacccg atggtactct 240ttggagacag tgttagtttc ggtgcaagaa
aggt 274
User Contributions:
Comment about this patent or add new information about this topic: